Web titans race to put books online

Worth A Look, Disruption, Frontier Centre

A race has begun to make "all human knowledge" accessible with just a few clicks of a computer mouse.

Two technology giants are charting different courses to rush the content of the world's libraries onto the Internet and make it freely available to everyone on-line.

The latest effort, announced yesterday and led by Yahoo Inc., aims to have complete electronic copies of thousands of literary classics, videos and speeches ready for download before the year's end.

The Internet services firm is drawing on a team of partners, including the University of Toronto, to bring a higher purpose to a technology that in its first decade of popular use has been used to peddle everything from Pez candy dispensers to porn.

The goal is to "share, use and expand all human knowledge," said David Mandelbrot, vice-president of search content at Yahoo.

The Sunnyvale, Calif.-based company, together with several archive organizations, technology firms and academic institutions, has created a international consortium called Open Content Alliance that is working to expand the reaches of the Web.

Last year, Yahoo's nemesis, Google Inc., began scanning books as part of a massive effort to create its own searchable and freely accessible global electronic library. Where Google has incurred the wrath of many authors who claim the company is violating copyright rules, Yahoo says the Open Content Alliance will only scan material that is out of copyright or has the approval of its creator.

"It's really important that the content creators have a say in how their material is used on-line," Mr. Mandelbrot said.

Last month, the Authors Guild, which represents 8,000 writers in the United States, filed suit against Google, claiming "massive copyright infringement." Google says its Library Project falls under the "fair use" provisions of copyright law. In addition, the company says authors are able to opt out of its project.

The efforts of both Yahoo and Google could feed each other, said Carole Moore, chief librarian at the University of Toronto.

In the past decade, people have come to expect newspapers and magazines to be available electronically. It's only natural that they should want books on-line, not necessarily to read, but to search, Ms. Moore said.

Yahoo's involvement with the Open Content Alliance is less proprietary than Google's early efforts with its library project. Yahoo is initially funding only the scanning and electronic conversion of 18,000 pieces of classic U.S. literature. Although Mr. Mandelbrot said the company is also developing a new search engine for the digital library, it won't be shouldering all the costs. "Our announcement today is really a call for participation," Mr. Mandelbrot said. "We're hoping for more funding from other organizations."

The U of T has been digitizing content for the past decade, using federal funding to scan some 20,000 books that are out of copyright. A year ago, the library began experimenting with an automated scanning process in collaboration with the San Francisco-based Internet Archive. The project has cut copying costs to about 10 cents a page today — about a 10th what they were a year ago, Ms. Moore said.

The Internet Archive, which has already amassed a huge collection of old Web pages, is storing the library content, which is indexed with Yahoo's search technology and will be made available to other search engines, including Google's.

"The opportunity before all of us is living up to the dream of the Library of Alexandria and then taking it a step further — universal access to all knowledge. Interestingly, it is now technically doable," Brewster Kahle, founder of the Internet Archive, said on a Yahoo blog posting. Even as steps are taken to digitize the world's most important words, speeches and videos, the best ways to generate money and pay for the projects are still being worked out. Google, for instance, sells ads displayed around search results.

Mr. Mandelbrot said Yahoo continues to talk to copyright holders about the best way to include content, and one of the ideas under discussion includes developing a micropayment system, which would let people pay electronically for individual pieces of content.

Initial membership of the Open Content Alliance includes Yahoo, Adobe Systems Inc., Hewlett-Packard Co., the Internet Archive, the European Archive, the National Archives of England, O'Reilly Media Inc., Prelinger Archives, the University of California, and the U of T.