Herkovic: Google Books is ‘too useful to fail’

Stanford is currently in a partnership with Google to help digitize millions of books for the online Google Books database, an effort that has an immense potential to democratize access to literature and knowledge. Google has scanned more than 1.7 million books owned by Stanford in the last five years and hopes to scan millions more over the next decade. More than two dozen major libraries around the world have signed on to the project.

The University last month confirmed its support of the Google Book Search Settlement Agreement, which Google and copyright owners had brought before a federal court. The new settlement made modifications to the original agreement, which was reached in 2008 as a resolution to separate lawsuits filed by the Authors Guild and the Association of American Publishers in 2005 accusing Google of “massive copyright infringement.”

Under the 303-page amended settlement, Google would make a one-time payment of $125 million to rights holders, authors and publishers to negate liability for materials that have already been scanned, searched and made available online. Part of the payment would also be devoted to the establishment of an independent non-profit entity called the Book Rights Registry, which would collect revenue from third-party users of Google Books content and transfer that revenue to rights holders.

The settlement would enable Google to continue scanning and displaying books under the condition that the company turn over 63 percent of its net revenues from advertising to rights holders. One of the key features of the agreement would be Google’s ability to make considerable use of out-of-print and “orphan” works, which are out-of-print but still protected under copyright law; revenues from these works would be mediated through the new Book Rights Registry.

The settlement also contains provisions for the establishment of institutional and public access subscription systems, which would enable companies, colleges and individuals to have access to the Google Books catalog.

If the settlement is approved, Stanford will become a Fully Participating Library in the Google Books effort, which would expand its current access to the Google Book catalog.

Andrew Herkovic, director of communications and development at Stanford Libraries, called the Google Books project “too useful to fail. I believe that something useful, if not perfect, will come out of the court.”

Herkovic spoke with The Daily about Stanford’s relationship with Google, the new settlement and the future of the Google Books project. Here is an edited excerpt.

The Stanford Daily (TSD): How did Stanford first begin its relationship with Google Book Search?

Andrew Herkovic (AH): As you know, Stanford Libraries has been digitizing materials for many years, but on a very small scale — generally for special purposes or projects. In about 2002, University Librarian Mike Keller was at a retreat at the home of Paul Allen, the Microsoft co-founder, and found himself in conversation with Larry Page [M.S. ’98], the co-founder of Google, in which they found that they were both very interested in the idea of the mass digitization of books to unleash the information sitting on library shelves. That may not have itself started the Google Book Search project, but it put us in a position so that when Google got serious about it in 2003, they immediately started talking to Stanford about our participation and discussion about how to go about such a plan. So we were one of the first five libraries that were on board when Google went public with this in 2004.

TSD: How does the Google Book Search subscription work?

AH: If the settlement goes through, the Fully Participating Libraries will have access to at least all of the content that they provided. Furthermore, under the settlement, every library will have the right to one terminal, so to speak [i.e. for one simultaneous user]. Individuals unaffiliated with the institutions would be able to buy access to Google Books on a pay-as-you-go basis or various, not yet fully articulated subscription schemes. If the settlement is approved, Stanford students on campus would not need to pay anything for access.

Someone in the community unaffiliated with Stanford would have the opportunity to seek a terminal in a Palo Alto library or would be able to seek a terminal on the Stanford campus and would have equivalent rights of anyone on the campus — I believe that to be true.

TSD: Could you describe some of the features of the Google Book Search database and what is convenient about using the database?

AH: The great power of Google Book Search is in search. And the power to search any book is a powerful tool. If the settlement goes through, there will be a lot of full texts available, depending on how authors and publishers individually react. In general, the public will be able to read, instead of snippets, something like up to 20 percent of the text of the particular book that is under copyright, and they would be able to read 100 percent of the text of the book that is out of copyright. The terms of the settlement have been limited to books published in the United States, Canada, the United Kingdom and Australia.

Unfortunately, we’re not talking about every book. We’re only talking about books published in those countries where the author or publisher has not opted out. That’s a huge limitation on the power of the system. But we’re looking at access to a lot of full texts, but not as much as we would hope in an ideal system. But a lot more than would have been possible under the original settlement with Google.

TSD: How has the Stanford alumni connection to Google affected Stanford’s relationship with Google?

AH: The fact that Larry and Sergey [Brin M.S. ’95 Ph.D. ’98] are closely associated with the Stanford computer science department — Google was, in fact, born here — certainly played a role. Among other things, it made it easier to develop and maintain good communication between the Stanford Libraries and Google. The other university that is most similar to Stanford in commitment and scope in Google Book Search is the University of Michigan. They are quantitatively ahead of us, and they have been very public in support of this. It may be no coincidence that Larry Page himself was an undergraduate there. The possibility that personal relationships and contacts greased the wheels cannot be dismissed.

TSD: Since Stanford became involved in Google Book Search in 2003, how has that relationship developed over time?

AH: All the library partners of Google Book Search meet twice a year, and it’s a really interesting gathering to look at the problems and the opportunities and solutions. And Stanford has participated wholeheartedly in those partner meetings. There is kind of a level playing field among all of the library partners, except that many of those partners have pretty tight restrictions on how much they are willing to do and how far out on a limb they are willing to go. Harvard and the New York Public Library, who were two of the original five, were very, very cautious in terms of what material they were willing to provide. Stanford has been much more gung-ho. Stanford has been deeply involved in participating in the — forming the public good side of the Book Search settlement. We’ve been working with Google in a pretty authentic partnership to make the most of that. Our relationship has matured in the sense of providing a great deal of advice and involvement in how Google is going forward.

TSD: Are there any negative ways in which the Google Books agreement affects Stanford?

AH: Are there downsides for us? The major downside to the settlement as we understand it is that it remains silent on the subject of books that remain outside of the scope of the settlement, which is to say books published everywhere except in English-speaking countries. We have material at Stanford that is not in English, and we’re scanning that material. But how freely that material becomes accessible by the open Web is less certain. That is a definite downside of the settlement. Whether that will definitely bite Stanford, I don’t know.

TSD: How has the scanning process been organized? Have there been problems?

AH: It’s a considerable — or logistical — nightmare to move thousands of books from campus over to Google, return them, get them back on the shelf. And, in fact, each book, after it has been selected off of the shelf, is checked out as if Google were a person…Google itself has its own controls, but the labor is being conducted mainly by Google people with a considerable amount of oversight and review by Stanford [Libraries]. There have certainly been glitches. Our books are in so many places; there have certainly been access issues. Elevators that are smaller or larger than our book trucks have really been logistical problems in a really large-scale project. I would say that overall we’ve had a very successful working relationship with Google staff.

TSD: Do you think the growth of Google Book Search reflects the development of a more autodidactic society?

AH: In its ideal state, [the] Google Books Search service making vast numbers of books available to the public to be searched and, to some extent, to be read, can be understood as a huge step forward in the democratization of information. And doing so essentially means that if you are a student at East Podunk Community College, that you will have access to millions of books, as does the Stanford student. There is a powerful argument that that really makes possible the improvement of that person’s potential horizons and prospects for education. And if they’re not a student at all, but rather a citizen, it could be very empowering for it to be assumed that vast numbers of books are at their fingertips. […]

We really hope that it will have the effect of making people better aware of differences in quality in available information. A lot of the stuff on the Web is hard to distinguish in terms of its quality of truthfulness. Books, of course, have the same problem, but to know that a book was held by Stanford or the University of Michigan gives it a certain authenticity that a Web site or Joe Moe’s blog cannot.