Stanford and Google Book Search
Statement of Support and Participation
In December 2004, Stanford University announced its intentions to provide Google with access to its book collections to be included within Google Book Search. Stanford issues this statement of support and participation to express the public benefits that will result from this monumental undertaking to provide search and discovery access to the world’s printed works.
GOOGLE BOOK SEARCH
Google Book Search is part of Google’s over-arching company goal to “organize the world’s information.” As part of that mission, Google determined that it would scan the world’s printed books and make them word searchable through Google’s Internet site. So a search for a particular term – “recombinant DNA” – will result in a search return containing bibliographic information about every work available to Google that includes the searched term. The bibliographic information includes standard online catalog information, such as title, author, date, and place of publication, publisher, size and it also includes information about how to access the full-text of the physical work, either through purchase or through a library.
Google Book Search has two key components, Google Books Partner Program and Google Books Library Project. Under the Partner Program, Google accepts digital works directly from publishers, and, according to contractual terms with the publishers, the search results will include not only bibliographic information, but also some text from the digital work.
Under the Library Project, Google is scanning works in libraries. The partner libraries include the University of Michigan, Harvard, Oxford, New York Public Library and Stanford. For works in the public domain – those works not subject to copyright – Google will display the full text of the book with the search result. For those works in copyright, Google will provide only bibliographic information and a few “snippets” from the actual digital text, usually not more than a few lines including the search term; no full text pages will be displayed.
WHY STANFORD IS PARTICIPATING IN THE LIBRARY PROJECT
Full-text searching of all literatures is an extremely powerful and useful tool for helping people identify and find texts of interest to them; this is the discovery process. Discovery of information has made the Internet the spectacularly successful resource it has become. By providing similar discovery tools for the tens of millions of printed books held in libraries, a very large and culturally important fraction of the world’s information will retain its rightful place as a trusted source of information and expression and its value in the processes of teaching, learning, and research will be magnified. Google Books Library Project, as well as Stanford’s other digitization efforts, is a major step in assuring that Stanford’s huge investment in its millions of library books can provide returns to readers of all ages, backgrounds, and locations. Keyword searching of the contents of the Stanford library collection – as well as the other books in the Google Book Search index, whether from publishers or other partner libraries – will dramatically enhance the discovery process, not only for Stanford students and researchers, but for everyone around the country and throughout the world with access to an Internet portal.
Many of the books on Stanford’s shelves (as well as those of the other participants) are believed to be protected by copyright. For this reason, Google will not display the full text of copyrighted books to readers. In other words, in most cases, this project is primarily supportive of the discovery process, not the delivery process. However, and very importantly, for those books in the public domain – those not protected by copyright – Google will display the full text to readers. This will make a great deal of literature and information available to all readers at no charge, a major public good that will be of particular value to children and teachers in primary and secondary education, as well as to those unable to use physical libraries due to disability, location, or other challenge.
To provide the world’s information seekers the means to discover content, and in the case of public domain materials to access content, is one of the primary reasons that Stanford chose to participate in the Library Project.
WHERE STANFORD IS IN THE SCANNING PROCESS
Google began scanning works from Stanford in approximately March of 2005. Stanford has selected its federal government collection as the first set of works to be scanned under the project; these works are in the public domain. Once this large collection is scanned, Stanford will focus on providing works to Google that were published in the United States up to 1964 and that are believed to be in the public domain. Stanford’s current focus is on older works in the public domain because Google will make the entire texts of these works available to readers and researchers, and because many of these older works are deteriorating at a rapid pace and it is a priority to digitize these works while they remain physically sound.
USE OF STANFORD FILES
Another major reason Stanford is part of the Google Books Library Project is so that it can access digital copies made by Google of books in its collections. Although Stanford is currently focusing on works in the public domain, eventually Stanford would like to bring more works, including copyrighted works, within the project. Understandably, content owners and copyright holders want information and assurances regarding Stanford’s plans for these digital files. Stanford University respects not only the laws of copyrights, but the principles driving those laws. Stanford’s uses of any digital works obtained through this project will comply with both the letter and spirit of copyright law.
WHAT STANFORD DOES INTEND TO DO WITH FILES
PRESERVATION is one of the vital social roles that libraries, particularly research libraries such as Stanford’s, provide. Stanford’s primary intent in obtaining a digital copy is to ensure the preservation of our library’s resources. Library collections are vulnerable to catastrophic losses from fire, flood, or earthquake. For example, Stanford libraries suffered flooding in 1978 and 1998, resulting in damage or loss to many thousands of books. These losses were minor compared to the more recent and tragic flooding of the University of Hawaii Library and the public and university libraries of New Orleans or the 1986 arson fire at the Los Angeles Public Library. A digital copy of works ensures the future of not only Stanford Libraries resources, but of the collected works of our society, our civilization.
EVEN BETTER DISCOVERY TOOLS. Stanford hopes to build better tools to discover information, such as through taxonomic mapping, associative searching, hyperlinking, and data-mining techniques.
LINKING TO STANFORD’S ONLINE CATALOG. Stanford will add links from Stanford University Libraries’ online catalog to public domain works displayed in Google.
DELIVERY OF FULL-TEXT DIGITAL CONTENT TO CAMPUS. Eventually, Stanford would like to explore legal ways to maximize our campus readers’ use of digitized texts. Currently, Stanford purchases or contracts access to thousands of digital publications held in copyright by others – electronic journals being the most common case. It honors contractual and legal limits on its use of these publications. Stanford’s policy with regard to books digitized by Google is exactly the same: it will respect any copyrights and licenses and prevent abuse (including hacking, mass downloads, and the like) by others.
WHAT STANFORD DOES NOT INTEND TO DO WITH FILES
Stanford does not intend to violate the legitimate rights of content owners to control the distribution and exploitation of works under copyright.
STANFORD’S REACTION TO LITIGATION AGAINST THE GOOGLE BOOKS LIBRARY PROJECT
Stanford is saddened by various publishers’ and the Author’s Guild decision to file suit against Google in an effort to curtail the Google Books Library Project – a project that has the potential of providing an invaluable social good.
Stanford believes that courts reviewing these cases will conclude that making a digital copy for the purpose of indexing and searching works is a fair use. Historically, copyright law has allowed the copying of works without permission where there is no harm to the copyright holder and where the end use will benefit society. Here, there could be nothing objectionable under copyright law if Google were able to hire a legion of researchers to cull through every text in the Stanford University Libraries’ shelves to ascertain each work that includes the term “recombinant DNA.” There could be nothing objectionable with those researchers then sharing the results of their efforts and providing bibliographic information about all works in Stanford’s libraries that include this term. Through the application of well engineered digital technologies, Google can simulate that legion of researchers electronically through algorithms that can return results in seconds. A digitization of the entire work needs to be created in order for Google to make possible the word searching process of such value to readers, but copyright law allows that digitized copy as a fair use. Thus, keyword searching of copyrighted texts and providing references to those texts is permitted by existing copyright law. The digital copies produced through this project are necessary to automate the searching process.
Stanford has nearly 9 million volumes in its collection, many of which are still in copyright, but out of print with no continuing commercial viability. These so-called Orphan Works have no champions at all, without an author or publisher available to ensure continuing accessibility by researchers, scholars, students and readers. While the publishers point out in their lawsuit that they have created a mechanism to preserve their catalogs of in print books through various digital projects, what of the Orphan Works, which represent a significant portion of works in university and research libraries? If the Google Books Library Project is restricted as the plaintiffs are requesting these Orphan Works will remain relatively inaccessible and their contents undiscovered except through laborious manual searching by people present at our library locations. This result would not be in the public interest.
Google provides a mechanism by which publishers and authors may opt-out of the Library Project, so copyright owners have the ability to bypass this project. But Orphan Works have no champions to opt-in to the project. And, if the plaintiffs win the day, the American public’s ability to discover these works will not be realized.
It has been stated in the press and the “blogosphere” that Google Book Search heralds a new age of access to the world’s literatures, that it is a truly transformative development. Stanford entered into the Library Project because it will revolutionize everyone’s ability to discover information, from elementary school students through post-graduate researchers at great centers of learning. Stanford’s hope for this project is that its 9 million books will be discoverable to everyone with access to an Internet portal. Stanford is proud to be part of this monumental undertaking.