SULAIR Home

Welcome to the new DLSS Home Page

DLSS is the information technology production arm of the Stanford Libraries; it serves as the digitization, digital preservation and access systems provider for SULAIR; and it is the research and development unit for new technologies, standards and methodologies related to library systems.


SMPL Renews Partnership with California Audiovisual Preservation Project

In an important collaboration this month, Stanford Media Preservation Lab and the Department of Special Collections & University Archives are participating in the California Audiovisual Preservation Project, a pioneering statewide initiative, for a third round in a row. The CAVPP is providing funds to reformat film and video selections from SULAIR’s collections, including newly resurrected video from the Stanford Prison Experiment and the Stanford University Film Collection. These items will be sent to an outside vendor with the equipment necessary to capture preservation-quality digital files from these unique materials in obsolete formats. The digitized content will be preserved in the Stanford Digital Repository and made broadly available to the public through the California Light and Sound collection at the Internet Archive.

The CAVPP works to provide digitization and access for historic California audiovisual recordings, as well as establish low-cost and practical standards to help organizations reformat their analog content. The Internet Archive and the Online Archive of California contribute to the project with storage for preservation, as well as by providing online access to teachers, researchers and students. There are nineteen California library and archive partners participating in the project, with funding provided through the California State Library and the Moving Image Archiving and Preservation Program at New York University.

Recently awarded a major grant from the NEH, the CAVPP will be directly responsible for preserving over two hundred audio and moving image recordings this year through the digitization services they offer to partner organizations.

Stanford Media Preservation Lab has provided support to the CAVPP project since its start with consultation and assistance with quality control of digitized content. The first two rounds of materials nominated by SULAIR provided reformatting for film and video media from the Buckminster Fuller Papers, the Apple Computer Inc. Records, and the Ampex Collection.

A demo video of "Lisa", an early personal computer developed by Apple

A demonstration of the Videofile Information System developed at Ampex in 1968


New Collections Added to Stanford Digital Repository in June, 2012

In June, approximately 68,000 images representing nearly 300 items across several collections were accessioned to the Stanford Digital Repository (SDR). The items include:
* Archives Parlementaires (81 books, 64,800 pages)
* Classic Papyrii (44 fragments, 88 images)
* Stanford Oral History Project (140 interviews, 2110 files)
* Special Collections Materials (18 photo collections, 900 images)

While many of these objects are already discoverable via SearchWorks others will get SearchWorks records in the coming months. However, all materials are currently available via the item’s PURL (a persistent URL which ensure that these materials are available from a single URL over the long-term, regardless of changes in file location or application technology).

Details below.

  • Archives Parlementaires, 81 volumes
  • We have now accessioned all 83 out-of-copyright volumes of the Archives Parlementaires. Part of the larger French Revolution Digital Archive, this material serves as a main point-of-entry for scholars and researchers interested in this important historical period. Online images of these books are available in a book viewer that enables scholars to flip backwards and forwards through the pages of bound materials in a much more intuitive online experience.
    Example image at: http://purl.stanford.edu/wn600cz2133
    Collection contact: Sarah Sussman

  • Classic Papyii, 44 fragments
  • These fragments are part of the Classics Department Papyri Collection, which is currently on deposit (with access to the original fragments by special request only) in Special Collections. In the 1980s, the department purchased about 75 Egyptian papyri for use as a teaching and research collection. The provenance of the papyri is uncertain, but like many others they could have been found in ancient garbage dumps or mummy wrappings (cartonnage). Hellenistic in origin, nearly all date from 250-150 BCE. Their texts, written in ancient Greek and Demotic, are documentary in nature: official letters, receipts, accounts, contracts, petitions (including one from jail), land measurement, public sale at auction, and lists of names.
    Example image at: http://purl.stanford.edu/dg606yd7146
    Collection contact: David Jordan

  • Stanford Oral History Project, 140 interviews
  • As University Archivist Daniel Hartwig recently announced in the SULAIR News article, Stanford Oral History Project Interviews Now Streaming Online, more than 100 interviews related to Stanford are now available.
    Example image at: http://www.oac.cdlib.org/findaid/ark:/13030/kt3c603546
    Collection contact: Daniel Hartwig

  • Special Collections Materials, 18 photo collections
  • In coordination with University Archivist Daniel Hartwig, eighteen photograph collections of the Stanford-related content (the Stanford family, campus architecture, and student life, etc.) were added to SDR. Online access to these materials greatly increases the ability of the Stanford community to explore and appreciate Stanford's rich history. Three of the more interesting collections include:

    • Muybridge Photos. This collection of 95 images includes original Muybridge glass plate negatives and print covering Stanfords' residences as well as printed cards from his horse in motion series.
    • Morley Baer Photos. This is a set of 89 photographs of Stanford University buildings created by Baer for the architects or for the University.
    • Ira Nowinski Photos of 2006 Stanford Powwow. 312 digital photographs and prints of the Stanford Powwow taken by photographer Ira Nowinski.

    Inclusion in the Stanford Digital Repository ensures that these materials are available to researchers and scholars (while upholding apptropriate access restrictions), now and in the future through a secure, sustainable stewardship environment.

Questions about the Stanford Digital Repository service should be directed to sdr-contact@lists.stanford.edu.


Source code available for Argo and dor-services

DLSS has released the source code for two of its library infrastructure projects:

Argo, Stanford's administrative "hydra head" for Fedora, provides a viewing, reporting and administrative interface for objects in a Fedora repository. It is also coupled with Stanford's lightweight and engine-free workflow system ("WorkDo") to provide a workflow visualization and control mechanism. WorkDo is a Hydra- and Fedora-compatible system that chains small scripts "robots" and microservices into complex processes to complete both human- and machine-based task flows.

dor-services is a Ruby gem that exposes Stanford’s Fedora-based Digital Object Registry (DOR) services and content models to both Hydra and non-Hydra processes. In addition to functional access to DOR’s Registration, Workflow, Identifier, Search, Metadata, Digital Stacks, and Preservation Ingest services, the dor-services library also defines a number of discrete modules that can be mixed into Hydra object models to extend their functionality. Each module is named according to a salient characteristic that it imparts to a digital object, and defines both object methods (what the object can do) as well as expectations (what metadata the object needs to provide) in order to properly represent that characteristic.


New Materials Added to Stanford Digital Repository in May, 2012

In May, approximately 1,400 images representing eighteen mostly 15th and 16h century books were accessioned to the Stanford Digital Repository (SDR). These items are part of Special Collections' goal to digitize and make more accessible materials considered "Beautiful Books". John Mustain is the collection contact for the materials listed below.

All of these books were previously discoverable via SearchWorks but required a visit to Special Collections to view these non-circulating materials. Access to digitized images of these books is now available via the item’s PURL (a persistent URL which ensure that these materials are available from a single URL over the long-term, regardless of changes in file location or application technology).

Inclusion in the Stanford Digital Repository ensures that these materials are available to researchers and scholars (while upholding appropriate access restrictions), now and in the future through a secure, sustainable stewardship environment.

Questions about the Stanford Digital Repository service should be directed to sdr-contact@lists.stanford.edu


New Digital Production resource - Image Defects page

The latest version of the Stanford University Libraries and Academic Information Resources Quality Assurance Image Defects page is now “live” and made freely available to the cultural heritage and library communities.

image_defect_page.JPG

This is a long-awaited tool that serves a range of production, development, and training needs. It includes sample images of common (and uncommon) defects, causes/sources, and potential remedies.

https://lib.stanford.edu/digital-production-services/quality-assurance-image-defects

This page compliments the outstanding and hugely popular AV Artifact site that was produced by our own Stanford Media Preservation Lab team.

http://preservation.bavc.org/artifactatlas/index.php/Table_of_Contents

Future work on the Image Defects page will include contributions from Imaging Scientist Don Williams, and content from the Library of Congress and Federal Agencies Digitization Guidelines Initiative.


New Collections Added to Stanford Digital Repository in April 2012

In April, approximately 41,000 images representing just over 1,300 items across several collections were accessioned to the Stanford Digital Repository (SDR).

  • R. Stuart Hummel collection: ~1,000 items (~ 35,000 images)
  • Stanford Medieval Manuscripts: 280 manuscripts (560 images)
  • Bibliothèque nationale du France: 3 manuscripts ( ~ 1,300 images)
  • Reid Dennis California Lithographs: 47 lithographs (47 images)
  • Archives Parlementaires: 2 books (1,600 images)
  • Special Collections Requests: 19 items (~2,800 images)

While many of these objects are already discoverable via SearchWorks others will get SearchWorks records in the coming months. However, all materials are currently available via the item’s PURL (a persistent URL which ensure that these materials are available from a single URL over the long-term, regardless of changes in file location or application technology).

  • R. Stuart Hummel collection
    The Hummel collection documents the Stuart and Hummel families' life and work in China as Methodist missionaries. These 1,000 items complete the efforts to accession this collection into SDR.
    Example image: http://purl.stanford.edu/vr836bv9896
    Collection contact: Glynn Edwards
  • Stanford Medieval Manuscripts
    The Stanford medieval manuscript collections represent a diverse set of materials acquired over the years that include everything from fragmentary materials to sumptuous bound codices, and drawn from diverse geographic areas and historical periods. While acquisition (and digitization) is an ongoing activity, the current set of materials ingested into the SDR includes three sets of materials frequently used in paleography courses at Stanford (M0297, M0299 and M0389) which provide students with examples of handwritten artifacts from the 9th to the 16th centuries.
    Example image: http://purl.stanford.edu/hp561td3079
    Collection contacts: David Jordan and John Mustain
  • Bibliothèque nationale du France Manuscripts
    These manuscripts were added to the Stanford digital collections to satisfy a Mellon-funded grant examining the material context of the transmission of Guillaume de Machaut's works in the fourteenth and fifteenth centuries. They will be available to Stanford users through a specialized manuscript discovery and viewing portal and available for use with third-party transcription and annotation tools.
    Collection contact: Ben Albritton
  • The Reid W. Dennis collection of California lithographs, 1850-1906
    This collection of mostly lithograph prints primarily portrays San Francisco-based city views, public buildings, and landscapes as well as images related to Stanford University, San Jose and other California-based images. Available in SearchWorks. Example image at: http://purl.stanford.edu/sr100hp0609
    Collection contact: Glynn Edwards
  • Archives Parlementaires
    A small initial portion of this 101-volume compilation of French Revolution material has been accessioned into SDR. Part of the larger French Revolution Digital Archive, this material serves as a main point-of-entry for research into this important historical period.
    Example image at: http://purl.stanford.edu/jt959wc5586
    Collection contact: Sarah Sussman
  • Special Collections & Patron Requests
    In coordination with Special Collections, several patron-generated requested items were added to SDR. This includes 13 boxes of 19th-century manuscript materials and three 19th-century books. Some of the items include:
    • Rankin Manuscripts
      Manuscript of Rev. Adam Lowry Rankin’s autobiography. Available in SearchWorks. Example image available at: http://purl.stanford.edu/qz269xh4040
      Collection contact: Glynn Edwards
    • Ferrario: Il costume antico e moderno
      This monumental and lavishly illustrated work covers the history of costume in different parts of the world and also include military customs, images of natural history, and architecture--and provides vast pictorial encyclopedia of much of the world. This remarkable compilation is part of Special Collections' goal to digitize and make more accessible materials considered "Beautiful Books"
      Example image at: http://purl.stanford.edu/vw497ky5190
      Collection contact: Special Collections
    • Salam, 1926-27
      200+ page Ottoman Turkish language newspaper printed in Rhodes. Available in SearchWorks. Example image at: http://purl.stanford.edu/vc308zs5684
      Collection contact: John Eilts

Inclusion in the Stanford Digital Repository ensures that these materials are available to researchers and scholars (while upholding appropriate access restrictions), now and in the future through a secure, sustainable stewardship environment.

Questions about the Stanford Digital Repository service should be directed to sdr-contact@lists.stanford.edu


New Collections Added to Stanford Digital Repository in March, 2012

In March, approximately 2,100 objects representing three collections were accessioned to the Stanford Digital Repository (SDR).

  • R. Stuart Hummel collection: ~ 2,100 items
  • The Life of Saint Catherine, Codex M0381: 1 manuscript
  • Special collection requests: 1 thesis

More details, including links to sample images are listed below.

While many of these objects are already discoverable via SearchWorks others will get SearchWorks records in the coming months. However, all materials are currently available via the item’s PURL (a persistent URL which ensure that these materials are available from a single URL over the long-term, regardless of changes in file location or application technology).

  • R. Stuart Hummel collection
    The Hummel collection documenting the Stuart and Hummel families' life and work in China as Methodist missionaries. Approximately 2,100 additional images from this collection have been accessioned into SDR. Example image: http://purl.stanford.edu/vr836bv9896
    Collection contact: Glynn Edwards

  • The Life of Saint Catherine, Codex M0381
    This medieval manuscript is the working copy used by the Venetian printer Giovanni Tacuine for the first printed edition (1501) of SANCTAE CATHARINAE SENENSIS. Available in SearchWorks. Example image at: http://purl.stanford.edu/rm504km6504
    Collection contact: John Mustain

  • Special Collections & Patron Requests
    In coordination with Special Collections, some patron-generated requested items are digitized and added to SDR.
    • “Hatzaad Harishon: Integration, Black Power and Black Jews in New York, 1964-1972”,
      Jacob S Dorman’s 1996 Department of History honors thesis. Available in SearchWorks. Example image at http://purl.stanford.edu/zc256bg7718
      Collection contact: Daniel Hartwig

Questions about the Stanford Digital Repository service should be directed to sdr-contact@lists.stanford.edu


New Collections Added to Stanford Digital Repository in February, 2012

In February approximately 7,000 objects representing six collections were accessioned to the Stanford Digital Repository (SDR), bringing the total number of objects in SDR to nearly 250,000.

  1. Buckminster Fuller collection: 5,200 slides
  2. Kitai topographical maps: 1,600 maps
  3. McLaughlin Maps, California as an Island: 114 maps
  4. R. Stuart Hummel collection: 52 items
  5. Eliasaf Robinson collection addendum: 1 gazette
  6. Islamic prayer book, 1228 H: 1 manuscript

More details, including links to sample images are listed below.

Inclusion in the Stanford Digital Repository ensures that these materials are available to researchers and scholars (while upholding appropriate access restrictions), now and in the future through a secure, sustainable stewardship environment.

While many of these objects are already discoverable via SearchWorks others will get SearchWorks records in the coming months. However, all materials are currently available via the item’s PURL (a persistent URL which ensure that these materials are available from a single URL over the long-term, regardless of changes in file location or application technology).

  • Buckminster Fuller collection
    Approximately 5,200 slide images & related indexes were added to complement the existing Fuller collection already available at: R. Buckminster Fuller Collection site. The slide images and indexes are undergoing quality control and are in queue for discovery via SearchWorks, and will be made available in the near future.
    Collection contact: Roberto Trujillo or Glen Worthey

  • Kitai topographical maps
    This collection of 1,600 Russian military sheet maps of China represents about 3,600 images. These maps are in queue for discovery via SearchWorks. Example image: http://purl.stanford.edu/fd349tp3788
    Collection contact: Julie Sweetkind-Singer

  • McLaughlin Maps, California as an Island
    A portion of the larger McLaughlin map collection, this subset highlights early cartographers’ view of California as an island. The collection now includes 671 maps, of which 114 are newly digitized and added to SDR this month. These maps are in queue for discovery via SearchWorks. Example image: http://purl.stanford.edu/jy409qg0248
    Collection contact: Salim Mohammed


  • R. Stuart Hummel collection
    The Hummel collection documenting the Stuart and Hummel families' life and work in China as Methodist missionaries in the late nineteenth and early twentieth centuries. A sample set of 52 images have been accessioned and after preliminary metadata QA is completed the entire collection (multiple thousands of images) will be accessioned into SDR.
    Collection contact: Glynn Edwards


  • Iton ha-Mizraḥ, Levant Gazette
    Very fragile Hebrew/English newspaper and an addendum to the Eliasaf Robinson Collection. Example image: http://purl.stanford.edu/zt646cr7630
    Collection contact: Zachary Baker


  • Islamic prayer book, 1228 H
    An Islamic prayer book imprinted in 1813, this nearly 500 page manuscript. Example image: http://purl.stanford.edu/yr183sf1341
    Collection contact: John Eilts


  • Special Collections & Patron Requests
    In coordination with Special Collections, several patron-generated requested items were digitized and recently been added to SDR. In January, in coordination with University Archivist Daniel Hartwig, several different University Archives-owned student life-related photo albums were added to SDR. Two of these items are:

Development and support of the Stanford Digital Repository is the responsibility of the Digital Library Systems and Services (DLSS) group. DLSS provides services and supporting technology and tools to SULAIR archivists, curators, and selectors who create or acquire digital collections in support of scholarly research at Stanford and beyond.

Questions about the Stanford Digital Repository service should be directed to sdr-contact@lists.stanford.edu


Upcoming Project featuring the Papers of Europe's First Female Professor

scan_bassi_news.jpg The Digital Production Group is very excited about an upcoming project featuring the personal papers of "Laura Bassi, a noted 18th-century Italian scientist and Europe's first female professor, " with Project Manager Cathy Aster at the helm.

More information to come, but in the meantime take a look at this recent article in the Stanford University News.


Stopwords in SearchWorks - to be or not to be

We've been examining whether or not to restore stopwords to the SearchWorks index. Stopwords are words ignored by a search engine when matching queries to results. Any list of terms can be a stopword list; most often the stopwords comprise the most commonly occurring words in a language, occasionally limited to certain functions (articles, prepositions vs. verbs, nouns).

The original usage of stopwords in search engines was to improve index performance (query matching time and disk usage) without degrading result relevancy (and possibly improving it!). It is common practice for search engines to employ stopwords; in fact Solr (http://lucene.apache.org/solr), the search engine behind SearchWorks, has English stopwords turned on as the default setting.

In our implementation of SearchWorks, there was no compelling reason to change most of the default Solr settings; thus, since SearchWorks's inception we have been using the following stopword list: a, an, and, are, as, at, be, but, by, for, if, in, into, is, it, no, not, of, on, or, s, such, t, that, the, their, then, there, these, they, this, to, was, will, with.

What follows is an analysis of how stopwords are currently affecting SearchWorks, and what might happen if we restore stopwords to SearchWorks, making every word signficant for every search.

Executive Summary of Analysis

The SearchWorks metadata group (see https://consul.stanford.edu/display/NGDE/SearchWorks) believe that restoring stopwords to SearchWorks could improve results in up to 18% of the searches, and will degrade results only in the small number of searches with more than 6 terms.

How Many Terms are there in User Queries (not including facet link clicking)?

Over 50% of the query strings for SearchWorks are 1 or 2 terms.
Over 75% of the query strings are 1, 2 or 3 terms.
Over 90% of the query strings for SearchWorks have 6 or fewer terms.

These figures include any stopwords occurring in queries.

number_of_terms.png

Source: (from Google Analytics for Oct 2011, analyzed by Casey Mullin: https://consul.stanford.edu/display/NGDE/Log+Analysis+Workspace - link at the bottom):

What Percentage of Query Strings have Stopwords?

For November 2011:
There were 142,869 searches
Stopwords appeared in searches 26,076 times

So, stopwords appeared in roughly 18% of searches.

stopwords percentage.png

(per analysis by Casey, sent in email to gryph-search on Dec 14, 2011; this information will be in the analytics on consul, once they are updated for November 2011).

Do the Stopwords Currently Used in Queries Imply the Users are Trying Boolean Searches?

The 10 stopwords appearing most often in queries are:

November 2011:
the -- 7578 occurrences
of -- 6582
and -- 4106
in -- 2298
a -- 1137
to -- 1033
for -- 695
on -- 685
an -- 289
with -- 231

or and not do not appear in many queries, while and is not the most frequent stopword, nor close to it in occurrences. I interpret this to mean stopwords in queries are NOT intended as boolean operators.

(per analysis by Casey, sent in email to gryph-search on Dec 14, 2011; this information will be in the analytics on consul, once they are updated for November 2011).

What About "Mininum Must Match"?

Solr allows us to cleverly fudge boolean AND or OR with a setting it calls "mm" or "Minimum Must Match". Our mm value says that if the query has 4 terms or fewer, all must match, but if a query has 5 or more terms, 90% (rounded down) must match. It was suggested a while back that we increase the mm threshold to 6 (from 4). Restoring stopwords to significance makes it more important to increase the mm threshold, as there are more significant words in our queries. Given that over 90% of queries have 6 or fewer terms, 6 seemed an appropriate threshold.

see https://consul.stanford.edu/display/NGDE/How+Search+Works+in+SearchWorks to learn more about mm

What is Improved by Restoring Stopwords to the Index?

1. searches comprised only of stopwords now retrieve results (improved recall)
examples:
- to be or not to be (with or without quotes)

2. precision is greatly improved for short searches including stopwords
examples:
- pearl vs. the pearl
- the one
- a zukofsky (author Zukofsky, title "a")
- there will be blood
Prod: 12678 (as a title search, 5013)
Test: 31 (as a title search, 5)
- OR spectrum (a periodical)
- Jazz: an introduction

3. subject links distinguish "in" from "and", etc.
archaeology in literature is no longer conflated with archaeology and literature

4. improved results for languages having lexical words overlapping English stopwords

What is Degraded by Restoring Stopwords to the Index?

1. long queries (over 6 terms) with a lot of stopwords have reduced precision
BUT: the words occurring as a phrase float to the top.
example: Lectures on the Calculus of Variations and Optimal Control Theory

What Else Have Testers Reported?

note that "test" is stopwords restored, while "prod" utilizes stopwords

Kathy Kerns (email of Dec 2): known item searches - in all cases, test kept the correct result at the top or tied or improved its relevance.

Casey Mullin (email of Dec 2): children in literature as a subject search - test is much better.

Phyllis Kayten (feedback of Dec 5):
searched for: dorothy and the wizard of oz (known item search) - title sought was actually dorothy and the wizard *IN* Oz; test did not retrieve it (due to increased precision), but prod did.
searched for: the man from nowhere - test is much better
searched for: death and taxes - first result on test is worse, but next results are good. First result on prod isn't perfect either.

Linda Yamamoto (feedback of Dec 9):
searched for: Lectures on the Calculus of Variations and Optimal Control Theory (known item search) - correct result is first in both prod and test. The other results: prod is better (but this is one of those long title searches).
C# and C++ searches don't work (unrelated -- this is a special character searching issue and has nothing to do with stopwords)

Vitus Tang (feedback of Dec 2):
"A potential problem of the stopword change is that title access points (aka uniform title) constructed according to AACR2 are without initial articles. So, for instance, the access point for the series "The NASA history series" is "NASA history series". A query that includes the initial article will not affect the search result in current production SW because "the" is eliminated as a stopword, but will affect the search result when stopwords are treated as significant words. On test, a phrase title search for "The NASA history series" retrieves 76 records. The same search on production retrieves 125 records. The test search still retrieves some of the records that belong to this series because the transcribed series statement, which is in the 490 field, includes the initial article, but not all of them do. The series access points in the 830 field are all without the initial article. [Symphony browse series retrieves 94 results.]"

Naomi's notes on Vitus's feedback: in gryphon-search, many of the records we examined had the "wrong" information in the field (it included the initial article, and it shouldn't have). Sooo … our data is dirty -- shocking, but true. It would be nice to know if series searches are common outside of Library Staff.

Additional Comments

SearchWorks employing stopwords gives imperfect search results. SearchWorks restoring stopwords, so that every term is signficant, gives different imperfect search results. Socrates gives yet different imperfect search results. The back end algorithms for determining what results match a query will always be fairly opaque to the end users - the algorithms are complicated. Moreover, users will have typos and other mistakes in their queries no matter what we do, and it seems unlikely we can consistently rescue them from themselves.

Solr gives us incredible control over our search engine's algorithm. There are many many knobs we can twiddle in our quest to improve the relevancy of search results. A few of the possibilities include:

a. tweak mm -- require a higher percentage of matching terms when there are more than 6 terms in the query

b. increase phrase boosting -- this would float results to the top that have the query terms occurring close together (and presumably in the same order). currently *seems* high enough, but have never performed any empirical tests)

c. reduce phrase slop (currently 3) -- implies words need to occur closer together in the results. Not clear exactly how phrase boosting and phrase slop interact.

d. adjust the relative boosting of fields (give even more weight to title field matches, etc.)

e. adjust the situations where the length of the INDEXED string affects the score of matches. (query "my cat" scores higher for title "my cat" than for "my cat and dog")

(See https://consul.stanford.edu/display/NGDE/Glossary for more information.)


Parker on the Web - Version 1.4 Release Announcement

We're pleased to announce the release of Version 1.4 of Parker on the Web, the fourth incremental site release since the launch of Version 1.0 in Fall 2009.

Version 1.4 constitutes the most substantial corrective content release to date. More than 500 additional image reshoots are now integrated into the site, along with a host of sequencing corrections -- which have impacted 151 manuscripts (27 per cent of the online collection). The reshoots either replace existing images with better quality versions, or provide images for selected manuscript pages that had been previously overlooked for digitization. This brings the images to a state of 99 per cent or better accuracy across the total of over 200,000 images. Corrections to manuscript descriptions, summaries and bibliography are also incorporated in this release. Along with these content corrections, approximately 100 new bibliographic citations have been added to the site as well.

This release constituted a significant technical challenge requiring numerous QC passes and analysis of image rendering problems for the web application. JP2 derivatives were generated by DLSS of the Cambridge-produced TIFF reshoot masters, but JPEGs were consistently not being rendered from the JP2s due to an unsupported bit depth error. Pair the resolution of this problem along with the complex interleaving and replacement of selected images -- along with detailed sequence file corrections -- and you have a set of interlocking issues that required lots of time and attention to detail to resolve.

Kudos for a job well done to Chris Jesudurai, Doris Cheung and Tony Calavano, along with Suzanne Paul from Corpus Christi College. A great team effort!


KEEP - Keeping Emulation Environments Portable

I recently attended a workshop of the KEEP project (Keeping Emulation Environments Portable) in Rome. KEEP is an EU funded project to develop software that virtualizes old computer hardware and software environments. This allows you to run old operating systems and the applications that were designed for them on modern computers. The KEEP project is multi-partner project that than includes a consortium of national libries (BNF, Koninklijke Bibliotheek), the University of Portsmouth, a computer history museum (Computerspiele Museum), commercial partners (Tessella), and the European Game Developers Association.

The project is scheduled to end in February 2012 and has already released software version 1.0.0 on SourceForge ( http://emuframework.sourceforge.net/ ). This version supports:
* 5 platforms: x86, C64, Amiga, BBC Micro, Amstrad
* 6 emulators included: Dioscuri, Qemu, VICE, UAE, BeebEm, JavaCPC
* 22 file formats supported: PDF, TXT, XML, JPG, TIFF, PNG, BMP, Quark, ARJ, EXE, disk/tape images and more
* Integration with format identification FITS
* Web services for software and emulator archives

The development team is currently working on some significant refinements of the interface that are scheduled to be released at the end of the current project in February 2012. The current list of supported platforms and operating systems and can be expanded. In addition, the framework is designed so that you can add support for legacy applications if you have the software and licenses to do so.

The KEEP emulator UI has a wizard mode that allows the end user to select the disk image or application file and then let the software guess the correct platform, OS and application using FITS. There is also support for the end user to manually choose the platform, OS, and application if the wizard isn't able to determine the requirement environment.

I am very excited by the potential of the KEEP emulation framework. I envision it potentially satisfying multiple SULAIR use cases for providing access to legacy applications (games) and older digital manuscript materials (Word Perfect files, old operating systems). In the coming months we hope to test the KEEP framework to determine its suitability for our collections.

Related Websites or Documents:

http://www.keep-project.eu/ezpub2/index.php
http://sourceforge.net/projects/emuframework/
http://emuframework.sourceforge.net/docs/System-User-Guide_1.0.pdf


Currently in the labs - Materials from the Monuments of Printing Exhibition, Part 1

Re-Posted from the Special Collections and Archives Exhibits Program listing -

The Monuments of Printing Exhibition highlights first 250 years of printing in the West

MonumentsOneCorrexRGBX300x400.jpgJohannes Gutenberg's printing of a Bible from movable type in Mainz, Germany in 1455 marked the beginning of a communication revolution in the West. Printers were able to reproduce texts efficiently in quantities virtually unimaginable to a scribe. Monuments of Printing: from Gutenberg through the Renaissance, the first of two exhibitions spanning five-hundred years of printing history, demonstrates the development of typography and printing in Europe over a 250-year period as seen in selected works in the rare book collections of the Stanford University Libraries. The exhibition will open Monday, August 1, in the Peterson Gallery and Munger Rotunda on the second floor of the Bing Wing of Green Library, Stanford University, and is free and open to the public.

A leaf from the Gutenberg Bible, displayed with a lectern bible manuscript exemplar, and a volume printed in 1702 with type designed for Louis XIV, bookend the exhibition of some forty titles. In between, the exhibition explores the roots of and influences on letterforms, printing, and book design in Europe in the fifteenth through seventeenth centuries, and shows work by some of the great European printers, Nicolas Jenson, Aldus Manutius, the Estienne family, and Simon de Colines among them. Highlights include Euclid's Elements (1482) by German printer Erhard Ratdolt; the Hypnerotomachia Poliphili , printed in Venice in 1499 by Aldus Manutius and considered one of the most beautiful early printed books, the Complutensian Polyglot Bible (1514-1517), printed in parallel columns in Hebrew, Latin, and Greek type, and the 1641 Virgil printed by the Imprimerie royale, one of many fine productions of France's state press.

Monuments of Printing: from Gutenberg through the Renaissance will be on display from August 1 through November 27, 2011. The second exhibition, Monuments of Printing: from Caslon through the Book Arts Revival, will be on display December 5, 2011 through March 18, 2012.

Exhibit cases are illuminated Monday through Saturday from 10 a.m. to 6 p.m. and Sunday from 1 to 6 p.m. The gallery is accessible whenever Green Library is open and hours vary with the academic schedule. For Library hours, call 650-723-0931.

NOTE: first-time visitors must register at the south entrance portal to Green Library's East Wing to gain access to the exhibition in the Bing (west) Wing. For a map of campus and transportation information, go to www.stanford.edu/home/visitors/maps.html


New Images of Rare Books and Digitization Devices

New images have been added to the DPG Flickr site!  Click the photos below, and take a look at our Rare Book section, and the section devoted to the Atiz Book Digitization Device.

The lay of the last minstrel, a poem; by Walter Scott.Atiz Book Digitization Device

These photos were taken by DPG's Wayne Vanderkuil and Doris Cheung.


Presidents Factor Heavily in the Wake of Presidents Day

Lincoln Papers on flickrSan Jose Mercury NewsIt seems, as of late, that the Green Library has been abuzz with rare books and ephemera of a Presidential persuasion. This is to be expected, as the current Library Exhibition focuses on The American Enlightenment, and features a copy of John Milton’s Paradise Lost, which has the signatures of both Thomas Jefferson and James Madison. It also highlights some other noteworthy items from the Special Collections, which are displayed in the cases along the Library’s rotunda and halls. American History Professor Caroline Winterer, Special Collections' Exhibition Manager and Designer Elizabeth Fischbach, and Curator of Rare Books John Mustain selected every item to help flesh out an understanding of how certain aspects of the Enlightenment in Europe were interpreted across the seas -- ranging from fashion, to science, art and architecture and all other areas of life -- during that particular time period. The various display cases serve to illustrate different facets of these new ways of thinking, and also serve as a framework for the incredibly beautiful and well researched exhibition catalog and accompanying exhibition website. Indeed, the exhibition has been receiving a lot of attention from visitors and scholars, and was recently featured in an article by the San Jose Mercury News.

In a lecture that Professor Winterer gave on the curatorial preparation that went into the exhibition, she described how difficult it was to decide which books to include. She explained how the exhibition team selected items based on their content and the contributions that they made to the overall themes, not based purely on their outward aesthetic, as the old adage would remind us.

"The books are quirky like people," she said, "some of them are boring until you get to know them."

And the exhibition does just that -- it allows visitors to fully engage with the materials they selected to showcase, and encourages them to allow themselves to be fully immersed in what were once such revolutionary concepts.  As with the visitors, it was a treat for the Digital Production Group to be so personally engaged with these remarkable historical items while we provided the images needed to execute the exhibition team's vision. All of the images produced for the exhibition and accompanying catalog may be viewed in the American Enlightenment section of the Stanford University Libraries Image Gallery space. While almost all of the books were quite rare and fragile, one of the most lovely was also one of the most unwieldly to work with because of it's enormous size: Mark Catesby's 1754 The Natural History of Carolina, Florida and the Bahama Islands was definitely a two-person job. Almost two feet in height, the book needed a very sturdy support structure to hold it open at the perfect angle to capture the image -- while being careful not to hold it so far open that the pages lying flat would start to pop up. The weight of the book alone made it challenging even to locate the selected pages, so that one person would have to hold the accumulating pages while the other would turn them. Working with these kinds of materials always requires very slow, focused, and deliberate movements, with a consciousness of where one's hands and fingertips are at all times. Often, when imaging the most fragile items, we will have one person who is exclusively "hands on," and another who is on the computer inputting file names and making sure the images are perfect. With hundreds of files, multi-leveled metadata, and images going into a web environment, the project was far more complex to execute than the seemingly simple piles of beautiful old books would immediately suggest.

Papers of Abraham LincolnIn addition to the momentous American Enlightenment Exhibition preparation, the Digital Production Group was also recently involved in an effort to digitize the personal papers of President Abraham Lincoln. Originating in a request made on behalf of the Illinois Historic Preservation Agency and Abraham Lincoln Presidential Library and Museum, the project will eventually include any noteworthy Lincoln document held by a large sampling of institutions, and will be featured on a specially tailored website, The Papers of Abraham Lincoln. The twenty-six items requested range from large-scale legal appointment documents with detailed embellishments, to small notes, seemingly written in great haste. Some were printed on curling vellum with a waxy feel to it, and some were on brittle and yellowing paper seemingly ripped from a notebook. Cropping the images was sometimes problematic, as the items often had inconsistent angles or deckled edges. The specifications for the project also required use of a specific color bar, which they provided, in order to ensure image accuracy as they gather items from multiple holdings. The color bar was placed close to each object, in every shot, and was analyzed to confirm that all the colors in the documents were depicted correctly. But in the minds of our photographers, the best part of all was trying to decipher the intricate cursive handwriting in these various items. Lincoln’s scrawled and slanting “A” became almost commonplace, and the lovely script of his letters’ transcriptions made all the more real an era before scanners, photocopiers, and PDFs.

Papers of Abraham LincolnInterested in hearing more about the project, Palo Alto Online featured a Presidents Day article that included an interview with Special Collections’ Head of Public Service and Processing Manuscripts Librarian, Mattie Taormina. Mattie was the person who gathered the items from their various collections, and was present to observe these important documents as they were being digitized for the project. During the shoot, DPG's Production Coordinator and Digitization Specialist, Doris Cheung, was keeping track of files and naming the images, while Rare Book and Special Collections Digitization Specialist Astrid Smith carefully handled each of the items and operated the camera. Though the photography sessions only lasted a few hours, the post-production and image quality assurance steps took much longer. Palo Alto Online’s Karla Kane describes the results of these combined efforts, saying, “The Lincoln project is one of many ways in which the university's archives contribute to ongoing scholarship and interest in historical figures.”

In terms of the “hype” that surrounds Special Collections items that are associated with particular historical figures, part of what made contributing to the Lincoln Project such a special undertaking was the mere act of coming into physical contact with these documents.  Even just seeing them in scale on the screen made President Abraham Lincoln all the more real – as though he stepped down from his famous larger-than-life marble seat and entered into the realm of personal experience. And that is what making these and other important historical documents readily available is, ultimately, all about.


* * *
To see some more candid shots of the Lincoln Papers, visit our flickr page.

The documents included in this entry are are: M0002, Partially Printed Document Signed, Appointment of William Clarke as Assistant Adjutant General of Volunteers with the rank of Captain, signed by Abraham Lincoln and Edwin Stanton, 3/4/1863, and M0206, Autograph Letter Signed, John A. Dahlgren to Abraham Lincoln, 6/10/1861

Information on visiting the American Enlightenment Exhibition: The physical exhibition will be on display in the Peterson Gallery and Munger Rotunda, Green Library Bing Wing, Stanford University, February 7 through May 15, 2011. The gallery is accessible whenever Green Library is open; case lights are on Monday–Saturday from 10 a.m. to 6 p.m. and Sunday from 12 to 6 p.m. Building hours vary with the academic schedule, so it's a good idea to call the Green Library hours recording line at 650-723-0931 or go to http://library.stanford.edu/depts/green/about/grnhours.html before you make the trip.  NOTE: Visitors without a Stanford University i.d. must register at the south entrance portal to Green Library’s East Wing to gain access to the exhibition. For a map of campus and transportation information, go to www.stanford.edu/home/visitors/maps.html


DLSS Video Lab Opens

DLSS has a new lab! In late September, under the roof of the Stanford Media Preservation Lab located at SULAIR's site on Page Mill Road, we installed equipment to support the digitization of video collections held at Stanford Libraries. Two digitization workstations, a host of analog video tape players and supporting system components, and tools for cleaning and repairing aging videotapes and other recording media are installed and in production. To put it all in operation, Michael Angeletti started as Stanford's first Moving Image Digitization Specialist. The lab is already humming with a handful of patron access requests and active planning for reformatting projects to be undertaken in the coming months.

With this expansion of the media lab -- we've had an audio digitization studio in production since 2008 -- SULAIR has completed a major step in a multi-phase effort to build internal capacity for digitally preserving its sound and moving image collections. The gear and staff expertise are in place. Now we will focus our attention on refining workflows and developing tools to support them, as well as on establishing best practices, so that the lab produces high-quality work efficiently and reliably.

Interested in a tour of the media lab? Let me know!


Syndicate content