May 14-15, 2010

Digitized medieval manuscripts represent a potentially discipline-changing resource for scholars, especially if they can become more than simply images on the screen. This meeting is intended to bringing technically-minded individuals and projects together to discuss roadmaps, wishlists, and the potential for converging plans for repositories holding digital manuscript collections, and the tool-makers creating software to use those manuscript collections (not necessarily mutually exclusive groups).

We hope that our two days together will provide an interchange of ideas and information that can lead to defining the technical needs for a digital manuscript community, produce documentation of those needs, and perhaps spark collaboration to begin to build an infrastructure that will support such a community.

In the links below, you will find information pertaining to the background and agenda of this meeting as we envisaged it in the planning stages - but a true working meeting will likely take some unexpected turns. Please feel free to make suggestions!

Over the last decade, energy and resources have been invested in numerous projects to digitize, in whole or in part, significant collections of medieval manuscripts, and to apply computer-based tools and methodologies to the study of medieval manuscripts. These investments have been made on both sides of the Atlantic. The Foundation has been an important participant in this process, supporting collection-centric projects like the Cambridge-Stanford collaboration to create the Parker Library on the Web, cross-collection projects like e-codices, a successor to CESG based at the Université de Fribourg, and Roman de la Rose, involving The Johns Hopkins University and the Bibliothèque nationale de France, plus tool-development projects like Edition Production and Presentation Technology, initially based at the University of Kentucky. Other important work has been funded by the Deutsche Forschungsgemeinschaft in Germany, the Joint Information Systems Committee in the UK, the National Endowment for the Humanities in the United States, and by other private foundations and individual institutions. It is important to note, however, that despite these investments, only a tiny fraction of surviving manuscripts has been digitized to date.

The workshop on Uses of Digitized Collections of Medieval Manuscripts and Interoperation, organized by Stanford at the Andrew W. Mellon Foundation’s request, held in January 2010, constitutes an interesting report card on progress to this point. Reports to the workshop indicate that purpose-driven and opportunistic digitization of the medieval documentary record seems likely to continue for the foreseeable future, gradually expanding the limited range of materials now available in digital form. Workshop participants generally agreed that more content enables more (and more interesting) uses. Alongside the content, transcription and annotation tools (e.g., Digital Mappaemundi and EASEE, both reported to the workshop) are objects of considerable interest, and seem likely to remain so. It is significant that two projects not specifically reported to the workshop but in some ways similar to EASEE and Digital Mappaemundi -- the German TextGrid project and the Maryland Institute for Technology in the Humanities’ Text-Image Linking Environment (TILE) -- have recently been funded or re-funded, and will therefore be active at least until 2012. Though neither is focused on medieval manuscripts, both have known applications in this space.

On the other hand, some participants differentiated application projects from research per se. EASEE and Digital Mappaemundi both demonstrated the ability to apply computer-based tools to the evidence in digitized medieval manuscripts, superimposing annotations and transcription grids on manuscript pages. Future use-cases might invoke computer-based applications, but focus on the pursuit of answers to important and consequential questions about many topics currently of interest to scholars, e.g., communities of readership, dialects, textual transmission, the process of manuscript production and dispersal, diachronic development of literary, musical, and iconographic genres, liturgical and legal developments.

The testimony of the workshop also suggested that genuine interoperability across several collections, or between applications and collections, remains imperfect, awkward and inefficient, and that custom work is too often required from both application and repository to support each instance of interaction. Presenters and respondents generally agreed that the next 2-3 years are ripe for substantial progress to simplify the links between digitized manuscript resources and new computer-based applications used to organize, compare, annotate and transcribe manuscripts. Several presentations underscored the central importance of interoperation, defined both as easy and relatively transparent user access to resources in multiple repositories, and as interoperation between data and user-driven applications, where users seek not just to view primary data, but to manipulate it for comparison and analysis in user-managed environments. In this perspective, better interoperation not only enables better scholarship, but is also a means to mitigate inter-project redundancy. It therefore follows that better interoperation should also advance sustainability.

Although the workshop demonstrated that research requires interoperation among digital resources and applications, its composition precluded serious discussion of technical solutions to problems like diverse programming platforms, uneven adoption of standards for imaging and metadata, long-term sustainability, and the reality of heterogeneous business models. These issues as they relate to medieval manuscript materials are the focus for the technical meeting we propose here. Solutions could apply beyond medieval manuscripts, however. Medieval source materials present a number of special challenges that are nonetheless echoed in other fields. Solutions for medieval materials could inform more generally applicable solutions. If, for example, one could solve the problems presented by annotation across medieval manuscripts found in multiple repositories, such solutions might be applied to datasets like nineteenth-century ephemera where some problems are similar.

Agenda and Ideas

  • Identify common approaches, differences and overlaps in the technical architectures of major content repositories. Repositories will continue to have diverse approaches to storing their data and designing their local navigation and display environments. The identification of similarities and differences in technical architecture is a prerequisite to the definition of best practices for data sharing.
  • Compare the functions of the applications – both “what” and “how” so that participants understand clearly if and how the various applications access and ingest content, if and how they transform data, and what are the applications’ requirements of the target data.
  • Analyze development roadmaps and wish lists, paying special attention to timelines. The core repositories for this meeting – Parker on the Web, e-codices, Roman de la Rose, BnF – all have ongoing development roadmaps. This is also true of the major tool-development projects. By comparing lists, we will identify areas where interests and, importantly, timing align. That is, what similar problems are being addressed simultaneously by more than one party? This will expose prospects for fruitful near-term collaboration.
  • Analyze metadata schema, protocols for exposing data and methods of aggregation. Two elements underlie all of the repository projects: images and descriptive metadata. Tools must, by necessity, address both of these elements. While it is not necessary to agree upon a single metadata schema, or methods for exposing and indexing that data, very little interoperation can occur without an understanding of the methods by which data is produced and shared, and strategies that would enable disparate datasets to be absorbed into a general search and discovery index.
  • Analyze image-related standards including file formats, naming conventions, and methods of delivering images. In an interoperable environment, digital images are not simply facsimiles, but objects of research. To enable the research, repositories and tools must ensure that a core level of reliable image data lies at the heart of all activity. To annotate, transcribe, map, or otherwise add data to this core level, images must be created in ways that make them useable. Interoperability between image-based projects depends upon well-articulated best practices for image creation and delivery.
  • Analyze technologies for annotation, schema for representing annotations, and approaches to personal, group and public sharing of annotations. We recognize that a large number of tools are being created that perform similar tasks focused on linking user-supplied data to one or more specific regions in one or more digital images. The importance of annotation to scholarship is illustrated by the foundation’s support for the Open Annotation Collaborative. In the spirit of the OAC, and with OAC in attendance, our meeting will help situate manuscript-related annotation tools in the broader landscape and define how DMSS repositories can best support this open annotation activity.
  • Begin high-level requirements definition for interoperability among digital manuscript repositories and applications and identify high-level design options, including persistent identifiers, core services and common APIs. These initial steps are prerequisite to delivering the types of services that have been identified by first-generation scholarly use-cases, and can only be developed through a collaborative effort between the repositories providing content and tool makers able to exploit content for research purposes. Our meeting is a first step in this process.
  • Develop interoperability strategies that are flexible enough to accommodate different business models and access rules. The current set of repository partners already represents heterogeneous business models. Different rules for access to data and services need not constitute an impediment to interoperation. This will be a premise of the meeting.

A set of modest next steps has the potential, in our view, to create a basis for cost-effective specialization among projects in the medieval manuscript space. With interoperation, resource-oriented projects would be able to concentrate on content that is exposed to external applications. In their turn, application-based projects would be able to assume easy (if not always free of charge) access to growing corpora of base data. In this way, expensive and largely unsustainable “silo” projects combining data and applications could be minimized, not only eliminating barriers to scholarship but potentially also reducing costs. A consistent, interoperable approach to content and annotation could also birth a reliable, decentralized strategy for persistent access to annotations and their underlying “primary” content, making computer-based manuscript scholarship repeatedly reusable and persistently cite-able.


  • Albritton, Benjamin (Stanford University)
  • Aster, Cathy (Stanford University)
  • Bonicel, Matthieu (Bibl. nationale de France)
  • Chartrand, James (TAPoR, PACE project)
  • Cramer, Tom (Stanford University)
  • Deering, Jon (EASEE project, Saint Louis University)
  • Gietz, Peter (TextGrid Project)
  • Jefferies, Neil (University of Oxford)
  • Jesudurai, Christopher (Stanford University)
  • Kim, Douglas (Stanford University)
  • McAulay, Elizabeth (UCLA)
  • Patton, Mark (Johns Hopkins University)
  • Sanderson, Rob (Open Annotation Collaboration, Los Alamos National Laboratory)
  • Schwemmer, Rafael (e-codices)
  • Snydman, Stu (Stanford University)


Meeting Location:

The meeting will take place in Classroom 127 of Wallenberg Hall, from 9 to 5 each day, with a break for a catered lunch at mid-day and coffee breaks in the morning and afternoon. The location of the hall, and directions for parking (if necessary), can be found here .

For those staying at the Stanford Guest House:


You will be staying at the Stanford Guest House, 2575 Sand Hill Road, Menlo Park, California. (650) 926-2800.
Your room has been booked, under your name, and charged to our meeting account. You can check in anytime after 3 pm on Thursday, May 13.

Maps and directions to the Guest House can be obtained here:

If you are arriving directly from SFO, we recommend the Super Shuttle service for a door-to-door trip (approximately $30 USD each way, for which you will be reimbursed).

Your room comes with a continental breakfast and an internet connection, laundry and many of the usual amenities found in a hotel.


The Stanford Guest House is located off-campus. In order to get to campus, we recommend:

1) The Marguerite Shuttle. The Marguerite is the Stanford campus bus service and runs on a regular schedule Monday to Friday. For your purposes, then, this will be the easiest way to get to campus on Friday morning. You will be taking the SLAC line, which will be running to campus approximately every 20 minutes starting at 7:50 or so. To make the 9:00 am start time, I suggest taking either the 8:10 or 8:30 bus. This is a free shuttle service. The full schedule for the SLAC line is at:

2) If you have rented a car, on-campus parking information can be found here: , or see the recommended parking instructions under “Meeting Location” above.

3) It is also possible to walk, though it is quite a long walk, from the Guest House to campus. I make the walk once or twice a week – at a fairly brisk pace you can count on an hour from door to door. Ask the Guest House staff for directions if you are going to walk.

On Saturday: we have booked the Stanford Guest House van to transport all meeting attendees staying at the Guest House. The van will leave at 8:30 am.

On Friday, we will discuss transportation options for returning to the Guest House after the group dinner on Friday, and any evening events on Saturday.

For those staying at the Stanford Faculty Club:


You will be staying at the Stanford Faculty Club, 439 Lagunita Drive, Stanford, California. (650) 723-9313.
Your room has been booked, under your name, and charged to our meeting account. You can check in anytime after 12 pm on Thursday, May 13. If you are going to be arriving after 6 pm, please let me know – there will be additional instructions to follow.

Maps and directions to the Faculty Club can be obtained here:

If you are arriving directly from SFO, we recommend the Super Shuttle service for a door-to-door trip (approximately $30 USD each way, for which you will be reimbursed).

Your room comes with a continental breakfast and an internet connection, laundry and many of the usual amenities found in a hotel.


The Faculty Club is an approximately ten minute walk from the meeting location. If you have mobility issues, please let me know so that alternate transportation arrangements can be made.

Note: On Friday, we will discuss transportation options for returning to the Faculty Club after the group dinner on Friday, and any evening events on Saturday.


Information about reimbursements to follow.

