What do you want to get out of the meeting?

Here is a rough (incomplete) list of what I heard you want to get out of
the next two days:

* learn how the community can benefit from the TextGrid infrastructure
* can web services be useful for interoperability?
* what cool technologies are out there?
* what concrete steps can we take to expose our data to other applications?
* how can we prevent reinvention of the wheel
* inform us on how OAC framework can benefit our work
* strategies for cleaning up silo'd collections
* can the same data models be used for medieval manuscript and more
modern formats?

General Annotation/Transcription Use-Case (Parker/TPEN/OAC)


A step-by-step walk through interactions (repository, transcription/annotation tool, dissemination and aggregation of annotations, ingest back into repository).


Diagram refined by Lisa McAulay

Dictionary of Old English Use Case

The Dictionary of Old English
Centre for Medieval Studies
University of Toronto

- DOE is zooming into Parker images in the Parker environment and
storing links to the Parker pages
- Links are to specific words within the context of a page of a
manuscript that are problematic or impossible to transcribe
- DOE is a subscription service with login that could link to other
repositories (e.g. e-codices, Bodleian, Vatican, Rose)

- Possible features
* stream images out to DOE's environment
* create bounding boxes around text

Web Services Mind Map (Pretty)

Web Services MindMap


Rafael Schwemmer

Web Services and Resource Maps

Web Services
get image
Specific MSS
Resource maps (structural & technical metadata)
Map A
page labels
Map B
All description (descriptive metadata)
Description A
author of description
format (RDF, TEI, DC, HTML)

Friday Whiteboards

Friday Morning Idea-Catcher

Friday Morning Agenda Planning and Potential Partnerships

Pot. Partnerships: SLU/Rose Miniature Parsing; Ontology for ms sequencing

Friday Afternoon Outline, Saturday Planning, and 3-Step Program for Saturday Work

1) Requirements (Use-Cases)
2) Solution formulated as API (Web Services)
3) Technology Needed for Solution Implementation (winning technologies)

Page Turners and Viewing Environments

Image Viewing/Browsing, Page Turner:
• Multiple images brought together
• "viewer requirements"
○ Thumbnails
○ Side-by-side images, or multiple images "non-contiguous"
○ Zoom
○ Pan
○ Rotate
○ Select a bounding box
○ Reading "directionality" (in multi-image and thumbnail) (W3C rec.)
○ Resize viewport
○ Full-screen support (fsi, zoomify)
○ Thumbnails of other set (additional menu display functionality)
○ Embeddable
○ Addressable
○ Javascript hooks (what's user doing and control/record it)

Fedora Repositories

We have all three groups of Fedora users here - active users, prospective user and abandon-ers

- Uses Fedora API and object model, but nothing else from Fedora
- Uses Fedora in a more expendable way
- Have a custom REST layer on top of Fedora
- Uses Solr to handle majority of the queries

- Is moving towards Fedora
- Currently using Fedora for building digital repository
- Uses Fedora content and object models
- Parker Library on the web content is not in Fedora


Persistence Issues

Various systems for uniquely identifying resources: ARK, DOI, TinyURL, UUID

Can provide more than one reference, using more than one system, for the same resource. Can use something internally, e.g. UUID internally, but also provide others externally, like TinyURL, and DOI.

Human readable can be helpful. For example to verify that a link is working.


Ben provides a summary of experience collecting metadata from 6
repositories for inclusion into a common index.

* only received XML, but collected:

** EAD
** Dublin Core

What are the common elements across the different corpora?

For EAD, the challenge of navigating the EAD up and down the hierarchy
to produce the index at the right level of analysis.

The BnF concatenates all levels to crosswalk to Dublin Core to expose
via OAI.

The challenge of transforming structured format to flat format.

Image Presentation


Our discussion revolved largely around a standard way of capturing sequence
images (ordinal numbering) versus naming/titling the images (cardinal:
roman, arabic, 1r, 1v, etc.)

Here is our list of topics that we wanted to discuss re: images; * marks the
ones we actually talked about

1. FORMAT: Tiff vs. JPEG2000
2. Filenaming conventions *
3. sequence and organization / foliation *
4. technical metadata *
5. technology for streaming and delivering
6. page turning *
7. capture standards; organization *



APIs and Services

Main thought: Smaller chunks of interlinked services (eg James, Neil, Tom)

Transcription: REST calls (JSON) to save. API is the rest call to do
one function, can be implemented in any way at the back end.
Broadcast list of MS descriptions
Expose full/brief metadata
Image streaming (dimensions, format etc)
Service registry
Citation (down to page region)
Text search
Access Control (Auth'n and Interoperability)

Neil: Objects in system for registry
Raphael: What is a common API? (to be discussed)

Technology implications:

Medieval Manuscripts or More?

Mediaeval only? Gives focus on use cases

Physical, digitally described, cultural?

Manuscript catalogs are pretty unique

Need to keep touch with broader community for tool re-use possibilites

Eastern manuscripts blur written/print distinction '
"Mediaeval" only well defined for Western materials

RTL, Top-to-bottom visualisations (page turners etc.)

Folio-recto-verso vs page numbering

Stay with Western MMS's for discussion in point, look to expand in the
future possibly via SIGS


Neil Jefferies

Day 1 - Morning Summary

Items to share/discuss:

- matrix or registry of tools
- page turner
- extract folio structure (TPEN)
- transcription/editor
- search term highlighting
- list of preferred technologies
- (internal) API list
- merits & demerits of using a Fedora repository
- standards: TEI, EAD, etc
- PID/PURL issues
- broader applicability for other content types in addition to digi medieval manuscripts
- what do we need to do to our image resources (what are reqs) for image streaming?
- metadata streaming flavors? (what data do consumers/partners want?)
- OAC reference implementation

Rob's Architectural List of Possible Styles

Rob's architectural list of possible styles:

Authentication: Shibboleth vs OpenID/OAuth
Web API style: Grid/SOA[P] vs REST/Linked Data
User Interface: Flash vs HTML5
Client Type: Rich Client vs Web
Server Type: Repo vs Simple


Rob Sanderson

Syndicate content
« Back