Our Digital Archivist (Peter) and I spent some time working with EnCase Forensic software and experimented with running a few search strings against a Robert Creeley computer disk. We added the words “reference” and “SSN” to the search script and were able to return ten hits across hundreds of files. All of the search hits found letters of reference Robert Creeley wrote on behalf of colleagues and students. The results were very interesting and I’m beginning to see how we might incorporate this sort of technology into the daily activities of our digital archivist. Digital Archivists’ will need to sift through large volumes of data and demonstrate a good faith effort to redact any files containing financial, student or health records. EnCase Forensic is one tool that can provide us with this capability; we’ll need to run similar experiments with other software tools.
One particularly interesting bit of data I gleaned from today’s test was the time it took to run the search. A single 1.44 MB floppy disk required almost 4 minutes to search. We will need to develop a list of search terms that require a digital archivist’s intervention and run this list against large bodies of data in an automated way. EnCase has already provided GREP search strings built into the software for finding credit card data, telephone numbers and IP addresses. We’ll clearly need to tailor this to meet our archival needs.
All of this experimentation with forensic software is occurring at just the right time. I’ve been working with my colleague Cathy Aster to develop digitization lab software requirements for all of our digital labs (image, media preservation, forensics). I’m just now beginning to see where commonalities in workflow might exist across all of these lab spaces. I'll talk about what I'm learning as insights apppear.
