This week’s readings on preservation of digital materials seem to speak more to the concerns of librarians and archivists than humanities scholars themselves. They reminded me a lot of the discussions I had while studying for my master’s in library science, during which I also focused on archival work. What really struck me during those studies, and from the readings this week, is the sheer amount and ephemeral quality of born-digital materials. Despite that fact that we all mostly acknowledge the superior capabilities for creating data, projects, etc. in digital formats, it is also true that we have not come up with a better medium than paper for long term storage. Not only does paper not need an appropriate “reader”, such as a CD-drive, floppy disc drive, VCR, etc., it is also highly stable in most cases, and is also still readable after sustaining some damage. Moreover, preservation of paper is mostly passive (keeping it out of the way of water, fire, acid, etc.), while preservation of digital materials requires constant recopying to either of the same type of media (CDs begin to deteriorate after about 10-15 years) or to a completely new media (if we aren’t going to keep a museum’s worth of old readers, we need to eliminate data storage on old media-types). This requires tons of human-power, funding, time, planning, etc. Of course, paper isn’t a cure-all either, especially for born-digital projects. Obviously, no one is going to print out every single one of their thousands of emails for posterity, and there are many digital works that aren’t simply text, so they cannot feasibly be stored in paper format. In some ways, it feels like we’ve opened a Pandora’s box with the creation of such an overwhelming amount of born-digital material, but of course all we can do is adapt and try to intelligently create best practices as we go along.
The authors this week have obviously thought a great deal about these issues, but certainly don’t offer a cure-all for these problems; it is heartening, however, that they have offered plans for a way forward. I especially like the goals or steps laid out by the The NINCH Guide to Good Practice in the Digital Representation and Management of Cultural Heritage Materials:
- Identifying the data to be preserved
- Adopting standards for file formats
- Adopting standards for storage media
- Storing data on and off site in environmentally secure locations
- Migrating data
- Refreshing data
- Putting organizational policy and procedures in place
The ethical and technological issues raised by Matthew G. Kirschenbaum in “Digital Forensics and Born Digital Content in Cultural Heritage Collections” in terms of mining a donor or subject’s computer to find historically pertinent information shows that librarians, archivists, and scholars will not only need the technological capabilities to engage in this activity, but will also need to seriously consider the ramifications of having access to data that may not have been intended for public view. Of course, this is not necessarily a new problem; as we have seen in recent years with revelations about Thomas Jefferson’s dealings with slaves, for example, even manuscript or printed materials created during a person’s life do not necessarily leave the legacy he or she always intended. Issues of provenance or authenticity when it comes to born-digital data also have a basis in the techniques and policies of dealing with physical media; however, while techniques such as materials analysis, handwriting analysis, etc. may not be applicable, chain of ownership, word-usage analysis, etc. will still be valuable tools in the arsenal. In fact, text mining techniques that we have discussed in other weeks could be an increasingly valuable tool for analyzing and determining authenticity of bodies of writing.
As for concerns about terminology for collections of online scholarship or documents discussed by Kenneth M. Price in “Edition, Project, Database, Archive, Thematic Research Collection: What is in a Name?” and Kate Theimer in “The Problem with the Scholar as ‘Archivist,’ or is there a Problem,” and “Archives in Context and as Context,” I think it is both important to acknowledge the correct usage of terms, and also acknowledge that the nature of language is that words evolve and change meaning over time. However, as a librarian, I also fully understand Theimer’s concern about the implicit disregard for her profession when using the term “archive” very loosely. Librarians and archivists both have a lot of trouble communicating their worth and professional status to the outside world–even to scholars. It would be ideal if the scholarly community banded together with librarians and archivists to express the worth of our collective field in the face of ever-increasing budget cuts and disparaging of the worth of cultural institutions and academia in society.