Rethinking Oversize Materials in Archival Collections


Written by Deena Gorland, ICFA intern (Fall 2014); Edited by ICFA staff

Due to previous experiences working at the Smithsonian Institution and the National Geographic, I was relatively well-prepared for the challenges inherent in processing substantial quantities of oversize materials in the Image Collections and Fieldwork Archives (ICFA) of Dumbarton Oaks. Certainly, I was cognizant of how large format materials present a unique challenge to archives, since their physical size requires different organizational and storage needs than normal-sized documents (e.g., personal papers and correspondence).  In addition, the oversize items in ICFA has been intellectually separated from their parent collections; therefore, the context or relationships between the items was lost and needed to be restored.

Starting in 2011, ICFA staff conducted a re-assessment of its oversize architectural drawings, tracings, and rubbings, primarily to evaluate their current storage environments and state of preservation, as well as to determine their history and relationship…

View original post 991 more words

Preservation Practicum and Prototyping Databases: A Review

As part of the stipulations of our grant funding, we are required to complete a practicum project at a cultural heritage information institution here in the Washington, DC area. I opted to complete my project over the summer at the Library of Congress Preservation Research and Testing Division (which is primarily why you have seen next to no blog activity from me during the whole summer). I’ve just completed my project and will summarize some of the larger points of my project.

The project was a portion of the CLASS-D development initiative that is ongoing at the Library of Congress PRTD. CLASS-D, which is n acronym for the Center for the Library’s Analytical Science Samples — Digital, is a tasked with developing a functioning database prototype to provide access to sample and analysis metadata for the materials contained within the PRTD’s CLASS collection.

A little background: The CLASS collection is a small body of diverse materials that have been set aside for preservation laboratory research via both noninvasive and invasive (i.e. destructive) techniques. These materials include books (the Barrow books, acquired from the W.J. Barrow Research Laboratory), standard paper samples, TAPPI fiber samples, magnetic tape, and more. These samples undergo a wide variety of laboratory analysis such as microscopic imaging, environmental scanning electron microscope imaging, pH analysis, spectrometry, accelerated aging, &c. Through this ongoing analysis, these samples have generated a great deal of important information on the physical characteristics and aging profiles of a variety of materials of different ages, periods, and production techniques, providing important information on the preservation of cultural heritage materials.

The problem: Now that PRTD has all of this information, how do we disseminate it?

Initial work on the project was completed by Doug Emery. This work ultimately produced a final report filed with the LOC that makes recommendations for data modeling and DB architecture. Based on the work completed for this report, I was tasked with completing the initial prototyping of the actual database in order to prove 1) the appropriateness of the initial data modeling work and 2) the feasibility of the database itself.

Through quite a bit of on-the-job review of DB design methodology, data wrangling, and ham-handed SQL coding, I was able to produce a functional database architecture model that accommodated the variety of sample metadata for all sample types to be included in the database. While I could go in to the nitty-gritty details of the database architecture (which I totally could, but won’t) I think a more valuable point would be some of the things that I took away from my experience at the PRTD:

Be Your Own Manager & Advocate

For those of you unfamiliar with what summer at the LOC is like, let me set the scene: imagine scores of junior fellows and interns meandering the halls, using laboratory space, needing supervisory assistance, all working on different projects across all departments at the LOC. Sound crazy, right? Within the Preservation department, there were at least a dozen scholars on-site throughout the summer working on a variety of in-depth research projects. While this sort of jam-packed work environment is great for innovation and learning new things, it’s not that great for being able to meet one-on-one with a supervisor. Dr. Fenella France, my most gracious host at the LOC, was pulled a million different ways throughout the summer due to her own professional responsibilities and the overabundance of junior researchers. For me, this was a bit of a wake-up call as my professional background has been in environments in which supervisors exerted close control over the work being done by their underlings. For the first time, I found myself doing a large amount of self-guided work without regular in-depth check-ins from the higher ups. This meant that I had to not only had to consciously guide my own schedules and progress, but that I had to also push to have my work reviewed in order to assure that my work and our overall project goals were in alignment. While this took a bit of schedule wrangling on my part, it did lead me to realize that you have to campaign for yourself and your project so that others will provide you with the proper attention and consideration that is required.

Don’t Expect Non-LIS Professionals to Care as Much as You Do About LIS Topics

The PRTD is largely a laboratory research institution. And while they have made huge strides towards serving their information needs and those of external researchers (i.e. CLASS-D), others within the institution simply are not as conscious of LIS issues as myself. This means that when specific researchers are asked for sample metadata, they aren’t necessarily going to provide it in neat, ordered, standard-compliant formats. This, at first, was extremely frustrating because it meant a great deal of data massaging and manual ingest in order to introduce the pilot data into the prototype. Is there a way to raise awareness on these issues? Yes. Can you use those tactics in every scenario? No. Do you need to understand that part of interdisciplinary system development is going to be dealing with others’ disciplinary focuses? Absolutely. Rather than being standoffish on these topics, think of yourself as a conduit through which data can be organized and provided with usable value. Be willing to communicate openly with others in order to help create the greatest level of user service.
And be prepared for a lot of blank looks over lunch as you try to explain data modeling to chemists…

Check Your Data Model. Check it Again. Putting in Pilot Data? Check it Again.

 So you’ve created your database architecture and your ready to ingest your pilot data? You’ve rigorously designed your data model and all the appropriate queries to automate record creation. So, you get cracking on ingesting ~1,90o sample records with nearly 93,000 associated records for physical characteristics. What’s this? You accidentally combined one of the characteristic fields, thereby invalidating all of the physical characteristics you just ingested for over 900 books? Bad words and exasperated looks ensue…

It may go without saying, but always, always, always double, triple, and quadruple check your model before you begin ingesting actual data. You never know what simple mistake you’ve made that will ultimately require several hours of undoing down the road. Luckily, this mistake only ate about 8 hours of my time. But, I would rather not have to waste time because of simple oversight.

You’ll Never Suspect to Find Interdisciplinary Collaboration… Until You Do.

To be perfectly honest, I did not want to work at the LOC. I do not have a science background and was much more interested in several projects more associated with codicology, rare book cataloging, and the like. However, having accepted the LOC project out of necessity, I later discovered that here — in the field of laboratory science — is fertile ground for collaboration with the LIS discipline. The innovative approaches that our field is bringing to the development of information systems and to the practice of data sharing have found a great deal of support and buy-in from the humanities and social science disciplines. And while there is not a lack of interest from the life sciences, there exists a gap between our professional discourse and our applied exercises. Once I arrived at the PRTD, I found several people who were extremely interested in taking advantage of new approaches to data management and sharing. However, given the disciplinary focus of their institute and their own professional responsibilities, they had not yet been able to seek out partnerships or support from LIS professionals. After talking with several PRTD representatives about the possible implementations of the CLASS-D initiative, I found that they were extremely interested in the benefits that could come from having an open access database and the possibilities of implementing RDF-compliant data modeling to promote innovative reuse of data. Despite coming in to the workspace with my own professional preconceptions, I found the PRTD to be an excellent institution, filled with possibilities for creative collaboration between the laboratory science and LIS disciplines. While we may harbor hopes and dreams for what type of institution we may end up working for, be sure to remain open to unforeseen opportunities that may offer you a chance to dramatically impact the mission of a collection, an institution, or a profession.

There will be more to come from the CLASS-D project. Up next, Nick Schwartz will begin working on modeling for the analysis metadata architecture that will attach to the existing architecture. Keep an eye out for more updates!

If you are interested in learning more about the CLASS-D project, feel free to contact me at

Library of Congress’s Cultural Heritage Archives Symposium

This past week, the Library of Congress’s American Folklife Center hosted a two-day conference titled Cultural Heritage Archives: Networks, Innovation & Collaboration. With attendees from all across the country and the world, this symposium was a fertile arena for insight and discourse on many pertinent issues related to the design, management, and administration of cultural heritage archives. A wide variety of speakers from all sectors of the archival community presented inspiring papers on numerous topics. Really, there is too much to say about this symposium; so much so that justice cannot really be done in a blog post format. This is a polite way of saying you really should have been there.

To provide a brief summary of the symposium, here are the major bullet points:

  • Danna Bell-Russel, Educational Resource Specialist at LOC and SAA President, presented the first keynote address on the first day, focusing on ways that archivists can bridge connections between institutions and disciplines.
  • The first session saw many papers on use and users of cultural heritage archives from U. Oregon, Oxford, U. Colorado Boulder, Universite Paris Diderot, U. Alberta, and U. North Texas.
    • FULL DISCLOSURE: I wasn’t able to make the first part of the day’s festivities, so I’m basing this off of the symposium program.
  • The second session of the day raised some great questions about the approach to archival description, such as:
    • How should the EAC-CPF standard be applied to link archival metadata?
    • How should social media be used to expose archival collections, especially regarding collections that have significant cultural importance?
    • What’s the best way to catalog music archives with regards to quick access and use in educational settings?
  • A poster session featured many great studies on archival programming, but the most interesting was a poster on Traditional Knowledge Licensing and Labeling presented by Jane Anderson. Learn more about TK Licenses here.
  • The second day kicked off with an absolutely fabulous key note address by Sita Reddy about the decolonization efforts of indigenous peoples regarding their cultural wisdom as captured in the Hortus Malabaricus. The abstract of her presentation can be seen here.
  • On a silly note, apparently there’s no LCSH heading for bourbon.

Be sure, I could go on and on and on about the wonderful talks presented at the symposium, but that could take ages.

HOWEVER, I will say that what was most stirring about this symposium was the continued recognition of the importance of collaboration in the cultural heritage community. Be it with ethnic or cultural groups, be it with archival users or audiences, artists, collectors, IT staff, archive administrators, or what have you, the repeated anthem of the two-day symposium was that archivists must constantly seek out new and innovative ways to collaborate with internal and external parties to ensure that the collections survive in perpetuity and gain new life through continued access and use.

One point was brought up that I thought also bears mentioning. During a Q&A session, Timothy Powell of the American Philosophical Society declared that he felt that a group was missing from the day’s proceedings and that the group was digital humanists. This struck me as odd as the statement appeared to come from an “us-them” perspective that set up digital humanists as external to archivists. I feel however that the digital humanities is a broad, reflexively-inclusive term that has within its scope all who work with the humanities, be they scholars, archivists, librarians, &c. Considering the increased prevalence of digital formats of preservation and access that occur in the archival community, one is hardpressed to find a humanist that doesn’t, in one form or another, operate in the digital world. In this regard, I disagree that the digital humanities were unrepresented; rather, many — if not all — of the attendees at the symposium are a part of the digital humanities, they just might not know it yet.

To wrap up, the symposium was a great forum of ideas on a wide array of topics in the cultural heritage archival field. Great perspectives were shared and I hope to see many excellent collaborations emerge from the proceedings.


Be sure to keep your ear to the ground as the LOC will likely make videos of the symposium sessions available online through the webcasts site.

For even MORE cultural heritage archives fun, be sure to check out the symposium’s twitter feed at #chas13.

Digital Directions 2013 Takeaways

Written by Joseph Koivisto


Writing for the POWRR Blog, Aaisha Haykal — University Archivist for Chicago State University — put together a great post about the Fundamentals of Creating and Managing Digital Collections Conference at the University of Michigan, Ann Arbor. In her post, she discusses the variety of sessions and workshops that occurred over the three-day event and shares some pictures from a tour of the Digital Conversion Unit or of the Technology Lab.

The most interesting part of the post is a list of conference takeaways, points that can be applied to any digital conservation environments. They are as follows:

  1. Know your institution, in terms of risk management (is some loss acceptable to you? who will be doing the metadata, how specific will it be?), budget, staffing (who responsibility is what), formats, mission, etc.
  2. It does not take much to get started with digital preservation-every little bit helps
  3. You really cannot do it alone (get assistance at every stage of the process)
  4. Modify standards, guidelines, and best practices to your institution, sometimes just good enough works
  5. Make your metadata interoperable and specific (ex. downstate and Illinois versus just downstate), so that when you merge records it is clear
  6. Approach stakeholders with a tailored message this can be done through workshops and one-on-one sessions. When involving IT, do not let them take over the project, this is your territory.
  7. Assessment of digital collections has to be done, either qualitative or quantitative.
  8. Document what you have done to the collections so that 1) those in the future can know and 2) that data was not lost in the transitions (bit count)
  9. Within the conversation of digital preservation we need to make clear the difference between preservation and access copies
  10. Learned more about the environment that digitization should be taking place in, in terms of lighting, monitors, and equipment.

Of this list, I found two points to be most insightful. First, standards, guidelines, and practices need to be custom tailored to the institution. Each institutional repository has unique needs and serves a particular collection and audience. Therefore, information professionals must design project standards and practices around the needs of the collection and the identified end users.

Second, information professionals must be wary of ceding control to IT staff that have been brought in to work on digital collections. Considering the increasingly tech-centric nature of conservation initiatives, information professionals need to make sure that governance does not change hands during the project time frame. How we do this is — again — something that will vary from institution to institution and project to project. However, acknowledging the issue prepares us to  better address the issues as they arise.

The original POWRR blog post can be found here.

2014 National Agenda for Digital Stewardship

Written by Joseph Koivisto


The National Digital Stewardship Alliance — along with support from DuraSpace — has released their 2014 National Agenda for Digital Stewardship. In this report, the NDSA aimed to…

highlight emerging technology trends, identify gaps in digital stewardship capacity, and provide funders and decision-makers with insight into the work needed to ensure that today’s valuable digital content remains accessible useful and comprehensible in the future, supporting a thriving economy, a robust democracy, and a rich cultural heritage.

Concurrent with the ongoing trends in digital information facilitation, curation, and preservation, the agenda details important elements in digital content areas, technical infrastructure development, and a variety of research priorities. By clearly articulating its priorities, the NDSA has created an outline for future development that will guide their work and provide external organizations with insight into the new horizons of digital stewardship research and implementation.

While much of the report covers ground that has been discussed before (the diversity of digital content areas, development topics for technical infrastructure), the discussion of research priorities presents a list of fascinating regions for new study. These include…

  • Applied Research for Cost Modeling and Audit Modeling
  • Information Equivalence and Significance
  • Policy Research on Trust Frameworks
  • Preservation at Scale
  • Strengthening the Evidence Base for Digital Preservation

These topics present fertile ground for future scholarly work and may be worth investigating for personal research (and maybe an article or conference presentation…).

Regardless of whether or not you intend to plunder NDSA’s agenda for personal research topics, it is an important bellwether for digital data professionals and all forms of digital scholarship.

Photo Albums, Digital Preservation, and Familial Sensemaking

Written by Joseph Koivisto


Who hasn’t had the experience of sitting down to thumb through old photo albums with your family, laughing at baby pictures, wondering at your parents’ microscopic first apartment, horrified at the realization that you look exactly like your father? This fairly universal occurrence is so familiar to many of us that we often don’t think about it. It is a socio-familial cliche, a roughly scripted interaction that is fodder for Romantic Comedies. And yet, the family photo album is a microcosm of larger cultural heritage institutions, recapitulating the intrinsic concepts and concerns of archives, museums, galleries, and libraries.

I recently found this article featured on NPR that discusses the transition from traditional physical photo albums to digital photo albums scattered across numerous applications (Facebook, Flickr, Tumblr, &c.).  It raises several good points and — along with some helpful insights from Bill LeFurgy — makes a good case for practicing proactive digital preservation.

Old family photo collections tend to have a somewhat haphazard storage and preservation profile. While many of us have nicely bound binder full of acid-free photo sleeves, there is the counterpart: the shoebox stuffed with unorganized photos that’s crammed into a closet somewhere. Be your photos in an album or a shoebox, they are relatively safe. Except for disasters such as fires, floods, or accidental Spring Cleaning collateral damage, physical photos will persist and don’t require a great deal of attention. Considering that the album/shoebox model is the tradition from which we are coming, it is no wonder that the average individual’s attitudes towards active preservation is somewhere between disinterested to horrifically negligent.

Just like the issues posed to collecting institutions, the private citizen has begun to switch to a new model of photo collection: the digital album. And just like larger institutions, issues of digital preservation are upon us.  Just like we learned from our Digital Curation coursework, digital objects are not preserved by accident and we must therefore change our attitudes towards the storage and preservation of our photographic records. There is a huge array of available technologies to help us preserve our digital photos. Articles and essays abound on how to best curate and save your photos. Clearly, society is beginning the slow process of accepting a new proactive attitude towards saving photos.

But the issue of digital photo albums is not limited to approaches for storage and preservation. The implications of changing from an analog to a digital medium also impact the ways in which we conceive of our personal and familial identities insomuch as they are reflected in the collections that we keep. The family photo album is a special type of object that serves as a vehicle of sensemaking. Think of Kwame Anthony Appiah’s article “Whose Culture Is It?” that presents archives and museums as institutions that help to form cultural identities, their curatorial practices and object collections connecting us to ourselves as a people. In a similar fashion, the pictures we keep are imbued with a personal/cultural significance and indicate not only what we’ve done, but what we have chosen to remember.

The family — or the individual — is like a museum. We make curatorial decisions about what to ingest into our collections, deciding which record is worth keeping based on our idiosyncratic and subjective criteria. With the emergence of digital photo albums as the likely new standard for collecting and preserving our photographic records, we cannot ignore the influence that technology will play on our own curatorial behaviors. Do we save all of the pictures we take on vacation, a number that seems to quickly inch into the hundreds if not thousands? Do we save the best pictures and, if we do, do we run the risk of presenting an artificially rose-colored version of ourselves? Do we save the pictures of us that look ugly? Do we share our photos online? With whom do we share them?

Technology has very visibly changed the way we photograph ourselves. From taking a snapshot, to saving a picture, to preserving an album, the ways in which we interact with photos (and with ourselves through photos), has been irrevocably altered. By remaining conscious of these and future changes, we can actively engage with emerging trends and, hopefully, not wind up in a shoebox.