Project Andvari Workshop: constructing a thesaurus

An excellent blog post by CUA’s very own Kevin Gunn about the Project Andvari controlled vocabulary workshop. Do yourself a favor and check it out!

Digital Scholarship and beyond

The Project Andvari team met November 7th and 8th, 2014.  The task of the workshop was  to create a basic thesaurus, fulfill a ‘proof of concept’ requirement (i.e. create a pilot project), and discuss future steps in the evolution of the project.

When completed, Project Andvari will be an online database for scholars and the public to search for pre-Christian images from the Medieval Norse, Nordic, Anglo-Saxon, and northern European traditions, covering roughly 400 AD to 1200 A.D.  For more background information regarding the history and parameters of the project,  grants received and submitted, individuals involved, and other delightful information, go to the blog.

 Friday, November 7th, 2014

Lilla Kopár, co-project director, convened the meeting and gave an itinerary for the next two days.  Her first big question was:  can we do this project at all? We discussed in the workshop in 2013 the digital tool we wanted to…

View original post 1,977 more words


SAA 2014: Archives, Activism, and a Whole Lot of Twitter

Oh, the most wonderful time of the year: conference season! When professionals the whole world over experience the joy of free continental breakfast, standing-room-only panel discussions, and odd luggage necessary to safely transport misshapen posters and displays through the TSA gauntlet. For me, the beginning and end of my conference whirlwind consisted of the Society of American Archivists 2014 meeting. Luckily, this year’s host city was Washington, DC. In lieu of a boarding pass, I grabbed my SmarTrip card and hoped down to Woodley Park for days filled with archival fun.

Full disclosure: I am not, per se, an archivist. I am a graduate student specializing in cultural heritage information management with a focus on rare books, manuscripts, and medieval material culture. However, if my coursework has taught me anything, it is that the ongoing convergence of library, archives, and museum professionals — coupled with the ever-increasing technological synergy between these disparate institutions — means that it is incumbent upon us as information professionals to being engaged on several scholarly fronts. With this in mind, I felt that my participation in the SAA conference would not only help me in my own interdisciplinary efforts, but would also add a unique voice to the archival conversation that would occur at the conference.

As writing about the chaos of conference life in some semblance of linear fashion is a herculean task, I will segment my comments by events, panels, discussions, or other relevant dimensions.

  • Before the official kick-off of SAA, a pre-conference workshop was held that explored the use of open-access applications for optical character recognition of non-standard texts. Led by Matthew Christy of the Early Modern OCR Project from Texas A&M, this workshop provided extremely helpful insight into the workflow for training Tesseract to identify and convert early modern print types into computer-usable text. On a personal note, this was a whirlwind of new information to the uninitiated OCRer (i.e. me). However, knowing what I do now, I think that this was an excellent professional development experience that will be useful on future projects.
  • FOIA and Access: The plenary discussion featured a lively discussion on the importance of FOIA to the realm of investigative journalism. A fantastic – and timely – discussion that highlighted the importance of archivists as both holders of information and conduits of access.
  • Integrating Digital Objects and Finding Aids: As with all panels focusing on digital materials, this panel was packed.  This panel, focusing on the Northwest Digital Archives, presented great ideas on approaches to ensuring object-collection hierarchy maintenance; use of publicly available resources as service hubs for private collections; and approaches to user testing.
  • SNAC: Representatives from the SNAC project led a great discussion on the development of linked EAC-CPF records to help unify entity identification in distributed record holding institutions. Again, another jam-paced session due to the digital orientation of the topic. Still, a great opportunity to learn about ongoing initiatives.
  • HIV/AIDS Archives: In this panel, a fascinating conversation occurred in which the difficulties associated with archiving an ongoing social phenomenon were illuminated. In particular, the NYPL archivist of the AIDS/HIV Collection recounted conflicts between their collection and the ACT UP activist organization due to the public perception of the historiographic activities of archivists. The difficulty arises from convincing the public that archives are not only collections of things that ‘have occurred’, but are rather ongoing records of individual, organizational, and societal events, continually being reappraised, reassessed, and reinterpreted. The quotable takeaway is the ongoing conversation between the competing concepts of “AIDS History” and “AIDS is History.”
  • Poster Session:

    36″x40″ of glory!

    The poster session was an excellent opportunity to meet a variety of scholars and professionals and to give them an introduction to my work on Project Andvari. A lot of very fruitful conversations occurred. A couple even led to possible partnerships for future collaboration and data sharing (and a possible job opportunity, but let’s not get too hopeful). All-in-all, it was a great chance to practice my presenting skills and to get my face out there as a participating member of the larger scholarly community.

Conferences are always hectic (and exhausting). There is always far too much for one person to experience, but the net effect is one of great professional development and scholarly sharing. This year’s SAA conference was no exception. I walked away from this experience enlivened with a renewed energy for my professional field. While I couldn’t attend more sessions, I was extremely grateful to my fellow conference attendees and their dogged upkeep of the #saa14 thread, allowing me to follow the numerous concurrent sessions that I could not attend.  As I near the end of my graduate coursework, I am excited to more fully enter in to my chosen profession, knowing that the field is populated with such energetic and innovative professionals.

Preservation Practicum and Prototyping Databases: A Review

As part of the stipulations of our grant funding, we are required to complete a practicum project at a cultural heritage information institution here in the Washington, DC area. I opted to complete my project over the summer at the Library of Congress Preservation Research and Testing Division (which is primarily why you have seen next to no blog activity from me during the whole summer). I’ve just completed my project and will summarize some of the larger points of my project.

The project was a portion of the CLASS-D development initiative that is ongoing at the Library of Congress PRTD. CLASS-D, which is n acronym for the Center for the Library’s Analytical Science Samples — Digital, is a tasked with developing a functioning database prototype to provide access to sample and analysis metadata for the materials contained within the PRTD’s CLASS collection.

A little background: The CLASS collection is a small body of diverse materials that have been set aside for preservation laboratory research via both noninvasive and invasive (i.e. destructive) techniques. These materials include books (the Barrow books, acquired from the W.J. Barrow Research Laboratory), standard paper samples, TAPPI fiber samples, magnetic tape, and more. These samples undergo a wide variety of laboratory analysis such as microscopic imaging, environmental scanning electron microscope imaging, pH analysis, spectrometry, accelerated aging, &c. Through this ongoing analysis, these samples have generated a great deal of important information on the physical characteristics and aging profiles of a variety of materials of different ages, periods, and production techniques, providing important information on the preservation of cultural heritage materials.

The problem: Now that PRTD has all of this information, how do we disseminate it?

Initial work on the project was completed by Doug Emery. This work ultimately produced a final report filed with the LOC that makes recommendations for data modeling and DB architecture. Based on the work completed for this report, I was tasked with completing the initial prototyping of the actual database in order to prove 1) the appropriateness of the initial data modeling work and 2) the feasibility of the database itself.

Through quite a bit of on-the-job review of DB design methodology, data wrangling, and ham-handed SQL coding, I was able to produce a functional database architecture model that accommodated the variety of sample metadata for all sample types to be included in the database. While I could go in to the nitty-gritty details of the database architecture (which I totally could, but won’t) I think a more valuable point would be some of the things that I took away from my experience at the PRTD:

Be Your Own Manager & Advocate

For those of you unfamiliar with what summer at the LOC is like, let me set the scene: imagine scores of junior fellows and interns meandering the halls, using laboratory space, needing supervisory assistance, all working on different projects across all departments at the LOC. Sound crazy, right? Within the Preservation department, there were at least a dozen scholars on-site throughout the summer working on a variety of in-depth research projects. While this sort of jam-packed work environment is great for innovation and learning new things, it’s not that great for being able to meet one-on-one with a supervisor. Dr. Fenella France, my most gracious host at the LOC, was pulled a million different ways throughout the summer due to her own professional responsibilities and the overabundance of junior researchers. For me, this was a bit of a wake-up call as my professional background has been in environments in which supervisors exerted close control over the work being done by their underlings. For the first time, I found myself doing a large amount of self-guided work without regular in-depth check-ins from the higher ups. This meant that I had to not only had to consciously guide my own schedules and progress, but that I had to also push to have my work reviewed in order to assure that my work and our overall project goals were in alignment. While this took a bit of schedule wrangling on my part, it did lead me to realize that you have to campaign for yourself and your project so that others will provide you with the proper attention and consideration that is required.

Don’t Expect Non-LIS Professionals to Care as Much as You Do About LIS Topics

The PRTD is largely a laboratory research institution. And while they have made huge strides towards serving their information needs and those of external researchers (i.e. CLASS-D), others within the institution simply are not as conscious of LIS issues as myself. This means that when specific researchers are asked for sample metadata, they aren’t necessarily going to provide it in neat, ordered, standard-compliant formats. This, at first, was extremely frustrating because it meant a great deal of data massaging and manual ingest in order to introduce the pilot data into the prototype. Is there a way to raise awareness on these issues? Yes. Can you use those tactics in every scenario? No. Do you need to understand that part of interdisciplinary system development is going to be dealing with others’ disciplinary focuses? Absolutely. Rather than being standoffish on these topics, think of yourself as a conduit through which data can be organized and provided with usable value. Be willing to communicate openly with others in order to help create the greatest level of user service.
And be prepared for a lot of blank looks over lunch as you try to explain data modeling to chemists…

Check Your Data Model. Check it Again. Putting in Pilot Data? Check it Again.

 So you’ve created your database architecture and your ready to ingest your pilot data? You’ve rigorously designed your data model and all the appropriate queries to automate record creation. So, you get cracking on ingesting ~1,90o sample records with nearly 93,000 associated records for physical characteristics. What’s this? You accidentally combined one of the characteristic fields, thereby invalidating all of the physical characteristics you just ingested for over 900 books? Bad words and exasperated looks ensue…

It may go without saying, but always, always, always double, triple, and quadruple check your model before you begin ingesting actual data. You never know what simple mistake you’ve made that will ultimately require several hours of undoing down the road. Luckily, this mistake only ate about 8 hours of my time. But, I would rather not have to waste time because of simple oversight.

You’ll Never Suspect to Find Interdisciplinary Collaboration… Until You Do.

To be perfectly honest, I did not want to work at the LOC. I do not have a science background and was much more interested in several projects more associated with codicology, rare book cataloging, and the like. However, having accepted the LOC project out of necessity, I later discovered that here — in the field of laboratory science — is fertile ground for collaboration with the LIS discipline. The innovative approaches that our field is bringing to the development of information systems and to the practice of data sharing have found a great deal of support and buy-in from the humanities and social science disciplines. And while there is not a lack of interest from the life sciences, there exists a gap between our professional discourse and our applied exercises. Once I arrived at the PRTD, I found several people who were extremely interested in taking advantage of new approaches to data management and sharing. However, given the disciplinary focus of their institute and their own professional responsibilities, they had not yet been able to seek out partnerships or support from LIS professionals. After talking with several PRTD representatives about the possible implementations of the CLASS-D initiative, I found that they were extremely interested in the benefits that could come from having an open access database and the possibilities of implementing RDF-compliant data modeling to promote innovative reuse of data. Despite coming in to the workspace with my own professional preconceptions, I found the PRTD to be an excellent institution, filled with possibilities for creative collaboration between the laboratory science and LIS disciplines. While we may harbor hopes and dreams for what type of institution we may end up working for, be sure to remain open to unforeseen opportunities that may offer you a chance to dramatically impact the mission of a collection, an institution, or a profession.

There will be more to come from the CLASS-D project. Up next, Nick Schwartz will begin working on modeling for the analysis metadata architecture that will attach to the existing architecture. Keep an eye out for more updates!

If you are interested in learning more about the CLASS-D project, feel free to contact me at

Cultural Heritage Information Management Forum Next June!

The Department of Library and Information Science has just announced that it will host the Cultural Heritage Information Management Forum in Washington, DC on June 5, 2015. This forum, which will serve as an arena for CHIM practicum project presentations, addresses the growing body of research and scholarship in the digital cultural heritage discipline.

The Program Planning Committee invites poster proposals on topics related to the forum theme. They include but are not limited to

  • Infrastructure for collection sharing, research, and access
  • Creation of digital collections
  • Access to digital cultural heritage collections
  • Outreach and engagement of users
  • Stewardship of cultural heritage collections
  • Partnerships and collaboration
  • Sustainability and funding models

Submissions will be accepted between February 2, 2015-March 30, 2015 and are open to all researchers, practitioners, and students in the cultural heritage discipline.

Stay tuned for more information about this upcoming forum!



Bitcoin: how do we display the intangible?

A really fascinating concept from a curatorial perspective: neo/crypto-numismatics and how to curate objects that are, by their very definition, intangible. As we move ever forward towards an increasingly digital landscape, how do we develop curatorial methodology and foundations for something with no true instantiation?

British Museum blog

bitcoin miner Benjamin Alsop, curator, British Museum

The Citi Money Gallery charts over four millennia’s worth of monetary history. The Department of Coins and Medals cares for over one million objects in the Museum’s collection and like any museum with a growing collection, the most pressing questions are what should we collect and where should we put it all? Yet a recent concern for me as the curator of the Citi Money Gallery is not which objects should I select from our vast collection for a new display, but whether we had any suitable objects at all. This may sound like the murmurings of an eccentric curator, but let me explain myself.

Bitcoin token, designed by Mike Caldwell (CM 2012,4040.4) Bitcoin token, designed by Mike Caldwell (CM 2012,4040.4)

If the gallery is to be a record of the changing nature and form of money through the ages, then it is just as important to reflect the modern world as it…

View original post 600 more words