Archiving, Accessing and Analyzing Digitized Newspapers

This is the second in a series of posts about the teams who will be attending the Institute in November, and their projects. This was submitted by Pam Lach and Stewart Varner.

What are the goals of your project, and how do they fit the Scholarship and the Crowd theme of SCI 2014?

Image of newspaper from 1921

Roanoke News (Weldon, N.C.), 1921

The North Carolina Collection at UNC’s Wilson Library has partnered with the Digital Innovation Lab and to digitize all of the pre-1923 historic North Carolina newspapers in the collection. As a result of that partnership, UNC will have millions of digital page images, text files (from OCR) and page-level metadata. Once it is complete, the total output will be 80 terabytes of data. Our immediate goal for SCI will be to identify potential research uses for this collection that draw on big data methods currently being explored by digital humanists. We will also determine how to store and provide access to the collection in a way that facilitates the use of these methods. The sheer size of the collection will necessitate extensive collaboration across multiple university units. Therefore, designing this collaboration to take advantage of everyone’s expertise while being mindful of everyone’s limited capacity will also be an important goal for the team. Finally, recognizing that many research libraries are facing similar issues (or soon will be), the team hopes to generate and articulate models, guides and recommendations for others to consult.

Who is on your team, and what are you hoping they will contribute to the project?

Mike Barker
University of North Carolina
Assistant Vice Chancellor for Research Computing and Learning Technologies

Mike leads the Research Computing unit of UNC’s Information Technology Services. Their mission is to provide a world-class computing infrastructure as well as other technology tools and capabilities to support the research needs of UNC faculty and staff. Mike’s bird’s eye view on campus IT will provide the team with the perspective to design a workable plan.

Brent Carter
Director of Business Development is responsible for digitizing the North Carolina Collection’s pre-1923 newspaper holdings, as well as providing access to the UNC community during a three-year embargo period. Brent is UNC’s contact at Newspapers and is interested in learning how to enhance their delivery platform to support data-driven research at scale.

Nick Graham
University of North Carolina
NC Digital Heritage Center

Nick has worked in special collections libraries archives for the past 15 years, focusing on public services and digital collections. He is the founding Program Coordinator for the North Carolina Digital Heritage Center; a statewide digital library program that works with more than 140 institutions around the state and serves as North Carolina’s service hub for the Digital Public Library of America. The Center is the leading newspaper digitization efforts in North Carolina. Nick is intimately familiar with the collection and has been UNC’s representative in the partnership.

Pam Lach
University of North Carolina
Digital Innovation Lab Associate Director

Pam Lach is the Associate Director of the Digital Innovation Lab at the University of North Carolina at Chapel Hill. She works with faculty, staff, and students to develop digital projects, and helps faculty integrate digital technologies into the classroom. She is also project manager for DH Press, a digital humanities visualization toolkit built as a WordPress plugin.

Stewart Varner
University of North Carolina
Digital Scholarship Librarian

Stewart Varner is the Digital Scholarship Librarian at the University of North Carolina Chapel Hill. He works with scholars across campus to incorporate technology into their research. He has years of experience designing library digital resources for advanced research in the humanities.

Stephanie Williams
University of North Carolina
Digital Project Programmer for the North Carolina Digital Heritage Center,UNC

What do you look forward to most from SCI, and what do you hope to accomplish through the Institute?

The team is very excited for the opportunity for all of us to concentrate on this project together for a sustained period. We see this as chance to develop closer relationships between team members and to expand our networks as we get to know the other teams at the Institute. We are all very excited for the opportunity to share ideas with the other teams and learn from the wealth of experiences they will bring to SCI.

Do you have plans for next steps after the Institute, or will you wait to see what emerges from the days together in November?

Our primary goal is to complete the Institute with a concrete plan for our next steps. We will also be launching a faculty-student working group next year to develop project-based uses of the collection.

Is there anything else you’d like to say about your project or participation in the Institute?

In advance of the Institute, we will be collecting ideas for how researchers may want to use the collection. We have some events planned for the UNC community but we would welcome input from anyone. Please let us know if you have project ideas that could use the collections or examples of existing collections that may be useful for us to consider.

Page image of newspaper from 1916

The Roanoke Beacon (Plymouth, N.C.), 1916

One thought on “Archiving, Accessing and Analyzing Digitized Newspapers

  1. Pingback: Participants in the November 2014 Institute |

Comments are closed.