A snapshot of the Mapping Texts project's visualization search tool.
Users of The Portal to Texas History at UNT are able to browse pages of historical Texas newspapers online, thanks to the website Chronicling America: Historic American Newspapers, which is produced by the National Digital Newspaper Program.
Two UNT faculty members have discovered a way for users to access interactive visualizations that offer better ways to explore the content of these historical newspapers than typing words into a search engine.
Andrew Torget, assistant professor of history; Rada Mihalcea, associate professor of computer science and engineering, and their partners at Stanford University's Bill Lane Center for the American West, received a $50,000 Digital Humanities Start-Up Grant from the National Endowment for the Humanities (NEH) for the project, called Mapping Texts.
The project, Torget says, "came about, in part, because UNT has established itself as one of the lead universities for the Chronicling America program."
UNT Libraries became a partner in the program in 2007 when it received the first of several NEH grants to digitize the newspapers and put them online. The hundreds of thousands of pages date from the 1820s through the 2000s, offering glimpses of daily life in Texas communities from those decades. Torget and the research team used a sample of 250,000 pages of the newspapers.
"When you can explore hundreds of millions of words, a basic text search simply isn't enough. For instance, if I search the 250,000 newspaper pages for 'cotton' because I'm trying to find historical information as background for a book, I get more than 71,000 results," Torget says. "Mapping Texts is about solving a big data problem. We need to find new ways for people to make sense of the overwhelming abundance of information being made available in the digital age."
Mapping Texts includes two interactive visualizations built by the UNT-Stanford team:
- Mapping Newspaper Quality maps a quantitative survey of the newspapers, plotting both the quantity and quality of information available
- Mapping Language Patterns maps a qualitative survey of the newspapers, plotting major language patterns embedded in the collection
Mark Phillips, assistant dean for digital libraries, calls Mapping Texts a tool that puts new lenses on the content for researchers.
"It's a great example of how UNT researchers are using the content in innovative and creative ways, and shows that libraries need to think about how faculty members can more easily access a large amount of content," he says.
Brett Bobley, NEH chief information officer and director of the NEH's Office of Digital Humanities, says that since all of the National Digital Newspaper Project pages are created using the same standards work like the Mapping Texts project could, in theory, scale beyond the Texas newspapers to other states or even nationally.
"As we scan millions of pages of newspapers and other humanities materials, new methods for searching and analyzing the materials will become critical to scholarship," he says.
Nancy Kolsti with UNT News Service can be reached at Nancy.Kolsti@unt.edu.