Welcome to the HathiTrust Research Center Sandbox!

About Us

The HathiTrust Research Center (HTRC) provides research access to the public domain text of the HathiTrust Digital Library. The HTRC is a collaborative research center launched jointly by Indiana University and the University of Illinois, along with the HathiTrust Digital Library, to help meet the technical challenges of dealing with massive amounts of digital text that researchers face by developing cutting-edge software tools and cyberinfrastructure to enable advanced computational access to the growing digital record of human knowledge.

The HTRC provides an infrastructure to search, collect, analyze, and visualize the full text of nearly 3 million public domain works and is intended for nonprofit and educational researchers.

What is the Sandbox?

The HTRC Sandbox is distinct from the main production portal of the HTRC. The HTRC Sandbox is meant to be an arena for users to try out experiments and do exploratory work.The dataset available on the sandbox is a much smaller subset of that associated with the HTRC’s main production portal. The HTRC Sandbox dataset consists of the non-Google-digitized public domain volumes (approximately 250,000 volumes) from the HathiTrust corpus.

The HTRC Data API is available for experimentation and several additional feature data and tools, such as HTRC-Bookworm, are being connected to this data for exploratory analysis. HTRC users can write their own programs, in programming languages of their choice, accessing the data through the HTRC Data API programmatically as client.

What can you do with the Sandbox?