Event box

HathiTrust digital library text mining capabilities

The HathiTrust Research Center has built tools that allow the user to execute algorithms on the works in public domain that are included in the digital library. Once a workset of public domain works is selected, several algorithms can be run on it that, among other things, can help you with the following tasks:

  1. Extract named entities from the workset (i.e., location, person, and organization references)
  2. Establish the timeline of events in the selected workset
  3. Extract topics in an automated way from the workset
  4. Classify the volumes in the workset according to categories that are pre-established  by the user

This will not be a hands on session, I only plan to demonstrate these capabilities. If anybody would like to explore this in more depth,  I’ll be happy to meet with you to talk about the possibilities that the HTRC tool allows or brainstorm approaches on how you can present to students and faculty. 

Wednesday, August 2, 2017
12:00pm - 1:00pm
Time Zone:
Central Time - US & Canada (change)
JTR 109 (Lincoln Park)
Lincoln Park Library
  Meeting/Non-Instructional Event  
Registration has closed.

Event Organizer

Ana Lucic