JCDL 2006 Conference Notes

Day 3 – First Session – Time and Space

Talk 1 – Supporting Literary Scholars with Data Mining and Visual Interfaces:

visual interfaces: accessible, provacative

text mining just beginning in the humanities

nora project: http://www.noraproject.org

systems today provide access not necessarily text analysis

text analysis – new area; classificiation problems; scholars typically need assistance;

other work being done to visualize metadata;

users: small group of computer programmers; broad base of scholars uninterested in computational tools themselves, but doing the work

users' needs: classifying documents; reading; finding indicators – what makes a document fall into one class or another

case study: emily dickinson's letters; 300 xml encoded documents


manual classificaiton -> automatic classification -> correlations with document metadata

manually rate documents through system ; this serves as training set for data mining classifier

start analysis -> data mining algorithm determines likelihood and ratio of being in 1 class or another

manual classification takes a bit of time;

found that the word indicators were not as helpful as the computational probability

after classificaiton want to understand relationship btw the documents you've classified. look for correlations

uses naive bayes algorithm;

Talk 2 – Time Period Directories

search in humanities – chronology, geo, bio, subject

trying to develop search capabilities to search 4 facets

want to try use metadata as infrastructure; search across genres

what metadata to use for temporal aspect? chronology?

date/time standards, hard to put on a timeline

named time period problems: unstable; multiple names; ambiguous; how to disambiguate between periods and dates; all problems occur with places as well

place name gazatteer; use structure – associate witha date and associate where it happened and the time of event -> this becomes the time period directory

this was then put into an xml schema

prototype developed from LC SH authority records


map interface: location data and puts on a map

timeline browse

country browse – list




