Child pages
  • Unleashing TEI and Plain Text Data
Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 3 Next »

Motivated by a recent mock keynote debate, "A Matter of Scale (," presented by Matt Jockers and Julia Flanders as part of the Boston Area Days of Digital Humanities Conference ( and the imperative that librarians involved with many things "digital" learn not only how to build tools, in this case for textual analysis, but leverage existing tools to support teaching and research endeavors rooted in the text.  Coming from the tool-building perspective and tradition, I seldom have time to explore existing tools for textual analysis.  This is partly because at IU we are so vested in textual markup following the TEI Guidelines for which few external tools exist that act on the markup (thus our focus on building). But as is the case with many academic libraries attempting to balance scale of digital production, we are not always in the position to build boutique interfaces, tools and functions for hand-crafted markup.  Further, often early research inquiries can be better defined if not answered by initially playing and experimenting with raw data sets before embarking on markup.  Finally, after many years of leading e-text initiatives and championing the TEI, I would love to sit around with folks and compare and contrast, not just the possibilities, but also the outcomes of real research inquiries that formed the basis for many of the TEI collections I am offering up to the community for experimentation.  
The other motivator for this session is two-fold.  At IU we've always exposed the TEI/XML, but at the most atomic level.  I am exploring workflows moving forward in which we batch not only the TEI but other versions of the data, primarily plain text, for easier harvesting and re-purposing.  One reason for doing this of many good ones is that we want to demonstrate to our faculty partners the infinite possibilities of sharing data in this way.  The content can and should be analyzed, parsed, and remixed outside of the context of it's collection site for broader impact and exposure.  I am hoping, with your help, to figure out how to best push versions of this data into the flow, around a more formal call, initially, to the digital humanities community-at-large so I can track the various morphings and instantiations of this data to share back with the IU community, especially my faculty partners.  
I recently blogged about this very concern on Day of DH 2013: <>.
This V is by no means limited to the following e-text data I will provide (access details forthcoming):

Serve up or use up even if snippets of your own data of interest.  

Nor is it limited by the following tools I have identified for starters:

In fact, it would be best to partner up with folks who are a familiar with a particular tool (come to this session, claim a tool!).  

All data will be posted (in progress) on this public-facing wiki page: 


TEI and Plain Text Data from IU Libraries (In Progress)

  • No labels