DIDO -- Digital Images Delivered Online DLC -- Digital Library of the Commons Hohenberger Photographs Lilly Sheet Music U.S. Steel Cushman Victorian Women Writers Project Variations2 Wright American Fiction Project Hoagy Carmichael Collection
Skip to end of metadata
Go to start of metadata
Digital Library Infrastructure

The eXtensible Text Framework (XTF), developed by the California Digital Library, is essentially a wrapper around Lucene that provides some functionality for handling XML and standard digital library formats.

XTF has been adopted to deliver text-based collections at DLP. The IU Board of Trustees Minutes and IU Finding Aids are currently supported by XTF (See Collections delivered with XTF).

There is a test version of XTF installed on rhyme (sample query: apartheid), and a test version being used for Newton.

It has a nice architecture with three main modules:

  • Indexing (textIndexer)
    • A command-line tool that initiates Lucene indexing of files in a given directory.
    • Can use custom XSLT both to select which documents get indexed and to pre-process documents for indexing.
    • Automatically detects which documents have been changed to perform incremental updates
  • Query processing (crossQuery)
    • Can use custom XSLT both to transform the query and to render the result list.
    • Has a very simple native query language, but also supports SRU/CQL.
  • Document rendering (dynaXML)
    • Can use custom XSLT to transform document ID numbers into file locations, and to render the resultant files.

It is unclear how powerful the query processing module is; we may need to beef it up a bit.

They have tied their implementation to a particular version of Lucene, by making some modifications to the Lucene code that have not been merged into the primary CVS. Not a big problem, but we would want to keep an external version of Lucene if we need searching capabilities that XTF cannot handle.

  • No labels