Page tree

Versions Compared


  • This line was added.
  • This line was removed.
  • Formatting was changed.


The creation of mass digitization projects is one of the most important shifts in library and archival practices, both from a preservation and access standpoint. As analog media continue to deteriorate, digitization is the best hope for their long-term storage. However, the sheer amount of materials created by digitization makes it unfeasible for catalogers to catalog everything by hand. Making matters worse, many of these materials come with only scant metadata, meaning a cataloger would need to watch or listen to the item to generate metadata. As such, using machine learning to automate this sounds promising. While the metadata generation is a promising solution. However, while machine learning algorithms have significantly improved over the years, particularly in the last decade with the sudden viability of deep neural networks, they still have difficulties and biases, which pose problems for metadata, in terms of usefulness, accuracy, and, relatedly, fairness. The goal of AMP is to introduce human intervention in its machine learning pipelines to circumvent some of these limitations, so the metadata it generates is more correct, or at least more useful.