Page tree
Skip to end of metadata
Go to start of metadata

List of MGMs

Audio and Text

  1. Speech-to-text
    1. AWS Transcribe
    2. Kaldi
      1. Local
      2. HPC
  2. Forced Alignment
    1. Gentle
  3. Named Entity Recognition
    1. AWS Comprehend
    2. SpaCy
  4. Vocabulary Tagging
  5. Segmentation
    1. INA Speech Segmenter
  6. Music Program OCR
  7. Applause Detection
    1. Acoustic Classification Segmentation

Video

  1. Video OCR
    1. MS Azure Video Indexer
    2. Tesseract+FFMPEG
  2. Shot Detection
    1. MS Azure Video Indexer
    2. PyScenedetect
  3. Facial Recognition
    1. Python face_recognition
  4. Contact Sheet Generation
    1. Based on given time interval
    2. Based on total number of frames evenly spaced
    3. Based on output of Shot Detection (the middle frame of each shot)
    4. Based on output of Facial Recognition

Other Documentation

Evaluation Criteria

Evaluation Template



This page will grow as more MGMs are evaluated.

  • No labels