Page tree
Skip to end of metadata
Go to start of metadata
  • About
    • The Forced Alignment tool is a wrapper around the python tool and API Gentle
  • Source Code
    • AMP's fork of the Gentle repo: https://github.com/AudiovisualMetadataPlatform/gentle
      • Only a slight change was made to the Kaldi installation bash script. 
    • Singularity container to build the Gentle tool: https://github.com/AudiovisualMetadataPlatform/gentle-singularity
      • Singularity wrapper around gentle removes the numerous dependencies in the build process.  It also removes the need to have available port open for the API.   
    • /srv/amp/gentle-singularity/gentle-singularity.sif: Singularity container file for the forced alignment code with all dependencies needed built-in
    • galaxy/tools/amp_stt/gentle_forced_alignment.py: Python wrapper script to run forced alignment in the singularity container

  • Dependencies
    • All dependencies are included in the singularity sif file, no extra installation needed.

  • Usage: See details on how to install, build, and run @https://github.com/AudiovisualMetadataPlatform/gentle      

  • Parameters
    • $input_audio_file: Input audio file in wav format
    • $input_transcript_file: Input transcript file in the form of AMP STT Json
  • Output
    • amp_transcript: JSON file in AMP Transcript format
  • Notes
    • In some instances, words in the input transcript could not be found.  It produces a json node like this:

      {
         "case": "not-found-in-audio",
         "endOffset": 60941,
         "startOffset": 60937,
         "word": "type"
      }
    • To accommodate for this, with input from the MGM team, we implemented an algorithm which checks to see how far in the transcript the next "successful" match was
      • We take the add the average time ((Next Success Start - Last Success End)/# of words away) to the previous words to get our new start/end time for the unfound words.
  • No labels