• About
    • The InaSpeechSegmenter is a python based tool that takes a binary file as input and products a list of audio segments. Each segment contains a start, end, and label.  This tool was designed to run in IU's HPC environment
  • Source Code
    • galaxy/tools/amp_segment/ina_speech_segmenter_hpc.xml
      Tool configuration detailing tool execution, input file, output file, and labeling.
    • galaxy/tools/amp_segment/ina_speech_segmenter_hpc.py
      Python script to call speech segmenter via API and conform json to schema

  • Dependencies
    • The python script uses the InaSpeechSegmenter tool. The source code, dependencies, and documentation can be found here: https://github.com/ina-foss/inaSpeechSegmenter.  To run in HPC, batch tools for transporting and running batch jobs can be found here: https://github.com/AudiovisualMetadataPlatform/hpc_batch

  • Running the tool
    • The tool can be invoked from Galaxy UI as other tools. User needs to use Get Data / Upload from computer tool to ingest the input file into Galaxy before running the tool.
      When ingesting, choose binary (the default) as file format. The file then will be copied into a designated location in Galaxy file system.  When invoked, the HPC tool will create a job and execute it on HPC.

  • Parameters
    • $input_file: the audio file to run the segmentation on.
    • $json_file: the output json file.
  • Output
    • Json file conforming to schema located here https://wiki.dlib.indiana.edu/display/AMP/MGM+-+Segmentation