Category description and use cases
Workflow example:
Speech-to-Text > Transcript Editor > Forced Aligner
Output standard
Summary:
JSON Schema
Sample output
Recommended tool(s)
Gentle
Official documentation: Gentle on Github
Language: REST API or Python on command line
Description:
Cost: Free (open source)
Social impact:
Notes:
Installation & requirements
Two options for installation:
- Install Docker image to run webserver, then use API
- Download source code and run bash installation script, then use as a command line python program
Parameters
Input formats
Audio (mp3, wav, possibly other formats) and transcript (plain text).
Example Usage
<tool name> Example
curl -F "audio=@audio.mp3" -F "transcript=@words.txt" "http://localhost:8765/transcriptions?async=false"# ORpython3 align.py audio.mp3 words.txt
Example Output
Gentle Output
{ "transcript": "Now, let me looking at the Congress, uh, as one of the institutions in trouble, uh, to some degree, not the same degree as others, perhaps, but still part of the whole mail.", "words": [ { "alignedWord": "now", "case": "success", "end": 38.29, "endOffset": 3, "phones": [ { "duration": 0.12, "phone": "n_B" }, { "duration": 0.01, "phone": "aw_E" } ], "start": 38.16, "startOffset": 0, "word": "Now" }, { "alignedWord": "let", "case": "success", "end": 38.65, "endOffset": 8, "phones": [ { "duration": 0.05, "phone": "l_B" }, ... ] }
Other evaluated tools
Tool Name
Official documentation: <link>
Language:
Description:
Cost: <$ OR Free (open source)>
Social impact:
Notes:
Installation & requirements
Parameters
Input formats
Example Usage
<tool name> Example
Example Output
<tool name> Output