Page tree
Skip to end of metadata
Go to start of metadata

Category description and use cases

Workflow example:

Speech-to-Text > Transcript Editor > Forced Aligner

Output standard

Summary: 

JSON Schema

Schema
{
 
}

Sample output

Sample Output
{

}


Recommended tool(s)

Gentle

Official documentation: Gentle on Github

Language: REST API or Python on command line

Description: 

Cost:  Free (open source)

Social impact: 

Notes: 

Installation & requirements

Two options for installation:

  1. Install Docker image to run webserver, then use API
  2. Download source code and run bash installation script, then use as a command line python program

Parameters

Input formats

Audio (mp3, wav, possibly other formats) and transcript (plain text).

Example Usage

<tool name> Example
curl -F "audio=@audio.mp3" -F "transcript=@words.txt" "http://localhost:8765/transcriptions?async=false"# ORpython3 align.py audio.mp3 words.txt

Example Output

Gentle Output
{
"transcript": "Now, let me looking at the Congress, uh, as one of the institutions in trouble, uh, to some degree, not the same degree as others, perhaps, but still part of the whole mail.",
"words": [
	{
		"alignedWord": "now",
		"case": "success",
		"end": 38.29,
		"endOffset": 3,
		"phones": [
			{
				"duration": 0.12,
				"phone": "n_B"
			},
			{
				"duration": 0.01,
				"phone": "aw_E"
			}
		],
		"start": 38.16,
		"startOffset": 0,
		"word": "Now"
	},
	{
		"alignedWord": "let",
		"case": "success",
		"end": 38.65,
		"endOffset": 8,
		"phones": [
			{
				"duration": 0.05,
				"phone": "l_B"
			},

	...

]
}

Other evaluated tools

Tool Name

Official documentation: <link>

Language: 

Description: 

Cost: <$ OR Free (open source)>

Social impact: 

Notes: 

Installation & requirements


Parameters


Input formats


Example Usage

<tool name> Example
 

Example Output

<tool name> Output
 

Evaluation summary


  • No labels