Page tree
Skip to end of metadata
Go to start of metadata
  • About
    • The AWS Comprehend adapter is a python based tool that takes in speech segmentation JSON as input and produces a list of entities. Each entity contains a start, end, type, label, and score.
  • Source Code
    • galaxy/tools/aws/aws_comprehend.xml
      Tool configuration detailing tool execution, input file, output file, and labeling.
    • galaxy/tools/aws/aws_comprehend.py
      Python script to call AWS comprehend via API and conform json to schema
    • galaxy/tools/amp_json_schema/entity_extraction_schema.py
      Set of classes representing the entity extraction schema
  • Installation:
            $ pip install Boto3

  • Running the tool
    • The tool can be invoked from Galaxy UI as other tools. User needs to supply input data in the form of standardized speech to text output
  • Parameters
    • $input_file: the speech to text output.
    • $json_file: the output json file.
    • $bucketName: the AWS bucket to store input and output files.  For testing purposes, the bucket amp-dev-test can be used.
    • $dataAccessRoleArn: IAM role allowing comprehend to access S3.


  • No labels