The Audio Video Annotations Analysis Toolkit

Corpora analysis is a complex task, requiring to learn editors for different file formats and multiple tools, often command-line based, or with programming knowledge prerequisite.

AVAA Toolkit makes it easy to create pipelines connecting ecosystems to process raw data (automated transcriptions, formats conversion..), and query large corpora of annotations coming from various sources to extract advanced statistics and generate beautiful, always up-to-date charts and timelines.

AVAA Toolkit is also a flexible converter ; it takes as input XML files describing the style and operations to generate an HTML document, and takes care of exporting only relevant portions of videos and their thumbnail snapshots, minimizing final document size and potential load times if hosted online.

Annotations Formats

AVAA Toolkit understands the following file formats

TEI, CHA and TEXTGRID formats are available thanks to the TEI-CORPO project.

Media Formats

AVAA Toolkit can also process the following media types

  • audio MP3 AAC OGG WAV OPUS FLAC
  • video MOV MKV MP4 AVI MTS

Most media processing made possible by FFmpeg