Architecture

Overview

While AVAA Toolkit's core is made with Java, most features are actually implemented in JavaScript.

All these scripts are stored in the scripts folder and processed when necessary as the XML-to-HTML conversion advances.

The file name itself indicates the nature of the component, and its unique id.

This modularity allows anyone to create custom scripts to extend AVAA Toolkit and solve specific tasks, without the need for further core AVAA Toolkit development.

Scripts of common utility could then be shared by their authors and benefit the community at large.

Scriptable Components

Views

Views are in charge of rendering a selection of annotations (or sometimes data) into HTML.

The view component is an object with the following structure:

list(Array<Annotation>) : String A method for rendering an array of annotations. Must return the final HTML.
data(Object) : String A method for rendering generic data. If this method is defined, it will be called instead of the list method.
css() : String A method returning default CSS to be embedded in the HTML page.
js() A function added to the final HTML page, and executed once to initialize browser-related stuff.
includes : Array<String> An optional array of files from the include folder, to reference in the HTML page.


// file view.my-custom-view.js
VIEW = {
    list:list=>{
        let html='';
        for(a of list){
            html+=`<div class="annotation">${a.tier.id}: ${a.value}</div>`;
        }
        return html;
    },
    css:()=>`
        #view div {
                color:green;
                margin:10px;
                border:2px dashed gray;
        }
    `
}

Note: in the css() method, any occurrence of #view will be replaced automatically by the actual view selector.

The view script file accepts these specific meta-definitions:

@input
@attr
@param
@dependency

The view component runs with these objects available in scope:

log
api
attr
params
data
variables

Operations

Operations take input data and can eventually be effectful or return modified data.

The data usually is the current selection, an array of annotations.

The operation component is a simple function with the following arguments in order:

value The string value of the operation's attribute itself
data The provided data to work on (an array of annotations if a selection is available, or any plain JS data)

The operation function should return either an array of annotations, which will become the current selection, or any data except string.

In case a string is returned, it will be considered as another script to run in order to obtain the final effect/result (legacy behavior).


/**
    @novalue
    A simple operation to reverse an input array
**/
function(value, data){
    if(!Array.isArray(data)){
        log.error(`input must be an array`);
        return data;
    }
    return data.reverse();
}

The operation script file accepts these specific meta-definitions:

@input
@output
@attr
@novalue
@dependency

The operation component runs with these objects available in scope:

log
api
attr
variables

Processors

Processors offer a wide range of possibilities, like manipulating the corpus or running external programs.

It is the component of choice for long-lasting tasks and advanced log feedback.

The processor component is an object with the following structure:

pipe(Pipeline) : Any A method called during pipeline execution, with the current pipeline as argument

If the pipe method returns false, the pipeline execution will be aborted.

If an exception is raised during the pipe method call, the pipeline execution will be aborted.


PROC = {
    pipe:pipe=>{
        for(let f of pipe.corpus){
            let step = log.addStep(f);
            for(let mf of f.media){
                step.file(mf, "Found media file:");
            }
        }
    }
}

The processor script file accepts these specific meta-definitions:

@input
@output
@attr
@params
@dependency

The processor component runs with these objects available in scope:

log
api
attr
variables
annotations
data
dataJSON
pipe

Mods

Mods (from Modifiers) are special components with the ability to hook anywhere during HTML generation.

A mod can add, remove, or modify HTML code at each HTML output point.

This allows complete modification of the final HTML page, for instance to add interactive features or visual effects.

The mod component is an object with the following structure:

hook(String) : String A method called at each HTML output step, with the step hook as argument

The following tag hooks are triggered:

head
body

The component must register itself on the available mod global object, for instance:


                    TODO

The mod script file accepts these specific meta-definitions:

@attr
@params
@dependency

The mod component runs with these objects available in scope:

log
api
attr
variables

Meta Definitions

Each script file can begin with doc comments /** ... **/ containing meta definitions.

A definition is a line beginning with a keyword prefixed by the @ character.

The following definitions are common to all script files:

@author Script file author
@license A license notice
@log A log entry, usually to log major changes
@todo A todo note
@sample Sample usage for the doc
@image Sample image (taken from dependencies/images folder)
@experimental Indicates the component is a work in progress
@hidden Whether to hide this component from the user (providers registering an extension use it to hide themselves from the providers' list)

The following definitions make sense to some components:

@input Allowed input type
@output Expected output type
@attr Custom attribute
@param Custom parameter
@dependency Custom JS file to load from the dependencies folder before loading the script file itself
@novalue Indicates the operation does not need an associated value

If a line does not begin with the @ character, it will be considered part of the component's description.

Component Execution Scope

When a component is executed, some helper objects are available in its scripting scope.

The following objects are available in all the components:

log A logger to raise feedback to the user
doc Current AVAA document being processed
api The API to access core functions
bag An object persistent across full conversion of the document
variables An object with current user-defined global variables

The following objects are available depending on the component:

data Current data being processed
dataJSON Current data as a JSON string, if possible
annotations Current selection (List) of annotations being processed, if available
attr An object with all the defined attributes and their values
params An array of parameters Element
pipe Currently executing Pipeline

External Programs

AVAA Toolkit can execute other programs and read their output, via the processor component.

Generic Process

The ScriptAPI exposes an exec(args:Array) method to prepare a process execution.


let step = log.addStep({name:`Getting ffmpeg version`});
let proc = api.exec(['ffmpeg','-version'])
proc.onError(msg=>{
    step.error(`A process error occurred: ${msg}`)
})
proc.onOutput(msg=>{
    step.info(msg)
})

// monitors execution and allows killing of the process from editor
step.monitor(proc);

// run the process and get its exit status
let exitStatus = proc.run();
let benched = Math.ceil(proc.duration/1000)+1;
if(!exitStatus){
    step.info(`Completed in ${benched} seconds`);
    step.ok();
}else{
    step.error(`Process exited with error code ${exitStatus} after ${benched} seconds`);
}

FFmpeg

It's possible to execute ffmpeg with the Generic Process approach, but it is cumbersome to read progress feedback and other results.

To this end, AVAA Toolkit provides a wrapper script that must be included in the component via the @dependency proc-ffmpeg-wrapper.js meta definition.

Then calling ffmpeg with


let step = log.addStep({name:`Custom Video Effect`});
let ffargs = [
    '-i', inputFile.getAbsolutePath(),
    '-filter_complex', `"gblur=sigma=42:steps=6;format=yuv420p"`,
    outputFile.getAbsolutePath()
]
step.info(`Applying Gaussian Blur...`)
step.work()
let result = ffmpegWrapper(step, ffargs)
if(!result) throw `Failed applying the video filter!`

The wrapper has additional options not yet covered in this guide.

Python

AVAA Toolkit also provides a wrapper and a protocol to simplify calling Python programs and reading their output.

Because it is not possible to understand the output of all potential Python programs, authors wishing to interface a Python program with AVAA Toolkit have to prefix its output according to the following rules:

#E: to indicate an error, like
```
print('#E:missing input')
```
#W: to send a warning, like
```
print('#W:file will be overwritten')
```
#I: to log information, like
```
print('#I:Temperature =', temperature)
```
#F: to link a file, like
```
print('#F:Input =', args.input)
```
#D: for download progress, like
```
print('#D:20%')
```
#P: for process progress, like
```
print('#P:6/50/Encoding')
```
#R: for the result, like
```
print('#R:'+json.dumps(result))
```

Now to implement the processor that will call the Python program, it is necessary to include these two dependencies:

@dependency proc-exec-output-wrapper.js This dependency provides the wrapper to interpret the output of the program.
This output wrapper is not specifically for Python programs, it can be used for any program that wants to interface with AVAA Toolkit.

@dependency proc-python-wrapper.js

pip

Let's see how this all fit with some code taken from the speech-to-text-whisper processor:


let req = log.addStep({name:'Requirements'})
let pythonVersion = pythonCheck(req);
if(!pythonVersion) return;
if(!pipCheck("torch",req)) return;
if(!pipCheck("openai-whisper",req)) return;

// checks that the avaa-whisper.py script is working
// the file avaa-whisper.py is in the dependencies folder
// it's a python script made to interface Whisper with AVAA
let whisperWrapper = pythonWrapper("avaa-whisper",req);

let step = log.addStep('Transcription');
let xargs = ['python','-u',whisperWrapper.getAbsolutePath(),
    '-i',inputFile.getAbsolutePath(),
    '-m',whisperModel,
    '-t',whisperTemp,
    '-l',whisperLang
];
let proc = api.exec(xargs);
step.monitor(proc);
proc.onError(s=>{
    step.error(`Process Error: ${s}`)
});
// here we use the output wrapper
proc.onOutput(execOutputWrapper(step,result=>{
    step.info(`Got results from transcription:`);
    let a = JSON.parse(result);
    // ...
}));
let exitStatus = proc.run();

R

AVAA Toolkit makes it easy to call R programs via the r-script processor.

The processor executes a R file (or R source code) and integrates the resulting data into the final HTML page. Resulting R output can be graphic files (jpg, png, gif, svg) or tabular text data. Arguments provided to the R script are in order:

temp directory path to work with and create result files
path to a JSON file consisting of the selection (annotations) or data provided to the processor

R programs can send feedback messages to AVAA by following the same output wrapper protocol described in the Python section above.

When working directly with R source code (and not a R program file), the r-script processor will automatically add helper code:

args <- commandArgs(trailingOnly = TRUE)
avaaTempDir <- args[1]
avaaInput <- fromJSON(txt=args[2])
avaaTemp <- function(ext){
	return(tempfile(pattern = "r-output-", fileext = paste(".",ext,sep=""), tmpdir = avaaTempDir))
}
avaaReturn <- function(output){
	cat(paste("#R:",output,"\n", sep="")))
}
avaaLogFile <- function(f,s=''){
	cat(paste("#F:",s," =", f, "\n", sep=""))
}

Example usage in source:

cat("#I:Demo Plot Script\n")
# Create x and y values
x <- 1:6
y <- x^2

# Plot to a png file
outputFile <- avaaTemp("png")
png(file = outputFile, width = 256*3, height = 256*3)

# Linear regression model y = A + B * x
model <- lm(y ~ x)

# Create a 2 by 2 layout for figures
par(mfrow = c(2, 2))

plot(model)
dev.off()

# Log and return file to AVAA
avaaLogFile(outputFile,'Plot File')
avaaReturn(outputFile)

AVAA Toolkit Scripting Guide

Introduction