Home

Awesome

ActiveAnno

ActiveAnno is a web-based, responsive, highly configurable open source document annotation tool.

<!-- TOC --> <!-- /TOC -->

Demo

Live Demo

The current running demo can be found here: http://ltdemos.informatik.uni-hamburg.de/activeanno. Please use the login credentials username "testuser", password "testuser". Since the location of where the demo is running might change in the future, it will always be referenced here in the Github Readme.

Demo Video

https://youtu.be/ryCi4XeReDg

Quick start

Use docker-compose to start the application with frontend, backend and the database. Run the following command from the top level directory of the project:

docker-compose -f ./deployments/local.docker-compose.yml up -d

By default, this will start 3 container: A MongoDB container, a Ktor backend service and an nginx server hosting a React application. Next, go to localhost:3000 to open the web UI. You can modify the port by editing the local.docker-compose.yml and changing the activeannofrontend.ports mapping. (Then, you also need to add the new URL to the activeannobackend.CORS_HOSTS_LIST env variable.)

Login

By default, the local.docker-compose.yml is configured for local run only, not for production purposes. Therefore, the authentication mechanism is disabled so that any username will automatically generate a super user account. Once opening localhost:3000, you will be redirected to localhost:3000/login and can provide any username you want. If you want to instantly see an example project, chose the username admin. For this user, an existing project with a project configuration as well as some example data is provided.

Login page on mobile

<img src="screenshots/AA_login_mobile.png" alt="Login" width="300"/>

Navigation / Usage

Overview page on desktop / tablet

<img src="screenshots/AA_landingpage.png" alt="Overview Desktop" width="600"/>

Overview page on mobile

<img src="screenshots/AA_home_mobile.png" alt="Overview Mobile" width="300"/>

Manage page on desktop

<img src="screenshots/AA_manage_overview.png" alt="Manage Desktop" width="600"/>

Annotate page on desktop / tablet (before annotating)

<img src="screenshots/AA_annotate.png" alt="Annotate Desktop" width="600"/>

Annotate page on desktop / tablet (after annotating)

<img src="screenshots/AA_annotate_annotated.png" alt="Annotate Desktop selected" width="600"/>

Curate page

<img src="screenshots/AA_curate_full.png" alt="Curate full view" width="700"/>

Analyze results page

<img src="screenshots/AA_analyze_full.png" alt="Analyze results full view" width="700"/>

Use cases

API

Additionally to the endpoints of the backend used by the frontend, there are additional endpoints that are relevant, especially when ActiveAnno is used in a microservice context. Here, the endpoints are documented together with screenshots from Postman as an example of how to call the API.

Importing documents into ActiveAnno

There are two ways to get documents to annotate into the application. Either create a One-off project and upload the documents inside the Manage UI, or create a project for continuous annotation of streams of documents. For this, the Import API is available.

POST /api/v1/import/document

The endpoint is protected via JWT authentication and authorization. In a production setup, the user needs to have the role activeanno_producer (configurable via env variables). Typical use cases would be another service pushing data to this endpoint, or pushing data manually via curl, Postman etc. For both cases a JWT with this role is required. The endpoint accepts all json structures, json objects as well as json arrays of json objects. Therefore, it is even more important that only authorized users can push to that endpoint. The application has no assumption about how the json objects are structured. For example, you could push a document of the following structure:

{
  "comment": "some comment",
  "timestamp": 123456789000,
  "innerObject": {
     "innerKey": "some metadata"
  },
  "uniqueID": 1
}

This document will be assigned a unique ID and be stored in a mongoDB instance. If you want to annotate this document, you need to create a project that applies a filter such that this document will be selected. Example filter for the document above

{
  "operator": "eq",
  "key": "uniqueID",
  "value": 1
}

Now, any document with unqiueID = 1 would be part of that project and will be shown to the annotators. You don't actually need to write that filter json yourself, it is possible to configure that inside the manage UI. Under the hood, it is just a Mongo Query {"uniqueID": {"$eq": 1}}. The reason the filter json structure is not directly the mongo query is that we gain type safety by modeling all allowed Mongo operations as classes in Kotlin. Also, the operator is used for polymorphic deserialization, making the transformation from JSON to the data structure inside the backend easy.

Screenshot Postman with POST to import endpoint - Authorization

<img src="screenshots/postman_import_1.png" alt="Postman import 1" width="600"/>

Screenshot Postman with POST to import endpoint - POST body and response (201 Created)

<img src="screenshots/postman_import_2.png" alt="Postman import 2" width="600"/>

In Postman, you need to set the Bearer Token to the Base64 encoded JWT. For testing purposes (with JWT verification disabled), go to jwt.io and create a JWT with the payload

{
  "sub": "testproducer",
  "roles": [
      "activeanno_producer"
  ]
}

This will be enough to push to that endpoint (or you can disable role protection via enviroment variables all together).

Exporting documents and annotations

The other relevant API is for exporting documents with their created annotations. For this, there are three ways. You can download all documents with the annotations from the Manage UI, you can use the REST API or webhooks. First, lets demonstrate the REST api.

GET /api/v1/export/project/{projectID}            // projectID, for example EXAMPLE_PROJECT_APP_REVIEWS
Optional get parameters:
includeUnfinished=true|false  // Include document that not yet have been fully annotated in the export
since=12345                   // UTC timestamp in millis, will include every document greater than or equal the timestamp (if includeUnfinished is true, will use the timestamp of the any existing annotation, if false, then from the officially chosen (by curator or algorithm) annotation 
documentIDs=ABC,DEF,GHI       // Ask for specific documents by their unqiue mongo ID. Comma separated string.

The authentication for this endpoint is configurable by the project. There is None, which means anybody can call this endpoint if they know the ID of the project. Then there is HTTP Basic Auth, where a username and password can be specified in the project config. The last one is the JWT role, which will work the same as above for the import, but with role activeanno_consumer by default.

<img src="screenshots/postman_export_1.png" alt="Postman export 1" width="600"/>

The full export result is stored under example_export.json.

For webhook exports, we need to define the WebHook configuration inside the Manage UI. There, you define a list of URLs where any new finalized document annotation result will be posted to. For this, you need to define a URL, a behavior how to handle a failure (retry on next finished or ignore), the export format and the kind of authentication required for the web hook. At the moment, no authentication and HTTP Basic Auth are supported, though an OAuth2 approach with clientID and clientSecret as well as token URL is planned to be supported in the future.

Access to newer features

Some newer features are currently only available through the API of ActiveAnno, not yet in the UI. This includes:

To have access to them, please refer to the documentation inside the api folder or use the JavaDoc / Dokka: https://maxmello.github.io/ActiveAnno/activeannoservice/

Production setup

For an actual production setup, some more steps are required. The first and biggest one is that an external Authentication service is necessary, if proper security is required. ActiveAnno uses JWT to authenticate and authorize users. Right now, no built-in user authentication is provided, as ActiveAnno is supposed to be used in a (Micro-)Service context, where authentication needs to work over multiple distributed services.

Authentication service integration

Development setup

When developing inside the project, you probably want to run the MongoDB in a container (use ./deployments/dev.docker-compose.yml), run the frontend via npm run start and the backend via ./gradlew run. Especially for the backend project, using IntelliJ to run the Ktor application (as well as the docker-compose file) makes things a lot easier.

To rebuild the backend docker image, first execute ./gradlew build and then docker build -t activeannobackend/v1 .. This will then use the newly generated build files.

Re-building the react application

As a web application in the browser actually has no environment variables, the environment variables used in the project are set at build time, not at deploy time. Therefore, if you want to change the default values of the environment variables, you need to rebuild the frontend yourself with the proper environment variable values. The web app has the following environment variables:

REACT_APP_DATABASE=activeannodb
REACT_APP_GENERATE_SUPERUSER_ON_LOGIN=true
REACT_APP_AUTHENTICATION_SERVICE_URL=
REACT_APP_ACTIVE_ANNO_SERVICE_URL=http://localhost:8080/api/v1

These are the default values. For production purposes, you would want to set REACT_APP_GENERATE_SUPERUSER_ON_LOGIN to false, then provide the URL to the authentication service by setting REACT_APP_AUTHENTICATION_SERVICE_URL. This is the endpoint that should return a JWT token for a username and password. The REACT_APP_ACTIVE_ANNO_SERVICE_URL value only needs to be changed if other parts of the docker-compose file about how the backend is exposed are changed. The REACT_APP_DATABASE name is the name of the web database and probably doesn't need to be changed.

To set the environment variables, edit the ./frontend/Dockerfile and add ENV key=value entries. Then rebuild the image locally. When you rebuild the project locally, you will also need to change the image reference inside the docker-compose file to the local rather than the remote image.

Configuring the backend service

Additionally to the environment variables related the authentication mentioned above, there are further ones that can be set inside the docker-compose file. Here is an overview:

PORT: 8080
HOST: '0.0.0.0'
HTTPS_REDIRECT: "false"
MONGO_CONNECTION: localhost:27017
MONGO_USER: user
MONGO_PASSWORD: password
MONGO_DATABASE_NAME: activeanno
LOGGING_LEVEL: DEBUG
GENERATE_EXAMPLE_PROJECT: "true"
CORS_HOSTS_LIST: 'localhost,localhost:3000,0.0.0.0,0.0.0.0:3000'

These are the default values. For production purposes, you would probably set GENERATE_EXAMPLE_PROJECT to false, increase the logging level, enable HTTPS_REDIRECT, and potentially connect the service to an existing MongoDB. The CORS list might not always be necessary, but in the local deploy scenario, the URLs of the web page (localhost:3000) and the backend (localhost:8080) differ and thus CORS is necessary.

Styling

Actually, there are additional environment variables in the frontend for colorizing the web app.

REACT_APP_COLOR_PRIMARY_MAIN=#9A334F
REACT_APP_COLOR_PRIMARY_LIGHT=#C1607A
REACT_APP_COLOR_PRIMARY_MEDIUM_DARK=#74132D
REACT_APP_COLOR_PRIMARY_DARK=#4D0015
REACT_APP_COLOR_PRIMARY_CONTRAST_TEXT=#ffffff

REACT_APP_COLOR_SECONDARY_MAIN=#423075
REACT_APP_COLOR_SECONDARY_LIGHT=#625192
REACT_APP_COLOR_SECONDARY_MEDIUM_DARK=#281758
REACT_APP_COLOR_SECONDARY_DARK=#14063A
REACT_APP_COLOR_SECONDARY_CONTRAST_TEXT=#ffffff

REACT_APP_COLOR_SUCCESS_MAIN=#85BC5E

With this, you can customize the UI to fit your UI requirements. The primary color is used for the header, some buttons etc. The secondary color is used to contrast the primary color, for example for buttons and icons where the background is in the primary color. The contrast text color is just the color for text such that it is readable when the background of that text is in the primary / secondary color. Lastly, there is the success color which is normally a kind of green.

Backend

The backend is written in Kotlin using the Ktor framework. The Dokka (JavaDoc equivalent) can be found on Github pages. To update the Dokka documentation, execute ./gradlew dokka. This will save the markdown files directly in the /docs path. Then commit the changes inside the /docs folder and push to github to update the Github pages site.

Project structure

The backend code is structured in the following packages

Frontend

The frontend is written in Javascript/ES6, React and Redux.

Project structure

Roadmap

ActiveAnno is under active development in open source.

New pages / views

Big features

Smaller features

Browser compatibility

WebAnno was mainly developed with Firefox, but it was also tested in an up-to-date version of Chrome and Safari.

Contributing

Please submit any issues as Github issues in this repository. You are invited to submit merge requests to contribute to this project.

Author

License

MIT