Home

Awesome

DACC

The Data Aggregator Cross Cozy (DACC)

Setup

Install

To setup the environement and install dependencies, simply run:

./mkenv.sh

Config

Copy the template into a new config file:

cp config-template.yml config.yml

Then, edit config.yml to suit your needs.

Run

source venv/bin/activate
flask run

API

Add a measure

You can request the /measure endpoint to add a raw measure:

$ curl -X POST -H 'Authorization: Bearer <token>' -H 'Content-Type: application/json' http://localhost:5000/measure -d @measure.json
HTTP/1.0 201 CREATED
Content-Type: application/json
Content-Length: 12
Server: Werkzeug/2.0.1 Python/3.8.10

{
  "ok": "true"
}

ℹ️ If you want to request an actual server, replace http://localhost:5000/measure by your DACC URL. For example, https://dacc.cozycloud.cc/measure, or https://dacc-dev.cozycloud.cc/measure for tests.

ℹ️ The token is automatically injected by the stack when using a remote-doctype. Here is the remote-doctype that must be used to request the Cozy's DACC server. You can also use the one for developement. To know how to use a remote-doctype from a Cozy app, see the stack documentation.

Measure format

Here is an example of a valid measure:

{
  "measureName": "konnector-event-daily",
  "value": 1,
  "startDate": "2022-01-01",
  "createdBy": "ecolyo",
  "group1": {
      "slug": "enedis"
  },
  "group2": {
      "event_type": "connection"
  },
  "group3": {
      "status": "success"
  }
}

The expected fields are the following:

Measure definition

A measure is defined by the following fields:

Note there is no public API to insert a new definition. For security purposes, Cozy restricts this possibility and carefully evaluates each new measure definition to accept it or not.

Aggregates functions

The available aggregates functions are the following:

Query an aggregated result

You can query the /aggregate endpoint to get an aggregated result:

$ curl -X GET -H 'Authorization: Bearer <token>' -H 'Content-Type: application/json' http://localhost:5000/aggregate -d '{"measureName": "connection-count-daily", "startDate": "2022-01-01", "endDate": "2022-01-02"}'
HTTP/1.0 200 OK
Content-Type: application/json
Content-Length: 12
Server: Werkzeug/2.0.1 Python/3.8.10

[
  {
    "avg": 54.39,
    "count": 173,
    "countNotZero": 172,
    "createdBy": "ecolyo",
    "group1": {
      "device": "desktop"
    },
    "max": 100.0,
    "measureName": "connection-count-daily",
    "min": 0.0,
    "startDate": "2022-01-01T00:00:00",
    "std": 28.94,
    "sum": 9410.0
  },
  {
    "avg": 53.04,
    "count": 26,
    "countNotZero": 26,
    "createdBy": "ecolyo",
    "group1": {
      "device": "mobile"
    },
    "max": 94.0,
    "measureName": "connection-count-daily",
    "min": 3.0,
    "startDate": "2022-01-02T00:00:00",
    "std": 29.33,
    "sum": 1379.0
  }
]

The expected query parameters are the following:

ℹ️ startDate is inclusive, while endDate is exclusive. In other words, startDate >= {results} < endDate

⚠️ If an aggregate has less contributions than the aggregation_threshold set in the associated measure definition, no result will be returned. This is a safeguard to ensure that no individual contribution can be revealed.

Development

Build and launch docker dev environment

To launch DACC application in developement environment with a dedicated PostgreSQL database in a docker environment:

FLASK_ENV=development
FLASK_RUN_HOST=0.0.0.0
FLASK_RUN_PORT=5000
PORT=5000
$ cp config-template.yml config.yml
docker-compose up

The application should start in development environment with auto-reloading. You can also add -d option if you prefer to launch it in background

Execute tests

To execute tests while your docker dev environment is running, simply run

$ docker exec dacc_web pytest

Docker dev environment stop & cleanup

To stop dev environment containers, run the following commands:

$ docker-compose down -v

To delete postgresql volumes, once stopped, use:

sudo rm -rf volumes

Managing versions

DACC uses semantic versioning, that is versions are in the form <major>.<minor>.<patch>.

Releasing a new version

The script scripts/releaseversion.sh will release a new production version:

$ ./scripts/releaseversion.sh

Updating version number

Each time you introduce new features, you need to manually increase <minor> in version number (and reset patch to 0) and each time you introduce backward-incompatible API change, you need to bump <major> and reset minor and patch to 0.

To manually change version, use the script scripts/nextversion.sh

$ ./scripts/nextversion.sh 1.2.3

This script will update version number, adding -dev suffix but it is your responsibility to add, commit & push the change.

Administration

Healthchecks

You can query the /status route to get health checks result:

$ curl -s -i http://localhost:5000/status
HTTP/1.0 200 OK
Content-Type: application/json
Content-Length: 63
Server: Werkzeug/2.0.1 Python/3.8.10

{
  "db": {
    "status": "ok"
  },
  "global_status": "ok"
}

Or If you need to only get the global status:

$ curl -s http://localhost:5000/status | jq -r '.global_status'
ok

Authentication

Some routes need authentication. Some commands are available to set up token-based authentication:

flask token create # Create a new token linked to an organization
flask token get # Get the list of existing tokens
flask token update # Update a token linked to an organization
flask token delete-org # Remove an organization

Database migration

When the database needs a migration, i.e. when the structure changed, for instance a new column, one needs to run flask db migrate. A migration script is then generated, that must be commited.

It is then possible to run flask db upgrade on the DACC server to automatically handle the database migration.

⚠️ Note the materialized views cannot be manually altered. Thus, a migration on the view will trigger a recreation from scratch, which can take some time depending on the volume. Hence, the migration script might include the following steps to prevent any service interruption:

CREATE MATERIALIZED VIEW tmp AS ...;
DROP MATERIALIZED VIEW myView;
ALTER MATERIALIZED VIEW tmp RENAME TO myView;

Insert definitions

To insert measure definitions, simply copy assets/definitions-example.json to a new file, adapt it to suit your needs and run flask insert-definitions-json -f assets/yournewfile.json.

Without the -f flag to specify a specific definition file, it takes the JSON file stored in dacc/assets/definitions.json by default.

For each definition found, it either inserts it, if it does not exist, or updates it otherwise.

Definition removal

Definitions removed from file are not removed from database by insert-definitions-json. Currently, definitions should be removed by hand from database. A future cli command could be added to remove definitions.

Measures purge

Raw measures can be purged with the commands:

By default, measures oldest than 90 days will be deleted. A -d days option can be passed to specify the minimal age of a measure to be deleted, expressed in number of days, starting from the current date.

Note the measures involving quartiles cannot be purged as easily as the others, because quartiles cannot be partitioned: one needs all the measures to compute the quartile. Thus, the max_days_to_update_quartile is used to determine if the measures can be removed.

When measures are purged, the impacted aggregate are updated to save the purge date in the last_raw_measures_purged column.

Wildcard aggregates

It is possible to manually generate wildcard aggregates:

flask compute-wildcard-aggregate <measure_name> <group1,group2> --from-date 2021-01-01 --to-date 2022-01-01

A wildcard aggregate will consider all the measures for any group value, if the group is specified.

For instance, let us consider an electric-consumption measure with group1_key: heating_type and group2_key: surface. If you want to aggreagte all the measures for any surface, you can run:

flask compute-wildcard-aggregate electric-consumption group2

This will produce aggregates with group2: * for each heating type.

ℹ️ You can specify a date range with --from-date and --to-date to restrict the measures based on the RawMeasure.last_updated column. The default values are respectively 1970-01-01 and the current date.

Logging

Simply enable the functionality in your config file and define minimum message criticity (in syslog's meaning) you want to be sent to syslog:

logging:
  enable: True
  logger_criticity: info

Community

What's Cozy?

<div align="center"> <a href="https://cozy.io"> <img src="https://cdn.rawgit.com/cozy/cozy-site/master/src/images/cozy-logo-name-horizontal-blue.svg" alt="cozy" height="48" /> </a> </div> </br>

Cozy is a platform that brings all your web services in the same private space. With it, your webapps and your devices can share data easily, providing you with a new experience. You can install Cozy on your own hardware where no one's tracking you.

Maintainer

The lead maintainer for the DACC is the Cozy Team.

Get in touch

You can reach the Cozy Community by:

License

DACC is developed by Cozy and distributed under the AGPL v3 license.