Home

Awesome

ANTgen - the AMBAL-based NILM Trace generator

This tool generates synthetic macroscopic load signatures for their use in conjunction with NILM (load disaggregation) tools. By default, it runs in scripted mode (i.e., with no graphical user interface) and processes an input configuration file into a set of CSV output files containing power consumption values and the timestamps of their occurrence, as well as a file summarizing the events that have occurred during the simulation).

If you find this tool useful and use it (or parts of it), we ask you to cite the following work in your publications:

@inproceedings{reinhardt20benchmarking,
  author = {Andreas Reinhardt and Christoph Klemenjak},
  title = {How does Load Disaggregation Performance Depend on Data Characteristics? Insights from a Benchmarking Study},
  booktitle = {Proceedings of the 11th ACM International Conference on Future Energy Systems (e-Energy)},
  year = {2020}
}

Requirements

ANTgen relies on a small number of Python libraries to fully function. Install them by typing:

pip3 install -r requirements.txt

Note: On Windows systems without a C/C++ compiler suite installed, the installation of package requirements may fail, reporting that the "Microsoft Visual C++ Build Tools" are missing. To continue, download and install them, then re-run the above command.

ANTgen can show an overview plot of its generated data if matplotlib is installed. So unless you plan to use ANTgen to create data on a headless server, we recommend the installation of this library as well.

pip3 install matplotlib    # (optional, install only if you want to see your data plotted)

Usage

python3 antgen.py [-o DESTINATION] [-w] [-s SEED] [-a] [-m MAPFILE] 
                  [-d DAYS] [-n NOISECONFIG] [-v] [-p] configfile

Mandatory argument:

Optional arguments:

Configuration

All ANTgen configuration files are expected to be present in TOML format. The core configuration file must feature at least the sections [GENERAL] and [users]. Optionally, specific appliance models to be used can be placed in the [devices] section. If [devices] is not part of the configuration file, a fallback mapping between appliance names and the corresponding model dictionary must be provided by means of the -m option, such as -m mapping.conf. This mapping file must contain a section named [devices] to be processed correctly.

An example configuration is shown as follows:

[GENERAL]
name = Sample configuration
days = 4
seed = 12345

[users]
Home = baseload.conf
Jack = STUDENT/student_simple.conf

[devices]
COOKINGSTOVE = COOKINGSTOVE/dev_D33097
TV = TV/dev_B80E51

This configuration file will create synthetic load signature data for four days, seeding the random number generator with the value 12345. There are two users in the generated data, one going by the handle "Jack" and following the daily routines specified in users/STUDENT/student_simple.conf. The second "user" is present to model the household base load, as defined in users/baseload.conf.

Users

To create realistic models, ANTgen relies on user models. All user models are stored in the users/ subdirectory, or subdirectories thereof. User models are stored in TOML format, and must feature the [GENERAL] and [presence] sections, as well as one section for each user activity that should be modeled (these ones must start with the string activity_).

An example user model is shown as follows:

[GENERAL]
name = Lucas Lazybone

[presence]
monday   = 00:00-08:30, 14:00-24:00
sunday   = 00:00-24:00

[activity_breakfast]
model      = KITCHEN/cooking_quick.conf
daily_runs = 1
monday     = 07:30-08:15
sunday     = 08:30-09:15

This configuration models a user who is only at home on mondays and sundays, and cooks breakfast once on both days at some (randomly determined) time in the specified time intervals.

Activities

Activities are modeled as state machines, to be executed by the users. All activity models must be stored in the activities/ subdirectory, or subdirectories thereof. Activity models also use the TOML format.

As follows, find some notes on the used nomenclature and some general guidelines for activity definitions:

  1. The only entry the [GENERAL] section of each activity model must contain is the name field. Enter a descriptive name of the activity, which will also constitute the corresponding file name for the power data when written to an output file.

  2. Specify the types of appliances the activity requires in the [devices] section. The tool will try to find matches for all entries listed there, so any unused leftover entries can make the synthesis fail. Use unique numeric keys for listing the devices required. They will be later referred to in the state machine (see below).

  3. Activities are modeled in the form of state machines. Each operational state is specified in the [sequence] table. All states must be assigned a numeric identifier (the state machine starts in state 0) and require the specification of the following fields, which are entered in the form of a comma-separated list:

    • A short name (primarily for debugging purposes)
    • The minimum and maximum duration (use '0' to fall back to the underlying appliance model's default value). If a value greater than 0 is provided, the underlying appliance's activity will be scaled linearly in time, i.e., the durations of all its elements will be stretched/compressed to meet the requested overall duration.
    • A flag whether the user must be present for a particular state to take place. This ensures that an activity is only scheduled when the user is actually at home when needed.
    • A flag whether the state must run to completion before the state machine will progress to the next step. If this value is set to 'false', a delay of just 5-10 seconds is introduced before moving on to the next state, and the appliance continues to operate in the background.
    • The ID of the device to operate in this state (as per the key specified in the [devices] section of the activity configuration file).
    • The next state(s) into which the appliance operation can move, as well as the probability of the transition there. The state machine will advance into state A with the probability specified, and into the state B with the converse probability. Using the same value for states A and B, or a probability of 1.0 effectively makes this a linear flow with no variation possible. Referring to an undefined state in the state machine will terminate the state machine's flow.

The state model representation also allows for unattended operation of devices (simply set the flag whether it involves the user to false). Similarly, states that involve the user but no appliance (e.g. eating) can be modeled by setting the appliance ID to an undefined value (e.g., 0), but specifing a non-zero duration.

There is no need to add a "start" state, yet an initial state with a duration of 0 seconds can be added for the sake of better readability. Likewise, a state relying on an undefined appliance (e.g., '0') with non-zero duration can act as a delay in-between states.

An example activity model for vacuuming the apartment is shown as follows. It assigns ID 1 to the VACUUMCLEANER appliance, and runs this appliance as long (or short) as stored in the appliance model. The user must be present both during the start of the activity (involves_user) as well as throughout its operation (run_to_completion). After one room has been vaccumed, the user rests for 5-10 minutes, before vacuuming another room (at 20% probability) or stopping the activity (at 80% probability).

[GENERAL]
name = vacuuming

[devices]
1=VACUUMCLEANER

[sequence]
# state ID, min_dur, max_dur, involves_user, run_to_completion, dev, prob_for_a, state_a, state_b
0=vacuum,         0,       0,          true,              true,   1,        1.0,       1,       1
1=rest,         300,     600,          true,              true,   0,        0.2,       0,       2

Appliance models

ANTgen uses the AMBAL format for its appliance models (i.e., XML files). A sample set of models is provided in the appliances/ subdirectory of this repository. Newly extracted models can simply be copied into this directory.

User interface

When executed with the -p option, a graphical user interface is brought up after the trace generation has completed. The user interface shows traces for total power demand as well as the demand of power per user, power per activity, and power per appliance. To use this feature, the matplotlib library must be installed.

By clicking on the colored lines in the legend boxes (not their textual labels), the visibility of individual traces can be toggled from the view.

Getting started with a little example

The distribution of ANTgen ships with a few user, activity, and appliance models. Run the following command to create a synthetic trace for one user and a constantly running refrigerator, for the duration of 10 days.

python3 antgen.py -m mapping.conf default.conf

During its execution, ANTgen will output some logging information, an excerpt of which is shown as follows. Most of it should be self-explanatory.

           root [I] ANTgen started using 'default.conf' on 14-05-2020 at 14:16:25
           root [I] Output files will be stored in ./output
                ...
      UserModel [I] User model successfully created for 'Household base load' (1 activity)
      UserModel [I] User model successfully created for 'Grumpy Grandma' (5 activities)
           root [I] ********************************************************************************
                ...
      UserModel [I] Generating load signature(s) for activity 1/1 (fridge) for 10 days...
  ActivityModel [I] Synthesis of 'fridge operation' done: 241 scheduled, 5 didn't fit
      UserModel [I] Generating load signature(s) for activity 1/5 (vacuum) for 10 days...
  ActivityModel [I] Synthesis of 'vacuuming' done: 8 runs scheduled
      UserModel [I] Generating load signature(s) for activity 2/5 (dishwashing) for 10 days...
  ActivityModel [I] Synthesis of 'dishwasher operation' done: 1 runs scheduled
      UserModel [I] Generating load signature(s) for activity 3/5 (tv) for 10 days...
  ActivityModel [I] Synthesis of 'watching TV' done: 13 runs scheduled
      UserModel [I] Generating load signature(s) for activity 4/5 (ironing) for 10 days...
  ActivityModel [I] Synthesis of 'ironing clothes' done: 4 runs scheduled
      UserModel [I] Generating load signature(s) for activity 5/5 (laundry) for 10 days...
  ActivityModel [I] Synthesis of 'washing a load of laundry' done: 3 runs scheduled
           root [I] Synthesis completed in 30.926 seconds
                ...
           root [I] ********************************************************************************
           root [I] Trace duration (days)  :             10
           root [I] First weekday          :         friday
           root [I] # active devices       :             12
           root [I] # appliance operations :            553
           root [I] ---------------------------------------
           root [I]    VACUUMCLEANER #runs :              8
           root [I]   WASHINGMACHINE #runs :              3
           root [I]             IRON #runs :              4
           root [I]     REFRIGERATOR #runs :            241
           root [I]       DISHWASHER #runs :              1
           root [I]               TV #runs :             26
           root [I] ---------------------------------------
           root [I] Max. appl. concurrency :              2
           root [I] Random seed            :     1234567890
           root [I] Added noise            :           none

If you have matplotlib installed, ANTgen can also provide a plot of the synthesis results. Simply invoke it with -p on the command line:

python3 antgen.py -m mapping.conf -p default.conf

Visualization when synthesizing default.conf

ANTgen features two ways to make the output data a little harder to disaggregate. First, adding noise to the aggregate signal is possible by invoking ANTgen with the -n option. For example, -n G200 will add 200 Watts of Gaussian noise (with a standard deviation of one tenth of the amplitude, i.e., 20W) to the aggregate signal. Second, you can use the -a switch to alternate the used appliance model for each activity. While all refrigerator cycles followed the exact same power consumption pattern in above diagram, a random model for the given appliance will be selected in this case (from the directory appliances/REFRIGERATOR/dev20111228/) for each operation of the refrigerator.

python3 antgen.py -m mapping.conf -n G200 -a -d5 -p default.conf

Visualization when synthesizing default.conf with extended options

Other configuration files (including the ones that were used to create the synthetic data for the aforementioned ACM e-Energy 2020 publication) are located in the testcases/ directory.

Copyright notice

Copyright (C) 2019-2020 Andreas Reinhardt reinhardt@ieee.org, TU Clausthal

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.