EnData

An open source project from Data to AI Lab at MIT.

EnData

A library for generative modeling and evaluation of synthetic household-level electricity load timeseries. This package is still under active development.

Documentation: (tbd)

Overview

EnData is a library built for generating synthetic household-level electric load and generation timeseries. EnData supports a variety of generative time series models that can be used to train a time series data generator from scratch on a user-defined dataset. Additionally, EnData provides functionality for loading pre-trained model checkpoints that can be used to generate data instantly. Trained models can be evaluated using a series of metrics and visualizations also implemented here.

These supported models include:

Feel free to look at our tutorial notebooks to get started.

Install

Requirements

EnData has been developed and tested on Python 3.8, Python 3.9 and Python 3.10

Also, although it is not strictly required, the usage of a virtualenv is highly recommended in order to avoid interfering with other software installed in the system in which EnData is run.

These are the minimum commands needed to create a virtualenv using python3.8 for EnData:

pip install virtualenv
virtualenv -p $(which python3.8) EnData-venv

Afterwards, you have to execute this command to activate the virtualenv:

source EnData-venv/bin/activate

Remember to execute it every time you start a new console to work on EnData!

Install from PyPI

After creating the virtualenv and activating it, we recommend using pip in order to install EnData:

pip install EnData

This will pull and install the latest stable release from PyPI. -->

Install from source

With your virtualenv activated, you can clone the repository and install it from source by running make install on the stable branch:

git clone [email protected]:michael-fuest/EnData.git
cd EnData
git checkout stable
make install

Install for Development

If you want to contribute to the project, a few more steps are required to make the project ready for development.

Please head to the Contributing Guide for more details about this process.

Quickstart

In this short tutorial we will guide you through a series of steps that will help you getting started with EnData.

Generating Data

To get started, define a DataGenerator and specify the name of the model you would like to use.

generator = DataGenerator(model_name="diffusion_ts")

We provide pre-trained model checkpoints that were trained on the PecanStreet Dataport dataset. You can use these checkpoints to load a trained model. The first step is to assign the DataGenerator a TimeSeriesDatasetinstance. We are using the PecanStreetDataset class here, which is an extension of TimeSeriesDataset.

dataset = PecanStreetDataset()
generator.set_dataset(dataset)

Once a dataset has been assigned, we can load a pre-trained model for that dataset as follows:

generator.load_model()

These pre-trained models are conditional models, meaning they require a set of conditioning variables to generate synthetic time series data. If you want to generate data for a random set of conditioning variables, you can do so as follows:

conditioning_variables = generator.sample_random_conditioning_variables()
synthetic_data = generator.generate(conditioning_variables)

For a more in-depth tutorial, please refer to the tutorial notebooks in the tutorials directory.

Datasets

If you want to reproduce our models from scratch, you will need to download the PecanStreet DataPort dataset and place it under the path specified in your data_config.yaml. Specifically you will require the following files:

15minute_data_austin.csv
15minute_data_california.csv
15minute_data_newyork.csv
metadata.csv

If you want to train models using the Open Power Systems dataset, you will need to download the following file:

household_data_15min_singleindex.csv

and again place it under the path specified in data_config.yaml.

What's next?

For more details about EnData and all its possibilities and features, please check the documentation site.

New models, new evaluation functionality and new datasets coming soon!

Name		Name	Last commit message	Last commit date
Latest commit History 164 Commits
config		config
datasets		datasets
endata		endata
eval		eval
generator		generator
tests		tests
tutorials		tutorials
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
AUTHORS.rst		AUTHORS.rst
CONTRIBUTING.rst		CONTRIBUTING.rst
HISTORY.md		HISTORY.md
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
README.md		README.md
main.py		main.py
playground.ipynb		playground.ipynb
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

EnData

Overview

Install

Requirements

Install from PyPI

Install from source

Install for Development

Quickstart

Generating Data

Datasets

What's next?

About

Releases

Packages

Languages

License

DAI-Lab/EnData

Folders and files

Latest commit

History

Repository files navigation

EnData

Overview

Install

Requirements

Install from PyPI

Install from source

Install for Development

Quickstart

Generating Data

Datasets

What's next?

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages