Skip to content

open-dapro is a collection of automated data pipelines for the German energy system built with Dagster 🏡 ☀️

License

Notifications You must be signed in to change notification settings

FlorianK13/open-dapro

 
 

Repository files navigation

energy_dagster

energy_dagster is a collection of automated data pipelines for the German energy system built with Dagster.

Getting started

  1. First, install your Dagster code location as a Python package. By using the -e (editable) flag, pip will install your Python package in "editable mode" so that as you develop, local code changes will automatically apply.
pip install -e ".[dev]"
  1. Next, make sure you have docker and docker-compose installed. You can check this by running:
docker compose version
  1. You now need to rename the .env.template file to .env and change your credentials if needed. The .env file will not be uploaded to git. Note that these credentials have to match the database created with the docker-compose.yml file. The default credentials are:
#
export pwd = postgres
export uid = postgres
export server = localhost
export db = energy_database
export port = 5511
export schema = raw
export DAGSTER_HOME = ~/.dagster/dagster_home
export DBT_PROFILE_FOLDER = dev
export MASTR_DOWNLOAD_DATE = today
  1. To initialize the database and to create the docker container, run:
python development/initialize.py

Check if the database is running on the server and port specified in the .env file.

  1. Start the Dagster UI web server:
dagster dev

If the environment variables were loaded successfully, you should see the following line:

dagster - INFO - Loaded environment variables from .env file: pwd,uid,server,db,port,schema, MASTR_DOWNLOAD_DATE
  1. Open http://localhost:3000 with your browser to see the project.

You can start writing your own assets in energy_dagster/assets.py. The assets are automatically loaded into the Dagster code location as you define them.

Development

Adding new Python dependencies

You can specify new Python dependencies in setup.py.

pre-commit hooks

In this project, we use pre-commit hooks to lint the code before committing. The hooks are defined in the .pre-commit-config.yaml file. To install the hooks, run the following command:

pre-commit install

This will install the hooks in your local repository. They will be executed before every commit and check for linting errors using the sqlfluff and black packages.

dbt osmosis

You can use dbt-osmosis for creating, updating, and deleting dbt property files. This can be done using the following command:

dbt-osmosis yaml refactor .\models\marts\

Deployment

To run the pipelines in production, you can use the following command from the root directory of the project:

docker-compose up --build

To start the schedulers, enter the docker_webserver container and run the following command:

dagster schedule start --start-all

About

open-dapro is a collection of automated data pipelines for the German energy system built with Dagster 🏡 ☀️

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 56.9%
  • SQL 40.9%
  • Dockerfile 1.8%
  • Shell 0.4%