energy_dagster is a collection of automated data pipelines for the German energy system built with Dagster.
- First, install your Dagster code location as a Python package. By using the -e (editable) flag, pip will install your Python package in "editable mode" so that as you develop, local code changes will automatically apply.
pip install -e ".[dev]"
- Next, make sure you have docker and docker-compose installed. You can check this by running:
docker compose version
- You now need to rename the
.env.template
file to.env
and change your credentials if needed. The.env
file will not be uploaded to git. Note that these credentials have to match the database created with thedocker-compose.yml
file. The default credentials are:
#
export pwd = postgres
export uid = postgres
export server = localhost
export db = energy_database
export port = 5511
export schema = raw
export DAGSTER_HOME = ~/.dagster/dagster_home
export DBT_PROFILE_FOLDER = dev
export MASTR_DOWNLOAD_DATE = today
- To initialize the database and to create the docker container, run:
python development/initialize.py
Check if the database is running on the server and port specified in the .env
file.
- Start the Dagster UI web server:
dagster dev
If the environment variables were loaded successfully, you should see the following line:
dagster - INFO - Loaded environment variables from .env file: pwd,uid,server,db,port,schema, MASTR_DOWNLOAD_DATE
- Open http://localhost:3000 with your browser to see the project.
You can start writing your own assets in energy_dagster/assets.py
. The assets are automatically loaded into the Dagster code location as you define them.
You can specify new Python dependencies in setup.py
.
In this project, we use pre-commit hooks to lint the code before committing. The hooks are defined in the .pre-commit-config.yaml
file. To install the hooks, run the following command:
pre-commit install
This will install the hooks in your local repository. They will be executed before every commit and check for linting errors using the sqlfluff and black packages.
You can use dbt-osmosis
for creating, updating, and deleting dbt property files.
This can be done using the following command:
dbt-osmosis yaml refactor .\models\marts\
To run the pipelines in production, you can use the following command from the root directory of the project:
docker-compose up --build
To start the schedulers, enter the docker_webserver
container and run the following command:
dagster schedule start --start-all