- pep8 compliant
- scheduled ETL processes
- integration with FHIR-compatible databases/APIs
- data validation
- logging for better monitoring and observability
- error handling
- parallelization-capable
- incremental-loading (WIP)
-
make sure you have docker installed
-
make sure you have docker-compose installed
-
make sure you have python 3.9 installed
-
make sure you have the requirements.txt installed
-
Create a
.env
file in the root directory with the following content: -
POSTGRES_HOST
: the host of the PostgreSQL database -
POSTGRES_PORT
: the port of the PostgreSQL database -
POSTGRES_DB
: the name of the PostgreSQL database -
POSTGRES_USER
: the username of the PostgreSQL database -
POSTGRES_PASSWORD
: the password of the PostgreSQL database
example:
POSTGRES_HOST={your-host}
POSTGRES_DB={your-db}
POSTGRES_USER={your-user}
POSTGRES_PASSWORD={your-password}
POSTGRES_PORT={your-port}
- unzip
data/patients_fhir_100.zip
- run
docker build -t mendel/app .
to build the app image - run
docker-compose up --build
to build and run the container - that's it! you're all set.
- you can use
psql -h localhost -d mendel -U mendel -p 5432
in the database container terminal to connect to the database \dt
select * from information_schema.tables;
- ...
pip install -r requirements.txt
- run
python scheduler.py
to run the scheduler, you can select the interval and time frommain.py
- run
python main.py
to run the whole project with the scheduled ETL processes - run
python processor.py
to run the data extraction and transformation logic and update the CSV files inout/
- run
python loader.py
to run the data loading logic and upload the CSV files to the database