This is your new Kedro project, which was generated using kedro 0.19.9
.
Take a look at the Kedro documentation to get started.
To create a project based on this starter, ensure you have installed Kedro into a virtual environment. Then use the following command:
pip install kedro
kedro new --starter=databricks-iris
After the project is created, navigate to the newly created project directory:
cd <my-project-name> # change directory
Install the required dependencies:
pip install -r requirements.txt
Now you can run the project:
kedro run
In order to get the best out of the template:
- Don't remove any lines from the
.gitignore
file we provide - Make sure your results can be reproduced by following a data engineering convention
- Don't commit data to your repository
- Don't commit any credentials or your local configuration to your repository. Keep all your credentials and local configuration in
conf/local/
Declare any dependencies in requirements.txt
for pip
installation.
To install them, run:
pip install -r requirements.txt
You can run your Kedro project with:
kedro run
Have a look at the file src/tests/test_run.py
for instructions on how to write your tests. You can run your tests as follows:
pytest
To configure the coverage threshold, look at the .coveragerc
file.
To see and update the dependency requirements for your project use requirements.txt
. You can install the project requirements with pip install -r requirements.txt
.
Further information about project dependencies
Note: Using
kedro jupyter
orkedro ipython
to run your notebook provides these variables in scope:catalog
,context
,pipelines
andsession
.Jupyter, JupyterLab, and IPython are already included in the project requirements by default, so once you have run
pip install -r requirements.txt
you will not need to take any extra steps before you use them.
To use Jupyter notebooks in your Kedro project, you need to install Jupyter:
pip install jupyter
After installing Jupyter, you can start a local notebook server:
kedro jupyter notebook
To use JupyterLab, you need to install it:
pip install jupyterlab
You can also start JupyterLab:
kedro jupyter lab
And if you want to run an IPython session:
kedro ipython
To automatically strip out all output cell contents before committing to git
, you can run kedro activate-nbstripout
. This will add a hook in .git/config
which will run nbstripout
before anything is committed to git
.
Note: Your output cells will be retained locally.
Further information about building project documentation and packaging your project