Development Guide

For those of you who are interested in contributing code to the project, many previous contributors have wrestled with a variety of ways of getting set up. While we can't cover every single last configuration, we could cover some of the more common cases. Here they are for your benefit!

Development Containers with VSCode

As of 29 May 2020, development containers are supported! This is the preferred way to get you started up and running, as it creates a uniform setup environment that is much easier for the maintainers to debug, because you are provided with a pre-built and clean development environment free of any assumptions of your own system. You don't have to wrestle with conda wait times if you don't want to!

To get started:

Fork the repository.
Ensure you have Docker running on your local machine.
Ensure you have VSCode running on your local machine.
In VS Code, Install an extension called Remote - Containers.
In Visual Studio Code, click on the quick actions Status Bar item in the lower left corner.
Then select "Remote Containers: Clone Repository In Container Volume".
Enter in the URL of your fork of pyjanitor.

VSCode will pull down the prebuilt Docker container, git clone the repository for you inside an isolated Docker volume, and mount the repository directory inside your Docker container.

Follow best practices to submit a pull request by making a feature branch. Now, hack away, and submit in your pull request!

You shouln't be able to access the cloned repo on your local hard drive. If you do want local access, then clone the repo locally first before selecting "Remote Containers: Open Folder In Container".

If you find something is broken because a utility is missing in the container, submit a PR with the appropriate build command inserted in the Dockerfile. Care has been taken to document what each step does, so please read the in-line documentation in the Dockerfile carefully.

Manual Setup

Fork the repository

Firstly, begin by forking the pyjanitor repo on GitHub. Then, clone your fork locally:

git clone git@github.com:<your_github_username>/pyjanitor.git

Setup the conda environment

Now, install your cloned repo into a conda environment. Assuming you have conda installed, this is how you set up your fork for local development

cd pyjanitor/
# Activate the pyjanitor conda environment
source activate pyjanitor-dev

# Create your conda environment
conda env create -f environment-dev.yml

# Install PyJanitor in development mode
python setup.py develop

# Register current virtual environment as a Jupyter Python kernel
python -m ipykernel install --user --name pyjanitor-dev --display-name "PyJanitor development"

If you plan to write any notebooks, make sure they run correctly inside the environment by selecting the correct kernel from the top right corner of JupyterLab!

!!! note "PyCharm Users"

For PyCharm users,
here are some `instructions <PYCHARM_USERS.html>`__  to get your Conda environment set up.

Install the pre-commit hooks.

pre-commit hooks are available to run code formatting checks automagically before git commits happen. If you did not have these installed before, run the following commands:

# Update your environment to install pre-commit
conda env update -f environment-dev.yml
# Install pre-commit hooks
pre-commit install

Build docs locally

You should also be able to preview the docs locally. To do this, from the main pyjanitor directory:

python -m mkdocs serve

The command above allows you to view the documentation locally in your browser.

If you get any errors about importing modules when running mkdocs serve, first activate the development environment:

source activate pyjanitor-dev || conda activate pyjanitor-dev

Plan out the change you'd like to contribute

The old adage rings true:

failing to plan means planning to fail.

We'd encourage you to flesh out the idea you'd like to contribute on the GitHub issue tracker before embarking on a contribution. Submitting new code, in particular, is one where the maintainers will need more consideration, after all, any new code submitted introduces a new maintenance burden, unless you the contributor would like to join the maintainers team!

To kickstart the discussion, submit an issue to the pyjanitor GitHub issue tracker describing your planned changes. The issue tracker also helps us keep track of who is working on what.

Create a branch for local development

New contributions to pyjanitor should be done in a new branch that you have based off the latest version of the dev branch.

To create a new branch:

git checkout -b <name-of-your-bugfix-or-feature> dev

Now you can make your changes locally.

Check your environment

To ensure that your environemnt is properly set up, run the following command:

python -m pytest -m "not turtle"

If all tests pass then your environment is setup for development and you are ready to contribute 🥳.

Check your code

When you're done making changes, commit your staged files with a meaningful message. While we have automated checks that run before code is commited via pre-commit and GitHub Actions to run tests before code can be merged, you can still manually run the following commands to check that your changes are properly formatted and that all tests still pass.

To do so:

Run python -m flake8 --exclude nbconvert_config.py janitor to check code styling problems
Run python -m black -c pyproject.toml to format your code.
Run python -m interrogate -c pyproject.toml to check your code for missing docstring.
Run darglint -v 2 to check quality of your docstrings.
Run python -m pytest to run all unit tests.

!!! tip You can run python -m pytest -m "not turtle" to run the fast tests.

!!! note "Running test locally" When you run tests locally, the tests in chemistry.py, biology.py, spark.py are automatically skipped if you don't have the optional dependencies (e.g. rdkit) installed.

!!! info * pre-commit does not run your tests locally rather all tests are run in continous integration (CI). * All tests must pass in CI before the pull request is accepted, and the continuous integration system up on GitHub Actions will help run all of the tests before they are committed to the repository.

Commit your changes

Now you can commit your changes and push your branch to GitHub:

git add .
git commit -m "Your detailed description of your changes."
git push origin <name-of-your-bugfix-or-feature>

Submit a pull request through the GitHub website

Congratulations 🎉🎉🎉, you've made it to the penultimate step; your code is ready to be checked and reviewed by the maintainers! Head over to the GitHub website and create a pull request. When you are picking out which branch to merge into, a.k.a. the target branch, be sure to select dev (not master).

Fix any remaining issues

It's rare, but you might at this point still encounter issues, as the continuous integration (CI) system on GitHub Actions checks your code. Some of these might not be your fault; rather, it might well be the case that your code fell a little bit out of date as others' pull requests are merged into the repository.

In any case, if there are any issues, the pipeline will fail out. We check for code style, docstring coverage, test coverage, and doc discovery. If you're comfortable looking at the pipeline logs, feel free to do so; they are open to all to view. Otherwise, one of the dev team members can help you with reviewing the code checks.

Code Compatibility

pyjanitor supports Python 3.6+, so all contributed code must maintain this compatibility.

Tips

To run a subset of tests:

pytest tests.test_functions

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

devguide.md

devguide.md

Development Guide

Development Containers with VSCode

Manual Setup

Fork the repository

Setup the conda environment

Install the pre-commit hooks.

Build docs locally

Plan out the change you'd like to contribute

Create a branch for local development

Check your environment

Check your code

Commit your changes

Submit a pull request through the GitHub website

Fix any remaining issues

Code Compatibility

Tips

Files

devguide.md

Latest commit

History

devguide.md

File metadata and controls

Development Guide

Development Containers with VSCode

Manual Setup

Fork the repository

Setup the conda environment

Install the pre-commit hooks.

Build docs locally

Plan out the change you'd like to contribute

Create a branch for local development

Check your environment

Check your code

Commit your changes

Submit a pull request through the GitHub website

Fix any remaining issues

Code Compatibility

Tips