Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs: Add how-to for Python development (virtualenvs, make test) #392

Merged
merged 9 commits into from
Oct 31, 2023
82 changes: 82 additions & 0 deletions source/developers/how-tos/get-ready-for-python-dev.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,82 @@
How To Get Ready for Python Development
#######################################

.. How-tos should have a short introduction sentence that captures the user's goal and introduces the steps.

This how-to will help prepare your local system to start developing on a Python-based service that is part of the Open edX Platform.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
This how-to will help prepare your local system to start developing on a Python-based service that is part of the Open edX Platform.
This how-to will help prepare your local system to start developing on a Python repository that is part of the Open edX Platform.

This will work on Python libraries too, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, libraries too -- will change.


Assumptions
***********

.. This section should contain a bulleted list of assumptions you have of the
person who is following the How-to. The assumptions may link to other
how-tos if possible.

* You have Git installed
* You have Python 3.8 installed

Steps
*****

.. note::

Some repositories will have a variation of these instructions in their README. Often it will be better to follow the instructions here instead, as the README instructions may be outdated, but keep an eye out for repositories that document special workflows.
Copy link
Member

@kdmccormick kdmccormick Oct 19, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I get what you're saying here, but from the perspective of a new Python dev who has no idea what special workflows are (which IMO is the audience of this doc), this note amounts to saying:

If these instructions conflict with the repository's instructions, then follow these instructions, except in the cases where you shouldn't, and in those cases you should follow repository's instructions.

which I think is more confusing and disheartening than it is helpful 😛

edit: Since I didn't make a clear suggestion--I suggest either taking a firmer stance or nixing the note.

Copy link
Contributor Author

@timmc-edx timmc-edx Oct 30, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's tough to figure out how to say "a lot of those READMEs are wrong but some of them are not" in a useful way that, frankly, isn't too embarrassing. Maybe this:

These instructions will not work in all repositories. If they do not, check the repo's README for a Getting Started section for possible alternative instructions.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually really like this wording!


First-Time Setup
================

First, clone the repository with ``git clone <git_url>``. It's best to place all of your Open edX repositories in the same parent directory as each other, as a few of them assume that they will be arranged this way. (If you're not using devstack, this may not matter.)

After navigating to the new directory in your shell, you'll need to set up a virtualenv and install requirements.

A virtualenv contains a copy of your repo's Python dependencies. This avoids dependency version conflicts with your system packages and other repos.

The most basic way of creating a virtualenv is to use the builtin ``venv`` module in Python::

python3.8 -m venv .venv-3.8
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I kind of understand why the -3.8 suffix is there, but these instructions will only involve one venv, so maybe we can keep it simpler by skipping the suffix.


This example creates a virtualenv at ``./.venv-3.8/``. Each time you will be working with Python in the repo, you'll need to activate the virtualenv for that terminal session::

source .venv-3.8/bin/activate

Once you have done this, any calls to ``python``, ``pip``, or other Python executables will refer to the ones managed by the virtualenv.

.. note::

If you create virtualenvs inside repos, you'll need to tell git to ignore them. The easiest way to do this is to create a file called ``.gitignore-global`` in your home directory and add the line ``.venv-3.*/``. Alternatively, you can create the virtualenvs elsewhere in your filesystem.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some (but of course not all) repos already mention venv in their .gitignore files.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Python cookiecutter doesn't mention it :(

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd proposed that change but closed it since it seemed like I was the only one having that problem: openedx/edx-cookiecutters#79

@nedbat if you want to reintroduce that change again I'll give it a 👍🏻


Many developers use wrapper scripts (or write their own using shell aliases). One commonly used tool is ``virtualenvwrapper``, which manages the virtualenv directories outside of repositories; this avoids several issues with git and other tools being able to see the virtualenv, but will require explicitly naming each virtualenv.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe make this a .. note:: directive and add a link to the virtualenv wrapper docs: https://virtualenvwrapper.readthedocs.io/en/stable/


Working on a Repo
=================

#. Make sure you are on the default branch and have the latest version of the code::

git checkout main
git pull

Some repos instead have ``master`` as their default branch.

#. Activate your virtualenv, as described above.
#. Install or update the Python requirements for the repo into your virtualenv::

make requirements

In a few repos, this might also install NodeJS requirements, or the Makefile target might have a different name.

#. Optionally, ensure you can run tests::

make test

Depending on the repository this might run not just unit tests, but also linters and other quality checks. The Makefile may also define a more comprehensive ``test-all`` target.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

quality is often a separate target and worth mentioning

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a huge amount of variation in what the various repos include. Maybe I should just make a list of targets that may or may not be relevant? I think those would be test, test-all, quality, validate, docs, diff_cover, coverage. Some even have targets for isort, pycodestyle, and pylint. I think in at least one, you're just supposed to run tox directly. (In some repos that only runs pytest, and other places it runs all the checks.)

It does seem like make quality is worth mentioning. Maybe nothing beyond that, and let CI checks take care of the rest?

My longer term goal is to homogenize the Makefile targets we use for Python. Writing this how-to really illustrates some of the current problems.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I wouldn't make the list, that gets confusing. Make test and make quality work pretty often and running both of those will get you a long way to passing your PR checks.


#. Make a branch for your changes::

git checkout -b <your_github_username>/<short_descriptive_label>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we recommend git switch -c <github_username>/<short_label>? switch is a newer command that seems like a better user experience.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW I have been forcing myself to use switch and restore lately, and can confirm that they just make more sense.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've never used it, and the man page has a big warning:

THIS COMMAND IS EXPERIMENTAL. THE BEHAVIOR MAY CHANGE.

...but I'm guessing the basic behavior will remain unchanged? It looks like the main benefit is that you don't have to do git stash; git checkout ...; git stash pop when you're on the wrong branch at first.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't realize there was a stash different with switch. To me, the main advantage is that it's conceptually clearer. I wish they would remove the EXPERIMENTAL warning. There isn't that much behavior to change in the first place.


#. As you change code and add tests, you can use ``make test`` to check if tests are still passing.
#. Run ``make test`` one more time and commit your changes with ``git commit``. Follow the `conventional commits`_ documentation. Make sure your commit message is informative and describes why the change is being made. While the first line of the message should be terse, the body of the message has plenty of room for details.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This skips git add.

You could spell out git add as a step. Alternatively, you could be purposefully vague, and let the developer find a separate resource for learning git basics.

Suggested change
#. Run ``make test`` one more time and commit your changes with ``git commit``. Follow the `conventional commits`_ documentation. Make sure your commit message is informative and describes why the change is being made. While the first line of the message should be terse, the body of the message has plenty of room for details.
#. Run ``make test`` one more time and then commit your changes. Follow the `conventional commits`_ documentation. Make sure your commit message is informative and describes why the change is being made. While the first line of the message should be terse, the body of the message has plenty of room for details.

#. Push your changes to GitHub. If you have write access to the repository, you can run ``git push``. If not, you'll need to fork the repository in GitHub. After forking, you can use ``git remote add fork <fork_url>`` to tell git about the fork, and then ``git push fork`` to push your branch to your fork.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you go through the steps of creating a branch then realize you need a fork, is it annoying at all? Or does it Just Work®?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should just work. Unless you check out a remote branch (creating a local one) or use -u/--set-upstream on a local branch, git doesn't have any particular opinion about what remote the branch goes with.

(I usually would use git push -u fork here, so that the branch always pushes to the same remote. Maybe I should include the -u option here?)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess I wonder if it makes sense to instruct people to make a fork first

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think fork-by-default is a reasonable strategy that aligns with most open source projects on GitHub. I'm an org owner and I use forks for almost everything.

Only maintainers and release managers really need to push new branches upstream, and folks in those roles probably don't need a guide to do that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair enough, I'll separate it out.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't seen people using fork as the name of their personal forks. More common is origin for your fork, and upstream for the upstream.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've seen both patterns. origin/upstream weirds me out because the upstream is the origin -- they more or less mean the same thing! Frankly, I use both, depending on whether I've initially cloned the authoritative repo or my fork. I'm too lazy to rename the remotes. 🙃

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've updated the PR to tell people to use a fork. It doesn't tell people how to update their fork, but that might be out of scope.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that is out of scope, but you could consider linking this: https://docs.openedx.org/en/latest/developers/quickstarts/first_openedx_pr.html

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Tim, looks good.

It doesn't tell people how to update their fork, but that might be out of scope.

I think that's fair. This doc probably isn't the right place for a comprehensive "how to use git with multiple remotes" guide.

#. In GitHub, open a pull request for your changes and ask for a review.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
#. In GitHub, open a pull request for your changes and ask for a review.
#. In GitHub, open a pull request (PR). In the PR description, include anything that could help reviewers understand and test your change.
#. If you have a repository maintainer in mind, you can reach out to them and ask for review. Otherwise, an Open edX project manager will triage your PR and find someone to review it. This process can take some time; we appreciate your patience!

"Ask for review" only works for folks who know maintainers/CCs. Without getting too deep into "How to make a PR", here are a couple items that could clue them into the next steps.


.. _conventional commits: https://open-edx-proposals.readthedocs.io/en/latest/best-practices/oep-0051-bp-conventional-commits.html