Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cachi2 wheels: documentation #517

Merged
merged 2 commits into from
Apr 17, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
File renamed without changes.
7 changes: 3 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,6 @@ The primary intended use of Cachi2's outputs is for network-isolated container b
Please note that Cachi2 is rather picky, aiming to:

* encourage or enforce best practices
* enforce building from source - no pre-built artifacts, such as Python [wheels][wheel-spec]
* never execute arbitrary code - looking at you [setup.py (discouraged)][setuppy-discouraged]
* keep the implementation simple

Expand Down Expand Up @@ -210,7 +209,7 @@ make test-unit
```

For finer control over which tests get executed, e.g. to run all tests in a specific file, activate
the [virtualenv](#virtualenv) and run:
the [virtualenv](#virtual-environment) and run:

```shell
tox -e py39 -- tests/unit/test_cli.py
Expand Down Expand Up @@ -345,8 +344,7 @@ files present in the source repository and downloading the declared dependencies
The files must be lockfiles, i.e. declare all the transitive dependencies and pin them to specific versions. Generating
such a lockfile is best done using tools like [pip-compile](https://pip-tools.readthedocs.io/en/stable/).

Note that, per the Cachi2 [goals](#goals), we download only source distributions. This means pip will need to rebuild
all the dependencies from source, which makes the build process more complex than you might expect.
We support source distribution file format ([sdist][sdist-spec]) as well as binary distribution file format ([wheel][wheel-spec]).

See [docs/pip.md](docs/pip.md) for more details.

Expand Down Expand Up @@ -389,6 +387,7 @@ still in early development phase.
[cachi2-container]: https://quay.io/repository/redhat-appstudio/cachi2
[cachi2-container-status]: https://quay.io/repository/redhat-appstudio/cachi2/status
[cachi2-releases]: https://github.com/containerbuildsystem/cachi2/releases
[sdist-spec]: https://packaging.python.org/en/latest/specifications/source-distribution-format/
[wheel-spec]: https://packaging.python.org/en/latest/specifications/binary-distribution-format/
[setuppy-discouraged]: https://setuptools.pypa.io/en/latest/userguide/quickstart.html#setuppy-discouraged
[go117-changelog]: https://tip.golang.org/doc/go1.17#go-command
Expand Down
48 changes: 36 additions & 12 deletions docs/pip.md
slimreaper35 marked this conversation as resolved.
Show resolved Hide resolved
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@
* [Specifying packages to process](#specifying-packages-to-process)
* [requirements.txt](#requirementstxt)
* [Project metadata](#project-metadata)
* [Building from source](#building-from-source)
* [Distribution formats](#distribution-formats)
* [Using fetched dependencies](#using-fetched-dependencies)
* [Troubleshooting](#troubleshooting)

Expand Down Expand Up @@ -40,12 +40,16 @@ JSON input:
// specify *build* requirements files
// defaults to ["requirements-build.txt"] or [] if the file does not exist
"requirements_build_files": ["requirements-build.txt"],
// option to allow fetching binary distributions (wheels)
// defaults to "false"
"allow_binary": "false",
}
```

*For more info on build requirements, see [Building from source](#building-from-source).*
*For more information on using build requirements and binary distributions, see
[Distribution Formats](#distribution-formats) section.*

The main argument accepts alternative forms of input, see [usage: pre-fetch-dependencies][usage-prefetch].
The main argument accepts alternative forms of input, see [usage: Pre-fetch dependencies][usage-prefetch].

## requirements.txt

Expand Down Expand Up @@ -345,11 +349,11 @@ if __name__ == "__main__":
setup(name=NAME, version=VERSION, ...)
```

## Building from source
## Distribution formats

Python packages typically distribute both the
[binary format](https://packaging.python.org/en/latest/specifications/binary-distribution-format/) (called wheel)
and the [source format](https://packaging.python.org/en/latest/specifications/source-distribution-format/) (sdist).
and the [source format](https://packaging.python.org/en/latest/specifications/source-distribution-format/) (called sdist).

Wheels are much more convenient; they are the pre-built format, installing from a wheel amounts to unzipping the wheel
and copying the files to the right place.
Expand All @@ -358,11 +362,28 @@ Sdists are more difficult to install. Pip must first build a wheel from the sdis
a [PEP 517](https://peps.python.org/pep-0517/) build system. To do that, pip has to install the build system and
its dependencies (defined via [PEP 518](https://peps.python.org/pep-0518/)).

Building from source gives you an important guarantee which using pre-built artifacts does not: what you installed
matches the source code. This can be especially important for Python packages implemented in C or other compiled
languages.
Cachi2 (unlike the older Cachito) can download both wheels and sdists. The `allow_binary` option controls this behavior.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I'm not sure we need to mention cachito here, but it's just a suggestion. I'm OK with it either way

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm OK with either way, too, but Cahito is mentioned at multiple places in the docs


### requirements-build.txt
* `"allow_binary": "true"` - download both wheels and sdists
* `"allow_binary": "false"` - download only sdists (default)

*Note: Cachi2 currently downloads one sdist and all the available wheels per
dependency (no filtering is being made by platform or Python version).*

### Building with wheels

Pre-fetching and building with wheels is much easier and faster than pre-fetching and building from source (even without filtering of wheels).
However, downloading all the wheels naturally results in a much larger overall download size.
Based on sample testing, wheels + sdists will be approximately 5x to 15x larger than just the sdists.
When building with wheels, dealing with build dependencies via requirements-build.txt is unnecessary.

### Building from source

Building wheels from sdists takes a long time, but building from source gives you an important guarantee
which using pre-built wheels does not: what you installed matches the source code.
This can be especially important for Python packages implemented in C or other compiled languages.

#### requirements-build.txt

To allow building from source in a network-isolated environment, Cachi2 must download all the PEP 517 build dependencies
before the build starts.
Expand All @@ -377,7 +398,7 @@ There's no great way to generate such a file. As far as we know, the best soluti
standalone script that lives in the old Cachito repo:
[pip\_find\_builddeps.py](https://github.com/containerbuildsystem/cachito/blob/master/bin/pip_find_builddeps.py).

**Prerequisites:**
#### Prerequisites

Generate a [fully resolved requirements.txt](#requirementstxt)

Expand Down Expand Up @@ -565,10 +586,13 @@ installation. If you do manage to make it work, please let us know.

### Dependency does not distribute sources

Some projects do not distribute sdists to PyPI. For example, [tensorflow](https://pypi.org/simple/tensorflow/) (as of
Some projects do not distribute sdists to PyPI. For example, [tensorflow](https://pypi.org/project/tensorflow/2.11.0/#files) (as of
version 2.11.0) distributes only wheels.

Possible workaround: find the git repository for the project, get the source tarball for a release. In requirements.txt,
Possible workarounds:

* Enable pre-fetching wheels using `"allow_binary": "true"` in JSON input.
* Find the git repository for the project, get the source tarball for a release. In requirements.txt,
specify the dependency [via an https url](#https-urls).

```diff
Expand Down