Skip to content

Commit

Permalink
Move CI documentation to inside Breeze docs (apache#37039)
Browse files Browse the repository at this point in the history
This PR moves the documentation of CI of ours to inside Breeze
doc folder and splits the documentation in separate docs / chapters
following similar changes done for Breeze docs apache#36936 and the
contributing docs apache#36969.
  • Loading branch information
potiuk authored Jan 27, 2024
1 parent 1bb8126 commit 6daceb8
Show file tree
Hide file tree
Showing 33 changed files with 1,379 additions and 1,255 deletions.
2 changes: 0 additions & 2 deletions .gitattributes
Original file line number Diff line number Diff line change
Expand Up @@ -15,8 +15,6 @@ Dockerfile.ci export-ignore

ISSUE_TRIAGE_PROCESS.rst export-ignore
CONTRIBUTING.rst export-ignore
CI.rst export-ignore
CI_DIAGRAMS.md export-ignore
contributing_docs/ export-ignore

.devcontainer export-ignore
Expand Down
669 changes: 0 additions & 669 deletions CI.rst

This file was deleted.

2 changes: 1 addition & 1 deletion Dockerfile.ci
Original file line number Diff line number Diff line change
Expand Up @@ -1247,7 +1247,7 @@ LABEL org.apache.airflow.distro="debian" \
org.opencontainers.image.created="${AIRFLOW_IMAGE_DATE_CREATED}" \
org.opencontainers.image.authors="[email protected]" \
org.opencontainers.image.url="https://airflow.apache.org" \
org.opencontainers.image.documentation="https://github.com/apache/airflow/IMAGES.rst" \
org.opencontainers.image.documentation="https://airflow.apache.org/docs/docker-stack/index.html" \
org.opencontainers.image.source="https://github.com/apache/airflow" \
org.opencontainers.image.version="${AIRFLOW_VERSION}" \
org.opencontainers.image.revision="${COMMIT_SHA}" \
Expand Down
561 changes: 0 additions & 561 deletions IMAGES.rst

This file was deleted.

4 changes: 2 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -224,7 +224,7 @@ Those are - in the order of most common ways people install Airflow:
`docker` tool, use them in Kubernetes, Helm Charts, `docker-compose`, `docker swarm`, etc. You can
read more about using, customising, and extending the images in the
[Latest docs](https://airflow.apache.org/docs/docker-stack/index.html), and
learn details on the internals in the [IMAGES.rst](https://github.com/apache/airflow/blob/main/IMAGES.rst) document.
learn details on the internals in the [images](https://airflow.apache.org/docs/docker-stack/index.html) document.
- [Tags in GitHub](https://github.com/apache/airflow/tags) to retrieve the git project sources that
were used to generate official source packages via git

Expand Down Expand Up @@ -429,7 +429,7 @@ might decide to add additional limits (and justify them with comment).

Want to help build Apache Airflow? Check out our [contributing documentation](https://github.com/apache/airflow/blob/main/contributing-docs/README.rst).

Official Docker (container) images for Apache Airflow are described in [IMAGES.rst](https://github.com/apache/airflow/blob/main/IMAGES.rst).
Official Docker (container) images for Apache Airflow are described in [images](dev/breeze/doc/ci/02_images.md).

<!-- END Contributing, please keep comment here to allow auto update of PyPI readme.md -->
<!-- START Who uses Apache Airflow, please keep comment here to allow auto update of PyPI readme.md -->
Expand Down
2 changes: 1 addition & 1 deletion contributing-docs/06_development_environments.rst
Original file line number Diff line number Diff line change
Expand Up @@ -86,7 +86,7 @@ Benefits:

- Breeze environment is almost the same as used in the CI automated builds.
So, if the tests run in your Breeze environment, they will work in the CI as well.
See `<../../CI.rst>`_ for details about Airflow CI.
See `<../../dev/breeze/doc/ci/README.md>`_ for details about Airflow CI.

Limitations:

Expand Down
3 changes: 2 additions & 1 deletion contributing-docs/testing/integration_tests.rst
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,8 @@ Enabling Integrations
---------------------

Airflow integration tests cannot be run in the local virtualenv. They can only run in the Breeze
environment with enabled integrations and in the CI. See `CI <CI.rst>`_ for details about Airflow CI.
environment with enabled integrations and in the CI. See `CI <../../dev/breeze/doc/ci/README.md>`_ for
details about Airflow CI.

When you are in the Breeze environment, by default, all integrations are disabled. This enables only true unit tests
to be executed in Breeze. You can enable the integration by passing the ``--integration <INTEGRATION>``
Expand Down
10 changes: 5 additions & 5 deletions dev/MANUALLY_GENERATING_IMAGE_CACHE_AND_CONSTRAINTS.md
Original file line number Diff line number Diff line change
Expand Up @@ -59,21 +59,21 @@ and users:
* `Constraints files` - used by both, CI jobs (to fix the versions of dependencies used by CI jobs in regular
PRs) and used by our users to reproducibly install released airflow versions.

Normally, both are updated and refreshed automatically via [CI system](../CI.rst). However, there are some
cases where we need to update them manually. This document describes how to do it.
Normally, both are updated and refreshed automatically via [CI system](../dev/breeze/doc/ci/README.md).
However, there are some cases where we need to update them manually. This document describes how to do it.

# Automated image cache and constraints refreshing in CI

Our [CI system](../CI.rst) is build in the way that it self-maintains. Regular scheduled builds and
Our [CI](../dev/breeze/doc/ci/README.md) is build in the way that it self-maintains. Regular scheduled builds and
merges to `main` branch builds (also known as `canary` builds) have separate maintenance step that
take care about refreshing the cache that is used to speed up our builds and to speed up
rebuilding of [Breeze](./breeze/doc/README.rst) images for development purpose. This is all happening automatically, usually:

* The latest [constraints](../contributing-docs/12_airflow_dependencies_and_extras.rst#pinned-constraint-files) are pushed to appropriate branch after all tests succeed in the
`canary` build.

* The [images](../IMAGES.rst) in `ghcr.io` registry are refreshed early at the beginning of the `canary` build. This
is done twice during the canary build:
* The [images](breeze/doc/ci/02_images.md) in `ghcr.io` registry are refreshed early at the beginning of the
`canary` build. This is done twice during the canary build:
* By the `Push Early Image Cache` job that is run at the beginning of the `canary` build. This cover the
case when there are new dependencies added or Dockerfile/scripts change. Thanks to that step, subsequent
PRs will be faster when they use the new Dockerfile/script. Those jobs **might fail** occasionally,
Expand Down
3 changes: 0 additions & 3 deletions dev/airflow-github
Original file line number Diff line number Diff line change
Expand Up @@ -142,12 +142,9 @@ def is_core_commit(files: list[str]) -> bool:
# non-released docs
"COMMITTERS.rst",
"contributing_docs/",
"IMAGES.rst",
"INTHEWILD.md",
"INSTALL",
"README.md",
"CI.rst",
"CI_DIAGRAMS.md",
"images/",
"codecov.yml",
"kubernetes_tests/",
Expand Down
2 changes: 2 additions & 0 deletions dev/breeze/doc/01_installation.rst
Original file line number Diff line number Diff line change
Expand Up @@ -462,5 +462,7 @@ This will also remove breeze from the folder: ``${HOME}.local/bin/``
pipx uninstall apache-airflow-breeze
----


Next step: Follow the `Customizing <02_customizing.rst>`_ guide to customize your environment.
1 change: 1 addition & 0 deletions dev/breeze/doc/02_customizing.rst
Original file line number Diff line number Diff line change
Expand Up @@ -125,5 +125,6 @@ For automation scripts, you can export the ``ANSWER`` variable (and set it to
export ANSWER="yes"
------

Next step: Follow the `Developer tasks <03_developer_tasks.rst>`_ guide to learn how to use Breeze for regular development tasks.
2 changes: 2 additions & 0 deletions dev/breeze/doc/03_developer_tasks.rst
Original file line number Diff line number Diff line change
Expand Up @@ -555,4 +555,6 @@ This is a lightweight solution that has its own limitations.
More details on using the local virtualenv are available in the
`Local Virtualenv <../../../contributing-docs/07_local_virtualenv.rst>`_.

------

Next step: Follow the `Troubleshooting <04_troubleshooting.rst>`_ guide to troubleshoot your Breeze environment.
2 changes: 2 additions & 0 deletions dev/breeze/doc/04_troubleshooting.rst
Original file line number Diff line number Diff line change
Expand Up @@ -155,4 +155,6 @@ issue. You may try running the below commands in the same terminal and then try
set HTTP_PROXY=null
set HTTPS_PROXY=null
----
Next step: Follow the `Test commands <05_test_commands.rst>`_ guide to running tests using Breeze.
2 changes: 2 additions & 0 deletions dev/breeze/doc/05_test_commands.rst
Original file line number Diff line number Diff line change
Expand Up @@ -603,5 +603,7 @@ All parameters of the command are here:
:width: 100%
:alt: Breeze k8s logs

-----

Next step: Follow the `Managing Breeze images <06_managing_docker_images.rst>`_ guide to learn how to manage
CI and PROD images of Breeze.
4 changes: 3 additions & 1 deletion dev/breeze/doc/06_managing_docker_images.rst
Original file line number Diff line number Diff line change
Expand Up @@ -108,7 +108,7 @@ customized variant of the image that contains everything you need.

You can building the production image manually by using ``prod-image build`` command.
Note, that the images can also be built using ``docker build`` command by passing appropriate
build-args as described in `IMAGES.rst <IMAGES.rst>`_ , but Breeze provides several flags that
build-args as described in `Images documentation <ci/02_images.md>`_ , but Breeze provides several flags that
makes it easier to do it. You can see all the flags by running ``breeze prod-image build --help``,
but here typical examples are presented:

Expand Down Expand Up @@ -180,5 +180,7 @@ These are all available flags of ``verify-prod-image`` command:
:width: 100%
:alt: Breeze prod-image verify

------

Next step: Follow the `Breeze maintenance tasks <07_breeze_maintenance_tasks.rst>`_ to learn about tasks that
are useful when you are modifying Breeze itself.
2 changes: 2 additions & 0 deletions dev/breeze/doc/07_breeze_maintenance_tasks.rst
Original file line number Diff line number Diff line change
Expand Up @@ -65,4 +65,6 @@ done via ``synchronize-local-mounts`` command.
:width: 100%
:alt: Breeze setup synchronize-local-mounts

-----

Next step: Follow the `CI tasks <08_ci_tasks.rst>`_ guide to learn how to use Breeze for regular development tasks.
3 changes: 3 additions & 0 deletions dev/breeze/doc/08_ci_tasks.rst
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,7 @@ CI tasks
========

Breeze hase a number of commands that are mostly used in CI environment to perform cleanup.
Detailed description of the CI design can be found in `CI design <ci/README.md>`_.

.. contents:: :local:

Expand Down Expand Up @@ -130,5 +131,7 @@ These are all available flags of ``find-backtracking-candidates`` command:
:width: 100%
:alt: Breeze ci find-backtracking-candidates

-----

Next step: Follow the `Release management tasks <09_release_management_tasks.rst>`_ guide to learn how
release managers are using Breeze to release various Airflow artifacts.
2 changes: 2 additions & 0 deletions dev/breeze/doc/09_release_management_tasks.rst
Original file line number Diff line number Diff line change
Expand Up @@ -597,5 +597,7 @@ This command will build one docker image per python version, with all the airflo
:width: 100%
:alt: Breeze build all airflow images

-----

Next step: Follow the `Advanced Breeze topics <10_advanced_breeze_topics.rst>`_ to
learn more about Breeze internals.
2 changes: 1 addition & 1 deletion dev/breeze/doc/10_advanced_breeze_topics.rst
Original file line number Diff line number Diff line change
Expand Up @@ -242,6 +242,6 @@ It's enabled by setting ``RECORD_BREEZE_OUTPUT_FILE`` to a file name where it wi
By default it records the screenshots with default characters width and with "Breeze screenshot" title,
but you can override it with ``RECORD_BREEZE_WIDTH`` and ``RECORD_BREEZE_TITLE`` variables respectively.


------

**Thank you for getting that far** - we hope you will enjoy using Breeze!
2 changes: 1 addition & 1 deletion dev/breeze/doc/README.rst
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@
.. raw:: html

<div align="center">
<img src="../../../images/AirflowBreeze_logo.png"
<img src="images/AirflowBreeze_logo.png"
alt="Airflow Breeze - Development and Test Environment for Apache Airflow">
</div>

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -159,5 +159,5 @@ Thanks to combination of features available in GitHub, the builds are secured ag
code by users contributing PRs, that could get uncontrolled write access to Airflow repository.

The negative consequence of this is that the build process becomes much more complex
(see [CI](../../../../CI.rst) for complete description) and that some cases (like modifying build behaviour
(see [CI](../ci/README.md) for complete description) and that some cases (like modifying build behaviour
require additional process of testing by pushing the changes as `main` branch to a fork of Apache Airflow)
129 changes: 129 additions & 0 deletions dev/breeze/doc/ci/01_ci_environment.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,129 @@
<!--
Licensed to the Apache Software Foundation (ASF) under one
or more contributor license agreements. See the NOTICE file
distributed with this work for additional information
regarding copyright ownership. The ASF licenses this file
to you under the Apache License, Version 2.0 (the
"License"); you may not use this file except in compliance
with the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing,
software distributed under the License is distributed on an
"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
KIND, either express or implied. See the License for the
specific language governing permissions and limitations
under the License.
-->

<!-- START doctoc generated TOC please keep comment here to allow auto update -->
<!-- DON'T EDIT THIS SECTION, INSTEAD RE-RUN doctoc TO UPDATE -->
**Table of Contents** *generated with [DocToc](https://github.com/thlorenz/doctoc)*

- [CI Environment](#ci-environment)
- [GitHub Actions workflows](#github-actions-workflows)
- [Container Registry used as cache](#container-registry-used-as-cache)
- [Authentication in GitHub Registry](#authentication-in-github-registry)

<!-- END doctoc generated TOC please keep comment here to allow auto update -->

# CI Environment

Continuous Integration is an important component of making Apache Airflow
robust and stable. We run a lot of tests for every pull request,
for main and v2-\*-test branches and regularly as scheduled jobs.

Our execution environment for CI is [GitHub Actions](https://github.com/features/actions). GitHub Actions.

However. part of the philosophy we have is that we are not tightly
coupled with any of the CI environments we use. Most of our CI jobs are
written as Python code packaged in [Breeze](../../README.md) package,
which are executed as steps in the CI jobs via `breeze` CLI commands.
And we have a number of variables determine build behaviour.

## GitHub Actions workflows

Our CI builds are highly optimized, leveraging the latest features
provided by the GitHub Actions environment to reuse parts of the build
process across different jobs.

A significant portion of our CI runs utilize container images. Given
that Airflow has numerous dependencies, we use Docker containers to
ensure tests run in a well-configured and consistent environment. This
approach is used for most tests, documentation building, and some
advanced static checks. The environment comprises two types of images:
CI images and PROD images. CI images are used for most tests and checks,
while PROD images are used for Kubernetes tests.

To run the tests, we need to ensure that the images are built using the
latest sources and that the build process is efficient. A full rebuild
of such an image from scratch might take approximately 15 minutes.
Therefore, we've implemented optimization techniques that efficiently
use the cache from the GitHub Docker registry. In most cases, this
reduces the time needed to rebuild the image to about 4 minutes.
However, when dependencies change, it can take around 6-7 minutes, and
if the base image of Python releases a new patch-level, it can take
approximately 12 minutes.

## Container Registry used as cache

We are using GitHub Container Registry to store the results of the
`Build Images` workflow which is used in the `Tests` workflow.

Currently in main version of Airflow we run tests in all versions of
Python supported, which means that we have to build multiple images (one
CI and one PROD for each Python version). Yet we run many jobs (\>15) -
for each of the CI images. That is a lot of time to just build the
environment to run. Therefore we are utilising the `pull_request_target`
feature of GitHub Actions.

This feature allows us to run a separate, independent workflow, when the
main workflow is run -this separate workflow is different than the main
one, because by default it runs using `main` version of the sources but
also - and most of all - that it has WRITE access to the GitHub
Container Image registry.

This is especially important in our case where Pull Requests to Airflow
might come from any repository, and it would be a huge security issue if
anyone from outside could utilise the WRITE access to the Container
Image Registry via external Pull Request.

Thanks to the WRITE access and fact that the `pull_request_target` workflow named
`Build Imaages` which - by default - uses the `main` version of the sources.
There we can safely run some code there as it has been reviewed and merged.
The workflow checks-out the incoming Pull Request, builds
the container image from the sources from the incoming PR (which happens in an
isolated Docker build step for security) and pushes such image to the
GitHub Docker Registry - so that this image can be built only once and
used by all the jobs running tests. The image is tagged with unique
`COMMIT_SHA` of the incoming Pull Request and the tests run in the `pull` workflow
can simply pull such image rather than build it from the scratch.
Pulling such image takes ~ 1 minute, thanks to that we are saving a
lot of precious time for jobs.

We use [GitHub Container Registry](https://docs.github.com/en/packages/guides/about-github-container-registry).
A `GITHUB_TOKEN` is needed to push to the registry. We configured
scopes of the tokens in our jobs to be able to write to the registry,
but only for the jobs that need it.

The latest cache is kept as `:cache-linux-amd64` and `:cache-linux-arm64`
tagged cache of our CI images (suitable for `--cache-from` directive of
buildx). It contains metadata and cache for all segments in the image,
and cache is kept separately for different platform.

The `latest` images of CI and PROD are `amd64` only images for CI,
because there is no easy way to push multiplatform images without
merging the manifests, and it is not really needed nor used for cache.

## Authentication in GitHub Registry

We are using GitHub Container Registry as cache for our images.
Authentication uses GITHUB_TOKEN mechanism. Authentication is needed for
pushing the images (WRITE) only in `push`, `pull_request_target`
workflows. When you are running the CI jobs in GitHub Actions,
GITHUB_TOKEN is set automatically by the actions.

----

Read next about [Images](02_images.md)
Loading

0 comments on commit 6daceb8

Please sign in to comment.