Skip to content

Commit

Permalink
chore: Update for 0.3.0 release, prepare for 0.4.0 development (#970)
Browse files Browse the repository at this point in the history
* Generate changelog for 0.3.0 release

* change maven version from 0.3.0-SNAPSHOT to 0.3.0

* update version in diffs

* update scripts

* update docs

* prepare for 0.4.0

* prepare for 0.4.0

* prepare for 0.4.0

* update CI

* improve release instructions

* more release note edits + formatting

* github repo release

* remove GH_TOKEN references

* fix an error in the release docs

* fix maven url

* fix maven url

* regenerate docs
  • Loading branch information
andygrove authored Oct 1, 2024
1 parent 84cccf7 commit a1599e2
Show file tree
Hide file tree
Showing 20 changed files with 145 additions and 75 deletions.
2 changes: 1 addition & 1 deletion .github/actions/setup-spark-builder/action.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,7 @@ inputs:
comet-version:
description: 'The Comet version to use for Spark'
required: true
default: '0.3.0-SNAPSHOT'
default: '0.4.0-SNAPSHOT'
runs:
using: "composite"
steps:
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/docker-publish.yml
Original file line number Diff line number Diff line change
Expand Up @@ -44,7 +44,7 @@ jobs:
- name: Extract Comet version
id: extract_version
run: |
# use the tag that triggered this workflow as the Comet version e.g. 0.3.0-rc1
# use the tag that triggered this workflow as the Comet version e.g. 0.4.0-rc1
echo "COMET_VERSION=${GITHUB_REF##*/}" >> $GITHUB_ENV
- name: Echo Comet version
run: echo "The current Comet version is ${{ env.COMET_VERSION }}"
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/spark_sql_test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -71,7 +71,7 @@ jobs:
with:
spark-version: ${{ matrix.spark-version.full }}
spark-short-version: ${{ matrix.spark-version.short }}
comet-version: '0.3.0-SNAPSHOT' # TODO: get this from pom.xml
comet-version: '0.4.0-SNAPSHOT' # TODO: get this from pom.xml
- name: Run Spark tests
run: |
cd apache-spark
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/spark_sql_test_ansi.yml
Original file line number Diff line number Diff line change
Expand Up @@ -69,7 +69,7 @@ jobs:
with:
spark-version: ${{ matrix.spark-version.full }}
spark-short-version: ${{ matrix.spark-version.short }}
comet-version: '0.3.0-SNAPSHOT' # TODO: get this from pom.xml
comet-version: '0.4.0-SNAPSHOT' # TODO: get this from pom.xml
- name: Run Spark tests
run: |
cd apache-spark
Expand Down
2 changes: 1 addition & 1 deletion common/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,7 @@ under the License.
<parent>
<groupId>org.apache.datafusion</groupId>
<artifactId>comet-parent-spark${spark.version.short}_${scala.binary.version}</artifactId>
<version>0.3.0-SNAPSHOT</version>
<version>0.4.0-SNAPSHOT</version>
<relativePath>../pom.xml</relativePath>
</parent>

Expand Down
2 changes: 1 addition & 1 deletion dev/diffs/3.4.3.diff
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ index d3544881af1..bf0e2b53c70 100644
<ivy.version>2.5.1</ivy.version>
<oro.version>2.0.8</oro.version>
+ <spark.version.short>3.4</spark.version.short>
+ <comet.version>0.3.0-SNAPSHOT</comet.version>
+ <comet.version>0.4.0-SNAPSHOT</comet.version>
<!--
If you changes codahale.metrics.version, you also need to change
the link to metrics.dropwizard.io in docs/monitoring.md.
Expand Down
2 changes: 1 addition & 1 deletion dev/diffs/3.5.1.diff
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ index 0f504dbee85..f6019da888a 100644
<ivy.version>2.5.1</ivy.version>
<oro.version>2.0.8</oro.version>
+ <spark.version.short>3.5</spark.version.short>
+ <comet.version>0.3.0-SNAPSHOT</comet.version>
+ <comet.version>0.4.0-SNAPSHOT</comet.version>
<!--
If you changes codahale.metrics.version, you also need to change
the link to metrics.dropwizard.io in docs/monitoring.md.
Expand Down
2 changes: 1 addition & 1 deletion dev/diffs/4.0.0-preview1.diff
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ index a4b1b2c3c9f..db50bdb0d3b 100644
<ivy.version>2.5.2</ivy.version>
<oro.version>2.0.8</oro.version>
+ <spark.version.short>4.0</spark.version.short>
+ <comet.version>0.3.0-SNAPSHOT</comet.version>
+ <comet.version>0.4.0-SNAPSHOT</comet.version>
<!--
If you change codahale.metrics.version, you also need to change
the link to metrics.dropwizard.io in docs/monitoring.md.
Expand Down
124 changes: 81 additions & 43 deletions dev/release/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,9 +17,9 @@ specific language governing permissions and limitations
under the License.
-->

# Aapche DataFusion Comet: Source Release Process
# Apache DataFusion Comet: Release Process

This documentation is for creating an official source release of Apache DataFusion Comet.
This documentation explains the release process for Apache DataFusion Comet.

## Creating the Release Candidate

Expand Down Expand Up @@ -49,12 +49,18 @@ git checkout -b branch-0.1
git push apache branch-0.1
```

Create and merge a PR against the release branch to update the Maven version from `0.3.0-SNAPSHOT` to `0.1.0`
Update the `pom.xml` files in the release branch to update the Maven version from `0.1.0-SNAPSHOT` to `0.1.0`.

There is no need to update the Rust crate versions because they will already be `0.1.0`.

### Update Version in main

Create a PR against the main branch to update the Rust crate version to `0.2.0` and the Maven version to `0.2.0-SNAPSHOT`.
The Spark diffs also need updating.
Create a PR against the main branch to prepare for developing the next release:

- Update the Rust crate version to `0.2.0`.
- Update the Maven version to `0.2.0-SNAPSHOT` (both in the `pom.xml` files and also in the diff files
under `dev/diffs`).
- Update the CI scripts under the `.github` directory.

### Generate the Change Log

Expand All @@ -81,26 +87,29 @@ python3 generate-changelog.py 0.0.0 HEAD 0.1.0 > ../changelog/0.1.0.md
Create a PR against the _main_ branch to add this change log and once this is approved and merged, cherry-pick the
commit into the release branch.

### Build the jars
### Build the jars

#### Setup to do the build
The build process requires Docker. Download the latest Docker Desktop from https://www.docker.com/products/docker-desktop/.
If you have multiple docker contexts running switch to the context of the Docker Desktop. For example -

```shell
The build process requires Docker. Download the latest Docker Desktop from https://www.docker.com/products/docker-desktop/.
If you have multiple docker contexts running switch to the context of the Docker Desktop. For example -

```shell
$ docker context ls
NAME DESCRIPTION DOCKER ENDPOINT ERROR
default Current DOCKER_HOST based configuration unix:///var/run/docker.sock
desktop-linux Docker Desktop unix:///Users/parth/.docker/run/docker.sock
my_custom_context * tcp://192.168.64.2:2376

$ docker context use desktop-linux
```
```

#### Run the build script
The `build-release-comet.sh` script will create a docker image for each architecture and use the image

The `build-release-comet.sh` script will create a docker image for each architecture and use the image
to build the platform specific binaries. These builder images are created every time this script is run.
The script optionally allows overriding of the repository and branch to build the binaries from (Note that
the local git repo is not used in the building of the binaries, but it is used to build the final uber jar).
The script optionally allows overriding of the repository and branch to build the binaries from (Note that
the local git repo is not used in the building of the binaries, but it is used to build the final uber jar).

```shell
Usage: build-release-comet.sh [options]
Expand All @@ -122,8 +131,10 @@ cd dev/release && ./build-release-comet.sh && cd ../..
```

#### Build output
The build output is installed to a temporary local maven repository. The build script will print the name of the repository
location at the end. This location will be required at the time of deploying the artifacts to a staging repository

The build output is installed to a temporary local maven repository. The build script will print the name of the
repository location at the end. This location will be required at the time of deploying the artifacts to a staging
repository

### Tag the Release Candidate

Expand All @@ -137,27 +148,28 @@ git tag 0.1.0-rc1
git push apache 0.1.0-rc1
```

Note that pushing a release candidate tag will trigger a GitHub workflow that will build a Docker image and publish
it to GitHub Container Registry at https://github.com/apache/datafusion-comet/pkgs/container/datafusion-comet

## Publishing the Release Candidate

This part of the process can mostly only be performed by a PMC member.

### Create the Release Candidate Tarball

Run the create-tarball script on the release candidate tag (`0.1.0-rc1`) to create the source tarball and upload it to the dev subversion repository

```shell
GH_TOKEN=<TOKEN> ./dev/release/create-tarball.sh 0.1.0 1
```

### Publish the maven artifacts

#### Setup maven

##### One time project setup

Setting up your project in the ASF Nexus Repository from here: https://infra.apache.org/publishing-maven-artifacts.html

##### Release Manager Setup
Set up your development environment from here: https://infra.apache.org/publishing-maven-artifacts.html

Set up your development environment from here: https://infra.apache.org/publishing-maven-artifacts.html

##### Build and publish a release candidate to nexus.
The script `publish-to-maven.sh` will publish the artifacts created by the `build-release-comet.sh` script.

The script `publish-to-maven.sh` will publish the artifacts created by the `build-release-comet.sh` script.
The artifacts will be signed using the gpg key of the release manager and uploaded to the maven staging repository.

Note: This script needs `xmllint` to be installed. On MacOS xmllint is available by default.
Expand All @@ -183,7 +195,8 @@ GPG_KEY - GPG key used to sign release artifacts
GPG_PASSPHRASE - Passphrase for GPG key
```

example
example

```shell
/comet:$./dev/release/publish-to-maven.sh -u release_manager_asf_id -r /tmp/comet-staging-repo-VsYOX
ASF Password :
Expand All @@ -193,23 +206,56 @@ Creating Nexus staging repository
...
```

In the Nexus repository UI (https://repository.apache.org/) locate and verify the artifacts in
In the Nexus repository UI (https://repository.apache.org/) locate and verify the artifacts in
staging (https://central.sonatype.org/publish/release/#locate-and-examine-your-staging-repository).

If the artifacts appear to be correct, then close and release the repository so it is made visible.
If the artifacts appear to be correct, then close and release the repository so it is made visible (this should
actually happen automatically when running the script).

### Create the Release Candidate Tarball

Run the create-tarball script on the release candidate tag (`0.1.0-rc1`) to create the source tarball and upload it to
the dev subversion repository

```shell
./dev/release/create-tarball.sh 0.1.0 1
```

This will generate an email template for starting the vote.

### Start an Email Voting Thread

Send the email that is generated in the previous step to `[email protected]`.

### Publish the Release Tarball
## Publishing Binary Releases

Once the vote passes, we can publish the source and binary releases.

Once the vote passes, run the release-tarball script to move the tarball to the release subversion repository.
### Publishing Source Tarball

Run the release-tarball script to move the tarball to the release subversion repository.

```shell
./dev/release/create-tarball.sh 0.1.0 1
./dev/release/release-tarball.sh 0.1.0 1
```

### Create a release in the GitHub repository

Go to https://github.com/apache/datafusion-comet/releases and create a release for the release tag, and paste the
changelog in the description.

### Publishing Maven Artifacts

Promote the Maven artifacts from staging to production by visiting https://repository.apache.org/#stagingRepositories
and selecting the staging repository and then clicking the "release" button.

### Publishing Crates

Publish the `datafusion-comet-spark-expr` crate to crates.io so that other Rust projects can leverage the
Spark-compatible operators and expressions outside of Spark.

### Push a release tag to the repo

Push a release tag (`0.1.0`) to the `apache` repository.

```shell
Expand All @@ -219,6 +265,9 @@ git tag 0.1.0
git push apache 0.1.0
```

Note that pushing a release tag will trigger a GitHub workflow that will build a Docker image and publish
it to GitHub Container Registry at https://github.com/apache/datafusion-comet/pkgs/container/datafusion-comet

Reply to the vote thread to close the vote and announce the release.

## Post Release Admin
Expand Down Expand Up @@ -260,20 +309,9 @@ svn ls https://dist.apache.org/repos/dist/release/datafusion | grep comet
Delete a release:

```shell
svn delete -m "delete old DataFusion Comet release" https://dist.apache.org/repos/dist/release/datafusion-comet/datafusion-comet-0.0.0
svn delete -m "delete old DataFusion Comet release" https://dist.apache.org/repos/dist/release/datafusion/datafusion-comet-0.0.0
```

## Publishing Binary Releases

### Publishing JAR Files to Maven

Once the vote has passed, promote the staged release candidate to production in the Nexus repository UI (https://repository.apache.org/).

### Publishing to crates.io

We may choose to publish the `datafusion-comet` to crates.io so that other Rust projects can leverage the
Spark-compatible operators and expressions outside of Spark.

## Post Release Activities

Writing a blog post about the release is a great way to generate more interest in the project. We typically create a
Expand Down
13 changes: 5 additions & 8 deletions dev/release/create-tarball.sh
Original file line number Diff line number Diff line change
Expand Up @@ -53,11 +53,6 @@ if [ "$#" -ne 2 ]; then
exit
fi

if [[ -z "${GH_TOKEN}" ]]; then
echo "Please set personal github token through GH_TOKEN environment variable"
exit
fi

version=$1
rc=$2
tag="${version}-rc${rc}"
Expand Down Expand Up @@ -87,7 +82,8 @@ I would like to propose a release of Apache DataFusion Comet version ${version}.
This release candidate is based on commit: ${release_hash} [1]
The proposed release tarball and signatures are hosted at [2].
The changelog is located at [3].
Pre-built jar files are available in a Maven staging repository [3].
The changelog is located at [4].
Please download, verify checksums and signatures, run the unit tests, and vote
on the release. The vote will be open for at least 72 hours.
Expand All @@ -107,7 +103,8 @@ Here is my vote:
[1]: https://github.com/apache/datafusion-comet/tree/${release_hash}
[2]: ${url}
[3]: https://github.com/apache/datafusion-comet/blob/${release_hash}/CHANGELOG.md
[3]: https://repository.apache.org/#nexus-search;quick~org.apache.datafusion
[4]: https://github.com/apache/datafusion-comet/blob/${release_hash}/CHANGELOG.md
MAIL
echo "---------------------------------------------------------"

Expand All @@ -121,7 +118,7 @@ echo "Running rat license checker on ${tarball}"
${DEV_RELEASE_DIR}/run-rat.sh ${tarball}

echo "Signing tarball and creating checksums"
gpg --armor --output ${tarball}.asc --detach-sig ${tarball}
gpg --pinentry-mode loopback --armor --output ${tarball}.asc --detach-sig ${tarball}
# create signing with relative path of tarball
# so that they can be verified with a command such as
# shasum --check apache-datafusion-comet-0.1.0-rc1.tar.gz.sha512
Expand Down
15 changes: 11 additions & 4 deletions dev/release/rat_exclude_files.txt
Original file line number Diff line number Diff line change
Expand Up @@ -2,18 +2,25 @@
*.dockerignore
.github/pull_request_template.md
.gitmodules
core/Cargo.lock
core/testdata/backtrace.txt
core/testdata/stacktrace.txt
native/Cargo.lock
native/testdata/backtrace.txt
native/testdata/stacktrace.txt
dev/copyright/scala-header.txt
dev/release/requirements.txt
dev/release/rat_exclude_files.txt
docs/spark_builtin_expr_coverage.txt
docs/source/_static/images/*.svg
docs/source/contributor-guide/benchmark-results/**/*.json
docs/logos/*.png
docs/logos/*.svg
rust-toolchain
spark/src/test/resources/tpcds-extended/q*.sql
spark/src/test/resources/tpcds-query-results/*.out
spark/src/test/resources/tpcds-micro-benchmarks/*.sql
spark/src/test/resources/tpcds-plan-stability/approved-plans*/**/explain.txt
spark/src/test/resources/tpcds-plan-stability/approved-plans*/**/simplified.txt
spark/src/test/resources/tpch-query-results/*.out
spark/src/test/resources/tpch-extended/q1.sql
spark/src/test/resources/tpch-extended/q*.sql
spark/src/test/resources/test-data/*.csv
spark/src/test/resources/test-data/*.ndjson
spark/inspections/CometTPC*results.txt
2 changes: 2 additions & 0 deletions dev/release/verifying-release-candidates.md
Original file line number Diff line number Diff line change
Expand Up @@ -37,6 +37,8 @@ make release-nogit
We hope that users will verify the release beyond running this script by testing the release candidate with their
existing Spark jobs and report any functional issues or performance regressions.

The email announcing the vote should contain a link to pre-built jar files in a Maven staging repository.

Another way of verifying the release is to follow the
[Comet Benchmarking Guide](https://datafusion.apache.org/comet/contributor-guide/benchmarking.html) and compare
performance with the previous release.
4 changes: 2 additions & 2 deletions docs/source/user-guide/installation.md
Original file line number Diff line number Diff line change
Expand Up @@ -38,15 +38,15 @@ See the [Comet Kubernetes Guide](kubernetes.md) guide.

## Using a Published JAR File

There are no published JAR files yet.
Pre-built jar files are available in Maven central at https://central.sonatype.com/namespace/org.apache.datafusion

## Using a Published Source Release

Official source releases can be downloaded from https://dist.apache.org/repos/dist/release/datafusion/

```console
# Pick the latest version
export COMET_VERSION=0.2.0
export COMET_VERSION=0.3.0
# Download the tarball
curl -O "https://dist.apache.org/repos/dist/release/datafusion/datafusion-comet-$COMET_VERSION/apache-datafusion-comet-$COMET_VERSION.tar.gz"
# Unpack
Expand Down
Loading

0 comments on commit a1599e2

Please sign in to comment.