All commands assume you are in the root directory of this project.
For me, that looks like ~/repos/chronon
.
Add the following to your shell run command files e.g. ~/.bashrc
.
export CHRONON_OS=<path/to/chronon/repo>
export CHRONON_API=$CHRONON_OS/api/py
alias materialize="PYTHONPATH=$CHRONON_API:$PYTHONPATH $CHRONON_API/ai/chronon/repo/compile.py"
Thrift is a dependency for compile. The latest version 0.14 is very new - feb 2021, and incompatible with hive metastore. So we force 0.13.
brew tap cartman-kai/thrift
brew install [email protected]
python3 -m pip install -U tox build
Be sure to open the project from the build.sbt
file (at the root level of the git directory).
Mark the following directories as Sources Root
by right clicking on the directory in the tree view, and selecting Mark As
-> Sources Root
:
- aggregator/src/main/scala
- api/src/main/scala
- spark/src/main/scala
Mark the following directories as Test Root
in a similar way:
- aggregator/src/test/scala
- api/src/test/scala
- spark/src/test/scala
The project should then automatically start indexing, and when it finishes you should be good to go.
Troubleshooting
Try the following if you are seeing flaky issues in IntelliJ
sbt +clean
sbt +assembly
sbt py_thrift
materialize --input_path=<path/to/conf>
All tests
sbt test
Specific submodule tests
sbt "testOnly *<Module>"
# example to test FetcherTest with 9G memory
sbt -mem 9000 "test:testOnly *FetcherTest"
# example to test specific test method from GroupByTest
sbt "test:testOnly *GroupByTest -- -t *testSnapshotEntities"
# Graph based view of all the dependencies
sbt dependencyBrowseGraph
# Tree based view of all the dependencies
sbt dependencyBrowseTree
- Inside the
$CHRONON_OS
directory.
sbt package
sbt python_api
Note: This will create the artifacts with the version specific naming specified under version.sbt
Builds on main branch will result in:
<artifact-name>-<version>.jar
[JARs] chronon_2.11-0.7.0-SNAPSHOT.jar
[Python] chronon-ai-0.7.0-SNAPSHOT.tar.gz
Builds on user branches will result in:
<artifact-name>-<branch-name>-<version>.jar
[JARs] chronon_2.11-jdoe--branch-0.7.0-SNAPSHOT.jar
[Python] chronon-ai-jdoe--branch-ai-0.7.0-SNAPSHOT.tar.gz
sbt assembly
sbt 'spark_uber/assembly'
- Inside the
$CHRONON_OS
directory.
To publish all the Chronon artifacts of the current git HEAD (builds and publishes all the JARs)
sbt publish
- All the SNAPSHOT ones are published to the maven repository as specified by the env variable
$CHRONON_SNAPSHOT_REPO
. - All the final artifacts are published to the MavenCentral (via Sonatype)
NOTE: Python API package will also be generated, but it will not be pushed to any PyPi repository. Only release
will
push the Python artifacts to the public repository.
- Login into JFrog artifactory webapp console and create an API Key under user profile section.
- In
~/.sbt/1.0/jfrog.sbt
add
credentials += Credentials(Path.userHome / ".sbt" / "jfrog_credentials")
- In
~/.sbt/jfrog_credentials
add
realm=Artifactory Realm
host=<Artifactory domain of $CHRONON_SNAPSHOT_REPO>
user=<your username>
password=<API Key>
- Create a sonatype account if you don't have one.
- Sign up here https://issues.sonatype.org/
- Create an issue to add your username created above to
ai.chronon
. Here is a sample issue.
brew install gpg
on your mac- In
~/.sbt/1.0/sonatype.sbt
add
credentials += Credentials(Path.userHome / ".sbt" / "sonatype_credentials")
- In
~/.sbt/sonatype_credentials
add
realm=Sonatype Nexus Repository Manager
host=s01.oss.sonatype.org
user=<your username>
password=<your password>
- setup gpg - just first step in this link
- Setup your pypi public account and contact @Nikhil to get added to the PyPi package as a collaborator
- Install
tox, build, twine
. There are three python requirements for the python build process.
- tox: Module for testing. To run the tests run tox in the main project directory.
- build: Module for building. To build run
python -m build
in the main project directory - twine: Module for publishing. To upload a distribution run
twine upload dist/<distribution>.whl
python3 -m pip install -U tox build twine
- Fetch the user token from the PyPi website.
- Make sure you have the credentials configuration for the python repositories you manage. Normally in
~/.pypirc
[distutils]
index-servers =
local
pypi
chronon-pypi
[local]
repository = # local artifactory
username = # local username
password = # token or password
[pypi]
username = # username or __token__
password = # password or token
# Or if using a project specific token
[chronon-pypi]
repository = https://upload.pypi.org/legacy/
username = __token__
password = # Project specific pypi token.
- Run release command in the right HEAD of chronon repository. Before running this, you may want to activate your Python venv or install the required Python packages on the laptop. Otherwise, the Python release will fail due to missing deps.
GPG_TTY=$(tty) sbt -mem 8192 release
This command will take into the account of version.sbt
and handles a series of events:
- Marks the current SNAPSHOT codebase as final (git commits).
- Creates a new git tag (e.g v0.7.0) pointing to the release commit.
- Builds the artifacts with released versioning suffix and pushes them to Sonatype, and PyPi central.
- Updates the
version.sbt
to point to the next in line developmental version (git commits).
- login into the staging repo in nexus (same password as sonatype jira)
- In the staging repos list - select your publish
- select "close" wait for the steps to finish
- Select "refresh" and "release"
- Wait for 30 mins to sync to maven or sonatype UI
- Push the local release commits (DO NOT SQUASH), and the new tag created from step 1 to Github.
- chronon repo disallow push to main branch directly, so instead push commits to a branch
git push origin main:your-name--release-xxx
- your PR should contain exactly two commits, 1 setting the release version, 1 setting the new snapshot version.
- make sure to use Rebase pull request instead of the regular Merge or Squash options when merging the PR.
- chronon repo disallow push to main branch directly, so instead push commits to a branch
- Push release tag to main branch
- tag new version to release commit
Setting version to 0.0.xx
. If not already tagged, can be added by
git tag -fa v0.0.xx <commit-sha>
- push tag
git push origin <tag-name>
- New tag should be available here - https://github.com/airbnb/chronon/tags
- tag new version to release commit
- Verify the Python API from the PyPi website that we are pointing to the latest.
- Most common reason for Python failure is re-uploading a version that's already uploaded.
We use gh releases to release the driver that can backfill, upload, stream etc. Currently the repo is not public and the run.py script can't reach it.
Run the sbt sphinx command to generate the sphinx docs locally and open it.
sbt sphinx
bash build.sh
bash gcloud_release.sh
{One-time} First install the ammonite REPL with support for scala 2.12
sudo sh -c '(echo "#!/usr/bin/env sh" && curl -L https://github.com/com-lihaoyi/Ammonite/releases/download/3.0.0-M0/2.12-3.0.0-M0) > /usr/local/bin/amm && chmod +x /usr/local/bin/amm' && amm
Build the chronon jar for scala 2.12
sbt ++2.12.12 spark_uber/assembly
Start the REPL
/usr/local/bin/amm
In the repl prompt load the jar
import $cp.spark.target.`scala-2.12`.`spark_uber-assembly-0.0.63-SNAPSHOT.jar`
Now you can import the chronon classes and use them directly from repl for testing.