prep for renaming master branch to main

airbnb · Feb 21, 2024 · b724016 · b724016
1 parent ac5095b
commit b724016
Show file tree

Hide file tree

Showing 23 changed files with 88 additions and 88 deletions.
diff --git a/CONTRIBUTE.md b/CONTRIBUTE.md
@@ -183,37 +183,37 @@ Below is a list of resources that can be useful for development and debugging.
 ## Docs
 
 (Docsite)[https://chronon.ai]
-(doc directory)[https://github.com/airbnb/chronon/tree/master/docs/source]
+(doc directory)[https://github.com/airbnb/chronon/tree/main/docs/source]
 (Code of conduct)[TODO]
 
 ## Links: 
 
 (pip project)[https://pypi.org/project/chronon-ai/]
-(maven central)[https://mvnrepository.com/artifact/ai.chronon/]: (publishing)[https://github.com/airbnb/chronon/blob/master/devnotes.md#publishing-all-the-artifacts-of-chronon]
-(Docsite: publishing)[https://github.com/airbnb/chronon/blob/master/devnotes.md#chronon-artifacts-publish-process]
+(maven central)[https://mvnrepository.com/artifact/ai.chronon/]: (publishing)[https://github.com/airbnb/chronon/blob/main/devnotes.md#publishing-all-the-artifacts-of-chronon]
+(Docsite: publishing)[https://github.com/airbnb/chronon/blob/main/devnotes.md#chronon-artifacts-publish-process]
 
 
 ## Code Pointers
 
-Api - (Thrift)[https://github.com/airbnb/chronon/blob/master/api/thrift/api.thrift#L180], (Python)[https://github.com/airbnb/chronon/blob/master/api/py/ai/chronon/group_by.py]
-(CLI driver entry point for job launching.)[https://github.com/airbnb/chronon/blob/master/spark/src/main/scala/ai/chronon/spark/Driver.scala]
+Api - (Thrift)[https://github.com/airbnb/chronon/blob/main/api/thrift/api.thrift#L180], (Python)[https://github.com/airbnb/chronon/blob/main/api/py/ai/chronon/group_by.py]
+(CLI driver entry point for job launching.)[https://github.com/airbnb/chronon/blob/main/spark/src/main/scala/ai/chronon/spark/Driver.scala]
 
 **Offline flows that produce hive tables or file output**
-(GroupBy)[https://github.com/airbnb/chronon/blob/master/spark/src/main/scala/ai/chronon/spark/GroupBy.scala]
-(Staging Query)[https://github.com/airbnb/chronon/blob/master/spark/src/main/scala/ai/chronon/spark/StagingQuery.scala]
-(Join backfills)[https://github.com/airbnb/chronon/blob/master/spark/src/main/scala/ai/chronon/spark/Join.scala]
-(Metadata Export)[https://github.com/airbnb/chronon/blob/master/spark/src/main/scala/ai/chronon/spark/MetadataExporter.scala]
+(GroupBy)[https://github.com/airbnb/chronon/blob/main/spark/src/main/scala/ai/chronon/spark/GroupBy.scala]
+(Staging Query)[https://github.com/airbnb/chronon/blob/main/spark/src/main/scala/ai/chronon/spark/StagingQuery.scala]
+(Join backfills)[https://github.com/airbnb/chronon/blob/main/spark/src/main/scala/ai/chronon/spark/Join.scala]
+(Metadata Export)[https://github.com/airbnb/chronon/blob/main/spark/src/main/scala/ai/chronon/spark/MetadataExporter.scala]
 Online flows that update and read data & metadata from the kvStore
-(GroupBy window tail upload )[https://github.com/airbnb/chronon/blob/master/spark/src/main/scala/ai/chronon/spark/GroupByUpload.scala]
-(Streaming window head upload)[https://github.com/airbnb/chronon/blob/master/spark/src/main/scala/ai/chronon/spark/streaming/GroupBy.scala]
-(Fetching)[https://github.com/airbnb/chronon/blob/master/online/src/main/scala/ai/chronon/online/Fetcher.scala]
+(GroupBy window tail upload )[https://github.com/airbnb/chronon/blob/main/spark/src/main/scala/ai/chronon/spark/GroupByUpload.scala]
+(Streaming window head upload)[https://github.com/airbnb/chronon/blob/main/spark/src/main/scala/ai/chronon/spark/streaming/GroupBy.scala]
+(Fetching)[https://github.com/airbnb/chronon/blob/main/online/src/main/scala/ai/chronon/online/Fetcher.scala]
 Aggregations
-(time based aggregations)[https://github.com/airbnb/chronon/blob/master/aggregator/src/main/scala/ai/chronon/aggregator/base/TimedAggregators.scala]
-(time independent aggregations)[https://github.com/airbnb/chronon/blob/master/aggregator/src/main/scala/ai/chronon/aggregator/base/SimpleAggregators.scala]
-(integration point with rest of chronon)[https://github.com/airbnb/chronon/blob/master/aggregator/src/main/scala/ai/chronon/aggregator/row/ColumnAggregator.scala#L223]
-(Windowing)[https://github.com/airbnb/chronon/tree/master/aggregator/src/main/scala/ai/chronon/aggregator/windowing]
+(time based aggregations)[https://github.com/airbnb/chronon/blob/main/aggregator/src/main/scala/ai/chronon/aggregator/base/TimedAggregators.scala]
+(time independent aggregations)[https://github.com/airbnb/chronon/blob/main/aggregator/src/main/scala/ai/chronon/aggregator/base/SimpleAggregators.scala]
+(integration point with rest of chronon)[https://github.com/airbnb/chronon/blob/main/aggregator/src/main/scala/ai/chronon/aggregator/row/ColumnAggregator.scala#L223]
+(Windowing)[https://github.com/airbnb/chronon/tree/main/aggregator/src/main/scala/ai/chronon/aggregator/windowing]
 
 **Testing**
-(Testing - sbt commands)[https://github.com/airbnb/chronon/blob/master/devnotes.md#testing]
+(Testing - sbt commands)[https://github.com/airbnb/chronon/blob/main/devnotes.md#testing]
 (Automated testing - circle-ci pipelines)[https://app.circleci.com/pipelines/github/airbnb/chronon]
-(Dev Setup)[https://github.com/airbnb/chronon/blob/master/devnotes.md#prerequisites]
+(Dev Setup)[https://github.com/airbnb/chronon/blob/main/devnotes.md#prerequisites]
diff --git a/README.md b/README.md
@@ -59,7 +59,7 @@ Does not include:
 
 ## Setup
 
-To get started with the Chronon, all you need to do is download the [docker-compose.yml](https://github.com/airbnb/chronon/blob/master/docker-compose.yml) file and run it locally:
+To get started with the Chronon, all you need to do is download the [docker-compose.yml](https://github.com/airbnb/chronon/blob/main/docker-compose.yml) file and run it locally:
 
 ```bash
 curl -o docker-compose.yml https://chronon.ai/docker-compose.yml
@@ -74,7 +74,7 @@ In this example, let's assume that we're a large online retailer, and we've dete
 
 ## Raw data sources
 
-Fabricated raw data is included in the [data](https://github.com/airbnb/chronon/blob/master/api/py/test/sample/data) directory. It includes four tables:
+Fabricated raw data is included in the [data](https://github.com/airbnb/chronon/blob/main/api/py/test/sample/data) directory. It includes four tables:
 
 1. Users - includes basic information about users such as account created date; modeled as a batch data source that updates daily
 2. Purchases - a log of all purchases by users; modeled as a log table with a streaming (i.e. Kafka) event-bus counterpart
@@ -141,11 +141,11 @@ v1 = GroupBy(
 )
 ```
 
-See the whole code file here: [purchases GroupBy](https://github.com/airbnb/chronon/blob/master/api/py/test/sample/group_bys/quickstart/purchases.py). This is also in your docker image. We'll be running computation for it and the other GroupBys in [Step 3 - Backfilling Data](#step-3---backfilling-data). 
+See the whole code file here: [purchases GroupBy](https://github.com/airbnb/chronon/blob/main/api/py/test/sample/group_bys/quickstart/purchases.py). This is also in your docker image. We'll be running computation for it and the other GroupBys in [Step 3 - Backfilling Data](#step-3---backfilling-data). 
 
 **Feature set 2: Returns data features**
 
-We perform a similar set of aggregations on returns data in the [returns GroupBy](https://github.com/airbnb/chronon/blob/master/api/py/test/sample/group_bys/quickstart/returns.py). The code is not included here because it looks similar to the above example.
+We perform a similar set of aggregations on returns data in the [returns GroupBy](https://github.com/airbnb/chronon/blob/main/api/py/test/sample/group_bys/quickstart/returns.py). The code is not included here because it looks similar to the above example.
 
 **Feature set 3: User data features**
 
@@ -167,7 +167,7 @@ v1 = GroupBy(
 ) 
 ```
 
-Taken from the [users GroupBy](https://github.com/airbnb/chronon/blob/master/api/py/test/sample/group_bys/quickstart/users.py).
+Taken from the [users GroupBy](https://github.com/airbnb/chronon/blob/main/api/py/test/sample/group_bys/quickstart/users.py).
 
 
 ### Step 2 - Join the features together
@@ -200,7 +200,7 @@ v1 = Join(
 )
 ```
 
-Taken from the [training_set Join](https://github.com/airbnb/chronon/blob/master/api/py/test/sample/joins/quickstart/training_set.py). 
+Taken from the [training_set Join](https://github.com/airbnb/chronon/blob/main/api/py/test/sample/joins/quickstart/training_set.py). 
 
 The `left` side of the join is what defines the timestamps and primary keys for the backfill (notice that it is built on top of the `checkout` event, as dictated by our use case).
 
@@ -370,7 +370,7 @@ Using chronon for your feature engineering work simplifies and improves your ML
 4. Chronon exposes easy endpoints for feature fetching.
 5. Consistency is guaranteed and measurable.
 
-For a more detailed view into the benefits of using Chronon, see [Benefits of Chronon documentation](https://github.com/airbnb/chronon/tree/master?tab=readme-ov-file#benefits-of-chronon-over-other-approaches).
+For a more detailed view into the benefits of using Chronon, see [Benefits of Chronon documentation](https://github.com/airbnb/chronon/tree/main?tab=readme-ov-file#benefits-of-chronon-over-other-approaches).
 
 
 # Benefits of Chronon over other approaches

diff --git a/aggregator/src/main/scala/ai/chronon/aggregator/base/SimpleAggregators.scala b/aggregator/src/main/scala/ai/chronon/aggregator/base/SimpleAggregators.scala
@@ -411,7 +411,7 @@ class FrequentItems[T: FrequentItemsFriendly](val mapSize: Int, val errorType: E
 // See: Back to the future: an even more nearly optimal cardinality estimation algorithm, 2017
 // https://arxiv.org/abs/1708.06839
 // refer to the chart here to tune your sketch size with lgK
-// https://github.com/apache/incubator-datasketches-java/blob/master/src/main/java/org/apache/datasketches/cpc/CpcSketch.java#L180
+// https://github.com/apache/incubator-datasketches-java/blob/main/src/main/java/org/apache/datasketches/cpc/CpcSketch.java#L180
 // default is about 1200 bytes
 class ApproxDistinctCount[Input: CpcFriendly](lgK: Int = 8) extends SimpleAggregator[Input, CpcSketch, Long] {
   override def outputType: DataType = LongType

diff --git a/...egator/src/main/scala/ai/chronon/aggregator/windowing/TwoStackLiteAggregationBuffer.scala b/...egator/src/main/scala/ai/chronon/aggregator/windowing/TwoStackLiteAggregationBuffer.scala
@@ -22,7 +22,7 @@ import java.util
 
 case class BankersEntry[IR](var value: IR, ts: Long)
 
-// ported from: https://github.com/IBM/sliding-window-aggregators/blob/master/rust/src/two_stacks_lite/mod.rs with some
+// ported from: https://github.com/IBM/sliding-window-aggregators/blob/main/rust/src/two_stacks_lite/mod.rs with some
 // modification to work with simple aggregator
 class TwoStackLiteAggregationBuffer[Input, IR >: Null, Output >: Null](aggregator: SimpleAggregator[Input, IR, Output],
                                                                        maxSize: Int) {

diff --git a/airflow/helpers.py b/airflow/helpers.py
@@ -66,7 +66,7 @@ def safe_part(p):
     return re.sub("[^A-Za-z0-9_]", "__", safe_name)
 
 
-# https://github.com/airbnb/chronon/blob/master/api/src/main/scala/ai/chronon/api/Extensions.scala
+# https://github.com/airbnb/chronon/blob/main/api/src/main/scala/ai/chronon/api/Extensions.scala
 def sanitize(name):
     return re.sub("[^a-zA-Z0-9_]", "_", name)
 

diff --git a/api/py/ai/chronon/group_by.py b/api/py/ai/chronon/group_by.py
@@ -61,7 +61,7 @@ class Operation:
     APPROX_UNIQUE_COUNT = ttypes.Operation.APPROX_UNIQUE_COUNT
     # refer to the chart here to tune your sketch size with lgK
     # default is 8
-    # https://github.com/apache/incubator-datasketches-java/blob/master/src/main/java/org/apache/datasketches/cpc/CpcSketch.java#L180
+    # https://github.com/apache/incubator-datasketches-java/blob/main/src/main/java/org/apache/datasketches/cpc/CpcSketch.java#L180
     APPROX_UNIQUE_COUNT_LGK = collector(ttypes.Operation.APPROX_UNIQUE_COUNT)
     UNIQUE_COUNT = ttypes.Operation.UNIQUE_COUNT
     COUNT = ttypes.Operation.COUNT

diff --git a/api/py/setup.py b/api/py/setup.py
@@ -27,7 +27,7 @@
 
 
 __version__ = "local"
-__branch__ = "master"
+__branch__ = "main"
 def get_version():
     version_str = os.environ.get("CHRONON_VERSION_STR", __version__)
     branch_str = os.environ.get("CHRONON_BRANCH_STR", __branch__)

diff --git a/build.sbt b/build.sbt
@@ -94,8 +94,8 @@ git.gitTagToVersionNumber := { tag: String =>
   // Git plugin will automatically add SNAPSHOT for dirty workspaces so remove it to avoid duplication.
   val versionStr = if (git.gitUncommittedChanges.value) version.value.replace("-SNAPSHOT", "") else version.value
   val branchTag = git.gitCurrentBranch.value.replace("/", "-")
-  if (branchTag == "master") {
-    // For master branches, we tag the packages as <package-name>-<build-version>
+  if (branchTag == "main" || branchTag = "master") {
+    // For main branches, we tag the packages as <package-name>-<build-version>
     Some(s"${versionStr}")
   } else {
     // For user branches, we tag the packages as <package-name>-<user-branch>-<build-version>

diff --git a/build.sh b/build.sh
@@ -7,8 +7,8 @@
 set -euxo pipefail
 
 BRANCH="$(git rev-parse --abbrev-ref HEAD)"
-if [[ "$BRANCH" != "master" ]]; then
-  echo "$(tput bold) You are not on master!"
+if [[ "$BRANCH" != "main" ]]; then
+  echo "$(tput bold) You are not on main branch!"
   echo "$(tput sgr0) Are you sure you want to release? (y to continue)"
   read response
   if [[ "$response" != "y" ]]; then

diff --git a/devnotes.md b/devnotes.md
@@ -104,7 +104,7 @@ sbt python_api
 
 Note: This will create the artifacts with the version specific naming specified under `version.sbt`
 ```text
-Builds on master will result in:
+Builds on main branch will result in:
 <artifact-name>-<version>.jar 
 [JARs]   chronon_2.11-0.7.0-SNAPSHOT.jar
 [Python] chronon-ai-0.7.0-SNAPSHOT.tar.gz
@@ -227,15 +227,15 @@ This command will take into the account of `version.sbt` and handles a series of
      2. Select "refresh" and "release"
      3. Wait for 30 mins to sync to [maven](https://repo1.maven.org/maven2/) or [sonatype UI](https://search.maven.org/search?q=g:ai.chronon)
 4. Push the local release commits (DO NOT SQUASH), and the new tag created from step 1 to Github.
-     1. chronon repo disallow push to master directly, so instead push commits to a branch `git push origin master:your-name--release-xxx`
+     1. chronon repo disallow push to main branch directly, so instead push commits to a branch `git push origin main:your-name--release-xxx`
      2. your PR should contain exactly two commits, 1 setting the release version, 1 setting the new snapshot version. 
      3. make sure to use **Rebase pull request** instead of the regular Merge or Squash options when merging the PR.
-5. Push release tag to master branch
+5. Push release tag to main branch
      1. tag new version to release commit `Setting version to 0.0.xx`. If not already tagged, can be added by 
      ```
        git tag -fa v0.0.xx <commit-sha>
      ```
-     2. push tag to master 
+     2. push tag 
       ```
         git push origin <tag-name>
       ```

diff --git a/docs/source/Code_Guidelines.md b/docs/source/Code_Guidelines.md
@@ -69,4 +69,4 @@ in terms of power. Also Spark APIs are mainly in Scala2.
 Every new behavior should be unit-tested. We have implemented a fuzzing framework 
 that can produce data randomly as scala objects or 
 spark tables - [see](../../spark/src/test/scala/ai/chronon/spark/test/DataFrameGen.scala). Use it for testing.
-Python code is also covered by tests - [see](https://github.com/airbnb/chronon/tree/master/api/py/test).
+Python code is also covered by tests - [see](https://github.com/airbnb/chronon/tree/main/api/py/test).
diff --git a/docs/source/authoring_features/ChainingFeatures.md b/docs/source/authoring_features/ChainingFeatures.md
@@ -79,9 +79,9 @@ enriched_listings = Join(
 
 ```
 ### Configuration Example
-[Chaining GroupBy](https://github.com/airbnb/chronon/blob/master/api/py/test/sample/group_bys/sample_team/sample_chaining_group_by.py)
+[Chaining GroupBy](https://github.com/airbnb/chronon/blob/main/api/py/test/sample/group_bys/sample_team/sample_chaining_group_by.py)
 
-[Chaining Join](https://github.com/airbnb/chronon/blob/master/api/py/test/sample/joins/sample_team/sample_chaining_join.py)
+[Chaining Join](https://github.com/airbnb/chronon/blob/main/api/py/test/sample/joins/sample_team/sample_chaining_join.py)
 
 ## Clarifications
 - The goal of chaining is to use output of a Join as input to downstream computations like GroupBy or a Join. As of today we support the case 1 and case 2 in future plan

diff --git a/docs/source/authoring_features/GroupBy.md b/docs/source/authoring_features/GroupBy.md
@@ -27,7 +27,7 @@ This can be achieved by using the output of one `GroupBy` as the input to the ne
 
 ## Supported aggregations
 
-All supported aggregations are defined [here](https://github.com/airbnb/chronon/blob/master/api/thrift/api.thrift#L51).
+All supported aggregations are defined [here](https://github.com/airbnb/chronon/blob/main/api/thrift/api.thrift#L51).
 Chronon supports powerful aggregation patterns and the section below goes into detail of the properties and behaviors
 of aggregations.
 
@@ -181,7 +181,7 @@ If you look at the parameters column in the above table - you will see `k`.
 
 For approx_unique_count and approx_percentile - k stands for the size of the `sketch` - the larger this is, the more
 accurate and expensive to compute the results will be. Mapping between k and size for approx_unique_count is
-[here](https://github.com/apache/incubator-datasketches-java/blob/master/src/main/java/org/apache/datasketches/cpc/CpcSketch.java#L180)
+[here](https://github.com/apache/incubator-datasketches-java/blob/main/src/main/java/org/apache/datasketches/cpc/CpcSketch.java#L180)
 for approx_percentile is the first table in [here](https://datasketches.apache.org/docs/KLL/KLLAccuracyAndSize.html).
 `percentiles` for `approx_percentile` is an array of doubles between 0 and 1, where you want percentiles at. (Ex: "[0.25, 0.5, 0.75]")
 
@@ -193,7 +193,7 @@ The following examples are broken down by source type. We strongly suggest makin
 
 ## Realtime Event GroupBy examples
 
-This example is based on the [returns](https://github.com/airbnb/chronon/blob/master/api/py/test/sample/group_bys/quickstart/returns.py) GroupBy from the quickstart guide that performs various aggregations over the `refund_amt` column over various windows.
+This example is based on the [returns](https://github.com/airbnb/chronon/blob/main/api/py/test/sample/group_bys/quickstart/returns.py) GroupBy from the quickstart guide that performs various aggregations over the `refund_amt` column over various windows.
 
 ```python
 source = Source(
@@ -236,7 +236,7 @@ v1 = GroupBy(
 
 ## Bucketed GroupBy Example
 
-In this example we take the [Purchases GroupBy](https://github.com/airbnb/chronon/blob/master/api/py/test/sample/group_bys/quickstart/purchases.py) from the Quickstart tutorial and modify it to include buckets based on a hypothetical `"credit_card_type"` column.
+In this example we take the [Purchases GroupBy](https://github.com/airbnb/chronon/blob/main/api/py/test/sample/group_bys/quickstart/purchases.py) from the Quickstart tutorial and modify it to include buckets based on a hypothetical `"credit_card_type"` column.
 
 ```python
 source = Source(
@@ -283,7 +283,7 @@ v1 = GroupBy(
 
 ## Simple Batch Event GroupBy examples
 
-Example GroupBy with windowed aggregations. Taken from [purchases.py](https://github.com/airbnb/chronon/blob/master/api/py/test/sample/group_bys/quickstart/purchases.py).
+Example GroupBy with windowed aggregations. Taken from [purchases.py](https://github.com/airbnb/chronon/blob/main/api/py/test/sample/group_bys/quickstart/purchases.py).
 
 Important things to note about this case relative to the streaming GroupBy:
 * The default accuracy here is `SNAPSHOT` meaning that updates to the online KV store only happen in batch, and also backfills will be midnight accurate rather than intra day accurate.
@@ -329,7 +329,7 @@ v1 = GroupBy(
 
 ### Batch Entity GroupBy examples
 
-This is taken from the [Users GroupBy](https://github.com/airbnb/chronon/blob/master/api/py/test/sample/group_bys/quickstart/users.py) from the quickstart tutorial.
+This is taken from the [Users GroupBy](https://github.com/airbnb/chronon/blob/main/api/py/test/sample/group_bys/quickstart/users.py) from the quickstart tutorial.
 
 
 ```python