Skip to content

Commit

Permalink
Merge branch 'main' into c2r_oom
Browse files Browse the repository at this point in the history
  • Loading branch information
zhztheplayer authored Jul 18, 2024
2 parents 9aaf96b + 0c81db9 commit 7922ee3
Show file tree
Hide file tree
Showing 529 changed files with 23,539 additions and 5,232 deletions.
3 changes: 3 additions & 0 deletions .github/workflows/build_bundle_package.yml
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,9 @@

name: Build bundle package

env:
ACTIONS_ALLOW_USE_UNSECURE_NODE_VERSION: true

concurrency:
group: ${{ github.repository }}-${{ github.head_ref || github.sha }}-${{ github.workflow }}
cancel-in-progress: true
Expand Down
7 changes: 3 additions & 4 deletions .github/workflows/clickhouse_be_trigger.yml
Original file line number Diff line number Diff line change
Expand Up @@ -22,14 +22,13 @@ on:
- '.github/workflows/clickhouse_be_trigger.yml'
- 'pom.xml'
- 'backends-clickhouse/**'
- 'gluten-celeborn/common'
- 'gluten-celeborn/package'
- 'gluten-celeborn/clickhouse'
- 'gluten-celeborn/common/**'
- 'gluten-celeborn/package/**'
- 'gluten-celeborn/clickhouse/**'
- 'gluten-core/**'
- 'gluten-ut/**'
- 'shims/**'
- 'tools/gluten-it/**'
- 'tools/gluten-te/**'
- 'cpp-ch/**'

jobs:
Expand Down
33 changes: 18 additions & 15 deletions .github/workflows/velox_docker.yml
Original file line number Diff line number Diff line change
Expand Up @@ -33,7 +33,6 @@ on:
- 'gluten-ut/**'
- 'shims/**'
- 'tools/gluten-it/**'
- 'tools/gluten-te/**'
- 'ep/build-velox/**'
- 'cpp/*'
- 'cpp/CMake/**'
Expand All @@ -42,6 +41,7 @@ on:
- 'dev/**'

env:
ACTIONS_ALLOW_USE_UNSECURE_NODE_VERSION: true
MVN_CMD: 'mvn -ntp'

concurrency:
Expand All @@ -51,7 +51,7 @@ concurrency:
jobs:
build-native-lib-centos-7:
runs-on: ubuntu-20.04
container: apache/gluten:gluten-vcpkg-builder_2024_05_29 # centos7 with dependencies installed
container: apache/gluten:gluten-vcpkg-builder_2024_07_11 # centos7 with dependencies installed
steps:
- uses: actions/checkout@v2
- name: Generate cache key
Expand All @@ -68,6 +68,7 @@ jobs:
- name: Build Gluten Velox third party
if: ${{ steps.cache.outputs.cache-hit != 'true' }}
run: |
df -a
source dev/ci-velox-buildstatic.sh
- name: Upload Artifact Native
uses: actions/upload-artifact@v2
Expand Down Expand Up @@ -192,10 +193,11 @@ jobs:
name: velox-arrow-jar-centos-7-${{github.sha}}
path: /root/.m2/repository/org/apache/arrow/
- name: Update mirror list
if: matrix.os == 'centos:8'
run: |
sed -i -e "s|mirrorlist=|#mirrorlist=|g" /etc/yum.repos.d/CentOS-* || true
sed -i -e "s|#baseurl=http://mirror.centos.org|baseurl=http://vault.centos.org|g" /etc/yum.repos.d/CentOS-* || true
if [ "${{ matrix.os }}" = "centos:7" ] || [ "${{ matrix.os }}" = "centos:8" ]; then
sed -i -e "s|mirrorlist=|#mirrorlist=|g" /etc/yum.repos.d/CentOS-* || true
sed -i -e "s|#baseurl=http://mirror.centos.org|baseurl=http://vault.centos.org|g" /etc/yum.repos.d/CentOS-* || true
fi
- name: Setup java and maven
run: |
if [ "${{ matrix.java }}" = "java-17" ]; then
Expand Down Expand Up @@ -334,8 +336,7 @@ jobs:
-d=FLUSH_MODE:DISABLED,spark.gluten.sql.columnar.backend.velox.flushablePartialAggregation=false,spark.gluten.sql.columnar.backend.velox.maxPartialAggregationMemoryRatio=1.0,spark.gluten.sql.columnar.backend.velox.maxExtendedPartialAggregationMemoryRatio=1.0,spark.gluten.sql.columnar.backend.velox.abandonPartialAggregationMinPct=100,spark.gluten.sql.columnar.backend.velox.abandonPartialAggregationMinRows=0 \
-d=FLUSH_MODE:ABANDONED,spark.gluten.sql.columnar.backend.velox.maxPartialAggregationMemoryRatio=1.0,spark.gluten.sql.columnar.backend.velox.maxExtendedPartialAggregationMemoryRatio=1.0,spark.gluten.sql.columnar.backend.velox.abandonPartialAggregationMinPct=0,spark.gluten.sql.columnar.backend.velox.abandonPartialAggregationMinRows=0 \
-d=FLUSH_MODE:FLUSHED,spark.gluten.sql.columnar.backend.velox.maxPartialAggregationMemoryRatio=0.05,spark.gluten.sql.columnar.backend.velox.maxExtendedPartialAggregationMemoryRatio=0.1,spark.gluten.sql.columnar.backend.velox.abandonPartialAggregationMinPct=100,spark.gluten.sql.columnar.backend.velox.abandonPartialAggregationMinRows=0
- name: (To be fixed) TPC-DS SF30.0 Parquet local spark3.2 Q23A/Q23B low memory, memory isolation on
if: false # Disabled as error https://gist.github.com/zhztheplayer/abd5e83ccdc48730678ae7ebae479fcc
- name: (To be fixed) TPC-DS SF30.0 Parquet local spark3.2 Q23A/Q23B low memory, memory isolation on # Disabled as error https://gist.github.com/zhztheplayer/abd5e83ccdc48730678ae7ebae479fcc
run: |
cd tools/gluten-it \
&& GLUTEN_IT_JVM_ARGS=-Xmx3G sbin/gluten-it.sh parameterized \
Expand All @@ -345,8 +346,8 @@ jobs:
-d=OFFHEAP_SIZE:2g,spark.memory.offHeap.size=2g \
-d=FLUSH_MODE:DISABLED,spark.gluten.sql.columnar.backend.velox.flushablePartialAggregation=false,spark.gluten.sql.columnar.backend.velox.maxPartialAggregationMemoryRatio=1.0,spark.gluten.sql.columnar.backend.velox.maxExtendedPartialAggregationMemoryRatio=1.0,spark.gluten.sql.columnar.backend.velox.abandonPartialAggregationMinPct=100,spark.gluten.sql.columnar.backend.velox.abandonPartialAggregationMinRows=0 \
-d=FLUSH_MODE:ABANDONED,spark.gluten.sql.columnar.backend.velox.maxPartialAggregationMemoryRatio=1.0,spark.gluten.sql.columnar.backend.velox.maxExtendedPartialAggregationMemoryRatio=1.0,spark.gluten.sql.columnar.backend.velox.abandonPartialAggregationMinPct=0,spark.gluten.sql.columnar.backend.velox.abandonPartialAggregationMinRows=0 \
-d=FLUSH_MODE:FLUSHED,spark.gluten.sql.columnar.backend.velox.maxPartialAggregationMemoryRatio=0.05,spark.gluten.sql.columnar.backend.velox.maxExtendedPartialAggregationMemoryRatio=0.1,spark.gluten.sql.columnar.backend.velox.abandonPartialAggregationMinPct=100,spark.gluten.sql.columnar.backend.velox.abandonPartialAggregationMinRows=0
- name: (To be fixed) TPC-DS SF30.0 Parquet local spark3.2 Q97 low memory # The case currently causes crash with "free: invalid size".
-d=FLUSH_MODE:FLUSHED,spark.gluten.sql.columnar.backend.velox.maxPartialAggregationMemoryRatio=0.05,spark.gluten.sql.columnar.backend.velox.maxExtendedPartialAggregationMemoryRatio=0.1,spark.gluten.sql.columnar.backend.velox.abandonPartialAggregationMinPct=100,spark.gluten.sql.columnar.backend.velox.abandonPartialAggregationMinRows=0 || true
- name: TPC-DS SF30.0 Parquet local spark3.2 Q97 low memory
run: |
cd tools/gluten-it \
&& GLUTEN_IT_JVM_ARGS=-Xmx3G sbin/gluten-it.sh parameterized \
Expand Down Expand Up @@ -532,7 +533,7 @@ jobs:
fail-fast: false
matrix:
spark: [ "spark-3.2" ]
celeborn: [ "celeborn-0.4.1", "celeborn-0.3.2-incubating" ]
celeborn: [ "celeborn-0.5.0", "celeborn-0.4.1", "celeborn-0.3.2-incubating" ]
runs-on: ubuntu-20.04
container: ubuntu:22.04
steps:
Expand Down Expand Up @@ -563,8 +564,10 @@ jobs:
- name: TPC-H SF1.0 && TPC-DS SF1.0 Parquet local spark3.2 with ${{ matrix.celeborn }}
run: |
EXTRA_PROFILE=""
if [ "${{ matrix.celeborn }}" = "celeborn-0.4.0" ]; then
if [ "${{ matrix.celeborn }}" = "celeborn-0.4.1" ]; then
EXTRA_PROFILE="-Pceleborn-0.4"
elif [ "${{ matrix.celeborn }}" = "celeborn-0.5.0" ]; then
EXTRA_PROFILE="-Pceleborn-0.5"
fi
echo "EXTRA_PROFILE: ${EXTRA_PROFILE}"
cd /opt && mkdir -p celeborn && \
Expand Down Expand Up @@ -616,6 +619,10 @@ jobs:
install_arrow_deps
./dev/builddeps-veloxbe.sh --run_setup_script=OFF --enable_ep_cache=OFF --build_tests=ON \
--build_examples=ON --build_benchmarks=ON --build_protobuf=ON
- name: Gluten CPP Test
run: |
cd ./cpp/build && \
ctest -V
- uses: actions/upload-artifact@v2
with:
name: velox-native-lib-centos-8-${{github.sha}}
Expand Down Expand Up @@ -681,10 +688,6 @@ jobs:
working-directory: ${{ github.workspace }}
run: |
mkdir -p '${{ env.CCACHE_DIR }}'
- name: Gluten CPP Test
run: |
cd $GITHUB_WORKSPACE/cpp/build && \
ctest -V
- name: Prepare spark.test.home for Spark 3.2.2 (other tests)
run: |
cd $GITHUB_WORKSPACE/ && \
Expand Down
7 changes: 5 additions & 2 deletions .github/workflows/velox_docker_cache.yml
Original file line number Diff line number Diff line change
Expand Up @@ -20,14 +20,17 @@ on:
branches:
- 'main'

env:
ACTIONS_ALLOW_USE_UNSECURE_NODE_VERSION: true

concurrency:
group: ${{ github.repository }}-${{ github.workflow }}
cancel-in-progress: false

jobs:
cache-native-lib:
runs-on: ubuntu-20.04
container: apache/gluten:gluten-vcpkg-builder_2024_05_29 # centos7 with dependencies installed
container: apache/gluten:gluten-vcpkg-builder_2024_07_11 # centos7 with dependencies installed
steps:
- uses: actions/checkout@v2
- name: Generate cache key
Expand Down Expand Up @@ -126,4 +129,4 @@ jobs:
# - uses: actions/cache/save@v3
# with:
# path: '${{ env.CCACHE_DIR }}'
# key: ccache-centos-release-default
# key: ccache-centos-release-default
File renamed without changes.
File renamed without changes.
Original file line number Diff line number Diff line change
Expand Up @@ -13,14 +13,17 @@
# See the License for the specific language governing permissions and
# limitations under the License.

name: Velox backend nightly job
name: Velox backend weekly job

on:
pull_request:
paths:
- '.github/workflows/velox_nightly.yml'
- '.github/workflows/velox_weekly.yml'
schedule:
- cron: '0 20 * * *'
- cron: '0 20 * * 0'

env:
ACTIONS_ALLOW_USE_UNSECURE_NODE_VERSION: true

concurrency:
group: ${{ github.repository }}-${{ github.head_ref || github.sha }}-${{ github.workflow }}
Expand All @@ -37,25 +40,28 @@ jobs:
steps:
- uses: actions/checkout@v2
- name: Update mirror list
if: matrix.os == 'centos:8'
run: |
sed -i -e "s|mirrorlist=|#mirrorlist=|g" /etc/yum.repos.d/CentOS-* || true
sed -i -e "s|#baseurl=http://mirror.centos.org|baseurl=http://vault.centos.org|g" /etc/yum.repos.d/CentOS-* || true
- name: build
run: |
yum update -y
yum install -y epel-release sudo dnf
if [ "${{ matrix.os }}" = "centos:8" ]; then
if [ "${{ matrix.os }}" = "centos:7" ]; then
yum install -y centos-release-scl
rm /etc/yum.repos.d/CentOS-SCLo-scl.repo -f
sed -i \
-e 's/^mirrorlist/#mirrorlist/' \
-e 's/^#baseurl/baseurl/' \
-e 's/mirror\.centos\.org/vault.centos.org/' \
/etc/yum.repos.d/CentOS-SCLo-scl-rh.repo
else
dnf install -y --setopt=install_weak_deps=False gcc-toolset-9
source /opt/rh/gcc-toolset-9/enable || exit 1
else
yum install -y centos-release-scl
yum install -y devtoolset-9
source /opt/rh/devtoolset-9/enable || exit 1
fi
yum install -y java-1.8.0-openjdk-devel patch wget git && \
yum install -y java-1.8.0-openjdk-devel patch wget git
export JAVA_HOME=/usr/lib/jvm/java-1.8.0-openjdk && \
export PATH=$JAVA_HOME/bin:$PATH && \
export PATH=$JAVA_HOME/bin:$PATH
wget --no-check-certificate https://downloads.apache.org/maven/maven-3/3.8.8/binaries/apache-maven-3.8.8-bin.tar.gz && \
tar -xvf apache-maven-3.8.8-bin.tar.gz && \
mv apache-maven-3.8.8 /usr/lib/maven && \
Expand All @@ -76,7 +82,8 @@ jobs:
- name: build
run: |
# To avoid the prompt for region selection during installing tzdata.
export DEBIAN_FRONTEND="noninteractive"
apt-get update && apt-get install -y sudo openjdk-8-jdk maven wget git
export DEBIAN_FRONTEND=noninteractive
apt-get update && apt-get install -y sudo maven wget git
sudo apt-get install -y openjdk-8-jdk
export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
cd $GITHUB_WORKSPACE/ && ./dev/package.sh
10 changes: 0 additions & 10 deletions DISCLAIMER

This file was deleted.

18 changes: 18 additions & 0 deletions DISCLAIMER-WIP
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
Apache Gluten (Incubating) is an effort undergoing incubation at the Apache
Software Foundation (ASF), sponsored by the Apache Incubator PMC.

Incubation is required of all newly accepted projects until a further review
indicates that the infrastructure, communications, and decision making process
have stabilized in a manner consistent with other successful ASF projects.

While incubation status is not necessarily a reflection of the completeness
or stability of the code, it does indicate that the project has yet to be
fully endorsed by the ASF.

Some of the incubating project’s releases may not be fully compliant with ASF policy.
For example, releases may have incomplete or un-reviewed licensing conditions.
What follows is a list of issues the project is currently aware of (this list is likely to be incomplete):

* Releases may have incomplete licensing conditions.

If you are planning to incorporate this work into your product/project, please be aware that you will need to conduct a thorough licensing review to determine the overall implications of including this work. For the current status of this project through the Apache Incubator, visit: https://incubator.apache.org/projects/gluten.html
Loading

0 comments on commit 7922ee3

Please sign in to comment.