Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GH-44950: [C++] Bump minimum CMake version to 3.25 #44989

Draft
wants to merge 10 commits into
base: main
Choose a base branch
from

Conversation

raulcd
Copy link
Member

@raulcd raulcd commented Dec 10, 2024

This is currently under development. I am only exercising CI to find out what things would require to be updated
This is still being decided and a decision hasn't been finalized yet.

Rationale for this change

We want to upgrade our CMake version to 3.25 as discussed on the ML:
https://lists.apache.org/thread/h8jp16ktrj11fmjmjhlg6xvkvv9wzvjk

What changes are included in this PR?

TBD

Are these changes tested?

Yes, via CI.

Are there any user-facing changes?

Yes, the minimum CMake version to be used to build Arrow is bumped to 3.25.
This PR includes breaking changes to build systems.

Copy link

⚠️ GitHub issue #44950 has been automatically assigned in GitHub to PR creator.

@raulcd
Copy link
Member Author

raulcd commented Dec 10, 2024

I am unsure on how to fix the remaining failures for R.
For the gcc 12 job I am unsure why it is failing and for the Windows C++ RTools 40 ucrt64 it seems we install CMake from MINGW here but I am not sure if this is necessary or can be updated.
https://github.com/apache/arrow/blob/main/ci/scripts/PKGBUILD#L39

Of course this is the initial CI (we also have to update all the extended CI jobs for crossbow).

@h-vetinari
Copy link
Contributor

For the gcc 12 job I am unsure why it is failing

Updating the CMake lower bound will flip the default of several policies from legacy to new; sounds like you might be relying on legacy behaviour there in some way (once you figure out which policy is at fault, there's usually a migration path to keep the old behaviour)

@raulcd
Copy link
Member Author

raulcd commented Dec 11, 2024

ok, it seems R forces the builds to use the CMake provided on the images:

**** Not using cmake found at /bin/cmake
Error in .make_numeric_version(x, strict, .standard_regexps()$valid_numeric_version) : 
  invalid non-character version specification 'x' (type: double)
Calls: build_libarrow ... as.numeric_version -> numeric_version -> .make_numeric_version
Execution halted

@jonkeane @assignUser @amoeba will this be an issue for CRAN? Are we somehow forced to the CMake version on those images?

root@03abb5b759ba:/# /bin/cmake --version
cmake version 3.22.1

CMake suite maintained and supported by Kitware (kitware.com/cmake).

@assignUser
Copy link
Member

assignUser commented Dec 11, 2024

I went through the logs of our recent checks on cran and only one is using a version < 3.25 and that seems more incidental then purposely as it's the r-odrel arm64 but the intel version has 3.26

So I don't think we should be forced to use that cmake version, additionally we have a function that fetches current cmake if an unsuited version is found but apparently there is an issue with it as seen above. IIRC there was a change to numeric version in one of the las R Versions that is causing this? I'll have a look.

@nealrichardson
Copy link
Member

It's possible the version comparison error is from this: https://github.com/apache/arrow/pull/44989/files#diff-935746c34b16289a07b0d9bf7642dbd268b18059b6187f7cdec7c464be47a3deL731-L743

cmake_version <- function(cmd = "cmake") {
  tryCatch(
    {
      raw_version <- system(paste(cmd, "--version"), intern = TRUE, ignore.stderr = TRUE)
      pat <- ".* ([0-9\\.]+).*?"
      which_line <- grep(pat, raw_version)
      package_version(sub(pat, "\\1", raw_version[which_line]))
    },
    error = function(e) {
      return(0)
    }
  )
}

The error case should probably return("0")

@nealrichardson
Copy link
Member

Two other places in the R nixlibs.R script worth updating:

@pitrou
Copy link
Member

pitrou commented Dec 11, 2024

The error case should probably return("0")

It would probably be more forward-looking to avoid the error entirely. Why does the function fail parsing the CMake version? Can we add cmake --version somewhere in the GH workflow?

@nealrichardson
Copy link
Member

nealrichardson commented Dec 11, 2024

The error case should probably return("0")

It would probably be more forward-looking to avoid the error entirely. Why does the function fail parsing the CMake version? Can we add cmake --version somewhere in the GH workflow?

I could be remembering wrong, but I believe the function is used to check for cmake of a certain version, and this is to be robust to where cmake may not be installed or not found at the path provided. It does not emit an error, it traps it.

This might not be where the error is coming from that was observed in CI, I was just browsing the source to see where you might get a numeric version error. Looking again, and reading the output it produced, I think we're hitting this: https://github.com/apache/arrow/pull/44989/files#diff-935746c34b16289a07b0d9bf7642dbd268b18059b6187f7cdec7c464be47a3deL718

      } else {
        # Keep trying
        lg("Not using cmake found at %s", path, .indent = "****")
        if (found_version > 0) {
          lg("Version >= %s required; found %s", version_required, found_version, .indent = "*****")
        } else {

should be found_version > "0". We must not have been running into this before because the "found_version" was always sufficient if found, and if it wasn't found, we were returning a numeric 0 which works in this comparison.

(To be clear, we need to fix both this and the return("0") above.)

.env Show resolved Hide resolved
@github-actions github-actions bot added awaiting changes Awaiting changes and removed awaiting committer review Awaiting committer review labels Dec 12, 2024
@github-actions github-actions bot added awaiting change review Awaiting change review and removed awaiting changes Awaiting changes labels Dec 13, 2024
@jonkeane
Copy link
Member

I've pushed the changes Neal suggested which should fix the ubuntu failure 🤞 (hope you don't mind, @raulcd !)

@jonkeane
Copy link
Member

I did a bit of digging on the windows front, and I suspect what's going on is the the MSYS2/mingw cmake is being preferred (I'm not familiar enough with that ecosystem to know if we can override it with something on the system, but that also seems fragile itself).

https://github.com/raulcd/arrow/blob/e5d521134db4bed8507572fece3b56eb4a1b9158/ci/scripts/r_windows_build.sh#L26-L30 has some info about how to test newer dependencies, I wonder if we (conditionally) use those pacman commands to insall a newer cmake than what's in the CRAN repo (https://cloud.r-project.org/bin/windows/Rtools/4.0/ucrt64/ has the version I'm seeing installed listed, so if we override that one in our builds that might be sufficient.

@nealrichardson
Copy link
Member

I did a bit of digging on the windows front, and I suspect what's going on is the the MSYS2/mingw cmake is being preferred (I'm not familiar enough with that ecosystem to know if we can override it with something on the system, but that also seems fragile itself).

https://github.com/raulcd/arrow/blob/e5d521134db4bed8507572fece3b56eb4a1b9158/ci/scripts/r_windows_build.sh#L26-L30 has some info about how to test newer dependencies, I wonder if we (conditionally) use those pacman commands to insall a newer cmake than what's in the CRAN repo (https://cloud.r-project.org/bin/windows/Rtools/4.0/ucrt64/ has the version I'm seeing installed listed, so if we override that one in our builds that might be sufficient.

Unfortunately, that won't help us. https://github.com/r-windows/rtools-packages/ is archived; CRAN has moved away from the toolchain that Jeroen was maintaining. We should delete that comment.

For newer cmake with older rtools, maybe we can get it from https://github.com/Kitware/CMake/releases/download/v3.31.2/cmake-3.31.2-windows-x86_64.zip or something? Or maybe we can install newer rtools for cmake but use the older rtools for compilers etc.?

@raulcd
Copy link
Member Author

raulcd commented Dec 17, 2024

I don't understand why the job is installing mingw-w64-ucrt-x86_64-cmake-3.21.3-1-any.pkg.tar.x when the remote package seems to be updated to a newer version: https://packages.msys2.org/packages/mingw-w64-ucrt-x86_64-cmake
And we seem to be updating to pull the newest packages:

+ pacman --noconfirm -Syy
:: Synchronizing package databases...
downloading mingw32.db...
downloading mingw64.db...
downloading ucrt64.db...
downloading mirrors.db...

edit:
Ok, I see this is using CRAN mirrors here: https://cloud.r-project.org/bin/windows/Rtools/4.0/ucrt64/

Probably naive question here, I suppose we can't update the mirrors here to point to msys2 and download the newer cmake from there before building Arrow as we would have the same issue when we try to publish to CRAN, right? @jonkeane @nealrichardson

@nealrichardson
Copy link
Member

Probably naive question here, I suppose we can't update the mirrors here to point to msys2 and download the newer cmake from there before building Arrow as we would have the same issue when we try to publish to CRAN, right? @jonkeane @nealrichardson

Honestly I don't know. But it's not relevant for CRAN because (a) CRAN currently only builds on R 4.3 and 4.4, both of which have new enough cmake, and (b) we don't require cmake on CRAN anyway because we build the libarrow C++ library in our CI and download it in the CRAN build--CRAN only compiles the R bindings, which do not require cmake.

Solving the cmake issue for R < 4.3 is only for our own purposes of building C++ libraries that are compatible with older versions of R on Windows. We support more versions of R than CRAN actively checks on, though it appears that in CI, we only check on the Windows current release version, and when we test old R versions, we do it on linux.

IIUC we're building libarrow with the rtools4.0 toolchain for maximum compatibility. If we want to continue doing that, we could try installing newer cmake somewhere before https://github.com/apache/arrow/blob/main/.github/workflows/r.yml#L293 and making sure it's on the PATH. Or, to bump up, we would change the rtools version in that job from 40 to 43 to use the rtools43 toolchain with newer cmake.

@pitrou
Copy link
Member

pitrou commented Dec 18, 2024

We support more versions of R than CRAN actively checks on

Do we still want to do that? How is our R support policy decided?

@nealrichardson
Copy link
Member

We support more versions of R than CRAN actively checks on

Do we still want to do that? How is our R support policy decided?

We generally follow the tidyverse version support policy, which is the most recent 5 versions. Many enterprise users of R don't upgrade versions eagerly so they can be stuck on older versions longer than CRAN's testing window.

If there is an easy way to get new enough cmake into the R windows-cpp job so that we can keep building with the R 4.0 toolchain, that would be ideal. I don't see why we couldn't do that.

Try installing CMake 3.25 manually on Ubuntu

Install cmake in linux-apt-lint.dockerfile and update install_cmake.sh to not require platform argument

Install CMake 3.25.0 on several Ubuntu jobs and Windows jobs that require the existing version to be bumped

Several fixes to install_cmake

Some more fixes and upgrade minimal examples to use Ubuntu 24.04

Some more fixes for R

Update from CMAKE to CMAKE_VERSION

Ensure that versions returned are strings, bump cmake download

Try forcing MINGW package to pull a newer CMake

Revert trying to pull newer CMake

Try to completely remove mingw CMake from CRAN

/c/cmake/bin/cmake.exe should be present and in path, try with .exe

Try with full path for CMake

Try removing manual installation of CMake and installing CMake via pacman before setting up RTools

Remove manual Installation of CMake from pacman not present

More tries

Temporarily try using msys2 GH action to install cmake

Remove CMake from PKGBUILD

CMake should be in the path

Debug new CMake location

Try with C:\Program Files\CMake\bin\cmake.exe

Just testing at this point

Bundle AWSSDK under RTools (#90)

Check for mingw

Check Windows compiler

Try modifying patch due to redefinition

Try fixing patch

AWS is bundled on Windows R now, do not use MSYS libraries

Fix license header on patch + add CMake condition for AWS patch

Update configure.win to remove specific AWS CPP libs

Update r/configure.win

Co-authored-by: Bryce Mecum <[email protected]>

Update r/configure.win

Test turning ccache back on

Update ci/scripts/r_windows_build.sh

Co-authored-by: Neal Richardson <[email protected]>

Update r/configure.win

Co-authored-by: Neal Richardson <[email protected]>

Review comments

Remove MSYS2_ARG_CONV_EXCL

Testing with manually adding missing rtools40

Add correct slash for path

Add comment on why MSYS2_ARG_CONV_EXCL is required

Use MSYS2_ARG_CONV_EXCL on the same CMake command

Update tutorial examples to use a newer Ubuntu

Add CMAKE_VERSION to cuda jobs

Upgrade minimal Ubuntu to use 24.04 instead of 22.04

Fix minimal build for Ubuntu 24.04

ubuntu-cuda-cpp requires cmake argument

Add cmake to docker-compose.yml missing images

Add missing cmake to some more images

Some more fixing mainly for R binary jobs

Set CMAKE to newly installed path

Revert "Try modifying patch due to redefinition"

This reverts commit b772b23.

Reapply "Try modifying patch due to redefinition"

This reverts commit 1d71375.

Hardcode enum value

I tried to find a proper way to include-guard this but
everything I tried didn't work.

Remove outdated refs to aws libs

Since we are bundling these we don't have these to copy

Move _WIN32_WINNT def into {c,cxx}flags

Ignore warnings-as-errors in cmake
@github-actions github-actions bot added awaiting change review Awaiting change review and removed awaiting changes Awaiting changes labels Jan 21, 2025
Rever CMAKE_VERSION back to CMAKE on .env

Fix typo on file name
@raulcd
Copy link
Member Author

raulcd commented Jan 22, 2025

@github-actions crossbow submit centos-7-amd64

This comment was marked as outdated.

@raulcd
Copy link
Member Author

raulcd commented Jan 22, 2025

@github-actions crossbow submit centos-7-amd64

This comment was marked as outdated.

@raulcd
Copy link
Member Author

raulcd commented Jan 22, 2025

@github-actions crossbow submit centos-7-amd64

Copy link

Revision: 79de380

Submitted crossbow builds: ursacomputing/crossbow @ actions-53b79b55f3

Task Status
centos-7-amd64 GitHub Actions

@raulcd
Copy link
Member Author

raulcd commented Jan 22, 2025

I am trying to install the newer CMake on the linux-packaging tasks but it's slightly complex to understand the context and where the files might be located. My idea is to reuse the install_cmake.sh script, and I was trying to link or copy the file under ci/scripts into the Dockerfile, but I am struggling to understand where the docker build is executed and what the path should be (I was thinking on making a symbolic link on dev/tasks/linux-packages to ci/scripts if necessary). I tried to parse this logs and we seem to jump around on rake quite a little bit but it's confusing:

cd /runner/_work/crossbow/crossbow/arrow/dev/tasks/linux-packages/../../..
docker build --cache-from apache/arrow-dev:aarch64-almalinux-9-package-apache-arrow --tag apache/arrow-dev:aarch64-almalinux-9-package-apache-arrow --build-arg DEBUG=yes --build-arg FROM=arm64v8/almalinux:9 almalinux-9
cd -
cp /runner/_work/crossbow/crossbow/arrow/dev/tasks/linux-packages/yum/build.sh yum/build.sh
rm -rf yum/tmp
mkdir -p yum/tmp
cp apache-arrow-20.0.0.dev8.tar.gz yum/tmp/apache-arrow-20.0.0.dev8.tar.gz
cd yum
mkdir -p /runner/_work/crossbow/crossbow/arrow/dev/tasks/linux-packages/apache-arrow/yum/build/almalinux-9-aarch64

cc @kou

@kou
Copy link
Member

kou commented Jan 22, 2025

https://github.com/apache/arrow/blob/main/dev/tasks/linux-packages/apache-arrow/yum/centos-7/Dockerfile and other dev/tasks/linux-packages/*/{apt,yum}/*/Dockerfiles are self-contained and no dependency Dockerfile. They use dev/tasks/linux-packages/*/{apt,yum}/*/ as docker build context.

If we change docker build context to the top directory, we can use ci/scripts/*. But it'll introduce another complexity. Could you embed a shell script instead of reusing ci/scripts/*?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants