Big endian PowerPC CI #17258

NattyNarwhal · 2024-12-24T20:08:19Z

I'll clean up the history later, but some notes:

This is a nightly run; it takes approximately half an hour to run to completion.
This runs on Gentoo, because it has good support for ppc64 and I know people familiar with it.
- If we want to stick with a binary distribution, Debian unstable, openSUSE, or Adelie would probably be the best bet.
The runner system is an LPAR on a POWER8 system I have racked in a real datacentre.
- The partition has 4 cores (8 threads per core, so 32 threads) assigned to it and 128 GB of RAM (which it rarely uses it seems, tops out at just over 16 GB in my experience if we need to resize). The storage is some spinning rust in RAID 5.
- We could probably do ppc64le CI with this as well. Or developer systems for PHP people who need to test things.
The official runner is incompatible, so I have to use an unofficial implementation.
- This is joined to my php-src fork right now; if we want to merge this, we'd obviously need to assign this to php/php-src. Is there any process for this?
- I have some tweaks to the runner scripts to run it as a systemd unit and pick up the right LLVM version on path.
- This isn't being run containerized. It'd be nice to figure that out.
Sample run here.
- ext/ffi/tests/gh11934.phpt will be fixed by Fix GH-16013 and bug #80857: Big endian issues #17255. I don't know what's up with the POSIX tests; I believe these used to pass until I updated the system. I'll look into this.
- A lot tests are slow; the number may need tuning because POWER8 is a bit old. The gd tests being very slow seem very unusual to me though.

cmb69 · 2024-12-30T16:24:29Z

The gd tests being very slow seem very unusual to me though.

The 5 slowest test use helpers written in PHP which check all pixels; the 6th slowest calls imagecolormatch() (which also checks all pixels) on a 16 mega-pixel image. Given that the tests use -j32 and that the L1 cache appears to be very small, I guess this is partly a RAM performance issue. I wouldn't be too worried about that.

However, it would make sense to rewrite the GD test helpers (particularly, test_image_equals_file() and calc_image_dissimilarity()) in C. See #17305.

NattyNarwhal · 2025-01-01T17:03:24Z

Looks like all ran tests pass now after a rebase! The tests are still slow (in particular gd), but the server is a little older and we are running several tests in parallel with sanitizers.

iluuu1994

The code looks good and simple enough. Is mysql already set up on the machine? I'm guessing this runs without docker? I'm not sure if those tests are actually executed.

iluuu1994 · 2025-01-06T17:52:36Z

.github/actions/emerge/action.yml

+            net-nds/openldap \
+            dev-db/unixODBC \
+            dev-db/postgresql \
+            || true


Should the || true be removed? Was it added for testing purposes?

I would think this is to allow a trailing backslash on the "last" line (like trailing comma for arrays).

The problem is that it also suppresses errors.

&& true instead?

The user that CI is running as doesn't have permission to install anything, it's purely for documentation until it does (though w/o binpkgs on Gentoo, we probably don't want to anyways).

I'm not a big fan of hundreds of lines of untested code :) In that case, we should probably just remove it. We can keep it in the PR.

Not sure what you mean by keep it in the PR; the PR description?

As a comment, in case it needs to be added at some point so it's not fully lost.

I've turned these into asserts that the package is installed.

NattyNarwhal · 2025-01-06T20:10:10Z

Is mysql already set up on the machine? I'm guessing this runs without docker? I'm not sure if those tests are actually executed.

This does not run containerized. I'd like to, just have to figure out how with the runner implementation.

There's also no DB server running, so I suspect those tests are skipped. Would it be important to get that working?

iluuu1994 · 2025-01-06T22:16:17Z

Would it be important to get that working?

That's up to you. I think the mysql extension does involve some bit shifting stuff for the protocol implementation, so it might make sense to test it with big endian.

NattyNarwhal · 2025-01-07T20:43:17Z

Rebasing and adding MySQL tests, run here. Postgres and Firebird in theory shouldn't also be much more work, although ideally I'd like to run all these DBs (and the job itself) in a container.

NattyNarwhal · 2025-01-07T21:52:14Z

Failed tests after enabling that + rebasing:

There's some DOM tests failing. Not sure what's up with that.
ext/mysqli/tests/functions/mysqli_character_set_name.phpt, ext/mysqli/tests/functions/mysqli_character_set_name_oo.phpt, ext/mysqli/tests/mysqli_change_user_set_names.phpt, ext/mysqli/tests/mysqli_get_charset.phpt: Looks like MariaDB has a different default for utf8 (instead of utf8mb3). Considering mb4 should be used nowadays AFAIK...
ext/mysqli/tests/mysqli_thread_id.phpt, ext/mysqli/tests/mysqli_stmt_get_result.phpt: looks like differences in warnings for closed connections
ext/mysqli/tests/mysqli_get_client_stats.phpt, ext/mysqli/tests/mysqli_query.phpt, ext/mysqli/tests/mysqli_query_unicode.phpt: Looks like our unprivileged DB user can't drop tables? Pretty sure GRANT ALL ON test.* [...] should include DROP...
ext/mysqli/tests/bug74021.phpt: Might also be related to the fact we're running unprivileged.

Note that this is MariaDB instead of MySQL. I can try with MySQL instead. Some stuff is also because we're not running it as root; I'd really prefer to avoid that until again, some kind of containerization.

NattyNarwhal · 2025-01-08T18:37:22Z

Having some issues with MySQL (MariaDB works, so whatever, upstream bug filed), so setting up Pg in a similar manner. Firebird looks like a piece of work, so putting that off. I might try to clean up the tests so they work on both MariaDB and MySQL, since MariaDB is pretty common in the ecosystem nowadays.

Run here (oops, forgot pg is in skipSlow)

NattyNarwhal · 2025-01-09T19:53:36Z

Running MySQL 8 instead of MariaDB, tests work as expected except for the ones that assume it's running as a superuser so far: https://github.com/NattyNarwhal/php-src/actions/runs/12696420415/job/35390469488

NattyNarwhal · 2025-01-10T20:39:36Z

Did a run with skipSlow set to false to test more things:

DOM still failing for mysterious reasons; I thought this may have been libxml2 regression since I did an upgrade, but reverting that version didn't change much
Enchant tests fail because no dicts, adding those next run.
ext/pgsql/tests/80_bug14383.phpt: I didn't even know pg could work with gdbm here; not sure what's up here.
SNMP and LDAP are basically skipped even though they're built, SNMP should be trivial to set up (though I don't like rwusers...), LDAP more involved (likely have to adapt setup-slapd).

cmb69 · 2025-01-10T22:25:33Z

DOM still failing for mysterious reasons

Possibly a big-endian issue, or otherwise platform related. cc @nielsdos

ext/pgsql/tests/80_bug14383.phpt: I didn't even know pg could work with gdbm here; not sure what's up here.

Maybe @Girgias has an idea?

nielsdos · 2025-01-10T22:41:06Z

@NattyNarwhal I'll take a look at DOM tomorrow on your big endian machine.

nielsdos · 2025-01-11T09:49:07Z

@NattyNarwhal Please rebase, the DOM issue is fixed on master.

Girgias · 2025-01-11T13:51:55Z

DOM still failing for mysterious reasons

Possibly a big-endian issue, or otherwise platform related. cc @nielsdos

ext/pgsql/tests/80_bug14383.phpt: I didn't even know pg could work with gdbm here; not sure what's up here.

Maybe @Girgias has an idea?

This is annoying as I don't know which tests of the suite fails, there might some weird conflict with the other DBA tests being run in parallel?

NattyNarwhal · 2025-01-11T19:36:00Z

Thanks for fixing DOM!

Run here; MySQL tests skipped because I brought it down to reduce noise and just focus on that failing Pg test.,

NattyNarwhal · 2025-01-11T20:40:50Z

My guess is it picks a different DBA backend that it maybe shouldn't:

ppc system

checking for QDBM support... no
checking for gdbm_open in -lgdbm... yes
checking for GDBM support... yes
checking for NDBM support... no
checking for TCADB support... no
checking for mdb_env_open in -llmdb... yes
checking for LMDB support... yes
checking for Berkeley DB4 support... no
checking for Berkeley DB3 support... no
checking for Berkeley DB2 support... no
checking for DB1 support... no
checking for DBM support... no
checking for CDB support... builtin
checking for INI File support... builtin
checking for FlatFile support... builtin
checking whether to enable DBA interface... yes

amd64 runner

checking for dpopen in -lqdbm... yes
checking for QDBM support... yes
checking for GDBM support... no
checking for NDBM support... no
checking for tcadbopen in -ltokyocabinet... yes
checking for TCADB support... yes
checking for mdb_env_open in -llmdb... yes
checking for LMDB support... yes
checking for Berkeley DB4 support... no
checking for Berkeley DB3 support... no
checking for Berkeley DB2 support... no
checking for DB1 support... no
checking for DBM support... no
checking for CDB support... builtin
checking for INI File support... builtin
checking for FlatFile support... builtin
checking whether to enable DBA interface... yes
checking whether to enable dl-test extension... yes, shared

I had copied the configure invocation from the Alpine section from nightly, which doesn't run pg (on account of not running a pg server, on top of skipSlow).

NattyNarwhal · 2025-01-11T20:55:37Z

Yeah. it's QDBM (works) vs. GDBM (doesn't).

The DBA test setup has this function to pick the first applicable handler:

function get_any_handler(): string {
    foreach (dba_handlers() as $handler) {
        // Those are weird
        if ($handler !== 'cdb' && $handler !== 'cdb_make' && $handler !== 'inifile') {
            echo 'Using handler: "', $handler, '"', \PHP_EOL;
            return $handler;
        }
    }
    return 'should_not_happen';
}

dba_handlers returns:

array(7) {
  [0]=>
  string(3) "cdb"
  [1]=>
  string(8) "cdb_make"
  [2]=>
  string(7) "inifile"
  [3]=>
  string(8) "flatfile"
  [4]=>
  string(4) "qdbm"
  [5]=>
  string(5) "tcadb"
  [6]=>
  string(4) "lmdb"
}

So it's presumably picking qdbm out this after disqualifying cdb and inifile. No idea why it doesn't pick flatfile even though it's not excluded...

After testing outside of CI with that, it passes.

NattyNarwhal · 2025-01-11T22:07:03Z

run here with that fixed and also running MySQL tests

This assumes gentoo (which has best ppc64be support of mainstream distributions). (Rebased onto the new workflow_call approach)

Not using containers (yet), for now just setting up MariaDB with an unprivileged user and seeing how that goes. Postgres and Firebird also seem doable, although SQL Server for ODBC definitely won't be.

PDO_sqlite, the Pg extensions, SNMP, LDAP, Enchant, Tidy, ODBC, and PDO_DBLIB. Note that DBLIB and ODBC will be hard to test since the tests assume SQL Server. LDAP and SNMP may require config on the PPC CI server, but Pg shhould be configured and ready to go.

The checks for things like ext/ldap will assume libraries are in ./lib, and not lib64 like they are on many multilib systems, Gentoo included. Gross that we're hardcoding it, but we should also probably convert ldap to pkg-config sometime. That may require getting rid of Oracle or Solaris LDAP support though.

Gentoo default USE flags seem to use hunspell for enchant.

GDBM is broken with ext/pgsql/tests/80_bug14383.phpt

NattyNarwhal · 2025-01-14T17:49:27Z

run with setting log_bin_trust_function_creators in MySQL.

ext/mysqli/tests/mysqli_get_client_stats.phpt: Needs ext/mysql: Don't mention errors for optional part of the test #17466.
ext/mysqli/tests/bug73462.phpt: Can't seem to reproduce this one out of CI.
ext/standard/tests/general_functions/proc_open_multiplex.phpt: Might be spurious, I can't seem to reproduce outside of CI.
Some tests are still horribly slow (mysqli, gd, bcmath). This might be an artifact of sanitizers being really nasty for performance. It doesn't help run-tests's parallel runners mode seems to allocate jobs in a way where you have might have several tests and several CPUs, but a few tests in, only one is running tests and the rest are zombies. (CONFLICTS doesn't help, but not that many tests are marked that way.)
The runner has a habit of cutting off stdout/err sometimes when steps end. Not too big of a deal, since the most important stuff is up front.

NattyNarwhal · 2025-01-15T16:34:49Z

Some tests are still horribly slow (mysqli, gd, bcmath). This might be an artifact of sanitizers being really nasty for performance. It doesn't help run-tests's parallel runners mode seems to allocate jobs in a way where you have might have several tests and several CPUs, but a few tests in, only one is running tests and the rest are zombies. (CONFLICTS doesn't help, but not that many tests are marked that way.)

Testing this with gd at least (since that's compute heavy); -O0 didn't much make of a difference vs. -O2, but sanitizers on is where the performance impact is. About what me and everyone else presumably expected, but good to confirm. With mysql, most of the time is spent in waiting in poll, so it makes me think MySQL itself is being slow. Not too important to look at now, but something to check later.

iluuu1994 · 2025-01-15T21:08:44Z

The build looks indeed pretty slow and I don't think it would be good to add this to the push job in this condition. I assume that because the runner isn't containerized, it also cannot reliably run jobs in parallel. You could try dropping UBSan, which I think causes a bigger slowdown compared to ASan, but also less value. If that doesn't help, it might be better to just add the job to nightly. This should largely be enough, these breaks are rare and I look at nightly regularly. I am somewhat surprised that you saw no significant improvement for -O2, are you sure this flag was actually propagated correctly? Normally this would be set by the build system with --disable-debug --enable-debug-assertions.

NattyNarwhal · 2025-01-15T21:53:48Z

The build looks indeed pretty slow and I don't think it would be good to add this to the push job in this condition.

Yes, I was thinking nightly is fine for now. I think for now it's probably best to get nightly going (especially if you're looking at it, it's not going to get ignored), and

I assume that because the runner isn't containerized, it also cannot reliably run jobs in parallel.

Indeed, because a lot of tests assume /tmp is unique per run as well as the DB servers. For a push run, the latter could probably be skipped.

You could try dropping UBSan, which I think causes a bigger slowdown compared to ASan, but also less value. If that doesn't help, it might be better to just add the job to nightly. This should largely be enough, these breaks are rare and I look at nightly regularly.

I think it's better to have the sanitizers with their performance impact than not in a nightly build. In a push build, I'd probably pare down the built extensions and maybe the sanitizers.

I am somewhat surprised that you saw no significant improvement for -O2, are you sure this flag was actually propagated correctly? Normally this would be set by the build system with --disable-debug --enable-debug-assertions.

Sorry, should have been more clear: This was comparing -O2 vs. -O0 vs. sanitizers on (0/2 doesn't seem to make a perf difference, I assume sanitizers implicitly kill optimization). The CI build should be doing the latter because --enable-debug IIRC implies -O0.

NattyNarwhal requested a review from iluuu1994 December 24, 2024 20:08

NattyNarwhal mentioned this pull request Dec 24, 2024

Fix FD getting code on big endian #17259

Merged

NattyNarwhal force-pushed the ppc64-ci-test branch from 9fb45b2 to d82d2f2 Compare January 1, 2025 16:33

iluuu1994 reviewed Jan 6, 2025

View reviewed changes

NattyNarwhal force-pushed the ppc64-ci-test branch from d82d2f2 to dc08278 Compare January 7, 2025 20:39

NattyNarwhal mentioned this pull request Jan 8, 2025

Travis CI re-enablement for s390x #15944

Closed

NattyNarwhal force-pushed the ppc64-ci-test branch from 5c4568b to 5c24b5d Compare January 8, 2025 18:31

NattyNarwhal force-pushed the ppc64-ci-test branch from 5c24b5d to 5d54d4a Compare January 10, 2025 19:10

NattyNarwhal force-pushed the ppc64-ci-test branch from 0346ef5 to 5c27d53 Compare January 11, 2025 18:44

NattyNarwhal mentioned this pull request Jan 13, 2025

ext/mysql: Don't mention errors for optional part of the test #17466

Draft

NattyNarwhal added 3 commits January 14, 2025 11:51

Attempt at ppc64 CI

d99d401

This assumes gentoo (which has best ppc64be support of mainstream distributions). (Rebased onto the new workflow_call approach)

oops, make that || true so the nop works

09f898b

Print config.log on config failure

1b52d71

NattyNarwhal added 20 commits January 14, 2025 11:51

helps to put the backslash there

0c11b54

try to fix formatting

4618092

subshells don't work for this

5ccc59d

fix bash syntax. oops

6fb7d26

we can't trust anything, can we

b43d53c

will this just use libc iconv?

61ce361

the power8 could maybe use more optimization

660e913

remove matrix generation remnants

b447789

Commit typo in checkout ref

8817d6f

Add ppc64 to run >=8.4

ab4845d

use clang-17 since gcc doesn't have FSan (and that's the default)

afdf39a

Run mysql tests

8799e5c

Not using containers (yet), for now just setting up MariaDB with an unprivileged user and seeing how that goes. Postgres and Firebird also seem doable, although SQL Server for ODBC definitely won't be.

Just assert that packages are installed instead

3d65818

also test pg

a75c8a8

Enable slow extensions

22c595f

PDO_sqlite, the Pg extensions, SNMP, LDAP, Enchant, Tidy, ODBC, and PDO_DBLIB. Note that DBLIB and ODBC will be hard to test since the tests assume SQL Server. LDAP and SNMP may require config on the PPC CI server, but Pg shhould be configured and ready to go.

Add SASL as a dependency for ext/ldap

c08f457

add dictionaries for hunspell

cf0a0ef

Gentoo default USE flags seem to use hunspell for enchant.

typo in pkg name

7a772aa

Add Tokyo Cabinet, use QDBM instead of GDBM

5c596ad

GDBM is broken with ext/pgsql/tests/80_bug14383.phpt

NattyNarwhal force-pushed the ppc64-ci-test branch from 15d5b52 to 5c596ad Compare January 14, 2025 15:51

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Big endian PowerPC CI #17258

Big endian PowerPC CI #17258

NattyNarwhal commented Dec 24, 2024 •

edited

Loading

cmb69 commented Dec 30, 2024

NattyNarwhal commented Jan 1, 2025

iluuu1994 left a comment

iluuu1994 Jan 6, 2025

cmb69 Jan 6, 2025

iluuu1994 Jan 6, 2025

cmb69 Jan 6, 2025

NattyNarwhal Jan 6, 2025

iluuu1994 Jan 6, 2025

NattyNarwhal Jan 7, 2025

iluuu1994 Jan 7, 2025

NattyNarwhal Jan 9, 2025

NattyNarwhal commented Jan 6, 2025

iluuu1994 commented Jan 6, 2025 •

edited

Loading

NattyNarwhal commented Jan 7, 2025

NattyNarwhal commented Jan 7, 2025

NattyNarwhal commented Jan 8, 2025 •

edited

Loading

NattyNarwhal commented Jan 9, 2025

NattyNarwhal commented Jan 10, 2025

cmb69 commented Jan 10, 2025

nielsdos commented Jan 10, 2025

nielsdos commented Jan 11, 2025

Girgias commented Jan 11, 2025

NattyNarwhal commented Jan 11, 2025

NattyNarwhal commented Jan 11, 2025 •

edited

Loading

NattyNarwhal commented Jan 11, 2025

NattyNarwhal commented Jan 11, 2025

NattyNarwhal commented Jan 14, 2025

NattyNarwhal commented Jan 15, 2025

iluuu1994 commented Jan 15, 2025 •

edited

Loading

NattyNarwhal commented Jan 15, 2025

Big endian PowerPC CI #17258

Are you sure you want to change the base?

Big endian PowerPC CI #17258

Conversation

NattyNarwhal commented Dec 24, 2024 • edited Loading

cmb69 commented Dec 30, 2024

NattyNarwhal commented Jan 1, 2025

iluuu1994 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

NattyNarwhal commented Jan 6, 2025

iluuu1994 commented Jan 6, 2025 • edited Loading

NattyNarwhal commented Jan 7, 2025

NattyNarwhal commented Jan 7, 2025

NattyNarwhal commented Jan 8, 2025 • edited Loading

NattyNarwhal commented Jan 9, 2025

NattyNarwhal commented Jan 10, 2025

cmb69 commented Jan 10, 2025

nielsdos commented Jan 10, 2025

nielsdos commented Jan 11, 2025

Girgias commented Jan 11, 2025

NattyNarwhal commented Jan 11, 2025

NattyNarwhal commented Jan 11, 2025 • edited Loading

NattyNarwhal commented Jan 11, 2025

NattyNarwhal commented Jan 11, 2025

NattyNarwhal commented Jan 14, 2025

NattyNarwhal commented Jan 15, 2025

iluuu1994 commented Jan 15, 2025 • edited Loading

NattyNarwhal commented Jan 15, 2025

NattyNarwhal commented Dec 24, 2024 •

edited

Loading

iluuu1994 commented Jan 6, 2025 •

edited

Loading

NattyNarwhal commented Jan 8, 2025 •

edited

Loading

NattyNarwhal commented Jan 11, 2025 •

edited

Loading

iluuu1994 commented Jan 15, 2025 •

edited

Loading