Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

test_socket_with_bind_to_interface fails when building for epel8 on arch ppc64le #665

Closed
wombelix opened this issue Aug 2, 2024 · 2 comments
Labels
bug This issue is a bug. p2 This is a standard priority issue response-requested Waiting on additional info and feedback. Will move to 'closing-soon' in 7 days.

Comments

@wombelix
Copy link

wombelix commented Aug 2, 2024

Describe the bug

I'm packaging aws-c-io for Fedora and EPEL (https://src.fedoraproject.org/rpms/aws-c-io).
While working on the update from 0.14.10 to 0.14.16 today I discovered that test_socket_with_bind_to_interface fails when building for RHEL 8 / EPEL 8 on arch ppc64le. On the same OS on arch x86_64 and aarch64 passes all tests without issues.

Expected Behavior

test_socket_with_bind_to_interface passes on RHEL 8 / EPEL 8 on arch ppc64le

Current Behavior

setsockopt() for enabling TCP_KEEPINTVL for TCP failed with errno 22 followed by bind failed with error code 98.
My understand is that error 22 indicates an invalid argument and error 98 that the port is already in use.
I couldn't figure out yet why both appear and which one happens first.
My assumption is that 22 is the actual problem and 98 just happens down the line because if it.
But this is the part were I could use some input from experienced C developers :)

 39/166 Test  #45: test_socket_with_bind_to_interface ......................................***Failed    0.01 sec
***FAILURE*** [INFO] [2024-08-02T11:03:21Z] [00007fffb5827540] [Unknown] - id=0x1519cc860: Initializing edge-triggered epoll
[INFO] [2024-08-02T11:03:21Z] [00007fffb5827540] [Unknown] - id=0x1519cc860: Using eventfd for cross-thread notifications.
[TRACE] [2024-08-02T11:03:21Z] [00007fffb5827540] [Unknown] - id=0x1519cc860: eventfd descriptor 5.
[INFO] [2024-08-02T11:03:21Z] [00007fffb5827540] [Unknown] - id=0x1519cc860: Starting event-loop thread.
[DEBUG] [2024-08-02T11:03:21Z] [00007fffb5827540] [Unknown] - id=0x7ffffe6ab9f8 fd=6: initializing with domain 0 and type 0
[DEBUG] [2024-08-02T11:03:21Z] [00007fffb5827540] [Unknown] - id=0x7ffffe6ab9f8 fd=6: setting socket options to: keep-alive 1, keep idle 60000, keep-alive interval 1000, keep-alive probe count 0.
[WARN] [2024-08-02T11:03:21Z] [00007fffb5827540] [Unknown] - id=0x7ffffe6ab9f8 fd=6: setsockopt() for enabling TCP_KEEPINTVL for TCP failed with errno 22.
[INFO] [2024-08-02T11:03:21Z] [00007fffb5827540] [Unknown] - id=0x7ffffe6ab9f8 fd=6: binding to 127.0.0.1:8127.
[ERROR] [2024-08-02T11:03:21Z] [00007fffb5827540] [Unknown] - id=0x7ffffe6ab9f8 fd=6: bind failed with error code 98
[INFO] [2024-08-02T11:03:21Z] [00007fffb400f0b0] [Unknown] - id=0x1519cc860: main loop started
[TRACE] [2024-08-02T11:03:21Z] [00007fffb400f0b0] [Unknown] - id=0x1519cc860: subscribing to events on fd 5
[INFO] [2024-08-02T11:03:21Z] [00007fffb400f0b0] [Unknown] - id=0x1519cc860: default timeout 100000, and max events to process per tick 100
[TRACE] [2024-08-02T11:03:21Z] [00007fffb400f0b0] [Unknown] - id=0x1519cc860: waiting for a maximum of 100000 ms
Expected success at aws_socket_bind(&listener, endpoint); got return value -1 with last error 1054
 [s_test_socket_ex(): /builddir/build/BUILD/aws-c-io-0.14.16/tests/socket_test.c:232]
***FAILURE*** s_test_socket() failed [s_test_socket_with_bind_to_interface(): /builddir/build/BUILD/aws-c-io-0.14.16/tests/socket_test.c:457]
***FAILURE*** test_socket_with_bind_to_interface [ FAILED ]

Reproduction Steps

I use the build infrastructure from Fedora. I don't have a direct access to a PPC64LE machine. So reproducing this error manually might be tricky. But I don't do anything fancy, I trigger the build with -DBUILD_SHARED_LIBS=ON and the tests without any additional args.

Possible Solution

No response

Additional Information/Context

The test that fails seem pretty new, added via PR #647 with commit d04508d.

aws-c-io version used

0.14.16

Compiler and version used

8.5.0-22.el8_10

Operating System and version

RHEL 8

@wombelix wombelix added bug This issue is a bug. needs-triage This issue or PR still needs to be triaged. labels Aug 2, 2024
@waahm7
Copy link
Contributor

waahm7 commented Aug 2, 2024

Can you please try using https://github.com/awslabs/aws-c-io/releases/tag/v0.14.18, and see if that fixes the issue?

@jmklix jmklix added response-requested Waiting on additional info and feedback. Will move to 'closing-soon' in 7 days. p2 This is a standard priority issue and removed needs-triage This issue or PR still needs to be triaged. labels Aug 2, 2024
@wombelix
Copy link
Author

wombelix commented Aug 3, 2024

Can you please try using https://github.com/awslabs/aws-c-io/releases/tag/v0.14.18, and see if that fixes the issue?

Awesome, thanks a lot for that quick fix! Issue is gone, I could successfully build on all architectures with the new release :)

@wombelix wombelix closed this as completed Aug 3, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug This issue is a bug. p2 This is a standard priority issue response-requested Waiting on additional info and feedback. Will move to 'closing-soon' in 7 days.
Projects
None yet
Development

No branches or pull requests

3 participants