Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

EBP-383: Fix runtime cgo panic by changing CCSMP opaque pointer typdefs #77

Merged
merged 4 commits into from
Dec 18, 2024

Conversation

TrentDaniel
Copy link

Added compile directive to solClient.h so that when the Go API links against CCSMP, it treats all opaque pointers as uintptrs (effectively longs) instead of as void pointers, except for solClient_opaquePtr_pt. I couldn't figure out how to get around solClient_opaquePtr_pt because certain CCSMP functions like solClient_msg_getBinaryAttachmentPtr require a pointer to an opaque pointer. This was all done to avoid the cgo runtime from panicking when it does a runtime check of a C/Go boundary and believes that the 'void ' type that the opaque pointers used to be typedef'd to was an unpinned pointer. As mentioned, the change to the typdef of opaque pointers in solClient.h will prevent the cgo runtime from panicking since it will evaluate the opaque pointers as longs/ints instead of unpinned references. It's unclear whether or not the solClient_opaquePtr_pt will become a problem in the future, but it is actually pointing to data whereas the other solClient_opaque_pt types are not, which the cgo runtime should be able to discern. The other changes made in this commit were in response to the typdef change, and isolating more memory allocation of configuration structs like the sessionCreateFuncInfo to C in order to mitigate the risk of bad references being accessed by CCSMP.

links against CCSMP, it treats all opaque pointers as uintptrs
(effectively longs) instead of as void pointers, except for
solClient_opaquePtr_pt. I couldn't figure out how to get around
solClient_opaquePtr_pt because certain CCSMP functions like
solClient_msg_getBinaryAttachmentPtr require a pointer to an opaque
pointer. This was all done to avoid the cgo runtime from panicking when
it does a runtime check of a C/Go boundary and believes that the 'void
*' type that the opaque pointers used to be typedef'd to was an unpinned
pointer. As mentioned, the change to the typdef of opaque pointers in
solClient.h will prevent the cgo runtime from panicking since it will
evaluate the opaque pointers as longs/ints instead of unpinned
references. It's unclear whether or not the solClient_opaquePtr_pt will
become a problem in the future, but it is actually pointing to data
whereas the other solClient_opaque*_pt types are not, which the cgo
runtime should be able to discern. The other changes made in this commit
were in response to the typdef change, and isolating more memory
allocation of configuration structs like the sessionCreateFuncInfo to C
in order to mitigate the risk of bad references being accessed by CCSMP.
@TrentDaniel TrentDaniel self-assigned this Dec 13, 2024
Copy link

gitstream-cm bot commented Dec 13, 2024

Please mark whether you used Copilot to assist coding in this PR

  • Copilot Assisted

Copy link

@oodigie oodigie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left a comment.

@oodigie
Copy link

oodigie commented Dec 13, 2024

Copy link

@RagnarPaulson RagnarPaulson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All looks great now. Thanks.

@TrentDaniel TrentDaniel requested a review from sdewilde December 16, 2024 16:07
@oodigie oodigie self-requested a review December 16, 2024 17:01
Copy link

@sdewilde sdewilde left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No comment on the change itself but the changes have run well in QA tests. Run our platform test on alpine with no issues and the test failures we saw in the feature test were infra related and passed on rerun with other resources. Looks good!

@oodigie
Copy link

oodigie commented Dec 16, 2024

@TrentDaniel, I still see a lot more failures on Jenkins - https://jenkins2.solacedev.net/job/solacedev.pubsubplus-go-client/job/EBP-383/4/#showFailuresLink . Please can you at least verify that the test failures are not related to the changes in this PR ?

@TrentDaniel
Copy link
Author

@TrentDaniel, I still see a lot more failures on Jenkins - https://jenkins2.solacedev.net/job/solacedev.pubsubplus-go-client/job/EBP-383/4/#showFailuresLink . Please can you at least verify that the test failures are not related to the changes in this PR ?

I've verified that the test failures are not caused by the changes in this PR. Summary:

  • 4 are some kind of SEMP timeout, probably to do with bad broker setup and definitely unrelated to API changes
  • 1 is a race condition, with ticket EBP-401
  • 1 is known test failure that you are currently working on
  • The rest are known test failures from a bottleneck in one of our test nodes

Copy link

@oodigie oodigie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will approve this PR since the Jenkins failures have been investigated and found to be unrelated to the changes introduced in this PR.

@TrentDaniel TrentDaniel merged commit a278b52 into dev Dec 18, 2024
5 checks passed
@TrentDaniel TrentDaniel deleted the EBP-383 branch December 18, 2024 17:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants