Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[#1823] replace malloc/calloc/strdup/free with openssl allocator #1926

Merged

Conversation

songlingatpan
Copy link
Contributor

Replaced malloc, calloc, strdup, and free with the OpenSSL memory allocator to enable the caller to customize memory allocator, addressing issue #1823. This PR does not change the existing behavior or algorithms.

@songlingatpan songlingatpan force-pushed the pr_shan_1823_openssl_allocator branch from a93e625 to b61754c Compare September 17, 2024 02:10
@songlingatpan
Copy link
Contributor Author

I will commit another change for copy_from_upstream.

Copy link
Member

@baentsch baentsch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@songlingatpan
Copy link
Contributor Author

@baentsch Question: 'bike', 'frodokem', 'ntruprime' is not from upstream.
To replace malloc with OQS_MEM_malloc, where should i modify the code? Thanks
liboqs/src/kem/frodokem/kem_frodokem976shake.c:11: OQS_KEM *kem = malloc(sizeof(OQS_KEM)); liboqs/src/kem/frodokem/kem_frodokem640aes.c:11: OQS_KEM *kem = malloc(sizeof(OQS_KEM)); liboqs/src/kem/frodokem/kem_frodokem1344shake.c:11: OQS_KEM *kem = malloc(sizeof(OQS_KEM)); liboqs/src/kem/frodokem/kem_frodokem976aes.c:11: OQS_KEM *kem = malloc(sizeof(OQS_KEM)); liboqs/src/kem/frodokem/kem_frodokem1344aes.c:11: OQS_KEM *kem = malloc(sizeof(OQS_KEM)); liboqs/src/kem/frodokem/kem_frodokem640shake.c:11: OQS_KEM *kem = malloc(sizeof(OQS_KEM)); liboqs/src/kem/kem.h:264:OQS_API void OQS_KEM_free(OQS_KEM *kem); liboqs/src/kem/ntruprime/kem_ntruprime_sntrup761.c:11: OQS_KEM *kem = malloc(sizeof(OQS_KEM)); liboqs/src/kem/kem.c:493:OQS_API void OQS_KEM_free(OQS_KEM *kem) { liboqs/src/kem/bike/kem_bike.c:9: OQS_KEM *kem = malloc(sizeof(OQS_KEM)); liboqs/src/kem/bike/kem_bike.c:34: OQS_KEM *kem = malloc(sizeof(OQS_KEM)); liboqs/src/kem/bike/kem_bike.c:59: OQS_KEM *kem = malloc(sizeof(OQS_KEM));

@baentsch
Copy link
Member

Question: 'bike', 'frodokem', 'ntruprime' is not from upstream.

Good point. @dstebila : There's surely upstreams for these algorithms, but they're not captured by OQS automation/copy_from_upstream. Question: Are these projects still maintained (and where) or is it OK to change code straight in liboqs for these algs? If the former, what would it take to bring them in via "copy_from_upstream"? If the latter, should we remove them completely from OQS?

@dstebila
Copy link
Member

Question: 'bike', 'frodokem', 'ntruprime' is not from upstream.

Good point. @dstebila : There's surely upstreams for these algorithms, but they're not captured by OQS automation/copy_from_upstream. Question: Are these projects still maintained (and where) or is it OK to change code straight in liboqs for these algs? If the former, what would it take to bring them in via "copy_from_upstream"? If the latter, should we remove them completely from OQS?

FrodoKEM is maintained at https://github.com/microsoft/PQCrypto-LWEKE/; I exported the code from there any manually added it to liboqs, as we didn't have as robust a copy_from_upstream at the time. FrodoKEM also needs to do an update from upstream, as there have been new variants introduced in the last year, but I don't have a plan for this update. So to avoid blocking on that, I would say it's fine to make the FrodoKEM changes directly here in this repository.

NTRUPrime had been coming from PQClean, but they have stopped supporting it. We only are keeping one variant of NTRUPrime because of its use in OpenSSH. I think we consider a timeline for sunsetting it. But in the interim, I think changes to NTRUPrime can be done directly here in this repository.

BIKE was contributed directly by the team at AWS. Our main contact for that has been @dkostic, but I'm not sure if he's still the right contact. Pinging @brian-jarvis-aws for some input.

@songlingatpan
Copy link
Contributor Author

@dstebila @baentsch @bhess @jschanck
Code change for openssl allocator has been done.
Please review the code.

In addition, there are quite some error handling and potential memory leak.
Would you like to fix them in the current PR? or create a separate PR for review?

Thanks

@songlingatpan
Copy link
Contributor Author

**baentsch ** requested changes

done

@dstebila
Copy link
Member

In addition, there are quite some error handling and potential memory leak. Would you like to fix them in the current PR? or create a separate PR for review?

I'd prefer to see those as a second PR, since those changes may be less mechanical and might require a closer look.

dstebila
dstebila previously approved these changes Sep 17, 2024
@dstebila dstebila added this to the 0.12.0 milestone Sep 17, 2024
@dstebila
Copy link
Member

I suggest we defer merging this until after the 0.11.0 release, otherwise we would need to cut a new release candidate and push the release back a week.

Copy link
Member

@baentsch baentsch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@songlingatpan most of this LGTM and improves the software (malloc->OQS_MEM_malloc & additional NULL checks). Some things I don't understand (see single comments). What I'm really unsure about is the apparent removal of (many) user-visible allocation error warnings for silent abort()s. Isn't that a step in the wrong direction? Yes, the abort() is a stupid pattern to begin with, but isn't it reasonable to let users at least know why the lib/app suddenly stopped?

@songlingatpan
Copy link
Contributor Author

@songlingatpan most of this LGTM and improves the software (malloc->OQS_MEM_malloc & additional NULL checks). Some things I don't understand (see single comments). What I'm really unsure about is the apparent removal of (many) user-visible allocation error warnings for silent abort()s. Isn't that a step in the wrong direction? Yes, the abort() is a stupid pattern to begin with, but isn't it reasonable to let users at least know why the lib/app suddenly stopped?

@baentsch
I understand your point.

There was some discussion with @SWilson4 about removing the checked malloc logic. If you review the OpenSSL code, it doesn’t use the checked malloc/abort() pattern. In general, a well-designed library should avoid including abort() or exit() calls, as these decisions should be left to the caller. Ideally, malloc failures should be handled consistently across the codebase by returning error codes, rather than aborting or exiting.

Since there is no fprintf logging for non-checked malloc failures, I didn’t add stderr output for abort(). I recommend fully removing abort() in the 0.12.0 release.

Our in-house version is much more stable than 0.11.0 and target for production readiness compared to the official liboqs repository. However, I’m facing challenges in integrating our in-house code changes back into the official liboqs due to abort(), and so on.

Anyway I can add fprintf back for abort.

@baentsch
Copy link
Member

Anyway I can add fprintf back for abort.

That'd be nice. And for the avoidance of doubt: My ask (to achieve this) is basically to reduce the amount of new code (by retaining "OQS_MEM_checked_malloc" calls), not to add more new code (fprintf statements, I guess)

@songlingatpan
Copy link
Contributor Author

Anyway I can add fprintf back for abort.

That'd be nice. And for the avoidance of doubt: My ask (to achieve this) is basically to reduce the amount of new code (by retaining "OQS_MEM_checked_malloc" calls), not to add more new code (fprintf statements, I guess)

@baentsch We can remove the checked malloc for now.
In the next step, we can remove abort and add return proper error and fix potential memory leaks.
If you target to the production, eventually we need to remove the checked malloc and do proper error handling and release resources, etc.

@songlingatpan
Copy link
Contributor Author

Anyway I can add fprintf back for abort.

That'd be nice. And for the avoidance of doubt: My ask (to achieve this) is basically to reduce the amount of new code (by retaining "OQS_MEM_checked_malloc" calls), not to add more new code (fprintf statements, I guess)

@baentsch We can remove the checked malloc for now. In the next step, we can remove abort and add return proper error and fix potential memory leaks. If you target to the production, eventually we need to remove the checked malloc and do proper error handling and release resources, etc.

Please let me know if you still want to keep the checked malloc. Still, if you check openssl code, there is no checked malloc. We need to remove it eventually.

@songlingatpan
Copy link
Contributor Author

@baentsch Please review the PR.

Thanks

src/common/common.h Outdated Show resolved Hide resolved
@songlingatpan songlingatpan force-pushed the pr_shan_1823_openssl_allocator branch 3 times, most recently from a194fa3 to b0d86c1 Compare October 18, 2024 19:21
@songlingatpan songlingatpan force-pushed the pr_shan_1823_openssl_allocator branch from b0d86c1 to 7afe1a3 Compare October 18, 2024 23:04
@baentsch baentsch merged commit 1d92135 into open-quantum-safe:main Oct 19, 2024
72 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants