-
Notifications
You must be signed in to change notification settings - Fork 42
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add missing dependencies to prevent add_custom_command race #457
Add missing dependencies to prevent add_custom_command race #457
Conversation
Signed-off-by: g2flyer <[email protected]>
@@ -74,6 +75,7 @@ ENDIF() | |||
################################################################################ | |||
IF (BUILD_UNTRUSTED) | |||
ADD_LIBRARY(${U_CRYPTO_LIB_NAME} STATIC ${PROJECT_HEADERS} ${PROJECT_SOURCES} ${IAS_HEADERS} ${IAS_SOURCES}) | |||
ADD_DEPENDENCIES(${U_CRYPTO_LIB_NAME} generate-ias-files) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why do we believe this fixes the problem? its very timing dependent. can you verify that only one instance is run?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i could consistently produce concurrent colliding invocations before the fix and it consistently works now. More importanly, the issue of conflict is described in docu and that docu also outlines the strategy of defining a separate add_custom_target (and using it as the sole (explicit) dependency)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the update. Given the timing nature of the problem, "it fixes it on my machine" is a bit weak for evidence of a fix rather than simply shifting the timing. The documentation looks promising. Regardless... this looks like an improvement. Another critical problem that will continue to cause problem, however, is in the way the file is created. Rather than using an atomic swap, we write directly to the file. The result is that one thread has a reasonable chance of corrupting the file. Will be pushing a separate PR that attempts to address this problem and remove the entire templating approach.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still maintain from what i learned debugging this problem, the reproducible failure of dual-cmake tasks (as "threatened" by the documentation), the complete disappearance of parallel identical tasks in repeated experiments after following the mitigation strategy, i'm convinced that an atomic swap is not necessary in "normal" builds. That said, i guess of course there could be other imaginable cases where somehow the script is invoked twice at the same time, so making it more atomic doesn't hurt (for SIM mode which caused my problem, the old script though was probably with cp
reasonably atomic, the issue there was more that it would have required an -f
to deal with another script already having run?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we've had this problem. fixed it. had it come back when some other changes happened. the fact that it was repeatable in ONE situation does not mean that fixing that ONE situation fixed all situations. and... mv is definitely atomic (its a single syscall) cp is NOT atomic (its multiple reads/writes) which is why the file corruption happens. and worse if we're in HW mode we are piping stdout into the file which is definitely open to corruption.
and to be clear... i'm fine with this change. it will certainly make the situation better & may actually ensure that there is only one instance triggered (which, i think we can only search the verbose cmake logs to verify). regardless... the scripts used to generate the file through a template are not the best way to incorporate a cert in the code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we've had this problem. fixed it. had it come back when some other changes happened. the fact that it was repeatable in ONE situation does not mean that fixing that ONE situation fixed all situations.
If we would add IAS_SOURCES to another library or alike while forgetting the explicit dependency to generate-ias-files
indeed the issue would re-occur
and... mv is definitely atomic (its a single syscall) cp is NOT atomic (its multiple reads/writes) which is why the file corruption happens. and worse if we're in HW mode we are piping stdout into the file which is definitely open to corruption.
i was just referring to that in bash without -f
both cp
and mv
complained if the target already existed when i did it manually ...
and to be clear... i'm fine with this change. it will certainly make the situation better & may actually ensure that there is only one instance triggered (which, i think we can only search the verbose cmake logs to verify). regardless... the scripts used to generate the file through a template are not the best way to incorporate a cert in the code.
our use add_custom_command lead to a race and same failure due to parallel invocation of
common/crypto/verify_ias_report/build_ias_certificates_cpp.sh
. Issue is that add_custom_command generated files have to be explicitly serialized in dependency graph which we only partially did by defining custom target but we didn't use it so ucrypto and tcrypto defines still created parallel dependencies via IAS_SOURCES. Not clear why (a) i was the first to seemingly trigger that and (b) it also seemed to depend on how i invoked (e.g., output redirect seemed to play a role but also which target and whether this triggered an explicit docker build in our makefile or an implicit one via invoking docker-compose ...)