-
Notifications
You must be signed in to change notification settings - Fork 434
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bug report for UD connection when creq length > max_inline_size #10423
Comments
@LeDong98 maybe it's an issue with retransmissions of CREQ packet? |
@yosefe |
I can add "-x UCX_UD_VERBS_TX_MIN_INLINE=128" in cmd to circumvent this problem, however, I think this is not a "solution", but a "workaround". This processing mechanism when creq length greater than max_inline_size could be considered a bug? |
@yosefe I drew a flowchart, if you have time could you help with it? 2.Ucx chose to release skb at an earlier date, eliminating the ep connect process on this end(side). 3.When the resource is actually released, the RDMA doolbell may have already been issued and the RDMA engine has not yet read away the skb, resulting in an error message. |
Describe the bug
For the UD transport layer, if the creq in the connection packet is greater than the value of max_inline_size, the connection will established in non-inline mode. In this case, a bug exists. The following is an example:
#6040 Problem treating incoming creq as ack
Steps to Reproduce
Setup and versions
Additional information (depending on the issue)
###ucx error log:
The text was updated successfully, but these errors were encountered: