Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Amazon DPDK 19.05 ENA patches #8

Open
wants to merge 13 commits into
base: local-19.05
Choose a base branch
from

Conversation

talawahtech
Copy link

This PR is to add the set of DPDK patches that Amazon backported to DPDK 19.05 for their ENA driver. https://github.com/amzn/amzn-drivers/tree/ena_linux_2.6.1/userspace/dpdk/19.05

Note that patch 0004-net-ena-fix-L4-checksum-Tx-offload.patch is a duplicate and was therefore removed.

It should also be noted that going forward Amazon has switched to a policy of only backporting patches for LTS releases, e.g 19.11.x, 20.11.x.

semihalf-kozik-rafal and others added 13 commits May 18, 2022 19:32
[ upstream commit 5673e28 ]

Instead of counting number of used NIC Tx bufs just count number
of Tx packets.

Fixes: 45b6d86 ("net/ena: add per-queue software counters stats")
Cc: [email protected]

Change-Id: I8e3260f11ed4b98a5917634d6dfbe92b986c788a
Signed-off-by: Rafal Kozik <[email protected]>
Acked-by: Michal Krawczyk <[email protected]>
[ upstream commit ef74b5f ]

Rx checksum flags and input errors shouldn't be updated on Tx, as it
would work only for packets forwarding.

The ierrors statistic should be updated on Rx, right after checking
Rx checksum flags if the Rx checksum offload is enabled.

Fixes: 1173fca ("ena: add polling-mode driver")
Cc: [email protected]

Change-Id: I4ccf68bcb1ef6b50d01c811fc7f050a3d7ec9966
Signed-off-by: Michal Krawczyk <[email protected]>
[ upstream commit 4217cb0 ]

Previous solution was using memzones in invalid way in hope to assign
IO queue to the appropriate NUMA zone.

The right way is to use socket_id from the rx/tx queue setup function
and then pass it to the IO queue.

Fixes: 3d3edc2 ("net/ena: make coherent memory allocation NUMA-aware")
Cc: [email protected]

Change-Id: I252df018ca6aae9d566618b6967bec0c5b3d939a
Signed-off-by: Michal Krawczyk <[email protected]>
Reviewed-by: David Marchand <[email protected]>
[ upstream commit 8190a84 ]

Recent modifications to admin command queue polling logic
did not support 32-bit applications.  Updated the driver to
work for 32 or 64 bit applications

Fixes: 3adcba9 ("net/ena: update HAL to the newer version")
Cc: [email protected]

Change-Id: I254d8d36af4208c713fbffcfbd0d241a88a972ce
Signed-off-by: David Harton <[email protected]>
Acked-by: Michal Krawczyk <[email protected]>
[ upstream commit 40e7c02 ]

During an if-condition evaluation, a 2-bit flag evaluates to 'true' for
'0x1', '0x2' and '0x3'. Thus, from this perspective these flags are
indistinguishable. To make them distinct, respective bits must be
extracted with a mask and then checked for strict equality.

Specifically here, even if `PKT_TX_UDP_CKSUM` (value '0x3') was set, the
expression `mbuf->ol_flags & PKT_TX_TCP` (the second flag of value
'0x1') is evaluated first and the result is 'true'. In consequence, for
UDP packets the execution flow enters an incorrect branch.

Fixes: 56b8b9b ("net/ena: convert to new Tx offloads API")
Cc: [email protected]

Change-Id: I7917b209856f24e5d0b6481a94b322d09266bc5b
Reported-by: Eduard Serra <[email protected]>
Signed-off-by: Maciej Bielski <[email protected]>
Acked-by: Michal Krawczyk <[email protected]>
[ upstream commit 0581705 ]

Add checking of l4_csum_checked and frag flags before checking the
l4_csum_error flag.

In case of IP fragment/unchecked L4 csum - add PKT_RX_L4_CKSUM_UNKNOWN
flag to the indicated mbuf.

Fixes: 1173fca ("ena: add polling-mode driver")
Cc: [email protected]

Change-Id: I9f1414741eb44ca289a8bfd11d7e66110c95040e
Signed-off-by: Igor Chauskin <[email protected]>
Reviewed-by: Maciej Bielski <[email protected]>
Reviewed-by: Michal Krawczyk <[email protected]>
[ upstream commit 38364c2 ]

Some of the ENA devices can't handle buffers which are smaller than a
1400B. Because of this limitation, size of the buffer is being checked
and limited during the Rx queue setup.

If it's below the allowed value, PMD won't finish it's configuration
successfully..

Change-Id: Ib402d3bfad98a3fc4f91095d5c6f90c6069da021
Signed-off-by: Michal Krawczyk <[email protected]>
Reviewed-by: Igor Chauskin <[email protected]>
Reviewed-by: Guy Tzalik <[email protected]>
[ upstream commit b14fcac ]

Memory allocation region id could possibly be non-unique
due to non-atomic increment, causing allocation failure.

Fixes: 9ba7981 ("ena: add communication layer for DPDK")
Cc: [email protected]

Change-Id: Ib9207aaae4e5e7ecdf1a99a0f23a508a53af631d
Signed-off-by: Igor Chauskin <[email protected]>
Reviewed-by: Michal Krawczyk <[email protected]>
Reviewed-by: Guy Tzalik <[email protected]>
[ upstream commit 29dc10d ]

rte_memzone_reserve() will reserve the biggest contiguous memzone
available if received 0 as size param.

Fixes: 9ba7981 ("ena: add communication layer for DPDK")
Cc: [email protected]

Change-Id: I2f71119e93c5e9addd8c538820093f030066d690
Signed-off-by: Igor Chauskin <[email protected]>
Reviewed-by: Michal Krawczyk <[email protected]>
Reviewed-by: Guy Tzalik <[email protected]>
[ upstream commit badc3a6 ]

IO rings were configured with the maximum allowed size for the Tx/Rx
rings. However, the application could decide to create smaller rings.

This patch is using value stored in the ring instead of the value from
the adapter which is indicating the maximum allowed value.

Fixes: df238f8 ("net/ena: recreate HW IO rings on start and stop")
Cc: [email protected]

Change-Id: Icf9102e2aa4e7413b6620b36dd232673239b7291
Signed-off-by: Michal Krawczyk <[email protected]>
Reviewed-by: Igor Chauskin <[email protected]>
Reviewed-by: Guy Tzalik <[email protected]>
[ upstream commit 38faa87 ]

The doorbell code is already issuing the doorbell by using rte_write.
Because of that, there is no need to do that before calling the
function.

Change-Id: Ia9c348e485987bc618bc7e89bf7fa057cc240617
Signed-off-by: Michal Krawczyk <[email protected]>
Reviewed-by: Igor Chauskin <[email protected]>
Reviewed-by: Guy Tzalik <[email protected]>
[ upstream commit 7755060 ]

Divider used for both Tx and Rx cleanup/refill threshold can cause too
big delay in case of the really big rings - for example if the 8k Rx
ring will be used, the refill won't trigger unless 1024 threshold will
be reached. It will also cause driver to try to allocate that much
descriptors.

Limiting it by fixed value - 256 in that case, would limit maximum
time spent in repopulate function.

Change-Id: Ia8659e6ddf179ff612a780adc6fe55d13eeac6e9
Signed-off-by: Michal Krawczyk <[email protected]>
Reviewed-by: Igor Chauskin <[email protected]>
Reviewed-by: Guy Tzalik <[email protected]>
[ upstream commit 5f267cb ]

Can be reproduced with "make EXTRA_CFLAGS='-O1'" command using
gcc (GCC) 9.3.1 20200408 (Red Hat 9.3.1-2)

Build error:
.../drivers/net/ena/ena_ethdev.c: In function ‘eth_ena_dev_init’:
.../drivers/net/ena/ena_ethdev.c:1815:20:
    error: ‘wd_state’ may be used uninitialized in this function
           [-Werror=maybe-uninitialized]
 1815 |  adapter->wd_state = wd_state;
      |  ~~~~~~~~~~~~~~~~~~^~~~~~~~~~

This looks like false positive, fixing by assigning initial value to
'wd_state' variable.

Change-Id: I68bd21d4e2a4b41466e670e282856d4f072dadc0
Signed-off-by: Ferruh Yigit <[email protected]>
Acked-by: Michal Krawczyk <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants