Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: Refactor interrupts #18

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open

Conversation

lmunch
Copy link
Contributor

@lmunch lmunch commented Nov 24, 2021

Currently the vector ID and queue ID are the same and the ID starts a 0. User
interrupt are allocated after queue interrupts. This results in the user
interrupt ending up at different locations depending on the number of vectors
available in the system. This is a problem as the user interrupt vector is
hardcoded in the shell and not configurable from SW.

This patch reorders the interrupt such that it matches the QDMA reference
driver:

  1. mailbox interrupt (optional - defaults to unused)
  2. user interrupt (optional - defaults to unused)
  3. error interrupt on PF0
  4. queue 0 interrupt
  5. queue 1 interrupt
    ...

Hence, if mailbox interrupts is disabled, user interrupt will end up a interrupt
vector 0 and if enabled at interrupt vector 1.

Signed-off-by: Lars Munch [email protected]

Currently the vector ID and queue ID are the same and the ID starts a 0. User
interrupt are allocated after queue interrupts. This results in the user
interrupt ending up at different locations depending on the number of vectors
available in the system. This is a problem as the user interrupt vector is
hardcoded in the shell and not configurable from SW.

This patch reorders the interrupt such that it matches the QDMA reference
driver:

0. mailbox interrupt (optional - defaults to unused)
1. user interrupt (optional - defaults to unused)
2. error interrupt on PF0
3. queue 0 interrupt
4. queue 1 interrupt
...

Hence, if mailbox interrupts is disabled, user interrupt will end up a interrupt
vector 0 and if enabled at interrupt vector 1.

Signed-off-by: Lars Munch <[email protected]>
@cneely-amd
Copy link
Collaborator

Hi Lars,

I tried testing your changes briefly this afternoon, but I noticed that I was seeing some packet loss (30-50%) when sending pings between my two test machines (with U250 versions of open-nic in each test machine).

Did you see any packet loss in your tests?

I'll be out the rest of the week, and will reply next week if you reply.

Thanks,
--Chris

@Hyunok-Kim
Copy link
Contributor

Hyunok-Kim commented Nov 25, 2021

The problem seems not be due to your code bug.
I have written a brand-new driver based on libqdma
Original libqdma uses the interrupt order you mentioned
When I used the original libqdma, I met the abnormal error/user interrupts and packet drops.
So I patched it to change the order. Refer to https://github.com/Hyunok-Kim/onic-driver
I suspect the problem comes from HW(IP).

@lmunch
Copy link
Contributor Author

lmunch commented Nov 25, 2021

Interesting. I do not see any packet drops. I do though get one Error Interrupt on driver loading:

[98332.746354] onic 0000:06:00.0 onic6s0f0 (uninitialized): Set MAC address to 00:0a:35:74:34:07
[98332.746360] onic 0000:06:00.0: device is a master PF
[98332.746500] onic 0000:06:00.0: Allocated 5 queue vectors
[98332.746714] onic 0000:06:00.0: Error IRQ (BH) fired on Funtion#00000: vector=84
[98332.746723] onic 0000:06:00.0: Setup IRQ vector 85 with name onic6s0f0-0
[98332.746741] onic 0000:06:00.0: Setup IRQ vector 86 with name onic6s0f0-1
[98332.746756] onic 0000:06:00.0: Setup IRQ vector 87 with name onic6s0f0-2
[98332.746770] onic 0000:06:00.0: Setup IRQ vector 88 with name onic6s0f0-3
[98332.746785] onic 0000:06:00.0: Setup IRQ vector 89 with name onic6s0f0-4

Ideally, the driver should be able to reserve user interrupt vectors (like XDMA reference driver can) because everything else can be configured, but I figured I start with these changes.

@Hyunok-Kim that is super interesting with the new driver!!! especially since the road to SRIOV support will be way shorter. To be honest I was a bit surprised see the opennic driver did not use libqdma. What is the plans for your new driver? replacing the current opennic driver?

Thanks
Lars

@cneely-amd
Copy link
Collaborator

@Hyunok-Kim I have a quick question, would the libqdma version require an extra set of data copies for receiving and transmitting in the new version? I think the person at Xilinx who wrote the initial open-nic driver had started experimenting with libqdma, but then ended up not using it in the end. Does the performance with your new driver seem better, worse, or about the same? If your new driver turns out to be better overall, could we replace the existing code under open-nic-driver with your new driver?

@Hyunok-Kim
Copy link
Contributor

Hyunok-Kim commented Nov 30, 2021

While qep-driver can use flexible configuration using device tree as firmware,
my driver don't require any extra data to configure run mode and queue setups but the setups are hard-coded in my driver

The performance of my driver is worse than open-nic-driver.
When I test it using iperf, the throughput of it is a half of that of open-nic-driver.
While open-nic-driver uses poll mode for tx and interupt mode for rx,
my driver can use only direct interrupt mode for both rx and tx data transfer.
https://github.com/Xilinx/dma_ip_drivers/blob/master/QDMA/linux-kernel/driver/libqdma/libqdma_export.h#L135-L146
The worse performance might come from the difference.

Anyway, I will test it again after I implement poll mode

@cneely-amd
Copy link
Collaborator

@Hyunok-Kim When I try your modifications to the shell and new driver, I get pretty similar performance to the current driver. Current = 13.1 Gb/s and New = 12.4 Gb/s. This is on my local test setup using two machines, each with a U250. I would say it's pretty good, but am interested if the polling mode improves performance.
@lmunch What are your thoughts on what to do with this "Refactor interrupts" pull request and the two driver versions at this point?

@lmunch
Copy link
Contributor Author

lmunch commented Dec 3, 2021

Ideally, the driver should be able to reserve user interrupt vectors (like XDMA reference driver can) because error, mailbox and queue interrupt vectors can all be configured dynamically while user interrupts cannot. So eventually something like this patch needs to go in. If it should be merged in it current form is up to you.

With regards to the two drivers, I am leaning towards the new driver as the road to SRIOV is shorter and I believe it could easier to use the new driver on a PF while running DPDK on a VF. I have not yet had the time to try the new driver though. The benefit I see of the current driver is that it is way smaller and it performs better (at least at the moment).

@lmunch
Copy link
Contributor Author

lmunch commented Dec 15, 2021

@Hyunok-Kim I noticed that you just updated your libqdma based driver with some nice json configuration. Did you try poll mode for tx to see if you can get similar performance as the current OpenNIC driver? What are your plans going forward with this driver?

@Hyunok-Kim
Copy link
Contributor

Hyunok-Kim commented Dec 15, 2021

@lmunch Throughput of poll mode wasn't better than that of interrupt mode in my test-setup.

  • interrupt mode(use default json file) : 8.93 Gbits/sec
  • poll mode(change 'poll_mode' from false to true in the default json file): 7.67 Gbits/sec
  • original driver(open-nic-driver) : 15.1 Gbits/sec

But my setup seems not to be proper for throughput test. @cneely-xlnx is doing performance test with my driver and powerful CPUs. He will report the result and make a certain decision.
Even though my driver isn't integrated with current driver, I will maintain my driver following new features in open-nic-shell and open-nic-driver.

@lmunch
Copy link
Contributor Author

lmunch commented Dec 15, 2021

@Hyunok-Kim thanks for the update. Do you have any other features planned for the driver e.g SRIOV? I will see if I can find the time to try it out before Christmas.

@Hyunok-Kim
Copy link
Contributor

Hyunok-Kim commented Dec 15, 2021

@lmunch Unfortunately, I'm not familiar with SRIOV and it is not in my interest. I'm still interesting in throughput improvement.
I'm moving on implementing h2c st bypass logic for scatter-gather dma(for jumbo frame) and some offloads(l4 hw checksum...) in open-nic-shell and it's user plugins. For it, I'm reading open-nic-shell codes carefully. There is no plan to work for the implementation in this year.

@cneely-amd
Copy link
Collaborator

@lmunch @Hyunok-Kim

In my test setup, I'm running n instances of iperf3 each configured for 5 Gb/s between two local machines. I'm planning to try this on some more powerful server machines, but haven't gotten to that yet. This is a work in progress.

single process CPU util transfer rate
open-nic-driver ~11% 12.9 Gb/s
onic-driver ~8 % 14.9 Gb/s

n=15 processes CPU util transfer rate
open-nic-driver ~86% 65 Gb/s
onic-driver ~87 % 56 Gb/s

@lmunch
Copy link
Contributor Author

lmunch commented Jan 5, 2022

@Hyunok-Kim, I just had some time to try your new driver. I only had one issue with the code (I will send you a pull request later). The performance on my system is not that great (yet.. I know its work in progress). I use a 10G connection. On original OpenNIC driver I get:

$ iperf3 -c 10.10.10.24  -t 20 -R
Connecting to host 10.10.10.24, port 5201
Reverse mode, remote host 10.10.10.24 is sending
[  5] local 10.10.10.44 port 48698 connected to 10.10.10.24 port 5201
[ ID] Interval           Transfer     Bitrate
[  5]   0.00-1.00   sec  1.10 GBytes  9.41 Gbits/sec                  
[  5]   1.00-2.00   sec  1.10 GBytes  9.41 Gbits/sec                  
[  5]   2.00-3.00   sec  1.10 GBytes  9.41 Gbits/sec                  
[  5]   3.00-4.00   sec  1.10 GBytes  9.41 Gbits/sec                  
[  5]   4.00-5.00   sec  1.10 GBytes  9.41 Gbits/sec                  
[  5]   5.00-6.00   sec  1.10 GBytes  9.41 Gbits/sec

And similar in the other direction. When using your driver I get:

$ iperf3 -c 10.10.10.24  -t 20 -R
Connecting to host 10.10.10.24, port 5201
Reverse mode, remote host 10.10.10.24 is sending
[  5] local 10.10.10.44 port 48706 connected to 10.10.10.24 port 5201
[ ID] Interval           Transfer     Bitrate
[  5]   0.00-1.00   sec   816 MBytes  6.84 Gbits/sec                  
[  5]   1.00-2.00   sec   824 MBytes  6.91 Gbits/sec                  
[  5]   2.00-3.00   sec   824 MBytes  6.91 Gbits/sec                  
[  5]   3.00-4.00   sec   809 MBytes  6.79 Gbits/sec                  
[  5]   4.00-5.00   sec   783 MBytes  6.57 Gbits/sec                  
[  5]   5.00-6.00   sec   784 MBytes  6.57 Gbits/sec                  
[  5]   6.00-7.00   sec   782 MBytes  6.56 Gbits/sec                  
[  5]   7.00-8.00   sec   783 MBytes  6.56 Gbits/sec        

and:

$ iperf3 -c 10.10.10.24  -t 20 
Connecting to host 10.10.10.24, port 5201
[  5] local 10.10.10.44 port 48710 connected to 10.10.10.24 port 5201
[ ID] Interval           Transfer     Bitrate         Retr  Cwnd
[  5]   0.00-1.00   sec  5.99 MBytes  50.1 Mbits/sec   55   1.41 KBytes       
[  5]   1.00-2.00   sec  0.00 Bytes  0.00 bits/sec    3   1.41 KBytes       
[  5]   2.00-3.00   sec  0.00 Bytes  0.00 bits/sec    6   1.41 KBytes       
[  5]   3.00-4.00   sec  0.00 Bytes  0.00 bits/sec    1   1.41 KBytes       
[  5]   4.00-5.00   sec  0.00 Bytes  0.00 bits/sec    1   1.41 KBytes       
[  5]   5.00-6.00   sec   445 KBytes  3.65 Mbits/sec    3   2.83 KBytes       
[  5]   6.00-7.00   sec  0.00 Bytes  0.00 bits/sec    2   1.41 KBytes       
[  5]   7.00-8.00   sec  0.00 Bytes  0.00 bits/sec    5   1.41 KBytes       
[  5]   8.00-9.00   sec  0.00 Bytes  0.00 bits/sec    1   1.41 KBytes       
[  5]   9.00-10.00  sec  0.00 Bytes  0.00 bits/sec    1   1.41 KBytes       
[  5]  10.00-11.00  sec  0.00 Bytes  0.00 bits/sec    1   1.41 KBytes       
[  5]  11.00-12.00  sec  0.00 Bytes  0.00 bits/sec    0   1.41 KBytes       
[  5]  12.00-13.00  sec  0.00 Bytes  0.00 bits/sec    0   1.41 KBytes    

@Hyunok-Kim
Copy link
Contributor

@lmunch To measure network throughput, I referenced the following articles
https://support.mellanox.com/s/article/getting-started-with-connectx-5-100gb-s-adapters-for-linux
https://www.kernel.org/doc/ols/2009/ols2009-pages-169-184.pdf
Multi stream connection is necessary to distribute overheads of network stack within linux kernel over multiple CPUs.
RSS(Receive Side Scaling) implementation also is required for it.
https://docs.microsoft.com/en-us/windows-hardware/drivers/network/introduction-to-receive-side-scaling

@lmunch
Copy link
Contributor Author

lmunch commented Jan 6, 2022

@Hyunok-Kim thanks for the links. I know my setup is not good for throughput measurements but its good enough for direct comparison of the 2 drivers. I run with the exact same FPGA code (I modified the rx/tx descriptor in the OpenNIC driver) so I can do simply swap drivers and re-test. As you can see from the numbers above then something is not good in onic-driver (especially in the last test).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants