RFC: Refactor interrupts #18

lmunch · 2021-11-24T18:57:32Z

Currently the vector ID and queue ID are the same and the ID starts a 0. User
interrupt are allocated after queue interrupts. This results in the user
interrupt ending up at different locations depending on the number of vectors
available in the system. This is a problem as the user interrupt vector is
hardcoded in the shell and not configurable from SW.

This patch reorders the interrupt such that it matches the QDMA reference
driver:

mailbox interrupt (optional - defaults to unused)
user interrupt (optional - defaults to unused)
error interrupt on PF0
queue 0 interrupt
queue 1 interrupt
...

Hence, if mailbox interrupts is disabled, user interrupt will end up a interrupt
vector 0 and if enabled at interrupt vector 1.

Signed-off-by: Lars Munch lars@segv.dk

Currently the vector ID and queue ID are the same and the ID starts a 0. User interrupt are allocated after queue interrupts. This results in the user interrupt ending up at different locations depending on the number of vectors available in the system. This is a problem as the user interrupt vector is hardcoded in the shell and not configurable from SW. This patch reorders the interrupt such that it matches the QDMA reference driver: 0. mailbox interrupt (optional - defaults to unused) 1. user interrupt (optional - defaults to unused) 2. error interrupt on PF0 3. queue 0 interrupt 4. queue 1 interrupt ... Hence, if mailbox interrupts is disabled, user interrupt will end up a interrupt vector 0 and if enabled at interrupt vector 1. Signed-off-by: Lars Munch <lars@segv.dk>

cneely-amd · 2021-11-25T00:02:38Z

Hi Lars,

I tried testing your changes briefly this afternoon, but I noticed that I was seeing some packet loss (30-50%) when sending pings between my two test machines (with U250 versions of open-nic in each test machine).

Did you see any packet loss in your tests?

I'll be out the rest of the week, and will reply next week if you reply.

Thanks,
--Chris

Hyunok-Kim · 2021-11-25T04:57:09Z

The problem seems not be due to your code bug.
I have written a brand-new driver based on libqdma
Original libqdma uses the interrupt order you mentioned
When I used the original libqdma, I met the abnormal error/user interrupts and packet drops.
So I patched it to change the order. Refer to https://github.com/Hyunok-Kim/onic-driver
I suspect the problem comes from HW(IP).

lmunch · 2021-11-25T07:42:58Z

Interesting. I do not see any packet drops. I do though get one Error Interrupt on driver loading:

[98332.746354] onic 0000:06:00.0 onic6s0f0 (uninitialized): Set MAC address to 00:0a:35:74:34:07
[98332.746360] onic 0000:06:00.0: device is a master PF
[98332.746500] onic 0000:06:00.0: Allocated 5 queue vectors
[98332.746714] onic 0000:06:00.0: Error IRQ (BH) fired on Funtion#00000: vector=84
[98332.746723] onic 0000:06:00.0: Setup IRQ vector 85 with name onic6s0f0-0
[98332.746741] onic 0000:06:00.0: Setup IRQ vector 86 with name onic6s0f0-1
[98332.746756] onic 0000:06:00.0: Setup IRQ vector 87 with name onic6s0f0-2
[98332.746770] onic 0000:06:00.0: Setup IRQ vector 88 with name onic6s0f0-3
[98332.746785] onic 0000:06:00.0: Setup IRQ vector 89 with name onic6s0f0-4

Ideally, the driver should be able to reserve user interrupt vectors (like XDMA reference driver can) because everything else can be configured, but I figured I start with these changes.

@Hyunok-Kim that is super interesting with the new driver!!! especially since the road to SRIOV support will be way shorter. To be honest I was a bit surprised see the opennic driver did not use libqdma. What is the plans for your new driver? replacing the current opennic driver?

Thanks
Lars

cneely-amd · 2021-11-30T00:51:19Z

@Hyunok-Kim I have a quick question, would the libqdma version require an extra set of data copies for receiving and transmitting in the new version? I think the person at Xilinx who wrote the initial open-nic driver had started experimenting with libqdma, but then ended up not using it in the end. Does the performance with your new driver seem better, worse, or about the same? If your new driver turns out to be better overall, could we replace the existing code under open-nic-driver with your new driver?

Hyunok-Kim · 2021-11-30T04:01:26Z

While qep-driver can use flexible configuration using device tree as firmware,
my driver don't require any extra data to configure run mode and queue setups but the setups are hard-coded in my driver

The performance of my driver is worse than open-nic-driver.
When I test it using iperf, the throughput of it is a half of that of open-nic-driver.
While open-nic-driver uses poll mode for tx and interupt mode for rx,
my driver can use only direct interrupt mode for both rx and tx data transfer.
https://github.com/Xilinx/dma_ip_drivers/blob/master/QDMA/linux-kernel/driver/libqdma/libqdma_export.h#L135-L146
The worse performance might come from the difference.

Anyway, I will test it again after I implement poll mode

cneely-amd · 2021-12-03T00:02:00Z

@Hyunok-Kim When I try your modifications to the shell and new driver, I get pretty similar performance to the current driver. Current = 13.1 Gb/s and New = 12.4 Gb/s. This is on my local test setup using two machines, each with a U250. I would say it's pretty good, but am interested if the polling mode improves performance.
@lmunch What are your thoughts on what to do with this "Refactor interrupts" pull request and the two driver versions at this point?

lmunch · 2021-12-03T09:53:29Z

Ideally, the driver should be able to reserve user interrupt vectors (like XDMA reference driver can) because error, mailbox and queue interrupt vectors can all be configured dynamically while user interrupts cannot. So eventually something like this patch needs to go in. If it should be merged in it current form is up to you.

With regards to the two drivers, I am leaning towards the new driver as the road to SRIOV is shorter and I believe it could easier to use the new driver on a PF while running DPDK on a VF. I have not yet had the time to try the new driver though. The benefit I see of the current driver is that it is way smaller and it performs better (at least at the moment).

lmunch · 2021-12-15T07:27:57Z

@Hyunok-Kim I noticed that you just updated your libqdma based driver with some nice json configuration. Did you try poll mode for tx to see if you can get similar performance as the current OpenNIC driver? What are your plans going forward with this driver?

Hyunok-Kim · 2021-12-15T09:13:56Z

@lmunch Throughput of poll mode wasn't better than that of interrupt mode in my test-setup.

interrupt mode(use default json file) : 8.93 Gbits/sec
poll mode(change 'poll_mode' from false to true in the default json file): 7.67 Gbits/sec
original driver(open-nic-driver) : 15.1 Gbits/sec

But my setup seems not to be proper for throughput test. @cneely-xlnx is doing performance test with my driver and powerful CPUs. He will report the result and make a certain decision.
Even though my driver isn't integrated with current driver, I will maintain my driver following new features in open-nic-shell and open-nic-driver.

lmunch · 2021-12-15T16:24:36Z

@Hyunok-Kim thanks for the update. Do you have any other features planned for the driver e.g SRIOV? I will see if I can find the time to try it out before Christmas.

Hyunok-Kim · 2021-12-15T22:54:14Z

@lmunch Unfortunately, I'm not familiar with SRIOV and it is not in my interest. I'm still interesting in throughput improvement.
I'm moving on implementing h2c st bypass logic for scatter-gather dma(for jumbo frame) and some offloads(l4 hw checksum...) in open-nic-shell and it's user plugins. For it, I'm reading open-nic-shell codes carefully. There is no plan to work for the implementation in this year.

cneely-amd · 2021-12-16T01:07:50Z

@lmunch @Hyunok-Kim

In my test setup, I'm running n instances of iperf3 each configured for 5 Gb/s between two local machines. I'm planning to try this on some more powerful server machines, but haven't gotten to that yet. This is a work in progress.

single process CPU util transfer rate
open-nic-driver ~11% 12.9 Gb/s
onic-driver ~8 % 14.9 Gb/s

n=15 processes CPU util transfer rate
open-nic-driver ~86% 65 Gb/s
onic-driver ~87 % 56 Gb/s

lmunch · 2022-01-05T15:48:06Z

@Hyunok-Kim, I just had some time to try your new driver. I only had one issue with the code (I will send you a pull request later). The performance on my system is not that great (yet.. I know its work in progress). I use a 10G connection. On original OpenNIC driver I get:

$ iperf3 -c 10.10.10.24  -t 20 -R
Connecting to host 10.10.10.24, port 5201
Reverse mode, remote host 10.10.10.24 is sending
[  5] local 10.10.10.44 port 48698 connected to 10.10.10.24 port 5201
[ ID] Interval           Transfer     Bitrate
[  5]   0.00-1.00   sec  1.10 GBytes  9.41 Gbits/sec                  
[  5]   1.00-2.00   sec  1.10 GBytes  9.41 Gbits/sec                  
[  5]   2.00-3.00   sec  1.10 GBytes  9.41 Gbits/sec                  
[  5]   3.00-4.00   sec  1.10 GBytes  9.41 Gbits/sec                  
[  5]   4.00-5.00   sec  1.10 GBytes  9.41 Gbits/sec                  
[  5]   5.00-6.00   sec  1.10 GBytes  9.41 Gbits/sec

And similar in the other direction. When using your driver I get:

$ iperf3 -c 10.10.10.24  -t 20 -R
Connecting to host 10.10.10.24, port 5201
Reverse mode, remote host 10.10.10.24 is sending
[  5] local 10.10.10.44 port 48706 connected to 10.10.10.24 port 5201
[ ID] Interval           Transfer     Bitrate
[  5]   0.00-1.00   sec   816 MBytes  6.84 Gbits/sec                  
[  5]   1.00-2.00   sec   824 MBytes  6.91 Gbits/sec                  
[  5]   2.00-3.00   sec   824 MBytes  6.91 Gbits/sec                  
[  5]   3.00-4.00   sec   809 MBytes  6.79 Gbits/sec                  
[  5]   4.00-5.00   sec   783 MBytes  6.57 Gbits/sec                  
[  5]   5.00-6.00   sec   784 MBytes  6.57 Gbits/sec                  
[  5]   6.00-7.00   sec   782 MBytes  6.56 Gbits/sec                  
[  5]   7.00-8.00   sec   783 MBytes  6.56 Gbits/sec

and:

$ iperf3 -c 10.10.10.24  -t 20 
Connecting to host 10.10.10.24, port 5201
[  5] local 10.10.10.44 port 48710 connected to 10.10.10.24 port 5201
[ ID] Interval           Transfer     Bitrate         Retr  Cwnd
[  5]   0.00-1.00   sec  5.99 MBytes  50.1 Mbits/sec   55   1.41 KBytes       
[  5]   1.00-2.00   sec  0.00 Bytes  0.00 bits/sec    3   1.41 KBytes       
[  5]   2.00-3.00   sec  0.00 Bytes  0.00 bits/sec    6   1.41 KBytes       
[  5]   3.00-4.00   sec  0.00 Bytes  0.00 bits/sec    1   1.41 KBytes       
[  5]   4.00-5.00   sec  0.00 Bytes  0.00 bits/sec    1   1.41 KBytes       
[  5]   5.00-6.00   sec   445 KBytes  3.65 Mbits/sec    3   2.83 KBytes       
[  5]   6.00-7.00   sec  0.00 Bytes  0.00 bits/sec    2   1.41 KBytes       
[  5]   7.00-8.00   sec  0.00 Bytes  0.00 bits/sec    5   1.41 KBytes       
[  5]   8.00-9.00   sec  0.00 Bytes  0.00 bits/sec    1   1.41 KBytes       
[  5]   9.00-10.00  sec  0.00 Bytes  0.00 bits/sec    1   1.41 KBytes       
[  5]  10.00-11.00  sec  0.00 Bytes  0.00 bits/sec    1   1.41 KBytes       
[  5]  11.00-12.00  sec  0.00 Bytes  0.00 bits/sec    0   1.41 KBytes       
[  5]  12.00-13.00  sec  0.00 Bytes  0.00 bits/sec    0   1.41 KBytes

Hyunok-Kim · 2022-01-06T07:59:00Z

@lmunch To measure network throughput, I referenced the following articles
https://support.mellanox.com/s/article/getting-started-with-connectx-5-100gb-s-adapters-for-linux
https://www.kernel.org/doc/ols/2009/ols2009-pages-169-184.pdf
Multi stream connection is necessary to distribute overheads of network stack within linux kernel over multiple CPUs.
RSS(Receive Side Scaling) implementation also is required for it.
https://docs.microsoft.com/en-us/windows-hardware/drivers/network/introduction-to-receive-side-scaling

lmunch · 2022-01-06T08:36:12Z

@Hyunok-Kim thanks for the links. I know my setup is not good for throughput measurements but its good enough for direct comparison of the 2 drivers. I run with the exact same FPGA code (I modified the rx/tx descriptor in the OpenNIC driver) so I can do simply swap drivers and re-test. As you can see from the numbers above then something is not good in onic-driver (especially in the last test).

muvarov · 2025-02-24T13:05:07Z

@lmunch @cneely-amd is that correct that this PR needs open-nic-shell patch for interrupt reordering?

lmunch · 2025-02-24T13:38:58Z

I do not recall having to change anything in the open-nic-shell to use this patch. If you use user interrupts, then you off course need to set that correctly.

BTW, we never figured out why @cneely-amd was seeing packet loss. I have now been using this patch with success for several years.

muvarov · 2025-02-24T14:18:45Z

@lmunch with this patch I do not see any interrupts . open-nic-shell patch is not patched.

And with main branch netperf works fine with lower number of queues (assuming netperf threads go to different rx queues.). And on increasing numbers of netperf state driver goes to state with 30% packet loss. And did not go back even if there is no traffic. I'm not sure if that what @cneely-amd saw, but I can reproduce it.

If I limit number of queues to 1 in the driver. Then it operates pretty much stable at about 20 Gbit/s. At lest for first 20 minutes. After that error interrupt happens onic 0000:01:00.0: Error IRQ (BH) fired on Funtion#00000: vector=181 (/proc/interrupts shows that single error interrupt also). And hw stops to operate. (HW is Alveo u250 in 2 port mode (2 PFs)). Is there any doc where I can understand what is the error interrupt?

RFC: Refactor interrupts #18

Are you sure you want to change the base?

RFC: Refactor interrupts #18

Uh oh!

Conversation

lmunch commented Nov 24, 2021

Uh oh!

cneely-amd commented Nov 25, 2021

Uh oh!

Hyunok-Kim commented Nov 25, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

lmunch commented Nov 25, 2021

Uh oh!

cneely-amd commented Nov 30, 2021

Uh oh!

Hyunok-Kim commented Nov 30, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

cneely-amd commented Dec 3, 2021

Uh oh!

lmunch commented Dec 3, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

lmunch commented Dec 15, 2021

Uh oh!

Hyunok-Kim commented Dec 15, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

lmunch commented Dec 15, 2021

Uh oh!

Hyunok-Kim commented Dec 15, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

cneely-amd commented Dec 16, 2021

Uh oh!

lmunch commented Jan 5, 2022

Uh oh!

Hyunok-Kim commented Jan 6, 2022

Uh oh!

lmunch commented Jan 6, 2022

Uh oh!

muvarov commented Feb 24, 2025

Uh oh!

lmunch commented Feb 24, 2025

Uh oh!

muvarov commented Feb 24, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Hyunok-Kim commented Nov 25, 2021 •

edited

Loading

Hyunok-Kim commented Nov 30, 2021 •

edited

Loading

lmunch commented Dec 3, 2021 •

edited

Loading

Hyunok-Kim commented Dec 15, 2021 •

edited

Loading

Hyunok-Kim commented Dec 15, 2021 •

edited

Loading