Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to attach netmap-patched veth interface to VALE-backed OVS #306

Closed
ulf-noring opened this issue May 17, 2017 · 2 comments
Closed

Comments

@ulf-noring
Copy link

ulf-noring commented May 17, 2017

I'm trying to add a veth interface (using netmap-patched drivers) to a netmap-patched OpenVSwitch, but when I perform the add-port operation, the operation seems to hang on some step, just sitting there. I can abort the operation with ctrl+c but it seems to be breaking things. If I reboot the machine and try adding the port while having the other end of the veth pair in another namespace, I get a kernel panic (see attached log below). I have tried the operations in an identical VM with a non-patched OVS without netmap installed, and it works as expected there (I am able to attach the interface, and later push traffic through it), so I do not think there is anything wrong with my OVS installation or other OS setup. I looked into this issue, which confirms that this should be possible as long as the drivers are patched.

Since I'm doing this in a VM, I could attempt to hook up the VM to GDB, but I have not used GDB before and thought I would ask for advice before trying to go any further. Does anyone know what could be the issue here? Should I try with a newer kernel version? Openvswitch 2.6.1 only supports kernels up to 4.7 according to the OpenVSWitch documentation. Grateful for any help figuring this out.

Some version information:

OS: Ubuntu 16.04
Kernel: 4.4.0.72-generic
Netmap: Github master downloaded on april 2
OVS: 2.6.1, patched with the netmap patch for 2.6.1 (with kernel module also installed)

Steps to repeat the issue (all performed as root):

I first initialize OVS with the following steps:

  1. modprobe openvswitch
  2. ovsdb-server --remote=punix:/var/run/openvswitch/db.sock --remote=db:Open_vSwitch,Open_vSwitch,manager_options --pidfile --detach
  3. ovs-vswitchd --pidfile --detach

The following steps will then reproduce the error:

  1. ovs-vsctl add-br s1
  2. ip link add h1-eth1 type veth peer name s1-eth1
  3. ovs-vsctl add-port s1 s1-eth1

Or, for getting the kernel panic:

  1. ovs-vsctl add-br s1
  2. ip link add h1-eth1 type veth peer name s1-eth1
  3. ip netns add h1
  4. ip link set h1-eth1 netns h1
  5. ovs-vsctl add-port s1 s1-eth1

lsmod | grep openvswitch shows:

Module Size Used by
openvswitch 225280 0
nf_nat_ipv6 16384 1 openvswitch
nf_nat_ipv4 16384 1 openvswitch
nf_defrag_ipv6 36864 2 openvswitch,nf_conntrack_ipv6
nf_nat 24576 3 openvswitch,nf_nat_ipv4,nf_nat_ipv6
nf_conntrack 106496 6 openvswitch,nf_nat,nf_nat_ipv4,nf_nat_ipv6,nf_conntrack_ipv4,nf_conntrack_ipv6
gre 16384 1 openvswitch
libcrc32c 16384 2 raid456,openvswitch
netmap 143360 2 e1000,openvswitch

lsmod | grep netmap shows:

netmap 143360 3 veth,e1000,openvswitch

vale-ctl shows:

889.574429 bdg_ctl [149] bridge:0 port:0 valeOVS:s1
889.574545 bdg_ctl [149] bridge:0 port:1 valeOVS:s1^

ovs-vsctl show shows (before adding the port):

93337176-b74d-4e0d-bd94-af24dacb2eb5
Bridge "s1"
Port "s1"
Interface "s1"
type: internal

Kernel panic log extracted via serial console (pastebin for better formatting: https://pastebin.com/mqfrBYT1 ):
[ 71.196211] 070.260706 [ 133] ovs_vale_ctl netmap_bdg_ctl(valeOVS:s1, NETMAP_BDG_ATTACH, NETMAP_BDG_HOST) --> 0 [ 71.198119] 070.262616 [ 148] ovs_vale_ctl datapath registered to valeOVS: [ 71.296697] BUG: unable to handle kernel NULL pointer dereference at 000000000000001e [ 71.300018] IP: [<ffffffffc042d537>] ovs_vport_add+0x117/0x170 [openvswitch] [ 71.313201] PGD 1b922b067 PUD 1b8195067 PMD 0 [ 71.313201] Oops: 0000 [#1] SMP [ 71.313201] Modules linked in: veth(OE) openvswitch(OE) nf_conntrack_ipv6 nf_nat_ipv6 nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_defrag_ipv6 nf_nat nf_conntrack gre ppdev input_leds joydev serio_raw i2c_piix4 parport_pc parport mac_hid 8250_fintek ib_iser rdma_cm iw_cm ib_cm ib_sa ib_mad ib_core ib_addr iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi pktgen autofs4 btrfs raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear cirrus ttm drm_kms_helper syscopyarea sysfillrect sysimgblt e1000(OE) psmouse fb_sys_fops netmap(OE) floppy drm pata_acpi [ 71.382476] CPU: 1 PID: 1157 Comm: ovs-vswitchd Tainted: G OE 4.4.0-72-generic #93-Ubuntu [ 71.389161] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014 [ 71.397773] task: ffff8800bb3c5940 ti: ffff8801b8264000 task.ti: ffff8801b8264000 [ 71.402142] RIP: 0010:[<ffffffffc042d537>] [<ffffffffc042d537>] ovs_vport_add+0x117/0x170 [openvswitch] [ 71.411400] RSP: 0018:ffff8801b8267a68 EFLAGS: 00010203 [ 71.411400] RAX: 0000000000000016 RBX: ffffffffc0445700 RCX: 0000000000000000 [ 71.411400] RDX: 0000000000002a92 RSI: ffff8801c2d1a0c0 RDI: ffff8801bc001900 [ 71.411400] RBP: ffff8801b8267a78 R08: 000000000001a0c0 R09: ffffffffc042cf7e [ 71.411400] R10: ffffea0006e92980 R11: ffff8801a9c9f000 R12: 0000000000000016 [ 71.411400] R13: ffff8800b96bf100 R14: ffff8801b854d500 R15: 0000000000000002 [ 71.411400] FS: 00007f09e5bab980(0000) GS:ffff8801c2d00000(0000) knlGS:0000000000000000 [ 71.411400] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 71.411400] CR2: 000000000000001e CR3: 00000001b9168000 CR4: 00000000000006e0 [ 71.411400] Stack: [ 71.411400] ffff8801b8267aa8 ffff8800b8016200 ffff8801b8267a90 ffffffffc0420822 [ 71.411400] ffff8801b8267b50 ffff8801b8267b08 ffffffffc0421cf2 ffff8801b91f6814 [ 71.411400] ffff8801b91f6824 ffffffff00000001 0000000000000000 ffff8800bb885500 [ 71.411400] Call Trace: [ 71.411400] [<ffffffffc0420822>] new_vport+0x12/0x50 [openvswitch] [ 71.411400] [<ffffffffc0421cf2>] ovs_vport_cmd_new+0x142/0x280 [openvswitch] [ 71.411400] [<ffffffff81765e84>] genl_family_rcv_msg+0x1e4/0x3e0 [ 71.411400] [<ffffffff81761cbc>] ? __netlink_sendskb+0x16c/0x250 [ 71.411400] [<ffffffff81766080>] ? genl_family_rcv_msg+0x3e0/0x3e0 [ 71.411400] [<ffffffff817660f6>] genl_rcv_msg+0x76/0xb0 [ 71.411400] [<ffffffff817655f4>] netlink_rcv_skb+0xa4/0xc0 [ 71.411400] [<ffffffff81765c88>] genl_rcv+0x28/0x40 [ 71.411400] [<ffffffff81764daf>] netlink_unicast+0x12f/0x1b0 [ 71.411400] [<ffffffff817652d1>] netlink_sendmsg+0x4a1/0x5f0 [ 71.411400] [<ffffffff8139f201>] ? aa_sock_msg_perm+0x61/0x150 [ 71.411400] [<ffffffff81713ae8>] sock_sendmsg+0x38/0x50 [ 71.411400] [<ffffffff81714591>] ___sys_sendmsg+0x281/0x290 [ 71.411400] [<ffffffff8174029e>] ? rtnl_unlock+0xe/0x10 [ 71.411400] [<ffffffff8174a39e>] ? dev_ioctl+0x1ae/0x580 [ 71.411400] [<ffffffff817114a2>] ? sock_do_ioctl+0x42/0x50 [ 71.411400] [<ffffffff8122ce95>] ? __fget_light+0x25/0x60 [ 71.411400] [<ffffffff81714ee1>] __sys_sendmsg+0x51/0x90 [ 71.411400] [<ffffffff81714f32>] SyS_sendmsg+0x12/0x20 [ 71.411400] [<ffffffff8183c672>] entry_SYSCALL_64_fastpath+0x16/0x71 [ 71.411400] Code: 83 48 8b 7b 38 e8 5a 8f cd c0 84 c0 75 0c 48 c7 c0 9f ff ff ff 5b 41 5c 5d c3 4c 89 e7 ff 53 08 48 3d 00 f0 ff ff 49 89 c4 77 3c <48> 8b 40 08 49 8b 34 24 48 8b 78 60 e8 58 fb ff ff 48 8b 10 49 [ 71.411400] RIP [<ffffffffc042d537>] ovs_vport_add+0x117/0x170 [openvswitch] [ 71.411400] RSP <ffff8801b8267a68> [ 71.411400] CR2: 000000000000001e [ 71.456986] ---[ end trace 6b4a7e00ac57647a ]---

@vmaffione
Copy link
Collaborator

Hi,
Problem is that the OVS-netmap kernel patch is partial and not really supported. It was just an initial experiment. The right way to integrate netmap and OVS would be to provide an user-space implementation, in the same way as DPDK does (e.g. dpdk and dpdkvhostuser ports).

Anyway, if you want to debug this you could add debug print statements in the ovs_vport_add function (before each line) and reproduce the crash, in order to understand what is the line that causes it.

@ulf-noring
Copy link
Author

A very late answer, but thank you for the reply. I'll look into how DPDK does it and the possibility of debugging.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants