Skip to content
This repository has been archived by the owner on Jun 18, 2024. It is now read-only.

Pull in bpf/for-next (8f4a95080418) #204

Merged
merged 3,896 commits into from
May 17, 2024
Merged

Pull in bpf/for-next (8f4a95080418) #204

merged 3,896 commits into from
May 17, 2024
This pull request is big! We’re only showing the most recent 250 commits.

Commits on May 7, 2024

  1. libbpf: improve early detection of doomed-to-fail BPF program loading

    Extend libbpf's pre-load checks for BPF programs, detecting more typical
    conditions that are destinated to cause BPF program failure. This is an
    opportunity to provide more helpful and actionable error message to
    users, instead of potentially very confusing BPF verifier log and/or
    error.
    
    In this case, we detect struct_ops BPF program that was not referenced
    anywhere, but still attempted to be loaded (according to libbpf logic).
    Suggest that the program might need to be used in some struct_ops
    variable. User will get a message of the following kind:
    
      libbpf: prog 'test_1_forgotten': SEC("struct_ops") program isn't referenced anywhere, did you forget to use it?
    
    Suggested-by: Tejun Heo <[email protected]>
    Signed-off-by: Andrii Nakryiko <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Martin KaFai Lau <[email protected]>
    anakryiko authored and Martin KaFai Lau committed May 7, 2024
    Configuration menu
    Copy the full SHA
    c78420b View commit details
    Browse the repository at this point in the history
  2. selftests/bpf: validate struct_ops early failure detection logic

    Add a simple test that validates that libbpf will reject isolated
    struct_ops program early with helpful warning message.
    
    Also validate that explicit use of such BPF program through BPF skeleton
    after BPF object is open won't trigger any warnings.
    
    Signed-off-by: Andrii Nakryiko <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Martin KaFai Lau <[email protected]>
    anakryiko authored and Martin KaFai Lau committed May 7, 2024
    Configuration menu
    Copy the full SHA
    41df073 View commit details
    Browse the repository at this point in the history
  3. selftests/bpf: shorten subtest names for struct_ops_module test

    Drive-by clean up, we shouldn't use meaningless "test_" prefix for
    subtest names.
    
    Signed-off-by: Andrii Nakryiko <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Martin KaFai Lau <[email protected]>
    anakryiko authored and Martin KaFai Lau committed May 7, 2024
    Configuration menu
    Copy the full SHA
    7b9959b View commit details
    Browse the repository at this point in the history
  4. Merge branch 'libbpf: further struct_ops fixes and improvements'

    Andrii Nakryiko says:
    
    ====================
    Fix yet another case of mishandling SEC("struct_ops") programs that were
    nulled out programmatically through BPF skeleton by the user.
    
    While at it, add some improvements around detecting and reporting errors,
    specifically a common case of declaring SEC("struct_ops") program, but
    forgetting to actually make use of it by setting it as a callback
    implementation in SEC(".struct_ops") variable (i.e., map) declaration.
    
    A bunch of new selftests are added as well.
    ====================
    
    Signed-off-by: Martin KaFai Lau <[email protected]>
    Martin KaFai Lau committed May 7, 2024
    Configuration menu
    Copy the full SHA
    7e2c7a3 View commit details
    Browse the repository at this point in the history
  5. selftests: netfilter: conntrack_tcp_unreplied.sh: wait for initial co…

    …nnection attempt
    
    Netdev CI reports occasional failures with this test
    ("ERROR: ns2-dX6bUE did not pick up tcp connection from peer").
    
    Add explicit busywait call until the initial connection attempt shows
    up in conntrack rather than a one-shot 'must exist' check.
    
    Signed-off-by: Florian Westphal <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Florian Westphal authored and kuba-moo committed May 7, 2024
    Configuration menu
    Copy the full SHA
    7650815 View commit details
    Browse the repository at this point in the history
  6. mptcp: fix possible NULL dereferences

    subflow_add_reset_reason(skb, ...) can fail.
    
    We can not assume mptcp_get_ext(skb) always return a non NULL pointer.
    
    syzbot reported:
    
    general protection fault, probably for non-canonical address 0xdffffc0000000003: 0000 [#1] PREEMPT SMP KASAN PTI
    KASAN: null-ptr-deref in range [0x0000000000000018-0x000000000000001f]
    CPU: 0 PID: 5098 Comm: syz-executor132 Not tainted 6.9.0-rc6-syzkaller-01478-gcdc74c9d06e7 #0
    Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 03/27/2024
     RIP: 0010:subflow_v6_route_req+0x2c7/0x490 net/mptcp/subflow.c:388
    Code: 8d 7b 07 48 89 f8 48 c1 e8 03 42 0f b6 04 20 84 c0 0f 85 c0 01 00 00 0f b6 43 07 48 8d 1c c3 48 83 c3 18 48 89 d8 48 c1 e8 03 <42> 0f b6 04 20 84 c0 0f 85 84 01 00 00 0f b6 5b 01 83 e3 0f 48 89
    RSP: 0018:ffffc9000362eb68 EFLAGS: 00010206
    RAX: 0000000000000003 RBX: 0000000000000018 RCX: ffff888022039e00
    RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
    RBP: ffff88807d961140 R08: ffffffff8b6cb76b R09: 1ffff1100fb2c230
    R10: dffffc0000000000 R11: ffffed100fb2c231 R12: dffffc0000000000
    R13: ffff888022bfe273 R14: ffff88802cf9cc80 R15: ffff88802ad5a700
    FS:  0000555587ad2380(0000) GS:ffff8880b9400000(0000) knlGS:0000000000000000
    CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    CR2: 00007f420c3f9720 CR3: 0000000022bfc000 CR4: 00000000003506f0
    DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
    Call Trace:
     <TASK>
      tcp_conn_request+0xf07/0x32c0 net/ipv4/tcp_input.c:7180
      tcp_rcv_state_process+0x183c/0x4500 net/ipv4/tcp_input.c:6663
      tcp_v6_do_rcv+0x8b2/0x1310 net/ipv6/tcp_ipv6.c:1673
      tcp_v6_rcv+0x22b4/0x30b0 net/ipv6/tcp_ipv6.c:1910
      ip6_protocol_deliver_rcu+0xc76/0x1570 net/ipv6/ip6_input.c:438
      ip6_input_finish+0x186/0x2d0 net/ipv6/ip6_input.c:483
      NF_HOOK+0x3a4/0x450 include/linux/netfilter.h:314
      NF_HOOK+0x3a4/0x450 include/linux/netfilter.h:314
      __netif_receive_skb_one_core net/core/dev.c:5625 [inline]
      __netif_receive_skb+0x1ea/0x650 net/core/dev.c:5739
      netif_receive_skb_internal net/core/dev.c:5825 [inline]
      netif_receive_skb+0x1e8/0x890 net/core/dev.c:5885
      tun_rx_batched+0x1b7/0x8f0 drivers/net/tun.c:1549
      tun_get_user+0x2f35/0x4560 drivers/net/tun.c:2002
      tun_chr_write_iter+0x113/0x1f0 drivers/net/tun.c:2048
      call_write_iter include/linux/fs.h:2110 [inline]
      new_sync_write fs/read_write.c:497 [inline]
      vfs_write+0xa84/0xcb0 fs/read_write.c:590
      ksys_write+0x1a0/0x2c0 fs/read_write.c:643
      do_syscall_x64 arch/x86/entry/common.c:52 [inline]
      do_syscall_64+0xf5/0x240 arch/x86/entry/common.c:83
     entry_SYSCALL_64_after_hwframe+0x77/0x7f
    
    Fixes: 3e14049 ("mptcp: support rstreason for passive reset")
    Reported-by: syzbot <[email protected]>
    Signed-off-by: Eric Dumazet <[email protected]>
    Acked-by: Paolo Abeni <[email protected]>
    Reviewed-by: Matthieu Baerts (NGI0) <[email protected]>
    Reviewed-by: Jason Xing <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Eric Dumazet authored and kuba-moo committed May 7, 2024
    Configuration menu
    Copy the full SHA
    445c0b6 View commit details
    Browse the repository at this point in the history
  7. nfc: nci: Fix kcov check in nci_rx_work()

    Commit 7e8cdc9 ("nfc: Add KCOV annotations") added
    kcov_remote_start_common()/kcov_remote_stop() pair into nci_rx_work(),
    with an assumption that kcov_remote_stop() is called upon continue of
    the for loop. But commit d24b035 ("nfc: nci: Fix uninit-value in
    nci_dev_up and nci_ntf_packet") forgot to call kcov_remote_stop() before
    break of the for loop.
    
    Reported-by: syzbot <[email protected]>
    Closes: https://syzkaller.appspot.com/bug?extid=0438378d6f157baae1a2
    Fixes: d24b035 ("nfc: nci: Fix uninit-value in nci_dev_up and nci_ntf_packet")
    Suggested-by: Andrey Konovalov <[email protected]>
    Signed-off-by: Tetsuo Handa <[email protected]>
    Reviewed-by: Krzysztof Kozlowski <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Tetsuo Handa authored and kuba-moo committed May 7, 2024
    Configuration menu
    Copy the full SHA
    19e35f2 View commit details
    Browse the repository at this point in the history
  8. lib: Allow for the DIM library to be modular

    Allow the Dynamic Interrupt Moderation (DIM) library to be built as a
    module. This is particularly useful in an Android GKI (Google Kernel
    Image) configuration where everything is built as a module, including
    Ethernet controller drivers. Having to build DIMLIB into the kernel
    image with potentially no user is wasteful.
    
    Signed-off-by: Florian Fainelli <[email protected]>
    Reviewed-by: Alexander Lobakin <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    ffainelli authored and kuba-moo committed May 7, 2024
    Configuration menu
    Copy the full SHA
    0d5044b View commit details
    Browse the repository at this point in the history

Commits on May 8, 2024

  1. selftests/net: fix uninitialized variables

    When building with clang, via:
    
        make LLVM=1 -C tools/testing/selftest
    
    ...clang warns about three variables that are not initialized in all
    cases:
    
    1) The opt_ipproto_off variable is used uninitialized if "testname" is
    not "ip". Willem de Bruijn pointed out that this is an actual bug, and
    suggested the fix that I'm using here (thanks!).
    
    2) The addr_len is used uninitialized, but only in the assert case,
       which bails out, so this is harmless.
    
    3) The family variable in add_listener() is only used uninitialized in
       the error case (neither IPv4 nor IPv6 is specified), so it's also
       harmless.
    
    Fix by initializing each variable.
    
    Signed-off-by: John Hubbard <[email protected]>
    Reviewed-by: Willem de Bruijn <[email protected]>
    Acked-by: Mat Martineau <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    johnhubbard authored and kuba-moo committed May 8, 2024
    Configuration menu
    Copy the full SHA
    eb709b5 View commit details
    Browse the repository at this point in the history
  2. mptcp: only allow set existing scheduler for net.mptcp.scheduler

    The current behavior is to accept any strings as inputs, this results in
    an inconsistent result where an unexisting scheduler can be set:
    
      # sysctl -w net.mptcp.scheduler=notdefault
      net.mptcp.scheduler = notdefault
    
    This patch changes this behavior by checking for existing scheduler
    before accepting the input.
    
    Fixes: e3b2870 ("mptcp: add a new sysctl scheduler")
    Cc: [email protected]
    Signed-off-by: Gregory Detal <[email protected]>
    Reviewed-by: Matthieu Baerts (NGI0) <[email protected]>
    Tested-by: Geliang Tang <[email protected]>
    Reviewed-by: Mat Martineau <[email protected]>
    Signed-off-by: Matthieu Baerts (NGI0) <[email protected]>
    Link: https://lore.kernel.org/r/20240506-upstream-net-20240506-mptcp-sched-exist-v1-1-2ed1529e521e@kernel.org
    Signed-off-by: Jakub Kicinski <[email protected]>
    gdetal authored and kuba-moo committed May 8, 2024
    Configuration menu
    Copy the full SHA
    6963c50 View commit details
    Browse the repository at this point in the history
  3. usb: aqc111: stop lying about skb->truesize

    Some usb drivers try to set small skb->truesize and break
    core networking stacks.
    
    I replace one skb_clone() by an allocation of a fresh
    and small skb, to get minimally sized skbs, like we did
    in commit 1e2c611 ("net: cdc_ncm: reduce skb truesize
    in rx path") and 4ce62d5 ("net: usb: ax88179_178a:
    stop lying about skb->truesize")
    
    Fixes: 361459c ("net: usb: aqc111: Implement RX data path")
    Signed-off-by: Eric Dumazet <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Eric Dumazet authored and kuba-moo committed May 8, 2024
    Configuration menu
    Copy the full SHA
    9aad6e4 View commit details
    Browse the repository at this point in the history
  4. net: usb: smsc75xx: stop lying about skb->truesize

    Some usb drivers try to set small skb->truesize and break
    core networking stacks.
    
    In this patch, I removed one of the skb->truesize override.
    
    I also replaced one skb_clone() by an allocation of a fresh
    and small skb, to get minimally sized skbs, like we did
    in commit 1e2c611 ("net: cdc_ncm: reduce skb truesize
    in rx path") and 4ce62d5 ("net: usb: ax88179_178a:
    stop lying about skb->truesize")
    
    Signed-off-by: Eric Dumazet <[email protected]>
    Cc: Steve Glendinning <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Eric Dumazet authored and kuba-moo committed May 8, 2024
    Configuration menu
    Copy the full SHA
    1b3b2d9 View commit details
    Browse the repository at this point in the history
  5. net: usb: sr9700: stop lying about skb->truesize

    Some usb drivers set small skb->truesize and break
    core networking stacks.
    
    In this patch, I removed one of the skb->truesize override.
    
    I also replaced one skb_clone() by an allocation of a fresh
    and small skb, to get minimally sized skbs, like we did
    in commit 1e2c611 ("net: cdc_ncm: reduce skb truesize
    in rx path") and 4ce62d5 ("net: usb: ax88179_178a:
    stop lying about skb->truesize")
    
    Fixes: c9b3745 ("USB2NET : SR9700 : One chip USB 1.1 USB2NET SR9700Device Driver Support")
    Signed-off-by: Eric Dumazet <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Eric Dumazet authored and kuba-moo committed May 8, 2024
    Configuration menu
    Copy the full SHA
    05417aa View commit details
    Browse the repository at this point in the history
  6. Merge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/gi…

    …t/tnguy/next-queue
    
    Tony Nguyen says:
    
    ====================
    Intel Wired LAN Driver Updates 2024-05-06 (ice)
    
    This series contains updates to ice driver only.
    
    Paul adds support for additional E830 devices and adjusts naming for
    existing E830 devices.
    
    Marcin commonizes a couple of TC setup calls to reduce duplicated code.
    
    Mateusz adds ice_vsi_cfg_params into ice_vsi to consolidate info.
    
    * '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue:
      ice: refactor struct ice_vsi_cfg_params to be inside of struct ice_vsi
      ice: Deduplicate tc action setup
      ice: update E830 device ids and comments
      ice: add additional E830 device ids
    ====================
    
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    kuba-moo committed May 8, 2024
    Configuration menu
    Copy the full SHA
    09ca994 View commit details
    Browse the repository at this point in the history
  7. virtiofs: include a newline in sysfs tag

    The internal tag string doesn't contain a newline. Append one when
    emitting the tag via sysfs.
    
    [Stefan] Orthogonal to the newline issue, sysfs_emit(buf, "%s", fs->tag) is
    needed to prevent format string injection.
    
    Signed-off-by: Brian Foster <[email protected]>
    Fixes: a8f62f5 ("virtiofs: export filesystem tags through sysfs")
    Signed-off-by: Miklos Szeredi <[email protected]>
    Brian Foster authored and Miklos Szeredi committed May 8, 2024
    Configuration menu
    Copy the full SHA
    96d88f6 View commit details
    Browse the repository at this point in the history
  8. net: dsa: add support for DCB get/set apptrust configuration

    Add DCB support to get/set trust configuration for different packet
    priority information sources. Some switch allow to chose different
    source of packet priority classification. For example on KSZ switches it
    is possible to configure VLAN PCP and/or DSCP sources.
    
    Signed-off-by: Oleksij Rempel <[email protected]>
    Reviewed-by: Florian Fainelli <[email protected]>
    Signed-off-by: David S. Miller <[email protected]>
    olerem authored and davem330 committed May 8, 2024
    Configuration menu
    Copy the full SHA
    96c6f33 View commit details
    Browse the repository at this point in the history
  9. net: dsa: microchip: add IPV information support

    Most of Microchip KSZ switches use Internal Priority Value associated
    with every frame. For example, it is possible to map any VLAN PCP or
    DSCP value to IPV and at the end, map IPV to a queue.
    
    Since amount of IPVs is not equal to amount of queues, add this
    information and make use of it in some functions.
    
    Signed-off-by: Oleksij Rempel <[email protected]>
    Acked-by: Arun Ramadoss <[email protected]>
    Signed-off-by: David S. Miller <[email protected]>
    olerem authored and davem330 committed May 8, 2024
    Configuration menu
    Copy the full SHA
    97278f8 View commit details
    Browse the repository at this point in the history
  10. net: add IEEE 802.1q specific helpers

    IEEE 802.1q specification provides recommendation and examples which can
    be used as good default values for different drivers.
    
    This patch implements mapping examples documented in IEEE 802.1Q-2022 in
    Annex I "I.3 Traffic type to traffic class mapping" and IETF DSCP naming
    and mapping DSCP to Traffic Type inspired by RFC8325.
    
    This helpers will be used in followup patches for dsa/microchip DCB
    implementation.
    
    Signed-off-by: Oleksij Rempel <[email protected]>
    Signed-off-by: David S. Miller <[email protected]>
    olerem authored and davem330 committed May 8, 2024
    Configuration menu
    Copy the full SHA
    768cf84 View commit details
    Browse the repository at this point in the history
  11. net: dsa: microchip: add multi queue support for KSZ88X3 variants

    KSZ88X3 switches support up to 4 queues. Rework ksz8795_set_prio_queue()
    to support KSZ8795 and KSZ88X3 families of switches.
    
    Per default, configure KSZ88X3 to use one queue, since it need special
    handling due to priority related errata. Errata handling is implemented
    in a separate patch.
    
    Signed-off-by: Oleksij Rempel <[email protected]>
    Acked-by: Arun Ramadoss <[email protected]>
    Signed-off-by: David S. Miller <[email protected]>
    olerem authored and davem330 committed May 8, 2024
    Configuration menu
    Copy the full SHA
    328de46 View commit details
    Browse the repository at this point in the history
  12. net: dsa: microchip: add support for different DCB app configurations

    Add DCB support to configure app trust sources and default port priority.
    
    Following commands can be used for testing:
    dcb apptrust set dev lan1 order pcp dscp
    dcb app replace dev lan1 default-prio 3
    
    Since it is not possible to configure DSCP-Prio mapping per port, this
    patch provide only ability to read switch global dscp-prio mapping and
    way to enable/disable app trust for DSCP.
    
    Signed-off-by: Oleksij Rempel <[email protected]>
    Acked-by: Arun Ramadoss <[email protected]>
    Signed-off-by: David S. Miller <[email protected]>
    olerem authored and davem330 committed May 8, 2024
    Configuration menu
    Copy the full SHA
    a16efc6 View commit details
    Browse the repository at this point in the history
  13. net: dsa: microchip: dcb: add special handling for KSZ88X3 family

    KSZ88X3 switches have different behavior on different ports:
    - It seems to be not possible to disable VLAN PCP classification on port
      2. It means, as soon as mutliqueue support is enabled, frames with
         VLAN tag will get PCP prios. This behavior do not affect Port 1 -
         it is possible to disable PCP prios.
    - DSCP classification is not working on Port 2.
    
    Since there are still usable configuration combinations, I added some
    quirks to make sure user will get appropriate error message if not
    possible configuration is chosen.
    
    Signed-off-by: Oleksij Rempel <[email protected]>
    Acked-by: Arun Ramadoss <[email protected]>
    Signed-off-by: David S. Miller <[email protected]>
    olerem authored and davem330 committed May 8, 2024
    Configuration menu
    Copy the full SHA
    a1ea577 View commit details
    Browse the repository at this point in the history
  14. net: dsa: microchip: enable ETS support for KSZ989X variants

    I tested ETS support on KSZ9893, so it should work other KSZ989X
    variants too, which was till not listed as support.
    
    With this change we now officially not support only ksz8 family of
    chips.
    
    Signed-off-by: Oleksij Rempel <[email protected]>
    Acked-by: Arun Ramadoss <[email protected]>
    Signed-off-by: David S. Miller <[email protected]>
    olerem authored and davem330 committed May 8, 2024
    Configuration menu
    Copy the full SHA
    c631250 View commit details
    Browse the repository at this point in the history
  15. net: dsa: microchip: init predictable IPV to queue mapping for all no…

    …n KSZ8xxx variants
    
    Init priority to queue mapping in the way as it shown in IEEE 802.1Q
    mapping example.
    
    Signed-off-by: Oleksij Rempel <[email protected]>
    Acked-by: Arun Ramadoss <[email protected]>
    Signed-off-by: David S. Miller <[email protected]>
    olerem authored and davem330 committed May 8, 2024
    Configuration menu
    Copy the full SHA
    3bcb896 View commit details
    Browse the repository at this point in the history
  16. net: dsa: microchip: let DCB code do PCP and DSCP policy configuration

    802.1P (PCP) and DiffServ (DSCP) are handled now by DCB code. Let it do
    all needed initial configuration.
    
    Signed-off-by: Oleksij Rempel <[email protected]>
    Acked-by: Arun Ramadoss <[email protected]>
    Signed-off-by: David S. Miller <[email protected]>
    olerem authored and davem330 committed May 8, 2024
    Configuration menu
    Copy the full SHA
    ea1078d View commit details
    Browse the repository at this point in the history
  17. net: dsa: add support switches global DSCP priority mapping

    Some switches like Microchip KSZ variants do not support per port DSCP
    priority configuration. Instead there is a global DSCP mapping table.
    
    To handle it, we will accept set/del request to any of user ports to
    make global configuration and update dcb app entries for all other
    ports.
    
    Signed-off-by: Oleksij Rempel <[email protected]>
    Signed-off-by: David S. Miller <[email protected]>
    olerem authored and davem330 committed May 8, 2024
    Configuration menu
    Copy the full SHA
    5f5109a View commit details
    Browse the repository at this point in the history
  18. net: dsa: microchip: add support DSCP priority mapping

    Microchip KSZ and LAN variants do not have per port DSCP priority
    configuration. Instead there is a global DSCP mapping table.
    
    This patch provides write access to this global DSCP map. In case entry
    is "deleted", we map corresponding DSCP entry to a best effort prio,
    which is expected to be the default priority for all untagged traffic.
    
    Signed-off-by: Oleksij Rempel <[email protected]>
    Signed-off-by: David S. Miller <[email protected]>
    olerem authored and davem330 committed May 8, 2024
    Configuration menu
    Copy the full SHA
    c2e7226 View commit details
    Browse the repository at this point in the history
  19. selftests: microchip: add test for QoS support on KSZ9477 switch family

    Add tests covering following functionality on KSZ9477 switch family:
    - default port priority
    - global DSCP to Internal Priority Mapping
    - apptrust configuration
    
    This script was tested on KSZ9893R
    
    Signed-off-by: Oleksij Rempel <[email protected]>
    Signed-off-by: David S. Miller <[email protected]>
    olerem authored and davem330 committed May 8, 2024
    Configuration menu
    Copy the full SHA
    cbc7aff View commit details
    Browse the repository at this point in the history
  20. Merge branch 'ksz-dcb-dscp'

    Oleksij Rempel says:
    
    ====================
    add DCB and DSCP support for KSZ switches
    
    This patch series is aimed at improving support for DCB (Data Center
    Bridging) and DSCP (Differentiated Services Code Point) on KSZ switches.
    
    The main goal is to introduce global DSCP and PCP (Priority Code Point)
    mapping support, addressing the limitation of KSZ switches not having
    per-port DSCP priority mapping. This involves extending the DSA
    framework with new callbacks for managing trust settings for global DSCP
    and PCP maps. Additionally, we introduce IEEE 802.1q helpers for default
    configurations, benefiting other drivers too.
    
    Change logs are in separate patches.
    
    Compared to v6 this series includes some new patches for DSCP global
    mapping support and QoS selftest script for KSZ9477 switches.
    ====================
    
    Signed-off-by: David S. Miller <[email protected]>
    davem330 committed May 8, 2024
    Configuration menu
    Copy the full SHA
    9f481ce View commit details
    Browse the repository at this point in the history
  21. net: bridge: fix corrupted ethernet header on multicast-to-unicast

    The change from skb_copy to pskb_copy unfortunately changed the data
    copying to omit the ethernet header, since it was pulled before reaching
    this point. Fix this by calling __skb_push/pull around pskb_copy.
    
    Fixes: 59c878c ("net: bridge: fix multicast-to-unicast with fraglist GSO")
    Signed-off-by: Felix Fietkau <[email protected]>
    Acked-by: Nikolay Aleksandrov <[email protected]>
    Signed-off-by: David S. Miller <[email protected]>
    nbd168 authored and davem330 committed May 8, 2024
    Configuration menu
    Copy the full SHA
    86b29d8 View commit details
    Browse the repository at this point in the history
  22. net/ipv4: add tracepoint for icmp_send

    Introduce a tracepoint for icmp_send, which can help users to get more
    detail information conveniently when icmp abnormal events happen.
    
    1. Giving an usecase example:
    =============================
    When an application experiences packet loss due to an unreachable UDP
    destination port, the kernel will send an exception message through the
    icmp_send function. By adding a trace point for icmp_send, developers or
    system administrators can obtain detailed information about the UDP
    packet loss, including the type, code, source address, destination address,
    source port, and destination port. This facilitates the trouble-shooting
    of UDP packet loss issues especially for those network-service
    applications.
    
    2. Operation Instructions:
    ==========================
    Switch to the tracing directory.
            cd /sys/kernel/tracing
    Filter for destination port unreachable.
            echo "type==3 && code==3" > events/icmp/icmp_send/filter
    Enable trace event.
            echo 1 > events/icmp/icmp_send/enable
    
    3. Result View:
    ================
     udp_client_erro-11370   [002] ...s.12   124.728002:
     icmp_send: icmp_send: type=3, code=3.
     From 127.0.0.1:41895 to 127.0.0.1:6666 ulen=23
     skbaddr=00000000589b167a
    
    Signed-off-by: Peilin He <[email protected]>
    Signed-off-by: xu xin <[email protected]>
    Reviewed-by: Yunkai Zhang <[email protected]>
    Cc: Yang Yang <[email protected]>
    Cc: Liu Chun <[email protected]>
    Cc: Xuexin Jiang <[email protected]>
    Reviewed-by: Steven Rostedt (Google) <[email protected]>
    Signed-off-by: David S. Miller <[email protected]>
    Peilin He authored and davem330 committed May 8, 2024
    Configuration menu
    Copy the full SHA
    db3efdc View commit details
    Browse the repository at this point in the history
  23. appletalk: Improve handling of broadcast packets

    When a broadcast AppleTalk packet is received, prefer queuing it on the
    socket whose address matches the address of the interface that received
    the packet (and is listening on the correct port). Userspace
    applications that handle such packets will usually send a response on
    the same socket that received the packet; this fix allows the response
    to be sent on the correct interface.
    
    If a socket matching the interface's address is not found, an arbitrary
    socket listening on the correct port will be used, if any. This matches
    the implementation's previous behavior.
    
    Fixes atalkd's responses to network information requests when multiple
    network interfaces are configured to use AppleTalk.
    
    Link: https://lore.kernel.org/netdev/[email protected]/
    Link: https://gist.github.com/VinDuv/4db433b6dce39d51a5b7847ee749b2a4
    Signed-off-by: Vincent Duvert <[email protected]>
    Signed-off-by: Doug Brown <[email protected]>
    Signed-off-by: David S. Miller <[email protected]>
    VinDuv authored and davem330 committed May 8, 2024
    Configuration menu
    Copy the full SHA
    2e82a58 View commit details
    Browse the repository at this point in the history
  24. net: phy: marvell-88q2xxx: add support for Rev B1 and B2

    Different revisions of the Marvell 88q2xxx phy needs different init
    sequences.
    
    Add init sequence for Rev B1 and Rev B2. Rev B2 init sequence skips one
    register write.
    
    Tested-by: Dimitri Fedrau <[email protected]>
    Signed-off-by: Gregor Herburger <[email protected]>
    Signed-off-by: David S. Miller <[email protected]>
    Gregor Herburger authored and davem330 committed May 8, 2024
    Configuration menu
    Copy the full SHA
    ab0cde3 View commit details
    Browse the repository at this point in the history
  25. net: bridge: switchdev: Improve error message for port_obj_add/del fu…

    …nctions
    
    Enhance the error reporting mechanism in the switchdev framework to
    provide more informative and user-friendly error messages.
    
    Following feedback from users struggling to understand the implications
    of error messages like "failed (err=-28) to add object (id=2)", this
    update aims to clarify what operation failed and how this might impact
    the system or network.
    
    With this change, error messages now include a description of the failed
    operation, the specific object involved, and a brief explanation of the
    potential impact on the system. This approach helps administrators and
    developers better understand the context and severity of errors,
    facilitating quicker and more effective troubleshooting.
    
    Example of the improved logging:
    
    [   70.516446] ksz-switch spi0.0 uplink: Failed to add Port Multicast
                   Database entry (object id=2) with error: -ENOSPC (-28).
    [   70.516446] Failure in updating the port's Multicast Database could
                   lead to multicast forwarding issues.
    [   70.516446] Current HW/SW setup lacks sufficient resources.
    
    This comprehensive update includes handling for a range of switchdev
    object IDs, ensuring that most operations within the switchdev framework
    benefit from clearer error reporting.
    
    Reviewed-by: Simon Horman <[email protected]>
    Signed-off-by: Oleksij Rempel <[email protected]>
    Signed-off-by: David S. Miller <[email protected]>
    olerem authored and davem330 committed May 8, 2024
    Configuration menu
    Copy the full SHA
    b7ffab2 View commit details
    Browse the repository at this point in the history
  26. net: stmmac: dwmac-ipq806x: account for rgmii-txid/rxid/id phy-mode

    Currently the ipq806x dwmac driver is almost always used attached to the
    CPU port of a switch and phy-mode was always set to "rgmii" or "sgmii".
    
    Some device came up with a special configuration where the PHY is
    directly attached to the GMAC port and in those case phy-mode needs to
    be set to "rgmii-id" to make the PHY correctly work and receive packets.
    
    Since the driver supports only "rgmii" and "sgmii" mode, when "rgmii-id"
    (or variants) mode is set, the mode is rejected and probe fails.
    
    Add support also for these phy-modes to correctly setup PHYs that requires
    delay applied to tx/rx.
    
    Signed-off-by: Christian Marangi <[email protected]>
    Reviewed-by: Andrew Lunn <[email protected]>
    Signed-off-by: David S. Miller <[email protected]>
    Ansuel authored and davem330 committed May 8, 2024
    Configuration menu
    Copy the full SHA
    abb45a2 View commit details
    Browse the repository at this point in the history
  27. ipv6: Fix potential uninit-value access in __ip6_make_skb()

    As it was done in commit fc1092f ("ipv4: Fix uninit-value access in
    __ip_make_skb()") for IPv4, check FLOWI_FLAG_KNOWN_NH on fl6->flowi6_flags
    instead of testing HDRINCL on the socket to avoid a race condition which
    causes uninit-value access.
    
    Fixes: ea30388 ("ipv6: Fix an uninit variable access bug in __ip6_make_skb()")
    Signed-off-by: Shigeru Yoshida <[email protected]>
    Signed-off-by: David S. Miller <[email protected]>
    Shigeru Yoshida authored and davem330 committed May 8, 2024
    Configuration menu
    Copy the full SHA
    4e13d3a View commit details
    Browse the repository at this point in the history
  28. ipvs: add READ_ONCE barrier for ipvs->sysctl_amemthresh

    Cc: Julian Anastasov <[email protected]>
    Cc: Simon Horman <[email protected]>
    Cc: Pablo Neira Ayuso <[email protected]>
    Cc: Jozsef Kadlecsik <[email protected]>
    Cc: Florian Westphal <[email protected]>
    Suggested-by: Julian Anastasov <[email protected]>
    Signed-off-by: Alexander Mikhalitsyn <[email protected]>
    Acked-by: Julian Anastasov <[email protected]>
    Signed-off-by: David S. Miller <[email protected]>
    mihalicyn authored and davem330 committed May 8, 2024
    Configuration menu
    Copy the full SHA
    643bb5d View commit details
    Browse the repository at this point in the history
  29. ipvs: allow some sysctls in non-init user namespaces

    Let's make all IPVS sysctls writtable even when
    network namespace is owned by non-initial user namespace.
    
    Let's make a few sysctls to be read-only for non-privileged users:
    - sync_qlen_max
    - sync_sock_size
    - run_estimation
    - est_cpulist
    - est_nice
    
    I'm trying to be conservative with this to prevent
    introducing any security issues in there. Maybe,
    we can allow more sysctls to be writable, but let's
    do this on-demand and when we see real use-case.
    
    This patch is motivated by user request in the LXC
    project [1]. Having this can help with running some
    Kubernetes [2] or Docker Swarm [3] workloads inside the system
    containers.
    
    Link: lxc/lxc#4278 [1]
    Link: https://github.com/kubernetes/kubernetes/blob/b722d017a34b300a2284b890448e5a605f21d01e/pkg/proxy/ipvs/proxier.go#L103 [2]
    Link: https://github.com/moby/libnetwork/blob/3797618f9a38372e8107d8c06f6ae199e1133ae8/osl/namespace_linux.go#L682 [3]
    
    Cc: Julian Anastasov <[email protected]>
    Cc: Simon Horman <[email protected]>
    Cc: Pablo Neira Ayuso <[email protected]>
    Cc: Jozsef Kadlecsik <[email protected]>
    Cc: Florian Westphal <[email protected]>
    Signed-off-by: Alexander Mikhalitsyn <[email protected]>
    Acked-by: Julian Anastasov <[email protected]>
    Signed-off-by: David S. Miller <[email protected]>
    mihalicyn authored and davem330 committed May 8, 2024
    Configuration menu
    Copy the full SHA
    2b696a2 View commit details
    Browse the repository at this point in the history
  30. ax25: Remove superfuous "return" from ax25_ds_set_timer

    Remove the explicit call to "return" in the void ax25_ds_set_timer
    function that was introduced in 78a7b5d ("ax.25: x.25: Remove the
    now superfluous sentinel elements from ctl_table array").
    
    Signed-off-by: Joel Granados <[email protected]>
    Signed-off-by: David S. Miller <[email protected]>
    Joelgranados authored and davem330 committed May 8, 2024
    Configuration menu
    Copy the full SHA
    1d3985e View commit details
    Browse the repository at this point in the history
  31. test: hsr: Call cleanup_all_ns when hsr_redbox.sh script exits

    Without this change the created netns instances are not cleared after
    this script execution. To fix this problem the cleanup_all_ns function
    from ../lib.sh is called.
    
    Signed-off-by: Lukasz Majewski <[email protected]>
    Reviewed-by: Simon Horman <[email protected]>
    Signed-off-by: David S. Miller <[email protected]>
    Lukasz Majewski authored and davem330 committed May 8, 2024
    Configuration menu
    Copy the full SHA
    252aa6d View commit details
    Browse the repository at this point in the history
  32. selftests: test_bridge_neigh_suppress.sh: Fix failures due to duplica…

    …te MAC
    
    When creating the topology for the test, three veth pairs are created in
    the initial network namespace before being moved to one of the network
    namespaces created by the test.
    
    On systems where systemd-udev uses MACAddressPolicy=persistent (default
    since systemd version 242), this will result in some net devices having
    the same MAC address since they were created with the same name in the
    initial network namespace. In turn, this leads to arping / ndisc6
    failing since packets are dropped by the bridge's loopback filter.
    
    Fix by creating each net device in the correct network namespace instead
    of moving it there from the initial network namespace.
    
    Reported-by: Jakub Kicinski <[email protected]>
    Closes: https://lore.kernel.org/netdev/[email protected]/
    Fixes: 7648ac7 ("selftests: net: Add bridge neighbor suppression test")
    Signed-off-by: Ido Schimmel <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    idosch authored and kuba-moo committed May 8, 2024
    Configuration menu
    Copy the full SHA
    9a169c2 View commit details
    Browse the repository at this point in the history
  33. bpf, arm64: Add support for lse atomics in bpf_arena

    When LSE atomics are available, BPF atomic instructions are implemented
    as single ARM64 atomic instructions, therefore it is easy to enable
    these in bpf_arena using the currently available exception handling
    setup.
    
    LL_SC atomics use loops and therefore would need more work to enable in
    bpf_arena.
    
    Enable LSE atomics based instructions in bpf_arena and use the
    bpf_jit_supports_insn() callback to reject atomics in bpf_arena if LSE
    atomics are not available.
    
    All atomics and arena_atomics selftests are passing:
    
      [root@ip-172-31-2-216 bpf]# ./test_progs -a atomics,arena_atomics
      #3/1     arena_atomics/add:OK
      #3/2     arena_atomics/sub:OK
      #3/3     arena_atomics/and:OK
      #3/4     arena_atomics/or:OK
      #3/5     arena_atomics/xor:OK
      #3/6     arena_atomics/cmpxchg:OK
      #3/7     arena_atomics/xchg:OK
      #3       arena_atomics:OK
      #10/1    atomics/add:OK
      #10/2    atomics/sub:OK
      #10/3    atomics/and:OK
      #10/4    atomics/or:OK
      #10/5    atomics/xor:OK
      #10/6    atomics/cmpxchg:OK
      #10/7    atomics/xchg:OK
      #10      atomics:OK
      Summary: 2/14 PASSED, 0 SKIPPED, 0 FAILED
    
    Signed-off-by: Puranjay Mohan <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Alexei Starovoitov <[email protected]>
    puranjaymohan authored and Alexei Starovoitov committed May 8, 2024
    Configuration menu
    Copy the full SHA
    e612b5c View commit details
    Browse the repository at this point in the history
  34. rxrpc: Fix congestion control algorithm

    Make the following fixes to the congestion control algorithm:
    
     (1) Don't vary the cwnd starting value by the size of RXRPC_TX_SMSS since
         that's currently held constant - set to the size of a jumbo subpacket
         payload so that we can create jumbo packets on the fly.  The current
         code invariably picks 3 as the starting value.
    
         Further, the starting cwnd needs to be an even number because we ack
         every other packet, so set it to 4.
    
     (2) Don't cut ssthresh when we see an ACK come from the peer with a
         receive window (rwind) less than ssthresh.  ssthresh keeps track of
         characteristics of the connection whereas rwind may be reduced by the
         peer for any reason - and may be reduced to 0.
    
    Fixes: 1fc4fa2 ("rxrpc: Fix congestion management")
    Fixes: 0851115 ("rxrpc: Reduce ssthresh to peer's receive window")
    Signed-off-by: David Howells <[email protected]>
    Suggested-by: Simon Wilkinson <[email protected]>
    cc: Marc Dionne <[email protected]>
    cc: [email protected]
    Reviewed-by: Jeffrey Altman <[email protected] <mailto:[email protected]>>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    dhowells authored and kuba-moo committed May 8, 2024
    Configuration menu
    Copy the full SHA
    ba4e103 View commit details
    Browse the repository at this point in the history
  35. rxrpc: Only transmit one ACK per jumbo packet received

    Only generate one ACK packet for all the subpackets in a jumbo packet.  If
    we would like to generate more than one ACK, we prioritise them base on
    their reason code, in the order, highest first:
    
       OutOfSeq > NoSpace > ExceedsWin > Duplicate > Requested > Delay > Idle
    
    For the first four, we reference the lowest offending subpacket; for the
    last three, the highest.
    
    This reduces the number of ACKs we end up transmitting to one per UDP
    packet transmitted to reduce network loading and packet parsing.
    
    Fixes: 5d7edbc ("rxrpc: Get rid of the Rx ring")
    Signed-off-by: David Howells <[email protected]>
    cc: Marc Dionne <[email protected]>
    cc: [email protected]
    Reviewed-by: Jeffrey Altman <[email protected] <mailto:[email protected]>>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    dhowells authored and kuba-moo committed May 8, 2024
    Configuration menu
    Copy the full SHA
    012b720 View commit details
    Browse the repository at this point in the history
  36. Merge branch 'rxrpc-miscellaneous-fixes'

    David Howells says:
    
    ====================
    rxrpc: Miscellaneous fixes (part)
    
    Here some miscellaneous fixes for AF_RXRPC:
    
     (1) Fix the congestion control algorithm to start cwnd at 4 and to not cut
         ssthresh when the peer cuts its rwind size.
    
     (2) Only transmit a single ACK for all the DATA packets glued together
         into a jumbo packet to reduce the number of ACKs being generated.
    ====================
    
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    kuba-moo committed May 8, 2024
    Configuration menu
    Copy the full SHA
    0275410 View commit details
    Browse the repository at this point in the history
  37. i40e: flower: validate control flags

    This driver currently doesn't support any control flags.
    
    Use flow_rule_has_control_flags() to check for control flags,
    such as can be set through `tc flower ... ip_flags frag`.
    
    In case any control flags are masked, flow_rule_has_control_flags()
    sets a NL extended error message, and we return -EOPNOTSUPP.
    
    Only compile-tested.
    
    Signed-off-by: Asbjørn Sloth Tønnesen <[email protected]>
    Reviewed-by: Simon Horman <[email protected]>
    Tested-by: Sujai Buvaneswaran <[email protected]>
    Signed-off-by: Tony Nguyen <[email protected]>
    Asbjørn Sloth Tønnesen authored and anguy11 committed May 8, 2024
    Configuration menu
    Copy the full SHA
    174ee5b View commit details
    Browse the repository at this point in the history
  38. iavf: flower: validate control flags

    This driver currently doesn't support any control flags.
    
    Use flow_rule_has_control_flags() to check for control flags,
    such as can be set through `tc flower ... ip_flags frag`.
    
    In case any control flags are masked, flow_rule_has_control_flags()
    sets a NL extended error message, and we return -EOPNOTSUPP.
    
    Only compile-tested.
    
    Signed-off-by: Asbjørn Sloth Tønnesen <[email protected]>
    Reviewed-by: Simon Horman <[email protected]>
    Tested-by: Sujai Buvaneswaran <[email protected]>
    Signed-off-by: Tony Nguyen <[email protected]>
    Asbjørn Sloth Tønnesen authored and anguy11 committed May 8, 2024
    Configuration menu
    Copy the full SHA
    c7b9c49 View commit details
    Browse the repository at this point in the history
  39. ice: flower: validate control flags

    This driver currently doesn't support any control flags.
    
    Use flow_rule_has_control_flags() to check for control flags,
    such as can be set through `tc flower ... ip_flags frag`.
    
    In case any control flags are masked, flow_rule_has_control_flags()
    sets a NL extended error message, and we return -EOPNOTSUPP.
    
    Only compile-tested.
    
    Signed-off-by: Asbjørn Sloth Tønnesen <[email protected]>
    Reviewed-by: Michal Swiatkowski <[email protected]>
    Reviewed-by: Simon Horman <[email protected]>
    Tested-by: Sujai Buvaneswaran <[email protected]>
    Signed-off-by: Tony Nguyen <[email protected]>
    Asbjørn Sloth Tønnesen authored and anguy11 committed May 8, 2024
    Configuration menu
    Copy the full SHA
    21e1fe9 View commit details
    Browse the repository at this point in the history
  40. igb: flower: validate control flags

    This driver currently doesn't support any control flags.
    
    Use flow_rule_match_has_control_flags() to check for control flags,
    such as can be set through `tc flower ... ip_flags frag`.
    
    In case any control flags are masked, flow_rule_match_has_control_flags()
    sets a NL extended error message, and we return -EOPNOTSUPP.
    
    Only compile-tested.
    
    Signed-off-by: Asbjørn Sloth Tønnesen <[email protected]>
    Reviewed-by: Simon Horman <[email protected]>
    Tested-by: Pucha Himasekhar Reddy <[email protected]> (A Contingent worker at Intel)
    Signed-off-by: Tony Nguyen <[email protected]>
    Asbjørn Sloth Tønnesen authored and anguy11 committed May 8, 2024
    Configuration menu
    Copy the full SHA
    fb324f2 View commit details
    Browse the repository at this point in the history
  41. ice: remove correct filters during eswitch release

    ice_clear_dflt_vsi() is only removing default rule. Both default RX and
    TX rule should be removed during release.
    
    If it isn't switching to switchdev, second time results in error, because
    TX filter is already there.
    
    Fix it by removing the correct set of rules.
    
    Fixes: 50d6202 ("ice: default Tx rule instead of to queue")
    Reviewed-by: Wojciech Drewek <[email protected]>
    Signed-off-by: Michal Swiatkowski <[email protected]>
    Signed-off-by: Marcin Szycik <[email protected]>
    Reviewed-by: Simon Horman <[email protected]>
    Tested-by: Sujai Buvaneswaran <[email protected]>
    Signed-off-by: Tony Nguyen <[email protected]>
    Michal Swiatkowski authored and anguy11 committed May 8, 2024
    Configuration menu
    Copy the full SHA
    8e3a90f View commit details
    Browse the repository at this point in the history
  42. igc: fix a log entry using uninitialized netdev

    During successful probe, igc logs this:
    
    [    5.133667] igc 0000:01:00.0 (unnamed net_device) (uninitialized): PHC added
                                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    The reason is that igc_ptp_init() is called very early, even before
    register_netdev() has been called. So the netdev_info() call works
    on a partially uninitialized netdev.
    
    Fix this by calling igc_ptp_init() after register_netdev(), right
    after the media autosense check, just as in igb.  Add a comment,
    just as in igb.
    
    Now the log message is fine:
    
    [    5.200987] igc 0000:01:00.0 eth0: PHC added
    
    Signed-off-by: Corinna Vinschen <[email protected]>
    Reviewed-by: Hariprasad Kelam <[email protected]>
    Acked-by: Vinicius Costa Gomes <[email protected]>
    Tested-by: Naama Meir <[email protected]>
    Signed-off-by: Tony Nguyen <[email protected]>
    github-cygwin authored and anguy11 committed May 8, 2024
    Configuration menu
    Copy the full SHA
    8616718 View commit details
    Browse the repository at this point in the history
  43. net: e1000e & ixgbe: Remove PCI_HEADER_TYPE_MFD duplicates

    PCI_HEADER_TYPE_MULTIFUNC is define by e1000e and ixgbe and both are
    unused. There is already PCI_HEADER_TYPE_MFD in pci_regs.h anyway which
    should be used instead so remove the duplicated defines of it.
    
    Signed-off-by: Ilpo Järvinen <[email protected]>
    Reviewed-by: Hariprasad Kelam <[email protected]>
    Reviewed-by: Jesse Brandeburg <[email protected]>
    Acked-by: Sasha Neftin <[email protected]>
    Signed-off-by: Tony Nguyen <[email protected]>
    ij-intel authored and anguy11 committed May 8, 2024
    Configuration menu
    Copy the full SHA
    6918107 View commit details
    Browse the repository at this point in the history
  44. Merge tag 'pci-v6.9-fixes-2' of git://git.kernel.org/pub/scm/linux/ke…

    …rnel/git/pci/pci
    
    Pull pci fixes from Bjorn Helgaas:
    
     - Update kernel-parameters doc to describe "pcie_aspm=off" more
       accurately (Bjorn Helgaas)
    
     - Restore the parent's (not the child's) ASPM state to the parent
       during resume, which fixes a reboot during resume (Kai-Heng Feng)
    
    * tag 'pci-v6.9-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci:
      PCI/ASPM: Restore parent state to parent, child state to child
      PCI/ASPM: Clarify that pcie_aspm=off means leave ASPM untouched
    torvalds committed May 8, 2024
    Configuration menu
    Copy the full SHA
    1ab1a19 View commit details
    Browse the repository at this point in the history
  45. bpf: avoid uninitialized warnings in verifier_global_subprogs.c

    [Changes from V1:
    - The warning to disable is -Wmaybe-uninitialized, not -Wuninitialized.
    - This warning is only supported in GCC.]
    
    The BPF selftest verifier_global_subprogs.c contains code that
    purposedly performs out of bounds access to memory, to check whether
    the kernel verifier is able to catch them.  For example:
    
      __noinline int global_unsupp(const int *mem)
      {
    	if (!mem)
    		return 0;
    	return mem[100]; /* BOOM */
      }
    
    With -O1 and higher and no inlining, GCC notices this fact and emits a
    "maybe uninitialized" warning.  This is by design.  Note that the
    emission of these warnings is highly dependent on the precise
    optimizations that are performed.
    
    This patch adds a compiler pragma to verifier_global_subprogs.c to
    ignore these warnings.
    
    Tested in bpf-next master.
    No regressions.
    
    Signed-off-by: Jose E. Marchesi <[email protected]>
    Cc: [email protected]
    Cc: [email protected]
    Cc: Yonghong Song <[email protected]>
    Cc: Eduard Zingerman <[email protected]>
    Acked-by: Yonghong Song <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Alexei Starovoitov <[email protected]>
    jemarch authored and Alexei Starovoitov committed May 8, 2024
    Configuration menu
    Copy the full SHA
    cd3fc3b View commit details
    Browse the repository at this point in the history
  46. bpf: avoid UB in usages of the __imm_insn macro

    [Changes from V2:
     - no-strict-aliasing is only applied when building with GCC.
     - cpumask_failure.c is excluded, as it doesn't use __imm_insn.]
    
    The __imm_insn macro is defined in bpf_misc.h as:
    
      #define __imm_insn(name, expr) [name]"i"(*(long *)&(expr))
    
    This may lead to type-punning and strict aliasing rules violations in
    it's typical usage where the address of a struct bpf_insn is passed as
    expr, like in:
    
      __imm_insn(st_mem,
                 BPF_ST_MEM(BPF_W, BPF_REG_1, offsetof(struct __sk_buff, mark), 42))
    
    Where:
    
      #define BPF_ST_MEM(SIZE, DST, OFF, IMM)				\
    	((struct bpf_insn) {					\
    		.code  = BPF_ST | BPF_SIZE(SIZE) | BPF_MEM,	\
    		.dst_reg = DST,					\
    		.src_reg = 0,					\
    		.off   = OFF,					\
    		.imm   = IMM })
    
    In all the actual instances of this in the BPF selftests the value is
    fed to a volatile asm statement as soon as it gets read from memory,
    and thus it is unlikely anti-aliasing rules breakage may lead to
    misguided optimizations.
    
    However, GCC detects the potential problem (indirectly) by issuing a
    warning stating that a temporary <Uxxxxxx> is used uninitialized,
    where the temporary corresponds to the memory read by *(long *).
    
    This patch adds -fno-strict-aliasing to the compilation flags of the
    particular selftests that do type punning via __imm_insn, only for
    GCC.
    
    Tested in master bpf-next.
    No regressions.
    
    Signed-off-by: Jose E. Marchesi <[email protected]>
    Cc: [email protected]
    Cc: [email protected]
    Cc: Yonghong Song <[email protected]>
    Cc: Eduard Zingerman <[email protected]>
    Acked-by: Yonghong Song <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Alexei Starovoitov <[email protected]>
    jemarch authored and Alexei Starovoitov committed May 8, 2024
    Configuration menu
    Copy the full SHA
    1209a52 View commit details
    Browse the repository at this point in the history
  47. bpf: guard BPF_NO_PRESERVE_ACCESS_INDEX in skb_pkt_end.c

    This little patch is a follow-up to:
    https://lore.kernel.org/bpf/[email protected]/T/#u
    
    The temporary workaround of passing -DBPF_NO_PRESERVE_ACCESS_INDEX
    when building with GCC triggers a redefinition preprocessor error when
    building progs/skb_pkt_end.c.  This patch adds a guard to avoid
    redefinition.
    
    Signed-off-by: Jose E. Marchesi <[email protected]>
    Cc: [email protected]
    Cc: [email protected]
    Cc: Eduard Zingerman <[email protected]>
    Cc: Yonghong Song <[email protected]>
    Cc: Andrii Nakryiko <[email protected]>
    Acked-by: Yonghong Song <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Alexei Starovoitov <[email protected]>
    jemarch authored and Alexei Starovoitov committed May 8, 2024
    Configuration menu
    Copy the full SHA
    911edc6 View commit details
    Browse the repository at this point in the history
  48. Merge tag 'soc-fixes-6.9-3' of git://git.kernel.org/pub/scm/linux/ker…

    …nel/git/soc/soc
    
    Pull ARM SoC fixes from Arnd Bergmann:
     "These are a couple of last minute fixes that came in over the previous
      week, addressing:
    
       - A pin configuration bug on a qualcomm board that caused issues with
         ethernet and mmc
    
       - Two minor code fixes for misleading console output in the microchip
         firmware driver
    
       - A build warning in the sifive cache driver"
    
    * tag 'soc-fixes-6.9-3' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc:
      firmware: microchip: clarify that sizes and addresses are in hex
      firmware: microchip: don't unconditionally print validation success
      arm64: dts: qcom: sa8155p-adp: fix SDHC2 CD pin configuration
      cache: sifive_ccache: Silence unused variable warning
    torvalds committed May 8, 2024
    Configuration menu
    Copy the full SHA
    6d7ddd8 View commit details
    Browse the repository at this point in the history
  49. Merge tag 'bcachefs-2024-05-07.2' of https://evilpiepirate.org/git/bc…

    …achefs
    
    Pull bcachefs fixes from Kent Overstreet:
    
     - Various syzbot fixes; mainly small gaps in validation
    
     - Fix an integer overflow in fiemap() which was preventing filefrag
       from returning the full list of extents
    
     - Fix a refcounting bug on the device refcount, turned up by new
       assertions in the development branch
    
     - Fix a device removal/readd bug; write_super() was repeatedly dropping
       and retaking bch_dev->io_ref references
    
    * tag 'bcachefs-2024-05-07.2' of https://evilpiepirate.org/git/bcachefs:
      bcachefs: Add missing sched_annotate_sleep() in bch2_journal_flush_seq_async()
      bcachefs: Fix race in bch2_write_super()
      bcachefs: BCH_SB_LAYOUT_SIZE_BITS_MAX
      bcachefs: Add missing skcipher_request_set_callback() call
      bcachefs: Fix snapshot_t() usage in bch2_fs_quota_read_inode()
      bcachefs: Fix shift-by-64 in bformat_needs_redo()
      bcachefs: Guard against unknown k.k->type in __bkey_invalid()
      bcachefs: Add missing validation for superblock section clean
      bcachefs: Fix assert in bch2_alloc_v4_invalid()
      bcachefs: fix overflow in fiemap
      bcachefs: Add a better limit for maximum number of buckets
      bcachefs: Fix lifetime issue in device iterator helpers
      bcachefs: Fix bch2_dev_lookup() refcounting
      bcachefs: Initialize bch_write_op->failed in inline data path
      bcachefs: Fix refcount put in sb_field_resize error path
      bcachefs: Inodes need extra padding for varint_decode_fast()
      bcachefs: Fix early error path in bch2_fs_btree_key_cache_exit()
      bcachefs: bucket_pos_to_bp_noerror()
      bcachefs: don't free error pointers
      bcachefs: Fix a scheduler splat in __bch2_next_write_buffer_flush_journal_buf()
    torvalds committed May 8, 2024
    Configuration menu
    Copy the full SHA
    f5fcbc8 View commit details
    Browse the repository at this point in the history
  50. Merge tag 'exfat-for-6.9-rc8' of git://git.kernel.org/pub/scm/linux/k…

    …ernel/git/linkinjeon/exfat
    
    Pull exfat fixes from Namjae Jeon:
    
     - Fix xfstests generic/013 test failure with dirsync mount option
    
     - Initialize the reserved fields of deleted file and stream extension
       dentries to zero
    
    * tag 'exfat-for-6.9-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/linkinjeon/exfat:
      exfat: zero the reserved fields of file and stream extension dentries
      exfat: fix timing of synchronizing bitmap and inode
    torvalds committed May 8, 2024
    Configuration menu
    Copy the full SHA
    fe35bf2 View commit details
    Browse the repository at this point in the history
  51. Merge tag 'fuse-fixes-6.9-final' of git://git.kernel.org/pub/scm/linu…

    …x/kernel/git/mszeredi/fuse
    
    Pull fuse fixes from Miklos Szeredi:
     "Two one-liner fixes for issues introduced in -rc1"
    
    * tag 'fuse-fixes-6.9-final' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse:
      virtiofs: include a newline in sysfs tag
      fuse: verify zero padding in fuse_backing_map
    torvalds committed May 8, 2024
    Configuration menu
    Copy the full SHA
    065a057 View commit details
    Browse the repository at this point in the history
  52. Merge tag '6.9-rc7-ksmbd-fixes' of git://git.samba.org/ksmbd

    Pull smb server fixes from Steve French:
     "Five ksmbd server fixes, all also for stable
    
       - Three fixes related to SMB3 leases (fixes two xfstests, and a
         locking issue)
    
       - Unitialized variable fix
    
       - Socket creation fix when bindv6only is set"
    
    * tag '6.9-rc7-ksmbd-fixes' of git://git.samba.org/ksmbd:
      ksmbd: do not grant v2 lease if parent lease key and epoch are not set
      ksmbd: use rwsem instead of rwlock for lease break
      ksmbd: avoid to send duplicate lease break notifications
      ksmbd: off ipv6only for both ipv4/ipv6 binding
      ksmbd: fix uninitialized symbol 'share' in smb2_tree_connect()
    torvalds committed May 8, 2024
    Configuration menu
    Copy the full SHA
    45db3ab View commit details
    Browse the repository at this point in the history
  53. bpf: Avoid uninitialized value in BPF_CORE_READ_BITFIELD

    [Changes from V1:
     - Use a default branch in the switch statement to initialize `val'.]
    
    GCC warns that `val' may be used uninitialized in the
    BPF_CRE_READ_BITFIELD macro, defined in bpf_core_read.h as:
    
    	[...]
    	unsigned long long val;						      \
    	[...]								      \
    	switch (__CORE_RELO(s, field, BYTE_SIZE)) {			      \
    	case 1: val = *(const unsigned char *)p; break;			      \
    	case 2: val = *(const unsigned short *)p; break;		      \
    	case 4: val = *(const unsigned int *)p; break;			      \
    	case 8: val = *(const unsigned long long *)p; break;		      \
            }       							      \
    	[...]
    	val;								      \
    	}								      \
    
    This patch adds a default entry in the switch statement that sets
    `val' to zero in order to avoid the warning, and random values to be
    used in case __builtin_preserve_field_info returns unexpected values
    for BPF_FIELD_BYTE_SIZE.
    
    Tested in bpf-next master.
    No regressions.
    
    Signed-off-by: Jose E. Marchesi <[email protected]>
    Signed-off-by: Andrii Nakryiko <[email protected]>
    Link: https://lore.kernel.org/bpf/[email protected]
    jemarch authored and anakryiko committed May 8, 2024
    Configuration menu
    Copy the full SHA
    0093670 View commit details
    Browse the repository at this point in the history

Commits on May 9, 2024

  1. dt-bindings: net: ipq4019-mdio: add IPQ9574 compatible

    Add a compatible property specific to IPQ9574. This should be used
    along with the IPQ4019 compatible. This second compatible serves the
    same purpose as the ipq{5,6,8} compatibles. This is to indicate that
    the clocks properties are required.
    
    Signed-off-by: Alexandru Gagniuc <[email protected]>
    Acked-by: Conor Dooley <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    mrnuke authored and kuba-moo committed May 9, 2024
    Configuration menu
    Copy the full SHA
    3a2a192 View commit details
    Browse the repository at this point in the history
  2. netlink/specs: Add VF attributes to rt_link spec

    Add support for retrieving VFs as part of link info. For example:
    
    ./tools/net/ynl/cli.py --spec Documentation/netlink/specs/rt_link.yaml \
      --do getlink --json '{"ifi-index": 38, "ext-mask": ["vf", "skip-stats"]}'
    {'address': 'b6:75:91:f2:64:65',
     [snip]
     'vfinfo-list': {'info': [{'broadcast': b'\xff\xff\xff\xff\xff\xff\x00\x00'
                                            b'\x00\x00\x00\x00\x00\x00\x00\x00'
                                            b'\x00\x00\x00\x00\x00\x00\x00\x00'
                                            b'\x00\x00\x00\x00\x00\x00\x00\x00',
                               'link-state': {'link-state': 'auto', 'vf': 0},
                               'mac': {'mac': b'\x00\x00\x00\x00\x00\x00\x00\x00'
                                              b'\x00\x00\x00\x00\x00\x00\x00\x00'
                                              b'\x00\x00\x00\x00\x00\x00\x00\x00'
                                              b'\x00\x00\x00\x00\x00\x00\x00\x00',
                                       'vf': 0},
                               'rate': {'max-tx-rate': 0,
                                        'min-tx-rate': 0,
                                        'vf': 0},
                               'rss-query-en': {'setting': 0, 'vf': 0},
                               'spoofchk': {'setting': 0, 'vf': 0},
                               'trust': {'setting': 0, 'vf': 0},
                               'tx-rate': {'rate': 0, 'vf': 0},
                               'vlan': {'qos': 0, 'vf': 0, 'vlan': 0},
                               'vlan-list': {'info': [{'qos': 0,
                                                       'vf': 0,
                                                       'vlan': 0,
                                                       'vlan-proto': 0}]}},
                              {'broadcast': b'\xff\xff\xff\xff\xff\xff\x00\x00'
                                            b'\x00\x00\x00\x00\x00\x00\x00\x00'
                                            b'\x00\x00\x00\x00\x00\x00\x00\x00'
                                            b'\x00\x00\x00\x00\x00\x00\x00\x00',
                               'link-state': {'link-state': 'auto', 'vf': 1},
                               'mac': {'mac': b'\x00\x00\x00\x00\x00\x00\x00\x00'
                                              b'\x00\x00\x00\x00\x00\x00\x00\x00'
                                              b'\x00\x00\x00\x00\x00\x00\x00\x00'
                                              b'\x00\x00\x00\x00\x00\x00\x00\x00',
                                       'vf': 1},
                               'rate': {'max-tx-rate': 0,
                                        'min-tx-rate': 0,
                                        'vf': 1},
                               'rss-query-en': {'setting': 0, 'vf': 1},
                               'spoofchk': {'setting': 0, 'vf': 1},
                               'trust': {'setting': 0, 'vf': 1},
                               'tx-rate': {'rate': 0, 'vf': 1},
                               'vlan': {'qos': 0, 'vf': 1, 'vlan': 0},
                               'vlan-list': {'info': [{'qos': 0,
                                                       'vf': 1,
                                                       'vlan': 0,
                                                       'vlan-proto': 0}]}}]},
     'xdp': {'attached': 0}}
    
    Signed-off-by: Donald Hunter <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    donaldh authored and kuba-moo committed May 9, 2024
    Configuration menu
    Copy the full SHA
    e497c32 View commit details
    Browse the repository at this point in the history
  3. dt-bindings: net: mediatek: remove wrongly added clocks and SerDes

    Several clocks as well as both sgmiisys phandles were added by mistake
    to the Ethernet bindings for MT7988. Also, the total number of clocks
    didn't match with the actual number of items listed.
    
    This happened because the vendor driver which served as a reference uses
    a high number of syscon phandles to access various parts of the SoC
    which wasn't acceptable upstream. Hence several parts which have never
    previously been supported (such SerDes PHY and USXGMII PCS) are going to
    be implemented by separate drivers. As a result the device tree will
    look much more sane.
    
    Quickly align the bindings with the upcoming reality of the drivers
    actually adding support for the remaining Ethernet-related features of
    the MT7988 SoC.
    
    Fixes: c94a9aa ("dt-bindings: net: mediatek,net: add mt7988-eth binding")
    Signed-off-by: Daniel Golle <[email protected]>
    Acked-by: Krzysztof Kozlowski <[email protected]>
    Link: https://lore.kernel.org/r/1569290b21cc787a424469ed74456a7e976b102d.1715084326.git.daniel@makrotopia.org
    Signed-off-by: Jakub Kicinski <[email protected]>
    dangowrt authored and kuba-moo committed May 9, 2024
    Configuration menu
    Copy the full SHA
    cc349b0 View commit details
    Browse the repository at this point in the history
  4. net: dst_cache: annotate data-races around dst_cache->reset_ts

    dst_cache->reset_ts is read or written locklessly,
    add READ_ONCE() and WRITE_ONCE() annotations.
    
    Signed-off-by: Eric Dumazet <[email protected]>
    Acked-by: Paolo Abeni <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Eric Dumazet authored and kuba-moo committed May 9, 2024
    Configuration menu
    Copy the full SHA
    3b09b2b View commit details
    Browse the repository at this point in the history
  5. net: dst_cache: minor optimization in dst_cache_set_ip6()

    There is no need to use this_cpu_ptr(dst_cache->cache) twice.
    
    Compiler is unable to optimize the second call, because of
    per-cpu constraints.
    
    Signed-off-by: Eric Dumazet <[email protected]>
    Acked-by: Paolo Abeni <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Eric Dumazet authored and kuba-moo committed May 9, 2024
    Configuration menu
    Copy the full SHA
    e2d09e5 View commit details
    Browse the repository at this point in the history
  6. ipv6: fib6_rules: avoid possible NULL dereference in fib6_rule_action()

    syzbot is able to trigger the following crash [1],
    caused by unsafe ip6_dst_idev() use.
    
    Indeed ip6_dst_idev() can return NULL, and must always be checked.
    
    [1]
    
    Oops: general protection fault, probably for non-canonical address 0xdffffc0000000000: 0000 [#1] PREEMPT SMP KASAN PTI
    KASAN: null-ptr-deref in range [0x0000000000000000-0x0000000000000007]
    CPU: 0 PID: 31648 Comm: syz-executor.0 Not tainted 6.9.0-rc4-next-20240417-syzkaller #0
    Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 03/27/2024
     RIP: 0010:__fib6_rule_action net/ipv6/fib6_rules.c:237 [inline]
     RIP: 0010:fib6_rule_action+0x241/0x7b0 net/ipv6/fib6_rules.c:267
    Code: 02 00 00 49 8d 9f d8 00 00 00 48 89 d8 48 c1 e8 03 42 80 3c 20 00 74 08 48 89 df e8 f9 32 bf f7 48 8b 1b 48 89 d8 48 c1 e8 03 <42> 80 3c 20 00 74 08 48 89 df e8 e0 32 bf f7 4c 8b 03 48 89 ef 4c
    RSP: 0018:ffffc9000fc1f2f0 EFLAGS: 00010246
    RAX: 0000000000000000 RBX: 0000000000000000 RCX: 1a772f98c8186700
    RDX: 0000000000000003 RSI: ffffffff8bcac4e0 RDI: ffffffff8c1f9760
    RBP: ffff8880673fb980 R08: ffffffff8fac15ef R09: 1ffffffff1f582bd
    R10: dffffc0000000000 R11: fffffbfff1f582be R12: dffffc0000000000
    R13: 0000000000000080 R14: ffff888076509000 R15: ffff88807a029a00
    FS:  00007f55e82ca6c0(0000) GS:ffff8880b9400000(0000) knlGS:0000000000000000
    CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    CR2: 0000001b31d23000 CR3: 0000000022b66000 CR4: 00000000003506f0
    DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
    Call Trace:
     <TASK>
      fib_rules_lookup+0x62c/0xdb0 net/core/fib_rules.c:317
      fib6_rule_lookup+0x1fd/0x790 net/ipv6/fib6_rules.c:108
      ip6_route_output_flags_noref net/ipv6/route.c:2637 [inline]
      ip6_route_output_flags+0x38e/0x610 net/ipv6/route.c:2649
      ip6_route_output include/net/ip6_route.h:93 [inline]
      ip6_dst_lookup_tail+0x189/0x11a0 net/ipv6/ip6_output.c:1120
      ip6_dst_lookup_flow+0xb9/0x180 net/ipv6/ip6_output.c:1250
      sctp_v6_get_dst+0x792/0x1e20 net/sctp/ipv6.c:326
      sctp_transport_route+0x12c/0x2e0 net/sctp/transport.c:455
      sctp_assoc_add_peer+0x614/0x15c0 net/sctp/associola.c:662
      sctp_connect_new_asoc+0x31d/0x6c0 net/sctp/socket.c:1099
      __sctp_connect+0x66d/0xe30 net/sctp/socket.c:1197
      sctp_connect net/sctp/socket.c:4819 [inline]
      sctp_inet_connect+0x149/0x1f0 net/sctp/socket.c:4834
      __sys_connect_file net/socket.c:2048 [inline]
      __sys_connect+0x2df/0x310 net/socket.c:2065
      __do_sys_connect net/socket.c:2075 [inline]
      __se_sys_connect net/socket.c:2072 [inline]
      __x64_sys_connect+0x7a/0x90 net/socket.c:2072
      do_syscall_x64 arch/x86/entry/common.c:52 [inline]
      do_syscall_64+0xf5/0x240 arch/x86/entry/common.c:83
     entry_SYSCALL_64_after_hwframe+0x77/0x7f
    
    Fixes: 5e5f3f0 ("[IPV6] ADDRCONF: Convert ipv6_get_saddr() to ipv6_dev_get_saddr().")
    Signed-off-by: Eric Dumazet <[email protected]>
    Reviewed-by: Simon Horman <[email protected]>
    Reviewed-by: David Ahern <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Eric Dumazet authored and kuba-moo committed May 9, 2024
    Configuration menu
    Copy the full SHA
    d101291 View commit details
    Browse the repository at this point in the history
  7. net: annotate data-races around dev->if_port

    Various ndo_set_config() methods can change dev->if_port
    
    dev->if_port is going to be read locklessly from
    rtnl_fill_link_ifmap().
    
    Add corresponding WRITE_ONCE() on writer sides.
    
    Signed-off-by: Eric Dumazet <[email protected]>
    Reviewed-by: Simon Horman <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Eric Dumazet authored and kuba-moo committed May 9, 2024
    Configuration menu
    Copy the full SHA
    8d8b1a4 View commit details
    Browse the repository at this point in the history
  8. phonet: no longer hold RTNL in route_dumpit()

    route_dumpit() already relies on RCU, RTNL is not needed.
    
    Also change return value at the end of a dump.
    This allows NLMSG_DONE to be appended to the current
    skb at the end of a dump, saving a couple of recvmsg()
    system calls.
    
    Signed-off-by: Eric Dumazet <[email protected]>
    Cc: Remi Denis-Courmont <[email protected]>
    Reviewed-by: Simon Horman <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Eric Dumazet authored and kuba-moo committed May 9, 2024
    Configuration menu
    Copy the full SHA
    58a4ff5 View commit details
    Browse the repository at this point in the history
  9. hsr: Simplify code for announcing HSR nodes timer setup

    Up till now the code to start HSR announce timer, which triggers sending
    supervisory frames, was assuming that hsr_netdev_notify() would be called
    at least twice for hsrX interface. This was required to have different
    values for old and current values of network device's operstate.
    
    This is problematic for a case where hsrX interface is already in the
    operational state when hsr_netdev_notify() is called, so timer is not
    configured to trigger and as a result the hsrX is not sending supervisory
    frames to HSR ring.
    
    This error has been discovered when hsr_ping.sh script was run. To be
    more specific - for the hsr1 and hsr2 the hsr_netdev_notify() was
    called at least twice with different IF_OPER_{LOWERDOWN|DOWN|UP} states
    assigned in hsr_check_carrier_and_operstate(hsr). As a result there was
    no issue with sending supervisory frames.
    However, with hsr3, the notify function was called only once with
    operstate set to IF_OPER_UP and timer responsible for triggering
    supervisory frames was not fired.
    
    The solution is to use netif_oper_up() and netif_running() helper
    functions to assess if network hsrX device is up.
    Only then, when the timer is not already pending, it is started.
    Otherwise it is deactivated.
    
    Fixes: f421436 ("net/hsr: Add support for the High-availability Seamless Redundancy protocol (HSRv0)")
    Signed-off-by: Lukasz Majewski <[email protected]>
    Reviewed-by: Simon Horman <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Lukasz Majewski authored and kuba-moo committed May 9, 2024
    Configuration menu
    Copy the full SHA
    4893b8b View commit details
    Browse the repository at this point in the history
  10. ipv6: prevent NULL dereference in ip6_output()

    According to syzbot, there is a chance that ip6_dst_idev()
    returns NULL in ip6_output(). Most places in IPv6 stack
    deal with a NULL idev just fine, but not here.
    
    syzbot reported:
    
    general protection fault, probably for non-canonical address 0xdffffc00000000bc: 0000 [#1] PREEMPT SMP KASAN PTI
    KASAN: null-ptr-deref in range [0x00000000000005e0-0x00000000000005e7]
    CPU: 0 PID: 9775 Comm: syz-executor.4 Not tainted 6.9.0-rc5-syzkaller-00157-g6a30653b604a #0
    Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 03/27/2024
     RIP: 0010:ip6_output+0x231/0x3f0 net/ipv6/ip6_output.c:237
    Code: 3c 1e 00 49 89 df 74 08 4c 89 ef e8 19 58 db f7 48 8b 44 24 20 49 89 45 00 49 89 c5 48 8d 9d e0 05 00 00 48 89 d8 48 c1 e8 03 <42> 0f b6 04 38 84 c0 4c 8b 74 24 28 0f 85 61 01 00 00 8b 1b 31 ff
    RSP: 0018:ffffc9000927f0d8 EFLAGS: 00010202
    RAX: 00000000000000bc RBX: 00000000000005e0 RCX: 0000000000040000
    RDX: ffffc900131f9000 RSI: 0000000000004f47 RDI: 0000000000004f48
    RBP: 0000000000000000 R08: ffffffff8a1f0b9a R09: 1ffffffff1f51fad
    R10: dffffc0000000000 R11: fffffbfff1f51fae R12: ffff8880293ec8c0
    R13: ffff88805d7fc000 R14: 1ffff1100527d91a R15: dffffc0000000000
    FS:  00007f135c6856c0(0000) GS:ffff8880b9400000(0000) knlGS:0000000000000000
    CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    CR2: 0000000020000080 CR3: 0000000064096000 CR4: 00000000003506f0
    DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
    Call Trace:
     <TASK>
      NF_HOOK include/linux/netfilter.h:314 [inline]
      ip6_xmit+0xefe/0x17f0 net/ipv6/ip6_output.c:358
      sctp_v6_xmit+0x9f2/0x13f0 net/sctp/ipv6.c:248
      sctp_packet_transmit+0x26ad/0x2ca0 net/sctp/output.c:653
      sctp_packet_singleton+0x22c/0x320 net/sctp/outqueue.c:783
      sctp_outq_flush_ctrl net/sctp/outqueue.c:914 [inline]
      sctp_outq_flush+0x6d5/0x3e20 net/sctp/outqueue.c:1212
      sctp_side_effects net/sctp/sm_sideeffect.c:1198 [inline]
      sctp_do_sm+0x59cc/0x60c0 net/sctp/sm_sideeffect.c:1169
      sctp_primitive_ASSOCIATE+0x95/0xc0 net/sctp/primitive.c:73
      __sctp_connect+0x9cd/0xe30 net/sctp/socket.c:1234
      sctp_connect net/sctp/socket.c:4819 [inline]
      sctp_inet_connect+0x149/0x1f0 net/sctp/socket.c:4834
      __sys_connect_file net/socket.c:2048 [inline]
      __sys_connect+0x2df/0x310 net/socket.c:2065
      __do_sys_connect net/socket.c:2075 [inline]
      __se_sys_connect net/socket.c:2072 [inline]
      __x64_sys_connect+0x7a/0x90 net/socket.c:2072
      do_syscall_x64 arch/x86/entry/common.c:52 [inline]
      do_syscall_64+0xf5/0x240 arch/x86/entry/common.c:83
     entry_SYSCALL_64_after_hwframe+0x77/0x7f
    
    Fixes: 778d80b ("ipv6: Add disable_ipv6 sysctl to disable IPv6 operaion on specific interface.")
    Reported-by: syzbot <[email protected]>
    Signed-off-by: Eric Dumazet <[email protected]>
    Reviewed-by: Larysa Zaremba <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Eric Dumazet authored and kuba-moo committed May 9, 2024
    Configuration menu
    Copy the full SHA
    4db783d View commit details
    Browse the repository at this point in the history
  11. selftests: drv-net: add checksum tests

    Run tools/testing/selftest/net/csum.c as part of drv-net.
    This binary covers multiple scenarios, based on arguments given,
    for both IPv4 and IPv6:
    
    - Accept UDP correct checksum
    - Detect UDP invalid checksum
    - Accept TCP correct checksum
    - Detect TCP invalid checksum
    
    - Transmit UDP: basic checksum offload
    - Transmit UDP: zero checksum conversion
    
    The test direction is reversed between receive and transmit tests, so
    that the NIC under test is always the local machine.
    
    In total this adds up to 12 testcases, with more to follow. For
    conciseness, I replaced individual functions with a function factory.
    
    Also detect hardware offload feature availability using Ethtool
    netlink and skip tests when either feature is off. This need may be
    common for offload feature tests and eventually deserving of a thin
    wrapper in lib.py.
    
    Missing are the PF_PACKET based send tests ('-P'). These use
    virtio_net_hdr to program hardware checksum offload. Which requires
    looking up the local MAC address and (harder) the MAC of the next hop.
    I'll have to give it some though how to do that robustly and where
    that code would belong.
    
    Tested:
    
            make -C tools/testing/selftests/ \
                    TARGETS="drivers/net drivers/net/hw" \
                    install INSTALL_PATH=/tmp/ksft
            cd /tmp/ksft
    
    	sudo NETIF=ens4 REMOTE_TYPE=ssh \
    		REMOTE_ARGS="[email protected]" \
    		LOCAL_V4="10.40.0.1" \
    		REMOTE_V4="10.40.0.2" \
    		./run_kselftest.sh -t drivers/net/hw:csum.py
    
    Signed-off-by: Willem de Bruijn <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    wdebruij authored and kuba-moo committed May 9, 2024
    Configuration menu
    Copy the full SHA
    1d0dc85 View commit details
    Browse the repository at this point in the history
  12. netdevsim: add NAPI support

    Add NAPI support to netdevim, similar to veth.
    
    * Add a nsim_rq rx queue structure to hold a NAPI instance and a skb
      queue.
    * During xmit, store the skb in the peer skb queue and schedule NAPI.
    * During napi_poll(), drain the skb queue and pass up the stack.
    * Add assoc between rxq and NAPI instance using netif_queue_set_napi().
    
    Signed-off-by: David Wei <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    spikeh authored and kuba-moo committed May 9, 2024
    Configuration menu
    Copy the full SHA
    3762ec0 View commit details
    Browse the repository at this point in the history
  13. net: selftest: add test for netdev netlink queue-get API

    Add a selftest for netdev generic netlink. For now there is only a
    single test that exercises the `queue-get` API.
    
    The test works with netdevsim by default or with a real device by
    setting NETIF.
    
    Add a timeout param to cmd() since ethtool -L can take a long time on
    real devices.
    
    Signed-off-by: David Wei <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    spikeh authored and kuba-moo committed May 9, 2024
    Configuration menu
    Copy the full SHA
    1cf2704 View commit details
    Browse the repository at this point in the history
  14. Merge branch 'netdevsim-add-napi-support'

    David Wei says:
    
    ====================
    netdevsim: add NAPI support
    
    Add NAPI support to netdevsim and register its Rx queues with NAPI
    instances. Then add a selftest using the new netdev Python selftest
    infra to exercise the existing Netdev Netlink API, specifically the
    queue-get API.
    
    This expands test coverage and further fleshes out netdevsim as a test
    device. It's still my goal to make it useful for testing things like
    flow steering and ZC Rx.
    ====================
    
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    kuba-moo committed May 9, 2024
    Configuration menu
    Copy the full SHA
    d9308f5 View commit details
    Browse the repository at this point in the history
  15. Merge tag 'wireless-next-2024-05-08' of git://git.kernel.org/pub/scm/…

    …linux/kernel/git/wireless/wireless-next
    
    Kalle Valo says:
    
    ====================
    wireless-next patches for v6.10
    
    The third, and most likely the last, "new features" pull request for
    v6.10 with changes both in stack and in drivers. In ath12k and rtw89
    we disabled Wireless Extensions just like with iwlwifi earlier. Wi-Fi
    7 devices will not support Wireless Extensions (WEXT) anymore so if
    someone is still using the legacy WEXT interface it's time to switch
    to nl80211 now!
    
    We merged wireless into wireless-next as we decided not to send a
    wireless pull request to v6.9 this late in the cycle. Also an
    immutable branch with MHI subsystem was merged to get ath11k and
    ath12k hibernation working.
    
    Major changes:
    
    mac80211/cfg80211
     * handle color change per link
    
    mt76
     * mt7921 LED control
     * mt7925 EHT radiotap support
     * mt7920e PCI support
    
    ath12k
     * debugfs support
     * dfs_simulate_radar debugfs file
     * disable Wireless Extensions
     * suspend and hibernation support
     * ACPI support
     * refactoring in preparation of multi-link support
    
    ath11k
     * support hibernation (required changes in qrtr and MHI subsystems)
     * ieee80211-freq-limit Device Tree property support
    
    ath10k
     * firmware-name Device Tree property support
    
    rtw89
     * complete features of new WiFi 7 chip 8922AE including BT-coexistence
       and WoWLAN
     * use BIOS ACPI settings to set TX power and channels
     * disable Wireless Extensios on Wi-Fi 7 devices
    
    iwlwifi
     * block_esr debugfs file
     * support again firmware API 90 (was reverted earlier)
     * provide channel survey information for Automatic Channel Selection (ACS)
    
    * tag 'wireless-next-2024-05-08' of git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next: (214 commits)
      wifi: mwl8k: initialize cmd->addr[] properly
      wifi: iwlwifi: Ensure prph_mac dump includes all addresses
      wifi: iwlwifi: mvm: don't request statistics in restart
      wifi: iwlwifi: mvm: exit EMLSR if secondary link is not used
      wifi: iwlwifi: mvm: add beacon template version 14
      wifi: iwlwifi: mvm: align UATS naming with firmware
      wifi: iwlwifi: Force SCU_ACTIVE for specific platforms
      wifi: iwlwifi: mvm: record and return channel survey information
      wifi: iwlwifi: mvm: add the firmware API for channel survey
      wifi: iwlwifi: mvm: Fix race in scan completion
      wifi: iwlwifi: mvm: Add a print for invalid link pair due to bandwidth
      wifi: iwlwifi: mvm: add a debugfs for reading EMLSR blocking reasons
      wifi: iwlwifi: mvm: Add active EMLSR blocking reasons prints
      wifi: iwlwifi: bump FW API to 90 for BZ/SC devices
      wifi: iwlwifi: mvm: fix primary link setting
      wifi: iwlwifi: mvm: use already determined cmd_id
      wifi: iwlwifi: mvm: don't reset link selection during restart
      wifi: iwlwifi: Print EMLSR states name
      wifi: iwlwifi: mvm: Block EMLSR when a p2p/softAP vif is active
      wifi: iwlwifi: mvm: fix typo in debug print
      ...
    ====================
    
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    kuba-moo committed May 9, 2024
    Configuration menu
    Copy the full SHA
    83127ec View commit details
    Browse the repository at this point in the history
  16. net/smc: fix neighbour and rtable leak in smc_ib_find_route()

    In smc_ib_find_route(), the neighbour found by neigh_lookup() and rtable
    resolved by ip_route_output_flow() are not released or put before return.
    It may cause the refcount leak, so fix it.
    
    Link: https://lore.kernel.org/r/[email protected]
    Fixes: e5c4744 ("net/smc: add SMC-Rv2 connection establishment")
    Signed-off-by: Wen Gu <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Paolo Abeni <[email protected]>
    Wen Gu authored and Paolo Abeni committed May 9, 2024
    Configuration menu
    Copy the full SHA
    2ddc0dd View commit details
    Browse the repository at this point in the history
  17. net: hns3: using user configure after hardware reset

    When a reset occurring, it's supposed to recover user's configuration.
    Currently, the port info(speed, duplex and autoneg) is stored in hclge_mac
    and will be scheduled updated. Consider the case that reset was happened
    consecutively. During the first reset, the port info is configured with
    a temporary value cause the PHY is reset and looking for best link config.
    Second reset start and use pervious configuration which is not the user's.
    The specific process is as follows:
    
    +------+               +----+                +----+
    | USER |               | PF |                | HW |
    +---+--+               +-+--+                +-+--+
        |  ethtool --reset   |                     |
        +------------------->|    reset command    |
        |  ethtool --reset   +-------------------->|
        +------------------->|                     +---+
        |                    +---+                 |   |
        |                    |   |reset currently  |   | HW RESET
        |                    |   |and wait to do   |   |
        |                    |<--+                 |   |
        |                    | send pervious cfg   |<--+
        |                    | (1000M FULL AN_ON)  |
        |                    +-------------------->|
        |                    | read cfg(time task) |
        |                    | (10M HALF AN_OFF)   +---+
        |                    |<--------------------+   | cfg take effect
        |                    |    reset command    |<--+
        |                    +-------------------->|
        |                    |                     +---+
        |                    | send pervious cfg   |   | HW RESET
        |                    | (10M HALF AN_OFF)   |<--+
        |                    +-------------------->|
        |                    | read cfg(time task) |
        |                    |  (10M HALF AN_OFF)  +---+
        |                    |<--------------------+   | cfg take effect
        |                    |                     |   |
        |                    | read cfg(time task) |<--+
        |                    |  (10M HALF AN_OFF)  |
        |                    |<--------------------+
        |                    |                     |
        v                    v                     v
    
    To avoid aboved situation, this patch introduced req_speed, req_duplex,
    req_autoneg to store user's configuration and it only be used after
    hardware reset and to recover user's configuration
    
    Fixes: f5f2b3e ("net: hns3: add support for imp-controlled PHYs")
    Signed-off-by: Peiyang Wang <[email protected]>
    Signed-off-by: Jijie Shao <[email protected]>
    Reviewed-by: Przemek Kitszel <[email protected]>
    Reviewed-by: Simon Horman <[email protected]>
    Signed-off-by: Paolo Abeni <[email protected]>
    Peiyang Wang authored and Paolo Abeni committed May 9, 2024
    Configuration menu
    Copy the full SHA
    05eb60e View commit details
    Browse the repository at this point in the history
  18. net: hns3: direct return when receive a unknown mailbox message

    Currently, the driver didn't return when receive a unknown
    mailbox message, and continue checking whether need to
    generate a response. It's unnecessary and may be incorrect.
    
    Fixes: bb5790b ("net: hns3: refactor mailbox response scheme between PF and VF")
    Signed-off-by: Jian Shen <[email protected]>
    Signed-off-by: Jijie Shao <[email protected]>
    Reviewed-by: Simon Horman <[email protected]>
    Signed-off-by: Paolo Abeni <[email protected]>
    IronShen authored and Paolo Abeni committed May 9, 2024
    Configuration menu
    Copy the full SHA
    669554c View commit details
    Browse the repository at this point in the history
  19. net: hns3: change type of numa_node_mask as nodemask_t

    It provides nodemask_t to describe the numa node mask in kernel. To
    improve transportability, change the type of numa_node_mask as nodemask_t.
    
    Fixes: 38caee9 ("net: hns3: Add support of the HNAE3 framework")
    Signed-off-by: Peiyang Wang <[email protected]>
    Signed-off-by: Jijie Shao <[email protected]>
    Reviewed-by: Simon Horman <[email protected]>
    Signed-off-by: Paolo Abeni <[email protected]>
    Peiyang Wang authored and Paolo Abeni committed May 9, 2024
    Configuration menu
    Copy the full SHA
    6639a7b View commit details
    Browse the repository at this point in the history
  20. net: hns3: release PTP resources if pf initialization failed

    During the PF initialization process, hclge_update_port_info may return an
    error code for some reason. At this point,  the ptp initialization has been
    completed. To void memory leaks, the resources that are applied by ptp
    should be released. Therefore, when hclge_update_port_info returns an error
    code, hclge_ptp_uninit is called to release the corresponding resources.
    
    Fixes: eaf83ae ("net: hns3: add querying fec ability from firmware")
    Signed-off-by: Peiyang Wang <[email protected]>
    Signed-off-by: Jijie Shao <[email protected]>
    Reviewed-by: Hariprasad Kelam <[email protected]>
    Signed-off-by: Paolo Abeni <[email protected]>
    Peiyang Wang authored and Paolo Abeni committed May 9, 2024
    Configuration menu
    Copy the full SHA
    950aa42 View commit details
    Browse the repository at this point in the history
  21. net: hns3: use appropriate barrier function after setting a bit value

    There is a memory barrier in followed case. When set the port down,
    hclgevf_set_timmer will set DOWN in state. Meanwhile, the service task has
    different behaviour based on whether the state is DOWN. Thus, to make sure
    service task see DOWN, use smp_mb__after_atomic after calling set_bit().
    
              CPU0                        CPU1
    ========================== ===================================
    hclgevf_set_timer_task()    hclgevf_periodic_service_task()
      set_bit(DOWN,state)         test_bit(DOWN,state)
    
    pf also has this issue.
    
    Fixes: ff20009 ("net: hns3: remove unnecessary work in hclgevf_main")
    Fixes: 1c6dfe6 ("net: hns3: remove mailbox and reset work in hclge_main")
    Signed-off-by: Peiyang Wang <[email protected]>
    Signed-off-by: Jijie Shao <[email protected]>
    Reviewed-by: Simon Horman <[email protected]>
    Signed-off-by: Paolo Abeni <[email protected]>
    Peiyang Wang authored and Paolo Abeni committed May 9, 2024
    Configuration menu
    Copy the full SHA
    094c281 View commit details
    Browse the repository at this point in the history
  22. net: hns3: fix port vlan filter not disabled issue

    According to hardware limitation, for device support modify
    VLAN filter state but not support bypass port VLAN filter,
    it should always disable the port VLAN filter. but the driver
    enables port VLAN filter when initializing, if there is no
    VLAN(except VLAN 0) id added, the driver will disable it
    in service task. In most time, it works fine. But there is
    a time window before the service task shceduled and net device
    being registered. So if user adds VLAN at this time, the driver
    will not update the VLAN filter state,  and the port VLAN filter
    remains enabled.
    
    To fix the problem, if support modify VLAN filter state but not
    support bypass port VLAN filter, set the port vlan filter to "off".
    
    Fixes: 184cd22 ("net: hns3: disable port VLAN filter when support function level VLAN filter control")
    Fixes: 2ba3066 ("net: hns3: add support for modify VLAN filter state")
    Signed-off-by: Yonglong Liu <[email protected]>
    Signed-off-by: Jijie Shao <[email protected]>
    Reviewed-by: Simon Horman <[email protected]>
    Signed-off-by: Paolo Abeni <[email protected]>
    liuyonglong86 authored and Paolo Abeni committed May 9, 2024
    Configuration menu
    Copy the full SHA
    f5db7a3 View commit details
    Browse the repository at this point in the history
  23. net: hns3: fix kernel crash when devlink reload during initialization

    The devlink reload process will access the hardware resources,
    but the register operation is done before the hardware is initialized.
    So, processing the devlink reload during initialization may lead to kernel
    crash.
    
    This patch fixes this by registering the devlink after
    hardware initialization.
    
    Fixes: cd62429 ("net: hns3: add support for registering devlink for VF")
    Fixes: 93305b7 ("net: hns3: fix kernel crash when devlink reload during pf initialization")
    Signed-off-by: Yonglong Liu <[email protected]>
    Signed-off-by: Jijie Shao <[email protected]>
    Signed-off-by: Paolo Abeni <[email protected]>
    liuyonglong86 authored and Paolo Abeni committed May 9, 2024
    Configuration menu
    Copy the full SHA
    35d92ab View commit details
    Browse the repository at this point in the history
  24. Merge branch 'there-are-some-bugfix-for-the-hns3-ethernet-driver'

    Jijie Shao says:
    
    ====================
    There are some bugfix for the HNS3 ethernet driver
    ====================
    
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Paolo Abeni <[email protected]>
    Paolo Abeni committed May 9, 2024
    Configuration menu
    Copy the full SHA
    393ceeb View commit details
    Browse the repository at this point in the history
  25. net: dsa: mv88e6xxx: add phylink_get_caps for the mv88e6320/21 family

    As of commit de5c9bf ("net: phylink: require supported_interfaces to
    be filled")
    Marvell 88e6320/21 switches fail to be probed:
    
    ...
    mv88e6085 30be0000.ethernet-1:00: phylink: error: empty supported_interfaces
    error creating PHYLINK: -22
    ...
    
    The problem stems from the use of mv88e6185_phylink_get_caps() to get
    the device capabilities.
    Since there are serdes only ports 0/1 included, create a new dedicated
    phylink_get_caps for the 6320 and 6321 to properly support their
    set of capabilities.
    
    Fixes: de5c9bf ("net: phylink: require supported_interfaces to be filled")
    Signed-off-by: Steffen Bätz <[email protected]>
    Reviewed-by: Andrew Lunn <[email protected]>
    Reviewed-by: Fabio Estevam <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Paolo Abeni <[email protected]>
    Steffen Bätz authored and Paolo Abeni committed May 9, 2024
    Configuration menu
    Copy the full SHA
    f39bf3c View commit details
    Browse the repository at this point in the history
  26. net: dsa: mv88e6xxx: read cmode on mv88e6320/21 serdes only ports

    On the mv88e6320 and 6321 switch family, port 0/1 are serdes only ports.
    Modified the mv88e6352_get_port4_serdes_cmode function to pass a port
    number since the register set of the 6352 is equal on the 6320/21.
    
    Signed-off-by: Steffen Bätz <[email protected]>
    Reviewed-by: Andrew Lunn <[email protected]>
    Reviewed-by: Fabio Estevam <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Paolo Abeni <[email protected]>
    Steffen Bätz authored and Paolo Abeni committed May 9, 2024
    Configuration menu
    Copy the full SHA
    6e7ffa1 View commit details
    Browse the repository at this point in the history
  27. l2tp: Support several sockets with same IP/port quadruple

    Some l2tp providers will use 1701 as origin port and open several
    tunnels for the same origin and target. On the Linux side, this
    may mean opening several sockets, but then trafic will go to only
    one of them, losing the trafic for the tunnel of the other socket
    (or leaving it up to userland, consuming a lot of cpu%).
    
    This can also happen when the l2tp provider uses a cluster, and
    load-balancing happens to migrate from one origin IP to another one,
    for which a socket was already established. Managing reassigning
    tunnels from one socket to another would be very hairy for userland.
    
    Lastly, as documented in l2tpconfig(1), as client it may be necessary
    to use 1701 as origin port for odd firewalls reasons, which could
    prevent from establishing several tunnels to a l2tp server, for the
    same reason: trafic would get only on one of the two sockets.
    
    With the V2 protocol it is however easy to route trafic to the proper
    tunnel, by looking up the tunnel number in the network namespace. This
    fixes the three cases altogether.
    
    Signed-off-by: Samuel Thibault <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Paolo Abeni <[email protected]>
    sthibaul authored and Paolo Abeni committed May 9, 2024
    Configuration menu
    Copy the full SHA
    628bc3e View commit details
    Browse the repository at this point in the history
  28. Merge tag 'pull-fixes' of git://git.kernel.org/pub/scm/linux/kernel/g…

    …it/viro/vfs
    
    Pull dentry leak fix from Al Viro:
     "Dentry leak fix in the qibfs driver that I forgot to send a pull
      request for ;-/
    
      My apologies - it actually sat in vfs.git#fixes for more than two
      months..."
    
    * tag 'pull-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
      qibfs: fix dentry leak
    torvalds committed May 9, 2024
    Configuration menu
    Copy the full SHA
    1bbc991 View commit details
    Browse the repository at this point in the history
  29. Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/gi…

    …t/rmk/linux
    
    Pull ARM fix from Russell King:
    
     - clear stale KASan stack poison when a CPU resumes
    
    * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rmk/linux:
      ARM: 9381/1: kasan: clear stale stack poison
    torvalds committed May 9, 2024
    Configuration menu
    Copy the full SHA
    62788b0 View commit details
    Browse the repository at this point in the history
  30. Merge tag 'net-6.9-rc8' of git://git.kernel.org/pub/scm/linux/kernel/…

    …git/netdev/net
    
    Pull networking fixes from Paolo Abeni:
     "Including fixes from bluetooth and IPsec.
    
      The bridge patch is actually a follow-up to a recent fix in the same
      area. We have a pending v6.8 AF_UNIX regression; it should be solved
      soon, but not in time for this PR.
    
      Current release - regressions:
    
       - eth: ks8851: Queue RX packets in IRQ handler instead of disabling
         BHs
    
       - net: bridge: fix corrupted ethernet header on multicast-to-unicast
    
      Current release - new code bugs:
    
       - xfrm: fix possible bad pointer derferencing in error path
    
      Previous releases - regressionis:
    
       - core: fix out-of-bounds access in ops_init
    
       - ipv6:
          - fix potential uninit-value access in __ip6_make_skb()
          - fib6_rules: avoid possible NULL dereference in fib6_rule_action()
    
       - tcp: use refcount_inc_not_zero() in tcp_twsk_unique().
    
       - rtnetlink: correct nested IFLA_VF_VLAN_LIST attribute validation
    
       - rxrpc: fix congestion control algorithm
    
       - bluetooth:
          - l2cap: fix slab-use-after-free in l2cap_connect()
          - msft: fix slab-use-after-free in msft_do_close()
    
       - eth: hns3: fix kernel crash when devlink reload during
         initialization
    
       - eth: dsa: mv88e6xxx: add phylink_get_caps for the mv88e6320/21
         family
    
      Previous releases - always broken:
    
       - xfrm: preserve vlan tags for transport mode software GRO
    
       - tcp: defer shutdown(SEND_SHUTDOWN) for TCP_SYN_RECV sockets
    
       - eth: hns3: keep using user config after hardware reset"
    
    * tag 'net-6.9-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (47 commits)
      net: dsa: mv88e6xxx: read cmode on mv88e6320/21 serdes only ports
      net: dsa: mv88e6xxx: add phylink_get_caps for the mv88e6320/21 family
      net: hns3: fix kernel crash when devlink reload during initialization
      net: hns3: fix port vlan filter not disabled issue
      net: hns3: use appropriate barrier function after setting a bit value
      net: hns3: release PTP resources if pf initialization failed
      net: hns3: change type of numa_node_mask as nodemask_t
      net: hns3: direct return when receive a unknown mailbox message
      net: hns3: using user configure after hardware reset
      net/smc: fix neighbour and rtable leak in smc_ib_find_route()
      ipv6: prevent NULL dereference in ip6_output()
      hsr: Simplify code for announcing HSR nodes timer setup
      ipv6: fib6_rules: avoid possible NULL dereference in fib6_rule_action()
      dt-bindings: net: mediatek: remove wrongly added clocks and SerDes
      rxrpc: Only transmit one ACK per jumbo packet received
      rxrpc: Fix congestion control algorithm
      selftests: test_bridge_neigh_suppress.sh: Fix failures due to duplicate MAC
      ipv6: Fix potential uninit-value access in __ip6_make_skb()
      net: phy: marvell-88q2xxx: add support for Rev B1 and B2
      appletalk: Improve handling of broadcast packets
      ...
    torvalds committed May 9, 2024
    Configuration menu
    Copy the full SHA
    8c3b756 View commit details
    Browse the repository at this point in the history
  31. Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

    Cross-merge networking fixes after downstream PR.
    
    No conflicts.
    
    Adjacent changes:
    
    drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c
      35d92ab ("net: hns3: fix kernel crash when devlink reload during initialization")
      2a1a1a7 ("net: hns3: add command queue trace for hns3")
    
    Signed-off-by: Jakub Kicinski <[email protected]>
    kuba-moo committed May 9, 2024
    Configuration menu
    Copy the full SHA
    e707383 View commit details
    Browse the repository at this point in the history
  32. selftests/bpf: Remove bpf_tracing_net.h usages from two networking tests

    This patch removes the bpf_tracing_net.h usage from the networking tests,
    fib_lookup and test_lwt_redirect. Instead of using the (copied) macro
    TC_ACT_SHOT and ETH_HLEN from bpf_tracing_net.h, they can directly
    use the ones defined in the network header files under linux/.
    
    Signed-off-by: Martin KaFai Lau <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Alexei Starovoitov <[email protected]>
    Martin KaFai Lau authored and Alexei Starovoitov committed May 9, 2024
    Configuration menu
    Copy the full SHA
    c0338e6 View commit details
    Browse the repository at this point in the history
  33. selftests/bpf: Add a few tcp helper functions and macros to bpf_traci…

    …ng_net.h
    
    This patch adds a few tcp related helper functions to bpf_tracing_net.h.
    They will be useful for both tcp-cc and network tracing related
    bpf progs. They have already been in the bpf_tcp_helpers.h. This change
    is needed to retire the bpf_tcp_helpers.h and consolidate all tests
    to vmlinux.h (i.e. bpf_tracing_net.h).
    
    Some of the helpers (tcp_sk and inet_csk) are also defined in
    bpf_cc_cubic.c and they are removed. While at it, remove
    the vmlinux.h from bpf_cc_cubic.c. bpf_tracing_net.h (which has
    vmlinux.h after this patch) is enough and will be consistent
    with the other tcp-cc tests in the later patches.
    
    The other TCP_* macro additions will be needed for the bpf_dctcp
    changes in the later patch.
    
    Signed-off-by: Martin KaFai Lau <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Alexei Starovoitov <[email protected]>
    Martin KaFai Lau authored and Alexei Starovoitov committed May 9, 2024
    Configuration menu
    Copy the full SHA
    cbaec46 View commit details
    Browse the repository at this point in the history
  34. selftests/bpf: Reuse the tcp_sk() from the bpf_tracing_net.h

    This patch removes the individual tcp_sk implementations from the
    tcp-cc tests. The tcp_sk() implementation from the bpf_tracing_net.h
    is reused instead.
    
    Signed-off-by: Martin KaFai Lau <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Alexei Starovoitov <[email protected]>
    Martin KaFai Lau authored and Alexei Starovoitov committed May 9, 2024
    Configuration menu
    Copy the full SHA
    cc5b18c View commit details
    Browse the repository at this point in the history
  35. selftests/bpf: Sanitize the SEC and inline usages in the bpf-tcp-cc t…

    …ests
    
    It is needed to remove the BPF_STRUCT_OPS usages from the tcp-cc tests
    because it is defined in bpf_tcp_helpers.h which is going to be retired.
    While at it, this patch consolidates all tcp-cc struct_ops programs to
    use the SEC("struct_ops") + BPF_PROG().
    
    It also removes the unnecessary __always_inline usages from the
    tcp-cc tests.
    
    Signed-off-by: Martin KaFai Lau <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Alexei Starovoitov <[email protected]>
    Martin KaFai Lau authored and Alexei Starovoitov committed May 9, 2024
    Configuration menu
    Copy the full SHA
    7d3851a View commit details
    Browse the repository at this point in the history
  36. selftests/bpf: Rename tcp-cc private struct in bpf_cubic and bpf_dctcp

    The "struct bictcp" and "struct dctcp" are private to the bpf prog
    and they are stored in the private buffer in inet_csk(sk)->icsk_ca_priv.
    Hence, there is no bpf CO-RE required.
    
    The same struct name exists in the vmlinux.h. To reuse vmlinux.h,
    they need to be renamed such that the bpf prog logic will be
    immuned from the kernel tcp-cc changes.
    
    This patch adds a "bpf_" prefix to them.
    
    Signed-off-by: Martin KaFai Lau <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Alexei Starovoitov <[email protected]>
    Martin KaFai Lau authored and Alexei Starovoitov committed May 9, 2024
    Configuration menu
    Copy the full SHA
    b1d87ae View commit details
    Browse the repository at this point in the history
  37. selftests/bpf: Use bpf_tracing_net.h in bpf_cubic

    This patch uses bpf_tracing_net.h (i.e. vmlinux.h) in bpf_cubic.
    This will allow to retire the bpf_tcp_helpers.h and consolidate
    tcp-cc tests to vmlinux.h.
    
    Signed-off-by: Martin KaFai Lau <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Alexei Starovoitov <[email protected]>
    Martin KaFai Lau authored and Alexei Starovoitov committed May 9, 2024
    Configuration menu
    Copy the full SHA
    a824c9a View commit details
    Browse the repository at this point in the history
  38. selftests/bpf: Use bpf_tracing_net.h in bpf_dctcp

    This patch uses bpf_tracing_net.h (i.e. vmlinux.h) in bpf_dctcp.
    This will allow to retire the bpf_tcp_helpers.h and consolidate
    tcp-cc tests to vmlinux.h.
    
    It will have a dup on min/max macros with the bpf_cubic. It could
    be further refactored in the future.
    
    Signed-off-by: Martin KaFai Lau <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Alexei Starovoitov <[email protected]>
    Martin KaFai Lau authored and Alexei Starovoitov committed May 9, 2024
    Configuration menu
    Copy the full SHA
    6ad4e6e View commit details
    Browse the repository at this point in the history
  39. selftests/bpf: Remove bpf_tcp_helpers.h usages from other misc bpf tc…

    …p-cc tests
    
    This patch removed the final few bpf_tcp_helpers.h usages
    in some misc bpf tcp-cc tests and replace it with
    bpf_tracing_net.h (i.e. vmlinux.h)
    
    Signed-off-by: Martin KaFai Lau <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Alexei Starovoitov <[email protected]>
    Martin KaFai Lau authored and Alexei Starovoitov committed May 9, 2024
    Configuration menu
    Copy the full SHA
    6eee55a View commit details
    Browse the repository at this point in the history
  40. selftests/bpf: Remove the bpf_tcp_helpers.h usages from other non tcp…

    …-cc tests
    
    The patch removes the remaining bpf_tcp_helpers.h usages in the
    non tcp-cc networking tests. It either replaces it with bpf_tracing_net.h
    or just removed it because the test is not actually using any
    kernel sockets. For the later, the missing macro (mainly SOL_TCP) is
    defined locally.
    
    An exception is the test_sock_fields which is testing
    the "struct bpf_sock" type instead of the kernel sock type.
    Whenever "vmlinux.h" is used instead, it hits a verifier
    error on doing arithmetic on the sock_common pointer:
    
    ; return !a6[0] && !a6[1] && !a6[2] && a6[3] == bpf_htonl(1); @ test_sock_fields.c:54
    21: (61) r2 = *(u32 *)(r1 +28)        ; R1_w=sock_common() R2_w=scalar(smin=0,smax=umax=0xffffffff,var_off=(0x0; 0xffffffff))
    22: (56) if w2 != 0x0 goto pc-6       ; R2_w=0
    23: (b7) r3 = 28                      ; R3_w=28
    24: (bf) r2 = r1                      ; R1_w=sock_common() R2_w=sock_common()
    25: (0f) r2 += r3
    R2 pointer arithmetic on sock_common prohibited
    
    Hence, instead of including bpf_tracing_net.h, the test_sock_fields test
    defines a tcp_sock with one lsndtime field in it.
    
    Another highlight is, in sockopt_qos_to_cc.c, the tcp_cc_eq()
    is replaced by bpf_strncmp(). tcp_cc_eq() was a workaround
    in bpf_tcp_helpers.h before bpf_strncmp had been added.
    
    The SOL_IPV6 addition to bpf_tracing_net.h is needed by the
    test_tcpbpf_kern test.
    
    Signed-off-by: Martin KaFai Lau <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Alexei Starovoitov <[email protected]>
    Martin KaFai Lau authored and Alexei Starovoitov committed May 9, 2024
    Configuration menu
    Copy the full SHA
    c075c9c View commit details
    Browse the repository at this point in the history
  41. selftests/bpf: Retire bpf_tcp_helpers.h

    The previous patches have consolidated the tests to use
    bpf_tracing_net.h (i.e. vmlinux.h) instead of bpf_tcp_helpers.h.
    
    This patch can finally retire the bpf_tcp_helpers.h from
    the repository.
    
    Signed-off-by: Martin KaFai Lau <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Alexei Starovoitov <[email protected]>
    Martin KaFai Lau authored and Alexei Starovoitov committed May 9, 2024
    Configuration menu
    Copy the full SHA
    6a65081 View commit details
    Browse the repository at this point in the history
  42. Merge branch 'selftests-bpf-retire-bpf_tcp_helpers-h'

    Martin KaFai Lau says:
    
    ====================
    selftests/bpf: Retire bpf_tcp_helpers.h
    
    From: Martin KaFai Lau <[email protected]>
    
    The earlier commit 8e6d9ae ("selftests/bpf: Use bpf_tracing.h instead of bpf_tcp_helpers.h")
    removed the bpf_tcp_helpers.h usages from the non networking tests.
    
    This patch set is a continuation of this effort to retire
    the bpf_tcp_helpers.h from the networking tests (mostly tcp-cc related).
    
    The main usage of the bpf_tcp_helpers.h is the partial kernel
    socket definitions (e.g. sock, tcp_sock). New fields are kept adding
    back to those partial socket definitions while everything is available
    in the vmlinux.h. The recent bpf_cc_cubic.c test tried to extend
    bpf_tcp_helpers.c but eventually used the vmlinux.h instead. To avoid
    this unnecessary detour for new tests and have one consistent way
    of using the kernel sockets, this patch set retires the bpf_tcp_helpers.h
    usages and consolidates the tests to use vmlinux.h instead.
    ====================
    
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Alexei Starovoitov <[email protected]>
    Alexei Starovoitov committed May 9, 2024
    Configuration menu
    Copy the full SHA
    cbe35ad View commit details
    Browse the repository at this point in the history
  43. selftest: epoll_busy_poll: epoll busy poll tests

    Add a simple test for the epoll busy poll ioctls, using the kernel
    selftest harness.
    
    This test ensures that the ioctls have the expected return codes and
    that the kernel properly gets and sets epoll busy poll parameters.
    
    The test can be expanded in the future to do real busy polling (provided
    another machine to act as the client is available).
    
    Signed-off-by: Joe Damato <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    jdamato-fsly authored and kuba-moo committed May 9, 2024
    Configuration menu
    Copy the full SHA
    60e0f98 View commit details
    Browse the repository at this point in the history
  44. selftests/bpf: Add post_socket_cb for network_helper_opts

    __start_server() sets SO_REUSPORT through setsockopt() when the parameter
    'reuseport' is set. This patch makes it more flexible by adding a function
    pointer post_socket_cb into struct network_helper_opts. The
    'const struct post_socket_opts *cb_opts' args in the post_socket_cb is
    for the future extension.
    
    The 'reuseport' parameter can be dropped.
    Now the original start_reuseport_server() can be implemented by setting a
    newly defined reuseport_cb() function pointer to post_socket_cb filed of
    struct network_helper_opts.
    
    Signed-off-by: Geliang Tang <[email protected]>
    Link: https://lore.kernel.org/r/470cb82f209f055fc7fb39c66c6b090b5b7ed2b2.1714907662.git.tanggeliang@kylinos.cn
    Signed-off-by: Martin KaFai Lau <[email protected]>
    Geliang Tang authored and Martin KaFai Lau committed May 9, 2024
    Configuration menu
    Copy the full SHA
    20434d2 View commit details
    Browse the repository at this point in the history
  45. selftests/bpf: Use start_server_addr in sockopt_inherit

    Include network_helpers.h in prog_tests/sockopt_inherit.c, use public
    helper start_server_addr() instead of the local defined function
    start_server(). This can avoid duplicate code.
    
    Add a helper custom_cb() to set SOL_CUSTOM sockopt looply, set it to
    post_socket_cb pointer of struct network_helper_opts, and pass it to
    start_server_addr().
    
    Signed-off-by: Geliang Tang <[email protected]>
    Link: https://lore.kernel.org/r/687af66f743a0bf15cdba372c5f71fe64863219e.1714907662.git.tanggeliang@kylinos.cn
    Signed-off-by: Martin KaFai Lau <[email protected]>
    Geliang Tang authored and Martin KaFai Lau committed May 9, 2024
    Configuration menu
    Copy the full SHA
    5166b3e View commit details
    Browse the repository at this point in the history
  46. selftests/bpf: Use start_server_addr in test_tcp_check_syncookie

    Include network_helpers.h in test_tcp_check_syncookie_user.c, use
    public helper start_server_addr() in it instead of the local defined
    function start_server(). This can avoid duplicate code.
    
    Add two helpers v6only_true() and v6only_false() to set IPV6_V6ONLY
    sockopt to true or false, set them to post_socket_cb pointer of struct
    network_helper_opts, and pass it to start_server_setsockopt().
    
    In order to use functions defined in network_helpers.c, Makefile needs
    to be updated too.
    
    Signed-off-by: Geliang Tang <[email protected]>
    Link: https://lore.kernel.org/r/e0c5324f5da84f453f47543536e70f126eaa8678.1714907662.git.tanggeliang@kylinos.cn
    Signed-off-by: Martin KaFai Lau <[email protected]>
    Geliang Tang authored and Martin KaFai Lau committed May 9, 2024
    Configuration menu
    Copy the full SHA
    49e1fa8 View commit details
    Browse the repository at this point in the history
  47. selftests/bpf: Use connect_to_fd in sockopt_inherit

    This patch uses public helper connect_to_fd() exported in network_helpers.h
    instead of the local defined function connect_to_server() in
    prog_tests/sockopt_inherit.c. This can avoid duplicate code.
    
    Signed-off-by: Geliang Tang <[email protected]>
    Link: https://lore.kernel.org/r/71db79127cc160b0643fd9a12c70ae019ae076a1.1714907662.git.tanggeliang@kylinos.cn
    Signed-off-by: Martin KaFai Lau <[email protected]>
    Geliang Tang authored and Martin KaFai Lau committed May 9, 2024
    Configuration menu
    Copy the full SHA
    5059c73 View commit details
    Browse the repository at this point in the history
  48. selftests/bpf: Use connect_to_fd in test_tcp_check_syncookie

    This patch uses public helper connect_to_fd() exported in network_helpers.h
    instead of the local defined function connect_to_server() in
    test_tcp_check_syncookie_user.c. This can avoid duplicate code.
    
    Then the arguments "addr" and "len" of run_test() become useless, drop them
    too.
    
    Signed-off-by: Geliang Tang <[email protected]>
    Link: https://lore.kernel.org/r/e0ae6b790ac0abc7193aadfb2660c8c9eb0fe1f0.1714907662.git.tanggeliang@kylinos.cn
    Signed-off-by: Martin KaFai Lau <[email protected]>
    Geliang Tang authored and Martin KaFai Lau committed May 9, 2024
    Configuration menu
    Copy the full SHA
    65a3f0d View commit details
    Browse the repository at this point in the history
  49. selftests/bpf: Drop get_port in test_tcp_check_syncookie

    The arguments "addr" and "len" of run_test() have dropped. This makes
    function get_port() useless. Drop it from test_tcp_check_syncookie_user.c.
    
    Signed-off-by: Geliang Tang <[email protected]>
    Link: https://lore.kernel.org/r/a9b5c8064ab4cbf0f68886fe0e4706428b8d0d47.1714907662.git.tanggeliang@kylinos.cn
    Signed-off-by: Martin KaFai Lau <[email protected]>
    Geliang Tang authored and Martin KaFai Lau committed May 9, 2024
    Configuration menu
    Copy the full SHA
    7abbf38 View commit details
    Browse the repository at this point in the history
  50. Merge branch 'use network helpers, part 4'

    Geliang Tang says:
    
    ====================
    From: Geliang Tang <[email protected]>
    
    This patchset adds post_socket_cb pointer into
    struct network_helper_opts to make start_server_addr() helper
    more flexible. With these modifications, many duplicate codes
    can be dropped.
    
    Patches 1-3 address Martin's comments in the previous series.
    ====================
    
    Signed-off-by: Martin KaFai Lau <[email protected]>
    Martin KaFai Lau committed May 9, 2024
    Configuration menu
    Copy the full SHA
    0d03a4d View commit details
    Browse the repository at this point in the history
  51. kbuild,bpf: Switch to using --btf_features for pahole v1.26 and later

    The btf_features list can be used for pahole v1.26 and later -
    it is useful because if a feature is not yet implemented it will
    not exit with a failure message.  This will allow us to add feature
    requests to the pahole options without having to check pahole versions
    in future; if the version of pahole supports the feature it will be
    added.
    
    Signed-off-by: Alan Maguire <[email protected]>
    Signed-off-by: Andrii Nakryiko <[email protected]>
    Tested-by: Eduard Zingerman <[email protected]>
    Acked-by: Andrii Nakryiko <[email protected]>
    Link: https://lore.kernel.org/bpf/[email protected]
    alan-maguire authored and anakryiko committed May 9, 2024
    Configuration menu
    Copy the full SHA
    fcd1ed8 View commit details
    Browse the repository at this point in the history

Commits on May 10, 2024

  1. net/sched: adjust device watchdog timer to detect stopped queue at ri…

    …ght time
    
    Applications are sensitive to long network latency, particularly
    heartbeat monitoring ones. Longer the tx timeout recovery higher the
    risk with such applications on a production machines. This patch
    remedies, yet honoring device set tx timeout.
    
    Modify watchdog next timeout to be shorter than the device specified.
    Compute the next timeout be equal to device watchdog timeout less the
    how long ago queue stop had been done. At next watchdog timeout tx
    timeout handler is called into if still in stopped state. Either called
    or not called, restore the watchdog timeout back to device specified.
    
    Signed-off-by: Praveen Kumar Kannoju <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    pkannoju authored and kuba-moo committed May 10, 2024
    Configuration menu
    Copy the full SHA
    33fb988 View commit details
    Browse the repository at this point in the history
  2. tcp: get rid of twsk_unique()

    DCCP is going away soon, and had no twsk_unique() method.
    
    We can directly call tcp_twsk_unique() for TCP sockets.
    
    Signed-off-by: Eric Dumazet <[email protected]>
    Reviewed-by: Kuniyuki Iwashima <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Eric Dumazet authored and kuba-moo committed May 10, 2024
    Configuration menu
    Copy the full SHA
    383eed2 View commit details
    Browse the repository at this point in the history
  3. net: ipv6: fix wrong start position when receive hop-by-hop fragment

    In IPv6, ipv6_rcv_core will parse the hop-by-hop type extension header and increase skb->transport_header by one extension header length.
    But if there are more other extension headers like fragment header at this time, the skb->transport_header points to the second extension header,
    not the transport layer header or the first extension header.
    
    This will result in the start and nexthdrp variable not pointing to the same position in ipv6frag_thdr_trunced,
    and ipv6_skip_exthdr returning incorrect offset and frag_off.Sometimes,the length of the last sharded packet is smaller than the calculated incorrect offset, resulting in packet loss.
    We can use network header to offset and calculate the correct position to solve this problem.
    
    Fixes: 9d9e937 (ipv6/netfilter: Discard first fragment not including all headers)
    Signed-off-by: Gao Xingwang <[email protected]>
    Signed-off-by: David S. Miller <[email protected]>
    gaoxingwang authored and davem330 committed May 10, 2024
    Configuration menu
    Copy the full SHA
    1cd354f View commit details
    Browse the repository at this point in the history
  4. netfilter: nft_set_pipapo: prepare pipapo_get helper for on-demand clone

    The helper uses priv->clone unconditionally which will fail once we do
    the clone conditionally on first insert or removal.
    
    'nft get element' from userspace needs to use priv->match since this
    runs from rcu read side lock section.
    
    Prepare for this by passing the match backend data as argument.
    
    Signed-off-by: Florian Westphal <[email protected]>
    Reviewed-by: Stefano Brivio <[email protected]>
    Signed-off-by: Pablo Neira Ayuso <[email protected]>
    Florian Westphal authored and ummakynes committed May 10, 2024
    Configuration menu
    Copy the full SHA
    a238106 View commit details
    Browse the repository at this point in the history
  5. netfilter: nft_set_pipapo: move cloning of match info to insert/remov…

    …al path
    
    This set type keeps two copies of the sets' content,
       priv->match (live version, used to match from packet path)
       priv->clone (work-in-progress version of the 'future' priv->match).
    
    All additions and removals are done on priv->clone.  When transaction
    completes, priv->clone becomes priv->match and a new clone is allocated
    for use by next transaction.
    
    Problem is that the cloning requires GFP_KERNEL allocations but we
    cannot fail at either commit or abort time.
    
    This patch defers the clone until we get an insertion or removal
    request.  This allows us to handle OOM situations correctly.
    
    This also allows to remove ->dirty in a followup change:
    
    If ->clone exists, ->dirty is always true
    If ->clone is NULL, ->dirty is always false, no elements were added
    or removed (except catchall elements which are external to the specific
    set backend).
    
    Signed-off-by: Florian Westphal <[email protected]>
    Reviewed-by: Stefano Brivio <[email protected]>
    Signed-off-by: Pablo Neira Ayuso <[email protected]>
    Florian Westphal authored and ummakynes committed May 10, 2024
    Configuration menu
    Copy the full SHA
    3f1d886 View commit details
    Browse the repository at this point in the history
  6. netfilter: nft_set_pipapo: remove dirty flag

    After previous change:
     ->clone exists: ->dirty is always true
     ->clone == NULL ->dirty is always false
    
    So remove this flag.
    
    Signed-off-by: Florian Westphal <[email protected]>
    Reviewed-by: Stefano Brivio <[email protected]>
    Signed-off-by: Pablo Neira Ayuso <[email protected]>
    Florian Westphal authored and ummakynes committed May 10, 2024
    Configuration menu
    Copy the full SHA
    532aec7 View commit details
    Browse the repository at this point in the history
  7. selftests: netfilter: add packetdrill based conntrack tests

    Add a new test script that uses packetdrill tool to exercise conntrack
    state machine.
    
    Needs ip/ip6tables and conntrack tool (to check if we have an entry in
    the expected state).
    
    Test cases added here cover following scenarios:
    1. already-acked (retransmitted) packets are not tagged as INVALID
    2. RST packet coming when conntrack is already closing (FIN/CLOSE_WAIT)
      transitions conntrack to CLOSE even if the RST is not an exact match
    3. RST packets with out-of-window sequence numbers are marked as INVALID
    4. SYN+Challenge ACK: check that challenge ack is allowed to pass
    5. Old SYN/ACK: check conntrack handles the case where SYN is answered
      with SYN/ACK for an old, previous connection attempt
    6. Check SYN reception while in ESTABLISHED state generates a challenge
       ack, RST response clears 'outdated' state + next SYN retransmit gets
       us into 'SYN_RECV' conntrack state.
    
    Tests get run twice, once with ipv4 and once with ipv6.
    
    Signed-off-by: Florian Westphal <[email protected]>
    Signed-off-by: Pablo Neira Ayuso <[email protected]>
    Florian Westphal authored and ummakynes committed May 10, 2024
    Configuration menu
    Copy the full SHA
    a8a388c View commit details
    Browse the repository at this point in the history
  8. netfilter: nf_tables: allow clone callbacks to sleep

    Sven Auhagen reports transaction failures with following error:
      ./main.nft:13:1-26: Error: Could not process rule: Cannot allocate memory
      percpu: allocation failed, size=16 align=8 atomic=1, atomic alloc failed, no space left
    
    This points to failing pcpu allocation with GFP_ATOMIC flag.
    However, transactions happen from user context and are allowed to sleep.
    
    One case where we can call into percpu allocator with GFP_ATOMIC is
    nft_counter expression.
    
    Normally this happens from control plane, so this could use GFP_KERNEL
    instead.  But one use case, element insertion from packet path,
    needs to use GFP_ATOMIC allocations (nft_dynset expression).
    
    At this time, .clone callbacks always use GFP_ATOMIC for this reason.
    
    Add gfp_t argument to the .clone function and pass GFP_KERNEL or
    GFP_ATOMIC flag depending on context, this allows all clone memory
    allocations to sleep for the normal (transaction) case.
    
    Cc: Sven Auhagen <[email protected]>
    Signed-off-by: Florian Westphal <[email protected]>
    Signed-off-by: Pablo Neira Ayuso <[email protected]>
    Florian Westphal authored and ummakynes committed May 10, 2024
    Configuration menu
    Copy the full SHA
    fa23e0d View commit details
    Browse the repository at this point in the history
  9. Merge tag 'gtp-24-05-07' of git://git.kernel.org/pub/scm/linux/kernel…

    …/git/pablo/gtp
    
    Pablo neira Ayuso says:
    
    ====================
    gtp pull request 24-05-07
    
    This v3 includes:
    - fix for clang uninitialized variable per Jakub.
    - address Smatch and Coccinelle reports per Simon
    - remove inline in new IPv6 support per Simon
    - fix memleaks in netlink control plane per Simon
    -o-
    
    The following patchset contains IPv6 GTP driver support for net-next,
    this also includes IPv6 over IPv4 and vice-versa:
    
    Patch #1 removes a unnecessary stack variable initialization in the
             socket routine.
    
    Patch #2 deals with GTP extension headers. This variable length extension
             header to decapsulate packets accordingly. Otherwise, packets are
             dropped when these extension headers are present which breaks
             interoperation with other non-Linux based GTP implementations.
    
    Patch #3 prepares for IPv6 support by moving IPv4 specific fields in PDP
             context objects to a union.
    
    Patch #4 adds IPv6 support while retaining backward compatibility.
             Three new attributes allows to declare an IPv6 GTP tunnel
             GTPA_FAMILY, GTPA_PEER_ADDR6 and GTPA_MS_ADDR6 as well as
             IFLA_GTP_LOCAL6 to declare the IPv6 GTP UDP socket. Up to this
             patch, only IPv6 outer in IPv6 inner is supported.
    
    Patch #5 uses IPv6 address /64 prefix for UE/MS in the inner headers.
             Unlike IPv4, which provides a 1:1 mapping between UE/MS,
             IPv6 tunnel encapsulates traffic for /64 address as specified
             by 3GPP TS. Patch has been split from Patch #4 to highlight
             this behaviour.
    
    Patch #6 passes up IPv6 link-local traffic, such as IPv6 SLAAC, for
             handling to userspace so they are handled as control packets.
    
    Patch #7 prepares to allow for GTP IPv4 over IPv6 and vice-versa by
             moving IP specific debugging out of the function to build
             IPv4 and IPv6 GTP packets.
    
    Patch #8 generalizes TOS/DSCP handling following similar approach as
             in the existing iptunnel infrastructure.
    
    Patch #9 adds a helper function to build an IPv4 GTP packet in the outer
             header.
    
    Patch #10 adds a helper function to build an IPv6 GTP packet in the outer
              header.
    
    Patch #11 adds support for GTP IPv4-over-IPv6 and vice-versa.
    
    Patch #12 allows to use the same TID/TEID (tunnel identifier) for inner
              IPv4 and IPv6 packets for better UE/MS dual stack integration.
    
    This series integrates with the osmocom.org project CI and TTCN-3 test
    infrastructure (Oliver Smith) as well as the userspace libgtpnl library.
    
    Thanks to Harald Welte, Oliver Smith and Pau Espin for reviewing and
    providing feedback through the osmocom.org redmine platform to make this
    happen.
    ====================
    
    Signed-off-by: David S. Miller <[email protected]>
    davem330 committed May 10, 2024
    Configuration menu
    Copy the full SHA
    f8beae0 View commit details
    Browse the repository at this point in the history

Commits on May 11, 2024

  1. bnxt_en: silence clang build warning

    Clang build brings a warning:
    
        ../drivers/net/ethernet/broadcom/bnxt/bnxt_ptp.c:133:12: warning:
        comparison of distinct pointer types ('typeof (tmo_us) *' (aka 'unsigned
        int *') and 'typeof (65535) *' (aka 'int *'))
        [-Wcompare-distinct-pointer-types]
          133 |                 tmo_us = min(tmo_us, BNXT_PTP_QTS_MAX_TMO_US);
              |                          ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    
    Fix it by specifying proper type for BNXT_PTP_QTS_MAX_TMO_US.
    
    Fixes: 7de3c22 ("bnxt_en: Add a timeout parameter to bnxt_hwrm_port_ts_query()")
    Signed-off-by: Vadim Fedorenko <[email protected]>
    Reviewed-by: Michael Chan <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Vadim Fedorenko authored and kuba-moo committed May 11, 2024
    Configuration menu
    Copy the full SHA
    3815553 View commit details
    Browse the repository at this point in the history
  2. virtio_net: Fix memory leak in virtnet_rx_mod_work

    The pointer delcaration was missing the __free(kfree).
    
    Fixes: ff7c7d9 ("virtio_net: Remove command data from control_buf")
    Reported-by: Jens Axboe <[email protected]>
    Closes: https://lore.kernel.org/netdev/[email protected]/
    Signed-off-by: Daniel Jurgens <[email protected]>
    Tested-by: Jens Axboe <[email protected]>
    Reviewed-by: Xuan Zhuo <[email protected]>
    Acked-by: Michael S. Tsirkin <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Daniel Jurgens authored and kuba-moo committed May 11, 2024
    Configuration menu
    Copy the full SHA
    b49bd37 View commit details
    Browse the repository at this point in the history
  3. selftests: net: fix timestamp not arriving in cmsg_time.sh

    On slow machines the SND timestamp sometimes doesn't arrive before
    we quit. The test only waits as long as the packet delay, so it's
    easy for a race condition to happen.
    
    Double the wait but do a bit of polling, once the SND timestamp
    arrives there's no point to wait any longer.
    
    This fixes the "TXTIME abs" failures on debug kernels, like:
    
       Case ICMPv4  - TXTIME abs returned '', expected 'OK'
    
    Reviewed-by: Willem de Bruijn <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    kuba-moo committed May 11, 2024
    Configuration menu
    Copy the full SHA
    2d3b8df View commit details
    Browse the repository at this point in the history
  4. selftests: net: increase the delay for relative cmsg_time.sh test

    Slow machines can delay scheduling of the packets for milliseconds.
    Increase the delay to 8ms if KSFT_MACHINE_SLOW. Try to limit the
    variability by moving setsockopts earlier (before we read time).
    
    This fixes the "TXTIME rel" failures on debug kernels, like:
    
      Case ICMPv4  - TXTIME rel returned '', expected 'OK'
    
    Reviewed-by: Willem de Bruijn <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    kuba-moo committed May 11, 2024
    Configuration menu
    Copy the full SHA
    b9d5f57 View commit details
    Browse the repository at this point in the history
  5. gve: Avoid unnecessary use of comma operator

    Although it does not seem to have any untoward side-effects,
    the use of ';' to separate to assignments seems more appropriate than ','.
    
    Flagged by clang-18 -Wcomma
    
    No functional change intended.
    Compile tested only.
    
    Reviewed-by: Shailend Chand <[email protected]>
    Reviewed-by: Larysa Zaremba <[email protected]>
    Signed-off-by: Simon Horman <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    horms authored and kuba-moo committed May 11, 2024
    Configuration menu
    Copy the full SHA
    ebb8308 View commit details
    Browse the repository at this point in the history
  6. gve: Use ethtool_sprintf/puts() to fill stats strings

    Make use of standard helpers to simplify filling in stats strings.
    
    The first two ethtool_puts() changes address the following fortification
    warnings flagged by W=1 builds with clang-18. (The last ethtool_puts
    change does not because the warning relates to writing beyond the first
    element of an array, and gve_gstrings_priv_flags only has one element.)
    
    .../fortify-string.h:562:4: warning: call to '__read_overflow2_field' declared with 'warning' attribute: detected read beyond size of field (2nd parameter); maybe use struct_group()? [-Wattribute-warning]
      562 |                         __read_overflow2_field(q_size_field, size);
          |                         ^
    .../fortify-string.h:562:4: warning: call to '__read_overflow2_field' declared with 'warning' attribute: detected read beyond size of field (2nd parameter); maybe use struct_group()? [-Wattribute-warning]
    
    Likewise, the same changes resolve the same problems flagged by Smatch.
    
    .../gve_ethtool.c:100 gve_get_strings() error: __builtin_memcpy() '*gve_gstrings_main_stats' too small (32 vs 576)
    .../gve_ethtool.c:120 gve_get_strings() error: __builtin_memcpy() '*gve_gstrings_adminq_stats' too small (32 vs 512)
    
    Compile tested only.
    
    Reviewed-by: Shailend Chand <[email protected]>
    Reviewed-by: Larysa Zaremba <[email protected]>
    Signed-off-by: Simon Horman <[email protected]>
    Acked-by: Justin Stitt <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    horms authored and kuba-moo committed May 11, 2024
    Configuration menu
    Copy the full SHA
    ba8bcb0 View commit details
    Browse the repository at this point in the history
  7. Merge branch 'gve-minor-cleanups'

    Simon Horman says:
    
    ====================
    gve: Minor cleanups
    
    This short patchset provides two minor cleanups for the gve driver.
    
    These were found by tooling as mentioned in each patch,
    and otherwise by inspection.
    
    No change in run time behaviour is intended.
    Each patch is compile tested only.
    
    v1: https://lore.kernel.org/r/[email protected]
    ====================
    
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    kuba-moo committed May 11, 2024
    Configuration menu
    Copy the full SHA
    9c1bbc7 View commit details
    Browse the repository at this point in the history
  8. octeontx2-pf: Reuse Transmit queue/Send queue index of HTB class

    Real number of Transmit queues are incremented when user enables HTB
    class and vice versa. Depending on SKB priority driver returns transmit
    queue (Txq). Transmit queues and Send queues are one-to-one mapped.
    
    In few scenarios, Driver is returning transmit queue value which is
    greater than real number of transmit queue and Stack detects this as
    error and overwrites transmit queue value.
    
    For example
    user has added two classes and real number of queues are incremented
    accordingly
    - tc class add dev eth1 parent 1: classid 1:1 htb
          rate 100Mbit ceil 100Mbit prio 1 quantum 1024
    - tc class add dev eth1 parent 1: classid 1:2 htb
          rate 100Mbit ceil 200Mbit prio 7 quantum 1024
    
    now if user deletes the class with id 1:1, driver decrements the real
    number of queues
    - tc class del dev eth1 classid 1:1
    
    But for the class with id 1:2, driver is returning transmit queue
    value which is higher than real number of transmit queue leading
    to below error
    
    eth1 selects TX queue x, but real number of TX queues is x
    
    This patch solves the problem by assigning deleted class transmit
    queue/send queue to active class.
    
    Signed-off-by: Hariprasad Kelam <[email protected]>
    Reviewed-by: Simon Horman <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Hariprasad Kelam authored and kuba-moo committed May 11, 2024
    Configuration menu
    Copy the full SHA
    04fb71c View commit details
    Browse the repository at this point in the history
  9. net: ethernet: adi: adin1110: Replace linux/gpio.h by proper one

    linux/gpio.h is deprecated and subject to remove.
    The driver doesn't use it directly, replace it
    with what is really being used.
    
    Signed-off-by: Andy Shevchenko <[email protected]>
    Reviewed-by: Simon Horman <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    andy-shev authored and kuba-moo committed May 11, 2024
    Configuration menu
    Copy the full SHA
    84c8b7a View commit details
    Browse the repository at this point in the history
  10. af_unix: Add dead flag to struct scm_fp_list.

    Commit 1af2dfa ("af_unix: Don't access successor in unix_del_edges()
    during GC.") fixed use-after-free by avoid accessing edge->successor while
    GC is in progress.
    
    However, there could be a small race window where another process could
    call unix_del_edges() while gc_in_progress is true and __skb_queue_purge()
    is on the way.
    
    So, we need another marker for struct scm_fp_list which indicates if the
    skb is garbage-collected.
    
    This patch adds dead flag in struct scm_fp_list and set it true before
    calling __skb_queue_purge().
    
    Fixes: 1af2dfa ("af_unix: Don't access successor in unix_del_edges() during GC.")
    Signed-off-by: Kuniyuki Iwashima <[email protected]>
    Acked-by: Paolo Abeni <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    q2ven authored and kuba-moo committed May 11, 2024
    Configuration menu
    Copy the full SHA
    7172dc9 View commit details
    Browse the repository at this point in the history
  11. net: dsa: microchip: Fix spellig mistake "configur" -> "configure"

    There is a spelling mistake in a dev_err message. Fix it.
    
    Signed-off-by: Colin Ian King <[email protected]>
    Reviewed-by: Simon Horman <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    ColinIanKing authored and kuba-moo committed May 11, 2024
    Configuration menu
    Copy the full SHA
    089507a View commit details
    Browse the repository at this point in the history
  12. net: usb: smsc95xx: stop lying about skb->truesize

    Some usb drivers try to set small skb->truesize and break
    core networking stacks.
    
    In this patch, I removed one of the skb->truesize override.
    
    I also replaced one skb_clone() by an allocation of a fresh
    and small skb, to get minimally sized skbs, like we did
    in commit 1e2c611 ("net: cdc_ncm: reduce skb truesize
    in rx path") and 4ce62d5 ("net: usb: ax88179_178a:
    stop lying about skb->truesize")
    
    v3: also fix a sparse error ( https://lore.kernel.org/oe-kbuild-all/[email protected]/ )
    v2: leave the skb_trim() game because smsc95xx_rx_csum_offload()
        needs the csum part. (Jakub)
        While we are it, use get_unaligned() in smsc95xx_rx_csum_offload().
    
    Fixes: 2f7ca80 ("net: Add SMSC LAN9500 USB2.0 10/100 ethernet adapter driver")
    Signed-off-by: Eric Dumazet <[email protected]>
    Cc: Steve Glendinning <[email protected]>
    Cc: [email protected]
    Reviewed-by: Simon Horman <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Eric Dumazet authored and kuba-moo committed May 11, 2024
    Configuration menu
    Copy the full SHA
    d50729f View commit details
    Browse the repository at this point in the history
  13. net: qede: use extack in qede_flow_parse_ports()

    Convert qede_flow_parse_ports to use extack,
    and drop the edev argument.
    
    Convert DP_NOTICE call to use NL_SET_ERR_MSG_MOD instead.
    
    In calls to qede_flow_parse_ports(), use NULL as extack
    for now, until a subsequent patch makes extack available.
    
    Only compile tested.
    
    Signed-off-by: Asbjørn Sloth Tønnesen <[email protected]>
    Reviewed-by: Simon Horman <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Asbjørn Sloth Tønnesen authored and kuba-moo committed May 11, 2024
    Configuration menu
    Copy the full SHA
    a7c9540 View commit details
    Browse the repository at this point in the history
  14. net: qede: use extack in qede_set_v6_tuple_to_profile()

    Convert qede_set_v6_tuple_to_profile() to take extack,
    and drop the edev argument.
    
    Convert DP_INFO call to use NL_SET_ERR_MSG_MOD instead.
    
    In calls to qede_set_v6_tuple_to_profile(), use NULL as extack
    for now, until a subsequent patch makes extack available.
    
    Only compile tested.
    
    Signed-off-by: Asbjørn Sloth Tønnesen <[email protected]>
    Reviewed-by: Simon Horman <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Asbjørn Sloth Tønnesen authored and kuba-moo committed May 11, 2024
    Configuration menu
    Copy the full SHA
    6f88f12 View commit details
    Browse the repository at this point in the history
  15. net: qede: use extack in qede_set_v4_tuple_to_profile()

    Convert qede_set_v4_tuple_to_profile() to take extack,
    and drop the edev argument.
    
    Convert DP_INFO call to use NL_SET_ERR_MSG_MOD instead.
    
    In calls to qede_set_v4_tuple_to_profile(), use NULL as extack
    for now, until a subsequent patch makes extack available.
    
    Only compile tested.
    
    Signed-off-by: Asbjørn Sloth Tønnesen <[email protected]>
    Reviewed-by: Simon Horman <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Asbjørn Sloth Tønnesen authored and kuba-moo committed May 11, 2024
    Configuration menu
    Copy the full SHA
    f63a9dc View commit details
    Browse the repository at this point in the history
  16. net: qede: use extack in qede_flow_parse_v6_common()

    Convert qede_flow_parse_v6_common() to take extack,
    and drop the edev argument.
    
    Convert DP_NOTICE call to use NL_SET_ERR_MSG_MOD instead.
    
    Pass extack in calls to qede_flow_parse_ports() and
    qede_set_v6_tuple_to_profile().
    
    In calls to qede_flow_parse_v6_common(), use NULL as extack
    for now, until a subsequent patch makes extack available.
    
    Only compile tested.
    
    Signed-off-by: Asbjørn Sloth Tønnesen <[email protected]>
    Reviewed-by: Simon Horman <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Asbjørn Sloth Tønnesen authored and kuba-moo committed May 11, 2024
    Configuration menu
    Copy the full SHA
    a62944d View commit details
    Browse the repository at this point in the history
  17. net: qede: use extack in qede_flow_parse_v4_common()

    Convert qede_flow_parse_v4_common() to take extack,
    and drop the edev argument.
    
    Convert DP_NOTICE call to use NL_SET_ERR_MSG_MOD instead.
    
    Pass extack in calls to qede_flow_parse_ports() and
    qede_set_v4_tuple_to_profile().
    
    In calls to qede_flow_parse_v4_common(), use NULL as extack
    for now, until a subsequent patch makes extack available.
    
    Only compile tested.
    
    Signed-off-by: Asbjørn Sloth Tønnesen <[email protected]>
    Reviewed-by: Simon Horman <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Asbjørn Sloth Tønnesen authored and kuba-moo committed May 11, 2024
    Configuration menu
    Copy the full SHA
    f2f9938 View commit details
    Browse the repository at this point in the history
  18. net: qede: use extack in qede_flow_parse_tcp_v6()

    Convert qede_flow_parse_tcp_v6() to take extack,
    and drop the edev argument.
    
    Pass extack in call to qede_flow_parse_v6_common().
    
    In call to qede_flow_parse_tcp_v6(), use NULL as extack
    for now, until a subsequent patch makes extack available.
    
    Only compile tested.
    
    Signed-off-by: Asbjørn Sloth Tønnesen <[email protected]>
    Reviewed-by: Simon Horman <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Asbjørn Sloth Tønnesen authored and kuba-moo committed May 11, 2024
    Configuration menu
    Copy the full SHA
    b1a18d5 View commit details
    Browse the repository at this point in the history
  19. net: qede: use extack in qede_flow_parse_tcp_v4()

    Convert qede_flow_parse_tcp_v4() to take extack,
    and drop the edev argument.
    
    Pass extack in call to qede_flow_parse_v4_common().
    
    In call to qede_flow_parse_tcp_v4(), use NULL as extack
    for now, until a subsequent patch makes extack available.
    
    Only compile tested.
    
    Signed-off-by: Asbjørn Sloth Tønnesen <[email protected]>
    Reviewed-by: Simon Horman <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Asbjørn Sloth Tønnesen authored and kuba-moo committed May 11, 2024
    Configuration menu
    Copy the full SHA
    f84d527 View commit details
    Browse the repository at this point in the history
  20. net: qede: use extack in qede_flow_parse_udp_v6()

    Convert qede_flow_parse_udp_v6() to take extack,
    and drop the edev argument.
    
    Pass extack in call to qede_flow_parse_v6_common().
    
    In call to qede_flow_parse_udp_v6(), use NULL as extack
    for now, until a subsequent patch makes extack available.
    
    Only compile tested.
    
    Signed-off-by: Asbjørn Sloth Tønnesen <[email protected]>
    Reviewed-by: Simon Horman <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Asbjørn Sloth Tønnesen authored and kuba-moo committed May 11, 2024
    Configuration menu
    Copy the full SHA
    b73ad5c View commit details
    Browse the repository at this point in the history
  21. net: qede: use extack in qede_flow_parse_udp_v4()

    Convert qede_flow_parse_udp_v4() to take extack,
    and drop the edev argument.
    
    Pass extack in call to qede_flow_parse_v4_common().
    
    In call to qede_flow_parse_udp_v4(), use NULL as extack
    for now, until a subsequent patch makes extack available.
    
    Only compile tested.
    
    Signed-off-by: Asbjørn Sloth Tønnesen <[email protected]>
    Reviewed-by: Simon Horman <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Asbjørn Sloth Tønnesen authored and kuba-moo committed May 11, 2024
    Configuration menu
    Copy the full SHA
    9c8f5ed View commit details
    Browse the repository at this point in the history
  22. net: qede: add extack in qede_add_tc_flower_fltr()

    Define extack locally, to reduce line lengths and aid future users.
    
    Only compile tested.
    
    Signed-off-by: Asbjørn Sloth Tønnesen <[email protected]>
    Reviewed-by: Simon Horman <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Asbjørn Sloth Tønnesen authored and kuba-moo committed May 11, 2024
    Configuration menu
    Copy the full SHA
    f833a65 View commit details
    Browse the repository at this point in the history
  23. net: qede: use extack in qede_parse_flow_attr()

    Convert qede_parse_flow_attr() to take extack,
    and drop the edev argument.
    
    Convert DP_NOTICE calls to use NL_SET_ERR_MSG_* instead.
    
    Pass extack in calls to qede_flow_parse_{tcp,udp}_v{4,6}().
    
    In calls to qede_parse_flow_attr(), if extack is
    unavailable, then use NULL for now, until a
    subsequent patch makes extack available.
    
    Only compile tested.
    
    Signed-off-by: Asbjørn Sloth Tønnesen <[email protected]>
    Reviewed-by: Simon Horman <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Asbjørn Sloth Tønnesen authored and kuba-moo committed May 11, 2024
    Configuration menu
    Copy the full SHA
    d6883bc View commit details
    Browse the repository at this point in the history
  24. net: qede: use faked extack in qede_flow_spec_to_rule()

    Since qede_parse_flow_attr() now does error reporting
    through extack, then give it a fake extack and extract the
    error message afterwards if one was set.
    
    The extracted error message is then passed on through
    DP_NOTICE(), including messages that was earlier issued
    with DP_INFO().
    
    This fake extack approach is already used by
    mlxsw_env_linecard_modules_power_mode_apply() in
    drivers/net/ethernet/mellanox/mlxsw/core_env.c
    
    Only compile tested.
    
    Signed-off-by: Asbjørn Sloth Tønnesen <[email protected]>
    Reviewed-by: Simon Horman <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Asbjørn Sloth Tønnesen authored and kuba-moo committed May 11, 2024
    Configuration menu
    Copy the full SHA
    eb705d7 View commit details
    Browse the repository at this point in the history
  25. net: qede: propagate extack through qede_flow_spec_validate()

    Pass extack to qede_flow_spec_validate() when called in
    qede_flow_spec_to_rule().
    
    Pass extack to qede_parse_actions().
    
    Not converting qede_flow_spec_validate() to use extack for
    errors, as it's only called from qede_flow_spec_to_rule(),
    where extack is faked into a DP_NOTICE anyway, so opting to
    keep DP_VERBOSE/DP_NOTICE usage.
    
    Only compile tested.
    
    Signed-off-by: Asbjørn Sloth Tønnesen <[email protected]>
    Reviewed-by: Simon Horman <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Asbjørn Sloth Tønnesen authored and kuba-moo committed May 11, 2024
    Configuration menu
    Copy the full SHA
    d2a437e View commit details
    Browse the repository at this point in the history
  26. net: qede: use extack in qede_parse_actions()

    Convert DP_NOTICE/DP_INFO to NL_SET_ERR_MSG_MOD.
    
    Keep edev around for use with QEDE_RSS_COUNT().
    
    Only compile tested.
    
    Signed-off-by: Asbjørn Sloth Tønnesen <[email protected]>
    Reviewed-by: Simon Horman <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Asbjørn Sloth Tønnesen authored and kuba-moo committed May 11, 2024
    Configuration menu
    Copy the full SHA
    8415487 View commit details
    Browse the repository at this point in the history
  27. Merge branch 'net-qede-convert-filter-code-to-use-extack'

    Asbjørn Sloth Tønnesen says:
    
    ====================
    net: qede: convert filter code to use extack
    
    This series converts the filter code in the qede driver
    to use NL_SET_ERR_MSG_*(extack, ...) for error handling.
    
    Patch 1-12 converts qede_parse_flow_attr() to use extack,
    along with all it's static helper functions.
    
    qede_parse_flow_attr() is used in two places:
    - qede_add_tc_flower_fltr()
    - qede_flow_spec_to_rule()
    
    In the latter call site extack is faked in the same way as
    is done in mlxsw (patch 12).
    
    While the conversion is going on, some error messages are silenced
    in between patch 1-12. If wanted could squash patch 1-12 in a v3, but
    I felt that it would be easier to review as 12 more trivial patches.
    
    Patch 13 and 14, finishes up by converting qede_parse_actions(),
    and ensures that extack is propagated to it, in both call contexts.
    
    v1: https://lore.kernel.org/netdev/[email protected]/
    ====================
    
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    kuba-moo committed May 11, 2024
    Configuration menu
    Copy the full SHA
    24e28b6 View commit details
    Browse the repository at this point in the history
  28. Merge branch '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git…

    …/tnguy/next-queue
    
    Tony Nguyen says:
    
    ====================
    Intel Wired LAN Driver Updates 2024-05-08 (most Intel drivers)
    
    This series contains updates to i40e, iavf, ice, igb, igc, e1000e, and ixgbe
    drivers.
    
    Asbjørn Sloth Tønnesen adds checks against supported flower control flags
    for i40e, iavf, ice, and igb drivers.
    
    Michal corrects filters removed during eswitch release for ice.
    
    Corinna Vinschen defers PTP initialization to later in probe so that
    netdev log entry is initialized on igc.
    
    Ilpo Järvinen removes a couple of unused, duplicate defines on
    e1000e and ixgbe.
    
    * '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue:
      net: e1000e & ixgbe: Remove PCI_HEADER_TYPE_MFD duplicates
      igc: fix a log entry using uninitialized netdev
      ice: remove correct filters during eswitch release
      igb: flower: validate control flags
      ice: flower: validate control flags
      iavf: flower: validate control flags
      i40e: flower: validate control flags
    ====================
    
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    kuba-moo committed May 11, 2024
    Configuration menu
    Copy the full SHA
    cddd2dc View commit details
    Browse the repository at this point in the history

Commits on May 12, 2024

  1. ARC: Add eBPF JIT support

    This will add eBPF JIT support to the 32-bit ARCv2 processors. The
    implementation is qualified by running the BPF tests on a Synopsys HSDK
    board with "ARC HS38 v2.1c at 500 MHz" as the 4-core CPU.
    
    The test_bpf.ko reports 2-10 fold improvements in execution time of its
    tests. For instance:
    
    test_bpf: #33 tcpdump port 22 jited:0 704 1766 2104 PASS
    test_bpf: #33 tcpdump port 22 jited:1 120  224  260 PASS
    
    test_bpf: #141 ALU_DIV_X: 4294967295 / 4294967295 = 1 jited:0 238 PASS
    test_bpf: #141 ALU_DIV_X: 4294967295 / 4294967295 = 1 jited:1  23 PASS
    
    test_bpf: #776 JMP32_JGE_K: all ... magnitudes jited:0 2034681 PASS
    test_bpf: #776 JMP32_JGE_K: all ... magnitudes jited:1 1020022 PASS
    
    Deployment and structure
    ------------------------
    The related codes are added to "arch/arc/net":
    
    - bpf_jit.h       -- The interface that a back-end translator must provide
    - bpf_jit_core.c  -- Knows how to handle the input eBPF byte stream
    - bpf_jit_arcv2.c -- The back-end code that knows the translation logic
    
    The bpf_int_jit_compile() at the end of bpf_jit_core.c is the entrance
    to the whole process. Normally, the translation is done in one pass,
    namely the "normal pass". In case some relocations are not known during
    this pass, some data (arc_jit_data) is allocated for the next pass to
    come. This possible next (and last) pass is called the "extra pass".
    
    1. Normal pass       # The necessary pass
         1a. Dry run       # Get the whole JIT length, epilogue offset, etc.
         1b. Emit phase    # Allocate memory and start emitting instructions
    2. Extra pass        # Only needed if there are relocations to be fixed
         2a. Patch relocations
    
    Support status
    --------------
    The JIT compiler supports BPF instructions up to "cpu=v4". However, it
    does not yet provide support for:
    
    - Tail calls
    - Atomic operations
    - 64-bit division/remainder
    - BPF_PROBE_MEM* (exception table)
    
    The result of "test_bpf" test suite on an HSDK board is:
    
    hsdk-lnx# insmod test_bpf.ko test_suite=test_bpf
    
      test_bpf: Summary: 863 PASSED, 186 FAILED, [851/851 JIT'ed]
    
    All the failing test cases are due to the ones that were not JIT'ed.
    Categorically, they can be represented as:
    
      .-----------.------------.-------------.
      | test type |   opcodes  | # of cases  |
      |-----------+------------+-------------|
      | atomic    | 0xC3, 0xDB |         149 |
      | div64     | 0x37, 0x3F |          22 |
      | mod64     | 0x97, 0x9F |          15 |
      `-----------^------------+-------------|
                               | (total) 186 |
                               `-------------'
    
    Setup: build config
    -------------------
    The following configs must be set to have a working JIT test:
    
      CONFIG_BPF_JIT=y
      CONFIG_BPF_JIT_ALWAYS_ON=y
      CONFIG_TEST_BPF=m
    
    The following options are not necessary for the tests module,
    but are good to have:
    
      CONFIG_DEBUG_INFO=y             # prerequisite for below
      CONFIG_DEBUG_INFO_BTF=y         # so bpftool can generate vmlinux.h
    
      CONFIG_FTRACE=y                 #
      CONFIG_BPF_SYSCALL=y            # all these options lead to
      CONFIG_KPROBE_EVENTS=y          # having CONFIG_BPF_EVENTS=y
      CONFIG_PERF_EVENTS=y            #
    
    Some BPF programs provide data through /sys/kernel/debug:
      CONFIG_DEBUG_FS=y
    arc# mount -t debugfs debugfs /sys/kernel/debug
    
    Setup: elfutils
    ---------------
    The libdw.{so,a} library that is used by pahole for processing
    the final binary must come from elfutils 0.189 or newer. The
    support for ARCv2 [1] has been added since that version.
    
    [1]
    https://sourceware.org/git/?p=elfutils.git;a=commit;h=de3d46b3e7
    
    Setup: pahole
    -------------
    The line below in linux/scripts/Makefile.btf must be commented out:
    
    pahole-flags-$(call test-ge, $(pahole-ver), 121) += --btf_gen_floats
    
    Or else, the build will fail:
    
    $ make V=1
      ...
      BTF     .btf.vmlinux.bin.o
    pahole -J --btf_gen_floats                    \
           -j --lang_exclude=rust                 \
           --skip_encoding_btf_inconsistent_proto \
           --btf_gen_optimized .tmp_vmlinux.btf
    Complex, interval and imaginary float types are not supported
    Encountered error while encoding BTF.
      ...
      BTFIDS  vmlinux
    ./tools/bpf/resolve_btfids/resolve_btfids vmlinux
    libbpf: failed to find '.BTF' ELF section in vmlinux
    FAILED: load BTF from vmlinux: No data available
    
    This is due to the fact that the ARC toolchains generate
    "complex float" DIE entries in libgcc and at the moment, pahole
    can't handle such entries.
    
    Running the tests
    -----------------
    host$ scp /bld/linux/lib/test_bpf.ko arc:
    arc # sysctl net.core.bpf_jit_enable=1
    arc # insmod test_bpf.ko test_suite=test_bpf
          ...
          test_bpf: #1048 Staggered jumps: JMP32_JSLE_X jited:1 697811 PASS
          test_bpf: Summary: 863 PASSED, 186 FAILED, [851/851 JIT'ed]
    
    Acknowledgments
    ---------------
    - Claudiu Zissulescu for his unwavering support
    - Yuriy Kolerov for testing and troubleshooting
    - Vladimir Isaev for the pahole workaround
    - Sergey Matyukevich for paving the road by adding the interpreter support
    
    Signed-off-by: Shahab Vahedi <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Alexei Starovoitov <[email protected]>
    Shahab Vahedi authored and Alexei Starovoitov committed May 12, 2024
    Configuration menu
    Copy the full SHA
    f122668 View commit details
    Browse the repository at this point in the history
  2. riscv, bpf: add internal-only MOV instruction to resolve per-CPU addrs

    Support an instruction for resolving absolute addresses of per-CPU
    data from their per-CPU offsets. This instruction is internal-only and
    users are not allowed to use them directly. They will only be used for
    internal inlining optimizations for now between BPF verifier and BPF
    JITs.
    
    RISC-V uses generic per-cpu implementation where the offsets for CPUs
    are kept in an array called __per_cpu_offset[cpu_number]. RISCV stores
    the address of the task_struct in TP register. The first element in
    task_struct is struct thread_info, and we can get the cpu number by
    reading from the TP register + offsetof(struct thread_info, cpu).
    
    Once we have the cpu number in a register we read the offset for that
    cpu from address: &__per_cpu_offset + cpu_number << 3. Then we add this
    offset to the destination register.
    
    To measure the improvement from this change, the benchmark in [1] was
    used on Qemu:
    
    Before:
    glob-arr-inc   :    1.127 ± 0.013M/s
    arr-inc        :    1.121 ± 0.004M/s
    hash-inc       :    0.681 ± 0.052M/s
    
    After:
    glob-arr-inc   :    1.138 ± 0.011M/s
    arr-inc        :    1.366 ± 0.006M/s
    hash-inc       :    0.676 ± 0.001M/s
    
    [1] anakryiko/linux@8dec900975ef
    
    Signed-off-by: Puranjay Mohan <[email protected]>
    Acked-by: Björn Töpel <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Alexei Starovoitov <[email protected]>
    puranjaymohan authored and Alexei Starovoitov committed May 12, 2024
    Configuration menu
    Copy the full SHA
    19c56d4 View commit details
    Browse the repository at this point in the history
  3. riscv, bpf: inline bpf_get_smp_processor_id()

    Inline the calls to bpf_get_smp_processor_id() in the riscv bpf jit.
    
    RISCV saves the pointer to the CPU's task_struct in the TP (thread
    pointer) register. This makes it trivial to get the CPU's processor id.
    As thread_info is the first member of task_struct, we can read the
    processor id from TP + offsetof(struct thread_info, cpu).
    
              RISCV64 JIT output for `call bpf_get_smp_processor_id`
    	  ======================================================
    
                    Before                           After
                   --------                         -------
    
             auipc   t1,0x848c                  ld    a5,32(tp)
             jalr    604(t1)
             mv      a5,a0
    
    Benchmark using [1] on Qemu.
    
    ./benchs/run_bench_trigger.sh glob-arr-inc arr-inc hash-inc
    
    +---------------+------------------+------------------+--------------+
    |      Name     |     Before       |       After      |   % change   |
    |---------------+------------------+------------------+--------------|
    | glob-arr-inc  | 1.077 ± 0.006M/s | 1.336 ± 0.010M/s |   + 24.04%   |
    | arr-inc       | 1.078 ± 0.002M/s | 1.332 ± 0.015M/s |   + 23.56%   |
    | hash-inc      | 0.494 ± 0.004M/s | 0.653 ± 0.001M/s |   + 32.18%   |
    +---------------+------------------+------------------+--------------+
    
    NOTE: This benchmark includes changes from this patch and the previous
          patch that implemented the per-cpu insn.
    
    [1] anakryiko/linux@8dec900975ef
    
    Signed-off-by: Puranjay Mohan <[email protected]>
    Acked-by: Kumar Kartikeya Dwivedi <[email protected]>
    Acked-by: Andrii Nakryiko <[email protected]>
    Acked-by: Björn Töpel <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Alexei Starovoitov <[email protected]>
    puranjaymohan authored and Alexei Starovoitov committed May 12, 2024
    Configuration menu
    Copy the full SHA
    2ddec2c View commit details
    Browse the repository at this point in the history
  4. arm64, bpf: add internal-only MOV instruction to resolve per-CPU addrs

    Support an instruction for resolving absolute addresses of per-CPU
    data from their per-CPU offsets. This instruction is internal-only and
    users are not allowed to use them directly. They will only be used for
    internal inlining optimizations for now between BPF verifier and BPF
    JITs.
    
    Since commit 7158627 ("arm64: percpu: implement optimised pcpu
    access using tpidr_el1"), the per-cpu offset for the CPU is stored in
    the tpidr_el1/2 register of that CPU.
    
    To support this BPF instruction in the ARM64 JIT, the following ARM64
    instructions are emitted:
    
    mov dst, src		// Move src to dst, if src != dst
    mrs tmp, tpidr_el1/2	// Move per-cpu offset of the current cpu in tmp.
    add dst, dst, tmp	// Add the per cpu offset to the dst.
    
    To measure the performance improvement provided by this change, the
    benchmark in [1] was used:
    
    Before:
    glob-arr-inc   :   23.597 ± 0.012M/s
    arr-inc        :   23.173 ± 0.019M/s
    hash-inc       :   12.186 ± 0.028M/s
    
    After:
    glob-arr-inc   :   23.819 ± 0.034M/s
    arr-inc        :   23.285 ± 0.017M/s
    hash-inc       :   12.419 ± 0.011M/s
    
    [1] anakryiko/linux@8dec900975ef
    
    Signed-off-by: Puranjay Mohan <[email protected]>
    Acked-by: Andrii Nakryiko <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Alexei Starovoitov <[email protected]>
    puranjaymohan authored and Alexei Starovoitov committed May 12, 2024
    Configuration menu
    Copy the full SHA
    7a4c322 View commit details
    Browse the repository at this point in the history
  5. bpf, arm64: inline bpf_get_smp_processor_id() helper

    Inline calls to bpf_get_smp_processor_id() helper in the JIT by emitting
    a read from struct thread_info. The SP_EL0 system register holds the
    pointer to the task_struct and thread_info is the first member of this
    struct. We can read the cpu number from the thread_info.
    
    Here is how the ARM64 JITed assembly changes after this commit:
    
                                          ARM64 JIT
                                         ===========
    
                  BEFORE                                    AFTER
                 --------                                  -------
    
    int cpu = bpf_get_smp_processor_id();        int cpu = bpf_get_smp_processor_id();
    
    mov     x10, #0xfffffffffffff4d0             mrs     x10, sp_el0
    movk    x10, #0x802b, lsl #16                ldr     w7, [x10, #24]
    movk    x10, #0x8000, lsl #32
    blr     x10
    add     x7, x0, #0x0
    
                   Performance improvement using benchmark[1]
    
    ./benchs/run_bench_trigger.sh glob-arr-inc arr-inc hash-inc
    
    +---------------+-------------------+-------------------+--------------+
    |      Name     |      Before       |        After      |   % change   |
    |---------------+-------------------+-------------------+--------------|
    | glob-arr-inc  | 23.380 ± 1.675M/s | 25.893 ± 0.026M/s |   + 10.74%   |
    | arr-inc       | 23.928 ± 0.034M/s | 25.213 ± 0.063M/s |   + 5.37%    |
    | hash-inc      | 12.352 ± 0.005M/s | 12.609 ± 0.013M/s |   + 2.08%    |
    +---------------+-------------------+-------------------+--------------+
    
    [1] anakryiko/linux@8dec900975ef
    
    Signed-off-by: Puranjay Mohan <[email protected]>
    Acked-by: Andrii Nakryiko <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Alexei Starovoitov <[email protected]>
    puranjaymohan authored and Alexei Starovoitov committed May 12, 2024
    Configuration menu
    Copy the full SHA
    75fe4c0 View commit details
    Browse the repository at this point in the history
  6. Merge branch 'bpf-inline-helpers-in-arm64-and-riscv-jits'

    Puranjay Mohan says:
    
    ====================
    bpf: Inline helpers in arm64 and riscv JITs
    
    Changes in v5 -> v6:
    arm64 v5: https://lore.kernel.org/all/[email protected]/
    riscv v2: https://lore.kernel.org/all/[email protected]/
    - Combine riscv and arm64 changes in single series
    - Some coding style fixes
    
    Changes in v4 -> v5:
    v4: https://lore.kernel.org/all/[email protected]/
    - Implement the inlining of the bpf_get_smp_processor_id() in the JIT.
    
    NOTE: This needs to be based on:
    https://lore.kernel.org/all/[email protected]/
    to be built.
    
    Manual run of bpf-ci with this series rebased on above:
    kernel-patches/bpf#6929
    
    Changes in v3 -> v4:
    v3: https://lore.kernel.org/all/[email protected]/
    - Fix coding style issue related to C89 standards.
    
    Changes in v2 -> v3:
    v2: https://lore.kernel.org/all/[email protected]/
    - Fixed the xlated dump of percpu mov to "r0 = &(void __percpu *)(r0)"
    - Made ARM64 and x86-64 use the same code for inlining. The only difference
      that remains is the per-cpu address of the cpu_number.
    
    Changes in v1 -> v2:
    v1: https://lore.kernel.org/all/[email protected]/
    - Add a patch to inline bpf_get_smp_processor_id()
    - Fix an issue in MRS instruction encoding as pointed out by Will
    - Remove CONFIG_SMP check because arm64 kernel always compiles with CONFIG_SMP
    
    This series adds the support of internal only per-CPU instructions and inlines
    the bpf_get_smp_processor_id() helper call for ARM64 and RISC-V BPF JITs.
    
    Here is an example of calls to bpf_get_smp_processor_id() and
    percpu_array_map_lookup_elem() before and after this series on ARM64.
    
                                             BPF
                                            =====
                  BEFORE                                       AFTER
                 --------                                     -------
    
    int cpu = bpf_get_smp_processor_id();           int cpu = bpf_get_smp_processor_id();
    (85) call bpf_get_smp_processor_id#229032       (85) call bpf_get_smp_processor_id#8
    
    p = bpf_map_lookup_elem(map, &zero);            p = bpf_map_lookup_elem(map, &zero);
    (18) r1 = map[id:78]                            (18) r1 = map[id:153]
    (18) r2 = map[id:82][0]+65536                   (18) r2 = map[id:157][0]+65536
    (85) call percpu_array_map_lookup_elem#313512   (07) r1 += 496
                                                    (61) r0 = *(u32 *)(r2 +0)
                                                    (35) if r0 >= 0x1 goto pc+5
                                                    (67) r0 <<= 3
                                                    (0f) r0 += r1
                                                    (79) r0 = *(u64 *)(r0 +0)
                                                    (bf) r0 = &(void __percpu *)(r0)
                                                    (05) goto pc+1
                                                    (b7) r0 = 0
    
                                          ARM64 JIT
                                         ===========
    
                  BEFORE                                       AFTER
                 --------                                     -------
    
    int cpu = bpf_get_smp_processor_id();           int cpu = bpf_get_smp_processor_id();
    mov     x10, #0xfffffffffffff4d0                mrs     x10, sp_el0
    movk    x10, #0x802b, lsl #16                   ldr     w7, [x10, #24]
    movk    x10, #0x8000, lsl #32
    blr     x10
    add     x7, x0, #0x0
    
    p = bpf_map_lookup_elem(map, &zero);            p = bpf_map_lookup_elem(map, &zero);
    mov     x0, #0xffff0003ffffffff                 mov     x0, #0xffff0003ffffffff
    movk    x0, #0xce5c, lsl #16                    movk    x0, #0xe0f3, lsl #16
    movk    x0, #0xca00                             movk    x0, #0x7c00
    mov     x1, #0xffff8000ffffffff                 mov     x1, #0xffff8000ffffffff
    movk    x1, #0x8bdb, lsl #16                    movk    x1, #0xb0c7, lsl #16
    movk    x1, #0x6000                             movk    x1, #0xe000
    mov     x10, #0xffffffffffff3ed0                add     x0, x0, #0x1f0
    movk    x10, #0x802d, lsl #16                   ldr     w7, [x1]
    movk    x10, #0x8000, lsl #32                   cmp     x7, #0x1
    blr     x10                                     b.cs    0x0000000000000090
    add     x7, x0, #0x0                            lsl     x7, x7, #3
                                                    add     x7, x7, x0
                                                    ldr     x7, [x7]
                                                    mrs     x10, tpidr_el1
                                                    add     x7, x7, x10
                                                    b       0x0000000000000094
                                                    mov     x7, #0x0
    
                  Performance improvement found using benchmark[1]
    
    ./benchs/run_bench_trigger.sh glob-arr-inc arr-inc hash-inc
    
      +---------------+-------------------+-------------------+--------------+
      |      Name     |      Before       |        After      |   % change   |
      |---------------+-------------------+-------------------+--------------|
      | glob-arr-inc  | 23.380 ± 1.675M/s | 25.893 ± 0.026M/s |   + 10.74%   |
      | arr-inc       | 23.928 ± 0.034M/s | 25.213 ± 0.063M/s |   + 5.37%    |
      | hash-inc      | 12.352 ± 0.005M/s | 12.609 ± 0.013M/s |   + 2.08%    |
      +---------------+-------------------+-------------------+--------------+
    
    [1] anakryiko/linux@8dec900975ef
    
                 RISCV64 JIT output for `call bpf_get_smp_processor_id`
                =======================================================
    
                      Before                           After
                     --------                         -------
    
               auipc   t1,0x848c                  ld    a5,32(tp)
               jalr    604(t1)
               mv      a5,a0
    
      Benchmark using [1] on Qemu.
    
      ./benchs/run_bench_trigger.sh glob-arr-inc arr-inc hash-inc
    
      +---------------+------------------+------------------+--------------+
      |      Name     |     Before       |       After      |   % change   |
      |---------------+------------------+------------------+--------------|
      | glob-arr-inc  | 1.077 ± 0.006M/s | 1.336 ± 0.010M/s |   + 24.04%   |
      | arr-inc       | 1.078 ± 0.002M/s | 1.332 ± 0.015M/s |   + 23.56%   |
      | hash-inc      | 0.494 ± 0.004M/s | 0.653 ± 0.001M/s |   + 32.18%   |
      +---------------+------------------+------------------+--------------+
    ====================
    
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Alexei Starovoitov <[email protected]>
    Alexei Starovoitov committed May 12, 2024
    Configuration menu
    Copy the full SHA
    55302bc View commit details
    Browse the repository at this point in the history
  7. s390/bpf: Emit a barrier for BPF_FETCH instructions

    BPF_ATOMIC_OP() macro documentation states that "BPF_ADD | BPF_FETCH"
    should be the same as atomic_fetch_add(), which is currently not the
    case on s390x: the serialization instruction "bcr 14,0" is missing.
    This applies to "and", "or" and "xor" variants too.
    
    s390x is allowed to reorder stores with subsequent fetches from
    different addresses, so code relying on BPF_FETCH acting as a barrier,
    for example:
    
      stw [%r0], 1
      afadd [%r1], %r2
      ldxw %r3, [%r4]
    
    may be broken. Fix it by emitting "bcr 14,0".
    
    Note that a separate serialization instruction is not needed for
    BPF_XCHG and BPF_CMPXCHG, because COMPARE AND SWAP performs
    serialization itself.
    
    Fixes: ba3b86b ("s390/bpf: Implement new atomic ops")
    Reported-by: Puranjay Mohan <[email protected]>
    Closes: https://lore.kernel.org/bpf/[email protected]/
    Signed-off-by: Ilya Leoshkevich <[email protected]>
    Reviewed-by: Puranjay Mohan <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Alexei Starovoitov <[email protected]>
    iii-i authored and Alexei Starovoitov committed May 12, 2024
    Configuration menu
    Copy the full SHA
    6837898 View commit details
    Browse the repository at this point in the history
  8. riscv, bpf: Fix typo in comment

    We can use either "instruction" or "insn" in the comment.
    
    Signed-off-by: Xiao Wang <[email protected]>
    Reviewed-by: Pu Lehui <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Alexei Starovoitov <[email protected]>
    XiaoWang1772 authored and Alexei Starovoitov committed May 12, 2024
    Configuration menu
    Copy the full SHA
    80c5a07 View commit details
    Browse the repository at this point in the history

Commits on May 13, 2024

  1. riscv, bpf: make some atomic operations fully ordered

    The BPF atomic operations with the BPF_FETCH modifier along with
    BPF_XCHG and BPF_CMPXCHG are fully ordered but the RISC-V JIT implements
    all atomic operations except BPF_CMPXCHG with relaxed ordering.
    
    Section 8.1 of the "The RISC-V Instruction Set Manual Volume I:
    Unprivileged ISA" [1], titled, "Specifying Ordering of Atomic
    Instructions" says:
    
    | To provide more efficient support for release consistency [5], each
    | atomic instruction has two bits, aq and rl, used to specify additional
    | memory ordering constraints as viewed by other RISC-V harts.
    
    and
    
    | If only the aq bit is set, the atomic memory operation is treated as
    | an acquire access.
    | If only the rl bit is set, the atomic memory operation is treated as a
    | release access.
    |
    | If both the aq and rl bits are set, the atomic memory operation is
    | sequentially consistent.
    
    Fix this by setting both aq and rl bits as 1 for operations with
    BPF_FETCH and BPF_XCHG.
    
    [1] https://riscv.org/wp-content/uploads/2017/05/riscv-spec-v2.2.pdf
    
    Fixes: dd642cc ("riscv, bpf: Implement more atomic operations for RV64")
    Signed-off-by: Puranjay Mohan <[email protected]>
    Reviewed-by: Pu Lehui <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Alexei Starovoitov <[email protected]>
    puranjaymohan authored and Alexei Starovoitov committed May 13, 2024
    Configuration menu
    Copy the full SHA
    20a759d View commit details
    Browse the repository at this point in the history
  2. selftests/bpf: Migrate recvmsg* return code tests to verifier_sock_ad…

    …dr.c
    
    This set of tests check that the BPF verifier rejects programs with
    invalid return codes (recvmsg4 and recvmsg6 hooks can only return 1).
    This patch replaces the tests in test_sock_addr.c with
    verifier_sock_addr.c, a new verifier prog_tests for sockaddr hooks, in a
    step towards fully retiring test_sock_addr.c.
    
    Signed-off-by: Jordan Rife <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Alexei Starovoitov <[email protected]>
    jrife authored and Alexei Starovoitov committed May 13, 2024
    Configuration menu
    Copy the full SHA
    73964e9 View commit details
    Browse the repository at this point in the history
  3. selftests/bpf: Use program name for skel load/destroy functions

    In preparation to migrate tests from bpf/test_sock_addr.c to
    sock_addr.c, update BPF_SKEL_FUNCS so that it generates functions
    based on prog_name instead of skel_name. This allows us to differentiate
    between programs in the same skeleton.
    
    Signed-off-by: Jordan Rife <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Alexei Starovoitov <[email protected]>
    jrife authored and Alexei Starovoitov committed May 13, 2024
    Configuration menu
    Copy the full SHA
    86b65c6 View commit details
    Browse the repository at this point in the history
  4. selftests/bpf: Handle LOAD_REJECT test cases

    In preparation to move test cases from bpf/test_sock_addr.c that expect
    LOAD_REJECT, this patch adds expected_attach_type and extends load_fn to
    accept an expected attach type and a flag indicating whether or not
    rejection is expected.
    
    Signed-off-by: Jordan Rife <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Alexei Starovoitov <[email protected]>
    jrife authored and Alexei Starovoitov committed May 13, 2024
    Configuration menu
    Copy the full SHA
    5eff48f View commit details
    Browse the repository at this point in the history
  5. selftests/bpf: Handle ATTACH_REJECT test cases

    In preparation to move test cases from bpf/test_sock_addr.c that expect
    ATTACH_REJECT, this patch adds BPF_SKEL_FUNCS_RAW to generate load and
    destroy functions that use bpf_prog_attach() to control the attach_type.
    
    The normal load functions use bpf_program__attach_cgroup which does not
    have the same degree of control over the attach type, as
    bpf_program_attach_fd() calls bpf_link_create() with the attach type
    extracted from prog using bpf_program__expected_attach_type(). It is
    currently not possible to modify the attach type before
    bpf_program__attach_cgroup() is called, since
    bpf_program__set_expected_attach_type() has no effect after the program
    is loaded.
    
    Signed-off-by: Jordan Rife <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Alexei Starovoitov <[email protected]>
    jrife authored and Alexei Starovoitov committed May 13, 2024
    Configuration menu
    Copy the full SHA
    5a047b2 View commit details
    Browse the repository at this point in the history
  6. selftests/bpf: Handle SYSCALL_EPERM and SYSCALL_ENOTSUPP test cases

    In preparation to move test cases from bpf/test_sock_addr.c that expect
    system calls to return ENOTSUPP or EPERM, this patch propagates errno
    from relevant system calls up to test_sock_addr() where the result can
    be checked.
    
    Signed-off-by: Jordan Rife <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Alexei Starovoitov <[email protected]>
    jrife authored and Alexei Starovoitov committed May 13, 2024
    Configuration menu
    Copy the full SHA
    a2618c0 View commit details
    Browse the repository at this point in the history
  7. selftests/bpf: Migrate WILDCARD_IP test

    Move wildcard IP sendmsg test case out of bpf/test_sock_addr.c into
    prog_tests/sock_addr.c.
    
    Signed-off-by: Jordan Rife <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Alexei Starovoitov <[email protected]>
    jrife authored and Alexei Starovoitov committed May 13, 2024
    Configuration menu
    Copy the full SHA
    d1b24fc View commit details
    Browse the repository at this point in the history
  8. selftests/bpf: Migrate sendmsg deny test cases

    This set of tests checks that sendmsg calls are rejected (return -EPERM)
    when the sendmsg* hook returns 0. Replace those in bpf/test_sock_addr.c
    with corresponding tests in prog_tests/sock_addr.c.
    
    Signed-off-by: Jordan Rife <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Alexei Starovoitov <[email protected]>
    jrife authored and Alexei Starovoitov committed May 13, 2024
    Configuration menu
    Copy the full SHA
    f46a104 View commit details
    Browse the repository at this point in the history
  9. selftests/bpf: Migrate sendmsg6 v4 mapped address tests

    Migrate test case from bpf/test_sock_addr.c ensuring that sendmsg
    returns -ENOTSUPP when sending to an IPv4-mapped IPv6 address to
    prog_tests/sock_addr.c.
    
    Signed-off-by: Jordan Rife <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Alexei Starovoitov <[email protected]>
    jrife authored and Alexei Starovoitov committed May 13, 2024
    Configuration menu
    Copy the full SHA
    54462e8 View commit details
    Browse the repository at this point in the history
  10. selftests/bpf: Migrate wildcard destination rewrite test

    Migrate test case from bpf/test_sock_addr.c ensuring that sendmsg
    respects when sendmsg6 hooks rewrite the destination IP with the IPv6
    wildcard IP, [::].
    
    Signed-off-by: Jordan Rife <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Alexei Starovoitov <[email protected]>
    jrife authored and Alexei Starovoitov committed May 13, 2024
    Configuration menu
    Copy the full SHA
    8eaf805 View commit details
    Browse the repository at this point in the history
  11. selftests/bpf: Migrate expected_attach_type tests

    Migrates tests from progs/test_sock_addr.c ensuring that programs fail
    to load when the expected attach type does not match.
    
    Signed-off-by: Jordan Rife <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Alexei Starovoitov <[email protected]>
    jrife authored and Alexei Starovoitov committed May 13, 2024
    Configuration menu
    Copy the full SHA
    b0f3af0 View commit details
    Browse the repository at this point in the history
  12. selftests/bpf: Migrate ATTACH_REJECT test cases

    Migrate test case from bpf/test_sock_addr.c ensuring that program
    attachment fails when using an inappropriate attach type.
    
    Signed-off-by: Jordan Rife <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Alexei Starovoitov <[email protected]>
    jrife authored and Alexei Starovoitov committed May 13, 2024
    Configuration menu
    Copy the full SHA
    cded71f View commit details
    Browse the repository at this point in the history
  13. selftests/bpf: Remove redundant sendmsg test cases

    Remove these test cases completely, as the same behavior is already
    covered by other sendmsg* test cases in prog_tests/sock_addr.c. This
    just rewrites the destination address similar to sendmsg_v4_prog and
    sendmsg_v6_prog.
    
    Signed-off-by: Jordan Rife <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Alexei Starovoitov <[email protected]>
    jrife authored and Alexei Starovoitov committed May 13, 2024
    Configuration menu
    Copy the full SHA
    9c3f178 View commit details
    Browse the repository at this point in the history
  14. selftests/bpf: Retire test_sock_addr.(c|sh)

    Fully remove test_sock_addr.c and test_sock_addr.sh, as test coverage
    has been fully moved to prog_tests/sock_addr.c.
    
    Signed-off-by: Jordan Rife <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Alexei Starovoitov <[email protected]>
    jrife authored and Alexei Starovoitov committed May 13, 2024
    Configuration menu
    Copy the full SHA
    61ecfdf View commit details
    Browse the repository at this point in the history
  15. selftests/bpf: Expand sockaddr program return value tests

    This patch expands verifier coverage for program return values to cover
    bind, connect, sendmsg, getsockname, and getpeername hooks. It also
    rounds out the recvmsg coverage by adding test cases for recvmsg_unix
    hooks.
    
    Signed-off-by: Jordan Rife <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Alexei Starovoitov <[email protected]>
    jrife authored and Alexei Starovoitov committed May 13, 2024
    Configuration menu
    Copy the full SHA
    1e0a836 View commit details
    Browse the repository at this point in the history
  16. sefltests/bpf: Expand sockaddr hook deny tests

    This patch expands test coverage for EPERM tests to include connect and
    bind calls and rounds out the coverage for sendmsg by adding tests for
    sendmsg_unix.
    
    Signed-off-by: Jordan Rife <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Alexei Starovoitov <[email protected]>
    jrife authored and Alexei Starovoitov committed May 13, 2024
    Configuration menu
    Copy the full SHA
    dfb7539 View commit details
    Browse the repository at this point in the history
  17. selftests/bpf: Expand getsockname and getpeername tests

    This expands coverage for getsockname and getpeername hooks to include
    getsockname4, getsockname6, getpeername4, and getpeername6.
    
    Signed-off-by: Jordan Rife <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Alexei Starovoitov <[email protected]>
    jrife authored and Alexei Starovoitov committed May 13, 2024
    Configuration menu
    Copy the full SHA
    bc467e9 View commit details
    Browse the repository at this point in the history
  18. selftests/bpf: Expand ATTACH_REJECT tests

    This expands coverage for ATTACH_REJECT tests to include connect_unix,
    sendmsg_unix, recvmsg*, getsockname*, and getpeername*.
    
    Signed-off-by: Jordan Rife <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Alexei Starovoitov <[email protected]>
    jrife authored and Alexei Starovoitov committed May 13, 2024
    Configuration menu
    Copy the full SHA
    a3d3eb9 View commit details
    Browse the repository at this point in the history
  19. Merge branch 'retire-progs-test_sock_addr'

    Jordan Rife says:
    
    ====================
    Retire progs/test_sock_addr.c
    
    This patch series migrates remaining tests from bpf/test_sock_addr.c to
    prog_tests/sock_addr.c and progs/verifier_sock_addr.c in order to fully
    retire the old-style test program and expands test coverage to test
    previously untested scenarios related to sockaddr hooks.
    
    This is a continuation of the work started recently during the expansion
    of prog_tests/sock_addr.c.
    
    Link: https://lore.kernel.org/bpf/[email protected]/T/#u
    
    =======
    Patches
    =======
    * Patch 1 moves tests that check valid return values for recvmsg hooks
      into progs/verifier_sock_addr.c, a new addition to the verifier test
      suite.
    * Patches 2-5 lay the groundwork for test migration, enabling
      prog_tests/sock_addr.c to handle more test dimensions.
    * Patches 6-11 move existing tests to prog_tests/sock_addr.c.
    * Patch 12 removes some redundant test cases.
    * Patches 14-17 expand on existing test coverage.
    ====================
    
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Alexei Starovoitov <[email protected]>
    Alexei Starovoitov committed May 13, 2024
    Configuration menu
    Copy the full SHA
    e9dd229 View commit details
    Browse the repository at this point in the history
  20. tools: remove redundant ethtool.h from tooling infra

    Remove the redundant ethtool.h header file from tools/include/uapi/linux.
    The file is unnecessary as the system uses the kernel's
    include/uapi/linux/ethtool.h directly.
    
    Signed-off-by: Tushar Vyavahare <[email protected]>
    Acked-by: Jakub Kicinski <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Alexei Starovoitov <[email protected]>
    tvyavaha authored and Alexei Starovoitov committed May 13, 2024
    Configuration menu
    Copy the full SHA
    bbe91a9 View commit details
    Browse the repository at this point in the history
  21. bpf: avoid gcc overflow warning in test_xdp_vlan.c

    This patch fixes an integer overflow warning raised by GCC in
    xdp_prognum1 of progs/test_xdp_vlan.c:
    
      GCC-BPF  [test_maps] test_xdp_vlan.bpf.o
    progs/test_xdp_vlan.c: In function 'xdp_prognum1':
    progs/test_xdp_vlan.c:163:25: error: integer overflow in expression
     '(short int)(((__builtin_constant_p((int)vlan_hdr->h_vlan_TCI)) != 0
       ? (int)(short unsigned int)((short int)((int)vlan_hdr->h_vlan_TCI
       << 8 >> 8) << 8 | (short int)((int)vlan_hdr->h_vlan_TCI << 0 >> 8
       << 0)) & 61440 : (int)__builtin_bswap16(vlan_hdr->h_vlan_TCI)
       & 61440) << 8 >> 8) << 8' of type 'short int' results in '0' [-Werror=overflow]
      163 |                         bpf_htons((bpf_ntohs(vlan_hdr->h_vlan_TCI) & 0xf000)
          |                         ^~~~~~~~~
    
    The problem lies with the expansion of the bpf_htons macro and the
    expression passed into it.  The bpf_htons macro (and similarly the
    bpf_ntohs macro) expand to a ternary operation using either
    __builtin_bswap16 or ___bpf_swab16 to swap the bytes, depending on
    whether the expression is constant.
    
    For an expression, with 'value' as a u16, like:
    
      bpf_htons (value & 0xf000)
    
    The entire (value & 0xf000) is 'x' in the expansion of ___bpf_swab16
    and we get as one part of the expanded swab16:
    
      ((__u16)(value & 0xf000) << 8 >> 8 << 8
    
    This will always evaluate to 0, which is intentional since this
    subexpression deals with the byte guaranteed to be 0 by the mask.
    
    However, GCC warns because the precise reason this always evaluates to 0
    is an overflow.  Specifically, the plain 0xf000 in the expression is a
    signed 32-bit integer, which causes 'value' to also be promoted to a
    signed 32-bit integer, and the combination of the 8-bit left shift and
    down-cast back to __u16 results in a signed overflow (really a 'warning:
    overflow in conversion from int to __u16' which is propegated up through
    the rest of the expression leading to the ultimate overflow warning
    above), which is a valid warning despite being the intended result of
    this code.
    
    Clang does not warn on this case, likely because it performs constant
    folding later in the compilation process relative to GCC.  It seems that
    by the time clang does constant folding for this expression, the side of
    the ternary with this overflow has already been discarded.
    
    Fortunately, this warning is easily silenced by simply making the 0xf000
    mask explicitly unsigned.  This has no impact on the result.
    
    Signed-off-by: David Faust <[email protected]>
    Cc: [email protected]
    Cc: [email protected]
    Cc: Eduard Zingerman <[email protected]>
    Cc: Yonghong Song <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Alexei Starovoitov <[email protected]>
    dafaust authored and Alexei Starovoitov committed May 13, 2024
    Configuration menu
    Copy the full SHA
    792a04b View commit details
    Browse the repository at this point in the history
  22. selftests/bpf: Fix a few tests for GCC related warnings.

    This patch corrects a few warnings to allow selftests to compile for
    GCC.
    
    -- progs/cpumask_failure.c --
    
    progs/bpf_misc.h:136:22: error: ‘cpumask’ is used uninitialized
    [-Werror=uninitialized]
      136 | #define __sink(expr) asm volatile("" : "+g"(expr))
          |                      ^~~
    progs/cpumask_failure.c:68:9: note: in expansion of macro ‘__sink’
       68 |         __sink(cpumask);
    
    The macro __sink(cpumask) with the '+' contraint modifier forces the
    the compiler to expect a read and write from cpumask. GCC detects
    that cpumask is never initialized and reports an error.
    This patch removes the spurious non required definitions of cpumask.
    
    -- progs/dynptr_fail.c --
    
    progs/dynptr_fail.c:1444:9: error: ‘ptr1’ may be used uninitialized
    [-Werror=maybe-uninitialized]
     1444 |         bpf_dynptr_clone(&ptr1, &ptr2);
    
    Many of the tests in the file are related to the detection of
    uninitialized pointers by the verifier. GCC is able to detect possible
    uninitialized values, and reports this as an error.
    The patch initializes all of the previous uninitialized structs.
    
    -- progs/test_tunnel_kern.c --
    
    progs/test_tunnel_kern.c:590:9: error: array subscript 1 is outside
    array bounds of ‘struct geneve_opt[1]’ [-Werror=array-bounds=]
      590 |         *(int *) &gopt.opt_data = bpf_htonl(0xdeadbeef);
          |         ^~~~~~~~~~~~~~~~~~~~~~~
    progs/test_tunnel_kern.c:575:27: note: at offset 4 into object ‘gopt’ of
    size 4
      575 |         struct geneve_opt gopt;
    
    This tests accesses beyond the defined data for the struct geneve_opt
    which contains as last field "u8 opt_data[0]" which clearly does not get
    reserved space (in stack) in the function header. This pattern is
    repeated in ip6geneve_set_tunnel and geneve_set_tunnel functions.
    GCC is able to see this and emits a warning.
    The patch introduces a local struct that allocates enough space to
    safely allow the write to opt_data field.
    
    -- progs/jeq_infer_not_null_fail.c --
    
    progs/jeq_infer_not_null_fail.c:21:40: error: array subscript ‘struct
    bpf_map[0]’ is partly outside array bounds of ‘struct <anonymous>[1]’
    [-Werror=array-bounds=]
       21 |         struct bpf_map *inner_map = map->inner_map_meta;
          |                                        ^~
    progs/jeq_infer_not_null_fail.c:14:3: note: object ‘m_hash’ of size 32
       14 | } m_hash SEC(".maps");
    
    This example defines m_hash in the context of the compilation unit and
    casts it to struct bpf_map which is much smaller than the size of struct
    bpf_map. It errors out in GCC when it attempts to access an element that
    would be defined in struct bpf_map outsize of the defined limits for
    m_hash.
    This patch disables the warning through a GCC pragma.
    
    This changes were tested in bpf-next master selftests without any
    regressions.
    
    Signed-off-by: Cupertino Miranda <[email protected]>
    Cc: [email protected]
    Cc: [email protected]
    Cc: Yonghong Song <[email protected]>
    Cc: Eduard Zingerman <[email protected]>
    Cc: Andrii Nakryiko <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Alexei Starovoitov <[email protected]>
    cupermir authored and Alexei Starovoitov committed May 13, 2024
    Configuration menu
    Copy the full SHA
    5ddafcc View commit details
    Browse the repository at this point in the history
  23. selftests/bpf: Free strdup memory in xdp_hw_metadata

    The strdup() function returns a pointer to a new string which is a
    duplicate of the string "ifname". Memory for the new string is obtained
    with malloc(), and need to be freed with free().
    
    This patch adds this missing "free(saved_hwtstamp_ifname)" in cleanup()
    to avoid a potential memory leak in xdp_hw_metadata.c.
    
    Signed-off-by: Geliang Tang <[email protected]>
    Link: https://lore.kernel.org/r/af9bcccb96655e82de5ce2b4510b88c9c8ed5ed0.1715417367.git.tanggeliang@kylinos.cn
    Signed-off-by: Alexei Starovoitov <[email protected]>
    Geliang Tang authored and Alexei Starovoitov committed May 13, 2024
    Configuration menu
    Copy the full SHA
    a3c1c95 View commit details
    Browse the repository at this point in the history
  24. bpf: disable strict aliasing in test_global_func9.c

    The BPF selftest test_global_func9.c performs type punning and breaks
    srict-aliasing rules.
    
    In particular, given:
    
      int global_func9(struct __sk_buff *skb)
      {
    	int result = 0;
    
    	[...]
    	{
    		const struct C c = {.x = skb->len, .y = skb->family };
    
    		result |= foo((const struct S *)&c);
    	}
      }
    
    When building with strict-aliasing enabled (the default) the
    initialization of `c' gets optimized away in its entirely:
    
    	[... no initialization of `c' ...]
    	r1 = r10
    	r1 += -40
    	call	foo
    	w0 |= w6
    
    Since GCC knows that `foo' accesses s->x, we get a "maybe
    uninitialized" warning.
    
    On the other hand, when strict-aliasing is disabled GCC only optimizes
    away the store to `.y':
    
    	r1 = *(u32 *) (r6+0)
    	*(u32 *) (r10+-40) = r1  ; This is .x = skb->len in `c'
    	r1 = r10
    	r1 += -40
    	call	foo
    	w0 |= w6
    
    In this case the warning is not emitted, because s-> is initialized.
    
    This patch disables strict aliasing in this test when building with
    GCC.  clang seems to not optimize this particular code even when
    strict aliasing is enabled.
    
    Tested in bpf-next master.
    
    Signed-off-by: Jose E. Marchesi <[email protected]>
    Cc: [email protected]
    Cc: [email protected]
    Cc: Yonghong Song <[email protected]>
    Cc: Eduard Zingerman <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Alexei Starovoitov <[email protected]>
    jemarch authored and Alexei Starovoitov committed May 13, 2024
    Configuration menu
    Copy the full SHA
    7386898 View commit details
    Browse the repository at this point in the history
  25. bpf: ignore expected GCC warning in test_global_func10.c

    The BPF selftest global_func10 in progs/test_global_func10.c contains:
    
      struct Small {
      	long x;
      };
    
      struct Big {
      	long x;
      	long y;
      };
    
      [...]
    
      __noinline int foo(const struct Big *big)
      {
    	if (!big)
    		return 0;
    
    	return bpf_get_prandom_u32() < big->y;
      }
    
      [...]
    
      SEC("cgroup_skb/ingress")
      __failure __msg("invalid indirect access to stack")
      int global_func10(struct __sk_buff *skb)
      {
    	const struct Small small = {.x = skb->len };
    
    	return foo((struct Big *)&small) ? 1 : 0;
      }
    
    GCC emits a "maybe uninitialized" warning for the code above, because
    it knows `foo' accesses `big->y'.
    
    Since the purpose of this selftest is to check that the verifier will
    fail on this sort of invalid memory access, this patch just silences
    the compiler warning.
    
    Tested in bpf-next master.
    No regressions.
    
    Signed-off-by: Jose E. Marchesi <[email protected]>
    Cc: [email protected]
    Cc: [email protected]
    Cc: Yonghong Song <[email protected]>
    Cc: Eduard Zingerman <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Alexei Starovoitov <[email protected]>
    jemarch authored and Alexei Starovoitov committed May 13, 2024
    Configuration menu
    Copy the full SHA
    6a2f786 View commit details
    Browse the repository at this point in the history
  26. bpf: make list_for_each_entry portable

    [Changes from V1:
    - The __compat_break has been abandoned in favor of
      a more readable can_loop macro that can be used anywhere, including
      loop conditions.]
    
    The macro list_for_each_entry is defined in bpf_arena_list.h as
    follows:
    
      #define list_for_each_entry(pos, head, member)				\
    	for (void * ___tmp = (pos = list_entry_safe((head)->first,		\
    						    typeof(*(pos)), member),	\
    			      (void *)0);					\
    	     pos && ({ ___tmp = (void *)pos->member.next; 1; });		\
    	     cond_break,							\
    	     pos = list_entry_safe((void __arena *)___tmp, typeof(*(pos)), member))
    
    The macro cond_break, in turn, expands to a statement expression that
    contains a `break' statement.  Compound statement expressions, and the
    subsequent ability of placing statements in the header of a `for'
    loop, are GNU extensions.
    
    Unfortunately, clang implements this GNU extension differently than
    GCC:
    
    - In GCC the `break' statement is bound to the containing "breakable"
      context in which the defining `for' appears.  If there is no such
      context, GCC emits a warning: break statement without enclosing `for'
      o `switch' statement.
    
    - In clang the `break' statement is bound to the defining `for'.  If
      the defining `for' is itself inside some breakable construct, then
      clang emits a -Wgcc-compat warning.
    
    This patch adds a new macro can_loop to bpf_experimental, that
    implements the same logic than cond_break but evaluates to a boolean
    expression.  The patch also changes all the current instances of usage
    of cond_break withing the header of loop accordingly.
    
    Tested in bpf-next master.
    No regressions.
    
    Signed-off-by: Jose E. Marchesi <[email protected]>
    Cc: [email protected]
    Cc: [email protected]
    Cc: Alexei Starovoitov <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Alexei Starovoitov <[email protected]>
    jemarch authored and Alexei Starovoitov committed May 13, 2024
    Configuration menu
    Copy the full SHA
    ba39486 View commit details
    Browse the repository at this point in the history
  27. Merge tag 'nf-next-24-05-12' of git://git.kernel.org/pub/scm/linux/ke…

    …rnel/git/netfilter/nf-next
    
    Pablo Neira Ayuso says:
    
    ====================
    Netfilter updates for net-next
    
    The following patchset contains Netfilter updates for net-next:
    
    Patch #1 skips transaction if object type provides no .update interface.
    
    Patch #2 skips NETDEV_CHANGENAME which is unused.
    
    Patch #3 enables conntrack to handle Multicast Router Advertisements and
    	 Multicast Router Solicitations from the Multicast Router Discovery
    	 protocol (RFC4286) as untracked opposed to invalid packets.
    	 From Linus Luessing.
    
    Patch #4 updates DCCP conntracker to mark invalid as invalid, instead of
    	 dropping them, from Jason Xing.
    
    Patch #5 uses NF_DROP instead of -NF_DROP since NF_DROP is 0,
    	 also from Jason.
    
    Patch #6 removes reference in netfilter's sysctl documentation on pickup
    	 entries which were already removed by Florian Westphal.
    
    Patch #7 removes check for IPS_OFFLOAD flag to disable early drop which
    	 allows to evict entries from the conntrack table,
    	 also from Florian.
    
    Patches #8 to #16 updates nf_tables pipapo set backend to allocate
    	 the datastructure copy on-demand from preparation phase,
    	 to better deal with OOM situations where .commit step is too late
    	 to fail. Series from Florian Westphal.
    
    Patch #17 adds a selftest with packetdrill to cover conntrack TCP state
    	 transitions, also from Florian.
    
    Patch #18 use GFP_KERNEL to clone elements from control plane to avoid
    	 quick atomic reserves exhaustion with large sets, reporter refers
    	 to million entries magnitude.
    
    * tag 'nf-next-24-05-12' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf-next:
      netfilter: nf_tables: allow clone callbacks to sleep
      selftests: netfilter: add packetdrill based conntrack tests
      netfilter: nft_set_pipapo: remove dirty flag
      netfilter: nft_set_pipapo: move cloning of match info to insert/removal path
      netfilter: nft_set_pipapo: prepare pipapo_get helper for on-demand clone
      netfilter: nft_set_pipapo: merge deactivate helper into caller
      netfilter: nft_set_pipapo: prepare walk function for on-demand clone
      netfilter: nft_set_pipapo: prepare destroy function for on-demand clone
      netfilter: nft_set_pipapo: make pipapo_clone helper return NULL
      netfilter: nft_set_pipapo: move prove_locking helper around
      netfilter: conntrack: remove flowtable early-drop test
      netfilter: conntrack: documentation: remove reference to non-existent sysctl
      netfilter: use NF_DROP instead of -NF_DROP
      netfilter: conntrack: dccp: try not to drop skb in conntrack
      netfilter: conntrack: fix ct-state for ICMPv6 Multicast Router Discovery
      netfilter: nf_tables: remove NETDEV_CHANGENAME from netdev chain event handler
      netfilter: nf_tables: skip transaction if update object is not implemented
    ====================
    
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    kuba-moo committed May 13, 2024
    Configuration menu
    Copy the full SHA
    c85e41b View commit details
    Browse the repository at this point in the history
  28. net: phy: air_en8811h: reset netdev rules when LED is set manually

    Setting LED_OFF via brightness_set should deactivate hw control, so make
    sure netdev trigger rules also get cleared in that case.
    This fixes unwanted restoration of the default netdev trigger rules and
    matches the behaviour when using the 'netdev' trigger without any
    hardware offloading.
    
    Fixes: 71e7943 ("net: phy: air_en8811h: Add the Airoha EN8811H PHY driver")
    Signed-off-by: Daniel Golle <[email protected]>
    Link: https://lore.kernel.org/r/5ed8ea615890a91fa4df59a7ae8311bbdf63cdcf.1715248281.git.daniel@makrotopia.org
    Signed-off-by: Jakub Kicinski <[email protected]>
    dangowrt authored and kuba-moo committed May 13, 2024
    Configuration menu
    Copy the full SHA
    87bfdbb View commit details
    Browse the repository at this point in the history
  29. selftest: epoll_busy_poll: Fix spelling mistake "couldnt" -> "couldn't"

    There is a spelling mistake in a TH_LOG message. Fix it.
    
    Signed-off-by: Colin Ian King <[email protected]>
    Reviewed-by: Simon Horman <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    ColinIanKing authored and kuba-moo committed May 13, 2024
    Configuration menu
    Copy the full SHA
    f37dc28 View commit details
    Browse the repository at this point in the history
  30. selftests: net: use upstream mtools

    Joachim kindly merged the IPv6 support in
    troglobit/mtools#2, so we can just use his
    version now. A few more fixes subsequently came in for IPv6, so even
    better.
    
    Check that the deployed mtools version is 3.0 or above. Note that the
    version check breaks compatibility with my fork where I didn't bump the
    version, but I assume that won't be a problem.
    
    Signed-off-by: Vladimir Oltean <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    vladimiroltean authored and kuba-moo committed May 13, 2024
    Configuration menu
    Copy the full SHA
    cfc2eef View commit details
    Browse the repository at this point in the history
  31. selftests: netfilter: nft_flowtable.sh: bump socat timeout to 1m

    Now that this test runs in netdev CI it looks like 10s isn't enough
    for debug kernels:
      selftests: net/netfilter: nft_flowtable.sh
      2024/05/10 20:33:08 socat[12204] E write(7, 0x563feb16a000, 8192): Broken pipe
      FAIL: file mismatch for ns1 -> ns2
      -rw------- 1 root root 37345280 May 10 20:32 /tmp/tmp.Am0yEHhNqI
     ...
    
    Looks like socat gets zapped too quickly, so increase timeout to 1m.
    
    Could also reduce tx file size for KSFT_MACHINE_SLOW, but its preferrable
    to have same test for both debug and nondebug.
    
    Signed-off-by: Florian Westphal <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Florian Westphal authored and kuba-moo committed May 13, 2024
    Configuration menu
    Copy the full SHA
    5fcc17d View commit details
    Browse the repository at this point in the history
  32. net: ena: Add a counter for driver's reset failures

    This patch adds a counter to the ena_adapter struct in
    order to keep track of reset failures.
    The counter is incremented every time either ena_restore_device()
    or ena_destroy_device() fail.
    
    Signed-off-by: Osama Abboud <[email protected]>
    Signed-off-by: David Arinzon <[email protected]>
    Reviewed-by: Simon Horman <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    davidarinzon authored and kuba-moo committed May 13, 2024
    Configuration menu
    Copy the full SHA
    62a261f View commit details
    Browse the repository at this point in the history
  33. net: ena: Reduce holes in ena_com structures

    This patch makes two changes in order to fill holes and
    reduce ther overall size of the structures ena_com_dev
    and ena_com_rx_ctx.
    
    Signed-off-by: Shahar Itzko <[email protected]>
    Signed-off-by: David Arinzon <[email protected]>
    Reviewed-by: Simon Horman <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    davidarinzon authored and kuba-moo committed May 13, 2024
    Configuration menu
    Copy the full SHA
    48673ef View commit details
    Browse the repository at this point in the history
  34. net: ena: Add validation for completion descriptors consistency

    Validate that `first` flag is set only for the first
    descriptor in multi-buffer packets.
    In case of an invalid descriptor, a reset will occur.
    A new reset reason for RX data corruption has been added.
    
    Signed-off-by: Shahar Itzko <[email protected]>
    Signed-off-by: David Arinzon <[email protected]>
    Reviewed-by: Simon Horman <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    davidarinzon authored and kuba-moo committed May 13, 2024
    Configuration menu
    Copy the full SHA
    b37b98a View commit details
    Browse the repository at this point in the history
  35. net: ena: Changes around strscpy calls

    strscpy copies as much of the string as possible,
    meaning that the destination string will be truncated
    in case of no space. As this is a non-critical error in
    our case, adding a debug level print for indication.
    
    This patch also removes a -1 which was added to ensure
    enough space for NUL, but strscpy destination string is
    guaranteed to be NUL-terminted, therefore, the -1 is
    not needed.
    
    Signed-off-by: David Arinzon <[email protected]>
    Reviewed-by: Simon Horman <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    davidarinzon authored and kuba-moo committed May 13, 2024
    Configuration menu
    Copy the full SHA
    97776ca View commit details
    Browse the repository at this point in the history
  36. net: ena: Change initial rx_usec interval

    For the purpose of obtaining better CPU utilization,
    minimum rx moderation interval is set to 20 usec.
    
    Signed-off-by: Osama Abboud <[email protected]>
    Signed-off-by: David Arinzon <[email protected]>
    Reviewed-by: Simon Horman <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    davidarinzon authored and kuba-moo committed May 13, 2024
    Configuration menu
    Copy the full SHA
    1cc0a47 View commit details
    Browse the repository at this point in the history
  37. Merge branch 'ena-driver-changes-may-2024'

    David Arinzon says:
    
    ====================
    ENA driver changes May 2024
    
    This patchset contains several misc and minor
    changes to the ENA driver.
    ====================
    
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    kuba-moo committed May 13, 2024
    Configuration menu
    Copy the full SHA
    9af9b89 View commit details
    Browse the repository at this point in the history
  38. net: gro: use cb instead of skb->network_header

    This patch converts references of skb->network_header to napi_gro_cb's
    network_offset and inner_network_offset.
    
    Signed-off-by: Richard Gobert <[email protected]>
    Reviewed-by: Willem de Bruijn <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Richard Gobert authored and kuba-moo committed May 13, 2024
    Configuration menu
    Copy the full SHA
    186b1ea View commit details
    Browse the repository at this point in the history
  39. net: gro: move L3 flush checks to tcp_gro_receive and udp_gro_receive…

    …_segment
    
    {inet,ipv6}_gro_receive functions perform flush checks (ttl, flags,
    iph->id, ...) against all packets in a loop. These flush checks are used in
    all merging UDP and TCP flows.
    
    These checks need to be done only once and only against the found p skb,
    since they only affect flush and not same_flow.
    
    This patch leverages correct network header offsets from the cb for both
    outer and inner network headers - allowing these checks to be done only
    once, in tcp_gro_receive and udp_gro_receive_segment. As a result,
    NAPI_GRO_CB(p)->flush is not used at all. In addition, flush_id checks are
    more declarative and contained in inet_gro_flush, thus removing the need
    for flush_id in napi_gro_cb.
    
    This results in less parsing code for non-loop flush tests for TCP and UDP
    flows.
    
    To make sure results are not within noise range - I've made netfilter drop
    all TCP packets, and measured CPU performance in GRO (in this case GRO is
    responsible for about 50% of the CPU utilization).
    
    perf top while replaying 64 parallel IP/TCP streams merging in GRO:
    (gro_receive_network_flush is compiled inline to tcp_gro_receive)
    net-next:
            6.94% [kernel] [k] inet_gro_receive
            3.02% [kernel] [k] tcp_gro_receive
    
    patch applied:
            4.27% [kernel] [k] tcp_gro_receive
            4.22% [kernel] [k] inet_gro_receive
    
    perf top while replaying 64 parallel IP/IP/TCP streams merging in GRO (same
    results for any encapsulation, in this case inet_gro_receive is top
    offender in net-next)
    net-next:
            10.09% [kernel] [k] inet_gro_receive
            2.08% [kernel] [k] tcp_gro_receive
    
    patch applied:
            6.97% [kernel] [k] inet_gro_receive
            3.68% [kernel] [k] tcp_gro_receive
    
    Signed-off-by: Richard Gobert <[email protected]>
    Reviewed-by: Willem de Bruijn <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Richard Gobert authored and kuba-moo committed May 13, 2024
    Configuration menu
    Copy the full SHA
    4b0ebbc View commit details
    Browse the repository at this point in the history
  40. selftests/net: add flush id selftests

    Added flush id selftests to test different cases where DF flag is set or
    unset and id value changes in the following packets. All cases where the
    packets should coalesce or should not coalesce are tested.
    
    Signed-off-by: Richard Gobert <[email protected]>
    Reviewed-by: Willem de Bruijn <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Richard Gobert authored and kuba-moo committed May 13, 2024
    Configuration menu
    Copy the full SHA
    bc21fae View commit details
    Browse the repository at this point in the history
  41. Merge branch 'net-gro-remove-network_header-use-move-p-flush-flush_id…

    …-calculations-to-l4'
    
    Richard Gobert says:
    
    ====================
    net: gro: remove network_header use, move p->{flush/flush_id} calculations to L4
    
    The cb fields network_offset and inner_network_offset are used instead of
    skb->network_header throughout GRO.
    
    These fields are then leveraged in the next commit to remove flush_id state
    from napi_gro_cb, and stateful code in {ipv6,inet}_gro_receive which may be
    unnecessarily complicated due to encapsulation support in GRO. These fields
    are checked in L4 instead.
    
    3rd patch adds tests for different flush_id flows in GRO.
    ====================
    
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    kuba-moo committed May 13, 2024
    Configuration menu
    Copy the full SHA
    e6e4357 View commit details
    Browse the repository at this point in the history
  42. tcp: socket option to check for MPTCP fallback to TCP

    A way for an application to know if an MPTCP connection fell back to TCP
    is to use getsockopt(MPTCP_INFO) and look for errors. The issue with
    this technique is that the same errors -- EOPNOTSUPP (IPv4) and
    ENOPROTOOPT (IPv6) -- are returned if there was a fallback, *or* if the
    kernel doesn't support this socket option. The userspace then has to
    look at the kernel version to understand what the errors mean.
    
    It is not clean, and it doesn't take into account older kernels where
    the socket option has been backported. A cleaner way would be to expose
    this info to the TCP socket level. In case of MPTCP socket where no
    fallback happened, the socket options for the TCP level will be handled
    in MPTCP code, in mptcp_getsockopt_sol_tcp(). If not, that will be in
    TCP code, in do_tcp_getsockopt(). So MPTCP simply has to set the value
    1, while TCP has to set 0.
    
    If the socket option is not supported, one of these two errors will be
    reported:
    - EOPNOTSUPP (95 - Operation not supported) for MPTCP sockets
    - ENOPROTOOPT (92 - Protocol not available) for TCP sockets, e.g. on the
      socket received after an 'accept()', when the client didn't request to
      use MPTCP: this socket will be a TCP one, even if the listen socket
      was an MPTCP one.
    
    With this new option, the kernel can return a clear answer to both "Is
    this kernel new enough to tell me the fallback status?" and "If it is
    new enough, is it currently a TCP or MPTCP socket?" questions, while not
    breaking the previous method.
    
    Acked-by: Mat Martineau <[email protected]>
    Signed-off-by: Matthieu Baerts (NGI0) <[email protected]>
    Link: https://lore.kernel.org/r/20240509-upstream-net-next-20240509-mptcp-tcp_is_mptcp-v1-1-f846df999202@kernel.org
    Signed-off-by: Jakub Kicinski <[email protected]>
    matttbe authored and kuba-moo committed May 13, 2024
    Configuration menu
    Copy the full SHA
    c084ebd View commit details
    Browse the repository at this point in the history
  43. netdev: Add queue stats for TX stop and wake

    TX queue stop and wake are counted by some drivers.
    Support reporting these via netdev-genl queue stats.
    
    Signed-off-by: Daniel Jurgens <[email protected]>
    Reviewed-by: Jiri Pirko <[email protected]>
    Reviewed-by: Jason Xing <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Daniel Jurgens authored and kuba-moo committed May 13, 2024
    Configuration menu
    Copy the full SHA
    b560351 View commit details
    Browse the repository at this point in the history
  44. virtio_net: Add TX stopped and wake counters

    Add a tx queue stop and wake counters, they are useful for debugging.
    
    $ ./tools/net/ynl/cli.py --spec netlink/specs/netdev.yaml \
    --dump qstats-get --json '{"scope": "queue"}'
    ...
     {'ifindex': 13,
      'queue-id': 0,
      'queue-type': 'tx',
      'tx-bytes': 14756682850,
      'tx-packets': 226465,
      'tx-stop': 113208,
      'tx-wake': 113208},
     {'ifindex': 13,
      'queue-id': 1,
      'queue-type': 'tx',
      'tx-bytes': 18167675008,
      'tx-packets': 278660,
      'tx-stop': 8632,
      'tx-wake': 8632}]
    
    Signed-off-by: Daniel Jurgens <[email protected]>
    Reviewed-by: Jiri Pirko <[email protected]>
    Reviewed-by: Jason Xing <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Daniel Jurgens authored and kuba-moo committed May 13, 2024
    Configuration menu
    Copy the full SHA
    c39add9 View commit details
    Browse the repository at this point in the history
  45. Merge branch 'add-tx-stop-wake-counters'

    Daniel Jurgens says:
    
    ====================
    Add TX stop/wake counters
    
    Several drivers provide TX stop and wake counters via ethtool stats. Add
    those to the netdev queue stats, and use them in virtio_net.
    ====================
    
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    kuba-moo committed May 13, 2024
    Configuration menu
    Copy the full SHA
    e5a2802 View commit details
    Browse the repository at this point in the history
  46. ynl: ensure exact-len value is resolved

    For type String and Binary we are currently usinig the exact-len
    limit value as is without attempting any name resolution.
    However, the spec may specify the name of a constant rather than an
    actual value, which would result in using the constant name as is
    and thus break the policy.
    
    Ensure the limit value is passed to get_limit(), which will always
    attempt resolving the name before printing the policy rule.
    
    Signed-off-by: Antonio Quartulli <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    ordex authored and kuba-moo committed May 13, 2024
    Configuration menu
    Copy the full SHA
    ec8c257 View commit details
    Browse the repository at this point in the history
  47. l2tp: Support different protocol versions with same IP/port quadruple

    628bc3e ("l2tp: Support several sockets with same IP/port quadruple")
    added support for several L2TPv2 tunnels using the same IP/port quadruple,
    but if an L2TPv3 socket exists it could eat all the trafic. We thus have to
    first use the version from the packet to get the proper tunnel, and only
    then check that the version matches.
    
    Signed-off-by: Samuel Thibault <[email protected]>
    Reviewed-by: James Chapman <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    sthibaul authored and kuba-moo committed May 13, 2024
    Configuration menu
    Copy the full SHA
    3647980 View commit details
    Browse the repository at this point in the history
  48. net: dsa: microchip: dcb: rename IPV to IPM

    IPV is added and used term in 802.1Qci PSFP and merged into 802.1Q (from
    802.1Q-2018) for another functions.
    
    Even it does similar operation holding temporal priority value
    internally (as it is named), because KSZ datasheet doesn't use the term
    of IPV (Internal Priority Value) and avoiding any confusion later when
    PSFP is in the Linux world, it is better to rename IPV to IPM (Internal
    Priority Mapping).
    
    In addition, LAN937x documentation already use IPV for 802.1Qci PSFP
    related functionality.
    
    Suggested-by: Woojung Huh <[email protected]>
    Signed-off-by: Oleksij Rempel <[email protected]>
    Reviewed-by: Woojung Huh <[email protected]>
    Reviewed-by: Simon Horman <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    olerem authored and kuba-moo committed May 13, 2024
    Configuration menu
    Copy the full SHA
    2ccb1ac View commit details
    Browse the repository at this point in the history
  49. net: dsa: microchip: dcb: add comments for DSCP related functions

    All other functions are commented. Add missing comments to following
    functions:
    ksz_set_global_dscp_entry()
    ksz_port_add_dscp_prio()
    ksz_port_del_dscp_prio()
    
    Signed-off-by: Oleksij Rempel <[email protected]>
    Acked-by: Arun Ramadoss <[email protected]>
    Reviewed-by: Simon Horman <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    olerem authored and kuba-moo committed May 13, 2024
    Configuration menu
    Copy the full SHA
    593d6ad View commit details
    Browse the repository at this point in the history
  50. net: dsa: microchip: dcb: set default apptrust to PCP only

    Before DCB support, the KSZ driver had only PCP as source of packet
    priority values. To avoid regressions, make PCP only as default value.
    User will need enable DSCP support manually.
    
    This patch do not affect other KSZ8 related quirks. User will still be
    warned by setting not support configurations for the port 2.
    
    Signed-off-by: Oleksij Rempel <[email protected]>
    Acked-by: Arun Ramadoss <[email protected]>
    Reviewed-by: Simon Horman <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    olerem authored and kuba-moo committed May 13, 2024
    Configuration menu
    Copy the full SHA
    01e400f View commit details
    Browse the repository at this point in the history
  51. Merge branch 'net-dsa-microchip-dcb-fixes'

    Oleksij Rempel says:
    
    ====================
    net: dsa: microchip: DCB fixes
    
    This patch series address recommendation to rename IPV to IPM to avoid
    confusion with IPV name used in 802.1Qci PSFP. And restores default "PCP
    only" configuration as source of priorities to avoid possible
    regressions.
    ====================
    
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    kuba-moo committed May 13, 2024
    Configuration menu
    Copy the full SHA
    ef318fc View commit details
    Browse the repository at this point in the history
  52. test: hsr: Extend the hsr_redbox.sh to have more SAN devices connected

    After this change the single SAN device (ns3eth1) is now replaced with
    two SAN devices - respectively ns4eth1 and ns5eth1.
    
    It is possible to extend this script to have more SAN devices connected
    by adding them to ns3br1 bridge.
    
    Signed-off-by: Lukasz Majewski <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Lukasz Majewski authored and kuba-moo committed May 13, 2024
    Configuration menu
    Copy the full SHA
    eafbf05 View commit details
    Browse the repository at this point in the history
  53. net/mlx5: Enable 8 ports LAG

    This patch adds to mlx5 drivers support for 8 ports HCAs.
    Starting with ConnectX-8 HCAs with 8 ports are possible.
    
    As most driver parts aren't affected by such configuration most driver
    code is unchanged.
    
    Specially the only affected areas are:
    - Lag
    - Multiport E-Switch
    - Single FDB E-Switch
    
    All of the above are already factored in generic way, and LAG and VF LAG
    are tested, so all that left is to change a #define and remove checks
    which are no longer needed.
    However, Multiport E-Switch is not tested yet, so it is left untouched.
    
    This patch will allow to create hardware LAG/VF LAG when all 8 ports are
    added to the same bond device.
    
    for example, In order to activate the hardware lag a user can execute
    the following:
    
    ip link add bond0 type bond
    ip link set bond0 type bond miimon 100 mode 2
    ip link set eth2 master bond0
    ip link set eth3 master bond0
    ip link set eth4 master bond0
    ip link set eth5 master bond0
    ip link set eth6 master bond0
    ip link set eth7 master bond0
    ip link set eth8 master bond0
    ip link set eth9 master bond0
    
    Where eth2, eth3, eth4, eth5, eth6, eth7, eth8 and eth9 are the PFs of
    the same HCA.
    
    Signed-off-by: Shay Drory <[email protected]>
    Reviewed-by: Mark Bloch <[email protected]>
    Signed-off-by: Tariq Toukan <[email protected]>
    Reviewed-by: Simon Horman <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    shayshyi authored and kuba-moo committed May 13, 2024
    Configuration menu
    Copy the full SHA
    e0e6adf View commit details
    Browse the repository at this point in the history
  54. net/mlx5e: Modifying channels number and updating TX queues

    It is not appropriate for the mlx5e_num_channels_changed
    function to be called solely for updating the TX queues,
    even if the channels number has not been changed.
    
    Move the code responsible for updating the TC and TX queues
    from mlx5e_num_channels_changed and produce a new function
    called mlx5e_update_tc_and_tx_queues. This new function should
    only be called when the channels number remains unchanged.
    
    Signed-off-by: Carolina Jubran <[email protected]>
    Signed-off-by: Tariq Toukan <[email protected]>
    Reviewed-by: Simon Horman <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    cjubran authored and kuba-moo committed May 13, 2024
    Configuration menu
    Copy the full SHA
    bcee093 View commit details
    Browse the repository at this point in the history
  55. net/mlx5: Remove unused msix related exported APIs

    MSIX irq allocation and free APIs are no longer
    in use. Hence, remove the dead code.
    
    Signed-off-by: Parav Pandit <[email protected]>
    Reviewed-by: Dragos Tatulea <[email protected]>
    Signed-off-by: Tariq Toukan <[email protected]>
    Reviewed-by: Simon Horman <[email protected]>
    Reviewed-by: Kalesh AP <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    paravmellanox authored and kuba-moo committed May 13, 2024
    Configuration menu
    Copy the full SHA
    db5944e View commit details
    Browse the repository at this point in the history
  56. Merge branch 'mlx5-misc-patches'

    Tariq Toukan says:
    
    ====================
    mlx5 misc patches
    
    This series includes patches for the mlx5 driver.
    
    Patch 1 by Shay enables LAG with HCAs of 8 ports.
    
    Patch 2 by Carolina optimizes the safe switch channels operation for the
    TX-only changes.
    
    Patch 3 by Parav cleans up some unused code.
    ====================
    
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    kuba-moo committed May 13, 2024
    Configuration menu
    Copy the full SHA
    d20e391 View commit details
    Browse the repository at this point in the history
  57. net: pcs: lynx: no need to read LPA in lynx_pcs_get_state_2500basex()

    Nothing useful is done with the LPA variable in lynx_pcs_get_state_2500basex(),
    we can just remove the read.
    
    Signed-off-by: Vladimir Oltean <[email protected]>
    Reviewed-by: Andrew Lunn <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    vladimiroltean authored and kuba-moo committed May 13, 2024
    Configuration menu
    Copy the full SHA
    afd29f3 View commit details
    Browse the repository at this point in the history
  58. Merge tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel…

    …/git/bpf/bpf-next
    
    Daniel Borkmann says:
    
    ====================
    pull-request: bpf-next 2024-05-13
    
    We've added 119 non-merge commits during the last 14 day(s) which contain
    a total of 134 files changed, 9462 insertions(+), 4742 deletions(-).
    
    The main changes are:
    
    1) Add BPF JIT support for 32-bit ARCv2 processors, from Shahab Vahedi.
    
    2) Add BPF range computation improvements to the verifier in particular
       around XOR and OR operators, refactoring of checks for range computation
       and relaxing MUL range computation so that src_reg can also be an unknown
       scalar, from Cupertino Miranda.
    
    3) Add support to attach kprobe BPF programs through kprobe_multi link in
       a session mode, meaning, a BPF program is attached to both function entry
       and return, the entry program can decide if the return program gets
       executed and the entry program can share u64 cookie value with return
       program. Session mode is a common use-case for tetragon and bpftrace,
       from Jiri Olsa.
    
    4) Fix a potential overflow in libbpf's ring__consume_n() and improve libbpf
       as well as BPF selftest's struct_ops handling, from Andrii Nakryiko.
    
    5) Improvements to BPF selftests in context of BPF gcc backend,
       from Jose E. Marchesi & David Faust.
    
    6) Migrate remaining BPF selftest tests from test_sock_addr.c to prog_test-
       -style in order to retire the old test, run it in BPF CI and additionally
       expand test coverage, from Jordan Rife.
    
    7) Big batch for BPF selftest refactoring in order to remove duplicate code
       around common network helpers, from Geliang Tang.
    
    8) Another batch of improvements to BPF selftests to retire obsolete
       bpf_tcp_helpers.h as everything is available vmlinux.h,
       from Martin KaFai Lau.
    
    9) Fix BPF map tear-down to not walk the map twice on free when both timer
       and wq is used, from Benjamin Tissoires.
    
    10) Fix BPF verifier assumptions about socket->sk that it can be non-NULL,
        from Alexei Starovoitov.
    
    11) Change BTF build scripts to using --btf_features for pahole v1.26+,
        from Alan Maguire.
    
    12) Small improvements to BPF reusing struct_size() and krealloc_array(),
        from Andy Shevchenko.
    
    13) Fix s390 JIT to emit a barrier for BPF_FETCH instructions,
        from Ilya Leoshkevich.
    
    14) Extend TCP ->cong_control() callback in order to feed in ack and
        flag parameters and allow write-access to tp->snd_cwnd_stamp
        from BPF program, from Miao Xu.
    
    15) Add support for internal-only per-CPU instructions to inline
        bpf_get_smp_processor_id() helper call for arm64 and riscv64 BPF JITs,
        from Puranjay Mohan.
    
    16) Follow-up to remove the redundant ethtool.h from tooling infrastructure,
        from Tushar Vyavahare.
    
    17) Extend libbpf to support "module:<function>" syntax for tracing
        programs, from Viktor Malik.
    
    * tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (119 commits)
      bpf: make list_for_each_entry portable
      bpf: ignore expected GCC warning in test_global_func10.c
      bpf: disable strict aliasing in test_global_func9.c
      selftests/bpf: Free strdup memory in xdp_hw_metadata
      selftests/bpf: Fix a few tests for GCC related warnings.
      bpf: avoid gcc overflow warning in test_xdp_vlan.c
      tools: remove redundant ethtool.h from tooling infra
      selftests/bpf: Expand ATTACH_REJECT tests
      selftests/bpf: Expand getsockname and getpeername tests
      sefltests/bpf: Expand sockaddr hook deny tests
      selftests/bpf: Expand sockaddr program return value tests
      selftests/bpf: Retire test_sock_addr.(c|sh)
      selftests/bpf: Remove redundant sendmsg test cases
      selftests/bpf: Migrate ATTACH_REJECT test cases
      selftests/bpf: Migrate expected_attach_type tests
      selftests/bpf: Migrate wildcard destination rewrite test
      selftests/bpf: Migrate sendmsg6 v4 mapped address tests
      selftests/bpf: Migrate sendmsg deny test cases
      selftests/bpf: Migrate WILDCARD_IP test
      selftests/bpf: Handle SYSCALL_EPERM and SYSCALL_ENOTSUPP test cases
      ...
    ====================
    
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    kuba-moo committed May 13, 2024
    Configuration menu
    Copy the full SHA
    6e62702 View commit details
    Browse the repository at this point in the history

Commits on May 14, 2024

  1. virtio_ring: enable premapped mode whatever use_dma_api

    Now, we have virtio DMA APIs, the driver can be the premapped
    mode whatever the virtio core uses dma api or not.
    
    So remove the limit of checking use_dma_api from
    virtqueue_set_dma_premapped().
    
    Signed-off-by: Xuan Zhuo <[email protected]>
    Acked-by: Jason Wang <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    fengidri authored and kuba-moo committed May 14, 2024
    Configuration menu
    Copy the full SHA
    f9dac92 View commit details
    Browse the repository at this point in the history
  2. virtio_net: big mode skip the unmap check

    The virtio-net big mode did not enable premapped mode,
    so we did not need to check the unmap. And the subsequent
    commit will remove the failover code for failing enable
    premapped for merge and small mode. So we need to remove
    the checking do_dma code in the big mode path.
    
    Signed-off-by: Xuan Zhuo <[email protected]>
    Acked-by: Jason Wang <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    fengidri authored and kuba-moo committed May 14, 2024
    Configuration menu
    Copy the full SHA
    a377ae5 View commit details
    Browse the repository at this point in the history
  3. virtio_net: rx remove premapped failover code

    Now, the premapped mode can be enabled unconditionally.
    
    So we can remove the failover code for merge and small mode.
    
    Signed-off-by: Xuan Zhuo <[email protected]>
    Acked-by: Jason Wang <[email protected]>
    Reviewed-by: Larysa Zaremba <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    fengidri authored and kuba-moo committed May 14, 2024
    Configuration menu
    Copy the full SHA
    defd28a View commit details
    Browse the repository at this point in the history
  4. virtio_net: remove the misleading comment

    We call the build_skb() actually without copying data.
    The comment is misleading. So remove it.
    
    Signed-off-by: Xuan Zhuo <[email protected]>
    Acked-by: Jason Wang <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    fengidri authored and kuba-moo committed May 14, 2024
    Configuration menu
    Copy the full SHA
    9719f03 View commit details
    Browse the repository at this point in the history
  5. Merge branch 'virtio_net-rx-enable-premapped-mode-by-default'

    Xuan Zhuo says:
    
    ====================
    virtio_net: rx enable premapped mode by default
    
    Actually, for the virtio drivers, we can enable premapped mode whatever
    the value of use_dma_api. Because we provide the virtio dma apis.
    So the driver can enable premapped mode unconditionally.
    
    This patch set makes the big mode of virtio-net to support premapped mode.
    And enable premapped mode for rx by default.
    
    Based on the following points, we do not use page pool to manage these
        pages:
    
        1. virtio-net uses the DMA APIs wrapped by virtio core. Therefore,
           we can only prevent the page pool from performing DMA operations, and
           let the driver perform DMA operations on the allocated pages.
        2. But when the page pool releases the page, we have no chance to
           execute dma unmap.
        3. A solution to #2 is to execute dma unmap every time before putting
           the page back to the page pool. (This is actually a waste, we don't
           execute unmap so frequently.)
        4. But there is another problem, we still need to use page.dma_addr to
           save the dma address. Using page.dma_addr while using page pool is
           unsafe behavior.
        5. And we need space the chain the pages submitted once to virtio core.
    
        More:
            https://lore.kernel.org/all/CACGkMEu=Aok9z2imB_c5qVuujSh=vjj1kx12fy9N7hqyi+M5Ow@mail.gmail.com/
    
    Why we do not use the page space to store the dma?
    
        http://lore.kernel.org/all/CACGkMEuyeJ9mMgYnnB42=hw6umNuo=agn7VBqBqYPd7GN=+39Q@mail.gmail.com
    ====================
    
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    kuba-moo committed May 14, 2024
    Configuration menu
    Copy the full SHA
    f4edb4d View commit details
    Browse the repository at this point in the history
  6. net: qede: flower: validate control flags

    This driver currently doesn't support any control flags.
    
    Use flow_rule_match_has_control_flags() to check for control flags,
    such as can be set through `tc flower ... ip_flags frag`.
    
    In case any control flags are masked, flow_rule_match_has_control_flags()
    sets a NL extended error message, and we return -EOPNOTSUPP.
    
    Only compile-tested.
    
    Signed-off-by: Asbjørn Sloth Tønnesen <[email protected]>
    Reviewed-by: Simon Horman <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Asbjørn Sloth Tønnesen authored and kuba-moo committed May 14, 2024
    Configuration menu
    Copy the full SHA
    486ffc3 View commit details
    Browse the repository at this point in the history
  7. dt-bindings: net: renesas,rzn1-gmac: Document RZ/N1 GMAC support

    The RZ/N1 series of MPUs feature up to two Gigabit Ethernet controllers.
    These controllers are based on Synopsys IPs. They can be connected to
    RZ/N1 RGMII/RMII converters.
    
    Add a binding that describes these GMAC devices.
    
    Signed-off-by: Clément Léger <[email protected]>
    [rgantois: commit log]
    Reviewed-by: Rob Herring <[email protected]>
    Reviewed-by: Geert Uytterhoeven <[email protected]>
    Signed-off-by: Romain Gantois <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    clementleger authored and kuba-moo committed May 14, 2024
    Configuration menu
    Copy the full SHA
    ab55887 View commit details
    Browse the repository at this point in the history
  8. net: stmmac: Add dedicated XPCS cleanup method

    Currently the XPCS handler destruction is performed in the
    stmmac_mdio_unregister() method. It doesn't look good because the handler
    isn't originally created in the corresponding protagonist
    stmmac_mdio_unregister(), but in the stmmac_xpcs_setup() function. In
    order to have more coherent MDIO and XPCS setup/cleanup procedures,
    let's move the DW XPCS destruction to the dedicated stmmac_pcs_clean()
    method.
    
    This method will also be used to cleanup PCS hardware using the
    pcs_exit() callback that will be introduced to stmmac in a subsequent
    patch.
    
    Signed-off-by: Serge Semin <[email protected]>
    Co-developed-by: Romain Gantois <[email protected]>
    Signed-off-by: Romain Gantois <[email protected]>
    Reviewed-by: Russell King (Oracle) <[email protected]>
    Reviewed-by: Hariprasad Kelam <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    fancer authored and kuba-moo committed May 14, 2024
    Configuration menu
    Copy the full SHA
    d5c5093 View commit details
    Browse the repository at this point in the history
  9. net: stmmac: Make stmmac_xpcs_setup() generic to all PCS devices

    A pcs_init() callback will be introduced to stmmac in a future patch. This
    new function will be called during the hardware initialization phase.
    Instead of separately initializing XPCS and PCS components, let's group all
    PCS-related hardware initialization logic in the current
    stmmac_xpcs_setup() function.
    
    Rename stmmac_xpcs_setup() to stmmac_pcs_setup() and move the conditional
    call to stmmac_xpcs_setup() inside the function itself.
    
    Signed-off-by: Serge Semin <[email protected]>
    Co-developed-by: Romain Gantois <[email protected]>
    Signed-off-by: Romain Gantois <[email protected]>
    Reviewed-by: Russell King (Oracle) <[email protected]>
    Reviewed-by: Hariprasad Kelam <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    fancer authored and kuba-moo committed May 14, 2024
    Configuration menu
    Copy the full SHA
    f9cdff1 View commit details
    Browse the repository at this point in the history
  10. net: stmmac: introduce pcs_init/pcs_exit stmmac operations

    Introduce a mechanism whereby platforms can create their PCS instances
    prior to the network device being published to userspace, but after
    some of the core stmmac initialisation has been completed. This means
    that the data structures that platforms need will be available.
    
    Signed-off-by: Russell King (Oracle) <[email protected]>
    Reviewed-by: Maxime Chevallier <[email protected]>
    Reviewed-by: Serge Semin <[email protected]>
    Co-developed-by: Romain Gantois <[email protected]>
    Signed-off-by: Romain Gantois <[email protected]>
    Reviewed-by: Hariprasad Kelam <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Russell King (Oracle) authored and kuba-moo committed May 14, 2024
    Configuration menu
    Copy the full SHA
    f0ef433 View commit details
    Browse the repository at this point in the history
  11. net: stmmac: dwmac-socfpga: use pcs_init/pcs_exit

    Use the newly introduced pcs_init() and pcs_exit() operations to
    create and destroy the PCS instance at a more appropriate moment during
    the driver lifecycle, thereby avoiding publishing a network device to
    userspace that has not yet finished its PCS initialisation.
    
    There are other similar issues with this driver which remain
    unaddressed, but these are out of scope for this patch.
    
    Signed-off-by: Russell King (Oracle) <[email protected]>
    Reviewed-by: Maxime Chevallier <[email protected]>
    [rgantois: removed second parameters of new callbacks]
    Signed-off-by: Romain Gantois <[email protected]>
    Reviewed-by: Hariprasad Kelam <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Russell King (Oracle) authored and kuba-moo committed May 14, 2024
    Configuration menu
    Copy the full SHA
    81b418a View commit details
    Browse the repository at this point in the history
  12. net: stmmac: add support for RZ/N1 GMAC

    Add support for the Renesas RZ/N1 GMAC. This support can make use of a
    custom RZ/N1 PCS which is fetched by parsing the pcs-handle device tree
    property.
    
    Signed-off-by: Clément Léger <[email protected]>
    Co-developed-by: Romain Gantois <[email protected]>
    Signed-off-by: Romain Gantois <[email protected]>
    Reviewed-by: Russell King (Oracle) <[email protected]>
    Reviewed-by: Hariprasad Kelam <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    clementleger authored and kuba-moo committed May 14, 2024
    Configuration menu
    Copy the full SHA
    f360446 View commit details
    Browse the repository at this point in the history
  13. Merge branch 'net-stmmac-add-support-for-rzn1-gmac-devices'

    Romain Gantois says:
    
    ====================
    net: stmmac: Add support for RZN1 GMAC devices
    
    This is version seven of my series that adds support for a Gigabit Ethernet
    controller featured in the Renesas r9a06g032 SoC, of the RZ/N1 family. This
    GMAC device is based on a Synopsys IP and is compatible with the stmmac driver.
    
    My former colleague Clément Léger originally sent a series for this driver,
    but an issue in bringing up the PCS clock had blocked the upstreaming
    process. This issue has since been resolved by the following series:
    
    https://lore.kernel.org/all/[email protected]/
    
    This series consists of a devicetree binding describing the RZN1 GMAC
    controller IP, a node for the GMAC1 device in the r9a06g032 SoC device
    tree, and the GMAC driver itself which is a glue layer in stmmac.
    
    There are also two patches by Russell that improve pcs initialization handling
    in stmmac.
    ====================
    
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    kuba-moo committed May 14, 2024
    Configuration menu
    Copy the full SHA
    0621be4 View commit details
    Browse the repository at this point in the history
  14. tcp: rstreason: fully support in tcp_rcv_synsent_state_process()

    In this function, only updating the map can finish the job for socket
    reset reason because the corresponding drop reasons are ready.
    
    Signed-off-by: Jason Xing <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    JasonXing authored and kuba-moo committed May 14, 2024
    Configuration menu
    Copy the full SHA
    2b9669d View commit details
    Browse the repository at this point in the history
  15. tcp: rstreason: fully support in tcp_ack()

    Based on the existing skb drop reason, updating the rstreason map can
    help us finish the rstreason job in this function.
    
    Signed-off-by: Jason Xing <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    JasonXing authored and kuba-moo committed May 14, 2024
    Configuration menu
    Copy the full SHA
    459a2b3 View commit details
    Browse the repository at this point in the history
  16. tcp: rstreason: fully support in tcp_rcv_state_process()

    Like the previous patch does in this series, finish the conversion map is
    enough to let rstreason mechanism work in this function.
    
    Signed-off-by: Jason Xing <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    JasonXing authored and kuba-moo committed May 14, 2024
    Configuration menu
    Copy the full SHA
    f6d5e2c View commit details
    Browse the repository at this point in the history
  17. tcp: rstreason: handle timewait cases in the receive path

    There are two possible cases where TCP layer can send an RST. Since they
    happen in the same place, I think using one independent reason is enough
    to identify this special situation.
    
    Signed-off-by: Jason Xing <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    JasonXing authored and kuba-moo committed May 14, 2024
    Configuration menu
    Copy the full SHA
    22a3255 View commit details
    Browse the repository at this point in the history
  18. tcp: rstreason: fully support in tcp_check_req()

    We're going to send an RST due to invalid syn packet which is already
    checked whether 1) it is in sequence, 2) it is a retransmitted skb.
    
    As RFC 793 says, if the state of socket is not CLOSED/LISTEN/SYN-SENT,
    then we should send an RST when receiving bad syn packet:
    "fourth, check the SYN bit,...If the SYN is in the window it is an
    error, send a reset"
    
    Signed-off-by: Jason Xing <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    JasonXing authored and kuba-moo committed May 14, 2024
    Configuration menu
    Copy the full SHA
    11f46ea View commit details
    Browse the repository at this point in the history
  19. Merge branch 'tcp-support-rstreasons-in-the-passive-logic'

    Jason Xing says:
    
    ====================
    tcp: support rstreasons in the passive logic
    
    In this series, I split all kinds of reasons into five part which,
    I think, can be easily reviewed. I respectively implement corresponding
    rstreasons in those functions. After this, we can trace the whole tcp
    passive reset with clear reasons.
    ====================
    
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    kuba-moo committed May 14, 2024
    Configuration menu
    Copy the full SHA
    a6fb986 View commit details
    Browse the repository at this point in the history
  20. net: prestera: Add flex arrays to some structs

    The "struct prestera_msg_vtcam_rule_add_req" uses a dynamically sized
    set of trailing elements. Specifically, it uses an array of structures
    of type "prestera_msg_acl_action actions_msg".
    
    The "struct prestera_msg_flood_domain_ports_set_req" also uses a
    dynamically sized set of trailing elements. Specifically, it uses an
    array of structures of type "prestera_msg_acl_action actions_msg".
    
    So, use the preferred way in the kernel declaring flexible arrays [1].
    
    At the same time, prepare for the coming implementation by GCC and Clang
    of the __counted_by attribute. Flexible array members annotated with
    __counted_by can have their accesses bounds-checked at run-time via
    CONFIG_UBSAN_BOUNDS (for array indexing) and CONFIG_FORTIFY_SOURCE (for
    strcpy/memcpy-family functions). In this case, it is important to note
    that the attribute used is specifically __counted_by_le since the
    counters are of type __le32.
    
    The logic does not need to change since the counters for the flexible
    arrays are asigned before any access to the arrays.
    
    The order in which the structure prestera_msg_vtcam_rule_add_req and the
    structure prestera_msg_flood_domain_ports_set_req are defined must be
    changed to avoid incomplete type errors.
    
    Also, avoid the open-coded arithmetic in memory allocator functions [2]
    using the "struct_size" macro.
    
    Moreover, the new structure members also allow us to avoid the open-
    coded arithmetic on pointers. So, take advantage of this refactoring
    accordingly.
    
    This code was detected with the help of Coccinelle, and audited and
    modified manually.
    
    Link: https://www.kernel.org/doc/html/next/process/deprecated.html#zero-length-and-one-element-arrays [1]
    Link: https://www.kernel.org/doc/html/next/process/deprecated.html#open-coded-arithmetic-in-allocator-arguments [2]
    Signed-off-by: Erick Archer <[email protected]>
    Reviewed-by: Simon Horman <[email protected]>
    Reviewed-by: Kees Cook <[email protected]>
    Link: https://lore.kernel.org/r/AS8PR02MB7237E8469568A59795F1F0408BE12@AS8PR02MB7237.eurprd02.prod.outlook.com
    Signed-off-by: Jakub Kicinski <[email protected]>
    Erick Archer authored and kuba-moo committed May 14, 2024
    Configuration menu
    Copy the full SHA
    86348d2 View commit details
    Browse the repository at this point in the history
  21. net: mana: Enable MANA driver on ARM64 with 4K page size

    Change the Kconfig dependency, so this driver can be built and run on ARM64
    with 4K page size.
    16/64K page sizes are not supported yet.
    
    Signed-off-by: Haiyang Zhang <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    haiyangz authored and kuba-moo committed May 14, 2024
    Configuration menu
    Copy the full SHA
    40a1d11 View commit details
    Browse the repository at this point in the history
  22. mptcp: SO_KEEPALIVE: fix getsockopt support

    SO_KEEPALIVE support has to be set on each subflow: on each TCP socket,
    where sk_prot->keepalive is defined. Technically, nothing has to be done
    on the MPTCP socket. That's why mptcp_sol_socket_sync_intval() was
    called instead of mptcp_sol_socket_intval().
    
    Except that when nothing is done on the MPTCP socket, the
    getsockopt(SO_KEEPALIVE), handled in net/core/sock.c:sk_getsockopt(),
    will not know if SO_KEEPALIVE has been set on the different subflows or
    not.
    
    The fix is simple: simply call mptcp_sol_socket_intval() which will end
    up calling net/core/sock.c:sk_setsockopt() where the SOCK_KEEPOPEN flag
    will be set, the one used in sk_getsockopt().
    
    So now, getsockopt(SO_KEEPALIVE) on an MPTCP socket will return the same
    value as the one previously set with setsockopt(SO_KEEPALIVE).
    
    Fixes: 1b3e7ed ("mptcp: setsockopt: handle SO_KEEPALIVE and SO_PRIORITY")
    Acked-by: Paolo Abeni <[email protected]>
    Signed-off-by: Matthieu Baerts (NGI0) <[email protected]>
    Signed-off-by: Mat Martineau <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    matttbe authored and kuba-moo committed May 14, 2024
    Configuration menu
    Copy the full SHA
    a651981 View commit details
    Browse the repository at this point in the history
  23. mptcp: fix full TCP keep-alive support

    SO_KEEPALIVE support has been added a while ago, as part of a series
    "adding SOL_SOCKET" support. To have a full control of this keep-alive
    feature, it is important to also support TCP_KEEP* socket options at the
    SOL_TCP level.
    
    Supporting them on the setsockopt() part is easy, it is just a matter of
    remembering each value in the MPTCP sock structure, and calling
    tcp_sock_set_keep*() helpers on each subflow. If the value is not
    modified (0), calling these helpers will not do anything. For the
    getsockopt() part, the corresponding value from the MPTCP sock structure
    or the default one is simply returned. All of this is very similar to
    other TCP_* socket options supported by MPTCP.
    
    It looks important for kernels supporting SO_KEEPALIVE, to also support
    TCP_KEEP* options as well: some apps seem to (wrongly) consider that if
    the former is supported, the latter ones will be supported as well. But
    also, not having this simple and isolated change is preventing MPTCP
    support in some apps, and libraries like GoLang [1]. This is why this
    patch is seen as a fix.
    
    Closes: multipath-tcp/mptcp_net-next#383
    Fixes: 1b3e7ed ("mptcp: setsockopt: handle SO_KEEPALIVE and SO_PRIORITY")
    Link: golang/go#56539 [1]
    Acked-by: Paolo Abeni <[email protected]>
    Signed-off-by: Matthieu Baerts (NGI0) <[email protected]>
    Signed-off-by: Mat Martineau <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    matttbe authored and kuba-moo committed May 14, 2024
    Configuration menu
    Copy the full SHA
    bd11dc4 View commit details
    Browse the repository at this point in the history
  24. mptcp: sockopt: info: stop early if no buffer

    Up to recently, it has been recommended to use getsockopt(MPTCP_INFO) to
    check if a fallback to TCP happened, or if the client requested to use
    MPTCP.
    
    In this case, the userspace app is only interested by the returned value
    of the getsocktop() call, and can then give 0 for the option length, and
    NULL for the buffer address. An easy optimisation is then to stop early,
    and avoid filling a local buffer -- which now requires two different
    locks -- if it is not needed.
    
    Reviewed-by: Mat Martineau <[email protected]>
    Signed-off-by: Matthieu Baerts (NGI0) <[email protected]>
    Signed-off-by: Mat Martineau <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    matttbe authored and kuba-moo committed May 14, 2024
    Configuration menu
    Copy the full SHA
    ce5f6f7 View commit details
    Browse the repository at this point in the history
  25. mptcp: add net.mptcp.available_schedulers

    The sysctl lists the available schedulers that can be set using
    net.mptcp.scheduler similarly to net.ipv4.tcp_available_congestion_control.
    
    Signed-off-by: Gregory Detal <[email protected]>
    Reviewed-by: Mat Martineau <[email protected]>
    Tested-by: Geliang Tang <[email protected]>
    Reviewed-by: Matthieu Baerts (NGI0) <[email protected]>
    Signed-off-by: Matthieu Baerts (NGI0) <[email protected]>
    Signed-off-by: Mat Martineau <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    gdetal authored and kuba-moo committed May 14, 2024
    Configuration menu
    Copy the full SHA
    73c900a View commit details
    Browse the repository at this point in the history
  26. mptcp: prefer strscpy over strcpy

    strcpy() performs no bounds checking on the destination buffer. This
    could result in linear overflows beyond the end of the buffer, leading
    to all kinds of misbehaviors. The safe replacement is strscpy() [1].
    
    This is in preparation of a possible future step where all strcpy() uses
    will be removed in favour of strscpy() [2].
    
    This fixes CheckPatch warnings:
    
      WARNING: Prefer strscpy over strcpy
    
    Link: https://www.kernel.org/doc/html/latest/process/deprecated.html#strcpy [1]
    Link: KSPP/linux#88 [2]
    Reviewed-by: Geliang Tang <[email protected]>
    Signed-off-by: Matthieu Baerts (NGI0) <[email protected]>
    Signed-off-by: Mat Martineau <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    matttbe authored and kuba-moo committed May 14, 2024
    Configuration menu
    Copy the full SHA
    5eae7a8 View commit details
    Browse the repository at this point in the history
  27. mptcp: remove unnecessary else statements

    The 'else' statements are not needed here, because their previous 'if'
    block ends with a 'return'.
    
    This fixes CheckPatch warnings:
    
      WARNING: else is not generally useful after a break or return
    
    Reviewed-by: Geliang Tang <[email protected]>
    Signed-off-by: Matthieu Baerts (NGI0) <[email protected]>
    Signed-off-by: Mat Martineau <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    matttbe authored and kuba-moo committed May 14, 2024
    Configuration menu
    Copy the full SHA
    00797af View commit details
    Browse the repository at this point in the history
  28. mptcp: move mptcp_pm_gen.h's include

    Nothing from protocol.h depends on mptcp_pm_gen.h, only code from
    pm_netlink.c and pm_userspace.c depends on it.
    
    So this include can be moved where it is needed to avoid a "unused
    includes" warning.
    
    Reviewed-by: Geliang Tang <[email protected]>
    Signed-off-by: Matthieu Baerts (NGI0) <[email protected]>
    Signed-off-by: Mat Martineau <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    matttbe authored and kuba-moo committed May 14, 2024
    Configuration menu
    Copy the full SHA
    76a8668 View commit details
    Browse the repository at this point in the history
  29. mptcp: include inet_common in mib.h

    So this file is now self-contained: it can be compiled alone with
    analytic tools.
    
    Reviewed-by: Geliang Tang <[email protected]>
    Signed-off-by: Matthieu Baerts (NGI0) <[email protected]>
    Signed-off-by: Mat Martineau <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    matttbe authored and kuba-moo committed May 14, 2024
    Configuration menu
    Copy the full SHA
    7fad5b3 View commit details
    Browse the repository at this point in the history
  30. Merge branch 'mptcp-small-improvements-fix-and-clean-ups'

    Mat Martineau says:
    
    ====================
    mptcp: small improvements, fix and clean-ups
    
    This series contain mostly unrelated patches:
    
    - The two first patches can be seen as "fixes". They are part of this
      series for -next because it looks like the last batch of fixes for
      v6.9 has already been sent. These fixes are not urgent, so they can
      wait if an unlikely v6.9-rc8 is published. About the two patches:
        - Patch 1 fixes getsockopt(SO_KEEPALIVE) support on MPTCP sockets
        - Patch 2 makes sure the full TCP keep-alive feature is supported,
          not just SO_KEEPALIVE.
    
    - Patch 3 is a small optimisation when getsockopt(MPTCP_INFO) is used
      without buffer, just to check if MPTCP is still being used: no
      fallback to TCP.
    
    - Patch 4 adds net.mptcp.available_schedulers sysctl knob to list packet
      schedulers, similar to net.ipv4.tcp_available_congestion_control.
    
    - Patch 5 and 6 fix CheckPatch warnings: "prefer strscpy over strcpy"
      and "else is not generally useful after a break or return".
    
    - Patch 7 and 8 remove and add header includes to avoid unused ones, and
      add missing ones to be self-contained.
    ====================
    
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    kuba-moo committed May 14, 2024
    Configuration menu
    Copy the full SHA
    9512515 View commit details
    Browse the repository at this point in the history
  31. net: stmmac: move the EST lock to struct stmmac_priv

    Reinitialize the whole EST structure would also reset the mutex
    lock which is embedded in the EST structure, and then trigger
    the following warning. To address this, move the lock to struct
    stmmac_priv. We also need to reacquire the mutex lock when doing
    this initialization.
    
    DEBUG_LOCKS_WARN_ON(lock->magic != lock)
    WARNING: CPU: 3 PID: 505 at kernel/locking/mutex.c:587 __mutex_lock+0xd84/0x1068
     Modules linked in:
     CPU: 3 PID: 505 Comm: tc Not tainted 6.9.0-rc6-00053-g0106679839f7-dirty #29
     Hardware name: NXP i.MX8MPlus EVK board (DT)
     pstate: 60000005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
     pc : __mutex_lock+0xd84/0x1068
     lr : __mutex_lock+0xd84/0x1068
     sp : ffffffc0864e3570
     x29: ffffffc0864e3570 x28: ffffffc0817bdc78 x27: 0000000000000003
     x26: ffffff80c54f1808 x25: ffffff80c9164080 x24: ffffffc080d723ac
     x23: 0000000000000000 x22: 0000000000000002 x21: 0000000000000000
     x20: 0000000000000000 x19: ffffffc083bc3000 x18: ffffffffffffffff
     x17: ffffffc08117b080 x16: 0000000000000002 x15: ffffff80d2d40000
     x14: 00000000000002da x13: ffffff80d2d404b8 x12: ffffffc082b5a5c8
     x11: ffffffc082bca680 x10: ffffffc082bb2640 x9 : ffffffc082bb2698
     x8 : 0000000000017fe8 x7 : c0000000ffffefff x6 : 0000000000000001
     x5 : ffffff8178fe0d48 x4 : 0000000000000000 x3 : 0000000000000027
     x2 : ffffff8178fe0d50 x1 : 0000000000000000 x0 : 0000000000000000
     Call trace:
      __mutex_lock+0xd84/0x1068
      mutex_lock_nested+0x28/0x34
      tc_setup_taprio+0x118/0x68c
      stmmac_setup_tc+0x50/0xf0
      taprio_change+0x868/0xc9c
    
    Fixes: b2aae65 ("net: stmmac: add mutex lock to protect est parameters")
    Signed-off-by: Xiaolei Wang <[email protected]>
    Reviewed-by: Simon Horman <[email protected]>
    Reviewed-by: Serge Semin <[email protected]>
    Reviewed-by: Andrew Halaney <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    xiaoleiwang123456 authored and kuba-moo committed May 14, 2024
    Configuration menu
    Copy the full SHA
    36ac9e7 View commit details
    Browse the repository at this point in the history
  32. net: stmmac: move the EST structure to struct stmmac_priv

    Move the EST structure to struct stmmac_priv, because the
    EST configs don't look like platform config, but EST is
    enabled in runtime with the settings retrieved for the TC
    TAPRIO feature also in runtime. So it's better to have the
    EST-data preserved in the driver private data instead of
    the platform data storage.
    
    Signed-off-by: Xiaolei Wang <[email protected]>
    Reviewed-by: Simon Horman <[email protected]>
    Reviewed-by: Serge Semin <[email protected]>
    Reviewed-by: Andrew Halaney <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    xiaoleiwang123456 authored and kuba-moo committed May 14, 2024
    Configuration menu
    Copy the full SHA
    bd17382 View commit details
    Browse the repository at this point in the history
  33. Merge branch 'move-est-lock-and-est-structure-to-struct-stmmac_priv'

    Xiaolei Wang says:
    
    ====================
    Move EST lock and EST structure to struct stmmac_priv
    
    1. Pulling the mutex protecting the EST structure out to avoid
        clearing it during reinit/memset of the EST structure,and
        reacquire the mutex lock when doing this initialization.
    
    2. Moving the EST structure to a more logical location
    ====================
    
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    kuba-moo committed May 14, 2024
    Configuration menu
    Copy the full SHA
    b08191d View commit details
    Browse the repository at this point in the history
  34. net: revert partially applied PHY topology series

    The series is causing issues with PHY drivers built as modules.
    Since it was only partially applied and the merge window has
    opened let's revert and try again for v6.11.
    
    Revert 6916e46 ("net: phy: Introduce ethernet link topology representation")
    Revert 0ec5ed6 ("net: sfp: pass the phy_device when disconnecting an sfp module's PHY")
    Revert e75e4e0 ("net: phy: add helpers to handle sfp phy connect/disconnect")
    Revert fdd3539 ("net: sfp: Add helper to return the SFP bus name")
    Revert 841942b ("net: ethtool: Allow passing a phy index for some commands")
    
    Link: https://lore.kernel.org/all/171242462917.4000.9759453824684907063.git-patchwork-notify@kernel.org/
    Link: https://lore.kernel.org/all/[email protected]/
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    kuba-moo committed May 14, 2024
    Configuration menu
    Copy the full SHA
    5c16727 View commit details
    Browse the repository at this point in the history

Commits on May 17, 2024

  1. Configuration menu
    Copy the full SHA
    8f4a950 View commit details
    Browse the repository at this point in the history