-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
mlxbf_pmc: bring in latest 6.8 upstream commits #15
base: 24.04_linux-nvidia
Are you sure you want to change the base?
mlxbf_pmc: bring in latest 6.8 upstream commits #15
Commits on May 30, 2024
-
UBUNTU: [Packaging] Initialize linux-nvidia-6.5
Signed-off-by: Ian May <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 1b17ae6 - Browse repository at this point
Copy the full SHA 1b17ae6View commit details -
Revert "UBUNTU: SAUCE: modpost: support arbitrary symbol length in mo…
…dversion" This reverts commit 47d27f2. We need to revert this to avoid regressing any modules used in Jammy. Signed-off-by: Ian May <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for b16faed - Browse repository at this point
Copy the full SHA b16faedView commit details -
UBUNTU: [Packaging] update variants
BugLink: https://bugs.launchpad.net/bugs/1786013 Signed-off-by: Ian May <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 8635d59 - Browse repository at this point
Copy the full SHA 8635d59View commit details -
UBUNTU: [Packaging] update Ubuntu.md
BugLink: https://bugs.launchpad.net/bugs/1786013 Signed-off-by: Ian May <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 2a2f308 - Browse repository at this point
Copy the full SHA 2a2f308View commit details -
Ignore: yes Signed-off-by: Ian May <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 8912883 - Browse repository at this point
Copy the full SHA 8912883View commit details -
UBUNTU: [Config] nvidia-6.5: update annotations
Signed-off-by: Ian May <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 855fe66 - Browse repository at this point
Copy the full SHA 855fe66View commit details -
UBUNTU: Ubuntu-nvidia-6.5-6.5.0-1001.1
Signed-off-by: Ian May <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 189b65c - Browse repository at this point
Copy the full SHA 189b65cView commit details -
UBUNTU: [Packaging] nvidia-6.5: disable rust support
Ignore: yes Signed-off-by: Ian May <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 3aa0d62 - Browse repository at this point
Copy the full SHA 3aa0d62View commit details -
Ignore: yes Signed-off-by: Ian May <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 1d06550 - Browse repository at this point
Copy the full SHA 1d06550View commit details -
UBUNTU: link-to-tracker: update tracking bug
BugLink: https://bugs.launchpad.net/bugs/2038972 Properties: no-test-build Signed-off-by: Ian May <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 8b723cd - Browse repository at this point
Copy the full SHA 8b723cdView commit details -
UBUNTU: [Config] nvidia-6.5: update annotations
Ignore: yes Signed-off-by: Ian May <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for ce68dae - Browse repository at this point
Copy the full SHA ce68daeView commit details -
UBUNTU: Ubuntu-nvidia-6.5-6.5.0-1004.4
Signed-off-by: Ian May <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for f400f4e - Browse repository at this point
Copy the full SHA f400f4eView commit details -
Ignore: yes Signed-off-by: Paolo Pisati <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 508976f - Browse repository at this point
Copy the full SHA 508976fView commit details -
Configuration menu - View commit details
-
Copy full SHA for a10b173 - Browse repository at this point
Copy the full SHA a10b173View commit details -
UBUNTU: link-to-tracker: update tracking bug
BugLink: https://bugs.launchpad.net/bugs/2046137 Properties: no-test-build Signed-off-by: Paolo Pisati <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 21c0669 - Browse repository at this point
Copy the full SHA 21c0669View commit details -
UBUNTU: [Packaging] update variants
BugLink: https://bugs.launchpad.net/bugs/1786013 Signed-off-by: Paolo Pisati <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for c6f7452 - Browse repository at this point
Copy the full SHA c6f7452View commit details -
UBUNTU: [Packaging] update update.conf
BugLink: https://bugs.launchpad.net/bugs/1786013 Signed-off-by: Paolo Pisati <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 651fa11 - Browse repository at this point
Copy the full SHA 651fa11View commit details -
UBUNTU: [Packaging] move to gcc-13 by default
Ignore: yes Signed-off-by: Andrea Righi <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 3e778f9 - Browse repository at this point
Copy the full SHA 3e778f9View commit details -
UBUNTU: rebase on Ubuntu-6.6.0-14.14
Signed-off-by: Paolo Pisati <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for b50a190 - Browse repository at this point
Copy the full SHA b50a190View commit details -
UBUNTU: [Config] updateconfigs following Ubuntu-6.6.0-14.14 rebase
Signed-off-by: Paolo Pisati <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for d5fa6f0 - Browse repository at this point
Copy the full SHA d5fa6f0View commit details -
UBUNTU: Ubuntu-nvidia-6.6.0-1001.1
Signed-off-by: Paolo Pisati <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 1c13bcf - Browse repository at this point
Copy the full SHA 1c13bcfView commit details -
UBUNTU: [Packaging] move to linux 6.8
Ignore: yes Signed-off-by: Andrea Righi <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 27028de - Browse repository at this point
Copy the full SHA 27028deView commit details -
Ignore: yes Signed-off-by: Andrea Righi <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 3b22c0b - Browse repository at this point
Copy the full SHA 3b22c0bView commit details -
Ignore: yes Signed-off-by: Andrea Righi <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 6a39482 - Browse repository at this point
Copy the full SHA 6a39482View commit details -
UBUNTU: link-to-tracker: update tracking bug
BugLink: https://bugs.launchpad.net/bugs/2055128 Properties: no-test-build Signed-off-by: Andrea Righi <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 685541f - Browse repository at this point
Copy the full SHA 685541fView commit details -
UBUNTU: debian.nvidia/dkms-versions -- update from kernel-versions (m…
…ain/d2024.02.07) BugLink: https://bugs.launchpad.net/bugs/1786013 Signed-off-by: Andrea Righi <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 6f62cdc - Browse repository at this point
Copy the full SHA 6f62cdcView commit details -
UBUNTU: [Packaging] add Rust build dependencies
Signed-off-by: Andrea Righi <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 2a28d36 - Browse repository at this point
Copy the full SHA 2a28d36View commit details -
UBUNTU: [Config] update annotations after rebase to v6.8
Signed-off-by: Andrea Righi <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 81783f7 - Browse repository at this point
Copy the full SHA 81783f7View commit details -
UBUNTU: [Packaging] clean ABI check files
Ignore: yes Signed-off-by: Andrea Righi <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 4798782 - Browse repository at this point
Copy the full SHA 4798782View commit details -
UBUNTU: Ubuntu-nvidia-6.8.0-1001.1
Signed-off-by: Andrea Righi <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for a7dcaa1 - Browse repository at this point
Copy the full SHA a7dcaa1View commit details -
Ignore: yes Signed-off-by: Ian May <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for ed361c3 - Browse repository at this point
Copy the full SHA ed361c3View commit details -
UBUNTU: link-to-tracker: update tracking bug
BugLink: https://bugs.launchpad.net/bugs/2058266 Properties: no-test-build Signed-off-by: Ian May <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 6bc833e - Browse repository at this point
Copy the full SHA 6bc833eView commit details -
UBUNTU: [Config] nvidia: update annotations
Ignore: yes Signed-off-by: Ian May <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for d6d3ea6 - Browse repository at this point
Copy the full SHA d6d3ea6View commit details -
UBUNTU: Ubuntu-nvidia-6.8.0-1002.2
Signed-off-by: Ian May <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 1883882 - Browse repository at this point
Copy the full SHA 1883882View commit details -
UBUNTU: [Packaging] dkms-versions standalone provides support
Add support for exposing rprovides data for standalone modules too. Switch to exposing provides as a shared debian/substvar file and use that in the templates. Ignore: yes Signed-off-by: Brad Figg <[email protected]> Signed-off-by: Ian May <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 07444b1 - Browse repository at this point
Copy the full SHA 07444b1View commit details -
UBUNTU: [Packaging] add versioning to dkms standalone rprovides
When nvidia-fs-dkms is available as a dkms package, we want to default to using the signed modules if possible. Adding a version number for the nvidia-fs modules package enables the inbox modules to be selected over an equivalent dkms version. Ignore: yes Signed-off-by: Ian May <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 9fb2f19 - Browse repository at this point
Copy the full SHA 9fb2f19View commit details -
NVIDIA: [Config]: Grouping AAEON config options together, under a com…
…ment BugLink: https://bugs.launchpad.net/bugs/2060327 Signed-off-by: Brad Figg <[email protected]> Acked-by: Brad Figg <[email protected]> Signed-off-by: Ian May <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 4759dbc - Browse repository at this point
Copy the full SHA 4759dbcView commit details -
NVIDIA: [Config]: Disable the NOUVEAU driver which is not used with -…
…nvidia kernels BugLink: https://bugs.launchpad.net/bugs/2060327 Signed-off-by: Brad Figg <[email protected]> Acked-by: Brad Figg <[email protected]> Signed-off-by: Ian May <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for f4b6311 - Browse repository at this point
Copy the full SHA f4b6311View commit details -
NVIDIA: [Config]: Adding CORESIGHT and ARM64_ERRATUM configs to annot…
…ations BugLink: https://bugs.launchpad.net/bugs/2060327 Signed-off-by: Brad Figg <[email protected]> Acked-by: Brad Figg <[email protected]> Signed-off-by: Ian May <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for b9ef9ff - Browse repository at this point
Copy the full SHA b9ef9ffView commit details -
UBUNTU: [Config] update nvidia specific annotations with notes
Ignore: yes Signed-off-by: Ian May <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for db2f8e9 - Browse repository at this point
Copy the full SHA db2f8e9View commit details -
UBUNTU: [Config] update annotations with updateconfigs
Ignore: yes Signed-off-by: Ian May <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for d559487 - Browse repository at this point
Copy the full SHA d559487View commit details -
UBUNTU: [Packaging] remove tools host package
Ignore: yes Signed-off-by: Ian May <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for cce0e65 - Browse repository at this point
Copy the full SHA cce0e65View commit details -
NVIDIA: SAUCE: Patch NFS driver to support GDS with 6.8 Kernel
BugLink: https://bugs.launchpad.net/bugs/2059814 With this change, the NFS driver would be enabled to support GPUDirectStorage(GDS). The change is around frwr_map and frwr_unmap in the NFS driver, where the IO request is first intercepted to check for GDS pages and if it is a GDS page then the request is served by GDS driver component called nvidia-fs, else the request would be served by the standard NFS driver code. Signed-off-by: Sourab Gupta <[email protected]> Acked-by: Brad Figg <[email protected]> Acked-by: Ian May <[email protected]> Signed-off-by: Ian May <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for e17a18a - Browse repository at this point
Copy the full SHA e17a18aView commit details -
NVIDIA: SAUCE: NVMe/MVMEeOF: Patch NVMe/NVMeOF driver to support GDS …
…on Linux 6.8 Kernel BugLink: https://bugs.launchpad.net/bugs/2059814 With this change, the NVMe and NVMeOF driver would be enabled to support GPUDirectStorage(GDS). The change is around nvme/nvme rdma map_data() and unmap_data(), where the IO request is first intercepted to check for GDS pages and if it is a GDS page then the request is served by GDS driver component called nvidia-fs, else the request would be served by the standard NVMe driver code Signed-off-by: Sourab Gupta <[email protected]> Acked-by: Brad Figg <[email protected]> Acked-by: Ian May <[email protected]> Signed-off-by: Ian May <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for b1da2dc - Browse repository at this point
Copy the full SHA b1da2dcView commit details -
NVIDIA: [Config] Add nvidia-fs build dependencies
BugLink: https://bugs.launchpad.net/bugs/2059814 Signed-off-by: Brad Figg <[email protected]> Acked-by: Brad Figg <[email protected]> Acked-by: Ian May <[email protected]> Signed-off-by: Ian May <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 99ece0b - Browse repository at this point
Copy the full SHA 99ece0bView commit details -
UBUNTU: [Packaging] drop getabis data
BugLink: https://bugs.launchpad.net/bugs/1786013 Signed-off-by: Ian May <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for e36d7b1 - Browse repository at this point
Copy the full SHA e36d7b1View commit details -
UBUNTU: [Packaging] Replace fs/cifs with fs/smb in inclusion list
BugLink: https://bugs.launchpad.net/bugs/1786013 Signed-off-by: Ian May <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 97d9c91 - Browse repository at this point
Copy the full SHA 97d9c91View commit details -
UBUNTU: [Packaging] remove bindgen-0.56
This pacakge is not available in noble. Ignore: yes Signed-off-by: Ian May <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 8718e41 - Browse repository at this point
Copy the full SHA 8718e41View commit details -
Ignore: yes Signed-off-by: Ian May <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 0a4c8e3 - Browse repository at this point
Copy the full SHA 0a4c8e3View commit details -
UBUNTU: [Packaging] debian.nvidia/dkms-versions -- update from kernel…
…-versions (main/d2024.04.04) BugLink: https://bugs.launchpad.net/bugs/1786013 Signed-off-by: Ian May <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 5ba7fb7 - Browse repository at this point
Copy the full SHA 5ba7fb7View commit details -
UBUNTU: link-to-tracker: update tracking bug
BugLink: https://bugs.launchpad.net/bugs/2060232 Properties: no-test-build Signed-off-by: Ian May <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 9c45592 - Browse repository at this point
Copy the full SHA 9c45592View commit details -
UBUNTU: Ubuntu-nvidia-6.8.0-1006.6
Signed-off-by: Ian May <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 196705c - Browse repository at this point
Copy the full SHA 196705cView commit details -
gpio: tegra186: Fix tegra186_gpio_is_accessible() check
BugLink: https://bugs.launchpad.net/bugs/2064549 The controller has several register bits describing access control information for a given GPIO pin. When SCR_SEC_[R|W]EN is unset, it means we have full read/write access to all the registers for given GPIO pin. When SCR_SEC[R|W]EN is set, it means we need to further check the accompanying SCR_SEC_G1[R|W] bit to determine read/write access to all the registers for given GPIO pin. This check was previously declaring that a GPIO pin was accessible only if either of the following conditions were met: - SCR_SEC_REN + SCR_SEC_WEN both set or - SCR_SEC_REN + SCR_SEC_WEN both set and SCR_SEC_G1R + SCR_SEC_G1W both set Update the check to properly handle cases where only one of SCR_SEC_REN or SCR_SEC_WEN is set. Fixes: b2b56a1 ("gpio: tegra186: Check GPIO pin permission before access.") Signed-off-by: Prathamesh Shete <[email protected]> Acked-by: Thierry Reding <[email protected]> (cherry-picked from commit d806f47 linux-next) Signed-off-by: Jamie Nguyen <[email protected]> Acked-by: Brad Figg <[email protected]> Acked-by: Noah Wager <[email protected]> Acked-by: Jacob Martin <[email protected]> Signed-off-by: Brad Figg <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 6a7cd7f - Browse repository at this point
Copy the full SHA 6a7cd7fView commit details -
arm64/mm: make set_ptes() robust when OAs cross 48-bit boundary
BugLink: https://bugs.launchpad.net/bugs/2059316 Patch series "mm/memory: optimize fork() with PTE-mapped THP", v3. Now that the rmap overhaul[1] is upstream that provides a clean interface for rmap batching, let's implement PTE batching during fork when processing PTE-mapped THPs. This series is partially based on Ryan's previous work[2] to implement cont-pte support on arm64, but its a complete rewrite based on [1] to optimize all architectures independent of any such PTE bits, and to use the new rmap batching functions that simplify the code and prepare for further rmap accounting changes. We collect consecutive PTEs that map consecutive pages of the same large folio, making sure that the other PTE bits are compatible, and (a) adjust the refcount only once per batch, (b) call rmap handling functions only once per batch and (c) perform batch PTE setting/updates. While this series should be beneficial for adding cont-pte support on ARM64[2], it's one of the requirements for maintaining a total mapcount[3] for large folios with minimal added overhead and further changes[4] that build up on top of the total mapcount. Independent of all that, this series results in a speedup during fork with PTE-mapped THP, which is the default with THPs that are smaller than a PMD (for example, 16KiB to 1024KiB mTHPs for anonymous memory[5]). On an Intel Xeon Silver 4210R CPU, fork'ing with 1GiB of PTE-mapped folios of the same size (stddev < 1%) results in the following runtimes for fork() (shorter is better): Folio Size | v6.8-rc1 | New | Change ------------------------------------------ 4KiB | 0.014328 | 0.014035 | - 2% 16KiB | 0.014263 | 0.01196 | -16% 32KiB | 0.014334 | 0.01094 | -24% 64KiB | 0.014046 | 0.010444 | -26% 128KiB | 0.014011 | 0.010063 | -28% 256KiB | 0.013993 | 0.009938 | -29% 512KiB | 0.013983 | 0.00985 | -30% 1024KiB | 0.013986 | 0.00982 | -30% 2048KiB | 0.014305 | 0.010076 | -30% Note that these numbers are even better than the ones from v1 (verified over multiple reboots), even though there were only minimal code changes. Well, I removed a pte_mkclean() call for anon folios, maybe that also plays a role. But my experience is that fork() is extremely sensitive to code size, inlining, ... so I suspect we'll see on other architectures rather a change of -20% instead of -30%, and it will be easy to "lose" some of that speedup in the future by subtle code changes. Next up is PTE batching when unmapping. Only tested on x86-64. Compile-tested on most other architectures. [1] https://lkml.kernel.org/r/[email protected] [2] https://lkml.kernel.org/r/[email protected] [3] https://lkml.kernel.org/r/[email protected] [4] https://lkml.kernel.org/r/[email protected] [5] https://lkml.kernel.org/r/[email protected] This patch (of 15): Since the high bits [51:48] of an OA are not stored contiguously in the PTE, there is a theoretical bug in set_ptes(), which just adds PAGE_SIZE to the pte to get the pte with the next pfn. This works until the pfn crosses the 48-bit boundary, at which point we overflow into the upper attributes. Of course one could argue (and Matthew Wilcox has :) that we will never see a folio cross this boundary because we only allow naturally aligned power-of-2 allocation, so this would require a half-petabyte folio. So its only a theoretical bug. But its better that the code is robust regardless. I've implemented pte_next_pfn() as part of the fix, which is an opt-in core-mm interface. So that is now available to the core-mm, which will be needed shortly to support forthcoming fork()-batching optimizations. Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/[email protected] Fixes: 4a169d6 ("arm64: implement the new page table range API") Closes: https://lore.kernel.org/linux-mm/[email protected]/ Signed-off-by: Ryan Roberts <[email protected]> Signed-off-by: David Hildenbrand <[email protected]> Reviewed-by: Catalin Marinas <[email protected]> Reviewed-by: David Hildenbrand <[email protected]> Tested-by: Ryan Roberts <[email protected]> Reviewed-by: Mike Rapoport (IBM) <[email protected]> Cc: Albert Ou <[email protected]> Cc: Alexander Gordeev <[email protected]> Cc: Aneesh Kumar K.V <[email protected]> Cc: Christian Borntraeger <[email protected]> Cc: Christophe Leroy <[email protected]> Cc: David S. Miller <[email protected]> Cc: Dinh Nguyen <[email protected]> Cc: Gerald Schaefer <[email protected]> Cc: Heiko Carstens <[email protected]> Cc: Matthew Wilcox <[email protected]> Cc: Michael Ellerman <[email protected]> Cc: Naveen N. Rao <[email protected]> Cc: Nicholas Piggin <[email protected]> Cc: Palmer Dabbelt <[email protected]> Cc: Paul Walmsley <[email protected]> Cc: Russell King (Oracle) <[email protected]> Cc: Sven Schnelle <[email protected]> Cc: Vasily Gorbik <[email protected]> Cc: Will Deacon <[email protected]> Cc: Alexandre Ghiti <[email protected]> Signed-off-by: Andrew Morton <[email protected]> (cherry picked from commit 6e8f588) Signed-off-by: dann frazier <[email protected]> Acked-by: Brad Figg <[email protected]> Acked-by: Noah Wager <[email protected]> Acked-by: Jacob Martin <[email protected]> Signed-off-by: Brad Figg <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for b739174 - Browse repository at this point
Copy the full SHA b739174View commit details -
arm/pgtable: define PFN_PTE_SHIFT
BugLink: https://bugs.launchpad.net/bugs/2059316 We want to make use of pte_next_pfn() outside of set_ptes(). Let's simply define PFN_PTE_SHIFT, required by pte_next_pfn(). Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: David Hildenbrand <[email protected]> Tested-by: Ryan Roberts <[email protected]> Reviewed-by: Mike Rapoport (IBM) <[email protected]> Cc: Albert Ou <[email protected]> Cc: Alexander Gordeev <[email protected]> Cc: Alexandre Ghiti <[email protected]> Cc: Aneesh Kumar K.V <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Christian Borntraeger <[email protected]> Cc: Christophe Leroy <[email protected]> Cc: David S. Miller <[email protected]> Cc: Dinh Nguyen <[email protected]> Cc: Gerald Schaefer <[email protected]> Cc: Heiko Carstens <[email protected]> Cc: Matthew Wilcox <[email protected]> Cc: Michael Ellerman <[email protected]> Cc: Naveen N. Rao <[email protected]> Cc: Nicholas Piggin <[email protected]> Cc: Palmer Dabbelt <[email protected]> Cc: Paul Walmsley <[email protected]> Cc: Russell King (Oracle) <[email protected]> Cc: Sven Schnelle <[email protected]> Cc: Vasily Gorbik <[email protected]> Cc: Will Deacon <[email protected]> Signed-off-by: Andrew Morton <[email protected]> (cherry picked from commit 12b884f) Signed-off-by: dann frazier <[email protected]> Acked-by: Brad Figg <[email protected]> Acked-by: Noah Wager <[email protected]> Acked-by: Jacob Martin <[email protected]> Signed-off-by: Brad Figg <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 83683cb - Browse repository at this point
Copy the full SHA 83683cbView commit details -
nios2/pgtable: define PFN_PTE_SHIFT
BugLink: https://bugs.launchpad.net/bugs/2059316 We want to make use of pte_next_pfn() outside of set_ptes(). Let's simply define PFN_PTE_SHIFT, required by pte_next_pfn(). Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: David Hildenbrand <[email protected]> Tested-by: Ryan Roberts <[email protected]> Reviewed-by: Mike Rapoport (IBM) <[email protected]> Cc: Albert Ou <[email protected]> Cc: Alexander Gordeev <[email protected]> Cc: Alexandre Ghiti <[email protected]> Cc: Aneesh Kumar K.V <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Christian Borntraeger <[email protected]> Cc: Christophe Leroy <[email protected]> Cc: David S. Miller <[email protected]> Cc: Dinh Nguyen <[email protected]> Cc: Gerald Schaefer <[email protected]> Cc: Heiko Carstens <[email protected]> Cc: Matthew Wilcox <[email protected]> Cc: Michael Ellerman <[email protected]> Cc: Naveen N. Rao <[email protected]> Cc: Nicholas Piggin <[email protected]> Cc: Palmer Dabbelt <[email protected]> Cc: Paul Walmsley <[email protected]> Cc: Russell King (Oracle) <[email protected]> Cc: Sven Schnelle <[email protected]> Cc: Vasily Gorbik <[email protected]> Cc: Will Deacon <[email protected]> Signed-off-by: Andrew Morton <[email protected]> (cherry picked from commit 3a6a6c3) Signed-off-by: dann frazier <[email protected]> Acked-by: Brad Figg <[email protected]> Acked-by: Noah Wager <[email protected]> Acked-by: Jacob Martin <[email protected]> Signed-off-by: Brad Figg <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for eca43c2 - Browse repository at this point
Copy the full SHA eca43c2View commit details -
powerpc/pgtable: define PFN_PTE_SHIFT
BugLink: https://bugs.launchpad.net/bugs/2059316 We want to make use of pte_next_pfn() outside of set_ptes(). Let's simply define PFN_PTE_SHIFT, required by pte_next_pfn(). Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: David Hildenbrand <[email protected]> Reviewed-by: Christophe Leroy <[email protected]> Tested-by: Ryan Roberts <[email protected]> Reviewed-by: Mike Rapoport (IBM) <[email protected]> Cc: Albert Ou <[email protected]> Cc: Alexander Gordeev <[email protected]> Cc: Alexandre Ghiti <[email protected]> Cc: Aneesh Kumar K.V <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Christian Borntraeger <[email protected]> Cc: David S. Miller <[email protected]> Cc: Dinh Nguyen <[email protected]> Cc: Gerald Schaefer <[email protected]> Cc: Heiko Carstens <[email protected]> Cc: Matthew Wilcox <[email protected]> Cc: Michael Ellerman <[email protected]> Cc: Naveen N. Rao <[email protected]> Cc: Nicholas Piggin <[email protected]> Cc: Palmer Dabbelt <[email protected]> Cc: Paul Walmsley <[email protected]> Cc: Russell King (Oracle) <[email protected]> Cc: Sven Schnelle <[email protected]> Cc: Vasily Gorbik <[email protected]> Cc: Will Deacon <[email protected]> Signed-off-by: Andrew Morton <[email protected]> (cherry picked from commit f7dc4d6) Signed-off-by: dann frazier <[email protected]> Acked-by: Brad Figg <[email protected]> Acked-by: Noah Wager <[email protected]> Acked-by: Jacob Martin <[email protected]> Signed-off-by: Brad Figg <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 294c511 - Browse repository at this point
Copy the full SHA 294c511View commit details -
riscv/pgtable: define PFN_PTE_SHIFT
BugLink: https://bugs.launchpad.net/bugs/2059316 We want to make use of pte_next_pfn() outside of set_ptes(). Let's simply define PFN_PTE_SHIFT, required by pte_next_pfn(). Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: David Hildenbrand <[email protected]> Reviewed-by: Alexandre Ghiti <[email protected]> Tested-by: Ryan Roberts <[email protected]> Reviewed-by: Mike Rapoport (IBM) <[email protected]> Cc: Albert Ou <[email protected]> Cc: Alexander Gordeev <[email protected]> Cc: Aneesh Kumar K.V <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Christian Borntraeger <[email protected]> Cc: Christophe Leroy <[email protected]> Cc: David S. Miller <[email protected]> Cc: Dinh Nguyen <[email protected]> Cc: Gerald Schaefer <[email protected]> Cc: Heiko Carstens <[email protected]> Cc: Matthew Wilcox <[email protected]> Cc: Michael Ellerman <[email protected]> Cc: Naveen N. Rao <[email protected]> Cc: Nicholas Piggin <[email protected]> Cc: Palmer Dabbelt <[email protected]> Cc: Paul Walmsley <[email protected]> Cc: Russell King (Oracle) <[email protected]> Cc: Sven Schnelle <[email protected]> Cc: Vasily Gorbik <[email protected]> Cc: Will Deacon <[email protected]> Signed-off-by: Andrew Morton <[email protected]> (cherry picked from commit 57c254b) Signed-off-by: dann frazier <[email protected]> Acked-by: Brad Figg <[email protected]> Acked-by: Noah Wager <[email protected]> Acked-by: Jacob Martin <[email protected]> Signed-off-by: Brad Figg <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for a94b720 - Browse repository at this point
Copy the full SHA a94b720View commit details -
s390/pgtable: define PFN_PTE_SHIFT
BugLink: https://bugs.launchpad.net/bugs/2059316 We want to make use of pte_next_pfn() outside of set_ptes(). Let's simply define PFN_PTE_SHIFT, required by pte_next_pfn(). Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: David Hildenbrand <[email protected]> Tested-by: Ryan Roberts <[email protected]> Reviewed-by: Mike Rapoport (IBM) <[email protected]> Cc: Albert Ou <[email protected]> Cc: Alexander Gordeev <[email protected]> Cc: Alexandre Ghiti <[email protected]> Cc: Aneesh Kumar K.V <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Christian Borntraeger <[email protected]> Cc: Christophe Leroy <[email protected]> Cc: David S. Miller <[email protected]> Cc: Dinh Nguyen <[email protected]> Cc: Gerald Schaefer <[email protected]> Cc: Heiko Carstens <[email protected]> Cc: Matthew Wilcox <[email protected]> Cc: Michael Ellerman <[email protected]> Cc: Naveen N. Rao <[email protected]> Cc: Nicholas Piggin <[email protected]> Cc: Palmer Dabbelt <[email protected]> Cc: Paul Walmsley <[email protected]> Cc: Russell King (Oracle) <[email protected]> Cc: Sven Schnelle <[email protected]> Cc: Vasily Gorbik <[email protected]> Cc: Will Deacon <[email protected]> Signed-off-by: Andrew Morton <[email protected]> (cherry picked from commit 4555ac8) Signed-off-by: dann frazier <[email protected]> Acked-by: Brad Figg <[email protected]> Acked-by: Noah Wager <[email protected]> Acked-by: Jacob Martin <[email protected]> Signed-off-by: Brad Figg <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 93b3718 - Browse repository at this point
Copy the full SHA 93b3718View commit details -
sparc/pgtable: define PFN_PTE_SHIFT
BugLink: https://bugs.launchpad.net/bugs/2059316 We want to make use of pte_next_pfn() outside of set_ptes(). Let's simply define PFN_PTE_SHIFT, required by pte_next_pfn(). Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: David Hildenbrand <[email protected]> Tested-by: Ryan Roberts <[email protected]> Reviewed-by: Mike Rapoport (IBM) <[email protected]> Cc: Albert Ou <[email protected]> Cc: Alexander Gordeev <[email protected]> Cc: Alexandre Ghiti <[email protected]> Cc: Aneesh Kumar K.V <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Christian Borntraeger <[email protected]> Cc: Christophe Leroy <[email protected]> Cc: David S. Miller <[email protected]> Cc: Dinh Nguyen <[email protected]> Cc: Gerald Schaefer <[email protected]> Cc: Heiko Carstens <[email protected]> Cc: Matthew Wilcox <[email protected]> Cc: Michael Ellerman <[email protected]> Cc: Naveen N. Rao <[email protected]> Cc: Nicholas Piggin <[email protected]> Cc: Palmer Dabbelt <[email protected]> Cc: Paul Walmsley <[email protected]> Cc: Russell King (Oracle) <[email protected]> Cc: Sven Schnelle <[email protected]> Cc: Vasily Gorbik <[email protected]> Cc: Will Deacon <[email protected]> Signed-off-by: Andrew Morton <[email protected]> (cherry picked from commit ce7a9de) Signed-off-by: dann frazier <[email protected]> Acked-by: Brad Figg <[email protected]> Acked-by: Noah Wager <[email protected]> Acked-by: Jacob Martin <[email protected]> Signed-off-by: Brad Figg <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 412c516 - Browse repository at this point
Copy the full SHA 412c516View commit details -
mm/pgtable: make pte_next_pfn() independent of set_ptes()
BugLink: https://bugs.launchpad.net/bugs/2059316 Let's provide pte_next_pfn(), independently of set_ptes(). This allows for using the generic pte_next_pfn() version in some arch-specific set_ptes() implementations, and prepares for reusing pte_next_pfn() in other context. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: David Hildenbrand <[email protected]> Reviewed-by: Christophe Leroy <[email protected]> Tested-by: Ryan Roberts <[email protected]> Reviewed-by: Mike Rapoport (IBM) <[email protected]> Cc: Albert Ou <[email protected]> Cc: Alexander Gordeev <[email protected]> Cc: Alexandre Ghiti <[email protected]> Cc: Aneesh Kumar K.V <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Christian Borntraeger <[email protected]> Cc: David S. Miller <[email protected]> Cc: Dinh Nguyen <[email protected]> Cc: Gerald Schaefer <[email protected]> Cc: Heiko Carstens <[email protected]> Cc: Matthew Wilcox <[email protected]> Cc: Michael Ellerman <[email protected]> Cc: Naveen N. Rao <[email protected]> Cc: Nicholas Piggin <[email protected]> Cc: Palmer Dabbelt <[email protected]> Cc: Paul Walmsley <[email protected]> Cc: Russell King (Oracle) <[email protected]> Cc: Sven Schnelle <[email protected]> Cc: Vasily Gorbik <[email protected]> Cc: Will Deacon <[email protected]> Signed-off-by: Andrew Morton <[email protected]> (cherry picked from commit 6cdfa1d) Signed-off-by: dann frazier <[email protected]> Acked-by: Brad Figg <[email protected]> Acked-by: Noah Wager <[email protected]> Acked-by: Jacob Martin <[email protected]> Signed-off-by: Brad Figg <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 471e039 - Browse repository at this point
Copy the full SHA 471e039View commit details -
arm/mm: use pte_next_pfn() in set_ptes()
BugLink: https://bugs.launchpad.net/bugs/2059316 Let's use our handy helper now that it's available on all archs. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: David Hildenbrand <[email protected]> Tested-by: Ryan Roberts <[email protected]> Reviewed-by: Mike Rapoport (IBM) <[email protected]> Cc: Albert Ou <[email protected]> Cc: Alexander Gordeev <[email protected]> Cc: Alexandre Ghiti <[email protected]> Cc: Aneesh Kumar K.V <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Christian Borntraeger <[email protected]> Cc: Christophe Leroy <[email protected]> Cc: David S. Miller <[email protected]> Cc: Dinh Nguyen <[email protected]> Cc: Gerald Schaefer <[email protected]> Cc: Heiko Carstens <[email protected]> Cc: Matthew Wilcox <[email protected]> Cc: Michael Ellerman <[email protected]> Cc: Naveen N. Rao <[email protected]> Cc: Nicholas Piggin <[email protected]> Cc: Palmer Dabbelt <[email protected]> Cc: Paul Walmsley <[email protected]> Cc: Russell King (Oracle) <[email protected]> Cc: Sven Schnelle <[email protected]> Cc: Vasily Gorbik <[email protected]> Cc: Will Deacon <[email protected]> Signed-off-by: Andrew Morton <[email protected]> (cherry picked from commit e5ea320) Signed-off-by: dann frazier <[email protected]> Acked-by: Brad Figg <[email protected]> Acked-by: Noah Wager <[email protected]> Acked-by: Jacob Martin <[email protected]> Signed-off-by: Brad Figg <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 0a1dfdb - Browse repository at this point
Copy the full SHA 0a1dfdbView commit details -
powerpc/mm: use pte_next_pfn() in set_ptes()
BugLink: https://bugs.launchpad.net/bugs/2059316 Let's use our handy new helper. Note that the implementation is slightly different, but shouldn't really make a difference in practice. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: David Hildenbrand <[email protected]> Reviewed-by: Christophe Leroy <[email protected]> Tested-by: Ryan Roberts <[email protected]> Reviewed-by: Mike Rapoport (IBM) <[email protected]> Cc: Albert Ou <[email protected]> Cc: Alexander Gordeev <[email protected]> Cc: Alexandre Ghiti <[email protected]> Cc: Aneesh Kumar K.V <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Christian Borntraeger <[email protected]> Cc: David S. Miller <[email protected]> Cc: Dinh Nguyen <[email protected]> Cc: Gerald Schaefer <[email protected]> Cc: Heiko Carstens <[email protected]> Cc: Matthew Wilcox <[email protected]> Cc: Michael Ellerman <[email protected]> Cc: Naveen N. Rao <[email protected]> Cc: Nicholas Piggin <[email protected]> Cc: Palmer Dabbelt <[email protected]> Cc: Paul Walmsley <[email protected]> Cc: Russell King (Oracle) <[email protected]> Cc: Sven Schnelle <[email protected]> Cc: Vasily Gorbik <[email protected]> Cc: Will Deacon <[email protected]> Signed-off-by: Andrew Morton <[email protected]> (cherry picked from commit 802cc2a) Signed-off-by: dann frazier <[email protected]> Acked-by: Brad Figg <[email protected]> Acked-by: Noah Wager <[email protected]> Acked-by: Jacob Martin <[email protected]> Signed-off-by: Brad Figg <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 2b99f03 - Browse repository at this point
Copy the full SHA 2b99f03View commit details -
mm/memory: factor out copying the actual PTE in copy_present_pte()
BugLink: https://bugs.launchpad.net/bugs/2059316 Let's prepare for further changes. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: David Hildenbrand <[email protected]> Reviewed-by: Ryan Roberts <[email protected]> Reviewed-by: Mike Rapoport (IBM) <[email protected]> Cc: Albert Ou <[email protected]> Cc: Alexander Gordeev <[email protected]> Cc: Alexandre Ghiti <[email protected]> Cc: Aneesh Kumar K.V <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Christian Borntraeger <[email protected]> Cc: Christophe Leroy <[email protected]> Cc: David S. Miller <[email protected]> Cc: Dinh Nguyen <[email protected]> Cc: Gerald Schaefer <[email protected]> Cc: Heiko Carstens <[email protected]> Cc: Matthew Wilcox <[email protected]> Cc: Michael Ellerman <[email protected]> Cc: Naveen N. Rao <[email protected]> Cc: Nicholas Piggin <[email protected]> Cc: Palmer Dabbelt <[email protected]> Cc: Paul Walmsley <[email protected]> Cc: Russell King (Oracle) <[email protected]> Cc: Sven Schnelle <[email protected]> Cc: Vasily Gorbik <[email protected]> Cc: Will Deacon <[email protected]> Signed-off-by: Andrew Morton <[email protected]> (cherry picked from commit 23ed190) Signed-off-by: dann frazier <[email protected]> Acked-by: Brad Figg <[email protected]> Acked-by: Noah Wager <[email protected]> Acked-by: Jacob Martin <[email protected]> Signed-off-by: Brad Figg <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 2b7d40c - Browse repository at this point
Copy the full SHA 2b7d40cView commit details -
mm/memory: pass PTE to copy_present_pte()
BugLink: https://bugs.launchpad.net/bugs/2059316 We already read it, let's just forward it. This patch is based on work by Ryan Roberts. [[email protected]: fix the hmm "exclusive_cow" selftest] Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: David Hildenbrand <[email protected]> Reviewed-by: Ryan Roberts <[email protected]> Reviewed-by: Mike Rapoport (IBM) <[email protected]> Cc: Albert Ou <[email protected]> Cc: Alexander Gordeev <[email protected]> Cc: Alexandre Ghiti <[email protected]> Cc: Aneesh Kumar K.V <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Christian Borntraeger <[email protected]> Cc: Christophe Leroy <[email protected]> Cc: David S. Miller <[email protected]> Cc: Dinh Nguyen <[email protected]> Cc: Gerald Schaefer <[email protected]> Cc: Heiko Carstens <[email protected]> Cc: Matthew Wilcox <[email protected]> Cc: Michael Ellerman <[email protected]> Cc: Naveen N. Rao <[email protected]> Cc: Nicholas Piggin <[email protected]> Cc: Palmer Dabbelt <[email protected]> Cc: Paul Walmsley <[email protected]> Cc: Russell King (Oracle) <[email protected]> Cc: Sven Schnelle <[email protected]> Cc: Vasily Gorbik <[email protected]> Cc: Will Deacon <[email protected]> Signed-off-by: Andrew Morton <[email protected]> (cherry picked from commit 5372329) Signed-off-by: dann frazier <[email protected]> Acked-by: Brad Figg <[email protected]> Acked-by: Noah Wager <[email protected]> Acked-by: Jacob Martin <[email protected]> Signed-off-by: Brad Figg <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for cd4ff8d - Browse repository at this point
Copy the full SHA cd4ff8dView commit details -
mm/memory: optimize fork() with PTE-mapped THP
BugLink: https://bugs.launchpad.net/bugs/2059316 Let's implement PTE batching when consecutive (present) PTEs map consecutive pages of the same large folio, and all other PTE bits besides the PFNs are equal. We will optimize folio_pte_batch() separately, to ignore selected PTE bits. This patch is based on work by Ryan Roberts. Use __always_inline for __copy_present_ptes() and keep the handling for single PTEs completely separate from the multi-PTE case: we really want the compiler to optimize for the single-PTE case with small folios, to not degrade performance. Note that PTE batching will never exceed a single page table and will always stay within VMA boundaries. Further, processing PTE-mapped THP that maybe pinned and have PageAnonExclusive set on at least one subpage should work as expected, but there is room for improvement: We will repeatedly (1) detect a PTE batch (2) detect that we have to copy a page (3) fall back and allocate a single page to copy a single page. For now we won't care as pinned pages are a corner case, and we should rather look into maintaining only a single PageAnonExclusive bit for large folios. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: David Hildenbrand <[email protected]> Reviewed-by: Ryan Roberts <[email protected]> Reviewed-by: Mike Rapoport (IBM) <[email protected]> Cc: Albert Ou <[email protected]> Cc: Alexander Gordeev <[email protected]> Cc: Alexandre Ghiti <[email protected]> Cc: Aneesh Kumar K.V <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Christian Borntraeger <[email protected]> Cc: Christophe Leroy <[email protected]> Cc: David S. Miller <[email protected]> Cc: Dinh Nguyen <[email protected]> Cc: Gerald Schaefer <[email protected]> Cc: Heiko Carstens <[email protected]> Cc: Matthew Wilcox <[email protected]> Cc: Michael Ellerman <[email protected]> Cc: Naveen N. Rao <[email protected]> Cc: Nicholas Piggin <[email protected]> Cc: Palmer Dabbelt <[email protected]> Cc: Paul Walmsley <[email protected]> Cc: Russell King (Oracle) <[email protected]> Cc: Sven Schnelle <[email protected]> Cc: Vasily Gorbik <[email protected]> Cc: Will Deacon <[email protected]> Signed-off-by: Andrew Morton <[email protected]> (backported from commit f8d9377) [ dannf: mm_counter_file() in v6.8 took a page instead of a folio ] Signed-off-by: dann frazier <[email protected]> Acked-by: Brad Figg <[email protected]> Acked-by: Noah Wager <[email protected]> Acked-by: Jacob Martin <[email protected]> Signed-off-by: Brad Figg <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for b66bbc5 - Browse repository at this point
Copy the full SHA b66bbc5View commit details -
mm/memory: ignore dirty/accessed/soft-dirty bits in folio_pte_batch()
BugLink: https://bugs.launchpad.net/bugs/2059316 Let's always ignore the accessed/young bit: we'll always mark the PTE as old in our child process during fork, and upcoming users will similarly not care. Ignore the dirty bit only if we don't want to duplicate the dirty bit into the child process during fork. Maybe, we could just set all PTEs in the child dirty if any PTE is dirty. For now, let's keep the behavior unchanged, this can be optimized later if required. Ignore the soft-dirty bit only if the bit doesn't have any meaning in the src vma, and similarly won't have any in the copied dst vma. For now, we won't bother with the uffd-wp bit. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: David Hildenbrand <[email protected]> Reviewed-by: Ryan Roberts <[email protected]> Cc: Albert Ou <[email protected]> Cc: Alexander Gordeev <[email protected]> Cc: Alexandre Ghiti <[email protected]> Cc: Aneesh Kumar K.V <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Christian Borntraeger <[email protected]> Cc: Christophe Leroy <[email protected]> Cc: David S. Miller <[email protected]> Cc: Dinh Nguyen <[email protected]> Cc: Gerald Schaefer <[email protected]> Cc: Heiko Carstens <[email protected]> Cc: Matthew Wilcox <[email protected]> Cc: Michael Ellerman <[email protected]> Cc: Naveen N. Rao <[email protected]> Cc: Nicholas Piggin <[email protected]> Cc: Palmer Dabbelt <[email protected]> Cc: Paul Walmsley <[email protected]> Cc: Russell King (Oracle) <[email protected]> Cc: Sven Schnelle <[email protected]> Cc: Vasily Gorbik <[email protected]> Cc: Will Deacon <[email protected]> Cc: Mike Rapoport (IBM) <[email protected]> Signed-off-by: Andrew Morton <[email protected]> (cherry picked from commit 25365e1) Signed-off-by: dann frazier <[email protected]> Acked-by: Brad Figg <[email protected]> Acked-by: Noah Wager <[email protected]> Acked-by: Jacob Martin <[email protected]> Signed-off-by: Brad Figg <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for f7acae8 - Browse repository at this point
Copy the full SHA f7acae8View commit details -
mm/memory: ignore writable bit in folio_pte_batch()
BugLink: https://bugs.launchpad.net/bugs/2059316 ... and conditionally return to the caller if any PTE except the first one is writable. fork() has to make sure to properly write-protect in case any PTE is writable. Other users (e.g., page unmaping) are expected to not care. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: David Hildenbrand <[email protected]> Reviewed-by: Ryan Roberts <[email protected]> Cc: Albert Ou <[email protected]> Cc: Alexander Gordeev <[email protected]> Cc: Alexandre Ghiti <[email protected]> Cc: Aneesh Kumar K.V <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Christian Borntraeger <[email protected]> Cc: Christophe Leroy <[email protected]> Cc: David S. Miller <[email protected]> Cc: Dinh Nguyen <[email protected]> Cc: Gerald Schaefer <[email protected]> Cc: Heiko Carstens <[email protected]> Cc: Matthew Wilcox <[email protected]> Cc: Michael Ellerman <[email protected]> Cc: Naveen N. Rao <[email protected]> Cc: Nicholas Piggin <[email protected]> Cc: Palmer Dabbelt <[email protected]> Cc: Paul Walmsley <[email protected]> Cc: Russell King (Oracle) <[email protected]> Cc: Sven Schnelle <[email protected]> Cc: Vasily Gorbik <[email protected]> Cc: Will Deacon <[email protected]> Cc: Mike Rapoport (IBM) <[email protected]> Signed-off-by: Andrew Morton <[email protected]> (cherry picked from commit d7c0e5f) Signed-off-by: dann frazier <[email protected]> Acked-by: Brad Figg <[email protected]> Acked-by: Noah Wager <[email protected]> Acked-by: Jacob Martin <[email protected]> Signed-off-by: Brad Figg <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 1ee3a79 - Browse repository at this point
Copy the full SHA 1ee3a79View commit details -
mm: clarify the spec for set_ptes()
BugLink: https://bugs.launchpad.net/bugs/2059316 Patch series "Transparent Contiguous PTEs for User Mappings", v6. This is a series to opportunistically and transparently use contpte mappings (set the contiguous bit in ptes) for user memory when those mappings meet the requirements. The change benefits arm64, but there is some (very) minor refactoring for x86 to enable its integration with core-mm. It is part of a wider effort to improve performance by allocating and mapping variable-sized blocks of memory (folios). One aim is for the 4K kernel to approach the performance of the 16K kernel, but without breaking compatibility and without the associated increase in memory. Another aim is to benefit the 16K and 64K kernels by enabling 2M THP, since this is the contpte size for those kernels. We have good performance data that demonstrates both aims are being met (see below). Of course this is only one half of the change. We require the mapped physical memory to be the correct size and alignment for this to actually be useful (i.e. 64K for 4K pages, or 2M for 16K/64K pages). Fortunately folios are solving this problem for us. Filesystems that support it (XFS, AFS, EROFS, tmpfs, ...) will allocate large folios up to the PMD size today, and more filesystems are coming. And for anonymous memory, "multi-size THP" is now upstream. Patch Layout ============ In this version, I've split the patches to better show each optimization: - 1-2: mm prep: misc code and docs cleanups - 3-6: mm,arm64,x86 prep: Add pte_advance_pfn() and make pte_next_pfn() a generic wrapper around it - 7-11: arm64 prep: Refactor ptep helpers into new layer - 12: functional contpte implementation - 23-18: various optimizations on top of the contpte implementation Testing ======= I've tested this series on both Ampere Altra (bare metal) and Apple M2 (VM): - mm selftests (inc new tests written for multi-size THP); no regressions - Speedometer Java script benchmark in Chromium web browser; no issues - Kernel compilation; no issues - Various tests under high memory pressure with swap enabled; no issues Performance =========== High Level Use Cases ~~~~~~~~~~~~~~~~~~~~ First some high level use cases (kernel compilation and speedometer JavaScript benchmarks). These are running on Ampere Altra (I've seen similar improvements on Android/Pixel 6). baseline: mm-unstable (mTHP switched off) mTHP: + enable 16K, 32K, 64K mTHP sizes "always" mTHP + contpte: + this series mTHP + contpte + exefolio: + patch at [6], which series supports Kernel Compilation with -j8 (negative is faster): | kernel | real-time | kern-time | user-time | |---------------------------|-----------|-----------|-----------| | baseline | 0.0% | 0.0% | 0.0% | | mTHP | -5.0% | -39.1% | -0.7% | | mTHP + contpte | -6.0% | -41.4% | -1.5% | | mTHP + contpte + exefolio | -7.8% | -43.1% | -3.4% | Kernel Compilation with -j80 (negative is faster): | kernel | real-time | kern-time | user-time | |---------------------------|-----------|-----------|-----------| | baseline | 0.0% | 0.0% | 0.0% | | mTHP | -5.0% | -36.6% | -0.6% | | mTHP + contpte | -6.1% | -38.2% | -1.6% | | mTHP + contpte + exefolio | -7.4% | -39.2% | -3.2% | Speedometer (positive is faster): | kernel | runs_per_min | |:--------------------------|--------------| | baseline | 0.0% | | mTHP | 1.5% | | mTHP + contpte | 3.2% | | mTHP + contpte + exefolio | 4.5% | Micro Benchmarks ~~~~~~~~~~~~~~~~ The following microbenchmarks are intended to demonstrate the performance of fork() and munmap() do not regress. I'm showing results for order-0 (4K) mappings, and for order-9 (2M) PTE-mapped THP. Thanks to David for sharing his benchmarks. baseline: mm-unstable + batch zap [7] series contpte-basic: + patches 0-19; functional contpte implementation contpte-batch: + patches 20-23; implement new batched APIs contpte-inline: + patch 24; __always_inline to help compiler contpte-fold: + patch 25; fold contpte mapping when sensible Primary platform is Ampere Altra bare metal. I'm also showing results for M2 VM (on top of MacOS) for reference, although experience suggests this might not be the most reliable for performance numbers of this sort: | FORK | order-0 | order-9 | | Ampere Altra |------------------------|------------------------| | (pte-map) | mean | stdev | mean | stdev | |----------------|------------|-----------|------------|-----------| | baseline | 0.0% | 2.7% | 0.0% | 0.2% | | contpte-basic | 6.3% | 1.4% | 1948.7% | 0.2% | | contpte-batch | 7.6% | 2.0% | -1.9% | 0.4% | | contpte-inline | 3.6% | 1.5% | -1.0% | 0.2% | | contpte-fold | 4.6% | 2.1% | -1.8% | 0.2% | | MUNMAP | order-0 | order-9 | | Ampere Altra |------------------------|------------------------| | (pte-map) | mean | stdev | mean | stdev | |----------------|------------|-----------|------------|-----------| | baseline | 0.0% | 0.5% | 0.0% | 0.3% | | contpte-basic | 1.8% | 0.3% | 1104.8% | 0.1% | | contpte-batch | -0.3% | 0.4% | 2.7% | 0.1% | | contpte-inline | -0.1% | 0.6% | 0.9% | 0.1% | | contpte-fold | 0.1% | 0.6% | 0.8% | 0.1% | | FORK | order-0 | order-9 | | Apple M2 VM |------------------------|------------------------| | (pte-map) | mean | stdev | mean | stdev | |----------------|------------|-----------|------------|-----------| | baseline | 0.0% | 1.4% | 0.0% | 0.8% | | contpte-basic | 6.8% | 1.2% | 469.4% | 1.4% | | contpte-batch | -7.7% | 2.0% | -8.9% | 0.7% | | contpte-inline | -6.0% | 2.1% | -6.0% | 2.0% | | contpte-fold | 5.9% | 1.4% | -6.4% | 1.4% | | MUNMAP | order-0 | order-9 | | Apple M2 VM |------------------------|------------------------| | (pte-map) | mean | stdev | mean | stdev | |----------------|------------|-----------|------------|-----------| | baseline | 0.0% | 0.6% | 0.0% | 0.4% | | contpte-basic | 1.6% | 0.6% | 233.6% | 0.7% | | contpte-batch | 1.9% | 0.3% | -3.9% | 0.4% | | contpte-inline | 2.2% | 0.8% | -1.6% | 0.9% | | contpte-fold | 1.5% | 0.7% | -1.7% | 0.7% | Misc ~~~~ John Hubbard at Nvidia has indicated dramatic 10x performance improvements for some workloads at [8], when using 64K base page kernel. [1] https://lore.kernel.org/linux-arm-kernel/[email protected]/ [2] https://lore.kernel.org/linux-arm-kernel/[email protected]/ [3] https://lore.kernel.org/linux-arm-kernel/[email protected]/ [4] https://lore.kernel.org/lkml/[email protected]/ [5] https://lore.kernel.org/linux-mm/[email protected]/ [6] https://lore.kernel.org/lkml/[email protected]/ [7] https://lore.kernel.org/linux-mm/[email protected]/ [8] https://lore.kernel.org/linux-mm/[email protected]/ [9] https://gitlab.arm.com/linux-arm/linux-rr/-/tree/features/granule_perf/contpte-lkml_v6 This patch (of 18): set_ptes() spec implies that it can only be used to set a present pte because it interprets the PFN field to increment it. However, set_pte_at() has been implemented on top of set_ptes() since set_ptes() was introduced, and set_pte_at() allows setting a pte to a not-present state. So clarify the spec to state that when nr==1, new state of pte may be present or not present. When nr>1, new state of all ptes must be present. While we are at it, tighten the spec to set requirements around the initial state of ptes; when nr==1 it may be either present or not-present. But when nr>1 all ptes must initially be not-present. All set_ptes() callsites already conform to this requirement. Stating it explicitly is useful because it allows for a simplification to the upcoming arm64 contpte implementation. Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Ryan Roberts <[email protected]> Acked-by: David Hildenbrand <[email protected]> Cc: Alistair Popple <[email protected]> Cc: Andrey Ryabinin <[email protected]> Cc: Ard Biesheuvel <[email protected]> Cc: Barry Song <[email protected]> Cc: Borislav Petkov (AMD) <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Dave Hansen <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Morse <[email protected]> Cc: John Hubbard <[email protected]> Cc: Kefeng Wang <[email protected]> Cc: Marc Zyngier <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Matthew Wilcox (Oracle) <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Will Deacon <[email protected]> Cc: Yang Shi <[email protected]> Cc: Zi Yan <[email protected]> Signed-off-by: Andrew Morton <[email protected]> (cherry picked from commit 6280d73) Signed-off-by: dann frazier <[email protected]> Acked-by: Brad Figg <[email protected]> Acked-by: Noah Wager <[email protected]> Acked-by: Jacob Martin <[email protected]> Signed-off-by: Brad Figg <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for d549f56 - Browse repository at this point
Copy the full SHA d549f56View commit details -
mm: thp: batch-collapse PMD with set_ptes()
BugLink: https://bugs.launchpad.net/bugs/2059316 Refactor __split_huge_pmd_locked() so that a present PMD can be collapsed to PTEs in a single batch using set_ptes(). This should improve performance a little bit, but the real motivation is to remove the need for the arm64 backend to have to fold the contpte entries. Instead, since the ptes are set as a batch, the contpte blocks can be initially set up pre-folded (once the arm64 contpte support is added in the next few patches). This leads to noticeable performance improvement during split. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Ryan Roberts <[email protected]> Acked-by: David Hildenbrand <[email protected]> Cc: Alistair Popple <[email protected]> Cc: Andrey Ryabinin <[email protected]> Cc: Ard Biesheuvel <[email protected]> Cc: Barry Song <[email protected]> Cc: Borislav Petkov (AMD) <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Dave Hansen <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Morse <[email protected]> Cc: John Hubbard <[email protected]> Cc: Kefeng Wang <[email protected]> Cc: Marc Zyngier <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Matthew Wilcox (Oracle) <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Will Deacon <[email protected]> Cc: Yang Shi <[email protected]> Cc: Zi Yan <[email protected]> Signed-off-by: Andrew Morton <[email protected]> (cherry picked from commit 2bdba98) Signed-off-by: dann frazier <[email protected]> Acked-by: Brad Figg <[email protected]> Acked-by: Noah Wager <[email protected]> Acked-by: Jacob Martin <[email protected]> Signed-off-by: Brad Figg <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 1c507af - Browse repository at this point
Copy the full SHA 1c507afView commit details -
mm: introduce pte_advance_pfn() and use for pte_next_pfn()
BugLink: https://bugs.launchpad.net/bugs/2059316 The goal is to be able to advance a PTE by an arbitrary number of PFNs. So introduce a new API that takes a nr param. Define the default implementation here and allow for architectures to override. pte_next_pfn() becomes a wrapper around pte_advance_pfn(). Follow up commits will convert each overriding architecture's pte_next_pfn() to pte_advance_pfn(). Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Ryan Roberts <[email protected]> Acked-by: David Hildenbrand <[email protected]> Cc: Alistair Popple <[email protected]> Cc: Andrey Ryabinin <[email protected]> Cc: Ard Biesheuvel <[email protected]> Cc: Barry Song <[email protected]> Cc: Borislav Petkov (AMD) <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Dave Hansen <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Morse <[email protected]> Cc: John Hubbard <[email protected]> Cc: Kefeng Wang <[email protected]> Cc: Marc Zyngier <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Matthew Wilcox (Oracle) <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Will Deacon <[email protected]> Cc: Yang Shi <[email protected]> Cc: Zi Yan <[email protected]> Signed-off-by: Andrew Morton <[email protected]> (cherry picked from commit 583ceaa) Signed-off-by: dann frazier <[email protected]> Acked-by: Brad Figg <[email protected]> Acked-by: Noah Wager <[email protected]> Acked-by: Jacob Martin <[email protected]> Signed-off-by: Brad Figg <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 93e1fe8 - Browse repository at this point
Copy the full SHA 93e1fe8View commit details -
arm64/mm: convert pte_next_pfn() to pte_advance_pfn()
BugLink: https://bugs.launchpad.net/bugs/2059316 Core-mm needs to be able to advance the pfn by an arbitrary amount, so override the new pte_advance_pfn() API to do so. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Ryan Roberts <[email protected]> Acked-by: David Hildenbrand <[email protected]> Acked-by: Mark Rutland <[email protected]> Acked-by: Catalin Marinas <[email protected]> Cc: Alistair Popple <[email protected]> Cc: Andrey Ryabinin <[email protected]> Cc: Ard Biesheuvel <[email protected]> Cc: Barry Song <[email protected]> Cc: Borislav Petkov (AMD) <[email protected]> Cc: Dave Hansen <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Morse <[email protected]> Cc: John Hubbard <[email protected]> Cc: Kefeng Wang <[email protected]> Cc: Marc Zyngier <[email protected]> Cc: Matthew Wilcox (Oracle) <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Will Deacon <[email protected]> Cc: Yang Shi <[email protected]> Cc: Zi Yan <[email protected]> Signed-off-by: Andrew Morton <[email protected]> (cherry picked from commit c1bd2b4) Signed-off-by: dann frazier <[email protected]> Acked-by: Brad Figg <[email protected]> Acked-by: Noah Wager <[email protected]> Acked-by: Jacob Martin <[email protected]> Signed-off-by: Brad Figg <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for e83382a - Browse repository at this point
Copy the full SHA e83382aView commit details -
x86/mm: convert pte_next_pfn() to pte_advance_pfn()
BugLink: https://bugs.launchpad.net/bugs/2059316 Core-mm needs to be able to advance the pfn by an arbitrary amount, so override the new pte_advance_pfn() API to do so. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Ryan Roberts <[email protected]> Reviewed-by: David Hildenbrand <[email protected]> Cc: Alistair Popple <[email protected]> Cc: Andrey Ryabinin <[email protected]> Cc: Ard Biesheuvel <[email protected]> Cc: Barry Song <[email protected]> Cc: Borislav Petkov (AMD) <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Dave Hansen <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Morse <[email protected]> Cc: John Hubbard <[email protected]> Cc: Kefeng Wang <[email protected]> Cc: Marc Zyngier <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Matthew Wilcox (Oracle) <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Will Deacon <[email protected]> Cc: Yang Shi <[email protected]> Cc: Zi Yan <[email protected]> Signed-off-by: Andrew Morton <[email protected]> (cherry picked from commit 506b586) Signed-off-by: dann frazier <[email protected]> Acked-by: Brad Figg <[email protected]> Acked-by: Noah Wager <[email protected]> Acked-by: Jacob Martin <[email protected]> Signed-off-by: Brad Figg <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 7e321a1 - Browse repository at this point
Copy the full SHA 7e321a1View commit details -
mm: tidy up pte_next_pfn() definition
BugLink: https://bugs.launchpad.net/bugs/2059316 Now that the all architecture overrides of pte_next_pfn() have been replaced with pte_advance_pfn(), we can simplify the definition of the generic pte_next_pfn() macro so that it is unconditionally defined. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Ryan Roberts <[email protected]> Acked-by: David Hildenbrand <[email protected]> Cc: Alistair Popple <[email protected]> Cc: Andrey Ryabinin <[email protected]> Cc: Ard Biesheuvel <[email protected]> Cc: Barry Song <[email protected]> Cc: Borislav Petkov (AMD) <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Dave Hansen <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Morse <[email protected]> Cc: John Hubbard <[email protected]> Cc: Kefeng Wang <[email protected]> Cc: Marc Zyngier <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Matthew Wilcox (Oracle) <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Will Deacon <[email protected]> Cc: Yang Shi <[email protected]> Cc: Zi Yan <[email protected]> Signed-off-by: Andrew Morton <[email protected]> (cherry picked from commit fb23bf6) Signed-off-by: dann frazier <[email protected]> Acked-by: Brad Figg <[email protected]> Acked-by: Noah Wager <[email protected]> Acked-by: Jacob Martin <[email protected]> Signed-off-by: Brad Figg <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 40eedae - Browse repository at this point
Copy the full SHA 40eedaeView commit details -
arm64/mm: convert READ_ONCE(*ptep) to ptep_get(ptep)
BugLink: https://bugs.launchpad.net/bugs/2059316 There are a number of places in the arch code that read a pte by using the READ_ONCE() macro. Refactor these call sites to instead use the ptep_get() helper, which itself is a READ_ONCE(). Generated code should be the same. This will benefit us when we shortly introduce the transparent contpte support. In this case, ptep_get() will become more complex so we now have all the code abstracted through it. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Ryan Roberts <[email protected]> Tested-by: John Hubbard <[email protected]> Acked-by: Mark Rutland <[email protected]> Acked-by: Catalin Marinas <[email protected]> Cc: Alistair Popple <[email protected]> Cc: Andrey Ryabinin <[email protected]> Cc: Ard Biesheuvel <[email protected]> Cc: Barry Song <[email protected]> Cc: Borislav Petkov (AMD) <[email protected]> Cc: Dave Hansen <[email protected]> Cc: David Hildenbrand <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Morse <[email protected]> Cc: Kefeng Wang <[email protected]> Cc: Marc Zyngier <[email protected]> Cc: Matthew Wilcox (Oracle) <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Will Deacon <[email protected]> Cc: Yang Shi <[email protected]> Cc: Zi Yan <[email protected]> Signed-off-by: Andrew Morton <[email protected]> (cherry picked from commit 5327365) Signed-off-by: dann frazier <[email protected]> Acked-by: Brad Figg <[email protected]> Acked-by: Noah Wager <[email protected]> Acked-by: Jacob Martin <[email protected]> Signed-off-by: Brad Figg <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for c66a4f2 - Browse repository at this point
Copy the full SHA c66a4f2View commit details -
arm64/mm: convert set_pte_at() to set_ptes(..., 1)
BugLink: https://bugs.launchpad.net/bugs/2059316 Since set_ptes() was introduced, set_pte_at() has been implemented as a generic macro around set_ptes(..., 1). So this change should continue to generate the same code. However, making this change prepares us for the transparent contpte support. It means we can reroute set_ptes() to __set_ptes(). Since set_pte_at() is a generic macro, there will be no equivalent __set_pte_at() to reroute to. Note that a couple of calls to set_pte_at() remain in the arch code. This is intentional, since those call sites are acting on behalf of core-mm and should continue to call into the public set_ptes() rather than the arch-private __set_ptes(). Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Ryan Roberts <[email protected]> Tested-by: John Hubbard <[email protected]> Acked-by: Mark Rutland <[email protected]> Acked-by: Catalin Marinas <[email protected]> Cc: Alistair Popple <[email protected]> Cc: Andrey Ryabinin <[email protected]> Cc: Ard Biesheuvel <[email protected]> Cc: Barry Song <[email protected]> Cc: Borislav Petkov (AMD) <[email protected]> Cc: Dave Hansen <[email protected]> Cc: David Hildenbrand <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Morse <[email protected]> Cc: Kefeng Wang <[email protected]> Cc: Marc Zyngier <[email protected]> Cc: Matthew Wilcox (Oracle) <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Will Deacon <[email protected]> Cc: Yang Shi <[email protected]> Cc: Zi Yan <[email protected]> Signed-off-by: Andrew Morton <[email protected]> (cherry picked from commit 659e193) Signed-off-by: dann frazier <[email protected]> Acked-by: Brad Figg <[email protected]> Acked-by: Noah Wager <[email protected]> Acked-by: Jacob Martin <[email protected]> Signed-off-by: Brad Figg <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 69ca841 - Browse repository at this point
Copy the full SHA 69ca841View commit details -
arm64/mm: convert ptep_clear() to ptep_get_and_clear()
BugLink: https://bugs.launchpad.net/bugs/2059316 ptep_clear() is a generic wrapper around the arch-implemented ptep_get_and_clear(). We are about to convert ptep_get_and_clear() into a public version and private version (__ptep_get_and_clear()) to support the transparent contpte work. We won't have a private version of ptep_clear() so let's convert it to directly call ptep_get_and_clear(). Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Ryan Roberts <[email protected]> Tested-by: John Hubbard <[email protected]> Acked-by: Mark Rutland <[email protected]> Acked-by: Catalin Marinas <[email protected]> Cc: Alistair Popple <[email protected]> Cc: Andrey Ryabinin <[email protected]> Cc: Ard Biesheuvel <[email protected]> Cc: Barry Song <[email protected]> Cc: Borislav Petkov (AMD) <[email protected]> Cc: Dave Hansen <[email protected]> Cc: David Hildenbrand <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Morse <[email protected]> Cc: Kefeng Wang <[email protected]> Cc: Marc Zyngier <[email protected]> Cc: Matthew Wilcox (Oracle) <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Will Deacon <[email protected]> Cc: Yang Shi <[email protected]> Cc: Zi Yan <[email protected]> Signed-off-by: Andrew Morton <[email protected]> (cherry picked from commit cbb0294) Signed-off-by: dann frazier <[email protected]> Acked-by: Brad Figg <[email protected]> Acked-by: Noah Wager <[email protected]> Acked-by: Jacob Martin <[email protected]> Signed-off-by: Brad Figg <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 0cf2dbe - Browse repository at this point
Copy the full SHA 0cf2dbeView commit details -
arm64/mm: new ptep layer to manage contig bit
BugLink: https://bugs.launchpad.net/bugs/2059316 Create a new layer for the in-table PTE manipulation APIs. For now, The existing API is prefixed with double underscore to become the arch-private API and the public API is just a simple wrapper that calls the private API. The public API implementation will subsequently be used to transparently manipulate the contiguous bit where appropriate. But since there are already some contig-aware users (e.g. hugetlb, kernel mapper), we must first ensure those users use the private API directly so that the future contig-bit manipulations in the public API do not interfere with those existing uses. The following APIs are treated this way: - ptep_get - set_pte - set_ptes - pte_clear - ptep_get_and_clear - ptep_test_and_clear_young - ptep_clear_flush_young - ptep_set_wrprotect - ptep_set_access_flags Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Ryan Roberts <[email protected]> Tested-by: John Hubbard <[email protected]> Acked-by: Mark Rutland <[email protected]> Acked-by: Catalin Marinas <[email protected]> Cc: Alistair Popple <[email protected]> Cc: Andrey Ryabinin <[email protected]> Cc: Ard Biesheuvel <[email protected]> Cc: Barry Song <[email protected]> Cc: Borislav Petkov (AMD) <[email protected]> Cc: Dave Hansen <[email protected]> Cc: David Hildenbrand <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Morse <[email protected]> Cc: Kefeng Wang <[email protected]> Cc: Marc Zyngier <[email protected]> Cc: Matthew Wilcox (Oracle) <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Will Deacon <[email protected]> Cc: Yang Shi <[email protected]> Cc: Zi Yan <[email protected]> Signed-off-by: Andrew Morton <[email protected]> (cherry picked from commit 5a00bfd) Signed-off-by: dann frazier <[email protected]> Acked-by: Brad Figg <[email protected]> Acked-by: Noah Wager <[email protected]> Acked-by: Jacob Martin <[email protected]> Signed-off-by: Brad Figg <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 14b21c0 - Browse repository at this point
Copy the full SHA 14b21c0View commit details -
arm64/mm: dplit __flush_tlb_range() to elide trailing DSB
BugLink: https://bugs.launchpad.net/bugs/2059316 Split __flush_tlb_range() into __flush_tlb_range_nosync() + __flush_tlb_range(), in the same way as the existing flush_tlb_page() arrangement. This allows calling __flush_tlb_range_nosync() to elide the trailing DSB. Forthcoming "contpte" code will take advantage of this when clearing the young bit from a contiguous range of ptes. Ordering between dsb and mmu_notifier_arch_invalidate_secondary_tlbs() has changed, but now aligns with the ordering of __flush_tlb_page(). It has been discussed that __flush_tlb_page() may be wrong though. Regardless, both will be resolved separately if needed. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Ryan Roberts <[email protected]> Reviewed-by: David Hildenbrand <[email protected]> Tested-by: John Hubbard <[email protected]> Acked-by: Mark Rutland <[email protected]> Acked-by: Catalin Marinas <[email protected]> Cc: Alistair Popple <[email protected]> Cc: Andrey Ryabinin <[email protected]> Cc: Ard Biesheuvel <[email protected]> Cc: Barry Song <[email protected]> Cc: Borislav Petkov (AMD) <[email protected]> Cc: Dave Hansen <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Morse <[email protected]> Cc: Kefeng Wang <[email protected]> Cc: Marc Zyngier <[email protected]> Cc: Matthew Wilcox (Oracle) <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Will Deacon <[email protected]> Cc: Yang Shi <[email protected]> Cc: Zi Yan <[email protected]> Signed-off-by: Andrew Morton <[email protected]> (cherry picked from commit d9d8dc2) Signed-off-by: dann frazier <[email protected]> Acked-by: Brad Figg <[email protected]> Acked-by: Noah Wager <[email protected]> Acked-by: Jacob Martin <[email protected]> Signed-off-by: Brad Figg <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for bb072bc - Browse repository at this point
Copy the full SHA bb072bcView commit details -
NVIDIA: [Config] arm64: ARM64_CONTPTE=y
BugLink: https://bugs.launchpad.net/bugs/2059316 Signed-off-by: dann frazier <[email protected]> Acked-by: Brad Figg <[email protected]> Acked-by: Noah Wager <[email protected]> Acked-by: Jacob Martin <[email protected]> Signed-off-by: Brad Figg <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 21a8200 - Browse repository at this point
Copy the full SHA 21a8200View commit details -
arm64/mm: wire up PTE_CONT for user mappings
BugLink: https://bugs.launchpad.net/bugs/2059316 With the ptep API sufficiently refactored, we can now introduce a new "contpte" API layer, which transparently manages the PTE_CONT bit for user mappings. In this initial implementation, only suitable batches of PTEs, set via set_ptes(), are mapped with the PTE_CONT bit. Any subsequent modification of individual PTEs will cause an "unfold" operation to repaint the contpte block as individual PTEs before performing the requested operation. While, a modification of a single PTE could cause the block of PTEs to which it belongs to become eligible for "folding" into a contpte entry, "folding" is not performed in this initial implementation due to the costs of checking the requirements are met. Due to this, contpte mappings will degrade back to normal pte mappings over time if/when protections are changed. This will be solved in a future patch. Since a contpte block only has a single access and dirty bit, the semantic here changes slightly; when getting a pte (e.g. ptep_get()) that is part of a contpte mapping, the access and dirty information are pulled from the block (so all ptes in the block return the same access/dirty info). When changing the access/dirty info on a pte (e.g. ptep_set_access_flags()) that is part of a contpte mapping, this change will affect the whole contpte block. This is works fine in practice since we guarantee that only a single folio is mapped by a contpte block, and the core-mm tracks access/dirty information per folio. In order for the public functions, which used to be pure inline, to continue to be callable by modules, export all the contpte_* symbols that are now called by those public inline functions. The feature is enabled/disabled with the ARM64_CONTPTE Kconfig parameter at build time. It defaults to enabled as long as its dependency, TRANSPARENT_HUGEPAGE is also enabled. The core-mm depends upon TRANSPARENT_HUGEPAGE to be able to allocate large folios, so if its not enabled, then there is no chance of meeting the physical contiguity requirement for contpte mappings. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Ryan Roberts <[email protected]> Acked-by: Ard Biesheuvel <[email protected]> Tested-by: John Hubbard <[email protected]> Acked-by: Mark Rutland <[email protected]> Reviewed-by: Catalin Marinas <[email protected]> Cc: Alistair Popple <[email protected]> Cc: Andrey Ryabinin <[email protected]> Cc: Barry Song <[email protected]> Cc: Borislav Petkov (AMD) <[email protected]> Cc: Dave Hansen <[email protected]> Cc: David Hildenbrand <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Morse <[email protected]> Cc: Kefeng Wang <[email protected]> Cc: Marc Zyngier <[email protected]> Cc: Matthew Wilcox (Oracle) <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Will Deacon <[email protected]> Cc: Yang Shi <[email protected]> Cc: Zi Yan <[email protected]> Signed-off-by: Andrew Morton <[email protected]> (cherry picked from commit 4602e57) Signed-off-by: dann frazier <[email protected]> Acked-by: Brad Figg <[email protected]> Acked-by: Noah Wager <[email protected]> Acked-by: Jacob Martin <[email protected]> Signed-off-by: Brad Figg <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for eb0862c - Browse repository at this point
Copy the full SHA eb0862cView commit details -
arm64/mm: implement new wrprotect_ptes() batch API
BugLink: https://bugs.launchpad.net/bugs/2059316 Optimize the contpte implementation to fix some of the fork performance regression introduced by the initial contpte commit. Subsequent patches will solve it entirely. During fork(), any private memory in the parent must be write-protected. Previously this was done 1 PTE at a time. But the core-mm supports batched wrprotect via the new wrprotect_ptes() API. So let's implement that API and for fully covered contpte mappings, we no longer need to unfold the contpte. This has 2 benefits: - reduced unfolding, reduces the number of tlbis that must be issued. - The memory remains contpte-mapped ("folded") in the parent, so it continues to benefit from the more efficient use of the TLB after the fork. The optimization to wrprotect a whole contpte block without unfolding is possible thanks to the tightening of the Arm ARM in respect to the definition and behaviour when 'Misprogramming the Contiguous bit'. See section D21194 at https://developer.arm.com/documentation/102105/ja-07/ Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Ryan Roberts <[email protected]> Tested-by: John Hubbard <[email protected]> Acked-by: Mark Rutland <[email protected]> Acked-by: Catalin Marinas <[email protected]> Cc: Alistair Popple <[email protected]> Cc: Andrey Ryabinin <[email protected]> Cc: Ard Biesheuvel <[email protected]> Cc: Barry Song <[email protected]> Cc: Borislav Petkov (AMD) <[email protected]> Cc: Dave Hansen <[email protected]> Cc: David Hildenbrand <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Morse <[email protected]> Cc: Kefeng Wang <[email protected]> Cc: Marc Zyngier <[email protected]> Cc: Matthew Wilcox (Oracle) <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Will Deacon <[email protected]> Cc: Yang Shi <[email protected]> Cc: Zi Yan <[email protected]> Signed-off-by: Andrew Morton <[email protected]> (cherry picked from commit 311a6cf) Signed-off-by: dann frazier <[email protected]> Acked-by: Brad Figg <[email protected]> Acked-by: Noah Wager <[email protected]> Acked-by: Jacob Martin <[email protected]> Signed-off-by: Brad Figg <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 7ccca5c - Browse repository at this point
Copy the full SHA 7ccca5cView commit details -
arm64/mm: implement new [get_and_]clear_full_ptes() batch APIs
BugLink: https://bugs.launchpad.net/bugs/2059316 Optimize the contpte implementation to fix some of the exit/munmap/dontneed performance regression introduced by the initial contpte commit. Subsequent patches will solve it entirely. During exit(), munmap() or madvise(MADV_DONTNEED), mappings must be cleared. Previously this was done 1 PTE at a time. But the core-mm supports batched clear via the new [get_and_]clear_full_ptes() APIs. So let's implement those APIs and for fully covered contpte mappings, we no longer need to unfold the contpte. This significantly reduces unfolding operations, reducing the number of tlbis that must be issued. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Ryan Roberts <[email protected]> Tested-by: John Hubbard <[email protected]> Acked-by: Mark Rutland <[email protected]> Acked-by: Catalin Marinas <[email protected]> Cc: Alistair Popple <[email protected]> Cc: Andrey Ryabinin <[email protected]> Cc: Ard Biesheuvel <[email protected]> Cc: Barry Song <[email protected]> Cc: Borislav Petkov (AMD) <[email protected]> Cc: Dave Hansen <[email protected]> Cc: David Hildenbrand <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Morse <[email protected]> Cc: Kefeng Wang <[email protected]> Cc: Marc Zyngier <[email protected]> Cc: Matthew Wilcox (Oracle) <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Will Deacon <[email protected]> Cc: Yang Shi <[email protected]> Cc: Zi Yan <[email protected]> Signed-off-by: Andrew Morton <[email protected]> (cherry picked from commit 6b1e4ef) Signed-off-by: dann frazier <[email protected]> Acked-by: Brad Figg <[email protected]> Acked-by: Noah Wager <[email protected]> Acked-by: Jacob Martin <[email protected]> Signed-off-by: Brad Figg <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 0ddab60 - Browse repository at this point
Copy the full SHA 0ddab60View commit details -
mm: add pte_batch_hint() to reduce scanning in folio_pte_batch()
BugLink: https://bugs.launchpad.net/bugs/2059316 Some architectures (e.g. arm64) can tell from looking at a pte, if some follow-on ptes also map contiguous physical memory with the same pgprot. (for arm64, these are contpte mappings). Take advantage of this knowledge to optimize folio_pte_batch() so that it can skip these ptes when scanning to create a batch. By default, if an arch does not opt-in, folio_pte_batch() returns a compile-time 1, so the changes are optimized out and the behaviour is as before. arm64 will opt-in to providing this hint in the next patch, which will greatly reduce the cost of ptep_get() when scanning a range of contptes. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Ryan Roberts <[email protected]> Acked-by: David Hildenbrand <[email protected]> Tested-by: John Hubbard <[email protected]> Cc: Alistair Popple <[email protected]> Cc: Andrey Ryabinin <[email protected]> Cc: Ard Biesheuvel <[email protected]> Cc: Barry Song <[email protected]> Cc: Borislav Petkov (AMD) <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Dave Hansen <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Morse <[email protected]> Cc: Kefeng Wang <[email protected]> Cc: Marc Zyngier <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Matthew Wilcox (Oracle) <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Will Deacon <[email protected]> Cc: Yang Shi <[email protected]> Cc: Zi Yan <[email protected]> Signed-off-by: Andrew Morton <[email protected]> (cherry picked from commit c6ec76a) Signed-off-by: dann frazier <[email protected]> Acked-by: Brad Figg <[email protected]> Acked-by: Noah Wager <[email protected]> Acked-by: Jacob Martin <[email protected]> Signed-off-by: Brad Figg <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for b9beb7a - Browse repository at this point
Copy the full SHA b9beb7aView commit details -
arm64/mm: implement pte_batch_hint()
BugLink: https://bugs.launchpad.net/bugs/2059316 When core code iterates over a range of ptes and calls ptep_get() for each of them, if the range happens to cover contpte mappings, the number of pte reads becomes amplified by a factor of the number of PTEs in a contpte block. This is because for each call to ptep_get(), the implementation must read all of the ptes in the contpte block to which it belongs to gather the access and dirty bits. This causes a hotspot for fork(), as well as operations that unmap memory such as munmap(), exit and madvise(MADV_DONTNEED). Fortunately we can fix this by implementing pte_batch_hint() which allows their iterators to skip getting the contpte tail ptes when gathering the batch of ptes to operate on. This results in the number of PTE reads returning to 1 per pte. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Ryan Roberts <[email protected]> Acked-by: Mark Rutland <[email protected]> Reviewed-by: David Hildenbrand <[email protected]> Tested-by: John Hubbard <[email protected]> Acked-by: Catalin Marinas <[email protected]> Cc: Alistair Popple <[email protected]> Cc: Andrey Ryabinin <[email protected]> Cc: Ard Biesheuvel <[email protected]> Cc: Barry Song <[email protected]> Cc: Borislav Petkov (AMD) <[email protected]> Cc: Dave Hansen <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Morse <[email protected]> Cc: Kefeng Wang <[email protected]> Cc: Marc Zyngier <[email protected]> Cc: Matthew Wilcox (Oracle) <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Will Deacon <[email protected]> Cc: Yang Shi <[email protected]> Cc: Zi Yan <[email protected]> Signed-off-by: Andrew Morton <[email protected]> (cherry picked from commit fb5451e) Signed-off-by: dann frazier <[email protected]> Acked-by: Brad Figg <[email protected]> Acked-by: Noah Wager <[email protected]> Acked-by: Jacob Martin <[email protected]> Signed-off-by: Brad Figg <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 2d583c7 - Browse repository at this point
Copy the full SHA 2d583c7View commit details -
arm64/mm: __always_inline to improve fork() perf
BugLink: https://bugs.launchpad.net/bugs/2059316 As set_ptes() and wrprotect_ptes() become a bit more complex, the compiler may choose not to inline them. But this is critical for fork() performance. So mark the functions, along with contpte_try_unfold() which is called by them, as __always_inline. This is worth ~1% on the fork() microbenchmark with order-0 folios (the common case). Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Ryan Roberts <[email protected]> Acked-by: Mark Rutland <[email protected]> Acked-by: Catalin Marinas <[email protected]> Cc: Alistair Popple <[email protected]> Cc: Andrey Ryabinin <[email protected]> Cc: Ard Biesheuvel <[email protected]> Cc: Barry Song <[email protected]> Cc: Borislav Petkov (AMD) <[email protected]> Cc: Dave Hansen <[email protected]> Cc: David Hildenbrand <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Morse <[email protected]> Cc: John Hubbard <[email protected]> Cc: Kefeng Wang <[email protected]> Cc: Marc Zyngier <[email protected]> Cc: Matthew Wilcox (Oracle) <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Will Deacon <[email protected]> Cc: Yang Shi <[email protected]> Cc: Zi Yan <[email protected]> Signed-off-by: Andrew Morton <[email protected]> (cherry picked from commit b972fc6) Signed-off-by: dann frazier <[email protected]> Acked-by: Brad Figg <[email protected]> Acked-by: Noah Wager <[email protected]> Acked-by: Jacob Martin <[email protected]> Signed-off-by: Brad Figg <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 266113f - Browse repository at this point
Copy the full SHA 266113fView commit details -
arm64/mm: automatically fold contpte mappings
BugLink: https://bugs.launchpad.net/bugs/2059316 There are situations where a change to a single PTE could cause the contpte block in which it resides to become foldable (i.e. could be repainted with the contiguous bit). Such situations arise, for example, when user space temporarily changes protections, via mprotect, for individual pages, such can be the case for certain garbage collectors. We would like to detect when such a PTE change occurs. However this can be expensive due to the amount of checking required. Therefore only perform the checks when an indiviual PTE is modified via mprotect (ptep_modify_prot_commit() -> set_pte_at() -> set_ptes(nr=1)) and only when we are setting the final PTE in a contpte-aligned block. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Ryan Roberts <[email protected]> Acked-by: Mark Rutland <[email protected]> Acked-by: Catalin Marinas <[email protected]> Cc: Alistair Popple <[email protected]> Cc: Andrey Ryabinin <[email protected]> Cc: Ard Biesheuvel <[email protected]> Cc: Barry Song <[email protected]> Cc: Borislav Petkov (AMD) <[email protected]> Cc: Dave Hansen <[email protected]> Cc: David Hildenbrand <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Morse <[email protected]> Cc: John Hubbard <[email protected]> Cc: Kefeng Wang <[email protected]> Cc: Marc Zyngier <[email protected]> Cc: Matthew Wilcox (Oracle) <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Will Deacon <[email protected]> Cc: Yang Shi <[email protected]> Cc: Zi Yan <[email protected]> Signed-off-by: Andrew Morton <[email protected]> (cherry picked from commit f0c2264) Signed-off-by: dann frazier <[email protected]> Acked-by: Brad Figg <[email protected]> Acked-by: Noah Wager <[email protected]> Acked-by: Jacob Martin <[email protected]> Signed-off-by: Brad Figg <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for d95b60d - Browse repository at this point
Copy the full SHA d95b60dView commit details -
arm64/mm: export contpte symbols only to GPL users
BugLink: https://bugs.launchpad.net/bugs/2059316 Patch series "Address some contpte nits". These 2 patches address some nits raised by Catalin late in the review cycle for my contpte series [1]. [1] https://lore.kernel.org/linux-mm/[email protected]/ This patch (of 2): The contpte symbols must be exported since some of the public inline ptep_* APIs are called from modules and these inlines now call the contpte functions. Originally they were exported as EXPORT_SYMBOL() for fear of breaking out-of-tree modules. But we subsequently concluded that EXPORT_SYMBOL_GPL() should be safe since these functions are deeply core mm routines, and any module operating at this level is not going to be able to survive on EXPORT_SYMBOL alone. Link: https://lkml.kernel.org/r/[email protected] Link: https://lore.kernel.org/linux-mm/[email protected]/ Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Ryan Roberts <[email protected]> Acked-by: David Hildenbrand <[email protected]> Acked-by: Catalin Marinas <[email protected]> Cc: John Hubbard <[email protected]> Cc: Mark Rutland <[email protected]> Signed-off-by: Andrew Morton <[email protected]> (cherry picked from commit 912609e) Signed-off-by: dann frazier <[email protected]> Acked-by: Brad Figg <[email protected]> Acked-by: Noah Wager <[email protected]> Acked-by: Jacob Martin <[email protected]> Signed-off-by: Brad Figg <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for b333d9d - Browse repository at this point
Copy the full SHA b333d9dView commit details -
arm64/mm: improve comment in contpte_ptep_get_lockless()
BugLink: https://bugs.launchpad.net/bugs/2059316 Make clear the atmicity/consistency requirements of the API and how we achieve them. Link: https://lore.kernel.org/linux-mm/[email protected]/ Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Ryan Roberts <[email protected]> Acked-by: David Hildenbrand <[email protected]> Reviewed-by: Catalin Marinas <[email protected]> Cc: John Hubbard <[email protected]> Cc: Mark Rutland <[email protected]> Signed-off-by: Andrew Morton <[email protected]> (cherry picked from commit 94c18d5) Signed-off-by: dann frazier <[email protected]> Acked-by: Brad Figg <[email protected]> Acked-by: Noah Wager <[email protected]> Acked-by: Jacob Martin <[email protected]> Signed-off-by: Brad Figg <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 846cdeb - Browse repository at this point
Copy the full SHA 846cdebView commit details -
NVIDIA: [Packaging] update nvidia-fs driver to latest version
BugLink: https://bugs.launchpad.net/bugs/2066955 Signed-off-by: Brad Figg <[email protected]> Acked-by: Jacob Martin <[email protected]> Acked-by: Noah Wager <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 89dbe2f - Browse repository at this point
Copy the full SHA 89dbe2fView commit details -
UBUNTU: [Packaging] blacklist coresight_etm4x
BugLink: https://bugs.launchpad.net/bugs/2061930 BugLink: https://bugs.launchpad.net/bugs/2067106 There are systems in production that don't have firmware that supports coresight_etm4x. Instead of removing completely, blacklist coresight_etm4x so systems with the correct firmware can use the module. Signed-off-by: Ian May <[email protected]> Signed-off-by: Jamie Nguyen <[email protected]> Acked-by: Brad Figg <[email protected]> Acked-by: Noah Wager <[email protected]> Acked-by: Jacob Martin <[email protected]> Signed-off-by: Brad Figg <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 0eda9f4 - Browse repository at this point
Copy the full SHA 0eda9f4View commit details -
tpm_tis_spi: Account for SPI header when allocating TPM SPI xfer buffer
BugLink: https://bugs.launchpad.net/bugs/2067429 The TPM SPI transfer mechanism uses MAX_SPI_FRAMESIZE for computing the maximum transfer length and the size of the transfer buffer. As such, it does not account for the 4 bytes of header that prepends the SPI data frame. This can result in out-of-bounds accesses and was confirmed with KASAN. Introduce SPI_HDRSIZE to account for the header and use to allocate the transfer buffer. Fixes: a86a42a ("tpm_tis_spi: Add hardware wait polling") Signed-off-by: Matthew R. Ochs <[email protected]> Tested-by: Carol Soto <[email protected]> Reviewed-by: Jarkko Sakkinen <[email protected]> Signed-off-by: Jarkko Sakkinen <[email protected]> (cherry picked from commit 195aba9) Acked-by: Brad Figg <[email protected]> Acked-by: Noah Wager <[email protected]> Acked-by: Jacob Martin <[email protected]> Signed-off-by: Brad Figg <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 6f18a98 - Browse repository at this point
Copy the full SHA 6f18a98View commit details -
UBUNTU: [Packaging] update Ubuntu.md
BugLink: https://bugs.launchpad.net/bugs/1786013 Signed-off-by: Jacob Martin <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 5e556d2 - Browse repository at this point
Copy the full SHA 5e556d2View commit details -
Ignore: yes Signed-off-by: Jacob Martin <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 1835bae - Browse repository at this point
Copy the full SHA 1835baeView commit details -
UBUNTU: [Packaging] enable perf python module
BugLink: https://bugs.launchpad.net/bugs/2051560 The perf python module is required by some tools (e.g., tuned) and we are not currently providing it. Enable it to be able to support tools that require this module. Signed-off-by: Andrea Righi <[email protected]> Signed-off-by: Jacob Martin <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 5b3a63f - Browse repository at this point
Copy the full SHA 5b3a63fView commit details -
UBUNTU: [Packaging] add Real-time Linux Analysis tool (rtla) to linux…
…-tools BugLink: https://bugs.launchpad.net/bugs/2059080 Signed-off-by: Andrea Righi <[email protected]> Signed-off-by: Jacob Martin <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for d9f6493 - Browse repository at this point
Copy the full SHA d9f6493View commit details -
UBUNTU: [Packaging] update dependencies for rtla
BugLink: https://bugs.launchpad.net/bugs/2059080 List the architectures where the build dependencies for rtla are needed to make sure that we don't introduce potential unresolved dependencies. Signed-off-by: Andrea Righi <[email protected]> Signed-off-by: Jacob Martin <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 8181674 - Browse repository at this point
Copy the full SHA 8181674View commit details -
UBUNTU: link-to-tracker: update tracking bug
BugLink: https://bugs.launchpad.net/bugs/2064335 Properties: no-test-build Signed-off-by: Jacob Martin <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 9002939 - Browse repository at this point
Copy the full SHA 9002939View commit details -
UBUNTU: [Packaging] debian.nvidia/dkms-versions -- update from kernel…
…-versions (main/2024.04.29) BugLink: https://bugs.launchpad.net/bugs/1786013 Signed-off-by: Jacob Martin <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 4e6dbea - Browse repository at this point
Copy the full SHA 4e6dbeaView commit details -
UBUNTU: Ubuntu-nvidia-6.8.0-1007.7
Signed-off-by: Jacob Martin <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 3928124 - Browse repository at this point
Copy the full SHA 3928124View commit details
Commits on Jun 18, 2024
-
platform/mellanox: mlxbf-pmc: Replace uintN_t with kernel-style types
BugLink: https://bugs.launchpad.net/bugs/2069777 Use u8, u32 and u64 instead of respective uintN_t types. Remove unnecessary newlines for function argument lists. Signed-off-by: Shravan Kumar Ramani <[email protected]> Link: https://lore.kernel.org/r/39be055af3506ce6f843d11e45d71620f2a96e26.1707808180.git.shravankr@nvidia.com Reviewed-by: Ilpo Järvinen <[email protected]> Signed-off-by: Ilpo Järvinen <[email protected]> (cherry picked from commit fd23023) Signed-off-by: David Thompson <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for ad93d33 - Browse repository at this point
Copy the full SHA ad93d33View commit details -
platform/mellanox: mlxbf-pmc: Cleanup signed/unsigned mix-up
BugLink: https://bugs.launchpad.net/bugs/2069777 Use unsigned integer types for register values and array indices. Use %u instead of %d accordingly. Signed-off-by: Shravan Kumar Ramani <[email protected]> Link: https://lore.kernel.org/r/d8548c70339a29258a906b2b518e5c48f669795c.1707808180.git.shravankr@nvidia.com Reviewed-by: Ilpo Järvinen <[email protected]> Signed-off-by: Ilpo Järvinen <[email protected]> (cherry picked from commit 1ae9ffd) Signed-off-by: David Thompson <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 24174e0 - Browse repository at this point
Copy the full SHA 24174e0View commit details -
platform/mellanox: mlxbf-pmc: mlxbf_pmc_event_list(): make size ptr o…
…ptional BugLink: https://bugs.launchpad.net/bugs/2069777 The mlxbf_pmc_event_list() function returns a pointer to an array of supported events and the array size. The array size is returned via a pointer passed as an argument, which is mandatory. However, we want to be able to use mlxbf_pmc_event_list() just to check if a block name is implemented/supported. For this usage passing the size argument is not necessary so let's make it optional. Signed-off-by: Luiz Capitulino <[email protected]> Reviewed-by: Hans de Goede <[email protected]> Link: https://lore.kernel.org/r/182de8ec6b9c33152f2ba6b248c35b0311abf5e4.1708635408.git.luizcap@redhat.com Reviewed-by: Ilpo Järvinen <[email protected]> Signed-off-by: Ilpo Järvinen <[email protected]> (cherry picked from commit 0d46439) Signed-off-by: David Thompson <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 48abb25 - Browse repository at this point
Copy the full SHA 48abb25View commit details -
platform/mellanox: mlxbf-pmc: Ignore unsupported performance blocks
BugLink: https://bugs.launchpad.net/bugs/2069777 Currently, the driver has two behaviors to deal with new & unsupported performance blocks reported by the firmware: 1. For register and unknown block types, the driver will fail to load with the following error message: [ 4510.956369] mlxbf-pmc: probe of MLNXBFD2:00 failed with error -22 2. For counter and crspace blocks, the driver will load and sysfs files will be created but getting the contents of event_list or trying to setup the counter will fail Instead, let's ignore and log unsupported blocks. This means the driver will always load and unsupported blocks will never show up in sysfs. Signed-off-by: Luiz Capitulino <[email protected]> Reviewed-by: Hans de Goede <[email protected]> Link: https://lore.kernel.org/r/f8e2e6210b43e825b69824b420c801cd513d401d.1708635408.git.luizcap@redhat.com Reviewed-by: Ilpo Järvinen <[email protected]> Signed-off-by: Ilpo Järvinen <[email protected]> (cherry picked from commit c0459ee) Signed-off-by: David Thompson <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for a02c3be - Browse repository at this point
Copy the full SHA a02c3beView commit details -
platform/mellanox: mlxbf-pmc: fix signedness bugs
BugLink: https://bugs.launchpad.net/bugs/2069777 These need to be signed for the error handling to work. The mlxbf_pmc_get_event_num() function returns int so int type is correct. Fixes: 1ae9ffd ("platform/mellanox: mlxbf-pmc: Cleanup signed/unsigned mix-up") Signed-off-by: Dan Carpenter <[email protected]> Link: https://lore.kernel.org/r/[email protected] Reviewed-by: Ilpo Järvinen <[email protected]> Signed-off-by: Ilpo Järvinen <[email protected]> (cherry picked from commit 7c8772f) Signed-off-by: David Thompson <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 39db741 - Browse repository at this point
Copy the full SHA 39db741View commit details