From 739e9b39c00c056b9ce251b05a85c662d61deb4f Mon Sep 17 00:00:00 2001 From: Ryan Roberts Date: Thu, 15 Feb 2024 10:31:58 +0000 Subject: [PATCH 001/548] arm64/mm: dplit __flush_tlb_range() to elide trailing DSB Split __flush_tlb_range() into __flush_tlb_range_nosync() + __flush_tlb_range(), in the same way as the existing flush_tlb_page() arrangement. This allows calling __flush_tlb_range_nosync() to elide the trailing DSB. Forthcoming "contpte" code will take advantage of this when clearing the young bit from a contiguous range of ptes. Ordering between dsb and mmu_notifier_arch_invalidate_secondary_tlbs() has changed, but now aligns with the ordering of __flush_tlb_page(). It has been discussed that __flush_tlb_page() may be wrong though. Regardless, both will be resolved separately if needed. Link: https://lkml.kernel.org/r/20240215103205.2607016-12-ryan.roberts@arm.com Signed-off-by: Ryan Roberts Reviewed-by: David Hildenbrand Tested-by: John Hubbard Acked-by: Mark Rutland Acked-by: Catalin Marinas Cc: Alistair Popple Cc: Andrey Ryabinin Cc: Ard Biesheuvel Cc: Barry Song <21cnbao@gmail.com> Cc: Borislav Petkov (AMD) Cc: Dave Hansen Cc: "H. Peter Anvin" Cc: Ingo Molnar Cc: James Morse Cc: Kefeng Wang Cc: Marc Zyngier Cc: Matthew Wilcox (Oracle) Cc: Thomas Gleixner Cc: Will Deacon Cc: Yang Shi Cc: Zi Yan Signed-off-by: Andrew Morton (cherry picked from commit d9d8dc2bd3fb2689309f704fe85e6dde2b1bd73a) Signed-off-by: Koba Ko --- arch/arm64/include/asm/tlbflush.h | 13 +++++++++++-- 1 file changed, 11 insertions(+), 2 deletions(-) diff --git a/arch/arm64/include/asm/tlbflush.h b/arch/arm64/include/asm/tlbflush.h index 6eeb56b6fac13..4fb38cb04c35b 100644 --- a/arch/arm64/include/asm/tlbflush.h +++ b/arch/arm64/include/asm/tlbflush.h @@ -405,7 +405,7 @@ do { \ #define __flush_s2_tlb_range_op(op, start, pages, stride, tlb_level) \ __flush_tlb_range_op(op, start, pages, stride, 0, tlb_level, false) -static inline void __flush_tlb_range(struct vm_area_struct *vma, +static inline void __flush_tlb_range_nosync(struct vm_area_struct *vma, unsigned long start, unsigned long end, unsigned long stride, bool last_level, int tlb_level) @@ -437,10 +437,19 @@ static inline void __flush_tlb_range(struct vm_area_struct *vma, else __flush_tlb_range_op(vae1is, start, pages, stride, asid, tlb_level, true); - dsb(ish); mmu_notifier_arch_invalidate_secondary_tlbs(vma->vm_mm, start, end); } +static inline void __flush_tlb_range(struct vm_area_struct *vma, + unsigned long start, unsigned long end, + unsigned long stride, bool last_level, + int tlb_level) +{ + __flush_tlb_range_nosync(vma, start, end, stride, + last_level, tlb_level); + dsb(ish); +} + static inline void flush_tlb_range(struct vm_area_struct *vma, unsigned long start, unsigned long end) { From 962959cbe8ed3205a0c1b79343b7fc33f910a81f Mon Sep 17 00:00:00 2001 From: Vidya Sagar Date: Tue, 16 Jan 2024 20:02:58 +0530 Subject: [PATCH 002/548] PCI: Clear Secondary Status errors after enumeration We enumerate devices by attempting config reads to the Vendor ID of each possible device. On conventional PCI, if no device responds, the read terminates with a Master Abort (PCI r3.0, sec 6.1). On PCIe, the config read is terminated as an Unsupported Request (PCIe r6.0, sec 2.3.2, 7.5.1.3.7). In either case, if the read addressed a device below a bridge, it is logged by setting "Received Master Abort" in the bridge Secondary Status register. Clear any errors logged in the Secondary Status register after enumeration. Link: https://lore.kernel.org/r/20240116143258.483235-1-vidyas@nvidia.com Signed-off-by: Vidya Sagar [bhelgaas: simplify commit log] Signed-off-by: Bjorn Helgaas (cherry picked from commit 7bf9d2af7e89f65a79225e26d261b52ce4ee3e95) Signed-off-by: Koba Ko --- drivers/pci/probe.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c index b7cec139d816b..805e8e864c90b 100644 --- a/drivers/pci/probe.c +++ b/drivers/pci/probe.c @@ -1477,6 +1477,9 @@ static int pci_scan_bridge_extend(struct pci_bus *bus, struct pci_dev *dev, } out: + /* Clear errors in the Secondary Status Register */ + pci_write_config_word(dev, PCI_SEC_STATUS, 0xffff); + pci_write_config_word(dev, PCI_BRIDGE_CONTROL, bctl); pm_runtime_put(&dev->dev); From 089da61372f03e096aaa92af95054f9b8a9af2db Mon Sep 17 00:00:00 2001 From: Vidya Sagar Date: Tue, 25 Jun 2024 21:01:50 +0530 Subject: [PATCH 003/548] PCI: Extend ACS configurability PCIe ACS settings control the level of isolation and the possible P2P paths between devices. With greater isolation the kernel will create smaller iommu_groups and with less isolation there is more HW that can achieve P2P transfers. From a virtualization perspective all devices in the same iommu_group must be assigned to the same VM as they lack security isolation. There is no way for the kernel to automatically know the correct ACS settings for any given system and workload. Existing command line options (e.g., disable_acs_redir) allow only for large scale change, disabling all isolation, but this is not sufficient for more complex cases. Add a kernel command-line option 'config_acs' to directly control all the ACS bits for specific devices, which allows the operator to setup the right level of isolation to achieve the desired P2P configuration. The definition is future proof; when new ACS bits are added to the spec the open syntax can be extended. ACS needs to be setup early in the kernel boot as the ACS settings affect how iommu_groups are formed. iommu_group formation is a one time event during initial device discovery, so changing ACS bits after kernel boot can result in an inaccurate view of the iommu_groups compared to the current isolation configuration. ACS applies to PCIe Downstream Ports and multi-function devices. The default ACS settings are strict and deny any direct traffic between two functions. This results in the smallest iommu_group the HW can support. Frequently these values result in slow or non-working P2PDMA. ACS offers a range of security choices controlling how traffic is allowed to go directly between two devices. Some popular choices: - Full prevention - Translated requests can be direct, with various options - Asymmetric direct traffic, A can reach B but not the reverse - All traffic can be direct Along with some other less common ones for special topologies. The intention is that this option would be used with expert knowledge of the HW capability and workload to achieve the desired configuration. Link: https://lore.kernel.org/r/20240625153150.159310-1-vidyas@nvidia.com Signed-off-by: Vidya Sagar [bhelgaas: add example, tidy printk formats] Signed-off-by: Bjorn Helgaas (cherry picked from commit 47c8846a49baa8c0b7a6a3e7e7eacd6e8d119d25) Signed-off-by: Koba Ko --- .../admin-guide/kernel-parameters.txt | 32 ++++ drivers/pci/pci.c | 148 +++++++++++------- 2 files changed, 122 insertions(+), 58 deletions(-) diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt index bcfa49019c3f1..00551b4768f05 100644 --- a/Documentation/admin-guide/kernel-parameters.txt +++ b/Documentation/admin-guide/kernel-parameters.txt @@ -4496,6 +4496,38 @@ bridges without forcing it upstream. Note: this removes isolation between devices and may put more devices in an IOMMU group. + config_acs= + Format: + @[; ...] + Specify one or more PCI devices (in the format + specified above) optionally prepended with flags + and separated by semicolons. The respective + capabilities will be enabled, disabled or + unchanged based on what is specified in + flags. + + ACS Flags is defined as follows: + bit-0 : ACS Source Validation + bit-1 : ACS Translation Blocking + bit-2 : ACS P2P Request Redirect + bit-3 : ACS P2P Completion Redirect + bit-4 : ACS Upstream Forwarding + bit-5 : ACS P2P Egress Control + bit-6 : ACS Direct Translated P2P + Each bit can be marked as: + '0' – force disabled + '1' – force enabled + 'x' – unchanged + For example, + pci=config_acs=10x + would configure all devices that support + ACS to enable P2P Request Redirect, disable + Translation Blocking, and leave Source + Validation unchanged from whatever power-up + or firmware set it to. + + Note: this may remove isolation between devices + and may put more devices in an IOMMU group. force_floating [S390] Force usage of floating interrupts. nomio [S390] Do not use MIO instructions. norid [S390] ignore the RID field and force use of diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c index 4541dfbf0e1b6..16ac493a439ac 100644 --- a/drivers/pci/pci.c +++ b/drivers/pci/pci.c @@ -887,30 +887,67 @@ void pci_request_acs(void) } static const char *disable_acs_redir_param; +static const char *config_acs_param; -/** - * pci_disable_acs_redir - disable ACS redirect capabilities - * @dev: the PCI device - * - * For only devices specified in the disable_acs_redir parameter. - */ -static void pci_disable_acs_redir(struct pci_dev *dev) +struct pci_acs { + u16 cap; + u16 ctrl; + u16 fw_ctrl; +}; + +static void __pci_config_acs(struct pci_dev *dev, struct pci_acs *caps, + const char *p, u16 mask, u16 flags) { + char *delimit; int ret = 0; - const char *p; - int pos; - u16 ctrl; - if (!disable_acs_redir_param) + if (!p) return; - p = disable_acs_redir_param; while (*p) { + if (!mask) { + /* Check for ACS flags */ + delimit = strstr(p, "@"); + if (delimit) { + int end; + u32 shift = 0; + + end = delimit - p - 1; + + while (end > -1) { + if (*(p + end) == '0') { + mask |= 1 << shift; + shift++; + end--; + } else if (*(p + end) == '1') { + mask |= 1 << shift; + flags |= 1 << shift; + shift++; + end--; + } else if ((*(p + end) == 'x') || (*(p + end) == 'X')) { + shift++; + end--; + } else { + pci_err(dev, "Invalid ACS flags... Ignoring\n"); + return; + } + } + p = delimit + 1; + } else { + pci_err(dev, "ACS Flags missing\n"); + return; + } + } + + if (mask & ~(PCI_ACS_SV | PCI_ACS_TB | PCI_ACS_RR | PCI_ACS_CR | + PCI_ACS_UF | PCI_ACS_EC | PCI_ACS_DT)) { + pci_err(dev, "Invalid ACS flags specified\n"); + return; + } + ret = pci_dev_str_match(dev, p, &p); if (ret < 0) { - pr_info_once("PCI: Can't parse disable_acs_redir parameter: %s\n", - disable_acs_redir_param); - + pr_info_once("PCI: Can't parse ACS command line parameter\n"); break; } else if (ret == 1) { /* Found a match */ @@ -930,56 +967,38 @@ static void pci_disable_acs_redir(struct pci_dev *dev) if (!pci_dev_specific_disable_acs_redir(dev)) return; - pos = dev->acs_cap; - if (!pos) { - pci_warn(dev, "cannot disable ACS redirect for this hardware as it does not have ACS capabilities\n"); - return; - } - - pci_read_config_word(dev, pos + PCI_ACS_CTRL, &ctrl); + pci_dbg(dev, "ACS mask = %#06x\n", mask); + pci_dbg(dev, "ACS flags = %#06x\n", flags); - /* P2P Request & Completion Redirect */ - ctrl &= ~(PCI_ACS_RR | PCI_ACS_CR | PCI_ACS_EC); + /* If mask is 0 then we copy the bit from the firmware setting. */ + caps->ctrl = (caps->ctrl & ~mask) | (caps->fw_ctrl & mask); + caps->ctrl |= flags; - pci_write_config_word(dev, pos + PCI_ACS_CTRL, ctrl); - - pci_info(dev, "disabled ACS redirect\n"); + pci_info(dev, "Configured ACS to %#06x\n", caps->ctrl); } /** * pci_std_enable_acs - enable ACS on devices using standard ACS capabilities * @dev: the PCI device + * @caps: default ACS controls */ -static void pci_std_enable_acs(struct pci_dev *dev) +static void pci_std_enable_acs(struct pci_dev *dev, struct pci_acs *caps) { - int pos; - u16 cap; - u16 ctrl; - - pos = dev->acs_cap; - if (!pos) - return; - - pci_read_config_word(dev, pos + PCI_ACS_CAP, &cap); - pci_read_config_word(dev, pos + PCI_ACS_CTRL, &ctrl); - /* Source Validation */ - ctrl |= (cap & PCI_ACS_SV); + caps->ctrl |= (caps->cap & PCI_ACS_SV); /* P2P Request Redirect */ - ctrl |= (cap & PCI_ACS_RR); + caps->ctrl |= (caps->cap & PCI_ACS_RR); /* P2P Completion Redirect */ - ctrl |= (cap & PCI_ACS_CR); + caps->ctrl |= (caps->cap & PCI_ACS_CR); /* Upstream Forwarding */ - ctrl |= (cap & PCI_ACS_UF); + caps->ctrl |= (caps->cap & PCI_ACS_UF); /* Enable Translation Blocking for external devices and noats */ if (pci_ats_disabled() || dev->external_facing || dev->untrusted) - ctrl |= (cap & PCI_ACS_TB); - - pci_write_config_word(dev, pos + PCI_ACS_CTRL, ctrl); + caps->ctrl |= (caps->cap & PCI_ACS_TB); } /** @@ -988,23 +1007,33 @@ static void pci_std_enable_acs(struct pci_dev *dev) */ static void pci_enable_acs(struct pci_dev *dev) { - if (!pci_acs_enable) - goto disable_acs_redir; + struct pci_acs caps; + int pos; + + pos = dev->acs_cap; + if (!pos) + return; - if (!pci_dev_specific_enable_acs(dev)) - goto disable_acs_redir; + pci_read_config_word(dev, pos + PCI_ACS_CAP, &caps.cap); + pci_read_config_word(dev, pos + PCI_ACS_CTRL, &caps.ctrl); + caps.fw_ctrl = caps.ctrl; - pci_std_enable_acs(dev); + /* If an iommu is present we start with kernel default caps */ + if (pci_acs_enable) { + if (pci_dev_specific_enable_acs(dev)) + pci_std_enable_acs(dev, &caps); + } -disable_acs_redir: /* - * Note: pci_disable_acs_redir() must be called even if ACS was not - * enabled by the kernel because it may have been enabled by - * platform firmware. So if we are told to disable it, we should - * always disable it after setting the kernel's default - * preferences. + * Always apply caps from the command line, even if there is no iommu. + * Trust that the admin has a reason to change the ACS settings. */ - pci_disable_acs_redir(dev); + __pci_config_acs(dev, &caps, disable_acs_redir_param, + PCI_ACS_RR | PCI_ACS_CR | PCI_ACS_EC, + ~(PCI_ACS_RR | PCI_ACS_CR | PCI_ACS_EC)); + __pci_config_acs(dev, &caps, config_acs_param, 0, 0); + + pci_write_config_word(dev, pos + PCI_ACS_CTRL, caps.ctrl); } /** @@ -7100,6 +7129,8 @@ static int __init pci_setup(char *str) pci_add_flags(PCI_SCAN_ALL_PCIE_DEVS); } else if (!strncmp(str, "disable_acs_redir=", 18)) { disable_acs_redir_param = str + 18; + } else if (!strncmp(str, "config_acs=", 11)) { + config_acs_param = str + 11; } else { pr_err("PCI: Unknown option `%s'\n", str); } @@ -7124,6 +7155,7 @@ static int __init pci_realloc_setup_params(void) resource_alignment_param = kstrdup(resource_alignment_param, GFP_KERNEL); disable_acs_redir_param = kstrdup(disable_acs_redir_param, GFP_KERNEL); + config_acs_param = kstrdup(config_acs_param, GFP_KERNEL); return 0; } From 6e90c7d0da2740c17ac096ffd4abfa7431c912d3 Mon Sep 17 00:00:00 2001 From: Gavin Shan Date: Fri, 5 Apr 2024 13:58:51 +1000 Subject: [PATCH 004/548] arm64: tlb: Improve __TLBI_VADDR_RANGE() The macro returns the operand of TLBI RANGE instruction. A mask needs to be applied to each individual field upon producing the operand, to avoid the adjacent fields can interfere with each other when invalid arguments have been provided. The code looks more tidy at least with a mask and FIELD_PREP(). Suggested-by: Marc Zyngier Signed-off-by: Gavin Shan Reviewed-by: Ryan Roberts Reviewed-by: Catalin Marinas Reviewed-by: Anshuman Khandual Reviewed-by: Shaoqin Huang Link: https://lore.kernel.org/r/20240405035852.1532010-3-gshan@redhat.com Signed-off-by: Will Deacon (cherry picked from commit e07255d69702bc9131427fda8f9749355b10780f) Signed-off-by: Koba Ko --- arch/arm64/include/asm/tlbflush.h | 28 ++++++++++++++++++---------- 1 file changed, 18 insertions(+), 10 deletions(-) diff --git a/arch/arm64/include/asm/tlbflush.h b/arch/arm64/include/asm/tlbflush.h index 4fb38cb04c35b..2a280f4cab4e2 100644 --- a/arch/arm64/include/asm/tlbflush.h +++ b/arch/arm64/include/asm/tlbflush.h @@ -134,16 +134,24 @@ static inline unsigned long get_trans_granule(void) * [BADDR, BADDR + (NUM + 1) * 2^(5*SCALE + 1) * PAGESIZE) * */ -#define __TLBI_VADDR_RANGE(addr, asid, scale, num, ttl) \ - ({ \ - unsigned long __ta = (addr) >> PAGE_SHIFT; \ - __ta &= GENMASK_ULL(36, 0); \ - __ta |= (unsigned long)(ttl) << 37; \ - __ta |= (unsigned long)(num) << 39; \ - __ta |= (unsigned long)(scale) << 44; \ - __ta |= get_trans_granule() << 46; \ - __ta |= (unsigned long)(asid) << 48; \ - __ta; \ +#define TLBIR_ASID_MASK GENMASK_ULL(63, 48) +#define TLBIR_TG_MASK GENMASK_ULL(47, 46) +#define TLBIR_SCALE_MASK GENMASK_ULL(45, 44) +#define TLBIR_NUM_MASK GENMASK_ULL(43, 39) +#define TLBIR_TTL_MASK GENMASK_ULL(38, 37) +#define TLBIR_BADDR_MASK GENMASK_ULL(36, 0) + +#define __TLBI_VADDR_RANGE(baddr, asid, scale, num, ttl) \ + ({ \ + unsigned long __ta = 0; \ + unsigned long __ttl = (ttl >= 1 && ttl <= 3) ? ttl : 0; \ + __ta |= FIELD_PREP(TLBIR_BADDR_MASK, baddr); \ + __ta |= FIELD_PREP(TLBIR_TTL_MASK, __ttl); \ + __ta |= FIELD_PREP(TLBIR_NUM_MASK, num); \ + __ta |= FIELD_PREP(TLBIR_SCALE_MASK, scale); \ + __ta |= FIELD_PREP(TLBIR_TG_MASK, get_trans_granule()); \ + __ta |= FIELD_PREP(TLBIR_ASID_MASK, asid); \ + __ta; \ }) /* These macros are used by the TLBI RANGE feature. */ From 04b68d09d0503cf41dc1f8fa57d665cf1d871735 Mon Sep 17 00:00:00 2001 From: Gavin Shan Date: Fri, 5 Apr 2024 13:58:52 +1000 Subject: [PATCH 005/548] arm64: tlb: Allow range operation for MAX_TLBI_RANGE_PAGES MAX_TLBI_RANGE_PAGES pages is covered by SCALE#3 and NUM#31 and it's supported now. Allow TLBI RANGE operation when the number of pages is equal to MAX_TLBI_RANGE_PAGES in __flush_tlb_range_nosync(). Suggested-by: Marc Zyngier Signed-off-by: Gavin Shan Reviewed-by: Anshuman Khandual Reviewed-by: Ryan Roberts Reviewed-by: Catalin Marinas Reviewed-by: Shaoqin Huang Link: https://lore.kernel.org/r/20240405035852.1532010-4-gshan@redhat.com Signed-off-by: Will Deacon (cherry picked from commit 73301e464a72a0d007d0d4e0f4d3dab5c58125bf) Signed-off-by: Koba Ko --- arch/arm64/include/asm/tlbflush.h | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/arch/arm64/include/asm/tlbflush.h b/arch/arm64/include/asm/tlbflush.h index 2a280f4cab4e2..c9887f4773cd9 100644 --- a/arch/arm64/include/asm/tlbflush.h +++ b/arch/arm64/include/asm/tlbflush.h @@ -428,11 +428,11 @@ static inline void __flush_tlb_range_nosync(struct vm_area_struct *vma, * When not uses TLB range ops, we can handle up to * (MAX_TLBI_OPS - 1) pages; * When uses TLB range ops, we can handle up to - * (MAX_TLBI_RANGE_PAGES - 1) pages. + * MAX_TLBI_RANGE_PAGES pages. */ if ((!system_supports_tlb_range() && (end - start) >= (MAX_TLBI_OPS * stride)) || - pages >= MAX_TLBI_RANGE_PAGES) { + pages > MAX_TLBI_RANGE_PAGES) { flush_tlb_mm(vma->vm_mm); return; } From e18e9a5b71c114b36128df123cc617dcbc1e5f1d Mon Sep 17 00:00:00 2001 From: "David E. Box" Date: Fri, 23 Feb 2024 14:58:47 -0600 Subject: [PATCH 006/548] PCI/ASPM: Move pci_configure_ltr() to aspm.c The Latency Tolerance Reporting (LTR) mechanism supports the ASPM L1.2 state and is only configured when CONFIG_PCIEASPM is set. Move pci_configure_ltr() and pci_bridge_reconfigure_ltr() into aspm.c since they only build when CONFIG_PCIEASPM is set. No functional change intended. Suggested-by: Bjorn Helgaas Link: https://lore.kernel.org/r/20240128233212.1139663-2-david.e.box@linux.intel.com [bhelgaas: commit log, split build change from function moves] Link: https://lore.kernel.org/r/20240223205851.114931-2-helgaas@kernel.org Signed-off-by: David E. Box Signed-off-by: Bjorn Helgaas (cherry picked from commit fa84f4435a6202dd90248517f41e54bf3fb85bc5) Signed-off-by: Koba Ko --- drivers/pci/pci.c | 18 ---------- drivers/pci/pci.h | 5 ++- drivers/pci/pcie/aspm.c | 75 +++++++++++++++++++++++++++++++++++++++++ drivers/pci/probe.c | 61 --------------------------------- 4 files changed, 79 insertions(+), 80 deletions(-) diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c index 16ac493a439ac..3238af39cae49 100644 --- a/drivers/pci/pci.c +++ b/drivers/pci/pci.c @@ -1623,24 +1623,6 @@ static int pci_save_pcie_state(struct pci_dev *dev) return 0; } -void pci_bridge_reconfigure_ltr(struct pci_dev *dev) -{ -#ifdef CONFIG_PCIEASPM - struct pci_dev *bridge; - u32 ctl; - - bridge = pci_upstream_bridge(dev); - if (bridge && bridge->ltr_path) { - pcie_capability_read_dword(bridge, PCI_EXP_DEVCTL2, &ctl); - if (!(ctl & PCI_EXP_DEVCTL2_LTR_EN)) { - pci_dbg(bridge, "re-enabling LTR\n"); - pcie_capability_set_word(bridge, PCI_EXP_DEVCTL2, - PCI_EXP_DEVCTL2_LTR_EN); - } - } -#endif -} - static void pci_restore_pcie_state(struct pci_dev *dev) { int i = 0; diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h index d69a17947ffce..980d3980bd3a0 100644 --- a/drivers/pci/pci.h +++ b/drivers/pci/pci.h @@ -98,7 +98,6 @@ void pci_msi_init(struct pci_dev *dev); void pci_msix_init(struct pci_dev *dev); bool pci_bridge_d3_possible(struct pci_dev *dev); void pci_bridge_d3_update(struct pci_dev *dev); -void pci_bridge_reconfigure_ltr(struct pci_dev *dev); int pci_bridge_wait_for_secondary_bus(struct pci_dev *dev, char *reset_type); static inline void pci_wakeup_event(struct pci_dev *dev) @@ -567,11 +566,15 @@ void pcie_aspm_init_link_state(struct pci_dev *pdev); void pcie_aspm_exit_link_state(struct pci_dev *pdev); void pcie_aspm_pm_state_change(struct pci_dev *pdev, bool locked); void pcie_aspm_powersave_config_link(struct pci_dev *pdev); +void pci_configure_ltr(struct pci_dev *pdev); +void pci_bridge_reconfigure_ltr(struct pci_dev *pdev); #else static inline void pcie_aspm_init_link_state(struct pci_dev *pdev) { } static inline void pcie_aspm_exit_link_state(struct pci_dev *pdev) { } static inline void pcie_aspm_pm_state_change(struct pci_dev *pdev, bool locked) { } static inline void pcie_aspm_powersave_config_link(struct pci_dev *pdev) { } +static inline void pci_configure_ltr(struct pci_dev *pdev) { } +static inline void pci_bridge_reconfigure_ltr(struct pci_dev *pdev) { } #endif #ifdef CONFIG_PCIE_ECRC diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c index 4e995ca4de01b..3216f58b6285f 100644 --- a/drivers/pci/pcie/aspm.c +++ b/drivers/pci/pcie/aspm.c @@ -936,6 +936,81 @@ void pcie_aspm_init_link_state(struct pci_dev *pdev) up_read(&pci_bus_sem); } +void pci_bridge_reconfigure_ltr(struct pci_dev *pdev) +{ + struct pci_dev *bridge; + u32 ctl; + + bridge = pci_upstream_bridge(pdev); + if (bridge && bridge->ltr_path) { + pcie_capability_read_dword(bridge, PCI_EXP_DEVCTL2, &ctl); + if (!(ctl & PCI_EXP_DEVCTL2_LTR_EN)) { + pci_dbg(bridge, "re-enabling LTR\n"); + pcie_capability_set_word(bridge, PCI_EXP_DEVCTL2, + PCI_EXP_DEVCTL2_LTR_EN); + } + } +} + +void pci_configure_ltr(struct pci_dev *pdev) +{ + struct pci_host_bridge *host = pci_find_host_bridge(pdev->bus); + struct pci_dev *bridge; + u32 cap, ctl; + + if (!pci_is_pcie(pdev)) + return; + + /* Read L1 PM substate capabilities */ + pdev->l1ss = pci_find_ext_capability(pdev, PCI_EXT_CAP_ID_L1SS); + + pcie_capability_read_dword(pdev, PCI_EXP_DEVCAP2, &cap); + if (!(cap & PCI_EXP_DEVCAP2_LTR)) + return; + + pcie_capability_read_dword(pdev, PCI_EXP_DEVCTL2, &ctl); + if (ctl & PCI_EXP_DEVCTL2_LTR_EN) { + if (pci_pcie_type(pdev) == PCI_EXP_TYPE_ROOT_PORT) { + pdev->ltr_path = 1; + return; + } + + bridge = pci_upstream_bridge(pdev); + if (bridge && bridge->ltr_path) + pdev->ltr_path = 1; + + return; + } + + if (!host->native_ltr) + return; + + /* + * Software must not enable LTR in an Endpoint unless the Root + * Complex and all intermediate Switches indicate support for LTR. + * PCIe r4.0, sec 6.18. + */ + if (pci_pcie_type(pdev) == PCI_EXP_TYPE_ROOT_PORT) { + pcie_capability_set_word(pdev, PCI_EXP_DEVCTL2, + PCI_EXP_DEVCTL2_LTR_EN); + pdev->ltr_path = 1; + return; + } + + /* + * If we're configuring a hot-added device, LTR was likely + * disabled in the upstream bridge, so re-enable it before enabling + * it in the new device. + */ + bridge = pci_upstream_bridge(pdev); + if (bridge && bridge->ltr_path) { + pci_bridge_reconfigure_ltr(pdev); + pcie_capability_set_word(pdev, PCI_EXP_DEVCTL2, + PCI_EXP_DEVCTL2_LTR_EN); + pdev->ltr_path = 1; + } +} + /* Recheck latencies and update aspm_capable for links under the root */ static void pcie_update_aspm_capable(struct pcie_link_state *root) { diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c index 805e8e864c90b..9f1412f7f93dd 100644 --- a/drivers/pci/probe.c +++ b/drivers/pci/probe.c @@ -2185,67 +2185,6 @@ static void pci_configure_relaxed_ordering(struct pci_dev *dev) } } -static void pci_configure_ltr(struct pci_dev *dev) -{ -#ifdef CONFIG_PCIEASPM - struct pci_host_bridge *host = pci_find_host_bridge(dev->bus); - struct pci_dev *bridge; - u32 cap, ctl; - - if (!pci_is_pcie(dev)) - return; - - /* Read L1 PM substate capabilities */ - dev->l1ss = pci_find_ext_capability(dev, PCI_EXT_CAP_ID_L1SS); - - pcie_capability_read_dword(dev, PCI_EXP_DEVCAP2, &cap); - if (!(cap & PCI_EXP_DEVCAP2_LTR)) - return; - - pcie_capability_read_dword(dev, PCI_EXP_DEVCTL2, &ctl); - if (ctl & PCI_EXP_DEVCTL2_LTR_EN) { - if (pci_pcie_type(dev) == PCI_EXP_TYPE_ROOT_PORT) { - dev->ltr_path = 1; - return; - } - - bridge = pci_upstream_bridge(dev); - if (bridge && bridge->ltr_path) - dev->ltr_path = 1; - - return; - } - - if (!host->native_ltr) - return; - - /* - * Software must not enable LTR in an Endpoint unless the Root - * Complex and all intermediate Switches indicate support for LTR. - * PCIe r4.0, sec 6.18. - */ - if (pci_pcie_type(dev) == PCI_EXP_TYPE_ROOT_PORT) { - pcie_capability_set_word(dev, PCI_EXP_DEVCTL2, - PCI_EXP_DEVCTL2_LTR_EN); - dev->ltr_path = 1; - return; - } - - /* - * If we're configuring a hot-added device, LTR was likely - * disabled in the upstream bridge, so re-enable it before enabling - * it in the new device. - */ - bridge = pci_upstream_bridge(dev); - if (bridge && bridge->ltr_path) { - pci_bridge_reconfigure_ltr(dev); - pcie_capability_set_word(dev, PCI_EXP_DEVCTL2, - PCI_EXP_DEVCTL2_LTR_EN); - dev->ltr_path = 1; - } -#endif -} - static void pci_configure_eetlp_prefix(struct pci_dev *dev) { #ifdef CONFIG_PCI_PASID From 483a33a869196363008e6685267c056742ea1f63 Mon Sep 17 00:00:00 2001 From: "David E. Box" Date: Fri, 23 Feb 2024 14:58:48 -0600 Subject: [PATCH 007/548] PCI/ASPM: Always build aspm.c Some ASPM-related tasks, such as save and restore of LTR and L1SS capabilities, still need to be performed when CONFIG_PCIEASPM is not enabled. To prepare for these changes, wrap the current code in aspm.c with an #ifdef and always build the file. Link: https://lore.kernel.org/r/20240128233212.1139663-2-david.e.box@linux.intel.com [bhelgaas: split build change from function moves] Link: https://lore.kernel.org/r/20240223205851.114931-3-helgaas@kernel.org Signed-off-by: David E. Box Signed-off-by: Bjorn Helgaas (cherry picked from commit f3994bba8200b49e3eacbe5914cd13f228e0db37) Signed-off-by: Koba Ko --- drivers/pci/pcie/Makefile | 2 +- drivers/pci/pcie/aspm.c | 4 ++++ 2 files changed, 5 insertions(+), 1 deletion(-) diff --git a/drivers/pci/pcie/Makefile b/drivers/pci/pcie/Makefile index 8de4ed5f98f14..6461aa93fe76e 100644 --- a/drivers/pci/pcie/Makefile +++ b/drivers/pci/pcie/Makefile @@ -6,7 +6,7 @@ pcieportdrv-y := portdrv.o rcec.o obj-$(CONFIG_PCIEPORTBUS) += pcieportdrv.o -obj-$(CONFIG_PCIEASPM) += aspm.o +obj-y += aspm.o obj-$(CONFIG_PCIEAER) += aer.o err.o obj-$(CONFIG_PCIEAER_INJECT) += aer_inject.o obj-$(CONFIG_PCIE_PME) += pme.o diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c index 3216f58b6285f..63762250a4228 100644 --- a/drivers/pci/pcie/aspm.c +++ b/drivers/pci/pcie/aspm.c @@ -21,6 +21,8 @@ #include #include "../pci.h" +#ifdef CONFIG_PCIEASPM + #ifdef MODULE_PARAM_PREFIX #undef MODULE_PARAM_PREFIX #endif @@ -1519,3 +1521,5 @@ bool pcie_aspm_support_enabled(void) { return aspm_support_enabled; } + +#endif /* CONFIG_PCIEASPM */ From b3ef49090336f327b3515c7d3e8300e7560e100f Mon Sep 17 00:00:00 2001 From: "David E. Box" Date: Fri, 23 Feb 2024 14:58:49 -0600 Subject: [PATCH 008/548] PCI/ASPM: Move pci_save_ltr_state() to aspm.c Even when CONFIG_PCIEASPM is not set, we save and restore the LTR Capability so that if ASPM L1.2 and LTR were configured by the platform, ASPM L1.2 will still work after suspend/resume, when that platform configuration may be lost. See dbbfadf23190 ("PCI/ASPM: Save LTR Capability for suspend/resume"). Since ASPM L1.2 depends on the LTR Capability, move the save/restore code to the part of aspm.c that is always compiled regardless of CONFIG_PCIEASPM. No functional change intended. Suggested-by: Bjorn Helgaas Link: https://lore.kernel.org/r/20240128233212.1139663-5-david.e.box@linux.intel.com [bhelgaas: commit log, reorder to make this a pure move] Link: https://lore.kernel.org/r/20240223205851.114931-4-helgaas@kernel.org Signed-off-by: David E. Box Signed-off-by: Bjorn Helgaas (cherry picked from commit 1e11b5494c3dbb1e5fce7e95021c1698799c7288) Signed-off-by: Koba Ko --- drivers/pci/pci.c | 40 ---------------------------------------- drivers/pci/pci.h | 5 +++++ drivers/pci/pcie/aspm.c | 40 ++++++++++++++++++++++++++++++++++++++++ 3 files changed, 45 insertions(+), 40 deletions(-) diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c index 3238af39cae49..151f80b24d2b2 100644 --- a/drivers/pci/pci.c +++ b/drivers/pci/pci.c @@ -1686,46 +1686,6 @@ static void pci_restore_pcix_state(struct pci_dev *dev) pci_write_config_word(dev, pos + PCI_X_CMD, cap[i++]); } -static void pci_save_ltr_state(struct pci_dev *dev) -{ - int ltr; - struct pci_cap_saved_state *save_state; - u32 *cap; - - if (!pci_is_pcie(dev)) - return; - - ltr = pci_find_ext_capability(dev, PCI_EXT_CAP_ID_LTR); - if (!ltr) - return; - - save_state = pci_find_saved_ext_cap(dev, PCI_EXT_CAP_ID_LTR); - if (!save_state) { - pci_err(dev, "no suspend buffer for LTR; ASPM issues possible after resume\n"); - return; - } - - /* Some broken devices only support dword access to LTR */ - cap = &save_state->cap.data[0]; - pci_read_config_dword(dev, ltr + PCI_LTR_MAX_SNOOP_LAT, cap); -} - -static void pci_restore_ltr_state(struct pci_dev *dev) -{ - struct pci_cap_saved_state *save_state; - int ltr; - u32 *cap; - - save_state = pci_find_saved_ext_cap(dev, PCI_EXT_CAP_ID_LTR); - ltr = pci_find_ext_capability(dev, PCI_EXT_CAP_ID_LTR); - if (!save_state || !ltr) - return; - - /* Some broken devices only support dword access to LTR */ - cap = &save_state->cap.data[0]; - pci_write_config_dword(dev, ltr + PCI_LTR_MAX_SNOOP_LAT, *cap); -} - /** * pci_save_state - save the PCI configuration space of a device before * suspending diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h index 980d3980bd3a0..5428c69800d01 100644 --- a/drivers/pci/pci.h +++ b/drivers/pci/pci.h @@ -561,6 +561,11 @@ pci_ers_result_t pcie_do_recovery(struct pci_dev *dev, bool pcie_wait_for_link(struct pci_dev *pdev, bool active); int pcie_retrain_link(struct pci_dev *pdev, bool use_lt); + +/* ASPM-related functionality we need even without CONFIG_PCIEASPM */ +void pci_save_ltr_state(struct pci_dev *dev); +void pci_restore_ltr_state(struct pci_dev *dev); + #ifdef CONFIG_PCIEASPM void pcie_aspm_init_link_state(struct pci_dev *pdev); void pcie_aspm_exit_link_state(struct pci_dev *pdev); diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c index 63762250a4228..678d1cef09656 100644 --- a/drivers/pci/pcie/aspm.c +++ b/drivers/pci/pcie/aspm.c @@ -21,6 +21,46 @@ #include #include "../pci.h" +void pci_save_ltr_state(struct pci_dev *dev) +{ + int ltr; + struct pci_cap_saved_state *save_state; + u32 *cap; + + if (!pci_is_pcie(dev)) + return; + + ltr = pci_find_ext_capability(dev, PCI_EXT_CAP_ID_LTR); + if (!ltr) + return; + + save_state = pci_find_saved_ext_cap(dev, PCI_EXT_CAP_ID_LTR); + if (!save_state) { + pci_err(dev, "no suspend buffer for LTR; ASPM issues possible after resume\n"); + return; + } + + /* Some broken devices only support dword access to LTR */ + cap = &save_state->cap.data[0]; + pci_read_config_dword(dev, ltr + PCI_LTR_MAX_SNOOP_LAT, cap); +} + +void pci_restore_ltr_state(struct pci_dev *dev) +{ + struct pci_cap_saved_state *save_state; + int ltr; + u32 *cap; + + save_state = pci_find_saved_ext_cap(dev, PCI_EXT_CAP_ID_LTR); + ltr = pci_find_ext_capability(dev, PCI_EXT_CAP_ID_LTR); + if (!save_state || !ltr) + return; + + /* Some broken devices only support dword access to LTR */ + cap = &save_state->cap.data[0]; + pci_write_config_dword(dev, ltr + PCI_LTR_MAX_SNOOP_LAT, *cap); +} + #ifdef CONFIG_PCIEASPM #ifdef MODULE_PARAM_PREFIX From fadd473de6444c5b1b7dbc5a88f15c96e3b13988 Mon Sep 17 00:00:00 2001 From: "David E. Box" Date: Fri, 23 Feb 2024 14:58:50 -0600 Subject: [PATCH 009/548] PCI/ASPM: Save L1 PM Substates Capability for suspend/resume 4ff116d0d5fd ("PCI/ASPM: Save L1 PM Substates Capability for suspend/resume") restored the L1 PM Substates Capability after resume, which reduced power consumption by making the ASPM L1.x states work after resume. a7152be79b62 ("Revert "PCI/ASPM: Save L1 PM Substates Capability for suspend/resume"") reverted 4ff116d0d5fd because resume failed on some systems, so power consumption after resume increased again. a7152be79b62 mentioned that we restore L1 PM substate configuration even though ASPM L1 may already be enabled. This is due the fact that the pci_restore_aspm_l1ss_state() was called before pci_restore_pcie_state(). Save and restore the L1 PM Substates Capability, following PCIe r6.1, sec 5.5.4 more closely by: 1) Do not restore ASPM configuration in pci_restore_pcie_state() but do that after PCIe capability is restored in pci_restore_aspm_state() following PCIe r6.1, sec 5.5.4. 2) If BIOS reenables L1SS, particularly L1.2, we need to clear the enables in the right order, downstream before upstream. Defer restoring the L1SS config until we are at the downstream component. Then update the config for both ends of the link in the prescribed order. 3) Program ASPM L1 PM substate configuration before L1 enables. 4) Program ASPM L1 PM substate enables last, after rest of the fields in the capability are programmed. [bhelgaas: commit log, squash L1SS-related patches, do both LNKCTL restores in pci_restore_pcie_state()] Link: https://lore.kernel.org/r/20240128233212.1139663-3-david.e.box@linux.intel.com Link: https://lore.kernel.org/r/20240128233212.1139663-4-david.e.box@linux.intel.com Link: https://lore.kernel.org/r/20240223205851.114931-5-helgaas@kernel.org Closes: https://bugzilla.kernel.org/show_bug.cgi?id=217321 Link: https://bugzilla.kernel.org/show_bug.cgi?id=216782 Link: https://bugzilla.kernel.org/show_bug.cgi?id=216877 Co-developed-by: Mika Westerberg Co-developed-by: David E. Box Reported-by: Koba Ko Signed-off-by: Mika Westerberg Signed-off-by: David E. Box Signed-off-by: Bjorn Helgaas Tested-by: Tasev Nikola # Asus UX305FA Cc: Mark Enriquez Cc: Thomas Witt Cc: Werner Sembach Cc: Vidya Sagar (cherry picked from commit 17423360a27ae58c1850f588bdd8013bbfcd250b) Signed-off-by: Koba Ko --- drivers/pci/pci.c | 17 ++++++- drivers/pci/pci.h | 3 ++ drivers/pci/pcie/aspm.c | 102 ++++++++++++++++++++++++++++++++++++++-- drivers/pci/probe.c | 1 + include/linux/pci.h | 2 +- 5 files changed, 119 insertions(+), 6 deletions(-) diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c index 151f80b24d2b2..8ef43d274f23d 100644 --- a/drivers/pci/pci.c +++ b/drivers/pci/pci.c @@ -1620,6 +1620,8 @@ static int pci_save_pcie_state(struct pci_dev *dev) pcie_capability_read_word(dev, PCI_EXP_LNKCTL2, &cap[i++]); pcie_capability_read_word(dev, PCI_EXP_SLTCTL2, &cap[i++]); + pci_save_aspm_l1ss_state(dev); + return 0; } @@ -1627,7 +1629,7 @@ static void pci_restore_pcie_state(struct pci_dev *dev) { int i = 0; struct pci_cap_saved_state *save_state; - u16 *cap; + u16 *cap, lnkctl; save_state = pci_find_saved_cap(dev, PCI_CAP_ID_EXP); if (!save_state) @@ -1642,12 +1644,23 @@ static void pci_restore_pcie_state(struct pci_dev *dev) cap = (u16 *)&save_state->cap.data[0]; pcie_capability_write_word(dev, PCI_EXP_DEVCTL, cap[i++]); - pcie_capability_write_word(dev, PCI_EXP_LNKCTL, cap[i++]); + + /* Restore LNKCTL register with ASPM control field clear */ + lnkctl = cap[i++]; + pcie_capability_write_word(dev, PCI_EXP_LNKCTL, + lnkctl & ~PCI_EXP_LNKCTL_ASPMC); + pcie_capability_write_word(dev, PCI_EXP_SLTCTL, cap[i++]); pcie_capability_write_word(dev, PCI_EXP_RTCTL, cap[i++]); pcie_capability_write_word(dev, PCI_EXP_DEVCTL2, cap[i++]); pcie_capability_write_word(dev, PCI_EXP_LNKCTL2, cap[i++]); pcie_capability_write_word(dev, PCI_EXP_SLTCTL2, cap[i++]); + + pci_restore_aspm_l1ss_state(dev); + + /* Restore ASPM control after restoring L1SS state */ + pcie_capability_set_word(dev, PCI_EXP_LNKCTL, + lnkctl & PCI_EXP_LNKCTL_ASPMC); } static int pci_save_pcix_state(struct pci_dev *dev) diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h index 5428c69800d01..c07233d995255 100644 --- a/drivers/pci/pci.h +++ b/drivers/pci/pci.h @@ -565,6 +565,9 @@ int pcie_retrain_link(struct pci_dev *pdev, bool use_lt); /* ASPM-related functionality we need even without CONFIG_PCIEASPM */ void pci_save_ltr_state(struct pci_dev *dev); void pci_restore_ltr_state(struct pci_dev *dev); +void pci_configure_aspm_l1ss(struct pci_dev *dev); +void pci_save_aspm_l1ss_state(struct pci_dev *dev); +void pci_restore_aspm_l1ss_state(struct pci_dev *dev); #ifdef CONFIG_PCIEASPM void pcie_aspm_init_link_state(struct pci_dev *pdev); diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c index 678d1cef09656..25d66c2a43fa5 100644 --- a/drivers/pci/pcie/aspm.c +++ b/drivers/pci/pcie/aspm.c @@ -61,6 +61,105 @@ void pci_restore_ltr_state(struct pci_dev *dev) pci_write_config_dword(dev, ltr + PCI_LTR_MAX_SNOOP_LAT, *cap); } +void pci_configure_aspm_l1ss(struct pci_dev *pdev) +{ + int rc; + + pdev->l1ss = pci_find_ext_capability(pdev, PCI_EXT_CAP_ID_L1SS); + + rc = pci_add_ext_cap_save_buffer(pdev, PCI_EXT_CAP_ID_L1SS, + 2 * sizeof(u32)); + if (rc) + pci_err(pdev, "unable to allocate ASPM L1SS save buffer (%pe)\n", + ERR_PTR(rc)); +} + +void pci_save_aspm_l1ss_state(struct pci_dev *pdev) +{ + struct pci_cap_saved_state *save_state; + u16 l1ss = pdev->l1ss; + u32 *cap; + + /* + * Save L1 substate configuration. The ASPM L0s/L1 configuration + * in PCI_EXP_LNKCTL_ASPMC is saved by pci_save_pcie_state(). + */ + if (!l1ss) + return; + + save_state = pci_find_saved_ext_cap(pdev, PCI_EXT_CAP_ID_L1SS); + if (!save_state) + return; + + cap = &save_state->cap.data[0]; + pci_read_config_dword(pdev, l1ss + PCI_L1SS_CTL2, cap++); + pci_read_config_dword(pdev, l1ss + PCI_L1SS_CTL1, cap++); +} + +void pci_restore_aspm_l1ss_state(struct pci_dev *pdev) +{ + struct pci_cap_saved_state *pl_save_state, *cl_save_state; + struct pci_dev *parent = pdev->bus->self; + u32 *cap, pl_ctl1, pl_ctl2, pl_l1_2_enable; + u32 cl_ctl1, cl_ctl2, cl_l1_2_enable; + + /* + * In case BIOS enabled L1.2 when resuming, we need to disable it first + * on the downstream component before the upstream. So, don't attempt to + * restore either until we are at the downstream component. + */ + if (pcie_downstream_port(pdev) || !parent) + return; + + if (!pdev->l1ss || !parent->l1ss) + return; + + cl_save_state = pci_find_saved_ext_cap(pdev, PCI_EXT_CAP_ID_L1SS); + pl_save_state = pci_find_saved_ext_cap(parent, PCI_EXT_CAP_ID_L1SS); + if (!cl_save_state || !pl_save_state) + return; + + cap = &cl_save_state->cap.data[0]; + cl_ctl2 = *cap++; + cl_ctl1 = *cap; + cap = &pl_save_state->cap.data[0]; + pl_ctl2 = *cap++; + pl_ctl1 = *cap; + + /* + * Disable L1.2 on this downstream endpoint device first, followed + * by the upstream + */ + pci_clear_and_set_config_dword(pdev, pdev->l1ss + PCI_L1SS_CTL1, + PCI_L1SS_CTL1_L1_2_MASK, 0); + pci_clear_and_set_config_dword(parent, parent->l1ss + PCI_L1SS_CTL1, + PCI_L1SS_CTL1_L1_2_MASK, 0); + + /* + * In addition, Common_Mode_Restore_Time and LTR_L1.2_THRESHOLD + * in PCI_L1SS_CTL1 must be programmed *before* setting the L1.2 + * enable bits, even though they're all in PCI_L1SS_CTL1. + */ + pl_l1_2_enable = pl_ctl1 & PCI_L1SS_CTL1_L1_2_MASK; + pl_ctl1 &= ~PCI_L1SS_CTL1_L1_2_MASK; + cl_l1_2_enable = cl_ctl1 & PCI_L1SS_CTL1_L1_2_MASK; + cl_ctl1 &= ~PCI_L1SS_CTL1_L1_2_MASK; + + /* Write back without enables first (above we cleared them in ctl1) */ + pci_write_config_dword(parent, parent->l1ss + PCI_L1SS_CTL2, pl_ctl2); + pci_write_config_dword(pdev, pdev->l1ss + PCI_L1SS_CTL2, cl_ctl2); + pci_write_config_dword(parent, parent->l1ss + PCI_L1SS_CTL1, pl_ctl1); + pci_write_config_dword(pdev, pdev->l1ss + PCI_L1SS_CTL1, cl_ctl1); + + /* Then write back the enables */ + if (pl_l1_2_enable || cl_l1_2_enable) { + pci_write_config_dword(parent, parent->l1ss + PCI_L1SS_CTL1, + pl_ctl1 | pl_l1_2_enable); + pci_write_config_dword(pdev, pdev->l1ss + PCI_L1SS_CTL1, + cl_ctl1 | cl_l1_2_enable); + } +} + #ifdef CONFIG_PCIEASPM #ifdef MODULE_PARAM_PREFIX @@ -1003,9 +1102,6 @@ void pci_configure_ltr(struct pci_dev *pdev) if (!pci_is_pcie(pdev)) return; - /* Read L1 PM substate capabilities */ - pdev->l1ss = pci_find_ext_capability(pdev, PCI_EXT_CAP_ID_L1SS); - pcie_capability_read_dword(pdev, PCI_EXP_DEVCAP2, &cap); if (!(cap & PCI_EXP_DEVCAP2_LTR)) return; diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c index 9f1412f7f93dd..ebf39d959edd7 100644 --- a/drivers/pci/probe.c +++ b/drivers/pci/probe.c @@ -2235,6 +2235,7 @@ static void pci_configure_device(struct pci_dev *dev) pci_configure_extended_tags(dev, NULL); pci_configure_relaxed_ordering(dev); pci_configure_ltr(dev); + pci_configure_aspm_l1ss(dev); pci_configure_eetlp_prefix(dev); pci_configure_serr(dev); diff --git a/include/linux/pci.h b/include/linux/pci.h index 2d1fb935a8c86..5c2bd8d46183e 100644 --- a/include/linux/pci.h +++ b/include/linux/pci.h @@ -390,9 +390,9 @@ struct pci_dev { unsigned int d3hot_delay; /* D3hot->D0 transition time in ms */ unsigned int d3cold_delay; /* D3cold->D0 transition time in ms */ + u16 l1ss; /* L1SS Capability pointer */ #ifdef CONFIG_PCIEASPM struct pcie_link_state *link_state; /* ASPM link state */ - u16 l1ss; /* L1SS Capability pointer */ unsigned int ltr_path:1; /* Latency Tolerance Reporting supported from root to here */ #endif From 5f567ce56a9deec3419334fe794f31754d2a90fe Mon Sep 17 00:00:00 2001 From: "David E. Box" Date: Fri, 23 Feb 2024 14:58:51 -0600 Subject: [PATCH 010/548] PCI/ASPM: Call pci_save_ltr_state() from pci_save_pcie_state() ASPM state is saved and restored from pci_save/restore_pcie_state(). Since the LTR Capability is linked with ASPM, move the LTR save and restore calls there as well. No functional change intended. Suggested-by: Bjorn Helgaas Link: https://lore.kernel.org/r/20240128233212.1139663-6-david.e.box@linux.intel.com Link: https://lore.kernel.org/r/20240223205851.114931-6-helgaas@kernel.org Signed-off-by: David E. Box Signed-off-by: Bjorn Helgaas (cherry picked from commit c198fafa0125e97728d16411aa653602900ab0bc) Signed-off-by: Koba Ko --- drivers/pci/pci.c | 14 +++++++------- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c index 8ef43d274f23d..3762ccf379e68 100644 --- a/drivers/pci/pci.c +++ b/drivers/pci/pci.c @@ -1621,6 +1621,7 @@ static int pci_save_pcie_state(struct pci_dev *dev) pcie_capability_read_word(dev, PCI_EXP_SLTCTL2, &cap[i++]); pci_save_aspm_l1ss_state(dev); + pci_save_ltr_state(dev); return 0; } @@ -1631,6 +1632,12 @@ static void pci_restore_pcie_state(struct pci_dev *dev) struct pci_cap_saved_state *save_state; u16 *cap, lnkctl; + /* + * Restore max latencies (in the LTR capability) before enabling + * LTR itself in PCI_EXP_DEVCTL2. + */ + pci_restore_ltr_state(dev); + save_state = pci_find_saved_cap(dev, PCI_CAP_ID_EXP); if (!save_state) return; @@ -1723,7 +1730,6 @@ int pci_save_state(struct pci_dev *dev) if (i != 0) return i; - pci_save_ltr_state(dev); pci_save_dpc_state(dev); pci_save_aer_state(dev); pci_save_ptm_state(dev); @@ -1825,12 +1831,6 @@ void pci_restore_state(struct pci_dev *dev) if (!dev->state_saved) return; - /* - * Restore max latencies (in the LTR capability) before enabling - * LTR itself (in the PCIe capability). - */ - pci_restore_ltr_state(dev); - pci_restore_pcie_state(dev); pci_restore_pasid_state(dev); pci_restore_pri_state(dev); From f11d7c4801c421b4ed8f67a4da5abf6eff3c2c13 Mon Sep 17 00:00:00 2001 From: Shuai Xue Date: Fri, 8 Dec 2023 10:56:50 +0800 Subject: [PATCH 011/548] PCI: Move pci_clear_and_set_dword() helper to PCI header The clear and set pattern is commonly used for accessing PCI config, move the helper pci_clear_and_set_dword() from aspm.c into PCI header. In addition, rename to pci_clear_and_set_config_dword() to retain the "config" information and match the other accessors. No functional change intended. Signed-off-by: Shuai Xue Acked-by: Bjorn Helgaas Tested-by: Ilkka Koskinen Link: https://lore.kernel.org/r/20231208025652.87192-4-xueshuai@linux.alibaba.com Signed-off-by: Will Deacon (cherry picked from commit ac16087134b837d42b75bb1c741070b6c142f258) Signed-off-by: Koba Ko --- drivers/pci/access.c | 12 ++++++++ drivers/pci/pcie/aspm.c | 65 +++++++++++++++++++---------------------- include/linux/pci.h | 2 ++ 3 files changed, 44 insertions(+), 35 deletions(-) diff --git a/drivers/pci/access.c b/drivers/pci/access.c index 6554a2e89d361..6449056b57dd3 100644 --- a/drivers/pci/access.c +++ b/drivers/pci/access.c @@ -598,3 +598,15 @@ int pci_write_config_dword(const struct pci_dev *dev, int where, return pci_bus_write_config_dword(dev->bus, dev->devfn, where, val); } EXPORT_SYMBOL(pci_write_config_dword); + +void pci_clear_and_set_config_dword(const struct pci_dev *dev, int pos, + u32 clear, u32 set) +{ + u32 val; + + pci_read_config_dword(dev, pos, &val); + val &= ~clear; + val |= set; + pci_write_config_dword(dev, pos, val); +} +EXPORT_SYMBOL(pci_clear_and_set_config_dword); diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c index 25d66c2a43fa5..5777994bda422 100644 --- a/drivers/pci/pcie/aspm.c +++ b/drivers/pci/pcie/aspm.c @@ -564,17 +564,6 @@ static void pcie_aspm_check_latency(struct pci_dev *endpoint) } } -static void pci_clear_and_set_dword(struct pci_dev *pdev, int pos, - u32 clear, u32 set) -{ - u32 val; - - pci_read_config_dword(pdev, pos, &val); - val &= ~clear; - val |= set; - pci_write_config_dword(pdev, pos, val); -} - /* Calculate L1.2 PM substate timing parameters */ static void aspm_calc_l12_info(struct pcie_link_state *link, u32 parent_l1ss_cap, u32 child_l1ss_cap) @@ -635,10 +624,12 @@ static void aspm_calc_l12_info(struct pcie_link_state *link, cl1_2_enables = cctl1 & PCI_L1SS_CTL1_L1_2_MASK; if (pl1_2_enables || cl1_2_enables) { - pci_clear_and_set_dword(child, child->l1ss + PCI_L1SS_CTL1, - PCI_L1SS_CTL1_L1_2_MASK, 0); - pci_clear_and_set_dword(parent, parent->l1ss + PCI_L1SS_CTL1, - PCI_L1SS_CTL1_L1_2_MASK, 0); + pci_clear_and_set_config_dword(child, + child->l1ss + PCI_L1SS_CTL1, + PCI_L1SS_CTL1_L1_2_MASK, 0); + pci_clear_and_set_config_dword(parent, + parent->l1ss + PCI_L1SS_CTL1, + PCI_L1SS_CTL1_L1_2_MASK, 0); } /* Program T_POWER_ON times in both ports */ @@ -646,22 +637,26 @@ static void aspm_calc_l12_info(struct pcie_link_state *link, pci_write_config_dword(child, child->l1ss + PCI_L1SS_CTL2, ctl2); /* Program Common_Mode_Restore_Time in upstream device */ - pci_clear_and_set_dword(parent, parent->l1ss + PCI_L1SS_CTL1, - PCI_L1SS_CTL1_CM_RESTORE_TIME, ctl1); + pci_clear_and_set_config_dword(parent, parent->l1ss + PCI_L1SS_CTL1, + PCI_L1SS_CTL1_CM_RESTORE_TIME, ctl1); /* Program LTR_L1.2_THRESHOLD time in both ports */ - pci_clear_and_set_dword(parent, parent->l1ss + PCI_L1SS_CTL1, - PCI_L1SS_CTL1_LTR_L12_TH_VALUE | - PCI_L1SS_CTL1_LTR_L12_TH_SCALE, ctl1); - pci_clear_and_set_dword(child, child->l1ss + PCI_L1SS_CTL1, - PCI_L1SS_CTL1_LTR_L12_TH_VALUE | - PCI_L1SS_CTL1_LTR_L12_TH_SCALE, ctl1); + pci_clear_and_set_config_dword(parent, parent->l1ss + PCI_L1SS_CTL1, + PCI_L1SS_CTL1_LTR_L12_TH_VALUE | + PCI_L1SS_CTL1_LTR_L12_TH_SCALE, + ctl1); + pci_clear_and_set_config_dword(child, child->l1ss + PCI_L1SS_CTL1, + PCI_L1SS_CTL1_LTR_L12_TH_VALUE | + PCI_L1SS_CTL1_LTR_L12_TH_SCALE, + ctl1); if (pl1_2_enables || cl1_2_enables) { - pci_clear_and_set_dword(parent, parent->l1ss + PCI_L1SS_CTL1, 0, - pl1_2_enables); - pci_clear_and_set_dword(child, child->l1ss + PCI_L1SS_CTL1, 0, - cl1_2_enables); + pci_clear_and_set_config_dword(parent, + parent->l1ss + PCI_L1SS_CTL1, 0, + pl1_2_enables); + pci_clear_and_set_config_dword(child, + child->l1ss + PCI_L1SS_CTL1, 0, + cl1_2_enables); } } @@ -821,10 +816,10 @@ static void pcie_config_aspm_l1ss(struct pcie_link_state *link, u32 state) */ /* Disable all L1 substates */ - pci_clear_and_set_dword(child, child->l1ss + PCI_L1SS_CTL1, - PCI_L1SS_CTL1_L1SS_MASK, 0); - pci_clear_and_set_dword(parent, parent->l1ss + PCI_L1SS_CTL1, - PCI_L1SS_CTL1_L1SS_MASK, 0); + pci_clear_and_set_config_dword(child, child->l1ss + PCI_L1SS_CTL1, + PCI_L1SS_CTL1_L1SS_MASK, 0); + pci_clear_and_set_config_dword(parent, parent->l1ss + PCI_L1SS_CTL1, + PCI_L1SS_CTL1_L1SS_MASK, 0); /* * If needed, disable L1, and it gets enabled later * in pcie_config_aspm_link(). @@ -847,10 +842,10 @@ static void pcie_config_aspm_l1ss(struct pcie_link_state *link, u32 state) val |= PCI_L1SS_CTL1_PCIPM_L1_2; /* Enable what we need to enable */ - pci_clear_and_set_dword(parent, parent->l1ss + PCI_L1SS_CTL1, - PCI_L1SS_CTL1_L1SS_MASK, val); - pci_clear_and_set_dword(child, child->l1ss + PCI_L1SS_CTL1, - PCI_L1SS_CTL1_L1SS_MASK, val); + pci_clear_and_set_config_dword(parent, parent->l1ss + PCI_L1SS_CTL1, + PCI_L1SS_CTL1_L1SS_MASK, val); + pci_clear_and_set_config_dword(child, child->l1ss + PCI_L1SS_CTL1, + PCI_L1SS_CTL1_L1SS_MASK, val); } static void pcie_config_aspm_dev(struct pci_dev *pdev, u32 val) diff --git a/include/linux/pci.h b/include/linux/pci.h index 5c2bd8d46183e..0814334d01e6d 100644 --- a/include/linux/pci.h +++ b/include/linux/pci.h @@ -1216,6 +1216,8 @@ int pci_read_config_dword(const struct pci_dev *dev, int where, u32 *val); int pci_write_config_byte(const struct pci_dev *dev, int where, u8 val); int pci_write_config_word(const struct pci_dev *dev, int where, u16 val); int pci_write_config_dword(const struct pci_dev *dev, int where, u32 val); +void pci_clear_and_set_config_dword(const struct pci_dev *dev, int pos, + u32 clear, u32 set); int pcie_capability_read_word(struct pci_dev *dev, int pos, u16 *val); int pcie_capability_read_dword(struct pci_dev *dev, int pos, u32 *val); From ff65cb0121240687280e85318504c1d2c86a2188 Mon Sep 17 00:00:00 2001 From: Vidya Sagar Date: Fri, 23 Feb 2024 13:36:24 -0600 Subject: [PATCH 012/548] PCI/ASPM: Update save_state when configuration changes Many PCIe device drivers save the configuration state of their device during probe and restore it when their .slot_reset() hook is called during PCIe error recovery. If the ASPM configuration is changed after the driver's probe is called and before an error event occurs, .slot_reset() restores the ASPM configuration to what it was at the time of probe, not to what it was just before the occurrence of the error event. This leads to a mismatch in ASPM configuration between the device and its upstream device. Update the saved configuration of the device when the ASPM configuration changes. Link: https://lore.kernel.org/r/20240222174436.3565146-1-vidyas@nvidia.com Signed-off-by: Vidya Sagar [bhelgaas: commit log, rebase to pci/aspm, rename to pci_update_aspm_saved_state() since it updates only LNKCTL, update only ASPMC and CLKREQ_EN in LNKCTL] Signed-off-by: Bjorn Helgaas Reviewed-by: Kuppuswamy Sathyanarayanan Reviewed-by: David E. Box (cherry picked from commit 6d4266675279b38c301243f3a4fac4a511b03246) Signed-off-by: Koba Ko --- drivers/pci/pcie/aspm.c | 34 +++++++++++++++++++++++++++++++++- 1 file changed, 33 insertions(+), 1 deletion(-) diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c index 5777994bda422..cdaa8b1dad322 100644 --- a/drivers/pci/pcie/aspm.c +++ b/drivers/pci/pcie/aspm.c @@ -279,16 +279,42 @@ static int policy_to_clkpm_state(struct pcie_link_state *link) return 0; } +static void pci_update_aspm_saved_state(struct pci_dev *dev) +{ + struct pci_cap_saved_state *save_state; + u16 *cap, lnkctl, aspm_ctl; + + save_state = pci_find_saved_cap(dev, PCI_CAP_ID_EXP); + if (!save_state) + return; + + pcie_capability_read_word(dev, PCI_EXP_LNKCTL, &lnkctl); + + /* + * Update ASPM and CLKREQ bits of LNKCTL in save_state. We only + * write PCI_EXP_LNKCTL_CCC during enumeration, so it shouldn't + * change after being captured in save_state. + */ + aspm_ctl = lnkctl & (PCI_EXP_LNKCTL_ASPMC | PCI_EXP_LNKCTL_CLKREQ_EN); + lnkctl &= ~(PCI_EXP_LNKCTL_ASPMC | PCI_EXP_LNKCTL_CLKREQ_EN); + + /* Depends on pci_save_pcie_state(): cap[1] is LNKCTL */ + cap = (u16 *)&save_state->cap.data[0]; + cap[1] = lnkctl | aspm_ctl; +} + static void pcie_set_clkpm_nocheck(struct pcie_link_state *link, int enable) { struct pci_dev *child; struct pci_bus *linkbus = link->pdev->subordinate; u32 val = enable ? PCI_EXP_LNKCTL_CLKREQ_EN : 0; - list_for_each_entry(child, &linkbus->devices, bus_list) + list_for_each_entry(child, &linkbus->devices, bus_list) { pcie_capability_clear_and_set_word(child, PCI_EXP_LNKCTL, PCI_EXP_LNKCTL_CLKREQ_EN, val); + pci_update_aspm_saved_state(child); + } link->clkpm_enabled = !!enable; } @@ -903,6 +929,12 @@ static void pcie_config_aspm_link(struct pcie_link_state *link, u32 state) pcie_config_aspm_dev(parent, upstream); link->aspm_enabled = state; + + /* Update latest ASPM configuration in saved context */ + pci_save_aspm_l1ss_state(link->downstream); + pci_update_aspm_saved_state(link->downstream); + pci_save_aspm_l1ss_state(parent); + pci_update_aspm_saved_state(parent); } static void pcie_config_aspm_path(struct pcie_link_state *link) From 86059afce127782e5941df5565b3ee01220372a2 Mon Sep 17 00:00:00 2001 From: Ryan Roberts Date: Mon, 27 Nov 2023 11:17:27 +0000 Subject: [PATCH 013/548] arm64/mm: Add lpa2_is_enabled() kvm_lpa2_is_enabled() stubs Add stub functions which is initially always return false. These provide the hooks that we need to update the range-based TLBI routines, whose operands are encoded differently depending on whether lpa2 is enabled or not. The kernel and kvm will enable the use of lpa2 asynchronously in future, and part of that enablement will involve fleshing out their respective hook to advertise when it is using lpa2. Since the kernel's decision to use lpa2 relies on more than just whether the HW supports the feature, it can't just use the same static key as kvm. This is another reason to use separate functions. lpa2_is_enabled() is already implemented as part of Ard's kernel lpa2 series. Since kvm will make its decision solely based on HW support, kvm_lpa2_is_enabled() will be defined as system_supports_lpa2() once kvm starts using lpa2. Reviewed-by: Oliver Upton Signed-off-by: Ryan Roberts Signed-off-by: Marc Zyngier Link: https://lore.kernel.org/r/20231127111737.1897081-3-ryan.roberts@arm.com (cherry picked from commit 936a4ec28141fa9369b6af4d6401f0be9f7e304c) Signed-off-by: Koba Ko --- arch/arm64/include/asm/kvm_pgtable.h | 2 ++ arch/arm64/include/asm/pgtable-prot.h | 2 ++ 2 files changed, 4 insertions(+) diff --git a/arch/arm64/include/asm/kvm_pgtable.h b/arch/arm64/include/asm/kvm_pgtable.h index d3e354bb8351d..10068500d6019 100644 --- a/arch/arm64/include/asm/kvm_pgtable.h +++ b/arch/arm64/include/asm/kvm_pgtable.h @@ -25,6 +25,8 @@ #define KVM_PGTABLE_MIN_BLOCK_LEVEL 2U #endif +#define kvm_lpa2_is_enabled() false + static inline u64 kvm_get_parange(u64 mmfr0) { u64 parange = cpuid_feature_extract_unsigned_field(mmfr0, diff --git a/arch/arm64/include/asm/pgtable-prot.h b/arch/arm64/include/asm/pgtable-prot.h index eed814b00a389..b4b2b86237692 100644 --- a/arch/arm64/include/asm/pgtable-prot.h +++ b/arch/arm64/include/asm/pgtable-prot.h @@ -71,6 +71,8 @@ extern bool arm64_use_ng_mappings; #define PTE_MAYBE_NG (arm64_use_ng_mappings ? PTE_NG : 0) #define PMD_MAYBE_NG (arm64_use_ng_mappings ? PMD_SECT_NG : 0) +#define lpa2_is_enabled() false + /* * If we have userspace only BTI we don't want to mark kernel pages * guarded even if the system does support BTI. From 53eff283c42e59031d45d0898d8d77e1932b11d8 Mon Sep 17 00:00:00 2001 From: Ryan Roberts Date: Mon, 27 Nov 2023 11:17:28 +0000 Subject: [PATCH 014/548] arm64/mm: Update tlb invalidation routines for FEAT_LPA2 FEAT_LPA2 impacts tlb invalidation in 2 ways; Firstly, the TTL field in the non-range tlbi instructions can now validly take a 0 value as a level hint for the 4KB granule (this is due to the extra level of translation) - previously TTL=0b0100 meant no hint and was treated as 0b0000. Secondly, The BADDR field of the range-based tlbi instructions is specified in 64KB units when LPA2 is in use (TCR.DS=1), whereas it is in page units otherwise. Changes are required for tlbi to continue to operate correctly when LPA2 is in use. Solve the first problem by always adding the level hint if the level is between [0, 3] (previously anything other than 0 was hinted, which breaks in the new level -1 case from kvm). When running on non-LPA2 HW, 0 is still safe to hint as the HW will fall back to non-hinted. While we are at it, we replace the notion of 0 being the non-hinted sentinel with a macro, TLBI_TTL_UNKNOWN. This means callers won't need updating if/when translation depth increases in future. The second issue is more complex: When LPA2 is in use, use the non-range tlbi instructions to forward align to a 64KB boundary first, then use range-based tlbi from there on, until we have either invalidated all pages or we have a single page remaining. If the latter, that is done with non-range tlbi. We determine whether LPA2 is in use based on lpa2_is_enabled() (for kernel calls) or kvm_lpa2_is_enabled() (for kvm calls). Reviewed-by: Catalin Marinas Reviewed-by: Oliver Upton Signed-off-by: Ryan Roberts Signed-off-by: Marc Zyngier Link: https://lore.kernel.org/r/20231127111737.1897081-4-ryan.roberts@arm.com (backported from commit c910f2b65518538b5072cb51760c8ef749e455d0) Signed-off-by: Koba Ko --- arch/arm64/include/asm/tlb.h | 15 ++++--- arch/arm64/include/asm/tlbflush.h | 67 +++++++++++++++++++++---------- 2 files changed, 56 insertions(+), 26 deletions(-) diff --git a/arch/arm64/include/asm/tlb.h b/arch/arm64/include/asm/tlb.h index 2c29239d05c3e..396ba9b4872cb 100644 --- a/arch/arm64/include/asm/tlb.h +++ b/arch/arm64/include/asm/tlb.h @@ -22,15 +22,15 @@ static void tlb_flush(struct mmu_gather *tlb); #include /* - * get the tlbi levels in arm64. Default value is 0 if more than one - * of cleared_* is set or neither is set. - * Arm64 doesn't support p4ds now. + * get the tlbi levels in arm64. Default value is TLBI_TTL_UNKNOWN if more than + * one of cleared_* is set or neither is set - this elides the level hinting to + * the hardware. */ static inline int tlb_get_level(struct mmu_gather *tlb) { /* The TTL field is only valid for the leaf entry. */ if (tlb->freed_tables) - return 0; + return TLBI_TTL_UNKNOWN; if (tlb->cleared_ptes && !(tlb->cleared_pmds || tlb->cleared_puds || @@ -47,7 +47,12 @@ static inline int tlb_get_level(struct mmu_gather *tlb) tlb->cleared_p4ds)) return 1; - return 0; + if (tlb->cleared_p4ds && !(tlb->cleared_ptes || + tlb->cleared_pmds || + tlb->cleared_puds)) + return 0; + + return TLBI_TTL_UNKNOWN; } static inline void tlb_flush(struct mmu_gather *tlb) diff --git a/arch/arm64/include/asm/tlbflush.h b/arch/arm64/include/asm/tlbflush.h index c9887f4773cd9..afb4095e02aa0 100644 --- a/arch/arm64/include/asm/tlbflush.h +++ b/arch/arm64/include/asm/tlbflush.h @@ -94,19 +94,22 @@ static inline unsigned long get_trans_granule(void) * When ARMv8.4-TTL exists, TLBI operations take an additional hint for * the level at which the invalidation must take place. If the level is * wrong, no invalidation may take place. In the case where the level - * cannot be easily determined, a 0 value for the level parameter will - * perform a non-hinted invalidation. + * cannot be easily determined, the value TLBI_TTL_UNKNOWN will perform + * a non-hinted invalidation. Any provided level outside the hint range + * will also cause fall-back to non-hinted invalidation. * * For Stage-2 invalidation, use the level values provided to that effect * in asm/stage2_pgtable.h. */ #define TLBI_TTL_MASK GENMASK_ULL(47, 44) +#define TLBI_TTL_UNKNOWN INT_MAX + #define __tlbi_level(op, addr, level) do { \ u64 arg = addr; \ \ if (cpus_have_const_cap(ARM64_HAS_ARMv8_4_TTL) && \ - level) { \ + level >= 0 && level <= 3) { \ u64 ttl = level & 3; \ ttl |= get_trans_granule() << 2; \ arg &= ~TLBI_TTL_MASK; \ @@ -122,16 +125,21 @@ static inline unsigned long get_trans_granule(void) } while (0) /* - * This macro creates a properly formatted VA operand for the TLB RANGE. - * The value bit assignments are: + * This macro creates a properly formatted VA operand for the TLB RANGE. The + * value bit assignments are: * * +----------+------+-------+-------+-------+----------------------+ * | ASID | TG | SCALE | NUM | TTL | BADDR | * +-----------------+-------+-------+-------+----------------------+ * |63 48|47 46|45 44|43 39|38 37|36 0| * - * The address range is determined by below formula: - * [BADDR, BADDR + (NUM + 1) * 2^(5*SCALE + 1) * PAGESIZE) + * The address range is determined by below formula: [BADDR, BADDR + (NUM + 1) * + * 2^(5*SCALE + 1) * PAGESIZE) + * + * Note that the first argument, baddr, is pre-shifted; If LPA2 is in use, BADDR + * holds addr[52:16]. Else BADDR holds page number. See for example ARM DDI + * 0487J.a section C5.5.60 "TLBI VAE1IS, TLBI VAE1ISNXS, TLB Invalidate by VA, + * EL1, Inner Shareable". * */ #define TLBIR_ASID_MASK GENMASK_ULL(63, 48) @@ -230,12 +238,16 @@ static inline unsigned long get_trans_granule(void) * CPUs, ensuring that any walk-cache entries associated with the * translation are also invalidated. * - * __flush_tlb_range(vma, start, end, stride, last_level) + * __flush_tlb_range(vma, start, end, stride, last_level, tlb_level) * Invalidate the virtual-address range '[start, end)' on all * CPUs for the user address space corresponding to 'vma->mm'. * The invalidation operations are issued at a granularity * determined by 'stride' and only affect any walk-cache entries - * if 'last_level' is equal to false. + * if 'last_level' is equal to false. tlb_level is the level at + * which the invalidation must take place. If the level is wrong, + * no invalidation may take place. In the case where the level + * cannot be easily determined, the value TLBI_TTL_UNKNOWN will + * perform a non-hinted invalidation. * * * Finally, take a look at asm/tlb.h to see how tlb_flush() is implemented @@ -361,32 +373,42 @@ static inline void arch_tlbbatch_flush(struct arch_tlbflush_unmap_batch *batch) * @tlb_level: Translation Table level hint, if known * @tlbi_user: If 'true', call an additional __tlbi_user() * (typically for user ASIDs). 'flase' for IPA instructions + * @lpa2: If 'true', the lpa2 scheme is used as set out below * * When the CPU does not support TLB range operations, flush the TLB * entries one by one at the granularity of 'stride'. If the TLB * range ops are supported, then: * - * 1. The minimum range granularity is decided by 'scale', so multiple range + * 1. If FEAT_LPA2 is in use, the start address of a range operation must be + * 64KB aligned, so flush pages one by one until the alignment is reached + * using the non-range operations. This step is skipped if LPA2 is not in + * use. + * + * 2. The minimum range granularity is decided by 'scale', so multiple range * TLBI operations may be required. Start from scale = 3, flush the largest * possible number of pages ((num+1)*2^(5*scale+1)) that fit into the * requested range, then decrement scale and continue until one or zero pages - * are left. + * are left. We must start from highest scale to ensure 64KB start alignment + * is maintained in the LPA2 case. * - * 2. If there is 1 page remaining, flush it through non-range operations. Range - * operations can only span an even number of pages. + * 3. If there is 1 page remaining, flush it through non-range operations. Range + * operations can only span an even number of pages. We save this for last to + * ensure 64KB start alignment is maintained for the LPA2 case. */ #define __flush_tlb_range_op(op, start, pages, stride, \ - asid, tlb_level, tlbi_user) \ + asid, tlb_level, tlbi_user, lpa2) \ do { \ typeof(start) __flush_start = start; \ typeof(pages) __flush_pages = pages; \ int num = 0; \ int scale = 3; \ + int shift = lpa2 ? 16 : PAGE_SHIFT; \ unsigned long addr; \ \ while (__flush_pages > 0) { \ if (!system_supports_tlb_range() || \ - __flush_pages == 1) { \ + __flush_pages == 1 || \ + (lpa2 && start != ALIGN(start, SZ_64K))) { \ addr = __TLBI_VADDR(__flush_start, asid); \ __tlbi_level(op, addr, tlb_level); \ if (tlbi_user) \ @@ -398,7 +420,7 @@ do { \ \ num = __TLBI_RANGE_NUM(__flush_pages, scale); \ if (num >= 0) { \ - addr = __TLBI_VADDR_RANGE(__flush_start, asid, \ + addr = __TLBI_VADDR_RANGE(__flush_start >> shift, asid, \ scale, num, tlb_level); \ __tlbi(r##op, addr); \ if (tlbi_user) \ @@ -411,7 +433,7 @@ do { \ } while (0) #define __flush_s2_tlb_range_op(op, start, pages, stride, tlb_level) \ - __flush_tlb_range_op(op, start, pages, stride, 0, tlb_level, false) + __flush_tlb_range_op(op, start, pages, stride, 0, tlb_level, false, kvm_lpa2_is_enabled()); static inline void __flush_tlb_range_nosync(struct vm_area_struct *vma, unsigned long start, unsigned long end, @@ -441,9 +463,11 @@ static inline void __flush_tlb_range_nosync(struct vm_area_struct *vma, asid = ASID(vma->vm_mm); if (last_level) - __flush_tlb_range_op(vale1is, start, pages, stride, asid, tlb_level, true); + __flush_tlb_range_op(vale1is, start, pages, stride, asid, + tlb_level, true, lpa2_is_enabled()); else - __flush_tlb_range_op(vae1is, start, pages, stride, asid, tlb_level, true); + __flush_tlb_range_op(vae1is, start, pages, stride, asid, + tlb_level, true, lpa2_is_enabled()); mmu_notifier_arch_invalidate_secondary_tlbs(vma->vm_mm, start, end); } @@ -464,9 +488,10 @@ static inline void flush_tlb_range(struct vm_area_struct *vma, /* * We cannot use leaf-only invalidation here, since we may be invalidating * table entries as part of collapsing hugepages or moving page tables. - * Set the tlb_level to 0 because we can not get enough information here. + * Set the tlb_level to TLBI_TTL_UNKNOWN because we can not get enough + * information here. */ - __flush_tlb_range(vma, start, end, PAGE_SIZE, false, 0); + __flush_tlb_range(vma, start, end, PAGE_SIZE, false, TLBI_TTL_UNKNOWN); } static inline void flush_tlb_kernel_range(unsigned long start, unsigned long end) From c2d8cd67cedb493382ba9acf3eb6cdd660376595 Mon Sep 17 00:00:00 2001 From: Anshuman Khandual Date: Mon, 27 Nov 2023 11:17:29 +0000 Subject: [PATCH 015/548] arm64/mm: Add FEAT_LPA2 specific ID_AA64MMFR0.TGRAN[2] PAGE_SIZE support is tested against possible minimum and maximum values for its respective ID_AA64MMFR0.TGRAN field, depending on whether it is signed or unsigned. But then FEAT_LPA2 implementation needs to be validated for 4K and 16K page sizes via feature specific ID_AA64MMFR0.TGRAN values. Hence it adds FEAT_LPA2 specific ID_AA64MMFR0.TGRAN[2] values per ARM ARM (0487G.A). Acked-by: Catalin Marinas Signed-off-by: Anshuman Khandual Reviewed-by: Oliver Upton Signed-off-by: Ryan Roberts Signed-off-by: Marc Zyngier Link: https://lore.kernel.org/r/20231127111737.1897081-5-ryan.roberts@arm.com (cherry picked from commit e477c8c483913de92c9cc00b34459dc4d695529b) Signed-off-by: Koba Ko --- arch/arm64/include/asm/sysreg.h | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h index 38296579a4fd5..bc782116315a9 100644 --- a/arch/arm64/include/asm/sysreg.h +++ b/arch/arm64/include/asm/sysreg.h @@ -826,10 +826,12 @@ /* id_aa64mmfr0 */ #define ID_AA64MMFR0_EL1_TGRAN4_SUPPORTED_MIN 0x0 +#define ID_AA64MMFR0_EL1_TGRAN4_LPA2 ID_AA64MMFR0_EL1_TGRAN4_52_BIT #define ID_AA64MMFR0_EL1_TGRAN4_SUPPORTED_MAX 0x7 #define ID_AA64MMFR0_EL1_TGRAN64_SUPPORTED_MIN 0x0 #define ID_AA64MMFR0_EL1_TGRAN64_SUPPORTED_MAX 0x7 #define ID_AA64MMFR0_EL1_TGRAN16_SUPPORTED_MIN 0x1 +#define ID_AA64MMFR0_EL1_TGRAN16_LPA2 ID_AA64MMFR0_EL1_TGRAN16_52_BIT #define ID_AA64MMFR0_EL1_TGRAN16_SUPPORTED_MAX 0xf #define ARM64_MIN_PARANGE_BITS 32 @@ -837,6 +839,7 @@ #define ID_AA64MMFR0_EL1_TGRAN_2_SUPPORTED_DEFAULT 0x0 #define ID_AA64MMFR0_EL1_TGRAN_2_SUPPORTED_NONE 0x1 #define ID_AA64MMFR0_EL1_TGRAN_2_SUPPORTED_MIN 0x2 +#define ID_AA64MMFR0_EL1_TGRAN_2_SUPPORTED_LPA2 0x3 #define ID_AA64MMFR0_EL1_TGRAN_2_SUPPORTED_MAX 0x7 #ifdef CONFIG_ARM64_PA_BITS_52 @@ -847,11 +850,13 @@ #if defined(CONFIG_ARM64_4K_PAGES) #define ID_AA64MMFR0_EL1_TGRAN_SHIFT ID_AA64MMFR0_EL1_TGRAN4_SHIFT +#define ID_AA64MMFR0_EL1_TGRAN_LPA2 ID_AA64MMFR0_EL1_TGRAN4_52_BIT #define ID_AA64MMFR0_EL1_TGRAN_SUPPORTED_MIN ID_AA64MMFR0_EL1_TGRAN4_SUPPORTED_MIN #define ID_AA64MMFR0_EL1_TGRAN_SUPPORTED_MAX ID_AA64MMFR0_EL1_TGRAN4_SUPPORTED_MAX #define ID_AA64MMFR0_EL1_TGRAN_2_SHIFT ID_AA64MMFR0_EL1_TGRAN4_2_SHIFT #elif defined(CONFIG_ARM64_16K_PAGES) #define ID_AA64MMFR0_EL1_TGRAN_SHIFT ID_AA64MMFR0_EL1_TGRAN16_SHIFT +#define ID_AA64MMFR0_EL1_TGRAN_LPA2 ID_AA64MMFR0_EL1_TGRAN16_52_BIT #define ID_AA64MMFR0_EL1_TGRAN_SUPPORTED_MIN ID_AA64MMFR0_EL1_TGRAN16_SUPPORTED_MIN #define ID_AA64MMFR0_EL1_TGRAN_SUPPORTED_MAX ID_AA64MMFR0_EL1_TGRAN16_SUPPORTED_MAX #define ID_AA64MMFR0_EL1_TGRAN_2_SHIFT ID_AA64MMFR0_EL1_TGRAN16_2_SHIFT From 84a4c248b171137a293b187de45b121bebfb3c10 Mon Sep 17 00:00:00 2001 From: Ryan Roberts Date: Mon, 27 Nov 2023 11:17:30 +0000 Subject: [PATCH 016/548] arm64: Add ARM64_HAS_LPA2 CPU capability Expose FEAT_LPA2 as a capability so that we can take advantage of alternatives patching in the hypervisor. Although FEAT_LPA2 presence is advertised separately for stage1 and stage2, the expectation is that in practice both stages will either support or not support it. Therefore, we combine both into a single capability, allowing us to simplify the implementation. KVM requires support in both stages in order to use LPA2 since the same library is used for hyp stage 1 and guest stage 2 pgtables. Reviewed-by: Oliver Upton Signed-off-by: Ryan Roberts Signed-off-by: Marc Zyngier Link: https://lore.kernel.org/r/20231127111737.1897081-6-ryan.roberts@arm.com (cherry picked from commit b1366d21daaebb8e474e4169c5e557fbb37bfdc0) Signed-off-by: Koba Ko --- arch/arm64/include/asm/cpufeature.h | 5 ++++ arch/arm64/kernel/cpufeature.c | 39 +++++++++++++++++++++++++++++ arch/arm64/tools/cpucaps | 1 + 3 files changed, 45 insertions(+) diff --git a/arch/arm64/include/asm/cpufeature.h b/arch/arm64/include/asm/cpufeature.h index 5bba393760557..b0e4d892ab8a1 100644 --- a/arch/arm64/include/asm/cpufeature.h +++ b/arch/arm64/include/asm/cpufeature.h @@ -831,6 +831,11 @@ static inline bool system_supports_tlb_range(void) cpus_have_const_cap(ARM64_HAS_TLB_RANGE); } +static inline bool system_supports_lpa2(void) +{ + return cpus_have_final_cap(ARM64_HAS_LPA2); +} + int do_emulate_mrs(struct pt_regs *regs, u32 sys_reg, u32 rt); bool try_emulate_mrs(struct pt_regs *regs, u32 isn); diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c index 2ce9ef9d924aa..d41151fb9cce2 100644 --- a/arch/arm64/kernel/cpufeature.c +++ b/arch/arm64/kernel/cpufeature.c @@ -1747,6 +1747,39 @@ static bool unmap_kernel_at_el0(const struct arm64_cpu_capabilities *entry, return !meltdown_safe; } +#if defined(ID_AA64MMFR0_EL1_TGRAN_LPA2) && defined(ID_AA64MMFR0_EL1_TGRAN_2_SUPPORTED_LPA2) +static bool has_lpa2_at_stage1(u64 mmfr0) +{ + unsigned int tgran; + + tgran = cpuid_feature_extract_unsigned_field(mmfr0, + ID_AA64MMFR0_EL1_TGRAN_SHIFT); + return tgran == ID_AA64MMFR0_EL1_TGRAN_LPA2; +} + +static bool has_lpa2_at_stage2(u64 mmfr0) +{ + unsigned int tgran; + + tgran = cpuid_feature_extract_unsigned_field(mmfr0, + ID_AA64MMFR0_EL1_TGRAN_2_SHIFT); + return tgran == ID_AA64MMFR0_EL1_TGRAN_2_SUPPORTED_LPA2; +} + +static bool has_lpa2(const struct arm64_cpu_capabilities *entry, int scope) +{ + u64 mmfr0; + + mmfr0 = read_sanitised_ftr_reg(SYS_ID_AA64MMFR0_EL1); + return has_lpa2_at_stage1(mmfr0) && has_lpa2_at_stage2(mmfr0); +} +#else +static bool has_lpa2(const struct arm64_cpu_capabilities *entry, int scope) +{ + return false; +} +#endif + #ifdef CONFIG_UNMAP_KERNEL_AT_EL0 #define KPTI_NG_TEMP_VA (-(1UL << PMD_SHIFT)) @@ -2731,6 +2764,12 @@ static const struct arm64_cpu_capabilities arm64_features[] = { .matches = has_cpuid_feature, ARM64_CPUID_FIELDS(ID_AA64MMFR2_EL1, EVT, IMP) }, + { + .desc = "52-bit Virtual Addressing for KVM (LPA2)", + .capability = ARM64_HAS_LPA2, + .type = ARM64_CPUCAP_SYSTEM_FEATURE, + .matches = has_lpa2, + }, {}, }; diff --git a/arch/arm64/tools/cpucaps b/arch/arm64/tools/cpucaps index c251ef3caae56..002da1ce7ac28 100644 --- a/arch/arm64/tools/cpucaps +++ b/arch/arm64/tools/cpucaps @@ -36,6 +36,7 @@ HAS_GIC_PRIO_MASKING HAS_GIC_PRIO_RELAXED_SYNC HAS_HCX HAS_LDAPR +HAS_LPA2 HAS_LSE_ATOMICS HAS_MOPS HAS_NESTED_VIRT From 9e65da742bb6459178f5c4cf6d378a438a4ac2e7 Mon Sep 17 00:00:00 2001 From: Ryan Roberts Date: Mon, 27 Nov 2023 11:17:32 +0000 Subject: [PATCH 017/548] KVM: arm64: Use LPA2 page-tables for stage2 and hyp stage1 Implement a simple policy whereby if the HW supports FEAT_LPA2 for the page size we are using, always use LPA2-style page-tables for stage 2 and hyp stage 1 (assuming an nvhe hyp), regardless of the VMM-requested IPA size or HW-implemented PA size. When in use we can now support up to 52-bit IPA and PA sizes. We use the previously created cpu feature to track whether LPA2 is supported for deciding whether to use the LPA2 or classic pte format. Note that FEAT_LPA2 brings support for bigger block mappings (512GB with 4KB, 64GB with 16KB). We explicitly don't enable these in the library because stage2_apply_range() works on batch sizes of the largest used block mapping, and increasing the size of the batch would lead to soft lockups. See commit 5994bc9e05c2 ("KVM: arm64: Limit stage2_apply_range() batch size to largest block"). With the addition of LPA2 support in the hypervisor, the PA size supported by the HW must be capped with a runtime decision, rather than simply using a compile-time decision based on PA_BITS. For example, on a system that advertises 52 bit PA but does not support FEAT_LPA2, A 4KB or 16KB kernel compiled with LPA2 support must still limit the PA size to 48 bits. Therefore, move the insertion of the PS field into TCR_EL2 out of __kvm_hyp_init assembly code and instead do it in cpu_prepare_hyp_mode() where the rest of TCR_EL2 is prepared. This allows us to figure out PS with kvm_get_parange(), which has the appropriate logic to ensure the above requirement. (and the PS field of VTCR_EL2 is already populated this way). Reviewed-by: Oliver Upton Signed-off-by: Ryan Roberts Signed-off-by: Marc Zyngier Link: https://lore.kernel.org/r/20231127111737.1897081-8-ryan.roberts@arm.com (backported from commit bd412e2a310cbc43b424198b0065086b0f462625) Signed-off-by: Koba Ko --- arch/arm64/include/asm/kvm_arm.h | 2 ++ arch/arm64/include/asm/kvm_pgtable.h | 49 +++++++++++++++++++++------- arch/arm64/kvm/arm.c | 5 +++ arch/arm64/kvm/hyp/nvhe/hyp-init.S | 4 --- arch/arm64/kvm/hyp/pgtable.c | 15 +++++++-- 5 files changed, 56 insertions(+), 19 deletions(-) diff --git a/arch/arm64/include/asm/kvm_arm.h b/arch/arm64/include/asm/kvm_arm.h index 1095c6647e966..5e8a0835470d3 100644 --- a/arch/arm64/include/asm/kvm_arm.h +++ b/arch/arm64/include/asm/kvm_arm.h @@ -106,6 +106,7 @@ #define HCRX_HOST_FLAGS (HCRX_EL2_MSCEn | HCRX_EL2_TCR2En) /* TCR_EL2 Registers bits */ +#define TCR_EL2_DS (1UL << 32) #define TCR_EL2_RES1 ((1U << 31) | (1 << 23)) #define TCR_EL2_TBI (1 << 20) #define TCR_EL2_PS_SHIFT 16 @@ -120,6 +121,7 @@ TCR_EL2_ORGN0_MASK | TCR_EL2_IRGN0_MASK | TCR_EL2_T0SZ_MASK) /* VTCR_EL2 Registers bits */ +#define VTCR_EL2_DS TCR_EL2_DS #define VTCR_EL2_RES1 (1U << 31) #define VTCR_EL2_HD (1 << 22) #define VTCR_EL2_HA (1 << 21) diff --git a/arch/arm64/include/asm/kvm_pgtable.h b/arch/arm64/include/asm/kvm_pgtable.h index 10068500d6019..69a2a87ecaf6b 100644 --- a/arch/arm64/include/asm/kvm_pgtable.h +++ b/arch/arm64/include/asm/kvm_pgtable.h @@ -25,14 +25,24 @@ #define KVM_PGTABLE_MIN_BLOCK_LEVEL 2U #endif -#define kvm_lpa2_is_enabled() false +#define kvm_lpa2_is_enabled() system_supports_lpa2() + +static inline u64 kvm_get_parange_max(void) +{ + if (kvm_lpa2_is_enabled() || + (IS_ENABLED(CONFIG_ARM64_PA_BITS_52) && PAGE_SHIFT == 16)) + return ID_AA64MMFR0_EL1_PARANGE_52; + else + return ID_AA64MMFR0_EL1_PARANGE_48; +} static inline u64 kvm_get_parange(u64 mmfr0) { + u64 parange_max = kvm_get_parange_max(); u64 parange = cpuid_feature_extract_unsigned_field(mmfr0, ID_AA64MMFR0_EL1_PARANGE_SHIFT); - if (parange > ID_AA64MMFR0_EL1_PARANGE_MAX) - parange = ID_AA64MMFR0_EL1_PARANGE_MAX; + if (parange > parange_max) + parange = parange_max; return parange; } @@ -43,6 +53,8 @@ typedef u64 kvm_pte_t; #define KVM_PTE_ADDR_MASK GENMASK(47, PAGE_SHIFT) #define KVM_PTE_ADDR_51_48 GENMASK(15, 12) +#define KVM_PTE_ADDR_MASK_LPA2 GENMASK(49, PAGE_SHIFT) +#define KVM_PTE_ADDR_51_50_LPA2 GENMASK(9, 8) #define KVM_PHYS_INVALID (-1ULL) @@ -53,21 +65,34 @@ static inline bool kvm_pte_valid(kvm_pte_t pte) static inline u64 kvm_pte_to_phys(kvm_pte_t pte) { - u64 pa = pte & KVM_PTE_ADDR_MASK; - - if (PAGE_SHIFT == 16) - pa |= FIELD_GET(KVM_PTE_ADDR_51_48, pte) << 48; + u64 pa; + + if (kvm_lpa2_is_enabled()) { + pa = pte & KVM_PTE_ADDR_MASK_LPA2; + pa |= FIELD_GET(KVM_PTE_ADDR_51_50_LPA2, pte) << 50; + } else { + pa = pte & KVM_PTE_ADDR_MASK; + if (PAGE_SHIFT == 16) + pa |= FIELD_GET(KVM_PTE_ADDR_51_48, pte) << 48; + } return pa; } static inline kvm_pte_t kvm_phys_to_pte(u64 pa) { - kvm_pte_t pte = pa & KVM_PTE_ADDR_MASK; - - if (PAGE_SHIFT == 16) { - pa &= GENMASK(51, 48); - pte |= FIELD_PREP(KVM_PTE_ADDR_51_48, pa >> 48); + kvm_pte_t pte; + + if (kvm_lpa2_is_enabled()) { + pte = pa & KVM_PTE_ADDR_MASK_LPA2; + pa &= GENMASK(51, 50); + pte |= FIELD_PREP(KVM_PTE_ADDR_51_50_LPA2, pa >> 50); + } else { + pte = pa & KVM_PTE_ADDR_MASK; + if (PAGE_SHIFT == 16) { + pa &= GENMASK(51, 48); + pte |= FIELD_PREP(KVM_PTE_ADDR_51_48, pa >> 48); + } } return pte; diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c index fe4314af8eecc..369ba2b8dbe54 100644 --- a/arch/arm64/kvm/arm.c +++ b/arch/arm64/kvm/arm.c @@ -1719,6 +1719,7 @@ static int kvm_init_vector_slots(void) static void __init cpu_prepare_hyp_mode(int cpu, u32 hyp_va_bits) { struct kvm_nvhe_init_params *params = per_cpu_ptr_nvhe_sym(kvm_init_params, cpu); + u64 mmfr0 = read_sanitised_ftr_reg(SYS_ID_AA64MMFR0_EL1); unsigned long tcr; /* @@ -1741,6 +1742,10 @@ static void __init cpu_prepare_hyp_mode(int cpu, u32 hyp_va_bits) } tcr &= ~TCR_T0SZ_MASK; tcr |= TCR_T0SZ(hyp_va_bits); + tcr &= ~TCR_EL2_PS_MASK; + tcr |= FIELD_PREP(TCR_EL2_PS_MASK, kvm_get_parange(mmfr0)); + if (kvm_lpa2_is_enabled()) + tcr |= TCR_EL2_DS; params->tcr_el2 = tcr; params->pgd_pa = kvm_mmu_get_httbr(); diff --git a/arch/arm64/kvm/hyp/nvhe/hyp-init.S b/arch/arm64/kvm/hyp/nvhe/hyp-init.S index 1cc06e6797bda..f62a7d3602857 100644 --- a/arch/arm64/kvm/hyp/nvhe/hyp-init.S +++ b/arch/arm64/kvm/hyp/nvhe/hyp-init.S @@ -122,11 +122,7 @@ alternative_if ARM64_HAS_CNP alternative_else_nop_endif msr ttbr0_el2, x2 - /* - * Set the PS bits in TCR_EL2. - */ ldr x0, [x0, #NVHE_INIT_TCR_EL2] - tcr_compute_pa_size x0, #TCR_EL2_PS_SHIFT, x1, x2 msr tcr_el2, x0 isb diff --git a/arch/arm64/kvm/hyp/pgtable.c b/arch/arm64/kvm/hyp/pgtable.c index ca0bf0b92ca09..f335ef0e85f0a 100644 --- a/arch/arm64/kvm/hyp/pgtable.c +++ b/arch/arm64/kvm/hyp/pgtable.c @@ -79,7 +79,10 @@ static bool kvm_pgtable_walk_skip_cmo(const struct kvm_pgtable_visit_ctx *ctx) static bool kvm_phys_is_valid(u64 phys) { - return phys < BIT(id_aa64mmfr0_parange_to_phys_shift(ID_AA64MMFR0_EL1_PARANGE_MAX)); + u64 parange_max = kvm_get_parange_max(); + u8 shift = id_aa64mmfr0_parange_to_phys_shift(parange_max); + + return phys < BIT(shift); } static bool kvm_block_mapping_supported(const struct kvm_pgtable_visit_ctx *ctx, u64 phys) @@ -408,7 +411,8 @@ static int hyp_set_prot_attr(enum kvm_pgtable_prot prot, kvm_pte_t *ptep) } attr |= FIELD_PREP(KVM_PTE_LEAF_ATTR_LO_S1_AP, ap); - attr |= FIELD_PREP(KVM_PTE_LEAF_ATTR_LO_S1_SH, sh); + if (!kvm_lpa2_is_enabled()) + attr |= FIELD_PREP(KVM_PTE_LEAF_ATTR_LO_S1_SH, sh); attr |= KVM_PTE_LEAF_ATTR_LO_S1_AF; attr |= prot & KVM_PTE_LEAF_ATTR_HI_SW; *ptep = attr; @@ -654,6 +658,9 @@ u64 kvm_get_vtcr(u64 mmfr0, u64 mmfr1, u32 phys_shift) vtcr |= VTCR_EL2_HA; #endif /* CONFIG_ARM64_HW_AFDBM */ + if (kvm_lpa2_is_enabled()) + vtcr |= VTCR_EL2_DS; + /* Set the vmid bits */ vtcr |= (get_vmid_bits(mmfr1) == 16) ? VTCR_EL2_VS_16BIT : @@ -711,7 +718,9 @@ static int stage2_set_prot_attr(struct kvm_pgtable *pgt, enum kvm_pgtable_prot p if (prot & KVM_PGTABLE_PROT_W) attr |= KVM_PTE_LEAF_ATTR_LO_S2_S2AP_W; - attr |= FIELD_PREP(KVM_PTE_LEAF_ATTR_LO_S2_SH, sh); + if (!kvm_lpa2_is_enabled()) + attr |= FIELD_PREP(KVM_PTE_LEAF_ATTR_LO_S2_SH, sh); + attr |= KVM_PTE_LEAF_ATTR_LO_S2_AF; attr |= prot & KVM_PTE_LEAF_ATTR_HI_SW; *ptep = attr; From 4944189717461433d4cf27cab72ae9fe8d50be74 Mon Sep 17 00:00:00 2001 From: Ryan Roberts Date: Mon, 27 Nov 2023 11:17:33 +0000 Subject: [PATCH 018/548] KVM: arm64: Convert translation level parameter to s8 With the introduction of FEAT_LPA2, the Arm ARM adds a new level of translation, level -1, so levels can now be in the range [-1;3]. 3 is always the last level and the first level is determined based on the number of VA bits in use. Convert level variables to use a signed type in preparation for supporting this new level -1. Since the last level is always anchored at 3, and the first level varies to suit the number of VA/IPA bits, take the opportunity to replace KVM_PGTABLE_MAX_LEVELS with the 2 macros KVM_PGTABLE_FIRST_LEVEL and KVM_PGTABLE_LAST_LEVEL. This removes the assumption from the code that levels run from 0 to KVM_PGTABLE_MAX_LEVELS - 1, which will soon no longer be true. Reviewed-by: Oliver Upton Signed-off-by: Ryan Roberts Signed-off-by: Marc Zyngier Link: https://lore.kernel.org/r/20231127111737.1897081-9-ryan.roberts@arm.com (cherry picked from commit 419edf48d79f6fb2cc3fa090131864e95b321d41) Signed-off-by: Koba Ko --- arch/arm64/include/asm/kvm_emulate.h | 2 +- arch/arm64/include/asm/kvm_pgtable.h | 31 +++++++------ arch/arm64/include/asm/kvm_pkvm.h | 5 +- arch/arm64/kvm/hyp/nvhe/mem_protect.c | 6 +-- arch/arm64/kvm/hyp/nvhe/mm.c | 4 +- arch/arm64/kvm/hyp/nvhe/setup.c | 2 +- arch/arm64/kvm/hyp/pgtable.c | 66 +++++++++++++++------------ arch/arm64/kvm/mmu.c | 16 ++++--- 8 files changed, 71 insertions(+), 61 deletions(-) diff --git a/arch/arm64/include/asm/kvm_emulate.h b/arch/arm64/include/asm/kvm_emulate.h index 3d6725ff0bf6d..bf3ef66eb51fb 100644 --- a/arch/arm64/include/asm/kvm_emulate.h +++ b/arch/arm64/include/asm/kvm_emulate.h @@ -404,7 +404,7 @@ static __always_inline u8 kvm_vcpu_trap_get_fault_type(const struct kvm_vcpu *vc return kvm_vcpu_get_esr(vcpu) & ESR_ELx_FSC_TYPE; } -static __always_inline u8 kvm_vcpu_trap_get_fault_level(const struct kvm_vcpu *vcpu) +static __always_inline s8 kvm_vcpu_trap_get_fault_level(const struct kvm_vcpu *vcpu) { return kvm_vcpu_get_esr(vcpu) & ESR_ELx_FSC_LEVEL; } diff --git a/arch/arm64/include/asm/kvm_pgtable.h b/arch/arm64/include/asm/kvm_pgtable.h index 69a2a87ecaf6b..3253828e453d5 100644 --- a/arch/arm64/include/asm/kvm_pgtable.h +++ b/arch/arm64/include/asm/kvm_pgtable.h @@ -11,7 +11,8 @@ #include #include -#define KVM_PGTABLE_MAX_LEVELS 4U +#define KVM_PGTABLE_FIRST_LEVEL 0 +#define KVM_PGTABLE_LAST_LEVEL 3 /* * The largest supported block sizes for KVM (no 52-bit PA support): @@ -20,9 +21,9 @@ * - 64K (level 2): 512MB */ #ifdef CONFIG_ARM64_4K_PAGES -#define KVM_PGTABLE_MIN_BLOCK_LEVEL 1U +#define KVM_PGTABLE_MIN_BLOCK_LEVEL 1 #else -#define KVM_PGTABLE_MIN_BLOCK_LEVEL 2U +#define KVM_PGTABLE_MIN_BLOCK_LEVEL 2 #endif #define kvm_lpa2_is_enabled() system_supports_lpa2() @@ -103,28 +104,28 @@ static inline kvm_pfn_t kvm_pte_to_pfn(kvm_pte_t pte) return __phys_to_pfn(kvm_pte_to_phys(pte)); } -static inline u64 kvm_granule_shift(u32 level) +static inline u64 kvm_granule_shift(s8 level) { - /* Assumes KVM_PGTABLE_MAX_LEVELS is 4 */ + /* Assumes KVM_PGTABLE_LAST_LEVEL is 3 */ return ARM64_HW_PGTABLE_LEVEL_SHIFT(level); } -static inline u64 kvm_granule_size(u32 level) +static inline u64 kvm_granule_size(s8 level) { return BIT(kvm_granule_shift(level)); } -static inline bool kvm_level_supports_block_mapping(u32 level) +static inline bool kvm_level_supports_block_mapping(s8 level) { return level >= KVM_PGTABLE_MIN_BLOCK_LEVEL; } static inline u32 kvm_supported_block_sizes(void) { - u32 level = KVM_PGTABLE_MIN_BLOCK_LEVEL; + s8 level = KVM_PGTABLE_MIN_BLOCK_LEVEL; u32 r = 0; - for (; level < KVM_PGTABLE_MAX_LEVELS; level++) + for (; level <= KVM_PGTABLE_LAST_LEVEL; level++) r |= BIT(kvm_granule_shift(level)); return r; @@ -169,7 +170,7 @@ struct kvm_pgtable_mm_ops { void* (*zalloc_page)(void *arg); void* (*zalloc_pages_exact)(size_t size); void (*free_pages_exact)(void *addr, size_t size); - void (*free_unlinked_table)(void *addr, u32 level); + void (*free_unlinked_table)(void *addr, s8 level); void (*get_page)(void *addr); void (*put_page)(void *addr); int (*page_count)(void *addr); @@ -265,7 +266,7 @@ struct kvm_pgtable_visit_ctx { u64 start; u64 addr; u64 end; - u32 level; + s8 level; enum kvm_pgtable_walk_flags flags; }; @@ -368,7 +369,7 @@ static inline bool kvm_pgtable_walk_lock_held(void) */ struct kvm_pgtable { u32 ia_bits; - u32 start_level; + s8 start_level; kvm_pteref_t pgd; struct kvm_pgtable_mm_ops *mm_ops; @@ -502,7 +503,7 @@ void kvm_pgtable_stage2_destroy(struct kvm_pgtable *pgt); * The page-table is assumed to be unreachable by any hardware walkers prior to * freeing and therefore no TLB invalidation is performed. */ -void kvm_pgtable_stage2_free_unlinked(struct kvm_pgtable_mm_ops *mm_ops, void *pgtable, u32 level); +void kvm_pgtable_stage2_free_unlinked(struct kvm_pgtable_mm_ops *mm_ops, void *pgtable, s8 level); /** * kvm_pgtable_stage2_create_unlinked() - Create an unlinked stage-2 paging structure. @@ -526,7 +527,7 @@ void kvm_pgtable_stage2_free_unlinked(struct kvm_pgtable_mm_ops *mm_ops, void *p * an ERR_PTR(error) on failure. */ kvm_pte_t *kvm_pgtable_stage2_create_unlinked(struct kvm_pgtable *pgt, - u64 phys, u32 level, + u64 phys, s8 level, enum kvm_pgtable_prot prot, void *mc, bool force_pte); @@ -752,7 +753,7 @@ int kvm_pgtable_walk(struct kvm_pgtable *pgt, u64 addr, u64 size, * Return: 0 on success, negative error code on failure. */ int kvm_pgtable_get_leaf(struct kvm_pgtable *pgt, u64 addr, - kvm_pte_t *ptep, u32 *level); + kvm_pte_t *ptep, s8 *level); /** * kvm_pgtable_stage2_pte_prot() - Retrieve the protection attributes of a diff --git a/arch/arm64/include/asm/kvm_pkvm.h b/arch/arm64/include/asm/kvm_pkvm.h index e46250a020172..ad9cfb5c1ff4e 100644 --- a/arch/arm64/include/asm/kvm_pkvm.h +++ b/arch/arm64/include/asm/kvm_pkvm.h @@ -56,10 +56,11 @@ static inline unsigned long hyp_vm_table_pages(void) static inline unsigned long __hyp_pgtable_max_pages(unsigned long nr_pages) { - unsigned long total = 0, i; + unsigned long total = 0; + int i; /* Provision the worst case scenario */ - for (i = 0; i < KVM_PGTABLE_MAX_LEVELS; i++) { + for (i = KVM_PGTABLE_FIRST_LEVEL; i <= KVM_PGTABLE_LAST_LEVEL; i++) { nr_pages = DIV_ROUND_UP(nr_pages, PTRS_PER_PTE); total += nr_pages; } diff --git a/arch/arm64/kvm/hyp/nvhe/mem_protect.c b/arch/arm64/kvm/hyp/nvhe/mem_protect.c index 9d703441278bd..2cfb6352a8ea0 100644 --- a/arch/arm64/kvm/hyp/nvhe/mem_protect.c +++ b/arch/arm64/kvm/hyp/nvhe/mem_protect.c @@ -91,7 +91,7 @@ static void host_s2_put_page(void *addr) hyp_put_page(&host_s2_pool, addr); } -static void host_s2_free_unlinked_table(void *addr, u32 level) +static void host_s2_free_unlinked_table(void *addr, s8 level) { kvm_pgtable_stage2_free_unlinked(&host_mmu.mm_ops, addr, level); } @@ -443,7 +443,7 @@ static int host_stage2_adjust_range(u64 addr, struct kvm_mem_range *range) { struct kvm_mem_range cur; kvm_pte_t pte; - u32 level; + s8 level; int ret; hyp_assert_lock_held(&host_mmu.lock); @@ -462,7 +462,7 @@ static int host_stage2_adjust_range(u64 addr, struct kvm_mem_range *range) cur.start = ALIGN_DOWN(addr, granule); cur.end = cur.start + granule; level++; - } while ((level < KVM_PGTABLE_MAX_LEVELS) && + } while ((level <= KVM_PGTABLE_LAST_LEVEL) && !(kvm_level_supports_block_mapping(level) && range_included(&cur, range))); diff --git a/arch/arm64/kvm/hyp/nvhe/mm.c b/arch/arm64/kvm/hyp/nvhe/mm.c index 65a7a186d7b21..b01a3d1078a88 100644 --- a/arch/arm64/kvm/hyp/nvhe/mm.c +++ b/arch/arm64/kvm/hyp/nvhe/mm.c @@ -260,7 +260,7 @@ static void fixmap_clear_slot(struct hyp_fixmap_slot *slot) * https://lore.kernel.org/kvm/20221017115209.2099-1-will@kernel.org/T/#mf10dfbaf1eaef9274c581b81c53758918c1d0f03 */ dsb(ishst); - __tlbi_level(vale2is, __TLBI_VADDR(addr, 0), (KVM_PGTABLE_MAX_LEVELS - 1)); + __tlbi_level(vale2is, __TLBI_VADDR(addr, 0), KVM_PGTABLE_LAST_LEVEL); dsb(ish); isb(); } @@ -275,7 +275,7 @@ static int __create_fixmap_slot_cb(const struct kvm_pgtable_visit_ctx *ctx, { struct hyp_fixmap_slot *slot = per_cpu_ptr(&fixmap_slots, (u64)ctx->arg); - if (!kvm_pte_valid(ctx->old) || ctx->level != KVM_PGTABLE_MAX_LEVELS - 1) + if (!kvm_pte_valid(ctx->old) || ctx->level != KVM_PGTABLE_LAST_LEVEL) return -EINVAL; slot->addr = ctx->addr; diff --git a/arch/arm64/kvm/hyp/nvhe/setup.c b/arch/arm64/kvm/hyp/nvhe/setup.c index 0d5e0a89ddce5..bc58d1b515af1 100644 --- a/arch/arm64/kvm/hyp/nvhe/setup.c +++ b/arch/arm64/kvm/hyp/nvhe/setup.c @@ -181,7 +181,7 @@ static int fix_host_ownership_walker(const struct kvm_pgtable_visit_ctx *ctx, if (!kvm_pte_valid(ctx->old)) return 0; - if (ctx->level != (KVM_PGTABLE_MAX_LEVELS - 1)) + if (ctx->level != KVM_PGTABLE_LAST_LEVEL) return -EINVAL; phys = kvm_pte_to_phys(ctx->old); diff --git a/arch/arm64/kvm/hyp/pgtable.c b/arch/arm64/kvm/hyp/pgtable.c index f335ef0e85f0a..3b6053e54da0d 100644 --- a/arch/arm64/kvm/hyp/pgtable.c +++ b/arch/arm64/kvm/hyp/pgtable.c @@ -101,7 +101,7 @@ static bool kvm_block_mapping_supported(const struct kvm_pgtable_visit_ctx *ctx, return IS_ALIGNED(ctx->addr, granule); } -static u32 kvm_pgtable_idx(struct kvm_pgtable_walk_data *data, u32 level) +static u32 kvm_pgtable_idx(struct kvm_pgtable_walk_data *data, s8 level) { u64 shift = kvm_granule_shift(level); u64 mask = BIT(PAGE_SHIFT - 3) - 1; @@ -117,7 +117,7 @@ static u32 kvm_pgd_page_idx(struct kvm_pgtable *pgt, u64 addr) return (addr & mask) >> shift; } -static u32 kvm_pgd_pages(u32 ia_bits, u32 start_level) +static u32 kvm_pgd_pages(u32 ia_bits, s8 start_level) { struct kvm_pgtable pgt = { .ia_bits = ia_bits, @@ -127,9 +127,9 @@ static u32 kvm_pgd_pages(u32 ia_bits, u32 start_level) return kvm_pgd_page_idx(&pgt, -1ULL) + 1; } -static bool kvm_pte_table(kvm_pte_t pte, u32 level) +static bool kvm_pte_table(kvm_pte_t pte, s8 level) { - if (level == KVM_PGTABLE_MAX_LEVELS - 1) + if (level == KVM_PGTABLE_LAST_LEVEL) return false; if (!kvm_pte_valid(pte)) @@ -157,11 +157,11 @@ static kvm_pte_t kvm_init_table_pte(kvm_pte_t *childp, struct kvm_pgtable_mm_ops return pte; } -static kvm_pte_t kvm_init_valid_leaf_pte(u64 pa, kvm_pte_t attr, u32 level) +static kvm_pte_t kvm_init_valid_leaf_pte(u64 pa, kvm_pte_t attr, s8 level) { kvm_pte_t pte = kvm_phys_to_pte(pa); - u64 type = (level == KVM_PGTABLE_MAX_LEVELS - 1) ? KVM_PTE_TYPE_PAGE : - KVM_PTE_TYPE_BLOCK; + u64 type = (level == KVM_PGTABLE_LAST_LEVEL) ? KVM_PTE_TYPE_PAGE : + KVM_PTE_TYPE_BLOCK; pte |= attr & (KVM_PTE_LEAF_ATTR_LO | KVM_PTE_LEAF_ATTR_HI); pte |= FIELD_PREP(KVM_PTE_TYPE, type); @@ -206,11 +206,11 @@ static bool kvm_pgtable_walk_continue(const struct kvm_pgtable_walker *walker, } static int __kvm_pgtable_walk(struct kvm_pgtable_walk_data *data, - struct kvm_pgtable_mm_ops *mm_ops, kvm_pteref_t pgtable, u32 level); + struct kvm_pgtable_mm_ops *mm_ops, kvm_pteref_t pgtable, s8 level); static inline int __kvm_pgtable_visit(struct kvm_pgtable_walk_data *data, struct kvm_pgtable_mm_ops *mm_ops, - kvm_pteref_t pteref, u32 level) + kvm_pteref_t pteref, s8 level) { enum kvm_pgtable_walk_flags flags = data->walker->flags; kvm_pte_t *ptep = kvm_dereference_pteref(data->walker, pteref); @@ -275,12 +275,13 @@ static inline int __kvm_pgtable_visit(struct kvm_pgtable_walk_data *data, } static int __kvm_pgtable_walk(struct kvm_pgtable_walk_data *data, - struct kvm_pgtable_mm_ops *mm_ops, kvm_pteref_t pgtable, u32 level) + struct kvm_pgtable_mm_ops *mm_ops, kvm_pteref_t pgtable, s8 level) { u32 idx; int ret = 0; - if (WARN_ON_ONCE(level >= KVM_PGTABLE_MAX_LEVELS)) + if (WARN_ON_ONCE(level < KVM_PGTABLE_FIRST_LEVEL || + level > KVM_PGTABLE_LAST_LEVEL)) return -EINVAL; for (idx = kvm_pgtable_idx(data, level); idx < PTRS_PER_PTE; ++idx) { @@ -343,7 +344,7 @@ int kvm_pgtable_walk(struct kvm_pgtable *pgt, u64 addr, u64 size, struct leaf_walk_data { kvm_pte_t pte; - u32 level; + s8 level; }; static int leaf_walker(const struct kvm_pgtable_visit_ctx *ctx, @@ -358,7 +359,7 @@ static int leaf_walker(const struct kvm_pgtable_visit_ctx *ctx, } int kvm_pgtable_get_leaf(struct kvm_pgtable *pgt, u64 addr, - kvm_pte_t *ptep, u32 *level) + kvm_pte_t *ptep, s8 *level) { struct leaf_walk_data data; struct kvm_pgtable_walker walker = { @@ -471,7 +472,7 @@ static int hyp_map_walker(const struct kvm_pgtable_visit_ctx *ctx, if (hyp_map_walker_try_leaf(ctx, data)) return 0; - if (WARN_ON(ctx->level == KVM_PGTABLE_MAX_LEVELS - 1)) + if (WARN_ON(ctx->level == KVM_PGTABLE_LAST_LEVEL)) return -EINVAL; childp = (kvm_pte_t *)mm_ops->zalloc_page(NULL); @@ -567,14 +568,19 @@ u64 kvm_pgtable_hyp_unmap(struct kvm_pgtable *pgt, u64 addr, u64 size) int kvm_pgtable_hyp_init(struct kvm_pgtable *pgt, u32 va_bits, struct kvm_pgtable_mm_ops *mm_ops) { - u64 levels = ARM64_HW_PGTABLE_LEVELS(va_bits); + s8 start_level = KVM_PGTABLE_LAST_LEVEL + 1 - + ARM64_HW_PGTABLE_LEVELS(va_bits); + + if (start_level < KVM_PGTABLE_FIRST_LEVEL || + start_level > KVM_PGTABLE_LAST_LEVEL) + return -EINVAL; pgt->pgd = (kvm_pteref_t)mm_ops->zalloc_page(NULL); if (!pgt->pgd) return -ENOMEM; pgt->ia_bits = va_bits; - pgt->start_level = KVM_PGTABLE_MAX_LEVELS - levels; + pgt->start_level = start_level; pgt->mm_ops = mm_ops; pgt->mmu = NULL; pgt->force_pte_cb = NULL; @@ -628,7 +634,7 @@ struct stage2_map_data { u64 kvm_get_vtcr(u64 mmfr0, u64 mmfr1, u32 phys_shift) { u64 vtcr = VTCR_EL2_FLAGS; - u8 lvls; + s8 lvls; vtcr |= kvm_get_parange(mmfr0) << VTCR_EL2_PS_SHIFT; vtcr |= VTCR_EL2_T0SZ(phys_shift); @@ -918,7 +924,7 @@ static bool stage2_leaf_mapping_allowed(const struct kvm_pgtable_visit_ctx *ctx, { u64 phys = stage2_map_walker_phys_addr(ctx, data); - if (data->force_pte && (ctx->level < (KVM_PGTABLE_MAX_LEVELS - 1))) + if (data->force_pte && ctx->level < KVM_PGTABLE_LAST_LEVEL) return false; return kvm_block_mapping_supported(ctx, phys); @@ -997,7 +1003,7 @@ static int stage2_map_walk_leaf(const struct kvm_pgtable_visit_ctx *ctx, if (ret != -E2BIG) return ret; - if (WARN_ON(ctx->level == KVM_PGTABLE_MAX_LEVELS - 1)) + if (WARN_ON(ctx->level == KVM_PGTABLE_LAST_LEVEL)) return -EINVAL; if (!data->memcache) @@ -1167,7 +1173,7 @@ struct stage2_attr_data { kvm_pte_t attr_set; kvm_pte_t attr_clr; kvm_pte_t pte; - u32 level; + s8 level; }; static int stage2_attr_walker(const struct kvm_pgtable_visit_ctx *ctx, @@ -1210,7 +1216,7 @@ static int stage2_attr_walker(const struct kvm_pgtable_visit_ctx *ctx, static int stage2_update_leaf_attrs(struct kvm_pgtable *pgt, u64 addr, u64 size, kvm_pte_t attr_set, kvm_pte_t attr_clr, kvm_pte_t *orig_pte, - u32 *level, enum kvm_pgtable_walk_flags flags) + s8 *level, enum kvm_pgtable_walk_flags flags) { int ret; kvm_pte_t attr_mask = KVM_PTE_LEAF_ATTR_LO | KVM_PTE_LEAF_ATTR_HI; @@ -1312,7 +1318,7 @@ int kvm_pgtable_stage2_relax_perms(struct kvm_pgtable *pgt, u64 addr, enum kvm_pgtable_prot prot) { int ret; - u32 level; + s8 level; kvm_pte_t set = 0, clr = 0; if (prot & KVM_PTE_LEAF_ATTR_HI_SW) @@ -1365,7 +1371,7 @@ int kvm_pgtable_stage2_flush(struct kvm_pgtable *pgt, u64 addr, u64 size) } kvm_pte_t *kvm_pgtable_stage2_create_unlinked(struct kvm_pgtable *pgt, - u64 phys, u32 level, + u64 phys, s8 level, enum kvm_pgtable_prot prot, void *mc, bool force_pte) { @@ -1423,7 +1429,7 @@ kvm_pte_t *kvm_pgtable_stage2_create_unlinked(struct kvm_pgtable *pgt, * fully populated tree up to the PTE entries. Note that @level is * interpreted as in "level @level entry". */ -static int stage2_block_get_nr_page_tables(u32 level) +static int stage2_block_get_nr_page_tables(s8 level) { switch (level) { case 1: @@ -1434,7 +1440,7 @@ static int stage2_block_get_nr_page_tables(u32 level) return 0; default: WARN_ON_ONCE(level < KVM_PGTABLE_MIN_BLOCK_LEVEL || - level >= KVM_PGTABLE_MAX_LEVELS); + level > KVM_PGTABLE_LAST_LEVEL); return -EINVAL; }; } @@ -1447,13 +1453,13 @@ static int stage2_split_walker(const struct kvm_pgtable_visit_ctx *ctx, struct kvm_s2_mmu *mmu; kvm_pte_t pte = ctx->old, new, *childp; enum kvm_pgtable_prot prot; - u32 level = ctx->level; + s8 level = ctx->level; bool force_pte; int nr_pages; u64 phys; /* No huge-pages exist at the last level */ - if (level == KVM_PGTABLE_MAX_LEVELS - 1) + if (level == KVM_PGTABLE_LAST_LEVEL) return 0; /* We only split valid block mappings */ @@ -1530,7 +1536,7 @@ int __kvm_pgtable_stage2_init(struct kvm_pgtable *pgt, struct kvm_s2_mmu *mmu, u64 vtcr = mmu->arch->vtcr; u32 ia_bits = VTCR_EL2_IPA(vtcr); u32 sl0 = FIELD_GET(VTCR_EL2_SL0_MASK, vtcr); - u32 start_level = VTCR_EL2_TGRAN_SL0_BASE - sl0; + s8 start_level = VTCR_EL2_TGRAN_SL0_BASE - sl0; pgd_sz = kvm_pgd_pages(ia_bits, start_level) * PAGE_SIZE; pgt->pgd = (kvm_pteref_t)mm_ops->zalloc_pages_exact(pgd_sz); @@ -1553,7 +1559,7 @@ size_t kvm_pgtable_stage2_pgd_size(u64 vtcr) { u32 ia_bits = VTCR_EL2_IPA(vtcr); u32 sl0 = FIELD_GET(VTCR_EL2_SL0_MASK, vtcr); - u32 start_level = VTCR_EL2_TGRAN_SL0_BASE - sl0; + s8 start_level = VTCR_EL2_TGRAN_SL0_BASE - sl0; return kvm_pgd_pages(ia_bits, start_level) * PAGE_SIZE; } @@ -1589,7 +1595,7 @@ void kvm_pgtable_stage2_destroy(struct kvm_pgtable *pgt) pgt->pgd = NULL; } -void kvm_pgtable_stage2_free_unlinked(struct kvm_pgtable_mm_ops *mm_ops, void *pgtable, u32 level) +void kvm_pgtable_stage2_free_unlinked(struct kvm_pgtable_mm_ops *mm_ops, void *pgtable, s8 level) { kvm_pteref_t ptep = (kvm_pteref_t)pgtable; struct kvm_pgtable_walker walker = { diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c index 482280fe22d7c..9d5ab07c496a9 100644 --- a/arch/arm64/kvm/mmu.c +++ b/arch/arm64/kvm/mmu.c @@ -223,12 +223,12 @@ static void stage2_free_unlinked_table_rcu_cb(struct rcu_head *head) { struct page *page = container_of(head, struct page, rcu_head); void *pgtable = page_to_virt(page); - u32 level = page_private(page); + s8 level = page_private(page); kvm_pgtable_stage2_free_unlinked(&kvm_s2_mm_ops, pgtable, level); } -static void stage2_free_unlinked_table(void *addr, u32 level) +static void stage2_free_unlinked_table(void *addr, s8 level) { struct page *page = virt_to_page(addr); @@ -804,13 +804,13 @@ static int get_user_mapping_size(struct kvm *kvm, u64 addr) struct kvm_pgtable pgt = { .pgd = (kvm_pteref_t)kvm->mm->pgd, .ia_bits = vabits_actual, - .start_level = (KVM_PGTABLE_MAX_LEVELS - - CONFIG_PGTABLE_LEVELS), + .start_level = (KVM_PGTABLE_LAST_LEVEL - + CONFIG_PGTABLE_LEVELS + 1), .mm_ops = &kvm_user_mm_ops, }; unsigned long flags; kvm_pte_t pte = 0; /* Keep GCC quiet... */ - u32 level = ~0; + s8 level = S8_MAX; int ret; /* @@ -829,7 +829,9 @@ static int get_user_mapping_size(struct kvm *kvm, u64 addr) * Not seeing an error, but not updating level? Something went * deeply wrong... */ - if (WARN_ON(level >= KVM_PGTABLE_MAX_LEVELS)) + if (WARN_ON(level > KVM_PGTABLE_LAST_LEVEL)) + return -EFAULT; + if (WARN_ON(level < KVM_PGTABLE_FIRST_LEVEL)) return -EFAULT; /* Oops, the userspace PTs are gone... Replay the fault */ @@ -1407,7 +1409,7 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa, gfn_t gfn; kvm_pfn_t pfn; bool logging_active = memslot_is_logging(memslot); - unsigned long fault_level = kvm_vcpu_trap_get_fault_level(vcpu); + s8 fault_level = kvm_vcpu_trap_get_fault_level(vcpu); long vma_pagesize, fault_granule; enum kvm_pgtable_prot prot = KVM_PGTABLE_PROT_R; struct kvm_pgtable *pgt; From 521dceb610f429b692f2a4403aee4df967bce6f6 Mon Sep 17 00:00:00 2001 From: Ryan Roberts Date: Mon, 27 Nov 2023 11:17:34 +0000 Subject: [PATCH 019/548] KVM: arm64: Support up to 5 levels of translation in kvm_pgtable FEAT_LPA2 increases the maximum levels of translation from 4 to 5 for the 4KB page case, when IA is >48 bits. While we can still use 4 levels for stage2 translation in this case (due to stage2 allowing concatenated page tables for first level lookup), the same kvm_pgtable library is used for the hyp stage1 page tables and stage1 does not support concatenation. Therefore, modify the library to support up to 5 levels. Previous patches already laid the groundwork for this by refactoring code to work in terms of KVM_PGTABLE_FIRST_LEVEL and KVM_PGTABLE_LAST_LEVEL. So we just need to change these macros. The hardware sometimes encodes the new level differently from the others: One such place is when reading the level from the FSC field in the ESR_EL2 register. We never expect to see the lowest level (-1) here since the stage 2 page tables always use concatenated tables for first level lookup and therefore only use 4 levels of lookup. So we get away with just adding a comment to explain why we are not being careful about decoding level -1. For stage2 VTCR_EL2.SL2 is introduced to encode the new start level. However, since we always use concatenated page tables for first level look up at stage2 (and therefore we will never need the new extra level) we never touch this new field. Reviewed-by: Oliver Upton Signed-off-by: Ryan Roberts Signed-off-by: Marc Zyngier Link: https://lore.kernel.org/r/20231127111737.1897081-10-ryan.roberts@arm.com (cherry picked from commit 0abc1b11a032199bb134fd25cd7ee0cdb26b7b03) Signed-off-by: Koba Ko --- arch/arm64/include/asm/kvm_emulate.h | 10 ++++++++++ arch/arm64/include/asm/kvm_pgtable.h | 2 +- arch/arm64/kvm/hyp/pgtable.c | 9 +++++++++ 3 files changed, 20 insertions(+), 1 deletion(-) diff --git a/arch/arm64/include/asm/kvm_emulate.h b/arch/arm64/include/asm/kvm_emulate.h index bf3ef66eb51fb..9afce5e423524 100644 --- a/arch/arm64/include/asm/kvm_emulate.h +++ b/arch/arm64/include/asm/kvm_emulate.h @@ -406,6 +406,16 @@ static __always_inline u8 kvm_vcpu_trap_get_fault_type(const struct kvm_vcpu *vc static __always_inline s8 kvm_vcpu_trap_get_fault_level(const struct kvm_vcpu *vcpu) { + /* + * Note: With the introduction of FEAT_LPA2 an extra level of + * translation (level -1) is added. This level (obviously) doesn't + * follow the previous convention of encoding the 4 levels in the 2 LSBs + * of the FSC so this function breaks if the fault is for level -1. + * + * However, stage2 tables always use concatenated tables for first level + * lookup and therefore it is guaranteed that the level will be between + * 0 and 3, and this function continues to work. + */ return kvm_vcpu_get_esr(vcpu) & ESR_ELx_FSC_LEVEL; } diff --git a/arch/arm64/include/asm/kvm_pgtable.h b/arch/arm64/include/asm/kvm_pgtable.h index 3253828e453d5..cfdf40f734b12 100644 --- a/arch/arm64/include/asm/kvm_pgtable.h +++ b/arch/arm64/include/asm/kvm_pgtable.h @@ -11,7 +11,7 @@ #include #include -#define KVM_PGTABLE_FIRST_LEVEL 0 +#define KVM_PGTABLE_FIRST_LEVEL -1 #define KVM_PGTABLE_LAST_LEVEL 3 /* diff --git a/arch/arm64/kvm/hyp/pgtable.c b/arch/arm64/kvm/hyp/pgtable.c index 3b6053e54da0d..42bf0d324e6a4 100644 --- a/arch/arm64/kvm/hyp/pgtable.c +++ b/arch/arm64/kvm/hyp/pgtable.c @@ -645,6 +645,15 @@ u64 kvm_get_vtcr(u64 mmfr0, u64 mmfr1, u32 phys_shift) lvls = stage2_pgtable_levels(phys_shift); if (lvls < 2) lvls = 2; + + /* + * When LPA2 is enabled, the HW supports an extra level of translation + * (for 5 in total) when using 4K pages. It also introduces VTCR_EL2.SL2 + * to as an addition to SL0 to enable encoding this extra start level. + * However, since we always use concatenated pages for the first level + * lookup, we will never need this extra level and therefore do not need + * to touch SL2. + */ vtcr |= VTCR_EL2_LVLS_TO_SL0(lvls); #ifdef CONFIG_ARM64_HW_AFDBM From 34420face7cb65f053931baae7c02ab9a104c5d1 Mon Sep 17 00:00:00 2001 From: Ryan Roberts Date: Mon, 27 Nov 2023 11:17:35 +0000 Subject: [PATCH 020/548] KVM: arm64: Allow guests with >48-bit IPA size on FEAT_LPA2 systems With all the page-table infrastructure in place, we can finally increase the maximum permisable IPA size to 52-bits on 4KB and 16KB page systems that have FEAT_LPA2. Reviewed-by: Oliver Upton Signed-off-by: Ryan Roberts Signed-off-by: Marc Zyngier Link: https://lore.kernel.org/r/20231127111737.1897081-11-ryan.roberts@arm.com (cherry picked from commit d782ac5b2ceebee5d374e13e990655d1a140d3a6) Signed-off-by: Koba Ko --- arch/arm64/kvm/reset.c | 9 ++++----- 1 file changed, 4 insertions(+), 5 deletions(-) diff --git a/arch/arm64/kvm/reset.c b/arch/arm64/kvm/reset.c index 43a53a403f510..af84bdbc88a04 100644 --- a/arch/arm64/kvm/reset.c +++ b/arch/arm64/kvm/reset.c @@ -319,12 +319,11 @@ int __init kvm_set_ipa_limit(void) parange = cpuid_feature_extract_unsigned_field(mmfr0, ID_AA64MMFR0_EL1_PARANGE_SHIFT); /* - * IPA size beyond 48 bits could not be supported - * on either 4K or 16K page size. Hence let's cap - * it to 48 bits, in case it's reported as larger - * on the system. + * IPA size beyond 48 bits for 4K and 16K page size is only supported + * when LPA2 is available. So if we have LPA2, enable it, else cap to 48 + * bits, in case it's reported as larger on the system. */ - if (PAGE_SIZE != SZ_64K) + if (!kvm_lpa2_is_enabled() && PAGE_SIZE != SZ_64K) parange = min(parange, (unsigned int)ID_AA64MMFR0_EL1_PARANGE_48); /* From 85fbbbd28447023d4236ba897e33c2900231d51c Mon Sep 17 00:00:00 2001 From: Ryan Roberts Date: Mon, 27 Nov 2023 11:17:36 +0000 Subject: [PATCH 021/548] KVM: selftests: arm64: Determine max ipa size per-page size We are about to add 52 bit PA guest modes for 4K and 16K pages when the system supports LPA2. In preparation beef up the logic that parses mmfr0 to also tell us what the maximum supported PA size is for each page size. Max PA size = 0 implies the page size is not supported at all. Reviewed-by: Oliver Upton Signed-off-by: Ryan Roberts Signed-off-by: Marc Zyngier Link: https://lore.kernel.org/r/20231127111737.1897081-12-ryan.roberts@arm.com (cherry picked from commit 72324ac52ddddb168d8008e36d6a617b9b74f6c1) Signed-off-by: Koba Ko --- .../selftests/kvm/include/aarch64/processor.h | 4 +- .../selftests/kvm/include/guest_modes.h | 4 +- .../selftests/kvm/lib/aarch64/processor.c | 30 ++++++++++-- tools/testing/selftests/kvm/lib/guest_modes.c | 48 ++++++++----------- 4 files changed, 50 insertions(+), 36 deletions(-) diff --git a/tools/testing/selftests/kvm/include/aarch64/processor.h b/tools/testing/selftests/kvm/include/aarch64/processor.h index cb537253a6b9c..9e415cf4f8dd2 100644 --- a/tools/testing/selftests/kvm/include/aarch64/processor.h +++ b/tools/testing/selftests/kvm/include/aarch64/processor.h @@ -118,8 +118,8 @@ enum { /* Access flag update enable/disable */ #define TCR_EL1_HA (1ULL << 39) -void aarch64_get_supported_page_sizes(uint32_t ipa, - bool *ps4k, bool *ps16k, bool *ps64k); +void aarch64_get_supported_page_sizes(uint32_t ipa, uint32_t *ipa4k, + uint32_t *ipa16k, uint32_t *ipa64k); void vm_init_descriptor_tables(struct kvm_vm *vm); void vcpu_init_descriptor_tables(struct kvm_vcpu *vcpu); diff --git a/tools/testing/selftests/kvm/include/guest_modes.h b/tools/testing/selftests/kvm/include/guest_modes.h index b691df33e64e1..63f5167397ccb 100644 --- a/tools/testing/selftests/kvm/include/guest_modes.h +++ b/tools/testing/selftests/kvm/include/guest_modes.h @@ -11,8 +11,8 @@ struct guest_mode { extern struct guest_mode guest_modes[NUM_VM_MODES]; -#define guest_mode_append(mode, supported, enabled) ({ \ - guest_modes[mode] = (struct guest_mode){ supported, enabled }; \ +#define guest_mode_append(mode, enabled) ({ \ + guest_modes[mode] = (struct guest_mode){ (enabled), (enabled) }; \ }) void guest_modes_append_default(void); diff --git a/tools/testing/selftests/kvm/lib/aarch64/processor.c b/tools/testing/selftests/kvm/lib/aarch64/processor.c index 3a0259e253353..e6ffd9037c373 100644 --- a/tools/testing/selftests/kvm/lib/aarch64/processor.c +++ b/tools/testing/selftests/kvm/lib/aarch64/processor.c @@ -492,12 +492,24 @@ uint32_t guest_get_vcpuid(void) return read_sysreg(tpidr_el1); } -void aarch64_get_supported_page_sizes(uint32_t ipa, - bool *ps4k, bool *ps16k, bool *ps64k) +static uint32_t max_ipa_for_page_size(uint32_t vm_ipa, uint32_t gran, + uint32_t not_sup_val, uint32_t ipa52_min_val) +{ + if (gran == not_sup_val) + return 0; + else if (gran >= ipa52_min_val && vm_ipa >= 52) + return 52; + else + return min(vm_ipa, 48U); +} + +void aarch64_get_supported_page_sizes(uint32_t ipa, uint32_t *ipa4k, + uint32_t *ipa16k, uint32_t *ipa64k) { struct kvm_vcpu_init preferred_init; int kvm_fd, vm_fd, vcpu_fd, err; uint64_t val; + uint32_t gran; struct kvm_one_reg reg = { .id = KVM_ARM64_SYS_REG(SYS_ID_AA64MMFR0_EL1), .addr = (uint64_t)&val, @@ -518,9 +530,17 @@ void aarch64_get_supported_page_sizes(uint32_t ipa, err = ioctl(vcpu_fd, KVM_GET_ONE_REG, ®); TEST_ASSERT(err == 0, KVM_IOCTL_ERROR(KVM_GET_ONE_REG, vcpu_fd)); - *ps4k = FIELD_GET(ARM64_FEATURE_MASK(ID_AA64MMFR0_TGRAN4), val) != 0xf; - *ps64k = FIELD_GET(ARM64_FEATURE_MASK(ID_AA64MMFR0_TGRAN64), val) == 0; - *ps16k = FIELD_GET(ARM64_FEATURE_MASK(ID_AA64MMFR0_TGRAN16), val) != 0; + gran = FIELD_GET(ARM64_FEATURE_MASK(ID_AA64MMFR0_EL1_TGRAN4), val); + *ipa4k = max_ipa_for_page_size(ipa, gran, ID_AA64MMFR0_EL1_TGRAN4_NI, + ID_AA64MMFR0_EL1_TGRAN4_52_BIT); + + gran = FIELD_GET(ARM64_FEATURE_MASK(ID_AA64MMFR0_EL1_TGRAN64), val); + *ipa64k = max_ipa_for_page_size(ipa, gran, ID_AA64MMFR0_EL1_TGRAN64_NI, + ID_AA64MMFR0_EL1_TGRAN64_IMP); + + gran = FIELD_GET(ARM64_FEATURE_MASK(ID_AA64MMFR0_EL1_TGRAN16), val); + *ipa16k = max_ipa_for_page_size(ipa, gran, ID_AA64MMFR0_EL1_TGRAN16_NI, + ID_AA64MMFR0_EL1_TGRAN16_52_BIT); close(vcpu_fd); close(vm_fd); diff --git a/tools/testing/selftests/kvm/lib/guest_modes.c b/tools/testing/selftests/kvm/lib/guest_modes.c index 1df3ce4b16fd8..772a7dd15db48 100644 --- a/tools/testing/selftests/kvm/lib/guest_modes.c +++ b/tools/testing/selftests/kvm/lib/guest_modes.c @@ -14,37 +14,31 @@ struct guest_mode guest_modes[NUM_VM_MODES]; void guest_modes_append_default(void) { #ifndef __aarch64__ - guest_mode_append(VM_MODE_DEFAULT, true, true); + guest_mode_append(VM_MODE_DEFAULT, true); #else { unsigned int limit = kvm_check_cap(KVM_CAP_ARM_VM_IPA_SIZE); - bool ps4k, ps16k, ps64k; + uint32_t ipa4k, ipa16k, ipa64k; int i; - aarch64_get_supported_page_sizes(limit, &ps4k, &ps16k, &ps64k); + aarch64_get_supported_page_sizes(limit, &ipa4k, &ipa16k, &ipa64k); - vm_mode_default = NUM_VM_MODES; + guest_mode_append(VM_MODE_P52V48_64K, ipa64k >= 52); - if (limit >= 52) - guest_mode_append(VM_MODE_P52V48_64K, ps64k, ps64k); - if (limit >= 48) { - guest_mode_append(VM_MODE_P48V48_4K, ps4k, ps4k); - guest_mode_append(VM_MODE_P48V48_16K, ps16k, ps16k); - guest_mode_append(VM_MODE_P48V48_64K, ps64k, ps64k); - } - if (limit >= 40) { - guest_mode_append(VM_MODE_P40V48_4K, ps4k, ps4k); - guest_mode_append(VM_MODE_P40V48_16K, ps16k, ps16k); - guest_mode_append(VM_MODE_P40V48_64K, ps64k, ps64k); - if (ps4k) - vm_mode_default = VM_MODE_P40V48_4K; - } - if (limit >= 36) { - guest_mode_append(VM_MODE_P36V48_4K, ps4k, ps4k); - guest_mode_append(VM_MODE_P36V48_16K, ps16k, ps16k); - guest_mode_append(VM_MODE_P36V48_64K, ps64k, ps64k); - guest_mode_append(VM_MODE_P36V47_16K, ps16k, ps16k); - } + guest_mode_append(VM_MODE_P48V48_4K, ipa4k >= 48); + guest_mode_append(VM_MODE_P48V48_16K, ipa16k >= 48); + guest_mode_append(VM_MODE_P48V48_64K, ipa64k >= 48); + + guest_mode_append(VM_MODE_P40V48_4K, ipa4k >= 40); + guest_mode_append(VM_MODE_P40V48_16K, ipa16k >= 40); + guest_mode_append(VM_MODE_P40V48_64K, ipa64k >= 40); + + guest_mode_append(VM_MODE_P36V48_4K, ipa4k >= 36); + guest_mode_append(VM_MODE_P36V48_16K, ipa16k >= 36); + guest_mode_append(VM_MODE_P36V48_64K, ipa64k >= 36); + guest_mode_append(VM_MODE_P36V47_16K, ipa16k >= 36); + + vm_mode_default = ipa4k >= 40 ? VM_MODE_P40V48_4K : NUM_VM_MODES; /* * Pick the first supported IPA size if the default @@ -72,7 +66,7 @@ void guest_modes_append_default(void) close(kvm_fd); /* Starting with z13 we have 47bits of physical address */ if (info.ibc >= 0x30) - guest_mode_append(VM_MODE_P47V64_4K, true, true); + guest_mode_append(VM_MODE_P47V64_4K, true); } #endif #ifdef __riscv @@ -80,9 +74,9 @@ void guest_modes_append_default(void) unsigned int sz = kvm_check_cap(KVM_CAP_VM_GPA_BITS); if (sz >= 52) - guest_mode_append(VM_MODE_P52V48_4K, true, true); + guest_mode_append(VM_MODE_P52V48_4K, true); if (sz >= 48) - guest_mode_append(VM_MODE_P48V48_4K, true, true); + guest_mode_append(VM_MODE_P48V48_4K, true); } #endif } From f518f81145bb7076a8441c502e2066d4b33a45c2 Mon Sep 17 00:00:00 2001 From: Ryan Roberts Date: Mon, 27 Nov 2023 11:17:37 +0000 Subject: [PATCH 022/548] KVM: selftests: arm64: Support P52V48 4K and 16K guest_modes Add support for VM_MODE_P52V48_4K and VM_MODE_P52V48_16K guest modes by using the FEAT_LPA2 pte format for stage1, when FEAT_LPA2 is available. Reviewed-by: Oliver Upton Signed-off-by: Ryan Roberts Signed-off-by: Marc Zyngier Link: https://lore.kernel.org/r/20231127111737.1897081-13-ryan.roberts@arm.com (cherry picked from commit 10a0cc3b688fcf753ff3f6518bb15e7a6809e908) Signed-off-by: Koba Ko --- .../selftests/kvm/include/kvm_util_base.h | 1 + .../selftests/kvm/lib/aarch64/processor.c | 39 ++++++++++++++----- tools/testing/selftests/kvm/lib/guest_modes.c | 2 + tools/testing/selftests/kvm/lib/kvm_util.c | 3 ++ 4 files changed, 36 insertions(+), 9 deletions(-) diff --git a/tools/testing/selftests/kvm/include/kvm_util_base.h b/tools/testing/selftests/kvm/include/kvm_util_base.h index a18db6a7b3cf4..406500fb6e284 100644 --- a/tools/testing/selftests/kvm/include/kvm_util_base.h +++ b/tools/testing/selftests/kvm/include/kvm_util_base.h @@ -171,6 +171,7 @@ static inline struct userspace_mem_region *vm_get_mem_region(struct kvm_vm *vm, enum vm_guest_mode { VM_MODE_P52V48_4K, + VM_MODE_P52V48_16K, VM_MODE_P52V48_64K, VM_MODE_P48V48_4K, VM_MODE_P48V48_16K, diff --git a/tools/testing/selftests/kvm/lib/aarch64/processor.c b/tools/testing/selftests/kvm/lib/aarch64/processor.c index e6ffd9037c373..41c776b642c0c 100644 --- a/tools/testing/selftests/kvm/lib/aarch64/processor.c +++ b/tools/testing/selftests/kvm/lib/aarch64/processor.c @@ -12,6 +12,7 @@ #include "kvm_util.h" #include "processor.h" #include +#include #define DEFAULT_ARM64_GUEST_STACK_VADDR_MIN 0xac0000 @@ -58,13 +59,25 @@ static uint64_t pte_index(struct kvm_vm *vm, vm_vaddr_t gva) return (gva >> vm->page_shift) & mask; } +static inline bool use_lpa2_pte_format(struct kvm_vm *vm) +{ + return (vm->page_size == SZ_4K || vm->page_size == SZ_16K) && + (vm->pa_bits > 48 || vm->va_bits > 48); +} + static uint64_t addr_pte(struct kvm_vm *vm, uint64_t pa, uint64_t attrs) { uint64_t pte; - pte = pa & GENMASK(47, vm->page_shift); - if (vm->page_shift == 16) - pte |= FIELD_GET(GENMASK(51, 48), pa) << 12; + if (use_lpa2_pte_format(vm)) { + pte = pa & GENMASK(49, vm->page_shift); + pte |= FIELD_GET(GENMASK(51, 50), pa) << 8; + attrs &= ~GENMASK(9, 8); + } else { + pte = pa & GENMASK(47, vm->page_shift); + if (vm->page_shift == 16) + pte |= FIELD_GET(GENMASK(51, 48), pa) << 12; + } pte |= attrs; return pte; @@ -74,9 +87,14 @@ static uint64_t pte_addr(struct kvm_vm *vm, uint64_t pte) { uint64_t pa; - pa = pte & GENMASK(47, vm->page_shift); - if (vm->page_shift == 16) - pa |= FIELD_GET(GENMASK(15, 12), pte) << 48; + if (use_lpa2_pte_format(vm)) { + pa = pte & GENMASK(49, vm->page_shift); + pa |= FIELD_GET(GENMASK(9, 8), pte) << 50; + } else { + pa = pte & GENMASK(47, vm->page_shift); + if (vm->page_shift == 16) + pa |= FIELD_GET(GENMASK(15, 12), pte) << 48; + } return pa; } @@ -266,9 +284,6 @@ void aarch64_vcpu_setup(struct kvm_vcpu *vcpu, struct kvm_vcpu_init *init) /* Configure base granule size */ switch (vm->mode) { - case VM_MODE_P52V48_4K: - TEST_FAIL("AArch64 does not support 4K sized pages " - "with 52-bit physical address ranges"); case VM_MODE_PXXV48_4K: TEST_FAIL("AArch64 does not support 4K sized pages " "with ANY-bit physical address ranges"); @@ -278,12 +293,14 @@ void aarch64_vcpu_setup(struct kvm_vcpu *vcpu, struct kvm_vcpu_init *init) case VM_MODE_P36V48_64K: tcr_el1 |= 1ul << 14; /* TG0 = 64KB */ break; + case VM_MODE_P52V48_16K: case VM_MODE_P48V48_16K: case VM_MODE_P40V48_16K: case VM_MODE_P36V48_16K: case VM_MODE_P36V47_16K: tcr_el1 |= 2ul << 14; /* TG0 = 16KB */ break; + case VM_MODE_P52V48_4K: case VM_MODE_P48V48_4K: case VM_MODE_P40V48_4K: case VM_MODE_P36V48_4K: @@ -297,6 +314,8 @@ void aarch64_vcpu_setup(struct kvm_vcpu *vcpu, struct kvm_vcpu_init *init) /* Configure output size */ switch (vm->mode) { + case VM_MODE_P52V48_4K: + case VM_MODE_P52V48_16K: case VM_MODE_P52V48_64K: tcr_el1 |= 6ul << 32; /* IPS = 52 bits */ ttbr0_el1 |= FIELD_GET(GENMASK(51, 48), vm->pgd) << 2; @@ -325,6 +344,8 @@ void aarch64_vcpu_setup(struct kvm_vcpu *vcpu, struct kvm_vcpu_init *init) /* TCR_EL1 |= IRGN0:WBWA | ORGN0:WBWA | SH0:Inner-Shareable */; tcr_el1 |= (1 << 8) | (1 << 10) | (3 << 12); tcr_el1 |= (64 - vm->va_bits) /* T0SZ */; + if (use_lpa2_pte_format(vm)) + tcr_el1 |= (1ul << 59) /* DS */; vcpu_set_reg(vcpu, KVM_ARM64_SYS_REG(SYS_SCTLR_EL1), sctlr_el1); vcpu_set_reg(vcpu, KVM_ARM64_SYS_REG(SYS_TCR_EL1), tcr_el1); diff --git a/tools/testing/selftests/kvm/lib/guest_modes.c b/tools/testing/selftests/kvm/lib/guest_modes.c index 772a7dd15db48..b04901e551387 100644 --- a/tools/testing/selftests/kvm/lib/guest_modes.c +++ b/tools/testing/selftests/kvm/lib/guest_modes.c @@ -23,6 +23,8 @@ void guest_modes_append_default(void) aarch64_get_supported_page_sizes(limit, &ipa4k, &ipa16k, &ipa64k); + guest_mode_append(VM_MODE_P52V48_4K, ipa4k >= 52); + guest_mode_append(VM_MODE_P52V48_16K, ipa16k >= 52); guest_mode_append(VM_MODE_P52V48_64K, ipa64k >= 52); guest_mode_append(VM_MODE_P48V48_4K, ipa4k >= 48); diff --git a/tools/testing/selftests/kvm/lib/kvm_util.c b/tools/testing/selftests/kvm/lib/kvm_util.c index 7a8af1821f5da..aeba7a23105c6 100644 --- a/tools/testing/selftests/kvm/lib/kvm_util.c +++ b/tools/testing/selftests/kvm/lib/kvm_util.c @@ -148,6 +148,7 @@ const char *vm_guest_mode_string(uint32_t i) { static const char * const strings[] = { [VM_MODE_P52V48_4K] = "PA-bits:52, VA-bits:48, 4K pages", + [VM_MODE_P52V48_16K] = "PA-bits:52, VA-bits:48, 16K pages", [VM_MODE_P52V48_64K] = "PA-bits:52, VA-bits:48, 64K pages", [VM_MODE_P48V48_4K] = "PA-bits:48, VA-bits:48, 4K pages", [VM_MODE_P48V48_16K] = "PA-bits:48, VA-bits:48, 16K pages", @@ -173,6 +174,7 @@ const char *vm_guest_mode_string(uint32_t i) const struct vm_guest_mode_params vm_guest_mode_params[] = { [VM_MODE_P52V48_4K] = { 52, 48, 0x1000, 12 }, + [VM_MODE_P52V48_16K] = { 52, 48, 0x4000, 14 }, [VM_MODE_P52V48_64K] = { 52, 48, 0x10000, 16 }, [VM_MODE_P48V48_4K] = { 48, 48, 0x1000, 12 }, [VM_MODE_P48V48_16K] = { 48, 48, 0x4000, 14 }, @@ -251,6 +253,7 @@ struct kvm_vm *____vm_create(enum vm_guest_mode mode) case VM_MODE_P36V48_64K: vm->pgtable_levels = 3; break; + case VM_MODE_P52V48_16K: case VM_MODE_P48V48_16K: case VM_MODE_P40V48_16K: case VM_MODE_P36V48_16K: From e87d3d4eaf3d3ec28f9ed8b02d8a505c96294c7b Mon Sep 17 00:00:00 2001 From: Alexey Kardashevskiy Date: Thu, 7 Mar 2024 13:20:06 +1100 Subject: [PATCH 023/548] PCI/DOE: Support discovery version 2 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit PCIe r6.1, sec 6.30.1.1 defines a "DOE Discovery Version" field in the DOE Discovery Request Data Object Contents (3rd DW) as: 15:8 DOE Discovery Version – must be 02h if the Capability Version in the Data Object Exchange Extended Capability is 02h or greater. Add support for the version on devices with the DOE v2 capability. Link: https://lore.kernel.org/r/20240307022006.3657433-1-aik@amd.com Signed-off-by: Alexey Kardashevskiy Signed-off-by: Bjorn Helgaas Reviewed-by: Kuppuswamy Sathyanarayanan (cherry picked from commit eebab7e3eb4bb906a8ebc3b70d28059ff1d9271c) Signed-off-by: Koba Ko --- drivers/pci/doe.c | 12 +++++++++--- include/uapi/linux/pci_regs.h | 1 + 2 files changed, 10 insertions(+), 3 deletions(-) diff --git a/drivers/pci/doe.c b/drivers/pci/doe.c index e3aab5edaf706..652d63df9d22e 100644 --- a/drivers/pci/doe.c +++ b/drivers/pci/doe.c @@ -383,11 +383,13 @@ static void pci_doe_task_complete(struct pci_doe_task *task) complete(task->private); } -static int pci_doe_discovery(struct pci_doe_mb *doe_mb, u8 *index, u16 *vid, +static int pci_doe_discovery(struct pci_doe_mb *doe_mb, u8 capver, u8 *index, u16 *vid, u8 *protocol) { u32 request_pl = FIELD_PREP(PCI_DOE_DATA_OBJECT_DISC_REQ_3_INDEX, - *index); + *index) | + FIELD_PREP(PCI_DOE_DATA_OBJECT_DISC_REQ_3_VER, + (capver >= 2) ? 2 : 0); __le32 request_pl_le = cpu_to_le32(request_pl); __le32 response_pl_le; u32 response_pl; @@ -421,13 +423,17 @@ static int pci_doe_cache_protocols(struct pci_doe_mb *doe_mb) { u8 index = 0; u8 xa_idx = 0; + u32 hdr = 0; + + pci_read_config_dword(doe_mb->pdev, doe_mb->cap_offset, &hdr); do { int rc; u16 vid; u8 prot; - rc = pci_doe_discovery(doe_mb, &index, &vid, &prot); + rc = pci_doe_discovery(doe_mb, PCI_EXT_CAP_VER(hdr), &index, + &vid, &prot); if (rc) return rc; diff --git a/include/uapi/linux/pci_regs.h b/include/uapi/linux/pci_regs.h index ade8dabf62108..f8e5543e0b4f6 100644 --- a/include/uapi/linux/pci_regs.h +++ b/include/uapi/linux/pci_regs.h @@ -1131,6 +1131,7 @@ #define PCI_DOE_DATA_OBJECT_HEADER_2_LENGTH 0x0003ffff #define PCI_DOE_DATA_OBJECT_DISC_REQ_3_INDEX 0x000000ff +#define PCI_DOE_DATA_OBJECT_DISC_REQ_3_VER 0x0000ff00 #define PCI_DOE_DATA_OBJECT_DISC_RSP_3_VID 0x0000ffff #define PCI_DOE_DATA_OBJECT_DISC_RSP_3_PROTOCOL 0x00ff0000 #define PCI_DOE_DATA_OBJECT_DISC_RSP_3_NEXT_INDEX 0xff000000 From 3fb7987283d9028409bb47699188d660a9cbb02d Mon Sep 17 00:00:00 2001 From: Jie Zhan Date: Sun, 29 Sep 2024 11:32:14 +0800 Subject: [PATCH 024/548] cppc_cpufreq: Remove HiSilicon CPPC workaround Since commit 6c8d750f9784 ("cpufreq / cppc: Work around for Hisilicon CPPC cpufreq"), we introduce a workround for HiSilicon platforms that do not support performance feedback counters, whereas they can get the actual frequency from the desired perf register. Later on, FIE is disabled in that workaround as well. Now the workround can be handled by the common code. Desired perf would be read and converted to frequency if feedback counters don't change. FIE would be disabled if the CPPC regs are in PCC region. Hence, the workaround is no longer needed and can be safely removed, in an effort to consolidate the driver procedure. Signed-off-by: Jie Zhan Reviewed-by: Xiongfeng Wang Reviewed-by: Huisong Li [ Viresh: Move fie_disabled withing CONFIG option to fix warning ] Signed-off-by: Viresh Kumar (cherry picked from commit ea1829d4d413bc38774703acfc266472f7bc0bb5) Signed-off-by: Koba Ko --- drivers/cpufreq/cppc_cpufreq.c | 73 +--------------------------------- 1 file changed, 1 insertion(+), 72 deletions(-) diff --git a/drivers/cpufreq/cppc_cpufreq.c b/drivers/cpufreq/cppc_cpufreq.c index aa34af940cb53..5f0660e9bcca5 100644 --- a/drivers/cpufreq/cppc_cpufreq.c +++ b/drivers/cpufreq/cppc_cpufreq.c @@ -36,33 +36,15 @@ static LIST_HEAD(cpu_data_list); static bool boost_supported; -struct cppc_workaround_oem_info { - char oem_id[ACPI_OEM_ID_SIZE + 1]; - char oem_table_id[ACPI_OEM_TABLE_ID_SIZE + 1]; - u32 oem_revision; -}; - -static struct cppc_workaround_oem_info wa_info[] = { - { - .oem_id = "HISI ", - .oem_table_id = "HIP07 ", - .oem_revision = 0, - }, { - .oem_id = "HISI ", - .oem_table_id = "HIP08 ", - .oem_revision = 0, - } -}; - static struct cpufreq_driver cppc_cpufreq_driver; +#ifdef CONFIG_ACPI_CPPC_CPUFREQ_FIE static enum { FIE_UNSET = -1, FIE_ENABLED, FIE_DISABLED } fie_disabled = FIE_UNSET; -#ifdef CONFIG_ACPI_CPPC_CPUFREQ_FIE module_param(fie_disabled, int, 0444); MODULE_PARM_DESC(fie_disabled, "Disable Frequency Invariance Engine (FIE)"); @@ -78,7 +60,6 @@ struct cppc_freq_invariance { static DEFINE_PER_CPU(struct cppc_freq_invariance, cppc_freq_inv); static struct kthread_worker *kworker_fie; -static unsigned int hisi_cppc_cpufreq_get_rate(unsigned int cpu); static int cppc_perf_from_fbctrs(struct cppc_cpudata *cpu_data, struct cppc_perf_fb_ctrs *fb_ctrs_t0, struct cppc_perf_fb_ctrs *fb_ctrs_t1); @@ -859,57 +840,6 @@ static struct cpufreq_driver cppc_cpufreq_driver = { .name = "cppc_cpufreq", }; -/* - * HISI platform does not support delivered performance counter and - * reference performance counter. It can calculate the performance using the - * platform specific mechanism. We reuse the desired performance register to - * store the real performance calculated by the platform. - */ -static unsigned int hisi_cppc_cpufreq_get_rate(unsigned int cpu) -{ - struct cpufreq_policy *policy = cpufreq_cpu_get(cpu); - struct cppc_cpudata *cpu_data; - u64 desired_perf; - int ret; - - if (!policy) - return -ENODEV; - - cpu_data = policy->driver_data; - - cpufreq_cpu_put(policy); - - ret = cppc_get_desired_perf(cpu, &desired_perf); - if (ret < 0) - return -EIO; - - return cppc_perf_to_khz(&cpu_data->perf_caps, desired_perf); -} - -static void cppc_check_hisi_workaround(void) -{ - struct acpi_table_header *tbl; - acpi_status status = AE_OK; - int i; - - status = acpi_get_table(ACPI_SIG_PCCT, 0, &tbl); - if (ACPI_FAILURE(status) || !tbl) - return; - - for (i = 0; i < ARRAY_SIZE(wa_info); i++) { - if (!memcmp(wa_info[i].oem_id, tbl->oem_id, ACPI_OEM_ID_SIZE) && - !memcmp(wa_info[i].oem_table_id, tbl->oem_table_id, ACPI_OEM_TABLE_ID_SIZE) && - wa_info[i].oem_revision == tbl->oem_revision) { - /* Overwrite the get() callback */ - cppc_cpufreq_driver.get = hisi_cppc_cpufreq_get_rate; - fie_disabled = FIE_DISABLED; - break; - } - } - - acpi_put_table(tbl); -} - static int __init cppc_cpufreq_init(void) { int ret; @@ -917,7 +847,6 @@ static int __init cppc_cpufreq_init(void) if (!acpi_cpc_valid()) return -ENODEV; - cppc_check_hisi_workaround(); cppc_freq_invariance_init(); populate_efficiency_class(); From 1f369ab7e6d59a850ec4872df82b373ab2d50128 Mon Sep 17 00:00:00 2001 From: Zeng Jingxiang Date: Sat, 26 Oct 2024 19:57:14 +0800 Subject: [PATCH 025/548] mm/vmscan: wake up flushers conditionally to avoid cgroup OOM Commit 14aa8b2d5c2e ("mm/mglru: don't sync disk for each aging cycle") removed the opportunity to wake up flushers during the MGLRU page reclamation process can lead to an increased likelihood of triggering OOM when encountering many dirty pages during reclamation on MGLRU. This leads to premature OOM if there are too many dirty pages in cgroup: Killed dd invoked oom-killer: gfp_mask=0x101cca(GFP_HIGHUSER_MOVABLE|__GFP_WRITE), order=0, oom_score_adj=0 Call Trace: dump_stack_lvl+0x5f/0x80 dump_stack+0x14/0x20 dump_header+0x46/0x1b0 oom_kill_process+0x104/0x220 out_of_memory+0x112/0x5a0 mem_cgroup_out_of_memory+0x13b/0x150 try_charge_memcg+0x44f/0x5c0 charge_memcg+0x34/0x50 __mem_cgroup_charge+0x31/0x90 filemap_add_folio+0x4b/0xf0 __filemap_get_folio+0x1a4/0x5b0 ? srso_return_thunk+0x5/0x5f ? __block_commit_write+0x82/0xb0 ext4_da_write_begin+0xe5/0x270 generic_perform_write+0x134/0x2b0 ext4_buffered_write_iter+0x57/0xd0 ext4_file_write_iter+0x76/0x7d0 ? selinux_file_permission+0x119/0x150 ? srso_return_thunk+0x5/0x5f ? srso_return_thunk+0x5/0x5f vfs_write+0x30c/0x440 ksys_write+0x65/0xe0 __x64_sys_write+0x1e/0x30 x64_sys_call+0x11c2/0x1d50 do_syscall_64+0x47/0x110 entry_SYSCALL_64_after_hwframe+0x76/0x7e memory: usage 308224kB, limit 308224kB, failcnt 2589 swap: usage 0kB, limit 9007199254740988kB, failcnt 0 ... file_dirty 303247360 file_writeback 0 ... oom-kill:constraint=CONSTRAINT_MEMCG,nodemask=(null),cpuset=test, mems_allowed=0,oom_memcg=/test,task_memcg=/test,task=dd,pid=4404,uid=0 Memory cgroup out of memory: Killed process 4404 (dd) total-vm:10512kB, anon-rss:1152kB, file-rss:1824kB, shmem-rss:0kB, UID:0 pgtables:76kB oom_score_adj:0 The flusher wake up was removed to decrease SSD wearing, but if we are seeing all dirty folios at the tail of an LRU, not waking up the flusher could lead to thrashing easily. So wake it up when a memcg is about to OOM due to dirty caches. I did run the build kernel test[1] on V6, with -j16 1G memcg on my local branch: Without the patch(10 times): user 1449.394 system 368.78 372.58 363.03 362.31 360.84 372.70 368.72 364.94 373.51 366.58 (avg 367.399) real 164.883 With the V6 patch(10 times): user 1447.525 system 360.87 360.63 372.39 364.09 368.49 365.15 359.93 362.04 359.72 354.60 (avg 362.79) real 164.514 Test results show that this patch has about 1% performance improvement, which should be caused by noise. Link: https://lkml.kernel.org/r/20241026115714.1437435-1-jingxiangzeng.cas@gmail.com Link: https://lore.kernel.org/all/CACePvbV4L-gRN9UKKuUnksfVJjOTq_5Sti2-e=pb_w51kucLKQ@mail.gmail.com/ [1] Fixes: 14aa8b2d5c2e ("mm/mglru: don't sync disk for each aging cycle") Suggested-by: Wei Xu Signed-off-by: Zeng Jingxiang Signed-off-by: Kairui Song Reviewed-by: Wei Xu Tested-by: Chris Li Cc: T.J. Mercier Cc: Yu Zhao Signed-off-by: Andrew Morton (cherry picked from commit 1bc542c6a0d1444559ab75823a89a94d244bf933) Signed-off-by: Koba Ko --- mm/vmscan.c | 24 ++++++++++++++++++++++-- 1 file changed, 22 insertions(+), 2 deletions(-) diff --git a/mm/vmscan.c b/mm/vmscan.c index 258f5472f1e90..ce174efb95039 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -4910,6 +4910,7 @@ static bool sort_folio(struct lruvec *lruvec, struct folio *folio, struct scan_c int tier_idx) { bool success; + bool dirty, writeback; int gen = folio_lru_gen(folio); int type = folio_is_file_lru(folio); int zone = folio_zonenum(folio); @@ -4964,9 +4965,17 @@ static bool sort_folio(struct lruvec *lruvec, struct folio *folio, struct scan_c return true; } + dirty = folio_test_dirty(folio); + writeback = folio_test_writeback(folio); + if (type == LRU_GEN_FILE && dirty) { + sc->nr.file_taken += delta; + if (!writeback) + sc->nr.unqueued_dirty += delta; + } + /* waiting for writeback */ - if (folio_test_locked(folio) || folio_test_writeback(folio) || - (type == LRU_GEN_FILE && folio_test_dirty(folio))) { + if (folio_test_locked(folio) || writeback || + (type == LRU_GEN_FILE && dirty)) { gen = folio_inc_gen(lruvec, folio, true); list_move(&folio->lru, &lrugen->folios[gen][type][zone]); return true; @@ -5078,6 +5087,8 @@ static int scan_folios(struct lruvec *lruvec, struct scan_control *sc, __count_memcg_events(memcg, PGREFILL, sorted); __count_vm_events(PGSCAN_ANON + type, isolated); + if (type == LRU_GEN_FILE) + sc->nr.file_taken += isolated; /* * There might not be eligible folios due to reclaim_idx. Check the * remaining to prevent livelock if it's not making progress. @@ -5206,6 +5217,7 @@ static int evict_folios(struct lruvec *lruvec, struct scan_control *sc, int swap return scanned; retry: reclaimed = shrink_folio_list(&list, pgdat, sc, &stat, false); + sc->nr.unqueued_dirty += stat.nr_unqueued_dirty; sc->nr_reclaimed += reclaimed; list_for_each_entry_safe_reverse(folio, next, &list, lru) { @@ -5420,6 +5432,13 @@ static bool try_to_shrink_lruvec(struct lruvec *lruvec, struct scan_control *sc) cond_resched(); } + /* + * If too many file cache in the coldest generation can't be evicted + * due to being dirty, wake up the flusher. + */ + if (sc->nr.unqueued_dirty && sc->nr.unqueued_dirty == sc->nr.file_taken) + wakeup_flusher_threads(WB_REASON_VMSCAN); + /* whether this lruvec should be rotated */ return nr_to_scan < 0; } @@ -6541,6 +6560,7 @@ static void shrink_node(pg_data_t *pgdat, struct scan_control *sc) bool reclaimable = false; if (lru_gen_enabled() && root_reclaim(sc)) { + memset(&sc->nr, 0, sizeof(sc->nr)); lru_gen_shrink_node(pgdat, sc); return; } From ca9f98ca83a0b336fef31f1bc92d473e367ed01d Mon Sep 17 00:00:00 2001 From: Dan Williams Date: Wed, 31 Jan 2024 00:30:21 -0800 Subject: [PATCH 026/548] ACPI/HMAT: Move HMAT messages to pr_debug() The HMAT messages printed at boot, beyond being noisy, can also print details for nodes that are not yet enabled. The primary method to consume HMAT details is via sysfs, and the sysfs interface gates what is emitted by whether the node is online or not. Hide the messages by default by moving them from "info" to "debug" log level. Otherwise, these prints are just a pretty-print way to dump the ACPI HMAT table. It has always been the case that post-analysis was required for these messages to map proximity-domains to Linux NUMA nodes, and as Priya points out that analysis also needs to consider whether the proximity domain is marked "enabled" in the SRAT. Reported-by: Priya Autee Signed-off-by: Dan Williams Acked-by: Rafael J. Wysocki Link: https://patch.msgid.link/170668982094.318782.2963631284830500182.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dave Jiang (cherry picked from commit c8e88de1b44e58cacdef41ea9aaa78fca35f1357) Signed-off-by: Koba Ko --- drivers/acpi/numa/hmat.c | 24 ++++++++++++------------ 1 file changed, 12 insertions(+), 12 deletions(-) diff --git a/drivers/acpi/numa/hmat.c b/drivers/acpi/numa/hmat.c index bba268ecd802f..b8f9ab21d138d 100644 --- a/drivers/acpi/numa/hmat.c +++ b/drivers/acpi/numa/hmat.c @@ -318,9 +318,9 @@ static __init int hmat_parse_locality(union acpi_subtable_headers *header, return -EINVAL; } - pr_info("Locality: Flags:%02x Type:%s Initiator Domains:%u Target Domains:%u Base:%lld\n", - hmat_loc->flags, hmat_data_type(type), ipds, tpds, - hmat_loc->entry_base_unit); + pr_debug("Locality: Flags:%02x Type:%s Initiator Domains:%u Target Domains:%u Base:%lld\n", + hmat_loc->flags, hmat_data_type(type), ipds, tpds, + hmat_loc->entry_base_unit); inits = (u32 *)(hmat_loc + 1); targs = inits + ipds; @@ -331,9 +331,9 @@ static __init int hmat_parse_locality(union acpi_subtable_headers *header, value = hmat_normalize(entries[init * tpds + targ], hmat_loc->entry_base_unit, type); - pr_info(" Initiator-Target[%u-%u]:%u%s\n", - inits[init], targs[targ], value, - hmat_data_type_suffix(type)); + pr_debug(" Initiator-Target[%u-%u]:%u%s\n", + inits[init], targs[targ], value, + hmat_data_type_suffix(type)); if (mem_hier == ACPI_HMAT_MEMORY) { target = find_mem_target(targs[targ]); @@ -368,9 +368,9 @@ static __init int hmat_parse_cache(union acpi_subtable_headers *header, } attrs = cache->cache_attributes; - pr_info("Cache: Domain:%u Size:%llu Attrs:%08x SMBIOS Handles:%d\n", - cache->memory_PD, cache->cache_size, attrs, - cache->number_of_SMBIOShandles); + pr_debug("Cache: Domain:%u Size:%llu Attrs:%08x SMBIOS Handles:%d\n", + cache->memory_PD, cache->cache_size, attrs, + cache->number_of_SMBIOShandles); target = find_mem_target(cache->memory_PD); if (!target) @@ -429,9 +429,9 @@ static int __init hmat_parse_proximity_domain(union acpi_subtable_headers *heade } if (hmat_revision == 1) - pr_info("Memory (%#llx length %#llx) Flags:%04x Processor Domain:%u Memory Domain:%u\n", - p->reserved3, p->reserved4, p->flags, p->processor_PD, - p->memory_PD); + pr_debug("Memory (%#llx length %#llx) Flags:%04x Processor Domain:%u Memory Domain:%u\n", + p->reserved3, p->reserved4, p->flags, p->processor_PD, + p->memory_PD); else pr_info("Memory Flags:%04x Processor Domain:%u Memory Domain:%u\n", p->flags, p->processor_PD, p->memory_PD); From 26bbb7e6b63dac42cecf58138cba56ceab7eb5d3 Mon Sep 17 00:00:00 2001 From: Tushar Dave Date: Thu, 6 Feb 2025 19:03:38 -0800 Subject: [PATCH 027/548] PCI/ACS: Fix 'pci=config_acs=' parameter Commit 47c8846a49ba ("PCI: Extend ACS configurability") introduced bugs that fail to configure ACS ctrl to the value specified by the kernel parameter. Essentially there are two bugs: 1) When ACS is configured for multiple PCI devices using 'config_acs' kernel parameter, it results into error "PCI: Can't parse ACS command line parameter". This is due to a bug that doesn't preserve the ACS mask, but instead overwrites the mask with value 0. For example, using 'config_acs' to configure ACS ctrl for multiple BDFs fails: Kernel command line: pci=config_acs=1111011@0020:02:00.0;101xxxx@0039:00:00.0 "dyndbg=file drivers/pci/pci.c +p" PCI: Can't parse ACS command line parameter pci 0020:02:00.0: ACS mask = 0x007f pci 0020:02:00.0: ACS flags = 0x007b pci 0020:02:00.0: Configured ACS to 0x007b After this fix: Kernel command line: pci=config_acs=1111011@0020:02:00.0;101xxxx@0039:00:00.0 "dyndbg=file drivers/pci/pci.c +p" pci 0020:02:00.0: ACS mask = 0x007f pci 0020:02:00.0: ACS flags = 0x007b pci 0020:02:00.0: ACS control = 0x005f pci 0020:02:00.0: ACS fw_ctrl = 0x0053 pci 0020:02:00.0: Configured ACS to 0x007b pci 0039:00:00.0: ACS mask = 0x0070 pci 0039:00:00.0: ACS flags = 0x0050 pci 0039:00:00.0: ACS control = 0x001d pci 0039:00:00.0: ACS fw_ctrl = 0x0000 pci 0039:00:00.0: Configured ACS to 0x0050 2) In the bit manipulation logic, we copy the bit from the firmware settings when mask bit 0. For example, 'disable_acs_redir' fails to clear all three ACS P2P redir bits due to the wrong bit fiddling: Kernel command line: pci=disable_acs_redir=0020:02:00.0;0030:02:00.0;0039:00:00.0 "dyndbg=file drivers/pci/pci.c +p" pci 0020:02:00.0: ACS mask = 0x002c pci 0020:02:00.0: ACS flags = 0xffd3 pci 0020:02:00.0: Configured ACS to 0xfffb pci 0030:02:00.0: ACS mask = 0x002c pci 0030:02:00.0: ACS flags = 0xffd3 pci 0030:02:00.0: Configured ACS to 0xffdf pci 0039:00:00.0: ACS mask = 0x002c pci 0039:00:00.0: ACS flags = 0xffd3 pci 0039:00:00.0: Configured ACS to 0xffd3 After this fix: Kernel command line: pci=disable_acs_redir=0020:02:00.0;0030:02:00.0;0039:00:00.0 "dyndbg=file drivers/pci/pci.c +p" pci 0020:02:00.0: ACS mask = 0x002c pci 0020:02:00.0: ACS flags = 0xffd3 pci 0020:02:00.0: ACS control = 0x007f pci 0020:02:00.0: ACS fw_ctrl = 0x007b pci 0020:02:00.0: Configured ACS to 0x0053 pci 0030:02:00.0: ACS mask = 0x002c pci 0030:02:00.0: ACS flags = 0xffd3 pci 0030:02:00.0: ACS control = 0x005f pci 0030:02:00.0: ACS fw_ctrl = 0x005f pci 0030:02:00.0: Configured ACS to 0x0053 pci 0039:00:00.0: ACS mask = 0x002c pci 0039:00:00.0: ACS flags = 0xffd3 pci 0039:00:00.0: ACS control = 0x001d pci 0039:00:00.0: ACS fw_ctrl = 0x0000 pci 0039:00:00.0: Configured ACS to 0x0000 Link: https://lore.kernel.org/r/20250207030338.456887-1-tdave@nvidia.com Fixes: 47c8846a49ba ("PCI: Extend ACS configurability") Signed-off-by: Tushar Dave Signed-off-by: Bjorn Helgaas Reviewed-by: Jason Gunthorpe Reviewed-by: Kuppuswamy Sathyanarayanan (cherry picked from commit 9cf8a952d57b422d3ff8a9a0163f8adf694f4b2b) Signed-off-by: Koba Ko --- drivers/pci/pci.c | 18 +++++++++++++----- 1 file changed, 13 insertions(+), 5 deletions(-) diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c index 3762ccf379e68..3ef2189213237 100644 --- a/drivers/pci/pci.c +++ b/drivers/pci/pci.c @@ -896,8 +896,10 @@ struct pci_acs { }; static void __pci_config_acs(struct pci_dev *dev, struct pci_acs *caps, - const char *p, u16 mask, u16 flags) + const char *p, const u16 acs_mask, const u16 acs_flags) { + u16 flags = acs_flags; + u16 mask = acs_mask; char *delimit; int ret = 0; @@ -905,7 +907,7 @@ static void __pci_config_acs(struct pci_dev *dev, struct pci_acs *caps, return; while (*p) { - if (!mask) { + if (!acs_mask) { /* Check for ACS flags */ delimit = strstr(p, "@"); if (delimit) { @@ -913,6 +915,8 @@ static void __pci_config_acs(struct pci_dev *dev, struct pci_acs *caps, u32 shift = 0; end = delimit - p - 1; + mask = 0; + flags = 0; while (end > -1) { if (*(p + end) == '0') { @@ -969,10 +973,14 @@ static void __pci_config_acs(struct pci_dev *dev, struct pci_acs *caps, pci_dbg(dev, "ACS mask = %#06x\n", mask); pci_dbg(dev, "ACS flags = %#06x\n", flags); + pci_dbg(dev, "ACS control = %#06x\n", caps->ctrl); + pci_dbg(dev, "ACS fw_ctrl = %#06x\n", caps->fw_ctrl); - /* If mask is 0 then we copy the bit from the firmware setting. */ - caps->ctrl = (caps->ctrl & ~mask) | (caps->fw_ctrl & mask); - caps->ctrl |= flags; + /* + * For mask bits that are 0, copy them from the firmware setting + * and apply flags for all the mask bits that are 1. + */ + caps->ctrl = (caps->fw_ctrl & ~mask) | (flags & mask); pci_info(dev, "Configured ACS to %#06x\n", caps->ctrl); } From 963f6f9148f8dce01c8d1ef456f2b2dd99508afc Mon Sep 17 00:00:00 2001 From: Yishai Hadas Date: Mon, 11 Sep 2023 12:38:48 +0300 Subject: [PATCH 028/548] net/mlx5: Introduce ifc bits for migration in a chunk mode Introduce ifc related stuff to enable migration in a chunk mode. Signed-off-by: Yishai Hadas Link: https://lore.kernel.org/r/20230911093856.81910-2-yishaih@nvidia.com Reviewed-by: Jason Gunthorpe Signed-off-by: Leon Romanovsky (cherry picked from commit 5aa4c9608d2d5fea29e211a80c29696f7d94e9f7) Signed-off-by: Koba Ko --- include/linux/mlx5/mlx5_ifc.h | 15 +++++++++++---- 1 file changed, 11 insertions(+), 4 deletions(-) diff --git a/include/linux/mlx5/mlx5_ifc.h b/include/linux/mlx5/mlx5_ifc.h index 9106771bb92f0..714272d84b049 100644 --- a/include/linux/mlx5/mlx5_ifc.h +++ b/include/linux/mlx5/mlx5_ifc.h @@ -1949,7 +1949,9 @@ struct mlx5_ifc_cmd_hca_cap_2_bits { u8 reserved_at_c0[0x8]; u8 migration_multi_load[0x1]; u8 migration_tracking_state[0x1]; - u8 reserved_at_ca[0x16]; + u8 reserved_at_ca[0x6]; + u8 migration_in_chunks[0x1]; + u8 reserved_at_d1[0xf]; u8 reserved_at_e0[0xc0]; @@ -12409,7 +12411,8 @@ struct mlx5_ifc_query_vhca_migration_state_in_bits { u8 op_mod[0x10]; u8 incremental[0x1]; - u8 reserved_at_41[0xf]; + u8 chunk[0x1]; + u8 reserved_at_42[0xe]; u8 vhca_id[0x10]; u8 reserved_at_60[0x20]; @@ -12425,7 +12428,11 @@ struct mlx5_ifc_query_vhca_migration_state_out_bits { u8 required_umem_size[0x20]; - u8 reserved_at_a0[0x160]; + u8 reserved_at_a0[0x20]; + + u8 remaining_total_size[0x40]; + + u8 reserved_at_100[0x100]; }; struct mlx5_ifc_save_vhca_state_in_bits { @@ -12457,7 +12464,7 @@ struct mlx5_ifc_save_vhca_state_out_bits { u8 actual_image_size[0x20]; - u8 reserved_at_60[0x20]; + u8 next_required_umem_size[0x20]; }; struct mlx5_ifc_load_vhca_state_in_bits { From d8e7f94b346a10ff04e6f1da9259d53b58c326d1 Mon Sep 17 00:00:00 2001 From: Or Har-Toov Date: Wed, 20 Sep 2023 13:07:41 +0300 Subject: [PATCH 029/548] IB/mlx5: Expose XDR speed through MAD Under MAD query port, Report NDR speed when NDR is supported in the port capability mask. Signed-off-by: Or Har-Toov Reviewed-by: Mark Zhang Link: https://lore.kernel.org/r/d30bdec2a66a8a2edd1d84ee61453c58cf346b43.1695204156.git.leon@kernel.org Reviewed-by: Jacob Keller Signed-off-by: Leon Romanovsky (cherry picked from commit 561b4a3ac65597c5eca62c91260240a47c88b3da) Signed-off-by: Koba Ko --- drivers/infiniband/hw/mlx5/mad.c | 13 +++++++++++++ include/rdma/ib_mad.h | 2 ++ 2 files changed, 15 insertions(+) diff --git a/drivers/infiniband/hw/mlx5/mad.c b/drivers/infiniband/hw/mlx5/mad.c index 64dae68c43e62..3e43687a7f6f7 100644 --- a/drivers/infiniband/hw/mlx5/mad.c +++ b/drivers/infiniband/hw/mlx5/mad.c @@ -620,6 +620,19 @@ int mlx5_query_mad_ifc_port(struct ib_device *ibdev, u32 port, } } + /* Check if extended speeds 2 (XDR/...) are supported */ + if (props->port_cap_flags & IB_PORT_CAP_MASK2_SUP && + props->port_cap_flags2 & IB_PORT_EXTENDED_SPEEDS2_SUP) { + ext_active_speed = (out_mad->data[56] >> 4) & 0x6; + + switch (ext_active_speed) { + case 2: + if (props->port_cap_flags2 & IB_PORT_LINK_SPEED_XDR_SUP) + props->active_speed = IB_SPEED_XDR; + break; + } + } + /* If reported active speed is QDR, check if is FDR-10 */ if (props->active_speed == 4) { if (dev->port_caps[port - 1].ext_port_cap & diff --git a/include/rdma/ib_mad.h b/include/rdma/ib_mad.h index 2e3843b761e89..3f1b58d8b4bf4 100644 --- a/include/rdma/ib_mad.h +++ b/include/rdma/ib_mad.h @@ -277,6 +277,8 @@ enum ib_port_capability_mask2_bits { IB_PORT_LINK_WIDTH_2X_SUP = 1 << 4, IB_PORT_LINK_SPEED_HDR_SUP = 1 << 5, IB_PORT_LINK_SPEED_NDR_SUP = 1 << 10, + IB_PORT_EXTENDED_SPEEDS2_SUP = 1 << 11, + IB_PORT_LINK_SPEED_XDR_SUP = 1 << 12, }; #define OPA_CLASS_PORT_INFO_PR_SUPPORT BIT(26) From bf060e7ab09e3edc6a403efb4e3fbd07d328e5e9 Mon Sep 17 00:00:00 2001 From: Patrisious Haddad Date: Wed, 20 Sep 2023 13:07:42 +0300 Subject: [PATCH 030/548] IB/mlx5: Add support for 800G_8X lane speed Add a check for 800G_8X speed when querying PTYS and report it back correctly when needed. Signed-off-by: Patrisious Haddad Reviewed-by: Mark Zhang Link: https://lore.kernel.org/r/26fd0b6e1fac071c3eb779657bb3d8ba47f47c4f.1695204156.git.leon@kernel.org Reviewed-by: Jacob Keller Signed-off-by: Leon Romanovsky (cherry picked from commit 948f0bf5ad6ac1e5f19f6aa8e7da4a950d66b661) Signed-off-by: Koba Ko --- drivers/infiniband/hw/mlx5/main.c | 4 ++++ drivers/net/ethernet/mellanox/mlx5/core/port.c | 1 + include/linux/mlx5/port.h | 1 + 3 files changed, 6 insertions(+) diff --git a/drivers/infiniband/hw/mlx5/main.c b/drivers/infiniband/hw/mlx5/main.c index e922fb8728654..19cc6b67e89c9 100644 --- a/drivers/infiniband/hw/mlx5/main.c +++ b/drivers/infiniband/hw/mlx5/main.c @@ -452,6 +452,10 @@ static int translate_eth_ext_proto_oper(u32 eth_proto_oper, u16 *active_speed, *active_width = IB_WIDTH_4X; *active_speed = IB_SPEED_NDR; break; + case MLX5E_PROT_MASK(MLX5E_800GAUI_8_800GBASE_CR8_KR8): + *active_width = IB_WIDTH_8X; + *active_speed = IB_SPEED_NDR; + break; default: return -EINVAL; } diff --git a/drivers/net/ethernet/mellanox/mlx5/core/port.c b/drivers/net/ethernet/mellanox/mlx5/core/port.c index 749f0fc2c189a..7d8c732818f20 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/port.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/port.c @@ -1102,6 +1102,7 @@ static const u32 mlx5e_ext_link_speed[MLX5E_EXT_LINK_MODES_NUMBER] = { [MLX5E_100GAUI_1_100GBASE_CR_KR] = 100000, [MLX5E_200GAUI_2_200GBASE_CR2_KR2] = 200000, [MLX5E_400GAUI_4_400GBASE_CR4_KR4] = 400000, + [MLX5E_800GAUI_8_800GBASE_CR8_KR8] = 800000, }; int mlx5_port_query_eth_proto(struct mlx5_core_dev *dev, u8 port, bool ext, diff --git a/include/linux/mlx5/port.h b/include/linux/mlx5/port.h index 5cc34216f23c3..26092c78a9850 100644 --- a/include/linux/mlx5/port.h +++ b/include/linux/mlx5/port.h @@ -117,6 +117,7 @@ enum mlx5e_ext_link_mode { MLX5E_200GAUI_2_200GBASE_CR2_KR2 = 13, MLX5E_400GAUI_8_400GBASE_CR8 = 15, MLX5E_400GAUI_4_400GBASE_CR4_KR4 = 16, + MLX5E_800GAUI_8_800GBASE_CR8_KR8 = 19, MLX5E_EXT_LINK_MODES_NUMBER, }; From 77ad7a80ad381eec19d2f664a66e32b531339e5e Mon Sep 17 00:00:00 2001 From: Patrisious Haddad Date: Wed, 20 Sep 2023 13:07:44 +0300 Subject: [PATCH 031/548] IB/mlx5: Adjust mlx5 rate mapping to support 800Gb Adjust mlx5 function which maps the speed rate from IB spec values to internal driver values to be able to handle speeds up to 800Gb. Signed-off-by: Patrisious Haddad Reviewed-by: Mark Zhang Link: https://lore.kernel.org/r/301c803d8486b0df8aefad3fb3cc10dc58671985.1695204156.git.leon@kernel.org Reviewed-by: Jacob Keller Signed-off-by: Leon Romanovsky (cherry picked from commit 4f4db190893fb8dee3761b4c5a704ffa5713b73c) Signed-off-by: Koba Ko --- drivers/infiniband/hw/mlx5/qp.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/infiniband/hw/mlx5/qp.c b/drivers/infiniband/hw/mlx5/qp.c index 0a9ae84600b20..de0895335872f 100644 --- a/drivers/infiniband/hw/mlx5/qp.c +++ b/drivers/infiniband/hw/mlx5/qp.c @@ -3440,7 +3440,7 @@ int mlx5r_ib_rate(struct mlx5_ib_dev *dev, u8 rate) if (rate == IB_RATE_PORT_CURRENT || rate == IB_RATE_800_GBPS) return 0; - if (rate < IB_RATE_2_5_GBPS || rate > IB_RATE_600_GBPS) + if (rate < IB_RATE_2_5_GBPS || rate > IB_RATE_800_GBPS) return -EINVAL; stat_rate_support = MLX5_CAP_GEN(dev->mdev, stat_rate_support); From 3860b67004384ab186e5967ab274ad9499a46bbf Mon Sep 17 00:00:00 2001 From: Patrisious Haddad Date: Wed, 20 Sep 2023 13:07:45 +0300 Subject: [PATCH 032/548] RDMA/ipoib: Add support for XDR speed in ethtool The IBTA specification 1.7 has new speed - XDR, supporting signaling rate of 200Gb. Ethtool support of IPoIB driver translates IB speed to signaling rate. Added translation of XDR IB type to rate of 200Gb Ethernet speed. Signed-off-by: Patrisious Haddad Reviewed-by: Mark Zhang Link: https://lore.kernel.org/r/ca252b79b7114af967de3d65f9a38992d4d87a14.1695204156.git.leon@kernel.org Reviewed-by: Jacob Keller Signed-off-by: Leon Romanovsky (cherry picked from commit 8dc0fd2f5693ab71f84fb16339c483999a909ab0) Signed-off-by: Koba Ko --- drivers/infiniband/ulp/ipoib/ipoib_ethtool.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/infiniband/ulp/ipoib/ipoib_ethtool.c b/drivers/infiniband/ulp/ipoib/ipoib_ethtool.c index 8af99b18d3618..7da94fb8d7fa2 100644 --- a/drivers/infiniband/ulp/ipoib/ipoib_ethtool.c +++ b/drivers/infiniband/ulp/ipoib/ipoib_ethtool.c @@ -174,6 +174,8 @@ static inline int ib_speed_enum_to_int(int speed) return SPEED_50000; case IB_SPEED_NDR: return SPEED_100000; + case IB_SPEED_XDR: + return SPEED_200000; } return SPEED_UNKNOWN; From 70cca8f7634703cceb6a26f4dfe3e1771062b31d Mon Sep 17 00:00:00 2001 From: Shay Drory Date: Sun, 12 May 2024 15:43:03 +0300 Subject: [PATCH 033/548] net/mlx5: Enable 8 ports LAG This patch adds to mlx5 drivers support for 8 ports HCAs. Starting with ConnectX-8 HCAs with 8 ports are possible. As most driver parts aren't affected by such configuration most driver code is unchanged. Specially the only affected areas are: - Lag - Multiport E-Switch - Single FDB E-Switch All of the above are already factored in generic way, and LAG and VF LAG are tested, so all that left is to change a #define and remove checks which are no longer needed. However, Multiport E-Switch is not tested yet, so it is left untouched. This patch will allow to create hardware LAG/VF LAG when all 8 ports are added to the same bond device. for example, In order to activate the hardware lag a user can execute the following: ip link add bond0 type bond ip link set bond0 type bond miimon 100 mode 2 ip link set eth2 master bond0 ip link set eth3 master bond0 ip link set eth4 master bond0 ip link set eth5 master bond0 ip link set eth6 master bond0 ip link set eth7 master bond0 ip link set eth8 master bond0 ip link set eth9 master bond0 Where eth2, eth3, eth4, eth5, eth6, eth7, eth8 and eth9 are the PFs of the same HCA. Signed-off-by: Shay Drory Reviewed-by: Mark Bloch Signed-off-by: Tariq Toukan Reviewed-by: Simon Horman Link: https://lore.kernel.org/r/20240512124306.740898-2-tariqt@nvidia.com Signed-off-by: Jakub Kicinski (cherry picked from commit e0e6adfe8c20f1b633017e4dafec1b06117da2df) Signed-off-by: Koba Ko --- drivers/net/ethernet/mellanox/mlx5/core/lag/lag.c | 3 --- include/linux/mlx5/driver.h | 2 +- 2 files changed, 1 insertion(+), 4 deletions(-) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/lag/lag.c b/drivers/net/ethernet/mellanox/mlx5/core/lag/lag.c index f6b1ac80c0af9..4e4226555ec7d 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/lag/lag.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/lag/lag.c @@ -713,7 +713,6 @@ int mlx5_deactivate_lag(struct mlx5_lag *ldev) return 0; } -#define MLX5_LAG_OFFLOADS_SUPPORTED_PORTS 4 bool mlx5_lag_check_prereq(struct mlx5_lag *ldev) { #ifdef CONFIG_MLX5_ESWITCH @@ -740,8 +739,6 @@ bool mlx5_lag_check_prereq(struct mlx5_lag *ldev) if (mlx5_eswitch_mode(ldev->pf[i].dev) != mode) return false; - if (mode == MLX5_ESWITCH_OFFLOADS && ldev->ports > MLX5_LAG_OFFLOADS_SUPPORTED_PORTS) - return false; #else for (i = 0; i < ldev->ports; i++) if (mlx5_sriov_is_enabled(ldev->pf[i].dev)) diff --git a/include/linux/mlx5/driver.h b/include/linux/mlx5/driver.h index 696a2227869fb..4f042312b23a7 100644 --- a/include/linux/mlx5/driver.h +++ b/include/linux/mlx5/driver.h @@ -85,7 +85,7 @@ enum mlx5_sqp_t { }; enum { - MLX5_MAX_PORTS = 4, + MLX5_MAX_PORTS = 8, }; enum { From 0f144f1c57ba5b818f2f953561759b31e67296c2 Mon Sep 17 00:00:00 2001 From: Carolina Jubran Date: Sun, 12 May 2024 15:43:04 +0300 Subject: [PATCH 034/548] net/mlx5e: Modifying channels number and updating TX queues It is not appropriate for the mlx5e_num_channels_changed function to be called solely for updating the TX queues, even if the channels number has not been changed. Move the code responsible for updating the TC and TX queues from mlx5e_num_channels_changed and produce a new function called mlx5e_update_tc_and_tx_queues. This new function should only be called when the channels number remains unchanged. Signed-off-by: Carolina Jubran Signed-off-by: Tariq Toukan Reviewed-by: Simon Horman Link: https://lore.kernel.org/r/20240512124306.740898-3-tariqt@nvidia.com Signed-off-by: Jakub Kicinski (cherry picked from commit bcee093751f87dff6ace6355da06e62fd04189a3) Signed-off-by: Koba Ko --- drivers/net/ethernet/mellanox/mlx5/core/en.h | 1 + .../ethernet/mellanox/mlx5/core/en_ethtool.c | 2 +- .../net/ethernet/mellanox/mlx5/core/en_main.c | 92 +++++++++---------- 3 files changed, 47 insertions(+), 48 deletions(-) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en.h b/drivers/net/ethernet/mellanox/mlx5/core/en.h index 9cf33ae48c216..87f95528a4e63 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/en.h @@ -1064,6 +1064,7 @@ int mlx5e_safe_switch_params(struct mlx5e_priv *priv, void *context, bool reset); int mlx5e_update_tx_netdev_queues(struct mlx5e_priv *priv); int mlx5e_num_channels_changed_ctx(struct mlx5e_priv *priv, void *context); +int mlx5e_update_tc_and_tx_queues_ctx(struct mlx5e_priv *priv, void *context); void mlx5e_activate_priv_channels(struct mlx5e_priv *priv); void mlx5e_deactivate_priv_channels(struct mlx5e_priv *priv); int mlx5e_ptp_rx_manage_fs_ctx(struct mlx5e_priv *priv, void *ctx); diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c b/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c index 54379297a7489..387988fec6f18 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c @@ -2114,7 +2114,7 @@ static int set_pflag_tx_port_ts(struct net_device *netdev, bool enable) */ err = mlx5e_safe_switch_params(priv, &new_params, - mlx5e_num_channels_changed_ctx, NULL, true); + mlx5e_update_tc_and_tx_queues_ctx, NULL, true); if (!err) priv->tx_ptp_opened = true; diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c index 5c6f01abdcb91..b95a6b4fab4d1 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c @@ -2804,7 +2804,28 @@ int mlx5e_update_tx_netdev_queues(struct mlx5e_priv *priv) return err; } -static int mlx5e_update_netdev_queues(struct mlx5e_priv *priv) +static void mlx5e_set_default_xps_cpumasks(struct mlx5e_priv *priv, + struct mlx5e_params *params) +{ + struct mlx5_core_dev *mdev = priv->mdev; + int num_comp_vectors, ix, irq; + + num_comp_vectors = mlx5_comp_vectors_max(mdev); + + for (ix = 0; ix < params->num_channels; ix++) { + cpumask_clear(priv->scratchpad.cpumask); + + for (irq = ix; irq < num_comp_vectors; irq += params->num_channels) { + int cpu = mlx5_comp_vector_get_cpu(mdev, irq); + + cpumask_set_cpu(cpu, priv->scratchpad.cpumask); + } + + netif_set_xps_queue(priv->netdev, priv->scratchpad.cpumask, ix); + } +} + +static int mlx5e_update_tc_and_tx_queues(struct mlx5e_priv *priv) { struct netdev_tc_txq old_tc_to_txq[TC_MAX_QUEUE], *tc_to_txq; struct net_device *netdev = priv->netdev; @@ -2828,22 +2849,10 @@ static int mlx5e_update_netdev_queues(struct mlx5e_priv *priv) err = mlx5e_update_tx_netdev_queues(priv); if (err) goto err_tcs; - err = netif_set_real_num_rx_queues(netdev, nch); - if (err) { - netdev_warn(netdev, "netif_set_real_num_rx_queues failed, %d\n", err); - goto err_txqs; - } + mlx5e_set_default_xps_cpumasks(priv, &priv->channels.params); return 0; -err_txqs: - /* netif_set_real_num_rx_queues could fail only when nch increased. Only - * one of nch and ntc is changed in this function. That means, the call - * to netif_set_real_num_tx_queues below should not fail, because it - * decreases the number of TX queues. - */ - WARN_ON_ONCE(netif_set_real_num_tx_queues(netdev, old_num_txqs)); - err_tcs: WARN_ON_ONCE(mlx5e_netdev_set_tcs(netdev, old_num_txqs / old_ntc, old_ntc, old_tc_to_txq)); @@ -2851,39 +2860,32 @@ static int mlx5e_update_netdev_queues(struct mlx5e_priv *priv) return err; } -static MLX5E_DEFINE_PREACTIVATE_WRAPPER_CTX(mlx5e_update_netdev_queues); - -static void mlx5e_set_default_xps_cpumasks(struct mlx5e_priv *priv, - struct mlx5e_params *params) -{ - struct mlx5_core_dev *mdev = priv->mdev; - int num_comp_vectors, ix, irq; - - num_comp_vectors = mlx5_comp_vectors_max(mdev); - - for (ix = 0; ix < params->num_channels; ix++) { - cpumask_clear(priv->scratchpad.cpumask); - - for (irq = ix; irq < num_comp_vectors; irq += params->num_channels) { - int cpu = mlx5_comp_vector_get_cpu(mdev, irq); - - cpumask_set_cpu(cpu, priv->scratchpad.cpumask); - } - - netif_set_xps_queue(priv->netdev, priv->scratchpad.cpumask, ix); - } -} +MLX5E_DEFINE_PREACTIVATE_WRAPPER_CTX(mlx5e_update_tc_and_tx_queues); static int mlx5e_num_channels_changed(struct mlx5e_priv *priv) { u16 count = priv->channels.params.num_channels; + struct net_device *netdev = priv->netdev; + int old_num_rxqs; int err; - err = mlx5e_update_netdev_queues(priv); - if (err) + old_num_rxqs = netdev->real_num_rx_queues; + err = netif_set_real_num_rx_queues(netdev, count); + if (err) { + netdev_warn(netdev, "%s: netif_set_real_num_rx_queues failed, %d\n", + __func__, err); return err; - - mlx5e_set_default_xps_cpumasks(priv, &priv->channels.params); + } + err = mlx5e_update_tc_and_tx_queues(priv); + if (err) { + /* mlx5e_update_tc_and_tx_queues can fail if channels or TCs number increases. + * Since channel number changed, it increased. That means, the call to + * netif_set_real_num_rx_queues below should not fail, because it + * decreases the number of RX queues. + */ + WARN_ON_ONCE(netif_set_real_num_rx_queues(netdev, old_num_rxqs)); + return err; + } /* This function may be called on attach, before priv->rx_res is created. */ if (!netif_is_rxfh_configured(priv->netdev) && priv->rx_res) @@ -3482,7 +3484,7 @@ static int mlx5e_setup_tc_mqprio_dcb(struct mlx5e_priv *priv, mlx5e_params_mqprio_dcb_set(&new_params, tc ? tc : 1); err = mlx5e_safe_switch_params(priv, &new_params, - mlx5e_num_channels_changed_ctx, NULL, true); + mlx5e_update_tc_and_tx_queues_ctx, NULL, true); if (!err && priv->mqprio_rl) { mlx5e_mqprio_rl_cleanup(priv->mqprio_rl); @@ -3583,10 +3585,8 @@ static struct mlx5e_mqprio_rl *mlx5e_mqprio_rl_create(struct mlx5_core_dev *mdev static int mlx5e_setup_tc_mqprio_channel(struct mlx5e_priv *priv, struct tc_mqprio_qopt_offload *mqprio) { - mlx5e_fp_preactivate preactivate; struct mlx5e_params new_params; struct mlx5e_mqprio_rl *rl; - bool nch_changed; int err; err = mlx5e_mqprio_channel_validate(priv, mqprio); @@ -3600,10 +3600,8 @@ static int mlx5e_setup_tc_mqprio_channel(struct mlx5e_priv *priv, new_params = priv->channels.params; mlx5e_params_mqprio_channel_set(&new_params, mqprio, rl); - nch_changed = mlx5e_get_dcb_num_tc(&priv->channels.params) > 1; - preactivate = nch_changed ? mlx5e_num_channels_changed_ctx : - mlx5e_update_netdev_queues_ctx; - err = mlx5e_safe_switch_params(priv, &new_params, preactivate, NULL, true); + err = mlx5e_safe_switch_params(priv, &new_params, + mlx5e_update_tc_and_tx_queues_ctx, NULL, true); if (err) { if (rl) { mlx5e_mqprio_rl_cleanup(rl); From 25dbe4f4b2ba80506b52ebe87bf6b01bff77222f Mon Sep 17 00:00:00 2001 From: Parav Pandit Date: Sun, 12 May 2024 15:43:05 +0300 Subject: [PATCH 035/548] net/mlx5: Remove unused msix related exported APIs MSIX irq allocation and free APIs are no longer in use. Hence, remove the dead code. Signed-off-by: Parav Pandit Reviewed-by: Dragos Tatulea Signed-off-by: Tariq Toukan Reviewed-by: Simon Horman Reviewed-by: Kalesh AP Link: https://lore.kernel.org/r/20240512124306.740898-4-tariqt@nvidia.com Signed-off-by: Jakub Kicinski (cherry picked from commit db5944e16cd80efcbfe0bf1d067fdcc2f710d246) Signed-off-by: Koba Ko --- .../net/ethernet/mellanox/mlx5/core/pci_irq.c | 52 ------------------- include/linux/mlx5/driver.h | 7 --- 2 files changed, 59 deletions(-) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/pci_irq.c b/drivers/net/ethernet/mellanox/mlx5/core/pci_irq.c index a8d6fd18c0f55..065196ef8203f 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/pci_irq.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/pci_irq.c @@ -507,58 +507,6 @@ struct mlx5_irq *mlx5_irq_request(struct mlx5_core_dev *dev, u16 vecidx, return irq; } -/** - * mlx5_msix_alloc - allocate msix interrupt - * @dev: mlx5 device from which to request - * @handler: interrupt handler - * @affdesc: affinity descriptor - * @name: interrupt name - * - * Returns: struct msi_map with result encoded. - * Note: the caller must make sure to release the irq by calling - * mlx5_msix_free() if shutdown was initiated. - */ -struct msi_map mlx5_msix_alloc(struct mlx5_core_dev *dev, - irqreturn_t (*handler)(int, void *), - const struct irq_affinity_desc *affdesc, - const char *name) -{ - struct msi_map map; - int err; - - if (!dev->pdev) { - map.virq = 0; - map.index = -EINVAL; - return map; - } - - map = pci_msix_alloc_irq_at(dev->pdev, MSI_ANY_INDEX, affdesc); - if (!map.virq) - return map; - - err = request_irq(map.virq, handler, 0, name, NULL); - if (err) { - mlx5_core_warn(dev, "err %d\n", err); - pci_msix_free_irq(dev->pdev, map); - map.virq = 0; - map.index = -ENOMEM; - } - return map; -} -EXPORT_SYMBOL(mlx5_msix_alloc); - -/** - * mlx5_msix_free - free a previously allocated msix interrupt - * @dev: mlx5 device associated with interrupt - * @map: map previously returned by mlx5_msix_alloc() - */ -void mlx5_msix_free(struct mlx5_core_dev *dev, struct msi_map map) -{ - free_irq(map.virq, NULL); - pci_msix_free_irq(dev->pdev, map); -} -EXPORT_SYMBOL(mlx5_msix_free); - /** * mlx5_irq_release_vector - release one IRQ back to the system. * @irq: the irq to release. diff --git a/include/linux/mlx5/driver.h b/include/linux/mlx5/driver.h index 4f042312b23a7..fc6afbec56537 100644 --- a/include/linux/mlx5/driver.h +++ b/include/linux/mlx5/driver.h @@ -1385,11 +1385,4 @@ static inline bool mlx5_is_macsec_roce_supported(struct mlx5_core_dev *mdev) enum { MLX5_OCTWORD = 16, }; - -struct msi_map mlx5_msix_alloc(struct mlx5_core_dev *dev, - irqreturn_t (*handler)(int, void *), - const struct irq_affinity_desc *affdesc, - const char *name); -void mlx5_msix_free(struct mlx5_core_dev *dev, struct msi_map map); - #endif /* MLX5_DRIVER_H */ From 0c7279fb62fea53a41c058394f9606eb02170951 Mon Sep 17 00:00:00 2001 From: Mark Zhang Date: Sun, 16 Jun 2024 19:08:33 +0300 Subject: [PATCH 036/548] RDMA/core: Create "issm*" device nodes only when SMI is supported For an IB port create it's issm device node only when it has SMI capability. In following patches mlx5 is going to support IB devices without this cap. Signed-off-by: Mark Zhang Link: https://lore.kernel.org/r/359f73c9a388d5e3ae971e40d8507888b1ba6f93.1718553901.git.leon@kernel.org Signed-off-by: Leon Romanovsky (cherry picked from commit 50660c5197f52b8137e223dc3ba8d43661179a1d) Signed-off-by: Koba Ko --- drivers/infiniband/core/user_mad.c | 29 ++++++++++++++++++----------- 1 file changed, 18 insertions(+), 11 deletions(-) diff --git a/drivers/infiniband/core/user_mad.c b/drivers/infiniband/core/user_mad.c index 2ed749f50a29f..f760dfffa1886 100644 --- a/drivers/infiniband/core/user_mad.c +++ b/drivers/infiniband/core/user_mad.c @@ -1321,15 +1321,17 @@ static int ib_umad_init_port(struct ib_device *device, int port_num, if (ret) goto err_cdev; - ib_umad_init_port_dev(&port->sm_dev, port, device); - port->sm_dev.devt = base_issm; - dev_set_name(&port->sm_dev, "issm%d", port->dev_num); - cdev_init(&port->sm_cdev, &umad_sm_fops); - port->sm_cdev.owner = THIS_MODULE; - - ret = cdev_device_add(&port->sm_cdev, &port->sm_dev); - if (ret) - goto err_dev; + if (rdma_cap_ib_smi(device, port_num)) { + ib_umad_init_port_dev(&port->sm_dev, port, device); + port->sm_dev.devt = base_issm; + dev_set_name(&port->sm_dev, "issm%d", port->dev_num); + cdev_init(&port->sm_cdev, &umad_sm_fops); + port->sm_cdev.owner = THIS_MODULE; + + ret = cdev_device_add(&port->sm_cdev, &port->sm_dev); + if (ret) + goto err_dev; + } return 0; @@ -1345,9 +1347,13 @@ static int ib_umad_init_port(struct ib_device *device, int port_num, static void ib_umad_kill_port(struct ib_umad_port *port) { struct ib_umad_file *file; + bool has_smi = false; int id; - cdev_device_del(&port->sm_cdev, &port->sm_dev); + if (rdma_cap_ib_smi(port->ib_dev, port->port_num)) { + cdev_device_del(&port->sm_cdev, &port->sm_dev); + has_smi = true; + } cdev_device_del(&port->cdev, &port->dev); mutex_lock(&port->file_mutex); @@ -1373,7 +1379,8 @@ static void ib_umad_kill_port(struct ib_umad_port *port) ida_free(&umad_ida, port->dev_num); /* balances device_initialize() */ - put_device(&port->sm_dev); + if (has_smi) + put_device(&port->sm_dev); put_device(&port->dev); } From 1b6cf311831f356cc20ac07a01a866b67579dcaf Mon Sep 17 00:00:00 2001 From: Mark Zhang Date: Sun, 16 Jun 2024 19:08:34 +0300 Subject: [PATCH 037/548] net/mlx5: mlx5_ifc update for multi-plane support Add new fields to support mlx5 multi-plane feature. Actual support will be added in following patches. Signed-off-by: Mark Zhang Link: https://lore.kernel.org/r/36a74a1b1d2b7b59c99cda4abad1794ddde30230.1718553901.git.leon@kernel.org Signed-off-by: Leon Romanovsky (cherry picked from commit 65528cfb21fdb68de8ae6dccae19af180d93e143) Signed-off-by: Koba Ko --- include/linux/mlx5/mlx5_ifc.h | 14 +++++++++----- 1 file changed, 9 insertions(+), 5 deletions(-) diff --git a/include/linux/mlx5/mlx5_ifc.h b/include/linux/mlx5/mlx5_ifc.h index 714272d84b049..cb97f403aed31 100644 --- a/include/linux/mlx5/mlx5_ifc.h +++ b/include/linux/mlx5/mlx5_ifc.h @@ -782,7 +782,7 @@ struct mlx5_ifc_ads_bits { u8 reserved_at_2[0xe]; u8 pkey_index[0x10]; - u8 reserved_at_20[0x8]; + u8 plane_index[0x8]; u8 grh[0x1]; u8 mlid[0x7]; u8 rlid[0x10]; @@ -1949,7 +1949,8 @@ struct mlx5_ifc_cmd_hca_cap_2_bits { u8 reserved_at_c0[0x8]; u8 migration_multi_load[0x1]; u8 migration_tracking_state[0x1]; - u8 reserved_at_ca[0x6]; + u8 multiplane_qp_ud[0x1]; + u8 reserved_at_cb[0x5]; u8 migration_in_chunks[0x1]; u8 reserved_at_d1[0xf]; @@ -4103,7 +4104,8 @@ struct mlx5_ifc_hca_vport_context_bits { u8 has_smi[0x1]; u8 has_raw[0x1]; u8 grh_required[0x1]; - u8 reserved_at_104[0xc]; + u8 reserved_at_104[0x4]; + u8 num_port_plane[0x8]; u8 port_physical_state[0x4]; u8 vport_state_policy[0x4]; u8 port_state[0x4]; @@ -7582,7 +7584,7 @@ struct mlx5_ifc_mad_ifc_in_bits { u8 op_mod[0x10]; u8 remote_lid[0x10]; - u8 reserved_at_50[0x8]; + u8 plane_index[0x8]; u8 port[0x8]; u8 reserved_at_60[0x20]; @@ -9511,7 +9513,9 @@ struct mlx5_ifc_ptys_reg_bits { u8 an_disable_cap[0x1]; u8 reserved_at_3[0x5]; u8 local_port[0x8]; - u8 reserved_at_10[0xd]; + u8 reserved_at_10[0x8]; + u8 plane_ind[0x4]; + u8 reserved_at_1c[0x1]; u8 proto_mask[0x3]; u8 an_status[0x4]; From 865872f4e414eee1d621eaed71791fddd80bd51e Mon Sep 17 00:00:00 2001 From: Mark Zhang Date: Sun, 16 Jun 2024 19:08:35 +0300 Subject: [PATCH 038/548] RDMA/mlx5: Add support to multi-plane device and port When multi-plane is supported, a logical port, which is aggregation of multiple physical plane ports, is exposed for data transmission. Compared with a normal mlx5 IB port, this logical port supports all functionalities except Subnet Management. Signed-off-by: Mark Zhang Link: https://lore.kernel.org/r/7e37c06c9cb243be9ac79930cd17053903785b95.1718553901.git.leon@kernel.org Signed-off-by: Leon Romanovsky (cherry picked from commit 2a5db20fa532198639671713c6213f96ff285b85) Signed-off-by: Koba Ko --- drivers/infiniband/hw/mlx5/main.c | 60 ++++++++++++++++--- drivers/infiniband/hw/mlx5/mlx5_ib.h | 2 + .../net/ethernet/mellanox/mlx5/core/vport.c | 1 + include/linux/mlx5/driver.h | 1 + 4 files changed, 55 insertions(+), 9 deletions(-) diff --git a/drivers/infiniband/hw/mlx5/main.c b/drivers/infiniband/hw/mlx5/main.c index 19cc6b67e89c9..44d7b87d83743 100644 --- a/drivers/infiniband/hw/mlx5/main.c +++ b/drivers/infiniband/hw/mlx5/main.c @@ -1334,7 +1334,13 @@ static int mlx5_query_hca_port(struct ib_device *ibdev, u32 port, props->sm_sl = rep->sm_sl; props->state = rep->vport_state; props->phys_state = rep->port_physical_state; - props->port_cap_flags = rep->cap_mask1; + + props->port_cap_flags = rep->cap_mask1; + if (dev->num_plane) { + props->port_cap_flags |= IB_PORT_SM_DISABLED; + props->port_cap_flags &= ~IB_PORT_SM; + } + props->gid_tbl_len = mlx5_get_gid_table_len(MLX5_CAP_GEN(mdev, gid_table_size)); props->max_msg_sz = 1 << MLX5_CAP_GEN(mdev, log_max_msg); props->pkey_tbl_len = mlx5_to_sw_pkey_sz(MLX5_CAP_GEN(mdev, pkey_table_size)); @@ -2780,6 +2786,23 @@ static int mlx5_ib_event_slave_port(struct notifier_block *nb, return NOTIFY_OK; } +static int mlx5_ib_get_plane_num(struct mlx5_core_dev *mdev, u8 *num_plane) +{ + struct mlx5_hca_vport_context vport_ctx; + int err; + + *num_plane = 0; + if (!MLX5_CAP_GEN(mdev, ib_virt)) + return 0; + + err = mlx5_query_hca_vport_context(mdev, 0, 1, 0, &vport_ctx); + if (err) + return err; + + *num_plane = vport_ctx.num_plane; + return 0; +} + static int set_has_smi_cap(struct mlx5_ib_dev *dev) { struct mlx5_hca_vport_context vport_ctx; @@ -2790,10 +2813,14 @@ static int set_has_smi_cap(struct mlx5_ib_dev *dev) return 0; for (port = 1; port <= dev->num_ports; port++) { - if (!MLX5_CAP_GEN(dev->mdev, ib_virt)) { + if (dev->num_plane) { + dev->port_caps[port - 1].has_smi = false; + continue; + } else if (!MLX5_CAP_GEN(dev->mdev, ib_virt)) { dev->port_caps[port - 1].has_smi = true; continue; } + err = mlx5_query_hca_vport_context(dev->mdev, 0, port, 0, &vport_ctx); if (err) { @@ -2984,6 +3011,11 @@ static u32 get_core_cap_flags(struct ib_device *ibdev, if (rep->grh_required) ret |= RDMA_CORE_CAP_IB_GRH_REQUIRED; + if (dev->num_plane) + return ret | RDMA_CORE_CAP_PROT_IB | RDMA_CORE_CAP_IB_MAD | + RDMA_CORE_CAP_IB_CM | RDMA_CORE_CAP_IB_SA | + RDMA_CORE_CAP_AF_IB; + if (ll == IB_LINK_LAYER_INFINIBAND) return ret | RDMA_CORE_PORT_IBA_IB; @@ -4498,11 +4530,18 @@ static int mlx5r_probe(struct auxiliary_device *adev, dev = ib_alloc_device(mlx5_ib_dev, ib_dev); if (!dev) return -ENOMEM; + + if (ll == IB_LINK_LAYER_INFINIBAND) { + ret = mlx5_ib_get_plane_num(mdev, &dev->num_plane); + if (ret) + goto fail; + } + dev->port = kcalloc(num_ports, sizeof(*dev->port), GFP_KERNEL); if (!dev->port) { - ib_dealloc_device(&dev->ib_dev); - return -ENOMEM; + ret = -ENOMEM; + goto fail; } dev->mdev = mdev; @@ -4514,14 +4553,17 @@ static int mlx5r_probe(struct auxiliary_device *adev, profile = &pf_profile; ret = __mlx5_ib_add(dev, profile); - if (ret) { - kfree(dev->port); - ib_dealloc_device(&dev->ib_dev); - return ret; - } + if (ret) + goto fail_ib_add; auxiliary_set_drvdata(adev, dev); return 0; + +fail_ib_add: + kfree(dev->port); +fail: + ib_dealloc_device(&dev->ib_dev); + return ret; } static void mlx5r_remove(struct auxiliary_device *adev) diff --git a/drivers/infiniband/hw/mlx5/mlx5_ib.h b/drivers/infiniband/hw/mlx5/mlx5_ib.h index 94678e5c59dd5..b2bd44d8e9708 100644 --- a/drivers/infiniband/hw/mlx5/mlx5_ib.h +++ b/drivers/infiniband/hw/mlx5/mlx5_ib.h @@ -1174,6 +1174,8 @@ struct mlx5_ib_dev { #ifdef CONFIG_MLX5_MACSEC struct mlx5_macsec macsec; #endif + + u8 num_plane; }; static inline struct mlx5_ib_cq *to_mibcq(struct mlx5_core_cq *mcq) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/vport.c b/drivers/net/ethernet/mellanox/mlx5/core/vport.c index 06b5265b6e6db..db01486f8466d 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/vport.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/vport.c @@ -722,6 +722,7 @@ int mlx5_query_hca_vport_context(struct mlx5_core_dev *dev, rep->grh_required = MLX5_GET_PR(hca_vport_context, ctx, grh_required); rep->sys_image_guid = MLX5_GET64_PR(hca_vport_context, ctx, system_image_guid); + rep->num_plane = MLX5_GET_PR(hca_vport_context, ctx, num_port_plane); ex: kvfree(out); diff --git a/include/linux/mlx5/driver.h b/include/linux/mlx5/driver.h index fc6afbec56537..3f72069021b92 100644 --- a/include/linux/mlx5/driver.h +++ b/include/linux/mlx5/driver.h @@ -898,6 +898,7 @@ struct mlx5_hca_vport_context { u16 qkey_violation_counter; u16 pkey_violation_counter; bool grh_required; + u8 num_plane; }; #define STRUCT_FIELD(header, field) \ From 0bf5c9c540392d3dd2e086810bb0dd04fccaaff5 Mon Sep 17 00:00:00 2001 From: Mark Zhang Date: Sun, 16 Jun 2024 19:08:36 +0300 Subject: [PATCH 039/548] RDMA/core: Support IB sub device with type "SMI" This patch adds 2 APIs, as well as driver operations to support adding and deleting an IB sub device, which provides part of functionalities of it's parent. A sub device has a type; for a sub device with type "SMI", it provides the smi capability through umad for its parent, meaning uverb is not supported. A sub device cannot live without a parent. So when a parent is released, all it's sub devices are released as well. Signed-off-by: Mark Zhang Link: https://lore.kernel.org/r/44253f7508b21eb2caefea3980c2bc072869116c.1718553901.git.leon@kernel.org Signed-off-by: Leon Romanovsky (cherry picked from commit bca51197620a257e2954be99b16f05115c3b2630) Signed-off-by: Koba Ko --- drivers/infiniband/core/device.c | 69 +++++++++++++++++++++++++++ drivers/infiniband/core/uverbs_main.c | 3 +- include/rdma/ib_verbs.h | 43 +++++++++++++++++ include/uapi/rdma/rdma_netlink.h | 5 ++ 4 files changed, 119 insertions(+), 1 deletion(-) diff --git a/drivers/infiniband/core/device.c b/drivers/infiniband/core/device.c index 6769c42e46d4f..979d114b7641f 100644 --- a/drivers/infiniband/core/device.c +++ b/drivers/infiniband/core/device.c @@ -503,6 +503,7 @@ static void ib_device_release(struct device *device) rcu_head); } + mutex_destroy(&dev->subdev_lock); mutex_destroy(&dev->unregistration_lock); mutex_destroy(&dev->compat_devs_mutex); @@ -650,6 +651,11 @@ struct ib_device *_ib_alloc_device(size_t size) BIT_ULL(IB_USER_VERBS_CMD_REG_MR) | BIT_ULL(IB_USER_VERBS_CMD_REREG_MR) | BIT_ULL(IB_USER_VERBS_CMD_RESIZE_CQ); + + mutex_init(&device->subdev_lock); + INIT_LIST_HEAD(&device->subdev_list_head); + INIT_LIST_HEAD(&device->subdev_list); + return device; } EXPORT_SYMBOL(_ib_alloc_device); @@ -1470,6 +1476,18 @@ EXPORT_SYMBOL(ib_register_device); /* Callers must hold a get on the device. */ static void __ib_unregister_device(struct ib_device *ib_dev) { + struct ib_device *sub, *tmp; + + mutex_lock(&ib_dev->subdev_lock); + list_for_each_entry_safe_reverse(sub, tmp, + &ib_dev->subdev_list_head, + subdev_list) { + list_del(&sub->subdev_list); + ib_dev->ops.del_sub_dev(sub); + ib_device_put(ib_dev); + } + mutex_unlock(&ib_dev->subdev_lock); + /* * We have a registration lock so that all the calls to unregister are * fully fenced, once any unregister returns the device is truely @@ -2602,6 +2620,7 @@ void ib_set_device_ops(struct ib_device *dev, const struct ib_device_ops *ops) ops->uverbs_no_driver_id_binding; SET_DEVICE_OP(dev_ops, add_gid); + SET_DEVICE_OP(dev_ops, add_sub_dev); SET_DEVICE_OP(dev_ops, advise_mr); SET_DEVICE_OP(dev_ops, alloc_dm); SET_DEVICE_OP(dev_ops, alloc_hw_device_stats); @@ -2636,6 +2655,7 @@ void ib_set_device_ops(struct ib_device *dev, const struct ib_device_ops *ops) SET_DEVICE_OP(dev_ops, dealloc_ucontext); SET_DEVICE_OP(dev_ops, dealloc_xrcd); SET_DEVICE_OP(dev_ops, del_gid); + SET_DEVICE_OP(dev_ops, del_sub_dev); SET_DEVICE_OP(dev_ops, dereg_mr); SET_DEVICE_OP(dev_ops, destroy_ah); SET_DEVICE_OP(dev_ops, destroy_counters); @@ -2730,6 +2750,55 @@ void ib_set_device_ops(struct ib_device *dev, const struct ib_device_ops *ops) } EXPORT_SYMBOL(ib_set_device_ops); +int ib_add_sub_device(struct ib_device *parent, + enum rdma_nl_dev_type type, + const char *name) +{ + struct ib_device *sub; + int ret = 0; + + if (!parent->ops.add_sub_dev || !parent->ops.del_sub_dev) + return -EOPNOTSUPP; + + if (!ib_device_try_get(parent)) + return -EINVAL; + + sub = parent->ops.add_sub_dev(parent, type, name); + if (IS_ERR(sub)) { + ib_device_put(parent); + return PTR_ERR(sub); + } + + sub->type = type; + sub->parent = parent; + + mutex_lock(&parent->subdev_lock); + list_add_tail(&parent->subdev_list_head, &sub->subdev_list); + mutex_unlock(&parent->subdev_lock); + + return ret; +} +EXPORT_SYMBOL(ib_add_sub_device); + +int ib_del_sub_device_and_put(struct ib_device *sub) +{ + struct ib_device *parent = sub->parent; + + if (!parent) + return -EOPNOTSUPP; + + mutex_lock(&parent->subdev_lock); + list_del(&sub->subdev_list); + mutex_unlock(&parent->subdev_lock); + + ib_device_put(sub); + parent->ops.del_sub_dev(sub); + ib_device_put(parent); + + return 0; +} +EXPORT_SYMBOL(ib_del_sub_device_and_put); + #ifdef CONFIG_INFINIBAND_VIRT_DMA int ib_dma_virt_map_sg(struct ib_device *dev, struct scatterlist *sg, int nents) { diff --git a/drivers/infiniband/core/uverbs_main.c b/drivers/infiniband/core/uverbs_main.c index 495d5a5d0373f..bc099287de9a6 100644 --- a/drivers/infiniband/core/uverbs_main.c +++ b/drivers/infiniband/core/uverbs_main.c @@ -1114,7 +1114,8 @@ static int ib_uverbs_add_one(struct ib_device *device) struct ib_uverbs_device *uverbs_dev; int ret; - if (!device->ops.alloc_ucontext) + if (!device->ops.alloc_ucontext || + device->type == RDMA_DEVICE_TYPE_SMI) return -EOPNOTSUPP; uverbs_dev = kzalloc(sizeof(*uverbs_dev), GFP_KERNEL); diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h index c7e9ec9e9a802..ede4ac480ebb5 100644 --- a/include/rdma/ib_verbs.h +++ b/include/rdma/ib_verbs.h @@ -2676,6 +2676,18 @@ struct ib_device_ops { */ int (*get_numa_node)(struct ib_device *dev); + /** + * add_sub_dev - Add a sub IB device + */ + struct ib_device *(*add_sub_dev)(struct ib_device *parent, + enum rdma_nl_dev_type type, + const char *name); + + /** + * del_sub_dev - Delete a sub IB device + */ + void (*del_sub_dev)(struct ib_device *sub_dev); + DECLARE_RDMA_OBJ_SIZE(ib_ah); DECLARE_RDMA_OBJ_SIZE(ib_counters); DECLARE_RDMA_OBJ_SIZE(ib_cq); @@ -2787,6 +2799,15 @@ struct ib_device { char iw_ifname[IFNAMSIZ]; u32 iw_driver_flags; u32 lag_flags; + + /* A parent device has a list of sub-devices */ + struct mutex subdev_lock; + struct list_head subdev_list_head; + + /* A sub device has a type and a parent */ + enum rdma_nl_dev_type type; + struct ib_device *parent; + struct list_head subdev_list; }; static inline void *rdma_zalloc_obj(struct ib_device *dev, size_t size, @@ -4836,4 +4857,26 @@ static inline u16 rdma_get_udp_sport(u32 fl, u32 lqpn, u32 rqpn) const struct ib_port_immutable* ib_port_immutable_read(struct ib_device *dev, unsigned int port); + +/** ib_add_sub_device - Add a sub IB device on an existing one + * + * @parent: The IB device that needs to add a sub device + * @type: The type of the new sub device + * @name: The name of the new sub device + * + * + * Return 0 on success, an error code otherwise + */ +int ib_add_sub_device(struct ib_device *parent, + enum rdma_nl_dev_type type, + const char *name); + + +/** ib_del_sub_device_and_put - Delect an IB sub device while holding a 'get' + * + * @sub: The sub device that is going to be deleted + * + * Return 0 on success, an error code otherwise + */ +int ib_del_sub_device_and_put(struct ib_device *sub); #endif /* IB_VERBS_H */ diff --git a/include/uapi/rdma/rdma_netlink.h b/include/uapi/rdma/rdma_netlink.h index e50c357367db0..47a523ec8dbff 100644 --- a/include/uapi/rdma/rdma_netlink.h +++ b/include/uapi/rdma/rdma_netlink.h @@ -592,4 +592,9 @@ enum rdma_nl_counter_mask { RDMA_COUNTER_MASK_QP_TYPE = 1, RDMA_COUNTER_MASK_PID = 1 << 1, }; + +/* Supported rdma device types. */ +enum rdma_nl_dev_type { + RDMA_DEVICE_TYPE_SMI = 1, +}; #endif /* _UAPI_RDMA_NETLINK_H */ From 39db0d6e16da39a6f75ec46899d47eeae698591a Mon Sep 17 00:00:00 2001 From: Mark Zhang Date: Sun, 16 Jun 2024 19:08:37 +0300 Subject: [PATCH 040/548] RDMA: Set type of rdma_ah to IB for a SMI sub device An address handle created on a SMI port has type IB, as a SMI port it's used for SMI management through umad. Signed-off-by: Mark Zhang Link: https://lore.kernel.org/r/195be77aae0cce93522269f22f1303d2ccbef605.1718553901.git.leon@kernel.org Signed-off-by: Leon Romanovsky (cherry picked from commit 36e97bbc2dca61751c5b4b7f248cd850a7654a19) Signed-off-by: Koba Ko --- include/rdma/ib_verbs.h | 2 ++ 1 file changed, 2 insertions(+) diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h index ede4ac480ebb5..2fa0fb41fb098 100644 --- a/include/rdma/ib_verbs.h +++ b/include/rdma/ib_verbs.h @@ -4676,6 +4676,8 @@ static inline enum rdma_ah_attr_type rdma_ah_find_type(struct ib_device *dev, return RDMA_AH_ATTR_TYPE_OPA; return RDMA_AH_ATTR_TYPE_IB; } + if (dev->type == RDMA_DEVICE_TYPE_SMI) + return RDMA_AH_ATTR_TYPE_IB; return RDMA_AH_ATTR_TYPE_UNDEFINED; } From 4cd9495f79182c0e9b1c39d4a136f90d5255c6a9 Mon Sep 17 00:00:00 2001 From: Mark Zhang Date: Sun, 16 Jun 2024 19:08:38 +0300 Subject: [PATCH 041/548] RDMA/core: Create GSI QP only when CM is supported GSI QP is not needed if the port doesn't support connection management. In following patches mlx5 is going to support IB ports that doesn't support CM. Signed-off-by: Mark Zhang Link: https://lore.kernel.org/r/c449ebd955923b0e54c58832fd322f9d461b37a0.1718553901.git.leon@kernel.org Signed-off-by: Leon Romanovsky (cherry picked from commit a9e0facacfd16117ab69a73e29b1335b4648773d) Signed-off-by: Koba Ko --- drivers/infiniband/core/agent.c | 32 ++++++++++++++++++++++---------- drivers/infiniband/core/mad.c | 9 ++++++--- 2 files changed, 28 insertions(+), 13 deletions(-) diff --git a/drivers/infiniband/core/agent.c b/drivers/infiniband/core/agent.c index f82b4260de42c..3bb46696731ee 100644 --- a/drivers/infiniband/core/agent.c +++ b/drivers/infiniband/core/agent.c @@ -59,7 +59,16 @@ __ib_get_agent_port(const struct ib_device *device, int port_num) struct ib_agent_port_private *entry; list_for_each_entry(entry, &ib_agent_port_list, port_list) { - if (entry->agent[1]->device == device && + /* Need to check both agent[0] and agent[1], as an agent port + * may only have one of them + */ + if (entry->agent[0] && + entry->agent[0]->device == device && + entry->agent[0]->port_num == port_num) + return entry; + + if (entry->agent[1] && + entry->agent[1]->device == device && entry->agent[1]->port_num == port_num) return entry; } @@ -172,14 +181,16 @@ int ib_agent_port_open(struct ib_device *device, int port_num) } } - /* Obtain send only MAD agent for GSI QP */ - port_priv->agent[1] = ib_register_mad_agent(device, port_num, - IB_QPT_GSI, NULL, 0, - &agent_send_handler, - NULL, NULL, 0); - if (IS_ERR(port_priv->agent[1])) { - ret = PTR_ERR(port_priv->agent[1]); - goto error3; + if (rdma_cap_ib_cm(device, port_num)) { + /* Obtain send only MAD agent for GSI QP */ + port_priv->agent[1] = ib_register_mad_agent(device, port_num, + IB_QPT_GSI, NULL, 0, + &agent_send_handler, + NULL, NULL, 0); + if (IS_ERR(port_priv->agent[1])) { + ret = PTR_ERR(port_priv->agent[1]); + goto error3; + } } spin_lock_irqsave(&ib_agent_port_list_lock, flags); @@ -212,7 +223,8 @@ int ib_agent_port_close(struct ib_device *device, int port_num) list_del(&port_priv->port_list); spin_unlock_irqrestore(&ib_agent_port_list_lock, flags); - ib_unregister_mad_agent(port_priv->agent[1]); + if (port_priv->agent[1]) + ib_unregister_mad_agent(port_priv->agent[1]); if (port_priv->agent[0]) ib_unregister_mad_agent(port_priv->agent[0]); diff --git a/drivers/infiniband/core/mad.c b/drivers/infiniband/core/mad.c index 242434c09e8d8..a5e0f8d492dff 100644 --- a/drivers/infiniband/core/mad.c +++ b/drivers/infiniband/core/mad.c @@ -2987,9 +2987,12 @@ static int ib_mad_port_open(struct ib_device *device, if (ret) goto error6; } - ret = create_mad_qp(&port_priv->qp_info[1], IB_QPT_GSI); - if (ret) - goto error7; + + if (rdma_cap_ib_cm(device, port_num)) { + ret = create_mad_qp(&port_priv->qp_info[1], IB_QPT_GSI); + if (ret) + goto error7; + } snprintf(name, sizeof(name), "ib_mad%u", port_num); port_priv->wq = alloc_ordered_workqueue(name, WQ_MEM_RECLAIM); From f5d18cda93a96f8367be98249747aa0dcfd5257f Mon Sep 17 00:00:00 2001 From: Mark Zhang Date: Sun, 16 Jun 2024 19:08:39 +0300 Subject: [PATCH 042/548] RDMA/mlx5: Support plane device and driver APIs to add and delete it This patch supports driver APIs "add_sub_dev" and "del_sub_dev", to add and delete a plane device respectively. A mlx5 plane device is a rdma SMI device; It provides the SMI capability through user MAD for it's parent, the logical multi-plane aggregated device. For a plane port: - It supports QP0 only; - When adding a plane device, all plane ports are added; - For some commands like mad_ifc, both plane_index and native portnum is needed; - When querying or modifying a plane port context, the native portnum must be used, as the query/modify_hca_vport_context command doesn't support plane port. Signed-off-by: Mark Zhang Link: https://lore.kernel.org/r/e933cd0562aece181f8657af2ca0f5b387d0f14e.1718553901.git.leon@kernel.org Signed-off-by: Leon Romanovsky (cherry picked from commit 026a425990af6969a15b57d6d7fa0138a7e21958) Signed-off-by: Koba Ko --- drivers/infiniband/hw/mlx5/cmd.c | 12 ++- drivers/infiniband/hw/mlx5/cmd.h | 2 +- drivers/infiniband/hw/mlx5/mad.c | 2 +- drivers/infiniband/hw/mlx5/main.c | 116 ++++++++++++++++++++++++++- drivers/infiniband/hw/mlx5/mlx5_ib.h | 8 ++ drivers/infiniband/hw/mlx5/qp.c | 7 +- drivers/infiniband/hw/mlx5/qpc.c | 13 ++- 7 files changed, 147 insertions(+), 13 deletions(-) diff --git a/drivers/infiniband/hw/mlx5/cmd.c b/drivers/infiniband/hw/mlx5/cmd.c index 1d0c8d5e745bf..895b62cc528df 100644 --- a/drivers/infiniband/hw/mlx5/cmd.c +++ b/drivers/infiniband/hw/mlx5/cmd.c @@ -177,7 +177,7 @@ int mlx5_cmd_xrcd_dealloc(struct mlx5_core_dev *dev, u32 xrcdn, u16 uid) return mlx5_cmd_exec_in(dev, dealloc_xrcd, in); } -int mlx5_cmd_mad_ifc(struct mlx5_core_dev *dev, const void *inb, void *outb, +int mlx5_cmd_mad_ifc(struct mlx5_ib_dev *dev, const void *inb, void *outb, u16 opmod, u8 port) { int outlen = MLX5_ST_SZ_BYTES(mad_ifc_out); @@ -195,12 +195,18 @@ int mlx5_cmd_mad_ifc(struct mlx5_core_dev *dev, const void *inb, void *outb, MLX5_SET(mad_ifc_in, in, opcode, MLX5_CMD_OP_MAD_IFC); MLX5_SET(mad_ifc_in, in, op_mod, opmod); - MLX5_SET(mad_ifc_in, in, port, port); + if (dev->ib_dev.type == RDMA_DEVICE_TYPE_SMI) { + MLX5_SET(mad_ifc_in, in, plane_index, port); + MLX5_SET(mad_ifc_in, in, port, + smi_to_native_portnum(dev, port)); + } else { + MLX5_SET(mad_ifc_in, in, port, port); + } data = MLX5_ADDR_OF(mad_ifc_in, in, mad); memcpy(data, inb, MLX5_FLD_SZ_BYTES(mad_ifc_in, mad)); - err = mlx5_cmd_exec_inout(dev, mad_ifc, in, out); + err = mlx5_cmd_exec_inout(dev->mdev, mad_ifc, in, out); if (err) goto out; diff --git a/drivers/infiniband/hw/mlx5/cmd.h b/drivers/infiniband/hw/mlx5/cmd.h index 93a971a40d119..e5cd31270443d 100644 --- a/drivers/infiniband/hw/mlx5/cmd.h +++ b/drivers/infiniband/hw/mlx5/cmd.h @@ -54,7 +54,7 @@ int mlx5_cmd_detach_mcg(struct mlx5_core_dev *dev, union ib_gid *mgid, u32 qpn, u16 uid); int mlx5_cmd_xrcd_alloc(struct mlx5_core_dev *dev, u32 *xrcdn, u16 uid); int mlx5_cmd_xrcd_dealloc(struct mlx5_core_dev *dev, u32 xrcdn, u16 uid); -int mlx5_cmd_mad_ifc(struct mlx5_core_dev *dev, const void *inb, void *outb, +int mlx5_cmd_mad_ifc(struct mlx5_ib_dev *dev, const void *inb, void *outb, u16 opmod, u8 port); int mlx5_cmd_uar_alloc(struct mlx5_core_dev *dev, u32 *uarn, u16 uid); int mlx5_cmd_uar_dealloc(struct mlx5_core_dev *dev, u32 uarn, u16 uid); diff --git a/drivers/infiniband/hw/mlx5/mad.c b/drivers/infiniband/hw/mlx5/mad.c index 3e43687a7f6f7..ead836d159d36 100644 --- a/drivers/infiniband/hw/mlx5/mad.c +++ b/drivers/infiniband/hw/mlx5/mad.c @@ -69,7 +69,7 @@ static int mlx5_MAD_IFC(struct mlx5_ib_dev *dev, int ignore_mkey, if (ignore_bkey || !in_wc) op_modifier |= 0x2; - return mlx5_cmd_mad_ifc(dev->mdev, in_mad, response_mad, op_modifier, + return mlx5_cmd_mad_ifc(dev, in_mad, response_mad, op_modifier, port); } diff --git a/drivers/infiniband/hw/mlx5/main.c b/drivers/infiniband/hw/mlx5/main.c index 44d7b87d83743..0e402059d7a19 100644 --- a/drivers/infiniband/hw/mlx5/main.c +++ b/drivers/infiniband/hw/mlx5/main.c @@ -283,6 +283,14 @@ struct mlx5_core_dev *mlx5_ib_get_native_port_mdev(struct mlx5_ib_dev *ibdev, struct mlx5_ib_multiport_info *mpi; struct mlx5_ib_port *port; + if (ibdev->ib_dev.type == RDMA_DEVICE_TYPE_SMI) { + if (native_port_num) + *native_port_num = smi_to_native_portnum(ibdev, + ib_port_num); + return ibdev->mdev; + + } + if (!mlx5_core_mp_enabled(ibdev->mdev) || ll != IB_LINK_LAYER_ETHERNET) { if (native_port_num) @@ -1324,6 +1332,9 @@ static int mlx5_query_hca_port(struct ib_device *ibdev, u32 port, /* props being zeroed by the caller, avoid zeroing it here */ + if (ibdev->type == RDMA_DEVICE_TYPE_SMI) + port = smi_to_native_portnum(dev, port); + err = mlx5_query_hca_vport_context(mdev, 0, port, 0, rep); if (err) goto out; @@ -1339,7 +1350,8 @@ static int mlx5_query_hca_port(struct ib_device *ibdev, u32 port, if (dev->num_plane) { props->port_cap_flags |= IB_PORT_SM_DISABLED; props->port_cap_flags &= ~IB_PORT_SM; - } + } else if (ibdev->type == RDMA_DEVICE_TYPE_SMI) + props->port_cap_flags &= ~IB_PORT_CM_SUP; props->gid_tbl_len = mlx5_get_gid_table_len(MLX5_CAP_GEN(mdev, gid_table_size)); props->max_msg_sz = 1 << MLX5_CAP_GEN(mdev, log_max_msg); @@ -2816,7 +2828,8 @@ static int set_has_smi_cap(struct mlx5_ib_dev *dev) if (dev->num_plane) { dev->port_caps[port - 1].has_smi = false; continue; - } else if (!MLX5_CAP_GEN(dev->mdev, ib_virt)) { + } else if (!MLX5_CAP_GEN(dev->mdev, ib_virt) || + dev->ib_dev.type == RDMA_DEVICE_TYPE_SMI) { dev->port_caps[port - 1].has_smi = true; continue; } @@ -3015,6 +3028,8 @@ static u32 get_core_cap_flags(struct ib_device *ibdev, return ret | RDMA_CORE_CAP_PROT_IB | RDMA_CORE_CAP_IB_MAD | RDMA_CORE_CAP_IB_CM | RDMA_CORE_CAP_IB_SA | RDMA_CORE_CAP_AF_IB; + else if (ibdev->type == RDMA_DEVICE_TYPE_SMI) + return ret | RDMA_CORE_CAP_IB_MAD | RDMA_CORE_CAP_IB_SMI; if (ll == IB_LINK_LAYER_INFINIBAND) return ret | RDMA_CORE_PORT_IBA_IB; @@ -3051,6 +3066,9 @@ static int mlx5_port_immutable(struct ib_device *ibdev, u32 port_num, return err; if (ll == IB_LINK_LAYER_INFINIBAND) { + if (ibdev->type == RDMA_DEVICE_TYPE_SMI) + port_num = smi_to_native_portnum(dev, port_num); + err = mlx5_query_hca_vport_context(dev->mdev, 0, port_num, 0, &rep); if (err) @@ -3869,12 +3887,18 @@ static int mlx5_ib_enable_driver(struct ib_device *dev) return ret; } +static struct ib_device *mlx5_ib_add_sub_dev(struct ib_device *parent, + enum rdma_nl_dev_type type, + const char *name); +static void mlx5_ib_del_sub_dev(struct ib_device *sub_dev); + static const struct ib_device_ops mlx5_ib_dev_ops = { .owner = THIS_MODULE, .driver_id = RDMA_DRIVER_MLX5, .uverbs_abi_ver = MLX5_IB_UVERBS_ABI_VERSION, .add_gid = mlx5_ib_add_gid, + .add_sub_dev = mlx5_ib_add_sub_dev, .alloc_mr = mlx5_ib_alloc_mr, .alloc_mr_integrity = mlx5_ib_alloc_mr_integrity, .alloc_pd = mlx5_ib_alloc_pd, @@ -3889,6 +3913,7 @@ static const struct ib_device_ops mlx5_ib_dev_ops = { .dealloc_pd = mlx5_ib_dealloc_pd, .dealloc_ucontext = mlx5_ib_dealloc_ucontext, .del_gid = mlx5_ib_del_gid, + .del_sub_dev = mlx5_ib_del_sub_dev, .dereg_mr = mlx5_ib_dereg_mr, .destroy_ah = mlx5_ib_destroy_ah, .destroy_cq = mlx5_ib_destroy_cq, @@ -4179,7 +4204,9 @@ static int mlx5_ib_stage_ib_reg_init(struct mlx5_ib_dev *dev) { const char *name; - if (!mlx5_lag_is_active(dev->mdev)) + if (dev->sub_dev_name) + name = dev->sub_dev_name; + else if (!mlx5_lag_is_active(dev->mdev)) name = "mlx5_%d"; else name = "mlx5_bond_%d"; @@ -4452,6 +4479,89 @@ const struct mlx5_ib_profile raw_eth_profile = { NULL), }; +static const struct mlx5_ib_profile plane_profile = { + STAGE_CREATE(MLX5_IB_STAGE_INIT, + mlx5_ib_stage_init_init, + mlx5_ib_stage_init_cleanup), + STAGE_CREATE(MLX5_IB_STAGE_CAPS, + mlx5_ib_stage_caps_init, + mlx5_ib_stage_caps_cleanup), + STAGE_CREATE(MLX5_IB_STAGE_NON_DEFAULT_CB, + mlx5_ib_stage_non_default_cb, + NULL), + STAGE_CREATE(MLX5_IB_STAGE_QP, + mlx5_init_qp_table, + mlx5_cleanup_qp_table), + STAGE_CREATE(MLX5_IB_STAGE_SRQ, + mlx5_init_srq_table, + mlx5_cleanup_srq_table), + STAGE_CREATE(MLX5_IB_STAGE_DEVICE_RESOURCES, + mlx5_ib_dev_res_init, + mlx5_ib_dev_res_cleanup), + STAGE_CREATE(MLX5_IB_STAGE_BFREG, + mlx5_ib_stage_bfrag_init, + mlx5_ib_stage_bfrag_cleanup), + STAGE_CREATE(MLX5_IB_STAGE_IB_REG, + mlx5_ib_stage_ib_reg_init, + mlx5_ib_stage_ib_reg_cleanup), +}; + +static struct ib_device *mlx5_ib_add_sub_dev(struct ib_device *parent, + enum rdma_nl_dev_type type, + const char *name) +{ + struct mlx5_ib_dev *mparent = to_mdev(parent), *mplane; + enum rdma_link_layer ll; + int ret; + + if (mparent->smi_dev) + return ERR_PTR(-EEXIST); + + ll = mlx5_port_type_cap_to_rdma_ll(MLX5_CAP_GEN(mparent->mdev, + port_type)); + if (type != RDMA_DEVICE_TYPE_SMI || !mparent->num_plane || + ll != IB_LINK_LAYER_INFINIBAND || + !MLX5_CAP_GEN_2(mparent->mdev, multiplane_qp_ud)) + return ERR_PTR(-EOPNOTSUPP); + + mplane = ib_alloc_device(mlx5_ib_dev, ib_dev); + if (!mplane) + return ERR_PTR(-ENOMEM); + + mplane->port = kcalloc(mparent->num_plane * mparent->num_ports, + sizeof(*mplane->port), GFP_KERNEL); + if (!mplane->port) { + ret = -ENOMEM; + goto fail_kcalloc; + } + + mplane->ib_dev.type = type; + mplane->mdev = mparent->mdev; + mplane->num_ports = mparent->num_plane; + mplane->sub_dev_name = name; + + ret = __mlx5_ib_add(mplane, &plane_profile); + if (ret) + goto fail_ib_add; + + mparent->smi_dev = mplane; + return &mplane->ib_dev; + +fail_ib_add: + kfree(mplane->port); +fail_kcalloc: + ib_dealloc_device(&mplane->ib_dev); + return ERR_PTR(ret); +} + +static void mlx5_ib_del_sub_dev(struct ib_device *sub_dev) +{ + struct mlx5_ib_dev *mdev = to_mdev(sub_dev); + + to_mdev(sub_dev->parent)->smi_dev = NULL; + __mlx5_ib_remove(mdev, mdev->profile, MLX5_IB_STAGE_MAX); +} + static int mlx5r_mp_probe(struct auxiliary_device *adev, const struct auxiliary_device_id *id) { diff --git a/drivers/infiniband/hw/mlx5/mlx5_ib.h b/drivers/infiniband/hw/mlx5/mlx5_ib.h index b2bd44d8e9708..0f1266502c085 100644 --- a/drivers/infiniband/hw/mlx5/mlx5_ib.h +++ b/drivers/infiniband/hw/mlx5/mlx5_ib.h @@ -1176,6 +1176,8 @@ struct mlx5_ib_dev { #endif u8 num_plane; + struct mlx5_ib_dev *smi_dev; + const char *sub_dev_name; }; static inline struct mlx5_ib_cq *to_mibcq(struct mlx5_core_cq *mcq) @@ -1684,4 +1686,10 @@ static inline bool mlx5_umem_needs_ats(struct mlx5_ib_dev *dev, int set_roce_addr(struct mlx5_ib_dev *dev, u32 port_num, unsigned int index, const union ib_gid *gid, const struct ib_gid_attr *attr); + +static inline u32 smi_to_native_portnum(struct mlx5_ib_dev *dev, u32 port) +{ + return (port - 1) / dev->num_ports + 1; +} + #endif /* MLX5_IB_H */ diff --git a/drivers/infiniband/hw/mlx5/qp.c b/drivers/infiniband/hw/mlx5/qp.c index de0895335872f..f07d2b96d0a3d 100644 --- a/drivers/infiniband/hw/mlx5/qp.c +++ b/drivers/infiniband/hw/mlx5/qp.c @@ -4230,7 +4230,12 @@ static int __mlx5_ib_modify_qp(struct ib_qp *ibqp, /* todo implement counter_index functionality */ - if (is_sqp(qp->type)) + if (dev->ib_dev.type == RDMA_DEVICE_TYPE_SMI && is_qp0(qp->type)) { + MLX5_SET(ads, pri_path, vhca_port_num, + smi_to_native_portnum(dev, qp->port)); + if (cur_state == IB_QPS_INIT && new_state == IB_QPS_RTR) + MLX5_SET(ads, pri_path, plane_index, qp->port); + } else if (is_sqp(qp->type)) MLX5_SET(ads, pri_path, vhca_port_num, qp->port); if (attr_mask & IB_QP_PORT) diff --git a/drivers/infiniband/hw/mlx5/qpc.c b/drivers/infiniband/hw/mlx5/qpc.c index 9de9bea1025e2..146d03ae40bd9 100644 --- a/drivers/infiniband/hw/mlx5/qpc.c +++ b/drivers/infiniband/hw/mlx5/qpc.c @@ -263,7 +263,8 @@ int mlx5_qpc_create_qp(struct mlx5_ib_dev *dev, struct mlx5_core_qp *qp, if (err) goto err_cmd; - mlx5_debug_qp_add(dev->mdev, qp); + if (dev->ib_dev.type != RDMA_DEVICE_TYPE_SMI) + mlx5_debug_qp_add(dev->mdev, qp); return 0; @@ -321,7 +322,8 @@ int mlx5_core_destroy_qp(struct mlx5_ib_dev *dev, struct mlx5_core_qp *qp) { u32 in[MLX5_ST_SZ_DW(destroy_qp_in)] = {}; - mlx5_debug_qp_remove(dev->mdev, qp); + if (dev->ib_dev.type != RDMA_DEVICE_TYPE_SMI) + mlx5_debug_qp_remove(dev->mdev, qp); destroy_resource_common(dev, qp); @@ -518,7 +520,9 @@ int mlx5_init_qp_table(struct mlx5_ib_dev *dev) spin_lock_init(&table->lock); INIT_RADIX_TREE(&table->tree, GFP_ATOMIC); xa_init(&table->dct_xa); - mlx5_qp_debugfs_init(dev->mdev); + + if (dev->ib_dev.type != RDMA_DEVICE_TYPE_SMI) + mlx5_qp_debugfs_init(dev->mdev); table->nb.notifier_call = rsc_event_notifier; mlx5_notifier_register(dev->mdev, &table->nb); @@ -531,7 +535,8 @@ void mlx5_cleanup_qp_table(struct mlx5_ib_dev *dev) struct mlx5_qp_table *table = &dev->qp_table; mlx5_notifier_unregister(dev->mdev, &table->nb); - mlx5_qp_debugfs_cleanup(dev->mdev); + if (dev->ib_dev.type != RDMA_DEVICE_TYPE_SMI) + mlx5_qp_debugfs_cleanup(dev->mdev); } int mlx5_core_qp_query(struct mlx5_ib_dev *dev, struct mlx5_core_qp *qp, From a1e9f6b9358c2fec3dddd6df0c0cebaf45cc9b59 Mon Sep 17 00:00:00 2001 From: Mark Zhang Date: Sun, 16 Jun 2024 19:08:40 +0300 Subject: [PATCH 043/548] RDMA/nldev: Add support to add/delete a sub IB device through netlink Add new netlink commands and attributes to support adding and deleting a sub IB device with admin privilege. Examples: $ rdma dev add smi1 type SMI parent ibp8s0f1 $ rdma dev del smi1 Signed-off-by: Mark Zhang Link: https://lore.kernel.org/r/77cbf1b36359642be8a8d8c5c2f4e585b544282f.1718553901.git.leon@kernel.org Signed-off-by: Leon Romanovsky (cherry picked from commit 060c642b2ab8b40b39f9db99c1d14c7d19ba507f) Signed-off-by: Koba Ko --- drivers/infiniband/core/nldev.c | 59 ++++++++++++++++++++++++++++++++ include/uapi/rdma/rdma_netlink.h | 6 ++++ 2 files changed, 65 insertions(+) diff --git a/drivers/infiniband/core/nldev.c b/drivers/infiniband/core/nldev.c index 6d1dbc9787590..bc65ecf16857b 100644 --- a/drivers/infiniband/core/nldev.c +++ b/drivers/infiniband/core/nldev.c @@ -156,6 +156,7 @@ static const struct nla_policy nldev_policy[RDMA_NLDEV_ATTR_MAX] = { [RDMA_NLDEV_SYS_ATTR_COPY_ON_FORK] = { .type = NLA_U8 }, [RDMA_NLDEV_ATTR_STAT_HWCOUNTER_INDEX] = { .type = NLA_U32 }, [RDMA_NLDEV_ATTR_STAT_HWCOUNTER_DYNAMIC] = { .type = NLA_U8 }, + [RDMA_NLDEV_ATTR_DEV_TYPE] = { .type = NLA_U8 }, }; static int put_driver_name_print_type(struct sk_buff *msg, const char *name, @@ -2468,6 +2469,56 @@ static int nldev_stat_get_counter_status_doit(struct sk_buff *skb, return ret; } +static int nldev_newdev(struct sk_buff *skb, struct nlmsghdr *nlh, + struct netlink_ext_ack *extack) +{ + struct nlattr *tb[RDMA_NLDEV_ATTR_MAX]; + enum rdma_nl_dev_type type; + struct ib_device *parent; + char name[IFNAMSIZ] = {}; + u32 parentid; + int ret; + + ret = nlmsg_parse(nlh, 0, tb, RDMA_NLDEV_ATTR_MAX - 1, + nldev_policy, extack); + if (ret || !tb[RDMA_NLDEV_ATTR_DEV_INDEX] || + !tb[RDMA_NLDEV_ATTR_DEV_NAME] || !tb[RDMA_NLDEV_ATTR_DEV_TYPE]) + return -EINVAL; + + nla_strscpy(name, tb[RDMA_NLDEV_ATTR_DEV_NAME], sizeof(name)); + type = nla_get_u8(tb[RDMA_NLDEV_ATTR_DEV_TYPE]); + parentid = nla_get_u32(tb[RDMA_NLDEV_ATTR_DEV_INDEX]); + parent = ib_device_get_by_index(sock_net(skb->sk), parentid); + if (!parent) + return -EINVAL; + + ret = ib_add_sub_device(parent, type, name); + ib_device_put(parent); + + return ret; +} + +static int nldev_deldev(struct sk_buff *skb, struct nlmsghdr *nlh, + struct netlink_ext_ack *extack) +{ + struct nlattr *tb[RDMA_NLDEV_ATTR_MAX]; + struct ib_device *device; + u32 devid; + int ret; + + ret = nlmsg_parse(nlh, 0, tb, RDMA_NLDEV_ATTR_MAX - 1, + nldev_policy, extack); + if (ret || !tb[RDMA_NLDEV_ATTR_DEV_INDEX]) + return -EINVAL; + + devid = nla_get_u32(tb[RDMA_NLDEV_ATTR_DEV_INDEX]); + device = ib_device_get_by_index(sock_net(skb->sk), devid); + if (!device) + return -EINVAL; + + return ib_del_sub_device_and_put(device); +} + static const struct rdma_nl_cbs nldev_cb_table[RDMA_NLDEV_NUM_OPS] = { [RDMA_NLDEV_CMD_GET] = { .doit = nldev_get_doit, @@ -2561,6 +2612,14 @@ static const struct rdma_nl_cbs nldev_cb_table[RDMA_NLDEV_NUM_OPS] = { [RDMA_NLDEV_CMD_STAT_GET_STATUS] = { .doit = nldev_stat_get_counter_status_doit, }, + [RDMA_NLDEV_CMD_NEWDEV] = { + .doit = nldev_newdev, + .flags = RDMA_NL_ADMIN_PERM, + }, + [RDMA_NLDEV_CMD_DELDEV] = { + .doit = nldev_deldev, + .flags = RDMA_NL_ADMIN_PERM, + }, }; void __init nldev_init(void) diff --git a/include/uapi/rdma/rdma_netlink.h b/include/uapi/rdma/rdma_netlink.h index 47a523ec8dbff..8186e493a619a 100644 --- a/include/uapi/rdma/rdma_netlink.h +++ b/include/uapi/rdma/rdma_netlink.h @@ -299,6 +299,10 @@ enum rdma_nldev_command { RDMA_NLDEV_CMD_STAT_GET_STATUS, + RDMA_NLDEV_CMD_NEWDEV, + + RDMA_NLDEV_CMD_DELDEV, + RDMA_NLDEV_NUM_OPS }; @@ -554,6 +558,8 @@ enum rdma_nldev_attr { RDMA_NLDEV_ATTR_STAT_HWCOUNTER_INDEX, /* u32 */ RDMA_NLDEV_ATTR_STAT_HWCOUNTER_DYNAMIC, /* u8 */ + RDMA_NLDEV_ATTR_DEV_TYPE, /* u8 */ + /* * Always the end */ From b21b2fe3eeaf2490bc29c87490950d9e1c1b8642 Mon Sep 17 00:00:00 2001 From: Mark Zhang Date: Sun, 16 Jun 2024 19:08:41 +0300 Subject: [PATCH 044/548] RDMA/nldev: Add support to dump device type and parent device if exists If a device has a specific type or a parent device, dump them as well. Example: $ rdma dev show smi1 3: smi1: node_type ca fw 20.38.1002 node_guid 9803:9b03:009f:d5ef sys_image_guid 9803:9b03:009f:d5ee type smi parent ibp8s0f1 Signed-off-by: Mark Zhang Link: https://lore.kernel.org/r/4c022e3e34b5de1254a3b367d502a362cdd0c53a.1718553901.git.leon@kernel.org Signed-off-by: Leon Romanovsky (cherry picked from commit 294424839b5ec2ecd17f4c8409796846b2b8dd31) Signed-off-by: Koba Ko --- drivers/infiniband/core/nldev.c | 10 ++++++++++ include/uapi/rdma/rdma_netlink.h | 2 ++ 2 files changed, 12 insertions(+) diff --git a/drivers/infiniband/core/nldev.c b/drivers/infiniband/core/nldev.c index bc65ecf16857b..d8985dd726da7 100644 --- a/drivers/infiniband/core/nldev.c +++ b/drivers/infiniband/core/nldev.c @@ -157,6 +157,7 @@ static const struct nla_policy nldev_policy[RDMA_NLDEV_ATTR_MAX] = { [RDMA_NLDEV_ATTR_STAT_HWCOUNTER_INDEX] = { .type = NLA_U32 }, [RDMA_NLDEV_ATTR_STAT_HWCOUNTER_DYNAMIC] = { .type = NLA_U8 }, [RDMA_NLDEV_ATTR_DEV_TYPE] = { .type = NLA_U8 }, + [RDMA_NLDEV_ATTR_PARENT_NAME] = { .type = NLA_NUL_STRING }, }; static int put_driver_name_print_type(struct sk_buff *msg, const char *name, @@ -285,6 +286,15 @@ static int fill_dev_info(struct sk_buff *msg, struct ib_device *device) if (nla_put_u8(msg, RDMA_NLDEV_ATTR_DEV_DIM, device->use_cq_dim)) return -EMSGSIZE; + if (device->type && + nla_put_u8(msg, RDMA_NLDEV_ATTR_DEV_TYPE, device->type)) + return -EMSGSIZE; + + if (device->parent && + nla_put_string(msg, RDMA_NLDEV_ATTR_PARENT_NAME, + dev_name(&device->parent->dev))) + return -EMSGSIZE; + /* * Link type is determined on first port and mlx4 device * which can potentially have two different link type for the same diff --git a/include/uapi/rdma/rdma_netlink.h b/include/uapi/rdma/rdma_netlink.h index 8186e493a619a..f40d1152d43eb 100644 --- a/include/uapi/rdma/rdma_netlink.h +++ b/include/uapi/rdma/rdma_netlink.h @@ -560,6 +560,8 @@ enum rdma_nldev_attr { RDMA_NLDEV_ATTR_DEV_TYPE, /* u8 */ + RDMA_NLDEV_ATTR_PARENT_NAME, /* string */ + /* * Always the end */ From 3cf0f48901b83d5f21ddac56a1608900bc8c6211 Mon Sep 17 00:00:00 2001 From: Mark Zhang Date: Sun, 16 Jun 2024 19:08:42 +0300 Subject: [PATCH 045/548] RDMA/mlx5: Add plane index support when querying PTYS registers Support the new "plane_ind" field when querying port PTYS registers. This is needed when querying the rate of a plane port. Signed-off-by: Mark Zhang Link: https://lore.kernel.org/r/1f703c36306aa46917fcd88eadbb23b3e380d526.1718553901.git.leon@kernel.org Signed-off-by: Leon Romanovsky (cherry picked from commit 3b43399b297c52cfe4dfe0194b6ad89d52bf8c35) Signed-off-by: Koba Ko --- drivers/infiniband/hw/mlx5/main.c | 12 +++++++----- drivers/net/ethernet/mellanox/mlx5/core/en/port.c | 2 +- drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c | 2 +- .../net/ethernet/mellanox/mlx5/core/ipoib/ethtool.c | 2 +- drivers/net/ethernet/mellanox/mlx5/core/port.c | 10 ++++++---- include/linux/mlx5/port.h | 5 +++-- 6 files changed, 19 insertions(+), 14 deletions(-) diff --git a/drivers/infiniband/hw/mlx5/main.c b/drivers/infiniband/hw/mlx5/main.c index 0e402059d7a19..910bc8bd390dc 100644 --- a/drivers/infiniband/hw/mlx5/main.c +++ b/drivers/infiniband/hw/mlx5/main.c @@ -512,10 +512,10 @@ static int mlx5_query_port_roce(struct ib_device *device, u32 port_num, */ if (dev->is_rep) err = mlx5_query_port_ptys(mdev, out, sizeof(out), MLX5_PTYS_EN, - 1); + 1, 0); else err = mlx5_query_port_ptys(mdev, out, sizeof(out), MLX5_PTYS_EN, - mdev_port_num); + mdev_port_num, 0); if (err) goto out; ext = !!MLX5_GET_ETH_PROTO(ptys_reg, out, true, eth_proto_capability); @@ -1318,11 +1318,11 @@ static int mlx5_query_hca_port(struct ib_device *ibdev, u32 port, struct mlx5_ib_dev *dev = to_mdev(ibdev); struct mlx5_core_dev *mdev = dev->mdev; struct mlx5_hca_vport_context *rep; + u8 vl_hw_cap, plane_index = 0; u16 max_mtu; u16 oper_mtu; int err; u16 ib_link_width_oper; - u8 vl_hw_cap; rep = kzalloc(sizeof(*rep), GFP_KERNEL); if (!rep) { @@ -1332,8 +1332,10 @@ static int mlx5_query_hca_port(struct ib_device *ibdev, u32 port, /* props being zeroed by the caller, avoid zeroing it here */ - if (ibdev->type == RDMA_DEVICE_TYPE_SMI) + if (ibdev->type == RDMA_DEVICE_TYPE_SMI) { + plane_index = port; port = smi_to_native_portnum(dev, port); + } err = mlx5_query_hca_vport_context(mdev, 0, port, 0, rep); if (err) @@ -1365,7 +1367,7 @@ static int mlx5_query_hca_port(struct ib_device *ibdev, u32 port, props->port_cap_flags2 = rep->cap_mask2; err = mlx5_query_ib_port_oper(mdev, &ib_link_width_oper, - &props->active_speed, port); + &props->active_speed, port, plane_index); if (err) goto out; diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/port.c b/drivers/net/ethernet/mellanox/mlx5/core/en/port.c index dbe2b19a9570e..f6bdf35ec460e 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en/port.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en/port.c @@ -41,7 +41,7 @@ void mlx5_port_query_eth_autoneg(struct mlx5_core_dev *dev, u8 *an_status, *an_disable_cap = 0; *an_disable_admin = 0; - if (mlx5_query_port_ptys(dev, out, sizeof(out), MLX5_PTYS_EN, 1)) + if (mlx5_query_port_ptys(dev, out, sizeof(out), MLX5_PTYS_EN, 1, 0)) return; *an_status = MLX5_GET(ptys_reg, out, an_status); diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c b/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c index 387988fec6f18..6f6703e845b92 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c @@ -1019,7 +1019,7 @@ int mlx5e_ethtool_get_link_ksettings(struct mlx5e_priv *priv, bool ext; int err; - err = mlx5_query_port_ptys(mdev, out, sizeof(out), MLX5_PTYS_EN, 1); + err = mlx5_query_port_ptys(mdev, out, sizeof(out), MLX5_PTYS_EN, 1, 0); if (err) { netdev_err(priv->netdev, "%s: query port ptys failed: %d\n", __func__, err); diff --git a/drivers/net/ethernet/mellanox/mlx5/core/ipoib/ethtool.c b/drivers/net/ethernet/mellanox/mlx5/core/ipoib/ethtool.c index 779d92b762d36..b8aadbea1312f 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/ipoib/ethtool.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/ipoib/ethtool.c @@ -215,7 +215,7 @@ static int mlx5i_get_link_ksettings(struct net_device *netdev, int speed, ret; ret = mlx5_query_ib_port_oper(mdev, &ib_link_width_oper, &ib_proto_oper, - 1); + 1, 0); if (ret) return ret; diff --git a/drivers/net/ethernet/mellanox/mlx5/core/port.c b/drivers/net/ethernet/mellanox/mlx5/core/port.c index 7d8c732818f20..bc48673b9a05e 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/port.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/port.c @@ -144,11 +144,13 @@ int mlx5_set_port_caps(struct mlx5_core_dev *dev, u8 port_num, u32 caps) EXPORT_SYMBOL_GPL(mlx5_set_port_caps); int mlx5_query_port_ptys(struct mlx5_core_dev *dev, u32 *ptys, - int ptys_size, int proto_mask, u8 local_port) + int ptys_size, int proto_mask, + u8 local_port, u8 plane_index) { u32 in[MLX5_ST_SZ_DW(ptys_reg)] = {0}; MLX5_SET(ptys_reg, in, local_port, local_port); + MLX5_SET(ptys_reg, in, plane_ind, plane_index); MLX5_SET(ptys_reg, in, proto_mask, proto_mask); return mlx5_core_access_reg(dev, in, sizeof(in), ptys, ptys_size, MLX5_REG_PTYS, 0, 0); @@ -167,13 +169,13 @@ int mlx5_set_port_beacon(struct mlx5_core_dev *dev, u16 beacon_duration) } int mlx5_query_ib_port_oper(struct mlx5_core_dev *dev, u16 *link_width_oper, - u16 *proto_oper, u8 local_port) + u16 *proto_oper, u8 local_port, u8 plane_index) { u32 out[MLX5_ST_SZ_DW(ptys_reg)]; int err; err = mlx5_query_port_ptys(dev, out, sizeof(out), MLX5_PTYS_IB, - local_port); + local_port, plane_index); if (err) return err; @@ -1114,7 +1116,7 @@ int mlx5_port_query_eth_proto(struct mlx5_core_dev *dev, u8 port, bool ext, if (!eproto) return -EINVAL; - err = mlx5_query_port_ptys(dev, out, sizeof(out), MLX5_PTYS_EN, port); + err = mlx5_query_port_ptys(dev, out, sizeof(out), MLX5_PTYS_EN, port, 0); if (err) return err; diff --git a/include/linux/mlx5/port.h b/include/linux/mlx5/port.h index 26092c78a9850..e68d42b8ce652 100644 --- a/include/linux/mlx5/port.h +++ b/include/linux/mlx5/port.h @@ -155,10 +155,11 @@ struct mlx5_port_eth_proto { int mlx5_set_port_caps(struct mlx5_core_dev *dev, u8 port_num, u32 caps); int mlx5_query_port_ptys(struct mlx5_core_dev *dev, u32 *ptys, - int ptys_size, int proto_mask, u8 local_port); + int ptys_size, int proto_mask, + u8 local_port, u8 plane_index); int mlx5_query_ib_port_oper(struct mlx5_core_dev *dev, u16 *link_width_oper, - u16 *proto_oper, u8 local_port); + u16 *proto_oper, u8 local_port, u8 plane_index); void mlx5_toggle_port_link(struct mlx5_core_dev *dev); int mlx5_set_port_admin_status(struct mlx5_core_dev *dev, enum mlx5_port_status status); From 00c814eedeb0fdc24a7c769dc381208bc7a8a159 Mon Sep 17 00:00:00 2001 From: Mark Zhang Date: Sun, 16 Jun 2024 19:08:43 +0300 Subject: [PATCH 046/548] net/mlx5: mlx5_ifc update for accessing ppcnt register of plane ports This patch adds new fields to support multi-plane and the extend port counters group. Actual support will be added in the next patch. Signed-off-by: Mark Zhang Link: https://lore.kernel.org/r/70221cdd79aad0e21cbf385d9567e3ebffbc5137.1718553901.git.leon@kernel.org Signed-off-by: Leon Romanovsky (cherry picked from commit c6b6677d85d47f5562020aa082d9b5f90738fdcd) Signed-off-by: Koba Ko --- include/linux/mlx5/mlx5_ifc.h | 47 +++++++++++++++++++++++++++++++++-- 1 file changed, 45 insertions(+), 2 deletions(-) diff --git a/include/linux/mlx5/mlx5_ifc.h b/include/linux/mlx5/mlx5_ifc.h index cb97f403aed31..a23c306e36e13 100644 --- a/include/linux/mlx5/mlx5_ifc.h +++ b/include/linux/mlx5/mlx5_ifc.h @@ -2589,6 +2589,46 @@ struct mlx5_ifc_ib_port_cntrs_grp_data_layout_bits { u8 port_xmit_wait[0x20]; }; +struct mlx5_ifc_ib_ext_port_cntrs_grp_data_layout_bits { + u8 reserved_at_0[0x300]; + + u8 port_xmit_data_high[0x20]; + + u8 port_xmit_data_low[0x20]; + + u8 port_rcv_data_high[0x20]; + + u8 port_rcv_data_low[0x20]; + + u8 port_xmit_pkts_high[0x20]; + + u8 port_xmit_pkts_low[0x20]; + + u8 port_rcv_pkts_high[0x20]; + + u8 port_rcv_pkts_low[0x20]; + + u8 reserved_at_400[0x80]; + + u8 port_unicast_xmit_pkts_high[0x20]; + + u8 port_unicast_xmit_pkts_low[0x20]; + + u8 port_multicast_xmit_pkts_high[0x20]; + + u8 port_multicast_xmit_pkts_low[0x20]; + + u8 port_unicast_rcv_pkts_high[0x20]; + + u8 port_unicast_rcv_pkts_low[0x20]; + + u8 port_multicast_rcv_pkts_high[0x20]; + + u8 port_multicast_rcv_pkts_low[0x20]; + + u8 reserved_at_580[0x240]; +}; + struct mlx5_ifc_eth_per_tc_prio_grp_data_layout_bits { u8 transmit_queue_high[0x20]; @@ -4474,6 +4514,7 @@ union mlx5_ifc_eth_cntrs_grp_data_layout_auto_bits { struct mlx5_ifc_eth_per_tc_prio_grp_data_layout_bits eth_per_tc_prio_grp_data_layout; struct mlx5_ifc_eth_per_tc_congest_prio_grp_data_layout_bits eth_per_tc_congest_prio_grp_data_layout; struct mlx5_ifc_ib_port_cntrs_grp_data_layout_bits ib_port_cntrs_grp_data_layout; + struct mlx5_ifc_ib_ext_port_cntrs_grp_data_layout_bits ib_ext_port_cntrs_grp_data_layout; struct mlx5_ifc_phys_layer_cntrs_bits phys_layer_cntrs; struct mlx5_ifc_phys_layer_statistical_cntrs_bits phys_layer_statistical_cntrs; u8 reserved_at_0[0x7c0]; @@ -9727,8 +9768,10 @@ struct mlx5_ifc_ppcnt_reg_bits { u8 grp[0x6]; u8 clr[0x1]; - u8 reserved_at_21[0x1c]; - u8 prio_tc[0x3]; + u8 reserved_at_21[0x13]; + u8 plane_ind[0x4]; + u8 reserved_at_38[0x3]; + u8 prio_tc[0x5]; union mlx5_ifc_eth_cntrs_grp_data_layout_auto_bits counter_set; }; From 01e5eda7a89ffd293494131fd851a541d38e668e Mon Sep 17 00:00:00 2001 From: Mark Zhang Date: Sun, 16 Jun 2024 19:08:44 +0300 Subject: [PATCH 047/548] RDMA/mlx5: Support per-plane port IB counters by querying PPCNT register Supports per-plane port counters by querying PPCNT register with the "extended port counters" group, as the query_vport_counter command doesn't support plane ports. Signed-off-by: Mark Zhang Link: https://lore.kernel.org/r/06ffb582d67159b7def4654c8272d3d6e8bd2f2f.1718553901.git.leon@kernel.org Signed-off-by: Leon Romanovsky (cherry picked from commit 7a2210a57d423cbdb9c5f2e8b12c65d7472ff147) Signed-off-by: Koba Ko --- drivers/infiniband/hw/mlx5/mad.c | 69 +++++++++++++++++++++++++++----- include/linux/mlx5/device.h | 1 + 2 files changed, 59 insertions(+), 11 deletions(-) diff --git a/drivers/infiniband/hw/mlx5/mad.c b/drivers/infiniband/hw/mlx5/mad.c index ead836d159d36..1b6c5e37d1697 100644 --- a/drivers/infiniband/hw/mlx5/mad.c +++ b/drivers/infiniband/hw/mlx5/mad.c @@ -147,8 +147,39 @@ static void pma_cnt_assign(struct ib_pma_portcounters *pma_cnt, vl_15_dropped); } -static int query_ib_ppcnt(struct mlx5_core_dev *dev, u8 port_num, void *out, - size_t sz) +static void pma_cnt_ext_assign_ppcnt(struct ib_pma_portcounters_ext *cnt_ext, + void *out) +{ + void *out_pma = MLX5_ADDR_OF(ppcnt_reg, out, + counter_set); + +#define MLX5_GET_EXT_CNTR(counter_name) \ + MLX5_GET64(ib_ext_port_cntrs_grp_data_layout, \ + out_pma, counter_name##_high) + + cnt_ext->port_xmit_data = + cpu_to_be64(MLX5_GET_EXT_CNTR(port_xmit_data) >> 2); + cnt_ext->port_rcv_data = + cpu_to_be64(MLX5_GET_EXT_CNTR(port_rcv_data) >> 2); + + cnt_ext->port_xmit_packets = + cpu_to_be64(MLX5_GET_EXT_CNTR(port_xmit_pkts)); + cnt_ext->port_rcv_packets = + cpu_to_be64(MLX5_GET_EXT_CNTR(port_rcv_pkts)); + + cnt_ext->port_unicast_xmit_packets = + cpu_to_be64(MLX5_GET_EXT_CNTR(port_unicast_xmit_pkts)); + cnt_ext->port_unicast_rcv_packets = + cpu_to_be64(MLX5_GET_EXT_CNTR(port_unicast_rcv_pkts)); + + cnt_ext->port_multicast_xmit_packets = + cpu_to_be64(MLX5_GET_EXT_CNTR(port_multicast_xmit_pkts)); + cnt_ext->port_multicast_rcv_packets = + cpu_to_be64(MLX5_GET_EXT_CNTR(port_multicast_rcv_pkts)); +} + +static int query_ib_ppcnt(struct mlx5_core_dev *dev, u8 port_num, u8 plane_num, + void *out, size_t sz, bool ext) { u32 *in; int err; @@ -160,8 +191,14 @@ static int query_ib_ppcnt(struct mlx5_core_dev *dev, u8 port_num, void *out, } MLX5_SET(ppcnt_reg, in, local_port, port_num); - - MLX5_SET(ppcnt_reg, in, grp, MLX5_INFINIBAND_PORT_COUNTERS_GROUP); + MLX5_SET(ppcnt_reg, in, plane_ind, plane_num); + + if (ext) + MLX5_SET(ppcnt_reg, in, grp, + MLX5_INFINIBAND_EXTENDED_PORT_COUNTERS_GROUP); + else + MLX5_SET(ppcnt_reg, in, grp, + MLX5_INFINIBAND_PORT_COUNTERS_GROUP); err = mlx5_core_access_reg(dev, in, sz, out, sz, MLX5_REG_PPCNT, 0, 0); @@ -189,7 +226,8 @@ static int process_pma_cmd(struct mlx5_ib_dev *dev, u32 port_num, mdev_port_num = 1; } if (MLX5_CAP_GEN(dev->mdev, num_ports) == 1 && - !mlx5_core_mp_enabled(mdev)) { + !mlx5_core_mp_enabled(mdev) && + dev->ib_dev.type != RDMA_DEVICE_TYPE_SMI) { /* set local port to one for Function-Per-Port HCA. */ mdev = dev->mdev; mdev_port_num = 1; @@ -208,7 +246,8 @@ static int process_pma_cmd(struct mlx5_ib_dev *dev, u32 port_num, if (in_mad->mad_hdr.attr_id == IB_PMA_PORT_COUNTERS_EXT) { struct ib_pma_portcounters_ext *pma_cnt_ext = (struct ib_pma_portcounters_ext *)(out_mad->data + 40); - int sz = MLX5_ST_SZ_BYTES(query_vport_counter_out); + int sz = max(MLX5_ST_SZ_BYTES(query_vport_counter_out), + MLX5_ST_SZ_BYTES(ppcnt_reg)); out_cnt = kvzalloc(sz, GFP_KERNEL); if (!out_cnt) { @@ -216,10 +255,18 @@ static int process_pma_cmd(struct mlx5_ib_dev *dev, u32 port_num, goto done; } - err = mlx5_core_query_vport_counter(mdev, 0, 0, mdev_port_num, - out_cnt); - if (!err) - pma_cnt_ext_assign(pma_cnt_ext, out_cnt); + if (dev->ib_dev.type == RDMA_DEVICE_TYPE_SMI) { + err = query_ib_ppcnt(mdev, mdev_port_num, + port_num, out_cnt, sz, 1); + if (!err) + pma_cnt_ext_assign_ppcnt(pma_cnt_ext, out_cnt); + } else { + err = mlx5_core_query_vport_counter(mdev, 0, 0, + mdev_port_num, + out_cnt); + if (!err) + pma_cnt_ext_assign(pma_cnt_ext, out_cnt); + } } else { struct ib_pma_portcounters *pma_cnt = (struct ib_pma_portcounters *)(out_mad->data + 40); @@ -231,7 +278,7 @@ static int process_pma_cmd(struct mlx5_ib_dev *dev, u32 port_num, goto done; } - err = query_ib_ppcnt(mdev, mdev_port_num, out_cnt, sz); + err = query_ib_ppcnt(mdev, mdev_port_num, 0, out_cnt, sz, 0); if (!err) pma_cnt_assign(pma_cnt, out_cnt); } diff --git a/include/linux/mlx5/device.h b/include/linux/mlx5/device.h index 26333d602a505..4e476789244d6 100644 --- a/include/linux/mlx5/device.h +++ b/include/linux/mlx5/device.h @@ -1459,6 +1459,7 @@ enum { MLX5_PER_TRAFFIC_CLASS_CONGESTION_GROUP = 0x13, MLX5_PHYSICAL_LAYER_STATISTICAL_GROUP = 0x16, MLX5_INFINIBAND_PORT_COUNTERS_GROUP = 0x20, + MLX5_INFINIBAND_EXTENDED_PORT_COUNTERS_GROUP = 0x21, }; enum { From 26ed4417f3699b2aecbe3ade8e6ff00c3a28a23b Mon Sep 17 00:00:00 2001 From: Mark Zhang Date: Mon, 1 Jul 2024 15:40:48 +0300 Subject: [PATCH 048/548] RDMA/core: Introduce "name_assign_type" for an IB device The name_assign_type indicates how the name is provided. Currently these types are supported: - RDMA_NAME_ASSIGN_TYPE_UNKNOWN: Unknown or not set; - RDMA_NAME_ASSIGN_TYPE_USER: Name is provided by the user; The user-created sub device, rxe and siw device has this type. When filling nl device info, it is set in the new attribute RDMA_NLDEV_ATTR_NAME_ASSIGN_TYPE. User-space tools like udev "rdma_rename" could check this attribute to determine if this device needs to be renamed or not. Signed-off-by: Mark Zhang Link: https://lore.kernel.org/r/522591bef9a369cc8e5dcb77787e017bffee37fe.1719837610.git.leon@kernel.org Signed-off-by: Leon Romanovsky (cherry picked from commit af48f95492dc1af36d9636a750ec492035c0ed7d) Signed-off-by: Koba Ko --- drivers/infiniband/core/nldev.c | 5 +++++ drivers/infiniband/hw/mlx5/main.c | 5 +++-- drivers/infiniband/sw/rxe/rxe_net.c | 1 + drivers/infiniband/sw/siw/siw_main.c | 1 + include/rdma/ib_verbs.h | 8 ++++++++ include/uapi/rdma/rdma_netlink.h | 9 +++++++++ 6 files changed, 27 insertions(+), 2 deletions(-) diff --git a/drivers/infiniband/core/nldev.c b/drivers/infiniband/core/nldev.c index d8985dd726da7..aba59dbecc5e7 100644 --- a/drivers/infiniband/core/nldev.c +++ b/drivers/infiniband/core/nldev.c @@ -158,6 +158,7 @@ static const struct nla_policy nldev_policy[RDMA_NLDEV_ATTR_MAX] = { [RDMA_NLDEV_ATTR_STAT_HWCOUNTER_DYNAMIC] = { .type = NLA_U8 }, [RDMA_NLDEV_ATTR_DEV_TYPE] = { .type = NLA_U8 }, [RDMA_NLDEV_ATTR_PARENT_NAME] = { .type = NLA_NUL_STRING }, + [RDMA_NLDEV_ATTR_NAME_ASSIGN_TYPE] = { .type = NLA_U8 }, }; static int put_driver_name_print_type(struct sk_buff *msg, const char *name, @@ -295,6 +296,10 @@ static int fill_dev_info(struct sk_buff *msg, struct ib_device *device) dev_name(&device->parent->dev))) return -EMSGSIZE; + if (nla_put_u8(msg, RDMA_NLDEV_ATTR_NAME_ASSIGN_TYPE, + device->name_assign_type)) + return -EMSGSIZE; + /* * Link type is determined on first port and mlx4 device * which can potentially have two different link type for the same diff --git a/drivers/infiniband/hw/mlx5/main.c b/drivers/infiniband/hw/mlx5/main.c index 910bc8bd390dc..b5822a394c6a8 100644 --- a/drivers/infiniband/hw/mlx5/main.c +++ b/drivers/infiniband/hw/mlx5/main.c @@ -4206,9 +4206,10 @@ static int mlx5_ib_stage_ib_reg_init(struct mlx5_ib_dev *dev) { const char *name; - if (dev->sub_dev_name) + if (dev->sub_dev_name) { name = dev->sub_dev_name; - else if (!mlx5_lag_is_active(dev->mdev)) + ib_mark_name_assigned_by_user(&dev->ib_dev); + } else if (!mlx5_lag_is_active(dev->mdev)) name = "mlx5_%d"; else name = "mlx5_bond_%d"; diff --git a/drivers/infiniband/sw/rxe/rxe_net.c b/drivers/infiniband/sw/rxe/rxe_net.c index e5827064ab1e2..369063eec1e05 100644 --- a/drivers/infiniband/sw/rxe/rxe_net.c +++ b/drivers/infiniband/sw/rxe/rxe_net.c @@ -522,6 +522,7 @@ int rxe_net_add(const char *ibdev_name, struct net_device *ndev) return -ENOMEM; rxe->ndev = ndev; + ib_mark_name_assigned_by_user(&rxe->ib_dev); err = rxe_add(rxe, ndev->mtu, ibdev_name); if (err) { diff --git a/drivers/infiniband/sw/siw/siw_main.c b/drivers/infiniband/sw/siw/siw_main.c index d4b6e0106851d..a37778273a81b 100644 --- a/drivers/infiniband/sw/siw/siw_main.c +++ b/drivers/infiniband/sw/siw/siw_main.c @@ -487,6 +487,7 @@ static int siw_newlink(const char *basedev_name, struct net_device *netdev) else sdev->state = IB_PORT_DOWN; + ib_mark_name_assigned_by_user(&sdev->base_dev); rv = siw_device_register(sdev, basedev_name); if (rv) ib_dealloc_device(&sdev->base_dev); diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h index 2fa0fb41fb098..f0e6adb9c4604 100644 --- a/include/rdma/ib_verbs.h +++ b/include/rdma/ib_verbs.h @@ -2808,6 +2808,8 @@ struct ib_device { enum rdma_nl_dev_type type; struct ib_device *parent; struct list_head subdev_list; + + enum rdma_nl_name_assign_type name_assign_type; }; static inline void *rdma_zalloc_obj(struct ib_device *dev, size_t size, @@ -4881,4 +4883,10 @@ int ib_add_sub_device(struct ib_device *parent, * Return 0 on success, an error code otherwise */ int ib_del_sub_device_and_put(struct ib_device *sub); + +static inline void ib_mark_name_assigned_by_user(struct ib_device *ibdev) +{ + ibdev->name_assign_type = RDMA_NAME_ASSIGN_TYPE_USER; +} + #endif /* IB_VERBS_H */ diff --git a/include/uapi/rdma/rdma_netlink.h b/include/uapi/rdma/rdma_netlink.h index f40d1152d43eb..ed628bcf0dc40 100644 --- a/include/uapi/rdma/rdma_netlink.h +++ b/include/uapi/rdma/rdma_netlink.h @@ -562,6 +562,8 @@ enum rdma_nldev_attr { RDMA_NLDEV_ATTR_PARENT_NAME, /* string */ + RDMA_NLDEV_ATTR_NAME_ASSIGN_TYPE, /* u8 */ + /* * Always the end */ @@ -605,4 +607,11 @@ enum rdma_nl_counter_mask { enum rdma_nl_dev_type { RDMA_DEVICE_TYPE_SMI = 1, }; + +/* RDMA device name assignment types */ +enum rdma_nl_name_assign_type { + RDMA_NAME_ASSIGN_TYPE_UNKNOWN = 0, + RDMA_NAME_ASSIGN_TYPE_USER = 1, /* Provided by user-space */ +}; + #endif /* _UAPI_RDMA_NETLINK_H */ From e7f03a282e6fe35a3517d51aca7ae7e2759891d9 Mon Sep 17 00:00:00 2001 From: Akiva Goldberger Date: Thu, 27 Jun 2024 21:23:49 +0300 Subject: [PATCH 049/548] RDMA: Pass entire uverbs attr bundle to create cq function Changes the create_cq verb signature by sending the entire uverbs attr bundle as a parameter. This allows drivers to send driver specific attrs through ioctl for the create_cq verb and access them in their driver specific code. Also adds a new enum value for driver specific ioctl attributes for methods already supporting UHW. Link: https://lore.kernel.org/r/ed147343987c0d43fd391c1b2f85e2f425747387.1719512393.git.leon@kernel.org Signed-off-by: Akiva Goldberger Signed-off-by: Leon Romanovsky Signed-off-by: Jason Gunthorpe (cherry picked from commit dd6d7f8574d7f8b6a0bf1aeef0b285d2706b8c2a) Signed-off-by: Koba Ko --- drivers/infiniband/core/uverbs_cmd.c | 2 +- drivers/infiniband/core/uverbs_std_types_cq.c | 2 +- drivers/infiniband/hw/bnxt_re/ib_verbs.c | 3 ++- drivers/infiniband/hw/bnxt_re/ib_verbs.h | 2 +- drivers/infiniband/hw/cxgb4/cq.c | 3 ++- drivers/infiniband/hw/cxgb4/iw_cxgb4.h | 2 +- drivers/infiniband/hw/efa/efa.h | 2 +- drivers/infiniband/hw/efa/efa_verbs.c | 3 ++- drivers/infiniband/hw/erdma/erdma_verbs.c | 3 ++- drivers/infiniband/hw/erdma/erdma_verbs.h | 2 +- drivers/infiniband/hw/hns/hns_roce_cq.c | 3 ++- drivers/infiniband/hw/hns/hns_roce_device.h | 2 +- drivers/infiniband/hw/irdma/verbs.c | 5 +++-- drivers/infiniband/hw/mana/cq.c | 3 ++- drivers/infiniband/hw/mana/mana_ib.h | 2 +- drivers/infiniband/hw/mlx4/cq.c | 3 ++- drivers/infiniband/hw/mlx4/mlx4_ib.h | 2 +- drivers/infiniband/hw/mlx5/cq.c | 3 ++- drivers/infiniband/hw/mlx5/mlx5_ib.h | 2 +- drivers/infiniband/hw/mthca/mthca_provider.c | 3 ++- drivers/infiniband/hw/ocrdma/ocrdma_verbs.c | 3 ++- drivers/infiniband/hw/ocrdma/ocrdma_verbs.h | 2 +- drivers/infiniband/hw/qedr/verbs.c | 3 ++- drivers/infiniband/hw/qedr/verbs.h | 2 +- drivers/infiniband/hw/usnic/usnic_ib_verbs.c | 2 +- drivers/infiniband/hw/usnic/usnic_ib_verbs.h | 2 +- drivers/infiniband/hw/vmw_pvrdma/pvrdma_cq.c | 5 +++-- drivers/infiniband/hw/vmw_pvrdma/pvrdma_verbs.h | 2 +- drivers/infiniband/sw/rdmavt/cq.c | 6 ++++-- drivers/infiniband/sw/rdmavt/cq.h | 2 +- drivers/infiniband/sw/rxe/rxe_verbs.c | 3 ++- drivers/infiniband/sw/siw/siw_verbs.c | 5 +++-- drivers/infiniband/sw/siw/siw_verbs.h | 2 +- include/rdma/ib_verbs.h | 2 +- include/uapi/rdma/ib_user_ioctl_cmds.h | 7 +++---- 35 files changed, 58 insertions(+), 42 deletions(-) diff --git a/drivers/infiniband/core/uverbs_cmd.c b/drivers/infiniband/core/uverbs_cmd.c index 33e2fe0facd52..d9449d49d338e 100644 --- a/drivers/infiniband/core/uverbs_cmd.c +++ b/drivers/infiniband/core/uverbs_cmd.c @@ -1051,7 +1051,7 @@ static int create_cq(struct uverbs_attr_bundle *attrs, rdma_restrack_new(&cq->res, RDMA_RESTRACK_CQ); rdma_restrack_set_name(&cq->res, NULL); - ret = ib_dev->ops.create_cq(cq, &attr, &attrs->driver_udata); + ret = ib_dev->ops.create_cq(cq, &attr, attrs); if (ret) goto err_free; rdma_restrack_add(&cq->res); diff --git a/drivers/infiniband/core/uverbs_std_types_cq.c b/drivers/infiniband/core/uverbs_std_types_cq.c index 370ad7c83f880..432054f0a8a49 100644 --- a/drivers/infiniband/core/uverbs_std_types_cq.c +++ b/drivers/infiniband/core/uverbs_std_types_cq.c @@ -128,7 +128,7 @@ static int UVERBS_HANDLER(UVERBS_METHOD_CQ_CREATE)( rdma_restrack_new(&cq->res, RDMA_RESTRACK_CQ); rdma_restrack_set_name(&cq->res, NULL); - ret = ib_dev->ops.create_cq(cq, &attr, &attrs->driver_udata); + ret = ib_dev->ops.create_cq(cq, &attr, attrs); if (ret) goto err_free; diff --git a/drivers/infiniband/hw/bnxt_re/ib_verbs.c b/drivers/infiniband/hw/bnxt_re/ib_verbs.c index f7345e4890a14..2dbd79767341c 100644 --- a/drivers/infiniband/hw/bnxt_re/ib_verbs.c +++ b/drivers/infiniband/hw/bnxt_re/ib_verbs.c @@ -2949,10 +2949,11 @@ int bnxt_re_destroy_cq(struct ib_cq *ib_cq, struct ib_udata *udata) } int bnxt_re_create_cq(struct ib_cq *ibcq, const struct ib_cq_init_attr *attr, - struct ib_udata *udata) + struct uverbs_attr_bundle *attrs) { struct bnxt_re_cq *cq = container_of(ibcq, struct bnxt_re_cq, ib_cq); struct bnxt_re_dev *rdev = to_bnxt_re_dev(ibcq->device, ibdev); + struct ib_udata *udata = &attrs->driver_udata; struct bnxt_re_ucontext *uctx = rdma_udata_to_drv_context(udata, struct bnxt_re_ucontext, ib_uctx); struct bnxt_qplib_dev_attr *dev_attr = &rdev->dev_attr; diff --git a/drivers/infiniband/hw/bnxt_re/ib_verbs.h b/drivers/infiniband/hw/bnxt_re/ib_verbs.h index ef910e6e2ccb7..8efd2fc3827e8 100644 --- a/drivers/infiniband/hw/bnxt_re/ib_verbs.h +++ b/drivers/infiniband/hw/bnxt_re/ib_verbs.h @@ -218,7 +218,7 @@ int bnxt_re_post_send(struct ib_qp *qp, const struct ib_send_wr *send_wr, int bnxt_re_post_recv(struct ib_qp *qp, const struct ib_recv_wr *recv_wr, const struct ib_recv_wr **bad_recv_wr); int bnxt_re_create_cq(struct ib_cq *ibcq, const struct ib_cq_init_attr *attr, - struct ib_udata *udata); + struct uverbs_attr_bundle *attrs); int bnxt_re_resize_cq(struct ib_cq *ibcq, int cqe, struct ib_udata *udata); int bnxt_re_destroy_cq(struct ib_cq *cq, struct ib_udata *udata); int bnxt_re_poll_cq(struct ib_cq *cq, int num_entries, struct ib_wc *wc); diff --git a/drivers/infiniband/hw/cxgb4/cq.c b/drivers/infiniband/hw/cxgb4/cq.c index 7e2835dcbc1c6..5111421f94732 100644 --- a/drivers/infiniband/hw/cxgb4/cq.c +++ b/drivers/infiniband/hw/cxgb4/cq.c @@ -995,8 +995,9 @@ int c4iw_destroy_cq(struct ib_cq *ib_cq, struct ib_udata *udata) } int c4iw_create_cq(struct ib_cq *ibcq, const struct ib_cq_init_attr *attr, - struct ib_udata *udata) + struct uverbs_attr_bundle *attrs) { + struct ib_udata *udata = &attrs->driver_udata; struct ib_device *ibdev = ibcq->device; int entries = attr->cqe; int vector = attr->comp_vector; diff --git a/drivers/infiniband/hw/cxgb4/iw_cxgb4.h b/drivers/infiniband/hw/cxgb4/iw_cxgb4.h index 50cb2259bf874..ab82415691d41 100644 --- a/drivers/infiniband/hw/cxgb4/iw_cxgb4.h +++ b/drivers/infiniband/hw/cxgb4/iw_cxgb4.h @@ -980,7 +980,7 @@ int c4iw_dereg_mr(struct ib_mr *ib_mr, struct ib_udata *udata); int c4iw_destroy_cq(struct ib_cq *ib_cq, struct ib_udata *udata); void c4iw_cq_rem_ref(struct c4iw_cq *chp); int c4iw_create_cq(struct ib_cq *ibcq, const struct ib_cq_init_attr *attr, - struct ib_udata *udata); + struct uverbs_attr_bundle *attrs); int c4iw_arm_cq(struct ib_cq *ibcq, enum ib_cq_notify_flags flags); int c4iw_modify_srq(struct ib_srq *ib_srq, struct ib_srq_attr *attr, enum ib_srq_attr_mask srq_attr_mask, diff --git a/drivers/infiniband/hw/efa/efa.h b/drivers/infiniband/hw/efa/efa.h index 7352a1f5d8111..1ffbdd20c4de0 100644 --- a/drivers/infiniband/hw/efa/efa.h +++ b/drivers/infiniband/hw/efa/efa.h @@ -150,7 +150,7 @@ int efa_create_qp(struct ib_qp *ibqp, struct ib_qp_init_attr *init_attr, struct ib_udata *udata); int efa_destroy_cq(struct ib_cq *ibcq, struct ib_udata *udata); int efa_create_cq(struct ib_cq *ibcq, const struct ib_cq_init_attr *attr, - struct ib_udata *udata); + struct uverbs_attr_bundle *attrs); struct ib_mr *efa_reg_mr(struct ib_pd *ibpd, u64 start, u64 length, u64 virt_addr, int access_flags, struct ib_udata *udata); diff --git a/drivers/infiniband/hw/efa/efa_verbs.c b/drivers/infiniband/hw/efa/efa_verbs.c index 0f8ca99d08276..e23ee3b0c3644 100644 --- a/drivers/infiniband/hw/efa/efa_verbs.c +++ b/drivers/infiniband/hw/efa/efa_verbs.c @@ -1064,8 +1064,9 @@ static int cq_mmap_entries_setup(struct efa_dev *dev, struct efa_cq *cq, } int efa_create_cq(struct ib_cq *ibcq, const struct ib_cq_init_attr *attr, - struct ib_udata *udata) + struct uverbs_attr_bundle *attrs) { + struct ib_udata *udata = &attrs->driver_udata; struct efa_ucontext *ucontext = rdma_udata_to_drv_context( udata, struct efa_ucontext, ibucontext); struct efa_com_create_cq_params params = {}; diff --git a/drivers/infiniband/hw/erdma/erdma_verbs.c b/drivers/infiniband/hw/erdma/erdma_verbs.c index 29ad2f5ffabe2..ef3347a9e5b0b 100644 --- a/drivers/infiniband/hw/erdma/erdma_verbs.c +++ b/drivers/infiniband/hw/erdma/erdma_verbs.c @@ -1640,8 +1640,9 @@ static int erdma_init_kernel_cq(struct erdma_cq *cq) } int erdma_create_cq(struct ib_cq *ibcq, const struct ib_cq_init_attr *attr, - struct ib_udata *udata) + struct uverbs_attr_bundle *attrs) { + struct ib_udata *udata = &attrs->driver_udata; struct erdma_cq *cq = to_ecq(ibcq); struct erdma_dev *dev = to_edev(ibcq->device); unsigned int depth = attr->cqe; diff --git a/drivers/infiniband/hw/erdma/erdma_verbs.h b/drivers/infiniband/hw/erdma/erdma_verbs.h index eb9c0f92fb6f0..43e651d9e926c 100644 --- a/drivers/infiniband/hw/erdma/erdma_verbs.h +++ b/drivers/infiniband/hw/erdma/erdma_verbs.h @@ -325,7 +325,7 @@ int erdma_query_device(struct ib_device *dev, struct ib_device_attr *attr, int erdma_get_port_immutable(struct ib_device *dev, u32 port, struct ib_port_immutable *ib_port_immutable); int erdma_create_cq(struct ib_cq *ibcq, const struct ib_cq_init_attr *attr, - struct ib_udata *data); + struct uverbs_attr_bundle *attrs); int erdma_query_port(struct ib_device *dev, u32 port, struct ib_port_attr *attr); int erdma_query_gid(struct ib_device *dev, u32 port, int idx, diff --git a/drivers/infiniband/hw/hns/hns_roce_cq.c b/drivers/infiniband/hw/hns/hns_roce_cq.c index 5e0d78f4e5454..999bab1794cb7 100644 --- a/drivers/infiniband/hw/hns/hns_roce_cq.c +++ b/drivers/infiniband/hw/hns/hns_roce_cq.c @@ -353,9 +353,10 @@ static int set_cqe_size(struct hns_roce_cq *hr_cq, struct ib_udata *udata, } int hns_roce_create_cq(struct ib_cq *ib_cq, const struct ib_cq_init_attr *attr, - struct ib_udata *udata) + struct uverbs_attr_bundle *attrs) { struct hns_roce_dev *hr_dev = to_hr_dev(ib_cq->device); + struct ib_udata *udata = &attrs->driver_udata; struct hns_roce_ib_create_cq_resp resp = {}; struct hns_roce_cq *hr_cq = to_hr_cq(ib_cq); struct ib_device *ibdev = &hr_dev->ib_dev; diff --git a/drivers/infiniband/hw/hns/hns_roce_device.h b/drivers/infiniband/hw/hns/hns_roce_device.h index 03b6546f63cdc..876b0ac6acfa5 100644 --- a/drivers/infiniband/hw/hns/hns_roce_device.h +++ b/drivers/infiniband/hw/hns/hns_roce_device.h @@ -1225,7 +1225,7 @@ __be32 send_ieth(const struct ib_send_wr *wr); int to_hr_qp_type(int qp_type); int hns_roce_create_cq(struct ib_cq *ib_cq, const struct ib_cq_init_attr *attr, - struct ib_udata *udata); + struct uverbs_attr_bundle *attrs); int hns_roce_destroy_cq(struct ib_cq *ib_cq, struct ib_udata *udata); int hns_roce_db_map_user(struct hns_roce_ucontext *context, unsigned long virt, diff --git a/drivers/infiniband/hw/irdma/verbs.c b/drivers/infiniband/hw/irdma/verbs.c index 38cecb28d322e..9f3ff5955b12a 100644 --- a/drivers/infiniband/hw/irdma/verbs.c +++ b/drivers/infiniband/hw/irdma/verbs.c @@ -2035,14 +2035,15 @@ static inline int cq_validate_flags(u32 flags, u8 hw_rev) * irdma_create_cq - create cq * @ibcq: CQ allocated * @attr: attributes for cq - * @udata: user data + * @attrs: uverbs attribute bundle */ static int irdma_create_cq(struct ib_cq *ibcq, const struct ib_cq_init_attr *attr, - struct ib_udata *udata) + struct uverbs_attr_bundle *attrs) { #define IRDMA_CREATE_CQ_MIN_REQ_LEN offsetofend(struct irdma_create_cq_req, user_cq_buf) #define IRDMA_CREATE_CQ_MIN_RESP_LEN offsetofend(struct irdma_create_cq_resp, cq_size) + struct ib_udata *udata = &attrs->driver_udata; struct ib_device *ibdev = ibcq->device; struct irdma_device *iwdev = to_iwdev(ibdev); struct irdma_pci_f *rf = iwdev->rf; diff --git a/drivers/infiniband/hw/mana/cq.c b/drivers/infiniband/hw/mana/cq.c index d141cab8a1e69..e6412f1f8a543 100644 --- a/drivers/infiniband/hw/mana/cq.c +++ b/drivers/infiniband/hw/mana/cq.c @@ -6,8 +6,9 @@ #include "mana_ib.h" int mana_ib_create_cq(struct ib_cq *ibcq, const struct ib_cq_init_attr *attr, - struct ib_udata *udata) + struct uverbs_attr_bundle *attrs) { + struct ib_udata *udata = &attrs->driver_udata; struct mana_ib_cq *cq = container_of(ibcq, struct mana_ib_cq, ibcq); struct ib_device *ibdev = ibcq->device; struct mana_ib_create_cq ucmd = {}; diff --git a/drivers/infiniband/hw/mana/mana_ib.h b/drivers/infiniband/hw/mana/mana_ib.h index 502cc8672eefa..d04aa0dce6060 100644 --- a/drivers/infiniband/hw/mana/mana_ib.h +++ b/drivers/infiniband/hw/mana/mana_ib.h @@ -135,7 +135,7 @@ void mana_ib_uncfg_vport(struct mana_ib_dev *dev, struct mana_ib_pd *pd, u32 port); int mana_ib_create_cq(struct ib_cq *ibcq, const struct ib_cq_init_attr *attr, - struct ib_udata *udata); + struct uverbs_attr_bundle *attrs); int mana_ib_destroy_cq(struct ib_cq *ibcq, struct ib_udata *udata); diff --git a/drivers/infiniband/hw/mlx4/cq.c b/drivers/infiniband/hw/mlx4/cq.c index 4cd738aae53c7..aa9ea6ba26e53 100644 --- a/drivers/infiniband/hw/mlx4/cq.c +++ b/drivers/infiniband/hw/mlx4/cq.c @@ -172,8 +172,9 @@ static int mlx4_ib_get_cq_umem(struct mlx4_ib_dev *dev, #define CQ_CREATE_FLAGS_SUPPORTED IB_UVERBS_CQ_FLAGS_TIMESTAMP_COMPLETION int mlx4_ib_create_cq(struct ib_cq *ibcq, const struct ib_cq_init_attr *attr, - struct ib_udata *udata) + struct uverbs_attr_bundle *attrs) { + struct ib_udata *udata = &attrs->driver_udata; struct ib_device *ibdev = ibcq->device; int entries = attr->cqe; int vector = attr->comp_vector; diff --git a/drivers/infiniband/hw/mlx4/mlx4_ib.h b/drivers/infiniband/hw/mlx4/mlx4_ib.h index 41ca1114a9958..b52bceff7d970 100644 --- a/drivers/infiniband/hw/mlx4/mlx4_ib.h +++ b/drivers/infiniband/hw/mlx4/mlx4_ib.h @@ -767,7 +767,7 @@ int mlx4_ib_map_mr_sg(struct ib_mr *ibmr, struct scatterlist *sg, int sg_nents, int mlx4_ib_modify_cq(struct ib_cq *cq, u16 cq_count, u16 cq_period); int mlx4_ib_resize_cq(struct ib_cq *ibcq, int entries, struct ib_udata *udata); int mlx4_ib_create_cq(struct ib_cq *ibcq, const struct ib_cq_init_attr *attr, - struct ib_udata *udata); + struct uverbs_attr_bundle *attrs); int mlx4_ib_destroy_cq(struct ib_cq *cq, struct ib_udata *udata); int mlx4_ib_poll_cq(struct ib_cq *ibcq, int num_entries, struct ib_wc *wc); int mlx4_ib_arm_cq(struct ib_cq *cq, enum ib_cq_notify_flags flags); diff --git a/drivers/infiniband/hw/mlx5/cq.c b/drivers/infiniband/hw/mlx5/cq.c index ee9acd58c5121..37805ca1f33c3 100644 --- a/drivers/infiniband/hw/mlx5/cq.c +++ b/drivers/infiniband/hw/mlx5/cq.c @@ -942,8 +942,9 @@ static void notify_soft_wc_handler(struct work_struct *work) } int mlx5_ib_create_cq(struct ib_cq *ibcq, const struct ib_cq_init_attr *attr, - struct ib_udata *udata) + struct uverbs_attr_bundle *attrs) { + struct ib_udata *udata = &attrs->driver_udata; struct ib_device *ibdev = ibcq->device; int entries = attr->cqe; int vector = attr->comp_vector; diff --git a/drivers/infiniband/hw/mlx5/mlx5_ib.h b/drivers/infiniband/hw/mlx5/mlx5_ib.h index 0f1266502c085..6c599500a5831 100644 --- a/drivers/infiniband/hw/mlx5/mlx5_ib.h +++ b/drivers/infiniband/hw/mlx5/mlx5_ib.h @@ -1317,7 +1317,7 @@ int mlx5_ib_read_wqe_rq(struct mlx5_ib_qp *qp, int wqe_index, void *buffer, int mlx5_ib_read_wqe_srq(struct mlx5_ib_srq *srq, int wqe_index, void *buffer, size_t buflen, size_t *bc); int mlx5_ib_create_cq(struct ib_cq *ibcq, const struct ib_cq_init_attr *attr, - struct ib_udata *udata); + struct uverbs_attr_bundle *attrs); int mlx5_ib_destroy_cq(struct ib_cq *cq, struct ib_udata *udata); int mlx5_ib_poll_cq(struct ib_cq *ibcq, int num_entries, struct ib_wc *wc); int mlx5_ib_arm_cq(struct ib_cq *ibcq, enum ib_cq_notify_flags flags); diff --git a/drivers/infiniband/hw/mthca/mthca_provider.c b/drivers/infiniband/hw/mthca/mthca_provider.c index e1325f2927d6e..6a1e2e79ddc31 100644 --- a/drivers/infiniband/hw/mthca/mthca_provider.c +++ b/drivers/infiniband/hw/mthca/mthca_provider.c @@ -574,8 +574,9 @@ static int mthca_destroy_qp(struct ib_qp *qp, struct ib_udata *udata) static int mthca_create_cq(struct ib_cq *ibcq, const struct ib_cq_init_attr *attr, - struct ib_udata *udata) + struct uverbs_attr_bundle *attrs) { + struct ib_udata *udata = &attrs->driver_udata; struct ib_device *ibdev = ibcq->device; int entries = attr->cqe; struct mthca_create_cq ucmd; diff --git a/drivers/infiniband/hw/ocrdma/ocrdma_verbs.c b/drivers/infiniband/hw/ocrdma/ocrdma_verbs.c index c849fdbd4c994..979de8f8df148 100644 --- a/drivers/infiniband/hw/ocrdma/ocrdma_verbs.c +++ b/drivers/infiniband/hw/ocrdma/ocrdma_verbs.c @@ -963,8 +963,9 @@ static int ocrdma_copy_cq_uresp(struct ocrdma_dev *dev, struct ocrdma_cq *cq, } int ocrdma_create_cq(struct ib_cq *ibcq, const struct ib_cq_init_attr *attr, - struct ib_udata *udata) + struct uverbs_attr_bundle *attrs) { + struct ib_udata *udata = &attrs->driver_udata; struct ib_device *ibdev = ibcq->device; int entries = attr->cqe; struct ocrdma_cq *cq = get_ocrdma_cq(ibcq); diff --git a/drivers/infiniband/hw/ocrdma/ocrdma_verbs.h b/drivers/infiniband/hw/ocrdma/ocrdma_verbs.h index f860b7fcef338..0644346d8d988 100644 --- a/drivers/infiniband/hw/ocrdma/ocrdma_verbs.h +++ b/drivers/infiniband/hw/ocrdma/ocrdma_verbs.h @@ -70,7 +70,7 @@ int ocrdma_alloc_pd(struct ib_pd *pd, struct ib_udata *udata); int ocrdma_dealloc_pd(struct ib_pd *pd, struct ib_udata *udata); int ocrdma_create_cq(struct ib_cq *ibcq, const struct ib_cq_init_attr *attr, - struct ib_udata *udata); + struct uverbs_attr_bundle *attrs); int ocrdma_resize_cq(struct ib_cq *, int cqe, struct ib_udata *); int ocrdma_destroy_cq(struct ib_cq *ibcq, struct ib_udata *udata); diff --git a/drivers/infiniband/hw/qedr/verbs.c b/drivers/infiniband/hw/qedr/verbs.c index f118ce0a9a617..568a5b18803fc 100644 --- a/drivers/infiniband/hw/qedr/verbs.c +++ b/drivers/infiniband/hw/qedr/verbs.c @@ -900,8 +900,9 @@ int qedr_arm_cq(struct ib_cq *ibcq, enum ib_cq_notify_flags flags) } int qedr_create_cq(struct ib_cq *ibcq, const struct ib_cq_init_attr *attr, - struct ib_udata *udata) + struct uverbs_attr_bundle *attrs) { + struct ib_udata *udata = &attrs->driver_udata; struct ib_device *ibdev = ibcq->device; struct qedr_ucontext *ctx = rdma_udata_to_drv_context( udata, struct qedr_ucontext, ibucontext); diff --git a/drivers/infiniband/hw/qedr/verbs.h b/drivers/infiniband/hw/qedr/verbs.h index 081753df79ef0..5731458abb068 100644 --- a/drivers/infiniband/hw/qedr/verbs.h +++ b/drivers/infiniband/hw/qedr/verbs.h @@ -52,7 +52,7 @@ int qedr_dealloc_pd(struct ib_pd *pd, struct ib_udata *udata); int qedr_alloc_xrcd(struct ib_xrcd *ibxrcd, struct ib_udata *udata); int qedr_dealloc_xrcd(struct ib_xrcd *ibxrcd, struct ib_udata *udata); int qedr_create_cq(struct ib_cq *ibcq, const struct ib_cq_init_attr *attr, - struct ib_udata *udata); + struct uverbs_attr_bundle *attrs); int qedr_destroy_cq(struct ib_cq *ibcq, struct ib_udata *udata); int qedr_arm_cq(struct ib_cq *ibcq, enum ib_cq_notify_flags flags); int qedr_create_qp(struct ib_qp *qp, struct ib_qp_init_attr *attrs, diff --git a/drivers/infiniband/hw/usnic/usnic_ib_verbs.c b/drivers/infiniband/hw/usnic/usnic_ib_verbs.c index 6289238cc5af8..217af34e82b3c 100644 --- a/drivers/infiniband/hw/usnic/usnic_ib_verbs.c +++ b/drivers/infiniband/hw/usnic/usnic_ib_verbs.c @@ -577,7 +577,7 @@ int usnic_ib_modify_qp(struct ib_qp *ibqp, struct ib_qp_attr *attr, } int usnic_ib_create_cq(struct ib_cq *ibcq, const struct ib_cq_init_attr *attr, - struct ib_udata *udata) + struct uverbs_attr_bundle *attrs) { if (attr->flags) return -EOPNOTSUPP; diff --git a/drivers/infiniband/hw/usnic/usnic_ib_verbs.h b/drivers/infiniband/hw/usnic/usnic_ib_verbs.h index 6ca9ee0dddbe1..53f53f2d53be0 100644 --- a/drivers/infiniband/hw/usnic/usnic_ib_verbs.h +++ b/drivers/infiniband/hw/usnic/usnic_ib_verbs.h @@ -56,7 +56,7 @@ int usnic_ib_destroy_qp(struct ib_qp *qp, struct ib_udata *udata); int usnic_ib_modify_qp(struct ib_qp *ibqp, struct ib_qp_attr *attr, int attr_mask, struct ib_udata *udata); int usnic_ib_create_cq(struct ib_cq *ibcq, const struct ib_cq_init_attr *attr, - struct ib_udata *udata); + struct uverbs_attr_bundle *attrs); int usnic_ib_destroy_cq(struct ib_cq *cq, struct ib_udata *udata); struct ib_mr *usnic_ib_reg_mr(struct ib_pd *pd, u64 start, u64 length, u64 virt_addr, int access_flags, diff --git a/drivers/infiniband/hw/vmw_pvrdma/pvrdma_cq.c b/drivers/infiniband/hw/vmw_pvrdma/pvrdma_cq.c index 6aa40bd2fd52d..b3df6eb9b8eff 100644 --- a/drivers/infiniband/hw/vmw_pvrdma/pvrdma_cq.c +++ b/drivers/infiniband/hw/vmw_pvrdma/pvrdma_cq.c @@ -94,13 +94,14 @@ int pvrdma_req_notify_cq(struct ib_cq *ibcq, * pvrdma_create_cq - create completion queue * @ibcq: Allocated CQ * @attr: completion queue attributes - * @udata: user data + * @attrs: bundle * * @return: 0 on success */ int pvrdma_create_cq(struct ib_cq *ibcq, const struct ib_cq_init_attr *attr, - struct ib_udata *udata) + struct uverbs_attr_bundle *attrs) { + struct ib_udata *udata = &attrs->driver_udata; struct ib_device *ibdev = ibcq->device; int entries = attr->cqe; struct pvrdma_dev *dev = to_vdev(ibdev); diff --git a/drivers/infiniband/hw/vmw_pvrdma/pvrdma_verbs.h b/drivers/infiniband/hw/vmw_pvrdma/pvrdma_verbs.h index 78807b23d831c..4b9edc03d73df 100644 --- a/drivers/infiniband/hw/vmw_pvrdma/pvrdma_verbs.h +++ b/drivers/infiniband/hw/vmw_pvrdma/pvrdma_verbs.h @@ -375,7 +375,7 @@ struct ib_mr *pvrdma_alloc_mr(struct ib_pd *pd, enum ib_mr_type mr_type, int pvrdma_map_mr_sg(struct ib_mr *ibmr, struct scatterlist *sg, int sg_nents, unsigned int *sg_offset); int pvrdma_create_cq(struct ib_cq *ibcq, const struct ib_cq_init_attr *attr, - struct ib_udata *udata); + struct uverbs_attr_bundle *attrs); int pvrdma_destroy_cq(struct ib_cq *cq, struct ib_udata *udata); int pvrdma_poll_cq(struct ib_cq *ibcq, int num_entries, struct ib_wc *wc); int pvrdma_req_notify_cq(struct ib_cq *cq, enum ib_cq_notify_flags flags); diff --git a/drivers/infiniband/sw/rdmavt/cq.c b/drivers/infiniband/sw/rdmavt/cq.c index 9fe4dcaa049aa..7b6cb36a993ad 100644 --- a/drivers/infiniband/sw/rdmavt/cq.c +++ b/drivers/infiniband/sw/rdmavt/cq.c @@ -5,6 +5,7 @@ #include #include +#include #include "cq.h" #include "vt.h" #include "trace.h" @@ -149,15 +150,16 @@ static void send_complete(struct work_struct *work) * rvt_create_cq - create a completion queue * @ibcq: Allocated CQ * @attr: creation attributes - * @udata: user data for libibverbs.so + * @attrs: uverbs bundle * * Called by ib_create_cq() in the generic verbs code. * * Return: 0 on success */ int rvt_create_cq(struct ib_cq *ibcq, const struct ib_cq_init_attr *attr, - struct ib_udata *udata) + struct uverbs_attr_bundle *attrs) { + struct ib_udata *udata = &attrs->driver_udata; struct ib_device *ibdev = ibcq->device; struct rvt_dev_info *rdi = ib_to_rvt(ibdev); struct rvt_cq *cq = ibcq_to_rvtcq(ibcq); diff --git a/drivers/infiniband/sw/rdmavt/cq.h b/drivers/infiniband/sw/rdmavt/cq.h index b0a948ec760bc..f89c45775821b 100644 --- a/drivers/infiniband/sw/rdmavt/cq.h +++ b/drivers/infiniband/sw/rdmavt/cq.h @@ -10,7 +10,7 @@ #include int rvt_create_cq(struct ib_cq *ibcq, const struct ib_cq_init_attr *attr, - struct ib_udata *udata); + struct uverbs_attr_bundle *attrs); int rvt_destroy_cq(struct ib_cq *ibcq, struct ib_udata *udata); int rvt_req_notify_cq(struct ib_cq *ibcq, enum ib_cq_notify_flags notify_flags); int rvt_resize_cq(struct ib_cq *ibcq, int cqe, struct ib_udata *udata); diff --git a/drivers/infiniband/sw/rxe/rxe_verbs.c b/drivers/infiniband/sw/rxe/rxe_verbs.c index dbb9baa4ffd00..b24b6dfa130ea 100644 --- a/drivers/infiniband/sw/rxe/rxe_verbs.c +++ b/drivers/infiniband/sw/rxe/rxe_verbs.c @@ -1057,8 +1057,9 @@ static int rxe_post_recv(struct ib_qp *ibqp, const struct ib_recv_wr *wr, /* cq */ static int rxe_create_cq(struct ib_cq *ibcq, const struct ib_cq_init_attr *attr, - struct ib_udata *udata) + struct uverbs_attr_bundle *attrs) { + struct ib_udata *udata = &attrs->driver_udata; struct ib_device *dev = ibcq->device; struct rxe_dev *rxe = to_rdev(dev); struct rxe_cq *cq = to_rcq(ibcq); diff --git a/drivers/infiniband/sw/siw/siw_verbs.c b/drivers/infiniband/sw/siw/siw_verbs.c index fdbef3254e308..dc7111fdca62b 100644 --- a/drivers/infiniband/sw/siw/siw_verbs.c +++ b/drivers/infiniband/sw/siw/siw_verbs.c @@ -1125,12 +1125,13 @@ int siw_destroy_cq(struct ib_cq *base_cq, struct ib_udata *udata) * * @base_cq: CQ as allocated by RDMA midlayer * @attr: Initial CQ attributes - * @udata: relates to user context + * @attrs: uverbs bundle */ int siw_create_cq(struct ib_cq *base_cq, const struct ib_cq_init_attr *attr, - struct ib_udata *udata) + struct uverbs_attr_bundle *attrs) { + struct ib_udata *udata = &attrs->driver_udata; struct siw_device *sdev = to_siw_dev(base_cq->device); struct siw_cq *cq = to_siw_cq(base_cq); int rv, size = attr->cqe; diff --git a/drivers/infiniband/sw/siw/siw_verbs.h b/drivers/infiniband/sw/siw/siw_verbs.h index 09964234f8d39..8b835e5208de5 100644 --- a/drivers/infiniband/sw/siw/siw_verbs.h +++ b/drivers/infiniband/sw/siw/siw_verbs.h @@ -43,7 +43,7 @@ int siw_get_port_immutable(struct ib_device *base_dev, u32 port, int siw_query_device(struct ib_device *base_dev, struct ib_device_attr *attr, struct ib_udata *udata); int siw_create_cq(struct ib_cq *base_cq, const struct ib_cq_init_attr *attr, - struct ib_udata *udata); + struct uverbs_attr_bundle *attrs); int siw_query_port(struct ib_device *base_dev, u32 port, struct ib_port_attr *attr); int siw_query_gid(struct ib_device *base_dev, u32 port, int idx, diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h index f0e6adb9c4604..013aa281d2069 100644 --- a/include/rdma/ib_verbs.h +++ b/include/rdma/ib_verbs.h @@ -2480,7 +2480,7 @@ struct ib_device_ops { int qp_attr_mask, struct ib_qp_init_attr *qp_init_attr); int (*destroy_qp)(struct ib_qp *qp, struct ib_udata *udata); int (*create_cq)(struct ib_cq *cq, const struct ib_cq_init_attr *attr, - struct ib_udata *udata); + struct uverbs_attr_bundle *attrs); int (*modify_cq)(struct ib_cq *cq, u16 cq_count, u16 cq_period); int (*destroy_cq)(struct ib_cq *cq, struct ib_udata *udata); int (*resize_cq)(struct ib_cq *cq, int cqe, struct ib_udata *udata); diff --git a/include/uapi/rdma/ib_user_ioctl_cmds.h b/include/uapi/rdma/ib_user_ioctl_cmds.h index dafc7ebe545b8..ec719053aab94 100644 --- a/include/uapi/rdma/ib_user_ioctl_cmds.h +++ b/include/uapi/rdma/ib_user_ioctl_cmds.h @@ -37,9 +37,6 @@ #define UVERBS_ID_NS_MASK 0xF000 #define UVERBS_ID_NS_SHIFT 12 -#define UVERBS_UDATA_DRIVER_DATA_NS 1 -#define UVERBS_UDATA_DRIVER_DATA_FLAG (1UL << UVERBS_ID_NS_SHIFT) - enum uverbs_default_objects { UVERBS_OBJECT_DEVICE, /* No instances of DEVICE are allowed */ UVERBS_OBJECT_PD, @@ -61,8 +58,10 @@ enum uverbs_default_objects { }; enum { - UVERBS_ATTR_UHW_IN = UVERBS_UDATA_DRIVER_DATA_FLAG, + UVERBS_ID_DRIVER_NS = 1UL << UVERBS_ID_NS_SHIFT, + UVERBS_ATTR_UHW_IN = UVERBS_ID_DRIVER_NS, UVERBS_ATTR_UHW_OUT, + UVERBS_ID_DRIVER_NS_WITH_UHW, }; enum uverbs_methods_device { From b049ad87e910e065937574794285c879ee250ac1 Mon Sep 17 00:00:00 2001 From: Akiva Goldberger Date: Thu, 27 Jun 2024 21:23:50 +0300 Subject: [PATCH 050/548] RDMA/mlx5: Send UAR page index as ioctl attribute Add UAR page index as a driver ioctl attribute to increase the number of supported indices, previously limited to 16 bits by mlx5_ib_create_cq struct. Link: https://lore.kernel.org/r/0e18b34d7ec3b1ae02d694b0d545aed7413c0ef7.1719512393.git.leon@kernel.org Signed-off-by: Akiva Goldberger Signed-off-by: Leon Romanovsky Signed-off-by: Jason Gunthorpe (cherry picked from commit 589b844f1bf04850d9fabcaa2e943325dc6768b4) Signed-off-by: Koba Ko --- drivers/infiniband/hw/mlx5/cq.c | 28 +++++++++++++++++++++--- drivers/infiniband/hw/mlx5/main.c | 1 + drivers/infiniband/hw/mlx5/mlx5_ib.h | 1 + include/uapi/rdma/mlx5_user_ioctl_cmds.h | 4 ++++ 4 files changed, 31 insertions(+), 3 deletions(-) diff --git a/drivers/infiniband/hw/mlx5/cq.c b/drivers/infiniband/hw/mlx5/cq.c index 37805ca1f33c3..1aa5311b03e9f 100644 --- a/drivers/infiniband/hw/mlx5/cq.c +++ b/drivers/infiniband/hw/mlx5/cq.c @@ -38,6 +38,9 @@ #include "srq.h" #include "qp.h" +#define UVERBS_MODULE_NAME mlx5_ib +#include + static void mlx5_ib_cq_comp(struct mlx5_core_cq *cq, struct mlx5_eqe *eqe) { struct ib_cq *ibcq = &to_mibcq(cq)->ibcq; @@ -714,7 +717,8 @@ static int mini_cqe_res_format_to_hw(struct mlx5_ib_dev *dev, u8 format) static int create_cq_user(struct mlx5_ib_dev *dev, struct ib_udata *udata, struct mlx5_ib_cq *cq, int entries, u32 **cqb, - int *cqe_size, int *index, int *inlen) + int *cqe_size, int *index, int *inlen, + struct uverbs_attr_bundle *attrs) { struct mlx5_ib_create_cq ucmd = {}; unsigned long page_size; @@ -788,7 +792,11 @@ static int create_cq_user(struct mlx5_ib_dev *dev, struct ib_udata *udata, order_base_2(page_size) - MLX5_ADAPTER_PAGE_SHIFT); MLX5_SET(cqc, cqc, page_offset, page_offset_quantized); - if (ucmd.flags & MLX5_IB_CREATE_CQ_FLAGS_UAR_PAGE_INDEX) { + if (uverbs_attr_is_valid(attrs, MLX5_IB_ATTR_CREATE_CQ_UAR_INDEX)) { + err = uverbs_copy_from(index, attrs, MLX5_IB_ATTR_CREATE_CQ_UAR_INDEX); + if (err) + goto err_cqb; + } else if (ucmd.flags & MLX5_IB_CREATE_CQ_FLAGS_UAR_PAGE_INDEX) { *index = ucmd.uar_page_index; } else if (context->bfregi.lib_uar_dyn) { err = -EINVAL; @@ -981,7 +989,7 @@ int mlx5_ib_create_cq(struct ib_cq *ibcq, const struct ib_cq_init_attr *attr, if (udata) { err = create_cq_user(dev, udata, cq, entries, &cqb, &cqe_size, - &index, &inlen); + &index, &inlen, attrs); if (err) return err; } else { @@ -1443,3 +1451,17 @@ int mlx5_ib_generate_wc(struct ib_cq *ibcq, struct ib_wc *wc) return 0; } + +ADD_UVERBS_ATTRIBUTES_SIMPLE( + mlx5_ib_cq_create, + UVERBS_OBJECT_CQ, + UVERBS_METHOD_CQ_CREATE, + UVERBS_ATTR_PTR_IN( + MLX5_IB_ATTR_CREATE_CQ_UAR_INDEX, + UVERBS_ATTR_TYPE(u32), + UA_OPTIONAL)); + +const struct uapi_definition mlx5_ib_create_cq_defs[] = { + UAPI_DEF_CHAIN_OBJ_TREE(UVERBS_OBJECT_CQ, &mlx5_ib_cq_create), + {}, +}; diff --git a/drivers/infiniband/hw/mlx5/main.c b/drivers/infiniband/hw/mlx5/main.c index b5822a394c6a8..05b8b96c77ac1 100644 --- a/drivers/infiniband/hw/mlx5/main.c +++ b/drivers/infiniband/hw/mlx5/main.c @@ -3798,6 +3798,7 @@ static const struct uapi_definition mlx5_ib_defs[] = { UAPI_DEF_CHAIN(mlx5_ib_qos_defs), UAPI_DEF_CHAIN(mlx5_ib_std_types_defs), UAPI_DEF_CHAIN(mlx5_ib_dm_defs), + UAPI_DEF_CHAIN(mlx5_ib_create_cq_defs), UAPI_DEF_CHAIN_OBJ_TREE(UVERBS_OBJECT_DEVICE, &mlx5_ib_query_context), UAPI_DEF_CHAIN_OBJ_TREE_NAMED(MLX5_IB_OBJECT_VAR, diff --git a/drivers/infiniband/hw/mlx5/mlx5_ib.h b/drivers/infiniband/hw/mlx5/mlx5_ib.h index 6c599500a5831..cd45286a96eac 100644 --- a/drivers/infiniband/hw/mlx5/mlx5_ib.h +++ b/drivers/infiniband/hw/mlx5/mlx5_ib.h @@ -1518,6 +1518,7 @@ extern const struct uapi_definition mlx5_ib_devx_defs[]; extern const struct uapi_definition mlx5_ib_flow_defs[]; extern const struct uapi_definition mlx5_ib_qos_defs[]; extern const struct uapi_definition mlx5_ib_std_types_defs[]; +extern const struct uapi_definition mlx5_ib_create_cq_defs[]; static inline int is_qp1(enum ib_qp_type qp_type) { diff --git a/include/uapi/rdma/mlx5_user_ioctl_cmds.h b/include/uapi/rdma/mlx5_user_ioctl_cmds.h index 595edad03dfe5..5b74d65348995 100644 --- a/include/uapi/rdma/mlx5_user_ioctl_cmds.h +++ b/include/uapi/rdma/mlx5_user_ioctl_cmds.h @@ -270,6 +270,10 @@ enum mlx5_ib_device_query_context_attrs { MLX5_IB_ATTR_QUERY_CONTEXT_RESP_UCTX = (1U << UVERBS_ID_NS_SHIFT), }; +enum mlx5_ib_create_cq_attrs { + MLX5_IB_ATTR_CREATE_CQ_UAR_INDEX = UVERBS_ID_DRIVER_NS_WITH_UHW, +}; + #define MLX5_IB_DW_MATCH_PARAM 0xA0 struct mlx5_ib_match_params { From 8e08591bf2557304501cef22b71c16bbb1226229 Mon Sep 17 00:00:00 2001 From: Yishai Hadas Date: Thu, 1 Aug 2024 15:05:10 +0300 Subject: [PATCH 051/548] net/mlx5: Add IFC related stuff for data direct Add IFC related stuff for data direct. Signed-off-by: Yishai Hadas Link: https://patch.msgid.link/82da7f578a567909bb5858a64ba844fe4cc298fa.1722512548.git.leon@kernel.org Signed-off-by: Leon Romanovsky (cherry picked from commit c772a2c690182410642ead740f7a84b3a7544b2b) Signed-off-by: Koba Ko --- include/linux/mlx5/mlx5_ifc.h | 51 +++++++++++++++++++++++++++++++---- 1 file changed, 46 insertions(+), 5 deletions(-) diff --git a/include/linux/mlx5/mlx5_ifc.h b/include/linux/mlx5/mlx5_ifc.h index a23c306e36e13..9f4cc06eb670e 100644 --- a/include/linux/mlx5/mlx5_ifc.h +++ b/include/linux/mlx5/mlx5_ifc.h @@ -312,6 +312,7 @@ enum { MLX5_CMD_OP_QUERY_VHCA_STATE = 0xb0d, MLX5_CMD_OP_MODIFY_VHCA_STATE = 0xb0e, MLX5_CMD_OP_SYNC_CRYPTO = 0xb12, + MLX5_CMD_OPCODE_QUERY_VUID = 0xb22, MLX5_CMD_OP_MAX }; @@ -1851,7 +1852,8 @@ struct mlx5_ifc_cmd_hca_cap_bits { u8 reserved_at_5a0[0x10]; u8 enhanced_cqe_compression[0x1]; - u8 reserved_at_5b1[0x2]; + u8 reserved_at_5b1[0x1]; + u8 crossing_vhca_mkey[0x1]; u8 log_max_dek[0x5]; u8 reserved_at_5b8[0x4]; u8 mini_cqe_resp_stride_index[0x1]; @@ -1920,7 +1922,9 @@ struct mlx5_ifc_cmd_hca_cap_bits { u8 dynamic_msix_table_size[0xc]; u8 reserved_at_740[0xc]; u8 min_dynamic_vf_msix_table_size[0x4]; - u8 reserved_at_750[0x4]; + u8 reserved_at_750[0x2]; + u8 data_direct[0x1]; + u8 reserved_at_753[0x1]; u8 max_dynamic_vf_msix_table_size[0xc]; u8 reserved_at_760[0x3]; @@ -1939,7 +1943,9 @@ struct mlx5_ifc_cmd_hca_cap_2_bits { u8 reserved_at_0[0x80]; u8 migratable[0x1]; - u8 reserved_at_81[0x1f]; + u8 reserved_at_81[0x11]; + u8 query_vuid[0x1]; + u8 reserved_at_93[0xd]; u8 max_reformat_insert_size[0x8]; u8 max_reformat_insert_offset[0x8]; @@ -4072,6 +4078,7 @@ enum { MLX5_MKC_ACCESS_MODE_KSM = 0x3, MLX5_MKC_ACCESS_MODE_SW_ICM = 0x4, MLX5_MKC_ACCESS_MODE_MEMIC = 0x5, + MLX5_MKC_ACCESS_MODE_CROSSING = 0x6, }; struct mlx5_ifc_mkc_bits { @@ -4114,7 +4121,10 @@ struct mlx5_ifc_mkc_bits { u8 bsf_octword_size[0x20]; - u8 reserved_at_120[0x80]; + u8 reserved_at_120[0x60]; + + u8 crossing_target_vhca_id[0x10]; + u8 reserved_at_190[0x10]; u8 translations_octword_size[0x20]; @@ -5045,6 +5055,36 @@ struct mlx5_ifc_query_vport_state_out_bits { u8 state[0x4]; }; +struct mlx5_ifc_array1024_auto_bits { + u8 array1024_auto[32][0x20]; +}; + +struct mlx5_ifc_query_vuid_in_bits { + u8 opcode[0x10]; + u8 uid[0x10]; + + u8 reserved_at_20[0x40]; + + u8 query_vfs_vuid[0x1]; + u8 data_direct[0x1]; + u8 reserved_at_62[0xe]; + u8 vhca_id[0x10]; +}; + +struct mlx5_ifc_query_vuid_out_bits { + u8 status[0x8]; + u8 reserved_at_8[0x18]; + + u8 syndrome[0x20]; + + u8 reserved_at_40[0x1a0]; + + u8 reserved_at_1e0[0x10]; + u8 num_of_entries[0x10]; + + struct mlx5_ifc_array1024_auto_bits vuid[]; +}; + enum { MLX5_VPORT_STATE_OP_MOD_VNIC_VPORT = 0x0, MLX5_VPORT_STATE_OP_MOD_ESW_VPORT = 0x1, @@ -8866,7 +8906,8 @@ struct mlx5_ifc_create_mkey_in_bits { u8 pg_access[0x1]; u8 mkey_umem_valid[0x1]; - u8 reserved_at_62[0x1e]; + u8 data_direct[0x1]; + u8 reserved_at_63[0x1d]; struct mlx5_ifc_mkc_bits memory_key_mkey_entry; From 1ac9c2498cd85c3f0e64a78e06f4fe2c05d2cb78 Mon Sep 17 00:00:00 2001 From: Jianbo Liu Date: Mon, 3 Jun 2024 13:26:37 +0300 Subject: [PATCH 052/548] net/mlx5: Reimplement write combining test The test of write combining was added before in mlx5_ib driver. It opens UD QP and posts NOP WQEs, and uses BlueFlame doorbell. When BlueFlame is used, WQEs get written directly to a PCI BAR of the device (in addition to memory) so that the device handles them without having to access memory. In this test, the WQEs written in memory are different from the ones written to the BlueFlame which request CQE update. By checking the completion reports posted on CQ, we can know if BlueFlame succeeds or not. The write combining must be supported if BlueFlame succeeds as its register is written using write combining. This patch reimplements the test in the same way, but using a pair of SQ and CQ only. It is moved to mlx5_core as a general feature used by both mlx5_core and mlx5_ib. Besides, save write combine test result of the PCI function, so that its thousands of child functions such as SF can query without paying the time and resource penalty by itself. The test function is called only after failing to get the cached result. With this enhancement, all thousands of SFs of the PF attached to same driver no longer need to perform WC check explicitly, which is already done in the system. This saves several commands per SF, thereby speeds up SF creation and also saves completion EQ creation. Signed-off-by: Jianbo Liu Reviewed-by: Tariq Toukan Link: https://lore.kernel.org/r/4ff5a8cc4c5b5b0d98397baa45a5019bcdbf096e.1717409369.git.leon@kernel.org Signed-off-by: Leon Romanovsky (cherry picked from commit d98995b4bf981519dde4af0a081c393d62474039) Signed-off-by: Koba Ko --- drivers/infiniband/hw/mlx5/main.c | 19 +- drivers/infiniband/hw/mlx5/mem.c | 198 -------- drivers/infiniband/hw/mlx5/mlx5_ib.h | 3 - drivers/infiniband/hw/mlx5/qp.c | 16 - .../net/ethernet/mellanox/mlx5/core/Makefile | 2 +- .../net/ethernet/mellanox/mlx5/core/main.c | 2 + drivers/net/ethernet/mellanox/mlx5/core/wc.c | 434 ++++++++++++++++++ include/linux/mlx5/driver.h | 11 + 8 files changed, 451 insertions(+), 234 deletions(-) create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/wc.c diff --git a/drivers/infiniband/hw/mlx5/main.c b/drivers/infiniband/hw/mlx5/main.c index 05b8b96c77ac1..e39df64312661 100644 --- a/drivers/infiniband/hw/mlx5/main.c +++ b/drivers/infiniband/hw/mlx5/main.c @@ -1834,7 +1834,7 @@ static int set_ucontext_resp(struct ib_ucontext *uctx, } resp->qp_tab_size = 1 << MLX5_CAP_GEN(dev->mdev, log_max_qp); - if (dev->wc_support) + if (mlx5_wc_support_get(dev->mdev)) resp->bf_reg_size = 1 << MLX5_CAP_GEN(dev->mdev, log_bf_reg_size); resp->cache_line_size = cache_line_size(); @@ -2361,7 +2361,7 @@ static int mlx5_ib_mmap(struct ib_ucontext *ibcontext, struct vm_area_struct *vm switch (command) { case MLX5_IB_MMAP_WC_PAGE: case MLX5_IB_MMAP_ALLOC_WC: - if (!dev->wc_support) + if (!mlx5_wc_support_get(dev->mdev)) return -EPERM; fallthrough; case MLX5_IB_MMAP_NC_PAGE: @@ -3723,7 +3723,7 @@ static int UVERBS_HANDLER(MLX5_IB_METHOD_UAR_OBJ_ALLOC)( alloc_type != MLX5_IB_UAPI_UAR_ALLOC_TYPE_NC) return -EOPNOTSUPP; - if (!to_mdev(c->ibucontext.device)->wc_support && + if (!mlx5_wc_support_get(to_mdev(c->ibucontext.device)->mdev) && alloc_type == MLX5_IB_UAPI_UAR_ALLOC_TYPE_BF) return -EOPNOTSUPP; @@ -3878,18 +3878,6 @@ static int mlx5_ib_stage_init_init(struct mlx5_ib_dev *dev) return err; } -static int mlx5_ib_enable_driver(struct ib_device *dev) -{ - struct mlx5_ib_dev *mdev = to_mdev(dev); - int ret; - - ret = mlx5_ib_test_wc(mdev); - mlx5_ib_dbg(mdev, "Write-Combining %s", - mdev->wc_support ? "supported" : "not supported"); - - return ret; -} - static struct ib_device *mlx5_ib_add_sub_dev(struct ib_device *parent, enum rdma_nl_dev_type type, const char *name); @@ -3927,7 +3915,6 @@ static const struct ib_device_ops mlx5_ib_dev_ops = { .drain_rq = mlx5_ib_drain_rq, .drain_sq = mlx5_ib_drain_sq, .device_group = &mlx5_attr_group, - .enable_driver = mlx5_ib_enable_driver, .get_dev_fw_str = get_dev_fw_str, .get_dma_mr = mlx5_ib_get_dma_mr, .get_link_layer = mlx5_ib_port_link_layer, diff --git a/drivers/infiniband/hw/mlx5/mem.c b/drivers/infiniband/hw/mlx5/mem.c index 5a22be14d958f..af321f6ef7f54 100644 --- a/drivers/infiniband/hw/mlx5/mem.c +++ b/drivers/infiniband/hw/mlx5/mem.c @@ -30,10 +30,8 @@ * SOFTWARE. */ -#include #include #include "mlx5_ib.h" -#include /* * Fill in a physical address list. ib_umem_num_dma_blocks() entries will be @@ -95,199 +93,3 @@ unsigned long __mlx5_umem_find_best_quantized_pgoff( return 0; return page_size; } - -#define WR_ID_BF 0xBF -#define WR_ID_END 0xBAD -#define TEST_WC_NUM_WQES 255 -#define TEST_WC_POLLING_MAX_TIME_JIFFIES msecs_to_jiffies(100) -static int post_send_nop(struct mlx5_ib_dev *dev, struct ib_qp *ibqp, u64 wr_id, - bool signaled) -{ - struct mlx5_ib_qp *qp = to_mqp(ibqp); - struct mlx5_wqe_ctrl_seg *ctrl; - struct mlx5_bf *bf = &qp->bf; - __be32 mmio_wqe[16] = {}; - unsigned long flags; - unsigned int idx; - - if (unlikely(dev->mdev->state == MLX5_DEVICE_STATE_INTERNAL_ERROR)) - return -EIO; - - spin_lock_irqsave(&qp->sq.lock, flags); - - idx = qp->sq.cur_post & (qp->sq.wqe_cnt - 1); - ctrl = mlx5_frag_buf_get_wqe(&qp->sq.fbc, idx); - - memset(ctrl, 0, sizeof(struct mlx5_wqe_ctrl_seg)); - ctrl->fm_ce_se = signaled ? MLX5_WQE_CTRL_CQ_UPDATE : 0; - ctrl->opmod_idx_opcode = - cpu_to_be32(((u32)(qp->sq.cur_post) << 8) | MLX5_OPCODE_NOP); - ctrl->qpn_ds = cpu_to_be32((sizeof(struct mlx5_wqe_ctrl_seg) / 16) | - (qp->trans_qp.base.mqp.qpn << 8)); - - qp->sq.wrid[idx] = wr_id; - qp->sq.w_list[idx].opcode = MLX5_OPCODE_NOP; - qp->sq.wqe_head[idx] = qp->sq.head + 1; - qp->sq.cur_post += DIV_ROUND_UP(sizeof(struct mlx5_wqe_ctrl_seg), - MLX5_SEND_WQE_BB); - qp->sq.w_list[idx].next = qp->sq.cur_post; - qp->sq.head++; - - memcpy(mmio_wqe, ctrl, sizeof(*ctrl)); - ((struct mlx5_wqe_ctrl_seg *)&mmio_wqe)->fm_ce_se |= - MLX5_WQE_CTRL_CQ_UPDATE; - - /* Make sure that descriptors are written before - * updating doorbell record and ringing the doorbell - */ - wmb(); - - qp->db.db[MLX5_SND_DBR] = cpu_to_be32(qp->sq.cur_post); - - /* Make sure doorbell record is visible to the HCA before - * we hit doorbell - */ - wmb(); - __iowrite64_copy(bf->bfreg->map + bf->offset, mmio_wqe, - sizeof(mmio_wqe) / 8); - - bf->offset ^= bf->buf_size; - - spin_unlock_irqrestore(&qp->sq.lock, flags); - - return 0; -} - -static int test_wc_poll_cq_result(struct mlx5_ib_dev *dev, struct ib_cq *cq) -{ - int ret; - struct ib_wc wc = {}; - unsigned long end = jiffies + TEST_WC_POLLING_MAX_TIME_JIFFIES; - - do { - ret = ib_poll_cq(cq, 1, &wc); - if (ret < 0 || wc.status) - return ret < 0 ? ret : -EINVAL; - if (ret) - break; - } while (!time_after(jiffies, end)); - - if (!ret) - return -ETIMEDOUT; - - if (wc.wr_id != WR_ID_BF) - ret = 0; - - return ret; -} - -static int test_wc_do_send(struct mlx5_ib_dev *dev, struct ib_qp *qp) -{ - int err, i; - - for (i = 0; i < TEST_WC_NUM_WQES; i++) { - err = post_send_nop(dev, qp, WR_ID_BF, false); - if (err) - return err; - } - - return post_send_nop(dev, qp, WR_ID_END, true); -} - -int mlx5_ib_test_wc(struct mlx5_ib_dev *dev) -{ - struct ib_cq_init_attr cq_attr = { .cqe = TEST_WC_NUM_WQES + 1 }; - int port_type_cap = MLX5_CAP_GEN(dev->mdev, port_type); - struct ib_qp_init_attr qp_init_attr = { - .cap = { .max_send_wr = TEST_WC_NUM_WQES }, - .qp_type = IB_QPT_UD, - .sq_sig_type = IB_SIGNAL_REQ_WR, - .create_flags = MLX5_IB_QP_CREATE_WC_TEST, - }; - struct ib_qp_attr qp_attr = { .port_num = 1 }; - struct ib_device *ibdev = &dev->ib_dev; - struct ib_qp *qp; - struct ib_cq *cq; - struct ib_pd *pd; - int ret; - - if (!MLX5_CAP_GEN(dev->mdev, bf)) - return 0; - - if (!dev->mdev->roce.roce_en && - port_type_cap == MLX5_CAP_PORT_TYPE_ETH) { - if (mlx5_core_is_pf(dev->mdev)) - dev->wc_support = arch_can_pci_mmap_wc(); - return 0; - } - - ret = mlx5_alloc_bfreg(dev->mdev, &dev->wc_bfreg, true, false); - if (ret) - goto print_err; - - if (!dev->wc_bfreg.wc) - goto out1; - - pd = ib_alloc_pd(ibdev, 0); - if (IS_ERR(pd)) { - ret = PTR_ERR(pd); - goto out1; - } - - cq = ib_create_cq(ibdev, NULL, NULL, NULL, &cq_attr); - if (IS_ERR(cq)) { - ret = PTR_ERR(cq); - goto out2; - } - - qp_init_attr.recv_cq = cq; - qp_init_attr.send_cq = cq; - qp = ib_create_qp(pd, &qp_init_attr); - if (IS_ERR(qp)) { - ret = PTR_ERR(qp); - goto out3; - } - - qp_attr.qp_state = IB_QPS_INIT; - ret = ib_modify_qp(qp, &qp_attr, - IB_QP_STATE | IB_QP_PORT | IB_QP_PKEY_INDEX | - IB_QP_QKEY); - if (ret) - goto out4; - - qp_attr.qp_state = IB_QPS_RTR; - ret = ib_modify_qp(qp, &qp_attr, IB_QP_STATE); - if (ret) - goto out4; - - qp_attr.qp_state = IB_QPS_RTS; - ret = ib_modify_qp(qp, &qp_attr, IB_QP_STATE | IB_QP_SQ_PSN); - if (ret) - goto out4; - - ret = test_wc_do_send(dev, qp); - if (ret < 0) - goto out4; - - ret = test_wc_poll_cq_result(dev, cq); - if (ret > 0) { - dev->wc_support = true; - ret = 0; - } - -out4: - ib_destroy_qp(qp); -out3: - ib_destroy_cq(cq); -out2: - ib_dealloc_pd(pd); -out1: - mlx5_free_bfreg(dev->mdev, &dev->wc_bfreg); -print_err: - if (ret) - mlx5_ib_err( - dev, - "Error %d while trying to test write-combining support\n", - ret); - return ret; -} diff --git a/drivers/infiniband/hw/mlx5/mlx5_ib.h b/drivers/infiniband/hw/mlx5/mlx5_ib.h index cd45286a96eac..29382f9beeb20 100644 --- a/drivers/infiniband/hw/mlx5/mlx5_ib.h +++ b/drivers/infiniband/hw/mlx5/mlx5_ib.h @@ -354,7 +354,6 @@ struct mlx5_ib_flow_db { * rely on the range reserved for that use in the ib_qp_create_flags enum. */ #define MLX5_IB_QP_CREATE_SQPN_QP1 IB_QP_CREATE_RESERVED_START -#define MLX5_IB_QP_CREATE_WC_TEST (IB_QP_CREATE_RESERVED_START << 1) struct wr_list { u16 opcode; @@ -1122,7 +1121,6 @@ struct mlx5_ib_dev { u8 ib_active:1; u8 is_rep:1; u8 lag_active:1; - u8 wc_support:1; u8 fill_delay; struct umr_common umrc; /* sync used page count stats @@ -1148,7 +1146,6 @@ struct mlx5_ib_dev { /* Array with num_ports elements */ struct mlx5_ib_port *port; struct mlx5_sq_bfreg bfreg; - struct mlx5_sq_bfreg wc_bfreg; struct mlx5_sq_bfreg fp_bfreg; struct mlx5_ib_delay_drop delay_drop; const struct mlx5_ib_profile *profile; diff --git a/drivers/infiniband/hw/mlx5/qp.c b/drivers/infiniband/hw/mlx5/qp.c index f07d2b96d0a3d..e389769d17524 100644 --- a/drivers/infiniband/hw/mlx5/qp.c +++ b/drivers/infiniband/hw/mlx5/qp.c @@ -1107,8 +1107,6 @@ static int _create_kernel_qp(struct mlx5_ib_dev *dev, if (init_attr->qp_type == MLX5_IB_QPT_REG_UMR) qp->bf.bfreg = &dev->fp_bfreg; - else if (qp->flags & MLX5_IB_QP_CREATE_WC_TEST) - qp->bf.bfreg = &dev->wc_bfreg; else qp->bf.bfreg = &dev->bfreg; @@ -2959,14 +2957,6 @@ static void process_create_flag(struct mlx5_ib_dev *dev, int *flags, int flag, return; } - if (flag == MLX5_IB_QP_CREATE_WC_TEST) { - /* - * Special case, if condition didn't meet, it won't be error, - * just different in-kernel flow. - */ - *flags &= ~MLX5_IB_QP_CREATE_WC_TEST; - return; - } mlx5_ib_dbg(dev, "Verbs create QP flag 0x%X is not supported\n", flag); } @@ -3027,8 +3017,6 @@ static int process_create_flags(struct mlx5_ib_dev *dev, struct mlx5_ib_qp *qp, IB_QP_CREATE_PCI_WRITE_END_PADDING, MLX5_CAP_GEN(mdev, end_pad), qp); - process_create_flag(dev, &create_flags, MLX5_IB_QP_CREATE_WC_TEST, - qp_type != MLX5_IB_QPT_REG_UMR, qp); process_create_flag(dev, &create_flags, MLX5_IB_QP_CREATE_SQPN_QP1, true, qp); @@ -4621,10 +4609,6 @@ static bool mlx5_ib_modify_qp_allowed(struct mlx5_ib_dev *dev, if (qp->type == IB_QPT_RAW_PACKET || qp->type == MLX5_IB_QPT_REG_UMR) return true; - /* Internal QP used for wc testing, with NOPs in wq */ - if (qp->flags & MLX5_IB_QP_CREATE_WC_TEST) - return true; - return false; } diff --git a/drivers/net/ethernet/mellanox/mlx5/core/Makefile b/drivers/net/ethernet/mellanox/mlx5/core/Makefile index 7e94caca48886..a78731065f3b0 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/Makefile +++ b/drivers/net/ethernet/mellanox/mlx5/core/Makefile @@ -17,7 +17,7 @@ mlx5_core-y := main.o cmd.o debugfs.o fw.o eq.o uar.o pagealloc.o \ fs_counters.o fs_ft_pool.o rl.o lag/debugfs.o lag/lag.o dev.o events.o wq.o lib/gid.o \ lib/devcom.o lib/pci_vsc.o lib/dm.o lib/fs_ttc.o diag/fs_tracepoint.o \ diag/fw_tracer.o diag/crdump.o devlink.o diag/rsc_dump.o diag/reporter_vnic.o \ - fw_reset.o qos.o lib/tout.o lib/aso.o + fw_reset.o qos.o lib/tout.o lib/aso.o wc.o # # Netdev basic diff --git a/drivers/net/ethernet/mellanox/mlx5/core/main.c b/drivers/net/ethernet/mellanox/mlx5/core/main.c index 62a85f09b52fd..67d82644d5cd4 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/main.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/main.c @@ -1806,6 +1806,7 @@ int mlx5_mdev_init(struct mlx5_core_dev *dev, int profile_idx) mutex_init(&dev->intf_state_mutex); lockdep_set_class(&dev->intf_state_mutex, &dev->lock_key); mutex_init(&dev->mlx5e_res.uplink_netdev_lock); + mutex_init(&dev->wc_state_lock); mutex_init(&priv->bfregs.reg_head.lock); mutex_init(&priv->bfregs.wc_head.lock); @@ -1903,6 +1904,7 @@ void mlx5_mdev_uninit(struct mlx5_core_dev *dev) mutex_destroy(&priv->alloc_mutex); mutex_destroy(&priv->bfregs.wc_head.lock); mutex_destroy(&priv->bfregs.reg_head.lock); + mutex_destroy(&dev->wc_state_lock); mutex_destroy(&dev->mlx5e_res.uplink_netdev_lock); mutex_destroy(&dev->intf_state_mutex); lockdep_unregister_key(&dev->lock_key); diff --git a/drivers/net/ethernet/mellanox/mlx5/core/wc.c b/drivers/net/ethernet/mellanox/mlx5/core/wc.c new file mode 100644 index 0000000000000..1bed75eca97db --- /dev/null +++ b/drivers/net/ethernet/mellanox/mlx5/core/wc.c @@ -0,0 +1,434 @@ +// SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB +// Copyright (c) 2024, NVIDIA CORPORATION & AFFILIATES. All rights reserved. + +#include +#include +#include "lib/clock.h" +#include "mlx5_core.h" +#include "wq.h" + +#define TEST_WC_NUM_WQES 255 +#define TEST_WC_LOG_CQ_SZ (order_base_2(TEST_WC_NUM_WQES)) +#define TEST_WC_SQ_LOG_WQ_SZ TEST_WC_LOG_CQ_SZ +#define TEST_WC_POLLING_MAX_TIME_JIFFIES msecs_to_jiffies(100) + +struct mlx5_wc_cq { + /* data path - accessed per cqe */ + struct mlx5_cqwq wq; + + /* data path - accessed per napi poll */ + struct mlx5_core_cq mcq; + + /* control */ + struct mlx5_core_dev *mdev; + struct mlx5_wq_ctrl wq_ctrl; +}; + +struct mlx5_wc_sq { + /* data path */ + u16 cc; + u16 pc; + + /* read only */ + struct mlx5_wq_cyc wq; + u32 sqn; + + /* control path */ + struct mlx5_wq_ctrl wq_ctrl; + + struct mlx5_wc_cq cq; + struct mlx5_sq_bfreg bfreg; +}; + +static int mlx5_wc_create_cqwq(struct mlx5_core_dev *mdev, void *cqc, + struct mlx5_wc_cq *cq) +{ + struct mlx5_core_cq *mcq = &cq->mcq; + struct mlx5_wq_param param = {}; + int err; + u32 i; + + err = mlx5_cqwq_create(mdev, ¶m, cqc, &cq->wq, &cq->wq_ctrl); + if (err) + return err; + + mcq->cqe_sz = 64; + mcq->set_ci_db = cq->wq_ctrl.db.db; + mcq->arm_db = cq->wq_ctrl.db.db + 1; + + for (i = 0; i < mlx5_cqwq_get_size(&cq->wq); i++) { + struct mlx5_cqe64 *cqe = mlx5_cqwq_get_wqe(&cq->wq, i); + + cqe->op_own = 0xf1; + } + + cq->mdev = mdev; + + return 0; +} + +static int create_wc_cq(struct mlx5_wc_cq *cq, void *cqc_data) +{ + u32 out[MLX5_ST_SZ_DW(create_cq_out)]; + struct mlx5_core_dev *mdev = cq->mdev; + struct mlx5_core_cq *mcq = &cq->mcq; + int err, inlen, eqn; + void *in, *cqc; + + err = mlx5_comp_eqn_get(mdev, 0, &eqn); + if (err) + return err; + + inlen = MLX5_ST_SZ_BYTES(create_cq_in) + + sizeof(u64) * cq->wq_ctrl.buf.npages; + in = kvzalloc(inlen, GFP_KERNEL); + if (!in) + return -ENOMEM; + + cqc = MLX5_ADDR_OF(create_cq_in, in, cq_context); + + memcpy(cqc, cqc_data, MLX5_ST_SZ_BYTES(cqc)); + + mlx5_fill_page_frag_array(&cq->wq_ctrl.buf, + (__be64 *)MLX5_ADDR_OF(create_cq_in, in, pas)); + + MLX5_SET(cqc, cqc, cq_period_mode, MLX5_CQ_PERIOD_MODE_START_FROM_EQE); + MLX5_SET(cqc, cqc, c_eqn_or_apu_element, eqn); + MLX5_SET(cqc, cqc, uar_page, mdev->priv.uar->index); + MLX5_SET(cqc, cqc, log_page_size, cq->wq_ctrl.buf.page_shift - + MLX5_ADAPTER_PAGE_SHIFT); + MLX5_SET64(cqc, cqc, dbr_addr, cq->wq_ctrl.db.dma); + + err = mlx5_core_create_cq(mdev, mcq, in, inlen, out, sizeof(out)); + + kvfree(in); + + return err; +} + +static int mlx5_wc_create_cq(struct mlx5_core_dev *mdev, struct mlx5_wc_cq *cq) +{ + void *cqc; + int err; + + cqc = kvzalloc(MLX5_ST_SZ_BYTES(cqc), GFP_KERNEL); + if (!cqc) + return -ENOMEM; + + MLX5_SET(cqc, cqc, log_cq_size, TEST_WC_LOG_CQ_SZ); + MLX5_SET(cqc, cqc, uar_page, mdev->priv.uar->index); + if (MLX5_CAP_GEN(mdev, cqe_128_always) && cache_line_size() >= 128) + MLX5_SET(cqc, cqc, cqe_sz, CQE_STRIDE_128_PAD); + + err = mlx5_wc_create_cqwq(mdev, cqc, cq); + if (err) { + mlx5_core_err(mdev, "Failed to create wc cq wq, err=%d\n", err); + goto err_create_cqwq; + } + + err = create_wc_cq(cq, cqc); + if (err) { + mlx5_core_err(mdev, "Failed to create wc cq, err=%d\n", err); + goto err_create_cq; + } + + kvfree(cqc); + return 0; + +err_create_cq: + mlx5_wq_destroy(&cq->wq_ctrl); +err_create_cqwq: + kvfree(cqc); + return err; +} + +static void mlx5_wc_destroy_cq(struct mlx5_wc_cq *cq) +{ + mlx5_core_destroy_cq(cq->mdev, &cq->mcq); + mlx5_wq_destroy(&cq->wq_ctrl); +} + +static int create_wc_sq(struct mlx5_core_dev *mdev, void *sqc_data, + struct mlx5_wc_sq *sq) +{ + void *in, *sqc, *wq; + int inlen, err; + u8 ts_format; + + inlen = MLX5_ST_SZ_BYTES(create_sq_in) + + sizeof(u64) * sq->wq_ctrl.buf.npages; + in = kvzalloc(inlen, GFP_KERNEL); + if (!in) + return -ENOMEM; + + sqc = MLX5_ADDR_OF(create_sq_in, in, ctx); + wq = MLX5_ADDR_OF(sqc, sqc, wq); + + memcpy(sqc, sqc_data, MLX5_ST_SZ_BYTES(sqc)); + MLX5_SET(sqc, sqc, cqn, sq->cq.mcq.cqn); + + MLX5_SET(sqc, sqc, state, MLX5_SQC_STATE_RST); + MLX5_SET(sqc, sqc, flush_in_error_en, 1); + + ts_format = mlx5_is_real_time_sq(mdev) ? + MLX5_TIMESTAMP_FORMAT_REAL_TIME : + MLX5_TIMESTAMP_FORMAT_FREE_RUNNING; + MLX5_SET(sqc, sqc, ts_format, ts_format); + + MLX5_SET(wq, wq, wq_type, MLX5_WQ_TYPE_CYCLIC); + MLX5_SET(wq, wq, uar_page, sq->bfreg.index); + MLX5_SET(wq, wq, log_wq_pg_sz, sq->wq_ctrl.buf.page_shift - + MLX5_ADAPTER_PAGE_SHIFT); + MLX5_SET64(wq, wq, dbr_addr, sq->wq_ctrl.db.dma); + + mlx5_fill_page_frag_array(&sq->wq_ctrl.buf, + (__be64 *)MLX5_ADDR_OF(wq, wq, pas)); + + err = mlx5_core_create_sq(mdev, in, inlen, &sq->sqn); + if (err) { + mlx5_core_err(mdev, "Failed to create wc sq, err=%d\n", err); + goto err_create_sq; + } + + memset(in, 0, MLX5_ST_SZ_BYTES(modify_sq_in)); + MLX5_SET(modify_sq_in, in, sq_state, MLX5_SQC_STATE_RST); + sqc = MLX5_ADDR_OF(modify_sq_in, in, ctx); + MLX5_SET(sqc, sqc, state, MLX5_SQC_STATE_RDY); + + err = mlx5_core_modify_sq(mdev, sq->sqn, in); + if (err) { + mlx5_core_err(mdev, "Failed to set wc sq(sqn=0x%x) ready, err=%d\n", + sq->sqn, err); + goto err_modify_sq; + } + + kvfree(in); + return 0; + +err_modify_sq: + mlx5_core_destroy_sq(mdev, sq->sqn); +err_create_sq: + kvfree(in); + return err; +} + +static int mlx5_wc_create_sq(struct mlx5_core_dev *mdev, struct mlx5_wc_sq *sq) +{ + struct mlx5_wq_param param = {}; + void *sqc_data, *wq; + int err; + + sqc_data = kvzalloc(MLX5_ST_SZ_BYTES(sqc), GFP_KERNEL); + if (!sqc_data) + return -ENOMEM; + + wq = MLX5_ADDR_OF(sqc, sqc_data, wq); + MLX5_SET(wq, wq, log_wq_stride, ilog2(MLX5_SEND_WQE_BB)); + MLX5_SET(wq, wq, pd, mdev->mlx5e_res.hw_objs.pdn); + MLX5_SET(wq, wq, log_wq_sz, TEST_WC_SQ_LOG_WQ_SZ); + + err = mlx5_wq_cyc_create(mdev, ¶m, wq, &sq->wq, &sq->wq_ctrl); + if (err) { + mlx5_core_err(mdev, "Failed to create wc sq wq, err=%d\n", err); + goto err_create_wq_cyc; + } + + err = create_wc_sq(mdev, sqc_data, sq); + if (err) + goto err_create_sq; + + mlx5_core_dbg(mdev, "wc sq->sqn = 0x%x created\n", sq->sqn); + + kvfree(sqc_data); + return 0; + +err_create_sq: + mlx5_wq_destroy(&sq->wq_ctrl); +err_create_wq_cyc: + kvfree(sqc_data); + return err; +} + +static void mlx5_wc_destroy_sq(struct mlx5_wc_sq *sq) +{ + mlx5_core_destroy_sq(sq->cq.mdev, sq->sqn); + mlx5_wq_destroy(&sq->wq_ctrl); +} + +static void mlx5_wc_post_nop(struct mlx5_wc_sq *sq, bool signaled) +{ + int buf_size = (1 << MLX5_CAP_GEN(sq->cq.mdev, log_bf_reg_size)) / 2; + struct mlx5_wqe_ctrl_seg *ctrl; + __be32 mmio_wqe[16] = {}; + u16 pi; + + pi = mlx5_wq_cyc_ctr2ix(&sq->wq, sq->pc); + ctrl = mlx5_wq_cyc_get_wqe(&sq->wq, pi); + memset(ctrl, 0, sizeof(*ctrl)); + ctrl->opmod_idx_opcode = + cpu_to_be32((sq->pc << MLX5_WQE_CTRL_WQE_INDEX_SHIFT) | MLX5_OPCODE_NOP); + ctrl->qpn_ds = + cpu_to_be32((sq->sqn << MLX5_WQE_CTRL_QPN_SHIFT) | + DIV_ROUND_UP(sizeof(struct mlx5_wqe_ctrl_seg), MLX5_SEND_WQE_DS)); + if (signaled) + ctrl->fm_ce_se |= MLX5_WQE_CTRL_CQ_UPDATE; + + memcpy(mmio_wqe, ctrl, sizeof(*ctrl)); + ((struct mlx5_wqe_ctrl_seg *)&mmio_wqe)->fm_ce_se |= + MLX5_WQE_CTRL_CQ_UPDATE; + + /* ensure wqe is visible to device before updating doorbell record */ + dma_wmb(); + + sq->pc++; + sq->wq.db[MLX5_SND_DBR] = cpu_to_be32(sq->pc); + + /* ensure doorbell record is visible to device before ringing the + * doorbell + */ + wmb(); + + __iowrite64_copy(sq->bfreg.map + sq->bfreg.offset, mmio_wqe, + sizeof(mmio_wqe) / 8); + + sq->bfreg.offset ^= buf_size; +} + +static int mlx5_wc_poll_cq(struct mlx5_wc_sq *sq) +{ + struct mlx5_wc_cq *cq = &sq->cq; + struct mlx5_cqe64 *cqe; + + cqe = mlx5_cqwq_get_cqe(&cq->wq); + if (!cqe) + return -ETIMEDOUT; + + /* sq->cc must be updated only after mlx5_cqwq_update_db_record(), + * otherwise a cq overrun may occur + */ + mlx5_cqwq_pop(&cq->wq); + + if (get_cqe_opcode(cqe) == MLX5_CQE_REQ) { + int wqe_counter = be16_to_cpu(cqe->wqe_counter); + struct mlx5_core_dev *mdev = cq->mdev; + + if (wqe_counter == TEST_WC_NUM_WQES - 1) + mdev->wc_state = MLX5_WC_STATE_UNSUPPORTED; + else + mdev->wc_state = MLX5_WC_STATE_SUPPORTED; + + mlx5_core_dbg(mdev, "wc wqe_counter = 0x%x\n", wqe_counter); + } + + mlx5_cqwq_update_db_record(&cq->wq); + + /* ensure cq space is freed before enabling more cqes */ + wmb(); + + sq->cc++; + + return 0; +} + +static void mlx5_core_test_wc(struct mlx5_core_dev *mdev) +{ + unsigned long expires; + struct mlx5_wc_sq *sq; + int i, err; + + if (mdev->wc_state != MLX5_WC_STATE_UNINITIALIZED) + return; + + sq = kzalloc(sizeof(*sq), GFP_KERNEL); + if (!sq) + return; + + err = mlx5_alloc_bfreg(mdev, &sq->bfreg, true, false); + if (err) { + mlx5_core_err(mdev, "Failed to alloc bfreg for wc, err=%d\n", err); + goto err_alloc_bfreg; + } + + err = mlx5_wc_create_cq(mdev, &sq->cq); + if (err) + goto err_create_cq; + + err = mlx5_wc_create_sq(mdev, sq); + if (err) + goto err_create_sq; + + for (i = 0; i < TEST_WC_NUM_WQES - 1; i++) + mlx5_wc_post_nop(sq, false); + + mlx5_wc_post_nop(sq, true); + + expires = jiffies + TEST_WC_POLLING_MAX_TIME_JIFFIES; + do { + err = mlx5_wc_poll_cq(sq); + if (err) + usleep_range(2, 10); + } while (mdev->wc_state == MLX5_WC_STATE_UNINITIALIZED && + time_is_after_jiffies(expires)); + + mlx5_wc_destroy_sq(sq); + +err_create_sq: + mlx5_wc_destroy_cq(&sq->cq); +err_create_cq: + mlx5_free_bfreg(mdev, &sq->bfreg); +err_alloc_bfreg: + kfree(sq); +} + +bool mlx5_wc_support_get(struct mlx5_core_dev *mdev) +{ + struct mlx5_core_dev *parent = NULL; + + if (!MLX5_CAP_GEN(mdev, bf)) { + mlx5_core_dbg(mdev, "BlueFlame not supported\n"); + goto out; + } + + if (!MLX5_CAP_GEN(mdev, log_max_sq)) { + mlx5_core_dbg(mdev, "SQ not supported\n"); + goto out; + } + + if (mdev->wc_state != MLX5_WC_STATE_UNINITIALIZED) + /* No need to lock anything as we perform WC test only + * once for whole device and was already done. + */ + goto out; + + mutex_lock(&mdev->wc_state_lock); + + if (mdev->wc_state != MLX5_WC_STATE_UNINITIALIZED) + goto unlock; + +#ifdef CONFIG_MLX5_SF + if (mlx5_core_is_sf(mdev)) + parent = mdev->priv.parent_mdev; +#endif + + if (parent) { + mutex_lock(&parent->wc_state_lock); + + mlx5_core_test_wc(parent); + + mlx5_core_dbg(mdev, "parent set wc_state=%d\n", + parent->wc_state); + mdev->wc_state = parent->wc_state; + + mutex_unlock(&parent->wc_state_lock); + } + + mlx5_core_test_wc(mdev); + +unlock: + mutex_unlock(&mdev->wc_state_lock); +out: + mlx5_core_dbg(mdev, "wc_state=%d\n", mdev->wc_state); + + return mdev->wc_state == MLX5_WC_STATE_SUPPORTED; +} +EXPORT_SYMBOL(mlx5_wc_support_get); diff --git a/include/linux/mlx5/driver.h b/include/linux/mlx5/driver.h index 3f72069021b92..1c74bbbe50bb5 100644 --- a/include/linux/mlx5/driver.h +++ b/include/linux/mlx5/driver.h @@ -757,6 +757,12 @@ struct mlx5_hca_cap { u32 max[MLX5_UN_SZ_DW(hca_cap_union)]; }; +enum mlx5_wc_state { + MLX5_WC_STATE_UNINITIALIZED, + MLX5_WC_STATE_UNSUPPORTED, + MLX5_WC_STATE_SUPPORTED, +}; + struct mlx5_core_dev { struct device *device; enum mlx5_coredev_type coredev_type; @@ -814,6 +820,9 @@ struct mlx5_core_dev { struct blocking_notifier_head macsec_nh; #endif u64 num_ipsec_offloads; + enum mlx5_wc_state wc_state; + /* sync write combining state */ + struct mutex wc_state_lock; }; struct mlx5_db { @@ -1386,4 +1395,6 @@ static inline bool mlx5_is_macsec_roce_supported(struct mlx5_core_dev *mdev) enum { MLX5_OCTWORD = 16, }; + +bool mlx5_wc_support_get(struct mlx5_core_dev *mdev); #endif /* MLX5_DRIVER_H */ From ad22f76c6599ef166411e5f639026a620f2a213b Mon Sep 17 00:00:00 2001 From: Jianbo Liu Date: Mon, 3 Jun 2024 13:26:38 +0300 Subject: [PATCH 053/548] IB/mlx5: Create UMR QP just before first reg_mr occurs UMR QP is not used in some cases, so move QP and its CQ creations from driver load flow to the time first reg_mr occurs, that is when MR interfaces are first called. The initialization of dev->umrc.pd and dev->umrc.lock is still done in driver load because pd is needed for mlx5_mkey_cache_init and the lock is reused to protect against the concurrent creation. When testing 4G bytes memory registration latency with rtool [1] and 8 threads in parallel, there is minor performance degradation (<5% for the max latency) is seen for the first reg_mr with this change. Link: https://github.com/paravmellanox/rtool [1] Signed-off-by: Jianbo Liu Link: https://lore.kernel.org/r/55d3c4f8a542fd974d8a4c5816eccfb318a59b38.1717409369.git.leon@kernel.org Signed-off-by: Leon Romanovsky (cherry picked from commit 638420115cc4ad6c3a2683bf46a052b505abb202) Signed-off-by: Koba Ko --- drivers/infiniband/hw/mlx5/main.c | 3 +- drivers/infiniband/hw/mlx5/mlx5_ib.h | 2 + drivers/infiniband/hw/mlx5/mr.c | 9 +++++ drivers/infiniband/hw/mlx5/umr.c | 55 +++++++++++++++++++++------- drivers/infiniband/hw/mlx5/umr.h | 3 ++ 5 files changed, 58 insertions(+), 14 deletions(-) diff --git a/drivers/infiniband/hw/mlx5/main.c b/drivers/infiniband/hw/mlx5/main.c index e39df64312661..e604728b66a1e 100644 --- a/drivers/infiniband/hw/mlx5/main.c +++ b/drivers/infiniband/hw/mlx5/main.c @@ -4208,6 +4208,7 @@ static void mlx5_ib_stage_pre_ib_reg_umr_cleanup(struct mlx5_ib_dev *dev) { mlx5_mkey_cache_cleanup(dev); mlx5r_umr_resource_cleanup(dev); + mlx5r_umr_cleanup(dev); } static void mlx5_ib_stage_ib_reg_cleanup(struct mlx5_ib_dev *dev) @@ -4219,7 +4220,7 @@ static int mlx5_ib_stage_post_ib_reg_umr_init(struct mlx5_ib_dev *dev) { int ret; - ret = mlx5r_umr_resource_init(dev); + ret = mlx5r_umr_init(dev); if (ret) return ret; diff --git a/drivers/infiniband/hw/mlx5/mlx5_ib.h b/drivers/infiniband/hw/mlx5/mlx5_ib.h index 29382f9beeb20..4e4ab08f70241 100644 --- a/drivers/infiniband/hw/mlx5/mlx5_ib.h +++ b/drivers/infiniband/hw/mlx5/mlx5_ib.h @@ -763,6 +763,8 @@ struct umr_common { */ struct mutex lock; unsigned int state; + /* Protects from repeat UMR QP creation */ + struct mutex init_lock; }; struct mlx5_cache_ent { diff --git a/drivers/infiniband/hw/mlx5/mr.c b/drivers/infiniband/hw/mlx5/mr.c index 9e465cf99733e..52d6ce9e4e4ae 100644 --- a/drivers/infiniband/hw/mlx5/mr.c +++ b/drivers/infiniband/hw/mlx5/mr.c @@ -1501,6 +1501,7 @@ struct ib_mr *mlx5_ib_reg_user_mr(struct ib_pd *pd, u64 start, u64 length, { struct mlx5_ib_dev *dev = to_mdev(pd->device); struct ib_umem *umem; + int err; if (!IS_ENABLED(CONFIG_INFINIBAND_USER_MEM)) return ERR_PTR(-EOPNOTSUPP); @@ -1508,6 +1509,10 @@ struct ib_mr *mlx5_ib_reg_user_mr(struct ib_pd *pd, u64 start, u64 length, mlx5_ib_dbg(dev, "start 0x%llx, iova 0x%llx, length 0x%llx, access_flags 0x%x\n", start, iova, length, access_flags); + err = mlx5r_umr_resource_init(dev); + if (err) + return ERR_PTR(err); + if (access_flags & IB_ACCESS_ON_DEMAND) return create_user_odp_mr(pd, start, length, iova, access_flags, udata); @@ -1554,6 +1559,10 @@ struct ib_mr *mlx5_ib_reg_user_mr_dmabuf(struct ib_pd *pd, u64 offset, "offset 0x%llx, virt_addr 0x%llx, length 0x%llx, fd %d, access_flags 0x%x\n", offset, virt_addr, length, fd, access_flags); + err = mlx5r_umr_resource_init(dev); + if (err) + return ERR_PTR(err); + /* dmabuf requires xlt update via umr to work. */ if (!mlx5r_umr_can_load_pas(dev, length)) return ERR_PTR(-EINVAL); diff --git a/drivers/infiniband/hw/mlx5/umr.c b/drivers/infiniband/hw/mlx5/umr.c index 234bf30db7319..e7e049b2fceb8 100644 --- a/drivers/infiniband/hw/mlx5/umr.c +++ b/drivers/infiniband/hw/mlx5/umr.c @@ -135,22 +135,28 @@ static int mlx5r_umr_qp_rst2rts(struct mlx5_ib_dev *dev, struct ib_qp *qp) int mlx5r_umr_resource_init(struct mlx5_ib_dev *dev) { struct ib_qp_init_attr init_attr = {}; - struct ib_pd *pd; struct ib_cq *cq; struct ib_qp *qp; - int ret; + int ret = 0; - pd = ib_alloc_pd(&dev->ib_dev, 0); - if (IS_ERR(pd)) { - mlx5_ib_dbg(dev, "Couldn't create PD for sync UMR QP\n"); - return PTR_ERR(pd); - } + + /* + * UMR qp is set once, never changed until device unload. + * Avoid taking the mutex if initialization is already done. + */ + if (dev->umrc.qp) + return 0; + + mutex_lock(&dev->umrc.init_lock); + /* First user allocates the UMR resources. Skip if already allocated. */ + if (dev->umrc.qp) + goto unlock; cq = ib_alloc_cq(&dev->ib_dev, NULL, 128, 0, IB_POLL_SOFTIRQ); if (IS_ERR(cq)) { mlx5_ib_dbg(dev, "Couldn't create CQ for sync UMR QP\n"); ret = PTR_ERR(cq); - goto destroy_pd; + goto unlock; } init_attr.send_cq = cq; @@ -160,7 +166,7 @@ int mlx5r_umr_resource_init(struct mlx5_ib_dev *dev) init_attr.cap.max_send_sge = 1; init_attr.qp_type = MLX5_IB_QPT_REG_UMR; init_attr.port_num = 1; - qp = ib_create_qp(pd, &init_attr); + qp = ib_create_qp(dev->umrc.pd, &init_attr); if (IS_ERR(qp)) { mlx5_ib_dbg(dev, "Couldn't create sync UMR QP\n"); ret = PTR_ERR(qp); @@ -171,22 +177,22 @@ int mlx5r_umr_resource_init(struct mlx5_ib_dev *dev) if (ret) goto destroy_qp; - dev->umrc.qp = qp; dev->umrc.cq = cq; - dev->umrc.pd = pd; sema_init(&dev->umrc.sem, MAX_UMR_WR); mutex_init(&dev->umrc.lock); dev->umrc.state = MLX5_UMR_STATE_ACTIVE; + dev->umrc.qp = qp; + mutex_unlock(&dev->umrc.init_lock); return 0; destroy_qp: ib_destroy_qp(qp); destroy_cq: ib_free_cq(cq); -destroy_pd: - ib_dealloc_pd(pd); +unlock: + mutex_unlock(&dev->umrc.init_lock); return ret; } @@ -194,8 +200,31 @@ void mlx5r_umr_resource_cleanup(struct mlx5_ib_dev *dev) { if (dev->umrc.state == MLX5_UMR_STATE_UNINIT) return; + mutex_destroy(&dev->umrc.lock); + /* After device init, UMR cp/qp are not unset during the lifetime. */ ib_destroy_qp(dev->umrc.qp); ib_free_cq(dev->umrc.cq); +} + +int mlx5r_umr_init(struct mlx5_ib_dev *dev) +{ + struct ib_pd *pd; + + pd = ib_alloc_pd(&dev->ib_dev, 0); + if (IS_ERR(pd)) { + mlx5_ib_dbg(dev, "Couldn't create PD for sync UMR QP\n"); + return PTR_ERR(pd); + } + dev->umrc.pd = pd; + + mutex_init(&dev->umrc.init_lock); + + return 0; +} + +void mlx5r_umr_cleanup(struct mlx5_ib_dev *dev) +{ + mutex_destroy(&dev->umrc.init_lock); ib_dealloc_pd(dev->umrc.pd); } diff --git a/drivers/infiniband/hw/mlx5/umr.h b/drivers/infiniband/hw/mlx5/umr.h index 3799bb758e490..5f734dc72befe 100644 --- a/drivers/infiniband/hw/mlx5/umr.h +++ b/drivers/infiniband/hw/mlx5/umr.h @@ -16,6 +16,9 @@ int mlx5r_umr_resource_init(struct mlx5_ib_dev *dev); void mlx5r_umr_resource_cleanup(struct mlx5_ib_dev *dev); +int mlx5r_umr_init(struct mlx5_ib_dev *dev); +void mlx5r_umr_cleanup(struct mlx5_ib_dev *dev); + static inline bool mlx5r_umr_can_load_pas(struct mlx5_ib_dev *dev, size_t length) { From 30db9e61cb25d2cc0e17243b09140daba576f789 Mon Sep 17 00:00:00 2001 From: Yishai Hadas Date: Thu, 1 Aug 2024 15:05:11 +0300 Subject: [PATCH 054/548] RDMA/mlx5: Introduce the 'data direct' driver Introduce the 'data direct' driver for a ConnectX-8 Data Direct device. The 'data direct' driver functions as the affiliated DMA device for one or more capable mlx5_ib devices. This DMA device, as the name suggests, is used exclusively for DMA operations. It can be considered a DMA engine managed by a PF/VF, lacking network capabilities and having minimal overall capabilities. Consequently, the DMA NIC PF will not be exposed to or directly used by software applications. The driver will not have any direct interface or interaction with the firmware (no command interface, no capabilities, etc.). It will operate solely over PCI to enable its DMA functionality. Registration and un-registration of the driver are handled as part of the mlx5_ib initialization and exit processes, as the mlx5_ib devices will effectively be its clients. The driver will serve as the DMA device for accessing another PCI device to achieve optimal performance (both on the same NUMA node, P2P access, etc.). Upon probing, it will read its VUID over PCI to handle mlx5_ib device registrations with the same VUID. Upon removal, it will notify its clients to allow them to clean up the resources that were mmaped with its DMA device. Signed-off-by: Yishai Hadas Link: https://patch.msgid.link/b77edecfd476c3f445da96ab6aef499ae47b2829.1722512548.git.leon@kernel.org Signed-off-by: Leon Romanovsky (cherry picked from commit 6910e3660d86c1a5654f742a40181d2c9154f26f) Signed-off-by: Koba Ko --- drivers/infiniband/hw/mlx5/Makefile | 1 + drivers/infiniband/hw/mlx5/data_direct.c | 227 +++++++++++++++++++++++ drivers/infiniband/hw/mlx5/data_direct.h | 23 +++ drivers/infiniband/hw/mlx5/main.c | 24 +++ drivers/infiniband/hw/mlx5/mlx5_ib.h | 6 + 5 files changed, 281 insertions(+) create mode 100644 drivers/infiniband/hw/mlx5/data_direct.c create mode 100644 drivers/infiniband/hw/mlx5/data_direct.h diff --git a/drivers/infiniband/hw/mlx5/Makefile b/drivers/infiniband/hw/mlx5/Makefile index 72a526236c2e0..b38961f5058ef 100644 --- a/drivers/infiniband/hw/mlx5/Makefile +++ b/drivers/infiniband/hw/mlx5/Makefile @@ -6,6 +6,7 @@ mlx5_ib-y := ah.o \ cong.o \ counters.o \ cq.o \ + data_direct.o \ dm.o \ doorbell.o \ gsi.o \ diff --git a/drivers/infiniband/hw/mlx5/data_direct.c b/drivers/infiniband/hw/mlx5/data_direct.c new file mode 100644 index 0000000000000..b9ba84afaae22 --- /dev/null +++ b/drivers/infiniband/hw/mlx5/data_direct.c @@ -0,0 +1,227 @@ +// SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB +/* + * Copyright (c) 2024, NVIDIA CORPORATION & AFFILIATES. All rights reserved + */ + +#include "mlx5_ib.h" +#include "data_direct.h" + +static LIST_HEAD(mlx5_data_direct_dev_list); +static LIST_HEAD(mlx5_data_direct_reg_list); + +/* + * This mutex should be held when accessing either of the above lists + */ +static DEFINE_MUTEX(mlx5_data_direct_mutex); + +struct mlx5_data_direct_registration { + struct mlx5_ib_dev *ibdev; + char vuid[MLX5_ST_SZ_BYTES(array1024_auto) + 1]; + struct list_head list; +}; + +static const struct pci_device_id mlx5_data_direct_pci_table[] = { + { PCI_VDEVICE(MELLANOX, 0x2100) }, /* ConnectX-8 Data Direct */ + { 0, } +}; + +static int mlx5_data_direct_vpd_get_vuid(struct mlx5_data_direct_dev *dev) +{ + struct pci_dev *pdev = dev->pdev; + unsigned int vpd_size, kw_len; + u8 *vpd_data; + int start; + int ret; + + vpd_data = pci_vpd_alloc(pdev, &vpd_size); + if (IS_ERR(vpd_data)) { + pci_err(pdev, "Unable to read VPD, err=%ld\n", PTR_ERR(vpd_data)); + return PTR_ERR(vpd_data); + } + + start = pci_vpd_find_ro_info_keyword(vpd_data, vpd_size, "VU", &kw_len); + if (start < 0) { + ret = start; + pci_err(pdev, "VU keyword not found, err=%d\n", ret); + goto end; + } + + dev->vuid = kmemdup_nul(vpd_data + start, kw_len, GFP_KERNEL); + ret = dev->vuid ? 0 : -ENOMEM; + +end: + kfree(vpd_data); + return ret; +} + +static void mlx5_data_direct_shutdown(struct pci_dev *pdev) +{ + pci_disable_device(pdev); +} + +static int mlx5_data_direct_set_dma_caps(struct pci_dev *pdev) +{ + int err; + + err = dma_set_mask_and_coherent(&pdev->dev, DMA_BIT_MASK(64)); + if (err) { + dev_warn(&pdev->dev, + "Warning: couldn't set 64-bit PCI DMA mask, err=%d\n", err); + err = dma_set_mask_and_coherent(&pdev->dev, DMA_BIT_MASK(32)); + if (err) { + dev_err(&pdev->dev, "Can't set PCI DMA mask, err=%d\n", err); + return err; + } + } + + dma_set_max_seg_size(&pdev->dev, SZ_2G); + return 0; +} + +int mlx5_data_direct_ib_reg(struct mlx5_ib_dev *ibdev, char *vuid) +{ + struct mlx5_data_direct_registration *reg; + struct mlx5_data_direct_dev *dev; + + reg = kzalloc(sizeof(*reg), GFP_KERNEL); + if (!reg) + return -ENOMEM; + + reg->ibdev = ibdev; + strcpy(reg->vuid, vuid); + + mutex_lock(&mlx5_data_direct_mutex); + list_for_each_entry(dev, &mlx5_data_direct_dev_list, list) { + if (strcmp(dev->vuid, vuid) == 0) { + mlx5_ib_data_direct_bind(ibdev, dev); + break; + } + } + + /* Add the registration to its global list, to be used upon bind/unbind + * of its affiliated data direct device + */ + list_add_tail(®->list, &mlx5_data_direct_reg_list); + mutex_unlock(&mlx5_data_direct_mutex); + return 0; +} + +void mlx5_data_direct_ib_unreg(struct mlx5_ib_dev *ibdev) +{ + struct mlx5_data_direct_registration *reg; + + mutex_lock(&mlx5_data_direct_mutex); + list_for_each_entry(reg, &mlx5_data_direct_reg_list, list) { + if (reg->ibdev == ibdev) { + list_del(®->list); + kfree(reg); + goto end; + } + } + + WARN_ON(true); +end: + mutex_unlock(&mlx5_data_direct_mutex); +} + +static void mlx5_data_direct_dev_reg(struct mlx5_data_direct_dev *dev) +{ + struct mlx5_data_direct_registration *reg; + + mutex_lock(&mlx5_data_direct_mutex); + list_for_each_entry(reg, &mlx5_data_direct_reg_list, list) { + if (strcmp(dev->vuid, reg->vuid) == 0) + mlx5_ib_data_direct_bind(reg->ibdev, dev); + } + + /* Add the data direct device to the global list, further IB devices may + * use it later as well + */ + list_add_tail(&dev->list, &mlx5_data_direct_dev_list); + mutex_unlock(&mlx5_data_direct_mutex); +} + +static void mlx5_data_direct_dev_unreg(struct mlx5_data_direct_dev *dev) +{ + struct mlx5_data_direct_registration *reg; + + mutex_lock(&mlx5_data_direct_mutex); + /* Prevent any further affiliations */ + list_del(&dev->list); + list_for_each_entry(reg, &mlx5_data_direct_reg_list, list) { + if (strcmp(dev->vuid, reg->vuid) == 0) + mlx5_ib_data_direct_unbind(reg->ibdev); + } + mutex_unlock(&mlx5_data_direct_mutex); +} + +static int mlx5_data_direct_probe(struct pci_dev *pdev, const struct pci_device_id *id) +{ + struct mlx5_data_direct_dev *dev; + int err; + + dev = kzalloc(sizeof(*dev), GFP_KERNEL); + if (!dev) + return -ENOMEM; + + dev->device = &pdev->dev; + dev->pdev = pdev; + + pci_set_drvdata(dev->pdev, dev); + err = pci_enable_device(pdev); + if (err) { + dev_err(dev->device, "Cannot enable PCI device, err=%d\n", err); + goto err; + } + + pci_set_master(pdev); + err = mlx5_data_direct_set_dma_caps(pdev); + if (err) + goto err_disable; + + if (pci_enable_atomic_ops_to_root(pdev, PCI_EXP_DEVCAP2_ATOMIC_COMP32) && + pci_enable_atomic_ops_to_root(pdev, PCI_EXP_DEVCAP2_ATOMIC_COMP64) && + pci_enable_atomic_ops_to_root(pdev, PCI_EXP_DEVCAP2_ATOMIC_COMP128)) + dev_dbg(dev->device, "Enabling pci atomics failed\n"); + + err = mlx5_data_direct_vpd_get_vuid(dev); + if (err) + goto err_disable; + + mlx5_data_direct_dev_reg(dev); + return 0; + +err_disable: + pci_disable_device(pdev); +err: + kfree(dev); + return err; +} + +static void mlx5_data_direct_remove(struct pci_dev *pdev) +{ + struct mlx5_data_direct_dev *dev = pci_get_drvdata(pdev); + + mlx5_data_direct_dev_unreg(dev); + pci_disable_device(pdev); + kfree(dev->vuid); + kfree(dev); +} + +static struct pci_driver mlx5_data_direct_driver = { + .name = KBUILD_MODNAME, + .id_table = mlx5_data_direct_pci_table, + .probe = mlx5_data_direct_probe, + .remove = mlx5_data_direct_remove, + .shutdown = mlx5_data_direct_shutdown, +}; + +int mlx5_data_direct_driver_register(void) +{ + return pci_register_driver(&mlx5_data_direct_driver); +} + +void mlx5_data_direct_driver_unregister(void) +{ + pci_unregister_driver(&mlx5_data_direct_driver); +} diff --git a/drivers/infiniband/hw/mlx5/data_direct.h b/drivers/infiniband/hw/mlx5/data_direct.h new file mode 100644 index 0000000000000..2fd2bdbe8f692 --- /dev/null +++ b/drivers/infiniband/hw/mlx5/data_direct.h @@ -0,0 +1,23 @@ +/* SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB */ +/* + * Copyright (c) 2024, NVIDIA CORPORATION & AFFILIATES. All rights reserved + */ + +#ifndef _MLX5_IB_DATA_DIRECT_H +#define _MLX5_IB_DATA_DIRECT_H + +struct mlx5_ib_dev; + +struct mlx5_data_direct_dev { + struct device *device; + struct pci_dev *pdev; + char *vuid; + struct list_head list; +}; + +int mlx5_data_direct_ib_reg(struct mlx5_ib_dev *ibdev, char *vuid); +void mlx5_data_direct_ib_unreg(struct mlx5_ib_dev *ibdev); +int mlx5_data_direct_driver_register(void); +void mlx5_data_direct_driver_unregister(void); + +#endif diff --git a/drivers/infiniband/hw/mlx5/main.c b/drivers/infiniband/hw/mlx5/main.c index e604728b66a1e..05db06331146f 100644 --- a/drivers/infiniband/hw/mlx5/main.c +++ b/drivers/infiniband/hw/mlx5/main.c @@ -48,6 +48,7 @@ #include #include #include "macsec.h" +#include "data_direct.h" #define UVERBS_MODULE_NAME mlx5_ib #include @@ -3862,6 +3863,7 @@ static int mlx5_ib_stage_init_init(struct mlx5_ib_dev *dev) dev->ib_dev.num_comp_vectors = mlx5_comp_vectors_max(mdev); mutex_init(&dev->cap_mask_mutex); + mutex_init(&dev->data_direct_lock); INIT_LIST_HEAD(&dev->qp_list); spin_lock_init(&dev->reset_flow_resource_lock); xa_init(&dev->odp_mkeys); @@ -4302,6 +4304,21 @@ static void mlx5_ib_stage_dev_notifier_cleanup(struct mlx5_ib_dev *dev) cancel_work_sync(&devr->ports[port].pkey_change_work); } +void mlx5_ib_data_direct_bind(struct mlx5_ib_dev *ibdev, + struct mlx5_data_direct_dev *dev) +{ + mutex_lock(&ibdev->data_direct_lock); + ibdev->data_direct_dev = dev; + mutex_unlock(&ibdev->data_direct_lock); +} + +void mlx5_ib_data_direct_unbind(struct mlx5_ib_dev *ibdev) +{ + mutex_lock(&ibdev->data_direct_lock); + ibdev->data_direct_dev = NULL; + mutex_unlock(&ibdev->data_direct_lock); +} + void __mlx5_ib_remove(struct mlx5_ib_dev *dev, const struct mlx5_ib_profile *profile, int stage) @@ -4725,17 +4742,23 @@ static int __init mlx5_ib_init(void) ret = mlx5r_rep_init(); if (ret) goto rep_err; + ret = mlx5_data_direct_driver_register(); + if (ret) + goto dd_err; ret = auxiliary_driver_register(&mlx5r_mp_driver); if (ret) goto mp_err; ret = auxiliary_driver_register(&mlx5r_driver); if (ret) goto drv_err; + return 0; drv_err: auxiliary_driver_unregister(&mlx5r_mp_driver); mp_err: + mlx5_data_direct_driver_unregister(); +dd_err: mlx5r_rep_cleanup(); rep_err: mlx5_ib_qp_event_cleanup(); @@ -4747,6 +4770,7 @@ static int __init mlx5_ib_init(void) static void __exit mlx5_ib_cleanup(void) { + mlx5_data_direct_driver_unregister(); auxiliary_driver_unregister(&mlx5r_driver); auxiliary_driver_unregister(&mlx5r_mp_driver); mlx5r_rep_cleanup(); diff --git a/drivers/infiniband/hw/mlx5/mlx5_ib.h b/drivers/infiniband/hw/mlx5/mlx5_ib.h index 4e4ab08f70241..224f398743c6f 100644 --- a/drivers/infiniband/hw/mlx5/mlx5_ib.h +++ b/drivers/infiniband/hw/mlx5/mlx5_ib.h @@ -1115,6 +1115,9 @@ struct mlx5_macsec { struct mlx5_ib_dev { struct ib_device ib_dev; struct mlx5_core_dev *mdev; + struct mlx5_data_direct_dev *data_direct_dev; + /* protect accessing data_direct_dev */ + struct mutex data_direct_lock; struct notifier_block mdev_events; int num_ports; /* serialize update of capability mask @@ -1410,6 +1413,9 @@ int mlx5_ib_destroy_rwq_ind_table(struct ib_rwq_ind_table *wq_ind_table); struct ib_mr *mlx5_ib_reg_dm_mr(struct ib_pd *pd, struct ib_dm *dm, struct ib_dm_mr_attr *attr, struct uverbs_attr_bundle *attrs); +void mlx5_ib_data_direct_bind(struct mlx5_ib_dev *ibdev, + struct mlx5_data_direct_dev *dev); +void mlx5_ib_data_direct_unbind(struct mlx5_ib_dev *ibdev); #ifdef CONFIG_INFINIBAND_ON_DEMAND_PAGING int mlx5_ib_odp_init_one(struct mlx5_ib_dev *ibdev); From 97817a1fe299817adf0d977f4a4e4f2e6870affd Mon Sep 17 00:00:00 2001 From: Yishai Hadas Date: Thu, 1 Aug 2024 15:05:12 +0300 Subject: [PATCH 055/548] RDMA/mlx5: Add the initialization flow to utilize the 'data direct' device Add the NET device initialization flow to utilize the 'data direct' device. When a NET mlx5_ib device is capable of 'data direct', the following sequence of actions will occur: - Find its affiliated 'data direct' VUID via a firmware command. - Create its own private PD and 'data direct' mkey. - Register to be notified when its 'data direct' driver is probed or removed. The DMA device of the affiliated 'data direct' device, including the private PD and the 'data direct' mkey, will be used later during MR registrations that request the data direct functionality. Signed-off-by: Yishai Hadas Link: https://patch.msgid.link/b11fa87b2a65bce4db8d40341bb6cee490fa4d06.1722512548.git.leon@kernel.org Signed-off-by: Leon Romanovsky (cherry picked from commit 2e8e631d7a41e3a4edc94f3c9dd5cb32c2aa539e) Signed-off-by: Koba Ko --- drivers/infiniband/hw/mlx5/cmd.c | 21 +++++++ drivers/infiniband/hw/mlx5/cmd.h | 2 + drivers/infiniband/hw/mlx5/main.c | 90 ++++++++++++++++++++++++++++ drivers/infiniband/hw/mlx5/mlx5_ib.h | 6 ++ 4 files changed, 119 insertions(+) diff --git a/drivers/infiniband/hw/mlx5/cmd.c b/drivers/infiniband/hw/mlx5/cmd.c index 895b62cc528df..7c08e30089277 100644 --- a/drivers/infiniband/hw/mlx5/cmd.c +++ b/drivers/infiniband/hw/mlx5/cmd.c @@ -245,3 +245,24 @@ int mlx5_cmd_uar_dealloc(struct mlx5_core_dev *dev, u32 uarn, u16 uid) MLX5_SET(dealloc_uar_in, in, uid, uid); return mlx5_cmd_exec_in(dev, dealloc_uar, in); } + +int mlx5_cmd_query_vuid(struct mlx5_core_dev *dev, bool data_direct, + char *out_vuid) +{ + u8 out[MLX5_ST_SZ_BYTES(query_vuid_out) + + MLX5_ST_SZ_BYTES(array1024_auto)] = {}; + u8 in[MLX5_ST_SZ_BYTES(query_vuid_in)] = {}; + char *vuid; + int err; + + MLX5_SET(query_vuid_in, in, opcode, MLX5_CMD_OPCODE_QUERY_VUID); + MLX5_SET(query_vuid_in, in, vhca_id, MLX5_CAP_GEN(dev, vhca_id)); + MLX5_SET(query_vuid_in, in, data_direct, data_direct); + err = mlx5_cmd_exec(dev, in, sizeof(in), out, sizeof(out)); + if (err) + return err; + + vuid = MLX5_ADDR_OF(query_vuid_out, out, vuid); + memcpy(out_vuid, vuid, MLX5_ST_SZ_BYTES(array1024_auto)); + return 0; +} diff --git a/drivers/infiniband/hw/mlx5/cmd.h b/drivers/infiniband/hw/mlx5/cmd.h index e5cd31270443d..e6c88b6ebd0da 100644 --- a/drivers/infiniband/hw/mlx5/cmd.h +++ b/drivers/infiniband/hw/mlx5/cmd.h @@ -58,4 +58,6 @@ int mlx5_cmd_mad_ifc(struct mlx5_ib_dev *dev, const void *inb, void *outb, u16 opmod, u8 port); int mlx5_cmd_uar_alloc(struct mlx5_core_dev *dev, u32 *uarn, u16 uid); int mlx5_cmd_uar_dealloc(struct mlx5_core_dev *dev, u32 uarn, u16 uid); +int mlx5_cmd_query_vuid(struct mlx5_core_dev *dev, bool data_direct, + char *out_vuid); #endif /* MLX5_IB_CMD_H */ diff --git a/drivers/infiniband/hw/mlx5/main.c b/drivers/infiniband/hw/mlx5/main.c index 05db06331146f..8389628d6a9e8 100644 --- a/drivers/infiniband/hw/mlx5/main.c +++ b/drivers/infiniband/hw/mlx5/main.c @@ -3014,6 +3014,59 @@ static void mlx5_ib_dev_res_cleanup(struct mlx5_ib_dev *dev) mutex_destroy(&devr->srq_lock); } +static int +mlx5_ib_create_data_direct_resources(struct mlx5_ib_dev *dev) +{ + int inlen = MLX5_ST_SZ_BYTES(create_mkey_in); + struct mlx5_core_dev *mdev = dev->mdev; + void *mkc; + u32 mkey; + u32 pdn; + u32 *in; + int err; + + err = mlx5_core_alloc_pd(mdev, &pdn); + if (err) + return err; + + in = kvzalloc(inlen, GFP_KERNEL); + if (!in) { + err = -ENOMEM; + goto err; + } + + MLX5_SET(create_mkey_in, in, data_direct, 1); + mkc = MLX5_ADDR_OF(create_mkey_in, in, memory_key_mkey_entry); + MLX5_SET(mkc, mkc, access_mode_1_0, MLX5_MKC_ACCESS_MODE_PA); + MLX5_SET(mkc, mkc, lw, 1); + MLX5_SET(mkc, mkc, lr, 1); + MLX5_SET(mkc, mkc, rw, 1); + MLX5_SET(mkc, mkc, rr, 1); + MLX5_SET(mkc, mkc, a, 1); + MLX5_SET(mkc, mkc, pd, pdn); + MLX5_SET(mkc, mkc, length64, 1); + MLX5_SET(mkc, mkc, qpn, 0xffffff); + err = mlx5_core_create_mkey(mdev, &mkey, in, inlen); + kvfree(in); + if (err) + goto err; + + dev->ddr.mkey = mkey; + dev->ddr.pdn = pdn; + return 0; + +err: + mlx5_core_dealloc_pd(mdev, pdn); + return err; +} + +static void +mlx5_ib_free_data_direct_resources(struct mlx5_ib_dev *dev) +{ + mlx5_core_destroy_mkey(dev->mdev, dev->ddr.mkey); + mlx5_core_dealloc_pd(dev->mdev, dev->ddr.pdn); +} + static u32 get_core_cap_flags(struct ib_device *ibdev, struct mlx5_hca_vport_context *rep) { @@ -3416,6 +3469,38 @@ static bool mlx5_ib_bind_slave_port(struct mlx5_ib_dev *ibdev, return false; } +static int mlx5_ib_data_direct_init(struct mlx5_ib_dev *dev) +{ + char vuid[MLX5_ST_SZ_BYTES(array1024_auto) + 1] = {}; + int ret; + + if (!MLX5_CAP_GEN(dev->mdev, data_direct)) + return 0; + + ret = mlx5_cmd_query_vuid(dev->mdev, true, vuid); + if (ret) + return ret; + + ret = mlx5_ib_create_data_direct_resources(dev); + if (ret) + return ret; + + ret = mlx5_data_direct_ib_reg(dev, vuid); + if (ret) + mlx5_ib_free_data_direct_resources(dev); + + return ret; +} + +static void mlx5_ib_data_direct_cleanup(struct mlx5_ib_dev *dev) +{ + if (!MLX5_CAP_GEN(dev->mdev, data_direct)) + return; + + mlx5_data_direct_ib_unreg(dev); + mlx5_ib_free_data_direct_resources(dev); +} + static int mlx5_ib_init_multiport_master(struct mlx5_ib_dev *dev) { u32 port_num = mlx5_core_native_port_num(dev->mdev) - 1; @@ -3810,6 +3895,7 @@ static const struct uapi_definition mlx5_ib_defs[] = { static void mlx5_ib_stage_init_cleanup(struct mlx5_ib_dev *dev) { + mlx5_ib_data_direct_cleanup(dev); mlx5_ib_cleanup_multiport_master(dev); WARN_ON(!xa_empty(&dev->odp_mkeys)); mutex_destroy(&dev->cap_mask_mutex); @@ -3872,6 +3958,10 @@ static int mlx5_ib_stage_init_init(struct mlx5_ib_dev *dev) spin_lock_init(&dev->dm.lock); dev->dm.dev = mdev; + err = mlx5_ib_data_direct_init(dev); + if (err) + goto err_mp; + return 0; err_mp: mlx5_ib_cleanup_multiport_master(dev); diff --git a/drivers/infiniband/hw/mlx5/mlx5_ib.h b/drivers/infiniband/hw/mlx5/mlx5_ib.h index 224f398743c6f..2786ae344a556 100644 --- a/drivers/infiniband/hw/mlx5/mlx5_ib.h +++ b/drivers/infiniband/hw/mlx5/mlx5_ib.h @@ -819,6 +819,11 @@ struct mlx5_ib_port_resources { struct work_struct pkey_change_work; }; +struct mlx5_data_direct_resources { + u32 pdn; + u32 mkey; +}; + struct mlx5_ib_resources { struct ib_cq *c0; struct mutex cq_lock; @@ -1172,6 +1177,7 @@ struct mlx5_ib_dev { u16 pkey_table_len; u8 lag_ports; struct mlx5_special_mkeys mkeys; + struct mlx5_data_direct_resources ddr; #ifdef CONFIG_MLX5_MACSEC struct mlx5_macsec macsec; From 420eb791f02d531d755d8e1fe7f7c5630b8f6f19 Mon Sep 17 00:00:00 2001 From: Yishai Hadas Date: Thu, 1 Aug 2024 15:05:13 +0300 Subject: [PATCH 056/548] RDMA/umem: Add support for creating pinned DMABUF umem with a given dma device Add support for creating pinned DMABUF umem with a specified DMA device instead of the DMA device of the given IB device. This API will be utilized in the upcoming patches of the series when multiple path DMAs are implemented. Signed-off-by: Yishai Hadas Link: https://patch.msgid.link/038aad36a43797e5591b20ba81051fc5758124f9.1722512548.git.leon@kernel.org Signed-off-by: Leon Romanovsky (cherry picked from commit 682358fd35dece838e6ae2d9d6a69fc0b9a9d411) Signed-off-by: Koba Ko --- drivers/infiniband/core/umem_dmabuf.c | 45 ++++++++++++++++++++------- include/rdma/ib_umem.h | 15 +++++++++ 2 files changed, 49 insertions(+), 11 deletions(-) diff --git a/drivers/infiniband/core/umem_dmabuf.c b/drivers/infiniband/core/umem_dmabuf.c index 39357dc2d229f..726a097865470 100644 --- a/drivers/infiniband/core/umem_dmabuf.c +++ b/drivers/infiniband/core/umem_dmabuf.c @@ -110,10 +110,12 @@ void ib_umem_dmabuf_unmap_pages(struct ib_umem_dmabuf *umem_dmabuf) } EXPORT_SYMBOL(ib_umem_dmabuf_unmap_pages); -struct ib_umem_dmabuf *ib_umem_dmabuf_get(struct ib_device *device, - unsigned long offset, size_t size, - int fd, int access, - const struct dma_buf_attach_ops *ops) +static struct ib_umem_dmabuf * +ib_umem_dmabuf_get_with_dma_device(struct ib_device *device, + struct device *dma_device, + unsigned long offset, size_t size, + int fd, int access, + const struct dma_buf_attach_ops *ops) { struct dma_buf *dmabuf; struct ib_umem_dmabuf *umem_dmabuf; @@ -152,7 +154,7 @@ struct ib_umem_dmabuf *ib_umem_dmabuf_get(struct ib_device *device, umem_dmabuf->attach = dma_buf_dynamic_attach( dmabuf, - device->dma_device, + dma_device, ops, umem_dmabuf); if (IS_ERR(umem_dmabuf->attach)) { @@ -168,6 +170,15 @@ struct ib_umem_dmabuf *ib_umem_dmabuf_get(struct ib_device *device, dma_buf_put(dmabuf); return ret; } + +struct ib_umem_dmabuf *ib_umem_dmabuf_get(struct ib_device *device, + unsigned long offset, size_t size, + int fd, int access, + const struct dma_buf_attach_ops *ops) +{ + return ib_umem_dmabuf_get_with_dma_device(device, device->dma_device, + offset, size, fd, access, ops); +} EXPORT_SYMBOL(ib_umem_dmabuf_get); static void @@ -184,16 +195,18 @@ static struct dma_buf_attach_ops ib_umem_dmabuf_attach_pinned_ops = { .move_notify = ib_umem_dmabuf_unsupported_move_notify, }; -struct ib_umem_dmabuf *ib_umem_dmabuf_get_pinned(struct ib_device *device, - unsigned long offset, - size_t size, int fd, - int access) +struct ib_umem_dmabuf * +ib_umem_dmabuf_get_pinned_with_dma_device(struct ib_device *device, + struct device *dma_device, + unsigned long offset, size_t size, + int fd, int access) { struct ib_umem_dmabuf *umem_dmabuf; int err; - umem_dmabuf = ib_umem_dmabuf_get(device, offset, size, fd, access, - &ib_umem_dmabuf_attach_pinned_ops); + umem_dmabuf = ib_umem_dmabuf_get_with_dma_device(device, dma_device, offset, + size, fd, access, + &ib_umem_dmabuf_attach_pinned_ops); if (IS_ERR(umem_dmabuf)) return umem_dmabuf; @@ -217,6 +230,16 @@ struct ib_umem_dmabuf *ib_umem_dmabuf_get_pinned(struct ib_device *device, ib_umem_release(&umem_dmabuf->umem); return ERR_PTR(err); } +EXPORT_SYMBOL(ib_umem_dmabuf_get_pinned_with_dma_device); + +struct ib_umem_dmabuf *ib_umem_dmabuf_get_pinned(struct ib_device *device, + unsigned long offset, + size_t size, int fd, + int access) +{ + return ib_umem_dmabuf_get_pinned_with_dma_device(device, device->dma_device, + offset, size, fd, access); +} EXPORT_SYMBOL(ib_umem_dmabuf_get_pinned); void ib_umem_dmabuf_release(struct ib_umem_dmabuf *umem_dmabuf) diff --git a/include/rdma/ib_umem.h b/include/rdma/ib_umem.h index 565a850445414..de05268ed6320 100644 --- a/include/rdma/ib_umem.h +++ b/include/rdma/ib_umem.h @@ -150,6 +150,11 @@ struct ib_umem_dmabuf *ib_umem_dmabuf_get_pinned(struct ib_device *device, unsigned long offset, size_t size, int fd, int access); +struct ib_umem_dmabuf * +ib_umem_dmabuf_get_pinned_with_dma_device(struct ib_device *device, + struct device *dma_device, + unsigned long offset, size_t size, + int fd, int access); int ib_umem_dmabuf_map_pages(struct ib_umem_dmabuf *umem_dmabuf); void ib_umem_dmabuf_unmap_pages(struct ib_umem_dmabuf *umem_dmabuf); void ib_umem_dmabuf_release(struct ib_umem_dmabuf *umem_dmabuf); @@ -196,6 +201,16 @@ ib_umem_dmabuf_get_pinned(struct ib_device *device, unsigned long offset, { return ERR_PTR(-EOPNOTSUPP); } + +static inline struct ib_umem_dmabuf * +ib_umem_dmabuf_get_pinned_with_dma_device(struct ib_device *device, + struct device *dma_device, + unsigned long offset, size_t size, + int fd, int access) +{ + return ERR_PTR(-EOPNOTSUPP); +} + static inline int ib_umem_dmabuf_map_pages(struct ib_umem_dmabuf *umem_dmabuf) { return -EOPNOTSUPP; From c3e5d931c58b7345209572d33a4689ff0903bb7d Mon Sep 17 00:00:00 2001 From: Yishai Hadas Date: Thu, 1 Aug 2024 15:05:14 +0300 Subject: [PATCH 057/548] RDMA/umem: Introduce an option to revoke DMABUF umem Introduce an option to revoke DMABUF umem. This option will retain the umem allocation while revoking its DMA mapping. Furthermore, any subsequent attempts to map the pages should fail once the umem has been revoked. This functionality will be utilized in the upcoming patches in the series, where we aim to delay umem deallocation until the mkey deregistration. However, we must unmap its pages immediately. Signed-off-by: Yishai Hadas Link: https://patch.msgid.link/a38270f2fe4a194868ca2312f4c1c760e51bcbff.1722512548.git.leon@kernel.org Signed-off-by: Leon Romanovsky (cherry picked from commit 253c61dc256b3e6be65657f78b4a8452163ce00f) Signed-off-by: Koba Ko --- drivers/infiniband/core/umem_dmabuf.c | 21 +++++++++++++++++++-- include/rdma/ib_umem.h | 3 +++ 2 files changed, 22 insertions(+), 2 deletions(-) diff --git a/drivers/infiniband/core/umem_dmabuf.c b/drivers/infiniband/core/umem_dmabuf.c index 726a097865470..9fcd37761264a 100644 --- a/drivers/infiniband/core/umem_dmabuf.c +++ b/drivers/infiniband/core/umem_dmabuf.c @@ -23,6 +23,9 @@ int ib_umem_dmabuf_map_pages(struct ib_umem_dmabuf *umem_dmabuf) dma_resv_assert_held(umem_dmabuf->attach->dmabuf->resv); + if (umem_dmabuf->revoked) + return -EINVAL; + if (umem_dmabuf->sgt) goto wait_fence; @@ -242,15 +245,29 @@ struct ib_umem_dmabuf *ib_umem_dmabuf_get_pinned(struct ib_device *device, } EXPORT_SYMBOL(ib_umem_dmabuf_get_pinned); -void ib_umem_dmabuf_release(struct ib_umem_dmabuf *umem_dmabuf) +void ib_umem_dmabuf_revoke(struct ib_umem_dmabuf *umem_dmabuf) { struct dma_buf *dmabuf = umem_dmabuf->attach->dmabuf; dma_resv_lock(dmabuf->resv, NULL); + if (umem_dmabuf->revoked) + goto end; ib_umem_dmabuf_unmap_pages(umem_dmabuf); - if (umem_dmabuf->pinned) + if (umem_dmabuf->pinned) { dma_buf_unpin(umem_dmabuf->attach); + umem_dmabuf->pinned = 0; + } + umem_dmabuf->revoked = 1; +end: dma_resv_unlock(dmabuf->resv); +} +EXPORT_SYMBOL(ib_umem_dmabuf_revoke); + +void ib_umem_dmabuf_release(struct ib_umem_dmabuf *umem_dmabuf) +{ + struct dma_buf *dmabuf = umem_dmabuf->attach->dmabuf; + + ib_umem_dmabuf_revoke(umem_dmabuf); dma_buf_detach(dmabuf, umem_dmabuf->attach); dma_buf_put(dmabuf); diff --git a/include/rdma/ib_umem.h b/include/rdma/ib_umem.h index de05268ed6320..7dc7b1cc71b5a 100644 --- a/include/rdma/ib_umem.h +++ b/include/rdma/ib_umem.h @@ -38,6 +38,7 @@ struct ib_umem_dmabuf { unsigned long last_sg_trim; void *private; u8 pinned : 1; + u8 revoked : 1; }; static inline struct ib_umem_dmabuf *to_ib_umem_dmabuf(struct ib_umem *umem) @@ -158,6 +159,7 @@ ib_umem_dmabuf_get_pinned_with_dma_device(struct ib_device *device, int ib_umem_dmabuf_map_pages(struct ib_umem_dmabuf *umem_dmabuf); void ib_umem_dmabuf_unmap_pages(struct ib_umem_dmabuf *umem_dmabuf); void ib_umem_dmabuf_release(struct ib_umem_dmabuf *umem_dmabuf); +void ib_umem_dmabuf_revoke(struct ib_umem_dmabuf *umem_dmabuf); #else /* CONFIG_INFINIBAND_USER_MEM */ @@ -217,6 +219,7 @@ static inline int ib_umem_dmabuf_map_pages(struct ib_umem_dmabuf *umem_dmabuf) } static inline void ib_umem_dmabuf_unmap_pages(struct ib_umem_dmabuf *umem_dmabuf) { } static inline void ib_umem_dmabuf_release(struct ib_umem_dmabuf *umem_dmabuf) { } +static inline void ib_umem_dmabuf_revoke(struct ib_umem_dmabuf *umem_dmabuf) {} #endif /* CONFIG_INFINIBAND_USER_MEM */ #endif /* IB_UMEM_H */ From a82dda92f5e79de49d8416b58249d4a3c28bbfc7 Mon Sep 17 00:00:00 2001 From: Yishai Hadas Date: Thu, 1 Aug 2024 15:05:15 +0300 Subject: [PATCH 058/548] RDMA: Pass uverbs_attr_bundle as part of '.reg_user_mr_dmabuf' API Pass uverbs_attr_bundle as part of '.reg_user_mr_dmabuf' API instead of udata. This enables passing some new ioctl attributes to the drivers, as will be introduced in the next patches for mlx5 driver. Change the involved drivers accordingly. Signed-off-by: Yishai Hadas Link: https://patch.msgid.link/9a25b2fc02443f7c36c2d93499ae25252b6afd40.1722512548.git.leon@kernel.org Signed-off-by: Leon Romanovsky (cherry picked from commit 3aa73c6b795b9aaaf933f3c95495d85fc0de39e3) Signed-off-by: Koba Ko --- drivers/infiniband/core/uverbs_std_types_mr.c | 2 +- drivers/infiniband/hw/bnxt_re/ib_verbs.c | 3 ++- drivers/infiniband/hw/bnxt_re/ib_verbs.h | 2 +- drivers/infiniband/hw/efa/efa.h | 2 +- drivers/infiniband/hw/efa/efa_verbs.c | 4 ++-- drivers/infiniband/hw/irdma/verbs.c | 2 +- drivers/infiniband/hw/mlx5/mlx5_ib.h | 2 +- drivers/infiniband/hw/mlx5/mr.c | 2 +- include/rdma/ib_verbs.h | 2 +- 9 files changed, 11 insertions(+), 10 deletions(-) diff --git a/drivers/infiniband/core/uverbs_std_types_mr.c b/drivers/infiniband/core/uverbs_std_types_mr.c index 03e1db5d1e8c3..7ebc7bd3caaea 100644 --- a/drivers/infiniband/core/uverbs_std_types_mr.c +++ b/drivers/infiniband/core/uverbs_std_types_mr.c @@ -239,7 +239,7 @@ static int UVERBS_HANDLER(UVERBS_METHOD_REG_DMABUF_MR)( mr = pd->device->ops.reg_user_mr_dmabuf(pd, offset, length, iova, fd, access_flags, - &attrs->driver_udata); + attrs); if (IS_ERR(mr)) return PTR_ERR(mr); diff --git a/drivers/infiniband/hw/bnxt_re/ib_verbs.c b/drivers/infiniband/hw/bnxt_re/ib_verbs.c index 2dbd79767341c..5c25b7a9db813 100644 --- a/drivers/infiniband/hw/bnxt_re/ib_verbs.c +++ b/drivers/infiniband/hw/bnxt_re/ib_verbs.c @@ -4112,7 +4112,8 @@ struct ib_mr *bnxt_re_reg_user_mr(struct ib_pd *ib_pd, u64 start, u64 length, struct ib_mr *bnxt_re_reg_user_mr_dmabuf(struct ib_pd *ib_pd, u64 start, u64 length, u64 virt_addr, int fd, - int mr_access_flags, struct ib_udata *udata) + int mr_access_flags, + struct uverbs_attr_bundle *attrs) { struct bnxt_re_pd *pd = container_of(ib_pd, struct bnxt_re_pd, ib_pd); struct bnxt_re_dev *rdev = pd->rdev; diff --git a/drivers/infiniband/hw/bnxt_re/ib_verbs.h b/drivers/infiniband/hw/bnxt_re/ib_verbs.h index 8efd2fc3827e8..8085bd4262b7a 100644 --- a/drivers/infiniband/hw/bnxt_re/ib_verbs.h +++ b/drivers/infiniband/hw/bnxt_re/ib_verbs.h @@ -239,7 +239,7 @@ struct ib_mr *bnxt_re_reg_user_mr(struct ib_pd *pd, u64 start, u64 length, struct ib_mr *bnxt_re_reg_user_mr_dmabuf(struct ib_pd *ib_pd, u64 start, u64 length, u64 virt_addr, int fd, int mr_access_flags, - struct ib_udata *udata); + struct uverbs_attr_bundle *attrs); int bnxt_re_alloc_ucontext(struct ib_ucontext *ctx, struct ib_udata *udata); void bnxt_re_dealloc_ucontext(struct ib_ucontext *context); int bnxt_re_mmap(struct ib_ucontext *context, struct vm_area_struct *vma); diff --git a/drivers/infiniband/hw/efa/efa.h b/drivers/infiniband/hw/efa/efa.h index 1ffbdd20c4de0..25f4a77ae3d26 100644 --- a/drivers/infiniband/hw/efa/efa.h +++ b/drivers/infiniband/hw/efa/efa.h @@ -157,7 +157,7 @@ struct ib_mr *efa_reg_mr(struct ib_pd *ibpd, u64 start, u64 length, struct ib_mr *efa_reg_user_mr_dmabuf(struct ib_pd *ibpd, u64 start, u64 length, u64 virt_addr, int fd, int access_flags, - struct ib_udata *udata); + struct uverbs_attr_bundle *attrs); int efa_dereg_mr(struct ib_mr *ibmr, struct ib_udata *udata); int efa_get_port_immutable(struct ib_device *ibdev, u32 port_num, struct ib_port_immutable *immutable); diff --git a/drivers/infiniband/hw/efa/efa_verbs.c b/drivers/infiniband/hw/efa/efa_verbs.c index e23ee3b0c3644..d452ef48057da 100644 --- a/drivers/infiniband/hw/efa/efa_verbs.c +++ b/drivers/infiniband/hw/efa/efa_verbs.c @@ -1662,14 +1662,14 @@ static int efa_register_mr(struct ib_pd *ibpd, struct efa_mr *mr, u64 start, struct ib_mr *efa_reg_user_mr_dmabuf(struct ib_pd *ibpd, u64 start, u64 length, u64 virt_addr, int fd, int access_flags, - struct ib_udata *udata) + struct uverbs_attr_bundle *attrs) { struct efa_dev *dev = to_edev(ibpd->device); struct ib_umem_dmabuf *umem_dmabuf; struct efa_mr *mr; int err; - mr = efa_alloc_mr(ibpd, access_flags, udata); + mr = efa_alloc_mr(ibpd, access_flags, &attrs->driver_udata); if (IS_ERR(mr)) { err = PTR_ERR(mr); goto err_out; diff --git a/drivers/infiniband/hw/irdma/verbs.c b/drivers/infiniband/hw/irdma/verbs.c index 9f3ff5955b12a..5ebfc7d907c34 100644 --- a/drivers/infiniband/hw/irdma/verbs.c +++ b/drivers/infiniband/hw/irdma/verbs.c @@ -3074,7 +3074,7 @@ static struct ib_mr *irdma_reg_user_mr(struct ib_pd *pd, u64 start, u64 len, static struct ib_mr *irdma_reg_user_mr_dmabuf(struct ib_pd *pd, u64 start, u64 len, u64 virt, int fd, int access, - struct ib_udata *udata) + struct uverbs_attr_bundle *attrs) { struct irdma_device *iwdev = to_iwdev(pd->device); struct ib_umem_dmabuf *umem_dmabuf; diff --git a/drivers/infiniband/hw/mlx5/mlx5_ib.h b/drivers/infiniband/hw/mlx5/mlx5_ib.h index 2786ae344a556..08a2454f0158d 100644 --- a/drivers/infiniband/hw/mlx5/mlx5_ib.h +++ b/drivers/infiniband/hw/mlx5/mlx5_ib.h @@ -1338,7 +1338,7 @@ struct ib_mr *mlx5_ib_reg_user_mr(struct ib_pd *pd, u64 start, u64 length, struct ib_mr *mlx5_ib_reg_user_mr_dmabuf(struct ib_pd *pd, u64 start, u64 length, u64 virt_addr, int fd, int access_flags, - struct ib_udata *udata); + struct uverbs_attr_bundle *attrs); int mlx5_ib_advise_mr(struct ib_pd *pd, enum ib_uverbs_advise_mr_advice advice, u32 flags, diff --git a/drivers/infiniband/hw/mlx5/mr.c b/drivers/infiniband/hw/mlx5/mr.c index 52d6ce9e4e4ae..81b76695fd346 100644 --- a/drivers/infiniband/hw/mlx5/mr.c +++ b/drivers/infiniband/hw/mlx5/mr.c @@ -1544,7 +1544,7 @@ static struct dma_buf_attach_ops mlx5_ib_dmabuf_attach_ops = { struct ib_mr *mlx5_ib_reg_user_mr_dmabuf(struct ib_pd *pd, u64 offset, u64 length, u64 virt_addr, int fd, int access_flags, - struct ib_udata *udata) + struct uverbs_attr_bundle *attrs) { struct mlx5_ib_dev *dev = to_mdev(pd->device); struct mlx5_ib_mr *mr = NULL; diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h index 013aa281d2069..2c21f975cff0c 100644 --- a/include/rdma/ib_verbs.h +++ b/include/rdma/ib_verbs.h @@ -2491,7 +2491,7 @@ struct ib_device_ops { struct ib_mr *(*reg_user_mr_dmabuf)(struct ib_pd *pd, u64 offset, u64 length, u64 virt_addr, int fd, int mr_access_flags, - struct ib_udata *udata); + struct uverbs_attr_bundle *attrs); struct ib_mr *(*rereg_user_mr)(struct ib_mr *mr, int flags, u64 start, u64 length, u64 virt_addr, int mr_access_flags, struct ib_pd *pd, From 68bcd3d452cba635058c0c47b0f233f6f76b6d34 Mon Sep 17 00:00:00 2001 From: Yishai Hadas Date: Thu, 1 Aug 2024 15:05:16 +0300 Subject: [PATCH 059/548] RDMA/mlx5: Add support for DMABUF MR registrations with Data-direct Add support for DMABUF MR registrations with Data-direct device. Upon userspace calling to register a DMABUF MR with the data direct bit set, the below algorithm will be followed. 1) Obtain a pinned DMABUF umem from the IB core using the user input parameters (FD, offset, length) and the DMA PF device. The DMA PF device is needed to allow the IOMMU to enable the DMA PF to access the user buffer over PCI. 2) Create a KSM MKEY by setting its entries according to the user buffer VA to IOVA mapping, with the MKEY being the data direct device-crossed MKEY. This KSM MKEY is umrable and will be used as part of the MR cache. The PD for creating it is the internal device 'data direct' kernel one. 3) Create a crossing MKEY that points to the KSM MKEY using the crossing access mode. 4) Manage the KSM MKEY by adding it to a list of 'data direct' MKEYs managed on the mlx5_ib device. 5) Return the crossing MKEY to the user, created with its supplied PD. Upon DMA PF unbind flow, the driver will revoke the KSM entries. The final deregistration will occur under the hood once the application deregisters its MKEY. Notes: - This version supports only the PINNED UMEM mode, so there is no dependency on ODP. - The IOVA supplied by the application must be system page aligned due to HW translations of KSM. - The crossing MKEY will not be umrable or part of the MR cache, as we cannot change its crossed (i.e. KSM) MKEY over UMR. Signed-off-by: Yishai Hadas Link: https://patch.msgid.link/1f99d8020ed540d9702b9e2252a145a439609ba6.1722512548.git.leon@kernel.org Signed-off-by: Leon Romanovsky (cherry picked from commit de8f847a5114ff7cfcdfc114af8485c431dec703) Signed-off-by: Koba Ko --- drivers/infiniband/hw/mlx5/main.c | 11 + drivers/infiniband/hw/mlx5/mlx5_ib.h | 8 + drivers/infiniband/hw/mlx5/mr.c | 304 +++++++++++++++++++--- drivers/infiniband/hw/mlx5/odp.c | 5 +- drivers/infiniband/hw/mlx5/umr.c | 93 ++++--- drivers/infiniband/hw/mlx5/umr.h | 1 + include/uapi/rdma/mlx5_user_ioctl_cmds.h | 4 + include/uapi/rdma/mlx5_user_ioctl_verbs.h | 4 + 8 files changed, 358 insertions(+), 72 deletions(-) diff --git a/drivers/infiniband/hw/mlx5/main.c b/drivers/infiniband/hw/mlx5/main.c index 8389628d6a9e8..2f06f522135a9 100644 --- a/drivers/infiniband/hw/mlx5/main.c +++ b/drivers/infiniband/hw/mlx5/main.c @@ -3485,6 +3485,7 @@ static int mlx5_ib_data_direct_init(struct mlx5_ib_dev *dev) if (ret) return ret; + INIT_LIST_HEAD(&dev->data_direct_mr_list); ret = mlx5_data_direct_ib_reg(dev, vuid); if (ret) mlx5_ib_free_data_direct_resources(dev); @@ -3878,6 +3879,14 @@ ADD_UVERBS_ATTRIBUTES_SIMPLE( dump_fill_mkey), UA_MANDATORY)); +ADD_UVERBS_ATTRIBUTES_SIMPLE( + mlx5_ib_reg_dmabuf_mr, + UVERBS_OBJECT_MR, + UVERBS_METHOD_REG_DMABUF_MR, + UVERBS_ATTR_FLAGS_IN(MLX5_IB_ATTR_REG_DMABUF_MR_ACCESS_FLAGS, + enum mlx5_ib_uapi_reg_dmabuf_flags, + UA_OPTIONAL)); + static const struct uapi_definition mlx5_ib_defs[] = { UAPI_DEF_CHAIN(mlx5_ib_devx_defs), UAPI_DEF_CHAIN(mlx5_ib_flow_defs), @@ -3887,6 +3896,7 @@ static const struct uapi_definition mlx5_ib_defs[] = { UAPI_DEF_CHAIN(mlx5_ib_create_cq_defs), UAPI_DEF_CHAIN_OBJ_TREE(UVERBS_OBJECT_DEVICE, &mlx5_ib_query_context), + UAPI_DEF_CHAIN_OBJ_TREE(UVERBS_OBJECT_MR, &mlx5_ib_reg_dmabuf_mr), UAPI_DEF_CHAIN_OBJ_TREE_NAMED(MLX5_IB_OBJECT_VAR, UAPI_DEF_IS_OBJ_SUPPORTED(var_is_supported)), UAPI_DEF_CHAIN_OBJ_TREE_NAMED(MLX5_IB_OBJECT_UAR), @@ -4405,6 +4415,7 @@ void mlx5_ib_data_direct_bind(struct mlx5_ib_dev *ibdev, void mlx5_ib_data_direct_unbind(struct mlx5_ib_dev *ibdev) { mutex_lock(&ibdev->data_direct_lock); + mlx5_ib_revoke_data_direct_mrs(ibdev); ibdev->data_direct_dev = NULL; mutex_unlock(&ibdev->data_direct_lock); } diff --git a/drivers/infiniband/hw/mlx5/mlx5_ib.h b/drivers/infiniband/hw/mlx5/mlx5_ib.h index 08a2454f0158d..2c962c3762b4b 100644 --- a/drivers/infiniband/hw/mlx5/mlx5_ib.h +++ b/drivers/infiniband/hw/mlx5/mlx5_ib.h @@ -681,6 +681,8 @@ struct mlx5_ib_mr { struct mlx5_ib_mkey mmkey; struct ib_umem *umem; + /* The mr is data direct related */ + u8 data_direct :1; union { /* Used only by kernel MRs (umem == NULL) */ @@ -718,6 +720,10 @@ struct mlx5_ib_mr { } odp_destroy; struct ib_odp_counters odp_stats; bool is_odp_implicit; + /* The affilated data direct crossed mr */ + struct mlx5_ib_mr *dd_crossed_mr; + struct list_head dd_node; + u8 revoked :1; }; }; }; @@ -1153,6 +1159,7 @@ struct mlx5_ib_dev { /* protect resources needed as part of reset flow */ spinlock_t reset_flow_resource_lock; struct list_head qp_list; + struct list_head data_direct_mr_list; /* Array with num_ports elements */ struct mlx5_ib_port *port; struct mlx5_sq_bfreg bfreg; @@ -1422,6 +1429,7 @@ struct ib_mr *mlx5_ib_reg_dm_mr(struct ib_pd *pd, struct ib_dm *dm, void mlx5_ib_data_direct_bind(struct mlx5_ib_dev *ibdev, struct mlx5_data_direct_dev *dev); void mlx5_ib_data_direct_unbind(struct mlx5_ib_dev *ibdev); +void mlx5_ib_revoke_data_direct_mrs(struct mlx5_ib_dev *dev); #ifdef CONFIG_INFINIBAND_ON_DEMAND_PAGING int mlx5_ib_odp_init_one(struct mlx5_ib_dev *ibdev); diff --git a/drivers/infiniband/hw/mlx5/mr.c b/drivers/infiniband/hw/mlx5/mr.c index 81b76695fd346..6ac87cc4168c9 100644 --- a/drivers/infiniband/hw/mlx5/mr.c +++ b/drivers/infiniband/hw/mlx5/mr.c @@ -43,6 +43,7 @@ #include "dm.h" #include "mlx5_ib.h" #include "umr.h" +#include "data_direct.h" enum { MAX_PENDING_REG_MR = 8, @@ -55,7 +56,9 @@ static void create_mkey_callback(int status, struct mlx5_async_work *context); static struct mlx5_ib_mr *reg_create(struct ib_pd *pd, struct ib_umem *umem, u64 iova, int access_flags, - unsigned int page_size, bool populate); + unsigned int page_size, bool populate, + int access_mode); +static int __mlx5_ib_dereg_mr(struct ib_mr *ibmr); static void set_mkc_access_pd_addr_fields(void *mkc, int acc, u64 start_addr, struct ib_pd *pd) @@ -1159,12 +1162,10 @@ static unsigned int mlx5_umem_dmabuf_default_pgsz(struct ib_umem *umem, static struct mlx5_ib_mr *alloc_cacheable_mr(struct ib_pd *pd, struct ib_umem *umem, u64 iova, - int access_flags) + int access_flags, int access_mode) { - struct mlx5r_cache_rb_key rb_key = { - .access_mode = MLX5_MKC_ACCESS_MODE_MTT, - }; struct mlx5_ib_dev *dev = to_mdev(pd->device); + struct mlx5r_cache_rb_key rb_key = {}; struct mlx5_cache_ent *ent; struct mlx5_ib_mr *mr; unsigned int page_size; @@ -1177,6 +1178,7 @@ static struct mlx5_ib_mr *alloc_cacheable_mr(struct ib_pd *pd, if (WARN_ON(!page_size)) return ERR_PTR(-EINVAL); + rb_key.access_mode = access_mode; rb_key.ndescs = ib_umem_num_dma_blocks(umem, page_size); rb_key.ats = mlx5_umem_needs_ats(dev, umem, access_flags); rb_key.access_flags = get_unchangeable_access_flags(dev, access_flags); @@ -1187,7 +1189,7 @@ static struct mlx5_ib_mr *alloc_cacheable_mr(struct ib_pd *pd, */ if (!ent) { mutex_lock(&dev->slow_path_mutex); - mr = reg_create(pd, umem, iova, access_flags, page_size, false); + mr = reg_create(pd, umem, iova, access_flags, page_size, false, access_mode); mutex_unlock(&dev->slow_path_mutex); if (IS_ERR(mr)) return mr; @@ -1207,13 +1209,71 @@ static struct mlx5_ib_mr *alloc_cacheable_mr(struct ib_pd *pd, return mr; } +static struct ib_mr * +reg_create_crossing_vhca_mr(struct ib_pd *pd, u64 iova, u64 length, int access_flags, + u32 crossed_lkey) +{ + struct mlx5_ib_dev *dev = to_mdev(pd->device); + int access_mode = MLX5_MKC_ACCESS_MODE_CROSSING; + struct mlx5_ib_mr *mr; + void *mkc; + int inlen; + u32 *in; + int err; + + if (!MLX5_CAP_GEN(dev->mdev, crossing_vhca_mkey)) + return ERR_PTR(-EOPNOTSUPP); + + mr = kzalloc(sizeof(*mr), GFP_KERNEL); + if (!mr) + return ERR_PTR(-ENOMEM); + + inlen = MLX5_ST_SZ_BYTES(create_mkey_in); + in = kvzalloc(inlen, GFP_KERNEL); + if (!in) { + err = -ENOMEM; + goto err_1; + } + + mkc = MLX5_ADDR_OF(create_mkey_in, in, memory_key_mkey_entry); + MLX5_SET(mkc, mkc, crossing_target_vhca_id, + MLX5_CAP_GEN(dev->mdev, vhca_id)); + MLX5_SET(mkc, mkc, translations_octword_size, crossed_lkey); + MLX5_SET(mkc, mkc, access_mode_1_0, access_mode & 0x3); + MLX5_SET(mkc, mkc, access_mode_4_2, (access_mode >> 2) & 0x7); + + /* for this crossing mkey IOVA should be 0 and len should be IOVA + len */ + set_mkc_access_pd_addr_fields(mkc, access_flags, 0, pd); + MLX5_SET64(mkc, mkc, len, iova + length); + + MLX5_SET(mkc, mkc, free, 0); + MLX5_SET(mkc, mkc, umr_en, 0); + err = mlx5_ib_create_mkey(dev, &mr->mmkey, in, inlen); + if (err) + goto err_2; + + mr->mmkey.type = MLX5_MKEY_MR; + set_mr_fields(dev, mr, length, access_flags, iova); + mr->ibmr.pd = pd; + kvfree(in); + mlx5_ib_dbg(dev, "crossing mkey = 0x%x\n", mr->mmkey.key); + + return &mr->ibmr; +err_2: + kvfree(in); +err_1: + kfree(mr); + return ERR_PTR(err); +} + /* * If ibmr is NULL it will be allocated by reg_create. * Else, the given ibmr will be used. */ static struct mlx5_ib_mr *reg_create(struct ib_pd *pd, struct ib_umem *umem, u64 iova, int access_flags, - unsigned int page_size, bool populate) + unsigned int page_size, bool populate, + int access_mode) { struct mlx5_ib_dev *dev = to_mdev(pd->device); struct mlx5_ib_mr *mr; @@ -1222,7 +1282,9 @@ static struct mlx5_ib_mr *reg_create(struct ib_pd *pd, struct ib_umem *umem, int inlen; u32 *in; int err; - bool pg_cap = !!(MLX5_CAP_GEN(dev->mdev, pg)); + bool pg_cap = !!(MLX5_CAP_GEN(dev->mdev, pg)) && + (access_mode == MLX5_MKC_ACCESS_MODE_MTT); + bool ksm_mode = (access_mode == MLX5_MKC_ACCESS_MODE_KSM); if (!page_size) return ERR_PTR(-EINVAL); @@ -1245,7 +1307,7 @@ static struct mlx5_ib_mr *reg_create(struct ib_pd *pd, struct ib_umem *umem, } pas = (__be64 *)MLX5_ADDR_OF(create_mkey_in, in, klm_pas_mtt); if (populate) { - if (WARN_ON(access_flags & IB_ACCESS_ON_DEMAND)) { + if (WARN_ON(access_flags & IB_ACCESS_ON_DEMAND || ksm_mode)) { err = -EINVAL; goto err_2; } @@ -1261,14 +1323,22 @@ static struct mlx5_ib_mr *reg_create(struct ib_pd *pd, struct ib_umem *umem, mkc = MLX5_ADDR_OF(create_mkey_in, in, memory_key_mkey_entry); set_mkc_access_pd_addr_fields(mkc, access_flags, iova, populate ? pd : dev->umrc.pd); + /* In case a data direct flow, overwrite the pdn field by its internal kernel PD */ + if (umem->is_dmabuf && ksm_mode) + MLX5_SET(mkc, mkc, pd, dev->ddr.pdn); + MLX5_SET(mkc, mkc, free, !populate); - MLX5_SET(mkc, mkc, access_mode_1_0, MLX5_MKC_ACCESS_MODE_MTT); + MLX5_SET(mkc, mkc, access_mode_1_0, access_mode); MLX5_SET(mkc, mkc, umr_en, 1); MLX5_SET64(mkc, mkc, len, umem->length); MLX5_SET(mkc, mkc, bsf_octword_size, 0); - MLX5_SET(mkc, mkc, translations_octword_size, - get_octo_len(iova, umem->length, mr->page_shift)); + if (ksm_mode) + MLX5_SET(mkc, mkc, translations_octword_size, + get_octo_len(iova, umem->length, mr->page_shift) * 2); + else + MLX5_SET(mkc, mkc, translations_octword_size, + get_octo_len(iova, umem->length, mr->page_shift)); MLX5_SET(mkc, mkc, log_page_size, mr->page_shift); if (mlx5_umem_needs_ats(dev, umem, access_flags)) MLX5_SET(mkc, mkc, ma_translation_mode, 1); @@ -1404,13 +1474,15 @@ static struct ib_mr *create_real_mr(struct ib_pd *pd, struct ib_umem *umem, xlt_with_umr = mlx5r_umr_can_load_pas(dev, umem->length); if (xlt_with_umr) { - mr = alloc_cacheable_mr(pd, umem, iova, access_flags); + mr = alloc_cacheable_mr(pd, umem, iova, access_flags, + MLX5_MKC_ACCESS_MODE_MTT); } else { unsigned int page_size = mlx5_umem_find_best_pgsz( umem, mkc, log_page_size, 0, iova); mutex_lock(&dev->slow_path_mutex); - mr = reg_create(pd, umem, iova, access_flags, page_size, true); + mr = reg_create(pd, umem, iova, access_flags, page_size, + true, MLX5_MKC_ACCESS_MODE_MTT); mutex_unlock(&dev->slow_path_mutex); } if (IS_ERR(mr)) { @@ -1473,7 +1545,8 @@ static struct ib_mr *create_user_odp_mr(struct ib_pd *pd, u64 start, u64 length, if (IS_ERR(odp)) return ERR_CAST(odp); - mr = alloc_cacheable_mr(pd, &odp->umem, iova, access_flags); + mr = alloc_cacheable_mr(pd, &odp->umem, iova, access_flags, + MLX5_MKC_ACCESS_MODE_MTT); if (IS_ERR(mr)) { ib_umem_release(&odp->umem); return ERR_CAST(mr); @@ -1541,35 +1614,31 @@ static struct dma_buf_attach_ops mlx5_ib_dmabuf_attach_ops = { .move_notify = mlx5_ib_dmabuf_invalidate_cb, }; -struct ib_mr *mlx5_ib_reg_user_mr_dmabuf(struct ib_pd *pd, u64 offset, - u64 length, u64 virt_addr, - int fd, int access_flags, - struct uverbs_attr_bundle *attrs) +static struct ib_mr * +reg_user_mr_dmabuf(struct ib_pd *pd, struct device *dma_device, + u64 offset, u64 length, u64 virt_addr, + int fd, int access_flags, int access_mode) { + bool pinned_mode = (access_mode == MLX5_MKC_ACCESS_MODE_KSM); struct mlx5_ib_dev *dev = to_mdev(pd->device); struct mlx5_ib_mr *mr = NULL; struct ib_umem_dmabuf *umem_dmabuf; int err; - if (!IS_ENABLED(CONFIG_INFINIBAND_USER_MEM) || - !IS_ENABLED(CONFIG_INFINIBAND_ON_DEMAND_PAGING)) - return ERR_PTR(-EOPNOTSUPP); - - mlx5_ib_dbg(dev, - "offset 0x%llx, virt_addr 0x%llx, length 0x%llx, fd %d, access_flags 0x%x\n", - offset, virt_addr, length, fd, access_flags); - err = mlx5r_umr_resource_init(dev); if (err) return ERR_PTR(err); - /* dmabuf requires xlt update via umr to work. */ - if (!mlx5r_umr_can_load_pas(dev, length)) - return ERR_PTR(-EINVAL); + if (!pinned_mode) + umem_dmabuf = ib_umem_dmabuf_get(&dev->ib_dev, + offset, length, fd, + access_flags, + &mlx5_ib_dmabuf_attach_ops); + else + umem_dmabuf = ib_umem_dmabuf_get_pinned_with_dma_device(&dev->ib_dev, + dma_device, offset, length, + fd, access_flags); - umem_dmabuf = ib_umem_dmabuf_get(&dev->ib_dev, offset, length, fd, - access_flags, - &mlx5_ib_dmabuf_attach_ops); if (IS_ERR(umem_dmabuf)) { mlx5_ib_dbg(dev, "umem_dmabuf get failed (%ld)\n", PTR_ERR(umem_dmabuf)); @@ -1577,7 +1646,7 @@ struct ib_mr *mlx5_ib_reg_user_mr_dmabuf(struct ib_pd *pd, u64 offset, } mr = alloc_cacheable_mr(pd, &umem_dmabuf->umem, virt_addr, - access_flags); + access_flags, access_mode); if (IS_ERR(mr)) { ib_umem_release(&umem_dmabuf->umem); return ERR_CAST(mr); @@ -1587,9 +1656,13 @@ struct ib_mr *mlx5_ib_reg_user_mr_dmabuf(struct ib_pd *pd, u64 offset, atomic_add(ib_umem_num_pages(mr->umem), &dev->mdev->priv.reg_pages); umem_dmabuf->private = mr; - err = mlx5r_store_odp_mkey(dev, &mr->mmkey); - if (err) - goto err_dereg_mr; + if (!pinned_mode) { + err = mlx5r_store_odp_mkey(dev, &mr->mmkey); + if (err) + goto err_dereg_mr; + } else { + mr->data_direct = true; + } err = mlx5_ib_init_dmabuf_mr(mr); if (err) @@ -1597,10 +1670,101 @@ struct ib_mr *mlx5_ib_reg_user_mr_dmabuf(struct ib_pd *pd, u64 offset, return &mr->ibmr; err_dereg_mr: - mlx5_ib_dereg_mr(&mr->ibmr, NULL); + __mlx5_ib_dereg_mr(&mr->ibmr); return ERR_PTR(err); } +static struct ib_mr * +reg_user_mr_dmabuf_by_data_direct(struct ib_pd *pd, u64 offset, + u64 length, u64 virt_addr, + int fd, int access_flags) +{ + struct mlx5_ib_dev *dev = to_mdev(pd->device); + struct mlx5_data_direct_dev *data_direct_dev; + struct ib_mr *crossing_mr; + struct ib_mr *crossed_mr; + int ret = 0; + + /* As of HW behaviour the IOVA must be page aligned in KSM mode */ + if (!PAGE_ALIGNED(virt_addr) || (access_flags & IB_ACCESS_ON_DEMAND)) + return ERR_PTR(-EOPNOTSUPP); + + mutex_lock(&dev->data_direct_lock); + data_direct_dev = dev->data_direct_dev; + if (!data_direct_dev) { + ret = -EINVAL; + goto end; + } + + /* The device's 'data direct mkey' was created without RO flags to + * simplify things and allow for a single mkey per device. + * Since RO is not a must, mask it out accordingly. + */ + access_flags &= ~IB_ACCESS_RELAXED_ORDERING; + crossed_mr = reg_user_mr_dmabuf(pd, &data_direct_dev->pdev->dev, + offset, length, virt_addr, fd, + access_flags, MLX5_MKC_ACCESS_MODE_KSM); + if (IS_ERR(crossed_mr)) { + ret = PTR_ERR(crossed_mr); + goto end; + } + + mutex_lock(&dev->slow_path_mutex); + crossing_mr = reg_create_crossing_vhca_mr(pd, virt_addr, length, access_flags, + crossed_mr->lkey); + mutex_unlock(&dev->slow_path_mutex); + if (IS_ERR(crossing_mr)) { + __mlx5_ib_dereg_mr(crossed_mr); + ret = PTR_ERR(crossing_mr); + goto end; + } + + list_add_tail(&to_mmr(crossed_mr)->dd_node, &dev->data_direct_mr_list); + to_mmr(crossing_mr)->dd_crossed_mr = to_mmr(crossed_mr); + to_mmr(crossing_mr)->data_direct = true; +end: + mutex_unlock(&dev->data_direct_lock); + return ret ? ERR_PTR(ret) : crossing_mr; +} + +struct ib_mr *mlx5_ib_reg_user_mr_dmabuf(struct ib_pd *pd, u64 offset, + u64 length, u64 virt_addr, + int fd, int access_flags, + struct uverbs_attr_bundle *attrs) +{ + struct mlx5_ib_dev *dev = to_mdev(pd->device); + int mlx5_access_flags = 0; + int err; + + if (!IS_ENABLED(CONFIG_INFINIBAND_USER_MEM) || + !IS_ENABLED(CONFIG_INFINIBAND_ON_DEMAND_PAGING)) + return ERR_PTR(-EOPNOTSUPP); + + if (uverbs_attr_is_valid(attrs, MLX5_IB_ATTR_REG_DMABUF_MR_ACCESS_FLAGS)) { + err = uverbs_get_flags32(&mlx5_access_flags, attrs, + MLX5_IB_ATTR_REG_DMABUF_MR_ACCESS_FLAGS, + MLX5_IB_UAPI_REG_DMABUF_ACCESS_DATA_DIRECT); + if (err) + return ERR_PTR(err); + } + + mlx5_ib_dbg(dev, + "offset 0x%llx, virt_addr 0x%llx, length 0x%llx, fd %d, access_flags 0x%x, mlx5_access_flags 0x%x\n", + offset, virt_addr, length, fd, access_flags, mlx5_access_flags); + + /* dmabuf requires xlt update via umr to work. */ + if (!mlx5r_umr_can_load_pas(dev, length)) + return ERR_PTR(-EINVAL); + + if (mlx5_access_flags & MLX5_IB_UAPI_REG_DMABUF_ACCESS_DATA_DIRECT) + return reg_user_mr_dmabuf_by_data_direct(pd, offset, length, virt_addr, + fd, access_flags); + + return reg_user_mr_dmabuf(pd, pd->device->dma_device, + offset, length, virt_addr, + fd, access_flags, MLX5_MKC_ACCESS_MODE_MTT); +} + /* * True if the change in access flags can be done via UMR, only some access * flags can be updated. @@ -1696,7 +1860,7 @@ struct ib_mr *mlx5_ib_rereg_user_mr(struct ib_mr *ib_mr, int flags, u64 start, struct mlx5_ib_mr *mr = to_mmr(ib_mr); int err; - if (!IS_ENABLED(CONFIG_INFINIBAND_USER_MEM)) + if (!IS_ENABLED(CONFIG_INFINIBAND_USER_MEM) || mr->data_direct) return ERR_PTR(-EOPNOTSUPP); mlx5_ib_dbg( @@ -1824,7 +1988,7 @@ mlx5_alloc_priv_descs(struct ib_device *device, static void mlx5_free_priv_descs(struct mlx5_ib_mr *mr) { - if (!mr->umem && mr->descs) { + if (!mr->umem && !mr->data_direct && mr->descs) { struct ib_device *device = mr->ibmr.device; int size = mr->max_descs * mr->desc_size; struct mlx5_ib_dev *dev = to_mdev(device); @@ -1879,7 +2043,35 @@ static int cache_ent_find_and_store(struct mlx5_ib_dev *dev, return ret; } -int mlx5_ib_dereg_mr(struct ib_mr *ibmr, struct ib_udata *udata) +static int mlx5_ib_revoke_data_direct_mr(struct mlx5_ib_mr *mr) +{ + struct mlx5_ib_dev *dev = to_mdev(mr->ibmr.device); + struct ib_umem_dmabuf *umem_dmabuf = to_ib_umem_dmabuf(mr->umem); + int err; + + lockdep_assert_held(&dev->data_direct_lock); + mr->revoked = true; + err = mlx5r_umr_revoke_mr(mr); + if (WARN_ON(err)) + return err; + + ib_umem_dmabuf_revoke(umem_dmabuf); + return 0; +} + +void mlx5_ib_revoke_data_direct_mrs(struct mlx5_ib_dev *dev) +{ + struct mlx5_ib_mr *mr, *next; + + lockdep_assert_held(&dev->data_direct_lock); + + list_for_each_entry_safe(mr, next, &dev->data_direct_mr_list, dd_node) { + list_del(&mr->dd_node); + mlx5_ib_revoke_data_direct_mr(mr); + } +} + +static int __mlx5_ib_dereg_mr(struct ib_mr *ibmr) { struct mlx5_ib_mr *mr = to_mmr(ibmr); struct mlx5_ib_dev *dev = to_mdev(ibmr->device); @@ -1953,6 +2145,36 @@ int mlx5_ib_dereg_mr(struct ib_mr *ibmr, struct ib_udata *udata) return 0; } +static int dereg_crossing_data_direct_mr(struct mlx5_ib_dev *dev, + struct mlx5_ib_mr *mr) +{ + struct mlx5_ib_mr *dd_crossed_mr = mr->dd_crossed_mr; + int ret; + + ret = __mlx5_ib_dereg_mr(&mr->ibmr); + if (ret) + return ret; + + mutex_lock(&dev->data_direct_lock); + if (!dd_crossed_mr->revoked) + list_del(&dd_crossed_mr->dd_node); + + ret = __mlx5_ib_dereg_mr(&dd_crossed_mr->ibmr); + mutex_unlock(&dev->data_direct_lock); + return ret; +} + +int mlx5_ib_dereg_mr(struct ib_mr *ibmr, struct ib_udata *udata) +{ + struct mlx5_ib_mr *mr = to_mmr(ibmr); + struct mlx5_ib_dev *dev = to_mdev(ibmr->device); + + if (mr->data_direct) + return dereg_crossing_data_direct_mr(dev, mr); + + return __mlx5_ib_dereg_mr(ibmr); +} + static void mlx5_set_umr_free_mkey(struct ib_pd *pd, u32 *in, int ndescs, int access_mode, int page_shift) { diff --git a/drivers/infiniband/hw/mlx5/odp.c b/drivers/infiniband/hw/mlx5/odp.c index 7ad5db46ffce7..c6c5fb263e220 100644 --- a/drivers/infiniband/hw/mlx5/odp.c +++ b/drivers/infiniband/hw/mlx5/odp.c @@ -712,7 +712,10 @@ static int pagefault_dmabuf_mr(struct mlx5_ib_mr *mr, size_t bcnt, ib_umem_dmabuf_unmap_pages(umem_dmabuf); err = -EINVAL; } else { - err = mlx5r_umr_update_mr_pas(mr, xlt_flags); + if (mr->data_direct) + err = mlx5r_umr_update_data_direct_ksm_pas(mr, xlt_flags); + else + err = mlx5r_umr_update_mr_pas(mr, xlt_flags); } dma_resv_unlock(umem_dmabuf->attach->dmabuf->resv); diff --git a/drivers/infiniband/hw/mlx5/umr.c b/drivers/infiniband/hw/mlx5/umr.c index e7e049b2fceb8..477fa83e1308c 100644 --- a/drivers/infiniband/hw/mlx5/umr.c +++ b/drivers/infiniband/hw/mlx5/umr.c @@ -632,44 +632,47 @@ static void mlx5r_umr_final_update_xlt(struct mlx5_ib_dev *dev, wqe->data_seg.byte_count = cpu_to_be32(sg->length); } -/* - * Send the DMA list to the HW for a normal MR using UMR. - * Dmabuf MR is handled in a similar way, except that the MLX5_IB_UPD_XLT_ZAP - * flag may be used. - */ -int mlx5r_umr_update_mr_pas(struct mlx5_ib_mr *mr, unsigned int flags) +static int +_mlx5r_umr_update_mr_pas(struct mlx5_ib_mr *mr, unsigned int flags, bool dd) { + size_t ent_size = dd ? sizeof(struct mlx5_ksm) : sizeof(struct mlx5_mtt); struct mlx5_ib_dev *dev = mr_to_mdev(mr); struct device *ddev = &dev->mdev->pdev->dev; struct mlx5r_umr_wqe wqe = {}; struct ib_block_iter biter; + struct mlx5_ksm *cur_ksm; struct mlx5_mtt *cur_mtt; size_t orig_sg_length; - struct mlx5_mtt *mtt; size_t final_size; + void *curr_entry; struct ib_sge sg; + void *entry; u64 offset = 0; int err = 0; - if (WARN_ON(mr->umem->is_odp)) - return -EINVAL; - - mtt = mlx5r_umr_create_xlt( - dev, &sg, ib_umem_num_dma_blocks(mr->umem, 1 << mr->page_shift), - sizeof(*mtt), flags); - if (!mtt) + entry = mlx5r_umr_create_xlt(dev, &sg, + ib_umem_num_dma_blocks(mr->umem, 1 << mr->page_shift), + ent_size, flags); + if (!entry) return -ENOMEM; orig_sg_length = sg.length; - mlx5r_umr_set_update_xlt_ctrl_seg(&wqe.ctrl_seg, flags, &sg); mlx5r_umr_set_update_xlt_mkey_seg(dev, &wqe.mkey_seg, mr, mr->page_shift); + if (dd) { + /* Use the data direct internal kernel PD */ + MLX5_SET(mkc, &wqe.mkey_seg, pd, dev->ddr.pdn); + cur_ksm = entry; + } else { + cur_mtt = entry; + } + mlx5r_umr_set_update_xlt_data_seg(&wqe.data_seg, &sg); - cur_mtt = mtt; + curr_entry = entry; rdma_umem_for_each_dma_block(mr->umem, &biter, BIT(mr->page_shift)) { - if (cur_mtt == (void *)mtt + sg.length) { + if (curr_entry == entry + sg.length) { dma_sync_single_for_device(ddev, sg.addr, sg.length, DMA_TO_DEVICE); @@ -681,23 +684,31 @@ int mlx5r_umr_update_mr_pas(struct mlx5_ib_mr *mr, unsigned int flags) DMA_TO_DEVICE); offset += sg.length; mlx5r_umr_update_offset(&wqe.ctrl_seg, offset); - - cur_mtt = mtt; + if (dd) + cur_ksm = entry; + else + cur_mtt = entry; } - cur_mtt->ptag = - cpu_to_be64(rdma_block_iter_dma_address(&biter) | - MLX5_IB_MTT_PRESENT); - - if (mr->umem->is_dmabuf && (flags & MLX5_IB_UPD_XLT_ZAP)) - cur_mtt->ptag = 0; - - cur_mtt++; + if (dd) { + cur_ksm->va = cpu_to_be64(rdma_block_iter_dma_address(&biter)); + cur_ksm->key = cpu_to_be32(dev->ddr.mkey); + cur_ksm++; + curr_entry = cur_ksm; + } else { + cur_mtt->ptag = + cpu_to_be64(rdma_block_iter_dma_address(&biter) | + MLX5_IB_MTT_PRESENT); + if (mr->umem->is_dmabuf && (flags & MLX5_IB_UPD_XLT_ZAP)) + cur_mtt->ptag = 0; + cur_mtt++; + curr_entry = cur_mtt; + } } - final_size = (void *)cur_mtt - (void *)mtt; + final_size = curr_entry - entry; sg.length = ALIGN(final_size, MLX5_UMR_FLEX_ALIGNMENT); - memset(cur_mtt, 0, sg.length - final_size); + memset(curr_entry, 0, sg.length - final_size); mlx5r_umr_final_update_xlt(dev, &wqe, mr, &sg, flags); dma_sync_single_for_device(ddev, sg.addr, sg.length, DMA_TO_DEVICE); @@ -705,10 +716,32 @@ int mlx5r_umr_update_mr_pas(struct mlx5_ib_mr *mr, unsigned int flags) err: sg.length = orig_sg_length; - mlx5r_umr_unmap_free_xlt(dev, mtt, &sg); + mlx5r_umr_unmap_free_xlt(dev, entry, &sg); return err; } +int mlx5r_umr_update_data_direct_ksm_pas(struct mlx5_ib_mr *mr, unsigned int flags) +{ + /* No invalidation flow is expected */ + if (WARN_ON(!mr->umem->is_dmabuf) || (flags & MLX5_IB_UPD_XLT_ZAP)) + return -EINVAL; + + return _mlx5r_umr_update_mr_pas(mr, flags, true); +} + +/* + * Send the DMA list to the HW for a normal MR using UMR. + * Dmabuf MR is handled in a similar way, except that the MLX5_IB_UPD_XLT_ZAP + * flag may be used. + */ +int mlx5r_umr_update_mr_pas(struct mlx5_ib_mr *mr, unsigned int flags) +{ + if (WARN_ON(mr->umem->is_odp)) + return -EINVAL; + + return _mlx5r_umr_update_mr_pas(mr, flags, false); +} + static bool umr_can_use_indirect_mkey(struct mlx5_ib_dev *dev) { return !MLX5_CAP_GEN(dev->mdev, umr_indirect_mkey_disabled); diff --git a/drivers/infiniband/hw/mlx5/umr.h b/drivers/infiniband/hw/mlx5/umr.h index 5f734dc72befe..4a02c9b5aad86 100644 --- a/drivers/infiniband/hw/mlx5/umr.h +++ b/drivers/infiniband/hw/mlx5/umr.h @@ -95,6 +95,7 @@ int mlx5r_umr_revoke_mr(struct mlx5_ib_mr *mr); int mlx5r_umr_rereg_pd_access(struct mlx5_ib_mr *mr, struct ib_pd *pd, int access_flags); int mlx5r_umr_update_mr_pas(struct mlx5_ib_mr *mr, unsigned int flags); +int mlx5r_umr_update_data_direct_ksm_pas(struct mlx5_ib_mr *mr, unsigned int flags); int mlx5r_umr_update_xlt(struct mlx5_ib_mr *mr, u64 idx, int npages, int page_shift, int flags); diff --git a/include/uapi/rdma/mlx5_user_ioctl_cmds.h b/include/uapi/rdma/mlx5_user_ioctl_cmds.h index 5b74d65348995..106276a4cce7a 100644 --- a/include/uapi/rdma/mlx5_user_ioctl_cmds.h +++ b/include/uapi/rdma/mlx5_user_ioctl_cmds.h @@ -274,6 +274,10 @@ enum mlx5_ib_create_cq_attrs { MLX5_IB_ATTR_CREATE_CQ_UAR_INDEX = UVERBS_ID_DRIVER_NS_WITH_UHW, }; +enum mlx5_ib_reg_dmabuf_mr_attrs { + MLX5_IB_ATTR_REG_DMABUF_MR_ACCESS_FLAGS = (1U << UVERBS_ID_NS_SHIFT), +}; + #define MLX5_IB_DW_MATCH_PARAM 0xA0 struct mlx5_ib_match_params { diff --git a/include/uapi/rdma/mlx5_user_ioctl_verbs.h b/include/uapi/rdma/mlx5_user_ioctl_verbs.h index 7af9e09ea556a..eef75840a0f98 100644 --- a/include/uapi/rdma/mlx5_user_ioctl_verbs.h +++ b/include/uapi/rdma/mlx5_user_ioctl_verbs.h @@ -54,6 +54,10 @@ enum mlx5_ib_uapi_flow_action_packet_reformat_type { MLX5_IB_UAPI_FLOW_ACTION_PACKET_REFORMAT_TYPE_L2_TO_L3_TUNNEL = 0x3, }; +enum mlx5_ib_uapi_reg_dmabuf_flags { + MLX5_IB_UAPI_REG_DMABUF_ACCESS_DATA_DIRECT = 1 << 0, +}; + struct mlx5_ib_uapi_devx_async_cmd_hdr { __aligned_u64 wr_id; __u8 out_data[]; From 4e4bc07488e7f5a15438de3681a50eef8ee3c24d Mon Sep 17 00:00:00 2001 From: Yishai Hadas Date: Thu, 1 Aug 2024 15:05:17 +0300 Subject: [PATCH 060/548] RDMA/mlx5: Introduce GET_DATA_DIRECT_SYSFS_PATH ioctl Introduce the 'GET_DATA_DIRECT_SYSFS_PATH' ioctl to return the sysfs path of the affiliated 'data direct' device for a given device. Signed-off-by: Yishai Hadas Link: https://patch.msgid.link/403745463e0ef52adbef681ff09aa6a29a756352.1722512548.git.leon@kernel.org Signed-off-by: Leon Romanovsky (cherry picked from commit ec7ad6530909983c8736c80af46e3529ce7bab55) Signed-off-by: Koba Ko --- drivers/infiniband/hw/mlx5/std_types.c | 55 +++++++++++++++++++++++- include/uapi/rdma/mlx5_user_ioctl_cmds.h | 5 +++ 2 files changed, 59 insertions(+), 1 deletion(-) diff --git a/drivers/infiniband/hw/mlx5/std_types.c b/drivers/infiniband/hw/mlx5/std_types.c index bbfcce3bdc84e..ffeb1e1a15389 100644 --- a/drivers/infiniband/hw/mlx5/std_types.c +++ b/drivers/infiniband/hw/mlx5/std_types.c @@ -10,6 +10,7 @@ #include #include #include "mlx5_ib.h" +#include "data_direct.h" #define UVERBS_MODULE_NAME mlx5_ib #include @@ -183,6 +184,50 @@ static int UVERBS_HANDLER(MLX5_IB_METHOD_QUERY_PORT)( sizeof(info)); } +static int UVERBS_HANDLER(MLX5_IB_METHOD_GET_DATA_DIRECT_SYSFS_PATH)( + struct uverbs_attr_bundle *attrs) +{ + struct mlx5_data_direct_dev *data_direct_dev; + struct mlx5_ib_ucontext *c; + struct mlx5_ib_dev *dev; + int out_len = uverbs_attr_get_len(attrs, + MLX5_IB_ATTR_GET_DATA_DIRECT_SYSFS_PATH); + u32 dev_path_len; + char *dev_path; + int ret; + + c = to_mucontext(ib_uverbs_get_ucontext(attrs)); + if (IS_ERR(c)) + return PTR_ERR(c); + dev = to_mdev(c->ibucontext.device); + mutex_lock(&dev->data_direct_lock); + data_direct_dev = dev->data_direct_dev; + if (!data_direct_dev) { + ret = -ENODEV; + goto end; + } + + dev_path = kobject_get_path(&data_direct_dev->device->kobj, GFP_KERNEL); + if (!dev_path) { + ret = -ENOMEM; + goto end; + } + + dev_path_len = strlen(dev_path) + 1; + if (dev_path_len > out_len) { + ret = -ENOSPC; + goto end; + } + + ret = uverbs_copy_to(attrs, MLX5_IB_ATTR_GET_DATA_DIRECT_SYSFS_PATH, dev_path, + dev_path_len); + kfree(dev_path); + +end: + mutex_unlock(&dev->data_direct_lock); + return ret; +} + DECLARE_UVERBS_NAMED_METHOD( MLX5_IB_METHOD_QUERY_PORT, UVERBS_ATTR_PTR_IN(MLX5_IB_ATTR_QUERY_PORT_PORT_NUM, @@ -193,9 +238,17 @@ DECLARE_UVERBS_NAMED_METHOD( reg_c0), UA_MANDATORY)); +DECLARE_UVERBS_NAMED_METHOD( + MLX5_IB_METHOD_GET_DATA_DIRECT_SYSFS_PATH, + UVERBS_ATTR_PTR_OUT( + MLX5_IB_ATTR_GET_DATA_DIRECT_SYSFS_PATH, + UVERBS_ATTR_MIN_SIZE(0), + UA_MANDATORY)); + ADD_UVERBS_METHODS(mlx5_ib_device, UVERBS_OBJECT_DEVICE, - &UVERBS_METHOD(MLX5_IB_METHOD_QUERY_PORT)); + &UVERBS_METHOD(MLX5_IB_METHOD_QUERY_PORT), + &UVERBS_METHOD(MLX5_IB_METHOD_GET_DATA_DIRECT_SYSFS_PATH)); DECLARE_UVERBS_NAMED_METHOD( MLX5_IB_METHOD_PD_QUERY, diff --git a/include/uapi/rdma/mlx5_user_ioctl_cmds.h b/include/uapi/rdma/mlx5_user_ioctl_cmds.h index 106276a4cce7a..fd2e4a3a56b36 100644 --- a/include/uapi/rdma/mlx5_user_ioctl_cmds.h +++ b/include/uapi/rdma/mlx5_user_ioctl_cmds.h @@ -348,6 +348,7 @@ enum mlx5_ib_pd_methods { enum mlx5_ib_device_methods { MLX5_IB_METHOD_QUERY_PORT = (1U << UVERBS_ID_NS_SHIFT), + MLX5_IB_METHOD_GET_DATA_DIRECT_SYSFS_PATH, }; enum mlx5_ib_query_port_attrs { @@ -355,4 +356,8 @@ enum mlx5_ib_query_port_attrs { MLX5_IB_ATTR_QUERY_PORT, }; +enum mlx5_ib_get_data_direct_sysfs_path_attrs { + MLX5_IB_ATTR_GET_DATA_DIRECT_SYSFS_PATH = (1U << UVERBS_ID_NS_SHIFT), +}; + #endif From d7ff21a86ddf5f17ddb0b1b7349ea1f1dc4b0a29 Mon Sep 17 00:00:00 2001 From: Yishai Hadas Date: Mon, 9 Sep 2024 21:47:33 +0300 Subject: [PATCH 061/548] RDMA/mlx5: Consider the query_vuid cap for data_direct Consider also the query_vuid cap before enabling the data_direct functionality. This may prevent a syndrome from the FW in case the query_vuid command is not supported. (e.g. migratable VF) Signed-off-by: Yishai Hadas Reviewed-by: Gal Shalom Link: https://patch.msgid.link/274c4f6f1ac0b1078243dd296695a49dbe58e7d1.1725907637.git.leonro@nvidia.com Signed-off-by: Leon Romanovsky (cherry picked from commit c77aec65e828bd82726f664585e3bb425d17be7f) Signed-off-by: Koba Ko --- drivers/infiniband/hw/mlx5/main.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/drivers/infiniband/hw/mlx5/main.c b/drivers/infiniband/hw/mlx5/main.c index 2f06f522135a9..eba7af9651a09 100644 --- a/drivers/infiniband/hw/mlx5/main.c +++ b/drivers/infiniband/hw/mlx5/main.c @@ -3474,7 +3474,8 @@ static int mlx5_ib_data_direct_init(struct mlx5_ib_dev *dev) char vuid[MLX5_ST_SZ_BYTES(array1024_auto) + 1] = {}; int ret; - if (!MLX5_CAP_GEN(dev->mdev, data_direct)) + if (!MLX5_CAP_GEN(dev->mdev, data_direct) || + !MLX5_CAP_GEN_2(dev->mdev, query_vuid)) return 0; ret = mlx5_cmd_query_vuid(dev->mdev, true, vuid); @@ -3495,7 +3496,8 @@ static int mlx5_ib_data_direct_init(struct mlx5_ib_dev *dev) static void mlx5_ib_data_direct_cleanup(struct mlx5_ib_dev *dev) { - if (!MLX5_CAP_GEN(dev->mdev, data_direct)) + if (!MLX5_CAP_GEN(dev->mdev, data_direct) || + !MLX5_CAP_GEN_2(dev->mdev, query_vuid)) return; mlx5_data_direct_ib_unreg(dev); From 3fbd6f2b44190cb68257de1686db6311a98d0c52 Mon Sep 17 00:00:00 2001 From: Edward Srouji Date: Tue, 3 Sep 2024 14:37:51 +0300 Subject: [PATCH 062/548] net/mlx5: Introduce data placement ordering bits Introduce out-of-order (OOO) data placement (DP) IFC related bits to support OOO DP QP. Signed-off-by: Edward Srouji Reviewed-by: Yishai Hadas Link: https://patch.msgid.link/f30e5cbb5459fd02f27f35909bb545cab346b58b.1725362773.git.leon@kernel.org Signed-off-by: Leon Romanovsky (cherry picked from commit 8ab3138a9b2dcb0ddf281240cf8cba414eb1224a) Signed-off-by: Koba Ko --- include/linux/mlx5/mlx5_ifc.h | 24 ++++++++++++++++++------ 1 file changed, 18 insertions(+), 6 deletions(-) diff --git a/include/linux/mlx5/mlx5_ifc.h b/include/linux/mlx5/mlx5_ifc.h index 9f4cc06eb670e..b0a81d133e7c1 100644 --- a/include/linux/mlx5/mlx5_ifc.h +++ b/include/linux/mlx5/mlx5_ifc.h @@ -1732,7 +1732,12 @@ struct mlx5_ifc_cmd_hca_cap_bits { u8 reserved_at_328[0x2]; u8 relaxed_ordering_read[0x1]; u8 log_max_pd[0x5]; - u8 reserved_at_330[0x6]; + u8 reserved_at_330[0x1]; + u8 dp_ordering_ooo_all_ud[0x1]; + u8 dp_ordering_ooo_all_uc[0x1]; + u8 dp_ordering_ooo_all_xrc[0x1]; + u8 dp_ordering_ooo_all_dc[0x1]; + u8 dp_ordering_ooo_all_rc[0x1]; u8 pci_sync_for_fw_update_with_driver_unload[0x1]; u8 vnic_env_cnt_steering_fail[0x1]; u8 vport_counter_local_loopback[0x1]; @@ -1943,7 +1948,9 @@ struct mlx5_ifc_cmd_hca_cap_2_bits { u8 reserved_at_0[0x80]; u8 migratable[0x1]; - u8 reserved_at_81[0x11]; + u8 reserved_at_81[0x7]; + u8 dp_ordering_force[0x1]; + u8 reserved_at_89[0x9]; u8 query_vuid[0x1]; u8 reserved_at_93[0xd]; @@ -3322,7 +3329,8 @@ struct mlx5_ifc_qpc_bits { u8 latency_sensitive[0x1]; u8 reserved_at_24[0x1]; u8 drain_sigerr[0x1]; - u8 reserved_at_26[0x2]; + u8 reserved_at_26[0x1]; + u8 dp_ordering_force[0x1]; u8 pd[0x18]; u8 mtu[0x3]; @@ -3395,7 +3403,8 @@ struct mlx5_ifc_qpc_bits { u8 rae[0x1]; u8 reserved_at_493[0x1]; u8 page_offset[0x6]; - u8 reserved_at_49a[0x3]; + u8 reserved_at_49a[0x2]; + u8 dp_ordering_1[0x1]; u8 cd_slave_receive[0x1]; u8 cd_slave_send[0x1]; u8 cd_master[0x1]; @@ -4295,7 +4304,8 @@ struct mlx5_ifc_dctc_bits { u8 state[0x4]; u8 reserved_at_8[0x18]; - u8 reserved_at_20[0x8]; + u8 reserved_at_20[0x7]; + u8 dp_ordering_force[0x1]; u8 user_index[0x18]; u8 reserved_at_40[0x8]; @@ -4310,7 +4320,9 @@ struct mlx5_ifc_dctc_bits { u8 latency_sensitive[0x1]; u8 rlky[0x1]; u8 free_ar[0x1]; - u8 reserved_at_73[0xd]; + u8 reserved_at_73[0x1]; + u8 dp_ordering_1[0x1]; + u8 reserved_at_75[0xb]; u8 reserved_at_80[0x8]; u8 cs_res[0x8]; From 6a6fd9ad2df20bd695acf0bb13f948fa0731689e Mon Sep 17 00:00:00 2001 From: Edward Srouji Date: Tue, 3 Sep 2024 14:37:52 +0300 Subject: [PATCH 063/548] RDMA/mlx5: Support OOO RX WQE consumption Support QP with out-of-order (OOO) capabilities enabled. This allows WRs on the receiver side of the QP to be consumed OOO, permitting the sender side to transmit messages without guaranteeing arrival order on the receiver side. When enabled, the completion ordering of WRs remains in-order, regardless of the Receive WRs consumption order. RDMA Read and RDMA Atomic operations on the responder side continue to be executed in-order, while the ordering of data placement for RDMA Write and Send operations is not guaranteed. Atomic operations larger than 8 bytes are currently not supported. Therefore, when this feature is enabled, the created QP restricts its atomic support to 8 bytes at most. In addition, when querying the device, a new flag is returned in response to indicate that the Kernel supports OOO QP. Signed-off-by: Edward Srouji Reviewed-by: Yishai Hadas Link: https://patch.msgid.link/06ac609a5f358c8fb0a090d22c61a2f9329d82e6.1725362773.git.leon@kernel.org Signed-off-by: Leon Romanovsky (cherry picked from commit 8b36f7c3c6618dc97697a6a20a13b29651f68ab6) Signed-off-by: Koba Ko --- drivers/infiniband/hw/mlx5/main.c | 8 +++++ drivers/infiniband/hw/mlx5/mlx5_ib.h | 1 + drivers/infiniband/hw/mlx5/qp.c | 51 +++++++++++++++++++++++++--- include/uapi/rdma/mlx5-abi.h | 5 +++ 4 files changed, 60 insertions(+), 5 deletions(-) diff --git a/drivers/infiniband/hw/mlx5/main.c b/drivers/infiniband/hw/mlx5/main.c index eba7af9651a09..8f2cd4286cef7 100644 --- a/drivers/infiniband/hw/mlx5/main.c +++ b/drivers/infiniband/hw/mlx5/main.c @@ -1144,6 +1144,14 @@ static int mlx5_ib_query_device(struct ib_device *ibdev, MLX5_IB_QUERY_DEV_RESP_PACKET_BASED_CREDIT_MODE; resp.flags |= MLX5_IB_QUERY_DEV_RESP_FLAGS_SCAT2CQE_DCT; + + if (MLX5_CAP_GEN_2(mdev, dp_ordering_force) && + (MLX5_CAP_GEN(mdev, dp_ordering_ooo_all_xrc) || + MLX5_CAP_GEN(mdev, dp_ordering_ooo_all_dc) || + MLX5_CAP_GEN(mdev, dp_ordering_ooo_all_rc) || + MLX5_CAP_GEN(mdev, dp_ordering_ooo_all_ud) || + MLX5_CAP_GEN(mdev, dp_ordering_ooo_all_uc))) + resp.flags |= MLX5_IB_QUERY_DEV_RESP_FLAGS_OOO_DP; } if (offsetofend(typeof(resp), sw_parsing_caps) <= uhw_outlen) { diff --git a/drivers/infiniband/hw/mlx5/mlx5_ib.h b/drivers/infiniband/hw/mlx5/mlx5_ib.h index 2c962c3762b4b..47dc4c57ed292 100644 --- a/drivers/infiniband/hw/mlx5/mlx5_ib.h +++ b/drivers/infiniband/hw/mlx5/mlx5_ib.h @@ -532,6 +532,7 @@ struct mlx5_ib_qp { struct mlx5_bf bf; u8 has_rq:1; u8 is_rss:1; + u8 is_ooo_rq:1; /* only for user space QPs. For kernel * we have it from the bf object diff --git a/drivers/infiniband/hw/mlx5/qp.c b/drivers/infiniband/hw/mlx5/qp.c index e389769d17524..75020ad5d85a0 100644 --- a/drivers/infiniband/hw/mlx5/qp.c +++ b/drivers/infiniband/hw/mlx5/qp.c @@ -1960,7 +1960,7 @@ static int atomic_size_to_mode(int size_mask) } static int get_atomic_mode(struct mlx5_ib_dev *dev, - enum ib_qp_type qp_type) + struct mlx5_ib_qp *qp) { u8 atomic_operations = MLX5_CAP_ATOMIC(dev->mdev, atomic_operations); u8 atomic = MLX5_CAP_GEN(dev->mdev, atomic); @@ -1970,7 +1970,7 @@ static int get_atomic_mode(struct mlx5_ib_dev *dev, if (!atomic) return -EOPNOTSUPP; - if (qp_type == MLX5_IB_QPT_DCT) + if (qp->type == MLX5_IB_QPT_DCT) atomic_size_mask = MLX5_CAP_ATOMIC(dev->mdev, atomic_size_dc); else atomic_size_mask = MLX5_CAP_ATOMIC(dev->mdev, atomic_size_qp); @@ -1984,6 +1984,10 @@ static int get_atomic_mode(struct mlx5_ib_dev *dev, atomic_operations & MLX5_ATOMIC_OPS_FETCH_ADD)) atomic_mode = MLX5_ATOMIC_MODE_IB_COMP; + /* OOO DP QPs do not support larger than 8-Bytes atomic operations */ + if (atomic_mode > MLX5_ATOMIC_MODE_8B && qp->is_ooo_rq) + atomic_mode = MLX5_ATOMIC_MODE_8B; + return atomic_mode; } @@ -2839,6 +2843,29 @@ static int check_valid_flow(struct mlx5_ib_dev *dev, struct ib_pd *pd, return 0; } +static bool get_dp_ooo_cap(struct mlx5_core_dev *mdev, enum ib_qp_type qp_type) +{ + if (!MLX5_CAP_GEN_2(mdev, dp_ordering_force)) + return false; + + switch (qp_type) { + case IB_QPT_RC: + return MLX5_CAP_GEN(mdev, dp_ordering_ooo_all_rc); + case IB_QPT_XRC_INI: + case IB_QPT_XRC_TGT: + return MLX5_CAP_GEN(mdev, dp_ordering_ooo_all_xrc); + case IB_QPT_UC: + return MLX5_CAP_GEN(mdev, dp_ordering_ooo_all_uc); + case IB_QPT_UD: + return MLX5_CAP_GEN(mdev, dp_ordering_ooo_all_ud); + case MLX5_IB_QPT_DCI: + case MLX5_IB_QPT_DCT: + return MLX5_CAP_GEN(mdev, dp_ordering_ooo_all_dc); + default: + return false; + } +} + static void process_vendor_flag(struct mlx5_ib_dev *dev, int *flags, int flag, bool cond, struct mlx5_ib_qp *qp) { @@ -3366,7 +3393,7 @@ static int set_qpc_atomic_flags(struct mlx5_ib_qp *qp, if (access_flags & IB_ACCESS_REMOTE_ATOMIC) { int atomic_mode; - atomic_mode = get_atomic_mode(dev, qp->type); + atomic_mode = get_atomic_mode(dev, qp); if (atomic_mode < 0) return -EOPNOTSUPP; @@ -4317,6 +4344,11 @@ static int __mlx5_ib_modify_qp(struct ib_qp *ibqp, if (qp->flags & MLX5_IB_QP_CREATE_SQPN_QP1) MLX5_SET(qpc, qpc, deth_sqpn, 1); + if (qp->is_ooo_rq && cur_state == IB_QPS_INIT && new_state == IB_QPS_RTR) { + MLX5_SET(qpc, qpc, dp_ordering_1, 1); + MLX5_SET(qpc, qpc, dp_ordering_force, 1); + } + mlx5_cur = to_mlx5_state(cur_state); mlx5_new = to_mlx5_state(new_state); @@ -4532,7 +4564,7 @@ static int mlx5_ib_modify_dct(struct ib_qp *ibqp, struct ib_qp_attr *attr, if (attr->qp_access_flags & IB_ACCESS_REMOTE_ATOMIC) { int atomic_mode; - atomic_mode = get_atomic_mode(dev, MLX5_IB_QPT_DCT); + atomic_mode = get_atomic_mode(dev, qp); if (atomic_mode < 0) return -EOPNOTSUPP; @@ -4576,6 +4608,10 @@ static int mlx5_ib_modify_dct(struct ib_qp *ibqp, struct ib_qp_attr *attr, MLX5_SET(dctc, dctc, hop_limit, attr->ah_attr.grh.hop_limit); if (attr->ah_attr.type == RDMA_AH_ATTR_TYPE_ROCE) MLX5_SET(dctc, dctc, eth_prio, attr->ah_attr.sl & 0x7); + if (qp->is_ooo_rq) { + MLX5_SET(dctc, dctc, dp_ordering_1, 1); + MLX5_SET(dctc, dctc, dp_ordering_force, 1); + } err = mlx5_core_create_dct(dev, &qp->dct.mdct, qp->dct.in, MLX5_ST_SZ_BYTES(create_dct_in), out, @@ -4679,11 +4715,16 @@ int mlx5_ib_modify_qp(struct ib_qp *ibqp, struct ib_qp_attr *attr, min(udata->inlen, sizeof(ucmd)))) return -EFAULT; - if (ucmd.comp_mask || + if (ucmd.comp_mask & ~MLX5_IB_MODIFY_QP_OOO_DP || memchr_inv(&ucmd.burst_info.reserved, 0, sizeof(ucmd.burst_info.reserved))) return -EOPNOTSUPP; + if (ucmd.comp_mask & MLX5_IB_MODIFY_QP_OOO_DP) { + if (!get_dp_ooo_cap(dev->mdev, qp->type)) + return -EOPNOTSUPP; + qp->is_ooo_rq = 1; + } } if (qp->type == IB_QPT_GSI) diff --git a/include/uapi/rdma/mlx5-abi.h b/include/uapi/rdma/mlx5-abi.h index a96b7d2770e15..41271eea7a254 100644 --- a/include/uapi/rdma/mlx5-abi.h +++ b/include/uapi/rdma/mlx5-abi.h @@ -251,6 +251,7 @@ enum mlx5_ib_query_dev_resp_flags { MLX5_IB_QUERY_DEV_RESP_FLAGS_CQE_128B_PAD = 1 << 1, MLX5_IB_QUERY_DEV_RESP_PACKET_BASED_CREDIT_MODE = 1 << 2, MLX5_IB_QUERY_DEV_RESP_FLAGS_SCAT2CQE_DCT = 1 << 3, + MLX5_IB_QUERY_DEV_RESP_FLAGS_OOO_DP = 1 << 4, }; enum mlx5_ib_tunnel_offloads { @@ -437,6 +438,10 @@ struct mlx5_ib_burst_info { __u16 reserved; }; +enum mlx5_ib_modify_qp_mask { + MLX5_IB_MODIFY_QP_OOO_DP = 1 << 0, +}; + struct mlx5_ib_modify_qp { __u32 comp_mask; struct mlx5_ib_burst_info burst_info; From 99467db9fcf712703403dec09dfe0e7cbf7f01a5 Mon Sep 17 00:00:00 2001 From: Mark Zhang Date: Thu, 31 Oct 2024 11:48:14 +0200 Subject: [PATCH 064/548] RDMA/mlx5: Support querying per-plane IB PortCounters On a SMI device, set requested plane_num when querying PPCNT register with the PortCounters Attribute group. Signed-off-by: Mark Zhang Reviewed-by: Maher Sanalla Link: https://patch.msgid.link/828d57444a0a41042556bb0a4394ecf2fcaed639.1730368052.git.leon@kernel.org Signed-off-by: Leon Romanovsky (cherry picked from commit eb3d354efb39576c86c1e659807c57c557f2f68a) Signed-off-by: Koba Ko --- drivers/infiniband/hw/mlx5/mad.c | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/drivers/infiniband/hw/mlx5/mad.c b/drivers/infiniband/hw/mlx5/mad.c index 1b6c5e37d1697..2453ae4384a71 100644 --- a/drivers/infiniband/hw/mlx5/mad.c +++ b/drivers/infiniband/hw/mlx5/mad.c @@ -278,7 +278,13 @@ static int process_pma_cmd(struct mlx5_ib_dev *dev, u32 port_num, goto done; } - err = query_ib_ppcnt(mdev, mdev_port_num, 0, out_cnt, sz, 0); + if (dev->ib_dev.type == RDMA_DEVICE_TYPE_SMI) + err = query_ib_ppcnt(mdev, mdev_port_num, port_num, + out_cnt, sz, 0); + else + err = query_ib_ppcnt(mdev, mdev_port_num, 0, + out_cnt, sz, 0); + if (!err) pma_cnt_assign(pma_cnt, out_cnt); } From 72f6d4abb2e949d6de3436be60e177dd4a6eaf49 Mon Sep 17 00:00:00 2001 From: Yevgeny Kliteynik Date: Thu, 5 Dec 2024 00:09:22 +0200 Subject: [PATCH 065/548] net/mlx5: Add ConnectX-8 device to ifc In preparation for ConnectX-8 SWS support, add enum for the new device type. Signed-off-by: Yevgeny Kliteynik Signed-off-by: Tariq Toukan Link: https://patch.msgid.link/20241204220931.254964-3-tariqt@nvidia.com Signed-off-by: Leon Romanovsky (cherry picked from commit e799ac9dd3c485a7cda3586f2a12784b030b9df0) Signed-off-by: Koba Ko --- include/linux/mlx5/mlx5_ifc.h | 1 + 1 file changed, 1 insertion(+) diff --git a/include/linux/mlx5/mlx5_ifc.h b/include/linux/mlx5/mlx5_ifc.h index b0a81d133e7c1..6b82332aeb01c 100644 --- a/include/linux/mlx5/mlx5_ifc.h +++ b/include/linux/mlx5/mlx5_ifc.h @@ -1455,6 +1455,7 @@ enum { MLX5_STEERING_FORMAT_CONNECTX_5 = 0, MLX5_STEERING_FORMAT_CONNECTX_6DX = 1, MLX5_STEERING_FORMAT_CONNECTX_7 = 2, + MLX5_STEERING_FORMAT_CONNECTX_8 = 3, }; struct mlx5_ifc_cmd_hca_cap_bits { From 3a68298fd2e501615109418c4f2e2442d4e6b527 Mon Sep 17 00:00:00 2001 From: Patrisious Haddad Date: Thu, 21 Sep 2023 15:10:28 +0300 Subject: [PATCH 066/548] net/mlx5: Register mlx5e priv to devcom in MPV mode If the device is in MPV mode, the ethernet driver would now register to events from IB driver about core devices affiliation or de-affiliation. Use the key provided in said event to connect each mlx5e priv instance to it's master counterpart, this way the ethernet driver is now aware of who is his master core device and even more, such as knowing if partner device has IPsec configured or not. Signed-off-by: Patrisious Haddad Reviewed-by: Mark Bloch Link: https://lore.kernel.org/r/279adfa0aa3a1957a339086f2c1739a50b8e4b68.1695296682.git.leon@kernel.org Signed-off-by: Leon Romanovsky (cherry picked from commit bf11485f8419f90ffaa3804fd01d8468fcc56e23) Signed-off-by: Koba Ko --- drivers/net/ethernet/mellanox/mlx5/core/en.h | 1 + .../mellanox/mlx5/core/en_accel/ipsec.h | 2 + .../net/ethernet/mellanox/mlx5/core/en_main.c | 39 +++++++++++++++++++ .../ethernet/mellanox/mlx5/core/lib/devcom.h | 1 + 4 files changed, 43 insertions(+) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en.h b/drivers/net/ethernet/mellanox/mlx5/core/en.h index 87f95528a4e63..5fa1e5a4e6a56 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/en.h @@ -935,6 +935,7 @@ struct mlx5e_priv { struct mlx5e_htb *htb; struct mlx5e_mqprio_rl *mqprio_rl; struct dentry *dfs_root; + struct mlx5_devcom_comp_dev *devcom; }; struct mlx5e_dev { diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec.h b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec.h index 9e7c42c2f77b2..06743156ffcaa 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec.h @@ -42,6 +42,8 @@ #define MLX5E_IPSEC_SADB_RX_BITS 10 #define MLX5E_IPSEC_ESN_SCOPE_MID 0x80000000L +#define MPV_DEVCOM_MASTER_UP 1 + struct aes_gcm_keymat { u64 seq_iv; diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c index b95a6b4fab4d1..1df2f42d9d2b4 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c @@ -69,6 +69,7 @@ #include "en/htb.h" #include "qos.h" #include "en/trap.h" +#include "lib/devcom.h" bool mlx5e_check_fragmented_striding_rq_cap(struct mlx5_core_dev *mdev, u8 page_shift, enum mlx5e_mpwrq_umr_mode umr_mode) @@ -178,6 +179,37 @@ static void mlx5e_disable_async_events(struct mlx5e_priv *priv) mlx5_notifier_unregister(priv->mdev, &priv->events_nb); } +static int mlx5e_devcom_event_mpv(int event, void *my_data, void *event_data) +{ + struct mlx5e_priv *slave_priv = my_data; + + mlx5_devcom_comp_set_ready(slave_priv->devcom, true); + + return 0; +} + +static int mlx5e_devcom_init_mpv(struct mlx5e_priv *priv, u64 *data) +{ + priv->devcom = mlx5_devcom_register_component(priv->mdev->priv.devc, + MLX5_DEVCOM_MPV, + *data, + mlx5e_devcom_event_mpv, + priv); + if (IS_ERR_OR_NULL(priv->devcom)) + return -EOPNOTSUPP; + + if (mlx5_core_is_mp_master(priv->mdev)) + mlx5_devcom_send_event(priv->devcom, MPV_DEVCOM_MASTER_UP, + MPV_DEVCOM_MASTER_UP, priv); + + return 0; +} + +static void mlx5e_devcom_cleanup_mpv(struct mlx5e_priv *priv) +{ + mlx5_devcom_unregister_component(priv->devcom); +} + static int blocking_event(struct notifier_block *nb, unsigned long event, void *data) { struct mlx5e_priv *priv = container_of(nb, struct mlx5e_priv, blocking_events_nb); @@ -192,6 +224,13 @@ static int blocking_event(struct notifier_block *nb, unsigned long event, void * return NOTIFY_BAD; } break; + case MLX5_DRIVER_EVENT_AFFILIATION_DONE: + if (mlx5e_devcom_init_mpv(priv, data)) + return NOTIFY_BAD; + break; + case MLX5_DRIVER_EVENT_AFFILIATION_REMOVED: + mlx5e_devcom_cleanup_mpv(priv); + break; default: return NOTIFY_DONE; } diff --git a/drivers/net/ethernet/mellanox/mlx5/core/lib/devcom.h b/drivers/net/ethernet/mellanox/mlx5/core/lib/devcom.h index 8389ac0af708e..8220d180e33c9 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/lib/devcom.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/lib/devcom.h @@ -8,6 +8,7 @@ enum mlx5_devcom_component { MLX5_DEVCOM_ESW_OFFLOADS, + MLX5_DEVCOM_MPV, MLX5_DEVCOM_NUM_COMPONENTS, }; From db68146d5cdcccbcae8db7319c048379616fcfdb Mon Sep 17 00:00:00 2001 From: Patrisious Haddad Date: Thu, 21 Sep 2023 15:10:29 +0300 Subject: [PATCH 067/548] net/mlx5: Store devcom pointer inside IPsec RoCE Store the mlx5e priv devcom component within IPsec RoCE to enable the IPsec RoCE code to access the other device's private information. This includes retrieving the necessary device information and the IPsec database, which helps determine if IPsec is configured or not. Signed-off-by: Patrisious Haddad Reviewed-by: Mark Bloch Link: https://lore.kernel.org/r/5bb3160ceeb07523542302886da54c78eef0d2af.1695296682.git.leon@kernel.org Signed-off-by: Leon Romanovsky (cherry picked from commit eff5b663a6c3047ebf75ce520d3b3eaccd9fc957) Signed-off-by: Koba Ko --- drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec.c | 2 +- drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec.h | 3 ++- drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec_fs.c | 5 +++-- drivers/net/ethernet/mellanox/mlx5/core/lib/ipsec_fs_roce.c | 6 +++++- drivers/net/ethernet/mellanox/mlx5/core/lib/ipsec_fs_roce.h | 5 ++++- 5 files changed, 15 insertions(+), 6 deletions(-) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec.c b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec.c index 5161bf51fa110..d03bf27487ec6 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec.c @@ -912,7 +912,7 @@ void mlx5e_ipsec_init(struct mlx5e_priv *priv) } ipsec->is_uplink_rep = mlx5e_is_uplink_rep(priv); - ret = mlx5e_accel_ipsec_fs_init(ipsec); + ret = mlx5e_accel_ipsec_fs_init(ipsec, &priv->devcom); if (ret) goto err_fs_init; diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec.h b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec.h index 06743156ffcaa..dc8e539dc20ea 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec.h @@ -38,6 +38,7 @@ #include #include #include "lib/aso.h" +#include "lib/devcom.h" #define MLX5E_IPSEC_SADB_RX_BITS 10 #define MLX5E_IPSEC_ESN_SCOPE_MID 0x80000000L @@ -304,7 +305,7 @@ void mlx5e_ipsec_cleanup(struct mlx5e_priv *priv); void mlx5e_ipsec_build_netdev(struct mlx5e_priv *priv); void mlx5e_accel_ipsec_fs_cleanup(struct mlx5e_ipsec *ipsec); -int mlx5e_accel_ipsec_fs_init(struct mlx5e_ipsec *ipsec); +int mlx5e_accel_ipsec_fs_init(struct mlx5e_ipsec *ipsec, struct mlx5_devcom_comp_dev **devcom); int mlx5e_accel_ipsec_fs_add_rule(struct mlx5e_ipsec_sa_entry *sa_entry); void mlx5e_accel_ipsec_fs_del_rule(struct mlx5e_ipsec_sa_entry *sa_entry); int mlx5e_accel_ipsec_fs_add_pol(struct mlx5e_ipsec_pol_entry *pol_entry); diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec_fs.c b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec_fs.c index 2382c71289857..f523dad74d73e 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec_fs.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec_fs.c @@ -1993,7 +1993,8 @@ void mlx5e_accel_ipsec_fs_cleanup(struct mlx5e_ipsec *ipsec) } } -int mlx5e_accel_ipsec_fs_init(struct mlx5e_ipsec *ipsec) +int mlx5e_accel_ipsec_fs_init(struct mlx5e_ipsec *ipsec, + struct mlx5_devcom_comp_dev **devcom) { struct mlx5_core_dev *mdev = ipsec->mdev; struct mlx5_flow_namespace *ns, *ns_esw; @@ -2045,7 +2046,7 @@ int mlx5e_accel_ipsec_fs_init(struct mlx5e_ipsec *ipsec) ipsec->tx_esw->ns = ns_esw; xa_init_flags(&ipsec->rx_esw->ipsec_obj_id_map, XA_FLAGS_ALLOC1); } else if (mlx5_ipsec_device_caps(mdev) & MLX5_IPSEC_CAP_ROCE) { - ipsec->roce = mlx5_ipsec_fs_roce_init(mdev); + ipsec->roce = mlx5_ipsec_fs_roce_init(mdev, devcom); } return 0; diff --git a/drivers/net/ethernet/mellanox/mlx5/core/lib/ipsec_fs_roce.c b/drivers/net/ethernet/mellanox/mlx5/core/lib/ipsec_fs_roce.c index 6e3f178d6f84b..6176680c470ce 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/lib/ipsec_fs_roce.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/lib/ipsec_fs_roce.c @@ -31,6 +31,7 @@ struct mlx5_ipsec_fs { struct mlx5_ipsec_rx_roce ipv4_rx; struct mlx5_ipsec_rx_roce ipv6_rx; struct mlx5_ipsec_tx_roce tx; + struct mlx5_devcom_comp_dev **devcom; }; static void ipsec_fs_roce_setup_udp_dport(struct mlx5_flow_spec *spec, @@ -337,7 +338,8 @@ void mlx5_ipsec_fs_roce_cleanup(struct mlx5_ipsec_fs *ipsec_roce) kfree(ipsec_roce); } -struct mlx5_ipsec_fs *mlx5_ipsec_fs_roce_init(struct mlx5_core_dev *mdev) +struct mlx5_ipsec_fs *mlx5_ipsec_fs_roce_init(struct mlx5_core_dev *mdev, + struct mlx5_devcom_comp_dev **devcom) { struct mlx5_ipsec_fs *roce_ipsec; struct mlx5_flow_namespace *ns; @@ -363,6 +365,8 @@ struct mlx5_ipsec_fs *mlx5_ipsec_fs_roce_init(struct mlx5_core_dev *mdev) roce_ipsec->tx.ns = ns; + roce_ipsec->devcom = devcom; + return roce_ipsec; err_tx: diff --git a/drivers/net/ethernet/mellanox/mlx5/core/lib/ipsec_fs_roce.h b/drivers/net/ethernet/mellanox/mlx5/core/lib/ipsec_fs_roce.h index 9712d705fe481..75475e1261814 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/lib/ipsec_fs_roce.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/lib/ipsec_fs_roce.h @@ -4,6 +4,8 @@ #ifndef __MLX5_LIB_IPSEC_H__ #define __MLX5_LIB_IPSEC_H__ +#include "lib/devcom.h" + struct mlx5_ipsec_fs; struct mlx5_flow_table * @@ -20,6 +22,7 @@ int mlx5_ipsec_fs_roce_tx_create(struct mlx5_core_dev *mdev, struct mlx5_ipsec_fs *ipsec_roce, struct mlx5_flow_table *pol_ft); void mlx5_ipsec_fs_roce_cleanup(struct mlx5_ipsec_fs *ipsec_roce); -struct mlx5_ipsec_fs *mlx5_ipsec_fs_roce_init(struct mlx5_core_dev *mdev); +struct mlx5_ipsec_fs *mlx5_ipsec_fs_roce_init(struct mlx5_core_dev *mdev, + struct mlx5_devcom_comp_dev **devcom); #endif /* __MLX5_LIB_IPSEC_H__ */ From 63680702d2fb4511f6b3b6410c96b8875ae2463c Mon Sep 17 00:00:00 2001 From: Patrisious Haddad Date: Thu, 21 Sep 2023 15:10:30 +0300 Subject: [PATCH 068/548] net/mlx5: Add alias flow table bits Add all the capabilities needed to check for alias object support. As well as all the fields or commands needed for its creation and the creation of flow table that is able to jump to an alias object. Signed-off-by: Patrisious Haddad Reviewed-by: Mark Bloch Link: https://lore.kernel.org/r/544c030f2a78c4adf3fe6b64f97a39cc1bbdabb9.1695296682.git.leon@kernel.org Signed-off-by: Leon Romanovsky (cherry picked from commit ef36ffcb381096ed32c309c342a02bf62939c503) Signed-off-by: Koba Ko --- include/linux/mlx5/mlx5_ifc.h | 57 ++++++++++++++++++++++++++++++++++- 1 file changed, 56 insertions(+), 1 deletion(-) diff --git a/include/linux/mlx5/mlx5_ifc.h b/include/linux/mlx5/mlx5_ifc.h index 6b82332aeb01c..32490cd916a2a 100644 --- a/include/linux/mlx5/mlx5_ifc.h +++ b/include/linux/mlx5/mlx5_ifc.h @@ -312,6 +312,8 @@ enum { MLX5_CMD_OP_QUERY_VHCA_STATE = 0xb0d, MLX5_CMD_OP_MODIFY_VHCA_STATE = 0xb0e, MLX5_CMD_OP_SYNC_CRYPTO = 0xb12, + MLX5_CMD_OP_ALLOW_OTHER_VHCA_ACCESS = 0xb16, + MLX5_CMD_OP_GENERATE_WQE = 0xb17, MLX5_CMD_OPCODE_QUERY_VUID = 0xb22, MLX5_CMD_OP_MAX }; @@ -1945,6 +1947,14 @@ struct mlx5_ifc_cmd_hca_cap_bits { u8 match_definer_format_supported[0x40]; }; +enum { + MLX5_CROSS_VHCA_OBJ_TO_OBJ_SUPPORTED_LOCAL_FLOW_TABLE_TO_REMOTE_FLOW_TABLE_MISS = 0x80000, +}; + +enum { + MLX5_ALLOWED_OBJ_FOR_OTHER_VHCA_ACCESS_FLOW_TABLE = 0x200, +}; + struct mlx5_ifc_cmd_hca_cap_2_bits { u8 reserved_at_0[0x80]; @@ -1968,7 +1978,11 @@ struct mlx5_ifc_cmd_hca_cap_2_bits { u8 migration_in_chunks[0x1]; u8 reserved_at_d1[0xf]; - u8 reserved_at_e0[0xc0]; + u8 cross_vhca_object_to_object_supported[0x20]; + + u8 allowed_object_for_other_vhca_access[0x40]; + + u8 reserved_at_140[0x60]; u8 flow_table_type_2_type[0x8]; u8 reserved_at_1a8[0x3]; @@ -6475,6 +6489,28 @@ struct mlx5_ifc_general_obj_out_cmd_hdr_bits { u8 reserved_at_60[0x20]; }; +struct mlx5_ifc_allow_other_vhca_access_in_bits { + u8 opcode[0x10]; + u8 uid[0x10]; + u8 reserved_at_20[0x10]; + u8 op_mod[0x10]; + u8 reserved_at_40[0x50]; + u8 object_type_to_be_accessed[0x10]; + u8 object_id_to_be_accessed[0x20]; + u8 reserved_at_c0[0x40]; + union { + u8 access_key_raw[0x100]; + u8 access_key[8][0x20]; + }; +}; + +struct mlx5_ifc_allow_other_vhca_access_out_bits { + u8 status[0x8]; + u8 reserved_at_8[0x18]; + u8 syndrome[0x20]; + u8 reserved_at_40[0x40]; +}; + struct mlx5_ifc_modify_header_arg_bits { u8 reserved_at_0[0x80]; @@ -6497,6 +6533,24 @@ struct mlx5_ifc_create_match_definer_out_bits { struct mlx5_ifc_general_obj_out_cmd_hdr_bits general_obj_out_cmd_hdr; }; +struct mlx5_ifc_alias_context_bits { + u8 vhca_id_to_be_accessed[0x10]; + u8 reserved_at_10[0xd]; + u8 status[0x3]; + u8 object_id_to_be_accessed[0x20]; + u8 reserved_at_40[0x40]; + union { + u8 access_key_raw[0x100]; + u8 access_key[8][0x20]; + }; + u8 metadata[0x80]; +}; + +struct mlx5_ifc_create_alias_obj_in_bits { + struct mlx5_ifc_general_obj_in_cmd_hdr_bits hdr; + struct mlx5_ifc_alias_context_bits alias_ctx; +}; + enum { MLX5_QUERY_FLOW_GROUP_OUT_MATCH_CRITERIA_ENABLE_OUTER_HEADERS = 0x0, MLX5_QUERY_FLOW_GROUP_OUT_MATCH_CRITERIA_ENABLE_MISC_PARAMETERS = 0x1, @@ -12030,6 +12084,7 @@ enum { MLX5_GENERAL_OBJECT_TYPES_FLOW_METER_ASO = 0x24, MLX5_GENERAL_OBJECT_TYPES_MACSEC = 0x27, MLX5_GENERAL_OBJECT_TYPES_INT_KEK = 0x47, + MLX5_GENERAL_OBJECT_TYPES_FLOW_TABLE_ALIAS = 0xff15, }; enum { From 727fb7862045ef4d5a8531eeef90c27489aa7be1 Mon Sep 17 00:00:00 2001 From: Patrisious Haddad Date: Thu, 21 Sep 2023 15:10:31 +0300 Subject: [PATCH 069/548] net/mlx5: Implement alias object allow and create functions Add functions which allow one vhca to access another vhca object, and functions that creates an alias object or destroys it. Together they can be used to create cross vhca flow table that is able jump from the steering domain that is managed by one vport, to the steering domain on a different vport. Signed-off-by: Patrisious Haddad Reviewed-by: Mark Bloch Link: https://lore.kernel.org/r/f45a9c85319fa783186b8988abcd64955b5f2a0c.1695296682.git.leon@kernel.org Signed-off-by: Leon Romanovsky (cherry picked from commit 8c894f88c479e4b059464aee45a42ab72023b0c3) Signed-off-by: Koba Ko --- drivers/net/ethernet/mellanox/mlx5/core/cmd.c | 70 +++++++++++++++++++ .../ethernet/mellanox/mlx5/core/mlx5_core.h | 22 ++++++ 2 files changed, 92 insertions(+) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/cmd.c b/drivers/net/ethernet/mellanox/mlx5/core/cmd.c index 5a2126679415c..365a3bd496ba6 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/cmd.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/cmd.c @@ -528,6 +528,7 @@ static int mlx5_internal_err_ret_value(struct mlx5_core_dev *dev, u16 op, case MLX5_CMD_OP_SAVE_VHCA_STATE: case MLX5_CMD_OP_LOAD_VHCA_STATE: case MLX5_CMD_OP_SYNC_CRYPTO: + case MLX5_CMD_OP_ALLOW_OTHER_VHCA_ACCESS: *status = MLX5_DRIVER_STATUS_ABORTED; *synd = MLX5_DRIVER_SYND; return -ENOLINK; @@ -731,6 +732,7 @@ const char *mlx5_command_str(int command) MLX5_COMMAND_STR_CASE(SAVE_VHCA_STATE); MLX5_COMMAND_STR_CASE(LOAD_VHCA_STATE); MLX5_COMMAND_STR_CASE(SYNC_CRYPTO); + MLX5_COMMAND_STR_CASE(ALLOW_OTHER_VHCA_ACCESS); default: return "unknown command opcode"; } } @@ -2124,6 +2126,74 @@ int mlx5_cmd_exec_cb(struct mlx5_async_ctx *ctx, void *in, int in_size, } EXPORT_SYMBOL(mlx5_cmd_exec_cb); +int mlx5_cmd_allow_other_vhca_access(struct mlx5_core_dev *dev, + struct mlx5_cmd_allow_other_vhca_access_attr *attr) +{ + u32 out[MLX5_ST_SZ_DW(allow_other_vhca_access_out)] = {}; + u32 in[MLX5_ST_SZ_DW(allow_other_vhca_access_in)] = {}; + void *key; + + MLX5_SET(allow_other_vhca_access_in, + in, opcode, MLX5_CMD_OP_ALLOW_OTHER_VHCA_ACCESS); + MLX5_SET(allow_other_vhca_access_in, + in, object_type_to_be_accessed, attr->obj_type); + MLX5_SET(allow_other_vhca_access_in, + in, object_id_to_be_accessed, attr->obj_id); + + key = MLX5_ADDR_OF(allow_other_vhca_access_in, in, access_key); + memcpy(key, attr->access_key, sizeof(attr->access_key)); + + return mlx5_cmd_exec(dev, in, sizeof(in), out, sizeof(out)); +} + +int mlx5_cmd_alias_obj_create(struct mlx5_core_dev *dev, + struct mlx5_cmd_alias_obj_create_attr *alias_attr, + u32 *obj_id) +{ + u32 out[MLX5_ST_SZ_DW(general_obj_out_cmd_hdr)] = {}; + u32 in[MLX5_ST_SZ_DW(create_alias_obj_in)] = {}; + void *param; + void *attr; + void *key; + int ret; + + attr = MLX5_ADDR_OF(create_alias_obj_in, in, hdr); + MLX5_SET(general_obj_in_cmd_hdr, + attr, opcode, MLX5_CMD_OP_CREATE_GENERAL_OBJECT); + MLX5_SET(general_obj_in_cmd_hdr, + attr, obj_type, alias_attr->obj_type); + param = MLX5_ADDR_OF(general_obj_in_cmd_hdr, in, op_param); + MLX5_SET(general_obj_create_param, param, alias_object, 1); + + attr = MLX5_ADDR_OF(create_alias_obj_in, in, alias_ctx); + MLX5_SET(alias_context, attr, vhca_id_to_be_accessed, alias_attr->vhca_id); + MLX5_SET(alias_context, attr, object_id_to_be_accessed, alias_attr->obj_id); + + key = MLX5_ADDR_OF(alias_context, attr, access_key); + memcpy(key, alias_attr->access_key, sizeof(alias_attr->access_key)); + + ret = mlx5_cmd_exec(dev, in, sizeof(in), out, sizeof(out)); + if (ret) + return ret; + + *obj_id = MLX5_GET(general_obj_out_cmd_hdr, out, obj_id); + + return 0; +} + +int mlx5_cmd_alias_obj_destroy(struct mlx5_core_dev *dev, u32 obj_id, + u16 obj_type) +{ + u32 out[MLX5_ST_SZ_DW(general_obj_out_cmd_hdr)] = {}; + u32 in[MLX5_ST_SZ_DW(general_obj_in_cmd_hdr)] = {}; + + MLX5_SET(general_obj_in_cmd_hdr, in, opcode, MLX5_CMD_OP_DESTROY_GENERAL_OBJECT); + MLX5_SET(general_obj_in_cmd_hdr, in, obj_type, obj_type); + MLX5_SET(general_obj_in_cmd_hdr, in, obj_id, obj_id); + + return mlx5_cmd_exec(dev, in, sizeof(in), out, sizeof(out)); +} + static void destroy_msg_cache(struct mlx5_core_dev *dev) { struct cmd_msg_cache *ch; diff --git a/drivers/net/ethernet/mellanox/mlx5/core/mlx5_core.h b/drivers/net/ethernet/mellanox/mlx5/core/mlx5_core.h index 124352459c23d..19ffd1816474a 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/mlx5_core.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/mlx5_core.h @@ -97,6 +97,22 @@ do { \ __func__, __LINE__, current->pid, \ ##__VA_ARGS__) +#define ACCESS_KEY_LEN 32 +#define FT_ID_FT_TYPE_OFFSET 24 + +struct mlx5_cmd_allow_other_vhca_access_attr { + u16 obj_type; + u32 obj_id; + u8 access_key[ACCESS_KEY_LEN]; +}; + +struct mlx5_cmd_alias_obj_create_attr { + u32 obj_id; + u16 vhca_id; + u16 obj_type; + u8 access_key[ACCESS_KEY_LEN]; +}; + static inline void mlx5_printk(struct mlx5_core_dev *dev, int level, const char *format, ...) { struct device *device = dev->device; @@ -343,6 +359,12 @@ bool mlx5_eth_supported(struct mlx5_core_dev *dev); bool mlx5_rdma_supported(struct mlx5_core_dev *dev); bool mlx5_vnet_supported(struct mlx5_core_dev *dev); bool mlx5_same_hw_devs(struct mlx5_core_dev *dev, struct mlx5_core_dev *peer_dev); +int mlx5_cmd_allow_other_vhca_access(struct mlx5_core_dev *dev, + struct mlx5_cmd_allow_other_vhca_access_attr *attr); +int mlx5_cmd_alias_obj_create(struct mlx5_core_dev *dev, + struct mlx5_cmd_alias_obj_create_attr *alias_attr, + u32 *obj_id); +int mlx5_cmd_alias_obj_destroy(struct mlx5_core_dev *dev, u32 obj_id, u16 obj_type); static inline u16 mlx5_core_ec_vf_vport_base(const struct mlx5_core_dev *dev) { From 10808d16787b3b5952414ced2111efb09203a818 Mon Sep 17 00:00:00 2001 From: Patrisious Haddad Date: Thu, 21 Sep 2023 15:10:32 +0300 Subject: [PATCH 070/548] net/mlx5: Add create alias flow table function to ipsec roce Implements functions which creates an alias flow table, and check if alias flow table creation is even supported, and if successful returns the created alias flow table object id. This function would be used in later patches to allow jumping from one vhca to another, in order to add support for MPV mode. Signed-off-by: Patrisious Haddad Reviewed-by: Mark Bloch Link: https://lore.kernel.org/r/36e15ef41586f2a9aacc65b935de18391eef5607.1695296682.git.leon@kernel.org Signed-off-by: Leon Romanovsky (cherry picked from commit 69c08efcbe7fa87ec89f28e0eed0c8c1deedfd83) Signed-off-by: Koba Ko --- .../mellanox/mlx5/core/lib/ipsec_fs_roce.c | 66 +++++++++++++++++++ 1 file changed, 66 insertions(+) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/lib/ipsec_fs_roce.c b/drivers/net/ethernet/mellanox/mlx5/core/lib/ipsec_fs_roce.c index 6176680c470ce..648f3d7ab68bd 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/lib/ipsec_fs_roce.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/lib/ipsec_fs_roce.c @@ -4,6 +4,7 @@ #include "fs_core.h" #include "lib/ipsec_fs_roce.h" #include "mlx5_core.h" +#include struct mlx5_ipsec_miss { struct mlx5_flow_group *group; @@ -44,6 +45,71 @@ static void ipsec_fs_roce_setup_udp_dport(struct mlx5_flow_spec *spec, MLX5_SET(fte_match_param, spec->match_value, outer_headers.udp_dport, dport); } +static bool ipsec_fs_create_alias_supported(struct mlx5_core_dev *mdev, + struct mlx5_core_dev *master_mdev) +{ + u64 obj_allowed_m = MLX5_CAP_GEN_2_64(master_mdev, allowed_object_for_other_vhca_access); + u32 obj_supp_m = MLX5_CAP_GEN_2(master_mdev, cross_vhca_object_to_object_supported); + u64 obj_allowed = MLX5_CAP_GEN_2_64(mdev, allowed_object_for_other_vhca_access); + u32 obj_supp = MLX5_CAP_GEN_2(mdev, cross_vhca_object_to_object_supported); + + if (!(obj_supp & + MLX5_CROSS_VHCA_OBJ_TO_OBJ_SUPPORTED_LOCAL_FLOW_TABLE_TO_REMOTE_FLOW_TABLE_MISS) || + !(obj_supp_m & + MLX5_CROSS_VHCA_OBJ_TO_OBJ_SUPPORTED_LOCAL_FLOW_TABLE_TO_REMOTE_FLOW_TABLE_MISS)) + return false; + + if (!(obj_allowed & MLX5_ALLOWED_OBJ_FOR_OTHER_VHCA_ACCESS_FLOW_TABLE) || + !(obj_allowed_m & MLX5_ALLOWED_OBJ_FOR_OTHER_VHCA_ACCESS_FLOW_TABLE)) + return false; + + return true; +} + +static int ipsec_fs_create_aliased_ft(struct mlx5_core_dev *ibv_owner, + struct mlx5_core_dev *ibv_allowed, + struct mlx5_flow_table *ft, + u32 *obj_id) +{ + u32 aliased_object_id = (ft->type << FT_ID_FT_TYPE_OFFSET) | ft->id; + u16 vhca_id_to_be_accessed = MLX5_CAP_GEN(ibv_owner, vhca_id); + struct mlx5_cmd_allow_other_vhca_access_attr allow_attr = {}; + struct mlx5_cmd_alias_obj_create_attr alias_attr = {}; + char key[ACCESS_KEY_LEN]; + int ret; + int i; + + if (!ipsec_fs_create_alias_supported(ibv_owner, ibv_allowed)) + return -EOPNOTSUPP; + + for (i = 0; i < ACCESS_KEY_LEN; i++) + key[i] = get_random_u64() & 0xFF; + + memcpy(allow_attr.access_key, key, ACCESS_KEY_LEN); + allow_attr.obj_type = MLX5_GENERAL_OBJECT_TYPES_FLOW_TABLE_ALIAS; + allow_attr.obj_id = aliased_object_id; + + ret = mlx5_cmd_allow_other_vhca_access(ibv_owner, &allow_attr); + if (ret) { + mlx5_core_err(ibv_owner, "Failed to allow other vhca access err=%d\n", + ret); + return ret; + } + + memcpy(alias_attr.access_key, key, ACCESS_KEY_LEN); + alias_attr.obj_id = aliased_object_id; + alias_attr.obj_type = MLX5_GENERAL_OBJECT_TYPES_FLOW_TABLE_ALIAS; + alias_attr.vhca_id = vhca_id_to_be_accessed; + ret = mlx5_cmd_alias_obj_create(ibv_allowed, &alias_attr, obj_id); + if (ret) { + mlx5_core_err(ibv_allowed, "Failed to create alias object err=%d\n", + ret); + return ret; + } + + return 0; +} + static int ipsec_fs_roce_rx_rule_setup(struct mlx5_core_dev *mdev, struct mlx5_flow_destination *default_dst, From d24fd969a4853e91f2067d28ed6a61a73b69e841 Mon Sep 17 00:00:00 2001 From: Patrisious Haddad Date: Thu, 21 Sep 2023 15:10:33 +0300 Subject: [PATCH 071/548] net/mlx5: Configure IPsec steering for egress RoCEv2 MPV traffic Add steering tables/rules in RDMA_TX master domain, to forward all traffic to IPsec crypto table in NIC domain. But in case the traffic is coming from the slave, have to first send the traffic to an alias table in order to switch gvmi and from there we can go to the appropriate gvmi crypto table in NIC domain. Signed-off-by: Patrisious Haddad Reviewed-by: Mark Bloch Link: https://lore.kernel.org/r/7ca5cf1ac5c6979359b8726e97510574e2b3d44d.1695296682.git.leon@kernel.org Signed-off-by: Leon Romanovsky (cherry picked from commit dfbd229abeee76a0bcf015e93c85dca8d18568d4) Signed-off-by: Koba Ko --- .../mellanox/mlx5/core/en_accel/ipsec_fs.c | 2 +- .../net/ethernet/mellanox/mlx5/core/fs_core.c | 4 +- .../mellanox/mlx5/core/lib/ipsec_fs_roce.c | 223 +++++++++++++++++- .../mellanox/mlx5/core/lib/ipsec_fs_roce.h | 3 +- 4 files changed, 225 insertions(+), 7 deletions(-) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec_fs.c b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec_fs.c index f523dad74d73e..8b5935a15cf4c 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec_fs.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec_fs.c @@ -656,7 +656,7 @@ static int ipsec_counter_rule_tx(struct mlx5_core_dev *mdev, struct mlx5e_ipsec_ static void tx_destroy(struct mlx5e_ipsec *ipsec, struct mlx5e_ipsec_tx *tx, struct mlx5_ipsec_fs *roce) { - mlx5_ipsec_fs_roce_tx_destroy(roce); + mlx5_ipsec_fs_roce_tx_destroy(roce, ipsec->mdev); if (tx->chains) { ipsec_chains_destroy(tx->chains); } else { diff --git a/drivers/net/ethernet/mellanox/mlx5/core/fs_core.c b/drivers/net/ethernet/mellanox/mlx5/core/fs_core.c index 5f35a6fc03054..ca813596a9b8e 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/fs_core.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/fs_core.c @@ -137,7 +137,7 @@ #define LAG_MIN_LEVEL (OFFLOADS_MIN_LEVEL + KERNEL_RX_MACSEC_MIN_LEVEL + 1) #define KERNEL_TX_IPSEC_NUM_PRIOS 1 -#define KERNEL_TX_IPSEC_NUM_LEVELS 3 +#define KERNEL_TX_IPSEC_NUM_LEVELS 4 #define KERNEL_TX_IPSEC_MIN_LEVEL (KERNEL_TX_IPSEC_NUM_LEVELS) #define KERNEL_TX_MACSEC_NUM_PRIOS 1 @@ -288,7 +288,7 @@ enum { #define RDMA_TX_BYPASS_MIN_LEVEL MLX5_BY_PASS_NUM_PRIOS #define RDMA_TX_COUNTERS_MIN_LEVEL (RDMA_TX_BYPASS_MIN_LEVEL + 1) -#define RDMA_TX_IPSEC_NUM_PRIOS 1 +#define RDMA_TX_IPSEC_NUM_PRIOS 2 #define RDMA_TX_IPSEC_PRIO_NUM_LEVELS 1 #define RDMA_TX_IPSEC_MIN_LEVEL (RDMA_TX_COUNTERS_MIN_LEVEL + RDMA_TX_IPSEC_NUM_PRIOS) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/lib/ipsec_fs_roce.c b/drivers/net/ethernet/mellanox/mlx5/core/lib/ipsec_fs_roce.c index 648f3d7ab68bd..a82ccae7b6141 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/lib/ipsec_fs_roce.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/lib/ipsec_fs_roce.c @@ -2,6 +2,8 @@ /* Copyright (c) 2022, NVIDIA CORPORATION & AFFILIATES. All rights reserved. */ #include "fs_core.h" +#include "fs_cmd.h" +#include "en.h" #include "lib/ipsec_fs_roce.h" #include "mlx5_core.h" #include @@ -25,6 +27,8 @@ struct mlx5_ipsec_tx_roce { struct mlx5_flow_group *g; struct mlx5_flow_table *ft; struct mlx5_flow_handle *rule; + struct mlx5_flow_table *goto_alias_ft; + u32 alias_id; struct mlx5_flow_namespace *ns; }; @@ -187,9 +191,200 @@ static int ipsec_fs_roce_tx_rule_setup(struct mlx5_core_dev *mdev, return err; } -void mlx5_ipsec_fs_roce_tx_destroy(struct mlx5_ipsec_fs *ipsec_roce) +static int ipsec_fs_roce_tx_mpv_rule_setup(struct mlx5_core_dev *mdev, + struct mlx5_ipsec_tx_roce *roce, + struct mlx5_flow_table *pol_ft) { + struct mlx5_flow_destination dst = {}; + MLX5_DECLARE_FLOW_ACT(flow_act); + struct mlx5_flow_handle *rule; + struct mlx5_flow_spec *spec; + int err = 0; + + spec = kvzalloc(sizeof(*spec), GFP_KERNEL); + if (!spec) + return -ENOMEM; + + spec->match_criteria_enable = MLX5_MATCH_MISC_PARAMETERS; + MLX5_SET_TO_ONES(fte_match_param, spec->match_criteria, misc_parameters.source_vhca_port); + MLX5_SET(fte_match_param, spec->match_value, misc_parameters.source_vhca_port, + MLX5_CAP_GEN(mdev, native_port_num)); + + flow_act.action = MLX5_FLOW_CONTEXT_ACTION_FWD_DEST; + dst.type = MLX5_FLOW_DESTINATION_TYPE_TABLE_TYPE; + dst.ft = roce->goto_alias_ft; + rule = mlx5_add_flow_rules(roce->ft, spec, &flow_act, &dst, 1); + if (IS_ERR(rule)) { + err = PTR_ERR(rule); + mlx5_core_err(mdev, "Fail to add TX RoCE IPsec rule err=%d\n", + err); + goto out; + } + roce->rule = rule; + + /* No need for miss rule, since on miss we go to next PRIO, in which + * if master is configured, he will catch the traffic to go to his + * encryption table. + */ + +out: + kvfree(spec); + return err; +} + +#define MLX5_TX_ROCE_GROUP_SIZE BIT(0) +#define MLX5_IPSEC_RDMA_TX_FT_LEVEL 0 +#define MLX5_IPSEC_NIC_GOTO_ALIAS_FT_LEVEL 3 /* Since last used level in NIC ipsec is 2 */ + +static int ipsec_fs_roce_tx_mpv_create_ft(struct mlx5_core_dev *mdev, + struct mlx5_ipsec_tx_roce *roce, + struct mlx5_flow_table *pol_ft, + struct mlx5e_priv *peer_priv) +{ + struct mlx5_flow_namespace *roce_ns, *nic_ns; + struct mlx5_flow_table_attr ft_attr = {}; + struct mlx5_flow_table next_ft; + struct mlx5_flow_table *ft; + int err; + + roce_ns = mlx5_get_flow_namespace(peer_priv->mdev, MLX5_FLOW_NAMESPACE_RDMA_TX_IPSEC); + if (!roce_ns) + return -EOPNOTSUPP; + + nic_ns = mlx5_get_flow_namespace(peer_priv->mdev, MLX5_FLOW_NAMESPACE_EGRESS_IPSEC); + if (!nic_ns) + return -EOPNOTSUPP; + + err = ipsec_fs_create_aliased_ft(mdev, peer_priv->mdev, pol_ft, &roce->alias_id); + if (err) + return err; + + next_ft.id = roce->alias_id; + ft_attr.max_fte = 1; + ft_attr.next_ft = &next_ft; + ft_attr.level = MLX5_IPSEC_NIC_GOTO_ALIAS_FT_LEVEL; + ft_attr.flags = MLX5_FLOW_TABLE_UNMANAGED; + ft = mlx5_create_flow_table(nic_ns, &ft_attr); + if (IS_ERR(ft)) { + err = PTR_ERR(ft); + mlx5_core_err(mdev, "Fail to create RoCE IPsec goto alias ft err=%d\n", err); + goto destroy_alias; + } + + roce->goto_alias_ft = ft; + + memset(&ft_attr, 0, sizeof(ft_attr)); + ft_attr.max_fte = 1; + ft_attr.level = MLX5_IPSEC_RDMA_TX_FT_LEVEL; + ft = mlx5_create_flow_table(roce_ns, &ft_attr); + if (IS_ERR(ft)) { + err = PTR_ERR(ft); + mlx5_core_err(mdev, "Fail to create RoCE IPsec tx ft err=%d\n", err); + goto destroy_alias_ft; + } + + roce->ft = ft; + + return 0; + +destroy_alias_ft: + mlx5_destroy_flow_table(roce->goto_alias_ft); +destroy_alias: + mlx5_cmd_alias_obj_destroy(peer_priv->mdev, roce->alias_id, + MLX5_GENERAL_OBJECT_TYPES_FLOW_TABLE_ALIAS); + return err; +} + +static int ipsec_fs_roce_tx_mpv_create_group_rules(struct mlx5_core_dev *mdev, + struct mlx5_ipsec_tx_roce *roce, + struct mlx5_flow_table *pol_ft, + u32 *in) +{ + struct mlx5_flow_group *g; + int ix = 0; + int err; + u8 *mc; + + mc = MLX5_ADDR_OF(create_flow_group_in, in, match_criteria); + MLX5_SET_TO_ONES(fte_match_param, mc, misc_parameters.source_vhca_port); + MLX5_SET_CFG(in, match_criteria_enable, MLX5_MATCH_MISC_PARAMETERS); + + MLX5_SET_CFG(in, start_flow_index, ix); + ix += MLX5_TX_ROCE_GROUP_SIZE; + MLX5_SET_CFG(in, end_flow_index, ix - 1); + g = mlx5_create_flow_group(roce->ft, in); + if (IS_ERR(g)) { + err = PTR_ERR(g); + mlx5_core_err(mdev, "Fail to create RoCE IPsec tx group err=%d\n", err); + return err; + } + roce->g = g; + + err = ipsec_fs_roce_tx_mpv_rule_setup(mdev, roce, pol_ft); + if (err) { + mlx5_core_err(mdev, "Fail to create RoCE IPsec tx rules err=%d\n", err); + goto destroy_group; + } + + return 0; + +destroy_group: + mlx5_destroy_flow_group(roce->g); + return err; +} + +static int ipsec_fs_roce_tx_mpv_create(struct mlx5_core_dev *mdev, + struct mlx5_ipsec_fs *ipsec_roce, + struct mlx5_flow_table *pol_ft, + u32 *in) +{ + struct mlx5_devcom_comp_dev *tmp = NULL; + struct mlx5_ipsec_tx_roce *roce; + struct mlx5e_priv *peer_priv; + int err; + + if (!mlx5_devcom_for_each_peer_begin(*ipsec_roce->devcom)) + return -EOPNOTSUPP; + + peer_priv = mlx5_devcom_get_next_peer_data(*ipsec_roce->devcom, &tmp); + if (!peer_priv) { + err = -EOPNOTSUPP; + goto release_peer; + } + + roce = &ipsec_roce->tx; + + err = ipsec_fs_roce_tx_mpv_create_ft(mdev, roce, pol_ft, peer_priv); + if (err) { + mlx5_core_err(mdev, "Fail to create RoCE IPsec tables err=%d\n", err); + goto release_peer; + } + + err = ipsec_fs_roce_tx_mpv_create_group_rules(mdev, roce, pol_ft, in); + if (err) { + mlx5_core_err(mdev, "Fail to create RoCE IPsec tx group/rule err=%d\n", err); + goto destroy_tables; + } + + mlx5_devcom_for_each_peer_end(*ipsec_roce->devcom); + return 0; + +destroy_tables: + mlx5_destroy_flow_table(roce->ft); + mlx5_destroy_flow_table(roce->goto_alias_ft); + mlx5_cmd_alias_obj_destroy(peer_priv->mdev, roce->alias_id, + MLX5_GENERAL_OBJECT_TYPES_FLOW_TABLE_ALIAS); +release_peer: + mlx5_devcom_for_each_peer_end(*ipsec_roce->devcom); + return err; +} + +void mlx5_ipsec_fs_roce_tx_destroy(struct mlx5_ipsec_fs *ipsec_roce, + struct mlx5_core_dev *mdev) +{ + struct mlx5_devcom_comp_dev *tmp = NULL; struct mlx5_ipsec_tx_roce *tx_roce; + struct mlx5e_priv *peer_priv; if (!ipsec_roce) return; @@ -199,9 +394,24 @@ void mlx5_ipsec_fs_roce_tx_destroy(struct mlx5_ipsec_fs *ipsec_roce) mlx5_del_flow_rules(tx_roce->rule); mlx5_destroy_flow_group(tx_roce->g); mlx5_destroy_flow_table(tx_roce->ft); -} -#define MLX5_TX_ROCE_GROUP_SIZE BIT(0) + if (!mlx5_core_is_mp_slave(mdev)) + return; + + if (!mlx5_devcom_for_each_peer_begin(*ipsec_roce->devcom)) + return; + + peer_priv = mlx5_devcom_get_next_peer_data(*ipsec_roce->devcom, &tmp); + if (!peer_priv) { + mlx5_devcom_for_each_peer_end(*ipsec_roce->devcom); + return; + } + + mlx5_destroy_flow_table(tx_roce->goto_alias_ft); + mlx5_cmd_alias_obj_destroy(peer_priv->mdev, tx_roce->alias_id, + MLX5_GENERAL_OBJECT_TYPES_FLOW_TABLE_ALIAS); + mlx5_devcom_for_each_peer_end(*ipsec_roce->devcom); +} int mlx5_ipsec_fs_roce_tx_create(struct mlx5_core_dev *mdev, struct mlx5_ipsec_fs *ipsec_roce, @@ -224,7 +434,14 @@ int mlx5_ipsec_fs_roce_tx_create(struct mlx5_core_dev *mdev, if (!in) return -ENOMEM; + if (mlx5_core_is_mp_slave(mdev)) { + err = ipsec_fs_roce_tx_mpv_create(mdev, ipsec_roce, pol_ft, in); + goto free_in; + } + ft_attr.max_fte = 1; + ft_attr.prio = 1; + ft_attr.level = MLX5_IPSEC_RDMA_TX_FT_LEVEL; ft = mlx5_create_flow_table(roce->ns, &ft_attr); if (IS_ERR(ft)) { err = PTR_ERR(ft); diff --git a/drivers/net/ethernet/mellanox/mlx5/core/lib/ipsec_fs_roce.h b/drivers/net/ethernet/mellanox/mlx5/core/lib/ipsec_fs_roce.h index 75475e1261814..ad120caf269ea 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/lib/ipsec_fs_roce.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/lib/ipsec_fs_roce.h @@ -17,7 +17,8 @@ int mlx5_ipsec_fs_roce_rx_create(struct mlx5_core_dev *mdev, struct mlx5_flow_namespace *ns, struct mlx5_flow_destination *default_dst, u32 family, u32 level, u32 prio); -void mlx5_ipsec_fs_roce_tx_destroy(struct mlx5_ipsec_fs *ipsec_roce); +void mlx5_ipsec_fs_roce_tx_destroy(struct mlx5_ipsec_fs *ipsec_roce, + struct mlx5_core_dev *mdev); int mlx5_ipsec_fs_roce_tx_create(struct mlx5_core_dev *mdev, struct mlx5_ipsec_fs *ipsec_roce, struct mlx5_flow_table *pol_ft); From 6848804e5e1d93518d11893bb2a102c79d69cf6c Mon Sep 17 00:00:00 2001 From: Patrisious Haddad Date: Thu, 21 Sep 2023 15:10:34 +0300 Subject: [PATCH 072/548] net/mlx5: Configure IPsec steering for ingress RoCEv2 MPV traffic Add empty flow table in RDMA_RX master domain, to forward all received traffic to it, in order to continue through the FW RoCE steering. In order to achieve that however, first we check if the decrypted traffic is RoCEv2, if so then forward it to RDMA_RX domain. But in case the traffic is coming from the slave, have to first send the traffic to an alias table in order to switch gvmi and from there we can go to the appropriate gvmi flow table in RDMA_RX master domain. Signed-off-by: Patrisious Haddad Reviewed-by: Mark Bloch Link: https://lore.kernel.org/r/d2200b53158b1e7ef30996812107dd7207485c28.1695296682.git.leon@kernel.org Signed-off-by: Leon Romanovsky (cherry picked from commit f2f0231cfe8905af217e5bf1a08bfb8e4d3b74fb) Signed-off-by: Koba Ko --- .../mellanox/mlx5/core/en_accel/ipsec_fs.c | 4 +- .../net/ethernet/mellanox/mlx5/core/fs_core.c | 6 +- .../mellanox/mlx5/core/lib/ipsec_fs_roce.c | 216 ++++++++++++++++-- .../mellanox/mlx5/core/lib/ipsec_fs_roce.h | 2 +- 4 files changed, 205 insertions(+), 23 deletions(-) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec_fs.c b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec_fs.c index 8b5935a15cf4c..153e5f6d83715 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec_fs.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec_fs.c @@ -362,7 +362,7 @@ static void rx_destroy(struct mlx5_core_dev *mdev, struct mlx5e_ipsec *ipsec, mlx5_ipsec_rx_status_destroy(ipsec, rx); mlx5_destroy_flow_table(rx->ft.status); - mlx5_ipsec_fs_roce_rx_destroy(ipsec->roce, family); + mlx5_ipsec_fs_roce_rx_destroy(ipsec->roce, family, mdev); } static void ipsec_rx_create_attr_set(struct mlx5e_ipsec *ipsec, @@ -516,7 +516,7 @@ static int rx_create(struct mlx5_core_dev *mdev, struct mlx5e_ipsec *ipsec, err_add: mlx5_destroy_flow_table(rx->ft.status); err_fs_ft_status: - mlx5_ipsec_fs_roce_rx_destroy(ipsec->roce, family); + mlx5_ipsec_fs_roce_rx_destroy(ipsec->roce, family, mdev); return err; } diff --git a/drivers/net/ethernet/mellanox/mlx5/core/fs_core.c b/drivers/net/ethernet/mellanox/mlx5/core/fs_core.c index ca813596a9b8e..ad76a5b849bfe 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/fs_core.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/fs_core.c @@ -114,9 +114,9 @@ #define ETHTOOL_NUM_PRIOS 11 #define ETHTOOL_MIN_LEVEL (KERNEL_MIN_LEVEL + ETHTOOL_NUM_PRIOS) /* Promiscuous, Vlan, mac, ttc, inner ttc, {UDP/ANY/aRFS/accel/{esp, esp_err}}, IPsec policy, - * IPsec RoCE policy + * {IPsec RoCE MPV,Alias table},IPsec RoCE policy */ -#define KERNEL_NIC_PRIO_NUM_LEVELS 9 +#define KERNEL_NIC_PRIO_NUM_LEVELS 11 #define KERNEL_NIC_NUM_PRIOS 1 /* One more level for tc */ #define KERNEL_MIN_LEVEL (KERNEL_NIC_PRIO_NUM_LEVELS + 1) @@ -231,7 +231,7 @@ enum { }; #define RDMA_RX_IPSEC_NUM_PRIOS 1 -#define RDMA_RX_IPSEC_NUM_LEVELS 2 +#define RDMA_RX_IPSEC_NUM_LEVELS 4 #define RDMA_RX_IPSEC_MIN_LEVEL (RDMA_RX_IPSEC_NUM_LEVELS) #define RDMA_RX_BYPASS_MIN_LEVEL MLX5_BY_PASS_NUM_REGULAR_PRIOS diff --git a/drivers/net/ethernet/mellanox/mlx5/core/lib/ipsec_fs_roce.c b/drivers/net/ethernet/mellanox/mlx5/core/lib/ipsec_fs_roce.c index a82ccae7b6141..cce2193608cd0 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/lib/ipsec_fs_roce.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/lib/ipsec_fs_roce.c @@ -18,6 +18,11 @@ struct mlx5_ipsec_rx_roce { struct mlx5_flow_table *ft; struct mlx5_flow_handle *rule; struct mlx5_ipsec_miss roce_miss; + struct mlx5_flow_table *nic_master_ft; + struct mlx5_flow_group *nic_master_group; + struct mlx5_flow_handle *nic_master_rule; + struct mlx5_flow_table *goto_alias_ft; + u32 alias_id; struct mlx5_flow_table *ft_rdma; struct mlx5_flow_namespace *ns_rdma; @@ -119,6 +124,7 @@ ipsec_fs_roce_rx_rule_setup(struct mlx5_core_dev *mdev, struct mlx5_flow_destination *default_dst, struct mlx5_ipsec_rx_roce *roce) { + bool is_mpv_slave = mlx5_core_is_mp_slave(mdev); struct mlx5_flow_destination dst = {}; MLX5_DECLARE_FLOW_ACT(flow_act); struct mlx5_flow_handle *rule; @@ -132,14 +138,19 @@ ipsec_fs_roce_rx_rule_setup(struct mlx5_core_dev *mdev, ipsec_fs_roce_setup_udp_dport(spec, ROCE_V2_UDP_DPORT); flow_act.action = MLX5_FLOW_CONTEXT_ACTION_FWD_DEST; - dst.type = MLX5_FLOW_DESTINATION_TYPE_TABLE_TYPE; - dst.ft = roce->ft_rdma; + if (is_mpv_slave) { + dst.type = MLX5_FLOW_DESTINATION_TYPE_FLOW_TABLE; + dst.ft = roce->goto_alias_ft; + } else { + dst.type = MLX5_FLOW_DESTINATION_TYPE_TABLE_TYPE; + dst.ft = roce->ft_rdma; + } rule = mlx5_add_flow_rules(roce->ft, spec, &flow_act, &dst, 1); if (IS_ERR(rule)) { err = PTR_ERR(rule); mlx5_core_err(mdev, "Fail to add RX RoCE IPsec rule err=%d\n", err); - goto fail_add_rule; + goto out; } roce->rule = rule; @@ -155,12 +166,30 @@ ipsec_fs_roce_rx_rule_setup(struct mlx5_core_dev *mdev, roce->roce_miss.rule = rule; + if (!is_mpv_slave) + goto out; + + flow_act.action = MLX5_FLOW_CONTEXT_ACTION_FWD_DEST; + dst.type = MLX5_FLOW_DESTINATION_TYPE_TABLE_TYPE; + dst.ft = roce->ft_rdma; + rule = mlx5_add_flow_rules(roce->nic_master_ft, NULL, &flow_act, &dst, + 1); + if (IS_ERR(rule)) { + err = PTR_ERR(rule); + mlx5_core_err(mdev, "Fail to add RX RoCE IPsec rule for alias err=%d\n", + err); + goto fail_add_nic_master_rule; + } + roce->nic_master_rule = rule; + kvfree(spec); return 0; +fail_add_nic_master_rule: + mlx5_del_flow_rules(roce->roce_miss.rule); fail_add_default_rule: mlx5_del_flow_rules(roce->rule); -fail_add_rule: +out: kvfree(spec); return err; } @@ -379,6 +408,141 @@ static int ipsec_fs_roce_tx_mpv_create(struct mlx5_core_dev *mdev, return err; } +static void roce_rx_mpv_destroy_tables(struct mlx5_core_dev *mdev, struct mlx5_ipsec_rx_roce *roce) +{ + mlx5_destroy_flow_table(roce->goto_alias_ft); + mlx5_cmd_alias_obj_destroy(mdev, roce->alias_id, + MLX5_GENERAL_OBJECT_TYPES_FLOW_TABLE_ALIAS); + mlx5_destroy_flow_group(roce->nic_master_group); + mlx5_destroy_flow_table(roce->nic_master_ft); +} + +#define MLX5_RX_ROCE_GROUP_SIZE BIT(0) +#define MLX5_IPSEC_RX_IPV4_FT_LEVEL 3 +#define MLX5_IPSEC_RX_IPV6_FT_LEVEL 2 + +static int ipsec_fs_roce_rx_mpv_create(struct mlx5_core_dev *mdev, + struct mlx5_ipsec_fs *ipsec_roce, + struct mlx5_flow_namespace *ns, + u32 family, u32 level, u32 prio) +{ + struct mlx5_flow_namespace *roce_ns, *nic_ns; + struct mlx5_flow_table_attr ft_attr = {}; + struct mlx5_devcom_comp_dev *tmp = NULL; + struct mlx5_ipsec_rx_roce *roce; + struct mlx5_flow_table next_ft; + struct mlx5_flow_table *ft; + struct mlx5_flow_group *g; + struct mlx5e_priv *peer_priv; + int ix = 0; + u32 *in; + int err; + + roce = (family == AF_INET) ? &ipsec_roce->ipv4_rx : + &ipsec_roce->ipv6_rx; + + if (!mlx5_devcom_for_each_peer_begin(*ipsec_roce->devcom)) + return -EOPNOTSUPP; + + peer_priv = mlx5_devcom_get_next_peer_data(*ipsec_roce->devcom, &tmp); + if (!peer_priv) { + err = -EOPNOTSUPP; + goto release_peer; + } + + roce_ns = mlx5_get_flow_namespace(peer_priv->mdev, MLX5_FLOW_NAMESPACE_RDMA_RX_IPSEC); + if (!roce_ns) { + err = -EOPNOTSUPP; + goto release_peer; + } + + nic_ns = mlx5_get_flow_namespace(peer_priv->mdev, MLX5_FLOW_NAMESPACE_KERNEL); + if (!nic_ns) { + err = -EOPNOTSUPP; + goto release_peer; + } + + in = kvzalloc(MLX5_ST_SZ_BYTES(create_flow_group_in), GFP_KERNEL); + if (!in) { + err = -ENOMEM; + goto release_peer; + } + + ft_attr.level = (family == AF_INET) ? MLX5_IPSEC_RX_IPV4_FT_LEVEL : + MLX5_IPSEC_RX_IPV6_FT_LEVEL; + ft_attr.max_fte = 1; + ft = mlx5_create_flow_table(roce_ns, &ft_attr); + if (IS_ERR(ft)) { + err = PTR_ERR(ft); + mlx5_core_err(mdev, "Fail to create RoCE IPsec rx ft at rdma master err=%d\n", err); + goto free_in; + } + + roce->ft_rdma = ft; + + ft_attr.max_fte = 1; + ft_attr.prio = prio; + ft_attr.level = level + 2; + ft = mlx5_create_flow_table(nic_ns, &ft_attr); + if (IS_ERR(ft)) { + err = PTR_ERR(ft); + mlx5_core_err(mdev, "Fail to create RoCE IPsec rx ft at NIC master err=%d\n", err); + goto destroy_ft_rdma; + } + roce->nic_master_ft = ft; + + MLX5_SET_CFG(in, start_flow_index, ix); + ix += 1; + MLX5_SET_CFG(in, end_flow_index, ix - 1); + g = mlx5_create_flow_group(roce->nic_master_ft, in); + if (IS_ERR(g)) { + err = PTR_ERR(g); + mlx5_core_err(mdev, "Fail to create RoCE IPsec rx group aliased err=%d\n", err); + goto destroy_nic_master_ft; + } + roce->nic_master_group = g; + + err = ipsec_fs_create_aliased_ft(peer_priv->mdev, mdev, roce->nic_master_ft, + &roce->alias_id); + if (err) { + mlx5_core_err(mdev, "Fail to create RoCE IPsec rx alias FT err=%d\n", err); + goto destroy_group; + } + + next_ft.id = roce->alias_id; + ft_attr.max_fte = 1; + ft_attr.prio = prio; + ft_attr.level = roce->ft->level + 1; + ft_attr.flags = MLX5_FLOW_TABLE_UNMANAGED; + ft_attr.next_ft = &next_ft; + ft = mlx5_create_flow_table(ns, &ft_attr); + if (IS_ERR(ft)) { + err = PTR_ERR(ft); + mlx5_core_err(mdev, "Fail to create RoCE IPsec rx ft at NIC slave err=%d\n", err); + goto destroy_alias; + } + roce->goto_alias_ft = ft; + + kvfree(in); + mlx5_devcom_for_each_peer_end(*ipsec_roce->devcom); + return 0; + +destroy_alias: + mlx5_cmd_alias_obj_destroy(mdev, roce->alias_id, + MLX5_GENERAL_OBJECT_TYPES_FLOW_TABLE_ALIAS); +destroy_group: + mlx5_destroy_flow_group(roce->nic_master_group); +destroy_nic_master_ft: + mlx5_destroy_flow_table(roce->nic_master_ft); +destroy_ft_rdma: + mlx5_destroy_flow_table(roce->ft_rdma); +free_in: + kvfree(in); +release_peer: + mlx5_devcom_for_each_peer_end(*ipsec_roce->devcom); + return err; +} + void mlx5_ipsec_fs_roce_tx_destroy(struct mlx5_ipsec_fs *ipsec_roce, struct mlx5_core_dev *mdev) { @@ -493,8 +657,10 @@ struct mlx5_flow_table *mlx5_ipsec_fs_roce_ft_get(struct mlx5_ipsec_fs *ipsec_ro return rx_roce->ft; } -void mlx5_ipsec_fs_roce_rx_destroy(struct mlx5_ipsec_fs *ipsec_roce, u32 family) +void mlx5_ipsec_fs_roce_rx_destroy(struct mlx5_ipsec_fs *ipsec_roce, u32 family, + struct mlx5_core_dev *mdev) { + bool is_mpv_slave = mlx5_core_is_mp_slave(mdev); struct mlx5_ipsec_rx_roce *rx_roce; if (!ipsec_roce) @@ -503,22 +669,25 @@ void mlx5_ipsec_fs_roce_rx_destroy(struct mlx5_ipsec_fs *ipsec_roce, u32 family) rx_roce = (family == AF_INET) ? &ipsec_roce->ipv4_rx : &ipsec_roce->ipv6_rx; + if (is_mpv_slave) + mlx5_del_flow_rules(rx_roce->nic_master_rule); mlx5_del_flow_rules(rx_roce->roce_miss.rule); mlx5_del_flow_rules(rx_roce->rule); + if (is_mpv_slave) + roce_rx_mpv_destroy_tables(mdev, rx_roce); mlx5_destroy_flow_table(rx_roce->ft_rdma); mlx5_destroy_flow_group(rx_roce->roce_miss.group); mlx5_destroy_flow_group(rx_roce->g); mlx5_destroy_flow_table(rx_roce->ft); } -#define MLX5_RX_ROCE_GROUP_SIZE BIT(0) - int mlx5_ipsec_fs_roce_rx_create(struct mlx5_core_dev *mdev, struct mlx5_ipsec_fs *ipsec_roce, struct mlx5_flow_namespace *ns, struct mlx5_flow_destination *default_dst, u32 family, u32 level, u32 prio) { + bool is_mpv_slave = mlx5_core_is_mp_slave(mdev); struct mlx5_flow_table_attr ft_attr = {}; struct mlx5_ipsec_rx_roce *roce; struct mlx5_flow_table *ft; @@ -582,18 +751,28 @@ int mlx5_ipsec_fs_roce_rx_create(struct mlx5_core_dev *mdev, } roce->roce_miss.group = g; - memset(&ft_attr, 0, sizeof(ft_attr)); - if (family == AF_INET) - ft_attr.level = 1; - ft = mlx5_create_flow_table(roce->ns_rdma, &ft_attr); - if (IS_ERR(ft)) { - err = PTR_ERR(ft); - mlx5_core_err(mdev, "Fail to create RoCE IPsec rx ft at rdma err=%d\n", err); - goto fail_rdma_table; + if (is_mpv_slave) { + err = ipsec_fs_roce_rx_mpv_create(mdev, ipsec_roce, ns, family, level, prio); + if (err) { + mlx5_core_err(mdev, "Fail to create RoCE IPsec rx alias err=%d\n", err); + goto fail_mpv_create; + } + } else { + memset(&ft_attr, 0, sizeof(ft_attr)); + if (family == AF_INET) + ft_attr.level = 1; + ft_attr.max_fte = 1; + ft = mlx5_create_flow_table(roce->ns_rdma, &ft_attr); + if (IS_ERR(ft)) { + err = PTR_ERR(ft); + mlx5_core_err(mdev, + "Fail to create RoCE IPsec rx ft at rdma err=%d\n", err); + goto fail_rdma_table; + } + + roce->ft_rdma = ft; } - roce->ft_rdma = ft; - err = ipsec_fs_roce_rx_rule_setup(mdev, default_dst, roce); if (err) { mlx5_core_err(mdev, "Fail to create RoCE IPsec rx rules err=%d\n", err); @@ -604,7 +783,10 @@ int mlx5_ipsec_fs_roce_rx_create(struct mlx5_core_dev *mdev, return 0; fail_setup_rule: + if (is_mpv_slave) + roce_rx_mpv_destroy_tables(mdev, roce); mlx5_destroy_flow_table(roce->ft_rdma); +fail_mpv_create: fail_rdma_table: mlx5_destroy_flow_group(roce->roce_miss.group); fail_mgroup: diff --git a/drivers/net/ethernet/mellanox/mlx5/core/lib/ipsec_fs_roce.h b/drivers/net/ethernet/mellanox/mlx5/core/lib/ipsec_fs_roce.h index ad120caf269ea..435a480400e5b 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/lib/ipsec_fs_roce.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/lib/ipsec_fs_roce.h @@ -11,7 +11,7 @@ struct mlx5_ipsec_fs; struct mlx5_flow_table * mlx5_ipsec_fs_roce_ft_get(struct mlx5_ipsec_fs *ipsec_roce, u32 family); void mlx5_ipsec_fs_roce_rx_destroy(struct mlx5_ipsec_fs *ipsec_roce, - u32 family); + u32 family, struct mlx5_core_dev *mdev); int mlx5_ipsec_fs_roce_rx_create(struct mlx5_core_dev *mdev, struct mlx5_ipsec_fs *ipsec_roce, struct mlx5_flow_namespace *ns, From 0e687b50bce5864701c340716be03627c3daff0a Mon Sep 17 00:00:00 2001 From: Patrisious Haddad Date: Thu, 21 Sep 2023 15:10:35 +0300 Subject: [PATCH 073/548] net/mlx5: Handle IPsec steering upon master unbind/bind When the master device is unbinded, make sure to clean up all of the steering rules or flow tables that were created over the master, in order to allow proper unbinding of master, and for ethernet traffic to continue to work independently. Upon bringing master device back up and attaching the slave to it, checks if the slave already has IPsec configured and if so reconfigure the rules needed to support RoCE traffic. Note that while master device is unbound, the user is unable to configure IPsec again, since they are in a kind of illegal state in which they are in MPV mode but the slave has no master. However if IPsec was configured before hand, it will continue to work for ethernet traffic while master is unbound, and would continue to work for all traffic when the master is bound back again. Signed-off-by: Patrisious Haddad Reviewed-by: Mark Bloch Link: https://lore.kernel.org/r/8434e88912c588affe51b34669900382a132e873.1695296682.git.leon@kernel.org Signed-off-by: Leon Romanovsky (cherry picked from commit 82f9378c443c206d3f9e45844306e5270e7e4109) Signed-off-by: Koba Ko --- drivers/net/ethernet/mellanox/mlx5/core/en.h | 7 ++ .../mellanox/mlx5/core/en_accel/ipsec.c | 1 + .../mellanox/mlx5/core/en_accel/ipsec.h | 24 +++- .../mellanox/mlx5/core/en_accel/ipsec_fs.c | 111 +++++++++++++++++- .../mlx5/core/en_accel/ipsec_offload.c | 3 +- .../net/ethernet/mellanox/mlx5/core/en_main.c | 28 ++++- .../mellanox/mlx5/core/lib/ipsec_fs_roce.c | 79 +++++++++---- .../mellanox/mlx5/core/lib/ipsec_fs_roce.h | 4 +- 8 files changed, 225 insertions(+), 32 deletions(-) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en.h b/drivers/net/ethernet/mellanox/mlx5/core/en.h index 5fa1e5a4e6a56..79fd14b63a470 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/en.h @@ -166,6 +166,13 @@ struct page_pool; #define mlx5e_state_dereference(priv, p) \ rcu_dereference_protected((p), lockdep_is_held(&(priv)->state_lock)) +enum mlx5e_devcom_events { + MPV_DEVCOM_MASTER_UP, + MPV_DEVCOM_MASTER_DOWN, + MPV_DEVCOM_IPSEC_MASTER_UP, + MPV_DEVCOM_IPSEC_MASTER_DOWN, +}; + static inline u8 mlx5e_get_num_lag_ports(struct mlx5_core_dev *mdev) { if (mlx5_lag_is_lacp_owner(mdev)) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec.c b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec.c index d03bf27487ec6..532913c3eee92 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec.c @@ -892,6 +892,7 @@ void mlx5e_ipsec_init(struct mlx5e_priv *priv) xa_init_flags(&ipsec->sadb, XA_FLAGS_ALLOC); ipsec->mdev = priv->mdev; + init_completion(&ipsec->comp); ipsec->wq = alloc_workqueue("mlx5e_ipsec: %s", WQ_UNBOUND, 0, priv->netdev->name); if (!ipsec->wq) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec.h b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec.h index dc8e539dc20ea..8f4a37bceaf45 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec.h @@ -43,8 +43,6 @@ #define MLX5E_IPSEC_SADB_RX_BITS 10 #define MLX5E_IPSEC_ESN_SCOPE_MID 0x80000000L -#define MPV_DEVCOM_MASTER_UP 1 - struct aes_gcm_keymat { u64 seq_iv; @@ -224,12 +222,20 @@ struct mlx5e_ipsec_tx_create_attr { enum mlx5_flow_namespace_type chains_ns; }; +struct mlx5e_ipsec_mpv_work { + int event; + struct work_struct work; + struct mlx5e_priv *slave_priv; + struct mlx5e_priv *master_priv; +}; + struct mlx5e_ipsec { struct mlx5_core_dev *mdev; struct xarray sadb; struct mlx5e_ipsec_sw_stats sw_stats; struct mlx5e_ipsec_hw_stats hw_stats; struct workqueue_struct *wq; + struct completion comp; struct mlx5e_flow_steering *fs; struct mlx5e_ipsec_rx *rx_ipv4; struct mlx5e_ipsec_rx *rx_ipv6; @@ -241,6 +247,7 @@ struct mlx5e_ipsec { struct notifier_block netevent_nb; struct mlx5_ipsec_fs *roce; u8 is_uplink_rep: 1; + struct mlx5e_ipsec_mpv_work mpv_work; }; struct mlx5e_ipsec_esn_state { @@ -331,6 +338,10 @@ void mlx5e_accel_ipsec_fs_read_stats(struct mlx5e_priv *priv, void mlx5e_ipsec_build_accel_xfrm_attrs(struct mlx5e_ipsec_sa_entry *sa_entry, struct mlx5_accel_esp_xfrm_attrs *attrs); +void mlx5e_ipsec_handle_mpv_event(int event, struct mlx5e_priv *slave_priv, + struct mlx5e_priv *master_priv); +void mlx5e_ipsec_send_event(struct mlx5e_priv *priv, int event); + static inline struct mlx5_core_dev * mlx5e_ipsec_sa2dev(struct mlx5e_ipsec_sa_entry *sa_entry) { @@ -366,6 +377,15 @@ static inline u32 mlx5_ipsec_device_caps(struct mlx5_core_dev *mdev) { return 0; } + +static inline void mlx5e_ipsec_handle_mpv_event(int event, struct mlx5e_priv *slave_priv, + struct mlx5e_priv *master_priv) +{ +} + +static inline void mlx5e_ipsec_send_event(struct mlx5e_priv *priv, int event) +{ +} #endif #endif /* __MLX5E_IPSEC_H__ */ diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec_fs.c b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec_fs.c index 153e5f6d83715..17e4fc5bb9d89 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec_fs.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec_fs.c @@ -332,6 +332,83 @@ static int ipsec_miss_create(struct mlx5_core_dev *mdev, return err; } +static void handle_ipsec_rx_bringup(struct mlx5e_ipsec *ipsec, u32 family) +{ + struct mlx5e_ipsec_rx *rx = ipsec_rx(ipsec, family, XFRM_DEV_OFFLOAD_PACKET); + struct mlx5_flow_namespace *ns = mlx5e_fs_get_ns(ipsec->fs, false); + struct mlx5_flow_destination old_dest, new_dest; + + old_dest = mlx5_ttc_get_default_dest(mlx5e_fs_get_ttc(ipsec->fs, false), + family2tt(family)); + + mlx5_ipsec_fs_roce_rx_create(ipsec->mdev, ipsec->roce, ns, &old_dest, family, + MLX5E_ACCEL_FS_ESP_FT_ROCE_LEVEL, MLX5E_NIC_PRIO); + + new_dest.ft = mlx5_ipsec_fs_roce_ft_get(ipsec->roce, family); + new_dest.type = MLX5_FLOW_DESTINATION_TYPE_FLOW_TABLE; + mlx5_modify_rule_destination(rx->status.rule, &new_dest, &old_dest); + mlx5_modify_rule_destination(rx->sa.rule, &new_dest, &old_dest); +} + +static void handle_ipsec_rx_cleanup(struct mlx5e_ipsec *ipsec, u32 family) +{ + struct mlx5e_ipsec_rx *rx = ipsec_rx(ipsec, family, XFRM_DEV_OFFLOAD_PACKET); + struct mlx5_flow_destination old_dest, new_dest; + + old_dest.ft = mlx5_ipsec_fs_roce_ft_get(ipsec->roce, family); + old_dest.type = MLX5_FLOW_DESTINATION_TYPE_FLOW_TABLE; + new_dest = mlx5_ttc_get_default_dest(mlx5e_fs_get_ttc(ipsec->fs, false), + family2tt(family)); + mlx5_modify_rule_destination(rx->sa.rule, &new_dest, &old_dest); + mlx5_modify_rule_destination(rx->status.rule, &new_dest, &old_dest); + + mlx5_ipsec_fs_roce_rx_destroy(ipsec->roce, family, ipsec->mdev); +} + +static void ipsec_mpv_work_handler(struct work_struct *_work) +{ + struct mlx5e_ipsec_mpv_work *work = container_of(_work, struct mlx5e_ipsec_mpv_work, work); + struct mlx5e_ipsec *ipsec = work->slave_priv->ipsec; + + switch (work->event) { + case MPV_DEVCOM_IPSEC_MASTER_UP: + mutex_lock(&ipsec->tx->ft.mutex); + if (ipsec->tx->ft.refcnt) + mlx5_ipsec_fs_roce_tx_create(ipsec->mdev, ipsec->roce, ipsec->tx->ft.pol, + true); + mutex_unlock(&ipsec->tx->ft.mutex); + + mutex_lock(&ipsec->rx_ipv4->ft.mutex); + if (ipsec->rx_ipv4->ft.refcnt) + handle_ipsec_rx_bringup(ipsec, AF_INET); + mutex_unlock(&ipsec->rx_ipv4->ft.mutex); + + mutex_lock(&ipsec->rx_ipv6->ft.mutex); + if (ipsec->rx_ipv6->ft.refcnt) + handle_ipsec_rx_bringup(ipsec, AF_INET6); + mutex_unlock(&ipsec->rx_ipv6->ft.mutex); + break; + case MPV_DEVCOM_IPSEC_MASTER_DOWN: + mutex_lock(&ipsec->tx->ft.mutex); + if (ipsec->tx->ft.refcnt) + mlx5_ipsec_fs_roce_tx_destroy(ipsec->roce, ipsec->mdev); + mutex_unlock(&ipsec->tx->ft.mutex); + + mutex_lock(&ipsec->rx_ipv4->ft.mutex); + if (ipsec->rx_ipv4->ft.refcnt) + handle_ipsec_rx_cleanup(ipsec, AF_INET); + mutex_unlock(&ipsec->rx_ipv4->ft.mutex); + + mutex_lock(&ipsec->rx_ipv6->ft.mutex); + if (ipsec->rx_ipv6->ft.refcnt) + handle_ipsec_rx_cleanup(ipsec, AF_INET6); + mutex_unlock(&ipsec->rx_ipv6->ft.mutex); + break; + } + + complete(&work->master_priv->ipsec->comp); +} + static void ipsec_rx_ft_disconnect(struct mlx5e_ipsec *ipsec, u32 family) { struct mlx5_ttc_table *ttc = mlx5e_fs_get_ttc(ipsec->fs, false); @@ -759,7 +836,7 @@ static int tx_create(struct mlx5e_ipsec *ipsec, struct mlx5e_ipsec_tx *tx, } connect_roce: - err = mlx5_ipsec_fs_roce_tx_create(mdev, roce, tx->ft.pol); + err = mlx5_ipsec_fs_roce_tx_create(mdev, roce, tx->ft.pol, false); if (err) goto err_roce; return 0; @@ -2047,6 +2124,8 @@ int mlx5e_accel_ipsec_fs_init(struct mlx5e_ipsec *ipsec, xa_init_flags(&ipsec->rx_esw->ipsec_obj_id_map, XA_FLAGS_ALLOC1); } else if (mlx5_ipsec_device_caps(mdev) & MLX5_IPSEC_CAP_ROCE) { ipsec->roce = mlx5_ipsec_fs_roce_init(mdev, devcom); + } else { + mlx5_core_warn(mdev, "IPsec was initialized without RoCE support\n"); } return 0; @@ -2093,3 +2172,33 @@ bool mlx5e_ipsec_fs_tunnel_enabled(struct mlx5e_ipsec_sa_entry *sa_entry) return rx->allow_tunnel_mode; } + +void mlx5e_ipsec_handle_mpv_event(int event, struct mlx5e_priv *slave_priv, + struct mlx5e_priv *master_priv) +{ + struct mlx5e_ipsec_mpv_work *work; + + reinit_completion(&master_priv->ipsec->comp); + + if (!slave_priv->ipsec) { + complete(&master_priv->ipsec->comp); + return; + } + + work = &slave_priv->ipsec->mpv_work; + + INIT_WORK(&work->work, ipsec_mpv_work_handler); + work->event = event; + work->slave_priv = slave_priv; + work->master_priv = master_priv; + queue_work(slave_priv->ipsec->wq, &work->work); +} + +void mlx5e_ipsec_send_event(struct mlx5e_priv *priv, int event) +{ + if (!priv->ipsec) + return; /* IPsec not supported */ + + mlx5_devcom_send_event(priv->devcom, event, event, priv); + wait_for_completion(&priv->ipsec->comp); +} diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec_offload.c b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec_offload.c index 940e350058d10..e5fe967a1e219 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec_offload.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec_offload.c @@ -5,6 +5,7 @@ #include "en.h" #include "ipsec.h" #include "lib/crypto.h" +#include "lib/ipsec_fs_roce.h" #include "fs_core.h" #include "eswitch.h" @@ -69,7 +70,7 @@ u32 mlx5_ipsec_device_caps(struct mlx5_core_dev *mdev) caps |= MLX5_IPSEC_CAP_ESPINUDP; } - if (mlx5_get_roce_state(mdev) && + if (mlx5_get_roce_state(mdev) && mlx5_ipsec_fs_is_mpv_roce_supported(mdev) && MLX5_CAP_GEN_2(mdev, flow_table_type_2_type) & MLX5_FT_NIC_RX_2_NIC_RX_RDMA && MLX5_CAP_GEN_2(mdev, flow_table_type_2_type) & MLX5_FT_NIC_TX_RDMA_2_NIC_TX) caps |= MLX5_IPSEC_CAP_ROCE; diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c index 1df2f42d9d2b4..db2bf46a1530e 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c @@ -183,7 +183,20 @@ static int mlx5e_devcom_event_mpv(int event, void *my_data, void *event_data) { struct mlx5e_priv *slave_priv = my_data; - mlx5_devcom_comp_set_ready(slave_priv->devcom, true); + switch (event) { + case MPV_DEVCOM_MASTER_UP: + mlx5_devcom_comp_set_ready(slave_priv->devcom, true); + break; + case MPV_DEVCOM_MASTER_DOWN: + /* no need for comp set ready false since we unregister after + * and it hurts cleanup flow. + */ + break; + case MPV_DEVCOM_IPSEC_MASTER_UP: + case MPV_DEVCOM_IPSEC_MASTER_DOWN: + mlx5e_ipsec_handle_mpv_event(event, my_data, event_data); + break; + } return 0; } @@ -198,15 +211,26 @@ static int mlx5e_devcom_init_mpv(struct mlx5e_priv *priv, u64 *data) if (IS_ERR_OR_NULL(priv->devcom)) return -EOPNOTSUPP; - if (mlx5_core_is_mp_master(priv->mdev)) + if (mlx5_core_is_mp_master(priv->mdev)) { mlx5_devcom_send_event(priv->devcom, MPV_DEVCOM_MASTER_UP, MPV_DEVCOM_MASTER_UP, priv); + mlx5e_ipsec_send_event(priv, MPV_DEVCOM_IPSEC_MASTER_UP); + } return 0; } static void mlx5e_devcom_cleanup_mpv(struct mlx5e_priv *priv) { + if (IS_ERR_OR_NULL(priv->devcom)) + return; + + if (mlx5_core_is_mp_master(priv->mdev)) { + mlx5_devcom_send_event(priv->devcom, MPV_DEVCOM_MASTER_DOWN, + MPV_DEVCOM_MASTER_DOWN, priv); + mlx5e_ipsec_send_event(priv, MPV_DEVCOM_IPSEC_MASTER_DOWN); + } + mlx5_devcom_unregister_component(priv->devcom); } diff --git a/drivers/net/ethernet/mellanox/mlx5/core/lib/ipsec_fs_roce.c b/drivers/net/ethernet/mellanox/mlx5/core/lib/ipsec_fs_roce.c index cce2193608cd0..234cd00f71a1c 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/lib/ipsec_fs_roce.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/lib/ipsec_fs_roce.c @@ -23,6 +23,7 @@ struct mlx5_ipsec_rx_roce { struct mlx5_flow_handle *nic_master_rule; struct mlx5_flow_table *goto_alias_ft; u32 alias_id; + char key[ACCESS_KEY_LEN]; struct mlx5_flow_table *ft_rdma; struct mlx5_flow_namespace *ns_rdma; @@ -34,6 +35,7 @@ struct mlx5_ipsec_tx_roce { struct mlx5_flow_handle *rule; struct mlx5_flow_table *goto_alias_ft; u32 alias_id; + char key[ACCESS_KEY_LEN]; struct mlx5_flow_namespace *ns; }; @@ -54,37 +56,40 @@ static void ipsec_fs_roce_setup_udp_dport(struct mlx5_flow_spec *spec, MLX5_SET(fte_match_param, spec->match_value, outer_headers.udp_dport, dport); } -static bool ipsec_fs_create_alias_supported(struct mlx5_core_dev *mdev, - struct mlx5_core_dev *master_mdev) +static bool ipsec_fs_create_alias_supported_one(struct mlx5_core_dev *mdev) { - u64 obj_allowed_m = MLX5_CAP_GEN_2_64(master_mdev, allowed_object_for_other_vhca_access); - u32 obj_supp_m = MLX5_CAP_GEN_2(master_mdev, cross_vhca_object_to_object_supported); u64 obj_allowed = MLX5_CAP_GEN_2_64(mdev, allowed_object_for_other_vhca_access); u32 obj_supp = MLX5_CAP_GEN_2(mdev, cross_vhca_object_to_object_supported); if (!(obj_supp & - MLX5_CROSS_VHCA_OBJ_TO_OBJ_SUPPORTED_LOCAL_FLOW_TABLE_TO_REMOTE_FLOW_TABLE_MISS) || - !(obj_supp_m & MLX5_CROSS_VHCA_OBJ_TO_OBJ_SUPPORTED_LOCAL_FLOW_TABLE_TO_REMOTE_FLOW_TABLE_MISS)) return false; - if (!(obj_allowed & MLX5_ALLOWED_OBJ_FOR_OTHER_VHCA_ACCESS_FLOW_TABLE) || - !(obj_allowed_m & MLX5_ALLOWED_OBJ_FOR_OTHER_VHCA_ACCESS_FLOW_TABLE)) + if (!(obj_allowed & MLX5_ALLOWED_OBJ_FOR_OTHER_VHCA_ACCESS_FLOW_TABLE)) return false; return true; } +static bool ipsec_fs_create_alias_supported(struct mlx5_core_dev *mdev, + struct mlx5_core_dev *master_mdev) +{ + if (ipsec_fs_create_alias_supported_one(mdev) && + ipsec_fs_create_alias_supported_one(master_mdev)) + return true; + + return false; +} + static int ipsec_fs_create_aliased_ft(struct mlx5_core_dev *ibv_owner, struct mlx5_core_dev *ibv_allowed, struct mlx5_flow_table *ft, - u32 *obj_id) + u32 *obj_id, char *alias_key, bool from_event) { u32 aliased_object_id = (ft->type << FT_ID_FT_TYPE_OFFSET) | ft->id; u16 vhca_id_to_be_accessed = MLX5_CAP_GEN(ibv_owner, vhca_id); struct mlx5_cmd_allow_other_vhca_access_attr allow_attr = {}; struct mlx5_cmd_alias_obj_create_attr alias_attr = {}; - char key[ACCESS_KEY_LEN]; int ret; int i; @@ -92,20 +97,23 @@ static int ipsec_fs_create_aliased_ft(struct mlx5_core_dev *ibv_owner, return -EOPNOTSUPP; for (i = 0; i < ACCESS_KEY_LEN; i++) - key[i] = get_random_u64() & 0xFF; + if (!from_event) + alias_key[i] = get_random_u64() & 0xFF; - memcpy(allow_attr.access_key, key, ACCESS_KEY_LEN); + memcpy(allow_attr.access_key, alias_key, ACCESS_KEY_LEN); allow_attr.obj_type = MLX5_GENERAL_OBJECT_TYPES_FLOW_TABLE_ALIAS; allow_attr.obj_id = aliased_object_id; - ret = mlx5_cmd_allow_other_vhca_access(ibv_owner, &allow_attr); - if (ret) { - mlx5_core_err(ibv_owner, "Failed to allow other vhca access err=%d\n", - ret); - return ret; + if (!from_event) { + ret = mlx5_cmd_allow_other_vhca_access(ibv_owner, &allow_attr); + if (ret) { + mlx5_core_err(ibv_owner, "Failed to allow other vhca access err=%d\n", + ret); + return ret; + } } - memcpy(alias_attr.access_key, key, ACCESS_KEY_LEN); + memcpy(alias_attr.access_key, alias_key, ACCESS_KEY_LEN); alias_attr.obj_id = aliased_object_id; alias_attr.obj_type = MLX5_GENERAL_OBJECT_TYPES_FLOW_TABLE_ALIAS; alias_attr.vhca_id = vhca_id_to_be_accessed; @@ -268,7 +276,8 @@ static int ipsec_fs_roce_tx_mpv_rule_setup(struct mlx5_core_dev *mdev, static int ipsec_fs_roce_tx_mpv_create_ft(struct mlx5_core_dev *mdev, struct mlx5_ipsec_tx_roce *roce, struct mlx5_flow_table *pol_ft, - struct mlx5e_priv *peer_priv) + struct mlx5e_priv *peer_priv, + bool from_event) { struct mlx5_flow_namespace *roce_ns, *nic_ns; struct mlx5_flow_table_attr ft_attr = {}; @@ -284,7 +293,8 @@ static int ipsec_fs_roce_tx_mpv_create_ft(struct mlx5_core_dev *mdev, if (!nic_ns) return -EOPNOTSUPP; - err = ipsec_fs_create_aliased_ft(mdev, peer_priv->mdev, pol_ft, &roce->alias_id); + err = ipsec_fs_create_aliased_ft(mdev, peer_priv->mdev, pol_ft, &roce->alias_id, roce->key, + from_event); if (err) return err; @@ -365,7 +375,7 @@ static int ipsec_fs_roce_tx_mpv_create_group_rules(struct mlx5_core_dev *mdev, static int ipsec_fs_roce_tx_mpv_create(struct mlx5_core_dev *mdev, struct mlx5_ipsec_fs *ipsec_roce, struct mlx5_flow_table *pol_ft, - u32 *in) + u32 *in, bool from_event) { struct mlx5_devcom_comp_dev *tmp = NULL; struct mlx5_ipsec_tx_roce *roce; @@ -383,7 +393,7 @@ static int ipsec_fs_roce_tx_mpv_create(struct mlx5_core_dev *mdev, roce = &ipsec_roce->tx; - err = ipsec_fs_roce_tx_mpv_create_ft(mdev, roce, pol_ft, peer_priv); + err = ipsec_fs_roce_tx_mpv_create_ft(mdev, roce, pol_ft, peer_priv, from_event); if (err) { mlx5_core_err(mdev, "Fail to create RoCE IPsec tables err=%d\n", err); goto release_peer; @@ -503,7 +513,7 @@ static int ipsec_fs_roce_rx_mpv_create(struct mlx5_core_dev *mdev, roce->nic_master_group = g; err = ipsec_fs_create_aliased_ft(peer_priv->mdev, mdev, roce->nic_master_ft, - &roce->alias_id); + &roce->alias_id, roce->key, false); if (err) { mlx5_core_err(mdev, "Fail to create RoCE IPsec rx alias FT err=%d\n", err); goto destroy_group; @@ -555,6 +565,9 @@ void mlx5_ipsec_fs_roce_tx_destroy(struct mlx5_ipsec_fs *ipsec_roce, tx_roce = &ipsec_roce->tx; + if (!tx_roce->ft) + return; /* Incase RoCE was cleaned from MPV event flow */ + mlx5_del_flow_rules(tx_roce->rule); mlx5_destroy_flow_group(tx_roce->g); mlx5_destroy_flow_table(tx_roce->ft); @@ -575,11 +588,13 @@ void mlx5_ipsec_fs_roce_tx_destroy(struct mlx5_ipsec_fs *ipsec_roce, mlx5_cmd_alias_obj_destroy(peer_priv->mdev, tx_roce->alias_id, MLX5_GENERAL_OBJECT_TYPES_FLOW_TABLE_ALIAS); mlx5_devcom_for_each_peer_end(*ipsec_roce->devcom); + tx_roce->ft = NULL; } int mlx5_ipsec_fs_roce_tx_create(struct mlx5_core_dev *mdev, struct mlx5_ipsec_fs *ipsec_roce, - struct mlx5_flow_table *pol_ft) + struct mlx5_flow_table *pol_ft, + bool from_event) { struct mlx5_flow_table_attr ft_attr = {}; struct mlx5_ipsec_tx_roce *roce; @@ -599,7 +614,7 @@ int mlx5_ipsec_fs_roce_tx_create(struct mlx5_core_dev *mdev, return -ENOMEM; if (mlx5_core_is_mp_slave(mdev)) { - err = ipsec_fs_roce_tx_mpv_create(mdev, ipsec_roce, pol_ft, in); + err = ipsec_fs_roce_tx_mpv_create(mdev, ipsec_roce, pol_ft, in, from_event); goto free_in; } @@ -668,6 +683,8 @@ void mlx5_ipsec_fs_roce_rx_destroy(struct mlx5_ipsec_fs *ipsec_roce, u32 family, rx_roce = (family == AF_INET) ? &ipsec_roce->ipv4_rx : &ipsec_roce->ipv6_rx; + if (!rx_roce->ft) + return; /* Incase RoCE was cleaned from MPV event flow */ if (is_mpv_slave) mlx5_del_flow_rules(rx_roce->nic_master_rule); @@ -679,6 +696,7 @@ void mlx5_ipsec_fs_roce_rx_destroy(struct mlx5_ipsec_fs *ipsec_roce, u32 family, mlx5_destroy_flow_group(rx_roce->roce_miss.group); mlx5_destroy_flow_group(rx_roce->g); mlx5_destroy_flow_table(rx_roce->ft); + rx_roce->ft = NULL; } int mlx5_ipsec_fs_roce_rx_create(struct mlx5_core_dev *mdev, @@ -798,6 +816,17 @@ int mlx5_ipsec_fs_roce_rx_create(struct mlx5_core_dev *mdev, return err; } +bool mlx5_ipsec_fs_is_mpv_roce_supported(struct mlx5_core_dev *mdev) +{ + if (!mlx5_core_mp_enabled(mdev)) + return true; + + if (ipsec_fs_create_alias_supported_one(mdev)) + return true; + + return false; +} + void mlx5_ipsec_fs_roce_cleanup(struct mlx5_ipsec_fs *ipsec_roce) { kfree(ipsec_roce); diff --git a/drivers/net/ethernet/mellanox/mlx5/core/lib/ipsec_fs_roce.h b/drivers/net/ethernet/mellanox/mlx5/core/lib/ipsec_fs_roce.h index 435a480400e5b..2a1af78309fe3 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/lib/ipsec_fs_roce.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/lib/ipsec_fs_roce.h @@ -21,9 +21,11 @@ void mlx5_ipsec_fs_roce_tx_destroy(struct mlx5_ipsec_fs *ipsec_roce, struct mlx5_core_dev *mdev); int mlx5_ipsec_fs_roce_tx_create(struct mlx5_core_dev *mdev, struct mlx5_ipsec_fs *ipsec_roce, - struct mlx5_flow_table *pol_ft); + struct mlx5_flow_table *pol_ft, + bool from_event); void mlx5_ipsec_fs_roce_cleanup(struct mlx5_ipsec_fs *ipsec_roce); struct mlx5_ipsec_fs *mlx5_ipsec_fs_roce_init(struct mlx5_core_dev *mdev, struct mlx5_devcom_comp_dev **devcom); +bool mlx5_ipsec_fs_is_mpv_roce_supported(struct mlx5_core_dev *mdev); #endif /* __MLX5_LIB_IPSEC_H__ */ From fcd6b0116bc9f9c8c3b7221324e7a4bd841eed25 Mon Sep 17 00:00:00 2001 From: Yevgeny Kliteynik Date: Mon, 16 Jan 2023 01:16:43 +0200 Subject: [PATCH 074/548] net/mlx5: Added missing mlx5_ifc definition for HW Steering Add mlx5_ifc definitions that are required for HWS support. Note that due to change in the mlx5_ifc_flow_table_context_bits structure that now includes both SWS and HWS bits in a union, this patch also includes small change in one of SWS files that was required for compilation. Reviewed-by: Hamdan Agbariya Signed-off-by: Yevgeny Kliteynik Signed-off-by: Saeed Mahameed (cherry picked from commit 34c626c3004a73ee7da5072be48ddff9c2902902) Signed-off-by: Koba Ko --- .../mellanox/mlx5/core/steering/dr_cmd.c | 12 +- include/linux/mlx5/mlx5_ifc.h | 207 +++++++++++++++--- 2 files changed, 185 insertions(+), 34 deletions(-) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_cmd.c b/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_cmd.c index 8c2a34a0d6be8..baefb9a3fa05e 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_cmd.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_cmd.c @@ -251,9 +251,9 @@ int mlx5dr_cmd_query_flow_table(struct mlx5_core_dev *dev, output->level = MLX5_GET(query_flow_table_out, out, flow_table_context.level); output->sw_owner_icm_root_1 = MLX5_GET64(query_flow_table_out, out, - flow_table_context.sw_owner_icm_root_1); + flow_table_context.sws.sw_owner_icm_root_1); output->sw_owner_icm_root_0 = MLX5_GET64(query_flow_table_out, out, - flow_table_context.sw_owner_icm_root_0); + flow_table_context.sws.sw_owner_icm_root_0); return 0; } @@ -480,15 +480,15 @@ int mlx5dr_cmd_create_flow_table(struct mlx5_core_dev *mdev, */ if (attr->table_type == MLX5_FLOW_TABLE_TYPE_NIC_RX) { MLX5_SET64(flow_table_context, ft_mdev, - sw_owner_icm_root_0, attr->icm_addr_rx); + sws.sw_owner_icm_root_0, attr->icm_addr_rx); } else if (attr->table_type == MLX5_FLOW_TABLE_TYPE_NIC_TX) { MLX5_SET64(flow_table_context, ft_mdev, - sw_owner_icm_root_0, attr->icm_addr_tx); + sws.sw_owner_icm_root_0, attr->icm_addr_tx); } else if (attr->table_type == MLX5_FLOW_TABLE_TYPE_FDB) { MLX5_SET64(flow_table_context, ft_mdev, - sw_owner_icm_root_0, attr->icm_addr_rx); + sws.sw_owner_icm_root_0, attr->icm_addr_rx); MLX5_SET64(flow_table_context, ft_mdev, - sw_owner_icm_root_1, attr->icm_addr_tx); + sws.sw_owner_icm_root_1, attr->icm_addr_tx); } } diff --git a/include/linux/mlx5/mlx5_ifc.h b/include/linux/mlx5/mlx5_ifc.h index 32490cd916a2a..93914578e2d6a 100644 --- a/include/linux/mlx5/mlx5_ifc.h +++ b/include/linux/mlx5/mlx5_ifc.h @@ -80,23 +80,15 @@ enum { enum { MLX5_OBJ_TYPE_SW_ICM = 0x0008, - MLX5_OBJ_TYPE_HEADER_MODIFY_ARGUMENT = 0x23, -}; - -enum { - MLX5_GENERAL_OBJ_TYPES_CAP_SW_ICM = (1ULL << MLX5_OBJ_TYPE_SW_ICM), - MLX5_GENERAL_OBJ_TYPES_CAP_GENEVE_TLV_OPT = (1ULL << 11), - MLX5_GENERAL_OBJ_TYPES_CAP_VIRTIO_NET_Q = (1ULL << 13), - MLX5_GENERAL_OBJ_TYPES_CAP_HEADER_MODIFY_ARGUMENT = - (1ULL << MLX5_OBJ_TYPE_HEADER_MODIFY_ARGUMENT), - MLX5_GENERAL_OBJ_TYPES_CAP_MACSEC_OFFLOAD = (1ULL << 39), -}; - -enum { MLX5_OBJ_TYPE_GENEVE_TLV_OPT = 0x000b, MLX5_OBJ_TYPE_VIRTIO_NET_Q = 0x000d, MLX5_OBJ_TYPE_VIRTIO_Q_COUNTERS = 0x001c, MLX5_OBJ_TYPE_MATCH_DEFINER = 0x0018, + MLX5_OBJ_TYPE_HEADER_MODIFY_ARGUMENT = 0x23, + MLX5_OBJ_TYPE_STC = 0x0040, + MLX5_OBJ_TYPE_RTC = 0x0041, + MLX5_OBJ_TYPE_STE = 0x0042, + MLX5_OBJ_TYPE_MODIFY_HDR_PATTERN = 0x0043, MLX5_OBJ_TYPE_PAGE_TRACK = 0x46, MLX5_OBJ_TYPE_MKEY = 0xff01, MLX5_OBJ_TYPE_QP = 0xff02, @@ -112,6 +104,16 @@ enum { MLX5_OBJ_TYPE_RQT = 0xff0e, MLX5_OBJ_TYPE_FLOW_COUNTER = 0xff0f, MLX5_OBJ_TYPE_CQ = 0xff10, + MLX5_OBJ_TYPE_FT_ALIAS = 0xff15, +}; + +enum { + MLX5_GENERAL_OBJ_TYPES_CAP_SW_ICM = (1ULL << MLX5_OBJ_TYPE_SW_ICM), + MLX5_GENERAL_OBJ_TYPES_CAP_GENEVE_TLV_OPT = (1ULL << 11), + MLX5_GENERAL_OBJ_TYPES_CAP_VIRTIO_NET_Q = (1ULL << 13), + MLX5_GENERAL_OBJ_TYPES_CAP_HEADER_MODIFY_ARGUMENT = + (1ULL << MLX5_OBJ_TYPE_HEADER_MODIFY_ARGUMENT), + MLX5_GENERAL_OBJ_TYPES_CAP_MACSEC_OFFLOAD = (1ULL << 39), }; enum { @@ -484,7 +486,13 @@ struct mlx5_ifc_flow_table_prop_layout_bits { u8 reserved_at_66[0x2]; u8 reformat_add_macsec[0x1]; u8 reformat_remove_macsec[0x1]; - u8 reserved_at_6a[0xe]; + u8 reparse[0x1]; + u8 reserved_at_6b[0x1]; + u8 cross_vhca_object[0x1]; + u8 reformat_l2_to_l3_audp_tunnel[0x1]; + u8 reformat_l3_audp_tunnel_to_l2[0x1]; + u8 ignore_flow_level_rtc_valid[0x1]; + u8 reserved_at_70[0x8]; u8 log_max_ft_num[0x8]; u8 reserved_at_80[0x10]; @@ -521,7 +529,15 @@ struct mlx5_ifc_ipv6_layout_bits { u8 ipv6[16][0x8]; }; +struct mlx5_ifc_ipv6_simple_layout_bits { + u8 ipv6_127_96[0x20]; + u8 ipv6_95_64[0x20]; + u8 ipv6_63_32[0x20]; + u8 ipv6_31_0[0x20]; +}; + union mlx5_ifc_ipv6_layout_ipv4_layout_auto_bits { + struct mlx5_ifc_ipv6_simple_layout_bits ipv6_simple_layout; struct mlx5_ifc_ipv6_layout_bits ipv6_layout; struct mlx5_ifc_ipv4_layout_bits ipv4_layout; u8 reserved_at_0[0x80]; @@ -897,7 +913,9 @@ struct mlx5_ifc_flow_table_eswitch_cap_bits { u8 reserved_at_8[0x5]; u8 fdb_uplink_hairpin[0x1]; u8 fdb_multi_path_any_table_limit_regc[0x1]; - u8 reserved_at_f[0x3]; + u8 reserved_at_f[0x1]; + u8 fdb_dynamic_tunnel[0x1]; + u8 reserved_at_11[0x1]; u8 fdb_multi_path_any_table[0x1]; u8 reserved_at_13[0x2]; u8 fdb_modify_header_fwd_to_table[0x1]; @@ -936,6 +954,73 @@ struct mlx5_ifc_flow_table_eswitch_cap_bits { u8 reserved_at_1900[0x6700]; }; +struct mlx5_ifc_wqe_based_flow_table_cap_bits { + u8 reserved_at_0[0x3]; + u8 log_max_num_ste[0x5]; + u8 reserved_at_8[0x3]; + u8 log_max_num_stc[0x5]; + u8 reserved_at_10[0x3]; + u8 log_max_num_rtc[0x5]; + u8 reserved_at_18[0x3]; + u8 log_max_num_header_modify_pattern[0x5]; + + u8 rtc_hash_split_table[0x1]; + u8 rtc_linear_lookup_table[0x1]; + u8 reserved_at_22[0x1]; + u8 stc_alloc_log_granularity[0x5]; + u8 reserved_at_28[0x3]; + u8 stc_alloc_log_max[0x5]; + u8 reserved_at_30[0x3]; + u8 ste_alloc_log_granularity[0x5]; + u8 reserved_at_38[0x3]; + u8 ste_alloc_log_max[0x5]; + + u8 reserved_at_40[0xb]; + u8 rtc_reparse_mode[0x5]; + u8 reserved_at_50[0x3]; + u8 rtc_index_mode[0x5]; + u8 reserved_at_58[0x3]; + u8 rtc_log_depth_max[0x5]; + + u8 reserved_at_60[0x10]; + u8 ste_format[0x10]; + + u8 stc_action_type[0x80]; + + u8 header_insert_type[0x10]; + u8 header_remove_type[0x10]; + + u8 trivial_match_definer[0x20]; + + u8 reserved_at_140[0x1b]; + u8 rtc_max_num_hash_definer_gen_wqe[0x5]; + + u8 reserved_at_160[0x18]; + u8 access_index_mode[0x8]; + + u8 reserved_at_180[0x10]; + u8 ste_format_gen_wqe[0x10]; + + u8 linear_match_definer_reg_c3[0x20]; + + u8 fdb_jump_to_tir_stc[0x1]; + u8 reserved_at_1c1[0x1f]; +}; + +struct mlx5_ifc_esw_cap_bits { + u8 reserved_at_0[0x1d]; + u8 merged_eswitch[0x1]; + u8 reserved_at_1e[0x2]; + + u8 reserved_at_20[0x40]; + + u8 esw_manager_vport_number_valid[0x1]; + u8 reserved_at_61[0xf]; + u8 esw_manager_vport_number[0x10]; + + u8 reserved_at_80[0x780]; +}; + enum { MLX5_COUNTER_SOURCE_ESWITCH = 0x0, MLX5_COUNTER_FLOW_ESWITCH = 0x1, @@ -1417,9 +1502,13 @@ enum { }; enum { + MLX5_FLEX_IPV4_OVER_VXLAN_ENABLED = 1 << 0, + MLX5_FLEX_IPV6_OVER_VXLAN_ENABLED = 1 << 1, + MLX5_FLEX_IPV6_OVER_IP_ENABLED = 1 << 2, MLX5_FLEX_PARSER_GENEVE_ENABLED = 1 << 3, MLX5_FLEX_PARSER_MPLS_OVER_GRE_ENABLED = 1 << 4, MLX5_FLEX_PARSER_MPLS_OVER_UDP_ENABLED = 1 << 5, + MLX5_FLEX_P_BIT_VXLAN_GPE_ENABLED = 1 << 6, MLX5_FLEX_PARSER_VXLAN_GPE_ENABLED = 1 << 7, MLX5_FLEX_PARSER_ICMP_V4_ENABLED = 1 << 8, MLX5_FLEX_PARSER_ICMP_V6_ENABLED = 1 << 9, @@ -1624,7 +1713,8 @@ struct mlx5_ifc_cmd_hca_cap_bits { u8 pci_sync_for_fw_update_event[0x1]; u8 reserved_at_1f2[0x6]; u8 init2_lag_tx_port_affinity[0x1]; - u8 reserved_at_1fa[0x3]; + u8 reserved_at_1fa[0x2]; + u8 wqe_based_flow_table_update_cap[0x1]; u8 cqe_version[0x4]; u8 compact_address_vector[0x1]; @@ -1937,7 +2027,7 @@ struct mlx5_ifc_cmd_hca_cap_bits { u8 reserved_at_760[0x3]; u8 log_max_num_header_modify_argument[0x5]; - u8 reserved_at_768[0x4]; + u8 log_header_modify_argument_granularity_offset[0x4]; u8 log_header_modify_argument_granularity[0x4]; u8 reserved_at_770[0x3]; u8 log_header_modify_argument_max_alloc[0x5]; @@ -1985,7 +2075,8 @@ struct mlx5_ifc_cmd_hca_cap_2_bits { u8 reserved_at_140[0x60]; u8 flow_table_type_2_type[0x8]; - u8 reserved_at_1a8[0x3]; + u8 reserved_at_1a8[0x2]; + u8 format_select_dw_8_6_ext[0x1]; u8 log_min_mkey_entity_size[0x5]; u8 reserved_at_1b0[0x10]; @@ -2001,9 +2092,40 @@ struct mlx5_ifc_cmd_hca_cap_2_bits { u8 reserved_at_250[0x10]; u8 reserved_at_260[0x120]; - u8 reserved_at_380[0x10]; + + u8 format_select_dw_gtpu_dw_0[0x8]; + u8 format_select_dw_gtpu_dw_1[0x8]; + u8 format_select_dw_gtpu_dw_2[0x8]; + u8 format_select_dw_gtpu_first_ext_dw_0[0x8]; + + u8 generate_wqe_type[0x20]; + + u8 reserved_at_2c0[0xc0]; + + u8 reserved_at_380[0xb]; + u8 min_mkey_log_entity_size_fixed_buffer[0x5]; u8 ec_vf_vport_base[0x10]; - u8 reserved_at_3a0[0x460]; + + u8 reserved_at_3a0[0x10]; + u8 max_rqt_vhca_id[0x10]; + + u8 reserved_at_3c0[0x20]; + + u8 reserved_at_3e0[0x10]; + u8 pcc_ifa2[0x1]; + u8 reserved_at_3f1[0xf]; + + u8 reserved_at_400[0x1]; + u8 min_mkey_log_entity_size_fixed_buffer_valid[0x1]; + u8 reserved_at_402[0xe]; + u8 return_reg_id[0x10]; + + u8 reserved_at_420[0x1c]; + u8 flow_table_hash_type[0x4]; + + u8 reserved_at_440[0x8]; + u8 max_num_eqs_24b[0x18]; + u8 reserved_at_460[0x3a0]; }; enum mlx5_ifc_flow_destination_type { @@ -2046,7 +2168,7 @@ struct mlx5_ifc_extended_dest_format_bits { u8 reserved_at_60[0x20]; }; -union mlx5_ifc_dest_format_struct_flow_counter_list_auto_bits { +union mlx5_ifc_dest_format_flow_counter_list_auto_bits { struct mlx5_ifc_extended_dest_format_bits extended_dest_format; struct mlx5_ifc_flow_counter_list_bits flow_counter_list; }; @@ -2138,7 +2260,10 @@ struct mlx5_ifc_wq_bits { u8 reserved_at_139[0x4]; u8 log_wqe_stride_size[0x3]; - u8 reserved_at_140[0x80]; + u8 dbr_umem_id[0x20]; + u8 wq_umem_id[0x20]; + + u8 wq_umem_offset[0x40]; u8 headers_mkey[0x20]; @@ -3517,6 +3642,8 @@ union mlx5_ifc_hca_cap_union_bits { struct mlx5_ifc_per_protocol_networking_offload_caps_bits per_protocol_networking_offload_caps; struct mlx5_ifc_flow_table_nic_cap_bits flow_table_nic_cap; struct mlx5_ifc_flow_table_eswitch_cap_bits flow_table_eswitch_cap; + struct mlx5_ifc_wqe_based_flow_table_cap_bits wqe_based_flow_table_cap; + struct mlx5_ifc_esw_cap_bits esw_cap; struct mlx5_ifc_e_switch_cap_bits e_switch_cap; struct mlx5_ifc_port_selection_cap_bits port_selection_cap; struct mlx5_ifc_qos_cap_bits qos_cap; @@ -3633,7 +3760,7 @@ struct mlx5_ifc_flow_context_bits { u8 reserved_at_1300[0x500]; - union mlx5_ifc_dest_format_struct_flow_counter_list_auto_bits destination[]; + union mlx5_ifc_dest_format_flow_counter_list_auto_bits destination[]; }; enum { @@ -3874,7 +4001,8 @@ struct mlx5_ifc_sqc_bits { u8 reg_umr[0x1]; u8 allow_swp[0x1]; u8 hairpin[0x1]; - u8 reserved_at_f[0xb]; + u8 non_wire[0x1]; + u8 reserved_at_10[0xa]; u8 ts_format[0x2]; u8 reserved_at_1c[0x4]; @@ -4919,6 +5047,16 @@ struct mlx5_ifc_set_fte_in_bits { struct mlx5_ifc_flow_context_bits flow_context; }; +struct mlx5_ifc_dest_format_bits { + u8 destination_type[0x8]; + u8 destination_id[0x18]; + + u8 destination_eswitch_owner_vhca_id_valid[0x1]; + u8 packet_reformat[0x1]; + u8 reserved_at_22[0xe]; + u8 destination_eswitch_owner_vhca_id[0x10]; +}; + struct mlx5_ifc_rts2rts_qp_out_bits { u8 status[0x8]; u8 reserved_at_8[0x18]; @@ -6111,7 +6249,8 @@ struct mlx5_ifc_flow_table_context_bits { u8 termination_table[0x1]; u8 table_miss_action[0x4]; u8 level[0x8]; - u8 reserved_at_10[0x8]; + u8 rtc_valid[0x1]; + u8 reserved_at_11[0x7]; u8 log_size[0x8]; u8 reserved_at_20[0x8]; @@ -6121,11 +6260,21 @@ struct mlx5_ifc_flow_table_context_bits { u8 lag_master_next_table_id[0x18]; u8 reserved_at_60[0x60]; + union { + struct { + u8 sw_owner_icm_root_1[0x40]; + + u8 sw_owner_icm_root_0[0x40]; + } sws; + struct { + u8 rtc_id_0[0x20]; - u8 sw_owner_icm_root_1[0x40]; + u8 rtc_id_1[0x20]; - u8 sw_owner_icm_root_0[0x40]; + u8 reserved_at_100[0x40]; + } hws; + }; }; struct mlx5_ifc_query_flow_table_out_bits { @@ -8907,7 +9056,9 @@ struct mlx5_ifc_create_qp_in_bits { struct mlx5_ifc_qpc_bits qpc; - u8 reserved_at_800[0x60]; + u8 wq_umem_offset[0x40]; + + u8 wq_umem_id[0x20]; u8 wq_umem_valid[0x1]; u8 reserved_at_861[0x1f]; From 291a706dd24adab0d51c811d353c435bda3c92fb Mon Sep 17 00:00:00 2001 From: Yevgeny Kliteynik Date: Tue, 17 Jan 2023 17:14:51 +0200 Subject: [PATCH 075/548] net/mlx5: Added missing definitions in preparation for HW Steering As part of preparation for HWS, added missing definitions in qp.h and fs_core.h: - FS_FT_FDB_RX/TX table types that are used by HWS in addition to an existing FS_FT_FDB - MLX5_WQE_CTRL_INITIATOR_SMALL_FENCE that is used by HWS to require fence in WQE Reviewed-by: Hamdan Agbariya Signed-off-by: Yevgeny Kliteynik Reviewed-by: Mark Bloch Signed-off-by: Saeed Mahameed (cherry picked from commit 00b9f0daefd7670356e592d418c890acf9179c94) Signed-off-by: Koba Ko --- drivers/net/ethernet/mellanox/mlx5/core/fs_core.h | 8 ++++++-- include/linux/mlx5/qp.h | 1 + 2 files changed, 7 insertions(+), 2 deletions(-) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/fs_core.h b/drivers/net/ethernet/mellanox/mlx5/core/fs_core.h index 4aed1768b85fb..1d48471a3e40d 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/fs_core.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/fs_core.h @@ -110,7 +110,9 @@ enum fs_flow_table_type { FS_FT_RDMA_RX = 0X7, FS_FT_RDMA_TX = 0X8, FS_FT_PORT_SEL = 0X9, - FS_FT_MAX_TYPE = FS_FT_PORT_SEL, + FS_FT_FDB_RX = 0xa, + FS_FT_FDB_TX = 0xb, + FS_FT_MAX_TYPE = FS_FT_FDB_TX, }; enum fs_flow_table_op_mod { @@ -368,7 +370,9 @@ struct mlx5_flow_root_namespace *find_root(struct fs_node *node); (type == FS_FT_RDMA_RX) ? MLX5_CAP_FLOWTABLE_RDMA_RX(mdev, cap) : \ (type == FS_FT_RDMA_TX) ? MLX5_CAP_FLOWTABLE_RDMA_TX(mdev, cap) : \ (type == FS_FT_PORT_SEL) ? MLX5_CAP_FLOWTABLE_PORT_SELECTION(mdev, cap) : \ - (BUILD_BUG_ON_ZERO(FS_FT_PORT_SEL != FS_FT_MAX_TYPE))\ + (type == FS_FT_FDB_RX) ? MLX5_CAP_ESW_FLOWTABLE_FDB(mdev, cap) : \ + (type == FS_FT_FDB_TX) ? MLX5_CAP_ESW_FLOWTABLE_FDB(mdev, cap) : \ + (BUILD_BUG_ON_ZERO(FS_FT_FDB_TX != FS_FT_MAX_TYPE))\ ) #endif diff --git a/include/linux/mlx5/qp.h b/include/linux/mlx5/qp.h index ad1ce650146cb..fc7eeff99a8a6 100644 --- a/include/linux/mlx5/qp.h +++ b/include/linux/mlx5/qp.h @@ -149,6 +149,7 @@ enum { MLX5_WQE_CTRL_CQ_UPDATE = 2 << 2, MLX5_WQE_CTRL_CQ_UPDATE_AND_EQE = 3 << 2, MLX5_WQE_CTRL_SOLICITED = 1 << 1, + MLX5_WQE_CTRL_INITIATOR_SMALL_FENCE = 1 << 5, }; enum { From 25ba32f3a841c9f1f4325d68b957fda3d7d63d33 Mon Sep 17 00:00:00 2001 From: Or Har-Toov Date: Mon, 16 Oct 2023 18:00:00 +0300 Subject: [PATCH 076/548] net/mlx5e: Add local loopback counter to vport rep stats Add counter for number of unicast, multicast and broadcast packets/ octets that were loopback. Signed-off-by: Or Har-Toov Reviewed-by: Patrisious Haddad Signed-off-by: Saeed Mahameed (cherry picked from commit b2a62e56b173ceb153cc1651189c537ebb5345cc) Signed-off-by: Koba Ko --- .../net/ethernet/mellanox/mlx5/core/en_rep.c | 26 ++++++++++++++++++- .../ethernet/mellanox/mlx5/core/en_stats.h | 2 ++ 2 files changed, 27 insertions(+), 1 deletion(-) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c b/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c index 851c499faa795..3d3044242a49b 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c @@ -113,8 +113,18 @@ static const struct counter_desc vport_rep_stats_desc[] = { tx_vport_rdma_multicast_bytes) }, }; +static const struct counter_desc vport_rep_loopback_stats_desc[] = { + { MLX5E_DECLARE_STAT(struct mlx5e_rep_stats, + vport_loopback_packets) }, + { MLX5E_DECLARE_STAT(struct mlx5e_rep_stats, + vport_loopback_bytes) }, +}; + #define NUM_VPORT_REP_SW_COUNTERS ARRAY_SIZE(sw_rep_stats_desc) #define NUM_VPORT_REP_HW_COUNTERS ARRAY_SIZE(vport_rep_stats_desc) +#define NUM_VPORT_REP_LOOPBACK_COUNTERS(dev) \ + (MLX5_CAP_GEN(dev, vport_counter_local_loopback) ? \ + ARRAY_SIZE(vport_rep_loopback_stats_desc) : 0) static MLX5E_DECLARE_STATS_GRP_OP_NUM_STATS(sw_rep) { @@ -158,7 +168,8 @@ static MLX5E_DECLARE_STATS_GRP_OP_UPDATE_STATS(sw_rep) static MLX5E_DECLARE_STATS_GRP_OP_NUM_STATS(vport_rep) { - return NUM_VPORT_REP_HW_COUNTERS; + return NUM_VPORT_REP_HW_COUNTERS + + NUM_VPORT_REP_LOOPBACK_COUNTERS(priv->mdev); } static MLX5E_DECLARE_STATS_GRP_OP_FILL_STRS(vport_rep) @@ -167,6 +178,9 @@ static MLX5E_DECLARE_STATS_GRP_OP_FILL_STRS(vport_rep) for (i = 0; i < NUM_VPORT_REP_HW_COUNTERS; i++) strcpy(data + (idx++) * ETH_GSTRING_LEN, vport_rep_stats_desc[i].format); + for (i = 0; i < NUM_VPORT_REP_LOOPBACK_COUNTERS(priv->mdev); i++) + strcpy(data + (idx++) * ETH_GSTRING_LEN, + vport_rep_loopback_stats_desc[i].format); return idx; } @@ -177,6 +191,9 @@ static MLX5E_DECLARE_STATS_GRP_OP_FILL_STATS(vport_rep) for (i = 0; i < NUM_VPORT_REP_HW_COUNTERS; i++) data[idx++] = MLX5E_READ_CTR64_CPU(&priv->stats.rep_stats, vport_rep_stats_desc, i); + for (i = 0; i < NUM_VPORT_REP_LOOPBACK_COUNTERS(priv->mdev); i++) + data[idx++] = MLX5E_READ_CTR64_CPU(&priv->stats.rep_stats, + vport_rep_loopback_stats_desc, i); return idx; } @@ -248,6 +265,13 @@ static MLX5E_DECLARE_STATS_GRP_OP_UPDATE_STATS(vport_rep) rep_stats->tx_vport_rdma_multicast_bytes = MLX5_GET_CTR(out, received_ib_multicast.octets); + if (MLX5_CAP_GEN(priv->mdev, vport_counter_local_loopback)) { + rep_stats->vport_loopback_packets = + MLX5_GET_CTR(out, local_loopback.packets); + rep_stats->vport_loopback_bytes = + MLX5_GET_CTR(out, local_loopback.octets); + } + out: kvfree(out); } diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_stats.h b/drivers/net/ethernet/mellanox/mlx5/core/en_stats.h index 477c547dcc04a..12b3607afecdf 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_stats.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_stats.h @@ -476,6 +476,8 @@ struct mlx5e_rep_stats { u64 tx_vport_rdma_multicast_packets; u64 rx_vport_rdma_multicast_bytes; u64 tx_vport_rdma_multicast_bytes; + u64 vport_loopback_packets; + u64 vport_loopback_bytes; }; struct mlx5e_stats { From 1df6a8dfc012e459c6e0bb97e3d18efa028b38e4 Mon Sep 17 00:00:00 2001 From: "justinstitt@google.com" Date: Wed, 6 Dec 2023 23:16:10 +0000 Subject: [PATCH 077/548] ethtool: Implement ethtool_puts() Use strscpy() to implement ethtool_puts(). Functionally the same as ethtool_sprintf() when it's used with two arguments or with just "%s" format specifier. Signed-off-by: Justin Stitt Reviewed-by: Przemek Kitszel Reviewed-by: Andrew Lunn Reviewed-by: Madhuri Sripada Signed-off-by: David S. Miller (cherry picked from commit 2a48c635fd9a48699805bbfeee1e4b94b8fe819d) Signed-off-by: Koba Ko --- include/linux/ethtool.h | 14 ++++++++++++++ net/ethtool/ioctl.c | 7 +++++++ 2 files changed, 21 insertions(+) diff --git a/include/linux/ethtool.h b/include/linux/ethtool.h index 1b523fd48586f..58f436c0ade94 100644 --- a/include/linux/ethtool.h +++ b/include/linux/ethtool.h @@ -1052,4 +1052,18 @@ static inline int ethtool_mm_frag_size_min_to_add(u32 val_min, u32 *val_add, * next string. */ extern __printf(2, 3) void ethtool_sprintf(u8 **data, const char *fmt, ...); + +/** + * ethtool_puts - Write string to ethtool string data + * @data: Pointer to a pointer to the start of string to update + * @str: String to write + * + * Write string to *data without a trailing newline. Update *data + * to point at start of next string. + * + * Prefer this function to ethtool_sprintf() when given only + * two arguments or if @fmt is just "%s". + */ +extern void ethtool_puts(u8 **data, const char *str); + #endif /* _LINUX_ETHTOOL_H */ diff --git a/net/ethtool/ioctl.c b/net/ethtool/ioctl.c index 4486cbe2faf0c..a5ff1ea7d8a0f 100644 --- a/net/ethtool/ioctl.c +++ b/net/ethtool/ioctl.c @@ -1994,6 +1994,13 @@ __printf(2, 3) void ethtool_sprintf(u8 **data, const char *fmt, ...) } EXPORT_SYMBOL(ethtool_sprintf); +void ethtool_puts(u8 **data, const char *str) +{ + strscpy(*data, str, ETH_GSTRING_LEN); + *data += ETH_GSTRING_LEN; +} +EXPORT_SYMBOL(ethtool_puts); + static int ethtool_phys_id(struct net_device *dev, void __user *useraddr) { struct ethtool_value id; From d5f82648842eedc60c64925a14847f52a0766ef8 Mon Sep 17 00:00:00 2001 From: Gal Pressman Date: Tue, 2 Apr 2024 16:30:33 +0300 Subject: [PATCH 078/548] net/mlx5e: Use ethtool_sprintf/puts() to fill priv flags strings Use ethtool_sprintf/puts() helper functions which handle the common pattern of printing a string into the ethtool strings interface and incrementing the string pointer by ETH_GSTRING_LEN. Signed-off-by: Gal Pressman Reviewed-by: Tariq Toukan Signed-off-by: Saeed Mahameed Signed-off-by: Tariq Toukan Reviewed-by: Simon Horman Link: https://lore.kernel.org/r/20240402133043.56322-2-tariqt@nvidia.com Signed-off-by: Jakub Kicinski (cherry picked from commit e2d515eb8fcdde06da49be981e13f0c7be4f0b16) Signed-off-by: Koba Ko --- drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c b/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c index 6f6703e845b92..3795ff2ad50c5 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c @@ -279,8 +279,7 @@ void mlx5e_ethtool_get_strings(struct mlx5e_priv *priv, u32 stringset, u8 *data) switch (stringset) { case ETH_SS_PRIV_FLAGS: for (i = 0; i < MLX5E_NUM_PFLAGS; i++) - strcpy(data + i * ETH_GSTRING_LEN, - mlx5e_priv_flags[i].name); + ethtool_puts(&data, mlx5e_priv_flags[i].name); break; case ETH_SS_TEST: From 41c1fad8dd9c89d0bfe4402280314ed1e49ecbce Mon Sep 17 00:00:00 2001 From: Gal Pressman Date: Tue, 2 Apr 2024 16:30:34 +0300 Subject: [PATCH 079/548] net/mlx5e: Use ethtool_sprintf/puts() to fill selftests strings Use ethtool_sprintf/puts() helper functions which handle the common pattern of printing a string into the ethtool strings interface and incrementing the string pointer by ETH_GSTRING_LEN. The int return value in mlx5e_self_test_fill_strings() is not removed as it is still used to return the number of selftests. Signed-off-by: Gal Pressman Reviewed-by: Tariq Toukan Signed-off-by: Saeed Mahameed Signed-off-by: Tariq Toukan Reviewed-by: Simon Horman Link: https://lore.kernel.org/r/20240402133043.56322-3-tariqt@nvidia.com Signed-off-by: Jakub Kicinski (cherry picked from commit 9ac9299d41f6b67c9b99f1522824572e50a292f8) Signed-off-by: Koba Ko --- drivers/net/ethernet/mellanox/mlx5/core/en_selftest.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_selftest.c b/drivers/net/ethernet/mellanox/mlx5/core/en_selftest.c index c170503b3aace..e561c47606b40 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_selftest.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_selftest.c @@ -362,7 +362,7 @@ int mlx5e_self_test_fill_strings(struct mlx5e_priv *priv, u8 *data) if (st.cond_func && st.cond_func(priv)) continue; if (data) - strcpy(data + count * ETH_GSTRING_LEN, st.name); + ethtool_puts(&data, st.name); count++; } return count; From 000f1f6b47edbcbc02668adc08e08bdff95e1b5a Mon Sep 17 00:00:00 2001 From: Gal Pressman Date: Tue, 2 Apr 2024 16:30:35 +0300 Subject: [PATCH 080/548] net/mlx5e: Use ethtool_sprintf/puts() to fill stats strings Use ethtool_sprintf/puts() helper functions which handle the common pattern of printing a string into the ethtool strings interface and incrementing the string pointer by ETH_GSTRING_LEN. Change the fill_strings callback to accept a **data pointer, and remove the index and return value. Signed-off-by: Gal Pressman Reviewed-by: Tariq Toukan Signed-off-by: Saeed Mahameed Signed-off-by: Tariq Toukan Reviewed-by: Simon Horman Link: https://lore.kernel.org/r/20240402133043.56322-4-tariqt@nvidia.com Signed-off-by: Jakub Kicinski (cherry picked from commit 89b34322d29381c916933f4728be7cfd08271442) Signed-off-by: Koba Ko --- .../mellanox/mlx5/core/en_accel/ipsec_stats.c | 11 +- .../mellanox/mlx5/core/en_accel/ktls.h | 7 +- .../mellanox/mlx5/core/en_accel/ktls_stats.c | 11 +- .../mlx5/core/en_accel/macsec_stats.c | 9 +- .../net/ethernet/mellanox/mlx5/core/en_rep.c | 10 +- .../ethernet/mellanox/mlx5/core/en_stats.c | 179 +++++++----------- .../ethernet/mellanox/mlx5/core/en_stats.h | 4 +- 7 files changed, 85 insertions(+), 146 deletions(-) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec_stats.c b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec_stats.c index e0e36a09721c5..fe10b06a5e3a0 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec_stats.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec_stats.c @@ -79,13 +79,10 @@ static MLX5E_DECLARE_STATS_GRP_OP_FILL_STRS(ipsec_hw) unsigned int i; if (!priv->ipsec) - return idx; + return; for (i = 0; i < NUM_IPSEC_HW_COUNTERS; i++) - strcpy(data + (idx++) * ETH_GSTRING_LEN, - mlx5e_ipsec_hw_stats_desc[i].format); - - return idx; + ethtool_puts(data, mlx5e_ipsec_hw_stats_desc[i].format); } static MLX5E_DECLARE_STATS_GRP_OP_FILL_STATS(ipsec_hw) @@ -116,9 +113,7 @@ static MLX5E_DECLARE_STATS_GRP_OP_FILL_STRS(ipsec_sw) if (priv->ipsec) for (i = 0; i < NUM_IPSEC_SW_COUNTERS; i++) - strcpy(data + (idx++) * ETH_GSTRING_LEN, - mlx5e_ipsec_sw_stats_desc[i].format); - return idx; + ethtool_puts(data, mlx5e_ipsec_sw_stats_desc[i].format); } static MLX5E_DECLARE_STATS_GRP_OP_FILL_STATS(ipsec_sw) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ktls.h b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ktls.h index f11075e676586..15732c85ecb22 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ktls.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ktls.h @@ -93,7 +93,7 @@ int mlx5e_ktls_init(struct mlx5e_priv *priv); void mlx5e_ktls_cleanup(struct mlx5e_priv *priv); int mlx5e_ktls_get_count(struct mlx5e_priv *priv); -int mlx5e_ktls_get_strings(struct mlx5e_priv *priv, uint8_t *data); +void mlx5e_ktls_get_strings(struct mlx5e_priv *priv, uint8_t **data); int mlx5e_ktls_get_stats(struct mlx5e_priv *priv, u64 *data); #else @@ -142,10 +142,7 @@ static inline bool mlx5e_is_ktls_rx(struct mlx5_core_dev *mdev) static inline int mlx5e_ktls_init(struct mlx5e_priv *priv) { return 0; } static inline void mlx5e_ktls_cleanup(struct mlx5e_priv *priv) { } static inline int mlx5e_ktls_get_count(struct mlx5e_priv *priv) { return 0; } -static inline int mlx5e_ktls_get_strings(struct mlx5e_priv *priv, uint8_t *data) -{ - return 0; -} +static inline void mlx5e_ktls_get_strings(struct mlx5e_priv *priv, uint8_t **data) { } static inline int mlx5e_ktls_get_stats(struct mlx5e_priv *priv, u64 *data) { diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ktls_stats.c b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ktls_stats.c index 7c1c0eb167879..06363f2653e06 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ktls_stats.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ktls_stats.c @@ -58,20 +58,17 @@ int mlx5e_ktls_get_count(struct mlx5e_priv *priv) return ARRAY_SIZE(mlx5e_ktls_sw_stats_desc); } -int mlx5e_ktls_get_strings(struct mlx5e_priv *priv, uint8_t *data) +void mlx5e_ktls_get_strings(struct mlx5e_priv *priv, uint8_t **data) { - unsigned int i, n, idx = 0; + unsigned int i, n; if (!priv->tls) - return 0; + return; n = mlx5e_ktls_get_count(priv); for (i = 0; i < n; i++) - strcpy(data + (idx++) * ETH_GSTRING_LEN, - mlx5e_ktls_sw_stats_desc[i].format); - - return n; + ethtool_puts(data, mlx5e_ktls_sw_stats_desc[i].format); } int mlx5e_ktls_get_stats(struct mlx5e_priv *priv, u64 *data) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/macsec_stats.c b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/macsec_stats.c index 4559ee16a11a2..a79e2786be56b 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/macsec_stats.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/macsec_stats.c @@ -38,16 +38,13 @@ static MLX5E_DECLARE_STATS_GRP_OP_FILL_STRS(macsec_hw) unsigned int i; if (!priv->macsec) - return idx; + return; if (!mlx5e_is_macsec_device(priv->mdev)) - return idx; + return; for (i = 0; i < NUM_MACSEC_HW_COUNTERS; i++) - strcpy(data + (idx++) * ETH_GSTRING_LEN, - mlx5e_macsec_hw_stats_desc[i].format); - - return idx; + ethtool_puts(data, mlx5e_macsec_hw_stats_desc[i].format); } static MLX5E_DECLARE_STATS_GRP_OP_FILL_STATS(macsec_hw) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c b/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c index 3d3044242a49b..fa0e3639b4d35 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c @@ -136,9 +136,7 @@ static MLX5E_DECLARE_STATS_GRP_OP_FILL_STRS(sw_rep) int i; for (i = 0; i < NUM_VPORT_REP_SW_COUNTERS; i++) - strcpy(data + (idx++) * ETH_GSTRING_LEN, - sw_rep_stats_desc[i].format); - return idx; + ethtool_puts(data, sw_rep_stats_desc[i].format); } static MLX5E_DECLARE_STATS_GRP_OP_FILL_STATS(sw_rep) @@ -177,11 +175,9 @@ static MLX5E_DECLARE_STATS_GRP_OP_FILL_STRS(vport_rep) int i; for (i = 0; i < NUM_VPORT_REP_HW_COUNTERS; i++) - strcpy(data + (idx++) * ETH_GSTRING_LEN, vport_rep_stats_desc[i].format); + ethtool_puts(data, vport_rep_stats_desc[i].format); for (i = 0; i < NUM_VPORT_REP_LOOPBACK_COUNTERS(priv->mdev); i++) - strcpy(data + (idx++) * ETH_GSTRING_LEN, - vport_rep_loopback_stats_desc[i].format); - return idx; + ethtool_puts(data, vport_rep_loopback_stats_desc[i].format); } static MLX5E_DECLARE_STATS_GRP_OP_FILL_STATS(vport_rep) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_stats.c b/drivers/net/ethernet/mellanox/mlx5/core/en_stats.c index 4b96ad657145b..bbcb738ea1c99 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_stats.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_stats.c @@ -97,10 +97,10 @@ void mlx5e_stats_fill_strings(struct mlx5e_priv *priv, u8 *data) { mlx5e_stats_grp_t *stats_grps = priv->profile->stats_grps; const unsigned int num_stats_grps = stats_grps_num(priv); - int i, idx = 0; + int i; for (i = 0; i < num_stats_grps; i++) - idx = stats_grps[i]->fill_strings(priv, data, idx); + stats_grps[i]->fill_strings(priv, &data); } /* Concrete NIC Stats */ @@ -257,8 +257,7 @@ static MLX5E_DECLARE_STATS_GRP_OP_FILL_STRS(sw) int i; for (i = 0; i < NUM_SW_COUNTERS; i++) - strcpy(data + (idx++) * ETH_GSTRING_LEN, sw_stats_desc[i].format); - return idx; + ethtool_puts(data, sw_stats_desc[i].format); } static MLX5E_DECLARE_STATS_GRP_OP_FILL_STATS(sw) @@ -579,14 +578,10 @@ static MLX5E_DECLARE_STATS_GRP_OP_FILL_STRS(qcnt) int i; for (i = 0; i < NUM_Q_COUNTERS && priv->q_counter; i++) - strcpy(data + (idx++) * ETH_GSTRING_LEN, - q_stats_desc[i].format); + ethtool_puts(data, q_stats_desc[i].format); for (i = 0; i < NUM_DROP_RQ_COUNTERS && priv->drop_rq_q_counter; i++) - strcpy(data + (idx++) * ETH_GSTRING_LEN, - drop_rq_stats_desc[i].format); - - return idx; + ethtool_puts(data, drop_rq_stats_desc[i].format); } static MLX5E_DECLARE_STATS_GRP_OP_FILL_STATS(qcnt) @@ -668,18 +663,13 @@ static MLX5E_DECLARE_STATS_GRP_OP_FILL_STRS(vnic_env) int i; for (i = 0; i < NUM_VNIC_ENV_STEER_COUNTERS(priv->mdev); i++) - strcpy(data + (idx++) * ETH_GSTRING_LEN, - vnic_env_stats_steer_desc[i].format); + ethtool_puts(data, vnic_env_stats_steer_desc[i].format); for (i = 0; i < NUM_VNIC_ENV_DEV_OOB_COUNTERS(priv->mdev); i++) - strcpy(data + (idx++) * ETH_GSTRING_LEN, - vnic_env_stats_dev_oob_desc[i].format); + ethtool_puts(data, vnic_env_stats_dev_oob_desc[i].format); for (i = 0; i < NUM_VNIC_ENV_DROP_COUNTERS(priv->mdev); i++) - strcpy(data + (idx++) * ETH_GSTRING_LEN, - vnic_env_stats_drop_desc[i].format); - - return idx; + ethtool_puts(data, vnic_env_stats_drop_desc[i].format); } static MLX5E_DECLARE_STATS_GRP_OP_FILL_STATS(vnic_env) @@ -781,13 +771,10 @@ static MLX5E_DECLARE_STATS_GRP_OP_FILL_STRS(vport) int i; for (i = 0; i < NUM_VPORT_COUNTERS; i++) - strcpy(data + (idx++) * ETH_GSTRING_LEN, vport_stats_desc[i].format); + ethtool_puts(data, vport_stats_desc[i].format); for (i = 0; i < NUM_VPORT_LOOPBACK_COUNTERS(priv->mdev); i++) - strcpy(data + (idx++) * ETH_GSTRING_LEN, - vport_loopback_stats_desc[i].format); - - return idx; + ethtool_puts(data, vport_loopback_stats_desc[i].format); } static MLX5E_DECLARE_STATS_GRP_OP_FILL_STATS(vport) @@ -851,8 +838,7 @@ static MLX5E_DECLARE_STATS_GRP_OP_FILL_STRS(802_3) int i; for (i = 0; i < NUM_PPORT_802_3_COUNTERS; i++) - strcpy(data + (idx++) * ETH_GSTRING_LEN, pport_802_3_stats_desc[i].format); - return idx; + ethtool_puts(data, pport_802_3_stats_desc[i].format); } static MLX5E_DECLARE_STATS_GRP_OP_FILL_STATS(802_3) @@ -1012,8 +998,7 @@ static MLX5E_DECLARE_STATS_GRP_OP_FILL_STRS(2863) int i; for (i = 0; i < NUM_PPORT_2863_COUNTERS; i++) - strcpy(data + (idx++) * ETH_GSTRING_LEN, pport_2863_stats_desc[i].format); - return idx; + ethtool_puts(data, pport_2863_stats_desc[i].format); } static MLX5E_DECLARE_STATS_GRP_OP_FILL_STATS(2863) @@ -1071,8 +1056,7 @@ static MLX5E_DECLARE_STATS_GRP_OP_FILL_STRS(2819) int i; for (i = 0; i < NUM_PPORT_2819_COUNTERS; i++) - strcpy(data + (idx++) * ETH_GSTRING_LEN, pport_2819_stats_desc[i].format); - return idx; + ethtool_puts(data, pport_2819_stats_desc[i].format); } static MLX5E_DECLARE_STATS_GRP_OP_FILL_STATS(2819) @@ -1198,21 +1182,18 @@ static MLX5E_DECLARE_STATS_GRP_OP_FILL_STRS(phy) struct mlx5_core_dev *mdev = priv->mdev; int i; - strcpy(data + (idx++) * ETH_GSTRING_LEN, "link_down_events_phy"); + ethtool_puts(data, "link_down_events_phy"); if (!MLX5_CAP_PCAM_FEATURE(mdev, ppcnt_statistical_group)) - return idx; + return; for (i = 0; i < NUM_PPORT_PHY_STATISTICAL_COUNTERS; i++) - strcpy(data + (idx++) * ETH_GSTRING_LEN, - pport_phy_statistical_stats_desc[i].format); + ethtool_puts(data, pport_phy_statistical_stats_desc[i].format); if (MLX5_CAP_PCAM_FEATURE(mdev, per_lane_error_counters)) for (i = 0; i < NUM_PPORT_PHY_STATISTICAL_PER_LANE_COUNTERS; i++) - strcpy(data + (idx++) * ETH_GSTRING_LEN, - pport_phy_statistical_err_lanes_stats_desc[i].format); - - return idx; + ethtool_puts(data, + pport_phy_statistical_err_lanes_stats_desc[i].format); } static MLX5E_DECLARE_STATS_GRP_OP_FILL_STATS(phy) @@ -1419,9 +1400,7 @@ static MLX5E_DECLARE_STATS_GRP_OP_FILL_STRS(eth_ext) if (MLX5_CAP_PCAM_FEATURE((priv)->mdev, rx_buffer_fullness_counters)) for (i = 0; i < NUM_PPORT_ETH_EXT_COUNTERS; i++) - strcpy(data + (idx++) * ETH_GSTRING_LEN, - pport_eth_ext_stats_desc[i].format); - return idx; + ethtool_puts(data, pport_eth_ext_stats_desc[i].format); } static MLX5E_DECLARE_STATS_GRP_OP_FILL_STATS(eth_ext) @@ -1499,19 +1478,16 @@ static MLX5E_DECLARE_STATS_GRP_OP_FILL_STRS(pcie) if (MLX5_CAP_MCAM_FEATURE((priv)->mdev, pcie_performance_group)) for (i = 0; i < NUM_PCIE_PERF_COUNTERS; i++) - strcpy(data + (idx++) * ETH_GSTRING_LEN, - pcie_perf_stats_desc[i].format); + ethtool_puts(data, pcie_perf_stats_desc[i].format); if (MLX5_CAP_MCAM_FEATURE((priv)->mdev, tx_overflow_buffer_pkt)) for (i = 0; i < NUM_PCIE_PERF_COUNTERS64; i++) - strcpy(data + (idx++) * ETH_GSTRING_LEN, - pcie_perf_stats_desc64[i].format); + ethtool_puts(data, pcie_perf_stats_desc64[i].format); if (MLX5_CAP_MCAM_FEATURE((priv)->mdev, pcie_outbound_stalled)) for (i = 0; i < NUM_PCIE_PERF_STALL_COUNTERS; i++) - strcpy(data + (idx++) * ETH_GSTRING_LEN, - pcie_perf_stall_stats_desc[i].format); - return idx; + ethtool_puts(data, + pcie_perf_stall_stats_desc[i].format); } static MLX5E_DECLARE_STATS_GRP_OP_FILL_STATS(pcie) @@ -1592,18 +1568,18 @@ static MLX5E_DECLARE_STATS_GRP_OP_FILL_STRS(per_port_buff_congest) int i, prio; if (!MLX5_CAP_GEN(mdev, sbcam_reg)) - return idx; + return; for (prio = 0; prio < NUM_PPORT_PRIO; prio++) { for (i = 0; i < NUM_PPORT_PER_TC_PRIO_COUNTERS; i++) - sprintf(data + (idx++) * ETH_GSTRING_LEN, - pport_per_tc_prio_stats_desc[i].format, prio); + ethtool_sprintf(data, + pport_per_tc_prio_stats_desc[i].format, + prio); for (i = 0; i < NUM_PPORT_PER_TC_CONGEST_PRIO_COUNTERS; i++) - sprintf(data + (idx++) * ETH_GSTRING_LEN, - pport_per_tc_congest_prio_stats_desc[i].format, prio); + ethtool_sprintf(data, + pport_per_tc_congest_prio_stats_desc[i].format, + prio); } - - return idx; } static MLX5E_DECLARE_STATS_GRP_OP_FILL_STATS(per_port_buff_congest) @@ -1711,19 +1687,17 @@ static int mlx5e_grp_per_prio_traffic_get_num_stats(void) return NUM_PPORT_PER_PRIO_TRAFFIC_COUNTERS * NUM_PPORT_PRIO; } -static int mlx5e_grp_per_prio_traffic_fill_strings(struct mlx5e_priv *priv, - u8 *data, - int idx) +static void mlx5e_grp_per_prio_traffic_fill_strings(struct mlx5e_priv *priv, + u8 **data) { int i, prio; for (prio = 0; prio < NUM_PPORT_PRIO; prio++) { for (i = 0; i < NUM_PPORT_PER_PRIO_TRAFFIC_COUNTERS; i++) - sprintf(data + (idx++) * ETH_GSTRING_LEN, - pport_per_prio_traffic_stats_desc[i].format, prio); + ethtool_sprintf(data, + pport_per_prio_traffic_stats_desc[i].format, + prio); } - - return idx; } static int mlx5e_grp_per_prio_traffic_fill_stats(struct mlx5e_priv *priv, @@ -1799,9 +1773,8 @@ static int mlx5e_grp_per_prio_pfc_get_num_stats(struct mlx5e_priv *priv) NUM_PPORT_PFC_STALL_COUNTERS(priv); } -static int mlx5e_grp_per_prio_pfc_fill_strings(struct mlx5e_priv *priv, - u8 *data, - int idx) +static void mlx5e_grp_per_prio_pfc_fill_strings(struct mlx5e_priv *priv, + u8 **data) { unsigned long pfc_combined; int i, prio; @@ -1812,23 +1785,22 @@ static int mlx5e_grp_per_prio_pfc_fill_strings(struct mlx5e_priv *priv, char pfc_string[ETH_GSTRING_LEN]; snprintf(pfc_string, sizeof(pfc_string), "prio%d", prio); - sprintf(data + (idx++) * ETH_GSTRING_LEN, - pport_per_prio_pfc_stats_desc[i].format, pfc_string); + ethtool_sprintf(data, + pport_per_prio_pfc_stats_desc[i].format, + pfc_string); } } if (mlx5e_query_global_pause_combined(priv)) { for (i = 0; i < NUM_PPORT_PER_PRIO_PFC_COUNTERS; i++) { - sprintf(data + (idx++) * ETH_GSTRING_LEN, - pport_per_prio_pfc_stats_desc[i].format, "global"); + ethtool_sprintf(data, + pport_per_prio_pfc_stats_desc[i].format, + "global"); } } for (i = 0; i < NUM_PPORT_PFC_STALL_COUNTERS(priv); i++) - strcpy(data + (idx++) * ETH_GSTRING_LEN, - pport_pfc_stall_stats_desc[i].format); - - return idx; + ethtool_puts(data, pport_pfc_stall_stats_desc[i].format); } static int mlx5e_grp_per_prio_pfc_fill_stats(struct mlx5e_priv *priv, @@ -1870,9 +1842,8 @@ static MLX5E_DECLARE_STATS_GRP_OP_NUM_STATS(per_prio) static MLX5E_DECLARE_STATS_GRP_OP_FILL_STRS(per_prio) { - idx = mlx5e_grp_per_prio_traffic_fill_strings(priv, data, idx); - idx = mlx5e_grp_per_prio_pfc_fill_strings(priv, data, idx); - return idx; + mlx5e_grp_per_prio_traffic_fill_strings(priv, data); + mlx5e_grp_per_prio_pfc_fill_strings(priv, data); } static MLX5E_DECLARE_STATS_GRP_OP_FILL_STATS(per_prio) @@ -1927,12 +1898,10 @@ static MLX5E_DECLARE_STATS_GRP_OP_FILL_STRS(pme) int i; for (i = 0; i < NUM_PME_STATUS_STATS; i++) - strcpy(data + (idx++) * ETH_GSTRING_LEN, mlx5e_pme_status_desc[i].format); + ethtool_puts(data, mlx5e_pme_status_desc[i].format); for (i = 0; i < NUM_PME_ERR_STATS; i++) - strcpy(data + (idx++) * ETH_GSTRING_LEN, mlx5e_pme_error_desc[i].format); - - return idx; + ethtool_puts(data, mlx5e_pme_error_desc[i].format); } static MLX5E_DECLARE_STATS_GRP_OP_FILL_STATS(pme) @@ -1962,7 +1931,7 @@ static MLX5E_DECLARE_STATS_GRP_OP_NUM_STATS(tls) static MLX5E_DECLARE_STATS_GRP_OP_FILL_STRS(tls) { - return idx + mlx5e_ktls_get_strings(priv, data + idx * ETH_GSTRING_LEN); + mlx5e_ktls_get_strings(priv, data); } static MLX5E_DECLARE_STATS_GRP_OP_FILL_STATS(tls) @@ -2247,10 +2216,7 @@ static MLX5E_DECLARE_STATS_GRP_OP_FILL_STRS(qos) for (qid = 0; qid < max_qos_sqs; qid++) for (i = 0; i < NUM_QOS_SQ_STATS; i++) - sprintf(data + (idx++) * ETH_GSTRING_LEN, - qos_sq_stats_desc[i].format, qid); - - return idx; + ethtool_sprintf(data, qos_sq_stats_desc[i].format, qid); } static MLX5E_DECLARE_STATS_GRP_OP_FILL_STATS(qos) @@ -2295,29 +2261,29 @@ static MLX5E_DECLARE_STATS_GRP_OP_FILL_STRS(ptp) int i, tc; if (!priv->tx_ptp_opened && !priv->rx_ptp_opened) - return idx; + return; for (i = 0; i < NUM_PTP_CH_STATS; i++) - sprintf(data + (idx++) * ETH_GSTRING_LEN, - "%s", ptp_ch_stats_desc[i].format); + ethtool_puts(data, ptp_ch_stats_desc[i].format); if (priv->tx_ptp_opened) { for (tc = 0; tc < priv->max_opened_tc; tc++) for (i = 0; i < NUM_PTP_SQ_STATS; i++) - sprintf(data + (idx++) * ETH_GSTRING_LEN, - ptp_sq_stats_desc[i].format, tc); + ethtool_sprintf(data, + ptp_sq_stats_desc[i].format, + tc); for (tc = 0; tc < priv->max_opened_tc; tc++) for (i = 0; i < NUM_PTP_CQ_STATS; i++) - sprintf(data + (idx++) * ETH_GSTRING_LEN, - ptp_cq_stats_desc[i].format, tc); + ethtool_sprintf(data, + ptp_cq_stats_desc[i].format, + tc); } if (priv->rx_ptp_opened) { for (i = 0; i < NUM_PTP_RQ_STATS; i++) - sprintf(data + (idx++) * ETH_GSTRING_LEN, - ptp_rq_stats_desc[i].format, MLX5E_PTP_CHANNEL_IX); + ethtool_sprintf(data, ptp_rq_stats_desc[i].format, + MLX5E_PTP_CHANNEL_IX); } - return idx; } static MLX5E_DECLARE_STATS_GRP_OP_FILL_STATS(ptp) @@ -2377,38 +2343,29 @@ static MLX5E_DECLARE_STATS_GRP_OP_FILL_STRS(channels) for (i = 0; i < max_nch; i++) for (j = 0; j < NUM_CH_STATS; j++) - sprintf(data + (idx++) * ETH_GSTRING_LEN, - ch_stats_desc[j].format, i); + ethtool_sprintf(data, ch_stats_desc[j].format, i); for (i = 0; i < max_nch; i++) { for (j = 0; j < NUM_RQ_STATS; j++) - sprintf(data + (idx++) * ETH_GSTRING_LEN, - rq_stats_desc[j].format, i); + ethtool_sprintf(data, rq_stats_desc[j].format, i); for (j = 0; j < NUM_XSKRQ_STATS * is_xsk; j++) - sprintf(data + (idx++) * ETH_GSTRING_LEN, - xskrq_stats_desc[j].format, i); + ethtool_sprintf(data, xskrq_stats_desc[j].format, i); for (j = 0; j < NUM_RQ_XDPSQ_STATS; j++) - sprintf(data + (idx++) * ETH_GSTRING_LEN, - rq_xdpsq_stats_desc[j].format, i); + ethtool_sprintf(data, rq_xdpsq_stats_desc[j].format, i); } for (tc = 0; tc < priv->max_opened_tc; tc++) for (i = 0; i < max_nch; i++) for (j = 0; j < NUM_SQ_STATS; j++) - sprintf(data + (idx++) * ETH_GSTRING_LEN, - sq_stats_desc[j].format, - i + tc * max_nch); + ethtool_sprintf(data, sq_stats_desc[j].format, + i + tc * max_nch); for (i = 0; i < max_nch; i++) { for (j = 0; j < NUM_XSKSQ_STATS * is_xsk; j++) - sprintf(data + (idx++) * ETH_GSTRING_LEN, - xsksq_stats_desc[j].format, i); + ethtool_sprintf(data, xsksq_stats_desc[j].format, i); for (j = 0; j < NUM_XDPSQ_STATS; j++) - sprintf(data + (idx++) * ETH_GSTRING_LEN, - xdpsq_stats_desc[j].format, i); + ethtool_sprintf(data, xdpsq_stats_desc[j].format, i); } - - return idx; } static MLX5E_DECLARE_STATS_GRP_OP_FILL_STATS(channels) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_stats.h b/drivers/net/ethernet/mellanox/mlx5/core/en_stats.h index 12b3607afecdf..0552b56ae4f44 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_stats.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_stats.h @@ -71,7 +71,7 @@ struct mlx5e_priv; struct mlx5e_stats_grp { u16 update_stats_mask; int (*get_num_stats)(struct mlx5e_priv *priv); - int (*fill_strings)(struct mlx5e_priv *priv, u8 *data, int idx); + void (*fill_strings)(struct mlx5e_priv *priv, u8 **data); int (*fill_stats)(struct mlx5e_priv *priv, u64 *data, int idx); void (*update_stats)(struct mlx5e_priv *priv); }; @@ -87,7 +87,7 @@ typedef const struct mlx5e_stats_grp *const mlx5e_stats_grp_t; void MLX5E_STATS_GRP_OP(grp, update_stats)(struct mlx5e_priv *priv) #define MLX5E_DECLARE_STATS_GRP_OP_FILL_STRS(grp) \ - int MLX5E_STATS_GRP_OP(grp, fill_strings)(struct mlx5e_priv *priv, u8 *data, int idx) + void MLX5E_STATS_GRP_OP(grp, fill_strings)(struct mlx5e_priv *priv, u8 **data) #define MLX5E_DECLARE_STATS_GRP_OP_FILL_STATS(grp) \ int MLX5E_STATS_GRP_OP(grp, fill_stats)(struct mlx5e_priv *priv, u64 *data, int idx) From 82a24d56bde19bde669db5585eb409fbbdf8895e Mon Sep 17 00:00:00 2001 From: Gal Pressman Date: Tue, 2 Apr 2024 16:30:36 +0300 Subject: [PATCH 081/548] net/mlx5e: Make stats group fill_stats callbacks consistent with the API The fill_strings() callbacks were changed to accept a **data pointer, and not rely on propagating the index value. Make a similar change to fill_stats() callbacks to keep the API consistent. Signed-off-by: Gal Pressman Reviewed-by: Tariq Toukan Signed-off-by: Saeed Mahameed Signed-off-by: Tariq Toukan Reviewed-by: Simon Horman Link: https://lore.kernel.org/r/20240402133043.56322-5-tariqt@nvidia.com Signed-off-by: Jakub Kicinski (cherry picked from commit 27ea84ab35f5980d3b1a71381da8d62f7a4b54be) Signed-off-by: Koba Ko --- .../mellanox/mlx5/core/en_accel/ipsec_stats.c | 17 +- .../mellanox/mlx5/core/en_accel/ktls.h | 7 +- .../mellanox/mlx5/core/en_accel/ktls_stats.c | 15 +- .../mlx5/core/en_accel/macsec_stats.c | 13 +- .../net/ethernet/mellanox/mlx5/core/en_rep.c | 18 +- .../ethernet/mellanox/mlx5/core/en_stats.c | 312 ++++++++++-------- .../ethernet/mellanox/mlx5/core/en_stats.h | 6 +- 7 files changed, 215 insertions(+), 173 deletions(-) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec_stats.c b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec_stats.c index fe10b06a5e3a0..2352e93540735 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec_stats.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec_stats.c @@ -90,14 +90,14 @@ static MLX5E_DECLARE_STATS_GRP_OP_FILL_STATS(ipsec_hw) int i; if (!priv->ipsec) - return idx; + return; mlx5e_accel_ipsec_fs_read_stats(priv, &priv->ipsec->hw_stats); for (i = 0; i < NUM_IPSEC_HW_COUNTERS; i++) - data[idx++] = MLX5E_READ_CTR_ATOMIC64(&priv->ipsec->hw_stats, - mlx5e_ipsec_hw_stats_desc, i); - - return idx; + mlx5e_ethtool_put_stat( + data, + MLX5E_READ_CTR_ATOMIC64(&priv->ipsec->hw_stats, + mlx5e_ipsec_hw_stats_desc, i)); } static MLX5E_DECLARE_STATS_GRP_OP_NUM_STATS(ipsec_sw) @@ -122,9 +122,10 @@ static MLX5E_DECLARE_STATS_GRP_OP_FILL_STATS(ipsec_sw) if (priv->ipsec) for (i = 0; i < NUM_IPSEC_SW_COUNTERS; i++) - data[idx++] = MLX5E_READ_CTR_ATOMIC64(&priv->ipsec->sw_stats, - mlx5e_ipsec_sw_stats_desc, i); - return idx; + mlx5e_ethtool_put_stat( + data, MLX5E_READ_CTR_ATOMIC64( + &priv->ipsec->sw_stats, + mlx5e_ipsec_sw_stats_desc, i)); } MLX5E_DEFINE_STATS_GRP(ipsec_hw, 0); diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ktls.h b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ktls.h index 15732c85ecb22..0d5566017ea0d 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ktls.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ktls.h @@ -94,7 +94,7 @@ void mlx5e_ktls_cleanup(struct mlx5e_priv *priv); int mlx5e_ktls_get_count(struct mlx5e_priv *priv); void mlx5e_ktls_get_strings(struct mlx5e_priv *priv, uint8_t **data); -int mlx5e_ktls_get_stats(struct mlx5e_priv *priv, u64 *data); +void mlx5e_ktls_get_stats(struct mlx5e_priv *priv, u64 **data); #else static inline void mlx5e_ktls_build_netdev(struct mlx5e_priv *priv) @@ -144,10 +144,7 @@ static inline void mlx5e_ktls_cleanup(struct mlx5e_priv *priv) { } static inline int mlx5e_ktls_get_count(struct mlx5e_priv *priv) { return 0; } static inline void mlx5e_ktls_get_strings(struct mlx5e_priv *priv, uint8_t **data) { } -static inline int mlx5e_ktls_get_stats(struct mlx5e_priv *priv, u64 *data) -{ - return 0; -} +static inline void mlx5e_ktls_get_stats(struct mlx5e_priv *priv, u64 **data) { } #endif #endif /* __MLX5E_TLS_H__ */ diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ktls_stats.c b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ktls_stats.c index 06363f2653e06..7bf79973128bd 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ktls_stats.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ktls_stats.c @@ -71,19 +71,18 @@ void mlx5e_ktls_get_strings(struct mlx5e_priv *priv, uint8_t **data) ethtool_puts(data, mlx5e_ktls_sw_stats_desc[i].format); } -int mlx5e_ktls_get_stats(struct mlx5e_priv *priv, u64 *data) +void mlx5e_ktls_get_stats(struct mlx5e_priv *priv, u64 **data) { - unsigned int i, n, idx = 0; + unsigned int i, n; if (!priv->tls) - return 0; + return; n = mlx5e_ktls_get_count(priv); for (i = 0; i < n; i++) - data[idx++] = MLX5E_READ_CTR_ATOMIC64(&priv->tls->sw_stats, - mlx5e_ktls_sw_stats_desc, - i); - - return n; + mlx5e_ethtool_put_stat( + data, + MLX5E_READ_CTR_ATOMIC64(&priv->tls->sw_stats, + mlx5e_ktls_sw_stats_desc, i)); } diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/macsec_stats.c b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/macsec_stats.c index a79e2786be56b..4bb47d48061d5 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/macsec_stats.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/macsec_stats.c @@ -53,19 +53,18 @@ static MLX5E_DECLARE_STATS_GRP_OP_FILL_STATS(macsec_hw) int i; if (!priv->macsec) - return idx; + return; if (!mlx5e_is_macsec_device(priv->mdev)) - return idx; + return; macsec_fs = priv->mdev->macsec_fs; mlx5_macsec_fs_get_stats_fill(macsec_fs, mlx5_macsec_fs_get_stats(macsec_fs)); for (i = 0; i < NUM_MACSEC_HW_COUNTERS; i++) - data[idx++] = MLX5E_READ_CTR64_CPU(mlx5_macsec_fs_get_stats(macsec_fs), - mlx5e_macsec_hw_stats_desc, - i); - - return idx; + mlx5e_ethtool_put_stat( + data, MLX5E_READ_CTR64_CPU( + mlx5_macsec_fs_get_stats(macsec_fs), + mlx5e_macsec_hw_stats_desc, i)); } MLX5E_DEFINE_STATS_GRP(macsec_hw, 0); diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c b/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c index fa0e3639b4d35..5f3c132231cb6 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c @@ -144,9 +144,9 @@ static MLX5E_DECLARE_STATS_GRP_OP_FILL_STATS(sw_rep) int i; for (i = 0; i < NUM_VPORT_REP_SW_COUNTERS; i++) - data[idx++] = MLX5E_READ_CTR64_CPU(&priv->stats.sw, - sw_rep_stats_desc, i); - return idx; + mlx5e_ethtool_put_stat( + data, MLX5E_READ_CTR64_CPU(&priv->stats.sw, + sw_rep_stats_desc, i)); } static MLX5E_DECLARE_STATS_GRP_OP_UPDATE_STATS(sw_rep) @@ -185,12 +185,14 @@ static MLX5E_DECLARE_STATS_GRP_OP_FILL_STATS(vport_rep) int i; for (i = 0; i < NUM_VPORT_REP_HW_COUNTERS; i++) - data[idx++] = MLX5E_READ_CTR64_CPU(&priv->stats.rep_stats, - vport_rep_stats_desc, i); + mlx5e_ethtool_put_stat( + data, MLX5E_READ_CTR64_CPU(&priv->stats.rep_stats, + vport_rep_stats_desc, i)); for (i = 0; i < NUM_VPORT_REP_LOOPBACK_COUNTERS(priv->mdev); i++) - data[idx++] = MLX5E_READ_CTR64_CPU(&priv->stats.rep_stats, - vport_rep_loopback_stats_desc, i); - return idx; + mlx5e_ethtool_put_stat( + data, + MLX5E_READ_CTR64_CPU(&priv->stats.rep_stats, + vport_rep_loopback_stats_desc, i)); } static MLX5E_DECLARE_STATS_GRP_OP_UPDATE_STATS(vport_rep) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_stats.c b/drivers/net/ethernet/mellanox/mlx5/core/en_stats.c index bbcb738ea1c99..c3cca294973d2 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_stats.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_stats.c @@ -41,6 +41,11 @@ #include #endif +void mlx5e_ethtool_put_stat(u64 **data, u64 val) +{ + *(*data)++ = val; +} + static unsigned int stats_grps_num(struct mlx5e_priv *priv) { return !priv->profile->stats_grps_num ? 0 : @@ -90,7 +95,7 @@ void mlx5e_stats_fill(struct mlx5e_priv *priv, u64 *data, int idx) int i; for (i = 0; i < num_stats_grps; i++) - idx = stats_grps[i]->fill_stats(priv, data, idx); + stats_grps[i]->fill_stats(priv, &data); } void mlx5e_stats_fill_strings(struct mlx5e_priv *priv, u8 *data) @@ -265,8 +270,9 @@ static MLX5E_DECLARE_STATS_GRP_OP_FILL_STATS(sw) int i; for (i = 0; i < NUM_SW_COUNTERS; i++) - data[idx++] = MLX5E_READ_CTR64_CPU(&priv->stats.sw, sw_stats_desc, i); - return idx; + mlx5e_ethtool_put_stat(data, + MLX5E_READ_CTR64_CPU(&priv->stats.sw, + sw_stats_desc, i)); } static void mlx5e_stats_grp_sw_update_stats_xdp_red(struct mlx5e_sw_stats *s, @@ -589,12 +595,13 @@ static MLX5E_DECLARE_STATS_GRP_OP_FILL_STATS(qcnt) int i; for (i = 0; i < NUM_Q_COUNTERS && priv->q_counter; i++) - data[idx++] = MLX5E_READ_CTR32_CPU(&priv->stats.qcnt, - q_stats_desc, i); + mlx5e_ethtool_put_stat(data, + MLX5E_READ_CTR32_CPU(&priv->stats.qcnt, + q_stats_desc, i)); for (i = 0; i < NUM_DROP_RQ_COUNTERS && priv->drop_rq_q_counter; i++) - data[idx++] = MLX5E_READ_CTR32_CPU(&priv->stats.qcnt, - drop_rq_stats_desc, i); - return idx; + mlx5e_ethtool_put_stat( + data, MLX5E_READ_CTR32_CPU(&priv->stats.qcnt, + drop_rq_stats_desc, i)); } static MLX5E_DECLARE_STATS_GRP_OP_UPDATE_STATS(qcnt) @@ -677,18 +684,22 @@ static MLX5E_DECLARE_STATS_GRP_OP_FILL_STATS(vnic_env) int i; for (i = 0; i < NUM_VNIC_ENV_STEER_COUNTERS(priv->mdev); i++) - data[idx++] = MLX5E_READ_CTR64_BE(priv->stats.vnic.query_vnic_env_out, - vnic_env_stats_steer_desc, i); + mlx5e_ethtool_put_stat( + data, + MLX5E_READ_CTR64_BE(priv->stats.vnic.query_vnic_env_out, + vnic_env_stats_steer_desc, i)); for (i = 0; i < NUM_VNIC_ENV_DEV_OOB_COUNTERS(priv->mdev); i++) - data[idx++] = MLX5E_READ_CTR32_BE(priv->stats.vnic.query_vnic_env_out, - vnic_env_stats_dev_oob_desc, i); + mlx5e_ethtool_put_stat( + data, + MLX5E_READ_CTR32_BE(priv->stats.vnic.query_vnic_env_out, + vnic_env_stats_dev_oob_desc, i)); for (i = 0; i < NUM_VNIC_ENV_DROP_COUNTERS(priv->mdev); i++) - data[idx++] = MLX5E_READ_CTR32_BE(priv->stats.vnic.query_vnic_env_out, - vnic_env_stats_drop_desc, i); - - return idx; + mlx5e_ethtool_put_stat( + data, + MLX5E_READ_CTR32_BE(priv->stats.vnic.query_vnic_env_out, + vnic_env_stats_drop_desc, i)); } static MLX5E_DECLARE_STATS_GRP_OP_UPDATE_STATS(vnic_env) @@ -782,14 +793,16 @@ static MLX5E_DECLARE_STATS_GRP_OP_FILL_STATS(vport) int i; for (i = 0; i < NUM_VPORT_COUNTERS; i++) - data[idx++] = MLX5E_READ_CTR64_BE(priv->stats.vport.query_vport_out, - vport_stats_desc, i); + mlx5e_ethtool_put_stat( + data, + MLX5E_READ_CTR64_BE(priv->stats.vport.query_vport_out, + vport_stats_desc, i)); for (i = 0; i < NUM_VPORT_LOOPBACK_COUNTERS(priv->mdev); i++) - data[idx++] = MLX5E_READ_CTR64_BE(priv->stats.vport.query_vport_out, - vport_loopback_stats_desc, i); - - return idx; + mlx5e_ethtool_put_stat( + data, + MLX5E_READ_CTR64_BE(priv->stats.vport.query_vport_out, + vport_loopback_stats_desc, i)); } static MLX5E_DECLARE_STATS_GRP_OP_UPDATE_STATS(vport) @@ -846,9 +859,10 @@ static MLX5E_DECLARE_STATS_GRP_OP_FILL_STATS(802_3) int i; for (i = 0; i < NUM_PPORT_802_3_COUNTERS; i++) - data[idx++] = MLX5E_READ_CTR64_BE(&priv->stats.pport.IEEE_802_3_counters, - pport_802_3_stats_desc, i); - return idx; + mlx5e_ethtool_put_stat( + data, MLX5E_READ_CTR64_BE( + &priv->stats.pport.IEEE_802_3_counters, + pport_802_3_stats_desc, i)); } #define MLX5_BASIC_PPCNT_SUPPORTED(mdev) \ @@ -1006,9 +1020,10 @@ static MLX5E_DECLARE_STATS_GRP_OP_FILL_STATS(2863) int i; for (i = 0; i < NUM_PPORT_2863_COUNTERS; i++) - data[idx++] = MLX5E_READ_CTR64_BE(&priv->stats.pport.RFC_2863_counters, - pport_2863_stats_desc, i); - return idx; + mlx5e_ethtool_put_stat( + data, MLX5E_READ_CTR64_BE( + &priv->stats.pport.RFC_2863_counters, + pport_2863_stats_desc, i)); } static MLX5E_DECLARE_STATS_GRP_OP_UPDATE_STATS(2863) @@ -1064,9 +1079,10 @@ static MLX5E_DECLARE_STATS_GRP_OP_FILL_STATS(2819) int i; for (i = 0; i < NUM_PPORT_2819_COUNTERS; i++) - data[idx++] = MLX5E_READ_CTR64_BE(&priv->stats.pport.RFC_2819_counters, - pport_2819_stats_desc, i); - return idx; + mlx5e_ethtool_put_stat( + data, MLX5E_READ_CTR64_BE( + &priv->stats.pport.RFC_2819_counters, + pport_2819_stats_desc, i)); } static MLX5E_DECLARE_STATS_GRP_OP_UPDATE_STATS(2819) @@ -1202,24 +1218,29 @@ static MLX5E_DECLARE_STATS_GRP_OP_FILL_STATS(phy) int i; /* link_down_events_phy has special handling since it is not stored in __be64 format */ - data[idx++] = MLX5_GET(ppcnt_reg, priv->stats.pport.phy_counters, - counter_set.phys_layer_cntrs.link_down_events); + mlx5e_ethtool_put_stat( + data, MLX5_GET(ppcnt_reg, priv->stats.pport.phy_counters, + counter_set.phys_layer_cntrs.link_down_events)); if (!MLX5_CAP_PCAM_FEATURE(mdev, ppcnt_statistical_group)) - return idx; + return; for (i = 0; i < NUM_PPORT_PHY_STATISTICAL_COUNTERS; i++) - data[idx++] = - MLX5E_READ_CTR64_BE(&priv->stats.pport.phy_statistical_counters, - pport_phy_statistical_stats_desc, i); + mlx5e_ethtool_put_stat( + data, + MLX5E_READ_CTR64_BE( + &priv->stats.pport.phy_statistical_counters, + pport_phy_statistical_stats_desc, i)); if (MLX5_CAP_PCAM_FEATURE(mdev, per_lane_error_counters)) for (i = 0; i < NUM_PPORT_PHY_STATISTICAL_PER_LANE_COUNTERS; i++) - data[idx++] = - MLX5E_READ_CTR64_BE(&priv->stats.pport.phy_statistical_counters, - pport_phy_statistical_err_lanes_stats_desc, - i); - return idx; + mlx5e_ethtool_put_stat( + data, + MLX5E_READ_CTR64_BE( + &priv->stats.pport + .phy_statistical_counters, + pport_phy_statistical_err_lanes_stats_desc, + i)); } static MLX5E_DECLARE_STATS_GRP_OP_UPDATE_STATS(phy) @@ -1409,10 +1430,11 @@ static MLX5E_DECLARE_STATS_GRP_OP_FILL_STATS(eth_ext) if (MLX5_CAP_PCAM_FEATURE((priv)->mdev, rx_buffer_fullness_counters)) for (i = 0; i < NUM_PPORT_ETH_EXT_COUNTERS; i++) - data[idx++] = - MLX5E_READ_CTR64_BE(&priv->stats.pport.eth_ext_counters, - pport_eth_ext_stats_desc, i); - return idx; + mlx5e_ethtool_put_stat( + data, + MLX5E_READ_CTR64_BE( + &priv->stats.pport.eth_ext_counters, + pport_eth_ext_stats_desc, i)); } static MLX5E_DECLARE_STATS_GRP_OP_UPDATE_STATS(eth_ext) @@ -1496,22 +1518,27 @@ static MLX5E_DECLARE_STATS_GRP_OP_FILL_STATS(pcie) if (MLX5_CAP_MCAM_FEATURE((priv)->mdev, pcie_performance_group)) for (i = 0; i < NUM_PCIE_PERF_COUNTERS; i++) - data[idx++] = - MLX5E_READ_CTR32_BE(&priv->stats.pcie.pcie_perf_counters, - pcie_perf_stats_desc, i); + mlx5e_ethtool_put_stat( + data, + MLX5E_READ_CTR32_BE( + &priv->stats.pcie.pcie_perf_counters, + pcie_perf_stats_desc, i)); if (MLX5_CAP_MCAM_FEATURE((priv)->mdev, tx_overflow_buffer_pkt)) for (i = 0; i < NUM_PCIE_PERF_COUNTERS64; i++) - data[idx++] = - MLX5E_READ_CTR64_BE(&priv->stats.pcie.pcie_perf_counters, - pcie_perf_stats_desc64, i); + mlx5e_ethtool_put_stat( + data, + MLX5E_READ_CTR64_BE( + &priv->stats.pcie.pcie_perf_counters, + pcie_perf_stats_desc64, i)); if (MLX5_CAP_MCAM_FEATURE((priv)->mdev, pcie_outbound_stalled)) for (i = 0; i < NUM_PCIE_PERF_STALL_COUNTERS; i++) - data[idx++] = - MLX5E_READ_CTR32_BE(&priv->stats.pcie.pcie_perf_counters, - pcie_perf_stall_stats_desc, i); - return idx; + mlx5e_ethtool_put_stat( + data, + MLX5E_READ_CTR32_BE( + &priv->stats.pcie.pcie_perf_counters, + pcie_perf_stall_stats_desc, i)); } static MLX5E_DECLARE_STATS_GRP_OP_UPDATE_STATS(pcie) @@ -1589,20 +1616,24 @@ static MLX5E_DECLARE_STATS_GRP_OP_FILL_STATS(per_port_buff_congest) int i, prio; if (!MLX5_CAP_GEN(mdev, sbcam_reg)) - return idx; + return; for (prio = 0; prio < NUM_PPORT_PRIO; prio++) { for (i = 0; i < NUM_PPORT_PER_TC_PRIO_COUNTERS; i++) - data[idx++] = - MLX5E_READ_CTR64_BE(&pport->per_tc_prio_counters[prio], - pport_per_tc_prio_stats_desc, i); + mlx5e_ethtool_put_stat( + data, + MLX5E_READ_CTR64_BE( + &pport->per_tc_prio_counters[prio], + pport_per_tc_prio_stats_desc, i)); for (i = 0; i < NUM_PPORT_PER_TC_CONGEST_PRIO_COUNTERS ; i++) - data[idx++] = - MLX5E_READ_CTR64_BE(&pport->per_tc_congest_prio_counters[prio], - pport_per_tc_congest_prio_stats_desc, i); + mlx5e_ethtool_put_stat( + data, + MLX5E_READ_CTR64_BE( + &pport->per_tc_congest_prio_counters + [prio], + pport_per_tc_congest_prio_stats_desc, + i)); } - - return idx; } static void mlx5e_grp_per_tc_prio_update_stats(struct mlx5e_priv *priv) @@ -1700,20 +1731,20 @@ static void mlx5e_grp_per_prio_traffic_fill_strings(struct mlx5e_priv *priv, } } -static int mlx5e_grp_per_prio_traffic_fill_stats(struct mlx5e_priv *priv, - u64 *data, - int idx) +static void mlx5e_grp_per_prio_traffic_fill_stats(struct mlx5e_priv *priv, + u64 **data) { int i, prio; for (prio = 0; prio < NUM_PPORT_PRIO; prio++) { for (i = 0; i < NUM_PPORT_PER_PRIO_TRAFFIC_COUNTERS; i++) - data[idx++] = - MLX5E_READ_CTR64_BE(&priv->stats.pport.per_prio_counters[prio], - pport_per_prio_traffic_stats_desc, i); + mlx5e_ethtool_put_stat( + data, + MLX5E_READ_CTR64_BE( + &priv->stats.pport + .per_prio_counters[prio], + pport_per_prio_traffic_stats_desc, i)); } - - return idx; } static const struct counter_desc pport_per_prio_pfc_stats_desc[] = { @@ -1803,9 +1834,8 @@ static void mlx5e_grp_per_prio_pfc_fill_strings(struct mlx5e_priv *priv, ethtool_puts(data, pport_pfc_stall_stats_desc[i].format); } -static int mlx5e_grp_per_prio_pfc_fill_stats(struct mlx5e_priv *priv, - u64 *data, - int idx) +static void mlx5e_grp_per_prio_pfc_fill_stats(struct mlx5e_priv *priv, + u64 **data) { unsigned long pfc_combined; int i, prio; @@ -1813,25 +1843,30 @@ static int mlx5e_grp_per_prio_pfc_fill_stats(struct mlx5e_priv *priv, pfc_combined = mlx5e_query_pfc_combined(priv); for_each_set_bit(prio, &pfc_combined, NUM_PPORT_PRIO) { for (i = 0; i < NUM_PPORT_PER_PRIO_PFC_COUNTERS; i++) { - data[idx++] = - MLX5E_READ_CTR64_BE(&priv->stats.pport.per_prio_counters[prio], - pport_per_prio_pfc_stats_desc, i); + mlx5e_ethtool_put_stat( + data, + MLX5E_READ_CTR64_BE( + &priv->stats.pport + .per_prio_counters[prio], + pport_per_prio_pfc_stats_desc, i)); } } if (mlx5e_query_global_pause_combined(priv)) { for (i = 0; i < NUM_PPORT_PER_PRIO_PFC_COUNTERS; i++) { - data[idx++] = - MLX5E_READ_CTR64_BE(&priv->stats.pport.per_prio_counters[0], - pport_per_prio_pfc_stats_desc, i); + mlx5e_ethtool_put_stat( + data, + MLX5E_READ_CTR64_BE( + &priv->stats.pport.per_prio_counters[0], + pport_per_prio_pfc_stats_desc, i)); } } for (i = 0; i < NUM_PPORT_PFC_STALL_COUNTERS(priv); i++) - data[idx++] = MLX5E_READ_CTR64_BE(&priv->stats.pport.per_prio_counters[0], - pport_pfc_stall_stats_desc, i); - - return idx; + mlx5e_ethtool_put_stat( + data, MLX5E_READ_CTR64_BE( + &priv->stats.pport.per_prio_counters[0], + pport_pfc_stall_stats_desc, i)); } static MLX5E_DECLARE_STATS_GRP_OP_NUM_STATS(per_prio) @@ -1848,9 +1883,8 @@ static MLX5E_DECLARE_STATS_GRP_OP_FILL_STRS(per_prio) static MLX5E_DECLARE_STATS_GRP_OP_FILL_STATS(per_prio) { - idx = mlx5e_grp_per_prio_traffic_fill_stats(priv, data, idx); - idx = mlx5e_grp_per_prio_pfc_fill_stats(priv, data, idx); - return idx; + mlx5e_grp_per_prio_traffic_fill_stats(priv, data); + mlx5e_grp_per_prio_pfc_fill_stats(priv, data); } static MLX5E_DECLARE_STATS_GRP_OP_UPDATE_STATS(per_prio) @@ -1912,14 +1946,14 @@ static MLX5E_DECLARE_STATS_GRP_OP_FILL_STATS(pme) mlx5_get_pme_stats(priv->mdev, &pme_stats); for (i = 0; i < NUM_PME_STATUS_STATS; i++) - data[idx++] = MLX5E_READ_CTR64_CPU(pme_stats.status_counters, - mlx5e_pme_status_desc, i); + mlx5e_ethtool_put_stat( + data, MLX5E_READ_CTR64_CPU(pme_stats.status_counters, + mlx5e_pme_status_desc, i)); for (i = 0; i < NUM_PME_ERR_STATS; i++) - data[idx++] = MLX5E_READ_CTR64_CPU(pme_stats.error_counters, - mlx5e_pme_error_desc, i); - - return idx; + mlx5e_ethtool_put_stat( + data, MLX5E_READ_CTR64_CPU(pme_stats.error_counters, + mlx5e_pme_error_desc, i)); } static MLX5E_DECLARE_STATS_GRP_OP_UPDATE_STATS(pme) { return; } @@ -1936,7 +1970,7 @@ static MLX5E_DECLARE_STATS_GRP_OP_FILL_STRS(tls) static MLX5E_DECLARE_STATS_GRP_OP_FILL_STATS(tls) { - return idx + mlx5e_ktls_get_stats(priv, data + idx); + mlx5e_ktls_get_stats(priv, data); } static MLX5E_DECLARE_STATS_GRP_OP_UPDATE_STATS(tls) { return; } @@ -2233,10 +2267,10 @@ static MLX5E_DECLARE_STATS_GRP_OP_FILL_STATS(qos) struct mlx5e_sq_stats *s = READ_ONCE(stats[qid]); for (i = 0; i < NUM_QOS_SQ_STATS; i++) - data[idx++] = MLX5E_READ_CTR64_CPU(s, qos_sq_stats_desc, i); + mlx5e_ethtool_put_stat( + data, + MLX5E_READ_CTR64_CPU(s, qos_sq_stats_desc, i)); } - - return idx; } static MLX5E_DECLARE_STATS_GRP_OP_UPDATE_STATS(qos) { return; } @@ -2291,33 +2325,35 @@ static MLX5E_DECLARE_STATS_GRP_OP_FILL_STATS(ptp) int i, tc; if (!priv->tx_ptp_opened && !priv->rx_ptp_opened) - return idx; + return; for (i = 0; i < NUM_PTP_CH_STATS; i++) - data[idx++] = - MLX5E_READ_CTR64_CPU(&priv->ptp_stats.ch, - ptp_ch_stats_desc, i); + mlx5e_ethtool_put_stat( + data, MLX5E_READ_CTR64_CPU(&priv->ptp_stats.ch, + ptp_ch_stats_desc, i)); if (priv->tx_ptp_opened) { for (tc = 0; tc < priv->max_opened_tc; tc++) for (i = 0; i < NUM_PTP_SQ_STATS; i++) - data[idx++] = - MLX5E_READ_CTR64_CPU(&priv->ptp_stats.sq[tc], - ptp_sq_stats_desc, i); + mlx5e_ethtool_put_stat( + data, MLX5E_READ_CTR64_CPU( + &priv->ptp_stats.sq[tc], + ptp_sq_stats_desc, i)); for (tc = 0; tc < priv->max_opened_tc; tc++) for (i = 0; i < NUM_PTP_CQ_STATS; i++) - data[idx++] = - MLX5E_READ_CTR64_CPU(&priv->ptp_stats.cq[tc], - ptp_cq_stats_desc, i); + mlx5e_ethtool_put_stat( + data, MLX5E_READ_CTR64_CPU( + &priv->ptp_stats.cq[tc], + ptp_cq_stats_desc, i)); } if (priv->rx_ptp_opened) { for (i = 0; i < NUM_PTP_RQ_STATS; i++) - data[idx++] = + mlx5e_ethtool_put_stat( + data, MLX5E_READ_CTR64_CPU(&priv->ptp_stats.rq, - ptp_rq_stats_desc, i); + ptp_rq_stats_desc, i)); } - return idx; } static MLX5E_DECLARE_STATS_GRP_OP_UPDATE_STATS(ptp) { return; } @@ -2376,44 +2412,50 @@ static MLX5E_DECLARE_STATS_GRP_OP_FILL_STATS(channels) for (i = 0; i < max_nch; i++) for (j = 0; j < NUM_CH_STATS; j++) - data[idx++] = - MLX5E_READ_CTR64_CPU(&priv->channel_stats[i]->ch, - ch_stats_desc, j); + mlx5e_ethtool_put_stat( + data, MLX5E_READ_CTR64_CPU( + &priv->channel_stats[i]->ch, + ch_stats_desc, j)); for (i = 0; i < max_nch; i++) { for (j = 0; j < NUM_RQ_STATS; j++) - data[idx++] = - MLX5E_READ_CTR64_CPU(&priv->channel_stats[i]->rq, - rq_stats_desc, j); + mlx5e_ethtool_put_stat( + data, MLX5E_READ_CTR64_CPU( + &priv->channel_stats[i]->rq, + rq_stats_desc, j)); for (j = 0; j < NUM_XSKRQ_STATS * is_xsk; j++) - data[idx++] = - MLX5E_READ_CTR64_CPU(&priv->channel_stats[i]->xskrq, - xskrq_stats_desc, j); + mlx5e_ethtool_put_stat( + data, MLX5E_READ_CTR64_CPU( + &priv->channel_stats[i]->xskrq, + xskrq_stats_desc, j)); for (j = 0; j < NUM_RQ_XDPSQ_STATS; j++) - data[idx++] = - MLX5E_READ_CTR64_CPU(&priv->channel_stats[i]->rq_xdpsq, - rq_xdpsq_stats_desc, j); + mlx5e_ethtool_put_stat( + data, MLX5E_READ_CTR64_CPU( + &priv->channel_stats[i]->rq_xdpsq, + rq_xdpsq_stats_desc, j)); } for (tc = 0; tc < priv->max_opened_tc; tc++) for (i = 0; i < max_nch; i++) for (j = 0; j < NUM_SQ_STATS; j++) - data[idx++] = - MLX5E_READ_CTR64_CPU(&priv->channel_stats[i]->sq[tc], - sq_stats_desc, j); + mlx5e_ethtool_put_stat( + data, + MLX5E_READ_CTR64_CPU( + &priv->channel_stats[i]->sq[tc], + sq_stats_desc, j)); for (i = 0; i < max_nch; i++) { for (j = 0; j < NUM_XSKSQ_STATS * is_xsk; j++) - data[idx++] = - MLX5E_READ_CTR64_CPU(&priv->channel_stats[i]->xsksq, - xsksq_stats_desc, j); + mlx5e_ethtool_put_stat( + data, MLX5E_READ_CTR64_CPU( + &priv->channel_stats[i]->xsksq, + xsksq_stats_desc, j)); for (j = 0; j < NUM_XDPSQ_STATS; j++) - data[idx++] = - MLX5E_READ_CTR64_CPU(&priv->channel_stats[i]->xdpsq, - xdpsq_stats_desc, j); + mlx5e_ethtool_put_stat( + data, MLX5E_READ_CTR64_CPU( + &priv->channel_stats[i]->xdpsq, + xdpsq_stats_desc, j)); } - - return idx; } static MLX5E_DECLARE_STATS_GRP_OP_UPDATE_STATS(channels) { return; } diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_stats.h b/drivers/net/ethernet/mellanox/mlx5/core/en_stats.h index 0552b56ae4f44..b71e3fdf92c5f 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_stats.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_stats.h @@ -72,10 +72,12 @@ struct mlx5e_stats_grp { u16 update_stats_mask; int (*get_num_stats)(struct mlx5e_priv *priv); void (*fill_strings)(struct mlx5e_priv *priv, u8 **data); - int (*fill_stats)(struct mlx5e_priv *priv, u64 *data, int idx); + void (*fill_stats)(struct mlx5e_priv *priv, u64 **data); void (*update_stats)(struct mlx5e_priv *priv); }; +void mlx5e_ethtool_put_stat(u64 **data, u64 val); + typedef const struct mlx5e_stats_grp *const mlx5e_stats_grp_t; #define MLX5E_STATS_GRP_OP(grp, name) mlx5e_stats_grp_ ## grp ## _ ## name @@ -90,7 +92,7 @@ typedef const struct mlx5e_stats_grp *const mlx5e_stats_grp_t; void MLX5E_STATS_GRP_OP(grp, fill_strings)(struct mlx5e_priv *priv, u8 **data) #define MLX5E_DECLARE_STATS_GRP_OP_FILL_STATS(grp) \ - int MLX5E_STATS_GRP_OP(grp, fill_stats)(struct mlx5e_priv *priv, u64 *data, int idx) + void MLX5E_STATS_GRP_OP(grp, fill_stats)(struct mlx5e_priv *priv, u64 **data) #define MLX5E_STATS_GRP(grp) mlx5e_stats_grp_ ## grp From 92c18f1dd9879bf486b047901af49a2512a8e3c2 Mon Sep 17 00:00:00 2001 From: Tariq Toukan Date: Tue, 2 Apr 2024 16:30:37 +0300 Subject: [PATCH 082/548] net/mlx5e: debugfs, Add reset option for command interface stats Resetting stats just before some test/debug case allows us to eliminate out the impact of previous commands. Useful in particular for the average latency calculation. The average_write() callback was unreachable, as "average" is a read-only file. Extend, rename, and use it for a newly exposed write-only "reset" file. Signed-off-by: Tariq Toukan Reviewed-by: Saeed Mahameed Reviewed-by: Gal Pressman Reviewed-by: Simon Horman Link: https://lore.kernel.org/r/20240402133043.56322-6-tariqt@nvidia.com Signed-off-by: Jakub Kicinski (cherry picked from commit 19b85f1b37ce1d97158ca2b1d7c1a3b3e640c813) Signed-off-by: Koba Ko --- .../net/ethernet/mellanox/mlx5/core/debugfs.c | 22 ++++++++++++++----- 1 file changed, 17 insertions(+), 5 deletions(-) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/debugfs.c b/drivers/net/ethernet/mellanox/mlx5/core/debugfs.c index 09652dc891154..36806e813c33c 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/debugfs.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/debugfs.c @@ -143,8 +143,8 @@ static ssize_t average_read(struct file *filp, char __user *buf, size_t count, return simple_read_from_buffer(buf, count, pos, tbuf, ret); } -static ssize_t average_write(struct file *filp, const char __user *buf, - size_t count, loff_t *pos) +static ssize_t reset_write(struct file *filp, const char __user *buf, + size_t count, loff_t *pos) { struct mlx5_cmd_stats *stats; @@ -152,6 +152,11 @@ static ssize_t average_write(struct file *filp, const char __user *buf, spin_lock_irq(&stats->lock); stats->sum = 0; stats->n = 0; + stats->failed = 0; + stats->failed_mbox_status = 0; + stats->last_failed_errno = 0; + stats->last_failed_mbox_status = 0; + stats->last_failed_syndrome = 0; spin_unlock_irq(&stats->lock); *pos += count; @@ -159,11 +164,16 @@ static ssize_t average_write(struct file *filp, const char __user *buf, return count; } -static const struct file_operations stats_fops = { +static const struct file_operations reset_fops = { + .owner = THIS_MODULE, + .open = simple_open, + .write = reset_write, +}; + +static const struct file_operations average_fops = { .owner = THIS_MODULE, .open = simple_open, .read = average_read, - .write = average_write, }; static ssize_t slots_read(struct file *filp, char __user *buf, size_t count, @@ -228,8 +238,10 @@ void mlx5_cmdif_debugfs_init(struct mlx5_core_dev *dev) continue; stats->root = debugfs_create_dir(namep, *cmd); + debugfs_create_file("reset", 0200, stats->root, stats, + &reset_fops); debugfs_create_file("average", 0400, stats->root, stats, - &stats_fops); + &average_fops); debugfs_create_u64("n", 0400, stats->root, &stats->n); debugfs_create_u64("failed", 0400, stats->root, &stats->failed); debugfs_create_u64("failed_mbox_status", 0400, stats->root, From adddbcd5e80b07da49c9ae86c0b2303c6971cbeb Mon Sep 17 00:00:00 2001 From: Carolina Jubran Date: Tue, 2 Apr 2024 16:30:38 +0300 Subject: [PATCH 083/548] net/mlx5e: XDP, Fix an inconsistent comment Starting from commit eb9b9fdcafe2 ("net/mlx5e: Introduce extended version for mlx5e_xmit_data") sinfo is no longer passed as an argument to mlx5e_xmit_xdp_frame(), the comment is inconsistent. check_result must be zero when the packet is fragmented. Signed-off-by: Carolina Jubran Reviewed-by: Tariq Toukan Signed-off-by: Saeed Mahameed Signed-off-by: Tariq Toukan Reviewed-by: Simon Horman Link: https://lore.kernel.org/r/20240402133043.56322-7-tariqt@nvidia.com Signed-off-by: Jakub Kicinski (cherry picked from commit 595f41608dbae55268f63f629acc95a1e0f08a65) Signed-off-by: Koba Ko --- drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c b/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c index b723ff5e5249c..1443a5f7e1a6c 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c @@ -518,7 +518,7 @@ mlx5e_xmit_xdp_frame(struct mlx5e_xdpsq *sq, struct mlx5e_xmit_data *xdptxd, linear = !!(dma_len - inline_hdr_sz); ds_cnt = MLX5E_TX_WQE_EMPTY_DS_COUNT + linear + !!inline_hdr_sz; - /* check_result must be 0 if sinfo is passed. */ + /* check_result must be 0 if xdptxd->has_frags is true. */ if (!check_result) { int stop_room = 1; From 68d718c8febe95ded9e9a55598b699650ce01171 Mon Sep 17 00:00:00 2001 From: Gal Pressman Date: Tue, 2 Apr 2024 16:30:39 +0300 Subject: [PATCH 084/548] net/mlx5: Convert uintX_t to uX In the kernel, the preferred types are uX. Signed-off-by: Gal Pressman Reviewed-by: Tariq Toukan Signed-off-by: Saeed Mahameed Signed-off-by: Tariq Toukan Reviewed-by: Simon Horman Link: https://lore.kernel.org/r/20240402133043.56322-8-tariqt@nvidia.com Signed-off-by: Jakub Kicinski (cherry picked from commit 30f8d23814ea4bb0229a78f7f8f36500c2e2d962) Signed-off-by: Koba Ko --- drivers/net/ethernet/mellanox/mlx5/core/en.h | 2 +- drivers/net/ethernet/mellanox/mlx5/core/en_accel/fs_tcp.c | 2 +- drivers/net/ethernet/mellanox/mlx5/core/en_accel/fs_tcp.h | 4 ++-- drivers/net/ethernet/mellanox/mlx5/core/en_accel/ktls.h | 4 ++-- drivers/net/ethernet/mellanox/mlx5/core/en_accel/ktls_stats.c | 2 +- drivers/net/ethernet/mellanox/mlx5/core/en_rep.c | 2 +- drivers/net/ethernet/mellanox/mlx5/core/fw.c | 2 +- drivers/net/ethernet/mellanox/mlx5/core/mlx5_core.h | 2 +- drivers/net/ethernet/mellanox/mlx5/core/steering/dr_ste_v0.c | 2 +- drivers/net/ethernet/mellanox/mlx5/core/steering/dr_ste_v1.c | 4 ++-- include/linux/mlx5/device.h | 2 +- 11 files changed, 14 insertions(+), 14 deletions(-) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en.h b/drivers/net/ethernet/mellanox/mlx5/core/en.h index 79fd14b63a470..7c98d9b88be3f 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/en.h @@ -1148,7 +1148,7 @@ void mlx5e_vxlan_set_netdev_info(struct mlx5e_priv *priv); void mlx5e_ethtool_get_drvinfo(struct mlx5e_priv *priv, struct ethtool_drvinfo *drvinfo); void mlx5e_ethtool_get_strings(struct mlx5e_priv *priv, - uint32_t stringset, uint8_t *data); + u32 stringset, u8 *data); int mlx5e_ethtool_get_sset_count(struct mlx5e_priv *priv, int sset); void mlx5e_ethtool_get_ethtool_stats(struct mlx5e_priv *priv, struct ethtool_stats *stats, u64 *data); diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/fs_tcp.c b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/fs_tcp.c index c7d191f66ad1b..4f83e3172767a 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/fs_tcp.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/fs_tcp.c @@ -73,7 +73,7 @@ void mlx5e_accel_fs_del_sk(struct mlx5_flow_handle *rule) struct mlx5_flow_handle *mlx5e_accel_fs_add_sk(struct mlx5e_flow_steering *fs, struct sock *sk, u32 tirn, - uint32_t flow_tag) + u32 flow_tag) { struct mlx5e_accel_fs_tcp *fs_tcp = mlx5e_fs_get_accel_tcp(fs); struct mlx5_flow_destination dest = {}; diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/fs_tcp.h b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/fs_tcp.h index a032bff482a6f..7e899c716267e 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/fs_tcp.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/fs_tcp.h @@ -11,14 +11,14 @@ int mlx5e_accel_fs_tcp_create(struct mlx5e_flow_steering *fs); void mlx5e_accel_fs_tcp_destroy(struct mlx5e_flow_steering *fs); struct mlx5_flow_handle *mlx5e_accel_fs_add_sk(struct mlx5e_flow_steering *fs, struct sock *sk, u32 tirn, - uint32_t flow_tag); + u32 flow_tag); void mlx5e_accel_fs_del_sk(struct mlx5_flow_handle *rule); #else static inline int mlx5e_accel_fs_tcp_create(struct mlx5e_flow_steering *fs) { return 0; } static inline void mlx5e_accel_fs_tcp_destroy(struct mlx5e_flow_steering *fs) {} static inline struct mlx5_flow_handle *mlx5e_accel_fs_add_sk(struct mlx5e_flow_steering *fs, struct sock *sk, u32 tirn, - uint32_t flow_tag) + u32 flow_tag) { return ERR_PTR(-EOPNOTSUPP); } static inline void mlx5e_accel_fs_del_sk(struct mlx5_flow_handle *rule) {} #endif diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ktls.h b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ktls.h index 0d5566017ea0d..3f956962e995b 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ktls.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ktls.h @@ -93,7 +93,7 @@ int mlx5e_ktls_init(struct mlx5e_priv *priv); void mlx5e_ktls_cleanup(struct mlx5e_priv *priv); int mlx5e_ktls_get_count(struct mlx5e_priv *priv); -void mlx5e_ktls_get_strings(struct mlx5e_priv *priv, uint8_t **data); +void mlx5e_ktls_get_strings(struct mlx5e_priv *priv, u8 **data); void mlx5e_ktls_get_stats(struct mlx5e_priv *priv, u64 **data); #else @@ -142,7 +142,7 @@ static inline bool mlx5e_is_ktls_rx(struct mlx5_core_dev *mdev) static inline int mlx5e_ktls_init(struct mlx5e_priv *priv) { return 0; } static inline void mlx5e_ktls_cleanup(struct mlx5e_priv *priv) { } static inline int mlx5e_ktls_get_count(struct mlx5e_priv *priv) { return 0; } -static inline void mlx5e_ktls_get_strings(struct mlx5e_priv *priv, uint8_t **data) { } +static inline void mlx5e_ktls_get_strings(struct mlx5e_priv *priv, u8 **data) { } static inline void mlx5e_ktls_get_stats(struct mlx5e_priv *priv, u64 **data) { } #endif diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ktls_stats.c b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ktls_stats.c index 7bf79973128bd..60be2d72eb9e5 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ktls_stats.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ktls_stats.c @@ -58,7 +58,7 @@ int mlx5e_ktls_get_count(struct mlx5e_priv *priv) return ARRAY_SIZE(mlx5e_ktls_sw_stats_desc); } -void mlx5e_ktls_get_strings(struct mlx5e_priv *priv, uint8_t **data) +void mlx5e_ktls_get_strings(struct mlx5e_priv *priv, u8 **data) { unsigned int i, n; diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c b/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c index 5f3c132231cb6..f5105aaa7e3f5 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c @@ -275,7 +275,7 @@ static MLX5E_DECLARE_STATS_GRP_OP_UPDATE_STATS(vport_rep) } static void mlx5e_rep_get_strings(struct net_device *dev, - u32 stringset, uint8_t *data) + u32 stringset, u8 *data) { struct mlx5e_priv *priv = netdev_priv(dev); diff --git a/drivers/net/ethernet/mellanox/mlx5/core/fw.c b/drivers/net/ethernet/mellanox/mlx5/core/fw.c index 70898f0a9866c..a27896cdaf869 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/fw.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/fw.c @@ -283,7 +283,7 @@ int mlx5_query_hca_caps(struct mlx5_core_dev *dev) return 0; } -int mlx5_cmd_init_hca(struct mlx5_core_dev *dev, uint32_t *sw_owner_id) +int mlx5_cmd_init_hca(struct mlx5_core_dev *dev, u32 *sw_owner_id) { u32 in[MLX5_ST_SZ_DW(init_hca_in)] = {}; int i; diff --git a/drivers/net/ethernet/mellanox/mlx5/core/mlx5_core.h b/drivers/net/ethernet/mellanox/mlx5/core/mlx5_core.h index 19ffd1816474a..78c852f99a3e0 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/mlx5_core.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/mlx5_core.h @@ -202,7 +202,7 @@ int mlx5_cmd_enable(struct mlx5_core_dev *dev); void mlx5_cmd_disable(struct mlx5_core_dev *dev); void mlx5_cmd_set_state(struct mlx5_core_dev *dev, enum mlx5_cmdif_state cmdif_state); -int mlx5_cmd_init_hca(struct mlx5_core_dev *dev, uint32_t *sw_owner_id); +int mlx5_cmd_init_hca(struct mlx5_core_dev *dev, u32 *sw_owner_id); int mlx5_cmd_teardown_hca(struct mlx5_core_dev *dev); int mlx5_cmd_force_teardown_hca(struct mlx5_core_dev *dev); int mlx5_cmd_fast_teardown_hca(struct mlx5_core_dev *dev); diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_ste_v0.c b/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_ste_v0.c index f708b029425ac..e9f6c7ed7a7be 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_ste_v0.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_ste_v0.c @@ -1883,7 +1883,7 @@ dr_ste_v0_build_tnl_gtpu_flex_parser_1_init(struct mlx5dr_ste_build *sb, static int dr_ste_v0_build_tnl_header_0_1_tag(struct mlx5dr_match_param *value, struct mlx5dr_ste_build *sb, - uint8_t *tag) + u8 *tag) { struct mlx5dr_match_misc5 *misc5 = &value->misc5; diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_ste_v1.c b/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_ste_v1.c index dd856cde188de..1d49704b95427 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_ste_v1.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_ste_v1.c @@ -1897,7 +1897,7 @@ void dr_ste_v1_build_flex_parser_tnl_geneve_init(struct mlx5dr_ste_build *sb, static int dr_ste_v1_build_tnl_header_0_1_tag(struct mlx5dr_match_param *value, struct mlx5dr_ste_build *sb, - uint8_t *tag) + u8 *tag) { struct mlx5dr_match_misc5 *misc5 = &value->misc5; @@ -2129,7 +2129,7 @@ dr_ste_v1_build_flex_parser_tnl_geneve_tlv_opt_init(struct mlx5dr_ste_build *sb, static int dr_ste_v1_build_flex_parser_tnl_geneve_tlv_opt_exist_tag(struct mlx5dr_match_param *value, struct mlx5dr_ste_build *sb, - uint8_t *tag) + u8 *tag) { u8 parser_id = sb->caps->flex_parser_id_geneve_tlv_option_0; struct mlx5dr_match_misc *misc = &value->misc; diff --git a/include/linux/mlx5/device.h b/include/linux/mlx5/device.h index 4e476789244d6..06f81b5775ba9 100644 --- a/include/linux/mlx5/device.h +++ b/include/linux/mlx5/device.h @@ -68,7 +68,7 @@ #define MLX5_UN_SZ_BYTES(typ) (sizeof(union mlx5_ifc_##typ##_bits) / 8) #define MLX5_UN_SZ_DW(typ) (sizeof(union mlx5_ifc_##typ##_bits) / 32) #define MLX5_BYTE_OFF(typ, fld) (__mlx5_bit_off(typ, fld) / 8) -#define MLX5_ADDR_OF(typ, p, fld) ((void *)((uint8_t *)(p) + MLX5_BYTE_OFF(typ, fld))) +#define MLX5_ADDR_OF(typ, p, fld) ((void *)((u8 *)(p) + MLX5_BYTE_OFF(typ, fld))) /* insert a value to a struct */ #define MLX5_SET(typ, p, fld, v) do { \ From 43b5002c202b8a902bf3e24fd18768de0ff881cc Mon Sep 17 00:00:00 2001 From: Gal Pressman Date: Tue, 2 Apr 2024 16:30:40 +0300 Subject: [PATCH 085/548] net/mlx5e: Add support for 800Gbps link modes Add support for 800Gbps speed, link modes of 100Gbps per lane. Signed-off-by: Gal Pressman Reviewed-by: Cosmin Ratiu Signed-off-by: Tariq Toukan Link: https://lore.kernel.org/r/20240402133043.56322-9-tariqt@nvidia.com Signed-off-by: Jakub Kicinski (cherry picked from commit 8c54c89ad45a36fa5f0a3426b5e2c504e9c338ee) Signed-off-by: Koba Ko --- drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c b/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c index 3795ff2ad50c5..bb2863439fc1f 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c @@ -229,6 +229,13 @@ void mlx5e_build_ptys2ethtool_map(void) ETHTOOL_LINK_MODE_400000baseLR4_ER4_FR4_Full_BIT, ETHTOOL_LINK_MODE_400000baseDR4_Full_BIT, ETHTOOL_LINK_MODE_400000baseCR4_Full_BIT); + MLX5_BUILD_PTYS2ETHTOOL_CONFIG(MLX5E_800GAUI_8_800GBASE_CR8_KR8, ext, + ETHTOOL_LINK_MODE_800000baseCR8_Full_BIT, + ETHTOOL_LINK_MODE_800000baseKR8_Full_BIT, + ETHTOOL_LINK_MODE_800000baseDR8_Full_BIT, + ETHTOOL_LINK_MODE_800000baseDR8_2_Full_BIT, + ETHTOOL_LINK_MODE_800000baseSR8_Full_BIT, + ETHTOOL_LINK_MODE_800000baseVR8_Full_BIT); } static void mlx5e_ethtool_get_speed_arr(struct mlx5_core_dev *mdev, From 9543be0eee671cb3d02c72fc76fbe7e9bc1a3f69 Mon Sep 17 00:00:00 2001 From: Jianbo Liu Date: Tue, 2 Apr 2024 16:30:41 +0300 Subject: [PATCH 086/548] net/mlx5: Support matching on l4_type for ttc_table Replace matching on TCP and UDP protocols with new l4_type field which is parsed by steering for ttc_table. It is enabled by the outer_l4_type or inner_l4_type bits in nic_rx or port_sel flow table capabilities and used only if pcc_ifa2 bit in HCA capabilities is set. Signed-off-by: Jianbo Liu Reviewed-by: Mark Bloch Signed-off-by: Tariq Toukan Link: https://lore.kernel.org/r/20240402133043.56322-10-tariqt@nvidia.com Signed-off-by: Jakub Kicinski (cherry picked from commit 137f3d50ad2a0f2e1ebe5181d6b32a5541786b99) Signed-off-by: Koba Ko --- .../net/ethernet/mellanox/mlx5/core/en_fs.c | 6 +- .../net/ethernet/mellanox/mlx5/core/en_tc.c | 3 +- .../mellanox/mlx5/core/lag/port_sel.c | 8 +- .../ethernet/mellanox/mlx5/core/lib/fs_ttc.c | 254 ++++++++++++++---- .../ethernet/mellanox/mlx5/core/lib/fs_ttc.h | 2 +- include/linux/mlx5/device.h | 6 + include/linux/mlx5/mlx5_ifc.h | 24 +- 7 files changed, 237 insertions(+), 66 deletions(-) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_fs.c b/drivers/net/ethernet/mellanox/mlx5/core/en_fs.c index 934b0d5ce1b3f..aa87ac957d557 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_fs.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_fs.c @@ -896,8 +896,7 @@ static void mlx5e_set_inner_ttc_params(struct mlx5e_flow_steering *fs, int tt; memset(ttc_params, 0, sizeof(*ttc_params)); - ttc_params->ns = mlx5_get_flow_namespace(fs->mdev, - MLX5_FLOW_NAMESPACE_KERNEL); + ttc_params->ns_type = MLX5_FLOW_NAMESPACE_KERNEL; ft_attr->level = MLX5E_INNER_TTC_FT_LEVEL; ft_attr->prio = MLX5E_NIC_PRIO; @@ -920,8 +919,7 @@ void mlx5e_set_ttc_params(struct mlx5e_flow_steering *fs, int tt; memset(ttc_params, 0, sizeof(*ttc_params)); - ttc_params->ns = mlx5_get_flow_namespace(fs->mdev, - MLX5_FLOW_NAMESPACE_KERNEL); + ttc_params->ns_type = MLX5_FLOW_NAMESPACE_KERNEL; ft_attr->level = MLX5E_TTC_FT_LEVEL; ft_attr->prio = MLX5E_NIC_PRIO; diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c index 2be9c69daad5f..1625836e11478 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c @@ -833,8 +833,7 @@ static void mlx5e_hairpin_set_ttc_params(struct mlx5e_hairpin *hp, memset(ttc_params, 0, sizeof(*ttc_params)); - ttc_params->ns = mlx5_get_flow_namespace(hp->func_mdev, - MLX5_FLOW_NAMESPACE_KERNEL); + ttc_params->ns_type = MLX5_FLOW_NAMESPACE_KERNEL; for (tt = 0; tt < MLX5_NUM_TT; tt++) { ttc_params->dests[tt].type = MLX5_FLOW_DESTINATION_TYPE_TIR; ttc_params->dests[tt].tir_num = diff --git a/drivers/net/ethernet/mellanox/mlx5/core/lag/port_sel.c b/drivers/net/ethernet/mellanox/mlx5/core/lag/port_sel.c index 9faa9ef863a1b..2ffa26845cba1 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/lag/port_sel.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/lag/port_sel.c @@ -453,13 +453,11 @@ static void set_tt_map(struct mlx5_lag_port_sel *port_sel, static void mlx5_lag_set_inner_ttc_params(struct mlx5_lag *ldev, struct ttc_params *ttc_params) { - struct mlx5_core_dev *dev = ldev->pf[MLX5_LAG_P1].dev; struct mlx5_lag_port_sel *port_sel = &ldev->port_sel; struct mlx5_flow_table_attr *ft_attr; int tt; - ttc_params->ns = mlx5_get_flow_namespace(dev, - MLX5_FLOW_NAMESPACE_PORT_SEL); + ttc_params->ns_type = MLX5_FLOW_NAMESPACE_PORT_SEL; ft_attr = &ttc_params->ft_attr; ft_attr->level = MLX5_LAG_FT_LEVEL_INNER_TTC; @@ -474,13 +472,11 @@ static void mlx5_lag_set_inner_ttc_params(struct mlx5_lag *ldev, static void mlx5_lag_set_outer_ttc_params(struct mlx5_lag *ldev, struct ttc_params *ttc_params) { - struct mlx5_core_dev *dev = ldev->pf[MLX5_LAG_P1].dev; struct mlx5_lag_port_sel *port_sel = &ldev->port_sel; struct mlx5_flow_table_attr *ft_attr; int tt; - ttc_params->ns = mlx5_get_flow_namespace(dev, - MLX5_FLOW_NAMESPACE_PORT_SEL); + ttc_params->ns_type = MLX5_FLOW_NAMESPACE_PORT_SEL; ft_attr = &ttc_params->ft_attr; ft_attr->level = MLX5_LAG_FT_LEVEL_TTC; diff --git a/drivers/net/ethernet/mellanox/mlx5/core/lib/fs_ttc.c b/drivers/net/ethernet/mellanox/mlx5/core/lib/fs_ttc.c index b78f2ba25c19b..9f13cea164465 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/lib/fs_ttc.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/lib/fs_ttc.c @@ -9,21 +9,24 @@ #include "mlx5_core.h" #include "lib/fs_ttc.h" -#define MLX5_TTC_NUM_GROUPS 3 -#define MLX5_TTC_GROUP1_SIZE (BIT(3) + MLX5_NUM_TUNNEL_TT) -#define MLX5_TTC_GROUP2_SIZE BIT(1) -#define MLX5_TTC_GROUP3_SIZE BIT(0) -#define MLX5_TTC_TABLE_SIZE (MLX5_TTC_GROUP1_SIZE +\ - MLX5_TTC_GROUP2_SIZE +\ - MLX5_TTC_GROUP3_SIZE) - -#define MLX5_INNER_TTC_NUM_GROUPS 3 -#define MLX5_INNER_TTC_GROUP1_SIZE BIT(3) -#define MLX5_INNER_TTC_GROUP2_SIZE BIT(1) -#define MLX5_INNER_TTC_GROUP3_SIZE BIT(0) -#define MLX5_INNER_TTC_TABLE_SIZE (MLX5_INNER_TTC_GROUP1_SIZE +\ - MLX5_INNER_TTC_GROUP2_SIZE +\ - MLX5_INNER_TTC_GROUP3_SIZE) +#define MLX5_TTC_MAX_NUM_GROUPS 4 +#define MLX5_TTC_GROUP_TCPUDP_SIZE (MLX5_TT_IPV6_UDP + 1) + +struct mlx5_fs_ttc_groups { + bool use_l4_type; + int num_groups; + int group_size[MLX5_TTC_MAX_NUM_GROUPS]; +}; + +static int mlx5_fs_ttc_table_size(const struct mlx5_fs_ttc_groups *groups) +{ + int i, sz = 0; + + for (i = 0; i < groups->num_groups; i++) + sz += groups->group_size[i]; + + return sz; +} /* L3/L4 traffic type classifier */ struct mlx5_ttc_table { @@ -138,6 +141,53 @@ static struct mlx5_etype_proto ttc_tunnel_rules[] = { }; +enum TTC_GROUP_TYPE { + TTC_GROUPS_DEFAULT = 0, + TTC_GROUPS_USE_L4_TYPE = 1, +}; + +static const struct mlx5_fs_ttc_groups ttc_groups[] = { + [TTC_GROUPS_DEFAULT] = { + .num_groups = 3, + .group_size = { + BIT(3) + MLX5_NUM_TUNNEL_TT, + BIT(1), + BIT(0), + }, + }, + [TTC_GROUPS_USE_L4_TYPE] = { + .use_l4_type = true, + .num_groups = 4, + .group_size = { + MLX5_TTC_GROUP_TCPUDP_SIZE, + BIT(3) + MLX5_NUM_TUNNEL_TT - MLX5_TTC_GROUP_TCPUDP_SIZE, + BIT(1), + BIT(0), + }, + }, +}; + +static const struct mlx5_fs_ttc_groups inner_ttc_groups[] = { + [TTC_GROUPS_DEFAULT] = { + .num_groups = 3, + .group_size = { + BIT(3), + BIT(1), + BIT(0), + }, + }, + [TTC_GROUPS_USE_L4_TYPE] = { + .use_l4_type = true, + .num_groups = 4, + .group_size = { + MLX5_TTC_GROUP_TCPUDP_SIZE, + BIT(3) - MLX5_TTC_GROUP_TCPUDP_SIZE, + BIT(1), + BIT(0), + }, + }, +}; + u8 mlx5_get_proto_by_tunnel_type(enum mlx5_tunnel_types tt) { return ttc_tunnel_rules[tt].proto; @@ -188,9 +238,29 @@ static u8 mlx5_etype_to_ipv(u16 ethertype) return 0; } +static void mlx5_fs_ttc_set_match_proto(void *headers_c, void *headers_v, + u8 proto, bool use_l4_type) +{ + int l4_type; + + if (use_l4_type && (proto == IPPROTO_TCP || proto == IPPROTO_UDP)) { + if (proto == IPPROTO_TCP) + l4_type = MLX5_PACKET_L4_TYPE_TCP; + else + l4_type = MLX5_PACKET_L4_TYPE_UDP; + + MLX5_SET_TO_ONES(fte_match_set_lyr_2_4, headers_c, l4_type); + MLX5_SET(fte_match_set_lyr_2_4, headers_v, l4_type, l4_type); + } else { + MLX5_SET_TO_ONES(fte_match_set_lyr_2_4, headers_c, ip_protocol); + MLX5_SET(fte_match_set_lyr_2_4, headers_v, ip_protocol, proto); + } +} + static struct mlx5_flow_handle * mlx5_generate_ttc_rule(struct mlx5_core_dev *dev, struct mlx5_flow_table *ft, - struct mlx5_flow_destination *dest, u16 etype, u8 proto) + struct mlx5_flow_destination *dest, u16 etype, u8 proto, + bool use_l4_type) { int match_ipv_outer = MLX5_CAP_FLOWTABLE_NIC_RX(dev, @@ -207,8 +277,13 @@ mlx5_generate_ttc_rule(struct mlx5_core_dev *dev, struct mlx5_flow_table *ft, if (proto) { spec->match_criteria_enable = MLX5_MATCH_OUTER_HEADERS; - MLX5_SET_TO_ONES(fte_match_param, spec->match_criteria, outer_headers.ip_protocol); - MLX5_SET(fte_match_param, spec->match_value, outer_headers.ip_protocol, proto); + mlx5_fs_ttc_set_match_proto(MLX5_ADDR_OF(fte_match_param, + spec->match_criteria, + outer_headers), + MLX5_ADDR_OF(fte_match_param, + spec->match_value, + outer_headers), + proto, use_l4_type); } ipv = mlx5_etype_to_ipv(etype); @@ -234,7 +309,8 @@ mlx5_generate_ttc_rule(struct mlx5_core_dev *dev, struct mlx5_flow_table *ft, static int mlx5_generate_ttc_table_rules(struct mlx5_core_dev *dev, struct ttc_params *params, - struct mlx5_ttc_table *ttc) + struct mlx5_ttc_table *ttc, + bool use_l4_type) { struct mlx5_flow_handle **trules; struct mlx5_ttc_rule *rules; @@ -251,7 +327,8 @@ static int mlx5_generate_ttc_table_rules(struct mlx5_core_dev *dev, continue; rule->rule = mlx5_generate_ttc_rule(dev, ft, ¶ms->dests[tt], ttc_rules[tt].etype, - ttc_rules[tt].proto); + ttc_rules[tt].proto, + use_l4_type); if (IS_ERR(rule->rule)) { err = PTR_ERR(rule->rule); rule->rule = NULL; @@ -273,7 +350,8 @@ static int mlx5_generate_ttc_table_rules(struct mlx5_core_dev *dev, trules[tt] = mlx5_generate_ttc_rule(dev, ft, ¶ms->tunnel_dests[tt], ttc_tunnel_rules[tt].etype, - ttc_tunnel_rules[tt].proto); + ttc_tunnel_rules[tt].proto, + use_l4_type); if (IS_ERR(trules[tt])) { err = PTR_ERR(trules[tt]); trules[tt] = NULL; @@ -289,7 +367,8 @@ static int mlx5_generate_ttc_table_rules(struct mlx5_core_dev *dev, } static int mlx5_create_ttc_table_groups(struct mlx5_ttc_table *ttc, - bool use_ipv) + bool use_ipv, + const struct mlx5_fs_ttc_groups *groups) { int inlen = MLX5_ST_SZ_BYTES(create_flow_group_in); int ix = 0; @@ -297,7 +376,7 @@ static int mlx5_create_ttc_table_groups(struct mlx5_ttc_table *ttc, int err; u8 *mc; - ttc->g = kcalloc(MLX5_TTC_NUM_GROUPS, sizeof(*ttc->g), GFP_KERNEL); + ttc->g = kcalloc(groups->num_groups, sizeof(*ttc->g), GFP_KERNEL); if (!ttc->g) return -ENOMEM; in = kvzalloc(inlen, GFP_KERNEL); @@ -307,16 +386,31 @@ static int mlx5_create_ttc_table_groups(struct mlx5_ttc_table *ttc, return -ENOMEM; } - /* L4 Group */ mc = MLX5_ADDR_OF(create_flow_group_in, in, match_criteria); - MLX5_SET_TO_ONES(fte_match_param, mc, outer_headers.ip_protocol); if (use_ipv) MLX5_SET_TO_ONES(fte_match_param, mc, outer_headers.ip_version); else MLX5_SET_TO_ONES(fte_match_param, mc, outer_headers.ethertype); MLX5_SET_CFG(in, match_criteria_enable, MLX5_MATCH_OUTER_HEADERS); + + /* TCP UDP group */ + if (groups->use_l4_type) { + MLX5_SET_TO_ONES(fte_match_param, mc, outer_headers.l4_type); + MLX5_SET_CFG(in, start_flow_index, ix); + ix += groups->group_size[ttc->num_groups]; + MLX5_SET_CFG(in, end_flow_index, ix - 1); + ttc->g[ttc->num_groups] = mlx5_create_flow_group(ttc->t, in); + if (IS_ERR(ttc->g[ttc->num_groups])) + goto err; + ttc->num_groups++; + + MLX5_SET(fte_match_param, mc, outer_headers.l4_type, 0); + } + + /* L4 Group */ + MLX5_SET_TO_ONES(fte_match_param, mc, outer_headers.ip_protocol); MLX5_SET_CFG(in, start_flow_index, ix); - ix += MLX5_TTC_GROUP1_SIZE; + ix += groups->group_size[ttc->num_groups]; MLX5_SET_CFG(in, end_flow_index, ix - 1); ttc->g[ttc->num_groups] = mlx5_create_flow_group(ttc->t, in); if (IS_ERR(ttc->g[ttc->num_groups])) @@ -326,7 +420,7 @@ static int mlx5_create_ttc_table_groups(struct mlx5_ttc_table *ttc, /* L3 Group */ MLX5_SET(fte_match_param, mc, outer_headers.ip_protocol, 0); MLX5_SET_CFG(in, start_flow_index, ix); - ix += MLX5_TTC_GROUP2_SIZE; + ix += groups->group_size[ttc->num_groups]; MLX5_SET_CFG(in, end_flow_index, ix - 1); ttc->g[ttc->num_groups] = mlx5_create_flow_group(ttc->t, in); if (IS_ERR(ttc->g[ttc->num_groups])) @@ -336,7 +430,7 @@ static int mlx5_create_ttc_table_groups(struct mlx5_ttc_table *ttc, /* Any Group */ memset(in, 0, inlen); MLX5_SET_CFG(in, start_flow_index, ix); - ix += MLX5_TTC_GROUP3_SIZE; + ix += groups->group_size[ttc->num_groups]; MLX5_SET_CFG(in, end_flow_index, ix - 1); ttc->g[ttc->num_groups] = mlx5_create_flow_group(ttc->t, in); if (IS_ERR(ttc->g[ttc->num_groups])) @@ -358,7 +452,7 @@ static struct mlx5_flow_handle * mlx5_generate_inner_ttc_rule(struct mlx5_core_dev *dev, struct mlx5_flow_table *ft, struct mlx5_flow_destination *dest, - u16 etype, u8 proto) + u16 etype, u8 proto, bool use_l4_type) { MLX5_DECLARE_FLOW_ACT(flow_act); struct mlx5_flow_handle *rule; @@ -379,8 +473,13 @@ mlx5_generate_inner_ttc_rule(struct mlx5_core_dev *dev, if (proto) { spec->match_criteria_enable = MLX5_MATCH_INNER_HEADERS; - MLX5_SET_TO_ONES(fte_match_param, spec->match_criteria, inner_headers.ip_protocol); - MLX5_SET(fte_match_param, spec->match_value, inner_headers.ip_protocol, proto); + mlx5_fs_ttc_set_match_proto(MLX5_ADDR_OF(fte_match_param, + spec->match_criteria, + inner_headers), + MLX5_ADDR_OF(fte_match_param, + spec->match_value, + inner_headers), + proto, use_l4_type); } rule = mlx5_add_flow_rules(ft, spec, &flow_act, dest, 1); @@ -395,7 +494,8 @@ mlx5_generate_inner_ttc_rule(struct mlx5_core_dev *dev, static int mlx5_generate_inner_ttc_table_rules(struct mlx5_core_dev *dev, struct ttc_params *params, - struct mlx5_ttc_table *ttc) + struct mlx5_ttc_table *ttc, + bool use_l4_type) { struct mlx5_ttc_rule *rules; struct mlx5_flow_table *ft; @@ -413,7 +513,8 @@ static int mlx5_generate_inner_ttc_table_rules(struct mlx5_core_dev *dev, rule->rule = mlx5_generate_inner_ttc_rule(dev, ft, ¶ms->dests[tt], ttc_rules[tt].etype, - ttc_rules[tt].proto); + ttc_rules[tt].proto, + use_l4_type); if (IS_ERR(rule->rule)) { err = PTR_ERR(rule->rule); rule->rule = NULL; @@ -430,7 +531,8 @@ static int mlx5_generate_inner_ttc_table_rules(struct mlx5_core_dev *dev, return err; } -static int mlx5_create_inner_ttc_table_groups(struct mlx5_ttc_table *ttc) +static int mlx5_create_inner_ttc_table_groups(struct mlx5_ttc_table *ttc, + const struct mlx5_fs_ttc_groups *groups) { int inlen = MLX5_ST_SZ_BYTES(create_flow_group_in); int ix = 0; @@ -438,8 +540,7 @@ static int mlx5_create_inner_ttc_table_groups(struct mlx5_ttc_table *ttc) int err; u8 *mc; - ttc->g = kcalloc(MLX5_INNER_TTC_NUM_GROUPS, sizeof(*ttc->g), - GFP_KERNEL); + ttc->g = kcalloc(groups->num_groups, sizeof(*ttc->g), GFP_KERNEL); if (!ttc->g) return -ENOMEM; in = kvzalloc(inlen, GFP_KERNEL); @@ -449,13 +550,28 @@ static int mlx5_create_inner_ttc_table_groups(struct mlx5_ttc_table *ttc) return -ENOMEM; } - /* L4 Group */ mc = MLX5_ADDR_OF(create_flow_group_in, in, match_criteria); - MLX5_SET_TO_ONES(fte_match_param, mc, inner_headers.ip_protocol); MLX5_SET_TO_ONES(fte_match_param, mc, inner_headers.ip_version); MLX5_SET_CFG(in, match_criteria_enable, MLX5_MATCH_INNER_HEADERS); + + /* TCP UDP group */ + if (groups->use_l4_type) { + MLX5_SET_TO_ONES(fte_match_param, mc, inner_headers.l4_type); + MLX5_SET_CFG(in, start_flow_index, ix); + ix += groups->group_size[ttc->num_groups]; + MLX5_SET_CFG(in, end_flow_index, ix - 1); + ttc->g[ttc->num_groups] = mlx5_create_flow_group(ttc->t, in); + if (IS_ERR(ttc->g[ttc->num_groups])) + goto err; + ttc->num_groups++; + + MLX5_SET(fte_match_param, mc, inner_headers.l4_type, 0); + } + + /* L4 Group */ + MLX5_SET_TO_ONES(fte_match_param, mc, inner_headers.ip_protocol); MLX5_SET_CFG(in, start_flow_index, ix); - ix += MLX5_INNER_TTC_GROUP1_SIZE; + ix += groups->group_size[ttc->num_groups]; MLX5_SET_CFG(in, end_flow_index, ix - 1); ttc->g[ttc->num_groups] = mlx5_create_flow_group(ttc->t, in); if (IS_ERR(ttc->g[ttc->num_groups])) @@ -465,7 +581,7 @@ static int mlx5_create_inner_ttc_table_groups(struct mlx5_ttc_table *ttc) /* L3 Group */ MLX5_SET(fte_match_param, mc, inner_headers.ip_protocol, 0); MLX5_SET_CFG(in, start_flow_index, ix); - ix += MLX5_INNER_TTC_GROUP2_SIZE; + ix += groups->group_size[ttc->num_groups]; MLX5_SET_CFG(in, end_flow_index, ix - 1); ttc->g[ttc->num_groups] = mlx5_create_flow_group(ttc->t, in); if (IS_ERR(ttc->g[ttc->num_groups])) @@ -475,7 +591,7 @@ static int mlx5_create_inner_ttc_table_groups(struct mlx5_ttc_table *ttc) /* Any Group */ memset(in, 0, inlen); MLX5_SET_CFG(in, start_flow_index, ix); - ix += MLX5_INNER_TTC_GROUP3_SIZE; + ix += groups->group_size[ttc->num_groups]; MLX5_SET_CFG(in, end_flow_index, ix - 1); ttc->g[ttc->num_groups] = mlx5_create_flow_group(ttc->t, in); if (IS_ERR(ttc->g[ttc->num_groups])) @@ -496,27 +612,47 @@ static int mlx5_create_inner_ttc_table_groups(struct mlx5_ttc_table *ttc) struct mlx5_ttc_table *mlx5_create_inner_ttc_table(struct mlx5_core_dev *dev, struct ttc_params *params) { + const struct mlx5_fs_ttc_groups *groups; + struct mlx5_flow_namespace *ns; struct mlx5_ttc_table *ttc; + bool use_l4_type; int err; ttc = kvzalloc(sizeof(*ttc), GFP_KERNEL); if (!ttc) return ERR_PTR(-ENOMEM); + switch (params->ns_type) { + case MLX5_FLOW_NAMESPACE_PORT_SEL: + use_l4_type = MLX5_CAP_GEN_2(dev, pcc_ifa2) && + MLX5_CAP_PORT_SELECTION_FT_FIELD_SUPPORT_2(dev, inner_l4_type); + break; + case MLX5_FLOW_NAMESPACE_KERNEL: + use_l4_type = MLX5_CAP_GEN_2(dev, pcc_ifa2) && + MLX5_CAP_NIC_RX_FT_FIELD_SUPPORT_2(dev, inner_l4_type); + break; + default: + return ERR_PTR(-EINVAL); + } + + ns = mlx5_get_flow_namespace(dev, params->ns_type); + groups = use_l4_type ? &inner_ttc_groups[TTC_GROUPS_USE_L4_TYPE] : + &inner_ttc_groups[TTC_GROUPS_DEFAULT]; + WARN_ON_ONCE(params->ft_attr.max_fte); - params->ft_attr.max_fte = MLX5_INNER_TTC_TABLE_SIZE; - ttc->t = mlx5_create_flow_table(params->ns, ¶ms->ft_attr); + params->ft_attr.max_fte = mlx5_fs_ttc_table_size(groups); + ttc->t = mlx5_create_flow_table(ns, ¶ms->ft_attr); if (IS_ERR(ttc->t)) { err = PTR_ERR(ttc->t); kvfree(ttc); return ERR_PTR(err); } - err = mlx5_create_inner_ttc_table_groups(ttc); + err = mlx5_create_inner_ttc_table_groups(ttc, groups); if (err) goto destroy_ft; - err = mlx5_generate_inner_ttc_table_rules(dev, params, ttc); + err = mlx5_generate_inner_ttc_table_rules(dev, params, ttc, use_l4_type); if (err) goto destroy_ft; @@ -549,27 +685,47 @@ struct mlx5_ttc_table *mlx5_create_ttc_table(struct mlx5_core_dev *dev, bool match_ipv_outer = MLX5_CAP_FLOWTABLE_NIC_RX(dev, ft_field_support.outer_ip_version); + const struct mlx5_fs_ttc_groups *groups; + struct mlx5_flow_namespace *ns; struct mlx5_ttc_table *ttc; + bool use_l4_type; int err; ttc = kvzalloc(sizeof(*ttc), GFP_KERNEL); if (!ttc) return ERR_PTR(-ENOMEM); + switch (params->ns_type) { + case MLX5_FLOW_NAMESPACE_PORT_SEL: + use_l4_type = MLX5_CAP_GEN_2(dev, pcc_ifa2) && + MLX5_CAP_PORT_SELECTION_FT_FIELD_SUPPORT_2(dev, outer_l4_type); + break; + case MLX5_FLOW_NAMESPACE_KERNEL: + use_l4_type = MLX5_CAP_GEN_2(dev, pcc_ifa2) && + MLX5_CAP_NIC_RX_FT_FIELD_SUPPORT_2(dev, outer_l4_type); + break; + default: + return ERR_PTR(-EINVAL); + } + + ns = mlx5_get_flow_namespace(dev, params->ns_type); + groups = use_l4_type ? &ttc_groups[TTC_GROUPS_USE_L4_TYPE] : + &ttc_groups[TTC_GROUPS_DEFAULT]; + WARN_ON_ONCE(params->ft_attr.max_fte); - params->ft_attr.max_fte = MLX5_TTC_TABLE_SIZE; - ttc->t = mlx5_create_flow_table(params->ns, ¶ms->ft_attr); + params->ft_attr.max_fte = mlx5_fs_ttc_table_size(groups); + ttc->t = mlx5_create_flow_table(ns, ¶ms->ft_attr); if (IS_ERR(ttc->t)) { err = PTR_ERR(ttc->t); kvfree(ttc); return ERR_PTR(err); } - err = mlx5_create_ttc_table_groups(ttc, match_ipv_outer); + err = mlx5_create_ttc_table_groups(ttc, match_ipv_outer, groups); if (err) goto destroy_ft; - err = mlx5_generate_ttc_table_rules(dev, params, ttc); + err = mlx5_generate_ttc_table_rules(dev, params, ttc, use_l4_type); if (err) goto destroy_ft; diff --git a/drivers/net/ethernet/mellanox/mlx5/core/lib/fs_ttc.h b/drivers/net/ethernet/mellanox/mlx5/core/lib/fs_ttc.h index 85fef0cd1c072..92eea6bea310b 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/lib/fs_ttc.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/lib/fs_ttc.h @@ -40,7 +40,7 @@ struct mlx5_ttc_rule { struct mlx5_ttc_table; struct ttc_params { - struct mlx5_flow_namespace *ns; + enum mlx5_flow_namespace_type ns_type; struct mlx5_flow_table_attr ft_attr; struct mlx5_flow_destination dests[MLX5_NUM_TT]; DECLARE_BITMAP(ignore_dests, MLX5_NUM_TT); diff --git a/include/linux/mlx5/device.h b/include/linux/mlx5/device.h index 06f81b5775ba9..70dcbbf351016 100644 --- a/include/linux/mlx5/device.h +++ b/include/linux/mlx5/device.h @@ -1335,6 +1335,9 @@ enum mlx5_qcam_feature_groups { #define MLX5_CAP_ESW_FT_FIELD_SUPPORT_2(mdev, cap) \ MLX5_CAP_ESW_FLOWTABLE(mdev, ft_field_support_2_esw_fdb.cap) +#define MLX5_CAP_NIC_RX_FT_FIELD_SUPPORT_2(mdev, cap) \ + MLX5_CAP_FLOWTABLE(mdev, ft_field_support_2_nic_receive.cap) + #define MLX5_CAP_ESW(mdev, cap) \ MLX5_GET(e_switch_cap, \ mdev->caps.hca[MLX5_CAP_ESWITCH]->cur, cap) @@ -1358,6 +1361,9 @@ enum mlx5_qcam_feature_groups { #define MLX5_CAP_FLOWTABLE_PORT_SELECTION(mdev, cap) \ MLX5_CAP_PORT_SELECTION(mdev, flow_table_properties_port_selection.cap) +#define MLX5_CAP_PORT_SELECTION_FT_FIELD_SUPPORT_2(mdev, cap) \ + MLX5_CAP_PORT_SELECTION(mdev, ft_field_support_2_port_selection.cap) + #define MLX5_CAP_ODP(mdev, cap)\ MLX5_GET(odp_cap, mdev->caps.hca[MLX5_CAP_ODP]->cur, cap) diff --git a/include/linux/mlx5/mlx5_ifc.h b/include/linux/mlx5/mlx5_ifc.h index 93914578e2d6a..7f924bbe4fa08 100644 --- a/include/linux/mlx5/mlx5_ifc.h +++ b/include/linux/mlx5/mlx5_ifc.h @@ -420,7 +420,10 @@ struct mlx5_ifc_flow_table_fields_supported_bits { /* Table 2170 - Flow Table Fields Supported 2 Format */ struct mlx5_ifc_flow_table_fields_supported_2_bits { - u8 reserved_at_0[0xe]; + u8 reserved_at_0[0x2]; + u8 inner_l4_type[0x1]; + u8 outer_l4_type[0x1]; + u8 reserved_at_4[0xa]; u8 bth_opcode[0x1]; u8 reserved_at_f[0x1]; u8 tunnel_header_0_1[0x1]; @@ -543,6 +546,12 @@ union mlx5_ifc_ipv6_layout_ipv4_layout_auto_bits { u8 reserved_at_0[0x80]; }; +enum { + MLX5_PACKET_L4_TYPE_NONE, + MLX5_PACKET_L4_TYPE_TCP, + MLX5_PACKET_L4_TYPE_UDP, +}; + struct mlx5_ifc_fte_match_set_lyr_2_4_bits { u8 smac_47_16[0x20]; @@ -568,7 +577,8 @@ struct mlx5_ifc_fte_match_set_lyr_2_4_bits { u8 tcp_sport[0x10]; u8 tcp_dport[0x10]; - u8 reserved_at_c0[0x10]; + u8 l4_type[0x2]; + u8 reserved_at_c2[0xe]; u8 ipv4_ihl[0x4]; u8 reserved_at_c4[0x4]; @@ -864,7 +874,11 @@ struct mlx5_ifc_flow_table_nic_cap_bits { struct mlx5_ifc_flow_table_prop_layout_bits flow_table_properties_nic_transmit_sniffer; - u8 reserved_at_e00[0x700]; + u8 reserved_at_e00[0x600]; + + struct mlx5_ifc_flow_table_fields_supported_2_bits ft_field_support_2_nic_receive; + + u8 reserved_at_1480[0x80]; struct mlx5_ifc_flow_table_fields_supported_2_bits ft_field_support_2_nic_receive_rdma; @@ -894,7 +908,9 @@ struct mlx5_ifc_port_selection_cap_bits { struct mlx5_ifc_flow_table_prop_layout_bits flow_table_properties_port_selection; - u8 reserved_at_400[0x7c00]; + struct mlx5_ifc_flow_table_fields_supported_2_bits ft_field_support_2_port_selection; + + u8 reserved_at_480[0x7b80]; }; enum { From e092285a10e50a8dac464778099f587b83ab87a6 Mon Sep 17 00:00:00 2001 From: Jianbo Liu Date: Tue, 2 Apr 2024 16:30:42 +0300 Subject: [PATCH 087/548] net/mlx5: Skip pages EQ creation for non-page supplier function Page events are not issued by device on the function if page_request_disable is set, so no need to create pages EQ. Signed-off-by: Jianbo Liu Reviewed-by: Parav Pandit Reviewed-by: Moshe Shemesh Signed-off-by: Tariq Toukan Link: https://lore.kernel.org/r/20240402133043.56322-11-tariqt@nvidia.com Signed-off-by: Jakub Kicinski (cherry picked from commit c788d79cfa6b68b1068b25e9cdff25d6035ed191) Signed-off-by: Koba Ko --- drivers/net/ethernet/mellanox/mlx5/core/eq.c | 9 ++++++++- include/linux/mlx5/mlx5_ifc.h | 4 +++- 2 files changed, 11 insertions(+), 2 deletions(-) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eq.c b/drivers/net/ethernet/mellanox/mlx5/core/eq.c index 07a0419549092..e0e6d3f62d1d6 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/eq.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/eq.c @@ -688,6 +688,12 @@ static int create_async_eqs(struct mlx5_core_dev *dev) if (err) goto err2; + /* Skip page eq creation when the device does not request for page requests */ + if (MLX5_CAP_GEN(dev, page_request_disable)) { + mlx5_core_dbg(dev, "Skip page EQ creation\n"); + return 0; + } + param = (struct mlx5_eq_param) { .irq = table->ctrl_irq, .nent = /* TODO: sriov max_vf + */ 1, @@ -716,7 +722,8 @@ static void destroy_async_eqs(struct mlx5_core_dev *dev) { struct mlx5_eq_table *table = dev->priv.eq_table; - cleanup_async_eq(dev, &table->pages_eq, "pages"); + if (!MLX5_CAP_GEN(dev, page_request_disable)) + cleanup_async_eq(dev, &table->pages_eq, "pages"); cleanup_async_eq(dev, &table->async_eq, "async"); mlx5_cmd_allowed_opcode(dev, MLX5_CMD_OP_DESTROY_EQ); mlx5_cmd_use_polling(dev); diff --git a/include/linux/mlx5/mlx5_ifc.h b/include/linux/mlx5/mlx5_ifc.h index 7f924bbe4fa08..7c7d8af1c7e80 100644 --- a/include/linux/mlx5/mlx5_ifc.h +++ b/include/linux/mlx5/mlx5_ifc.h @@ -1566,7 +1566,9 @@ enum { }; struct mlx5_ifc_cmd_hca_cap_bits { - u8 reserved_at_0[0x10]; + u8 reserved_at_0[0x6]; + u8 page_request_disable[0x1]; + u8 reserved_at_7[0x9]; u8 shared_object_to_user_object_allowed[0x1]; u8 reserved_at_13[0xe]; u8 vhca_resource_manager[0x1]; From 6226391309eeb9edc6ab309b1d433678c6bf08d4 Mon Sep 17 00:00:00 2001 From: Jianbo Liu Date: Tue, 2 Apr 2024 16:30:43 +0300 Subject: [PATCH 088/548] net/mlx5: Don't call give_pages() if request 0 page Firmware will return 0 on query BOOT/INIT PAGES for non-page supplier functions (external host PF/VF/SF), so no page is needed to be allocated for them. Signed-off-by: Jianbo Liu Reviewed-by: Parav Pandit Reviewed-by: Moshe Shemesh Signed-off-by: Tariq Toukan Link: https://lore.kernel.org/r/20240402133043.56322-12-tariqt@nvidia.com Signed-off-by: Jakub Kicinski (cherry picked from commit 07e1bc785a91e1cb621ac6a098acb222fb0811cf) Signed-off-by: Koba Ko --- drivers/net/ethernet/mellanox/mlx5/core/pagealloc.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/pagealloc.c b/drivers/net/ethernet/mellanox/mlx5/core/pagealloc.c index e0581c6f9cecd..70167998fa4b7 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/pagealloc.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/pagealloc.c @@ -660,6 +660,9 @@ int mlx5_satisfy_startup_pages(struct mlx5_core_dev *dev, int boot) mlx5_core_dbg(dev, "requested %d %s pages for func_id 0x%x\n", npages, boot ? "boot" : "init", func_id); + if (!npages) + return 0; + return give_pages(dev, func_id, npages, 0, mlx5_core_is_ecpf(dev)); } From c25846af47a667538503ab6e6c7064eace382a01 Mon Sep 17 00:00:00 2001 From: Yevgeny Kliteynik Date: Thu, 20 Jun 2024 02:31:59 +0300 Subject: [PATCH 089/548] net/mlx5: HWS, added actions handling When a packet matches a flow, the actions specified for the flow are applied. The supported actions include (but not limited to) the following: - drop: packet processing is stopped - go to vport: packet is forwarded to a specified vport - go to flow table: packet is forwarded to a specified table and processing continues there - push/pop vlan: add/remove vlan header respectively to/from the packet - insert/remove header: add/remove a user-defined header to/from the packet - counter: count the packet bytes in the specified counter - tag: tag the matching flow with a provided tag value - reformat: change the packet format by adding or removing some of its headers - modify header: modify the value of the packet headers with set/add/copy ops - range: match packet on range of values Reviewed-by: Erez Shitrit Signed-off-by: Yevgeny Kliteynik Signed-off-by: Saeed Mahameed (cherry picked from commit 504e536d90104c850731840d3fbc95acf251f11b) Signed-off-by: Koba Ko --- .../mlx5/core/steering/hws/mlx5hws_action.c | 2604 +++++++++++++++++ .../mlx5/core/steering/hws/mlx5hws_action.h | 307 ++ 2 files changed, 2911 insertions(+) create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_action.c create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_action.h diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_action.c b/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_action.c new file mode 100644 index 0000000000000..b27bb41065326 --- /dev/null +++ b/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_action.c @@ -0,0 +1,2604 @@ +// SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB +/* Copyright (c) 2024 NVIDIA Corporation & Affiliates */ + +#include "mlx5hws_internal.h" + +#define MLX5HWS_ACTION_METER_INIT_COLOR_OFFSET 1 + +/* Header removal size limited to 128B (64 words) */ +#define MLX5HWS_ACTION_REMOVE_HEADER_MAX_SIZE 128 + +/* This is the longest supported action sequence for FDB table: + * DECAP, POP_VLAN, MODIFY, CTR, ASO, PUSH_VLAN, MODIFY, ENCAP, Term. + */ +static const u32 action_order_arr[MLX5HWS_TABLE_TYPE_MAX][MLX5HWS_ACTION_TYP_MAX] = { + [MLX5HWS_TABLE_TYPE_FDB] = { + BIT(MLX5HWS_ACTION_TYP_REMOVE_HEADER) | + BIT(MLX5HWS_ACTION_TYP_REFORMAT_TNL_L2_TO_L2) | + BIT(MLX5HWS_ACTION_TYP_REFORMAT_TNL_L3_TO_L2), + BIT(MLX5HWS_ACTION_TYP_POP_VLAN), + BIT(MLX5HWS_ACTION_TYP_POP_VLAN), + BIT(MLX5HWS_ACTION_TYP_MODIFY_HDR), + BIT(MLX5HWS_ACTION_TYP_PUSH_VLAN), + BIT(MLX5HWS_ACTION_TYP_PUSH_VLAN), + BIT(MLX5HWS_ACTION_TYP_INSERT_HEADER) | + BIT(MLX5HWS_ACTION_TYP_REFORMAT_L2_TO_TNL_L2) | + BIT(MLX5HWS_ACTION_TYP_REFORMAT_L2_TO_TNL_L3), + BIT(MLX5HWS_ACTION_TYP_CTR), + BIT(MLX5HWS_ACTION_TYP_TAG), + BIT(MLX5HWS_ACTION_TYP_ASO_METER), + BIT(MLX5HWS_ACTION_TYP_MODIFY_HDR), + BIT(MLX5HWS_ACTION_TYP_TBL) | + BIT(MLX5HWS_ACTION_TYP_VPORT) | + BIT(MLX5HWS_ACTION_TYP_DROP) | + BIT(MLX5HWS_ACTION_TYP_SAMPLER) | + BIT(MLX5HWS_ACTION_TYP_RANGE) | + BIT(MLX5HWS_ACTION_TYP_DEST_ARRAY), + BIT(MLX5HWS_ACTION_TYP_LAST), + }, +}; + +static const char * const mlx5hws_action_type_str[] = { + [MLX5HWS_ACTION_TYP_LAST] = "LAST", + [MLX5HWS_ACTION_TYP_REFORMAT_TNL_L2_TO_L2] = "TNL_L2_TO_L2", + [MLX5HWS_ACTION_TYP_REFORMAT_L2_TO_TNL_L2] = "L2_TO_TNL_L2", + [MLX5HWS_ACTION_TYP_REFORMAT_TNL_L3_TO_L2] = "TNL_L3_TO_L2", + [MLX5HWS_ACTION_TYP_REFORMAT_L2_TO_TNL_L3] = "L2_TO_TNL_L3", + [MLX5HWS_ACTION_TYP_DROP] = "DROP", + [MLX5HWS_ACTION_TYP_TBL] = "TBL", + [MLX5HWS_ACTION_TYP_CTR] = "CTR", + [MLX5HWS_ACTION_TYP_TAG] = "TAG", + [MLX5HWS_ACTION_TYP_MODIFY_HDR] = "MODIFY_HDR", + [MLX5HWS_ACTION_TYP_VPORT] = "VPORT", + [MLX5HWS_ACTION_TYP_MISS] = "DEFAULT_MISS", + [MLX5HWS_ACTION_TYP_POP_VLAN] = "POP_VLAN", + [MLX5HWS_ACTION_TYP_PUSH_VLAN] = "PUSH_VLAN", + [MLX5HWS_ACTION_TYP_ASO_METER] = "ASO_METER", + [MLX5HWS_ACTION_TYP_DEST_ARRAY] = "DEST_ARRAY", + [MLX5HWS_ACTION_TYP_INSERT_HEADER] = "INSERT_HEADER", + [MLX5HWS_ACTION_TYP_REMOVE_HEADER] = "REMOVE_HEADER", + [MLX5HWS_ACTION_TYP_SAMPLER] = "SAMPLER", + [MLX5HWS_ACTION_TYP_RANGE] = "RANGE", +}; + +static_assert(ARRAY_SIZE(mlx5hws_action_type_str) == MLX5HWS_ACTION_TYP_MAX, + "Missing mlx5hws_action_type_str"); + +const char *mlx5hws_action_type_to_str(enum mlx5hws_action_type action_type) +{ + return mlx5hws_action_type_str[action_type]; +} + +enum mlx5hws_action_type mlx5hws_action_get_type(struct mlx5hws_action *action) +{ + return action->type; +} + +static int hws_action_get_shared_stc_nic(struct mlx5hws_context *ctx, + enum mlx5hws_context_shared_stc_type stc_type, + u8 tbl_type) +{ + struct mlx5hws_cmd_stc_modify_attr stc_attr = {0}; + struct mlx5hws_action_shared_stc *shared_stc; + int ret; + + mutex_lock(&ctx->ctrl_lock); + if (ctx->common_res[tbl_type].shared_stc[stc_type]) { + ctx->common_res[tbl_type].shared_stc[stc_type]->refcount++; + mutex_unlock(&ctx->ctrl_lock); + return 0; + } + + shared_stc = kzalloc(sizeof(*shared_stc), GFP_KERNEL); + if (!shared_stc) { + ret = -ENOMEM; + goto unlock_and_out; + } + switch (stc_type) { + case MLX5HWS_CONTEXT_SHARED_STC_DECAP_L3: + stc_attr.action_type = MLX5_IFC_STC_ACTION_TYPE_HEADER_REMOVE; + stc_attr.action_offset = MLX5HWS_ACTION_OFFSET_DW5; + stc_attr.reparse_mode = MLX5_IFC_STC_REPARSE_IGNORE; + stc_attr.remove_header.decap = 0; + stc_attr.remove_header.start_anchor = MLX5_HEADER_ANCHOR_PACKET_START; + stc_attr.remove_header.end_anchor = MLX5_HEADER_ANCHOR_IPV6_IPV4; + break; + case MLX5HWS_CONTEXT_SHARED_STC_DOUBLE_POP: + stc_attr.action_type = MLX5_IFC_STC_ACTION_TYPE_REMOVE_WORDS; + stc_attr.action_offset = MLX5HWS_ACTION_OFFSET_DW5; + stc_attr.reparse_mode = MLX5_IFC_STC_REPARSE_ALWAYS; + stc_attr.remove_words.start_anchor = MLX5_HEADER_ANCHOR_FIRST_VLAN_START; + stc_attr.remove_words.num_of_words = MLX5HWS_ACTION_HDR_LEN_L2_VLAN; + break; + default: + mlx5hws_err(ctx, "No such stc_type: %d\n", stc_type); + pr_warn("HWS: Invalid stc_type: %d\n", stc_type); + ret = -EINVAL; + goto unlock_and_out; + } + + ret = mlx5hws_action_alloc_single_stc(ctx, &stc_attr, tbl_type, + &shared_stc->stc_chunk); + if (ret) { + mlx5hws_err(ctx, "Failed to allocate shared decap l2 STC\n"); + goto free_shared_stc; + } + + ctx->common_res[tbl_type].shared_stc[stc_type] = shared_stc; + ctx->common_res[tbl_type].shared_stc[stc_type]->refcount = 1; + + mutex_unlock(&ctx->ctrl_lock); + + return 0; + +free_shared_stc: + kfree(shared_stc); +unlock_and_out: + mutex_unlock(&ctx->ctrl_lock); + return ret; +} + +static int hws_action_get_shared_stc(struct mlx5hws_action *action, + enum mlx5hws_context_shared_stc_type stc_type) +{ + struct mlx5hws_context *ctx = action->ctx; + int ret; + + if (stc_type >= MLX5HWS_CONTEXT_SHARED_STC_MAX) { + pr_warn("HWS: Invalid shared stc_type: %d\n", stc_type); + return -EINVAL; + } + + if (unlikely(!(action->flags & MLX5HWS_ACTION_FLAG_HWS_FDB))) { + pr_warn("HWS: Invalid action->flags: %d\n", action->flags); + return -EINVAL; + } + + ret = hws_action_get_shared_stc_nic(ctx, stc_type, MLX5HWS_TABLE_TYPE_FDB); + if (ret) { + mlx5hws_err(ctx, + "Failed to allocate memory for FDB shared STCs (type: %d)\n", + stc_type); + return ret; + } + + return 0; +} + +static void hws_action_put_shared_stc(struct mlx5hws_action *action, + enum mlx5hws_context_shared_stc_type stc_type) +{ + enum mlx5hws_table_type tbl_type = MLX5HWS_TABLE_TYPE_FDB; + struct mlx5hws_action_shared_stc *shared_stc; + struct mlx5hws_context *ctx = action->ctx; + + if (stc_type >= MLX5HWS_CONTEXT_SHARED_STC_MAX) { + pr_warn("HWS: Invalid shared stc_type: %d\n", stc_type); + return; + } + + mutex_lock(&ctx->ctrl_lock); + if (--ctx->common_res[tbl_type].shared_stc[stc_type]->refcount) { + mutex_unlock(&ctx->ctrl_lock); + return; + } + + shared_stc = ctx->common_res[tbl_type].shared_stc[stc_type]; + + mlx5hws_action_free_single_stc(ctx, tbl_type, &shared_stc->stc_chunk); + kfree(shared_stc); + ctx->common_res[tbl_type].shared_stc[stc_type] = NULL; + mutex_unlock(&ctx->ctrl_lock); +} + +static void hws_action_print_combo(struct mlx5hws_context *ctx, + enum mlx5hws_action_type *user_actions) +{ + mlx5hws_err(ctx, "Invalid action_type sequence"); + while (*user_actions != MLX5HWS_ACTION_TYP_LAST) { + mlx5hws_err(ctx, " %s", mlx5hws_action_type_to_str(*user_actions)); + user_actions++; + } + mlx5hws_err(ctx, "\n"); +} + +bool mlx5hws_action_check_combo(struct mlx5hws_context *ctx, + enum mlx5hws_action_type *user_actions, + enum mlx5hws_table_type table_type) +{ + const u32 *order_arr = action_order_arr[table_type]; + u8 order_idx = 0; + u8 user_idx = 0; + bool valid_combo; + + if (table_type >= MLX5HWS_TABLE_TYPE_MAX) { + mlx5hws_err(ctx, "Invalid table_type %d", table_type); + return false; + } + + while (order_arr[order_idx] != BIT(MLX5HWS_ACTION_TYP_LAST)) { + /* User action order validated move to next user action */ + if (BIT(user_actions[user_idx]) & order_arr[order_idx]) + user_idx++; + + /* Iterate to the next supported action in the order */ + order_idx++; + } + + /* Combination is valid if all user action were processed */ + valid_combo = user_actions[user_idx] == MLX5HWS_ACTION_TYP_LAST; + if (!valid_combo) + hws_action_print_combo(ctx, user_actions); + + return valid_combo; +} + +static bool +hws_action_fixup_stc_attr(struct mlx5hws_context *ctx, + struct mlx5hws_cmd_stc_modify_attr *stc_attr, + struct mlx5hws_cmd_stc_modify_attr *fixup_stc_attr, + enum mlx5hws_table_type table_type, + bool is_mirror) +{ + bool use_fixup = false; + u32 fw_tbl_type; + u32 base_id; + + fw_tbl_type = mlx5hws_table_get_res_fw_ft_type(table_type, is_mirror); + + switch (stc_attr->action_type) { + case MLX5_IFC_STC_ACTION_TYPE_JUMP_TO_STE_TABLE: + if (is_mirror && stc_attr->ste_table.ignore_tx) { + fixup_stc_attr->action_type = MLX5_IFC_STC_ACTION_TYPE_DROP; + fixup_stc_attr->action_offset = MLX5HWS_ACTION_OFFSET_HIT; + fixup_stc_attr->stc_offset = stc_attr->stc_offset; + use_fixup = true; + break; + } + if (!is_mirror) + base_id = mlx5hws_pool_chunk_get_base_id(stc_attr->ste_table.ste_pool, + &stc_attr->ste_table.ste); + else + base_id = + mlx5hws_pool_chunk_get_base_mirror_id(stc_attr->ste_table.ste_pool, + &stc_attr->ste_table.ste); + + *fixup_stc_attr = *stc_attr; + fixup_stc_attr->ste_table.ste_obj_id = base_id; + use_fixup = true; + break; + + case MLX5_IFC_STC_ACTION_TYPE_TAG: + if (fw_tbl_type == FS_FT_FDB_TX) { + fixup_stc_attr->action_type = MLX5_IFC_STC_ACTION_TYPE_NOP; + fixup_stc_attr->action_offset = MLX5HWS_ACTION_OFFSET_DW5; + fixup_stc_attr->stc_offset = stc_attr->stc_offset; + use_fixup = true; + } + break; + + case MLX5_IFC_STC_ACTION_TYPE_ALLOW: + if (fw_tbl_type == FS_FT_FDB_TX || fw_tbl_type == FS_FT_FDB_RX) { + fixup_stc_attr->action_type = MLX5_IFC_STC_ACTION_TYPE_JUMP_TO_VPORT; + fixup_stc_attr->action_offset = stc_attr->action_offset; + fixup_stc_attr->stc_offset = stc_attr->stc_offset; + fixup_stc_attr->vport.esw_owner_vhca_id = ctx->caps->vhca_id; + fixup_stc_attr->vport.vport_num = ctx->caps->eswitch_manager_vport_number; + fixup_stc_attr->vport.eswitch_owner_vhca_id_valid = + ctx->caps->merged_eswitch; + use_fixup = true; + } + break; + + case MLX5_IFC_STC_ACTION_TYPE_JUMP_TO_VPORT: + if (stc_attr->vport.vport_num != MLX5_VPORT_UPLINK) + break; + + if (fw_tbl_type == FS_FT_FDB_TX || fw_tbl_type == FS_FT_FDB_RX) { + /* The FW doesn't allow to go to wire in the TX/RX by JUMP_TO_VPORT */ + fixup_stc_attr->action_type = MLX5_IFC_STC_ACTION_TYPE_JUMP_TO_UPLINK; + fixup_stc_attr->action_offset = stc_attr->action_offset; + fixup_stc_attr->stc_offset = stc_attr->stc_offset; + fixup_stc_attr->vport.vport_num = 0; + fixup_stc_attr->vport.esw_owner_vhca_id = stc_attr->vport.esw_owner_vhca_id; + fixup_stc_attr->vport.eswitch_owner_vhca_id_valid = + stc_attr->vport.eswitch_owner_vhca_id_valid; + } + use_fixup = true; + break; + + default: + break; + } + + return use_fixup; +} + +int mlx5hws_action_alloc_single_stc(struct mlx5hws_context *ctx, + struct mlx5hws_cmd_stc_modify_attr *stc_attr, + u32 table_type, + struct mlx5hws_pool_chunk *stc) +__must_hold(&ctx->ctrl_lock) +{ + struct mlx5hws_cmd_stc_modify_attr cleanup_stc_attr = {0}; + struct mlx5hws_pool *stc_pool = ctx->stc_pool[table_type]; + struct mlx5hws_cmd_stc_modify_attr fixup_stc_attr = {0}; + bool use_fixup; + u32 obj_0_id; + int ret; + + ret = mlx5hws_pool_chunk_alloc(stc_pool, stc); + if (ret) { + mlx5hws_err(ctx, "Failed to allocate single action STC\n"); + return ret; + } + + stc_attr->stc_offset = stc->offset; + + /* Dynamic reparse not supported, overwrite and use default */ + if (!mlx5hws_context_cap_dynamic_reparse(ctx)) + stc_attr->reparse_mode = MLX5_IFC_STC_REPARSE_IGNORE; + + obj_0_id = mlx5hws_pool_chunk_get_base_id(stc_pool, stc); + + /* According to table/action limitation change the stc_attr */ + use_fixup = hws_action_fixup_stc_attr(ctx, stc_attr, &fixup_stc_attr, table_type, false); + ret = mlx5hws_cmd_stc_modify(ctx->mdev, obj_0_id, + use_fixup ? &fixup_stc_attr : stc_attr); + if (ret) { + mlx5hws_err(ctx, "Failed to modify STC action_type %d tbl_type %d\n", + stc_attr->action_type, table_type); + goto free_chunk; + } + + /* Modify the FDB peer */ + if (table_type == MLX5HWS_TABLE_TYPE_FDB) { + u32 obj_1_id; + + obj_1_id = mlx5hws_pool_chunk_get_base_mirror_id(stc_pool, stc); + + use_fixup = hws_action_fixup_stc_attr(ctx, stc_attr, + &fixup_stc_attr, + table_type, true); + ret = mlx5hws_cmd_stc_modify(ctx->mdev, obj_1_id, + use_fixup ? &fixup_stc_attr : stc_attr); + if (ret) { + mlx5hws_err(ctx, + "Failed to modify peer STC action_type %d tbl_type %d\n", + stc_attr->action_type, table_type); + goto clean_obj_0; + } + } + + return 0; + +clean_obj_0: + cleanup_stc_attr.action_type = MLX5_IFC_STC_ACTION_TYPE_DROP; + cleanup_stc_attr.action_offset = MLX5HWS_ACTION_OFFSET_HIT; + cleanup_stc_attr.stc_offset = stc->offset; + mlx5hws_cmd_stc_modify(ctx->mdev, obj_0_id, &cleanup_stc_attr); +free_chunk: + mlx5hws_pool_chunk_free(stc_pool, stc); + return ret; +} + +void mlx5hws_action_free_single_stc(struct mlx5hws_context *ctx, + u32 table_type, + struct mlx5hws_pool_chunk *stc) +__must_hold(&ctx->ctrl_lock) +{ + struct mlx5hws_pool *stc_pool = ctx->stc_pool[table_type]; + struct mlx5hws_cmd_stc_modify_attr stc_attr = {0}; + u32 obj_id; + + /* Modify the STC not to point to an object */ + stc_attr.action_type = MLX5_IFC_STC_ACTION_TYPE_DROP; + stc_attr.action_offset = MLX5HWS_ACTION_OFFSET_HIT; + stc_attr.stc_offset = stc->offset; + obj_id = mlx5hws_pool_chunk_get_base_id(stc_pool, stc); + mlx5hws_cmd_stc_modify(ctx->mdev, obj_id, &stc_attr); + + if (table_type == MLX5HWS_TABLE_TYPE_FDB) { + obj_id = mlx5hws_pool_chunk_get_base_mirror_id(stc_pool, stc); + mlx5hws_cmd_stc_modify(ctx->mdev, obj_id, &stc_attr); + } + + mlx5hws_pool_chunk_free(stc_pool, stc); +} + +static u32 hws_action_get_mh_stc_type(struct mlx5hws_context *ctx, + __be64 pattern) +{ + u8 action_type = MLX5_GET(set_action_in, &pattern, action_type); + + switch (action_type) { + case MLX5_MODIFICATION_TYPE_SET: + return MLX5_IFC_STC_ACTION_TYPE_SET; + case MLX5_MODIFICATION_TYPE_ADD: + return MLX5_IFC_STC_ACTION_TYPE_ADD; + case MLX5_MODIFICATION_TYPE_COPY: + return MLX5_IFC_STC_ACTION_TYPE_COPY; + case MLX5_MODIFICATION_TYPE_ADD_FIELD: + return MLX5_IFC_STC_ACTION_TYPE_ADD_FIELD; + default: + mlx5hws_err(ctx, "Unsupported action type: 0x%x\n", action_type); + return MLX5_IFC_STC_ACTION_TYPE_NOP; + } +} + +static void hws_action_fill_stc_attr(struct mlx5hws_action *action, + u32 obj_id, + struct mlx5hws_cmd_stc_modify_attr *attr) +{ + attr->reparse_mode = MLX5_IFC_STC_REPARSE_IGNORE; + + switch (action->type) { + case MLX5HWS_ACTION_TYP_TAG: + attr->action_type = MLX5_IFC_STC_ACTION_TYPE_TAG; + attr->action_offset = MLX5HWS_ACTION_OFFSET_DW5; + break; + case MLX5HWS_ACTION_TYP_DROP: + attr->action_type = MLX5_IFC_STC_ACTION_TYPE_DROP; + attr->action_offset = MLX5HWS_ACTION_OFFSET_HIT; + break; + case MLX5HWS_ACTION_TYP_MISS: + attr->action_type = MLX5_IFC_STC_ACTION_TYPE_ALLOW; + attr->action_offset = MLX5HWS_ACTION_OFFSET_HIT; + break; + case MLX5HWS_ACTION_TYP_CTR: + attr->id = obj_id; + attr->action_type = MLX5_IFC_STC_ACTION_TYPE_COUNTER; + attr->action_offset = MLX5HWS_ACTION_OFFSET_DW0; + break; + case MLX5HWS_ACTION_TYP_REFORMAT_TNL_L3_TO_L2: + case MLX5HWS_ACTION_TYP_MODIFY_HDR: + attr->action_offset = MLX5HWS_ACTION_OFFSET_DW6; + attr->reparse_mode = MLX5_IFC_STC_REPARSE_IGNORE; + if (action->modify_header.require_reparse) + attr->reparse_mode = MLX5_IFC_STC_REPARSE_ALWAYS; + + if (action->modify_header.num_of_actions == 1) { + attr->modify_action.data = action->modify_header.single_action; + attr->action_type = hws_action_get_mh_stc_type(action->ctx, + attr->modify_action.data); + + if (attr->action_type == MLX5_IFC_STC_ACTION_TYPE_ADD || + attr->action_type == MLX5_IFC_STC_ACTION_TYPE_SET) + MLX5_SET(set_action_in, &attr->modify_action.data, data, 0); + } else { + attr->action_type = MLX5_IFC_STC_ACTION_TYPE_ACC_MODIFY_LIST; + attr->modify_header.arg_id = action->modify_header.arg_id; + attr->modify_header.pattern_id = action->modify_header.pat_id; + } + break; + case MLX5HWS_ACTION_TYP_TBL: + case MLX5HWS_ACTION_TYP_DEST_ARRAY: + attr->action_type = MLX5_IFC_STC_ACTION_TYPE_JUMP_TO_FT; + attr->action_offset = MLX5HWS_ACTION_OFFSET_HIT; + attr->dest_table_id = obj_id; + break; + case MLX5HWS_ACTION_TYP_REFORMAT_TNL_L2_TO_L2: + attr->action_type = MLX5_IFC_STC_ACTION_TYPE_HEADER_REMOVE; + attr->action_offset = MLX5HWS_ACTION_OFFSET_DW5; + attr->reparse_mode = MLX5_IFC_STC_REPARSE_ALWAYS; + attr->remove_header.decap = 1; + attr->remove_header.start_anchor = MLX5_HEADER_ANCHOR_PACKET_START; + attr->remove_header.end_anchor = MLX5_HEADER_ANCHOR_INNER_MAC; + break; + case MLX5HWS_ACTION_TYP_REFORMAT_L2_TO_TNL_L2: + case MLX5HWS_ACTION_TYP_REFORMAT_L2_TO_TNL_L3: + case MLX5HWS_ACTION_TYP_INSERT_HEADER: + attr->reparse_mode = MLX5_IFC_STC_REPARSE_ALWAYS; + if (!action->reformat.require_reparse) + attr->reparse_mode = MLX5_IFC_STC_REPARSE_IGNORE; + + attr->action_type = MLX5_IFC_STC_ACTION_TYPE_HEADER_INSERT; + attr->action_offset = MLX5HWS_ACTION_OFFSET_DW6; + attr->insert_header.encap = action->reformat.encap; + attr->insert_header.insert_anchor = action->reformat.anchor; + attr->insert_header.arg_id = action->reformat.arg_id; + attr->insert_header.header_size = action->reformat.header_size; + attr->insert_header.insert_offset = action->reformat.offset; + break; + case MLX5HWS_ACTION_TYP_ASO_METER: + attr->action_offset = MLX5HWS_ACTION_OFFSET_DW6; + attr->action_type = MLX5_IFC_STC_ACTION_TYPE_ASO; + attr->aso.aso_type = ASO_OPC_MOD_POLICER; + attr->aso.devx_obj_id = obj_id; + attr->aso.return_reg_id = action->aso.return_reg_id; + break; + case MLX5HWS_ACTION_TYP_VPORT: + attr->action_offset = MLX5HWS_ACTION_OFFSET_HIT; + attr->action_type = MLX5_IFC_STC_ACTION_TYPE_JUMP_TO_VPORT; + attr->vport.vport_num = action->vport.vport_num; + attr->vport.esw_owner_vhca_id = action->vport.esw_owner_vhca_id; + attr->vport.eswitch_owner_vhca_id_valid = action->vport.esw_owner_vhca_id_valid; + break; + case MLX5HWS_ACTION_TYP_POP_VLAN: + attr->action_type = MLX5_IFC_STC_ACTION_TYPE_REMOVE_WORDS; + attr->action_offset = MLX5HWS_ACTION_OFFSET_DW5; + attr->reparse_mode = MLX5_IFC_STC_REPARSE_ALWAYS; + attr->remove_words.start_anchor = MLX5_HEADER_ANCHOR_FIRST_VLAN_START; + attr->remove_words.num_of_words = MLX5HWS_ACTION_HDR_LEN_L2_VLAN / 2; + break; + case MLX5HWS_ACTION_TYP_PUSH_VLAN: + attr->action_type = MLX5_IFC_STC_ACTION_TYPE_HEADER_INSERT; + attr->action_offset = MLX5HWS_ACTION_OFFSET_DW6; + attr->reparse_mode = MLX5_IFC_STC_REPARSE_ALWAYS; + attr->insert_header.encap = 0; + attr->insert_header.is_inline = 1; + attr->insert_header.insert_anchor = MLX5_HEADER_ANCHOR_PACKET_START; + attr->insert_header.insert_offset = MLX5HWS_ACTION_HDR_LEN_L2_MACS; + attr->insert_header.header_size = MLX5HWS_ACTION_HDR_LEN_L2_VLAN; + break; + case MLX5HWS_ACTION_TYP_REMOVE_HEADER: + attr->action_type = MLX5_IFC_STC_ACTION_TYPE_REMOVE_WORDS; + attr->remove_header.decap = 0; /* the mode we support decap is 0 */ + attr->remove_words.start_anchor = action->remove_header.anchor; + /* the size is in already in words */ + attr->remove_words.num_of_words = action->remove_header.size; + attr->action_offset = MLX5HWS_ACTION_OFFSET_DW5; + attr->reparse_mode = MLX5_IFC_STC_REPARSE_ALWAYS; + break; + default: + mlx5hws_err(action->ctx, "Invalid action type %d\n", action->type); + } +} + +static int +hws_action_create_stcs(struct mlx5hws_action *action, u32 obj_id) +{ + struct mlx5hws_cmd_stc_modify_attr stc_attr = {0}; + struct mlx5hws_context *ctx = action->ctx; + int ret; + + hws_action_fill_stc_attr(action, obj_id, &stc_attr); + + /* Block unsupported parallel obj modify over the same base */ + mutex_lock(&ctx->ctrl_lock); + + /* Allocate STC for FDB */ + if (action->flags & MLX5HWS_ACTION_FLAG_HWS_FDB) { + ret = mlx5hws_action_alloc_single_stc(ctx, &stc_attr, + MLX5HWS_TABLE_TYPE_FDB, + &action->stc[MLX5HWS_TABLE_TYPE_FDB]); + if (ret) + goto out_err; + } + + mutex_unlock(&ctx->ctrl_lock); + + return 0; + +out_err: + mutex_unlock(&ctx->ctrl_lock); + return ret; +} + +static void +hws_action_destroy_stcs(struct mlx5hws_action *action) +{ + struct mlx5hws_context *ctx = action->ctx; + + /* Block unsupported parallel obj modify over the same base */ + mutex_lock(&ctx->ctrl_lock); + + if (action->flags & MLX5HWS_ACTION_FLAG_HWS_FDB) + mlx5hws_action_free_single_stc(ctx, MLX5HWS_TABLE_TYPE_FDB, + &action->stc[MLX5HWS_TABLE_TYPE_FDB]); + + mutex_unlock(&ctx->ctrl_lock); +} + +static bool hws_action_is_flag_hws_fdb(u32 flags) +{ + return flags & MLX5HWS_ACTION_FLAG_HWS_FDB; +} + +static bool +hws_action_validate_hws_action(struct mlx5hws_context *ctx, u32 flags) +{ + if (!(ctx->flags & MLX5HWS_CONTEXT_FLAG_HWS_SUPPORT)) { + mlx5hws_err(ctx, "Cannot create HWS action since HWS is not supported\n"); + return false; + } + + if ((flags & MLX5HWS_ACTION_FLAG_HWS_FDB) && !ctx->caps->eswitch_manager) { + mlx5hws_err(ctx, "Cannot create HWS action for FDB for non-eswitch-manager\n"); + return false; + } + + return true; +} + +static struct mlx5hws_action * +hws_action_create_generic_bulk(struct mlx5hws_context *ctx, + u32 flags, + enum mlx5hws_action_type action_type, + u8 bulk_sz) +{ + struct mlx5hws_action *action; + int i; + + if (!hws_action_is_flag_hws_fdb(flags)) { + mlx5hws_err(ctx, + "Action (type: %d) flags must specify only HWS FDB\n", action_type); + return NULL; + } + + if (!hws_action_validate_hws_action(ctx, flags)) + return NULL; + + action = kcalloc(bulk_sz, sizeof(*action), GFP_KERNEL); + if (!action) + return NULL; + + for (i = 0; i < bulk_sz; i++) { + action[i].ctx = ctx; + action[i].flags = flags; + action[i].type = action_type; + } + + return action; +} + +static struct mlx5hws_action * +hws_action_create_generic(struct mlx5hws_context *ctx, + u32 flags, + enum mlx5hws_action_type action_type) +{ + return hws_action_create_generic_bulk(ctx, flags, action_type, 1); +} + +struct mlx5hws_action * +mlx5hws_action_create_dest_table_num(struct mlx5hws_context *ctx, + u32 table_id, + u32 flags) +{ + struct mlx5hws_action *action; + int ret; + + action = hws_action_create_generic(ctx, flags, MLX5HWS_ACTION_TYP_TBL); + if (!action) + return NULL; + + ret = hws_action_create_stcs(action, table_id); + if (ret) + goto free_action; + + action->dest_obj.obj_id = table_id; + + return action; + +free_action: + kfree(action); + return NULL; +} + +struct mlx5hws_action * +mlx5hws_action_create_dest_table(struct mlx5hws_context *ctx, + struct mlx5hws_table *tbl, + u32 flags) +{ + return mlx5hws_action_create_dest_table_num(ctx, tbl->ft_id, flags); +} + +struct mlx5hws_action * +mlx5hws_action_create_dest_drop(struct mlx5hws_context *ctx, u32 flags) +{ + struct mlx5hws_action *action; + int ret; + + action = hws_action_create_generic(ctx, flags, MLX5HWS_ACTION_TYP_DROP); + if (!action) + return NULL; + + ret = hws_action_create_stcs(action, 0); + if (ret) + goto free_action; + + return action; + +free_action: + kfree(action); + return NULL; +} + +struct mlx5hws_action * +mlx5hws_action_create_default_miss(struct mlx5hws_context *ctx, u32 flags) +{ + struct mlx5hws_action *action; + int ret; + + action = hws_action_create_generic(ctx, flags, MLX5HWS_ACTION_TYP_MISS); + if (!action) + return NULL; + + ret = hws_action_create_stcs(action, 0); + if (ret) + goto free_action; + + return action; + +free_action: + kfree(action); + return NULL; +} + +struct mlx5hws_action * +mlx5hws_action_create_tag(struct mlx5hws_context *ctx, u32 flags) +{ + struct mlx5hws_action *action; + int ret; + + action = hws_action_create_generic(ctx, flags, MLX5HWS_ACTION_TYP_TAG); + if (!action) + return NULL; + + ret = hws_action_create_stcs(action, 0); + if (ret) + goto free_action; + + return action; + +free_action: + kfree(action); + return NULL; +} + +static struct mlx5hws_action * +hws_action_create_aso(struct mlx5hws_context *ctx, + enum mlx5hws_action_type action_type, + u32 obj_id, + u8 return_reg_id, + u32 flags) +{ + struct mlx5hws_action *action; + int ret; + + action = hws_action_create_generic(ctx, flags, action_type); + if (!action) + return NULL; + + action->aso.obj_id = obj_id; + action->aso.return_reg_id = return_reg_id; + + ret = hws_action_create_stcs(action, obj_id); + if (ret) + goto free_action; + + return action; + +free_action: + kfree(action); + return NULL; +} + +struct mlx5hws_action * +mlx5hws_action_create_aso_meter(struct mlx5hws_context *ctx, + u32 obj_id, + u8 return_reg_id, + u32 flags) +{ + return hws_action_create_aso(ctx, MLX5HWS_ACTION_TYP_ASO_METER, + obj_id, return_reg_id, flags); +} + +struct mlx5hws_action * +mlx5hws_action_create_counter(struct mlx5hws_context *ctx, + u32 obj_id, + u32 flags) +{ + struct mlx5hws_action *action; + int ret; + + action = hws_action_create_generic(ctx, flags, MLX5HWS_ACTION_TYP_CTR); + if (!action) + return NULL; + + ret = hws_action_create_stcs(action, obj_id); + if (ret) + goto free_action; + + return action; + +free_action: + kfree(action); + return NULL; +} + +struct mlx5hws_action * +mlx5hws_action_create_dest_vport(struct mlx5hws_context *ctx, + u16 vport_num, + bool vhca_id_valid, + u16 vhca_id, + u32 flags) +{ + struct mlx5hws_action *action; + int ret; + + if (!(flags & MLX5HWS_ACTION_FLAG_HWS_FDB)) { + mlx5hws_err(ctx, "Vport action is supported for FDB only\n"); + return NULL; + } + + action = hws_action_create_generic(ctx, flags, MLX5HWS_ACTION_TYP_VPORT); + if (!action) + return NULL; + + if (!ctx->caps->merged_eswitch && vhca_id_valid && vhca_id != ctx->caps->vhca_id) { + mlx5hws_err(ctx, "Non merged eswitch cannot send to other vhca\n"); + goto free_action; + } + + action->vport.vport_num = vport_num; + action->vport.esw_owner_vhca_id_valid = vhca_id_valid; + + if (vhca_id_valid) + action->vport.esw_owner_vhca_id = vhca_id; + + ret = hws_action_create_stcs(action, 0); + if (ret) { + mlx5hws_err(ctx, "Failed creating stc for vport %d\n", vport_num); + goto free_action; + } + + return action; + +free_action: + kfree(action); + return NULL; +} + +struct mlx5hws_action * +mlx5hws_action_create_push_vlan(struct mlx5hws_context *ctx, u32 flags) +{ + struct mlx5hws_action *action; + int ret; + + action = hws_action_create_generic(ctx, flags, MLX5HWS_ACTION_TYP_PUSH_VLAN); + if (!action) + return NULL; + + ret = hws_action_create_stcs(action, 0); + if (ret) { + mlx5hws_err(ctx, "Failed creating stc for push vlan\n"); + goto free_action; + } + + return action; + +free_action: + kfree(action); + return NULL; +} + +struct mlx5hws_action * +mlx5hws_action_create_pop_vlan(struct mlx5hws_context *ctx, u32 flags) +{ + struct mlx5hws_action *action; + int ret; + + action = hws_action_create_generic(ctx, flags, MLX5HWS_ACTION_TYP_POP_VLAN); + if (!action) + return NULL; + + ret = hws_action_get_shared_stc(action, MLX5HWS_CONTEXT_SHARED_STC_DOUBLE_POP); + if (ret) { + mlx5hws_err(ctx, "Failed to create remove stc for reformat\n"); + goto free_action; + } + + ret = hws_action_create_stcs(action, 0); + if (ret) { + mlx5hws_err(ctx, "Failed creating stc for pop vlan\n"); + goto free_shared; + } + + return action; + +free_shared: + hws_action_put_shared_stc(action, MLX5HWS_CONTEXT_SHARED_STC_DOUBLE_POP); +free_action: + kfree(action); + return NULL; +} + +static int +hws_action_handle_insert_with_ptr(struct mlx5hws_action *action, + u8 num_of_hdrs, + struct mlx5hws_action_reformat_header *hdrs, + u32 log_bulk_sz) +{ + size_t max_sz = 0; + u32 arg_id; + int ret, i; + + for (i = 0; i < num_of_hdrs; i++) { + if (hdrs[i].sz % W_SIZE != 0) { + mlx5hws_err(action->ctx, + "Header data size should be in WORD granularity\n"); + return -EINVAL; + } + max_sz = max(hdrs[i].sz, max_sz); + } + + /* Allocate single shared arg object for all headers */ + ret = mlx5hws_arg_create(action->ctx, + hdrs->data, + max_sz, + log_bulk_sz, + action->flags & MLX5HWS_ACTION_FLAG_SHARED, + &arg_id); + if (ret) + return ret; + + for (i = 0; i < num_of_hdrs; i++) { + action[i].reformat.arg_id = arg_id; + action[i].reformat.header_size = hdrs[i].sz; + action[i].reformat.num_of_hdrs = num_of_hdrs; + action[i].reformat.max_hdr_sz = max_sz; + action[i].reformat.require_reparse = true; + + if (action[i].type == MLX5HWS_ACTION_TYP_REFORMAT_L2_TO_TNL_L2 || + action[i].type == MLX5HWS_ACTION_TYP_REFORMAT_L2_TO_TNL_L3) { + action[i].reformat.anchor = MLX5_HEADER_ANCHOR_PACKET_START; + action[i].reformat.offset = 0; + action[i].reformat.encap = 1; + } + + ret = hws_action_create_stcs(&action[i], 0); + if (ret) { + mlx5hws_err(action->ctx, "Failed to create stc for reformat\n"); + goto free_stc; + } + } + + return 0; + +free_stc: + while (i--) + hws_action_destroy_stcs(&action[i]); + + mlx5hws_arg_destroy(action->ctx, arg_id); + return ret; +} + +static int +hws_action_handle_l2_to_tunnel_l3(struct mlx5hws_action *action, + u8 num_of_hdrs, + struct mlx5hws_action_reformat_header *hdrs, + u32 log_bulk_sz) +{ + int ret; + + /* The action is remove-l2-header + insert-l3-header */ + ret = hws_action_get_shared_stc(action, MLX5HWS_CONTEXT_SHARED_STC_DECAP_L3); + if (ret) { + mlx5hws_err(action->ctx, "Failed to create remove stc for reformat\n"); + return ret; + } + + /* Reuse the insert with pointer for the L2L3 header */ + ret = hws_action_handle_insert_with_ptr(action, + num_of_hdrs, + hdrs, + log_bulk_sz); + if (ret) + goto put_shared_stc; + + return 0; + +put_shared_stc: + hws_action_put_shared_stc(action, MLX5HWS_CONTEXT_SHARED_STC_DECAP_L3); + return ret; +} + +static void hws_action_prepare_decap_l3_actions(size_t data_sz, + u8 *mh_data, + int *num_of_actions) +{ + int actions; + u32 i; + + /* Remove L2L3 outer headers */ + MLX5_SET(stc_ste_param_remove, mh_data, action_type, + MLX5_MODIFICATION_TYPE_REMOVE); + MLX5_SET(stc_ste_param_remove, mh_data, decap, 0x1); + MLX5_SET(stc_ste_param_remove, mh_data, remove_start_anchor, + MLX5_HEADER_ANCHOR_PACKET_START); + MLX5_SET(stc_ste_param_remove, mh_data, remove_end_anchor, + MLX5_HEADER_ANCHOR_INNER_IPV6_IPV4); + mh_data += MLX5HWS_ACTION_DOUBLE_SIZE; /* Assume every action is 2 dw */ + actions = 1; + + /* Add the new header using inline action 4Byte at a time, the header + * is added in reversed order to the beginning of the packet to avoid + * incorrect parsing by the HW. Since header is 14B or 18B an extra + * two bytes are padded and later removed. + */ + for (i = 0; i < data_sz / MLX5HWS_ACTION_INLINE_DATA_SIZE + 1; i++) { + MLX5_SET(stc_ste_param_insert, mh_data, action_type, + MLX5_MODIFICATION_TYPE_INSERT); + MLX5_SET(stc_ste_param_insert, mh_data, inline_data, 0x1); + MLX5_SET(stc_ste_param_insert, mh_data, insert_anchor, + MLX5_HEADER_ANCHOR_PACKET_START); + MLX5_SET(stc_ste_param_insert, mh_data, insert_size, 2); + mh_data += MLX5HWS_ACTION_DOUBLE_SIZE; + actions++; + } + + /* Remove first 2 extra bytes */ + MLX5_SET(stc_ste_param_remove_words, mh_data, action_type, + MLX5_MODIFICATION_TYPE_REMOVE_WORDS); + MLX5_SET(stc_ste_param_remove_words, mh_data, remove_start_anchor, + MLX5_HEADER_ANCHOR_PACKET_START); + /* The hardware expects here size in words (2 bytes) */ + MLX5_SET(stc_ste_param_remove_words, mh_data, remove_size, 1); + actions++; + + *num_of_actions = actions; +} + +static int +hws_action_handle_tunnel_l3_to_l2(struct mlx5hws_action *action, + u8 num_of_hdrs, + struct mlx5hws_action_reformat_header *hdrs, + u32 log_bulk_sz) +{ + u8 mh_data[MLX5HWS_ACTION_REFORMAT_DATA_SIZE] = {0}; + struct mlx5hws_context *ctx = action->ctx; + u32 arg_id, pat_id; + int num_of_actions; + int mh_data_size; + int ret, i; + + for (i = 0; i < num_of_hdrs; i++) { + if (hdrs[i].sz != MLX5HWS_ACTION_HDR_LEN_L2 && + hdrs[i].sz != MLX5HWS_ACTION_HDR_LEN_L2_W_VLAN) { + mlx5hws_err(ctx, "Data size is not supported for decap-l3\n"); + return -EINVAL; + } + } + + /* Create a full modify header action list in case shared */ + hws_action_prepare_decap_l3_actions(hdrs->sz, mh_data, &num_of_actions); + if (action->flags & MLX5HWS_ACTION_FLAG_SHARED) + mlx5hws_action_prepare_decap_l3_data(hdrs->data, mh_data, num_of_actions); + + /* All DecapL3 cases require the same max arg size */ + ret = mlx5hws_arg_create_modify_header_arg(ctx, + (__be64 *)mh_data, + num_of_actions, + log_bulk_sz, + action->flags & MLX5HWS_ACTION_FLAG_SHARED, + &arg_id); + if (ret) + return ret; + + for (i = 0; i < num_of_hdrs; i++) { + memset(mh_data, 0, MLX5HWS_ACTION_REFORMAT_DATA_SIZE); + hws_action_prepare_decap_l3_actions(hdrs[i].sz, mh_data, &num_of_actions); + mh_data_size = num_of_actions * MLX5HWS_MODIFY_ACTION_SIZE; + + ret = mlx5hws_pat_get_pattern(ctx, (__be64 *)mh_data, mh_data_size, &pat_id); + if (ret) { + mlx5hws_err(ctx, "Failed to allocate pattern for DecapL3\n"); + goto free_stc_and_pat; + } + + action[i].modify_header.max_num_of_actions = num_of_actions; + action[i].modify_header.num_of_actions = num_of_actions; + action[i].modify_header.num_of_patterns = num_of_hdrs; + action[i].modify_header.arg_id = arg_id; + action[i].modify_header.pat_id = pat_id; + action[i].modify_header.require_reparse = + mlx5hws_pat_require_reparse((__be64 *)mh_data, num_of_actions); + + ret = hws_action_create_stcs(&action[i], 0); + if (ret) { + mlx5hws_pat_put_pattern(ctx, pat_id); + goto free_stc_and_pat; + } + } + + return 0; + +free_stc_and_pat: + while (i--) { + hws_action_destroy_stcs(&action[i]); + mlx5hws_pat_put_pattern(ctx, action[i].modify_header.pat_id); + } + + mlx5hws_arg_destroy(action->ctx, arg_id); + return ret; +} + +static int +hws_action_create_reformat_hws(struct mlx5hws_action *action, + u8 num_of_hdrs, + struct mlx5hws_action_reformat_header *hdrs, + u32 bulk_size) +{ + int ret; + + switch (action->type) { + case MLX5HWS_ACTION_TYP_REFORMAT_TNL_L2_TO_L2: + ret = hws_action_create_stcs(action, 0); + break; + case MLX5HWS_ACTION_TYP_REFORMAT_L2_TO_TNL_L2: + ret = hws_action_handle_insert_with_ptr(action, num_of_hdrs, hdrs, bulk_size); + break; + case MLX5HWS_ACTION_TYP_REFORMAT_L2_TO_TNL_L3: + ret = hws_action_handle_l2_to_tunnel_l3(action, num_of_hdrs, hdrs, bulk_size); + break; + case MLX5HWS_ACTION_TYP_REFORMAT_TNL_L3_TO_L2: + ret = hws_action_handle_tunnel_l3_to_l2(action, num_of_hdrs, hdrs, bulk_size); + break; + default: + mlx5hws_err(action->ctx, "Invalid HWS reformat action type\n"); + return -EINVAL; + } + + return ret; +} + +struct mlx5hws_action * +mlx5hws_action_create_reformat(struct mlx5hws_context *ctx, + enum mlx5hws_action_type reformat_type, + u8 num_of_hdrs, + struct mlx5hws_action_reformat_header *hdrs, + u32 log_bulk_size, + u32 flags) +{ + struct mlx5hws_action *action; + int ret; + + if (!num_of_hdrs) { + mlx5hws_err(ctx, "Reformat num_of_hdrs cannot be zero\n"); + return NULL; + } + + action = hws_action_create_generic_bulk(ctx, flags, reformat_type, num_of_hdrs); + if (!action) + return NULL; + + if ((flags & MLX5HWS_ACTION_FLAG_SHARED) && (log_bulk_size || num_of_hdrs > 1)) { + mlx5hws_err(ctx, "Reformat flags don't fit HWS (flags: 0x%x)\n", flags); + goto free_action; + } + + ret = hws_action_create_reformat_hws(action, num_of_hdrs, hdrs, log_bulk_size); + if (ret) { + mlx5hws_err(ctx, "Failed to create HWS reformat action\n"); + goto free_action; + } + + return action; + +free_action: + kfree(action); + return NULL; +} + +static int +hws_action_create_modify_header_hws(struct mlx5hws_action *action, + u8 num_of_patterns, + struct mlx5hws_action_mh_pattern *pattern, + u32 log_bulk_size) +{ + struct mlx5hws_context *ctx = action->ctx; + u16 num_actions, max_mh_actions = 0; + int i, ret, size_in_bytes; + u32 pat_id, arg_id = 0; + __be64 *new_pattern; + size_t pat_max_sz; + + pat_max_sz = MLX5HWS_ARG_CHUNK_SIZE_MAX * MLX5HWS_ARG_DATA_SIZE; + size_in_bytes = pat_max_sz * sizeof(__be64); + new_pattern = kcalloc(num_of_patterns, size_in_bytes, GFP_KERNEL); + if (!new_pattern) + return -ENOMEM; + + /* Calculate maximum number of mh actions for shared arg allocation */ + for (i = 0; i < num_of_patterns; i++) { + size_t new_num_actions; + size_t cur_num_actions; + u32 nope_location; + + cur_num_actions = pattern[i].sz / MLX5HWS_MODIFY_ACTION_SIZE; + + mlx5hws_pat_calc_nope(pattern[i].data, cur_num_actions, + pat_max_sz / MLX5HWS_MODIFY_ACTION_SIZE, + &new_num_actions, &nope_location, + &new_pattern[i * pat_max_sz]); + + action[i].modify_header.nope_locations = nope_location; + action[i].modify_header.num_of_actions = new_num_actions; + + max_mh_actions = max(max_mh_actions, new_num_actions); + } + + if (mlx5hws_arg_get_arg_log_size(max_mh_actions) >= MLX5HWS_ARG_CHUNK_SIZE_MAX) { + mlx5hws_err(ctx, "Num of actions (%d) bigger than allowed\n", + max_mh_actions); + ret = -EINVAL; + goto free_new_pat; + } + + /* Allocate single shared arg for all patterns based on the max size */ + if (max_mh_actions > 1) { + ret = mlx5hws_arg_create_modify_header_arg(ctx, + pattern->data, + max_mh_actions, + log_bulk_size, + action->flags & + MLX5HWS_ACTION_FLAG_SHARED, + &arg_id); + if (ret) + goto free_new_pat; + } + + for (i = 0; i < num_of_patterns; i++) { + if (!mlx5hws_pat_verify_actions(ctx, pattern[i].data, pattern[i].sz)) { + mlx5hws_err(ctx, "Fail to verify pattern modify actions\n"); + ret = -EINVAL; + goto free_stc_and_pat; + } + num_actions = pattern[i].sz / MLX5HWS_MODIFY_ACTION_SIZE; + action[i].modify_header.num_of_patterns = num_of_patterns; + action[i].modify_header.max_num_of_actions = max_mh_actions; + + action[i].modify_header.require_reparse = + mlx5hws_pat_require_reparse(pattern[i].data, num_actions); + + if (num_actions == 1) { + pat_id = 0; + /* Optimize single modify action to be used inline */ + action[i].modify_header.single_action = pattern[i].data[0]; + action[i].modify_header.single_action_type = + MLX5_GET(set_action_in, pattern[i].data, action_type); + } else { + /* Multiple modify actions require a pattern */ + if (unlikely(action[i].modify_header.nope_locations)) { + size_t pattern_sz; + + pattern_sz = action[i].modify_header.num_of_actions * + MLX5HWS_MODIFY_ACTION_SIZE; + ret = + mlx5hws_pat_get_pattern(ctx, + &new_pattern[i * pat_max_sz], + pattern_sz, &pat_id); + } else { + ret = mlx5hws_pat_get_pattern(ctx, + pattern[i].data, + pattern[i].sz, + &pat_id); + } + if (ret) { + mlx5hws_err(ctx, + "Failed to allocate pattern for modify header\n"); + goto free_stc_and_pat; + } + + action[i].modify_header.arg_id = arg_id; + action[i].modify_header.pat_id = pat_id; + } + /* Allocate STC for each action representing a header */ + ret = hws_action_create_stcs(&action[i], 0); + if (ret) { + if (pat_id) + mlx5hws_pat_put_pattern(ctx, pat_id); + goto free_stc_and_pat; + } + } + + kfree(new_pattern); + return 0; + +free_stc_and_pat: + while (i--) { + hws_action_destroy_stcs(&action[i]); + if (action[i].modify_header.pat_id) + mlx5hws_pat_put_pattern(ctx, action[i].modify_header.pat_id); + } + + if (arg_id) + mlx5hws_arg_destroy(ctx, arg_id); +free_new_pat: + kfree(new_pattern); + return ret; +} + +struct mlx5hws_action * +mlx5hws_action_create_modify_header(struct mlx5hws_context *ctx, + u8 num_of_patterns, + struct mlx5hws_action_mh_pattern *patterns, + u32 log_bulk_size, + u32 flags) +{ + struct mlx5hws_action *action; + int ret; + + if (!num_of_patterns) { + mlx5hws_err(ctx, "Invalid number of patterns\n"); + return NULL; + } + action = hws_action_create_generic_bulk(ctx, flags, + MLX5HWS_ACTION_TYP_MODIFY_HDR, + num_of_patterns); + if (!action) + return NULL; + + if ((flags & MLX5HWS_ACTION_FLAG_SHARED) && (log_bulk_size || num_of_patterns > 1)) { + mlx5hws_err(ctx, "Action cannot be shared with requested pattern or size\n"); + goto free_action; + } + + ret = hws_action_create_modify_header_hws(action, + num_of_patterns, + patterns, + log_bulk_size); + if (ret) + goto free_action; + + return action; + +free_action: + kfree(action); + return NULL; +} + +struct mlx5hws_action * +mlx5hws_action_create_dest_array(struct mlx5hws_context *ctx, + size_t num_dest, + struct mlx5hws_action_dest_attr *dests, + bool ignore_flow_level, + u32 flow_source, + u32 flags) +{ + struct mlx5hws_cmd_set_fte_dest *dest_list = NULL; + struct mlx5hws_cmd_ft_create_attr ft_attr = {0}; + struct mlx5hws_cmd_set_fte_attr fte_attr = {0}; + struct mlx5hws_cmd_forward_tbl *fw_island; + struct mlx5hws_action *action; + u32 i /*, packet_reformat_id*/; + int ret; + + if (num_dest <= 1) { + mlx5hws_err(ctx, "Action must have multiple dests\n"); + return NULL; + } + + if (flags == (MLX5HWS_ACTION_FLAG_HWS_FDB | MLX5HWS_ACTION_FLAG_SHARED)) { + ft_attr.type = FS_FT_FDB; + ft_attr.level = ctx->caps->fdb_ft.max_level - 1; + } else { + mlx5hws_err(ctx, "Action flags not supported\n"); + return NULL; + } + + dest_list = kcalloc(num_dest, sizeof(*dest_list), GFP_KERNEL); + if (!dest_list) + return NULL; + + for (i = 0; i < num_dest; i++) { + enum mlx5hws_action_type action_type = dests[i].dest->type; + struct mlx5hws_action *reformat_action = dests[i].reformat; + + switch (action_type) { + case MLX5HWS_ACTION_TYP_TBL: + dest_list[i].destination_type = + MLX5_FLOW_DESTINATION_TYPE_FLOW_TABLE; + dest_list[i].destination_id = dests[i].dest->dest_obj.obj_id; + fte_attr.action_flags |= MLX5_FLOW_CONTEXT_ACTION_FWD_DEST; + fte_attr.ignore_flow_level = ignore_flow_level; + /* ToDo: In SW steering we have a handling of 'go to WIRE' + * destination here by upper layer setting 'is_wire_ft' flag + * if the destination is wire. + * This is because uplink should be last dest in the list. + */ + break; + case MLX5HWS_ACTION_TYP_VPORT: + dest_list[i].destination_type = MLX5_FLOW_DESTINATION_TYPE_VPORT; + dest_list[i].destination_id = dests[i].dest->vport.vport_num; + fte_attr.action_flags |= MLX5_FLOW_CONTEXT_ACTION_FWD_DEST; + if (ctx->caps->merged_eswitch) { + dest_list[i].ext_flags |= + MLX5HWS_CMD_EXT_DEST_ESW_OWNER_VHCA_ID; + dest_list[i].esw_owner_vhca_id = + dests[i].dest->vport.esw_owner_vhca_id; + } + break; + default: + mlx5hws_err(ctx, "Unsupported action in dest_array\n"); + goto free_dest_list; + } + + if (reformat_action) { + mlx5hws_err(ctx, "dest_array with reformat action - unsupported\n"); + goto free_dest_list; + } + } + + fte_attr.dests_num = num_dest; + fte_attr.dests = dest_list; + + fw_island = mlx5hws_cmd_forward_tbl_create(ctx->mdev, &ft_attr, &fte_attr); + if (!fw_island) + goto free_dest_list; + + action = hws_action_create_generic(ctx, flags, MLX5HWS_ACTION_TYP_DEST_ARRAY); + if (!action) + goto destroy_fw_island; + + ret = hws_action_create_stcs(action, fw_island->ft_id); + if (ret) + goto free_action; + + action->dest_array.fw_island = fw_island; + action->dest_array.num_dest = num_dest; + action->dest_array.dest_list = dest_list; + + return action; + +free_action: + kfree(action); +destroy_fw_island: + mlx5hws_cmd_forward_tbl_destroy(ctx->mdev, fw_island); +free_dest_list: + for (i = 0; i < num_dest; i++) { + if (dest_list[i].ext_reformat_id) + mlx5hws_cmd_packet_reformat_destroy(ctx->mdev, + dest_list[i].ext_reformat_id); + } + kfree(dest_list); + return NULL; +} + +struct mlx5hws_action * +mlx5hws_action_create_insert_header(struct mlx5hws_context *ctx, + u8 num_of_hdrs, + struct mlx5hws_action_insert_header *hdrs, + u32 log_bulk_size, + u32 flags) +{ + struct mlx5hws_action_reformat_header *reformat_hdrs; + struct mlx5hws_action *action; + int ret; + int i; + + action = hws_action_create_generic(ctx, flags, MLX5HWS_ACTION_TYP_INSERT_HEADER); + if (!action) + return NULL; + + reformat_hdrs = kcalloc(num_of_hdrs, sizeof(*reformat_hdrs), GFP_KERNEL); + if (!reformat_hdrs) + goto free_action; + + for (i = 0; i < num_of_hdrs; i++) { + if (hdrs[i].offset % W_SIZE != 0) { + mlx5hws_err(ctx, "Header offset should be in WORD granularity\n"); + goto free_reformat_hdrs; + } + + action[i].reformat.anchor = hdrs[i].anchor; + action[i].reformat.encap = hdrs[i].encap; + action[i].reformat.offset = hdrs[i].offset; + + reformat_hdrs[i].sz = hdrs[i].hdr.sz; + reformat_hdrs[i].data = hdrs[i].hdr.data; + } + + ret = hws_action_handle_insert_with_ptr(action, num_of_hdrs, + reformat_hdrs, log_bulk_size); + if (ret) { + mlx5hws_err(ctx, "Failed to create HWS reformat action\n"); + goto free_reformat_hdrs; + } + + kfree(reformat_hdrs); + + return action; + +free_reformat_hdrs: + kfree(reformat_hdrs); +free_action: + kfree(action); + return NULL; +} + +struct mlx5hws_action * +mlx5hws_action_create_remove_header(struct mlx5hws_context *ctx, + struct mlx5hws_action_remove_header_attr *attr, + u32 flags) +{ + struct mlx5hws_action *action; + + action = hws_action_create_generic(ctx, flags, MLX5HWS_ACTION_TYP_REMOVE_HEADER); + if (!action) + return NULL; + + /* support only remove anchor with size */ + if (attr->size % W_SIZE != 0) { + mlx5hws_err(ctx, + "Invalid size, HW supports header remove in WORD granularity\n"); + goto free_action; + } + + if (attr->size > MLX5HWS_ACTION_REMOVE_HEADER_MAX_SIZE) { + mlx5hws_err(ctx, "Header removal size limited to %u bytes\n", + MLX5HWS_ACTION_REMOVE_HEADER_MAX_SIZE); + goto free_action; + } + + action->remove_header.anchor = attr->anchor; + action->remove_header.size = attr->size / W_SIZE; + + if (hws_action_create_stcs(action, 0)) + goto free_action; + + return action; + +free_action: + kfree(action); + return NULL; +} + +static struct mlx5hws_definer * +hws_action_create_dest_match_range_definer(struct mlx5hws_context *ctx) +{ + struct mlx5hws_definer *definer; + __be32 *tag; + int ret; + + definer = kzalloc(sizeof(*definer), GFP_KERNEL); + if (!definer) + return NULL; + + definer->dw_selector[0] = MLX5_IFC_DEFINER_FORMAT_OFFSET_OUTER_ETH_PKT_LEN / 4; + /* Set DW0 tag mask */ + tag = (__force __be32 *)definer->mask.jumbo; + tag[MLX5HWS_RULE_JUMBO_MATCH_TAG_OFFSET_DW0] = htonl(0xffffUL << 16); + + mutex_lock(&ctx->ctrl_lock); + + ret = mlx5hws_definer_get_obj(ctx, definer); + if (ret < 0) { + mutex_unlock(&ctx->ctrl_lock); + kfree(definer); + return NULL; + } + + mutex_unlock(&ctx->ctrl_lock); + definer->obj_id = ret; + + return definer; +} + +static struct mlx5hws_matcher_action_ste * +hws_action_create_dest_match_range_table(struct mlx5hws_context *ctx, + struct mlx5hws_definer *definer, + u32 miss_ft_id) +{ + struct mlx5hws_cmd_rtc_create_attr rtc_attr = {0}; + struct mlx5hws_action_default_stc *default_stc; + struct mlx5hws_matcher_action_ste *table_ste; + struct mlx5hws_pool_attr pool_attr = {0}; + struct mlx5hws_pool *ste_pool, *stc_pool; + struct mlx5hws_pool_chunk *ste; + u32 *rtc_0_id, *rtc_1_id; + u32 obj_id; + int ret; + + /* Check if STE range is supported */ + if (!IS_BIT_SET(ctx->caps->supp_ste_format_gen_wqe, MLX5_IFC_RTC_STE_FORMAT_RANGE)) { + mlx5hws_err(ctx, "Range STE format not supported\n"); + return NULL; + } + + table_ste = kzalloc(sizeof(*table_ste), GFP_KERNEL); + if (!table_ste) + return NULL; + + mutex_lock(&ctx->ctrl_lock); + + pool_attr.table_type = MLX5HWS_TABLE_TYPE_FDB; + pool_attr.pool_type = MLX5HWS_POOL_TYPE_STE; + pool_attr.flags = MLX5HWS_POOL_FLAGS_FOR_STE_ACTION_POOL; + pool_attr.alloc_log_sz = 1; + table_ste->pool = mlx5hws_pool_create(ctx, &pool_attr); + if (!table_ste->pool) { + mlx5hws_err(ctx, "Failed to allocate memory ste pool\n"); + goto free_ste; + } + + /* Allocate RTC */ + rtc_0_id = &table_ste->rtc_0_id; + rtc_1_id = &table_ste->rtc_1_id; + ste_pool = table_ste->pool; + ste = &table_ste->ste; + ste->order = 1; + + rtc_attr.log_size = 0; + rtc_attr.log_depth = 0; + rtc_attr.miss_ft_id = miss_ft_id; + rtc_attr.num_hash_definer = 1; + rtc_attr.update_index_mode = MLX5_IFC_RTC_STE_UPDATE_MODE_BY_HASH; + rtc_attr.access_index_mode = MLX5_IFC_RTC_STE_ACCESS_MODE_BY_HASH; + rtc_attr.match_definer_0 = ctx->caps->trivial_match_definer; + rtc_attr.fw_gen_wqe = true; + rtc_attr.is_scnd_range = true; + + obj_id = mlx5hws_pool_chunk_get_base_id(ste_pool, ste); + + rtc_attr.pd = ctx->pd_num; + rtc_attr.ste_base = obj_id; + rtc_attr.ste_offset = ste->offset; + rtc_attr.reparse_mode = mlx5hws_context_get_reparse_mode(ctx); + rtc_attr.table_type = mlx5hws_table_get_res_fw_ft_type(MLX5HWS_TABLE_TYPE_FDB, false); + + /* STC is a single resource (obj_id), use any STC for the ID */ + stc_pool = ctx->stc_pool[MLX5HWS_TABLE_TYPE_FDB]; + default_stc = ctx->common_res[MLX5HWS_TABLE_TYPE_FDB].default_stc; + obj_id = mlx5hws_pool_chunk_get_base_id(stc_pool, &default_stc->default_hit); + rtc_attr.stc_base = obj_id; + + ret = mlx5hws_cmd_rtc_create(ctx->mdev, &rtc_attr, rtc_0_id); + if (ret) { + mlx5hws_err(ctx, "Failed to create RTC"); + goto pool_destroy; + } + + /* Create mirror RTC */ + obj_id = mlx5hws_pool_chunk_get_base_mirror_id(ste_pool, ste); + rtc_attr.ste_base = obj_id; + rtc_attr.table_type = mlx5hws_table_get_res_fw_ft_type(MLX5HWS_TABLE_TYPE_FDB, true); + + obj_id = mlx5hws_pool_chunk_get_base_mirror_id(stc_pool, &default_stc->default_hit); + rtc_attr.stc_base = obj_id; + + ret = mlx5hws_cmd_rtc_create(ctx->mdev, &rtc_attr, rtc_1_id); + if (ret) { + mlx5hws_err(ctx, "Failed to create mirror RTC"); + goto destroy_rtc_0; + } + + mutex_unlock(&ctx->ctrl_lock); + + return table_ste; + +destroy_rtc_0: + mlx5hws_cmd_rtc_destroy(ctx->mdev, *rtc_0_id); +pool_destroy: + mlx5hws_pool_destroy(table_ste->pool); +free_ste: + mutex_unlock(&ctx->ctrl_lock); + kfree(table_ste); + return NULL; +} + +static void +hws_action_destroy_dest_match_range_table(struct mlx5hws_context *ctx, + struct mlx5hws_matcher_action_ste *table_ste) +{ + mutex_lock(&ctx->ctrl_lock); + + mlx5hws_cmd_rtc_destroy(ctx->mdev, table_ste->rtc_1_id); + mlx5hws_cmd_rtc_destroy(ctx->mdev, table_ste->rtc_0_id); + mlx5hws_pool_destroy(table_ste->pool); + kfree(table_ste); + + mutex_unlock(&ctx->ctrl_lock); +} + +static int +hws_action_create_dest_match_range_fill_table(struct mlx5hws_context *ctx, + struct mlx5hws_matcher_action_ste *table_ste, + struct mlx5hws_action *hit_ft_action, + struct mlx5hws_definer *range_definer, + u32 min, u32 max) +{ + struct mlx5hws_wqe_gta_data_seg_ste match_wqe_data = {0}; + struct mlx5hws_wqe_gta_data_seg_ste range_wqe_data = {0}; + struct mlx5hws_wqe_gta_ctrl_seg wqe_ctrl = {0}; + u32 no_use, used_rtc_0_id, used_rtc_1_id, ret; + struct mlx5hws_context_common_res *common_res; + struct mlx5hws_send_ste_attr ste_attr = {0}; + struct mlx5hws_send_engine *queue; + __be32 *wqe_data_arr; + + mutex_lock(&ctx->ctrl_lock); + + /* Get the control queue */ + queue = &ctx->send_queue[ctx->queues - 1]; + if (unlikely(mlx5hws_send_engine_err(queue))) { + ret = -EIO; + goto error; + } + + /* Init default send STE attributes */ + ste_attr.gta_opcode = MLX5HWS_WQE_GTA_OP_ACTIVATE; + ste_attr.send_attr.opmod = MLX5HWS_WQE_GTA_OPMOD_STE; + ste_attr.send_attr.opcode = MLX5HWS_WQE_OPCODE_TBL_ACCESS; + ste_attr.send_attr.len = MLX5HWS_WQE_SZ_GTA_CTRL + MLX5HWS_WQE_SZ_GTA_DATA; + ste_attr.send_attr.user_data = &no_use; + ste_attr.send_attr.rule = NULL; + ste_attr.send_attr.fence = 1; + ste_attr.send_attr.notify_hw = true; + ste_attr.rtc_0 = table_ste->rtc_0_id; + ste_attr.rtc_1 = table_ste->rtc_1_id; + ste_attr.used_id_rtc_0 = &used_rtc_0_id; + ste_attr.used_id_rtc_1 = &used_rtc_1_id; + + common_res = &ctx->common_res[MLX5HWS_TABLE_TYPE_FDB]; + + /* init an empty match STE which will always hit */ + ste_attr.wqe_ctrl = &wqe_ctrl; + ste_attr.wqe_data = &match_wqe_data; + ste_attr.send_attr.match_definer_id = ctx->caps->trivial_match_definer; + + /* Fill WQE control data */ + wqe_ctrl.stc_ix[MLX5HWS_ACTION_STC_IDX_CTRL] = + htonl(common_res->default_stc->nop_ctr.offset); + wqe_ctrl.stc_ix[MLX5HWS_ACTION_STC_IDX_DW5] = + htonl(common_res->default_stc->nop_dw5.offset); + wqe_ctrl.stc_ix[MLX5HWS_ACTION_STC_IDX_DW6] = + htonl(common_res->default_stc->nop_dw6.offset); + wqe_ctrl.stc_ix[MLX5HWS_ACTION_STC_IDX_DW7] = + htonl(common_res->default_stc->nop_dw7.offset); + wqe_ctrl.stc_ix[MLX5HWS_ACTION_STC_IDX_CTRL] |= + htonl(MLX5HWS_ACTION_STC_IDX_LAST_COMBO2 << 29); + wqe_ctrl.stc_ix[MLX5HWS_ACTION_STC_IDX_HIT] = + htonl(hit_ft_action->stc[MLX5HWS_TABLE_TYPE_FDB].offset); + + wqe_data_arr = (__force __be32 *)&range_wqe_data; + + ste_attr.range_wqe_data = &range_wqe_data; + ste_attr.send_attr.len += MLX5HWS_WQE_SZ_GTA_DATA; + ste_attr.send_attr.range_definer_id = mlx5hws_definer_get_id(range_definer); + + /* Fill range matching fields, + * min/max_value_2 corresponds to match_dw_0 in its definer, + * min_value_2 sets in DW0 in the STE and max_value_2 sets in DW1 in the STE. + */ + wqe_data_arr[MLX5HWS_MATCHER_OFFSET_TAG_DW0] = htonl(min << 16); + wqe_data_arr[MLX5HWS_MATCHER_OFFSET_TAG_DW1] = htonl(max << 16); + + /* Send WQEs to FW */ + mlx5hws_send_stes_fw(ctx, queue, &ste_attr); + + /* Poll for completion */ + ret = mlx5hws_send_queue_action(ctx, ctx->queues - 1, + MLX5HWS_SEND_QUEUE_ACTION_DRAIN_SYNC); + if (ret) { + mlx5hws_err(ctx, "Failed to drain control queue"); + goto error; + } + + mutex_unlock(&ctx->ctrl_lock); + + return 0; + +error: + mutex_unlock(&ctx->ctrl_lock); + return ret; +} + +struct mlx5hws_action * +mlx5hws_action_create_dest_match_range(struct mlx5hws_context *ctx, + u32 field, + struct mlx5_flow_table *hit_ft, + struct mlx5_flow_table *miss_ft, + u32 min, u32 max, u32 flags) +{ + struct mlx5hws_cmd_stc_modify_attr stc_attr = {0}; + struct mlx5hws_matcher_action_ste *table_ste; + struct mlx5hws_action *hit_ft_action; + struct mlx5hws_definer *definer; + struct mlx5hws_action *action; + u32 miss_ft_id = miss_ft->id; + u32 hit_ft_id = hit_ft->id; + int ret; + + if (field != MLX5_FLOW_DEST_RANGE_FIELD_PKT_LEN || + min > 0xffff || max > 0xffff) { + mlx5hws_err(ctx, "Invalid match range parameters\n"); + return NULL; + } + + action = hws_action_create_generic(ctx, flags, MLX5HWS_ACTION_TYP_RANGE); + if (!action) + return NULL; + + definer = hws_action_create_dest_match_range_definer(ctx); + if (!definer) + goto free_action; + + table_ste = hws_action_create_dest_match_range_table(ctx, definer, miss_ft_id); + if (!table_ste) + goto destroy_definer; + + hit_ft_action = mlx5hws_action_create_dest_table_num(ctx, hit_ft_id, flags); + if (!hit_ft_action) + goto destroy_table_ste; + + ret = hws_action_create_dest_match_range_fill_table(ctx, table_ste, + hit_ft_action, + definer, min, max); + if (ret) + goto destroy_hit_ft_action; + + action->range.table_ste = table_ste; + action->range.definer = definer; + action->range.hit_ft_action = hit_ft_action; + + /* Allocate STC for jumps to STE */ + mutex_lock(&ctx->ctrl_lock); + stc_attr.action_offset = MLX5HWS_ACTION_OFFSET_HIT; + stc_attr.action_type = MLX5_IFC_STC_ACTION_TYPE_JUMP_TO_STE_TABLE; + stc_attr.reparse_mode = MLX5_IFC_STC_REPARSE_IGNORE; + stc_attr.ste_table.ste = table_ste->ste; + stc_attr.ste_table.ste_pool = table_ste->pool; + stc_attr.ste_table.match_definer_id = ctx->caps->trivial_match_definer; + + ret = mlx5hws_action_alloc_single_stc(ctx, &stc_attr, MLX5HWS_TABLE_TYPE_FDB, + &action->stc[MLX5HWS_TABLE_TYPE_FDB]); + if (ret) + goto error_unlock; + + mutex_unlock(&ctx->ctrl_lock); + + return action; + +error_unlock: + mutex_unlock(&ctx->ctrl_lock); +destroy_hit_ft_action: + mlx5hws_action_destroy(hit_ft_action); +destroy_table_ste: + hws_action_destroy_dest_match_range_table(ctx, table_ste); +destroy_definer: + mlx5hws_definer_free(ctx, definer); +free_action: + kfree(action); + mlx5hws_err(ctx, "Failed to create action dest match range"); + return NULL; +} + +struct mlx5hws_action * +mlx5hws_action_create_last(struct mlx5hws_context *ctx, u32 flags) +{ + return hws_action_create_generic(ctx, flags, MLX5HWS_ACTION_TYP_LAST); +} + +struct mlx5hws_action * +mlx5hws_action_create_flow_sampler(struct mlx5hws_context *ctx, + u32 sampler_id, u32 flags) +{ + mlx5hws_err(ctx, "Flow sampler action - unsupported\n"); + return NULL; +} + +static void hws_action_destroy_hws(struct mlx5hws_action *action) +{ + u32 ext_reformat_id; + bool shared_arg; + u32 obj_id; + u32 i; + + switch (action->type) { + case MLX5HWS_ACTION_TYP_MISS: + case MLX5HWS_ACTION_TYP_TAG: + case MLX5HWS_ACTION_TYP_DROP: + case MLX5HWS_ACTION_TYP_CTR: + case MLX5HWS_ACTION_TYP_TBL: + case MLX5HWS_ACTION_TYP_REFORMAT_TNL_L2_TO_L2: + case MLX5HWS_ACTION_TYP_ASO_METER: + case MLX5HWS_ACTION_TYP_PUSH_VLAN: + case MLX5HWS_ACTION_TYP_REMOVE_HEADER: + case MLX5HWS_ACTION_TYP_VPORT: + hws_action_destroy_stcs(action); + break; + case MLX5HWS_ACTION_TYP_POP_VLAN: + hws_action_destroy_stcs(action); + hws_action_put_shared_stc(action, MLX5HWS_CONTEXT_SHARED_STC_DOUBLE_POP); + break; + case MLX5HWS_ACTION_TYP_DEST_ARRAY: + hws_action_destroy_stcs(action); + mlx5hws_cmd_forward_tbl_destroy(action->ctx->mdev, action->dest_array.fw_island); + for (i = 0; i < action->dest_array.num_dest; i++) { + ext_reformat_id = action->dest_array.dest_list[i].ext_reformat_id; + if (ext_reformat_id) + mlx5hws_cmd_packet_reformat_destroy(action->ctx->mdev, + ext_reformat_id); + } + kfree(action->dest_array.dest_list); + break; + case MLX5HWS_ACTION_TYP_REFORMAT_TNL_L3_TO_L2: + case MLX5HWS_ACTION_TYP_MODIFY_HDR: + shared_arg = false; + for (i = 0; i < action->modify_header.num_of_patterns; i++) { + hws_action_destroy_stcs(&action[i]); + if (action[i].modify_header.num_of_actions > 1) { + mlx5hws_pat_put_pattern(action[i].ctx, + action[i].modify_header.pat_id); + /* Save shared arg object to be freed after */ + obj_id = action[i].modify_header.arg_id; + shared_arg = true; + } + } + if (shared_arg) + mlx5hws_arg_destroy(action->ctx, obj_id); + break; + case MLX5HWS_ACTION_TYP_REFORMAT_L2_TO_TNL_L3: + hws_action_put_shared_stc(action, MLX5HWS_CONTEXT_SHARED_STC_DECAP_L3); + for (i = 0; i < action->reformat.num_of_hdrs; i++) + hws_action_destroy_stcs(&action[i]); + mlx5hws_arg_destroy(action->ctx, action->reformat.arg_id); + break; + case MLX5HWS_ACTION_TYP_INSERT_HEADER: + case MLX5HWS_ACTION_TYP_REFORMAT_L2_TO_TNL_L2: + for (i = 0; i < action->reformat.num_of_hdrs; i++) + hws_action_destroy_stcs(&action[i]); + mlx5hws_arg_destroy(action->ctx, action->reformat.arg_id); + break; + case MLX5HWS_ACTION_TYP_RANGE: + hws_action_destroy_stcs(action); + hws_action_destroy_dest_match_range_table(action->ctx, action->range.table_ste); + mlx5hws_definer_free(action->ctx, action->range.definer); + mlx5hws_action_destroy(action->range.hit_ft_action); + break; + case MLX5HWS_ACTION_TYP_LAST: + break; + default: + pr_warn("HWS: Invalid action type: %d\n", action->type); + } +} + +int mlx5hws_action_destroy(struct mlx5hws_action *action) +{ + hws_action_destroy_hws(action); + + kfree(action); + return 0; +} + +int mlx5hws_action_get_default_stc(struct mlx5hws_context *ctx, u8 tbl_type) +__must_hold(&ctx->ctrl_lock) +{ + struct mlx5hws_cmd_stc_modify_attr stc_attr = {0}; + struct mlx5hws_action_default_stc *default_stc; + int ret; + + if (ctx->common_res[tbl_type].default_stc) { + ctx->common_res[tbl_type].default_stc->refcount++; + return 0; + } + + default_stc = kzalloc(sizeof(*default_stc), GFP_KERNEL); + if (!default_stc) + return -ENOMEM; + + stc_attr.action_type = MLX5_IFC_STC_ACTION_TYPE_NOP; + stc_attr.action_offset = MLX5HWS_ACTION_OFFSET_DW0; + stc_attr.reparse_mode = MLX5_IFC_STC_REPARSE_IGNORE; + ret = mlx5hws_action_alloc_single_stc(ctx, &stc_attr, tbl_type, + &default_stc->nop_ctr); + if (ret) { + mlx5hws_err(ctx, "Failed to allocate default counter STC\n"); + goto free_default_stc; + } + + stc_attr.action_offset = MLX5HWS_ACTION_OFFSET_DW5; + ret = mlx5hws_action_alloc_single_stc(ctx, &stc_attr, tbl_type, + &default_stc->nop_dw5); + if (ret) { + mlx5hws_err(ctx, "Failed to allocate default NOP DW5 STC\n"); + goto free_nop_ctr; + } + + stc_attr.action_offset = MLX5HWS_ACTION_OFFSET_DW6; + ret = mlx5hws_action_alloc_single_stc(ctx, &stc_attr, tbl_type, + &default_stc->nop_dw6); + if (ret) { + mlx5hws_err(ctx, "Failed to allocate default NOP DW6 STC\n"); + goto free_nop_dw5; + } + + stc_attr.action_offset = MLX5HWS_ACTION_OFFSET_DW7; + ret = mlx5hws_action_alloc_single_stc(ctx, &stc_attr, tbl_type, + &default_stc->nop_dw7); + if (ret) { + mlx5hws_err(ctx, "Failed to allocate default NOP DW7 STC\n"); + goto free_nop_dw6; + } + + stc_attr.action_offset = MLX5HWS_ACTION_OFFSET_HIT; + stc_attr.action_type = MLX5_IFC_STC_ACTION_TYPE_ALLOW; + + ret = mlx5hws_action_alloc_single_stc(ctx, &stc_attr, tbl_type, + &default_stc->default_hit); + if (ret) { + mlx5hws_err(ctx, "Failed to allocate default allow STC\n"); + goto free_nop_dw7; + } + + ctx->common_res[tbl_type].default_stc = default_stc; + ctx->common_res[tbl_type].default_stc->refcount++; + + return 0; + +free_nop_dw7: + mlx5hws_action_free_single_stc(ctx, tbl_type, &default_stc->nop_dw7); +free_nop_dw6: + mlx5hws_action_free_single_stc(ctx, tbl_type, &default_stc->nop_dw6); +free_nop_dw5: + mlx5hws_action_free_single_stc(ctx, tbl_type, &default_stc->nop_dw5); +free_nop_ctr: + mlx5hws_action_free_single_stc(ctx, tbl_type, &default_stc->nop_ctr); +free_default_stc: + kfree(default_stc); + return ret; +} + +void mlx5hws_action_put_default_stc(struct mlx5hws_context *ctx, u8 tbl_type) +__must_hold(&ctx->ctrl_lock) +{ + struct mlx5hws_action_default_stc *default_stc; + + default_stc = ctx->common_res[tbl_type].default_stc; + + default_stc = ctx->common_res[tbl_type].default_stc; + if (--default_stc->refcount) + return; + + mlx5hws_action_free_single_stc(ctx, tbl_type, &default_stc->default_hit); + mlx5hws_action_free_single_stc(ctx, tbl_type, &default_stc->nop_dw7); + mlx5hws_action_free_single_stc(ctx, tbl_type, &default_stc->nop_dw6); + mlx5hws_action_free_single_stc(ctx, tbl_type, &default_stc->nop_dw5); + mlx5hws_action_free_single_stc(ctx, tbl_type, &default_stc->nop_ctr); + kfree(default_stc); + ctx->common_res[tbl_type].default_stc = NULL; +} + +static void hws_action_modify_write(struct mlx5hws_send_engine *queue, + u32 arg_idx, + u8 *arg_data, + u16 num_of_actions, + u32 nope_locations) +{ + u8 *new_arg_data = NULL; + int i, j; + + if (unlikely(nope_locations)) { + new_arg_data = kcalloc(num_of_actions, + MLX5HWS_MODIFY_ACTION_SIZE, GFP_KERNEL); + if (unlikely(!new_arg_data)) + return; + + for (i = 0, j = 0; i < num_of_actions; i++, j++) { + memcpy(&new_arg_data[j], arg_data, MLX5HWS_MODIFY_ACTION_SIZE); + if (BIT(i) & nope_locations) + j++; + } + } + + mlx5hws_arg_write(queue, NULL, arg_idx, + new_arg_data ? new_arg_data : arg_data, + num_of_actions * MLX5HWS_MODIFY_ACTION_SIZE); + + kfree(new_arg_data); +} + +void mlx5hws_action_prepare_decap_l3_data(u8 *src, u8 *dst, u16 num_of_actions) +{ + u8 *e_src; + int i; + + /* num_of_actions = remove l3l2 + 4/5 inserts + remove extra 2 bytes + * copy from end of src to the start of dst. + * move to the end, 2 is the leftover from 14B or 18B + */ + if (num_of_actions == DECAP_L3_NUM_ACTIONS_W_NO_VLAN) + e_src = src + MLX5HWS_ACTION_HDR_LEN_L2; + else + e_src = src + MLX5HWS_ACTION_HDR_LEN_L2_W_VLAN; + + /* Move dst over the first remove action + zero data */ + dst += MLX5HWS_ACTION_DOUBLE_SIZE; + /* Move dst over the first insert ctrl action */ + dst += MLX5HWS_ACTION_DOUBLE_SIZE / 2; + /* Actions: + * no vlan: r_h-insert_4b-insert_4b-insert_4b-insert_4b-remove_2b. + * with vlan: r_h-insert_4b-insert_4b-insert_4b-insert_4b-insert_4b-remove_2b. + * the loop is without the last insertion. + */ + for (i = 0; i < num_of_actions - 3; i++) { + e_src -= MLX5HWS_ACTION_INLINE_DATA_SIZE; + memcpy(dst, e_src, MLX5HWS_ACTION_INLINE_DATA_SIZE); /* data */ + dst += MLX5HWS_ACTION_DOUBLE_SIZE; + } + /* Copy the last 2 bytes after a gap of 2 bytes which will be removed */ + e_src -= MLX5HWS_ACTION_INLINE_DATA_SIZE / 2; + dst += MLX5HWS_ACTION_INLINE_DATA_SIZE / 2; + memcpy(dst, e_src, 2); +} + +static int +hws_action_get_shared_stc_offset(struct mlx5hws_context_common_res *common_res, + enum mlx5hws_context_shared_stc_type stc_type) +{ + return common_res->shared_stc[stc_type]->stc_chunk.offset; +} + +static struct mlx5hws_actions_wqe_setter * +hws_action_setter_find_first(struct mlx5hws_actions_wqe_setter *setter, + u8 req_flags) +{ + /* Use a new setter if requested flags are taken */ + while (setter->flags & req_flags) + setter++; + + /* Use current setter in required flags are not used */ + return setter; +} + +static void +hws_action_apply_stc(struct mlx5hws_actions_apply_data *apply, + enum mlx5hws_action_stc_idx stc_idx, + u8 action_idx) +{ + struct mlx5hws_action *action = apply->rule_action[action_idx].action; + + apply->wqe_ctrl->stc_ix[stc_idx] = + htonl(action->stc[apply->tbl_type].offset); +} + +static void +hws_action_setter_push_vlan(struct mlx5hws_actions_apply_data *apply, + struct mlx5hws_actions_wqe_setter *setter) +{ + struct mlx5hws_rule_action *rule_action; + + rule_action = &apply->rule_action[setter->idx_double]; + apply->wqe_data[MLX5HWS_ACTION_OFFSET_DW6] = 0; + apply->wqe_data[MLX5HWS_ACTION_OFFSET_DW7] = rule_action->push_vlan.vlan_hdr; + + hws_action_apply_stc(apply, MLX5HWS_ACTION_STC_IDX_DW6, setter->idx_double); + apply->wqe_ctrl->stc_ix[MLX5HWS_ACTION_STC_IDX_DW7] = 0; +} + +static void +hws_action_setter_modify_header(struct mlx5hws_actions_apply_data *apply, + struct mlx5hws_actions_wqe_setter *setter) +{ + struct mlx5hws_rule_action *rule_action; + struct mlx5hws_action *action; + u32 arg_sz, arg_idx; + u8 *single_action; + __be32 stc_idx; + + rule_action = &apply->rule_action[setter->idx_double]; + action = rule_action->action; + + stc_idx = htonl(action->stc[apply->tbl_type].offset); + apply->wqe_ctrl->stc_ix[MLX5HWS_ACTION_STC_IDX_DW6] = stc_idx; + apply->wqe_ctrl->stc_ix[MLX5HWS_ACTION_STC_IDX_DW7] = 0; + + apply->wqe_data[MLX5HWS_ACTION_OFFSET_DW6] = 0; + + if (action->modify_header.num_of_actions == 1) { + if (action->modify_header.single_action_type == + MLX5_MODIFICATION_TYPE_COPY || + action->modify_header.single_action_type == + MLX5_MODIFICATION_TYPE_ADD_FIELD) { + apply->wqe_data[MLX5HWS_ACTION_OFFSET_DW7] = 0; + return; + } + + if (action->flags & MLX5HWS_ACTION_FLAG_SHARED) + single_action = (u8 *)&action->modify_header.single_action; + else + single_action = rule_action->modify_header.data; + + apply->wqe_data[MLX5HWS_ACTION_OFFSET_DW7] = + *(__be32 *)MLX5_ADDR_OF(set_action_in, single_action, data); + } else { + /* Argument offset multiple with number of args per these actions */ + arg_sz = mlx5hws_arg_get_arg_size(action->modify_header.max_num_of_actions); + arg_idx = rule_action->modify_header.offset * arg_sz; + + apply->wqe_data[MLX5HWS_ACTION_OFFSET_DW7] = htonl(arg_idx); + + if (!(action->flags & MLX5HWS_ACTION_FLAG_SHARED)) { + apply->require_dep = 1; + hws_action_modify_write(apply->queue, + action->modify_header.arg_id + arg_idx, + rule_action->modify_header.data, + action->modify_header.num_of_actions, + action->modify_header.nope_locations); + } + } +} + +static void +hws_action_setter_insert_ptr(struct mlx5hws_actions_apply_data *apply, + struct mlx5hws_actions_wqe_setter *setter) +{ + struct mlx5hws_rule_action *rule_action; + struct mlx5hws_action *action; + u32 arg_idx, arg_sz; + __be32 stc_idx; + + rule_action = &apply->rule_action[setter->idx_double]; + action = rule_action->action + rule_action->reformat.hdr_idx; + + /* Argument offset multiple on args required for header size */ + arg_sz = mlx5hws_arg_data_size_to_arg_size(action->reformat.max_hdr_sz); + arg_idx = rule_action->reformat.offset * arg_sz; + + apply->wqe_data[MLX5HWS_ACTION_OFFSET_DW6] = 0; + apply->wqe_data[MLX5HWS_ACTION_OFFSET_DW7] = htonl(arg_idx); + + stc_idx = htonl(action->stc[apply->tbl_type].offset); + apply->wqe_ctrl->stc_ix[MLX5HWS_ACTION_STC_IDX_DW6] = stc_idx; + apply->wqe_ctrl->stc_ix[MLX5HWS_ACTION_STC_IDX_DW7] = 0; + + if (!(action->flags & MLX5HWS_ACTION_FLAG_SHARED)) { + apply->require_dep = 1; + mlx5hws_arg_write(apply->queue, NULL, + action->reformat.arg_id + arg_idx, + rule_action->reformat.data, + action->reformat.header_size); + } +} + +static void +hws_action_setter_tnl_l3_to_l2(struct mlx5hws_actions_apply_data *apply, + struct mlx5hws_actions_wqe_setter *setter) +{ + struct mlx5hws_rule_action *rule_action; + struct mlx5hws_action *action; + u32 arg_sz, arg_idx; + __be32 stc_idx; + + rule_action = &apply->rule_action[setter->idx_double]; + action = rule_action->action + rule_action->reformat.hdr_idx; + + /* Argument offset multiple on args required for num of actions */ + arg_sz = mlx5hws_arg_get_arg_size(action->modify_header.max_num_of_actions); + arg_idx = rule_action->reformat.offset * arg_sz; + + apply->wqe_data[MLX5HWS_ACTION_OFFSET_DW6] = 0; + apply->wqe_data[MLX5HWS_ACTION_OFFSET_DW7] = htonl(arg_idx); + + stc_idx = htonl(action->stc[apply->tbl_type].offset); + apply->wqe_ctrl->stc_ix[MLX5HWS_ACTION_STC_IDX_DW6] = stc_idx; + apply->wqe_ctrl->stc_ix[MLX5HWS_ACTION_STC_IDX_DW7] = 0; + + if (!(action->flags & MLX5HWS_ACTION_FLAG_SHARED)) { + apply->require_dep = 1; + mlx5hws_arg_decapl3_write(apply->queue, + action->modify_header.arg_id + arg_idx, + rule_action->reformat.data, + action->modify_header.num_of_actions); + } +} + +static void +hws_action_setter_aso(struct mlx5hws_actions_apply_data *apply, + struct mlx5hws_actions_wqe_setter *setter) +{ + struct mlx5hws_rule_action *rule_action; + u32 exe_aso_ctrl; + u32 offset; + + rule_action = &apply->rule_action[setter->idx_double]; + + switch (rule_action->action->type) { + case MLX5HWS_ACTION_TYP_ASO_METER: + /* exe_aso_ctrl format: + * [STC only and reserved bits 29b][init_color 2b][meter_id 1b] + */ + offset = rule_action->aso_meter.offset / MLX5_ASO_METER_NUM_PER_OBJ; + exe_aso_ctrl = rule_action->aso_meter.offset % MLX5_ASO_METER_NUM_PER_OBJ; + exe_aso_ctrl |= rule_action->aso_meter.init_color << + MLX5HWS_ACTION_METER_INIT_COLOR_OFFSET; + break; + default: + mlx5hws_err(rule_action->action->ctx, + "Unsupported ASO action type: %d\n", rule_action->action->type); + return; + } + + /* aso_object_offset format: [24B] */ + apply->wqe_data[MLX5HWS_ACTION_OFFSET_DW6] = htonl(offset); + apply->wqe_data[MLX5HWS_ACTION_OFFSET_DW7] = htonl(exe_aso_ctrl); + + hws_action_apply_stc(apply, MLX5HWS_ACTION_STC_IDX_DW6, setter->idx_double); + apply->wqe_ctrl->stc_ix[MLX5HWS_ACTION_STC_IDX_DW7] = 0; +} + +static void +hws_action_setter_tag(struct mlx5hws_actions_apply_data *apply, + struct mlx5hws_actions_wqe_setter *setter) +{ + struct mlx5hws_rule_action *rule_action; + + rule_action = &apply->rule_action[setter->idx_single]; + apply->wqe_data[MLX5HWS_ACTION_OFFSET_DW5] = htonl(rule_action->tag.value); + hws_action_apply_stc(apply, MLX5HWS_ACTION_STC_IDX_DW5, setter->idx_single); +} + +static void +hws_action_setter_ctrl_ctr(struct mlx5hws_actions_apply_data *apply, + struct mlx5hws_actions_wqe_setter *setter) +{ + struct mlx5hws_rule_action *rule_action; + + rule_action = &apply->rule_action[setter->idx_ctr]; + apply->wqe_data[MLX5HWS_ACTION_OFFSET_DW0] = htonl(rule_action->counter.offset); + hws_action_apply_stc(apply, MLX5HWS_ACTION_STC_IDX_CTRL, setter->idx_ctr); +} + +static void +hws_action_setter_single(struct mlx5hws_actions_apply_data *apply, + struct mlx5hws_actions_wqe_setter *setter) +{ + apply->wqe_data[MLX5HWS_ACTION_OFFSET_DW5] = 0; + hws_action_apply_stc(apply, MLX5HWS_ACTION_STC_IDX_DW5, setter->idx_single); +} + +static void +hws_action_setter_single_double_pop(struct mlx5hws_actions_apply_data *apply, + struct mlx5hws_actions_wqe_setter *setter) +{ + apply->wqe_data[MLX5HWS_ACTION_OFFSET_DW5] = 0; + apply->wqe_ctrl->stc_ix[MLX5HWS_ACTION_STC_IDX_DW5] = + htonl(hws_action_get_shared_stc_offset(apply->common_res, + MLX5HWS_CONTEXT_SHARED_STC_DOUBLE_POP)); +} + +static void +hws_action_setter_hit(struct mlx5hws_actions_apply_data *apply, + struct mlx5hws_actions_wqe_setter *setter) +{ + apply->wqe_data[MLX5HWS_ACTION_OFFSET_HIT_LSB] = 0; + hws_action_apply_stc(apply, MLX5HWS_ACTION_STC_IDX_HIT, setter->idx_hit); +} + +static void +hws_action_setter_default_hit(struct mlx5hws_actions_apply_data *apply, + struct mlx5hws_actions_wqe_setter *setter) +{ + apply->wqe_data[MLX5HWS_ACTION_OFFSET_HIT_LSB] = 0; + apply->wqe_ctrl->stc_ix[MLX5HWS_ACTION_STC_IDX_HIT] = + htonl(apply->common_res->default_stc->default_hit.offset); +} + +static void +hws_action_setter_hit_next_action(struct mlx5hws_actions_apply_data *apply, + struct mlx5hws_actions_wqe_setter *setter) +{ + apply->wqe_data[MLX5HWS_ACTION_OFFSET_HIT_LSB] = htonl(apply->next_direct_idx << 6); + apply->wqe_ctrl->stc_ix[MLX5HWS_ACTION_STC_IDX_HIT] = htonl(apply->jump_to_action_stc); +} + +static void +hws_action_setter_common_decap(struct mlx5hws_actions_apply_data *apply, + struct mlx5hws_actions_wqe_setter *setter) +{ + apply->wqe_data[MLX5HWS_ACTION_OFFSET_DW5] = 0; + apply->wqe_ctrl->stc_ix[MLX5HWS_ACTION_STC_IDX_DW5] = + htonl(hws_action_get_shared_stc_offset(apply->common_res, + MLX5HWS_CONTEXT_SHARED_STC_DECAP_L3)); +} + +static void +hws_action_setter_range(struct mlx5hws_actions_apply_data *apply, + struct mlx5hws_actions_wqe_setter *setter) +{ + /* Always jump to index zero */ + apply->wqe_data[MLX5HWS_ACTION_OFFSET_HIT_LSB] = 0; + hws_action_apply_stc(apply, MLX5HWS_ACTION_STC_IDX_HIT, setter->idx_hit); +} + +int mlx5hws_action_template_process(struct mlx5hws_action_template *at) +{ + struct mlx5hws_actions_wqe_setter *start_setter = at->setters + 1; + enum mlx5hws_action_type *action_type = at->action_type_arr; + struct mlx5hws_actions_wqe_setter *setter = at->setters; + struct mlx5hws_actions_wqe_setter *pop_setter = NULL; + struct mlx5hws_actions_wqe_setter *last_setter; + int i; + + /* Note: Given action combination must be valid */ + + /* Check if action were already processed */ + if (at->num_of_action_stes) + return 0; + + for (i = 0; i < MLX5HWS_ACTION_MAX_STE; i++) + setter[i].set_hit = &hws_action_setter_hit_next_action; + + /* The same action template setters can be used with jumbo or match + * STE, to support both cases we reserve the first setter for cases + * with jumbo STE to allow jump to the first action STE. + * This extra setter can be reduced in some cases on rule creation. + */ + setter = start_setter; + last_setter = start_setter; + + for (i = 0; i < at->num_actions; i++) { + switch (action_type[i]) { + case MLX5HWS_ACTION_TYP_DROP: + case MLX5HWS_ACTION_TYP_TBL: + case MLX5HWS_ACTION_TYP_DEST_ARRAY: + case MLX5HWS_ACTION_TYP_VPORT: + case MLX5HWS_ACTION_TYP_MISS: + /* Hit action */ + last_setter->flags |= ASF_HIT; + last_setter->set_hit = &hws_action_setter_hit; + last_setter->idx_hit = i; + break; + + case MLX5HWS_ACTION_TYP_RANGE: + last_setter->flags |= ASF_HIT; + last_setter->set_hit = &hws_action_setter_range; + last_setter->idx_hit = i; + break; + + case MLX5HWS_ACTION_TYP_POP_VLAN: + /* Single remove header to header */ + if (pop_setter) { + /* We have 2 pops, use the shared */ + pop_setter->set_single = &hws_action_setter_single_double_pop; + break; + } + setter = hws_action_setter_find_first(last_setter, + ASF_SINGLE1 | ASF_MODIFY | + ASF_INSERT); + setter->flags |= ASF_SINGLE1 | ASF_REMOVE; + setter->set_single = &hws_action_setter_single; + setter->idx_single = i; + pop_setter = setter; + break; + + case MLX5HWS_ACTION_TYP_PUSH_VLAN: + /* Double insert inline */ + setter = hws_action_setter_find_first(last_setter, ASF_DOUBLE | ASF_REMOVE); + setter->flags |= ASF_DOUBLE | ASF_INSERT; + setter->set_double = &hws_action_setter_push_vlan; + setter->idx_double = i; + break; + + case MLX5HWS_ACTION_TYP_MODIFY_HDR: + /* Double modify header list */ + setter = hws_action_setter_find_first(last_setter, ASF_DOUBLE | ASF_REMOVE); + setter->flags |= ASF_DOUBLE | ASF_MODIFY; + setter->set_double = &hws_action_setter_modify_header; + setter->idx_double = i; + break; + + case MLX5HWS_ACTION_TYP_ASO_METER: + /* Double ASO action */ + setter = hws_action_setter_find_first(last_setter, ASF_DOUBLE); + setter->flags |= ASF_DOUBLE; + setter->set_double = &hws_action_setter_aso; + setter->idx_double = i; + break; + + case MLX5HWS_ACTION_TYP_REMOVE_HEADER: + case MLX5HWS_ACTION_TYP_REFORMAT_TNL_L2_TO_L2: + /* Single remove header to header */ + setter = hws_action_setter_find_first(last_setter, + ASF_SINGLE1 | ASF_MODIFY); + setter->flags |= ASF_SINGLE1 | ASF_REMOVE; + setter->set_single = &hws_action_setter_single; + setter->idx_single = i; + break; + + case MLX5HWS_ACTION_TYP_INSERT_HEADER: + case MLX5HWS_ACTION_TYP_REFORMAT_L2_TO_TNL_L2: + /* Double insert header with pointer */ + setter = hws_action_setter_find_first(last_setter, ASF_DOUBLE | ASF_REMOVE); + setter->flags |= ASF_DOUBLE | ASF_INSERT; + setter->set_double = &hws_action_setter_insert_ptr; + setter->idx_double = i; + break; + + case MLX5HWS_ACTION_TYP_REFORMAT_L2_TO_TNL_L3: + /* Single remove + Double insert header with pointer */ + setter = hws_action_setter_find_first(last_setter, + ASF_SINGLE1 | ASF_DOUBLE); + setter->flags |= ASF_SINGLE1 | ASF_DOUBLE; + setter->set_double = &hws_action_setter_insert_ptr; + setter->idx_double = i; + setter->set_single = &hws_action_setter_common_decap; + setter->idx_single = i; + break; + + case MLX5HWS_ACTION_TYP_REFORMAT_TNL_L3_TO_L2: + /* Double modify header list with remove and push inline */ + setter = hws_action_setter_find_first(last_setter, ASF_DOUBLE | ASF_REMOVE); + setter->flags |= ASF_DOUBLE | ASF_MODIFY | ASF_INSERT; + setter->set_double = &hws_action_setter_tnl_l3_to_l2; + setter->idx_double = i; + break; + + case MLX5HWS_ACTION_TYP_TAG: + /* Single TAG action, search for any room from the start */ + setter = hws_action_setter_find_first(start_setter, ASF_SINGLE1); + setter->flags |= ASF_SINGLE1; + setter->set_single = &hws_action_setter_tag; + setter->idx_single = i; + break; + + case MLX5HWS_ACTION_TYP_CTR: + /* Control counter action + * TODO: Current counter executed first. Support is needed + * for single ation counter action which is done last. + * Example: Decap + CTR + */ + setter = hws_action_setter_find_first(start_setter, ASF_CTR); + setter->flags |= ASF_CTR; + setter->set_ctr = &hws_action_setter_ctrl_ctr; + setter->idx_ctr = i; + break; + default: + pr_warn("HWS: Invalid action type in processingaction template: action_type[%d]=%d\n", + i, action_type[i]); + return -EOPNOTSUPP; + } + + last_setter = max(setter, last_setter); + } + + /* Set default hit on the last STE if no hit action provided */ + if (!(last_setter->flags & ASF_HIT)) + last_setter->set_hit = &hws_action_setter_default_hit; + + at->num_of_action_stes = last_setter - start_setter + 1; + + /* Check if action template doesn't require any action DWs */ + at->only_term = (at->num_of_action_stes == 1) && + !(last_setter->flags & ~(ASF_CTR | ASF_HIT)); + + return 0; +} + +struct mlx5hws_action_template * +mlx5hws_action_template_create(enum mlx5hws_action_type action_type[]) +{ + struct mlx5hws_action_template *at; + u8 num_actions = 0; + int i; + + at = kzalloc(sizeof(*at), GFP_KERNEL); + if (!at) + return NULL; + + while (action_type[num_actions++] != MLX5HWS_ACTION_TYP_LAST) + ; + + at->num_actions = num_actions - 1; + at->action_type_arr = kcalloc(num_actions, sizeof(*action_type), GFP_KERNEL); + if (!at->action_type_arr) + goto free_at; + + for (i = 0; i < num_actions; i++) + at->action_type_arr[i] = action_type[i]; + + return at; + +free_at: + kfree(at); + return NULL; +} + +int mlx5hws_action_template_destroy(struct mlx5hws_action_template *at) +{ + kfree(at->action_type_arr); + kfree(at); + return 0; +} diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_action.h b/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_action.h new file mode 100644 index 0000000000000..bf5c1b2410060 --- /dev/null +++ b/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_action.h @@ -0,0 +1,307 @@ +/* SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB */ +/* Copyright (c) 2024 NVIDIA Corporation & Affiliates */ + +#ifndef MLX5HWS_ACTION_H_ +#define MLX5HWS_ACTION_H_ + +/* Max number of STEs needed for a rule (including match) */ +#define MLX5HWS_ACTION_MAX_STE 20 + +/* Max number of internal subactions of ipv6_ext */ +#define MLX5HWS_ACTION_IPV6_EXT_MAX_SA 4 + +enum mlx5hws_action_stc_idx { + MLX5HWS_ACTION_STC_IDX_CTRL = 0, + MLX5HWS_ACTION_STC_IDX_HIT = 1, + MLX5HWS_ACTION_STC_IDX_DW5 = 2, + MLX5HWS_ACTION_STC_IDX_DW6 = 3, + MLX5HWS_ACTION_STC_IDX_DW7 = 4, + MLX5HWS_ACTION_STC_IDX_MAX = 5, + /* STC Jumvo STE combo: CTR, Hit */ + MLX5HWS_ACTION_STC_IDX_LAST_JUMBO_STE = 1, + /* STC combo1: CTR, SINGLE, DOUBLE, Hit */ + MLX5HWS_ACTION_STC_IDX_LAST_COMBO1 = 3, + /* STC combo2: CTR, 3 x SINGLE, Hit */ + MLX5HWS_ACTION_STC_IDX_LAST_COMBO2 = 4, + /* STC combo2: CTR, TRIPLE, Hit */ + MLX5HWS_ACTION_STC_IDX_LAST_COMBO3 = 2, +}; + +enum mlx5hws_action_offset { + MLX5HWS_ACTION_OFFSET_DW0 = 0, + MLX5HWS_ACTION_OFFSET_DW5 = 5, + MLX5HWS_ACTION_OFFSET_DW6 = 6, + MLX5HWS_ACTION_OFFSET_DW7 = 7, + MLX5HWS_ACTION_OFFSET_HIT = 3, + MLX5HWS_ACTION_OFFSET_HIT_LSB = 4, +}; + +enum { + MLX5HWS_ACTION_DOUBLE_SIZE = 8, + MLX5HWS_ACTION_INLINE_DATA_SIZE = 4, + MLX5HWS_ACTION_HDR_LEN_L2_MACS = 12, + MLX5HWS_ACTION_HDR_LEN_L2_VLAN = 4, + MLX5HWS_ACTION_HDR_LEN_L2_ETHER = 2, + MLX5HWS_ACTION_HDR_LEN_L2 = (MLX5HWS_ACTION_HDR_LEN_L2_MACS + + MLX5HWS_ACTION_HDR_LEN_L2_ETHER), + MLX5HWS_ACTION_HDR_LEN_L2_W_VLAN = (MLX5HWS_ACTION_HDR_LEN_L2 + + MLX5HWS_ACTION_HDR_LEN_L2_VLAN), + MLX5HWS_ACTION_REFORMAT_DATA_SIZE = 64, + DECAP_L3_NUM_ACTIONS_W_NO_VLAN = 6, + DECAP_L3_NUM_ACTIONS_W_VLAN = 7, +}; + +enum mlx5hws_action_setter_flag { + ASF_SINGLE1 = 1 << 0, + ASF_SINGLE2 = 1 << 1, + ASF_SINGLE3 = 1 << 2, + ASF_DOUBLE = ASF_SINGLE2 | ASF_SINGLE3, + ASF_TRIPLE = ASF_SINGLE1 | ASF_DOUBLE, + ASF_INSERT = 1 << 3, + ASF_REMOVE = 1 << 4, + ASF_MODIFY = 1 << 5, + ASF_CTR = 1 << 6, + ASF_HIT = 1 << 7, +}; + +struct mlx5hws_action_default_stc { + struct mlx5hws_pool_chunk nop_ctr; + struct mlx5hws_pool_chunk nop_dw5; + struct mlx5hws_pool_chunk nop_dw6; + struct mlx5hws_pool_chunk nop_dw7; + struct mlx5hws_pool_chunk default_hit; + u32 refcount; +}; + +struct mlx5hws_action_shared_stc { + struct mlx5hws_pool_chunk stc_chunk; + u32 refcount; +}; + +struct mlx5hws_actions_apply_data { + struct mlx5hws_send_engine *queue; + struct mlx5hws_rule_action *rule_action; + __be32 *wqe_data; + struct mlx5hws_wqe_gta_ctrl_seg *wqe_ctrl; + u32 jump_to_action_stc; + struct mlx5hws_context_common_res *common_res; + enum mlx5hws_table_type tbl_type; + u32 next_direct_idx; + u8 require_dep; +}; + +struct mlx5hws_actions_wqe_setter; + +typedef void (*mlx5hws_action_setter_fp)(struct mlx5hws_actions_apply_data *apply, + struct mlx5hws_actions_wqe_setter *setter); + +struct mlx5hws_actions_wqe_setter { + mlx5hws_action_setter_fp set_single; + mlx5hws_action_setter_fp set_double; + mlx5hws_action_setter_fp set_triple; + mlx5hws_action_setter_fp set_hit; + mlx5hws_action_setter_fp set_ctr; + u8 idx_single; + u8 idx_double; + u8 idx_triple; + u8 idx_ctr; + u8 idx_hit; + u8 stage_idx; + u8 flags; +}; + +struct mlx5hws_action_template { + struct mlx5hws_actions_wqe_setter setters[MLX5HWS_ACTION_MAX_STE]; + enum mlx5hws_action_type *action_type_arr; + u8 num_of_action_stes; + u8 num_actions; + u8 only_term; +}; + +struct mlx5hws_action { + u8 type; + u8 flags; + struct mlx5hws_context *ctx; + union { + struct { + struct mlx5hws_pool_chunk stc[MLX5HWS_TABLE_TYPE_MAX]; + union { + struct { + u32 pat_id; + u32 arg_id; + __be64 single_action; + u32 nope_locations; + u8 num_of_patterns; + u8 single_action_type; + u8 num_of_actions; + u8 max_num_of_actions; + u8 require_reparse; + } modify_header; + struct { + u32 arg_id; + u32 header_size; + u16 max_hdr_sz; + u8 num_of_hdrs; + u8 anchor; + u8 e_anchor; + u8 offset; + bool encap; + u8 require_reparse; + } reformat; + struct { + u32 obj_id; + u8 return_reg_id; + } aso; + struct { + u16 vport_num; + u16 esw_owner_vhca_id; + bool esw_owner_vhca_id_valid; + } vport; + struct { + u32 obj_id; + } dest_obj; + struct { + struct mlx5hws_cmd_forward_tbl *fw_island; + size_t num_dest; + struct mlx5hws_cmd_set_fte_dest *dest_list; + } dest_array; + struct { + u8 type; + u8 start_anchor; + u8 end_anchor; + u8 num_of_words; + bool decap; + } insert_hdr; + struct { + /* PRM start anchor from which header will be removed */ + u8 anchor; + /* Header remove offset in bytes, from the start + * anchor to the location where remove header starts. + */ + u8 offset; + /* Indicates the removed header size in bytes */ + size_t size; + } remove_header; + struct { + struct mlx5hws_matcher_action_ste *table_ste; + struct mlx5hws_action *hit_ft_action; + struct mlx5hws_definer *definer; + } range; + }; + }; + + struct ibv_flow_action *flow_action; + u32 obj_id; + struct ibv_qp *qp; + }; +}; + +const char *mlx5hws_action_type_to_str(enum mlx5hws_action_type action_type); + +int mlx5hws_action_get_default_stc(struct mlx5hws_context *ctx, + u8 tbl_type); + +void mlx5hws_action_put_default_stc(struct mlx5hws_context *ctx, + u8 tbl_type); + +void mlx5hws_action_prepare_decap_l3_data(u8 *src, u8 *dst, + u16 num_of_actions); + +int mlx5hws_action_template_process(struct mlx5hws_action_template *at); + +bool mlx5hws_action_check_combo(struct mlx5hws_context *ctx, + enum mlx5hws_action_type *user_actions, + enum mlx5hws_table_type table_type); + +int mlx5hws_action_alloc_single_stc(struct mlx5hws_context *ctx, + struct mlx5hws_cmd_stc_modify_attr *stc_attr, + u32 table_type, + struct mlx5hws_pool_chunk *stc); + +void mlx5hws_action_free_single_stc(struct mlx5hws_context *ctx, + u32 table_type, + struct mlx5hws_pool_chunk *stc); + +static inline void +mlx5hws_action_setter_default_single(struct mlx5hws_actions_apply_data *apply, + struct mlx5hws_actions_wqe_setter *setter) +{ + apply->wqe_data[MLX5HWS_ACTION_OFFSET_DW5] = 0; + apply->wqe_ctrl->stc_ix[MLX5HWS_ACTION_STC_IDX_DW5] = + htonl(apply->common_res->default_stc->nop_dw5.offset); +} + +static inline void +mlx5hws_action_setter_default_double(struct mlx5hws_actions_apply_data *apply, + struct mlx5hws_actions_wqe_setter *setter) +{ + apply->wqe_data[MLX5HWS_ACTION_OFFSET_DW6] = 0; + apply->wqe_data[MLX5HWS_ACTION_OFFSET_DW7] = 0; + apply->wqe_ctrl->stc_ix[MLX5HWS_ACTION_STC_IDX_DW6] = + htonl(apply->common_res->default_stc->nop_dw6.offset); + apply->wqe_ctrl->stc_ix[MLX5HWS_ACTION_STC_IDX_DW7] = + htonl(apply->common_res->default_stc->nop_dw7.offset); +} + +static inline void +mlx5hws_action_setter_default_ctr(struct mlx5hws_actions_apply_data *apply, + struct mlx5hws_actions_wqe_setter *setter) +{ + apply->wqe_data[MLX5HWS_ACTION_OFFSET_DW0] = 0; + apply->wqe_ctrl->stc_ix[MLX5HWS_ACTION_STC_IDX_CTRL] = + htonl(apply->common_res->default_stc->nop_ctr.offset); +} + +static inline void +mlx5hws_action_apply_setter(struct mlx5hws_actions_apply_data *apply, + struct mlx5hws_actions_wqe_setter *setter, + bool is_jumbo) +{ + u8 num_of_actions; + + /* Set control counter */ + if (setter->set_ctr) + setter->set_ctr(apply, setter); + else + mlx5hws_action_setter_default_ctr(apply, setter); + + if (!is_jumbo) { + if (unlikely(setter->set_triple)) { + /* Set triple on match */ + setter->set_triple(apply, setter); + num_of_actions = MLX5HWS_ACTION_STC_IDX_LAST_COMBO3; + } else { + /* Set single and double on match */ + if (setter->set_single) + setter->set_single(apply, setter); + else + mlx5hws_action_setter_default_single(apply, setter); + + if (setter->set_double) + setter->set_double(apply, setter); + else + mlx5hws_action_setter_default_double(apply, setter); + + num_of_actions = setter->set_double ? + MLX5HWS_ACTION_STC_IDX_LAST_COMBO1 : + MLX5HWS_ACTION_STC_IDX_LAST_COMBO2; + } + } else { + apply->wqe_data[MLX5HWS_ACTION_OFFSET_DW5] = 0; + apply->wqe_data[MLX5HWS_ACTION_OFFSET_DW6] = 0; + apply->wqe_data[MLX5HWS_ACTION_OFFSET_DW7] = 0; + apply->wqe_ctrl->stc_ix[MLX5HWS_ACTION_STC_IDX_DW5] = 0; + apply->wqe_ctrl->stc_ix[MLX5HWS_ACTION_STC_IDX_DW6] = 0; + apply->wqe_ctrl->stc_ix[MLX5HWS_ACTION_STC_IDX_DW7] = 0; + num_of_actions = MLX5HWS_ACTION_STC_IDX_LAST_JUMBO_STE; + } + + /* Set next/final hit action */ + setter->set_hit(apply, setter); + + /* Set number of actions */ + apply->wqe_ctrl->stc_ix[MLX5HWS_ACTION_STC_IDX_CTRL] |= + htonl(num_of_actions << 29); +} + +#endif /* MLX5HWS_ACTION_H_ */ From 042d7f6958528ed22c844ec1c1f5a7ec38de9faf Mon Sep 17 00:00:00 2001 From: Yevgeny Kliteynik Date: Thu, 20 Jun 2024 02:32:28 +0300 Subject: [PATCH 090/548] net/mlx5: HWS, added tables handling Flow tables are SW objects that are comprised of list of matchers, that in turn define the properties of a flow to match on and set of actions to perform on the flows in case of match hit or miss. Reviewed-by: Itamar Gozlan Signed-off-by: Yevgeny Kliteynik Signed-off-by: Saeed Mahameed (cherry picked from commit 71a1372b8275dd1c57db045d4a710a5b8f124f52) Signed-off-by: Koba Ko --- .../mlx5/core/steering/hws/mlx5hws_table.c | 493 ++++++++++++++++++ .../mlx5/core/steering/hws/mlx5hws_table.h | 68 +++ 2 files changed, 561 insertions(+) create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_table.c create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_table.h diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_table.c b/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_table.c new file mode 100644 index 0000000000000..9dbc3e9da5ea9 --- /dev/null +++ b/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_table.c @@ -0,0 +1,493 @@ +// SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB +/* Copyright (c) 2024 NVIDIA Corporation & Affiliates */ + +#include "mlx5hws_internal.h" + +u32 mlx5hws_table_get_id(struct mlx5hws_table *tbl) +{ + return tbl->ft_id; +} + +static void hws_table_init_next_ft_attr(struct mlx5hws_table *tbl, + struct mlx5hws_cmd_ft_create_attr *ft_attr) +{ + ft_attr->type = tbl->fw_ft_type; + if (tbl->type == MLX5HWS_TABLE_TYPE_FDB) + ft_attr->level = tbl->ctx->caps->fdb_ft.max_level - 1; + else + ft_attr->level = tbl->ctx->caps->nic_ft.max_level - 1; + ft_attr->rtc_valid = true; +} + +static void hws_table_set_cap_attr(struct mlx5hws_table *tbl, + struct mlx5hws_cmd_ft_create_attr *ft_attr) +{ + /* Enabling reformat_en or decap_en for the first flow table + * must be done when all VFs are down. + * However, HWS doesn't know when it is required to create the first FT. + * On the other hand, HWS doesn't use all these FT capabilities at all + * (the API doesn't even provide a way to specify these flags), so we'll + * just set these caps on all the flow tables. + * If HCA_CAP.fdb_dynamic_tunnel is set, this constraint is N/A. + */ + if (!MLX5_CAP_ESW_FLOWTABLE(tbl->ctx->mdev, fdb_dynamic_tunnel)) { + ft_attr->reformat_en = true; + ft_attr->decap_en = true; + } +} + +static int hws_table_up_default_fdb_miss_tbl(struct mlx5hws_table *tbl) +{ + struct mlx5hws_cmd_ft_create_attr ft_attr = {0}; + struct mlx5hws_cmd_set_fte_attr fte_attr = {0}; + struct mlx5hws_cmd_forward_tbl *default_miss; + struct mlx5hws_cmd_set_fte_dest dest = {0}; + struct mlx5hws_context *ctx = tbl->ctx; + u8 tbl_type = tbl->type; + + if (tbl->type != MLX5HWS_TABLE_TYPE_FDB) + return 0; + + if (ctx->common_res[tbl_type].default_miss) { + ctx->common_res[tbl_type].default_miss->refcount++; + return 0; + } + + ft_attr.type = tbl->fw_ft_type; + ft_attr.level = tbl->ctx->caps->fdb_ft.max_level; /* The last level */ + ft_attr.rtc_valid = false; + + dest.destination_type = MLX5_FLOW_DESTINATION_TYPE_VPORT; + dest.destination_id = ctx->caps->eswitch_manager_vport_number; + + fte_attr.action_flags = MLX5_FLOW_CONTEXT_ACTION_FWD_DEST; + fte_attr.dests_num = 1; + fte_attr.dests = &dest; + + default_miss = mlx5hws_cmd_forward_tbl_create(ctx->mdev, &ft_attr, &fte_attr); + if (!default_miss) { + mlx5hws_err(ctx, "Failed to default miss table type: 0x%x\n", tbl_type); + return -EINVAL; + } + + /* ctx->ctrl_lock must be held here */ + ctx->common_res[tbl_type].default_miss = default_miss; + ctx->common_res[tbl_type].default_miss->refcount++; + + return 0; +} + +/* Called under ctx->ctrl_lock */ +static void hws_table_down_default_fdb_miss_tbl(struct mlx5hws_table *tbl) +{ + struct mlx5hws_cmd_forward_tbl *default_miss; + struct mlx5hws_context *ctx = tbl->ctx; + u8 tbl_type = tbl->type; + + if (tbl->type != MLX5HWS_TABLE_TYPE_FDB) + return; + + default_miss = ctx->common_res[tbl_type].default_miss; + if (--default_miss->refcount) + return; + + mlx5hws_cmd_forward_tbl_destroy(ctx->mdev, default_miss); + ctx->common_res[tbl_type].default_miss = NULL; +} + +static int hws_table_connect_to_default_miss_tbl(struct mlx5hws_table *tbl, u32 ft_id) +{ + struct mlx5hws_cmd_ft_modify_attr ft_attr = {0}; + int ret; + + if (unlikely(tbl->type != MLX5HWS_TABLE_TYPE_FDB)) + pr_warn("HWS: invalid table type %d\n", tbl->type); + + mlx5hws_cmd_set_attr_connect_miss_tbl(tbl->ctx, + tbl->fw_ft_type, + tbl->type, + &ft_attr); + + ret = mlx5hws_cmd_flow_table_modify(tbl->ctx->mdev, &ft_attr, ft_id); + if (ret) { + mlx5hws_err(tbl->ctx, "Failed to connect FT to default FDB FT\n"); + return ret; + } + + return 0; +} + +int mlx5hws_table_create_default_ft(struct mlx5_core_dev *mdev, + struct mlx5hws_table *tbl, + u32 *ft_id) +{ + struct mlx5hws_cmd_ft_create_attr ft_attr = {0}; + int ret; + + hws_table_init_next_ft_attr(tbl, &ft_attr); + hws_table_set_cap_attr(tbl, &ft_attr); + + ret = mlx5hws_cmd_flow_table_create(mdev, &ft_attr, ft_id); + if (ret) { + mlx5hws_err(tbl->ctx, "Failed creating default ft\n"); + return ret; + } + + if (tbl->type == MLX5HWS_TABLE_TYPE_FDB) { + /* Take/create ref over the default miss */ + ret = hws_table_up_default_fdb_miss_tbl(tbl); + if (ret) { + mlx5hws_err(tbl->ctx, "Failed to get default fdb miss\n"); + goto free_ft_obj; + } + ret = hws_table_connect_to_default_miss_tbl(tbl, *ft_id); + if (ret) { + mlx5hws_err(tbl->ctx, "Failed connecting to default miss tbl\n"); + goto down_miss_tbl; + } + } + + return 0; + +down_miss_tbl: + hws_table_down_default_fdb_miss_tbl(tbl); +free_ft_obj: + mlx5hws_cmd_flow_table_destroy(mdev, ft_attr.type, *ft_id); + return ret; +} + +void mlx5hws_table_destroy_default_ft(struct mlx5hws_table *tbl, + u32 ft_id) +{ + mlx5hws_cmd_flow_table_destroy(tbl->ctx->mdev, tbl->fw_ft_type, ft_id); + hws_table_down_default_fdb_miss_tbl(tbl); +} + +static int hws_table_init_check_hws_support(struct mlx5hws_context *ctx, + struct mlx5hws_table *tbl) +{ + if (!(ctx->flags & MLX5HWS_CONTEXT_FLAG_HWS_SUPPORT)) { + mlx5hws_err(ctx, "HWS not supported, cannot create mlx5hws_table\n"); + return -EOPNOTSUPP; + } + + return 0; +} + +static int hws_table_init(struct mlx5hws_table *tbl) +{ + struct mlx5hws_context *ctx = tbl->ctx; + int ret; + + ret = hws_table_init_check_hws_support(ctx, tbl); + if (ret) + return ret; + + if (mlx5hws_table_get_fw_ft_type(tbl->type, (u8 *)&tbl->fw_ft_type)) { + pr_warn("HWS: invalid table type %d\n", tbl->type); + return -EOPNOTSUPP; + } + + mutex_lock(&ctx->ctrl_lock); + ret = mlx5hws_table_create_default_ft(tbl->ctx->mdev, tbl, &tbl->ft_id); + if (ret) { + mlx5hws_err(tbl->ctx, "Failed to create flow table object\n"); + mutex_unlock(&ctx->ctrl_lock); + return ret; + } + + ret = mlx5hws_action_get_default_stc(ctx, tbl->type); + if (ret) + goto tbl_destroy; + + INIT_LIST_HEAD(&tbl->matchers_list); + INIT_LIST_HEAD(&tbl->default_miss.head); + + mutex_unlock(&ctx->ctrl_lock); + + return 0; + +tbl_destroy: + mlx5hws_table_destroy_default_ft(tbl, tbl->ft_id); + mutex_unlock(&ctx->ctrl_lock); + return ret; +} + +static void hws_table_uninit(struct mlx5hws_table *tbl) +{ + mutex_lock(&tbl->ctx->ctrl_lock); + mlx5hws_action_put_default_stc(tbl->ctx, tbl->type); + mlx5hws_table_destroy_default_ft(tbl, tbl->ft_id); + mutex_unlock(&tbl->ctx->ctrl_lock); +} + +struct mlx5hws_table *mlx5hws_table_create(struct mlx5hws_context *ctx, + struct mlx5hws_table_attr *attr) +{ + struct mlx5hws_table *tbl; + int ret; + + if (attr->type > MLX5HWS_TABLE_TYPE_FDB) { + mlx5hws_err(ctx, "Invalid table type %d\n", attr->type); + return NULL; + } + + tbl = kzalloc(sizeof(*tbl), GFP_KERNEL); + if (!tbl) + return NULL; + + tbl->ctx = ctx; + tbl->type = attr->type; + tbl->level = attr->level; + + ret = hws_table_init(tbl); + if (ret) { + mlx5hws_err(ctx, "Failed to initialise table\n"); + goto free_tbl; + } + + mutex_lock(&ctx->ctrl_lock); + list_add(&tbl->tbl_list_node, &ctx->tbl_list); + mutex_unlock(&ctx->ctrl_lock); + + return tbl; + +free_tbl: + kfree(tbl); + return NULL; +} + +int mlx5hws_table_destroy(struct mlx5hws_table *tbl) +{ + struct mlx5hws_context *ctx = tbl->ctx; + int ret; + + mutex_lock(&ctx->ctrl_lock); + if (!list_empty(&tbl->matchers_list)) { + mlx5hws_err(tbl->ctx, "Cannot destroy table containing matchers\n"); + ret = -EBUSY; + goto unlock_err; + } + + if (!list_empty(&tbl->default_miss.head)) { + mlx5hws_err(tbl->ctx, "Cannot destroy table pointed by default miss\n"); + ret = -EBUSY; + goto unlock_err; + } + + list_del_init(&tbl->tbl_list_node); + mutex_unlock(&ctx->ctrl_lock); + + hws_table_uninit(tbl); + kfree(tbl); + + return 0; + +unlock_err: + mutex_unlock(&ctx->ctrl_lock); + return ret; +} + +static u32 hws_table_get_last_ft(struct mlx5hws_table *tbl) +{ + struct mlx5hws_matcher *matcher; + + if (list_empty(&tbl->matchers_list)) + return tbl->ft_id; + + matcher = list_last_entry(&tbl->matchers_list, struct mlx5hws_matcher, list_node); + return matcher->end_ft_id; +} + +int mlx5hws_table_ft_set_default_next_ft(struct mlx5hws_table *tbl, u32 ft_id) +{ + struct mlx5hws_cmd_ft_modify_attr ft_attr = {0}; + int ret; + + /* Due to FW limitation, resetting the flow table to default action will + * disconnect RTC when ignore_flow_level_rtc_valid is not supported. + */ + if (!tbl->ctx->caps->nic_ft.ignore_flow_level_rtc_valid) + return 0; + + if (tbl->type == MLX5HWS_TABLE_TYPE_FDB) + return hws_table_connect_to_default_miss_tbl(tbl, ft_id); + + ft_attr.type = tbl->fw_ft_type; + ft_attr.modify_fs = MLX5_IFC_MODIFY_FLOW_TABLE_MISS_ACTION; + ft_attr.table_miss_action = MLX5_IFC_MODIFY_FLOW_TABLE_MISS_ACTION_DEFAULT; + + ret = mlx5hws_cmd_flow_table_modify(tbl->ctx->mdev, &ft_attr, ft_id); + if (ret) { + mlx5hws_err(tbl->ctx, "Failed to set FT default miss action\n"); + return ret; + } + + return 0; +} + +int mlx5hws_table_ft_set_next_rtc(struct mlx5hws_context *ctx, + u32 ft_id, + u32 fw_ft_type, + u32 rtc_0_id, + u32 rtc_1_id) +{ + struct mlx5hws_cmd_ft_modify_attr ft_attr = {0}; + + ft_attr.modify_fs = MLX5_IFC_MODIFY_FLOW_TABLE_RTC_ID; + ft_attr.type = fw_ft_type; + ft_attr.rtc_id_0 = rtc_0_id; + ft_attr.rtc_id_1 = rtc_1_id; + + return mlx5hws_cmd_flow_table_modify(ctx->mdev, &ft_attr, ft_id); +} + +static int hws_table_ft_set_next_ft(struct mlx5hws_context *ctx, + u32 ft_id, + u32 fw_ft_type, + u32 next_ft_id) +{ + struct mlx5hws_cmd_ft_modify_attr ft_attr = {0}; + + ft_attr.modify_fs = MLX5_IFC_MODIFY_FLOW_TABLE_MISS_ACTION; + ft_attr.table_miss_action = MLX5_IFC_MODIFY_FLOW_TABLE_MISS_ACTION_GOTO_TBL; + ft_attr.type = fw_ft_type; + ft_attr.table_miss_id = next_ft_id; + + return mlx5hws_cmd_flow_table_modify(ctx->mdev, &ft_attr, ft_id); +} + +int mlx5hws_table_update_connected_miss_tables(struct mlx5hws_table *dst_tbl) +{ + struct mlx5hws_table *src_tbl; + int ret; + + if (list_empty(&dst_tbl->default_miss.head)) + return 0; + + list_for_each_entry(src_tbl, &dst_tbl->default_miss.head, default_miss.next) { + ret = mlx5hws_table_connect_to_miss_table(src_tbl, dst_tbl); + if (ret) { + mlx5hws_err(dst_tbl->ctx, + "Failed to update source miss table, unexpected behavior\n"); + return ret; + } + } + + return 0; +} + +int mlx5hws_table_connect_to_miss_table(struct mlx5hws_table *src_tbl, + struct mlx5hws_table *dst_tbl) +{ + struct mlx5hws_matcher *matcher; + u32 last_ft_id; + int ret; + + last_ft_id = hws_table_get_last_ft(src_tbl); + + if (dst_tbl) { + if (list_empty(&dst_tbl->matchers_list)) { + /* Connect src_tbl last_ft to dst_tbl start anchor */ + ret = hws_table_ft_set_next_ft(src_tbl->ctx, + last_ft_id, + src_tbl->fw_ft_type, + dst_tbl->ft_id); + if (ret) + return ret; + + /* Reset last_ft RTC to default RTC */ + ret = mlx5hws_table_ft_set_next_rtc(src_tbl->ctx, + last_ft_id, + src_tbl->fw_ft_type, + 0, 0); + if (ret) + return ret; + } else { + /* Connect src_tbl last_ft to first matcher RTC */ + matcher = list_first_entry(&dst_tbl->matchers_list, + struct mlx5hws_matcher, + list_node); + ret = mlx5hws_table_ft_set_next_rtc(src_tbl->ctx, + last_ft_id, + src_tbl->fw_ft_type, + matcher->match_ste.rtc_0_id, + matcher->match_ste.rtc_1_id); + if (ret) + return ret; + + /* Reset next miss FT to default */ + ret = mlx5hws_table_ft_set_default_next_ft(src_tbl, last_ft_id); + if (ret) + return ret; + } + } else { + /* Reset next miss FT to default */ + ret = mlx5hws_table_ft_set_default_next_ft(src_tbl, last_ft_id); + if (ret) + return ret; + + /* Reset last_ft RTC to default RTC */ + ret = mlx5hws_table_ft_set_next_rtc(src_tbl->ctx, + last_ft_id, + src_tbl->fw_ft_type, + 0, 0); + if (ret) + return ret; + } + + src_tbl->default_miss.miss_tbl = dst_tbl; + + return 0; +} + +static int hws_table_set_default_miss_not_valid(struct mlx5hws_table *tbl, + struct mlx5hws_table *miss_tbl) +{ + if (!tbl->ctx->caps->nic_ft.ignore_flow_level_rtc_valid) { + mlx5hws_err(tbl->ctx, "Default miss table is not supported\n"); + return -EOPNOTSUPP; + } + + if ((miss_tbl && miss_tbl->type != tbl->type)) { + mlx5hws_err(tbl->ctx, "Invalid arguments\n"); + return -EINVAL; + } + + return 0; +} + +int mlx5hws_table_set_default_miss(struct mlx5hws_table *tbl, + struct mlx5hws_table *miss_tbl) +{ + struct mlx5hws_context *ctx = tbl->ctx; + struct mlx5hws_table *old_miss_tbl; + int ret; + + ret = hws_table_set_default_miss_not_valid(tbl, miss_tbl); + if (ret) + return ret; + + mutex_lock(&ctx->ctrl_lock); + + old_miss_tbl = tbl->default_miss.miss_tbl; + ret = mlx5hws_table_connect_to_miss_table(tbl, miss_tbl); + if (ret) + goto out; + + if (old_miss_tbl) + list_del_init(&tbl->default_miss.next); + + old_miss_tbl = tbl->default_miss.miss_tbl; + if (old_miss_tbl) + list_del_init(&old_miss_tbl->default_miss.head); + + if (miss_tbl) + list_add(&tbl->default_miss.next, &miss_tbl->default_miss.head); + + mutex_unlock(&ctx->ctrl_lock); + return 0; +out: + mutex_unlock(&ctx->ctrl_lock); + return -ret; +} diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_table.h b/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_table.h new file mode 100644 index 0000000000000..dd50420eec9ea --- /dev/null +++ b/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_table.h @@ -0,0 +1,68 @@ +/* SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB */ +/* Copyright (c) 2024 NVIDIA Corporation & Affiliates */ + +#ifndef MLX5HWS_TABLE_H_ +#define MLX5HWS_TABLE_H_ + +struct mlx5hws_default_miss { + /* My miss table */ + struct mlx5hws_table *miss_tbl; + struct list_head next; + /* Tables missing to my table */ + struct list_head head; +}; + +struct mlx5hws_table { + struct mlx5hws_context *ctx; + u32 ft_id; + enum mlx5hws_table_type type; + u32 fw_ft_type; + u32 level; + struct list_head matchers_list; + struct list_head tbl_list_node; + struct mlx5hws_default_miss default_miss; +}; + +static inline +u32 mlx5hws_table_get_fw_ft_type(enum mlx5hws_table_type type, + u8 *ret_type) +{ + if (type != MLX5HWS_TABLE_TYPE_FDB) + return -EOPNOTSUPP; + + *ret_type = FS_FT_FDB; + + return 0; +} + +static inline +u32 mlx5hws_table_get_res_fw_ft_type(enum mlx5hws_table_type tbl_type, + bool is_mirror) +{ + if (tbl_type == MLX5HWS_TABLE_TYPE_FDB) + return is_mirror ? FS_FT_FDB_TX : FS_FT_FDB_RX; + + return 0; +} + +int mlx5hws_table_create_default_ft(struct mlx5_core_dev *mdev, + struct mlx5hws_table *tbl, + u32 *ft_id); + +void mlx5hws_table_destroy_default_ft(struct mlx5hws_table *tbl, + u32 ft_id); + +int mlx5hws_table_connect_to_miss_table(struct mlx5hws_table *src_tbl, + struct mlx5hws_table *dst_tbl); + +int mlx5hws_table_update_connected_miss_tables(struct mlx5hws_table *dst_tbl); + +int mlx5hws_table_ft_set_default_next_ft(struct mlx5hws_table *tbl, u32 ft_id); + +int mlx5hws_table_ft_set_next_rtc(struct mlx5hws_context *ctx, + u32 ft_id, + u32 fw_ft_type, + u32 rtc_0_id, + u32 rtc_1_id); + +#endif /* MLX5HWS_TABLE_H_ */ From f3ebddac243ad08088e81a0d9ff9767f5d24efed Mon Sep 17 00:00:00 2001 From: Yevgeny Kliteynik Date: Thu, 20 Jun 2024 02:33:11 +0300 Subject: [PATCH 091/548] net/mlx5: HWS, added rules handling Steering rule is a concept that includes match parameters for a flow, and actions to perform on the flows that match these parameters. This patch adds rules handling part of HW Steering. Reviewed-by: Itamar Gozlan Signed-off-by: Yevgeny Kliteynik Signed-off-by: Saeed Mahameed (cherry picked from commit 49674803542c162a465819c1a5ca66b494ecf367) Signed-off-by: Koba Ko --- .../mlx5/core/steering/hws/mlx5hws_rule.c | 780 ++++++++++++++++++ .../mlx5/core/steering/hws/mlx5hws_rule.h | 84 ++ 2 files changed, 864 insertions(+) create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_rule.c create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_rule.h diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_rule.c b/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_rule.c new file mode 100644 index 0000000000000..c79ee70edf03e --- /dev/null +++ b/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_rule.c @@ -0,0 +1,780 @@ +// SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB +/* Copyright (c) 2024 NVIDIA Corporation & Affiliates */ + +#include "mlx5hws_internal.h" + +static void hws_rule_skip(struct mlx5hws_matcher *matcher, + struct mlx5hws_match_template *mt, + u32 flow_source, + bool *skip_rx, bool *skip_tx) +{ + /* By default FDB rules are added to both RX and TX */ + *skip_rx = false; + *skip_tx = false; + + if (flow_source == MLX5_FLOW_CONTEXT_FLOW_SOURCE_LOCAL_VPORT) { + *skip_rx = true; + } else if (flow_source == MLX5_FLOW_CONTEXT_FLOW_SOURCE_UPLINK) { + *skip_tx = true; + } else { + /* If no flow source was set for current rule, + * check for flow source in matcher attributes. + */ + if (matcher->attr.optimize_flow_src) { + *skip_tx = + matcher->attr.optimize_flow_src == MLX5HWS_MATCHER_FLOW_SRC_WIRE; + *skip_rx = + matcher->attr.optimize_flow_src == MLX5HWS_MATCHER_FLOW_SRC_VPORT; + return; + } + } +} + +static void +hws_rule_update_copy_tag(struct mlx5hws_rule *rule, + struct mlx5hws_wqe_gta_data_seg_ste *wqe_data, + bool is_jumbo) +{ + struct mlx5hws_rule_match_tag *tag; + + if (!mlx5hws_matcher_is_resizable(rule->matcher)) { + tag = &rule->tag; + } else { + struct mlx5hws_wqe_gta_data_seg_ste *data_seg = + (struct mlx5hws_wqe_gta_data_seg_ste *)(void *)rule->resize_info->data_seg; + tag = (struct mlx5hws_rule_match_tag *)(void *)data_seg->action; + } + + if (is_jumbo) + memcpy(wqe_data->jumbo, tag->jumbo, MLX5HWS_JUMBO_TAG_SZ); + else + memcpy(wqe_data->tag, tag->match, MLX5HWS_MATCH_TAG_SZ); +} + +static void hws_rule_init_dep_wqe(struct mlx5hws_send_ring_dep_wqe *dep_wqe, + struct mlx5hws_rule *rule, + struct mlx5hws_match_template *mt, + struct mlx5hws_rule_attr *attr) +{ + struct mlx5hws_matcher *matcher = rule->matcher; + struct mlx5hws_table *tbl = matcher->tbl; + bool skip_rx, skip_tx; + + dep_wqe->rule = rule; + dep_wqe->user_data = attr->user_data; + dep_wqe->direct_index = mlx5hws_matcher_is_insert_by_idx(matcher) ? + attr->rule_idx : 0; + + if (tbl->type == MLX5HWS_TABLE_TYPE_FDB) { + hws_rule_skip(matcher, mt, attr->flow_source, &skip_rx, &skip_tx); + + if (!skip_rx) { + dep_wqe->rtc_0 = matcher->match_ste.rtc_0_id; + dep_wqe->retry_rtc_0 = matcher->col_matcher ? + matcher->col_matcher->match_ste.rtc_0_id : 0; + } else { + dep_wqe->rtc_0 = 0; + dep_wqe->retry_rtc_0 = 0; + } + + if (!skip_tx) { + dep_wqe->rtc_1 = matcher->match_ste.rtc_1_id; + dep_wqe->retry_rtc_1 = matcher->col_matcher ? + matcher->col_matcher->match_ste.rtc_1_id : 0; + } else { + dep_wqe->rtc_1 = 0; + dep_wqe->retry_rtc_1 = 0; + } + } else { + pr_warn("HWS: invalid tbl->type: %d\n", tbl->type); + } +} + +static void hws_rule_move_get_rtc(struct mlx5hws_rule *rule, + struct mlx5hws_send_ste_attr *ste_attr) +{ + struct mlx5hws_matcher *dst_matcher = rule->matcher->resize_dst; + + if (rule->resize_info->rtc_0) { + ste_attr->rtc_0 = dst_matcher->match_ste.rtc_0_id; + ste_attr->retry_rtc_0 = dst_matcher->col_matcher ? + dst_matcher->col_matcher->match_ste.rtc_0_id : 0; + } + if (rule->resize_info->rtc_1) { + ste_attr->rtc_1 = dst_matcher->match_ste.rtc_1_id; + ste_attr->retry_rtc_1 = dst_matcher->col_matcher ? + dst_matcher->col_matcher->match_ste.rtc_1_id : 0; + } +} + +static void hws_rule_gen_comp(struct mlx5hws_send_engine *queue, + struct mlx5hws_rule *rule, + bool err, + void *user_data, + enum mlx5hws_rule_status rule_status_on_succ) +{ + enum mlx5hws_flow_op_status comp_status; + + if (!err) { + comp_status = MLX5HWS_FLOW_OP_SUCCESS; + rule->status = rule_status_on_succ; + } else { + comp_status = MLX5HWS_FLOW_OP_ERROR; + rule->status = MLX5HWS_RULE_STATUS_FAILED; + } + + mlx5hws_send_engine_inc_rule(queue); + mlx5hws_send_engine_gen_comp(queue, user_data, comp_status); +} + +static void +hws_rule_save_resize_info(struct mlx5hws_rule *rule, + struct mlx5hws_send_ste_attr *ste_attr, + bool is_update) +{ + if (!mlx5hws_matcher_is_resizable(rule->matcher)) + return; + + if (likely(!is_update)) { + rule->resize_info = kzalloc(sizeof(*rule->resize_info), GFP_KERNEL); + if (unlikely(!rule->resize_info)) { + pr_warn("HWS: resize info isn't allocated for rule\n"); + return; + } + + rule->resize_info->max_stes = + rule->matcher->action_ste[MLX5HWS_ACTION_STE_IDX_ANY].max_stes; + rule->resize_info->action_ste_pool[0] = rule->matcher->action_ste[0].max_stes ? + rule->matcher->action_ste[0].pool : + NULL; + rule->resize_info->action_ste_pool[1] = rule->matcher->action_ste[1].max_stes ? + rule->matcher->action_ste[1].pool : + NULL; + } + + memcpy(rule->resize_info->ctrl_seg, ste_attr->wqe_ctrl, + sizeof(rule->resize_info->ctrl_seg)); + memcpy(rule->resize_info->data_seg, ste_attr->wqe_data, + sizeof(rule->resize_info->data_seg)); +} + +void mlx5hws_rule_clear_resize_info(struct mlx5hws_rule *rule) +{ + if (mlx5hws_matcher_is_resizable(rule->matcher) && + rule->resize_info) { + kfree(rule->resize_info); + rule->resize_info = NULL; + } +} + +static void +hws_rule_save_delete_info(struct mlx5hws_rule *rule, + struct mlx5hws_send_ste_attr *ste_attr) +{ + struct mlx5hws_match_template *mt = rule->matcher->mt; + bool is_jumbo = mlx5hws_matcher_mt_is_jumbo(mt); + + if (mlx5hws_matcher_is_resizable(rule->matcher)) + return; + + if (is_jumbo) + memcpy(&rule->tag.jumbo, ste_attr->wqe_data->jumbo, MLX5HWS_JUMBO_TAG_SZ); + else + memcpy(&rule->tag.match, ste_attr->wqe_data->tag, MLX5HWS_MATCH_TAG_SZ); +} + +static void +hws_rule_clear_delete_info(struct mlx5hws_rule *rule) +{ + /* nothing to do here */ +} + +static void +hws_rule_load_delete_info(struct mlx5hws_rule *rule, + struct mlx5hws_send_ste_attr *ste_attr) +{ + if (unlikely(!mlx5hws_matcher_is_resizable(rule->matcher))) { + ste_attr->wqe_tag = &rule->tag; + } else { + struct mlx5hws_wqe_gta_data_seg_ste *data_seg = + (struct mlx5hws_wqe_gta_data_seg_ste *)(void *)rule->resize_info->data_seg; + struct mlx5hws_rule_match_tag *tag = + (struct mlx5hws_rule_match_tag *)(void *)data_seg->action; + ste_attr->wqe_tag = tag; + } +} + +static int hws_rule_alloc_action_ste_idx(struct mlx5hws_rule *rule, + u8 action_ste_selector) +{ + struct mlx5hws_matcher *matcher = rule->matcher; + struct mlx5hws_matcher_action_ste *action_ste; + struct mlx5hws_pool_chunk ste = {0}; + int ret; + + action_ste = &matcher->action_ste[action_ste_selector]; + ste.order = ilog2(roundup_pow_of_two(action_ste->max_stes)); + ret = mlx5hws_pool_chunk_alloc(action_ste->pool, &ste); + if (unlikely(ret)) { + mlx5hws_err(matcher->tbl->ctx, + "Failed to allocate STE for rule actions"); + return ret; + } + rule->action_ste_idx = ste.offset; + + return 0; +} + +static void hws_rule_free_action_ste_idx(struct mlx5hws_rule *rule, + u8 action_ste_selector) +{ + struct mlx5hws_matcher *matcher = rule->matcher; + struct mlx5hws_pool_chunk ste = {0}; + struct mlx5hws_pool *pool; + u8 max_stes; + + if (mlx5hws_matcher_is_resizable(matcher)) { + /* Free the original action pool if rule was resized */ + max_stes = rule->resize_info->max_stes; + pool = rule->resize_info->action_ste_pool[action_ste_selector]; + } else { + max_stes = matcher->action_ste[action_ste_selector].max_stes; + pool = matcher->action_ste[action_ste_selector].pool; + } + + /* This release is safe only when the rule match part was deleted */ + ste.order = ilog2(roundup_pow_of_two(max_stes)); + ste.offset = rule->action_ste_idx; + + mlx5hws_pool_chunk_free(pool, &ste); +} + +static int hws_rule_alloc_action_ste(struct mlx5hws_rule *rule, + struct mlx5hws_rule_attr *attr) +{ + int action_ste_idx; + int ret; + + ret = hws_rule_alloc_action_ste_idx(rule, 0); + if (unlikely(ret)) + return ret; + + action_ste_idx = rule->action_ste_idx; + + ret = hws_rule_alloc_action_ste_idx(rule, 1); + if (unlikely(ret)) { + hws_rule_free_action_ste_idx(rule, 0); + return ret; + } + + /* Both pools have to return the same index */ + if (unlikely(rule->action_ste_idx != action_ste_idx)) { + pr_warn("HWS: allocation of action STE failed - pool indexes mismatch\n"); + return -EINVAL; + } + + return 0; +} + +void mlx5hws_rule_free_action_ste(struct mlx5hws_rule *rule) +{ + if (rule->action_ste_idx > -1) { + hws_rule_free_action_ste_idx(rule, 1); + hws_rule_free_action_ste_idx(rule, 0); + } +} + +static void hws_rule_create_init(struct mlx5hws_rule *rule, + struct mlx5hws_send_ste_attr *ste_attr, + struct mlx5hws_actions_apply_data *apply, + bool is_update) +{ + struct mlx5hws_matcher *matcher = rule->matcher; + struct mlx5hws_table *tbl = matcher->tbl; + struct mlx5hws_context *ctx = tbl->ctx; + + /* Init rule before reuse */ + if (!is_update) { + /* In update we use these rtc's */ + rule->rtc_0 = 0; + rule->rtc_1 = 0; + rule->action_ste_selector = 0; + } else { + rule->action_ste_selector = !rule->action_ste_selector; + } + + rule->pending_wqes = 0; + rule->action_ste_idx = -1; + rule->status = MLX5HWS_RULE_STATUS_CREATING; + + /* Init default send STE attributes */ + ste_attr->gta_opcode = MLX5HWS_WQE_GTA_OP_ACTIVATE; + ste_attr->send_attr.opmod = MLX5HWS_WQE_GTA_OPMOD_STE; + ste_attr->send_attr.opcode = MLX5HWS_WQE_OPCODE_TBL_ACCESS; + ste_attr->send_attr.len = MLX5HWS_WQE_SZ_GTA_CTRL + MLX5HWS_WQE_SZ_GTA_DATA; + + /* Init default action apply */ + apply->tbl_type = tbl->type; + apply->common_res = &ctx->common_res[tbl->type]; + apply->jump_to_action_stc = matcher->action_ste[0].stc.offset; + apply->require_dep = 0; +} + +static void hws_rule_move_init(struct mlx5hws_rule *rule, + struct mlx5hws_rule_attr *attr) +{ + /* Save the old RTC IDs to be later used in match STE delete */ + rule->resize_info->rtc_0 = rule->rtc_0; + rule->resize_info->rtc_1 = rule->rtc_1; + rule->resize_info->rule_idx = attr->rule_idx; + + rule->rtc_0 = 0; + rule->rtc_1 = 0; + + rule->pending_wqes = 0; + rule->action_ste_idx = -1; + rule->action_ste_selector = 0; + rule->status = MLX5HWS_RULE_STATUS_CREATING; + rule->resize_info->state = MLX5HWS_RULE_RESIZE_STATE_WRITING; +} + +bool mlx5hws_rule_move_in_progress(struct mlx5hws_rule *rule) +{ + return mlx5hws_matcher_is_in_resize(rule->matcher) && + rule->resize_info && + rule->resize_info->state != MLX5HWS_RULE_RESIZE_STATE_IDLE; +} + +static int hws_rule_create_hws(struct mlx5hws_rule *rule, + struct mlx5hws_rule_attr *attr, + u8 mt_idx, + u32 *match_param, + u8 at_idx, + struct mlx5hws_rule_action rule_actions[]) +{ + struct mlx5hws_action_template *at = &rule->matcher->at[at_idx]; + struct mlx5hws_match_template *mt = &rule->matcher->mt[mt_idx]; + bool is_jumbo = mlx5hws_matcher_mt_is_jumbo(mt); + struct mlx5hws_matcher *matcher = rule->matcher; + struct mlx5hws_context *ctx = matcher->tbl->ctx; + struct mlx5hws_send_ste_attr ste_attr = {0}; + struct mlx5hws_send_ring_dep_wqe *dep_wqe; + struct mlx5hws_actions_wqe_setter *setter; + struct mlx5hws_actions_apply_data apply; + struct mlx5hws_send_engine *queue; + u8 total_stes, action_stes; + bool is_update; + int i, ret; + + is_update = !match_param; + + setter = &at->setters[at->num_of_action_stes]; + total_stes = at->num_of_action_stes + (is_jumbo && !at->only_term); + action_stes = total_stes - 1; + + queue = &ctx->send_queue[attr->queue_id]; + if (unlikely(mlx5hws_send_engine_err(queue))) + return -EIO; + + hws_rule_create_init(rule, &ste_attr, &apply, is_update); + + /* Allocate dependent match WQE since rule might have dependent writes. + * The queued dependent WQE can be later aborted or kept as a dependency. + * dep_wqe buffers (ctrl, data) are also reused for all STE writes. + */ + dep_wqe = mlx5hws_send_add_new_dep_wqe(queue); + hws_rule_init_dep_wqe(dep_wqe, rule, mt, attr); + + ste_attr.wqe_ctrl = &dep_wqe->wqe_ctrl; + ste_attr.wqe_data = &dep_wqe->wqe_data; + apply.wqe_ctrl = &dep_wqe->wqe_ctrl; + apply.wqe_data = (__force __be32 *)&dep_wqe->wqe_data; + apply.rule_action = rule_actions; + apply.queue = queue; + + if (action_stes) { + /* Allocate action STEs for rules that need more than match STE */ + if (!is_update) { + ret = hws_rule_alloc_action_ste(rule, attr); + if (ret) { + mlx5hws_err(ctx, "Failed to allocate action memory %d", ret); + mlx5hws_send_abort_new_dep_wqe(queue); + return ret; + } + } + /* Skip RX/TX based on the dep_wqe init */ + ste_attr.rtc_0 = dep_wqe->rtc_0 ? + matcher->action_ste[rule->action_ste_selector].rtc_0_id : 0; + ste_attr.rtc_1 = dep_wqe->rtc_1 ? + matcher->action_ste[rule->action_ste_selector].rtc_1_id : 0; + /* Action STEs are written to a specific index last to first */ + ste_attr.direct_index = rule->action_ste_idx + action_stes; + apply.next_direct_idx = ste_attr.direct_index; + } else { + apply.next_direct_idx = 0; + } + + for (i = total_stes; i-- > 0;) { + mlx5hws_action_apply_setter(&apply, setter--, !i && is_jumbo); + + if (i == 0) { + /* Handle last match STE. + * For hash split / linear lookup RTCs, packets reaching any STE + * will always match and perform the specified actions, which + * makes the tag irrelevant. + */ + if (likely(!mlx5hws_matcher_is_insert_by_idx(matcher) && !is_update)) + mlx5hws_definer_create_tag(match_param, mt->fc, mt->fc_sz, + (u8 *)dep_wqe->wqe_data.action); + else if (is_update) + hws_rule_update_copy_tag(rule, &dep_wqe->wqe_data, is_jumbo); + + /* Rule has dependent WQEs, match dep_wqe is queued */ + if (action_stes || apply.require_dep) + break; + + /* Rule has no dependencies, abort dep_wqe and send WQE now */ + mlx5hws_send_abort_new_dep_wqe(queue); + ste_attr.wqe_tag_is_jumbo = is_jumbo; + ste_attr.send_attr.notify_hw = !attr->burst; + ste_attr.send_attr.user_data = dep_wqe->user_data; + ste_attr.send_attr.rule = dep_wqe->rule; + ste_attr.rtc_0 = dep_wqe->rtc_0; + ste_attr.rtc_1 = dep_wqe->rtc_1; + ste_attr.used_id_rtc_0 = &rule->rtc_0; + ste_attr.used_id_rtc_1 = &rule->rtc_1; + ste_attr.retry_rtc_0 = dep_wqe->retry_rtc_0; + ste_attr.retry_rtc_1 = dep_wqe->retry_rtc_1; + ste_attr.direct_index = dep_wqe->direct_index; + } else { + apply.next_direct_idx = --ste_attr.direct_index; + } + + mlx5hws_send_ste(queue, &ste_attr); + } + + /* Backup TAG on the rule for deletion and resize info for + * moving rules to a new matcher, only after insertion. + */ + if (!is_update) + hws_rule_save_delete_info(rule, &ste_attr); + + hws_rule_save_resize_info(rule, &ste_attr, is_update); + mlx5hws_send_engine_inc_rule(queue); + + if (!attr->burst) + mlx5hws_send_all_dep_wqe(queue); + + return 0; +} + +static void hws_rule_destroy_failed_hws(struct mlx5hws_rule *rule, + struct mlx5hws_rule_attr *attr) +{ + struct mlx5hws_context *ctx = rule->matcher->tbl->ctx; + struct mlx5hws_send_engine *queue; + + queue = &ctx->send_queue[attr->queue_id]; + + hws_rule_gen_comp(queue, rule, false, + attr->user_data, MLX5HWS_RULE_STATUS_DELETED); + + /* Rule failed now we can safely release action STEs */ + mlx5hws_rule_free_action_ste(rule); + + /* Clear complex tag */ + hws_rule_clear_delete_info(rule); + + /* Clear info that was saved for resizing */ + mlx5hws_rule_clear_resize_info(rule); + + /* If a rule that was indicated as burst (need to trigger HW) has failed + * insertion we won't ring the HW as nothing is being written to the WQ. + * In such case update the last WQE and ring the HW with that work + */ + if (attr->burst) + return; + + mlx5hws_send_all_dep_wqe(queue); + mlx5hws_send_engine_flush_queue(queue); +} + +static int hws_rule_destroy_hws(struct mlx5hws_rule *rule, + struct mlx5hws_rule_attr *attr) +{ + bool is_jumbo = mlx5hws_matcher_mt_is_jumbo(rule->matcher->mt); + struct mlx5hws_context *ctx = rule->matcher->tbl->ctx; + struct mlx5hws_matcher *matcher = rule->matcher; + struct mlx5hws_wqe_gta_ctrl_seg wqe_ctrl = {0}; + struct mlx5hws_send_ste_attr ste_attr = {0}; + struct mlx5hws_send_engine *queue; + + queue = &ctx->send_queue[attr->queue_id]; + + if (unlikely(mlx5hws_send_engine_err(queue))) { + hws_rule_destroy_failed_hws(rule, attr); + return 0; + } + + /* Rule is not completed yet */ + if (rule->status == MLX5HWS_RULE_STATUS_CREATING) + return -EBUSY; + + /* Rule failed and doesn't require cleanup */ + if (rule->status == MLX5HWS_RULE_STATUS_FAILED) { + hws_rule_destroy_failed_hws(rule, attr); + return 0; + } + + if (rule->skip_delete) { + /* Rule shouldn't be deleted in HW. + * Generate completion as if write succeeded, and we can + * safely release action STEs and clear resize info. + */ + hws_rule_gen_comp(queue, rule, false, + attr->user_data, MLX5HWS_RULE_STATUS_DELETED); + + mlx5hws_rule_free_action_ste(rule); + mlx5hws_rule_clear_resize_info(rule); + return 0; + } + + mlx5hws_send_engine_inc_rule(queue); + + /* Send dependent WQE */ + if (!attr->burst) + mlx5hws_send_all_dep_wqe(queue); + + rule->status = MLX5HWS_RULE_STATUS_DELETING; + + ste_attr.send_attr.opmod = MLX5HWS_WQE_GTA_OPMOD_STE; + ste_attr.send_attr.opcode = MLX5HWS_WQE_OPCODE_TBL_ACCESS; + ste_attr.send_attr.len = MLX5HWS_WQE_SZ_GTA_CTRL + MLX5HWS_WQE_SZ_GTA_DATA; + + ste_attr.send_attr.rule = rule; + ste_attr.send_attr.notify_hw = !attr->burst; + ste_attr.send_attr.user_data = attr->user_data; + + ste_attr.rtc_0 = rule->rtc_0; + ste_attr.rtc_1 = rule->rtc_1; + ste_attr.used_id_rtc_0 = &rule->rtc_0; + ste_attr.used_id_rtc_1 = &rule->rtc_1; + ste_attr.wqe_ctrl = &wqe_ctrl; + ste_attr.wqe_tag_is_jumbo = is_jumbo; + ste_attr.gta_opcode = MLX5HWS_WQE_GTA_OP_DEACTIVATE; + if (unlikely(mlx5hws_matcher_is_insert_by_idx(matcher))) + ste_attr.direct_index = attr->rule_idx; + + hws_rule_load_delete_info(rule, &ste_attr); + mlx5hws_send_ste(queue, &ste_attr); + hws_rule_clear_delete_info(rule); + + return 0; +} + +static int hws_rule_enqueue_precheck(struct mlx5hws_rule *rule, + struct mlx5hws_rule_attr *attr) +{ + struct mlx5hws_context *ctx = rule->matcher->tbl->ctx; + + if (unlikely(!attr->user_data)) + return -EINVAL; + + /* Check if there is room in queue */ + if (unlikely(mlx5hws_send_engine_full(&ctx->send_queue[attr->queue_id]))) + return -EBUSY; + + return 0; +} + +static int hws_rule_enqueue_precheck_move(struct mlx5hws_rule *rule, + struct mlx5hws_rule_attr *attr) +{ + if (unlikely(rule->status != MLX5HWS_RULE_STATUS_CREATED)) + return -EINVAL; + + return hws_rule_enqueue_precheck(rule, attr); +} + +static int hws_rule_enqueue_precheck_create(struct mlx5hws_rule *rule, + struct mlx5hws_rule_attr *attr) +{ + if (unlikely(mlx5hws_matcher_is_in_resize(rule->matcher))) + /* Matcher in resize - new rules are not allowed */ + return -EAGAIN; + + return hws_rule_enqueue_precheck(rule, attr); +} + +static int hws_rule_enqueue_precheck_update(struct mlx5hws_rule *rule, + struct mlx5hws_rule_attr *attr) +{ + struct mlx5hws_matcher *matcher = rule->matcher; + + if (unlikely(!mlx5hws_matcher_is_resizable(rule->matcher) && + !matcher->attr.optimize_using_rule_idx && + !mlx5hws_matcher_is_insert_by_idx(matcher))) { + return -EOPNOTSUPP; + } + + if (unlikely(rule->status != MLX5HWS_RULE_STATUS_CREATED)) + return -EBUSY; + + return hws_rule_enqueue_precheck_create(rule, attr); +} + +int mlx5hws_rule_move_hws_remove(struct mlx5hws_rule *rule, + void *queue_ptr, + void *user_data) +{ + bool is_jumbo = mlx5hws_matcher_mt_is_jumbo(rule->matcher->mt); + struct mlx5hws_wqe_gta_ctrl_seg empty_wqe_ctrl = {0}; + struct mlx5hws_matcher *matcher = rule->matcher; + struct mlx5hws_send_engine *queue = queue_ptr; + struct mlx5hws_send_ste_attr ste_attr = {0}; + + mlx5hws_send_all_dep_wqe(queue); + + rule->resize_info->state = MLX5HWS_RULE_RESIZE_STATE_DELETING; + + ste_attr.send_attr.fence = 0; + ste_attr.send_attr.opmod = MLX5HWS_WQE_GTA_OPMOD_STE; + ste_attr.send_attr.opcode = MLX5HWS_WQE_OPCODE_TBL_ACCESS; + ste_attr.send_attr.len = MLX5HWS_WQE_SZ_GTA_CTRL + MLX5HWS_WQE_SZ_GTA_DATA; + ste_attr.send_attr.rule = rule; + ste_attr.send_attr.notify_hw = 1; + ste_attr.send_attr.user_data = user_data; + ste_attr.rtc_0 = rule->resize_info->rtc_0; + ste_attr.rtc_1 = rule->resize_info->rtc_1; + ste_attr.used_id_rtc_0 = &rule->resize_info->rtc_0; + ste_attr.used_id_rtc_1 = &rule->resize_info->rtc_1; + ste_attr.wqe_ctrl = &empty_wqe_ctrl; + ste_attr.wqe_tag_is_jumbo = is_jumbo; + ste_attr.gta_opcode = MLX5HWS_WQE_GTA_OP_DEACTIVATE; + + if (unlikely(mlx5hws_matcher_is_insert_by_idx(matcher))) + ste_attr.direct_index = rule->resize_info->rule_idx; + + hws_rule_load_delete_info(rule, &ste_attr); + mlx5hws_send_ste(queue, &ste_attr); + + return 0; +} + +int mlx5hws_rule_move_hws_add(struct mlx5hws_rule *rule, + struct mlx5hws_rule_attr *attr) +{ + bool is_jumbo = mlx5hws_matcher_mt_is_jumbo(rule->matcher->mt); + struct mlx5hws_context *ctx = rule->matcher->tbl->ctx; + struct mlx5hws_matcher *matcher = rule->matcher; + struct mlx5hws_send_ste_attr ste_attr = {0}; + struct mlx5hws_send_engine *queue; + int ret; + + ret = hws_rule_enqueue_precheck_move(rule, attr); + if (unlikely(ret)) + return ret; + + queue = &ctx->send_queue[attr->queue_id]; + + ret = mlx5hws_send_engine_err(queue); + if (ret) + return ret; + + hws_rule_move_init(rule, attr); + hws_rule_move_get_rtc(rule, &ste_attr); + + ste_attr.send_attr.opmod = MLX5HWS_WQE_GTA_OPMOD_STE; + ste_attr.send_attr.opcode = MLX5HWS_WQE_OPCODE_TBL_ACCESS; + ste_attr.send_attr.len = MLX5HWS_WQE_SZ_GTA_CTRL + MLX5HWS_WQE_SZ_GTA_DATA; + ste_attr.gta_opcode = MLX5HWS_WQE_GTA_OP_ACTIVATE; + ste_attr.wqe_tag_is_jumbo = is_jumbo; + + ste_attr.send_attr.rule = rule; + ste_attr.send_attr.fence = 0; + ste_attr.send_attr.notify_hw = !attr->burst; + ste_attr.send_attr.user_data = attr->user_data; + + ste_attr.used_id_rtc_0 = &rule->rtc_0; + ste_attr.used_id_rtc_1 = &rule->rtc_1; + ste_attr.wqe_ctrl = (struct mlx5hws_wqe_gta_ctrl_seg *)rule->resize_info->ctrl_seg; + ste_attr.wqe_data = (struct mlx5hws_wqe_gta_data_seg_ste *)rule->resize_info->data_seg; + ste_attr.direct_index = mlx5hws_matcher_is_insert_by_idx(matcher) ? + attr->rule_idx : 0; + + mlx5hws_send_ste(queue, &ste_attr); + mlx5hws_send_engine_inc_rule(queue); + + if (!attr->burst) + mlx5hws_send_all_dep_wqe(queue); + + return 0; +} + +int mlx5hws_rule_create(struct mlx5hws_matcher *matcher, + u8 mt_idx, + u32 *match_param, + u8 at_idx, + struct mlx5hws_rule_action rule_actions[], + struct mlx5hws_rule_attr *attr, + struct mlx5hws_rule *rule_handle) +{ + int ret; + + rule_handle->matcher = matcher; + + ret = hws_rule_enqueue_precheck_create(rule_handle, attr); + if (unlikely(ret)) + return ret; + + if (unlikely(!(matcher->num_of_mt >= mt_idx) || + !(matcher->num_of_at >= at_idx) || + !match_param)) { + pr_warn("HWS: Invalid rule creation parameters (MTs, ATs or match params)\n"); + return -EINVAL; + } + + ret = hws_rule_create_hws(rule_handle, + attr, + mt_idx, + match_param, + at_idx, + rule_actions); + + return ret; +} + +int mlx5hws_rule_destroy(struct mlx5hws_rule *rule, + struct mlx5hws_rule_attr *attr) +{ + int ret; + + ret = hws_rule_enqueue_precheck(rule, attr); + if (unlikely(ret)) + return -ret; + + ret = hws_rule_destroy_hws(rule, attr); + + return -ret; +} + +int mlx5hws_rule_action_update(struct mlx5hws_rule *rule, + u8 at_idx, + struct mlx5hws_rule_action rule_actions[], + struct mlx5hws_rule_attr *attr) +{ + int ret; + + ret = hws_rule_enqueue_precheck_update(rule, attr); + if (unlikely(ret)) + return -ret; + + ret = hws_rule_create_hws(rule, + attr, + 0, + NULL, + at_idx, + rule_actions); + + return -ret; +} diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_rule.h b/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_rule.h new file mode 100644 index 0000000000000..495cdd17e9f3c --- /dev/null +++ b/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_rule.h @@ -0,0 +1,84 @@ +/* SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB */ +/* Copyright (c) 2024 NVIDIA Corporation & Affiliates */ + +#ifndef MLX5HWS_RULE_H_ +#define MLX5HWS_RULE_H_ + +enum { + MLX5HWS_STE_CTRL_SZ = 20, + MLX5HWS_ACTIONS_SZ = 12, + MLX5HWS_MATCH_TAG_SZ = 32, + MLX5HWS_JUMBO_TAG_SZ = 44, +}; + +enum mlx5hws_rule_status { + MLX5HWS_RULE_STATUS_UNKNOWN, + MLX5HWS_RULE_STATUS_CREATING, + MLX5HWS_RULE_STATUS_CREATED, + MLX5HWS_RULE_STATUS_DELETING, + MLX5HWS_RULE_STATUS_DELETED, + MLX5HWS_RULE_STATUS_FAILING, + MLX5HWS_RULE_STATUS_FAILED, +}; + +enum mlx5hws_rule_move_state { + MLX5HWS_RULE_RESIZE_STATE_IDLE, + MLX5HWS_RULE_RESIZE_STATE_WRITING, + MLX5HWS_RULE_RESIZE_STATE_DELETING, +}; + +enum mlx5hws_rule_jumbo_match_tag_offset { + MLX5HWS_RULE_JUMBO_MATCH_TAG_OFFSET_DW0 = 8, +}; + +struct mlx5hws_rule_match_tag { + union { + u8 jumbo[MLX5HWS_JUMBO_TAG_SZ]; + struct { + u8 reserved[MLX5HWS_ACTIONS_SZ]; + u8 match[MLX5HWS_MATCH_TAG_SZ]; + }; + }; +}; + +struct mlx5hws_rule_resize_info { + struct mlx5hws_pool *action_ste_pool[2]; + u32 rtc_0; + u32 rtc_1; + u32 rule_idx; + u8 state; + u8 max_stes; + u8 ctrl_seg[MLX5HWS_WQE_SZ_GTA_CTRL]; /* Ctrl segment of STE: 48 bytes */ + u8 data_seg[MLX5HWS_WQE_SZ_GTA_DATA]; /* Data segment of STE: 64 bytes */ +}; + +struct mlx5hws_rule { + struct mlx5hws_matcher *matcher; + union { + struct mlx5hws_rule_match_tag tag; + struct mlx5hws_rule_resize_info *resize_info; + }; + u32 rtc_0; /* The RTC into which the STE was inserted */ + u32 rtc_1; /* The RTC into which the STE was inserted */ + int action_ste_idx; /* STE array index */ + u8 status; /* enum mlx5hws_rule_status */ + u8 action_ste_selector; /* For rule update - which action STE is in use */ + u8 pending_wqes; + bool skip_delete; /* For complex rules - another rule with same tag + * still exists, so don't actually delete this rule. + */ +}; + +void mlx5hws_rule_free_action_ste(struct mlx5hws_rule *rule); + +int mlx5hws_rule_move_hws_remove(struct mlx5hws_rule *rule, + void *queue, void *user_data); + +int mlx5hws_rule_move_hws_add(struct mlx5hws_rule *rule, + struct mlx5hws_rule_attr *attr); + +bool mlx5hws_rule_move_in_progress(struct mlx5hws_rule *rule); + +void mlx5hws_rule_clear_resize_info(struct mlx5hws_rule *rule); + +#endif /* MLX5HWS_RULE_H_ */ From 6468e9f509c45ead3e0cef1692cbfeb94c5275a7 Mon Sep 17 00:00:00 2001 From: Yevgeny Kliteynik Date: Thu, 20 Jun 2024 02:36:51 +0300 Subject: [PATCH 092/548] net/mlx5: HWS, added definers handling The Match Definer combines packet fields and a mask, creating a key which can be used for packet matching during steering flow processing. This patch adds handling of definer objects in HWS. Reviewed-by: Hamdan Agbariya Signed-off-by: Yevgeny Kliteynik Signed-off-by: Saeed Mahameed (cherry picked from commit 74a778b4a63faef9ff02aad0d332b209835f93e1) Signed-off-by: Koba Ko --- .../mlx5/core/steering/hws/mlx5hws_definer.c | 2146 +++++++++++++++++ .../mlx5/core/steering/hws/mlx5hws_definer.h | 834 +++++++ 2 files changed, 2980 insertions(+) create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_definer.c create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_definer.h diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_definer.c b/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_definer.c new file mode 100644 index 0000000000000..3bdb5c90efffa --- /dev/null +++ b/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_definer.c @@ -0,0 +1,2146 @@ +// SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB +/* Copyright (c) 2024 NVIDIA Corporation & Affiliates */ + +#include "mlx5hws_internal.h" + +/* Pattern tunnel Layer bits. */ +#define MLX5_FLOW_LAYER_VXLAN BIT(12) +#define MLX5_FLOW_LAYER_VXLAN_GPE BIT(13) +#define MLX5_FLOW_LAYER_GRE BIT(14) +#define MLX5_FLOW_LAYER_MPLS BIT(15) + +/* Pattern tunnel Layer bits (continued). */ +#define MLX5_FLOW_LAYER_IPIP BIT(23) +#define MLX5_FLOW_LAYER_IPV6_ENCAP BIT(24) +#define MLX5_FLOW_LAYER_NVGRE BIT(25) +#define MLX5_FLOW_LAYER_GENEVE BIT(26) + +#define MLX5_FLOW_ITEM_FLEX_TUNNEL BIT_ULL(39) + +/* Tunnel Masks. */ +#define MLX5_FLOW_LAYER_TUNNEL \ + (MLX5_FLOW_LAYER_VXLAN | MLX5_FLOW_LAYER_VXLAN_GPE | \ + MLX5_FLOW_LAYER_GRE | MLX5_FLOW_LAYER_NVGRE | MLX5_FLOW_LAYER_MPLS | \ + MLX5_FLOW_LAYER_IPIP | MLX5_FLOW_LAYER_IPV6_ENCAP | \ + MLX5_FLOW_LAYER_GENEVE | MLX5_FLOW_LAYER_GTP | \ + MLX5_FLOW_ITEM_FLEX_TUNNEL) + +#define GTP_PDU_SC 0x85 +#define BAD_PORT 0xBAD +#define ETH_TYPE_IPV4_VXLAN 0x0800 +#define ETH_TYPE_IPV6_VXLAN 0x86DD +#define UDP_GTPU_PORT 2152 +#define UDP_PORT_MPLS 6635 +#define UDP_GENEVE_PORT 6081 +#define UDP_ROCEV2_PORT 4791 +#define HWS_FLOW_LAYER_TUNNEL_NO_MPLS (MLX5_FLOW_LAYER_TUNNEL & ~MLX5_FLOW_LAYER_MPLS) + +#define STE_NO_VLAN 0x0 +#define STE_SVLAN 0x1 +#define STE_CVLAN 0x2 +#define STE_NO_L3 0x0 +#define STE_IPV4 0x1 +#define STE_IPV6 0x2 +#define STE_NO_L4 0x0 +#define STE_TCP 0x1 +#define STE_UDP 0x2 +#define STE_ICMP 0x3 +#define STE_ESP 0x3 + +#define IPV4 0x4 +#define IPV6 0x6 + +/* Setter function based on bit offset and mask, for 32bit DW */ +#define _HWS_SET32(p, v, byte_off, bit_off, mask) \ + do { \ + u32 _v = v; \ + *((__be32 *)(p) + ((byte_off) / 4)) = \ + cpu_to_be32((be32_to_cpu(*((__be32 *)(p) + \ + ((byte_off) / 4))) & \ + (~((mask) << (bit_off)))) | \ + (((_v) & (mask)) << \ + (bit_off))); \ + } while (0) + +/* Setter function based on bit offset and mask, for unaligned 32bit DW */ +#define HWS_SET32(p, v, byte_off, bit_off, mask) \ + do { \ + if (unlikely((bit_off) < 0)) { \ + u32 _bit_off = -1 * (bit_off); \ + u32 second_dw_mask = (mask) & ((1 << _bit_off) - 1); \ + _HWS_SET32(p, (v) >> _bit_off, byte_off, 0, (mask) >> _bit_off); \ + _HWS_SET32(p, (v) & second_dw_mask, (byte_off) + DW_SIZE, \ + (bit_off) % BITS_IN_DW, second_dw_mask); \ + } else { \ + _HWS_SET32(p, v, byte_off, (bit_off), (mask)); \ + } \ + } while (0) + +/* Getter for up to aligned 32bit DW */ +#define HWS_GET32(p, byte_off, bit_off, mask) \ + ((be32_to_cpu(*((__be32 *)(p) + ((byte_off) / 4))) >> (bit_off)) & (mask)) + +#define HWS_CALC_FNAME(field, inner) \ + ((inner) ? MLX5HWS_DEFINER_FNAME_##field##_I : \ + MLX5HWS_DEFINER_FNAME_##field##_O) + +#define HWS_GET_MATCH_PARAM(match_param, hdr) \ + MLX5_GET(fte_match_param, match_param, hdr) + +#define HWS_IS_FLD_SET(match_param, hdr) \ + (!!(HWS_GET_MATCH_PARAM(match_param, hdr))) + +#define HWS_IS_FLD_SET_DW_ARR(match_param, hdr, sz_in_bits) ({ \ + BUILD_BUG_ON((sz_in_bits) % 32); \ + u32 sz = sz_in_bits; \ + u32 res = 0; \ + u32 dw_off = __mlx5_dw_off(fte_match_param, hdr); \ + while (!res && sz >= 32) { \ + res = *((match_param) + (dw_off++)); \ + sz -= 32; \ + } \ + res; \ + }) + +#define HWS_IS_FLD_SET_SZ(match_param, hdr, sz_in_bits) \ + (((sz_in_bits) > 32) ? HWS_IS_FLD_SET_DW_ARR(match_param, hdr, sz_in_bits) : \ + !!(HWS_GET_MATCH_PARAM(match_param, hdr))) + +#define HWS_GET64_MATCH_PARAM(match_param, hdr) \ + MLX5_GET64(fte_match_param, match_param, hdr) + +#define HWS_IS_FLD64_SET(match_param, hdr) \ + (!!(HWS_GET64_MATCH_PARAM(match_param, hdr))) + +#define HWS_CALC_HDR_SRC(fc, s_hdr) \ + do { \ + (fc)->s_bit_mask = __mlx5_mask(fte_match_param, s_hdr); \ + (fc)->s_bit_off = __mlx5_dw_bit_off(fte_match_param, s_hdr); \ + (fc)->s_byte_off = MLX5_BYTE_OFF(fte_match_param, s_hdr); \ + } while (0) + +#define HWS_CALC_HDR_DST(fc, d_hdr) \ + do { \ + (fc)->bit_mask = __mlx5_mask(definer_hl, d_hdr); \ + (fc)->bit_off = __mlx5_dw_bit_off(definer_hl, d_hdr); \ + (fc)->byte_off = MLX5_BYTE_OFF(definer_hl, d_hdr); \ + } while (0) + +#define HWS_CALC_HDR(fc, s_hdr, d_hdr) \ + do { \ + HWS_CALC_HDR_SRC(fc, s_hdr); \ + HWS_CALC_HDR_DST(fc, d_hdr); \ + (fc)->tag_set = &hws_definer_generic_set; \ + } while (0) + +#define HWS_SET_HDR(fc_arr, match_param, fname, s_hdr, d_hdr) \ + do { \ + if (HWS_IS_FLD_SET(match_param, s_hdr)) \ + HWS_CALC_HDR(&(fc_arr)[MLX5HWS_DEFINER_FNAME_##fname], s_hdr, d_hdr); \ + } while (0) + +struct mlx5hws_definer_sel_ctrl { + u8 allowed_full_dw; /* Full DW selectors cover all offsets */ + u8 allowed_lim_dw; /* Limited DW selectors cover offset < 64 */ + u8 allowed_bytes; /* Bytes selectors, up to offset 255 */ + u8 used_full_dw; + u8 used_lim_dw; + u8 used_bytes; + u8 full_dw_selector[DW_SELECTORS]; + u8 lim_dw_selector[DW_SELECTORS_LIMITED]; + u8 byte_selector[BYTE_SELECTORS]; +}; + +struct mlx5hws_definer_conv_data { + struct mlx5hws_context *ctx; + struct mlx5hws_definer_fc *fc; + /* enum mlx5hws_definer_match_flag */ + u32 match_flags; +}; + +static void +hws_definer_ones_set(struct mlx5hws_definer_fc *fc, + void *match_param, + u8 *tag) +{ + HWS_SET32(tag, -1, fc->byte_off, fc->bit_off, fc->bit_mask); +} + +static void +hws_definer_generic_set(struct mlx5hws_definer_fc *fc, + void *match_param, + u8 *tag) +{ + /* Can be optimized */ + u32 val = HWS_GET32(match_param, fc->s_byte_off, fc->s_bit_off, fc->s_bit_mask); + + HWS_SET32(tag, val, fc->byte_off, fc->bit_off, fc->bit_mask); +} + +static void +hws_definer_outer_vlan_type_set(struct mlx5hws_definer_fc *fc, + void *match_param, + u8 *tag) +{ + if (HWS_GET_MATCH_PARAM(match_param, outer_headers.cvlan_tag)) + HWS_SET32(tag, STE_CVLAN, fc->byte_off, fc->bit_off, fc->bit_mask); + else if (HWS_GET_MATCH_PARAM(match_param, outer_headers.svlan_tag)) + HWS_SET32(tag, STE_SVLAN, fc->byte_off, fc->bit_off, fc->bit_mask); + else + HWS_SET32(tag, STE_NO_VLAN, fc->byte_off, fc->bit_off, fc->bit_mask); +} + +static void +hws_definer_inner_vlan_type_set(struct mlx5hws_definer_fc *fc, + void *match_param, + u8 *tag) +{ + if (HWS_GET_MATCH_PARAM(match_param, inner_headers.cvlan_tag)) + HWS_SET32(tag, STE_CVLAN, fc->byte_off, fc->bit_off, fc->bit_mask); + else if (HWS_GET_MATCH_PARAM(match_param, inner_headers.svlan_tag)) + HWS_SET32(tag, STE_SVLAN, fc->byte_off, fc->bit_off, fc->bit_mask); + else + HWS_SET32(tag, STE_NO_VLAN, fc->byte_off, fc->bit_off, fc->bit_mask); +} + +static void +hws_definer_second_vlan_type_set(struct mlx5hws_definer_fc *fc, + void *match_param, + u8 *tag, + bool inner) +{ + u32 second_cvlan_tag = inner ? + HWS_GET_MATCH_PARAM(match_param, misc_parameters.inner_second_cvlan_tag) : + HWS_GET_MATCH_PARAM(match_param, misc_parameters.outer_second_cvlan_tag); + u32 second_svlan_tag = inner ? + HWS_GET_MATCH_PARAM(match_param, misc_parameters.inner_second_svlan_tag) : + HWS_GET_MATCH_PARAM(match_param, misc_parameters.outer_second_svlan_tag); + + if (second_cvlan_tag) + HWS_SET32(tag, STE_CVLAN, fc->byte_off, fc->bit_off, fc->bit_mask); + else if (second_svlan_tag) + HWS_SET32(tag, STE_SVLAN, fc->byte_off, fc->bit_off, fc->bit_mask); + else + HWS_SET32(tag, STE_NO_VLAN, fc->byte_off, fc->bit_off, fc->bit_mask); +} + +static void +hws_definer_inner_second_vlan_type_set(struct mlx5hws_definer_fc *fc, + void *match_param, + u8 *tag) +{ + hws_definer_second_vlan_type_set(fc, match_param, tag, true); +} + +static void +hws_definer_outer_second_vlan_type_set(struct mlx5hws_definer_fc *fc, + void *match_param, + u8 *tag) +{ + hws_definer_second_vlan_type_set(fc, match_param, tag, false); +} + +static void hws_definer_icmp_dw1_set(struct mlx5hws_definer_fc *fc, + void *match_param, + u8 *tag) +{ + u32 code = HWS_GET_MATCH_PARAM(match_param, misc_parameters_3.icmp_code); + u32 type = HWS_GET_MATCH_PARAM(match_param, misc_parameters_3.icmp_type); + u32 dw = (type << __mlx5_dw_bit_off(header_icmp, type)) | + (code << __mlx5_dw_bit_off(header_icmp, code)); + + HWS_SET32(tag, dw, fc->byte_off, fc->bit_off, fc->bit_mask); +} + +static void +hws_definer_icmpv6_dw1_set(struct mlx5hws_definer_fc *fc, + void *match_param, + u8 *tag) +{ + u32 code = HWS_GET_MATCH_PARAM(match_param, misc_parameters_3.icmpv6_code); + u32 type = HWS_GET_MATCH_PARAM(match_param, misc_parameters_3.icmpv6_type); + u32 dw = (type << __mlx5_dw_bit_off(header_icmp, type)) | + (code << __mlx5_dw_bit_off(header_icmp, code)); + + HWS_SET32(tag, dw, fc->byte_off, fc->bit_off, fc->bit_mask); +} + +static void +hws_definer_l3_type_set(struct mlx5hws_definer_fc *fc, + void *match_param, + u8 *tag) +{ + u32 val = HWS_GET32(match_param, fc->s_byte_off, fc->s_bit_off, fc->s_bit_mask); + + if (val == IPV4) + HWS_SET32(tag, STE_IPV4, fc->byte_off, fc->bit_off, fc->bit_mask); + else if (val == IPV6) + HWS_SET32(tag, STE_IPV6, fc->byte_off, fc->bit_off, fc->bit_mask); + else + HWS_SET32(tag, STE_NO_L3, fc->byte_off, fc->bit_off, fc->bit_mask); +} + +static void +hws_definer_set_source_port_gvmi(struct mlx5hws_definer_fc *fc, + void *match_param, + u8 *tag, + struct mlx5hws_context *peer_ctx) +{ + u16 source_port = HWS_GET_MATCH_PARAM(match_param, misc_parameters.source_port); + u16 vport_gvmi = 0; + int ret; + + ret = mlx5hws_vport_get_gvmi(peer_ctx, source_port, &vport_gvmi); + if (ret) { + HWS_SET32(tag, BAD_PORT, fc->byte_off, fc->bit_off, fc->bit_mask); + mlx5hws_err(fc->ctx, "Vport 0x%x is disabled or invalid\n", source_port); + return; + } + + if (vport_gvmi) + HWS_SET32(tag, vport_gvmi, fc->byte_off, fc->bit_off, fc->bit_mask); +} + +static void +hws_definer_set_source_gvmi_vhca_id(struct mlx5hws_definer_fc *fc, + void *match_param, + u8 *tag) +__must_hold(&fc->ctx->ctrl_lock) +{ + int id = HWS_GET_MATCH_PARAM(match_param, misc_parameters.source_eswitch_owner_vhca_id); + struct mlx5hws_context *peer_ctx; + + if (id == fc->ctx->caps->vhca_id) + peer_ctx = fc->ctx; + else + peer_ctx = xa_load(&fc->ctx->peer_ctx_xa, id); + + if (!peer_ctx) { + HWS_SET32(tag, BAD_PORT, fc->byte_off, fc->bit_off, fc->bit_mask); + mlx5hws_err(fc->ctx, "Invalid vhca_id provided 0x%x\n", id); + return; + } + + hws_definer_set_source_port_gvmi(fc, match_param, tag, peer_ctx); +} + +static void +hws_definer_set_source_gvmi(struct mlx5hws_definer_fc *fc, + void *match_param, + u8 *tag) +{ + hws_definer_set_source_port_gvmi(fc, match_param, tag, fc->ctx); +} + +static struct mlx5hws_definer_fc * +hws_definer_flex_parser_steering_ok_bits_handler(struct mlx5hws_definer_conv_data *cd, + u8 parser_id) +{ + struct mlx5hws_definer_fc *fc; + + switch (parser_id) { + case 0: + fc = &cd->fc[MLX5HWS_DEFINER_FNAME_FLEX_PARSER0_OK]; + HWS_CALC_HDR_DST(fc, oks1.flex_parser0_steering_ok); + fc->tag_set = &hws_definer_generic_set; + break; + case 1: + fc = &cd->fc[MLX5HWS_DEFINER_FNAME_FLEX_PARSER1_OK]; + HWS_CALC_HDR_DST(fc, oks1.flex_parser1_steering_ok); + fc->tag_set = &hws_definer_generic_set; + break; + case 2: + fc = &cd->fc[MLX5HWS_DEFINER_FNAME_FLEX_PARSER2_OK]; + HWS_CALC_HDR_DST(fc, oks1.flex_parser2_steering_ok); + fc->tag_set = &hws_definer_generic_set; + break; + case 3: + fc = &cd->fc[MLX5HWS_DEFINER_FNAME_FLEX_PARSER3_OK]; + HWS_CALC_HDR_DST(fc, oks1.flex_parser3_steering_ok); + fc->tag_set = &hws_definer_generic_set; + break; + case 4: + fc = &cd->fc[MLX5HWS_DEFINER_FNAME_FLEX_PARSER4_OK]; + HWS_CALC_HDR_DST(fc, oks1.flex_parser4_steering_ok); + fc->tag_set = &hws_definer_generic_set; + break; + case 5: + fc = &cd->fc[MLX5HWS_DEFINER_FNAME_FLEX_PARSER5_OK]; + HWS_CALC_HDR_DST(fc, oks1.flex_parser5_steering_ok); + fc->tag_set = &hws_definer_generic_set; + break; + case 6: + fc = &cd->fc[MLX5HWS_DEFINER_FNAME_FLEX_PARSER6_OK]; + HWS_CALC_HDR_DST(fc, oks1.flex_parser6_steering_ok); + fc->tag_set = &hws_definer_generic_set; + break; + case 7: + fc = &cd->fc[MLX5HWS_DEFINER_FNAME_FLEX_PARSER7_OK]; + HWS_CALC_HDR_DST(fc, oks1.flex_parser7_steering_ok); + fc->tag_set = &hws_definer_generic_set; + break; + default: + mlx5hws_err(cd->ctx, "Unsupported flex parser steering ok index %u\n", parser_id); + return NULL; + } + + return fc; +} + +static struct mlx5hws_definer_fc * +hws_definer_flex_parser_handler(struct mlx5hws_definer_conv_data *cd, + u8 parser_id) +{ + struct mlx5hws_definer_fc *fc; + + switch (parser_id) { + case 0: + fc = &cd->fc[MLX5HWS_DEFINER_FNAME_FLEX_PARSER_0]; + HWS_CALC_HDR_DST(fc, flex_parser.flex_parser_0); + fc->tag_set = &hws_definer_generic_set; + break; + case 1: + fc = &cd->fc[MLX5HWS_DEFINER_FNAME_FLEX_PARSER_1]; + HWS_CALC_HDR_DST(fc, flex_parser.flex_parser_1); + fc->tag_set = &hws_definer_generic_set; + break; + case 2: + fc = &cd->fc[MLX5HWS_DEFINER_FNAME_FLEX_PARSER_2]; + HWS_CALC_HDR_DST(fc, flex_parser.flex_parser_2); + fc->tag_set = &hws_definer_generic_set; + break; + case 3: + fc = &cd->fc[MLX5HWS_DEFINER_FNAME_FLEX_PARSER_3]; + HWS_CALC_HDR_DST(fc, flex_parser.flex_parser_3); + fc->tag_set = &hws_definer_generic_set; + break; + case 4: + fc = &cd->fc[MLX5HWS_DEFINER_FNAME_FLEX_PARSER_4]; + HWS_CALC_HDR_DST(fc, flex_parser.flex_parser_4); + fc->tag_set = &hws_definer_generic_set; + break; + case 5: + fc = &cd->fc[MLX5HWS_DEFINER_FNAME_FLEX_PARSER_5]; + HWS_CALC_HDR_DST(fc, flex_parser.flex_parser_5); + fc->tag_set = &hws_definer_generic_set; + break; + case 6: + fc = &cd->fc[MLX5HWS_DEFINER_FNAME_FLEX_PARSER_6]; + HWS_CALC_HDR_DST(fc, flex_parser.flex_parser_6); + fc->tag_set = &hws_definer_generic_set; + break; + case 7: + fc = &cd->fc[MLX5HWS_DEFINER_FNAME_FLEX_PARSER_7]; + HWS_CALC_HDR_DST(fc, flex_parser.flex_parser_7); + fc->tag_set = &hws_definer_generic_set; + break; + default: + mlx5hws_err(cd->ctx, "Unsupported flex parser %u\n", parser_id); + return NULL; + } + + return fc; +} + +static struct mlx5hws_definer_fc * +hws_definer_misc4_fields_handler(struct mlx5hws_definer_conv_data *cd, + bool *parser_is_used, + u32 id, + u32 value) +{ + if (id || value) { + if (id >= HWS_NUM_OF_FLEX_PARSERS) { + mlx5hws_err(cd->ctx, "Unsupported parser id\n"); + return NULL; + } + + if (parser_is_used[id]) { + mlx5hws_err(cd->ctx, "Parser id have been used\n"); + return NULL; + } + } + + parser_is_used[id] = true; + + return hws_definer_flex_parser_handler(cd, id); +} + +static int +hws_definer_check_match_flags(struct mlx5hws_definer_conv_data *cd) +{ + u32 flags; + + flags = cd->match_flags & (MLX5HWS_DEFINER_MATCH_FLAG_TNL_VXLAN_GPE | + MLX5HWS_DEFINER_MATCH_FLAG_TNL_GENEVE | + MLX5HWS_DEFINER_MATCH_FLAG_TNL_GTPU | + MLX5HWS_DEFINER_MATCH_FLAG_TNL_GRE | + MLX5HWS_DEFINER_MATCH_FLAG_TNL_VXLAN | + MLX5HWS_DEFINER_MATCH_FLAG_TNL_HEADER_0_1); + if (flags & (flags - 1)) + goto err_conflict; + + flags = cd->match_flags & (MLX5HWS_DEFINER_MATCH_FLAG_TNL_GRE_OPT_KEY | + MLX5HWS_DEFINER_MATCH_FLAG_TNL_HEADER_2); + + if (flags & (flags - 1)) + goto err_conflict; + + flags = cd->match_flags & (MLX5HWS_DEFINER_MATCH_FLAG_TNL_MPLS_OVER_GRE | + MLX5HWS_DEFINER_MATCH_FLAG_TNL_MPLS_OVER_UDP); + if (flags & (flags - 1)) + goto err_conflict; + + flags = cd->match_flags & (MLX5HWS_DEFINER_MATCH_FLAG_ICMPV4 | + MLX5HWS_DEFINER_MATCH_FLAG_ICMPV6 | + MLX5HWS_DEFINER_MATCH_FLAG_TCP_O | + MLX5HWS_DEFINER_MATCH_FLAG_TCP_I); + if (flags & (flags - 1)) + goto err_conflict; + + return 0; + +err_conflict: + mlx5hws_err(cd->ctx, "Invalid definer fields combination\n"); + return -EINVAL; +} + +static int +hws_definer_conv_outer(struct mlx5hws_definer_conv_data *cd, + u32 *match_param) +{ + bool is_s_ipv6, is_d_ipv6, smac_set, dmac_set; + struct mlx5hws_definer_fc *fc = cd->fc; + struct mlx5hws_definer_fc *curr_fc; + u32 *s_ipv6, *d_ipv6; + + if (HWS_IS_FLD_SET_SZ(match_param, outer_headers.l4_type, 0x2) || + HWS_IS_FLD_SET_SZ(match_param, outer_headers.reserved_at_c2, 0xe) || + HWS_IS_FLD_SET_SZ(match_param, outer_headers.reserved_at_c4, 0x4)) { + mlx5hws_err(cd->ctx, "Unsupported outer parameters set\n"); + return -EINVAL; + } + + /* L2 Check ethertype */ + HWS_SET_HDR(fc, match_param, ETH_TYPE_O, + outer_headers.ethertype, + eth_l2_outer.l3_ethertype); + /* L2 Check SMAC 47_16 */ + HWS_SET_HDR(fc, match_param, ETH_SMAC_47_16_O, + outer_headers.smac_47_16, eth_l2_src_outer.smac_47_16); + /* L2 Check SMAC 15_0 */ + HWS_SET_HDR(fc, match_param, ETH_SMAC_15_0_O, + outer_headers.smac_15_0, eth_l2_src_outer.smac_15_0); + /* L2 Check DMAC 47_16 */ + HWS_SET_HDR(fc, match_param, ETH_DMAC_47_16_O, + outer_headers.dmac_47_16, eth_l2_outer.dmac_47_16); + /* L2 Check DMAC 15_0 */ + HWS_SET_HDR(fc, match_param, ETH_DMAC_15_0_O, + outer_headers.dmac_15_0, eth_l2_outer.dmac_15_0); + + /* L2 VLAN */ + HWS_SET_HDR(fc, match_param, VLAN_FIRST_PRIO_O, + outer_headers.first_prio, eth_l2_outer.first_priority); + HWS_SET_HDR(fc, match_param, VLAN_CFI_O, + outer_headers.first_cfi, eth_l2_outer.first_cfi); + HWS_SET_HDR(fc, match_param, VLAN_ID_O, + outer_headers.first_vid, eth_l2_outer.first_vlan_id); + + /* L2 CVLAN and SVLAN */ + if (HWS_GET_MATCH_PARAM(match_param, outer_headers.cvlan_tag) || + HWS_GET_MATCH_PARAM(match_param, outer_headers.svlan_tag)) { + curr_fc = &fc[MLX5HWS_DEFINER_FNAME_VLAN_TYPE_O]; + HWS_CALC_HDR_DST(curr_fc, eth_l2_outer.first_vlan_qualifier); + curr_fc->tag_set = &hws_definer_outer_vlan_type_set; + curr_fc->tag_mask_set = &hws_definer_ones_set; + } + + /* L3 Check IP header */ + HWS_SET_HDR(fc, match_param, IP_PROTOCOL_O, + outer_headers.ip_protocol, + eth_l3_outer.protocol_next_header); + HWS_SET_HDR(fc, match_param, IP_TTL_O, + outer_headers.ttl_hoplimit, + eth_l3_outer.time_to_live_hop_limit); + + /* L3 Check IPv4/IPv6 addresses */ + s_ipv6 = MLX5_ADDR_OF(fte_match_param, match_param, + outer_headers.src_ipv4_src_ipv6.ipv6_layout); + d_ipv6 = MLX5_ADDR_OF(fte_match_param, match_param, + outer_headers.dst_ipv4_dst_ipv6.ipv6_layout); + + /* Assume IPv6 is used if ipv6 bits are set */ + is_s_ipv6 = s_ipv6[0] || s_ipv6[1] || s_ipv6[2]; + is_d_ipv6 = d_ipv6[0] || d_ipv6[1] || d_ipv6[2]; + + if (is_s_ipv6) { + /* Handle IPv6 source address */ + HWS_SET_HDR(fc, match_param, IPV6_SRC_127_96_O, + outer_headers.src_ipv4_src_ipv6.ipv6_simple_layout.ipv6_127_96, + ipv6_src_outer.ipv6_address_127_96); + HWS_SET_HDR(fc, match_param, IPV6_SRC_95_64_O, + outer_headers.src_ipv4_src_ipv6.ipv6_simple_layout.ipv6_95_64, + ipv6_src_outer.ipv6_address_95_64); + HWS_SET_HDR(fc, match_param, IPV6_SRC_63_32_O, + outer_headers.src_ipv4_src_ipv6.ipv6_simple_layout.ipv6_63_32, + ipv6_src_outer.ipv6_address_63_32); + HWS_SET_HDR(fc, match_param, IPV6_SRC_31_0_O, + outer_headers.src_ipv4_src_ipv6.ipv6_simple_layout.ipv6_31_0, + ipv6_src_outer.ipv6_address_31_0); + } else { + /* Handle IPv4 source address */ + HWS_SET_HDR(fc, match_param, IPV4_SRC_O, + outer_headers.src_ipv4_src_ipv6.ipv6_simple_layout.ipv6_31_0, + ipv4_src_dest_outer.source_address); + } + if (is_d_ipv6) { + /* Handle IPv6 destination address */ + HWS_SET_HDR(fc, match_param, IPV6_DST_127_96_O, + outer_headers.dst_ipv4_dst_ipv6.ipv6_simple_layout.ipv6_127_96, + ipv6_dst_outer.ipv6_address_127_96); + HWS_SET_HDR(fc, match_param, IPV6_DST_95_64_O, + outer_headers.dst_ipv4_dst_ipv6.ipv6_simple_layout.ipv6_95_64, + ipv6_dst_outer.ipv6_address_95_64); + HWS_SET_HDR(fc, match_param, IPV6_DST_63_32_O, + outer_headers.dst_ipv4_dst_ipv6.ipv6_simple_layout.ipv6_63_32, + ipv6_dst_outer.ipv6_address_63_32); + HWS_SET_HDR(fc, match_param, IPV6_DST_31_0_O, + outer_headers.dst_ipv4_dst_ipv6.ipv6_simple_layout.ipv6_31_0, + ipv6_dst_outer.ipv6_address_31_0); + } else { + /* Handle IPv4 destination address */ + HWS_SET_HDR(fc, match_param, IPV4_DST_O, + outer_headers.dst_ipv4_dst_ipv6.ipv6_simple_layout.ipv6_31_0, + ipv4_src_dest_outer.destination_address); + } + + /* L4 Handle TCP/UDP */ + HWS_SET_HDR(fc, match_param, L4_SPORT_O, + outer_headers.tcp_sport, eth_l4_outer.source_port); + HWS_SET_HDR(fc, match_param, L4_DPORT_O, + outer_headers.tcp_dport, eth_l4_outer.destination_port); + HWS_SET_HDR(fc, match_param, L4_SPORT_O, + outer_headers.udp_sport, eth_l4_outer.source_port); + HWS_SET_HDR(fc, match_param, L4_DPORT_O, + outer_headers.udp_dport, eth_l4_outer.destination_port); + HWS_SET_HDR(fc, match_param, TCP_FLAGS_O, + outer_headers.tcp_flags, eth_l4_outer.tcp_flags); + + /* L3 Handle DSCP, ECN and IHL */ + HWS_SET_HDR(fc, match_param, IP_DSCP_O, + outer_headers.ip_dscp, eth_l3_outer.dscp); + HWS_SET_HDR(fc, match_param, IP_ECN_O, + outer_headers.ip_ecn, eth_l3_outer.ecn); + HWS_SET_HDR(fc, match_param, IPV4_IHL_O, + outer_headers.ipv4_ihl, eth_l3_outer.ihl); + + /* Set IP fragmented bit */ + if (HWS_IS_FLD_SET(match_param, outer_headers.frag)) { + smac_set = HWS_IS_FLD_SET(match_param, outer_headers.smac_15_0) || + HWS_IS_FLD_SET(match_param, outer_headers.smac_47_16); + dmac_set = HWS_IS_FLD_SET(match_param, outer_headers.dmac_15_0) || + HWS_IS_FLD_SET(match_param, outer_headers.dmac_47_16); + if (smac_set == dmac_set) { + HWS_SET_HDR(fc, match_param, IP_FRAG_O, + outer_headers.frag, eth_l4_outer.ip_fragmented); + } else { + HWS_SET_HDR(fc, match_param, IP_FRAG_O, + outer_headers.frag, eth_l2_src_outer.ip_fragmented); + } + } + + /* L3_type set */ + if (HWS_IS_FLD_SET(match_param, outer_headers.ip_version)) { + curr_fc = &fc[MLX5HWS_DEFINER_FNAME_ETH_L3_TYPE_O]; + HWS_CALC_HDR_DST(curr_fc, eth_l2_outer.l3_type); + curr_fc->tag_set = &hws_definer_l3_type_set; + curr_fc->tag_mask_set = &hws_definer_ones_set; + HWS_CALC_HDR_SRC(curr_fc, outer_headers.ip_version); + } + + return 0; +} + +static int +hws_definer_conv_inner(struct mlx5hws_definer_conv_data *cd, + u32 *match_param) +{ + bool is_s_ipv6, is_d_ipv6, smac_set, dmac_set; + struct mlx5hws_definer_fc *fc = cd->fc; + struct mlx5hws_definer_fc *curr_fc; + u32 *s_ipv6, *d_ipv6; + + if (HWS_IS_FLD_SET_SZ(match_param, inner_headers.l4_type, 0x2) || + HWS_IS_FLD_SET_SZ(match_param, inner_headers.reserved_at_c2, 0xe) || + HWS_IS_FLD_SET_SZ(match_param, inner_headers.reserved_at_c4, 0x4)) { + mlx5hws_err(cd->ctx, "Unsupported inner parameters set\n"); + return -EINVAL; + } + + /* L2 Check ethertype */ + HWS_SET_HDR(fc, match_param, ETH_TYPE_I, + inner_headers.ethertype, + eth_l2_inner.l3_ethertype); + /* L2 Check SMAC 47_16 */ + HWS_SET_HDR(fc, match_param, ETH_SMAC_47_16_I, + inner_headers.smac_47_16, eth_l2_src_inner.smac_47_16); + /* L2 Check SMAC 15_0 */ + HWS_SET_HDR(fc, match_param, ETH_SMAC_15_0_I, + inner_headers.smac_15_0, eth_l2_src_inner.smac_15_0); + /* L2 Check DMAC 47_16 */ + HWS_SET_HDR(fc, match_param, ETH_DMAC_47_16_I, + inner_headers.dmac_47_16, eth_l2_inner.dmac_47_16); + /* L2 Check DMAC 15_0 */ + HWS_SET_HDR(fc, match_param, ETH_DMAC_15_0_I, + inner_headers.dmac_15_0, eth_l2_inner.dmac_15_0); + + /* L2 VLAN */ + HWS_SET_HDR(fc, match_param, VLAN_FIRST_PRIO_I, + inner_headers.first_prio, eth_l2_inner.first_priority); + HWS_SET_HDR(fc, match_param, VLAN_CFI_I, + inner_headers.first_cfi, eth_l2_inner.first_cfi); + HWS_SET_HDR(fc, match_param, VLAN_ID_I, + inner_headers.first_vid, eth_l2_inner.first_vlan_id); + + /* L2 CVLAN and SVLAN */ + if (HWS_GET_MATCH_PARAM(match_param, inner_headers.cvlan_tag) || + HWS_GET_MATCH_PARAM(match_param, inner_headers.svlan_tag)) { + curr_fc = &fc[MLX5HWS_DEFINER_FNAME_VLAN_TYPE_I]; + HWS_CALC_HDR_DST(curr_fc, eth_l2_inner.first_vlan_qualifier); + curr_fc->tag_set = &hws_definer_inner_vlan_type_set; + curr_fc->tag_mask_set = &hws_definer_ones_set; + } + /* L3 Check IP header */ + HWS_SET_HDR(fc, match_param, IP_PROTOCOL_I, + inner_headers.ip_protocol, + eth_l3_inner.protocol_next_header); + HWS_SET_HDR(fc, match_param, IP_VERSION_I, + inner_headers.ip_version, + eth_l3_inner.ip_version); + HWS_SET_HDR(fc, match_param, IP_TTL_I, + inner_headers.ttl_hoplimit, + eth_l3_inner.time_to_live_hop_limit); + + /* L3 Check IPv4/IPv6 addresses */ + s_ipv6 = MLX5_ADDR_OF(fte_match_param, match_param, + inner_headers.src_ipv4_src_ipv6.ipv6_layout); + d_ipv6 = MLX5_ADDR_OF(fte_match_param, match_param, + inner_headers.dst_ipv4_dst_ipv6.ipv6_layout); + + /* Assume IPv6 is used if ipv6 bits are set */ + is_s_ipv6 = s_ipv6[0] || s_ipv6[1] || s_ipv6[2]; + is_d_ipv6 = d_ipv6[0] || d_ipv6[1] || d_ipv6[2]; + + if (is_s_ipv6) { + /* Handle IPv6 source address */ + HWS_SET_HDR(fc, match_param, IPV6_SRC_127_96_I, + inner_headers.src_ipv4_src_ipv6.ipv6_simple_layout.ipv6_127_96, + ipv6_src_inner.ipv6_address_127_96); + HWS_SET_HDR(fc, match_param, IPV6_SRC_95_64_I, + inner_headers.src_ipv4_src_ipv6.ipv6_simple_layout.ipv6_95_64, + ipv6_src_inner.ipv6_address_95_64); + HWS_SET_HDR(fc, match_param, IPV6_SRC_63_32_I, + inner_headers.src_ipv4_src_ipv6.ipv6_simple_layout.ipv6_63_32, + ipv6_src_inner.ipv6_address_63_32); + HWS_SET_HDR(fc, match_param, IPV6_SRC_31_0_I, + inner_headers.src_ipv4_src_ipv6.ipv6_simple_layout.ipv6_31_0, + ipv6_src_inner.ipv6_address_31_0); + } else { + /* Handle IPv4 source address */ + HWS_SET_HDR(fc, match_param, IPV4_SRC_I, + inner_headers.src_ipv4_src_ipv6.ipv6_simple_layout.ipv6_31_0, + ipv4_src_dest_inner.source_address); + } + if (is_d_ipv6) { + /* Handle IPv6 destination address */ + HWS_SET_HDR(fc, match_param, IPV6_DST_127_96_I, + inner_headers.dst_ipv4_dst_ipv6.ipv6_simple_layout.ipv6_127_96, + ipv6_dst_inner.ipv6_address_127_96); + HWS_SET_HDR(fc, match_param, IPV6_DST_95_64_I, + inner_headers.dst_ipv4_dst_ipv6.ipv6_simple_layout.ipv6_95_64, + ipv6_dst_inner.ipv6_address_95_64); + HWS_SET_HDR(fc, match_param, IPV6_DST_63_32_I, + inner_headers.dst_ipv4_dst_ipv6.ipv6_simple_layout.ipv6_63_32, + ipv6_dst_inner.ipv6_address_63_32); + HWS_SET_HDR(fc, match_param, IPV6_DST_31_0_I, + inner_headers.dst_ipv4_dst_ipv6.ipv6_simple_layout.ipv6_31_0, + ipv6_dst_inner.ipv6_address_31_0); + } else { + /* Handle IPv4 destination address */ + HWS_SET_HDR(fc, match_param, IPV4_DST_I, + inner_headers.dst_ipv4_dst_ipv6.ipv6_simple_layout.ipv6_31_0, + ipv4_src_dest_inner.destination_address); + } + + /* L4 Handle TCP/UDP */ + HWS_SET_HDR(fc, match_param, L4_SPORT_I, + inner_headers.tcp_sport, eth_l4_inner.source_port); + HWS_SET_HDR(fc, match_param, L4_DPORT_I, + inner_headers.tcp_dport, eth_l4_inner.destination_port); + HWS_SET_HDR(fc, match_param, L4_SPORT_I, + inner_headers.udp_sport, eth_l4_inner.source_port); + HWS_SET_HDR(fc, match_param, L4_DPORT_I, + inner_headers.udp_dport, eth_l4_inner.destination_port); + HWS_SET_HDR(fc, match_param, TCP_FLAGS_I, + inner_headers.tcp_flags, eth_l4_inner.tcp_flags); + + /* L3 Handle DSCP, ECN and IHL */ + HWS_SET_HDR(fc, match_param, IP_DSCP_I, + inner_headers.ip_dscp, eth_l3_inner.dscp); + HWS_SET_HDR(fc, match_param, IP_ECN_I, + inner_headers.ip_ecn, eth_l3_inner.ecn); + HWS_SET_HDR(fc, match_param, IPV4_IHL_I, + inner_headers.ipv4_ihl, eth_l3_inner.ihl); + + /* Set IP fragmented bit */ + if (HWS_IS_FLD_SET(match_param, inner_headers.frag)) { + if (HWS_IS_FLD_SET(match_param, misc_parameters.vxlan_vni)) { + HWS_SET_HDR(fc, match_param, IP_FRAG_I, + inner_headers.frag, eth_l2_inner.ip_fragmented); + } else { + smac_set = HWS_IS_FLD_SET(match_param, inner_headers.smac_15_0) || + HWS_IS_FLD_SET(match_param, inner_headers.smac_47_16); + dmac_set = HWS_IS_FLD_SET(match_param, inner_headers.dmac_15_0) || + HWS_IS_FLD_SET(match_param, inner_headers.dmac_47_16); + if (smac_set == dmac_set) { + HWS_SET_HDR(fc, match_param, IP_FRAG_I, + inner_headers.frag, eth_l4_inner.ip_fragmented); + } else { + HWS_SET_HDR(fc, match_param, IP_FRAG_I, + inner_headers.frag, eth_l2_src_inner.ip_fragmented); + } + } + } + + /* L3_type set */ + if (HWS_IS_FLD_SET(match_param, inner_headers.ip_version)) { + curr_fc = &fc[MLX5HWS_DEFINER_FNAME_ETH_L3_TYPE_I]; + HWS_CALC_HDR_DST(curr_fc, eth_l2_inner.l3_type); + curr_fc->tag_set = &hws_definer_l3_type_set; + curr_fc->tag_mask_set = &hws_definer_ones_set; + HWS_CALC_HDR_SRC(curr_fc, inner_headers.ip_version); + } + + return 0; +} + +static int +hws_definer_conv_misc(struct mlx5hws_definer_conv_data *cd, + u32 *match_param) +{ + struct mlx5hws_cmd_query_caps *caps = cd->ctx->caps; + struct mlx5hws_definer_fc *fc = cd->fc; + struct mlx5hws_definer_fc *curr_fc; + + if (HWS_IS_FLD_SET_SZ(match_param, misc_parameters.reserved_at_1, 0x1) || + HWS_IS_FLD_SET_SZ(match_param, misc_parameters.reserved_at_64, 0xc) || + HWS_IS_FLD_SET_SZ(match_param, misc_parameters.reserved_at_d8, 0x6) || + HWS_IS_FLD_SET_SZ(match_param, misc_parameters.reserved_at_e0, 0xc) || + HWS_IS_FLD_SET_SZ(match_param, misc_parameters.reserved_at_100, 0xc) || + HWS_IS_FLD_SET_SZ(match_param, misc_parameters.reserved_at_120, 0xa) || + HWS_IS_FLD_SET_SZ(match_param, misc_parameters.reserved_at_140, 0x8) || + HWS_IS_FLD_SET(match_param, misc_parameters.bth_dst_qp) || + HWS_IS_FLD_SET(match_param, misc_parameters.bth_opcode) || + HWS_IS_FLD_SET(match_param, misc_parameters.inner_esp_spi) || + HWS_IS_FLD_SET(match_param, misc_parameters.outer_esp_spi) || + HWS_IS_FLD_SET(match_param, misc_parameters.source_vhca_port) || + HWS_IS_FLD_SET_SZ(match_param, misc_parameters.reserved_at_1a0, 0x60)) { + mlx5hws_err(cd->ctx, "Unsupported misc parameters set\n"); + return -EINVAL; + } + + /* Check GRE related fields */ + if (HWS_IS_FLD_SET(match_param, misc_parameters.gre_c_present)) { + cd->match_flags |= MLX5HWS_DEFINER_MATCH_FLAG_TNL_GRE; + curr_fc = &fc[MLX5HWS_DEFINER_FNAME_GRE_C]; + HWS_CALC_HDR(curr_fc, + misc_parameters.gre_c_present, + tunnel_header.tunnel_header_0); + curr_fc->bit_mask = __mlx5_mask(header_gre, gre_c_present); + curr_fc->bit_off = __mlx5_dw_bit_off(header_gre, gre_c_present); + } + + if (HWS_IS_FLD_SET(match_param, misc_parameters.gre_k_present)) { + cd->match_flags |= MLX5HWS_DEFINER_MATCH_FLAG_TNL_GRE; + curr_fc = &fc[MLX5HWS_DEFINER_FNAME_GRE_K]; + HWS_CALC_HDR(curr_fc, + misc_parameters.gre_k_present, + tunnel_header.tunnel_header_0); + curr_fc->bit_mask = __mlx5_mask(header_gre, gre_k_present); + curr_fc->bit_off = __mlx5_dw_bit_off(header_gre, gre_k_present); + } + + if (HWS_IS_FLD_SET(match_param, misc_parameters.gre_s_present)) { + cd->match_flags |= MLX5HWS_DEFINER_MATCH_FLAG_TNL_GRE; + curr_fc = &fc[MLX5HWS_DEFINER_FNAME_GRE_S]; + HWS_CALC_HDR(curr_fc, + misc_parameters.gre_s_present, + tunnel_header.tunnel_header_0); + curr_fc->bit_mask = __mlx5_mask(header_gre, gre_s_present); + curr_fc->bit_off = __mlx5_dw_bit_off(header_gre, gre_s_present); + } + + if (HWS_IS_FLD_SET(match_param, misc_parameters.gre_protocol)) { + cd->match_flags |= MLX5HWS_DEFINER_MATCH_FLAG_TNL_GRE; + curr_fc = &fc[MLX5HWS_DEFINER_FNAME_GRE_PROTOCOL]; + HWS_CALC_HDR(curr_fc, + misc_parameters.gre_protocol, + tunnel_header.tunnel_header_0); + curr_fc->bit_mask = __mlx5_mask(header_gre, gre_protocol); + curr_fc->bit_off = __mlx5_dw_bit_off(header_gre, gre_protocol); + } + + if (HWS_IS_FLD_SET(match_param, misc_parameters.gre_key.key)) { + cd->match_flags |= MLX5HWS_DEFINER_MATCH_FLAG_TNL_GRE | + MLX5HWS_DEFINER_MATCH_FLAG_TNL_GRE_OPT_KEY; + HWS_SET_HDR(fc, match_param, GRE_OPT_KEY, + misc_parameters.gre_key.key, tunnel_header.tunnel_header_2); + } + + /* Check GENEVE related fields */ + if (HWS_IS_FLD_SET(match_param, misc_parameters.geneve_vni)) { + cd->match_flags |= MLX5HWS_DEFINER_MATCH_FLAG_TNL_GENEVE; + curr_fc = &fc[MLX5HWS_DEFINER_FNAME_GENEVE_VNI]; + HWS_CALC_HDR(curr_fc, + misc_parameters.geneve_vni, + tunnel_header.tunnel_header_1); + curr_fc->bit_mask = __mlx5_mask(header_geneve, vni); + curr_fc->bit_off = __mlx5_dw_bit_off(header_geneve, vni); + } + + if (HWS_IS_FLD_SET(match_param, misc_parameters.geneve_opt_len)) { + cd->match_flags |= MLX5HWS_DEFINER_MATCH_FLAG_TNL_GENEVE; + curr_fc = &fc[MLX5HWS_DEFINER_FNAME_GENEVE_OPT_LEN]; + HWS_CALC_HDR(curr_fc, + misc_parameters.geneve_opt_len, + tunnel_header.tunnel_header_0); + curr_fc->bit_mask = __mlx5_mask(header_geneve, opt_len); + curr_fc->bit_off = __mlx5_dw_bit_off(header_geneve, opt_len); + } + + if (HWS_IS_FLD_SET(match_param, misc_parameters.geneve_protocol_type)) { + cd->match_flags |= MLX5HWS_DEFINER_MATCH_FLAG_TNL_GENEVE; + curr_fc = &fc[MLX5HWS_DEFINER_FNAME_GENEVE_PROTO]; + HWS_CALC_HDR(curr_fc, + misc_parameters.geneve_protocol_type, + tunnel_header.tunnel_header_0); + curr_fc->bit_mask = __mlx5_mask(header_geneve, protocol_type); + curr_fc->bit_off = __mlx5_dw_bit_off(header_geneve, protocol_type); + } + + if (HWS_IS_FLD_SET(match_param, misc_parameters.geneve_oam)) { + cd->match_flags |= MLX5HWS_DEFINER_MATCH_FLAG_TNL_GENEVE; + curr_fc = &fc[MLX5HWS_DEFINER_FNAME_GENEVE_OAM]; + HWS_CALC_HDR(curr_fc, + misc_parameters.geneve_oam, + tunnel_header.tunnel_header_0); + curr_fc->bit_mask = __mlx5_mask(header_geneve, o_flag); + curr_fc->bit_off = __mlx5_dw_bit_off(header_geneve, o_flag); + } + + HWS_SET_HDR(fc, match_param, SOURCE_QP, + misc_parameters.source_sqn, source_qp_gvmi.source_qp); + HWS_SET_HDR(fc, match_param, IPV6_FLOW_LABEL_O, + misc_parameters.outer_ipv6_flow_label, eth_l3_outer.flow_label); + HWS_SET_HDR(fc, match_param, IPV6_FLOW_LABEL_I, + misc_parameters.inner_ipv6_flow_label, eth_l3_inner.flow_label); + + /* L2 Second VLAN */ + HWS_SET_HDR(fc, match_param, VLAN_SECOND_PRIO_O, + misc_parameters.outer_second_prio, eth_l2_outer.second_priority); + HWS_SET_HDR(fc, match_param, VLAN_SECOND_PRIO_I, + misc_parameters.inner_second_prio, eth_l2_inner.second_priority); + HWS_SET_HDR(fc, match_param, VLAN_SECOND_CFI_O, + misc_parameters.outer_second_cfi, eth_l2_outer.second_cfi); + HWS_SET_HDR(fc, match_param, VLAN_SECOND_CFI_I, + misc_parameters.inner_second_cfi, eth_l2_inner.second_cfi); + HWS_SET_HDR(fc, match_param, VLAN_SECOND_ID_O, + misc_parameters.outer_second_vid, eth_l2_outer.second_vlan_id); + HWS_SET_HDR(fc, match_param, VLAN_SECOND_ID_I, + misc_parameters.inner_second_vid, eth_l2_inner.second_vlan_id); + + /* L2 Second CVLAN and SVLAN */ + if (HWS_GET_MATCH_PARAM(match_param, misc_parameters.outer_second_cvlan_tag) || + HWS_GET_MATCH_PARAM(match_param, misc_parameters.outer_second_svlan_tag)) { + curr_fc = &fc[MLX5HWS_DEFINER_FNAME_VLAN_SECOND_TYPE_O]; + HWS_CALC_HDR_DST(curr_fc, eth_l2_outer.second_vlan_qualifier); + curr_fc->tag_set = &hws_definer_outer_second_vlan_type_set; + curr_fc->tag_mask_set = &hws_definer_ones_set; + } + + if (HWS_GET_MATCH_PARAM(match_param, misc_parameters.inner_second_cvlan_tag) || + HWS_GET_MATCH_PARAM(match_param, misc_parameters.inner_second_svlan_tag)) { + curr_fc = &fc[MLX5HWS_DEFINER_FNAME_VLAN_SECOND_TYPE_I]; + HWS_CALC_HDR_DST(curr_fc, eth_l2_inner.second_vlan_qualifier); + curr_fc->tag_set = &hws_definer_inner_second_vlan_type_set; + curr_fc->tag_mask_set = &hws_definer_ones_set; + } + + /* VXLAN VNI */ + if (HWS_GET_MATCH_PARAM(match_param, misc_parameters.vxlan_vni)) { + cd->match_flags |= MLX5HWS_DEFINER_MATCH_FLAG_TNL_VXLAN; + curr_fc = &fc[MLX5HWS_DEFINER_FNAME_VXLAN_VNI]; + HWS_CALC_HDR(curr_fc, misc_parameters.vxlan_vni, tunnel_header.tunnel_header_1); + curr_fc->bit_mask = __mlx5_mask(header_vxlan, vni); + curr_fc->bit_off = __mlx5_dw_bit_off(header_vxlan, vni); + } + + /* Flex protocol steering ok bits */ + if (HWS_GET_MATCH_PARAM(match_param, misc_parameters.geneve_tlv_option_0_exist)) { + cd->match_flags |= MLX5HWS_DEFINER_MATCH_FLAG_TNL_GENEVE; + + if (!caps->flex_parser_ok_bits_supp) { + mlx5hws_err(cd->ctx, "Unsupported flex_parser_ok_bits_supp capability\n"); + return -EOPNOTSUPP; + } + + curr_fc = hws_definer_flex_parser_steering_ok_bits_handler( + cd, caps->flex_parser_id_geneve_tlv_option_0); + if (!curr_fc) + return -EINVAL; + + HWS_CALC_HDR_SRC(fc, misc_parameters.geneve_tlv_option_0_exist); + } + + if (HWS_GET_MATCH_PARAM(match_param, misc_parameters.source_port)) { + curr_fc = &fc[MLX5HWS_DEFINER_FNAME_SOURCE_GVMI]; + HWS_CALC_HDR_DST(curr_fc, source_qp_gvmi.source_gvmi); + curr_fc->tag_mask_set = &hws_definer_ones_set; + curr_fc->tag_set = HWS_IS_FLD_SET(match_param, + misc_parameters.source_eswitch_owner_vhca_id) ? + &hws_definer_set_source_gvmi_vhca_id : + &hws_definer_set_source_gvmi; + } else { + if (HWS_IS_FLD_SET(match_param, misc_parameters.source_eswitch_owner_vhca_id)) { + mlx5hws_err(cd->ctx, + "Unsupported source_eswitch_owner_vhca_id field usage\n"); + return -EOPNOTSUPP; + } + } + + return 0; +} + +static int +hws_definer_conv_misc2(struct mlx5hws_definer_conv_data *cd, + u32 *match_param) +{ + struct mlx5hws_cmd_query_caps *caps = cd->ctx->caps; + struct mlx5hws_definer_fc *fc = cd->fc; + struct mlx5hws_definer_fc *curr_fc; + + if (HWS_IS_FLD_SET_SZ(match_param, misc_parameters_2.reserved_at_1a0, 0x8) || + HWS_IS_FLD_SET_SZ(match_param, misc_parameters_2.reserved_at_1b8, 0x8) || + HWS_IS_FLD_SET_SZ(match_param, misc_parameters_2.reserved_at_1c0, 0x40) || + HWS_IS_FLD_SET(match_param, misc_parameters_2.macsec_syndrome) || + HWS_IS_FLD_SET(match_param, misc_parameters_2.ipsec_syndrome)) { + mlx5hws_err(cd->ctx, "Unsupported misc2 parameters set\n"); + return -EINVAL; + } + + HWS_SET_HDR(fc, match_param, MPLS0_O, + misc_parameters_2.outer_first_mpls, mpls_outer.mpls0_label); + HWS_SET_HDR(fc, match_param, MPLS0_I, + misc_parameters_2.inner_first_mpls, mpls_inner.mpls0_label); + HWS_SET_HDR(fc, match_param, REG_0, + misc_parameters_2.metadata_reg_c_0, registers.register_c_0); + HWS_SET_HDR(fc, match_param, REG_1, + misc_parameters_2.metadata_reg_c_1, registers.register_c_1); + HWS_SET_HDR(fc, match_param, REG_2, + misc_parameters_2.metadata_reg_c_2, registers.register_c_2); + HWS_SET_HDR(fc, match_param, REG_3, + misc_parameters_2.metadata_reg_c_3, registers.register_c_3); + HWS_SET_HDR(fc, match_param, REG_4, + misc_parameters_2.metadata_reg_c_4, registers.register_c_4); + HWS_SET_HDR(fc, match_param, REG_5, + misc_parameters_2.metadata_reg_c_5, registers.register_c_5); + HWS_SET_HDR(fc, match_param, REG_6, + misc_parameters_2.metadata_reg_c_6, registers.register_c_6); + HWS_SET_HDR(fc, match_param, REG_7, + misc_parameters_2.metadata_reg_c_7, registers.register_c_7); + HWS_SET_HDR(fc, match_param, REG_A, + misc_parameters_2.metadata_reg_a, metadata.general_purpose); + + if (HWS_IS_FLD_SET(match_param, misc_parameters_2.outer_first_mpls_over_gre)) { + cd->match_flags |= MLX5HWS_DEFINER_MATCH_FLAG_TNL_MPLS_OVER_GRE; + + if (!(caps->flex_protocols & MLX5_FLEX_PARSER_MPLS_OVER_GRE_ENABLED)) { + mlx5hws_err(cd->ctx, "Unsupported misc2 first mpls over gre parameters set\n"); + return -EOPNOTSUPP; + } + + curr_fc = hws_definer_flex_parser_handler(cd, caps->flex_parser_id_mpls_over_gre); + if (!curr_fc) + return -EINVAL; + + HWS_CALC_HDR_SRC(fc, misc_parameters_2.outer_first_mpls_over_gre); + } + + if (HWS_IS_FLD_SET(match_param, misc_parameters_2.outer_first_mpls_over_udp)) { + cd->match_flags |= MLX5HWS_DEFINER_MATCH_FLAG_TNL_MPLS_OVER_UDP; + + if (!(caps->flex_protocols & MLX5_FLEX_PARSER_MPLS_OVER_UDP_ENABLED)) { + mlx5hws_err(cd->ctx, "Unsupported misc2 first mpls over udp parameters set\n"); + return -EOPNOTSUPP; + } + + curr_fc = hws_definer_flex_parser_handler(cd, caps->flex_parser_id_mpls_over_udp); + if (!curr_fc) + return -EINVAL; + + HWS_CALC_HDR_SRC(fc, misc_parameters_2.outer_first_mpls_over_udp); + } + + return 0; +} + +static int +hws_definer_conv_misc3(struct mlx5hws_definer_conv_data *cd, u32 *match_param) +{ + struct mlx5hws_cmd_query_caps *caps = cd->ctx->caps; + struct mlx5hws_definer_fc *fc = cd->fc; + struct mlx5hws_definer_fc *curr_fc; + bool vxlan_gpe_flex_parser_enabled; + + /* Check reserved and unsupported fields */ + if (HWS_IS_FLD_SET_SZ(match_param, misc_parameters_3.reserved_at_80, 0x8) || + HWS_IS_FLD_SET_SZ(match_param, misc_parameters_3.reserved_at_b0, 0x10) || + HWS_IS_FLD_SET_SZ(match_param, misc_parameters_3.reserved_at_170, 0x10) || + HWS_IS_FLD_SET_SZ(match_param, misc_parameters_3.reserved_at_1e0, 0x20)) { + mlx5hws_err(cd->ctx, "Unsupported misc3 parameters set\n"); + return -EINVAL; + } + + if (HWS_IS_FLD_SET(match_param, misc_parameters_3.inner_tcp_seq_num) || + HWS_IS_FLD_SET(match_param, misc_parameters_3.inner_tcp_ack_num)) { + cd->match_flags |= MLX5HWS_DEFINER_MATCH_FLAG_TCP_I; + HWS_SET_HDR(fc, match_param, TCP_SEQ_NUM, + misc_parameters_3.inner_tcp_seq_num, tcp_icmp.tcp_seq); + HWS_SET_HDR(fc, match_param, TCP_ACK_NUM, + misc_parameters_3.inner_tcp_ack_num, tcp_icmp.tcp_ack); + } + + if (HWS_IS_FLD_SET(match_param, misc_parameters_3.outer_tcp_seq_num) || + HWS_IS_FLD_SET(match_param, misc_parameters_3.outer_tcp_ack_num)) { + cd->match_flags |= MLX5HWS_DEFINER_MATCH_FLAG_TCP_O; + HWS_SET_HDR(fc, match_param, TCP_SEQ_NUM, + misc_parameters_3.outer_tcp_seq_num, tcp_icmp.tcp_seq); + HWS_SET_HDR(fc, match_param, TCP_ACK_NUM, + misc_parameters_3.outer_tcp_ack_num, tcp_icmp.tcp_ack); + } + + vxlan_gpe_flex_parser_enabled = caps->flex_protocols & MLX5_FLEX_PARSER_VXLAN_GPE_ENABLED; + + if (HWS_IS_FLD_SET(match_param, misc_parameters_3.outer_vxlan_gpe_vni)) { + cd->match_flags |= MLX5HWS_DEFINER_MATCH_FLAG_TNL_VXLAN_GPE; + + if (!vxlan_gpe_flex_parser_enabled) { + mlx5hws_err(cd->ctx, "Unsupported VXLAN GPE flex parser\n"); + return -EOPNOTSUPP; + } + + curr_fc = &fc[MLX5HWS_DEFINER_FNAME_VXLAN_GPE_VNI]; + HWS_CALC_HDR(curr_fc, misc_parameters_3.outer_vxlan_gpe_vni, + tunnel_header.tunnel_header_1); + curr_fc->bit_mask = __mlx5_mask(header_vxlan_gpe, vni); + curr_fc->bit_off = __mlx5_dw_bit_off(header_vxlan_gpe, vni); + } + + if (HWS_IS_FLD_SET(match_param, misc_parameters_3.outer_vxlan_gpe_next_protocol)) { + cd->match_flags |= MLX5HWS_DEFINER_MATCH_FLAG_TNL_VXLAN_GPE; + + if (!vxlan_gpe_flex_parser_enabled) { + mlx5hws_err(cd->ctx, "Unsupported VXLAN GPE flex parser\n"); + return -EOPNOTSUPP; + } + + curr_fc = &fc[MLX5HWS_DEFINER_FNAME_VXLAN_GPE_PROTO]; + HWS_CALC_HDR(curr_fc, misc_parameters_3.outer_vxlan_gpe_next_protocol, + tunnel_header.tunnel_header_0); + curr_fc->byte_off += MLX5_BYTE_OFF(header_vxlan_gpe, protocol); + curr_fc->bit_mask = __mlx5_mask(header_vxlan_gpe, protocol); + curr_fc->bit_off = __mlx5_dw_bit_off(header_vxlan_gpe, protocol); + } + + if (HWS_IS_FLD_SET(match_param, misc_parameters_3.outer_vxlan_gpe_flags)) { + cd->match_flags |= MLX5HWS_DEFINER_MATCH_FLAG_TNL_VXLAN_GPE; + + if (!vxlan_gpe_flex_parser_enabled) { + mlx5hws_err(cd->ctx, "Unsupported VXLAN GPE flex parser\n"); + return -EOPNOTSUPP; + } + + curr_fc = &fc[MLX5HWS_DEFINER_FNAME_VXLAN_GPE_FLAGS]; + HWS_CALC_HDR(curr_fc, misc_parameters_3.outer_vxlan_gpe_flags, + tunnel_header.tunnel_header_0); + curr_fc->bit_mask = __mlx5_mask(header_vxlan_gpe, flags); + curr_fc->bit_off = __mlx5_dw_bit_off(header_vxlan_gpe, flags); + } + + if (HWS_IS_FLD_SET(match_param, misc_parameters_3.icmp_header_data) || + HWS_IS_FLD_SET(match_param, misc_parameters_3.icmp_type) || + HWS_IS_FLD_SET(match_param, misc_parameters_3.icmp_code)) { + cd->match_flags |= MLX5HWS_DEFINER_MATCH_FLAG_ICMPV4; + + if (!(caps->flex_protocols & MLX5_FLEX_PARSER_ICMP_V4_ENABLED)) { + mlx5hws_err(cd->ctx, "Unsupported ICMPv4 flex parser\n"); + return -EOPNOTSUPP; + } + + HWS_SET_HDR(fc, match_param, ICMP_DW3, + misc_parameters_3.icmp_header_data, tcp_icmp.icmp_dw3); + + if (HWS_IS_FLD_SET(match_param, misc_parameters_3.icmp_type) || + HWS_IS_FLD_SET(match_param, misc_parameters_3.icmp_code)) { + curr_fc = &fc[MLX5HWS_DEFINER_FNAME_ICMP_DW1]; + HWS_CALC_HDR_DST(curr_fc, tcp_icmp.icmp_dw1); + curr_fc->tag_set = &hws_definer_icmp_dw1_set; + } + } + + if (HWS_IS_FLD_SET(match_param, misc_parameters_3.icmpv6_header_data) || + HWS_IS_FLD_SET(match_param, misc_parameters_3.icmpv6_type) || + HWS_IS_FLD_SET(match_param, misc_parameters_3.icmpv6_code)) { + cd->match_flags |= MLX5HWS_DEFINER_MATCH_FLAG_ICMPV6; + + if (!(caps->flex_protocols & MLX5_FLEX_PARSER_ICMP_V6_ENABLED)) { + mlx5hws_err(cd->ctx, "Unsupported ICMPv6 parser\n"); + return -EOPNOTSUPP; + } + + HWS_SET_HDR(fc, match_param, ICMP_DW3, + misc_parameters_3.icmpv6_header_data, tcp_icmp.icmp_dw3); + + if (HWS_IS_FLD_SET(match_param, misc_parameters_3.icmpv6_type) || + HWS_IS_FLD_SET(match_param, misc_parameters_3.icmpv6_code)) { + curr_fc = &fc[MLX5HWS_DEFINER_FNAME_ICMP_DW1]; + HWS_CALC_HDR_DST(curr_fc, tcp_icmp.icmp_dw1); + curr_fc->tag_set = &hws_definer_icmpv6_dw1_set; + } + } + + if (HWS_IS_FLD_SET(match_param, misc_parameters_3.geneve_tlv_option_0_data)) { + cd->match_flags |= MLX5HWS_DEFINER_MATCH_FLAG_TNL_GENEVE; + + curr_fc = + hws_definer_flex_parser_handler(cd, + caps->flex_parser_id_geneve_tlv_option_0); + if (!curr_fc) + return -EINVAL; + + HWS_CALC_HDR_SRC(fc, misc_parameters_3.geneve_tlv_option_0_data); + } + + if (HWS_IS_FLD_SET(match_param, misc_parameters_3.gtpu_teid)) { + cd->match_flags |= MLX5HWS_DEFINER_MATCH_FLAG_TNL_GTPU; + + if (!(caps->flex_protocols & MLX5_FLEX_PARSER_GTPU_TEID_ENABLED)) { + mlx5hws_err(cd->ctx, "Unsupported GTPU TEID flex parser\n"); + return -EOPNOTSUPP; + } + + fc = &cd->fc[MLX5HWS_DEFINER_FNAME_GTP_TEID]; + fc->tag_set = &hws_definer_generic_set; + fc->bit_mask = __mlx5_mask(header_gtp, teid); + fc->byte_off = caps->format_select_gtpu_dw_1 * DW_SIZE; + HWS_CALC_HDR_SRC(fc, misc_parameters_3.gtpu_teid); + } + + if (HWS_IS_FLD_SET(match_param, misc_parameters_3.gtpu_msg_type)) { + cd->match_flags |= MLX5HWS_DEFINER_MATCH_FLAG_TNL_GTPU; + + if (!(caps->flex_protocols & MLX5_FLEX_PARSER_GTPU_ENABLED)) { + mlx5hws_err(cd->ctx, "Unsupported GTPU flex parser\n"); + return -EOPNOTSUPP; + } + + fc = &cd->fc[MLX5HWS_DEFINER_FNAME_GTP_MSG_TYPE]; + fc->tag_set = &hws_definer_generic_set; + fc->bit_mask = __mlx5_mask(header_gtp, msg_type); + fc->bit_off = __mlx5_dw_bit_off(header_gtp, msg_type); + fc->byte_off = caps->format_select_gtpu_dw_0 * DW_SIZE; + HWS_CALC_HDR_SRC(fc, misc_parameters_3.gtpu_msg_type); + } + + if (HWS_IS_FLD_SET(match_param, misc_parameters_3.gtpu_msg_flags)) { + cd->match_flags |= MLX5HWS_DEFINER_MATCH_FLAG_TNL_GTPU; + + if (!(caps->flex_protocols & MLX5_FLEX_PARSER_GTPU_ENABLED)) { + mlx5hws_err(cd->ctx, "Unsupported GTPU flex parser\n"); + return -EOPNOTSUPP; + } + + fc = &cd->fc[MLX5HWS_DEFINER_FNAME_GTP_MSG_TYPE]; + fc->tag_set = &hws_definer_generic_set; + fc->bit_mask = __mlx5_mask(header_gtp, msg_flags); + fc->bit_off = __mlx5_dw_bit_off(header_gtp, msg_flags); + fc->byte_off = caps->format_select_gtpu_dw_0 * DW_SIZE; + HWS_CALC_HDR_SRC(fc, misc_parameters_3.gtpu_msg_flags); + } + + if (HWS_IS_FLD_SET(match_param, misc_parameters_3.gtpu_dw_2)) { + cd->match_flags |= MLX5HWS_DEFINER_MATCH_FLAG_TNL_GTPU; + + if (!(caps->flex_protocols & MLX5_FLEX_PARSER_GTPU_DW_2_ENABLED)) { + mlx5hws_err(cd->ctx, "Unsupported GTPU DW2 flex parser\n"); + return -EOPNOTSUPP; + } + + curr_fc = &fc[MLX5HWS_DEFINER_FNAME_GTPU_DW2]; + curr_fc->tag_set = &hws_definer_generic_set; + curr_fc->bit_mask = -1; + curr_fc->byte_off = caps->format_select_gtpu_dw_2 * DW_SIZE; + HWS_CALC_HDR_SRC(fc, misc_parameters_3.gtpu_dw_2); + } + + if (HWS_IS_FLD_SET(match_param, misc_parameters_3.gtpu_first_ext_dw_0)) { + cd->match_flags |= MLX5HWS_DEFINER_MATCH_FLAG_TNL_GTPU; + + if (!(caps->flex_protocols & MLX5_FLEX_PARSER_GTPU_FIRST_EXT_DW_0_ENABLED)) { + mlx5hws_err(cd->ctx, "Unsupported GTPU first EXT DW0 flex parser\n"); + return -EOPNOTSUPP; + } + + curr_fc = &fc[MLX5HWS_DEFINER_FNAME_GTPU_FIRST_EXT_DW0]; + curr_fc->tag_set = &hws_definer_generic_set; + curr_fc->bit_mask = -1; + curr_fc->byte_off = caps->format_select_gtpu_ext_dw_0 * DW_SIZE; + HWS_CALC_HDR_SRC(fc, misc_parameters_3.gtpu_first_ext_dw_0); + } + + if (HWS_IS_FLD_SET(match_param, misc_parameters_3.gtpu_dw_0)) { + cd->match_flags |= MLX5HWS_DEFINER_MATCH_FLAG_TNL_GTPU; + + if (!(caps->flex_protocols & MLX5_FLEX_PARSER_GTPU_DW_0_ENABLED)) { + mlx5hws_err(cd->ctx, "Unsupported GTPU DW0 flex parser\n"); + return -EOPNOTSUPP; + } + + curr_fc = &fc[MLX5HWS_DEFINER_FNAME_GTPU_DW0]; + curr_fc->tag_set = &hws_definer_generic_set; + curr_fc->bit_mask = -1; + curr_fc->byte_off = caps->format_select_gtpu_dw_0 * DW_SIZE; + HWS_CALC_HDR_SRC(fc, misc_parameters_3.gtpu_dw_0); + } + + return 0; +} + +static int +hws_definer_conv_misc4(struct mlx5hws_definer_conv_data *cd, + u32 *match_param) +{ + bool parser_is_used[HWS_NUM_OF_FLEX_PARSERS] = {}; + struct mlx5hws_definer_fc *fc; + u32 id, value; + + if (HWS_IS_FLD_SET_SZ(match_param, misc_parameters_4.reserved_at_100, 0x100)) { + mlx5hws_err(cd->ctx, "Unsupported misc4 parameters set\n"); + return -EINVAL; + } + + id = HWS_GET_MATCH_PARAM(match_param, misc_parameters_4.prog_sample_field_id_0); + value = HWS_GET_MATCH_PARAM(match_param, misc_parameters_4.prog_sample_field_value_0); + fc = hws_definer_misc4_fields_handler(cd, parser_is_used, id, value); + if (!fc) + return -EINVAL; + + HWS_CALC_HDR_SRC(fc, misc_parameters_4.prog_sample_field_value_0); + + id = HWS_GET_MATCH_PARAM(match_param, misc_parameters_4.prog_sample_field_id_1); + value = HWS_GET_MATCH_PARAM(match_param, misc_parameters_4.prog_sample_field_value_1); + fc = hws_definer_misc4_fields_handler(cd, parser_is_used, id, value); + if (!fc) + return -EINVAL; + + HWS_CALC_HDR_SRC(fc, misc_parameters_4.prog_sample_field_value_1); + + id = HWS_GET_MATCH_PARAM(match_param, misc_parameters_4.prog_sample_field_id_2); + value = HWS_GET_MATCH_PARAM(match_param, misc_parameters_4.prog_sample_field_value_2); + fc = hws_definer_misc4_fields_handler(cd, parser_is_used, id, value); + if (!fc) + return -EINVAL; + + HWS_CALC_HDR_SRC(fc, misc_parameters_4.prog_sample_field_value_2); + + id = HWS_GET_MATCH_PARAM(match_param, misc_parameters_4.prog_sample_field_id_3); + value = HWS_GET_MATCH_PARAM(match_param, misc_parameters_4.prog_sample_field_value_3); + fc = hws_definer_misc4_fields_handler(cd, parser_is_used, id, value); + if (!fc) + return -EINVAL; + + HWS_CALC_HDR_SRC(fc, misc_parameters_4.prog_sample_field_value_3); + + return 0; +} + +static int +hws_definer_conv_misc5(struct mlx5hws_definer_conv_data *cd, + u32 *match_param) +{ + struct mlx5hws_definer_fc *fc = cd->fc; + + if (HWS_IS_FLD_SET(match_param, misc_parameters_5.macsec_tag_0) || + HWS_IS_FLD_SET(match_param, misc_parameters_5.macsec_tag_1) || + HWS_IS_FLD_SET(match_param, misc_parameters_5.macsec_tag_2) || + HWS_IS_FLD_SET(match_param, misc_parameters_5.macsec_tag_3) || + HWS_IS_FLD_SET_SZ(match_param, misc_parameters_5.reserved_at_100, 0x100)) { + mlx5hws_err(cd->ctx, "Unsupported misc5 parameters set\n"); + return -EINVAL; + } + + if (HWS_IS_FLD_SET(match_param, misc_parameters_5.tunnel_header_0)) { + cd->match_flags |= MLX5HWS_DEFINER_MATCH_FLAG_TNL_HEADER_0_1; + HWS_SET_HDR(fc, match_param, TNL_HDR_0, + misc_parameters_5.tunnel_header_0, tunnel_header.tunnel_header_0); + } + + if (HWS_IS_FLD_SET(match_param, misc_parameters_5.tunnel_header_1)) { + cd->match_flags |= MLX5HWS_DEFINER_MATCH_FLAG_TNL_HEADER_0_1; + HWS_SET_HDR(fc, match_param, TNL_HDR_1, + misc_parameters_5.tunnel_header_1, tunnel_header.tunnel_header_1); + } + + if (HWS_IS_FLD_SET(match_param, misc_parameters_5.tunnel_header_2)) { + cd->match_flags |= MLX5HWS_DEFINER_MATCH_FLAG_TNL_HEADER_2; + HWS_SET_HDR(fc, match_param, TNL_HDR_2, + misc_parameters_5.tunnel_header_2, tunnel_header.tunnel_header_2); + } + + HWS_SET_HDR(fc, match_param, TNL_HDR_3, + misc_parameters_5.tunnel_header_3, tunnel_header.tunnel_header_3); + + return 0; +} + +static int hws_definer_get_fc_size(struct mlx5hws_definer_fc *fc) +{ + u32 fc_sz = 0; + int i; + + /* For empty matcher, ZERO_SIZE_PTR is returned */ + if (fc == ZERO_SIZE_PTR) + return 0; + + for (i = 0; i < MLX5HWS_DEFINER_FNAME_MAX; i++) + if (fc[i].tag_set) + fc_sz++; + return fc_sz; +} + +static struct mlx5hws_definer_fc * +hws_definer_alloc_compressed_fc(struct mlx5hws_definer_fc *fc) +{ + struct mlx5hws_definer_fc *compressed_fc = NULL; + u32 definer_size = hws_definer_get_fc_size(fc); + u32 fc_sz = 0; + int i; + + compressed_fc = kcalloc(definer_size, sizeof(*compressed_fc), GFP_KERNEL); + if (!compressed_fc) + return NULL; + + /* For empty matcher, ZERO_SIZE_PTR is returned */ + if (!definer_size) + return compressed_fc; + + for (i = 0, fc_sz = 0; i < MLX5HWS_DEFINER_FNAME_MAX; i++) { + if (!fc[i].tag_set) + continue; + + fc[i].fname = i; + memcpy(&compressed_fc[fc_sz++], &fc[i], sizeof(*compressed_fc)); + } + + return compressed_fc; +} + +static void +hws_definer_set_hl(u8 *hl, struct mlx5hws_definer_fc *fc) +{ + int i; + + /* nothing to do for empty matcher */ + if (fc == ZERO_SIZE_PTR) + return; + + for (i = 0; i < MLX5HWS_DEFINER_FNAME_MAX; i++) { + if (!fc[i].tag_set) + continue; + + HWS_SET32(hl, -1, fc[i].byte_off, fc[i].bit_off, fc[i].bit_mask); + } +} + +static struct mlx5hws_definer_fc * +hws_definer_alloc_fc(struct mlx5hws_context *ctx, + size_t len) +{ + struct mlx5hws_definer_fc *fc; + int i; + + fc = kcalloc(len, sizeof(*fc), GFP_KERNEL); + if (!fc) + return NULL; + + for (i = 0; i < len; i++) + fc[i].ctx = ctx; + + return fc; +} + +static int +hws_definer_conv_match_params_to_hl(struct mlx5hws_context *ctx, + struct mlx5hws_match_template *mt, + u8 *hl) +{ + struct mlx5hws_definer_conv_data cd = {0}; + struct mlx5hws_definer_fc *fc; + int ret; + + fc = hws_definer_alloc_fc(ctx, MLX5HWS_DEFINER_FNAME_MAX); + if (!fc) + return -ENOMEM; + + cd.fc = fc; + cd.ctx = ctx; + + if (mt->match_criteria_enable & MLX5HWS_DEFINER_MATCH_CRITERIA_MISC6) { + mlx5hws_err(ctx, "Unsupported match_criteria_enable provided\n"); + ret = -EOPNOTSUPP; + goto err_free_fc; + } + + if (mt->match_criteria_enable & MLX5HWS_DEFINER_MATCH_CRITERIA_OUTER) { + ret = hws_definer_conv_outer(&cd, mt->match_param); + if (ret) + goto err_free_fc; + } + + if (mt->match_criteria_enable & MLX5HWS_DEFINER_MATCH_CRITERIA_INNER) { + ret = hws_definer_conv_inner(&cd, mt->match_param); + if (ret) + goto err_free_fc; + } + + if (mt->match_criteria_enable & MLX5HWS_DEFINER_MATCH_CRITERIA_MISC) { + ret = hws_definer_conv_misc(&cd, mt->match_param); + if (ret) + goto err_free_fc; + } + + if (mt->match_criteria_enable & MLX5HWS_DEFINER_MATCH_CRITERIA_MISC2) { + ret = hws_definer_conv_misc2(&cd, mt->match_param); + if (ret) + goto err_free_fc; + } + + if (mt->match_criteria_enable & MLX5HWS_DEFINER_MATCH_CRITERIA_MISC3) { + ret = hws_definer_conv_misc3(&cd, mt->match_param); + if (ret) + goto err_free_fc; + } + + if (mt->match_criteria_enable & MLX5HWS_DEFINER_MATCH_CRITERIA_MISC4) { + ret = hws_definer_conv_misc4(&cd, mt->match_param); + if (ret) + goto err_free_fc; + } + + if (mt->match_criteria_enable & MLX5HWS_DEFINER_MATCH_CRITERIA_MISC5) { + ret = hws_definer_conv_misc5(&cd, mt->match_param); + if (ret) + goto err_free_fc; + } + + /* Check there is no conflicted fields set together */ + ret = hws_definer_check_match_flags(&cd); + if (ret) + goto err_free_fc; + + /* Allocate fc array on mt */ + mt->fc = hws_definer_alloc_compressed_fc(fc); + if (!mt->fc) { + mlx5hws_err(ctx, + "Convert match params: failed to set field copy to match template\n"); + ret = -ENOMEM; + goto err_free_fc; + } + mt->fc_sz = hws_definer_get_fc_size(fc); + + /* Fill in headers layout */ + hws_definer_set_hl(hl, fc); + + kfree(fc); + return 0; + +err_free_fc: + kfree(fc); + return ret; +} + +struct mlx5hws_definer_fc * +mlx5hws_definer_conv_match_params_to_compressed_fc(struct mlx5hws_context *ctx, + u8 match_criteria_enable, + u32 *match_param, + int *fc_sz) +{ + struct mlx5hws_definer_fc *compressed_fc = NULL; + struct mlx5hws_definer_conv_data cd = {0}; + struct mlx5hws_definer_fc *fc; + int ret; + + fc = hws_definer_alloc_fc(ctx, MLX5HWS_DEFINER_FNAME_MAX); + if (!fc) + return NULL; + + cd.fc = fc; + cd.ctx = ctx; + + if (match_criteria_enable & MLX5HWS_DEFINER_MATCH_CRITERIA_OUTER) { + ret = hws_definer_conv_outer(&cd, match_param); + if (ret) + goto err_free_fc; + } + + if (match_criteria_enable & MLX5HWS_DEFINER_MATCH_CRITERIA_INNER) { + ret = hws_definer_conv_inner(&cd, match_param); + if (ret) + goto err_free_fc; + } + + if (match_criteria_enable & MLX5HWS_DEFINER_MATCH_CRITERIA_MISC) { + ret = hws_definer_conv_misc(&cd, match_param); + if (ret) + goto err_free_fc; + } + + if (match_criteria_enable & MLX5HWS_DEFINER_MATCH_CRITERIA_MISC2) { + ret = hws_definer_conv_misc2(&cd, match_param); + if (ret) + goto err_free_fc; + } + + if (match_criteria_enable & MLX5HWS_DEFINER_MATCH_CRITERIA_MISC3) { + ret = hws_definer_conv_misc3(&cd, match_param); + if (ret) + goto err_free_fc; + } + + if (match_criteria_enable & MLX5HWS_DEFINER_MATCH_CRITERIA_MISC4) { + ret = hws_definer_conv_misc4(&cd, match_param); + if (ret) + goto err_free_fc; + } + + if (match_criteria_enable & MLX5HWS_DEFINER_MATCH_CRITERIA_MISC5) { + ret = hws_definer_conv_misc5(&cd, match_param); + if (ret) + goto err_free_fc; + } + + /* Allocate fc array on mt */ + compressed_fc = hws_definer_alloc_compressed_fc(fc); + if (!compressed_fc) { + mlx5hws_err(ctx, + "Convert to compressed fc: failed to set field copy to match template\n"); + goto err_free_fc; + } + *fc_sz = hws_definer_get_fc_size(fc); + +err_free_fc: + kfree(fc); + return compressed_fc; +} + +static int +hws_definer_find_byte_in_tag(struct mlx5hws_definer *definer, + u32 hl_byte_off, + u32 *tag_byte_off) +{ + int i, dw_to_scan; + u8 byte_offset; + + /* Avoid accessing unused DW selectors */ + dw_to_scan = mlx5hws_definer_is_jumbo(definer) ? + DW_SELECTORS : DW_SELECTORS_MATCH; + + /* Add offset since each DW covers multiple BYTEs */ + byte_offset = hl_byte_off % DW_SIZE; + for (i = 0; i < dw_to_scan; i++) { + if (definer->dw_selector[i] == hl_byte_off / DW_SIZE) { + *tag_byte_off = byte_offset + DW_SIZE * (DW_SELECTORS - i - 1); + return 0; + } + } + + /* Add offset to skip DWs in definer */ + byte_offset = DW_SIZE * DW_SELECTORS; + /* Iterate in reverse since the code uses bytes from 7 -> 0 */ + for (i = BYTE_SELECTORS; i-- > 0 ;) { + if (definer->byte_selector[i] == hl_byte_off) { + *tag_byte_off = byte_offset + (BYTE_SELECTORS - i - 1); + return 0; + } + } + + return -EINVAL; +} + +static int +hws_definer_fc_bind(struct mlx5hws_definer *definer, + struct mlx5hws_definer_fc *fc, + u32 fc_sz) +{ + u32 tag_offset = 0; + int ret, byte_diff; + u32 i; + + for (i = 0; i < fc_sz; i++) { + /* Map header layout byte offset to byte offset in tag */ + ret = hws_definer_find_byte_in_tag(definer, fc->byte_off, &tag_offset); + if (ret) + return ret; + + /* Move setter based on the location in the definer */ + byte_diff = fc->byte_off % DW_SIZE - tag_offset % DW_SIZE; + fc->bit_off = fc->bit_off + byte_diff * BITS_IN_BYTE; + + /* Update offset in headers layout to offset in tag */ + fc->byte_off = tag_offset; + fc++; + } + + return 0; +} + +static bool +hws_definer_best_hl_fit_recu(struct mlx5hws_definer_sel_ctrl *ctrl, + u32 cur_dw, + u32 *data) +{ + u8 bytes_set; + int byte_idx; + bool ret; + int i; + + /* Reached end, nothing left to do */ + if (cur_dw == MLX5_ST_SZ_DW(definer_hl)) + return true; + + /* No data set, can skip to next DW */ + while (!*data) { + cur_dw++; + data++; + + /* Reached end, nothing left to do */ + if (cur_dw == MLX5_ST_SZ_DW(definer_hl)) + return true; + } + + /* Used all DW selectors and Byte selectors, no possible solution */ + if (ctrl->allowed_full_dw == ctrl->used_full_dw && + ctrl->allowed_lim_dw == ctrl->used_lim_dw && + ctrl->allowed_bytes == ctrl->used_bytes) + return false; + + /* Try to use limited DW selectors */ + if (ctrl->allowed_lim_dw > ctrl->used_lim_dw && cur_dw < 64) { + ctrl->lim_dw_selector[ctrl->used_lim_dw++] = cur_dw; + + ret = hws_definer_best_hl_fit_recu(ctrl, cur_dw + 1, data + 1); + if (ret) + return ret; + + ctrl->lim_dw_selector[--ctrl->used_lim_dw] = 0; + } + + /* Try to use DW selectors */ + if (ctrl->allowed_full_dw > ctrl->used_full_dw) { + ctrl->full_dw_selector[ctrl->used_full_dw++] = cur_dw; + + ret = hws_definer_best_hl_fit_recu(ctrl, cur_dw + 1, data + 1); + if (ret) + return ret; + + ctrl->full_dw_selector[--ctrl->used_full_dw] = 0; + } + + /* No byte selector for offset bigger than 255 */ + if (cur_dw * DW_SIZE > 255) + return false; + + bytes_set = !!(0x000000ff & *data) + + !!(0x0000ff00 & *data) + + !!(0x00ff0000 & *data) + + !!(0xff000000 & *data); + + /* Check if there are enough byte selectors left */ + if (bytes_set + ctrl->used_bytes > ctrl->allowed_bytes) + return false; + + /* Try to use Byte selectors */ + for (i = 0; i < DW_SIZE; i++) + if ((0xff000000 >> (i * BITS_IN_BYTE)) & be32_to_cpu((__force __be32)*data)) { + /* Use byte selectors high to low */ + byte_idx = ctrl->allowed_bytes - ctrl->used_bytes - 1; + ctrl->byte_selector[byte_idx] = cur_dw * DW_SIZE + i; + ctrl->used_bytes++; + } + + ret = hws_definer_best_hl_fit_recu(ctrl, cur_dw + 1, data + 1); + if (ret) + return ret; + + for (i = 0; i < DW_SIZE; i++) + if ((0xff << (i * BITS_IN_BYTE)) & be32_to_cpu((__force __be32)*data)) { + ctrl->used_bytes--; + byte_idx = ctrl->allowed_bytes - ctrl->used_bytes - 1; + ctrl->byte_selector[byte_idx] = 0; + } + + return false; +} + +static void +hws_definer_copy_sel_ctrl(struct mlx5hws_definer_sel_ctrl *ctrl, + struct mlx5hws_definer *definer) +{ + memcpy(definer->byte_selector, ctrl->byte_selector, ctrl->allowed_bytes); + memcpy(definer->dw_selector, ctrl->full_dw_selector, ctrl->allowed_full_dw); + memcpy(definer->dw_selector + ctrl->allowed_full_dw, + ctrl->lim_dw_selector, ctrl->allowed_lim_dw); +} + +static int +hws_definer_find_best_match_fit(struct mlx5hws_context *ctx, + struct mlx5hws_definer *definer, + u8 *hl) +{ + struct mlx5hws_definer_sel_ctrl ctrl = {0}; + bool found; + + /* Try to create a match definer */ + ctrl.allowed_full_dw = DW_SELECTORS_MATCH; + ctrl.allowed_lim_dw = 0; + ctrl.allowed_bytes = BYTE_SELECTORS; + + found = hws_definer_best_hl_fit_recu(&ctrl, 0, (u32 *)hl); + if (found) { + hws_definer_copy_sel_ctrl(&ctrl, definer); + definer->type = MLX5HWS_DEFINER_TYPE_MATCH; + return 0; + } + + /* Try to create a full/limited jumbo definer */ + ctrl.allowed_full_dw = ctx->caps->full_dw_jumbo_support ? DW_SELECTORS : + DW_SELECTORS_MATCH; + ctrl.allowed_lim_dw = ctx->caps->full_dw_jumbo_support ? 0 : + DW_SELECTORS_LIMITED; + ctrl.allowed_bytes = BYTE_SELECTORS; + + found = hws_definer_best_hl_fit_recu(&ctrl, 0, (u32 *)hl); + if (found) { + hws_definer_copy_sel_ctrl(&ctrl, definer); + definer->type = MLX5HWS_DEFINER_TYPE_JUMBO; + return 0; + } + + return E2BIG; +} + +static void +hws_definer_create_tag_mask(u32 *match_param, + struct mlx5hws_definer_fc *fc, + u32 fc_sz, + u8 *tag) +{ + u32 i; + + for (i = 0; i < fc_sz; i++) { + if (fc->tag_mask_set) + fc->tag_mask_set(fc, match_param, tag); + else + fc->tag_set(fc, match_param, tag); + fc++; + } +} + +void mlx5hws_definer_create_tag(u32 *match_param, + struct mlx5hws_definer_fc *fc, + u32 fc_sz, + u8 *tag) +{ + u32 i; + + for (i = 0; i < fc_sz; i++) { + fc->tag_set(fc, match_param, tag); + fc++; + } +} + +int mlx5hws_definer_get_id(struct mlx5hws_definer *definer) +{ + return definer->obj_id; +} + +int mlx5hws_definer_compare(struct mlx5hws_definer *definer_a, + struct mlx5hws_definer *definer_b) +{ + int i; + + /* Future: Optimize by comparing selectors with valid mask only */ + for (i = 0; i < BYTE_SELECTORS; i++) + if (definer_a->byte_selector[i] != definer_b->byte_selector[i]) + return 1; + + for (i = 0; i < DW_SELECTORS; i++) + if (definer_a->dw_selector[i] != definer_b->dw_selector[i]) + return 1; + + for (i = 0; i < MLX5HWS_JUMBO_TAG_SZ; i++) + if (definer_a->mask.jumbo[i] != definer_b->mask.jumbo[i]) + return 1; + + return 0; +} + +int +mlx5hws_definer_calc_layout(struct mlx5hws_context *ctx, + struct mlx5hws_match_template *mt, + struct mlx5hws_definer *match_definer) +{ + u8 *match_hl; + int ret; + + /* Union header-layout (hl) is used for creating a single definer + * field layout used with different bitmasks for hash and match. + */ + match_hl = kzalloc(MLX5_ST_SZ_BYTES(definer_hl), GFP_KERNEL); + if (!match_hl) + return -ENOMEM; + + /* Convert all mt items to header layout (hl) + * and allocate the match and range field copy array (fc & fcr). + */ + ret = hws_definer_conv_match_params_to_hl(ctx, mt, match_hl); + if (ret) { + mlx5hws_err(ctx, "Failed to convert items to header layout\n"); + goto free_fc; + } + + /* Find the match definer layout for header layout match union */ + ret = hws_definer_find_best_match_fit(ctx, match_definer, match_hl); + if (ret) { + if (ret == E2BIG) + mlx5hws_dbg(ctx, + "Failed to create match definer from header layout - E2BIG\n"); + else + mlx5hws_err(ctx, + "Failed to create match definer from header layout (%d)\n", + ret); + goto free_fc; + } + + kfree(match_hl); + return 0; + +free_fc: + kfree(mt->fc); + + kfree(match_hl); + return ret; +} + +int mlx5hws_definer_init_cache(struct mlx5hws_definer_cache **cache) +{ + struct mlx5hws_definer_cache *new_cache; + + new_cache = kzalloc(sizeof(*new_cache), GFP_KERNEL); + if (!new_cache) + return -ENOMEM; + + INIT_LIST_HEAD(&new_cache->list_head); + *cache = new_cache; + + return 0; +} + +void mlx5hws_definer_uninit_cache(struct mlx5hws_definer_cache *cache) +{ + kfree(cache); +} + +int mlx5hws_definer_get_obj(struct mlx5hws_context *ctx, + struct mlx5hws_definer *definer) +{ + struct mlx5hws_definer_cache *cache = ctx->definer_cache; + struct mlx5hws_cmd_definer_create_attr def_attr = {0}; + struct mlx5hws_definer_cache_item *cached_definer; + u32 obj_id; + int ret; + + /* Search definer cache for requested definer */ + list_for_each_entry(cached_definer, &cache->list_head, list_node) { + if (mlx5hws_definer_compare(&cached_definer->definer, definer)) + continue; + + /* Reuse definer and set LRU (move to be first in the list) */ + list_del_init(&cached_definer->list_node); + list_add(&cached_definer->list_node, &cache->list_head); + cached_definer->refcount++; + return cached_definer->definer.obj_id; + } + + /* Allocate and create definer based on the bitmask tag */ + def_attr.match_mask = definer->mask.jumbo; + def_attr.dw_selector = definer->dw_selector; + def_attr.byte_selector = definer->byte_selector; + + ret = mlx5hws_cmd_definer_create(ctx->mdev, &def_attr, &obj_id); + if (ret) + return -1; + + cached_definer = kzalloc(sizeof(*cached_definer), GFP_KERNEL); + if (!cached_definer) + goto free_definer_obj; + + memcpy(&cached_definer->definer, definer, sizeof(*definer)); + cached_definer->definer.obj_id = obj_id; + cached_definer->refcount = 1; + list_add(&cached_definer->list_node, &cache->list_head); + + return obj_id; + +free_definer_obj: + mlx5hws_cmd_definer_destroy(ctx->mdev, obj_id); + return -1; +} + +static void +hws_definer_put_obj(struct mlx5hws_context *ctx, u32 obj_id) +{ + struct mlx5hws_definer_cache_item *cached_definer; + + list_for_each_entry(cached_definer, &ctx->definer_cache->list_head, list_node) { + if (cached_definer->definer.obj_id != obj_id) + continue; + + /* Object found */ + if (--cached_definer->refcount) + return; + + list_del_init(&cached_definer->list_node); + mlx5hws_cmd_definer_destroy(ctx->mdev, cached_definer->definer.obj_id); + kfree(cached_definer); + return; + } + + /* Programming error, object must be part of cache */ + pr_warn("HWS: failed putting definer object\n"); +} + +static struct mlx5hws_definer * +hws_definer_alloc(struct mlx5hws_context *ctx, + struct mlx5hws_definer_fc *fc, + int fc_sz, + u32 *match_param, + struct mlx5hws_definer *layout, + bool bind_fc) +{ + struct mlx5hws_definer *definer; + int ret; + + definer = kmemdup(layout, sizeof(*definer), GFP_KERNEL); + if (!definer) + return NULL; + + /* Align field copy array based on given layout */ + if (bind_fc) { + ret = hws_definer_fc_bind(definer, fc, fc_sz); + if (ret) { + mlx5hws_err(ctx, "Failed to bind field copy to definer\n"); + goto free_definer; + } + } + + /* Create the tag mask used for definer creation */ + hws_definer_create_tag_mask(match_param, fc, fc_sz, definer->mask.jumbo); + + ret = mlx5hws_definer_get_obj(ctx, definer); + if (ret < 0) + goto free_definer; + + definer->obj_id = ret; + return definer; + +free_definer: + kfree(definer); + return NULL; +} + +void mlx5hws_definer_free(struct mlx5hws_context *ctx, + struct mlx5hws_definer *definer) +{ + hws_definer_put_obj(ctx, definer->obj_id); + kfree(definer); +} + +static int +hws_definer_mt_match_init(struct mlx5hws_context *ctx, + struct mlx5hws_match_template *mt, + struct mlx5hws_definer *match_layout) +{ + /* Create mandatory match definer */ + mt->definer = hws_definer_alloc(ctx, + mt->fc, + mt->fc_sz, + mt->match_param, + match_layout, + true); + if (!mt->definer) { + mlx5hws_err(ctx, "Failed to create match definer\n"); + return -EINVAL; + } + + return 0; +} + +static void +hws_definer_mt_match_uninit(struct mlx5hws_context *ctx, + struct mlx5hws_match_template *mt) +{ + mlx5hws_definer_free(ctx, mt->definer); +} + +int mlx5hws_definer_mt_init(struct mlx5hws_context *ctx, + struct mlx5hws_match_template *mt) +{ + struct mlx5hws_definer match_layout = {0}; + int ret; + + ret = mlx5hws_definer_calc_layout(ctx, mt, &match_layout); + if (ret) { + mlx5hws_err(ctx, "Failed to calculate matcher definer layout\n"); + return ret; + } + + /* Calculate definers needed for exact match */ + ret = hws_definer_mt_match_init(ctx, mt, &match_layout); + if (ret) { + mlx5hws_err(ctx, "Failed to init match definers\n"); + goto free_fc; + } + + return 0; + +free_fc: + kfree(mt->fc); + return ret; +} + +void mlx5hws_definer_mt_uninit(struct mlx5hws_context *ctx, + struct mlx5hws_match_template *mt) +{ + hws_definer_mt_match_uninit(ctx, mt); + kfree(mt->fc); +} diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_definer.h b/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_definer.h new file mode 100644 index 0000000000000..2f6a7df4021c8 --- /dev/null +++ b/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_definer.h @@ -0,0 +1,834 @@ +/* SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB */ +/* Copyright (c) 2024 NVIDIA Corporation & Affiliates */ + +#ifndef MLX5HWS_DEFINER_H_ +#define MLX5HWS_DEFINER_H_ + +/* Max available selecotrs */ +#define DW_SELECTORS 9 +#define BYTE_SELECTORS 8 + +/* Selectors based on match TAG */ +#define DW_SELECTORS_MATCH 6 +#define DW_SELECTORS_LIMITED 3 + +/* Selectors based on range TAG */ +#define DW_SELECTORS_RANGE 2 +#define BYTE_SELECTORS_RANGE 8 + +#define HWS_NUM_OF_FLEX_PARSERS 8 + +enum mlx5hws_definer_fname { + MLX5HWS_DEFINER_FNAME_ETH_SMAC_47_16_O, + MLX5HWS_DEFINER_FNAME_ETH_SMAC_47_16_I, + MLX5HWS_DEFINER_FNAME_ETH_SMAC_15_0_O, + MLX5HWS_DEFINER_FNAME_ETH_SMAC_15_0_I, + MLX5HWS_DEFINER_FNAME_ETH_DMAC_47_16_O, + MLX5HWS_DEFINER_FNAME_ETH_DMAC_47_16_I, + MLX5HWS_DEFINER_FNAME_ETH_DMAC_15_0_O, + MLX5HWS_DEFINER_FNAME_ETH_DMAC_15_0_I, + MLX5HWS_DEFINER_FNAME_ETH_TYPE_O, + MLX5HWS_DEFINER_FNAME_ETH_TYPE_I, + MLX5HWS_DEFINER_FNAME_ETH_L3_TYPE_O, + MLX5HWS_DEFINER_FNAME_ETH_L3_TYPE_I, + MLX5HWS_DEFINER_FNAME_VLAN_TYPE_O, + MLX5HWS_DEFINER_FNAME_VLAN_TYPE_I, + MLX5HWS_DEFINER_FNAME_VLAN_FIRST_PRIO_O, + MLX5HWS_DEFINER_FNAME_VLAN_FIRST_PRIO_I, + MLX5HWS_DEFINER_FNAME_VLAN_CFI_O, + MLX5HWS_DEFINER_FNAME_VLAN_CFI_I, + MLX5HWS_DEFINER_FNAME_VLAN_ID_O, + MLX5HWS_DEFINER_FNAME_VLAN_ID_I, + MLX5HWS_DEFINER_FNAME_VLAN_SECOND_TYPE_O, + MLX5HWS_DEFINER_FNAME_VLAN_SECOND_TYPE_I, + MLX5HWS_DEFINER_FNAME_VLAN_SECOND_PRIO_O, + MLX5HWS_DEFINER_FNAME_VLAN_SECOND_PRIO_I, + MLX5HWS_DEFINER_FNAME_VLAN_SECOND_CFI_O, + MLX5HWS_DEFINER_FNAME_VLAN_SECOND_CFI_I, + MLX5HWS_DEFINER_FNAME_VLAN_SECOND_ID_O, + MLX5HWS_DEFINER_FNAME_VLAN_SECOND_ID_I, + MLX5HWS_DEFINER_FNAME_IPV4_IHL_O, + MLX5HWS_DEFINER_FNAME_IPV4_IHL_I, + MLX5HWS_DEFINER_FNAME_IP_DSCP_O, + MLX5HWS_DEFINER_FNAME_IP_DSCP_I, + MLX5HWS_DEFINER_FNAME_IP_ECN_O, + MLX5HWS_DEFINER_FNAME_IP_ECN_I, + MLX5HWS_DEFINER_FNAME_IP_TTL_O, + MLX5HWS_DEFINER_FNAME_IP_TTL_I, + MLX5HWS_DEFINER_FNAME_IPV4_DST_O, + MLX5HWS_DEFINER_FNAME_IPV4_DST_I, + MLX5HWS_DEFINER_FNAME_IPV4_SRC_O, + MLX5HWS_DEFINER_FNAME_IPV4_SRC_I, + MLX5HWS_DEFINER_FNAME_IP_VERSION_O, + MLX5HWS_DEFINER_FNAME_IP_VERSION_I, + MLX5HWS_DEFINER_FNAME_IP_FRAG_O, + MLX5HWS_DEFINER_FNAME_IP_FRAG_I, + MLX5HWS_DEFINER_FNAME_IP_LEN_O, + MLX5HWS_DEFINER_FNAME_IP_LEN_I, + MLX5HWS_DEFINER_FNAME_IP_TOS_O, + MLX5HWS_DEFINER_FNAME_IP_TOS_I, + MLX5HWS_DEFINER_FNAME_IPV6_FLOW_LABEL_O, + MLX5HWS_DEFINER_FNAME_IPV6_FLOW_LABEL_I, + MLX5HWS_DEFINER_FNAME_IPV6_DST_127_96_O, + MLX5HWS_DEFINER_FNAME_IPV6_DST_95_64_O, + MLX5HWS_DEFINER_FNAME_IPV6_DST_63_32_O, + MLX5HWS_DEFINER_FNAME_IPV6_DST_31_0_O, + MLX5HWS_DEFINER_FNAME_IPV6_DST_127_96_I, + MLX5HWS_DEFINER_FNAME_IPV6_DST_95_64_I, + MLX5HWS_DEFINER_FNAME_IPV6_DST_63_32_I, + MLX5HWS_DEFINER_FNAME_IPV6_DST_31_0_I, + MLX5HWS_DEFINER_FNAME_IPV6_SRC_127_96_O, + MLX5HWS_DEFINER_FNAME_IPV6_SRC_95_64_O, + MLX5HWS_DEFINER_FNAME_IPV6_SRC_63_32_O, + MLX5HWS_DEFINER_FNAME_IPV6_SRC_31_0_O, + MLX5HWS_DEFINER_FNAME_IPV6_SRC_127_96_I, + MLX5HWS_DEFINER_FNAME_IPV6_SRC_95_64_I, + MLX5HWS_DEFINER_FNAME_IPV6_SRC_63_32_I, + MLX5HWS_DEFINER_FNAME_IPV6_SRC_31_0_I, + MLX5HWS_DEFINER_FNAME_IP_PROTOCOL_O, + MLX5HWS_DEFINER_FNAME_IP_PROTOCOL_I, + MLX5HWS_DEFINER_FNAME_L4_SPORT_O, + MLX5HWS_DEFINER_FNAME_L4_SPORT_I, + MLX5HWS_DEFINER_FNAME_L4_DPORT_O, + MLX5HWS_DEFINER_FNAME_L4_DPORT_I, + MLX5HWS_DEFINER_FNAME_TCP_FLAGS_I, + MLX5HWS_DEFINER_FNAME_TCP_FLAGS_O, + MLX5HWS_DEFINER_FNAME_TCP_SEQ_NUM, + MLX5HWS_DEFINER_FNAME_TCP_ACK_NUM, + MLX5HWS_DEFINER_FNAME_GTP_TEID, + MLX5HWS_DEFINER_FNAME_GTP_MSG_TYPE, + MLX5HWS_DEFINER_FNAME_GTP_EXT_FLAG, + MLX5HWS_DEFINER_FNAME_GTP_NEXT_EXT_HDR, + MLX5HWS_DEFINER_FNAME_GTP_EXT_HDR_PDU, + MLX5HWS_DEFINER_FNAME_GTP_EXT_HDR_QFI, + MLX5HWS_DEFINER_FNAME_GTPU_DW0, + MLX5HWS_DEFINER_FNAME_GTPU_FIRST_EXT_DW0, + MLX5HWS_DEFINER_FNAME_GTPU_DW2, + MLX5HWS_DEFINER_FNAME_FLEX_PARSER_0, + MLX5HWS_DEFINER_FNAME_FLEX_PARSER_1, + MLX5HWS_DEFINER_FNAME_FLEX_PARSER_2, + MLX5HWS_DEFINER_FNAME_FLEX_PARSER_3, + MLX5HWS_DEFINER_FNAME_FLEX_PARSER_4, + MLX5HWS_DEFINER_FNAME_FLEX_PARSER_5, + MLX5HWS_DEFINER_FNAME_FLEX_PARSER_6, + MLX5HWS_DEFINER_FNAME_FLEX_PARSER_7, + MLX5HWS_DEFINER_FNAME_VPORT_REG_C_0, + MLX5HWS_DEFINER_FNAME_VXLAN_FLAGS, + MLX5HWS_DEFINER_FNAME_VXLAN_VNI, + MLX5HWS_DEFINER_FNAME_VXLAN_GPE_FLAGS, + MLX5HWS_DEFINER_FNAME_VXLAN_GPE_RSVD0, + MLX5HWS_DEFINER_FNAME_VXLAN_GPE_PROTO, + MLX5HWS_DEFINER_FNAME_VXLAN_GPE_VNI, + MLX5HWS_DEFINER_FNAME_VXLAN_GPE_RSVD1, + MLX5HWS_DEFINER_FNAME_GENEVE_OPT_LEN, + MLX5HWS_DEFINER_FNAME_GENEVE_OAM, + MLX5HWS_DEFINER_FNAME_GENEVE_PROTO, + MLX5HWS_DEFINER_FNAME_GENEVE_VNI, + MLX5HWS_DEFINER_FNAME_SOURCE_QP, + MLX5HWS_DEFINER_FNAME_SOURCE_GVMI, + MLX5HWS_DEFINER_FNAME_REG_0, + MLX5HWS_DEFINER_FNAME_REG_1, + MLX5HWS_DEFINER_FNAME_REG_2, + MLX5HWS_DEFINER_FNAME_REG_3, + MLX5HWS_DEFINER_FNAME_REG_4, + MLX5HWS_DEFINER_FNAME_REG_5, + MLX5HWS_DEFINER_FNAME_REG_6, + MLX5HWS_DEFINER_FNAME_REG_7, + MLX5HWS_DEFINER_FNAME_REG_8, + MLX5HWS_DEFINER_FNAME_REG_9, + MLX5HWS_DEFINER_FNAME_REG_10, + MLX5HWS_DEFINER_FNAME_REG_11, + MLX5HWS_DEFINER_FNAME_REG_A, + MLX5HWS_DEFINER_FNAME_REG_B, + MLX5HWS_DEFINER_FNAME_GRE_KEY_PRESENT, + MLX5HWS_DEFINER_FNAME_GRE_C, + MLX5HWS_DEFINER_FNAME_GRE_K, + MLX5HWS_DEFINER_FNAME_GRE_S, + MLX5HWS_DEFINER_FNAME_GRE_PROTOCOL, + MLX5HWS_DEFINER_FNAME_GRE_OPT_KEY, + MLX5HWS_DEFINER_FNAME_GRE_OPT_SEQ, + MLX5HWS_DEFINER_FNAME_GRE_OPT_CHECKSUM, + MLX5HWS_DEFINER_FNAME_INTEGRITY_O, + MLX5HWS_DEFINER_FNAME_INTEGRITY_I, + MLX5HWS_DEFINER_FNAME_ICMP_DW1, + MLX5HWS_DEFINER_FNAME_ICMP_DW2, + MLX5HWS_DEFINER_FNAME_ICMP_DW3, + MLX5HWS_DEFINER_FNAME_IPSEC_SPI, + MLX5HWS_DEFINER_FNAME_IPSEC_SEQUENCE_NUMBER, + MLX5HWS_DEFINER_FNAME_IPSEC_SYNDROME, + MLX5HWS_DEFINER_FNAME_MPLS0_O, + MLX5HWS_DEFINER_FNAME_MPLS1_O, + MLX5HWS_DEFINER_FNAME_MPLS2_O, + MLX5HWS_DEFINER_FNAME_MPLS3_O, + MLX5HWS_DEFINER_FNAME_MPLS4_O, + MLX5HWS_DEFINER_FNAME_MPLS0_I, + MLX5HWS_DEFINER_FNAME_MPLS1_I, + MLX5HWS_DEFINER_FNAME_MPLS2_I, + MLX5HWS_DEFINER_FNAME_MPLS3_I, + MLX5HWS_DEFINER_FNAME_MPLS4_I, + MLX5HWS_DEFINER_FNAME_FLEX_PARSER0_OK, + MLX5HWS_DEFINER_FNAME_FLEX_PARSER1_OK, + MLX5HWS_DEFINER_FNAME_FLEX_PARSER2_OK, + MLX5HWS_DEFINER_FNAME_FLEX_PARSER3_OK, + MLX5HWS_DEFINER_FNAME_FLEX_PARSER4_OK, + MLX5HWS_DEFINER_FNAME_FLEX_PARSER5_OK, + MLX5HWS_DEFINER_FNAME_FLEX_PARSER6_OK, + MLX5HWS_DEFINER_FNAME_FLEX_PARSER7_OK, + MLX5HWS_DEFINER_FNAME_OKS2_MPLS0_O, + MLX5HWS_DEFINER_FNAME_OKS2_MPLS1_O, + MLX5HWS_DEFINER_FNAME_OKS2_MPLS2_O, + MLX5HWS_DEFINER_FNAME_OKS2_MPLS3_O, + MLX5HWS_DEFINER_FNAME_OKS2_MPLS4_O, + MLX5HWS_DEFINER_FNAME_OKS2_MPLS0_I, + MLX5HWS_DEFINER_FNAME_OKS2_MPLS1_I, + MLX5HWS_DEFINER_FNAME_OKS2_MPLS2_I, + MLX5HWS_DEFINER_FNAME_OKS2_MPLS3_I, + MLX5HWS_DEFINER_FNAME_OKS2_MPLS4_I, + MLX5HWS_DEFINER_FNAME_GENEVE_OPT_OK_0, + MLX5HWS_DEFINER_FNAME_GENEVE_OPT_OK_1, + MLX5HWS_DEFINER_FNAME_GENEVE_OPT_OK_2, + MLX5HWS_DEFINER_FNAME_GENEVE_OPT_OK_3, + MLX5HWS_DEFINER_FNAME_GENEVE_OPT_OK_4, + MLX5HWS_DEFINER_FNAME_GENEVE_OPT_OK_5, + MLX5HWS_DEFINER_FNAME_GENEVE_OPT_OK_6, + MLX5HWS_DEFINER_FNAME_GENEVE_OPT_OK_7, + MLX5HWS_DEFINER_FNAME_GENEVE_OPT_DW_0, + MLX5HWS_DEFINER_FNAME_GENEVE_OPT_DW_1, + MLX5HWS_DEFINER_FNAME_GENEVE_OPT_DW_2, + MLX5HWS_DEFINER_FNAME_GENEVE_OPT_DW_3, + MLX5HWS_DEFINER_FNAME_GENEVE_OPT_DW_4, + MLX5HWS_DEFINER_FNAME_GENEVE_OPT_DW_5, + MLX5HWS_DEFINER_FNAME_GENEVE_OPT_DW_6, + MLX5HWS_DEFINER_FNAME_GENEVE_OPT_DW_7, + MLX5HWS_DEFINER_FNAME_IB_L4_OPCODE, + MLX5HWS_DEFINER_FNAME_IB_L4_QPN, + MLX5HWS_DEFINER_FNAME_IB_L4_A, + MLX5HWS_DEFINER_FNAME_RANDOM_NUM, + MLX5HWS_DEFINER_FNAME_PTYPE_L2_O, + MLX5HWS_DEFINER_FNAME_PTYPE_L2_I, + MLX5HWS_DEFINER_FNAME_PTYPE_L3_O, + MLX5HWS_DEFINER_FNAME_PTYPE_L3_I, + MLX5HWS_DEFINER_FNAME_PTYPE_L4_O, + MLX5HWS_DEFINER_FNAME_PTYPE_L4_I, + MLX5HWS_DEFINER_FNAME_PTYPE_L4_EXT_O, + MLX5HWS_DEFINER_FNAME_PTYPE_L4_EXT_I, + MLX5HWS_DEFINER_FNAME_PTYPE_FRAG_O, + MLX5HWS_DEFINER_FNAME_PTYPE_FRAG_I, + MLX5HWS_DEFINER_FNAME_TNL_HDR_0, + MLX5HWS_DEFINER_FNAME_TNL_HDR_1, + MLX5HWS_DEFINER_FNAME_TNL_HDR_2, + MLX5HWS_DEFINER_FNAME_TNL_HDR_3, + MLX5HWS_DEFINER_FNAME_MAX, +}; + +enum mlx5hws_definer_match_criteria { + MLX5HWS_DEFINER_MATCH_CRITERIA_EMPTY = 0, + MLX5HWS_DEFINER_MATCH_CRITERIA_OUTER = 1 << 0, + MLX5HWS_DEFINER_MATCH_CRITERIA_MISC = 1 << 1, + MLX5HWS_DEFINER_MATCH_CRITERIA_INNER = 1 << 2, + MLX5HWS_DEFINER_MATCH_CRITERIA_MISC2 = 1 << 3, + MLX5HWS_DEFINER_MATCH_CRITERIA_MISC3 = 1 << 4, + MLX5HWS_DEFINER_MATCH_CRITERIA_MISC4 = 1 << 5, + MLX5HWS_DEFINER_MATCH_CRITERIA_MISC5 = 1 << 6, + MLX5HWS_DEFINER_MATCH_CRITERIA_MISC6 = 1 << 7, +}; + +enum mlx5hws_definer_type { + MLX5HWS_DEFINER_TYPE_MATCH, + MLX5HWS_DEFINER_TYPE_JUMBO, +}; + +enum mlx5hws_definer_match_flag { + MLX5HWS_DEFINER_MATCH_FLAG_TNL_VXLAN_GPE = 1 << 0, + MLX5HWS_DEFINER_MATCH_FLAG_TNL_GENEVE = 1 << 1, + MLX5HWS_DEFINER_MATCH_FLAG_TNL_GTPU = 1 << 2, + MLX5HWS_DEFINER_MATCH_FLAG_TNL_GRE = 1 << 3, + MLX5HWS_DEFINER_MATCH_FLAG_TNL_VXLAN = 1 << 4, + MLX5HWS_DEFINER_MATCH_FLAG_TNL_HEADER_0_1 = 1 << 5, + + MLX5HWS_DEFINER_MATCH_FLAG_TNL_GRE_OPT_KEY = 1 << 6, + MLX5HWS_DEFINER_MATCH_FLAG_TNL_HEADER_2 = 1 << 7, + + MLX5HWS_DEFINER_MATCH_FLAG_TNL_MPLS_OVER_GRE = 1 << 8, + MLX5HWS_DEFINER_MATCH_FLAG_TNL_MPLS_OVER_UDP = 1 << 9, + + MLX5HWS_DEFINER_MATCH_FLAG_ICMPV4 = 1 << 10, + MLX5HWS_DEFINER_MATCH_FLAG_ICMPV6 = 1 << 11, + MLX5HWS_DEFINER_MATCH_FLAG_TCP_O = 1 << 12, + MLX5HWS_DEFINER_MATCH_FLAG_TCP_I = 1 << 13, +}; + +struct mlx5hws_definer_fc { + struct mlx5hws_context *ctx; + /* Source */ + u32 s_byte_off; + int s_bit_off; + u32 s_bit_mask; + /* Destination */ + u32 byte_off; + int bit_off; + u32 bit_mask; + enum mlx5hws_definer_fname fname; + void (*tag_set)(struct mlx5hws_definer_fc *fc, + void *mach_param, + u8 *tag); + void (*tag_mask_set)(struct mlx5hws_definer_fc *fc, + void *mach_param, + u8 *tag); +}; + +struct mlx5_ifc_definer_hl_eth_l2_bits { + u8 dmac_47_16[0x20]; + u8 dmac_15_0[0x10]; + u8 l3_ethertype[0x10]; + u8 reserved_at_40[0x1]; + u8 sx_sniffer[0x1]; + u8 functional_lb[0x1]; + u8 ip_fragmented[0x1]; + u8 qp_type[0x2]; + u8 encap_type[0x2]; + u8 port_number[0x2]; + u8 l3_type[0x2]; + u8 l4_type_bwc[0x2]; + u8 first_vlan_qualifier[0x2]; + u8 first_priority[0x3]; + u8 first_cfi[0x1]; + u8 first_vlan_id[0xc]; + u8 l4_type[0x4]; + u8 reserved_at_64[0x2]; + u8 ipsec_layer[0x2]; + u8 l2_type[0x2]; + u8 force_lb[0x1]; + u8 l2_ok[0x1]; + u8 l3_ok[0x1]; + u8 l4_ok[0x1]; + u8 second_vlan_qualifier[0x2]; + u8 second_priority[0x3]; + u8 second_cfi[0x1]; + u8 second_vlan_id[0xc]; +}; + +struct mlx5_ifc_definer_hl_eth_l2_src_bits { + u8 smac_47_16[0x20]; + u8 smac_15_0[0x10]; + u8 loopback_syndrome[0x8]; + u8 l3_type[0x2]; + u8 l4_type_bwc[0x2]; + u8 first_vlan_qualifier[0x2]; + u8 ip_fragmented[0x1]; + u8 functional_lb[0x1]; +}; + +struct mlx5_ifc_definer_hl_ib_l2_bits { + u8 sx_sniffer[0x1]; + u8 force_lb[0x1]; + u8 functional_lb[0x1]; + u8 reserved_at_3[0x3]; + u8 port_number[0x2]; + u8 sl[0x4]; + u8 qp_type[0x2]; + u8 lnh[0x2]; + u8 dlid[0x10]; + u8 vl[0x4]; + u8 lrh_packet_length[0xc]; + u8 slid[0x10]; +}; + +struct mlx5_ifc_definer_hl_eth_l3_bits { + u8 ip_version[0x4]; + u8 ihl[0x4]; + union { + u8 tos[0x8]; + struct { + u8 dscp[0x6]; + u8 ecn[0x2]; + }; + }; + u8 time_to_live_hop_limit[0x8]; + u8 protocol_next_header[0x8]; + u8 identification[0x10]; + union { + u8 ipv4_frag[0x10]; + struct { + u8 flags[0x3]; + u8 fragment_offset[0xd]; + }; + }; + u8 ipv4_total_length[0x10]; + u8 checksum[0x10]; + u8 reserved_at_60[0xc]; + u8 flow_label[0x14]; + u8 packet_length[0x10]; + u8 ipv6_payload_length[0x10]; +}; + +struct mlx5_ifc_definer_hl_eth_l4_bits { + u8 source_port[0x10]; + u8 destination_port[0x10]; + u8 data_offset[0x4]; + u8 l4_ok[0x1]; + u8 l3_ok[0x1]; + u8 ip_fragmented[0x1]; + u8 tcp_ns[0x1]; + union { + u8 tcp_flags[0x8]; + struct { + u8 tcp_cwr[0x1]; + u8 tcp_ece[0x1]; + u8 tcp_urg[0x1]; + u8 tcp_ack[0x1]; + u8 tcp_psh[0x1]; + u8 tcp_rst[0x1]; + u8 tcp_syn[0x1]; + u8 tcp_fin[0x1]; + }; + }; + u8 first_fragment[0x1]; + u8 reserved_at_31[0xf]; +}; + +struct mlx5_ifc_definer_hl_src_qp_gvmi_bits { + u8 loopback_syndrome[0x8]; + u8 l3_type[0x2]; + u8 l4_type_bwc[0x2]; + u8 first_vlan_qualifier[0x2]; + u8 reserved_at_e[0x1]; + u8 functional_lb[0x1]; + u8 source_gvmi[0x10]; + u8 force_lb[0x1]; + u8 ip_fragmented[0x1]; + u8 source_is_requestor[0x1]; + u8 reserved_at_23[0x5]; + u8 source_qp[0x18]; +}; + +struct mlx5_ifc_definer_hl_ib_l4_bits { + u8 opcode[0x8]; + u8 qp[0x18]; + u8 se[0x1]; + u8 migreq[0x1]; + u8 ackreq[0x1]; + u8 fecn[0x1]; + u8 becn[0x1]; + u8 bth[0x1]; + u8 deth[0x1]; + u8 dcceth[0x1]; + u8 reserved_at_28[0x2]; + u8 pad_count[0x2]; + u8 tver[0x4]; + u8 p_key[0x10]; + u8 reserved_at_40[0x8]; + u8 deth_source_qp[0x18]; +}; + +enum mlx5hws_integrity_ok1_bits { + MLX5HWS_DEFINER_OKS1_FIRST_L4_OK = 24, + MLX5HWS_DEFINER_OKS1_FIRST_L3_OK = 25, + MLX5HWS_DEFINER_OKS1_SECOND_L4_OK = 26, + MLX5HWS_DEFINER_OKS1_SECOND_L3_OK = 27, + MLX5HWS_DEFINER_OKS1_FIRST_L4_CSUM_OK = 28, + MLX5HWS_DEFINER_OKS1_FIRST_IPV4_CSUM_OK = 29, + MLX5HWS_DEFINER_OKS1_SECOND_L4_CSUM_OK = 30, + MLX5HWS_DEFINER_OKS1_SECOND_IPV4_CSUM_OK = 31, +}; + +struct mlx5_ifc_definer_hl_oks1_bits { + union { + u8 oks1_bits[0x20]; + struct { + u8 second_ipv4_checksum_ok[0x1]; + u8 second_l4_checksum_ok[0x1]; + u8 first_ipv4_checksum_ok[0x1]; + u8 first_l4_checksum_ok[0x1]; + u8 second_l3_ok[0x1]; + u8 second_l4_ok[0x1]; + u8 first_l3_ok[0x1]; + u8 first_l4_ok[0x1]; + u8 flex_parser7_steering_ok[0x1]; + u8 flex_parser6_steering_ok[0x1]; + u8 flex_parser5_steering_ok[0x1]; + u8 flex_parser4_steering_ok[0x1]; + u8 flex_parser3_steering_ok[0x1]; + u8 flex_parser2_steering_ok[0x1]; + u8 flex_parser1_steering_ok[0x1]; + u8 flex_parser0_steering_ok[0x1]; + u8 second_ipv6_extension_header_vld[0x1]; + u8 first_ipv6_extension_header_vld[0x1]; + u8 l3_tunneling_ok[0x1]; + u8 l2_tunneling_ok[0x1]; + u8 second_tcp_ok[0x1]; + u8 second_udp_ok[0x1]; + u8 second_ipv4_ok[0x1]; + u8 second_ipv6_ok[0x1]; + u8 second_l2_ok[0x1]; + u8 vxlan_ok[0x1]; + u8 gre_ok[0x1]; + u8 first_tcp_ok[0x1]; + u8 first_udp_ok[0x1]; + u8 first_ipv4_ok[0x1]; + u8 first_ipv6_ok[0x1]; + u8 first_l2_ok[0x1]; + }; + }; +}; + +struct mlx5_ifc_definer_hl_oks2_bits { + u8 reserved_at_0[0xa]; + u8 second_mpls_ok[0x1]; + u8 second_mpls4_s_bit[0x1]; + u8 second_mpls4_qualifier[0x1]; + u8 second_mpls3_s_bit[0x1]; + u8 second_mpls3_qualifier[0x1]; + u8 second_mpls2_s_bit[0x1]; + u8 second_mpls2_qualifier[0x1]; + u8 second_mpls1_s_bit[0x1]; + u8 second_mpls1_qualifier[0x1]; + u8 second_mpls0_s_bit[0x1]; + u8 second_mpls0_qualifier[0x1]; + u8 first_mpls_ok[0x1]; + u8 first_mpls4_s_bit[0x1]; + u8 first_mpls4_qualifier[0x1]; + u8 first_mpls3_s_bit[0x1]; + u8 first_mpls3_qualifier[0x1]; + u8 first_mpls2_s_bit[0x1]; + u8 first_mpls2_qualifier[0x1]; + u8 first_mpls1_s_bit[0x1]; + u8 first_mpls1_qualifier[0x1]; + u8 first_mpls0_s_bit[0x1]; + u8 first_mpls0_qualifier[0x1]; +}; + +struct mlx5_ifc_definer_hl_voq_bits { + u8 reserved_at_0[0x18]; + u8 ecn_ok[0x1]; + u8 congestion[0x1]; + u8 profile[0x2]; + u8 internal_prio[0x4]; +}; + +struct mlx5_ifc_definer_hl_ipv4_src_dst_bits { + u8 source_address[0x20]; + u8 destination_address[0x20]; +}; + +struct mlx5_ifc_definer_hl_random_number_bits { + u8 random_number[0x10]; + u8 reserved[0x10]; +}; + +struct mlx5_ifc_definer_hl_ipv6_addr_bits { + u8 ipv6_address_127_96[0x20]; + u8 ipv6_address_95_64[0x20]; + u8 ipv6_address_63_32[0x20]; + u8 ipv6_address_31_0[0x20]; +}; + +struct mlx5_ifc_definer_tcp_icmp_header_bits { + union { + struct { + u8 icmp_dw1[0x20]; + u8 icmp_dw2[0x20]; + u8 icmp_dw3[0x20]; + }; + struct { + u8 tcp_seq[0x20]; + u8 tcp_ack[0x20]; + u8 tcp_win_urg[0x20]; + }; + }; +}; + +struct mlx5_ifc_definer_hl_tunnel_header_bits { + u8 tunnel_header_0[0x20]; + u8 tunnel_header_1[0x20]; + u8 tunnel_header_2[0x20]; + u8 tunnel_header_3[0x20]; +}; + +struct mlx5_ifc_definer_hl_ipsec_bits { + u8 spi[0x20]; + u8 sequence_number[0x20]; + u8 reserved[0x10]; + u8 ipsec_syndrome[0x8]; + u8 next_header[0x8]; +}; + +struct mlx5_ifc_definer_hl_metadata_bits { + u8 metadata_to_cqe[0x20]; + u8 general_purpose[0x20]; + u8 acomulated_hash[0x20]; +}; + +struct mlx5_ifc_definer_hl_flex_parser_bits { + u8 flex_parser_7[0x20]; + u8 flex_parser_6[0x20]; + u8 flex_parser_5[0x20]; + u8 flex_parser_4[0x20]; + u8 flex_parser_3[0x20]; + u8 flex_parser_2[0x20]; + u8 flex_parser_1[0x20]; + u8 flex_parser_0[0x20]; +}; + +struct mlx5_ifc_definer_hl_registers_bits { + u8 register_c_10[0x20]; + u8 register_c_11[0x20]; + u8 register_c_8[0x20]; + u8 register_c_9[0x20]; + u8 register_c_6[0x20]; + u8 register_c_7[0x20]; + u8 register_c_4[0x20]; + u8 register_c_5[0x20]; + u8 register_c_2[0x20]; + u8 register_c_3[0x20]; + u8 register_c_0[0x20]; + u8 register_c_1[0x20]; +}; + +struct mlx5_ifc_definer_hl_mpls_bits { + u8 mpls0_label[0x20]; + u8 mpls1_label[0x20]; + u8 mpls2_label[0x20]; + u8 mpls3_label[0x20]; + u8 mpls4_label[0x20]; +}; + +struct mlx5_ifc_definer_hl_bits { + struct mlx5_ifc_definer_hl_eth_l2_bits eth_l2_outer; + struct mlx5_ifc_definer_hl_eth_l2_bits eth_l2_inner; + struct mlx5_ifc_definer_hl_eth_l2_src_bits eth_l2_src_outer; + struct mlx5_ifc_definer_hl_eth_l2_src_bits eth_l2_src_inner; + struct mlx5_ifc_definer_hl_ib_l2_bits ib_l2; + struct mlx5_ifc_definer_hl_eth_l3_bits eth_l3_outer; + struct mlx5_ifc_definer_hl_eth_l3_bits eth_l3_inner; + struct mlx5_ifc_definer_hl_eth_l4_bits eth_l4_outer; + struct mlx5_ifc_definer_hl_eth_l4_bits eth_l4_inner; + struct mlx5_ifc_definer_hl_src_qp_gvmi_bits source_qp_gvmi; + struct mlx5_ifc_definer_hl_ib_l4_bits ib_l4; + struct mlx5_ifc_definer_hl_oks1_bits oks1; + struct mlx5_ifc_definer_hl_oks2_bits oks2; + struct mlx5_ifc_definer_hl_voq_bits voq; + u8 reserved_at_480[0x380]; + struct mlx5_ifc_definer_hl_ipv4_src_dst_bits ipv4_src_dest_outer; + struct mlx5_ifc_definer_hl_ipv4_src_dst_bits ipv4_src_dest_inner; + struct mlx5_ifc_definer_hl_ipv6_addr_bits ipv6_dst_outer; + struct mlx5_ifc_definer_hl_ipv6_addr_bits ipv6_dst_inner; + struct mlx5_ifc_definer_hl_ipv6_addr_bits ipv6_src_outer; + struct mlx5_ifc_definer_hl_ipv6_addr_bits ipv6_src_inner; + u8 unsupported_dest_ib_l3[0x80]; + u8 unsupported_source_ib_l3[0x80]; + u8 unsupported_udp_misc_outer[0x20]; + u8 unsupported_udp_misc_inner[0x20]; + struct mlx5_ifc_definer_tcp_icmp_header_bits tcp_icmp; + struct mlx5_ifc_definer_hl_tunnel_header_bits tunnel_header; + struct mlx5_ifc_definer_hl_mpls_bits mpls_outer; + struct mlx5_ifc_definer_hl_mpls_bits mpls_inner; + u8 unsupported_config_headers_outer[0x80]; + u8 unsupported_config_headers_inner[0x80]; + struct mlx5_ifc_definer_hl_random_number_bits random_number; + struct mlx5_ifc_definer_hl_ipsec_bits ipsec; + struct mlx5_ifc_definer_hl_metadata_bits metadata; + u8 unsupported_utc_timestamp[0x40]; + u8 unsupported_free_running_timestamp[0x40]; + struct mlx5_ifc_definer_hl_flex_parser_bits flex_parser; + struct mlx5_ifc_definer_hl_registers_bits registers; + /* Reserved in case header layout on future HW */ + u8 unsupported_reserved[0xd40]; +}; + +enum mlx5hws_definer_gtp { + MLX5HWS_DEFINER_GTP_EXT_HDR_BIT = 0x04, +}; + +struct mlx5_ifc_header_gtp_bits { + u8 version[0x3]; + u8 proto_type[0x1]; + u8 reserved1[0x1]; + union { + u8 msg_flags[0x3]; + struct { + u8 ext_hdr_flag[0x1]; + u8 seq_num_flag[0x1]; + u8 pdu_flag[0x1]; + }; + }; + u8 msg_type[0x8]; + u8 msg_len[0x8]; + u8 teid[0x20]; +}; + +struct mlx5_ifc_header_opt_gtp_bits { + u8 seq_num[0x10]; + u8 pdu_num[0x8]; + u8 next_ext_hdr_type[0x8]; +}; + +struct mlx5_ifc_header_gtp_psc_bits { + u8 len[0x8]; + u8 pdu_type[0x4]; + u8 flags[0x4]; + u8 qfi[0x8]; + u8 reserved2[0x8]; +}; + +struct mlx5_ifc_header_ipv6_vtc_bits { + u8 version[0x4]; + union { + u8 tos[0x8]; + struct { + u8 dscp[0x6]; + u8 ecn[0x2]; + }; + }; + u8 flow_label[0x14]; +}; + +struct mlx5_ifc_header_ipv6_routing_ext_bits { + u8 next_hdr[0x8]; + u8 hdr_len[0x8]; + u8 type[0x8]; + u8 segments_left[0x8]; + union { + u8 flags[0x20]; + struct { + u8 last_entry[0x8]; + u8 flag[0x8]; + u8 tag[0x10]; + }; + }; +}; + +struct mlx5_ifc_header_vxlan_bits { + u8 flags[0x8]; + u8 reserved1[0x18]; + u8 vni[0x18]; + u8 reserved2[0x8]; +}; + +struct mlx5_ifc_header_vxlan_gpe_bits { + u8 flags[0x8]; + u8 rsvd0[0x10]; + u8 protocol[0x8]; + u8 vni[0x18]; + u8 rsvd1[0x8]; +}; + +struct mlx5_ifc_header_gre_bits { + union { + u8 c_rsvd0_ver[0x10]; + struct { + u8 gre_c_present[0x1]; + u8 reserved_at_1[0x1]; + u8 gre_k_present[0x1]; + u8 gre_s_present[0x1]; + u8 reserved_at_4[0x9]; + u8 version[0x3]; + }; + }; + u8 gre_protocol[0x10]; + u8 checksum[0x10]; + u8 reserved_at_30[0x10]; +}; + +struct mlx5_ifc_header_geneve_bits { + union { + u8 ver_opt_len_o_c_rsvd[0x10]; + struct { + u8 version[0x2]; + u8 opt_len[0x6]; + u8 o_flag[0x1]; + u8 c_flag[0x1]; + u8 reserved_at_a[0x6]; + }; + }; + u8 protocol_type[0x10]; + u8 vni[0x18]; + u8 reserved_at_38[0x8]; +}; + +struct mlx5_ifc_header_geneve_opt_bits { + u8 class[0x10]; + u8 type[0x8]; + u8 reserved[0x3]; + u8 len[0x5]; +}; + +struct mlx5_ifc_header_icmp_bits { + union { + u8 icmp_dw1[0x20]; + struct { + u8 type[0x8]; + u8 code[0x8]; + u8 cksum[0x10]; + }; + }; + union { + u8 icmp_dw2[0x20]; + struct { + u8 ident[0x10]; + u8 seq_nb[0x10]; + }; + }; +}; + +struct mlx5hws_definer { + enum mlx5hws_definer_type type; + u8 dw_selector[DW_SELECTORS]; + u8 byte_selector[BYTE_SELECTORS]; + struct mlx5hws_rule_match_tag mask; + u32 obj_id; +}; + +struct mlx5hws_definer_cache { + struct list_head list_head; +}; + +struct mlx5hws_definer_cache_item { + struct mlx5hws_definer definer; + u32 refcount; + struct list_head list_node; +}; + +static inline bool +mlx5hws_definer_is_jumbo(struct mlx5hws_definer *definer) +{ + return (definer->type == MLX5HWS_DEFINER_TYPE_JUMBO); +} + +void mlx5hws_definer_create_tag(u32 *match_param, + struct mlx5hws_definer_fc *fc, + u32 fc_sz, + u8 *tag); + +int mlx5hws_definer_get_id(struct mlx5hws_definer *definer); + +int mlx5hws_definer_mt_init(struct mlx5hws_context *ctx, + struct mlx5hws_match_template *mt); + +void mlx5hws_definer_mt_uninit(struct mlx5hws_context *ctx, + struct mlx5hws_match_template *mt); + +int mlx5hws_definer_init_cache(struct mlx5hws_definer_cache **cache); + +void mlx5hws_definer_uninit_cache(struct mlx5hws_definer_cache *cache); + +int mlx5hws_definer_compare(struct mlx5hws_definer *definer_a, + struct mlx5hws_definer *definer_b); + +int mlx5hws_definer_get_obj(struct mlx5hws_context *ctx, + struct mlx5hws_definer *definer); + +void mlx5hws_definer_free(struct mlx5hws_context *ctx, + struct mlx5hws_definer *definer); + +int mlx5hws_definer_calc_layout(struct mlx5hws_context *ctx, + struct mlx5hws_match_template *mt, + struct mlx5hws_definer *match_definer); + +struct mlx5hws_definer_fc * +mlx5hws_definer_conv_match_params_to_compressed_fc(struct mlx5hws_context *ctx, + u8 match_criteria_enable, + u32 *match_param, + int *fc_sz); + +#endif /* MLX5HWS_DEFINER_H_ */ From b6932715f30a0f7a309b6f9c52e3ae62d56ba026 Mon Sep 17 00:00:00 2001 From: Yevgeny Kliteynik Date: Thu, 20 Jun 2024 02:37:18 +0300 Subject: [PATCH 093/548] net/mlx5: HWS, added matchers functionality Matcher object encompasses all the building blocks that are needed in order to perform flow steering of a given flow: - flow table that serves as entering point of this matcher - Rule Table Context (RTC) objects to hold ll the Steering Table Entries (STEs), both for matching the flow and for performing actions - rules that describe the set of matching parameters for a flow and actions to perform in case of a hit. This patch adds implementation of matchers handling in HWS. Reviewed-by: Itamar Gozlan Signed-off-by: Yevgeny Kliteynik Signed-off-by: Saeed Mahameed (cherry picked from commit 472dd792348f6601ccaa97d5626ee4faff891901) Signed-off-by: Koba Ko --- .../mlx5/core/steering/hws/mlx5hws_matcher.c | 1216 +++++++++++++++++ .../mlx5/core/steering/hws/mlx5hws_matcher.h | 107 ++ 2 files changed, 1323 insertions(+) create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_matcher.c create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_matcher.h diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_matcher.c b/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_matcher.c new file mode 100644 index 0000000000000..1964261415aa9 --- /dev/null +++ b/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_matcher.c @@ -0,0 +1,1216 @@ +// SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB +/* Copyright (c) 2024 NVIDIA Corporation & Affiliates */ + +#include "mlx5hws_internal.h" + +enum mlx5hws_matcher_rtc_type { + HWS_MATCHER_RTC_TYPE_MATCH, + HWS_MATCHER_RTC_TYPE_STE_ARRAY, + HWS_MATCHER_RTC_TYPE_MAX, +}; + +static const char * const mlx5hws_matcher_rtc_type_str[] = { + [HWS_MATCHER_RTC_TYPE_MATCH] = "MATCH", + [HWS_MATCHER_RTC_TYPE_STE_ARRAY] = "STE_ARRAY", + [HWS_MATCHER_RTC_TYPE_MAX] = "UNKNOWN", +}; + +static const char *hws_matcher_rtc_type_to_str(enum mlx5hws_matcher_rtc_type rtc_type) +{ + if (rtc_type > HWS_MATCHER_RTC_TYPE_MAX) + rtc_type = HWS_MATCHER_RTC_TYPE_MAX; + return mlx5hws_matcher_rtc_type_str[rtc_type]; +} + +static bool hws_matcher_requires_col_tbl(u8 log_num_of_rules) +{ + /* Collision table concatenation is done only for large rule tables */ + return log_num_of_rules > MLX5HWS_MATCHER_ASSURED_RULES_TH; +} + +static u8 hws_matcher_rules_to_tbl_depth(u8 log_num_of_rules) +{ + if (hws_matcher_requires_col_tbl(log_num_of_rules)) + return MLX5HWS_MATCHER_ASSURED_MAIN_TBL_DEPTH; + + /* For small rule tables we use a single deep table to assure insertion */ + return min(log_num_of_rules, MLX5HWS_MATCHER_ASSURED_COL_TBL_DEPTH); +} + +static void hws_matcher_destroy_end_ft(struct mlx5hws_matcher *matcher) +{ + mlx5hws_table_destroy_default_ft(matcher->tbl, matcher->end_ft_id); +} + +static int hws_matcher_create_end_ft(struct mlx5hws_matcher *matcher) +{ + struct mlx5hws_table *tbl = matcher->tbl; + int ret; + + ret = mlx5hws_table_create_default_ft(tbl->ctx->mdev, tbl, &matcher->end_ft_id); + if (ret) { + mlx5hws_err(tbl->ctx, "Failed to create matcher end flow table\n"); + return ret; + } + return 0; +} + +static int hws_matcher_connect(struct mlx5hws_matcher *matcher) +{ + struct mlx5hws_table *tbl = matcher->tbl; + struct mlx5hws_context *ctx = tbl->ctx; + struct mlx5hws_matcher *prev = NULL; + struct mlx5hws_matcher *next = NULL; + struct mlx5hws_matcher *tmp_matcher; + int ret; + + /* Find location in matcher list */ + if (list_empty(&tbl->matchers_list)) { + list_add(&matcher->list_node, &tbl->matchers_list); + goto connect; + } + + list_for_each_entry(tmp_matcher, &tbl->matchers_list, list_node) { + if (tmp_matcher->attr.priority > matcher->attr.priority) { + next = tmp_matcher; + break; + } + prev = tmp_matcher; + } + + if (next) + /* insert before next */ + list_add_tail(&matcher->list_node, &next->list_node); + else + /* insert after prev */ + list_add(&matcher->list_node, &prev->list_node); + +connect: + if (next) { + /* Connect to next RTC */ + ret = mlx5hws_table_ft_set_next_rtc(ctx, + matcher->end_ft_id, + tbl->fw_ft_type, + next->match_ste.rtc_0_id, + next->match_ste.rtc_1_id); + if (ret) { + mlx5hws_err(ctx, "Failed to connect new matcher to next RTC\n"); + goto remove_from_list; + } + } else { + /* Connect last matcher to next miss_tbl if exists */ + ret = mlx5hws_table_connect_to_miss_table(tbl, tbl->default_miss.miss_tbl); + if (ret) { + mlx5hws_err(ctx, "Failed connect new matcher to miss_tbl\n"); + goto remove_from_list; + } + } + + /* Connect to previous FT */ + ret = mlx5hws_table_ft_set_next_rtc(ctx, + prev ? prev->end_ft_id : tbl->ft_id, + tbl->fw_ft_type, + matcher->match_ste.rtc_0_id, + matcher->match_ste.rtc_1_id); + if (ret) { + mlx5hws_err(ctx, "Failed to connect new matcher to previous FT\n"); + goto remove_from_list; + } + + /* Reset prev matcher FT default miss (drop refcount) */ + ret = mlx5hws_table_ft_set_default_next_ft(tbl, prev ? prev->end_ft_id : tbl->ft_id); + if (ret) { + mlx5hws_err(ctx, "Failed to reset matcher ft default miss\n"); + goto remove_from_list; + } + + if (!prev) { + /* Update tables missing to current matcher in the table */ + ret = mlx5hws_table_update_connected_miss_tables(tbl); + if (ret) { + mlx5hws_err(ctx, "Fatal error, failed to update connected miss table\n"); + goto remove_from_list; + } + } + + return 0; + +remove_from_list: + list_del_init(&matcher->list_node); + return ret; +} + +static int hws_matcher_disconnect(struct mlx5hws_matcher *matcher) +{ + struct mlx5hws_matcher *next = NULL, *prev = NULL; + struct mlx5hws_table *tbl = matcher->tbl; + u32 prev_ft_id = tbl->ft_id; + int ret; + + if (!list_is_first(&matcher->list_node, &tbl->matchers_list)) { + prev = list_prev_entry(matcher, list_node); + prev_ft_id = prev->end_ft_id; + } + + if (!list_is_last(&matcher->list_node, &tbl->matchers_list)) + next = list_next_entry(matcher, list_node); + + list_del_init(&matcher->list_node); + + if (next) { + /* Connect previous end FT to next RTC */ + ret = mlx5hws_table_ft_set_next_rtc(tbl->ctx, + prev_ft_id, + tbl->fw_ft_type, + next->match_ste.rtc_0_id, + next->match_ste.rtc_1_id); + if (ret) { + mlx5hws_err(tbl->ctx, "Failed to disconnect matcher\n"); + goto matcher_reconnect; + } + } else { + ret = mlx5hws_table_connect_to_miss_table(tbl, tbl->default_miss.miss_tbl); + if (ret) { + mlx5hws_err(tbl->ctx, "Failed to disconnect last matcher\n"); + goto matcher_reconnect; + } + } + + /* Removing first matcher, update connected miss tables if exists */ + if (prev_ft_id == tbl->ft_id) { + ret = mlx5hws_table_update_connected_miss_tables(tbl); + if (ret) { + mlx5hws_err(tbl->ctx, "Fatal error, failed to update connected miss table\n"); + goto matcher_reconnect; + } + } + + ret = mlx5hws_table_ft_set_default_next_ft(tbl, prev_ft_id); + if (ret) { + mlx5hws_err(tbl->ctx, "Fatal error, failed to restore matcher ft default miss\n"); + goto matcher_reconnect; + } + + return 0; + +matcher_reconnect: + if (list_empty(&tbl->matchers_list) || !prev) + list_add(&matcher->list_node, &tbl->matchers_list); + else + /* insert after prev matcher */ + list_add(&matcher->list_node, &prev->list_node); + + return ret; +} + +static void hws_matcher_set_rtc_attr_sz(struct mlx5hws_matcher *matcher, + struct mlx5hws_cmd_rtc_create_attr *rtc_attr, + enum mlx5hws_matcher_rtc_type rtc_type, + bool is_mirror) +{ + struct mlx5hws_pool_chunk *ste = &matcher->action_ste[MLX5HWS_ACTION_STE_IDX_ANY].ste; + enum mlx5hws_matcher_flow_src flow_src = matcher->attr.optimize_flow_src; + bool is_match_rtc = rtc_type == HWS_MATCHER_RTC_TYPE_MATCH; + + if ((flow_src == MLX5HWS_MATCHER_FLOW_SRC_VPORT && !is_mirror) || + (flow_src == MLX5HWS_MATCHER_FLOW_SRC_WIRE && is_mirror)) { + /* Optimize FDB RTC */ + rtc_attr->log_size = 0; + rtc_attr->log_depth = 0; + } else { + /* Keep original values */ + rtc_attr->log_size = is_match_rtc ? matcher->attr.table.sz_row_log : ste->order; + rtc_attr->log_depth = is_match_rtc ? matcher->attr.table.sz_col_log : 0; + } +} + +static int hws_matcher_create_rtc(struct mlx5hws_matcher *matcher, + enum mlx5hws_matcher_rtc_type rtc_type, + u8 action_ste_selector) +{ + struct mlx5hws_matcher_attr *attr = &matcher->attr; + struct mlx5hws_cmd_rtc_create_attr rtc_attr = {0}; + struct mlx5hws_match_template *mt = matcher->mt; + struct mlx5hws_context *ctx = matcher->tbl->ctx; + struct mlx5hws_action_default_stc *default_stc; + struct mlx5hws_matcher_action_ste *action_ste; + struct mlx5hws_table *tbl = matcher->tbl; + struct mlx5hws_pool *ste_pool, *stc_pool; + struct mlx5hws_pool_chunk *ste; + u32 *rtc_0_id, *rtc_1_id; + u32 obj_id; + int ret; + + switch (rtc_type) { + case HWS_MATCHER_RTC_TYPE_MATCH: + rtc_0_id = &matcher->match_ste.rtc_0_id; + rtc_1_id = &matcher->match_ste.rtc_1_id; + ste_pool = matcher->match_ste.pool; + ste = &matcher->match_ste.ste; + ste->order = attr->table.sz_col_log + attr->table.sz_row_log; + + rtc_attr.log_size = attr->table.sz_row_log; + rtc_attr.log_depth = attr->table.sz_col_log; + rtc_attr.is_frst_jumbo = mlx5hws_matcher_mt_is_jumbo(mt); + rtc_attr.is_scnd_range = 0; + rtc_attr.miss_ft_id = matcher->end_ft_id; + + if (attr->insert_mode == MLX5HWS_MATCHER_INSERT_BY_HASH) { + /* The usual Hash Table */ + rtc_attr.update_index_mode = MLX5_IFC_RTC_STE_UPDATE_MODE_BY_HASH; + + /* The first mt is used since all share the same definer */ + rtc_attr.match_definer_0 = mlx5hws_definer_get_id(mt->definer); + } else if (attr->insert_mode == MLX5HWS_MATCHER_INSERT_BY_INDEX) { + rtc_attr.update_index_mode = MLX5_IFC_RTC_STE_UPDATE_MODE_BY_OFFSET; + rtc_attr.num_hash_definer = 1; + + if (attr->distribute_mode == MLX5HWS_MATCHER_DISTRIBUTE_BY_HASH) { + /* Hash Split Table */ + rtc_attr.access_index_mode = MLX5_IFC_RTC_STE_ACCESS_MODE_BY_HASH; + rtc_attr.match_definer_0 = mlx5hws_definer_get_id(mt->definer); + } else if (attr->distribute_mode == MLX5HWS_MATCHER_DISTRIBUTE_BY_LINEAR) { + /* Linear Lookup Table */ + rtc_attr.access_index_mode = MLX5_IFC_RTC_STE_ACCESS_MODE_LINEAR; + rtc_attr.match_definer_0 = ctx->caps->linear_match_definer; + } + } + + /* Match pool requires implicit allocation */ + ret = mlx5hws_pool_chunk_alloc(ste_pool, ste); + if (ret) { + mlx5hws_err(ctx, "Failed to allocate STE for %s RTC", + hws_matcher_rtc_type_to_str(rtc_type)); + return ret; + } + break; + + case HWS_MATCHER_RTC_TYPE_STE_ARRAY: + action_ste = &matcher->action_ste[action_ste_selector]; + + rtc_0_id = &action_ste->rtc_0_id; + rtc_1_id = &action_ste->rtc_1_id; + ste_pool = action_ste->pool; + ste = &action_ste->ste; + ste->order = ilog2(roundup_pow_of_two(action_ste->max_stes)) + + attr->table.sz_row_log; + rtc_attr.log_size = ste->order; + rtc_attr.log_depth = 0; + rtc_attr.update_index_mode = MLX5_IFC_RTC_STE_UPDATE_MODE_BY_OFFSET; + /* The action STEs use the default always hit definer */ + rtc_attr.match_definer_0 = ctx->caps->trivial_match_definer; + rtc_attr.is_frst_jumbo = false; + rtc_attr.miss_ft_id = 0; + break; + + default: + mlx5hws_err(ctx, "HWS Invalid RTC type\n"); + return -EINVAL; + } + + obj_id = mlx5hws_pool_chunk_get_base_id(ste_pool, ste); + + rtc_attr.pd = ctx->pd_num; + rtc_attr.ste_base = obj_id; + rtc_attr.ste_offset = ste->offset; + rtc_attr.reparse_mode = mlx5hws_context_get_reparse_mode(ctx); + rtc_attr.table_type = mlx5hws_table_get_res_fw_ft_type(tbl->type, false); + hws_matcher_set_rtc_attr_sz(matcher, &rtc_attr, rtc_type, false); + + /* STC is a single resource (obj_id), use any STC for the ID */ + stc_pool = ctx->stc_pool[tbl->type]; + default_stc = ctx->common_res[tbl->type].default_stc; + obj_id = mlx5hws_pool_chunk_get_base_id(stc_pool, &default_stc->default_hit); + rtc_attr.stc_base = obj_id; + + ret = mlx5hws_cmd_rtc_create(ctx->mdev, &rtc_attr, rtc_0_id); + if (ret) { + mlx5hws_err(ctx, "Failed to create matcher RTC of type %s", + hws_matcher_rtc_type_to_str(rtc_type)); + goto free_ste; + } + + if (tbl->type == MLX5HWS_TABLE_TYPE_FDB) { + obj_id = mlx5hws_pool_chunk_get_base_mirror_id(ste_pool, ste); + rtc_attr.ste_base = obj_id; + rtc_attr.table_type = mlx5hws_table_get_res_fw_ft_type(tbl->type, true); + + obj_id = mlx5hws_pool_chunk_get_base_mirror_id(stc_pool, &default_stc->default_hit); + rtc_attr.stc_base = obj_id; + hws_matcher_set_rtc_attr_sz(matcher, &rtc_attr, rtc_type, true); + + ret = mlx5hws_cmd_rtc_create(ctx->mdev, &rtc_attr, rtc_1_id); + if (ret) { + mlx5hws_err(ctx, "Failed to create peer matcher RTC of type %s", + hws_matcher_rtc_type_to_str(rtc_type)); + goto destroy_rtc_0; + } + } + + return 0; + +destroy_rtc_0: + mlx5hws_cmd_rtc_destroy(ctx->mdev, *rtc_0_id); +free_ste: + if (rtc_type == HWS_MATCHER_RTC_TYPE_MATCH) + mlx5hws_pool_chunk_free(ste_pool, ste); + return ret; +} + +static void hws_matcher_destroy_rtc(struct mlx5hws_matcher *matcher, + enum mlx5hws_matcher_rtc_type rtc_type, + u8 action_ste_selector) +{ + struct mlx5hws_matcher_action_ste *action_ste; + struct mlx5hws_table *tbl = matcher->tbl; + struct mlx5hws_pool_chunk *ste; + struct mlx5hws_pool *ste_pool; + u32 rtc_0_id, rtc_1_id; + + switch (rtc_type) { + case HWS_MATCHER_RTC_TYPE_MATCH: + rtc_0_id = matcher->match_ste.rtc_0_id; + rtc_1_id = matcher->match_ste.rtc_1_id; + ste_pool = matcher->match_ste.pool; + ste = &matcher->match_ste.ste; + break; + case HWS_MATCHER_RTC_TYPE_STE_ARRAY: + action_ste = &matcher->action_ste[action_ste_selector]; + rtc_0_id = action_ste->rtc_0_id; + rtc_1_id = action_ste->rtc_1_id; + ste_pool = action_ste->pool; + ste = &action_ste->ste; + break; + default: + return; + } + + if (tbl->type == MLX5HWS_TABLE_TYPE_FDB) + mlx5hws_cmd_rtc_destroy(matcher->tbl->ctx->mdev, rtc_1_id); + + mlx5hws_cmd_rtc_destroy(matcher->tbl->ctx->mdev, rtc_0_id); + if (rtc_type == HWS_MATCHER_RTC_TYPE_MATCH) + mlx5hws_pool_chunk_free(ste_pool, ste); +} + +static int +hws_matcher_check_attr_sz(struct mlx5hws_cmd_query_caps *caps, + struct mlx5hws_matcher *matcher) +{ + struct mlx5hws_matcher_attr *attr = &matcher->attr; + + if (attr->table.sz_col_log > caps->rtc_log_depth_max) { + mlx5hws_err(matcher->tbl->ctx, "Matcher depth exceeds limit %d\n", + caps->rtc_log_depth_max); + return -EOPNOTSUPP; + } + + if (attr->table.sz_col_log + attr->table.sz_row_log > caps->ste_alloc_log_max) { + mlx5hws_err(matcher->tbl->ctx, "Total matcher size exceeds limit %d\n", + caps->ste_alloc_log_max); + return -EOPNOTSUPP; + } + + if (attr->table.sz_col_log + attr->table.sz_row_log < caps->ste_alloc_log_gran) { + mlx5hws_err(matcher->tbl->ctx, "Total matcher size below limit %d\n", + caps->ste_alloc_log_gran); + return -EOPNOTSUPP; + } + + return 0; +} + +static void hws_matcher_set_pool_attr(struct mlx5hws_pool_attr *attr, + struct mlx5hws_matcher *matcher) +{ + switch (matcher->attr.optimize_flow_src) { + case MLX5HWS_MATCHER_FLOW_SRC_VPORT: + attr->opt_type = MLX5HWS_POOL_OPTIMIZE_ORIG; + break; + case MLX5HWS_MATCHER_FLOW_SRC_WIRE: + attr->opt_type = MLX5HWS_POOL_OPTIMIZE_MIRROR; + break; + default: + break; + } +} + +static int hws_matcher_check_and_process_at(struct mlx5hws_matcher *matcher, + struct mlx5hws_action_template *at) +{ + struct mlx5hws_context *ctx = matcher->tbl->ctx; + bool valid; + int ret; + + valid = mlx5hws_action_check_combo(ctx, at->action_type_arr, matcher->tbl->type); + if (!valid) { + mlx5hws_err(ctx, "Invalid combination in action template\n"); + return -EINVAL; + } + + /* Process action template to setters */ + ret = mlx5hws_action_template_process(at); + if (ret) { + mlx5hws_err(ctx, "Failed to process action template\n"); + return ret; + } + + return 0; +} + +static int hws_matcher_resize_init(struct mlx5hws_matcher *src_matcher) +{ + struct mlx5hws_matcher_resize_data *resize_data; + + resize_data = kzalloc(sizeof(*resize_data), GFP_KERNEL); + if (!resize_data) + return -ENOMEM; + + resize_data->max_stes = src_matcher->action_ste[MLX5HWS_ACTION_STE_IDX_ANY].max_stes; + + resize_data->action_ste[0].stc = src_matcher->action_ste[0].stc; + resize_data->action_ste[0].rtc_0_id = src_matcher->action_ste[0].rtc_0_id; + resize_data->action_ste[0].rtc_1_id = src_matcher->action_ste[0].rtc_1_id; + resize_data->action_ste[0].pool = src_matcher->action_ste[0].max_stes ? + src_matcher->action_ste[0].pool : + NULL; + resize_data->action_ste[1].stc = src_matcher->action_ste[1].stc; + resize_data->action_ste[1].rtc_0_id = src_matcher->action_ste[1].rtc_0_id; + resize_data->action_ste[1].rtc_1_id = src_matcher->action_ste[1].rtc_1_id; + resize_data->action_ste[1].pool = src_matcher->action_ste[1].max_stes ? + src_matcher->action_ste[1].pool : + NULL; + + /* Place the new resized matcher on the dst matcher's list */ + list_add(&resize_data->list_node, &src_matcher->resize_dst->resize_data); + + /* Move all the previous resized matchers to the dst matcher's list */ + while (!list_empty(&src_matcher->resize_data)) { + resize_data = list_first_entry(&src_matcher->resize_data, + struct mlx5hws_matcher_resize_data, + list_node); + list_del_init(&resize_data->list_node); + list_add(&resize_data->list_node, &src_matcher->resize_dst->resize_data); + } + + return 0; +} + +static void hws_matcher_resize_uninit(struct mlx5hws_matcher *matcher) +{ + struct mlx5hws_matcher_resize_data *resize_data; + + if (!mlx5hws_matcher_is_resizable(matcher)) + return; + + while (!list_empty(&matcher->resize_data)) { + resize_data = list_first_entry(&matcher->resize_data, + struct mlx5hws_matcher_resize_data, + list_node); + list_del_init(&resize_data->list_node); + + if (resize_data->max_stes) { + mlx5hws_action_free_single_stc(matcher->tbl->ctx, + matcher->tbl->type, + &resize_data->action_ste[1].stc); + mlx5hws_action_free_single_stc(matcher->tbl->ctx, + matcher->tbl->type, + &resize_data->action_ste[0].stc); + + if (matcher->tbl->type == MLX5HWS_TABLE_TYPE_FDB) { + mlx5hws_cmd_rtc_destroy(matcher->tbl->ctx->mdev, + resize_data->action_ste[1].rtc_1_id); + mlx5hws_cmd_rtc_destroy(matcher->tbl->ctx->mdev, + resize_data->action_ste[0].rtc_1_id); + } + mlx5hws_cmd_rtc_destroy(matcher->tbl->ctx->mdev, + resize_data->action_ste[1].rtc_0_id); + mlx5hws_cmd_rtc_destroy(matcher->tbl->ctx->mdev, + resize_data->action_ste[0].rtc_0_id); + if (resize_data->action_ste[MLX5HWS_ACTION_STE_IDX_ANY].pool) { + mlx5hws_pool_destroy(resize_data->action_ste[1].pool); + mlx5hws_pool_destroy(resize_data->action_ste[0].pool); + } + } + + kfree(resize_data); + } +} + +static int +hws_matcher_bind_at_idx(struct mlx5hws_matcher *matcher, u8 action_ste_selector) +{ + struct mlx5hws_cmd_stc_modify_attr stc_attr = {0}; + struct mlx5hws_matcher_action_ste *action_ste; + struct mlx5hws_table *tbl = matcher->tbl; + struct mlx5hws_pool_attr pool_attr = {0}; + struct mlx5hws_context *ctx = tbl->ctx; + int ret; + + action_ste = &matcher->action_ste[action_ste_selector]; + + /* Allocate action STE mempool */ + pool_attr.table_type = tbl->type; + pool_attr.pool_type = MLX5HWS_POOL_TYPE_STE; + pool_attr.flags = MLX5HWS_POOL_FLAGS_FOR_STE_ACTION_POOL; + pool_attr.alloc_log_sz = ilog2(roundup_pow_of_two(action_ste->max_stes)) + + matcher->attr.table.sz_row_log; + hws_matcher_set_pool_attr(&pool_attr, matcher); + action_ste->pool = mlx5hws_pool_create(ctx, &pool_attr); + if (!action_ste->pool) { + mlx5hws_err(ctx, "Failed to create action ste pool\n"); + return -EINVAL; + } + + /* Allocate action RTC */ + ret = hws_matcher_create_rtc(matcher, HWS_MATCHER_RTC_TYPE_STE_ARRAY, action_ste_selector); + if (ret) { + mlx5hws_err(ctx, "Failed to create action RTC\n"); + goto free_ste_pool; + } + + /* Allocate STC for jumps to STE */ + stc_attr.action_offset = MLX5HWS_ACTION_OFFSET_HIT; + stc_attr.action_type = MLX5_IFC_STC_ACTION_TYPE_JUMP_TO_STE_TABLE; + stc_attr.reparse_mode = MLX5_IFC_STC_REPARSE_IGNORE; + stc_attr.ste_table.ste = action_ste->ste; + stc_attr.ste_table.ste_pool = action_ste->pool; + stc_attr.ste_table.match_definer_id = ctx->caps->trivial_match_definer; + + ret = mlx5hws_action_alloc_single_stc(ctx, &stc_attr, tbl->type, + &action_ste->stc); + if (ret) { + mlx5hws_err(ctx, "Failed to create action jump to table STC\n"); + goto free_rtc; + } + + return 0; + +free_rtc: + hws_matcher_destroy_rtc(matcher, HWS_MATCHER_RTC_TYPE_STE_ARRAY, action_ste_selector); +free_ste_pool: + mlx5hws_pool_destroy(action_ste->pool); + return ret; +} + +static void hws_matcher_unbind_at_idx(struct mlx5hws_matcher *matcher, u8 action_ste_selector) +{ + struct mlx5hws_matcher_action_ste *action_ste; + struct mlx5hws_table *tbl = matcher->tbl; + + action_ste = &matcher->action_ste[action_ste_selector]; + + if (!action_ste->max_stes || + matcher->flags & MLX5HWS_MATCHER_FLAGS_COLLISION || + mlx5hws_matcher_is_in_resize(matcher)) + return; + + mlx5hws_action_free_single_stc(tbl->ctx, tbl->type, &action_ste->stc); + hws_matcher_destroy_rtc(matcher, HWS_MATCHER_RTC_TYPE_STE_ARRAY, action_ste_selector); + mlx5hws_pool_destroy(action_ste->pool); +} + +static int hws_matcher_bind_at(struct mlx5hws_matcher *matcher) +{ + bool is_jumbo = mlx5hws_matcher_mt_is_jumbo(matcher->mt); + struct mlx5hws_table *tbl = matcher->tbl; + struct mlx5hws_context *ctx = tbl->ctx; + u32 required_stes; + u8 max_stes = 0; + int i, ret; + + if (matcher->flags & MLX5HWS_MATCHER_FLAGS_COLLISION) + return 0; + + for (i = 0; i < matcher->num_of_at; i++) { + struct mlx5hws_action_template *at = &matcher->at[i]; + + ret = hws_matcher_check_and_process_at(matcher, at); + if (ret) { + mlx5hws_err(ctx, "Invalid at %d", i); + return ret; + } + + required_stes = at->num_of_action_stes - (!is_jumbo || at->only_term); + max_stes = max(max_stes, required_stes); + + /* Future: Optimize reparse */ + } + + /* There are no additional STEs required for matcher */ + if (!max_stes) + return 0; + + matcher->action_ste[0].max_stes = max_stes; + matcher->action_ste[1].max_stes = max_stes; + + ret = hws_matcher_bind_at_idx(matcher, 0); + if (ret) + return ret; + + ret = hws_matcher_bind_at_idx(matcher, 1); + if (ret) + goto free_at_0; + + return 0; + +free_at_0: + hws_matcher_unbind_at_idx(matcher, 0); + return ret; +} + +static void hws_matcher_unbind_at(struct mlx5hws_matcher *matcher) +{ + hws_matcher_unbind_at_idx(matcher, 1); + hws_matcher_unbind_at_idx(matcher, 0); +} + +static int hws_matcher_bind_mt(struct mlx5hws_matcher *matcher) +{ + struct mlx5hws_context *ctx = matcher->tbl->ctx; + struct mlx5hws_pool_attr pool_attr = {0}; + int ret; + + /* Calculate match, range and hash definers */ + if (!(matcher->flags & MLX5HWS_MATCHER_FLAGS_COLLISION)) { + ret = mlx5hws_definer_mt_init(ctx, matcher->mt); + if (ret) { + if (ret == E2BIG) + mlx5hws_err(ctx, "Failed to set matcher templates with match definers\n"); + return ret; + } + } + + /* Create an STE pool per matcher*/ + pool_attr.table_type = matcher->tbl->type; + pool_attr.pool_type = MLX5HWS_POOL_TYPE_STE; + pool_attr.flags = MLX5HWS_POOL_FLAGS_FOR_MATCHER_STE_POOL; + pool_attr.alloc_log_sz = matcher->attr.table.sz_col_log + + matcher->attr.table.sz_row_log; + hws_matcher_set_pool_attr(&pool_attr, matcher); + + matcher->match_ste.pool = mlx5hws_pool_create(ctx, &pool_attr); + if (!matcher->match_ste.pool) { + mlx5hws_err(ctx, "Failed to allocate matcher STE pool\n"); + ret = -EOPNOTSUPP; + goto uninit_match_definer; + } + + return 0; + +uninit_match_definer: + if (!(matcher->flags & MLX5HWS_MATCHER_FLAGS_COLLISION)) + mlx5hws_definer_mt_uninit(ctx, matcher->mt); + return ret; +} + +static void hws_matcher_unbind_mt(struct mlx5hws_matcher *matcher) +{ + mlx5hws_pool_destroy(matcher->match_ste.pool); + if (!(matcher->flags & MLX5HWS_MATCHER_FLAGS_COLLISION)) + mlx5hws_definer_mt_uninit(matcher->tbl->ctx, matcher->mt); +} + +static int +hws_matcher_validate_insert_mode(struct mlx5hws_cmd_query_caps *caps, + struct mlx5hws_matcher *matcher) +{ + struct mlx5hws_matcher_attr *attr = &matcher->attr; + struct mlx5hws_context *ctx = matcher->tbl->ctx; + + switch (attr->insert_mode) { + case MLX5HWS_MATCHER_INSERT_BY_HASH: + if (matcher->attr.distribute_mode != MLX5HWS_MATCHER_DISTRIBUTE_BY_HASH) { + mlx5hws_err(ctx, "Invalid matcher distribute mode\n"); + return -EOPNOTSUPP; + } + break; + + case MLX5HWS_MATCHER_INSERT_BY_INDEX: + if (attr->table.sz_col_log) { + mlx5hws_err(ctx, "Matcher with INSERT_BY_INDEX supports only Nx1 table size\n"); + return -EOPNOTSUPP; + } + + if (attr->distribute_mode == MLX5HWS_MATCHER_DISTRIBUTE_BY_HASH) { + /* Hash Split Table */ + if (!caps->rtc_hash_split_table) { + mlx5hws_err(ctx, "FW doesn't support insert by index and hash distribute\n"); + return -EOPNOTSUPP; + } + } else if (attr->distribute_mode == MLX5HWS_MATCHER_DISTRIBUTE_BY_LINEAR) { + /* Linear Lookup Table */ + if (!caps->rtc_linear_lookup_table || + !IS_BIT_SET(caps->access_index_mode, + MLX5_IFC_RTC_STE_ACCESS_MODE_LINEAR)) { + mlx5hws_err(ctx, "FW doesn't support insert by index and linear distribute\n"); + return -EOPNOTSUPP; + } + + if (attr->table.sz_row_log > MLX5_IFC_RTC_LINEAR_LOOKUP_TBL_LOG_MAX) { + mlx5hws_err(ctx, "Matcher with linear distribute: rows exceed limit %d", + MLX5_IFC_RTC_LINEAR_LOOKUP_TBL_LOG_MAX); + return -EOPNOTSUPP; + } + } else { + mlx5hws_err(ctx, "Matcher has unsupported distribute mode\n"); + return -EOPNOTSUPP; + } + break; + + default: + mlx5hws_err(ctx, "Matcher has unsupported insert mode\n"); + return -EOPNOTSUPP; + } + + return 0; +} + +static int +hws_matcher_process_attr(struct mlx5hws_cmd_query_caps *caps, + struct mlx5hws_matcher *matcher) +{ + struct mlx5hws_matcher_attr *attr = &matcher->attr; + + if (hws_matcher_validate_insert_mode(caps, matcher)) + return -EOPNOTSUPP; + + if (matcher->tbl->type != MLX5HWS_TABLE_TYPE_FDB && attr->optimize_flow_src) { + mlx5hws_err(matcher->tbl->ctx, "NIC domain doesn't support flow_src\n"); + return -EOPNOTSUPP; + } + + /* Convert number of rules to the required depth */ + if (attr->mode == MLX5HWS_MATCHER_RESOURCE_MODE_RULE && + attr->insert_mode == MLX5HWS_MATCHER_INSERT_BY_HASH) + attr->table.sz_col_log = hws_matcher_rules_to_tbl_depth(attr->rule.num_log); + + matcher->flags |= attr->resizable ? MLX5HWS_MATCHER_FLAGS_RESIZABLE : 0; + + return hws_matcher_check_attr_sz(caps, matcher); +} + +static int hws_matcher_create_and_connect(struct mlx5hws_matcher *matcher) +{ + int ret; + + /* Select and create the definers for current matcher */ + ret = hws_matcher_bind_mt(matcher); + if (ret) + return ret; + + /* Calculate and verify action combination */ + ret = hws_matcher_bind_at(matcher); + if (ret) + goto unbind_mt; + + /* Create matcher end flow table anchor */ + ret = hws_matcher_create_end_ft(matcher); + if (ret) + goto unbind_at; + + /* Allocate the RTC for the new matcher */ + ret = hws_matcher_create_rtc(matcher, HWS_MATCHER_RTC_TYPE_MATCH, 0); + if (ret) + goto destroy_end_ft; + + /* Connect the matcher to the matcher list */ + ret = hws_matcher_connect(matcher); + if (ret) + goto destroy_rtc; + + return 0; + +destroy_rtc: + hws_matcher_destroy_rtc(matcher, HWS_MATCHER_RTC_TYPE_MATCH, 0); +destroy_end_ft: + hws_matcher_destroy_end_ft(matcher); +unbind_at: + hws_matcher_unbind_at(matcher); +unbind_mt: + hws_matcher_unbind_mt(matcher); + return ret; +} + +static void hws_matcher_destroy_and_disconnect(struct mlx5hws_matcher *matcher) +{ + hws_matcher_resize_uninit(matcher); + hws_matcher_disconnect(matcher); + hws_matcher_destroy_rtc(matcher, HWS_MATCHER_RTC_TYPE_MATCH, 0); + hws_matcher_destroy_end_ft(matcher); + hws_matcher_unbind_at(matcher); + hws_matcher_unbind_mt(matcher); +} + +static int +hws_matcher_create_col_matcher(struct mlx5hws_matcher *matcher) +{ + struct mlx5hws_context *ctx = matcher->tbl->ctx; + struct mlx5hws_matcher *col_matcher; + int ret; + + if (matcher->attr.mode != MLX5HWS_MATCHER_RESOURCE_MODE_RULE || + matcher->attr.insert_mode == MLX5HWS_MATCHER_INSERT_BY_INDEX) + return 0; + + if (!hws_matcher_requires_col_tbl(matcher->attr.rule.num_log)) + return 0; + + col_matcher = kzalloc(sizeof(*matcher), GFP_KERNEL); + if (!col_matcher) + return -ENOMEM; + + INIT_LIST_HEAD(&col_matcher->resize_data); + + col_matcher->tbl = matcher->tbl; + col_matcher->mt = matcher->mt; + col_matcher->at = matcher->at; + col_matcher->num_of_at = matcher->num_of_at; + col_matcher->num_of_mt = matcher->num_of_mt; + col_matcher->attr.priority = matcher->attr.priority; + col_matcher->flags = matcher->flags; + col_matcher->flags |= MLX5HWS_MATCHER_FLAGS_COLLISION; + col_matcher->attr.mode = MLX5HWS_MATCHER_RESOURCE_MODE_HTABLE; + col_matcher->attr.optimize_flow_src = matcher->attr.optimize_flow_src; + col_matcher->attr.table.sz_row_log = matcher->attr.rule.num_log; + col_matcher->attr.table.sz_col_log = MLX5HWS_MATCHER_ASSURED_COL_TBL_DEPTH; + if (col_matcher->attr.table.sz_row_log > MLX5HWS_MATCHER_ASSURED_ROW_RATIO) + col_matcher->attr.table.sz_row_log -= MLX5HWS_MATCHER_ASSURED_ROW_RATIO; + + col_matcher->attr.max_num_of_at_attach = matcher->attr.max_num_of_at_attach; + + ret = hws_matcher_process_attr(ctx->caps, col_matcher); + if (ret) + goto free_col_matcher; + + ret = hws_matcher_create_and_connect(col_matcher); + if (ret) + goto free_col_matcher; + + matcher->col_matcher = col_matcher; + + return 0; + +free_col_matcher: + kfree(col_matcher); + mlx5hws_err(ctx, "Failed to create assured collision matcher\n"); + return ret; +} + +static void +hws_matcher_destroy_col_matcher(struct mlx5hws_matcher *matcher) +{ + if (matcher->attr.mode != MLX5HWS_MATCHER_RESOURCE_MODE_RULE || + matcher->attr.insert_mode == MLX5HWS_MATCHER_INSERT_BY_INDEX) + return; + + if (matcher->col_matcher) { + hws_matcher_destroy_and_disconnect(matcher->col_matcher); + kfree(matcher->col_matcher); + } +} + +static int hws_matcher_init(struct mlx5hws_matcher *matcher) +{ + struct mlx5hws_context *ctx = matcher->tbl->ctx; + int ret; + + INIT_LIST_HEAD(&matcher->resize_data); + + mutex_lock(&ctx->ctrl_lock); + + /* Allocate matcher resource and connect to the packet pipe */ + ret = hws_matcher_create_and_connect(matcher); + if (ret) + goto unlock_err; + + /* Create additional matcher for collision handling */ + ret = hws_matcher_create_col_matcher(matcher); + if (ret) + goto destory_and_disconnect; + mutex_unlock(&ctx->ctrl_lock); + + return 0; + +destory_and_disconnect: + hws_matcher_destroy_and_disconnect(matcher); +unlock_err: + mutex_unlock(&ctx->ctrl_lock); + return ret; +} + +static int hws_matcher_uninit(struct mlx5hws_matcher *matcher) +{ + struct mlx5hws_context *ctx = matcher->tbl->ctx; + + mutex_lock(&ctx->ctrl_lock); + hws_matcher_destroy_col_matcher(matcher); + hws_matcher_destroy_and_disconnect(matcher); + mutex_unlock(&ctx->ctrl_lock); + + return 0; +} + +int mlx5hws_matcher_attach_at(struct mlx5hws_matcher *matcher, + struct mlx5hws_action_template *at) +{ + bool is_jumbo = mlx5hws_matcher_mt_is_jumbo(matcher->mt); + struct mlx5hws_context *ctx = matcher->tbl->ctx; + u32 required_stes; + int ret; + + if (!matcher->attr.max_num_of_at_attach) { + mlx5hws_dbg(ctx, "Num of current at (%d) exceed allowed value\n", + matcher->num_of_at); + return -EOPNOTSUPP; + } + + ret = hws_matcher_check_and_process_at(matcher, at); + if (ret) + return -ret; + + required_stes = at->num_of_action_stes - (!is_jumbo || at->only_term); + if (matcher->action_ste[MLX5HWS_ACTION_STE_IDX_ANY].max_stes < required_stes) { + mlx5hws_dbg(ctx, "Required STEs [%d] exceeds initial action template STE [%d]\n", + required_stes, + matcher->action_ste[MLX5HWS_ACTION_STE_IDX_ANY].max_stes); + return -ENOMEM; + } + + matcher->at[matcher->num_of_at] = *at; + matcher->num_of_at += 1; + matcher->attr.max_num_of_at_attach -= 1; + + if (matcher->col_matcher) + matcher->col_matcher->num_of_at = matcher->num_of_at; + + return 0; +} + +static int +hws_matcher_set_templates(struct mlx5hws_matcher *matcher, + struct mlx5hws_match_template *mt[], + u8 num_of_mt, + struct mlx5hws_action_template *at[], + u8 num_of_at) +{ + struct mlx5hws_context *ctx = matcher->tbl->ctx; + int ret = 0; + int i; + + if (!num_of_mt || !num_of_at) { + mlx5hws_err(ctx, "Number of action/match template cannot be zero\n"); + return -EOPNOTSUPP; + } + + matcher->mt = kcalloc(num_of_mt, sizeof(*matcher->mt), GFP_KERNEL); + if (!matcher->mt) + return -ENOMEM; + + matcher->at = kcalloc(num_of_at + matcher->attr.max_num_of_at_attach, + sizeof(*matcher->at), + GFP_KERNEL); + if (!matcher->at) { + mlx5hws_err(ctx, "Failed to allocate action template array\n"); + ret = -ENOMEM; + goto free_mt; + } + + for (i = 0; i < num_of_mt; i++) + matcher->mt[i] = *mt[i]; + + for (i = 0; i < num_of_at; i++) + matcher->at[i] = *at[i]; + + matcher->num_of_mt = num_of_mt; + matcher->num_of_at = num_of_at; + + return 0; + +free_mt: + kfree(matcher->mt); + return ret; +} + +static void +hws_matcher_unset_templates(struct mlx5hws_matcher *matcher) +{ + kfree(matcher->at); + kfree(matcher->mt); +} + +struct mlx5hws_matcher * +mlx5hws_matcher_create(struct mlx5hws_table *tbl, + struct mlx5hws_match_template *mt[], + u8 num_of_mt, + struct mlx5hws_action_template *at[], + u8 num_of_at, + struct mlx5hws_matcher_attr *attr) +{ + struct mlx5hws_context *ctx = tbl->ctx; + struct mlx5hws_matcher *matcher; + int ret; + + matcher = kzalloc(sizeof(*matcher), GFP_KERNEL); + if (!matcher) + return NULL; + + matcher->tbl = tbl; + matcher->attr = *attr; + + ret = hws_matcher_process_attr(tbl->ctx->caps, matcher); + if (ret) + goto free_matcher; + + ret = hws_matcher_set_templates(matcher, mt, num_of_mt, at, num_of_at); + if (ret) + goto free_matcher; + + ret = hws_matcher_init(matcher); + if (ret) { + mlx5hws_err(ctx, "Failed to initialise matcher: %d\n", ret); + goto unset_templates; + } + + return matcher; + +unset_templates: + hws_matcher_unset_templates(matcher); +free_matcher: + kfree(matcher); + return NULL; +} + +int mlx5hws_matcher_destroy(struct mlx5hws_matcher *matcher) +{ + hws_matcher_uninit(matcher); + hws_matcher_unset_templates(matcher); + kfree(matcher); + return 0; +} + +struct mlx5hws_match_template * +mlx5hws_match_template_create(struct mlx5hws_context *ctx, + u32 *match_param, + u32 match_param_sz, + u8 match_criteria_enable) +{ + struct mlx5hws_match_template *mt; + + mt = kzalloc(sizeof(*mt), GFP_KERNEL); + if (!mt) + return NULL; + + mt->match_param = kzalloc(MLX5_ST_SZ_BYTES(fte_match_param), GFP_KERNEL); + if (!mt->match_param) + goto free_template; + + memcpy(mt->match_param, match_param, match_param_sz); + mt->match_criteria_enable = match_criteria_enable; + + return mt; + +free_template: + kfree(mt); + return NULL; +} + +int mlx5hws_match_template_destroy(struct mlx5hws_match_template *mt) +{ + kfree(mt->match_param); + kfree(mt); + return 0; +} + +static int hws_matcher_resize_precheck(struct mlx5hws_matcher *src_matcher, + struct mlx5hws_matcher *dst_matcher) +{ + struct mlx5hws_context *ctx = src_matcher->tbl->ctx; + int i; + + if (src_matcher->tbl->type != dst_matcher->tbl->type) { + mlx5hws_err(ctx, "Table type mismatch for src/dst matchers\n"); + return -EINVAL; + } + + if (!mlx5hws_matcher_is_resizable(src_matcher) || + !mlx5hws_matcher_is_resizable(dst_matcher)) { + mlx5hws_err(ctx, "Src/dst matcher is not resizable\n"); + return -EINVAL; + } + + if (mlx5hws_matcher_is_insert_by_idx(src_matcher) != + mlx5hws_matcher_is_insert_by_idx(dst_matcher)) { + mlx5hws_err(ctx, "Src/dst matchers insert mode mismatch\n"); + return -EINVAL; + } + + if (mlx5hws_matcher_is_in_resize(src_matcher) || + mlx5hws_matcher_is_in_resize(dst_matcher)) { + mlx5hws_err(ctx, "Src/dst matcher is already in resize\n"); + return -EINVAL; + } + + /* Compare match templates - make sure the definers are equivalent */ + if (src_matcher->num_of_mt != dst_matcher->num_of_mt) { + mlx5hws_err(ctx, "Src/dst matcher match templates mismatch\n"); + return -EINVAL; + } + + if (src_matcher->action_ste[MLX5HWS_ACTION_STE_IDX_ANY].max_stes > + dst_matcher->action_ste[0].max_stes) { + mlx5hws_err(ctx, "Src/dst matcher max STEs mismatch\n"); + return -EINVAL; + } + + for (i = 0; i < src_matcher->num_of_mt; i++) { + if (mlx5hws_definer_compare(src_matcher->mt[i].definer, + dst_matcher->mt[i].definer)) { + mlx5hws_err(ctx, "Src/dst matcher definers mismatch\n"); + return -EINVAL; + } + } + + return 0; +} + +int mlx5hws_matcher_resize_set_target(struct mlx5hws_matcher *src_matcher, + struct mlx5hws_matcher *dst_matcher) +{ + int ret = 0; + + mutex_lock(&src_matcher->tbl->ctx->ctrl_lock); + + ret = hws_matcher_resize_precheck(src_matcher, dst_matcher); + if (ret) + goto out; + + src_matcher->resize_dst = dst_matcher; + + ret = hws_matcher_resize_init(src_matcher); + if (ret) + src_matcher->resize_dst = NULL; + +out: + mutex_unlock(&src_matcher->tbl->ctx->ctrl_lock); + return ret; +} + +int mlx5hws_matcher_resize_rule_move(struct mlx5hws_matcher *src_matcher, + struct mlx5hws_rule *rule, + struct mlx5hws_rule_attr *attr) +{ + struct mlx5hws_context *ctx = src_matcher->tbl->ctx; + + if (unlikely(!mlx5hws_matcher_is_in_resize(src_matcher))) { + mlx5hws_err(ctx, "Matcher is not resizable or not in resize\n"); + return -EINVAL; + } + + if (unlikely(src_matcher != rule->matcher)) { + mlx5hws_err(ctx, "Rule doesn't belong to src matcher\n"); + return -EINVAL; + } + + return mlx5hws_rule_move_hws_add(rule, attr); +} diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_matcher.h b/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_matcher.h new file mode 100644 index 0000000000000..125391d1a1148 --- /dev/null +++ b/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_matcher.h @@ -0,0 +1,107 @@ +/* SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB */ +/* Copyright (c) 2024 NVIDIA Corporation & Affiliates */ + +#ifndef MLX5HWS_MATCHER_H_ +#define MLX5HWS_MATCHER_H_ + +/* We calculated that concatenating a collision table to the main table with + * 3% of the main table rows will be enough resources for high insertion + * success probability. + * + * The calculation: log2(2^x * 3 / 100) = log2(2^x) + log2(3/100) = x - 5.05 ~ 5 + */ +#define MLX5HWS_MATCHER_ASSURED_ROW_RATIO 5 +/* Threshold to determine if amount of rules require a collision table */ +#define MLX5HWS_MATCHER_ASSURED_RULES_TH 10 +/* Required depth of an assured collision table */ +#define MLX5HWS_MATCHER_ASSURED_COL_TBL_DEPTH 4 +/* Required depth of the main large table */ +#define MLX5HWS_MATCHER_ASSURED_MAIN_TBL_DEPTH 2 + +enum mlx5hws_matcher_offset { + MLX5HWS_MATCHER_OFFSET_TAG_DW1 = 12, + MLX5HWS_MATCHER_OFFSET_TAG_DW0 = 13, +}; + +enum mlx5hws_matcher_flags { + MLX5HWS_MATCHER_FLAGS_COLLISION = 1 << 2, + MLX5HWS_MATCHER_FLAGS_RESIZABLE = 1 << 3, +}; + +struct mlx5hws_match_template { + struct mlx5hws_definer *definer; + struct mlx5hws_definer_fc *fc; + u32 *match_param; + u8 match_criteria_enable; + u16 fc_sz; +}; + +struct mlx5hws_matcher_match_ste { + struct mlx5hws_pool_chunk ste; + u32 rtc_0_id; + u32 rtc_1_id; + struct mlx5hws_pool *pool; +}; + +struct mlx5hws_matcher_action_ste { + struct mlx5hws_pool_chunk ste; + struct mlx5hws_pool_chunk stc; + u32 rtc_0_id; + u32 rtc_1_id; + struct mlx5hws_pool *pool; + u8 max_stes; +}; + +struct mlx5hws_matcher_resize_data_node { + struct mlx5hws_pool_chunk stc; + u32 rtc_0_id; + u32 rtc_1_id; + struct mlx5hws_pool *pool; +}; + +struct mlx5hws_matcher_resize_data { + struct mlx5hws_matcher_resize_data_node action_ste[2]; + u8 max_stes; + struct list_head list_node; +}; + +struct mlx5hws_matcher { + struct mlx5hws_table *tbl; + struct mlx5hws_matcher_attr attr; + struct mlx5hws_match_template *mt; + struct mlx5hws_action_template *at; + u8 num_of_at; + u8 num_of_mt; + /* enum mlx5hws_matcher_flags */ + u8 flags; + u32 end_ft_id; + struct mlx5hws_matcher *col_matcher; + struct mlx5hws_matcher *resize_dst; + struct mlx5hws_matcher_match_ste match_ste; + struct mlx5hws_matcher_action_ste action_ste[2]; + struct list_head list_node; + struct list_head resize_data; +}; + +static inline bool +mlx5hws_matcher_mt_is_jumbo(struct mlx5hws_match_template *mt) +{ + return mlx5hws_definer_is_jumbo(mt->definer); +} + +static inline bool mlx5hws_matcher_is_resizable(struct mlx5hws_matcher *matcher) +{ + return !!(matcher->flags & MLX5HWS_MATCHER_FLAGS_RESIZABLE); +} + +static inline bool mlx5hws_matcher_is_in_resize(struct mlx5hws_matcher *matcher) +{ + return !!matcher->resize_dst; +} + +static inline bool mlx5hws_matcher_is_insert_by_idx(struct mlx5hws_matcher *matcher) +{ + return matcher->attr.insert_mode == MLX5HWS_MATCHER_INSERT_BY_INDEX; +} + +#endif /* MLX5HWS_MATCHER_H_ */ From e3169319ee76cab80f01a1d1c8af37ec7be0c24d Mon Sep 17 00:00:00 2001 From: Yevgeny Kliteynik Date: Thu, 20 Jun 2024 02:38:22 +0300 Subject: [PATCH 094/548] net/mlx5: HWS, added FW commands handling This patch adds implementation of FW object handling, such as creation/destruction, modification, and querying. Reviewed-by: Hamdan Agbariya Signed-off-by: Yevgeny Kliteynik Signed-off-by: Saeed Mahameed (cherry picked from commit 0869701cba3d4c9865a019d23b0d159aee90d23f) Signed-off-by: Koba Ko --- .../mlx5/core/steering/hws/mlx5hws_cmd.c | 1300 +++++++++++++++++ .../mlx5/core/steering/hws/mlx5hws_cmd.h | 361 +++++ 2 files changed, 1661 insertions(+) create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_cmd.c create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_cmd.h diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_cmd.c b/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_cmd.c new file mode 100644 index 0000000000000..2c7b14172049b --- /dev/null +++ b/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_cmd.c @@ -0,0 +1,1300 @@ +// SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB +/* Copyright (c) 2024 NVIDIA Corporation & Affiliates */ + +#include "mlx5hws_internal.h" + +static enum mlx5_ifc_flow_destination_type +hws_cmd_dest_type_to_ifc_dest_type(enum mlx5_flow_destination_type type) +{ + switch (type) { + case MLX5_FLOW_DESTINATION_TYPE_VPORT: + return MLX5_IFC_FLOW_DESTINATION_TYPE_VPORT; + case MLX5_FLOW_DESTINATION_TYPE_FLOW_TABLE: + return MLX5_IFC_FLOW_DESTINATION_TYPE_FLOW_TABLE; + case MLX5_FLOW_DESTINATION_TYPE_TIR: + return MLX5_IFC_FLOW_DESTINATION_TYPE_TIR; + case MLX5_FLOW_DESTINATION_TYPE_FLOW_SAMPLER: + return MLX5_IFC_FLOW_DESTINATION_TYPE_FLOW_SAMPLER; + case MLX5_FLOW_DESTINATION_TYPE_UPLINK: + return MLX5_IFC_FLOW_DESTINATION_TYPE_UPLINK; + case MLX5_FLOW_DESTINATION_TYPE_TABLE_TYPE: + return MLX5_IFC_FLOW_DESTINATION_TYPE_TABLE_TYPE; + case MLX5_FLOW_DESTINATION_TYPE_NONE: + case MLX5_FLOW_DESTINATION_TYPE_PORT: + case MLX5_FLOW_DESTINATION_TYPE_COUNTER: + case MLX5_FLOW_DESTINATION_TYPE_FLOW_TABLE_NUM: + case MLX5_FLOW_DESTINATION_TYPE_RANGE: + default: + pr_warn("HWS: unknown flow dest type %d\n", type); + return 0; + } +}; + +static int hws_cmd_general_obj_destroy(struct mlx5_core_dev *mdev, + u32 object_type, + u32 object_id) +{ + u32 in[MLX5_ST_SZ_DW(general_obj_in_cmd_hdr)] = {}; + u32 out[MLX5_ST_SZ_DW(general_obj_out_cmd_hdr)]; + + MLX5_SET(general_obj_in_cmd_hdr, in, opcode, MLX5_CMD_OP_DESTROY_GENERAL_OBJECT); + MLX5_SET(general_obj_in_cmd_hdr, in, obj_type, object_type); + MLX5_SET(general_obj_in_cmd_hdr, in, obj_id, object_id); + + return mlx5_cmd_exec(mdev, in, sizeof(in), out, sizeof(out)); +} + +int mlx5hws_cmd_flow_table_create(struct mlx5_core_dev *mdev, + struct mlx5hws_cmd_ft_create_attr *ft_attr, + u32 *table_id) +{ + u32 out[MLX5_ST_SZ_DW(create_flow_table_out)] = {0}; + u32 in[MLX5_ST_SZ_DW(create_flow_table_in)] = {0}; + void *ft_ctx; + int ret; + + MLX5_SET(create_flow_table_in, in, opcode, MLX5_CMD_OP_CREATE_FLOW_TABLE); + MLX5_SET(create_flow_table_in, in, table_type, ft_attr->type); + + ft_ctx = MLX5_ADDR_OF(create_flow_table_in, in, flow_table_context); + MLX5_SET(flow_table_context, ft_ctx, level, ft_attr->level); + MLX5_SET(flow_table_context, ft_ctx, rtc_valid, ft_attr->rtc_valid); + MLX5_SET(flow_table_context, ft_ctx, reformat_en, ft_attr->reformat_en); + MLX5_SET(flow_table_context, ft_ctx, decap_en, ft_attr->decap_en); + + ret = mlx5_cmd_exec_inout(mdev, create_flow_table, in, out); + if (ret) + return ret; + + *table_id = MLX5_GET(create_flow_table_out, out, table_id); + + return 0; +} + +int mlx5hws_cmd_flow_table_modify(struct mlx5_core_dev *mdev, + struct mlx5hws_cmd_ft_modify_attr *ft_attr, + u32 table_id) +{ + u32 in[MLX5_ST_SZ_DW(modify_flow_table_in)] = {0}; + void *ft_ctx; + + MLX5_SET(modify_flow_table_in, in, opcode, MLX5_CMD_OP_MODIFY_FLOW_TABLE); + MLX5_SET(modify_flow_table_in, in, table_type, ft_attr->type); + MLX5_SET(modify_flow_table_in, in, modify_field_select, ft_attr->modify_fs); + MLX5_SET(modify_flow_table_in, in, table_id, table_id); + + ft_ctx = MLX5_ADDR_OF(modify_flow_table_in, in, flow_table_context); + + MLX5_SET(flow_table_context, ft_ctx, table_miss_action, ft_attr->table_miss_action); + MLX5_SET(flow_table_context, ft_ctx, table_miss_id, ft_attr->table_miss_id); + MLX5_SET(flow_table_context, ft_ctx, hws.rtc_id_0, ft_attr->rtc_id_0); + MLX5_SET(flow_table_context, ft_ctx, hws.rtc_id_1, ft_attr->rtc_id_1); + + return mlx5_cmd_exec_in(mdev, modify_flow_table, in); +} + +int mlx5hws_cmd_flow_table_query(struct mlx5_core_dev *mdev, + u32 table_id, + struct mlx5hws_cmd_ft_query_attr *ft_attr, + u64 *icm_addr_0, u64 *icm_addr_1) +{ + u32 out[MLX5_ST_SZ_DW(query_flow_table_out)] = {0}; + u32 in[MLX5_ST_SZ_DW(query_flow_table_in)] = {0}; + void *ft_ctx; + int ret; + + MLX5_SET(query_flow_table_in, in, opcode, MLX5_CMD_OP_QUERY_FLOW_TABLE); + MLX5_SET(query_flow_table_in, in, table_type, ft_attr->type); + MLX5_SET(query_flow_table_in, in, table_id, table_id); + + ret = mlx5_cmd_exec_inout(mdev, query_flow_table, in, out); + if (ret) + return ret; + + ft_ctx = MLX5_ADDR_OF(query_flow_table_out, out, flow_table_context); + *icm_addr_0 = MLX5_GET64(flow_table_context, ft_ctx, sws.sw_owner_icm_root_0); + *icm_addr_1 = MLX5_GET64(flow_table_context, ft_ctx, sws.sw_owner_icm_root_1); + + return ret; +} + +int mlx5hws_cmd_flow_table_destroy(struct mlx5_core_dev *mdev, + u8 fw_ft_type, u32 table_id) +{ + u32 in[MLX5_ST_SZ_DW(destroy_flow_table_in)] = {0}; + + MLX5_SET(destroy_flow_table_in, in, opcode, MLX5_CMD_OP_DESTROY_FLOW_TABLE); + MLX5_SET(destroy_flow_table_in, in, table_type, fw_ft_type); + MLX5_SET(destroy_flow_table_in, in, table_id, table_id); + + return mlx5_cmd_exec_in(mdev, destroy_flow_table, in); +} + +void mlx5hws_cmd_alias_flow_table_destroy(struct mlx5_core_dev *mdev, + u32 table_id) +{ + hws_cmd_general_obj_destroy(mdev, MLX5_OBJ_TYPE_FT_ALIAS, table_id); +} + +static int hws_cmd_flow_group_create(struct mlx5_core_dev *mdev, + struct mlx5hws_cmd_fg_attr *fg_attr, + u32 *group_id) +{ + u32 out[MLX5_ST_SZ_DW(create_flow_group_out)] = {0}; + int inlen = MLX5_ST_SZ_BYTES(create_flow_group_in); + u32 *in; + int ret; + + in = kvzalloc(inlen, GFP_KERNEL); + if (!in) + return -ENOMEM; + + MLX5_SET(create_flow_group_in, in, opcode, MLX5_CMD_OP_CREATE_FLOW_GROUP); + MLX5_SET(create_flow_group_in, in, table_type, fg_attr->table_type); + MLX5_SET(create_flow_group_in, in, table_id, fg_attr->table_id); + + ret = mlx5_cmd_exec_inout(mdev, create_flow_group, in, out); + if (ret) + goto out; + + *group_id = MLX5_GET(create_flow_group_out, out, group_id); + +out: + kvfree(in); + return ret; +} + +static int hws_cmd_flow_group_destroy(struct mlx5_core_dev *mdev, + u32 ft_id, u32 fg_id, u8 ft_type) +{ + u32 in[MLX5_ST_SZ_DW(destroy_flow_group_in)] = {}; + + MLX5_SET(destroy_flow_group_in, in, opcode, MLX5_CMD_OP_DESTROY_FLOW_GROUP); + MLX5_SET(destroy_flow_group_in, in, table_type, ft_type); + MLX5_SET(destroy_flow_group_in, in, table_id, ft_id); + MLX5_SET(destroy_flow_group_in, in, group_id, fg_id); + + return mlx5_cmd_exec_in(mdev, destroy_flow_group, in); +} + +int mlx5hws_cmd_set_fte(struct mlx5_core_dev *mdev, + u32 table_type, + u32 table_id, + u32 group_id, + struct mlx5hws_cmd_set_fte_attr *fte_attr) +{ + u32 out[MLX5_ST_SZ_DW(set_fte_out)] = {0}; + void *in_flow_context; + u32 dest_entry_sz; + u32 total_dest_sz; + u32 action_flags; + u8 *in_dests; + u32 inlen; + u32 *in; + int ret; + u32 i; + + dest_entry_sz = fte_attr->extended_dest ? + MLX5_ST_SZ_BYTES(extended_dest_format) : + MLX5_ST_SZ_BYTES(dest_format); + total_dest_sz = dest_entry_sz * fte_attr->dests_num; + inlen = align((MLX5_ST_SZ_BYTES(set_fte_in) + total_dest_sz), DW_SIZE); + in = kzalloc(inlen, GFP_KERNEL); + if (!in) + return -ENOMEM; + + MLX5_SET(set_fte_in, in, opcode, MLX5_CMD_OP_SET_FLOW_TABLE_ENTRY); + MLX5_SET(set_fte_in, in, table_type, table_type); + MLX5_SET(set_fte_in, in, table_id, table_id); + + in_flow_context = MLX5_ADDR_OF(set_fte_in, in, flow_context); + MLX5_SET(flow_context, in_flow_context, group_id, group_id); + MLX5_SET(flow_context, in_flow_context, flow_source, fte_attr->flow_source); + MLX5_SET(flow_context, in_flow_context, extended_destination, fte_attr->extended_dest); + MLX5_SET(set_fte_in, in, ignore_flow_level, fte_attr->ignore_flow_level); + + action_flags = fte_attr->action_flags; + MLX5_SET(flow_context, in_flow_context, action, action_flags); + + if (action_flags & MLX5_FLOW_CONTEXT_ACTION_PACKET_REFORMAT) { + MLX5_SET(flow_context, in_flow_context, + packet_reformat_id, fte_attr->packet_reformat_id); + } + + if (action_flags & (MLX5_FLOW_CONTEXT_ACTION_DECRYPT | MLX5_FLOW_CONTEXT_ACTION_ENCRYPT)) { + MLX5_SET(flow_context, in_flow_context, + encrypt_decrypt_type, fte_attr->encrypt_decrypt_type); + MLX5_SET(flow_context, in_flow_context, + encrypt_decrypt_obj_id, fte_attr->encrypt_decrypt_obj_id); + } + + if (action_flags & MLX5_FLOW_CONTEXT_ACTION_FWD_DEST) { + in_dests = (u8 *)MLX5_ADDR_OF(flow_context, in_flow_context, destination); + + for (i = 0; i < fte_attr->dests_num; i++) { + struct mlx5hws_cmd_set_fte_dest *dest = &fte_attr->dests[i]; + enum mlx5_ifc_flow_destination_type ifc_dest_type = + hws_cmd_dest_type_to_ifc_dest_type(dest->destination_type); + + switch (dest->destination_type) { + case MLX5_FLOW_DESTINATION_TYPE_VPORT: + if (dest->ext_flags & MLX5HWS_CMD_EXT_DEST_ESW_OWNER_VHCA_ID) { + MLX5_SET(dest_format, in_dests, + destination_eswitch_owner_vhca_id_valid, 1); + MLX5_SET(dest_format, in_dests, + destination_eswitch_owner_vhca_id, + dest->esw_owner_vhca_id); + } + fallthrough; + case MLX5_FLOW_DESTINATION_TYPE_TIR: + case MLX5_FLOW_DESTINATION_TYPE_FLOW_TABLE: + MLX5_SET(dest_format, in_dests, destination_type, ifc_dest_type); + MLX5_SET(dest_format, in_dests, destination_id, + dest->destination_id); + if (dest->ext_flags & MLX5HWS_CMD_EXT_DEST_REFORMAT) { + MLX5_SET(dest_format, in_dests, packet_reformat, 1); + MLX5_SET(extended_dest_format, in_dests, packet_reformat_id, + dest->ext_reformat_id); + } + break; + default: + ret = -EOPNOTSUPP; + goto out; + } + + in_dests = in_dests + dest_entry_sz; + } + MLX5_SET(flow_context, in_flow_context, destination_list_size, fte_attr->dests_num); + } + + ret = mlx5_cmd_exec(mdev, in, inlen, out, sizeof(out)); + if (ret) + mlx5_core_err(mdev, "Failed creating FLOW_TABLE_ENTRY\n"); + +out: + kfree(in); + return ret; +} + +int mlx5hws_cmd_delete_fte(struct mlx5_core_dev *mdev, + u32 table_type, + u32 table_id) +{ + u32 in[MLX5_ST_SZ_DW(delete_fte_in)] = {}; + + MLX5_SET(delete_fte_in, in, opcode, MLX5_CMD_OP_DELETE_FLOW_TABLE_ENTRY); + MLX5_SET(delete_fte_in, in, table_type, table_type); + MLX5_SET(delete_fte_in, in, table_id, table_id); + + return mlx5_cmd_exec_in(mdev, delete_fte, in); +} + +struct mlx5hws_cmd_forward_tbl * +mlx5hws_cmd_forward_tbl_create(struct mlx5_core_dev *mdev, + struct mlx5hws_cmd_ft_create_attr *ft_attr, + struct mlx5hws_cmd_set_fte_attr *fte_attr) +{ + struct mlx5hws_cmd_fg_attr fg_attr = {0}; + struct mlx5hws_cmd_forward_tbl *tbl; + int ret; + + tbl = kzalloc(sizeof(*tbl), GFP_KERNEL); + if (!tbl) + return NULL; + + ret = mlx5hws_cmd_flow_table_create(mdev, ft_attr, &tbl->ft_id); + if (ret) { + mlx5_core_err(mdev, "Failed to create FT\n"); + goto free_tbl; + } + + fg_attr.table_id = tbl->ft_id; + fg_attr.table_type = ft_attr->type; + + ret = hws_cmd_flow_group_create(mdev, &fg_attr, &tbl->fg_id); + if (ret) { + mlx5_core_err(mdev, "Failed to create FG\n"); + goto free_ft; + } + + ret = mlx5hws_cmd_set_fte(mdev, ft_attr->type, + tbl->ft_id, tbl->fg_id, fte_attr); + if (ret) { + mlx5_core_err(mdev, "Failed to create FTE\n"); + goto free_fg; + } + + tbl->type = ft_attr->type; + return tbl; + +free_fg: + hws_cmd_flow_group_destroy(mdev, tbl->ft_id, tbl->fg_id, ft_attr->type); +free_ft: + mlx5hws_cmd_flow_table_destroy(mdev, ft_attr->type, tbl->ft_id); +free_tbl: + kfree(tbl); + return NULL; +} + +void mlx5hws_cmd_forward_tbl_destroy(struct mlx5_core_dev *mdev, + struct mlx5hws_cmd_forward_tbl *tbl) +{ + mlx5hws_cmd_delete_fte(mdev, tbl->type, tbl->ft_id); + hws_cmd_flow_group_destroy(mdev, tbl->ft_id, tbl->fg_id, tbl->type); + mlx5hws_cmd_flow_table_destroy(mdev, tbl->type, tbl->ft_id); + kfree(tbl); +} + +void mlx5hws_cmd_set_attr_connect_miss_tbl(struct mlx5hws_context *ctx, + u32 fw_ft_type, + enum mlx5hws_table_type type, + struct mlx5hws_cmd_ft_modify_attr *ft_attr) +{ + u32 default_miss_tbl; + + if (type != MLX5HWS_TABLE_TYPE_FDB) + return; + + ft_attr->modify_fs = MLX5_IFC_MODIFY_FLOW_TABLE_MISS_ACTION; + ft_attr->type = fw_ft_type; + ft_attr->table_miss_action = MLX5_IFC_MODIFY_FLOW_TABLE_MISS_ACTION_GOTO_TBL; + + default_miss_tbl = ctx->common_res[type].default_miss->ft_id; + if (!default_miss_tbl) { + pr_warn("HWS: no flow table ID for default miss\n"); + return; + } + + ft_attr->table_miss_id = default_miss_tbl; +} + +int mlx5hws_cmd_rtc_create(struct mlx5_core_dev *mdev, + struct mlx5hws_cmd_rtc_create_attr *rtc_attr, + u32 *rtc_id) +{ + u32 out[MLX5_ST_SZ_DW(general_obj_out_cmd_hdr)] = {0}; + u32 in[MLX5_ST_SZ_DW(create_rtc_in)] = {0}; + void *attr; + int ret; + + attr = MLX5_ADDR_OF(create_rtc_in, in, hdr); + MLX5_SET(general_obj_in_cmd_hdr, + attr, opcode, MLX5_CMD_OP_CREATE_GENERAL_OBJECT); + MLX5_SET(general_obj_in_cmd_hdr, + attr, obj_type, MLX5_OBJ_TYPE_RTC); + + attr = MLX5_ADDR_OF(create_rtc_in, in, rtc); + MLX5_SET(rtc, attr, ste_format_0, rtc_attr->is_frst_jumbo ? + MLX5_IFC_RTC_STE_FORMAT_11DW : + MLX5_IFC_RTC_STE_FORMAT_8DW); + + if (rtc_attr->is_scnd_range) { + MLX5_SET(rtc, attr, ste_format_1, MLX5_IFC_RTC_STE_FORMAT_RANGE); + MLX5_SET(rtc, attr, num_match_ste, 2); + } + + MLX5_SET(rtc, attr, pd, rtc_attr->pd); + MLX5_SET(rtc, attr, update_method, rtc_attr->fw_gen_wqe); + MLX5_SET(rtc, attr, update_index_mode, rtc_attr->update_index_mode); + MLX5_SET(rtc, attr, access_index_mode, rtc_attr->access_index_mode); + MLX5_SET(rtc, attr, num_hash_definer, rtc_attr->num_hash_definer); + MLX5_SET(rtc, attr, log_depth, rtc_attr->log_depth); + MLX5_SET(rtc, attr, log_hash_size, rtc_attr->log_size); + MLX5_SET(rtc, attr, table_type, rtc_attr->table_type); + MLX5_SET(rtc, attr, num_hash_definer, rtc_attr->num_hash_definer); + MLX5_SET(rtc, attr, match_definer_0, rtc_attr->match_definer_0); + MLX5_SET(rtc, attr, match_definer_1, rtc_attr->match_definer_1); + MLX5_SET(rtc, attr, stc_id, rtc_attr->stc_base); + MLX5_SET(rtc, attr, ste_table_base_id, rtc_attr->ste_base); + MLX5_SET(rtc, attr, ste_table_offset, rtc_attr->ste_offset); + MLX5_SET(rtc, attr, miss_flow_table_id, rtc_attr->miss_ft_id); + MLX5_SET(rtc, attr, reparse_mode, rtc_attr->reparse_mode); + + ret = mlx5_cmd_exec(mdev, in, sizeof(in), out, sizeof(out)); + if (ret) { + mlx5_core_err(mdev, "Failed to create RTC\n"); + goto out; + } + + *rtc_id = MLX5_GET(general_obj_out_cmd_hdr, out, obj_id); +out: + return ret; +} + +void mlx5hws_cmd_rtc_destroy(struct mlx5_core_dev *mdev, u32 rtc_id) +{ + hws_cmd_general_obj_destroy(mdev, MLX5_OBJ_TYPE_RTC, rtc_id); +} + +int mlx5hws_cmd_stc_create(struct mlx5_core_dev *mdev, + struct mlx5hws_cmd_stc_create_attr *stc_attr, + u32 *stc_id) +{ + u32 out[MLX5_ST_SZ_DW(general_obj_out_cmd_hdr)] = {0}; + u32 in[MLX5_ST_SZ_DW(create_stc_in)] = {0}; + void *attr; + int ret; + + attr = MLX5_ADDR_OF(create_stc_in, in, hdr); + MLX5_SET(general_obj_in_cmd_hdr, + attr, opcode, MLX5_CMD_OP_CREATE_GENERAL_OBJECT); + MLX5_SET(general_obj_in_cmd_hdr, + attr, obj_type, MLX5_OBJ_TYPE_STC); + MLX5_SET(general_obj_in_cmd_hdr, + attr, op_param.create.log_obj_range, stc_attr->log_obj_range); + + attr = MLX5_ADDR_OF(create_stc_in, in, stc); + MLX5_SET(stc, attr, table_type, stc_attr->table_type); + + ret = mlx5_cmd_exec(mdev, in, sizeof(in), out, sizeof(out)); + if (ret) { + mlx5_core_err(mdev, "Failed to create STC\n"); + goto out; + } + + *stc_id = MLX5_GET(general_obj_out_cmd_hdr, out, obj_id); +out: + return ret; +} + +void mlx5hws_cmd_stc_destroy(struct mlx5_core_dev *mdev, u32 stc_id) +{ + hws_cmd_general_obj_destroy(mdev, MLX5_OBJ_TYPE_STC, stc_id); +} + +static int +hws_cmd_stc_modify_set_stc_param(struct mlx5_core_dev *mdev, + struct mlx5hws_cmd_stc_modify_attr *stc_attr, + void *stc_param) +{ + switch (stc_attr->action_type) { + case MLX5_IFC_STC_ACTION_TYPE_COUNTER: + MLX5_SET(stc_ste_param_flow_counter, stc_param, flow_counter_id, stc_attr->id); + break; + case MLX5_IFC_STC_ACTION_TYPE_JUMP_TO_TIR: + MLX5_SET(stc_ste_param_tir, stc_param, tirn, stc_attr->dest_tir_num); + break; + case MLX5_IFC_STC_ACTION_TYPE_JUMP_TO_FT: + MLX5_SET(stc_ste_param_table, stc_param, table_id, stc_attr->dest_table_id); + break; + case MLX5_IFC_STC_ACTION_TYPE_ACC_MODIFY_LIST: + MLX5_SET(stc_ste_param_header_modify_list, stc_param, + header_modify_pattern_id, stc_attr->modify_header.pattern_id); + MLX5_SET(stc_ste_param_header_modify_list, stc_param, + header_modify_argument_id, stc_attr->modify_header.arg_id); + break; + case MLX5_IFC_STC_ACTION_TYPE_HEADER_REMOVE: + MLX5_SET(stc_ste_param_remove, stc_param, action_type, + MLX5_MODIFICATION_TYPE_REMOVE); + MLX5_SET(stc_ste_param_remove, stc_param, decap, + stc_attr->remove_header.decap); + MLX5_SET(stc_ste_param_remove, stc_param, remove_start_anchor, + stc_attr->remove_header.start_anchor); + MLX5_SET(stc_ste_param_remove, stc_param, remove_end_anchor, + stc_attr->remove_header.end_anchor); + break; + case MLX5_IFC_STC_ACTION_TYPE_HEADER_INSERT: + MLX5_SET(stc_ste_param_insert, stc_param, action_type, + MLX5_MODIFICATION_TYPE_INSERT); + MLX5_SET(stc_ste_param_insert, stc_param, encap, + stc_attr->insert_header.encap); + MLX5_SET(stc_ste_param_insert, stc_param, inline_data, + stc_attr->insert_header.is_inline); + MLX5_SET(stc_ste_param_insert, stc_param, insert_anchor, + stc_attr->insert_header.insert_anchor); + /* HW gets the next 2 sizes in words */ + MLX5_SET(stc_ste_param_insert, stc_param, insert_size, + stc_attr->insert_header.header_size / W_SIZE); + MLX5_SET(stc_ste_param_insert, stc_param, insert_offset, + stc_attr->insert_header.insert_offset / W_SIZE); + MLX5_SET(stc_ste_param_insert, stc_param, insert_argument, + stc_attr->insert_header.arg_id); + break; + case MLX5_IFC_STC_ACTION_TYPE_COPY: + case MLX5_IFC_STC_ACTION_TYPE_SET: + case MLX5_IFC_STC_ACTION_TYPE_ADD: + case MLX5_IFC_STC_ACTION_TYPE_ADD_FIELD: + *(__be64 *)stc_param = stc_attr->modify_action.data; + break; + case MLX5_IFC_STC_ACTION_TYPE_JUMP_TO_VPORT: + case MLX5_IFC_STC_ACTION_TYPE_JUMP_TO_UPLINK: + MLX5_SET(stc_ste_param_vport, stc_param, vport_number, + stc_attr->vport.vport_num); + MLX5_SET(stc_ste_param_vport, stc_param, eswitch_owner_vhca_id, + stc_attr->vport.esw_owner_vhca_id); + MLX5_SET(stc_ste_param_vport, stc_param, eswitch_owner_vhca_id_valid, + stc_attr->vport.eswitch_owner_vhca_id_valid); + break; + case MLX5_IFC_STC_ACTION_TYPE_DROP: + case MLX5_IFC_STC_ACTION_TYPE_NOP: + case MLX5_IFC_STC_ACTION_TYPE_TAG: + case MLX5_IFC_STC_ACTION_TYPE_ALLOW: + break; + case MLX5_IFC_STC_ACTION_TYPE_ASO: + MLX5_SET(stc_ste_param_execute_aso, stc_param, aso_object_id, + stc_attr->aso.devx_obj_id); + MLX5_SET(stc_ste_param_execute_aso, stc_param, return_reg_id, + stc_attr->aso.return_reg_id); + MLX5_SET(stc_ste_param_execute_aso, stc_param, aso_type, + stc_attr->aso.aso_type); + break; + case MLX5_IFC_STC_ACTION_TYPE_JUMP_TO_STE_TABLE: + MLX5_SET(stc_ste_param_ste_table, stc_param, ste_obj_id, + stc_attr->ste_table.ste_obj_id); + MLX5_SET(stc_ste_param_ste_table, stc_param, match_definer_id, + stc_attr->ste_table.match_definer_id); + MLX5_SET(stc_ste_param_ste_table, stc_param, log_hash_size, + stc_attr->ste_table.log_hash_size); + break; + case MLX5_IFC_STC_ACTION_TYPE_REMOVE_WORDS: + MLX5_SET(stc_ste_param_remove_words, stc_param, action_type, + MLX5_MODIFICATION_TYPE_REMOVE_WORDS); + MLX5_SET(stc_ste_param_remove_words, stc_param, remove_start_anchor, + stc_attr->remove_words.start_anchor); + MLX5_SET(stc_ste_param_remove_words, stc_param, + remove_size, stc_attr->remove_words.num_of_words); + break; + case MLX5_IFC_STC_ACTION_TYPE_CRYPTO_IPSEC_ENCRYPTION: + MLX5_SET(stc_ste_param_ipsec_encrypt, stc_param, ipsec_object_id, + stc_attr->id); + break; + case MLX5_IFC_STC_ACTION_TYPE_CRYPTO_IPSEC_DECRYPTION: + MLX5_SET(stc_ste_param_ipsec_decrypt, stc_param, ipsec_object_id, + stc_attr->id); + break; + case MLX5_IFC_STC_ACTION_TYPE_TRAILER: + MLX5_SET(stc_ste_param_trailer, stc_param, command, + stc_attr->reformat_trailer.op); + MLX5_SET(stc_ste_param_trailer, stc_param, type, + stc_attr->reformat_trailer.type); + MLX5_SET(stc_ste_param_trailer, stc_param, length, + stc_attr->reformat_trailer.size); + break; + default: + mlx5_core_err(mdev, "Not supported type %d\n", stc_attr->action_type); + return -EINVAL; + } + return 0; +} + +int mlx5hws_cmd_stc_modify(struct mlx5_core_dev *mdev, + u32 stc_id, + struct mlx5hws_cmd_stc_modify_attr *stc_attr) +{ + u32 out[MLX5_ST_SZ_DW(general_obj_out_cmd_hdr)] = {0}; + u32 in[MLX5_ST_SZ_DW(create_stc_in)] = {0}; + void *stc_param; + void *attr; + int ret; + + attr = MLX5_ADDR_OF(create_stc_in, in, hdr); + MLX5_SET(general_obj_in_cmd_hdr, + attr, opcode, MLX5_CMD_OP_MODIFY_GENERAL_OBJECT); + MLX5_SET(general_obj_in_cmd_hdr, + attr, obj_type, MLX5_OBJ_TYPE_STC); + MLX5_SET(general_obj_in_cmd_hdr, in, obj_id, stc_id); + MLX5_SET(general_obj_in_cmd_hdr, in, + op_param.query.obj_offset, stc_attr->stc_offset); + + attr = MLX5_ADDR_OF(create_stc_in, in, stc); + MLX5_SET(stc, attr, ste_action_offset, stc_attr->action_offset); + MLX5_SET(stc, attr, action_type, stc_attr->action_type); + MLX5_SET(stc, attr, reparse_mode, stc_attr->reparse_mode); + MLX5_SET64(stc, attr, modify_field_select, + MLX5_IFC_MODIFY_STC_FIELD_SELECT_NEW_STC); + + /* Set destination TIRN, TAG, FT ID, STE ID */ + stc_param = MLX5_ADDR_OF(stc, attr, stc_param); + ret = hws_cmd_stc_modify_set_stc_param(mdev, stc_attr, stc_param); + if (ret) + return ret; + + ret = mlx5_cmd_exec(mdev, in, sizeof(in), out, sizeof(out)); + if (ret) + mlx5_core_err(mdev, "Failed to modify STC FW action_type %d\n", + stc_attr->action_type); + + return ret; +} + +int mlx5hws_cmd_arg_create(struct mlx5_core_dev *mdev, + u16 log_obj_range, + u32 pd, + u32 *arg_id) +{ + u32 out[MLX5_ST_SZ_DW(general_obj_out_cmd_hdr)] = {0}; + u32 in[MLX5_ST_SZ_DW(create_arg_in)] = {0}; + void *attr; + int ret; + + attr = MLX5_ADDR_OF(create_arg_in, in, hdr); + MLX5_SET(general_obj_in_cmd_hdr, + attr, opcode, MLX5_CMD_OP_CREATE_GENERAL_OBJECT); + MLX5_SET(general_obj_in_cmd_hdr, + attr, obj_type, MLX5_OBJ_TYPE_HEADER_MODIFY_ARGUMENT); + MLX5_SET(general_obj_in_cmd_hdr, + attr, op_param.create.log_obj_range, log_obj_range); + + attr = MLX5_ADDR_OF(create_arg_in, in, arg); + MLX5_SET(arg, attr, access_pd, pd); + + ret = mlx5_cmd_exec(mdev, in, sizeof(in), out, sizeof(out)); + if (ret) { + mlx5_core_err(mdev, "Failed to create ARG\n"); + goto out; + } + + *arg_id = MLX5_GET(general_obj_out_cmd_hdr, out, obj_id); +out: + return ret; +} + +void mlx5hws_cmd_arg_destroy(struct mlx5_core_dev *mdev, + u32 arg_id) +{ + hws_cmd_general_obj_destroy(mdev, MLX5_OBJ_TYPE_HEADER_MODIFY_ARGUMENT, arg_id); +} + +int mlx5hws_cmd_header_modify_pattern_create(struct mlx5_core_dev *mdev, + u32 pattern_length, + u8 *actions, + u32 *ptrn_id) +{ + u32 in[MLX5_ST_SZ_DW(create_header_modify_pattern_in)] = {0}; + u32 out[MLX5_ST_SZ_DW(general_obj_out_cmd_hdr)] = {0}; + int num_of_actions; + u64 *pattern_data; + void *pattern; + void *attr; + int ret; + int i; + + if (pattern_length > MLX5_MAX_ACTIONS_DATA_IN_HEADER_MODIFY) { + mlx5_core_err(mdev, "Pattern length %d exceeds limit %d\n", + pattern_length, MLX5_MAX_ACTIONS_DATA_IN_HEADER_MODIFY); + return -EINVAL; + } + + attr = MLX5_ADDR_OF(create_header_modify_pattern_in, in, hdr); + MLX5_SET(general_obj_in_cmd_hdr, + attr, opcode, MLX5_CMD_OP_CREATE_GENERAL_OBJECT); + MLX5_SET(general_obj_in_cmd_hdr, + attr, obj_type, MLX5_OBJ_TYPE_MODIFY_HDR_PATTERN); + + pattern = MLX5_ADDR_OF(create_header_modify_pattern_in, in, pattern); + /* Pattern_length is in ddwords */ + MLX5_SET(header_modify_pattern_in, pattern, pattern_length, pattern_length / (2 * DW_SIZE)); + + pattern_data = (u64 *)MLX5_ADDR_OF(header_modify_pattern_in, pattern, pattern_data); + memcpy(pattern_data, actions, pattern_length); + + num_of_actions = pattern_length / MLX5HWS_MODIFY_ACTION_SIZE; + for (i = 0; i < num_of_actions; i++) { + int type; + + type = MLX5_GET(set_action_in, &pattern_data[i], action_type); + if (type != MLX5_MODIFICATION_TYPE_COPY && + type != MLX5_MODIFICATION_TYPE_ADD_FIELD) + /* Action typ-copy use all bytes for control */ + MLX5_SET(set_action_in, &pattern_data[i], data, 0); + } + + ret = mlx5_cmd_exec(mdev, in, sizeof(in), out, sizeof(out)); + if (ret) { + mlx5_core_err(mdev, "Failed to create header_modify_pattern\n"); + goto out; + } + + *ptrn_id = MLX5_GET(general_obj_out_cmd_hdr, out, obj_id); +out: + return ret; +} + +void mlx5hws_cmd_header_modify_pattern_destroy(struct mlx5_core_dev *mdev, + u32 ptrn_id) +{ + hws_cmd_general_obj_destroy(mdev, MLX5_OBJ_TYPE_MODIFY_HDR_PATTERN, ptrn_id); +} + +int mlx5hws_cmd_ste_create(struct mlx5_core_dev *mdev, + struct mlx5hws_cmd_ste_create_attr *ste_attr, + u32 *ste_id) +{ + u32 out[MLX5_ST_SZ_DW(general_obj_out_cmd_hdr)] = {0}; + u32 in[MLX5_ST_SZ_DW(create_ste_in)] = {0}; + void *attr; + int ret; + + attr = MLX5_ADDR_OF(create_ste_in, in, hdr); + MLX5_SET(general_obj_in_cmd_hdr, + attr, opcode, MLX5_CMD_OP_CREATE_GENERAL_OBJECT); + MLX5_SET(general_obj_in_cmd_hdr, + attr, obj_type, MLX5_OBJ_TYPE_STE); + MLX5_SET(general_obj_in_cmd_hdr, + attr, op_param.create.log_obj_range, ste_attr->log_obj_range); + + attr = MLX5_ADDR_OF(create_ste_in, in, ste); + MLX5_SET(ste, attr, table_type, ste_attr->table_type); + + ret = mlx5_cmd_exec(mdev, in, sizeof(in), out, sizeof(out)); + if (ret) { + mlx5_core_err(mdev, "Failed to create STE\n"); + goto out; + } + + *ste_id = MLX5_GET(general_obj_out_cmd_hdr, out, obj_id); +out: + return ret; +} + +void mlx5hws_cmd_ste_destroy(struct mlx5_core_dev *mdev, u32 ste_id) +{ + hws_cmd_general_obj_destroy(mdev, MLX5_OBJ_TYPE_STE, ste_id); +} + +int mlx5hws_cmd_definer_create(struct mlx5_core_dev *mdev, + struct mlx5hws_cmd_definer_create_attr *def_attr, + u32 *definer_id) +{ + u32 out[MLX5_ST_SZ_DW(general_obj_out_cmd_hdr)] = {0}; + u32 in[MLX5_ST_SZ_DW(create_definer_in)] = {0}; + void *ptr; + int ret; + + MLX5_SET(general_obj_in_cmd_hdr, + in, opcode, MLX5_CMD_OP_CREATE_GENERAL_OBJECT); + MLX5_SET(general_obj_in_cmd_hdr, + in, obj_type, MLX5_OBJ_TYPE_MATCH_DEFINER); + + ptr = MLX5_ADDR_OF(create_definer_in, in, definer); + MLX5_SET(definer, ptr, format_id, MLX5_IFC_DEFINER_FORMAT_ID_SELECT); + + MLX5_SET(definer, ptr, format_select_dw0, def_attr->dw_selector[0]); + MLX5_SET(definer, ptr, format_select_dw1, def_attr->dw_selector[1]); + MLX5_SET(definer, ptr, format_select_dw2, def_attr->dw_selector[2]); + MLX5_SET(definer, ptr, format_select_dw3, def_attr->dw_selector[3]); + MLX5_SET(definer, ptr, format_select_dw4, def_attr->dw_selector[4]); + MLX5_SET(definer, ptr, format_select_dw5, def_attr->dw_selector[5]); + MLX5_SET(definer, ptr, format_select_dw6, def_attr->dw_selector[6]); + MLX5_SET(definer, ptr, format_select_dw7, def_attr->dw_selector[7]); + MLX5_SET(definer, ptr, format_select_dw8, def_attr->dw_selector[8]); + + MLX5_SET(definer, ptr, format_select_byte0, def_attr->byte_selector[0]); + MLX5_SET(definer, ptr, format_select_byte1, def_attr->byte_selector[1]); + MLX5_SET(definer, ptr, format_select_byte2, def_attr->byte_selector[2]); + MLX5_SET(definer, ptr, format_select_byte3, def_attr->byte_selector[3]); + MLX5_SET(definer, ptr, format_select_byte4, def_attr->byte_selector[4]); + MLX5_SET(definer, ptr, format_select_byte5, def_attr->byte_selector[5]); + MLX5_SET(definer, ptr, format_select_byte6, def_attr->byte_selector[6]); + MLX5_SET(definer, ptr, format_select_byte7, def_attr->byte_selector[7]); + + ptr = MLX5_ADDR_OF(definer, ptr, match_mask); + memcpy(ptr, def_attr->match_mask, MLX5_FLD_SZ_BYTES(definer, match_mask)); + + ret = mlx5_cmd_exec(mdev, in, sizeof(in), out, sizeof(out)); + if (ret) { + mlx5_core_err(mdev, "Failed to create Definer\n"); + goto out; + } + + *definer_id = MLX5_GET(general_obj_out_cmd_hdr, out, obj_id); +out: + return ret; +} + +void mlx5hws_cmd_definer_destroy(struct mlx5_core_dev *mdev, + u32 definer_id) +{ + hws_cmd_general_obj_destroy(mdev, MLX5_OBJ_TYPE_MATCH_DEFINER, definer_id); +} + +int mlx5hws_cmd_packet_reformat_create(struct mlx5_core_dev *mdev, + struct mlx5hws_cmd_packet_reformat_create_attr *attr, + u32 *reformat_id) +{ + u32 out[MLX5_ST_SZ_DW(alloc_packet_reformat_out)] = {0}; + size_t insz, cmd_data_sz, cmd_total_sz; + void *prctx; + void *pdata; + void *in; + int ret; + + cmd_total_sz = MLX5_ST_SZ_BYTES(alloc_packet_reformat_context_in); + cmd_total_sz += MLX5_ST_SZ_BYTES(packet_reformat_context_in); + cmd_data_sz = MLX5_FLD_SZ_BYTES(packet_reformat_context_in, reformat_data); + insz = align(cmd_total_sz + attr->data_sz - cmd_data_sz, DW_SIZE); + in = kzalloc(insz, GFP_KERNEL); + if (!in) + return -ENOMEM; + + MLX5_SET(alloc_packet_reformat_context_in, in, opcode, + MLX5_CMD_OP_ALLOC_PACKET_REFORMAT_CONTEXT); + + prctx = MLX5_ADDR_OF(alloc_packet_reformat_context_in, in, + packet_reformat_context); + pdata = MLX5_ADDR_OF(packet_reformat_context_in, prctx, reformat_data); + + MLX5_SET(packet_reformat_context_in, prctx, reformat_type, attr->type); + MLX5_SET(packet_reformat_context_in, prctx, reformat_param_0, attr->reformat_param_0); + MLX5_SET(packet_reformat_context_in, prctx, reformat_data_size, attr->data_sz); + memcpy(pdata, attr->data, attr->data_sz); + + ret = mlx5_cmd_exec(mdev, in, insz, out, sizeof(out)); + if (ret) { + mlx5_core_err(mdev, "Failed to create packet reformat\n"); + goto out; + } + + *reformat_id = MLX5_GET(alloc_packet_reformat_out, out, packet_reformat_id); +out: + kfree(in); + return ret; +} + +int mlx5hws_cmd_packet_reformat_destroy(struct mlx5_core_dev *mdev, + u32 reformat_id) +{ + u32 out[MLX5_ST_SZ_DW(dealloc_packet_reformat_out)] = {0}; + u32 in[MLX5_ST_SZ_DW(dealloc_packet_reformat_in)] = {0}; + int ret; + + MLX5_SET(dealloc_packet_reformat_in, in, opcode, + MLX5_CMD_OP_DEALLOC_PACKET_REFORMAT_CONTEXT); + MLX5_SET(dealloc_packet_reformat_in, in, + packet_reformat_id, reformat_id); + + ret = mlx5_cmd_exec(mdev, in, sizeof(in), out, sizeof(out)); + if (ret) + mlx5_core_err(mdev, "Failed to destroy packet_reformat\n"); + + return ret; +} + +int mlx5hws_cmd_sq_modify_rdy(struct mlx5_core_dev *mdev, u32 sqn) +{ + u32 out[MLX5_ST_SZ_DW(modify_sq_out)] = {0}; + u32 in[MLX5_ST_SZ_DW(modify_sq_in)] = {0}; + void *sqc = MLX5_ADDR_OF(modify_sq_in, in, ctx); + int ret; + + MLX5_SET(modify_sq_in, in, opcode, MLX5_CMD_OP_MODIFY_SQ); + MLX5_SET(modify_sq_in, in, sqn, sqn); + MLX5_SET(modify_sq_in, in, sq_state, MLX5_SQC_STATE_RST); + MLX5_SET(sqc, sqc, state, MLX5_SQC_STATE_RDY); + + ret = mlx5_cmd_exec(mdev, in, sizeof(in), out, sizeof(out)); + if (ret) + mlx5_core_err(mdev, "Failed to modify SQ\n"); + + return ret; +} + +int mlx5hws_cmd_allow_other_vhca_access(struct mlx5_core_dev *mdev, + struct mlx5hws_cmd_allow_other_vhca_access_attr *attr) +{ + u32 out[MLX5_ST_SZ_DW(allow_other_vhca_access_out)] = {0}; + u32 in[MLX5_ST_SZ_DW(allow_other_vhca_access_in)] = {0}; + void *key; + int ret; + + MLX5_SET(allow_other_vhca_access_in, + in, opcode, MLX5_CMD_OP_ALLOW_OTHER_VHCA_ACCESS); + MLX5_SET(allow_other_vhca_access_in, + in, object_type_to_be_accessed, attr->obj_type); + MLX5_SET(allow_other_vhca_access_in, + in, object_id_to_be_accessed, attr->obj_id); + + key = MLX5_ADDR_OF(allow_other_vhca_access_in, in, access_key); + memcpy(key, attr->access_key, sizeof(attr->access_key)); + + ret = mlx5_cmd_exec(mdev, in, sizeof(in), out, sizeof(out)); + if (ret) + mlx5_core_err(mdev, "Failed to execute ALLOW_OTHER_VHCA_ACCESS command\n"); + + return ret; +} + +int mlx5hws_cmd_alias_obj_create(struct mlx5_core_dev *mdev, + struct mlx5hws_cmd_alias_obj_create_attr *alias_attr, + u32 *obj_id) +{ + u32 out[MLX5_ST_SZ_DW(general_obj_out_cmd_hdr)] = {0}; + u32 in[MLX5_ST_SZ_DW(create_alias_obj_in)] = {0}; + void *attr; + void *key; + int ret; + + attr = MLX5_ADDR_OF(create_alias_obj_in, in, hdr); + MLX5_SET(general_obj_in_cmd_hdr, + attr, opcode, MLX5_CMD_OP_CREATE_GENERAL_OBJECT); + MLX5_SET(general_obj_in_cmd_hdr, + attr, obj_type, alias_attr->obj_type); + MLX5_SET(general_obj_in_cmd_hdr, attr, op_param.create.alias_object, 1); + + attr = MLX5_ADDR_OF(create_alias_obj_in, in, alias_ctx); + MLX5_SET(alias_context, attr, vhca_id_to_be_accessed, alias_attr->vhca_id); + MLX5_SET(alias_context, attr, object_id_to_be_accessed, alias_attr->obj_id); + + key = MLX5_ADDR_OF(alias_context, attr, access_key); + memcpy(key, alias_attr->access_key, sizeof(alias_attr->access_key)); + + ret = mlx5_cmd_exec(mdev, in, sizeof(in), out, sizeof(out)); + if (ret) { + mlx5_core_err(mdev, "Failed to create ALIAS OBJ\n"); + goto out; + } + + *obj_id = MLX5_GET(general_obj_out_cmd_hdr, out, obj_id); +out: + return ret; +} + +int mlx5hws_cmd_alias_obj_destroy(struct mlx5_core_dev *mdev, + u16 obj_type, + u32 obj_id) +{ + return hws_cmd_general_obj_destroy(mdev, obj_type, obj_id); +} + +int mlx5hws_cmd_generate_wqe(struct mlx5_core_dev *mdev, + struct mlx5hws_cmd_generate_wqe_attr *attr, + struct mlx5_cqe64 *ret_cqe) +{ + u32 out[MLX5_ST_SZ_DW(generate_wqe_out)] = {0}; + u32 in[MLX5_ST_SZ_DW(generate_wqe_in)] = {0}; + u8 status; + void *ptr; + int ret; + + MLX5_SET(generate_wqe_in, in, opcode, MLX5_CMD_OP_GENERATE_WQE); + MLX5_SET(generate_wqe_in, in, pdn, attr->pdn); + + ptr = MLX5_ADDR_OF(generate_wqe_in, in, wqe_ctrl); + memcpy(ptr, attr->wqe_ctrl, MLX5_FLD_SZ_BYTES(generate_wqe_in, wqe_ctrl)); + + ptr = MLX5_ADDR_OF(generate_wqe_in, in, wqe_gta_ctrl); + memcpy(ptr, attr->gta_ctrl, MLX5_FLD_SZ_BYTES(generate_wqe_in, wqe_gta_ctrl)); + + ptr = MLX5_ADDR_OF(generate_wqe_in, in, wqe_gta_data_0); + memcpy(ptr, attr->gta_data_0, MLX5_FLD_SZ_BYTES(generate_wqe_in, wqe_gta_data_0)); + + if (attr->gta_data_1) { + ptr = MLX5_ADDR_OF(generate_wqe_in, in, wqe_gta_data_1); + memcpy(ptr, attr->gta_data_1, MLX5_FLD_SZ_BYTES(generate_wqe_in, wqe_gta_data_1)); + } + + ret = mlx5_cmd_exec(mdev, in, sizeof(in), out, sizeof(out)); + if (ret) { + mlx5_core_err(mdev, "Failed to write GTA WQE using FW\n"); + return ret; + } + + status = MLX5_GET(generate_wqe_out, out, status); + if (status) { + mlx5_core_err(mdev, "Invalid FW CQE status %d\n", status); + return -EINVAL; + } + + ptr = MLX5_ADDR_OF(generate_wqe_out, out, cqe_data); + memcpy(ret_cqe, ptr, sizeof(*ret_cqe)); + + return ret; +} + +int mlx5hws_cmd_query_caps(struct mlx5_core_dev *mdev, + struct mlx5hws_cmd_query_caps *caps) +{ + u32 in[MLX5_ST_SZ_DW(query_hca_cap_in)] = {0}; + u32 out_size; + u32 *out; + int ret; + + out_size = MLX5_ST_SZ_BYTES(query_hca_cap_out); + out = kzalloc(out_size, GFP_KERNEL); + if (!out) + return -ENOMEM; + + MLX5_SET(query_hca_cap_in, in, opcode, MLX5_CMD_OP_QUERY_HCA_CAP); + MLX5_SET(query_hca_cap_in, in, op_mod, + MLX5_SET_HCA_CAP_OP_MOD_GENERAL_DEVICE | HCA_CAP_OPMOD_GET_CUR); + + ret = mlx5_cmd_exec(mdev, in, sizeof(in), out, out_size); + if (ret) { + mlx5_core_err(mdev, "Failed to query device caps\n"); + goto out; + } + + caps->wqe_based_update = + MLX5_GET(query_hca_cap_out, out, + capability.cmd_hca_cap.wqe_based_flow_table_update_cap); + + caps->eswitch_manager = MLX5_GET(query_hca_cap_out, out, + capability.cmd_hca_cap.eswitch_manager); + + caps->flex_protocols = MLX5_GET(query_hca_cap_out, out, + capability.cmd_hca_cap.flex_parser_protocols); + + if (caps->flex_protocols & MLX5_FLEX_PARSER_GENEVE_TLV_OPTION_0_ENABLED) + caps->flex_parser_id_geneve_tlv_option_0 = + MLX5_GET(query_hca_cap_out, out, + capability.cmd_hca_cap.flex_parser_id_geneve_tlv_option_0); + + if (caps->flex_protocols & MLX5_FLEX_PARSER_MPLS_OVER_GRE_ENABLED) + caps->flex_parser_id_mpls_over_gre = + MLX5_GET(query_hca_cap_out, out, + capability.cmd_hca_cap.flex_parser_id_outer_first_mpls_over_gre); + + if (caps->flex_protocols & MLX5_FLEX_PARSER_MPLS_OVER_UDP_ENABLED) + caps->flex_parser_id_mpls_over_udp = + MLX5_GET(query_hca_cap_out, out, + capability.cmd_hca_cap.flex_parser_id_outer_first_mpls_over_udp_label); + + caps->log_header_modify_argument_granularity = + MLX5_GET(query_hca_cap_out, out, + capability.cmd_hca_cap.log_header_modify_argument_granularity); + + caps->log_header_modify_argument_granularity -= + MLX5_GET(query_hca_cap_out, out, + capability.cmd_hca_cap.log_header_modify_argument_granularity_offset); + + caps->log_header_modify_argument_max_alloc = + MLX5_GET(query_hca_cap_out, out, + capability.cmd_hca_cap.log_header_modify_argument_max_alloc); + + caps->definer_format_sup = + MLX5_GET64(query_hca_cap_out, out, + capability.cmd_hca_cap.match_definer_format_supported); + + caps->vhca_id = MLX5_GET(query_hca_cap_out, out, + capability.cmd_hca_cap.vhca_id); + + caps->sq_ts_format = MLX5_GET(query_hca_cap_out, out, + capability.cmd_hca_cap.sq_ts_format); + + caps->ipsec_offload = MLX5_GET(query_hca_cap_out, out, + capability.cmd_hca_cap.ipsec_offload); + + MLX5_SET(query_hca_cap_in, in, op_mod, + MLX5_GET_HCA_CAP_OP_MOD_GENERAL_DEVICE_2 | HCA_CAP_OPMOD_GET_CUR); + + ret = mlx5_cmd_exec(mdev, in, sizeof(in), out, out_size); + if (ret) { + mlx5_core_err(mdev, "Failed to query device caps 2\n"); + goto out; + } + + caps->full_dw_jumbo_support = + MLX5_GET(query_hca_cap_out, out, + capability.cmd_hca_cap_2.format_select_dw_8_6_ext); + + caps->format_select_gtpu_dw_0 = + MLX5_GET(query_hca_cap_out, out, + capability.cmd_hca_cap_2.format_select_dw_gtpu_dw_0); + + caps->format_select_gtpu_dw_1 = + MLX5_GET(query_hca_cap_out, out, + capability.cmd_hca_cap_2.format_select_dw_gtpu_dw_1); + + caps->format_select_gtpu_dw_2 = + MLX5_GET(query_hca_cap_out, out, + capability.cmd_hca_cap_2.format_select_dw_gtpu_dw_2); + + caps->format_select_gtpu_ext_dw_0 = + MLX5_GET(query_hca_cap_out, out, + capability.cmd_hca_cap_2.format_select_dw_gtpu_first_ext_dw_0); + + caps->supp_type_gen_wqe = + MLX5_GET(query_hca_cap_out, out, + capability.cmd_hca_cap_2.generate_wqe_type); + + caps->flow_table_hash_type = + MLX5_GET(query_hca_cap_out, out, + capability.cmd_hca_cap_2.flow_table_hash_type); + + MLX5_SET(query_hca_cap_in, in, op_mod, + MLX5_GET_HCA_CAP_OP_MOD_NIC_FLOW_TABLE | HCA_CAP_OPMOD_GET_CUR); + + ret = mlx5_cmd_exec(mdev, in, sizeof(in), out, out_size); + if (ret) { + mlx5_core_err(mdev, "Failed to query flow table caps\n"); + goto out; + } + + caps->nic_ft.max_level = + MLX5_GET(query_hca_cap_out, out, + capability.flow_table_nic_cap.flow_table_properties_nic_receive.max_ft_level); + + caps->nic_ft.reparse = + MLX5_GET(query_hca_cap_out, out, + capability.flow_table_nic_cap.flow_table_properties_nic_receive.reparse); + + caps->nic_ft.ignore_flow_level_rtc_valid = + MLX5_GET(query_hca_cap_out, out, + capability.flow_table_nic_cap.flow_table_properties_nic_receive.ignore_flow_level_rtc_valid); + + caps->flex_parser_ok_bits_supp = + MLX5_GET(query_hca_cap_out, out, + capability.flow_table_nic_cap.flow_table_properties_nic_receive.ft_field_support.geneve_tlv_option_0_exist); + + if (caps->wqe_based_update) { + MLX5_SET(query_hca_cap_in, in, op_mod, + MLX5_GET_HCA_CAP_OP_MOD_WQE_BASED_FLOW_TABLE | HCA_CAP_OPMOD_GET_CUR); + + ret = mlx5_cmd_exec(mdev, in, sizeof(in), out, out_size); + if (ret) { + mlx5_core_err(mdev, "Failed to query WQE based FT caps\n"); + goto out; + } + + caps->rtc_reparse_mode = + MLX5_GET(query_hca_cap_out, out, + capability.wqe_based_flow_table_cap.rtc_reparse_mode); + + caps->ste_format = + MLX5_GET(query_hca_cap_out, out, + capability.wqe_based_flow_table_cap.ste_format); + + caps->rtc_index_mode = + MLX5_GET(query_hca_cap_out, out, + capability.wqe_based_flow_table_cap.rtc_index_mode); + + caps->rtc_log_depth_max = + MLX5_GET(query_hca_cap_out, out, + capability.wqe_based_flow_table_cap.rtc_log_depth_max); + + caps->ste_alloc_log_max = + MLX5_GET(query_hca_cap_out, out, + capability.wqe_based_flow_table_cap.ste_alloc_log_max); + + caps->ste_alloc_log_gran = + MLX5_GET(query_hca_cap_out, out, + capability.wqe_based_flow_table_cap.ste_alloc_log_granularity); + + caps->trivial_match_definer = + MLX5_GET(query_hca_cap_out, out, + capability.wqe_based_flow_table_cap.trivial_match_definer); + + caps->stc_alloc_log_max = + MLX5_GET(query_hca_cap_out, out, + capability.wqe_based_flow_table_cap.stc_alloc_log_max); + + caps->stc_alloc_log_gran = + MLX5_GET(query_hca_cap_out, out, + capability.wqe_based_flow_table_cap.stc_alloc_log_granularity); + + caps->rtc_hash_split_table = + MLX5_GET(query_hca_cap_out, out, + capability.wqe_based_flow_table_cap.rtc_hash_split_table); + + caps->rtc_linear_lookup_table = + MLX5_GET(query_hca_cap_out, out, + capability.wqe_based_flow_table_cap.rtc_linear_lookup_table); + + caps->access_index_mode = + MLX5_GET(query_hca_cap_out, out, + capability.wqe_based_flow_table_cap.access_index_mode); + + caps->linear_match_definer = + MLX5_GET(query_hca_cap_out, out, + capability.wqe_based_flow_table_cap.linear_match_definer_reg_c3); + + caps->rtc_max_hash_def_gen_wqe = + MLX5_GET(query_hca_cap_out, out, + capability.wqe_based_flow_table_cap.rtc_max_num_hash_definer_gen_wqe); + + caps->supp_ste_format_gen_wqe = + MLX5_GET(query_hca_cap_out, out, + capability.wqe_based_flow_table_cap.ste_format_gen_wqe); + + caps->fdb_tir_stc = + MLX5_GET(query_hca_cap_out, out, + capability.wqe_based_flow_table_cap.fdb_jump_to_tir_stc); + } + + if (caps->eswitch_manager) { + MLX5_SET(query_hca_cap_in, in, op_mod, + MLX5_GET_HCA_CAP_OP_MOD_ESW_FLOW_TABLE | HCA_CAP_OPMOD_GET_CUR); + + ret = mlx5_cmd_exec(mdev, in, sizeof(in), out, out_size); + if (ret) { + mlx5_core_err(mdev, "Failed to query flow table esw caps\n"); + goto out; + } + + caps->fdb_ft.max_level = + MLX5_GET(query_hca_cap_out, out, + capability.flow_table_nic_cap.flow_table_properties_nic_receive.max_ft_level); + + caps->fdb_ft.reparse = + MLX5_GET(query_hca_cap_out, out, + capability.flow_table_nic_cap.flow_table_properties_nic_receive.reparse); + + MLX5_SET(query_hca_cap_in, in, op_mod, + MLX5_SET_HCA_CAP_OP_MOD_ESW | HCA_CAP_OPMOD_GET_CUR); + + ret = mlx5_cmd_exec(mdev, in, sizeof(in), out, out_size); + if (ret) { + mlx5_core_err(mdev, "Failed to query eswitch capabilities\n"); + goto out; + } + + if (MLX5_GET(query_hca_cap_out, out, + capability.esw_cap.esw_manager_vport_number_valid)) + caps->eswitch_manager_vport_number = + MLX5_GET(query_hca_cap_out, out, + capability.esw_cap.esw_manager_vport_number); + + caps->merged_eswitch = MLX5_GET(query_hca_cap_out, out, + capability.esw_cap.merged_eswitch); + } + + ret = mlx5_cmd_exec(mdev, in, sizeof(in), out, out_size); + if (ret) { + mlx5_core_err(mdev, "Failed to query device attributes\n"); + goto out; + } + + snprintf(caps->fw_ver, sizeof(caps->fw_ver), "%d.%d.%d", + fw_rev_maj(mdev), fw_rev_min(mdev), fw_rev_sub(mdev)); + + caps->is_ecpf = mlx5_core_is_ecpf_esw_manager(mdev); + +out: + kfree(out); + return ret; +} + +int mlx5hws_cmd_query_gvmi(struct mlx5_core_dev *mdev, bool other_function, + u16 vport_number, u16 *gvmi) +{ + bool ec_vf_func = other_function ? mlx5_core_is_ec_vf_vport(mdev, vport_number) : false; + u32 in[MLX5_ST_SZ_DW(query_hca_cap_in)] = {}; + int out_size; + void *out; + int err; + + out_size = MLX5_ST_SZ_BYTES(query_hca_cap_out); + out = kzalloc(out_size, GFP_KERNEL); + if (!out) + return -ENOMEM; + + MLX5_SET(query_hca_cap_in, in, opcode, MLX5_CMD_OP_QUERY_HCA_CAP); + MLX5_SET(query_hca_cap_in, in, other_function, other_function); + MLX5_SET(query_hca_cap_in, in, function_id, + mlx5_vport_to_func_id(mdev, vport_number, ec_vf_func)); + MLX5_SET(query_hca_cap_in, in, ec_vf_function, ec_vf_func); + MLX5_SET(query_hca_cap_in, in, op_mod, + MLX5_SET_HCA_CAP_OP_MOD_GENERAL_DEVICE << 1 | HCA_CAP_OPMOD_GET_CUR); + + err = mlx5_cmd_exec_inout(mdev, query_hca_cap, in, out); + if (err) { + kfree(out); + return err; + } + + *gvmi = MLX5_GET(query_hca_cap_out, out, capability.cmd_hca_cap.vhca_id); + + kfree(out); + + return 0; +} diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_cmd.h b/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_cmd.h new file mode 100644 index 0000000000000..2fbcf4ff571a0 --- /dev/null +++ b/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_cmd.h @@ -0,0 +1,361 @@ +/* SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB */ +/* Copyright (c) 2024 NVIDIA Corporation & Affiliates */ + +#ifndef MLX5HWS_CMD_H_ +#define MLX5HWS_CMD_H_ + +#define WIRE_PORT 0xFFFF + +#define ACCESS_KEY_LEN 32 + +enum mlx5hws_cmd_ext_dest_flags { + MLX5HWS_CMD_EXT_DEST_REFORMAT = 1 << 0, + MLX5HWS_CMD_EXT_DEST_ESW_OWNER_VHCA_ID = 1 << 1, +}; + +struct mlx5hws_cmd_set_fte_dest { + u8 destination_type; + u32 destination_id; + enum mlx5hws_cmd_ext_dest_flags ext_flags; + u32 ext_reformat_id; + u16 esw_owner_vhca_id; +}; + +struct mlx5hws_cmd_set_fte_attr { + u32 action_flags; + bool ignore_flow_level; + u8 flow_source; + u8 extended_dest; + u8 encrypt_decrypt_type; + u32 encrypt_decrypt_obj_id; + u32 packet_reformat_id; + u32 dests_num; + struct mlx5hws_cmd_set_fte_dest *dests; +}; + +struct mlx5hws_cmd_ft_create_attr { + u8 type; + u8 level; + bool rtc_valid; + bool decap_en; + bool reformat_en; +}; + +struct mlx5hws_cmd_ft_modify_attr { + u8 type; + u32 rtc_id_0; + u32 rtc_id_1; + u32 table_miss_id; + u8 table_miss_action; + u64 modify_fs; +}; + +struct mlx5hws_cmd_ft_query_attr { + u8 type; +}; + +struct mlx5hws_cmd_fg_attr { + u32 table_id; + u32 table_type; +}; + +struct mlx5hws_cmd_forward_tbl { + u8 type; + u32 ft_id; + u32 fg_id; + u32 refcount; +}; + +struct mlx5hws_cmd_rtc_create_attr { + u32 pd; + u32 stc_base; + u32 ste_base; + u32 ste_offset; + u32 miss_ft_id; + bool fw_gen_wqe; + u8 update_index_mode; + u8 access_index_mode; + u8 num_hash_definer; + u8 log_depth; + u8 log_size; + u8 table_type; + u8 match_definer_0; + u8 match_definer_1; + u8 reparse_mode; + bool is_frst_jumbo; + bool is_scnd_range; +}; + +struct mlx5hws_cmd_alias_obj_create_attr { + u32 obj_id; + u16 vhca_id; + u16 obj_type; + u8 access_key[ACCESS_KEY_LEN]; +}; + +struct mlx5hws_cmd_stc_create_attr { + u8 log_obj_range; + u8 table_type; +}; + +struct mlx5hws_cmd_stc_modify_attr { + u32 stc_offset; + u8 action_offset; + u8 reparse_mode; + enum mlx5_ifc_stc_action_type action_type; + union { + u32 id; /* TIRN, TAG, FT ID, STE ID, CRYPTO */ + struct { + u8 decap; + u16 start_anchor; + u16 end_anchor; + } remove_header; + struct { + u32 arg_id; + u32 pattern_id; + } modify_header; + struct { + __be64 data; + } modify_action; + struct { + u32 arg_id; + u32 header_size; + u8 is_inline; + u8 encap; + u16 insert_anchor; + u16 insert_offset; + } insert_header; + struct { + u8 aso_type; + u32 devx_obj_id; + u8 return_reg_id; + } aso; + struct { + u16 vport_num; + u16 esw_owner_vhca_id; + u8 eswitch_owner_vhca_id_valid; + } vport; + struct { + struct mlx5hws_pool_chunk ste; + struct mlx5hws_pool *ste_pool; + u32 ste_obj_id; /* Internal */ + u32 match_definer_id; + u8 log_hash_size; + bool ignore_tx; + } ste_table; + struct { + u16 start_anchor; + u16 num_of_words; + } remove_words; + struct { + u8 type; + u8 op; + u8 size; + } reformat_trailer; + + u32 dest_table_id; + u32 dest_tir_num; + }; +}; + +struct mlx5hws_cmd_ste_create_attr { + u8 log_obj_range; + u8 table_type; +}; + +struct mlx5hws_cmd_definer_create_attr { + u8 *dw_selector; + u8 *byte_selector; + u8 *match_mask; +}; + +struct mlx5hws_cmd_allow_other_vhca_access_attr { + u16 obj_type; + u32 obj_id; + u8 access_key[ACCESS_KEY_LEN]; +}; + +struct mlx5hws_cmd_packet_reformat_create_attr { + u8 type; + size_t data_sz; + void *data; + u8 reformat_param_0; +}; + +struct mlx5hws_cmd_query_ft_caps { + u8 max_level; + u8 reparse; + u8 ignore_flow_level_rtc_valid; +}; + +struct mlx5hws_cmd_generate_wqe_attr { + u8 *wqe_ctrl; + u8 *gta_ctrl; + u8 *gta_data_0; + u8 *gta_data_1; + u32 pdn; +}; + +struct mlx5hws_cmd_query_caps { + u32 flex_protocols; + u8 wqe_based_update; + u8 rtc_reparse_mode; + u16 ste_format; + u8 rtc_index_mode; + u8 ste_alloc_log_max; + u8 ste_alloc_log_gran; + u8 stc_alloc_log_max; + u8 stc_alloc_log_gran; + u8 rtc_log_depth_max; + u8 format_select_gtpu_dw_0; + u8 format_select_gtpu_dw_1; + u8 flow_table_hash_type; + u8 format_select_gtpu_dw_2; + u8 format_select_gtpu_ext_dw_0; + u8 access_index_mode; + u32 linear_match_definer; + bool full_dw_jumbo_support; + bool rtc_hash_split_table; + bool rtc_linear_lookup_table; + u32 supp_type_gen_wqe; + u8 rtc_max_hash_def_gen_wqe; + u16 supp_ste_format_gen_wqe; + struct mlx5hws_cmd_query_ft_caps nic_ft; + struct mlx5hws_cmd_query_ft_caps fdb_ft; + bool eswitch_manager; + bool merged_eswitch; + u32 eswitch_manager_vport_number; + u8 log_header_modify_argument_granularity; + u8 log_header_modify_argument_max_alloc; + u8 sq_ts_format; + u8 fdb_tir_stc; + u64 definer_format_sup; + u32 trivial_match_definer; + u32 vhca_id; + u32 shared_vhca_id; + char fw_ver[64]; + bool ipsec_offload; + bool is_ecpf; + u8 flex_parser_ok_bits_supp; + u8 flex_parser_id_geneve_tlv_option_0; + u8 flex_parser_id_mpls_over_gre; + u8 flex_parser_id_mpls_over_udp; +}; + +int mlx5hws_cmd_flow_table_create(struct mlx5_core_dev *mdev, + struct mlx5hws_cmd_ft_create_attr *ft_attr, + u32 *table_id); + +int mlx5hws_cmd_flow_table_modify(struct mlx5_core_dev *mdev, + struct mlx5hws_cmd_ft_modify_attr *ft_attr, + u32 table_id); + +int mlx5hws_cmd_flow_table_query(struct mlx5_core_dev *mdev, + u32 obj_id, + struct mlx5hws_cmd_ft_query_attr *ft_attr, + u64 *icm_addr_0, u64 *icm_addr_1); + +int mlx5hws_cmd_flow_table_destroy(struct mlx5_core_dev *mdev, + u8 fw_ft_type, u32 table_id); + +void mlx5hws_cmd_alias_flow_table_destroy(struct mlx5_core_dev *mdev, + u32 table_id); + +int mlx5hws_cmd_rtc_create(struct mlx5_core_dev *mdev, + struct mlx5hws_cmd_rtc_create_attr *rtc_attr, + u32 *rtc_id); + +void mlx5hws_cmd_rtc_destroy(struct mlx5_core_dev *mdev, u32 rtc_id); + +int mlx5hws_cmd_stc_create(struct mlx5_core_dev *mdev, + struct mlx5hws_cmd_stc_create_attr *stc_attr, + u32 *stc_id); + +int mlx5hws_cmd_stc_modify(struct mlx5_core_dev *mdev, + u32 stc_id, + struct mlx5hws_cmd_stc_modify_attr *stc_attr); + +void mlx5hws_cmd_stc_destroy(struct mlx5_core_dev *mdev, u32 stc_id); + +int mlx5hws_cmd_generate_wqe(struct mlx5_core_dev *mdev, + struct mlx5hws_cmd_generate_wqe_attr *attr, + struct mlx5_cqe64 *ret_cqe); + +int mlx5hws_cmd_ste_create(struct mlx5_core_dev *mdev, + struct mlx5hws_cmd_ste_create_attr *ste_attr, + u32 *ste_id); + +void mlx5hws_cmd_ste_destroy(struct mlx5_core_dev *mdev, u32 ste_id); + +int mlx5hws_cmd_definer_create(struct mlx5_core_dev *mdev, + struct mlx5hws_cmd_definer_create_attr *def_attr, + u32 *definer_id); + +void mlx5hws_cmd_definer_destroy(struct mlx5_core_dev *mdev, + u32 definer_id); + +int mlx5hws_cmd_arg_create(struct mlx5_core_dev *mdev, + u16 log_obj_range, + u32 pd, + u32 *arg_id); + +void mlx5hws_cmd_arg_destroy(struct mlx5_core_dev *mdev, + u32 arg_id); + +int mlx5hws_cmd_header_modify_pattern_create(struct mlx5_core_dev *mdev, + u32 pattern_length, + u8 *actions, + u32 *ptrn_id); + +void mlx5hws_cmd_header_modify_pattern_destroy(struct mlx5_core_dev *mdev, + u32 ptrn_id); + +int mlx5hws_cmd_packet_reformat_create(struct mlx5_core_dev *mdev, + struct mlx5hws_cmd_packet_reformat_create_attr *attr, + u32 *reformat_id); + +int mlx5hws_cmd_packet_reformat_destroy(struct mlx5_core_dev *mdev, + u32 reformat_id); + +int mlx5hws_cmd_set_fte(struct mlx5_core_dev *mdev, + u32 table_type, + u32 table_id, + u32 group_id, + struct mlx5hws_cmd_set_fte_attr *fte_attr); + +int mlx5hws_cmd_delete_fte(struct mlx5_core_dev *mdev, + u32 table_type, u32 table_id); + +struct mlx5hws_cmd_forward_tbl * +mlx5hws_cmd_forward_tbl_create(struct mlx5_core_dev *mdev, + struct mlx5hws_cmd_ft_create_attr *ft_attr, + struct mlx5hws_cmd_set_fte_attr *fte_attr); + +void mlx5hws_cmd_forward_tbl_destroy(struct mlx5_core_dev *mdev, + struct mlx5hws_cmd_forward_tbl *tbl); + +int mlx5hws_cmd_alias_obj_create(struct mlx5_core_dev *mdev, + struct mlx5hws_cmd_alias_obj_create_attr *alias_attr, + u32 *obj_id); + +int mlx5hws_cmd_alias_obj_destroy(struct mlx5_core_dev *mdev, + u16 obj_type, + u32 obj_id); + +int mlx5hws_cmd_sq_modify_rdy(struct mlx5_core_dev *mdev, u32 sqn); + +int mlx5hws_cmd_query_caps(struct mlx5_core_dev *mdev, + struct mlx5hws_cmd_query_caps *caps); + +void mlx5hws_cmd_set_attr_connect_miss_tbl(struct mlx5hws_context *ctx, + u32 fw_ft_type, + enum mlx5hws_table_type type, + struct mlx5hws_cmd_ft_modify_attr *ft_attr); + +int mlx5hws_cmd_allow_other_vhca_access(struct mlx5_core_dev *mdev, + struct mlx5hws_cmd_allow_other_vhca_access_attr *attr); + +int mlx5hws_cmd_query_gvmi(struct mlx5_core_dev *mdev, bool other_function, + u16 vport_number, u16 *gvmi); + +#endif /* MLX5HWS_CMD_H_ */ From 0f33685b1a6ea30ba4bb8fae026daab21e731c54 Mon Sep 17 00:00:00 2001 From: Yevgeny Kliteynik Date: Thu, 20 Jun 2024 02:39:00 +0300 Subject: [PATCH 095/548] net/mlx5: HWS, added modify header pattern and args handling Packet headers/metadta manipulations are split into two parts: - Header Modify Pattern: an object that describes which fields will be modified and in which way - Header Modify Argument: an object that provides the values to be used for header modification Reviewed-by: Hamdan Agbariya Signed-off-by: Yevgeny Kliteynik Signed-off-by: Saeed Mahameed (cherry picked from commit aefc15a0fa1cfab233f490a4675167e5ca5c42b2) Signed-off-by: Koba Ko --- .../mlx5/core/steering/hws/mlx5hws_pat_arg.c | 579 ++++++++++++++++++ .../mlx5/core/steering/hws/mlx5hws_pat_arg.h | 101 +++ 2 files changed, 680 insertions(+) create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_pat_arg.c create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_pat_arg.h diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_pat_arg.c b/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_pat_arg.c new file mode 100644 index 0000000000000..e084a5cbf81fd --- /dev/null +++ b/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_pat_arg.c @@ -0,0 +1,579 @@ +// SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB +/* Copyright (c) 2024 NVIDIA Corporation & Affiliates */ + +#include "mlx5hws_internal.h" + +enum mlx5hws_arg_chunk_size +mlx5hws_arg_data_size_to_arg_log_size(u16 data_size) +{ + /* Return the roundup of log2(data_size) */ + if (data_size <= MLX5HWS_ARG_DATA_SIZE) + return MLX5HWS_ARG_CHUNK_SIZE_1; + if (data_size <= MLX5HWS_ARG_DATA_SIZE * 2) + return MLX5HWS_ARG_CHUNK_SIZE_2; + if (data_size <= MLX5HWS_ARG_DATA_SIZE * 4) + return MLX5HWS_ARG_CHUNK_SIZE_3; + if (data_size <= MLX5HWS_ARG_DATA_SIZE * 8) + return MLX5HWS_ARG_CHUNK_SIZE_4; + + return MLX5HWS_ARG_CHUNK_SIZE_MAX; +} + +u32 mlx5hws_arg_data_size_to_arg_size(u16 data_size) +{ + return BIT(mlx5hws_arg_data_size_to_arg_log_size(data_size)); +} + +enum mlx5hws_arg_chunk_size +mlx5hws_arg_get_arg_log_size(u16 num_of_actions) +{ + return mlx5hws_arg_data_size_to_arg_log_size(num_of_actions * + MLX5HWS_MODIFY_ACTION_SIZE); +} + +u32 mlx5hws_arg_get_arg_size(u16 num_of_actions) +{ + return BIT(mlx5hws_arg_get_arg_log_size(num_of_actions)); +} + +bool mlx5hws_pat_require_reparse(__be64 *actions, u16 num_of_actions) +{ + u16 i, field; + u8 action_id; + + for (i = 0; i < num_of_actions; i++) { + action_id = MLX5_GET(set_action_in, &actions[i], action_type); + + switch (action_id) { + case MLX5_MODIFICATION_TYPE_NOP: + field = MLX5_MODI_OUT_NONE; + break; + + case MLX5_MODIFICATION_TYPE_SET: + case MLX5_MODIFICATION_TYPE_ADD: + field = MLX5_GET(set_action_in, &actions[i], field); + break; + + case MLX5_MODIFICATION_TYPE_COPY: + case MLX5_MODIFICATION_TYPE_ADD_FIELD: + field = MLX5_GET(copy_action_in, &actions[i], dst_field); + break; + + default: + /* Insert/Remove/Unknown actions require reparse */ + return true; + } + + /* Below fields can change packet structure require a reparse */ + if (field == MLX5_MODI_OUT_ETHERTYPE || + field == MLX5_MODI_OUT_IPV6_NEXT_HDR) + return true; + } + + return false; +} + +/* Cache and cache element handling */ +int mlx5hws_pat_init_pattern_cache(struct mlx5hws_pattern_cache **cache) +{ + struct mlx5hws_pattern_cache *new_cache; + + new_cache = kzalloc(sizeof(*new_cache), GFP_KERNEL); + if (!new_cache) + return -ENOMEM; + + INIT_LIST_HEAD(&new_cache->ptrn_list); + mutex_init(&new_cache->lock); + + *cache = new_cache; + + return 0; +} + +void mlx5hws_pat_uninit_pattern_cache(struct mlx5hws_pattern_cache *cache) +{ + mutex_destroy(&cache->lock); + kfree(cache); +} + +static bool mlx5hws_pat_compare_pattern(int cur_num_of_actions, + __be64 cur_actions[], + int num_of_actions, + __be64 actions[]) +{ + int i; + + if (cur_num_of_actions != num_of_actions) + return false; + + for (i = 0; i < num_of_actions; i++) { + u8 action_id = + MLX5_GET(set_action_in, &actions[i], action_type); + + if (action_id == MLX5_MODIFICATION_TYPE_COPY || + action_id == MLX5_MODIFICATION_TYPE_ADD_FIELD) { + if (actions[i] != cur_actions[i]) + return false; + } else { + /* Compare just the control, not the values */ + if ((__force __be32)actions[i] != + (__force __be32)cur_actions[i]) + return false; + } + } + + return true; +} + +static struct mlx5hws_pattern_cache_item * +mlx5hws_pat_find_cached_pattern(struct mlx5hws_pattern_cache *cache, + u16 num_of_actions, + __be64 *actions) +{ + struct mlx5hws_pattern_cache_item *cached_pat = NULL; + + list_for_each_entry(cached_pat, &cache->ptrn_list, ptrn_list_node) { + if (mlx5hws_pat_compare_pattern(cached_pat->mh_data.num_of_actions, + (__be64 *)cached_pat->mh_data.data, + num_of_actions, + actions)) + return cached_pat; + } + + return NULL; +} + +static struct mlx5hws_pattern_cache_item * +mlx5hws_pat_get_existing_cached_pattern(struct mlx5hws_pattern_cache *cache, + u16 num_of_actions, + __be64 *actions) +{ + struct mlx5hws_pattern_cache_item *cached_pattern; + + cached_pattern = mlx5hws_pat_find_cached_pattern(cache, num_of_actions, actions); + if (cached_pattern) { + /* LRU: move it to be first in the list */ + list_del_init(&cached_pattern->ptrn_list_node); + list_add(&cached_pattern->ptrn_list_node, &cache->ptrn_list); + cached_pattern->refcount++; + } + + return cached_pattern; +} + +static struct mlx5hws_pattern_cache_item * +mlx5hws_pat_add_pattern_to_cache(struct mlx5hws_pattern_cache *cache, + u32 pattern_id, + u16 num_of_actions, + __be64 *actions) +{ + struct mlx5hws_pattern_cache_item *cached_pattern; + + cached_pattern = kzalloc(sizeof(*cached_pattern), GFP_KERNEL); + if (!cached_pattern) + return NULL; + + cached_pattern->mh_data.num_of_actions = num_of_actions; + cached_pattern->mh_data.pattern_id = pattern_id; + cached_pattern->mh_data.data = + kmemdup(actions, num_of_actions * MLX5HWS_MODIFY_ACTION_SIZE, GFP_KERNEL); + if (!cached_pattern->mh_data.data) + goto free_cached_obj; + + list_add(&cached_pattern->ptrn_list_node, &cache->ptrn_list); + cached_pattern->refcount = 1; + + return cached_pattern; + +free_cached_obj: + kfree(cached_pattern); + return NULL; +} + +static struct mlx5hws_pattern_cache_item * +mlx5hws_pat_find_cached_pattern_by_id(struct mlx5hws_pattern_cache *cache, + u32 ptrn_id) +{ + struct mlx5hws_pattern_cache_item *cached_pattern = NULL; + + list_for_each_entry(cached_pattern, &cache->ptrn_list, ptrn_list_node) { + if (cached_pattern->mh_data.pattern_id == ptrn_id) + return cached_pattern; + } + + return NULL; +} + +static void +mlx5hws_pat_remove_pattern(struct mlx5hws_pattern_cache_item *cached_pattern) +{ + list_del_init(&cached_pattern->ptrn_list_node); + + kfree(cached_pattern->mh_data.data); + kfree(cached_pattern); +} + +void mlx5hws_pat_put_pattern(struct mlx5hws_context *ctx, u32 ptrn_id) +{ + struct mlx5hws_pattern_cache *cache = ctx->pattern_cache; + struct mlx5hws_pattern_cache_item *cached_pattern; + + mutex_lock(&cache->lock); + cached_pattern = mlx5hws_pat_find_cached_pattern_by_id(cache, ptrn_id); + if (!cached_pattern) { + mlx5hws_err(ctx, "Failed to find cached pattern with provided ID\n"); + pr_warn("HWS: pattern ID %d is not found\n", ptrn_id); + goto out; + } + + if (--cached_pattern->refcount) + goto out; + + mlx5hws_pat_remove_pattern(cached_pattern); + mlx5hws_cmd_header_modify_pattern_destroy(ctx->mdev, ptrn_id); + +out: + mutex_unlock(&cache->lock); +} + +int mlx5hws_pat_get_pattern(struct mlx5hws_context *ctx, + __be64 *pattern, size_t pattern_sz, + u32 *pattern_id) +{ + u16 num_of_actions = pattern_sz / MLX5HWS_MODIFY_ACTION_SIZE; + struct mlx5hws_pattern_cache_item *cached_pattern; + u32 ptrn_id = 0; + int ret = 0; + + mutex_lock(&ctx->pattern_cache->lock); + + cached_pattern = mlx5hws_pat_get_existing_cached_pattern(ctx->pattern_cache, + num_of_actions, + pattern); + if (cached_pattern) { + *pattern_id = cached_pattern->mh_data.pattern_id; + goto out_unlock; + } + + ret = mlx5hws_cmd_header_modify_pattern_create(ctx->mdev, + pattern_sz, + (u8 *)pattern, + &ptrn_id); + if (ret) { + mlx5hws_err(ctx, "Failed to create pattern FW object\n"); + goto out_unlock; + } + + cached_pattern = mlx5hws_pat_add_pattern_to_cache(ctx->pattern_cache, + ptrn_id, + num_of_actions, + pattern); + if (!cached_pattern) { + mlx5hws_err(ctx, "Failed to add pattern to cache\n"); + ret = -EINVAL; + goto clean_pattern; + } + + mutex_unlock(&ctx->pattern_cache->lock); + *pattern_id = ptrn_id; + + return ret; + +clean_pattern: + mlx5hws_cmd_header_modify_pattern_destroy(ctx->mdev, *pattern_id); +out_unlock: + mutex_unlock(&ctx->pattern_cache->lock); + return ret; +} + +static void +mlx5d_arg_init_send_attr(struct mlx5hws_send_engine_post_attr *send_attr, + void *comp_data, + u32 arg_idx) +{ + send_attr->opcode = MLX5HWS_WQE_OPCODE_TBL_ACCESS; + send_attr->opmod = MLX5HWS_WQE_GTA_OPMOD_MOD_ARG; + send_attr->len = MLX5HWS_WQE_SZ_GTA_CTRL + MLX5HWS_WQE_SZ_GTA_DATA; + send_attr->id = arg_idx; + send_attr->user_data = comp_data; +} + +void mlx5hws_arg_decapl3_write(struct mlx5hws_send_engine *queue, + u32 arg_idx, + u8 *arg_data, + u16 num_of_actions) +{ + struct mlx5hws_send_engine_post_attr send_attr = {0}; + struct mlx5hws_wqe_gta_data_seg_arg *wqe_arg = NULL; + struct mlx5hws_wqe_gta_ctrl_seg *wqe_ctrl = NULL; + struct mlx5hws_send_engine_post_ctrl ctrl; + size_t wqe_len; + + mlx5d_arg_init_send_attr(&send_attr, NULL, arg_idx); + + ctrl = mlx5hws_send_engine_post_start(queue); + mlx5hws_send_engine_post_req_wqe(&ctrl, (void *)&wqe_ctrl, &wqe_len); + memset(wqe_ctrl, 0, wqe_len); + mlx5hws_send_engine_post_req_wqe(&ctrl, (void *)&wqe_arg, &wqe_len); + mlx5hws_action_prepare_decap_l3_data(arg_data, (u8 *)wqe_arg, + num_of_actions); + mlx5hws_send_engine_post_end(&ctrl, &send_attr); +} + +void mlx5hws_arg_write(struct mlx5hws_send_engine *queue, + void *comp_data, + u32 arg_idx, + u8 *arg_data, + size_t data_size) +{ + struct mlx5hws_send_engine_post_attr send_attr = {0}; + struct mlx5hws_wqe_gta_data_seg_arg *wqe_arg; + struct mlx5hws_send_engine_post_ctrl ctrl; + struct mlx5hws_wqe_gta_ctrl_seg *wqe_ctrl; + int i, full_iter, leftover; + size_t wqe_len; + + mlx5d_arg_init_send_attr(&send_attr, comp_data, arg_idx); + + /* Each WQE can hold 64B of data, it might require multiple iteration */ + full_iter = data_size / MLX5HWS_ARG_DATA_SIZE; + leftover = data_size & (MLX5HWS_ARG_DATA_SIZE - 1); + + for (i = 0; i < full_iter; i++) { + ctrl = mlx5hws_send_engine_post_start(queue); + mlx5hws_send_engine_post_req_wqe(&ctrl, (void *)&wqe_ctrl, &wqe_len); + memset(wqe_ctrl, 0, wqe_len); + mlx5hws_send_engine_post_req_wqe(&ctrl, (void *)&wqe_arg, &wqe_len); + memcpy(wqe_arg, arg_data, wqe_len); + send_attr.id = arg_idx++; + mlx5hws_send_engine_post_end(&ctrl, &send_attr); + + /* Move to next argument data */ + arg_data += MLX5HWS_ARG_DATA_SIZE; + } + + if (leftover) { + ctrl = mlx5hws_send_engine_post_start(queue); + mlx5hws_send_engine_post_req_wqe(&ctrl, (void *)&wqe_ctrl, &wqe_len); + memset(wqe_ctrl, 0, wqe_len); + mlx5hws_send_engine_post_req_wqe(&ctrl, (void *)&wqe_arg, &wqe_len); + memcpy(wqe_arg, arg_data, leftover); + send_attr.id = arg_idx; + mlx5hws_send_engine_post_end(&ctrl, &send_attr); + } +} + +int mlx5hws_arg_write_inline_arg_data(struct mlx5hws_context *ctx, + u32 arg_idx, + u8 *arg_data, + size_t data_size) +{ + struct mlx5hws_send_engine *queue; + int ret; + + mutex_lock(&ctx->ctrl_lock); + + /* Get the control queue */ + queue = &ctx->send_queue[ctx->queues - 1]; + + mlx5hws_arg_write(queue, arg_data, arg_idx, arg_data, data_size); + + mlx5hws_send_engine_flush_queue(queue); + + /* Poll for completion */ + ret = mlx5hws_send_queue_action(ctx, ctx->queues - 1, + MLX5HWS_SEND_QUEUE_ACTION_DRAIN_SYNC); + + if (ret) + mlx5hws_err(ctx, "Failed to drain arg queue\n"); + + mutex_unlock(&ctx->ctrl_lock); + + return ret; +} + +bool mlx5hws_arg_is_valid_arg_request_size(struct mlx5hws_context *ctx, + u32 arg_size) +{ + if (arg_size < ctx->caps->log_header_modify_argument_granularity || + arg_size > ctx->caps->log_header_modify_argument_max_alloc) { + return false; + } + return true; +} + +int mlx5hws_arg_create(struct mlx5hws_context *ctx, + u8 *data, + size_t data_sz, + u32 log_bulk_sz, + bool write_data, + u32 *arg_id) +{ + u16 single_arg_log_sz; + u16 multi_arg_log_sz; + int ret; + u32 id; + + single_arg_log_sz = mlx5hws_arg_data_size_to_arg_log_size(data_sz); + multi_arg_log_sz = single_arg_log_sz + log_bulk_sz; + + if (single_arg_log_sz >= MLX5HWS_ARG_CHUNK_SIZE_MAX) { + mlx5hws_err(ctx, "Requested single arg %u not supported\n", single_arg_log_sz); + return -EOPNOTSUPP; + } + + if (!mlx5hws_arg_is_valid_arg_request_size(ctx, multi_arg_log_sz)) { + mlx5hws_err(ctx, "Argument log size %d not supported by FW\n", multi_arg_log_sz); + return -EOPNOTSUPP; + } + + /* Alloc bulk of args */ + ret = mlx5hws_cmd_arg_create(ctx->mdev, multi_arg_log_sz, ctx->pd_num, &id); + if (ret) { + mlx5hws_err(ctx, "Failed allocating arg in order: %d\n", multi_arg_log_sz); + return ret; + } + + if (write_data) { + ret = mlx5hws_arg_write_inline_arg_data(ctx, id, + data, data_sz); + if (ret) { + mlx5hws_err(ctx, "Failed writing arg data\n"); + mlx5hws_cmd_arg_destroy(ctx->mdev, id); + return ret; + } + } + + *arg_id = id; + return ret; +} + +void mlx5hws_arg_destroy(struct mlx5hws_context *ctx, u32 arg_id) +{ + mlx5hws_cmd_arg_destroy(ctx->mdev, arg_id); +} + +int mlx5hws_arg_create_modify_header_arg(struct mlx5hws_context *ctx, + __be64 *data, + u8 num_of_actions, + u32 log_bulk_sz, + bool write_data, + u32 *arg_id) +{ + size_t data_sz = num_of_actions * MLX5HWS_MODIFY_ACTION_SIZE; + int ret; + + ret = mlx5hws_arg_create(ctx, + (u8 *)data, + data_sz, + log_bulk_sz, + write_data, + arg_id); + if (ret) + mlx5hws_err(ctx, "Failed creating modify header arg\n"); + + return ret; +} + +static int +hws_action_modify_check_field_limitation(u8 action_type, __be64 *pattern) +{ + /* Need to check field limitation here, but for now - return OK */ + return 0; +} + +#define INVALID_FIELD 0xffff + +static void +hws_action_modify_get_target_fields(u8 action_type, __be64 *pattern, + u16 *src_field, u16 *dst_field) +{ + switch (action_type) { + case MLX5_ACTION_TYPE_SET: + case MLX5_ACTION_TYPE_ADD: + *src_field = MLX5_GET(set_action_in, pattern, field); + *dst_field = INVALID_FIELD; + break; + case MLX5_ACTION_TYPE_COPY: + *src_field = MLX5_GET(copy_action_in, pattern, src_field); + *dst_field = MLX5_GET(copy_action_in, pattern, dst_field); + break; + default: + pr_warn("HWS: invalid modify header action type %d\n", action_type); + } +} + +bool mlx5hws_pat_verify_actions(struct mlx5hws_context *ctx, __be64 pattern[], size_t sz) +{ + size_t i; + + for (i = 0; i < sz / MLX5HWS_MODIFY_ACTION_SIZE; i++) { + u8 action_type = + MLX5_GET(set_action_in, &pattern[i], action_type); + if (action_type >= MLX5_MODIFICATION_TYPE_MAX) { + mlx5hws_err(ctx, "Unsupported action id %d\n", action_type); + return false; + } + if (hws_action_modify_check_field_limitation(action_type, &pattern[i])) { + mlx5hws_err(ctx, "Unsupported action number %zu\n", i); + return false; + } + } + + return true; +} + +void mlx5hws_pat_calc_nope(__be64 *pattern, size_t num_actions, + size_t max_actions, size_t *new_size, + u32 *nope_location, __be64 *new_pat) +{ + u16 prev_src_field = 0, prev_dst_field = 0; + u16 src_field, dst_field; + u8 action_type; + size_t i, j; + + *new_size = num_actions; + *nope_location = 0; + + if (num_actions == 1) + return; + + for (i = 0, j = 0; i < num_actions; i++, j++) { + action_type = MLX5_GET(set_action_in, &pattern[i], action_type); + + hws_action_modify_get_target_fields(action_type, &pattern[i], + &src_field, &dst_field); + if (i % 2) { + if (action_type == MLX5_ACTION_TYPE_COPY && + (prev_src_field == src_field || + prev_dst_field == dst_field)) { + /* need Nope */ + *new_size += 1; + *nope_location |= BIT(i); + memset(&new_pat[j], 0, MLX5HWS_MODIFY_ACTION_SIZE); + MLX5_SET(set_action_in, &new_pat[j], + action_type, + MLX5_MODIFICATION_TYPE_NOP); + j++; + } else if (prev_src_field == src_field) { + /* need Nope*/ + *new_size += 1; + *nope_location |= BIT(i); + MLX5_SET(set_action_in, &new_pat[j], + action_type, + MLX5_MODIFICATION_TYPE_NOP); + j++; + } + } + memcpy(&new_pat[j], &pattern[i], MLX5HWS_MODIFY_ACTION_SIZE); + /* check if no more space */ + if (j > max_actions) { + *new_size = num_actions; + *nope_location = 0; + return; + } + + prev_src_field = src_field; + prev_dst_field = dst_field; + } +} diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_pat_arg.h b/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_pat_arg.h new file mode 100644 index 0000000000000..27ca93385b084 --- /dev/null +++ b/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_pat_arg.h @@ -0,0 +1,101 @@ +/* SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB */ +/* Copyright (c) 2024 NVIDIA Corporation & Affiliates */ + +#ifndef MLX5HWS_PAT_ARG_H_ +#define MLX5HWS_PAT_ARG_H_ + +/* Modify-header arg pool */ +enum mlx5hws_arg_chunk_size { + MLX5HWS_ARG_CHUNK_SIZE_1, + /* Keep MIN updated when changing */ + MLX5HWS_ARG_CHUNK_SIZE_MIN = MLX5HWS_ARG_CHUNK_SIZE_1, + MLX5HWS_ARG_CHUNK_SIZE_2, + MLX5HWS_ARG_CHUNK_SIZE_3, + MLX5HWS_ARG_CHUNK_SIZE_4, + MLX5HWS_ARG_CHUNK_SIZE_MAX, +}; + +enum { + MLX5HWS_MODIFY_ACTION_SIZE = 8, + MLX5HWS_ARG_DATA_SIZE = 64, +}; + +struct mlx5hws_pattern_cache { + struct mutex lock; /* Protect pattern list */ + struct list_head ptrn_list; +}; + +struct mlx5hws_pattern_cache_item { + struct { + u32 pattern_id; + u8 *data; + u16 num_of_actions; + } mh_data; + u32 refcount; + struct list_head ptrn_list_node; +}; + +enum mlx5hws_arg_chunk_size +mlx5hws_arg_get_arg_log_size(u16 num_of_actions); + +u32 mlx5hws_arg_get_arg_size(u16 num_of_actions); + +enum mlx5hws_arg_chunk_size +mlx5hws_arg_data_size_to_arg_log_size(u16 data_size); + +u32 mlx5hws_arg_data_size_to_arg_size(u16 data_size); + +int mlx5hws_pat_init_pattern_cache(struct mlx5hws_pattern_cache **cache); + +void mlx5hws_pat_uninit_pattern_cache(struct mlx5hws_pattern_cache *cache); + +bool mlx5hws_pat_verify_actions(struct mlx5hws_context *ctx, __be64 pattern[], size_t sz); + +int mlx5hws_arg_create(struct mlx5hws_context *ctx, + u8 *data, + size_t data_sz, + u32 log_bulk_sz, + bool write_data, + u32 *arg_id); + +void mlx5hws_arg_destroy(struct mlx5hws_context *ctx, u32 arg_id); + +int mlx5hws_arg_create_modify_header_arg(struct mlx5hws_context *ctx, + __be64 *data, + u8 num_of_actions, + u32 log_bulk_sz, + bool write_data, + u32 *modify_hdr_arg_id); + +int mlx5hws_pat_get_pattern(struct mlx5hws_context *ctx, + __be64 *pattern, + size_t pattern_sz, + u32 *ptrn_id); + +void mlx5hws_pat_put_pattern(struct mlx5hws_context *ctx, + u32 ptrn_id); + +bool mlx5hws_arg_is_valid_arg_request_size(struct mlx5hws_context *ctx, + u32 arg_size); + +bool mlx5hws_pat_require_reparse(__be64 *actions, u16 num_of_actions); + +void mlx5hws_arg_write(struct mlx5hws_send_engine *queue, + void *comp_data, + u32 arg_idx, + u8 *arg_data, + size_t data_size); + +void mlx5hws_arg_decapl3_write(struct mlx5hws_send_engine *queue, + u32 arg_idx, + u8 *arg_data, + u16 num_of_actions); + +int mlx5hws_arg_write_inline_arg_data(struct mlx5hws_context *ctx, + u32 arg_idx, + u8 *arg_data, + size_t data_size); + +void mlx5hws_pat_calc_nope(__be64 *pattern, size_t num_actions, size_t max_actions, + size_t *new_size, u32 *nope_location, __be64 *new_pat); +#endif /* MLX5HWS_PAT_ARG_H_ */ From c4d5f1d143ee2e3a095584513bfede8707db4466 Mon Sep 17 00:00:00 2001 From: Yevgeny Kliteynik Date: Mon, 12 Aug 2024 02:10:56 +0300 Subject: [PATCH 096/548] net/mlx5: HWS, added vport handling Vport is a virtual eswitch port that is associated with its virtual function (VF), physical function (PF) or sub-function (SF). This patch adds handling of vports in HWS. Reviewed-by: Hamdan Agbariya Signed-off-by: Yevgeny Kliteynik Signed-off-by: Saeed Mahameed (cherry picked from commit 6c5e68254027e7a82cb4580fd1a28cda3d417a97) Signed-off-by: Koba Ko --- .../mlx5/core/steering/hws/mlx5hws_vport.c | 86 +++++++++++++++++++ .../mlx5/core/steering/hws/mlx5hws_vport.h | 13 +++ 2 files changed, 99 insertions(+) create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_vport.c create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_vport.h diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_vport.c b/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_vport.c new file mode 100644 index 0000000000000..faf42421c43fd --- /dev/null +++ b/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_vport.c @@ -0,0 +1,86 @@ +// SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB +/* Copyright (c) 2024 NVIDIA Corporation & Affiliates */ + +#include "mlx5hws_internal.h" + +int mlx5hws_vport_init_vports(struct mlx5hws_context *ctx) +{ + int ret; + + if (!ctx->caps->eswitch_manager) + return 0; + + xa_init(&ctx->vports.vport_gvmi_xa); + + /* Set gvmi for eswitch manager and uplink vports only. Rest of the vports + * (vport 0 of other function, VFs and SFs) will be queried dynamically. + */ + + ret = mlx5hws_cmd_query_gvmi(ctx->mdev, false, 0, &ctx->vports.esw_manager_gvmi); + if (ret) + return ret; + + ctx->vports.uplink_gvmi = 0; + return 0; +} + +void mlx5hws_vport_uninit_vports(struct mlx5hws_context *ctx) +{ + if (ctx->caps->eswitch_manager) + xa_destroy(&ctx->vports.vport_gvmi_xa); +} + +static int hws_vport_add_gvmi(struct mlx5hws_context *ctx, u16 vport) +{ + u16 vport_gvmi; + int ret; + + ret = mlx5hws_cmd_query_gvmi(ctx->mdev, true, vport, &vport_gvmi); + if (ret) + return -EINVAL; + + ret = xa_insert(&ctx->vports.vport_gvmi_xa, vport, + xa_mk_value(vport_gvmi), GFP_KERNEL); + if (ret) + mlx5hws_dbg(ctx, "Couldn't insert new vport gvmi into xarray (%d)\n", ret); + + return ret; +} + +static bool hws_vport_is_esw_mgr_vport(struct mlx5hws_context *ctx, u16 vport) +{ + return ctx->caps->is_ecpf ? vport == MLX5_VPORT_ECPF : + vport == MLX5_VPORT_PF; +} + +int mlx5hws_vport_get_gvmi(struct mlx5hws_context *ctx, u16 vport, u16 *vport_gvmi) +{ + void *entry; + int ret; + + if (!ctx->caps->eswitch_manager) + return -EINVAL; + + if (hws_vport_is_esw_mgr_vport(ctx, vport)) { + *vport_gvmi = ctx->vports.esw_manager_gvmi; + return 0; + } + + if (vport == MLX5_VPORT_UPLINK) { + *vport_gvmi = ctx->vports.uplink_gvmi; + return 0; + } + +load_entry: + entry = xa_load(&ctx->vports.vport_gvmi_xa, vport); + + if (!xa_is_value(entry)) { + ret = hws_vport_add_gvmi(ctx, vport); + if (ret && ret != -EBUSY) + return ret; + goto load_entry; + } + + *vport_gvmi = (u16)xa_to_value(entry); + return 0; +} diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_vport.h b/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_vport.h new file mode 100644 index 0000000000000..0912fc166b3a6 --- /dev/null +++ b/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_vport.h @@ -0,0 +1,13 @@ +/* SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB */ +/* Copyright (c) 2024 NVIDIA Corporation & Affiliates */ + +#ifndef MLX5HWS_VPORT_H_ +#define MLX5HWS_VPORT_H_ + +int mlx5hws_vport_init_vports(struct mlx5hws_context *ctx); + +void mlx5hws_vport_uninit_vports(struct mlx5hws_context *ctx); + +int mlx5hws_vport_get_gvmi(struct mlx5hws_context *ctx, u16 vport, u16 *vport_gvmi); + +#endif /* MLX5HWS_VPORT_H_ */ From 318b8a46b348768f5f0952535ff7e216f8f51bd7 Mon Sep 17 00:00:00 2001 From: Yevgeny Kliteynik Date: Thu, 20 Jun 2024 02:34:22 +0300 Subject: [PATCH 097/548] net/mlx5: HWS, added memory management handling Added object pools and buddy allocator functionality. Reviewed-by: Itamar Gozlan Signed-off-by: Yevgeny Kliteynik Signed-off-by: Saeed Mahameed (cherry picked from commit c61afff94373641695cc81999e9bb10408ea84d5) Signed-off-by: Koba Ko --- .../mlx5/core/steering/hws/mlx5hws_buddy.c | 149 ++++ .../mlx5/core/steering/hws/mlx5hws_buddy.h | 21 + .../mlx5/core/steering/hws/mlx5hws_pool.c | 640 ++++++++++++++++++ .../mlx5/core/steering/hws/mlx5hws_pool.h | 151 +++++ 4 files changed, 961 insertions(+) create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_buddy.c create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_buddy.h create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_pool.c create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_pool.h diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_buddy.c b/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_buddy.c new file mode 100644 index 0000000000000..e6ed66202a404 --- /dev/null +++ b/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_buddy.c @@ -0,0 +1,149 @@ +// SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB +/* Copyright (c) 2024 NVIDIA Corporation & Affiliates */ + +#include "mlx5hws_internal.h" +#include "mlx5hws_buddy.h" + +static int hws_buddy_init(struct mlx5hws_buddy_mem *buddy, u32 max_order) +{ + int i, s, ret = 0; + + buddy->max_order = max_order; + + buddy->bitmap = kcalloc(buddy->max_order + 1, + sizeof(*buddy->bitmap), + GFP_KERNEL); + if (!buddy->bitmap) + return -ENOMEM; + + buddy->num_free = kcalloc(buddy->max_order + 1, + sizeof(*buddy->num_free), + GFP_KERNEL); + if (!buddy->num_free) { + ret = -ENOMEM; + goto err_out_free_bits; + } + + for (i = 0; i <= (int)buddy->max_order; ++i) { + s = 1 << (buddy->max_order - i); + + buddy->bitmap[i] = bitmap_zalloc(s, GFP_KERNEL); + if (!buddy->bitmap[i]) { + ret = -ENOMEM; + goto err_out_free_num_free; + } + } + + bitmap_set(buddy->bitmap[buddy->max_order], 0, 1); + buddy->num_free[buddy->max_order] = 1; + + return 0; + +err_out_free_num_free: + for (i = 0; i <= (int)buddy->max_order; ++i) + bitmap_free(buddy->bitmap[i]); + + kfree(buddy->num_free); + +err_out_free_bits: + kfree(buddy->bitmap); + return ret; +} + +struct mlx5hws_buddy_mem *mlx5hws_buddy_create(u32 max_order) +{ + struct mlx5hws_buddy_mem *buddy; + + buddy = kzalloc(sizeof(*buddy), GFP_KERNEL); + if (!buddy) + return NULL; + + if (hws_buddy_init(buddy, max_order)) + goto free_buddy; + + return buddy; + +free_buddy: + kfree(buddy); + return NULL; +} + +void mlx5hws_buddy_cleanup(struct mlx5hws_buddy_mem *buddy) +{ + int i; + + for (i = 0; i <= (int)buddy->max_order; ++i) + bitmap_free(buddy->bitmap[i]); + + kfree(buddy->num_free); + kfree(buddy->bitmap); +} + +static int hws_buddy_find_free_seg(struct mlx5hws_buddy_mem *buddy, + u32 start_order, + u32 *segment, + u32 *order) +{ + unsigned int seg, order_iter, m; + + for (order_iter = start_order; + order_iter <= buddy->max_order; ++order_iter) { + if (!buddy->num_free[order_iter]) + continue; + + m = 1 << (buddy->max_order - order_iter); + seg = find_first_bit(buddy->bitmap[order_iter], m); + + if (WARN(seg >= m, + "ICM Buddy: failed finding free mem for order %d\n", + order_iter)) + return -ENOMEM; + + break; + } + + if (order_iter > buddy->max_order) + return -ENOMEM; + + *segment = seg; + *order = order_iter; + return 0; +} + +int mlx5hws_buddy_alloc_mem(struct mlx5hws_buddy_mem *buddy, u32 order) +{ + u32 seg, order_iter, err; + + err = hws_buddy_find_free_seg(buddy, order, &seg, &order_iter); + if (err) + return err; + + bitmap_clear(buddy->bitmap[order_iter], seg, 1); + --buddy->num_free[order_iter]; + + while (order_iter > order) { + --order_iter; + seg <<= 1; + bitmap_set(buddy->bitmap[order_iter], seg ^ 1, 1); + ++buddy->num_free[order_iter]; + } + + seg <<= order; + + return seg; +} + +void mlx5hws_buddy_free_mem(struct mlx5hws_buddy_mem *buddy, u32 seg, u32 order) +{ + seg >>= order; + + while (test_bit(seg ^ 1, buddy->bitmap[order])) { + bitmap_clear(buddy->bitmap[order], seg ^ 1, 1); + --buddy->num_free[order]; + seg >>= 1; + ++order; + } + + bitmap_set(buddy->bitmap[order], seg, 1); + ++buddy->num_free[order]; +} diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_buddy.h b/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_buddy.h new file mode 100644 index 0000000000000..338c44bbedaff --- /dev/null +++ b/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_buddy.h @@ -0,0 +1,21 @@ +/* SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB */ +/* Copyright (c) 2024 NVIDIA Corporation & Affiliates */ + +#ifndef MLX5HWS_BUDDY_H_ +#define MLX5HWS_BUDDY_H_ + +struct mlx5hws_buddy_mem { + unsigned long **bitmap; + unsigned int *num_free; + u32 max_order; +}; + +struct mlx5hws_buddy_mem *mlx5hws_buddy_create(u32 max_order); + +void mlx5hws_buddy_cleanup(struct mlx5hws_buddy_mem *buddy); + +int mlx5hws_buddy_alloc_mem(struct mlx5hws_buddy_mem *buddy, u32 order); + +void mlx5hws_buddy_free_mem(struct mlx5hws_buddy_mem *buddy, u32 seg, u32 order); + +#endif /* MLX5HWS_BUDDY_H_ */ diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_pool.c b/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_pool.c new file mode 100644 index 0000000000000..a8a63e3278be3 --- /dev/null +++ b/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_pool.c @@ -0,0 +1,640 @@ +// SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB +/* Copyright (c) 2024 NVIDIA Corporation & Affiliates */ + +#include "mlx5hws_internal.h" +#include "mlx5hws_buddy.h" + +static void hws_pool_free_one_resource(struct mlx5hws_pool_resource *resource) +{ + switch (resource->pool->type) { + case MLX5HWS_POOL_TYPE_STE: + mlx5hws_cmd_ste_destroy(resource->pool->ctx->mdev, resource->base_id); + break; + case MLX5HWS_POOL_TYPE_STC: + mlx5hws_cmd_stc_destroy(resource->pool->ctx->mdev, resource->base_id); + break; + default: + break; + } + + kfree(resource); +} + +static void hws_pool_resource_free(struct mlx5hws_pool *pool, + int resource_idx) +{ + hws_pool_free_one_resource(pool->resource[resource_idx]); + pool->resource[resource_idx] = NULL; + + if (pool->tbl_type == MLX5HWS_TABLE_TYPE_FDB) { + hws_pool_free_one_resource(pool->mirror_resource[resource_idx]); + pool->mirror_resource[resource_idx] = NULL; + } +} + +static struct mlx5hws_pool_resource * +hws_pool_create_one_resource(struct mlx5hws_pool *pool, u32 log_range, + u32 fw_ft_type) +{ + struct mlx5hws_cmd_ste_create_attr ste_attr; + struct mlx5hws_cmd_stc_create_attr stc_attr; + struct mlx5hws_pool_resource *resource; + u32 obj_id = 0; + int ret; + + resource = kzalloc(sizeof(*resource), GFP_KERNEL); + if (!resource) + return NULL; + + switch (pool->type) { + case MLX5HWS_POOL_TYPE_STE: + ste_attr.log_obj_range = log_range; + ste_attr.table_type = fw_ft_type; + ret = mlx5hws_cmd_ste_create(pool->ctx->mdev, &ste_attr, &obj_id); + break; + case MLX5HWS_POOL_TYPE_STC: + stc_attr.log_obj_range = log_range; + stc_attr.table_type = fw_ft_type; + ret = mlx5hws_cmd_stc_create(pool->ctx->mdev, &stc_attr, &obj_id); + break; + default: + ret = -EINVAL; + } + + if (ret) { + mlx5hws_err(pool->ctx, "Failed to allocate resource objects\n"); + goto free_resource; + } + + resource->pool = pool; + resource->range = 1 << log_range; + resource->base_id = obj_id; + + return resource; + +free_resource: + kfree(resource); + return NULL; +} + +static int +hws_pool_resource_alloc(struct mlx5hws_pool *pool, u32 log_range, int idx) +{ + struct mlx5hws_pool_resource *resource; + u32 fw_ft_type, opt_log_range; + + fw_ft_type = mlx5hws_table_get_res_fw_ft_type(pool->tbl_type, false); + opt_log_range = pool->opt_type == MLX5HWS_POOL_OPTIMIZE_ORIG ? 0 : log_range; + resource = hws_pool_create_one_resource(pool, opt_log_range, fw_ft_type); + if (!resource) { + mlx5hws_err(pool->ctx, "Failed allocating resource\n"); + return -EINVAL; + } + + pool->resource[idx] = resource; + + if (pool->tbl_type == MLX5HWS_TABLE_TYPE_FDB) { + struct mlx5hws_pool_resource *mirror_resource; + + fw_ft_type = mlx5hws_table_get_res_fw_ft_type(pool->tbl_type, true); + opt_log_range = pool->opt_type == MLX5HWS_POOL_OPTIMIZE_MIRROR ? 0 : log_range; + mirror_resource = hws_pool_create_one_resource(pool, opt_log_range, fw_ft_type); + if (!mirror_resource) { + mlx5hws_err(pool->ctx, "Failed allocating mirrored resource\n"); + hws_pool_free_one_resource(resource); + pool->resource[idx] = NULL; + return -EINVAL; + } + pool->mirror_resource[idx] = mirror_resource; + } + + return 0; +} + +static unsigned long *hws_pool_create_and_init_bitmap(u32 log_range) +{ + unsigned long *cur_bmp; + + cur_bmp = bitmap_zalloc(1 << log_range, GFP_KERNEL); + if (!cur_bmp) + return NULL; + + bitmap_fill(cur_bmp, 1 << log_range); + + return cur_bmp; +} + +static void hws_pool_buddy_db_put_chunk(struct mlx5hws_pool *pool, + struct mlx5hws_pool_chunk *chunk) +{ + struct mlx5hws_buddy_mem *buddy; + + buddy = pool->db.buddy_manager->buddies[chunk->resource_idx]; + if (!buddy) { + mlx5hws_err(pool->ctx, "No such buddy (%d)\n", chunk->resource_idx); + return; + } + + mlx5hws_buddy_free_mem(buddy, chunk->offset, chunk->order); +} + +static struct mlx5hws_buddy_mem * +hws_pool_buddy_get_next_buddy(struct mlx5hws_pool *pool, int idx, + u32 order, bool *is_new_buddy) +{ + static struct mlx5hws_buddy_mem *buddy; + u32 new_buddy_size; + + buddy = pool->db.buddy_manager->buddies[idx]; + if (buddy) + return buddy; + + new_buddy_size = max(pool->alloc_log_sz, order); + *is_new_buddy = true; + buddy = mlx5hws_buddy_create(new_buddy_size); + if (!buddy) { + mlx5hws_err(pool->ctx, "Failed to create buddy order: %d index: %d\n", + new_buddy_size, idx); + return NULL; + } + + if (hws_pool_resource_alloc(pool, new_buddy_size, idx) != 0) { + mlx5hws_err(pool->ctx, "Failed to create resource type: %d: size %d index: %d\n", + pool->type, new_buddy_size, idx); + mlx5hws_buddy_cleanup(buddy); + return NULL; + } + + pool->db.buddy_manager->buddies[idx] = buddy; + + return buddy; +} + +static int hws_pool_buddy_get_mem_chunk(struct mlx5hws_pool *pool, + int order, + u32 *buddy_idx, + int *seg) +{ + struct mlx5hws_buddy_mem *buddy; + bool new_mem = false; + int ret = 0; + int i; + + *seg = -1; + + /* Find the next free place from the buddy array */ + while (*seg == -1) { + for (i = 0; i < MLX5HWS_POOL_RESOURCE_ARR_SZ; i++) { + buddy = hws_pool_buddy_get_next_buddy(pool, i, + order, + &new_mem); + if (!buddy) { + ret = -ENOMEM; + goto out; + } + + *seg = mlx5hws_buddy_alloc_mem(buddy, order); + if (*seg != -1) + goto found; + + if (pool->flags & MLX5HWS_POOL_FLAGS_ONE_RESOURCE) { + mlx5hws_err(pool->ctx, + "Fail to allocate seg for one resource pool\n"); + ret = -ENOMEM; + goto out; + } + + if (new_mem) { + /* We have new memory pool, should be place for us */ + mlx5hws_err(pool->ctx, + "No memory for order: %d with buddy no: %d\n", + order, i); + ret = -ENOMEM; + goto out; + } + } + } + +found: + *buddy_idx = i; +out: + return ret; +} + +static int hws_pool_buddy_db_get_chunk(struct mlx5hws_pool *pool, + struct mlx5hws_pool_chunk *chunk) +{ + int ret = 0; + + /* Go over the buddies and find next free slot */ + ret = hws_pool_buddy_get_mem_chunk(pool, chunk->order, + &chunk->resource_idx, + &chunk->offset); + if (ret) + mlx5hws_err(pool->ctx, "Failed to get free slot for chunk with order: %d\n", + chunk->order); + + return ret; +} + +static void hws_pool_buddy_db_uninit(struct mlx5hws_pool *pool) +{ + struct mlx5hws_buddy_mem *buddy; + int i; + + for (i = 0; i < MLX5HWS_POOL_RESOURCE_ARR_SZ; i++) { + buddy = pool->db.buddy_manager->buddies[i]; + if (buddy) { + mlx5hws_buddy_cleanup(buddy); + kfree(buddy); + pool->db.buddy_manager->buddies[i] = NULL; + } + } + + kfree(pool->db.buddy_manager); +} + +static int hws_pool_buddy_db_init(struct mlx5hws_pool *pool, u32 log_range) +{ + pool->db.buddy_manager = kzalloc(sizeof(*pool->db.buddy_manager), GFP_KERNEL); + if (!pool->db.buddy_manager) + return -ENOMEM; + + if (pool->flags & MLX5HWS_POOL_FLAGS_ALLOC_MEM_ON_CREATE) { + bool new_buddy; + + if (!hws_pool_buddy_get_next_buddy(pool, 0, log_range, &new_buddy)) { + mlx5hws_err(pool->ctx, + "Failed allocating memory on create log_sz: %d\n", log_range); + kfree(pool->db.buddy_manager); + return -ENOMEM; + } + } + + pool->p_db_uninit = &hws_pool_buddy_db_uninit; + pool->p_get_chunk = &hws_pool_buddy_db_get_chunk; + pool->p_put_chunk = &hws_pool_buddy_db_put_chunk; + + return 0; +} + +static int hws_pool_create_resource_on_index(struct mlx5hws_pool *pool, + u32 alloc_size, int idx) +{ + int ret = hws_pool_resource_alloc(pool, alloc_size, idx); + + if (ret) { + mlx5hws_err(pool->ctx, "Failed to create resource type: %d: size %d index: %d\n", + pool->type, alloc_size, idx); + return ret; + } + + return 0; +} + +static struct mlx5hws_pool_elements * +hws_pool_element_create_new_elem(struct mlx5hws_pool *pool, u32 order, int idx) +{ + struct mlx5hws_pool_elements *elem; + u32 alloc_size; + + alloc_size = pool->alloc_log_sz; + + elem = kzalloc(sizeof(*elem), GFP_KERNEL); + if (!elem) + return NULL; + + /* Sharing the same resource, also means that all the elements are with size 1 */ + if ((pool->flags & MLX5HWS_POOL_FLAGS_FIXED_SIZE_OBJECTS) && + !(pool->flags & MLX5HWS_POOL_FLAGS_RESOURCE_PER_CHUNK)) { + /* Currently all chunks in size 1 */ + elem->bitmap = hws_pool_create_and_init_bitmap(alloc_size - order); + if (!elem->bitmap) { + mlx5hws_err(pool->ctx, + "Failed to create bitmap type: %d: size %d index: %d\n", + pool->type, alloc_size, idx); + goto free_elem; + } + + elem->log_size = alloc_size - order; + } + + if (hws_pool_create_resource_on_index(pool, alloc_size, idx)) { + mlx5hws_err(pool->ctx, "Failed to create resource type: %d: size %d index: %d\n", + pool->type, alloc_size, idx); + goto free_db; + } + + pool->db.element_manager->elements[idx] = elem; + + return elem; + +free_db: + bitmap_free(elem->bitmap); +free_elem: + kfree(elem); + return NULL; +} + +static int hws_pool_element_find_seg(struct mlx5hws_pool_elements *elem, int *seg) +{ + unsigned int segment, size; + + size = 1 << elem->log_size; + + segment = find_first_bit(elem->bitmap, size); + if (segment >= size) { + elem->is_full = true; + return -ENOMEM; + } + + bitmap_clear(elem->bitmap, segment, 1); + *seg = segment; + return 0; +} + +static int +hws_pool_onesize_element_get_mem_chunk(struct mlx5hws_pool *pool, u32 order, + u32 *idx, int *seg) +{ + struct mlx5hws_pool_elements *elem; + + elem = pool->db.element_manager->elements[0]; + if (!elem) + elem = hws_pool_element_create_new_elem(pool, order, 0); + if (!elem) + goto err_no_elem; + + if (hws_pool_element_find_seg(elem, seg) != 0) { + mlx5hws_err(pool->ctx, "No more resources (last request order: %d)\n", order); + return -ENOMEM; + } + + *idx = 0; + elem->num_of_elements++; + return 0; + +err_no_elem: + mlx5hws_err(pool->ctx, "Failed to allocate element for order: %d\n", order); + return -ENOMEM; +} + +static int +hws_pool_general_element_get_mem_chunk(struct mlx5hws_pool *pool, u32 order, + u32 *idx, int *seg) +{ + int ret, i; + + for (i = 0; i < MLX5HWS_POOL_RESOURCE_ARR_SZ; i++) { + if (!pool->resource[i]) { + ret = hws_pool_create_resource_on_index(pool, order, i); + if (ret) + goto err_no_res; + *idx = i; + *seg = 0; /* One memory slot in that element */ + return 0; + } + } + + mlx5hws_err(pool->ctx, "No more resources (last request order: %d)\n", order); + return -ENOMEM; + +err_no_res: + mlx5hws_err(pool->ctx, "Failed to allocate element for order: %d\n", order); + return -ENOMEM; +} + +static int hws_pool_general_element_db_get_chunk(struct mlx5hws_pool *pool, + struct mlx5hws_pool_chunk *chunk) +{ + int ret; + + /* Go over all memory elements and find/allocate free slot */ + ret = hws_pool_general_element_get_mem_chunk(pool, chunk->order, + &chunk->resource_idx, + &chunk->offset); + if (ret) + mlx5hws_err(pool->ctx, "Failed to get free slot for chunk with order: %d\n", + chunk->order); + + return ret; +} + +static void hws_pool_general_element_db_put_chunk(struct mlx5hws_pool *pool, + struct mlx5hws_pool_chunk *chunk) +{ + if (unlikely(!pool->resource[chunk->resource_idx])) + pr_warn("HWS: invalid resource with index %d\n", chunk->resource_idx); + + if (pool->flags & MLX5HWS_POOL_FLAGS_RELEASE_FREE_RESOURCE) + hws_pool_resource_free(pool, chunk->resource_idx); +} + +static void hws_pool_general_element_db_uninit(struct mlx5hws_pool *pool) +{ + (void)pool; +} + +/* This memory management works as the following: + * - At start doesn't allocate no mem at all. + * - When new request for chunk arrived: + * allocate resource and give it. + * - When free that chunk: + * the resource is freed. + */ +static int hws_pool_general_element_db_init(struct mlx5hws_pool *pool) +{ + pool->p_db_uninit = &hws_pool_general_element_db_uninit; + pool->p_get_chunk = &hws_pool_general_element_db_get_chunk; + pool->p_put_chunk = &hws_pool_general_element_db_put_chunk; + + return 0; +} + +static void hws_onesize_element_db_destroy_element(struct mlx5hws_pool *pool, + struct mlx5hws_pool_elements *elem, + struct mlx5hws_pool_chunk *chunk) +{ + if (unlikely(!pool->resource[chunk->resource_idx])) + pr_warn("HWS: invalid resource with index %d\n", chunk->resource_idx); + + hws_pool_resource_free(pool, chunk->resource_idx); + kfree(elem); + pool->db.element_manager->elements[chunk->resource_idx] = NULL; +} + +static void hws_onesize_element_db_put_chunk(struct mlx5hws_pool *pool, + struct mlx5hws_pool_chunk *chunk) +{ + struct mlx5hws_pool_elements *elem; + + if (unlikely(chunk->resource_idx)) + pr_warn("HWS: invalid resource with index %d\n", chunk->resource_idx); + + elem = pool->db.element_manager->elements[chunk->resource_idx]; + if (!elem) { + mlx5hws_err(pool->ctx, "No such element (%d)\n", chunk->resource_idx); + return; + } + + bitmap_set(elem->bitmap, chunk->offset, 1); + elem->is_full = false; + elem->num_of_elements--; + + if (pool->flags & MLX5HWS_POOL_FLAGS_RELEASE_FREE_RESOURCE && + !elem->num_of_elements) + hws_onesize_element_db_destroy_element(pool, elem, chunk); +} + +static int hws_onesize_element_db_get_chunk(struct mlx5hws_pool *pool, + struct mlx5hws_pool_chunk *chunk) +{ + int ret = 0; + + /* Go over all memory elements and find/allocate free slot */ + ret = hws_pool_onesize_element_get_mem_chunk(pool, chunk->order, + &chunk->resource_idx, + &chunk->offset); + if (ret) + mlx5hws_err(pool->ctx, "Failed to get free slot for chunk with order: %d\n", + chunk->order); + + return ret; +} + +static void hws_onesize_element_db_uninit(struct mlx5hws_pool *pool) +{ + struct mlx5hws_pool_elements *elem; + int i; + + for (i = 0; i < MLX5HWS_POOL_RESOURCE_ARR_SZ; i++) { + elem = pool->db.element_manager->elements[i]; + if (elem) { + bitmap_free(elem->bitmap); + kfree(elem); + pool->db.element_manager->elements[i] = NULL; + } + } + kfree(pool->db.element_manager); +} + +/* This memory management works as the following: + * - At start doesn't allocate no mem at all. + * - When new request for chunk arrived: + * aloocate the first and only slot of memory/resource + * when it ended return error. + */ +static int hws_pool_onesize_element_db_init(struct mlx5hws_pool *pool) +{ + pool->db.element_manager = kzalloc(sizeof(*pool->db.element_manager), GFP_KERNEL); + if (!pool->db.element_manager) + return -ENOMEM; + + pool->p_db_uninit = &hws_onesize_element_db_uninit; + pool->p_get_chunk = &hws_onesize_element_db_get_chunk; + pool->p_put_chunk = &hws_onesize_element_db_put_chunk; + + return 0; +} + +static int hws_pool_db_init(struct mlx5hws_pool *pool, + enum mlx5hws_db_type db_type) +{ + int ret; + + if (db_type == MLX5HWS_POOL_DB_TYPE_GENERAL_SIZE) + ret = hws_pool_general_element_db_init(pool); + else if (db_type == MLX5HWS_POOL_DB_TYPE_ONE_SIZE_RESOURCE) + ret = hws_pool_onesize_element_db_init(pool); + else + ret = hws_pool_buddy_db_init(pool, pool->alloc_log_sz); + + if (ret) { + mlx5hws_err(pool->ctx, "Failed to init general db : %d (ret: %d)\n", db_type, ret); + return ret; + } + + return 0; +} + +static void hws_pool_db_unint(struct mlx5hws_pool *pool) +{ + pool->p_db_uninit(pool); +} + +int mlx5hws_pool_chunk_alloc(struct mlx5hws_pool *pool, + struct mlx5hws_pool_chunk *chunk) +{ + int ret; + + mutex_lock(&pool->lock); + ret = pool->p_get_chunk(pool, chunk); + mutex_unlock(&pool->lock); + + return ret; +} + +void mlx5hws_pool_chunk_free(struct mlx5hws_pool *pool, + struct mlx5hws_pool_chunk *chunk) +{ + mutex_lock(&pool->lock); + pool->p_put_chunk(pool, chunk); + mutex_unlock(&pool->lock); +} + +struct mlx5hws_pool * +mlx5hws_pool_create(struct mlx5hws_context *ctx, struct mlx5hws_pool_attr *pool_attr) +{ + enum mlx5hws_db_type res_db_type; + struct mlx5hws_pool *pool; + + pool = kzalloc(sizeof(*pool), GFP_KERNEL); + if (!pool) + return NULL; + + pool->ctx = ctx; + pool->type = pool_attr->pool_type; + pool->alloc_log_sz = pool_attr->alloc_log_sz; + pool->flags = pool_attr->flags; + pool->tbl_type = pool_attr->table_type; + pool->opt_type = pool_attr->opt_type; + + /* Support general db */ + if (pool->flags == (MLX5HWS_POOL_FLAGS_RELEASE_FREE_RESOURCE | + MLX5HWS_POOL_FLAGS_RESOURCE_PER_CHUNK)) + res_db_type = MLX5HWS_POOL_DB_TYPE_GENERAL_SIZE; + else if (pool->flags == (MLX5HWS_POOL_FLAGS_ONE_RESOURCE | + MLX5HWS_POOL_FLAGS_FIXED_SIZE_OBJECTS)) + res_db_type = MLX5HWS_POOL_DB_TYPE_ONE_SIZE_RESOURCE; + else + res_db_type = MLX5HWS_POOL_DB_TYPE_BUDDY; + + pool->alloc_log_sz = pool_attr->alloc_log_sz; + + if (hws_pool_db_init(pool, res_db_type)) + goto free_pool; + + mutex_init(&pool->lock); + + return pool; + +free_pool: + kfree(pool); + return NULL; +} + +int mlx5hws_pool_destroy(struct mlx5hws_pool *pool) +{ + int i; + + mutex_destroy(&pool->lock); + + for (i = 0; i < MLX5HWS_POOL_RESOURCE_ARR_SZ; i++) + if (pool->resource[i]) + hws_pool_resource_free(pool, i); + + hws_pool_db_unint(pool); + + kfree(pool); + return 0; +} diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_pool.h b/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_pool.h new file mode 100644 index 0000000000000..621298b352b24 --- /dev/null +++ b/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_pool.h @@ -0,0 +1,151 @@ +/* SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB */ +/* Copyright (c) 2024 NVIDIA Corporation & Affiliates */ + +#ifndef MLX5HWS_POOL_H_ +#define MLX5HWS_POOL_H_ + +#define MLX5HWS_POOL_STC_LOG_SZ 15 + +#define MLX5HWS_POOL_RESOURCE_ARR_SZ 100 + +enum mlx5hws_pool_type { + MLX5HWS_POOL_TYPE_STE, + MLX5HWS_POOL_TYPE_STC, +}; + +struct mlx5hws_pool_chunk { + u32 resource_idx; + /* Internal offset, relative to base index */ + int offset; + int order; +}; + +struct mlx5hws_pool_resource { + struct mlx5hws_pool *pool; + u32 base_id; + u32 range; +}; + +enum mlx5hws_pool_flags { + /* Only a one resource in that pool */ + MLX5HWS_POOL_FLAGS_ONE_RESOURCE = 1 << 0, + MLX5HWS_POOL_FLAGS_RELEASE_FREE_RESOURCE = 1 << 1, + /* No sharing resources between chunks */ + MLX5HWS_POOL_FLAGS_RESOURCE_PER_CHUNK = 1 << 2, + /* All objects are in the same size */ + MLX5HWS_POOL_FLAGS_FIXED_SIZE_OBJECTS = 1 << 3, + /* Managed by buddy allocator */ + MLX5HWS_POOL_FLAGS_BUDDY_MANAGED = 1 << 4, + /* Allocate pool_type memory on pool creation */ + MLX5HWS_POOL_FLAGS_ALLOC_MEM_ON_CREATE = 1 << 5, + + /* These values should be used by the caller */ + MLX5HWS_POOL_FLAGS_FOR_STC_POOL = + MLX5HWS_POOL_FLAGS_ONE_RESOURCE | + MLX5HWS_POOL_FLAGS_FIXED_SIZE_OBJECTS, + MLX5HWS_POOL_FLAGS_FOR_MATCHER_STE_POOL = + MLX5HWS_POOL_FLAGS_RELEASE_FREE_RESOURCE | + MLX5HWS_POOL_FLAGS_RESOURCE_PER_CHUNK, + MLX5HWS_POOL_FLAGS_FOR_STE_ACTION_POOL = + MLX5HWS_POOL_FLAGS_ONE_RESOURCE | + MLX5HWS_POOL_FLAGS_BUDDY_MANAGED | + MLX5HWS_POOL_FLAGS_ALLOC_MEM_ON_CREATE, +}; + +enum mlx5hws_pool_optimize { + MLX5HWS_POOL_OPTIMIZE_NONE = 0x0, + MLX5HWS_POOL_OPTIMIZE_ORIG = 0x1, + MLX5HWS_POOL_OPTIMIZE_MIRROR = 0x2, +}; + +struct mlx5hws_pool_attr { + enum mlx5hws_pool_type pool_type; + enum mlx5hws_table_type table_type; + enum mlx5hws_pool_flags flags; + enum mlx5hws_pool_optimize opt_type; + /* Allocation size once memory is depleted */ + size_t alloc_log_sz; +}; + +enum mlx5hws_db_type { + /* Uses for allocating chunk of big memory, each element has its own resource in the FW*/ + MLX5HWS_POOL_DB_TYPE_GENERAL_SIZE, + /* One resource only, all the elements are with same one size */ + MLX5HWS_POOL_DB_TYPE_ONE_SIZE_RESOURCE, + /* Many resources, the memory allocated with buddy mechanism */ + MLX5HWS_POOL_DB_TYPE_BUDDY, +}; + +struct mlx5hws_buddy_manager { + struct mlx5hws_buddy_mem *buddies[MLX5HWS_POOL_RESOURCE_ARR_SZ]; +}; + +struct mlx5hws_pool_elements { + u32 num_of_elements; + unsigned long *bitmap; + u32 log_size; + bool is_full; +}; + +struct mlx5hws_element_manager { + struct mlx5hws_pool_elements *elements[MLX5HWS_POOL_RESOURCE_ARR_SZ]; +}; + +struct mlx5hws_pool_db { + enum mlx5hws_db_type type; + union { + struct mlx5hws_element_manager *element_manager; + struct mlx5hws_buddy_manager *buddy_manager; + }; +}; + +typedef int (*mlx5hws_pool_db_get_chunk)(struct mlx5hws_pool *pool, + struct mlx5hws_pool_chunk *chunk); +typedef void (*mlx5hws_pool_db_put_chunk)(struct mlx5hws_pool *pool, + struct mlx5hws_pool_chunk *chunk); +typedef void (*mlx5hws_pool_unint_db)(struct mlx5hws_pool *pool); + +struct mlx5hws_pool { + struct mlx5hws_context *ctx; + enum mlx5hws_pool_type type; + enum mlx5hws_pool_flags flags; + struct mutex lock; /* protect the pool */ + size_t alloc_log_sz; + enum mlx5hws_table_type tbl_type; + enum mlx5hws_pool_optimize opt_type; + struct mlx5hws_pool_resource *resource[MLX5HWS_POOL_RESOURCE_ARR_SZ]; + struct mlx5hws_pool_resource *mirror_resource[MLX5HWS_POOL_RESOURCE_ARR_SZ]; + /* DB */ + struct mlx5hws_pool_db db; + /* Functions */ + mlx5hws_pool_unint_db p_db_uninit; + mlx5hws_pool_db_get_chunk p_get_chunk; + mlx5hws_pool_db_put_chunk p_put_chunk; +}; + +struct mlx5hws_pool * +mlx5hws_pool_create(struct mlx5hws_context *ctx, + struct mlx5hws_pool_attr *pool_attr); + +int mlx5hws_pool_destroy(struct mlx5hws_pool *pool); + +int mlx5hws_pool_chunk_alloc(struct mlx5hws_pool *pool, + struct mlx5hws_pool_chunk *chunk); + +void mlx5hws_pool_chunk_free(struct mlx5hws_pool *pool, + struct mlx5hws_pool_chunk *chunk); + +static inline u32 +mlx5hws_pool_chunk_get_base_id(struct mlx5hws_pool *pool, + struct mlx5hws_pool_chunk *chunk) +{ + return pool->resource[chunk->resource_idx]->base_id; +} + +static inline u32 +mlx5hws_pool_chunk_get_base_mirror_id(struct mlx5hws_pool *pool, + struct mlx5hws_pool_chunk *chunk) +{ + return pool->mirror_resource[chunk->resource_idx]->base_id; +} +#endif /* MLX5HWS_POOL_H_ */ From 76f8d14d207e7a7e4e1d2ef6811a1d2ac5d11ecd Mon Sep 17 00:00:00 2001 From: Yevgeny Kliteynik Date: Thu, 20 Jun 2024 02:43:36 +0300 Subject: [PATCH 098/548] net/mlx5: HWS, added backward-compatible API handling Added implementation of backward-compatible (BWC) steering API. Native HWS API is very different from SWS API: - SWS is synchronous (rule creation/deletion API call returns when the rule is created/deleted), while HWS is asynchronous (it requires polling for completion in order to know when the rule creation/deletion happened) - SWS manages its own memory (it allocates/frees all the needed memory for steering rules, while HWS requires the rules memory to be allocated/freed outside the API In order to make HWS fit the existing fs-core steering API paradigm, this patch adds implementation of backward-compatible (BWC) steering API that has the bahaviour similar to SWS: among others, it encompasses all the rules' memory management and completion polling, presenting the usual synchronous API for the upper layer. A user that wishes to utilize the full speed potential of HWS can call the HWS async API and have rule insertion/deletion batching, lower memory management overhead, and lower CPU utilization. Such approach will be taken by the future Connection Tracking. Note that BWC steering doesn't support yet rules that require more than one match STE - complex rules. This support will be added later on. Reviewed-by: Erez Shitrit Signed-off-by: Yevgeny Kliteynik Signed-off-by: Saeed Mahameed (cherry picked from commit 2111bb970c787b16b002dc726c1d296ce87a00fb) Signed-off-by: Koba Ko --- .../mlx5/core/steering/hws/mlx5hws_bwc.c | 997 ++++++++++++++++++ .../mlx5/core/steering/hws/mlx5hws_bwc.h | 73 ++ .../core/steering/hws/mlx5hws_bwc_complex.c | 86 ++ .../core/steering/hws/mlx5hws_bwc_complex.h | 29 + 4 files changed, 1185 insertions(+) create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_bwc.c create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_bwc.h create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_bwc_complex.c create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_bwc_complex.h diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_bwc.c b/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_bwc.c new file mode 100644 index 0000000000000..bd52b05db3670 --- /dev/null +++ b/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_bwc.c @@ -0,0 +1,997 @@ +// SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB +/* Copyright (c) 2024 NVIDIA Corporation & Affiliates */ + +#include "mlx5hws_internal.h" + +static u16 hws_bwc_gen_queue_idx(struct mlx5hws_context *ctx) +{ + /* assign random queue */ + return get_random_u8() % mlx5hws_bwc_queues(ctx); +} + +static u16 +hws_bwc_get_burst_th(struct mlx5hws_context *ctx, u16 queue_id) +{ + return min(ctx->send_queue[queue_id].num_entries / 2, + MLX5HWS_BWC_MATCHER_REHASH_BURST_TH); +} + +static struct mutex * +hws_bwc_get_queue_lock(struct mlx5hws_context *ctx, u16 idx) +{ + return &ctx->bwc_send_queue_locks[idx]; +} + +static void hws_bwc_lock_all_queues(struct mlx5hws_context *ctx) +{ + u16 bwc_queues = mlx5hws_bwc_queues(ctx); + struct mutex *queue_lock; /* Protect the queue */ + int i; + + for (i = 0; i < bwc_queues; i++) { + queue_lock = hws_bwc_get_queue_lock(ctx, i); + mutex_lock(queue_lock); + } +} + +static void hws_bwc_unlock_all_queues(struct mlx5hws_context *ctx) +{ + u16 bwc_queues = mlx5hws_bwc_queues(ctx); + struct mutex *queue_lock; /* Protect the queue */ + int i = bwc_queues; + + while (i--) { + queue_lock = hws_bwc_get_queue_lock(ctx, i); + mutex_unlock(queue_lock); + } +} + +static void hws_bwc_matcher_init_attr(struct mlx5hws_matcher_attr *attr, + u32 priority, + u8 size_log) +{ + memset(attr, 0, sizeof(*attr)); + + attr->priority = priority; + attr->optimize_using_rule_idx = 0; + attr->mode = MLX5HWS_MATCHER_RESOURCE_MODE_RULE; + attr->optimize_flow_src = MLX5HWS_MATCHER_FLOW_SRC_ANY; + attr->insert_mode = MLX5HWS_MATCHER_INSERT_BY_HASH; + attr->distribute_mode = MLX5HWS_MATCHER_DISTRIBUTE_BY_HASH; + attr->rule.num_log = size_log; + attr->resizable = true; + attr->max_num_of_at_attach = MLX5HWS_BWC_MATCHER_ATTACH_AT_NUM; +} + +int mlx5hws_bwc_matcher_create_simple(struct mlx5hws_bwc_matcher *bwc_matcher, + struct mlx5hws_table *table, + u32 priority, + u8 match_criteria_enable, + struct mlx5hws_match_parameters *mask, + enum mlx5hws_action_type action_types[]) +{ + enum mlx5hws_action_type init_action_types[1] = { MLX5HWS_ACTION_TYP_LAST }; + struct mlx5hws_context *ctx = table->ctx; + u16 bwc_queues = mlx5hws_bwc_queues(ctx); + struct mlx5hws_matcher_attr attr = {0}; + int i; + + bwc_matcher->rules = kcalloc(bwc_queues, sizeof(*bwc_matcher->rules), GFP_KERNEL); + if (!bwc_matcher->rules) + goto err; + + for (i = 0; i < bwc_queues; i++) + INIT_LIST_HEAD(&bwc_matcher->rules[i]); + + hws_bwc_matcher_init_attr(&attr, + priority, + MLX5HWS_BWC_MATCHER_INIT_SIZE_LOG); + + bwc_matcher->priority = priority; + bwc_matcher->size_log = MLX5HWS_BWC_MATCHER_INIT_SIZE_LOG; + + /* create dummy action template */ + bwc_matcher->at[0] = + mlx5hws_action_template_create(action_types ? + action_types : init_action_types); + if (!bwc_matcher->at[0]) { + mlx5hws_err(table->ctx, "BWC matcher: failed creating action template\n"); + goto free_bwc_matcher_rules; + } + + bwc_matcher->num_of_at = 1; + + bwc_matcher->mt = mlx5hws_match_template_create(ctx, + mask->match_buf, + mask->match_sz, + match_criteria_enable); + if (!bwc_matcher->mt) { + mlx5hws_err(table->ctx, "BWC matcher: failed creating match template\n"); + goto free_at; + } + + bwc_matcher->matcher = mlx5hws_matcher_create(table, + &bwc_matcher->mt, 1, + &bwc_matcher->at[0], + bwc_matcher->num_of_at, + &attr); + if (!bwc_matcher->matcher) { + mlx5hws_err(table->ctx, "BWC matcher: failed creating HWS matcher\n"); + goto free_mt; + } + + return 0; + +free_mt: + mlx5hws_match_template_destroy(bwc_matcher->mt); +free_at: + mlx5hws_action_template_destroy(bwc_matcher->at[0]); +free_bwc_matcher_rules: + kfree(bwc_matcher->rules); +err: + return -EINVAL; +} + +struct mlx5hws_bwc_matcher * +mlx5hws_bwc_matcher_create(struct mlx5hws_table *table, + u32 priority, + u8 match_criteria_enable, + struct mlx5hws_match_parameters *mask) +{ + struct mlx5hws_bwc_matcher *bwc_matcher; + bool is_complex; + int ret; + + if (!mlx5hws_context_bwc_supported(table->ctx)) { + mlx5hws_err(table->ctx, + "BWC matcher: context created w/o BWC API compatibility\n"); + return NULL; + } + + bwc_matcher = kzalloc(sizeof(*bwc_matcher), GFP_KERNEL); + if (!bwc_matcher) + return NULL; + + /* Check if the required match params can be all matched + * in single STE, otherwise complex matcher is needed. + */ + + is_complex = mlx5hws_bwc_match_params_is_complex(table->ctx, match_criteria_enable, mask); + if (is_complex) + ret = mlx5hws_bwc_matcher_create_complex(bwc_matcher, + table, + priority, + match_criteria_enable, + mask); + else + ret = mlx5hws_bwc_matcher_create_simple(bwc_matcher, + table, + priority, + match_criteria_enable, + mask, + NULL); + if (ret) + goto free_bwc_matcher; + + return bwc_matcher; + +free_bwc_matcher: + kfree(bwc_matcher); + + return NULL; +} + +int mlx5hws_bwc_matcher_destroy_simple(struct mlx5hws_bwc_matcher *bwc_matcher) +{ + int i; + + mlx5hws_matcher_destroy(bwc_matcher->matcher); + bwc_matcher->matcher = NULL; + + for (i = 0; i < bwc_matcher->num_of_at; i++) + mlx5hws_action_template_destroy(bwc_matcher->at[i]); + + mlx5hws_match_template_destroy(bwc_matcher->mt); + kfree(bwc_matcher->rules); + + return 0; +} + +int mlx5hws_bwc_matcher_destroy(struct mlx5hws_bwc_matcher *bwc_matcher) +{ + if (bwc_matcher->num_of_rules) + mlx5hws_err(bwc_matcher->matcher->tbl->ctx, + "BWC matcher destroy: matcher still has %d rules\n", + bwc_matcher->num_of_rules); + + mlx5hws_bwc_matcher_destroy_simple(bwc_matcher); + + kfree(bwc_matcher); + return 0; +} + +static int hws_bwc_queue_poll(struct mlx5hws_context *ctx, + u16 queue_id, + u32 *pending_rules, + bool drain) +{ + struct mlx5hws_flow_op_result comp[MLX5HWS_BWC_MATCHER_REHASH_BURST_TH]; + u16 burst_th = hws_bwc_get_burst_th(ctx, queue_id); + bool got_comp = *pending_rules >= burst_th; + bool queue_full; + int err = 0; + int ret; + int i; + + /* Check if there are any completions at all */ + if (!got_comp && !drain) + return 0; + + queue_full = mlx5hws_send_engine_full(&ctx->send_queue[queue_id]); + while (queue_full || ((got_comp || drain) && *pending_rules)) { + ret = mlx5hws_send_queue_poll(ctx, queue_id, comp, burst_th); + if (unlikely(ret < 0)) { + mlx5hws_err(ctx, "BWC poll error: polling queue %d returned %d\n", + queue_id, ret); + return -EINVAL; + } + + if (ret) { + (*pending_rules) -= ret; + for (i = 0; i < ret; i++) { + if (unlikely(comp[i].status != MLX5HWS_FLOW_OP_SUCCESS)) { + mlx5hws_err(ctx, + "BWC poll error: polling queue %d returned completion with error\n", + queue_id); + err = -EINVAL; + } + } + queue_full = false; + } + + got_comp = !!ret; + } + + return err; +} + +void +mlx5hws_bwc_rule_fill_attr(struct mlx5hws_bwc_matcher *bwc_matcher, + u16 bwc_queue_idx, + u32 flow_source, + struct mlx5hws_rule_attr *rule_attr) +{ + struct mlx5hws_context *ctx = bwc_matcher->matcher->tbl->ctx; + + /* no use of INSERT_BY_INDEX in bwc rule */ + rule_attr->rule_idx = 0; + + /* notify HW at each rule insertion/deletion */ + rule_attr->burst = 0; + + /* We don't need user data, but the API requires it to exist */ + rule_attr->user_data = (void *)0xFACADE; + + rule_attr->queue_id = mlx5hws_bwc_get_queue_id(ctx, bwc_queue_idx); + rule_attr->flow_source = flow_source; +} + +struct mlx5hws_bwc_rule * +mlx5hws_bwc_rule_alloc(struct mlx5hws_bwc_matcher *bwc_matcher) +{ + struct mlx5hws_bwc_rule *bwc_rule; + + bwc_rule = kzalloc(sizeof(*bwc_rule), GFP_KERNEL); + if (unlikely(!bwc_rule)) + goto out_err; + + bwc_rule->rule = kzalloc(sizeof(*bwc_rule->rule), GFP_KERNEL); + if (unlikely(!bwc_rule->rule)) + goto free_rule; + + bwc_rule->bwc_matcher = bwc_matcher; + return bwc_rule; + +free_rule: + kfree(bwc_rule); +out_err: + return NULL; +} + +void mlx5hws_bwc_rule_free(struct mlx5hws_bwc_rule *bwc_rule) +{ + if (likely(bwc_rule->rule)) + kfree(bwc_rule->rule); + kfree(bwc_rule); +} + +static void hws_bwc_rule_list_add(struct mlx5hws_bwc_rule *bwc_rule, u16 idx) +{ + struct mlx5hws_bwc_matcher *bwc_matcher = bwc_rule->bwc_matcher; + + bwc_matcher->num_of_rules++; + bwc_rule->bwc_queue_idx = idx; + list_add(&bwc_rule->list_node, &bwc_matcher->rules[idx]); +} + +static void hws_bwc_rule_list_remove(struct mlx5hws_bwc_rule *bwc_rule) +{ + struct mlx5hws_bwc_matcher *bwc_matcher = bwc_rule->bwc_matcher; + + bwc_matcher->num_of_rules--; + list_del_init(&bwc_rule->list_node); +} + +static int +hws_bwc_rule_destroy_hws_async(struct mlx5hws_bwc_rule *bwc_rule, + struct mlx5hws_rule_attr *attr) +{ + return mlx5hws_rule_destroy(bwc_rule->rule, attr); +} + +static int +hws_bwc_rule_destroy_hws_sync(struct mlx5hws_bwc_rule *bwc_rule, + struct mlx5hws_rule_attr *rule_attr) +{ + struct mlx5hws_context *ctx = bwc_rule->bwc_matcher->matcher->tbl->ctx; + struct mlx5hws_flow_op_result completion; + int ret; + + ret = hws_bwc_rule_destroy_hws_async(bwc_rule, rule_attr); + if (unlikely(ret)) + return ret; + + do { + ret = mlx5hws_send_queue_poll(ctx, rule_attr->queue_id, &completion, 1); + } while (ret != 1); + + if (unlikely(completion.status != MLX5HWS_FLOW_OP_SUCCESS || + (bwc_rule->rule->status != MLX5HWS_RULE_STATUS_DELETED && + bwc_rule->rule->status != MLX5HWS_RULE_STATUS_DELETING))) { + mlx5hws_err(ctx, "Failed destroying BWC rule: completion %d, rule status %d\n", + completion.status, bwc_rule->rule->status); + return -EINVAL; + } + + return 0; +} + +int mlx5hws_bwc_rule_destroy_simple(struct mlx5hws_bwc_rule *bwc_rule) +{ + struct mlx5hws_bwc_matcher *bwc_matcher = bwc_rule->bwc_matcher; + struct mlx5hws_context *ctx = bwc_matcher->matcher->tbl->ctx; + u16 idx = bwc_rule->bwc_queue_idx; + struct mlx5hws_rule_attr attr; + struct mutex *queue_lock; /* Protect the queue */ + int ret; + + mlx5hws_bwc_rule_fill_attr(bwc_matcher, idx, 0, &attr); + + queue_lock = hws_bwc_get_queue_lock(ctx, idx); + + mutex_lock(queue_lock); + + ret = hws_bwc_rule_destroy_hws_sync(bwc_rule, &attr); + hws_bwc_rule_list_remove(bwc_rule); + + mutex_unlock(queue_lock); + + return ret; +} + +int mlx5hws_bwc_rule_destroy(struct mlx5hws_bwc_rule *bwc_rule) +{ + int ret; + + ret = mlx5hws_bwc_rule_destroy_simple(bwc_rule); + + mlx5hws_bwc_rule_free(bwc_rule); + return ret; +} + +static int +hws_bwc_rule_create_async(struct mlx5hws_bwc_rule *bwc_rule, + u32 *match_param, + u8 at_idx, + struct mlx5hws_rule_action rule_actions[], + struct mlx5hws_rule_attr *rule_attr) +{ + return mlx5hws_rule_create(bwc_rule->bwc_matcher->matcher, + 0, /* only one match template supported */ + match_param, + at_idx, + rule_actions, + rule_attr, + bwc_rule->rule); +} + +static int +hws_bwc_rule_create_sync(struct mlx5hws_bwc_rule *bwc_rule, + u32 *match_param, + u8 at_idx, + struct mlx5hws_rule_action rule_actions[], + struct mlx5hws_rule_attr *rule_attr) + +{ + struct mlx5hws_context *ctx = bwc_rule->bwc_matcher->matcher->tbl->ctx; + u32 expected_completions = 1; + int ret; + + ret = hws_bwc_rule_create_async(bwc_rule, match_param, + at_idx, rule_actions, + rule_attr); + if (unlikely(ret)) + return ret; + + ret = hws_bwc_queue_poll(ctx, rule_attr->queue_id, &expected_completions, true); + + return ret; +} + +static int +hws_bwc_rule_update_sync(struct mlx5hws_bwc_rule *bwc_rule, + u8 at_idx, + struct mlx5hws_rule_action rule_actions[], + struct mlx5hws_rule_attr *rule_attr) +{ + struct mlx5hws_bwc_matcher *bwc_matcher = bwc_rule->bwc_matcher; + struct mlx5hws_context *ctx = bwc_matcher->matcher->tbl->ctx; + u32 expected_completions = 1; + int ret; + + ret = mlx5hws_rule_action_update(bwc_rule->rule, + at_idx, + rule_actions, + rule_attr); + if (unlikely(ret)) + return ret; + + ret = hws_bwc_queue_poll(ctx, rule_attr->queue_id, &expected_completions, true); + if (unlikely(ret)) + mlx5hws_err(ctx, "Failed updating BWC rule (%d)\n", ret); + + return ret; +} + +static bool +hws_bwc_matcher_size_maxed_out(struct mlx5hws_bwc_matcher *bwc_matcher) +{ + struct mlx5hws_cmd_query_caps *caps = bwc_matcher->matcher->tbl->ctx->caps; + + return bwc_matcher->size_log + MLX5HWS_MATCHER_ASSURED_MAIN_TBL_DEPTH >= + caps->ste_alloc_log_max - 1; +} + +static bool +hws_bwc_matcher_rehash_size_needed(struct mlx5hws_bwc_matcher *bwc_matcher, + u32 num_of_rules) +{ + if (unlikely(hws_bwc_matcher_size_maxed_out(bwc_matcher))) + return false; + + if (unlikely((num_of_rules * 100 / MLX5HWS_BWC_MATCHER_REHASH_PERCENT_TH) >= + (1UL << bwc_matcher->size_log))) + return true; + + return false; +} + +static void +hws_bwc_rule_actions_to_action_types(struct mlx5hws_rule_action rule_actions[], + enum mlx5hws_action_type action_types[]) +{ + int i = 0; + + for (i = 0; + rule_actions[i].action && (rule_actions[i].action->type != MLX5HWS_ACTION_TYP_LAST); + i++) { + action_types[i] = (enum mlx5hws_action_type)rule_actions[i].action->type; + } + + action_types[i] = MLX5HWS_ACTION_TYP_LAST; +} + +static int +hws_bwc_matcher_extend_at(struct mlx5hws_bwc_matcher *bwc_matcher, + struct mlx5hws_rule_action rule_actions[]) +{ + enum mlx5hws_action_type action_types[MLX5HWS_BWC_MAX_ACTS]; + + hws_bwc_rule_actions_to_action_types(rule_actions, action_types); + + bwc_matcher->at[bwc_matcher->num_of_at] = + mlx5hws_action_template_create(action_types); + + if (unlikely(!bwc_matcher->at[bwc_matcher->num_of_at])) + return -ENOMEM; + + bwc_matcher->num_of_at++; + return 0; +} + +static int +hws_bwc_matcher_extend_size(struct mlx5hws_bwc_matcher *bwc_matcher) +{ + struct mlx5hws_context *ctx = bwc_matcher->matcher->tbl->ctx; + struct mlx5hws_cmd_query_caps *caps = ctx->caps; + + if (unlikely(hws_bwc_matcher_size_maxed_out(bwc_matcher))) { + mlx5hws_err(ctx, "Can't resize matcher: depth exceeds limit %d\n", + caps->rtc_log_depth_max); + return -ENOMEM; + } + + bwc_matcher->size_log = + min(bwc_matcher->size_log + MLX5HWS_BWC_MATCHER_SIZE_LOG_STEP, + caps->ste_alloc_log_max - MLX5HWS_MATCHER_ASSURED_MAIN_TBL_DEPTH); + + return 0; +} + +static int +hws_bwc_matcher_find_at(struct mlx5hws_bwc_matcher *bwc_matcher, + struct mlx5hws_rule_action rule_actions[]) +{ + enum mlx5hws_action_type *action_type_arr; + int i, j; + + /* start from index 1 - first action template is a dummy */ + for (i = 1; i < bwc_matcher->num_of_at; i++) { + j = 0; + action_type_arr = bwc_matcher->at[i]->action_type_arr; + + while (rule_actions[j].action && + rule_actions[j].action->type != MLX5HWS_ACTION_TYP_LAST) { + if (action_type_arr[j] != rule_actions[j].action->type) + break; + j++; + } + + if (action_type_arr[j] == MLX5HWS_ACTION_TYP_LAST && + (!rule_actions[j].action || + rule_actions[j].action->type == MLX5HWS_ACTION_TYP_LAST)) + return i; + } + + return -1; +} + +static int hws_bwc_matcher_move_all_simple(struct mlx5hws_bwc_matcher *bwc_matcher) +{ + struct mlx5hws_context *ctx = bwc_matcher->matcher->tbl->ctx; + u16 bwc_queues = mlx5hws_bwc_queues(ctx); + struct mlx5hws_bwc_rule **bwc_rules; + struct mlx5hws_rule_attr rule_attr; + u32 *pending_rules; + int i, j, ret = 0; + bool all_done; + u16 burst_th; + + mlx5hws_bwc_rule_fill_attr(bwc_matcher, 0, 0, &rule_attr); + + pending_rules = kcalloc(bwc_queues, sizeof(*pending_rules), GFP_KERNEL); + if (!pending_rules) + return -ENOMEM; + + bwc_rules = kcalloc(bwc_queues, sizeof(*bwc_rules), GFP_KERNEL); + if (!bwc_rules) { + ret = -ENOMEM; + goto free_pending_rules; + } + + for (i = 0; i < bwc_queues; i++) { + if (list_empty(&bwc_matcher->rules[i])) + bwc_rules[i] = NULL; + else + bwc_rules[i] = list_first_entry(&bwc_matcher->rules[i], + struct mlx5hws_bwc_rule, + list_node); + } + + do { + all_done = true; + + for (i = 0; i < bwc_queues; i++) { + rule_attr.queue_id = mlx5hws_bwc_get_queue_id(ctx, i); + burst_th = hws_bwc_get_burst_th(ctx, rule_attr.queue_id); + + for (j = 0; j < burst_th && bwc_rules[i]; j++) { + rule_attr.burst = !!((j + 1) % burst_th); + ret = mlx5hws_matcher_resize_rule_move(bwc_matcher->matcher, + bwc_rules[i]->rule, + &rule_attr); + if (unlikely(ret)) { + mlx5hws_err(ctx, + "Moving BWC rule failed during rehash (%d)\n", + ret); + goto free_bwc_rules; + } + + all_done = false; + pending_rules[i]++; + bwc_rules[i] = list_is_last(&bwc_rules[i]->list_node, + &bwc_matcher->rules[i]) ? + NULL : list_next_entry(bwc_rules[i], list_node); + + ret = hws_bwc_queue_poll(ctx, rule_attr.queue_id, + &pending_rules[i], false); + if (unlikely(ret)) + goto free_bwc_rules; + } + } + } while (!all_done); + + /* drain all the bwc queues */ + for (i = 0; i < bwc_queues; i++) { + if (pending_rules[i]) { + u16 queue_id = mlx5hws_bwc_get_queue_id(ctx, i); + + mlx5hws_send_engine_flush_queue(&ctx->send_queue[queue_id]); + ret = hws_bwc_queue_poll(ctx, queue_id, + &pending_rules[i], true); + if (unlikely(ret)) + goto free_bwc_rules; + } + } + +free_bwc_rules: + kfree(bwc_rules); +free_pending_rules: + kfree(pending_rules); + + return ret; +} + +static int hws_bwc_matcher_move_all(struct mlx5hws_bwc_matcher *bwc_matcher) +{ + return hws_bwc_matcher_move_all_simple(bwc_matcher); +} + +static int hws_bwc_matcher_move(struct mlx5hws_bwc_matcher *bwc_matcher) +{ + struct mlx5hws_context *ctx = bwc_matcher->matcher->tbl->ctx; + struct mlx5hws_matcher_attr matcher_attr = {0}; + struct mlx5hws_matcher *old_matcher; + struct mlx5hws_matcher *new_matcher; + int ret; + + hws_bwc_matcher_init_attr(&matcher_attr, + bwc_matcher->priority, + bwc_matcher->size_log); + + old_matcher = bwc_matcher->matcher; + new_matcher = mlx5hws_matcher_create(old_matcher->tbl, + &bwc_matcher->mt, 1, + bwc_matcher->at, + bwc_matcher->num_of_at, + &matcher_attr); + if (!new_matcher) { + mlx5hws_err(ctx, "Rehash error: matcher creation failed\n"); + return -ENOMEM; + } + + ret = mlx5hws_matcher_resize_set_target(old_matcher, new_matcher); + if (ret) { + mlx5hws_err(ctx, "Rehash error: failed setting resize target\n"); + return ret; + } + + ret = hws_bwc_matcher_move_all(bwc_matcher); + if (ret) { + mlx5hws_err(ctx, "Rehash error: moving rules failed\n"); + return -ENOMEM; + } + + bwc_matcher->matcher = new_matcher; + mlx5hws_matcher_destroy(old_matcher); + + return 0; +} + +static int +hws_bwc_matcher_rehash_size(struct mlx5hws_bwc_matcher *bwc_matcher) +{ + u32 num_of_rules; + int ret; + + /* If the current matcher size is already at its max size, we can't + * do the rehash. Skip it and try adding the rule again - perhaps + * there was some change. + */ + if (hws_bwc_matcher_size_maxed_out(bwc_matcher)) + return 0; + + /* It is possible that other rule has already performed rehash. + * Need to check again if we really need rehash. + * If the reason for rehash was size, but not any more - skip rehash. + */ + num_of_rules = __atomic_load_n(&bwc_matcher->num_of_rules, __ATOMIC_RELAXED); + if (!hws_bwc_matcher_rehash_size_needed(bwc_matcher, num_of_rules)) + return 0; + + /* Now we're done all the checking - do the rehash: + * - extend match RTC size + * - create new matcher + * - move all the rules to the new matcher + * - destroy the old matcher + */ + + ret = hws_bwc_matcher_extend_size(bwc_matcher); + if (ret) + return ret; + + return hws_bwc_matcher_move(bwc_matcher); +} + +static int +hws_bwc_matcher_rehash_at(struct mlx5hws_bwc_matcher *bwc_matcher) +{ + /* Rehash by action template doesn't require any additional checking. + * The bwc_matcher already contains the new action template. + * Just do the usual rehash: + * - create new matcher + * - move all the rules to the new matcher + * - destroy the old matcher + */ + return hws_bwc_matcher_move(bwc_matcher); +} + +int mlx5hws_bwc_rule_create_simple(struct mlx5hws_bwc_rule *bwc_rule, + u32 *match_param, + struct mlx5hws_rule_action rule_actions[], + u32 flow_source, + u16 bwc_queue_idx) +{ + struct mlx5hws_bwc_matcher *bwc_matcher = bwc_rule->bwc_matcher; + struct mlx5hws_context *ctx = bwc_matcher->matcher->tbl->ctx; + struct mlx5hws_rule_attr rule_attr; + struct mutex *queue_lock; /* Protect the queue */ + u32 num_of_rules; + int ret = 0; + int at_idx; + + mlx5hws_bwc_rule_fill_attr(bwc_matcher, bwc_queue_idx, flow_source, &rule_attr); + + queue_lock = hws_bwc_get_queue_lock(ctx, bwc_queue_idx); + + mutex_lock(queue_lock); + + /* check if rehash needed due to missing action template */ + at_idx = hws_bwc_matcher_find_at(bwc_matcher, rule_actions); + if (unlikely(at_idx < 0)) { + /* we need to extend BWC matcher action templates array */ + mutex_unlock(queue_lock); + hws_bwc_lock_all_queues(ctx); + + ret = hws_bwc_matcher_extend_at(bwc_matcher, rule_actions); + if (unlikely(ret)) { + hws_bwc_unlock_all_queues(ctx); + return ret; + } + + /* action templates array was extended, we need the last idx */ + at_idx = bwc_matcher->num_of_at - 1; + + ret = mlx5hws_matcher_attach_at(bwc_matcher->matcher, + bwc_matcher->at[at_idx]); + if (unlikely(ret)) { + /* Action template attach failed, possibly due to + * requiring more action STEs. + * Need to attempt creating new matcher with all + * the action templates, including the new one. + */ + ret = hws_bwc_matcher_rehash_at(bwc_matcher); + if (unlikely(ret)) { + mlx5hws_action_template_destroy(bwc_matcher->at[at_idx]); + bwc_matcher->at[at_idx] = NULL; + bwc_matcher->num_of_at--; + + hws_bwc_unlock_all_queues(ctx); + + mlx5hws_err(ctx, + "BWC rule insertion: rehash AT failed (%d)\n", ret); + return ret; + } + } + + hws_bwc_unlock_all_queues(ctx); + mutex_lock(queue_lock); + } + + /* check if number of rules require rehash */ + num_of_rules = bwc_matcher->num_of_rules; + + if (unlikely(hws_bwc_matcher_rehash_size_needed(bwc_matcher, num_of_rules))) { + mutex_unlock(queue_lock); + + hws_bwc_lock_all_queues(ctx); + ret = hws_bwc_matcher_rehash_size(bwc_matcher); + hws_bwc_unlock_all_queues(ctx); + + if (ret) { + mlx5hws_err(ctx, "BWC rule insertion: rehash size [%d -> %d] failed (%d)\n", + bwc_matcher->size_log - MLX5HWS_BWC_MATCHER_SIZE_LOG_STEP, + bwc_matcher->size_log, + ret); + return ret; + } + + mutex_lock(queue_lock); + } + + ret = hws_bwc_rule_create_sync(bwc_rule, + match_param, + at_idx, + rule_actions, + &rule_attr); + if (likely(!ret)) { + hws_bwc_rule_list_add(bwc_rule, bwc_queue_idx); + mutex_unlock(queue_lock); + return 0; /* rule inserted successfully */ + } + + /* At this point the rule wasn't added. + * It could be because there was collision, or some other problem. + * If we don't dive deeper than API, the only thing we know is that + * the status of completion is RTE_FLOW_OP_ERROR. + * Try rehash by size and insert rule again - last chance. + */ + + mutex_unlock(queue_lock); + + hws_bwc_lock_all_queues(ctx); + ret = hws_bwc_matcher_rehash_size(bwc_matcher); + hws_bwc_unlock_all_queues(ctx); + + if (ret) { + mlx5hws_err(ctx, "BWC rule insertion: rehash failed (%d)\n", ret); + return ret; + } + + /* Rehash done, but we still have that pesky rule to add */ + mutex_lock(queue_lock); + + ret = hws_bwc_rule_create_sync(bwc_rule, + match_param, + at_idx, + rule_actions, + &rule_attr); + + if (unlikely(ret)) { + mutex_unlock(queue_lock); + mlx5hws_err(ctx, "BWC rule insertion failed (%d)\n", ret); + return ret; + } + + hws_bwc_rule_list_add(bwc_rule, bwc_queue_idx); + mutex_unlock(queue_lock); + + return 0; +} + +struct mlx5hws_bwc_rule * +mlx5hws_bwc_rule_create(struct mlx5hws_bwc_matcher *bwc_matcher, + struct mlx5hws_match_parameters *params, + u32 flow_source, + struct mlx5hws_rule_action rule_actions[]) +{ + struct mlx5hws_context *ctx = bwc_matcher->matcher->tbl->ctx; + struct mlx5hws_bwc_rule *bwc_rule; + u16 bwc_queue_idx; + int ret; + + if (unlikely(!mlx5hws_context_bwc_supported(ctx))) { + mlx5hws_err(ctx, "BWC rule: Context created w/o BWC API compatibility\n"); + return NULL; + } + + bwc_rule = mlx5hws_bwc_rule_alloc(bwc_matcher); + if (unlikely(!bwc_rule)) + return NULL; + + bwc_queue_idx = hws_bwc_gen_queue_idx(ctx); + + ret = mlx5hws_bwc_rule_create_simple(bwc_rule, + params->match_buf, + rule_actions, + flow_source, + bwc_queue_idx); + if (unlikely(ret)) { + mlx5hws_bwc_rule_free(bwc_rule); + return NULL; + } + + return bwc_rule; +} + +static int +hws_bwc_rule_action_update(struct mlx5hws_bwc_rule *bwc_rule, + struct mlx5hws_rule_action rule_actions[]) +{ + struct mlx5hws_bwc_matcher *bwc_matcher = bwc_rule->bwc_matcher; + struct mlx5hws_context *ctx = bwc_matcher->matcher->tbl->ctx; + struct mlx5hws_rule_attr rule_attr; + struct mutex *queue_lock; /* Protect the queue */ + int at_idx, ret; + u16 idx; + + idx = bwc_rule->bwc_queue_idx; + + mlx5hws_bwc_rule_fill_attr(bwc_matcher, idx, 0, &rule_attr); + queue_lock = hws_bwc_get_queue_lock(ctx, idx); + + mutex_lock(queue_lock); + + /* check if rehash needed due to missing action template */ + at_idx = hws_bwc_matcher_find_at(bwc_matcher, rule_actions); + if (unlikely(at_idx < 0)) { + /* we need to extend BWC matcher action templates array */ + mutex_unlock(queue_lock); + hws_bwc_lock_all_queues(ctx); + + /* check again - perhaps other thread already did extend_at */ + at_idx = hws_bwc_matcher_find_at(bwc_matcher, rule_actions); + if (likely(at_idx < 0)) { + ret = hws_bwc_matcher_extend_at(bwc_matcher, rule_actions); + if (unlikely(ret)) { + hws_bwc_unlock_all_queues(ctx); + mlx5hws_err(ctx, "BWC rule update: failed extending AT (%d)", ret); + return -EINVAL; + } + + /* action templates array was extended, we need the last idx */ + at_idx = bwc_matcher->num_of_at - 1; + + ret = mlx5hws_matcher_attach_at(bwc_matcher->matcher, + bwc_matcher->at[at_idx]); + if (unlikely(ret)) { + /* Action template attach failed, possibly due to + * requiring more action STEs. + * Need to attempt creating new matcher with all + * the action templates, including the new one. + */ + ret = hws_bwc_matcher_rehash_at(bwc_matcher); + if (unlikely(ret)) { + mlx5hws_action_template_destroy(bwc_matcher->at[at_idx]); + bwc_matcher->at[at_idx] = NULL; + bwc_matcher->num_of_at--; + + hws_bwc_unlock_all_queues(ctx); + + mlx5hws_err(ctx, + "BWC rule update: rehash AT failed (%d)\n", + ret); + return ret; + } + } + } + + hws_bwc_unlock_all_queues(ctx); + mutex_lock(queue_lock); + } + + ret = hws_bwc_rule_update_sync(bwc_rule, + at_idx, + rule_actions, + &rule_attr); + mutex_unlock(queue_lock); + + if (unlikely(ret)) + mlx5hws_err(ctx, "BWC rule: update failed (%d)\n", ret); + + return ret; +} + +int mlx5hws_bwc_rule_action_update(struct mlx5hws_bwc_rule *bwc_rule, + struct mlx5hws_rule_action rule_actions[]) +{ + struct mlx5hws_bwc_matcher *bwc_matcher = bwc_rule->bwc_matcher; + struct mlx5hws_context *ctx = bwc_matcher->matcher->tbl->ctx; + + if (unlikely(!mlx5hws_context_bwc_supported(ctx))) { + mlx5hws_err(ctx, "BWC rule: Context created w/o BWC API compatibility\n"); + return -EINVAL; + } + + return hws_bwc_rule_action_update(bwc_rule, rule_actions); +} diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_bwc.h b/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_bwc.h new file mode 100644 index 0000000000000..4fe8c32d8fbe8 --- /dev/null +++ b/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_bwc.h @@ -0,0 +1,73 @@ +/* SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB */ +/* Copyright (c) 2024 NVIDIA Corporation & Affiliates */ + +#ifndef MLX5HWS_BWC_H_ +#define MLX5HWS_BWC_H_ + +#define MLX5HWS_BWC_MATCHER_INIT_SIZE_LOG 1 +#define MLX5HWS_BWC_MATCHER_SIZE_LOG_STEP 1 +#define MLX5HWS_BWC_MATCHER_REHASH_PERCENT_TH 70 +#define MLX5HWS_BWC_MATCHER_REHASH_BURST_TH 32 +#define MLX5HWS_BWC_MATCHER_ATTACH_AT_NUM 255 + +#define MLX5HWS_BWC_MAX_ACTS 16 + +struct mlx5hws_bwc_matcher { + struct mlx5hws_matcher *matcher; + struct mlx5hws_match_template *mt; + struct mlx5hws_action_template *at[MLX5HWS_BWC_MATCHER_ATTACH_AT_NUM]; + u8 num_of_at; + u16 priority; + u8 size_log; + u32 num_of_rules; /* atomically accessed */ + struct list_head *rules; +}; + +struct mlx5hws_bwc_rule { + struct mlx5hws_bwc_matcher *bwc_matcher; + struct mlx5hws_rule *rule; + u16 bwc_queue_idx; + struct list_head list_node; +}; + +int +mlx5hws_bwc_matcher_create_simple(struct mlx5hws_bwc_matcher *bwc_matcher, + struct mlx5hws_table *table, + u32 priority, + u8 match_criteria_enable, + struct mlx5hws_match_parameters *mask, + enum mlx5hws_action_type action_types[]); + +int mlx5hws_bwc_matcher_destroy_simple(struct mlx5hws_bwc_matcher *bwc_matcher); + +struct mlx5hws_bwc_rule *mlx5hws_bwc_rule_alloc(struct mlx5hws_bwc_matcher *bwc_matcher); + +void mlx5hws_bwc_rule_free(struct mlx5hws_bwc_rule *bwc_rule); + +int mlx5hws_bwc_rule_create_simple(struct mlx5hws_bwc_rule *bwc_rule, + u32 *match_param, + struct mlx5hws_rule_action rule_actions[], + u32 flow_source, + u16 bwc_queue_idx); + +int mlx5hws_bwc_rule_destroy_simple(struct mlx5hws_bwc_rule *bwc_rule); + +void mlx5hws_bwc_rule_fill_attr(struct mlx5hws_bwc_matcher *bwc_matcher, + u16 bwc_queue_idx, + u32 flow_source, + struct mlx5hws_rule_attr *rule_attr); + +static inline u16 mlx5hws_bwc_queues(struct mlx5hws_context *ctx) +{ + /* Besides the control queue, half of the queues are + * reguler HWS queues, and the other half are BWC queues. + */ + return (ctx->queues - 1) / 2; +} + +static inline u16 mlx5hws_bwc_get_queue_id(struct mlx5hws_context *ctx, u16 idx) +{ + return idx + mlx5hws_bwc_queues(ctx); +} + +#endif /* MLX5HWS_BWC_H_ */ diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_bwc_complex.c b/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_bwc_complex.c new file mode 100644 index 0000000000000..bb563f50ef09a --- /dev/null +++ b/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_bwc_complex.c @@ -0,0 +1,86 @@ +// SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB +/* Copyright (c) 2024 NVIDIA Corporation & Affiliates */ + +#include "mlx5hws_internal.h" + +bool mlx5hws_bwc_match_params_is_complex(struct mlx5hws_context *ctx, + u8 match_criteria_enable, + struct mlx5hws_match_parameters *mask) +{ + struct mlx5hws_definer match_layout = {0}; + struct mlx5hws_match_template *mt; + bool is_complex = false; + int ret; + + if (!match_criteria_enable) + return false; /* empty matcher */ + + mt = mlx5hws_match_template_create(ctx, + mask->match_buf, + mask->match_sz, + match_criteria_enable); + if (!mt) { + mlx5hws_err(ctx, "BWC: failed creating match template\n"); + return false; + } + + ret = mlx5hws_definer_calc_layout(ctx, mt, &match_layout); + if (ret) { + /* The only case that we're interested in is E2BIG, + * which means that the match parameters need to be + * split into complex martcher. + * For all other cases (good or bad) - just return true + * and let the usual match creation path handle it, + * both for good and bad flows. + */ + if (ret == E2BIG) { + is_complex = true; + mlx5hws_dbg(ctx, "Matcher definer layout: need complex matcher\n"); + } else { + mlx5hws_err(ctx, "Failed to calculate matcher definer layout\n"); + } + } + + mlx5hws_match_template_destroy(mt); + + return is_complex; +} + +int mlx5hws_bwc_matcher_create_complex(struct mlx5hws_bwc_matcher *bwc_matcher, + struct mlx5hws_table *table, + u32 priority, + u8 match_criteria_enable, + struct mlx5hws_match_parameters *mask) +{ + mlx5hws_err(table->ctx, "Complex matcher is not supported yet\n"); + return -EOPNOTSUPP; +} + +void +mlx5hws_bwc_matcher_destroy_complex(struct mlx5hws_bwc_matcher *bwc_matcher) +{ + /* nothing to do here */ +} + +int mlx5hws_bwc_rule_create_complex(struct mlx5hws_bwc_rule *bwc_rule, + struct mlx5hws_match_parameters *params, + u32 flow_source, + struct mlx5hws_rule_action rule_actions[], + u16 bwc_queue_idx) +{ + mlx5hws_err(bwc_rule->bwc_matcher->matcher->tbl->ctx, + "Complex rule is not supported yet\n"); + return -EOPNOTSUPP; +} + +int mlx5hws_bwc_rule_destroy_complex(struct mlx5hws_bwc_rule *bwc_rule) +{ + return 0; +} + +int mlx5hws_bwc_matcher_move_all_complex(struct mlx5hws_bwc_matcher *bwc_matcher) +{ + mlx5hws_err(bwc_matcher->matcher->tbl->ctx, + "Moving complex rule is not supported yet\n"); + return -EOPNOTSUPP; +} diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_bwc_complex.h b/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_bwc_complex.h new file mode 100644 index 0000000000000..068ee81186098 --- /dev/null +++ b/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_bwc_complex.h @@ -0,0 +1,29 @@ +/* SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB */ +/* Copyright (c) 2024 NVIDIA Corporation & Affiliates */ + +#ifndef MLX5HWS_BWC_COMPLEX_H_ +#define MLX5HWS_BWC_COMPLEX_H_ + +bool mlx5hws_bwc_match_params_is_complex(struct mlx5hws_context *ctx, + u8 match_criteria_enable, + struct mlx5hws_match_parameters *mask); + +int mlx5hws_bwc_matcher_create_complex(struct mlx5hws_bwc_matcher *bwc_matcher, + struct mlx5hws_table *table, + u32 priority, + u8 match_criteria_enable, + struct mlx5hws_match_parameters *mask); + +void mlx5hws_bwc_matcher_destroy_complex(struct mlx5hws_bwc_matcher *bwc_matcher); + +int mlx5hws_bwc_matcher_move_all_complex(struct mlx5hws_bwc_matcher *bwc_matcher); + +int mlx5hws_bwc_rule_create_complex(struct mlx5hws_bwc_rule *bwc_rule, + struct mlx5hws_match_parameters *params, + u32 flow_source, + struct mlx5hws_rule_action rule_actions[], + u16 bwc_queue_idx); + +int mlx5hws_bwc_rule_destroy_complex(struct mlx5hws_bwc_rule *bwc_rule); + +#endif /* MLX5HWS_BWC_COMPLEX_H_ */ From e485de551cea6fcd240921b234e74a7bf296390b Mon Sep 17 00:00:00 2001 From: Yevgeny Kliteynik Date: Thu, 20 Jun 2024 02:40:54 +0300 Subject: [PATCH 099/548] net/mlx5: HWS, added debug dump and internal headers Added debug dump of the existing HWS state, and all the required internal definitions. To dump the HWS state, cat the following debugfs node: cat /sys/kernel/debug/mlx5//steering/fdb/ctx_ Reviewed-by: Hamdan Agbariya Signed-off-by: Yevgeny Kliteynik Signed-off-by: Saeed Mahameed (cherry picked from commit d4a605e968e7f0af58f228ab7bf96d9d7d4f0b69) Signed-off-by: Koba Ko --- .../mlx5/core/steering/hws/mlx5hws_debug.c | 480 ++++++++++++++++ .../mlx5/core/steering/hws/mlx5hws_debug.h | 40 ++ .../mlx5/core/steering/hws/mlx5hws_internal.h | 59 ++ .../mlx5/core/steering/hws/mlx5hws_prm.h | 514 ++++++++++++++++++ 4 files changed, 1093 insertions(+) create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_debug.c create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_debug.h create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_internal.h create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_prm.h diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_debug.c b/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_debug.c new file mode 100644 index 0000000000000..2b8c5a4e1c4c0 --- /dev/null +++ b/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_debug.c @@ -0,0 +1,480 @@ +// SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB +/* Copyright (c) 2024 NVIDIA Corporation & Affiliates */ + +#include +#include +#include +#include +#include "mlx5hws_internal.h" + +static int +hws_debug_dump_matcher_template_definer(struct seq_file *f, + void *parent_obj, + struct mlx5hws_definer *definer, + enum mlx5hws_debug_res_type type) +{ + int i; + + if (!definer) + return 0; + + seq_printf(f, "%d,0x%llx,0x%llx,%d,%d,", + type, + HWS_PTR_TO_ID(definer), + HWS_PTR_TO_ID(parent_obj), + definer->obj_id, + definer->type); + + for (i = 0; i < DW_SELECTORS; i++) + seq_printf(f, "0x%x%s", definer->dw_selector[i], + (i == DW_SELECTORS - 1) ? "," : "-"); + + for (i = 0; i < BYTE_SELECTORS; i++) + seq_printf(f, "0x%x%s", definer->byte_selector[i], + (i == BYTE_SELECTORS - 1) ? "," : "-"); + + for (i = 0; i < MLX5HWS_JUMBO_TAG_SZ; i++) + seq_printf(f, "%02x", definer->mask.jumbo[i]); + + seq_puts(f, "\n"); + + return 0; +} + +static int +hws_debug_dump_matcher_match_template(struct seq_file *f, struct mlx5hws_matcher *matcher) +{ + enum mlx5hws_debug_res_type type; + int i, ret; + + for (i = 0; i < matcher->num_of_mt; i++) { + struct mlx5hws_match_template *mt = &matcher->mt[i]; + + seq_printf(f, "%d,0x%llx,0x%llx,%d,%d,%d\n", + MLX5HWS_DEBUG_RES_TYPE_MATCHER_MATCH_TEMPLATE, + HWS_PTR_TO_ID(mt), + HWS_PTR_TO_ID(matcher), + mt->fc_sz, + 0, 0); + + type = MLX5HWS_DEBUG_RES_TYPE_MATCHER_TEMPLATE_MATCH_DEFINER; + ret = hws_debug_dump_matcher_template_definer(f, mt, mt->definer, type); + if (ret) + return ret; + } + + return 0; +} + +static int +hws_debug_dump_matcher_action_template(struct seq_file *f, struct mlx5hws_matcher *matcher) +{ + enum mlx5hws_action_type action_type; + int i, j; + + for (i = 0; i < matcher->num_of_at; i++) { + struct mlx5hws_action_template *at = &matcher->at[i]; + + seq_printf(f, "%d,0x%llx,0x%llx,%d,%d,%d", + MLX5HWS_DEBUG_RES_TYPE_MATCHER_ACTION_TEMPLATE, + HWS_PTR_TO_ID(at), + HWS_PTR_TO_ID(matcher), + at->only_term, + at->num_of_action_stes, + at->num_actions); + + for (j = 0; j < at->num_actions; j++) { + action_type = at->action_type_arr[j]; + seq_printf(f, ",%s", mlx5hws_action_type_to_str(action_type)); + } + + seq_puts(f, "\n"); + } + + return 0; +} + +static int +hws_debug_dump_matcher_attr(struct seq_file *f, struct mlx5hws_matcher *matcher) +{ + struct mlx5hws_matcher_attr *attr = &matcher->attr; + + seq_printf(f, "%d,0x%llx,%d,%d,%d,%d,%d,%d,%d,%d\n", + MLX5HWS_DEBUG_RES_TYPE_MATCHER_ATTR, + HWS_PTR_TO_ID(matcher), + attr->priority, + attr->mode, + attr->table.sz_row_log, + attr->table.sz_col_log, + attr->optimize_using_rule_idx, + attr->optimize_flow_src, + attr->insert_mode, + attr->distribute_mode); + + return 0; +} + +static int hws_debug_dump_matcher(struct seq_file *f, struct mlx5hws_matcher *matcher) +{ + enum mlx5hws_table_type tbl_type = matcher->tbl->type; + struct mlx5hws_cmd_ft_query_attr ft_attr = {0}; + struct mlx5hws_pool_chunk *ste; + struct mlx5hws_pool *ste_pool; + u64 icm_addr_0 = 0; + u64 icm_addr_1 = 0; + u32 ste_0_id = -1; + u32 ste_1_id = -1; + int ret; + + seq_printf(f, "%d,0x%llx,0x%llx,%d,%d,0x%llx", + MLX5HWS_DEBUG_RES_TYPE_MATCHER, + HWS_PTR_TO_ID(matcher), + HWS_PTR_TO_ID(matcher->tbl), + matcher->num_of_mt, + matcher->end_ft_id, + matcher->col_matcher ? HWS_PTR_TO_ID(matcher->col_matcher) : 0); + + ste = &matcher->match_ste.ste; + ste_pool = matcher->match_ste.pool; + if (ste_pool) { + ste_0_id = mlx5hws_pool_chunk_get_base_id(ste_pool, ste); + if (tbl_type == MLX5HWS_TABLE_TYPE_FDB) + ste_1_id = mlx5hws_pool_chunk_get_base_mirror_id(ste_pool, ste); + } + + seq_printf(f, ",%d,%d,%d,%d", + matcher->match_ste.rtc_0_id, + (int)ste_0_id, + matcher->match_ste.rtc_1_id, + (int)ste_1_id); + + ste = &matcher->action_ste[0].ste; + ste_pool = matcher->action_ste[0].pool; + if (ste_pool) { + ste_0_id = mlx5hws_pool_chunk_get_base_id(ste_pool, ste); + if (tbl_type == MLX5HWS_TABLE_TYPE_FDB) + ste_1_id = mlx5hws_pool_chunk_get_base_mirror_id(ste_pool, ste); + else + ste_1_id = -1; + } else { + ste_0_id = -1; + ste_1_id = -1; + } + + ft_attr.type = matcher->tbl->fw_ft_type; + ret = mlx5hws_cmd_flow_table_query(matcher->tbl->ctx->mdev, + matcher->end_ft_id, + &ft_attr, + &icm_addr_0, + &icm_addr_1); + if (ret) + return ret; + + seq_printf(f, ",%d,%d,%d,%d,%d,0x%llx,0x%llx\n", + matcher->action_ste[0].rtc_0_id, + (int)ste_0_id, + matcher->action_ste[0].rtc_1_id, + (int)ste_1_id, + 0, + mlx5hws_debug_icm_to_idx(icm_addr_0), + mlx5hws_debug_icm_to_idx(icm_addr_1)); + + ret = hws_debug_dump_matcher_attr(f, matcher); + if (ret) + return ret; + + ret = hws_debug_dump_matcher_match_template(f, matcher); + if (ret) + return ret; + + ret = hws_debug_dump_matcher_action_template(f, matcher); + if (ret) + return ret; + + return 0; +} + +static int hws_debug_dump_table(struct seq_file *f, struct mlx5hws_table *tbl) +{ + struct mlx5hws_cmd_ft_query_attr ft_attr = {0}; + struct mlx5hws_matcher *matcher; + u64 local_icm_addr_0 = 0; + u64 local_icm_addr_1 = 0; + u64 icm_addr_0 = 0; + u64 icm_addr_1 = 0; + int ret; + + seq_printf(f, "%d,0x%llx,0x%llx,%d,%d,%d,%d,%d", + MLX5HWS_DEBUG_RES_TYPE_TABLE, + HWS_PTR_TO_ID(tbl), + HWS_PTR_TO_ID(tbl->ctx), + tbl->ft_id, + MLX5HWS_TABLE_TYPE_BASE + tbl->type, + tbl->fw_ft_type, + tbl->level, + 0); + + ft_attr.type = tbl->fw_ft_type; + ret = mlx5hws_cmd_flow_table_query(tbl->ctx->mdev, + tbl->ft_id, + &ft_attr, + &icm_addr_0, + &icm_addr_1); + if (ret) + return ret; + + seq_printf(f, ",0x%llx,0x%llx,0x%llx,0x%llx,0x%llx\n", + mlx5hws_debug_icm_to_idx(icm_addr_0), + mlx5hws_debug_icm_to_idx(icm_addr_1), + mlx5hws_debug_icm_to_idx(local_icm_addr_0), + mlx5hws_debug_icm_to_idx(local_icm_addr_1), + HWS_PTR_TO_ID(tbl->default_miss.miss_tbl)); + + list_for_each_entry(matcher, &tbl->matchers_list, list_node) { + ret = hws_debug_dump_matcher(f, matcher); + if (ret) + return ret; + } + + return 0; +} + +static int +hws_debug_dump_context_send_engine(struct seq_file *f, struct mlx5hws_context *ctx) +{ + struct mlx5hws_send_engine *send_queue; + struct mlx5hws_send_ring *send_ring; + struct mlx5hws_send_ring_cq *cq; + struct mlx5hws_send_ring_sq *sq; + int i; + + for (i = 0; i < (int)ctx->queues; i++) { + send_queue = &ctx->send_queue[i]; + seq_printf(f, "%d,0x%llx,%d,%d,%d,%d,%d,%d,%d,%d,%d\n", + MLX5HWS_DEBUG_RES_TYPE_CONTEXT_SEND_ENGINE, + HWS_PTR_TO_ID(ctx), + i, + send_queue->used_entries, + send_queue->num_entries, + 1, /* one send ring per queue */ + send_queue->num_entries, + send_queue->err, + send_queue->completed.ci, + send_queue->completed.pi, + send_queue->completed.mask); + + send_ring = &send_queue->send_ring; + cq = &send_ring->send_cq; + sq = &send_ring->send_sq; + + seq_printf(f, "%d,0x%llx,%d,%d,%d,%d,%d,%d,%d,%d,%d,%d,%d,%d,%d,%d\n", + MLX5HWS_DEBUG_RES_TYPE_CONTEXT_SEND_RING, + HWS_PTR_TO_ID(ctx), + 0, /* one send ring per send queue */ + i, + cq->mcq.cqn, + 0, + 0, + 0, + 0, + 0, + 0, + cq->mcq.cqe_sz, + sq->sqn, + 0, + 0, + 0); + } + + return 0; +} + +static int hws_debug_dump_context_caps(struct seq_file *f, struct mlx5hws_context *ctx) +{ + struct mlx5hws_cmd_query_caps *caps = ctx->caps; + + seq_printf(f, "%d,0x%llx,%s,%d,%d,%d,%d,", + MLX5HWS_DEBUG_RES_TYPE_CONTEXT_CAPS, + HWS_PTR_TO_ID(ctx), + caps->fw_ver, + caps->wqe_based_update, + caps->ste_format, + caps->ste_alloc_log_max, + caps->log_header_modify_argument_max_alloc); + + seq_printf(f, "%d,%d,%d,%d,%d,%d,%d,%d,%d,%d,%d,%d,%d,%d,%d,%d,%d,%s\n", + caps->flex_protocols, + caps->rtc_reparse_mode, + caps->rtc_index_mode, + caps->ste_alloc_log_gran, + caps->stc_alloc_log_max, + caps->stc_alloc_log_gran, + caps->rtc_log_depth_max, + caps->format_select_gtpu_dw_0, + caps->format_select_gtpu_dw_1, + caps->format_select_gtpu_dw_2, + caps->format_select_gtpu_ext_dw_0, + caps->nic_ft.max_level, + caps->nic_ft.reparse, + caps->fdb_ft.max_level, + caps->fdb_ft.reparse, + caps->log_header_modify_argument_granularity, + caps->linear_match_definer, + "regc_3"); + + return 0; +} + +static int hws_debug_dump_context_attr(struct seq_file *f, struct mlx5hws_context *ctx) +{ + seq_printf(f, "%u,0x%llx,%d,%zu,%d,%s,%d,%d\n", + MLX5HWS_DEBUG_RES_TYPE_CONTEXT_ATTR, + HWS_PTR_TO_ID(ctx), + ctx->pd_num, + ctx->queues, + ctx->send_queue->num_entries, + "None", /* no shared gvmi */ + ctx->caps->vhca_id, + 0xffff); /* no shared gvmi */ + + return 0; +} + +static int hws_debug_dump_context_info(struct seq_file *f, struct mlx5hws_context *ctx) +{ + struct mlx5_core_dev *dev = ctx->mdev; + int ret; + + seq_printf(f, "%d,0x%llx,%d,%s,%s.KERNEL_%u_%u_%u\n", + MLX5HWS_DEBUG_RES_TYPE_CONTEXT, + HWS_PTR_TO_ID(ctx), + ctx->flags & MLX5HWS_CONTEXT_FLAG_HWS_SUPPORT, + pci_name(dev->pdev), + HWS_DEBUG_FORMAT_VERSION, + LINUX_VERSION_MAJOR, + LINUX_VERSION_PATCHLEVEL, + LINUX_VERSION_SUBLEVEL); + + ret = hws_debug_dump_context_attr(f, ctx); + if (ret) + return ret; + + ret = hws_debug_dump_context_caps(f, ctx); + if (ret) + return ret; + + return 0; +} + +static int hws_debug_dump_context_stc_resource(struct seq_file *f, + struct mlx5hws_context *ctx, + u32 tbl_type, + struct mlx5hws_pool_resource *resource) +{ + seq_printf(f, "%d,0x%llx,%u,%u\n", + MLX5HWS_DEBUG_RES_TYPE_CONTEXT_STC, + HWS_PTR_TO_ID(ctx), + tbl_type, + resource->base_id); + + return 0; +} + +static int hws_debug_dump_context_stc(struct seq_file *f, struct mlx5hws_context *ctx) +{ + struct mlx5hws_pool *stc_pool; + u32 table_type; + int ret; + int i; + + for (i = 0; i < MLX5HWS_TABLE_TYPE_MAX; i++) { + stc_pool = ctx->stc_pool[i]; + table_type = MLX5HWS_TABLE_TYPE_BASE + i; + + if (!stc_pool) + continue; + + if (stc_pool->resource[0]) { + ret = hws_debug_dump_context_stc_resource(f, ctx, table_type, + stc_pool->resource[0]); + if (ret) + return ret; + } + + if (i == MLX5HWS_TABLE_TYPE_FDB && stc_pool->mirror_resource[0]) { + ret = hws_debug_dump_context_stc_resource(f, ctx, table_type, + stc_pool->mirror_resource[0]); + if (ret) + return ret; + } + } + + return 0; +} + +static int hws_debug_dump_context(struct seq_file *f, struct mlx5hws_context *ctx) +{ + struct mlx5hws_table *tbl; + int ret; + + ret = hws_debug_dump_context_info(f, ctx); + if (ret) + return ret; + + ret = hws_debug_dump_context_send_engine(f, ctx); + if (ret) + return ret; + + ret = hws_debug_dump_context_stc(f, ctx); + if (ret) + return ret; + + list_for_each_entry(tbl, &ctx->tbl_list, tbl_list_node) { + ret = hws_debug_dump_table(f, tbl); + if (ret) + return ret; + } + + return 0; +} + +static int +hws_debug_dump(struct seq_file *f, struct mlx5hws_context *ctx) +{ + int ret; + + if (!f || !ctx) + return -EINVAL; + + mutex_lock(&ctx->ctrl_lock); + ret = hws_debug_dump_context(f, ctx); + mutex_unlock(&ctx->ctrl_lock); + + return ret; +} + +static int hws_dump_show(struct seq_file *file, void *priv) +{ + return hws_debug_dump(file, file->private); +} +DEFINE_SHOW_ATTRIBUTE(hws_dump); + +void mlx5hws_debug_init_dump(struct mlx5hws_context *ctx) +{ + struct mlx5_core_dev *dev = ctx->mdev; + char file_name[128]; + + ctx->debug_info.steering_debugfs = + debugfs_create_dir("steering", mlx5_debugfs_get_dev_root(dev)); + ctx->debug_info.fdb_debugfs = + debugfs_create_dir("fdb", ctx->debug_info.steering_debugfs); + + sprintf(file_name, "ctx_%p", ctx); + debugfs_create_file(file_name, 0444, ctx->debug_info.fdb_debugfs, + ctx, &hws_dump_fops); +} + +void mlx5hws_debug_uninit_dump(struct mlx5hws_context *ctx) +{ + debugfs_remove_recursive(ctx->debug_info.steering_debugfs); +} diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_debug.h b/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_debug.h new file mode 100644 index 0000000000000..b93a536035d96 --- /dev/null +++ b/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_debug.h @@ -0,0 +1,40 @@ +/* SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB */ +/* Copyright (c) 2024 NVIDIA Corporation & Affiliates */ + +#ifndef MLX5HWS_DEBUG_H_ +#define MLX5HWS_DEBUG_H_ + +#define HWS_DEBUG_FORMAT_VERSION "1.0" + +#define HWS_PTR_TO_ID(p) ((u64)(uintptr_t)(p) & 0xFFFFFFFFULL) + +enum mlx5hws_debug_res_type { + MLX5HWS_DEBUG_RES_TYPE_CONTEXT = 4000, + MLX5HWS_DEBUG_RES_TYPE_CONTEXT_ATTR = 4001, + MLX5HWS_DEBUG_RES_TYPE_CONTEXT_CAPS = 4002, + MLX5HWS_DEBUG_RES_TYPE_CONTEXT_SEND_ENGINE = 4003, + MLX5HWS_DEBUG_RES_TYPE_CONTEXT_SEND_RING = 4004, + MLX5HWS_DEBUG_RES_TYPE_CONTEXT_STC = 4005, + + MLX5HWS_DEBUG_RES_TYPE_TABLE = 4100, + + MLX5HWS_DEBUG_RES_TYPE_MATCHER = 4200, + MLX5HWS_DEBUG_RES_TYPE_MATCHER_ATTR = 4201, + MLX5HWS_DEBUG_RES_TYPE_MATCHER_MATCH_TEMPLATE = 4202, + MLX5HWS_DEBUG_RES_TYPE_MATCHER_TEMPLATE_MATCH_DEFINER = 4203, + MLX5HWS_DEBUG_RES_TYPE_MATCHER_ACTION_TEMPLATE = 4204, + MLX5HWS_DEBUG_RES_TYPE_MATCHER_TEMPLATE_HASH_DEFINER = 4205, + MLX5HWS_DEBUG_RES_TYPE_MATCHER_TEMPLATE_RANGE_DEFINER = 4206, + MLX5HWS_DEBUG_RES_TYPE_MATCHER_TEMPLATE_COMPARE_MATCH_DEFINER = 4207, +}; + +static inline u64 +mlx5hws_debug_icm_to_idx(u64 icm_addr) +{ + return (icm_addr >> 6) & 0xffffffff; +} + +void mlx5hws_debug_init_dump(struct mlx5hws_context *ctx); +void mlx5hws_debug_uninit_dump(struct mlx5hws_context *ctx); + +#endif /* MLX5HWS_DEBUG_H_ */ diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_internal.h b/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_internal.h new file mode 100644 index 0000000000000..458bc31be6ac4 --- /dev/null +++ b/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_internal.h @@ -0,0 +1,59 @@ +/* SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB */ +/* Copyright (c) 2024 NVIDIA Corporation & Affiliates */ + +#ifndef MLX5HWS_INTERNAL_H_ +#define MLX5HWS_INTERNAL_H_ + +#include + +#include "../dr_types.h" + +#include "mlx5hws_prm.h" + +#include "mlx5hws.h" + +#include "mlx5hws_pool.h" +#include "mlx5hws_vport.h" +#include "mlx5hws_context.h" +#include "mlx5hws_table.h" +#include "mlx5hws_send.h" +#include "mlx5hws_rule.h" +#include "mlx5hws_cmd.h" +#include "mlx5hws_action.h" +#include "mlx5hws_definer.h" +#include "mlx5hws_matcher.h" +#include "mlx5hws_debug.h" +#include "mlx5hws_pat_arg.h" +#include "mlx5hws_bwc.h" +#include "mlx5hws_bwc_complex.h" + +#define W_SIZE 2 +#define DW_SIZE 4 +#define BITS_IN_BYTE 8 +#define BITS_IN_DW (BITS_IN_BYTE * DW_SIZE) + +#define IS_BIT_SET(_value, _bit) ((_value) & (1ULL << (_bit))) + +#define mlx5hws_err(ctx, arg...) mlx5_core_err((ctx)->mdev, ##arg) +#define mlx5hws_info(ctx, arg...) mlx5_core_info((ctx)->mdev, ##arg) +#define mlx5hws_dbg(ctx, arg...) mlx5_core_dbg((ctx)->mdev, ##arg) + +#define MLX5HWS_TABLE_TYPE_BASE 2 +#define MLX5HWS_ACTION_STE_IDX_ANY 0 + +static inline bool is_mem_zero(const u8 *mem, size_t size) +{ + if (unlikely(!size)) { + pr_warn("HWS: invalid buffer of size 0 in %s\n", __func__); + return true; + } + + return (*mem == 0) && memcmp(mem, mem + 1, size - 1) == 0; +} + +static inline unsigned long align(unsigned long val, unsigned long align) +{ + return (val + align - 1) & ~(align - 1); +} + +#endif /* MLX5HWS_INTERNAL_H_ */ diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_prm.h b/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_prm.h new file mode 100644 index 0000000000000..de92cecbeb928 --- /dev/null +++ b/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_prm.h @@ -0,0 +1,514 @@ +/* SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB */ +/* Copyright (c) 2024 NVIDIA Corporation & Affiliates */ + +#ifndef MLX5_PRM_H_ +#define MLX5_PRM_H_ + +#define MLX5_MAX_ACTIONS_DATA_IN_HEADER_MODIFY 512 + +/* Action type of header modification. */ +enum { + MLX5_MODIFICATION_TYPE_SET = 0x1, + MLX5_MODIFICATION_TYPE_ADD = 0x2, + MLX5_MODIFICATION_TYPE_COPY = 0x3, + MLX5_MODIFICATION_TYPE_INSERT = 0x4, + MLX5_MODIFICATION_TYPE_REMOVE = 0x5, + MLX5_MODIFICATION_TYPE_NOP = 0x6, + MLX5_MODIFICATION_TYPE_REMOVE_WORDS = 0x7, + MLX5_MODIFICATION_TYPE_ADD_FIELD = 0x8, + MLX5_MODIFICATION_TYPE_MAX, +}; + +/* The field of packet to be modified. */ +enum mlx5_modification_field { + MLX5_MODI_OUT_NONE = -1, + MLX5_MODI_OUT_SMAC_47_16 = 1, + MLX5_MODI_OUT_SMAC_15_0, + MLX5_MODI_OUT_ETHERTYPE, + MLX5_MODI_OUT_DMAC_47_16, + MLX5_MODI_OUT_DMAC_15_0, + MLX5_MODI_OUT_IP_DSCP, + MLX5_MODI_OUT_TCP_FLAGS, + MLX5_MODI_OUT_TCP_SPORT, + MLX5_MODI_OUT_TCP_DPORT, + MLX5_MODI_OUT_IPV4_TTL, + MLX5_MODI_OUT_UDP_SPORT, + MLX5_MODI_OUT_UDP_DPORT, + MLX5_MODI_OUT_SIPV6_127_96, + MLX5_MODI_OUT_SIPV6_95_64, + MLX5_MODI_OUT_SIPV6_63_32, + MLX5_MODI_OUT_SIPV6_31_0, + MLX5_MODI_OUT_DIPV6_127_96, + MLX5_MODI_OUT_DIPV6_95_64, + MLX5_MODI_OUT_DIPV6_63_32, + MLX5_MODI_OUT_DIPV6_31_0, + MLX5_MODI_OUT_SIPV4, + MLX5_MODI_OUT_DIPV4, + MLX5_MODI_OUT_FIRST_VID, + MLX5_MODI_IN_SMAC_47_16 = 0x31, + MLX5_MODI_IN_SMAC_15_0, + MLX5_MODI_IN_ETHERTYPE, + MLX5_MODI_IN_DMAC_47_16, + MLX5_MODI_IN_DMAC_15_0, + MLX5_MODI_IN_IP_DSCP, + MLX5_MODI_IN_TCP_FLAGS, + MLX5_MODI_IN_TCP_SPORT, + MLX5_MODI_IN_TCP_DPORT, + MLX5_MODI_IN_IPV4_TTL, + MLX5_MODI_IN_UDP_SPORT, + MLX5_MODI_IN_UDP_DPORT, + MLX5_MODI_IN_SIPV6_127_96, + MLX5_MODI_IN_SIPV6_95_64, + MLX5_MODI_IN_SIPV6_63_32, + MLX5_MODI_IN_SIPV6_31_0, + MLX5_MODI_IN_DIPV6_127_96, + MLX5_MODI_IN_DIPV6_95_64, + MLX5_MODI_IN_DIPV6_63_32, + MLX5_MODI_IN_DIPV6_31_0, + MLX5_MODI_IN_SIPV4, + MLX5_MODI_IN_DIPV4, + MLX5_MODI_OUT_IPV6_HOPLIMIT, + MLX5_MODI_IN_IPV6_HOPLIMIT, + MLX5_MODI_META_DATA_REG_A, + MLX5_MODI_META_DATA_REG_B = 0x50, + MLX5_MODI_META_REG_C_0, + MLX5_MODI_META_REG_C_1, + MLX5_MODI_META_REG_C_2, + MLX5_MODI_META_REG_C_3, + MLX5_MODI_META_REG_C_4, + MLX5_MODI_META_REG_C_5, + MLX5_MODI_META_REG_C_6, + MLX5_MODI_META_REG_C_7, + MLX5_MODI_OUT_TCP_SEQ_NUM, + MLX5_MODI_IN_TCP_SEQ_NUM, + MLX5_MODI_OUT_TCP_ACK_NUM, + MLX5_MODI_IN_TCP_ACK_NUM = 0x5C, + MLX5_MODI_GTP_TEID = 0x6E, + MLX5_MODI_OUT_IP_ECN = 0x73, + MLX5_MODI_TUNNEL_HDR_DW_1 = 0x75, + MLX5_MODI_GTPU_FIRST_EXT_DW_0 = 0x76, + MLX5_MODI_HASH_RESULT = 0x81, + MLX5_MODI_IN_MPLS_LABEL_0 = 0x8a, + MLX5_MODI_IN_MPLS_LABEL_1, + MLX5_MODI_IN_MPLS_LABEL_2, + MLX5_MODI_IN_MPLS_LABEL_3, + MLX5_MODI_IN_MPLS_LABEL_4, + MLX5_MODI_OUT_IP_PROTOCOL = 0x4A, + MLX5_MODI_OUT_IPV6_NEXT_HDR = 0x4A, + MLX5_MODI_META_REG_C_8 = 0x8F, + MLX5_MODI_META_REG_C_9 = 0x90, + MLX5_MODI_META_REG_C_10 = 0x91, + MLX5_MODI_META_REG_C_11 = 0x92, + MLX5_MODI_META_REG_C_12 = 0x93, + MLX5_MODI_META_REG_C_13 = 0x94, + MLX5_MODI_META_REG_C_14 = 0x95, + MLX5_MODI_META_REG_C_15 = 0x96, + MLX5_MODI_OUT_IPV4_TOTAL_LEN = 0x11D, + MLX5_MODI_OUT_IPV6_PAYLOAD_LEN = 0x11E, + MLX5_MODI_OUT_IPV4_IHL = 0x11F, + MLX5_MODI_OUT_TCP_DATA_OFFSET = 0x120, + MLX5_MODI_OUT_ESP_SPI = 0x5E, + MLX5_MODI_OUT_ESP_SEQ_NUM = 0x82, + MLX5_MODI_OUT_IPSEC_NEXT_HDR = 0x126, + MLX5_MODI_INVALID = INT_MAX, +}; + +enum { + MLX5_GET_HCA_CAP_OP_MOD_NIC_FLOW_TABLE = 0x7 << 1, + MLX5_GET_HCA_CAP_OP_MOD_ESW_FLOW_TABLE = 0x8 << 1, + MLX5_SET_HCA_CAP_OP_MOD_ESW = 0x9 << 1, + MLX5_GET_HCA_CAP_OP_MOD_WQE_BASED_FLOW_TABLE = 0x1B << 1, + MLX5_GET_HCA_CAP_OP_MOD_GENERAL_DEVICE_2 = 0x20 << 1, +}; + +enum mlx5_ifc_rtc_update_mode { + MLX5_IFC_RTC_STE_UPDATE_MODE_BY_HASH = 0x0, + MLX5_IFC_RTC_STE_UPDATE_MODE_BY_OFFSET = 0x1, +}; + +enum mlx5_ifc_rtc_access_mode { + MLX5_IFC_RTC_STE_ACCESS_MODE_BY_HASH = 0x0, + MLX5_IFC_RTC_STE_ACCESS_MODE_LINEAR = 0x1, +}; + +enum mlx5_ifc_rtc_ste_format { + MLX5_IFC_RTC_STE_FORMAT_8DW = 0x4, + MLX5_IFC_RTC_STE_FORMAT_11DW = 0x5, + MLX5_IFC_RTC_STE_FORMAT_RANGE = 0x7, +}; + +enum mlx5_ifc_rtc_reparse_mode { + MLX5_IFC_RTC_REPARSE_NEVER = 0x0, + MLX5_IFC_RTC_REPARSE_ALWAYS = 0x1, + MLX5_IFC_RTC_REPARSE_BY_STC = 0x2, +}; + +#define MLX5_IFC_RTC_LINEAR_LOOKUP_TBL_LOG_MAX 16 + +struct mlx5_ifc_rtc_bits { + u8 modify_field_select[0x40]; + u8 reserved_at_40[0x40]; + u8 update_index_mode[0x2]; + u8 reparse_mode[0x2]; + u8 num_match_ste[0x4]; + u8 pd[0x18]; + u8 reserved_at_a0[0x9]; + u8 access_index_mode[0x3]; + u8 num_hash_definer[0x4]; + u8 update_method[0x1]; + u8 reserved_at_b1[0x2]; + u8 log_depth[0x5]; + u8 log_hash_size[0x8]; + u8 ste_format_0[0x8]; + u8 table_type[0x8]; + u8 ste_format_1[0x8]; + u8 reserved_at_d8[0x8]; + u8 match_definer_0[0x20]; + u8 stc_id[0x20]; + u8 ste_table_base_id[0x20]; + u8 ste_table_offset[0x20]; + u8 reserved_at_160[0x8]; + u8 miss_flow_table_id[0x18]; + u8 match_definer_1[0x20]; + u8 reserved_at_1a0[0x260]; +}; + +enum mlx5_ifc_stc_action_type { + MLX5_IFC_STC_ACTION_TYPE_NOP = 0x00, + MLX5_IFC_STC_ACTION_TYPE_COPY = 0x05, + MLX5_IFC_STC_ACTION_TYPE_SET = 0x06, + MLX5_IFC_STC_ACTION_TYPE_ADD = 0x07, + MLX5_IFC_STC_ACTION_TYPE_REMOVE_WORDS = 0x08, + MLX5_IFC_STC_ACTION_TYPE_HEADER_REMOVE = 0x09, + MLX5_IFC_STC_ACTION_TYPE_HEADER_INSERT = 0x0b, + MLX5_IFC_STC_ACTION_TYPE_TAG = 0x0c, + MLX5_IFC_STC_ACTION_TYPE_ACC_MODIFY_LIST = 0x0e, + MLX5_IFC_STC_ACTION_TYPE_CRYPTO_IPSEC_ENCRYPTION = 0x10, + MLX5_IFC_STC_ACTION_TYPE_CRYPTO_IPSEC_DECRYPTION = 0x11, + MLX5_IFC_STC_ACTION_TYPE_ASO = 0x12, + MLX5_IFC_STC_ACTION_TYPE_TRAILER = 0x13, + MLX5_IFC_STC_ACTION_TYPE_COUNTER = 0x14, + MLX5_IFC_STC_ACTION_TYPE_ADD_FIELD = 0x1b, + MLX5_IFC_STC_ACTION_TYPE_JUMP_TO_STE_TABLE = 0x80, + MLX5_IFC_STC_ACTION_TYPE_JUMP_TO_TIR = 0x81, + MLX5_IFC_STC_ACTION_TYPE_JUMP_TO_FT = 0x82, + MLX5_IFC_STC_ACTION_TYPE_DROP = 0x83, + MLX5_IFC_STC_ACTION_TYPE_ALLOW = 0x84, + MLX5_IFC_STC_ACTION_TYPE_JUMP_TO_VPORT = 0x85, + MLX5_IFC_STC_ACTION_TYPE_JUMP_TO_UPLINK = 0x86, +}; + +enum mlx5_ifc_stc_reparse_mode { + MLX5_IFC_STC_REPARSE_IGNORE = 0x0, + MLX5_IFC_STC_REPARSE_NEVER = 0x1, + MLX5_IFC_STC_REPARSE_ALWAYS = 0x2, +}; + +struct mlx5_ifc_stc_ste_param_ste_table_bits { + u8 ste_obj_id[0x20]; + u8 match_definer_id[0x20]; + u8 reserved_at_40[0x3]; + u8 log_hash_size[0x5]; + u8 reserved_at_48[0x38]; +}; + +struct mlx5_ifc_stc_ste_param_tir_bits { + u8 reserved_at_0[0x8]; + u8 tirn[0x18]; + u8 reserved_at_20[0x60]; +}; + +struct mlx5_ifc_stc_ste_param_table_bits { + u8 reserved_at_0[0x8]; + u8 table_id[0x18]; + u8 reserved_at_20[0x60]; +}; + +struct mlx5_ifc_stc_ste_param_flow_counter_bits { + u8 flow_counter_id[0x20]; +}; + +enum { + MLX5_ASO_CT_NUM_PER_OBJ = 1, + MLX5_ASO_METER_NUM_PER_OBJ = 2, + MLX5_ASO_IPSEC_NUM_PER_OBJ = 1, + MLX5_ASO_FIRST_HIT_NUM_PER_OBJ = 512, +}; + +struct mlx5_ifc_stc_ste_param_execute_aso_bits { + u8 aso_object_id[0x20]; + u8 return_reg_id[0x4]; + u8 aso_type[0x4]; + u8 reserved_at_28[0x18]; +}; + +struct mlx5_ifc_stc_ste_param_ipsec_encrypt_bits { + u8 ipsec_object_id[0x20]; +}; + +struct mlx5_ifc_stc_ste_param_ipsec_decrypt_bits { + u8 ipsec_object_id[0x20]; +}; + +struct mlx5_ifc_stc_ste_param_trailer_bits { + u8 reserved_at_0[0x8]; + u8 command[0x4]; + u8 reserved_at_c[0x2]; + u8 type[0x2]; + u8 reserved_at_10[0xa]; + u8 length[0x6]; +}; + +struct mlx5_ifc_stc_ste_param_header_modify_list_bits { + u8 header_modify_pattern_id[0x20]; + u8 header_modify_argument_id[0x20]; +}; + +enum mlx5_ifc_header_anchors { + MLX5_HEADER_ANCHOR_PACKET_START = 0x0, + MLX5_HEADER_ANCHOR_MAC = 0x1, + MLX5_HEADER_ANCHOR_FIRST_VLAN_START = 0x2, + MLX5_HEADER_ANCHOR_IPV6_IPV4 = 0x07, + MLX5_HEADER_ANCHOR_ESP = 0x08, + MLX5_HEADER_ANCHOR_TCP_UDP = 0x09, + MLX5_HEADER_ANCHOR_TUNNEL_HEADER = 0x0a, + MLX5_HEADER_ANCHOR_INNER_MAC = 0x13, + MLX5_HEADER_ANCHOR_INNER_IPV6_IPV4 = 0x19, + MLX5_HEADER_ANCHOR_INNER_TCP_UDP = 0x1a, + MLX5_HEADER_ANCHOR_L4_PAYLOAD = 0x1b, + MLX5_HEADER_ANCHOR_INNER_L4_PAYLOAD = 0x1c +}; + +struct mlx5_ifc_stc_ste_param_remove_bits { + u8 action_type[0x4]; + u8 decap[0x1]; + u8 reserved_at_5[0x5]; + u8 remove_start_anchor[0x6]; + u8 reserved_at_10[0x2]; + u8 remove_end_anchor[0x6]; + u8 reserved_at_18[0x8]; +}; + +struct mlx5_ifc_stc_ste_param_remove_words_bits { + u8 action_type[0x4]; + u8 reserved_at_4[0x6]; + u8 remove_start_anchor[0x6]; + u8 reserved_at_10[0x1]; + u8 remove_offset[0x7]; + u8 reserved_at_18[0x2]; + u8 remove_size[0x6]; +}; + +struct mlx5_ifc_stc_ste_param_insert_bits { + u8 action_type[0x4]; + u8 encap[0x1]; + u8 inline_data[0x1]; + u8 reserved_at_6[0x4]; + u8 insert_anchor[0x6]; + u8 reserved_at_10[0x1]; + u8 insert_offset[0x7]; + u8 reserved_at_18[0x1]; + u8 insert_size[0x7]; + u8 insert_argument[0x20]; +}; + +struct mlx5_ifc_stc_ste_param_vport_bits { + u8 eswitch_owner_vhca_id[0x10]; + u8 vport_number[0x10]; + u8 eswitch_owner_vhca_id_valid[0x1]; + u8 reserved_at_21[0x5f]; +}; + +union mlx5_ifc_stc_param_bits { + struct mlx5_ifc_stc_ste_param_ste_table_bits ste_table; + struct mlx5_ifc_stc_ste_param_tir_bits tir; + struct mlx5_ifc_stc_ste_param_table_bits table; + struct mlx5_ifc_stc_ste_param_flow_counter_bits counter; + struct mlx5_ifc_stc_ste_param_header_modify_list_bits modify_header; + struct mlx5_ifc_stc_ste_param_execute_aso_bits aso; + struct mlx5_ifc_stc_ste_param_remove_bits remove_header; + struct mlx5_ifc_stc_ste_param_insert_bits insert_header; + struct mlx5_ifc_set_action_in_bits add; + struct mlx5_ifc_set_action_in_bits set; + struct mlx5_ifc_copy_action_in_bits copy; + struct mlx5_ifc_stc_ste_param_vport_bits vport; + struct mlx5_ifc_stc_ste_param_ipsec_encrypt_bits ipsec_encrypt; + struct mlx5_ifc_stc_ste_param_ipsec_decrypt_bits ipsec_decrypt; + struct mlx5_ifc_stc_ste_param_trailer_bits trailer; + u8 reserved_at_0[0x80]; +}; + +enum { + MLX5_IFC_MODIFY_STC_FIELD_SELECT_NEW_STC = BIT(0), +}; + +struct mlx5_ifc_stc_bits { + u8 modify_field_select[0x40]; + u8 reserved_at_40[0x46]; + u8 reparse_mode[0x2]; + u8 table_type[0x8]; + u8 ste_action_offset[0x8]; + u8 action_type[0x8]; + u8 reserved_at_a0[0x60]; + union mlx5_ifc_stc_param_bits stc_param; + u8 reserved_at_180[0x280]; +}; + +struct mlx5_ifc_ste_bits { + u8 modify_field_select[0x40]; + u8 reserved_at_40[0x48]; + u8 table_type[0x8]; + u8 reserved_at_90[0x370]; +}; + +struct mlx5_ifc_definer_bits { + u8 modify_field_select[0x40]; + u8 reserved_at_40[0x50]; + u8 format_id[0x10]; + u8 reserved_at_60[0x60]; + u8 format_select_dw3[0x8]; + u8 format_select_dw2[0x8]; + u8 format_select_dw1[0x8]; + u8 format_select_dw0[0x8]; + u8 format_select_dw7[0x8]; + u8 format_select_dw6[0x8]; + u8 format_select_dw5[0x8]; + u8 format_select_dw4[0x8]; + u8 reserved_at_100[0x18]; + u8 format_select_dw8[0x8]; + u8 reserved_at_120[0x20]; + u8 format_select_byte3[0x8]; + u8 format_select_byte2[0x8]; + u8 format_select_byte1[0x8]; + u8 format_select_byte0[0x8]; + u8 format_select_byte7[0x8]; + u8 format_select_byte6[0x8]; + u8 format_select_byte5[0x8]; + u8 format_select_byte4[0x8]; + u8 reserved_at_180[0x40]; + u8 ctrl[0xa0]; + u8 match_mask[0x160]; +}; + +struct mlx5_ifc_arg_bits { + u8 rsvd0[0x88]; + u8 access_pd[0x18]; +}; + +struct mlx5_ifc_header_modify_pattern_in_bits { + u8 modify_field_select[0x40]; + + u8 reserved_at_40[0x40]; + + u8 pattern_length[0x8]; + u8 reserved_at_88[0x18]; + + u8 reserved_at_a0[0x60]; + + u8 pattern_data[MLX5_MAX_ACTIONS_DATA_IN_HEADER_MODIFY * 8]; +}; + +struct mlx5_ifc_create_rtc_in_bits { + struct mlx5_ifc_general_obj_in_cmd_hdr_bits hdr; + struct mlx5_ifc_rtc_bits rtc; +}; + +struct mlx5_ifc_create_stc_in_bits { + struct mlx5_ifc_general_obj_in_cmd_hdr_bits hdr; + struct mlx5_ifc_stc_bits stc; +}; + +struct mlx5_ifc_create_ste_in_bits { + struct mlx5_ifc_general_obj_in_cmd_hdr_bits hdr; + struct mlx5_ifc_ste_bits ste; +}; + +struct mlx5_ifc_create_definer_in_bits { + struct mlx5_ifc_general_obj_in_cmd_hdr_bits hdr; + struct mlx5_ifc_definer_bits definer; +}; + +struct mlx5_ifc_create_arg_in_bits { + struct mlx5_ifc_general_obj_in_cmd_hdr_bits hdr; + struct mlx5_ifc_arg_bits arg; +}; + +struct mlx5_ifc_create_header_modify_pattern_in_bits { + struct mlx5_ifc_general_obj_in_cmd_hdr_bits hdr; + struct mlx5_ifc_header_modify_pattern_in_bits pattern; +}; + +struct mlx5_ifc_generate_wqe_in_bits { + u8 opcode[0x10]; + u8 uid[0x10]; + u8 reserved_at_20[0x10]; + u8 op_mode[0x10]; + u8 reserved_at_40[0x40]; + u8 reserved_at_80[0x8]; + u8 pdn[0x18]; + u8 reserved_at_a0[0x160]; + u8 wqe_ctrl[0x80]; + u8 wqe_gta_ctrl[0x180]; + u8 wqe_gta_data_0[0x200]; + u8 wqe_gta_data_1[0x200]; +}; + +struct mlx5_ifc_generate_wqe_out_bits { + u8 status[0x8]; + u8 reserved_at_8[0x18]; + u8 syndrome[0x20]; + u8 reserved_at_40[0x1c0]; + u8 cqe_data[0x200]; +}; + +enum mlx5_access_aso_opc_mod { + ASO_OPC_MOD_IPSEC = 0x0, + ASO_OPC_MOD_CONNECTION_TRACKING = 0x1, + ASO_OPC_MOD_POLICER = 0x2, + ASO_OPC_MOD_RACE_AVOIDANCE = 0x3, + ASO_OPC_MOD_FLOW_HIT = 0x4, +}; + +enum { + MLX5_IFC_MODIFY_FLOW_TABLE_MISS_ACTION = BIT(0), + MLX5_IFC_MODIFY_FLOW_TABLE_RTC_ID = BIT(1), +}; + +enum { + MLX5_IFC_MODIFY_FLOW_TABLE_MISS_ACTION_DEFAULT = 0, + MLX5_IFC_MODIFY_FLOW_TABLE_MISS_ACTION_GOTO_TBL = 1, +}; + +struct mlx5_ifc_alloc_packet_reformat_out_bits { + u8 status[0x8]; + u8 reserved_at_8[0x18]; + + u8 syndrome[0x20]; + + u8 packet_reformat_id[0x20]; + + u8 reserved_at_60[0x20]; +}; + +struct mlx5_ifc_dealloc_packet_reformat_in_bits { + u8 opcode[0x10]; + u8 reserved_at_10[0x10]; + + u8 reserved_at_20[0x10]; + u8 op_mod[0x10]; + + u8 packet_reformat_id[0x20]; + + u8 reserved_at_60[0x20]; +}; + +struct mlx5_ifc_dealloc_packet_reformat_out_bits { + u8 status[0x8]; + u8 reserved_at_8[0x18]; + + u8 syndrome[0x20]; + + u8 reserved_at_40[0x40]; +}; + +#endif /* MLX5_PRM_H_ */ From caab8e7674865840399f89fb458c904bdf0a80f4 Mon Sep 17 00:00:00 2001 From: Yevgeny Kliteynik Date: Thu, 20 Jun 2024 02:35:50 +0300 Subject: [PATCH 100/548] net/mlx5: HWS, added send engine and context handling Added implementation of send engine and handling of HWS context. Reviewed-by: Itamar Gozlan Signed-off-by: Yevgeny Kliteynik Signed-off-by: Saeed Mahameed (cherry picked from commit 2ca62599aa0bbc0e61595531614e0989ba6b3194) Signed-off-by: Koba Ko --- .../mlx5/core/steering/hws/mlx5hws_context.c | 260 ++++ .../mlx5/core/steering/hws/mlx5hws_context.h | 64 + .../mlx5/core/steering/hws/mlx5hws_send.c | 1209 +++++++++++++++++ .../mlx5/core/steering/hws/mlx5hws_send.h | 270 ++++ 4 files changed, 1803 insertions(+) create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_context.c create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_context.h create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_send.c create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_send.h diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_context.c b/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_context.c new file mode 100644 index 0000000000000..00e4fdf4a558b --- /dev/null +++ b/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_context.c @@ -0,0 +1,260 @@ +// SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB +/* Copyright (c) 2024 NVIDIA CORPORATION. All rights reserved. */ + +#include "mlx5hws_internal.h" + +bool mlx5hws_context_cap_dynamic_reparse(struct mlx5hws_context *ctx) +{ + return IS_BIT_SET(ctx->caps->rtc_reparse_mode, MLX5_IFC_RTC_REPARSE_BY_STC); +} + +u8 mlx5hws_context_get_reparse_mode(struct mlx5hws_context *ctx) +{ + /* Prefer to use dynamic reparse, reparse only specific actions */ + if (mlx5hws_context_cap_dynamic_reparse(ctx)) + return MLX5_IFC_RTC_REPARSE_NEVER; + + /* Otherwise use less efficient static */ + return MLX5_IFC_RTC_REPARSE_ALWAYS; +} + +static int hws_context_pools_init(struct mlx5hws_context *ctx) +{ + struct mlx5hws_pool_attr pool_attr = {0}; + u8 max_log_sz; + int ret; + int i; + + ret = mlx5hws_pat_init_pattern_cache(&ctx->pattern_cache); + if (ret) + return ret; + + ret = mlx5hws_definer_init_cache(&ctx->definer_cache); + if (ret) + goto uninit_pat_cache; + + /* Create an STC pool per FT type */ + pool_attr.pool_type = MLX5HWS_POOL_TYPE_STC; + pool_attr.flags = MLX5HWS_POOL_FLAGS_FOR_STC_POOL; + max_log_sz = min(MLX5HWS_POOL_STC_LOG_SZ, ctx->caps->stc_alloc_log_max); + pool_attr.alloc_log_sz = max(max_log_sz, ctx->caps->stc_alloc_log_gran); + + for (i = 0; i < MLX5HWS_TABLE_TYPE_MAX; i++) { + pool_attr.table_type = i; + ctx->stc_pool[i] = mlx5hws_pool_create(ctx, &pool_attr); + if (!ctx->stc_pool[i]) { + mlx5hws_err(ctx, "Failed to allocate STC pool [%d]", i); + ret = -ENOMEM; + goto free_stc_pools; + } + } + + return 0; + +free_stc_pools: + for (i = 0; i < MLX5HWS_TABLE_TYPE_MAX; i++) + if (ctx->stc_pool[i]) + mlx5hws_pool_destroy(ctx->stc_pool[i]); + + mlx5hws_definer_uninit_cache(ctx->definer_cache); +uninit_pat_cache: + mlx5hws_pat_uninit_pattern_cache(ctx->pattern_cache); + return ret; +} + +static void hws_context_pools_uninit(struct mlx5hws_context *ctx) +{ + int i; + + for (i = 0; i < MLX5HWS_TABLE_TYPE_MAX; i++) { + if (ctx->stc_pool[i]) + mlx5hws_pool_destroy(ctx->stc_pool[i]); + } + + mlx5hws_definer_uninit_cache(ctx->definer_cache); + mlx5hws_pat_uninit_pattern_cache(ctx->pattern_cache); +} + +static int hws_context_init_pd(struct mlx5hws_context *ctx) +{ + int ret = 0; + + ret = mlx5_core_alloc_pd(ctx->mdev, &ctx->pd_num); + if (ret) { + mlx5hws_err(ctx, "Failed to allocate PD\n"); + return ret; + } + + ctx->flags |= MLX5HWS_CONTEXT_FLAG_PRIVATE_PD; + + return 0; +} + +static int hws_context_uninit_pd(struct mlx5hws_context *ctx) +{ + if (ctx->flags & MLX5HWS_CONTEXT_FLAG_PRIVATE_PD) + mlx5_core_dealloc_pd(ctx->mdev, ctx->pd_num); + + return 0; +} + +static void hws_context_check_hws_supp(struct mlx5hws_context *ctx) +{ + struct mlx5hws_cmd_query_caps *caps = ctx->caps; + + /* HWS not supported on device / FW */ + if (!caps->wqe_based_update) { + mlx5hws_err(ctx, "Required HWS WQE based insertion cap not supported\n"); + return; + } + + if (!caps->eswitch_manager) { + mlx5hws_err(ctx, "HWS is not supported for non eswitch manager port\n"); + return; + } + + /* Current solution requires all rules to set reparse bit */ + if ((!caps->nic_ft.reparse || + (!caps->fdb_ft.reparse && caps->eswitch_manager)) || + !IS_BIT_SET(caps->rtc_reparse_mode, MLX5_IFC_RTC_REPARSE_ALWAYS)) { + mlx5hws_err(ctx, "Required HWS reparse cap not supported\n"); + return; + } + + /* FW/HW must support 8DW STE */ + if (!IS_BIT_SET(caps->ste_format, MLX5_IFC_RTC_STE_FORMAT_8DW)) { + mlx5hws_err(ctx, "Required HWS STE format not supported\n"); + return; + } + + /* Adding rules by hash and by offset are requirements */ + if (!IS_BIT_SET(caps->rtc_index_mode, MLX5_IFC_RTC_STE_UPDATE_MODE_BY_HASH) || + !IS_BIT_SET(caps->rtc_index_mode, MLX5_IFC_RTC_STE_UPDATE_MODE_BY_OFFSET)) { + mlx5hws_err(ctx, "Required HWS RTC update mode not supported\n"); + return; + } + + /* Support for SELECT definer ID is required */ + if (!IS_BIT_SET(caps->definer_format_sup, MLX5_IFC_DEFINER_FORMAT_ID_SELECT)) { + mlx5hws_err(ctx, "Required HWS Dynamic definer not supported\n"); + return; + } + + ctx->flags |= MLX5HWS_CONTEXT_FLAG_HWS_SUPPORT; +} + +static int hws_context_init_hws(struct mlx5hws_context *ctx, + struct mlx5hws_context_attr *attr) +{ + int ret; + + hws_context_check_hws_supp(ctx); + + if (!(ctx->flags & MLX5HWS_CONTEXT_FLAG_HWS_SUPPORT)) + return 0; + + ret = hws_context_init_pd(ctx); + if (ret) + return ret; + + ret = hws_context_pools_init(ctx); + if (ret) + goto uninit_pd; + + if (attr->bwc) + ctx->flags |= MLX5HWS_CONTEXT_FLAG_BWC_SUPPORT; + + ret = mlx5hws_send_queues_open(ctx, attr->queues, attr->queue_size); + if (ret) + goto pools_uninit; + + INIT_LIST_HEAD(&ctx->tbl_list); + + return 0; + +pools_uninit: + hws_context_pools_uninit(ctx); +uninit_pd: + hws_context_uninit_pd(ctx); + return ret; +} + +static void hws_context_uninit_hws(struct mlx5hws_context *ctx) +{ + if (!(ctx->flags & MLX5HWS_CONTEXT_FLAG_HWS_SUPPORT)) + return; + + mlx5hws_send_queues_close(ctx); + hws_context_pools_uninit(ctx); + hws_context_uninit_pd(ctx); +} + +struct mlx5hws_context *mlx5hws_context_open(struct mlx5_core_dev *mdev, + struct mlx5hws_context_attr *attr) +{ + struct mlx5hws_context *ctx; + int ret; + + ctx = kzalloc(sizeof(*ctx), GFP_KERNEL); + if (!ctx) + return NULL; + + ctx->mdev = mdev; + + mutex_init(&ctx->ctrl_lock); + xa_init(&ctx->peer_ctx_xa); + + ctx->caps = kzalloc(sizeof(*ctx->caps), GFP_KERNEL); + if (!ctx->caps) + goto free_ctx; + + ret = mlx5hws_cmd_query_caps(mdev, ctx->caps); + if (ret) + goto free_caps; + + ret = mlx5hws_vport_init_vports(ctx); + if (ret) + goto free_caps; + + ret = hws_context_init_hws(ctx, attr); + if (ret) + goto uninit_vports; + + mlx5hws_debug_init_dump(ctx); + + return ctx; + +uninit_vports: + mlx5hws_vport_uninit_vports(ctx); +free_caps: + kfree(ctx->caps); +free_ctx: + xa_destroy(&ctx->peer_ctx_xa); + mutex_destroy(&ctx->ctrl_lock); + kfree(ctx); + return NULL; +} + +int mlx5hws_context_close(struct mlx5hws_context *ctx) +{ + mlx5hws_debug_uninit_dump(ctx); + hws_context_uninit_hws(ctx); + mlx5hws_vport_uninit_vports(ctx); + kfree(ctx->caps); + xa_destroy(&ctx->peer_ctx_xa); + mutex_destroy(&ctx->ctrl_lock); + kfree(ctx); + return 0; +} + +void mlx5hws_context_set_peer(struct mlx5hws_context *ctx, + struct mlx5hws_context *peer_ctx, + u16 peer_vhca_id) +{ + mutex_lock(&ctx->ctrl_lock); + + if (xa_err(xa_store(&ctx->peer_ctx_xa, peer_vhca_id, peer_ctx, GFP_KERNEL))) + pr_warn("HWS: failed storing peer vhca ID in peer xarray\n"); + + mutex_unlock(&ctx->ctrl_lock); +} diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_context.h b/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_context.h new file mode 100644 index 0000000000000..e5a7ce6043340 --- /dev/null +++ b/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_context.h @@ -0,0 +1,64 @@ +/* SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB */ +/* Copyright (c) 2024 NVIDIA Corporation & Affiliates */ + +#ifndef MLX5HWS_CONTEXT_H_ +#define MLX5HWS_CONTEXT_H_ + +enum mlx5hws_context_flags { + MLX5HWS_CONTEXT_FLAG_HWS_SUPPORT = 1 << 0, + MLX5HWS_CONTEXT_FLAG_PRIVATE_PD = 1 << 1, + MLX5HWS_CONTEXT_FLAG_BWC_SUPPORT = 1 << 2, +}; + +enum mlx5hws_context_shared_stc_type { + MLX5HWS_CONTEXT_SHARED_STC_DECAP_L3 = 0, + MLX5HWS_CONTEXT_SHARED_STC_DOUBLE_POP = 1, + MLX5HWS_CONTEXT_SHARED_STC_MAX = 2, +}; + +struct mlx5hws_context_common_res { + struct mlx5hws_action_default_stc *default_stc; + struct mlx5hws_action_shared_stc *shared_stc[MLX5HWS_CONTEXT_SHARED_STC_MAX]; + struct mlx5hws_cmd_forward_tbl *default_miss; +}; + +struct mlx5hws_context_debug_info { + struct dentry *steering_debugfs; + struct dentry *fdb_debugfs; +}; + +struct mlx5hws_context_vports { + u16 esw_manager_gvmi; + u16 uplink_gvmi; + struct xarray vport_gvmi_xa; +}; + +struct mlx5hws_context { + struct mlx5_core_dev *mdev; + struct mlx5hws_cmd_query_caps *caps; + u32 pd_num; + struct mlx5hws_pool *stc_pool[MLX5HWS_TABLE_TYPE_MAX]; + struct mlx5hws_context_common_res common_res[MLX5HWS_TABLE_TYPE_MAX]; + struct mlx5hws_pattern_cache *pattern_cache; + struct mlx5hws_definer_cache *definer_cache; + struct mutex ctrl_lock; /* control lock to protect the whole context */ + enum mlx5hws_context_flags flags; + struct mlx5hws_send_engine *send_queue; + size_t queues; + struct mutex *bwc_send_queue_locks; /* protect BWC queues */ + struct list_head tbl_list; + struct mlx5hws_context_debug_info debug_info; + struct xarray peer_ctx_xa; + struct mlx5hws_context_vports vports; +}; + +static inline bool mlx5hws_context_bwc_supported(struct mlx5hws_context *ctx) +{ + return ctx->flags & MLX5HWS_CONTEXT_FLAG_BWC_SUPPORT; +} + +bool mlx5hws_context_cap_dynamic_reparse(struct mlx5hws_context *ctx); + +u8 mlx5hws_context_get_reparse_mode(struct mlx5hws_context *ctx); + +#endif /* MLX5HWS_CONTEXT_H_ */ diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_send.c b/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_send.c new file mode 100644 index 0000000000000..fb97a15c041a1 --- /dev/null +++ b/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_send.c @@ -0,0 +1,1209 @@ +// SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB +/* Copyright (c) 2024 NVIDIA Corporation & Affiliates */ + +#include "mlx5hws_internal.h" +#include "lib/clock.h" + +enum { CQ_OK = 0, CQ_EMPTY = -1, CQ_POLL_ERR = -2 }; + +struct mlx5hws_send_ring_dep_wqe * +mlx5hws_send_add_new_dep_wqe(struct mlx5hws_send_engine *queue) +{ + struct mlx5hws_send_ring_sq *send_sq = &queue->send_ring.send_sq; + unsigned int idx = send_sq->head_dep_idx++ & (queue->num_entries - 1); + + memset(&send_sq->dep_wqe[idx].wqe_data.tag, 0, MLX5HWS_MATCH_TAG_SZ); + + return &send_sq->dep_wqe[idx]; +} + +void mlx5hws_send_abort_new_dep_wqe(struct mlx5hws_send_engine *queue) +{ + queue->send_ring.send_sq.head_dep_idx--; +} + +void mlx5hws_send_all_dep_wqe(struct mlx5hws_send_engine *queue) +{ + struct mlx5hws_send_ring_sq *send_sq = &queue->send_ring.send_sq; + struct mlx5hws_send_ste_attr ste_attr = {0}; + struct mlx5hws_send_ring_dep_wqe *dep_wqe; + + ste_attr.send_attr.opmod = MLX5HWS_WQE_GTA_OPMOD_STE; + ste_attr.send_attr.opcode = MLX5HWS_WQE_OPCODE_TBL_ACCESS; + ste_attr.send_attr.len = MLX5HWS_WQE_SZ_GTA_CTRL + MLX5HWS_WQE_SZ_GTA_DATA; + ste_attr.gta_opcode = MLX5HWS_WQE_GTA_OP_ACTIVATE; + + /* Fence first from previous depend WQEs */ + ste_attr.send_attr.fence = 1; + + while (send_sq->head_dep_idx != send_sq->tail_dep_idx) { + dep_wqe = &send_sq->dep_wqe[send_sq->tail_dep_idx++ & (queue->num_entries - 1)]; + + /* Notify HW on the last WQE */ + ste_attr.send_attr.notify_hw = (send_sq->tail_dep_idx == send_sq->head_dep_idx); + ste_attr.send_attr.user_data = dep_wqe->user_data; + ste_attr.send_attr.rule = dep_wqe->rule; + + ste_attr.rtc_0 = dep_wqe->rtc_0; + ste_attr.rtc_1 = dep_wqe->rtc_1; + ste_attr.retry_rtc_0 = dep_wqe->retry_rtc_0; + ste_attr.retry_rtc_1 = dep_wqe->retry_rtc_1; + ste_attr.used_id_rtc_0 = &dep_wqe->rule->rtc_0; + ste_attr.used_id_rtc_1 = &dep_wqe->rule->rtc_1; + ste_attr.wqe_ctrl = &dep_wqe->wqe_ctrl; + ste_attr.wqe_data = &dep_wqe->wqe_data; + ste_attr.direct_index = dep_wqe->direct_index; + + mlx5hws_send_ste(queue, &ste_attr); + + /* Fencing is done only on the first WQE */ + ste_attr.send_attr.fence = 0; + } +} + +struct mlx5hws_send_engine_post_ctrl +mlx5hws_send_engine_post_start(struct mlx5hws_send_engine *queue) +{ + struct mlx5hws_send_engine_post_ctrl ctrl; + + ctrl.queue = queue; + /* Currently only one send ring is supported */ + ctrl.send_ring = &queue->send_ring; + ctrl.num_wqebbs = 0; + + return ctrl; +} + +void mlx5hws_send_engine_post_req_wqe(struct mlx5hws_send_engine_post_ctrl *ctrl, + char **buf, size_t *len) +{ + struct mlx5hws_send_ring_sq *send_sq = &ctrl->send_ring->send_sq; + unsigned int idx; + + idx = (send_sq->cur_post + ctrl->num_wqebbs) & send_sq->buf_mask; + + /* Note that *buf is a single MLX5_SEND_WQE_BB. It cannot be used + * as buffer of more than one WQE_BB, since the two MLX5_SEND_WQE_BB + * can be on 2 different kernel memory pages. + */ + *buf = mlx5_wq_cyc_get_wqe(&send_sq->wq, idx); + *len = MLX5_SEND_WQE_BB; + + if (!ctrl->num_wqebbs) { + *buf += sizeof(struct mlx5hws_wqe_ctrl_seg); + *len -= sizeof(struct mlx5hws_wqe_ctrl_seg); + } + + ctrl->num_wqebbs++; +} + +static void hws_send_engine_post_ring(struct mlx5hws_send_ring_sq *sq, + struct mlx5hws_wqe_ctrl_seg *doorbell_cseg) +{ + /* ensure wqe is visible to device before updating doorbell record */ + dma_wmb(); + + *sq->wq.db = cpu_to_be32(sq->cur_post); + + /* ensure doorbell record is visible to device before ringing the + * doorbell + */ + wmb(); + + mlx5_write64((__be32 *)doorbell_cseg, sq->uar_map); + + /* Ensure doorbell is written on uar_page before poll_cq */ + WRITE_ONCE(doorbell_cseg, NULL); +} + +static void +hws_send_wqe_set_tag(struct mlx5hws_wqe_gta_data_seg_ste *wqe_data, + struct mlx5hws_rule_match_tag *tag, + bool is_jumbo) +{ + if (is_jumbo) { + /* Clear previous possibly dirty control */ + memset(wqe_data, 0, MLX5HWS_STE_CTRL_SZ); + memcpy(wqe_data->jumbo, tag->jumbo, MLX5HWS_JUMBO_TAG_SZ); + } else { + /* Clear previous possibly dirty control and actions */ + memset(wqe_data, 0, MLX5HWS_STE_CTRL_SZ + MLX5HWS_ACTIONS_SZ); + memcpy(wqe_data->tag, tag->match, MLX5HWS_MATCH_TAG_SZ); + } +} + +void mlx5hws_send_engine_post_end(struct mlx5hws_send_engine_post_ctrl *ctrl, + struct mlx5hws_send_engine_post_attr *attr) +{ + struct mlx5hws_wqe_ctrl_seg *wqe_ctrl; + struct mlx5hws_send_ring_sq *sq; + unsigned int idx; + u32 flags = 0; + + sq = &ctrl->send_ring->send_sq; + idx = sq->cur_post & sq->buf_mask; + sq->last_idx = idx; + + wqe_ctrl = mlx5_wq_cyc_get_wqe(&sq->wq, idx); + + wqe_ctrl->opmod_idx_opcode = + cpu_to_be32((attr->opmod << 24) | + ((sq->cur_post & 0xffff) << 8) | + attr->opcode); + wqe_ctrl->qpn_ds = + cpu_to_be32((attr->len + sizeof(struct mlx5hws_wqe_ctrl_seg)) / 16 | + sq->sqn << 8); + wqe_ctrl->imm = cpu_to_be32(attr->id); + + flags |= attr->notify_hw ? MLX5_WQE_CTRL_CQ_UPDATE : 0; + flags |= attr->fence ? MLX5_WQE_CTRL_INITIATOR_SMALL_FENCE : 0; + wqe_ctrl->flags = cpu_to_be32(flags); + + sq->wr_priv[idx].id = attr->id; + sq->wr_priv[idx].retry_id = attr->retry_id; + + sq->wr_priv[idx].rule = attr->rule; + sq->wr_priv[idx].user_data = attr->user_data; + sq->wr_priv[idx].num_wqebbs = ctrl->num_wqebbs; + + if (attr->rule) { + sq->wr_priv[idx].rule->pending_wqes++; + sq->wr_priv[idx].used_id = attr->used_id; + } + + sq->cur_post += ctrl->num_wqebbs; + + if (attr->notify_hw) + hws_send_engine_post_ring(sq, wqe_ctrl); +} + +static void hws_send_wqe(struct mlx5hws_send_engine *queue, + struct mlx5hws_send_engine_post_attr *send_attr, + struct mlx5hws_wqe_gta_ctrl_seg *send_wqe_ctrl, + void *send_wqe_data, + void *send_wqe_tag, + bool is_jumbo, + u8 gta_opcode, + u32 direct_index) +{ + struct mlx5hws_wqe_gta_data_seg_ste *wqe_data; + struct mlx5hws_wqe_gta_ctrl_seg *wqe_ctrl; + struct mlx5hws_send_engine_post_ctrl ctrl; + size_t wqe_len; + + ctrl = mlx5hws_send_engine_post_start(queue); + mlx5hws_send_engine_post_req_wqe(&ctrl, (void *)&wqe_ctrl, &wqe_len); + mlx5hws_send_engine_post_req_wqe(&ctrl, (void *)&wqe_data, &wqe_len); + + wqe_ctrl->op_dirix = cpu_to_be32(gta_opcode << 28 | direct_index); + memcpy(wqe_ctrl->stc_ix, send_wqe_ctrl->stc_ix, + sizeof(send_wqe_ctrl->stc_ix)); + + if (send_wqe_data) + memcpy(wqe_data, send_wqe_data, sizeof(*wqe_data)); + else + hws_send_wqe_set_tag(wqe_data, send_wqe_tag, is_jumbo); + + mlx5hws_send_engine_post_end(&ctrl, send_attr); +} + +void mlx5hws_send_ste(struct mlx5hws_send_engine *queue, + struct mlx5hws_send_ste_attr *ste_attr) +{ + struct mlx5hws_send_engine_post_attr *send_attr = &ste_attr->send_attr; + u8 notify_hw = send_attr->notify_hw; + u8 fence = send_attr->fence; + + if (ste_attr->rtc_1) { + send_attr->id = ste_attr->rtc_1; + send_attr->used_id = ste_attr->used_id_rtc_1; + send_attr->retry_id = ste_attr->retry_rtc_1; + send_attr->fence = fence; + send_attr->notify_hw = notify_hw && !ste_attr->rtc_0; + hws_send_wqe(queue, send_attr, + ste_attr->wqe_ctrl, + ste_attr->wqe_data, + ste_attr->wqe_tag, + ste_attr->wqe_tag_is_jumbo, + ste_attr->gta_opcode, + ste_attr->direct_index); + } + + if (ste_attr->rtc_0) { + send_attr->id = ste_attr->rtc_0; + send_attr->used_id = ste_attr->used_id_rtc_0; + send_attr->retry_id = ste_attr->retry_rtc_0; + send_attr->fence = fence && !ste_attr->rtc_1; + send_attr->notify_hw = notify_hw; + hws_send_wqe(queue, send_attr, + ste_attr->wqe_ctrl, + ste_attr->wqe_data, + ste_attr->wqe_tag, + ste_attr->wqe_tag_is_jumbo, + ste_attr->gta_opcode, + ste_attr->direct_index); + } + + /* Restore to original requested values */ + send_attr->notify_hw = notify_hw; + send_attr->fence = fence; +} + +static void hws_send_engine_retry_post_send(struct mlx5hws_send_engine *queue, + struct mlx5hws_send_ring_priv *priv, + u16 wqe_cnt) +{ + struct mlx5hws_send_engine_post_attr send_attr = {0}; + struct mlx5hws_wqe_gta_data_seg_ste *wqe_data; + struct mlx5hws_wqe_gta_ctrl_seg *wqe_ctrl; + struct mlx5hws_send_engine_post_ctrl ctrl; + struct mlx5hws_send_ring_sq *send_sq; + unsigned int idx; + size_t wqe_len; + char *p; + + send_attr.rule = priv->rule; + send_attr.opcode = MLX5HWS_WQE_OPCODE_TBL_ACCESS; + send_attr.opmod = MLX5HWS_WQE_GTA_OPMOD_STE; + send_attr.len = MLX5_SEND_WQE_BB * 2 - sizeof(struct mlx5hws_wqe_ctrl_seg); + send_attr.notify_hw = 1; + send_attr.fence = 0; + send_attr.user_data = priv->user_data; + send_attr.id = priv->retry_id; + send_attr.used_id = priv->used_id; + + ctrl = mlx5hws_send_engine_post_start(queue); + mlx5hws_send_engine_post_req_wqe(&ctrl, (void *)&wqe_ctrl, &wqe_len); + mlx5hws_send_engine_post_req_wqe(&ctrl, (void *)&wqe_data, &wqe_len); + + send_sq = &ctrl.send_ring->send_sq; + idx = wqe_cnt & send_sq->buf_mask; + p = mlx5_wq_cyc_get_wqe(&send_sq->wq, idx); + + /* Copy old gta ctrl */ + memcpy(wqe_ctrl, p + sizeof(struct mlx5hws_wqe_ctrl_seg), + MLX5_SEND_WQE_BB - sizeof(struct mlx5hws_wqe_ctrl_seg)); + + idx = (wqe_cnt + 1) & send_sq->buf_mask; + p = mlx5_wq_cyc_get_wqe(&send_sq->wq, idx); + + /* Copy old gta data */ + memcpy(wqe_data, p, MLX5_SEND_WQE_BB); + + mlx5hws_send_engine_post_end(&ctrl, &send_attr); +} + +void mlx5hws_send_engine_flush_queue(struct mlx5hws_send_engine *queue) +{ + struct mlx5hws_send_ring_sq *sq = &queue->send_ring.send_sq; + struct mlx5hws_wqe_ctrl_seg *wqe_ctrl; + + wqe_ctrl = mlx5_wq_cyc_get_wqe(&sq->wq, sq->last_idx); + wqe_ctrl->flags |= cpu_to_be32(MLX5_WQE_CTRL_CQ_UPDATE); + + hws_send_engine_post_ring(sq, wqe_ctrl); +} + +static void +hws_send_engine_update_rule_resize(struct mlx5hws_send_engine *queue, + struct mlx5hws_send_ring_priv *priv, + enum mlx5hws_flow_op_status *status) +{ + switch (priv->rule->resize_info->state) { + case MLX5HWS_RULE_RESIZE_STATE_WRITING: + if (priv->rule->status == MLX5HWS_RULE_STATUS_FAILING) { + /* Backup original RTCs */ + u32 orig_rtc_0 = priv->rule->resize_info->rtc_0; + u32 orig_rtc_1 = priv->rule->resize_info->rtc_1; + + /* Delete partially failed move rule using resize_info */ + priv->rule->resize_info->rtc_0 = priv->rule->rtc_0; + priv->rule->resize_info->rtc_1 = priv->rule->rtc_1; + + /* Move rule to original RTC for future delete */ + priv->rule->rtc_0 = orig_rtc_0; + priv->rule->rtc_1 = orig_rtc_1; + } + /* Clean leftovers */ + mlx5hws_rule_move_hws_remove(priv->rule, queue, priv->user_data); + break; + + case MLX5HWS_RULE_RESIZE_STATE_DELETING: + if (priv->rule->status == MLX5HWS_RULE_STATUS_FAILING) { + *status = MLX5HWS_FLOW_OP_ERROR; + } else { + *status = MLX5HWS_FLOW_OP_SUCCESS; + priv->rule->matcher = priv->rule->matcher->resize_dst; + } + priv->rule->resize_info->state = MLX5HWS_RULE_RESIZE_STATE_IDLE; + priv->rule->status = MLX5HWS_RULE_STATUS_CREATED; + break; + + default: + break; + } +} + +static void hws_send_engine_update_rule(struct mlx5hws_send_engine *queue, + struct mlx5hws_send_ring_priv *priv, + u16 wqe_cnt, + enum mlx5hws_flow_op_status *status) +{ + priv->rule->pending_wqes--; + + if (*status == MLX5HWS_FLOW_OP_ERROR) { + if (priv->retry_id) { + hws_send_engine_retry_post_send(queue, priv, wqe_cnt); + return; + } + /* Some part of the rule failed */ + priv->rule->status = MLX5HWS_RULE_STATUS_FAILING; + *priv->used_id = 0; + } else { + *priv->used_id = priv->id; + } + + /* Update rule status for the last completion */ + if (!priv->rule->pending_wqes) { + if (unlikely(mlx5hws_rule_move_in_progress(priv->rule))) { + hws_send_engine_update_rule_resize(queue, priv, status); + return; + } + + if (unlikely(priv->rule->status == MLX5HWS_RULE_STATUS_FAILING)) { + /* Rule completely failed and doesn't require cleanup */ + if (!priv->rule->rtc_0 && !priv->rule->rtc_1) + priv->rule->status = MLX5HWS_RULE_STATUS_FAILED; + + *status = MLX5HWS_FLOW_OP_ERROR; + } else { + /* Increase the status, this only works on good flow as the enum + * is arrange it away creating -> created -> deleting -> deleted + */ + priv->rule->status++; + *status = MLX5HWS_FLOW_OP_SUCCESS; + /* Rule was deleted now we can safely release action STEs + * and clear resize info + */ + if (priv->rule->status == MLX5HWS_RULE_STATUS_DELETED) { + mlx5hws_rule_free_action_ste(priv->rule); + mlx5hws_rule_clear_resize_info(priv->rule); + } + } + } +} + +static void hws_send_engine_update(struct mlx5hws_send_engine *queue, + struct mlx5_cqe64 *cqe, + struct mlx5hws_send_ring_priv *priv, + struct mlx5hws_flow_op_result res[], + s64 *i, + u32 res_nb, + u16 wqe_cnt) +{ + enum mlx5hws_flow_op_status status; + + if (!cqe || (likely(be32_to_cpu(cqe->byte_cnt) >> 31 == 0) && + likely(get_cqe_opcode(cqe) == MLX5_CQE_REQ))) { + status = MLX5HWS_FLOW_OP_SUCCESS; + } else { + status = MLX5HWS_FLOW_OP_ERROR; + } + + if (priv->user_data) { + if (priv->rule) { + hws_send_engine_update_rule(queue, priv, wqe_cnt, &status); + /* Completion is provided on the last rule WQE */ + if (priv->rule->pending_wqes) + return; + } + + if (*i < res_nb) { + res[*i].user_data = priv->user_data; + res[*i].status = status; + (*i)++; + mlx5hws_send_engine_dec_rule(queue); + } else { + mlx5hws_send_engine_gen_comp(queue, priv->user_data, status); + } + } +} + +static int mlx5hws_parse_cqe(struct mlx5hws_send_ring_cq *cq, + struct mlx5_cqe64 *cqe64) +{ + if (unlikely(get_cqe_opcode(cqe64) != MLX5_CQE_REQ)) { + struct mlx5_err_cqe *err_cqe = (struct mlx5_err_cqe *)cqe64; + + mlx5_core_err(cq->mdev, "Bad OP in HWS SQ CQE: 0x%x\n", get_cqe_opcode(cqe64)); + mlx5_core_err(cq->mdev, "vendor_err_synd=%x\n", err_cqe->vendor_err_synd); + mlx5_core_err(cq->mdev, "syndrome=%x\n", err_cqe->syndrome); + print_hex_dump(KERN_WARNING, "", DUMP_PREFIX_OFFSET, + 16, 1, err_cqe, + sizeof(*err_cqe), false); + return CQ_POLL_ERR; + } + + return CQ_OK; +} + +static int mlx5hws_cq_poll_one(struct mlx5hws_send_ring_cq *cq) +{ + struct mlx5_cqe64 *cqe64; + int err; + + cqe64 = mlx5_cqwq_get_cqe(&cq->wq); + if (!cqe64) { + if (unlikely(cq->mdev->state == + MLX5_DEVICE_STATE_INTERNAL_ERROR)) { + mlx5_core_dbg_once(cq->mdev, + "Polling CQ while device is shutting down\n"); + return CQ_POLL_ERR; + } + return CQ_EMPTY; + } + + mlx5_cqwq_pop(&cq->wq); + err = mlx5hws_parse_cqe(cq, cqe64); + mlx5_cqwq_update_db_record(&cq->wq); + + return err; +} + +static void hws_send_engine_poll_cq(struct mlx5hws_send_engine *queue, + struct mlx5hws_flow_op_result res[], + s64 *polled, + u32 res_nb) +{ + struct mlx5hws_send_ring *send_ring = &queue->send_ring; + struct mlx5hws_send_ring_cq *cq = &send_ring->send_cq; + struct mlx5hws_send_ring_sq *sq = &send_ring->send_sq; + struct mlx5hws_send_ring_priv *priv; + struct mlx5_cqe64 *cqe; + u8 cqe_opcode; + u16 wqe_cnt; + + cqe = mlx5_cqwq_get_cqe(&cq->wq); + if (!cqe) + return; + + cqe_opcode = get_cqe_opcode(cqe); + if (cqe_opcode == MLX5_CQE_INVALID) + return; + + if (unlikely(cqe_opcode != MLX5_CQE_REQ)) + queue->err = true; + + wqe_cnt = be16_to_cpu(cqe->wqe_counter) & sq->buf_mask; + + while (cq->poll_wqe != wqe_cnt) { + priv = &sq->wr_priv[cq->poll_wqe]; + hws_send_engine_update(queue, NULL, priv, res, polled, res_nb, 0); + cq->poll_wqe = (cq->poll_wqe + priv->num_wqebbs) & sq->buf_mask; + } + + priv = &sq->wr_priv[wqe_cnt]; + cq->poll_wqe = (wqe_cnt + priv->num_wqebbs) & sq->buf_mask; + hws_send_engine_update(queue, cqe, priv, res, polled, res_nb, wqe_cnt); + mlx5hws_cq_poll_one(cq); +} + +static void hws_send_engine_poll_list(struct mlx5hws_send_engine *queue, + struct mlx5hws_flow_op_result res[], + s64 *polled, + u32 res_nb) +{ + struct mlx5hws_completed_poll *comp = &queue->completed; + + while (comp->ci != comp->pi) { + if (*polled < res_nb) { + res[*polled].status = + comp->entries[comp->ci].status; + res[*polled].user_data = + comp->entries[comp->ci].user_data; + (*polled)++; + comp->ci = (comp->ci + 1) & comp->mask; + mlx5hws_send_engine_dec_rule(queue); + } else { + return; + } + } +} + +static int hws_send_engine_poll(struct mlx5hws_send_engine *queue, + struct mlx5hws_flow_op_result res[], + u32 res_nb) +{ + s64 polled = 0; + + hws_send_engine_poll_list(queue, res, &polled, res_nb); + + if (polled >= res_nb) + return polled; + + hws_send_engine_poll_cq(queue, res, &polled, res_nb); + + return polled; +} + +int mlx5hws_send_queue_poll(struct mlx5hws_context *ctx, + u16 queue_id, + struct mlx5hws_flow_op_result res[], + u32 res_nb) +{ + return hws_send_engine_poll(&ctx->send_queue[queue_id], res, res_nb); +} + +static int hws_send_ring_alloc_sq(struct mlx5_core_dev *mdev, + int numa_node, + struct mlx5hws_send_engine *queue, + struct mlx5hws_send_ring_sq *sq, + void *sqc_data) +{ + void *sqc_wq = MLX5_ADDR_OF(sqc, sqc_data, wq); + struct mlx5_wq_cyc *wq = &sq->wq; + struct mlx5_wq_param param; + size_t buf_sz; + int err; + + sq->uar_map = mdev->mlx5e_res.hw_objs.bfreg.map; + sq->mdev = mdev; + + param.db_numa_node = numa_node; + param.buf_numa_node = numa_node; + err = mlx5_wq_cyc_create(mdev, ¶m, sqc_wq, wq, &sq->wq_ctrl); + if (err) + return err; + wq->db = &wq->db[MLX5_SND_DBR]; + + buf_sz = queue->num_entries * MAX_WQES_PER_RULE; + sq->dep_wqe = kcalloc(queue->num_entries, sizeof(*sq->dep_wqe), GFP_KERNEL); + if (!sq->dep_wqe) { + err = -ENOMEM; + goto destroy_wq_cyc; + } + + sq->wr_priv = kzalloc(sizeof(*sq->wr_priv) * buf_sz, GFP_KERNEL); + if (!sq->dep_wqe) { + err = -ENOMEM; + goto free_dep_wqe; + } + + sq->buf_mask = (queue->num_entries * MAX_WQES_PER_RULE) - 1; + + return 0; + +free_dep_wqe: + kfree(sq->dep_wqe); +destroy_wq_cyc: + mlx5_wq_destroy(&sq->wq_ctrl); + return err; +} + +static void hws_send_ring_free_sq(struct mlx5hws_send_ring_sq *sq) +{ + if (!sq) + return; + kfree(sq->wr_priv); + kfree(sq->dep_wqe); + mlx5_wq_destroy(&sq->wq_ctrl); +} + +static int hws_send_ring_create_sq(struct mlx5_core_dev *mdev, u32 pdn, + void *sqc_data, + struct mlx5hws_send_engine *queue, + struct mlx5hws_send_ring_sq *sq, + struct mlx5hws_send_ring_cq *cq) +{ + void *in, *sqc, *wq; + int inlen, err; + u8 ts_format; + + inlen = MLX5_ST_SZ_BYTES(create_sq_in) + + sizeof(u64) * sq->wq_ctrl.buf.npages; + in = kvzalloc(inlen, GFP_KERNEL); + if (!in) + return -ENOMEM; + + sqc = MLX5_ADDR_OF(create_sq_in, in, ctx); + wq = MLX5_ADDR_OF(sqc, sqc, wq); + + memcpy(sqc, sqc_data, MLX5_ST_SZ_BYTES(sqc)); + MLX5_SET(sqc, sqc, cqn, cq->mcq.cqn); + + MLX5_SET(sqc, sqc, state, MLX5_SQC_STATE_RST); + MLX5_SET(sqc, sqc, flush_in_error_en, 1); + + ts_format = mlx5_is_real_time_sq(mdev) ? MLX5_TIMESTAMP_FORMAT_REAL_TIME : + MLX5_TIMESTAMP_FORMAT_FREE_RUNNING; + MLX5_SET(sqc, sqc, ts_format, ts_format); + + MLX5_SET(wq, wq, wq_type, MLX5_WQ_TYPE_CYCLIC); + MLX5_SET(wq, wq, uar_page, mdev->mlx5e_res.hw_objs.bfreg.index); + MLX5_SET(wq, wq, log_wq_pg_sz, sq->wq_ctrl.buf.page_shift - MLX5_ADAPTER_PAGE_SHIFT); + MLX5_SET64(wq, wq, dbr_addr, sq->wq_ctrl.db.dma); + + mlx5_fill_page_frag_array(&sq->wq_ctrl.buf, + (__be64 *)MLX5_ADDR_OF(wq, wq, pas)); + + err = mlx5_core_create_sq(mdev, in, inlen, &sq->sqn); + + kvfree(in); + + return err; +} + +static int hws_send_ring_set_sq_rdy(struct mlx5_core_dev *mdev, u32 sqn) +{ + void *in, *sqc; + int inlen, err; + + inlen = MLX5_ST_SZ_BYTES(modify_sq_in); + in = kvzalloc(inlen, GFP_KERNEL); + if (!in) + return -ENOMEM; + + MLX5_SET(modify_sq_in, in, sq_state, MLX5_SQC_STATE_RST); + sqc = MLX5_ADDR_OF(modify_sq_in, in, ctx); + MLX5_SET(sqc, sqc, state, MLX5_SQC_STATE_RDY); + + err = mlx5_core_modify_sq(mdev, sqn, in); + + kvfree(in); + + return err; +} + +static void hws_send_ring_close_sq(struct mlx5hws_send_ring_sq *sq) +{ + mlx5_core_destroy_sq(sq->mdev, sq->sqn); + mlx5_wq_destroy(&sq->wq_ctrl); + kfree(sq->wr_priv); + kfree(sq->dep_wqe); +} + +static int hws_send_ring_create_sq_rdy(struct mlx5_core_dev *mdev, u32 pdn, + void *sqc_data, + struct mlx5hws_send_engine *queue, + struct mlx5hws_send_ring_sq *sq, + struct mlx5hws_send_ring_cq *cq) +{ + int err; + + err = hws_send_ring_create_sq(mdev, pdn, sqc_data, queue, sq, cq); + if (err) + return err; + + err = hws_send_ring_set_sq_rdy(mdev, sq->sqn); + if (err) + hws_send_ring_close_sq(sq); + + return err; +} + +static int hws_send_ring_open_sq(struct mlx5hws_context *ctx, + int numa_node, + struct mlx5hws_send_engine *queue, + struct mlx5hws_send_ring_sq *sq, + struct mlx5hws_send_ring_cq *cq) +{ + size_t buf_sz, sq_log_buf_sz; + void *sqc_data, *wq; + int err; + + sqc_data = kvzalloc(MLX5_ST_SZ_BYTES(sqc), GFP_KERNEL); + if (!sqc_data) + return -ENOMEM; + + buf_sz = queue->num_entries * MAX_WQES_PER_RULE; + sq_log_buf_sz = ilog2(roundup_pow_of_two(buf_sz)); + + wq = MLX5_ADDR_OF(sqc, sqc_data, wq); + MLX5_SET(wq, wq, log_wq_stride, ilog2(MLX5_SEND_WQE_BB)); + MLX5_SET(wq, wq, pd, ctx->pd_num); + MLX5_SET(wq, wq, log_wq_sz, sq_log_buf_sz); + + err = hws_send_ring_alloc_sq(ctx->mdev, numa_node, queue, sq, sqc_data); + if (err) + goto err_free_sqc; + + err = hws_send_ring_create_sq_rdy(ctx->mdev, ctx->pd_num, sqc_data, + queue, sq, cq); + if (err) + goto err_free_sq; + + kvfree(sqc_data); + + return 0; +err_free_sq: + hws_send_ring_free_sq(sq); +err_free_sqc: + kvfree(sqc_data); + return err; +} + +static void hws_cq_complete(struct mlx5_core_cq *mcq, + struct mlx5_eqe *eqe) +{ + pr_err("CQ completion CQ: #%u\n", mcq->cqn); +} + +static int hws_send_ring_alloc_cq(struct mlx5_core_dev *mdev, + int numa_node, + struct mlx5hws_send_engine *queue, + void *cqc_data, + struct mlx5hws_send_ring_cq *cq) +{ + struct mlx5_core_cq *mcq = &cq->mcq; + struct mlx5_wq_param param; + struct mlx5_cqe64 *cqe; + int err; + u32 i; + + param.buf_numa_node = numa_node; + param.db_numa_node = numa_node; + + err = mlx5_cqwq_create(mdev, ¶m, cqc_data, &cq->wq, &cq->wq_ctrl); + if (err) + return err; + + mcq->cqe_sz = 64; + mcq->set_ci_db = cq->wq_ctrl.db.db; + mcq->arm_db = cq->wq_ctrl.db.db + 1; + mcq->comp = hws_cq_complete; + + for (i = 0; i < mlx5_cqwq_get_size(&cq->wq); i++) { + cqe = mlx5_cqwq_get_wqe(&cq->wq, i); + cqe->op_own = 0xf1; + } + + cq->mdev = mdev; + + return 0; +} + +static int hws_send_ring_create_cq(struct mlx5_core_dev *mdev, + struct mlx5hws_send_engine *queue, + void *cqc_data, + struct mlx5hws_send_ring_cq *cq) +{ + u32 out[MLX5_ST_SZ_DW(create_cq_out)]; + struct mlx5_core_cq *mcq = &cq->mcq; + void *in, *cqc; + int inlen, eqn; + int err; + + err = mlx5_comp_eqn_get(mdev, 0, &eqn); + if (err) + return err; + + inlen = MLX5_ST_SZ_BYTES(create_cq_in) + + sizeof(u64) * cq->wq_ctrl.buf.npages; + in = kvzalloc(inlen, GFP_KERNEL); + if (!in) + return -ENOMEM; + + cqc = MLX5_ADDR_OF(create_cq_in, in, cq_context); + memcpy(cqc, cqc_data, MLX5_ST_SZ_BYTES(cqc)); + mlx5_fill_page_frag_array(&cq->wq_ctrl.buf, + (__be64 *)MLX5_ADDR_OF(create_cq_in, in, pas)); + + MLX5_SET(cqc, cqc, c_eqn_or_apu_element, eqn); + MLX5_SET(cqc, cqc, uar_page, mdev->priv.uar->index); + MLX5_SET(cqc, cqc, log_page_size, cq->wq_ctrl.buf.page_shift - MLX5_ADAPTER_PAGE_SHIFT); + MLX5_SET64(cqc, cqc, dbr_addr, cq->wq_ctrl.db.dma); + + err = mlx5_core_create_cq(mdev, mcq, in, inlen, out, sizeof(out)); + + kvfree(in); + + return err; +} + +static int hws_send_ring_open_cq(struct mlx5_core_dev *mdev, + struct mlx5hws_send_engine *queue, + int numa_node, + struct mlx5hws_send_ring_cq *cq) +{ + void *cqc_data; + int err; + + cqc_data = kvzalloc(MLX5_ST_SZ_BYTES(cqc), GFP_KERNEL); + if (!cqc_data) + return -ENOMEM; + + MLX5_SET(cqc, cqc_data, uar_page, mdev->priv.uar->index); + MLX5_SET(cqc, cqc_data, cqe_sz, queue->num_entries); + MLX5_SET(cqc, cqc_data, log_cq_size, ilog2(queue->num_entries)); + + err = hws_send_ring_alloc_cq(mdev, numa_node, queue, cqc_data, cq); + if (err) + goto err_out; + + err = hws_send_ring_create_cq(mdev, queue, cqc_data, cq); + if (err) + goto err_free_cq; + + kvfree(cqc_data); + + return 0; + +err_free_cq: + mlx5_wq_destroy(&cq->wq_ctrl); +err_out: + kvfree(cqc_data); + return err; +} + +static void hws_send_ring_close_cq(struct mlx5hws_send_ring_cq *cq) +{ + mlx5_core_destroy_cq(cq->mdev, &cq->mcq); + mlx5_wq_destroy(&cq->wq_ctrl); +} + +static void hws_send_ring_close(struct mlx5hws_send_engine *queue) +{ + hws_send_ring_close_sq(&queue->send_ring.send_sq); + hws_send_ring_close_cq(&queue->send_ring.send_cq); +} + +static int mlx5hws_send_ring_open(struct mlx5hws_context *ctx, + struct mlx5hws_send_engine *queue) +{ + int numa_node = dev_to_node(mlx5_core_dma_dev(ctx->mdev)); + struct mlx5hws_send_ring *ring = &queue->send_ring; + int err; + + err = hws_send_ring_open_cq(ctx->mdev, queue, numa_node, &ring->send_cq); + if (err) + return err; + + err = hws_send_ring_open_sq(ctx, numa_node, queue, &ring->send_sq, + &ring->send_cq); + if (err) + goto close_cq; + + return err; + +close_cq: + hws_send_ring_close_cq(&ring->send_cq); + return err; +} + +void mlx5hws_send_queue_close(struct mlx5hws_send_engine *queue) +{ + hws_send_ring_close(queue); + kfree(queue->completed.entries); +} + +int mlx5hws_send_queue_open(struct mlx5hws_context *ctx, + struct mlx5hws_send_engine *queue, + u16 queue_size) +{ + int err; + + mutex_init(&queue->lock); + + queue->num_entries = roundup_pow_of_two(queue_size); + queue->used_entries = 0; + + queue->completed.entries = kcalloc(queue->num_entries, + sizeof(queue->completed.entries[0]), + GFP_KERNEL); + if (!queue->completed.entries) + return -ENOMEM; + + queue->completed.pi = 0; + queue->completed.ci = 0; + queue->completed.mask = queue->num_entries - 1; + err = mlx5hws_send_ring_open(ctx, queue); + if (err) + goto free_completed_entries; + + return 0; + +free_completed_entries: + kfree(queue->completed.entries); + return err; +} + +static void __hws_send_queues_close(struct mlx5hws_context *ctx, u16 queues) +{ + while (queues--) + mlx5hws_send_queue_close(&ctx->send_queue[queues]); +} + +static void hws_send_queues_bwc_locks_destroy(struct mlx5hws_context *ctx) +{ + int bwc_queues = ctx->queues - 1; + int i; + + if (!mlx5hws_context_bwc_supported(ctx)) + return; + + for (i = 0; i < bwc_queues; i++) + mutex_destroy(&ctx->bwc_send_queue_locks[i]); + kfree(ctx->bwc_send_queue_locks); +} + +void mlx5hws_send_queues_close(struct mlx5hws_context *ctx) +{ + hws_send_queues_bwc_locks_destroy(ctx); + __hws_send_queues_close(ctx, ctx->queues); + kfree(ctx->send_queue); +} + +static int hws_bwc_send_queues_init(struct mlx5hws_context *ctx) +{ + /* Number of BWC queues is equal to number of the usual HWS queues */ + int bwc_queues = ctx->queues - 1; + int i; + + if (!mlx5hws_context_bwc_supported(ctx)) + return 0; + + ctx->queues += bwc_queues; + + ctx->bwc_send_queue_locks = kcalloc(bwc_queues, + sizeof(*ctx->bwc_send_queue_locks), + GFP_KERNEL); + + if (!ctx->bwc_send_queue_locks) + return -ENOMEM; + + for (i = 0; i < bwc_queues; i++) + mutex_init(&ctx->bwc_send_queue_locks[i]); + + return 0; +} + +int mlx5hws_send_queues_open(struct mlx5hws_context *ctx, + u16 queues, + u16 queue_size) +{ + int err = 0; + u32 i; + + /* Open one extra queue for control path */ + ctx->queues = queues + 1; + + /* open a separate set of queues and locks for bwc API */ + err = hws_bwc_send_queues_init(ctx); + if (err) + return err; + + ctx->send_queue = kcalloc(ctx->queues, sizeof(*ctx->send_queue), GFP_KERNEL); + if (!ctx->send_queue) { + err = -ENOMEM; + goto free_bwc_locks; + } + + for (i = 0; i < ctx->queues; i++) { + err = mlx5hws_send_queue_open(ctx, &ctx->send_queue[i], queue_size); + if (err) + goto close_send_queues; + } + + return 0; + +close_send_queues: + __hws_send_queues_close(ctx, i); + + kfree(ctx->send_queue); + +free_bwc_locks: + hws_send_queues_bwc_locks_destroy(ctx); + + return err; +} + +int mlx5hws_send_queue_action(struct mlx5hws_context *ctx, + u16 queue_id, + u32 actions) +{ + struct mlx5hws_send_ring_sq *send_sq; + struct mlx5hws_send_engine *queue; + bool wait_comp = false; + s64 polled = 0; + + queue = &ctx->send_queue[queue_id]; + send_sq = &queue->send_ring.send_sq; + + switch (actions) { + case MLX5HWS_SEND_QUEUE_ACTION_DRAIN_SYNC: + wait_comp = true; + fallthrough; + case MLX5HWS_SEND_QUEUE_ACTION_DRAIN_ASYNC: + if (send_sq->head_dep_idx != send_sq->tail_dep_idx) + /* Send dependent WQEs to drain the queue */ + mlx5hws_send_all_dep_wqe(queue); + else + /* Signal on the last posted WQE */ + mlx5hws_send_engine_flush_queue(queue); + + /* Poll queue until empty */ + while (wait_comp && !mlx5hws_send_engine_empty(queue)) + hws_send_engine_poll_cq(queue, NULL, &polled, 0); + + break; + default: + return -EINVAL; + } + + return 0; +} + +static int +hws_send_wqe_fw(struct mlx5_core_dev *mdev, + u32 pd_num, + struct mlx5hws_send_engine_post_attr *send_attr, + struct mlx5hws_wqe_gta_ctrl_seg *send_wqe_ctrl, + void *send_wqe_match_data, + void *send_wqe_match_tag, + void *send_wqe_range_data, + void *send_wqe_range_tag, + bool is_jumbo, + u8 gta_opcode) +{ + bool has_range = send_wqe_range_data || send_wqe_range_tag; + bool has_match = send_wqe_match_data || send_wqe_match_tag; + struct mlx5hws_wqe_gta_data_seg_ste gta_wqe_data0 = {0}; + struct mlx5hws_wqe_gta_data_seg_ste gta_wqe_data1 = {0}; + struct mlx5hws_wqe_gta_ctrl_seg gta_wqe_ctrl = {0}; + struct mlx5hws_cmd_generate_wqe_attr attr = {0}; + struct mlx5hws_wqe_ctrl_seg wqe_ctrl = {0}; + struct mlx5_cqe64 cqe; + u32 flags = 0; + int ret; + + /* Set WQE control */ + wqe_ctrl.opmod_idx_opcode = cpu_to_be32((send_attr->opmod << 24) | send_attr->opcode); + wqe_ctrl.qpn_ds = cpu_to_be32((send_attr->len + sizeof(struct mlx5hws_wqe_ctrl_seg)) / 16); + flags |= send_attr->notify_hw ? MLX5_WQE_CTRL_CQ_UPDATE : 0; + wqe_ctrl.flags = cpu_to_be32(flags); + wqe_ctrl.imm = cpu_to_be32(send_attr->id); + + /* Set GTA WQE CTRL */ + memcpy(gta_wqe_ctrl.stc_ix, send_wqe_ctrl->stc_ix, sizeof(send_wqe_ctrl->stc_ix)); + gta_wqe_ctrl.op_dirix = cpu_to_be32(gta_opcode << 28); + + /* Set GTA match WQE DATA */ + if (has_match) { + if (send_wqe_match_data) + memcpy(>a_wqe_data0, send_wqe_match_data, sizeof(gta_wqe_data0)); + else + hws_send_wqe_set_tag(>a_wqe_data0, send_wqe_match_tag, is_jumbo); + + gta_wqe_data0.rsvd1_definer = cpu_to_be32(send_attr->match_definer_id << 8); + attr.gta_data_0 = (u8 *)>a_wqe_data0; + } + + /* Set GTA range WQE DATA */ + if (has_range) { + if (send_wqe_range_data) + memcpy(>a_wqe_data1, send_wqe_range_data, sizeof(gta_wqe_data1)); + else + hws_send_wqe_set_tag(>a_wqe_data1, send_wqe_range_tag, false); + + gta_wqe_data1.rsvd1_definer = cpu_to_be32(send_attr->range_definer_id << 8); + attr.gta_data_1 = (u8 *)>a_wqe_data1; + } + + attr.pdn = pd_num; + attr.wqe_ctrl = (u8 *)&wqe_ctrl; + attr.gta_ctrl = (u8 *)>a_wqe_ctrl; + +send_wqe: + ret = mlx5hws_cmd_generate_wqe(mdev, &attr, &cqe); + if (ret) { + mlx5_core_err(mdev, "Failed to write WQE using command"); + return ret; + } + + if ((get_cqe_opcode(&cqe) == MLX5_CQE_REQ) && + (be32_to_cpu(cqe.byte_cnt) >> 31 == 0)) { + *send_attr->used_id = send_attr->id; + return 0; + } + + /* Retry if rule failed */ + if (send_attr->retry_id) { + wqe_ctrl.imm = cpu_to_be32(send_attr->retry_id); + send_attr->id = send_attr->retry_id; + send_attr->retry_id = 0; + goto send_wqe; + } + + return -1; +} + +void mlx5hws_send_stes_fw(struct mlx5hws_context *ctx, + struct mlx5hws_send_engine *queue, + struct mlx5hws_send_ste_attr *ste_attr) +{ + struct mlx5hws_send_engine_post_attr *send_attr = &ste_attr->send_attr; + struct mlx5hws_rule *rule = send_attr->rule; + struct mlx5_core_dev *mdev; + u16 queue_id; + u32 pdn; + int ret; + + queue_id = queue - ctx->send_queue; + mdev = ctx->mdev; + pdn = ctx->pd_num; + + /* Writing through FW can't HW fence, therefore we drain the queue */ + if (send_attr->fence) + mlx5hws_send_queue_action(ctx, + queue_id, + MLX5HWS_SEND_QUEUE_ACTION_DRAIN_SYNC); + + if (ste_attr->rtc_1) { + send_attr->id = ste_attr->rtc_1; + send_attr->used_id = ste_attr->used_id_rtc_1; + send_attr->retry_id = ste_attr->retry_rtc_1; + ret = hws_send_wqe_fw(mdev, pdn, send_attr, + ste_attr->wqe_ctrl, + ste_attr->wqe_data, + ste_attr->wqe_tag, + ste_attr->range_wqe_data, + ste_attr->range_wqe_tag, + ste_attr->wqe_tag_is_jumbo, + ste_attr->gta_opcode); + if (ret) + goto fail_rule; + } + + if (ste_attr->rtc_0) { + send_attr->id = ste_attr->rtc_0; + send_attr->used_id = ste_attr->used_id_rtc_0; + send_attr->retry_id = ste_attr->retry_rtc_0; + ret = hws_send_wqe_fw(mdev, pdn, send_attr, + ste_attr->wqe_ctrl, + ste_attr->wqe_data, + ste_attr->wqe_tag, + ste_attr->range_wqe_data, + ste_attr->range_wqe_tag, + ste_attr->wqe_tag_is_jumbo, + ste_attr->gta_opcode); + if (ret) + goto fail_rule; + } + + /* Increase the status, this only works on good flow as the enum + * is arrange it away creating -> created -> deleting -> deleted + */ + if (likely(rule)) + rule->status++; + + mlx5hws_send_engine_gen_comp(queue, send_attr->user_data, MLX5HWS_FLOW_OP_SUCCESS); + + return; + +fail_rule: + if (likely(rule)) + rule->status = !rule->rtc_0 && !rule->rtc_1 ? + MLX5HWS_RULE_STATUS_FAILED : MLX5HWS_RULE_STATUS_FAILING; + + mlx5hws_send_engine_gen_comp(queue, send_attr->user_data, MLX5HWS_FLOW_OP_ERROR); +} diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_send.h b/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_send.h new file mode 100644 index 0000000000000..b50825d6dc53d --- /dev/null +++ b/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_send.h @@ -0,0 +1,270 @@ +/* SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB */ +/* Copyright (c) 2024 NVIDIA Corporation & Affiliates */ + +#ifndef MLX5HWS_SEND_H_ +#define MLX5HWS_SEND_H_ + +/* As a single operation requires at least two WQEBBS. + * This means a maximum of 16 such operations per rule. + */ +#define MAX_WQES_PER_RULE 32 + +enum mlx5hws_wqe_opcode { + MLX5HWS_WQE_OPCODE_TBL_ACCESS = 0x2c, +}; + +enum mlx5hws_wqe_opmod { + MLX5HWS_WQE_OPMOD_GTA_STE = 0, + MLX5HWS_WQE_OPMOD_GTA_MOD_ARG = 1, +}; + +enum mlx5hws_wqe_gta_opcode { + MLX5HWS_WQE_GTA_OP_ACTIVATE = 0, + MLX5HWS_WQE_GTA_OP_DEACTIVATE = 1, +}; + +enum mlx5hws_wqe_gta_opmod { + MLX5HWS_WQE_GTA_OPMOD_STE = 0, + MLX5HWS_WQE_GTA_OPMOD_MOD_ARG = 1, +}; + +enum mlx5hws_wqe_gta_sz { + MLX5HWS_WQE_SZ_GTA_CTRL = 48, + MLX5HWS_WQE_SZ_GTA_DATA = 64, +}; + +/* WQE Control segment. */ +struct mlx5hws_wqe_ctrl_seg { + __be32 opmod_idx_opcode; + __be32 qpn_ds; + __be32 flags; + __be32 imm; +}; + +struct mlx5hws_wqe_gta_ctrl_seg { + __be32 op_dirix; + __be32 stc_ix[5]; + __be32 rsvd0[6]; +}; + +struct mlx5hws_wqe_gta_data_seg_ste { + __be32 rsvd0_ctr_id; + __be32 rsvd1_definer; + __be32 rsvd2[3]; + union { + struct { + __be32 action[3]; + __be32 tag[8]; + }; + __be32 jumbo[11]; + }; +}; + +struct mlx5hws_wqe_gta_data_seg_arg { + __be32 action_args[8]; +}; + +struct mlx5hws_wqe_gta { + struct mlx5hws_wqe_gta_ctrl_seg gta_ctrl; + union { + struct mlx5hws_wqe_gta_data_seg_ste seg_ste; + struct mlx5hws_wqe_gta_data_seg_arg seg_arg; + }; +}; + +struct mlx5hws_send_ring_cq { + struct mlx5_core_dev *mdev; + struct mlx5_cqwq wq; + struct mlx5_wq_ctrl wq_ctrl; + struct mlx5_core_cq mcq; + u16 poll_wqe; +}; + +struct mlx5hws_send_ring_priv { + struct mlx5hws_rule *rule; + void *user_data; + u32 num_wqebbs; + u32 id; + u32 retry_id; + u32 *used_id; +}; + +struct mlx5hws_send_ring_dep_wqe { + struct mlx5hws_wqe_gta_ctrl_seg wqe_ctrl; + struct mlx5hws_wqe_gta_data_seg_ste wqe_data; + struct mlx5hws_rule *rule; + u32 rtc_0; + u32 rtc_1; + u32 retry_rtc_0; + u32 retry_rtc_1; + u32 direct_index; + void *user_data; +}; + +struct mlx5hws_send_ring_sq { + struct mlx5_core_dev *mdev; + u16 cur_post; + u16 buf_mask; + struct mlx5hws_send_ring_priv *wr_priv; + unsigned int last_idx; + struct mlx5hws_send_ring_dep_wqe *dep_wqe; + unsigned int head_dep_idx; + unsigned int tail_dep_idx; + u32 sqn; + struct mlx5_wq_cyc wq; + struct mlx5_wq_ctrl wq_ctrl; + void __iomem *uar_map; +}; + +struct mlx5hws_send_ring { + struct mlx5hws_send_ring_cq send_cq; + struct mlx5hws_send_ring_sq send_sq; +}; + +struct mlx5hws_completed_poll_entry { + void *user_data; + enum mlx5hws_flow_op_status status; +}; + +struct mlx5hws_completed_poll { + struct mlx5hws_completed_poll_entry *entries; + u16 ci; + u16 pi; + u16 mask; +}; + +struct mlx5hws_send_engine { + struct mlx5hws_send_ring send_ring; + struct mlx5_uars_page *uar; /* Uar is shared between rings of a queue */ + struct mlx5hws_completed_poll completed; + u16 used_entries; + u16 num_entries; + bool err; + struct mutex lock; /* Protects the send engine */ +}; + +struct mlx5hws_send_engine_post_ctrl { + struct mlx5hws_send_engine *queue; + struct mlx5hws_send_ring *send_ring; + size_t num_wqebbs; +}; + +struct mlx5hws_send_engine_post_attr { + u8 opcode; + u8 opmod; + u8 notify_hw; + u8 fence; + u8 match_definer_id; + u8 range_definer_id; + size_t len; + struct mlx5hws_rule *rule; + u32 id; + u32 retry_id; + u32 *used_id; + void *user_data; +}; + +struct mlx5hws_send_ste_attr { + u32 rtc_0; + u32 rtc_1; + u32 retry_rtc_0; + u32 retry_rtc_1; + u32 *used_id_rtc_0; + u32 *used_id_rtc_1; + bool wqe_tag_is_jumbo; + u8 gta_opcode; + u32 direct_index; + struct mlx5hws_send_engine_post_attr send_attr; + struct mlx5hws_rule_match_tag *wqe_tag; + struct mlx5hws_rule_match_tag *range_wqe_tag; + struct mlx5hws_wqe_gta_ctrl_seg *wqe_ctrl; + struct mlx5hws_wqe_gta_data_seg_ste *wqe_data; + struct mlx5hws_wqe_gta_data_seg_ste *range_wqe_data; +}; + +struct mlx5hws_send_ring_dep_wqe * +mlx5hws_send_add_new_dep_wqe(struct mlx5hws_send_engine *queue); + +void mlx5hws_send_abort_new_dep_wqe(struct mlx5hws_send_engine *queue); + +void mlx5hws_send_all_dep_wqe(struct mlx5hws_send_engine *queue); + +void mlx5hws_send_queue_close(struct mlx5hws_send_engine *queue); + +int mlx5hws_send_queue_open(struct mlx5hws_context *ctx, + struct mlx5hws_send_engine *queue, + u16 queue_size); + +void mlx5hws_send_queues_close(struct mlx5hws_context *ctx); + +int mlx5hws_send_queues_open(struct mlx5hws_context *ctx, + u16 queues, + u16 queue_size); + +int mlx5hws_send_queue_action(struct mlx5hws_context *ctx, + u16 queue_id, + u32 actions); + +int mlx5hws_send_test(struct mlx5hws_context *ctx, + u16 queues, + u16 queue_size); + +struct mlx5hws_send_engine_post_ctrl +mlx5hws_send_engine_post_start(struct mlx5hws_send_engine *queue); + +void mlx5hws_send_engine_post_req_wqe(struct mlx5hws_send_engine_post_ctrl *ctrl, + char **buf, size_t *len); + +void mlx5hws_send_engine_post_end(struct mlx5hws_send_engine_post_ctrl *ctrl, + struct mlx5hws_send_engine_post_attr *attr); + +void mlx5hws_send_ste(struct mlx5hws_send_engine *queue, + struct mlx5hws_send_ste_attr *ste_attr); + +void mlx5hws_send_stes_fw(struct mlx5hws_context *ctx, + struct mlx5hws_send_engine *queue, + struct mlx5hws_send_ste_attr *ste_attr); + +void mlx5hws_send_engine_flush_queue(struct mlx5hws_send_engine *queue); + +static inline bool mlx5hws_send_engine_empty(struct mlx5hws_send_engine *queue) +{ + struct mlx5hws_send_ring_sq *send_sq = &queue->send_ring.send_sq; + struct mlx5hws_send_ring_cq *send_cq = &queue->send_ring.send_cq; + + return ((send_sq->cur_post & send_sq->buf_mask) == send_cq->poll_wqe); +} + +static inline bool mlx5hws_send_engine_full(struct mlx5hws_send_engine *queue) +{ + return queue->used_entries >= queue->num_entries; +} + +static inline void mlx5hws_send_engine_inc_rule(struct mlx5hws_send_engine *queue) +{ + queue->used_entries++; +} + +static inline void mlx5hws_send_engine_dec_rule(struct mlx5hws_send_engine *queue) +{ + queue->used_entries--; +} + +static inline void mlx5hws_send_engine_gen_comp(struct mlx5hws_send_engine *queue, + void *user_data, + int comp_status) +{ + struct mlx5hws_completed_poll *comp = &queue->completed; + + comp->entries[comp->pi].status = comp_status; + comp->entries[comp->pi].user_data = user_data; + + comp->pi = (comp->pi + 1) & comp->mask; +} + +static inline bool mlx5hws_send_engine_err(struct mlx5hws_send_engine *queue) +{ + return queue->err; +} + +#endif /* MLX5HWS_SEND_H_ */ From 9c0d7942b52215ee2bef16428983b6fdeaf0a9a2 Mon Sep 17 00:00:00 2001 From: Yevgeny Kliteynik Date: Thu, 20 Jun 2024 02:46:40 +0300 Subject: [PATCH 101/548] net/mlx5: HWS, added API and enabled HWS support Enabling HWS support in the mlx5 driver: - added HWS API header - added HWS files in the mlx5 driver makefile - added kconfig flag that enables HWS compilation Reviewed-by: Erez Shitrit Signed-off-by: Yevgeny Kliteynik Signed-off-by: Saeed Mahameed (cherry picked from commit 510f9f61a1121a296a45962760d5e2824277fa37) Signed-off-by: Koba Ko --- .../ethernet/mellanox/mlx5/kconfig.rst | 3 + .../net/ethernet/mellanox/mlx5/core/Kconfig | 10 + .../net/ethernet/mellanox/mlx5/core/Makefile | 21 + .../mellanox/mlx5/core/steering/hws/Makefile | 2 + .../mellanox/mlx5/core/steering/hws/mlx5hws.h | 954 ++++++++++++++++++ .../mlx5/core/steering/hws/mlx5hws_internal.h | 8 +- 6 files changed, 994 insertions(+), 4 deletions(-) create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/steering/hws/Makefile create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws.h diff --git a/Documentation/networking/device_drivers/ethernet/mellanox/mlx5/kconfig.rst b/Documentation/networking/device_drivers/ethernet/mellanox/mlx5/kconfig.rst index 0a42c3395ffab..6cd8254337025 100644 --- a/Documentation/networking/device_drivers/ethernet/mellanox/mlx5/kconfig.rst +++ b/Documentation/networking/device_drivers/ethernet/mellanox/mlx5/kconfig.rst @@ -130,6 +130,9 @@ Enabling the driver and kconfig options | Build support for software-managed steering in the NIC. +**CONFIG_MLX5_HW_STEERING=(y/n)** + +| Build support for hardware-managed steering in the NIC. **CONFIG_MLX5_TC_CT=(y/n)** diff --git a/drivers/net/ethernet/mellanox/mlx5/core/Kconfig b/drivers/net/ethernet/mellanox/mlx5/core/Kconfig index c4f4de82e29e2..277cf53247f93 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/Kconfig +++ b/drivers/net/ethernet/mellanox/mlx5/core/Kconfig @@ -172,6 +172,16 @@ config MLX5_SW_STEERING help Build support for software-managed steering in the NIC. +config MLX5_HW_STEERING + bool "Mellanox Technologies hardware-managed steering" + depends on MLX5_CORE_EN && MLX5_ESWITCH + default y + help + Build support for Hardware-Managed Flow Steering (HMFS) in the NIC. + HMFS is a new approach to managing steering rules where STEs are + written to ICM by HW (as opposed to SW in software-managed steering), + which allows higher rate of rule insertion. + config MLX5_SF bool "Mellanox Technologies subfunction device support using auxiliary device" depends on MLX5_CORE && MLX5_CORE_EN diff --git a/drivers/net/ethernet/mellanox/mlx5/core/Makefile b/drivers/net/ethernet/mellanox/mlx5/core/Makefile index a78731065f3b0..6bfdcaef877a5 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/Makefile +++ b/drivers/net/ethernet/mellanox/mlx5/core/Makefile @@ -119,6 +119,27 @@ mlx5_core-$(CONFIG_MLX5_SW_STEERING) += steering/dr_domain.o steering/dr_table.o steering/dr_action.o steering/fs_dr.o \ steering/dr_definer.o steering/dr_ptrn.o \ steering/dr_arg.o steering/dr_dbg.o lib/smfs.o + +# +# HW Steering +# +mlx5_core-$(CONFIG_MLX5_HW_STEERING) += steering/hws/mlx5hws_cmd.o \ + steering/hws/mlx5hws_context.o \ + steering/hws/mlx5hws_pat_arg.o \ + steering/hws/mlx5hws_buddy.o \ + steering/hws/mlx5hws_pool.o \ + steering/hws/mlx5hws_table.o \ + steering/hws/mlx5hws_action.o \ + steering/hws/mlx5hws_rule.o \ + steering/hws/mlx5hws_matcher.o \ + steering/hws/mlx5hws_send.o \ + steering/hws/mlx5hws_definer.o \ + steering/hws/mlx5hws_bwc.o \ + steering/hws/mlx5hws_debug.o \ + steering/hws/mlx5hws_vport.o \ + steering/hws/mlx5hws_bwc_complex.o + + # # SF device # diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/Makefile b/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/Makefile new file mode 100644 index 0000000000000..c78512eed8d73 --- /dev/null +++ b/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/Makefile @@ -0,0 +1,2 @@ +# SPDX-License-Identifier: GPL-2.0-only +subdir-ccflags-y += -I$(src)/.. diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws.h b/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws.h new file mode 100644 index 0000000000000..9a85d4e12f772 --- /dev/null +++ b/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws.h @@ -0,0 +1,954 @@ +/* SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB */ +/* Copyright (c) 2024 NVIDIA Corporation & Affiliates */ + +#ifndef MLX5HWS_H_ +#define MLX5HWS_H_ + +struct mlx5hws_context; +struct mlx5hws_table; +struct mlx5hws_matcher; +struct mlx5hws_rule; + +enum mlx5hws_table_type { + MLX5HWS_TABLE_TYPE_FDB, + MLX5HWS_TABLE_TYPE_MAX, +}; + +enum mlx5hws_matcher_resource_mode { + /* Allocate resources based on number of rules with minimal failure probability */ + MLX5HWS_MATCHER_RESOURCE_MODE_RULE, + /* Allocate fixed size hash table based on given column and rows */ + MLX5HWS_MATCHER_RESOURCE_MODE_HTABLE, +}; + +enum mlx5hws_action_type { + MLX5HWS_ACTION_TYP_LAST, + MLX5HWS_ACTION_TYP_REFORMAT_TNL_L2_TO_L2, + MLX5HWS_ACTION_TYP_REFORMAT_L2_TO_TNL_L2, + MLX5HWS_ACTION_TYP_REFORMAT_TNL_L3_TO_L2, + MLX5HWS_ACTION_TYP_REFORMAT_L2_TO_TNL_L3, + MLX5HWS_ACTION_TYP_DROP, + MLX5HWS_ACTION_TYP_MISS, + MLX5HWS_ACTION_TYP_TBL, + MLX5HWS_ACTION_TYP_CTR, + MLX5HWS_ACTION_TYP_TAG, + MLX5HWS_ACTION_TYP_MODIFY_HDR, + MLX5HWS_ACTION_TYP_VPORT, + MLX5HWS_ACTION_TYP_POP_VLAN, + MLX5HWS_ACTION_TYP_PUSH_VLAN, + MLX5HWS_ACTION_TYP_ASO_METER, + MLX5HWS_ACTION_TYP_INSERT_HEADER, + MLX5HWS_ACTION_TYP_REMOVE_HEADER, + MLX5HWS_ACTION_TYP_RANGE, + MLX5HWS_ACTION_TYP_SAMPLER, + MLX5HWS_ACTION_TYP_DEST_ARRAY, + MLX5HWS_ACTION_TYP_MAX, +}; + +enum mlx5hws_action_flags { + MLX5HWS_ACTION_FLAG_HWS_FDB = 1 << 0, + /* Shared action can be used over a few threads, since the + * data is written only once at the creation of the action. + */ + MLX5HWS_ACTION_FLAG_SHARED = 1 << 1, +}; + +enum mlx5hws_action_aso_meter_color { + MLX5HWS_ACTION_ASO_METER_COLOR_RED = 0x0, + MLX5HWS_ACTION_ASO_METER_COLOR_YELLOW = 0x1, + MLX5HWS_ACTION_ASO_METER_COLOR_GREEN = 0x2, + MLX5HWS_ACTION_ASO_METER_COLOR_UNDEFINED = 0x3, +}; + +enum mlx5hws_send_queue_actions { + /* Start executing all pending queued rules */ + MLX5HWS_SEND_QUEUE_ACTION_DRAIN_ASYNC = 1 << 0, + /* Start executing all pending queued rules wait till completion */ + MLX5HWS_SEND_QUEUE_ACTION_DRAIN_SYNC = 1 << 1, +}; + +struct mlx5hws_context_attr { + u16 queues; + u16 queue_size; + bool bwc; /* add support for backward compatible API*/ +}; + +struct mlx5hws_table_attr { + enum mlx5hws_table_type type; + u32 level; +}; + +enum mlx5hws_matcher_flow_src { + MLX5HWS_MATCHER_FLOW_SRC_ANY = 0x0, + MLX5HWS_MATCHER_FLOW_SRC_WIRE = 0x1, + MLX5HWS_MATCHER_FLOW_SRC_VPORT = 0x2, +}; + +enum mlx5hws_matcher_insert_mode { + MLX5HWS_MATCHER_INSERT_BY_HASH = 0x0, + MLX5HWS_MATCHER_INSERT_BY_INDEX = 0x1, +}; + +enum mlx5hws_matcher_distribute_mode { + MLX5HWS_MATCHER_DISTRIBUTE_BY_HASH = 0x0, + MLX5HWS_MATCHER_DISTRIBUTE_BY_LINEAR = 0x1, +}; + +struct mlx5hws_matcher_attr { + /* Processing priority inside table */ + u32 priority; + /* Provide all rules with unique rule_idx in num_log range to reduce locking */ + bool optimize_using_rule_idx; + /* Resource mode and corresponding size */ + enum mlx5hws_matcher_resource_mode mode; + /* Optimize insertion in case packet origin is the same for all rules */ + enum mlx5hws_matcher_flow_src optimize_flow_src; + /* Define the insertion and distribution modes for this matcher */ + enum mlx5hws_matcher_insert_mode insert_mode; + enum mlx5hws_matcher_distribute_mode distribute_mode; + /* Define whether the created matcher supports resizing into a bigger matcher */ + bool resizable; + union { + struct { + u8 sz_row_log; + u8 sz_col_log; + } table; + + struct { + u8 num_log; + } rule; + }; + /* Optional AT attach configuration - Max number of additional AT */ + u8 max_num_of_at_attach; +}; + +struct mlx5hws_rule_attr { + void *user_data; + /* Valid if matcher optimize_using_rule_idx is set or + * if matcher is configured to insert rules by index. + */ + u32 rule_idx; + u32 flow_source; + u16 queue_id; + u32 burst:1; +}; + +/* In actions that take offset, the offset is unique, pointing to a single + * resource and the user should not reuse the same index because data changing + * is not atomic. + */ +struct mlx5hws_rule_action { + struct mlx5hws_action *action; + union { + struct { + u32 value; + } tag; + + struct { + u32 offset; + } counter; + + struct { + u32 offset; + u8 *data; + } modify_header; + + struct { + u32 offset; + u8 hdr_idx; + u8 *data; + } reformat; + + struct { + __be32 vlan_hdr; + } push_vlan; + + struct { + u32 offset; + enum mlx5hws_action_aso_meter_color init_color; + } aso_meter; + }; +}; + +struct mlx5hws_action_reformat_header { + size_t sz; + void *data; +}; + +struct mlx5hws_action_insert_header { + struct mlx5hws_action_reformat_header hdr; + /* PRM start anchor to which header will be inserted */ + u8 anchor; + /* Header insertion offset in bytes, from the start + * anchor to the location where new header will be inserted. + */ + u8 offset; + /* Indicates this header insertion adds encapsulation header to the packet, + * requiring device to update offloaded fields (for example IPv4 total length). + */ + bool encap; +}; + +struct mlx5hws_action_remove_header_attr { + /* PRM start anchor from which header will be removed */ + u8 anchor; + /* Header remove offset in bytes, from the start + * anchor to the location where remove header starts. + */ + u8 offset; + /* Indicates the removed header size in bytes */ + size_t size; +}; + +struct mlx5hws_action_mh_pattern { + /* Byte size of modify actions provided by "data" */ + size_t sz; + /* PRM format modify actions pattern */ + __be64 *data; +}; + +struct mlx5hws_action_dest_attr { + /* Required destination action to forward the packet */ + struct mlx5hws_action *dest; + /* Optional reformat action */ + struct mlx5hws_action *reformat; +}; + +/* Check whether HWS is supported + * + * @param[in] mdev + * The device to check. + * @return true if supported, false otherwise. + */ +static inline bool mlx5hws_is_supported(struct mlx5_core_dev *mdev) +{ + u8 ignore_flow_level_rtc_valid; + u8 wqe_based_flow_table_update; + + wqe_based_flow_table_update = + MLX5_CAP_GEN(mdev, wqe_based_flow_table_update_cap); + ignore_flow_level_rtc_valid = + MLX5_CAP_FLOWTABLE(mdev, + flow_table_properties_nic_receive.ignore_flow_level_rtc_valid); + + return wqe_based_flow_table_update && ignore_flow_level_rtc_valid; +} + +/* Open a context used for direct rule insertion using hardware steering. + * Each context can contain multiple tables of different types. + * + * @param[in] mdev + * The device to be used for HWS. + * @param[in] attr + * Attributes used for context open. + * @return pointer to mlx5hws_context on success NULL otherwise. + */ +struct mlx5hws_context * +mlx5hws_context_open(struct mlx5_core_dev *mdev, + struct mlx5hws_context_attr *attr); + +/* Close a context used for direct hardware steering. + * + * @param[in] ctx + * mlx5hws context to close. + * @return zero on success non zero otherwise. + */ +int mlx5hws_context_close(struct mlx5hws_context *ctx); + +/* Set a peer context, each context can have multiple contexts as peers. + * + * @param[in] ctx + * The context in which the peer_ctx will be peered to it. + * @param[in] peer_ctx + * The peer context. + * @param[in] peer_vhca_id + * The peer context vhca id. + */ +void mlx5hws_context_set_peer(struct mlx5hws_context *ctx, + struct mlx5hws_context *peer_ctx, + u16 peer_vhca_id); + +/* Create a new direct rule table. Each table can contain multiple matchers. + * + * @param[in] ctx + * The context in which the new table will be opened. + * @param[in] attr + * Attributes used for table creation. + * @return pointer to mlx5hws_table on success NULL otherwise. + */ +struct mlx5hws_table * +mlx5hws_table_create(struct mlx5hws_context *ctx, + struct mlx5hws_table_attr *attr); + +/* Destroy direct rule table. + * + * @param[in] tbl + * Table to destroy. + * @return zero on success non zero otherwise. + */ +int mlx5hws_table_destroy(struct mlx5hws_table *tbl); + +/* Get ID of the flow table. + * + * @param[in] tbl + * Table to get ID of. + * @return ID of the table. + */ +u32 mlx5hws_table_get_id(struct mlx5hws_table *tbl); + +/* Set default miss table for mlx5hws_table by using another mlx5hws_table + * Traffic which all table matchers miss will be forwarded to miss table. + * + * @param[in] tbl + * Source table + * @param[in] miss_tbl + * Target (miss) table, or NULL to remove current miss table + * @return zero on success non zero otherwise. + */ +int mlx5hws_table_set_default_miss(struct mlx5hws_table *tbl, + struct mlx5hws_table *miss_tbl); + +/* Create new match template based on items mask, the match template + * will be used for matcher creation. + * + * @param[in] ctx + * The context in which the new template will be created. + * @param[in] match_param + * Describe the mask based on PRM match parameters + * @param[in] match_param_sz + * Size of match param buffer + * @param[in] match_criteria_enable + * Bitmap for each sub-set in match_criteria buffer + * @return pointer to mlx5hws_match_template on success NULL otherwise + */ +struct mlx5hws_match_template * +mlx5hws_match_template_create(struct mlx5hws_context *ctx, + u32 *match_param, + u32 match_param_sz, + u8 match_criteria_enable); + +/* Destroy match template. + * + * @param[in] mt + * Match template to destroy. + * @return zero on success non zero otherwise. + */ +int mlx5hws_match_template_destroy(struct mlx5hws_match_template *mt); + +/* Create new action template based on action_type array, the action template + * will be used for matcher creation. + * + * @param[in] action_type + * An array of actions based on the order of actions which will be provided + * with rule_actions to mlx5hws_rule_create. The last action is marked + * using MLX5HWS_ACTION_TYP_LAST. + * @return pointer to mlx5hws_action_template on success NULL otherwise + */ +struct mlx5hws_action_template * +mlx5hws_action_template_create(enum mlx5hws_action_type action_type[]); + +/* Destroy action template. + * + * @param[in] at + * Action template to destroy. + * @return zero on success non zero otherwise. + */ +int mlx5hws_action_template_destroy(struct mlx5hws_action_template *at); + +/* Create a new direct rule matcher. Each matcher can contain multiple rules. + * Matchers on the table will be processed by priority. Matching fields and + * mask are described by the match template. In some cases multiple match + * templates can be used on the same matcher. + * + * @param[in] table + * The table in which the new matcher will be opened. + * @param[in] mt + * Array of match templates to be used on matcher. + * @param[in] num_of_mt + * Number of match templates in mt array. + * @param[in] at + * Array of action templates to be used on matcher. + * @param[in] num_of_at + * Number of action templates in mt array. + * @param[in] attr + * Attributes used for matcher creation. + * @return pointer to mlx5hws_matcher on success NULL otherwise. + */ +struct mlx5hws_matcher * +mlx5hws_matcher_create(struct mlx5hws_table *table, + struct mlx5hws_match_template *mt[], + u8 num_of_mt, + struct mlx5hws_action_template *at[], + u8 num_of_at, + struct mlx5hws_matcher_attr *attr); + +/* Destroy direct rule matcher. + * + * @param[in] matcher + * Matcher to destroy. + * @return zero on success non zero otherwise. + */ +int mlx5hws_matcher_destroy(struct mlx5hws_matcher *matcher); + +/* Attach new action template to direct rule matcher. + * + * @param[in] matcher + * Matcher to attach at to. + * @param[in] at + * Action template to be attached to the matcher. + * @return zero on success non zero otherwise. + */ +int mlx5hws_matcher_attach_at(struct mlx5hws_matcher *matcher, + struct mlx5hws_action_template *at); + +/* Link two matchers and enable moving rules from src matcher to dst matcher. + * Both matchers must be in the same table type, must be created with 'resizable' + * property, and should have the same characteristics (e.g. same mt, same at). + * + * It is the user's responsibility to make sure that the dst matcher + * was allocated with the appropriate size. + * + * Once the function is completed, the user is: + * - allowed to move rules from src into dst matcher + * - no longer allowed to insert rules to the src matcher + * + * The user is always allowed to insert rules to the dst matcher and + * to delete rules from any matcher. + * + * @param[in] src_matcher + * source matcher for moving rules from + * @param[in] dst_matcher + * destination matcher for moving rules to + * @return zero on successful move, non zero otherwise. + */ +int mlx5hws_matcher_resize_set_target(struct mlx5hws_matcher *src_matcher, + struct mlx5hws_matcher *dst_matcher); + +/* Enqueue moving rule operation: moving rule from src matcher to a dst matcher + * + * @param[in] src_matcher + * matcher that the rule belongs to + * @param[in] rule + * the rule to move + * @param[in] attr + * rule attributes + * @return zero on success, non zero otherwise. + */ +int mlx5hws_matcher_resize_rule_move(struct mlx5hws_matcher *src_matcher, + struct mlx5hws_rule *rule, + struct mlx5hws_rule_attr *attr); + +/* Enqueue create rule operation. + * + * @param[in] matcher + * The matcher in which the new rule will be created. + * @param[in] mt_idx + * Match template index to create the match with. + * @param[in] match_param + * The match parameter PRM buffer used for the value matching. + * @param[in] rule_actions + * Rule action to be executed on match. + * @param[in] at_idx + * Action template index to apply the actions with. + * @param[in] num_of_actions + * Number of rule actions. + * @param[in] attr + * Rule creation attributes. + * @param[in, out] rule_handle + * A valid rule handle. The handle doesn't require any initialization. + * @return zero on successful enqueue non zero otherwise. + */ +int mlx5hws_rule_create(struct mlx5hws_matcher *matcher, + u8 mt_idx, + u32 *match_param, + u8 at_idx, + struct mlx5hws_rule_action rule_actions[], + struct mlx5hws_rule_attr *attr, + struct mlx5hws_rule *rule_handle); + +/* Enqueue destroy rule operation. + * + * @param[in] rule + * The rule destruction to enqueue. + * @param[in] attr + * Rule destruction attributes. + * @return zero on successful enqueue non zero otherwise. + */ +int mlx5hws_rule_destroy(struct mlx5hws_rule *rule, + struct mlx5hws_rule_attr *attr); + +/* Enqueue update actions on an existing rule. + * + * @param[in, out] rule_handle + * A valid rule handle to update. + * @param[in] at_idx + * Action template index to update the actions with. + * @param[in] rule_actions + * Rule action to be executed on match. + * @param[in] attr + * Rule update attributes. + * @return zero on successful enqueue non zero otherwise. + */ +int mlx5hws_rule_action_update(struct mlx5hws_rule *rule, + u8 at_idx, + struct mlx5hws_rule_action rule_actions[], + struct mlx5hws_rule_attr *attr); + +/* Get action type. + * + * @param[in] action + * The action to get the type of. + * @return action type. + */ +enum mlx5hws_action_type +mlx5hws_action_get_type(struct mlx5hws_action *action); + +/* Create direct rule drop action. + * + * @param[in] ctx + * The context in which the new action will be created. + * @param[in] flags + * Action creation flags. (enum mlx5hws_action_flags) + * @return pointer to mlx5hws_action on success NULL otherwise. + */ +struct mlx5hws_action * +mlx5hws_action_create_dest_drop(struct mlx5hws_context *ctx, + u32 flags); + +/* Create direct rule default miss action. + * Defaults are RX: Drop TX: Wire. + * + * @param[in] ctx + * The context in which the new action will be created. + * @param[in] flags + * Action creation flags. (enum mlx5hws_action_flags) + * @return pointer to mlx5hws_action on success NULL otherwise. + */ +struct mlx5hws_action * +mlx5hws_action_create_default_miss(struct mlx5hws_context *ctx, + u32 flags); + +/* Create direct rule goto table action. + * + * @param[in] ctx + * The context in which the new action will be created. + * @param[in] tbl + * Destination table. + * @param[in] flags + * Action creation flags. (enum mlx5hws_action_flags) + * @return pointer to mlx5hws_action on success NULL otherwise. + */ +struct mlx5hws_action * +mlx5hws_action_create_dest_table(struct mlx5hws_context *ctx, + struct mlx5hws_table *tbl, + u32 flags); + +/* Create direct rule goto table number action. + * + * @param[in] ctx + * The context in which the new action will be created. + * @param[in] tbl_num + * Destination table number. + * @param[in] flags + * Action creation flags. (enum mlx5hws_action_flags) + * @return pointer to mlx5hws_action on success NULL otherwise. + */ +struct mlx5hws_action * +mlx5hws_action_create_dest_table_num(struct mlx5hws_context *ctx, + u32 table_num, u32 flags); + +/* Create direct rule range match action. + * + * @param[in] ctx + * The context in which the new action will be created. + * @param[in] field + * Field to comapare the value. + * @param[in] hit_ft + * Flow table to go to on hit. + * @param[in] miss_ft + * Flow table to go to on miss. + * @param[in] min + * Minimal value of the field to be considered as hit. + * @param[in] max + * Maximal value of the field to be considered as hit. + * @param[in] flags + * Action creation flags. (enum mlx5hws_action_flags) + * @return pointer to mlx5hws_action on success NULL otherwise. + */ +struct mlx5hws_action * +mlx5hws_action_create_dest_match_range(struct mlx5hws_context *ctx, + u32 field, + struct mlx5_flow_table *hit_ft, + struct mlx5_flow_table *miss_ft, + u32 min, u32 max, u32 flags); + +/* Create direct rule flow sampler action. + * + * @param[in] ctx + * The context in which the new action will be created. + * @param[in] sampler_id + * Flow sampler object ID. + * @param[in] flags + * Action creation flags. (enum mlx5hws_action_flags) + * @return pointer to mlx5hws_action on success NULL otherwise. + */ +struct mlx5hws_action * +mlx5hws_action_create_flow_sampler(struct mlx5hws_context *ctx, + u32 sampler_id, u32 flags); + +/* Create direct rule goto vport action. + * + * @param[in] ctx + * The context in which the new action will be created. + * @param[in] vport_num + * Destination vport number. + * @param[in] vhca_id_valid + * Tells if the vhca_id parameter is valid. + * @param[in] vhca_id + * VHCA ID of the destination vport. + * @param[in] flags + * Action creation flags. (enum mlx5hws_action_flags) + * @return pointer to mlx5hws_action on success NULL otherwise. + */ +struct mlx5hws_action * +mlx5hws_action_create_dest_vport(struct mlx5hws_context *ctx, + u16 vport_num, + bool vhca_id_valid, + u16 vhca_id, + u32 flags); + +/* Create direct rule TAG action. + * + * @param[in] ctx + * The context in which the new action will be created. + * @param[in] flags + * Action creation flags. (enum mlx5hws_action_flags) + * @return pointer to mlx5hws_action on success NULL otherwise. + */ +struct mlx5hws_action * +mlx5hws_action_create_tag(struct mlx5hws_context *ctx, + u32 flags); + +/* Create direct rule counter action. + * + * @param[in] ctx + * The context in which the new action will be created. + * @param[in] obj_id + * Direct rule counter object ID. + * @param[in] flags + * Action creation flags. (enum mlx5hws_action_flags) + * @return pointer to mlx5hws_action on success NULL otherwise. + */ +struct mlx5hws_action * +mlx5hws_action_create_counter(struct mlx5hws_context *ctx, + u32 obj_id, + u32 flags); + +/* Create direct rule reformat action. + * + * @param[in] ctx + * The context in which the new action will be created. + * @param[in] reformat_type + * Type of reformat prefixed with MLX5HWS_ACTION_TYP_REFORMAT. + * @param[in] num_of_hdrs + * Number of provided headers in "hdrs" array. + * @param[in] hdrs + * Headers array containing header information. + * @param[in] log_bulk_size + * Number of unique values used with this reformat. + * @param[in] flags + * Action creation flags. (enum mlx5hws_action_flags) + * @return pointer to mlx5hws_action on success NULL otherwise. + */ +struct mlx5hws_action * +mlx5hws_action_create_reformat(struct mlx5hws_context *ctx, + enum mlx5hws_action_type reformat_type, + u8 num_of_hdrs, + struct mlx5hws_action_reformat_header *hdrs, + u32 log_bulk_size, + u32 flags); + +/* Create direct rule modify header action. + * + * @param[in] ctx + * The context in which the new action will be created. + * @param[in] num_of_patterns + * Number of provided patterns in "patterns" array. + * @param[in] patterns + * Patterns array containing pattern information. + * @param[in] log_bulk_size + * Number of unique values used with this pattern. + * @param[in] flags + * Action creation flags. (enum mlx5hws_action_flags) + * @return pointer to mlx5hws_action on success NULL otherwise. + */ +struct mlx5hws_action * +mlx5hws_action_create_modify_header(struct mlx5hws_context *ctx, + u8 num_of_patterns, + struct mlx5hws_action_mh_pattern *patterns, + u32 log_bulk_size, + u32 flags); + +/* Create direct rule ASO flow meter action. + * + * @param[in] ctx + * The context in which the new action will be created. + * @param[in] obj_id + * ASO object ID. + * @param[in] return_reg_c + * Copy the ASO object value into this reg_c, after a packet hits a rule with this ASO object. + * @param[in] flags + * Action creation flags. (enum mlx5hws_action_flags) + * @return pointer to mlx5hws_action on success NULL otherwise. + */ +struct mlx5hws_action * +mlx5hws_action_create_aso_meter(struct mlx5hws_context *ctx, + u32 obj_id, + u8 return_reg_c, + u32 flags); + +/* Create direct rule pop vlan action. + * @param[in] ctx + * The context in which the new action will be created. + * @param[in] flags + * Action creation flags. (enum mlx5hws_action_flags) + * @return pointer to mlx5hws_action on success NULL otherwise. + */ +struct mlx5hws_action * +mlx5hws_action_create_pop_vlan(struct mlx5hws_context *ctx, u32 flags); + +/* Create direct rule push vlan action. + * @param[in] ctx + * The context in which the new action will be created. + * @param[in] flags + * Action creation flags. (enum mlx5hws_action_flags) + * @return pointer to mlx5hws_action on success NULL otherwise. + */ +struct mlx5hws_action * +mlx5hws_action_create_push_vlan(struct mlx5hws_context *ctx, u32 flags); + +/* Create a dest array action, this action can duplicate packets and forward to + * multiple destinations in the destination list. + * @param[in] ctx + * The context in which the new action will be created. + * @param[in] num_dest + * The number of dests attributes. + * @param[in] dests + * The destination array. Each contains a destination action and can have + * additional actions. + * @param[in] ignore_flow_level + * Boolean that says whether to turn on 'ignore_flow_level' for this dest. + * @param[in] flow_source + * Source port of the traffic for this actions. + * @param[in] flags + * Action creation flags. (enum mlx5hws_action_flags) + * @return pointer to mlx5hws_action on success NULL otherwise. + */ +struct mlx5hws_action * +mlx5hws_action_create_dest_array(struct mlx5hws_context *ctx, + size_t num_dest, + struct mlx5hws_action_dest_attr *dests, + bool ignore_flow_level, + u32 flow_source, + u32 flags); + +/* Create insert header action. + * + * @param[in] ctx + * The context in which the new action will be created. + * @param[in] num_of_hdrs + * Number of provided headers in "hdrs" array. + * @param[in] hdrs + * Headers array containing header information. + * @param[in] log_bulk_size + * Number of unique values used with this insert header. + * @param[in] flags + * Action creation flags. (enum mlx5hws_action_flags) + * @return pointer to mlx5hws_action on success NULL otherwise. + */ +struct mlx5hws_action * +mlx5hws_action_create_insert_header(struct mlx5hws_context *ctx, + u8 num_of_hdrs, + struct mlx5hws_action_insert_header *hdrs, + u32 log_bulk_size, + u32 flags); + +/* Create remove header action. + * + * @param[in] ctx + * The context in which the new action will be created. + * @param[in] attr + * attributes: specifies the remove header type, PRM start anchor and + * the PRM end anchor or the PRM start anchor and remove size in bytes. + * @param[in] flags + * Action creation flags. (enum mlx5hws_action_flags) + * @return pointer to mlx5hws_action on success NULL otherwise. + */ +struct mlx5hws_action * +mlx5hws_action_create_remove_header(struct mlx5hws_context *ctx, + struct mlx5hws_action_remove_header_attr *attr, + u32 flags); + +/* Create direct rule LAST action. + * + * @param[in] ctx + * The context in which the new action will be created. + * @param[in] flags + * Action creation flags. (enum mlx5hws_action_flags) + * @return pointer to mlx5hws_action on success NULL otherwise. + */ +struct mlx5hws_action * +mlx5hws_action_create_last(struct mlx5hws_context *ctx, u32 flags); + +/* Destroy direct rule action. + * + * @param[in] action + * The action to destroy. + * @return zero on success non zero otherwise. + */ +int mlx5hws_action_destroy(struct mlx5hws_action *action); + +enum mlx5hws_flow_op_status { + MLX5HWS_FLOW_OP_SUCCESS, + MLX5HWS_FLOW_OP_ERROR, +}; + +struct mlx5hws_flow_op_result { + enum mlx5hws_flow_op_status status; + void *user_data; +}; + +/* Poll queue for rule creation and deletions completions. + * + * @param[in] ctx + * The context to which the queue belong to. + * @param[in] queue_id + * The id of the queue to poll. + * @param[in, out] res + * Completion array. + * @param[in] res_nb + * Maximum number of results to return. + * @return negative number on failure, the number of completions otherwise. + */ +int mlx5hws_send_queue_poll(struct mlx5hws_context *ctx, + u16 queue_id, + struct mlx5hws_flow_op_result res[], + u32 res_nb); + +/* Perform an action on the queue + * + * @param[in] ctx + * The context to which the queue belong to. + * @param[in] queue_id + * The id of the queue to perform the action on. + * @param[in] actions + * Actions to perform on the queue. (enum mlx5hws_send_queue_actions) + * @return zero on success non zero otherwise. + */ +int mlx5hws_send_queue_action(struct mlx5hws_context *ctx, + u16 queue_id, + u32 actions); + +/* Dump HWS info + * + * @param[in] ctx + * The context which to dump the info from. + * @return zero on success non zero otherwise. + */ +int mlx5hws_debug_dump(struct mlx5hws_context *ctx); + +struct mlx5hws_bwc_matcher; +struct mlx5hws_bwc_rule; + +struct mlx5hws_match_parameters { + size_t match_sz; + u32 *match_buf; /* Device spec format */ +}; + +/* Create a new BWC direct rule matcher. + * This function does the following: + * - creates match template based on flow items + * - creates an empty action template + * - creates a usual mlx5hws_matcher with these mt and at, setting + * its size to minimal + * Notes: + * - table->ctx must have BWC support + * - complex rules are not supported + * + * @param[in] table + * The table in which the new matcher will be opened + * @param[in] priority + * Priority for this BWC matcher + * @param[in] match_criteria_enable + * Bitmask that defines matching criteria + * @param[in] mask + * Match parameters + * @return pointer to mlx5hws_bwc_matcher on success or NULL otherwise. + */ +struct mlx5hws_bwc_matcher * +mlx5hws_bwc_matcher_create(struct mlx5hws_table *table, + u32 priority, + u8 match_criteria_enable, + struct mlx5hws_match_parameters *mask); + +/* Destroy BWC direct rule matcher. + * + * @param[in] bwc_matcher + * Matcher to destroy + * @return zero on success, non zero otherwise + */ +int mlx5hws_bwc_matcher_destroy(struct mlx5hws_bwc_matcher *bwc_matcher); + +/* Create a new BWC rule. + * Unlike the usual rule creation function, this one is blocking: when the + * function returns, the rule is written to its place (no need to poll). + * This function does the following: + * - finds matching action template based on the provided rule_actions, or + * creates new action template if matching action template doesn't exist + * - updates corresponding BWC matcher stats + * - if needed, the function performs rehash: + * - creates a new matcher based on mt, at, new_sz + * - moves all the existing matcher rules to the new matcher + * - removes the old matcher + * - inserts new rule + * - polls till completion is received + * Notes: + * - matcher->tbl->ctx must have BWC support + * - separate BWC ctx queues are used + * + * @param[in] bwc_matcher + * The BWC matcher in which the new rule will be created. + * @param[in] params + * Match perameters + * @param[in] flow_source + * Flow source for this rule + * @param[in] rule_actions + * Rule action to be executed on match + * @return valid BWC rule handle on success, NULL otherwise + */ +struct mlx5hws_bwc_rule * +mlx5hws_bwc_rule_create(struct mlx5hws_bwc_matcher *bwc_matcher, + struct mlx5hws_match_parameters *params, + u32 flow_source, + struct mlx5hws_rule_action rule_actions[]); + +/* Destroy BWC direct rule. + * + * @param[in] bwc_rule + * Rule to destroy + * @return zero on success, non zero otherwise + */ +int mlx5hws_bwc_rule_destroy(struct mlx5hws_bwc_rule *bwc_rule); + +/* Update actions on an existing BWC rule. + * + * @param[in] bwc_rule + * Rule to update + * @param[in] rule_actions + * Rule action to update with + * @return zero on successful update, non zero otherwise. + */ +int mlx5hws_bwc_rule_action_update(struct mlx5hws_bwc_rule *bwc_rule, + struct mlx5hws_rule_action rule_actions[]); + +#endif diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_internal.h b/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_internal.h index 458bc31be6ac4..5643be1cd5bf3 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_internal.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_internal.h @@ -5,13 +5,13 @@ #define MLX5HWS_INTERNAL_H_ #include - -#include "../dr_types.h" +#include +#include "fs_core.h" +#include "wq.h" +#include "lib/mlx5.h" #include "mlx5hws_prm.h" - #include "mlx5hws.h" - #include "mlx5hws_pool.h" #include "mlx5hws_vport.h" #include "mlx5hws_context.h" From 9dde3384feabd8bd384f69d89e48b0a8406b7b77 Mon Sep 17 00:00:00 2001 From: Yevgeny Kliteynik Date: Thu, 31 Oct 2024 14:58:53 +0200 Subject: [PATCH 102/548] net/mlx5: DR, moved all the SWS code into a separate directory After adding HWS support in a separate folder, moving all the SWS code into its own folder as well. Now SWS and HWS implementation are located in their appropriate folders: - steering/sws/ - steering/hws/ Signed-off-by: Yevgeny Kliteynik Signed-off-by: Tariq Toukan Link: https://patch.msgid.link/20241031125856.530927-3-tariqt@nvidia.com Signed-off-by: Jakub Kicinski (cherry picked from commit e03cf321882badadc77dfc428f465173523d5f79) Signed-off-by: Koba Ko --- .../net/ethernet/mellanox/mlx5/core/Makefile | 33 +++++++++++++------ .../net/ethernet/mellanox/mlx5/core/fs_core.h | 2 +- .../ethernet/mellanox/mlx5/core/lib/smfs.h | 4 +-- .../mlx5/core/steering/{ => sws}/dr_action.c | 0 .../mlx5/core/steering/{ => sws}/dr_arg.c | 0 .../mlx5/core/steering/{ => sws}/dr_buddy.c | 0 .../mlx5/core/steering/{ => sws}/dr_cmd.c | 0 .../mlx5/core/steering/{ => sws}/dr_dbg.c | 0 .../mlx5/core/steering/{ => sws}/dr_dbg.h | 0 .../mlx5/core/steering/{ => sws}/dr_definer.c | 0 .../mlx5/core/steering/{ => sws}/dr_domain.c | 0 .../mlx5/core/steering/{ => sws}/dr_fw.c | 0 .../core/steering/{ => sws}/dr_icm_pool.c | 0 .../mlx5/core/steering/{ => sws}/dr_matcher.c | 0 .../mlx5/core/steering/{ => sws}/dr_ptrn.c | 0 .../mlx5/core/steering/{ => sws}/dr_rule.c | 0 .../mlx5/core/steering/{ => sws}/dr_send.c | 0 .../mlx5/core/steering/{ => sws}/dr_ste.c | 0 .../mlx5/core/steering/{ => sws}/dr_ste.h | 0 .../mlx5/core/steering/{ => sws}/dr_ste_v0.c | 0 .../mlx5/core/steering/{ => sws}/dr_ste_v1.c | 0 .../mlx5/core/steering/{ => sws}/dr_ste_v1.h | 0 .../mlx5/core/steering/{ => sws}/dr_ste_v2.c | 0 .../mlx5/core/steering/{ => sws}/dr_table.c | 0 .../mlx5/core/steering/{ => sws}/dr_types.h | 0 .../mlx5/core/steering/{ => sws}/fs_dr.c | 0 .../mlx5/core/steering/{ => sws}/fs_dr.h | 0 .../core/steering/{ => sws}/mlx5_ifc_dr.h | 0 .../steering/{ => sws}/mlx5_ifc_dr_ste_v1.h | 0 .../mlx5/core/steering/{ => sws}/mlx5dr.h | 0 30 files changed, 26 insertions(+), 13 deletions(-) rename drivers/net/ethernet/mellanox/mlx5/core/steering/{ => sws}/dr_action.c (100%) rename drivers/net/ethernet/mellanox/mlx5/core/steering/{ => sws}/dr_arg.c (100%) rename drivers/net/ethernet/mellanox/mlx5/core/steering/{ => sws}/dr_buddy.c (100%) rename drivers/net/ethernet/mellanox/mlx5/core/steering/{ => sws}/dr_cmd.c (100%) rename drivers/net/ethernet/mellanox/mlx5/core/steering/{ => sws}/dr_dbg.c (100%) rename drivers/net/ethernet/mellanox/mlx5/core/steering/{ => sws}/dr_dbg.h (100%) rename drivers/net/ethernet/mellanox/mlx5/core/steering/{ => sws}/dr_definer.c (100%) rename drivers/net/ethernet/mellanox/mlx5/core/steering/{ => sws}/dr_domain.c (100%) rename drivers/net/ethernet/mellanox/mlx5/core/steering/{ => sws}/dr_fw.c (100%) rename drivers/net/ethernet/mellanox/mlx5/core/steering/{ => sws}/dr_icm_pool.c (100%) rename drivers/net/ethernet/mellanox/mlx5/core/steering/{ => sws}/dr_matcher.c (100%) rename drivers/net/ethernet/mellanox/mlx5/core/steering/{ => sws}/dr_ptrn.c (100%) rename drivers/net/ethernet/mellanox/mlx5/core/steering/{ => sws}/dr_rule.c (100%) rename drivers/net/ethernet/mellanox/mlx5/core/steering/{ => sws}/dr_send.c (100%) rename drivers/net/ethernet/mellanox/mlx5/core/steering/{ => sws}/dr_ste.c (100%) rename drivers/net/ethernet/mellanox/mlx5/core/steering/{ => sws}/dr_ste.h (100%) rename drivers/net/ethernet/mellanox/mlx5/core/steering/{ => sws}/dr_ste_v0.c (100%) rename drivers/net/ethernet/mellanox/mlx5/core/steering/{ => sws}/dr_ste_v1.c (100%) rename drivers/net/ethernet/mellanox/mlx5/core/steering/{ => sws}/dr_ste_v1.h (100%) rename drivers/net/ethernet/mellanox/mlx5/core/steering/{ => sws}/dr_ste_v2.c (100%) rename drivers/net/ethernet/mellanox/mlx5/core/steering/{ => sws}/dr_table.c (100%) rename drivers/net/ethernet/mellanox/mlx5/core/steering/{ => sws}/dr_types.h (100%) rename drivers/net/ethernet/mellanox/mlx5/core/steering/{ => sws}/fs_dr.c (100%) rename drivers/net/ethernet/mellanox/mlx5/core/steering/{ => sws}/fs_dr.h (100%) rename drivers/net/ethernet/mellanox/mlx5/core/steering/{ => sws}/mlx5_ifc_dr.h (100%) rename drivers/net/ethernet/mellanox/mlx5/core/steering/{ => sws}/mlx5_ifc_dr_ste_v1.h (100%) rename drivers/net/ethernet/mellanox/mlx5/core/steering/{ => sws}/mlx5dr.h (100%) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/Makefile b/drivers/net/ethernet/mellanox/mlx5/core/Makefile index 6bfdcaef877a5..425acb16e68e1 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/Makefile +++ b/drivers/net/ethernet/mellanox/mlx5/core/Makefile @@ -109,16 +109,29 @@ mlx5_core-$(CONFIG_MLX5_EN_TLS) += en_accel/ktls_stats.o \ en_accel/fs_tcp.o en_accel/ktls.o en_accel/ktls_txrx.o \ en_accel/ktls_tx.o en_accel/ktls_rx.o -mlx5_core-$(CONFIG_MLX5_SW_STEERING) += steering/dr_domain.o steering/dr_table.o \ - steering/dr_matcher.o steering/dr_rule.o \ - steering/dr_icm_pool.o steering/dr_buddy.o \ - steering/dr_ste.o steering/dr_send.o \ - steering/dr_ste_v0.o steering/dr_ste_v1.o \ - steering/dr_ste_v2.o \ - steering/dr_cmd.o steering/dr_fw.o \ - steering/dr_action.o steering/fs_dr.o \ - steering/dr_definer.o steering/dr_ptrn.o \ - steering/dr_arg.o steering/dr_dbg.o lib/smfs.o +# +# SW Steering +# +mlx5_core-$(CONFIG_MLX5_SW_STEERING) += steering/sws/dr_domain.o \ + steering/sws/dr_table.o \ + steering/sws/dr_matcher.o \ + steering/sws/dr_rule.o \ + steering/sws/dr_icm_pool.o \ + steering/sws/dr_buddy.o \ + steering/sws/dr_ste.o \ + steering/sws/dr_send.o \ + steering/sws/dr_ste_v0.o \ + steering/sws/dr_ste_v1.o \ + steering/sws/dr_ste_v2.o \ + steering/sws/dr_cmd.o \ + steering/sws/dr_fw.o \ + steering/sws/dr_action.o \ + steering/sws/dr_definer.o \ + steering/sws/dr_ptrn.o \ + steering/sws/dr_arg.o \ + steering/sws/dr_dbg.o \ + steering/sws/fs_dr.o \ + lib/smfs.o # # HW Steering diff --git a/drivers/net/ethernet/mellanox/mlx5/core/fs_core.h b/drivers/net/ethernet/mellanox/mlx5/core/fs_core.h index 1d48471a3e40d..6761346b176c2 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/fs_core.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/fs_core.h @@ -37,7 +37,7 @@ #include #include #include -#include +#include #define FDB_TC_MAX_CHAIN 3 #define FDB_FT_CHAIN (FDB_TC_MAX_CHAIN + 1) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/lib/smfs.h b/drivers/net/ethernet/mellanox/mlx5/core/lib/smfs.h index 452d0df339acd..404f3d4b6380c 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/lib/smfs.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/lib/smfs.h @@ -4,8 +4,8 @@ #ifndef __MLX5_LIB_SMFS_H__ #define __MLX5_LIB_SMFS_H__ -#include "steering/mlx5dr.h" -#include "steering/dr_types.h" +#include "steering/sws/mlx5dr.h" +#include "steering/sws/dr_types.h" struct mlx5dr_matcher * mlx5_smfs_matcher_create(struct mlx5dr_table *table, u32 priority, struct mlx5_flow_spec *spec); diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_action.c b/drivers/net/ethernet/mellanox/mlx5/core/steering/sws/dr_action.c similarity index 100% rename from drivers/net/ethernet/mellanox/mlx5/core/steering/dr_action.c rename to drivers/net/ethernet/mellanox/mlx5/core/steering/sws/dr_action.c diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_arg.c b/drivers/net/ethernet/mellanox/mlx5/core/steering/sws/dr_arg.c similarity index 100% rename from drivers/net/ethernet/mellanox/mlx5/core/steering/dr_arg.c rename to drivers/net/ethernet/mellanox/mlx5/core/steering/sws/dr_arg.c diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_buddy.c b/drivers/net/ethernet/mellanox/mlx5/core/steering/sws/dr_buddy.c similarity index 100% rename from drivers/net/ethernet/mellanox/mlx5/core/steering/dr_buddy.c rename to drivers/net/ethernet/mellanox/mlx5/core/steering/sws/dr_buddy.c diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_cmd.c b/drivers/net/ethernet/mellanox/mlx5/core/steering/sws/dr_cmd.c similarity index 100% rename from drivers/net/ethernet/mellanox/mlx5/core/steering/dr_cmd.c rename to drivers/net/ethernet/mellanox/mlx5/core/steering/sws/dr_cmd.c diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_dbg.c b/drivers/net/ethernet/mellanox/mlx5/core/steering/sws/dr_dbg.c similarity index 100% rename from drivers/net/ethernet/mellanox/mlx5/core/steering/dr_dbg.c rename to drivers/net/ethernet/mellanox/mlx5/core/steering/sws/dr_dbg.c diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_dbg.h b/drivers/net/ethernet/mellanox/mlx5/core/steering/sws/dr_dbg.h similarity index 100% rename from drivers/net/ethernet/mellanox/mlx5/core/steering/dr_dbg.h rename to drivers/net/ethernet/mellanox/mlx5/core/steering/sws/dr_dbg.h diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_definer.c b/drivers/net/ethernet/mellanox/mlx5/core/steering/sws/dr_definer.c similarity index 100% rename from drivers/net/ethernet/mellanox/mlx5/core/steering/dr_definer.c rename to drivers/net/ethernet/mellanox/mlx5/core/steering/sws/dr_definer.c diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_domain.c b/drivers/net/ethernet/mellanox/mlx5/core/steering/sws/dr_domain.c similarity index 100% rename from drivers/net/ethernet/mellanox/mlx5/core/steering/dr_domain.c rename to drivers/net/ethernet/mellanox/mlx5/core/steering/sws/dr_domain.c diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_fw.c b/drivers/net/ethernet/mellanox/mlx5/core/steering/sws/dr_fw.c similarity index 100% rename from drivers/net/ethernet/mellanox/mlx5/core/steering/dr_fw.c rename to drivers/net/ethernet/mellanox/mlx5/core/steering/sws/dr_fw.c diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_icm_pool.c b/drivers/net/ethernet/mellanox/mlx5/core/steering/sws/dr_icm_pool.c similarity index 100% rename from drivers/net/ethernet/mellanox/mlx5/core/steering/dr_icm_pool.c rename to drivers/net/ethernet/mellanox/mlx5/core/steering/sws/dr_icm_pool.c diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_matcher.c b/drivers/net/ethernet/mellanox/mlx5/core/steering/sws/dr_matcher.c similarity index 100% rename from drivers/net/ethernet/mellanox/mlx5/core/steering/dr_matcher.c rename to drivers/net/ethernet/mellanox/mlx5/core/steering/sws/dr_matcher.c diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_ptrn.c b/drivers/net/ethernet/mellanox/mlx5/core/steering/sws/dr_ptrn.c similarity index 100% rename from drivers/net/ethernet/mellanox/mlx5/core/steering/dr_ptrn.c rename to drivers/net/ethernet/mellanox/mlx5/core/steering/sws/dr_ptrn.c diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_rule.c b/drivers/net/ethernet/mellanox/mlx5/core/steering/sws/dr_rule.c similarity index 100% rename from drivers/net/ethernet/mellanox/mlx5/core/steering/dr_rule.c rename to drivers/net/ethernet/mellanox/mlx5/core/steering/sws/dr_rule.c diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_send.c b/drivers/net/ethernet/mellanox/mlx5/core/steering/sws/dr_send.c similarity index 100% rename from drivers/net/ethernet/mellanox/mlx5/core/steering/dr_send.c rename to drivers/net/ethernet/mellanox/mlx5/core/steering/sws/dr_send.c diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_ste.c b/drivers/net/ethernet/mellanox/mlx5/core/steering/sws/dr_ste.c similarity index 100% rename from drivers/net/ethernet/mellanox/mlx5/core/steering/dr_ste.c rename to drivers/net/ethernet/mellanox/mlx5/core/steering/sws/dr_ste.c diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_ste.h b/drivers/net/ethernet/mellanox/mlx5/core/steering/sws/dr_ste.h similarity index 100% rename from drivers/net/ethernet/mellanox/mlx5/core/steering/dr_ste.h rename to drivers/net/ethernet/mellanox/mlx5/core/steering/sws/dr_ste.h diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_ste_v0.c b/drivers/net/ethernet/mellanox/mlx5/core/steering/sws/dr_ste_v0.c similarity index 100% rename from drivers/net/ethernet/mellanox/mlx5/core/steering/dr_ste_v0.c rename to drivers/net/ethernet/mellanox/mlx5/core/steering/sws/dr_ste_v0.c diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_ste_v1.c b/drivers/net/ethernet/mellanox/mlx5/core/steering/sws/dr_ste_v1.c similarity index 100% rename from drivers/net/ethernet/mellanox/mlx5/core/steering/dr_ste_v1.c rename to drivers/net/ethernet/mellanox/mlx5/core/steering/sws/dr_ste_v1.c diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_ste_v1.h b/drivers/net/ethernet/mellanox/mlx5/core/steering/sws/dr_ste_v1.h similarity index 100% rename from drivers/net/ethernet/mellanox/mlx5/core/steering/dr_ste_v1.h rename to drivers/net/ethernet/mellanox/mlx5/core/steering/sws/dr_ste_v1.h diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_ste_v2.c b/drivers/net/ethernet/mellanox/mlx5/core/steering/sws/dr_ste_v2.c similarity index 100% rename from drivers/net/ethernet/mellanox/mlx5/core/steering/dr_ste_v2.c rename to drivers/net/ethernet/mellanox/mlx5/core/steering/sws/dr_ste_v2.c diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_table.c b/drivers/net/ethernet/mellanox/mlx5/core/steering/sws/dr_table.c similarity index 100% rename from drivers/net/ethernet/mellanox/mlx5/core/steering/dr_table.c rename to drivers/net/ethernet/mellanox/mlx5/core/steering/sws/dr_table.c diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_types.h b/drivers/net/ethernet/mellanox/mlx5/core/steering/sws/dr_types.h similarity index 100% rename from drivers/net/ethernet/mellanox/mlx5/core/steering/dr_types.h rename to drivers/net/ethernet/mellanox/mlx5/core/steering/sws/dr_types.h diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/fs_dr.c b/drivers/net/ethernet/mellanox/mlx5/core/steering/sws/fs_dr.c similarity index 100% rename from drivers/net/ethernet/mellanox/mlx5/core/steering/fs_dr.c rename to drivers/net/ethernet/mellanox/mlx5/core/steering/sws/fs_dr.c diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/fs_dr.h b/drivers/net/ethernet/mellanox/mlx5/core/steering/sws/fs_dr.h similarity index 100% rename from drivers/net/ethernet/mellanox/mlx5/core/steering/fs_dr.h rename to drivers/net/ethernet/mellanox/mlx5/core/steering/sws/fs_dr.h diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/mlx5_ifc_dr.h b/drivers/net/ethernet/mellanox/mlx5/core/steering/sws/mlx5_ifc_dr.h similarity index 100% rename from drivers/net/ethernet/mellanox/mlx5/core/steering/mlx5_ifc_dr.h rename to drivers/net/ethernet/mellanox/mlx5/core/steering/sws/mlx5_ifc_dr.h diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/mlx5_ifc_dr_ste_v1.h b/drivers/net/ethernet/mellanox/mlx5/core/steering/sws/mlx5_ifc_dr_ste_v1.h similarity index 100% rename from drivers/net/ethernet/mellanox/mlx5/core/steering/mlx5_ifc_dr_ste_v1.h rename to drivers/net/ethernet/mellanox/mlx5/core/steering/sws/mlx5_ifc_dr_ste_v1.h diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/mlx5dr.h b/drivers/net/ethernet/mellanox/mlx5/core/steering/sws/mlx5dr.h similarity index 100% rename from drivers/net/ethernet/mellanox/mlx5/core/steering/mlx5dr.h rename to drivers/net/ethernet/mellanox/mlx5/core/steering/sws/mlx5dr.h From e9542605a887a8ceb0d8f5e53e25a55d017de667 Mon Sep 17 00:00:00 2001 From: Yevgeny Kliteynik Date: Thu, 31 Oct 2024 14:58:54 +0200 Subject: [PATCH 103/548] net/mlx5: HWS, renamed the files in accordance with naming convention Removed the 'mlx5hws_' file name prefix from the internal HWS files. Signed-off-by: Yevgeny Kliteynik Signed-off-by: Tariq Toukan Link: https://patch.msgid.link/20241031125856.530927-4-tariqt@nvidia.com Signed-off-by: Jakub Kicinski (cherry picked from commit a2740138ec659b53f3a7670b34819737af596c59) Signed-off-by: Koba Ko --- .../net/ethernet/mellanox/mlx5/core/Makefile | 30 ++++++++-------- .../hws/{mlx5hws_action.c => action.c} | 2 +- .../hws/{mlx5hws_action.h => action.h} | 6 ++-- .../steering/hws/{mlx5hws_buddy.c => buddy.c} | 4 +-- .../steering/hws/{mlx5hws_buddy.h => buddy.h} | 6 ++-- .../steering/hws/{mlx5hws_bwc.c => bwc.c} | 2 +- .../steering/hws/{mlx5hws_bwc.h => bwc.h} | 6 ++-- .../{mlx5hws_bwc_complex.c => bwc_complex.c} | 2 +- .../{mlx5hws_bwc_complex.h => bwc_complex.h} | 6 ++-- .../steering/hws/{mlx5hws_cmd.c => cmd.c} | 2 +- .../steering/hws/{mlx5hws_cmd.h => cmd.h} | 6 ++-- .../hws/{mlx5hws_context.c => context.c} | 2 +- .../hws/{mlx5hws_context.h => context.h} | 6 ++-- .../steering/hws/{mlx5hws_debug.c => debug.c} | 2 +- .../steering/hws/{mlx5hws_debug.h => debug.h} | 6 ++-- .../hws/{mlx5hws_definer.c => definer.c} | 2 +- .../hws/{mlx5hws_definer.h => definer.h} | 6 ++-- .../hws/{mlx5hws_internal.h => internal.h} | 36 +++++++++---------- .../hws/{mlx5hws_matcher.c => matcher.c} | 2 +- .../hws/{mlx5hws_matcher.h => matcher.h} | 6 ++-- .../hws/{mlx5hws_pat_arg.c => pat_arg.c} | 2 +- .../hws/{mlx5hws_pat_arg.h => pat_arg.h} | 0 .../steering/hws/{mlx5hws_pool.c => pool.c} | 4 +-- .../steering/hws/{mlx5hws_pool.h => pool.h} | 0 .../steering/hws/{mlx5hws_prm.h => prm.h} | 0 .../steering/hws/{mlx5hws_rule.c => rule.c} | 2 +- .../steering/hws/{mlx5hws_rule.h => rule.h} | 0 .../steering/hws/{mlx5hws_send.c => send.c} | 2 +- .../steering/hws/{mlx5hws_send.h => send.h} | 0 .../steering/hws/{mlx5hws_table.c => table.c} | 2 +- .../steering/hws/{mlx5hws_table.h => table.h} | 0 .../steering/hws/{mlx5hws_vport.c => vport.c} | 2 +- .../steering/hws/{mlx5hws_vport.h => vport.h} | 0 33 files changed, 77 insertions(+), 77 deletions(-) rename drivers/net/ethernet/mellanox/mlx5/core/steering/hws/{mlx5hws_action.c => action.c} (99%) rename drivers/net/ethernet/mellanox/mlx5/core/steering/hws/{mlx5hws_action.h => action.h} (99%) rename drivers/net/ethernet/mellanox/mlx5/core/steering/hws/{mlx5hws_buddy.c => buddy.c} (98%) rename drivers/net/ethernet/mellanox/mlx5/core/steering/hws/{mlx5hws_buddy.h => buddy.h} (86%) rename drivers/net/ethernet/mellanox/mlx5/core/steering/hws/{mlx5hws_bwc.c => bwc.c} (99%) rename drivers/net/ethernet/mellanox/mlx5/core/steering/hws/{mlx5hws_bwc.h => bwc.h} (96%) rename drivers/net/ethernet/mellanox/mlx5/core/steering/hws/{mlx5hws_bwc_complex.c => bwc_complex.c} (98%) rename drivers/net/ethernet/mellanox/mlx5/core/steering/hws/{mlx5hws_bwc_complex.h => bwc_complex.h} (90%) rename drivers/net/ethernet/mellanox/mlx5/core/steering/hws/{mlx5hws_cmd.c => cmd.c} (99%) rename drivers/net/ethernet/mellanox/mlx5/core/steering/hws/{mlx5hws_cmd.h => cmd.h} (99%) rename drivers/net/ethernet/mellanox/mlx5/core/steering/hws/{mlx5hws_context.c => context.c} (99%) rename drivers/net/ethernet/mellanox/mlx5/core/steering/hws/{mlx5hws_context.h => context.h} (95%) rename drivers/net/ethernet/mellanox/mlx5/core/steering/hws/{mlx5hws_debug.c => debug.c} (99%) rename drivers/net/ethernet/mellanox/mlx5/core/steering/hws/{mlx5hws_debug.h => debug.h} (93%) rename drivers/net/ethernet/mellanox/mlx5/core/steering/hws/{mlx5hws_definer.c => definer.c} (99%) rename drivers/net/ethernet/mellanox/mlx5/core/steering/hws/{mlx5hws_definer.h => definer.h} (99%) rename drivers/net/ethernet/mellanox/mlx5/core/steering/hws/{mlx5hws_internal.h => internal.h} (67%) rename drivers/net/ethernet/mellanox/mlx5/core/steering/hws/{mlx5hws_matcher.c => matcher.c} (99%) rename drivers/net/ethernet/mellanox/mlx5/core/steering/hws/{mlx5hws_matcher.h => matcher.h} (96%) rename drivers/net/ethernet/mellanox/mlx5/core/steering/hws/{mlx5hws_pat_arg.c => pat_arg.c} (99%) rename drivers/net/ethernet/mellanox/mlx5/core/steering/hws/{mlx5hws_pat_arg.h => pat_arg.h} (100%) rename drivers/net/ethernet/mellanox/mlx5/core/steering/hws/{mlx5hws_pool.c => pool.c} (99%) rename drivers/net/ethernet/mellanox/mlx5/core/steering/hws/{mlx5hws_pool.h => pool.h} (100%) rename drivers/net/ethernet/mellanox/mlx5/core/steering/hws/{mlx5hws_prm.h => prm.h} (100%) rename drivers/net/ethernet/mellanox/mlx5/core/steering/hws/{mlx5hws_rule.c => rule.c} (99%) rename drivers/net/ethernet/mellanox/mlx5/core/steering/hws/{mlx5hws_rule.h => rule.h} (100%) rename drivers/net/ethernet/mellanox/mlx5/core/steering/hws/{mlx5hws_send.c => send.c} (99%) rename drivers/net/ethernet/mellanox/mlx5/core/steering/hws/{mlx5hws_send.h => send.h} (100%) rename drivers/net/ethernet/mellanox/mlx5/core/steering/hws/{mlx5hws_table.c => table.c} (99%) rename drivers/net/ethernet/mellanox/mlx5/core/steering/hws/{mlx5hws_table.h => table.h} (100%) rename drivers/net/ethernet/mellanox/mlx5/core/steering/hws/{mlx5hws_vport.c => vport.c} (98%) rename drivers/net/ethernet/mellanox/mlx5/core/steering/hws/{mlx5hws_vport.h => vport.h} (100%) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/Makefile b/drivers/net/ethernet/mellanox/mlx5/core/Makefile index 425acb16e68e1..cad9a4226c1f3 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/Makefile +++ b/drivers/net/ethernet/mellanox/mlx5/core/Makefile @@ -136,21 +136,21 @@ mlx5_core-$(CONFIG_MLX5_SW_STEERING) += steering/sws/dr_domain.o \ # # HW Steering # -mlx5_core-$(CONFIG_MLX5_HW_STEERING) += steering/hws/mlx5hws_cmd.o \ - steering/hws/mlx5hws_context.o \ - steering/hws/mlx5hws_pat_arg.o \ - steering/hws/mlx5hws_buddy.o \ - steering/hws/mlx5hws_pool.o \ - steering/hws/mlx5hws_table.o \ - steering/hws/mlx5hws_action.o \ - steering/hws/mlx5hws_rule.o \ - steering/hws/mlx5hws_matcher.o \ - steering/hws/mlx5hws_send.o \ - steering/hws/mlx5hws_definer.o \ - steering/hws/mlx5hws_bwc.o \ - steering/hws/mlx5hws_debug.o \ - steering/hws/mlx5hws_vport.o \ - steering/hws/mlx5hws_bwc_complex.o +mlx5_core-$(CONFIG_MLX5_HW_STEERING) += steering/hws/cmd.o \ + steering/hws/context.o \ + steering/hws/pat_arg.o \ + steering/hws/buddy.o \ + steering/hws/pool.o \ + steering/hws/table.o \ + steering/hws/action.o \ + steering/hws/rule.o \ + steering/hws/matcher.o \ + steering/hws/send.o \ + steering/hws/definer.o \ + steering/hws/bwc.o \ + steering/hws/debug.o \ + steering/hws/vport.o \ + steering/hws/bwc_complex.o # diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_action.c b/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/action.c similarity index 99% rename from drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_action.c rename to drivers/net/ethernet/mellanox/mlx5/core/steering/hws/action.c index b27bb41065326..a897cdc60fdbd 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_action.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/action.c @@ -1,7 +1,7 @@ // SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB /* Copyright (c) 2024 NVIDIA Corporation & Affiliates */ -#include "mlx5hws_internal.h" +#include "internal.h" #define MLX5HWS_ACTION_METER_INIT_COLOR_OFFSET 1 diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_action.h b/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/action.h similarity index 99% rename from drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_action.h rename to drivers/net/ethernet/mellanox/mlx5/core/steering/hws/action.h index bf5c1b2410060..e8f562c318267 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_action.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/action.h @@ -1,8 +1,8 @@ /* SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB */ /* Copyright (c) 2024 NVIDIA Corporation & Affiliates */ -#ifndef MLX5HWS_ACTION_H_ -#define MLX5HWS_ACTION_H_ +#ifndef HWS_ACTION_H_ +#define HWS_ACTION_H_ /* Max number of STEs needed for a rule (including match) */ #define MLX5HWS_ACTION_MAX_STE 20 @@ -304,4 +304,4 @@ mlx5hws_action_apply_setter(struct mlx5hws_actions_apply_data *apply, htonl(num_of_actions << 29); } -#endif /* MLX5HWS_ACTION_H_ */ +#endif /* HWS_ACTION_H_ */ diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_buddy.c b/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/buddy.c similarity index 98% rename from drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_buddy.c rename to drivers/net/ethernet/mellanox/mlx5/core/steering/hws/buddy.c index e6ed66202a404..b9aef80ba094b 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_buddy.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/buddy.c @@ -1,8 +1,8 @@ // SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB /* Copyright (c) 2024 NVIDIA Corporation & Affiliates */ -#include "mlx5hws_internal.h" -#include "mlx5hws_buddy.h" +#include "internal.h" +#include "buddy.h" static int hws_buddy_init(struct mlx5hws_buddy_mem *buddy, u32 max_order) { diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_buddy.h b/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/buddy.h similarity index 86% rename from drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_buddy.h rename to drivers/net/ethernet/mellanox/mlx5/core/steering/hws/buddy.h index 338c44bbedaff..ef6b223677aa4 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_buddy.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/buddy.h @@ -1,8 +1,8 @@ /* SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB */ /* Copyright (c) 2024 NVIDIA Corporation & Affiliates */ -#ifndef MLX5HWS_BUDDY_H_ -#define MLX5HWS_BUDDY_H_ +#ifndef HWS_BUDDY_H_ +#define HWS_BUDDY_H_ struct mlx5hws_buddy_mem { unsigned long **bitmap; @@ -18,4 +18,4 @@ int mlx5hws_buddy_alloc_mem(struct mlx5hws_buddy_mem *buddy, u32 order); void mlx5hws_buddy_free_mem(struct mlx5hws_buddy_mem *buddy, u32 seg, u32 order); -#endif /* MLX5HWS_BUDDY_H_ */ +#endif /* HWS_BUDDY_H_ */ diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_bwc.c b/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/bwc.c similarity index 99% rename from drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_bwc.c rename to drivers/net/ethernet/mellanox/mlx5/core/steering/hws/bwc.c index bd52b05db3670..ffced670699f6 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_bwc.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/bwc.c @@ -1,7 +1,7 @@ // SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB /* Copyright (c) 2024 NVIDIA Corporation & Affiliates */ -#include "mlx5hws_internal.h" +#include "internal.h" static u16 hws_bwc_gen_queue_idx(struct mlx5hws_context *ctx) { diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_bwc.h b/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/bwc.h similarity index 96% rename from drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_bwc.h rename to drivers/net/ethernet/mellanox/mlx5/core/steering/hws/bwc.h index 4fe8c32d8fbe8..0b745968e21e1 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_bwc.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/bwc.h @@ -1,8 +1,8 @@ /* SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB */ /* Copyright (c) 2024 NVIDIA Corporation & Affiliates */ -#ifndef MLX5HWS_BWC_H_ -#define MLX5HWS_BWC_H_ +#ifndef HWS_BWC_H_ +#define HWS_BWC_H_ #define MLX5HWS_BWC_MATCHER_INIT_SIZE_LOG 1 #define MLX5HWS_BWC_MATCHER_SIZE_LOG_STEP 1 @@ -70,4 +70,4 @@ static inline u16 mlx5hws_bwc_get_queue_id(struct mlx5hws_context *ctx, u16 idx) return idx + mlx5hws_bwc_queues(ctx); } -#endif /* MLX5HWS_BWC_H_ */ +#endif /* HWS_BWC_H_ */ diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_bwc_complex.c b/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/bwc_complex.c similarity index 98% rename from drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_bwc_complex.c rename to drivers/net/ethernet/mellanox/mlx5/core/steering/hws/bwc_complex.c index bb563f50ef09a..1b6595778eec4 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_bwc_complex.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/bwc_complex.c @@ -1,7 +1,7 @@ // SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB /* Copyright (c) 2024 NVIDIA Corporation & Affiliates */ -#include "mlx5hws_internal.h" +#include "internal.h" bool mlx5hws_bwc_match_params_is_complex(struct mlx5hws_context *ctx, u8 match_criteria_enable, diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_bwc_complex.h b/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/bwc_complex.h similarity index 90% rename from drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_bwc_complex.h rename to drivers/net/ethernet/mellanox/mlx5/core/steering/hws/bwc_complex.h index 068ee81186098..340f0688e3947 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_bwc_complex.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/bwc_complex.h @@ -1,8 +1,8 @@ /* SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB */ /* Copyright (c) 2024 NVIDIA Corporation & Affiliates */ -#ifndef MLX5HWS_BWC_COMPLEX_H_ -#define MLX5HWS_BWC_COMPLEX_H_ +#ifndef HWS_BWC_COMPLEX_H_ +#define HWS_BWC_COMPLEX_H_ bool mlx5hws_bwc_match_params_is_complex(struct mlx5hws_context *ctx, u8 match_criteria_enable, @@ -26,4 +26,4 @@ int mlx5hws_bwc_rule_create_complex(struct mlx5hws_bwc_rule *bwc_rule, int mlx5hws_bwc_rule_destroy_complex(struct mlx5hws_bwc_rule *bwc_rule); -#endif /* MLX5HWS_BWC_COMPLEX_H_ */ +#endif /* HWS_BWC_COMPLEX_H_ */ diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_cmd.c b/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/cmd.c similarity index 99% rename from drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_cmd.c rename to drivers/net/ethernet/mellanox/mlx5/core/steering/hws/cmd.c index 2c7b14172049b..c00c138c3366b 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_cmd.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/cmd.c @@ -1,7 +1,7 @@ // SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB /* Copyright (c) 2024 NVIDIA Corporation & Affiliates */ -#include "mlx5hws_internal.h" +#include "internal.h" static enum mlx5_ifc_flow_destination_type hws_cmd_dest_type_to_ifc_dest_type(enum mlx5_flow_destination_type type) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_cmd.h b/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/cmd.h similarity index 99% rename from drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_cmd.h rename to drivers/net/ethernet/mellanox/mlx5/core/steering/hws/cmd.h index 2fbcf4ff571a0..434f62b0904ec 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_cmd.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/cmd.h @@ -1,8 +1,8 @@ /* SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB */ /* Copyright (c) 2024 NVIDIA Corporation & Affiliates */ -#ifndef MLX5HWS_CMD_H_ -#define MLX5HWS_CMD_H_ +#ifndef HWS_CMD_H_ +#define HWS_CMD_H_ #define WIRE_PORT 0xFFFF @@ -358,4 +358,4 @@ int mlx5hws_cmd_allow_other_vhca_access(struct mlx5_core_dev *mdev, int mlx5hws_cmd_query_gvmi(struct mlx5_core_dev *mdev, bool other_function, u16 vport_number, u16 *gvmi); -#endif /* MLX5HWS_CMD_H_ */ +#endif /* HWS_CMD_H_ */ diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_context.c b/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/context.c similarity index 99% rename from drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_context.c rename to drivers/net/ethernet/mellanox/mlx5/core/steering/hws/context.c index 00e4fdf4a558b..fd48b05e91e05 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_context.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/context.c @@ -1,7 +1,7 @@ // SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB /* Copyright (c) 2024 NVIDIA CORPORATION. All rights reserved. */ -#include "mlx5hws_internal.h" +#include "internal.h" bool mlx5hws_context_cap_dynamic_reparse(struct mlx5hws_context *ctx) { diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_context.h b/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/context.h similarity index 95% rename from drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_context.h rename to drivers/net/ethernet/mellanox/mlx5/core/steering/hws/context.h index e5a7ce6043340..3cc8770b1a70d 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_context.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/context.h @@ -1,8 +1,8 @@ /* SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB */ /* Copyright (c) 2024 NVIDIA Corporation & Affiliates */ -#ifndef MLX5HWS_CONTEXT_H_ -#define MLX5HWS_CONTEXT_H_ +#ifndef HWS_CONTEXT_H_ +#define HWS_CONTEXT_H_ enum mlx5hws_context_flags { MLX5HWS_CONTEXT_FLAG_HWS_SUPPORT = 1 << 0, @@ -61,4 +61,4 @@ bool mlx5hws_context_cap_dynamic_reparse(struct mlx5hws_context *ctx); u8 mlx5hws_context_get_reparse_mode(struct mlx5hws_context *ctx); -#endif /* MLX5HWS_CONTEXT_H_ */ +#endif /* HWS_CONTEXT_H_ */ diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_debug.c b/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/debug.c similarity index 99% rename from drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_debug.c rename to drivers/net/ethernet/mellanox/mlx5/core/steering/hws/debug.c index 2b8c5a4e1c4c0..5b200b4bc1a87 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_debug.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/debug.c @@ -5,7 +5,7 @@ #include #include #include -#include "mlx5hws_internal.h" +#include "internal.h" static int hws_debug_dump_matcher_template_definer(struct seq_file *f, diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_debug.h b/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/debug.h similarity index 93% rename from drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_debug.h rename to drivers/net/ethernet/mellanox/mlx5/core/steering/hws/debug.h index b93a536035d96..e44e7ae28f93e 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_debug.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/debug.h @@ -1,8 +1,8 @@ /* SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB */ /* Copyright (c) 2024 NVIDIA Corporation & Affiliates */ -#ifndef MLX5HWS_DEBUG_H_ -#define MLX5HWS_DEBUG_H_ +#ifndef HWS_DEBUG_H_ +#define HWS_DEBUG_H_ #define HWS_DEBUG_FORMAT_VERSION "1.0" @@ -37,4 +37,4 @@ mlx5hws_debug_icm_to_idx(u64 icm_addr) void mlx5hws_debug_init_dump(struct mlx5hws_context *ctx); void mlx5hws_debug_uninit_dump(struct mlx5hws_context *ctx); -#endif /* MLX5HWS_DEBUG_H_ */ +#endif /* HWS_DEBUG_H_ */ diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_definer.c b/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/definer.c similarity index 99% rename from drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_definer.c rename to drivers/net/ethernet/mellanox/mlx5/core/steering/hws/definer.c index 3bdb5c90efffa..b919278a76a15 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_definer.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/definer.c @@ -1,7 +1,7 @@ // SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB /* Copyright (c) 2024 NVIDIA Corporation & Affiliates */ -#include "mlx5hws_internal.h" +#include "internal.h" /* Pattern tunnel Layer bits. */ #define MLX5_FLOW_LAYER_VXLAN BIT(12) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_definer.h b/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/definer.h similarity index 99% rename from drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_definer.h rename to drivers/net/ethernet/mellanox/mlx5/core/steering/hws/definer.h index 2f6a7df4021c8..9432d5084def3 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_definer.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/definer.h @@ -1,8 +1,8 @@ /* SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB */ /* Copyright (c) 2024 NVIDIA Corporation & Affiliates */ -#ifndef MLX5HWS_DEFINER_H_ -#define MLX5HWS_DEFINER_H_ +#ifndef HWS_DEFINER_H_ +#define HWS_DEFINER_H_ /* Max available selecotrs */ #define DW_SELECTORS 9 @@ -831,4 +831,4 @@ mlx5hws_definer_conv_match_params_to_compressed_fc(struct mlx5hws_context *ctx, u32 *match_param, int *fc_sz); -#endif /* MLX5HWS_DEFINER_H_ */ +#endif /* HWS_DEFINER_H_ */ diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_internal.h b/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/internal.h similarity index 67% rename from drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_internal.h rename to drivers/net/ethernet/mellanox/mlx5/core/steering/hws/internal.h index 5643be1cd5bf3..3c8635f286cef 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_internal.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/internal.h @@ -1,8 +1,8 @@ /* SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB */ /* Copyright (c) 2024 NVIDIA Corporation & Affiliates */ -#ifndef MLX5HWS_INTERNAL_H_ -#define MLX5HWS_INTERNAL_H_ +#ifndef HWS_INTERNAL_H_ +#define HWS_INTERNAL_H_ #include #include @@ -10,22 +10,22 @@ #include "wq.h" #include "lib/mlx5.h" -#include "mlx5hws_prm.h" +#include "prm.h" #include "mlx5hws.h" -#include "mlx5hws_pool.h" -#include "mlx5hws_vport.h" -#include "mlx5hws_context.h" -#include "mlx5hws_table.h" -#include "mlx5hws_send.h" -#include "mlx5hws_rule.h" -#include "mlx5hws_cmd.h" -#include "mlx5hws_action.h" -#include "mlx5hws_definer.h" -#include "mlx5hws_matcher.h" -#include "mlx5hws_debug.h" -#include "mlx5hws_pat_arg.h" -#include "mlx5hws_bwc.h" -#include "mlx5hws_bwc_complex.h" +#include "pool.h" +#include "vport.h" +#include "context.h" +#include "table.h" +#include "send.h" +#include "rule.h" +#include "cmd.h" +#include "action.h" +#include "definer.h" +#include "matcher.h" +#include "debug.h" +#include "pat_arg.h" +#include "bwc.h" +#include "bwc_complex.h" #define W_SIZE 2 #define DW_SIZE 4 @@ -56,4 +56,4 @@ static inline unsigned long align(unsigned long val, unsigned long align) return (val + align - 1) & ~(align - 1); } -#endif /* MLX5HWS_INTERNAL_H_ */ +#endif /* HWS_INTERNAL_H_ */ diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_matcher.c b/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/matcher.c similarity index 99% rename from drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_matcher.c rename to drivers/net/ethernet/mellanox/mlx5/core/steering/hws/matcher.c index 1964261415aa9..25b3badd76d60 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_matcher.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/matcher.c @@ -1,7 +1,7 @@ // SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB /* Copyright (c) 2024 NVIDIA Corporation & Affiliates */ -#include "mlx5hws_internal.h" +#include "internal.h" enum mlx5hws_matcher_rtc_type { HWS_MATCHER_RTC_TYPE_MATCH, diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_matcher.h b/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/matcher.h similarity index 96% rename from drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_matcher.h rename to drivers/net/ethernet/mellanox/mlx5/core/steering/hws/matcher.h index 125391d1a1148..81ff487f57be4 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_matcher.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/matcher.h @@ -1,8 +1,8 @@ /* SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB */ /* Copyright (c) 2024 NVIDIA Corporation & Affiliates */ -#ifndef MLX5HWS_MATCHER_H_ -#define MLX5HWS_MATCHER_H_ +#ifndef HWS_MATCHER_H_ +#define HWS_MATCHER_H_ /* We calculated that concatenating a collision table to the main table with * 3% of the main table rows will be enough resources for high insertion @@ -104,4 +104,4 @@ static inline bool mlx5hws_matcher_is_insert_by_idx(struct mlx5hws_matcher *matc return matcher->attr.insert_mode == MLX5HWS_MATCHER_INSERT_BY_INDEX; } -#endif /* MLX5HWS_MATCHER_H_ */ +#endif /* HWS_MATCHER_H_ */ diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_pat_arg.c b/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/pat_arg.c similarity index 99% rename from drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_pat_arg.c rename to drivers/net/ethernet/mellanox/mlx5/core/steering/hws/pat_arg.c index e084a5cbf81fd..06db5e4726aee 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_pat_arg.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/pat_arg.c @@ -1,7 +1,7 @@ // SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB /* Copyright (c) 2024 NVIDIA Corporation & Affiliates */ -#include "mlx5hws_internal.h" +#include "internal.h" enum mlx5hws_arg_chunk_size mlx5hws_arg_data_size_to_arg_log_size(u16 data_size) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_pat_arg.h b/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/pat_arg.h similarity index 100% rename from drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_pat_arg.h rename to drivers/net/ethernet/mellanox/mlx5/core/steering/hws/pat_arg.h diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_pool.c b/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/pool.c similarity index 99% rename from drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_pool.c rename to drivers/net/ethernet/mellanox/mlx5/core/steering/hws/pool.c index a8a63e3278be3..fed2d913f3b89 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_pool.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/pool.c @@ -1,8 +1,8 @@ // SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB /* Copyright (c) 2024 NVIDIA Corporation & Affiliates */ -#include "mlx5hws_internal.h" -#include "mlx5hws_buddy.h" +#include "internal.h" +#include "buddy.h" static void hws_pool_free_one_resource(struct mlx5hws_pool_resource *resource) { diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_pool.h b/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/pool.h similarity index 100% rename from drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_pool.h rename to drivers/net/ethernet/mellanox/mlx5/core/steering/hws/pool.h diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_prm.h b/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/prm.h similarity index 100% rename from drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_prm.h rename to drivers/net/ethernet/mellanox/mlx5/core/steering/hws/prm.h diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_rule.c b/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/rule.c similarity index 99% rename from drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_rule.c rename to drivers/net/ethernet/mellanox/mlx5/core/steering/hws/rule.c index c79ee70edf03e..a08ce52d2fa19 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_rule.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/rule.c @@ -1,7 +1,7 @@ // SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB /* Copyright (c) 2024 NVIDIA Corporation & Affiliates */ -#include "mlx5hws_internal.h" +#include "internal.h" static void hws_rule_skip(struct mlx5hws_matcher *matcher, struct mlx5hws_match_template *mt, diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_rule.h b/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/rule.h similarity index 100% rename from drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_rule.h rename to drivers/net/ethernet/mellanox/mlx5/core/steering/hws/rule.h diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_send.c b/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/send.c similarity index 99% rename from drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_send.c rename to drivers/net/ethernet/mellanox/mlx5/core/steering/hws/send.c index fb97a15c041a1..40cda54154f92 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_send.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/send.c @@ -1,7 +1,7 @@ // SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB /* Copyright (c) 2024 NVIDIA Corporation & Affiliates */ -#include "mlx5hws_internal.h" +#include "internal.h" #include "lib/clock.h" enum { CQ_OK = 0, CQ_EMPTY = -1, CQ_POLL_ERR = -2 }; diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_send.h b/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/send.h similarity index 100% rename from drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_send.h rename to drivers/net/ethernet/mellanox/mlx5/core/steering/hws/send.h diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_table.c b/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/table.c similarity index 99% rename from drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_table.c rename to drivers/net/ethernet/mellanox/mlx5/core/steering/hws/table.c index 9dbc3e9da5ea9..8a22e2b8dfdc5 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_table.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/table.c @@ -1,7 +1,7 @@ // SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB /* Copyright (c) 2024 NVIDIA Corporation & Affiliates */ -#include "mlx5hws_internal.h" +#include "internal.h" u32 mlx5hws_table_get_id(struct mlx5hws_table *tbl) { diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_table.h b/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/table.h similarity index 100% rename from drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_table.h rename to drivers/net/ethernet/mellanox/mlx5/core/steering/hws/table.h diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_vport.c b/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/vport.c similarity index 98% rename from drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_vport.c rename to drivers/net/ethernet/mellanox/mlx5/core/steering/hws/vport.c index faf42421c43fd..d8e382b9fa61e 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_vport.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/vport.c @@ -1,7 +1,7 @@ // SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB /* Copyright (c) 2024 NVIDIA Corporation & Affiliates */ -#include "mlx5hws_internal.h" +#include "internal.h" int mlx5hws_vport_init_vports(struct mlx5hws_context *ctx) { diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_vport.h b/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/vport.h similarity index 100% rename from drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws_vport.h rename to drivers/net/ethernet/mellanox/mlx5/core/steering/hws/vport.h From 622d5245b14925d9600729089ac103f39b4f14b3 Mon Sep 17 00:00:00 2001 From: Yevgeny Kliteynik Date: Thu, 19 Dec 2024 19:58:36 +0200 Subject: [PATCH 104/548] net/mlx5: HWS, no need to expose mlx5hws_send_queues_open/close No need to have mlx5hws_send_queues_open/close in header. Make them static and remove from header. Signed-off-by: Yevgeny Kliteynik Reviewed-by: Itamar Gozlan Reviewed-by: Mark Bloch Signed-off-by: Tariq Toukan Link: https://patch.msgid.link/20241219175841.1094544-7-tariqt@nvidia.com Signed-off-by: Jakub Kicinski (cherry picked from commit 9a0155a709fadaab468a24abca7996c5fdf0507b) Signed-off-by: Koba Ko --- .../net/ethernet/mellanox/mlx5/core/steering/hws/send.c | 8 ++++---- .../net/ethernet/mellanox/mlx5/core/steering/hws/send.h | 6 ------ 2 files changed, 4 insertions(+), 10 deletions(-) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/send.c b/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/send.c index 40cda54154f92..a43cb065bb0cd 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/send.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/send.c @@ -890,15 +890,15 @@ static int mlx5hws_send_ring_open(struct mlx5hws_context *ctx, return err; } -void mlx5hws_send_queue_close(struct mlx5hws_send_engine *queue) +static void mlx5hws_send_queue_close(struct mlx5hws_send_engine *queue) { hws_send_ring_close(queue); kfree(queue->completed.entries); } -int mlx5hws_send_queue_open(struct mlx5hws_context *ctx, - struct mlx5hws_send_engine *queue, - u16 queue_size) +static int mlx5hws_send_queue_open(struct mlx5hws_context *ctx, + struct mlx5hws_send_engine *queue, + u16 queue_size) { int err; diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/send.h b/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/send.h index b50825d6dc53d..f833092235c19 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/send.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/send.h @@ -189,12 +189,6 @@ void mlx5hws_send_abort_new_dep_wqe(struct mlx5hws_send_engine *queue); void mlx5hws_send_all_dep_wqe(struct mlx5hws_send_engine *queue); -void mlx5hws_send_queue_close(struct mlx5hws_send_engine *queue); - -int mlx5hws_send_queue_open(struct mlx5hws_context *ctx, - struct mlx5hws_send_engine *queue, - u16 queue_size); - void mlx5hws_send_queues_close(struct mlx5hws_context *ctx); int mlx5hws_send_queues_open(struct mlx5hws_context *ctx, From 280db3f0767ef8256f8d7d10da53b4bf006baf02 Mon Sep 17 00:00:00 2001 From: Yevgeny Kliteynik Date: Thu, 19 Dec 2024 19:58:37 +0200 Subject: [PATCH 105/548] net/mlx5: HWS, do not initialize native API queues HWS has two types of APIs: - Native: fastest and slimmest, async API. The user of this API is required to manage rule handles memory, and to poll for completion for each rule. - BWC: backward compatible API, similar semantics to SWS API. This layer is implemented above native API and it does all the work for the user, so that it is easy to switch between SWS and HWS. Right now the existing users of HWS require only BWC API. Therefore, in order to not waste resources, this patch disables send queues allocation for native API. If in the future support for faster HWS rule insertion will be required (such as for Connection Tracking), native queues can be enabled. Signed-off-by: Yevgeny Kliteynik Reviewed-by: Itamar Gozlan Reviewed-by: Mark Bloch Signed-off-by: Tariq Toukan Link: https://patch.msgid.link/20241219175841.1094544-8-tariqt@nvidia.com Signed-off-by: Jakub Kicinski (cherry picked from commit 429776b6019bbdcf04dcd49706fe7de6a280078b) Signed-off-by: Koba Ko --- .../ethernet/mellanox/mlx5/core/steering/hws/bwc.h | 6 ++++-- .../mellanox/mlx5/core/steering/hws/context.c | 6 ++++-- .../mellanox/mlx5/core/steering/hws/context.h | 6 ++++++ .../mellanox/mlx5/core/steering/hws/mlx5hws.h | 1 - .../ethernet/mellanox/mlx5/core/steering/hws/send.c | 13 +++++++++++-- 5 files changed, 25 insertions(+), 7 deletions(-) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/bwc.h b/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/bwc.h index 0b745968e21e1..3d4965213b012 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/bwc.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/bwc.h @@ -60,9 +60,11 @@ void mlx5hws_bwc_rule_fill_attr(struct mlx5hws_bwc_matcher *bwc_matcher, static inline u16 mlx5hws_bwc_queues(struct mlx5hws_context *ctx) { /* Besides the control queue, half of the queues are - * reguler HWS queues, and the other half are BWC queues. + * regular HWS queues, and the other half are BWC queues. */ - return (ctx->queues - 1) / 2; + if (mlx5hws_context_bwc_supported(ctx)) + return (ctx->queues - 1) / 2; + return 0; } static inline u16 mlx5hws_bwc_get_queue_id(struct mlx5hws_context *ctx, u16 idx) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/context.c b/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/context.c index fd48b05e91e05..4a8928f33bb99 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/context.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/context.c @@ -161,8 +161,10 @@ static int hws_context_init_hws(struct mlx5hws_context *ctx, if (ret) goto uninit_pd; - if (attr->bwc) - ctx->flags |= MLX5HWS_CONTEXT_FLAG_BWC_SUPPORT; + /* Context has support for backward compatible API, + * and does not have support for native HWS API. + */ + ctx->flags |= MLX5HWS_CONTEXT_FLAG_BWC_SUPPORT; ret = mlx5hws_send_queues_open(ctx, attr->queues, attr->queue_size); if (ret) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/context.h b/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/context.h index 3cc8770b1a70d..a243e13a2bac6 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/context.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/context.h @@ -8,6 +8,7 @@ enum mlx5hws_context_flags { MLX5HWS_CONTEXT_FLAG_HWS_SUPPORT = 1 << 0, MLX5HWS_CONTEXT_FLAG_PRIVATE_PD = 1 << 1, MLX5HWS_CONTEXT_FLAG_BWC_SUPPORT = 1 << 2, + MLX5HWS_CONTEXT_FLAG_NATIVE_SUPPORT = 1 << 3, }; enum mlx5hws_context_shared_stc_type { @@ -57,6 +58,11 @@ static inline bool mlx5hws_context_bwc_supported(struct mlx5hws_context *ctx) return ctx->flags & MLX5HWS_CONTEXT_FLAG_BWC_SUPPORT; } +static inline bool mlx5hws_context_native_supported(struct mlx5hws_context *ctx) +{ + return ctx->flags & MLX5HWS_CONTEXT_FLAG_NATIVE_SUPPORT; +} + bool mlx5hws_context_cap_dynamic_reparse(struct mlx5hws_context *ctx); u8 mlx5hws_context_get_reparse_mode(struct mlx5hws_context *ctx); diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws.h b/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws.h index 9a85d4e12f772..ed6011522f130 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/mlx5hws.h @@ -70,7 +70,6 @@ enum mlx5hws_send_queue_actions { struct mlx5hws_context_attr { u16 queues; u16 queue_size; - bool bwc; /* add support for backward compatible API*/ }; struct mlx5hws_table_attr { diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/send.c b/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/send.c index a43cb065bb0cd..9dacfec340b0a 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/send.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/send.c @@ -892,6 +892,9 @@ static int mlx5hws_send_ring_open(struct mlx5hws_context *ctx, static void mlx5hws_send_queue_close(struct mlx5hws_send_engine *queue) { + if (!queue->num_entries) + return; /* this queue wasn't initialized */ + hws_send_ring_close(queue); kfree(queue->completed.entries); } @@ -982,7 +985,7 @@ int mlx5hws_send_queues_open(struct mlx5hws_context *ctx, u16 queue_size) { int err = 0; - u32 i; + int i = 0; /* Open one extra queue for control path */ ctx->queues = queues + 1; @@ -998,7 +1001,13 @@ int mlx5hws_send_queues_open(struct mlx5hws_context *ctx, goto free_bwc_locks; } - for (i = 0; i < ctx->queues; i++) { + /* If native API isn't supported, skip the unused native queues: + * initialize BWC queues and control queue only. + */ + if (!mlx5hws_context_native_supported(ctx)) + i = mlx5hws_bwc_get_queue_id(ctx, 0); + + for (; i < ctx->queues; i++) { err = mlx5hws_send_queue_open(ctx, &ctx->send_queue[i], queue_size); if (err) goto close_send_queues; From 1b222654ff0f430d7b2a3fb74342ffc4662336d1 Mon Sep 17 00:00:00 2001 From: Itamar Gozlan Date: Thu, 19 Dec 2024 19:58:38 +0200 Subject: [PATCH 106/548] net/mlx5: DR, expand SWS STE callbacks and consolidate common structs Expand SWS STE callbacks to support ConnectX-8 hardware. Move common enums and structures to a shared header file. Signed-off-by: Itamar Gozlan Signed-off-by: Yevgeny Kliteynik Signed-off-by: Tariq Toukan Link: https://patch.msgid.link/20241219175841.1094544-9-tariqt@nvidia.com Signed-off-by: Jakub Kicinski (cherry picked from commit aa90a30804a563763eb78f00f56f759b72b91cb0) Signed-off-by: Koba Ko --- .../mellanox/mlx5/core/steering/sws/dr_ste.c | 4 +- .../mellanox/mlx5/core/steering/sws/dr_ste.h | 18 +- .../mlx5/core/steering/sws/dr_ste_v0.c | 6 +- .../mlx5/core/steering/sws/dr_ste_v1.c | 207 ++++-------------- .../mlx5/core/steering/sws/dr_ste_v1.h | 147 ++++++++++++- .../mlx5/core/steering/sws/dr_ste_v2.c | 169 +------------- .../mlx5/core/steering/sws/dr_ste_v2.h | 168 ++++++++++++++ 7 files changed, 377 insertions(+), 342 deletions(-) create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/steering/sws/dr_ste_v2.h diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/sws/dr_ste.c b/drivers/net/ethernet/mellanox/mlx5/core/steering/sws/dr_ste.c index e94fbb015efad..01ba8eae2983d 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/steering/sws/dr_ste.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/steering/sws/dr_ste.c @@ -555,7 +555,7 @@ void mlx5dr_ste_set_actions_tx(struct mlx5dr_ste_ctx *ste_ctx, struct mlx5dr_ste_actions_attr *attr, u32 *added_stes) { - ste_ctx->set_actions_tx(dmn, action_type_set, ste_ctx->actions_caps, + ste_ctx->set_actions_tx(ste_ctx, dmn, action_type_set, ste_ctx->actions_caps, hw_ste_arr, attr, added_stes); } @@ -566,7 +566,7 @@ void mlx5dr_ste_set_actions_rx(struct mlx5dr_ste_ctx *ste_ctx, struct mlx5dr_ste_actions_attr *attr, u32 *added_stes) { - ste_ctx->set_actions_rx(dmn, action_type_set, ste_ctx->actions_caps, + ste_ctx->set_actions_rx(ste_ctx, dmn, action_type_set, ste_ctx->actions_caps, hw_ste_arr, attr, added_stes); } diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/sws/dr_ste.h b/drivers/net/ethernet/mellanox/mlx5/core/steering/sws/dr_ste.h index 54a6619c3ecbf..b6ec8d30d9903 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/steering/sws/dr_ste.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/steering/sws/dr_ste.h @@ -160,13 +160,15 @@ struct mlx5dr_ste_ctx { /* Actions */ u32 actions_caps; - void (*set_actions_rx)(struct mlx5dr_domain *dmn, + void (*set_actions_rx)(struct mlx5dr_ste_ctx *ste_ctx, + struct mlx5dr_domain *dmn, u8 *action_type_set, u32 actions_caps, u8 *hw_ste_arr, struct mlx5dr_ste_actions_attr *attr, u32 *added_stes); - void (*set_actions_tx)(struct mlx5dr_domain *dmn, + void (*set_actions_tx)(struct mlx5dr_ste_ctx *ste_ctx, + struct mlx5dr_domain *dmn, u8 *action_type_set, u32 actions_caps, u8 *hw_ste_arr, @@ -197,7 +199,17 @@ struct mlx5dr_ste_ctx { u16 *used_hw_action_num); int (*alloc_modify_hdr_chunk)(struct mlx5dr_action *action); void (*dealloc_modify_hdr_chunk)(struct mlx5dr_action *action); - + /* Actions bit set */ + void (*set_encap)(u8 *hw_ste_p, u8 *d_action, + u32 reformat_id, int size); + void (*set_push_vlan)(u8 *ste, u8 *d_action, + u32 vlan_hdr); + void (*set_pop_vlan)(u8 *hw_ste_p, u8 *s_action, + u8 vlans_num); + void (*set_rx_decap)(u8 *hw_ste_p, u8 *s_action); + void (*set_encap_l3)(u8 *hw_ste_p, u8 *frst_s_action, + u8 *scnd_d_action, u32 reformat_id, + int size); /* Send */ void (*prepare_for_postsend)(u8 *hw_ste_p, u32 ste_size); }; diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/sws/dr_ste_v0.c b/drivers/net/ethernet/mellanox/mlx5/core/steering/sws/dr_ste_v0.c index e9f6c7ed7a7be..42536bee55e28 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/steering/sws/dr_ste_v0.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/steering/sws/dr_ste_v0.c @@ -406,7 +406,8 @@ static void dr_ste_v0_arr_init_next(u8 **last_ste, } static void -dr_ste_v0_set_actions_tx(struct mlx5dr_domain *dmn, +dr_ste_v0_set_actions_tx(struct mlx5dr_ste_ctx *ste_ctx, + struct mlx5dr_domain *dmn, u8 *action_type_set, u32 actions_caps, u8 *last_ste, @@ -476,7 +477,8 @@ dr_ste_v0_set_actions_tx(struct mlx5dr_domain *dmn, } static void -dr_ste_v0_set_actions_rx(struct mlx5dr_domain *dmn, +dr_ste_v0_set_actions_rx(struct mlx5dr_ste_ctx *ste_ctx, + struct mlx5dr_domain *dmn, u8 *action_type_set, u32 actions_caps, u8 *last_ste, diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/sws/dr_ste_v1.c b/drivers/net/ethernet/mellanox/mlx5/core/steering/sws/dr_ste_v1.c index 1d49704b95427..7f83d77c43ef0 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/steering/sws/dr_ste_v1.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/steering/sws/dr_ste_v1.c @@ -5,136 +5,6 @@ #include "mlx5_ifc_dr_ste_v1.h" #include "dr_ste_v1.h" -#define DR_STE_CALC_DFNR_TYPE(lookup_type, inner) \ - ((inner) ? DR_STE_V1_LU_TYPE_##lookup_type##_I : \ - DR_STE_V1_LU_TYPE_##lookup_type##_O) - -enum dr_ste_v1_entry_format { - DR_STE_V1_TYPE_BWC_BYTE = 0x0, - DR_STE_V1_TYPE_BWC_DW = 0x1, - DR_STE_V1_TYPE_MATCH = 0x2, - DR_STE_V1_TYPE_MATCH_RANGES = 0x7, -}; - -/* Lookup type is built from 2B: [ Definer mode 1B ][ Definer index 1B ] */ -enum { - DR_STE_V1_LU_TYPE_NOP = 0x0000, - DR_STE_V1_LU_TYPE_ETHL2_TNL = 0x0002, - DR_STE_V1_LU_TYPE_IBL3_EXT = 0x0102, - DR_STE_V1_LU_TYPE_ETHL2_O = 0x0003, - DR_STE_V1_LU_TYPE_IBL4 = 0x0103, - DR_STE_V1_LU_TYPE_ETHL2_I = 0x0004, - DR_STE_V1_LU_TYPE_SRC_QP_GVMI = 0x0104, - DR_STE_V1_LU_TYPE_ETHL2_SRC_O = 0x0005, - DR_STE_V1_LU_TYPE_ETHL2_HEADERS_O = 0x0105, - DR_STE_V1_LU_TYPE_ETHL2_SRC_I = 0x0006, - DR_STE_V1_LU_TYPE_ETHL2_HEADERS_I = 0x0106, - DR_STE_V1_LU_TYPE_ETHL3_IPV4_5_TUPLE_O = 0x0007, - DR_STE_V1_LU_TYPE_IPV6_DES_O = 0x0107, - DR_STE_V1_LU_TYPE_ETHL3_IPV4_5_TUPLE_I = 0x0008, - DR_STE_V1_LU_TYPE_IPV6_DES_I = 0x0108, - DR_STE_V1_LU_TYPE_ETHL4_O = 0x0009, - DR_STE_V1_LU_TYPE_IPV6_SRC_O = 0x0109, - DR_STE_V1_LU_TYPE_ETHL4_I = 0x000a, - DR_STE_V1_LU_TYPE_IPV6_SRC_I = 0x010a, - DR_STE_V1_LU_TYPE_ETHL2_SRC_DST_O = 0x000b, - DR_STE_V1_LU_TYPE_MPLS_O = 0x010b, - DR_STE_V1_LU_TYPE_ETHL2_SRC_DST_I = 0x000c, - DR_STE_V1_LU_TYPE_MPLS_I = 0x010c, - DR_STE_V1_LU_TYPE_ETHL3_IPV4_MISC_O = 0x000d, - DR_STE_V1_LU_TYPE_GRE = 0x010d, - DR_STE_V1_LU_TYPE_FLEX_PARSER_TNL_HEADER = 0x000e, - DR_STE_V1_LU_TYPE_GENERAL_PURPOSE = 0x010e, - DR_STE_V1_LU_TYPE_ETHL3_IPV4_MISC_I = 0x000f, - DR_STE_V1_LU_TYPE_STEERING_REGISTERS_0 = 0x010f, - DR_STE_V1_LU_TYPE_STEERING_REGISTERS_1 = 0x0110, - DR_STE_V1_LU_TYPE_FLEX_PARSER_OK = 0x0011, - DR_STE_V1_LU_TYPE_FLEX_PARSER_0 = 0x0111, - DR_STE_V1_LU_TYPE_FLEX_PARSER_1 = 0x0112, - DR_STE_V1_LU_TYPE_ETHL4_MISC_O = 0x0113, - DR_STE_V1_LU_TYPE_ETHL4_MISC_I = 0x0114, - DR_STE_V1_LU_TYPE_INVALID = 0x00ff, - DR_STE_V1_LU_TYPE_DONT_CARE = MLX5DR_STE_LU_TYPE_DONT_CARE, -}; - -enum dr_ste_v1_header_anchors { - DR_STE_HEADER_ANCHOR_START_OUTER = 0x00, - DR_STE_HEADER_ANCHOR_1ST_VLAN = 0x02, - DR_STE_HEADER_ANCHOR_IPV6_IPV4 = 0x07, - DR_STE_HEADER_ANCHOR_INNER_MAC = 0x13, - DR_STE_HEADER_ANCHOR_INNER_IPV6_IPV4 = 0x19, -}; - -enum dr_ste_v1_action_size { - DR_STE_ACTION_SINGLE_SZ = 4, - DR_STE_ACTION_DOUBLE_SZ = 8, - DR_STE_ACTION_TRIPLE_SZ = 12, -}; - -enum dr_ste_v1_action_insert_ptr_attr { - DR_STE_V1_ACTION_INSERT_PTR_ATTR_NONE = 0, /* Regular push header (e.g. push vlan) */ - DR_STE_V1_ACTION_INSERT_PTR_ATTR_ENCAP = 1, /* Encapsulation / Tunneling */ - DR_STE_V1_ACTION_INSERT_PTR_ATTR_ESP = 2, /* IPsec */ -}; - -enum dr_ste_v1_action_id { - DR_STE_V1_ACTION_ID_NOP = 0x00, - DR_STE_V1_ACTION_ID_COPY = 0x05, - DR_STE_V1_ACTION_ID_SET = 0x06, - DR_STE_V1_ACTION_ID_ADD = 0x07, - DR_STE_V1_ACTION_ID_REMOVE_BY_SIZE = 0x08, - DR_STE_V1_ACTION_ID_REMOVE_HEADER_TO_HEADER = 0x09, - DR_STE_V1_ACTION_ID_INSERT_INLINE = 0x0a, - DR_STE_V1_ACTION_ID_INSERT_POINTER = 0x0b, - DR_STE_V1_ACTION_ID_FLOW_TAG = 0x0c, - DR_STE_V1_ACTION_ID_QUEUE_ID_SEL = 0x0d, - DR_STE_V1_ACTION_ID_ACCELERATED_LIST = 0x0e, - DR_STE_V1_ACTION_ID_MODIFY_LIST = 0x0f, - DR_STE_V1_ACTION_ID_ASO = 0x12, - DR_STE_V1_ACTION_ID_TRAILER = 0x13, - DR_STE_V1_ACTION_ID_COUNTER_ID = 0x14, - DR_STE_V1_ACTION_ID_MAX = 0x21, - /* use for special cases */ - DR_STE_V1_ACTION_ID_SPECIAL_ENCAP_L3 = 0x22, -}; - -enum { - DR_STE_V1_ACTION_MDFY_FLD_L2_OUT_0 = 0x00, - DR_STE_V1_ACTION_MDFY_FLD_L2_OUT_1 = 0x01, - DR_STE_V1_ACTION_MDFY_FLD_L2_OUT_2 = 0x02, - DR_STE_V1_ACTION_MDFY_FLD_SRC_L2_OUT_0 = 0x08, - DR_STE_V1_ACTION_MDFY_FLD_SRC_L2_OUT_1 = 0x09, - DR_STE_V1_ACTION_MDFY_FLD_L3_OUT_0 = 0x0e, - DR_STE_V1_ACTION_MDFY_FLD_L4_OUT_0 = 0x18, - DR_STE_V1_ACTION_MDFY_FLD_L4_OUT_1 = 0x19, - DR_STE_V1_ACTION_MDFY_FLD_IPV4_OUT_0 = 0x40, - DR_STE_V1_ACTION_MDFY_FLD_IPV4_OUT_1 = 0x41, - DR_STE_V1_ACTION_MDFY_FLD_IPV6_DST_OUT_0 = 0x44, - DR_STE_V1_ACTION_MDFY_FLD_IPV6_DST_OUT_1 = 0x45, - DR_STE_V1_ACTION_MDFY_FLD_IPV6_DST_OUT_2 = 0x46, - DR_STE_V1_ACTION_MDFY_FLD_IPV6_DST_OUT_3 = 0x47, - DR_STE_V1_ACTION_MDFY_FLD_IPV6_SRC_OUT_0 = 0x4c, - DR_STE_V1_ACTION_MDFY_FLD_IPV6_SRC_OUT_1 = 0x4d, - DR_STE_V1_ACTION_MDFY_FLD_IPV6_SRC_OUT_2 = 0x4e, - DR_STE_V1_ACTION_MDFY_FLD_IPV6_SRC_OUT_3 = 0x4f, - DR_STE_V1_ACTION_MDFY_FLD_TCP_MISC_0 = 0x5e, - DR_STE_V1_ACTION_MDFY_FLD_TCP_MISC_1 = 0x5f, - DR_STE_V1_ACTION_MDFY_FLD_CFG_HDR_0_0 = 0x6f, - DR_STE_V1_ACTION_MDFY_FLD_CFG_HDR_0_1 = 0x70, - DR_STE_V1_ACTION_MDFY_FLD_METADATA_2_CQE = 0x7b, - DR_STE_V1_ACTION_MDFY_FLD_GNRL_PURPOSE = 0x7c, - DR_STE_V1_ACTION_MDFY_FLD_REGISTER_2_0 = 0x8c, - DR_STE_V1_ACTION_MDFY_FLD_REGISTER_2_1 = 0x8d, - DR_STE_V1_ACTION_MDFY_FLD_REGISTER_1_0 = 0x8e, - DR_STE_V1_ACTION_MDFY_FLD_REGISTER_1_1 = 0x8f, - DR_STE_V1_ACTION_MDFY_FLD_REGISTER_0_0 = 0x90, - DR_STE_V1_ACTION_MDFY_FLD_REGISTER_0_1 = 0x91, -}; - -enum dr_ste_v1_aso_ctx_type { - DR_STE_V1_ASO_CTX_TYPE_POLICERS = 0x2, -}; - static const struct mlx5dr_ste_action_modify_field dr_ste_v1_action_modify_field_arr[] = { [MLX5_ACTION_IN_FIELD_OUT_SMAC_47_16] = { .hw_field = DR_STE_V1_ACTION_MDFY_FLD_SRC_L2_OUT_0, .start = 0, .end = 31, @@ -379,13 +249,12 @@ static void dr_ste_v1_set_counter_id(u8 *hw_ste_p, u32 ctr_id) MLX5_SET(ste_match_bwc_v1, hw_ste_p, counter_id, ctr_id); } -static void dr_ste_v1_set_reparse(u8 *hw_ste_p) +void dr_ste_v1_set_reparse(u8 *hw_ste_p) { MLX5_SET(ste_match_bwc_v1, hw_ste_p, reparse, 1); } -static void dr_ste_v1_set_encap(u8 *hw_ste_p, u8 *d_action, - u32 reformat_id, int size) +void dr_ste_v1_set_encap(u8 *hw_ste_p, u8 *d_action, u32 reformat_id, int size) { MLX5_SET(ste_double_action_insert_with_ptr_v1, d_action, action_id, DR_STE_V1_ACTION_ID_INSERT_POINTER); @@ -432,8 +301,7 @@ static void dr_ste_v1_set_remove_hdr(u8 *hw_ste_p, u8 *s_action, dr_ste_v1_set_reparse(hw_ste_p); } -static void dr_ste_v1_set_push_vlan(u8 *hw_ste_p, u8 *d_action, - u32 vlan_hdr) +void dr_ste_v1_set_push_vlan(u8 *hw_ste_p, u8 *d_action, u32 vlan_hdr) { MLX5_SET(ste_double_action_insert_with_inline_v1, d_action, action_id, DR_STE_V1_ACTION_ID_INSERT_INLINE); @@ -446,7 +314,7 @@ static void dr_ste_v1_set_push_vlan(u8 *hw_ste_p, u8 *d_action, dr_ste_v1_set_reparse(hw_ste_p); } -static void dr_ste_v1_set_pop_vlan(u8 *hw_ste_p, u8 *s_action, u8 vlans_num) +void dr_ste_v1_set_pop_vlan(u8 *hw_ste_p, u8 *s_action, u8 vlans_num) { MLX5_SET(ste_single_action_remove_header_size_v1, s_action, action_id, DR_STE_V1_ACTION_ID_REMOVE_BY_SIZE); @@ -459,11 +327,8 @@ static void dr_ste_v1_set_pop_vlan(u8 *hw_ste_p, u8 *s_action, u8 vlans_num) dr_ste_v1_set_reparse(hw_ste_p); } -static void dr_ste_v1_set_encap_l3(u8 *hw_ste_p, - u8 *frst_s_action, - u8 *scnd_d_action, - u32 reformat_id, - int size) +void dr_ste_v1_set_encap_l3(u8 *hw_ste_p, u8 *frst_s_action, u8 *scnd_d_action, + u32 reformat_id, int size) { /* Remove L2 headers */ MLX5_SET(ste_single_action_remove_header_v1, frst_s_action, action_id, @@ -483,7 +348,7 @@ static void dr_ste_v1_set_encap_l3(u8 *hw_ste_p, dr_ste_v1_set_reparse(hw_ste_p); } -static void dr_ste_v1_set_rx_decap(u8 *hw_ste_p, u8 *s_action) +void dr_ste_v1_set_rx_decap(u8 *hw_ste_p, u8 *s_action) { MLX5_SET(ste_single_action_remove_header_v1, s_action, action_id, DR_STE_V1_ACTION_ID_REMOVE_HEADER_TO_HEADER); @@ -620,7 +485,8 @@ static void dr_ste_v1_arr_init_next_match_range(u8 **last_ste, dr_ste_v1_set_entry_type(*last_ste, DR_STE_V1_TYPE_MATCH_RANGES); } -void dr_ste_v1_set_actions_tx(struct mlx5dr_domain *dmn, +void dr_ste_v1_set_actions_tx(struct mlx5dr_ste_ctx *ste_ctx, + struct mlx5dr_domain *dmn, u8 *action_type_set, u32 actions_caps, u8 *last_ste, @@ -640,7 +506,7 @@ void dr_ste_v1_set_actions_tx(struct mlx5dr_domain *dmn, last_ste, action); action_sz = DR_STE_ACTION_TRIPLE_SZ; } - dr_ste_v1_set_pop_vlan(last_ste, action, attr->vlans.count); + ste_ctx->set_pop_vlan(last_ste, action, attr->vlans.count); action_sz -= DR_STE_ACTION_SINGLE_SZ; action += DR_STE_ACTION_SINGLE_SZ; @@ -677,8 +543,8 @@ void dr_ste_v1_set_actions_tx(struct mlx5dr_domain *dmn, action_sz = DR_STE_ACTION_TRIPLE_SZ; allow_encap = true; } - dr_ste_v1_set_push_vlan(last_ste, action, - attr->vlans.headers[i]); + ste_ctx->set_push_vlan(last_ste, action, + attr->vlans.headers[i]); action_sz -= DR_STE_ACTION_DOUBLE_SZ; action += DR_STE_ACTION_DOUBLE_SZ; } @@ -691,9 +557,9 @@ void dr_ste_v1_set_actions_tx(struct mlx5dr_domain *dmn, action_sz = DR_STE_ACTION_TRIPLE_SZ; allow_encap = true; } - dr_ste_v1_set_encap(last_ste, action, - attr->reformat.id, - attr->reformat.size); + ste_ctx->set_encap(last_ste, action, + attr->reformat.id, + attr->reformat.size); action_sz -= DR_STE_ACTION_DOUBLE_SZ; action += DR_STE_ACTION_DOUBLE_SZ; } else if (action_type_set[DR_ACTION_TYP_L2_TO_TNL_L3]) { @@ -706,10 +572,10 @@ void dr_ste_v1_set_actions_tx(struct mlx5dr_domain *dmn, } d_action = action + DR_STE_ACTION_SINGLE_SZ; - dr_ste_v1_set_encap_l3(last_ste, - action, d_action, - attr->reformat.id, - attr->reformat.size); + ste_ctx->set_encap_l3(last_ste, + action, d_action, + attr->reformat.id, + attr->reformat.size); action_sz -= DR_STE_ACTION_TRIPLE_SZ; action += DR_STE_ACTION_TRIPLE_SZ; } else if (action_type_set[DR_ACTION_TYP_INSERT_HDR]) { @@ -776,7 +642,8 @@ void dr_ste_v1_set_actions_tx(struct mlx5dr_domain *dmn, dr_ste_v1_set_hit_addr(last_ste, attr->final_icm_addr, 1); } -void dr_ste_v1_set_actions_rx(struct mlx5dr_domain *dmn, +void dr_ste_v1_set_actions_rx(struct mlx5dr_ste_ctx *ste_ctx, + struct mlx5dr_domain *dmn, u8 *action_type_set, u32 actions_caps, u8 *last_ste, @@ -799,7 +666,7 @@ void dr_ste_v1_set_actions_rx(struct mlx5dr_domain *dmn, allow_modify_hdr = false; allow_ctr = false; } else if (action_type_set[DR_ACTION_TYP_TNL_L2_TO_L2]) { - dr_ste_v1_set_rx_decap(last_ste, action); + ste_ctx->set_rx_decap(last_ste, action); action_sz -= DR_STE_ACTION_SINGLE_SZ; action += DR_STE_ACTION_SINGLE_SZ; allow_modify_hdr = false; @@ -827,7 +694,7 @@ void dr_ste_v1_set_actions_rx(struct mlx5dr_domain *dmn, action_sz = DR_STE_ACTION_TRIPLE_SZ; } - dr_ste_v1_set_pop_vlan(last_ste, action, attr->vlans.count); + ste_ctx->set_pop_vlan(last_ste, action, attr->vlans.count); action_sz -= DR_STE_ACTION_SINGLE_SZ; action += DR_STE_ACTION_SINGLE_SZ; allow_ctr = false; @@ -868,8 +735,8 @@ void dr_ste_v1_set_actions_rx(struct mlx5dr_domain *dmn, last_ste, action); action_sz = DR_STE_ACTION_TRIPLE_SZ; } - dr_ste_v1_set_push_vlan(last_ste, action, - attr->vlans.headers[i]); + ste_ctx->set_push_vlan(last_ste, action, + attr->vlans.headers[i]); action_sz -= DR_STE_ACTION_DOUBLE_SZ; action += DR_STE_ACTION_DOUBLE_SZ; } @@ -895,9 +762,9 @@ void dr_ste_v1_set_actions_rx(struct mlx5dr_domain *dmn, action = MLX5_ADDR_OF(ste_mask_and_match_v1, last_ste, action); action_sz = DR_STE_ACTION_TRIPLE_SZ; } - dr_ste_v1_set_encap(last_ste, action, - attr->reformat.id, - attr->reformat.size); + ste_ctx->set_encap(last_ste, action, + attr->reformat.id, + attr->reformat.size); action_sz -= DR_STE_ACTION_DOUBLE_SZ; action += DR_STE_ACTION_DOUBLE_SZ; allow_modify_hdr = false; @@ -912,10 +779,10 @@ void dr_ste_v1_set_actions_rx(struct mlx5dr_domain *dmn, d_action = action + DR_STE_ACTION_SINGLE_SZ; - dr_ste_v1_set_encap_l3(last_ste, - action, d_action, - attr->reformat.id, - attr->reformat.size); + ste_ctx->set_encap_l3(last_ste, + action, d_action, + attr->reformat.id, + attr->reformat.size); action_sz -= DR_STE_ACTION_TRIPLE_SZ; allow_modify_hdr = false; } else if (action_type_set[DR_ACTION_TYP_INSERT_HDR]) { @@ -1027,9 +894,6 @@ void dr_ste_v1_set_action_copy(u8 *d_action, MLX5_SET(ste_double_action_copy_v1, d_action, source_right_shifter, src_shifter); } -#define DR_STE_DECAP_L3_ACTION_NUM 8 -#define DR_STE_L2_HDR_MAX_SZ 20 - int dr_ste_v1_set_action_decap_l3_list(void *data, u32 data_sz, u8 *hw_action, @@ -2330,7 +2194,12 @@ static struct mlx5dr_ste_ctx ste_ctx_v1 = { .set_action_decap_l3_list = &dr_ste_v1_set_action_decap_l3_list, .alloc_modify_hdr_chunk = &dr_ste_v1_alloc_modify_hdr_ptrn_arg, .dealloc_modify_hdr_chunk = &dr_ste_v1_free_modify_hdr_ptrn_arg, - + /* Actions bit set */ + .set_encap = &dr_ste_v1_set_encap, + .set_push_vlan = &dr_ste_v1_set_push_vlan, + .set_pop_vlan = &dr_ste_v1_set_pop_vlan, + .set_rx_decap = &dr_ste_v1_set_rx_decap, + .set_encap_l3 = &dr_ste_v1_set_encap_l3, /* Send */ .prepare_for_postsend = &dr_ste_v1_prepare_for_postsend, }; diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/sws/dr_ste_v1.h b/drivers/net/ethernet/mellanox/mlx5/core/steering/sws/dr_ste_v1.h index e2fc698670880..a8d9e308d3392 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/steering/sws/dr_ste_v1.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/steering/sws/dr_ste_v1.h @@ -7,6 +7,138 @@ #include "dr_types.h" #include "dr_ste.h" +#define DR_STE_DECAP_L3_ACTION_NUM 8 +#define DR_STE_L2_HDR_MAX_SZ 20 +#define DR_STE_CALC_DFNR_TYPE(lookup_type, inner) \ + ((inner) ? DR_STE_V1_LU_TYPE_##lookup_type##_I : \ + DR_STE_V1_LU_TYPE_##lookup_type##_O) + +enum dr_ste_v1_entry_format { + DR_STE_V1_TYPE_BWC_BYTE = 0x0, + DR_STE_V1_TYPE_BWC_DW = 0x1, + DR_STE_V1_TYPE_MATCH = 0x2, + DR_STE_V1_TYPE_MATCH_RANGES = 0x7, +}; + +/* Lookup type is built from 2B: [ Definer mode 1B ][ Definer index 1B ] */ +enum { + DR_STE_V1_LU_TYPE_NOP = 0x0000, + DR_STE_V1_LU_TYPE_ETHL2_TNL = 0x0002, + DR_STE_V1_LU_TYPE_IBL3_EXT = 0x0102, + DR_STE_V1_LU_TYPE_ETHL2_O = 0x0003, + DR_STE_V1_LU_TYPE_IBL4 = 0x0103, + DR_STE_V1_LU_TYPE_ETHL2_I = 0x0004, + DR_STE_V1_LU_TYPE_SRC_QP_GVMI = 0x0104, + DR_STE_V1_LU_TYPE_ETHL2_SRC_O = 0x0005, + DR_STE_V1_LU_TYPE_ETHL2_HEADERS_O = 0x0105, + DR_STE_V1_LU_TYPE_ETHL2_SRC_I = 0x0006, + DR_STE_V1_LU_TYPE_ETHL2_HEADERS_I = 0x0106, + DR_STE_V1_LU_TYPE_ETHL3_IPV4_5_TUPLE_O = 0x0007, + DR_STE_V1_LU_TYPE_IPV6_DES_O = 0x0107, + DR_STE_V1_LU_TYPE_ETHL3_IPV4_5_TUPLE_I = 0x0008, + DR_STE_V1_LU_TYPE_IPV6_DES_I = 0x0108, + DR_STE_V1_LU_TYPE_ETHL4_O = 0x0009, + DR_STE_V1_LU_TYPE_IPV6_SRC_O = 0x0109, + DR_STE_V1_LU_TYPE_ETHL4_I = 0x000a, + DR_STE_V1_LU_TYPE_IPV6_SRC_I = 0x010a, + DR_STE_V1_LU_TYPE_ETHL2_SRC_DST_O = 0x000b, + DR_STE_V1_LU_TYPE_MPLS_O = 0x010b, + DR_STE_V1_LU_TYPE_ETHL2_SRC_DST_I = 0x000c, + DR_STE_V1_LU_TYPE_MPLS_I = 0x010c, + DR_STE_V1_LU_TYPE_ETHL3_IPV4_MISC_O = 0x000d, + DR_STE_V1_LU_TYPE_GRE = 0x010d, + DR_STE_V1_LU_TYPE_FLEX_PARSER_TNL_HEADER = 0x000e, + DR_STE_V1_LU_TYPE_GENERAL_PURPOSE = 0x010e, + DR_STE_V1_LU_TYPE_ETHL3_IPV4_MISC_I = 0x000f, + DR_STE_V1_LU_TYPE_STEERING_REGISTERS_0 = 0x010f, + DR_STE_V1_LU_TYPE_STEERING_REGISTERS_1 = 0x0110, + DR_STE_V1_LU_TYPE_FLEX_PARSER_OK = 0x0011, + DR_STE_V1_LU_TYPE_FLEX_PARSER_0 = 0x0111, + DR_STE_V1_LU_TYPE_FLEX_PARSER_1 = 0x0112, + DR_STE_V1_LU_TYPE_ETHL4_MISC_O = 0x0113, + DR_STE_V1_LU_TYPE_ETHL4_MISC_I = 0x0114, + DR_STE_V1_LU_TYPE_INVALID = 0x00ff, + DR_STE_V1_LU_TYPE_DONT_CARE = MLX5DR_STE_LU_TYPE_DONT_CARE, +}; + +enum dr_ste_v1_header_anchors { + DR_STE_HEADER_ANCHOR_START_OUTER = 0x00, + DR_STE_HEADER_ANCHOR_1ST_VLAN = 0x02, + DR_STE_HEADER_ANCHOR_IPV6_IPV4 = 0x07, + DR_STE_HEADER_ANCHOR_INNER_MAC = 0x13, + DR_STE_HEADER_ANCHOR_INNER_IPV6_IPV4 = 0x19, +}; + +enum dr_ste_v1_action_size { + DR_STE_ACTION_SINGLE_SZ = 4, + DR_STE_ACTION_DOUBLE_SZ = 8, + DR_STE_ACTION_TRIPLE_SZ = 12, +}; + +enum dr_ste_v1_action_insert_ptr_attr { + DR_STE_V1_ACTION_INSERT_PTR_ATTR_NONE = 0, /* Regular push header (e.g. push vlan) */ + DR_STE_V1_ACTION_INSERT_PTR_ATTR_ENCAP = 1, /* Encapsulation / Tunneling */ + DR_STE_V1_ACTION_INSERT_PTR_ATTR_ESP = 2, /* IPsec */ +}; + +enum dr_ste_v1_action_id { + DR_STE_V1_ACTION_ID_NOP = 0x00, + DR_STE_V1_ACTION_ID_COPY = 0x05, + DR_STE_V1_ACTION_ID_SET = 0x06, + DR_STE_V1_ACTION_ID_ADD = 0x07, + DR_STE_V1_ACTION_ID_REMOVE_BY_SIZE = 0x08, + DR_STE_V1_ACTION_ID_REMOVE_HEADER_TO_HEADER = 0x09, + DR_STE_V1_ACTION_ID_INSERT_INLINE = 0x0a, + DR_STE_V1_ACTION_ID_INSERT_POINTER = 0x0b, + DR_STE_V1_ACTION_ID_FLOW_TAG = 0x0c, + DR_STE_V1_ACTION_ID_QUEUE_ID_SEL = 0x0d, + DR_STE_V1_ACTION_ID_ACCELERATED_LIST = 0x0e, + DR_STE_V1_ACTION_ID_MODIFY_LIST = 0x0f, + DR_STE_V1_ACTION_ID_ASO = 0x12, + DR_STE_V1_ACTION_ID_TRAILER = 0x13, + DR_STE_V1_ACTION_ID_COUNTER_ID = 0x14, + DR_STE_V1_ACTION_ID_MAX = 0x21, + /* use for special cases */ + DR_STE_V1_ACTION_ID_SPECIAL_ENCAP_L3 = 0x22, +}; + +enum { + DR_STE_V1_ACTION_MDFY_FLD_L2_OUT_0 = 0x00, + DR_STE_V1_ACTION_MDFY_FLD_L2_OUT_1 = 0x01, + DR_STE_V1_ACTION_MDFY_FLD_L2_OUT_2 = 0x02, + DR_STE_V1_ACTION_MDFY_FLD_SRC_L2_OUT_0 = 0x08, + DR_STE_V1_ACTION_MDFY_FLD_SRC_L2_OUT_1 = 0x09, + DR_STE_V1_ACTION_MDFY_FLD_L3_OUT_0 = 0x0e, + DR_STE_V1_ACTION_MDFY_FLD_L4_OUT_0 = 0x18, + DR_STE_V1_ACTION_MDFY_FLD_L4_OUT_1 = 0x19, + DR_STE_V1_ACTION_MDFY_FLD_IPV4_OUT_0 = 0x40, + DR_STE_V1_ACTION_MDFY_FLD_IPV4_OUT_1 = 0x41, + DR_STE_V1_ACTION_MDFY_FLD_IPV6_DST_OUT_0 = 0x44, + DR_STE_V1_ACTION_MDFY_FLD_IPV6_DST_OUT_1 = 0x45, + DR_STE_V1_ACTION_MDFY_FLD_IPV6_DST_OUT_2 = 0x46, + DR_STE_V1_ACTION_MDFY_FLD_IPV6_DST_OUT_3 = 0x47, + DR_STE_V1_ACTION_MDFY_FLD_IPV6_SRC_OUT_0 = 0x4c, + DR_STE_V1_ACTION_MDFY_FLD_IPV6_SRC_OUT_1 = 0x4d, + DR_STE_V1_ACTION_MDFY_FLD_IPV6_SRC_OUT_2 = 0x4e, + DR_STE_V1_ACTION_MDFY_FLD_IPV6_SRC_OUT_3 = 0x4f, + DR_STE_V1_ACTION_MDFY_FLD_TCP_MISC_0 = 0x5e, + DR_STE_V1_ACTION_MDFY_FLD_TCP_MISC_1 = 0x5f, + DR_STE_V1_ACTION_MDFY_FLD_CFG_HDR_0_0 = 0x6f, + DR_STE_V1_ACTION_MDFY_FLD_CFG_HDR_0_1 = 0x70, + DR_STE_V1_ACTION_MDFY_FLD_METADATA_2_CQE = 0x7b, + DR_STE_V1_ACTION_MDFY_FLD_GNRL_PURPOSE = 0x7c, + DR_STE_V1_ACTION_MDFY_FLD_REGISTER_2_0 = 0x8c, + DR_STE_V1_ACTION_MDFY_FLD_REGISTER_2_1 = 0x8d, + DR_STE_V1_ACTION_MDFY_FLD_REGISTER_1_0 = 0x8e, + DR_STE_V1_ACTION_MDFY_FLD_REGISTER_1_1 = 0x8f, + DR_STE_V1_ACTION_MDFY_FLD_REGISTER_0_0 = 0x90, + DR_STE_V1_ACTION_MDFY_FLD_REGISTER_0_1 = 0x91, +}; + +enum dr_ste_v1_aso_ctx_type { + DR_STE_V1_ASO_CTX_TYPE_POLICERS = 0x2, +}; + bool dr_ste_v1_is_miss_addr_set(u8 *hw_ste_p); void dr_ste_v1_set_miss_addr(u8 *hw_ste_p, u64 miss_addr); u64 dr_ste_v1_get_miss_addr(u8 *hw_ste_p); @@ -17,11 +149,18 @@ u16 dr_ste_v1_get_next_lu_type(u8 *hw_ste_p); void dr_ste_v1_set_hit_addr(u8 *hw_ste_p, u64 icm_addr, u32 ht_size); void dr_ste_v1_init(u8 *hw_ste_p, u16 lu_type, bool is_rx, u16 gvmi); void dr_ste_v1_prepare_for_postsend(u8 *hw_ste_p, u32 ste_size); -void dr_ste_v1_set_actions_tx(struct mlx5dr_domain *dmn, u8 *action_type_set, - u32 actions_caps, u8 *last_ste, +void dr_ste_v1_set_reparse(u8 *hw_ste_p); +void dr_ste_v1_set_encap(u8 *hw_ste_p, u8 *d_action, u32 reformat_id, int size); +void dr_ste_v1_set_push_vlan(u8 *hw_ste_p, u8 *d_action, u32 vlan_hdr); +void dr_ste_v1_set_pop_vlan(u8 *hw_ste_p, u8 *s_action, u8 vlans_num); +void dr_ste_v1_set_encap_l3(u8 *hw_ste_p, u8 *frst_s_action, u8 *scnd_d_action, + u32 reformat_id, int size); +void dr_ste_v1_set_rx_decap(u8 *hw_ste_p, u8 *s_action); +void dr_ste_v1_set_actions_tx(struct mlx5dr_ste_ctx *ste_ctx, struct mlx5dr_domain *dmn, + u8 *action_type_set, u32 actions_caps, u8 *last_ste, struct mlx5dr_ste_actions_attr *attr, u32 *added_stes); -void dr_ste_v1_set_actions_rx(struct mlx5dr_domain *dmn, u8 *action_type_set, - u32 actions_caps, u8 *last_ste, +void dr_ste_v1_set_actions_rx(struct mlx5dr_ste_ctx *ste_ctx, struct mlx5dr_domain *dmn, + u8 *action_type_set, u32 actions_caps, u8 *last_ste, struct mlx5dr_ste_actions_attr *attr, u32 *added_stes); void dr_ste_v1_set_action_set(u8 *d_action, u8 hw_field, u8 shifter, u8 length, u32 data); diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/sws/dr_ste_v2.c b/drivers/net/ethernet/mellanox/mlx5/core/steering/sws/dr_ste_v2.c index 808b013cf48cb..0882dba0f64b9 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/steering/sws/dr_ste_v2.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/steering/sws/dr_ste_v2.c @@ -2,167 +2,7 @@ /* Copyright (c) 2022, NVIDIA CORPORATION & AFFILIATES. All rights reserved. */ #include "dr_ste_v1.h" - -enum { - DR_STE_V2_ACTION_MDFY_FLD_L2_OUT_0 = 0x00, - DR_STE_V2_ACTION_MDFY_FLD_L2_OUT_1 = 0x01, - DR_STE_V2_ACTION_MDFY_FLD_L2_OUT_2 = 0x02, - DR_STE_V2_ACTION_MDFY_FLD_SRC_L2_OUT_0 = 0x08, - DR_STE_V2_ACTION_MDFY_FLD_SRC_L2_OUT_1 = 0x09, - DR_STE_V2_ACTION_MDFY_FLD_L3_OUT_0 = 0x0e, - DR_STE_V2_ACTION_MDFY_FLD_L4_OUT_0 = 0x18, - DR_STE_V2_ACTION_MDFY_FLD_L4_OUT_1 = 0x19, - DR_STE_V2_ACTION_MDFY_FLD_IPV4_OUT_0 = 0x40, - DR_STE_V2_ACTION_MDFY_FLD_IPV4_OUT_1 = 0x41, - DR_STE_V2_ACTION_MDFY_FLD_IPV6_DST_OUT_0 = 0x44, - DR_STE_V2_ACTION_MDFY_FLD_IPV6_DST_OUT_1 = 0x45, - DR_STE_V2_ACTION_MDFY_FLD_IPV6_DST_OUT_2 = 0x46, - DR_STE_V2_ACTION_MDFY_FLD_IPV6_DST_OUT_3 = 0x47, - DR_STE_V2_ACTION_MDFY_FLD_IPV6_SRC_OUT_0 = 0x4c, - DR_STE_V2_ACTION_MDFY_FLD_IPV6_SRC_OUT_1 = 0x4d, - DR_STE_V2_ACTION_MDFY_FLD_IPV6_SRC_OUT_2 = 0x4e, - DR_STE_V2_ACTION_MDFY_FLD_IPV6_SRC_OUT_3 = 0x4f, - DR_STE_V2_ACTION_MDFY_FLD_TCP_MISC_0 = 0x5e, - DR_STE_V2_ACTION_MDFY_FLD_TCP_MISC_1 = 0x5f, - DR_STE_V2_ACTION_MDFY_FLD_CFG_HDR_0_0 = 0x6f, - DR_STE_V2_ACTION_MDFY_FLD_CFG_HDR_0_1 = 0x70, - DR_STE_V2_ACTION_MDFY_FLD_METADATA_2_CQE = 0x7b, - DR_STE_V2_ACTION_MDFY_FLD_GNRL_PURPOSE = 0x7c, - DR_STE_V2_ACTION_MDFY_FLD_REGISTER_2_0 = 0x90, - DR_STE_V2_ACTION_MDFY_FLD_REGISTER_2_1 = 0x91, - DR_STE_V2_ACTION_MDFY_FLD_REGISTER_1_0 = 0x92, - DR_STE_V2_ACTION_MDFY_FLD_REGISTER_1_1 = 0x93, - DR_STE_V2_ACTION_MDFY_FLD_REGISTER_0_0 = 0x94, - DR_STE_V2_ACTION_MDFY_FLD_REGISTER_0_1 = 0x95, -}; - -static const struct mlx5dr_ste_action_modify_field dr_ste_v2_action_modify_field_arr[] = { - [MLX5_ACTION_IN_FIELD_OUT_SMAC_47_16] = { - .hw_field = DR_STE_V2_ACTION_MDFY_FLD_SRC_L2_OUT_0, .start = 0, .end = 31, - }, - [MLX5_ACTION_IN_FIELD_OUT_SMAC_15_0] = { - .hw_field = DR_STE_V2_ACTION_MDFY_FLD_SRC_L2_OUT_1, .start = 16, .end = 31, - }, - [MLX5_ACTION_IN_FIELD_OUT_ETHERTYPE] = { - .hw_field = DR_STE_V2_ACTION_MDFY_FLD_L2_OUT_1, .start = 0, .end = 15, - }, - [MLX5_ACTION_IN_FIELD_OUT_DMAC_47_16] = { - .hw_field = DR_STE_V2_ACTION_MDFY_FLD_L2_OUT_0, .start = 0, .end = 31, - }, - [MLX5_ACTION_IN_FIELD_OUT_DMAC_15_0] = { - .hw_field = DR_STE_V2_ACTION_MDFY_FLD_L2_OUT_1, .start = 16, .end = 31, - }, - [MLX5_ACTION_IN_FIELD_OUT_IP_DSCP] = { - .hw_field = DR_STE_V2_ACTION_MDFY_FLD_L3_OUT_0, .start = 18, .end = 23, - }, - [MLX5_ACTION_IN_FIELD_OUT_TCP_FLAGS] = { - .hw_field = DR_STE_V2_ACTION_MDFY_FLD_L4_OUT_1, .start = 16, .end = 24, - .l4_type = DR_STE_ACTION_MDFY_TYPE_L4_TCP, - }, - [MLX5_ACTION_IN_FIELD_OUT_TCP_SPORT] = { - .hw_field = DR_STE_V2_ACTION_MDFY_FLD_L4_OUT_0, .start = 16, .end = 31, - .l4_type = DR_STE_ACTION_MDFY_TYPE_L4_TCP, - }, - [MLX5_ACTION_IN_FIELD_OUT_TCP_DPORT] = { - .hw_field = DR_STE_V2_ACTION_MDFY_FLD_L4_OUT_0, .start = 0, .end = 15, - .l4_type = DR_STE_ACTION_MDFY_TYPE_L4_TCP, - }, - [MLX5_ACTION_IN_FIELD_OUT_IP_TTL] = { - .hw_field = DR_STE_V2_ACTION_MDFY_FLD_L3_OUT_0, .start = 8, .end = 15, - .l3_type = DR_STE_ACTION_MDFY_TYPE_L3_IPV4, - }, - [MLX5_ACTION_IN_FIELD_OUT_IPV6_HOPLIMIT] = { - .hw_field = DR_STE_V2_ACTION_MDFY_FLD_L3_OUT_0, .start = 8, .end = 15, - .l3_type = DR_STE_ACTION_MDFY_TYPE_L3_IPV6, - }, - [MLX5_ACTION_IN_FIELD_OUT_UDP_SPORT] = { - .hw_field = DR_STE_V2_ACTION_MDFY_FLD_L4_OUT_0, .start = 16, .end = 31, - .l4_type = DR_STE_ACTION_MDFY_TYPE_L4_UDP, - }, - [MLX5_ACTION_IN_FIELD_OUT_UDP_DPORT] = { - .hw_field = DR_STE_V2_ACTION_MDFY_FLD_L4_OUT_0, .start = 0, .end = 15, - .l4_type = DR_STE_ACTION_MDFY_TYPE_L4_UDP, - }, - [MLX5_ACTION_IN_FIELD_OUT_SIPV6_127_96] = { - .hw_field = DR_STE_V2_ACTION_MDFY_FLD_IPV6_SRC_OUT_0, .start = 0, .end = 31, - .l3_type = DR_STE_ACTION_MDFY_TYPE_L3_IPV6, - }, - [MLX5_ACTION_IN_FIELD_OUT_SIPV6_95_64] = { - .hw_field = DR_STE_V2_ACTION_MDFY_FLD_IPV6_SRC_OUT_1, .start = 0, .end = 31, - .l3_type = DR_STE_ACTION_MDFY_TYPE_L3_IPV6, - }, - [MLX5_ACTION_IN_FIELD_OUT_SIPV6_63_32] = { - .hw_field = DR_STE_V2_ACTION_MDFY_FLD_IPV6_SRC_OUT_2, .start = 0, .end = 31, - .l3_type = DR_STE_ACTION_MDFY_TYPE_L3_IPV6, - }, - [MLX5_ACTION_IN_FIELD_OUT_SIPV6_31_0] = { - .hw_field = DR_STE_V2_ACTION_MDFY_FLD_IPV6_SRC_OUT_3, .start = 0, .end = 31, - .l3_type = DR_STE_ACTION_MDFY_TYPE_L3_IPV6, - }, - [MLX5_ACTION_IN_FIELD_OUT_DIPV6_127_96] = { - .hw_field = DR_STE_V2_ACTION_MDFY_FLD_IPV6_DST_OUT_0, .start = 0, .end = 31, - .l3_type = DR_STE_ACTION_MDFY_TYPE_L3_IPV6, - }, - [MLX5_ACTION_IN_FIELD_OUT_DIPV6_95_64] = { - .hw_field = DR_STE_V2_ACTION_MDFY_FLD_IPV6_DST_OUT_1, .start = 0, .end = 31, - .l3_type = DR_STE_ACTION_MDFY_TYPE_L3_IPV6, - }, - [MLX5_ACTION_IN_FIELD_OUT_DIPV6_63_32] = { - .hw_field = DR_STE_V2_ACTION_MDFY_FLD_IPV6_DST_OUT_2, .start = 0, .end = 31, - .l3_type = DR_STE_ACTION_MDFY_TYPE_L3_IPV6, - }, - [MLX5_ACTION_IN_FIELD_OUT_DIPV6_31_0] = { - .hw_field = DR_STE_V2_ACTION_MDFY_FLD_IPV6_DST_OUT_3, .start = 0, .end = 31, - .l3_type = DR_STE_ACTION_MDFY_TYPE_L3_IPV6, - }, - [MLX5_ACTION_IN_FIELD_OUT_SIPV4] = { - .hw_field = DR_STE_V2_ACTION_MDFY_FLD_IPV4_OUT_0, .start = 0, .end = 31, - .l3_type = DR_STE_ACTION_MDFY_TYPE_L3_IPV4, - }, - [MLX5_ACTION_IN_FIELD_OUT_DIPV4] = { - .hw_field = DR_STE_V2_ACTION_MDFY_FLD_IPV4_OUT_1, .start = 0, .end = 31, - .l3_type = DR_STE_ACTION_MDFY_TYPE_L3_IPV4, - }, - [MLX5_ACTION_IN_FIELD_METADATA_REG_A] = { - .hw_field = DR_STE_V2_ACTION_MDFY_FLD_GNRL_PURPOSE, .start = 0, .end = 31, - }, - [MLX5_ACTION_IN_FIELD_METADATA_REG_B] = { - .hw_field = DR_STE_V2_ACTION_MDFY_FLD_METADATA_2_CQE, .start = 0, .end = 31, - }, - [MLX5_ACTION_IN_FIELD_METADATA_REG_C_0] = { - .hw_field = DR_STE_V2_ACTION_MDFY_FLD_REGISTER_0_0, .start = 0, .end = 31, - }, - [MLX5_ACTION_IN_FIELD_METADATA_REG_C_1] = { - .hw_field = DR_STE_V2_ACTION_MDFY_FLD_REGISTER_0_1, .start = 0, .end = 31, - }, - [MLX5_ACTION_IN_FIELD_METADATA_REG_C_2] = { - .hw_field = DR_STE_V2_ACTION_MDFY_FLD_REGISTER_1_0, .start = 0, .end = 31, - }, - [MLX5_ACTION_IN_FIELD_METADATA_REG_C_3] = { - .hw_field = DR_STE_V2_ACTION_MDFY_FLD_REGISTER_1_1, .start = 0, .end = 31, - }, - [MLX5_ACTION_IN_FIELD_METADATA_REG_C_4] = { - .hw_field = DR_STE_V2_ACTION_MDFY_FLD_REGISTER_2_0, .start = 0, .end = 31, - }, - [MLX5_ACTION_IN_FIELD_METADATA_REG_C_5] = { - .hw_field = DR_STE_V2_ACTION_MDFY_FLD_REGISTER_2_1, .start = 0, .end = 31, - }, - [MLX5_ACTION_IN_FIELD_OUT_TCP_SEQ_NUM] = { - .hw_field = DR_STE_V2_ACTION_MDFY_FLD_TCP_MISC_0, .start = 0, .end = 31, - }, - [MLX5_ACTION_IN_FIELD_OUT_TCP_ACK_NUM] = { - .hw_field = DR_STE_V2_ACTION_MDFY_FLD_TCP_MISC_1, .start = 0, .end = 31, - }, - [MLX5_ACTION_IN_FIELD_OUT_FIRST_VID] = { - .hw_field = DR_STE_V2_ACTION_MDFY_FLD_L2_OUT_2, .start = 0, .end = 15, - }, - [MLX5_ACTION_IN_FIELD_OUT_EMD_31_0] = { - .hw_field = DR_STE_V2_ACTION_MDFY_FLD_CFG_HDR_0_1, .start = 0, .end = 31, - }, - [MLX5_ACTION_IN_FIELD_OUT_EMD_47_32] = { - .hw_field = DR_STE_V2_ACTION_MDFY_FLD_CFG_HDR_0_0, .start = 0, .end = 15, - }, -}; +#include "dr_ste_v2.h" static struct mlx5dr_ste_ctx ste_ctx_v2 = { /* Builders */ @@ -223,7 +63,12 @@ static struct mlx5dr_ste_ctx ste_ctx_v2 = { .set_action_decap_l3_list = &dr_ste_v1_set_action_decap_l3_list, .alloc_modify_hdr_chunk = &dr_ste_v1_alloc_modify_hdr_ptrn_arg, .dealloc_modify_hdr_chunk = &dr_ste_v1_free_modify_hdr_ptrn_arg, - + /* Actions bit set */ + .set_encap = &dr_ste_v1_set_encap, + .set_push_vlan = &dr_ste_v1_set_push_vlan, + .set_pop_vlan = &dr_ste_v1_set_pop_vlan, + .set_rx_decap = &dr_ste_v1_set_rx_decap, + .set_encap_l3 = &dr_ste_v1_set_encap_l3, /* Send */ .prepare_for_postsend = &dr_ste_v1_prepare_for_postsend, }; diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/sws/dr_ste_v2.h b/drivers/net/ethernet/mellanox/mlx5/core/steering/sws/dr_ste_v2.h new file mode 100644 index 0000000000000..d853fde49cfc6 --- /dev/null +++ b/drivers/net/ethernet/mellanox/mlx5/core/steering/sws/dr_ste_v2.h @@ -0,0 +1,168 @@ +/* SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB */ +/* Copyright (c) 2024, NVIDIA CORPORATION & AFFILIATES. All rights reserved. */ + +#ifndef _DR_STE_V2_ +#define _DR_STE_V2_ + +enum { + DR_STE_V2_ACTION_MDFY_FLD_L2_OUT_0 = 0x00, + DR_STE_V2_ACTION_MDFY_FLD_L2_OUT_1 = 0x01, + DR_STE_V2_ACTION_MDFY_FLD_L2_OUT_2 = 0x02, + DR_STE_V2_ACTION_MDFY_FLD_SRC_L2_OUT_0 = 0x08, + DR_STE_V2_ACTION_MDFY_FLD_SRC_L2_OUT_1 = 0x09, + DR_STE_V2_ACTION_MDFY_FLD_L3_OUT_0 = 0x0e, + DR_STE_V2_ACTION_MDFY_FLD_L4_OUT_0 = 0x18, + DR_STE_V2_ACTION_MDFY_FLD_L4_OUT_1 = 0x19, + DR_STE_V2_ACTION_MDFY_FLD_IPV4_OUT_0 = 0x40, + DR_STE_V2_ACTION_MDFY_FLD_IPV4_OUT_1 = 0x41, + DR_STE_V2_ACTION_MDFY_FLD_IPV6_DST_OUT_0 = 0x44, + DR_STE_V2_ACTION_MDFY_FLD_IPV6_DST_OUT_1 = 0x45, + DR_STE_V2_ACTION_MDFY_FLD_IPV6_DST_OUT_2 = 0x46, + DR_STE_V2_ACTION_MDFY_FLD_IPV6_DST_OUT_3 = 0x47, + DR_STE_V2_ACTION_MDFY_FLD_IPV6_SRC_OUT_0 = 0x4c, + DR_STE_V2_ACTION_MDFY_FLD_IPV6_SRC_OUT_1 = 0x4d, + DR_STE_V2_ACTION_MDFY_FLD_IPV6_SRC_OUT_2 = 0x4e, + DR_STE_V2_ACTION_MDFY_FLD_IPV6_SRC_OUT_3 = 0x4f, + DR_STE_V2_ACTION_MDFY_FLD_TCP_MISC_0 = 0x5e, + DR_STE_V2_ACTION_MDFY_FLD_TCP_MISC_1 = 0x5f, + DR_STE_V2_ACTION_MDFY_FLD_CFG_HDR_0_0 = 0x6f, + DR_STE_V2_ACTION_MDFY_FLD_CFG_HDR_0_1 = 0x70, + DR_STE_V2_ACTION_MDFY_FLD_METADATA_2_CQE = 0x7b, + DR_STE_V2_ACTION_MDFY_FLD_GNRL_PURPOSE = 0x7c, + DR_STE_V2_ACTION_MDFY_FLD_REGISTER_2_0 = 0x90, + DR_STE_V2_ACTION_MDFY_FLD_REGISTER_2_1 = 0x91, + DR_STE_V2_ACTION_MDFY_FLD_REGISTER_1_0 = 0x92, + DR_STE_V2_ACTION_MDFY_FLD_REGISTER_1_1 = 0x93, + DR_STE_V2_ACTION_MDFY_FLD_REGISTER_0_0 = 0x94, + DR_STE_V2_ACTION_MDFY_FLD_REGISTER_0_1 = 0x95, +}; + +static const struct mlx5dr_ste_action_modify_field dr_ste_v2_action_modify_field_arr[] = { + [MLX5_ACTION_IN_FIELD_OUT_SMAC_47_16] = { + .hw_field = DR_STE_V2_ACTION_MDFY_FLD_SRC_L2_OUT_0, .start = 0, .end = 31, + }, + [MLX5_ACTION_IN_FIELD_OUT_SMAC_15_0] = { + .hw_field = DR_STE_V2_ACTION_MDFY_FLD_SRC_L2_OUT_1, .start = 16, .end = 31, + }, + [MLX5_ACTION_IN_FIELD_OUT_ETHERTYPE] = { + .hw_field = DR_STE_V2_ACTION_MDFY_FLD_L2_OUT_1, .start = 0, .end = 15, + }, + [MLX5_ACTION_IN_FIELD_OUT_DMAC_47_16] = { + .hw_field = DR_STE_V2_ACTION_MDFY_FLD_L2_OUT_0, .start = 0, .end = 31, + }, + [MLX5_ACTION_IN_FIELD_OUT_DMAC_15_0] = { + .hw_field = DR_STE_V2_ACTION_MDFY_FLD_L2_OUT_1, .start = 16, .end = 31, + }, + [MLX5_ACTION_IN_FIELD_OUT_IP_DSCP] = { + .hw_field = DR_STE_V2_ACTION_MDFY_FLD_L3_OUT_0, .start = 18, .end = 23, + }, + [MLX5_ACTION_IN_FIELD_OUT_TCP_FLAGS] = { + .hw_field = DR_STE_V2_ACTION_MDFY_FLD_L4_OUT_1, .start = 16, .end = 24, + .l4_type = DR_STE_ACTION_MDFY_TYPE_L4_TCP, + }, + [MLX5_ACTION_IN_FIELD_OUT_TCP_SPORT] = { + .hw_field = DR_STE_V2_ACTION_MDFY_FLD_L4_OUT_0, .start = 16, .end = 31, + .l4_type = DR_STE_ACTION_MDFY_TYPE_L4_TCP, + }, + [MLX5_ACTION_IN_FIELD_OUT_TCP_DPORT] = { + .hw_field = DR_STE_V2_ACTION_MDFY_FLD_L4_OUT_0, .start = 0, .end = 15, + .l4_type = DR_STE_ACTION_MDFY_TYPE_L4_TCP, + }, + [MLX5_ACTION_IN_FIELD_OUT_IP_TTL] = { + .hw_field = DR_STE_V2_ACTION_MDFY_FLD_L3_OUT_0, .start = 8, .end = 15, + .l3_type = DR_STE_ACTION_MDFY_TYPE_L3_IPV4, + }, + [MLX5_ACTION_IN_FIELD_OUT_IPV6_HOPLIMIT] = { + .hw_field = DR_STE_V2_ACTION_MDFY_FLD_L3_OUT_0, .start = 8, .end = 15, + .l3_type = DR_STE_ACTION_MDFY_TYPE_L3_IPV6, + }, + [MLX5_ACTION_IN_FIELD_OUT_UDP_SPORT] = { + .hw_field = DR_STE_V2_ACTION_MDFY_FLD_L4_OUT_0, .start = 16, .end = 31, + .l4_type = DR_STE_ACTION_MDFY_TYPE_L4_UDP, + }, + [MLX5_ACTION_IN_FIELD_OUT_UDP_DPORT] = { + .hw_field = DR_STE_V2_ACTION_MDFY_FLD_L4_OUT_0, .start = 0, .end = 15, + .l4_type = DR_STE_ACTION_MDFY_TYPE_L4_UDP, + }, + [MLX5_ACTION_IN_FIELD_OUT_SIPV6_127_96] = { + .hw_field = DR_STE_V2_ACTION_MDFY_FLD_IPV6_SRC_OUT_0, .start = 0, .end = 31, + .l3_type = DR_STE_ACTION_MDFY_TYPE_L3_IPV6, + }, + [MLX5_ACTION_IN_FIELD_OUT_SIPV6_95_64] = { + .hw_field = DR_STE_V2_ACTION_MDFY_FLD_IPV6_SRC_OUT_1, .start = 0, .end = 31, + .l3_type = DR_STE_ACTION_MDFY_TYPE_L3_IPV6, + }, + [MLX5_ACTION_IN_FIELD_OUT_SIPV6_63_32] = { + .hw_field = DR_STE_V2_ACTION_MDFY_FLD_IPV6_SRC_OUT_2, .start = 0, .end = 31, + .l3_type = DR_STE_ACTION_MDFY_TYPE_L3_IPV6, + }, + [MLX5_ACTION_IN_FIELD_OUT_SIPV6_31_0] = { + .hw_field = DR_STE_V2_ACTION_MDFY_FLD_IPV6_SRC_OUT_3, .start = 0, .end = 31, + .l3_type = DR_STE_ACTION_MDFY_TYPE_L3_IPV6, + }, + [MLX5_ACTION_IN_FIELD_OUT_DIPV6_127_96] = { + .hw_field = DR_STE_V2_ACTION_MDFY_FLD_IPV6_DST_OUT_0, .start = 0, .end = 31, + .l3_type = DR_STE_ACTION_MDFY_TYPE_L3_IPV6, + }, + [MLX5_ACTION_IN_FIELD_OUT_DIPV6_95_64] = { + .hw_field = DR_STE_V2_ACTION_MDFY_FLD_IPV6_DST_OUT_1, .start = 0, .end = 31, + .l3_type = DR_STE_ACTION_MDFY_TYPE_L3_IPV6, + }, + [MLX5_ACTION_IN_FIELD_OUT_DIPV6_63_32] = { + .hw_field = DR_STE_V2_ACTION_MDFY_FLD_IPV6_DST_OUT_2, .start = 0, .end = 31, + .l3_type = DR_STE_ACTION_MDFY_TYPE_L3_IPV6, + }, + [MLX5_ACTION_IN_FIELD_OUT_DIPV6_31_0] = { + .hw_field = DR_STE_V2_ACTION_MDFY_FLD_IPV6_DST_OUT_3, .start = 0, .end = 31, + .l3_type = DR_STE_ACTION_MDFY_TYPE_L3_IPV6, + }, + [MLX5_ACTION_IN_FIELD_OUT_SIPV4] = { + .hw_field = DR_STE_V2_ACTION_MDFY_FLD_IPV4_OUT_0, .start = 0, .end = 31, + .l3_type = DR_STE_ACTION_MDFY_TYPE_L3_IPV4, + }, + [MLX5_ACTION_IN_FIELD_OUT_DIPV4] = { + .hw_field = DR_STE_V2_ACTION_MDFY_FLD_IPV4_OUT_1, .start = 0, .end = 31, + .l3_type = DR_STE_ACTION_MDFY_TYPE_L3_IPV4, + }, + [MLX5_ACTION_IN_FIELD_METADATA_REG_A] = { + .hw_field = DR_STE_V2_ACTION_MDFY_FLD_GNRL_PURPOSE, .start = 0, .end = 31, + }, + [MLX5_ACTION_IN_FIELD_METADATA_REG_B] = { + .hw_field = DR_STE_V2_ACTION_MDFY_FLD_METADATA_2_CQE, .start = 0, .end = 31, + }, + [MLX5_ACTION_IN_FIELD_METADATA_REG_C_0] = { + .hw_field = DR_STE_V2_ACTION_MDFY_FLD_REGISTER_0_0, .start = 0, .end = 31, + }, + [MLX5_ACTION_IN_FIELD_METADATA_REG_C_1] = { + .hw_field = DR_STE_V2_ACTION_MDFY_FLD_REGISTER_0_1, .start = 0, .end = 31, + }, + [MLX5_ACTION_IN_FIELD_METADATA_REG_C_2] = { + .hw_field = DR_STE_V2_ACTION_MDFY_FLD_REGISTER_1_0, .start = 0, .end = 31, + }, + [MLX5_ACTION_IN_FIELD_METADATA_REG_C_3] = { + .hw_field = DR_STE_V2_ACTION_MDFY_FLD_REGISTER_1_1, .start = 0, .end = 31, + }, + [MLX5_ACTION_IN_FIELD_METADATA_REG_C_4] = { + .hw_field = DR_STE_V2_ACTION_MDFY_FLD_REGISTER_2_0, .start = 0, .end = 31, + }, + [MLX5_ACTION_IN_FIELD_METADATA_REG_C_5] = { + .hw_field = DR_STE_V2_ACTION_MDFY_FLD_REGISTER_2_1, .start = 0, .end = 31, + }, + [MLX5_ACTION_IN_FIELD_OUT_TCP_SEQ_NUM] = { + .hw_field = DR_STE_V2_ACTION_MDFY_FLD_TCP_MISC_0, .start = 0, .end = 31, + }, + [MLX5_ACTION_IN_FIELD_OUT_TCP_ACK_NUM] = { + .hw_field = DR_STE_V2_ACTION_MDFY_FLD_TCP_MISC_1, .start = 0, .end = 31, + }, + [MLX5_ACTION_IN_FIELD_OUT_FIRST_VID] = { + .hw_field = DR_STE_V2_ACTION_MDFY_FLD_L2_OUT_2, .start = 0, .end = 15, + }, + [MLX5_ACTION_IN_FIELD_OUT_EMD_31_0] = { + .hw_field = DR_STE_V2_ACTION_MDFY_FLD_CFG_HDR_0_1, .start = 0, .end = 31, + }, + [MLX5_ACTION_IN_FIELD_OUT_EMD_47_32] = { + .hw_field = DR_STE_V2_ACTION_MDFY_FLD_CFG_HDR_0_0, .start = 0, .end = 15, + }, +}; + +#endif /* _DR_STE_V2_ */ From a17edb845101286e5f4b213e6f85658883fa4524 Mon Sep 17 00:00:00 2001 From: Itamar Gozlan Date: Thu, 19 Dec 2024 19:58:39 +0200 Subject: [PATCH 107/548] net/mlx5: DR, add support for ConnectX-8 steering Add support for a new steering format version that is implemented by ConnectX-8. Except for several differences, the STEv3 is identical to STEv2, so for most callbacks STEv3 context struct will call STEv2 functions. Signed-off-by: Itamar Gozlan Signed-off-by: Yevgeny Kliteynik Signed-off-by: Tariq Toukan Link: https://patch.msgid.link/20241219175841.1094544-10-tariqt@nvidia.com Signed-off-by: Jakub Kicinski (cherry picked from commit 4d617b57574f8ac04c997bdf9127a4c703a5f1f0) Signed-off-by: Koba Ko --- .../net/ethernet/mellanox/mlx5/core/Makefile | 1 + .../mlx5/core/steering/sws/dr_domain.c | 2 +- .../mellanox/mlx5/core/steering/sws/dr_ste.c | 2 + .../mellanox/mlx5/core/steering/sws/dr_ste.h | 1 + .../mlx5/core/steering/sws/dr_ste_v3.c | 221 ++++++++++++++++++ .../mlx5/core/steering/sws/mlx5_ifc_dr.h | 40 ++++ .../mellanox/mlx5/core/steering/sws/mlx5dr.h | 2 +- 7 files changed, 267 insertions(+), 2 deletions(-) create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/steering/sws/dr_ste_v3.c diff --git a/drivers/net/ethernet/mellanox/mlx5/core/Makefile b/drivers/net/ethernet/mellanox/mlx5/core/Makefile index cad9a4226c1f3..6ade13530ffac 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/Makefile +++ b/drivers/net/ethernet/mellanox/mlx5/core/Makefile @@ -123,6 +123,7 @@ mlx5_core-$(CONFIG_MLX5_SW_STEERING) += steering/sws/dr_domain.o \ steering/sws/dr_ste_v0.o \ steering/sws/dr_ste_v1.o \ steering/sws/dr_ste_v2.o \ + steering/sws/dr_ste_v3.o \ steering/sws/dr_cmd.o \ steering/sws/dr_fw.o \ steering/sws/dr_action.o \ diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/sws/dr_domain.c b/drivers/net/ethernet/mellanox/mlx5/core/steering/sws/dr_domain.c index 49f22cad92bfd..60cb4527588a5 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/steering/sws/dr_domain.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/steering/sws/dr_domain.c @@ -8,7 +8,7 @@ #define DR_DOMAIN_SW_STEERING_SUPPORTED(dmn, dmn_type) \ ((dmn)->info.caps.dmn_type##_sw_owner || \ ((dmn)->info.caps.dmn_type##_sw_owner_v2 && \ - (dmn)->info.caps.sw_format_ver <= MLX5_STEERING_FORMAT_CONNECTX_7)) + (dmn)->info.caps.sw_format_ver <= MLX5_STEERING_FORMAT_CONNECTX_8)) bool mlx5dr_domain_is_support_ptrn_arg(struct mlx5dr_domain *dmn) { diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/sws/dr_ste.c b/drivers/net/ethernet/mellanox/mlx5/core/steering/sws/dr_ste.c index 01ba8eae2983d..c8b8ff80c7c70 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/steering/sws/dr_ste.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/steering/sws/dr_ste.c @@ -1458,6 +1458,8 @@ struct mlx5dr_ste_ctx *mlx5dr_ste_get_ctx(u8 version) return mlx5dr_ste_get_ctx_v1(); else if (version == MLX5_STEERING_FORMAT_CONNECTX_7) return mlx5dr_ste_get_ctx_v2(); + else if (version == MLX5_STEERING_FORMAT_CONNECTX_8) + return mlx5dr_ste_get_ctx_v3(); return NULL; } diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/sws/dr_ste.h b/drivers/net/ethernet/mellanox/mlx5/core/steering/sws/dr_ste.h index b6ec8d30d9903..5f409dc30aca8 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/steering/sws/dr_ste.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/steering/sws/dr_ste.h @@ -217,5 +217,6 @@ struct mlx5dr_ste_ctx { struct mlx5dr_ste_ctx *mlx5dr_ste_get_ctx_v0(void); struct mlx5dr_ste_ctx *mlx5dr_ste_get_ctx_v1(void); struct mlx5dr_ste_ctx *mlx5dr_ste_get_ctx_v2(void); +struct mlx5dr_ste_ctx *mlx5dr_ste_get_ctx_v3(void); #endif /* _DR_STE_ */ diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/sws/dr_ste_v3.c b/drivers/net/ethernet/mellanox/mlx5/core/steering/sws/dr_ste_v3.c new file mode 100644 index 0000000000000..cc60ce1d274ef --- /dev/null +++ b/drivers/net/ethernet/mellanox/mlx5/core/steering/sws/dr_ste_v3.c @@ -0,0 +1,221 @@ +// SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB +/* Copyright (c) 2024, NVIDIA CORPORATION & AFFILIATES. All rights reserved. */ + +#include "dr_ste_v1.h" +#include "dr_ste_v2.h" + +static void dr_ste_v3_set_encap(u8 *hw_ste_p, u8 *d_action, + u32 reformat_id, int size) +{ + MLX5_SET(ste_double_action_insert_with_ptr_v3, d_action, action_id, + DR_STE_V1_ACTION_ID_INSERT_POINTER); + /* The hardware expects here size in words (2 byte) */ + MLX5_SET(ste_double_action_insert_with_ptr_v3, d_action, size, size / 2); + MLX5_SET(ste_double_action_insert_with_ptr_v3, d_action, pointer, reformat_id); + MLX5_SET(ste_double_action_insert_with_ptr_v3, d_action, attributes, + DR_STE_V1_ACTION_INSERT_PTR_ATTR_ENCAP); + dr_ste_v1_set_reparse(hw_ste_p); +} + +static void dr_ste_v3_set_push_vlan(u8 *ste, u8 *d_action, + u32 vlan_hdr) +{ + MLX5_SET(ste_double_action_insert_with_inline_v3, d_action, action_id, + DR_STE_V1_ACTION_ID_INSERT_INLINE); + /* The hardware expects here offset to vlan header in words (2 byte) */ + MLX5_SET(ste_double_action_insert_with_inline_v3, d_action, start_offset, + HDR_LEN_L2_MACS >> 1); + MLX5_SET(ste_double_action_insert_with_inline_v3, d_action, inline_data, vlan_hdr); + dr_ste_v1_set_reparse(ste); +} + +static void dr_ste_v3_set_pop_vlan(u8 *hw_ste_p, u8 *s_action, + u8 vlans_num) +{ + MLX5_SET(ste_single_action_remove_header_size_v3, s_action, + action_id, DR_STE_V1_ACTION_ID_REMOVE_BY_SIZE); + MLX5_SET(ste_single_action_remove_header_size_v3, s_action, + start_anchor, DR_STE_HEADER_ANCHOR_1ST_VLAN); + /* The hardware expects here size in words (2 byte) */ + MLX5_SET(ste_single_action_remove_header_size_v3, s_action, + remove_size, (HDR_LEN_L2_VLAN >> 1) * vlans_num); + + dr_ste_v1_set_reparse(hw_ste_p); +} + +static void dr_ste_v3_set_encap_l3(u8 *hw_ste_p, + u8 *frst_s_action, + u8 *scnd_d_action, + u32 reformat_id, + int size) +{ + /* Remove L2 headers */ + MLX5_SET(ste_single_action_remove_header_v3, frst_s_action, action_id, + DR_STE_V1_ACTION_ID_REMOVE_HEADER_TO_HEADER); + MLX5_SET(ste_single_action_remove_header_v3, frst_s_action, end_anchor, + DR_STE_HEADER_ANCHOR_IPV6_IPV4); + + /* Encapsulate with given reformat ID */ + MLX5_SET(ste_double_action_insert_with_ptr_v3, scnd_d_action, action_id, + DR_STE_V1_ACTION_ID_INSERT_POINTER); + /* The hardware expects here size in words (2 byte) */ + MLX5_SET(ste_double_action_insert_with_ptr_v3, scnd_d_action, size, size / 2); + MLX5_SET(ste_double_action_insert_with_ptr_v3, scnd_d_action, pointer, reformat_id); + MLX5_SET(ste_double_action_insert_with_ptr_v3, scnd_d_action, attributes, + DR_STE_V1_ACTION_INSERT_PTR_ATTR_ENCAP); + + dr_ste_v1_set_reparse(hw_ste_p); +} + +static void dr_ste_v3_set_rx_decap(u8 *hw_ste_p, u8 *s_action) +{ + MLX5_SET(ste_single_action_remove_header_v3, s_action, action_id, + DR_STE_V1_ACTION_ID_REMOVE_HEADER_TO_HEADER); + MLX5_SET(ste_single_action_remove_header_v3, s_action, decap, 1); + MLX5_SET(ste_single_action_remove_header_v3, s_action, vni_to_cqe, 1); + MLX5_SET(ste_single_action_remove_header_v3, s_action, end_anchor, + DR_STE_HEADER_ANCHOR_INNER_MAC); + + dr_ste_v1_set_reparse(hw_ste_p); +} + +static int +dr_ste_v3_set_action_decap_l3_list(void *data, u32 data_sz, + u8 *hw_action, u32 hw_action_sz, + uint16_t *used_hw_action_num) +{ + u8 padded_data[DR_STE_L2_HDR_MAX_SZ] = {}; + void *data_ptr = padded_data; + u16 used_actions = 0; + u32 inline_data_sz; + u32 i; + + if (hw_action_sz / DR_STE_ACTION_DOUBLE_SZ < DR_STE_DECAP_L3_ACTION_NUM) + return -EINVAL; + + inline_data_sz = + MLX5_FLD_SZ_BYTES(ste_double_action_insert_with_inline_v3, inline_data); + + /* Add an alignment padding */ + memcpy(padded_data + data_sz % inline_data_sz, data, data_sz); + + /* Remove L2L3 outer headers */ + MLX5_SET(ste_single_action_remove_header_v3, hw_action, action_id, + DR_STE_V1_ACTION_ID_REMOVE_HEADER_TO_HEADER); + MLX5_SET(ste_single_action_remove_header_v3, hw_action, decap, 1); + MLX5_SET(ste_single_action_remove_header_v3, hw_action, vni_to_cqe, 1); + MLX5_SET(ste_single_action_remove_header_v3, hw_action, end_anchor, + DR_STE_HEADER_ANCHOR_INNER_IPV6_IPV4); + hw_action += DR_STE_ACTION_DOUBLE_SZ; + used_actions++; /* Remove and NOP are a single double action */ + + /* Point to the last dword of the header */ + data_ptr += (data_sz / inline_data_sz) * inline_data_sz; + + /* Add the new header using inline action 4Byte at a time, the header + * is added in reversed order to the beginning of the packet to avoid + * incorrect parsing by the HW. Since header is 14B or 18B an extra + * two bytes are padded and later removed. + */ + for (i = 0; i < data_sz / inline_data_sz + 1; i++) { + void *addr_inline; + + MLX5_SET(ste_double_action_insert_with_inline_v3, hw_action, action_id, + DR_STE_V1_ACTION_ID_INSERT_INLINE); + /* The hardware expects here offset to words (2 bytes) */ + MLX5_SET(ste_double_action_insert_with_inline_v3, hw_action, start_offset, 0); + + /* Copy bytes one by one to avoid endianness problem */ + addr_inline = MLX5_ADDR_OF(ste_double_action_insert_with_inline_v3, + hw_action, inline_data); + memcpy(addr_inline, data_ptr - i * inline_data_sz, inline_data_sz); + hw_action += DR_STE_ACTION_DOUBLE_SZ; + used_actions++; + } + + /* Remove first 2 extra bytes */ + MLX5_SET(ste_single_action_remove_header_size_v3, hw_action, action_id, + DR_STE_V1_ACTION_ID_REMOVE_BY_SIZE); + MLX5_SET(ste_single_action_remove_header_size_v3, hw_action, start_offset, 0); + /* The hardware expects here size in words (2 bytes) */ + MLX5_SET(ste_single_action_remove_header_size_v3, hw_action, remove_size, 1); + used_actions++; + + *used_hw_action_num = used_actions; + + return 0; +} + +static struct mlx5dr_ste_ctx ste_ctx_v3 = { + /* Builders */ + .build_eth_l2_src_dst_init = &dr_ste_v1_build_eth_l2_src_dst_init, + .build_eth_l3_ipv6_src_init = &dr_ste_v1_build_eth_l3_ipv6_src_init, + .build_eth_l3_ipv6_dst_init = &dr_ste_v1_build_eth_l3_ipv6_dst_init, + .build_eth_l3_ipv4_5_tuple_init = &dr_ste_v1_build_eth_l3_ipv4_5_tuple_init, + .build_eth_l2_src_init = &dr_ste_v1_build_eth_l2_src_init, + .build_eth_l2_dst_init = &dr_ste_v1_build_eth_l2_dst_init, + .build_eth_l2_tnl_init = &dr_ste_v1_build_eth_l2_tnl_init, + .build_eth_l3_ipv4_misc_init = &dr_ste_v1_build_eth_l3_ipv4_misc_init, + .build_eth_ipv6_l3_l4_init = &dr_ste_v1_build_eth_ipv6_l3_l4_init, + .build_mpls_init = &dr_ste_v1_build_mpls_init, + .build_tnl_gre_init = &dr_ste_v1_build_tnl_gre_init, + .build_tnl_mpls_init = &dr_ste_v1_build_tnl_mpls_init, + .build_tnl_mpls_over_udp_init = &dr_ste_v1_build_tnl_mpls_over_udp_init, + .build_tnl_mpls_over_gre_init = &dr_ste_v1_build_tnl_mpls_over_gre_init, + .build_icmp_init = &dr_ste_v1_build_icmp_init, + .build_general_purpose_init = &dr_ste_v1_build_general_purpose_init, + .build_eth_l4_misc_init = &dr_ste_v1_build_eth_l4_misc_init, + .build_tnl_vxlan_gpe_init = &dr_ste_v1_build_flex_parser_tnl_vxlan_gpe_init, + .build_tnl_geneve_init = &dr_ste_v1_build_flex_parser_tnl_geneve_init, + .build_tnl_geneve_tlv_opt_init = &dr_ste_v1_build_flex_parser_tnl_geneve_tlv_opt_init, + .build_tnl_geneve_tlv_opt_exist_init = + &dr_ste_v1_build_flex_parser_tnl_geneve_tlv_opt_exist_init, + .build_register_0_init = &dr_ste_v1_build_register_0_init, + .build_register_1_init = &dr_ste_v1_build_register_1_init, + .build_src_gvmi_qpn_init = &dr_ste_v1_build_src_gvmi_qpn_init, + .build_flex_parser_0_init = &dr_ste_v1_build_flex_parser_0_init, + .build_flex_parser_1_init = &dr_ste_v1_build_flex_parser_1_init, + .build_tnl_gtpu_init = &dr_ste_v1_build_flex_parser_tnl_gtpu_init, + .build_tnl_header_0_1_init = &dr_ste_v1_build_tnl_header_0_1_init, + .build_tnl_gtpu_flex_parser_0_init = &dr_ste_v1_build_tnl_gtpu_flex_parser_0_init, + .build_tnl_gtpu_flex_parser_1_init = &dr_ste_v1_build_tnl_gtpu_flex_parser_1_init, + + /* Getters and Setters */ + .ste_init = &dr_ste_v1_init, + .set_next_lu_type = &dr_ste_v1_set_next_lu_type, + .get_next_lu_type = &dr_ste_v1_get_next_lu_type, + .is_miss_addr_set = &dr_ste_v1_is_miss_addr_set, + .set_miss_addr = &dr_ste_v1_set_miss_addr, + .get_miss_addr = &dr_ste_v1_get_miss_addr, + .set_hit_addr = &dr_ste_v1_set_hit_addr, + .set_byte_mask = &dr_ste_v1_set_byte_mask, + .get_byte_mask = &dr_ste_v1_get_byte_mask, + + /* Actions */ + .actions_caps = DR_STE_CTX_ACTION_CAP_TX_POP | + DR_STE_CTX_ACTION_CAP_RX_PUSH | + DR_STE_CTX_ACTION_CAP_RX_ENCAP, + .set_actions_rx = &dr_ste_v1_set_actions_rx, + .set_actions_tx = &dr_ste_v1_set_actions_tx, + .modify_field_arr_sz = ARRAY_SIZE(dr_ste_v2_action_modify_field_arr), + .modify_field_arr = dr_ste_v2_action_modify_field_arr, + .set_action_set = &dr_ste_v1_set_action_set, + .set_action_add = &dr_ste_v1_set_action_add, + .set_action_copy = &dr_ste_v1_set_action_copy, + .set_action_decap_l3_list = &dr_ste_v3_set_action_decap_l3_list, + .alloc_modify_hdr_chunk = &dr_ste_v1_alloc_modify_hdr_ptrn_arg, + .dealloc_modify_hdr_chunk = &dr_ste_v1_free_modify_hdr_ptrn_arg, + /* Actions bit set */ + .set_encap = &dr_ste_v3_set_encap, + .set_push_vlan = &dr_ste_v3_set_push_vlan, + .set_pop_vlan = &dr_ste_v3_set_pop_vlan, + .set_rx_decap = &dr_ste_v3_set_rx_decap, + .set_encap_l3 = &dr_ste_v3_set_encap_l3, + /* Send */ + .prepare_for_postsend = &dr_ste_v1_prepare_for_postsend, +}; + +struct mlx5dr_ste_ctx *mlx5dr_ste_get_ctx_v3(void) +{ + return &ste_ctx_v3; +} diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/sws/mlx5_ifc_dr.h b/drivers/net/ethernet/mellanox/mlx5/core/steering/sws/mlx5_ifc_dr.h index fb078fa0f0cc8..898c3618ff263 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/steering/sws/mlx5_ifc_dr.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/steering/sws/mlx5_ifc_dr.h @@ -600,4 +600,44 @@ struct mlx5_ifc_ste_double_action_aso_v1_bits { }; }; +struct mlx5_ifc_ste_single_action_remove_header_v3_bits { + u8 action_id[0x8]; + u8 start_anchor[0x7]; + u8 end_anchor[0x7]; + u8 reserved_at_16[0x1]; + u8 outer_l4_remove[0x1]; + u8 reserved_at_18[0x4]; + u8 decap[0x1]; + u8 vni_to_cqe[0x1]; + u8 qos_profile[0x2]; +}; + +struct mlx5_ifc_ste_single_action_remove_header_size_v3_bits { + u8 action_id[0x8]; + u8 start_anchor[0x7]; + u8 start_offset[0x8]; + u8 outer_l4_remove[0x1]; + u8 reserved_at_18[0x2]; + u8 remove_size[0x6]; +}; + +struct mlx5_ifc_ste_double_action_insert_with_inline_v3_bits { + u8 action_id[0x8]; + u8 start_anchor[0x7]; + u8 start_offset[0x8]; + u8 reserved_at_17[0x9]; + + u8 inline_data[0x20]; +}; + +struct mlx5_ifc_ste_double_action_insert_with_ptr_v3_bits { + u8 action_id[0x8]; + u8 start_anchor[0x7]; + u8 start_offset[0x8]; + u8 size[0x6]; + u8 attributes[0x3]; + + u8 pointer[0x20]; +}; + #endif /* MLX5_IFC_DR_H */ diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/sws/mlx5dr.h b/drivers/net/ethernet/mellanox/mlx5/core/steering/sws/mlx5dr.h index 89fced86936f6..bc70f083571c9 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/steering/sws/mlx5dr.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/steering/sws/mlx5dr.h @@ -165,7 +165,7 @@ mlx5dr_is_supported(struct mlx5_core_dev *dev) (MLX5_CAP_ESW_FLOWTABLE_FDB(dev, sw_owner) || (MLX5_CAP_ESW_FLOWTABLE_FDB(dev, sw_owner_v2) && (MLX5_CAP_GEN(dev, steering_format_version) <= - MLX5_STEERING_FORMAT_CONNECTX_7))); + MLX5_STEERING_FORMAT_CONNECTX_8))); } /* buddy functions & structure */ From 003da6cee208173c97e0e5df6435a691c8939714 Mon Sep 17 00:00:00 2001 From: Riwen Lu Date: Wed, 12 Jun 2024 19:46:31 +0800 Subject: [PATCH 108/548] cpufreq/cppc: Don't compare desired_perf in target() There is a corner case where the desired_perf is exactly same as the old perf, but the actual current freq is not. This happens during S3 while the cpufreq governor is set to powersave. During cpufreq resume process, the booting CPU's new_freq obtained via .get() is the highest frequency, while the policy->cur and cpu->perf_ctrls.desired_perf are set to the lowest level (powersave governor). This causes the warning: "CPU frequency out of sync:", and the cpufreq core sets policy->cur to new_freq. Then the governor->limits() calls cppc_cpufreq_set_target() to configures the CPU frequency and returns directly because the desired_perf converted from target_freq is same as the cpu->perf_ctrls.desired_perf and both are the lowest_perf. Since target_freq and policy->cur have been already compared in __cpufreq_driver_target(), there's no need to compare them again here. Drop the comparison. Signed-off-by: Riwen Lu [ Viresh: Updated commit message / subject ] Signed-off-by: Viresh Kumar (cherry picked from commit 90e4ed6bb02ad93663f17411d17e8e714a765a6b) Signed-off-by: Koba Ko --- drivers/cpufreq/cppc_cpufreq.c | 9 ++------- 1 file changed, 2 insertions(+), 7 deletions(-) diff --git a/drivers/cpufreq/cppc_cpufreq.c b/drivers/cpufreq/cppc_cpufreq.c index 5f0660e9bcca5..6f68b5c5c52d6 100644 --- a/drivers/cpufreq/cppc_cpufreq.c +++ b/drivers/cpufreq/cppc_cpufreq.c @@ -275,15 +275,10 @@ static int cppc_cpufreq_set_target(struct cpufreq_policy *policy, struct cppc_cpudata *cpu_data = policy->driver_data; unsigned int cpu = policy->cpu; struct cpufreq_freqs freqs; - u32 desired_perf; int ret = 0; - desired_perf = cppc_khz_to_perf(&cpu_data->perf_caps, target_freq); - /* Return if it is exactly the same perf */ - if (desired_perf == cpu_data->perf_ctrls.desired_perf) - return ret; - - cpu_data->perf_ctrls.desired_perf = desired_perf; + cpu_data->perf_ctrls.desired_perf = + cppc_khz_to_perf(&cpu_data->perf_caps, target_freq); freqs.old = policy->cur; freqs.new = target_freq; From 343c7c852c96e1b6b08fdd19418b66331bfbaebc Mon Sep 17 00:00:00 2001 From: Koba Ko Date: Sun, 13 Oct 2024 04:50:10 +0800 Subject: [PATCH 109/548] ACPI: PRM: Find EFI_MEMORY_RUNTIME block for PRM handler and context PRMT needs to find the correct type of block to translate the PA-VA mapping for EFI runtime services. The issue arises because the PRMT is finding a block of type EFI_CONVENTIONAL_MEMORY, which is not appropriate for runtime services as described in Section 2.2.2 (Runtime Services) of the UEFI Specification [1]. Since the PRM handler is a type of runtime service, this causes an exception when the PRM handler is called. [Firmware Bug]: Unable to handle paging request in EFI runtime service WARNING: CPU: 22 PID: 4330 at drivers/firmware/efi/runtime-wrappers.c:341 __efi_queue_work+0x11c/0x170 Call trace: Let PRMT find a block with EFI_MEMORY_RUNTIME for PRM handler and PRM context. If no suitable block is found, a warning message will be printed, but the procedure continues to manage the next PRM handler. However, if the PRM handler is actually called without proper allocation, it would result in a failure during error handling. By using the correct memory types for runtime services, ensure that the PRM handler and the context are properly mapped in the virtual address space during runtime, preventing the paging request error. The issue is really that only memory that has been remapped for runtime by the firmware can be used by the PRM handler, and so the region needs to have the EFI_MEMORY_RUNTIME attribute. Link: https://uefi.org/sites/default/files/resources/UEFI_Spec_2_10_Aug29.pdf # [1] Fixes: cefc7ca46235 ("ACPI: PRM: implement OperationRegion handler for the PlatformRtMechanism subtype") Cc: All applicable Signed-off-by: Koba Ko Reviewed-by: Matthew R. Ochs Reviewed-by: Zhang Rui Reviewed-by: Ard Biesheuvel Link: https://patch.msgid.link/20241012205010.4165798-1-kobak@nvidia.com [ rjw: Subject and changelog edits ] Signed-off-by: Rafael J. Wysocki (cherry picked from commit 088984c8d54c0053fc4ae606981291d741c5924b) Signed-off-by: Koba Ko --- drivers/acpi/prmt.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/drivers/acpi/prmt.c b/drivers/acpi/prmt.c index a34f7d37877c9..8b391f12853bb 100644 --- a/drivers/acpi/prmt.c +++ b/drivers/acpi/prmt.c @@ -263,7 +263,9 @@ static acpi_status acpi_platformrt_space_handler(u32 function, if (!handler || !module) goto invalid_guid; - if (!handler->handler_addr) { + if (!handler->handler_addr || + !handler->static_data_buffer_addr || + !handler->acpi_param_buffer_addr) { buffer->prm_status = PRM_HANDLER_ERROR; return AE_OK; } From b339e7125cd2a39109b7e42f9c5cb0a91ea7edb6 Mon Sep 17 00:00:00 2001 From: David Hildenbrand Date: Wed, 13 Sep 2023 14:51:08 +0200 Subject: [PATCH 110/548] mm/rmap: drop stale comment in page_add_anon_rmap and hugepage_add_anon_rmap() Patch series "Anon rmap cleanups". Some cleanups around rmap for anon pages. I'm working on more cleanups also around file rmap -- also to handle the "compound" parameter internally only and to let hugetlb use page_add_file_rmap(), but these changes make sense separately. This patch (of 6): That comment was added in commit 5dbe0af47f8a ("mm: fix kernel BUG at mm/rmap.c:1017!") to document why we can see vma->vm_end getting adjusted concurrently due to a VMA split. However, the optimized locking code was changed again in bf181b9f9d8 ("mm anon rmap: replace same_anon_vma linked list with an interval tree."). ... and later, the comment was changed in commit 0503ea8f5ba7 ("mm/mmap: remove __vma_adjust()") to talk about "vma_merge" although the original issue was with VMA splitting. Let's just remove that comment. Nowadays, it's outdated, imprecise and confusing. Link: https://lkml.kernel.org/r/20230913125113.313322-1-david@redhat.com Link: https://lkml.kernel.org/r/20230913125113.313322-2-david@redhat.com Signed-off-by: David Hildenbrand Cc: Mike Kravetz Cc: Muchun Song Cc: Matthew Wilcox Signed-off-by: Andrew Morton (cherry picked from commit fd63908706f79c963946a77b7f352db5431deed5) Signed-off-by: Koba Ko --- mm/rmap.c | 2 -- 1 file changed, 2 deletions(-) diff --git a/mm/rmap.c b/mm/rmap.c index 968b85a67b1a1..956965964c468 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -1245,7 +1245,6 @@ void page_add_anon_rmap(struct page *page, struct vm_area_struct *vma, __lruvec_stat_mod_folio(folio, NR_ANON_MAPPED, nr); if (likely(!folio_test_ksm(folio))) { - /* address might be in next vma when migration races vma_merge */ if (first) __page_set_anon_rmap(folio, page, vma, address, !!(flags & RMAP_EXCLUSIVE)); @@ -2549,7 +2548,6 @@ void hugepage_add_anon_rmap(struct page *page, struct vm_area_struct *vma, BUG_ON(!folio_test_locked(folio)); BUG_ON(!anon_vma); - /* address might be in next vma when migration races vma_merge */ first = atomic_inc_and_test(&folio->_entire_mapcount); VM_BUG_ON_PAGE(!first && (flags & RMAP_EXCLUSIVE), page); VM_BUG_ON_PAGE(!first && PageAnonExclusive(page), page); From 41546c37f2e0fa752e225f376ce8ffb955599e83 Mon Sep 17 00:00:00 2001 From: David Hildenbrand Date: Wed, 13 Sep 2023 14:51:09 +0200 Subject: [PATCH 111/548] mm/rmap: move SetPageAnonExclusive out of __page_set_anon_rmap() Let's handle it in the caller. No need to pass the page. While at it, rename the function to __folio_set_anon() and pass "bool exclusive" instead of "int exclusive". Link: https://lkml.kernel.org/r/20230913125113.313322-3-david@redhat.com Signed-off-by: David Hildenbrand Cc: Matthew Wilcox Cc: Mike Kravetz Cc: Muchun Song Signed-off-by: Andrew Morton (cherry picked from commit c66db8c0702c0ab741ecfd5e12b323ff49fe9089) Signed-off-by: Koba Ko --- mm/rmap.c | 41 +++++++++++++++++++++-------------------- 1 file changed, 21 insertions(+), 20 deletions(-) diff --git a/mm/rmap.c b/mm/rmap.c index 956965964c468..c9f4dd33b9864 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -1122,27 +1122,25 @@ void page_move_anon_rmap(struct page *page, struct vm_area_struct *vma) } /** - * __page_set_anon_rmap - set up new anonymous rmap - * @folio: Folio which contains page. - * @page: Page to add to rmap. - * @vma: VM area to add page to. + * __folio_set_anon - set up a new anonymous rmap for a folio + * @folio: The folio to set up the new anonymous rmap for. + * @vma: VM area to add the folio to. * @address: User virtual address of the mapping - * @exclusive: the page is exclusively owned by the current process + * @exclusive: Whether the folio is exclusive to the process. */ -static void __page_set_anon_rmap(struct folio *folio, struct page *page, - struct vm_area_struct *vma, unsigned long address, int exclusive) +static void __folio_set_anon(struct folio *folio, struct vm_area_struct *vma, + unsigned long address, bool exclusive) { struct anon_vma *anon_vma = vma->anon_vma; BUG_ON(!anon_vma); if (folio_test_anon(folio)) - goto out; + return; /* - * If the page isn't exclusively mapped into this vma, - * we must use the _oldest_ possible anon_vma for the - * page mapping! + * If the folio isn't exclusive to this vma, we must use the _oldest_ + * possible anon_vma for the folio mapping! */ if (!exclusive) anon_vma = anon_vma->root; @@ -1156,9 +1154,6 @@ static void __page_set_anon_rmap(struct folio *folio, struct page *page, anon_vma = (void *) anon_vma + PAGE_MAPPING_ANON; WRITE_ONCE(folio->mapping, (struct address_space *) anon_vma); folio->index = linear_page_index(vma, address); -out: - if (exclusive) - SetPageAnonExclusive(page); } /** @@ -1246,11 +1241,13 @@ void page_add_anon_rmap(struct page *page, struct vm_area_struct *vma, if (likely(!folio_test_ksm(folio))) { if (first) - __page_set_anon_rmap(folio, page, vma, address, - !!(flags & RMAP_EXCLUSIVE)); + __folio_set_anon(folio, vma, address, + !!(flags & RMAP_EXCLUSIVE)); else __page_check_anon_rmap(folio, page, vma, address); } + if (flags & RMAP_EXCLUSIVE) + SetPageAnonExclusive(page); mlock_vma_folio(folio, vma, compound); } @@ -1289,7 +1286,8 @@ void folio_add_new_anon_rmap(struct folio *folio, struct vm_area_struct *vma, } __lruvec_stat_mod_folio(folio, NR_ANON_MAPPED, nr); - __page_set_anon_rmap(folio, &folio->page, vma, address, 1); + __folio_set_anon(folio, vma, address, true); + SetPageAnonExclusive(&folio->page); } /** @@ -2552,8 +2550,10 @@ void hugepage_add_anon_rmap(struct page *page, struct vm_area_struct *vma, VM_BUG_ON_PAGE(!first && (flags & RMAP_EXCLUSIVE), page); VM_BUG_ON_PAGE(!first && PageAnonExclusive(page), page); if (first) - __page_set_anon_rmap(folio, page, vma, address, - !!(flags & RMAP_EXCLUSIVE)); + __folio_set_anon(folio, vma, address, + !!(flags & RMAP_EXCLUSIVE)); + if (flags & RMAP_EXCLUSIVE) + SetPageAnonExclusive(page); } void hugepage_add_new_anon_rmap(struct folio *folio, @@ -2563,6 +2563,7 @@ void hugepage_add_new_anon_rmap(struct folio *folio, /* increment count (starts at -1) */ atomic_set(&folio->_entire_mapcount, 0); folio_clear_hugetlb_restore_reserve(folio); - __page_set_anon_rmap(folio, &folio->page, vma, address, 1); + __folio_set_anon(folio, vma, address, true); + SetPageAnonExclusive(&folio->page); } #endif /* CONFIG_HUGETLB_PAGE */ From 14f4f2cdcffaabf5828bedbf83d54c14353eaf56 Mon Sep 17 00:00:00 2001 From: David Hildenbrand Date: Wed, 13 Sep 2023 14:51:10 +0200 Subject: [PATCH 112/548] mm/rmap: move folio_test_anon() check out of __folio_set_anon() Let's handle it in the caller; no need for the "first" check based on the mapcount. We really only end up with !anon pages in page_add_anon_rmap() via do_swap_page(), where we hold the folio lock. So races are not possible. Add a VM_WARN_ON_FOLIO() to make sure that we really hold the folio lock. In the future, we might want to let do_swap_page() use folio_add_new_anon_rmap() on new pages instead: however, we might have to pass then whether the folio is exclusive or not. So keep it in there for now. For hugetlb we never expect to have a non-anon page in hugepage_add_anon_rmap(). Remove that code, along with some other checks that are either not required or were checked in hugepage_add_new_anon_rmap() already. Link: https://lkml.kernel.org/r/20230913125113.313322-4-david@redhat.com Signed-off-by: David Hildenbrand Cc: Matthew Wilcox Cc: Mike Kravetz Cc: Muchun Song Signed-off-by: Andrew Morton (cherry picked from commit c5c540034747dfe450f64d1151081a6080daa8f9) Signed-off-by: Koba Ko --- mm/rmap.c | 23 ++++++++--------------- 1 file changed, 8 insertions(+), 15 deletions(-) diff --git a/mm/rmap.c b/mm/rmap.c index c9f4dd33b9864..de8f22c25edb5 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -1135,9 +1135,6 @@ static void __folio_set_anon(struct folio *folio, struct vm_area_struct *vma, BUG_ON(!anon_vma); - if (folio_test_anon(folio)) - return; - /* * If the folio isn't exclusive to this vma, we must use the _oldest_ * possible anon_vma for the folio mapping! @@ -1239,12 +1236,12 @@ void page_add_anon_rmap(struct page *page, struct vm_area_struct *vma, if (nr) __lruvec_stat_mod_folio(folio, NR_ANON_MAPPED, nr); - if (likely(!folio_test_ksm(folio))) { - if (first) - __folio_set_anon(folio, vma, address, - !!(flags & RMAP_EXCLUSIVE)); - else - __page_check_anon_rmap(folio, page, vma, address); + if (unlikely(!folio_test_anon(folio))) { + VM_WARN_ON_FOLIO(!folio_test_locked(folio), folio); + __folio_set_anon(folio, vma, address, + !!(flags & RMAP_EXCLUSIVE)); + } else if (likely(!folio_test_ksm(folio))) { + __page_check_anon_rmap(folio, page, vma, address); } if (flags & RMAP_EXCLUSIVE) SetPageAnonExclusive(page); @@ -2541,17 +2538,13 @@ void hugepage_add_anon_rmap(struct page *page, struct vm_area_struct *vma, unsigned long address, rmap_t flags) { struct folio *folio = page_folio(page); - struct anon_vma *anon_vma = vma->anon_vma; int first; - BUG_ON(!folio_test_locked(folio)); - BUG_ON(!anon_vma); + VM_WARN_ON_FOLIO(!folio_test_anon(folio), folio); + first = atomic_inc_and_test(&folio->_entire_mapcount); VM_BUG_ON_PAGE(!first && (flags & RMAP_EXCLUSIVE), page); VM_BUG_ON_PAGE(!first && PageAnonExclusive(page), page); - if (first) - __folio_set_anon(folio, vma, address, - !!(flags & RMAP_EXCLUSIVE)); if (flags & RMAP_EXCLUSIVE) SetPageAnonExclusive(page); } From 3f40a6ffbd0122f23ab7e0e1addbf021e43431b8 Mon Sep 17 00:00:00 2001 From: David Hildenbrand Date: Wed, 13 Sep 2023 14:51:11 +0200 Subject: [PATCH 113/548] mm/rmap: warn on new PTE-mapped folios in page_add_anon_rmap() If swapin code would ever decide to not use order-0 pages and supply a PTE-mapped large folio, we will have to change how we call __folio_set_anon() -- eventually with exclusive=false and an adjusted address. For now, let's add a VM_WARN_ON_FOLIO() with a comment about the situation. Link: https://lkml.kernel.org/r/20230913125113.313322-5-david@redhat.com Signed-off-by: David Hildenbrand Cc: Matthew Wilcox Cc: Mike Kravetz Cc: Muchun Song Signed-off-by: Andrew Morton (cherry picked from commit a1f34ee1de2c3a55bc2a6b9a38e1ecd2830dcc03) Signed-off-by: Koba Ko --- mm/rmap.c | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/mm/rmap.c b/mm/rmap.c index de8f22c25edb5..856dbcfb40f1f 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -1238,6 +1238,13 @@ void page_add_anon_rmap(struct page *page, struct vm_area_struct *vma, if (unlikely(!folio_test_anon(folio))) { VM_WARN_ON_FOLIO(!folio_test_locked(folio), folio); + /* + * For a PTE-mapped large folio, we only know that the single + * PTE is exclusive. Further, __folio_set_anon() might not get + * folio->index right when not given the address of the head + * page. + */ + VM_WARN_ON_FOLIO(folio_test_large(folio) && !compound, folio); __folio_set_anon(folio, vma, address, !!(flags & RMAP_EXCLUSIVE)); } else if (likely(!folio_test_ksm(folio))) { From 0076b4536a296c9bc6741dd6589f56ed0b26991c Mon Sep 17 00:00:00 2001 From: Zach O'Keefe Date: Mon, 25 Sep 2023 13:01:10 -0700 Subject: [PATCH 114/548] mm/thp: fix "mm: thp: kill __transhuge_page_enabled()" The 6.0 commits: commit 9fec51689ff6 ("mm: thp: kill transparent_hugepage_active()") commit 7da4e2cb8b1f ("mm: thp: kill __transhuge_page_enabled()") merged "can we have THPs in this VMA?" logic that was previously done separately by fault-path, khugepaged, and smaps "THPeligible" checks. During the process, the semantics of the fault path check changed in two ways: 1) A VM_NO_KHUGEPAGED check was introduced (also added to smaps path). 2) We no longer checked if non-anonymous memory had a vm_ops->huge_fault handler that could satisfy the fault. Previously, this check had been done in create_huge_pud() and create_huge_pmd() routines, but after the changes, we never reach those routines. During the review of the above commits, it was determined that in-tree users weren't affected by the change; most notably, since the only relevant user (in terms of THP) of VM_MIXEDMAP or ->huge_fault is DAX, which is explicitly approved early in approval logic. However, this was a bad assumption to make as it assumes the only reason to support ->huge_fault was for DAX (which is not true in general). Remove the VM_NO_KHUGEPAGED check when not in collapse path and give any ->huge_fault handler a chance to handle the fault. Note that we don't validate the file mode or mapping alignment, which is consistent with the behavior before the aforementioned commits. Link: https://lkml.kernel.org/r/20230925200110.1979606-1-zokeefe@google.com Fixes: 7da4e2cb8b1f ("mm: thp: kill __transhuge_page_enabled()") Reported-by: Saurabh Singh Sengar Signed-off-by: Zach O'Keefe Cc: Yang Shi Cc: Matthew Wilcox Cc: David Hildenbrand Cc: Ryan Roberts Signed-off-by: Andrew Morton (cherry picked from commit 7a81751fcdeb833acc858e59082688e3020bfe12) Signed-off-by: Koba Ko --- mm/huge_memory.c | 20 +++++++++++++------- 1 file changed, 13 insertions(+), 7 deletions(-) diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 78f5df12b8eb3..9eefcbc042810 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -86,11 +86,11 @@ bool hugepage_vma_check(struct vm_area_struct *vma, unsigned long vm_flags, return in_pf; /* - * Special VMA and hugetlb VMA. + * khugepaged special VMA and hugetlb VMA. * Must be checked after dax since some dax mappings may have * VM_MIXEDMAP set. */ - if (vm_flags & VM_NO_KHUGEPAGED) + if (!in_pf && !smaps && (vm_flags & VM_NO_KHUGEPAGED)) return false; /* @@ -118,12 +118,18 @@ bool hugepage_vma_check(struct vm_area_struct *vma, unsigned long vm_flags, !hugepage_flags_always()))) return false; - /* Only regular file is valid */ - if (!in_pf && file_thp_enabled(vma)) - return true; - - if (!vma_is_anonymous(vma)) + if (!vma_is_anonymous(vma)) { + /* + * Trust that ->huge_fault() handlers know what they are doing + * in fault path. + */ + if (((in_pf || smaps)) && vma->vm_ops->huge_fault) + return true; + /* Only regular file is valid in collapse path */ + if (((!in_pf || smaps)) && file_thp_enabled(vma)) + return true; return false; + } if (vma_is_temporary_stack(vma)) return false; From 1d3f0aec3d732165a4a6e6486cac2db5760cd877 Mon Sep 17 00:00:00 2001 From: Ryan Roberts Date: Thu, 7 Dec 2023 16:12:02 +0000 Subject: [PATCH 115/548] mm: allow deferred splitting of arbitrary anon large folios Patch series "Multi-size THP for anonymous memory", v9. A series to implement multi-size THP (mTHP) for anonymous memory (previously called "small-sized THP" and "large anonymous folios"). The objective of this is to improve performance by allocating larger chunks of memory during anonymous page faults: 1) Since SW (the kernel) is dealing with larger chunks of memory than base pages, there are efficiency savings to be had; fewer page faults, batched PTE and RMAP manipulation, reduced lru list, etc. In short, we reduce kernel overhead. This should benefit all architectures. 2) Since we are now mapping physically contiguous chunks of memory, we can take advantage of HW TLB compression techniques. A reduction in TLB pressure speeds up kernel and user space. arm64 systems have 2 mechanisms to coalesce TLB entries; "the contiguous bit" (architectural) and HPA (uarch). This version incorporates David's feedback on the core patches (#3, #4) and adds some RB and TB tags (see change log for details). By default, the existing behaviour (and performance) is maintained. The user must explicitly enable multi-size THP to see the performance benefit. This is done via a new sysfs interface (as recommended by David Hildenbrand - thanks to David for the suggestion)! This interface is inspired by the existing per-hugepage-size sysfs interface used by hugetlb, provides full backwards compatibility with the existing PMD-size THP interface, and provides a base for future extensibility. See [9] for detailed discussion of the interface. This series is based on mm-unstable (715b67adf4c8). Prerequisites ============= I'm removing this section on the basis that I don't believe what we were previously calling prerequisites are really prerequisites anymore. We originally defined them when mTHP was a compile-time feature. There is now a runtime control to opt-in to mTHP; when disabled, correctness and performance are as before. When enabled, the code is still correct/robust, but in the absence of the one remaining item (compaction) there may be a performance impact in some corners. See the old list in the v8 cover letter at [8]. And a longer explanation of my thinking here [10]. SUMMARY: I don't think we should hold this series up, waiting for the items on the prerequisites list. I believe this series should be ready now so hopefully can be added to mm-unstable for some testing, then fingers crossed for v6.8. Testing ======= The series includes patches for mm selftests to enlighten the cow and khugepaged tests to explicitly test with multi-size THP, in the same way that PMD-sized THP is tested. The new tests all pass, and no regressions are observed in the mm selftest suite. I've also run my usual kernel compilation and java script benchmarks without any issues. Refer to my performance numbers posted with v6 [6]. (These are for multi-size THP only - they do not include the arm64 contpte follow-on series). John Hubbard at Nvidia has indicated dramatic 10x performance improvements for some workloads at [11]. (Observed using v6 of this series as well as the arm64 contpte series). Kefeng Wang at Huawei has also indicated he sees improvements at [12] although there are some latency regressions also. I've also checked that there is no regression in the write fault path when mTHP is disabled using a microbenchmark. I ran it for a baseline kernel, as well as v8 and v9. I repeated on Ampere Altra (bare metal) and Apple M2 (VM): | | m2 vm | altra | |--------------|---------------------|---------------------| | kernel | mean | std_rel | mean | std_rel | |--------------|----------|----------|----------|----------| | baseline | 0.000% | 0.341% | 0.000% | 3.581% | | anonfolio-v8 | 0.005% | 0.272% | 5.068% | 1.128% | | anonfolio-v9 | -0.013% | 0.442% | 0.107% | 1.788% | There is no measurable difference on M2, but altra has a slow down in v8 which is fixed in v9 by moving the THP order check to be inline within thp_vma_allowable_orders(), as suggested by David. This patch (of 10): In preparation for the introduction of anonymous multi-size THP, we would like to be able to split them when they have unmapped subpages, in order to free those unused pages under memory pressure. So remove the artificial requirement that the large folio needed to be at least PMD-sized. Link: https://lkml.kernel.org/r/20231207161211.2374093-1-ryan.roberts@arm.com Link: https://lkml.kernel.org/r/20231207161211.2374093-2-ryan.roberts@arm.com Signed-off-by: Ryan Roberts Reviewed-by: Yu Zhao Reviewed-by: Yin Fengwei Reviewed-by: Matthew Wilcox (Oracle) Reviewed-by: David Hildenbrand Reviewed-by: Barry Song Tested-by: Kefeng Wang Tested-by: John Hubbard Cc: Alistair Popple Cc: Anshuman Khandual Cc: Catalin Marinas Cc: David Rientjes Cc: "Huang, Ying" Cc: Hugh Dickins Cc: Itaru Kitayama Cc: Kirill A. Shutemov Cc: Luis Chamberlain Cc: Vlastimil Babka Cc: Yang Shi Cc: Zi Yan Signed-off-by: Andrew Morton (cherry picked from commit 7dc7c5ef6463111991002f24c0aea08afe86f2cc) Signed-off-by: Koba Ko --- mm/rmap.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/mm/rmap.c b/mm/rmap.c index 856dbcfb40f1f..1c987ec9a0be2 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -1447,11 +1447,11 @@ void page_remove_rmap(struct page *page, struct vm_area_struct *vma, __lruvec_stat_mod_folio(folio, idx, -nr); /* - * Queue anon THP for deferred split if at least one + * Queue anon large folio for deferred split if at least one * page of the folio is unmapped and at least one page * is still mapped. */ - if (folio_test_pmd_mappable(folio) && folio_test_anon(folio)) + if (folio_test_large(folio) && folio_test_anon(folio)) if (!compound || nr < nr_pmdmapped) deferred_split_folio(folio); } From 6d143b2dadc1a7edd79020e2d63cc05d309491e0 Mon Sep 17 00:00:00 2001 From: Ryan Roberts Date: Thu, 7 Dec 2023 16:12:03 +0000 Subject: [PATCH 116/548] mm: non-pmd-mappable, large folios for folio_add_new_anon_rmap() In preparation for supporting anonymous multi-size THP, improve folio_add_new_anon_rmap() to allow a non-pmd-mappable, large folio to be passed to it. In this case, all contained pages are accounted using the order-0 folio (or base page) scheme. Link: https://lkml.kernel.org/r/20231207161211.2374093-3-ryan.roberts@arm.com Signed-off-by: Ryan Roberts Reviewed-by: Yu Zhao Reviewed-by: Yin Fengwei Reviewed-by: David Hildenbrand Reviewed-by: Barry Song Tested-by: Kefeng Wang Tested-by: John Hubbard Cc: Alistair Popple Cc: Anshuman Khandual Cc: Catalin Marinas Cc: David Rientjes Cc: "Huang, Ying" Cc: Hugh Dickins Cc: Itaru Kitayama Cc: Kirill A. Shutemov Cc: Luis Chamberlain Cc: Matthew Wilcox (Oracle) Cc: Vlastimil Babka Cc: Yang Shi Cc: Zi Yan Signed-off-by: Andrew Morton (cherry picked from commit 372cbd4d5a0665bf7e181c72f5e40e1bf59b0b08) Signed-off-by: Koba Ko --- mm/rmap.c | 28 ++++++++++++++++++++-------- 1 file changed, 20 insertions(+), 8 deletions(-) diff --git a/mm/rmap.c b/mm/rmap.c index 1c987ec9a0be2..9e7dfeac56f1f 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -1266,32 +1266,44 @@ void page_add_anon_rmap(struct page *page, struct vm_area_struct *vma, * This means the inc-and-test can be bypassed. * The folio does not have to be locked. * - * If the folio is large, it is accounted as a THP. As the folio + * If the folio is pmd-mappable, it is accounted as a THP. As the folio * is new, it's assumed to be mapped exclusively by a single process. */ void folio_add_new_anon_rmap(struct folio *folio, struct vm_area_struct *vma, unsigned long address) { - int nr; + int nr = folio_nr_pages(folio); - VM_BUG_ON_VMA(address < vma->vm_start || address >= vma->vm_end, vma); + VM_BUG_ON_VMA(address < vma->vm_start || + address + (nr << PAGE_SHIFT) > vma->vm_end, vma); __folio_set_swapbacked(folio); + __folio_set_anon(folio, vma, address, true); - if (likely(!folio_test_pmd_mappable(folio))) { + if (likely(!folio_test_large(folio))) { /* increment count (starts at -1) */ atomic_set(&folio->_mapcount, 0); - nr = 1; + SetPageAnonExclusive(&folio->page); + } else if (!folio_test_pmd_mappable(folio)) { + int i; + + for (i = 0; i < nr; i++) { + struct page *page = folio_page(folio, i); + + /* increment count (starts at -1) */ + atomic_set(&page->_mapcount, 0); + SetPageAnonExclusive(page); + } + + atomic_set(&folio->_nr_pages_mapped, nr); } else { /* increment count (starts at -1) */ atomic_set(&folio->_entire_mapcount, 0); atomic_set(&folio->_nr_pages_mapped, COMPOUND_MAPPED); - nr = folio_nr_pages(folio); + SetPageAnonExclusive(&folio->page); __lruvec_stat_mod_folio(folio, NR_ANON_THPS, nr); } __lruvec_stat_mod_folio(folio, NR_ANON_MAPPED, nr); - __folio_set_anon(folio, vma, address, true); - SetPageAnonExclusive(&folio->page); } /** From 40ff2b3ff10b9680ae14ba619db0e23ccc3d6d41 Mon Sep 17 00:00:00 2001 From: Ryan Roberts Date: Thu, 7 Dec 2023 16:12:04 +0000 Subject: [PATCH 117/548] mm: thp: introduce multi-size THP sysfs interface In preparation for adding support for anonymous multi-size THP, introduce new sysfs structure that will be used to control the new behaviours. A new directory is added under transparent_hugepage for each supported THP size, and contains an `enabled` file, which can be set to "inherit" (to inherit the global setting), "always", "madvise" or "never". For now, the kernel still only supports PMD-sized anonymous THP, so only 1 directory is populated. The first half of the change converts transhuge_vma_suitable() and hugepage_vma_check() so that they take a bitfield of orders for which the user wants to determine support, and the functions filter out all the orders that can't be supported, given the current sysfs configuration and the VMA dimensions. The resulting functions are renamed to thp_vma_suitable_orders() and thp_vma_allowable_orders() respectively. Convenience functions that take a single, unencoded order and return a boolean are also defined as thp_vma_suitable_order() and thp_vma_allowable_order(). The second half of the change implements the new sysfs interface. It has been done so that each supported THP size has a `struct thpsize`, which describes the relevant metadata and is itself a kobject. This is pretty minimal for now, but should make it easy to add new per-thpsize files to the interface if needed in future (e.g. per-size defrag). Rather than keep the `enabled` state directly in the struct thpsize, I've elected to directly encode it into huge_anon_orders_[always|madvise|inherit] bitfields since this reduces the amount of work required in thp_vma_allowable_orders() which is called for every page fault. See Documentation/admin-guide/mm/transhuge.rst, as modified by this commit, for details of how the new sysfs interface works. [ryan.roberts@arm.com: fix build warning when CONFIG_SYSFS is disabled] Link: https://lkml.kernel.org/r/20231211125320.3997543-1-ryan.roberts@arm.com Link: https://lkml.kernel.org/r/20231207161211.2374093-4-ryan.roberts@arm.com Signed-off-by: Ryan Roberts Reviewed-by: Barry Song Tested-by: Kefeng Wang Tested-by: John Hubbard Acked-by: David Hildenbrand Cc: Alistair Popple Cc: Anshuman Khandual Cc: Catalin Marinas Cc: David Rientjes Cc: "Huang, Ying" Cc: Hugh Dickins Cc: Itaru Kitayama Cc: Kirill A. Shutemov Cc: Luis Chamberlain Cc: Matthew Wilcox (Oracle) Cc: Vlastimil Babka Cc: Yang Shi Cc: Yin Fengwei Cc: Yu Zhao Cc: Zi Yan Signed-off-by: Andrew Morton (cherry picked from commit 3485b88390b0af9e05dc2c3f57e9936f41e159a0) Signed-off-by: Koba Ko --- Documentation/admin-guide/mm/transhuge.rst | 97 +++++++-- Documentation/filesystems/proc.rst | 6 +- fs/proc/task_mmu.c | 3 +- include/linux/huge_mm.h | 181 +++++++++++++--- mm/huge_memory.c | 229 ++++++++++++++++++--- mm/khugepaged.c | 20 +- mm/memory.c | 6 +- mm/page_vma_mapped.c | 3 +- 8 files changed, 458 insertions(+), 87 deletions(-) diff --git a/Documentation/admin-guide/mm/transhuge.rst b/Documentation/admin-guide/mm/transhuge.rst index b0cc8243e0934..04eb45a2f9406 100644 --- a/Documentation/admin-guide/mm/transhuge.rst +++ b/Documentation/admin-guide/mm/transhuge.rst @@ -45,10 +45,25 @@ components: the two is using hugepages just because of the fact the TLB miss is going to run faster. +Modern kernels support "multi-size THP" (mTHP), which introduces the +ability to allocate memory in blocks that are bigger than a base page +but smaller than traditional PMD-size (as described above), in +increments of a power-of-2 number of pages. mTHP can back anonymous +memory (for example 16K, 32K, 64K, etc). These THPs continue to be +PTE-mapped, but in many cases can still provide similar benefits to +those outlined above: Page faults are significantly reduced (by a +factor of e.g. 4, 8, 16, etc), but latency spikes are much less +prominent because the size of each page isn't as huge as the PMD-sized +variant and there is less memory to clear in each page fault. Some +architectures also employ TLB compression mechanisms to squeeze more +entries in when a set of PTEs are virtually and physically contiguous +and approporiately aligned. In this case, TLB misses will occur less +often. + THP can be enabled system wide or restricted to certain tasks or even memory ranges inside task's address space. Unless THP is completely disabled, there is ``khugepaged`` daemon that scans memory and -collapses sequences of basic pages into huge pages. +collapses sequences of basic pages into PMD-sized huge pages. The THP behaviour is controlled via :ref:`sysfs ` interface and using madvise(2) and prctl(2) system calls. @@ -95,12 +110,40 @@ Global THP controls Transparent Hugepage Support for anonymous memory can be entirely disabled (mostly for debugging purposes) or only enabled inside MADV_HUGEPAGE regions (to avoid the risk of consuming more memory resources) or enabled -system wide. This can be achieved with one of:: +system wide. This can be achieved per-supported-THP-size with one of:: + + echo always >/sys/kernel/mm/transparent_hugepage/hugepages-kB/enabled + echo madvise >/sys/kernel/mm/transparent_hugepage/hugepages-kB/enabled + echo never >/sys/kernel/mm/transparent_hugepage/hugepages-kB/enabled + +where is the hugepage size being addressed, the available sizes +for which vary by system. + +For example:: + + echo always >/sys/kernel/mm/transparent_hugepage/hugepages-2048kB/enabled + +Alternatively it is possible to specify that a given hugepage size +will inherit the top-level "enabled" value:: + + echo inherit >/sys/kernel/mm/transparent_hugepage/hugepages-kB/enabled + +For example:: + + echo inherit >/sys/kernel/mm/transparent_hugepage/hugepages-2048kB/enabled + +The top-level setting (for use with "inherit") can be set by issuing +one of the following commands:: echo always >/sys/kernel/mm/transparent_hugepage/enabled echo madvise >/sys/kernel/mm/transparent_hugepage/enabled echo never >/sys/kernel/mm/transparent_hugepage/enabled +By default, PMD-sized hugepages have enabled="inherit" and all other +hugepage sizes have enabled="never". If enabling multiple hugepage +sizes, the kernel will select the most appropriate enabled size for a +given allocation. + It's also possible to limit defrag efforts in the VM to generate anonymous hugepages in case they're not immediately free to madvise regions or to never try to defrag memory and simply fallback to regular @@ -146,25 +189,34 @@ madvise never should be self-explanatory. -By default kernel tries to use huge zero page on read page fault to -anonymous mapping. It's possible to disable huge zero page by writing 0 -or enable it back by writing 1:: +By default kernel tries to use huge, PMD-mappable zero page on read +page fault to anonymous mapping. It's possible to disable huge zero +page by writing 0 or enable it back by writing 1:: echo 0 >/sys/kernel/mm/transparent_hugepage/use_zero_page echo 1 >/sys/kernel/mm/transparent_hugepage/use_zero_page -Some userspace (such as a test program, or an optimized memory allocation -library) may want to know the size (in bytes) of a transparent hugepage:: +Some userspace (such as a test program, or an optimized memory +allocation library) may want to know the size (in bytes) of a +PMD-mappable transparent hugepage:: cat /sys/kernel/mm/transparent_hugepage/hpage_pmd_size -khugepaged will be automatically started when -transparent_hugepage/enabled is set to "always" or "madvise, and it'll -be automatically shutdown if it's set to "never". +khugepaged will be automatically started when one or more hugepage +sizes are enabled (either by directly setting "always" or "madvise", +or by setting "inherit" while the top-level enabled is set to "always" +or "madvise"), and it'll be automatically shutdown when the last +hugepage size is disabled (either by directly setting "never", or by +setting "inherit" while the top-level enabled is set to "never"). Khugepaged controls ------------------- +.. note:: + khugepaged currently only searches for opportunities to collapse to + PMD-sized THP and no attempt is made to collapse to other THP + sizes. + khugepaged runs usually at low frequency so while one may not want to invoke defrag algorithms synchronously during the page faults, it should be worth invoking defrag at least in khugepaged. However it's @@ -282,19 +334,26 @@ force Need of application restart =========================== -The transparent_hugepage/enabled values and tmpfs mount option only affect -future behavior. So to make them effective you need to restart any -application that could have been using hugepages. This also applies to the -regions registered in khugepaged. +The transparent_hugepage/enabled and +transparent_hugepage/hugepages-kB/enabled values and tmpfs mount +option only affect future behavior. So to make them effective you need +to restart any application that could have been using hugepages. This +also applies to the regions registered in khugepaged. Monitoring usage ================ -The number of anonymous transparent huge pages currently used by the +.. note:: + Currently the below counters only record events relating to + PMD-sized THP. Events relating to other THP sizes are not included. + +The number of PMD-sized anonymous transparent huge pages currently used by the system is available by reading the AnonHugePages field in ``/proc/meminfo``. -To identify what applications are using anonymous transparent huge pages, -it is necessary to read ``/proc/PID/smaps`` and count the AnonHugePages fields -for each mapping. +To identify what applications are using PMD-sized anonymous transparent huge +pages, it is necessary to read ``/proc/PID/smaps`` and count the AnonHugePages +fields for each mapping. (Note that AnonHugePages only applies to traditional +PMD-sized THP for historical reasons and should have been called +AnonHugePmdMapped). The number of file transparent huge pages mapped to userspace is available by reading ShmemPmdMapped and ShmemHugePages fields in ``/proc/meminfo``. @@ -413,7 +472,7 @@ for huge pages. Optimizing the applications =========================== -To be guaranteed that the kernel will map a 2M page immediately in any +To be guaranteed that the kernel will map a THP immediately in any memory region, the mmap region has to be hugepage naturally aligned. posix_memalign() can provide that guarantee. diff --git a/Documentation/filesystems/proc.rst b/Documentation/filesystems/proc.rst index 2b59cff8be179..6652b658ee775 100644 --- a/Documentation/filesystems/proc.rst +++ b/Documentation/filesystems/proc.rst @@ -528,9 +528,9 @@ replaced by copy-on-write) part of the underlying shmem object out on swap. does not take into account swapped out page of underlying shmem objects. "Locked" indicates whether the mapping is locked in memory or not. -"THPeligible" indicates whether the mapping is eligible for allocating THP -pages as well as the THP is PMD mappable or not - 1 if true, 0 otherwise. -It just shows the current status. +"THPeligible" indicates whether the mapping is eligible for allocating +naturally aligned THP pages of any currently enabled size. 1 if true, 0 +otherwise. "VmFlags" field deserves a separate description. This member represents the kernel flags associated with the particular virtual memory area in two letter diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c index d0896bf8e12f0..3a07b566e5219 100644 --- a/fs/proc/task_mmu.c +++ b/fs/proc/task_mmu.c @@ -865,7 +865,8 @@ static int show_smap(struct seq_file *m, void *v) __show_smap(m, &mss, false); seq_printf(m, "THPeligible: %8u\n", - hugepage_vma_check(vma, vma->vm_flags, true, false, true)); + !!thp_vma_allowable_orders(vma, vma->vm_flags, true, false, + true, THP_ORDERS_ALL)); if (arch_pkeys_enabled()) seq_printf(m, "ProtectionKey: %8u\n", vma_pkey(vma)); diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h index fc789c0ac85b8..866ea42b40bac 100644 --- a/include/linux/huge_mm.h +++ b/include/linux/huge_mm.h @@ -67,6 +67,24 @@ extern struct kobj_attribute shmem_enabled_attr; #define HPAGE_PMD_ORDER (HPAGE_PMD_SHIFT-PAGE_SHIFT) #define HPAGE_PMD_NR (1<vm_start >> PAGE_SHIFT) - vma->vm_pgoff, - HPAGE_PMD_NR)) + hpage_size >> PAGE_SHIFT)) return false; } - haddr = addr & HPAGE_PMD_MASK; + haddr = ALIGN_DOWN(addr, hpage_size); - if (haddr < vma->vm_start || haddr + HPAGE_PMD_SIZE > vma->vm_end) + if (haddr < vma->vm_start || haddr + hpage_size > vma->vm_end) return false; return true; } +/* + * Filter the bitfield of input orders to the ones suitable for use in the vma. + * See thp_vma_suitable_order(). + * All orders that pass the checks are returned as a bitfield. + */ +static inline unsigned long thp_vma_suitable_orders(struct vm_area_struct *vma, + unsigned long addr, unsigned long orders) +{ + int order; + + /* + * Iterate over orders, highest to lowest, removing orders that don't + * meet alignment requirements from the set. Exit loop at first order + * that meets requirements, since all lower orders must also meet + * requirements. + */ + + order = highest_order(orders); + + while (orders) { + if (thp_vma_suitable_order(vma, addr, order)) + break; + order = next_order(&orders, order); + } + + return orders; +} + static inline bool file_thp_enabled(struct vm_area_struct *vma) { struct inode *inode; @@ -130,8 +208,52 @@ static inline bool file_thp_enabled(struct vm_area_struct *vma) !inode_is_open_for_write(inode) && S_ISREG(inode->i_mode); } -bool hugepage_vma_check(struct vm_area_struct *vma, unsigned long vm_flags, - bool smaps, bool in_pf, bool enforce_sysfs); +unsigned long __thp_vma_allowable_orders(struct vm_area_struct *vma, + unsigned long vm_flags, bool smaps, + bool in_pf, bool enforce_sysfs, + unsigned long orders); + +/** + * thp_vma_allowable_orders - determine hugepage orders that are allowed for vma + * @vma: the vm area to check + * @vm_flags: use these vm_flags instead of vma->vm_flags + * @smaps: whether answer will be used for smaps file + * @in_pf: whether answer will be used by page fault handler + * @enforce_sysfs: whether sysfs config should be taken into account + * @orders: bitfield of all orders to consider + * + * Calculates the intersection of the requested hugepage orders and the allowed + * hugepage orders for the provided vma. Permitted orders are encoded as a set + * bit at the corresponding bit position (bit-2 corresponds to order-2, bit-3 + * corresponds to order-3, etc). Order-0 is never considered a hugepage order. + * + * Return: bitfield of orders allowed for hugepage in the vma. 0 if no hugepage + * orders are allowed. + */ +static inline +unsigned long thp_vma_allowable_orders(struct vm_area_struct *vma, + unsigned long vm_flags, bool smaps, + bool in_pf, bool enforce_sysfs, + unsigned long orders) +{ + /* Optimization to check if required orders are enabled early. */ + if (enforce_sysfs && vma_is_anonymous(vma)) { + unsigned long mask = READ_ONCE(huge_anon_orders_always); + + if (vm_flags & VM_HUGEPAGE) + mask |= READ_ONCE(huge_anon_orders_madvise); + if (hugepage_global_always() || + ((vm_flags & VM_HUGEPAGE) && hugepage_global_enabled())) + mask |= READ_ONCE(huge_anon_orders_inherit); + + orders &= mask; + if (!orders) + return 0; + } + + return __thp_vma_allowable_orders(vma, vm_flags, smaps, in_pf, + enforce_sysfs, orders); +} #define transparent_hugepage_use_zero_page() \ (transparent_hugepage_flags & \ @@ -285,17 +407,24 @@ static inline bool folio_test_pmd_mappable(struct folio *folio) return false; } -static inline bool transhuge_vma_suitable(struct vm_area_struct *vma, - unsigned long addr) +static inline bool thp_vma_suitable_order(struct vm_area_struct *vma, + unsigned long addr, int order) { return false; } -static inline bool hugepage_vma_check(struct vm_area_struct *vma, - unsigned long vm_flags, bool smaps, - bool in_pf, bool enforce_sysfs) +static inline unsigned long thp_vma_suitable_orders(struct vm_area_struct *vma, + unsigned long addr, unsigned long orders) { - return false; + return 0; +} + +static inline unsigned long thp_vma_allowable_orders(struct vm_area_struct *vma, + unsigned long vm_flags, bool smaps, + bool in_pf, bool enforce_sysfs, + unsigned long orders) +{ + return 0; } static inline void folio_prep_large_rmappable(struct folio *folio) {} diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 9eefcbc042810..2ef540aa98316 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -71,19 +71,30 @@ static struct shrinker deferred_split_shrinker; static atomic_t huge_zero_refcount; struct page *huge_zero_page __read_mostly; unsigned long huge_zero_pfn __read_mostly = ~0UL; +unsigned long huge_anon_orders_always __read_mostly; +unsigned long huge_anon_orders_madvise __read_mostly; +unsigned long huge_anon_orders_inherit __read_mostly; + +unsigned long __thp_vma_allowable_orders(struct vm_area_struct *vma, + unsigned long vm_flags, bool smaps, + bool in_pf, bool enforce_sysfs, + unsigned long orders) +{ + /* Check the intersection of requested and supported orders. */ + orders &= vma_is_anonymous(vma) ? + THP_ORDERS_ALL_ANON : THP_ORDERS_ALL_FILE; + if (!orders) + return 0; -bool hugepage_vma_check(struct vm_area_struct *vma, unsigned long vm_flags, - bool smaps, bool in_pf, bool enforce_sysfs) -{ if (!vma->vm_mm) /* vdso */ - return false; + return 0; if (thp_disabled_by_hw() || vma_thp_disabled(vma, vm_flags)) - return false; + return 0; /* khugepaged doesn't collapse DAX vma, but page fault is fine. */ if (vma_is_dax(vma)) - return in_pf; + return in_pf ? orders : 0; /* * khugepaged special VMA and hugetlb VMA. @@ -91,17 +102,29 @@ bool hugepage_vma_check(struct vm_area_struct *vma, unsigned long vm_flags, * VM_MIXEDMAP set. */ if (!in_pf && !smaps && (vm_flags & VM_NO_KHUGEPAGED)) - return false; + return 0; /* - * Check alignment for file vma and size for both file and anon vma. + * Check alignment for file vma and size for both file and anon vma by + * filtering out the unsuitable orders. * * Skip the check for page fault. Huge fault does the check in fault - * handlers. And this check is not suitable for huge PUD fault. + * handlers. */ - if (!in_pf && - !transhuge_vma_suitable(vma, (vma->vm_end - HPAGE_PMD_SIZE))) - return false; + if (!in_pf) { + int order = highest_order(orders); + unsigned long addr; + + while (orders) { + addr = vma->vm_end - (PAGE_SIZE << order); + if (thp_vma_suitable_order(vma, addr, order)) + break; + order = next_order(&orders, order); + } + + if (!orders) + return 0; + } /* * Enabled via shmem mount options or sysfs settings. @@ -110,29 +133,33 @@ bool hugepage_vma_check(struct vm_area_struct *vma, unsigned long vm_flags, */ if (!in_pf && shmem_file(vma->vm_file)) return shmem_is_huge(file_inode(vma->vm_file), vma->vm_pgoff, - !enforce_sysfs, vma->vm_mm, vm_flags); - - /* Enforce sysfs THP requirements as necessary */ - if (enforce_sysfs && - (!hugepage_flags_enabled() || (!(vm_flags & VM_HUGEPAGE) && - !hugepage_flags_always()))) - return false; + !enforce_sysfs, vma->vm_mm, vm_flags) + ? orders : 0; if (!vma_is_anonymous(vma)) { + /* + * Enforce sysfs THP requirements as necessary. Anonymous vmas + * were already handled in thp_vma_allowable_orders(). + */ + if (enforce_sysfs && + (!hugepage_global_enabled() || (!(vm_flags & VM_HUGEPAGE) && + !hugepage_global_always()))) + return 0; + /* * Trust that ->huge_fault() handlers know what they are doing * in fault path. */ if (((in_pf || smaps)) && vma->vm_ops->huge_fault) - return true; + return orders; /* Only regular file is valid in collapse path */ if (((!in_pf || smaps)) && file_thp_enabled(vma)) - return true; - return false; + return orders; + return 0; } if (vma_is_temporary_stack(vma)) - return false; + return 0; /* * THPeligible bit of smaps should show 1 for proper VMAs even @@ -142,9 +169,9 @@ bool hugepage_vma_check(struct vm_area_struct *vma, unsigned long vm_flags, * the first page fault. */ if (!vma->anon_vma) - return (smaps || in_pf); + return (smaps || in_pf) ? orders : 0; - return true; + return orders; } static bool get_huge_zero_page(void) @@ -402,9 +429,136 @@ static const struct attribute_group hugepage_attr_group = { .attrs = hugepage_attr, }; +static void hugepage_exit_sysfs(struct kobject *hugepage_kobj); +static void thpsize_release(struct kobject *kobj); +static DEFINE_SPINLOCK(huge_anon_orders_lock); +static LIST_HEAD(thpsize_list); + +struct thpsize { + struct kobject kobj; + struct list_head node; + int order; +}; + +#define to_thpsize(kobj) container_of(kobj, struct thpsize, kobj) + +static ssize_t thpsize_enabled_show(struct kobject *kobj, + struct kobj_attribute *attr, char *buf) +{ + int order = to_thpsize(kobj)->order; + const char *output; + + if (test_bit(order, &huge_anon_orders_always)) + output = "[always] inherit madvise never"; + else if (test_bit(order, &huge_anon_orders_inherit)) + output = "always [inherit] madvise never"; + else if (test_bit(order, &huge_anon_orders_madvise)) + output = "always inherit [madvise] never"; + else + output = "always inherit madvise [never]"; + + return sysfs_emit(buf, "%s\n", output); +} + +static ssize_t thpsize_enabled_store(struct kobject *kobj, + struct kobj_attribute *attr, + const char *buf, size_t count) +{ + int order = to_thpsize(kobj)->order; + ssize_t ret = count; + + if (sysfs_streq(buf, "always")) { + spin_lock(&huge_anon_orders_lock); + clear_bit(order, &huge_anon_orders_inherit); + clear_bit(order, &huge_anon_orders_madvise); + set_bit(order, &huge_anon_orders_always); + spin_unlock(&huge_anon_orders_lock); + } else if (sysfs_streq(buf, "inherit")) { + spin_lock(&huge_anon_orders_lock); + clear_bit(order, &huge_anon_orders_always); + clear_bit(order, &huge_anon_orders_madvise); + set_bit(order, &huge_anon_orders_inherit); + spin_unlock(&huge_anon_orders_lock); + } else if (sysfs_streq(buf, "madvise")) { + spin_lock(&huge_anon_orders_lock); + clear_bit(order, &huge_anon_orders_always); + clear_bit(order, &huge_anon_orders_inherit); + set_bit(order, &huge_anon_orders_madvise); + spin_unlock(&huge_anon_orders_lock); + } else if (sysfs_streq(buf, "never")) { + spin_lock(&huge_anon_orders_lock); + clear_bit(order, &huge_anon_orders_always); + clear_bit(order, &huge_anon_orders_inherit); + clear_bit(order, &huge_anon_orders_madvise); + spin_unlock(&huge_anon_orders_lock); + } else + ret = -EINVAL; + + return ret; +} + +static struct kobj_attribute thpsize_enabled_attr = + __ATTR(enabled, 0644, thpsize_enabled_show, thpsize_enabled_store); + +static struct attribute *thpsize_attrs[] = { + &thpsize_enabled_attr.attr, + NULL, +}; + +static const struct attribute_group thpsize_attr_group = { + .attrs = thpsize_attrs, +}; + +static const struct kobj_type thpsize_ktype = { + .release = &thpsize_release, + .sysfs_ops = &kobj_sysfs_ops, +}; + +static struct thpsize *thpsize_create(int order, struct kobject *parent) +{ + unsigned long size = (PAGE_SIZE << order) / SZ_1K; + struct thpsize *thpsize; + int ret; + + thpsize = kzalloc(sizeof(*thpsize), GFP_KERNEL); + if (!thpsize) + return ERR_PTR(-ENOMEM); + + ret = kobject_init_and_add(&thpsize->kobj, &thpsize_ktype, parent, + "hugepages-%lukB", size); + if (ret) { + kfree(thpsize); + return ERR_PTR(ret); + } + + ret = sysfs_create_group(&thpsize->kobj, &thpsize_attr_group); + if (ret) { + kobject_put(&thpsize->kobj); + return ERR_PTR(ret); + } + + thpsize->order = order; + return thpsize; +} + +static void thpsize_release(struct kobject *kobj) +{ + kfree(to_thpsize(kobj)); +} + static int __init hugepage_init_sysfs(struct kobject **hugepage_kobj) { int err; + struct thpsize *thpsize; + unsigned long orders; + int order; + + /* + * Default to setting PMD-sized THP to inherit the global setting and + * disable all other sizes. powerpc's PMD_ORDER isn't a compile-time + * constant so we have to do this here. + */ + huge_anon_orders_inherit = BIT(PMD_ORDER); *hugepage_kobj = kobject_create_and_add("transparent_hugepage", mm_kobj); if (unlikely(!*hugepage_kobj)) { @@ -424,8 +578,24 @@ static int __init hugepage_init_sysfs(struct kobject **hugepage_kobj) goto remove_hp_group; } + orders = THP_ORDERS_ALL_ANON; + order = highest_order(orders); + while (orders) { + thpsize = thpsize_create(order, *hugepage_kobj); + if (IS_ERR(thpsize)) { + pr_err("failed to create thpsize for order %d\n", order); + err = PTR_ERR(thpsize); + goto remove_all; + } + list_add(&thpsize->node, &thpsize_list); + order = next_order(&orders, order); + } + return 0; +remove_all: + hugepage_exit_sysfs(*hugepage_kobj); + return err; remove_hp_group: sysfs_remove_group(*hugepage_kobj, &hugepage_attr_group); delete_obj: @@ -435,6 +605,13 @@ static int __init hugepage_init_sysfs(struct kobject **hugepage_kobj) static void __init hugepage_exit_sysfs(struct kobject *hugepage_kobj) { + struct thpsize *thpsize, *tmp; + + list_for_each_entry_safe(thpsize, tmp, &thpsize_list, node) { + list_del(&thpsize->node); + kobject_put(&thpsize->kobj); + } + sysfs_remove_group(hugepage_kobj, &khugepaged_attr_group); sysfs_remove_group(hugepage_kobj, &hugepage_attr_group); kobject_put(hugepage_kobj); @@ -777,7 +954,7 @@ vm_fault_t do_huge_pmd_anonymous_page(struct vm_fault *vmf) struct folio *folio; unsigned long haddr = vmf->address & HPAGE_PMD_MASK; - if (!transhuge_vma_suitable(vma, haddr)) + if (!thp_vma_suitable_order(vma, haddr, PMD_ORDER)) return VM_FAULT_FALLBACK; if (unlikely(anon_vma_prepare(vma))) return VM_FAULT_OOM; diff --git a/mm/khugepaged.c b/mm/khugepaged.c index f227b39ae4cf7..b9f2af46b52ca 100644 --- a/mm/khugepaged.c +++ b/mm/khugepaged.c @@ -446,7 +446,8 @@ void khugepaged_enter_vma(struct vm_area_struct *vma, { if (!test_bit(MMF_VM_HUGEPAGE, &vma->vm_mm->flags) && hugepage_flags_enabled()) { - if (hugepage_vma_check(vma, vm_flags, false, false, true)) + if (thp_vma_allowable_order(vma, vm_flags, false, false, true, + PMD_ORDER)) __khugepaged_enter(vma->vm_mm); } } @@ -907,16 +908,16 @@ static int hugepage_vma_revalidate(struct mm_struct *mm, unsigned long address, if (!vma) return SCAN_VMA_NULL; - if (!transhuge_vma_suitable(vma, address)) + if (!thp_vma_suitable_order(vma, address, PMD_ORDER)) return SCAN_ADDRESS_RANGE; - if (!hugepage_vma_check(vma, vma->vm_flags, false, false, - cc->is_khugepaged)) + if (!thp_vma_allowable_order(vma, vma->vm_flags, false, false, + cc->is_khugepaged, PMD_ORDER)) return SCAN_VMA_CHECK; /* * Anon VMA expected, the address may be unmapped then * remapped to file after khugepaged reaquired the mmap_lock. * - * hugepage_vma_check may return true for qualified file + * thp_vma_allowable_order may return true for qualified file * vmas. */ if (expect_anon && (!(*vmap)->anon_vma || !vma_is_anonymous(*vmap))) @@ -1492,7 +1493,8 @@ int collapse_pte_mapped_thp(struct mm_struct *mm, unsigned long addr, * and map it by a PMD, regardless of sysfs THP settings. As such, let's * analogously elide sysfs THP settings here. */ - if (!hugepage_vma_check(vma, vma->vm_flags, false, false, false)) + if (!thp_vma_allowable_order(vma, vma->vm_flags, false, false, false, + PMD_ORDER)) return SCAN_VMA_CHECK; /* Keep pmd pgtable for uffd-wp; see comment in retract_page_tables() */ @@ -2364,7 +2366,8 @@ static unsigned int khugepaged_scan_mm_slot(unsigned int pages, int *result, progress++; break; } - if (!hugepage_vma_check(vma, vma->vm_flags, false, false, true)) { + if (!thp_vma_allowable_order(vma, vma->vm_flags, false, false, + true, PMD_ORDER)) { skip: progress++; continue; @@ -2701,7 +2704,8 @@ int madvise_collapse(struct vm_area_struct *vma, struct vm_area_struct **prev, *prev = vma; - if (!hugepage_vma_check(vma, vma->vm_flags, false, false, false)) + if (!thp_vma_allowable_order(vma, vma->vm_flags, false, false, false, + PMD_ORDER)) return -EINVAL; cc = kmalloc(sizeof(*cc), GFP_KERNEL); diff --git a/mm/memory.c b/mm/memory.c index 8b80db7115e5f..0de5c14c8d898 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -4301,7 +4301,7 @@ vm_fault_t do_set_pmd(struct vm_fault *vmf, struct page *page) if (thp_disabled_by_hw() || vma_thp_disabled(vma, vma->vm_flags)) return ret; - if (!transhuge_vma_suitable(vma, haddr)) + if (!thp_vma_suitable_order(vma, haddr, PMD_ORDER)) return ret; page = compound_head(page); @@ -5099,7 +5099,7 @@ static vm_fault_t __handle_mm_fault(struct vm_area_struct *vma, return VM_FAULT_OOM; retry_pud: if (pud_none(*vmf.pud) && - hugepage_vma_check(vma, vm_flags, false, true, true)) { + thp_vma_allowable_order(vma, vm_flags, false, true, true, PUD_ORDER)) { ret = create_huge_pud(&vmf); if (!(ret & VM_FAULT_FALLBACK)) return ret; @@ -5133,7 +5133,7 @@ static vm_fault_t __handle_mm_fault(struct vm_area_struct *vma, goto retry_pud; if (pmd_none(*vmf.pmd) && - hugepage_vma_check(vma, vm_flags, false, true, true)) { + thp_vma_allowable_order(vma, vm_flags, false, true, true, PMD_ORDER)) { ret = create_huge_pmd(&vmf); if (!(ret & VM_FAULT_FALLBACK)) return ret; diff --git a/mm/page_vma_mapped.c b/mm/page_vma_mapped.c index dcc1ee3d059e9..032d71f9876b6 100644 --- a/mm/page_vma_mapped.c +++ b/mm/page_vma_mapped.c @@ -273,7 +273,8 @@ bool page_vma_mapped_walk(struct page_vma_mapped_walk *pvmw) * cleared *pmd but not decremented compound_mapcount(). */ if ((pvmw->flags & PVMW_SYNC) && - transhuge_vma_suitable(vma, pvmw->address) && + thp_vma_suitable_order(vma, pvmw->address, + PMD_ORDER) && (pvmw->nr_pages >= HPAGE_PMD_NR)) { spinlock_t *ptl = pmd_lock(mm, pvmw->pmd); From bc55df30c0868266f2e5047cda99e13031f51367 Mon Sep 17 00:00:00 2001 From: Ryan Roberts Date: Thu, 7 Dec 2023 16:12:05 +0000 Subject: [PATCH 118/548] mm: thp: support allocation of anonymous multi-size THP Introduce the logic to allow THP to be configured (through the new sysfs interface we just added) to allocate large folios to back anonymous memory, which are larger than the base page size but smaller than PMD-size. We call this new THP extension "multi-size THP" (mTHP). mTHP continues to be PTE-mapped, but in many cases can still provide similar benefits to traditional PMD-sized THP: Page faults are significantly reduced (by a factor of e.g. 4, 8, 16, etc. depending on the configured order), but latency spikes are much less prominent because the size of each page isn't as huge as the PMD-sized variant and there is less memory to clear in each page fault. The number of per-page operations (e.g. ref counting, rmap management, lru list management) are also significantly reduced since those ops now become per-folio. Some architectures also employ TLB compression mechanisms to squeeze more entries in when a set of PTEs are virtually and physically contiguous and approporiately aligned. In this case, TLB misses will occur less often. The new behaviour is disabled by default, but can be enabled at runtime by writing to /sys/kernel/mm/transparent_hugepage/hugepage-XXkb/enabled (see documentation in previous commit). The long term aim is to change the default to include suitable lower orders, but there are some risks around internal fragmentation that need to be better understood first. [ryan.roberts@arm.com: resolve some multi-size THP review nits] Link: https://lkml.kernel.org/r/20231214160251.3574571-1-ryan.roberts@arm.com Link: https://lkml.kernel.org/r/20231207161211.2374093-5-ryan.roberts@arm.com Signed-off-by: Ryan Roberts Tested-by: Kefeng Wang Tested-by: John Hubbard Acked-by: David Hildenbrand Cc: Alistair Popple Cc: Anshuman Khandual Cc: Barry Song Cc: Catalin Marinas Cc: David Rientjes Cc: "Huang, Ying" Cc: Hugh Dickins Cc: Itaru Kitayama Cc: Kirill A. Shutemov Cc: Luis Chamberlain Cc: Matthew Wilcox (Oracle) Cc: Vlastimil Babka Cc: Yang Shi Cc: Yin Fengwei Cc: Yu Zhao Cc: Zi Yan Signed-off-by: Andrew Morton (cherry picked from commit 19eaf44954df64f9bc8dec398219e15ad0811497) Signed-off-by: Koba Ko --- include/linux/huge_mm.h | 6 ++- mm/memory.c | 109 ++++++++++++++++++++++++++++++++++++---- 2 files changed, 104 insertions(+), 11 deletions(-) diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h index 866ea42b40bac..09179237dac4b 100644 --- a/include/linux/huge_mm.h +++ b/include/linux/huge_mm.h @@ -68,9 +68,11 @@ extern struct kobj_attribute shmem_enabled_attr; #define HPAGE_PMD_NR (1<vma; + unsigned long orders; + struct folio *folio; + unsigned long addr; + pte_t *pte; + gfp_t gfp; + int order; + + /* + * If uffd is active for the vma we need per-page fault fidelity to + * maintain the uffd semantics. + */ + if (unlikely(userfaultfd_armed(vma))) + goto fallback; + + /* + * Get a list of all the (large) orders below PMD_ORDER that are enabled + * for this vma. Then filter out the orders that can't be allocated over + * the faulting address and still be fully contained in the vma. + */ + orders = thp_vma_allowable_orders(vma, vma->vm_flags, false, true, true, + BIT(PMD_ORDER) - 1); + orders = thp_vma_suitable_orders(vma, vmf->address, orders); + + if (!orders) + goto fallback; + + pte = pte_offset_map(vmf->pmd, vmf->address & PMD_MASK); + if (!pte) + return ERR_PTR(-EAGAIN); + + /* + * Find the highest order where the aligned range is completely + * pte_none(). Note that all remaining orders will be completely + * pte_none(). + */ + order = highest_order(orders); + while (orders) { + addr = ALIGN_DOWN(vmf->address, PAGE_SIZE << order); + if (pte_range_none(pte + pte_index(addr), 1 << order)) + break; + order = next_order(&orders, order); + } + + pte_unmap(pte); + + /* Try allocating the highest of the remaining orders. */ + gfp = vma_thp_gfp_mask(vma); + while (orders) { + addr = ALIGN_DOWN(vmf->address, PAGE_SIZE << order); + folio = vma_alloc_folio(gfp, order, vma, addr, true); + if (folio) { + clear_huge_page(&folio->page, vmf->address, 1 << order); + return folio; + } + order = next_order(&orders, order); + } + +fallback: +#endif + return vma_alloc_zeroed_movable_folio(vmf->vma, vmf->address); +} + /* * We enter with non-exclusive mmap_lock (to exclude vma changes, * but allow concurrent faults), and pte mapped but not yet locked. @@ -4104,9 +4182,12 @@ static vm_fault_t do_anonymous_page(struct vm_fault *vmf) { bool uffd_wp = vmf_orig_pte_uffd_wp(vmf); struct vm_area_struct *vma = vmf->vma; + unsigned long addr = vmf->address; struct folio *folio; vm_fault_t ret = 0; + int nr_pages = 1; pte_t entry; + int i; /* File mapping without ->vm_ops ? */ if (vma->vm_flags & VM_SHARED) @@ -4146,10 +4227,16 @@ static vm_fault_t do_anonymous_page(struct vm_fault *vmf) /* Allocate our own private page. */ if (unlikely(anon_vma_prepare(vma))) goto oom; - folio = vma_alloc_zeroed_movable_folio(vma, vmf->address); + /* Returns NULL on OOM or ERR_PTR(-EAGAIN) if we must retry the fault */ + folio = alloc_anon_folio(vmf); + if (IS_ERR(folio)) + return 0; if (!folio) goto oom; + nr_pages = folio_nr_pages(folio); + addr = ALIGN_DOWN(vmf->address, nr_pages * PAGE_SIZE); + if (mem_cgroup_charge(folio, vma->vm_mm, GFP_KERNEL)) goto oom_free_page; folio_throttle_swaprate(folio, GFP_KERNEL); @@ -4166,12 +4253,15 @@ static vm_fault_t do_anonymous_page(struct vm_fault *vmf) if (vma->vm_flags & VM_WRITE) entry = pte_mkwrite(pte_mkdirty(entry), vma); - vmf->pte = pte_offset_map_lock(vma->vm_mm, vmf->pmd, vmf->address, - &vmf->ptl); + vmf->pte = pte_offset_map_lock(vma->vm_mm, vmf->pmd, addr, &vmf->ptl); if (!vmf->pte) goto release; - if (vmf_pte_changed(vmf)) { - update_mmu_tlb(vma, vmf->address, vmf->pte); + if (nr_pages == 1 && vmf_pte_changed(vmf)) { + update_mmu_tlb(vma, addr, vmf->pte); + goto release; + } else if (nr_pages > 1 && !pte_range_none(vmf->pte, nr_pages)) { + for (i = 0; i < nr_pages; i++) + update_mmu_tlb(vma, addr + PAGE_SIZE * i, vmf->pte + i); goto release; } @@ -4186,16 +4276,17 @@ static vm_fault_t do_anonymous_page(struct vm_fault *vmf) return handle_userfault(vmf, VM_UFFD_MISSING); } - inc_mm_counter(vma->vm_mm, MM_ANONPAGES); - folio_add_new_anon_rmap(folio, vma, vmf->address); + folio_ref_add(folio, nr_pages - 1); + add_mm_counter(vma->vm_mm, MM_ANONPAGES, nr_pages); + folio_add_new_anon_rmap(folio, vma, addr); folio_add_lru_vma(folio, vma); setpte: if (uffd_wp) entry = pte_mkuffd_wp(entry); - set_pte_at(vma->vm_mm, vmf->address, vmf->pte, entry); + set_ptes(vma->vm_mm, addr, vmf->pte, entry, nr_pages); /* No need to invalidate - it was non-present before */ - update_mmu_cache_range(vmf, vma, vmf->address, vmf->pte, 1); + update_mmu_cache_range(vmf, vma, addr, vmf->pte, nr_pages); unlock: if (vmf->pte) pte_unmap_unlock(vmf->pte, vmf->ptl); From b31f1fd8ce11a72775283454b96dbd33c76c83cb Mon Sep 17 00:00:00 2001 From: Ryan Roberts Date: Thu, 7 Dec 2023 16:12:06 +0000 Subject: [PATCH 119/548] selftests/mm/kugepaged: restore thp settings at exit Previously, the saved thp settings would be restored upon a signal or at the natural end of the test suite. But there are some tests that directly call exit() upon failure. In this case, the thp settings were not being restored, which could then influence other tests. Fix this by installing an atexit() handler to do the actual restore. The signal handler can now just call exit() and the atexit handler is invoked. Link: https://lkml.kernel.org/r/20231207161211.2374093-6-ryan.roberts@arm.com Signed-off-by: Ryan Roberts Reviewed-by: Alistair Popple Reviewed-by: David Hildenbrand Tested-by: Kefeng Wang Tested-by: John Hubbard Cc: Anshuman Khandual Cc: Barry Song Cc: Catalin Marinas Cc: David Rientjes Cc: "Huang, Ying" Cc: Hugh Dickins Cc: Itaru Kitayama Cc: Kirill A. Shutemov Cc: Luis Chamberlain Cc: Matthew Wilcox (Oracle) Cc: Vlastimil Babka Cc: Yang Shi Cc: Yin Fengwei Cc: Yu Zhao Cc: Zi Yan Signed-off-by: Andrew Morton (cherry picked from commit b6aab3384cafba151c53d3b5f7e1f8d073aadf03) Signed-off-by: Koba Ko --- tools/testing/selftests/mm/khugepaged.c | 17 +++++++++++------ 1 file changed, 11 insertions(+), 6 deletions(-) diff --git a/tools/testing/selftests/mm/khugepaged.c b/tools/testing/selftests/mm/khugepaged.c index 030667cb55337..fc47a1c4944c4 100644 --- a/tools/testing/selftests/mm/khugepaged.c +++ b/tools/testing/selftests/mm/khugepaged.c @@ -374,18 +374,22 @@ static void pop_settings(void) write_settings(current_settings()); } -static void restore_settings(int sig) +static void restore_settings_atexit(void) { if (skip_settings_restore) - goto out; + return; printf("Restore THP and khugepaged settings..."); write_settings(&saved_settings); success("OK"); - if (sig) - exit(EXIT_FAILURE); -out: - exit(exit_status); + + skip_settings_restore = true; +} + +static void restore_settings(int sig) +{ + /* exit() will invoke the restore_settings_atexit handler. */ + exit(sig ? EXIT_FAILURE : exit_status); } static void save_settings(void) @@ -415,6 +419,7 @@ static void save_settings(void) success("OK"); + atexit(restore_settings_atexit); signal(SIGTERM, restore_settings); signal(SIGINT, restore_settings); signal(SIGHUP, restore_settings); From a5bce0d8d1d793b07998dcca10d3a07e9812117f Mon Sep 17 00:00:00 2001 From: Ryan Roberts Date: Thu, 7 Dec 2023 16:12:07 +0000 Subject: [PATCH 120/548] selftests/mm: factor out thp settings management The khugepaged test has a useful framework for save/restore/pop/push of all thp settings via the sysfs interface. This will be useful to explicitly control multi-size THP settings in other tests, so let's move it out of khugepaged and into its own thp_settings.[c|h] utility. Link: https://lkml.kernel.org/r/20231207161211.2374093-7-ryan.roberts@arm.com Signed-off-by: Ryan Roberts Tested-by: Alistair Popple Acked-by: David Hildenbrand Tested-by: Kefeng Wang Tested-by: John Hubbard Cc: Anshuman Khandual Cc: Barry Song Cc: Catalin Marinas Cc: David Rientjes Cc: "Huang, Ying" Cc: Hugh Dickins Cc: Itaru Kitayama Cc: Kirill A. Shutemov Cc: Luis Chamberlain Cc: Matthew Wilcox (Oracle) Cc: Vlastimil Babka Cc: Yang Shi Cc: Yin Fengwei Cc: Yu Zhao Cc: Zi Yan Signed-off-by: Andrew Morton (cherry picked from commit 00679a183ac6d2584723cfc2a2c07c8285f802dc) Signed-off-by: Koba Ko --- tools/testing/selftests/mm/Makefile | 4 +- tools/testing/selftests/mm/khugepaged.c | 346 ++-------------------- tools/testing/selftests/mm/thp_settings.c | 296 ++++++++++++++++++ tools/testing/selftests/mm/thp_settings.h | 71 +++++ 4 files changed, 391 insertions(+), 326 deletions(-) create mode 100644 tools/testing/selftests/mm/thp_settings.c create mode 100644 tools/testing/selftests/mm/thp_settings.h diff --git a/tools/testing/selftests/mm/Makefile b/tools/testing/selftests/mm/Makefile index c9fcbc6e5121e..f64ec79d772ee 100644 --- a/tools/testing/selftests/mm/Makefile +++ b/tools/testing/selftests/mm/Makefile @@ -117,8 +117,8 @@ TEST_FILES += va_high_addr_switch.sh include ../lib.mk -$(TEST_GEN_PROGS): vm_util.c -$(TEST_GEN_FILES): vm_util.c +$(TEST_GEN_PROGS): vm_util.c thp_settings.c +$(TEST_GEN_FILES): vm_util.c thp_settings.c $(OUTPUT)/uffd-stress: uffd-common.c $(OUTPUT)/uffd-unit-tests: uffd-common.c diff --git a/tools/testing/selftests/mm/khugepaged.c b/tools/testing/selftests/mm/khugepaged.c index fc47a1c4944c4..b15e7fd701769 100644 --- a/tools/testing/selftests/mm/khugepaged.c +++ b/tools/testing/selftests/mm/khugepaged.c @@ -22,13 +22,13 @@ #include "linux/magic.h" #include "vm_util.h" +#include "thp_settings.h" #define BASE_ADDR ((void *)(1UL << 30)) static unsigned long hpage_pmd_size; static unsigned long page_size; static int hpage_pmd_nr; -#define THP_SYSFS "/sys/kernel/mm/transparent_hugepage/" #define PID_SMAPS "/proc/self/smaps" #define TEST_FILE "collapse_test_file" @@ -71,78 +71,7 @@ struct file_info { }; static struct file_info finfo; - -enum thp_enabled { - THP_ALWAYS, - THP_MADVISE, - THP_NEVER, -}; - -static const char *thp_enabled_strings[] = { - "always", - "madvise", - "never", - NULL -}; - -enum thp_defrag { - THP_DEFRAG_ALWAYS, - THP_DEFRAG_DEFER, - THP_DEFRAG_DEFER_MADVISE, - THP_DEFRAG_MADVISE, - THP_DEFRAG_NEVER, -}; - -static const char *thp_defrag_strings[] = { - "always", - "defer", - "defer+madvise", - "madvise", - "never", - NULL -}; - -enum shmem_enabled { - SHMEM_ALWAYS, - SHMEM_WITHIN_SIZE, - SHMEM_ADVISE, - SHMEM_NEVER, - SHMEM_DENY, - SHMEM_FORCE, -}; - -static const char *shmem_enabled_strings[] = { - "always", - "within_size", - "advise", - "never", - "deny", - "force", - NULL -}; - -struct khugepaged_settings { - bool defrag; - unsigned int alloc_sleep_millisecs; - unsigned int scan_sleep_millisecs; - unsigned int max_ptes_none; - unsigned int max_ptes_swap; - unsigned int max_ptes_shared; - unsigned long pages_to_scan; -}; - -struct settings { - enum thp_enabled thp_enabled; - enum thp_defrag thp_defrag; - enum shmem_enabled shmem_enabled; - bool use_zero_page; - struct khugepaged_settings khugepaged; - unsigned long read_ahead_kb; -}; - -static struct settings saved_settings; static bool skip_settings_restore; - static int exit_status; static void success(const char *msg) @@ -161,226 +90,13 @@ static void skip(const char *msg) printf(" \e[33m%s\e[0m\n", msg); } -static int read_file(const char *path, char *buf, size_t buflen) -{ - int fd; - ssize_t numread; - - fd = open(path, O_RDONLY); - if (fd == -1) - return 0; - - numread = read(fd, buf, buflen - 1); - if (numread < 1) { - close(fd); - return 0; - } - - buf[numread] = '\0'; - close(fd); - - return (unsigned int) numread; -} - -static int write_file(const char *path, const char *buf, size_t buflen) -{ - int fd; - ssize_t numwritten; - - fd = open(path, O_WRONLY); - if (fd == -1) { - printf("open(%s)\n", path); - exit(EXIT_FAILURE); - return 0; - } - - numwritten = write(fd, buf, buflen - 1); - close(fd); - if (numwritten < 1) { - printf("write(%s)\n", buf); - exit(EXIT_FAILURE); - return 0; - } - - return (unsigned int) numwritten; -} - -static int read_string(const char *name, const char *strings[]) -{ - char path[PATH_MAX]; - char buf[256]; - char *c; - int ret; - - ret = snprintf(path, PATH_MAX, THP_SYSFS "%s", name); - if (ret >= PATH_MAX) { - printf("%s: Pathname is too long\n", __func__); - exit(EXIT_FAILURE); - } - - if (!read_file(path, buf, sizeof(buf))) { - perror(path); - exit(EXIT_FAILURE); - } - - c = strchr(buf, '['); - if (!c) { - printf("%s: Parse failure\n", __func__); - exit(EXIT_FAILURE); - } - - c++; - memmove(buf, c, sizeof(buf) - (c - buf)); - - c = strchr(buf, ']'); - if (!c) { - printf("%s: Parse failure\n", __func__); - exit(EXIT_FAILURE); - } - *c = '\0'; - - ret = 0; - while (strings[ret]) { - if (!strcmp(strings[ret], buf)) - return ret; - ret++; - } - - printf("Failed to parse %s\n", name); - exit(EXIT_FAILURE); -} - -static void write_string(const char *name, const char *val) -{ - char path[PATH_MAX]; - int ret; - - ret = snprintf(path, PATH_MAX, THP_SYSFS "%s", name); - if (ret >= PATH_MAX) { - printf("%s: Pathname is too long\n", __func__); - exit(EXIT_FAILURE); - } - - if (!write_file(path, val, strlen(val) + 1)) { - perror(path); - exit(EXIT_FAILURE); - } -} - -static const unsigned long _read_num(const char *path) -{ - char buf[21]; - - if (read_file(path, buf, sizeof(buf)) < 0) { - perror("read_file(read_num)"); - exit(EXIT_FAILURE); - } - - return strtoul(buf, NULL, 10); -} - -static const unsigned long read_num(const char *name) -{ - char path[PATH_MAX]; - int ret; - - ret = snprintf(path, PATH_MAX, THP_SYSFS "%s", name); - if (ret >= PATH_MAX) { - printf("%s: Pathname is too long\n", __func__); - exit(EXIT_FAILURE); - } - return _read_num(path); -} - -static void _write_num(const char *path, unsigned long num) -{ - char buf[21]; - - sprintf(buf, "%ld", num); - if (!write_file(path, buf, strlen(buf) + 1)) { - perror(path); - exit(EXIT_FAILURE); - } -} - -static void write_num(const char *name, unsigned long num) -{ - char path[PATH_MAX]; - int ret; - - ret = snprintf(path, PATH_MAX, THP_SYSFS "%s", name); - if (ret >= PATH_MAX) { - printf("%s: Pathname is too long\n", __func__); - exit(EXIT_FAILURE); - } - _write_num(path, num); -} - -static void write_settings(struct settings *settings) -{ - struct khugepaged_settings *khugepaged = &settings->khugepaged; - - write_string("enabled", thp_enabled_strings[settings->thp_enabled]); - write_string("defrag", thp_defrag_strings[settings->thp_defrag]); - write_string("shmem_enabled", - shmem_enabled_strings[settings->shmem_enabled]); - write_num("use_zero_page", settings->use_zero_page); - - write_num("khugepaged/defrag", khugepaged->defrag); - write_num("khugepaged/alloc_sleep_millisecs", - khugepaged->alloc_sleep_millisecs); - write_num("khugepaged/scan_sleep_millisecs", - khugepaged->scan_sleep_millisecs); - write_num("khugepaged/max_ptes_none", khugepaged->max_ptes_none); - write_num("khugepaged/max_ptes_swap", khugepaged->max_ptes_swap); - write_num("khugepaged/max_ptes_shared", khugepaged->max_ptes_shared); - write_num("khugepaged/pages_to_scan", khugepaged->pages_to_scan); - - if (file_ops && finfo.type == VMA_FILE) - _write_num(finfo.dev_queue_read_ahead_path, - settings->read_ahead_kb); -} - -#define MAX_SETTINGS_DEPTH 4 -static struct settings settings_stack[MAX_SETTINGS_DEPTH]; -static int settings_index; - -static struct settings *current_settings(void) -{ - if (!settings_index) { - printf("Fail: No settings set"); - exit(EXIT_FAILURE); - } - return settings_stack + settings_index - 1; -} - -static void push_settings(struct settings *settings) -{ - if (settings_index >= MAX_SETTINGS_DEPTH) { - printf("Fail: Settings stack exceeded"); - exit(EXIT_FAILURE); - } - settings_stack[settings_index++] = *settings; - write_settings(current_settings()); -} - -static void pop_settings(void) -{ - if (settings_index <= 0) { - printf("Fail: Settings stack empty"); - exit(EXIT_FAILURE); - } - --settings_index; - write_settings(current_settings()); -} - static void restore_settings_atexit(void) { if (skip_settings_restore) return; printf("Restore THP and khugepaged settings..."); - write_settings(&saved_settings); + thp_restore_settings(); success("OK"); skip_settings_restore = true; @@ -395,27 +111,9 @@ static void restore_settings(int sig) static void save_settings(void) { printf("Save THP and khugepaged settings..."); - saved_settings = (struct settings) { - .thp_enabled = read_string("enabled", thp_enabled_strings), - .thp_defrag = read_string("defrag", thp_defrag_strings), - .shmem_enabled = - read_string("shmem_enabled", shmem_enabled_strings), - .use_zero_page = read_num("use_zero_page"), - }; - saved_settings.khugepaged = (struct khugepaged_settings) { - .defrag = read_num("khugepaged/defrag"), - .alloc_sleep_millisecs = - read_num("khugepaged/alloc_sleep_millisecs"), - .scan_sleep_millisecs = - read_num("khugepaged/scan_sleep_millisecs"), - .max_ptes_none = read_num("khugepaged/max_ptes_none"), - .max_ptes_swap = read_num("khugepaged/max_ptes_swap"), - .max_ptes_shared = read_num("khugepaged/max_ptes_shared"), - .pages_to_scan = read_num("khugepaged/pages_to_scan"), - }; if (file_ops && finfo.type == VMA_FILE) - saved_settings.read_ahead_kb = - _read_num(finfo.dev_queue_read_ahead_path); + thp_set_read_ahead_path(finfo.dev_queue_read_ahead_path); + thp_save_settings(); success("OK"); @@ -798,7 +496,7 @@ static void __madvise_collapse(const char *msg, char *p, int nr_hpages, struct mem_ops *ops, bool expect) { int ret; - struct settings settings = *current_settings(); + struct thp_settings settings = *thp_current_settings(); printf("%s...", msg); @@ -808,7 +506,7 @@ static void __madvise_collapse(const char *msg, char *p, int nr_hpages, */ settings.thp_enabled = THP_NEVER; settings.shmem_enabled = SHMEM_NEVER; - push_settings(&settings); + thp_push_settings(&settings); /* Clear VM_NOHUGEPAGE */ madvise(p, nr_hpages * hpage_pmd_size, MADV_HUGEPAGE); @@ -820,7 +518,7 @@ static void __madvise_collapse(const char *msg, char *p, int nr_hpages, else success("OK"); - pop_settings(); + thp_pop_settings(); } static void madvise_collapse(const char *msg, char *p, int nr_hpages, @@ -850,13 +548,13 @@ static bool wait_for_scan(const char *msg, char *p, int nr_hpages, madvise(p, nr_hpages * hpage_pmd_size, MADV_HUGEPAGE); /* Wait until the second full_scan completed */ - full_scans = read_num("khugepaged/full_scans") + 2; + full_scans = thp_read_num("khugepaged/full_scans") + 2; printf("%s...", msg); while (timeout--) { if (ops->check_huge(p, nr_hpages)) break; - if (read_num("khugepaged/full_scans") >= full_scans) + if (thp_read_num("khugepaged/full_scans") >= full_scans) break; printf("."); usleep(TICK); @@ -911,11 +609,11 @@ static bool is_tmpfs(struct mem_ops *ops) static void alloc_at_fault(void) { - struct settings settings = *current_settings(); + struct thp_settings settings = *thp_current_settings(); char *p; settings.thp_enabled = THP_ALWAYS; - push_settings(&settings); + thp_push_settings(&settings); p = alloc_mapping(1); *p = 1; @@ -925,7 +623,7 @@ static void alloc_at_fault(void) else fail("Fail"); - pop_settings(); + thp_pop_settings(); madvise(p, page_size, MADV_DONTNEED); printf("Split huge PMD on MADV_DONTNEED..."); @@ -973,11 +671,11 @@ static void collapse_single_pte_entry(struct collapse_context *c, struct mem_ops static void collapse_max_ptes_none(struct collapse_context *c, struct mem_ops *ops) { int max_ptes_none = hpage_pmd_nr / 2; - struct settings settings = *current_settings(); + struct thp_settings settings = *thp_current_settings(); void *p; settings.khugepaged.max_ptes_none = max_ptes_none; - push_settings(&settings); + thp_push_settings(&settings); p = ops->setup_area(1); @@ -1002,7 +700,7 @@ static void collapse_max_ptes_none(struct collapse_context *c, struct mem_ops *o } skip: ops->cleanup_area(p, hpage_pmd_size); - pop_settings(); + thp_pop_settings(); } static void collapse_swapin_single_pte(struct collapse_context *c, struct mem_ops *ops) @@ -1033,7 +731,7 @@ static void collapse_swapin_single_pte(struct collapse_context *c, struct mem_op static void collapse_max_ptes_swap(struct collapse_context *c, struct mem_ops *ops) { - int max_ptes_swap = read_num("khugepaged/max_ptes_swap"); + int max_ptes_swap = thp_read_num("khugepaged/max_ptes_swap"); void *p; p = ops->setup_area(1); @@ -1250,11 +948,11 @@ static void collapse_fork_compound(struct collapse_context *c, struct mem_ops *o fail("Fail"); ops->fault(p, 0, page_size); - write_num("khugepaged/max_ptes_shared", hpage_pmd_nr - 1); + thp_write_num("khugepaged/max_ptes_shared", hpage_pmd_nr - 1); c->collapse("Collapse PTE table full of compound pages in child", p, 1, ops, true); - write_num("khugepaged/max_ptes_shared", - current_settings()->khugepaged.max_ptes_shared); + thp_write_num("khugepaged/max_ptes_shared", + thp_current_settings()->khugepaged.max_ptes_shared); validate_memory(p, 0, hpage_pmd_size); ops->cleanup_area(p, hpage_pmd_size); @@ -1275,7 +973,7 @@ static void collapse_fork_compound(struct collapse_context *c, struct mem_ops *o static void collapse_max_ptes_shared(struct collapse_context *c, struct mem_ops *ops) { - int max_ptes_shared = read_num("khugepaged/max_ptes_shared"); + int max_ptes_shared = thp_read_num("khugepaged/max_ptes_shared"); int wstatus; void *p; @@ -1443,7 +1141,7 @@ static void parse_test_type(int argc, const char **argv) int main(int argc, const char **argv) { - struct settings default_settings = { + struct thp_settings default_settings = { .thp_enabled = THP_MADVISE, .thp_defrag = THP_DEFRAG_ALWAYS, .shmem_enabled = SHMEM_ADVISE, @@ -1484,7 +1182,7 @@ int main(int argc, const char **argv) default_settings.khugepaged.pages_to_scan = hpage_pmd_nr * 8; save_settings(); - push_settings(&default_settings); + thp_push_settings(&default_settings); alloc_at_fault(); diff --git a/tools/testing/selftests/mm/thp_settings.c b/tools/testing/selftests/mm/thp_settings.c new file mode 100644 index 0000000000000..5e8ec792cac70 --- /dev/null +++ b/tools/testing/selftests/mm/thp_settings.c @@ -0,0 +1,296 @@ +// SPDX-License-Identifier: GPL-2.0 +#include +#include +#include +#include +#include +#include + +#include "thp_settings.h" + +#define THP_SYSFS "/sys/kernel/mm/transparent_hugepage/" +#define MAX_SETTINGS_DEPTH 4 +static struct thp_settings settings_stack[MAX_SETTINGS_DEPTH]; +static int settings_index; +static struct thp_settings saved_settings; +static char dev_queue_read_ahead_path[PATH_MAX]; + +static const char * const thp_enabled_strings[] = { + "always", + "madvise", + "never", + NULL +}; + +static const char * const thp_defrag_strings[] = { + "always", + "defer", + "defer+madvise", + "madvise", + "never", + NULL +}; + +static const char * const shmem_enabled_strings[] = { + "always", + "within_size", + "advise", + "never", + "deny", + "force", + NULL +}; + +int read_file(const char *path, char *buf, size_t buflen) +{ + int fd; + ssize_t numread; + + fd = open(path, O_RDONLY); + if (fd == -1) + return 0; + + numread = read(fd, buf, buflen - 1); + if (numread < 1) { + close(fd); + return 0; + } + + buf[numread] = '\0'; + close(fd); + + return (unsigned int) numread; +} + +int write_file(const char *path, const char *buf, size_t buflen) +{ + int fd; + ssize_t numwritten; + + fd = open(path, O_WRONLY); + if (fd == -1) { + printf("open(%s)\n", path); + exit(EXIT_FAILURE); + return 0; + } + + numwritten = write(fd, buf, buflen - 1); + close(fd); + if (numwritten < 1) { + printf("write(%s)\n", buf); + exit(EXIT_FAILURE); + return 0; + } + + return (unsigned int) numwritten; +} + +const unsigned long read_num(const char *path) +{ + char buf[21]; + + if (read_file(path, buf, sizeof(buf)) < 0) { + perror("read_file()"); + exit(EXIT_FAILURE); + } + + return strtoul(buf, NULL, 10); +} + +void write_num(const char *path, unsigned long num) +{ + char buf[21]; + + sprintf(buf, "%ld", num); + if (!write_file(path, buf, strlen(buf) + 1)) { + perror(path); + exit(EXIT_FAILURE); + } +} + +int thp_read_string(const char *name, const char * const strings[]) +{ + char path[PATH_MAX]; + char buf[256]; + char *c; + int ret; + + ret = snprintf(path, PATH_MAX, THP_SYSFS "%s", name); + if (ret >= PATH_MAX) { + printf("%s: Pathname is too long\n", __func__); + exit(EXIT_FAILURE); + } + + if (!read_file(path, buf, sizeof(buf))) { + perror(path); + exit(EXIT_FAILURE); + } + + c = strchr(buf, '['); + if (!c) { + printf("%s: Parse failure\n", __func__); + exit(EXIT_FAILURE); + } + + c++; + memmove(buf, c, sizeof(buf) - (c - buf)); + + c = strchr(buf, ']'); + if (!c) { + printf("%s: Parse failure\n", __func__); + exit(EXIT_FAILURE); + } + *c = '\0'; + + ret = 0; + while (strings[ret]) { + if (!strcmp(strings[ret], buf)) + return ret; + ret++; + } + + printf("Failed to parse %s\n", name); + exit(EXIT_FAILURE); +} + +void thp_write_string(const char *name, const char *val) +{ + char path[PATH_MAX]; + int ret; + + ret = snprintf(path, PATH_MAX, THP_SYSFS "%s", name); + if (ret >= PATH_MAX) { + printf("%s: Pathname is too long\n", __func__); + exit(EXIT_FAILURE); + } + + if (!write_file(path, val, strlen(val) + 1)) { + perror(path); + exit(EXIT_FAILURE); + } +} + +const unsigned long thp_read_num(const char *name) +{ + char path[PATH_MAX]; + int ret; + + ret = snprintf(path, PATH_MAX, THP_SYSFS "%s", name); + if (ret >= PATH_MAX) { + printf("%s: Pathname is too long\n", __func__); + exit(EXIT_FAILURE); + } + return read_num(path); +} + +void thp_write_num(const char *name, unsigned long num) +{ + char path[PATH_MAX]; + int ret; + + ret = snprintf(path, PATH_MAX, THP_SYSFS "%s", name); + if (ret >= PATH_MAX) { + printf("%s: Pathname is too long\n", __func__); + exit(EXIT_FAILURE); + } + write_num(path, num); +} + +void thp_read_settings(struct thp_settings *settings) +{ + *settings = (struct thp_settings) { + .thp_enabled = thp_read_string("enabled", thp_enabled_strings), + .thp_defrag = thp_read_string("defrag", thp_defrag_strings), + .shmem_enabled = + thp_read_string("shmem_enabled", shmem_enabled_strings), + .use_zero_page = thp_read_num("use_zero_page"), + }; + settings->khugepaged = (struct khugepaged_settings) { + .defrag = thp_read_num("khugepaged/defrag"), + .alloc_sleep_millisecs = + thp_read_num("khugepaged/alloc_sleep_millisecs"), + .scan_sleep_millisecs = + thp_read_num("khugepaged/scan_sleep_millisecs"), + .max_ptes_none = thp_read_num("khugepaged/max_ptes_none"), + .max_ptes_swap = thp_read_num("khugepaged/max_ptes_swap"), + .max_ptes_shared = thp_read_num("khugepaged/max_ptes_shared"), + .pages_to_scan = thp_read_num("khugepaged/pages_to_scan"), + }; + if (dev_queue_read_ahead_path[0]) + settings->read_ahead_kb = read_num(dev_queue_read_ahead_path); +} + +void thp_write_settings(struct thp_settings *settings) +{ + struct khugepaged_settings *khugepaged = &settings->khugepaged; + + thp_write_string("enabled", thp_enabled_strings[settings->thp_enabled]); + thp_write_string("defrag", thp_defrag_strings[settings->thp_defrag]); + thp_write_string("shmem_enabled", + shmem_enabled_strings[settings->shmem_enabled]); + thp_write_num("use_zero_page", settings->use_zero_page); + + thp_write_num("khugepaged/defrag", khugepaged->defrag); + thp_write_num("khugepaged/alloc_sleep_millisecs", + khugepaged->alloc_sleep_millisecs); + thp_write_num("khugepaged/scan_sleep_millisecs", + khugepaged->scan_sleep_millisecs); + thp_write_num("khugepaged/max_ptes_none", khugepaged->max_ptes_none); + thp_write_num("khugepaged/max_ptes_swap", khugepaged->max_ptes_swap); + thp_write_num("khugepaged/max_ptes_shared", khugepaged->max_ptes_shared); + thp_write_num("khugepaged/pages_to_scan", khugepaged->pages_to_scan); + + if (dev_queue_read_ahead_path[0]) + write_num(dev_queue_read_ahead_path, settings->read_ahead_kb); +} + +struct thp_settings *thp_current_settings(void) +{ + if (!settings_index) { + printf("Fail: No settings set"); + exit(EXIT_FAILURE); + } + return settings_stack + settings_index - 1; +} + +void thp_push_settings(struct thp_settings *settings) +{ + if (settings_index >= MAX_SETTINGS_DEPTH) { + printf("Fail: Settings stack exceeded"); + exit(EXIT_FAILURE); + } + settings_stack[settings_index++] = *settings; + thp_write_settings(thp_current_settings()); +} + +void thp_pop_settings(void) +{ + if (settings_index <= 0) { + printf("Fail: Settings stack empty"); + exit(EXIT_FAILURE); + } + --settings_index; + thp_write_settings(thp_current_settings()); +} + +void thp_restore_settings(void) +{ + thp_write_settings(&saved_settings); +} + +void thp_save_settings(void) +{ + thp_read_settings(&saved_settings); +} + +void thp_set_read_ahead_path(char *path) +{ + if (!path) { + dev_queue_read_ahead_path[0] = '\0'; + return; + } + + strncpy(dev_queue_read_ahead_path, path, + sizeof(dev_queue_read_ahead_path)); + dev_queue_read_ahead_path[sizeof(dev_queue_read_ahead_path) - 1] = '\0'; +} diff --git a/tools/testing/selftests/mm/thp_settings.h b/tools/testing/selftests/mm/thp_settings.h new file mode 100644 index 0000000000000..ff3d98c30617b --- /dev/null +++ b/tools/testing/selftests/mm/thp_settings.h @@ -0,0 +1,71 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +#ifndef __THP_SETTINGS_H__ +#define __THP_SETTINGS_H__ + +#include +#include +#include + +enum thp_enabled { + THP_ALWAYS, + THP_MADVISE, + THP_NEVER, +}; + +enum thp_defrag { + THP_DEFRAG_ALWAYS, + THP_DEFRAG_DEFER, + THP_DEFRAG_DEFER_MADVISE, + THP_DEFRAG_MADVISE, + THP_DEFRAG_NEVER, +}; + +enum shmem_enabled { + SHMEM_ALWAYS, + SHMEM_WITHIN_SIZE, + SHMEM_ADVISE, + SHMEM_NEVER, + SHMEM_DENY, + SHMEM_FORCE, +}; + +struct khugepaged_settings { + bool defrag; + unsigned int alloc_sleep_millisecs; + unsigned int scan_sleep_millisecs; + unsigned int max_ptes_none; + unsigned int max_ptes_swap; + unsigned int max_ptes_shared; + unsigned long pages_to_scan; +}; + +struct thp_settings { + enum thp_enabled thp_enabled; + enum thp_defrag thp_defrag; + enum shmem_enabled shmem_enabled; + bool use_zero_page; + struct khugepaged_settings khugepaged; + unsigned long read_ahead_kb; +}; + +int read_file(const char *path, char *buf, size_t buflen); +int write_file(const char *path, const char *buf, size_t buflen); +const unsigned long read_num(const char *path); +void write_num(const char *path, unsigned long num); + +int thp_read_string(const char *name, const char * const strings[]); +void thp_write_string(const char *name, const char *val); +const unsigned long thp_read_num(const char *name); +void thp_write_num(const char *name, unsigned long num); + +void thp_write_settings(struct thp_settings *settings); +void thp_read_settings(struct thp_settings *settings); +struct thp_settings *thp_current_settings(void); +void thp_push_settings(struct thp_settings *settings); +void thp_pop_settings(void); +void thp_restore_settings(void); +void thp_save_settings(void); + +void thp_set_read_ahead_path(char *path); + +#endif /* __THP_SETTINGS_H__ */ From f136b8cf5d5b9f906b2e8ea71559df60b18e0f9f Mon Sep 17 00:00:00 2001 From: Ryan Roberts Date: Thu, 7 Dec 2023 16:12:08 +0000 Subject: [PATCH 121/548] selftests/mm: support multi-size THP interface in thp_settings Save and restore the new per-size hugepage enabled setting, if available on the running kernel. Since the number of per-size directories is not fixed, solve this as simply as possible by catering for a maximum number in the thp_settings struct (20). Each array index is the order. The value of THP_NEVER is changed to 0 so that all of these new settings default to THP_NEVER and the user only needs to fill in the ones they want to enable. Link: https://lkml.kernel.org/r/20231207161211.2374093-8-ryan.roberts@arm.com Signed-off-by: Ryan Roberts Tested-by: Kefeng Wang Tested-by: John Hubbard Cc: Alistair Popple Cc: Anshuman Khandual Cc: Barry Song Cc: Catalin Marinas Cc: David Hildenbrand Cc: David Rientjes Cc: "Huang, Ying" Cc: Hugh Dickins Cc: Itaru Kitayama Cc: Kirill A. Shutemov Cc: Luis Chamberlain Cc: Matthew Wilcox (Oracle) Cc: Vlastimil Babka Cc: Yang Shi Cc: Yin Fengwei Cc: Yu Zhao Cc: Zi Yan Signed-off-by: Andrew Morton (cherry picked from commit 4f5070a5e40db2e9dbf5fff4ec678d6fbb338d5c) Signed-off-by: Koba Ko --- tools/testing/selftests/mm/khugepaged.c | 3 ++ tools/testing/selftests/mm/thp_settings.c | 55 ++++++++++++++++++++++- tools/testing/selftests/mm/thp_settings.h | 11 ++++- 3 files changed, 67 insertions(+), 2 deletions(-) diff --git a/tools/testing/selftests/mm/khugepaged.c b/tools/testing/selftests/mm/khugepaged.c index b15e7fd701769..7bd3baa9d34b4 100644 --- a/tools/testing/selftests/mm/khugepaged.c +++ b/tools/testing/selftests/mm/khugepaged.c @@ -1141,6 +1141,7 @@ static void parse_test_type(int argc, const char **argv) int main(int argc, const char **argv) { + int hpage_pmd_order; struct thp_settings default_settings = { .thp_enabled = THP_MADVISE, .thp_defrag = THP_DEFRAG_ALWAYS, @@ -1175,11 +1176,13 @@ int main(int argc, const char **argv) exit(EXIT_FAILURE); } hpage_pmd_nr = hpage_pmd_size / page_size; + hpage_pmd_order = __builtin_ctz(hpage_pmd_nr); default_settings.khugepaged.max_ptes_none = hpage_pmd_nr - 1; default_settings.khugepaged.max_ptes_swap = hpage_pmd_nr / 8; default_settings.khugepaged.max_ptes_shared = hpage_pmd_nr / 2; default_settings.khugepaged.pages_to_scan = hpage_pmd_nr * 8; + default_settings.hugepages[hpage_pmd_order].enabled = THP_INHERIT; save_settings(); thp_push_settings(&default_settings); diff --git a/tools/testing/selftests/mm/thp_settings.c b/tools/testing/selftests/mm/thp_settings.c index 5e8ec792cac70..a4163438108ec 100644 --- a/tools/testing/selftests/mm/thp_settings.c +++ b/tools/testing/selftests/mm/thp_settings.c @@ -16,9 +16,10 @@ static struct thp_settings saved_settings; static char dev_queue_read_ahead_path[PATH_MAX]; static const char * const thp_enabled_strings[] = { + "never", "always", + "inherit", "madvise", - "never", NULL }; @@ -198,6 +199,10 @@ void thp_write_num(const char *name, unsigned long num) void thp_read_settings(struct thp_settings *settings) { + unsigned long orders = thp_supported_orders(); + char path[PATH_MAX]; + int i; + *settings = (struct thp_settings) { .thp_enabled = thp_read_string("enabled", thp_enabled_strings), .thp_defrag = thp_read_string("defrag", thp_defrag_strings), @@ -218,11 +223,26 @@ void thp_read_settings(struct thp_settings *settings) }; if (dev_queue_read_ahead_path[0]) settings->read_ahead_kb = read_num(dev_queue_read_ahead_path); + + for (i = 0; i < NR_ORDERS; i++) { + if (!((1 << i) & orders)) { + settings->hugepages[i].enabled = THP_NEVER; + continue; + } + snprintf(path, PATH_MAX, "hugepages-%ukB/enabled", + (getpagesize() >> 10) << i); + settings->hugepages[i].enabled = + thp_read_string(path, thp_enabled_strings); + } } void thp_write_settings(struct thp_settings *settings) { struct khugepaged_settings *khugepaged = &settings->khugepaged; + unsigned long orders = thp_supported_orders(); + char path[PATH_MAX]; + int enabled; + int i; thp_write_string("enabled", thp_enabled_strings[settings->thp_enabled]); thp_write_string("defrag", thp_defrag_strings[settings->thp_defrag]); @@ -242,6 +262,15 @@ void thp_write_settings(struct thp_settings *settings) if (dev_queue_read_ahead_path[0]) write_num(dev_queue_read_ahead_path, settings->read_ahead_kb); + + for (i = 0; i < NR_ORDERS; i++) { + if (!((1 << i) & orders)) + continue; + snprintf(path, PATH_MAX, "hugepages-%ukB/enabled", + (getpagesize() >> 10) << i); + enabled = settings->hugepages[i].enabled; + thp_write_string(path, thp_enabled_strings[enabled]); + } } struct thp_settings *thp_current_settings(void) @@ -294,3 +323,27 @@ void thp_set_read_ahead_path(char *path) sizeof(dev_queue_read_ahead_path)); dev_queue_read_ahead_path[sizeof(dev_queue_read_ahead_path) - 1] = '\0'; } + +unsigned long thp_supported_orders(void) +{ + unsigned long orders = 0; + char path[PATH_MAX]; + char buf[256]; + int ret; + int i; + + for (i = 0; i < NR_ORDERS; i++) { + ret = snprintf(path, PATH_MAX, THP_SYSFS "hugepages-%ukB/enabled", + (getpagesize() >> 10) << i); + if (ret >= PATH_MAX) { + printf("%s: Pathname is too long\n", __func__); + exit(EXIT_FAILURE); + } + + ret = read_file(path, buf, sizeof(buf)); + if (ret) + orders |= 1UL << i; + } + + return orders; +} diff --git a/tools/testing/selftests/mm/thp_settings.h b/tools/testing/selftests/mm/thp_settings.h index ff3d98c30617b..71cbff05f4c7f 100644 --- a/tools/testing/selftests/mm/thp_settings.h +++ b/tools/testing/selftests/mm/thp_settings.h @@ -7,9 +7,10 @@ #include enum thp_enabled { + THP_NEVER, THP_ALWAYS, + THP_INHERIT, THP_MADVISE, - THP_NEVER, }; enum thp_defrag { @@ -29,6 +30,12 @@ enum shmem_enabled { SHMEM_FORCE, }; +#define NR_ORDERS 20 + +struct hugepages_settings { + enum thp_enabled enabled; +}; + struct khugepaged_settings { bool defrag; unsigned int alloc_sleep_millisecs; @@ -46,6 +53,7 @@ struct thp_settings { bool use_zero_page; struct khugepaged_settings khugepaged; unsigned long read_ahead_kb; + struct hugepages_settings hugepages[NR_ORDERS]; }; int read_file(const char *path, char *buf, size_t buflen); @@ -67,5 +75,6 @@ void thp_restore_settings(void); void thp_save_settings(void); void thp_set_read_ahead_path(char *path); +unsigned long thp_supported_orders(void); #endif /* __THP_SETTINGS_H__ */ From d57af6e60c31cfacd307ae21c295267eba5811ad Mon Sep 17 00:00:00 2001 From: Ryan Roberts Date: Thu, 7 Dec 2023 16:12:09 +0000 Subject: [PATCH 122/548] selftests/mm/khugepaged: enlighten for multi-size THP The `collapse_max_ptes_none` test was previously failing when a THP size less than PMD-size had enabled="always". The root cause is because the test faults in 1 page less than the threshold it set for collapsing. But when THP is enabled always, we "over allocate" and therefore the threshold is passed, and collapse unexpectedly succeeds. Solve this by enlightening khugepaged selftest. Add a command line option to pass in the desired THP size that should be used for all anonymous allocations. The harness will then explicitly configure a THP size as requested and modify the `collapse_max_ptes_none` test so that it faults in the threshold minus the number of pages in the configured THP size. If no command line option is provided, default to order 0, as per previous behaviour. I chose to use an order in the command line interface, since this makes the interface agnostic of base page size, making it easier to invoke from run_vmtests.sh. Link: https://lkml.kernel.org/r/20231207161211.2374093-9-ryan.roberts@arm.com Signed-off-by: Ryan Roberts Tested-by: Kefeng Wang Tested-by: John Hubbard Cc: Alistair Popple Cc: Anshuman Khandual Cc: Barry Song Cc: Catalin Marinas Cc: David Hildenbrand Cc: David Rientjes Cc: "Huang, Ying" Cc: Hugh Dickins Cc: Itaru Kitayama Cc: Kirill A. Shutemov Cc: Luis Chamberlain Cc: Matthew Wilcox (Oracle) Cc: Vlastimil Babka Cc: Yang Shi Cc: Yin Fengwei Cc: Yu Zhao Cc: Zi Yan Signed-off-by: Andrew Morton (cherry picked from commit 9f0704eae8a4edc8dca9c8a297f798d505a4103a) Signed-off-by: Koba Ko --- tools/testing/selftests/mm/khugepaged.c | 48 +++++++++++++++++------ tools/testing/selftests/mm/run_vmtests.sh | 2 + 2 files changed, 39 insertions(+), 11 deletions(-) diff --git a/tools/testing/selftests/mm/khugepaged.c b/tools/testing/selftests/mm/khugepaged.c index 7bd3baa9d34b4..829320a519e72 100644 --- a/tools/testing/selftests/mm/khugepaged.c +++ b/tools/testing/selftests/mm/khugepaged.c @@ -28,6 +28,7 @@ static unsigned long hpage_pmd_size; static unsigned long page_size; static int hpage_pmd_nr; +static int anon_order; #define PID_SMAPS "/proc/self/smaps" #define TEST_FILE "collapse_test_file" @@ -607,6 +608,11 @@ static bool is_tmpfs(struct mem_ops *ops) return ops == &__file_ops && finfo.type == VMA_SHMEM; } +static bool is_anon(struct mem_ops *ops) +{ + return ops == &__anon_ops; +} + static void alloc_at_fault(void) { struct thp_settings settings = *thp_current_settings(); @@ -673,6 +679,7 @@ static void collapse_max_ptes_none(struct collapse_context *c, struct mem_ops *o int max_ptes_none = hpage_pmd_nr / 2; struct thp_settings settings = *thp_current_settings(); void *p; + int fault_nr_pages = is_anon(ops) ? 1 << anon_order : 1; settings.khugepaged.max_ptes_none = max_ptes_none; thp_push_settings(&settings); @@ -686,10 +693,10 @@ static void collapse_max_ptes_none(struct collapse_context *c, struct mem_ops *o goto skip; } - ops->fault(p, 0, (hpage_pmd_nr - max_ptes_none - 1) * page_size); + ops->fault(p, 0, (hpage_pmd_nr - max_ptes_none - fault_nr_pages) * page_size); c->collapse("Maybe collapse with max_ptes_none exceeded", p, 1, ops, !c->enforce_pte_scan_limits); - validate_memory(p, 0, (hpage_pmd_nr - max_ptes_none - 1) * page_size); + validate_memory(p, 0, (hpage_pmd_nr - max_ptes_none - fault_nr_pages) * page_size); if (c->enforce_pte_scan_limits) { ops->fault(p, 0, (hpage_pmd_nr - max_ptes_none) * page_size); @@ -1076,7 +1083,7 @@ static void madvise_retracted_page_tables(struct collapse_context *c, static void usage(void) { - fprintf(stderr, "\nUsage: ./khugepaged [dir]\n\n"); + fprintf(stderr, "\nUsage: ./khugepaged [OPTIONS] [dir]\n\n"); fprintf(stderr, "\t\t: :\n"); fprintf(stderr, "\t\t: [all|khugepaged|madvise]\n"); fprintf(stderr, "\t\t: [all|anon|file|shmem]\n"); @@ -1085,15 +1092,34 @@ static void usage(void) fprintf(stderr, "\tCONFIG_READ_ONLY_THP_FOR_FS=y\n"); fprintf(stderr, "\n\tif [dir] is a (sub)directory of a tmpfs mount, tmpfs must be\n"); fprintf(stderr, "\tmounted with huge=madvise option for khugepaged tests to work\n"); + fprintf(stderr, "\n\tSupported Options:\n"); + fprintf(stderr, "\t\t-h: This help message.\n"); + fprintf(stderr, "\t\t-s: mTHP size, expressed as page order.\n"); + fprintf(stderr, "\t\t Defaults to 0. Use this size for anon allocations.\n"); exit(1); } -static void parse_test_type(int argc, const char **argv) +static void parse_test_type(int argc, char **argv) { + int opt; char *buf; const char *token; - if (argc == 1) { + while ((opt = getopt(argc, argv, "s:h")) != -1) { + switch (opt) { + case 's': + anon_order = atoi(optarg); + break; + case 'h': + default: + usage(); + } + } + + argv += optind; + argc -= optind; + + if (argc == 0) { /* Backwards compatibility */ khugepaged_context = &__khugepaged_context; madvise_context = &__madvise_context; @@ -1101,7 +1127,7 @@ static void parse_test_type(int argc, const char **argv) return; } - buf = strdup(argv[1]); + buf = strdup(argv[0]); token = strsep(&buf, ":"); if (!strcmp(token, "all")) { @@ -1135,11 +1161,13 @@ static void parse_test_type(int argc, const char **argv) if (!file_ops) return; - if (argc != 3) + if (argc != 2) usage(); + + get_finfo(argv[1]); } -int main(int argc, const char **argv) +int main(int argc, char **argv) { int hpage_pmd_order; struct thp_settings default_settings = { @@ -1164,9 +1192,6 @@ int main(int argc, const char **argv) parse_test_type(argc, argv); - if (file_ops) - get_finfo(argv[2]); - setbuf(stdout, NULL); page_size = getpagesize(); @@ -1183,6 +1208,7 @@ int main(int argc, const char **argv) default_settings.khugepaged.max_ptes_shared = hpage_pmd_nr / 2; default_settings.khugepaged.pages_to_scan = hpage_pmd_nr * 8; default_settings.hugepages[hpage_pmd_order].enabled = THP_INHERIT; + default_settings.hugepages[anon_order].enabled = THP_ALWAYS; save_settings(); thp_push_settings(&default_settings); diff --git a/tools/testing/selftests/mm/run_vmtests.sh b/tools/testing/selftests/mm/run_vmtests.sh index d7b2c9d07eec5..8ec99d704d06e 100755 --- a/tools/testing/selftests/mm/run_vmtests.sh +++ b/tools/testing/selftests/mm/run_vmtests.sh @@ -377,6 +377,8 @@ CATEGORY="cow" run_test ./cow CATEGORY="thp" run_test ./khugepaged +CATEGORY="thp" run_test ./khugepaged -s 2 + CATEGORY="thp" run_test ./transhuge-stress -d 20 CATEGORY="thp" run_test ./split_huge_page_test From c61c87484d6e4bafe80358d5e6789c88fcbcc835 Mon Sep 17 00:00:00 2001 From: Ryan Roberts Date: Thu, 7 Dec 2023 16:12:10 +0000 Subject: [PATCH 123/548] selftests/mm/cow: generalize do_run_with_thp() helper do_run_with_thp() prepares (PMD-sized) THP memory into different states before running tests. With the introduction of multi-size THP, we would like to reuse this logic to also test those smaller THP sizes. So let's add a thpsize parameter which tells the function what size THP it should operate on. A separate commit will utilize this change to add new tests for multi-size THP, where available. Link: https://lkml.kernel.org/r/20231207161211.2374093-10-ryan.roberts@arm.com Signed-off-by: Ryan Roberts Reviewed-by: David Hildenbrand Tested-by: Kefeng Wang Tested-by: John Hubbard Cc: Alistair Popple Cc: Anshuman Khandual Cc: Barry Song Cc: Catalin Marinas Cc: David Rientjes Cc: "Huang, Ying" Cc: Hugh Dickins Cc: Itaru Kitayama Cc: Kirill A. Shutemov Cc: Luis Chamberlain Cc: Matthew Wilcox (Oracle) Cc: Vlastimil Babka Cc: Yang Shi Cc: Yin Fengwei Cc: Yu Zhao Cc: Zi Yan Signed-off-by: Andrew Morton (cherry picked from commit 12dc16b38463a671bc91dc2df10f3a014a27ff3b) Signed-off-by: Koba Ko --- tools/testing/selftests/mm/cow.c | 121 +++++++++++++++++-------------- 1 file changed, 67 insertions(+), 54 deletions(-) diff --git a/tools/testing/selftests/mm/cow.c b/tools/testing/selftests/mm/cow.c index 76d37904172db..a49b0a0ecf58e 100644 --- a/tools/testing/selftests/mm/cow.c +++ b/tools/testing/selftests/mm/cow.c @@ -32,7 +32,7 @@ static size_t pagesize; static int pagemap_fd; -static size_t thpsize; +static size_t pmdsize; static int nr_hugetlbsizes; static size_t hugetlbsizes[10]; static int gup_fd; @@ -734,7 +734,7 @@ enum thp_run { THP_RUN_PARTIAL_SHARED, }; -static void do_run_with_thp(test_fn fn, enum thp_run thp_run) +static void do_run_with_thp(test_fn fn, enum thp_run thp_run, size_t thpsize) { char *mem, *mmap_mem, *tmp, *mremap_mem = MAP_FAILED; size_t size, mmap_size, mremap_size; @@ -759,11 +759,11 @@ static void do_run_with_thp(test_fn fn, enum thp_run thp_run) } /* - * Try to populate a THP. Touch the first sub-page and test if we get - * another sub-page populated automatically. + * Try to populate a THP. Touch the first sub-page and test if + * we get the last sub-page populated automatically. */ mem[0] = 0; - if (!pagemap_is_populated(pagemap_fd, mem + pagesize)) { + if (!pagemap_is_populated(pagemap_fd, mem + thpsize - pagesize)) { ksft_test_result_skip("Did not get a THP populated\n"); goto munmap; } @@ -773,12 +773,14 @@ static void do_run_with_thp(test_fn fn, enum thp_run thp_run) switch (thp_run) { case THP_RUN_PMD: case THP_RUN_PMD_SWAPOUT: + assert(thpsize == pmdsize); break; case THP_RUN_PTE: case THP_RUN_PTE_SWAPOUT: /* * Trigger PTE-mapping the THP by temporarily mapping a single - * subpage R/O. + * subpage R/O. This is a noop if the THP is not pmdsize (and + * therefore already PTE-mapped). */ ret = mprotect(mem + pagesize, pagesize, PROT_READ); if (ret) { @@ -875,52 +877,60 @@ static void do_run_with_thp(test_fn fn, enum thp_run thp_run) munmap(mremap_mem, mremap_size); } -static void run_with_thp(test_fn fn, const char *desc) +static void run_with_thp(test_fn fn, const char *desc, size_t size) { - ksft_print_msg("[RUN] %s ... with THP\n", desc); - do_run_with_thp(fn, THP_RUN_PMD); + ksft_print_msg("[RUN] %s ... with THP (%zu kB)\n", + desc, size / 1024); + do_run_with_thp(fn, THP_RUN_PMD, size); } -static void run_with_thp_swap(test_fn fn, const char *desc) +static void run_with_thp_swap(test_fn fn, const char *desc, size_t size) { - ksft_print_msg("[RUN] %s ... with swapped-out THP\n", desc); - do_run_with_thp(fn, THP_RUN_PMD_SWAPOUT); + ksft_print_msg("[RUN] %s ... with swapped-out THP (%zu kB)\n", + desc, size / 1024); + do_run_with_thp(fn, THP_RUN_PMD_SWAPOUT, size); } -static void run_with_pte_mapped_thp(test_fn fn, const char *desc) +static void run_with_pte_mapped_thp(test_fn fn, const char *desc, size_t size) { - ksft_print_msg("[RUN] %s ... with PTE-mapped THP\n", desc); - do_run_with_thp(fn, THP_RUN_PTE); + ksft_print_msg("[RUN] %s ... with PTE-mapped THP (%zu kB)\n", + desc, size / 1024); + do_run_with_thp(fn, THP_RUN_PTE, size); } -static void run_with_pte_mapped_thp_swap(test_fn fn, const char *desc) +static void run_with_pte_mapped_thp_swap(test_fn fn, const char *desc, size_t size) { - ksft_print_msg("[RUN] %s ... with swapped-out, PTE-mapped THP\n", desc); - do_run_with_thp(fn, THP_RUN_PTE_SWAPOUT); + ksft_print_msg("[RUN] %s ... with swapped-out, PTE-mapped THP (%zu kB)\n", + desc, size / 1024); + do_run_with_thp(fn, THP_RUN_PTE_SWAPOUT, size); } -static void run_with_single_pte_of_thp(test_fn fn, const char *desc) +static void run_with_single_pte_of_thp(test_fn fn, const char *desc, size_t size) { - ksft_print_msg("[RUN] %s ... with single PTE of THP\n", desc); - do_run_with_thp(fn, THP_RUN_SINGLE_PTE); + ksft_print_msg("[RUN] %s ... with single PTE of THP (%zu kB)\n", + desc, size / 1024); + do_run_with_thp(fn, THP_RUN_SINGLE_PTE, size); } -static void run_with_single_pte_of_thp_swap(test_fn fn, const char *desc) +static void run_with_single_pte_of_thp_swap(test_fn fn, const char *desc, size_t size) { - ksft_print_msg("[RUN] %s ... with single PTE of swapped-out THP\n", desc); - do_run_with_thp(fn, THP_RUN_SINGLE_PTE_SWAPOUT); + ksft_print_msg("[RUN] %s ... with single PTE of swapped-out THP (%zu kB)\n", + desc, size / 1024); + do_run_with_thp(fn, THP_RUN_SINGLE_PTE_SWAPOUT, size); } -static void run_with_partial_mremap_thp(test_fn fn, const char *desc) +static void run_with_partial_mremap_thp(test_fn fn, const char *desc, size_t size) { - ksft_print_msg("[RUN] %s ... with partially mremap()'ed THP\n", desc); - do_run_with_thp(fn, THP_RUN_PARTIAL_MREMAP); + ksft_print_msg("[RUN] %s ... with partially mremap()'ed THP (%zu kB)\n", + desc, size / 1024); + do_run_with_thp(fn, THP_RUN_PARTIAL_MREMAP, size); } -static void run_with_partial_shared_thp(test_fn fn, const char *desc) +static void run_with_partial_shared_thp(test_fn fn, const char *desc, size_t size) { - ksft_print_msg("[RUN] %s ... with partially shared THP\n", desc); - do_run_with_thp(fn, THP_RUN_PARTIAL_SHARED); + ksft_print_msg("[RUN] %s ... with partially shared THP (%zu kB)\n", + desc, size / 1024); + do_run_with_thp(fn, THP_RUN_PARTIAL_SHARED, size); } static void run_with_hugetlb(test_fn fn, const char *desc, size_t hugetlbsize) @@ -1091,15 +1101,15 @@ static void run_anon_test_case(struct test_case const *test_case) run_with_base_page(test_case->fn, test_case->desc); run_with_base_page_swap(test_case->fn, test_case->desc); - if (thpsize) { - run_with_thp(test_case->fn, test_case->desc); - run_with_thp_swap(test_case->fn, test_case->desc); - run_with_pte_mapped_thp(test_case->fn, test_case->desc); - run_with_pte_mapped_thp_swap(test_case->fn, test_case->desc); - run_with_single_pte_of_thp(test_case->fn, test_case->desc); - run_with_single_pte_of_thp_swap(test_case->fn, test_case->desc); - run_with_partial_mremap_thp(test_case->fn, test_case->desc); - run_with_partial_shared_thp(test_case->fn, test_case->desc); + if (pmdsize) { + run_with_thp(test_case->fn, test_case->desc, pmdsize); + run_with_thp_swap(test_case->fn, test_case->desc, pmdsize); + run_with_pte_mapped_thp(test_case->fn, test_case->desc, pmdsize); + run_with_pte_mapped_thp_swap(test_case->fn, test_case->desc, pmdsize); + run_with_single_pte_of_thp(test_case->fn, test_case->desc, pmdsize); + run_with_single_pte_of_thp_swap(test_case->fn, test_case->desc, pmdsize); + run_with_partial_mremap_thp(test_case->fn, test_case->desc, pmdsize); + run_with_partial_shared_thp(test_case->fn, test_case->desc, pmdsize); } for (i = 0; i < nr_hugetlbsizes; i++) run_with_hugetlb(test_case->fn, test_case->desc, @@ -1120,7 +1130,7 @@ static int tests_per_anon_test_case(void) { int tests = 2 + nr_hugetlbsizes; - if (thpsize) + if (pmdsize) tests += 8; return tests; } @@ -1329,7 +1339,7 @@ static void run_anon_thp_test_cases(void) { int i; - if (!thpsize) + if (!pmdsize) return; ksft_print_msg("[INFO] Anonymous THP tests\n"); @@ -1338,13 +1348,13 @@ static void run_anon_thp_test_cases(void) struct test_case const *test_case = &anon_thp_test_cases[i]; ksft_print_msg("[RUN] %s\n", test_case->desc); - do_run_with_thp(test_case->fn, THP_RUN_PMD); + do_run_with_thp(test_case->fn, THP_RUN_PMD, pmdsize); } } static int tests_per_anon_thp_test_case(void) { - return thpsize ? 1 : 0; + return pmdsize ? 1 : 0; } typedef void (*non_anon_test_fn)(char *mem, const char *smem, size_t size); @@ -1419,7 +1429,7 @@ static void run_with_huge_zeropage(non_anon_test_fn fn, const char *desc) } /* For alignment purposes, we need twice the thp size. */ - mmap_size = 2 * thpsize; + mmap_size = 2 * pmdsize; mmap_mem = mmap(NULL, mmap_size, PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_ANONYMOUS, -1, 0); if (mmap_mem == MAP_FAILED) { @@ -1434,11 +1444,11 @@ static void run_with_huge_zeropage(non_anon_test_fn fn, const char *desc) } /* We need a THP-aligned memory area. */ - mem = (char *)(((uintptr_t)mmap_mem + thpsize) & ~(thpsize - 1)); - smem = (char *)(((uintptr_t)mmap_smem + thpsize) & ~(thpsize - 1)); + mem = (char *)(((uintptr_t)mmap_mem + pmdsize) & ~(pmdsize - 1)); + smem = (char *)(((uintptr_t)mmap_smem + pmdsize) & ~(pmdsize - 1)); - ret = madvise(mem, thpsize, MADV_HUGEPAGE); - ret |= madvise(smem, thpsize, MADV_HUGEPAGE); + ret = madvise(mem, pmdsize, MADV_HUGEPAGE); + ret |= madvise(smem, pmdsize, MADV_HUGEPAGE); if (ret) { ksft_test_result_fail("MADV_HUGEPAGE failed\n"); goto munmap; @@ -1457,7 +1467,7 @@ static void run_with_huge_zeropage(non_anon_test_fn fn, const char *desc) goto munmap; } - fn(mem, smem, thpsize); + fn(mem, smem, pmdsize); munmap: munmap(mmap_mem, mmap_size); if (mmap_smem != MAP_FAILED) @@ -1650,7 +1660,7 @@ static void run_non_anon_test_case(struct non_anon_test_case const *test_case) run_with_zeropage(test_case->fn, test_case->desc); run_with_memfd(test_case->fn, test_case->desc); run_with_tmpfile(test_case->fn, test_case->desc); - if (thpsize) + if (pmdsize) run_with_huge_zeropage(test_case->fn, test_case->desc); for (i = 0; i < nr_hugetlbsizes; i++) run_with_memfd_hugetlb(test_case->fn, test_case->desc, @@ -1671,7 +1681,7 @@ static int tests_per_non_anon_test_case(void) { int tests = 3 + nr_hugetlbsizes; - if (thpsize) + if (pmdsize) tests += 1; return tests; } @@ -1683,10 +1693,13 @@ int main(int argc, char **argv) ksft_print_header(); pagesize = getpagesize(); - thpsize = read_pmd_pagesize(); - if (thpsize) + pmdsize = read_pmd_pagesize(); + if (pmdsize) { + ksft_print_msg("[INFO] detected PMD size: %zu KiB\n", + pmdsize / 1024); ksft_print_msg("[INFO] detected THP size: %zu KiB\n", - thpsize / 1024); + pmdsize / 1024); + } nr_hugetlbsizes = detect_hugetlb_page_sizes(hugetlbsizes, ARRAY_SIZE(hugetlbsizes)); detect_huge_zeropage(); From 523714f269ccab0de218ffa17e67b347d13ede88 Mon Sep 17 00:00:00 2001 From: Ryan Roberts Date: Thu, 7 Dec 2023 16:12:11 +0000 Subject: [PATCH 124/548] selftests/mm/cow: add tests for anonymous multi-size THP Add tests similar to the existing PMD-sized THP tests, but which operate on memory backed by (PTE-mapped) multi-size THP. This reuses all the existing infrastructure. If the test suite detects that multi-size THP is not supported by the kernel, the new tests are skipped. Link: https://lkml.kernel.org/r/20231207161211.2374093-11-ryan.roberts@arm.com Signed-off-by: Ryan Roberts Reviewed-by: David Hildenbrand Tested-by: Kefeng Wang Tested-by: John Hubbard Cc: Alistair Popple Cc: Anshuman Khandual Cc: Barry Song Cc: Catalin Marinas Cc: David Rientjes Cc: "Huang, Ying" Cc: Hugh Dickins Cc: Itaru Kitayama Cc: Kirill A. Shutemov Cc: Luis Chamberlain Cc: Matthew Wilcox (Oracle) Cc: Vlastimil Babka Cc: Yang Shi Cc: Yin Fengwei Cc: Yu Zhao Cc: Zi Yan Signed-off-by: Andrew Morton (cherry picked from commit c0f79103322c322ea9342d52c2d81528b7b56232) Signed-off-by: Koba Ko --- tools/testing/selftests/mm/cow.c | 82 +++++++++++++++++++++++++++----- 1 file changed, 70 insertions(+), 12 deletions(-) diff --git a/tools/testing/selftests/mm/cow.c b/tools/testing/selftests/mm/cow.c index a49b0a0ecf58e..3550073c7eaf5 100644 --- a/tools/testing/selftests/mm/cow.c +++ b/tools/testing/selftests/mm/cow.c @@ -29,15 +29,49 @@ #include "../../../../mm/gup_test.h" #include "../kselftest.h" #include "vm_util.h" +#include "thp_settings.h" static size_t pagesize; static int pagemap_fd; static size_t pmdsize; +static int nr_thpsizes; +static size_t thpsizes[20]; static int nr_hugetlbsizes; static size_t hugetlbsizes[10]; static int gup_fd; static bool has_huge_zeropage; +static int sz2ord(size_t size) +{ + return __builtin_ctzll(size / pagesize); +} + +static int detect_thp_sizes(size_t sizes[], int max) +{ + int count = 0; + unsigned long orders; + size_t kb; + int i; + + /* thp not supported at all. */ + if (!pmdsize) + return 0; + + orders = 1UL << sz2ord(pmdsize); + orders |= thp_supported_orders(); + + for (i = 0; orders && count < max; i++) { + if (!(orders & (1UL << i))) + continue; + orders &= ~(1UL << i); + kb = (pagesize >> 10) << i; + sizes[count++] = kb * 1024; + ksft_print_msg("[INFO] detected THP size: %zu KiB\n", kb); + } + + return count; +} + static void detect_huge_zeropage(void) { int fd = open("/sys/kernel/mm/transparent_hugepage/use_zero_page", @@ -1101,15 +1135,27 @@ static void run_anon_test_case(struct test_case const *test_case) run_with_base_page(test_case->fn, test_case->desc); run_with_base_page_swap(test_case->fn, test_case->desc); - if (pmdsize) { - run_with_thp(test_case->fn, test_case->desc, pmdsize); - run_with_thp_swap(test_case->fn, test_case->desc, pmdsize); - run_with_pte_mapped_thp(test_case->fn, test_case->desc, pmdsize); - run_with_pte_mapped_thp_swap(test_case->fn, test_case->desc, pmdsize); - run_with_single_pte_of_thp(test_case->fn, test_case->desc, pmdsize); - run_with_single_pte_of_thp_swap(test_case->fn, test_case->desc, pmdsize); - run_with_partial_mremap_thp(test_case->fn, test_case->desc, pmdsize); - run_with_partial_shared_thp(test_case->fn, test_case->desc, pmdsize); + for (i = 0; i < nr_thpsizes; i++) { + size_t size = thpsizes[i]; + struct thp_settings settings = *thp_current_settings(); + + settings.hugepages[sz2ord(pmdsize)].enabled = THP_NEVER; + settings.hugepages[sz2ord(size)].enabled = THP_ALWAYS; + thp_push_settings(&settings); + + if (size == pmdsize) { + run_with_thp(test_case->fn, test_case->desc, size); + run_with_thp_swap(test_case->fn, test_case->desc, size); + } + + run_with_pte_mapped_thp(test_case->fn, test_case->desc, size); + run_with_pte_mapped_thp_swap(test_case->fn, test_case->desc, size); + run_with_single_pte_of_thp(test_case->fn, test_case->desc, size); + run_with_single_pte_of_thp_swap(test_case->fn, test_case->desc, size); + run_with_partial_mremap_thp(test_case->fn, test_case->desc, size); + run_with_partial_shared_thp(test_case->fn, test_case->desc, size); + + thp_pop_settings(); } for (i = 0; i < nr_hugetlbsizes; i++) run_with_hugetlb(test_case->fn, test_case->desc, @@ -1130,8 +1176,9 @@ static int tests_per_anon_test_case(void) { int tests = 2 + nr_hugetlbsizes; + tests += 6 * nr_thpsizes; if (pmdsize) - tests += 8; + tests += 2; return tests; } @@ -1689,16 +1736,22 @@ static int tests_per_non_anon_test_case(void) int main(int argc, char **argv) { int err; + struct thp_settings default_settings; ksft_print_header(); pagesize = getpagesize(); pmdsize = read_pmd_pagesize(); if (pmdsize) { + /* Only if THP is supported. */ + thp_read_settings(&default_settings); + default_settings.hugepages[sz2ord(pmdsize)].enabled = THP_INHERIT; + thp_save_settings(); + thp_push_settings(&default_settings); + ksft_print_msg("[INFO] detected PMD size: %zu KiB\n", pmdsize / 1024); - ksft_print_msg("[INFO] detected THP size: %zu KiB\n", - pmdsize / 1024); + nr_thpsizes = detect_thp_sizes(thpsizes, ARRAY_SIZE(thpsizes)); } nr_hugetlbsizes = detect_hugetlb_page_sizes(hugetlbsizes, ARRAY_SIZE(hugetlbsizes)); @@ -1717,6 +1770,11 @@ int main(int argc, char **argv) run_anon_thp_test_cases(); run_non_anon_test_cases(); + if (pmdsize) { + /* Only if THP is supported. */ + thp_restore_settings(); + } + err = ksft_get_fail_cnt(); if (err) ksft_exit_fail_msg("%d out of %d tests failed\n", From dfb83bfbc1597ec053f732de1bb2f22401ff7188 Mon Sep 17 00:00:00 2001 From: Yin Fengwei Date: Mon, 18 Sep 2023 15:33:16 +0800 Subject: [PATCH 125/548] mm: add functions folio_in_range() and folio_within_vma() Patch series "support large folio for mlock", v3. Yu mentioned at [1] about the mlock() can't be applied to large folio. I leant the related code and here is my understanding: - For RLIMIT_MEMLOCK related, there is no problem. Because the RLIMIT_MEMLOCK statistics is not related underneath page. That means underneath page mlock or munlock doesn't impact the RLIMIT_MEMLOCK statistics collection which is always correct. - For keeping the page in RAM, there is no problem either. At least, during try_to_unmap_one(), once detect the VMA has VM_LOCKED bit set in vm_flags, the folio will be kept whatever the folio is mlocked or not. So the function of mlock for large folio works. But it's not optimized because the page reclaim needs scan these large folio and may split them. This series identified the large folio for mlock to four types: - The large folio is in VM_LOCKED range and fully mapped to the range - The large folio is in the VM_LOCKED range but not fully mapped to the range - The large folio cross VM_LOCKED VMA boundary - The large folio cross last level page table boundary For the first type, we mlock large folio so page reclaim will skip it. For the second/third type, we don't mlock large folio. As the pages not mapped to VM_LOACKED range are mapped to none VM_LOCKED range, if system is in memory pressure situation, the large folio can be picked by page reclaim and split. Then the pages not mapped to VM_LOCKED range can be reclaimed. For the fourth type, we don't mlock large folio because locking one page table lock can't prevent the part in another last level page table being unmapped. Thanks to Ryan for pointing this out. To check whether the folio is fully mapped to the range, PTEs needs be checked to see whether the page of folio is associated. Which needs take page table lock and is heavy operation. So far, the only place needs this check is madvise and page reclaim. These functions already have their own PTE iterator. patch1 introduce API to check whether large folio is in VMA range. patch2 make page reclaim/mlock_vma_folio/munlock_vma_folio support large folio mlock/munlock. patch3 make mlock/munlock syscall support large folio. Yu also mentioned a race which can make folio unevictable after munlock during RFC v2 discussion [3]: We decided that race issue didn't block this series based on: - That race issue was not introduced by this series - We had a looks-ok fix for that race issue. Need to wait for mlock_count fixing patch as Yosry Ahmed suggested [4] [1] https://lore.kernel.org/linux-mm/CAOUHufbtNPkdktjt_5qM45GegVO-rCFOMkSh0HQminQ12zsV8Q@mail.gmail.com/ [2] https://lore.kernel.org/linux-mm/20230809061105.3369958-1-fengwei.yin@intel.com/ [3] https://lore.kernel.org/linux-mm/CAOUHufZ6=9P_=CAOQyw0xw-3q707q-1FVV09dBNDC-hpcpj2Pg@mail.gmail.com/ This patch (of 3): folio_in_range() will be used to check whether the folio is mapped to specific VMA and whether the mapping address of folio is in the range. Also a helper function folio_within_vma() to check whether folio is in the range of vma based on folio_in_range(). Link: https://lkml.kernel.org/r/20230918073318.1181104-1-fengwei.yin@intel.com Link: https://lkml.kernel.org/r/20230918073318.1181104-2-fengwei.yin@intel.com Signed-off-by: Yin Fengwei Cc: David Hildenbrand Cc: Hugh Dickins Cc: Matthew Wilcox (Oracle) Cc: Ryan Roberts Cc: Yang Shi Cc: Yosry Ahmed Cc: Yu Zhao Signed-off-by: Andrew Morton (cherry picked from commit 28e566572aacdc551e24649e57cc9f04ba880cd2) Signed-off-by: Koba Ko --- mm/internal.h | 50 ++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 50 insertions(+) diff --git a/mm/internal.h b/mm/internal.h index f773db493a99d..b1a853b60a8c7 100644 --- a/mm/internal.h +++ b/mm/internal.h @@ -655,6 +655,56 @@ extern long faultin_page_range(struct mm_struct *mm, unsigned long start, unsigned long end, bool write, int *locked); extern bool mlock_future_ok(struct mm_struct *mm, unsigned long flags, unsigned long bytes); + +/* + * NOTE: This function can't tell whether the folio is "fully mapped" in the + * range. + * "fully mapped" means all the pages of folio is associated with the page + * table of range while this function just check whether the folio range is + * within the range [start, end). Funcation caller nees to do page table + * check if it cares about the page table association. + * + * Typical usage (like mlock or madvise) is: + * Caller knows at least 1 page of folio is associated with page table of VMA + * and the range [start, end) is intersect with the VMA range. Caller wants + * to know whether the folio is fully associated with the range. It calls + * this function to check whether the folio is in the range first. Then checks + * the page table to know whether the folio is fully mapped to the range. + */ +static inline bool +folio_within_range(struct folio *folio, struct vm_area_struct *vma, + unsigned long start, unsigned long end) +{ + pgoff_t pgoff, addr; + unsigned long vma_pglen = (vma->vm_end - vma->vm_start) >> PAGE_SHIFT; + + VM_WARN_ON_FOLIO(folio_test_ksm(folio), folio); + if (start > end) + return false; + + if (start < vma->vm_start) + start = vma->vm_start; + + if (end > vma->vm_end) + end = vma->vm_end; + + pgoff = folio_pgoff(folio); + + /* if folio start address is not in vma range */ + if (!in_range(pgoff, vma->vm_pgoff, vma_pglen)) + return false; + + addr = vma->vm_start + ((pgoff - vma->vm_pgoff) << PAGE_SHIFT); + + return !(addr < start || end - addr < folio_size(folio)); +} + +static inline bool +folio_within_vma(struct folio *folio, struct vm_area_struct *vma) +{ + return folio_within_range(folio, vma, vma->vm_start, vma->vm_end); +} + /* * mlock_vma_folio() and munlock_vma_folio(): * should be called with vma's mmap_lock held for read or write, From a5c1faa4d16c4ea159a36b6edb4ec5ebf36fd209 Mon Sep 17 00:00:00 2001 From: Yin Fengwei Date: Mon, 18 Sep 2023 15:33:17 +0800 Subject: [PATCH 126/548] mm: handle large folio when large folio in VM_LOCKED VMA range If large folio is in the range of VM_LOCKED VMA, it should be mlocked to avoid being picked by page reclaim. Which may split the large folio and then mlock each pages again. Mlock this kind of large folio to prevent them being picked by page reclaim. For the large folio which cross the boundary of VM_LOCKED VMA or not fully mapped to VM_LOCKED VMA, we'd better not to mlock it. So if the system is under memory pressure, this kind of large folio will be split and the pages ouf of VM_LOCKED VMA can be reclaimed. Ideally, for large folio, we should mlock it when the large folio is fully mapped to VMA and munlock it if any page are unmampped from VMA. But it's not easy to detect whether the large folio is fully mapped to VMA in some cases (like add/remove rmap). So we update mlock_vma_folio() and munlock_vma_folio() to mlock/munlock the folio according to vma->vm_flags. Let caller to decide whether they should call these two functions. For add rmap, only mlock normal 4K folio and postpone large folio handling to page reclaim phase. It is possible to reuse page table iterator to detect whether folio is fully mapped or not during page reclaim phase. For remove rmap, invoke munlock_vma_folio() to munlock folio unconditionly because rmap makes folio not fully mapped to VMA. Link: https://lkml.kernel.org/r/20230918073318.1181104-3-fengwei.yin@intel.com Signed-off-by: Yin Fengwei Cc: David Hildenbrand Cc: Hugh Dickins Cc: Matthew Wilcox (Oracle) Cc: Ryan Roberts Cc: Yang Shi Cc: Yosry Ahmed Cc: Yu Zhao Signed-off-by: Andrew Morton (cherry picked from commit 1acbc3f936146d1b34987294803ac131bc298ce8) Signed-off-by: Koba Ko --- mm/internal.h | 23 ++++++++++-------- mm/rmap.c | 66 ++++++++++++++++++++++++++++++++++++++++++--------- 2 files changed, 68 insertions(+), 21 deletions(-) diff --git a/mm/internal.h b/mm/internal.h index b1a853b60a8c7..d3df1271fe0ed 100644 --- a/mm/internal.h +++ b/mm/internal.h @@ -713,14 +713,10 @@ folio_within_vma(struct folio *folio, struct vm_area_struct *vma) * mlock is usually called at the end of page_add_*_rmap(), munlock at * the end of page_remove_rmap(); but new anon folios are managed by * folio_add_lru_vma() calling mlock_new_folio(). - * - * @compound is used to include pmd mappings of THPs, but filter out - * pte mappings of THPs, which cannot be consistently counted: a pte - * mapping of the THP head cannot be distinguished by the page alone. */ void mlock_folio(struct folio *folio); static inline void mlock_vma_folio(struct folio *folio, - struct vm_area_struct *vma, bool compound) + struct vm_area_struct *vma) { /* * The VM_SPECIAL check here serves two purposes. @@ -730,17 +726,24 @@ static inline void mlock_vma_folio(struct folio *folio, * file->f_op->mmap() is using vm_insert_page(s), when VM_LOCKED may * still be set while VM_SPECIAL bits are added: so ignore it then. */ - if (unlikely((vma->vm_flags & (VM_LOCKED|VM_SPECIAL)) == VM_LOCKED) && - (compound || !folio_test_large(folio))) + if (unlikely((vma->vm_flags & (VM_LOCKED|VM_SPECIAL)) == VM_LOCKED)) mlock_folio(folio); } void munlock_folio(struct folio *folio); static inline void munlock_vma_folio(struct folio *folio, - struct vm_area_struct *vma, bool compound) + struct vm_area_struct *vma) { - if (unlikely(vma->vm_flags & VM_LOCKED) && - (compound || !folio_test_large(folio))) + /* + * munlock if the function is called. Ideally, we should only + * do munlock if any page of folio is unmapped from VMA and + * cause folio not fully mapped to VMA. + * + * But it's not easy to confirm that's the situation. So we + * always munlock the folio and page reclaim will correct it + * if it's wrong. + */ + if (unlikely(vma->vm_flags & VM_LOCKED)) munlock_folio(folio); } diff --git a/mm/rmap.c b/mm/rmap.c index 9e7dfeac56f1f..e83c18933bb75 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -798,6 +798,7 @@ struct folio_referenced_arg { unsigned long vm_flags; struct mem_cgroup *memcg; }; + /* * arg: folio_referenced_arg will be passed */ @@ -807,17 +808,33 @@ static bool folio_referenced_one(struct folio *folio, struct folio_referenced_arg *pra = arg; DEFINE_FOLIO_VMA_WALK(pvmw, folio, vma, address, 0); int referenced = 0; + unsigned long start = address, ptes = 0; while (page_vma_mapped_walk(&pvmw)) { address = pvmw.address; - if ((vma->vm_flags & VM_LOCKED) && - (!folio_test_large(folio) || !pvmw.pte)) { - /* Restore the mlock which got missed */ - mlock_vma_folio(folio, vma, !pvmw.pte); - page_vma_mapped_walk_done(&pvmw); - pra->vm_flags |= VM_LOCKED; - return false; /* To break the loop */ + if (vma->vm_flags & VM_LOCKED) { + if (!folio_test_large(folio) || !pvmw.pte) { + /* Restore the mlock which got missed */ + mlock_vma_folio(folio, vma); + page_vma_mapped_walk_done(&pvmw); + pra->vm_flags |= VM_LOCKED; + return false; /* To break the loop */ + } + /* + * For large folio fully mapped to VMA, will + * be handled after the pvmw loop. + * + * For large folio cross VMA boundaries, it's + * expected to be picked by page reclaim. But + * should skip reference of pages which are in + * the range of VM_LOCKED vma. As page reclaim + * should just count the reference of pages out + * the range of VM_LOCKED vma. + */ + ptes++; + pra->mapcount--; + continue; } if (pvmw.pte) { @@ -842,6 +859,23 @@ static bool folio_referenced_one(struct folio *folio, pra->mapcount--; } + if ((vma->vm_flags & VM_LOCKED) && + folio_test_large(folio) && + folio_within_vma(folio, vma)) { + unsigned long s_align, e_align; + + s_align = ALIGN_DOWN(start, PMD_SIZE); + e_align = ALIGN_DOWN(start + folio_size(folio) - 1, PMD_SIZE); + + /* folio doesn't cross page table boundary and fully mapped */ + if ((s_align == e_align) && (ptes == folio_nr_pages(folio))) { + /* Restore the mlock which got missed */ + mlock_vma_folio(folio, vma); + pra->vm_flags |= VM_LOCKED; + return false; /* To break the loop */ + } + } + if (referenced) folio_clear_idle(folio); if (folio_test_clear_young(folio)) @@ -1253,7 +1287,14 @@ void page_add_anon_rmap(struct page *page, struct vm_area_struct *vma, if (flags & RMAP_EXCLUSIVE) SetPageAnonExclusive(page); - mlock_vma_folio(folio, vma, compound); + /* + * For large folio, only mlock it if it's fully mapped to VMA. It's + * not easy to check whether the large folio is fully mapped to VMA + * here. Only mlock normal 4K folio and leave page reclaim to handle + * large folio. + */ + if (!folio_test_large(folio)) + mlock_vma_folio(folio, vma); } /** @@ -1365,7 +1406,9 @@ void folio_add_file_rmap_range(struct folio *folio, struct page *page, if (nr) __lruvec_stat_mod_folio(folio, NR_FILE_MAPPED, nr); - mlock_vma_folio(folio, vma, compound); + /* See comments in page_add_anon_rmap() */ + if (!folio_test_large(folio)) + mlock_vma_folio(folio, vma); } /** @@ -1476,7 +1519,7 @@ void page_remove_rmap(struct page *page, struct vm_area_struct *vma, * it's only reliable while mapped. */ - munlock_vma_folio(folio, vma, compound); + munlock_vma_folio(folio, vma); } /* @@ -1541,7 +1584,8 @@ static bool try_to_unmap_one(struct folio *folio, struct vm_area_struct *vma, if (!(flags & TTU_IGNORE_MLOCK) && (vma->vm_flags & VM_LOCKED)) { /* Restore the mlock which got missed */ - mlock_vma_folio(folio, vma, false); + if (!folio_test_large(folio)) + mlock_vma_folio(folio, vma); page_vma_mapped_walk_done(&pvmw); ret = false; break; From efdc0e0800b5b0ea82bf31e29b2553984ea17163 Mon Sep 17 00:00:00 2001 From: Yin Fengwei Date: Mon, 18 Sep 2023 15:33:18 +0800 Subject: [PATCH 127/548] mm: mlock: update mlock_pte_range to handle large folio Current kernel only lock base size folio during mlock syscall. Add large folio support with following rules: - Only mlock large folio when it's in VM_LOCKED VMA range and fully mapped to page table. fully mapped folio is required as if folio is not fully mapped to a VM_LOCKED VMA, if system is in memory pressure, page reclaim is allowed to pick up this folio, split it and reclaim the pages which are not in VM_LOCKED VMA. - munlock will apply to the large folio which is in VMA range or cross the VMA boundary. This is required to handle the case that the large folio is mlocked, later the VMA is split in the middle of large folio. Link: https://lkml.kernel.org/r/20230918073318.1181104-4-fengwei.yin@intel.com Signed-off-by: Yin Fengwei Cc: David Hildenbrand Cc: Hugh Dickins Cc: Matthew Wilcox (Oracle) Cc: Ryan Roberts Cc: Yang Shi Cc: Yosry Ahmed Cc: Yu Zhao Signed-off-by: Andrew Morton (cherry picked from commit dc68badcede4ec3b4e5cdfcb8f678670220ac2ca) Signed-off-by: Koba Ko --- mm/mlock.c | 66 ++++++++++++++++++++++++++++++++++++++++++++++++++++-- 1 file changed, 64 insertions(+), 2 deletions(-) diff --git a/mm/mlock.c b/mm/mlock.c index 06bdfab83b58a..42b6865f8f828 100644 --- a/mm/mlock.c +++ b/mm/mlock.c @@ -305,6 +305,58 @@ void munlock_folio(struct folio *folio) local_unlock(&mlock_fbatch.lock); } +static inline unsigned int folio_mlock_step(struct folio *folio, + pte_t *pte, unsigned long addr, unsigned long end) +{ + unsigned int count, i, nr = folio_nr_pages(folio); + unsigned long pfn = folio_pfn(folio); + pte_t ptent = ptep_get(pte); + + if (!folio_test_large(folio)) + return 1; + + count = pfn + nr - pte_pfn(ptent); + count = min_t(unsigned int, count, (end - addr) >> PAGE_SHIFT); + + for (i = 0; i < count; i++, pte++) { + pte_t entry = ptep_get(pte); + + if (!pte_present(entry)) + break; + if (pte_pfn(entry) - pfn >= nr) + break; + } + + return i; +} + +static inline bool allow_mlock_munlock(struct folio *folio, + struct vm_area_struct *vma, unsigned long start, + unsigned long end, unsigned int step) +{ + /* + * For unlock, allow munlock large folio which is partially + * mapped to VMA. As it's possible that large folio is + * mlocked and VMA is split later. + * + * During memory pressure, such kind of large folio can + * be split. And the pages are not in VM_LOCKed VMA + * can be reclaimed. + */ + if (!(vma->vm_flags & VM_LOCKED)) + return true; + + /* folio not in range [start, end), skip mlock */ + if (!folio_within_range(folio, vma, start, end)) + return false; + + /* folio is not fully mapped, skip mlock */ + if (step != folio_nr_pages(folio)) + return false; + + return true; +} + static int mlock_pte_range(pmd_t *pmd, unsigned long addr, unsigned long end, struct mm_walk *walk) @@ -314,6 +366,8 @@ static int mlock_pte_range(pmd_t *pmd, unsigned long addr, pte_t *start_pte, *pte; pte_t ptent; struct folio *folio; + unsigned int step = 1; + unsigned long start = addr; ptl = pmd_trans_huge_lock(pmd, vma); if (ptl) { @@ -334,6 +388,7 @@ static int mlock_pte_range(pmd_t *pmd, unsigned long addr, walk->action = ACTION_AGAIN; return 0; } + for (pte = start_pte; addr != end; pte++, addr += PAGE_SIZE) { ptent = ptep_get(pte); if (!pte_present(ptent)) @@ -341,12 +396,19 @@ static int mlock_pte_range(pmd_t *pmd, unsigned long addr, folio = vm_normal_folio(vma, addr, ptent); if (!folio || folio_is_zone_device(folio)) continue; - if (folio_test_large(folio)) - continue; + + step = folio_mlock_step(folio, pte, addr, end); + if (!allow_mlock_munlock(folio, vma, start, end, step)) + goto next_entry; + if (vma->vm_flags & VM_LOCKED) mlock_folio(folio); else munlock_folio(folio); + +next_entry: + pte += step - 1; + addr += (step - 1) << PAGE_SHIFT; } pte_unmap(start_pte); out: From ec447b5093395d3ec65a8566f7d530a61243630d Mon Sep 17 00:00:00 2001 From: David Hildenbrand Date: Wed, 20 Dec 2023 23:44:25 +0100 Subject: [PATCH 128/548] mm/rmap: rename hugepage_add* to hugetlb_add* Patch series "mm/rmap: interface overhaul", v2. This series overhauls the rmap interface, to get rid of the "bool compound" / RMAP_COMPOUND parameter with the goal of making the interface less error prone, more future proof, and more natural to extend to "batching". Also, this converts the interface to always consume folio+subpage, which speeds up operations on large folios. Further, this series adds PTE-batching variants for 4 rmap functions, whereby only folio_add_anon_rmap_ptes() is used for batching in this series when PTE-remapping a PMD-mapped THP. folio_remove_rmap_ptes(), folio_try_dup_anon_rmap_ptes() and folio_dup_file_rmap_ptes() will soon come in handy[1,2]. This series performs a lot of folio conversion along the way. Most of the added LOC in the diff are only due to documentation. As we're moving to a pte/pmd interface where we clearly express the mapping granularity we are dealing with, we first get the remainder of hugetlb out of the way, as it is special and expected to remain special: it treats everything as a "single logical PTE" and only currently allows entire mappings. Even if we'd ever support partial mappings, I strongly assume the interface and implementation will still differ heavily: hopefull we can avoid working on subpages/subpage mapcounts completely and only add a "count" parameter for them to enable batching. New (extended) hugetlb interface that operates on entire folio: * hugetlb_add_new_anon_rmap() -> Already existed * hugetlb_add_anon_rmap() -> Already existed * hugetlb_try_dup_anon_rmap() * hugetlb_try_share_anon_rmap() * hugetlb_add_file_rmap() * hugetlb_remove_rmap() New "ordinary" interface for small folios / THP:: * folio_add_new_anon_rmap() -> Already existed * folio_add_anon_rmap_[pte|ptes|pmd]() * folio_try_dup_anon_rmap_[pte|ptes|pmd]() * folio_try_share_anon_rmap_[pte|pmd]() * folio_add_file_rmap_[pte|ptes|pmd]() * folio_dup_file_rmap_[pte|ptes|pmd]() * folio_remove_rmap_[pte|ptes|pmd]() folio_add_new_anon_rmap() will always map at the largest granularity possible (currently, a single PMD to cover a PMD-sized THP). Could be extended if ever required. In the future, we might want "_pud" variants and eventually "_pmds" variants for batching. I ran some simple microbenchmarks on an Intel(R) Xeon(R) Silver 4210R: measuring munmap(), fork(), cow, MADV_DONTNEED on each PTE ... and PTE remapping PMD-mapped THPs on 1 GiB of memory. For small folios, there is barely a change (< 1% improvement for me). For PTE-mapped THP: * PTE-remapping a PMD-mapped THP is more than 10% faster. * fork() is more than 4% faster. * MADV_DONTNEED is 2% faster * COW when writing only a single byte on a COW-shared PTE is 1% faster * munmap() barely changes (< 1%). [1] https://lkml.kernel.org/r/20230810103332.3062143-1-ryan.roberts@arm.com [2] https://lkml.kernel.org/r/20231204105440.61448-1-ryan.roberts@arm.com This patch (of 40): Let's just call it "hugetlb_". Yes, it's all already inconsistent and confusing because we have a lot of "hugepage_" functions for legacy reasons. But "hugetlb" cannot possibly be confused with transparent huge pages, and it matches "hugetlb.c" and "folio_test_hugetlb()". So let's minimize confusion in rmap code. Link: https://lkml.kernel.org/r/20231220224504.646757-1-david@redhat.com Link: https://lkml.kernel.org/r/20231220224504.646757-2-david@redhat.com Signed-off-by: David Hildenbrand Reviewed-by: Muchun Song Cc: Hugh Dickins Cc: Matthew Wilcox (Oracle) Cc: Muchun Song Cc: Peter Xu Cc: Ryan Roberts Cc: Yin Fengwei Signed-off-by: Andrew Morton (cherry picked from commit 9d5fafd5d882446999366f673ab06edba453f862) Signed-off-by: Koba Ko --- include/linux/rmap.h | 4 ++-- mm/hugetlb.c | 8 ++++---- mm/migrate.c | 2 +- mm/rmap.c | 6 +++--- 4 files changed, 10 insertions(+), 10 deletions(-) diff --git a/include/linux/rmap.h b/include/linux/rmap.h index b1fb58b435a98..23e456fb5487e 100644 --- a/include/linux/rmap.h +++ b/include/linux/rmap.h @@ -203,9 +203,9 @@ void folio_add_file_rmap_range(struct folio *, struct page *, unsigned int nr, void page_remove_rmap(struct page *, struct vm_area_struct *, bool compound); -void hugepage_add_anon_rmap(struct page *, struct vm_area_struct *, +void hugetlb_add_anon_rmap(struct page *, struct vm_area_struct *, unsigned long address, rmap_t flags); -void hugepage_add_new_anon_rmap(struct folio *, struct vm_area_struct *, +void hugetlb_add_new_anon_rmap(struct folio *, struct vm_area_struct *, unsigned long address); static inline void __page_dup_rmap(struct page *page, bool compound) diff --git a/mm/hugetlb.c b/mm/hugetlb.c index d57d8f1c2dfea..a8ab300b3b7dc 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -5033,7 +5033,7 @@ hugetlb_install_folio(struct vm_area_struct *vma, pte_t *ptep, unsigned long add pte_t newpte = make_huge_pte(vma, &new_folio->page, 1); __folio_mark_uptodate(new_folio); - hugepage_add_new_anon_rmap(new_folio, vma, addr); + hugetlb_add_new_anon_rmap(new_folio, vma, addr); if (userfaultfd_wp(vma) && huge_pte_uffd_wp(old)) newpte = huge_pte_mkuffd_wp(newpte); set_huge_pte_at(vma->vm_mm, addr, ptep, newpte, sz); @@ -5734,7 +5734,7 @@ static vm_fault_t hugetlb_wp(struct mm_struct *mm, struct vm_area_struct *vma, /* Break COW or unshare */ huge_ptep_clear_flush(vma, haddr, ptep); page_remove_rmap(&old_folio->page, vma, true); - hugepage_add_new_anon_rmap(new_folio, vma, haddr); + hugetlb_add_new_anon_rmap(new_folio, vma, haddr); if (huge_pte_uffd_wp(pte)) newpte = huge_pte_mkuffd_wp(newpte); set_huge_pte_at(mm, haddr, ptep, newpte, huge_page_size(h)); @@ -6022,7 +6022,7 @@ static vm_fault_t hugetlb_no_page(struct mm_struct *mm, goto backout; if (anon_rmap) - hugepage_add_new_anon_rmap(folio, vma, haddr); + hugetlb_add_new_anon_rmap(folio, vma, haddr); else page_dup_file_rmap(&folio->page, true); new_pte = make_huge_pte(vma, &folio->page, ((vma->vm_flags & VM_WRITE) @@ -6450,7 +6450,7 @@ int hugetlb_mfill_atomic_pte(pte_t *dst_pte, if (folio_in_pagecache) page_dup_file_rmap(&folio->page, true); else - hugepage_add_new_anon_rmap(folio, dst_vma, dst_addr); + hugetlb_add_new_anon_rmap(folio, dst_vma, dst_addr); /* * For either: (1) CONTINUE on a non-shared VMA, or (2) UFFDIO_COPY diff --git a/mm/migrate.c b/mm/migrate.c index 4ed4708852174..959e44c5b0847 100644 --- a/mm/migrate.c +++ b/mm/migrate.c @@ -249,7 +249,7 @@ static bool remove_migration_pte(struct folio *folio, pte = arch_make_huge_pte(pte, shift, vma->vm_flags); if (folio_test_anon(folio)) - hugepage_add_anon_rmap(new, vma, pvmw.address, + hugetlb_add_anon_rmap(new, vma, pvmw.address, rmap_flags); else page_dup_file_rmap(new, true); diff --git a/mm/rmap.c b/mm/rmap.c index e83c18933bb75..9467451d80eb1 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -2597,7 +2597,7 @@ void rmap_walk_locked(struct folio *folio, struct rmap_walk_control *rwc) * * RMAP_COMPOUND is ignored. */ -void hugepage_add_anon_rmap(struct page *page, struct vm_area_struct *vma, +void hugetlb_add_anon_rmap(struct page *page, struct vm_area_struct *vma, unsigned long address, rmap_t flags) { struct folio *folio = page_folio(page); @@ -2612,8 +2612,8 @@ void hugepage_add_anon_rmap(struct page *page, struct vm_area_struct *vma, SetPageAnonExclusive(page); } -void hugepage_add_new_anon_rmap(struct folio *folio, - struct vm_area_struct *vma, unsigned long address) +void hugetlb_add_new_anon_rmap(struct folio *folio, + struct vm_area_struct *vma, unsigned long address) { BUG_ON(address < vma->vm_start || address >= vma->vm_end); /* increment count (starts at -1) */ From a54fd8c05f1e3429c010f4abc08b2d06975623c9 Mon Sep 17 00:00:00 2001 From: David Hildenbrand Date: Wed, 20 Dec 2023 23:44:26 +0100 Subject: [PATCH 129/548] mm/rmap: introduce and use hugetlb_remove_rmap() hugetlb rmap handling differs quite a lot from "ordinary" rmap code. For example, hugetlb currently only supports entire mappings, and treats any mapping as mapped using a single "logical PTE". Let's move it out of the way so we can overhaul our "ordinary" rmap. implementation/interface. Let's introduce and use hugetlb_remove_rmap() and remove the hugetlb code from page_remove_rmap(). This effectively removes one check on the small-folio path as well. Add sanity checks that we end up with the right folios in the right functions. Note: all possible candidates that need care are page_remove_rmap() that pass compound=true. Link: https://lkml.kernel.org/r/20231220224504.646757-3-david@redhat.com Signed-off-by: David Hildenbrand Reviewed-by: Yin Fengwei Reviewed-by: Ryan Roberts Reviewed-by: Matthew Wilcox (Oracle) Reviewed-by: Muchun Song Cc: Hugh Dickins Cc: Muchun Song Cc: Peter Xu Signed-off-by: Andrew Morton (cherry picked from commit e135826b2da0cf25305086dc9ac1e91718a148e1) Signed-off-by: Koba Ko --- include/linux/rmap.h | 7 +++++++ mm/hugetlb.c | 4 ++-- mm/rmap.c | 18 +++++++++--------- 3 files changed, 18 insertions(+), 11 deletions(-) diff --git a/include/linux/rmap.h b/include/linux/rmap.h index 23e456fb5487e..7246bff0edd08 100644 --- a/include/linux/rmap.h +++ b/include/linux/rmap.h @@ -208,6 +208,13 @@ void hugetlb_add_anon_rmap(struct page *, struct vm_area_struct *, void hugetlb_add_new_anon_rmap(struct folio *, struct vm_area_struct *, unsigned long address); +static inline void hugetlb_remove_rmap(struct folio *folio) +{ + VM_WARN_ON_FOLIO(!folio_test_hugetlb(folio), folio); + + atomic_dec(&folio->_entire_mapcount); +} + static inline void __page_dup_rmap(struct page *page, bool compound) { if (compound) { diff --git a/mm/hugetlb.c b/mm/hugetlb.c index a8ab300b3b7dc..1a65c86b09959 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -5424,7 +5424,7 @@ void __unmap_hugepage_range(struct mmu_gather *tlb, struct vm_area_struct *vma, make_pte_marker(PTE_MARKER_UFFD_WP), sz); hugetlb_count_sub(pages_per_huge_page(h), mm); - page_remove_rmap(page, vma, true); + hugetlb_remove_rmap(page_folio(page)); spin_unlock(ptl); tlb_remove_page_size(tlb, page, huge_page_size(h)); @@ -5733,7 +5733,7 @@ static vm_fault_t hugetlb_wp(struct mm_struct *mm, struct vm_area_struct *vma, /* Break COW or unshare */ huge_ptep_clear_flush(vma, haddr, ptep); - page_remove_rmap(&old_folio->page, vma, true); + hugetlb_remove_rmap(old_folio); hugetlb_add_new_anon_rmap(new_folio, vma, haddr); if (huge_pte_uffd_wp(pte)) newpte = huge_pte_mkuffd_wp(newpte); diff --git a/mm/rmap.c b/mm/rmap.c index 9467451d80eb1..109b0aaed83be 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -1452,15 +1452,9 @@ void page_remove_rmap(struct page *page, struct vm_area_struct *vma, bool last; enum node_stat_item idx; + VM_WARN_ON_FOLIO(folio_test_hugetlb(folio), folio); VM_BUG_ON_PAGE(compound && !PageHead(page), page); - /* Hugetlb pages are not counted in NR_*MAPPED */ - if (unlikely(folio_test_hugetlb(folio))) { - /* hugetlb pages are always mapped with pmds */ - atomic_dec(&folio->_entire_mapcount); - return; - } - /* Is page being unmapped by PTE? Is this its last map to be removed? */ if (likely(!compound)) { last = atomic_add_negative(-1, &page->_mapcount); @@ -1818,7 +1812,10 @@ static bool try_to_unmap_one(struct folio *folio, struct vm_area_struct *vma, dec_mm_counter(mm, mm_counter_file(&folio->page)); } discard: - page_remove_rmap(subpage, vma, folio_test_hugetlb(folio)); + if (unlikely(folio_test_hugetlb(folio))) + hugetlb_remove_rmap(folio); + else + page_remove_rmap(subpage, vma, false); if (vma->vm_flags & VM_LOCKED) mlock_drain_local(); folio_put(folio); @@ -2171,7 +2168,10 @@ static bool try_to_migrate_one(struct folio *folio, struct vm_area_struct *vma, */ } - page_remove_rmap(subpage, vma, folio_test_hugetlb(folio)); + if (unlikely(folio_test_hugetlb(folio))) + hugetlb_remove_rmap(folio); + else + page_remove_rmap(subpage, vma, false); if (vma->vm_flags & VM_LOCKED) mlock_drain_local(); folio_put(folio); From d29731426619d78d08379385642e7ced66fcec48 Mon Sep 17 00:00:00 2001 From: David Hildenbrand Date: Wed, 20 Dec 2023 23:44:27 +0100 Subject: [PATCH 130/548] mm/rmap: introduce and use hugetlb_add_file_rmap() hugetlb rmap handling differs quite a lot from "ordinary" rmap code. For example, hugetlb currently only supports entire mappings, and treats any mapping as mapped using a single "logical PTE". Let's move it out of the way so we can overhaul our "ordinary" rmap. implementation/interface. Right now we're using page_dup_file_rmap() in some cases where "ordinary" rmap code would have used page_add_file_rmap(). So let's introduce and use hugetlb_add_file_rmap() instead. We won't be adding a "hugetlb_dup_file_rmap()" functon for the fork() case, as it would be doing the same: "dup" is just an optimization for "add". What remains is a single page_dup_file_rmap() call in fork() code. Add sanity checks that we end up with the right folios in the right functions. Link: https://lkml.kernel.org/r/20231220224504.646757-4-david@redhat.com Signed-off-by: David Hildenbrand Reviewed-by: Yin Fengwei Reviewed-by: Ryan Roberts Reviewed-by: Muchun Song Cc: Hugh Dickins Cc: Matthew Wilcox (Oracle) Cc: Muchun Song Cc: Peter Xu Signed-off-by: Andrew Morton (cherry picked from commit 44887f39945519fa8405133b1acd098fda9c9746) Signed-off-by: Koba Ko --- include/linux/rmap.h | 8 ++++++++ mm/hugetlb.c | 6 +++--- mm/migrate.c | 2 +- mm/rmap.c | 1 + 4 files changed, 13 insertions(+), 4 deletions(-) diff --git a/include/linux/rmap.h b/include/linux/rmap.h index 7246bff0edd08..c5868db3ece0f 100644 --- a/include/linux/rmap.h +++ b/include/linux/rmap.h @@ -208,6 +208,14 @@ void hugetlb_add_anon_rmap(struct page *, struct vm_area_struct *, void hugetlb_add_new_anon_rmap(struct folio *, struct vm_area_struct *, unsigned long address); +static inline void hugetlb_add_file_rmap(struct folio *folio) +{ + VM_WARN_ON_FOLIO(!folio_test_hugetlb(folio), folio); + VM_WARN_ON_FOLIO(folio_test_anon(folio), folio); + + atomic_inc(&folio->_entire_mapcount); +} + static inline void hugetlb_remove_rmap(struct folio *folio) { VM_WARN_ON_FOLIO(!folio_test_hugetlb(folio), folio); diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 1a65c86b09959..32aef4256892d 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -5156,7 +5156,7 @@ int copy_hugetlb_page_range(struct mm_struct *dst, struct mm_struct *src, * sleep during the process. */ if (!folio_test_anon(pte_folio)) { - page_dup_file_rmap(&pte_folio->page, true); + hugetlb_add_file_rmap(pte_folio); } else if (page_try_dup_anon_rmap(&pte_folio->page, true, src_vma)) { pte_t src_pte_old = entry; @@ -6024,7 +6024,7 @@ static vm_fault_t hugetlb_no_page(struct mm_struct *mm, if (anon_rmap) hugetlb_add_new_anon_rmap(folio, vma, haddr); else - page_dup_file_rmap(&folio->page, true); + hugetlb_add_file_rmap(folio); new_pte = make_huge_pte(vma, &folio->page, ((vma->vm_flags & VM_WRITE) && (vma->vm_flags & VM_SHARED))); /* @@ -6448,7 +6448,7 @@ int hugetlb_mfill_atomic_pte(pte_t *dst_pte, goto out_release_unlock; if (folio_in_pagecache) - page_dup_file_rmap(&folio->page, true); + hugetlb_add_file_rmap(folio); else hugetlb_add_new_anon_rmap(folio, dst_vma, dst_addr); diff --git a/mm/migrate.c b/mm/migrate.c index 959e44c5b0847..1ca33989565a5 100644 --- a/mm/migrate.c +++ b/mm/migrate.c @@ -252,7 +252,7 @@ static bool remove_migration_pte(struct folio *folio, hugetlb_add_anon_rmap(new, vma, pvmw.address, rmap_flags); else - page_dup_file_rmap(new, true); + hugetlb_add_file_rmap(folio); set_huge_pte_at(vma->vm_mm, pvmw.address, pvmw.pte, pte, psize); } else diff --git a/mm/rmap.c b/mm/rmap.c index 109b0aaed83be..98be470d80639 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -1367,6 +1367,7 @@ void folio_add_file_rmap_range(struct folio *folio, struct page *page, unsigned int nr_pmdmapped = 0, first; int nr = 0; + VM_WARN_ON_FOLIO(folio_test_hugetlb(folio), folio); VM_WARN_ON_FOLIO(compound && !folio_test_pmd_mappable(folio), folio); /* Is page being mapped by PTE? Is this its first map to be added? */ From 8521e4b7d1eafdfb4bd743f597ccc3434b1eab40 Mon Sep 17 00:00:00 2001 From: David Hildenbrand Date: Wed, 20 Dec 2023 23:44:28 +0100 Subject: [PATCH 131/548] mm/rmap: introduce and use hugetlb_try_dup_anon_rmap() hugetlb rmap handling differs quite a lot from "ordinary" rmap code. For example, hugetlb currently only supports entire mappings, and treats any mapping as mapped using a single "logical PTE". Let's move it out of the way so we can overhaul our "ordinary" rmap. implementation/interface. So let's introduce and use hugetlb_try_dup_anon_rmap() to make all hugetlb handling use dedicated hugetlb_* rmap functions. Add sanity checks that we end up with the right folios in the right functions. Note that is_device_private_page() does not apply to hugetlb. Link: https://lkml.kernel.org/r/20231220224504.646757-5-david@redhat.com Signed-off-by: David Hildenbrand Reviewed-by: Yin Fengwei Reviewed-by: Ryan Roberts Reviewed-by: Muchun Song Cc: Hugh Dickins Cc: Matthew Wilcox (Oracle) Cc: Muchun Song Cc: Peter Xu Signed-off-by: Andrew Morton (cherry picked from commit ebe2e35ec0f256372c158a18de459fb60070b313) Signed-off-by: Koba Ko --- include/linux/mm.h | 12 +++++++++--- include/linux/rmap.h | 18 ++++++++++++++++++ mm/hugetlb.c | 3 +-- 3 files changed, 28 insertions(+), 5 deletions(-) diff --git a/include/linux/mm.h b/include/linux/mm.h index ee26e37daa0a8..8cd0a9669ab8c 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -1950,15 +1950,21 @@ static inline bool page_maybe_dma_pinned(struct page *page) * * The caller has to hold the PT lock and the vma->vm_mm->->write_protect_seq. */ -static inline bool page_needs_cow_for_dma(struct vm_area_struct *vma, - struct page *page) +static inline bool folio_needs_cow_for_dma(struct vm_area_struct *vma, + struct folio *folio) { VM_BUG_ON(!(raw_read_seqcount(&vma->vm_mm->write_protect_seq) & 1)); if (!test_bit(MMF_HAS_PINNED, &vma->vm_mm->flags)) return false; - return page_maybe_dma_pinned(page); + return folio_maybe_dma_pinned(folio); +} + +static inline bool page_needs_cow_for_dma(struct vm_area_struct *vma, + struct page *page) +{ + return folio_needs_cow_for_dma(vma, page_folio(page)); } /** diff --git a/include/linux/rmap.h b/include/linux/rmap.h index c5868db3ece0f..46a020bf3f7d2 100644 --- a/include/linux/rmap.h +++ b/include/linux/rmap.h @@ -208,6 +208,22 @@ void hugetlb_add_anon_rmap(struct page *, struct vm_area_struct *, void hugetlb_add_new_anon_rmap(struct folio *, struct vm_area_struct *, unsigned long address); +/* See page_try_dup_anon_rmap() */ +static inline int hugetlb_try_dup_anon_rmap(struct folio *folio, + struct vm_area_struct *vma) +{ + VM_WARN_ON_FOLIO(!folio_test_hugetlb(folio), folio); + VM_WARN_ON_FOLIO(!folio_test_anon(folio), folio); + + if (PageAnonExclusive(&folio->page)) { + if (unlikely(folio_needs_cow_for_dma(vma, folio))) + return -EBUSY; + ClearPageAnonExclusive(&folio->page); + } + atomic_inc(&folio->_entire_mapcount); + return 0; +} + static inline void hugetlb_add_file_rmap(struct folio *folio) { VM_WARN_ON_FOLIO(!folio_test_hugetlb(folio), folio); @@ -225,6 +241,8 @@ static inline void hugetlb_remove_rmap(struct folio *folio) static inline void __page_dup_rmap(struct page *page, bool compound) { + VM_WARN_ON(folio_test_hugetlb(page_folio(page))); + if (compound) { struct folio *folio = (struct folio *)page; diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 32aef4256892d..e5f2c2679c542 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -5157,8 +5157,7 @@ int copy_hugetlb_page_range(struct mm_struct *dst, struct mm_struct *src, */ if (!folio_test_anon(pte_folio)) { hugetlb_add_file_rmap(pte_folio); - } else if (page_try_dup_anon_rmap(&pte_folio->page, - true, src_vma)) { + } else if (hugetlb_try_dup_anon_rmap(pte_folio, src_vma)) { pte_t src_pte_old = entry; struct folio *new_folio; From 54a69149ab1df79609dd77777fb566992dffddf6 Mon Sep 17 00:00:00 2001 From: David Hildenbrand Date: Wed, 20 Dec 2023 23:44:29 +0100 Subject: [PATCH 132/548] mm/rmap: introduce and use hugetlb_try_share_anon_rmap() hugetlb rmap handling differs quite a lot from "ordinary" rmap code. For example, hugetlb currently only supports entire mappings, and treats any mapping as mapped using a single "logical PTE". Let's move it out of the way so we can overhaul our "ordinary" rmap. implementation/interface. So let's introduce and use hugetlb_try_dup_anon_rmap() to make all hugetlb handling use dedicated hugetlb_* rmap functions. Add sanity checks that we end up with the right folios in the right functions. Note that try_to_unmap_one() does not need care. Easy to spot because among all that nasty hugetlb special-casing in that function, we're not using set_huge_pte_at() on the anon path -- well, and that code assumes that we would want to swapout. Link: https://lkml.kernel.org/r/20231220224504.646757-6-david@redhat.com Signed-off-by: David Hildenbrand Reviewed-by: Yin Fengwei Reviewed-by: Ryan Roberts Cc: Hugh Dickins Cc: Matthew Wilcox (Oracle) Cc: Muchun Song Cc: Muchun Song Cc: Peter Xu Signed-off-by: Andrew Morton (cherry picked from commit 0c2ec32bf0b2f0d7ccb98c53ee5d255d68e73595) Signed-off-by: Koba Ko --- include/linux/rmap.h | 25 +++++++++++++++++++++++++ mm/rmap.c | 15 ++++++++++----- 2 files changed, 35 insertions(+), 5 deletions(-) diff --git a/include/linux/rmap.h b/include/linux/rmap.h index 46a020bf3f7d2..342fb85ed324d 100644 --- a/include/linux/rmap.h +++ b/include/linux/rmap.h @@ -224,6 +224,30 @@ static inline int hugetlb_try_dup_anon_rmap(struct folio *folio, return 0; } +/* See page_try_share_anon_rmap() */ +static inline int hugetlb_try_share_anon_rmap(struct folio *folio) +{ + VM_WARN_ON_FOLIO(!folio_test_hugetlb(folio), folio); + VM_WARN_ON_FOLIO(!folio_test_anon(folio), folio); + VM_WARN_ON_FOLIO(!PageAnonExclusive(&folio->page), folio); + + /* Paired with the memory barrier in try_grab_folio(). */ + if (IS_ENABLED(CONFIG_HAVE_FAST_GUP)) + smp_mb(); + + if (unlikely(folio_maybe_dma_pinned(folio))) + return -EBUSY; + ClearPageAnonExclusive(&folio->page); + + /* + * This is conceptually a smp_wmb() paired with the smp_rmb() in + * gup_must_unshare(). + */ + if (IS_ENABLED(CONFIG_HAVE_FAST_GUP)) + smp_mb__after_atomic(); + return 0; +} + static inline void hugetlb_add_file_rmap(struct folio *folio) { VM_WARN_ON_FOLIO(!folio_test_hugetlb(folio), folio); @@ -328,6 +352,7 @@ static inline int page_try_dup_anon_rmap(struct page *page, bool compound, */ static inline int page_try_share_anon_rmap(struct page *page) { + VM_WARN_ON(folio_test_hugetlb(page_folio(page))); VM_BUG_ON_PAGE(!PageAnon(page) || !PageAnonExclusive(page), page); /* device private pages cannot get pinned via GUP. */ diff --git a/mm/rmap.c b/mm/rmap.c index 98be470d80639..f465b9d506511 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -2121,13 +2121,18 @@ static bool try_to_migrate_one(struct folio *folio, struct vm_area_struct *vma, !anon_exclusive, subpage); /* See page_try_share_anon_rmap(): clear PTE first. */ - if (anon_exclusive && - page_try_share_anon_rmap(subpage)) { - if (folio_test_hugetlb(folio)) + if (folio_test_hugetlb(folio)) { + if (anon_exclusive && + hugetlb_try_share_anon_rmap(folio)) { set_huge_pte_at(mm, address, pvmw.pte, pteval, hsz); - else - set_pte_at(mm, address, pvmw.pte, pteval); + ret = false; + page_vma_mapped_walk_done(&pvmw); + break; + } + } else if (anon_exclusive && + page_try_share_anon_rmap(subpage)) { + set_pte_at(mm, address, pvmw.pte, pteval); ret = false; page_vma_mapped_walk_done(&pvmw); break; From 47859859499e762c7cecf2927cd49a9b559271d4 Mon Sep 17 00:00:00 2001 From: David Hildenbrand Date: Wed, 20 Dec 2023 23:44:30 +0100 Subject: [PATCH 133/548] mm/rmap: add hugetlb sanity checks for anon rmap handling Let's make sure we end up with the right folios in the right functions when adding an anon rmap, just like we already do in the other rmap functions. Link: https://lkml.kernel.org/r/20231220224504.646757-7-david@redhat.com Signed-off-by: David Hildenbrand Reviewed-by: Ryan Roberts Cc: Hugh Dickins Cc: Matthew Wilcox (Oracle) Cc: Muchun Song Cc: Muchun Song Cc: Peter Xu Cc: Yin Fengwei Signed-off-by: Andrew Morton (cherry picked from commit a4ea18641d8330a97d7d66f0ab017b690099ffce) Signed-off-by: Koba Ko --- mm/rmap.c | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/mm/rmap.c b/mm/rmap.c index f465b9d506511..6d0667303b1f8 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -1235,6 +1235,8 @@ void page_add_anon_rmap(struct page *page, struct vm_area_struct *vma, bool compound = flags & RMAP_COMPOUND; bool first = true; + VM_WARN_ON_FOLIO(folio_test_hugetlb(folio), folio); + /* Is page being mapped by PTE? Is this its first map to be added? */ if (likely(!compound)) { first = atomic_inc_and_test(&page->_mapcount); @@ -1315,6 +1317,7 @@ void folio_add_new_anon_rmap(struct folio *folio, struct vm_area_struct *vma, { int nr = folio_nr_pages(folio); + VM_WARN_ON_FOLIO(folio_test_hugetlb(folio), folio); VM_BUG_ON_VMA(address < vma->vm_start || address + (nr << PAGE_SHIFT) > vma->vm_end, vma); __folio_set_swapbacked(folio); @@ -2609,6 +2612,7 @@ void hugetlb_add_anon_rmap(struct page *page, struct vm_area_struct *vma, struct folio *folio = page_folio(page); int first; + VM_WARN_ON_FOLIO(!folio_test_hugetlb(folio), folio); VM_WARN_ON_FOLIO(!folio_test_anon(folio), folio); first = atomic_inc_and_test(&folio->_entire_mapcount); @@ -2621,6 +2625,8 @@ void hugetlb_add_anon_rmap(struct page *page, struct vm_area_struct *vma, void hugetlb_add_new_anon_rmap(struct folio *folio, struct vm_area_struct *vma, unsigned long address) { + VM_WARN_ON_FOLIO(!folio_test_hugetlb(folio), folio); + BUG_ON(address < vma->vm_start || address >= vma->vm_end); /* increment count (starts at -1) */ atomic_set(&folio->_entire_mapcount, 0); From f4ace5652c7bc4897b817cd5ab06c139bb404baf Mon Sep 17 00:00:00 2001 From: David Hildenbrand Date: Wed, 20 Dec 2023 23:44:31 +0100 Subject: [PATCH 134/548] mm/rmap: convert folio_add_file_rmap_range() into folio_add_file_rmap_[pte|ptes|pmd]() Let's get rid of the compound parameter and instead define explicitly which mappings we're adding. That is more future proof, easier to read and harder to mess up. Use an enum to express the granularity internally. Make the compiler always special-case on the granularity by using __always_inline. Replace the "compound" check by a switch-case that will be removed by the compiler completely. Add plenty of sanity checks with CONFIG_DEBUG_VM. Replace the folio_test_pmd_mappable() check by a config check in the caller and sanity checks. Convert the single user of folio_add_file_rmap_range(). While at it, consistently use "int" instead of "unisgned int" in rmap code when dealing with mapcounts and the number of pages. This function design can later easily be extended to PUDs and to batch PMDs. Note that for now we don't support anything bigger than PMD-sized folios (as we cleanly separated hugetlb handling). Sanity checks will catch if that ever changes. Next up is removing page_remove_rmap() along with its "compound" parameter and smilarly converting all other rmap functions. Link: https://lkml.kernel.org/r/20231220224504.646757-8-david@redhat.com Signed-off-by: David Hildenbrand Reviewed-by: Yin Fengwei Reviewed-by: Ryan Roberts Cc: Hugh Dickins Cc: Matthew Wilcox (Oracle) Cc: Muchun Song Cc: Muchun Song Cc: Peter Xu Signed-off-by: Andrew Morton (cherry picked from commit 68f0320824fa59c5429cbc811e6c46e7a30ea32c) Signed-off-by: Koba Ko --- include/linux/rmap.h | 46 ++++++++++++++++++++++++-- mm/memory.c | 2 +- mm/rmap.c | 79 ++++++++++++++++++++++++++++---------------- 3 files changed, 95 insertions(+), 32 deletions(-) diff --git a/include/linux/rmap.h b/include/linux/rmap.h index 342fb85ed324d..dda94c75d2156 100644 --- a/include/linux/rmap.h +++ b/include/linux/rmap.h @@ -186,6 +186,44 @@ typedef int __bitwise rmap_t; */ #define RMAP_COMPOUND ((__force rmap_t)BIT(1)) +/* + * Internally, we're using an enum to specify the granularity. We make the + * compiler emit specialized code for each granularity. + */ +enum rmap_level { + RMAP_LEVEL_PTE = 0, + RMAP_LEVEL_PMD, +}; + +static inline void __folio_rmap_sanity_checks(struct folio *folio, + struct page *page, int nr_pages, enum rmap_level level) +{ + /* hugetlb folios are handled separately. */ + VM_WARN_ON_FOLIO(folio_test_hugetlb(folio), folio); + VM_WARN_ON_FOLIO(folio_test_large(folio) && + !folio_test_large_rmappable(folio), folio); + + VM_WARN_ON_ONCE(nr_pages <= 0); + VM_WARN_ON_FOLIO(page_folio(page) != folio, folio); + VM_WARN_ON_FOLIO(page_folio(page + nr_pages - 1) != folio, folio); + + switch (level) { + case RMAP_LEVEL_PTE: + break; + case RMAP_LEVEL_PMD: + /* + * We don't support folios larger than a single PMD yet. So + * when RMAP_LEVEL_PMD is set, we assume that we are creating + * a single "entire" mapping of the folio. + */ + VM_WARN_ON_FOLIO(folio_nr_pages(folio) != HPAGE_PMD_NR, folio); + VM_WARN_ON_FOLIO(nr_pages != HPAGE_PMD_NR, folio); + break; + default: + VM_WARN_ON_ONCE(true); + } +} + /* * rmap interfaces called when adding or removing pte of page */ @@ -198,8 +236,12 @@ void folio_add_new_anon_rmap(struct folio *, struct vm_area_struct *, unsigned long address); void page_add_file_rmap(struct page *, struct vm_area_struct *, bool compound); -void folio_add_file_rmap_range(struct folio *, struct page *, unsigned int nr, - struct vm_area_struct *, bool compound); +void folio_add_file_rmap_ptes(struct folio *, struct page *, int nr_pages, + struct vm_area_struct *); +#define folio_add_file_rmap_pte(folio, page, vma) \ + folio_add_file_rmap_ptes(folio, page, 1, vma) +void folio_add_file_rmap_pmd(struct folio *, struct page *, + struct vm_area_struct *); void page_remove_rmap(struct page *, struct vm_area_struct *, bool compound); diff --git a/mm/memory.c b/mm/memory.c index 683b29c673228..2c48ee0060d4b 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -4492,7 +4492,7 @@ void set_pte_range(struct vm_fault *vmf, struct folio *folio, folio_add_lru_vma(folio, vma); } else { add_mm_counter(vma->vm_mm, mm_counter_file(page), nr); - folio_add_file_rmap_range(folio, page, nr, vma, false); + folio_add_file_rmap_ptes(folio, page, nr, vma); } set_ptes(vma->vm_mm, addr, vmf->pte, entry, nr); diff --git a/mm/rmap.c b/mm/rmap.c index 6d0667303b1f8..b9d8006fc93af 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -1350,31 +1350,18 @@ void folio_add_new_anon_rmap(struct folio *folio, struct vm_area_struct *vma, __lruvec_stat_mod_folio(folio, NR_ANON_MAPPED, nr); } -/** - * folio_add_file_rmap_range - add pte mapping to page range of a folio - * @folio: The folio to add the mapping to - * @page: The first page to add - * @nr_pages: The number of pages which will be mapped - * @vma: the vm area in which the mapping is added - * @compound: charge the page as compound or small page - * - * The page range of folio is defined by [first_page, first_page + nr_pages) - * - * The caller needs to hold the pte lock. - */ -void folio_add_file_rmap_range(struct folio *folio, struct page *page, - unsigned int nr_pages, struct vm_area_struct *vma, - bool compound) +static __always_inline void __folio_add_file_rmap(struct folio *folio, + struct page *page, int nr_pages, struct vm_area_struct *vma, + enum rmap_level level) { atomic_t *mapped = &folio->_nr_pages_mapped; - unsigned int nr_pmdmapped = 0, first; - int nr = 0; + int nr = 0, nr_pmdmapped = 0, first; - VM_WARN_ON_FOLIO(folio_test_hugetlb(folio), folio); - VM_WARN_ON_FOLIO(compound && !folio_test_pmd_mappable(folio), folio); + VM_WARN_ON_FOLIO(folio_test_anon(folio), folio); + __folio_rmap_sanity_checks(folio, page, nr_pages, level); - /* Is page being mapped by PTE? Is this its first map to be added? */ - if (likely(!compound)) { + switch (level) { + case RMAP_LEVEL_PTE: do { first = atomic_inc_and_test(&page->_mapcount); if (first && folio_test_large(folio)) { @@ -1385,9 +1372,8 @@ void folio_add_file_rmap_range(struct folio *folio, struct page *page, if (first) nr++; } while (page++, --nr_pages > 0); - } else if (folio_test_pmd_mappable(folio)) { - /* That test is redundant: it's for safety or to optimize out */ - + break; + case RMAP_LEVEL_PMD: first = atomic_inc_and_test(&folio->_entire_mapcount); if (first) { nr = atomic_add_return_relaxed(COMPOUND_MAPPED, mapped); @@ -1402,6 +1388,7 @@ void folio_add_file_rmap_range(struct folio *folio, struct page *page, nr = 0; } } + break; } if (nr_pmdmapped) @@ -1415,6 +1402,43 @@ void folio_add_file_rmap_range(struct folio *folio, struct page *page, mlock_vma_folio(folio, vma); } +/** + * folio_add_file_rmap_ptes - add PTE mappings to a page range of a folio + * @folio: The folio to add the mappings to + * @page: The first page to add + * @nr_pages: The number of pages that will be mapped using PTEs + * @vma: The vm area in which the mappings are added + * + * The page range of the folio is defined by [page, page + nr_pages) + * + * The caller needs to hold the page table lock. + */ +void folio_add_file_rmap_ptes(struct folio *folio, struct page *page, + int nr_pages, struct vm_area_struct *vma) +{ + __folio_add_file_rmap(folio, page, nr_pages, vma, RMAP_LEVEL_PTE); +} + +/** + * folio_add_file_rmap_pmd - add a PMD mapping to a page range of a folio + * @folio: The folio to add the mapping to + * @page: The first page to add + * @vma: The vm area in which the mapping is added + * + * The page range of the folio is defined by [page, page + HPAGE_PMD_NR) + * + * The caller needs to hold the page table lock. + */ +void folio_add_file_rmap_pmd(struct folio *folio, struct page *page, + struct vm_area_struct *vma) +{ +#ifdef CONFIG_TRANSPARENT_HUGEPAGE + __folio_add_file_rmap(folio, page, HPAGE_PMD_NR, vma, RMAP_LEVEL_PMD); +#else + WARN_ON_ONCE(true); +#endif +} + /** * page_add_file_rmap - add pte mapping to a file page * @page: the page to add the mapping to @@ -1427,16 +1451,13 @@ void page_add_file_rmap(struct page *page, struct vm_area_struct *vma, bool compound) { struct folio *folio = page_folio(page); - unsigned int nr_pages; VM_WARN_ON_ONCE_PAGE(compound && !PageTransHuge(page), page); if (likely(!compound)) - nr_pages = 1; + folio_add_file_rmap_pte(folio, page, vma); else - nr_pages = folio_nr_pages(folio); - - folio_add_file_rmap_range(folio, page, nr_pages, vma, compound); + folio_add_file_rmap_pmd(folio, page, vma); } /** From da018f8440f023884ea78439059e4e1760f9e565 Mon Sep 17 00:00:00 2001 From: "Matthew Wilcox (Oracle)" Date: Fri, 6 Oct 2023 20:53:13 +0100 Subject: [PATCH 135/548] mm: make lock_folio_maybe_drop_mmap() VMA lock aware Patch series "Handle more faults under the VMA lock", v2. At this point, we're handling the majority of file-backed page faults under the VMA lock, using the ->map_pages entry point. This patch set attempts to expand that for the following siutations: - We have to do a read. This could be because we've hit the point in the readahead window where we need to kick off the next readahead, or because the page is simply not present in cache. - We're handling a write fault. Most applications don't do I/O by writes to shared mmaps for very good reasons, but some do, and it'd be nice to not make that slow unnecessarily. - We're doing a COW of a private mapping (both PTE already present and PTE not-present). These are two different codepaths and I handle both of them in this patch set. There is no support in this patch set for drivers to mark themselves as being VMA lock friendly; they could implement the ->map_pages vm_operation, but if they do, they would be the first. This is probably something we want to change at some point in the future, and I've marked where to make that change in the code. There is very little performance change in the benchmarks we've run; mostly because the vast majority of page faults are handled through the other paths. I still think this patch series is useful for workloads that may take these paths more often, and just for cleaning up the fault path in general (it's now clearer why we have to retry in these cases). This patch (of 6): Drop the VMA lock instead of the mmap_lock if that's the one which is held. Link: https://lkml.kernel.org/r/20231006195318.4087158-1-willy@infradead.org Link: https://lkml.kernel.org/r/20231006195318.4087158-2-willy@infradead.org Signed-off-by: Matthew Wilcox (Oracle) Reviewed-by: Suren Baghdasaryan Signed-off-by: Andrew Morton (cherry picked from commit 5d74b2ab2c15d596c470bae6626f345d5575a9d0) Signed-off-by: Koba Ko --- mm/filemap.c | 13 +++++++------ 1 file changed, 7 insertions(+), 6 deletions(-) diff --git a/mm/filemap.c b/mm/filemap.c index 05eb77623a106..86fbe650d8833 100644 --- a/mm/filemap.c +++ b/mm/filemap.c @@ -3151,7 +3151,7 @@ static int lock_folio_maybe_drop_mmap(struct vm_fault *vmf, struct folio *folio, /* * NOTE! This will make us return with VM_FAULT_RETRY, but with - * the mmap_lock still held. That's how FAULT_FLAG_RETRY_NOWAIT + * the fault lock still held. That's how FAULT_FLAG_RETRY_NOWAIT * is supposed to work. We have way too many special cases.. */ if (vmf->flags & FAULT_FLAG_RETRY_NOWAIT) @@ -3161,13 +3161,14 @@ static int lock_folio_maybe_drop_mmap(struct vm_fault *vmf, struct folio *folio, if (vmf->flags & FAULT_FLAG_KILLABLE) { if (__folio_lock_killable(folio)) { /* - * We didn't have the right flags to drop the mmap_lock, - * but all fault_handlers only check for fatal signals - * if we return VM_FAULT_RETRY, so we need to drop the - * mmap_lock here and return 0 if we don't have a fpin. + * We didn't have the right flags to drop the + * fault lock, but all fault_handlers only check + * for fatal signals if we return VM_FAULT_RETRY, + * so we need to drop the fault lock here and + * return 0 if we don't have a fpin. */ if (*fpin == NULL) - mmap_read_unlock(vmf->vma->vm_mm); + release_fault_lock(vmf); return 0; } } else From ca384324cee58135d9b0f66838a95874092b1b45 Mon Sep 17 00:00:00 2001 From: "Matthew Wilcox (Oracle)" Date: Fri, 6 Oct 2023 20:53:14 +0100 Subject: [PATCH 136/548] mm: call wp_page_copy() under the VMA lock It is usually safe to call wp_page_copy() under the VMA lock. The only unsafe situation is when no anon_vma has been allocated for this VMA, and we have to look at adjacent VMAs to determine if their anon_vma can be shared. Since this happens only for the first COW of a page in this VMA, the majority of calls to wp_page_copy() do not need to fall back to the mmap_sem. Add vmf_anon_prepare() as an alternative to anon_vma_prepare() which will return RETRY if we currently hold the VMA lock and need to allocate an anon_vma. This lets us drop the check in do_wp_page(). Link: https://lkml.kernel.org/r/20231006195318.4087158-3-willy@infradead.org Signed-off-by: Matthew Wilcox (Oracle) Reviewed-by: Suren Baghdasaryan Signed-off-by: Andrew Morton (cherry picked from commit 164b06f238b986317131e6b61b2f22aabcbc2cc0) Signed-off-by: Koba Ko --- mm/memory.c | 39 ++++++++++++++++++++++++++------------- 1 file changed, 26 insertions(+), 13 deletions(-) diff --git a/mm/memory.c b/mm/memory.c index 2c48ee0060d4b..219258817524d 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -3049,6 +3049,21 @@ static inline void wp_page_reuse(struct vm_fault *vmf) count_vm_event(PGREUSE); } +static vm_fault_t vmf_anon_prepare(struct vm_fault *vmf) +{ + struct vm_area_struct *vma = vmf->vma; + + if (likely(vma->anon_vma)) + return 0; + if (vmf->flags & FAULT_FLAG_VMA_LOCK) { + vma_end_read(vma); + return VM_FAULT_RETRY; + } + if (__anon_vma_prepare(vma)) + return VM_FAULT_OOM; + return 0; +} + /* * Handle the case of a page which we actually need to copy to a new page, * either due to COW or unsharing. @@ -3076,27 +3091,29 @@ static vm_fault_t wp_page_copy(struct vm_fault *vmf) pte_t entry; int page_copied = 0; struct mmu_notifier_range range; - int ret; + vm_fault_t ret; delayacct_wpcopy_start(); if (vmf->page) old_folio = page_folio(vmf->page); - if (unlikely(anon_vma_prepare(vma))) - goto oom; + ret = vmf_anon_prepare(vmf); + if (unlikely(ret)) + goto out; if (is_zero_pfn(pte_pfn(vmf->orig_pte))) { new_folio = vma_alloc_zeroed_movable_folio(vma, vmf->address); if (!new_folio) goto oom; } else { + int err; new_folio = vma_alloc_folio(GFP_HIGHUSER_MOVABLE, 0, vma, vmf->address, false); if (!new_folio) goto oom; - ret = __wp_page_copy_user(&new_folio->page, vmf->page, vmf); - if (ret) { + err = __wp_page_copy_user(&new_folio->page, vmf->page, vmf); + if (err) { /* * COW failed, if the fault was solved by other, * it's fine. If not, userspace would re-fault on @@ -3109,7 +3126,7 @@ static vm_fault_t wp_page_copy(struct vm_fault *vmf) folio_put(old_folio); delayacct_wpcopy_end(); - return ret == -EHWPOISON ? VM_FAULT_HWPOISON : 0; + return err == -EHWPOISON ? VM_FAULT_HWPOISON : 0; } kmsan_copy_page_meta(&new_folio->page, vmf->page); } @@ -3219,11 +3236,13 @@ static vm_fault_t wp_page_copy(struct vm_fault *vmf) oom_free_new: folio_put(new_folio); oom: + ret = VM_FAULT_OOM; +out: if (old_folio) folio_put(old_folio); delayacct_wpcopy_end(); - return VM_FAULT_OOM; + return ret; } /** @@ -3448,12 +3467,6 @@ static vm_fault_t do_wp_page(struct vm_fault *vmf) return 0; } copy: - if ((vmf->flags & FAULT_FLAG_VMA_LOCK) && !vma->anon_vma) { - pte_unmap_unlock(vmf->pte, vmf->ptl); - vma_end_read(vmf->vma); - return VM_FAULT_RETRY; - } - /* * Ok, we need to copy. Oh, well.. */ From 98af015d4ff6d85d4c729d45f97958a61d825b2c Mon Sep 17 00:00:00 2001 From: "Matthew Wilcox (Oracle)" Date: Fri, 6 Oct 2023 20:53:15 +0100 Subject: [PATCH 137/548] mm: handle shared faults under the VMA lock There are many implementations of ->fault and some of them depend on mmap_lock being held. All vm_ops that implement ->map_pages() end up calling filemap_fault(), which I have audited to be sure it does not rely on mmap_lock. So (for now) key off ->map_pages existing as a flag to indicate that it's safe to call ->fault while only holding the vma lock. Link: https://lkml.kernel.org/r/20231006195318.4087158-4-willy@infradead.org Signed-off-by: Matthew Wilcox (Oracle) Reviewed-by: Suren Baghdasaryan Signed-off-by: Andrew Morton (cherry picked from commit 4ed4379881aa62588aba6442a9f362a8cf7624e6) Signed-off-by: Koba Ko --- mm/memory.c | 22 ++++++++++++++++++---- 1 file changed, 18 insertions(+), 4 deletions(-) diff --git a/mm/memory.c b/mm/memory.c index 219258817524d..10a4a6f6363d0 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -3049,6 +3049,21 @@ static inline void wp_page_reuse(struct vm_fault *vmf) count_vm_event(PGREUSE); } +/* + * We could add a bitflag somewhere, but for now, we know that all + * vm_ops that have a ->map_pages have been audited and don't need + * the mmap_lock to be held. + */ +static inline vm_fault_t vmf_can_call_fault(const struct vm_fault *vmf) +{ + struct vm_area_struct *vma = vmf->vma; + + if (vma->vm_ops->map_pages || !(vmf->flags & FAULT_FLAG_VMA_LOCK)) + return 0; + vma_end_read(vma); + return VM_FAULT_RETRY; +} + static vm_fault_t vmf_anon_prepare(struct vm_fault *vmf) { struct vm_area_struct *vma = vmf->vma; @@ -4779,10 +4794,9 @@ static vm_fault_t do_shared_fault(struct vm_fault *vmf) vm_fault_t ret, tmp; struct folio *folio; - if (vmf->flags & FAULT_FLAG_VMA_LOCK) { - vma_end_read(vma); - return VM_FAULT_RETRY; - } + ret = vmf_can_call_fault(vmf); + if (ret) + return ret; ret = __do_fault(vmf); if (unlikely(ret & (VM_FAULT_ERROR | VM_FAULT_NOPAGE | VM_FAULT_RETRY))) From 8b1995c8a22e6d68e15aae27d7f655bbf15c9b4f Mon Sep 17 00:00:00 2001 From: "Matthew Wilcox (Oracle)" Date: Fri, 6 Oct 2023 20:53:16 +0100 Subject: [PATCH 138/548] mm: handle COW faults under the VMA lock If the page is not currently present in the page tables, we need to call the page fault handler to find out which page we're supposed to COW, so we need to both check that there is already an anon_vma and that the fault handler doesn't need the mmap_lock. Link: https://lkml.kernel.org/r/20231006195318.4087158-5-willy@infradead.org Signed-off-by: Matthew Wilcox (Oracle) Reviewed-by: Suren Baghdasaryan Signed-off-by: Andrew Morton (cherry picked from commit 4de8c93a4751e10737b6af65db42c743228c67a6) Signed-off-by: Koba Ko --- mm/memory.c | 12 +++++------- 1 file changed, 5 insertions(+), 7 deletions(-) diff --git a/mm/memory.c b/mm/memory.c index 10a4a6f6363d0..b0f09ef32a771 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -4749,13 +4749,11 @@ static vm_fault_t do_cow_fault(struct vm_fault *vmf) struct vm_area_struct *vma = vmf->vma; vm_fault_t ret; - if (vmf->flags & FAULT_FLAG_VMA_LOCK) { - vma_end_read(vma); - return VM_FAULT_RETRY; - } - - if (unlikely(anon_vma_prepare(vma))) - return VM_FAULT_OOM; + ret = vmf_can_call_fault(vmf); + if (!ret) + ret = vmf_anon_prepare(vmf); + if (ret) + return ret; vmf->cow_page = alloc_page_vma(GFP_HIGHUSER_MOVABLE, vma, vmf->address); if (!vmf->cow_page) From 3cd8a8c46f776326af82fc075c4815f491f6ce06 Mon Sep 17 00:00:00 2001 From: "Matthew Wilcox (Oracle)" Date: Fri, 6 Oct 2023 20:53:17 +0100 Subject: [PATCH 139/548] mm: handle read faults under the VMA lock Most file-backed faults are already handled through ->map_pages(), but if we need to do I/O we'll come this way. Since filemap_fault() is now safe to be called under the VMA lock, we can handle these faults under the VMA lock now. Link: https://lkml.kernel.org/r/20231006195318.4087158-6-willy@infradead.org Signed-off-by: Matthew Wilcox (Oracle) Reviewed-by: Suren Baghdasaryan Signed-off-by: Andrew Morton (cherry picked from commit 12214eba1992642eee5813a9cc9f626e5b2d1815) Signed-off-by: Koba Ko --- mm/memory.c | 7 +++---- 1 file changed, 3 insertions(+), 4 deletions(-) diff --git a/mm/memory.c b/mm/memory.c index b0f09ef32a771..4d231dcb554e6 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -4727,10 +4727,9 @@ static vm_fault_t do_read_fault(struct vm_fault *vmf) return ret; } - if (vmf->flags & FAULT_FLAG_VMA_LOCK) { - vma_end_read(vmf->vma); - return VM_FAULT_RETRY; - } + ret = vmf_can_call_fault(vmf); + if (ret) + return ret; ret = __do_fault(vmf); if (unlikely(ret & (VM_FAULT_ERROR | VM_FAULT_NOPAGE | VM_FAULT_RETRY))) From 7adea80bb60d18209e141689f019f74d1c896ca8 Mon Sep 17 00:00:00 2001 From: "Matthew Wilcox (Oracle)" Date: Fri, 6 Oct 2023 20:53:18 +0100 Subject: [PATCH 140/548] mm: handle write faults to RO pages under the VMA lock I think this is a pretty rare occurrence, but for consistency handle faults with the VMA lock held the same way that we handle other faults with the VMA lock held. Link: https://lkml.kernel.org/r/20231006195318.4087158-7-willy@infradead.org Signed-off-by: Matthew Wilcox (Oracle) Reviewed-by: Suren Baghdasaryan Signed-off-by: Andrew Morton (cherry picked from commit 4a68fef16df9d88d528094116f8bbd2dbfa62089) Signed-off-by: Koba Ko --- mm/memory.c | 13 ++++++------- 1 file changed, 6 insertions(+), 7 deletions(-) diff --git a/mm/memory.c b/mm/memory.c index 4d231dcb554e6..63b194c55b5dd 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -3308,10 +3308,9 @@ static vm_fault_t wp_pfn_shared(struct vm_fault *vmf) vm_fault_t ret; pte_unmap_unlock(vmf->pte, vmf->ptl); - if (vmf->flags & FAULT_FLAG_VMA_LOCK) { - vma_end_read(vmf->vma); - return VM_FAULT_RETRY; - } + ret = vmf_can_call_fault(vmf); + if (ret) + return ret; vmf->flags |= FAULT_FLAG_MKWRITE; ret = vma->vm_ops->pfn_mkwrite(vmf); @@ -3335,10 +3334,10 @@ static vm_fault_t wp_page_shared(struct vm_fault *vmf, struct folio *folio) vm_fault_t tmp; pte_unmap_unlock(vmf->pte, vmf->ptl); - if (vmf->flags & FAULT_FLAG_VMA_LOCK) { + tmp = vmf_can_call_fault(vmf); + if (tmp) { folio_put(folio); - vma_end_read(vmf->vma); - return VM_FAULT_RETRY; + return tmp; } tmp = do_page_mkwrite(vmf, folio); From 059212ab61a95f485b2ed171b58c0cef8c1d1101 Mon Sep 17 00:00:00 2001 From: David Hildenbrand Date: Mon, 2 Oct 2023 16:29:47 +0200 Subject: [PATCH 141/548] mm/rmap: move SetPageAnonExclusive() out of page_move_anon_rmap() Patch series "mm/rmap: convert page_move_anon_rmap() to folio_move_anon_rmap()". Convert page_move_anon_rmap() to folio_move_anon_rmap(), letting the callers handle PageAnonExclusive. I'm including cleanup patch #3 because it fits into the picture and can be done cleaner by the conversion. This patch (of 3): Let's move it into the caller: there is a difference between whether an anon folio can only be mapped by one process (e.g., into one VMA), and whether it is truly exclusive (e.g., no references -- including GUP -- from other processes). Further, for large folios the page might not actually be pointing at the head page of the folio, so it better be handled in the caller. This is a preparation for converting page_move_anon_rmap() to consume a folio. Link: https://lkml.kernel.org/r/20231002142949.235104-1-david@redhat.com Link: https://lkml.kernel.org/r/20231002142949.235104-2-david@redhat.com Signed-off-by: David Hildenbrand Reviewed-by: Suren Baghdasaryan Reviewed-by: Vishal Moola (Oracle) Cc: Mike Kravetz Cc: Muchun Song Cc: Matthew Wilcox Signed-off-by: Andrew Morton (cherry picked from commit 5ca432896a4ce6d69fffc3298b24c0dd9bdb871f) Signed-off-by: Koba Ko --- mm/huge_memory.c | 1 + mm/hugetlb.c | 4 +++- mm/memory.c | 1 + mm/rmap.c | 1 - 4 files changed, 5 insertions(+), 2 deletions(-) diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 2ef540aa98316..c72a7f034fd5e 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -1526,6 +1526,7 @@ vm_fault_t do_huge_pmd_wp_page(struct vm_fault *vmf) pmd_t entry; page_move_anon_rmap(page, vma); + SetPageAnonExclusive(page); folio_unlock(folio); reuse: if (unlikely(unshare)) { diff --git a/mm/hugetlb.c b/mm/hugetlb.c index e5f2c2679c542..ec1903ec4e3d3 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -5620,8 +5620,10 @@ static vm_fault_t hugetlb_wp(struct mm_struct *mm, struct vm_area_struct *vma, * owner and can reuse this page. */ if (folio_mapcount(old_folio) == 1 && folio_test_anon(old_folio)) { - if (!PageAnonExclusive(&old_folio->page)) + if (!PageAnonExclusive(&old_folio->page)) { page_move_anon_rmap(&old_folio->page, vma); + SetPageAnonExclusive(&old_folio->page); + } if (likely(!unshare)) set_huge_ptep_writable(vma, haddr, ptep); diff --git a/mm/memory.c b/mm/memory.c index 63b194c55b5dd..037d3997bcc68 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -3471,6 +3471,7 @@ static vm_fault_t do_wp_page(struct vm_fault *vmf) * sunglasses. Hit it. */ page_move_anon_rmap(vmf->page, vma); + SetPageAnonExclusive(vmf->page); folio_unlock(folio); reuse: if (unlikely(unshare)) { diff --git a/mm/rmap.c b/mm/rmap.c index b9d8006fc93af..6976bfb91e60f 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -1152,7 +1152,6 @@ void page_move_anon_rmap(struct page *page, struct vm_area_struct *vma) * folio_test_anon()) will not see one without the other. */ WRITE_ONCE(folio->mapping, anon_vma); - SetPageAnonExclusive(page); } /** From 32c0d8ab556742fafd2febc3898960390536f5e3 Mon Sep 17 00:00:00 2001 From: David Hildenbrand Date: Mon, 2 Oct 2023 16:29:48 +0200 Subject: [PATCH 142/548] mm/rmap: convert page_move_anon_rmap() to folio_move_anon_rmap() Let's convert it to consume a folio. [akpm@linux-foundation.org: fix kerneldoc] Link: https://lkml.kernel.org/r/20231002142949.235104-3-david@redhat.com Signed-off-by: David Hildenbrand Reviewed-by: Suren Baghdasaryan Reviewed-by: Vishal Moola (Oracle) Cc: Mike Kravetz Cc: Muchun Song Cc: Matthew Wilcox Signed-off-by: Andrew Morton (cherry picked from commit 069686255c16a75b6a796e42df47f5af27b496a4) Signed-off-by: Koba Ko --- include/linux/rmap.h | 2 +- mm/huge_memory.c | 2 +- mm/hugetlb.c | 2 +- mm/memory.c | 2 +- mm/rmap.c | 16 +++++++--------- 5 files changed, 11 insertions(+), 13 deletions(-) diff --git a/include/linux/rmap.h b/include/linux/rmap.h index dda94c75d2156..a1b81fa88b4c6 100644 --- a/include/linux/rmap.h +++ b/include/linux/rmap.h @@ -227,7 +227,7 @@ static inline void __folio_rmap_sanity_checks(struct folio *folio, /* * rmap interfaces called when adding or removing pte of page */ -void page_move_anon_rmap(struct page *, struct vm_area_struct *); +void folio_move_anon_rmap(struct folio *, struct vm_area_struct *); void page_add_anon_rmap(struct page *, struct vm_area_struct *, unsigned long address, rmap_t flags); void page_add_new_anon_rmap(struct page *, struct vm_area_struct *, diff --git a/mm/huge_memory.c b/mm/huge_memory.c index c72a7f034fd5e..30f25b501d7a1 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -1525,7 +1525,7 @@ vm_fault_t do_huge_pmd_wp_page(struct vm_fault *vmf) if (folio_ref_count(folio) == 1) { pmd_t entry; - page_move_anon_rmap(page, vma); + folio_move_anon_rmap(folio, vma); SetPageAnonExclusive(page); folio_unlock(folio); reuse: diff --git a/mm/hugetlb.c b/mm/hugetlb.c index ec1903ec4e3d3..5a31ddacefe7c 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -5621,7 +5621,7 @@ static vm_fault_t hugetlb_wp(struct mm_struct *mm, struct vm_area_struct *vma, */ if (folio_mapcount(old_folio) == 1 && folio_test_anon(old_folio)) { if (!PageAnonExclusive(&old_folio->page)) { - page_move_anon_rmap(&old_folio->page, vma); + folio_move_anon_rmap(old_folio, vma); SetPageAnonExclusive(&old_folio->page); } if (likely(!unshare)) diff --git a/mm/memory.c b/mm/memory.c index 037d3997bcc68..6936123a9ccb9 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -3470,7 +3470,7 @@ static vm_fault_t do_wp_page(struct vm_fault *vmf) * and the folio is locked, it's dark out, and we're wearing * sunglasses. Hit it. */ - page_move_anon_rmap(vmf->page, vma); + folio_move_anon_rmap(folio, vma); SetPageAnonExclusive(vmf->page); folio_unlock(folio); reuse: diff --git a/mm/rmap.c b/mm/rmap.c index 6976bfb91e60f..c166b99ce2446 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -1128,19 +1128,17 @@ int folio_total_mapcount(struct folio *folio) } /** - * page_move_anon_rmap - move a page to our anon_vma - * @page: the page to move to our anon_vma - * @vma: the vma the page belongs to + * folio_move_anon_rmap - move a folio to our anon_vma + * @folio: The folio to move to our anon_vma + * @vma: The vma the folio belongs to * - * When a page belongs exclusively to one process after a COW event, - * that page can be moved into the anon_vma that belongs to just that - * process, so the rmap code will not search the parent or sibling - * processes. + * When a folio belongs exclusively to one process after a COW event, + * that folio can be moved into the anon_vma that belongs to just that + * process, so the rmap code will not search the parent or sibling processes. */ -void page_move_anon_rmap(struct page *page, struct vm_area_struct *vma) +void folio_move_anon_rmap(struct folio *folio, struct vm_area_struct *vma) { void *anon_vma = vma->anon_vma; - struct folio *folio = page_folio(page); VM_BUG_ON_FOLIO(!folio_test_locked(folio), folio); VM_BUG_ON_VMA(!anon_vma, vma); From 6678c6d0b224cbf69980b2c42ac2e83df2b78918 Mon Sep 17 00:00:00 2001 From: David Hildenbrand Date: Mon, 2 Oct 2023 16:29:49 +0200 Subject: [PATCH 143/548] memory: move exclusivity detection in do_wp_page() into wp_can_reuse_anon_folio() Let's clean up do_wp_page() a bit, removing two labels and making it a easier to read. wp_can_reuse_anon_folio() now only operates on the whole folio. Move the SetPageAnonExclusive() out into do_wp_page(). No need to do this under page lock -- the page table lock is sufficient. Link: https://lkml.kernel.org/r/20231002142949.235104-4-david@redhat.com Signed-off-by: David Hildenbrand Cc: Mike Kravetz Cc: Muchun Song Cc: Suren Baghdasaryan Cc: Matthew Wilcox Signed-off-by: Andrew Morton (cherry picked from commit dec078cc2181fccf8b134406b86aaacc19f7163f) Signed-off-by: Koba Ko --- mm/memory.c | 88 +++++++++++++++++++++++++++-------------------------- 1 file changed, 45 insertions(+), 43 deletions(-) diff --git a/mm/memory.c b/mm/memory.c index 6936123a9ccb9..1426328f910fd 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -3362,6 +3362,44 @@ static vm_fault_t wp_page_shared(struct vm_fault *vmf, struct folio *folio) return ret; } +static bool wp_can_reuse_anon_folio(struct folio *folio, + struct vm_area_struct *vma) +{ + /* + * We have to verify under folio lock: these early checks are + * just an optimization to avoid locking the folio and freeing + * the swapcache if there is little hope that we can reuse. + * + * KSM doesn't necessarily raise the folio refcount. + */ + if (folio_test_ksm(folio) || folio_ref_count(folio) > 3) + return false; + if (!folio_test_lru(folio)) + /* + * We cannot easily detect+handle references from + * remote LRU caches or references to LRU folios. + */ + lru_add_drain(); + if (folio_ref_count(folio) > 1 + folio_test_swapcache(folio)) + return false; + if (!folio_trylock(folio)) + return false; + if (folio_test_swapcache(folio)) + folio_free_swap(folio); + if (folio_test_ksm(folio) || folio_ref_count(folio) != 1) { + folio_unlock(folio); + return false; + } + /* + * Ok, we've got the only folio reference from our mapping + * and the folio is locked, it's dark out, and we're wearing + * sunglasses. Hit it. + */ + folio_move_anon_rmap(folio, vma); + folio_unlock(folio); + return true; +} + /* * This routine handles present pages, when * * users try to write to a shared page (FAULT_FLAG_WRITE) @@ -3431,49 +3469,14 @@ static vm_fault_t do_wp_page(struct vm_fault *vmf) /* * Private mapping: create an exclusive anonymous page copy if reuse * is impossible. We might miss VM_WRITE for FOLL_FORCE handling. + * + * If we encounter a page that is marked exclusive, we must reuse + * the page without further checks. */ - if (folio && folio_test_anon(folio)) { - /* - * If the page is exclusive to this process we must reuse the - * page without further checks. - */ - if (PageAnonExclusive(vmf->page)) - goto reuse; - - /* - * We have to verify under folio lock: these early checks are - * just an optimization to avoid locking the folio and freeing - * the swapcache if there is little hope that we can reuse. - * - * KSM doesn't necessarily raise the folio refcount. - */ - if (folio_test_ksm(folio) || folio_ref_count(folio) > 3) - goto copy; - if (!folio_test_lru(folio)) - /* - * We cannot easily detect+handle references from - * remote LRU caches or references to LRU folios. - */ - lru_add_drain(); - if (folio_ref_count(folio) > 1 + folio_test_swapcache(folio)) - goto copy; - if (!folio_trylock(folio)) - goto copy; - if (folio_test_swapcache(folio)) - folio_free_swap(folio); - if (folio_test_ksm(folio) || folio_ref_count(folio) != 1) { - folio_unlock(folio); - goto copy; - } - /* - * Ok, we've got the only folio reference from our mapping - * and the folio is locked, it's dark out, and we're wearing - * sunglasses. Hit it. - */ - folio_move_anon_rmap(folio, vma); - SetPageAnonExclusive(vmf->page); - folio_unlock(folio); -reuse: + if (folio && folio_test_anon(folio) && + (PageAnonExclusive(vmf->page) || wp_can_reuse_anon_folio(folio, vma))) { + if (!PageAnonExclusive(vmf->page)) + SetPageAnonExclusive(vmf->page); if (unlikely(unshare)) { pte_unmap_unlock(vmf->pte, vmf->ptl); return 0; @@ -3481,7 +3484,6 @@ static vm_fault_t do_wp_page(struct vm_fault *vmf) wp_page_reuse(vmf); return 0; } -copy: /* * Ok, we need to copy. Oh, well.. */ From 53fccf8f7affe9a7891eec37c375d112f19482fe Mon Sep 17 00:00:00 2001 From: Kefeng Wang Date: Wed, 18 Oct 2023 22:07:57 +0800 Subject: [PATCH 144/548] mm: huge_memory: use a folio in change_huge_pmd() Use a folio in change_huge_pmd(), which helps to remove last xchg_page_access_time() caller. Link: https://lkml.kernel.org/r/20231018140806.2783514-11-wangkefeng.wang@huawei.com Signed-off-by: Kefeng Wang Cc: David Hildenbrand Cc: Huang Ying Cc: Ingo Molnar Cc: Juri Lelli Cc: Matthew Wilcox (Oracle) Cc: Peter Zijlstra Cc: Vincent Guittot Cc: Zi Yan Signed-off-by: Andrew Morton (backported from commit d986ba2b1953f761d3859c22160e82c58ed4287d) Signed-off-by: Koba Ko --- include/linux/mm.h | 5 +++++ mm/huge_memory.c | 13 +++++++------ 2 files changed, 12 insertions(+), 6 deletions(-) diff --git a/include/linux/mm.h b/include/linux/mm.h index 8cd0a9669ab8c..dc27352b84554 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -1791,6 +1791,11 @@ static inline void vma_set_access_pid_bit(struct vm_area_struct *vma) } #endif /* CONFIG_NUMA_BALANCING */ +static inline int folio_xchg_access_time(struct folio *folio, int time) +{ + return xchg_page_access_time(&folio->page, time); +} + #if defined(CONFIG_KASAN_SW_TAGS) || defined(CONFIG_KASAN_HW_TAGS) /* diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 30f25b501d7a1..9f77d234b05c0 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -1998,7 +1998,7 @@ int change_huge_pmd(struct mmu_gather *tlb, struct vm_area_struct *vma, #ifdef CONFIG_ARCH_ENABLE_THP_MIGRATION if (is_swap_pmd(*pmd)) { swp_entry_t entry = pmd_to_swp_entry(*pmd); - struct page *page = pfn_swap_entry_to_page(entry); + struct folio *folio = page_folio(pfn_swap_entry_to_page(entry)); pmd_t newpmd; VM_BUG_ON(!is_pmd_migration_entry(*pmd)); @@ -2007,7 +2007,7 @@ int change_huge_pmd(struct mmu_gather *tlb, struct vm_area_struct *vma, * A protection check is difficult so * just be safe and disable write */ - if (PageAnon(page)) + if (folio_test_anon(folio)) entry = make_readable_exclusive_migration_entry(swp_offset(entry)); else entry = make_readable_migration_entry(swp_offset(entry)); @@ -2029,7 +2029,7 @@ int change_huge_pmd(struct mmu_gather *tlb, struct vm_area_struct *vma, #endif if (prot_numa) { - struct page *page; + struct folio *folio; bool toptier; /* * Avoid trapping faults against the zero page. The read-only @@ -2042,8 +2042,8 @@ int change_huge_pmd(struct mmu_gather *tlb, struct vm_area_struct *vma, if (pmd_protnone(*pmd)) goto unlock; - page = pmd_page(*pmd); - toptier = node_is_toptier(page_to_nid(page)); + folio = page_folio(pmd_page(*pmd)); + toptier = node_is_toptier(folio_nid(folio)); /* * Skip scanning top tier node if normal numa * balancing is disabled @@ -2054,7 +2054,8 @@ int change_huge_pmd(struct mmu_gather *tlb, struct vm_area_struct *vma, if (sysctl_numa_balancing_mode & NUMA_BALANCING_MEMORY_TIERING && !toptier) - xchg_page_access_time(page, jiffies_to_msecs(jiffies)); + folio_xchg_access_time(folio, + jiffies_to_msecs(jiffies)); } /* * In case prot_numa, we are under mmap_read_lock(mm). It's critical From 3fabdc7df05d67b382925e8baf134a26896e87db Mon Sep 17 00:00:00 2001 From: Kefeng Wang Date: Sat, 18 Nov 2023 10:32:28 +0800 Subject: [PATCH 145/548] mm: ksm: use more folio api in ksm_might_need_to_copy() Patch series "mm: cleanup and use more folio in page fault", v3. Rename page_copy_prealloc() to folio_prealloc(), which is used by more functions, also do more folio conversion in page fault. This patch (of 5): Since ksm only support normal page, no swapout/in for ksm large folio too, add large folio check in ksm_might_need_to_copy(), also convert page->index to folio->index as page->index is going away. Then convert ksm_might_need_to_copy() to use more folio api to save nine compound_head() calls, short 'address' to reduce max-line-length. Link: https://lkml.kernel.org/r/20231118023232.1409103-1-wangkefeng.wang@huawei.com Link: https://lkml.kernel.org/r/20231118023232.1409103-2-wangkefeng.wang@huawei.com Signed-off-by: Kefeng Wang Cc: David Hildenbrand Cc: Matthew Wilcox (Oracle) Cc: Sidhartha Kumar Cc: Vishal Moola (Oracle) Signed-off-by: Andrew Morton (cherry picked from commit 1486fb50136f4799946f5ecfe050094574647153) Signed-off-by: Koba Ko --- include/linux/ksm.h | 4 ++-- mm/ksm.c | 39 +++++++++++++++++++++------------------ 2 files changed, 23 insertions(+), 20 deletions(-) diff --git a/include/linux/ksm.h b/include/linux/ksm.h index b9cdeba03668a..32ecea266fe62 100644 --- a/include/linux/ksm.h +++ b/include/linux/ksm.h @@ -88,7 +88,7 @@ static inline void ksm_exit(struct mm_struct *mm) * but what if the vma was unmerged while the page was swapped out? */ struct page *ksm_might_need_to_copy(struct page *page, - struct vm_area_struct *vma, unsigned long address); + struct vm_area_struct *vma, unsigned long addr); void rmap_walk_ksm(struct folio *folio, struct rmap_walk_control *rwc); void folio_migrate_ksm(struct folio *newfolio, struct folio *folio); @@ -141,7 +141,7 @@ static inline int ksm_madvise(struct vm_area_struct *vma, unsigned long start, } static inline struct page *ksm_might_need_to_copy(struct page *page, - struct vm_area_struct *vma, unsigned long address) + struct vm_area_struct *vma, unsigned long addr) { return page; } diff --git a/mm/ksm.c b/mm/ksm.c index 2e4cd681622de..fe9296bd85cde 100644 --- a/mm/ksm.c +++ b/mm/ksm.c @@ -2788,48 +2788,51 @@ void __ksm_exit(struct mm_struct *mm) } struct page *ksm_might_need_to_copy(struct page *page, - struct vm_area_struct *vma, unsigned long address) + struct vm_area_struct *vma, unsigned long addr) { struct folio *folio = page_folio(page); struct anon_vma *anon_vma = folio_anon_vma(folio); - struct page *new_page; + struct folio *new_folio; - if (PageKsm(page)) { - if (page_stable_node(page) && + if (folio_test_large(folio)) + return page; + + if (folio_test_ksm(folio)) { + if (folio_stable_node(folio) && !(ksm_run & KSM_RUN_UNMERGE)) return page; /* no need to copy it */ } else if (!anon_vma) { return page; /* no need to copy it */ - } else if (page->index == linear_page_index(vma, address) && + } else if (folio->index == linear_page_index(vma, addr) && anon_vma->root == vma->anon_vma->root) { return page; /* still no need to copy it */ } if (PageHWPoison(page)) return ERR_PTR(-EHWPOISON); - if (!PageUptodate(page)) + if (!folio_test_uptodate(folio)) return page; /* let do_swap_page report the error */ - new_page = alloc_page_vma(GFP_HIGHUSER_MOVABLE, vma, address); - if (new_page && - mem_cgroup_charge(page_folio(new_page), vma->vm_mm, GFP_KERNEL)) { - put_page(new_page); - new_page = NULL; + new_folio = vma_alloc_folio(GFP_HIGHUSER_MOVABLE, 0, vma, addr, false); + if (new_folio && + mem_cgroup_charge(new_folio, vma->vm_mm, GFP_KERNEL)) { + folio_put(new_folio); + new_folio = NULL; } - if (new_page) { - if (copy_mc_user_highpage(new_page, page, address, vma)) { - put_page(new_page); + if (new_folio) { + if (copy_mc_user_highpage(&new_folio->page, page, addr, vma)) { + folio_put(new_folio); memory_failure_queue(page_to_pfn(page), 0); return ERR_PTR(-EHWPOISON); } - SetPageDirty(new_page); - __SetPageUptodate(new_page); - __SetPageLocked(new_page); + folio_set_dirty(new_folio); + __folio_mark_uptodate(new_folio); + __folio_set_locked(new_folio); #ifdef CONFIG_SWAP count_vm_event(KSM_SWPIN_COPY); #endif } - return new_page; + return new_folio ? &new_folio->page : NULL; } void rmap_walk_ksm(struct folio *folio, struct rmap_walk_control *rwc) From 3ebea11492d7c2fa499d294da2264a3c55d0393b Mon Sep 17 00:00:00 2001 From: Kefeng Wang Date: Sat, 18 Nov 2023 10:32:29 +0800 Subject: [PATCH 146/548] mm: memory: use a folio in validate_page_before_insert() Use a folio in validate_page_before_insert() to save two compound_head() calls. Link: https://lkml.kernel.org/r/20231118023232.1409103-3-wangkefeng.wang@huawei.com Signed-off-by: Kefeng Wang Reviewed-by: Sidhartha Kumar Cc: David Hildenbrand Cc: Matthew Wilcox (Oracle) Cc: Vishal Moola (Oracle) Signed-off-by: Andrew Morton (cherry picked from commit f8b6187d8dd98fd32fe393071f362a7b6beaad0a) Signed-off-by: Koba Ko --- mm/memory.c | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/mm/memory.c b/mm/memory.c index 1426328f910fd..6dc1fa3d3fdd2 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -1824,9 +1824,12 @@ pte_t *__get_locked_pte(struct mm_struct *mm, unsigned long addr, static int validate_page_before_insert(struct page *page) { - if (PageAnon(page) || PageSlab(page) || page_has_type(page)) + struct folio *folio = page_folio(page); + + if (folio_test_anon(folio) || folio_test_slab(folio) || + page_has_type(page)) return -EINVAL; - flush_dcache_page(page); + flush_dcache_folio(folio); return 0; } From e99e98596e2c126001d930d2fdf609fc6790d015 Mon Sep 17 00:00:00 2001 From: Kefeng Wang Date: Sat, 18 Nov 2023 10:32:30 +0800 Subject: [PATCH 147/548] mm: memory: rename page_copy_prealloc() to folio_prealloc() Let's rename page_copy_prealloc() to folio_prealloc(), which could be reused in more functons, as it maybe zero the new page, pass a new need_zero to it, and call the vma_alloc_zeroed_movable_folio() if need_zero is true. Link: https://lkml.kernel.org/r/20231118023232.1409103-4-wangkefeng.wang@huawei.com Signed-off-by: Kefeng Wang Reviewed-by: Sidhartha Kumar Reviewed-by: Vishal Moola (Oracle) Cc: David Hildenbrand Cc: Matthew Wilcox (Oracle) Signed-off-by: Andrew Morton (cherry picked from commit 294de6d8f14a69f1251b94223ba9d90d64b28cec) Signed-off-by: Koba Ko --- mm/memory.c | 13 +++++++++---- 1 file changed, 9 insertions(+), 4 deletions(-) diff --git a/mm/memory.c b/mm/memory.c index 6dc1fa3d3fdd2..e0ab6ea1c2bd4 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -979,12 +979,17 @@ copy_present_pte(struct vm_area_struct *dst_vma, struct vm_area_struct *src_vma, return 0; } -static inline struct folio *page_copy_prealloc(struct mm_struct *src_mm, - struct vm_area_struct *vma, unsigned long addr) +static inline struct folio *folio_prealloc(struct mm_struct *src_mm, + struct vm_area_struct *vma, unsigned long addr, bool need_zero) { struct folio *new_folio; - new_folio = vma_alloc_folio(GFP_HIGHUSER_MOVABLE, 0, vma, addr, false); + if (need_zero) + new_folio = vma_alloc_zeroed_movable_folio(vma, addr); + else + new_folio = vma_alloc_folio(GFP_HIGHUSER_MOVABLE, 0, vma, + addr, false); + if (!new_folio) return NULL; @@ -1116,7 +1121,7 @@ copy_pte_range(struct vm_area_struct *dst_vma, struct vm_area_struct *src_vma, } else if (ret == -EBUSY) { goto out; } else if (ret == -EAGAIN) { - prealloc = page_copy_prealloc(src_mm, src_vma, addr); + prealloc = folio_prealloc(src_mm, src_vma, addr, false); if (!prealloc) return -ENOMEM; } else if (ret) { From e927ca04693aeaa2210cf277fd49ff387dcb5f5f Mon Sep 17 00:00:00 2001 From: Kefeng Wang Date: Sat, 18 Nov 2023 10:32:31 +0800 Subject: [PATCH 148/548] mm: memory: use a folio in do_cow_fault() Use folio_prealloc() helper and convert to use a folio in do_cow_fault(), which save five compound_head() calls. Link: https://lkml.kernel.org/r/20231118023232.1409103-5-wangkefeng.wang@huawei.com Signed-off-by: Kefeng Wang Reviewed-by: Vishal Moola (Oracle) Cc: David Hildenbrand Cc: Matthew Wilcox (Oracle) Cc: Sidhartha Kumar Signed-off-by: Andrew Morton (cherry picked from commit e4621e70469c3ac6e1b6914f1c42941a8a6e44d2) Signed-off-by: Koba Ko --- mm/memory.c | 16 ++++++---------- 1 file changed, 6 insertions(+), 10 deletions(-) diff --git a/mm/memory.c b/mm/memory.c index e0ab6ea1c2bd4..9fcd1bcc06726 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -4756,6 +4756,7 @@ static vm_fault_t do_read_fault(struct vm_fault *vmf) static vm_fault_t do_cow_fault(struct vm_fault *vmf) { struct vm_area_struct *vma = vmf->vma; + struct folio *folio; vm_fault_t ret; ret = vmf_can_call_fault(vmf); @@ -4764,16 +4765,11 @@ static vm_fault_t do_cow_fault(struct vm_fault *vmf) if (ret) return ret; - vmf->cow_page = alloc_page_vma(GFP_HIGHUSER_MOVABLE, vma, vmf->address); - if (!vmf->cow_page) + folio = folio_prealloc(vma->vm_mm, vma, vmf->address, false); + if (!folio) return VM_FAULT_OOM; - if (mem_cgroup_charge(page_folio(vmf->cow_page), vma->vm_mm, - GFP_KERNEL)) { - put_page(vmf->cow_page); - return VM_FAULT_OOM; - } - folio_throttle_swaprate(page_folio(vmf->cow_page), GFP_KERNEL); + vmf->cow_page = &folio->page; ret = __do_fault(vmf); if (unlikely(ret & (VM_FAULT_ERROR | VM_FAULT_NOPAGE | VM_FAULT_RETRY))) @@ -4782,7 +4778,7 @@ static vm_fault_t do_cow_fault(struct vm_fault *vmf) return ret; copy_user_highpage(vmf->cow_page, vmf->page, vmf->address, vma); - __SetPageUptodate(vmf->cow_page); + __folio_mark_uptodate(folio); ret |= finish_fault(vmf); unlock_page(vmf->page); @@ -4791,7 +4787,7 @@ static vm_fault_t do_cow_fault(struct vm_fault *vmf) goto uncharge_out; return ret; uncharge_out: - put_page(vmf->cow_page); + folio_put(folio); return ret; } From eb21cec379af642c0caca67e93751b26e4810f83 Mon Sep 17 00:00:00 2001 From: Kefeng Wang Date: Sat, 18 Nov 2023 10:32:32 +0800 Subject: [PATCH 149/548] mm: memory: use folio_prealloc() in wp_page_copy() Use folio_prealloc() helper to simplify code a bit. Link: https://lkml.kernel.org/r/20231118023232.1409103-6-wangkefeng.wang@huawei.com Signed-off-by: Kefeng Wang Cc: David Hildenbrand Cc: Matthew Wilcox (Oracle) Cc: Sidhartha Kumar Cc: Vishal Moola (Oracle) Signed-off-by: Andrew Morton (cherry picked from commit cf503cc665c442ce9893cb12561c57a328465e29) Signed-off-by: Koba Ko --- mm/memory.c | 22 +++++++--------------- 1 file changed, 7 insertions(+), 15 deletions(-) diff --git a/mm/memory.c b/mm/memory.c index 9fcd1bcc06726..421c69adb04f9 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -3115,6 +3115,7 @@ static vm_fault_t wp_page_copy(struct vm_fault *vmf) int page_copied = 0; struct mmu_notifier_range range; vm_fault_t ret; + bool pfn_is_zero; delayacct_wpcopy_start(); @@ -3124,16 +3125,13 @@ static vm_fault_t wp_page_copy(struct vm_fault *vmf) if (unlikely(ret)) goto out; - if (is_zero_pfn(pte_pfn(vmf->orig_pte))) { - new_folio = vma_alloc_zeroed_movable_folio(vma, vmf->address); - if (!new_folio) - goto oom; - } else { + pfn_is_zero = is_zero_pfn(pte_pfn(vmf->orig_pte)); + new_folio = folio_prealloc(mm, vma, vmf->address, pfn_is_zero); + if (!new_folio) + goto oom; + + if (!pfn_is_zero) { int err; - new_folio = vma_alloc_folio(GFP_HIGHUSER_MOVABLE, 0, vma, - vmf->address, false); - if (!new_folio) - goto oom; err = __wp_page_copy_user(&new_folio->page, vmf->page, vmf); if (err) { @@ -3154,10 +3152,6 @@ static vm_fault_t wp_page_copy(struct vm_fault *vmf) kmsan_copy_page_meta(&new_folio->page, vmf->page); } - if (mem_cgroup_charge(new_folio, mm, GFP_KERNEL)) - goto oom_free_new; - folio_throttle_swaprate(new_folio, GFP_KERNEL); - __folio_mark_uptodate(new_folio); mmu_notifier_range_init(&range, MMU_NOTIFY_CLEAR, 0, mm, @@ -3256,8 +3250,6 @@ static vm_fault_t wp_page_copy(struct vm_fault *vmf) delayacct_wpcopy_end(); return 0; -oom_free_new: - folio_put(new_folio); oom: ret = VM_FAULT_OOM; out: From 1cbae6f2c97e9fb3024a32d7b1fff777faed6dd7 Mon Sep 17 00:00:00 2001 From: Ryan Roberts Date: Thu, 5 Oct 2023 15:07:30 +0100 Subject: [PATCH 150/548] arm64/mm: Hoist synchronization out of set_ptes() loop set_ptes() sets a physically contiguous block of memory (which all belongs to the same folio) to a contiguous block of ptes. The arm64 implementation of this previously just looped, operating on each individual pte. But the __sync_icache_dcache() and mte_sync_tags() operations can both be hoisted out of the loop so that they are performed once for the contiguous set of pages (which may be less than the whole folio). This should result in minor performance gains. __sync_icache_dcache() already acts on the whole folio, and sets a flag in the folio so that it skips duplicate calls. But by hoisting the call, all the pte testing is done only once. mte_sync_tags() operates on each individual page with its own loop. But by passing the number of pages explicitly, we can rely solely on its loop and do the checks only once. This approach also makes it robust for the future, rather than assuming if a head page of a compound page is being mapped, then the whole compound page is being mapped, instead we explicitly know how many pages are being mapped. The old assumption may not continue to hold once the "anonymous large folios" feature is merged. Signed-off-by: Ryan Roberts Reviewed-by: Steven Price Link: https://lore.kernel.org/r/20231005140730.2191134-1-ryan.roberts@arm.com Signed-off-by: Catalin Marinas (cherry picked from commit 3425cec42c3ce0f65fe74e412756b567b152e61d) Signed-off-by: Koba Ko --- arch/arm64/include/asm/mte.h | 4 ++-- arch/arm64/include/asm/pgtable.h | 27 +++++++++++++++++---------- arch/arm64/kernel/mte.c | 4 ++-- 3 files changed, 21 insertions(+), 14 deletions(-) diff --git a/arch/arm64/include/asm/mte.h b/arch/arm64/include/asm/mte.h index 4cedbaa16f419..91fbd5c8a3911 100644 --- a/arch/arm64/include/asm/mte.h +++ b/arch/arm64/include/asm/mte.h @@ -90,7 +90,7 @@ static inline bool try_page_mte_tagging(struct page *page) } void mte_zero_clear_page_tags(void *addr); -void mte_sync_tags(pte_t pte); +void mte_sync_tags(pte_t pte, unsigned int nr_pages); void mte_copy_page_tags(void *kto, const void *kfrom); void mte_thread_init_user(void); void mte_thread_switch(struct task_struct *next); @@ -122,7 +122,7 @@ static inline bool try_page_mte_tagging(struct page *page) static inline void mte_zero_clear_page_tags(void *addr) { } -static inline void mte_sync_tags(pte_t pte) +static inline void mte_sync_tags(pte_t pte, unsigned int nr_pages) { } static inline void mte_copy_page_tags(void *kto, const void *kfrom) diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h index 0212129b13d07..4724a8fbe4803 100644 --- a/arch/arm64/include/asm/pgtable.h +++ b/arch/arm64/include/asm/pgtable.h @@ -325,8 +325,7 @@ static inline void __check_safe_pte_update(struct mm_struct *mm, pte_t *ptep, __func__, pte_val(old_pte), pte_val(pte)); } -static inline void __set_pte_at(struct mm_struct *mm, unsigned long addr, - pte_t *ptep, pte_t pte) +static inline void __sync_cache_and_tags(pte_t pte, unsigned int nr_pages) { if (pte_present(pte) && pte_user_exec(pte) && !pte_special(pte)) __sync_icache_dcache(pte); @@ -339,20 +338,18 @@ static inline void __set_pte_at(struct mm_struct *mm, unsigned long addr, */ if (system_supports_mte() && pte_access_permitted(pte, false) && !pte_special(pte) && pte_tagged(pte)) - mte_sync_tags(pte); - - __check_safe_pte_update(mm, ptep, pte); - - set_pte(ptep, pte); + mte_sync_tags(pte, nr_pages); } static inline void set_ptes(struct mm_struct *mm, unsigned long addr, pte_t *ptep, pte_t pte, unsigned int nr) { page_table_check_ptes_set(mm, ptep, pte, nr); + __sync_cache_and_tags(pte, nr); for (;;) { - __set_pte_at(mm, addr, ptep, pte); + __check_safe_pte_update(mm, ptep, pte); + set_pte(ptep, pte); if (--nr == 0) break; ptep++; @@ -531,18 +528,28 @@ static inline pmd_t pmd_mkdevmap(pmd_t pmd) #define pud_pfn(pud) ((__pud_to_phys(pud) & PUD_MASK) >> PAGE_SHIFT) #define pfn_pud(pfn,prot) __pud(__phys_to_pud_val((phys_addr_t)(pfn) << PAGE_SHIFT) | pgprot_val(prot)) +static inline void __set_pte_at(struct mm_struct *mm, unsigned long addr, + pte_t *ptep, pte_t pte, unsigned int nr) +{ + __sync_cache_and_tags(pte, nr); + __check_safe_pte_update(mm, ptep, pte); + set_pte(ptep, pte); +} + static inline void set_pmd_at(struct mm_struct *mm, unsigned long addr, pmd_t *pmdp, pmd_t pmd) { page_table_check_pmd_set(mm, pmdp, pmd); - return __set_pte_at(mm, addr, (pte_t *)pmdp, pmd_pte(pmd)); + return __set_pte_at(mm, addr, (pte_t *)pmdp, pmd_pte(pmd), + PMD_SIZE >> PAGE_SHIFT); } static inline void set_pud_at(struct mm_struct *mm, unsigned long addr, pud_t *pudp, pud_t pud) { page_table_check_pud_set(mm, pudp, pud); - return __set_pte_at(mm, addr, (pte_t *)pudp, pud_pte(pud)); + return __set_pte_at(mm, addr, (pte_t *)pudp, pud_pte(pud), + PUD_SIZE >> PAGE_SHIFT); } #define __p4d_to_phys(p4d) __pte_to_phys(p4d_pte(p4d)) diff --git a/arch/arm64/kernel/mte.c b/arch/arm64/kernel/mte.c index 4edecaac8f919..2fb5e7a7a4d5e 100644 --- a/arch/arm64/kernel/mte.c +++ b/arch/arm64/kernel/mte.c @@ -35,10 +35,10 @@ DEFINE_STATIC_KEY_FALSE(mte_async_or_asymm_mode); EXPORT_SYMBOL_GPL(mte_async_or_asymm_mode); #endif -void mte_sync_tags(pte_t pte) +void mte_sync_tags(pte_t pte, unsigned int nr_pages) { struct page *page = pte_page(pte); - long i, nr_pages = compound_nr(page); + unsigned int i; /* if PG_mte_tagged is set, tags have already been initialised */ for (i = 0; i < nr_pages; i++, page++) { From 6998dff1dae0595bc4465b3f9f7077317fbfa679 Mon Sep 17 00:00:00 2001 From: "Matthew Wilcox (Oracle)" Date: Mon, 11 Dec 2023 16:22:06 +0000 Subject: [PATCH 151/548] mm: convert ksm_might_need_to_copy() to work on folios Patch series "Finish two folio conversions". Most callers of page_add_new_anon_rmap() and lru_cache_add_inactive_or_unevictable() have been converted to their folio equivalents, but there are still a few stragglers. There's a bit of preparatory work in ksm and unuse_pte(), but after that it's pretty mechanical. This patch (of 9): Accept a folio as an argument and return a folio result. Removes a call to compound_head() in do_swap_page(), and prevents folio & page from getting out of sync in unuse_pte(). Reviewed-by: David Hildenbrand [willy@infradead.org: fix smatch warning] Link: https://lkml.kernel.org/r/ZXnPtblC6A1IkyAB@casper.infradead.org [david@redhat.com: only adjust the page if the folio changed] Link: https://lkml.kernel.org/r/6a8f2110-fa91-4c10-9eae-88315309a6e3@redhat.com Link: https://lkml.kernel.org/r/20231211162214.2146080-1-willy@infradead.org Link: https://lkml.kernel.org/r/20231211162214.2146080-2-willy@infradead.org Signed-off-by: Matthew Wilcox (Oracle) Signed-off-by: David Hildenbrand Signed-off-by: Andrew Morton (cherry picked from commit 96db66d9c8f3c1547325af01b1f328b85d6ee1b9) Signed-off-by: Koba Ko --- include/linux/ksm.h | 6 +++--- mm/ksm.c | 21 +++++++++++---------- mm/memory.c | 11 +++++++---- mm/swapfile.c | 8 +++++--- 4 files changed, 26 insertions(+), 20 deletions(-) diff --git a/include/linux/ksm.h b/include/linux/ksm.h index 32ecea266fe62..f701b57fc64bf 100644 --- a/include/linux/ksm.h +++ b/include/linux/ksm.h @@ -87,7 +87,7 @@ static inline void ksm_exit(struct mm_struct *mm) * We'd like to make this conditional on vma->vm_flags & VM_MERGEABLE, * but what if the vma was unmerged while the page was swapped out? */ -struct page *ksm_might_need_to_copy(struct page *page, +struct folio *ksm_might_need_to_copy(struct folio *folio, struct vm_area_struct *vma, unsigned long addr); void rmap_walk_ksm(struct folio *folio, struct rmap_walk_control *rwc); @@ -140,10 +140,10 @@ static inline int ksm_madvise(struct vm_area_struct *vma, unsigned long start, return 0; } -static inline struct page *ksm_might_need_to_copy(struct page *page, +static inline struct folio *ksm_might_need_to_copy(struct folio *folio, struct vm_area_struct *vma, unsigned long addr) { - return page; + return folio; } static inline void rmap_walk_ksm(struct folio *folio, diff --git a/mm/ksm.c b/mm/ksm.c index fe9296bd85cde..5fbe58d863c77 100644 --- a/mm/ksm.c +++ b/mm/ksm.c @@ -2787,30 +2787,30 @@ void __ksm_exit(struct mm_struct *mm) trace_ksm_exit(mm); } -struct page *ksm_might_need_to_copy(struct page *page, +struct folio *ksm_might_need_to_copy(struct folio *folio, struct vm_area_struct *vma, unsigned long addr) { - struct folio *folio = page_folio(page); + struct page *page = folio_page(folio, 0); struct anon_vma *anon_vma = folio_anon_vma(folio); struct folio *new_folio; if (folio_test_large(folio)) - return page; + return folio; if (folio_test_ksm(folio)) { if (folio_stable_node(folio) && !(ksm_run & KSM_RUN_UNMERGE)) - return page; /* no need to copy it */ + return folio; /* no need to copy it */ } else if (!anon_vma) { - return page; /* no need to copy it */ + return folio; /* no need to copy it */ } else if (folio->index == linear_page_index(vma, addr) && anon_vma->root == vma->anon_vma->root) { - return page; /* still no need to copy it */ + return folio; /* still no need to copy it */ } if (PageHWPoison(page)) return ERR_PTR(-EHWPOISON); if (!folio_test_uptodate(folio)) - return page; /* let do_swap_page report the error */ + return folio; /* let do_swap_page report the error */ new_folio = vma_alloc_folio(GFP_HIGHUSER_MOVABLE, 0, vma, addr, false); if (new_folio && @@ -2819,9 +2819,10 @@ struct page *ksm_might_need_to_copy(struct page *page, new_folio = NULL; } if (new_folio) { - if (copy_mc_user_highpage(&new_folio->page, page, addr, vma)) { + if (copy_mc_user_highpage(folio_page(new_folio, 0), page, + addr, vma)) { folio_put(new_folio); - memory_failure_queue(page_to_pfn(page), 0); + memory_failure_queue(folio_pfn(folio), 0); return ERR_PTR(-EHWPOISON); } folio_set_dirty(new_folio); @@ -2832,7 +2833,7 @@ struct page *ksm_might_need_to_copy(struct page *page, #endif } - return new_folio ? &new_folio->page : NULL; + return new_folio; } void rmap_walk_ksm(struct folio *folio, struct rmap_walk_control *rwc) diff --git a/mm/memory.c b/mm/memory.c index 421c69adb04f9..b54656e4c75c4 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -3937,15 +3937,18 @@ vm_fault_t do_swap_page(struct vm_fault *vmf) * page->index of !PageKSM() pages would be nonlinear inside the * anon VMA -- PageKSM() is lost on actual swapout. */ - page = ksm_might_need_to_copy(page, vma, vmf->address); - if (unlikely(!page)) { + folio = ksm_might_need_to_copy(folio, vma, vmf->address); + if (unlikely(!folio)) { ret = VM_FAULT_OOM; + folio = swapcache; goto out_page; - } else if (unlikely(PTR_ERR(page) == -EHWPOISON)) { + } else if (unlikely(folio == ERR_PTR(-EHWPOISON))) { ret = VM_FAULT_HWPOISON; + folio = swapcache; goto out_page; } - folio = page_folio(page); + if (folio != swapcache) + page = folio_page(folio, 0); /* * If we want to map a page that's in the swapcache writable, we diff --git a/mm/swapfile.c b/mm/swapfile.c index c856d6bb2daf3..bd450abe309ed 100644 --- a/mm/swapfile.c +++ b/mm/swapfile.c @@ -1760,11 +1760,13 @@ static int unuse_pte(struct vm_area_struct *vma, pmd_t *pmd, int ret = 1; swapcache = page; - page = ksm_might_need_to_copy(page, vma, addr); - if (unlikely(!page)) + folio = ksm_might_need_to_copy(folio, vma, addr); + if (unlikely(!folio)) return -ENOMEM; - else if (unlikely(PTR_ERR(page) == -EHWPOISON)) + else if (unlikely(folio == ERR_PTR(-EHWPOISON))) hwpoisoned = true; + else + page = folio_file_page(folio, swp_offset(entry)); pte = pte_offset_map_lock(vma->vm_mm, pmd, addr, &ptl); if (unlikely(!pte || !pte_same_as_swp(ptep_get(pte), From aafd1376b67c0220796590f3e5069f674acf01a6 Mon Sep 17 00:00:00 2001 From: "Matthew Wilcox (Oracle)" Date: Tue, 12 Dec 2023 16:48:13 +0000 Subject: [PATCH 152/548] mm: remove PageAnonExclusive assertions in unuse_pte() The page in question is either freshly allocated or known to be in the swap cache; these assertions are not particularly useful. Link: https://lkml.kernel.org/r/20231212164813.2540119-1-willy@infradead.org Signed-off-by: Matthew Wilcox (Oracle) Reviewed-by: David Hildenbrand Signed-off-by: Andrew Morton (cherry picked from commit 8d294a8c6393afbde59cf14a0e8413df4b206698) Signed-off-by: Koba Ko --- mm/swapfile.c | 4 ---- 1 file changed, 4 deletions(-) diff --git a/mm/swapfile.c b/mm/swapfile.c index bd450abe309ed..b7a9512a8eb4b 100644 --- a/mm/swapfile.c +++ b/mm/swapfile.c @@ -1799,10 +1799,6 @@ static int unuse_pte(struct vm_area_struct *vma, pmd_t *pmd, */ arch_swap_restore(entry, page_folio(page)); - /* See do_swap_page() */ - BUG_ON(!PageAnon(page) && PageMappedToDisk(page)); - BUG_ON(PageAnon(page) && PageAnonExclusive(page)); - dec_mm_counter(vma->vm_mm, MM_SWAPENTS); inc_mm_counter(vma->vm_mm, MM_ANONPAGES); get_page(page); From ddca2238c1ced4efea35d917d3373f64674d1347 Mon Sep 17 00:00:00 2001 From: "Matthew Wilcox (Oracle)" Date: Mon, 11 Dec 2023 16:22:08 +0000 Subject: [PATCH 153/548] mm: convert unuse_pte() to use a folio throughout Saves about eight calls to compound_head(). Link: https://lkml.kernel.org/r/20231211162214.2146080-4-willy@infradead.org Signed-off-by: Matthew Wilcox (Oracle) Signed-off-by: Andrew Morton (cherry picked from commit f00f48436c789af659047d3c5d6f6d17e640634e) Signed-off-by: Koba Ko --- mm/swapfile.c | 47 +++++++++++++++++++++++++---------------------- 1 file changed, 25 insertions(+), 22 deletions(-) diff --git a/mm/swapfile.c b/mm/swapfile.c index b7a9512a8eb4b..d67c20144dc87 100644 --- a/mm/swapfile.c +++ b/mm/swapfile.c @@ -1752,21 +1752,25 @@ static inline int pte_same_as_swp(pte_t pte, pte_t swp_pte) static int unuse_pte(struct vm_area_struct *vma, pmd_t *pmd, unsigned long addr, swp_entry_t entry, struct folio *folio) { - struct page *page = folio_file_page(folio, swp_offset(entry)); - struct page *swapcache; + struct page *page; + struct folio *swapcache; spinlock_t *ptl; pte_t *pte, new_pte, old_pte; - bool hwpoisoned = PageHWPoison(page); + bool hwpoisoned = false; int ret = 1; - swapcache = page; + swapcache = folio; folio = ksm_might_need_to_copy(folio, vma, addr); if (unlikely(!folio)) return -ENOMEM; - else if (unlikely(folio == ERR_PTR(-EHWPOISON))) + else if (unlikely(folio == ERR_PTR(-EHWPOISON))) { + hwpoisoned = true; + folio = swapcache; + } + + page = folio_file_page(folio, swp_offset(entry)); + if (PageHWPoison(page)) hwpoisoned = true; - else - page = folio_file_page(folio, swp_offset(entry)); pte = pte_offset_map_lock(vma->vm_mm, pmd, addr, &ptl); if (unlikely(!pte || !pte_same_as_swp(ptep_get(pte), @@ -1777,13 +1781,12 @@ static int unuse_pte(struct vm_area_struct *vma, pmd_t *pmd, old_pte = ptep_get(pte); - if (unlikely(hwpoisoned || !PageUptodate(page))) { + if (unlikely(hwpoisoned || !folio_test_uptodate(folio))) { swp_entry_t swp_entry; dec_mm_counter(vma->vm_mm, MM_SWAPENTS); if (hwpoisoned) { - swp_entry = make_hwpoison_entry(swapcache); - page = swapcache; + swp_entry = make_hwpoison_entry(page); } else { swp_entry = make_poisoned_swp_entry(); } @@ -1797,27 +1800,27 @@ static int unuse_pte(struct vm_area_struct *vma, pmd_t *pmd, * when reading from swap. This metadata may be indexed by swap entry * so this must be called before swap_free(). */ - arch_swap_restore(entry, page_folio(page)); + arch_swap_restore(entry, folio); dec_mm_counter(vma->vm_mm, MM_SWAPENTS); inc_mm_counter(vma->vm_mm, MM_ANONPAGES); - get_page(page); - if (page == swapcache) { + folio_get(folio); + if (folio == swapcache) { rmap_t rmap_flags = RMAP_NONE; /* - * See do_swap_page(): PageWriteback() would be problematic. - * However, we do a wait_on_page_writeback() just before this - * call and have the page locked. + * See do_swap_page(): writeback would be problematic. + * However, we do a folio_wait_writeback() just before this + * call and have the folio locked. */ - VM_BUG_ON_PAGE(PageWriteback(page), page); + VM_BUG_ON_FOLIO(folio_test_writeback(folio), folio); if (pte_swp_exclusive(old_pte)) rmap_flags |= RMAP_EXCLUSIVE; page_add_anon_rmap(page, vma, addr, rmap_flags); } else { /* ksm created a completely new copy */ - page_add_new_anon_rmap(page, vma, addr); - lru_cache_add_inactive_or_unevictable(page, vma); + folio_add_new_anon_rmap(folio, vma, addr); + folio_add_lru_vma(folio, vma); } new_pte = pte_mkold(mk_pte(page, vma->vm_page_prot)); if (pte_swp_soft_dirty(old_pte)) @@ -1830,9 +1833,9 @@ static int unuse_pte(struct vm_area_struct *vma, pmd_t *pmd, out: if (pte) pte_unmap_unlock(pte, ptl); - if (page != swapcache) { - unlock_page(page); - put_page(page); + if (folio != swapcache) { + folio_unlock(folio); + folio_put(folio); } return ret; } From e63a2fc6d1e1580ea0d609fbce67d537e7e007ad Mon Sep 17 00:00:00 2001 From: "Matthew Wilcox (Oracle)" Date: Mon, 11 Dec 2023 16:22:09 +0000 Subject: [PATCH 154/548] mm: remove some calls to page_add_new_anon_rmap() We already have the folio in these functions, we just need to use it. folio_add_new_anon_rmap() didn't exist at the time they were converted to folios. Link: https://lkml.kernel.org/r/20231211162214.2146080-5-willy@infradead.org Signed-off-by: Matthew Wilcox (Oracle) Reviewed-by: David Hildenbrand Signed-off-by: Andrew Morton (cherry picked from commit 2853b66b601a265306be709b4d86aaff7d92a0fc) Signed-off-by: Koba Ko --- kernel/events/uprobes.c | 2 +- mm/memory.c | 2 +- mm/userfaultfd.c | 2 +- 3 files changed, 3 insertions(+), 3 deletions(-) diff --git a/kernel/events/uprobes.c b/kernel/events/uprobes.c index a554f43d3ceb9..5fa4fc7c84485 100644 --- a/kernel/events/uprobes.c +++ b/kernel/events/uprobes.c @@ -192,7 +192,7 @@ static int __replace_page(struct vm_area_struct *vma, unsigned long addr, if (new_page) { folio_get(new_folio); - page_add_new_anon_rmap(new_page, vma, addr); + folio_add_new_anon_rmap(new_folio, vma, addr); folio_add_lru_vma(new_folio, vma); } else /* no new page, just dec_mm_counter for old_page */ diff --git a/mm/memory.c b/mm/memory.c index b54656e4c75c4..20640b2eb3f42 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -4066,7 +4066,7 @@ vm_fault_t do_swap_page(struct vm_fault *vmf) /* ksm created a completely new copy */ if (unlikely(folio != swapcache && swapcache)) { - page_add_new_anon_rmap(page, vma, vmf->address); + folio_add_new_anon_rmap(folio, vma, vmf->address); folio_add_lru_vma(folio, vma); } else { page_add_anon_rmap(page, vma, vmf->address, rmap_flags); diff --git a/mm/userfaultfd.c b/mm/userfaultfd.c index 92fe2a76f4b51..ffef13f97eddf 100644 --- a/mm/userfaultfd.c +++ b/mm/userfaultfd.c @@ -116,7 +116,7 @@ int mfill_atomic_install_pte(pmd_t *dst_pmd, folio_add_lru(folio); page_add_file_rmap(page, dst_vma, false); } else { - page_add_new_anon_rmap(page, dst_vma, dst_addr); + folio_add_new_anon_rmap(folio, dst_vma, dst_addr); folio_add_lru_vma(folio, dst_vma); } From 8f7c5d1e07a46451c78389c8c7c9f7dde393b864 Mon Sep 17 00:00:00 2001 From: "Matthew Wilcox (Oracle)" Date: Mon, 11 Dec 2023 16:22:10 +0000 Subject: [PATCH 155/548] mm: remove stale example from comment folio_add_new_anon_rmap() no longer works this way, so just remove the entire example. Link: https://lkml.kernel.org/r/20231211162214.2146080-6-willy@infradead.org Signed-off-by: Matthew Wilcox (Oracle) Reviewed-by: David Hildenbrand Cc: Ralph Campbell Signed-off-by: Andrew Morton (cherry picked from commit b2926ac8178bf5c88ada4285f413f56c1cafc592) Signed-off-by: Koba Ko --- mm/memremap.c | 18 ++++-------------- 1 file changed, 4 insertions(+), 14 deletions(-) diff --git a/mm/memremap.c b/mm/memremap.c index bee85560a2434..19ed6855f96ff 100644 --- a/mm/memremap.c +++ b/mm/memremap.c @@ -485,21 +485,11 @@ void free_zone_device_page(struct page *page) __ClearPageAnonExclusive(page); /* - * When a device managed page is freed, the page->mapping field + * When a device managed page is freed, the folio->mapping field * may still contain a (stale) mapping value. For example, the - * lower bits of page->mapping may still identify the page as an - * anonymous page. Ultimately, this entire field is just stale - * and wrong, and it will cause errors if not cleared. One - * example is: - * - * migrate_vma_pages() - * migrate_vma_insert_page() - * page_add_new_anon_rmap() - * __page_set_anon_rmap() - * ...checks page->mapping, via PageAnon(page) call, - * and incorrectly concludes that the page is an - * anonymous page. Therefore, it incorrectly, - * silently fails to set up the new anon rmap. + * lower bits of folio->mapping may still identify the folio as an + * anonymous folio. Ultimately, this entire field is just stale + * and wrong, and it will cause errors if not cleared. * * For other types of ZONE_DEVICE pages, migration is either * handled differently or not done at all, so there is no need From 47f88904400d7853977799b94635ad24e6169816 Mon Sep 17 00:00:00 2001 From: "Matthew Wilcox (Oracle)" Date: Mon, 11 Dec 2023 16:22:11 +0000 Subject: [PATCH 156/548] mm: remove references to page_add_new_anon_rmap in comments Refer to folio_add_new_anon_rmap() instead. Link: https://lkml.kernel.org/r/20231211162214.2146080-7-willy@infradead.org Signed-off-by: Matthew Wilcox (Oracle) Reviewed-by: David Hildenbrand Signed-off-by: Andrew Morton (cherry picked from commit cb9089babc91f7ffc785d51a0fa567365b0e7751) Signed-off-by: Koba Ko --- mm/rmap.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/mm/rmap.c b/mm/rmap.c index c166b99ce2446..bf4bf8f8e660f 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -1201,9 +1201,9 @@ static void __page_check_anon_rmap(struct folio *folio, struct page *page, * We have exclusion against page_add_anon_rmap because the caller * always holds the page locked. * - * We have exclusion against page_add_new_anon_rmap because those pages + * We have exclusion against folio_add_new_anon_rmap because those pages * are initially only visible via the pagetables, and the pte is locked - * over the call to page_add_new_anon_rmap. + * over the call to folio_add_new_anon_rmap. */ VM_BUG_ON_FOLIO(folio_anon_vma(folio)->root != vma->anon_vma->root, folio); From df230b358b55ae309b0c7e0fe6f3dc611a0e2bcf Mon Sep 17 00:00:00 2001 From: "Matthew Wilcox (Oracle)" Date: Mon, 11 Dec 2023 16:22:12 +0000 Subject: [PATCH 157/548] mm: convert migrate_vma_insert_page() to use a folio Replaces five calls to compound_head() with one. Link: https://lkml.kernel.org/r/20231211162214.2146080-8-willy@infradead.org Signed-off-by: Matthew Wilcox (Oracle) Reviewed-by: David Hildenbrand Reviewed-by: Alistair Popple Signed-off-by: Andrew Morton (cherry picked from commit d3b082736518562f4eed185e1a67f28d20635fef) Signed-off-by: Koba Ko --- mm/migrate_device.c | 23 ++++++++++++----------- 1 file changed, 12 insertions(+), 11 deletions(-) diff --git a/mm/migrate_device.c b/mm/migrate_device.c index 8ac1f79f754a2..81193363f8cd5 100644 --- a/mm/migrate_device.c +++ b/mm/migrate_device.c @@ -564,6 +564,7 @@ static void migrate_vma_insert_page(struct migrate_vma *migrate, struct page *page, unsigned long *src) { + struct folio *folio = page_folio(page); struct vm_area_struct *vma = migrate->vma; struct mm_struct *mm = vma->vm_mm; bool flush = false; @@ -596,17 +597,17 @@ static void migrate_vma_insert_page(struct migrate_vma *migrate, goto abort; if (unlikely(anon_vma_prepare(vma))) goto abort; - if (mem_cgroup_charge(page_folio(page), vma->vm_mm, GFP_KERNEL)) + if (mem_cgroup_charge(folio, vma->vm_mm, GFP_KERNEL)) goto abort; /* - * The memory barrier inside __SetPageUptodate makes sure that - * preceding stores to the page contents become visible before + * The memory barrier inside __folio_mark_uptodate makes sure that + * preceding stores to the folio contents become visible before * the set_pte_at() write. */ - __SetPageUptodate(page); + __folio_mark_uptodate(folio); - if (is_device_private_page(page)) { + if (folio_is_device_private(folio)) { swp_entry_t swp_entry; if (vma->vm_flags & VM_WRITE) @@ -617,8 +618,8 @@ static void migrate_vma_insert_page(struct migrate_vma *migrate, page_to_pfn(page)); entry = swp_entry_to_pte(swp_entry); } else { - if (is_zone_device_page(page) && - !is_device_coherent_page(page)) { + if (folio_is_zone_device(folio) && + !folio_is_device_coherent(folio)) { pr_warn_once("Unsupported ZONE_DEVICE page type.\n"); goto abort; } @@ -652,10 +653,10 @@ static void migrate_vma_insert_page(struct migrate_vma *migrate, goto unlock_abort; inc_mm_counter(mm, MM_ANONPAGES); - page_add_new_anon_rmap(page, vma, addr); - if (!is_zone_device_page(page)) - lru_cache_add_inactive_or_unevictable(page, vma); - get_page(page); + folio_add_new_anon_rmap(folio, vma, addr); + if (!folio_is_zone_device(folio)) + folio_add_lru_vma(folio, vma); + folio_get(folio); if (flush) { flush_cache_page(vma, addr, pte_pfn(orig_pte)); From eb92c66f117042f84b8168a962fe7a54f81f55ed Mon Sep 17 00:00:00 2001 From: "Matthew Wilcox (Oracle)" Date: Mon, 11 Dec 2023 16:22:13 +0000 Subject: [PATCH 158/548] mm: convert collapse_huge_page() to use a folio Replace three calls to compound_head() with one. Link: https://lkml.kernel.org/r/20231211162214.2146080-9-willy@infradead.org Signed-off-by: Matthew Wilcox (Oracle) Reviewed-by: David Hildenbrand Signed-off-by: Andrew Morton (cherry picked from commit 5432726848bb27a01badcbc93b596f39ee6c5ffb) Signed-off-by: Koba Ko --- mm/khugepaged.c | 1 + 1 file changed, 1 insertion(+) diff --git a/mm/khugepaged.c b/mm/khugepaged.c index b9f2af46b52ca..b21b5c15b43d5 100644 --- a/mm/khugepaged.c +++ b/mm/khugepaged.c @@ -1199,6 +1199,7 @@ static int collapse_huge_page(struct mm_struct *mm, unsigned long address, if (unlikely(result != SCAN_SUCCEED)) goto out_up_write; + folio = page_folio(hpage); /* * The smp_wmb() inside __folio_mark_uptodate() ensures the * copy_huge_page writes become visible before the set_pmd_at() From 2f4e18d3bc45b034ae22451f117c179614a0e180 Mon Sep 17 00:00:00 2001 From: "Matthew Wilcox (Oracle)" Date: Mon, 11 Dec 2023 16:22:14 +0000 Subject: [PATCH 159/548] mm: remove page_add_new_anon_rmap and lru_cache_add_inactive_or_unevictable All callers have now been converted to folio_add_new_anon_rmap() and folio_add_lru_vma() so we can remove the wrapper. Link: https://lkml.kernel.org/r/20231211162214.2146080-10-willy@infradead.org Signed-off-by: Matthew Wilcox (Oracle) Reviewed-by: David Hildenbrand Signed-off-by: Andrew Morton (cherry picked from commit cafa8e37a2ebd344ae0774324c21f46640bbaab3) Signed-off-by: Koba Ko --- include/linux/rmap.h | 2 -- include/linux/swap.h | 3 --- mm/folio-compat.c | 16 ---------------- 3 files changed, 21 deletions(-) diff --git a/include/linux/rmap.h b/include/linux/rmap.h index a1b81fa88b4c6..dda60b8c19ef2 100644 --- a/include/linux/rmap.h +++ b/include/linux/rmap.h @@ -230,8 +230,6 @@ static inline void __folio_rmap_sanity_checks(struct folio *folio, void folio_move_anon_rmap(struct folio *, struct vm_area_struct *); void page_add_anon_rmap(struct page *, struct vm_area_struct *, unsigned long address, rmap_t flags); -void page_add_new_anon_rmap(struct page *, struct vm_area_struct *, - unsigned long address); void folio_add_new_anon_rmap(struct folio *, struct vm_area_struct *, unsigned long address); void page_add_file_rmap(struct page *, struct vm_area_struct *, diff --git a/include/linux/swap.h b/include/linux/swap.h index cb25db2a93dd1..0201dd8c49e71 100644 --- a/include/linux/swap.h +++ b/include/linux/swap.h @@ -396,9 +396,6 @@ void folio_deactivate(struct folio *folio); void folio_mark_lazyfree(struct folio *folio); extern void swap_setup(void); -extern void lru_cache_add_inactive_or_unevictable(struct page *page, - struct vm_area_struct *vma); - /* linux/mm/vmscan.c */ extern unsigned long zone_reclaimable_pages(struct zone *zone); extern unsigned long try_to_free_pages(struct zonelist *zonelist, int order, diff --git a/mm/folio-compat.c b/mm/folio-compat.c index 10c3247542cbe..a546271db69b5 100644 --- a/mm/folio-compat.c +++ b/mm/folio-compat.c @@ -77,12 +77,6 @@ bool redirty_page_for_writepage(struct writeback_control *wbc, } EXPORT_SYMBOL(redirty_page_for_writepage); -void lru_cache_add_inactive_or_unevictable(struct page *page, - struct vm_area_struct *vma) -{ - folio_add_lru_vma(page_folio(page), vma); -} - int add_to_page_cache_lru(struct page *page, struct address_space *mapping, pgoff_t index, gfp_t gfp) { @@ -122,13 +116,3 @@ void putback_lru_page(struct page *page) { folio_putback_lru(page_folio(page)); } - -#ifdef CONFIG_MMU -void page_add_new_anon_rmap(struct page *page, struct vm_area_struct *vma, - unsigned long address) -{ - VM_BUG_ON_PAGE(PageTail(page), page); - - return folio_add_new_anon_rmap((struct folio *)page, vma, address); -} -#endif From 2a1a15bb6e8adde899b5df749813c49ef6cf2b84 Mon Sep 17 00:00:00 2001 From: David Hildenbrand Date: Wed, 20 Dec 2023 23:44:32 +0100 Subject: [PATCH 160/548] mm/memory: page_add_file_rmap() -> folio_add_file_rmap_[pte|pmd]() Let's convert insert_page_into_pte_locked() and do_set_pmd(). While at it, perform some folio conversion. Link: https://lkml.kernel.org/r/20231220224504.646757-9-david@redhat.com Signed-off-by: David Hildenbrand Reviewed-by: Yin Fengwei Reviewed-by: Ryan Roberts Cc: Hugh Dickins Cc: Matthew Wilcox (Oracle) Cc: Muchun Song Cc: Muchun Song Cc: Peter Xu Signed-off-by: Andrew Morton (cherry picked from commit ef37b2ea08ace7b5fbcd569d703be1903afd12f9) Signed-off-by: Koba Ko --- mm/memory.c | 14 ++++++++------ 1 file changed, 8 insertions(+), 6 deletions(-) diff --git a/mm/memory.c b/mm/memory.c index 20640b2eb3f42..ebcf63dd0599a 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -1841,12 +1841,14 @@ static int validate_page_before_insert(struct page *page) static int insert_page_into_pte_locked(struct vm_area_struct *vma, pte_t *pte, unsigned long addr, struct page *page, pgprot_t prot) { + struct folio *folio = page_folio(page); + if (!pte_none(ptep_get(pte))) return -EBUSY; /* Ok, finally just insert the thing.. */ - get_page(page); + folio_get(folio); inc_mm_counter(vma->vm_mm, mm_counter_file(page)); - page_add_file_rmap(page, vma, false); + folio_add_file_rmap_pte(folio, page, vma); set_pte_at(vma->vm_mm, addr, pte, mk_pte(page, prot)); return 0; } @@ -4410,6 +4412,7 @@ static void deposit_prealloc_pte(struct vm_fault *vmf) vm_fault_t do_set_pmd(struct vm_fault *vmf, struct page *page) { + struct folio *folio = page_folio(page); struct vm_area_struct *vma = vmf->vma; bool write = vmf->flags & FAULT_FLAG_WRITE; unsigned long haddr = vmf->address & HPAGE_PMD_MASK; @@ -4428,8 +4431,7 @@ vm_fault_t do_set_pmd(struct vm_fault *vmf, struct page *page) if (!thp_vma_suitable_order(vma, haddr, PMD_ORDER)) return ret; - page = compound_head(page); - if (compound_order(page) != HPAGE_PMD_ORDER) + if (page != &folio->page || folio_order(folio) != HPAGE_PMD_ORDER) return ret; /* @@ -4438,7 +4440,7 @@ vm_fault_t do_set_pmd(struct vm_fault *vmf, struct page *page) * check. This kind of THP just can be PTE mapped. Access to * the corrupted subpage should trigger SIGBUS as expected. */ - if (unlikely(PageHasHWPoisoned(page))) + if (unlikely(folio_test_has_hwpoisoned(folio))) return ret; /* @@ -4462,7 +4464,7 @@ vm_fault_t do_set_pmd(struct vm_fault *vmf, struct page *page) entry = maybe_pmd_mkwrite(pmd_mkdirty(entry), vma); add_mm_counter(vma->vm_mm, mm_counter_file(page), HPAGE_PMD_NR); - page_add_file_rmap(page, vma, true); + folio_add_file_rmap_pmd(folio, page, vma); /* * deposit and withdraw with pmd lock held From af356b5a2d073d89e7290ab392f2ebd146dc66e5 Mon Sep 17 00:00:00 2001 From: "Vishal Moola (Oracle)" Date: Fri, 20 Oct 2023 11:33:31 -0700 Subject: [PATCH 161/548] mm/khugepaged: convert collapse_pte_mapped_thp() to use folios This removes 2 calls to compound_head() and helps convert khugepaged to use folios throughout. Previously, if the address passed to collapse_pte_mapped_thp() corresponded to a tail page, the scan would fail immediately. Using filemap_lock_folio() we get the corresponding folio back and try to operate on the folio instead. Link: https://lkml.kernel.org/r/20231020183331.10770-6-vishal.moola@gmail.com Signed-off-by: Vishal Moola (Oracle) Reviewed-by: Rik van Riel Reviewed-by: Yang Shi Cc: Kefeng Wang Cc: Matthew Wilcox (Oracle) Signed-off-by: Andrew Morton (cherry picked from commit 98b32d296d95d7aa0516c36b72406277412268cd) Signed-off-by: Koba Ko --- mm/khugepaged.c | 45 ++++++++++++++++++++------------------------- 1 file changed, 20 insertions(+), 25 deletions(-) diff --git a/mm/khugepaged.c b/mm/khugepaged.c index b21b5c15b43d5..40b84ba39e747 100644 --- a/mm/khugepaged.c +++ b/mm/khugepaged.c @@ -1468,7 +1468,7 @@ int collapse_pte_mapped_thp(struct mm_struct *mm, unsigned long addr, bool notified = false; unsigned long haddr = addr & HPAGE_PMD_MASK; struct vm_area_struct *vma = vma_lookup(mm, haddr); - struct page *hpage; + struct folio *folio; pte_t *start_pte, *pte; pmd_t *pmd, pgt_pmd; spinlock_t *pml = NULL, *ptl; @@ -1502,19 +1502,14 @@ int collapse_pte_mapped_thp(struct mm_struct *mm, unsigned long addr, if (userfaultfd_wp(vma)) return SCAN_PTE_UFFD_WP; - hpage = find_lock_page(vma->vm_file->f_mapping, + folio = filemap_lock_folio(vma->vm_file->f_mapping, linear_page_index(vma, haddr)); - if (!hpage) + if (IS_ERR(folio)) return SCAN_PAGE_NULL; - if (!PageHead(hpage)) { - result = SCAN_FAIL; - goto drop_hpage; - } - - if (compound_order(hpage) != HPAGE_PMD_ORDER) { + if (folio_order(folio) != HPAGE_PMD_ORDER) { result = SCAN_PAGE_COMPOUND; - goto drop_hpage; + goto drop_folio; } result = find_pmd_or_thp_or_none(mm, haddr, &pmd); @@ -1528,13 +1523,13 @@ int collapse_pte_mapped_thp(struct mm_struct *mm, unsigned long addr, */ goto maybe_install_pmd; default: - goto drop_hpage; + goto drop_folio; } result = SCAN_FAIL; start_pte = pte_offset_map_lock(mm, pmd, haddr, &ptl); if (!start_pte) /* mmap_lock + page lock should prevent this */ - goto drop_hpage; + goto drop_folio; /* step 1: check all mapped PTEs are to the right huge page */ for (i = 0, addr = haddr, pte = start_pte; @@ -1559,7 +1554,7 @@ int collapse_pte_mapped_thp(struct mm_struct *mm, unsigned long addr, * Note that uprobe, debugger, or MAP_PRIVATE may change the * page table, but the new page will not be a subpage of hpage. */ - if (hpage + i != page) + if (folio_page(folio, i) != page) goto abort; } @@ -1574,7 +1569,7 @@ int collapse_pte_mapped_thp(struct mm_struct *mm, unsigned long addr, * page_table_lock) ptl nests inside pml. The less time we hold pml, * the better; but userfaultfd's mfill_atomic_pte() on a private VMA * inserts a valid as-if-COWed PTE without even looking up page cache. - * So page lock of hpage does not protect from it, so we must not drop + * So page lock of folio does not protect from it, so we must not drop * ptl before pgt_pmd is removed, so uffd private needs pml taken now. */ if (userfaultfd_armed(vma) && !(vma->vm_flags & VM_SHARED)) @@ -1598,7 +1593,7 @@ int collapse_pte_mapped_thp(struct mm_struct *mm, unsigned long addr, continue; /* * We dropped ptl after the first scan, to do the mmu_notifier: - * page lock stops more PTEs of the hpage being faulted in, but + * page lock stops more PTEs of the folio being faulted in, but * does not stop write faults COWing anon copies from existing * PTEs; and does not stop those being swapped out or migrated. */ @@ -1607,7 +1602,7 @@ int collapse_pte_mapped_thp(struct mm_struct *mm, unsigned long addr, goto abort; } page = vm_normal_page(vma, addr, ptent); - if (hpage + i != page) + if (folio_page(folio, i) != page) goto abort; /* @@ -1626,8 +1621,8 @@ int collapse_pte_mapped_thp(struct mm_struct *mm, unsigned long addr, /* step 3: set proper refcount and mm_counters. */ if (nr_ptes) { - page_ref_sub(hpage, nr_ptes); - add_mm_counter(mm, mm_counter_file(hpage), -nr_ptes); + folio_ref_sub(folio, nr_ptes); + add_mm_counter(mm, mm_counter_file(&folio->page), -nr_ptes); } /* step 4: remove empty page table */ @@ -1651,14 +1646,14 @@ int collapse_pte_mapped_thp(struct mm_struct *mm, unsigned long addr, maybe_install_pmd: /* step 5: install pmd entry */ result = install_pmd - ? set_huge_pmd(vma, haddr, pmd, hpage) + ? set_huge_pmd(vma, haddr, pmd, &folio->page) : SCAN_SUCCEED; - goto drop_hpage; + goto drop_folio; abort: if (nr_ptes) { flush_tlb_mm(mm); - page_ref_sub(hpage, nr_ptes); - add_mm_counter(mm, mm_counter_file(hpage), -nr_ptes); + folio_ref_sub(folio, nr_ptes); + add_mm_counter(mm, mm_counter_file(&folio->page), -nr_ptes); } if (start_pte) pte_unmap_unlock(start_pte, ptl); @@ -1666,9 +1661,9 @@ int collapse_pte_mapped_thp(struct mm_struct *mm, unsigned long addr, spin_unlock(pml); if (notified) mmu_notifier_invalidate_range_end(&range); -drop_hpage: - unlock_page(hpage); - put_page(hpage); +drop_folio: + folio_unlock(folio); + folio_put(folio); return result; } From 9775f0e44505a8e2c48f0821caac6a1a74fb8896 Mon Sep 17 00:00:00 2001 From: David Hildenbrand Date: Wed, 20 Dec 2023 23:44:33 +0100 Subject: [PATCH 162/548] mm/huge_memory: page_add_file_rmap() -> folio_add_file_rmap_pmd() Let's convert remove_migration_pmd() and while at it, perform some folio conversion. Link: https://lkml.kernel.org/r/20231220224504.646757-10-david@redhat.com Signed-off-by: David Hildenbrand Reviewed-by: Yin Fengwei Reviewed-by: Ryan Roberts Cc: Hugh Dickins Cc: Matthew Wilcox (Oracle) Cc: Muchun Song Cc: Muchun Song Cc: Peter Xu Signed-off-by: Andrew Morton (cherry picked from commit 14d85a6e88a658e29d9c8d6c521e7f824f2f2c6c) Signed-off-by: Koba Ko --- mm/huge_memory.c | 11 ++++++----- 1 file changed, 6 insertions(+), 5 deletions(-) diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 9f77d234b05c0..a4119cca09989 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -3445,6 +3445,7 @@ int set_pmd_migration_entry(struct page_vma_mapped_walk *pvmw, void remove_migration_pmd(struct page_vma_mapped_walk *pvmw, struct page *new) { + struct folio *folio = page_folio(new); struct vm_area_struct *vma = pvmw->vma; struct mm_struct *mm = vma->vm_mm; unsigned long address = pvmw->address; @@ -3456,7 +3457,7 @@ void remove_migration_pmd(struct page_vma_mapped_walk *pvmw, struct page *new) return; entry = pmd_to_swp_entry(*pvmw->pmd); - get_page(new); + folio_get(folio); pmde = mk_huge_pmd(new, READ_ONCE(vma->vm_page_prot)); if (pmd_swp_soft_dirty(*pvmw->pmd)) pmde = pmd_mksoft_dirty(pmde); @@ -3467,10 +3468,10 @@ void remove_migration_pmd(struct page_vma_mapped_walk *pvmw, struct page *new) if (!is_migration_entry_young(entry)) pmde = pmd_mkold(pmde); /* NOTE: this may contain setting soft-dirty on some archs */ - if (PageDirty(new) && is_migration_entry_dirty(entry)) + if (folio_test_dirty(folio) && is_migration_entry_dirty(entry)) pmde = pmd_mkdirty(pmde); - if (PageAnon(new)) { + if (folio_test_anon(folio)) { rmap_t rmap_flags = RMAP_COMPOUND; if (!is_readable_migration_entry(entry)) @@ -3478,9 +3479,9 @@ void remove_migration_pmd(struct page_vma_mapped_walk *pvmw, struct page *new) page_add_anon_rmap(new, vma, haddr, rmap_flags); } else { - page_add_file_rmap(new, vma, true); + folio_add_file_rmap_pmd(folio, new, vma); } - VM_BUG_ON(pmd_write(pmde) && PageAnon(new) && !PageAnonExclusive(new)); + VM_BUG_ON(pmd_write(pmde) && folio_test_anon(folio) && !PageAnonExclusive(new)); set_pmd_at(mm, haddr, pvmw->pmd, pmde); /* No need to invalidate - it was non-present before */ From a7d4eb0c75062887f6f280e98b2eb758d89aa0e7 Mon Sep 17 00:00:00 2001 From: David Hildenbrand Date: Wed, 20 Dec 2023 23:44:34 +0100 Subject: [PATCH 163/548] mm/migrate: page_add_file_rmap() -> folio_add_file_rmap_pte() Let's convert remove_migration_pte(). Link: https://lkml.kernel.org/r/20231220224504.646757-11-david@redhat.com Signed-off-by: David Hildenbrand Reviewed-by: Yin Fengwei Reviewed-by: Ryan Roberts Cc: Hugh Dickins Cc: Matthew Wilcox (Oracle) Cc: Muchun Song Cc: Muchun Song Cc: Peter Xu Signed-off-by: Andrew Morton (cherry picked from commit c4dffb0bc237d5e3b51adf947062e65ed34ac3c3) Signed-off-by: Koba Ko --- mm/migrate.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/mm/migrate.c b/mm/migrate.c index 1ca33989565a5..fea18c6b6ea9b 100644 --- a/mm/migrate.c +++ b/mm/migrate.c @@ -262,7 +262,7 @@ static bool remove_migration_pte(struct folio *folio, page_add_anon_rmap(new, vma, pvmw.address, rmap_flags); else - page_add_file_rmap(new, vma, false); + folio_add_file_rmap_pte(folio, new, vma); set_pte_at(vma->vm_mm, pvmw.address, pvmw.pte, pte); } if (vma->vm_flags & VM_LOCKED) From 8326c8daa46dc0cf5523c0313930daa68dd74f0e Mon Sep 17 00:00:00 2001 From: David Hildenbrand Date: Wed, 20 Dec 2023 23:44:35 +0100 Subject: [PATCH 164/548] mm/userfaultfd: page_add_file_rmap() -> folio_add_file_rmap_pte() Let's convert mfill_atomic_install_pte(). Link: https://lkml.kernel.org/r/20231220224504.646757-12-david@redhat.com Signed-off-by: David Hildenbrand Reviewed-by: Yin Fengwei Reviewed-by: Ryan Roberts Cc: Hugh Dickins Cc: Matthew Wilcox (Oracle) Cc: Muchun Song Cc: Muchun Song Cc: Peter Xu Signed-off-by: Andrew Morton (cherry picked from commit 7123e19c3c9d1539c899ac8d919498e3393bb288) Signed-off-by: Koba Ko --- mm/userfaultfd.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/mm/userfaultfd.c b/mm/userfaultfd.c index ffef13f97eddf..2031e1d5b2d7c 100644 --- a/mm/userfaultfd.c +++ b/mm/userfaultfd.c @@ -114,7 +114,7 @@ int mfill_atomic_install_pte(pmd_t *dst_pmd, /* Usually, cache pages are already added to LRU */ if (newly_allocated) folio_add_lru(folio); - page_add_file_rmap(page, dst_vma, false); + folio_add_file_rmap_pte(folio, page, dst_vma); } else { folio_add_new_anon_rmap(folio, dst_vma, dst_addr); folio_add_lru_vma(folio, dst_vma); From 133312f9a7f9d091196774d7e1eecc6323d6fa84 Mon Sep 17 00:00:00 2001 From: David Hildenbrand Date: Wed, 20 Dec 2023 23:44:36 +0100 Subject: [PATCH 165/548] mm/rmap: remove page_add_file_rmap() All users are gone, let's remove it. Link: https://lkml.kernel.org/r/20231220224504.646757-13-david@redhat.com Signed-off-by: David Hildenbrand Reviewed-by: Yin Fengwei Reviewed-by: Ryan Roberts Cc: Hugh Dickins Cc: Matthew Wilcox (Oracle) Cc: Muchun Song Cc: Muchun Song Cc: Peter Xu Signed-off-by: Andrew Morton (cherry picked from commit be6e57cfabe99a5d3b3869103c4ea0ed4a9692d4) Signed-off-by: Koba Ko --- include/linux/rmap.h | 2 -- mm/rmap.c | 21 --------------------- 2 files changed, 23 deletions(-) diff --git a/include/linux/rmap.h b/include/linux/rmap.h index dda60b8c19ef2..2d774e44b26d0 100644 --- a/include/linux/rmap.h +++ b/include/linux/rmap.h @@ -232,8 +232,6 @@ void page_add_anon_rmap(struct page *, struct vm_area_struct *, unsigned long address, rmap_t flags); void folio_add_new_anon_rmap(struct folio *, struct vm_area_struct *, unsigned long address); -void page_add_file_rmap(struct page *, struct vm_area_struct *, - bool compound); void folio_add_file_rmap_ptes(struct folio *, struct page *, int nr_pages, struct vm_area_struct *); #define folio_add_file_rmap_pte(folio, page, vma) \ diff --git a/mm/rmap.c b/mm/rmap.c index bf4bf8f8e660f..1cab627dd7aba 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -1436,27 +1436,6 @@ void folio_add_file_rmap_pmd(struct folio *folio, struct page *page, #endif } -/** - * page_add_file_rmap - add pte mapping to a file page - * @page: the page to add the mapping to - * @vma: the vm area in which the mapping is added - * @compound: charge the page as compound or small page - * - * The caller needs to hold the pte lock. - */ -void page_add_file_rmap(struct page *page, struct vm_area_struct *vma, - bool compound) -{ - struct folio *folio = page_folio(page); - - VM_WARN_ON_ONCE_PAGE(compound && !PageTransHuge(page), page); - - if (likely(!compound)) - folio_add_file_rmap_pte(folio, page, vma); - else - folio_add_file_rmap_pmd(folio, page, vma); -} - /** * page_remove_rmap - take down pte mapping from a page * @page: page to remove mapping from From 926f4754eee42d2b2ff1905b15aabdf8c352db60 Mon Sep 17 00:00:00 2001 From: David Hildenbrand Date: Wed, 20 Dec 2023 23:44:37 +0100 Subject: [PATCH 166/548] mm/rmap: factor out adding folio mappings into __folio_add_rmap() Let's factor it out to prepare for reuse as we convert page_add_anon_rmap() to folio_add_anon_rmap_[pte|ptes|pmd](). Make the compiler always special-case on the granularity by using __always_inline. Link: https://lkml.kernel.org/r/20231220224504.646757-14-david@redhat.com Signed-off-by: David Hildenbrand Reviewed-by: Yin Fengwei Cc: Hugh Dickins Cc: Matthew Wilcox (Oracle) Cc: Muchun Song Cc: Muchun Song Cc: Peter Xu Cc: Ryan Roberts Signed-off-by: Andrew Morton (cherry picked from commit 96fd74958c558d6976bbc303dda0efa389182fab) Signed-off-by: Koba Ko --- mm/rmap.c | 78 +++++++++++++++++++++++++++++++------------------------ 1 file changed, 44 insertions(+), 34 deletions(-) diff --git a/mm/rmap.c b/mm/rmap.c index 1cab627dd7aba..e2cad03e48106 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -1127,6 +1127,48 @@ int folio_total_mapcount(struct folio *folio) return mapcount; } +static __always_inline unsigned int __folio_add_rmap(struct folio *folio, + struct page *page, int nr_pages, enum rmap_level level, + int *nr_pmdmapped) +{ + atomic_t *mapped = &folio->_nr_pages_mapped; + int first, nr = 0; + + __folio_rmap_sanity_checks(folio, page, nr_pages, level); + + switch (level) { + case RMAP_LEVEL_PTE: + do { + first = atomic_inc_and_test(&page->_mapcount); + if (first && folio_test_large(folio)) { + first = atomic_inc_return_relaxed(mapped); + first = (first < COMPOUND_MAPPED); + } + + if (first) + nr++; + } while (page++, --nr_pages > 0); + break; + case RMAP_LEVEL_PMD: + first = atomic_inc_and_test(&folio->_entire_mapcount); + if (first) { + nr = atomic_add_return_relaxed(COMPOUND_MAPPED, mapped); + if (likely(nr < COMPOUND_MAPPED + COMPOUND_MAPPED)) { + *nr_pmdmapped = folio_nr_pages(folio); + nr = *nr_pmdmapped - (nr & FOLIO_PAGES_MAPPED); + /* Raced ahead of a remove and another add? */ + if (unlikely(nr < 0)) + nr = 0; + } else { + /* Raced ahead of a remove of COMPOUND_MAPPED */ + nr = 0; + } + } + break; + } + return nr; +} + /** * folio_move_anon_rmap - move a folio to our anon_vma * @folio: The folio to move to our anon_vma @@ -1351,43 +1393,11 @@ static __always_inline void __folio_add_file_rmap(struct folio *folio, struct page *page, int nr_pages, struct vm_area_struct *vma, enum rmap_level level) { - atomic_t *mapped = &folio->_nr_pages_mapped; - int nr = 0, nr_pmdmapped = 0, first; + int nr, nr_pmdmapped = 0; VM_WARN_ON_FOLIO(folio_test_anon(folio), folio); - __folio_rmap_sanity_checks(folio, page, nr_pages, level); - - switch (level) { - case RMAP_LEVEL_PTE: - do { - first = atomic_inc_and_test(&page->_mapcount); - if (first && folio_test_large(folio)) { - first = atomic_inc_return_relaxed(mapped); - first = (first < COMPOUND_MAPPED); - } - - if (first) - nr++; - } while (page++, --nr_pages > 0); - break; - case RMAP_LEVEL_PMD: - first = atomic_inc_and_test(&folio->_entire_mapcount); - if (first) { - nr = atomic_add_return_relaxed(COMPOUND_MAPPED, mapped); - if (likely(nr < COMPOUND_MAPPED + COMPOUND_MAPPED)) { - nr_pmdmapped = folio_nr_pages(folio); - nr = nr_pmdmapped - (nr & FOLIO_PAGES_MAPPED); - /* Raced ahead of a remove and another add? */ - if (unlikely(nr < 0)) - nr = 0; - } else { - /* Raced ahead of a remove of COMPOUND_MAPPED */ - nr = 0; - } - } - break; - } + nr = __folio_add_rmap(folio, page, nr_pages, level, &nr_pmdmapped); if (nr_pmdmapped) __lruvec_stat_mod_folio(folio, folio_test_swapbacked(folio) ? NR_SHMEM_PMDMAPPED : NR_FILE_PMDMAPPED, nr_pmdmapped); From 3ba32485719807d06e557182270a3280f35d7006 Mon Sep 17 00:00:00 2001 From: David Hildenbrand Date: Wed, 20 Dec 2023 23:44:38 +0100 Subject: [PATCH 167/548] mm/rmap: introduce folio_add_anon_rmap_[pte|ptes|pmd]() Let's mimic what we did with folio_add_file_rmap_*() so we can similarly replace page_add_anon_rmap() next. Make the compiler always special-case on the granularity by using __always_inline. For the PageAnonExclusive sanity checks, when adding a PMD mapping, we're now also checking each individual subpage covered by that PMD, instead of only the head page. Note that the new functions ignore the RMAP_COMPOUND flag, which we will remove as soon as page_add_anon_rmap() is gone. Link: https://lkml.kernel.org/r/20231220224504.646757-15-david@redhat.com Signed-off-by: David Hildenbrand Reviewed-by: Yin Fengwei Cc: Hugh Dickins Cc: Matthew Wilcox (Oracle) Cc: Muchun Song Cc: Muchun Song Cc: Peter Xu Cc: Ryan Roberts Signed-off-by: Andrew Morton (cherry picked from commit 8bd5130070fbf2247a97c5361427a810522ac98a) Signed-off-by: Koba Ko --- include/linux/rmap.h | 6 +++ mm/rmap.c | 119 +++++++++++++++++++++++++++++-------------- 2 files changed, 88 insertions(+), 37 deletions(-) diff --git a/include/linux/rmap.h b/include/linux/rmap.h index 2d774e44b26d0..238314a1539c9 100644 --- a/include/linux/rmap.h +++ b/include/linux/rmap.h @@ -228,6 +228,12 @@ static inline void __folio_rmap_sanity_checks(struct folio *folio, * rmap interfaces called when adding or removing pte of page */ void folio_move_anon_rmap(struct folio *, struct vm_area_struct *); +void folio_add_anon_rmap_ptes(struct folio *, struct page *, int nr_pages, + struct vm_area_struct *, unsigned long address, rmap_t flags); +#define folio_add_anon_rmap_pte(folio, page, vma, address, flags) \ + folio_add_anon_rmap_ptes(folio, page, 1, vma, address, flags) +void folio_add_anon_rmap_pmd(struct folio *, struct page *, + struct vm_area_struct *, unsigned long address, rmap_t flags); void page_add_anon_rmap(struct page *, struct vm_area_struct *, unsigned long address, rmap_t flags); void folio_add_new_anon_rmap(struct folio *, struct vm_area_struct *, diff --git a/mm/rmap.c b/mm/rmap.c index e2cad03e48106..12c3c325659f7 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -1269,43 +1269,20 @@ void page_add_anon_rmap(struct page *page, struct vm_area_struct *vma, unsigned long address, rmap_t flags) { struct folio *folio = page_folio(page); - atomic_t *mapped = &folio->_nr_pages_mapped; - int nr = 0, nr_pmdmapped = 0; - bool compound = flags & RMAP_COMPOUND; - bool first = true; - VM_WARN_ON_FOLIO(folio_test_hugetlb(folio), folio); - - /* Is page being mapped by PTE? Is this its first map to be added? */ - if (likely(!compound)) { - first = atomic_inc_and_test(&page->_mapcount); - nr = first; - if (first && folio_test_large(folio)) { - nr = atomic_inc_return_relaxed(mapped); - nr = (nr < COMPOUND_MAPPED); - } - } else if (folio_test_pmd_mappable(folio)) { - /* That test is redundant: it's for safety or to optimize out */ - - first = atomic_inc_and_test(&folio->_entire_mapcount); - if (first) { - nr = atomic_add_return_relaxed(COMPOUND_MAPPED, mapped); - if (likely(nr < COMPOUND_MAPPED + COMPOUND_MAPPED)) { - nr_pmdmapped = folio_nr_pages(folio); - nr = nr_pmdmapped - (nr & FOLIO_PAGES_MAPPED); - /* Raced ahead of a remove and another add? */ - if (unlikely(nr < 0)) - nr = 0; - } else { - /* Raced ahead of a remove of COMPOUND_MAPPED */ - nr = 0; - } - } - } + if (likely(!(flags & RMAP_COMPOUND))) + folio_add_anon_rmap_pte(folio, page, vma, address, flags); + else + folio_add_anon_rmap_pmd(folio, page, vma, address, flags); +} - VM_BUG_ON_PAGE(!first && (flags & RMAP_EXCLUSIVE), page); - VM_BUG_ON_PAGE(!first && PageAnonExclusive(page), page); +static __always_inline void __folio_add_anon_rmap(struct folio *folio, + struct page *page, int nr_pages, struct vm_area_struct *vma, + unsigned long address, rmap_t flags, enum rmap_level level) +{ + int i, nr, nr_pmdmapped = 0; + nr = __folio_add_rmap(folio, page, nr_pages, level, &nr_pmdmapped); if (nr_pmdmapped) __lruvec_stat_mod_folio(folio, NR_ANON_THPS, nr_pmdmapped); if (nr) @@ -1319,14 +1296,34 @@ void page_add_anon_rmap(struct page *page, struct vm_area_struct *vma, * folio->index right when not given the address of the head * page. */ - VM_WARN_ON_FOLIO(folio_test_large(folio) && !compound, folio); + VM_WARN_ON_FOLIO(folio_test_large(folio) && + level != RMAP_LEVEL_PMD, folio); __folio_set_anon(folio, vma, address, !!(flags & RMAP_EXCLUSIVE)); } else if (likely(!folio_test_ksm(folio))) { __page_check_anon_rmap(folio, page, vma, address); } - if (flags & RMAP_EXCLUSIVE) - SetPageAnonExclusive(page); + + if (flags & RMAP_EXCLUSIVE) { + switch (level) { + case RMAP_LEVEL_PTE: + for (i = 0; i < nr_pages; i++) + SetPageAnonExclusive(page + i); + break; + case RMAP_LEVEL_PMD: + SetPageAnonExclusive(page); + break; + } + } + for (i = 0; i < nr_pages; i++) { + struct page *cur_page = page + i; + + /* While PTE-mapping a THP we have a PMD and a PTE mapping. */ + VM_WARN_ON_FOLIO((atomic_read(&cur_page->_mapcount) > 0 || + (folio_test_large(folio) && + folio_entire_mapcount(folio) > 1)) && + PageAnonExclusive(cur_page), folio); + } /* * For large folio, only mlock it if it's fully mapped to VMA. It's @@ -1338,6 +1335,54 @@ void page_add_anon_rmap(struct page *page, struct vm_area_struct *vma, mlock_vma_folio(folio, vma); } +/** + * folio_add_anon_rmap_ptes - add PTE mappings to a page range of an anon folio + * @folio: The folio to add the mappings to + * @page: The first page to add + * @nr_pages: The number of pages which will be mapped + * @vma: The vm area in which the mappings are added + * @address: The user virtual address of the first page to map + * @flags: The rmap flags + * + * The page range of folio is defined by [first_page, first_page + nr_pages) + * + * The caller needs to hold the page table lock, and the page must be locked in + * the anon_vma case: to serialize mapping,index checking after setting, + * and to ensure that an anon folio is not being upgraded racily to a KSM folio + * (but KSM folios are never downgraded). + */ +void folio_add_anon_rmap_ptes(struct folio *folio, struct page *page, + int nr_pages, struct vm_area_struct *vma, unsigned long address, + rmap_t flags) +{ + __folio_add_anon_rmap(folio, page, nr_pages, vma, address, flags, + RMAP_LEVEL_PTE); +} + +/** + * folio_add_anon_rmap_pmd - add a PMD mapping to a page range of an anon folio + * @folio: The folio to add the mapping to + * @page: The first page to add + * @vma: The vm area in which the mapping is added + * @address: The user virtual address of the first page to map + * @flags: The rmap flags + * + * The page range of folio is defined by [first_page, first_page + HPAGE_PMD_NR) + * + * The caller needs to hold the page table lock, and the page must be locked in + * the anon_vma case: to serialize mapping,index checking after setting. + */ +void folio_add_anon_rmap_pmd(struct folio *folio, struct page *page, + struct vm_area_struct *vma, unsigned long address, rmap_t flags) +{ +#ifdef CONFIG_TRANSPARENT_HUGEPAGE + __folio_add_anon_rmap(folio, page, HPAGE_PMD_NR, vma, address, flags, + RMAP_LEVEL_PMD); +#else + WARN_ON_ONCE(true); +#endif +} + /** * folio_add_new_anon_rmap - Add mapping to a new anonymous folio. * @folio: The folio to add the mapping to. From 2c99753607258f32beefa997245fcbe2f0f44937 Mon Sep 17 00:00:00 2001 From: David Hildenbrand Date: Wed, 20 Dec 2023 23:44:39 +0100 Subject: [PATCH 168/548] mm/huge_memory: batch rmap operations in __split_huge_pmd_locked() Let's use folio_add_anon_rmap_ptes(), batching the rmap operations. While at it, use more folio operations (but only in the code branch we're touching), use VM_WARN_ON_FOLIO(), and pass RMAP_EXCLUSIVE instead of manually setting PageAnonExclusive. We should never see non-anon pages on that branch: otherwise, the existing page_add_anon_rmap() call would have been flawed already. Link: https://lkml.kernel.org/r/20231220224504.646757-16-david@redhat.com Signed-off-by: David Hildenbrand Reviewed-by: Yin Fengwei Cc: Hugh Dickins Cc: Matthew Wilcox (Oracle) Cc: Muchun Song Cc: Muchun Song Cc: Peter Xu Cc: Ryan Roberts Signed-off-by: Andrew Morton (cherry picked from commit 91b2978a348073db0e47b380fa66c865eb25f3d8) Signed-off-by: Koba Ko --- mm/huge_memory.c | 23 +++++++++++++++-------- 1 file changed, 15 insertions(+), 8 deletions(-) diff --git a/mm/huge_memory.c b/mm/huge_memory.c index a4119cca09989..e1d3dcf64a0af 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -2241,6 +2241,7 @@ static void __split_huge_pmd_locked(struct vm_area_struct *vma, pmd_t *pmd, unsigned long haddr, bool freeze) { struct mm_struct *mm = vma->vm_mm; + struct folio *folio; struct page *page; pgtable_t pgtable; pmd_t old_pmd, _pmd; @@ -2339,16 +2340,18 @@ static void __split_huge_pmd_locked(struct vm_area_struct *vma, pmd_t *pmd, */ old_pmd = pmdp_invalidate(vma, haddr, pmd); page = pmd_page(old_pmd); + folio = page_folio(page); if (pmd_dirty(old_pmd)) { dirty = true; - SetPageDirty(page); + folio_set_dirty(folio); } write = pmd_write(old_pmd); young = pmd_young(old_pmd); soft_dirty = pmd_soft_dirty(old_pmd); uffd_wp = pmd_uffd_wp(old_pmd); - VM_BUG_ON_PAGE(!page_count(page), page); + VM_WARN_ON_FOLIO(!folio_ref_count(folio), folio); + VM_WARN_ON_FOLIO(!folio_test_anon(folio), folio); /* * Without "freeze", we'll simply split the PMD, propagating the @@ -2365,11 +2368,18 @@ static void __split_huge_pmd_locked(struct vm_area_struct *vma, pmd_t *pmd, * * See page_try_share_anon_rmap(): invalidate PMD first. */ - anon_exclusive = PageAnon(page) && PageAnonExclusive(page); + anon_exclusive = PageAnonExclusive(page); if (freeze && anon_exclusive && page_try_share_anon_rmap(page)) freeze = false; - if (!freeze) - page_ref_add(page, HPAGE_PMD_NR - 1); + if (!freeze) { + rmap_t rmap_flags = RMAP_NONE; + + folio_ref_add(folio, HPAGE_PMD_NR - 1); + if (anon_exclusive) + rmap_flags |= RMAP_EXCLUSIVE; + folio_add_anon_rmap_ptes(folio, page, HPAGE_PMD_NR, + vma, haddr, rmap_flags); + } } /* @@ -2412,8 +2422,6 @@ static void __split_huge_pmd_locked(struct vm_area_struct *vma, pmd_t *pmd, entry = mk_pte(page + i, READ_ONCE(vma->vm_page_prot)); if (write) entry = pte_mkwrite(entry, vma); - if (anon_exclusive) - SetPageAnonExclusive(page + i); if (!young) entry = pte_mkold(entry); /* NOTE: this may set soft-dirty too on some archs */ @@ -2423,7 +2431,6 @@ static void __split_huge_pmd_locked(struct vm_area_struct *vma, pmd_t *pmd, entry = pte_mksoft_dirty(entry); if (uffd_wp) entry = pte_mkuffd_wp(entry); - page_add_anon_rmap(page + i, vma, addr, RMAP_NONE); } VM_BUG_ON(!pte_none(ptep_get(pte))); set_pte_at(mm, addr, pte, entry); From 99fbf676adc36c95c44414ff7a854d5922eccbb2 Mon Sep 17 00:00:00 2001 From: David Hildenbrand Date: Wed, 20 Dec 2023 23:44:40 +0100 Subject: [PATCH 169/548] mm/huge_memory: page_add_anon_rmap() -> folio_add_anon_rmap_pmd() Let's convert remove_migration_pmd(). No need to set RMAP_COMPOUND, that we will remove soon. Link: https://lkml.kernel.org/r/20231220224504.646757-17-david@redhat.com Signed-off-by: David Hildenbrand Cc: Hugh Dickins Cc: Matthew Wilcox (Oracle) Cc: Muchun Song Cc: Muchun Song Cc: Peter Xu Cc: Ryan Roberts Cc: Yin Fengwei Signed-off-by: Andrew Morton (cherry picked from commit 395db7b190892f1ca8d31e1fc83198e2531335f6) Signed-off-by: Koba Ko --- mm/huge_memory.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/mm/huge_memory.c b/mm/huge_memory.c index e1d3dcf64a0af..f44ec1f0dc653 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -3479,12 +3479,12 @@ void remove_migration_pmd(struct page_vma_mapped_walk *pvmw, struct page *new) pmde = pmd_mkdirty(pmde); if (folio_test_anon(folio)) { - rmap_t rmap_flags = RMAP_COMPOUND; + rmap_t rmap_flags = RMAP_NONE; if (!is_readable_migration_entry(entry)) rmap_flags |= RMAP_EXCLUSIVE; - page_add_anon_rmap(new, vma, haddr, rmap_flags); + folio_add_anon_rmap_pmd(folio, new, vma, haddr, rmap_flags); } else { folio_add_file_rmap_pmd(folio, new, vma); } From d9a2e5f3a32baaeeb0af456cce5640cca1ae6037 Mon Sep 17 00:00:00 2001 From: David Hildenbrand Date: Wed, 20 Dec 2023 23:44:41 +0100 Subject: [PATCH 170/548] mm/migrate: page_add_anon_rmap() -> folio_add_anon_rmap_pte() Let's convert remove_migration_pte(). Link: https://lkml.kernel.org/r/20231220224504.646757-18-david@redhat.com Signed-off-by: David Hildenbrand Cc: Hugh Dickins Cc: Matthew Wilcox (Oracle) Cc: Muchun Song Cc: Muchun Song Cc: Peter Xu Cc: Ryan Roberts Cc: Yin Fengwei Signed-off-by: Andrew Morton (cherry picked from commit a15dc4785c98f360bdca78483455e0aff30242cb) Signed-off-by: Koba Ko --- mm/migrate.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/mm/migrate.c b/mm/migrate.c index fea18c6b6ea9b..763bff5ccd41f 100644 --- a/mm/migrate.c +++ b/mm/migrate.c @@ -259,8 +259,8 @@ static bool remove_migration_pte(struct folio *folio, #endif { if (folio_test_anon(folio)) - page_add_anon_rmap(new, vma, pvmw.address, - rmap_flags); + folio_add_anon_rmap_pte(folio, new, vma, + pvmw.address, rmap_flags); else folio_add_file_rmap_pte(folio, new, vma); set_pte_at(vma->vm_mm, pvmw.address, pvmw.pte, pte); From ca9cb2991f508a9ff667610aedf301e114da95de Mon Sep 17 00:00:00 2001 From: David Hildenbrand Date: Wed, 20 Dec 2023 23:44:42 +0100 Subject: [PATCH 171/548] mm/ksm: page_add_anon_rmap() -> folio_add_anon_rmap_pte() Let's convert replace_page(). While at it, perform some folio conversion. Link: https://lkml.kernel.org/r/20231220224504.646757-19-david@redhat.com Signed-off-by: David Hildenbrand Cc: Hugh Dickins Cc: Matthew Wilcox (Oracle) Cc: Muchun Song Cc: Muchun Song Cc: Peter Xu Cc: Ryan Roberts Cc: Yin Fengwei Signed-off-by: Andrew Morton (cherry picked from commit 977295349eb7826c50e2841915de96eab3a502c2) Signed-off-by: Koba Ko --- mm/ksm.c | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-) diff --git a/mm/ksm.c b/mm/ksm.c index 5fbe58d863c77..37595360c8c86 100644 --- a/mm/ksm.c +++ b/mm/ksm.c @@ -1186,6 +1186,7 @@ static int write_protect_page(struct vm_area_struct *vma, struct page *page, static int replace_page(struct vm_area_struct *vma, struct page *page, struct page *kpage, pte_t orig_pte) { + struct folio *kfolio = page_folio(kpage); struct mm_struct *mm = vma->vm_mm; struct folio *folio; pmd_t *pmd; @@ -1225,15 +1226,16 @@ static int replace_page(struct vm_area_struct *vma, struct page *page, goto out_mn; } VM_BUG_ON_PAGE(PageAnonExclusive(page), page); - VM_BUG_ON_PAGE(PageAnon(kpage) && PageAnonExclusive(kpage), kpage); + VM_BUG_ON_FOLIO(folio_test_anon(kfolio) && PageAnonExclusive(kpage), + kfolio); /* * No need to check ksm_use_zero_pages here: we can only have a * zero_page here if ksm_use_zero_pages was enabled already. */ if (!is_zero_pfn(page_to_pfn(kpage))) { - get_page(kpage); - page_add_anon_rmap(kpage, vma, addr, RMAP_NONE); + folio_get(kfolio); + folio_add_anon_rmap_pte(kfolio, kpage, vma, addr, RMAP_NONE); newpte = mk_pte(kpage, vma->vm_page_prot); } else { /* From 122bde9ea4a9bf2b65e47e583fafcd96a7aaf55a Mon Sep 17 00:00:00 2001 From: David Hildenbrand Date: Wed, 20 Dec 2023 23:44:43 +0100 Subject: [PATCH 172/548] mm/swapfile: page_add_anon_rmap() -> folio_add_anon_rmap_pte() Let's convert unuse_pte(). Link: https://lkml.kernel.org/r/20231220224504.646757-20-david@redhat.com Signed-off-by: David Hildenbrand Cc: Hugh Dickins Cc: Matthew Wilcox (Oracle) Cc: Muchun Song Cc: Muchun Song Cc: Peter Xu Cc: Ryan Roberts Cc: Yin Fengwei Signed-off-by: Andrew Morton (cherry picked from commit da7dc0afe243874b6ad25f5070aa728349e4e0fd) Signed-off-by: Koba Ko --- mm/swapfile.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/mm/swapfile.c b/mm/swapfile.c index d67c20144dc87..d59faced1133d 100644 --- a/mm/swapfile.c +++ b/mm/swapfile.c @@ -1817,7 +1817,7 @@ static int unuse_pte(struct vm_area_struct *vma, pmd_t *pmd, if (pte_swp_exclusive(old_pte)) rmap_flags |= RMAP_EXCLUSIVE; - page_add_anon_rmap(page, vma, addr, rmap_flags); + folio_add_anon_rmap_pte(folio, page, vma, addr, rmap_flags); } else { /* ksm created a completely new copy */ folio_add_new_anon_rmap(folio, vma, addr); folio_add_lru_vma(folio, vma); From a7cc8d0cc81b7842228fbab53fa750d89faefc3b Mon Sep 17 00:00:00 2001 From: David Hildenbrand Date: Wed, 20 Dec 2023 23:44:44 +0100 Subject: [PATCH 173/548] mm/memory: page_add_anon_rmap() -> folio_add_anon_rmap_pte() Let's convert restore_exclusive_pte() and do_swap_page(). While at it, perform some folio conversion in restore_exclusive_pte(). Link: https://lkml.kernel.org/r/20231220224504.646757-21-david@redhat.com Signed-off-by: David Hildenbrand Cc: Hugh Dickins Cc: Matthew Wilcox (Oracle) Cc: Muchun Song Cc: Muchun Song Cc: Peter Xu Cc: Ryan Roberts Cc: Yin Fengwei Signed-off-by: Andrew Morton (cherry picked from commit b832a354d787bfbdea5c226f0d77cc1a222d09f8) Signed-off-by: Koba Ko --- mm/memory.c | 11 +++++++---- 1 file changed, 7 insertions(+), 4 deletions(-) diff --git a/mm/memory.c b/mm/memory.c index ebcf63dd0599a..501a3800d1d40 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -697,6 +697,7 @@ static void restore_exclusive_pte(struct vm_area_struct *vma, struct page *page, unsigned long address, pte_t *ptep) { + struct folio *folio = page_folio(page); pte_t orig_pte; pte_t pte; swp_entry_t entry; @@ -712,14 +713,15 @@ static void restore_exclusive_pte(struct vm_area_struct *vma, else if (is_writable_device_exclusive_entry(entry)) pte = maybe_mkwrite(pte_mkdirty(pte), vma); - VM_BUG_ON(pte_write(pte) && !(PageAnon(page) && PageAnonExclusive(page))); + VM_BUG_ON_FOLIO(pte_write(pte) && (!folio_test_anon(folio) && + PageAnonExclusive(page)), folio); /* * No need to take a page reference as one was already * created when the swap entry was made. */ - if (PageAnon(page)) - page_add_anon_rmap(page, vma, address, RMAP_NONE); + if (folio_test_anon(folio)) + folio_add_anon_rmap_pte(folio, page, vma, address, RMAP_NONE); else /* * Currently device exclusive access only supports anonymous @@ -4071,7 +4073,8 @@ vm_fault_t do_swap_page(struct vm_fault *vmf) folio_add_new_anon_rmap(folio, vma, vmf->address); folio_add_lru_vma(folio, vma); } else { - page_add_anon_rmap(page, vma, vmf->address, rmap_flags); + folio_add_anon_rmap_pte(folio, page, vma, vmf->address, + rmap_flags); } VM_BUG_ON(!folio_test_anon(folio) || From 2ae82ca3645d2954a69731591401418ceb1913e4 Mon Sep 17 00:00:00 2001 From: David Hildenbrand Date: Wed, 20 Dec 2023 23:44:45 +0100 Subject: [PATCH 174/548] mm/rmap: remove page_add_anon_rmap() All users are gone, remove it and all traces. Link: https://lkml.kernel.org/r/20231220224504.646757-22-david@redhat.com Signed-off-by: David Hildenbrand Cc: Hugh Dickins Cc: Matthew Wilcox (Oracle) Cc: Muchun Song Cc: Muchun Song Cc: Peter Xu Cc: Ryan Roberts Cc: Yin Fengwei Signed-off-by: Andrew Morton (cherry picked from commit 84f0169e6c8a613012722e0d63302f9da4a72099) Signed-off-by: Koba Ko --- include/linux/rmap.h | 2 -- mm/rmap.c | 31 ++++--------------------------- 2 files changed, 4 insertions(+), 29 deletions(-) diff --git a/include/linux/rmap.h b/include/linux/rmap.h index 238314a1539c9..e139693b71045 100644 --- a/include/linux/rmap.h +++ b/include/linux/rmap.h @@ -234,8 +234,6 @@ void folio_add_anon_rmap_ptes(struct folio *, struct page *, int nr_pages, folio_add_anon_rmap_ptes(folio, page, 1, vma, address, flags) void folio_add_anon_rmap_pmd(struct folio *, struct page *, struct vm_area_struct *, unsigned long address, rmap_t flags); -void page_add_anon_rmap(struct page *, struct vm_area_struct *, - unsigned long address, rmap_t flags); void folio_add_new_anon_rmap(struct folio *, struct vm_area_struct *, unsigned long address); void folio_add_file_rmap_ptes(struct folio *, struct page *, int nr_pages, diff --git a/mm/rmap.c b/mm/rmap.c index 12c3c325659f7..47782d58ed295 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -1240,7 +1240,7 @@ static void __page_check_anon_rmap(struct folio *folio, struct page *page, * The page's anon-rmap details (mapping and index) are guaranteed to * be set up correctly at this point. * - * We have exclusion against page_add_anon_rmap because the caller + * We have exclusion against folio_add_anon_rmap_*() because the caller * always holds the page locked. * * We have exclusion against folio_add_new_anon_rmap because those pages @@ -1253,29 +1253,6 @@ static void __page_check_anon_rmap(struct folio *folio, struct page *page, page); } -/** - * page_add_anon_rmap - add pte mapping to an anonymous page - * @page: the page to add the mapping to - * @vma: the vm area in which the mapping is added - * @address: the user virtual address mapped - * @flags: the rmap flags - * - * The caller needs to hold the pte lock, and the page must be locked in - * the anon_vma case: to serialize mapping,index checking after setting, - * and to ensure that PageAnon is not being upgraded racily to PageKsm - * (but PageKsm is never downgraded to PageAnon). - */ -void page_add_anon_rmap(struct page *page, struct vm_area_struct *vma, - unsigned long address, rmap_t flags) -{ - struct folio *folio = page_folio(page); - - if (likely(!(flags & RMAP_COMPOUND))) - folio_add_anon_rmap_pte(folio, page, vma, address, flags); - else - folio_add_anon_rmap_pmd(folio, page, vma, address, flags); -} - static __always_inline void __folio_add_anon_rmap(struct folio *folio, struct page *page, int nr_pages, struct vm_area_struct *vma, unsigned long address, rmap_t flags, enum rmap_level level) @@ -1389,7 +1366,7 @@ void folio_add_anon_rmap_pmd(struct folio *folio, struct page *page, * @vma: the vm area in which the mapping is added * @address: the user virtual address mapped * - * Like page_add_anon_rmap() but must only be called on *new* folios. + * Like folio_add_anon_rmap_*() but must only be called on *new* folios. * This means the inc-and-test can be bypassed. * The folio does not have to be locked. * @@ -1449,7 +1426,7 @@ static __always_inline void __folio_add_file_rmap(struct folio *folio, if (nr) __lruvec_stat_mod_folio(folio, NR_FILE_MAPPED, nr); - /* See comments in page_add_anon_rmap() */ + /* See comments in folio_add_anon_rmap_*() */ if (!folio_test_large(folio)) mlock_vma_folio(folio, vma); } @@ -1563,7 +1540,7 @@ void page_remove_rmap(struct page *page, struct vm_area_struct *vma, /* * It would be tidy to reset folio_test_anon mapping when fully - * unmapped, but that might overwrite a racing page_add_anon_rmap + * unmapped, but that might overwrite a racing folio_add_anon_rmap_*() * which increments mapcount after us but sets mapping before us: * so leave the reset to free_pages_prepare, and remember that * it's only reliable while mapped. From 6a7c6a0ed5079f2a7d0d959ceb6c39d1556ace93 Mon Sep 17 00:00:00 2001 From: David Hildenbrand Date: Wed, 20 Dec 2023 23:44:46 +0100 Subject: [PATCH 175/548] mm/rmap: remove RMAP_COMPOUND No longer used, let's remove it and clarify RMAP_NONE/RMAP_EXCLUSIVE a bit. Link: https://lkml.kernel.org/r/20231220224504.646757-23-david@redhat.com Signed-off-by: David Hildenbrand Cc: Hugh Dickins Cc: Matthew Wilcox (Oracle) Cc: Muchun Song Cc: Muchun Song Cc: Peter Xu Cc: Ryan Roberts Cc: Yin Fengwei Signed-off-by: Andrew Morton (cherry picked from commit 0cae959e3abf19ba62805f6e6a8b42b6cd9ed3e3) Signed-off-by: Koba Ko --- include/linux/rmap.h | 12 +++--------- mm/rmap.c | 2 -- 2 files changed, 3 insertions(+), 11 deletions(-) diff --git a/include/linux/rmap.h b/include/linux/rmap.h index e139693b71045..aa5d9e9281853 100644 --- a/include/linux/rmap.h +++ b/include/linux/rmap.h @@ -172,20 +172,14 @@ struct anon_vma *folio_get_anon_vma(struct folio *folio); typedef int __bitwise rmap_t; /* - * No special request: if the page is a subpage of a compound page, it is - * mapped via a PTE. The mapped (sub)page is possibly shared between processes. + * No special request: A mapped anonymous (sub)page is possibly shared between + * processes. */ #define RMAP_NONE ((__force rmap_t)0) -/* The (sub)page is exclusive to a single process. */ +/* The anonymous (sub)page is exclusive to a single process. */ #define RMAP_EXCLUSIVE ((__force rmap_t)BIT(0)) -/* - * The compound page is not mapped via PTEs, but instead via a single PMD and - * should be accounted accordingly. - */ -#define RMAP_COMPOUND ((__force rmap_t)BIT(1)) - /* * Internally, we're using an enum to specify the granularity. We make the * compiler emit specialized code for each granularity. diff --git a/mm/rmap.c b/mm/rmap.c index 47782d58ed295..0bdcd8e00d0c1 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -2632,8 +2632,6 @@ void rmap_walk_locked(struct folio *folio, struct rmap_walk_control *rwc) * The following two functions are for anonymous (private mapped) hugepages. * Unlike common anonymous pages, anonymous hugepages have no accounting code * and no lru code, because we handle hugepages differently from common pages. - * - * RMAP_COMPOUND is ignored. */ void hugetlb_add_anon_rmap(struct page *page, struct vm_area_struct *vma, unsigned long address, rmap_t flags) From 9d88bc14e45f5c127ca4f20fbb52e945d3a3f869 Mon Sep 17 00:00:00 2001 From: David Hildenbrand Date: Wed, 20 Dec 2023 23:44:47 +0100 Subject: [PATCH 176/548] mm/rmap: introduce folio_remove_rmap_[pte|ptes|pmd]() Let's mimic what we did with folio_add_file_rmap_*() and folio_add_anon_rmap_*() so we can similarly replace page_remove_rmap() next. Make the compiler always special-case on the granularity by using __always_inline. We're adding folio_remove_rmap_ptes() handling right away, as we want to use that soon for batching rmap operations when unmapping PTE-mapped large folios. Link: https://lkml.kernel.org/r/20231220224504.646757-24-david@redhat.com Signed-off-by: David Hildenbrand Cc: Hugh Dickins Cc: Matthew Wilcox (Oracle) Cc: Muchun Song Cc: Muchun Song Cc: Peter Xu Cc: Ryan Roberts Cc: Yin Fengwei Signed-off-by: Andrew Morton (cherry picked from commit b06dc281aa9901076898d4d0a7bde588f11bc204) Signed-off-by: Koba Ko --- include/linux/rmap.h | 6 ++++ mm/rmap.c | 82 +++++++++++++++++++++++++++++++++++--------- 2 files changed, 72 insertions(+), 16 deletions(-) diff --git a/include/linux/rmap.h b/include/linux/rmap.h index aa5d9e9281853..50ffddcc4eb8e 100644 --- a/include/linux/rmap.h +++ b/include/linux/rmap.h @@ -238,6 +238,12 @@ void folio_add_file_rmap_pmd(struct folio *, struct page *, struct vm_area_struct *); void page_remove_rmap(struct page *, struct vm_area_struct *, bool compound); +void folio_remove_rmap_ptes(struct folio *, struct page *, int nr_pages, + struct vm_area_struct *); +#define folio_remove_rmap_pte(folio, page, vma) \ + folio_remove_rmap_ptes(folio, page, 1, vma) +void folio_remove_rmap_pmd(struct folio *, struct page *, + struct vm_area_struct *); void hugetlb_add_anon_rmap(struct page *, struct vm_area_struct *, unsigned long address, rmap_t flags); diff --git a/mm/rmap.c b/mm/rmap.c index 0bdcd8e00d0c1..9a34cd1b1afe0 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -1480,25 +1480,37 @@ void page_remove_rmap(struct page *page, struct vm_area_struct *vma, bool compound) { struct folio *folio = page_folio(page); + + if (likely(!compound)) + folio_remove_rmap_pte(folio, page, vma); + else + folio_remove_rmap_pmd(folio, page, vma); +} + +static __always_inline void __folio_remove_rmap(struct folio *folio, + struct page *page, int nr_pages, struct vm_area_struct *vma, + enum rmap_level level) +{ atomic_t *mapped = &folio->_nr_pages_mapped; - int nr = 0, nr_pmdmapped = 0; - bool last; + int last, nr = 0, nr_pmdmapped = 0; enum node_stat_item idx; - VM_WARN_ON_FOLIO(folio_test_hugetlb(folio), folio); - VM_BUG_ON_PAGE(compound && !PageHead(page), page); - - /* Is page being unmapped by PTE? Is this its last map to be removed? */ - if (likely(!compound)) { - last = atomic_add_negative(-1, &page->_mapcount); - nr = last; - if (last && folio_test_large(folio)) { - nr = atomic_dec_return_relaxed(mapped); - nr = (nr < COMPOUND_MAPPED); - } - } else if (folio_test_pmd_mappable(folio)) { - /* That test is redundant: it's for safety or to optimize out */ + __folio_rmap_sanity_checks(folio, page, nr_pages, level); + + switch (level) { + case RMAP_LEVEL_PTE: + do { + last = atomic_add_negative(-1, &page->_mapcount); + if (last && folio_test_large(folio)) { + last = atomic_dec_return_relaxed(mapped); + last = (last < COMPOUND_MAPPED); + } + if (last) + nr++; + } while (page++, --nr_pages > 0); + break; + case RMAP_LEVEL_PMD: last = atomic_add_negative(-1, &folio->_entire_mapcount); if (last) { nr = atomic_sub_return_relaxed(COMPOUND_MAPPED, mapped); @@ -1513,6 +1525,7 @@ void page_remove_rmap(struct page *page, struct vm_area_struct *vma, nr = 0; } } + break; } if (nr_pmdmapped) { @@ -1534,7 +1547,7 @@ void page_remove_rmap(struct page *page, struct vm_area_struct *vma, * is still mapped. */ if (folio_test_large(folio) && folio_test_anon(folio)) - if (!compound || nr < nr_pmdmapped) + if (level == RMAP_LEVEL_PTE || nr < nr_pmdmapped) deferred_split_folio(folio); } @@ -1549,6 +1562,43 @@ void page_remove_rmap(struct page *page, struct vm_area_struct *vma, munlock_vma_folio(folio, vma); } +/** + * folio_remove_rmap_ptes - remove PTE mappings from a page range of a folio + * @folio: The folio to remove the mappings from + * @page: The first page to remove + * @nr_pages: The number of pages that will be removed from the mapping + * @vma: The vm area from which the mappings are removed + * + * The page range of the folio is defined by [page, page + nr_pages) + * + * The caller needs to hold the page table lock. + */ +void folio_remove_rmap_ptes(struct folio *folio, struct page *page, + int nr_pages, struct vm_area_struct *vma) +{ + __folio_remove_rmap(folio, page, nr_pages, vma, RMAP_LEVEL_PTE); +} + +/** + * folio_remove_rmap_pmd - remove a PMD mapping from a page range of a folio + * @folio: The folio to remove the mapping from + * @page: The first page to remove + * @vma: The vm area from which the mapping is removed + * + * The page range of the folio is defined by [page, page + HPAGE_PMD_NR) + * + * The caller needs to hold the page table lock. + */ +void folio_remove_rmap_pmd(struct folio *folio, struct page *page, + struct vm_area_struct *vma) +{ +#ifdef CONFIG_TRANSPARENT_HUGEPAGE + __folio_remove_rmap(folio, page, HPAGE_PMD_NR, vma, RMAP_LEVEL_PMD); +#else + WARN_ON_ONCE(true); +#endif +} + /* * @arg: enum ttu_flags will be passed to this argument */ From d1bf81f826bfd64bf4de326a196d600798fdf627 Mon Sep 17 00:00:00 2001 From: David Hildenbrand Date: Wed, 20 Dec 2023 23:44:48 +0100 Subject: [PATCH 177/548] kernel/events/uprobes: page_remove_rmap() -> folio_remove_rmap_pte() Let's convert __replace_page(). Link: https://lkml.kernel.org/r/20231220224504.646757-25-david@redhat.com Signed-off-by: David Hildenbrand Cc: Hugh Dickins Cc: Matthew Wilcox (Oracle) Cc: Muchun Song Cc: Muchun Song Cc: Peter Xu Cc: Ryan Roberts Cc: Yin Fengwei Signed-off-by: Andrew Morton (cherry picked from commit 5cc9695f06b065168f5c893c8e006b6a8a2c9c91) Signed-off-by: Koba Ko --- kernel/events/uprobes.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/kernel/events/uprobes.c b/kernel/events/uprobes.c index 5fa4fc7c84485..646dcabc28783 100644 --- a/kernel/events/uprobes.c +++ b/kernel/events/uprobes.c @@ -209,7 +209,7 @@ static int __replace_page(struct vm_area_struct *vma, unsigned long addr, set_pte_at_notify(mm, addr, pvmw.pte, mk_pte(new_page, vma->vm_page_prot)); - page_remove_rmap(old_page, vma, false); + folio_remove_rmap_pte(old_folio, old_page, vma); if (!folio_mapped(old_folio)) folio_free_swap(old_folio); page_vma_mapped_walk_done(&pvmw); From b21ae7cc180f2a3f61c33f8827d8fc66ce689515 Mon Sep 17 00:00:00 2001 From: David Hildenbrand Date: Wed, 20 Dec 2023 23:44:49 +0100 Subject: [PATCH 178/548] mm/huge_memory: page_remove_rmap() -> folio_remove_rmap_pmd() Let's convert zap_huge_pmd() and set_pmd_migration_entry(). While at it, perform some more folio conversion. Link: https://lkml.kernel.org/r/20231220224504.646757-26-david@redhat.com Signed-off-by: David Hildenbrand Cc: Hugh Dickins Cc: Matthew Wilcox (Oracle) Cc: Muchun Song Cc: Muchun Song Cc: Peter Xu Cc: Ryan Roberts Cc: Yin Fengwei Signed-off-by: Andrew Morton (cherry picked from commit a8e61d584eda0d5532b0bbfe3c2427d2688d3c83) Signed-off-by: Koba Ko --- mm/huge_memory.c | 26 ++++++++++++++------------ 1 file changed, 14 insertions(+), 12 deletions(-) diff --git a/mm/huge_memory.c b/mm/huge_memory.c index f44ec1f0dc653..259e0a6b4eb8c 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -1863,7 +1863,7 @@ int zap_huge_pmd(struct mmu_gather *tlb, struct vm_area_struct *vma, if (pmd_present(orig_pmd)) { page = pmd_page(orig_pmd); - page_remove_rmap(page, vma, true); + folio_remove_rmap_pmd(page_folio(page), page, vma); VM_BUG_ON_PAGE(page_mapcount(page) < 0, page); VM_BUG_ON_PAGE(!PageHead(page), page); } else if (thp_migration_supported()) { @@ -2276,12 +2276,13 @@ static void __split_huge_pmd_locked(struct vm_area_struct *vma, pmd_t *pmd, page = pfn_swap_entry_to_page(entry); } else { page = pmd_page(old_pmd); - if (!PageDirty(page) && pmd_dirty(old_pmd)) - set_page_dirty(page); - if (!PageReferenced(page) && pmd_young(old_pmd)) - SetPageReferenced(page); - page_remove_rmap(page, vma, true); - put_page(page); + folio = page_folio(page); + if (!folio_test_dirty(folio) && pmd_dirty(old_pmd)) + folio_set_dirty(folio); + if (!folio_test_referenced(folio) && pmd_young(old_pmd)) + folio_set_referenced(folio); + folio_remove_rmap_pmd(folio, page, vma); + folio_put(folio); } add_mm_counter(mm, mm_counter_file(page), -HPAGE_PMD_NR); return; @@ -2439,7 +2440,7 @@ static void __split_huge_pmd_locked(struct vm_area_struct *vma, pmd_t *pmd, pte_unmap(pte - 1); if (!pmd_migration) - page_remove_rmap(page, vma, true); + folio_remove_rmap_pmd(folio, page, vma); if (freeze) put_page(page); @@ -3404,6 +3405,7 @@ late_initcall(split_huge_pages_debugfs); int set_pmd_migration_entry(struct page_vma_mapped_walk *pvmw, struct page *page) { + struct folio *folio = page_folio(page); struct vm_area_struct *vma = pvmw->vma; struct mm_struct *mm = vma->vm_mm; unsigned long address = pvmw->address; @@ -3419,14 +3421,14 @@ int set_pmd_migration_entry(struct page_vma_mapped_walk *pvmw, pmdval = pmdp_invalidate(vma, address, pvmw->pmd); /* See page_try_share_anon_rmap(): invalidate PMD first. */ - anon_exclusive = PageAnon(page) && PageAnonExclusive(page); + anon_exclusive = folio_test_anon(folio) && PageAnonExclusive(page); if (anon_exclusive && page_try_share_anon_rmap(page)) { set_pmd_at(mm, address, pvmw->pmd, pmdval); return -EBUSY; } if (pmd_dirty(pmdval)) - set_page_dirty(page); + folio_set_dirty(folio); if (pmd_write(pmdval)) entry = make_writable_migration_entry(page_to_pfn(page)); else if (anon_exclusive) @@ -3443,8 +3445,8 @@ int set_pmd_migration_entry(struct page_vma_mapped_walk *pvmw, if (pmd_uffd_wp(pmdval)) pmdswp = pmd_swp_mkuffd_wp(pmdswp); set_pmd_at(mm, address, pvmw->pmd, pmdswp); - page_remove_rmap(page, vma, true); - put_page(page); + folio_remove_rmap_pmd(folio, page, vma); + folio_put(folio); trace_set_migration_pmd(address, pmd_val(pmdswp)); return 0; From 578159bfdb9e935b38900f6fe1c3ef775b9289b5 Mon Sep 17 00:00:00 2001 From: David Hildenbrand Date: Wed, 20 Dec 2023 23:44:50 +0100 Subject: [PATCH 179/548] mm/khugepaged: page_remove_rmap() -> folio_remove_rmap_pte() Let's convert __collapse_huge_page_copy_succeeded() and collapse_pte_mapped_thp(). While at it, perform some more folio conversion in __collapse_huge_page_copy_succeeded(). We can get rid of release_pte_page(). Link: https://lkml.kernel.org/r/20231220224504.646757-27-david@redhat.com Signed-off-by: David Hildenbrand Cc: Hugh Dickins Cc: Matthew Wilcox (Oracle) Cc: Muchun Song Cc: Muchun Song Cc: Peter Xu Cc: Ryan Roberts Cc: Yin Fengwei Signed-off-by: Andrew Morton (cherry picked from commit 35668a4321461505dcc39b56a0d97b0ba2c99668) Signed-off-by: Koba Ko --- mm/khugepaged.c | 17 +++++++---------- 1 file changed, 7 insertions(+), 10 deletions(-) diff --git a/mm/khugepaged.c b/mm/khugepaged.c index 40b84ba39e747..127f3f24c7910 100644 --- a/mm/khugepaged.c +++ b/mm/khugepaged.c @@ -494,11 +494,6 @@ static void release_pte_folio(struct folio *folio) folio_putback_lru(folio); } -static void release_pte_page(struct page *page) -{ - release_pte_folio(page_folio(page)); -} - static void release_pte_pages(pte_t *pte, pte_t *_pte, struct list_head *compound_pagelist) { @@ -686,6 +681,7 @@ static void __collapse_huge_page_copy_succeeded(pte_t *pte, spinlock_t *ptl, struct list_head *compound_pagelist) { + struct folio *src_folio; struct page *src_page; struct page *tmp; pte_t *_pte; @@ -707,16 +703,17 @@ static void __collapse_huge_page_copy_succeeded(pte_t *pte, } } else { src_page = pte_page(pteval); - if (!PageCompound(src_page)) - release_pte_page(src_page); + src_folio = page_folio(src_page); + if (!folio_test_large(src_folio)) + release_pte_folio(src_folio); /* * ptl mostly unnecessary, but preempt has to * be disabled to update the per-cpu stats - * inside page_remove_rmap(). + * inside folio_remove_rmap_pte(). */ spin_lock(ptl); ptep_clear(vma->vm_mm, address, _pte); - page_remove_rmap(src_page, vma, false); + folio_remove_rmap_pte(src_folio, src_page, vma); spin_unlock(ptl); free_page_and_swap_cache(src_page); } @@ -1611,7 +1608,7 @@ int collapse_pte_mapped_thp(struct mm_struct *mm, unsigned long addr, * PTE dirty? Shmem page is already dirty; file is read-only. */ ptep_clear(mm, addr, pte); - page_remove_rmap(page, vma, false); + folio_remove_rmap_pte(folio, page, vma); nr_ptes++; } From 3af3f390e307f53bd6c9c9acf69c446a93e9687b Mon Sep 17 00:00:00 2001 From: David Hildenbrand Date: Wed, 20 Dec 2023 23:44:51 +0100 Subject: [PATCH 180/548] mm/ksm: page_remove_rmap() -> folio_remove_rmap_pte() Let's convert replace_page(). Link: https://lkml.kernel.org/r/20231220224504.646757-28-david@redhat.com Signed-off-by: David Hildenbrand Cc: Hugh Dickins Cc: Matthew Wilcox (Oracle) Cc: Muchun Song Cc: Muchun Song Cc: Peter Xu Cc: Ryan Roberts Cc: Yin Fengwei Signed-off-by: Andrew Morton (cherry picked from commit 18e8612e56244c6db3254d435a22344856a9c55b) Signed-off-by: Koba Ko --- mm/ksm.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/mm/ksm.c b/mm/ksm.c index 37595360c8c86..558eb2153f107 100644 --- a/mm/ksm.c +++ b/mm/ksm.c @@ -1265,7 +1265,7 @@ static int replace_page(struct vm_area_struct *vma, struct page *page, set_pte_at_notify(mm, addr, ptep, newpte); folio = page_folio(page); - page_remove_rmap(page, vma, false); + folio_remove_rmap_pte(folio, page, vma); if (!folio_mapped(folio)) folio_free_swap(folio); folio_put(folio); From cc12eae19bd32cd4ebcf050d995e2d1109a330a0 Mon Sep 17 00:00:00 2001 From: David Hildenbrand Date: Wed, 20 Dec 2023 23:44:52 +0100 Subject: [PATCH 181/548] mm/memory: page_remove_rmap() -> folio_remove_rmap_pte() Let's convert zap_pte_range() and closely-related tlb_flush_rmap_batch(). While at it, perform some more folio conversion in zap_pte_range(). Link: https://lkml.kernel.org/r/20231220224504.646757-29-david@redhat.com Signed-off-by: David Hildenbrand Cc: Hugh Dickins Cc: Matthew Wilcox (Oracle) Cc: Muchun Song Cc: Muchun Song Cc: Peter Xu Cc: Ryan Roberts Cc: Yin Fengwei Signed-off-by: Andrew Morton (cherry picked from commit c46265030b0f400ef89833bb51da62676d2f855a) Signed-off-by: Koba Ko --- mm/memory.c | 23 +++++++++++++---------- mm/mmu_gather.c | 2 +- 2 files changed, 14 insertions(+), 11 deletions(-) diff --git a/mm/memory.c b/mm/memory.c index 501a3800d1d40..89fc28e48c001 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -1418,6 +1418,7 @@ static unsigned long zap_pte_range(struct mmu_gather *tlb, arch_enter_lazy_mmu_mode(); do { pte_t ptent = ptep_get(pte); + struct folio *folio; struct page *page; if (pte_none(ptent)) @@ -1443,21 +1444,22 @@ static unsigned long zap_pte_range(struct mmu_gather *tlb, continue; } + folio = page_folio(page); delay_rmap = 0; - if (!PageAnon(page)) { + if (!folio_test_anon(folio)) { if (pte_dirty(ptent)) { - set_page_dirty(page); + folio_set_dirty(folio); if (tlb_delay_rmap(tlb)) { delay_rmap = 1; force_flush = 1; } } if (pte_young(ptent) && likely(vma_has_recency(vma))) - mark_page_accessed(page); + folio_mark_accessed(folio); } rss[mm_counter(page)]--; if (!delay_rmap) { - page_remove_rmap(page, vma, false); + folio_remove_rmap_pte(folio, page, vma); if (unlikely(page_mapcount(page) < 0)) print_bad_pte(vma, addr, ptent, page); } @@ -1473,6 +1475,7 @@ static unsigned long zap_pte_range(struct mmu_gather *tlb, if (is_device_private_entry(entry) || is_device_exclusive_entry(entry)) { page = pfn_swap_entry_to_page(entry); + folio = page_folio(page); if (unlikely(!should_zap_page(details, page))) continue; /* @@ -1484,8 +1487,8 @@ static unsigned long zap_pte_range(struct mmu_gather *tlb, WARN_ON_ONCE(!vma_is_anonymous(vma)); rss[mm_counter(page)]--; if (is_device_private_entry(entry)) - page_remove_rmap(page, vma, false); - put_page(page); + folio_remove_rmap_pte(folio, page, vma); + folio_put(folio); } else if (!non_swap_entry(entry)) { /* Genuine swap entry, hence a private anon page */ if (!should_zap_cows(details)) @@ -3218,10 +3221,10 @@ static vm_fault_t wp_page_copy(struct vm_fault *vmf) * threads. * * The critical issue is to order this - * page_remove_rmap with the ptp_clear_flush above. - * Those stores are ordered by (if nothing else,) + * folio_remove_rmap_pte() with the ptp_clear_flush + * above. Those stores are ordered by (if nothing else,) * the barrier present in the atomic_add_negative - * in page_remove_rmap. + * in folio_remove_rmap_pte(); * * Then the TLB flush in ptep_clear_flush ensures that * no process can access the old page before the @@ -3230,7 +3233,7 @@ static vm_fault_t wp_page_copy(struct vm_fault *vmf) * mapcount is visible. So transitively, TLBs to * old page will be flushed before it can be reused. */ - page_remove_rmap(vmf->page, vma, false); + folio_remove_rmap_pte(old_folio, vmf->page, vma); } /* Free the old page.. */ diff --git a/mm/mmu_gather.c b/mm/mmu_gather.c index 4f559f4ddd217..604ddf08affed 100644 --- a/mm/mmu_gather.c +++ b/mm/mmu_gather.c @@ -55,7 +55,7 @@ static void tlb_flush_rmap_batch(struct mmu_gather_batch *batch, struct vm_area_ if (encoded_page_flags(enc)) { struct page *page = encoded_page_ptr(enc); - page_remove_rmap(page, vma, false); + folio_remove_rmap_pte(page_folio(page), page, vma); } } } From 437345906b99a54b633ea458d5e2e1fafe9f92f0 Mon Sep 17 00:00:00 2001 From: David Hildenbrand Date: Wed, 20 Dec 2023 23:44:53 +0100 Subject: [PATCH 182/548] mm/migrate_device: page_remove_rmap() -> folio_remove_rmap_pte() Let's convert migrate_vma_collect_pmd(). While at it, perform more folio conversion. Link: https://lkml.kernel.org/r/20231220224504.646757-30-david@redhat.com Signed-off-by: David Hildenbrand Cc: Hugh Dickins Cc: Matthew Wilcox (Oracle) Cc: Muchun Song Cc: Muchun Song Cc: Peter Xu Cc: Ryan Roberts Cc: Yin Fengwei Signed-off-by: Andrew Morton (cherry picked from commit 5b205c7f2684764c8a9cc3442986623d4d6e87f1) Signed-off-by: Koba Ko --- mm/migrate_device.c | 39 +++++++++++++++++++++------------------ 1 file changed, 21 insertions(+), 18 deletions(-) diff --git a/mm/migrate_device.c b/mm/migrate_device.c index 81193363f8cd5..39b7754480c67 100644 --- a/mm/migrate_device.c +++ b/mm/migrate_device.c @@ -107,6 +107,7 @@ static int migrate_vma_collect_pmd(pmd_t *pmdp, for (; addr < end; addr += PAGE_SIZE, ptep++) { unsigned long mpfn = 0, pfn; + struct folio *folio; struct page *page; swp_entry_t entry; pte_t pte; @@ -168,41 +169,43 @@ static int migrate_vma_collect_pmd(pmd_t *pmdp, } /* - * By getting a reference on the page we pin it and that blocks + * By getting a reference on the folio we pin it and that blocks * any kind of migration. Side effect is that it "freezes" the * pte. * - * We drop this reference after isolating the page from the lru - * for non device page (device page are not on the lru and thus + * We drop this reference after isolating the folio from the lru + * for non device folio (device folio are not on the lru and thus * can't be dropped from it). */ - get_page(page); + folio = page_folio(page); + folio_get(folio); /* - * We rely on trylock_page() to avoid deadlock between + * We rely on folio_trylock() to avoid deadlock between * concurrent migrations where each is waiting on the others - * page lock. If we can't immediately lock the page we fail this + * folio lock. If we can't immediately lock the folio we fail this * migration as it is only best effort anyway. * - * If we can lock the page it's safe to set up a migration entry - * now. In the common case where the page is mapped once in a + * If we can lock the folio it's safe to set up a migration entry + * now. In the common case where the folio is mapped once in a * single process setting up the migration entry now is an * optimisation to avoid walking the rmap later with * try_to_migrate(). */ - if (trylock_page(page)) { + if (folio_trylock(folio)) { bool anon_exclusive; pte_t swp_pte; flush_cache_page(vma, addr, pte_pfn(pte)); - anon_exclusive = PageAnon(page) && PageAnonExclusive(page); + anon_exclusive = folio_test_anon(folio) && + PageAnonExclusive(page); if (anon_exclusive) { pte = ptep_clear_flush(vma, addr, ptep); if (page_try_share_anon_rmap(page)) { set_pte_at(mm, addr, ptep, pte); - unlock_page(page); - put_page(page); + folio_unlock(folio); + folio_put(folio); mpfn = 0; goto next; } @@ -214,7 +217,7 @@ static int migrate_vma_collect_pmd(pmd_t *pmdp, /* Set the dirty flag on the folio now the pte is gone. */ if (pte_dirty(pte)) - folio_mark_dirty(page_folio(page)); + folio_mark_dirty(folio); /* Setup special migration page table entry */ if (mpfn & MIGRATE_PFN_WRITE) @@ -248,16 +251,16 @@ static int migrate_vma_collect_pmd(pmd_t *pmdp, /* * This is like regular unmap: we remove the rmap and - * drop page refcount. Page won't be freed, as we took - * a reference just above. + * drop the folio refcount. The folio won't be freed, as + * we took a reference just above. */ - page_remove_rmap(page, vma, false); - put_page(page); + folio_remove_rmap_pte(folio, page, vma); + folio_put(folio); if (pte_present(pte)) unmapped++; } else { - put_page(page); + folio_put(folio); mpfn = 0; } From aade9c125ec54f0d0b564570634d27f31aa55b15 Mon Sep 17 00:00:00 2001 From: David Hildenbrand Date: Wed, 20 Dec 2023 23:44:54 +0100 Subject: [PATCH 183/548] mm/rmap: page_remove_rmap() -> folio_remove_rmap_pte() Let's convert try_to_unmap_one() and try_to_migrate_one(). Link: https://lkml.kernel.org/r/20231220224504.646757-31-david@redhat.com Signed-off-by: David Hildenbrand Cc: Hugh Dickins Cc: Matthew Wilcox (Oracle) Cc: Muchun Song Cc: Muchun Song Cc: Peter Xu Cc: Ryan Roberts Cc: Yin Fengwei Signed-off-by: Andrew Morton (cherry picked from commit ca1a0746182c3c059573d7e4554d335cae5306dc) Signed-off-by: Koba Ko --- mm/rmap.c | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/mm/rmap.c b/mm/rmap.c index 9a34cd1b1afe0..53c3cb349d495 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -1617,7 +1617,7 @@ static bool try_to_unmap_one(struct folio *folio, struct vm_area_struct *vma, /* * When racing against e.g. zap_pte_range() on another cpu, - * in between its ptep_get_and_clear_full() and page_remove_rmap(), + * in between its ptep_get_and_clear_full() and folio_remove_rmap_*(), * try_to_unmap() may return before page_mapped() has become false, * if page table locking is skipped: use TTU_SYNC to wait for that. */ @@ -1898,7 +1898,7 @@ static bool try_to_unmap_one(struct folio *folio, struct vm_area_struct *vma, if (unlikely(folio_test_hugetlb(folio))) hugetlb_remove_rmap(folio); else - page_remove_rmap(subpage, vma, false); + folio_remove_rmap_pte(folio, subpage, vma); if (vma->vm_flags & VM_LOCKED) mlock_drain_local(); folio_put(folio); @@ -1966,7 +1966,7 @@ static bool try_to_migrate_one(struct folio *folio, struct vm_area_struct *vma, /* * When racing against e.g. zap_pte_range() on another cpu, - * in between its ptep_get_and_clear_full() and page_remove_rmap(), + * in between its ptep_get_and_clear_full() and folio_remove_rmap_*(), * try_to_migrate() may return before page_mapped() has become false, * if page table locking is skipped: use TTU_SYNC to wait for that. */ @@ -2259,7 +2259,7 @@ static bool try_to_migrate_one(struct folio *folio, struct vm_area_struct *vma, if (unlikely(folio_test_hugetlb(folio))) hugetlb_remove_rmap(folio); else - page_remove_rmap(subpage, vma, false); + folio_remove_rmap_pte(folio, subpage, vma); if (vma->vm_flags & VM_LOCKED) mlock_drain_local(); folio_put(folio); @@ -2398,7 +2398,7 @@ static bool page_make_device_exclusive_one(struct folio *folio, * There is a reference on the page for the swap entry which has * been removed, so shouldn't take another. */ - page_remove_rmap(subpage, vma, false); + folio_remove_rmap_pte(folio, subpage, vma); } mmu_notifier_invalidate_range_end(&range); From f028b4a3fa4ea30dfc43549381ed71bc7ae38576 Mon Sep 17 00:00:00 2001 From: David Hildenbrand Date: Wed, 20 Dec 2023 23:44:55 +0100 Subject: [PATCH 184/548] Documentation: stop referring to page_remove_rmap() Refer to folio_remove_rmap_*() instaed. Link: https://lkml.kernel.org/r/20231220224504.646757-32-david@redhat.com Signed-off-by: David Hildenbrand Cc: Hugh Dickins Cc: Matthew Wilcox (Oracle) Cc: Muchun Song Cc: Muchun Song Cc: Peter Xu Cc: Ryan Roberts Cc: Yin Fengwei Signed-off-by: Andrew Morton (cherry picked from commit 5a0033f0285e0bb29f6e4d1593d4519c91ed882a) Signed-off-by: Koba Ko --- Documentation/mm/transhuge.rst | 2 +- Documentation/mm/unevictable-lru.rst | 4 ++-- 2 files changed, 3 insertions(+), 3 deletions(-) diff --git a/Documentation/mm/transhuge.rst b/Documentation/mm/transhuge.rst index 9a607059ea11c..cf81272a6b8b6 100644 --- a/Documentation/mm/transhuge.rst +++ b/Documentation/mm/transhuge.rst @@ -156,7 +156,7 @@ Partial unmap and deferred_split_folio() Unmapping part of THP (with munmap() or other way) is not going to free memory immediately. Instead, we detect that a subpage of THP is not in use -in page_remove_rmap() and queue the THP for splitting if memory pressure +in folio_remove_rmap_*() and queue the THP for splitting if memory pressure comes. Splitting will free up unused subpages. Splitting the page right away is not an option due to locking context in diff --git a/Documentation/mm/unevictable-lru.rst b/Documentation/mm/unevictable-lru.rst index 67f1338440a50..b6a07a26b10d5 100644 --- a/Documentation/mm/unevictable-lru.rst +++ b/Documentation/mm/unevictable-lru.rst @@ -486,7 +486,7 @@ munlock the pages if we're removing the last VM_LOCKED VMA that maps the pages. Before the unevictable/mlock changes, mlocking did not mark the pages in any way, so unmapping them required no processing. -For each PTE (or PMD) being unmapped from a VMA, page_remove_rmap() calls +For each PTE (or PMD) being unmapped from a VMA, folio_remove_rmap_*() calls munlock_vma_folio(), which calls munlock_folio() when the VMA is VM_LOCKED (unless it was a PTE mapping of a part of a transparent huge page). @@ -511,7 +511,7 @@ userspace; truncation even unmaps and deletes any private anonymous pages which had been Copied-On-Write from the file pages now being truncated. Mlocked pages can be munlocked and deleted in this way: like with munmap(), -for each PTE (or PMD) being unmapped from a VMA, page_remove_rmap() calls +for each PTE (or PMD) being unmapped from a VMA, folio_remove_rmap_*() calls munlock_vma_folio(), which calls munlock_folio() when the VMA is VM_LOCKED (unless it was a PTE mapping of a part of a transparent huge page). From c9ed4a32f38e86d2caa9a57bf4af607ff9daa9e9 Mon Sep 17 00:00:00 2001 From: David Hildenbrand Date: Wed, 20 Dec 2023 23:44:56 +0100 Subject: [PATCH 185/548] mm/rmap: remove page_remove_rmap() All callers are gone, let's remove it and some leftover traces. Link: https://lkml.kernel.org/r/20231220224504.646757-33-david@redhat.com Signed-off-by: David Hildenbrand Cc: Hugh Dickins Cc: Matthew Wilcox (Oracle) Cc: Muchun Song Cc: Muchun Song Cc: Peter Xu Cc: Ryan Roberts Cc: Yin Fengwei Signed-off-by: Andrew Morton (cherry picked from commit 4d8f7418e8ba36036c8486d92d9591c368ab9b85) Signed-off-by: Koba Ko --- include/linux/rmap.h | 4 +--- mm/filemap.c | 10 +++++----- mm/internal.h | 2 +- mm/memory-failure.c | 4 ++-- mm/rmap.c | 23 ++--------------------- 5 files changed, 11 insertions(+), 32 deletions(-) diff --git a/include/linux/rmap.h b/include/linux/rmap.h index 50ffddcc4eb8e..10f59b81beed1 100644 --- a/include/linux/rmap.h +++ b/include/linux/rmap.h @@ -236,8 +236,6 @@ void folio_add_file_rmap_ptes(struct folio *, struct page *, int nr_pages, folio_add_file_rmap_ptes(folio, page, 1, vma) void folio_add_file_rmap_pmd(struct folio *, struct page *, struct vm_area_struct *); -void page_remove_rmap(struct page *, struct vm_area_struct *, - bool compound); void folio_remove_rmap_ptes(struct folio *, struct page *, int nr_pages, struct vm_area_struct *); #define folio_remove_rmap_pte(folio, page, vma) \ @@ -384,7 +382,7 @@ static inline int page_try_dup_anon_rmap(struct page *page, bool compound, * * This is similar to page_try_dup_anon_rmap(), however, not used during fork() * to duplicate a mapping, but instead to prepare for KSM or temporarily - * unmapping a page (swap, migration) via page_remove_rmap(). + * unmapping a page (swap, migration) via folio_remove_rmap_*(). * * Marking the page shared can only fail if the page may be pinned; device * private pages cannot get pinned and consequently this function cannot fail. diff --git a/mm/filemap.c b/mm/filemap.c index 86fbe650d8833..b3ef7f248813b 100644 --- a/mm/filemap.c +++ b/mm/filemap.c @@ -113,11 +113,11 @@ * ->i_pages lock (try_to_unmap_one) * ->lruvec->lru_lock (follow_page->mark_page_accessed) * ->lruvec->lru_lock (check_pte_range->isolate_lru_page) - * ->private_lock (page_remove_rmap->set_page_dirty) - * ->i_pages lock (page_remove_rmap->set_page_dirty) - * bdi.wb->list_lock (page_remove_rmap->set_page_dirty) - * ->inode->i_lock (page_remove_rmap->set_page_dirty) - * ->memcg->move_lock (page_remove_rmap->folio_memcg_lock) + * ->private_lock (folio_remove_rmap_pte->set_page_dirty) + * ->i_pages lock (folio_remove_rmap_pte->set_page_dirty) + * bdi.wb->list_lock (folio_remove_rmap_pte->set_page_dirty) + * ->inode->i_lock (folio_remove_rmap_pte->set_page_dirty) + * ->memcg->move_lock (folio_remove_rmap_pte->folio_memcg_lock) * bdi.wb->list_lock (zap_pte_range->set_page_dirty) * ->inode->i_lock (zap_pte_range->set_page_dirty) * ->private_lock (zap_pte_range->block_dirty_folio) diff --git a/mm/internal.h b/mm/internal.h index d3df1271fe0ed..a819129ae29cd 100644 --- a/mm/internal.h +++ b/mm/internal.h @@ -711,7 +711,7 @@ folio_within_vma(struct folio *folio, struct vm_area_struct *vma) * under page table lock for the pte/pmd being added or removed. * * mlock is usually called at the end of page_add_*_rmap(), munlock at - * the end of page_remove_rmap(); but new anon folios are managed by + * the end of folio_remove_rmap_*(); but new anon folios are managed by * folio_add_lru_vma() calling mlock_new_folio(). */ void mlock_folio(struct folio *folio); diff --git a/mm/memory-failure.c b/mm/memory-failure.c index a96840c415816..d0178e883308d 100644 --- a/mm/memory-failure.c +++ b/mm/memory-failure.c @@ -2333,8 +2333,8 @@ int memory_failure(unsigned long pfn, int flags) * We use page flags to determine what action should be taken, but * the flags can be modified by the error containment action. One * example is an mlocked page, where PG_mlocked is cleared by - * page_remove_rmap() in try_to_unmap_one(). So to determine page status - * correctly, we save a copy of the page flags at this time. + * folio_remove_rmap_*() in try_to_unmap_one(). So to determine page + * status correctly, we save a copy of the page flags at this time. */ page_flags = p->flags; diff --git a/mm/rmap.c b/mm/rmap.c index 53c3cb349d495..1c7ee56816549 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -470,7 +470,7 @@ void __init anon_vma_init(void) /* * Getting a lock on a stable anon_vma from a page off the LRU is tricky! * - * Since there is no serialization what so ever against page_remove_rmap() + * Since there is no serialization what so ever against folio_remove_rmap_*() * the best this function can do is return a refcount increased anon_vma * that might have been relevant to this page. * @@ -487,7 +487,7 @@ void __init anon_vma_init(void) * [ something equivalent to page_mapped_in_vma() ]. * * Since anon_vma's slab is SLAB_TYPESAFE_BY_RCU and we know from - * page_remove_rmap() that the anon_vma pointer from page->mapping is valid + * folio_remove_rmap_*() that the anon_vma pointer from page->mapping is valid * if there is a mapcount, we can dereference the anon_vma after observing * those. */ @@ -1468,25 +1468,6 @@ void folio_add_file_rmap_pmd(struct folio *folio, struct page *page, #endif } -/** - * page_remove_rmap - take down pte mapping from a page - * @page: page to remove mapping from - * @vma: the vm area from which the mapping is removed - * @compound: uncharge the page as compound or small page - * - * The caller needs to hold the pte lock. - */ -void page_remove_rmap(struct page *page, struct vm_area_struct *vma, - bool compound) -{ - struct folio *folio = page_folio(page); - - if (likely(!compound)) - folio_remove_rmap_pte(folio, page, vma); - else - folio_remove_rmap_pmd(folio, page, vma); -} - static __always_inline void __folio_remove_rmap(struct folio *folio, struct page *page, int nr_pages, struct vm_area_struct *vma, enum rmap_level level) From 3c0e2cde65685ab331d258e6a1eafce9c43f6f6f Mon Sep 17 00:00:00 2001 From: David Hildenbrand Date: Wed, 20 Dec 2023 23:44:57 +0100 Subject: [PATCH 186/548] mm/rmap: convert page_dup_file_rmap() to folio_dup_file_rmap_[pte|ptes|pmd]() Let's convert page_dup_file_rmap() like the other rmap functions. As there is only a single caller, convert that single caller right away and remove page_dup_file_rmap(). Add folio_dup_file_rmap_ptes() right away, we want to perform rmap baching during fork() soon. Link: https://lkml.kernel.org/r/20231220224504.646757-34-david@redhat.com Signed-off-by: David Hildenbrand Cc: Hugh Dickins Cc: Matthew Wilcox (Oracle) Cc: Muchun Song Cc: Muchun Song Cc: Peter Xu Cc: Ryan Roberts Cc: Yin Fengwei Signed-off-by: Andrew Morton (cherry picked from commit d8ef5e311d7bfde54b60ab45026f206eff31b2d2) Signed-off-by: Koba Ko --- include/linux/rmap.h | 59 ++++++++++++++++++++++++++++++++++++++++---- mm/memory.c | 2 +- 2 files changed, 55 insertions(+), 6 deletions(-) diff --git a/include/linux/rmap.h b/include/linux/rmap.h index 10f59b81beed1..27d7d371a53ab 100644 --- a/include/linux/rmap.h +++ b/include/linux/rmap.h @@ -303,6 +303,60 @@ static inline void hugetlb_remove_rmap(struct folio *folio) atomic_dec(&folio->_entire_mapcount); } +static __always_inline void __folio_dup_file_rmap(struct folio *folio, + struct page *page, int nr_pages, enum rmap_level level) +{ + __folio_rmap_sanity_checks(folio, page, nr_pages, level); + + switch (level) { + case RMAP_LEVEL_PTE: + do { + atomic_inc(&page->_mapcount); + } while (page++, --nr_pages > 0); + break; + case RMAP_LEVEL_PMD: + atomic_inc(&folio->_entire_mapcount); + break; + } +} + +/** + * folio_dup_file_rmap_ptes - duplicate PTE mappings of a page range of a folio + * @folio: The folio to duplicate the mappings of + * @page: The first page to duplicate the mappings of + * @nr_pages: The number of pages of which the mapping will be duplicated + * + * The page range of the folio is defined by [page, page + nr_pages) + * + * The caller needs to hold the page table lock. + */ +static inline void folio_dup_file_rmap_ptes(struct folio *folio, + struct page *page, int nr_pages) +{ + __folio_dup_file_rmap(folio, page, nr_pages, RMAP_LEVEL_PTE); +} +#define folio_dup_file_rmap_pte(folio, page) \ + folio_dup_file_rmap_ptes(folio, page, 1) + +/** + * folio_dup_file_rmap_pmd - duplicate a PMD mapping of a page range of a folio + * @folio: The folio to duplicate the mapping of + * @page: The first page to duplicate the mapping of + * + * The page range of the folio is defined by [page, page + HPAGE_PMD_NR) + * + * The caller needs to hold the page table lock. + */ +static inline void folio_dup_file_rmap_pmd(struct folio *folio, + struct page *page) +{ +#ifdef CONFIG_TRANSPARENT_HUGEPAGE + __folio_dup_file_rmap(folio, page, HPAGE_PMD_NR, RMAP_LEVEL_PTE); +#else + WARN_ON_ONCE(true); +#endif +} + static inline void __page_dup_rmap(struct page *page, bool compound) { VM_WARN_ON(folio_test_hugetlb(page_folio(page))); @@ -317,11 +371,6 @@ static inline void __page_dup_rmap(struct page *page, bool compound) } } -static inline void page_dup_file_rmap(struct page *page, bool compound) -{ - __page_dup_rmap(page, compound); -} - /** * page_try_dup_anon_rmap - try duplicating a mapping of an already mapped * anonymous page diff --git a/mm/memory.c b/mm/memory.c index 89fc28e48c001..f54983cc8137a 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -952,7 +952,7 @@ copy_present_pte(struct vm_area_struct *dst_vma, struct vm_area_struct *src_vma, rss[MM_ANONPAGES]++; } else if (page) { folio_get(folio); - page_dup_file_rmap(page, false); + folio_dup_file_rmap_pte(folio, page); rss[mm_counter_file(page)]++; } From ea296e6297a9cc4a06a048580a75aee19edda9da Mon Sep 17 00:00:00 2001 From: David Hildenbrand Date: Wed, 20 Dec 2023 23:44:58 +0100 Subject: [PATCH 187/548] mm/rmap: introduce folio_try_dup_anon_rmap_[pte|ptes|pmd]() The last user of page_needs_cow_for_dma() and __page_dup_rmap() are gone, remove them. Add folio_try_dup_anon_rmap_ptes() right away, we want to perform rmap baching during fork() soon. Link: https://lkml.kernel.org/r/20231220224504.646757-35-david@redhat.com Signed-off-by: David Hildenbrand Cc: Hugh Dickins Cc: Matthew Wilcox (Oracle) Cc: Muchun Song Cc: Muchun Song Cc: Peter Xu Cc: Ryan Roberts Cc: Yin Fengwei Signed-off-by: Andrew Morton (cherry picked from commit 61d90309b7156d54c5d358cb5d8bf55b33d233d2) Signed-off-by: Koba Ko --- include/linux/mm.h | 6 -- include/linux/rmap.h | 150 ++++++++++++++++++++++++++++++------------- 2 files changed, 106 insertions(+), 50 deletions(-) diff --git a/include/linux/mm.h b/include/linux/mm.h index dc27352b84554..9099a30dccb89 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -1966,12 +1966,6 @@ static inline bool folio_needs_cow_for_dma(struct vm_area_struct *vma, return folio_maybe_dma_pinned(folio); } -static inline bool page_needs_cow_for_dma(struct vm_area_struct *vma, - struct page *page) -{ - return folio_needs_cow_for_dma(vma, page_folio(page)); -} - /** * is_zero_page - Query if a page is a zero page * @page: The page to query diff --git a/include/linux/rmap.h b/include/linux/rmap.h index 27d7d371a53ab..1d3e42eaccd4a 100644 --- a/include/linux/rmap.h +++ b/include/linux/rmap.h @@ -357,68 +357,130 @@ static inline void folio_dup_file_rmap_pmd(struct folio *folio, #endif } -static inline void __page_dup_rmap(struct page *page, bool compound) +static __always_inline int __folio_try_dup_anon_rmap(struct folio *folio, + struct page *page, int nr_pages, struct vm_area_struct *src_vma, + enum rmap_level level) { - VM_WARN_ON(folio_test_hugetlb(page_folio(page))); + bool maybe_pinned; + int i; + + VM_WARN_ON_FOLIO(!folio_test_anon(folio), folio); + __folio_rmap_sanity_checks(folio, page, nr_pages, level); - if (compound) { - struct folio *folio = (struct folio *)page; + /* + * If this folio may have been pinned by the parent process, + * don't allow to duplicate the mappings but instead require to e.g., + * copy the subpage immediately for the child so that we'll always + * guarantee the pinned folio won't be randomly replaced in the + * future on write faults. + */ + maybe_pinned = likely(!folio_is_device_private(folio)) && + unlikely(folio_needs_cow_for_dma(src_vma, folio)); - VM_BUG_ON_PAGE(compound && !PageHead(page), page); + /* + * No need to check+clear for already shared PTEs/PMDs of the + * folio. But if any page is PageAnonExclusive, we must fallback to + * copying if the folio maybe pinned. + */ + switch (level) { + case RMAP_LEVEL_PTE: + if (unlikely(maybe_pinned)) { + for (i = 0; i < nr_pages; i++) + if (PageAnonExclusive(page + i)) + return -EBUSY; + } + do { + if (PageAnonExclusive(page)) + ClearPageAnonExclusive(page); + atomic_inc(&page->_mapcount); + } while (page++, --nr_pages > 0); + break; + case RMAP_LEVEL_PMD: + if (PageAnonExclusive(page)) { + if (unlikely(maybe_pinned)) + return -EBUSY; + ClearPageAnonExclusive(page); + } atomic_inc(&folio->_entire_mapcount); - } else { - atomic_inc(&page->_mapcount); + break; } + return 0; } /** - * page_try_dup_anon_rmap - try duplicating a mapping of an already mapped - * anonymous page - * @page: the page to duplicate the mapping for - * @compound: the page is mapped as compound or as a small page - * @vma: the source vma + * folio_try_dup_anon_rmap_ptes - try duplicating PTE mappings of a page range + * of a folio + * @folio: The folio to duplicate the mappings of + * @page: The first page to duplicate the mappings of + * @nr_pages: The number of pages of which the mapping will be duplicated + * @src_vma: The vm area from which the mappings are duplicated * - * The caller needs to hold the PT lock and the vma->vma_mm->write_protect_seq. + * The page range of the folio is defined by [page, page + nr_pages) * - * Duplicating the mapping can only fail if the page may be pinned; device - * private pages cannot get pinned and consequently this function cannot fail. + * The caller needs to hold the page table lock and the + * vma->vma_mm->write_protect_seq. + * + * Duplicating the mappings can only fail if the folio may be pinned; device + * private folios cannot get pinned and consequently this function cannot fail + * for them. + * + * If duplicating the mappings succeeded, the duplicated PTEs have to be R/O in + * the parent and the child. They must *not* be writable after this call + * succeeded. + * + * Returns 0 if duplicating the mappings succeeded. Returns -EBUSY otherwise. + */ +static inline int folio_try_dup_anon_rmap_ptes(struct folio *folio, + struct page *page, int nr_pages, struct vm_area_struct *src_vma) +{ + return __folio_try_dup_anon_rmap(folio, page, nr_pages, src_vma, + RMAP_LEVEL_PTE); +} +#define folio_try_dup_anon_rmap_pte(folio, page, vma) \ + folio_try_dup_anon_rmap_ptes(folio, page, 1, vma) + +/** + * folio_try_dup_anon_rmap_pmd - try duplicating a PMD mapping of a page range + * of a folio + * @folio: The folio to duplicate the mapping of + * @page: The first page to duplicate the mapping of + * @src_vma: The vm area from which the mapping is duplicated + * + * The page range of the folio is defined by [page, page + HPAGE_PMD_NR) * - * If duplicating the mapping succeeds, the page has to be mapped R/O into - * the parent and the child. It must *not* get mapped writable after this call. + * The caller needs to hold the page table lock and the + * vma->vma_mm->write_protect_seq. + * + * Duplicating the mapping can only fail if the folio may be pinned; device + * private folios cannot get pinned and consequently this function cannot fail + * for them. + * + * If duplicating the mapping succeeds, the duplicated PMD has to be R/O in + * the parent and the child. They must *not* be writable after this call + * succeeded. * * Returns 0 if duplicating the mapping succeeded. Returns -EBUSY otherwise. */ +static inline int folio_try_dup_anon_rmap_pmd(struct folio *folio, + struct page *page, struct vm_area_struct *src_vma) +{ +#ifdef CONFIG_TRANSPARENT_HUGEPAGE + return __folio_try_dup_anon_rmap(folio, page, HPAGE_PMD_NR, src_vma, + RMAP_LEVEL_PMD); +#else + WARN_ON_ONCE(true); + return -EBUSY; +#endif +} + static inline int page_try_dup_anon_rmap(struct page *page, bool compound, struct vm_area_struct *vma) { - VM_BUG_ON_PAGE(!PageAnon(page), page); - - /* - * No need to check+clear for already shared pages, including KSM - * pages. - */ - if (!PageAnonExclusive(page)) - goto dup; - - /* - * If this page may have been pinned by the parent process, - * don't allow to duplicate the mapping but instead require to e.g., - * copy the page immediately for the child so that we'll always - * guarantee the pinned page won't be randomly replaced in the - * future on write faults. - */ - if (likely(!is_device_private_page(page)) && - unlikely(page_needs_cow_for_dma(vma, page))) - return -EBUSY; + struct folio *folio = page_folio(page); - ClearPageAnonExclusive(page); - /* - * It's okay to share the anon page between both processes, mapping - * the page R/O into both processes. - */ -dup: - __page_dup_rmap(page, compound); - return 0; + if (likely(!compound)) + return folio_try_dup_anon_rmap_pte(folio, page, vma); + return folio_try_dup_anon_rmap_pmd(folio, page, vma); } /** From b902e27b1cf948af778b237ca4ae467824b7c30a Mon Sep 17 00:00:00 2001 From: David Hildenbrand Date: Wed, 20 Dec 2023 23:44:59 +0100 Subject: [PATCH 188/548] mm/huge_memory: page_try_dup_anon_rmap() -> folio_try_dup_anon_rmap_pmd() Let's convert copy_huge_pmd() and fixup the comment in copy_huge_pud(). While at it, perform more folio conversion in copy_huge_pmd(). Link: https://lkml.kernel.org/r/20231220224504.646757-36-david@redhat.com Signed-off-by: David Hildenbrand Cc: Hugh Dickins Cc: Matthew Wilcox (Oracle) Cc: Muchun Song Cc: Muchun Song Cc: Peter Xu Cc: Ryan Roberts Cc: Yin Fengwei Signed-off-by: Andrew Morton (cherry picked from commit 96c772c25c89f35091ce924117602d04de82a0fe) Signed-off-by: Koba Ko --- mm/huge_memory.c | 12 +++++++----- 1 file changed, 7 insertions(+), 5 deletions(-) diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 259e0a6b4eb8c..5fb299e8174b2 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -1241,6 +1241,7 @@ int copy_huge_pmd(struct mm_struct *dst_mm, struct mm_struct *src_mm, { spinlock_t *dst_ptl, *src_ptl; struct page *src_page; + struct folio *src_folio; pmd_t pmd; pgtable_t pgtable = NULL; int ret = -ENOMEM; @@ -1307,11 +1308,12 @@ int copy_huge_pmd(struct mm_struct *dst_mm, struct mm_struct *src_mm, src_page = pmd_page(pmd); VM_BUG_ON_PAGE(!PageHead(src_page), src_page); + src_folio = page_folio(src_page); - get_page(src_page); - if (unlikely(page_try_dup_anon_rmap(src_page, true, src_vma))) { + folio_get(src_folio); + if (unlikely(folio_try_dup_anon_rmap_pmd(src_folio, src_page, src_vma))) { /* Page maybe pinned: split and retry the fault on PTEs. */ - put_page(src_page); + folio_put(src_folio); pte_free(dst_mm, pgtable); spin_unlock(src_ptl); spin_unlock(dst_ptl); @@ -1420,8 +1422,8 @@ int copy_huge_pud(struct mm_struct *dst_mm, struct mm_struct *src_mm, } /* - * TODO: once we support anonymous pages, use page_try_dup_anon_rmap() - * and split if duplicating fails. + * TODO: once we support anonymous pages, use + * folio_try_dup_anon_rmap_*() and split if duplicating fails. */ pudp_set_wrprotect(src_mm, addr, src_pud); pud = pud_mkold(pud_wrprotect(pud)); From 1b84f1703f5ac0e6852b022a9135aa8e9f57d371 Mon Sep 17 00:00:00 2001 From: David Hildenbrand Date: Wed, 20 Dec 2023 23:45:00 +0100 Subject: [PATCH 189/548] mm/memory: page_try_dup_anon_rmap() -> folio_try_dup_anon_rmap_pte() Let's convert copy_nonpresent_pte(). While at it, perform some more folio conversion. Link: https://lkml.kernel.org/r/20231220224504.646757-37-david@redhat.com Signed-off-by: David Hildenbrand Cc: Hugh Dickins Cc: Matthew Wilcox (Oracle) Cc: Muchun Song Cc: Muchun Song Cc: Peter Xu Cc: Ryan Roberts Cc: Yin Fengwei Signed-off-by: Andrew Morton (cherry picked from commit 08e7795e2444c3df9292f4ac7092be6168166a46) Signed-off-by: Koba Ko --- mm/memory.c | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-) diff --git a/mm/memory.c b/mm/memory.c index f54983cc8137a..14711cbf86e73 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -772,6 +772,7 @@ copy_nonpresent_pte(struct mm_struct *dst_mm, struct mm_struct *src_mm, unsigned long vm_flags = dst_vma->vm_flags; pte_t orig_pte = ptep_get(src_pte); pte_t pte = orig_pte; + struct folio *folio; struct page *page; swp_entry_t entry = pte_to_swp_entry(orig_pte); @@ -816,6 +817,7 @@ copy_nonpresent_pte(struct mm_struct *dst_mm, struct mm_struct *src_mm, } } else if (is_device_private_entry(entry)) { page = pfn_swap_entry_to_page(entry); + folio = page_folio(page); /* * Update rss count even for unaddressable pages, as @@ -826,10 +828,10 @@ copy_nonpresent_pte(struct mm_struct *dst_mm, struct mm_struct *src_mm, * for unaddressable pages, at some point. But for now * keep things as they are. */ - get_page(page); + folio_get(folio); rss[mm_counter(page)]++; /* Cannot fail as these pages cannot get pinned. */ - BUG_ON(page_try_dup_anon_rmap(page, false, src_vma)); + folio_try_dup_anon_rmap_pte(folio, page, src_vma); /* * We do not preserve soft-dirty information, because so @@ -943,7 +945,7 @@ copy_present_pte(struct vm_area_struct *dst_vma, struct vm_area_struct *src_vma, * future. */ folio_get(folio); - if (unlikely(page_try_dup_anon_rmap(page, false, src_vma))) { + if (unlikely(folio_try_dup_anon_rmap_pte(folio, page, src_vma))) { /* Page may be pinned, we have to copy. */ folio_put(folio); return copy_present_page(dst_vma, src_vma, dst_pte, src_pte, From 05605e95a26131fbf57090ef418489ec089a2f7f Mon Sep 17 00:00:00 2001 From: David Hildenbrand Date: Wed, 20 Dec 2023 23:45:01 +0100 Subject: [PATCH 190/548] mm/rmap: remove page_try_dup_anon_rmap() All users are gone, remove page_try_dup_anon_rmap() and any remaining traces. Link: https://lkml.kernel.org/r/20231220224504.646757-38-david@redhat.com Signed-off-by: David Hildenbrand Cc: Hugh Dickins Cc: Matthew Wilcox (Oracle) Cc: Muchun Song Cc: Muchun Song Cc: Peter Xu Cc: Ryan Roberts Cc: Yin Fengwei Signed-off-by: Andrew Morton (cherry picked from commit a13d096471ec0ac5c6fc90fbcd57e8430024046a) Signed-off-by: Koba Ko --- include/linux/rmap.h | 16 +++------------- 1 file changed, 3 insertions(+), 13 deletions(-) diff --git a/include/linux/rmap.h b/include/linux/rmap.h index 1d3e42eaccd4a..5dae2e94e46d8 100644 --- a/include/linux/rmap.h +++ b/include/linux/rmap.h @@ -248,7 +248,7 @@ void hugetlb_add_anon_rmap(struct page *, struct vm_area_struct *, void hugetlb_add_new_anon_rmap(struct folio *, struct vm_area_struct *, unsigned long address); -/* See page_try_dup_anon_rmap() */ +/* See folio_try_dup_anon_rmap_*() */ static inline int hugetlb_try_dup_anon_rmap(struct folio *folio, struct vm_area_struct *vma) { @@ -473,16 +473,6 @@ static inline int folio_try_dup_anon_rmap_pmd(struct folio *folio, #endif } -static inline int page_try_dup_anon_rmap(struct page *page, bool compound, - struct vm_area_struct *vma) -{ - struct folio *folio = page_folio(page); - - if (likely(!compound)) - return folio_try_dup_anon_rmap_pte(folio, page, vma); - return folio_try_dup_anon_rmap_pmd(folio, page, vma); -} - /** * page_try_share_anon_rmap - try marking an exclusive anonymous page possibly * shared to prepare for KSM or temporary unmapping @@ -491,8 +481,8 @@ static inline int page_try_dup_anon_rmap(struct page *page, bool compound, * The caller needs to hold the PT lock and has to have the page table entry * cleared/invalidated. * - * This is similar to page_try_dup_anon_rmap(), however, not used during fork() - * to duplicate a mapping, but instead to prepare for KSM or temporarily + * This is similar to folio_try_dup_anon_rmap_*(), however, not used during + * fork() to duplicate a mapping, but instead to prepare for KSM or temporarily * unmapping a page (swap, migration) via folio_remove_rmap_*(). * * Marking the page shared can only fail if the page may be pinned; device From 5d471aec10d5683f9e2a4a506c6b8b1666808112 Mon Sep 17 00:00:00 2001 From: David Hildenbrand Date: Wed, 20 Dec 2023 23:45:02 +0100 Subject: [PATCH 191/548] mm: convert page_try_share_anon_rmap() to folio_try_share_anon_rmap_[pte|pmd]() Let's convert it like we converted all the other rmap functions. Don't introduce folio_try_share_anon_rmap_ptes() for now, as we don't have a user that wants rmap batching in sight. Pretty easy to add later. All users are easy to convert -- only ksm.c doesn't use folios yet but that is left for future work -- so let's just do it in a single shot. While at it, turn the BUG_ON into a WARN_ON_ONCE. Note that page_try_share_anon_rmap() so far didn't care about pte/pmd mappings (no compound parameter). We're changing that so we can perform better sanity checks and make the code actually more readable/consistent. For example, __folio_rmap_sanity_checks() will make sure that a PMD range actually falls completely into the folio. Link: https://lkml.kernel.org/r/20231220224504.646757-39-david@redhat.com Signed-off-by: David Hildenbrand Cc: Hugh Dickins Cc: Matthew Wilcox (Oracle) Cc: Muchun Song Cc: Muchun Song Cc: Peter Xu Cc: Ryan Roberts Cc: Yin Fengwei Signed-off-by: Andrew Morton (cherry picked from commit e3b4b1374f87c71e9309efc6149f113cdd17af72) Signed-off-by: Koba Ko --- include/linux/rmap.h | 96 ++++++++++++++++++++++++++++++++------------ mm/huge_memory.c | 9 +++-- mm/internal.h | 4 +- mm/ksm.c | 5 ++- mm/migrate_device.c | 2 +- mm/rmap.c | 11 ++--- 6 files changed, 88 insertions(+), 39 deletions(-) diff --git a/include/linux/rmap.h b/include/linux/rmap.h index 5dae2e94e46d8..393c2d90a7321 100644 --- a/include/linux/rmap.h +++ b/include/linux/rmap.h @@ -264,7 +264,7 @@ static inline int hugetlb_try_dup_anon_rmap(struct folio *folio, return 0; } -/* See page_try_share_anon_rmap() */ +/* See folio_try_share_anon_rmap_*() */ static inline int hugetlb_try_share_anon_rmap(struct folio *folio) { VM_WARN_ON_FOLIO(!folio_test_hugetlb(folio), folio); @@ -473,31 +473,15 @@ static inline int folio_try_dup_anon_rmap_pmd(struct folio *folio, #endif } -/** - * page_try_share_anon_rmap - try marking an exclusive anonymous page possibly - * shared to prepare for KSM or temporary unmapping - * @page: the exclusive anonymous page to try marking possibly shared - * - * The caller needs to hold the PT lock and has to have the page table entry - * cleared/invalidated. - * - * This is similar to folio_try_dup_anon_rmap_*(), however, not used during - * fork() to duplicate a mapping, but instead to prepare for KSM or temporarily - * unmapping a page (swap, migration) via folio_remove_rmap_*(). - * - * Marking the page shared can only fail if the page may be pinned; device - * private pages cannot get pinned and consequently this function cannot fail. - * - * Returns 0 if marking the page possibly shared succeeded. Returns -EBUSY - * otherwise. - */ -static inline int page_try_share_anon_rmap(struct page *page) +static __always_inline int __folio_try_share_anon_rmap(struct folio *folio, + struct page *page, int nr_pages, enum rmap_level level) { - VM_WARN_ON(folio_test_hugetlb(page_folio(page))); - VM_BUG_ON_PAGE(!PageAnon(page) || !PageAnonExclusive(page), page); + VM_WARN_ON_FOLIO(!folio_test_anon(folio), folio); + VM_WARN_ON_FOLIO(!PageAnonExclusive(page), folio); + __folio_rmap_sanity_checks(folio, page, nr_pages, level); - /* device private pages cannot get pinned via GUP. */ - if (unlikely(is_device_private_page(page))) { + /* device private folios cannot get pinned via GUP. */ + if (unlikely(folio_is_device_private(folio))) { ClearPageAnonExclusive(page); return 0; } @@ -548,7 +532,7 @@ static inline int page_try_share_anon_rmap(struct page *page) if (IS_ENABLED(CONFIG_HAVE_FAST_GUP)) smp_mb(); - if (unlikely(page_maybe_dma_pinned(page))) + if (unlikely(folio_maybe_dma_pinned(folio))) return -EBUSY; ClearPageAnonExclusive(page); @@ -561,6 +545,68 @@ static inline int page_try_share_anon_rmap(struct page *page) return 0; } +/** + * folio_try_share_anon_rmap_pte - try marking an exclusive anonymous page + * mapped by a PTE possibly shared to prepare + * for KSM or temporary unmapping + * @folio: The folio to share a mapping of + * @page: The mapped exclusive page + * + * The caller needs to hold the page table lock and has to have the page table + * entries cleared/invalidated. + * + * This is similar to folio_try_dup_anon_rmap_pte(), however, not used during + * fork() to duplicate mappings, but instead to prepare for KSM or temporarily + * unmapping parts of a folio (swap, migration) via folio_remove_rmap_pte(). + * + * Marking the mapped page shared can only fail if the folio maybe pinned; + * device private folios cannot get pinned and consequently this function cannot + * fail. + * + * Returns 0 if marking the mapped page possibly shared succeeded. Returns + * -EBUSY otherwise. + */ +static inline int folio_try_share_anon_rmap_pte(struct folio *folio, + struct page *page) +{ + return __folio_try_share_anon_rmap(folio, page, 1, RMAP_LEVEL_PTE); +} + +/** + * folio_try_share_anon_rmap_pmd - try marking an exclusive anonymous page + * range mapped by a PMD possibly shared to + * prepare for temporary unmapping + * @folio: The folio to share the mapping of + * @page: The first page to share the mapping of + * + * The page range of the folio is defined by [page, page + HPAGE_PMD_NR) + * + * The caller needs to hold the page table lock and has to have the page table + * entries cleared/invalidated. + * + * This is similar to folio_try_dup_anon_rmap_pmd(), however, not used during + * fork() to duplicate a mapping, but instead to prepare for temporarily + * unmapping parts of a folio (swap, migration) via folio_remove_rmap_pmd(). + * + * Marking the mapped pages shared can only fail if the folio maybe pinned; + * device private folios cannot get pinned and consequently this function cannot + * fail. + * + * Returns 0 if marking the mapped pages possibly shared succeeded. Returns + * -EBUSY otherwise. + */ +static inline int folio_try_share_anon_rmap_pmd(struct folio *folio, + struct page *page) +{ +#ifdef CONFIG_TRANSPARENT_HUGEPAGE + return __folio_try_share_anon_rmap(folio, page, HPAGE_PMD_NR, + RMAP_LEVEL_PMD); +#else + WARN_ON_ONCE(true); + return -EBUSY; +#endif +} + /* * Called from mm/vmscan.c to handle paging out */ diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 5fb299e8174b2..beef47aecc57f 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -2369,10 +2369,11 @@ static void __split_huge_pmd_locked(struct vm_area_struct *vma, pmd_t *pmd, * In case we cannot clear PageAnonExclusive(), split the PMD * only and let try_to_migrate_one() fail later. * - * See page_try_share_anon_rmap(): invalidate PMD first. + * See folio_try_share_anon_rmap_pmd(): invalidate PMD first. */ anon_exclusive = PageAnonExclusive(page); - if (freeze && anon_exclusive && page_try_share_anon_rmap(page)) + if (freeze && anon_exclusive && + folio_try_share_anon_rmap_pmd(folio, page)) freeze = false; if (!freeze) { rmap_t rmap_flags = RMAP_NONE; @@ -3422,9 +3423,9 @@ int set_pmd_migration_entry(struct page_vma_mapped_walk *pvmw, flush_cache_range(vma, address, address + HPAGE_PMD_SIZE); pmdval = pmdp_invalidate(vma, address, pvmw->pmd); - /* See page_try_share_anon_rmap(): invalidate PMD first. */ + /* See folio_try_share_anon_rmap_pmd(): invalidate PMD first. */ anon_exclusive = folio_test_anon(folio) && PageAnonExclusive(page); - if (anon_exclusive && page_try_share_anon_rmap(page)) { + if (anon_exclusive && folio_try_share_anon_rmap_pmd(folio, page)) { set_pmd_at(mm, address, pvmw->pmd, pmdval); return -EBUSY; } diff --git a/mm/internal.h b/mm/internal.h index a819129ae29cd..d96ae6f308406 100644 --- a/mm/internal.h +++ b/mm/internal.h @@ -1103,7 +1103,7 @@ enum { * * Ordinary GUP: Using the PT lock * * GUP-fast and fork(): mm->write_protect_seq * * GUP-fast and KSM or temporary unmapping (swap, migration): see - * page_try_share_anon_rmap() + * folio_try_share_anon_rmap_*() * * Must be called with the (sub)page that's actually referenced via the * page table entry, which might not necessarily be the head page for a @@ -1146,7 +1146,7 @@ static inline bool gup_must_unshare(struct vm_area_struct *vma, return is_cow_mapping(vma->vm_flags); } - /* Paired with a memory barrier in page_try_share_anon_rmap(). */ + /* Paired with a memory barrier in folio_try_share_anon_rmap_*(). */ if (IS_ENABLED(CONFIG_HAVE_FAST_GUP)) smp_rmb(); diff --git a/mm/ksm.c b/mm/ksm.c index 558eb2153f107..9aafdc73efa26 100644 --- a/mm/ksm.c +++ b/mm/ksm.c @@ -1148,8 +1148,9 @@ static int write_protect_page(struct vm_area_struct *vma, struct page *page, goto out_unlock; } - /* See page_try_share_anon_rmap(): clear PTE first. */ - if (anon_exclusive && page_try_share_anon_rmap(page)) { + /* See folio_try_share_anon_rmap_pte(): clear PTE first. */ + if (anon_exclusive && + folio_try_share_anon_rmap_pte(page_folio(page), page)) { set_pte_at(mm, pvmw.address, pvmw.pte, entry); goto out_unlock; } diff --git a/mm/migrate_device.c b/mm/migrate_device.c index 39b7754480c67..b6c27c76e1a0b 100644 --- a/mm/migrate_device.c +++ b/mm/migrate_device.c @@ -202,7 +202,7 @@ static int migrate_vma_collect_pmd(pmd_t *pmdp, if (anon_exclusive) { pte = ptep_clear_flush(vma, addr, ptep); - if (page_try_share_anon_rmap(page)) { + if (folio_try_share_anon_rmap_pte(folio, page)) { set_pte_at(mm, addr, ptep, pte); folio_unlock(folio); folio_put(folio); diff --git a/mm/rmap.c b/mm/rmap.c index 1c7ee56816549..cf1896f4c96c3 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -1836,9 +1836,9 @@ static bool try_to_unmap_one(struct folio *folio, struct vm_area_struct *vma, break; } - /* See page_try_share_anon_rmap(): clear PTE first. */ + /* See folio_try_share_anon_rmap(): clear PTE first. */ if (anon_exclusive && - page_try_share_anon_rmap(subpage)) { + folio_try_share_anon_rmap_pte(folio, subpage)) { swap_free(entry); set_pte_at(mm, address, pvmw.pte, pteval); ret = false; @@ -2112,7 +2112,8 @@ static bool try_to_migrate_one(struct folio *folio, struct vm_area_struct *vma, pte_t swp_pte; if (anon_exclusive) - BUG_ON(page_try_share_anon_rmap(subpage)); + WARN_ON_ONCE(folio_try_share_anon_rmap_pte(folio, + subpage)); /* * Store the pfn of the page in a special migration @@ -2183,7 +2184,7 @@ static bool try_to_migrate_one(struct folio *folio, struct vm_area_struct *vma, VM_BUG_ON_PAGE(pte_write(pteval) && folio_test_anon(folio) && !anon_exclusive, subpage); - /* See page_try_share_anon_rmap(): clear PTE first. */ + /* See folio_try_share_anon_rmap_pte(): clear PTE first. */ if (folio_test_hugetlb(folio)) { if (anon_exclusive && hugetlb_try_share_anon_rmap(folio)) { @@ -2194,7 +2195,7 @@ static bool try_to_migrate_one(struct folio *folio, struct vm_area_struct *vma, break; } } else if (anon_exclusive && - page_try_share_anon_rmap(subpage)) { + folio_try_share_anon_rmap_pte(folio, subpage)) { set_pte_at(mm, address, pvmw.pte, pteval); ret = false; page_vma_mapped_walk_done(&pvmw); From 744f33eb60cc24421bfbb3c5f7df560c33d85ca3 Mon Sep 17 00:00:00 2001 From: David Hildenbrand Date: Wed, 20 Dec 2023 23:45:03 +0100 Subject: [PATCH 192/548] mm/rmap: rename COMPOUND_MAPPED to ENTIRELY_MAPPED We removed all "bool compound" and RMAP_COMPOUND parameters. Let's remove the remaining "compound" terminology by making COMPOUND_MAPPED match the "folio->_entire_mapcount" terminology, renaming it to ENTIRELY_MAPPED. ENTIRELY_MAPPED is only used when the whole folio is mapped using a single page table entry (e.g., a single PMD mapping a PMD-sized THP). For now, we don't support mapping any THP bigger than that, so ENTIRELY_MAPPED only applies to PMD-mapped PMD-sized THP only. Link: https://lkml.kernel.org/r/20231220224504.646757-40-david@redhat.com Signed-off-by: David Hildenbrand Cc: Hugh Dickins Cc: Matthew Wilcox (Oracle) Cc: Muchun Song Cc: Muchun Song Cc: Peter Xu Cc: Ryan Roberts Cc: Yin Fengwei Signed-off-by: Andrew Morton (cherry picked from commit e78a13fd16bb9d9712f61be2bd6612a092ce66ea) Signed-off-by: Koba Ko --- Documentation/mm/transhuge.rst | 2 +- mm/internal.h | 6 +++--- mm/rmap.c | 18 +++++++++--------- 3 files changed, 13 insertions(+), 13 deletions(-) diff --git a/Documentation/mm/transhuge.rst b/Documentation/mm/transhuge.rst index cf81272a6b8b6..93c9239b9ebe2 100644 --- a/Documentation/mm/transhuge.rst +++ b/Documentation/mm/transhuge.rst @@ -117,7 +117,7 @@ pages: - map/unmap of a PMD entry for the whole THP increment/decrement folio->_entire_mapcount and also increment/decrement - folio->_nr_pages_mapped by COMPOUND_MAPPED when _entire_mapcount + folio->_nr_pages_mapped by ENTIRELY_MAPPED when _entire_mapcount goes from -1 to 0 or 0 to -1. - map/unmap of individual pages with PTE entry increment/decrement diff --git a/mm/internal.h b/mm/internal.h index d96ae6f308406..149cdf9973e95 100644 --- a/mm/internal.h +++ b/mm/internal.h @@ -54,12 +54,12 @@ void page_writeback_init(void); /* * If a 16GB hugetlb folio were mapped by PTEs of all of its 4kB pages, - * its nr_pages_mapped would be 0x400000: choose the COMPOUND_MAPPED bit + * its nr_pages_mapped would be 0x400000: choose the ENTIRELY_MAPPED bit * above that range, instead of 2*(PMD_SIZE/PAGE_SIZE). Hugetlb currently * leaves nr_pages_mapped at 0, but avoid surprise if it participates later. */ -#define COMPOUND_MAPPED 0x800000 -#define FOLIO_PAGES_MAPPED (COMPOUND_MAPPED - 1) +#define ENTIRELY_MAPPED 0x800000 +#define FOLIO_PAGES_MAPPED (ENTIRELY_MAPPED - 1) /* * Flags passed to __show_mem() and show_free_areas() to suppress output in diff --git a/mm/rmap.c b/mm/rmap.c index cf1896f4c96c3..5e556fc4f5100 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -1142,7 +1142,7 @@ static __always_inline unsigned int __folio_add_rmap(struct folio *folio, first = atomic_inc_and_test(&page->_mapcount); if (first && folio_test_large(folio)) { first = atomic_inc_return_relaxed(mapped); - first = (first < COMPOUND_MAPPED); + first = (first < ENTIRELY_MAPPED); } if (first) @@ -1152,15 +1152,15 @@ static __always_inline unsigned int __folio_add_rmap(struct folio *folio, case RMAP_LEVEL_PMD: first = atomic_inc_and_test(&folio->_entire_mapcount); if (first) { - nr = atomic_add_return_relaxed(COMPOUND_MAPPED, mapped); - if (likely(nr < COMPOUND_MAPPED + COMPOUND_MAPPED)) { + nr = atomic_add_return_relaxed(ENTIRELY_MAPPED, mapped); + if (likely(nr < ENTIRELY_MAPPED + ENTIRELY_MAPPED)) { *nr_pmdmapped = folio_nr_pages(folio); nr = *nr_pmdmapped - (nr & FOLIO_PAGES_MAPPED); /* Raced ahead of a remove and another add? */ if (unlikely(nr < 0)) nr = 0; } else { - /* Raced ahead of a remove of COMPOUND_MAPPED */ + /* Raced ahead of a remove of ENTIRELY_MAPPED */ nr = 0; } } @@ -1403,7 +1403,7 @@ void folio_add_new_anon_rmap(struct folio *folio, struct vm_area_struct *vma, } else { /* increment count (starts at -1) */ atomic_set(&folio->_entire_mapcount, 0); - atomic_set(&folio->_nr_pages_mapped, COMPOUND_MAPPED); + atomic_set(&folio->_nr_pages_mapped, ENTIRELY_MAPPED); SetPageAnonExclusive(&folio->page); __lruvec_stat_mod_folio(folio, NR_ANON_THPS, nr); } @@ -1484,7 +1484,7 @@ static __always_inline void __folio_remove_rmap(struct folio *folio, last = atomic_add_negative(-1, &page->_mapcount); if (last && folio_test_large(folio)) { last = atomic_dec_return_relaxed(mapped); - last = (last < COMPOUND_MAPPED); + last = (last < ENTIRELY_MAPPED); } if (last) @@ -1494,15 +1494,15 @@ static __always_inline void __folio_remove_rmap(struct folio *folio, case RMAP_LEVEL_PMD: last = atomic_add_negative(-1, &folio->_entire_mapcount); if (last) { - nr = atomic_sub_return_relaxed(COMPOUND_MAPPED, mapped); - if (likely(nr < COMPOUND_MAPPED)) { + nr = atomic_sub_return_relaxed(ENTIRELY_MAPPED, mapped); + if (likely(nr < ENTIRELY_MAPPED)) { nr_pmdmapped = folio_nr_pages(folio); nr = nr_pmdmapped - (nr & FOLIO_PAGES_MAPPED); /* Raced ahead of another remove and an add? */ if (unlikely(nr < 0)) nr = 0; } else { - /* An add of COMPOUND_MAPPED raced ahead */ + /* An add of ENTIRELY_MAPPED raced ahead */ nr = 0; } } From 351323ea635f72ba4bee046ffa00038a9d1078a4 Mon Sep 17 00:00:00 2001 From: David Hildenbrand Date: Wed, 20 Dec 2023 23:45:04 +0100 Subject: [PATCH 193/548] mm: remove one last reference to page_add_*_rmap() Let's fixup one remaining comment. Note that the only trace remaining of the old rmap interface is in an example in Documentation/trace/ftrace.rst, that we'll just leave alone. Link: https://lkml.kernel.org/r/20231220224504.646757-41-david@redhat.com Signed-off-by: David Hildenbrand Cc: Hugh Dickins Cc: Matthew Wilcox (Oracle) Cc: Muchun Song Cc: Muchun Song Cc: Peter Xu Cc: Ryan Roberts Cc: Yin Fengwei Signed-off-by: Andrew Morton (cherry picked from commit 4a8ffab02db55c8a70063c57519cadf72d480ed4) Signed-off-by: Koba Ko --- mm/internal.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/mm/internal.h b/mm/internal.h index 149cdf9973e95..d971c8e67738b 100644 --- a/mm/internal.h +++ b/mm/internal.h @@ -710,7 +710,7 @@ folio_within_vma(struct folio *folio, struct vm_area_struct *vma) * should be called with vma's mmap_lock held for read or write, * under page table lock for the pte/pmd being added or removed. * - * mlock is usually called at the end of page_add_*_rmap(), munlock at + * mlock is usually called at the end of folio_add_*_rmap_*(), munlock at * the end of folio_remove_rmap_*(); but new anon folios are managed by * folio_add_lru_vma() calling mlock_new_folio(). */ From aaa12c270efb3d6c0938628e485bbe8ba359ba82 Mon Sep 17 00:00:00 2001 From: "Matthew Wilcox (Oracle)" Date: Thu, 11 Jan 2024 15:24:20 +0000 Subject: [PATCH 194/548] mm: add pfn_swap_entry_folio() Patch series "mm: convert mm counter to take a folio", v3. Make sure all mm_counter() and mm_counter_file() callers have a folio, then convert mm counter functions to take a folio, which saves some compound_head() calls. This patch (of 10): Thanks to the compound_head() hidden inside PageLocked(), this saves a call to compound_head() over calling page_folio(pfn_swap_entry_to_page()) Link: https://lkml.kernel.org/r/20240111152429.3374566-1-willy@infradead.org Link: https://lkml.kernel.org/r/20240111152429.3374566-2-willy@infradead.org Signed-off-by: Matthew Wilcox (Oracle) Cc: David Hildenbrand Cc: Kefeng Wang Signed-off-by: Andrew Morton (cherry picked from commit 5662400a9ac03f38ef3b84e4ff9a640a4604bef9) Signed-off-by: Koba Ko --- include/linux/swapops.h | 13 +++++++++++++ mm/filemap.c | 2 +- mm/huge_memory.c | 2 +- 3 files changed, 15 insertions(+), 2 deletions(-) diff --git a/include/linux/swapops.h b/include/linux/swapops.h index 925c84653af5e..a5c560a2f8c25 100644 --- a/include/linux/swapops.h +++ b/include/linux/swapops.h @@ -497,6 +497,19 @@ static inline struct page *pfn_swap_entry_to_page(swp_entry_t entry) return p; } +static inline struct folio *pfn_swap_entry_folio(swp_entry_t entry) +{ + struct folio *folio = pfn_folio(swp_offset_pfn(entry)); + + /* + * Any use of migration entries may only occur while the + * corresponding folio is locked + */ + BUG_ON(is_migration_entry(entry) && !folio_test_locked(folio)); + + return folio; +} + /* * A pfn swap entry is a special type of swap entry that always has a pfn stored * in the swap offset. They can either be used to represent unaddressable device diff --git a/mm/filemap.c b/mm/filemap.c index b3ef7f248813b..5cf965a47584a 100644 --- a/mm/filemap.c +++ b/mm/filemap.c @@ -1409,7 +1409,7 @@ void migration_entry_wait_on_locked(swp_entry_t entry, spinlock_t *ptl) unsigned long pflags; bool in_thrashing; wait_queue_head_t *q; - struct folio *folio = page_folio(pfn_swap_entry_to_page(entry)); + struct folio *folio = pfn_swap_entry_folio(entry); q = folio_waitqueue(folio); if (!folio_test_uptodate(folio) && folio_test_workingset(folio)) { diff --git a/mm/huge_memory.c b/mm/huge_memory.c index beef47aecc57f..4cc1ba1166e48 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -2000,7 +2000,7 @@ int change_huge_pmd(struct mmu_gather *tlb, struct vm_area_struct *vma, #ifdef CONFIG_ARCH_ENABLE_THP_MIGRATION if (is_swap_pmd(*pmd)) { swp_entry_t entry = pmd_to_swp_entry(*pmd); - struct folio *folio = page_folio(pfn_swap_entry_to_page(entry)); + struct folio *folio = pfn_swap_entry_folio(entry); pmd_t newpmd; VM_BUG_ON(!is_pmd_migration_entry(*pmd)); From 138617b65396dae04e2316b7f49b4ac16072b705 Mon Sep 17 00:00:00 2001 From: "Matthew Wilcox (Oracle)" Date: Thu, 11 Jan 2024 15:24:22 +0000 Subject: [PATCH 195/548] mprotect: use pfn_swap_entry_folio We only want to know whether the folio is anonymous, so use pfn_swap_entry_folio() and save a call to compound_head(). Link: https://lkml.kernel.org/r/20240111152429.3374566-4-willy@infradead.org Signed-off-by: Matthew Wilcox (Oracle) Cc: David Hildenbrand Cc: Kefeng Wang Signed-off-by: Andrew Morton (cherry picked from commit f2d571b0b207087442d1c3fca5189ee1cb34648e) Signed-off-by: Koba Ko --- mm/mprotect.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/mm/mprotect.c b/mm/mprotect.c index 7e870a8c9402a..7404d3ba25b70 100644 --- a/mm/mprotect.c +++ b/mm/mprotect.c @@ -196,13 +196,13 @@ static long change_pte_range(struct mmu_gather *tlb, pte_t newpte; if (is_writable_migration_entry(entry)) { - struct page *page = pfn_swap_entry_to_page(entry); + struct folio *folio = pfn_swap_entry_folio(entry); /* * A protection check is difficult so * just be safe and disable write */ - if (PageAnon(page)) + if (folio_test_anon(folio)) entry = make_readable_exclusive_migration_entry( swp_offset(entry)); else From f1d811da5c0c7b0e1cbfca43ffd05bf49e18b42f Mon Sep 17 00:00:00 2001 From: Kefeng Wang Date: Thu, 11 Jan 2024 15:24:24 +0000 Subject: [PATCH 196/548] mm: use pfn_swap_entry_folio() in __split_huge_pmd_locked() Call pfn_swap_entry_folio() in __split_huge_pmd_locked() as preparation for converting mm counter functions to take a folio. Link: https://lkml.kernel.org/r/20240111152429.3374566-6-willy@infradead.org Signed-off-by: Kefeng Wang Signed-off-by: Matthew Wilcox (Oracle) Cc: David Hildenbrand Signed-off-by: Andrew Morton (cherry picked from commit 439992ff4637ad5042ca8ee1f659fae24890de3e) Signed-off-by: Koba Ko --- mm/huge_memory.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 4cc1ba1166e48..a9f7d41f1fd14 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -2275,7 +2275,7 @@ static void __split_huge_pmd_locked(struct vm_area_struct *vma, pmd_t *pmd, swp_entry_t entry; entry = pmd_to_swp_entry(old_pmd); - page = pfn_swap_entry_to_page(entry); + folio = pfn_swap_entry_folio(entry); } else { page = pmd_page(old_pmd); folio = page_folio(page); @@ -2286,7 +2286,7 @@ static void __split_huge_pmd_locked(struct vm_area_struct *vma, pmd_t *pmd, folio_remove_rmap_pmd(folio, page, vma); folio_put(folio); } - add_mm_counter(mm, mm_counter_file(page), -HPAGE_PMD_NR); + add_mm_counter(mm, mm_counter_file(&folio->page), -HPAGE_PMD_NR); return; } From c4652212d15fb3353683265925d6a2fdc60cdeb1 Mon Sep 17 00:00:00 2001 From: Kefeng Wang Date: Thu, 11 Jan 2024 15:24:25 +0000 Subject: [PATCH 197/548] mm: use pfn_swap_entry_to_folio() in zap_huge_pmd() Call pfn_swap_entry_to_folio() in zap_huge_pmd() as preparation for converting mm counter functions to take a folio. Saves a call to compound_head() embedded inside PageAnon(). Link: https://lkml.kernel.org/r/20240111152429.3374566-7-willy@infradead.org Signed-off-by: Kefeng Wang Signed-off-by: Matthew Wilcox (Oracle) Cc: David Hildenbrand Signed-off-by: Andrew Morton (cherry picked from commit 0103b27a6b826729dc1500d013e53ebed48980b3) Signed-off-by: Koba Ko --- mm/huge_memory.c | 17 ++++++++++------- 1 file changed, 10 insertions(+), 7 deletions(-) diff --git a/mm/huge_memory.c b/mm/huge_memory.c index a9f7d41f1fd14..e7be0f3c421c5 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -1860,12 +1860,14 @@ int zap_huge_pmd(struct mmu_gather *tlb, struct vm_area_struct *vma, zap_deposited_table(tlb->mm, pmd); spin_unlock(ptl); } else { - struct page *page = NULL; + struct folio *folio = NULL; int flush_needed = 1; if (pmd_present(orig_pmd)) { - page = pmd_page(orig_pmd); - folio_remove_rmap_pmd(page_folio(page), page, vma); + struct page *page = pmd_page(orig_pmd); + + folio = page_folio(page); + folio_remove_rmap_pmd(folio, page, vma); VM_BUG_ON_PAGE(page_mapcount(page) < 0, page); VM_BUG_ON_PAGE(!PageHead(page), page); } else if (thp_migration_supported()) { @@ -1873,23 +1875,24 @@ int zap_huge_pmd(struct mmu_gather *tlb, struct vm_area_struct *vma, VM_BUG_ON(!is_pmd_migration_entry(orig_pmd)); entry = pmd_to_swp_entry(orig_pmd); - page = pfn_swap_entry_to_page(entry); + folio = pfn_swap_entry_folio(entry); flush_needed = 0; } else WARN_ONCE(1, "Non present huge pmd without pmd migration enabled!"); - if (PageAnon(page)) { + if (folio_test_anon(folio)) { zap_deposited_table(tlb->mm, pmd); add_mm_counter(tlb->mm, MM_ANONPAGES, -HPAGE_PMD_NR); } else { if (arch_needs_pgtable_deposit()) zap_deposited_table(tlb->mm, pmd); - add_mm_counter(tlb->mm, mm_counter_file(page), -HPAGE_PMD_NR); + add_mm_counter(tlb->mm, mm_counter_file(&folio->page), + -HPAGE_PMD_NR); } spin_unlock(ptl); if (flush_needed) - tlb_remove_page_size(tlb, page, HPAGE_PMD_SIZE); + tlb_remove_page_size(tlb, &folio->page, HPAGE_PMD_SIZE); } return 1; } From d0444c760e908e863f2fd0382445bf6c629ed8df Mon Sep 17 00:00:00 2001 From: Kefeng Wang Date: Thu, 11 Jan 2024 15:24:26 +0000 Subject: [PATCH 198/548] mm: use pfn_swap_entry_folio() in copy_nonpresent_pte() Call pfn_swap_entry_folio() as preparation for converting mm counter functions to take a folio. Link: https://lkml.kernel.org/r/20240111152429.3374566-8-willy@infradead.org Signed-off-by: Kefeng Wang Signed-off-by: Matthew Wilcox (Oracle) Cc: David Hildenbrand Signed-off-by: Andrew Morton (cherry picked from commit 530c2a0da0b440bec4af3dae5bd7110f77962e9b) Signed-off-by: Koba Ko --- mm/memory.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/mm/memory.c b/mm/memory.c index 14711cbf86e73..eff458673e920 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -795,9 +795,9 @@ copy_nonpresent_pte(struct mm_struct *dst_mm, struct mm_struct *src_mm, } rss[MM_SWAPENTS]++; } else if (is_migration_entry(entry)) { - page = pfn_swap_entry_to_page(entry); + folio = pfn_swap_entry_folio(entry); - rss[mm_counter(page)]++; + rss[mm_counter(&folio->page)]++; if (!is_readable_migration_entry(entry) && is_cow_mapping(vm_flags)) { From 756ab0c4a0e27d8aa7b7cb1b57215346dea42371 Mon Sep 17 00:00:00 2001 From: Kefeng Wang Date: Thu, 11 Jan 2024 15:24:27 +0000 Subject: [PATCH 199/548] mm: convert to should_zap_page() to should_zap_folio() Make should_zap_page() take a folio and rename it to should_zap_folio() as preparation for converting mm counter functions to take a folio. Saves a call to compound_head() hidden inside PageAnon(). [wangkefeng.wang@huawei.com: fix used-uninitialized warning] Link: https://lkml.kernel.org/r/962a7993-fce9-4de8-85cd-25e290f25736@huawei.com Link: https://lkml.kernel.org/r/20240111152429.3374566-9-willy@infradead.org Signed-off-by: Kefeng Wang Signed-off-by: Matthew Wilcox (Oracle) Cc: David Hildenbrand Signed-off-by: Andrew Morton (cherry picked from commit eabafaaa957553142cdafc8ae804fb679e5a5f5e) Signed-off-by: Koba Ko --- mm/memory.c | 31 +++++++++++++++++-------------- 1 file changed, 17 insertions(+), 14 deletions(-) diff --git a/mm/memory.c b/mm/memory.c index eff458673e920..844844dd77383 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -1355,19 +1355,20 @@ static inline bool should_zap_cows(struct zap_details *details) return details->even_cows; } -/* Decides whether we should zap this page with the page pointer specified */ -static inline bool should_zap_page(struct zap_details *details, struct page *page) +/* Decides whether we should zap this folio with the folio pointer specified */ +static inline bool should_zap_folio(struct zap_details *details, + struct folio *folio) { - /* If we can make a decision without *page.. */ + /* If we can make a decision without *folio.. */ if (should_zap_cows(details)) return true; - /* E.g. the caller passes NULL for the case of a zero page */ - if (!page) + /* E.g. the caller passes NULL for the case of a zero folio */ + if (!folio) return true; - /* Otherwise we should only zap non-anon pages */ - return !PageAnon(page); + /* Otherwise we should only zap non-anon folios */ + return !folio_test_anon(folio); } static inline bool zap_drop_file_uffd_wp(struct zap_details *details) @@ -1420,7 +1421,7 @@ static unsigned long zap_pte_range(struct mmu_gather *tlb, arch_enter_lazy_mmu_mode(); do { pte_t ptent = ptep_get(pte); - struct folio *folio; + struct folio *folio = NULL; struct page *page; if (pte_none(ptent)) @@ -1433,7 +1434,10 @@ static unsigned long zap_pte_range(struct mmu_gather *tlb, unsigned int delay_rmap; page = vm_normal_page(vma, addr, ptent); - if (unlikely(!should_zap_page(details, page))) + if (page) + folio = page_folio(page); + + if (unlikely(!should_zap_folio(details, folio))) continue; ptent = ptep_get_and_clear_full(mm, addr, pte, tlb->fullmm); @@ -1446,7 +1450,6 @@ static unsigned long zap_pte_range(struct mmu_gather *tlb, continue; } - folio = page_folio(page); delay_rmap = 0; if (!folio_test_anon(folio)) { if (pte_dirty(ptent)) { @@ -1478,7 +1481,7 @@ static unsigned long zap_pte_range(struct mmu_gather *tlb, is_device_exclusive_entry(entry)) { page = pfn_swap_entry_to_page(entry); folio = page_folio(page); - if (unlikely(!should_zap_page(details, page))) + if (unlikely(!should_zap_folio(details, folio))) continue; /* * Both device private/exclusive mappings should only @@ -1499,10 +1502,10 @@ static unsigned long zap_pte_range(struct mmu_gather *tlb, if (unlikely(!free_swap_and_cache(entry))) print_bad_pte(vma, addr, ptent, NULL); } else if (is_migration_entry(entry)) { - page = pfn_swap_entry_to_page(entry); - if (!should_zap_page(details, page)) + folio = pfn_swap_entry_folio(entry); + if (!should_zap_folio(details, folio)) continue; - rss[mm_counter(page)]--; + rss[mm_counter(&folio->page)]--; } else if (pte_marker_entry_uffd_wp(entry)) { /* * For anon: always drop the marker; for file: only From 2219cf2f0d3269b1bc439b274b616c7361469228 Mon Sep 17 00:00:00 2001 From: Kefeng Wang Date: Thu, 11 Jan 2024 15:24:28 +0000 Subject: [PATCH 200/548] mm: convert mm_counter() to take a folio Now all callers of mm_counter() have a folio, convert mm_counter() to take a folio. Saves a call to compound_head() hidden inside PageAnon(). Link: https://lkml.kernel.org/r/20240111152429.3374566-10-willy@infradead.org Signed-off-by: Kefeng Wang Signed-off-by: Matthew Wilcox (Oracle) Cc: David Hildenbrand Signed-off-by: Andrew Morton (cherry picked from commit a23f517b0e1554467b0eb3bc1ebcb4d626217302) Signed-off-by: Koba Ko --- arch/s390/mm/pgtable.c | 2 +- include/linux/mm.h | 6 +++--- mm/memory.c | 10 +++++----- mm/rmap.c | 8 ++++---- mm/userfaultfd.c | 2 +- 5 files changed, 14 insertions(+), 14 deletions(-) diff --git a/arch/s390/mm/pgtable.c b/arch/s390/mm/pgtable.c index 5e349869590a8..2129fddb3dbc3 100644 --- a/arch/s390/mm/pgtable.c +++ b/arch/s390/mm/pgtable.c @@ -732,7 +732,7 @@ static void ptep_zap_swap_entry(struct mm_struct *mm, swp_entry_t entry) else if (is_migration_entry(entry)) { struct page *page = pfn_swap_entry_to_page(entry); - dec_mm_counter(mm, mm_counter(page)); + dec_mm_counter(mm, mm_counter(folio)); } free_swap_and_cache(entry); } diff --git a/include/linux/mm.h b/include/linux/mm.h index 9099a30dccb89..f06f421406ff4 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -2592,11 +2592,11 @@ static inline int mm_counter_file(struct page *page) return MM_FILEPAGES; } -static inline int mm_counter(struct page *page) +static inline int mm_counter(struct folio *folio) { - if (PageAnon(page)) + if (folio_test_anon(folio)) return MM_ANONPAGES; - return mm_counter_file(page); + return mm_counter_file(&folio->page); } static inline unsigned long get_mm_rss(struct mm_struct *mm) diff --git a/mm/memory.c b/mm/memory.c index 844844dd77383..82398f1754c89 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -797,7 +797,7 @@ copy_nonpresent_pte(struct mm_struct *dst_mm, struct mm_struct *src_mm, } else if (is_migration_entry(entry)) { folio = pfn_swap_entry_folio(entry); - rss[mm_counter(&folio->page)]++; + rss[mm_counter(folio)]++; if (!is_readable_migration_entry(entry) && is_cow_mapping(vm_flags)) { @@ -829,7 +829,7 @@ copy_nonpresent_pte(struct mm_struct *dst_mm, struct mm_struct *src_mm, * keep things as they are. */ folio_get(folio); - rss[mm_counter(page)]++; + rss[mm_counter(folio)]++; /* Cannot fail as these pages cannot get pinned. */ folio_try_dup_anon_rmap_pte(folio, page, src_vma); @@ -1462,7 +1462,7 @@ static unsigned long zap_pte_range(struct mmu_gather *tlb, if (pte_young(ptent) && likely(vma_has_recency(vma))) folio_mark_accessed(folio); } - rss[mm_counter(page)]--; + rss[mm_counter(folio)]--; if (!delay_rmap) { folio_remove_rmap_pte(folio, page, vma); if (unlikely(page_mapcount(page) < 0)) @@ -1490,7 +1490,7 @@ static unsigned long zap_pte_range(struct mmu_gather *tlb, * see zap_install_uffd_wp_if_needed(). */ WARN_ON_ONCE(!vma_is_anonymous(vma)); - rss[mm_counter(page)]--; + rss[mm_counter(folio)]--; if (is_device_private_entry(entry)) folio_remove_rmap_pte(folio, page, vma); folio_put(folio); @@ -1505,7 +1505,7 @@ static unsigned long zap_pte_range(struct mmu_gather *tlb, folio = pfn_swap_entry_folio(entry); if (!should_zap_folio(details, folio)) continue; - rss[mm_counter(&folio->page)]--; + rss[mm_counter(folio)]--; } else if (pte_marker_entry_uffd_wp(entry)) { /* * For anon: always drop the marker; for file: only diff --git a/mm/rmap.c b/mm/rmap.c index 5e556fc4f5100..c44b9cb3f1dad 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -1750,7 +1750,7 @@ static bool try_to_unmap_one(struct folio *folio, struct vm_area_struct *vma, set_huge_pte_at(mm, address, pvmw.pte, pteval, hsz); } else { - dec_mm_counter(mm, mm_counter(&folio->page)); + dec_mm_counter(mm, mm_counter(folio)); set_pte_at(mm, address, pvmw.pte, pteval); } @@ -1765,7 +1765,7 @@ static bool try_to_unmap_one(struct folio *folio, struct vm_area_struct *vma, * migration) will not expect userfaults on already * copied pages. */ - dec_mm_counter(mm, mm_counter(&folio->page)); + dec_mm_counter(mm, mm_counter(folio)); } else if (folio_test_anon(folio)) { swp_entry_t entry = page_swap_entry(subpage); pte_t swp_pte; @@ -2151,7 +2151,7 @@ static bool try_to_migrate_one(struct folio *folio, struct vm_area_struct *vma, set_huge_pte_at(mm, address, pvmw.pte, pteval, hsz); } else { - dec_mm_counter(mm, mm_counter(&folio->page)); + dec_mm_counter(mm, mm_counter(folio)); set_pte_at(mm, address, pvmw.pte, pteval); } @@ -2166,7 +2166,7 @@ static bool try_to_migrate_one(struct folio *folio, struct vm_area_struct *vma, * migration) will not expect userfaults on already * copied pages. */ - dec_mm_counter(mm, mm_counter(&folio->page)); + dec_mm_counter(mm, mm_counter(folio)); } else { swp_entry_t entry; pte_t swp_pte; diff --git a/mm/userfaultfd.c b/mm/userfaultfd.c index 2031e1d5b2d7c..952366a178980 100644 --- a/mm/userfaultfd.c +++ b/mm/userfaultfd.c @@ -124,7 +124,7 @@ int mfill_atomic_install_pte(pmd_t *dst_pmd, * Must happen after rmap, as mm_counter() checks mapping (via * PageAnon()), which is set by __page_set_anon_rmap(). */ - inc_mm_counter(dst_mm, mm_counter(page)); + inc_mm_counter(dst_mm, mm_counter(folio)); set_pte_at(dst_mm, dst_addr, dst_pte, _dst_pte); From 899a358bac319215bfd9ee68730b0287de7b156d Mon Sep 17 00:00:00 2001 From: Kefeng Wang Date: Thu, 11 Jan 2024 15:24:29 +0000 Subject: [PATCH 201/548] mm: convert mm_counter_file() to take a folio Now all callers of mm_counter_file() have a folio, convert mm_counter_file() to take a folio. Saves a call to compound_head() hidden inside PageSwapBacked(). Link: https://lkml.kernel.org/r/20240111152429.3374566-11-willy@infradead.org Signed-off-by: Kefeng Wang Signed-off-by: Matthew Wilcox (Oracle) Cc: David Hildenbrand Signed-off-by: Andrew Morton (cherry picked from commit 6b27cc6c66abf0f0b091a95ca1ad4e0fc68c11fd) Signed-off-by: Koba Ko --- include/linux/mm.h | 8 ++++---- kernel/events/uprobes.c | 2 +- mm/huge_memory.c | 4 ++-- mm/khugepaged.c | 4 ++-- mm/memory.c | 10 +++++----- mm/rmap.c | 2 +- 6 files changed, 15 insertions(+), 15 deletions(-) diff --git a/include/linux/mm.h b/include/linux/mm.h index f06f421406ff4..b4cb844cc779a 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -2584,10 +2584,10 @@ static inline void dec_mm_counter(struct mm_struct *mm, int member) mm_trace_rss_stat(mm, member); } -/* Optimized variant when page is already known not to be PageAnon */ -static inline int mm_counter_file(struct page *page) +/* Optimized variant when folio is already known not to be anon */ +static inline int mm_counter_file(struct folio *folio) { - if (PageSwapBacked(page)) + if (folio_test_swapbacked(folio)) return MM_SHMEMPAGES; return MM_FILEPAGES; } @@ -2596,7 +2596,7 @@ static inline int mm_counter(struct folio *folio) { if (folio_test_anon(folio)) return MM_ANONPAGES; - return mm_counter_file(&folio->page); + return mm_counter_file(folio); } static inline unsigned long get_mm_rss(struct mm_struct *mm) diff --git a/kernel/events/uprobes.c b/kernel/events/uprobes.c index 646dcabc28783..bc7e912735c6b 100644 --- a/kernel/events/uprobes.c +++ b/kernel/events/uprobes.c @@ -199,7 +199,7 @@ static int __replace_page(struct vm_area_struct *vma, unsigned long addr, dec_mm_counter(mm, MM_ANONPAGES); if (!folio_test_anon(old_folio)) { - dec_mm_counter(mm, mm_counter_file(old_page)); + dec_mm_counter(mm, mm_counter_file(old_folio)); inc_mm_counter(mm, MM_ANONPAGES); } diff --git a/mm/huge_memory.c b/mm/huge_memory.c index e7be0f3c421c5..be5ff6c41e450 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -1886,7 +1886,7 @@ int zap_huge_pmd(struct mmu_gather *tlb, struct vm_area_struct *vma, } else { if (arch_needs_pgtable_deposit()) zap_deposited_table(tlb->mm, pmd); - add_mm_counter(tlb->mm, mm_counter_file(&folio->page), + add_mm_counter(tlb->mm, mm_counter_file(folio), -HPAGE_PMD_NR); } @@ -2289,7 +2289,7 @@ static void __split_huge_pmd_locked(struct vm_area_struct *vma, pmd_t *pmd, folio_remove_rmap_pmd(folio, page, vma); folio_put(folio); } - add_mm_counter(mm, mm_counter_file(&folio->page), -HPAGE_PMD_NR); + add_mm_counter(mm, mm_counter_file(folio), -HPAGE_PMD_NR); return; } diff --git a/mm/khugepaged.c b/mm/khugepaged.c index 127f3f24c7910..b2c4b34760d3d 100644 --- a/mm/khugepaged.c +++ b/mm/khugepaged.c @@ -1619,7 +1619,7 @@ int collapse_pte_mapped_thp(struct mm_struct *mm, unsigned long addr, /* step 3: set proper refcount and mm_counters. */ if (nr_ptes) { folio_ref_sub(folio, nr_ptes); - add_mm_counter(mm, mm_counter_file(&folio->page), -nr_ptes); + add_mm_counter(mm, mm_counter_file(folio), -nr_ptes); } /* step 4: remove empty page table */ @@ -1650,7 +1650,7 @@ int collapse_pte_mapped_thp(struct mm_struct *mm, unsigned long addr, if (nr_ptes) { flush_tlb_mm(mm); folio_ref_sub(folio, nr_ptes); - add_mm_counter(mm, mm_counter_file(&folio->page), -nr_ptes); + add_mm_counter(mm, mm_counter_file(folio), -nr_ptes); } if (start_pte) pte_unmap_unlock(start_pte, ptl); diff --git a/mm/memory.c b/mm/memory.c index 82398f1754c89..5d30b8d8dd6a2 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -955,7 +955,7 @@ copy_present_pte(struct vm_area_struct *dst_vma, struct vm_area_struct *src_vma, } else if (page) { folio_get(folio); folio_dup_file_rmap_pte(folio, page); - rss[mm_counter_file(page)]++; + rss[mm_counter_file(folio)]++; } /* @@ -1857,7 +1857,7 @@ static int insert_page_into_pte_locked(struct vm_area_struct *vma, pte_t *pte, return -EBUSY; /* Ok, finally just insert the thing.. */ folio_get(folio); - inc_mm_counter(vma->vm_mm, mm_counter_file(page)); + inc_mm_counter(vma->vm_mm, mm_counter_file(folio)); folio_add_file_rmap_pte(folio, page, vma); set_pte_at(vma->vm_mm, addr, pte, mk_pte(page, prot)); return 0; @@ -3178,7 +3178,7 @@ static vm_fault_t wp_page_copy(struct vm_fault *vmf) if (likely(vmf->pte && pte_same(ptep_get(vmf->pte), vmf->orig_pte))) { if (old_folio) { if (!folio_test_anon(old_folio)) { - dec_mm_counter(mm, mm_counter_file(&old_folio->page)); + dec_mm_counter(mm, mm_counter_file(old_folio)); inc_mm_counter(mm, MM_ANONPAGES); } } else { @@ -4474,7 +4474,7 @@ vm_fault_t do_set_pmd(struct vm_fault *vmf, struct page *page) if (write) entry = maybe_pmd_mkwrite(pmd_mkdirty(entry), vma); - add_mm_counter(vma->vm_mm, mm_counter_file(page), HPAGE_PMD_NR); + add_mm_counter(vma->vm_mm, mm_counter_file(folio), HPAGE_PMD_NR); folio_add_file_rmap_pmd(folio, page, vma); /* @@ -4537,7 +4537,7 @@ void set_pte_range(struct vm_fault *vmf, struct folio *folio, folio_add_new_anon_rmap(folio, vma, addr); folio_add_lru_vma(folio, vma); } else { - add_mm_counter(vma->vm_mm, mm_counter_file(page), nr); + add_mm_counter(vma->vm_mm, mm_counter_file(folio), nr); folio_add_file_rmap_ptes(folio, page, nr, vma); } set_ptes(vma->vm_mm, addr, vmf->pte, entry, nr); diff --git a/mm/rmap.c b/mm/rmap.c index c44b9cb3f1dad..1c4dbd42948ca 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -1873,7 +1873,7 @@ static bool try_to_unmap_one(struct folio *folio, struct vm_area_struct *vma, * * See Documentation/mm/mmu_notifier.rst */ - dec_mm_counter(mm, mm_counter_file(&folio->page)); + dec_mm_counter(mm, mm_counter_file(folio)); } discard: if (unlikely(folio_test_hugetlb(folio))) From fbd10bcc5ad414bc5f7ceb386fd1f36063a4f32e Mon Sep 17 00:00:00 2001 From: Ryan Roberts Date: Mon, 29 Jan 2024 13:46:35 +0100 Subject: [PATCH 202/548] arm64/mm: make set_ptes() robust when OAs cross 48-bit boundary Patch series "mm/memory: optimize fork() with PTE-mapped THP", v3. Now that the rmap overhaul[1] is upstream that provides a clean interface for rmap batching, let's implement PTE batching during fork when processing PTE-mapped THPs. This series is partially based on Ryan's previous work[2] to implement cont-pte support on arm64, but its a complete rewrite based on [1] to optimize all architectures independent of any such PTE bits, and to use the new rmap batching functions that simplify the code and prepare for further rmap accounting changes. We collect consecutive PTEs that map consecutive pages of the same large folio, making sure that the other PTE bits are compatible, and (a) adjust the refcount only once per batch, (b) call rmap handling functions only once per batch and (c) perform batch PTE setting/updates. While this series should be beneficial for adding cont-pte support on ARM64[2], it's one of the requirements for maintaining a total mapcount[3] for large folios with minimal added overhead and further changes[4] that build up on top of the total mapcount. Independent of all that, this series results in a speedup during fork with PTE-mapped THP, which is the default with THPs that are smaller than a PMD (for example, 16KiB to 1024KiB mTHPs for anonymous memory[5]). On an Intel Xeon Silver 4210R CPU, fork'ing with 1GiB of PTE-mapped folios of the same size (stddev < 1%) results in the following runtimes for fork() (shorter is better): Folio Size | v6.8-rc1 | New | Change ------------------------------------------ 4KiB | 0.014328 | 0.014035 | - 2% 16KiB | 0.014263 | 0.01196 | -16% 32KiB | 0.014334 | 0.01094 | -24% 64KiB | 0.014046 | 0.010444 | -26% 128KiB | 0.014011 | 0.010063 | -28% 256KiB | 0.013993 | 0.009938 | -29% 512KiB | 0.013983 | 0.00985 | -30% 1024KiB | 0.013986 | 0.00982 | -30% 2048KiB | 0.014305 | 0.010076 | -30% Note that these numbers are even better than the ones from v1 (verified over multiple reboots), even though there were only minimal code changes. Well, I removed a pte_mkclean() call for anon folios, maybe that also plays a role. But my experience is that fork() is extremely sensitive to code size, inlining, ... so I suspect we'll see on other architectures rather a change of -20% instead of -30%, and it will be easy to "lose" some of that speedup in the future by subtle code changes. Next up is PTE batching when unmapping. Only tested on x86-64. Compile-tested on most other architectures. [1] https://lkml.kernel.org/r/20231220224504.646757-1-david@redhat.com [2] https://lkml.kernel.org/r/20231218105100.172635-1-ryan.roberts@arm.com [3] https://lkml.kernel.org/r/20230809083256.699513-1-david@redhat.com [4] https://lkml.kernel.org/r/20231124132626.235350-1-david@redhat.com [5] https://lkml.kernel.org/r/20231207161211.2374093-1-ryan.roberts@arm.com This patch (of 15): Since the high bits [51:48] of an OA are not stored contiguously in the PTE, there is a theoretical bug in set_ptes(), which just adds PAGE_SIZE to the pte to get the pte with the next pfn. This works until the pfn crosses the 48-bit boundary, at which point we overflow into the upper attributes. Of course one could argue (and Matthew Wilcox has :) that we will never see a folio cross this boundary because we only allow naturally aligned power-of-2 allocation, so this would require a half-petabyte folio. So its only a theoretical bug. But its better that the code is robust regardless. I've implemented pte_next_pfn() as part of the fix, which is an opt-in core-mm interface. So that is now available to the core-mm, which will be needed shortly to support forthcoming fork()-batching optimizations. Link: https://lkml.kernel.org/r/20240129124649.189745-1-david@redhat.com Link: https://lkml.kernel.org/r/20240125173534.1659317-1-ryan.roberts@arm.com Link: https://lkml.kernel.org/r/20240129124649.189745-2-david@redhat.com Fixes: 4a169d61c2ed ("arm64: implement the new page table range API") Closes: https://lore.kernel.org/linux-mm/fdaeb9a5-d890-499a-92c8-d171df43ad01@arm.com/ Signed-off-by: Ryan Roberts Signed-off-by: David Hildenbrand Reviewed-by: Catalin Marinas Reviewed-by: David Hildenbrand Tested-by: Ryan Roberts Reviewed-by: Mike Rapoport (IBM) Cc: Albert Ou Cc: Alexander Gordeev Cc: Aneesh Kumar K.V Cc: Christian Borntraeger Cc: Christophe Leroy Cc: David S. Miller Cc: Dinh Nguyen Cc: Gerald Schaefer Cc: Heiko Carstens Cc: Matthew Wilcox Cc: Michael Ellerman Cc: Naveen N. Rao Cc: Nicholas Piggin Cc: Palmer Dabbelt Cc: Paul Walmsley Cc: Russell King (Oracle) Cc: Sven Schnelle Cc: Vasily Gorbik Cc: Will Deacon Cc: Alexandre Ghiti Signed-off-by: Andrew Morton (cherry picked from commit 6e8f588708971e0626f5be808e8c4b6cdb86eb0b) Signed-off-by: Koba Ko --- arch/arm64/include/asm/pgtable.h | 29 +++++++++++++++++------------ 1 file changed, 17 insertions(+), 12 deletions(-) diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h index 4724a8fbe4803..225c026f0d74e 100644 --- a/arch/arm64/include/asm/pgtable.h +++ b/arch/arm64/include/asm/pgtable.h @@ -341,6 +341,22 @@ static inline void __sync_cache_and_tags(pte_t pte, unsigned int nr_pages) mte_sync_tags(pte, nr_pages); } +/* + * Select all bits except the pfn + */ +static inline pgprot_t pte_pgprot(pte_t pte) +{ + unsigned long pfn = pte_pfn(pte); + + return __pgprot(pte_val(pfn_pte(pfn, __pgprot(0))) ^ pte_val(pte)); +} + +#define pte_next_pfn pte_next_pfn +static inline pte_t pte_next_pfn(pte_t pte) +{ + return pfn_pte(pte_pfn(pte) + 1, pte_pgprot(pte)); +} + static inline void set_ptes(struct mm_struct *mm, unsigned long addr, pte_t *ptep, pte_t pte, unsigned int nr) { @@ -353,8 +369,7 @@ static inline void set_ptes(struct mm_struct *mm, unsigned long addr, if (--nr == 0) break; ptep++; - addr += PAGE_SIZE; - pte_val(pte) += PAGE_SIZE; + pte = pte_next_pfn(pte); } } #define set_ptes set_ptes @@ -433,16 +448,6 @@ static inline pte_t pte_swp_clear_exclusive(pte_t pte) return clear_pte_bit(pte, __pgprot(PTE_SWP_EXCLUSIVE)); } -/* - * Select all bits except the pfn - */ -static inline pgprot_t pte_pgprot(pte_t pte) -{ - unsigned long pfn = pte_pfn(pte); - - return __pgprot(pte_val(pfn_pte(pfn, __pgprot(0))) ^ pte_val(pte)); -} - #ifdef CONFIG_NUMA_BALANCING /* * See the comment in include/linux/pgtable.h From 04e7cf829ab271aaebec41a79f029733d3fd128d Mon Sep 17 00:00:00 2001 From: David Hildenbrand Date: Mon, 29 Jan 2024 13:46:36 +0100 Subject: [PATCH 203/548] arm/pgtable: define PFN_PTE_SHIFT We want to make use of pte_next_pfn() outside of set_ptes(). Let's simply define PFN_PTE_SHIFT, required by pte_next_pfn(). Link: https://lkml.kernel.org/r/20240129124649.189745-3-david@redhat.com Signed-off-by: David Hildenbrand Tested-by: Ryan Roberts Reviewed-by: Mike Rapoport (IBM) Cc: Albert Ou Cc: Alexander Gordeev Cc: Alexandre Ghiti Cc: Aneesh Kumar K.V Cc: Catalin Marinas Cc: Christian Borntraeger Cc: Christophe Leroy Cc: David S. Miller Cc: Dinh Nguyen Cc: Gerald Schaefer Cc: Heiko Carstens Cc: Matthew Wilcox Cc: Michael Ellerman Cc: Naveen N. Rao Cc: Nicholas Piggin Cc: Palmer Dabbelt Cc: Paul Walmsley Cc: Russell King (Oracle) Cc: Sven Schnelle Cc: Vasily Gorbik Cc: Will Deacon Signed-off-by: Andrew Morton (cherry picked from commit 12b884f2e09ab42d3879a3e2c703e7157691013c) Signed-off-by: Koba Ko --- arch/arm/include/asm/pgtable.h | 2 ++ 1 file changed, 2 insertions(+) diff --git a/arch/arm/include/asm/pgtable.h b/arch/arm/include/asm/pgtable.h index d657b84b6bf70..be91e376df79e 100644 --- a/arch/arm/include/asm/pgtable.h +++ b/arch/arm/include/asm/pgtable.h @@ -209,6 +209,8 @@ static inline void __sync_icache_dcache(pte_t pteval) extern void __sync_icache_dcache(pte_t pteval); #endif +#define PFN_PTE_SHIFT PAGE_SHIFT + void set_ptes(struct mm_struct *mm, unsigned long addr, pte_t *ptep, pte_t pteval, unsigned int nr); #define set_ptes set_ptes From fc728181fc6d10576f10fe7e9e1762ff7c4967e8 Mon Sep 17 00:00:00 2001 From: David Hildenbrand Date: Mon, 29 Jan 2024 13:46:42 +0100 Subject: [PATCH 204/548] mm/pgtable: make pte_next_pfn() independent of set_ptes() Let's provide pte_next_pfn(), independently of set_ptes(). This allows for using the generic pte_next_pfn() version in some arch-specific set_ptes() implementations, and prepares for reusing pte_next_pfn() in other context. Link: https://lkml.kernel.org/r/20240129124649.189745-9-david@redhat.com Signed-off-by: David Hildenbrand Reviewed-by: Christophe Leroy Tested-by: Ryan Roberts Reviewed-by: Mike Rapoport (IBM) Cc: Albert Ou Cc: Alexander Gordeev Cc: Alexandre Ghiti Cc: Aneesh Kumar K.V Cc: Catalin Marinas Cc: Christian Borntraeger Cc: David S. Miller Cc: Dinh Nguyen Cc: Gerald Schaefer Cc: Heiko Carstens Cc: Matthew Wilcox Cc: Michael Ellerman Cc: Naveen N. Rao Cc: Nicholas Piggin Cc: Palmer Dabbelt Cc: Paul Walmsley Cc: Russell King (Oracle) Cc: Sven Schnelle Cc: Vasily Gorbik Cc: Will Deacon Signed-off-by: Andrew Morton (cherry picked from commit 6cdfa1d5d5d8285108495c33588c48cdda81b647) Signed-off-by: Koba Ko --- include/linux/pgtable.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/include/linux/pgtable.h b/include/linux/pgtable.h index e2c9a0c259df3..35374228270b3 100644 --- a/include/linux/pgtable.h +++ b/include/linux/pgtable.h @@ -209,7 +209,6 @@ static inline int pmd_young(pmd_t pmd) #define arch_flush_lazy_mmu_mode() do {} while (0) #endif -#ifndef set_ptes #ifndef pte_next_pfn static inline pte_t pte_next_pfn(pte_t pte) @@ -218,6 +217,7 @@ static inline pte_t pte_next_pfn(pte_t pte) } #endif +#ifndef set_ptes /** * set_ptes - Map consecutive pages to a contiguous range of addresses. * @mm: Address space to map the pages into. From 8f0ab14300a4fefb3ee7725414683109e8bc9edb Mon Sep 17 00:00:00 2001 From: David Hildenbrand Date: Mon, 29 Jan 2024 13:46:43 +0100 Subject: [PATCH 205/548] arm/mm: use pte_next_pfn() in set_ptes() Let's use our handy helper now that it's available on all archs. Link: https://lkml.kernel.org/r/20240129124649.189745-10-david@redhat.com Signed-off-by: David Hildenbrand Tested-by: Ryan Roberts Reviewed-by: Mike Rapoport (IBM) Cc: Albert Ou Cc: Alexander Gordeev Cc: Alexandre Ghiti Cc: Aneesh Kumar K.V Cc: Catalin Marinas Cc: Christian Borntraeger Cc: Christophe Leroy Cc: David S. Miller Cc: Dinh Nguyen Cc: Gerald Schaefer Cc: Heiko Carstens Cc: Matthew Wilcox Cc: Michael Ellerman Cc: Naveen N. Rao Cc: Nicholas Piggin Cc: Palmer Dabbelt Cc: Paul Walmsley Cc: Russell King (Oracle) Cc: Sven Schnelle Cc: Vasily Gorbik Cc: Will Deacon Signed-off-by: Andrew Morton (cherry picked from commit e5ea320aec811c0e5cddefda17052579e0306415) Signed-off-by: Koba Ko --- arch/arm/mm/mmu.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/arm/mm/mmu.c b/arch/arm/mm/mmu.c index 073de5b24560d..735cca0ccfe20 100644 --- a/arch/arm/mm/mmu.c +++ b/arch/arm/mm/mmu.c @@ -1822,6 +1822,6 @@ void set_ptes(struct mm_struct *mm, unsigned long addr, if (--nr == 0) break; ptep++; - pte_val(pteval) += PAGE_SIZE; + pteval = pte_next_pfn(pteval); } } From 415004cd1b704f19a7e9c84880f65edacc850b43 Mon Sep 17 00:00:00 2001 From: David Hildenbrand Date: Mon, 29 Jan 2024 13:46:45 +0100 Subject: [PATCH 206/548] mm/memory: factor out copying the actual PTE in copy_present_pte() Let's prepare for further changes. Link: https://lkml.kernel.org/r/20240129124649.189745-12-david@redhat.com Signed-off-by: David Hildenbrand Reviewed-by: Ryan Roberts Reviewed-by: Mike Rapoport (IBM) Cc: Albert Ou Cc: Alexander Gordeev Cc: Alexandre Ghiti Cc: Aneesh Kumar K.V Cc: Catalin Marinas Cc: Christian Borntraeger Cc: Christophe Leroy Cc: David S. Miller Cc: Dinh Nguyen Cc: Gerald Schaefer Cc: Heiko Carstens Cc: Matthew Wilcox Cc: Michael Ellerman Cc: Naveen N. Rao Cc: Nicholas Piggin Cc: Palmer Dabbelt Cc: Paul Walmsley Cc: Russell King (Oracle) Cc: Sven Schnelle Cc: Vasily Gorbik Cc: Will Deacon Signed-off-by: Andrew Morton (cherry picked from commit 23ed190868a65525b8941370630fbb215f12ebe8) Signed-off-by: Koba Ko --- mm/memory.c | 63 ++++++++++++++++++++++++++++------------------------- 1 file changed, 33 insertions(+), 30 deletions(-) diff --git a/mm/memory.c b/mm/memory.c index 5d30b8d8dd6a2..35281353b89eb 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -919,6 +919,29 @@ copy_present_page(struct vm_area_struct *dst_vma, struct vm_area_struct *src_vma return 0; } +static inline void __copy_present_pte(struct vm_area_struct *dst_vma, + struct vm_area_struct *src_vma, pte_t *dst_pte, pte_t *src_pte, + pte_t pte, unsigned long addr) +{ + struct mm_struct *src_mm = src_vma->vm_mm; + + /* If it's a COW mapping, write protect it both processes. */ + if (is_cow_mapping(src_vma->vm_flags) && pte_write(pte)) { + ptep_set_wrprotect(src_mm, addr, src_pte); + pte = pte_wrprotect(pte); + } + + /* If it's a shared mapping, mark it clean in the child. */ + if (src_vma->vm_flags & VM_SHARED) + pte = pte_mkclean(pte); + pte = pte_mkold(pte); + + if (!userfaultfd_wp(dst_vma)) + pte = pte_clear_uffd_wp(pte); + + set_pte_at(dst_vma->vm_mm, addr, dst_pte, pte); +} + /* * Copy one pte. Returns 0 if succeeded, or -EAGAIN if one preallocated page * is required to copy this pte. @@ -928,23 +951,23 @@ copy_present_pte(struct vm_area_struct *dst_vma, struct vm_area_struct *src_vma, pte_t *dst_pte, pte_t *src_pte, unsigned long addr, int *rss, struct folio **prealloc) { - struct mm_struct *src_mm = src_vma->vm_mm; - unsigned long vm_flags = src_vma->vm_flags; pte_t pte = ptep_get(src_pte); struct page *page; struct folio *folio; page = vm_normal_page(src_vma, addr, pte); - if (page) - folio = page_folio(page); - if (page && folio_test_anon(folio)) { + if (unlikely(!page)) + goto copy_pte; + + folio = page_folio(page); + folio_get(folio); + if (folio_test_anon(folio)) { /* * If this page may have been pinned by the parent process, * copy the page immediately for the child so that we'll always * guarantee the pinned page won't be randomly replaced in the * future. */ - folio_get(folio); if (unlikely(folio_try_dup_anon_rmap_pte(folio, page, src_vma))) { /* Page may be pinned, we have to copy. */ folio_put(folio); @@ -952,34 +975,14 @@ copy_present_pte(struct vm_area_struct *dst_vma, struct vm_area_struct *src_vma, addr, rss, prealloc, page); } rss[MM_ANONPAGES]++; - } else if (page) { - folio_get(folio); + VM_WARN_ON_FOLIO(PageAnonExclusive(page), folio); + } else { folio_dup_file_rmap_pte(folio, page); rss[mm_counter_file(folio)]++; } - /* - * If it's a COW mapping, write protect it both - * in the parent and the child - */ - if (is_cow_mapping(vm_flags) && pte_write(pte)) { - ptep_set_wrprotect(src_mm, addr, src_pte); - pte = pte_wrprotect(pte); - } - VM_BUG_ON(page && folio_test_anon(folio) && PageAnonExclusive(page)); - - /* - * If it's a shared mapping, mark it clean in - * the child - */ - if (vm_flags & VM_SHARED) - pte = pte_mkclean(pte); - pte = pte_mkold(pte); - - if (!userfaultfd_wp(dst_vma)) - pte = pte_clear_uffd_wp(pte); - - set_pte_at(dst_vma->vm_mm, addr, dst_pte, pte); +copy_pte: + __copy_present_pte(dst_vma, src_vma, dst_pte, src_pte, pte, addr); return 0; } From 1f1380a6929a827754277975a64c2ac0ca4d99b9 Mon Sep 17 00:00:00 2001 From: David Hildenbrand Date: Mon, 29 Jan 2024 13:46:46 +0100 Subject: [PATCH 207/548] mm/memory: pass PTE to copy_present_pte() We already read it, let's just forward it. This patch is based on work by Ryan Roberts. [david@redhat.com: fix the hmm "exclusive_cow" selftest] Link: https://lkml.kernel.org/r/13f296b8-e882-47fd-b939-c2141dc28717@redhat.com Link: https://lkml.kernel.org/r/20240129124649.189745-13-david@redhat.com Signed-off-by: David Hildenbrand Reviewed-by: Ryan Roberts Reviewed-by: Mike Rapoport (IBM) Cc: Albert Ou Cc: Alexander Gordeev Cc: Alexandre Ghiti Cc: Aneesh Kumar K.V Cc: Catalin Marinas Cc: Christian Borntraeger Cc: Christophe Leroy Cc: David S. Miller Cc: Dinh Nguyen Cc: Gerald Schaefer Cc: Heiko Carstens Cc: Matthew Wilcox Cc: Michael Ellerman Cc: Naveen N. Rao Cc: Nicholas Piggin Cc: Palmer Dabbelt Cc: Paul Walmsley Cc: Russell King (Oracle) Cc: Sven Schnelle Cc: Vasily Gorbik Cc: Will Deacon Signed-off-by: Andrew Morton (cherry picked from commit 53723298ba436830fdf0744c19b57b2a18f44041) Signed-off-by: Koba Ko --- mm/memory.c | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-) diff --git a/mm/memory.c b/mm/memory.c index 35281353b89eb..f163c108a2f3c 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -948,10 +948,9 @@ static inline void __copy_present_pte(struct vm_area_struct *dst_vma, */ static inline int copy_present_pte(struct vm_area_struct *dst_vma, struct vm_area_struct *src_vma, - pte_t *dst_pte, pte_t *src_pte, unsigned long addr, int *rss, - struct folio **prealloc) + pte_t *dst_pte, pte_t *src_pte, pte_t pte, unsigned long addr, + int *rss, struct folio **prealloc) { - pte_t pte = ptep_get(src_pte); struct page *page; struct folio *folio; @@ -1083,6 +1082,8 @@ copy_pte_range(struct vm_area_struct *dst_vma, struct vm_area_struct *src_vma, progress += 8; continue; } + ptent = ptep_get(src_pte); + VM_WARN_ON_ONCE(!pte_present(ptent)); /* * Device exclusive entry restored, continue by copying @@ -1092,7 +1093,7 @@ copy_pte_range(struct vm_area_struct *dst_vma, struct vm_area_struct *src_vma, } /* copy_present_pte() will clear `*prealloc' if consumed */ ret = copy_present_pte(dst_vma, src_vma, dst_pte, src_pte, - addr, rss, &prealloc); + ptent, addr, rss, &prealloc); /* * If we need a pre-allocated page for this pte, drop the * locks, allocate, and try again. From efa7a309428be5161eb6a34aeb6304a5bdc5b49c Mon Sep 17 00:00:00 2001 From: David Hildenbrand Date: Mon, 29 Jan 2024 13:46:47 +0100 Subject: [PATCH 208/548] mm/memory: optimize fork() with PTE-mapped THP Let's implement PTE batching when consecutive (present) PTEs map consecutive pages of the same large folio, and all other PTE bits besides the PFNs are equal. We will optimize folio_pte_batch() separately, to ignore selected PTE bits. This patch is based on work by Ryan Roberts. Use __always_inline for __copy_present_ptes() and keep the handling for single PTEs completely separate from the multi-PTE case: we really want the compiler to optimize for the single-PTE case with small folios, to not degrade performance. Note that PTE batching will never exceed a single page table and will always stay within VMA boundaries. Further, processing PTE-mapped THP that maybe pinned and have PageAnonExclusive set on at least one subpage should work as expected, but there is room for improvement: We will repeatedly (1) detect a PTE batch (2) detect that we have to copy a page (3) fall back and allocate a single page to copy a single page. For now we won't care as pinned pages are a corner case, and we should rather look into maintaining only a single PageAnonExclusive bit for large folios. Link: https://lkml.kernel.org/r/20240129124649.189745-14-david@redhat.com Signed-off-by: David Hildenbrand Reviewed-by: Ryan Roberts Reviewed-by: Mike Rapoport (IBM) Cc: Albert Ou Cc: Alexander Gordeev Cc: Alexandre Ghiti Cc: Aneesh Kumar K.V Cc: Catalin Marinas Cc: Christian Borntraeger Cc: Christophe Leroy Cc: David S. Miller Cc: Dinh Nguyen Cc: Gerald Schaefer Cc: Heiko Carstens Cc: Matthew Wilcox Cc: Michael Ellerman Cc: Naveen N. Rao Cc: Nicholas Piggin Cc: Palmer Dabbelt Cc: Paul Walmsley Cc: Russell King (Oracle) Cc: Sven Schnelle Cc: Vasily Gorbik Cc: Will Deacon Signed-off-by: Andrew Morton (cherry picked from commit f8d937761d65c87e9987b88ea7beb7bddc333a0e) Signed-off-by: Koba Ko --- include/linux/pgtable.h | 31 +++++++++++ mm/memory.c | 112 +++++++++++++++++++++++++++++++++------- 2 files changed, 124 insertions(+), 19 deletions(-) diff --git a/include/linux/pgtable.h b/include/linux/pgtable.h index 35374228270b3..6b85a95ed7307 100644 --- a/include/linux/pgtable.h +++ b/include/linux/pgtable.h @@ -645,6 +645,37 @@ static inline void ptep_set_wrprotect(struct mm_struct *mm, unsigned long addres } #endif +#ifndef wrprotect_ptes +/** + * wrprotect_ptes - Write-protect PTEs that map consecutive pages of the same + * folio. + * @mm: Address space the pages are mapped into. + * @addr: Address the first page is mapped at. + * @ptep: Page table pointer for the first entry. + * @nr: Number of entries to write-protect. + * + * May be overridden by the architecture; otherwise, implemented as a simple + * loop over ptep_set_wrprotect(). + * + * Note that PTE bits in the PTE range besides the PFN can differ. For example, + * some PTEs might be write-protected. + * + * Context: The caller holds the page table lock. The PTEs map consecutive + * pages that belong to the same folio. The PTEs are all in the same PMD. + */ +static inline void wrprotect_ptes(struct mm_struct *mm, unsigned long addr, + pte_t *ptep, unsigned int nr) +{ + for (;;) { + ptep_set_wrprotect(mm, addr, ptep); + if (--nr == 0) + break; + ptep++; + addr += PAGE_SIZE; + } +} +#endif + /* * On some architectures hardware does not set page access bit when accessing * memory page, it is responsibility of software setting this bit. It brings diff --git a/mm/memory.c b/mm/memory.c index f163c108a2f3c..915c14546b963 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -919,15 +919,15 @@ copy_present_page(struct vm_area_struct *dst_vma, struct vm_area_struct *src_vma return 0; } -static inline void __copy_present_pte(struct vm_area_struct *dst_vma, +static __always_inline void __copy_present_ptes(struct vm_area_struct *dst_vma, struct vm_area_struct *src_vma, pte_t *dst_pte, pte_t *src_pte, - pte_t pte, unsigned long addr) + pte_t pte, unsigned long addr, int nr) { struct mm_struct *src_mm = src_vma->vm_mm; /* If it's a COW mapping, write protect it both processes. */ if (is_cow_mapping(src_vma->vm_flags) && pte_write(pte)) { - ptep_set_wrprotect(src_mm, addr, src_pte); + wrprotect_ptes(src_mm, addr, src_pte, nr); pte = pte_wrprotect(pte); } @@ -939,26 +939,93 @@ static inline void __copy_present_pte(struct vm_area_struct *dst_vma, if (!userfaultfd_wp(dst_vma)) pte = pte_clear_uffd_wp(pte); - set_pte_at(dst_vma->vm_mm, addr, dst_pte, pte); + set_ptes(dst_vma->vm_mm, addr, dst_pte, pte, nr); +} + +/* + * Detect a PTE batch: consecutive (present) PTEs that map consecutive + * pages of the same folio. + * + * All PTEs inside a PTE batch have the same PTE bits set, excluding the PFN. + */ +static inline int folio_pte_batch(struct folio *folio, unsigned long addr, + pte_t *start_ptep, pte_t pte, int max_nr) +{ + unsigned long folio_end_pfn = folio_pfn(folio) + folio_nr_pages(folio); + const pte_t *end_ptep = start_ptep + max_nr; + pte_t expected_pte = pte_next_pfn(pte); + pte_t *ptep = start_ptep + 1; + + VM_WARN_ON_FOLIO(!pte_present(pte), folio); + + while (ptep != end_ptep) { + pte = ptep_get(ptep); + + if (!pte_same(pte, expected_pte)) + break; + + /* + * Stop immediately once we reached the end of the folio. In + * corner cases the next PFN might fall into a different + * folio. + */ + if (pte_pfn(pte) == folio_end_pfn) + break; + + expected_pte = pte_next_pfn(expected_pte); + ptep++; + } + + return ptep - start_ptep; } /* - * Copy one pte. Returns 0 if succeeded, or -EAGAIN if one preallocated page - * is required to copy this pte. + * Copy one present PTE, trying to batch-process subsequent PTEs that map + * consecutive pages of the same folio by copying them as well. + * + * Returns -EAGAIN if one preallocated page is required to copy the next PTE. + * Otherwise, returns the number of copied PTEs (at least 1). */ static inline int -copy_present_pte(struct vm_area_struct *dst_vma, struct vm_area_struct *src_vma, +copy_present_ptes(struct vm_area_struct *dst_vma, struct vm_area_struct *src_vma, pte_t *dst_pte, pte_t *src_pte, pte_t pte, unsigned long addr, - int *rss, struct folio **prealloc) + int max_nr, int *rss, struct folio **prealloc) { struct page *page; struct folio *folio; + int err, nr; page = vm_normal_page(src_vma, addr, pte); if (unlikely(!page)) goto copy_pte; folio = page_folio(page); + + /* + * If we likely have to copy, just don't bother with batching. Make + * sure that the common "small folio" case is as fast as possible + * by keeping the batching logic separate. + */ + if (unlikely(!*prealloc && folio_test_large(folio) && max_nr != 1)) { + nr = folio_pte_batch(folio, addr, src_pte, pte, max_nr); + folio_ref_add(folio, nr); + if (folio_test_anon(folio)) { + if (unlikely(folio_try_dup_anon_rmap_ptes(folio, page, + nr, src_vma))) { + folio_ref_sub(folio, nr); + return -EAGAIN; + } + rss[MM_ANONPAGES] += nr; + VM_WARN_ON_FOLIO(PageAnonExclusive(page), folio); + } else { + folio_dup_file_rmap_ptes(folio, page, nr); + rss[mm_counter_file(folio)] += nr; + } + __copy_present_ptes(dst_vma, src_vma, dst_pte, src_pte, pte, + addr, nr); + return nr; + } + folio_get(folio); if (folio_test_anon(folio)) { /* @@ -970,8 +1037,9 @@ copy_present_pte(struct vm_area_struct *dst_vma, struct vm_area_struct *src_vma, if (unlikely(folio_try_dup_anon_rmap_pte(folio, page, src_vma))) { /* Page may be pinned, we have to copy. */ folio_put(folio); - return copy_present_page(dst_vma, src_vma, dst_pte, src_pte, - addr, rss, prealloc, page); + err = copy_present_page(dst_vma, src_vma, dst_pte, src_pte, + addr, rss, prealloc, page); + return err ? err : 1; } rss[MM_ANONPAGES]++; VM_WARN_ON_FOLIO(PageAnonExclusive(page), folio); @@ -981,8 +1049,8 @@ copy_present_pte(struct vm_area_struct *dst_vma, struct vm_area_struct *src_vma, } copy_pte: - __copy_present_pte(dst_vma, src_vma, dst_pte, src_pte, pte, addr); - return 0; + __copy_present_ptes(dst_vma, src_vma, dst_pte, src_pte, pte, addr, 1); + return 1; } static inline struct folio *folio_prealloc(struct mm_struct *src_mm, @@ -1019,10 +1087,11 @@ copy_pte_range(struct vm_area_struct *dst_vma, struct vm_area_struct *src_vma, pte_t *src_pte, *dst_pte; pte_t ptent; spinlock_t *src_ptl, *dst_ptl; - int progress, ret = 0; + int progress, max_nr, ret = 0; int rss[NR_MM_COUNTERS]; swp_entry_t entry = (swp_entry_t){0}; struct folio *prealloc = NULL; + int nr; again: progress = 0; @@ -1053,6 +1122,8 @@ copy_pte_range(struct vm_area_struct *dst_vma, struct vm_area_struct *src_vma, arch_enter_lazy_mmu_mode(); do { + nr = 1; + /* * We are holding two locks at this point - either of them * could generate latencies in another task on another CPU. @@ -1091,9 +1162,10 @@ copy_pte_range(struct vm_area_struct *dst_vma, struct vm_area_struct *src_vma, */ WARN_ON_ONCE(ret != -ENOENT); } - /* copy_present_pte() will clear `*prealloc' if consumed */ - ret = copy_present_pte(dst_vma, src_vma, dst_pte, src_pte, - ptent, addr, rss, &prealloc); + /* copy_present_ptes() will clear `*prealloc' if consumed */ + max_nr = (end - addr) / PAGE_SIZE; + ret = copy_present_ptes(dst_vma, src_vma, dst_pte, src_pte, + ptent, addr, max_nr, rss, &prealloc); /* * If we need a pre-allocated page for this pte, drop the * locks, allocate, and try again. @@ -1110,8 +1182,10 @@ copy_pte_range(struct vm_area_struct *dst_vma, struct vm_area_struct *src_vma, folio_put(prealloc); prealloc = NULL; } - progress += 8; - } while (dst_pte++, src_pte++, addr += PAGE_SIZE, addr != end); + nr = ret; + progress += 8 * nr; + } while (dst_pte += nr, src_pte += nr, addr += PAGE_SIZE * nr, + addr != end); arch_leave_lazy_mmu_mode(); pte_unmap_unlock(orig_src_pte, src_ptl); @@ -1132,7 +1206,7 @@ copy_pte_range(struct vm_area_struct *dst_vma, struct vm_area_struct *src_vma, prealloc = folio_prealloc(src_mm, src_vma, addr, false); if (!prealloc) return -ENOMEM; - } else if (ret) { + } else if (ret < 0) { VM_WARN_ON_ONCE(1); } From def8b4848f3046fc26a7b81635a66f582a127b0e Mon Sep 17 00:00:00 2001 From: David Hildenbrand Date: Mon, 29 Jan 2024 13:46:48 +0100 Subject: [PATCH 209/548] mm/memory: ignore dirty/accessed/soft-dirty bits in folio_pte_batch() Let's always ignore the accessed/young bit: we'll always mark the PTE as old in our child process during fork, and upcoming users will similarly not care. Ignore the dirty bit only if we don't want to duplicate the dirty bit into the child process during fork. Maybe, we could just set all PTEs in the child dirty if any PTE is dirty. For now, let's keep the behavior unchanged, this can be optimized later if required. Ignore the soft-dirty bit only if the bit doesn't have any meaning in the src vma, and similarly won't have any in the copied dst vma. For now, we won't bother with the uffd-wp bit. Link: https://lkml.kernel.org/r/20240129124649.189745-15-david@redhat.com Signed-off-by: David Hildenbrand Reviewed-by: Ryan Roberts Cc: Albert Ou Cc: Alexander Gordeev Cc: Alexandre Ghiti Cc: Aneesh Kumar K.V Cc: Catalin Marinas Cc: Christian Borntraeger Cc: Christophe Leroy Cc: David S. Miller Cc: Dinh Nguyen Cc: Gerald Schaefer Cc: Heiko Carstens Cc: Matthew Wilcox Cc: Michael Ellerman Cc: Naveen N. Rao Cc: Nicholas Piggin Cc: Palmer Dabbelt Cc: Paul Walmsley Cc: Russell King (Oracle) Cc: Sven Schnelle Cc: Vasily Gorbik Cc: Will Deacon Cc: Mike Rapoport (IBM) Signed-off-by: Andrew Morton (cherry picked from commit 25365e10699aa0e320345d019194fbea9f37a4ae) Signed-off-by: Koba Ko --- mm/memory.c | 36 +++++++++++++++++++++++++++++++----- 1 file changed, 31 insertions(+), 5 deletions(-) diff --git a/mm/memory.c b/mm/memory.c index 915c14546b963..add258e038335 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -942,24 +942,44 @@ static __always_inline void __copy_present_ptes(struct vm_area_struct *dst_vma, set_ptes(dst_vma->vm_mm, addr, dst_pte, pte, nr); } +/* Flags for folio_pte_batch(). */ +typedef int __bitwise fpb_t; + +/* Compare PTEs after pte_mkclean(), ignoring the dirty bit. */ +#define FPB_IGNORE_DIRTY ((__force fpb_t)BIT(0)) + +/* Compare PTEs after pte_clear_soft_dirty(), ignoring the soft-dirty bit. */ +#define FPB_IGNORE_SOFT_DIRTY ((__force fpb_t)BIT(1)) + +static inline pte_t __pte_batch_clear_ignored(pte_t pte, fpb_t flags) +{ + if (flags & FPB_IGNORE_DIRTY) + pte = pte_mkclean(pte); + if (likely(flags & FPB_IGNORE_SOFT_DIRTY)) + pte = pte_clear_soft_dirty(pte); + return pte_mkold(pte); +} + /* * Detect a PTE batch: consecutive (present) PTEs that map consecutive * pages of the same folio. * - * All PTEs inside a PTE batch have the same PTE bits set, excluding the PFN. + * All PTEs inside a PTE batch have the same PTE bits set, excluding the PFN, + * the accessed bit, dirty bit (with FPB_IGNORE_DIRTY) and soft-dirty bit + * (with FPB_IGNORE_SOFT_DIRTY). */ static inline int folio_pte_batch(struct folio *folio, unsigned long addr, - pte_t *start_ptep, pte_t pte, int max_nr) + pte_t *start_ptep, pte_t pte, int max_nr, fpb_t flags) { unsigned long folio_end_pfn = folio_pfn(folio) + folio_nr_pages(folio); const pte_t *end_ptep = start_ptep + max_nr; - pte_t expected_pte = pte_next_pfn(pte); + pte_t expected_pte = __pte_batch_clear_ignored(pte_next_pfn(pte), flags); pte_t *ptep = start_ptep + 1; VM_WARN_ON_FOLIO(!pte_present(pte), folio); while (ptep != end_ptep) { - pte = ptep_get(ptep); + pte = __pte_batch_clear_ignored(ptep_get(ptep), flags); if (!pte_same(pte, expected_pte)) break; @@ -993,6 +1013,7 @@ copy_present_ptes(struct vm_area_struct *dst_vma, struct vm_area_struct *src_vma { struct page *page; struct folio *folio; + fpb_t flags = 0; int err, nr; page = vm_normal_page(src_vma, addr, pte); @@ -1007,7 +1028,12 @@ copy_present_ptes(struct vm_area_struct *dst_vma, struct vm_area_struct *src_vma * by keeping the batching logic separate. */ if (unlikely(!*prealloc && folio_test_large(folio) && max_nr != 1)) { - nr = folio_pte_batch(folio, addr, src_pte, pte, max_nr); + if (src_vma->vm_flags & VM_SHARED) + flags |= FPB_IGNORE_DIRTY; + if (!vma_soft_dirty_enabled(src_vma)) + flags |= FPB_IGNORE_SOFT_DIRTY; + + nr = folio_pte_batch(folio, addr, src_pte, pte, max_nr, flags); folio_ref_add(folio, nr); if (folio_test_anon(folio)) { if (unlikely(folio_try_dup_anon_rmap_ptes(folio, page, From 6782131b94a1c450e984659640ea4efea6c4f63d Mon Sep 17 00:00:00 2001 From: David Hildenbrand Date: Mon, 29 Jan 2024 13:46:49 +0100 Subject: [PATCH 210/548] mm/memory: ignore writable bit in folio_pte_batch() ... and conditionally return to the caller if any PTE except the first one is writable. fork() has to make sure to properly write-protect in case any PTE is writable. Other users (e.g., page unmaping) are expected to not care. Link: https://lkml.kernel.org/r/20240129124649.189745-16-david@redhat.com Signed-off-by: David Hildenbrand Reviewed-by: Ryan Roberts Cc: Albert Ou Cc: Alexander Gordeev Cc: Alexandre Ghiti Cc: Aneesh Kumar K.V Cc: Catalin Marinas Cc: Christian Borntraeger Cc: Christophe Leroy Cc: David S. Miller Cc: Dinh Nguyen Cc: Gerald Schaefer Cc: Heiko Carstens Cc: Matthew Wilcox Cc: Michael Ellerman Cc: Naveen N. Rao Cc: Nicholas Piggin Cc: Palmer Dabbelt Cc: Paul Walmsley Cc: Russell King (Oracle) Cc: Sven Schnelle Cc: Vasily Gorbik Cc: Will Deacon Cc: Mike Rapoport (IBM) Signed-off-by: Andrew Morton (cherry picked from commit d7c0e5f722ab229153c22efc836bf220479bdce6) Signed-off-by: Koba Ko --- mm/memory.c | 30 ++++++++++++++++++++++++------ 1 file changed, 24 insertions(+), 6 deletions(-) diff --git a/mm/memory.c b/mm/memory.c index add258e038335..f42267dbb2d7c 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -957,7 +957,7 @@ static inline pte_t __pte_batch_clear_ignored(pte_t pte, fpb_t flags) pte = pte_mkclean(pte); if (likely(flags & FPB_IGNORE_SOFT_DIRTY)) pte = pte_clear_soft_dirty(pte); - return pte_mkold(pte); + return pte_wrprotect(pte_mkold(pte)); } /* @@ -965,21 +965,32 @@ static inline pte_t __pte_batch_clear_ignored(pte_t pte, fpb_t flags) * pages of the same folio. * * All PTEs inside a PTE batch have the same PTE bits set, excluding the PFN, - * the accessed bit, dirty bit (with FPB_IGNORE_DIRTY) and soft-dirty bit - * (with FPB_IGNORE_SOFT_DIRTY). + * the accessed bit, writable bit, dirty bit (with FPB_IGNORE_DIRTY) and + * soft-dirty bit (with FPB_IGNORE_SOFT_DIRTY). + * + * If "any_writable" is set, it will indicate if any other PTE besides the + * first (given) PTE is writable. */ static inline int folio_pte_batch(struct folio *folio, unsigned long addr, - pte_t *start_ptep, pte_t pte, int max_nr, fpb_t flags) + pte_t *start_ptep, pte_t pte, int max_nr, fpb_t flags, + bool *any_writable) { unsigned long folio_end_pfn = folio_pfn(folio) + folio_nr_pages(folio); const pte_t *end_ptep = start_ptep + max_nr; pte_t expected_pte = __pte_batch_clear_ignored(pte_next_pfn(pte), flags); pte_t *ptep = start_ptep + 1; + bool writable; + + if (any_writable) + *any_writable = false; VM_WARN_ON_FOLIO(!pte_present(pte), folio); while (ptep != end_ptep) { - pte = __pte_batch_clear_ignored(ptep_get(ptep), flags); + pte = ptep_get(ptep); + if (any_writable) + writable = !!pte_write(pte); + pte = __pte_batch_clear_ignored(pte, flags); if (!pte_same(pte, expected_pte)) break; @@ -992,6 +1003,9 @@ static inline int folio_pte_batch(struct folio *folio, unsigned long addr, if (pte_pfn(pte) == folio_end_pfn) break; + if (any_writable) + *any_writable |= writable; + expected_pte = pte_next_pfn(expected_pte); ptep++; } @@ -1013,6 +1027,7 @@ copy_present_ptes(struct vm_area_struct *dst_vma, struct vm_area_struct *src_vma { struct page *page; struct folio *folio; + bool any_writable; fpb_t flags = 0; int err, nr; @@ -1033,7 +1048,8 @@ copy_present_ptes(struct vm_area_struct *dst_vma, struct vm_area_struct *src_vma if (!vma_soft_dirty_enabled(src_vma)) flags |= FPB_IGNORE_SOFT_DIRTY; - nr = folio_pte_batch(folio, addr, src_pte, pte, max_nr, flags); + nr = folio_pte_batch(folio, addr, src_pte, pte, max_nr, flags, + &any_writable); folio_ref_add(folio, nr); if (folio_test_anon(folio)) { if (unlikely(folio_try_dup_anon_rmap_ptes(folio, page, @@ -1047,6 +1063,8 @@ copy_present_ptes(struct vm_area_struct *dst_vma, struct vm_area_struct *src_vma folio_dup_file_rmap_ptes(folio, page, nr); rss[mm_counter_file(folio)] += nr; } + if (any_writable) + pte = pte_mkwrite(pte, src_vma); __copy_present_ptes(dst_vma, src_vma, dst_pte, src_pte, pte, addr, nr); return nr; From 6996f262f1b8e617eb81c52d7470122563f63f2e Mon Sep 17 00:00:00 2001 From: Ryan Roberts Date: Thu, 15 Feb 2024 10:31:48 +0000 Subject: [PATCH 211/548] mm: clarify the spec for set_ptes() Patch series "Transparent Contiguous PTEs for User Mappings", v6. This is a series to opportunistically and transparently use contpte mappings (set the contiguous bit in ptes) for user memory when those mappings meet the requirements. The change benefits arm64, but there is some (very) minor refactoring for x86 to enable its integration with core-mm. It is part of a wider effort to improve performance by allocating and mapping variable-sized blocks of memory (folios). One aim is for the 4K kernel to approach the performance of the 16K kernel, but without breaking compatibility and without the associated increase in memory. Another aim is to benefit the 16K and 64K kernels by enabling 2M THP, since this is the contpte size for those kernels. We have good performance data that demonstrates both aims are being met (see below). Of course this is only one half of the change. We require the mapped physical memory to be the correct size and alignment for this to actually be useful (i.e. 64K for 4K pages, or 2M for 16K/64K pages). Fortunately folios are solving this problem for us. Filesystems that support it (XFS, AFS, EROFS, tmpfs, ...) will allocate large folios up to the PMD size today, and more filesystems are coming. And for anonymous memory, "multi-size THP" is now upstream. Patch Layout ============ In this version, I've split the patches to better show each optimization: - 1-2: mm prep: misc code and docs cleanups - 3-6: mm,arm64,x86 prep: Add pte_advance_pfn() and make pte_next_pfn() a generic wrapper around it - 7-11: arm64 prep: Refactor ptep helpers into new layer - 12: functional contpte implementation - 23-18: various optimizations on top of the contpte implementation Testing ======= I've tested this series on both Ampere Altra (bare metal) and Apple M2 (VM): - mm selftests (inc new tests written for multi-size THP); no regressions - Speedometer Java script benchmark in Chromium web browser; no issues - Kernel compilation; no issues - Various tests under high memory pressure with swap enabled; no issues Performance =========== High Level Use Cases ~~~~~~~~~~~~~~~~~~~~ First some high level use cases (kernel compilation and speedometer JavaScript benchmarks). These are running on Ampere Altra (I've seen similar improvements on Android/Pixel 6). baseline: mm-unstable (mTHP switched off) mTHP: + enable 16K, 32K, 64K mTHP sizes "always" mTHP + contpte: + this series mTHP + contpte + exefolio: + patch at [6], which series supports Kernel Compilation with -j8 (negative is faster): | kernel | real-time | kern-time | user-time | |---------------------------|-----------|-----------|-----------| | baseline | 0.0% | 0.0% | 0.0% | | mTHP | -5.0% | -39.1% | -0.7% | | mTHP + contpte | -6.0% | -41.4% | -1.5% | | mTHP + contpte + exefolio | -7.8% | -43.1% | -3.4% | Kernel Compilation with -j80 (negative is faster): | kernel | real-time | kern-time | user-time | |---------------------------|-----------|-----------|-----------| | baseline | 0.0% | 0.0% | 0.0% | | mTHP | -5.0% | -36.6% | -0.6% | | mTHP + contpte | -6.1% | -38.2% | -1.6% | | mTHP + contpte + exefolio | -7.4% | -39.2% | -3.2% | Speedometer (positive is faster): | kernel | runs_per_min | |:--------------------------|--------------| | baseline | 0.0% | | mTHP | 1.5% | | mTHP + contpte | 3.2% | | mTHP + contpte + exefolio | 4.5% | Micro Benchmarks ~~~~~~~~~~~~~~~~ The following microbenchmarks are intended to demonstrate the performance of fork() and munmap() do not regress. I'm showing results for order-0 (4K) mappings, and for order-9 (2M) PTE-mapped THP. Thanks to David for sharing his benchmarks. baseline: mm-unstable + batch zap [7] series contpte-basic: + patches 0-19; functional contpte implementation contpte-batch: + patches 20-23; implement new batched APIs contpte-inline: + patch 24; __always_inline to help compiler contpte-fold: + patch 25; fold contpte mapping when sensible Primary platform is Ampere Altra bare metal. I'm also showing results for M2 VM (on top of MacOS) for reference, although experience suggests this might not be the most reliable for performance numbers of this sort: | FORK | order-0 | order-9 | | Ampere Altra |------------------------|------------------------| | (pte-map) | mean | stdev | mean | stdev | |----------------|------------|-----------|------------|-----------| | baseline | 0.0% | 2.7% | 0.0% | 0.2% | | contpte-basic | 6.3% | 1.4% | 1948.7% | 0.2% | | contpte-batch | 7.6% | 2.0% | -1.9% | 0.4% | | contpte-inline | 3.6% | 1.5% | -1.0% | 0.2% | | contpte-fold | 4.6% | 2.1% | -1.8% | 0.2% | | MUNMAP | order-0 | order-9 | | Ampere Altra |------------------------|------------------------| | (pte-map) | mean | stdev | mean | stdev | |----------------|------------|-----------|------------|-----------| | baseline | 0.0% | 0.5% | 0.0% | 0.3% | | contpte-basic | 1.8% | 0.3% | 1104.8% | 0.1% | | contpte-batch | -0.3% | 0.4% | 2.7% | 0.1% | | contpte-inline | -0.1% | 0.6% | 0.9% | 0.1% | | contpte-fold | 0.1% | 0.6% | 0.8% | 0.1% | | FORK | order-0 | order-9 | | Apple M2 VM |------------------------|------------------------| | (pte-map) | mean | stdev | mean | stdev | |----------------|------------|-----------|------------|-----------| | baseline | 0.0% | 1.4% | 0.0% | 0.8% | | contpte-basic | 6.8% | 1.2% | 469.4% | 1.4% | | contpte-batch | -7.7% | 2.0% | -8.9% | 0.7% | | contpte-inline | -6.0% | 2.1% | -6.0% | 2.0% | | contpte-fold | 5.9% | 1.4% | -6.4% | 1.4% | | MUNMAP | order-0 | order-9 | | Apple M2 VM |------------------------|------------------------| | (pte-map) | mean | stdev | mean | stdev | |----------------|------------|-----------|------------|-----------| | baseline | 0.0% | 0.6% | 0.0% | 0.4% | | contpte-basic | 1.6% | 0.6% | 233.6% | 0.7% | | contpte-batch | 1.9% | 0.3% | -3.9% | 0.4% | | contpte-inline | 2.2% | 0.8% | -1.6% | 0.9% | | contpte-fold | 1.5% | 0.7% | -1.7% | 0.7% | Misc ~~~~ John Hubbard at Nvidia has indicated dramatic 10x performance improvements for some workloads at [8], when using 64K base page kernel. [1] https://lore.kernel.org/linux-arm-kernel/20230622144210.2623299-1-ryan.roberts@arm.com/ [2] https://lore.kernel.org/linux-arm-kernel/20231115163018.1303287-1-ryan.roberts@arm.com/ [3] https://lore.kernel.org/linux-arm-kernel/20231204105440.61448-1-ryan.roberts@arm.com/ [4] https://lore.kernel.org/lkml/20231218105100.172635-1-ryan.roberts@arm.com/ [5] https://lore.kernel.org/linux-mm/633af0a7-0823-424f-b6ef-374d99483f05@arm.com/ [6] https://lore.kernel.org/lkml/08c16f7d-f3b3-4f22-9acc-da943f647dc3@arm.com/ [7] https://lore.kernel.org/linux-mm/20240214204435.167852-1-david@redhat.com/ [8] https://lore.kernel.org/linux-mm/c507308d-bdd4-5f9e-d4ff-e96e4520be85@nvidia.com/ [9] https://gitlab.arm.com/linux-arm/linux-rr/-/tree/features/granule_perf/contpte-lkml_v6 This patch (of 18): set_ptes() spec implies that it can only be used to set a present pte because it interprets the PFN field to increment it. However, set_pte_at() has been implemented on top of set_ptes() since set_ptes() was introduced, and set_pte_at() allows setting a pte to a not-present state. So clarify the spec to state that when nr==1, new state of pte may be present or not present. When nr>1, new state of all ptes must be present. While we are at it, tighten the spec to set requirements around the initial state of ptes; when nr==1 it may be either present or not-present. But when nr>1 all ptes must initially be not-present. All set_ptes() callsites already conform to this requirement. Stating it explicitly is useful because it allows for a simplification to the upcoming arm64 contpte implementation. Link: https://lkml.kernel.org/r/20240215103205.2607016-1-ryan.roberts@arm.com Link: https://lkml.kernel.org/r/20240215103205.2607016-2-ryan.roberts@arm.com Signed-off-by: Ryan Roberts Acked-by: David Hildenbrand Cc: Alistair Popple Cc: Andrey Ryabinin Cc: Ard Biesheuvel Cc: Barry Song <21cnbao@gmail.com> Cc: Borislav Petkov (AMD) Cc: Catalin Marinas Cc: Dave Hansen Cc: "H. Peter Anvin" Cc: Ingo Molnar Cc: James Morse Cc: John Hubbard Cc: Kefeng Wang Cc: Marc Zyngier Cc: Mark Rutland Cc: Matthew Wilcox (Oracle) Cc: Thomas Gleixner Cc: Will Deacon Cc: Yang Shi Cc: Zi Yan Signed-off-by: Andrew Morton (cherry picked from commit 6280d7317ccae19c776a3b6cf9848c964f958091) Signed-off-by: Koba Ko --- include/linux/pgtable.h | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/include/linux/pgtable.h b/include/linux/pgtable.h index 6b85a95ed7307..76253ef059a2e 100644 --- a/include/linux/pgtable.h +++ b/include/linux/pgtable.h @@ -226,6 +226,10 @@ static inline pte_t pte_next_pfn(pte_t pte) * @pte: Page table entry for the first page. * @nr: Number of pages to map. * + * When nr==1, initial state of pte may be present or not present, and new state + * may be present or not present. When nr>1, initial state of all ptes must be + * not present, and new state must be present. + * * May be overridden by the architecture, or the architecture can define * set_pte() and PFN_PTE_SHIFT. * From 534c04dde1b4273fcc12efc638ed31dea37c85ed Mon Sep 17 00:00:00 2001 From: Ryan Roberts Date: Thu, 15 Feb 2024 10:31:49 +0000 Subject: [PATCH 212/548] mm: thp: batch-collapse PMD with set_ptes() Refactor __split_huge_pmd_locked() so that a present PMD can be collapsed to PTEs in a single batch using set_ptes(). This should improve performance a little bit, but the real motivation is to remove the need for the arm64 backend to have to fold the contpte entries. Instead, since the ptes are set as a batch, the contpte blocks can be initially set up pre-folded (once the arm64 contpte support is added in the next few patches). This leads to noticeable performance improvement during split. Link: https://lkml.kernel.org/r/20240215103205.2607016-3-ryan.roberts@arm.com Signed-off-by: Ryan Roberts Acked-by: David Hildenbrand Cc: Alistair Popple Cc: Andrey Ryabinin Cc: Ard Biesheuvel Cc: Barry Song <21cnbao@gmail.com> Cc: Borislav Petkov (AMD) Cc: Catalin Marinas Cc: Dave Hansen Cc: "H. Peter Anvin" Cc: Ingo Molnar Cc: James Morse Cc: John Hubbard Cc: Kefeng Wang Cc: Marc Zyngier Cc: Mark Rutland Cc: Matthew Wilcox (Oracle) Cc: Thomas Gleixner Cc: Will Deacon Cc: Yang Shi Cc: Zi Yan Signed-off-by: Andrew Morton (cherry picked from commit 2bdba9868a4ffcb1492db7272f34b54387910177) Signed-off-by: Koba Ko --- mm/huge_memory.c | 58 +++++++++++++++++++++++++++--------------------- 1 file changed, 33 insertions(+), 25 deletions(-) diff --git a/mm/huge_memory.c b/mm/huge_memory.c index be5ff6c41e450..43e54cfa26197 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -2398,15 +2398,16 @@ static void __split_huge_pmd_locked(struct vm_area_struct *vma, pmd_t *pmd, pte = pte_offset_map(&_pmd, haddr); VM_BUG_ON(!pte); - for (i = 0, addr = haddr; i < HPAGE_PMD_NR; i++, addr += PAGE_SIZE) { - pte_t entry; - /* - * Note that NUMA hinting access restrictions are not - * transferred to avoid any possibility of altering - * permissions across VMAs. - */ - if (freeze || pmd_migration) { + + /* + * Note that NUMA hinting access restrictions are not transferred to + * avoid any possibility of altering permissions across VMAs. + */ + if (freeze || pmd_migration) { + for (i = 0, addr = haddr; i < HPAGE_PMD_NR; i++, addr += PAGE_SIZE) { + pte_t entry; swp_entry_t swp_entry; + if (write) swp_entry = make_writable_migration_entry( page_to_pfn(page + i)); @@ -2425,25 +2426,32 @@ static void __split_huge_pmd_locked(struct vm_area_struct *vma, pmd_t *pmd, entry = pte_swp_mksoft_dirty(entry); if (uffd_wp) entry = pte_swp_mkuffd_wp(entry); - } else { - entry = mk_pte(page + i, READ_ONCE(vma->vm_page_prot)); - if (write) - entry = pte_mkwrite(entry, vma); - if (!young) - entry = pte_mkold(entry); - /* NOTE: this may set soft-dirty too on some archs */ - if (dirty) - entry = pte_mkdirty(entry); - if (soft_dirty) - entry = pte_mksoft_dirty(entry); - if (uffd_wp) - entry = pte_mkuffd_wp(entry); + + VM_WARN_ON(!pte_none(ptep_get(pte + i))); + set_pte_at(mm, addr, pte + i, entry); } - VM_BUG_ON(!pte_none(ptep_get(pte))); - set_pte_at(mm, addr, pte, entry); - pte++; + } else { + pte_t entry; + + entry = mk_pte(page, READ_ONCE(vma->vm_page_prot)); + if (write) + entry = pte_mkwrite(entry, vma); + if (!young) + entry = pte_mkold(entry); + /* NOTE: this may set soft-dirty too on some archs */ + if (dirty) + entry = pte_mkdirty(entry); + if (soft_dirty) + entry = pte_mksoft_dirty(entry); + if (uffd_wp) + entry = pte_mkuffd_wp(entry); + + for (i = 0; i < HPAGE_PMD_NR; i++) + VM_WARN_ON(!pte_none(ptep_get(pte + i))); + + set_ptes(mm, haddr, pte, entry, HPAGE_PMD_NR); } - pte_unmap(pte - 1); + pte_unmap(pte); if (!pmd_migration) folio_remove_rmap_pmd(folio, page, vma); From b14fb16624cbe61b4e392fde811031c5260293e8 Mon Sep 17 00:00:00 2001 From: Ryan Roberts Date: Thu, 15 Feb 2024 10:31:50 +0000 Subject: [PATCH 213/548] mm: introduce pte_advance_pfn() and use for pte_next_pfn() The goal is to be able to advance a PTE by an arbitrary number of PFNs. So introduce a new API that takes a nr param. Define the default implementation here and allow for architectures to override. pte_next_pfn() becomes a wrapper around pte_advance_pfn(). Follow up commits will convert each overriding architecture's pte_next_pfn() to pte_advance_pfn(). Link: https://lkml.kernel.org/r/20240215103205.2607016-4-ryan.roberts@arm.com Signed-off-by: Ryan Roberts Acked-by: David Hildenbrand Cc: Alistair Popple Cc: Andrey Ryabinin Cc: Ard Biesheuvel Cc: Barry Song <21cnbao@gmail.com> Cc: Borislav Petkov (AMD) Cc: Catalin Marinas Cc: Dave Hansen Cc: "H. Peter Anvin" Cc: Ingo Molnar Cc: James Morse Cc: John Hubbard Cc: Kefeng Wang Cc: Marc Zyngier Cc: Mark Rutland Cc: Matthew Wilcox (Oracle) Cc: Thomas Gleixner Cc: Will Deacon Cc: Yang Shi Cc: Zi Yan Signed-off-by: Andrew Morton (cherry picked from commit 583ceaaa339960e673ac0029f323bb1c6ffc96d7) Signed-off-by: Koba Ko --- include/linux/pgtable.h | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-) diff --git a/include/linux/pgtable.h b/include/linux/pgtable.h index 76253ef059a2e..cac6469b791a6 100644 --- a/include/linux/pgtable.h +++ b/include/linux/pgtable.h @@ -209,14 +209,17 @@ static inline int pmd_young(pmd_t pmd) #define arch_flush_lazy_mmu_mode() do {} while (0) #endif - #ifndef pte_next_pfn -static inline pte_t pte_next_pfn(pte_t pte) +#ifndef pte_advance_pfn +static inline pte_t pte_advance_pfn(pte_t pte, unsigned long nr) { - return __pte(pte_val(pte) + (1UL << PFN_PTE_SHIFT)); + return __pte(pte_val(pte) + (nr << PFN_PTE_SHIFT)); } #endif +#define pte_next_pfn(pte) pte_advance_pfn(pte, 1) +#endif + #ifndef set_ptes /** * set_ptes - Map consecutive pages to a contiguous range of addresses. From b3d48dcb1f9ee2da8665f8d154ca34d1432961b0 Mon Sep 17 00:00:00 2001 From: Ryan Roberts Date: Thu, 15 Feb 2024 10:31:51 +0000 Subject: [PATCH 214/548] arm64/mm: convert pte_next_pfn() to pte_advance_pfn() Core-mm needs to be able to advance the pfn by an arbitrary amount, so override the new pte_advance_pfn() API to do so. Link: https://lkml.kernel.org/r/20240215103205.2607016-5-ryan.roberts@arm.com Signed-off-by: Ryan Roberts Acked-by: David Hildenbrand Acked-by: Mark Rutland Acked-by: Catalin Marinas Cc: Alistair Popple Cc: Andrey Ryabinin Cc: Ard Biesheuvel Cc: Barry Song <21cnbao@gmail.com> Cc: Borislav Petkov (AMD) Cc: Dave Hansen Cc: "H. Peter Anvin" Cc: Ingo Molnar Cc: James Morse Cc: John Hubbard Cc: Kefeng Wang Cc: Marc Zyngier Cc: Matthew Wilcox (Oracle) Cc: Thomas Gleixner Cc: Will Deacon Cc: Yang Shi Cc: Zi Yan Signed-off-by: Andrew Morton (cherry picked from commit c1bd2b4028ae5b4d2ada64b31c40cc44cdf00972) Signed-off-by: Koba Ko --- arch/arm64/include/asm/pgtable.h | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h index 225c026f0d74e..8bb79d2461729 100644 --- a/arch/arm64/include/asm/pgtable.h +++ b/arch/arm64/include/asm/pgtable.h @@ -351,10 +351,10 @@ static inline pgprot_t pte_pgprot(pte_t pte) return __pgprot(pte_val(pfn_pte(pfn, __pgprot(0))) ^ pte_val(pte)); } -#define pte_next_pfn pte_next_pfn -static inline pte_t pte_next_pfn(pte_t pte) +#define pte_advance_pfn pte_advance_pfn +static inline pte_t pte_advance_pfn(pte_t pte, unsigned long nr) { - return pfn_pte(pte_pfn(pte) + 1, pte_pgprot(pte)); + return pfn_pte(pte_pfn(pte) + nr, pte_pgprot(pte)); } static inline void set_ptes(struct mm_struct *mm, unsigned long addr, @@ -369,7 +369,7 @@ static inline void set_ptes(struct mm_struct *mm, unsigned long addr, if (--nr == 0) break; ptep++; - pte = pte_next_pfn(pte); + pte = pte_advance_pfn(pte, 1); } } #define set_ptes set_ptes From 8dd6fb5317836819d4fece00d3c3d43d6f6e1107 Mon Sep 17 00:00:00 2001 From: Ryan Roberts Date: Thu, 15 Feb 2024 10:31:52 +0000 Subject: [PATCH 215/548] x86/mm: convert pte_next_pfn() to pte_advance_pfn() Core-mm needs to be able to advance the pfn by an arbitrary amount, so override the new pte_advance_pfn() API to do so. Link: https://lkml.kernel.org/r/20240215103205.2607016-6-ryan.roberts@arm.com Signed-off-by: Ryan Roberts Reviewed-by: David Hildenbrand Cc: Alistair Popple Cc: Andrey Ryabinin Cc: Ard Biesheuvel Cc: Barry Song <21cnbao@gmail.com> Cc: Borislav Petkov (AMD) Cc: Catalin Marinas Cc: Dave Hansen Cc: "H. Peter Anvin" Cc: Ingo Molnar Cc: James Morse Cc: John Hubbard Cc: Kefeng Wang Cc: Marc Zyngier Cc: Mark Rutland Cc: Matthew Wilcox (Oracle) Cc: Thomas Gleixner Cc: Will Deacon Cc: Yang Shi Cc: Zi Yan Signed-off-by: Andrew Morton (cherry picked from commit 506b586769ecef8c83fff64de227e7fa84b7be42) Signed-off-by: Koba Ko --- arch/x86/include/asm/pgtable.h | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h index d03fe4fb41f43..993d49cd379a2 100644 --- a/arch/x86/include/asm/pgtable.h +++ b/arch/x86/include/asm/pgtable.h @@ -939,13 +939,13 @@ static inline int pte_same(pte_t a, pte_t b) return a.pte == b.pte; } -static inline pte_t pte_next_pfn(pte_t pte) +static inline pte_t pte_advance_pfn(pte_t pte, unsigned long nr) { if (__pte_needs_invert(pte_val(pte))) - return __pte(pte_val(pte) - (1UL << PFN_PTE_SHIFT)); - return __pte(pte_val(pte) + (1UL << PFN_PTE_SHIFT)); + return __pte(pte_val(pte) - (nr << PFN_PTE_SHIFT)); + return __pte(pte_val(pte) + (nr << PFN_PTE_SHIFT)); } -#define pte_next_pfn pte_next_pfn +#define pte_advance_pfn pte_advance_pfn static inline int pte_present(pte_t a) { From 633af117212336d2e346c7eddacc10425de1bb42 Mon Sep 17 00:00:00 2001 From: Ryan Roberts Date: Thu, 15 Feb 2024 10:31:53 +0000 Subject: [PATCH 216/548] mm: tidy up pte_next_pfn() definition Now that the all architecture overrides of pte_next_pfn() have been replaced with pte_advance_pfn(), we can simplify the definition of the generic pte_next_pfn() macro so that it is unconditionally defined. Link: https://lkml.kernel.org/r/20240215103205.2607016-7-ryan.roberts@arm.com Signed-off-by: Ryan Roberts Acked-by: David Hildenbrand Cc: Alistair Popple Cc: Andrey Ryabinin Cc: Ard Biesheuvel Cc: Barry Song <21cnbao@gmail.com> Cc: Borislav Petkov (AMD) Cc: Catalin Marinas Cc: Dave Hansen Cc: "H. Peter Anvin" Cc: Ingo Molnar Cc: James Morse Cc: John Hubbard Cc: Kefeng Wang Cc: Marc Zyngier Cc: Mark Rutland Cc: Matthew Wilcox (Oracle) Cc: Thomas Gleixner Cc: Will Deacon Cc: Yang Shi Cc: Zi Yan Signed-off-by: Andrew Morton (cherry picked from commit fb23bf6bd288db3187c27b971e558a3e9f70ae96) Signed-off-by: Koba Ko --- include/linux/pgtable.h | 2 -- 1 file changed, 2 deletions(-) diff --git a/include/linux/pgtable.h b/include/linux/pgtable.h index cac6469b791a6..72fbc15f3e991 100644 --- a/include/linux/pgtable.h +++ b/include/linux/pgtable.h @@ -209,7 +209,6 @@ static inline int pmd_young(pmd_t pmd) #define arch_flush_lazy_mmu_mode() do {} while (0) #endif -#ifndef pte_next_pfn #ifndef pte_advance_pfn static inline pte_t pte_advance_pfn(pte_t pte, unsigned long nr) { @@ -218,7 +217,6 @@ static inline pte_t pte_advance_pfn(pte_t pte, unsigned long nr) #endif #define pte_next_pfn(pte) pte_advance_pfn(pte, 1) -#endif #ifndef set_ptes /** From fbb5ed6f1064a415af426cf9a271287096af388c Mon Sep 17 00:00:00 2001 From: Ryan Roberts Date: Thu, 15 Feb 2024 10:31:54 +0000 Subject: [PATCH 217/548] arm64/mm: convert READ_ONCE(*ptep) to ptep_get(ptep) There are a number of places in the arch code that read a pte by using the READ_ONCE() macro. Refactor these call sites to instead use the ptep_get() helper, which itself is a READ_ONCE(). Generated code should be the same. This will benefit us when we shortly introduce the transparent contpte support. In this case, ptep_get() will become more complex so we now have all the code abstracted through it. Link: https://lkml.kernel.org/r/20240215103205.2607016-8-ryan.roberts@arm.com Signed-off-by: Ryan Roberts Tested-by: John Hubbard Acked-by: Mark Rutland Acked-by: Catalin Marinas Cc: Alistair Popple Cc: Andrey Ryabinin Cc: Ard Biesheuvel Cc: Barry Song <21cnbao@gmail.com> Cc: Borislav Petkov (AMD) Cc: Dave Hansen Cc: David Hildenbrand Cc: "H. Peter Anvin" Cc: Ingo Molnar Cc: James Morse Cc: Kefeng Wang Cc: Marc Zyngier Cc: Matthew Wilcox (Oracle) Cc: Thomas Gleixner Cc: Will Deacon Cc: Yang Shi Cc: Zi Yan Signed-off-by: Andrew Morton (cherry picked from commit 532736558e8ef2865eae1d84b52dda4422cac810) Signed-off-by: Koba Ko --- arch/arm64/include/asm/pgtable.h | 12 +++++++++--- arch/arm64/kernel/efi.c | 2 +- arch/arm64/mm/fault.c | 4 ++-- arch/arm64/mm/hugetlbpage.c | 6 +++--- arch/arm64/mm/kasan_init.c | 2 +- arch/arm64/mm/mmu.c | 12 ++++++------ arch/arm64/mm/pageattr.c | 4 ++-- arch/arm64/mm/trans_pgd.c | 2 +- 8 files changed, 25 insertions(+), 19 deletions(-) diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h index 8bb79d2461729..9513e958cbbc8 100644 --- a/arch/arm64/include/asm/pgtable.h +++ b/arch/arm64/include/asm/pgtable.h @@ -275,6 +275,12 @@ static inline void set_pte(pte_t *ptep, pte_t pte) } } +#define ptep_get ptep_get +static inline pte_t ptep_get(pte_t *ptep) +{ + return READ_ONCE(*ptep); +} + extern void __sync_icache_dcache(pte_t pteval); bool pgattr_change_is_safe(u64 old, u64 new); @@ -302,7 +308,7 @@ static inline void __check_safe_pte_update(struct mm_struct *mm, pte_t *ptep, if (!IS_ENABLED(CONFIG_DEBUG_VM)) return; - old_pte = READ_ONCE(*ptep); + old_pte = ptep_get(ptep); if (!pte_valid(old_pte) || !pte_valid(pte)) return; @@ -903,7 +909,7 @@ static inline int __ptep_test_and_clear_young(pte_t *ptep) { pte_t old_pte, pte; - pte = READ_ONCE(*ptep); + pte = ptep_get(ptep); do { old_pte = pte; pte = pte_mkold(pte); @@ -985,7 +991,7 @@ static inline void ptep_set_wrprotect(struct mm_struct *mm, unsigned long addres { pte_t old_pte, pte; - pte = READ_ONCE(*ptep); + pte = ptep_get(ptep); do { old_pte = pte; pte = pte_wrprotect(pte); diff --git a/arch/arm64/kernel/efi.c b/arch/arm64/kernel/efi.c index 2b478ca356b00..e72d62416b1a0 100644 --- a/arch/arm64/kernel/efi.c +++ b/arch/arm64/kernel/efi.c @@ -107,7 +107,7 @@ static int __init set_permissions(pte_t *ptep, unsigned long addr, void *data) { struct set_perm_data *spd = data; const efi_memory_desc_t *md = spd->md; - pte_t pte = READ_ONCE(*ptep); + pte_t pte = ptep_get(ptep); if (md->attribute & EFI_MEMORY_RO) pte = set_pte_bit(pte, __pgprot(PTE_RDONLY)); diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c index 2e5d1e238af95..3651fcb41f7e6 100644 --- a/arch/arm64/mm/fault.c +++ b/arch/arm64/mm/fault.c @@ -191,7 +191,7 @@ static void show_pte(unsigned long addr) if (!ptep) break; - pte = READ_ONCE(*ptep); + pte = ptep_get(ptep); pr_cont(", pte=%016llx", pte_val(pte)); pte_unmap(ptep); } while(0); @@ -214,7 +214,7 @@ int ptep_set_access_flags(struct vm_area_struct *vma, pte_t entry, int dirty) { pteval_t old_pteval, pteval; - pte_t pte = READ_ONCE(*ptep); + pte_t pte = ptep_get(ptep); if (pte_same(pte, entry)) return 0; diff --git a/arch/arm64/mm/hugetlbpage.c b/arch/arm64/mm/hugetlbpage.c index 06efc3a1652eb..a5a56e30c254b 100644 --- a/arch/arm64/mm/hugetlbpage.c +++ b/arch/arm64/mm/hugetlbpage.c @@ -479,7 +479,7 @@ void huge_ptep_set_wrprotect(struct mm_struct *mm, size_t pgsize; pte_t pte; - if (!pte_cont(READ_ONCE(*ptep))) { + if (!pte_cont(ptep_get(ptep))) { ptep_set_wrprotect(mm, addr, ptep); return; } @@ -504,7 +504,7 @@ pte_t huge_ptep_clear_flush(struct vm_area_struct *vma, size_t pgsize; int ncontig; - if (!pte_cont(READ_ONCE(*ptep))) + if (!pte_cont(ptep_get(ptep))) return ptep_clear_flush(vma, addr, ptep); ncontig = find_num_contig(mm, addr, ptep, &pgsize); @@ -552,7 +552,7 @@ pte_t huge_ptep_modify_prot_start(struct vm_area_struct *vma, unsigned long addr * when the permission changes from executable to non-executable * in cases where cpu is affected with errata #2645198. */ - if (pte_user_exec(READ_ONCE(*ptep))) + if (pte_user_exec(ptep_get(ptep))) return huge_ptep_clear_flush(vma, addr, ptep); } return huge_ptep_get_and_clear(vma->vm_mm, addr, ptep, psize); diff --git a/arch/arm64/mm/kasan_init.c b/arch/arm64/mm/kasan_init.c index f17d066e85eb8..1b96e0ad66610 100644 --- a/arch/arm64/mm/kasan_init.c +++ b/arch/arm64/mm/kasan_init.c @@ -113,7 +113,7 @@ static void __init kasan_pte_populate(pmd_t *pmdp, unsigned long addr, memset(__va(page_phys), KASAN_SHADOW_INIT, PAGE_SIZE); next = addr + PAGE_SIZE; set_pte(ptep, pfn_pte(__phys_to_pfn(page_phys), PAGE_KERNEL)); - } while (ptep++, addr = next, addr != end && pte_none(READ_ONCE(*ptep))); + } while (ptep++, addr = next, addr != end && pte_none(ptep_get(ptep))); } static void __init kasan_pmd_populate(pud_t *pudp, unsigned long addr, diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c index c8e83fe1cd5a7..3f54a9a0010fc 100644 --- a/arch/arm64/mm/mmu.c +++ b/arch/arm64/mm/mmu.c @@ -176,7 +176,7 @@ static void init_pte(pmd_t *pmdp, unsigned long addr, unsigned long end, ptep = pte_set_fixmap_offset(pmdp, addr); do { - pte_t old_pte = READ_ONCE(*ptep); + pte_t old_pte = ptep_get(ptep); set_pte(ptep, pfn_pte(__phys_to_pfn(phys), prot)); @@ -185,7 +185,7 @@ static void init_pte(pmd_t *pmdp, unsigned long addr, unsigned long end, * only allow updates to the permission attributes. */ BUG_ON(!pgattr_change_is_safe(pte_val(old_pte), - READ_ONCE(pte_val(*ptep)))); + pte_val(ptep_get(ptep)))); phys += PAGE_SIZE; } while (ptep++, addr += PAGE_SIZE, addr != end); @@ -854,7 +854,7 @@ static void unmap_hotplug_pte_range(pmd_t *pmdp, unsigned long addr, do { ptep = pte_offset_kernel(pmdp, addr); - pte = READ_ONCE(*ptep); + pte = ptep_get(ptep); if (pte_none(pte)) continue; @@ -987,7 +987,7 @@ static void free_empty_pte_table(pmd_t *pmdp, unsigned long addr, do { ptep = pte_offset_kernel(pmdp, addr); - pte = READ_ONCE(*ptep); + pte = ptep_get(ptep); /* * This is just a sanity check here which verifies that @@ -1006,7 +1006,7 @@ static void free_empty_pte_table(pmd_t *pmdp, unsigned long addr, */ ptep = pte_offset_kernel(pmdp, 0UL); for (i = 0; i < PTRS_PER_PTE; i++) { - if (!pte_none(READ_ONCE(ptep[i]))) + if (!pte_none(ptep_get(&ptep[i]))) return; } @@ -1481,7 +1481,7 @@ pte_t ptep_modify_prot_start(struct vm_area_struct *vma, unsigned long addr, pte * when the permission changes from executable to non-executable * in cases where cpu is affected with errata #2645198. */ - if (pte_user_exec(READ_ONCE(*ptep))) + if (pte_user_exec(ptep_get(ptep))) return ptep_clear_flush(vma, addr, ptep); } return ptep_get_and_clear(vma->vm_mm, addr, ptep); diff --git a/arch/arm64/mm/pageattr.c b/arch/arm64/mm/pageattr.c index 0a62f458c5cb0..e0e35bd942222 100644 --- a/arch/arm64/mm/pageattr.c +++ b/arch/arm64/mm/pageattr.c @@ -36,7 +36,7 @@ bool can_set_direct_map(void) static int change_page_range(pte_t *ptep, unsigned long addr, void *data) { struct page_change_data *cdata = data; - pte_t pte = READ_ONCE(*ptep); + pte_t pte = ptep_get(ptep); pte = clear_pte_bit(pte, cdata->clear_mask); pte = set_pte_bit(pte, cdata->set_mask); @@ -242,5 +242,5 @@ bool kernel_page_present(struct page *page) return true; ptep = pte_offset_kernel(pmdp, addr); - return pte_valid(READ_ONCE(*ptep)); + return pte_valid(ptep_get(ptep)); } diff --git a/arch/arm64/mm/trans_pgd.c b/arch/arm64/mm/trans_pgd.c index 7b14df3c64776..f71ab4704cce7 100644 --- a/arch/arm64/mm/trans_pgd.c +++ b/arch/arm64/mm/trans_pgd.c @@ -33,7 +33,7 @@ static void *trans_alloc(struct trans_pgd_info *info) static void _copy_pte(pte_t *dst_ptep, pte_t *src_ptep, unsigned long addr) { - pte_t pte = READ_ONCE(*src_ptep); + pte_t pte = ptep_get(src_ptep); if (pte_valid(pte)) { /* From 59f1e0d4e482f11ffaca12f2791799fec00c22ce Mon Sep 17 00:00:00 2001 From: Ryan Roberts Date: Thu, 15 Feb 2024 10:31:55 +0000 Subject: [PATCH 218/548] arm64/mm: convert set_pte_at() to set_ptes(..., 1) Since set_ptes() was introduced, set_pte_at() has been implemented as a generic macro around set_ptes(..., 1). So this change should continue to generate the same code. However, making this change prepares us for the transparent contpte support. It means we can reroute set_ptes() to __set_ptes(). Since set_pte_at() is a generic macro, there will be no equivalent __set_pte_at() to reroute to. Note that a couple of calls to set_pte_at() remain in the arch code. This is intentional, since those call sites are acting on behalf of core-mm and should continue to call into the public set_ptes() rather than the arch-private __set_ptes(). Link: https://lkml.kernel.org/r/20240215103205.2607016-9-ryan.roberts@arm.com Signed-off-by: Ryan Roberts Tested-by: John Hubbard Acked-by: Mark Rutland Acked-by: Catalin Marinas Cc: Alistair Popple Cc: Andrey Ryabinin Cc: Ard Biesheuvel Cc: Barry Song <21cnbao@gmail.com> Cc: Borislav Petkov (AMD) Cc: Dave Hansen Cc: David Hildenbrand Cc: "H. Peter Anvin" Cc: Ingo Molnar Cc: James Morse Cc: Kefeng Wang Cc: Marc Zyngier Cc: Matthew Wilcox (Oracle) Cc: Thomas Gleixner Cc: Will Deacon Cc: Yang Shi Cc: Zi Yan Signed-off-by: Andrew Morton (cherry picked from commit 659e193027910a5d3083e34b488ab459d2ef5082) Signed-off-by: Koba Ko --- arch/arm64/include/asm/pgtable.h | 2 +- arch/arm64/kernel/mte.c | 2 +- arch/arm64/kvm/guest.c | 2 +- arch/arm64/mm/fault.c | 2 +- arch/arm64/mm/hugetlbpage.c | 10 +++++----- 5 files changed, 9 insertions(+), 9 deletions(-) diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h index 9513e958cbbc8..2200395af83f4 100644 --- a/arch/arm64/include/asm/pgtable.h +++ b/arch/arm64/include/asm/pgtable.h @@ -1083,7 +1083,7 @@ static inline void arch_swap_restore(swp_entry_t entry, struct folio *folio) #endif /* CONFIG_ARM64_MTE */ /* - * On AArch64, the cache coherency is handled via the set_pte_at() function. + * On AArch64, the cache coherency is handled via the set_ptes() function. */ static inline void update_mmu_cache_range(struct vm_fault *vmf, struct vm_area_struct *vma, unsigned long addr, pte_t *ptep, diff --git a/arch/arm64/kernel/mte.c b/arch/arm64/kernel/mte.c index 2fb5e7a7a4d5e..b99b718164edf 100644 --- a/arch/arm64/kernel/mte.c +++ b/arch/arm64/kernel/mte.c @@ -67,7 +67,7 @@ int memcmp_pages(struct page *page1, struct page *page2) /* * If the page content is identical but at least one of the pages is * tagged, return non-zero to avoid KSM merging. If only one of the - * pages is tagged, set_pte_at() may zero or change the tags of the + * pages is tagged, set_ptes() may zero or change the tags of the * other page via mte_sync_tags(). */ if (page_mte_tagged(page1) || page_mte_tagged(page2)) diff --git a/arch/arm64/kvm/guest.c b/arch/arm64/kvm/guest.c index efe82cc86bd1f..ce238ef9e1133 100644 --- a/arch/arm64/kvm/guest.c +++ b/arch/arm64/kvm/guest.c @@ -1073,7 +1073,7 @@ int kvm_vm_ioctl_mte_copy_tags(struct kvm *kvm, } else { /* * Only locking to serialise with a concurrent - * set_pte_at() in the VMM but still overriding the + * set_ptes() in the VMM but still overriding the * tags, hence ignoring the return value. */ try_page_mte_tagging(page); diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c index 3651fcb41f7e6..5239430cd5f2c 100644 --- a/arch/arm64/mm/fault.c +++ b/arch/arm64/mm/fault.c @@ -205,7 +205,7 @@ static void show_pte(unsigned long addr) * * It needs to cope with hardware update of the accessed/dirty state by other * agents in the system and can safely skip the __sync_icache_dcache() call as, - * like set_pte_at(), the PTE is never changed from no-exec to exec here. + * like set_ptes(), the PTE is never changed from no-exec to exec here. * * Returns whether or not the PTE actually changed. */ diff --git a/arch/arm64/mm/hugetlbpage.c b/arch/arm64/mm/hugetlbpage.c index a5a56e30c254b..398310ce91961 100644 --- a/arch/arm64/mm/hugetlbpage.c +++ b/arch/arm64/mm/hugetlbpage.c @@ -246,12 +246,12 @@ void set_huge_pte_at(struct mm_struct *mm, unsigned long addr, if (!pte_present(pte)) { for (i = 0; i < ncontig; i++, ptep++, addr += pgsize) - set_pte_at(mm, addr, ptep, pte); + set_ptes(mm, addr, ptep, pte, 1); return; } if (!pte_cont(pte)) { - set_pte_at(mm, addr, ptep, pte); + set_ptes(mm, addr, ptep, pte, 1); return; } @@ -262,7 +262,7 @@ void set_huge_pte_at(struct mm_struct *mm, unsigned long addr, clear_flush(mm, addr, ptep, pgsize, ncontig); for (i = 0; i < ncontig; i++, ptep++, addr += pgsize, pfn += dpfn) - set_pte_at(mm, addr, ptep, pfn_pte(pfn, hugeprot)); + set_ptes(mm, addr, ptep, pfn_pte(pfn, hugeprot), 1); } pte_t *huge_pte_alloc(struct mm_struct *mm, struct vm_area_struct *vma, @@ -465,7 +465,7 @@ int huge_ptep_set_access_flags(struct vm_area_struct *vma, hugeprot = pte_pgprot(pte); for (i = 0; i < ncontig; i++, ptep++, addr += pgsize, pfn += dpfn) - set_pte_at(mm, addr, ptep, pfn_pte(pfn, hugeprot)); + set_ptes(mm, addr, ptep, pfn_pte(pfn, hugeprot), 1); return 1; } @@ -494,7 +494,7 @@ void huge_ptep_set_wrprotect(struct mm_struct *mm, pfn = pte_pfn(pte); for (i = 0; i < ncontig; i++, ptep++, addr += pgsize, pfn += dpfn) - set_pte_at(mm, addr, ptep, pfn_pte(pfn, hugeprot)); + set_ptes(mm, addr, ptep, pfn_pte(pfn, hugeprot), 1); } pte_t huge_ptep_clear_flush(struct vm_area_struct *vma, From e0de58ac1cfda0a7410377cb549eb68749fa4041 Mon Sep 17 00:00:00 2001 From: Ryan Roberts Date: Thu, 15 Feb 2024 10:31:56 +0000 Subject: [PATCH 219/548] arm64/mm: convert ptep_clear() to ptep_get_and_clear() ptep_clear() is a generic wrapper around the arch-implemented ptep_get_and_clear(). We are about to convert ptep_get_and_clear() into a public version and private version (__ptep_get_and_clear()) to support the transparent contpte work. We won't have a private version of ptep_clear() so let's convert it to directly call ptep_get_and_clear(). Link: https://lkml.kernel.org/r/20240215103205.2607016-10-ryan.roberts@arm.com Signed-off-by: Ryan Roberts Tested-by: John Hubbard Acked-by: Mark Rutland Acked-by: Catalin Marinas Cc: Alistair Popple Cc: Andrey Ryabinin Cc: Ard Biesheuvel Cc: Barry Song <21cnbao@gmail.com> Cc: Borislav Petkov (AMD) Cc: Dave Hansen Cc: David Hildenbrand Cc: "H. Peter Anvin" Cc: Ingo Molnar Cc: James Morse Cc: Kefeng Wang Cc: Marc Zyngier Cc: Matthew Wilcox (Oracle) Cc: Thomas Gleixner Cc: Will Deacon Cc: Yang Shi Cc: Zi Yan Signed-off-by: Andrew Morton (cherry picked from commit cbb0294fdd72a5f63ec59fad5c0a98d63bd572fc) Signed-off-by: Koba Ko --- arch/arm64/mm/hugetlbpage.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/arm64/mm/hugetlbpage.c b/arch/arm64/mm/hugetlbpage.c index 398310ce91961..688d11b1cabdb 100644 --- a/arch/arm64/mm/hugetlbpage.c +++ b/arch/arm64/mm/hugetlbpage.c @@ -228,7 +228,7 @@ static void clear_flush(struct mm_struct *mm, unsigned long i, saddr = addr; for (i = 0; i < ncontig; i++, addr += pgsize, ptep++) - ptep_clear(mm, addr, ptep); + ptep_get_and_clear(mm, addr, ptep); flush_tlb_range(&vma, saddr, addr); } From 6602cb32a6dd4469a842decc19d7a90ba7eafc15 Mon Sep 17 00:00:00 2001 From: Ryan Roberts Date: Thu, 15 Feb 2024 10:31:57 +0000 Subject: [PATCH 220/548] arm64/mm: new ptep layer to manage contig bit Create a new layer for the in-table PTE manipulation APIs. For now, The existing API is prefixed with double underscore to become the arch-private API and the public API is just a simple wrapper that calls the private API. The public API implementation will subsequently be used to transparently manipulate the contiguous bit where appropriate. But since there are already some contig-aware users (e.g. hugetlb, kernel mapper), we must first ensure those users use the private API directly so that the future contig-bit manipulations in the public API do not interfere with those existing uses. The following APIs are treated this way: - ptep_get - set_pte - set_ptes - pte_clear - ptep_get_and_clear - ptep_test_and_clear_young - ptep_clear_flush_young - ptep_set_wrprotect - ptep_set_access_flags Link: https://lkml.kernel.org/r/20240215103205.2607016-11-ryan.roberts@arm.com Signed-off-by: Ryan Roberts Tested-by: John Hubbard Acked-by: Mark Rutland Acked-by: Catalin Marinas Cc: Alistair Popple Cc: Andrey Ryabinin Cc: Ard Biesheuvel Cc: Barry Song <21cnbao@gmail.com> Cc: Borislav Petkov (AMD) Cc: Dave Hansen Cc: David Hildenbrand Cc: "H. Peter Anvin" Cc: Ingo Molnar Cc: James Morse Cc: Kefeng Wang Cc: Marc Zyngier Cc: Matthew Wilcox (Oracle) Cc: Thomas Gleixner Cc: Will Deacon Cc: Yang Shi Cc: Zi Yan Signed-off-by: Andrew Morton (backported from commit 5a00bfd6a52cf31e93d5f1b734087deb32a3cffa) Signed-off-by: Koba Ko --- arch/arm64/include/asm/pgtable.h | 80 ++++++++++++++++++-------------- arch/arm64/kernel/efi.c | 4 +- arch/arm64/kernel/mte.c | 2 +- arch/arm64/kvm/guest.c | 2 +- arch/arm64/mm/fault.c | 12 ++--- arch/arm64/mm/fixmap.c | 4 +- arch/arm64/mm/hugetlbpage.c | 32 ++++++------- arch/arm64/mm/kasan_init.c | 6 +-- arch/arm64/mm/mmu.c | 14 +++--- arch/arm64/mm/pageattr.c | 6 +-- arch/arm64/mm/trans_pgd.c | 6 +-- 11 files changed, 88 insertions(+), 80 deletions(-) diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h index 2200395af83f4..fd13489c9515d 100644 --- a/arch/arm64/include/asm/pgtable.h +++ b/arch/arm64/include/asm/pgtable.h @@ -93,7 +93,8 @@ static inline pteval_t __phys_to_pte_val(phys_addr_t phys) __pte(__phys_to_pte_val((phys_addr_t)(pfn) << PAGE_SHIFT) | pgprot_val(prot)) #define pte_none(pte) (!pte_val(pte)) -#define pte_clear(mm,addr,ptep) set_pte(ptep, __pte(0)) +#define __pte_clear(mm, addr, ptep) \ + __set_pte(ptep, __pte(0)) #define pte_page(pte) (pfn_to_page(pte_pfn(pte))) /* @@ -137,7 +138,7 @@ static inline pteval_t __phys_to_pte_val(phys_addr_t phys) * so that we don't erroneously return false for pages that have been * remapped as PROT_NONE but are yet to be flushed from the TLB. * Note that we can't make any assumptions based on the state of the access - * flag, since ptep_clear_flush_young() elides a DSB when invalidating the + * flag, since __ptep_clear_flush_young() elides a DSB when invalidating the * TLB. */ #define pte_accessible(mm, pte) \ @@ -261,7 +262,7 @@ static inline pte_t pte_mkdevmap(pte_t pte) return set_pte_bit(pte, __pgprot(PTE_DEVMAP | PTE_SPECIAL)); } -static inline void set_pte(pte_t *ptep, pte_t pte) +static inline void __set_pte(pte_t *ptep, pte_t pte) { WRITE_ONCE(*ptep, pte); @@ -275,8 +276,7 @@ static inline void set_pte(pte_t *ptep, pte_t pte) } } -#define ptep_get ptep_get -static inline pte_t ptep_get(pte_t *ptep) +static inline pte_t __ptep_get(pte_t *ptep) { return READ_ONCE(*ptep); } @@ -308,7 +308,7 @@ static inline void __check_safe_pte_update(struct mm_struct *mm, pte_t *ptep, if (!IS_ENABLED(CONFIG_DEBUG_VM)) return; - old_pte = ptep_get(ptep); + old_pte = __ptep_get(ptep); if (!pte_valid(old_pte) || !pte_valid(pte)) return; @@ -317,7 +317,7 @@ static inline void __check_safe_pte_update(struct mm_struct *mm, pte_t *ptep, /* * Check for potential race with hardware updates of the pte - * (ptep_set_access_flags safely changes valid ptes without going + * (__ptep_set_access_flags safely changes valid ptes without going * through an invalid entry). */ VM_WARN_ONCE(!pte_young(pte), @@ -363,7 +363,8 @@ static inline pte_t pte_advance_pfn(pte_t pte, unsigned long nr) return pfn_pte(pte_pfn(pte) + nr, pte_pgprot(pte)); } -static inline void set_ptes(struct mm_struct *mm, unsigned long addr, +static inline void __set_ptes(struct mm_struct *mm, + unsigned long __always_unused addr, pte_t *ptep, pte_t pte, unsigned int nr) { page_table_check_ptes_set(mm, ptep, pte, nr); @@ -371,14 +372,13 @@ static inline void set_ptes(struct mm_struct *mm, unsigned long addr, for (;;) { __check_safe_pte_update(mm, ptep, pte); - set_pte(ptep, pte); + __set_pte(ptep, pte); if (--nr == 0) break; ptep++; pte = pte_advance_pfn(pte, 1); } } -#define set_ptes set_ptes /* * Huge pte definitions. @@ -544,7 +544,7 @@ static inline void __set_pte_at(struct mm_struct *mm, unsigned long addr, { __sync_cache_and_tags(pte, nr); __check_safe_pte_update(mm, ptep, pte); - set_pte(ptep, pte); + __set_pte(ptep, pte); } static inline void set_pmd_at(struct mm_struct *mm, unsigned long addr, @@ -859,8 +859,7 @@ static inline pmd_t pmd_modify(pmd_t pmd, pgprot_t newprot) return pte_pmd(pte_modify(pmd_pte(pmd), newprot)); } -#define __HAVE_ARCH_PTEP_SET_ACCESS_FLAGS -extern int ptep_set_access_flags(struct vm_area_struct *vma, +extern int __ptep_set_access_flags(struct vm_area_struct *vma, unsigned long address, pte_t *ptep, pte_t entry, int dirty); @@ -870,7 +869,8 @@ static inline int pmdp_set_access_flags(struct vm_area_struct *vma, unsigned long address, pmd_t *pmdp, pmd_t entry, int dirty) { - return ptep_set_access_flags(vma, address, (pte_t *)pmdp, pmd_pte(entry), dirty); + return __ptep_set_access_flags(vma, address, (pte_t *)pmdp, + pmd_pte(entry), dirty); } static inline int pud_devmap(pud_t pud) @@ -904,12 +904,13 @@ static inline bool pud_user_accessible_page(pud_t pud) /* * Atomic pte/pmd modifications. */ -#define __HAVE_ARCH_PTEP_TEST_AND_CLEAR_YOUNG -static inline int __ptep_test_and_clear_young(pte_t *ptep) +static inline int __ptep_test_and_clear_young(struct vm_area_struct *vma, + unsigned long address, + pte_t *ptep) { pte_t old_pte, pte; - pte = ptep_get(ptep); + pte = __ptep_get(ptep); do { old_pte = pte; pte = pte_mkold(pte); @@ -920,18 +921,10 @@ static inline int __ptep_test_and_clear_young(pte_t *ptep) return pte_young(pte); } -static inline int ptep_test_and_clear_young(struct vm_area_struct *vma, - unsigned long address, - pte_t *ptep) -{ - return __ptep_test_and_clear_young(ptep); -} - -#define __HAVE_ARCH_PTEP_CLEAR_YOUNG_FLUSH -static inline int ptep_clear_flush_young(struct vm_area_struct *vma, +static inline int __ptep_clear_flush_young(struct vm_area_struct *vma, unsigned long address, pte_t *ptep) { - int young = ptep_test_and_clear_young(vma, address, ptep); + int young = __ptep_test_and_clear_young(vma, address, ptep); if (young) { /* @@ -954,12 +947,11 @@ static inline int pmdp_test_and_clear_young(struct vm_area_struct *vma, unsigned long address, pmd_t *pmdp) { - return ptep_test_and_clear_young(vma, address, (pte_t *)pmdp); + return __ptep_test_and_clear_young(vma, address, (pte_t *)pmdp); } #endif /* CONFIG_TRANSPARENT_HUGEPAGE */ -#define __HAVE_ARCH_PTEP_GET_AND_CLEAR -static inline pte_t ptep_get_and_clear(struct mm_struct *mm, +static inline pte_t __ptep_get_and_clear(struct mm_struct *mm, unsigned long address, pte_t *ptep) { pte_t pte = __pte(xchg_relaxed(&pte_val(*ptep), 0)); @@ -983,15 +975,15 @@ static inline pmd_t pmdp_huge_get_and_clear(struct mm_struct *mm, #endif /* CONFIG_TRANSPARENT_HUGEPAGE */ /* - * ptep_set_wrprotect - mark read-only while trasferring potential hardware + * __ptep_set_wrprotect - mark read-only while trasferring potential hardware * dirty status (PTE_DBM && !PTE_RDONLY) to the software PTE_DIRTY bit. */ -#define __HAVE_ARCH_PTEP_SET_WRPROTECT -static inline void ptep_set_wrprotect(struct mm_struct *mm, unsigned long address, pte_t *ptep) +static inline void __ptep_set_wrprotect(struct mm_struct *mm, + unsigned long address, pte_t *ptep) { pte_t old_pte, pte; - pte = ptep_get(ptep); + pte = __ptep_get(ptep); do { old_pte = pte; pte = pte_wrprotect(pte); @@ -1005,7 +997,7 @@ static inline void ptep_set_wrprotect(struct mm_struct *mm, unsigned long addres static inline void pmdp_set_wrprotect(struct mm_struct *mm, unsigned long address, pmd_t *pmdp) { - ptep_set_wrprotect(mm, address, (pte_t *)pmdp); + __ptep_set_wrprotect(mm, address, (pte_t *)pmdp); } #define pmdp_establish pmdp_establish @@ -1083,7 +1075,7 @@ static inline void arch_swap_restore(swp_entry_t entry, struct folio *folio) #endif /* CONFIG_ARM64_MTE */ /* - * On AArch64, the cache coherency is handled via the set_ptes() function. + * On AArch64, the cache coherency is handled via the __set_ptes() function. */ static inline void update_mmu_cache_range(struct vm_fault *vmf, struct vm_area_struct *vma, unsigned long addr, pte_t *ptep, @@ -1135,6 +1127,22 @@ extern pte_t ptep_modify_prot_start(struct vm_area_struct *vma, extern void ptep_modify_prot_commit(struct vm_area_struct *vma, unsigned long addr, pte_t *ptep, pte_t old_pte, pte_t new_pte); + +#define ptep_get __ptep_get +#define set_pte __set_pte +#define set_ptes __set_ptes +#define pte_clear __pte_clear +#define __HAVE_ARCH_PTEP_GET_AND_CLEAR +#define ptep_get_and_clear __ptep_get_and_clear +#define __HAVE_ARCH_PTEP_TEST_AND_CLEAR_YOUNG +#define ptep_test_and_clear_young __ptep_test_and_clear_young +#define __HAVE_ARCH_PTEP_CLEAR_YOUNG_FLUSH +#define ptep_clear_flush_young __ptep_clear_flush_young +#define __HAVE_ARCH_PTEP_SET_WRPROTECT +#define ptep_set_wrprotect __ptep_set_wrprotect +#define __HAVE_ARCH_PTEP_SET_ACCESS_FLAGS +#define ptep_set_access_flags __ptep_set_access_flags + #endif /* !__ASSEMBLY__ */ #endif /* __ASM_PGTABLE_H */ diff --git a/arch/arm64/kernel/efi.c b/arch/arm64/kernel/efi.c index e72d62416b1a0..89d104c0bce65 100644 --- a/arch/arm64/kernel/efi.c +++ b/arch/arm64/kernel/efi.c @@ -107,7 +107,7 @@ static int __init set_permissions(pte_t *ptep, unsigned long addr, void *data) { struct set_perm_data *spd = data; const efi_memory_desc_t *md = spd->md; - pte_t pte = ptep_get(ptep); + pte_t pte = __ptep_get(ptep); if (md->attribute & EFI_MEMORY_RO) pte = set_pte_bit(pte, __pgprot(PTE_RDONLY)); @@ -116,7 +116,7 @@ static int __init set_permissions(pte_t *ptep, unsigned long addr, void *data) else if (IS_ENABLED(CONFIG_ARM64_BTI_KERNEL) && system_supports_bti() && spd->has_bti) pte = set_pte_bit(pte, __pgprot(PTE_GP)); - set_pte(ptep, pte); + __set_pte(ptep, pte); return 0; } diff --git a/arch/arm64/kernel/mte.c b/arch/arm64/kernel/mte.c index b99b718164edf..cea96ee75d22d 100644 --- a/arch/arm64/kernel/mte.c +++ b/arch/arm64/kernel/mte.c @@ -67,7 +67,7 @@ int memcmp_pages(struct page *page1, struct page *page2) /* * If the page content is identical but at least one of the pages is * tagged, return non-zero to avoid KSM merging. If only one of the - * pages is tagged, set_ptes() may zero or change the tags of the + * pages is tagged, __set_ptes() may zero or change the tags of the * other page via mte_sync_tags(). */ if (page_mte_tagged(page1) || page_mte_tagged(page2)) diff --git a/arch/arm64/kvm/guest.c b/arch/arm64/kvm/guest.c index ce238ef9e1133..135fcf3fc4bbe 100644 --- a/arch/arm64/kvm/guest.c +++ b/arch/arm64/kvm/guest.c @@ -1073,7 +1073,7 @@ int kvm_vm_ioctl_mte_copy_tags(struct kvm *kvm, } else { /* * Only locking to serialise with a concurrent - * set_ptes() in the VMM but still overriding the + * __set_ptes() in the VMM but still overriding the * tags, hence ignoring the return value. */ try_page_mte_tagging(page); diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c index 5239430cd5f2c..fced4d944bbd1 100644 --- a/arch/arm64/mm/fault.c +++ b/arch/arm64/mm/fault.c @@ -191,7 +191,7 @@ static void show_pte(unsigned long addr) if (!ptep) break; - pte = ptep_get(ptep); + pte = __ptep_get(ptep); pr_cont(", pte=%016llx", pte_val(pte)); pte_unmap(ptep); } while(0); @@ -205,16 +205,16 @@ static void show_pte(unsigned long addr) * * It needs to cope with hardware update of the accessed/dirty state by other * agents in the system and can safely skip the __sync_icache_dcache() call as, - * like set_ptes(), the PTE is never changed from no-exec to exec here. + * like __set_ptes(), the PTE is never changed from no-exec to exec here. * * Returns whether or not the PTE actually changed. */ -int ptep_set_access_flags(struct vm_area_struct *vma, - unsigned long address, pte_t *ptep, - pte_t entry, int dirty) +int __ptep_set_access_flags(struct vm_area_struct *vma, + unsigned long address, pte_t *ptep, + pte_t entry, int dirty) { pteval_t old_pteval, pteval; - pte_t pte = ptep_get(ptep); + pte_t pte = __ptep_get(ptep); if (pte_same(pte, entry)) return 0; diff --git a/arch/arm64/mm/fixmap.c b/arch/arm64/mm/fixmap.c index c0a3301203bdf..bfc02568805ae 100644 --- a/arch/arm64/mm/fixmap.c +++ b/arch/arm64/mm/fixmap.c @@ -121,9 +121,9 @@ void __set_fixmap(enum fixed_addresses idx, ptep = fixmap_pte(addr); if (pgprot_val(flags)) { - set_pte(ptep, pfn_pte(phys >> PAGE_SHIFT, flags)); + __set_pte(ptep, pfn_pte(phys >> PAGE_SHIFT, flags)); } else { - pte_clear(&init_mm, addr, ptep); + __pte_clear(&init_mm, addr, ptep); flush_tlb_kernel_range(addr, addr+PAGE_SIZE); } } diff --git a/arch/arm64/mm/hugetlbpage.c b/arch/arm64/mm/hugetlbpage.c index 688d11b1cabdb..3c9e5bac3ae28 100644 --- a/arch/arm64/mm/hugetlbpage.c +++ b/arch/arm64/mm/hugetlbpage.c @@ -145,14 +145,14 @@ pte_t huge_ptep_get(pte_t *ptep) { int ncontig, i; size_t pgsize; - pte_t orig_pte = ptep_get(ptep); + pte_t orig_pte = __ptep_get(ptep); if (!pte_present(orig_pte) || !pte_cont(orig_pte)) return orig_pte; ncontig = num_contig_ptes(page_size(pte_page(orig_pte)), &pgsize); for (i = 0; i < ncontig; i++, ptep++) { - pte_t pte = ptep_get(ptep); + pte_t pte = __ptep_get(ptep); if (pte_dirty(pte)) orig_pte = pte_mkdirty(orig_pte); @@ -228,7 +228,7 @@ static void clear_flush(struct mm_struct *mm, unsigned long i, saddr = addr; for (i = 0; i < ncontig; i++, addr += pgsize, ptep++) - ptep_get_and_clear(mm, addr, ptep); + __ptep_get_and_clear(mm, addr, ptep); flush_tlb_range(&vma, saddr, addr); } @@ -246,12 +246,12 @@ void set_huge_pte_at(struct mm_struct *mm, unsigned long addr, if (!pte_present(pte)) { for (i = 0; i < ncontig; i++, ptep++, addr += pgsize) - set_ptes(mm, addr, ptep, pte, 1); + __set_ptes(mm, addr, ptep, pte, 1); return; } if (!pte_cont(pte)) { - set_ptes(mm, addr, ptep, pte, 1); + __set_ptes(mm, addr, ptep, pte, 1); return; } @@ -262,7 +262,7 @@ void set_huge_pte_at(struct mm_struct *mm, unsigned long addr, clear_flush(mm, addr, ptep, pgsize, ncontig); for (i = 0; i < ncontig; i++, ptep++, addr += pgsize, pfn += dpfn) - set_ptes(mm, addr, ptep, pfn_pte(pfn, hugeprot), 1); + __set_ptes(mm, addr, ptep, pfn_pte(pfn, hugeprot), 1); } pte_t *huge_pte_alloc(struct mm_struct *mm, struct vm_area_struct *vma, @@ -392,7 +392,7 @@ void huge_pte_clear(struct mm_struct *mm, unsigned long addr, ncontig = num_contig_ptes(sz, &pgsize); for (i = 0; i < ncontig; i++, addr += pgsize, ptep++) - pte_clear(mm, addr, ptep); + __pte_clear(mm, addr, ptep); } pte_t huge_ptep_get_and_clear(struct mm_struct *mm, unsigned long addr, @@ -418,11 +418,11 @@ static int __cont_access_flags_changed(pte_t *ptep, pte_t pte, int ncontig) { int i; - if (pte_write(pte) != pte_write(ptep_get(ptep))) + if (pte_write(pte) != pte_write(__ptep_get(ptep))) return 1; for (i = 0; i < ncontig; i++) { - pte_t orig_pte = ptep_get(ptep + i); + pte_t orig_pte = __ptep_get(ptep + i); if (pte_dirty(pte) != pte_dirty(orig_pte)) return 1; @@ -446,7 +446,7 @@ int huge_ptep_set_access_flags(struct vm_area_struct *vma, pte_t orig_pte; if (!pte_cont(pte)) - return ptep_set_access_flags(vma, addr, ptep, pte, dirty); + return __ptep_set_access_flags(vma, addr, ptep, pte, dirty); ncontig = find_num_contig(mm, addr, ptep, &pgsize); dpfn = pgsize >> PAGE_SHIFT; @@ -465,7 +465,7 @@ int huge_ptep_set_access_flags(struct vm_area_struct *vma, hugeprot = pte_pgprot(pte); for (i = 0; i < ncontig; i++, ptep++, addr += pgsize, pfn += dpfn) - set_ptes(mm, addr, ptep, pfn_pte(pfn, hugeprot), 1); + __set_ptes(mm, addr, ptep, pfn_pte(pfn, hugeprot), 1); return 1; } @@ -479,8 +479,8 @@ void huge_ptep_set_wrprotect(struct mm_struct *mm, size_t pgsize; pte_t pte; - if (!pte_cont(ptep_get(ptep))) { - ptep_set_wrprotect(mm, addr, ptep); + if (!pte_cont(__ptep_get(ptep))) { + __ptep_set_wrprotect(mm, addr, ptep); return; } @@ -494,7 +494,7 @@ void huge_ptep_set_wrprotect(struct mm_struct *mm, pfn = pte_pfn(pte); for (i = 0; i < ncontig; i++, ptep++, addr += pgsize, pfn += dpfn) - set_ptes(mm, addr, ptep, pfn_pte(pfn, hugeprot), 1); + __set_ptes(mm, addr, ptep, pfn_pte(pfn, hugeprot), 1); } pte_t huge_ptep_clear_flush(struct vm_area_struct *vma, @@ -504,7 +504,7 @@ pte_t huge_ptep_clear_flush(struct vm_area_struct *vma, size_t pgsize; int ncontig; - if (!pte_cont(ptep_get(ptep))) + if (!pte_cont(__ptep_get(ptep))) return ptep_clear_flush(vma, addr, ptep); ncontig = find_num_contig(mm, addr, ptep, &pgsize); @@ -552,7 +552,7 @@ pte_t huge_ptep_modify_prot_start(struct vm_area_struct *vma, unsigned long addr * when the permission changes from executable to non-executable * in cases where cpu is affected with errata #2645198. */ - if (pte_user_exec(ptep_get(ptep))) + if (pte_user_exec(__ptep_get(ptep))) return huge_ptep_clear_flush(vma, addr, ptep); } return huge_ptep_get_and_clear(vma->vm_mm, addr, ptep, psize); diff --git a/arch/arm64/mm/kasan_init.c b/arch/arm64/mm/kasan_init.c index 1b96e0ad66610..28856f511fb63 100644 --- a/arch/arm64/mm/kasan_init.c +++ b/arch/arm64/mm/kasan_init.c @@ -112,8 +112,8 @@ static void __init kasan_pte_populate(pmd_t *pmdp, unsigned long addr, if (!early) memset(__va(page_phys), KASAN_SHADOW_INIT, PAGE_SIZE); next = addr + PAGE_SIZE; - set_pte(ptep, pfn_pte(__phys_to_pfn(page_phys), PAGE_KERNEL)); - } while (ptep++, addr = next, addr != end && pte_none(ptep_get(ptep))); + __set_pte(ptep, pfn_pte(__phys_to_pfn(page_phys), PAGE_KERNEL)); + } while (ptep++, addr = next, addr != end && pte_none(__ptep_get(ptep))); } static void __init kasan_pmd_populate(pud_t *pudp, unsigned long addr, @@ -266,7 +266,7 @@ static void __init kasan_init_shadow(void) * so we should make sure that it maps the zero page read-only. */ for (i = 0; i < PTRS_PER_PTE; i++) - set_pte(&kasan_early_shadow_pte[i], + __set_pte(&kasan_early_shadow_pte[i], pfn_pte(sym_to_pfn(kasan_early_shadow_page), PAGE_KERNEL_RO)); diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c index 3f54a9a0010fc..1839847c5a85c 100644 --- a/arch/arm64/mm/mmu.c +++ b/arch/arm64/mm/mmu.c @@ -176,16 +176,16 @@ static void init_pte(pmd_t *pmdp, unsigned long addr, unsigned long end, ptep = pte_set_fixmap_offset(pmdp, addr); do { - pte_t old_pte = ptep_get(ptep); + pte_t old_pte = __ptep_get(ptep); - set_pte(ptep, pfn_pte(__phys_to_pfn(phys), prot)); + __set_pte(ptep, pfn_pte(__phys_to_pfn(phys), prot)); /* * After the PTE entry has been populated once, we * only allow updates to the permission attributes. */ BUG_ON(!pgattr_change_is_safe(pte_val(old_pte), - pte_val(ptep_get(ptep)))); + pte_val(__ptep_get(ptep)))); phys += PAGE_SIZE; } while (ptep++, addr += PAGE_SIZE, addr != end); @@ -854,12 +854,12 @@ static void unmap_hotplug_pte_range(pmd_t *pmdp, unsigned long addr, do { ptep = pte_offset_kernel(pmdp, addr); - pte = ptep_get(ptep); + pte = __ptep_get(ptep); if (pte_none(pte)) continue; WARN_ON(!pte_present(pte)); - pte_clear(&init_mm, addr, ptep); + __pte_clear(&init_mm, addr, ptep); flush_tlb_kernel_range(addr, addr + PAGE_SIZE); if (free_mapped) free_hotplug_page_range(pte_page(pte), @@ -987,7 +987,7 @@ static void free_empty_pte_table(pmd_t *pmdp, unsigned long addr, do { ptep = pte_offset_kernel(pmdp, addr); - pte = ptep_get(ptep); + pte = __ptep_get(ptep); /* * This is just a sanity check here which verifies that @@ -1006,7 +1006,7 @@ static void free_empty_pte_table(pmd_t *pmdp, unsigned long addr, */ ptep = pte_offset_kernel(pmdp, 0UL); for (i = 0; i < PTRS_PER_PTE; i++) { - if (!pte_none(ptep_get(&ptep[i]))) + if (!pte_none(__ptep_get(&ptep[i]))) return; } diff --git a/arch/arm64/mm/pageattr.c b/arch/arm64/mm/pageattr.c index e0e35bd942222..0e270a1c51e64 100644 --- a/arch/arm64/mm/pageattr.c +++ b/arch/arm64/mm/pageattr.c @@ -36,12 +36,12 @@ bool can_set_direct_map(void) static int change_page_range(pte_t *ptep, unsigned long addr, void *data) { struct page_change_data *cdata = data; - pte_t pte = ptep_get(ptep); + pte_t pte = __ptep_get(ptep); pte = clear_pte_bit(pte, cdata->clear_mask); pte = set_pte_bit(pte, cdata->set_mask); - set_pte(ptep, pte); + __set_pte(ptep, pte); return 0; } @@ -242,5 +242,5 @@ bool kernel_page_present(struct page *page) return true; ptep = pte_offset_kernel(pmdp, addr); - return pte_valid(ptep_get(ptep)); + return pte_valid(__ptep_get(ptep)); } diff --git a/arch/arm64/mm/trans_pgd.c b/arch/arm64/mm/trans_pgd.c index f71ab4704cce7..5139a28130c08 100644 --- a/arch/arm64/mm/trans_pgd.c +++ b/arch/arm64/mm/trans_pgd.c @@ -33,7 +33,7 @@ static void *trans_alloc(struct trans_pgd_info *info) static void _copy_pte(pte_t *dst_ptep, pte_t *src_ptep, unsigned long addr) { - pte_t pte = ptep_get(src_ptep); + pte_t pte = __ptep_get(src_ptep); if (pte_valid(pte)) { /* @@ -41,7 +41,7 @@ static void _copy_pte(pte_t *dst_ptep, pte_t *src_ptep, unsigned long addr) * read only (code, rodata). Clear the RDONLY bit from * the temporary mappings we use during restore. */ - set_pte(dst_ptep, pte_mkwrite_novma(pte)); + __set_pte(dst_ptep, pte_mkwrite_novma(pte)); } else if ((debug_pagealloc_enabled() || is_kfence_address((void *)addr)) && !pte_none(pte)) { /* @@ -55,7 +55,7 @@ static void _copy_pte(pte_t *dst_ptep, pte_t *src_ptep, unsigned long addr) */ BUG_ON(!pfn_valid(pte_pfn(pte))); - set_pte(dst_ptep, pte_mkpresent(pte_mkwrite_novma(pte))); + __set_pte(dst_ptep, pte_mkpresent(pte_mkwrite_novma(pte))); } } From 72b7e215046fb1498fd36d760f001f397b12015c Mon Sep 17 00:00:00 2001 From: Ryan Roberts Date: Thu, 15 Feb 2024 10:31:59 +0000 Subject: [PATCH 221/548] arm64/mm: wire up PTE_CONT for user mappings With the ptep API sufficiently refactored, we can now introduce a new "contpte" API layer, which transparently manages the PTE_CONT bit for user mappings. In this initial implementation, only suitable batches of PTEs, set via set_ptes(), are mapped with the PTE_CONT bit. Any subsequent modification of individual PTEs will cause an "unfold" operation to repaint the contpte block as individual PTEs before performing the requested operation. While, a modification of a single PTE could cause the block of PTEs to which it belongs to become eligible for "folding" into a contpte entry, "folding" is not performed in this initial implementation due to the costs of checking the requirements are met. Due to this, contpte mappings will degrade back to normal pte mappings over time if/when protections are changed. This will be solved in a future patch. Since a contpte block only has a single access and dirty bit, the semantic here changes slightly; when getting a pte (e.g. ptep_get()) that is part of a contpte mapping, the access and dirty information are pulled from the block (so all ptes in the block return the same access/dirty info). When changing the access/dirty info on a pte (e.g. ptep_set_access_flags()) that is part of a contpte mapping, this change will affect the whole contpte block. This is works fine in practice since we guarantee that only a single folio is mapped by a contpte block, and the core-mm tracks access/dirty information per folio. In order for the public functions, which used to be pure inline, to continue to be callable by modules, export all the contpte_* symbols that are now called by those public inline functions. The feature is enabled/disabled with the ARM64_CONTPTE Kconfig parameter at build time. It defaults to enabled as long as its dependency, TRANSPARENT_HUGEPAGE is also enabled. The core-mm depends upon TRANSPARENT_HUGEPAGE to be able to allocate large folios, so if its not enabled, then there is no chance of meeting the physical contiguity requirement for contpte mappings. Link: https://lkml.kernel.org/r/20240215103205.2607016-13-ryan.roberts@arm.com Signed-off-by: Ryan Roberts Acked-by: Ard Biesheuvel Tested-by: John Hubbard Acked-by: Mark Rutland Reviewed-by: Catalin Marinas Cc: Alistair Popple Cc: Andrey Ryabinin Cc: Barry Song <21cnbao@gmail.com> Cc: Borislav Petkov (AMD) Cc: Dave Hansen Cc: David Hildenbrand Cc: "H. Peter Anvin" Cc: Ingo Molnar Cc: James Morse Cc: Kefeng Wang Cc: Marc Zyngier Cc: Matthew Wilcox (Oracle) Cc: Thomas Gleixner Cc: Will Deacon Cc: Yang Shi Cc: Zi Yan Signed-off-by: Andrew Morton (cherry picked from commit 4602e5757bcceb231c3a13c36c373ad4a750eddb) Signed-off-by: Koba Ko --- arch/arm64/Kconfig | 9 + arch/arm64/include/asm/pgtable.h | 167 ++++++++++++++++++ arch/arm64/mm/Makefile | 1 + arch/arm64/mm/contpte.c | 285 +++++++++++++++++++++++++++++++ include/linux/efi.h | 5 + 5 files changed, 467 insertions(+) create mode 100644 arch/arm64/mm/contpte.c diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig index 4ecba0690938c..43c5801cfdece 100644 --- a/arch/arm64/Kconfig +++ b/arch/arm64/Kconfig @@ -2264,6 +2264,15 @@ config UNWIND_PATCH_PAC_INTO_SCS select UNWIND_TABLES select DYNAMIC_SCS +config ARM64_CONTPTE + bool "Contiguous PTE mappings for user memory" if EXPERT + depends on TRANSPARENT_HUGEPAGE + default y + help + When enabled, user mappings are configured using the PTE contiguous + bit, for any mappings that meet the size and alignment requirements. + This reduces TLB pressure and improves performance. + endmenu # "Kernel Features" menu "Boot options" diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h index fd13489c9515d..ad2bd36eba13c 100644 --- a/arch/arm64/include/asm/pgtable.h +++ b/arch/arm64/include/asm/pgtable.h @@ -133,6 +133,10 @@ static inline pteval_t __phys_to_pte_val(phys_addr_t phys) */ #define pte_valid_not_user(pte) \ ((pte_val(pte) & (PTE_VALID | PTE_USER | PTE_UXN)) == (PTE_VALID | PTE_UXN)) +/* + * Returns true if the pte is valid and has the contiguous bit set. + */ +#define pte_valid_cont(pte) (pte_valid(pte) && pte_cont(pte)) /* * Could the pte be present in the TLB? We must check mm_tlb_flush_pending * so that we don't erroneously return false for pages that have been @@ -1128,6 +1132,167 @@ extern void ptep_modify_prot_commit(struct vm_area_struct *vma, unsigned long addr, pte_t *ptep, pte_t old_pte, pte_t new_pte); +#ifdef CONFIG_ARM64_CONTPTE + +/* + * The contpte APIs are used to transparently manage the contiguous bit in ptes + * where it is possible and makes sense to do so. The PTE_CONT bit is considered + * a private implementation detail of the public ptep API (see below). + */ +extern void __contpte_try_unfold(struct mm_struct *mm, unsigned long addr, + pte_t *ptep, pte_t pte); +extern pte_t contpte_ptep_get(pte_t *ptep, pte_t orig_pte); +extern pte_t contpte_ptep_get_lockless(pte_t *orig_ptep); +extern void contpte_set_ptes(struct mm_struct *mm, unsigned long addr, + pte_t *ptep, pte_t pte, unsigned int nr); +extern int contpte_ptep_test_and_clear_young(struct vm_area_struct *vma, + unsigned long addr, pte_t *ptep); +extern int contpte_ptep_clear_flush_young(struct vm_area_struct *vma, + unsigned long addr, pte_t *ptep); +extern int contpte_ptep_set_access_flags(struct vm_area_struct *vma, + unsigned long addr, pte_t *ptep, + pte_t entry, int dirty); + +static inline void contpte_try_unfold(struct mm_struct *mm, unsigned long addr, + pte_t *ptep, pte_t pte) +{ + if (unlikely(pte_valid_cont(pte))) + __contpte_try_unfold(mm, addr, ptep, pte); +} + +/* + * The below functions constitute the public API that arm64 presents to the + * core-mm to manipulate PTE entries within their page tables (or at least this + * is the subset of the API that arm64 needs to implement). These public + * versions will automatically and transparently apply the contiguous bit where + * it makes sense to do so. Therefore any users that are contig-aware (e.g. + * hugetlb, kernel mapper) should NOT use these APIs, but instead use the + * private versions, which are prefixed with double underscore. All of these + * APIs except for ptep_get_lockless() are expected to be called with the PTL + * held. Although the contiguous bit is considered private to the + * implementation, it is deliberately allowed to leak through the getters (e.g. + * ptep_get()), back to core code. This is required so that pte_leaf_size() can + * provide an accurate size for perf_get_pgtable_size(). But this leakage means + * its possible a pte will be passed to a setter with the contiguous bit set, so + * we explicitly clear the contiguous bit in those cases to prevent accidentally + * setting it in the pgtable. + */ + +#define ptep_get ptep_get +static inline pte_t ptep_get(pte_t *ptep) +{ + pte_t pte = __ptep_get(ptep); + + if (likely(!pte_valid_cont(pte))) + return pte; + + return contpte_ptep_get(ptep, pte); +} + +#define ptep_get_lockless ptep_get_lockless +static inline pte_t ptep_get_lockless(pte_t *ptep) +{ + pte_t pte = __ptep_get(ptep); + + if (likely(!pte_valid_cont(pte))) + return pte; + + return contpte_ptep_get_lockless(ptep); +} + +static inline void set_pte(pte_t *ptep, pte_t pte) +{ + /* + * We don't have the mm or vaddr so cannot unfold contig entries (since + * it requires tlb maintenance). set_pte() is not used in core code, so + * this should never even be called. Regardless do our best to service + * any call and emit a warning if there is any attempt to set a pte on + * top of an existing contig range. + */ + pte_t orig_pte = __ptep_get(ptep); + + WARN_ON_ONCE(pte_valid_cont(orig_pte)); + __set_pte(ptep, pte_mknoncont(pte)); +} + +#define set_ptes set_ptes +static inline void set_ptes(struct mm_struct *mm, unsigned long addr, + pte_t *ptep, pte_t pte, unsigned int nr) +{ + pte = pte_mknoncont(pte); + + if (likely(nr == 1)) { + contpte_try_unfold(mm, addr, ptep, __ptep_get(ptep)); + __set_ptes(mm, addr, ptep, pte, 1); + } else { + contpte_set_ptes(mm, addr, ptep, pte, nr); + } +} + +static inline void pte_clear(struct mm_struct *mm, + unsigned long addr, pte_t *ptep) +{ + contpte_try_unfold(mm, addr, ptep, __ptep_get(ptep)); + __pte_clear(mm, addr, ptep); +} + +#define __HAVE_ARCH_PTEP_GET_AND_CLEAR +static inline pte_t ptep_get_and_clear(struct mm_struct *mm, + unsigned long addr, pte_t *ptep) +{ + contpte_try_unfold(mm, addr, ptep, __ptep_get(ptep)); + return __ptep_get_and_clear(mm, addr, ptep); +} + +#define __HAVE_ARCH_PTEP_TEST_AND_CLEAR_YOUNG +static inline int ptep_test_and_clear_young(struct vm_area_struct *vma, + unsigned long addr, pte_t *ptep) +{ + pte_t orig_pte = __ptep_get(ptep); + + if (likely(!pte_valid_cont(orig_pte))) + return __ptep_test_and_clear_young(vma, addr, ptep); + + return contpte_ptep_test_and_clear_young(vma, addr, ptep); +} + +#define __HAVE_ARCH_PTEP_CLEAR_YOUNG_FLUSH +static inline int ptep_clear_flush_young(struct vm_area_struct *vma, + unsigned long addr, pte_t *ptep) +{ + pte_t orig_pte = __ptep_get(ptep); + + if (likely(!pte_valid_cont(orig_pte))) + return __ptep_clear_flush_young(vma, addr, ptep); + + return contpte_ptep_clear_flush_young(vma, addr, ptep); +} + +#define __HAVE_ARCH_PTEP_SET_WRPROTECT +static inline void ptep_set_wrprotect(struct mm_struct *mm, + unsigned long addr, pte_t *ptep) +{ + contpte_try_unfold(mm, addr, ptep, __ptep_get(ptep)); + __ptep_set_wrprotect(mm, addr, ptep); +} + +#define __HAVE_ARCH_PTEP_SET_ACCESS_FLAGS +static inline int ptep_set_access_flags(struct vm_area_struct *vma, + unsigned long addr, pte_t *ptep, + pte_t entry, int dirty) +{ + pte_t orig_pte = __ptep_get(ptep); + + entry = pte_mknoncont(entry); + + if (likely(!pte_valid_cont(orig_pte))) + return __ptep_set_access_flags(vma, addr, ptep, entry, dirty); + + return contpte_ptep_set_access_flags(vma, addr, ptep, entry, dirty); +} + +#else /* CONFIG_ARM64_CONTPTE */ + #define ptep_get __ptep_get #define set_pte __set_pte #define set_ptes __set_ptes @@ -1143,6 +1308,8 @@ extern void ptep_modify_prot_commit(struct vm_area_struct *vma, #define __HAVE_ARCH_PTEP_SET_ACCESS_FLAGS #define ptep_set_access_flags __ptep_set_access_flags +#endif /* CONFIG_ARM64_CONTPTE */ + #endif /* !__ASSEMBLY__ */ #endif /* __ASM_PGTABLE_H */ diff --git a/arch/arm64/mm/Makefile b/arch/arm64/mm/Makefile index dbd1bc95967d0..60454256945b8 100644 --- a/arch/arm64/mm/Makefile +++ b/arch/arm64/mm/Makefile @@ -3,6 +3,7 @@ obj-y := dma-mapping.o extable.o fault.o init.o \ cache.o copypage.o flush.o \ ioremap.o mmap.o pgd.o mmu.o \ context.o proc.o pageattr.o fixmap.o +obj-$(CONFIG_ARM64_CONTPTE) += contpte.o obj-$(CONFIG_HUGETLB_PAGE) += hugetlbpage.o obj-$(CONFIG_PTDUMP_CORE) += ptdump.o obj-$(CONFIG_PTDUMP_DEBUGFS) += ptdump_debugfs.o diff --git a/arch/arm64/mm/contpte.c b/arch/arm64/mm/contpte.c new file mode 100644 index 0000000000000..6d7f40667fa23 --- /dev/null +++ b/arch/arm64/mm/contpte.c @@ -0,0 +1,285 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* + * Copyright (C) 2023 ARM Ltd. + */ + +#include +#include +#include +#include + +static inline bool mm_is_user(struct mm_struct *mm) +{ + /* + * Don't attempt to apply the contig bit to kernel mappings, because + * dynamically adding/removing the contig bit can cause page faults. + * These racing faults are ok for user space, since they get serialized + * on the PTL. But kernel mappings can't tolerate faults. + */ + if (unlikely(mm_is_efi(mm))) + return false; + return mm != &init_mm; +} + +static inline pte_t *contpte_align_down(pte_t *ptep) +{ + return PTR_ALIGN_DOWN(ptep, sizeof(*ptep) * CONT_PTES); +} + +static void contpte_convert(struct mm_struct *mm, unsigned long addr, + pte_t *ptep, pte_t pte) +{ + struct vm_area_struct vma = TLB_FLUSH_VMA(mm, 0); + unsigned long start_addr; + pte_t *start_ptep; + int i; + + start_ptep = ptep = contpte_align_down(ptep); + start_addr = addr = ALIGN_DOWN(addr, CONT_PTE_SIZE); + pte = pfn_pte(ALIGN_DOWN(pte_pfn(pte), CONT_PTES), pte_pgprot(pte)); + + for (i = 0; i < CONT_PTES; i++, ptep++, addr += PAGE_SIZE) { + pte_t ptent = __ptep_get_and_clear(mm, addr, ptep); + + if (pte_dirty(ptent)) + pte = pte_mkdirty(pte); + + if (pte_young(ptent)) + pte = pte_mkyoung(pte); + } + + __flush_tlb_range(&vma, start_addr, addr, PAGE_SIZE, true, 3); + + __set_ptes(mm, start_addr, start_ptep, pte, CONT_PTES); +} + +void __contpte_try_unfold(struct mm_struct *mm, unsigned long addr, + pte_t *ptep, pte_t pte) +{ + /* + * We have already checked that the ptes are contiguous in + * contpte_try_unfold(), so just check that the mm is user space. + */ + if (!mm_is_user(mm)) + return; + + pte = pte_mknoncont(pte); + contpte_convert(mm, addr, ptep, pte); +} +EXPORT_SYMBOL(__contpte_try_unfold); + +pte_t contpte_ptep_get(pte_t *ptep, pte_t orig_pte) +{ + /* + * Gather access/dirty bits, which may be populated in any of the ptes + * of the contig range. We are guaranteed to be holding the PTL, so any + * contiguous range cannot be unfolded or otherwise modified under our + * feet. + */ + + pte_t pte; + int i; + + ptep = contpte_align_down(ptep); + + for (i = 0; i < CONT_PTES; i++, ptep++) { + pte = __ptep_get(ptep); + + if (pte_dirty(pte)) + orig_pte = pte_mkdirty(orig_pte); + + if (pte_young(pte)) + orig_pte = pte_mkyoung(orig_pte); + } + + return orig_pte; +} +EXPORT_SYMBOL(contpte_ptep_get); + +pte_t contpte_ptep_get_lockless(pte_t *orig_ptep) +{ + /* + * Gather access/dirty bits, which may be populated in any of the ptes + * of the contig range. We may not be holding the PTL, so any contiguous + * range may be unfolded/modified/refolded under our feet. Therefore we + * ensure we read a _consistent_ contpte range by checking that all ptes + * in the range are valid and have CONT_PTE set, that all pfns are + * contiguous and that all pgprots are the same (ignoring access/dirty). + * If we find a pte that is not consistent, then we must be racing with + * an update so start again. If the target pte does not have CONT_PTE + * set then that is considered consistent on its own because it is not + * part of a contpte range. + */ + + pgprot_t orig_prot; + unsigned long pfn; + pte_t orig_pte; + pgprot_t prot; + pte_t *ptep; + pte_t pte; + int i; + +retry: + orig_pte = __ptep_get(orig_ptep); + + if (!pte_valid_cont(orig_pte)) + return orig_pte; + + orig_prot = pte_pgprot(pte_mkold(pte_mkclean(orig_pte))); + ptep = contpte_align_down(orig_ptep); + pfn = pte_pfn(orig_pte) - (orig_ptep - ptep); + + for (i = 0; i < CONT_PTES; i++, ptep++, pfn++) { + pte = __ptep_get(ptep); + prot = pte_pgprot(pte_mkold(pte_mkclean(pte))); + + if (!pte_valid_cont(pte) || + pte_pfn(pte) != pfn || + pgprot_val(prot) != pgprot_val(orig_prot)) + goto retry; + + if (pte_dirty(pte)) + orig_pte = pte_mkdirty(orig_pte); + + if (pte_young(pte)) + orig_pte = pte_mkyoung(orig_pte); + } + + return orig_pte; +} +EXPORT_SYMBOL(contpte_ptep_get_lockless); + +void contpte_set_ptes(struct mm_struct *mm, unsigned long addr, + pte_t *ptep, pte_t pte, unsigned int nr) +{ + unsigned long next; + unsigned long end; + unsigned long pfn; + pgprot_t prot; + + /* + * The set_ptes() spec guarantees that when nr > 1, the initial state of + * all ptes is not-present. Therefore we never need to unfold or + * otherwise invalidate a range before we set the new ptes. + * contpte_set_ptes() should never be called for nr < 2. + */ + VM_WARN_ON(nr == 1); + + if (!mm_is_user(mm)) + return __set_ptes(mm, addr, ptep, pte, nr); + + end = addr + (nr << PAGE_SHIFT); + pfn = pte_pfn(pte); + prot = pte_pgprot(pte); + + do { + next = pte_cont_addr_end(addr, end); + nr = (next - addr) >> PAGE_SHIFT; + pte = pfn_pte(pfn, prot); + + if (((addr | next | (pfn << PAGE_SHIFT)) & ~CONT_PTE_MASK) == 0) + pte = pte_mkcont(pte); + else + pte = pte_mknoncont(pte); + + __set_ptes(mm, addr, ptep, pte, nr); + + addr = next; + ptep += nr; + pfn += nr; + + } while (addr != end); +} +EXPORT_SYMBOL(contpte_set_ptes); + +int contpte_ptep_test_and_clear_young(struct vm_area_struct *vma, + unsigned long addr, pte_t *ptep) +{ + /* + * ptep_clear_flush_young() technically requires us to clear the access + * flag for a _single_ pte. However, the core-mm code actually tracks + * access/dirty per folio, not per page. And since we only create a + * contig range when the range is covered by a single folio, we can get + * away with clearing young for the whole contig range here, so we avoid + * having to unfold. + */ + + int young = 0; + int i; + + ptep = contpte_align_down(ptep); + addr = ALIGN_DOWN(addr, CONT_PTE_SIZE); + + for (i = 0; i < CONT_PTES; i++, ptep++, addr += PAGE_SIZE) + young |= __ptep_test_and_clear_young(vma, addr, ptep); + + return young; +} +EXPORT_SYMBOL(contpte_ptep_test_and_clear_young); + +int contpte_ptep_clear_flush_young(struct vm_area_struct *vma, + unsigned long addr, pte_t *ptep) +{ + int young; + + young = contpte_ptep_test_and_clear_young(vma, addr, ptep); + + if (young) { + /* + * See comment in __ptep_clear_flush_young(); same rationale for + * eliding the trailing DSB applies here. + */ + addr = ALIGN_DOWN(addr, CONT_PTE_SIZE); + __flush_tlb_range_nosync(vma, addr, addr + CONT_PTE_SIZE, + PAGE_SIZE, true, 3); + } + + return young; +} +EXPORT_SYMBOL(contpte_ptep_clear_flush_young); + +int contpte_ptep_set_access_flags(struct vm_area_struct *vma, + unsigned long addr, pte_t *ptep, + pte_t entry, int dirty) +{ + unsigned long start_addr; + pte_t orig_pte; + int i; + + /* + * Gather the access/dirty bits for the contiguous range. If nothing has + * changed, its a noop. + */ + orig_pte = pte_mknoncont(ptep_get(ptep)); + if (pte_val(orig_pte) == pte_val(entry)) + return 0; + + /* + * We can fix up access/dirty bits without having to unfold the contig + * range. But if the write bit is changing, we must unfold. + */ + if (pte_write(orig_pte) == pte_write(entry)) { + /* + * For HW access management, we technically only need to update + * the flag on a single pte in the range. But for SW access + * management, we need to update all the ptes to prevent extra + * faults. Avoid per-page tlb flush in __ptep_set_access_flags() + * and instead flush the whole range at the end. + */ + ptep = contpte_align_down(ptep); + start_addr = addr = ALIGN_DOWN(addr, CONT_PTE_SIZE); + + for (i = 0; i < CONT_PTES; i++, ptep++, addr += PAGE_SIZE) + __ptep_set_access_flags(vma, addr, ptep, entry, 0); + + if (dirty) + __flush_tlb_range(vma, start_addr, addr, + PAGE_SIZE, true, 3); + } else { + __contpte_try_unfold(vma->vm_mm, addr, ptep, orig_pte); + __ptep_set_access_flags(vma, addr, ptep, entry, dirty); + } + + return 1; +} +EXPORT_SYMBOL(contpte_ptep_set_access_flags); diff --git a/include/linux/efi.h b/include/linux/efi.h index 7db1c0759c096..9099f57537edf 100644 --- a/include/linux/efi.h +++ b/include/linux/efi.h @@ -695,6 +695,11 @@ extern struct efi { extern struct mm_struct efi_mm; +static inline bool mm_is_efi(struct mm_struct *mm) +{ + return IS_ENABLED(CONFIG_EFI) && mm == &efi_mm; +} + static inline int efi_guidcmp (efi_guid_t left, efi_guid_t right) { From f7126b920f10e90ffcc00f45b904553d772af39f Mon Sep 17 00:00:00 2001 From: Ryan Roberts Date: Thu, 15 Feb 2024 10:32:00 +0000 Subject: [PATCH 222/548] arm64/mm: implement new wrprotect_ptes() batch API Optimize the contpte implementation to fix some of the fork performance regression introduced by the initial contpte commit. Subsequent patches will solve it entirely. During fork(), any private memory in the parent must be write-protected. Previously this was done 1 PTE at a time. But the core-mm supports batched wrprotect via the new wrprotect_ptes() API. So let's implement that API and for fully covered contpte mappings, we no longer need to unfold the contpte. This has 2 benefits: - reduced unfolding, reduces the number of tlbis that must be issued. - The memory remains contpte-mapped ("folded") in the parent, so it continues to benefit from the more efficient use of the TLB after the fork. The optimization to wrprotect a whole contpte block without unfolding is possible thanks to the tightening of the Arm ARM in respect to the definition and behaviour when 'Misprogramming the Contiguous bit'. See section D21194 at https://developer.arm.com/documentation/102105/ja-07/ Link: https://lkml.kernel.org/r/20240215103205.2607016-14-ryan.roberts@arm.com Signed-off-by: Ryan Roberts Tested-by: John Hubbard Acked-by: Mark Rutland Acked-by: Catalin Marinas Cc: Alistair Popple Cc: Andrey Ryabinin Cc: Ard Biesheuvel Cc: Barry Song <21cnbao@gmail.com> Cc: Borislav Petkov (AMD) Cc: Dave Hansen Cc: David Hildenbrand Cc: "H. Peter Anvin" Cc: Ingo Molnar Cc: James Morse Cc: Kefeng Wang Cc: Marc Zyngier Cc: Matthew Wilcox (Oracle) Cc: Thomas Gleixner Cc: Will Deacon Cc: Yang Shi Cc: Zi Yan Signed-off-by: Andrew Morton (cherry picked from commit 311a6cf29690bb8295327bad0e76e0ad48cadcc4) Signed-off-by: Koba Ko --- arch/arm64/include/asm/pgtable.h | 61 ++++++++++++++++++++++++++------ arch/arm64/mm/contpte.c | 38 ++++++++++++++++++++ 2 files changed, 89 insertions(+), 10 deletions(-) diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h index ad2bd36eba13c..c76fca86f31ec 100644 --- a/arch/arm64/include/asm/pgtable.h +++ b/arch/arm64/include/asm/pgtable.h @@ -978,16 +978,12 @@ static inline pmd_t pmdp_huge_get_and_clear(struct mm_struct *mm, } #endif /* CONFIG_TRANSPARENT_HUGEPAGE */ -/* - * __ptep_set_wrprotect - mark read-only while trasferring potential hardware - * dirty status (PTE_DBM && !PTE_RDONLY) to the software PTE_DIRTY bit. - */ -static inline void __ptep_set_wrprotect(struct mm_struct *mm, - unsigned long address, pte_t *ptep) +static inline void ___ptep_set_wrprotect(struct mm_struct *mm, + unsigned long address, pte_t *ptep, + pte_t pte) { - pte_t old_pte, pte; + pte_t old_pte; - pte = __ptep_get(ptep); do { old_pte = pte; pte = pte_wrprotect(pte); @@ -996,6 +992,25 @@ static inline void __ptep_set_wrprotect(struct mm_struct *mm, } while (pte_val(pte) != pte_val(old_pte)); } +/* + * __ptep_set_wrprotect - mark read-only while trasferring potential hardware + * dirty status (PTE_DBM && !PTE_RDONLY) to the software PTE_DIRTY bit. + */ +static inline void __ptep_set_wrprotect(struct mm_struct *mm, + unsigned long address, pte_t *ptep) +{ + ___ptep_set_wrprotect(mm, address, ptep, __ptep_get(ptep)); +} + +static inline void __wrprotect_ptes(struct mm_struct *mm, unsigned long address, + pte_t *ptep, unsigned int nr) +{ + unsigned int i; + + for (i = 0; i < nr; i++, address += PAGE_SIZE, ptep++) + __ptep_set_wrprotect(mm, address, ptep); +} + #ifdef CONFIG_TRANSPARENT_HUGEPAGE #define __HAVE_ARCH_PMDP_SET_WRPROTECT static inline void pmdp_set_wrprotect(struct mm_struct *mm, @@ -1149,6 +1164,8 @@ extern int contpte_ptep_test_and_clear_young(struct vm_area_struct *vma, unsigned long addr, pte_t *ptep); extern int contpte_ptep_clear_flush_young(struct vm_area_struct *vma, unsigned long addr, pte_t *ptep); +extern void contpte_wrprotect_ptes(struct mm_struct *mm, unsigned long addr, + pte_t *ptep, unsigned int nr); extern int contpte_ptep_set_access_flags(struct vm_area_struct *vma, unsigned long addr, pte_t *ptep, pte_t entry, int dirty); @@ -1268,12 +1285,35 @@ static inline int ptep_clear_flush_young(struct vm_area_struct *vma, return contpte_ptep_clear_flush_young(vma, addr, ptep); } +#define wrprotect_ptes wrprotect_ptes +static inline void wrprotect_ptes(struct mm_struct *mm, unsigned long addr, + pte_t *ptep, unsigned int nr) +{ + if (likely(nr == 1)) { + /* + * Optimization: wrprotect_ptes() can only be called for present + * ptes so we only need to check contig bit as condition for + * unfold, and we can remove the contig bit from the pte we read + * to avoid re-reading. This speeds up fork() which is sensitive + * for order-0 folios. Equivalent to contpte_try_unfold(). + */ + pte_t orig_pte = __ptep_get(ptep); + + if (unlikely(pte_cont(orig_pte))) { + __contpte_try_unfold(mm, addr, ptep, orig_pte); + orig_pte = pte_mknoncont(orig_pte); + } + ___ptep_set_wrprotect(mm, addr, ptep, orig_pte); + } else { + contpte_wrprotect_ptes(mm, addr, ptep, nr); + } +} + #define __HAVE_ARCH_PTEP_SET_WRPROTECT static inline void ptep_set_wrprotect(struct mm_struct *mm, unsigned long addr, pte_t *ptep) { - contpte_try_unfold(mm, addr, ptep, __ptep_get(ptep)); - __ptep_set_wrprotect(mm, addr, ptep); + wrprotect_ptes(mm, addr, ptep, 1); } #define __HAVE_ARCH_PTEP_SET_ACCESS_FLAGS @@ -1305,6 +1345,7 @@ static inline int ptep_set_access_flags(struct vm_area_struct *vma, #define ptep_clear_flush_young __ptep_clear_flush_young #define __HAVE_ARCH_PTEP_SET_WRPROTECT #define ptep_set_wrprotect __ptep_set_wrprotect +#define wrprotect_ptes __wrprotect_ptes #define __HAVE_ARCH_PTEP_SET_ACCESS_FLAGS #define ptep_set_access_flags __ptep_set_access_flags diff --git a/arch/arm64/mm/contpte.c b/arch/arm64/mm/contpte.c index 6d7f40667fa23..bedb585245356 100644 --- a/arch/arm64/mm/contpte.c +++ b/arch/arm64/mm/contpte.c @@ -26,6 +26,26 @@ static inline pte_t *contpte_align_down(pte_t *ptep) return PTR_ALIGN_DOWN(ptep, sizeof(*ptep) * CONT_PTES); } +static void contpte_try_unfold_partial(struct mm_struct *mm, unsigned long addr, + pte_t *ptep, unsigned int nr) +{ + /* + * Unfold any partially covered contpte block at the beginning and end + * of the range. + */ + + if (ptep != contpte_align_down(ptep) || nr < CONT_PTES) + contpte_try_unfold(mm, addr, ptep, __ptep_get(ptep)); + + if (ptep + nr != contpte_align_down(ptep + nr)) { + unsigned long last_addr = addr + PAGE_SIZE * (nr - 1); + pte_t *last_ptep = ptep + nr - 1; + + contpte_try_unfold(mm, last_addr, last_ptep, + __ptep_get(last_ptep)); + } +} + static void contpte_convert(struct mm_struct *mm, unsigned long addr, pte_t *ptep, pte_t pte) { @@ -238,6 +258,24 @@ int contpte_ptep_clear_flush_young(struct vm_area_struct *vma, } EXPORT_SYMBOL(contpte_ptep_clear_flush_young); +void contpte_wrprotect_ptes(struct mm_struct *mm, unsigned long addr, + pte_t *ptep, unsigned int nr) +{ + /* + * If wrprotecting an entire contig range, we can avoid unfolding. Just + * set wrprotect and wait for the later mmu_gather flush to invalidate + * the tlb. Until the flush, the page may or may not be wrprotected. + * After the flush, it is guaranteed wrprotected. If it's a partial + * range though, we must unfold, because we can't have a case where + * CONT_PTE is set but wrprotect applies to a subset of the PTEs; this + * would cause it to continue to be unpredictable after the flush. + */ + + contpte_try_unfold_partial(mm, addr, ptep, nr); + __wrprotect_ptes(mm, addr, ptep, nr); +} +EXPORT_SYMBOL(contpte_wrprotect_ptes); + int contpte_ptep_set_access_flags(struct vm_area_struct *vma, unsigned long addr, pte_t *ptep, pte_t entry, int dirty) From 95bc94a0fb797181698541bb57db5320391d1e18 Mon Sep 17 00:00:00 2001 From: Ryan Roberts Date: Thu, 15 Feb 2024 10:32:01 +0000 Subject: [PATCH 223/548] arm64/mm: implement new [get_and_]clear_full_ptes() batch APIs Optimize the contpte implementation to fix some of the exit/munmap/dontneed performance regression introduced by the initial contpte commit. Subsequent patches will solve it entirely. During exit(), munmap() or madvise(MADV_DONTNEED), mappings must be cleared. Previously this was done 1 PTE at a time. But the core-mm supports batched clear via the new [get_and_]clear_full_ptes() APIs. So let's implement those APIs and for fully covered contpte mappings, we no longer need to unfold the contpte. This significantly reduces unfolding operations, reducing the number of tlbis that must be issued. Link: https://lkml.kernel.org/r/20240215103205.2607016-15-ryan.roberts@arm.com Signed-off-by: Ryan Roberts Tested-by: John Hubbard Acked-by: Mark Rutland Acked-by: Catalin Marinas Cc: Alistair Popple Cc: Andrey Ryabinin Cc: Ard Biesheuvel Cc: Barry Song <21cnbao@gmail.com> Cc: Borislav Petkov (AMD) Cc: Dave Hansen Cc: David Hildenbrand Cc: "H. Peter Anvin" Cc: Ingo Molnar Cc: James Morse Cc: Kefeng Wang Cc: Marc Zyngier Cc: Matthew Wilcox (Oracle) Cc: Thomas Gleixner Cc: Will Deacon Cc: Yang Shi Cc: Zi Yan Signed-off-by: Andrew Morton (cherry picked from commit 6b1e4efb6f5499ae8f9f5cdda7502285a0edbf51) Signed-off-by: Koba Ko --- arch/arm64/include/asm/pgtable.h | 67 ++++++++++++++++++++++++++++++++ arch/arm64/mm/contpte.c | 17 ++++++++ 2 files changed, 84 insertions(+) diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h index c76fca86f31ec..dca577de8aaa5 100644 --- a/arch/arm64/include/asm/pgtable.h +++ b/arch/arm64/include/asm/pgtable.h @@ -965,6 +965,37 @@ static inline pte_t __ptep_get_and_clear(struct mm_struct *mm, return pte; } +static inline void __clear_full_ptes(struct mm_struct *mm, unsigned long addr, + pte_t *ptep, unsigned int nr, int full) +{ + for (;;) { + __ptep_get_and_clear(mm, addr, ptep); + if (--nr == 0) + break; + ptep++; + addr += PAGE_SIZE; + } +} + +static inline pte_t __get_and_clear_full_ptes(struct mm_struct *mm, + unsigned long addr, pte_t *ptep, + unsigned int nr, int full) +{ + pte_t pte, tmp_pte; + + pte = __ptep_get_and_clear(mm, addr, ptep); + while (--nr) { + ptep++; + addr += PAGE_SIZE; + tmp_pte = __ptep_get_and_clear(mm, addr, ptep); + if (pte_dirty(tmp_pte)) + pte = pte_mkdirty(pte); + if (pte_young(tmp_pte)) + pte = pte_mkyoung(pte); + } + return pte; +} + #ifdef CONFIG_TRANSPARENT_HUGEPAGE #define __HAVE_ARCH_PMDP_HUGE_GET_AND_CLEAR static inline pmd_t pmdp_huge_get_and_clear(struct mm_struct *mm, @@ -1160,6 +1191,11 @@ extern pte_t contpte_ptep_get(pte_t *ptep, pte_t orig_pte); extern pte_t contpte_ptep_get_lockless(pte_t *orig_ptep); extern void contpte_set_ptes(struct mm_struct *mm, unsigned long addr, pte_t *ptep, pte_t pte, unsigned int nr); +extern void contpte_clear_full_ptes(struct mm_struct *mm, unsigned long addr, + pte_t *ptep, unsigned int nr, int full); +extern pte_t contpte_get_and_clear_full_ptes(struct mm_struct *mm, + unsigned long addr, pte_t *ptep, + unsigned int nr, int full); extern int contpte_ptep_test_and_clear_young(struct vm_area_struct *vma, unsigned long addr, pte_t *ptep); extern int contpte_ptep_clear_flush_young(struct vm_area_struct *vma, @@ -1253,6 +1289,35 @@ static inline void pte_clear(struct mm_struct *mm, __pte_clear(mm, addr, ptep); } +#define clear_full_ptes clear_full_ptes +static inline void clear_full_ptes(struct mm_struct *mm, unsigned long addr, + pte_t *ptep, unsigned int nr, int full) +{ + if (likely(nr == 1)) { + contpte_try_unfold(mm, addr, ptep, __ptep_get(ptep)); + __clear_full_ptes(mm, addr, ptep, nr, full); + } else { + contpte_clear_full_ptes(mm, addr, ptep, nr, full); + } +} + +#define get_and_clear_full_ptes get_and_clear_full_ptes +static inline pte_t get_and_clear_full_ptes(struct mm_struct *mm, + unsigned long addr, pte_t *ptep, + unsigned int nr, int full) +{ + pte_t pte; + + if (likely(nr == 1)) { + contpte_try_unfold(mm, addr, ptep, __ptep_get(ptep)); + pte = __get_and_clear_full_ptes(mm, addr, ptep, nr, full); + } else { + pte = contpte_get_and_clear_full_ptes(mm, addr, ptep, nr, full); + } + + return pte; +} + #define __HAVE_ARCH_PTEP_GET_AND_CLEAR static inline pte_t ptep_get_and_clear(struct mm_struct *mm, unsigned long addr, pte_t *ptep) @@ -1337,6 +1402,8 @@ static inline int ptep_set_access_flags(struct vm_area_struct *vma, #define set_pte __set_pte #define set_ptes __set_ptes #define pte_clear __pte_clear +#define clear_full_ptes __clear_full_ptes +#define get_and_clear_full_ptes __get_and_clear_full_ptes #define __HAVE_ARCH_PTEP_GET_AND_CLEAR #define ptep_get_and_clear __ptep_get_and_clear #define __HAVE_ARCH_PTEP_TEST_AND_CLEAR_YOUNG diff --git a/arch/arm64/mm/contpte.c b/arch/arm64/mm/contpte.c index bedb585245356..50e0173dc5eee 100644 --- a/arch/arm64/mm/contpte.c +++ b/arch/arm64/mm/contpte.c @@ -212,6 +212,23 @@ void contpte_set_ptes(struct mm_struct *mm, unsigned long addr, } EXPORT_SYMBOL(contpte_set_ptes); +void contpte_clear_full_ptes(struct mm_struct *mm, unsigned long addr, + pte_t *ptep, unsigned int nr, int full) +{ + contpte_try_unfold_partial(mm, addr, ptep, nr); + __clear_full_ptes(mm, addr, ptep, nr, full); +} +EXPORT_SYMBOL(contpte_clear_full_ptes); + +pte_t contpte_get_and_clear_full_ptes(struct mm_struct *mm, + unsigned long addr, pte_t *ptep, + unsigned int nr, int full) +{ + contpte_try_unfold_partial(mm, addr, ptep, nr); + return __get_and_clear_full_ptes(mm, addr, ptep, nr, full); +} +EXPORT_SYMBOL(contpte_get_and_clear_full_ptes); + int contpte_ptep_test_and_clear_young(struct vm_area_struct *vma, unsigned long addr, pte_t *ptep) { From f0b20b7e20598eccd7586a49544022536c63c9a6 Mon Sep 17 00:00:00 2001 From: Ryan Roberts Date: Thu, 15 Feb 2024 10:32:02 +0000 Subject: [PATCH 224/548] mm: add pte_batch_hint() to reduce scanning in folio_pte_batch() Some architectures (e.g. arm64) can tell from looking at a pte, if some follow-on ptes also map contiguous physical memory with the same pgprot. (for arm64, these are contpte mappings). Take advantage of this knowledge to optimize folio_pte_batch() so that it can skip these ptes when scanning to create a batch. By default, if an arch does not opt-in, folio_pte_batch() returns a compile-time 1, so the changes are optimized out and the behaviour is as before. arm64 will opt-in to providing this hint in the next patch, which will greatly reduce the cost of ptep_get() when scanning a range of contptes. Link: https://lkml.kernel.org/r/20240215103205.2607016-16-ryan.roberts@arm.com Signed-off-by: Ryan Roberts Acked-by: David Hildenbrand Tested-by: John Hubbard Cc: Alistair Popple Cc: Andrey Ryabinin Cc: Ard Biesheuvel Cc: Barry Song <21cnbao@gmail.com> Cc: Borislav Petkov (AMD) Cc: Catalin Marinas Cc: Dave Hansen Cc: "H. Peter Anvin" Cc: Ingo Molnar Cc: James Morse Cc: Kefeng Wang Cc: Marc Zyngier Cc: Mark Rutland Cc: Matthew Wilcox (Oracle) Cc: Thomas Gleixner Cc: Will Deacon Cc: Yang Shi Cc: Zi Yan Signed-off-by: Andrew Morton (cherry picked from commit c6ec76a2ebc5829e5826b218d2e1475ec11b333e) Signed-off-by: Koba Ko --- include/linux/pgtable.h | 21 +++++++++++++++++++++ mm/memory.c | 19 ++++++++++++------- 2 files changed, 33 insertions(+), 7 deletions(-) diff --git a/include/linux/pgtable.h b/include/linux/pgtable.h index 72fbc15f3e991..b9df33865009a 100644 --- a/include/linux/pgtable.h +++ b/include/linux/pgtable.h @@ -209,6 +209,27 @@ static inline int pmd_young(pmd_t pmd) #define arch_flush_lazy_mmu_mode() do {} while (0) #endif +#ifndef pte_batch_hint +/** + * pte_batch_hint - Number of pages that can be added to batch without scanning. + * @ptep: Page table pointer for the entry. + * @pte: Page table entry. + * + * Some architectures know that a set of contiguous ptes all map the same + * contiguous memory with the same permissions. In this case, it can provide a + * hint to aid pte batching without the core code needing to scan every pte. + * + * An architecture implementation may ignore the PTE accessed state. Further, + * the dirty state must apply atomically to all the PTEs described by the hint. + * + * May be overridden by the architecture, else pte_batch_hint is always 1. + */ +static inline unsigned int pte_batch_hint(pte_t *ptep, pte_t pte) +{ + return 1; +} +#endif + #ifndef pte_advance_pfn static inline pte_t pte_advance_pfn(pte_t pte, unsigned long nr) { diff --git a/mm/memory.c b/mm/memory.c index f42267dbb2d7c..f3a6276578565 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -977,16 +977,20 @@ static inline int folio_pte_batch(struct folio *folio, unsigned long addr, { unsigned long folio_end_pfn = folio_pfn(folio) + folio_nr_pages(folio); const pte_t *end_ptep = start_ptep + max_nr; - pte_t expected_pte = __pte_batch_clear_ignored(pte_next_pfn(pte), flags); - pte_t *ptep = start_ptep + 1; + pte_t expected_pte, *ptep; bool writable; + int nr; if (any_writable) *any_writable = false; VM_WARN_ON_FOLIO(!pte_present(pte), folio); - while (ptep != end_ptep) { + nr = pte_batch_hint(start_ptep, pte); + expected_pte = __pte_batch_clear_ignored(pte_advance_pfn(pte, nr), flags); + ptep = start_ptep + nr; + + while (ptep < end_ptep) { pte = ptep_get(ptep); if (any_writable) writable = !!pte_write(pte); @@ -1000,17 +1004,18 @@ static inline int folio_pte_batch(struct folio *folio, unsigned long addr, * corner cases the next PFN might fall into a different * folio. */ - if (pte_pfn(pte) == folio_end_pfn) + if (pte_pfn(pte) >= folio_end_pfn) break; if (any_writable) *any_writable |= writable; - expected_pte = pte_next_pfn(expected_pte); - ptep++; + nr = pte_batch_hint(ptep, pte); + expected_pte = pte_advance_pfn(expected_pte, nr); + ptep += nr; } - return ptep - start_ptep; + return min(ptep - start_ptep, max_nr); } /* From fedcb9f4827303262eaa7b3b65725de315fa30e0 Mon Sep 17 00:00:00 2001 From: Ryan Roberts Date: Thu, 15 Feb 2024 10:32:03 +0000 Subject: [PATCH 225/548] arm64/mm: implement pte_batch_hint() When core code iterates over a range of ptes and calls ptep_get() for each of them, if the range happens to cover contpte mappings, the number of pte reads becomes amplified by a factor of the number of PTEs in a contpte block. This is because for each call to ptep_get(), the implementation must read all of the ptes in the contpte block to which it belongs to gather the access and dirty bits. This causes a hotspot for fork(), as well as operations that unmap memory such as munmap(), exit and madvise(MADV_DONTNEED). Fortunately we can fix this by implementing pte_batch_hint() which allows their iterators to skip getting the contpte tail ptes when gathering the batch of ptes to operate on. This results in the number of PTE reads returning to 1 per pte. Link: https://lkml.kernel.org/r/20240215103205.2607016-17-ryan.roberts@arm.com Signed-off-by: Ryan Roberts Acked-by: Mark Rutland Reviewed-by: David Hildenbrand Tested-by: John Hubbard Acked-by: Catalin Marinas Cc: Alistair Popple Cc: Andrey Ryabinin Cc: Ard Biesheuvel Cc: Barry Song <21cnbao@gmail.com> Cc: Borislav Petkov (AMD) Cc: Dave Hansen Cc: "H. Peter Anvin" Cc: Ingo Molnar Cc: James Morse Cc: Kefeng Wang Cc: Marc Zyngier Cc: Matthew Wilcox (Oracle) Cc: Thomas Gleixner Cc: Will Deacon Cc: Yang Shi Cc: Zi Yan Signed-off-by: Andrew Morton (cherry picked from commit fb5451e5f72b31002760083a99fbb41771c4f1ad) Signed-off-by: Koba Ko --- arch/arm64/include/asm/pgtable.h | 9 +++++++++ 1 file changed, 9 insertions(+) diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h index dca577de8aaa5..81f92748fd292 100644 --- a/arch/arm64/include/asm/pgtable.h +++ b/arch/arm64/include/asm/pgtable.h @@ -1213,6 +1213,15 @@ static inline void contpte_try_unfold(struct mm_struct *mm, unsigned long addr, __contpte_try_unfold(mm, addr, ptep, pte); } +#define pte_batch_hint pte_batch_hint +static inline unsigned int pte_batch_hint(pte_t *ptep, pte_t pte) +{ + if (!pte_valid_cont(pte)) + return 1; + + return CONT_PTES - (((unsigned long)ptep >> 3) & (CONT_PTES - 1)); +} + /* * The below functions constitute the public API that arm64 presents to the * core-mm to manipulate PTE entries within their page tables (or at least this From 098883a438b3667096c4dd6525b28e85c415f98f Mon Sep 17 00:00:00 2001 From: Ryan Roberts Date: Thu, 15 Feb 2024 10:32:04 +0000 Subject: [PATCH 226/548] arm64/mm: __always_inline to improve fork() perf As set_ptes() and wrprotect_ptes() become a bit more complex, the compiler may choose not to inline them. But this is critical for fork() performance. So mark the functions, along with contpte_try_unfold() which is called by them, as __always_inline. This is worth ~1% on the fork() microbenchmark with order-0 folios (the common case). Link: https://lkml.kernel.org/r/20240215103205.2607016-18-ryan.roberts@arm.com Signed-off-by: Ryan Roberts Acked-by: Mark Rutland Acked-by: Catalin Marinas Cc: Alistair Popple Cc: Andrey Ryabinin Cc: Ard Biesheuvel Cc: Barry Song <21cnbao@gmail.com> Cc: Borislav Petkov (AMD) Cc: Dave Hansen Cc: David Hildenbrand Cc: "H. Peter Anvin" Cc: Ingo Molnar Cc: James Morse Cc: John Hubbard Cc: Kefeng Wang Cc: Marc Zyngier Cc: Matthew Wilcox (Oracle) Cc: Thomas Gleixner Cc: Will Deacon Cc: Yang Shi Cc: Zi Yan Signed-off-by: Andrew Morton (cherry picked from commit b972fc6afba002319fe23bc698ce6431ee43868c) Signed-off-by: Koba Ko --- arch/arm64/include/asm/pgtable.h | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h index 81f92748fd292..9a7afb75fdfc8 100644 --- a/arch/arm64/include/asm/pgtable.h +++ b/arch/arm64/include/asm/pgtable.h @@ -1206,8 +1206,8 @@ extern int contpte_ptep_set_access_flags(struct vm_area_struct *vma, unsigned long addr, pte_t *ptep, pte_t entry, int dirty); -static inline void contpte_try_unfold(struct mm_struct *mm, unsigned long addr, - pte_t *ptep, pte_t pte) +static __always_inline void contpte_try_unfold(struct mm_struct *mm, + unsigned long addr, pte_t *ptep, pte_t pte) { if (unlikely(pte_valid_cont(pte))) __contpte_try_unfold(mm, addr, ptep, pte); @@ -1278,7 +1278,7 @@ static inline void set_pte(pte_t *ptep, pte_t pte) } #define set_ptes set_ptes -static inline void set_ptes(struct mm_struct *mm, unsigned long addr, +static __always_inline void set_ptes(struct mm_struct *mm, unsigned long addr, pte_t *ptep, pte_t pte, unsigned int nr) { pte = pte_mknoncont(pte); @@ -1360,8 +1360,8 @@ static inline int ptep_clear_flush_young(struct vm_area_struct *vma, } #define wrprotect_ptes wrprotect_ptes -static inline void wrprotect_ptes(struct mm_struct *mm, unsigned long addr, - pte_t *ptep, unsigned int nr) +static __always_inline void wrprotect_ptes(struct mm_struct *mm, + unsigned long addr, pte_t *ptep, unsigned int nr) { if (likely(nr == 1)) { /* From b2a94c403123ebce61ca04dade88b49af2e8bff2 Mon Sep 17 00:00:00 2001 From: Ryan Roberts Date: Thu, 15 Feb 2024 10:32:05 +0000 Subject: [PATCH 227/548] arm64/mm: automatically fold contpte mappings There are situations where a change to a single PTE could cause the contpte block in which it resides to become foldable (i.e. could be repainted with the contiguous bit). Such situations arise, for example, when user space temporarily changes protections, via mprotect, for individual pages, such can be the case for certain garbage collectors. We would like to detect when such a PTE change occurs. However this can be expensive due to the amount of checking required. Therefore only perform the checks when an indiviual PTE is modified via mprotect (ptep_modify_prot_commit() -> set_pte_at() -> set_ptes(nr=1)) and only when we are setting the final PTE in a contpte-aligned block. Link: https://lkml.kernel.org/r/20240215103205.2607016-19-ryan.roberts@arm.com Signed-off-by: Ryan Roberts Acked-by: Mark Rutland Acked-by: Catalin Marinas Cc: Alistair Popple Cc: Andrey Ryabinin Cc: Ard Biesheuvel Cc: Barry Song <21cnbao@gmail.com> Cc: Borislav Petkov (AMD) Cc: Dave Hansen Cc: David Hildenbrand Cc: "H. Peter Anvin" Cc: Ingo Molnar Cc: James Morse Cc: John Hubbard Cc: Kefeng Wang Cc: Marc Zyngier Cc: Matthew Wilcox (Oracle) Cc: Thomas Gleixner Cc: Will Deacon Cc: Yang Shi Cc: Zi Yan Signed-off-by: Andrew Morton (cherry picked from commit f0c2264958e18bc7bc35b567d51b99461e4de34f) Signed-off-by: Koba Ko --- arch/arm64/include/asm/pgtable.h | 26 +++++++++++++ arch/arm64/mm/contpte.c | 64 ++++++++++++++++++++++++++++++++ 2 files changed, 90 insertions(+) diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h index 9a7afb75fdfc8..7879a5602eaab 100644 --- a/arch/arm64/include/asm/pgtable.h +++ b/arch/arm64/include/asm/pgtable.h @@ -1185,6 +1185,8 @@ extern void ptep_modify_prot_commit(struct vm_area_struct *vma, * where it is possible and makes sense to do so. The PTE_CONT bit is considered * a private implementation detail of the public ptep API (see below). */ +extern void __contpte_try_fold(struct mm_struct *mm, unsigned long addr, + pte_t *ptep, pte_t pte); extern void __contpte_try_unfold(struct mm_struct *mm, unsigned long addr, pte_t *ptep, pte_t pte); extern pte_t contpte_ptep_get(pte_t *ptep, pte_t orig_pte); @@ -1206,6 +1208,29 @@ extern int contpte_ptep_set_access_flags(struct vm_area_struct *vma, unsigned long addr, pte_t *ptep, pte_t entry, int dirty); +static __always_inline void contpte_try_fold(struct mm_struct *mm, + unsigned long addr, pte_t *ptep, pte_t pte) +{ + /* + * Only bother trying if both the virtual and physical addresses are + * aligned and correspond to the last entry in a contig range. The core + * code mostly modifies ranges from low to high, so this is the likely + * the last modification in the contig range, so a good time to fold. + * We can't fold special mappings, because there is no associated folio. + */ + + const unsigned long contmask = CONT_PTES - 1; + bool valign = ((addr >> PAGE_SHIFT) & contmask) == contmask; + + if (unlikely(valign)) { + bool palign = (pte_pfn(pte) & contmask) == contmask; + + if (unlikely(palign && + pte_valid(pte) && !pte_cont(pte) && !pte_special(pte))) + __contpte_try_fold(mm, addr, ptep, pte); + } +} + static __always_inline void contpte_try_unfold(struct mm_struct *mm, unsigned long addr, pte_t *ptep, pte_t pte) { @@ -1286,6 +1311,7 @@ static __always_inline void set_ptes(struct mm_struct *mm, unsigned long addr, if (likely(nr == 1)) { contpte_try_unfold(mm, addr, ptep, __ptep_get(ptep)); __set_ptes(mm, addr, ptep, pte, 1); + contpte_try_fold(mm, addr, ptep, pte); } else { contpte_set_ptes(mm, addr, ptep, pte, nr); } diff --git a/arch/arm64/mm/contpte.c b/arch/arm64/mm/contpte.c index 50e0173dc5eee..16788f07716d5 100644 --- a/arch/arm64/mm/contpte.c +++ b/arch/arm64/mm/contpte.c @@ -73,6 +73,70 @@ static void contpte_convert(struct mm_struct *mm, unsigned long addr, __set_ptes(mm, start_addr, start_ptep, pte, CONT_PTES); } +void __contpte_try_fold(struct mm_struct *mm, unsigned long addr, + pte_t *ptep, pte_t pte) +{ + /* + * We have already checked that the virtual and pysical addresses are + * correctly aligned for a contpte mapping in contpte_try_fold() so the + * remaining checks are to ensure that the contpte range is fully + * covered by a single folio, and ensure that all the ptes are valid + * with contiguous PFNs and matching prots. We ignore the state of the + * access and dirty bits for the purpose of deciding if its a contiguous + * range; the folding process will generate a single contpte entry which + * has a single access and dirty bit. Those 2 bits are the logical OR of + * their respective bits in the constituent pte entries. In order to + * ensure the contpte range is covered by a single folio, we must + * recover the folio from the pfn, but special mappings don't have a + * folio backing them. Fortunately contpte_try_fold() already checked + * that the pte is not special - we never try to fold special mappings. + * Note we can't use vm_normal_page() for this since we don't have the + * vma. + */ + + unsigned long folio_start, folio_end; + unsigned long cont_start, cont_end; + pte_t expected_pte, subpte; + struct folio *folio; + struct page *page; + unsigned long pfn; + pte_t *orig_ptep; + pgprot_t prot; + + int i; + + if (!mm_is_user(mm)) + return; + + page = pte_page(pte); + folio = page_folio(page); + folio_start = addr - (page - &folio->page) * PAGE_SIZE; + folio_end = folio_start + folio_nr_pages(folio) * PAGE_SIZE; + cont_start = ALIGN_DOWN(addr, CONT_PTE_SIZE); + cont_end = cont_start + CONT_PTE_SIZE; + + if (folio_start > cont_start || folio_end < cont_end) + return; + + pfn = ALIGN_DOWN(pte_pfn(pte), CONT_PTES); + prot = pte_pgprot(pte_mkold(pte_mkclean(pte))); + expected_pte = pfn_pte(pfn, prot); + orig_ptep = ptep; + ptep = contpte_align_down(ptep); + + for (i = 0; i < CONT_PTES; i++) { + subpte = pte_mkold(pte_mkclean(__ptep_get(ptep))); + if (!pte_same(subpte, expected_pte)) + return; + expected_pte = pte_advance_pfn(expected_pte, 1); + ptep++; + } + + pte = pte_mkcont(pte); + contpte_convert(mm, addr, orig_ptep, pte); +} +EXPORT_SYMBOL(__contpte_try_fold); + void __contpte_try_unfold(struct mm_struct *mm, unsigned long addr, pte_t *ptep, pte_t pte) { From 0c3870067524bfe988791094c0ee70aac0779c0b Mon Sep 17 00:00:00 2001 From: Jason Gunthorpe Date: Thu, 11 Apr 2024 13:46:18 -0300 Subject: [PATCH 228/548] net: hns3: Remove io_stop_wc() calls after __iowrite64_copy() Now that the ARM64 arch implementation does the DGH as part of __iowrite64_copy() there is no reason to open code this in drivers. Link: https://lore.kernel.org/r/5-v3-1893cd8b9369+1925-mlx5_arm_wc_jgg@nvidia.com Reviewed-by: Jijie Shao Signed-off-by: Jason Gunthorpe (cherry picked from commit 2b7a5e1fe02231acc5d50339b2f10833565ef559) Signed-off-by: Koba Ko --- drivers/net/ethernet/hisilicon/hns3/hns3_enet.c | 4 ---- 1 file changed, 4 deletions(-) diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c b/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c index dbf44a17987eb..69d8badc62085 100644 --- a/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c +++ b/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c @@ -2079,8 +2079,6 @@ static void hns3_tx_push_bd(struct hns3_enet_ring *ring, int num) __iowrite64_copy(ring->tqp->mem_base, desc, (sizeof(struct hns3_desc) * HNS3_MAX_PUSH_BD_NUM) / HNS3_BYTES_PER_64BIT); - - io_stop_wc(); } static void hns3_tx_mem_doorbell(struct hns3_enet_ring *ring) @@ -2099,8 +2097,6 @@ static void hns3_tx_mem_doorbell(struct hns3_enet_ring *ring) u64_stats_update_begin(&ring->syncp); ring->stats.tx_mem_doorbell += ring->pending_buf; u64_stats_update_end(&ring->syncp); - - io_stop_wc(); } static void hns3_tx_doorbell(struct hns3_enet_ring *ring, int num, From bd16537dc40c92c1c5a13cd342f4ced6a01335cf Mon Sep 17 00:00:00 2001 From: Besar Wicaksono Date: Tue, 10 Oct 2023 18:48:03 -0500 Subject: [PATCH 229/548] perf cs-etm: Fix incorrect or missing decoder for raw trace The decoder creation for raw trace uses metadata from the first CPU. On per-cpu mode, this metadata is incorrectly used for every decoder. On per-process/per-thread traces, the first CPU is CPU0. If CPU0 trace is not enabled, its metadata will be marked unused and the decoder is not created. Perf report dump skips the decoding part because the decoder is missing. To fix this, use metadata of the CPU associated with sample object. Signed-off-by: Besar Wicaksono Reviewed-by: James Clark Cc: suzuki.poulose@arm.com Cc: mike.leach@linaro.org Cc: jonathanh@nvidia.com Cc: rwiley@nvidia.com Cc: treding@nvidia.com Cc: vsethi@nvidia.com Cc: ywan@nvidia.com Cc: linux-arm-kernel@lists.infradead.org Cc: coresight@lists.linaro.org Cc: linux-tegra@vger.kernel.org Link: https://lore.kernel.org/r/20231010234803.5419-1-bwicaksono@nvidia.com Signed-off-by: Namhyung Kim (cherry picked from commit a16afcc58a8c5ebc65c852faf001f8f61f05e4ef) Signed-off-by: Koba Ko --- tools/perf/util/cs-etm.c | 106 ++++++++++++++++++++++++--------------- 1 file changed, 65 insertions(+), 41 deletions(-) diff --git a/tools/perf/util/cs-etm.c b/tools/perf/util/cs-etm.c index 799c104901b4f..1fb104357eebb 100644 --- a/tools/perf/util/cs-etm.c +++ b/tools/perf/util/cs-etm.c @@ -283,22 +283,31 @@ static int cs_etm__metadata_set_trace_id(u8 trace_chan_id, u64 *cpu_metadata) } /* - * Get a metadata for a specific cpu from an array. + * Get a metadata index for a specific cpu from an array. * */ -static u64 *get_cpu_data(struct cs_etm_auxtrace *etm, int cpu) +static int get_cpu_data_idx(struct cs_etm_auxtrace *etm, int cpu) { int i; - u64 *metadata = NULL; for (i = 0; i < etm->num_cpu; i++) { if (etm->metadata[i][CS_ETM_CPU] == (u64)cpu) { - metadata = etm->metadata[i]; - break; + return i; } } - return metadata; + return -1; +} + +/* + * Get a metadata for a specific cpu from an array. + * + */ +static u64 *get_cpu_data(struct cs_etm_auxtrace *etm, int cpu) +{ + int idx = get_cpu_data_idx(etm, cpu); + + return (idx != -1) ? etm->metadata[idx] : NULL; } /* @@ -641,66 +650,80 @@ static void cs_etm__packet_dump(const char *pkt_string) } static void cs_etm__set_trace_param_etmv3(struct cs_etm_trace_params *t_params, - struct cs_etm_auxtrace *etm, int idx, - u32 etmidr) + struct cs_etm_auxtrace *etm, int t_idx, + int m_idx, u32 etmidr) { u64 **metadata = etm->metadata; - t_params[idx].protocol = cs_etm__get_v7_protocol_version(etmidr); - t_params[idx].etmv3.reg_ctrl = metadata[idx][CS_ETM_ETMCR]; - t_params[idx].etmv3.reg_trc_id = metadata[idx][CS_ETM_ETMTRACEIDR]; + t_params[t_idx].protocol = cs_etm__get_v7_protocol_version(etmidr); + t_params[t_idx].etmv3.reg_ctrl = metadata[m_idx][CS_ETM_ETMCR]; + t_params[t_idx].etmv3.reg_trc_id = metadata[m_idx][CS_ETM_ETMTRACEIDR]; } static void cs_etm__set_trace_param_etmv4(struct cs_etm_trace_params *t_params, - struct cs_etm_auxtrace *etm, int idx) + struct cs_etm_auxtrace *etm, int t_idx, + int m_idx) { u64 **metadata = etm->metadata; - t_params[idx].protocol = CS_ETM_PROTO_ETMV4i; - t_params[idx].etmv4.reg_idr0 = metadata[idx][CS_ETMV4_TRCIDR0]; - t_params[idx].etmv4.reg_idr1 = metadata[idx][CS_ETMV4_TRCIDR1]; - t_params[idx].etmv4.reg_idr2 = metadata[idx][CS_ETMV4_TRCIDR2]; - t_params[idx].etmv4.reg_idr8 = metadata[idx][CS_ETMV4_TRCIDR8]; - t_params[idx].etmv4.reg_configr = metadata[idx][CS_ETMV4_TRCCONFIGR]; - t_params[idx].etmv4.reg_traceidr = metadata[idx][CS_ETMV4_TRCTRACEIDR]; + t_params[t_idx].protocol = CS_ETM_PROTO_ETMV4i; + t_params[t_idx].etmv4.reg_idr0 = metadata[m_idx][CS_ETMV4_TRCIDR0]; + t_params[t_idx].etmv4.reg_idr1 = metadata[m_idx][CS_ETMV4_TRCIDR1]; + t_params[t_idx].etmv4.reg_idr2 = metadata[m_idx][CS_ETMV4_TRCIDR2]; + t_params[t_idx].etmv4.reg_idr8 = metadata[m_idx][CS_ETMV4_TRCIDR8]; + t_params[t_idx].etmv4.reg_configr = metadata[m_idx][CS_ETMV4_TRCCONFIGR]; + t_params[t_idx].etmv4.reg_traceidr = metadata[m_idx][CS_ETMV4_TRCTRACEIDR]; } static void cs_etm__set_trace_param_ete(struct cs_etm_trace_params *t_params, - struct cs_etm_auxtrace *etm, int idx) + struct cs_etm_auxtrace *etm, int t_idx, + int m_idx) { u64 **metadata = etm->metadata; - t_params[idx].protocol = CS_ETM_PROTO_ETE; - t_params[idx].ete.reg_idr0 = metadata[idx][CS_ETE_TRCIDR0]; - t_params[idx].ete.reg_idr1 = metadata[idx][CS_ETE_TRCIDR1]; - t_params[idx].ete.reg_idr2 = metadata[idx][CS_ETE_TRCIDR2]; - t_params[idx].ete.reg_idr8 = metadata[idx][CS_ETE_TRCIDR8]; - t_params[idx].ete.reg_configr = metadata[idx][CS_ETE_TRCCONFIGR]; - t_params[idx].ete.reg_traceidr = metadata[idx][CS_ETE_TRCTRACEIDR]; - t_params[idx].ete.reg_devarch = metadata[idx][CS_ETE_TRCDEVARCH]; + t_params[t_idx].protocol = CS_ETM_PROTO_ETE; + t_params[t_idx].ete.reg_idr0 = metadata[m_idx][CS_ETE_TRCIDR0]; + t_params[t_idx].ete.reg_idr1 = metadata[m_idx][CS_ETE_TRCIDR1]; + t_params[t_idx].ete.reg_idr2 = metadata[m_idx][CS_ETE_TRCIDR2]; + t_params[t_idx].ete.reg_idr8 = metadata[m_idx][CS_ETE_TRCIDR8]; + t_params[t_idx].ete.reg_configr = metadata[m_idx][CS_ETE_TRCCONFIGR]; + t_params[t_idx].ete.reg_traceidr = metadata[m_idx][CS_ETE_TRCTRACEIDR]; + t_params[t_idx].ete.reg_devarch = metadata[m_idx][CS_ETE_TRCDEVARCH]; } static int cs_etm__init_trace_params(struct cs_etm_trace_params *t_params, struct cs_etm_auxtrace *etm, + bool formatted, + int sample_cpu, int decoders) { - int i; + int t_idx, m_idx; u32 etmidr; u64 architecture; - for (i = 0; i < decoders; i++) { - architecture = etm->metadata[i][CS_ETM_MAGIC]; + for (t_idx = 0; t_idx < decoders; t_idx++) { + if (formatted) + m_idx = t_idx; + else { + m_idx = get_cpu_data_idx(etm, sample_cpu); + if (m_idx == -1) { + pr_warning("CS_ETM: unknown CPU, falling back to first metadata\n"); + m_idx = 0; + } + } + + architecture = etm->metadata[m_idx][CS_ETM_MAGIC]; switch (architecture) { case __perf_cs_etmv3_magic: - etmidr = etm->metadata[i][CS_ETM_ETMIDR]; - cs_etm__set_trace_param_etmv3(t_params, etm, i, etmidr); + etmidr = etm->metadata[m_idx][CS_ETM_ETMIDR]; + cs_etm__set_trace_param_etmv3(t_params, etm, t_idx, m_idx, etmidr); break; case __perf_cs_etmv4_magic: - cs_etm__set_trace_param_etmv4(t_params, etm, i); + cs_etm__set_trace_param_etmv4(t_params, etm, t_idx, m_idx); break; case __perf_cs_ete_magic: - cs_etm__set_trace_param_ete(t_params, etm, i); + cs_etm__set_trace_param_ete(t_params, etm, t_idx, m_idx); break; default: return -EINVAL; @@ -1016,7 +1039,7 @@ static u32 cs_etm__mem_access(struct cs_etm_queue *etmq, u8 trace_chan_id, } static struct cs_etm_queue *cs_etm__alloc_queue(struct cs_etm_auxtrace *etm, - bool formatted) + bool formatted, int sample_cpu) { struct cs_etm_decoder_params d_params; struct cs_etm_trace_params *t_params = NULL; @@ -1041,7 +1064,7 @@ static struct cs_etm_queue *cs_etm__alloc_queue(struct cs_etm_auxtrace *etm, if (!t_params) goto out_free; - if (cs_etm__init_trace_params(t_params, etm, decoders)) + if (cs_etm__init_trace_params(t_params, etm, formatted, sample_cpu, decoders)) goto out_free; /* Set decoder parameters to decode trace packets */ @@ -1081,14 +1104,15 @@ static struct cs_etm_queue *cs_etm__alloc_queue(struct cs_etm_auxtrace *etm, static int cs_etm__setup_queue(struct cs_etm_auxtrace *etm, struct auxtrace_queue *queue, unsigned int queue_nr, - bool formatted) + bool formatted, + int sample_cpu) { struct cs_etm_queue *etmq = queue->priv; if (list_empty(&queue->head) || etmq) return 0; - etmq = cs_etm__alloc_queue(etm, formatted); + etmq = cs_etm__alloc_queue(etm, formatted, sample_cpu); if (!etmq) return -ENOMEM; @@ -2827,7 +2851,7 @@ static int cs_etm__process_auxtrace_event(struct perf_session *session, * formatted in piped mode (true). */ err = cs_etm__setup_queue(etm, &etm->queues.queue_array[idx], - idx, true); + idx, true, -1); if (err) return err; @@ -3033,7 +3057,7 @@ static int cs_etm__queue_aux_fragment(struct perf_session *session, off_t file_o idx = auxtrace_event->idx; formatted = !(aux_event->flags & PERF_AUX_FLAG_CORESIGHT_FORMAT_RAW); return cs_etm__setup_queue(etm, &etm->queues.queue_array[idx], - idx, formatted); + idx, formatted, sample->cpu); } /* Wasn't inside this buffer, but there were no parse errors. 1 == 'not found' */ From 623444afc0d403ea1501c543e62ab331aa0bddab Mon Sep 17 00:00:00 2001 From: Besar Wicaksono Date: Mon, 21 Aug 2023 18:16:08 -0500 Subject: [PATCH 230/548] perf: arm_cspmu: Separate Arm and vendor module Arm Coresight PMU driver consists of main standard code and vendor backend code. Both are currently built as a single module. This patch adds vendor registration API to separate the two to keep things modular. The main driver requests each known backend module during initialization and defer device binding process. The backend module then registers an init callback to the main driver and continue the device driver binding process. Signed-off-by: Besar Wicaksono Reviewed-by: Suzuki K Poulose Reviewed-and-tested-by: Ilkka Koskinen Link: https://lore.kernel.org/r/20230821231608.50911-1-bwicaksono@nvidia.com Signed-off-by: Will Deacon (cherry picked from commit bfc653aa89cb05796d7b4e046600accb442c9b7a) Signed-off-by: Koba Ko --- drivers/perf/arm_cspmu/Kconfig | 9 +- drivers/perf/arm_cspmu/Makefile | 6 +- drivers/perf/arm_cspmu/arm_cspmu.c | 168 ++++++++++++++++++++------ drivers/perf/arm_cspmu/arm_cspmu.h | 25 +++- drivers/perf/arm_cspmu/nvidia_cspmu.c | 34 +++++- drivers/perf/arm_cspmu/nvidia_cspmu.h | 17 --- 6 files changed, 199 insertions(+), 60 deletions(-) delete mode 100644 drivers/perf/arm_cspmu/nvidia_cspmu.h diff --git a/drivers/perf/arm_cspmu/Kconfig b/drivers/perf/arm_cspmu/Kconfig index 25d25ded09832..d5f787d22234d 100644 --- a/drivers/perf/arm_cspmu/Kconfig +++ b/drivers/perf/arm_cspmu/Kconfig @@ -1,6 +1,6 @@ # SPDX-License-Identifier: GPL-2.0 # -# Copyright (c) 2022, NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# Copyright (c) 2022-2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved. config ARM_CORESIGHT_PMU_ARCH_SYSTEM_PMU tristate "ARM Coresight Architecture PMU" @@ -10,3 +10,10 @@ config ARM_CORESIGHT_PMU_ARCH_SYSTEM_PMU based on ARM CoreSight PMU architecture. Note that this PMU architecture does not have relationship with the ARM CoreSight Self-Hosted Tracing. + +config NVIDIA_CORESIGHT_PMU_ARCH_SYSTEM_PMU + tristate "NVIDIA Coresight Architecture PMU" + depends on ARM_CORESIGHT_PMU_ARCH_SYSTEM_PMU + help + Provides NVIDIA specific attributes for performance monitoring unit + (PMU) devices based on ARM CoreSight PMU architecture. diff --git a/drivers/perf/arm_cspmu/Makefile b/drivers/perf/arm_cspmu/Makefile index fedb17df982d9..0309d2ff264a1 100644 --- a/drivers/perf/arm_cspmu/Makefile +++ b/drivers/perf/arm_cspmu/Makefile @@ -1,6 +1,8 @@ -# Copyright (c) 2022, NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# Copyright (c) 2022-2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved. # # SPDX-License-Identifier: GPL-2.0 obj-$(CONFIG_ARM_CORESIGHT_PMU_ARCH_SYSTEM_PMU) += arm_cspmu_module.o -arm_cspmu_module-y := arm_cspmu.o nvidia_cspmu.o +arm_cspmu_module-y := arm_cspmu.o + +obj-$(CONFIG_NVIDIA_CORESIGHT_PMU_ARCH_SYSTEM_PMU) += nvidia_cspmu.o diff --git a/drivers/perf/arm_cspmu/arm_cspmu.c b/drivers/perf/arm_cspmu/arm_cspmu.c index 9363c31f31b89..2167f9f156ec7 100644 --- a/drivers/perf/arm_cspmu/arm_cspmu.c +++ b/drivers/perf/arm_cspmu/arm_cspmu.c @@ -16,7 +16,7 @@ * The user should refer to the vendor technical documentation to get details * about the supported events. * - * Copyright (c) 2022, NVIDIA CORPORATION & AFFILIATES. All rights reserved. + * Copyright (c) 2022-2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved. * */ @@ -26,11 +26,11 @@ #include #include #include +#include #include #include #include "arm_cspmu.h" -#include "nvidia_cspmu.h" #define PMUNAME "arm_cspmu" #define DRVNAME "arm-cs-arch-pmu" @@ -112,11 +112,10 @@ */ #define HILOHI_MAX_POLL 1000 -/* JEDEC-assigned JEP106 identification code */ -#define ARM_CSPMU_IMPL_ID_NVIDIA 0x36B - static unsigned long arm_cspmu_cpuhp_state; +static DEFINE_MUTEX(arm_cspmu_lock); + static struct acpi_apmt_node *arm_cspmu_apmt_node(struct device *dev) { return *(struct acpi_apmt_node **)dev_get_platdata(dev); @@ -373,27 +372,37 @@ static struct attribute_group arm_cspmu_cpumask_attr_group = { .attrs = arm_cspmu_cpumask_attrs, }; -struct impl_match { - u32 pmiidr; - u32 mask; - int (*impl_init_ops)(struct arm_cspmu *cspmu); -}; - -static const struct impl_match impl_match[] = { +static struct arm_cspmu_impl_match impl_match[] = { { - .pmiidr = ARM_CSPMU_IMPL_ID_NVIDIA, - .mask = ARM_CSPMU_PMIIDR_IMPLEMENTER, - .impl_init_ops = nv_cspmu_init_ops + .module_name = "nvidia_cspmu", + .pmiidr_val = ARM_CSPMU_IMPL_ID_NVIDIA, + .pmiidr_mask = ARM_CSPMU_PMIIDR_IMPLEMENTER, + .module = NULL, + .impl_init_ops = NULL, }, - {} + {0} }; +static struct arm_cspmu_impl_match *arm_cspmu_impl_match_get(u32 pmiidr) +{ + struct arm_cspmu_impl_match *match = impl_match; + + for (; match->pmiidr_val; match++) { + u32 mask = match->pmiidr_mask; + + if ((match->pmiidr_val & mask) == (pmiidr & mask)) + return match; + } + + return NULL; +} + static int arm_cspmu_init_impl_ops(struct arm_cspmu *cspmu) { - int ret; + int ret = 0; struct arm_cspmu_impl_ops *impl_ops = &cspmu->impl.ops; struct acpi_apmt_node *apmt_node = arm_cspmu_apmt_node(cspmu->dev); - const struct impl_match *match = impl_match; + struct arm_cspmu_impl_match *match; /* * Get PMU implementer and product id from APMT node. @@ -405,17 +414,36 @@ static int arm_cspmu_init_impl_ops(struct arm_cspmu *cspmu) readl(cspmu->base0 + PMIIDR); /* Find implementer specific attribute ops. */ - for (; match->pmiidr; match++) { - const u32 mask = match->mask; + match = arm_cspmu_impl_match_get(cspmu->impl.pmiidr); + + /* Load implementer module and initialize the callbacks. */ + if (match) { + mutex_lock(&arm_cspmu_lock); + + if (match->impl_init_ops) { + /* Prevent unload until PMU registration is done. */ + if (try_module_get(match->module)) { + cspmu->impl.module = match->module; + cspmu->impl.match = match; + ret = match->impl_init_ops(cspmu); + if (ret) + module_put(match->module); + } else { + WARN(1, "arm_cspmu failed to get module: %s\n", + match->module_name); + ret = -EINVAL; + } + } else { + request_module_nowait(match->module_name); + ret = -EPROBE_DEFER; + } - if ((match->pmiidr & mask) == (cspmu->impl.pmiidr & mask)) { - ret = match->impl_init_ops(cspmu); - if (ret) - return ret; + mutex_unlock(&arm_cspmu_lock); - break; - } - } + if (ret) + return ret; + } else + cspmu->impl.module = THIS_MODULE; /* Use default callbacks if implementer doesn't provide one. */ CHECK_DEFAULT_IMPL_OPS(impl_ops, get_event_attrs); @@ -478,11 +506,6 @@ arm_cspmu_alloc_attr_group(struct arm_cspmu *cspmu) struct attribute_group **attr_groups = NULL; struct device *dev = cspmu->dev; const struct arm_cspmu_impl_ops *impl_ops = &cspmu->impl.ops; - int ret; - - ret = arm_cspmu_init_impl_ops(cspmu); - if (ret) - return NULL; cspmu->identifier = impl_ops->get_identifier(cspmu); cspmu->name = impl_ops->get_name(cspmu); @@ -1152,7 +1175,7 @@ static int arm_cspmu_register_pmu(struct arm_cspmu *cspmu) cspmu->pmu = (struct pmu){ .task_ctx_nr = perf_invalid_context, - .module = THIS_MODULE, + .module = cspmu->impl.module, .pmu_enable = arm_cspmu_enable, .pmu_disable = arm_cspmu_disable, .event_init = arm_cspmu_event_init, @@ -1199,11 +1222,17 @@ static int arm_cspmu_device_probe(struct platform_device *pdev) if (ret) return ret; - ret = arm_cspmu_register_pmu(cspmu); + ret = arm_cspmu_init_impl_ops(cspmu); if (ret) return ret; - return 0; + ret = arm_cspmu_register_pmu(cspmu); + + /* Matches arm_cspmu_init_impl_ops() above. */ + if (cspmu->impl.module != THIS_MODULE) + module_put(cspmu->impl.module); + + return ret; } static int arm_cspmu_device_remove(struct platform_device *pdev) @@ -1303,6 +1332,75 @@ static void __exit arm_cspmu_exit(void) cpuhp_remove_multi_state(arm_cspmu_cpuhp_state); } +int arm_cspmu_impl_register(const struct arm_cspmu_impl_match *impl_match) +{ + struct arm_cspmu_impl_match *match; + int ret = 0; + + match = arm_cspmu_impl_match_get(impl_match->pmiidr_val); + + if (match) { + mutex_lock(&arm_cspmu_lock); + + if (!match->impl_init_ops) { + match->module = impl_match->module; + match->impl_init_ops = impl_match->impl_init_ops; + } else { + /* Broken match table may contain non-unique entries */ + WARN(1, "arm_cspmu backend already registered for module: %s, pmiidr: 0x%x, mask: 0x%x\n", + match->module_name, + match->pmiidr_val, + match->pmiidr_mask); + + ret = -EINVAL; + } + + mutex_unlock(&arm_cspmu_lock); + + if (!ret) + ret = driver_attach(&arm_cspmu_driver.driver); + } else { + pr_err("arm_cspmu reg failed, unable to find a match for pmiidr: 0x%x\n", + impl_match->pmiidr_val); + + ret = -EINVAL; + } + + return ret; +} +EXPORT_SYMBOL_GPL(arm_cspmu_impl_register); + +static int arm_cspmu_match_device(struct device *dev, const void *match) +{ + struct arm_cspmu *cspmu = platform_get_drvdata(to_platform_device(dev)); + + return (cspmu && cspmu->impl.match == match) ? 1 : 0; +} + +void arm_cspmu_impl_unregister(const struct arm_cspmu_impl_match *impl_match) +{ + struct device *dev; + struct arm_cspmu_impl_match *match; + + match = arm_cspmu_impl_match_get(impl_match->pmiidr_val); + + if (WARN_ON(!match)) + return; + + /* Unbind the driver from all matching backend devices. */ + while ((dev = driver_find_device(&arm_cspmu_driver.driver, NULL, + match, arm_cspmu_match_device))) + device_release_driver(dev); + + mutex_lock(&arm_cspmu_lock); + + match->module = NULL; + match->impl_init_ops = NULL; + + mutex_unlock(&arm_cspmu_lock); +} +EXPORT_SYMBOL_GPL(arm_cspmu_impl_unregister); + module_init(arm_cspmu_init); module_exit(arm_cspmu_exit); diff --git a/drivers/perf/arm_cspmu/arm_cspmu.h b/drivers/perf/arm_cspmu/arm_cspmu.h index 83df53d1c132c..7936a90ded7fe 100644 --- a/drivers/perf/arm_cspmu/arm_cspmu.h +++ b/drivers/perf/arm_cspmu/arm_cspmu.h @@ -1,7 +1,7 @@ /* SPDX-License-Identifier: GPL-2.0 * * ARM CoreSight Architecture PMU driver. - * Copyright (c) 2022, NVIDIA CORPORATION & AFFILIATES. All rights reserved. + * Copyright (c) 2022-2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved. * */ @@ -69,6 +69,9 @@ #define ARM_CSPMU_PMIIDR_IMPLEMENTER GENMASK(11, 0) #define ARM_CSPMU_PMIIDR_PRODUCTID GENMASK(31, 20) +/* JEDEC-assigned JEP106 identification code */ +#define ARM_CSPMU_IMPL_ID_NVIDIA 0x36B + struct arm_cspmu; /* This tracks the events assigned to each counter in the PMU. */ @@ -106,9 +109,23 @@ struct arm_cspmu_impl_ops { struct attribute *attr, int unused); }; +/* Vendor/implementer registration parameter. */ +struct arm_cspmu_impl_match { + /* Backend module. */ + struct module *module; + const char *module_name; + /* PMIIDR value/mask. */ + u32 pmiidr_val; + u32 pmiidr_mask; + /* Callback to vendor backend to init arm_cspmu_impl::ops. */ + int (*impl_init_ops)(struct arm_cspmu *cspmu); +}; + /* Vendor/implementer descriptor. */ struct arm_cspmu_impl { u32 pmiidr; + struct module *module; + struct arm_cspmu_impl_match *match; struct arm_cspmu_impl_ops ops; void *ctx; }; @@ -147,4 +164,10 @@ ssize_t arm_cspmu_sysfs_format_show(struct device *dev, struct device_attribute *attr, char *buf); +/* Register vendor backend. */ +int arm_cspmu_impl_register(const struct arm_cspmu_impl_match *impl_match); + +/* Unregister vendor backend. */ +void arm_cspmu_impl_unregister(const struct arm_cspmu_impl_match *impl_match); + #endif /* __ARM_CSPMU_H__ */ diff --git a/drivers/perf/arm_cspmu/nvidia_cspmu.c b/drivers/perf/arm_cspmu/nvidia_cspmu.c index 72ef80caa3c87..0382b702f0920 100644 --- a/drivers/perf/arm_cspmu/nvidia_cspmu.c +++ b/drivers/perf/arm_cspmu/nvidia_cspmu.c @@ -1,14 +1,15 @@ // SPDX-License-Identifier: GPL-2.0 /* - * Copyright (c) 2022, NVIDIA CORPORATION & AFFILIATES. All rights reserved. + * Copyright (c) 2022-2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved. * */ /* Support for NVIDIA specific attributes. */ +#include #include -#include "nvidia_cspmu.h" +#include "arm_cspmu.h" #define NV_PCIE_PORT_COUNT 10ULL #define NV_PCIE_FILTER_ID_MASK GENMASK_ULL(NV_PCIE_PORT_COUNT - 1, 0) @@ -351,7 +352,7 @@ static char *nv_cspmu_format_name(const struct arm_cspmu *cspmu, return name; } -int nv_cspmu_init_ops(struct arm_cspmu *cspmu) +static int nv_cspmu_init_ops(struct arm_cspmu *cspmu) { u32 prodid; struct nv_cspmu_ctx *ctx; @@ -395,6 +396,31 @@ int nv_cspmu_init_ops(struct arm_cspmu *cspmu) return 0; } -EXPORT_SYMBOL_GPL(nv_cspmu_init_ops); + +/* Match all NVIDIA Coresight PMU devices */ +static const struct arm_cspmu_impl_match nv_cspmu_param = { + .pmiidr_val = ARM_CSPMU_IMPL_ID_NVIDIA, + .module = THIS_MODULE, + .impl_init_ops = nv_cspmu_init_ops +}; + +static int __init nvidia_cspmu_init(void) +{ + int ret; + + ret = arm_cspmu_impl_register(&nv_cspmu_param); + if (ret) + pr_err("nvidia_cspmu backend registration error: %d\n", ret); + + return ret; +} + +static void __exit nvidia_cspmu_exit(void) +{ + arm_cspmu_impl_unregister(&nv_cspmu_param); +} + +module_init(nvidia_cspmu_init); +module_exit(nvidia_cspmu_exit); MODULE_LICENSE("GPL v2"); diff --git a/drivers/perf/arm_cspmu/nvidia_cspmu.h b/drivers/perf/arm_cspmu/nvidia_cspmu.h deleted file mode 100644 index 71e18f0dc50b7..0000000000000 --- a/drivers/perf/arm_cspmu/nvidia_cspmu.h +++ /dev/null @@ -1,17 +0,0 @@ -/* SPDX-License-Identifier: GPL-2.0 - * - * Copyright (c) 2022, NVIDIA CORPORATION & AFFILIATES. All rights reserved. - * - */ - -/* Support for NVIDIA specific attributes. */ - -#ifndef __NVIDIA_CSPMU_H__ -#define __NVIDIA_CSPMU_H__ - -#include "arm_cspmu.h" - -/* Allocate NVIDIA descriptor. */ -int nv_cspmu_init_ops(struct arm_cspmu *cspmu); - -#endif /* __NVIDIA_CSPMU_H__ */ From cb5e08fdc7d32f285af9484c9c0dfc5601ae80fc Mon Sep 17 00:00:00 2001 From: Anshuman Khandual Date: Tue, 29 Aug 2023 19:24:05 +0530 Subject: [PATCH 231/548] coresight: trbe: Enable ACPI based TRBE devices This detects and enables ACPI based TRBE devices via the dummy platform device created earlier for this purpose. Cc: Suzuki K Poulose Cc: Mike Leach Cc: Leo Yan Cc: Alexander Shishkin Cc: coresight@lists.linaro.org Cc: linux-arm-kernel@lists.infradead.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Anshuman Khandual Signed-off-by: Suzuki K Poulose Link: https://lore.kernel.org/r/20230829135405.1159449-3-anshuman.khandual@arm.com (cherry picked from commit 17f8b216e02654a0b37127736ce78b32ccaa867b) Signed-off-by: Koba Ko --- drivers/hwtracing/coresight/coresight-trbe.c | 9 +++++++++ drivers/hwtracing/coresight/coresight-trbe.h | 2 ++ 2 files changed, 11 insertions(+) diff --git a/drivers/hwtracing/coresight/coresight-trbe.c b/drivers/hwtracing/coresight/coresight-trbe.c index e20c1c6acc731..d5e2f2299edef 100644 --- a/drivers/hwtracing/coresight/coresight-trbe.c +++ b/drivers/hwtracing/coresight/coresight-trbe.c @@ -1536,7 +1536,16 @@ static const struct of_device_id arm_trbe_of_match[] = { }; MODULE_DEVICE_TABLE(of, arm_trbe_of_match); +#ifdef CONFIG_ACPI +static const struct platform_device_id arm_trbe_acpi_match[] = { + { ARMV8_TRBE_PDEV_NAME, 0 }, + { } +}; +MODULE_DEVICE_TABLE(platform, arm_trbe_acpi_match); +#endif + static struct platform_driver arm_trbe_driver = { + .id_table = ACPI_PTR(arm_trbe_acpi_match), .driver = { .name = DRVNAME, .of_match_table = of_match_ptr(arm_trbe_of_match), diff --git a/drivers/hwtracing/coresight/coresight-trbe.h b/drivers/hwtracing/coresight/coresight-trbe.h index e915e749be551..45202c48accec 100644 --- a/drivers/hwtracing/coresight/coresight-trbe.h +++ b/drivers/hwtracing/coresight/coresight-trbe.h @@ -7,11 +7,13 @@ * * Author: Anshuman Khandual */ +#include #include #include #include #include #include +#include #include #include From 718452ab2800e78cc5f8604eeccb9eb0b1ba476a Mon Sep 17 00:00:00 2001 From: Anshuman Khandual Date: Tue, 29 Aug 2023 19:24:04 +0530 Subject: [PATCH 232/548] coresight: trbe: Add a representative coresight_platform_data for TRBE TRBE coresight devices do not need regular connections information, as the paths get built between all percpu source and their respective percpu sink devices. Please refer 'commit 2cd87a7b293d ("coresight: core: Add support for dedicated percpu sinks")' which added support for percpu sink devices. coresight_register() expect device connections via the platform_data. TRBE devices do not have any graph connections and thus is empty. With upcoming ACPI support for TRBE, we do not get a real acpi_device and thus coresight_get_platform_dat() will end up in failures. Hence this allocates a zeroed coresight_platform_data structure and assigns that back into the device. Cc: Suzuki K Poulose Cc: Mike Leach Cc: Leo Yan Cc: Alexander Shishkin Cc: coresight@lists.linaro.org Cc: linux-arm-kernel@lists.infradead.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Anshuman Khandual Signed-off-by: Suzuki K Poulose Link: https://lore.kernel.org/r/20230829135405.1159449-2-anshuman.khandual@arm.com (cherry picked from commit 4277f035d227e829133df284be7e35b7236a5b0f) Signed-off-by: Koba Ko --- drivers/hwtracing/coresight/coresight-trbe.c | 14 ++++++++++++-- 1 file changed, 12 insertions(+), 2 deletions(-) diff --git a/drivers/hwtracing/coresight/coresight-trbe.c b/drivers/hwtracing/coresight/coresight-trbe.c index d5e2f2299edef..a3954be7b1f3b 100644 --- a/drivers/hwtracing/coresight/coresight-trbe.c +++ b/drivers/hwtracing/coresight/coresight-trbe.c @@ -1253,8 +1253,18 @@ static void arm_trbe_register_coresight_cpu(struct trbe_drvdata *drvdata, int cp desc.name = devm_kasprintf(dev, GFP_KERNEL, "trbe%d", cpu); if (!desc.name) goto cpu_clear; - - desc.pdata = coresight_get_platform_data(dev); + /* + * TRBE coresight devices do not need regular connections + * information, as the paths get built between all percpu + * source and their respective percpu sink devices. Though + * coresight_register() expect device connections via the + * platform_data, which TRBE devices do not have. As they + * are not real ACPI devices, coresight_get_platform_data() + * ends up failing. Instead let's allocate a dummy zeroed + * coresight_platform_data structure and assign that back + * into the device for that purpose. + */ + desc.pdata = devm_kzalloc(dev, sizeof(*desc.pdata), GFP_KERNEL); if (IS_ERR(desc.pdata)) goto cpu_clear; From 33efbe08a8eee61ef01def4e29a432085e8a0c4d Mon Sep 17 00:00:00 2001 From: Ian Rogers Date: Fri, 1 Sep 2023 16:39:45 -0700 Subject: [PATCH 233/548] perf parse-events: Remove unnecessary __maybe_unused The parameter head_terms is always used in get_config_terms. Reviewed-by: James Clark Signed-off-by: Ian Rogers Cc: Adrian Hunter Cc: Alexander Shishkin Cc: Ingo Molnar Cc: Jiri Olsa Cc: Kan Liang Cc: Mark Rutland Cc: Namhyung Kim Cc: Peter Zijlstra Cc: Rob Herring Link: https://lore.kernel.org/r/20230901233949.2930562-2-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo (cherry picked from commit 6fcfe54d2c91925ec3054cf25e68064913ca7948) Signed-off-by: Koba Ko --- tools/perf/util/parse-events.c | 6 ++---- 1 file changed, 2 insertions(+), 4 deletions(-) diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c index 65608a3cba819..e9e3623f3fed0 100644 --- a/tools/perf/util/parse-events.c +++ b/tools/perf/util/parse-events.c @@ -34,8 +34,7 @@ #ifdef PARSER_DEBUG extern int parse_events_debug; #endif -static int get_config_terms(struct list_head *head_config, - struct list_head *head_terms __maybe_unused); +static int get_config_terms(struct list_head *head_config, struct list_head *head_terms); struct event_symbol event_symbols_hw[PERF_COUNT_HW_MAX] = { [PERF_COUNT_HW_CPU_CYCLES] = { @@ -1079,8 +1078,7 @@ static int config_attr(struct perf_event_attr *attr, return 0; } -static int get_config_terms(struct list_head *head_config, - struct list_head *head_terms __maybe_unused) +static int get_config_terms(struct list_head *head_config, struct list_head *head_terms) { #define ADD_CONFIG_TERM(__type, __weak) \ struct evsel_config_term *__t; \ From 6992bdcf7f64c894892088274bd16c8e2be2b84b Mon Sep 17 00:00:00 2001 From: Ian Rogers Date: Fri, 1 Sep 2023 16:39:46 -0700 Subject: [PATCH 234/548] perf parse-events: Tidy up str parameter Add a const and rename str to event_name. Reviewed-by: James Clark Signed-off-by: Ian Rogers Cc: Adrian Hunter Cc: Alexander Shishkin Cc: Ingo Molnar Cc: Jiri Olsa Cc: Kan Liang Cc: Mark Rutland Cc: Namhyung Kim Cc: Peter Zijlstra Cc: Rob Herring Link: https://lore.kernel.org/r/20230901233949.2930562-3-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo (cherry picked from commit 8f91662ef8be473fb025e170601b0dd75838f7d4) Signed-off-by: Koba Ko --- tools/perf/util/parse-events.c | 13 +++++++------ tools/perf/util/parse-events.h | 2 +- 2 files changed, 8 insertions(+), 7 deletions(-) diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c index e9e3623f3fed0..283c559a35b41 100644 --- a/tools/perf/util/parse-events.c +++ b/tools/perf/util/parse-events.c @@ -1482,7 +1482,7 @@ int parse_events_add_pmu(struct parse_events_state *parse_state, } int parse_events_multi_pmu_add(struct parse_events_state *parse_state, - char *str, struct list_head *head, + const char *event_name, struct list_head *head, struct list_head **listp, void *loc_) { struct parse_events_term *term; @@ -1502,7 +1502,8 @@ int parse_events_multi_pmu_add(struct parse_events_state *parse_state, INIT_LIST_HEAD(head); } - config = strdup(str); + + config = strdup(event_name); if (!config) goto out_err; @@ -1528,7 +1529,7 @@ int parse_events_multi_pmu_add(struct parse_events_state *parse_state, if (parse_events__filter_pmu(parse_state, pmu)) continue; - if (!perf_pmu__have_event(pmu, str)) + if (!perf_pmu__have_event(pmu, event_name)) continue; auto_merge_stats = perf_pmu__auto_merge_stats(pmu); @@ -1539,7 +1540,7 @@ int parse_events_multi_pmu_add(struct parse_events_state *parse_state, strbuf_init(&sb, /*hint=*/ 0); parse_events_term__to_strbuf(orig_head, &sb); - pr_debug("%s -> %s/%s/\n", str, pmu->name, sb.buf); + pr_debug("%s -> %s/%s/\n", event_name, pmu->name, sb.buf); strbuf_release(&sb); ok++; } @@ -1547,13 +1548,13 @@ int parse_events_multi_pmu_add(struct parse_events_state *parse_state, } if (parse_state->fake_pmu) { - if (!parse_events_add_pmu(parse_state, list, str, head, + if (!parse_events_add_pmu(parse_state, list, event_name, head, /*auto_merge_stats=*/true, loc)) { struct strbuf sb; strbuf_init(&sb, /*hint=*/ 0); parse_events_term__to_strbuf(head, &sb); - pr_debug("%s -> %s/%s/\n", str, "fake_pmu", sb.buf); + pr_debug("%s -> %s/%s/\n", event_name, "fake_pmu", sb.buf); strbuf_release(&sb); ok++; } diff --git a/tools/perf/util/parse-events.h b/tools/perf/util/parse-events.h index 594e5d2dc67f3..36a67ef7b35a1 100644 --- a/tools/perf/util/parse-events.h +++ b/tools/perf/util/parse-events.h @@ -217,7 +217,7 @@ struct evsel *parse_events__add_event(int idx, struct perf_event_attr *attr, struct perf_pmu *pmu); int parse_events_multi_pmu_add(struct parse_events_state *parse_state, - char *str, + const char *event_name, struct list_head *head_config, struct list_head **listp, void *loc); From 52df124f31b4ba7f5134a3b6222e8e11379e08cd Mon Sep 17 00:00:00 2001 From: Ian Rogers Date: Fri, 1 Sep 2023 16:39:47 -0700 Subject: [PATCH 235/548] perf parse-events: Avoid enum casts Add term_type to union of values returned by the lexer to avoid casts to and from an integer. Reviewed-by: James Clark Signed-off-by: Ian Rogers Cc: Adrian Hunter Cc: Alexander Shishkin Cc: Ingo Molnar Cc: Jiri Olsa Cc: Kan Liang Cc: Mark Rutland Cc: Namhyung Kim Cc: Peter Zijlstra Cc: Rob Herring Link: https://lore.kernel.org/r/20230901233949.2930562-4-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo (cherry picked from commit 4163644818e95ea6b0afb3982b34c4d59ed50bb2) Signed-off-by: Koba Ko --- tools/perf/util/parse-events.l | 2 +- tools/perf/util/parse-events.y | 25 +++++++++++-------------- 2 files changed, 12 insertions(+), 15 deletions(-) diff --git a/tools/perf/util/parse-events.l b/tools/perf/util/parse-events.l index 4ef4b6f171a0d..7bdf0565a92c0 100644 --- a/tools/perf/util/parse-events.l +++ b/tools/perf/util/parse-events.l @@ -120,7 +120,7 @@ static int term(yyscan_t scanner, enum parse_events__term_type type) { YYSTYPE *yylval = parse_events_get_lval(scanner); - yylval->num = type; + yylval->term_type = type; return PE_TERM; } diff --git a/tools/perf/util/parse-events.y b/tools/perf/util/parse-events.y index c3a86ef4b7cf3..720630202d4c4 100644 --- a/tools/perf/util/parse-events.y +++ b/tools/perf/util/parse-events.y @@ -70,7 +70,7 @@ static void free_list_evsel(struct list_head* list_evsel) %type PE_VALUE_SYM_HW %type PE_VALUE_SYM_SW %type PE_VALUE_SYM_TOOL -%type PE_TERM +%type PE_TERM %type value_sym %type PE_RAW %type PE_NAME @@ -112,6 +112,7 @@ static void free_list_evsel(struct list_head* list_evsel) { char *str; u64 num; + enum parse_events__term_type term_type; struct list_head *list_evsel; struct list_head *list_terms; struct parse_events_term *term; @@ -777,8 +778,7 @@ PE_TERM_HW PE_TERM '=' name_or_raw { struct parse_events_term *term; - int err = parse_events_term__str(&term, (enum parse_events__term_type)$1, - /*config=*/NULL, $3, &@1, &@3); + int err = parse_events_term__str(&term, $1, /*config=*/NULL, $3, &@1, &@3); if (err) { free($3); @@ -790,8 +790,7 @@ PE_TERM '=' name_or_raw PE_TERM '=' PE_TERM_HW { struct parse_events_term *term; - int err = parse_events_term__str(&term, (enum parse_events__term_type)$1, - /*config=*/NULL, $3.str, &@1, &@3); + int err = parse_events_term__str(&term, $1, /*config=*/NULL, $3.str, &@1, &@3); if (err) { free($3.str); @@ -803,10 +802,7 @@ PE_TERM '=' PE_TERM_HW PE_TERM '=' PE_TERM { struct parse_events_term *term; - int err = parse_events_term__term(&term, - (enum parse_events__term_type)$1, - (enum parse_events__term_type)$3, - &@1, &@3); + int err = parse_events_term__term(&term, $1, $3, &@1, &@3); if (err) PE_ABORT(err); @@ -817,8 +813,9 @@ PE_TERM '=' PE_TERM PE_TERM '=' PE_VALUE { struct parse_events_term *term; - int err = parse_events_term__num(&term, (enum parse_events__term_type)$1, - /*config=*/NULL, $3, /*novalue=*/false, &@1, &@3); + int err = parse_events_term__num(&term, $1, + /*config=*/NULL, $3, /*novalue=*/false, + &@1, &@3); if (err) PE_ABORT(err); @@ -829,9 +826,9 @@ PE_TERM '=' PE_VALUE PE_TERM { struct parse_events_term *term; - int err = parse_events_term__num(&term, (enum parse_events__term_type)$1, - /*config=*/NULL, /*num=*/1, /*novalue=*/true, - &@1, /*loc_val=*/NULL); + int err = parse_events_term__num(&term, $1, + /*config=*/NULL, /*num=*/1, /*novalue=*/true, + &@1, /*loc_val=*/NULL); if (err) PE_ABORT(err); From ccc6c5e604db0b70426b0cc7fd4ce79d5b457429 Mon Sep 17 00:00:00 2001 From: Ian Rogers Date: Fri, 1 Sep 2023 16:39:48 -0700 Subject: [PATCH 236/548] perf parse-events: Copy fewer term lists When trying to add events to multiple PMUs the term list is copied first as adding the event will rewrite the event's name term into the sysfs and/or json encoding terms (see perf_pmu__check_alias). Change the parse events add API so the passed in term list is const, then copy the list when modification is necessary. Reviewed-by: James Clark Signed-off-by: Ian Rogers Cc: Adrian Hunter Cc: Alexander Shishkin Cc: Ingo Molnar Cc: Jiri Olsa Cc: Kan Liang Cc: Mark Rutland Cc: Namhyung Kim Cc: Peter Zijlstra Cc: Rob Herring Link: https://lore.kernel.org/r/20230901233949.2930562-5-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo (cherry picked from commit 727adeed06e82915841e121762eb329881ae0107) Signed-off-by: Koba Ko --- tools/perf/util/parse-events.c | 108 ++++++++++++++++++--------------- tools/perf/util/parse-events.h | 7 +-- tools/perf/util/parse-events.y | 17 +----- 3 files changed, 65 insertions(+), 67 deletions(-) diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c index 283c559a35b41..06a844bcce4af 100644 --- a/tools/perf/util/parse-events.c +++ b/tools/perf/util/parse-events.c @@ -35,6 +35,7 @@ extern int parse_events_debug; #endif static int get_config_terms(struct list_head *head_config, struct list_head *head_terms); +static int parse_events_terms__copy(const struct list_head *src, struct list_head *dest); struct event_symbol event_symbols_hw[PERF_COUNT_HW_MAX] = { [PERF_COUNT_HW_CPU_CYCLES] = { @@ -1367,7 +1368,7 @@ static bool config_term_percore(struct list_head *config_terms) int parse_events_add_pmu(struct parse_events_state *parse_state, struct list_head *list, const char *name, - struct list_head *head_config, + const struct list_head *const_head_terms, bool auto_merge_stats, void *loc_) { struct perf_event_attr attr; @@ -1377,6 +1378,7 @@ int parse_events_add_pmu(struct parse_events_state *parse_state, struct parse_events_error *err = parse_state->error; YYLTYPE *loc = loc_; LIST_HEAD(config_terms); + LIST_HEAD(head_terms); pmu = parse_state->fake_pmu ?: perf_pmus__find(name); @@ -1390,32 +1392,37 @@ int parse_events_add_pmu(struct parse_events_state *parse_state, return -EINVAL; } + if (const_head_terms) { + int ret = parse_events_terms__copy(const_head_terms, &head_terms); + + if (ret) + return ret; + } + if (verbose > 1) { struct strbuf sb; strbuf_init(&sb, /*hint=*/ 0); - if (pmu->selectable && !head_config) { + if (pmu->selectable && list_empty(&head_terms)) { strbuf_addf(&sb, "%s//", name); } else { strbuf_addf(&sb, "%s/", name); - parse_events_term__to_strbuf(head_config, &sb); + parse_events_term__to_strbuf(&head_terms, &sb); strbuf_addch(&sb, '/'); } fprintf(stderr, "Attempt to add: %s\n", sb.buf); strbuf_release(&sb); } - if (head_config) - fix_raw(head_config, pmu); + fix_raw(&head_terms, pmu); if (pmu->default_config) { - memcpy(&attr, pmu->default_config, - sizeof(struct perf_event_attr)); + memcpy(&attr, pmu->default_config, sizeof(struct perf_event_attr)); } else { memset(&attr, 0, sizeof(attr)); } attr.type = pmu->type; - if (!head_config) { + if (list_empty(&head_terms)) { evsel = __add_event(list, &parse_state->idx, &attr, /*init_attr=*/true, /*name=*/NULL, /*metric_id=*/NULL, pmu, @@ -1424,14 +1431,16 @@ int parse_events_add_pmu(struct parse_events_state *parse_state, return evsel ? 0 : -ENOMEM; } - if (!parse_state->fake_pmu && perf_pmu__check_alias(pmu, head_config, &info, err)) + if (!parse_state->fake_pmu && perf_pmu__check_alias(pmu, &head_terms, &info, err)) { + parse_events_terms__purge(&head_terms); return -EINVAL; + } if (verbose > 1) { struct strbuf sb; strbuf_init(&sb, /*hint=*/ 0); - parse_events_term__to_strbuf(head_config, &sb); + parse_events_term__to_strbuf(&head_terms, &sb); fprintf(stderr, "..after resolving event: %s/%s/\n", name, sb.buf); strbuf_release(&sb); } @@ -1440,39 +1449,52 @@ int parse_events_add_pmu(struct parse_events_state *parse_state, * Configure hardcoded terms first, no need to check * return value when called with fail == 0 ;) */ - if (config_attr(&attr, head_config, parse_state->error, config_term_pmu)) + if (config_attr(&attr, &head_terms, parse_state->error, config_term_pmu)) { + parse_events_terms__purge(&head_terms); return -EINVAL; + } - if (get_config_terms(head_config, &config_terms)) + if (get_config_terms(&head_terms, &config_terms)) { + parse_events_terms__purge(&head_terms); return -ENOMEM; + } /* * When using default config, record which bits of attr->config were * changed by the user. */ - if (pmu->default_config && get_config_chgs(pmu, head_config, &config_terms)) + if (pmu->default_config && get_config_chgs(pmu, &head_terms, &config_terms)) { + parse_events_terms__purge(&head_terms); return -ENOMEM; + } - if (!parse_state->fake_pmu && perf_pmu__config(pmu, &attr, head_config, parse_state->error)) { + if (!parse_state->fake_pmu && + perf_pmu__config(pmu, &attr, &head_terms, parse_state->error)) { free_config_terms(&config_terms); + parse_events_terms__purge(&head_terms); return -EINVAL; } evsel = __add_event(list, &parse_state->idx, &attr, /*init_attr=*/true, - get_config_name(head_config), - get_config_metric_id(head_config), pmu, + get_config_name(&head_terms), + get_config_metric_id(&head_terms), pmu, &config_terms, auto_merge_stats, /*cpu_list=*/NULL); - if (!evsel) + if (!evsel) { + parse_events_terms__purge(&head_terms); return -ENOMEM; + } if (evsel->name) evsel->use_config_name = true; evsel->percore = config_term_percore(&evsel->config_terms); - if (parse_state->fake_pmu) + if (parse_state->fake_pmu) { + parse_events_terms__purge(&head_terms); return 0; + } + parse_events_terms__purge(&head_terms); free((char *)evsel->unit); evsel->unit = strdup(info.unit); evsel->scale = info.scale; @@ -1482,25 +1504,25 @@ int parse_events_add_pmu(struct parse_events_state *parse_state, } int parse_events_multi_pmu_add(struct parse_events_state *parse_state, - const char *event_name, struct list_head *head, + const char *event_name, + const struct list_head *const_head_terms, struct list_head **listp, void *loc_) { struct parse_events_term *term; struct list_head *list = NULL; - struct list_head *orig_head = NULL; struct perf_pmu *pmu = NULL; YYLTYPE *loc = loc_; int ok = 0; const char *config; + LIST_HEAD(head_terms); *listp = NULL; - if (!head) { - head = malloc(sizeof(struct list_head)); - if (!head) - goto out_err; + if (const_head_terms) { + int ret = parse_events_terms__copy(const_head_terms, &head_terms); - INIT_LIST_HEAD(head); + if (ret) + return ret; } config = strdup(event_name); @@ -1514,7 +1536,7 @@ int parse_events_multi_pmu_add(struct parse_events_state *parse_state, zfree(&config); goto out_err; } - list_add_tail(&term->list, head); + list_add_tail(&term->list, &head_terms); /* Add it for all PMUs that support the alias */ list = malloc(sizeof(struct list_head)); @@ -1533,27 +1555,25 @@ int parse_events_multi_pmu_add(struct parse_events_state *parse_state, continue; auto_merge_stats = perf_pmu__auto_merge_stats(pmu); - parse_events_copy_term_list(head, &orig_head); if (!parse_events_add_pmu(parse_state, list, pmu->name, - orig_head, auto_merge_stats, loc)) { + &head_terms, auto_merge_stats, loc)) { struct strbuf sb; strbuf_init(&sb, /*hint=*/ 0); - parse_events_term__to_strbuf(orig_head, &sb); + parse_events_term__to_strbuf(&head_terms, &sb); pr_debug("%s -> %s/%s/\n", event_name, pmu->name, sb.buf); strbuf_release(&sb); ok++; } - parse_events_terms__delete(orig_head); } if (parse_state->fake_pmu) { - if (!parse_events_add_pmu(parse_state, list, event_name, head, + if (!parse_events_add_pmu(parse_state, list, event_name, &head_terms, /*auto_merge_stats=*/true, loc)) { struct strbuf sb; strbuf_init(&sb, /*hint=*/ 0); - parse_events_term__to_strbuf(head, &sb); + parse_events_term__to_strbuf(&head_terms, &sb); pr_debug("%s -> %s/%s/\n", event_name, "fake_pmu", sb.buf); strbuf_release(&sb); ok++; @@ -1561,12 +1581,12 @@ int parse_events_multi_pmu_add(struct parse_events_state *parse_state, } out_err: + parse_events_terms__purge(&head_terms); if (ok) *listp = list; else free(list); - parse_events_terms__delete(head); return ok ? 0 : -1; } @@ -2543,27 +2563,19 @@ void parse_events_term__delete(struct parse_events_term *term) free(term); } -int parse_events_copy_term_list(struct list_head *old, - struct list_head **new) +static int parse_events_terms__copy(const struct list_head *src, struct list_head *dest) { - struct parse_events_term *term, *n; - int ret; - - if (!old) { - *new = NULL; - return 0; - } + struct parse_events_term *term; - *new = malloc(sizeof(struct list_head)); - if (!*new) - return -ENOMEM; - INIT_LIST_HEAD(*new); + list_for_each_entry (term, src, list) { + struct parse_events_term *n; + int ret; - list_for_each_entry (term, old, list) { ret = parse_events_term__clone(&n, term); if (ret) return ret; - list_add_tail(&n->list, *new); + + list_add_tail(&n->list, dest); } return 0; } diff --git a/tools/perf/util/parse-events.h b/tools/perf/util/parse-events.h index 36a67ef7b35a1..e6612856e8817 100644 --- a/tools/perf/util/parse-events.h +++ b/tools/perf/util/parse-events.h @@ -209,7 +209,7 @@ int parse_events_add_breakpoint(struct parse_events_state *parse_state, struct list_head *head_config); int parse_events_add_pmu(struct parse_events_state *parse_state, struct list_head *list, const char *name, - struct list_head *head_config, + const struct list_head *const_head_terms, bool auto_merge_stats, void *loc); struct evsel *parse_events__add_event(int idx, struct perf_event_attr *attr, @@ -218,12 +218,9 @@ struct evsel *parse_events__add_event(int idx, struct perf_event_attr *attr, int parse_events_multi_pmu_add(struct parse_events_state *parse_state, const char *event_name, - struct list_head *head_config, + const struct list_head *head_terms, struct list_head **listp, void *loc); -int parse_events_copy_term_list(struct list_head *old, - struct list_head **new); - void parse_events__set_leader(char *name, struct list_head *list); void parse_events_update_lists(struct list_head *list_event, struct list_head *list_all); diff --git a/tools/perf/util/parse-events.y b/tools/perf/util/parse-events.y index 720630202d4c4..d878a040c240e 100644 --- a/tools/perf/util/parse-events.y +++ b/tools/perf/util/parse-events.y @@ -275,23 +275,18 @@ event_pmu: PE_NAME opt_pmu_config { struct parse_events_state *parse_state = _parse_state; - struct list_head *list = NULL, *orig_terms = NULL, *terms= NULL; + /* List of created evsels. */ + struct list_head *list = NULL; char *pattern = NULL; #define CLEANUP \ do { \ parse_events_terms__delete($2); \ - parse_events_terms__delete(orig_terms); \ free(list); \ free($1); \ free(pattern); \ } while(0) - if (parse_events_copy_term_list($2, &orig_terms)) { - CLEANUP; - YYNOMEM; - } - list = alloc_list(); if (!list) { CLEANUP; @@ -321,16 +316,11 @@ PE_NAME opt_pmu_config !perf_pmu__match(pattern, pmu->alias_name, $1)) { bool auto_merge_stats = perf_pmu__auto_merge_stats(pmu); - if (parse_events_copy_term_list(orig_terms, &terms)) { - CLEANUP; - YYNOMEM; - } - if (!parse_events_add_pmu(parse_state, list, pmu->name, terms, + if (!parse_events_add_pmu(parse_state, list, pmu->name, $2, auto_merge_stats, &@1)) { ok++; parse_state->wild_card_pmus = true; } - parse_events_terms__delete(terms); } } @@ -338,7 +328,6 @@ PE_NAME opt_pmu_config /* Failure to add, assume $1 is an event name. */ zfree(&list); ok = !parse_events_multi_pmu_add(parse_state, $1, $2, &list, &@1); - $2 = NULL; } if (!ok) { struct parse_events_error *error = parse_state->error; From 11ded1e12883bfc2628127a40965e2f8d1ef44a2 Mon Sep 17 00:00:00 2001 From: Ian Rogers Date: Fri, 1 Sep 2023 16:39:49 -0700 Subject: [PATCH 237/548] perf parse-events: Introduce 'struct parse_events_terms' parse_events_terms() existed in function names but was passed a 'struct list_head'. As many parse_events functions take an evsel_config list as well as a parse_event_term list, and the naming head_terms and head_config is inconsistent, there's a potential to switch the lists and get errors. Introduce a 'struct parse_events_terms', that just wraps a list_head, to avoid this. Add the regular init/exit functions and transition the code to use them. Reviewed-by: James Clark Signed-off-by: Ian Rogers Cc: Adrian Hunter Cc: Alexander Shishkin Cc: Ingo Molnar Cc: Jiri Olsa Cc: Kan Liang Cc: Mark Rutland Cc: Namhyung Kim Cc: Peter Zijlstra Cc: Rob Herring Link: https://lore.kernel.org/r/20230901233949.2930562-6-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo (cherry picked from commit 0d3f0e6f94ef58d5532e23b6d153b0890cf0014c) Signed-off-by: Koba Ko --- tools/perf/arch/x86/util/intel-pt.c | 15 +-- tools/perf/tests/parse-events.c | 12 +-- tools/perf/tests/pmu.c | 23 ++-- tools/perf/util/parse-events.c | 158 +++++++++++++++------------- tools/perf/util/parse-events.h | 29 +++-- tools/perf/util/parse-events.y | 12 +-- tools/perf/util/pmu.c | 49 ++++----- tools/perf/util/pmu.h | 6 +- 8 files changed, 160 insertions(+), 144 deletions(-) diff --git a/tools/perf/arch/x86/util/intel-pt.c b/tools/perf/arch/x86/util/intel-pt.c index aaa2c641e7871..69f02b3c812a3 100644 --- a/tools/perf/arch/x86/util/intel-pt.c +++ b/tools/perf/arch/x86/util/intel-pt.c @@ -65,28 +65,23 @@ static int intel_pt_parse_terms_with_default(struct perf_pmu *pmu, const char *str, u64 *config) { - struct list_head *terms; + struct parse_events_terms terms; struct perf_event_attr attr = { .size = 0, }; int err; - terms = malloc(sizeof(struct list_head)); - if (!terms) - return -ENOMEM; - - INIT_LIST_HEAD(terms); - - err = parse_events_terms(terms, str, /*input=*/ NULL); + parse_events_terms__init(&terms); + err = parse_events_terms(&terms, str, /*input=*/ NULL); if (err) goto out_free; attr.config = *config; - err = perf_pmu__config_terms(pmu, &attr, terms, /*zero=*/true, /*err=*/NULL); + err = perf_pmu__config_terms(pmu, &attr, &terms, /*zero=*/true, /*err=*/NULL); if (err) goto out_free; *config = attr.config; out_free: - parse_events_terms__delete(terms); + parse_events_terms__exit(&terms); return err; } diff --git a/tools/perf/tests/parse-events.c b/tools/perf/tests/parse-events.c index d47f1f8711641..93796e885cefd 100644 --- a/tools/perf/tests/parse-events.c +++ b/tools/perf/tests/parse-events.c @@ -771,12 +771,12 @@ static int test__checkevent_pmu_events_mix(struct evlist *evlist) return TEST_OK; } -static int test__checkterms_simple(struct list_head *terms) +static int test__checkterms_simple(struct parse_events_terms *terms) { struct parse_events_term *term; /* config=10 */ - term = list_entry(terms->next, struct parse_events_term, list); + term = list_entry(terms->terms.next, struct parse_events_term, list); TEST_ASSERT_VAL("wrong type term", term->type_term == PARSE_EVENTS__TERM_TYPE_CONFIG); TEST_ASSERT_VAL("wrong type val", @@ -2363,7 +2363,7 @@ static const struct evlist_test test__events_pmu[] = { struct terms_test { const char *str; - int (*check)(struct list_head *terms); + int (*check)(struct parse_events_terms *terms); }; static const struct terms_test test__terms[] = { @@ -2467,11 +2467,11 @@ static int test__events2(struct test_suite *test __maybe_unused, int subtest __m static int test_term(const struct terms_test *t) { - struct list_head terms; + struct parse_events_terms terms; int ret; - INIT_LIST_HEAD(&terms); + parse_events_terms__init(&terms); ret = parse_events_terms(&terms, t->str, /*input=*/ NULL); if (ret) { pr_debug("failed to parse terms '%s', err %d\n", @@ -2480,7 +2480,7 @@ static int test_term(const struct terms_test *t) } ret = t->check(&terms); - parse_events_terms__purge(&terms); + parse_events_terms__exit(&terms); return ret; } diff --git a/tools/perf/tests/pmu.c b/tools/perf/tests/pmu.c index eb60e5f66859d..8f18127d876a7 100644 --- a/tools/perf/tests/pmu.c +++ b/tools/perf/tests/pmu.c @@ -128,30 +128,35 @@ static int test_format_dir_put(char *dir) return system(buf); } -static struct list_head *test_terms_list(void) +static void add_test_terms(struct parse_events_terms *terms) { - static LIST_HEAD(terms); unsigned int i; - for (i = 0; i < ARRAY_SIZE(test_terms); i++) - list_add_tail(&test_terms[i].list, &terms); + for (i = 0; i < ARRAY_SIZE(test_terms); i++) { + struct parse_events_term *clone; - return &terms; + parse_events_term__clone(&clone, &test_terms[i]); + list_add_tail(&clone->list, &terms->terms); + } } static int test__pmu(struct test_suite *test __maybe_unused, int subtest __maybe_unused) { char dir[PATH_MAX]; char *format; - struct list_head *terms = test_terms_list(); + struct parse_events_terms terms; struct perf_event_attr attr; struct perf_pmu *pmu; int fd; int ret; + parse_events_terms__init(&terms); + add_test_terms(&terms); pmu = zalloc(sizeof(*pmu)); - if (!pmu) + if (!pmu) { + parse_events_terms__exit(&terms); return -ENOMEM; + } INIT_LIST_HEAD(&pmu->format); INIT_LIST_HEAD(&pmu->aliases); @@ -159,6 +164,7 @@ static int test__pmu(struct test_suite *test __maybe_unused, int subtest __maybe format = test_format_dir_get(dir, sizeof(dir)); if (!format) { free(pmu); + parse_events_terms__exit(&terms); return -EINVAL; } @@ -175,7 +181,7 @@ static int test__pmu(struct test_suite *test __maybe_unused, int subtest __maybe if (ret) goto out; - ret = perf_pmu__config_terms(pmu, &attr, terms, /*zero=*/false, /*err=*/NULL); + ret = perf_pmu__config_terms(pmu, &attr, &terms, /*zero=*/false, /*err=*/NULL); if (ret) goto out; @@ -191,6 +197,7 @@ static int test__pmu(struct test_suite *test __maybe_unused, int subtest __maybe out: test_format_dir_put(format); perf_pmu__delete(pmu); + parse_events_terms__exit(&terms); return ret; } diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c index 06a844bcce4af..c56e07bd7dd67 100644 --- a/tools/perf/util/parse-events.c +++ b/tools/perf/util/parse-events.c @@ -34,8 +34,9 @@ #ifdef PARSER_DEBUG extern int parse_events_debug; #endif -static int get_config_terms(struct list_head *head_config, struct list_head *head_terms); -static int parse_events_terms__copy(const struct list_head *src, struct list_head *dest); +static int get_config_terms(struct parse_events_terms *head_config, struct list_head *head_terms); +static int parse_events_terms__copy(const struct parse_events_terms *src, + struct parse_events_terms *dest); struct event_symbol event_symbols_hw[PERF_COUNT_HW_MAX] = { [PERF_COUNT_HW_CPU_CYCLES] = { @@ -153,26 +154,27 @@ const char *event_type(int type) return "unknown"; } -static char *get_config_str(struct list_head *head_terms, enum parse_events__term_type type_term) +static char *get_config_str(struct parse_events_terms *head_terms, + enum parse_events__term_type type_term) { struct parse_events_term *term; if (!head_terms) return NULL; - list_for_each_entry(term, head_terms, list) + list_for_each_entry(term, &head_terms->terms, list) if (term->type_term == type_term) return term->val.str; return NULL; } -static char *get_config_metric_id(struct list_head *head_terms) +static char *get_config_metric_id(struct parse_events_terms *head_terms) { return get_config_str(head_terms, PARSE_EVENTS__TERM_TYPE_METRIC_ID); } -static char *get_config_name(struct list_head *head_terms) +static char *get_config_name(struct parse_events_terms *head_terms) { return get_config_str(head_terms, PARSE_EVENTS__TERM_TYPE_NAME); } @@ -188,11 +190,11 @@ static char *get_config_name(struct list_head *head_terms) * @config_terms: the list of terms that may contain a raw term. * @pmu: the PMU to scan for events from. */ -static void fix_raw(struct list_head *config_terms, struct perf_pmu *pmu) +static void fix_raw(struct parse_events_terms *config_terms, struct perf_pmu *pmu) { struct parse_events_term *term; - list_for_each_entry(term, config_terms, list) { + list_for_each_entry(term, &config_terms->terms, list) { u64 num; if (term->type_term != PARSE_EVENTS__TERM_TYPE_RAW) @@ -356,7 +358,7 @@ static int config_term_common(struct perf_event_attr *attr, struct parse_events_term *term, struct parse_events_error *err); static int config_attr(struct perf_event_attr *attr, - struct list_head *head, + struct parse_events_terms *head, struct parse_events_error *err, config_term_func_t config_term); @@ -442,7 +444,7 @@ bool parse_events__filter_pmu(const struct parse_events_state *parse_state, int parse_events_add_cache(struct list_head *list, int *idx, const char *name, struct parse_events_state *parse_state, - struct list_head *head_config) + struct parse_events_terms *head_config) { struct perf_pmu *pmu = NULL; bool found_supported = false; @@ -520,7 +522,7 @@ static void tracepoint_error(struct parse_events_error *e, int err, static int add_tracepoint(struct list_head *list, int *idx, const char *sys_name, const char *evt_name, struct parse_events_error *err, - struct list_head *head_config, void *loc_) + struct parse_events_terms *head_config, void *loc_) { YYLTYPE *loc = loc_; struct evsel *evsel = evsel__newtp_idx(sys_name, evt_name, (*idx)++); @@ -545,7 +547,7 @@ static int add_tracepoint(struct list_head *list, int *idx, static int add_tracepoint_multi_event(struct list_head *list, int *idx, const char *sys_name, const char *evt_name, struct parse_events_error *err, - struct list_head *head_config, YYLTYPE *loc) + struct parse_events_terms *head_config, YYLTYPE *loc) { char *evt_path; struct dirent *evt_ent; @@ -593,7 +595,7 @@ static int add_tracepoint_multi_event(struct list_head *list, int *idx, static int add_tracepoint_event(struct list_head *list, int *idx, const char *sys_name, const char *evt_name, struct parse_events_error *err, - struct list_head *head_config, YYLTYPE *loc) + struct parse_events_terms *head_config, YYLTYPE *loc) { return strpbrk(evt_name, "*?") ? add_tracepoint_multi_event(list, idx, sys_name, evt_name, @@ -605,7 +607,7 @@ static int add_tracepoint_event(struct list_head *list, int *idx, static int add_tracepoint_multi_sys(struct list_head *list, int *idx, const char *sys_name, const char *evt_name, struct parse_events_error *err, - struct list_head *head_config, YYLTYPE *loc) + struct parse_events_terms *head_config, YYLTYPE *loc) { struct dirent *events_ent; DIR *events_dir; @@ -680,7 +682,7 @@ do { \ int parse_events_add_breakpoint(struct parse_events_state *parse_state, struct list_head *list, u64 addr, char *type, u64 len, - struct list_head *head_config __maybe_unused) + struct parse_events_terms *head_config) { struct perf_event_attr attr; LIST_HEAD(config_terms); @@ -1066,20 +1068,20 @@ static int config_term_tracepoint(struct perf_event_attr *attr, #endif static int config_attr(struct perf_event_attr *attr, - struct list_head *head, + struct parse_events_terms *head, struct parse_events_error *err, config_term_func_t config_term) { struct parse_events_term *term; - list_for_each_entry(term, head, list) + list_for_each_entry(term, &head->terms, list) if (config_term(attr, term, err)) return -EINVAL; return 0; } -static int get_config_terms(struct list_head *head_config, struct list_head *head_terms) +static int get_config_terms(struct parse_events_terms *head_config, struct list_head *head_terms) { #define ADD_CONFIG_TERM(__type, __weak) \ struct evsel_config_term *__t; \ @@ -1112,7 +1114,7 @@ do { \ struct parse_events_term *term; - list_for_each_entry(term, head_config, list) { + list_for_each_entry(term, &head_config->terms, list) { switch (term->type_term) { case PARSE_EVENTS__TERM_TYPE_SAMPLE_PERIOD: ADD_CONFIG_TERM_VAL(PERIOD, period, term->val.num, term->weak); @@ -1193,14 +1195,14 @@ do { \ * Add EVSEL__CONFIG_TERM_CFG_CHG where cfg_chg will have a bit set for * each bit of attr->config that the user has changed. */ -static int get_config_chgs(struct perf_pmu *pmu, struct list_head *head_config, +static int get_config_chgs(struct perf_pmu *pmu, struct parse_events_terms *head_config, struct list_head *head_terms) { struct parse_events_term *term; u64 bits = 0; int type; - list_for_each_entry(term, head_config, list) { + list_for_each_entry(term, &head_config->terms, list) { switch (term->type_term) { case PARSE_EVENTS__TERM_TYPE_USER: type = perf_pmu__format_type(pmu, term->config); @@ -1250,7 +1252,7 @@ static int get_config_chgs(struct perf_pmu *pmu, struct list_head *head_config, int parse_events_add_tracepoint(struct list_head *list, int *idx, const char *sys, const char *event, struct parse_events_error *err, - struct list_head *head_config, void *loc_) + struct parse_events_terms *head_config, void *loc_) { YYLTYPE *loc = loc_; #ifdef HAVE_LIBTRACEEVENT @@ -1283,7 +1285,7 @@ int parse_events_add_tracepoint(struct list_head *list, int *idx, static int __parse_events_add_numeric(struct parse_events_state *parse_state, struct list_head *list, struct perf_pmu *pmu, u32 type, u32 extended_type, - u64 config, struct list_head *head_config) + u64 config, struct parse_events_terms *head_config) { struct perf_event_attr attr; LIST_HEAD(config_terms); @@ -1319,7 +1321,7 @@ static int __parse_events_add_numeric(struct parse_events_state *parse_state, int parse_events_add_numeric(struct parse_events_state *parse_state, struct list_head *list, u32 type, u64 config, - struct list_head *head_config, + struct parse_events_terms *head_config, bool wildcard) { struct perf_pmu *pmu = NULL; @@ -1368,7 +1370,7 @@ static bool config_term_percore(struct list_head *config_terms) int parse_events_add_pmu(struct parse_events_state *parse_state, struct list_head *list, const char *name, - const struct list_head *const_head_terms, + const struct parse_events_terms *const_parsed_terms, bool auto_merge_stats, void *loc_) { struct perf_event_attr attr; @@ -1378,7 +1380,7 @@ int parse_events_add_pmu(struct parse_events_state *parse_state, struct parse_events_error *err = parse_state->error; YYLTYPE *loc = loc_; LIST_HEAD(config_terms); - LIST_HEAD(head_terms); + struct parse_events_terms parsed_terms; pmu = parse_state->fake_pmu ?: perf_pmus__find(name); @@ -1392,8 +1394,9 @@ int parse_events_add_pmu(struct parse_events_state *parse_state, return -EINVAL; } - if (const_head_terms) { - int ret = parse_events_terms__copy(const_head_terms, &head_terms); + parse_events_terms__init(&parsed_terms); + if (const_parsed_terms) { + int ret = parse_events_terms__copy(const_parsed_terms, &parsed_terms); if (ret) return ret; @@ -1403,17 +1406,17 @@ int parse_events_add_pmu(struct parse_events_state *parse_state, struct strbuf sb; strbuf_init(&sb, /*hint=*/ 0); - if (pmu->selectable && list_empty(&head_terms)) { + if (pmu->selectable && list_empty(&parsed_terms.terms)) { strbuf_addf(&sb, "%s//", name); } else { strbuf_addf(&sb, "%s/", name); - parse_events_term__to_strbuf(&head_terms, &sb); + parse_events_terms__to_strbuf(&parsed_terms, &sb); strbuf_addch(&sb, '/'); } fprintf(stderr, "Attempt to add: %s\n", sb.buf); strbuf_release(&sb); } - fix_raw(&head_terms, pmu); + fix_raw(&parsed_terms, pmu); if (pmu->default_config) { memcpy(&attr, pmu->default_config, sizeof(struct perf_event_attr)); @@ -1422,7 +1425,7 @@ int parse_events_add_pmu(struct parse_events_state *parse_state, } attr.type = pmu->type; - if (list_empty(&head_terms)) { + if (list_empty(&parsed_terms.terms)) { evsel = __add_event(list, &parse_state->idx, &attr, /*init_attr=*/true, /*name=*/NULL, /*metric_id=*/NULL, pmu, @@ -1431,8 +1434,8 @@ int parse_events_add_pmu(struct parse_events_state *parse_state, return evsel ? 0 : -ENOMEM; } - if (!parse_state->fake_pmu && perf_pmu__check_alias(pmu, &head_terms, &info, err)) { - parse_events_terms__purge(&head_terms); + if (!parse_state->fake_pmu && perf_pmu__check_alias(pmu, &parsed_terms, &info, err)) { + parse_events_terms__exit(&parsed_terms); return -EINVAL; } @@ -1440,7 +1443,7 @@ int parse_events_add_pmu(struct parse_events_state *parse_state, struct strbuf sb; strbuf_init(&sb, /*hint=*/ 0); - parse_events_term__to_strbuf(&head_terms, &sb); + parse_events_terms__to_strbuf(&parsed_terms, &sb); fprintf(stderr, "..after resolving event: %s/%s/\n", name, sb.buf); strbuf_release(&sb); } @@ -1449,13 +1452,13 @@ int parse_events_add_pmu(struct parse_events_state *parse_state, * Configure hardcoded terms first, no need to check * return value when called with fail == 0 ;) */ - if (config_attr(&attr, &head_terms, parse_state->error, config_term_pmu)) { - parse_events_terms__purge(&head_terms); + if (config_attr(&attr, &parsed_terms, parse_state->error, config_term_pmu)) { + parse_events_terms__exit(&parsed_terms); return -EINVAL; } - if (get_config_terms(&head_terms, &config_terms)) { - parse_events_terms__purge(&head_terms); + if (get_config_terms(&parsed_terms, &config_terms)) { + parse_events_terms__exit(&parsed_terms); return -ENOMEM; } @@ -1463,24 +1466,24 @@ int parse_events_add_pmu(struct parse_events_state *parse_state, * When using default config, record which bits of attr->config were * changed by the user. */ - if (pmu->default_config && get_config_chgs(pmu, &head_terms, &config_terms)) { - parse_events_terms__purge(&head_terms); + if (pmu->default_config && get_config_chgs(pmu, &parsed_terms, &config_terms)) { + parse_events_terms__exit(&parsed_terms); return -ENOMEM; } if (!parse_state->fake_pmu && - perf_pmu__config(pmu, &attr, &head_terms, parse_state->error)) { + perf_pmu__config(pmu, &attr, &parsed_terms, parse_state->error)) { free_config_terms(&config_terms); - parse_events_terms__purge(&head_terms); + parse_events_terms__exit(&parsed_terms); return -EINVAL; } evsel = __add_event(list, &parse_state->idx, &attr, /*init_attr=*/true, - get_config_name(&head_terms), - get_config_metric_id(&head_terms), pmu, + get_config_name(&parsed_terms), + get_config_metric_id(&parsed_terms), pmu, &config_terms, auto_merge_stats, /*cpu_list=*/NULL); if (!evsel) { - parse_events_terms__purge(&head_terms); + parse_events_terms__exit(&parsed_terms); return -ENOMEM; } @@ -1490,11 +1493,11 @@ int parse_events_add_pmu(struct parse_events_state *parse_state, evsel->percore = config_term_percore(&evsel->config_terms); if (parse_state->fake_pmu) { - parse_events_terms__purge(&head_terms); + parse_events_terms__exit(&parsed_terms); return 0; } - parse_events_terms__purge(&head_terms); + parse_events_terms__exit(&parsed_terms); free((char *)evsel->unit); evsel->unit = strdup(info.unit); evsel->scale = info.scale; @@ -1505,7 +1508,7 @@ int parse_events_add_pmu(struct parse_events_state *parse_state, int parse_events_multi_pmu_add(struct parse_events_state *parse_state, const char *event_name, - const struct list_head *const_head_terms, + const struct parse_events_terms *const_parsed_terms, struct list_head **listp, void *loc_) { struct parse_events_term *term; @@ -1514,12 +1517,13 @@ int parse_events_multi_pmu_add(struct parse_events_state *parse_state, YYLTYPE *loc = loc_; int ok = 0; const char *config; - LIST_HEAD(head_terms); + struct parse_events_terms parsed_terms; *listp = NULL; - if (const_head_terms) { - int ret = parse_events_terms__copy(const_head_terms, &head_terms); + parse_events_terms__init(&parsed_terms); + if (const_parsed_terms) { + int ret = parse_events_terms__copy(const_parsed_terms, &parsed_terms); if (ret) return ret; @@ -1536,7 +1540,7 @@ int parse_events_multi_pmu_add(struct parse_events_state *parse_state, zfree(&config); goto out_err; } - list_add_tail(&term->list, &head_terms); + list_add_tail(&term->list, &parsed_terms.terms); /* Add it for all PMUs that support the alias */ list = malloc(sizeof(struct list_head)); @@ -1556,11 +1560,11 @@ int parse_events_multi_pmu_add(struct parse_events_state *parse_state, auto_merge_stats = perf_pmu__auto_merge_stats(pmu); if (!parse_events_add_pmu(parse_state, list, pmu->name, - &head_terms, auto_merge_stats, loc)) { + &parsed_terms, auto_merge_stats, loc)) { struct strbuf sb; strbuf_init(&sb, /*hint=*/ 0); - parse_events_term__to_strbuf(&head_terms, &sb); + parse_events_terms__to_strbuf(&parsed_terms, &sb); pr_debug("%s -> %s/%s/\n", event_name, pmu->name, sb.buf); strbuf_release(&sb); ok++; @@ -1568,12 +1572,12 @@ int parse_events_multi_pmu_add(struct parse_events_state *parse_state, } if (parse_state->fake_pmu) { - if (!parse_events_add_pmu(parse_state, list, event_name, &head_terms, + if (!parse_events_add_pmu(parse_state, list, event_name, &parsed_terms, /*auto_merge_stats=*/true, loc)) { struct strbuf sb; strbuf_init(&sb, /*hint=*/ 0); - parse_events_term__to_strbuf(&head_terms, &sb); + parse_events_terms__to_strbuf(&parsed_terms, &sb); pr_debug("%s -> %s/%s/\n", event_name, "fake_pmu", sb.buf); strbuf_release(&sb); ok++; @@ -1581,7 +1585,7 @@ int parse_events_multi_pmu_add(struct parse_events_state *parse_state, } out_err: - parse_events_terms__purge(&head_terms); + parse_events_terms__exit(&parsed_terms); if (ok) *listp = list; else @@ -1851,7 +1855,7 @@ static int parse_events__scanner(const char *str, /* * parse event config string, return a list of event terms. */ -int parse_events_terms(struct list_head *terms, const char *str, FILE *input) +int parse_events_terms(struct parse_events_terms *terms, const char *str, FILE *input) { struct parse_events_state parse_state = { .terms = NULL, @@ -1860,14 +1864,10 @@ int parse_events_terms(struct list_head *terms, const char *str, FILE *input) int ret; ret = parse_events__scanner(str, input, &parse_state); + if (!ret) + list_splice(&parse_state.terms->terms, &terms->terms); - if (!ret) { - list_splice(parse_state.terms, terms); - zfree(&parse_state.terms); - return 0; - } - - parse_events_terms__delete(parse_state.terms); + zfree(&parse_state.terms); return ret; } @@ -2563,11 +2563,12 @@ void parse_events_term__delete(struct parse_events_term *term) free(term); } -static int parse_events_terms__copy(const struct list_head *src, struct list_head *dest) +static int parse_events_terms__copy(const struct parse_events_terms *src, + struct parse_events_terms *dest) { struct parse_events_term *term; - list_for_each_entry (term, src, list) { + list_for_each_entry (term, &src->terms, list) { struct parse_events_term *n; int ret; @@ -2575,38 +2576,43 @@ static int parse_events_terms__copy(const struct list_head *src, struct list_hea if (ret) return ret; - list_add_tail(&n->list, dest); + list_add_tail(&n->list, &dest->terms); } return 0; } -void parse_events_terms__purge(struct list_head *terms) +void parse_events_terms__init(struct parse_events_terms *terms) +{ + INIT_LIST_HEAD(&terms->terms); +} + +void parse_events_terms__exit(struct parse_events_terms *terms) { struct parse_events_term *term, *h; - list_for_each_entry_safe(term, h, terms, list) { + list_for_each_entry_safe(term, h, &terms->terms, list) { list_del_init(&term->list); parse_events_term__delete(term); } } -void parse_events_terms__delete(struct list_head *terms) +void parse_events_terms__delete(struct parse_events_terms *terms) { if (!terms) return; - parse_events_terms__purge(terms); + parse_events_terms__exit(terms); free(terms); } -int parse_events_term__to_strbuf(struct list_head *term_list, struct strbuf *sb) +int parse_events_terms__to_strbuf(const struct parse_events_terms *terms, struct strbuf *sb) { struct parse_events_term *term; bool first = true; - if (!term_list) + if (!terms) return 0; - list_for_each_entry(term, term_list, list) { + list_for_each_entry(term, &terms->terms, list) { int ret; if (!first) { diff --git a/tools/perf/util/parse-events.h b/tools/perf/util/parse-events.h index e6612856e8817..63c0a36a4bf11 100644 --- a/tools/perf/util/parse-events.h +++ b/tools/perf/util/parse-events.h @@ -44,7 +44,6 @@ static inline int parse_events(struct evlist *evlist, const char *str, int parse_event(struct evlist *evlist, const char *str); -int parse_events_terms(struct list_head *terms, const char *str, FILE *input); int parse_filter(const struct option *opt, const char *str, int unset); int exclude_perf(const struct option *opt, const char *arg, int unset); @@ -140,6 +139,11 @@ struct parse_events_error { char *first_help; }; +/* A wrapper around a list of terms for the sake of better type safety. */ +struct parse_events_terms { + struct list_head terms; +}; + struct parse_events_state { /* The list parsed events are placed on. */ struct list_head list; @@ -148,7 +152,7 @@ struct parse_events_state { /* Error information. */ struct parse_events_error *error; /* Holds returned terms for term parsing. */ - struct list_head *terms; + struct parse_events_terms *terms; /* Start token. */ int stoken; /* Special fake PMU marker for testing. */ @@ -181,35 +185,38 @@ int parse_events_term__term(struct parse_events_term **term, int parse_events_term__clone(struct parse_events_term **new, struct parse_events_term *term); void parse_events_term__delete(struct parse_events_term *term); -void parse_events_terms__delete(struct list_head *terms); -void parse_events_terms__purge(struct list_head *terms); -int parse_events_term__to_strbuf(struct list_head *term_list, struct strbuf *sb); + +void parse_events_terms__delete(struct parse_events_terms *terms); +void parse_events_terms__init(struct parse_events_terms *terms); +void parse_events_terms__exit(struct parse_events_terms *terms); +int parse_events_terms(struct parse_events_terms *terms, const char *str, FILE *input); +int parse_events_terms__to_strbuf(const struct parse_events_terms *terms, struct strbuf *sb); int parse_events__modifier_event(struct list_head *list, char *str, bool add); int parse_events__modifier_group(struct list_head *list, char *event_mod); int parse_events_name(struct list_head *list, const char *name); int parse_events_add_tracepoint(struct list_head *list, int *idx, const char *sys, const char *event, struct parse_events_error *error, - struct list_head *head_config, void *loc); + struct parse_events_terms *head_config, void *loc); int parse_events_add_numeric(struct parse_events_state *parse_state, struct list_head *list, u32 type, u64 config, - struct list_head *head_config, + struct parse_events_terms *head_config, bool wildcard); int parse_events_add_tool(struct parse_events_state *parse_state, struct list_head *list, int tool_event); int parse_events_add_cache(struct list_head *list, int *idx, const char *name, struct parse_events_state *parse_state, - struct list_head *head_config); + struct parse_events_terms *head_config); int parse_events__decode_legacy_cache(const char *name, int pmu_type, __u64 *config); int parse_events_add_breakpoint(struct parse_events_state *parse_state, struct list_head *list, u64 addr, char *type, u64 len, - struct list_head *head_config); + struct parse_events_terms *head_config); int parse_events_add_pmu(struct parse_events_state *parse_state, struct list_head *list, const char *name, - const struct list_head *const_head_terms, + const struct parse_events_terms *const_parsed_terms, bool auto_merge_stats, void *loc); struct evsel *parse_events__add_event(int idx, struct perf_event_attr *attr, @@ -218,7 +225,7 @@ struct evsel *parse_events__add_event(int idx, struct perf_event_attr *attr, int parse_events_multi_pmu_add(struct parse_events_state *parse_state, const char *event_name, - const struct list_head *head_terms, + const struct parse_events_terms *const_parsed_terms, struct list_head **listp, void *loc); void parse_events__set_leader(char *name, struct list_head *list); diff --git a/tools/perf/util/parse-events.y b/tools/perf/util/parse-events.y index d878a040c240e..b46b80da5ea75 100644 --- a/tools/perf/util/parse-events.y +++ b/tools/perf/util/parse-events.y @@ -114,7 +114,7 @@ static void free_list_evsel(struct list_head* list_evsel) u64 num; enum parse_events__term_type term_type; struct list_head *list_evsel; - struct list_head *list_terms; + struct parse_events_terms *list_terms; struct parse_events_term *term; struct tracepoint_name { char *sys; @@ -645,26 +645,26 @@ start_terms: event_config event_config: event_config ',' event_term { - struct list_head *head = $1; + struct parse_events_terms *head = $1; struct parse_events_term *term = $3; if (!head) { parse_events_term__delete(term); YYABORT; } - list_add_tail(&term->list, head); + list_add_tail(&term->list, &head->terms); $$ = $1; } | event_term { - struct list_head *head = malloc(sizeof(*head)); + struct parse_events_terms *head = malloc(sizeof(*head)); struct parse_events_term *term = $1; if (!head) YYNOMEM; - INIT_LIST_HEAD(head); - list_add_tail(&term->list, head); + parse_events_terms__init(head); + list_add_tail(&term->list, &head->terms); $$ = head; } diff --git a/tools/perf/util/pmu.c b/tools/perf/util/pmu.c index 2587c4b463fa8..43eac49522aea 100644 --- a/tools/perf/util/pmu.c +++ b/tools/perf/util/pmu.c @@ -65,7 +65,7 @@ struct perf_pmu_alias { */ char *topic; /** @terms: Owned list of the original parsed parameters. */ - struct list_head terms; + struct parse_events_terms terms; /** @list: List element of struct perf_pmu aliases. */ struct list_head list; /** @@ -417,7 +417,7 @@ static void perf_pmu_free_alias(struct perf_pmu_alias *newalias) zfree(&newalias->long_desc); zfree(&newalias->topic); zfree(&newalias->pmu_name); - parse_events_terms__purge(&newalias->terms); + parse_events_terms__exit(&newalias->terms); free(newalias); } @@ -518,7 +518,7 @@ static int update_alias(const struct pmu_event *pe, assign_str(pe->name, "topic", &data->alias->topic, pe->topic); data->alias->per_pkg = pe->perpkg; if (pe->event) { - parse_events_terms__purge(&data->alias->terms); + parse_events_terms__exit(&data->alias->terms); ret = parse_events_terms(&data->alias->terms, pe->event, /*input=*/NULL); } if (!ret && pe->unit) { @@ -558,7 +558,7 @@ static int perf_pmu__new_alias(struct perf_pmu *pmu, const char *name, if (!alias) return -ENOMEM; - INIT_LIST_HEAD(&alias->terms); + parse_events_terms__init(&alias->terms); alias->scale = 1.0; alias->unit[0] = '\0'; alias->per_pkg = perpkg; @@ -696,17 +696,17 @@ static int pmu_aliases_parse(struct perf_pmu *pmu) return 0; } -static int pmu_alias_terms(struct perf_pmu_alias *alias, - struct list_head *terms) +static int pmu_alias_terms(struct perf_pmu_alias *alias, struct list_head *terms) { struct parse_events_term *term, *cloned; - LIST_HEAD(list); - int ret; + struct parse_events_terms clone_terms; + + parse_events_terms__init(&clone_terms); + list_for_each_entry(term, &alias->terms.terms, list) { + int ret = parse_events_term__clone(&cloned, term); - list_for_each_entry(term, &alias->terms, list) { - ret = parse_events_term__clone(&cloned, term); if (ret) { - parse_events_terms__purge(&list); + parse_events_terms__exit(&clone_terms); return ret; } /* @@ -714,9 +714,10 @@ static int pmu_alias_terms(struct perf_pmu_alias *alias, * which we don't want for implicit terms in aliases. */ cloned->weak = true; - list_add_tail(&cloned->list, &list); + list_add_tail(&cloned->list, &clone_terms.terms); } - list_splice(&list, terms); + list_splice_init(&clone_terms.terms, terms); + parse_events_terms__exit(&clone_terms); return 0; } @@ -1257,12 +1258,12 @@ static __u64 pmu_format_max_value(const unsigned long *format) * in a config string) later on in the term list. */ static int pmu_resolve_param_term(struct parse_events_term *term, - struct list_head *head_terms, + struct parse_events_terms *head_terms, __u64 *value) { struct parse_events_term *t; - list_for_each_entry(t, head_terms, list) { + list_for_each_entry(t, &head_terms->terms, list) { if (t->type_val == PARSE_EVENTS__TERM_TYPE_NUM && t->config && !strcmp(t->config, term->config)) { t->used = true; @@ -1306,7 +1307,7 @@ static char *pmu_formats_string(struct list_head *formats) static int pmu_config_term(struct perf_pmu *pmu, struct perf_event_attr *attr, struct parse_events_term *term, - struct list_head *head_terms, + struct parse_events_terms *head_terms, bool zero, struct parse_events_error *err) { struct perf_pmu_format *format; @@ -1428,13 +1429,13 @@ static int pmu_config_term(struct perf_pmu *pmu, int perf_pmu__config_terms(struct perf_pmu *pmu, struct perf_event_attr *attr, - struct list_head *head_terms, + struct parse_events_terms *terms, bool zero, struct parse_events_error *err) { struct parse_events_term *term; - list_for_each_entry(term, head_terms, list) { - if (pmu_config_term(pmu, attr, term, head_terms, zero, err)) + list_for_each_entry(term, &terms->terms, list) { + if (pmu_config_term(pmu, attr, term, terms, zero, err)) return -EINVAL; } @@ -1447,7 +1448,7 @@ int perf_pmu__config_terms(struct perf_pmu *pmu, * 2) pmu format definitions - specified by pmu parameter */ int perf_pmu__config(struct perf_pmu *pmu, struct perf_event_attr *attr, - struct list_head *head_terms, + struct parse_events_terms *head_terms, struct parse_events_error *err) { bool zero = !!pmu->default_config; @@ -1541,7 +1542,7 @@ static int check_info_data(struct perf_pmu *pmu, * Find alias in the terms list and replace it with the terms * defined for the alias */ -int perf_pmu__check_alias(struct perf_pmu *pmu, struct list_head *head_terms, +int perf_pmu__check_alias(struct perf_pmu *pmu, struct parse_events_terms *head_terms, struct perf_pmu_info *info, struct parse_events_error *err) { struct parse_events_term *term, *h; @@ -1558,7 +1559,7 @@ int perf_pmu__check_alias(struct perf_pmu *pmu, struct list_head *head_terms, info->scale = 0.0; info->snapshot = false; - list_for_each_entry_safe(term, h, head_terms, list) { + list_for_each_entry_safe(term, h, &head_terms->terms, list) { alias = pmu_find_alias(pmu, term); if (!alias) continue; @@ -1704,7 +1705,7 @@ static char *format_alias(char *buf, int len, const struct perf_pmu *pmu, : (int)strlen(pmu->name); int used = snprintf(buf, len, "%.*s/%s", pmu_name_len, pmu->name, alias->name); - list_for_each_entry(term, &alias->terms, list) { + list_for_each_entry(term, &alias->terms.terms, list) { if (term->type_val == PARSE_EVENTS__TERM_TYPE_STR) used += snprintf(buf + used, sub_non_neg(len, used), ",%s=%s", term->config, @@ -1764,7 +1765,7 @@ int perf_pmu__for_each_event(struct perf_pmu *pmu, bool skip_duplicate_pmus, info.desc = event->desc; info.long_desc = event->long_desc; info.encoding_desc = buf + buf_used; - parse_events_term__to_strbuf(&event->terms, &sb); + parse_events_terms__to_strbuf(&event->terms, &sb); buf_used += snprintf(buf + buf_used, sizeof(buf) - buf_used, "%s/%s/", info.pmu_name, sb.buf) + 1; info.topic = event->topic; diff --git a/tools/perf/util/pmu.h b/tools/perf/util/pmu.h index 5a03c361cb04c..e192010ad3c75 100644 --- a/tools/perf/util/pmu.h +++ b/tools/perf/util/pmu.h @@ -198,15 +198,15 @@ typedef int (*pmu_event_callback)(void *state, struct pmu_event_info *info); void pmu_add_sys_aliases(struct perf_pmu *pmu); int perf_pmu__config(struct perf_pmu *pmu, struct perf_event_attr *attr, - struct list_head *head_terms, + struct parse_events_terms *head_terms, struct parse_events_error *error); int perf_pmu__config_terms(struct perf_pmu *pmu, struct perf_event_attr *attr, - struct list_head *head_terms, + struct parse_events_terms *terms, bool zero, struct parse_events_error *error); __u64 perf_pmu__format_bits(struct perf_pmu *pmu, const char *name); int perf_pmu__format_type(struct perf_pmu *pmu, const char *name); -int perf_pmu__check_alias(struct perf_pmu *pmu, struct list_head *head_terms, +int perf_pmu__check_alias(struct perf_pmu *pmu, struct parse_events_terms *head_terms, struct perf_pmu_info *info, struct parse_events_error *err); int perf_pmu__find_event(struct perf_pmu *pmu, const char *event, void *state, pmu_event_callback cb); From 7966de6475bd9fe8ed64100f6e4846edf3bb19b8 Mon Sep 17 00:00:00 2001 From: Ian Rogers Date: Wed, 22 Nov 2023 20:29:22 -0800 Subject: [PATCH 238/548] perf parse-events: Make legacy events lower priority than sysfs/JSON The perf tool has previously made legacy events the priority so with or without a PMU the legacy event would be opened: $ perf stat -e cpu-cycles,cpu/cpu-cycles/ true Using CPUID GenuineIntel-6-8D-1 intel_pt default config: tsc,mtc,mtc_period=3,psb_period=3,pt,branch Attempting to add event pmu 'cpu' with 'cpu-cycles,' that may result in non-fatal errors After aliases, add event pmu 'cpu' with 'cpu-cycles,' that may result in non-fatal errors Control descriptor is not initialized ------------------------------------------------------------ perf_event_attr: type 0 (PERF_TYPE_HARDWARE) size 136 config 0 (PERF_COUNT_HW_CPU_CYCLES) sample_type IDENTIFIER read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING disabled 1 inherit 1 enable_on_exec 1 exclude_guest 1 ------------------------------------------------------------ sys_perf_event_open: pid 833967 cpu -1 group_fd -1 flags 0x8 = 3 ------------------------------------------------------------ perf_event_attr: type 0 (PERF_TYPE_HARDWARE) size 136 config 0 (PERF_COUNT_HW_CPU_CYCLES) sample_type IDENTIFIER read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING disabled 1 inherit 1 enable_on_exec 1 exclude_guest 1 ------------------------------------------------------------ ... Fixes to make hybrid/BIG.little PMUs behave correctly, ie as core PMUs capable of opening legacy events on each, removing hard coded "cpu_core" and "cpu_atom" Intel PMU names, etc. caused a behavioral difference on Apple/ARM due to latent issues in the PMU driver reported in: https://lore.kernel.org/lkml/08f1f185-e259-4014-9ca4-6411d5c1bc65@marcan.st/ As part of that report Mark Rutland requested that legacy events not be higher in priority when a PMU is specified reversing what has until this change been perf's default behavior. With this change the above becomes: $ perf stat -e cpu-cycles,cpu/cpu-cycles/ true Using CPUID GenuineIntel-6-8D-1 Attempt to add: cpu/cpu-cycles=0/ ..after resolving event: cpu/event=0x3c/ Control descriptor is not initialized ------------------------------------------------------------ perf_event_attr: type 0 (PERF_TYPE_HARDWARE) size 136 config 0 (PERF_COUNT_HW_CPU_CYCLES) sample_type IDENTIFIER read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING disabled 1 inherit 1 enable_on_exec 1 exclude_guest 1 ------------------------------------------------------------ sys_perf_event_open: pid 827628 cpu -1 group_fd -1 flags 0x8 = 3 ------------------------------------------------------------ perf_event_attr: type 4 (PERF_TYPE_RAW) size 136 config 0x3c sample_type IDENTIFIER read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING disabled 1 inherit 1 enable_on_exec 1 exclude_guest 1 ------------------------------------------------------------ ... So the second event has become a raw event as /sys/devices/cpu/events/cpu-cycles exists. A fix was necessary to config_term_pmu in parse-events.c as check_alias expansion needs to happen after config_term_pmu, and config_term_pmu may need calling a second time because of this. config_term_pmu is updated to not use the legacy event when the PMU has such a named event (either from JSON or sysfs). The bulk of this change is updating all of the parse-events test expectations so that if a sysfs/JSON event exists for a PMU the test doesn't fail - a further sign, if it were needed, that the legacy event priority was a known and tested behavior of the perf tool. Reported-by: Hector Martin Signed-off-by: Ian Rogers Tested-by: Hector Martin Tested-by: Marc Zyngier Acked-by: Mark Rutland Cc: Adrian Hunter Cc: Alexander Shishkin Cc: Ingo Molnar Cc: James Clark Cc: Jiri Olsa Cc: Kan Liang Cc: Namhyung Kim Cc: Peter Zijlstra Link: https://lore.kernel.org/r/20231123042922.834425-1-irogers@google.com [ Initialize the 'alias_rewrote_terms' variable to false to address a clang warning ] Signed-off-by: Arnaldo Carvalho de Melo (cherry picked from commit a24d9d9dc096fc0d0bd85302c9a4fe4fe3b1107b) Signed-off-by: Koba Ko --- tools/perf/tests/parse-events.c | 256 +++++++++++++++++++++++--------- tools/perf/util/parse-events.c | 52 +++++-- tools/perf/util/pmu.c | 8 +- tools/perf/util/pmu.h | 3 +- 4 files changed, 231 insertions(+), 88 deletions(-) diff --git a/tools/perf/tests/parse-events.c b/tools/perf/tests/parse-events.c index 93796e885cefd..f9706694ec2b9 100644 --- a/tools/perf/tests/parse-events.c +++ b/tools/perf/tests/parse-events.c @@ -162,6 +162,22 @@ static int test__checkevent_numeric(struct evlist *evlist) return TEST_OK; } + +static int assert_hw(struct perf_evsel *evsel, enum perf_hw_id id, const char *name) +{ + struct perf_pmu *pmu; + + if (evsel->attr.type == PERF_TYPE_HARDWARE) { + TEST_ASSERT_VAL("wrong config", test_perf_config(evsel, id)); + return 0; + } + pmu = perf_pmus__find_by_type(evsel->attr.type); + + TEST_ASSERT_VAL("unexpected PMU type", pmu); + TEST_ASSERT_VAL("PMU missing event", perf_pmu__have_event(pmu, name)); + return 0; +} + static int test__checkevent_symbolic_name(struct evlist *evlist) { struct perf_evsel *evsel; @@ -169,10 +185,12 @@ static int test__checkevent_symbolic_name(struct evlist *evlist) TEST_ASSERT_VAL("wrong number of entries", 0 != evlist->core.nr_entries); perf_evlist__for_each_evsel(&evlist->core, evsel) { - TEST_ASSERT_VAL("wrong type", PERF_TYPE_HARDWARE == evsel->attr.type); - TEST_ASSERT_VAL("wrong config", - test_perf_config(evsel, PERF_COUNT_HW_INSTRUCTIONS)); + int ret = assert_hw(evsel, PERF_COUNT_HW_INSTRUCTIONS, "instructions"); + + if (ret) + return ret; } + return TEST_OK; } @@ -183,8 +201,10 @@ static int test__checkevent_symbolic_name_config(struct evlist *evlist) TEST_ASSERT_VAL("wrong number of entries", 0 != evlist->core.nr_entries); perf_evlist__for_each_evsel(&evlist->core, evsel) { - TEST_ASSERT_VAL("wrong type", PERF_TYPE_HARDWARE == evsel->attr.type); - TEST_ASSERT_VAL("wrong config", test_perf_config(evsel, PERF_COUNT_HW_CPU_CYCLES)); + int ret = assert_hw(evsel, PERF_COUNT_HW_CPU_CYCLES, "cycles"); + + if (ret) + return ret; /* * The period value gets configured within evlist__config, * while this test executes only parse events method. @@ -861,10 +881,14 @@ static int test__group1(struct evlist *evlist) evlist__nr_groups(evlist) == num_core_entries()); for (int i = 0; i < num_core_entries(); i++) { + int ret; + /* instructions:k */ evsel = leader = (i == 0 ? evlist__first(evlist) : evsel__next(evsel)); - TEST_ASSERT_VAL("wrong type", PERF_TYPE_HARDWARE == evsel->core.attr.type); - TEST_ASSERT_VAL("wrong config", test_config(evsel, PERF_COUNT_HW_INSTRUCTIONS)); + ret = assert_hw(&evsel->core, PERF_COUNT_HW_INSTRUCTIONS, "instructions"); + if (ret) + return ret; + TEST_ASSERT_VAL("wrong exclude_user", evsel->core.attr.exclude_user); TEST_ASSERT_VAL("wrong exclude_kernel", !evsel->core.attr.exclude_kernel); TEST_ASSERT_VAL("wrong exclude_hv", evsel->core.attr.exclude_hv); @@ -878,8 +902,10 @@ static int test__group1(struct evlist *evlist) /* cycles:upp */ evsel = evsel__next(evsel); - TEST_ASSERT_VAL("wrong type", PERF_TYPE_HARDWARE == evsel->core.attr.type); - TEST_ASSERT_VAL("wrong config", test_config(evsel, PERF_COUNT_HW_CPU_CYCLES)); + ret = assert_hw(&evsel->core, PERF_COUNT_HW_CPU_CYCLES, "cycles"); + if (ret) + return ret; + TEST_ASSERT_VAL("wrong exclude_user", !evsel->core.attr.exclude_user); TEST_ASSERT_VAL("wrong exclude_kernel", evsel->core.attr.exclude_kernel); TEST_ASSERT_VAL("wrong exclude_hv", evsel->core.attr.exclude_hv); @@ -907,6 +933,8 @@ static int test__group2(struct evlist *evlist) TEST_ASSERT_VAL("wrong number of groups", 1 == evlist__nr_groups(evlist)); evlist__for_each_entry(evlist, evsel) { + int ret; + if (evsel->core.attr.type == PERF_TYPE_SOFTWARE) { /* faults + :ku modifier */ leader = evsel; @@ -939,8 +967,10 @@ static int test__group2(struct evlist *evlist) continue; } /* cycles:k */ - TEST_ASSERT_VAL("wrong type", PERF_TYPE_HARDWARE == evsel->core.attr.type); - TEST_ASSERT_VAL("wrong config", test_config(evsel, PERF_COUNT_HW_CPU_CYCLES)); + ret = assert_hw(&evsel->core, PERF_COUNT_HW_CPU_CYCLES, "cycles"); + if (ret) + return ret; + TEST_ASSERT_VAL("wrong exclude_user", evsel->core.attr.exclude_user); TEST_ASSERT_VAL("wrong exclude_kernel", !evsel->core.attr.exclude_kernel); TEST_ASSERT_VAL("wrong exclude_hv", evsel->core.attr.exclude_hv); @@ -957,6 +987,7 @@ static int test__group2(struct evlist *evlist) static int test__group3(struct evlist *evlist __maybe_unused) { struct evsel *evsel, *group1_leader = NULL, *group2_leader = NULL; + int ret; TEST_ASSERT_VAL("wrong number of entries", evlist->core.nr_entries == (3 * perf_pmus__num_core_pmus() + 2)); @@ -1045,8 +1076,10 @@ static int test__group3(struct evlist *evlist __maybe_unused) continue; } /* instructions:u */ - TEST_ASSERT_VAL("wrong type", PERF_TYPE_HARDWARE == evsel->core.attr.type); - TEST_ASSERT_VAL("wrong config", test_config(evsel, PERF_COUNT_HW_INSTRUCTIONS)); + ret = assert_hw(&evsel->core, PERF_COUNT_HW_INSTRUCTIONS, "instructions"); + if (ret) + return ret; + TEST_ASSERT_VAL("wrong exclude_user", !evsel->core.attr.exclude_user); TEST_ASSERT_VAL("wrong exclude_kernel", evsel->core.attr.exclude_kernel); TEST_ASSERT_VAL("wrong exclude_hv", evsel->core.attr.exclude_hv); @@ -1070,10 +1103,14 @@ static int test__group4(struct evlist *evlist __maybe_unused) num_core_entries() == evlist__nr_groups(evlist)); for (int i = 0; i < num_core_entries(); i++) { + int ret; + /* cycles:u + p */ evsel = leader = (i == 0 ? evlist__first(evlist) : evsel__next(evsel)); - TEST_ASSERT_VAL("wrong type", PERF_TYPE_HARDWARE == evsel->core.attr.type); - TEST_ASSERT_VAL("wrong config", test_config(evsel, PERF_COUNT_HW_CPU_CYCLES)); + ret = assert_hw(&evsel->core, PERF_COUNT_HW_CPU_CYCLES, "cycles"); + if (ret) + return ret; + TEST_ASSERT_VAL("wrong exclude_user", !evsel->core.attr.exclude_user); TEST_ASSERT_VAL("wrong exclude_kernel", evsel->core.attr.exclude_kernel); TEST_ASSERT_VAL("wrong exclude_hv", evsel->core.attr.exclude_hv); @@ -1089,8 +1126,10 @@ static int test__group4(struct evlist *evlist __maybe_unused) /* instructions:kp + p */ evsel = evsel__next(evsel); - TEST_ASSERT_VAL("wrong type", PERF_TYPE_HARDWARE == evsel->core.attr.type); - TEST_ASSERT_VAL("wrong config", test_config(evsel, PERF_COUNT_HW_INSTRUCTIONS)); + ret = assert_hw(&evsel->core, PERF_COUNT_HW_INSTRUCTIONS, "instructions"); + if (ret) + return ret; + TEST_ASSERT_VAL("wrong exclude_user", evsel->core.attr.exclude_user); TEST_ASSERT_VAL("wrong exclude_kernel", !evsel->core.attr.exclude_kernel); TEST_ASSERT_VAL("wrong exclude_hv", evsel->core.attr.exclude_hv); @@ -1108,6 +1147,7 @@ static int test__group4(struct evlist *evlist __maybe_unused) static int test__group5(struct evlist *evlist __maybe_unused) { struct evsel *evsel = NULL, *leader; + int ret; TEST_ASSERT_VAL("wrong number of entries", evlist->core.nr_entries == (5 * num_core_entries())); @@ -1117,8 +1157,10 @@ static int test__group5(struct evlist *evlist __maybe_unused) for (int i = 0; i < num_core_entries(); i++) { /* cycles + G */ evsel = leader = (i == 0 ? evlist__first(evlist) : evsel__next(evsel)); - TEST_ASSERT_VAL("wrong type", PERF_TYPE_HARDWARE == evsel->core.attr.type); - TEST_ASSERT_VAL("wrong config", test_config(evsel, PERF_COUNT_HW_CPU_CYCLES)); + ret = assert_hw(&evsel->core, PERF_COUNT_HW_CPU_CYCLES, "cycles"); + if (ret) + return ret; + TEST_ASSERT_VAL("wrong exclude_user", !evsel->core.attr.exclude_user); TEST_ASSERT_VAL("wrong exclude_kernel", !evsel->core.attr.exclude_kernel); TEST_ASSERT_VAL("wrong exclude_hv", !evsel->core.attr.exclude_hv); @@ -1133,8 +1175,10 @@ static int test__group5(struct evlist *evlist __maybe_unused) /* instructions + G */ evsel = evsel__next(evsel); - TEST_ASSERT_VAL("wrong type", PERF_TYPE_HARDWARE == evsel->core.attr.type); - TEST_ASSERT_VAL("wrong config", test_config(evsel, PERF_COUNT_HW_INSTRUCTIONS)); + ret = assert_hw(&evsel->core, PERF_COUNT_HW_INSTRUCTIONS, "instructions"); + if (ret) + return ret; + TEST_ASSERT_VAL("wrong exclude_user", !evsel->core.attr.exclude_user); TEST_ASSERT_VAL("wrong exclude_kernel", !evsel->core.attr.exclude_kernel); TEST_ASSERT_VAL("wrong exclude_hv", !evsel->core.attr.exclude_hv); @@ -1148,8 +1192,10 @@ static int test__group5(struct evlist *evlist __maybe_unused) for (int i = 0; i < num_core_entries(); i++) { /* cycles:G */ evsel = leader = evsel__next(evsel); - TEST_ASSERT_VAL("wrong type", PERF_TYPE_HARDWARE == evsel->core.attr.type); - TEST_ASSERT_VAL("wrong config", test_config(evsel, PERF_COUNT_HW_CPU_CYCLES)); + ret = assert_hw(&evsel->core, PERF_COUNT_HW_CPU_CYCLES, "cycles"); + if (ret) + return ret; + TEST_ASSERT_VAL("wrong exclude_user", !evsel->core.attr.exclude_user); TEST_ASSERT_VAL("wrong exclude_kernel", !evsel->core.attr.exclude_kernel); TEST_ASSERT_VAL("wrong exclude_hv", !evsel->core.attr.exclude_hv); @@ -1164,8 +1210,10 @@ static int test__group5(struct evlist *evlist __maybe_unused) /* instructions:G */ evsel = evsel__next(evsel); - TEST_ASSERT_VAL("wrong type", PERF_TYPE_HARDWARE == evsel->core.attr.type); - TEST_ASSERT_VAL("wrong config", test_config(evsel, PERF_COUNT_HW_INSTRUCTIONS)); + ret = assert_hw(&evsel->core, PERF_COUNT_HW_INSTRUCTIONS, "instructions"); + if (ret) + return ret; + TEST_ASSERT_VAL("wrong exclude_user", !evsel->core.attr.exclude_user); TEST_ASSERT_VAL("wrong exclude_kernel", !evsel->core.attr.exclude_kernel); TEST_ASSERT_VAL("wrong exclude_hv", !evsel->core.attr.exclude_hv); @@ -1178,8 +1226,10 @@ static int test__group5(struct evlist *evlist __maybe_unused) for (int i = 0; i < num_core_entries(); i++) { /* cycles */ evsel = evsel__next(evsel); - TEST_ASSERT_VAL("wrong type", PERF_TYPE_HARDWARE == evsel->core.attr.type); - TEST_ASSERT_VAL("wrong config", test_config(evsel, PERF_COUNT_HW_CPU_CYCLES)); + ret = assert_hw(&evsel->core, PERF_COUNT_HW_CPU_CYCLES, "cycles"); + if (ret) + return ret; + TEST_ASSERT_VAL("wrong exclude_user", !evsel->core.attr.exclude_user); TEST_ASSERT_VAL("wrong exclude_kernel", !evsel->core.attr.exclude_kernel); TEST_ASSERT_VAL("wrong exclude_hv", !evsel->core.attr.exclude_hv); @@ -1201,10 +1251,14 @@ static int test__group_gh1(struct evlist *evlist) evlist__nr_groups(evlist) == num_core_entries()); for (int i = 0; i < num_core_entries(); i++) { + int ret; + /* cycles + :H group modifier */ evsel = leader = (i == 0 ? evlist__first(evlist) : evsel__next(evsel)); - TEST_ASSERT_VAL("wrong type", PERF_TYPE_HARDWARE == evsel->core.attr.type); - TEST_ASSERT_VAL("wrong config", test_config(evsel, PERF_COUNT_HW_CPU_CYCLES)); + ret = assert_hw(&evsel->core, PERF_COUNT_HW_CPU_CYCLES, "cycles"); + if (ret) + return ret; + TEST_ASSERT_VAL("wrong exclude_user", !evsel->core.attr.exclude_user); TEST_ASSERT_VAL("wrong exclude_kernel", !evsel->core.attr.exclude_kernel); TEST_ASSERT_VAL("wrong exclude_hv", !evsel->core.attr.exclude_hv); @@ -1218,8 +1272,10 @@ static int test__group_gh1(struct evlist *evlist) /* cache-misses:G + :H group modifier */ evsel = evsel__next(evsel); - TEST_ASSERT_VAL("wrong type", PERF_TYPE_HARDWARE == evsel->core.attr.type); - TEST_ASSERT_VAL("wrong config", test_config(evsel, PERF_COUNT_HW_CACHE_MISSES)); + ret = assert_hw(&evsel->core, PERF_COUNT_HW_CACHE_MISSES, "cache-misses"); + if (ret) + return ret; + TEST_ASSERT_VAL("wrong exclude_user", !evsel->core.attr.exclude_user); TEST_ASSERT_VAL("wrong exclude_kernel", !evsel->core.attr.exclude_kernel); TEST_ASSERT_VAL("wrong exclude_hv", !evsel->core.attr.exclude_hv); @@ -1242,10 +1298,14 @@ static int test__group_gh2(struct evlist *evlist) evlist__nr_groups(evlist) == num_core_entries()); for (int i = 0; i < num_core_entries(); i++) { + int ret; + /* cycles + :G group modifier */ evsel = leader = (i == 0 ? evlist__first(evlist) : evsel__next(evsel)); - TEST_ASSERT_VAL("wrong type", PERF_TYPE_HARDWARE == evsel->core.attr.type); - TEST_ASSERT_VAL("wrong config", test_config(evsel, PERF_COUNT_HW_CPU_CYCLES)); + ret = assert_hw(&evsel->core, PERF_COUNT_HW_CPU_CYCLES, "cycles"); + if (ret) + return ret; + TEST_ASSERT_VAL("wrong exclude_user", !evsel->core.attr.exclude_user); TEST_ASSERT_VAL("wrong exclude_kernel", !evsel->core.attr.exclude_kernel); TEST_ASSERT_VAL("wrong exclude_hv", !evsel->core.attr.exclude_hv); @@ -1259,8 +1319,10 @@ static int test__group_gh2(struct evlist *evlist) /* cache-misses:H + :G group modifier */ evsel = evsel__next(evsel); - TEST_ASSERT_VAL("wrong type", PERF_TYPE_HARDWARE == evsel->core.attr.type); - TEST_ASSERT_VAL("wrong config", test_config(evsel, PERF_COUNT_HW_CACHE_MISSES)); + ret = assert_hw(&evsel->core, PERF_COUNT_HW_CACHE_MISSES, "cache-misses"); + if (ret) + return ret; + TEST_ASSERT_VAL("wrong exclude_user", !evsel->core.attr.exclude_user); TEST_ASSERT_VAL("wrong exclude_kernel", !evsel->core.attr.exclude_kernel); TEST_ASSERT_VAL("wrong exclude_hv", !evsel->core.attr.exclude_hv); @@ -1283,10 +1345,14 @@ static int test__group_gh3(struct evlist *evlist) evlist__nr_groups(evlist) == num_core_entries()); for (int i = 0; i < num_core_entries(); i++) { + int ret; + /* cycles:G + :u group modifier */ evsel = leader = (i == 0 ? evlist__first(evlist) : evsel__next(evsel)); - TEST_ASSERT_VAL("wrong type", PERF_TYPE_HARDWARE == evsel->core.attr.type); - TEST_ASSERT_VAL("wrong config", test_config(evsel, PERF_COUNT_HW_CPU_CYCLES)); + ret = assert_hw(&evsel->core, PERF_COUNT_HW_CPU_CYCLES, "cycles"); + if (ret) + return ret; + TEST_ASSERT_VAL("wrong exclude_user", !evsel->core.attr.exclude_user); TEST_ASSERT_VAL("wrong exclude_kernel", evsel->core.attr.exclude_kernel); TEST_ASSERT_VAL("wrong exclude_hv", evsel->core.attr.exclude_hv); @@ -1300,8 +1366,10 @@ static int test__group_gh3(struct evlist *evlist) /* cache-misses:H + :u group modifier */ evsel = evsel__next(evsel); - TEST_ASSERT_VAL("wrong type", PERF_TYPE_HARDWARE == evsel->core.attr.type); - TEST_ASSERT_VAL("wrong config", test_config(evsel, PERF_COUNT_HW_CACHE_MISSES)); + ret = assert_hw(&evsel->core, PERF_COUNT_HW_CACHE_MISSES, "cache-misses"); + if (ret) + return ret; + TEST_ASSERT_VAL("wrong exclude_user", !evsel->core.attr.exclude_user); TEST_ASSERT_VAL("wrong exclude_kernel", evsel->core.attr.exclude_kernel); TEST_ASSERT_VAL("wrong exclude_hv", evsel->core.attr.exclude_hv); @@ -1324,10 +1392,14 @@ static int test__group_gh4(struct evlist *evlist) evlist__nr_groups(evlist) == num_core_entries()); for (int i = 0; i < num_core_entries(); i++) { + int ret; + /* cycles:G + :uG group modifier */ evsel = leader = (i == 0 ? evlist__first(evlist) : evsel__next(evsel)); - TEST_ASSERT_VAL("wrong type", PERF_TYPE_HARDWARE == evsel->core.attr.type); - TEST_ASSERT_VAL("wrong config", test_config(evsel, PERF_COUNT_HW_CPU_CYCLES)); + ret = assert_hw(&evsel->core, PERF_COUNT_HW_CPU_CYCLES, "cycles"); + if (ret) + return ret; + TEST_ASSERT_VAL("wrong exclude_user", !evsel->core.attr.exclude_user); TEST_ASSERT_VAL("wrong exclude_kernel", evsel->core.attr.exclude_kernel); TEST_ASSERT_VAL("wrong exclude_hv", evsel->core.attr.exclude_hv); @@ -1341,8 +1413,10 @@ static int test__group_gh4(struct evlist *evlist) /* cache-misses:H + :uG group modifier */ evsel = evsel__next(evsel); - TEST_ASSERT_VAL("wrong type", PERF_TYPE_HARDWARE == evsel->core.attr.type); - TEST_ASSERT_VAL("wrong config", test_config(evsel, PERF_COUNT_HW_CACHE_MISSES)); + ret = assert_hw(&evsel->core, PERF_COUNT_HW_CACHE_MISSES, "cache-misses"); + if (ret) + return ret; + TEST_ASSERT_VAL("wrong exclude_user", !evsel->core.attr.exclude_user); TEST_ASSERT_VAL("wrong exclude_kernel", evsel->core.attr.exclude_kernel); TEST_ASSERT_VAL("wrong exclude_hv", evsel->core.attr.exclude_hv); @@ -1363,10 +1437,14 @@ static int test__leader_sample1(struct evlist *evlist) evlist->core.nr_entries == (3 * num_core_entries())); for (int i = 0; i < num_core_entries(); i++) { + int ret; + /* cycles - sampling group leader */ evsel = leader = (i == 0 ? evlist__first(evlist) : evsel__next(evsel)); - TEST_ASSERT_VAL("wrong type", PERF_TYPE_HARDWARE == evsel->core.attr.type); - TEST_ASSERT_VAL("wrong config", test_config(evsel, PERF_COUNT_HW_CPU_CYCLES)); + ret = assert_hw(&evsel->core, PERF_COUNT_HW_CPU_CYCLES, "cycles"); + if (ret) + return ret; + TEST_ASSERT_VAL("wrong exclude_user", !evsel->core.attr.exclude_user); TEST_ASSERT_VAL("wrong exclude_kernel", !evsel->core.attr.exclude_kernel); TEST_ASSERT_VAL("wrong exclude_hv", !evsel->core.attr.exclude_hv); @@ -1379,8 +1457,10 @@ static int test__leader_sample1(struct evlist *evlist) /* cache-misses - not sampling */ evsel = evsel__next(evsel); - TEST_ASSERT_VAL("wrong type", PERF_TYPE_HARDWARE == evsel->core.attr.type); - TEST_ASSERT_VAL("wrong config", test_config(evsel, PERF_COUNT_HW_CACHE_MISSES)); + ret = assert_hw(&evsel->core, PERF_COUNT_HW_CACHE_MISSES, "cache-misses"); + if (ret) + return ret; + TEST_ASSERT_VAL("wrong exclude_user", !evsel->core.attr.exclude_user); TEST_ASSERT_VAL("wrong exclude_kernel", !evsel->core.attr.exclude_kernel); TEST_ASSERT_VAL("wrong exclude_hv", !evsel->core.attr.exclude_hv); @@ -1392,8 +1472,10 @@ static int test__leader_sample1(struct evlist *evlist) /* branch-misses - not sampling */ evsel = evsel__next(evsel); - TEST_ASSERT_VAL("wrong type", PERF_TYPE_HARDWARE == evsel->core.attr.type); - TEST_ASSERT_VAL("wrong config", test_config(evsel, PERF_COUNT_HW_BRANCH_MISSES)); + ret = assert_hw(&evsel->core, PERF_COUNT_HW_BRANCH_MISSES, "branch-misses"); + if (ret) + return ret; + TEST_ASSERT_VAL("wrong exclude_user", !evsel->core.attr.exclude_user); TEST_ASSERT_VAL("wrong exclude_kernel", !evsel->core.attr.exclude_kernel); TEST_ASSERT_VAL("wrong exclude_hv", !evsel->core.attr.exclude_hv); @@ -1415,10 +1497,14 @@ static int test__leader_sample2(struct evlist *evlist __maybe_unused) evlist->core.nr_entries == (2 * num_core_entries())); for (int i = 0; i < num_core_entries(); i++) { + int ret; + /* instructions - sampling group leader */ evsel = leader = (i == 0 ? evlist__first(evlist) : evsel__next(evsel)); - TEST_ASSERT_VAL("wrong type", PERF_TYPE_HARDWARE == evsel->core.attr.type); - TEST_ASSERT_VAL("wrong config", test_config(evsel, PERF_COUNT_HW_INSTRUCTIONS)); + ret = assert_hw(&evsel->core, PERF_COUNT_HW_INSTRUCTIONS, "instructions"); + if (ret) + return ret; + TEST_ASSERT_VAL("wrong exclude_user", !evsel->core.attr.exclude_user); TEST_ASSERT_VAL("wrong exclude_kernel", evsel->core.attr.exclude_kernel); TEST_ASSERT_VAL("wrong exclude_hv", evsel->core.attr.exclude_hv); @@ -1431,8 +1517,10 @@ static int test__leader_sample2(struct evlist *evlist __maybe_unused) /* branch-misses - not sampling */ evsel = evsel__next(evsel); - TEST_ASSERT_VAL("wrong type", PERF_TYPE_HARDWARE == evsel->core.attr.type); - TEST_ASSERT_VAL("wrong config", test_config(evsel, PERF_COUNT_HW_BRANCH_MISSES)); + ret = assert_hw(&evsel->core, PERF_COUNT_HW_BRANCH_MISSES, "branch-misses"); + if (ret) + return ret; + TEST_ASSERT_VAL("wrong exclude_user", !evsel->core.attr.exclude_user); TEST_ASSERT_VAL("wrong exclude_kernel", evsel->core.attr.exclude_kernel); TEST_ASSERT_VAL("wrong exclude_hv", evsel->core.attr.exclude_hv); @@ -1472,10 +1560,14 @@ static int test__pinned_group(struct evlist *evlist) evlist->core.nr_entries == (3 * num_core_entries())); for (int i = 0; i < num_core_entries(); i++) { + int ret; + /* cycles - group leader */ evsel = leader = (i == 0 ? evlist__first(evlist) : evsel__next(evsel)); - TEST_ASSERT_VAL("wrong type", PERF_TYPE_HARDWARE == evsel->core.attr.type); - TEST_ASSERT_VAL("wrong config", test_config(evsel, PERF_COUNT_HW_CPU_CYCLES)); + ret = assert_hw(&evsel->core, PERF_COUNT_HW_CPU_CYCLES, "cycles"); + if (ret) + return ret; + TEST_ASSERT_VAL("wrong group name", !evsel->group_name); TEST_ASSERT_VAL("wrong leader", evsel__has_leader(evsel, leader)); /* TODO: The group modifier is not copied to the split group leader. */ @@ -1484,13 +1576,18 @@ static int test__pinned_group(struct evlist *evlist) /* cache-misses - can not be pinned, but will go on with the leader */ evsel = evsel__next(evsel); - TEST_ASSERT_VAL("wrong type", PERF_TYPE_HARDWARE == evsel->core.attr.type); - TEST_ASSERT_VAL("wrong config", test_config(evsel, PERF_COUNT_HW_CACHE_MISSES)); + ret = assert_hw(&evsel->core, PERF_COUNT_HW_CACHE_MISSES, "cache-misses"); + if (ret) + return ret; + TEST_ASSERT_VAL("wrong pinned", !evsel->core.attr.pinned); /* branch-misses - ditto */ evsel = evsel__next(evsel); - TEST_ASSERT_VAL("wrong config", test_config(evsel, PERF_COUNT_HW_BRANCH_MISSES)); + ret = assert_hw(&evsel->core, PERF_COUNT_HW_BRANCH_MISSES, "branch-misses"); + if (ret) + return ret; + TEST_ASSERT_VAL("wrong pinned", !evsel->core.attr.pinned); } return TEST_OK; @@ -1517,10 +1614,14 @@ static int test__exclusive_group(struct evlist *evlist) evlist->core.nr_entries == 3 * num_core_entries()); for (int i = 0; i < num_core_entries(); i++) { + int ret; + /* cycles - group leader */ evsel = leader = (i == 0 ? evlist__first(evlist) : evsel__next(evsel)); - TEST_ASSERT_VAL("wrong type", PERF_TYPE_HARDWARE == evsel->core.attr.type); - TEST_ASSERT_VAL("wrong config", test_config(evsel, PERF_COUNT_HW_CPU_CYCLES)); + ret = assert_hw(&evsel->core, PERF_COUNT_HW_CPU_CYCLES, "cycles"); + if (ret) + return ret; + TEST_ASSERT_VAL("wrong group name", !evsel->group_name); TEST_ASSERT_VAL("wrong leader", evsel__has_leader(evsel, leader)); /* TODO: The group modifier is not copied to the split group leader. */ @@ -1529,13 +1630,18 @@ static int test__exclusive_group(struct evlist *evlist) /* cache-misses - can not be pinned, but will go on with the leader */ evsel = evsel__next(evsel); - TEST_ASSERT_VAL("wrong type", PERF_TYPE_HARDWARE == evsel->core.attr.type); - TEST_ASSERT_VAL("wrong config", test_config(evsel, PERF_COUNT_HW_CACHE_MISSES)); + ret = assert_hw(&evsel->core, PERF_COUNT_HW_CACHE_MISSES, "cache-misses"); + if (ret) + return ret; + TEST_ASSERT_VAL("wrong exclusive", !evsel->core.attr.exclusive); /* branch-misses - ditto */ evsel = evsel__next(evsel); - TEST_ASSERT_VAL("wrong config", test_config(evsel, PERF_COUNT_HW_BRANCH_MISSES)); + ret = assert_hw(&evsel->core, PERF_COUNT_HW_BRANCH_MISSES, "branch-misses"); + if (ret) + return ret; + TEST_ASSERT_VAL("wrong exclusive", !evsel->core.attr.exclusive); } return TEST_OK; @@ -1677,9 +1783,11 @@ static int test__checkevent_raw_pmu(struct evlist *evlist) static int test__sym_event_slash(struct evlist *evlist) { struct evsel *evsel = evlist__first(evlist); + int ret = assert_hw(&evsel->core, PERF_COUNT_HW_CPU_CYCLES, "cycles"); + + if (ret) + return ret; - TEST_ASSERT_VAL("wrong type", evsel->core.attr.type == PERF_TYPE_HARDWARE); - TEST_ASSERT_VAL("wrong config", test_config(evsel, PERF_COUNT_HW_CPU_CYCLES)); TEST_ASSERT_VAL("wrong exclude_kernel", evsel->core.attr.exclude_kernel); return TEST_OK; } @@ -1687,9 +1795,11 @@ static int test__sym_event_slash(struct evlist *evlist) static int test__sym_event_dc(struct evlist *evlist) { struct evsel *evsel = evlist__first(evlist); + int ret = assert_hw(&evsel->core, PERF_COUNT_HW_CPU_CYCLES, "cycles"); + + if (ret) + return ret; - TEST_ASSERT_VAL("wrong type", evsel->core.attr.type == PERF_TYPE_HARDWARE); - TEST_ASSERT_VAL("wrong config", test_config(evsel, PERF_COUNT_HW_CPU_CYCLES)); TEST_ASSERT_VAL("wrong exclude_user", evsel->core.attr.exclude_user); return TEST_OK; } @@ -1697,9 +1807,11 @@ static int test__sym_event_dc(struct evlist *evlist) static int test__term_equal_term(struct evlist *evlist) { struct evsel *evsel = evlist__first(evlist); + int ret = assert_hw(&evsel->core, PERF_COUNT_HW_CPU_CYCLES, "cycles"); + + if (ret) + return ret; - TEST_ASSERT_VAL("wrong type", evsel->core.attr.type == PERF_TYPE_HARDWARE); - TEST_ASSERT_VAL("wrong config", test_config(evsel, PERF_COUNT_HW_CPU_CYCLES)); TEST_ASSERT_VAL("wrong name setting", strcmp(evsel->name, "name") == 0); return TEST_OK; } @@ -1707,9 +1819,11 @@ static int test__term_equal_term(struct evlist *evlist) static int test__term_equal_legacy(struct evlist *evlist) { struct evsel *evsel = evlist__first(evlist); + int ret = assert_hw(&evsel->core, PERF_COUNT_HW_CPU_CYCLES, "cycles"); + + if (ret) + return ret; - TEST_ASSERT_VAL("wrong type", evsel->core.attr.type == PERF_TYPE_HARDWARE); - TEST_ASSERT_VAL("wrong config", test_config(evsel, PERF_COUNT_HW_CPU_CYCLES)); TEST_ASSERT_VAL("wrong name setting", strcmp(evsel->name, "l1d") == 0); return TEST_OK; } diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c index c56e07bd7dd67..44b450526ea0a 100644 --- a/tools/perf/util/parse-events.c +++ b/tools/perf/util/parse-events.c @@ -976,7 +976,7 @@ static int config_term_pmu(struct perf_event_attr *attr, struct parse_events_error *err) { if (term->type_term == PARSE_EVENTS__TERM_TYPE_LEGACY_CACHE) { - const struct perf_pmu *pmu = perf_pmus__find_by_type(attr->type); + struct perf_pmu *pmu = perf_pmus__find_by_type(attr->type); if (!pmu) { char *err_str; @@ -986,15 +986,23 @@ static int config_term_pmu(struct perf_event_attr *attr, err_str, /*help=*/NULL); return -EINVAL; } - if (perf_pmu__supports_legacy_cache(pmu)) { + /* + * Rewrite the PMU event to a legacy cache one unless the PMU + * doesn't support legacy cache events or the event is present + * within the PMU. + */ + if (perf_pmu__supports_legacy_cache(pmu) && + !perf_pmu__have_event(pmu, term->config)) { attr->type = PERF_TYPE_HW_CACHE; return parse_events__decode_legacy_cache(term->config, pmu->type, &attr->config); - } else + } else { term->type_term = PARSE_EVENTS__TERM_TYPE_USER; + term->no_value = true; + } } if (term->type_term == PARSE_EVENTS__TERM_TYPE_HARDWARE) { - const struct perf_pmu *pmu = perf_pmus__find_by_type(attr->type); + struct perf_pmu *pmu = perf_pmus__find_by_type(attr->type); if (!pmu) { char *err_str; @@ -1004,10 +1012,19 @@ static int config_term_pmu(struct perf_event_attr *attr, err_str, /*help=*/NULL); return -EINVAL; } - attr->type = PERF_TYPE_HARDWARE; - attr->config = term->val.num; - if (perf_pmus__supports_extended_type()) - attr->config |= (__u64)pmu->type << PERF_PMU_TYPE_SHIFT; + /* + * If the PMU has a sysfs or json event prefer it over + * legacy. ARM requires this. + */ + if (perf_pmu__have_event(pmu, term->config)) { + term->type_term = PARSE_EVENTS__TERM_TYPE_USER; + term->no_value = true; + } else { + attr->type = PERF_TYPE_HARDWARE; + attr->config = term->val.num; + if (perf_pmus__supports_extended_type()) + attr->config |= (__u64)pmu->type << PERF_PMU_TYPE_SHIFT; + } return 0; } if (term->type_term == PARSE_EVENTS__TERM_TYPE_USER || @@ -1381,6 +1398,7 @@ int parse_events_add_pmu(struct parse_events_state *parse_state, YYLTYPE *loc = loc_; LIST_HEAD(config_terms); struct parse_events_terms parsed_terms; + bool alias_rewrote_terms = false; pmu = parse_state->fake_pmu ?: perf_pmus__find(name); @@ -1434,7 +1452,15 @@ int parse_events_add_pmu(struct parse_events_state *parse_state, return evsel ? 0 : -ENOMEM; } - if (!parse_state->fake_pmu && perf_pmu__check_alias(pmu, &parsed_terms, &info, err)) { + /* Configure attr/terms with a known PMU, this will set hardcoded terms. */ + if (config_attr(&attr, &parsed_terms, parse_state->error, config_term_pmu)) { + parse_events_terms__exit(&parsed_terms); + return -EINVAL; + } + + /* Look for event names in the terms and rewrite into format based terms. */ + if (!parse_state->fake_pmu && perf_pmu__check_alias(pmu, &parsed_terms, + &info, &alias_rewrote_terms, err)) { parse_events_terms__exit(&parsed_terms); return -EINVAL; } @@ -1448,11 +1474,9 @@ int parse_events_add_pmu(struct parse_events_state *parse_state, strbuf_release(&sb); } - /* - * Configure hardcoded terms first, no need to check - * return value when called with fail == 0 ;) - */ - if (config_attr(&attr, &parsed_terms, parse_state->error, config_term_pmu)) { + /* Configure attr/terms again if an alias was expanded. */ + if (alias_rewrote_terms && + config_attr(&attr, &parsed_terms, parse_state->error, config_term_pmu)) { parse_events_terms__exit(&parsed_terms); return -EINVAL; } diff --git a/tools/perf/util/pmu.c b/tools/perf/util/pmu.c index 43eac49522aea..c26a03add1638 100644 --- a/tools/perf/util/pmu.c +++ b/tools/perf/util/pmu.c @@ -1543,12 +1543,14 @@ static int check_info_data(struct perf_pmu *pmu, * defined for the alias */ int perf_pmu__check_alias(struct perf_pmu *pmu, struct parse_events_terms *head_terms, - struct perf_pmu_info *info, struct parse_events_error *err) + struct perf_pmu_info *info, bool *rewrote_terms, + struct parse_events_error *err) { struct parse_events_term *term, *h; struct perf_pmu_alias *alias; int ret; + *rewrote_terms = false; info->per_pkg = false; /* @@ -1570,7 +1572,7 @@ int perf_pmu__check_alias(struct perf_pmu *pmu, struct parse_events_terms *head_ NULL); return ret; } - + *rewrote_terms = true; ret = check_info_data(pmu, alias, info, err, term->err_term); if (ret) return ret; @@ -1664,6 +1666,8 @@ bool perf_pmu__auto_merge_stats(const struct perf_pmu *pmu) bool perf_pmu__have_event(struct perf_pmu *pmu, const char *name) { + if (!name) + return false; if (perf_pmu__find_alias(pmu, name, /*load=*/ true) != NULL) return true; if (pmu->cpu_aliases_added || !pmu->events_table) diff --git a/tools/perf/util/pmu.h b/tools/perf/util/pmu.h index e192010ad3c75..ebbb73286e1f5 100644 --- a/tools/perf/util/pmu.h +++ b/tools/perf/util/pmu.h @@ -207,7 +207,8 @@ int perf_pmu__config_terms(struct perf_pmu *pmu, __u64 perf_pmu__format_bits(struct perf_pmu *pmu, const char *name); int perf_pmu__format_type(struct perf_pmu *pmu, const char *name); int perf_pmu__check_alias(struct perf_pmu *pmu, struct parse_events_terms *head_terms, - struct perf_pmu_info *info, struct parse_events_error *err); + struct perf_pmu_info *info, bool *rewrote_terms, + struct parse_events_error *err); int perf_pmu__find_event(struct perf_pmu *pmu, const char *event, void *state, pmu_event_callback cb); int perf_pmu__format_parse(struct perf_pmu *pmu, int dirfd, bool eager_load); From 8deb19de0130d3fa37975f989ce61b921ca4cff8 Mon Sep 17 00:00:00 2001 From: Leo Yan Date: Sat, 14 Oct 2023 15:41:58 +0800 Subject: [PATCH 239/548] perf cs-etm: Validate timestamp tracing in per-thread mode So far, it's impossible to validate timestamp trace in Arm CoreSight when the perf is in the per-thread mode. E.g. for the command: perf record -e cs_etm/timestamp/ --per-thread -- ls The command enables config 'timestamp' for 'cs_etm' event in the per-thread mode. In this case, the function cs_etm_validate_config() directly bails out and skips validation. Given profiled process can be scheduled on any CPUs in the per-thread mode, this patch validates timestamp tracing for all CPUs when detect the CPU map is empty. Signed-off-by: Leo Yan Reviewed-by: James Clark Acked-by: Suzuki K Poulose Cc: Will Deacon Cc: Mike Leach Cc: John Garry Cc: linux-arm-kernel@lists.infradead.org Cc: coresight@lists.linaro.org Link: https://lore.kernel.org/r/20231014074159.1667880-2-leo.yan@linaro.org Signed-off-by: Namhyung Kim (cherry picked from commit f8ccc2d5cc6516b019bcf8e361ae2a380cb36019) Signed-off-by: Koba Ko --- tools/perf/arch/arm/util/cs-etm.c | 13 +++++++++++-- 1 file changed, 11 insertions(+), 2 deletions(-) diff --git a/tools/perf/arch/arm/util/cs-etm.c b/tools/perf/arch/arm/util/cs-etm.c index b8d6a953fd742..cf9ef9ba800b2 100644 --- a/tools/perf/arch/arm/util/cs-etm.c +++ b/tools/perf/arch/arm/util/cs-etm.c @@ -205,8 +205,17 @@ static int cs_etm_validate_config(struct auxtrace_record *itr, for (i = 0; i < cpu__max_cpu().cpu; i++) { struct perf_cpu cpu = { .cpu = i, }; - if (!perf_cpu_map__has(event_cpus, cpu) || - !perf_cpu_map__has(online_cpus, cpu)) + /* + * In per-cpu case, do the validation for CPUs to work with. + * In per-thread case, the CPU map is empty. Since the traced + * program can run on any CPUs in this case, thus don't skip + * validation. + */ + if (!perf_cpu_map__empty(event_cpus) && + !perf_cpu_map__has(event_cpus, cpu)) + continue; + + if (!perf_cpu_map__has(online_cpus, cpu)) continue; err = cs_etm_validate_context_id(itr, evsel, i); From 3fd8764608f6e2aa6b0f908674875af7542b2164 Mon Sep 17 00:00:00 2001 From: Leo Yan Date: Sat, 14 Oct 2023 15:41:59 +0800 Subject: [PATCH 240/548] perf cs-etm: Respect timestamp option When users pass the option '--timestamp' or '-T' in the record command, all events will set the PERF_SAMPLE_TIME bit in the attribution. In this case, the AUX event will record the kernel timestamp, but it doesn't mean Arm CoreSight enables timestamp packets in its hardware tracing. If the option '--timestamp' or '-T' is set, this patch always enables Arm CoreSight timestamp, as a result, the bit 28 in event's config is to be set. Before: # perf record -e cs_etm// --per-thread --timestamp -- ls # perf script --header-only ... # event : name = cs_etm//, , id = { 69 }, type = 12, size = 136, config = 0, { sample_period, sample_freq } = 1, sample_type = IP|TID|TIME|CPU|IDENTIFIER, read_format = ID|LOST, disabled = 1, enable_on_exec = 1, sample_id_all = 1, exclude_guest = 1 ... After: # perf record -e cs_etm// --per-thread --timestamp -- ls # perf script --header-only ... # event : name = cs_etm//, , id = { 49 }, type = 12, size = 136, config = 0x10000000, { sample_period, sample_freq } = 1, sample_type = IP|TID|TIME|CPU|IDENTIFIER, read_format = ID|LOST, disabled = 1, enable_on_exec = 1, sample_id_all = 1, exclude_guest = 1 ... Signed-off-by: Leo Yan Reviewed-by: James Clark Acked-by: Suzuki K Poulose Cc: Will Deacon Cc: Mike Leach Cc: Alexander Shishkin Cc: John Garry Cc: linux-arm-kernel@lists.infradead.org Cc: coresight@lists.linaro.org Link: https://lore.kernel.org/r/20231014074159.1667880-3-leo.yan@linaro.org Signed-off-by: Namhyung Kim (cherry picked from commit 78efa7b411450a534e9d326cdf5578f681c73988) Signed-off-by: Koba Ko --- tools/perf/arch/arm/util/cs-etm.c | 9 +++++++++ 1 file changed, 9 insertions(+) diff --git a/tools/perf/arch/arm/util/cs-etm.c b/tools/perf/arch/arm/util/cs-etm.c index cf9ef9ba800b2..58c506e9788db 100644 --- a/tools/perf/arch/arm/util/cs-etm.c +++ b/tools/perf/arch/arm/util/cs-etm.c @@ -442,6 +442,15 @@ static int cs_etm_recording_options(struct auxtrace_record *itr, "contextid", 1); } + /* + * When the option '--timestamp' or '-T' is enabled, the PERF_SAMPLE_TIME + * bit is set for all events. In this case, always enable Arm CoreSight + * timestamp tracing. + */ + if (opts->sample_time_set) + evsel__set_config_if_unset(cs_etm_pmu, cs_etm_evsel, + "timestamp", 1); + /* Add dummy event to keep tracking */ err = parse_event(evlist, "dummy:u"); if (err) From c129968281607077801531f680a450042939bdf0 Mon Sep 17 00:00:00 2001 From: Ian Rogers Date: Tue, 28 Nov 2023 22:01:58 -0800 Subject: [PATCH 241/548] libperf cpumap: Rename perf_cpu_map__dummy_new() to perf_cpu_map__new_any_cpu() MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Rename perf_cpu_map__dummy_new() to perf_cpu_map__new_any_cpu() to better indicate this is creating a CPU map for the perf_event_open "any" CPU case. Reviewed-by: James Clark Signed-off-by: Ian Rogers Cc: Adrian Hunter Cc: Alexander Shishkin Cc: Alexandre Ghiti Cc: Andrew Jones Cc: André Almeida Cc: Athira Jajeev Cc: Atish Patra Cc: Changbin Du Cc: Darren Hart Cc: Davidlohr Bueso Cc: Huacai Chen Cc: Ingo Molnar Cc: Jiri Olsa Cc: John Garry Cc: K Prateek Nayak Cc: Kajol Jain Cc: Kan Liang Cc: Leo Yan Cc: Mark Rutland Cc: Mike Leach Cc: Namhyung Kim Cc: Nick Desaulniers Cc: Paolo Bonzini Cc: Paran Lee Cc: Peter Zijlstra Cc: Ravi Bangoria Cc: Sandipan Das Cc: Sean Christopherson Cc: Steinar H. Gunderson Cc: Suzuki Poulouse Cc: Thomas Gleixner Cc: Will Deacon Cc: Yang Jihong Cc: Yang Li Cc: Yanteng Si Cc: bpf@vger.kernel.org Cc: coresight@lists.linaro.org Cc: linux-arm-kernel@lists.infradead.org Link: https://lore.kernel.org/r/20231129060211.1890454-2-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo (cherry picked from commit 48219b089d84f109e8a81d8a7fa1bbc2e6e5f97d) Signed-off-by: Koba Ko --- tools/lib/perf/Documentation/libperf.txt | 2 +- tools/lib/perf/cpumap.c | 4 ++-- tools/lib/perf/evsel.c | 2 +- tools/lib/perf/include/perf/cpumap.h | 4 ++-- tools/lib/perf/libperf.map | 2 +- tools/lib/perf/tests/test-cpumap.c | 2 +- tools/lib/perf/tests/test-evlist.c | 2 +- tools/perf/tests/cpumap.c | 2 +- tools/perf/tests/sw-clock.c | 2 +- tools/perf/tests/task-exit.c | 2 +- tools/perf/util/evlist.c | 2 +- tools/perf/util/evsel.c | 2 +- 12 files changed, 14 insertions(+), 14 deletions(-) diff --git a/tools/lib/perf/Documentation/libperf.txt b/tools/lib/perf/Documentation/libperf.txt index a8f1a237931b1..a256a26598b04 100644 --- a/tools/lib/perf/Documentation/libperf.txt +++ b/tools/lib/perf/Documentation/libperf.txt @@ -37,7 +37,7 @@ SYNOPSIS struct perf_cpu_map; - struct perf_cpu_map *perf_cpu_map__dummy_new(void); + struct perf_cpu_map *perf_cpu_map__new_any_cpu(void); struct perf_cpu_map *perf_cpu_map__new(const char *cpu_list); struct perf_cpu_map *perf_cpu_map__read(FILE *file); struct perf_cpu_map *perf_cpu_map__get(struct perf_cpu_map *map); diff --git a/tools/lib/perf/cpumap.c b/tools/lib/perf/cpumap.c index 2a5a292173740..2bd6aba3d8c9a 100644 --- a/tools/lib/perf/cpumap.c +++ b/tools/lib/perf/cpumap.c @@ -27,7 +27,7 @@ struct perf_cpu_map *perf_cpu_map__alloc(int nr_cpus) return result; } -struct perf_cpu_map *perf_cpu_map__dummy_new(void) +struct perf_cpu_map *perf_cpu_map__new_any_cpu(void) { struct perf_cpu_map *cpus = perf_cpu_map__alloc(1); @@ -271,7 +271,7 @@ struct perf_cpu_map *perf_cpu_map__new(const char *cpu_list) else if (*cpu_list != '\0') cpus = cpu_map__default_new(); else - cpus = perf_cpu_map__dummy_new(); + cpus = perf_cpu_map__new_any_cpu(); invalid: free(tmp_cpus); out: diff --git a/tools/lib/perf/evsel.c b/tools/lib/perf/evsel.c index 8b51b008a81f1..c07160953224a 100644 --- a/tools/lib/perf/evsel.c +++ b/tools/lib/perf/evsel.c @@ -120,7 +120,7 @@ int perf_evsel__open(struct perf_evsel *evsel, struct perf_cpu_map *cpus, static struct perf_cpu_map *empty_cpu_map; if (empty_cpu_map == NULL) { - empty_cpu_map = perf_cpu_map__dummy_new(); + empty_cpu_map = perf_cpu_map__new_any_cpu(); if (empty_cpu_map == NULL) return -ENOMEM; } diff --git a/tools/lib/perf/include/perf/cpumap.h b/tools/lib/perf/include/perf/cpumap.h index e38d859a384d2..d0bf218ada119 100644 --- a/tools/lib/perf/include/perf/cpumap.h +++ b/tools/lib/perf/include/perf/cpumap.h @@ -19,9 +19,9 @@ struct perf_cache { struct perf_cpu_map; /** - * perf_cpu_map__dummy_new - a map with a singular "any CPU"/dummy -1 value. + * perf_cpu_map__new_any_cpu - a map with a singular "any CPU"/dummy -1 value. */ -LIBPERF_API struct perf_cpu_map *perf_cpu_map__dummy_new(void); +LIBPERF_API struct perf_cpu_map *perf_cpu_map__new_any_cpu(void); LIBPERF_API struct perf_cpu_map *perf_cpu_map__default_new(void); LIBPERF_API struct perf_cpu_map *perf_cpu_map__new(const char *cpu_list); LIBPERF_API struct perf_cpu_map *perf_cpu_map__read(FILE *file); diff --git a/tools/lib/perf/libperf.map b/tools/lib/perf/libperf.map index 190b56ae923ad..a8ff64baea3e1 100644 --- a/tools/lib/perf/libperf.map +++ b/tools/lib/perf/libperf.map @@ -1,7 +1,7 @@ LIBPERF_0.0.1 { global: libperf_init; - perf_cpu_map__dummy_new; + perf_cpu_map__new_any_cpu; perf_cpu_map__default_new; perf_cpu_map__get; perf_cpu_map__put; diff --git a/tools/lib/perf/tests/test-cpumap.c b/tools/lib/perf/tests/test-cpumap.c index 87b0510a556ff..2c359bdb951ee 100644 --- a/tools/lib/perf/tests/test-cpumap.c +++ b/tools/lib/perf/tests/test-cpumap.c @@ -21,7 +21,7 @@ int test_cpumap(int argc, char **argv) libperf_init(libperf_print); - cpus = perf_cpu_map__dummy_new(); + cpus = perf_cpu_map__new_any_cpu(); if (!cpus) return -1; diff --git a/tools/lib/perf/tests/test-evlist.c b/tools/lib/perf/tests/test-evlist.c index ed616fc19b4f2..ab63878bacb99 100644 --- a/tools/lib/perf/tests/test-evlist.c +++ b/tools/lib/perf/tests/test-evlist.c @@ -261,7 +261,7 @@ static int test_mmap_thread(void) threads = perf_thread_map__new_dummy(); __T("failed to create threads", threads); - cpus = perf_cpu_map__dummy_new(); + cpus = perf_cpu_map__new_any_cpu(); __T("failed to create cpus", cpus); perf_thread_map__set_pid(threads, 0, pid); diff --git a/tools/perf/tests/cpumap.c b/tools/perf/tests/cpumap.c index 7730fc2ab40b7..bd8e396f3e57b 100644 --- a/tools/perf/tests/cpumap.c +++ b/tools/perf/tests/cpumap.c @@ -213,7 +213,7 @@ static int test__cpu_map_intersect(struct test_suite *test __maybe_unused, static int test__cpu_map_equal(struct test_suite *test __maybe_unused, int subtest __maybe_unused) { - struct perf_cpu_map *any = perf_cpu_map__dummy_new(); + struct perf_cpu_map *any = perf_cpu_map__new_any_cpu(); struct perf_cpu_map *one = perf_cpu_map__new("1"); struct perf_cpu_map *two = perf_cpu_map__new("2"); struct perf_cpu_map *empty = perf_cpu_map__intersect(one, two); diff --git a/tools/perf/tests/sw-clock.c b/tools/perf/tests/sw-clock.c index 4d7493fa01059..290716783ac6a 100644 --- a/tools/perf/tests/sw-clock.c +++ b/tools/perf/tests/sw-clock.c @@ -62,7 +62,7 @@ static int __test__sw_clock_freq(enum perf_sw_ids clock_id) } evlist__add(evlist, evsel); - cpus = perf_cpu_map__dummy_new(); + cpus = perf_cpu_map__new_any_cpu(); threads = thread_map__new_by_tid(getpid()); if (!cpus || !threads) { err = -ENOMEM; diff --git a/tools/perf/tests/task-exit.c b/tools/perf/tests/task-exit.c index 968dddde6ddaf..d33d0952025cf 100644 --- a/tools/perf/tests/task-exit.c +++ b/tools/perf/tests/task-exit.c @@ -70,7 +70,7 @@ static int test__task_exit(struct test_suite *test __maybe_unused, int subtest _ * evlist__prepare_workload we'll fill in the only thread * we're monitoring, the one forked there. */ - cpus = perf_cpu_map__dummy_new(); + cpus = perf_cpu_map__new_any_cpu(); threads = thread_map__new_by_tid(-1); if (!cpus || !threads) { err = -ENOMEM; diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c index f86a1eb4ea36a..8a4d1d5fd5753 100644 --- a/tools/perf/util/evlist.c +++ b/tools/perf/util/evlist.c @@ -1064,7 +1064,7 @@ int evlist__create_maps(struct evlist *evlist, struct target *target) return -1; if (target__uses_dummy_map(target)) - cpus = perf_cpu_map__dummy_new(); + cpus = perf_cpu_map__new_any_cpu(); else cpus = perf_cpu_map__new(target->cpu_list); diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c index 2a6295f1ac1bc..05ecdca9e0190 100644 --- a/tools/perf/util/evsel.c +++ b/tools/perf/util/evsel.c @@ -1811,7 +1811,7 @@ static int __evsel__prepare_open(struct evsel *evsel, struct perf_cpu_map *cpus, if (cpus == NULL) { if (empty_cpu_map == NULL) { - empty_cpu_map = perf_cpu_map__dummy_new(); + empty_cpu_map = perf_cpu_map__new_any_cpu(); if (empty_cpu_map == NULL) return -ENOMEM; } From e56a040c23b4c7fa250af0c778cfb361615319a0 Mon Sep 17 00:00:00 2001 From: Ian Rogers Date: Tue, 28 Nov 2023 22:01:59 -0800 Subject: [PATCH 242/548] libperf cpumap: Rename perf_cpu_map__default_new() to perf_cpu_map__new_online_cpus() and prefer sysfs MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Rename perf_cpu_map__default_new() to perf_cpu_map__new_online_cpus() to better indicate what the implementation does. Read the online CPUs from /sys/devices/system/cpu/online first before using sysconf() as it can't accurately configure holes in the CPU map. If sysconf() is used, warn when the configured and online processors disagree. When reading from a file, if the read doesn't yield a CPU map then return an empty map rather than the default online. This avoids recursion but also better yields being able to detect failures. Add more comments. Reviewed-by: James Clark Signed-off-by: Ian Rogers Cc: Adrian Hunter Cc: Alexander Shishkin Cc: Alexandre Ghiti Cc: Andrew Jones Cc: André Almeida Cc: Athira Jajeev Cc: Atish Patra Cc: Changbin Du Cc: Darren Hart Cc: Davidlohr Bueso Cc: Huacai Chen Cc: Ingo Molnar Cc: Jiri Olsa Cc: John Garry Cc: K Prateek Nayak Cc: Kajol Jain Cc: Kan Liang Cc: Leo Yan Cc: Mark Rutland Cc: Mike Leach Cc: Namhyung Kim Cc: Nick Desaulniers Cc: Paolo Bonzini Cc: Paran Lee Cc: Peter Zijlstra Cc: Ravi Bangoria Cc: Sandipan Das Cc: Sean Christopherson Cc: Steinar H. Gunderson Cc: Suzuki Poulouse Cc: Thomas Gleixner Cc: Will Deacon Cc: Yang Jihong Cc: Yang Li Cc: Yanteng Si Cc: bpf@vger.kernel.org Cc: coresight@lists.linaro.org Cc: linux-arm-kernel@lists.infradead.org Link: https://lore.kernel.org/r/20231129060211.1890454-3-irogers@google.com [ s/syfs/sysfs/g typo ] Signed-off-by: Arnaldo Carvalho de Melo (cherry picked from commit 8f60f870a9af53295ab4301da05ca453f115a6b6) Signed-off-by: Koba Ko --- tools/lib/perf/cpumap.c | 59 +++++++++++++++++----------- tools/lib/perf/include/perf/cpumap.h | 15 ++++++- tools/lib/perf/libperf.map | 2 +- tools/lib/perf/tests/test-cpumap.c | 2 +- 4 files changed, 51 insertions(+), 27 deletions(-) diff --git a/tools/lib/perf/cpumap.c b/tools/lib/perf/cpumap.c index 2bd6aba3d8c9a..3aa80d0d26e86 100644 --- a/tools/lib/perf/cpumap.c +++ b/tools/lib/perf/cpumap.c @@ -9,6 +9,7 @@ #include #include #include +#include "internal.h" void perf_cpu_map__set_nr(struct perf_cpu_map *map, int nr_cpus) { @@ -66,15 +67,21 @@ void perf_cpu_map__put(struct perf_cpu_map *map) } } -static struct perf_cpu_map *cpu_map__default_new(void) +static struct perf_cpu_map *cpu_map__new_sysconf(void) { struct perf_cpu_map *cpus; - int nr_cpus; + int nr_cpus, nr_cpus_conf; nr_cpus = sysconf(_SC_NPROCESSORS_ONLN); if (nr_cpus < 0) return NULL; + nr_cpus_conf = sysconf(_SC_NPROCESSORS_CONF); + if (nr_cpus != nr_cpus_conf) { + pr_warning("Number of online CPUs (%d) differs from the number configured (%d) the CPU map will only cover the first %d CPUs.", + nr_cpus, nr_cpus_conf, nr_cpus); + } + cpus = perf_cpu_map__alloc(nr_cpus); if (cpus != NULL) { int i; @@ -86,9 +93,27 @@ static struct perf_cpu_map *cpu_map__default_new(void) return cpus; } -struct perf_cpu_map *perf_cpu_map__default_new(void) +static struct perf_cpu_map *cpu_map__new_sysfs_online(void) { - return cpu_map__default_new(); + struct perf_cpu_map *cpus = NULL; + FILE *onlnf; + + onlnf = fopen("/sys/devices/system/cpu/online", "r"); + if (onlnf) { + cpus = perf_cpu_map__read(onlnf); + fclose(onlnf); + } + return cpus; +} + +struct perf_cpu_map *perf_cpu_map__new_online_cpus(void) +{ + struct perf_cpu_map *cpus = cpu_map__new_sysfs_online(); + + if (cpus) + return cpus; + + return cpu_map__new_sysconf(); } @@ -180,27 +205,11 @@ struct perf_cpu_map *perf_cpu_map__read(FILE *file) if (nr_cpus > 0) cpus = cpu_map__trim_new(nr_cpus, tmp_cpus); - else - cpus = cpu_map__default_new(); out_free_tmp: free(tmp_cpus); return cpus; } -static struct perf_cpu_map *cpu_map__read_all_cpu_map(void) -{ - struct perf_cpu_map *cpus = NULL; - FILE *onlnf; - - onlnf = fopen("/sys/devices/system/cpu/online", "r"); - if (!onlnf) - return cpu_map__default_new(); - - cpus = perf_cpu_map__read(onlnf); - fclose(onlnf); - return cpus; -} - struct perf_cpu_map *perf_cpu_map__new(const char *cpu_list) { struct perf_cpu_map *cpus = NULL; @@ -211,7 +220,7 @@ struct perf_cpu_map *perf_cpu_map__new(const char *cpu_list) int max_entries = 0; if (!cpu_list) - return cpu_map__read_all_cpu_map(); + return perf_cpu_map__new_online_cpus(); /* * must handle the case of empty cpumap to cover @@ -268,9 +277,11 @@ struct perf_cpu_map *perf_cpu_map__new(const char *cpu_list) if (nr_cpus > 0) cpus = cpu_map__trim_new(nr_cpus, tmp_cpus); - else if (*cpu_list != '\0') - cpus = cpu_map__default_new(); - else + else if (*cpu_list != '\0') { + pr_warning("Unexpected characters at end of cpu list ('%s'), using online CPUs.", + cpu_list); + cpus = perf_cpu_map__new_online_cpus(); + } else cpus = perf_cpu_map__new_any_cpu(); invalid: free(tmp_cpus); diff --git a/tools/lib/perf/include/perf/cpumap.h b/tools/lib/perf/include/perf/cpumap.h index d0bf218ada119..b24bd8b8f34e1 100644 --- a/tools/lib/perf/include/perf/cpumap.h +++ b/tools/lib/perf/include/perf/cpumap.h @@ -22,7 +22,20 @@ struct perf_cpu_map; * perf_cpu_map__new_any_cpu - a map with a singular "any CPU"/dummy -1 value. */ LIBPERF_API struct perf_cpu_map *perf_cpu_map__new_any_cpu(void); -LIBPERF_API struct perf_cpu_map *perf_cpu_map__default_new(void); +/** + * perf_cpu_map__new_online_cpus - a map read from + * /sys/devices/system/cpu/online if + * available. If reading wasn't possible a map + * is created using the online processors + * assuming the first 'n' processors are all + * online. + */ +LIBPERF_API struct perf_cpu_map *perf_cpu_map__new_online_cpus(void); +/** + * perf_cpu_map__new - create a map from the given cpu_list such as "0-7". If no + * cpu_list argument is provided then + * perf_cpu_map__new_online_cpus is returned. + */ LIBPERF_API struct perf_cpu_map *perf_cpu_map__new(const char *cpu_list); LIBPERF_API struct perf_cpu_map *perf_cpu_map__read(FILE *file); LIBPERF_API struct perf_cpu_map *perf_cpu_map__get(struct perf_cpu_map *map); diff --git a/tools/lib/perf/libperf.map b/tools/lib/perf/libperf.map index a8ff64baea3e1..8a71f841498e4 100644 --- a/tools/lib/perf/libperf.map +++ b/tools/lib/perf/libperf.map @@ -2,7 +2,7 @@ LIBPERF_0.0.1 { global: libperf_init; perf_cpu_map__new_any_cpu; - perf_cpu_map__default_new; + perf_cpu_map__new_online_cpus; perf_cpu_map__get; perf_cpu_map__put; perf_cpu_map__new; diff --git a/tools/lib/perf/tests/test-cpumap.c b/tools/lib/perf/tests/test-cpumap.c index 2c359bdb951ee..c998b1dae8631 100644 --- a/tools/lib/perf/tests/test-cpumap.c +++ b/tools/lib/perf/tests/test-cpumap.c @@ -29,7 +29,7 @@ int test_cpumap(int argc, char **argv) perf_cpu_map__put(cpus); perf_cpu_map__put(cpus); - cpus = perf_cpu_map__default_new(); + cpus = perf_cpu_map__new_online_cpus(); if (!cpus) return -1; From c386dd228219ae0aa7204deb8386a8135a2c6397 Mon Sep 17 00:00:00 2001 From: Ian Rogers Date: Tue, 28 Nov 2023 22:02:00 -0800 Subject: [PATCH 243/548] libperf cpumap: Rename perf_cpu_map__empty() to perf_cpu_map__has_any_cpu_or_is_empty() MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The name perf_cpu_map_empty is misleading as true is also returned when the map contains an "any" CPU (aka dummy) map. Rename to perf_cpu_map__has_any_cpu_or_is_empty(), later changes will (re)introduce perf_cpu_map__empty() and perf_cpu_map__has_any_cpu(). Reviewed-by: James Clark Signed-off-by: Ian Rogers Cc: Adrian Hunter Cc: Alexander Shishkin Cc: Alexandre Ghiti Cc: Andrew Jones Cc: André Almeida Cc: Athira Jajeev Cc: Atish Patra Cc: Changbin Du Cc: Darren Hart Cc: Davidlohr Bueso Cc: Huacai Chen Cc: Ingo Molnar Cc: Jiri Olsa Cc: John Garry Cc: K Prateek Nayak Cc: Kajol Jain Cc: Kan Liang Cc: Leo Yan Cc: Mark Rutland Cc: Mike Leach Cc: Namhyung Kim Cc: Nick Desaulniers Cc: Paolo Bonzini Cc: Paran Lee Cc: Peter Zijlstra Cc: Ravi Bangoria Cc: Sandipan Das Cc: Sean Christopherson Cc: Steinar H. Gunderson Cc: Suzuki Poulouse Cc: Thomas Gleixner Cc: Will Deacon Cc: Yang Jihong Cc: Yang Li Cc: Yanteng Si Cc: bpf@vger.kernel.org Cc: coresight@lists.linaro.org Cc: linux-arm-kernel@lists.infradead.org Link: https://lore.kernel.org/r/20231129060211.1890454-4-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo (cherry picked from commit 923ca62a7b1edceaa61eb6ac8dc56fdac51913b8) Signed-off-by: Koba Ko --- tools/lib/perf/Documentation/libperf.txt | 2 +- tools/lib/perf/cpumap.c | 2 +- tools/lib/perf/evlist.c | 4 ++-- tools/lib/perf/include/perf/cpumap.h | 4 ++-- tools/lib/perf/libperf.map | 2 +- tools/perf/arch/arm/util/cs-etm.c | 10 +++++----- tools/perf/arch/arm64/util/arm-spe.c | 4 ++-- tools/perf/arch/x86/util/intel-bts.c | 4 ++-- tools/perf/arch/x86/util/intel-pt.c | 10 +++++----- tools/perf/builtin-c2c.c | 2 +- tools/perf/builtin-stat.c | 6 +++--- tools/perf/util/auxtrace.c | 4 ++-- tools/perf/util/record.c | 2 +- tools/perf/util/stat.c | 2 +- 14 files changed, 29 insertions(+), 29 deletions(-) diff --git a/tools/lib/perf/Documentation/libperf.txt b/tools/lib/perf/Documentation/libperf.txt index a256a26598b04..fcfb9499ef9cd 100644 --- a/tools/lib/perf/Documentation/libperf.txt +++ b/tools/lib/perf/Documentation/libperf.txt @@ -46,7 +46,7 @@ SYNOPSIS void perf_cpu_map__put(struct perf_cpu_map *map); int perf_cpu_map__cpu(const struct perf_cpu_map *cpus, int idx); int perf_cpu_map__nr(const struct perf_cpu_map *cpus); - bool perf_cpu_map__empty(const struct perf_cpu_map *map); + bool perf_cpu_map__has_any_cpu_or_is_empty(const struct perf_cpu_map *map); int perf_cpu_map__max(struct perf_cpu_map *map); bool perf_cpu_map__has(const struct perf_cpu_map *map, int cpu); diff --git a/tools/lib/perf/cpumap.c b/tools/lib/perf/cpumap.c index 3aa80d0d26e86..4adcd7920d033 100644 --- a/tools/lib/perf/cpumap.c +++ b/tools/lib/perf/cpumap.c @@ -311,7 +311,7 @@ int perf_cpu_map__nr(const struct perf_cpu_map *cpus) return cpus ? __perf_cpu_map__nr(cpus) : 1; } -bool perf_cpu_map__empty(const struct perf_cpu_map *map) +bool perf_cpu_map__has_any_cpu_or_is_empty(const struct perf_cpu_map *map) { return map ? __perf_cpu_map__cpu(map, 0).cpu == -1 : true; } diff --git a/tools/lib/perf/evlist.c b/tools/lib/perf/evlist.c index fad607789d1e5..bc575b9e624c0 100644 --- a/tools/lib/perf/evlist.c +++ b/tools/lib/perf/evlist.c @@ -625,7 +625,7 @@ static int perf_evlist__nr_mmaps(struct perf_evlist *evlist) /* One for each CPU */ nr_mmaps = perf_cpu_map__nr(evlist->all_cpus); - if (perf_cpu_map__empty(evlist->all_cpus)) { + if (perf_cpu_map__has_any_cpu_or_is_empty(evlist->all_cpus)) { /* Plus one for each thread */ nr_mmaps += perf_thread_map__nr(evlist->threads); /* Minus the per-thread CPU (-1) */ @@ -659,7 +659,7 @@ int perf_evlist__mmap_ops(struct perf_evlist *evlist, if (evlist->pollfd.entries == NULL && perf_evlist__alloc_pollfd(evlist) < 0) return -ENOMEM; - if (perf_cpu_map__empty(cpus)) + if (perf_cpu_map__has_any_cpu_or_is_empty(cpus)) return mmap_per_thread(evlist, ops, mp); return mmap_per_cpu(evlist, ops, mp); diff --git a/tools/lib/perf/include/perf/cpumap.h b/tools/lib/perf/include/perf/cpumap.h index b24bd8b8f34e1..9cf361fc5edcb 100644 --- a/tools/lib/perf/include/perf/cpumap.h +++ b/tools/lib/perf/include/perf/cpumap.h @@ -47,9 +47,9 @@ LIBPERF_API void perf_cpu_map__put(struct perf_cpu_map *map); LIBPERF_API struct perf_cpu perf_cpu_map__cpu(const struct perf_cpu_map *cpus, int idx); LIBPERF_API int perf_cpu_map__nr(const struct perf_cpu_map *cpus); /** - * perf_cpu_map__empty - is map either empty or the "any CPU"/dummy value. + * perf_cpu_map__has_any_cpu_or_is_empty - is map either empty or has the "any CPU"/dummy value. */ -LIBPERF_API bool perf_cpu_map__empty(const struct perf_cpu_map *map); +LIBPERF_API bool perf_cpu_map__has_any_cpu_or_is_empty(const struct perf_cpu_map *map); LIBPERF_API struct perf_cpu perf_cpu_map__max(const struct perf_cpu_map *map); LIBPERF_API bool perf_cpu_map__has(const struct perf_cpu_map *map, struct perf_cpu cpu); LIBPERF_API bool perf_cpu_map__equal(const struct perf_cpu_map *lhs, diff --git a/tools/lib/perf/libperf.map b/tools/lib/perf/libperf.map index 8a71f841498e4..10b3f37226426 100644 --- a/tools/lib/perf/libperf.map +++ b/tools/lib/perf/libperf.map @@ -9,7 +9,7 @@ LIBPERF_0.0.1 { perf_cpu_map__read; perf_cpu_map__nr; perf_cpu_map__cpu; - perf_cpu_map__empty; + perf_cpu_map__has_any_cpu_or_is_empty; perf_cpu_map__max; perf_cpu_map__has; perf_thread_map__new_array; diff --git a/tools/perf/arch/arm/util/cs-etm.c b/tools/perf/arch/arm/util/cs-etm.c index 58c506e9788db..12066f101467d 100644 --- a/tools/perf/arch/arm/util/cs-etm.c +++ b/tools/perf/arch/arm/util/cs-etm.c @@ -211,7 +211,7 @@ static int cs_etm_validate_config(struct auxtrace_record *itr, * program can run on any CPUs in this case, thus don't skip * validation. */ - if (!perf_cpu_map__empty(event_cpus) && + if (!perf_cpu_map__has_any_cpu_or_is_empty(event_cpus) && !perf_cpu_map__has(event_cpus, cpu)) continue; @@ -435,7 +435,7 @@ static int cs_etm_recording_options(struct auxtrace_record *itr, * Also the case of per-cpu mmaps, need the contextID in order to be notified * when a context switch happened. */ - if (!perf_cpu_map__empty(cpus)) { + if (!perf_cpu_map__has_any_cpu_or_is_empty(cpus)) { evsel__set_config_if_unset(cs_etm_pmu, cs_etm_evsel, "timestamp", 1); evsel__set_config_if_unset(cs_etm_pmu, cs_etm_evsel, @@ -461,7 +461,7 @@ static int cs_etm_recording_options(struct auxtrace_record *itr, evsel->core.attr.sample_period = 1; /* In per-cpu case, always need the time of mmap events etc */ - if (!perf_cpu_map__empty(cpus)) + if (!perf_cpu_map__has_any_cpu_or_is_empty(cpus)) evsel__set_sample_bit(evsel, TIME); err = cs_etm_validate_config(itr, cs_etm_evsel); @@ -539,7 +539,7 @@ cs_etm_info_priv_size(struct auxtrace_record *itr __maybe_unused, struct perf_cpu_map *online_cpus = perf_cpu_map__new(NULL); /* cpu map is not empty, we have specific CPUs to work with */ - if (!perf_cpu_map__empty(event_cpus)) { + if (!perf_cpu_map__has_any_cpu_or_is_empty(event_cpus)) { for (i = 0; i < cpu__max_cpu().cpu; i++) { struct perf_cpu cpu = { .cpu = i, }; @@ -814,7 +814,7 @@ static int cs_etm_info_fill(struct auxtrace_record *itr, return -EINVAL; /* If the cpu_map is empty all online CPUs are involved */ - if (perf_cpu_map__empty(event_cpus)) { + if (perf_cpu_map__has_any_cpu_or_is_empty(event_cpus)) { cpu_map = online_cpus; } else { /* Make sure all specified CPUs are online */ diff --git a/tools/perf/arch/arm64/util/arm-spe.c b/tools/perf/arch/arm64/util/arm-spe.c index 9cc3d6dcb849f..0cdedc6085cba 100644 --- a/tools/perf/arch/arm64/util/arm-spe.c +++ b/tools/perf/arch/arm64/util/arm-spe.c @@ -213,7 +213,7 @@ static int arm_spe_recording_options(struct auxtrace_record *itr, * In the case of per-cpu mmaps, sample CPU for AUX event; * also enable the timestamp tracing for samples correlation. */ - if (!perf_cpu_map__empty(cpus)) { + if (!perf_cpu_map__has_any_cpu_or_is_empty(cpus)) { evsel__set_sample_bit(arm_spe_evsel, CPU); evsel__set_config_if_unset(arm_spe_pmu, arm_spe_evsel, "ts_enable", 1); @@ -246,7 +246,7 @@ static int arm_spe_recording_options(struct auxtrace_record *itr, tracking_evsel->core.attr.sample_period = 1; /* In per-cpu case, always need the time of mmap events etc */ - if (!perf_cpu_map__empty(cpus)) { + if (!perf_cpu_map__has_any_cpu_or_is_empty(cpus)) { evsel__set_sample_bit(tracking_evsel, TIME); evsel__set_sample_bit(tracking_evsel, CPU); diff --git a/tools/perf/arch/x86/util/intel-bts.c b/tools/perf/arch/x86/util/intel-bts.c index d2c8cac114702..af8ae4647585b 100644 --- a/tools/perf/arch/x86/util/intel-bts.c +++ b/tools/perf/arch/x86/util/intel-bts.c @@ -143,7 +143,7 @@ static int intel_bts_recording_options(struct auxtrace_record *itr, if (!opts->full_auxtrace) return 0; - if (opts->full_auxtrace && !perf_cpu_map__empty(cpus)) { + if (opts->full_auxtrace && !perf_cpu_map__has_any_cpu_or_is_empty(cpus)) { pr_err(INTEL_BTS_PMU_NAME " does not support per-cpu recording\n"); return -EINVAL; } @@ -224,7 +224,7 @@ static int intel_bts_recording_options(struct auxtrace_record *itr, * In the case of per-cpu mmaps, we need the CPU on the * AUX event. */ - if (!perf_cpu_map__empty(cpus)) + if (!perf_cpu_map__has_any_cpu_or_is_empty(cpus)) evsel__set_sample_bit(intel_bts_evsel, CPU); } diff --git a/tools/perf/arch/x86/util/intel-pt.c b/tools/perf/arch/x86/util/intel-pt.c index 69f02b3c812a3..d2afca4aec360 100644 --- a/tools/perf/arch/x86/util/intel-pt.c +++ b/tools/perf/arch/x86/util/intel-pt.c @@ -373,7 +373,7 @@ static int intel_pt_info_fill(struct auxtrace_record *itr, ui__warning("Intel Processor Trace: TSC not available\n"); } - per_cpu_mmaps = !perf_cpu_map__empty(session->evlist->core.user_requested_cpus); + per_cpu_mmaps = !perf_cpu_map__has_any_cpu_or_is_empty(session->evlist->core.user_requested_cpus); auxtrace_info->type = PERF_AUXTRACE_INTEL_PT; auxtrace_info->priv[INTEL_PT_PMU_TYPE] = intel_pt_pmu->type; @@ -790,7 +790,7 @@ static int intel_pt_recording_options(struct auxtrace_record *itr, * Per-cpu recording needs sched_switch events to distinguish different * threads. */ - if (have_timing_info && !perf_cpu_map__empty(cpus) && + if (have_timing_info && !perf_cpu_map__has_any_cpu_or_is_empty(cpus) && !record_opts__no_switch_events(opts)) { if (perf_can_record_switch_events()) { bool cpu_wide = !target__none(&opts->target) && @@ -848,7 +848,7 @@ static int intel_pt_recording_options(struct auxtrace_record *itr, * In the case of per-cpu mmaps, we need the CPU on the * AUX event. */ - if (!perf_cpu_map__empty(cpus)) + if (!perf_cpu_map__has_any_cpu_or_is_empty(cpus)) evsel__set_sample_bit(intel_pt_evsel, CPU); } @@ -874,7 +874,7 @@ static int intel_pt_recording_options(struct auxtrace_record *itr, tracking_evsel->immediate = true; /* In per-cpu case, always need the time of mmap events etc */ - if (!perf_cpu_map__empty(cpus)) { + if (!perf_cpu_map__has_any_cpu_or_is_empty(cpus)) { evsel__set_sample_bit(tracking_evsel, TIME); /* And the CPU for switch events */ evsel__set_sample_bit(tracking_evsel, CPU); @@ -886,7 +886,7 @@ static int intel_pt_recording_options(struct auxtrace_record *itr, * Warn the user when we do not have enough information to decode i.e. * per-cpu with no sched_switch (except workload-only). */ - if (!ptr->have_sched_switch && !perf_cpu_map__empty(cpus) && + if (!ptr->have_sched_switch && !perf_cpu_map__has_any_cpu_or_is_empty(cpus) && !target__none(&opts->target) && !intel_pt_evsel->core.attr.exclude_user) ui__warning("Intel Processor Trace decoding will not be possible except for kernel tracing!\n"); diff --git a/tools/perf/builtin-c2c.c b/tools/perf/builtin-c2c.c index a4cf9de7a7b5a..f78eea9e21539 100644 --- a/tools/perf/builtin-c2c.c +++ b/tools/perf/builtin-c2c.c @@ -2320,7 +2320,7 @@ static int setup_nodes(struct perf_session *session) nodes[node] = set; /* empty node, skip */ - if (perf_cpu_map__empty(map)) + if (perf_cpu_map__has_any_cpu_or_is_empty(map)) continue; perf_cpu_map__for_each_cpu(cpu, idx, map) { diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c index 9692ebdd7f11e..045198106fc58 100644 --- a/tools/perf/builtin-stat.c +++ b/tools/perf/builtin-stat.c @@ -1338,7 +1338,7 @@ static int cpu__get_cache_id_from_map(struct perf_cpu cpu, char *map) * be the first online CPU in the cache domain else use the * first online CPU of the cache domain as the ID. */ - if (perf_cpu_map__empty(cpu_map)) + if (perf_cpu_map__has_any_cpu_or_is_empty(cpu_map)) id = cpu.cpu; else id = perf_cpu_map__cpu(cpu_map, 0).cpu; @@ -1644,7 +1644,7 @@ static int perf_stat_init_aggr_mode(void) * taking the highest cpu number to be the size of * the aggregation translate cpumap. */ - if (!perf_cpu_map__empty(evsel_list->core.user_requested_cpus)) + if (!perf_cpu_map__has_any_cpu_or_is_empty(evsel_list->core.user_requested_cpus)) nr = perf_cpu_map__max(evsel_list->core.user_requested_cpus).cpu; else nr = 0; @@ -2311,7 +2311,7 @@ int process_stat_config_event(struct perf_session *session, perf_event__read_stat_config(&stat_config, &event->stat_config); - if (perf_cpu_map__empty(st->cpus)) { + if (perf_cpu_map__has_any_cpu_or_is_empty(st->cpus)) { if (st->aggr_mode != AGGR_UNSET) pr_warning("warning: processing task data, aggregation mode not set\n"); } else if (st->aggr_mode != AGGR_UNSET) { diff --git a/tools/perf/util/auxtrace.c b/tools/perf/util/auxtrace.c index c51829fdef23b..f572de2e83467 100644 --- a/tools/perf/util/auxtrace.c +++ b/tools/perf/util/auxtrace.c @@ -174,7 +174,7 @@ void auxtrace_mmap_params__set_idx(struct auxtrace_mmap_params *mp, struct evlist *evlist, struct evsel *evsel, int idx) { - bool per_cpu = !perf_cpu_map__empty(evlist->core.user_requested_cpus); + bool per_cpu = !perf_cpu_map__has_any_cpu_or_is_empty(evlist->core.user_requested_cpus); mp->mmap_needed = evsel->needs_auxtrace_mmap; @@ -648,7 +648,7 @@ int auxtrace_parse_snapshot_options(struct auxtrace_record *itr, static int evlist__enable_event_idx(struct evlist *evlist, struct evsel *evsel, int idx) { - bool per_cpu_mmaps = !perf_cpu_map__empty(evlist->core.user_requested_cpus); + bool per_cpu_mmaps = !perf_cpu_map__has_any_cpu_or_is_empty(evlist->core.user_requested_cpus); if (per_cpu_mmaps) { struct perf_cpu evlist_cpu = perf_cpu_map__cpu(evlist->core.all_cpus, idx); diff --git a/tools/perf/util/record.c b/tools/perf/util/record.c index 9eb5c6a08999e..40290382b2d70 100644 --- a/tools/perf/util/record.c +++ b/tools/perf/util/record.c @@ -237,7 +237,7 @@ bool evlist__can_select_event(struct evlist *evlist, const char *str) evsel = evlist__last(temp_evlist); - if (!evlist || perf_cpu_map__empty(evlist->core.user_requested_cpus)) { + if (!evlist || perf_cpu_map__has_any_cpu_or_is_empty(evlist->core.user_requested_cpus)) { struct perf_cpu_map *cpus = perf_cpu_map__new(NULL); if (cpus) diff --git a/tools/perf/util/stat.c b/tools/perf/util/stat.c index ec35060422173..012c4946b9c49 100644 --- a/tools/perf/util/stat.c +++ b/tools/perf/util/stat.c @@ -315,7 +315,7 @@ static int check_per_pkg(struct evsel *counter, struct perf_counts_values *vals, if (!counter->per_pkg) return 0; - if (perf_cpu_map__empty(cpus)) + if (perf_cpu_map__has_any_cpu_or_is_empty(cpus)) return 0; if (!mask) { From be4b6ea9e39972099c82b861afaa5e4b3a90a1e0 Mon Sep 17 00:00:00 2001 From: Ian Rogers Date: Tue, 28 Nov 2023 22:02:01 -0800 Subject: [PATCH 244/548] libperf cpumap: Replace usage of perf_cpu_map__new(NULL) with perf_cpu_map__new_online_cpus() MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Passing NULL to perf_cpu_map__new() performs perf_cpu_map__new_online_cpus(), just directly call perf_cpu_map__new_online_cpus() to be more intention revealing. Reviewed-by: James Clark Signed-off-by: Ian Rogers Cc: Adrian Hunter Cc: Alexander Shishkin Cc: Alexandre Ghiti Cc: Andrew Jones Cc: André Almeida Cc: Athira Jajeev Cc: Atish Patra Cc: Changbin Du Cc: Darren Hart Cc: Davidlohr Bueso Cc: Huacai Chen Cc: Ingo Molnar Cc: Jiri Olsa Cc: John Garry Cc: K Prateek Nayak Cc: Kajol Jain Cc: Kan Liang Cc: Leo Yan Cc: Mark Rutland Cc: Mike Leach Cc: Namhyung Kim Cc: Nick Desaulniers Cc: Paolo Bonzini Cc: Paran Lee Cc: Peter Zijlstra Cc: Ravi Bangoria Cc: Sandipan Das Cc: Sean Christopherson Cc: Steinar H. Gunderson Cc: Suzuki Poulouse Cc: Thomas Gleixner Cc: Will Deacon Cc: Yang Jihong Cc: Yang Li Cc: Yanteng Si Cc: bpf@vger.kernel.org Cc: coresight@lists.linaro.org Cc: linux-arm-kernel@lists.infradead.org Link: https://lore.kernel.org/r/20231129060211.1890454-5-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo (cherry picked from commit effe957c6bb70cac12918c0f5fd4cefb35967618) Signed-off-by: Koba Ko --- tools/lib/perf/Documentation/examples/sampling.c | 2 +- tools/lib/perf/Documentation/libperf-sampling.txt | 2 +- tools/lib/perf/evlist.c | 2 +- tools/lib/perf/tests/test-evlist.c | 4 ++-- tools/lib/perf/tests/test-evsel.c | 2 +- tools/perf/arch/arm/util/cs-etm.c | 6 +++--- tools/perf/arch/arm64/util/header.c | 2 +- tools/perf/bench/epoll-ctl.c | 2 +- tools/perf/bench/epoll-wait.c | 2 +- tools/perf/bench/futex-hash.c | 2 +- tools/perf/bench/futex-lock-pi.c | 2 +- tools/perf/bench/futex-requeue.c | 2 +- tools/perf/bench/futex-wake-parallel.c | 2 +- tools/perf/bench/futex-wake.c | 2 +- tools/perf/builtin-ftrace.c | 2 +- tools/perf/tests/code-reading.c | 2 +- tools/perf/tests/keep-tracking.c | 2 +- tools/perf/tests/mmap-basic.c | 2 +- tools/perf/tests/openat-syscall-all-cpus.c | 2 +- tools/perf/tests/perf-time-to-tsc.c | 2 +- tools/perf/tests/switch-tracking.c | 2 +- tools/perf/tests/topology.c | 2 +- tools/perf/util/bpf_counter.c | 2 +- tools/perf/util/cpumap.c | 2 +- tools/perf/util/cputopo.c | 2 +- tools/perf/util/evlist.c | 2 +- tools/perf/util/perf_api_probe.c | 4 ++-- tools/perf/util/record.c | 2 +- 28 files changed, 32 insertions(+), 32 deletions(-) diff --git a/tools/lib/perf/Documentation/examples/sampling.c b/tools/lib/perf/Documentation/examples/sampling.c index 8e1a926a9cfe6..bc142f0664b5a 100644 --- a/tools/lib/perf/Documentation/examples/sampling.c +++ b/tools/lib/perf/Documentation/examples/sampling.c @@ -39,7 +39,7 @@ int main(int argc, char **argv) libperf_init(libperf_print); - cpus = perf_cpu_map__new(NULL); + cpus = perf_cpu_map__new_online_cpus(); if (!cpus) { fprintf(stderr, "failed to create cpus\n"); return -1; diff --git a/tools/lib/perf/Documentation/libperf-sampling.txt b/tools/lib/perf/Documentation/libperf-sampling.txt index d6ca24f6ef78f..2378980fab8a6 100644 --- a/tools/lib/perf/Documentation/libperf-sampling.txt +++ b/tools/lib/perf/Documentation/libperf-sampling.txt @@ -97,7 +97,7 @@ In this case we will monitor all the available CPUs: [source,c] -- - 42 cpus = perf_cpu_map__new(NULL); + 42 cpus = perf_cpu_map__new_online_cpus(); 43 if (!cpus) { 44 fprintf(stderr, "failed to create cpus\n"); 45 return -1; diff --git a/tools/lib/perf/evlist.c b/tools/lib/perf/evlist.c index bc575b9e624c0..c6d67fc9e57ef 100644 --- a/tools/lib/perf/evlist.c +++ b/tools/lib/perf/evlist.c @@ -39,7 +39,7 @@ static void __perf_evlist__propagate_maps(struct perf_evlist *evlist, if (evsel->system_wide) { /* System wide: set the cpu map of the evsel to all online CPUs. */ perf_cpu_map__put(evsel->cpus); - evsel->cpus = perf_cpu_map__new(NULL); + evsel->cpus = perf_cpu_map__new_online_cpus(); } else if (evlist->has_user_cpus && evsel->is_pmu_core) { /* * User requested CPUs on a core PMU, ensure the requested CPUs diff --git a/tools/lib/perf/tests/test-evlist.c b/tools/lib/perf/tests/test-evlist.c index ab63878bacb99..10f70cb41ff1d 100644 --- a/tools/lib/perf/tests/test-evlist.c +++ b/tools/lib/perf/tests/test-evlist.c @@ -46,7 +46,7 @@ static int test_stat_cpu(void) }; int err, idx; - cpus = perf_cpu_map__new(NULL); + cpus = perf_cpu_map__new_online_cpus(); __T("failed to create cpus", cpus); evlist = perf_evlist__new(); @@ -350,7 +350,7 @@ static int test_mmap_cpus(void) attr.config = id; - cpus = perf_cpu_map__new(NULL); + cpus = perf_cpu_map__new_online_cpus(); __T("failed to create cpus", cpus); evlist = perf_evlist__new(); diff --git a/tools/lib/perf/tests/test-evsel.c b/tools/lib/perf/tests/test-evsel.c index a11fc51bfb688..545ec31505466 100644 --- a/tools/lib/perf/tests/test-evsel.c +++ b/tools/lib/perf/tests/test-evsel.c @@ -27,7 +27,7 @@ static int test_stat_cpu(void) }; int err, idx; - cpus = perf_cpu_map__new(NULL); + cpus = perf_cpu_map__new_online_cpus(); __T("failed to create cpus", cpus); evsel = perf_evsel__new(&attr); diff --git a/tools/perf/arch/arm/util/cs-etm.c b/tools/perf/arch/arm/util/cs-etm.c index 12066f101467d..37738fca939fd 100644 --- a/tools/perf/arch/arm/util/cs-etm.c +++ b/tools/perf/arch/arm/util/cs-etm.c @@ -199,7 +199,7 @@ static int cs_etm_validate_config(struct auxtrace_record *itr, { int i, err = -EINVAL; struct perf_cpu_map *event_cpus = evsel->evlist->core.user_requested_cpus; - struct perf_cpu_map *online_cpus = perf_cpu_map__new(NULL); + struct perf_cpu_map *online_cpus = perf_cpu_map__new_online_cpus(); /* Set option of each CPU we have */ for (i = 0; i < cpu__max_cpu().cpu; i++) { @@ -536,7 +536,7 @@ cs_etm_info_priv_size(struct auxtrace_record *itr __maybe_unused, int i; int etmv3 = 0, etmv4 = 0, ete = 0; struct perf_cpu_map *event_cpus = evlist->core.user_requested_cpus; - struct perf_cpu_map *online_cpus = perf_cpu_map__new(NULL); + struct perf_cpu_map *online_cpus = perf_cpu_map__new_online_cpus(); /* cpu map is not empty, we have specific CPUs to work with */ if (!perf_cpu_map__has_any_cpu_or_is_empty(event_cpus)) { @@ -802,7 +802,7 @@ static int cs_etm_info_fill(struct auxtrace_record *itr, u64 nr_cpu, type; struct perf_cpu_map *cpu_map; struct perf_cpu_map *event_cpus = session->evlist->core.user_requested_cpus; - struct perf_cpu_map *online_cpus = perf_cpu_map__new(NULL); + struct perf_cpu_map *online_cpus = perf_cpu_map__new_online_cpus(); struct cs_etm_recording *ptr = container_of(itr, struct cs_etm_recording, itr); struct perf_pmu *cs_etm_pmu = ptr->cs_etm_pmu; diff --git a/tools/perf/arch/arm64/util/header.c b/tools/perf/arch/arm64/util/header.c index a2eef9ec54910..97037499152ef 100644 --- a/tools/perf/arch/arm64/util/header.c +++ b/tools/perf/arch/arm64/util/header.c @@ -57,7 +57,7 @@ static int _get_cpuid(char *buf, size_t sz, struct perf_cpu_map *cpus) int get_cpuid(char *buf, size_t sz) { - struct perf_cpu_map *cpus = perf_cpu_map__new(NULL); + struct perf_cpu_map *cpus = perf_cpu_map__new_online_cpus(); int ret; if (!cpus) diff --git a/tools/perf/bench/epoll-ctl.c b/tools/perf/bench/epoll-ctl.c index 6bfffe83dde99..d3db73dac66af 100644 --- a/tools/perf/bench/epoll-ctl.c +++ b/tools/perf/bench/epoll-ctl.c @@ -330,7 +330,7 @@ int bench_epoll_ctl(int argc, const char **argv) act.sa_sigaction = toggle_done; sigaction(SIGINT, &act, NULL); - cpu = perf_cpu_map__new(NULL); + cpu = perf_cpu_map__new_online_cpus(); if (!cpu) goto errmem; diff --git a/tools/perf/bench/epoll-wait.c b/tools/perf/bench/epoll-wait.c index f370f7872ba4c..298c67cb10c83 100644 --- a/tools/perf/bench/epoll-wait.c +++ b/tools/perf/bench/epoll-wait.c @@ -449,7 +449,7 @@ int bench_epoll_wait(int argc, const char **argv) act.sa_sigaction = toggle_done; sigaction(SIGINT, &act, NULL); - cpu = perf_cpu_map__new(NULL); + cpu = perf_cpu_map__new_online_cpus(); if (!cpu) goto errmem; diff --git a/tools/perf/bench/futex-hash.c b/tools/perf/bench/futex-hash.c index 2005a3fa30267..0c69d20efa329 100644 --- a/tools/perf/bench/futex-hash.c +++ b/tools/perf/bench/futex-hash.c @@ -138,7 +138,7 @@ int bench_futex_hash(int argc, const char **argv) exit(EXIT_FAILURE); } - cpu = perf_cpu_map__new(NULL); + cpu = perf_cpu_map__new_online_cpus(); if (!cpu) goto errmem; diff --git a/tools/perf/bench/futex-lock-pi.c b/tools/perf/bench/futex-lock-pi.c index 092cbd52db82b..7a4973346180f 100644 --- a/tools/perf/bench/futex-lock-pi.c +++ b/tools/perf/bench/futex-lock-pi.c @@ -172,7 +172,7 @@ int bench_futex_lock_pi(int argc, const char **argv) if (argc) goto err; - cpu = perf_cpu_map__new(NULL); + cpu = perf_cpu_map__new_online_cpus(); if (!cpu) err(EXIT_FAILURE, "calloc"); diff --git a/tools/perf/bench/futex-requeue.c b/tools/perf/bench/futex-requeue.c index c0035990a33ce..d9ad736c1a3e0 100644 --- a/tools/perf/bench/futex-requeue.c +++ b/tools/perf/bench/futex-requeue.c @@ -174,7 +174,7 @@ int bench_futex_requeue(int argc, const char **argv) if (argc) goto err; - cpu = perf_cpu_map__new(NULL); + cpu = perf_cpu_map__new_online_cpus(); if (!cpu) err(EXIT_FAILURE, "cpu_map__new"); diff --git a/tools/perf/bench/futex-wake-parallel.c b/tools/perf/bench/futex-wake-parallel.c index 5ab0234d74e69..b66df553e5614 100644 --- a/tools/perf/bench/futex-wake-parallel.c +++ b/tools/perf/bench/futex-wake-parallel.c @@ -264,7 +264,7 @@ int bench_futex_wake_parallel(int argc, const char **argv) err(EXIT_FAILURE, "mlockall"); } - cpu = perf_cpu_map__new(NULL); + cpu = perf_cpu_map__new_online_cpus(); if (!cpu) err(EXIT_FAILURE, "calloc"); diff --git a/tools/perf/bench/futex-wake.c b/tools/perf/bench/futex-wake.c index 18a5894af8bb5..690fd6d3da130 100644 --- a/tools/perf/bench/futex-wake.c +++ b/tools/perf/bench/futex-wake.c @@ -149,7 +149,7 @@ int bench_futex_wake(int argc, const char **argv) exit(EXIT_FAILURE); } - cpu = perf_cpu_map__new(NULL); + cpu = perf_cpu_map__new_online_cpus(); if (!cpu) err(EXIT_FAILURE, "calloc"); diff --git a/tools/perf/builtin-ftrace.c b/tools/perf/builtin-ftrace.c index a1971703e49cb..8eab3cdb2c6de 100644 --- a/tools/perf/builtin-ftrace.c +++ b/tools/perf/builtin-ftrace.c @@ -333,7 +333,7 @@ static int set_tracing_func_irqinfo(struct perf_ftrace *ftrace) static int reset_tracing_cpu(void) { - struct perf_cpu_map *cpumap = perf_cpu_map__new(NULL); + struct perf_cpu_map *cpumap = perf_cpu_map__new_online_cpus(); int ret; ret = set_tracing_cpumask(cpumap); diff --git a/tools/perf/tests/code-reading.c b/tools/perf/tests/code-reading.c index ff249555ca57a..813c99006a875 100644 --- a/tools/perf/tests/code-reading.c +++ b/tools/perf/tests/code-reading.c @@ -630,7 +630,7 @@ static int do_test_code_reading(bool try_kcore) goto out_put; } - cpus = perf_cpu_map__new(NULL); + cpus = perf_cpu_map__new_online_cpus(); if (!cpus) { pr_debug("perf_cpu_map__new failed\n"); goto out_put; diff --git a/tools/perf/tests/keep-tracking.c b/tools/perf/tests/keep-tracking.c index 8f4f9b632e1e5..5a3b2bed07f32 100644 --- a/tools/perf/tests/keep-tracking.c +++ b/tools/perf/tests/keep-tracking.c @@ -81,7 +81,7 @@ static int test__keep_tracking(struct test_suite *test __maybe_unused, int subte threads = thread_map__new(-1, getpid(), UINT_MAX); CHECK_NOT_NULL__(threads); - cpus = perf_cpu_map__new(NULL); + cpus = perf_cpu_map__new_online_cpus(); CHECK_NOT_NULL__(cpus); evlist = evlist__new(); diff --git a/tools/perf/tests/mmap-basic.c b/tools/perf/tests/mmap-basic.c index 886a13a77a162..012c8ae439fdc 100644 --- a/tools/perf/tests/mmap-basic.c +++ b/tools/perf/tests/mmap-basic.c @@ -52,7 +52,7 @@ static int test__basic_mmap(struct test_suite *test __maybe_unused, int subtest return -1; } - cpus = perf_cpu_map__new(NULL); + cpus = perf_cpu_map__new_online_cpus(); if (cpus == NULL) { pr_debug("perf_cpu_map__new\n"); goto out_free_threads; diff --git a/tools/perf/tests/openat-syscall-all-cpus.c b/tools/perf/tests/openat-syscall-all-cpus.c index f3275be83a338..fb114118c8764 100644 --- a/tools/perf/tests/openat-syscall-all-cpus.c +++ b/tools/perf/tests/openat-syscall-all-cpus.c @@ -37,7 +37,7 @@ static int test__openat_syscall_event_on_all_cpus(struct test_suite *test __mayb return -1; } - cpus = perf_cpu_map__new(NULL); + cpus = perf_cpu_map__new_online_cpus(); if (cpus == NULL) { pr_debug("perf_cpu_map__new\n"); goto out_thread_map_delete; diff --git a/tools/perf/tests/perf-time-to-tsc.c b/tools/perf/tests/perf-time-to-tsc.c index efcd71c2738af..bbe2ddeb9b745 100644 --- a/tools/perf/tests/perf-time-to-tsc.c +++ b/tools/perf/tests/perf-time-to-tsc.c @@ -93,7 +93,7 @@ static int test__perf_time_to_tsc(struct test_suite *test __maybe_unused, int su threads = thread_map__new(-1, getpid(), UINT_MAX); CHECK_NOT_NULL__(threads); - cpus = perf_cpu_map__new(NULL); + cpus = perf_cpu_map__new_online_cpus(); CHECK_NOT_NULL__(cpus); evlist = evlist__new(); diff --git a/tools/perf/tests/switch-tracking.c b/tools/perf/tests/switch-tracking.c index 2c7bdf4fd55ed..ee43d8fa2ed67 100644 --- a/tools/perf/tests/switch-tracking.c +++ b/tools/perf/tests/switch-tracking.c @@ -351,7 +351,7 @@ static int test__switch_tracking(struct test_suite *test __maybe_unused, int sub goto out_err; } - cpus = perf_cpu_map__new(NULL); + cpus = perf_cpu_map__new_online_cpus(); if (!cpus) { pr_debug("perf_cpu_map__new failed!\n"); goto out_err; diff --git a/tools/perf/tests/topology.c b/tools/perf/tests/topology.c index 9dee63734e66a..2a842f53fbb57 100644 --- a/tools/perf/tests/topology.c +++ b/tools/perf/tests/topology.c @@ -215,7 +215,7 @@ static int test__session_topology(struct test_suite *test __maybe_unused, int su if (session_write_header(path)) goto free_path; - map = perf_cpu_map__new(NULL); + map = perf_cpu_map__new_online_cpus(); if (map == NULL) { pr_debug("failed to get system cpumap\n"); goto free_path; diff --git a/tools/perf/util/bpf_counter.c b/tools/perf/util/bpf_counter.c index 6732cbbcf9b3c..6b6bcb07f1c8d 100644 --- a/tools/perf/util/bpf_counter.c +++ b/tools/perf/util/bpf_counter.c @@ -452,7 +452,7 @@ static int bperf__load(struct evsel *evsel, struct target *target) return -1; if (!all_cpu_map) { - all_cpu_map = perf_cpu_map__new(NULL); + all_cpu_map = perf_cpu_map__new_online_cpus(); if (!all_cpu_map) return -1; } diff --git a/tools/perf/util/cpumap.c b/tools/perf/util/cpumap.c index 0e090e8bc3349..0581ee0fa5f27 100644 --- a/tools/perf/util/cpumap.c +++ b/tools/perf/util/cpumap.c @@ -672,7 +672,7 @@ struct perf_cpu_map *cpu_map__online(void) /* thread unsafe */ static struct perf_cpu_map *online; if (!online) - online = perf_cpu_map__new(NULL); /* from /sys/devices/system/cpu/online */ + online = perf_cpu_map__new_online_cpus(); /* from /sys/devices/system/cpu/online */ return online; } diff --git a/tools/perf/util/cputopo.c b/tools/perf/util/cputopo.c index 81cfc85f46682..8bbeb2dc76fda 100644 --- a/tools/perf/util/cputopo.c +++ b/tools/perf/util/cputopo.c @@ -267,7 +267,7 @@ struct cpu_topology *cpu_topology__new(void) ncpus = cpu__max_present_cpu().cpu; /* build online CPU map */ - map = perf_cpu_map__new(NULL); + map = perf_cpu_map__new_online_cpus(); if (map == NULL) { pr_debug("failed to get system cpumap\n"); return NULL; diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c index 8a4d1d5fd5753..72b2f8909065b 100644 --- a/tools/perf/util/evlist.c +++ b/tools/perf/util/evlist.c @@ -1360,7 +1360,7 @@ static int evlist__create_syswide_maps(struct evlist *evlist) * error, and we may not want to do that fallback to a * default cpu identity map :-\ */ - cpus = perf_cpu_map__new(NULL); + cpus = perf_cpu_map__new_online_cpus(); if (!cpus) return -ENOMEM; diff --git a/tools/perf/util/perf_api_probe.c b/tools/perf/util/perf_api_probe.c index e1e2d701599c4..1de3b69cdf4aa 100644 --- a/tools/perf/util/perf_api_probe.c +++ b/tools/perf/util/perf_api_probe.c @@ -64,7 +64,7 @@ static bool perf_probe_api(setup_probe_fn_t fn) struct perf_cpu cpu; int ret, i = 0; - cpus = perf_cpu_map__new(NULL); + cpus = perf_cpu_map__new_online_cpus(); if (!cpus) return false; cpu = perf_cpu_map__cpu(cpus, 0); @@ -140,7 +140,7 @@ bool perf_can_record_cpu_wide(void) struct perf_cpu cpu; int fd; - cpus = perf_cpu_map__new(NULL); + cpus = perf_cpu_map__new_online_cpus(); if (!cpus) return false; diff --git a/tools/perf/util/record.c b/tools/perf/util/record.c index 40290382b2d70..87e817b3cf7e9 100644 --- a/tools/perf/util/record.c +++ b/tools/perf/util/record.c @@ -238,7 +238,7 @@ bool evlist__can_select_event(struct evlist *evlist, const char *str) evsel = evlist__last(temp_evlist); if (!evlist || perf_cpu_map__has_any_cpu_or_is_empty(evlist->core.user_requested_cpus)) { - struct perf_cpu_map *cpus = perf_cpu_map__new(NULL); + struct perf_cpu_map *cpus = perf_cpu_map__new_online_cpus(); if (cpus) cpu = perf_cpu_map__cpu(cpus, 0); From d94fb24a6dc31a6ea5dfdc279f3d2891d7906562 Mon Sep 17 00:00:00 2001 From: Ian Rogers Date: Tue, 28 Nov 2023 22:02:02 -0800 Subject: [PATCH 245/548] libperf cpumap: Add for_each_cpu() that skips the "any CPU" case MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit When iterating CPUs in a CPU map it is often desirable to skip the "any CPU" (aka dummy) case. Add a helper for this and use in builtin-record. Reviewed-by: James Clark Signed-off-by: Ian Rogers Cc: Adrian Hunter Cc: Alexander Shishkin Cc: Alexandre Ghiti Cc: Andrew Jones Cc: André Almeida Cc: Athira Jajeev Cc: Atish Patra Cc: Changbin Du Cc: Darren Hart Cc: Davidlohr Bueso Cc: Huacai Chen Cc: Ingo Molnar Cc: Jiri Olsa Cc: John Garry Cc: K Prateek Nayak Cc: Kajol Jain Cc: Kan Liang Cc: Leo Yan Cc: Mark Rutland Cc: Mike Leach Cc: Namhyung Kim Cc: Nick Desaulniers Cc: Paolo Bonzini Cc: Paran Lee Cc: Peter Zijlstra Cc: Ravi Bangoria Cc: Sandipan Das Cc: Sean Christopherson Cc: Steinar H. Gunderson Cc: Suzuki Poulouse Cc: Thomas Gleixner Cc: Will Deacon Cc: Yang Jihong Cc: Yang Li Cc: Yanteng Si Cc: bpf@vger.kernel.org Cc: coresight@lists.linaro.org Cc: linux-arm-kernel@lists.infradead.org Link: https://lore.kernel.org/r/20231129060211.1890454-6-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo (cherry picked from commit 5805c82513c444333efb086017be8d666336858a) Signed-off-by: Koba Ko --- tools/lib/perf/include/perf/cpumap.h | 6 ++++++ tools/perf/builtin-record.c | 4 +--- 2 files changed, 7 insertions(+), 3 deletions(-) diff --git a/tools/lib/perf/include/perf/cpumap.h b/tools/lib/perf/include/perf/cpumap.h index 9cf361fc5edcb..dbe0a7352b648 100644 --- a/tools/lib/perf/include/perf/cpumap.h +++ b/tools/lib/perf/include/perf/cpumap.h @@ -64,6 +64,12 @@ LIBPERF_API bool perf_cpu_map__has_any_cpu(const struct perf_cpu_map *map); (idx) < perf_cpu_map__nr(cpus); \ (idx)++, (cpu) = perf_cpu_map__cpu(cpus, idx)) +#define perf_cpu_map__for_each_cpu_skip_any(_cpu, idx, cpus) \ + for ((idx) = 0, (_cpu) = perf_cpu_map__cpu(cpus, idx); \ + (idx) < perf_cpu_map__nr(cpus); \ + (idx)++, (_cpu) = perf_cpu_map__cpu(cpus, idx)) \ + if ((_cpu).cpu != -1) + #define perf_cpu_map__for_each_idx(idx, cpus) \ for ((idx) = 0; (idx) < perf_cpu_map__nr(cpus); (idx)++) diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c index 81f77c0505fde..b538c132de3e6 100644 --- a/tools/perf/builtin-record.c +++ b/tools/perf/builtin-record.c @@ -3531,9 +3531,7 @@ static int record__mmap_cpu_mask_init(struct mmap_cpu_mask *mask, struct perf_cp if (cpu_map__is_dummy(cpus)) return 0; - perf_cpu_map__for_each_cpu(cpu, idx, cpus) { - if (cpu.cpu == -1) - continue; + perf_cpu_map__for_each_cpu_skip_any(cpu, idx, cpus) { /* Return ENODEV is input cpu is greater than max cpu */ if ((unsigned long)cpu.cpu > mask->nbits) return -ENODEV; From 66e023fd90991461567e23146039a1e2f47c845c Mon Sep 17 00:00:00 2001 From: Ian Rogers Date: Fri, 2 Feb 2024 15:40:50 -0800 Subject: [PATCH 246/548] libperf cpumap: Add any, empty and min helpers MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Additional helpers to better replace perf_cpu_map__has_any_cpu_or_is_empty(). Signed-off-by: Ian Rogers Acked-by: Namhyung Kim Cc: Adrian Hunter Cc: Alexander Shishkin Cc: Alexandre Ghiti Cc: Andrew Jones Cc: André Almeida Cc: Athira Rajeev Cc: Atish Patra Cc: Changbin Du Cc: Darren Hart Cc: Davidlohr Bueso Cc: Huacai Chen Cc: Ingo Molnar Cc: James Clark Cc: Jiri Olsa Cc: John Garry Cc: K Prateek Nayak Cc: Kajol Jain Cc: Kan Liang Cc: Leo Yan Cc: Mark Rutland Cc: Mike Leach Cc: Nick Desaulniers Cc: Paolo Bonzini Cc: Paran Lee Cc: Peter Zijlstra Cc: Ravi Bangoria Cc: Sandipan Das Cc: Sean Christopherson Cc: Steinar H. Gunderson Cc: Suzuki Poulouse Cc: Thomas Gleixner Cc: Will Deacon Cc: Yang Jihong Cc: Yang Li Cc: Yanteng Si Link: https://lore.kernel.org/r/20240202234057.2085863-2-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo (cherry picked from commit b6b4a62d8525c3093c3273dc6b2e6831adbfc4b2) Signed-off-by: Koba Ko --- tools/lib/perf/cpumap.c | 27 +++++++++++++++++++++++++++ tools/lib/perf/include/perf/cpumap.h | 16 ++++++++++++++++ tools/lib/perf/libperf.map | 4 ++++ 3 files changed, 47 insertions(+) diff --git a/tools/lib/perf/cpumap.c b/tools/lib/perf/cpumap.c index 4adcd7920d033..ba49552952c59 100644 --- a/tools/lib/perf/cpumap.c +++ b/tools/lib/perf/cpumap.c @@ -316,6 +316,19 @@ bool perf_cpu_map__has_any_cpu_or_is_empty(const struct perf_cpu_map *map) return map ? __perf_cpu_map__cpu(map, 0).cpu == -1 : true; } +bool perf_cpu_map__is_any_cpu_or_is_empty(const struct perf_cpu_map *map) +{ + if (!map) + return true; + + return __perf_cpu_map__nr(map) == 1 && __perf_cpu_map__cpu(map, 0).cpu == -1; +} + +bool perf_cpu_map__is_empty(const struct perf_cpu_map *map) +{ + return map == NULL; +} + int perf_cpu_map__idx(const struct perf_cpu_map *cpus, struct perf_cpu cpu) { int low, high; @@ -372,6 +385,20 @@ bool perf_cpu_map__has_any_cpu(const struct perf_cpu_map *map) return map && __perf_cpu_map__cpu(map, 0).cpu == -1; } +struct perf_cpu perf_cpu_map__min(const struct perf_cpu_map *map) +{ + struct perf_cpu cpu, result = { + .cpu = -1 + }; + int idx; + + perf_cpu_map__for_each_cpu_skip_any(cpu, idx, map) { + result = cpu; + break; + } + return result; +} + struct perf_cpu perf_cpu_map__max(const struct perf_cpu_map *map) { struct perf_cpu result = { diff --git a/tools/lib/perf/include/perf/cpumap.h b/tools/lib/perf/include/perf/cpumap.h index dbe0a7352b648..523e4348fc96b 100644 --- a/tools/lib/perf/include/perf/cpumap.h +++ b/tools/lib/perf/include/perf/cpumap.h @@ -50,6 +50,22 @@ LIBPERF_API int perf_cpu_map__nr(const struct perf_cpu_map *cpus); * perf_cpu_map__has_any_cpu_or_is_empty - is map either empty or has the "any CPU"/dummy value. */ LIBPERF_API bool perf_cpu_map__has_any_cpu_or_is_empty(const struct perf_cpu_map *map); +/** + * perf_cpu_map__is_any_cpu_or_is_empty - is map either empty or the "any CPU"/dummy value. + */ +LIBPERF_API bool perf_cpu_map__is_any_cpu_or_is_empty(const struct perf_cpu_map *map); +/** + * perf_cpu_map__is_empty - does the map contain no values and it doesn't + * contain the special "any CPU"/dummy value. + */ +LIBPERF_API bool perf_cpu_map__is_empty(const struct perf_cpu_map *map); +/** + * perf_cpu_map__min - the minimum CPU value or -1 if empty or just the "any CPU"/dummy value. + */ +LIBPERF_API struct perf_cpu perf_cpu_map__min(const struct perf_cpu_map *map); +/** + * perf_cpu_map__max - the maximum CPU value or -1 if empty or just the "any CPU"/dummy value. + */ LIBPERF_API struct perf_cpu perf_cpu_map__max(const struct perf_cpu_map *map); LIBPERF_API bool perf_cpu_map__has(const struct perf_cpu_map *map, struct perf_cpu cpu); LIBPERF_API bool perf_cpu_map__equal(const struct perf_cpu_map *lhs, diff --git a/tools/lib/perf/libperf.map b/tools/lib/perf/libperf.map index 10b3f37226426..2aa79b6960326 100644 --- a/tools/lib/perf/libperf.map +++ b/tools/lib/perf/libperf.map @@ -10,6 +10,10 @@ LIBPERF_0.0.1 { perf_cpu_map__nr; perf_cpu_map__cpu; perf_cpu_map__has_any_cpu_or_is_empty; + perf_cpu_map__is_any_cpu_or_is_empty; + perf_cpu_map__is_empty; + perf_cpu_map__has_any_cpu; + perf_cpu_map__min; perf_cpu_map__max; perf_cpu_map__has; perf_thread_map__new_array; From 2fecd34083ff74cb36a4f9f02dfb13448dac1163 Mon Sep 17 00:00:00 2001 From: Ian Rogers Date: Fri, 2 Feb 2024 15:40:52 -0800 Subject: [PATCH 247/548] perf arm-spe/cs-etm: Directly iterate CPU maps MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Rather than iterate all CPUs and see if they are in CPU maps, directly iterate the CPU map. Similarly make use of the intersect function taking care for when "any" CPU is specified. Switch perf_cpu_map__has_any_cpu_or_is_empty() to more appropriate alternatives. Reviewed-by: James Clark Signed-off-by: Ian Rogers Acked-by: Namhyung Kim Cc: Adrian Hunter Cc: Alexander Shishkin Cc: Alexandre Ghiti Cc: Andrew Jones Cc: André Almeida Cc: Athira Rajeev Cc: Atish Patra Cc: Changbin Du Cc: Darren Hart Cc: Davidlohr Bueso Cc: Huacai Chen Cc: Ingo Molnar Cc: Jiri Olsa Cc: John Garry Cc: K Prateek Nayak Cc: Kajol Jain Cc: Kan Liang Cc: Leo Yan Cc: Mark Rutland Cc: Mike Leach Cc: Nick Desaulniers Cc: Paolo Bonzini Cc: Paran Lee Cc: Peter Zijlstra Cc: Ravi Bangoria Cc: Sandipan Das Cc: Sean Christopherson Cc: Steinar H. Gunderson Cc: Suzuki Poulouse Cc: Thomas Gleixner Cc: Will Deacon Cc: Yang Jihong Cc: Yang Li Cc: Yanteng Si Link: https://lore.kernel.org/r/20240202234057.2085863-4-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo (cherry picked from commit e28ee1239daa147a30c287ff0ad34ccadcc01de0) Signed-off-by: Koba Ko --- tools/perf/arch/arm/util/cs-etm.c | 114 ++++++++++++--------------- tools/perf/arch/arm64/util/arm-spe.c | 4 +- 2 files changed, 51 insertions(+), 67 deletions(-) diff --git a/tools/perf/arch/arm/util/cs-etm.c b/tools/perf/arch/arm/util/cs-etm.c index 37738fca939fd..5d5d452230f2a 100644 --- a/tools/perf/arch/arm/util/cs-etm.c +++ b/tools/perf/arch/arm/util/cs-etm.c @@ -197,38 +197,37 @@ static int cs_etm_validate_timestamp(struct auxtrace_record *itr, static int cs_etm_validate_config(struct auxtrace_record *itr, struct evsel *evsel) { - int i, err = -EINVAL; + int idx, err = 0; struct perf_cpu_map *event_cpus = evsel->evlist->core.user_requested_cpus; - struct perf_cpu_map *online_cpus = perf_cpu_map__new_online_cpus(); - - /* Set option of each CPU we have */ - for (i = 0; i < cpu__max_cpu().cpu; i++) { - struct perf_cpu cpu = { .cpu = i, }; + struct perf_cpu_map *intersect_cpus; + struct perf_cpu cpu; - /* - * In per-cpu case, do the validation for CPUs to work with. - * In per-thread case, the CPU map is empty. Since the traced - * program can run on any CPUs in this case, thus don't skip - * validation. - */ - if (!perf_cpu_map__has_any_cpu_or_is_empty(event_cpus) && - !perf_cpu_map__has(event_cpus, cpu)) - continue; + /* + * Set option of each CPU we have. In per-cpu case, do the validation + * for CPUs to work with. In per-thread case, the CPU map has the "any" + * CPU value. Since the traced program can run on any CPUs in this case, + * thus don't skip validation. + */ + if (!perf_cpu_map__has_any_cpu(event_cpus)) { + struct perf_cpu_map *online_cpus = perf_cpu_map__new_online_cpus(); - if (!perf_cpu_map__has(online_cpus, cpu)) - continue; + intersect_cpus = perf_cpu_map__intersect(event_cpus, online_cpus); + perf_cpu_map__put(online_cpus); + } else { + intersect_cpus = perf_cpu_map__new_online_cpus(); + } - err = cs_etm_validate_context_id(itr, evsel, i); + perf_cpu_map__for_each_cpu_skip_any(cpu, idx, intersect_cpus) { + err = cs_etm_validate_context_id(itr, evsel, cpu.cpu); if (err) - goto out; - err = cs_etm_validate_timestamp(itr, evsel, i); + break; + + err = cs_etm_validate_timestamp(itr, evsel, cpu.cpu); if (err) - goto out; + break; } - err = 0; -out: - perf_cpu_map__put(online_cpus); + perf_cpu_map__put(intersect_cpus); return err; } @@ -435,7 +434,7 @@ static int cs_etm_recording_options(struct auxtrace_record *itr, * Also the case of per-cpu mmaps, need the contextID in order to be notified * when a context switch happened. */ - if (!perf_cpu_map__has_any_cpu_or_is_empty(cpus)) { + if (!perf_cpu_map__is_any_cpu_or_is_empty(cpus)) { evsel__set_config_if_unset(cs_etm_pmu, cs_etm_evsel, "timestamp", 1); evsel__set_config_if_unset(cs_etm_pmu, cs_etm_evsel, @@ -461,7 +460,7 @@ static int cs_etm_recording_options(struct auxtrace_record *itr, evsel->core.attr.sample_period = 1; /* In per-cpu case, always need the time of mmap events etc */ - if (!perf_cpu_map__has_any_cpu_or_is_empty(cpus)) + if (!perf_cpu_map__is_any_cpu_or_is_empty(cpus)) evsel__set_sample_bit(evsel, TIME); err = cs_etm_validate_config(itr, cs_etm_evsel); @@ -533,45 +532,31 @@ static size_t cs_etm_info_priv_size(struct auxtrace_record *itr __maybe_unused, struct evlist *evlist __maybe_unused) { - int i; + int idx; int etmv3 = 0, etmv4 = 0, ete = 0; struct perf_cpu_map *event_cpus = evlist->core.user_requested_cpus; - struct perf_cpu_map *online_cpus = perf_cpu_map__new_online_cpus(); - - /* cpu map is not empty, we have specific CPUs to work with */ - if (!perf_cpu_map__has_any_cpu_or_is_empty(event_cpus)) { - for (i = 0; i < cpu__max_cpu().cpu; i++) { - struct perf_cpu cpu = { .cpu = i, }; + struct perf_cpu_map *intersect_cpus; + struct perf_cpu cpu; - if (!perf_cpu_map__has(event_cpus, cpu) || - !perf_cpu_map__has(online_cpus, cpu)) - continue; + if (!perf_cpu_map__has_any_cpu(event_cpus)) { + /* cpu map is not "any" CPU , we have specific CPUs to work with */ + struct perf_cpu_map *online_cpus = perf_cpu_map__new_online_cpus(); - if (cs_etm_is_ete(itr, i)) - ete++; - else if (cs_etm_is_etmv4(itr, i)) - etmv4++; - else - etmv3++; - } + intersect_cpus = perf_cpu_map__intersect(event_cpus, online_cpus); + perf_cpu_map__put(online_cpus); } else { - /* get configuration for all CPUs in the system */ - for (i = 0; i < cpu__max_cpu().cpu; i++) { - struct perf_cpu cpu = { .cpu = i, }; - - if (!perf_cpu_map__has(online_cpus, cpu)) - continue; - - if (cs_etm_is_ete(itr, i)) - ete++; - else if (cs_etm_is_etmv4(itr, i)) - etmv4++; - else - etmv3++; - } + /* Event can be "any" CPU so count all online CPUs. */ + intersect_cpus = perf_cpu_map__new_online_cpus(); } - - perf_cpu_map__put(online_cpus); + perf_cpu_map__for_each_cpu_skip_any(cpu, idx, intersect_cpus) { + if (cs_etm_is_ete(itr, cpu.cpu)) + ete++; + else if (cs_etm_is_etmv4(itr, cpu.cpu)) + etmv4++; + else + etmv3++; + } + perf_cpu_map__put(intersect_cpus); return (CS_ETM_HEADER_SIZE + (ete * CS_ETE_PRIV_SIZE) + @@ -813,16 +798,15 @@ static int cs_etm_info_fill(struct auxtrace_record *itr, if (!session->evlist->core.nr_mmaps) return -EINVAL; - /* If the cpu_map is empty all online CPUs are involved */ - if (perf_cpu_map__has_any_cpu_or_is_empty(event_cpus)) { + /* If the cpu_map has the "any" CPU all online CPUs are involved */ + if (perf_cpu_map__has_any_cpu(event_cpus)) { cpu_map = online_cpus; } else { /* Make sure all specified CPUs are online */ - for (i = 0; i < perf_cpu_map__nr(event_cpus); i++) { - struct perf_cpu cpu = { .cpu = i, }; + struct perf_cpu cpu; - if (perf_cpu_map__has(event_cpus, cpu) && - !perf_cpu_map__has(online_cpus, cpu)) + perf_cpu_map__for_each_cpu(cpu, i, event_cpus) { + if (!perf_cpu_map__has(online_cpus, cpu)) return -EINVAL; } diff --git a/tools/perf/arch/arm64/util/arm-spe.c b/tools/perf/arch/arm64/util/arm-spe.c index 0cdedc6085cba..4e947814ea748 100644 --- a/tools/perf/arch/arm64/util/arm-spe.c +++ b/tools/perf/arch/arm64/util/arm-spe.c @@ -213,7 +213,7 @@ static int arm_spe_recording_options(struct auxtrace_record *itr, * In the case of per-cpu mmaps, sample CPU for AUX event; * also enable the timestamp tracing for samples correlation. */ - if (!perf_cpu_map__has_any_cpu_or_is_empty(cpus)) { + if (!perf_cpu_map__is_any_cpu_or_is_empty(cpus)) { evsel__set_sample_bit(arm_spe_evsel, CPU); evsel__set_config_if_unset(arm_spe_pmu, arm_spe_evsel, "ts_enable", 1); @@ -246,7 +246,7 @@ static int arm_spe_recording_options(struct auxtrace_record *itr, tracking_evsel->core.attr.sample_period = 1; /* In per-cpu case, always need the time of mmap events etc */ - if (!perf_cpu_map__has_any_cpu_or_is_empty(cpus)) { + if (!perf_cpu_map__is_any_cpu_or_is_empty(cpus)) { evsel__set_sample_bit(tracking_evsel, TIME); evsel__set_sample_bit(tracking_evsel, CPU); From ba4e1d1ee5687331e720414b37e84185709a9e81 Mon Sep 17 00:00:00 2001 From: Ian Rogers Date: Fri, 2 Feb 2024 15:40:53 -0800 Subject: [PATCH 248/548] perf intel-pt/intel-bts: Switch perf_cpu_map__has_any_cpu_or_is_empty use MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Switch perf_cpu_map__has_any_cpu_or_is_empty() to perf_cpu_map__is_any_cpu_or_is_empty() as a CPU map may contain CPUs as well as the dummy event and perf_cpu_map__is_any_cpu_or_is_empty() is a more correct alternative. Signed-off-by: Ian Rogers Acked-by: Namhyung Kim Cc: Adrian Hunter Cc: Alexander Shishkin Cc: Alexandre Ghiti Cc: Andrew Jones Cc: André Almeida Cc: Athira Rajeev Cc: Atish Patra Cc: Changbin Du Cc: Darren Hart Cc: Davidlohr Bueso Cc: Huacai Chen Cc: Ingo Molnar Cc: James Clark Cc: Jiri Olsa Cc: John Garry Cc: K Prateek Nayak Cc: Kajol Jain Cc: Kan Liang Cc: Leo Yan Cc: Mark Rutland Cc: Mike Leach Cc: Nick Desaulniers Cc: Paolo Bonzini Cc: Paran Lee Cc: Peter Zijlstra Cc: Ravi Bangoria Cc: Sandipan Das Cc: Sean Christopherson Cc: Steinar H. Gunderson Cc: Suzuki Poulouse Cc: Thomas Gleixner Cc: Will Deacon Cc: Yang Jihong Cc: Yang Li Cc: Yanteng Si Link: https://lore.kernel.org/r/20240202234057.2085863-5-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo (cherry picked from commit 291dcd774b644875c9eb2a56b272919fe35c5c6f) Signed-off-by: Koba Ko --- tools/perf/arch/x86/util/intel-bts.c | 4 ++-- tools/perf/arch/x86/util/intel-pt.c | 10 +++++----- 2 files changed, 7 insertions(+), 7 deletions(-) diff --git a/tools/perf/arch/x86/util/intel-bts.c b/tools/perf/arch/x86/util/intel-bts.c index af8ae4647585b..34696f3d3d5df 100644 --- a/tools/perf/arch/x86/util/intel-bts.c +++ b/tools/perf/arch/x86/util/intel-bts.c @@ -143,7 +143,7 @@ static int intel_bts_recording_options(struct auxtrace_record *itr, if (!opts->full_auxtrace) return 0; - if (opts->full_auxtrace && !perf_cpu_map__has_any_cpu_or_is_empty(cpus)) { + if (opts->full_auxtrace && !perf_cpu_map__is_any_cpu_or_is_empty(cpus)) { pr_err(INTEL_BTS_PMU_NAME " does not support per-cpu recording\n"); return -EINVAL; } @@ -224,7 +224,7 @@ static int intel_bts_recording_options(struct auxtrace_record *itr, * In the case of per-cpu mmaps, we need the CPU on the * AUX event. */ - if (!perf_cpu_map__has_any_cpu_or_is_empty(cpus)) + if (!perf_cpu_map__is_any_cpu_or_is_empty(cpus)) evsel__set_sample_bit(intel_bts_evsel, CPU); } diff --git a/tools/perf/arch/x86/util/intel-pt.c b/tools/perf/arch/x86/util/intel-pt.c index d2afca4aec360..7b47bbc7afc76 100644 --- a/tools/perf/arch/x86/util/intel-pt.c +++ b/tools/perf/arch/x86/util/intel-pt.c @@ -373,7 +373,7 @@ static int intel_pt_info_fill(struct auxtrace_record *itr, ui__warning("Intel Processor Trace: TSC not available\n"); } - per_cpu_mmaps = !perf_cpu_map__has_any_cpu_or_is_empty(session->evlist->core.user_requested_cpus); + per_cpu_mmaps = !perf_cpu_map__is_any_cpu_or_is_empty(session->evlist->core.user_requested_cpus); auxtrace_info->type = PERF_AUXTRACE_INTEL_PT; auxtrace_info->priv[INTEL_PT_PMU_TYPE] = intel_pt_pmu->type; @@ -790,7 +790,7 @@ static int intel_pt_recording_options(struct auxtrace_record *itr, * Per-cpu recording needs sched_switch events to distinguish different * threads. */ - if (have_timing_info && !perf_cpu_map__has_any_cpu_or_is_empty(cpus) && + if (have_timing_info && !perf_cpu_map__is_any_cpu_or_is_empty(cpus) && !record_opts__no_switch_events(opts)) { if (perf_can_record_switch_events()) { bool cpu_wide = !target__none(&opts->target) && @@ -848,7 +848,7 @@ static int intel_pt_recording_options(struct auxtrace_record *itr, * In the case of per-cpu mmaps, we need the CPU on the * AUX event. */ - if (!perf_cpu_map__has_any_cpu_or_is_empty(cpus)) + if (!perf_cpu_map__is_any_cpu_or_is_empty(cpus)) evsel__set_sample_bit(intel_pt_evsel, CPU); } @@ -874,7 +874,7 @@ static int intel_pt_recording_options(struct auxtrace_record *itr, tracking_evsel->immediate = true; /* In per-cpu case, always need the time of mmap events etc */ - if (!perf_cpu_map__has_any_cpu_or_is_empty(cpus)) { + if (!perf_cpu_map__is_any_cpu_or_is_empty(cpus)) { evsel__set_sample_bit(tracking_evsel, TIME); /* And the CPU for switch events */ evsel__set_sample_bit(tracking_evsel, CPU); @@ -886,7 +886,7 @@ static int intel_pt_recording_options(struct auxtrace_record *itr, * Warn the user when we do not have enough information to decode i.e. * per-cpu with no sched_switch (except workload-only). */ - if (!ptr->have_sched_switch && !perf_cpu_map__has_any_cpu_or_is_empty(cpus) && + if (!ptr->have_sched_switch && !perf_cpu_map__is_any_cpu_or_is_empty(cpus) && !target__none(&opts->target) && !intel_pt_evsel->core.attr.exclude_user) ui__warning("Intel Processor Trace decoding will not be possible except for kernel tracing!\n"); From 704fbf4f4504bfe81694b400cd50c366f2b5bca7 Mon Sep 17 00:00:00 2001 From: Ian Rogers Date: Fri, 2 Feb 2024 15:40:54 -0800 Subject: [PATCH 249/548] perf cpumap: Clean up use of perf_cpu_map__has_any_cpu_or_is_empty MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Most uses of what was perf_cpu_map__empty but is now perf_cpu_map__has_any_cpu_or_is_empty want to do something with the CPU map if it contains CPUs. Replace uses of perf_cpu_map__has_any_cpu_or_is_empty with other helpers so that CPUs within the map can be handled. Reviewed-by: James Clark Signed-off-by: Ian Rogers Acked-by: Namhyung Kim Cc: Adrian Hunter Cc: Alexander Shishkin Cc: Alexandre Ghiti Cc: Andrew Jones Cc: André Almeida Cc: Athira Rajeev Cc: Atish Patra Cc: Changbin Du Cc: Darren Hart Cc: Davidlohr Bueso Cc: Huacai Chen Cc: Ingo Molnar Cc: Jiri Olsa Cc: John Garry Cc: K Prateek Nayak Cc: Kajol Jain Cc: Kan Liang Cc: Leo Yan Cc: Mark Rutland Cc: Mike Leach Cc: Nick Desaulniers Cc: Paolo Bonzini Cc: Paran Lee Cc: Peter Zijlstra Cc: Ravi Bangoria Cc: Sandipan Das Cc: Sean Christopherson Cc: Steinar H. Gunderson Cc: Suzuki Poulouse Cc: Thomas Gleixner Cc: Will Deacon Cc: Yang Jihong Cc: Yang Li Cc: Yanteng Si Link: https://lore.kernel.org/r/20240202234057.2085863-6-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo (cherry picked from commit 3e5deb708c8f3f7d645f567b2ba38d8045fa11ba) Signed-off-by: Koba Ko --- tools/perf/builtin-c2c.c | 6 +----- tools/perf/builtin-stat.c | 9 ++++----- tools/perf/util/auxtrace.c | 4 ++-- tools/perf/util/record.c | 2 +- tools/perf/util/stat.c | 2 +- 5 files changed, 9 insertions(+), 14 deletions(-) diff --git a/tools/perf/builtin-c2c.c b/tools/perf/builtin-c2c.c index f78eea9e21539..ef7ed53a4b4e2 100644 --- a/tools/perf/builtin-c2c.c +++ b/tools/perf/builtin-c2c.c @@ -2319,11 +2319,7 @@ static int setup_nodes(struct perf_session *session) nodes[node] = set; - /* empty node, skip */ - if (perf_cpu_map__has_any_cpu_or_is_empty(map)) - continue; - - perf_cpu_map__for_each_cpu(cpu, idx, map) { + perf_cpu_map__for_each_cpu_skip_any(cpu, idx, map) { __set_bit(cpu.cpu, set); if (WARN_ONCE(cpu2node[cpu.cpu] != -1, "node/cpu topology bug")) diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c index 045198106fc58..3bdee1cec3ebb 100644 --- a/tools/perf/builtin-stat.c +++ b/tools/perf/builtin-stat.c @@ -1338,10 +1338,9 @@ static int cpu__get_cache_id_from_map(struct perf_cpu cpu, char *map) * be the first online CPU in the cache domain else use the * first online CPU of the cache domain as the ID. */ - if (perf_cpu_map__has_any_cpu_or_is_empty(cpu_map)) + id = perf_cpu_map__min(cpu_map).cpu; + if (id == -1) id = cpu.cpu; - else - id = perf_cpu_map__cpu(cpu_map, 0).cpu; /* Free the perf_cpu_map used to find the cache ID */ perf_cpu_map__put(cpu_map); @@ -1644,7 +1643,7 @@ static int perf_stat_init_aggr_mode(void) * taking the highest cpu number to be the size of * the aggregation translate cpumap. */ - if (!perf_cpu_map__has_any_cpu_or_is_empty(evsel_list->core.user_requested_cpus)) + if (!perf_cpu_map__is_any_cpu_or_is_empty(evsel_list->core.user_requested_cpus)) nr = perf_cpu_map__max(evsel_list->core.user_requested_cpus).cpu; else nr = 0; @@ -2311,7 +2310,7 @@ int process_stat_config_event(struct perf_session *session, perf_event__read_stat_config(&stat_config, &event->stat_config); - if (perf_cpu_map__has_any_cpu_or_is_empty(st->cpus)) { + if (perf_cpu_map__is_empty(st->cpus)) { if (st->aggr_mode != AGGR_UNSET) pr_warning("warning: processing task data, aggregation mode not set\n"); } else if (st->aggr_mode != AGGR_UNSET) { diff --git a/tools/perf/util/auxtrace.c b/tools/perf/util/auxtrace.c index f572de2e83467..2edae23782d79 100644 --- a/tools/perf/util/auxtrace.c +++ b/tools/perf/util/auxtrace.c @@ -174,7 +174,7 @@ void auxtrace_mmap_params__set_idx(struct auxtrace_mmap_params *mp, struct evlist *evlist, struct evsel *evsel, int idx) { - bool per_cpu = !perf_cpu_map__has_any_cpu_or_is_empty(evlist->core.user_requested_cpus); + bool per_cpu = !perf_cpu_map__has_any_cpu(evlist->core.user_requested_cpus); mp->mmap_needed = evsel->needs_auxtrace_mmap; @@ -648,7 +648,7 @@ int auxtrace_parse_snapshot_options(struct auxtrace_record *itr, static int evlist__enable_event_idx(struct evlist *evlist, struct evsel *evsel, int idx) { - bool per_cpu_mmaps = !perf_cpu_map__has_any_cpu_or_is_empty(evlist->core.user_requested_cpus); + bool per_cpu_mmaps = !perf_cpu_map__has_any_cpu(evlist->core.user_requested_cpus); if (per_cpu_mmaps) { struct perf_cpu evlist_cpu = perf_cpu_map__cpu(evlist->core.all_cpus, idx); diff --git a/tools/perf/util/record.c b/tools/perf/util/record.c index 87e817b3cf7e9..e867de8ddaaa4 100644 --- a/tools/perf/util/record.c +++ b/tools/perf/util/record.c @@ -237,7 +237,7 @@ bool evlist__can_select_event(struct evlist *evlist, const char *str) evsel = evlist__last(temp_evlist); - if (!evlist || perf_cpu_map__has_any_cpu_or_is_empty(evlist->core.user_requested_cpus)) { + if (!evlist || perf_cpu_map__is_any_cpu_or_is_empty(evlist->core.user_requested_cpus)) { struct perf_cpu_map *cpus = perf_cpu_map__new_online_cpus(); if (cpus) diff --git a/tools/perf/util/stat.c b/tools/perf/util/stat.c index 012c4946b9c49..915808a6211af 100644 --- a/tools/perf/util/stat.c +++ b/tools/perf/util/stat.c @@ -315,7 +315,7 @@ static int check_per_pkg(struct evsel *counter, struct perf_counts_values *vals, if (!counter->per_pkg) return 0; - if (perf_cpu_map__has_any_cpu_or_is_empty(cpus)) + if (perf_cpu_map__is_any_cpu_or_is_empty(cpus)) return 0; if (!mask) { From 85bb87402579efb916f5474efef857b3baac0f0c Mon Sep 17 00:00:00 2001 From: Ian Rogers Date: Tue, 28 Nov 2023 22:02:07 -0800 Subject: [PATCH 250/548] perf top: Avoid repeated function calls to perf_cpu_map__nr(). MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Add a local variable to avoid repeated calls to perf_cpu_map__nr(). Reviewed-by: James Clark Signed-off-by: Adrian Hunter Signed-off-by: Ian Rogers Acked-by: Adrian Hunter Acked-by: Namhyung Kim Cc: Alexander Shishkin Cc: Alexandre Ghiti Cc: Andrew Jones Cc: André Almeida Cc: Athira Rajeev Cc: Atish Patra Cc: Changbin Du Cc: Darren Hart Cc: Davidlohr Bueso Cc: Huacai Chen Cc: Ingo Molnar Cc: Jiri Olsa Cc: John Garry Cc: K Prateek Nayak Cc: Kajol Jain Cc: Kan Liang Cc: Leo Yan Cc: Mark Rutland Cc: Mike Leach Cc: Nick Desaulniers Cc: Paolo Bonzini Cc: Paran Lee Cc: Peter Zijlstra Cc: Ravi Bangoria Cc: Sandipan Das Cc: Sean Christopherson Cc: Steinar H. Gunderson Cc: Suzuki Poulouse Cc: Thomas Gleixner Cc: Will Deacon Cc: Yang Jihong Cc: Yang Li Cc: Yanteng Si Link: https://lore.kernel.org/r/20231129060211.1890454-11-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo (cherry picked from commit 3e0594f9f0f774918d63701dc7de634bb1bc7b1e) Signed-off-by: Koba Ko --- tools/perf/util/top.c | 9 ++++----- 1 file changed, 4 insertions(+), 5 deletions(-) diff --git a/tools/perf/util/top.c b/tools/perf/util/top.c index be7157de04518..4db3d1bd686cf 100644 --- a/tools/perf/util/top.c +++ b/tools/perf/util/top.c @@ -28,6 +28,7 @@ size_t perf_top__header_snprintf(struct perf_top *top, char *bf, size_t size) struct record_opts *opts = &top->record_opts; struct target *target = &opts->target; size_t ret = 0; + int nr_cpus; if (top->samples) { samples_per_sec = top->samples / top->delay_secs; @@ -93,19 +94,17 @@ size_t perf_top__header_snprintf(struct perf_top *top, char *bf, size_t size) else ret += SNPRINTF(bf + ret, size - ret, " (all"); + nr_cpus = perf_cpu_map__nr(top->evlist->core.user_requested_cpus); if (target->cpu_list) ret += SNPRINTF(bf + ret, size - ret, ", CPU%s: %s)", - perf_cpu_map__nr(top->evlist->core.user_requested_cpus) > 1 - ? "s" : "", + nr_cpus > 1 ? "s" : "", target->cpu_list); else { if (target->tid) ret += SNPRINTF(bf + ret, size - ret, ")"); else ret += SNPRINTF(bf + ret, size - ret, ", %d CPU%s)", - perf_cpu_map__nr(top->evlist->core.user_requested_cpus), - perf_cpu_map__nr(top->evlist->core.user_requested_cpus) > 1 - ? "s" : ""); + nr_cpus, nr_cpus > 1 ? "s" : ""); } perf_top__reset_sample_counters(top); From 52bce623f38fbfd688e0f6f11f847ddd563dadfe Mon Sep 17 00:00:00 2001 From: Ian Rogers Date: Fri, 2 Feb 2024 15:40:55 -0800 Subject: [PATCH 251/548] perf arm64 header: Remove unnecessary CPU map get and put MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit In both cases the CPU map is known owned by either the caller or a PMU. Reviewed-by: James Clark Signed-off-by: Ian Rogers Acked-by: Namhyung Kim Cc: Adrian Hunter Cc: Alexander Shishkin Cc: Alexandre Ghiti Cc: Andrew Jones Cc: André Almeida Cc: Athira Rajeev Cc: Atish Patra Cc: Changbin Du Cc: Darren Hart Cc: Davidlohr Bueso Cc: Huacai Chen Cc: Ingo Molnar Cc: Jiri Olsa Cc: John Garry Cc: K Prateek Nayak Cc: Kajol Jain Cc: Kan Liang Cc: Leo Yan Cc: Mark Rutland Cc: Mike Leach Cc: Nick Desaulniers Cc: Paolo Bonzini Cc: Paran Lee Cc: Peter Zijlstra Cc: Ravi Bangoria Cc: Sandipan Das Cc: Sean Christopherson Cc: Steinar H. Gunderson Cc: Suzuki Poulouse Cc: Thomas Gleixner Cc: Will Deacon Cc: Yang Jihong Cc: Yang Li Cc: Yanteng Si Link: https://lore.kernel.org/r/20240202234057.2085863-7-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo (cherry picked from commit 4ddccd0048089eb324330fa3a0036e41c31355d3) Signed-off-by: Koba Ko --- tools/perf/arch/arm64/util/header.c | 3 --- 1 file changed, 3 deletions(-) diff --git a/tools/perf/arch/arm64/util/header.c b/tools/perf/arch/arm64/util/header.c index 97037499152ef..a9de0b5187dd4 100644 --- a/tools/perf/arch/arm64/util/header.c +++ b/tools/perf/arch/arm64/util/header.c @@ -25,8 +25,6 @@ static int _get_cpuid(char *buf, size_t sz, struct perf_cpu_map *cpus) if (!sysfs || sz < MIDR_SIZE) return EINVAL; - cpus = perf_cpu_map__get(cpus); - for (cpu = 0; cpu < perf_cpu_map__nr(cpus); cpu++) { char path[PATH_MAX]; FILE *file; @@ -51,7 +49,6 @@ static int _get_cpuid(char *buf, size_t sz, struct perf_cpu_map *cpus) break; } - perf_cpu_map__put(cpus); return ret; } From 1733cf26279175a4aac7179738b403200e2f4403 Mon Sep 17 00:00:00 2001 From: Ian Rogers Date: Fri, 2 Feb 2024 15:40:56 -0800 Subject: [PATCH 252/548] perf stat: Remove duplicate cpus_map_matched function MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Use libperf's perf_cpu_map__equal() that performs the same function. Reviewed-by: James Clark Signed-off-by: Ian Rogers Acked-by: Namhyung Kim Cc: Adrian Hunter Cc: Alexander Shishkin Cc: Alexandre Ghiti Cc: Andrew Jones Cc: André Almeida Cc: Athira Rajeev Cc: Atish Patra Cc: Changbin Du Cc: Darren Hart Cc: Davidlohr Bueso Cc: Huacai Chen Cc: Ingo Molnar Cc: Jiri Olsa Cc: John Garry Cc: K Prateek Nayak Cc: Kajol Jain Cc: Kan Liang Cc: Leo Yan Cc: Mark Rutland Cc: Mike Leach Cc: Nick Desaulniers Cc: Paolo Bonzini Cc: Paran Lee Cc: Peter Zijlstra Cc: Ravi Bangoria Cc: Sandipan Das Cc: Sean Christopherson Cc: Steinar H. Gunderson Cc: Suzuki Poulouse Cc: Thomas Gleixner Cc: Will Deacon Cc: Yang Jihong Cc: Yang Li Cc: Yanteng Si Link: https://lore.kernel.org/r/20240202234057.2085863-8-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo (cherry picked from commit 954ac1b4a79a06736fd85a183f34c18c152286c6) Signed-off-by: Koba Ko --- tools/perf/builtin-stat.c | 22 +--------------------- 1 file changed, 1 insertion(+), 21 deletions(-) diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c index 3bdee1cec3ebb..e8cca0531a341 100644 --- a/tools/perf/builtin-stat.c +++ b/tools/perf/builtin-stat.c @@ -164,26 +164,6 @@ static struct perf_stat_config stat_config = { .iostat_run = false, }; -static bool cpus_map_matched(struct evsel *a, struct evsel *b) -{ - if (!a->core.cpus && !b->core.cpus) - return true; - - if (!a->core.cpus || !b->core.cpus) - return false; - - if (perf_cpu_map__nr(a->core.cpus) != perf_cpu_map__nr(b->core.cpus)) - return false; - - for (int i = 0; i < perf_cpu_map__nr(a->core.cpus); i++) { - if (perf_cpu_map__cpu(a->core.cpus, i).cpu != - perf_cpu_map__cpu(b->core.cpus, i).cpu) - return false; - } - - return true; -} - static void evlist__check_cpu_maps(struct evlist *evlist) { struct evsel *evsel, *warned_leader = NULL; @@ -194,7 +174,7 @@ static void evlist__check_cpu_maps(struct evlist *evlist) /* Check that leader matches cpus with each member. */ if (leader == evsel) continue; - if (cpus_map_matched(leader, evsel)) + if (perf_cpu_map__equal(leader->core.cpus, evsel->core.cpus)) continue; /* If there's mismatch disable the group and warn user. */ From 49d5886fa200ce1a2d993fbe08d76b2c1575f71a Mon Sep 17 00:00:00 2001 From: Ian Rogers Date: Fri, 2 Feb 2024 15:40:57 -0800 Subject: [PATCH 253/548] perf cpumap: Use perf_cpu_map__for_each_cpu when possible MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Rather than manually iterating the CPU map, use perf_cpu_map__for_each_cpu(). When possible tidy local variables. Reviewed-by: James Clark Signed-off-by: Ian Rogers Acked-by: Namhyung Kim Cc: Adrian Hunter Cc: Alexander Shishkin Cc: Alexandre Ghiti Cc: Andrew Jones Cc: André Almeida Cc: Athira Rajeev Cc: Atish Patra Cc: Changbin Du Cc: Darren Hart Cc: Davidlohr Bueso Cc: Huacai Chen Cc: Ingo Molnar Cc: Jiri Olsa Cc: John Garry Cc: K Prateek Nayak Cc: Kajol Jain Cc: Kan Liang Cc: Leo Yan Cc: Mark Rutland Cc: Mike Leach Cc: Nick Desaulniers Cc: Paolo Bonzini Cc: Paran Lee Cc: Peter Zijlstra Cc: Ravi Bangoria Cc: Sandipan Das Cc: Sean Christopherson Cc: Steinar H. Gunderson Cc: Suzuki Poulouse Cc: Thomas Gleixner Cc: Will Deacon Cc: Yang Jihong Cc: Yang Li Cc: Yanteng Si Link: https://lore.kernel.org/r/20240202234057.2085863-9-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo (cherry picked from commit 71bc3ac8e8c93f769d1a2040153fe6d6b8093fa7) Signed-off-by: Koba Ko --- tools/perf/arch/arm64/util/header.c | 10 +- tools/perf/tests/bitmap.c | 13 +- tools/perf/tests/topology.c | 46 +-- tools/perf/util/bpf_kwork.c | 16 +- tools/perf/util/bpf_kwork_top.c | 308 ++++++++++++++++++ tools/perf/util/cpumap.c | 12 +- .../scripting-engines/trace-event-python.c | 12 +- tools/perf/util/session.c | 5 +- tools/perf/util/svghelper.c | 20 +- 9 files changed, 374 insertions(+), 68 deletions(-) create mode 100644 tools/perf/util/bpf_kwork_top.c diff --git a/tools/perf/arch/arm64/util/header.c b/tools/perf/arch/arm64/util/header.c index a9de0b5187dd4..741df3614a09a 100644 --- a/tools/perf/arch/arm64/util/header.c +++ b/tools/perf/arch/arm64/util/header.c @@ -4,8 +4,6 @@ #include #include #include -#include -#include #include #include #include "debug.h" @@ -19,18 +17,18 @@ static int _get_cpuid(char *buf, size_t sz, struct perf_cpu_map *cpus) { const char *sysfs = sysfs__mountpoint(); - int cpu; - int ret = EINVAL; + struct perf_cpu cpu; + int idx, ret = EINVAL; if (!sysfs || sz < MIDR_SIZE) return EINVAL; - for (cpu = 0; cpu < perf_cpu_map__nr(cpus); cpu++) { + perf_cpu_map__for_each_cpu(cpu, idx, cpus) { char path[PATH_MAX]; FILE *file; scnprintf(path, PATH_MAX, "%s/devices/system/cpu/cpu%d" MIDR, - sysfs, RC_CHK_ACCESS(cpus)->map[cpu].cpu); + sysfs, cpu.cpu); file = fopen(path, "r"); if (!file) { diff --git a/tools/perf/tests/bitmap.c b/tools/perf/tests/bitmap.c index 0173f5402a35b..98956e0e07655 100644 --- a/tools/perf/tests/bitmap.c +++ b/tools/perf/tests/bitmap.c @@ -11,18 +11,19 @@ static unsigned long *get_bitmap(const char *str, int nbits) { struct perf_cpu_map *map = perf_cpu_map__new(str); - unsigned long *bm = NULL; - int i; + unsigned long *bm; bm = bitmap_zalloc(nbits); if (map && bm) { - for (i = 0; i < perf_cpu_map__nr(map); i++) - __set_bit(perf_cpu_map__cpu(map, i).cpu, bm); + int i; + struct perf_cpu cpu; + + perf_cpu_map__for_each_cpu(cpu, i, map) + __set_bit(cpu.cpu, bm); } - if (map) - perf_cpu_map__put(map); + perf_cpu_map__put(map); return bm; } diff --git a/tools/perf/tests/topology.c b/tools/perf/tests/topology.c index 2a842f53fbb57..a8cb5ba898ab3 100644 --- a/tools/perf/tests/topology.c +++ b/tools/perf/tests/topology.c @@ -68,6 +68,7 @@ static int check_cpu_topology(char *path, struct perf_cpu_map *map) }; int i; struct aggr_cpu_id id; + struct perf_cpu cpu; session = perf_session__new(&data, NULL); TEST_ASSERT_VAL("can't get session", !IS_ERR(session)); @@ -113,8 +114,7 @@ static int check_cpu_topology(char *path, struct perf_cpu_map *map) TEST_ASSERT_VAL("Session header CPU map not set", session->header.env.cpu); for (i = 0; i < session->header.env.nr_cpus_avail; i++) { - struct perf_cpu cpu = { .cpu = i }; - + cpu.cpu = i; if (!perf_cpu_map__has(map, cpu)) continue; pr_debug("CPU %d, core %d, socket %d\n", i, @@ -123,48 +123,48 @@ static int check_cpu_topology(char *path, struct perf_cpu_map *map) } // Test that CPU ID contains socket, die, core and CPU - for (i = 0; i < perf_cpu_map__nr(map); i++) { - id = aggr_cpu_id__cpu(perf_cpu_map__cpu(map, i), NULL); + perf_cpu_map__for_each_cpu(cpu, i, map) { + id = aggr_cpu_id__cpu(cpu, NULL); TEST_ASSERT_VAL("Cpu map - CPU ID doesn't match", - perf_cpu_map__cpu(map, i).cpu == id.cpu.cpu); + cpu.cpu == id.cpu.cpu); TEST_ASSERT_VAL("Cpu map - Core ID doesn't match", - session->header.env.cpu[perf_cpu_map__cpu(map, i).cpu].core_id == id.core); + session->header.env.cpu[cpu.cpu].core_id == id.core); TEST_ASSERT_VAL("Cpu map - Socket ID doesn't match", - session->header.env.cpu[perf_cpu_map__cpu(map, i).cpu].socket_id == + session->header.env.cpu[cpu.cpu].socket_id == id.socket); TEST_ASSERT_VAL("Cpu map - Die ID doesn't match", - session->header.env.cpu[perf_cpu_map__cpu(map, i).cpu].die_id == id.die); + session->header.env.cpu[cpu.cpu].die_id == id.die); TEST_ASSERT_VAL("Cpu map - Node ID is set", id.node == -1); TEST_ASSERT_VAL("Cpu map - Thread IDX is set", id.thread_idx == -1); } // Test that core ID contains socket, die and core - for (i = 0; i < perf_cpu_map__nr(map); i++) { - id = aggr_cpu_id__core(perf_cpu_map__cpu(map, i), NULL); + perf_cpu_map__for_each_cpu(cpu, i, map) { + id = aggr_cpu_id__core(cpu, NULL); TEST_ASSERT_VAL("Core map - Core ID doesn't match", - session->header.env.cpu[perf_cpu_map__cpu(map, i).cpu].core_id == id.core); + session->header.env.cpu[cpu.cpu].core_id == id.core); TEST_ASSERT_VAL("Core map - Socket ID doesn't match", - session->header.env.cpu[perf_cpu_map__cpu(map, i).cpu].socket_id == + session->header.env.cpu[cpu.cpu].socket_id == id.socket); TEST_ASSERT_VAL("Core map - Die ID doesn't match", - session->header.env.cpu[perf_cpu_map__cpu(map, i).cpu].die_id == id.die); + session->header.env.cpu[cpu.cpu].die_id == id.die); TEST_ASSERT_VAL("Core map - Node ID is set", id.node == -1); TEST_ASSERT_VAL("Core map - Thread IDX is set", id.thread_idx == -1); } // Test that die ID contains socket and die - for (i = 0; i < perf_cpu_map__nr(map); i++) { - id = aggr_cpu_id__die(perf_cpu_map__cpu(map, i), NULL); + perf_cpu_map__for_each_cpu(cpu, i, map) { + id = aggr_cpu_id__die(cpu, NULL); TEST_ASSERT_VAL("Die map - Socket ID doesn't match", - session->header.env.cpu[perf_cpu_map__cpu(map, i).cpu].socket_id == + session->header.env.cpu[cpu.cpu].socket_id == id.socket); TEST_ASSERT_VAL("Die map - Die ID doesn't match", - session->header.env.cpu[perf_cpu_map__cpu(map, i).cpu].die_id == id.die); + session->header.env.cpu[cpu.cpu].die_id == id.die); TEST_ASSERT_VAL("Die map - Node ID is set", id.node == -1); TEST_ASSERT_VAL("Die map - Core is set", id.core == -1); @@ -173,10 +173,10 @@ static int check_cpu_topology(char *path, struct perf_cpu_map *map) } // Test that socket ID contains only socket - for (i = 0; i < perf_cpu_map__nr(map); i++) { - id = aggr_cpu_id__socket(perf_cpu_map__cpu(map, i), NULL); + perf_cpu_map__for_each_cpu(cpu, i, map) { + id = aggr_cpu_id__socket(cpu, NULL); TEST_ASSERT_VAL("Socket map - Socket ID doesn't match", - session->header.env.cpu[perf_cpu_map__cpu(map, i).cpu].socket_id == + session->header.env.cpu[cpu.cpu].socket_id == id.socket); TEST_ASSERT_VAL("Socket map - Node ID is set", id.node == -1); @@ -187,10 +187,10 @@ static int check_cpu_topology(char *path, struct perf_cpu_map *map) } // Test that node ID contains only node - for (i = 0; i < perf_cpu_map__nr(map); i++) { - id = aggr_cpu_id__node(perf_cpu_map__cpu(map, i), NULL); + perf_cpu_map__for_each_cpu(cpu, i, map) { + id = aggr_cpu_id__node(cpu, NULL); TEST_ASSERT_VAL("Node map - Node ID doesn't match", - cpu__get_node(perf_cpu_map__cpu(map, i)) == id.node); + cpu__get_node(cpu) == id.node); TEST_ASSERT_VAL("Node map - Socket is set", id.socket == -1); TEST_ASSERT_VAL("Node map - Die ID is set", id.die == -1); TEST_ASSERT_VAL("Node map - Core is set", id.core == -1); diff --git a/tools/perf/util/bpf_kwork.c b/tools/perf/util/bpf_kwork.c index 6eb2c78fd7f4a..44f0f708a15d6 100644 --- a/tools/perf/util/bpf_kwork.c +++ b/tools/perf/util/bpf_kwork.c @@ -147,12 +147,12 @@ static bool valid_kwork_class_type(enum kwork_class_type type) static int setup_filters(struct perf_kwork *kwork) { - u8 val = 1; - int i, nr_cpus, key, fd; - struct perf_cpu_map *map; - if (kwork->cpu_list != NULL) { - fd = bpf_map__fd(skel->maps.perf_kwork_cpu_filter); + int idx, nr_cpus; + struct perf_cpu_map *map; + struct perf_cpu cpu; + int fd = bpf_map__fd(skel->maps.perf_kwork_cpu_filter); + if (fd < 0) { pr_debug("Invalid cpu filter fd\n"); return -1; @@ -165,8 +165,8 @@ static int setup_filters(struct perf_kwork *kwork) } nr_cpus = libbpf_num_possible_cpus(); - for (i = 0; i < perf_cpu_map__nr(map); i++) { - struct perf_cpu cpu = perf_cpu_map__cpu(map, i); + perf_cpu_map__for_each_cpu(cpu, idx, map) { + u8 val = 1; if (cpu.cpu >= nr_cpus) { perf_cpu_map__put(map); @@ -181,6 +181,8 @@ static int setup_filters(struct perf_kwork *kwork) } if (kwork->profile_name != NULL) { + int key, fd; + if (strlen(kwork->profile_name) >= MAX_KWORKNAME) { pr_err("Requested name filter %s too large, limit to %d\n", kwork->profile_name, MAX_KWORKNAME - 1); diff --git a/tools/perf/util/bpf_kwork_top.c b/tools/perf/util/bpf_kwork_top.c new file mode 100644 index 0000000000000..22a3b00a1e23f --- /dev/null +++ b/tools/perf/util/bpf_kwork_top.c @@ -0,0 +1,308 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * bpf_kwork_top.c + * + * Copyright (c) 2022 Huawei Inc, Yang Jihong + */ + +#include +#include +#include +#include +#include + +#include + +#include "util/debug.h" +#include "util/evsel.h" +#include "util/kwork.h" + +#include +#include + +#include "util/bpf_skel/kwork_top.skel.h" + +/* + * This should be in sync with "util/kwork_top.bpf.c" + */ +#define MAX_COMMAND_LEN 16 + +struct time_data { + __u64 timestamp; +}; + +struct work_data { + __u64 runtime; +}; + +struct task_data { + __u32 tgid; + __u32 is_kthread; + char comm[MAX_COMMAND_LEN]; +}; + +struct work_key { + __u32 type; + __u32 pid; + __u64 task_p; +}; + +struct task_key { + __u32 pid; + __u32 cpu; +}; + +struct kwork_class_bpf { + struct kwork_class *class; + void (*load_prepare)(void); +}; + +static struct kwork_top_bpf *skel; + +void perf_kwork__top_start(void) +{ + struct timespec ts; + + clock_gettime(CLOCK_MONOTONIC, &ts); + skel->bss->from_timestamp = (u64)ts.tv_sec * NSEC_PER_SEC + ts.tv_nsec; + skel->bss->enabled = 1; + pr_debug("perf kwork top start at: %lld\n", skel->bss->from_timestamp); +} + +void perf_kwork__top_finish(void) +{ + struct timespec ts; + + skel->bss->enabled = 0; + clock_gettime(CLOCK_MONOTONIC, &ts); + skel->bss->to_timestamp = (u64)ts.tv_sec * NSEC_PER_SEC + ts.tv_nsec; + pr_debug("perf kwork top finish at: %lld\n", skel->bss->to_timestamp); +} + +static void irq_load_prepare(void) +{ + bpf_program__set_autoload(skel->progs.on_irq_handler_entry, true); + bpf_program__set_autoload(skel->progs.on_irq_handler_exit, true); +} + +static struct kwork_class_bpf kwork_irq_bpf = { + .load_prepare = irq_load_prepare, +}; + +static void softirq_load_prepare(void) +{ + bpf_program__set_autoload(skel->progs.on_softirq_entry, true); + bpf_program__set_autoload(skel->progs.on_softirq_exit, true); +} + +static struct kwork_class_bpf kwork_softirq_bpf = { + .load_prepare = softirq_load_prepare, +}; + +static void sched_load_prepare(void) +{ + bpf_program__set_autoload(skel->progs.on_switch, true); +} + +static struct kwork_class_bpf kwork_sched_bpf = { + .load_prepare = sched_load_prepare, +}; + +static struct kwork_class_bpf * +kwork_class_bpf_supported_list[KWORK_CLASS_MAX] = { + [KWORK_CLASS_IRQ] = &kwork_irq_bpf, + [KWORK_CLASS_SOFTIRQ] = &kwork_softirq_bpf, + [KWORK_CLASS_SCHED] = &kwork_sched_bpf, +}; + +static bool valid_kwork_class_type(enum kwork_class_type type) +{ + return type >= 0 && type < KWORK_CLASS_MAX; +} + +static int setup_filters(struct perf_kwork *kwork) +{ + if (kwork->cpu_list) { + int idx, nr_cpus, fd; + struct perf_cpu_map *map; + struct perf_cpu cpu; + + fd = bpf_map__fd(skel->maps.kwork_top_cpu_filter); + if (fd < 0) { + pr_debug("Invalid cpu filter fd\n"); + return -1; + } + + map = perf_cpu_map__new(kwork->cpu_list); + if (!map) { + pr_debug("Invalid cpu_list\n"); + return -1; + } + + nr_cpus = libbpf_num_possible_cpus(); + perf_cpu_map__for_each_cpu(cpu, idx, map) { + u8 val = 1; + + if (cpu.cpu >= nr_cpus) { + perf_cpu_map__put(map); + pr_err("Requested cpu %d too large\n", cpu.cpu); + return -1; + } + bpf_map_update_elem(fd, &cpu.cpu, &val, BPF_ANY); + } + perf_cpu_map__put(map); + + skel->bss->has_cpu_filter = 1; + } + + return 0; +} + +int perf_kwork__top_prepare_bpf(struct perf_kwork *kwork __maybe_unused) +{ + struct bpf_program *prog; + struct kwork_class *class; + struct kwork_class_bpf *class_bpf; + enum kwork_class_type type; + + skel = kwork_top_bpf__open(); + if (!skel) { + pr_debug("Failed to open kwork top skeleton\n"); + return -1; + } + + /* + * set all progs to non-autoload, + * then set corresponding progs according to config + */ + bpf_object__for_each_program(prog, skel->obj) + bpf_program__set_autoload(prog, false); + + list_for_each_entry(class, &kwork->class_list, list) { + type = class->type; + if (!valid_kwork_class_type(type) || + !kwork_class_bpf_supported_list[type]) { + pr_err("Unsupported bpf trace class %s\n", class->name); + goto out; + } + + class_bpf = kwork_class_bpf_supported_list[type]; + class_bpf->class = class; + + if (class_bpf->load_prepare) + class_bpf->load_prepare(); + } + + if (kwork_top_bpf__load(skel)) { + pr_debug("Failed to load kwork top skeleton\n"); + goto out; + } + + if (setup_filters(kwork)) + goto out; + + if (kwork_top_bpf__attach(skel)) { + pr_debug("Failed to attach kwork top skeleton\n"); + goto out; + } + + return 0; + +out: + kwork_top_bpf__destroy(skel); + return -1; +} + +static void read_task_info(struct kwork_work *work) +{ + int fd; + struct task_data data; + struct task_key key = { + .pid = work->id, + .cpu = work->cpu, + }; + + fd = bpf_map__fd(skel->maps.kwork_top_tasks); + if (fd < 0) { + pr_debug("Invalid top tasks map fd\n"); + return; + } + + if (!bpf_map_lookup_elem(fd, &key, &data)) { + work->tgid = data.tgid; + work->is_kthread = data.is_kthread; + work->name = strdup(data.comm); + } +} +static int add_work(struct perf_kwork *kwork, struct work_key *key, + struct work_data *data, int cpu) +{ + struct kwork_class_bpf *bpf_trace; + struct kwork_work *work; + struct kwork_work tmp = { + .id = key->pid, + .cpu = cpu, + .name = NULL, + }; + enum kwork_class_type type = key->type; + + if (!valid_kwork_class_type(type)) { + pr_debug("Invalid class type %d to add work\n", type); + return -1; + } + + bpf_trace = kwork_class_bpf_supported_list[type]; + tmp.class = bpf_trace->class; + + work = perf_kwork_add_work(kwork, tmp.class, &tmp); + if (!work) + return -1; + + work->total_runtime = data->runtime; + read_task_info(work); + + return 0; +} + +int perf_kwork__top_read_bpf(struct perf_kwork *kwork) +{ + int i, fd, nr_cpus; + struct work_data *data; + struct work_key key, prev; + + fd = bpf_map__fd(skel->maps.kwork_top_works); + if (fd < 0) { + pr_debug("Invalid top runtime fd\n"); + return -1; + } + + nr_cpus = libbpf_num_possible_cpus(); + data = calloc(nr_cpus, sizeof(struct work_data)); + if (!data) + return -1; + + memset(&prev, 0, sizeof(prev)); + while (!bpf_map_get_next_key(fd, &prev, &key)) { + if ((bpf_map_lookup_elem(fd, &key, data)) != 0) { + pr_debug("Failed to lookup top elem\n"); + return -1; + } + + for (i = 0; i < nr_cpus; i++) { + if (data[i].runtime == 0) + continue; + + if (add_work(kwork, &key, &data[i], i)) + return -1; + } + prev = key; + } + free(data); + + return 0; +} + +void perf_kwork__top_cleanup_bpf(void) +{ + kwork_top_bpf__destroy(skel); +} diff --git a/tools/perf/util/cpumap.c b/tools/perf/util/cpumap.c index 0581ee0fa5f27..e2287187babd9 100644 --- a/tools/perf/util/cpumap.c +++ b/tools/perf/util/cpumap.c @@ -629,10 +629,10 @@ static char hex_char(unsigned char val) size_t cpu_map__snprint_mask(struct perf_cpu_map *map, char *buf, size_t size) { - int i, cpu; + int idx; char *ptr = buf; unsigned char *bitmap; - struct perf_cpu last_cpu = perf_cpu_map__cpu(map, perf_cpu_map__nr(map) - 1); + struct perf_cpu c, last_cpu = perf_cpu_map__max(map); if (buf == NULL) return 0; @@ -643,12 +643,10 @@ size_t cpu_map__snprint_mask(struct perf_cpu_map *map, char *buf, size_t size) return 0; } - for (i = 0; i < perf_cpu_map__nr(map); i++) { - cpu = perf_cpu_map__cpu(map, i).cpu; - bitmap[cpu / 8] |= 1 << (cpu % 8); - } + perf_cpu_map__for_each_cpu(c, idx, map) + bitmap[c.cpu / 8] |= 1 << (c.cpu % 8); - for (cpu = last_cpu.cpu / 4 * 4; cpu >= 0; cpu -= 4) { + for (int cpu = last_cpu.cpu / 4 * 4; cpu >= 0; cpu -= 4) { unsigned char bits = bitmap[cpu / 8]; if (cpu % 8) diff --git a/tools/perf/util/scripting-engines/trace-event-python.c b/tools/perf/util/scripting-engines/trace-event-python.c index 94312741443ab..1fdab04d5b83c 100644 --- a/tools/perf/util/scripting-engines/trace-event-python.c +++ b/tools/perf/util/scripting-engines/trace-event-python.c @@ -1686,13 +1686,15 @@ static void python_process_stat(struct perf_stat_config *config, { struct perf_thread_map *threads = counter->core.threads; struct perf_cpu_map *cpus = counter->core.cpus; - int cpu, thread; - for (thread = 0; thread < perf_thread_map__nr(threads); thread++) { - for (cpu = 0; cpu < perf_cpu_map__nr(cpus); cpu++) { - process_stat(counter, perf_cpu_map__cpu(cpus, cpu), + for (int thread = 0; thread < perf_thread_map__nr(threads); thread++) { + int idx; + struct perf_cpu cpu; + + perf_cpu_map__for_each_cpu(cpu, idx, cpus) { + process_stat(counter, cpu, perf_thread_map__pid(threads, thread), tstamp, - perf_counts(counter->counts, cpu, thread)); + perf_counts(counter->counts, idx, thread)); } } } diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c index 277b2cbd51861..e1edcc64f0aea 100644 --- a/tools/perf/util/session.c +++ b/tools/perf/util/session.c @@ -2730,6 +2730,7 @@ int perf_session__cpu_bitmap(struct perf_session *session, int i, err = -1; struct perf_cpu_map *map; int nr_cpus = min(session->header.env.nr_cpus_avail, MAX_NR_CPUS); + struct perf_cpu cpu; for (i = 0; i < PERF_TYPE_MAX; ++i) { struct evsel *evsel; @@ -2751,9 +2752,7 @@ int perf_session__cpu_bitmap(struct perf_session *session, return -1; } - for (i = 0; i < perf_cpu_map__nr(map); i++) { - struct perf_cpu cpu = perf_cpu_map__cpu(map, i); - + perf_cpu_map__for_each_cpu(cpu, i, map) { if (cpu.cpu >= nr_cpus) { pr_err("Requested CPU %d too large. " "Consider raising MAX_NR_CPUS\n", cpu.cpu); diff --git a/tools/perf/util/svghelper.c b/tools/perf/util/svghelper.c index 0e4dc31c6c9c0..cbd9e6cc9c0b5 100644 --- a/tools/perf/util/svghelper.c +++ b/tools/perf/util/svghelper.c @@ -725,26 +725,24 @@ static void scan_core_topology(int *map, struct topology *t, int nr_cpus) static int str_to_bitmap(char *s, cpumask_t *b, int nr_cpus) { - int i; - int ret = 0; - struct perf_cpu_map *m; - struct perf_cpu c; + int idx, ret = 0; + struct perf_cpu_map *map; + struct perf_cpu cpu; - m = perf_cpu_map__new(s); - if (!m) + map = perf_cpu_map__new(s); + if (!map) return -1; - for (i = 0; i < perf_cpu_map__nr(m); i++) { - c = perf_cpu_map__cpu(m, i); - if (c.cpu >= nr_cpus) { + perf_cpu_map__for_each_cpu(cpu, idx, map) { + if (cpu.cpu >= nr_cpus) { ret = -1; break; } - __set_bit(c.cpu, cpumask_bits(b)); + __set_bit(cpu.cpu, cpumask_bits(b)); } - perf_cpu_map__put(m); + perf_cpu_map__put(map); return ret; } From d51291ac43fe82aa5460b9143df5024e944ea4c8 Mon Sep 17 00:00:00 2001 From: Ian Rogers Date: Tue, 28 Nov 2023 22:02:11 -0800 Subject: [PATCH 254/548] libperf cpumap: Document perf_cpu_map__nr()'s behavior MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit perf_cpu_map__nr()'s behavior around an empty CPU map is strange as it returns that there is 1 CPU. Changing code that may rely on this behavior is hard, we can at least document the behavior. Reviewed-by: James Clark Signed-off-by: Ian Rogers Acked-by: Adrian Hunter Acked-by: Namhyung Kim Cc: Alexander Shishkin Cc: Alexandre Ghiti Cc: Andrew Jones Cc: André Almeida Cc: Athira Rajeev Cc: Atish Patra Cc: Changbin Du Cc: Darren Hart Cc: Davidlohr Bueso Cc: Huacai Chen Cc: Ingo Molnar Cc: Jiri Olsa Cc: John Garry Cc: K Prateek Nayak Cc: Kajol Jain Cc: Kan Liang Cc: Leo Yan Cc: Mark Rutland Cc: Mike Leach Cc: Nick Desaulniers Cc: Paolo Bonzini Cc: Paran Lee Cc: Peter Zijlstra Cc: Ravi Bangoria Cc: Sandipan Das Cc: Sean Christopherson Cc: Steinar H. Gunderson Cc: Suzuki Poulouse Cc: Thomas Gleixner Cc: Will Deacon Cc: Yang Jihong Cc: Yang Li Cc: Yanteng Si Link: https://lore.kernel.org/r/20231129060211.1890454-15-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo (cherry picked from commit 67bc993446d340d4f059014f6be6ead35ec5f88b) Signed-off-by: Koba Ko --- tools/lib/perf/include/perf/cpumap.h | 11 +++++++++++ 1 file changed, 11 insertions(+) diff --git a/tools/lib/perf/include/perf/cpumap.h b/tools/lib/perf/include/perf/cpumap.h index 523e4348fc96b..90457d17fb2f5 100644 --- a/tools/lib/perf/include/perf/cpumap.h +++ b/tools/lib/perf/include/perf/cpumap.h @@ -44,7 +44,18 @@ LIBPERF_API struct perf_cpu_map *perf_cpu_map__merge(struct perf_cpu_map *orig, LIBPERF_API struct perf_cpu_map *perf_cpu_map__intersect(struct perf_cpu_map *orig, struct perf_cpu_map *other); LIBPERF_API void perf_cpu_map__put(struct perf_cpu_map *map); +/** + * perf_cpu_map__cpu - get the CPU value at the given index. Returns -1 if index + * is invalid. + */ LIBPERF_API struct perf_cpu perf_cpu_map__cpu(const struct perf_cpu_map *cpus, int idx); +/** + * perf_cpu_map__nr - for an empty map returns 1, as perf_cpu_map__cpu returns a + * cpu of -1 for an invalid index, this makes an empty map + * look like it contains the "any CPU"/dummy value. Otherwise + * the result is the number CPUs in the map plus one if the + * "any CPU"/dummy value is present. + */ LIBPERF_API int perf_cpu_map__nr(const struct perf_cpu_map *cpus); /** * perf_cpu_map__has_any_cpu_or_is_empty - is map either empty or has the "any CPU"/dummy value. From fb28091ac188b35508582f808e746990714ee13d Mon Sep 17 00:00:00 2001 From: James Clark Date: Wed, 1 May 2024 14:57:51 +0100 Subject: [PATCH 255/548] perf cs-etm: Use struct perf_cpu as much as possible The perf_cpu struct makes some iterators simpler and avoids some mistakes with interchanging CPU IDs with indexes etc. At the moment in this file the conversion to an integer is done somewhere in the middle of the call tree. Change it to delay the conversion to an int until the leaf functions. Some of the usage patterns are duplicated, so instead of changing them all, make cs_etm_get_ro() more reusable and use that everywhere. cs_etm_get_ro() didn't return an error before, but return one now so that it can also be used where an error is needed. Continue to ignore the error where it was already ignored. Use cs_etm_pmu_path_exists() instead of cs_etm_get_ro() in cs_etm_is_etmv4() because cs_etm_get_ro() prints a warning, but path exists is sufficient for this use case. Signed-off-by: James Clark Cc: Adrian Hunter Cc: Alexander Shishkin Cc: Ian Rogers Cc: Ingo Molnar Cc: Jiri Olsa Cc: John Garry Cc: Kan Liang Cc: Leo Yan Cc: Mark Rutland Cc: Mike Leach Cc: Namhyung Kim Cc: Peter Zijlstra Cc: Suzuki Poulouse Cc: Will Deacon Link: https://lore.kernel.org/r/20240501135753.508022-2-james.clark@arm.com Signed-off-by: Arnaldo Carvalho de Melo (cherry picked from commit cbaf2c4f932e08c9e3df2903cf1559a32defbc41) Signed-off-by: Koba Ko --- tools/perf/arch/arm/util/cs-etm.c | 204 +++++++++++++----------------- 1 file changed, 88 insertions(+), 116 deletions(-) diff --git a/tools/perf/arch/arm/util/cs-etm.c b/tools/perf/arch/arm/util/cs-etm.c index 5d5d452230f2a..4014cba35573c 100644 --- a/tools/perf/arch/arm/util/cs-etm.c +++ b/tools/perf/arch/arm/util/cs-etm.c @@ -66,18 +66,19 @@ static const char * const metadata_ete_ro[] = { [CS_ETE_TS_SOURCE] = "ts_source", }; -static bool cs_etm_is_etmv4(struct auxtrace_record *itr, int cpu); -static bool cs_etm_is_ete(struct auxtrace_record *itr, int cpu); +static bool cs_etm_is_etmv4(struct auxtrace_record *itr, struct perf_cpu cpu); +static bool cs_etm_is_ete(struct auxtrace_record *itr, struct perf_cpu cpu); +static int cs_etm_get_ro(struct perf_pmu *pmu, struct perf_cpu cpu, const char *path, __u64 *val); +static bool cs_etm_pmu_path_exists(struct perf_pmu *pmu, struct perf_cpu cpu, const char *path); -static int cs_etm_validate_context_id(struct auxtrace_record *itr, - struct evsel *evsel, int cpu) +static int cs_etm_validate_context_id(struct auxtrace_record *itr, struct evsel *evsel, + struct perf_cpu cpu) { struct cs_etm_recording *ptr = container_of(itr, struct cs_etm_recording, itr); struct perf_pmu *cs_etm_pmu = ptr->cs_etm_pmu; - char path[PATH_MAX]; int err; - u32 val; + __u64 val; u64 contextid = evsel->core.attr.config & (perf_pmu__format_bits(cs_etm_pmu, "contextid") | perf_pmu__format_bits(cs_etm_pmu, "contextid1") | @@ -94,16 +95,9 @@ static int cs_etm_validate_context_id(struct auxtrace_record *itr, } /* Get a handle on TRCIDR2 */ - snprintf(path, PATH_MAX, "cpu%d/%s", - cpu, metadata_etmv4_ro[CS_ETMV4_TRCIDR2]); - err = perf_pmu__scan_file(cs_etm_pmu, path, "%x", &val); - - /* There was a problem reading the file, bailing out */ - if (err != 1) { - pr_err("%s: can't read file %s\n", CORESIGHT_ETM_PMU_NAME, - path); + err = cs_etm_get_ro(cs_etm_pmu, cpu, metadata_etmv4_ro[CS_ETMV4_TRCIDR2], &val); + if (err) return err; - } if (contextid & perf_pmu__format_bits(cs_etm_pmu, "contextid1")) { @@ -140,15 +134,14 @@ static int cs_etm_validate_context_id(struct auxtrace_record *itr, return 0; } -static int cs_etm_validate_timestamp(struct auxtrace_record *itr, - struct evsel *evsel, int cpu) +static int cs_etm_validate_timestamp(struct auxtrace_record *itr, struct evsel *evsel, + struct perf_cpu cpu) { struct cs_etm_recording *ptr = container_of(itr, struct cs_etm_recording, itr); struct perf_pmu *cs_etm_pmu = ptr->cs_etm_pmu; - char path[PATH_MAX]; int err; - u32 val; + __u64 val; if (!(evsel->core.attr.config & perf_pmu__format_bits(cs_etm_pmu, "timestamp"))) @@ -161,16 +154,9 @@ static int cs_etm_validate_timestamp(struct auxtrace_record *itr, } /* Get a handle on TRCIRD0 */ - snprintf(path, PATH_MAX, "cpu%d/%s", - cpu, metadata_etmv4_ro[CS_ETMV4_TRCIDR0]); - err = perf_pmu__scan_file(cs_etm_pmu, path, "%x", &val); - - /* There was a problem reading the file, bailing out */ - if (err != 1) { - pr_err("%s: can't read file %s\n", - CORESIGHT_ETM_PMU_NAME, path); + err = cs_etm_get_ro(cs_etm_pmu, cpu, metadata_etmv4_ro[CS_ETMV4_TRCIDR0], &val); + if (err) return err; - } /* * TRCIDR0.TSSIZE, bit [28-24], indicates whether global timestamping @@ -218,11 +204,11 @@ static int cs_etm_validate_config(struct auxtrace_record *itr, } perf_cpu_map__for_each_cpu_skip_any(cpu, idx, intersect_cpus) { - err = cs_etm_validate_context_id(itr, evsel, cpu.cpu); + err = cs_etm_validate_context_id(itr, evsel, cpu); if (err) break; - err = cs_etm_validate_timestamp(itr, evsel, cpu.cpu); + err = cs_etm_validate_timestamp(itr, evsel, cpu); if (err) break; } @@ -549,9 +535,9 @@ cs_etm_info_priv_size(struct auxtrace_record *itr __maybe_unused, intersect_cpus = perf_cpu_map__new_online_cpus(); } perf_cpu_map__for_each_cpu_skip_any(cpu, idx, intersect_cpus) { - if (cs_etm_is_ete(itr, cpu.cpu)) + if (cs_etm_is_ete(itr, cpu)) ete++; - else if (cs_etm_is_etmv4(itr, cpu.cpu)) + else if (cs_etm_is_etmv4(itr, cpu)) etmv4++; else etmv3++; @@ -564,66 +550,59 @@ cs_etm_info_priv_size(struct auxtrace_record *itr __maybe_unused, (etmv3 * CS_ETMV3_PRIV_SIZE)); } -static bool cs_etm_is_etmv4(struct auxtrace_record *itr, int cpu) +static bool cs_etm_is_etmv4(struct auxtrace_record *itr, struct perf_cpu cpu) { - bool ret = false; - char path[PATH_MAX]; - int scan; - unsigned int val; struct cs_etm_recording *ptr = container_of(itr, struct cs_etm_recording, itr); struct perf_pmu *cs_etm_pmu = ptr->cs_etm_pmu; /* Take any of the RO files for ETMv4 and see if it present */ - snprintf(path, PATH_MAX, "cpu%d/%s", - cpu, metadata_etmv4_ro[CS_ETMV4_TRCIDR0]); - scan = perf_pmu__scan_file(cs_etm_pmu, path, "%x", &val); - - /* The file was read successfully, we have a winner */ - if (scan == 1) - ret = true; - - return ret; + return cs_etm_pmu_path_exists(cs_etm_pmu, cpu, metadata_etmv4_ro[CS_ETMV4_TRCIDR0]); } -static int cs_etm_get_ro(struct perf_pmu *pmu, int cpu, const char *path) +static int cs_etm_get_ro(struct perf_pmu *pmu, struct perf_cpu cpu, const char *path, __u64 *val) { char pmu_path[PATH_MAX]; int scan; - unsigned int val = 0; /* Get RO metadata from sysfs */ - snprintf(pmu_path, PATH_MAX, "cpu%d/%s", cpu, path); + snprintf(pmu_path, PATH_MAX, "cpu%d/%s", cpu.cpu, path); - scan = perf_pmu__scan_file(pmu, pmu_path, "%x", &val); - if (scan != 1) + scan = perf_pmu__scan_file(pmu, pmu_path, "%llx", val); + if (scan != 1) { pr_err("%s: error reading: %s\n", __func__, pmu_path); + return -EINVAL; + } - return val; + return 0; } -static int cs_etm_get_ro_signed(struct perf_pmu *pmu, int cpu, const char *path) +static int cs_etm_get_ro_signed(struct perf_pmu *pmu, struct perf_cpu cpu, const char *path, + __u64 *out_val) { char pmu_path[PATH_MAX]; int scan; int val = 0; /* Get RO metadata from sysfs */ - snprintf(pmu_path, PATH_MAX, "cpu%d/%s", cpu, path); + snprintf(pmu_path, PATH_MAX, "cpu%d/%s", cpu.cpu, path); scan = perf_pmu__scan_file(pmu, pmu_path, "%d", &val); - if (scan != 1) + if (scan != 1) { pr_err("%s: error reading: %s\n", __func__, pmu_path); + return -EINVAL; + } - return val; + *out_val = (__u64) val; + return 0; } -static bool cs_etm_pmu_path_exists(struct perf_pmu *pmu, int cpu, const char *path) +static bool cs_etm_pmu_path_exists(struct perf_pmu *pmu, struct perf_cpu cpu, const char *path) { char pmu_path[PATH_MAX]; /* Get RO metadata from sysfs */ - snprintf(pmu_path, PATH_MAX, "cpu%d/%s", cpu, path); + snprintf(pmu_path, PATH_MAX, "cpu%d/%s", cpu.cpu, path); return perf_pmu__file_exists(pmu, pmu_path); } @@ -636,16 +615,16 @@ static bool cs_etm_pmu_path_exists(struct perf_pmu *pmu, int cpu, const char *pa #define TRCDEVARCH_ARCHVER_MASK GENMASK(15, 12) #define TRCDEVARCH_ARCHVER(x) (((x) & TRCDEVARCH_ARCHVER_MASK) >> TRCDEVARCH_ARCHVER_SHIFT) -static bool cs_etm_is_ete(struct auxtrace_record *itr, int cpu) +static bool cs_etm_is_ete(struct auxtrace_record *itr, struct perf_cpu cpu) { struct cs_etm_recording *ptr = container_of(itr, struct cs_etm_recording, itr); struct perf_pmu *cs_etm_pmu = ptr->cs_etm_pmu; - int trcdevarch; + __u64 trcdevarch; if (!cs_etm_pmu_path_exists(cs_etm_pmu, cpu, metadata_ete_ro[CS_ETE_TRCDEVARCH])) return false; - trcdevarch = cs_etm_get_ro(cs_etm_pmu, cpu, metadata_ete_ro[CS_ETE_TRCDEVARCH]); + cs_etm_get_ro(cs_etm_pmu, cpu, metadata_ete_ro[CS_ETE_TRCDEVARCH], &trcdevarch); /* * ETE if ARCHVER is 5 (ARCHVER is 4 for ETM) and ARCHPART is 0xA13. * See ETM_DEVARCH_ETE_ARCH in coresight-etm4x.h @@ -653,7 +632,12 @@ static bool cs_etm_is_ete(struct auxtrace_record *itr, int cpu) return TRCDEVARCH_ARCHVER(trcdevarch) == 5 && TRCDEVARCH_ARCHPART(trcdevarch) == 0xA13; } -static void cs_etm_save_etmv4_header(__u64 data[], struct auxtrace_record *itr, int cpu) +static __u64 cs_etm_get_legacy_trace_id(struct perf_cpu cpu) +{ + return CORESIGHT_LEGACY_CPU_TRACE_ID(cpu.cpu); +} + +static void cs_etm_save_etmv4_header(__u64 data[], struct auxtrace_record *itr, struct perf_cpu cpu) { struct cs_etm_recording *ptr = container_of(itr, struct cs_etm_recording, itr); struct perf_pmu *cs_etm_pmu = ptr->cs_etm_pmu; @@ -661,33 +645,32 @@ static void cs_etm_save_etmv4_header(__u64 data[], struct auxtrace_record *itr, /* Get trace configuration register */ data[CS_ETMV4_TRCCONFIGR] = cs_etmv4_get_config(itr); /* traceID set to legacy version, in case new perf running on older system */ - data[CS_ETMV4_TRCTRACEIDR] = - CORESIGHT_LEGACY_CPU_TRACE_ID(cpu) | CORESIGHT_TRACE_ID_UNUSED_FLAG; + data[CS_ETMV4_TRCTRACEIDR] = cs_etm_get_legacy_trace_id(cpu) | + CORESIGHT_TRACE_ID_UNUSED_FLAG; /* Get read-only information from sysFS */ - data[CS_ETMV4_TRCIDR0] = cs_etm_get_ro(cs_etm_pmu, cpu, - metadata_etmv4_ro[CS_ETMV4_TRCIDR0]); - data[CS_ETMV4_TRCIDR1] = cs_etm_get_ro(cs_etm_pmu, cpu, - metadata_etmv4_ro[CS_ETMV4_TRCIDR1]); - data[CS_ETMV4_TRCIDR2] = cs_etm_get_ro(cs_etm_pmu, cpu, - metadata_etmv4_ro[CS_ETMV4_TRCIDR2]); - data[CS_ETMV4_TRCIDR8] = cs_etm_get_ro(cs_etm_pmu, cpu, - metadata_etmv4_ro[CS_ETMV4_TRCIDR8]); - data[CS_ETMV4_TRCAUTHSTATUS] = cs_etm_get_ro(cs_etm_pmu, cpu, - metadata_etmv4_ro[CS_ETMV4_TRCAUTHSTATUS]); + cs_etm_get_ro(cs_etm_pmu, cpu, metadata_etmv4_ro[CS_ETMV4_TRCIDR0], + &data[CS_ETMV4_TRCIDR0]); + cs_etm_get_ro(cs_etm_pmu, cpu, metadata_etmv4_ro[CS_ETMV4_TRCIDR1], + &data[CS_ETMV4_TRCIDR1]); + cs_etm_get_ro(cs_etm_pmu, cpu, metadata_etmv4_ro[CS_ETMV4_TRCIDR2], + &data[CS_ETMV4_TRCIDR2]); + cs_etm_get_ro(cs_etm_pmu, cpu, metadata_etmv4_ro[CS_ETMV4_TRCIDR8], + &data[CS_ETMV4_TRCIDR8]); + cs_etm_get_ro(cs_etm_pmu, cpu, metadata_etmv4_ro[CS_ETMV4_TRCAUTHSTATUS], + &data[CS_ETMV4_TRCAUTHSTATUS]); /* Kernels older than 5.19 may not expose ts_source */ - if (cs_etm_pmu_path_exists(cs_etm_pmu, cpu, metadata_etmv4_ro[CS_ETMV4_TS_SOURCE])) - data[CS_ETMV4_TS_SOURCE] = (__u64) cs_etm_get_ro_signed(cs_etm_pmu, cpu, - metadata_etmv4_ro[CS_ETMV4_TS_SOURCE]); - else { + if (!cs_etm_pmu_path_exists(cs_etm_pmu, cpu, metadata_etmv4_ro[CS_ETMV4_TS_SOURCE]) || + cs_etm_get_ro_signed(cs_etm_pmu, cpu, metadata_etmv4_ro[CS_ETMV4_TS_SOURCE], + &data[CS_ETMV4_TS_SOURCE])) { pr_debug3("[%03d] pmu file 'ts_source' not found. Fallback to safe value (-1)\n", - cpu); + cpu.cpu); data[CS_ETMV4_TS_SOURCE] = (__u64) -1; } } -static void cs_etm_save_ete_header(__u64 data[], struct auxtrace_record *itr, int cpu) +static void cs_etm_save_ete_header(__u64 data[], struct auxtrace_record *itr, struct perf_cpu cpu) { struct cs_etm_recording *ptr = container_of(itr, struct cs_etm_recording, itr); struct perf_pmu *cs_etm_pmu = ptr->cs_etm_pmu; @@ -695,36 +678,30 @@ static void cs_etm_save_ete_header(__u64 data[], struct auxtrace_record *itr, in /* Get trace configuration register */ data[CS_ETE_TRCCONFIGR] = cs_etmv4_get_config(itr); /* traceID set to legacy version, in case new perf running on older system */ - data[CS_ETE_TRCTRACEIDR] = - CORESIGHT_LEGACY_CPU_TRACE_ID(cpu) | CORESIGHT_TRACE_ID_UNUSED_FLAG; + data[CS_ETE_TRCTRACEIDR] = cs_etm_get_legacy_trace_id(cpu) | CORESIGHT_TRACE_ID_UNUSED_FLAG; /* Get read-only information from sysFS */ - data[CS_ETE_TRCIDR0] = cs_etm_get_ro(cs_etm_pmu, cpu, - metadata_ete_ro[CS_ETE_TRCIDR0]); - data[CS_ETE_TRCIDR1] = cs_etm_get_ro(cs_etm_pmu, cpu, - metadata_ete_ro[CS_ETE_TRCIDR1]); - data[CS_ETE_TRCIDR2] = cs_etm_get_ro(cs_etm_pmu, cpu, - metadata_ete_ro[CS_ETE_TRCIDR2]); - data[CS_ETE_TRCIDR8] = cs_etm_get_ro(cs_etm_pmu, cpu, - metadata_ete_ro[CS_ETE_TRCIDR8]); - data[CS_ETE_TRCAUTHSTATUS] = cs_etm_get_ro(cs_etm_pmu, cpu, - metadata_ete_ro[CS_ETE_TRCAUTHSTATUS]); + cs_etm_get_ro(cs_etm_pmu, cpu, metadata_ete_ro[CS_ETE_TRCIDR0], &data[CS_ETE_TRCIDR0]); + cs_etm_get_ro(cs_etm_pmu, cpu, metadata_ete_ro[CS_ETE_TRCIDR1], &data[CS_ETE_TRCIDR1]); + cs_etm_get_ro(cs_etm_pmu, cpu, metadata_ete_ro[CS_ETE_TRCIDR2], &data[CS_ETE_TRCIDR2]); + cs_etm_get_ro(cs_etm_pmu, cpu, metadata_ete_ro[CS_ETE_TRCIDR8], &data[CS_ETE_TRCIDR8]); + cs_etm_get_ro(cs_etm_pmu, cpu, metadata_ete_ro[CS_ETE_TRCAUTHSTATUS], + &data[CS_ETE_TRCAUTHSTATUS]); /* ETE uses the same registers as ETMv4 plus TRCDEVARCH */ - data[CS_ETE_TRCDEVARCH] = cs_etm_get_ro(cs_etm_pmu, cpu, - metadata_ete_ro[CS_ETE_TRCDEVARCH]); + cs_etm_get_ro(cs_etm_pmu, cpu, metadata_ete_ro[CS_ETE_TRCDEVARCH], + &data[CS_ETE_TRCDEVARCH]); /* Kernels older than 5.19 may not expose ts_source */ - if (cs_etm_pmu_path_exists(cs_etm_pmu, cpu, metadata_ete_ro[CS_ETE_TS_SOURCE])) - data[CS_ETE_TS_SOURCE] = (__u64) cs_etm_get_ro_signed(cs_etm_pmu, cpu, - metadata_ete_ro[CS_ETE_TS_SOURCE]); - else { + if (!cs_etm_pmu_path_exists(cs_etm_pmu, cpu, metadata_ete_ro[CS_ETE_TS_SOURCE]) || + cs_etm_get_ro_signed(cs_etm_pmu, cpu, metadata_ete_ro[CS_ETE_TS_SOURCE], + &data[CS_ETE_TS_SOURCE])) { pr_debug3("[%03d] pmu file 'ts_source' not found. Fallback to safe value (-1)\n", - cpu); + cpu.cpu); data[CS_ETE_TS_SOURCE] = (__u64) -1; } } -static void cs_etm_get_metadata(int cpu, u32 *offset, +static void cs_etm_get_metadata(struct perf_cpu cpu, u32 *offset, struct auxtrace_record *itr, struct perf_record_auxtrace_info *info) { @@ -754,15 +731,13 @@ static void cs_etm_get_metadata(int cpu, u32 *offset, /* Get configuration register */ info->priv[*offset + CS_ETM_ETMCR] = cs_etm_get_config(itr); /* traceID set to legacy value in case new perf running on old system */ - info->priv[*offset + CS_ETM_ETMTRACEIDR] = - CORESIGHT_LEGACY_CPU_TRACE_ID(cpu) | CORESIGHT_TRACE_ID_UNUSED_FLAG; + info->priv[*offset + CS_ETM_ETMTRACEIDR] = cs_etm_get_legacy_trace_id(cpu) | + CORESIGHT_TRACE_ID_UNUSED_FLAG; /* Get read-only information from sysFS */ - info->priv[*offset + CS_ETM_ETMCCER] = - cs_etm_get_ro(cs_etm_pmu, cpu, - metadata_etmv3_ro[CS_ETM_ETMCCER]); - info->priv[*offset + CS_ETM_ETMIDR] = - cs_etm_get_ro(cs_etm_pmu, cpu, - metadata_etmv3_ro[CS_ETM_ETMIDR]); + cs_etm_get_ro(cs_etm_pmu, cpu, metadata_etmv3_ro[CS_ETM_ETMCCER], + &info->priv[*offset + CS_ETM_ETMCCER]); + cs_etm_get_ro(cs_etm_pmu, cpu, metadata_etmv3_ro[CS_ETM_ETMIDR], + &info->priv[*offset + CS_ETM_ETMIDR]); /* How much space was used */ increment = CS_ETM_PRIV_MAX; @@ -771,7 +746,7 @@ static void cs_etm_get_metadata(int cpu, u32 *offset, /* Build generic header portion */ info->priv[*offset + CS_ETM_MAGIC] = magic; - info->priv[*offset + CS_ETM_CPU] = cpu; + info->priv[*offset + CS_ETM_CPU] = cpu.cpu; info->priv[*offset + CS_ETM_NR_TRC_PARAMS] = nr_trc_params; /* Where the next CPU entry should start from */ *offset += increment; @@ -791,6 +766,7 @@ static int cs_etm_info_fill(struct auxtrace_record *itr, struct cs_etm_recording *ptr = container_of(itr, struct cs_etm_recording, itr); struct perf_pmu *cs_etm_pmu = ptr->cs_etm_pmu; + struct perf_cpu cpu; if (priv_size != cs_etm_info_priv_size(itr, session->evlist)) return -EINVAL; @@ -803,8 +779,6 @@ static int cs_etm_info_fill(struct auxtrace_record *itr, cpu_map = online_cpus; } else { /* Make sure all specified CPUs are online */ - struct perf_cpu cpu; - perf_cpu_map__for_each_cpu(cpu, i, event_cpus) { if (!perf_cpu_map__has(online_cpus, cpu)) return -EINVAL; @@ -826,11 +800,9 @@ static int cs_etm_info_fill(struct auxtrace_record *itr, offset = CS_ETM_SNAPSHOT + 1; - for (i = 0; i < cpu__max_cpu().cpu && offset < priv_size; i++) { - struct perf_cpu cpu = { .cpu = i, }; - - if (perf_cpu_map__has(cpu_map, cpu)) - cs_etm_get_metadata(i, &offset, itr, info); + perf_cpu_map__for_each_cpu(cpu, i, cpu_map) { + assert(offset < priv_size); + cs_etm_get_metadata(cpu, &offset, itr, info); } perf_cpu_map__put(online_cpus); From 3626ca9af83cec1ded872a6ec04fb98b66a3a5c5 Mon Sep 17 00:00:00 2001 From: James Clark Date: Wed, 1 May 2024 14:57:52 +0100 Subject: [PATCH 256/548] perf cs-etm: Remove repeated fetches of the ETM PMU Most functions already have cs_etm_pmu, so it's a bit neater to pass it through rather than itr only to convert it again. Signed-off-by: James Clark Cc: Adrian Hunter Cc: Alexander Shishkin Cc: Ian Rogers Cc: Ingo Molnar Cc: Jiri Olsa Cc: John Garry Cc: Kan Liang Cc: Leo Yan Cc: Mark Rutland Cc: Mike Leach Cc: Namhyung Kim Cc: Peter Zijlstra Cc: Suzuki Poulouse Cc: Will Deacon Link: https://lore.kernel.org/r/20240501135753.508022-3-james.clark@arm.com Signed-off-by: Arnaldo Carvalho de Melo (cherry picked from commit bc5e0e1b93565e3760ffb407db836c9f50d4ffae) Signed-off-by: Koba Ko --- tools/perf/arch/arm/util/cs-etm.c | 60 ++++++++++++++----------------- 1 file changed, 27 insertions(+), 33 deletions(-) diff --git a/tools/perf/arch/arm/util/cs-etm.c b/tools/perf/arch/arm/util/cs-etm.c index 4014cba35573c..e012c597f3404 100644 --- a/tools/perf/arch/arm/util/cs-etm.c +++ b/tools/perf/arch/arm/util/cs-etm.c @@ -66,17 +66,14 @@ static const char * const metadata_ete_ro[] = { [CS_ETE_TS_SOURCE] = "ts_source", }; -static bool cs_etm_is_etmv4(struct auxtrace_record *itr, struct perf_cpu cpu); -static bool cs_etm_is_ete(struct auxtrace_record *itr, struct perf_cpu cpu); +static bool cs_etm_is_etmv4(struct perf_pmu *cs_etm_pmu, struct perf_cpu cpu); +static bool cs_etm_is_ete(struct perf_pmu *cs_etm_pmu, struct perf_cpu cpu); static int cs_etm_get_ro(struct perf_pmu *pmu, struct perf_cpu cpu, const char *path, __u64 *val); static bool cs_etm_pmu_path_exists(struct perf_pmu *pmu, struct perf_cpu cpu, const char *path); -static int cs_etm_validate_context_id(struct auxtrace_record *itr, struct evsel *evsel, +static int cs_etm_validate_context_id(struct perf_pmu *cs_etm_pmu, struct evsel *evsel, struct perf_cpu cpu) { - struct cs_etm_recording *ptr = - container_of(itr, struct cs_etm_recording, itr); - struct perf_pmu *cs_etm_pmu = ptr->cs_etm_pmu; int err; __u64 val; u64 contextid = evsel->core.attr.config & @@ -88,7 +85,7 @@ static int cs_etm_validate_context_id(struct auxtrace_record *itr, struct evsel return 0; /* Not supported in etmv3 */ - if (!cs_etm_is_etmv4(itr, cpu)) { + if (!cs_etm_is_etmv4(cs_etm_pmu, cpu)) { pr_err("%s: contextid not supported in ETMv3, disable with %s/contextid=0/\n", CORESIGHT_ETM_PMU_NAME, CORESIGHT_ETM_PMU_NAME); return -EINVAL; @@ -134,12 +131,9 @@ static int cs_etm_validate_context_id(struct auxtrace_record *itr, struct evsel return 0; } -static int cs_etm_validate_timestamp(struct auxtrace_record *itr, struct evsel *evsel, +static int cs_etm_validate_timestamp(struct perf_pmu *cs_etm_pmu, struct evsel *evsel, struct perf_cpu cpu) { - struct cs_etm_recording *ptr = - container_of(itr, struct cs_etm_recording, itr); - struct perf_pmu *cs_etm_pmu = ptr->cs_etm_pmu; int err; __u64 val; @@ -147,7 +141,7 @@ static int cs_etm_validate_timestamp(struct auxtrace_record *itr, struct evsel * perf_pmu__format_bits(cs_etm_pmu, "timestamp"))) return 0; - if (!cs_etm_is_etmv4(itr, cpu)) { + if (!cs_etm_is_etmv4(cs_etm_pmu, cpu)) { pr_err("%s: timestamp not supported in ETMv3, disable with %s/timestamp=0/\n", CORESIGHT_ETM_PMU_NAME, CORESIGHT_ETM_PMU_NAME); return -EINVAL; @@ -173,6 +167,13 @@ static int cs_etm_validate_timestamp(struct auxtrace_record *itr, struct evsel * return 0; } +static struct perf_pmu *cs_etm_get_pmu(struct auxtrace_record *itr) +{ + struct cs_etm_recording *ptr = container_of(itr, struct cs_etm_recording, itr); + + return ptr->cs_etm_pmu; +} + /* * Check whether the requested timestamp and contextid options should be * available on all requested CPUs and if not, tell the user how to override. @@ -180,7 +181,7 @@ static int cs_etm_validate_timestamp(struct auxtrace_record *itr, struct evsel * * first is better. In theory the kernel could still disable the option for * some other reason so this is best effort only. */ -static int cs_etm_validate_config(struct auxtrace_record *itr, +static int cs_etm_validate_config(struct perf_pmu *cs_etm_pmu, struct evsel *evsel) { int idx, err = 0; @@ -204,11 +205,11 @@ static int cs_etm_validate_config(struct auxtrace_record *itr, } perf_cpu_map__for_each_cpu_skip_any(cpu, idx, intersect_cpus) { - err = cs_etm_validate_context_id(itr, evsel, cpu); + err = cs_etm_validate_context_id(cs_etm_pmu, evsel, cpu); if (err) break; - err = cs_etm_validate_timestamp(itr, evsel, cpu); + err = cs_etm_validate_timestamp(cs_etm_pmu, evsel, cpu); if (err) break; } @@ -449,7 +450,7 @@ static int cs_etm_recording_options(struct auxtrace_record *itr, if (!perf_cpu_map__is_any_cpu_or_is_empty(cpus)) evsel__set_sample_bit(evsel, TIME); - err = cs_etm_validate_config(itr, cs_etm_evsel); + err = cs_etm_validate_config(cs_etm_pmu, cs_etm_evsel); out: return err; } @@ -515,14 +516,15 @@ static u64 cs_etmv4_get_config(struct auxtrace_record *itr) } static size_t -cs_etm_info_priv_size(struct auxtrace_record *itr __maybe_unused, - struct evlist *evlist __maybe_unused) +cs_etm_info_priv_size(struct auxtrace_record *itr, + struct evlist *evlist) { int idx; int etmv3 = 0, etmv4 = 0, ete = 0; struct perf_cpu_map *event_cpus = evlist->core.user_requested_cpus; struct perf_cpu_map *intersect_cpus; struct perf_cpu cpu; + struct perf_pmu *cs_etm_pmu = cs_etm_get_pmu(itr); if (!perf_cpu_map__has_any_cpu(event_cpus)) { /* cpu map is not "any" CPU , we have specific CPUs to work with */ @@ -535,9 +537,9 @@ cs_etm_info_priv_size(struct auxtrace_record *itr __maybe_unused, intersect_cpus = perf_cpu_map__new_online_cpus(); } perf_cpu_map__for_each_cpu_skip_any(cpu, idx, intersect_cpus) { - if (cs_etm_is_ete(itr, cpu)) + if (cs_etm_is_ete(cs_etm_pmu, cpu)) ete++; - else if (cs_etm_is_etmv4(itr, cpu)) + else if (cs_etm_is_etmv4(cs_etm_pmu, cpu)) etmv4++; else etmv3++; @@ -550,12 +552,8 @@ cs_etm_info_priv_size(struct auxtrace_record *itr __maybe_unused, (etmv3 * CS_ETMV3_PRIV_SIZE)); } -static bool cs_etm_is_etmv4(struct auxtrace_record *itr, struct perf_cpu cpu) +static bool cs_etm_is_etmv4(struct perf_pmu *cs_etm_pmu, struct perf_cpu cpu) { - struct cs_etm_recording *ptr = - container_of(itr, struct cs_etm_recording, itr); - struct perf_pmu *cs_etm_pmu = ptr->cs_etm_pmu; - /* Take any of the RO files for ETMv4 and see if it present */ return cs_etm_pmu_path_exists(cs_etm_pmu, cpu, metadata_etmv4_ro[CS_ETMV4_TRCIDR0]); } @@ -615,10 +613,8 @@ static bool cs_etm_pmu_path_exists(struct perf_pmu *pmu, struct perf_cpu cpu, co #define TRCDEVARCH_ARCHVER_MASK GENMASK(15, 12) #define TRCDEVARCH_ARCHVER(x) (((x) & TRCDEVARCH_ARCHVER_MASK) >> TRCDEVARCH_ARCHVER_SHIFT) -static bool cs_etm_is_ete(struct auxtrace_record *itr, struct perf_cpu cpu) +static bool cs_etm_is_ete(struct perf_pmu *cs_etm_pmu, struct perf_cpu cpu) { - struct cs_etm_recording *ptr = container_of(itr, struct cs_etm_recording, itr); - struct perf_pmu *cs_etm_pmu = ptr->cs_etm_pmu; __u64 trcdevarch; if (!cs_etm_pmu_path_exists(cs_etm_pmu, cpu, metadata_ete_ro[CS_ETE_TRCDEVARCH])) @@ -707,19 +703,17 @@ static void cs_etm_get_metadata(struct perf_cpu cpu, u32 *offset, { u32 increment, nr_trc_params; u64 magic; - struct cs_etm_recording *ptr = - container_of(itr, struct cs_etm_recording, itr); - struct perf_pmu *cs_etm_pmu = ptr->cs_etm_pmu; + struct perf_pmu *cs_etm_pmu = cs_etm_get_pmu(itr); /* first see what kind of tracer this cpu is affined to */ - if (cs_etm_is_ete(itr, cpu)) { + if (cs_etm_is_ete(cs_etm_pmu, cpu)) { magic = __perf_cs_ete_magic; cs_etm_save_ete_header(&info->priv[*offset], itr, cpu); /* How much space was used */ increment = CS_ETE_PRIV_MAX; nr_trc_params = CS_ETE_PRIV_MAX - CS_ETM_COMMON_BLK_MAX_V1; - } else if (cs_etm_is_etmv4(itr, cpu)) { + } else if (cs_etm_is_etmv4(cs_etm_pmu, cpu)) { magic = __perf_cs_etmv4_magic; cs_etm_save_etmv4_header(&info->priv[*offset], itr, cpu); From 68d04747e904676d2ac29e31e4a6bc7679514d47 Mon Sep 17 00:00:00 2001 From: James Clark Date: Wed, 1 May 2024 14:57:53 +0100 Subject: [PATCH 257/548] perf cs-etm: Improve version detection and error reporting When the config validation functions are warning about ETMv3, they do it based on "not ETMv4". If the drivers aren't all loaded or the hardware doesn't support Coresight it will appear as "not ETMv4" and then Perf will print the error message "... not supported in ETMv3 ..." which is wrong and confusing. cs_etm_is_etmv4() is also misnamed because it also returns true for ETE because ETE has a superset of the ETMv4 metadata files. Although this was always done in the correct order so it wasn't a bug. Improve all this by making a single get version function which also handles not present as a separate case. Change the ETMv3 error message to only print when ETMv3 is detected, and add a new error message for the not present case. Reviewed-by: Ian Rogers Reviewed-by: Leo Yan Signed-off-by: James Clark Cc: Adrian Hunter Cc: Alexander Shishkin Cc: Ingo Molnar Cc: Jiri Olsa Cc: John Garry Cc: Kan Liang Cc: Mark Rutland Cc: Mike Leach Cc: Namhyung Kim Cc: Peter Zijlstra Cc: Suzuki Poulouse Cc: Will Deacon Link: https://lore.kernel.org/r/20240501135753.508022-4-james.clark@arm.com Signed-off-by: Arnaldo Carvalho de Melo (cherry picked from commit e3123079b906dc2edb80466de85b2c333a56fbf2) Signed-off-by: Koba Ko --- tools/perf/arch/arm/util/cs-etm.c | 61 ++++++++++++++++++++++--------- 1 file changed, 43 insertions(+), 18 deletions(-) diff --git a/tools/perf/arch/arm/util/cs-etm.c b/tools/perf/arch/arm/util/cs-etm.c index e012c597f3404..28d6935f9fc9f 100644 --- a/tools/perf/arch/arm/util/cs-etm.c +++ b/tools/perf/arch/arm/util/cs-etm.c @@ -66,11 +66,25 @@ static const char * const metadata_ete_ro[] = { [CS_ETE_TS_SOURCE] = "ts_source", }; -static bool cs_etm_is_etmv4(struct perf_pmu *cs_etm_pmu, struct perf_cpu cpu); +enum cs_etm_version { CS_NOT_PRESENT, CS_ETMV3, CS_ETMV4, CS_ETE }; + static bool cs_etm_is_ete(struct perf_pmu *cs_etm_pmu, struct perf_cpu cpu); static int cs_etm_get_ro(struct perf_pmu *pmu, struct perf_cpu cpu, const char *path, __u64 *val); static bool cs_etm_pmu_path_exists(struct perf_pmu *pmu, struct perf_cpu cpu, const char *path); +static enum cs_etm_version cs_etm_get_version(struct perf_pmu *cs_etm_pmu, + struct perf_cpu cpu) +{ + if (cs_etm_is_ete(cs_etm_pmu, cpu)) + return CS_ETE; + else if (cs_etm_pmu_path_exists(cs_etm_pmu, cpu, metadata_etmv4_ro[CS_ETMV4_TRCIDR0])) + return CS_ETMV4; + else if (cs_etm_pmu_path_exists(cs_etm_pmu, cpu, metadata_etmv3_ro[CS_ETM_ETMCCER])) + return CS_ETMV3; + + return CS_NOT_PRESENT; +} + static int cs_etm_validate_context_id(struct perf_pmu *cs_etm_pmu, struct evsel *evsel, struct perf_cpu cpu) { @@ -85,7 +99,7 @@ static int cs_etm_validate_context_id(struct perf_pmu *cs_etm_pmu, struct evsel return 0; /* Not supported in etmv3 */ - if (!cs_etm_is_etmv4(cs_etm_pmu, cpu)) { + if (cs_etm_get_version(cs_etm_pmu, cpu) == CS_ETMV3) { pr_err("%s: contextid not supported in ETMv3, disable with %s/contextid=0/\n", CORESIGHT_ETM_PMU_NAME, CORESIGHT_ETM_PMU_NAME); return -EINVAL; @@ -141,7 +155,7 @@ static int cs_etm_validate_timestamp(struct perf_pmu *cs_etm_pmu, struct evsel * perf_pmu__format_bits(cs_etm_pmu, "timestamp"))) return 0; - if (!cs_etm_is_etmv4(cs_etm_pmu, cpu)) { + if (cs_etm_get_version(cs_etm_pmu, cpu) == CS_ETMV3) { pr_err("%s: timestamp not supported in ETMv3, disable with %s/timestamp=0/\n", CORESIGHT_ETM_PMU_NAME, CORESIGHT_ETM_PMU_NAME); return -EINVAL; @@ -205,6 +219,11 @@ static int cs_etm_validate_config(struct perf_pmu *cs_etm_pmu, } perf_cpu_map__for_each_cpu_skip_any(cpu, idx, intersect_cpus) { + if (cs_etm_get_version(cs_etm_pmu, cpu) == CS_NOT_PRESENT) { + pr_err("%s: Not found on CPU %d. Check hardware and firmware support and that all Coresight drivers are loaded\n", + CORESIGHT_ETM_PMU_NAME, cpu.cpu); + return -EINVAL; + } err = cs_etm_validate_context_id(cs_etm_pmu, evsel, cpu); if (err) break; @@ -536,13 +555,13 @@ cs_etm_info_priv_size(struct auxtrace_record *itr, /* Event can be "any" CPU so count all online CPUs. */ intersect_cpus = perf_cpu_map__new_online_cpus(); } + /* Count number of each type of ETM. Don't count if that CPU has CS_NOT_PRESENT. */ perf_cpu_map__for_each_cpu_skip_any(cpu, idx, intersect_cpus) { - if (cs_etm_is_ete(cs_etm_pmu, cpu)) - ete++; - else if (cs_etm_is_etmv4(cs_etm_pmu, cpu)) - etmv4++; - else - etmv3++; + enum cs_etm_version v = cs_etm_get_version(cs_etm_pmu, cpu); + + ete += v == CS_ETE; + etmv4 += v == CS_ETMV4; + etmv3 += v == CS_ETMV3; } perf_cpu_map__put(intersect_cpus); @@ -552,12 +571,6 @@ cs_etm_info_priv_size(struct auxtrace_record *itr, (etmv3 * CS_ETMV3_PRIV_SIZE)); } -static bool cs_etm_is_etmv4(struct perf_pmu *cs_etm_pmu, struct perf_cpu cpu) -{ - /* Take any of the RO files for ETMv4 and see if it present */ - return cs_etm_pmu_path_exists(cs_etm_pmu, cpu, metadata_etmv4_ro[CS_ETMV4_TRCIDR0]); -} - static int cs_etm_get_ro(struct perf_pmu *pmu, struct perf_cpu cpu, const char *path, __u64 *val) { char pmu_path[PATH_MAX]; @@ -706,21 +719,26 @@ static void cs_etm_get_metadata(struct perf_cpu cpu, u32 *offset, struct perf_pmu *cs_etm_pmu = cs_etm_get_pmu(itr); /* first see what kind of tracer this cpu is affined to */ - if (cs_etm_is_ete(cs_etm_pmu, cpu)) { + switch (cs_etm_get_version(cs_etm_pmu, cpu)) { + case CS_ETE: magic = __perf_cs_ete_magic; cs_etm_save_ete_header(&info->priv[*offset], itr, cpu); /* How much space was used */ increment = CS_ETE_PRIV_MAX; nr_trc_params = CS_ETE_PRIV_MAX - CS_ETM_COMMON_BLK_MAX_V1; - } else if (cs_etm_is_etmv4(cs_etm_pmu, cpu)) { + break; + + case CS_ETMV4: magic = __perf_cs_etmv4_magic; cs_etm_save_etmv4_header(&info->priv[*offset], itr, cpu); /* How much space was used */ increment = CS_ETMV4_PRIV_MAX; nr_trc_params = CS_ETMV4_PRIV_MAX - CS_ETMV4_TRCCONFIGR; - } else { + break; + + case CS_ETMV3: magic = __perf_cs_etmv3_magic; /* Get configuration register */ info->priv[*offset + CS_ETM_ETMCR] = cs_etm_get_config(itr); @@ -736,6 +754,13 @@ static void cs_etm_get_metadata(struct perf_cpu cpu, u32 *offset, /* How much space was used */ increment = CS_ETM_PRIV_MAX; nr_trc_params = CS_ETM_PRIV_MAX - CS_ETM_ETMCR; + break; + + default: + case CS_NOT_PRESENT: + /* Unreachable, CPUs already validated in cs_etm_validate_config() */ + assert(true); + return; } /* Build generic header portion */ From ac5fbf67d063d0fc51845425ef0e7a657322efb5 Mon Sep 17 00:00:00 2001 From: James Clark Date: Mon, 22 Jul 2024 11:11:43 +0100 Subject: [PATCH 258/548] perf cs-etm: Create decoders after both AUX and HW_ID search passes Both of these passes gather information about how to create the decoders. AUX records determine formatted/unformatted, and the HW_IDs determine the traceID/metadata mappings. Therefore it makes sense to cache the information and wait until both passes are over until creating the decoders, rather than creating them at the first HW_ID found. This will allow a simplification of the creation process where cs_etm_queue->traceid_list will exclusively used to create the decoders, rather than the current two methods depending on whether the trace is formatted or not. Previously the sample CPU from the AUX record was used to initialize the decoder CPU, but actually sample CPU == AUX queue index in per-CPU mode, so saving the sample CPU isn't required. Similarly formatted/unformatted was used upfront to create the decoders, but now it's cached until later. Reviewed-by: Anshuman Khandual Reviewed-by: Mike Leach Signed-off-by: James Clark Signed-off-by: James Clark Tested-by: Ganapatrao Kulkarni Tested-by: Leo Yan Acked-by: Suzuki Poulouse Cc: Adrian Hunter Cc: Alexander Shishkin Cc: Alexandre Torgue Cc: Ian Rogers Cc: Ingo Molnar Cc: Jiri Olsa Cc: John Garry Cc: Kan Liang Cc: Leo Yan Cc: Mark Rutland Cc: Maxime Coquelin Cc: Namhyung Kim Cc: Peter Zijlstra Cc: Will Deacon Link: https://lore.kernel.org/r/20240722101202.26915-2-james.clark@linaro.org Signed-off-by: Arnaldo Carvalho de Melo (cherry picked from commit b6aa0de9a53a231eb068ce1e62b5e7ec9e30e627) Signed-off-by: Koba Ko --- tools/perf/util/cs-etm.c | 182 ++++++++++++++++++++++++--------------- 1 file changed, 113 insertions(+), 69 deletions(-) diff --git a/tools/perf/util/cs-etm.c b/tools/perf/util/cs-etm.c index 1fb104357eebb..9077072994b23 100644 --- a/tools/perf/util/cs-etm.c +++ b/tools/perf/util/cs-etm.c @@ -97,12 +97,19 @@ struct cs_etm_traceid_queue { struct cs_etm_packet_queue packet_queue; }; +enum cs_etm_format { + UNSET, + FORMATTED, + UNFORMATTED +}; + struct cs_etm_queue { struct cs_etm_auxtrace *etm; struct cs_etm_decoder *decoder; struct auxtrace_buffer *buffer; unsigned int queue_nr; u8 pending_timestamp_chan_id; + enum cs_etm_format format; u64 offset; const unsigned char *buf; size_t buf_len, buf_used; @@ -693,7 +700,7 @@ static void cs_etm__set_trace_param_ete(struct cs_etm_trace_params *t_params, static int cs_etm__init_trace_params(struct cs_etm_trace_params *t_params, struct cs_etm_auxtrace *etm, - bool formatted, + enum cs_etm_format format, int sample_cpu, int decoders) { @@ -702,7 +709,7 @@ static int cs_etm__init_trace_params(struct cs_etm_trace_params *t_params, u64 architecture; for (t_idx = 0; t_idx < decoders; t_idx++) { - if (formatted) + if (format == FORMATTED) m_idx = t_idx; else { m_idx = get_cpu_data_idx(etm, sample_cpu); @@ -735,8 +742,7 @@ static int cs_etm__init_trace_params(struct cs_etm_trace_params *t_params, static int cs_etm__init_decoder_params(struct cs_etm_decoder_params *d_params, struct cs_etm_queue *etmq, - enum cs_etm_decoder_operation mode, - bool formatted) + enum cs_etm_decoder_operation mode) { int ret = -EINVAL; @@ -746,7 +752,7 @@ static int cs_etm__init_decoder_params(struct cs_etm_decoder_params *d_params, d_params->packet_printer = cs_etm__packet_dump; d_params->operation = mode; d_params->data = etmq; - d_params->formatted = formatted; + d_params->formatted = etmq->format == FORMATTED; d_params->fsyncs = false; d_params->hsyncs = false; d_params->frame_aligned = true; @@ -1038,81 +1044,34 @@ static u32 cs_etm__mem_access(struct cs_etm_queue *etmq, u8 trace_chan_id, return ret; } -static struct cs_etm_queue *cs_etm__alloc_queue(struct cs_etm_auxtrace *etm, - bool formatted, int sample_cpu) +static struct cs_etm_queue *cs_etm__alloc_queue(void) { - struct cs_etm_decoder_params d_params; - struct cs_etm_trace_params *t_params = NULL; - struct cs_etm_queue *etmq; - /* - * Each queue can only contain data from one CPU when unformatted, so only one decoder is - * needed. - */ - int decoders = formatted ? etm->num_cpu : 1; - - etmq = zalloc(sizeof(*etmq)); + struct cs_etm_queue *etmq = zalloc(sizeof(*etmq)); if (!etmq) return NULL; etmq->traceid_queues_list = intlist__new(NULL); if (!etmq->traceid_queues_list) - goto out_free; - - /* Use metadata to fill in trace parameters for trace decoder */ - t_params = zalloc(sizeof(*t_params) * decoders); - - if (!t_params) - goto out_free; - - if (cs_etm__init_trace_params(t_params, etm, formatted, sample_cpu, decoders)) - goto out_free; - - /* Set decoder parameters to decode trace packets */ - if (cs_etm__init_decoder_params(&d_params, etmq, - dump_trace ? CS_ETM_OPERATION_PRINT : - CS_ETM_OPERATION_DECODE, - formatted)) - goto out_free; - - etmq->decoder = cs_etm_decoder__new(decoders, &d_params, - t_params); - - if (!etmq->decoder) - goto out_free; - - /* - * Register a function to handle all memory accesses required by - * the trace decoder library. - */ - if (cs_etm_decoder__add_mem_access_cb(etmq->decoder, - 0x0L, ((u64) -1L), - cs_etm__mem_access)) - goto out_free_decoder; + free(etmq); - zfree(&t_params); return etmq; - -out_free_decoder: - cs_etm_decoder__free(etmq->decoder); -out_free: - intlist__delete(etmq->traceid_queues_list); - free(etmq); - - return NULL; } static int cs_etm__setup_queue(struct cs_etm_auxtrace *etm, struct auxtrace_queue *queue, - unsigned int queue_nr, - bool formatted, - int sample_cpu) + unsigned int queue_nr, enum cs_etm_format format) { struct cs_etm_queue *etmq = queue->priv; + if (etmq && format != etmq->format) { + pr_err("CS_ETM: mixed formatted and unformatted trace not supported\n"); + return -EINVAL; + } + if (list_empty(&queue->head) || etmq) return 0; - etmq = cs_etm__alloc_queue(etm, formatted, sample_cpu); + etmq = cs_etm__alloc_queue(); if (!etmq) return -ENOMEM; @@ -1120,7 +1079,9 @@ static int cs_etm__setup_queue(struct cs_etm_auxtrace *etm, queue->priv = etmq; etmq->etm = etm; etmq->queue_nr = queue_nr; + queue->cpu = queue_nr; /* Placeholder, may be reset to -1 in per-thread mode */ etmq->offset = 0; + etmq->format = format; return 0; } @@ -2851,7 +2812,7 @@ static int cs_etm__process_auxtrace_event(struct perf_session *session, * formatted in piped mode (true). */ err = cs_etm__setup_queue(etm, &etm->queues.queue_array[idx], - idx, true, -1); + idx, FORMATTED); if (err) return err; @@ -2972,7 +2933,7 @@ static int cs_etm__queue_aux_fragment(struct perf_session *session, off_t file_o union perf_event auxtrace_fragment; __u64 aux_offset, aux_size; __u32 idx; - bool formatted; + enum cs_etm_format format; struct cs_etm_auxtrace *etm = container_of(session->auxtrace, struct cs_etm_auxtrace, @@ -3055,9 +3016,10 @@ static int cs_etm__queue_aux_fragment(struct perf_session *session, off_t file_o return err; idx = auxtrace_event->idx; - formatted = !(aux_event->flags & PERF_AUX_FLAG_CORESIGHT_FORMAT_RAW); - return cs_etm__setup_queue(etm, &etm->queues.queue_array[idx], - idx, formatted, sample->cpu); + format = (aux_event->flags & PERF_AUX_FLAG_CORESIGHT_FORMAT_RAW) ? + UNFORMATTED : FORMATTED; + + return cs_etm__setup_queue(etm, &etm->queues.queue_array[idx], idx, format); } /* Wasn't inside this buffer, but there were no parse errors. 1 == 'not found' */ @@ -3241,6 +3203,84 @@ static int cs_etm__clear_unused_trace_ids_metadata(int num_cpu, u64 **metadata) return 0; } +/* + * Use the data gathered by the peeks for HW_ID (trace ID mappings) and AUX + * (formatted or not) packets to create the decoders. + */ +static int cs_etm__create_queue_decoders(struct cs_etm_queue *etmq) +{ + struct cs_etm_decoder_params d_params; + + /* + * Each queue can only contain data from one CPU when unformatted, so only one decoder is + * needed. + */ + int decoders = etmq->format == FORMATTED ? etmq->etm->num_cpu : 1; + + /* Use metadata to fill in trace parameters for trace decoder */ + struct cs_etm_trace_params *t_params = zalloc(sizeof(*t_params) * decoders); + + if (!t_params) + goto out_free; + + if (cs_etm__init_trace_params(t_params, etmq->etm, etmq->format, + etmq->queue_nr, decoders)) + goto out_free; + + /* Set decoder parameters to decode trace packets */ + if (cs_etm__init_decoder_params(&d_params, etmq, + dump_trace ? CS_ETM_OPERATION_PRINT : + CS_ETM_OPERATION_DECODE)) + goto out_free; + + etmq->decoder = cs_etm_decoder__new(decoders, &d_params, + t_params); + + if (!etmq->decoder) + goto out_free; + + /* + * Register a function to handle all memory accesses required by + * the trace decoder library. + */ + if (cs_etm_decoder__add_mem_access_cb(etmq->decoder, + 0x0L, ((u64) -1L), + cs_etm__mem_access)) + goto out_free_decoder; + + zfree(&t_params); + return 0; + +out_free_decoder: + cs_etm_decoder__free(etmq->decoder); +out_free: + zfree(&t_params); + return -EINVAL; +} + +static int cs_etm__create_decoders(struct cs_etm_auxtrace *etm) +{ + struct auxtrace_queues *queues = &etm->queues; + + for (unsigned int i = 0; i < queues->nr_queues; i++) { + bool empty = list_empty(&queues->queue_array[i].head); + struct cs_etm_queue *etmq = queues->queue_array[i].priv; + int ret; + + /* + * Don't create decoders for empty queues, mainly because + * etmq->format is unknown for empty queues. + */ + if (empty) + continue; + + ret = cs_etm__create_queue_decoders(etmq); + if (ret) + return ret; + } + return 0; +} + int cs_etm__process_auxtrace_info_full(union perf_event *event, struct perf_session *session) { @@ -3389,6 +3429,10 @@ int cs_etm__process_auxtrace_info_full(union perf_event *event, if (err) goto err_free_queues; + err = cs_etm__queue_aux_records(session); + if (err) + goto err_free_queues; + /* * Map Trace ID values to CPU metadata. * @@ -3411,7 +3455,7 @@ int cs_etm__process_auxtrace_info_full(union perf_event *event, * flags if present. */ - /* first scan for AUX_OUTPUT_HW_ID records to map trace ID values to CPU metadata */ + /* Scan for AUX_OUTPUT_HW_ID records to map trace ID values to CPU metadata */ aux_hw_id_found = 0; err = perf_session__peek_events(session, session->header.data_offset, session->header.data_size, @@ -3429,7 +3473,7 @@ int cs_etm__process_auxtrace_info_full(union perf_event *event, if (err) goto err_free_queues; - err = cs_etm__queue_aux_records(session); + err = cs_etm__create_decoders(etm); if (err) goto err_free_queues; From edb994feb1788493d6c878c49d5529e70256e374 Mon Sep 17 00:00:00 2001 From: James Clark Date: Mon, 22 Jul 2024 11:11:44 +0100 Subject: [PATCH 259/548] perf: cs-etm: Allocate queues for all CPUs Make cs_etm__setup_queue() setup a queue even if it's empty, and pre-allocate queues based on the max CPU that was recorded. In per-CPU mode aux queues are indexed based on CPU ID even if all CPUs aren't recorded, sparse queue arrays aren't used. This will allow HW_IDs to be saved even if no aux data was received in that queue without having to call cs_etm__setup_queue() from two different places. Reviewed-by: Mike Leach Signed-off-by: James Clark Cc: Adrian Hunter Cc: Alexander Shishkin Cc: Alexandre Torgue Cc: Anshuman Khandual Cc: Ganapatrao Kulkarni Cc: Ian Rogers Cc: Ingo Molnar Cc: Jiri Olsa Cc: John Garry Cc: Kan Liang Cc: Leo Yan Cc: Mark Rutland Cc: Maxime Coquelin Cc: Namhyung Kim Cc: Peter Zijlstra Cc: Suzuki Poulouse Cc: Will Deacon Link: https://lore.kernel.org/r/20240722101202.26915-3-james.clark@linaro.org Signed-off-by: James Clark Signed-off-by: Arnaldo Carvalho de Melo (cherry picked from commit 57880a7966be510cdaa08ab76b9d63e3df786bf0) Signed-off-by: Koba Ko --- tools/perf/util/cs-etm.c | 53 +++++++++++++++++++--------------------- 1 file changed, 25 insertions(+), 28 deletions(-) diff --git a/tools/perf/util/cs-etm.c b/tools/perf/util/cs-etm.c index 9077072994b23..b13a3ceb6293c 100644 --- a/tools/perf/util/cs-etm.c +++ b/tools/perf/util/cs-etm.c @@ -1059,16 +1059,11 @@ static struct cs_etm_queue *cs_etm__alloc_queue(void) static int cs_etm__setup_queue(struct cs_etm_auxtrace *etm, struct auxtrace_queue *queue, - unsigned int queue_nr, enum cs_etm_format format) + unsigned int queue_nr) { struct cs_etm_queue *etmq = queue->priv; - if (etmq && format != etmq->format) { - pr_err("CS_ETM: mixed formatted and unformatted trace not supported\n"); - return -EINVAL; - } - - if (list_empty(&queue->head) || etmq) + if (etmq) return 0; etmq = cs_etm__alloc_queue(); @@ -1081,7 +1076,6 @@ static int cs_etm__setup_queue(struct cs_etm_auxtrace *etm, etmq->queue_nr = queue_nr; queue->cpu = queue_nr; /* Placeholder, may be reset to -1 in per-thread mode */ etmq->offset = 0; - etmq->format = format; return 0; } @@ -2805,17 +2799,6 @@ static int cs_etm__process_auxtrace_event(struct perf_session *session, if (err) return err; - /* - * Knowing if the trace is formatted or not requires a lookup of - * the aux record so only works in non-piped mode where data is - * queued in cs_etm__queue_aux_records(). Always assume - * formatted in piped mode (true). - */ - err = cs_etm__setup_queue(etm, &etm->queues.queue_array[idx], - idx, FORMATTED); - if (err) - return err; - if (dump_trace) if (auxtrace_buffer__get_data(buffer, fd)) { cs_etm__dump_event(etm->queues.queue_array[idx].priv, buffer); @@ -2932,7 +2915,6 @@ static int cs_etm__queue_aux_fragment(struct perf_session *session, off_t file_o struct perf_record_auxtrace *auxtrace_event; union perf_event auxtrace_fragment; __u64 aux_offset, aux_size; - __u32 idx; enum cs_etm_format format; struct cs_etm_auxtrace *etm = container_of(session->auxtrace, @@ -2999,6 +2981,8 @@ static int cs_etm__queue_aux_fragment(struct perf_session *session, off_t file_o if (aux_offset >= auxtrace_event->offset && aux_offset + aux_size <= auxtrace_event->offset + auxtrace_event->size) { + struct cs_etm_queue *etmq = etm->queues.queue_array[auxtrace_event->idx].priv; + /* * If this AUX event was inside this buffer somewhere, create a new auxtrace event * based on the sizes of the aux event, and queue that fragment. @@ -3015,11 +2999,14 @@ static int cs_etm__queue_aux_fragment(struct perf_session *session, off_t file_o if (err) return err; - idx = auxtrace_event->idx; format = (aux_event->flags & PERF_AUX_FLAG_CORESIGHT_FORMAT_RAW) ? UNFORMATTED : FORMATTED; - - return cs_etm__setup_queue(etm, &etm->queues.queue_array[idx], idx, format); + if (etmq->format != UNSET && format != etmq->format) { + pr_err("CS_ETM: mixed formatted and unformatted trace not supported\n"); + return -EINVAL; + } + etmq->format = format; + return 0; } /* Wasn't inside this buffer, but there were no parse errors. 1 == 'not found' */ @@ -3271,6 +3258,7 @@ static int cs_etm__create_decoders(struct cs_etm_auxtrace *etm) * Don't create decoders for empty queues, mainly because * etmq->format is unknown for empty queues. */ + assert(empty == (etmq->format == UNSET)); if (empty) continue; @@ -3290,10 +3278,10 @@ int cs_etm__process_auxtrace_info_full(union perf_event *event, int event_header_size = sizeof(struct perf_event_header); int total_size = auxtrace_info->header.size; int priv_size = 0; - int num_cpu; + int num_cpu, max_cpu = 0; int err = 0; int aux_hw_id_found; - int i, j; + int i; u64 *ptr = NULL; u64 **metadata = NULL; @@ -3324,7 +3312,7 @@ int cs_etm__process_auxtrace_info_full(union perf_event *event, * required by the trace decoder to properly decode the trace due * to its highly compressed nature. */ - for (j = 0; j < num_cpu; j++) { + for (int j = 0; j < num_cpu; j++) { if (ptr[i] == __perf_cs_etmv3_magic) { metadata[j] = cs_etm__create_meta_blk(ptr, &i, @@ -3348,6 +3336,9 @@ int cs_etm__process_auxtrace_info_full(union perf_event *event, err = -ENOMEM; goto err_free_metadata; } + + if ((int) metadata[j][CS_ETM_CPU] > max_cpu) + max_cpu = metadata[j][CS_ETM_CPU]; } /* @@ -3377,10 +3368,16 @@ int cs_etm__process_auxtrace_info_full(union perf_event *event, */ etm->pid_fmt = cs_etm__init_pid_fmt(metadata[0]); - err = auxtrace_queues__init(&etm->queues); + err = auxtrace_queues__init_nr(&etm->queues, max_cpu + 1); if (err) goto err_free_etm; + for (unsigned int j = 0; j < etm->queues.nr_queues; ++j) { + err = cs_etm__setup_queue(etm, &etm->queues.queue_array[j], j); + if (err) + goto err_free_queues; + } + if (session->itrace_synth_opts->set) { etm->synth_opts = *session->itrace_synth_opts; } else { @@ -3487,7 +3484,7 @@ int cs_etm__process_auxtrace_info_full(union perf_event *event, zfree(&etm); err_free_metadata: /* No need to check @metadata[j], free(NULL) is supported */ - for (j = 0; j < num_cpu; j++) + for (int j = 0; j < num_cpu; j++) zfree(&metadata[j]); zfree(&metadata); err_free_traceid_list: From 2bcc6ba7512da572be25fe6b372df1582e742513 Mon Sep 17 00:00:00 2001 From: James Clark Date: Mon, 22 Jul 2024 11:11:45 +0100 Subject: [PATCH 260/548] perf: cs-etm: Move traceid_list to each queue The global list won't work for per-sink trace ID allocations, so put a list in each queue where the IDs will be unique to that queue. To keep the same behavior as before, for version 0 of the HW_ID packets, copy all the HW_ID mappings into all queues. This change doesn't effect the decoders, only trace ID lookups on the Perf side. The decoders are still created with global mappings which will be fixed in a later commit. Reviewed-by: Mike Leach Signed-off-by: James Clark Cc: Adrian Hunter Cc: Alexander Shishkin Cc: Alexandre Torgue Cc: Anshuman Khandual Cc: Ganapatrao Kulkarni Cc: Ian Rogers Cc: Ingo Molnar Cc: Jiri Olsa Cc: John Garry Cc: Kan Liang Cc: Leo Yan Cc: Mark Rutland Cc: Maxime Coquelin Cc: Namhyung Kim Cc: Peter Zijlstra Cc: Suzuki Poulouse Cc: Will Deacon Link: https://lore.kernel.org/r/20240722101202.26915-4-james.clark@linaro.org Signed-off-by: James Clark Signed-off-by: Arnaldo Carvalho de Melo (cherry picked from commit 77c123f53e97ad4bde0271eb671b71774a99ebf6) Signed-off-by: Koba Ko --- .../perf/util/cs-etm-decoder/cs-etm-decoder.c | 28 ++- tools/perf/util/cs-etm.c | 215 +++++++++++------- tools/perf/util/cs-etm.h | 2 +- 3 files changed, 147 insertions(+), 98 deletions(-) diff --git a/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c b/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c index e917985bbbe6d..0c9c48cedbf1c 100644 --- a/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c +++ b/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c @@ -388,7 +388,8 @@ cs_etm_decoder__reset_timestamp(struct cs_etm_packet_queue *packet_queue) } static ocsd_datapath_resp_t -cs_etm_decoder__buffer_packet(struct cs_etm_packet_queue *packet_queue, +cs_etm_decoder__buffer_packet(struct cs_etm_queue *etmq, + struct cs_etm_packet_queue *packet_queue, const u8 trace_chan_id, enum cs_etm_sample_type sample_type) { @@ -398,7 +399,7 @@ cs_etm_decoder__buffer_packet(struct cs_etm_packet_queue *packet_queue, if (packet_queue->packet_count >= CS_ETM_PACKET_MAX_BUFFER - 1) return OCSD_RESP_FATAL_SYS_ERR; - if (cs_etm__get_cpu(trace_chan_id, &cpu) < 0) + if (cs_etm__get_cpu(etmq, trace_chan_id, &cpu) < 0) return OCSD_RESP_FATAL_SYS_ERR; et = packet_queue->tail; @@ -436,7 +437,7 @@ cs_etm_decoder__buffer_range(struct cs_etm_queue *etmq, int ret = 0; struct cs_etm_packet *packet; - ret = cs_etm_decoder__buffer_packet(packet_queue, trace_chan_id, + ret = cs_etm_decoder__buffer_packet(etmq, packet_queue, trace_chan_id, CS_ETM_RANGE); if (ret != OCSD_RESP_CONT && ret != OCSD_RESP_WAIT) return ret; @@ -496,7 +497,8 @@ cs_etm_decoder__buffer_range(struct cs_etm_queue *etmq, } static ocsd_datapath_resp_t -cs_etm_decoder__buffer_discontinuity(struct cs_etm_packet_queue *queue, +cs_etm_decoder__buffer_discontinuity(struct cs_etm_queue *etmq, + struct cs_etm_packet_queue *queue, const uint8_t trace_chan_id) { /* @@ -504,18 +506,19 @@ cs_etm_decoder__buffer_discontinuity(struct cs_etm_packet_queue *queue, * reset time statistics. */ cs_etm_decoder__reset_timestamp(queue); - return cs_etm_decoder__buffer_packet(queue, trace_chan_id, + return cs_etm_decoder__buffer_packet(etmq, queue, trace_chan_id, CS_ETM_DISCONTINUITY); } static ocsd_datapath_resp_t -cs_etm_decoder__buffer_exception(struct cs_etm_packet_queue *queue, +cs_etm_decoder__buffer_exception(struct cs_etm_queue *etmq, + struct cs_etm_packet_queue *queue, const ocsd_generic_trace_elem *elem, const uint8_t trace_chan_id) { int ret = 0; struct cs_etm_packet *packet; - ret = cs_etm_decoder__buffer_packet(queue, trace_chan_id, + ret = cs_etm_decoder__buffer_packet(etmq, queue, trace_chan_id, CS_ETM_EXCEPTION); if (ret != OCSD_RESP_CONT && ret != OCSD_RESP_WAIT) return ret; @@ -527,10 +530,11 @@ cs_etm_decoder__buffer_exception(struct cs_etm_packet_queue *queue, } static ocsd_datapath_resp_t -cs_etm_decoder__buffer_exception_ret(struct cs_etm_packet_queue *queue, +cs_etm_decoder__buffer_exception_ret(struct cs_etm_queue *etmq, + struct cs_etm_packet_queue *queue, const uint8_t trace_chan_id) { - return cs_etm_decoder__buffer_packet(queue, trace_chan_id, + return cs_etm_decoder__buffer_packet(etmq, queue, trace_chan_id, CS_ETM_EXCEPTION_RET); } @@ -599,7 +603,7 @@ static ocsd_datapath_resp_t cs_etm_decoder__gen_trace_elem_printer( case OCSD_GEN_TRC_ELEM_EO_TRACE: case OCSD_GEN_TRC_ELEM_NO_SYNC: case OCSD_GEN_TRC_ELEM_TRACE_ON: - resp = cs_etm_decoder__buffer_discontinuity(packet_queue, + resp = cs_etm_decoder__buffer_discontinuity(etmq, packet_queue, trace_chan_id); break; case OCSD_GEN_TRC_ELEM_INSTR_RANGE: @@ -607,11 +611,11 @@ static ocsd_datapath_resp_t cs_etm_decoder__gen_trace_elem_printer( trace_chan_id); break; case OCSD_GEN_TRC_ELEM_EXCEPTION: - resp = cs_etm_decoder__buffer_exception(packet_queue, elem, + resp = cs_etm_decoder__buffer_exception(etmq, packet_queue, elem, trace_chan_id); break; case OCSD_GEN_TRC_ELEM_EXCEPTION_RET: - resp = cs_etm_decoder__buffer_exception_ret(packet_queue, + resp = cs_etm_decoder__buffer_exception_ret(etmq, packet_queue, trace_chan_id); break; case OCSD_GEN_TRC_ELEM_TIMESTAMP: diff --git a/tools/perf/util/cs-etm.c b/tools/perf/util/cs-etm.c index b13a3ceb6293c..ad95df90fbebc 100644 --- a/tools/perf/util/cs-etm.c +++ b/tools/perf/util/cs-etm.c @@ -116,16 +116,18 @@ struct cs_etm_queue { /* Conversion between traceID and index in traceid_queues array */ struct intlist *traceid_queues_list; struct cs_etm_traceid_queue **traceid_queues; + /* Conversion between traceID and metadata pointers */ + struct intlist *traceid_list; }; -/* RB tree for quick conversion between traceID and metadata pointers */ -static struct intlist *traceid_list; - static int cs_etm__process_timestamped_queues(struct cs_etm_auxtrace *etm); static int cs_etm__process_timeless_queues(struct cs_etm_auxtrace *etm, pid_t tid); static int cs_etm__get_data_block(struct cs_etm_queue *etmq); static int cs_etm__decode_data_block(struct cs_etm_queue *etmq); +static int cs_etm__metadata_get_trace_id(u8 *trace_chan_id, u64 *cpu_metadata); +static u64 *get_cpu_data(struct cs_etm_auxtrace *etm, int cpu); +static int cs_etm__metadata_set_trace_id(u8 trace_chan_id, u64 *cpu_metadata); /* PTMs ETMIDR [11:8] set to b0011 */ #define ETMIDR_PTM_VERSION 0x00000300 @@ -151,12 +153,12 @@ static u32 cs_etm__get_v7_protocol_version(u32 etmidr) return CS_ETM_PROTO_ETMV3; } -static int cs_etm__get_magic(u8 trace_chan_id, u64 *magic) +static int cs_etm__get_magic(struct cs_etm_queue *etmq, u8 trace_chan_id, u64 *magic) { struct int_node *inode; u64 *metadata; - inode = intlist__find(traceid_list, trace_chan_id); + inode = intlist__find(etmq->traceid_list, trace_chan_id); if (!inode) return -EINVAL; @@ -165,12 +167,12 @@ static int cs_etm__get_magic(u8 trace_chan_id, u64 *magic) return 0; } -int cs_etm__get_cpu(u8 trace_chan_id, int *cpu) +int cs_etm__get_cpu(struct cs_etm_queue *etmq, u8 trace_chan_id, int *cpu) { struct int_node *inode; u64 *metadata; - inode = intlist__find(traceid_list, trace_chan_id); + inode = intlist__find(etmq->traceid_list, trace_chan_id); if (!inode) return -EINVAL; @@ -222,30 +224,108 @@ enum cs_etm_pid_fmt cs_etm__get_pid_fmt(struct cs_etm_queue *etmq) return etmq->etm->pid_fmt; } -static int cs_etm__map_trace_id(u8 trace_chan_id, u64 *cpu_metadata) +static int cs_etm__insert_trace_id_node(struct cs_etm_queue *etmq, + u8 trace_chan_id, u64 *cpu_metadata) { - struct int_node *inode; - /* Get an RB node for this CPU */ - inode = intlist__findnew(traceid_list, trace_chan_id); + struct int_node *inode = intlist__findnew(etmq->traceid_list, trace_chan_id); /* Something went wrong, no need to continue */ if (!inode) return -ENOMEM; + /* Disallow re-mapping a different traceID to metadata pair. */ + if (inode->priv) { + u64 *curr_cpu_data = inode->priv; + u8 curr_chan_id; + int err; + + if (curr_cpu_data[CS_ETM_CPU] != cpu_metadata[CS_ETM_CPU]) { + pr_err("CS_ETM: map mismatch between HW_ID packet CPU and Trace ID\n"); + return -EINVAL; + } + + /* check that the mapped ID matches */ + err = cs_etm__metadata_get_trace_id(&curr_chan_id, curr_cpu_data); + if (err) + return err; + + if (curr_chan_id != trace_chan_id) { + pr_err("CS_ETM: mismatch between CPU trace ID and HW_ID packet ID\n"); + return -EINVAL; + } + + /* Skip re-adding the same mappings if everything matched */ + return 0; + } + + /* Not one we've seen before, associate the traceID with the metadata pointer */ + inode->priv = cpu_metadata; + + return 0; +} + +static struct cs_etm_queue *cs_etm__get_queue(struct cs_etm_auxtrace *etm, int cpu) +{ + if (etm->per_thread_decoding) + return etm->queues.queue_array[0].priv; + else + return etm->queues.queue_array[cpu].priv; +} + +static int cs_etm__map_trace_id_v0(struct cs_etm_auxtrace *etm, u8 trace_chan_id, + u64 *cpu_metadata) +{ + struct cs_etm_queue *etmq; + /* - * The node for that CPU should not be taken. - * Back out if that's the case. + * If the queue is unformatted then only save one mapping in the + * queue associated with that CPU so only one decoder is made. */ - if (inode->priv) - return -EINVAL; + etmq = cs_etm__get_queue(etm, cpu_metadata[CS_ETM_CPU]); + if (etmq->format == UNFORMATTED) + return cs_etm__insert_trace_id_node(etmq, trace_chan_id, + cpu_metadata); - /* All good, associate the traceID with the metadata pointer */ - inode->priv = cpu_metadata; + /* + * Otherwise, version 0 trace IDs are global so save them into every + * queue. + */ + for (unsigned int i = 0; i < etm->queues.nr_queues; ++i) { + int ret; + + etmq = etm->queues.queue_array[i].priv; + ret = cs_etm__insert_trace_id_node(etmq, trace_chan_id, + cpu_metadata); + if (ret) + return ret; + } return 0; } +static int cs_etm__process_trace_id_v0(struct cs_etm_auxtrace *etm, int cpu, + u64 hw_id) +{ + int err; + u64 *cpu_data; + u8 trace_chan_id = FIELD_GET(CS_AUX_HW_ID_TRACE_ID_MASK, hw_id); + + cpu_data = get_cpu_data(etm, cpu); + if (cpu_data == NULL) + return -EINVAL; + + err = cs_etm__map_trace_id_v0(etm, trace_chan_id, cpu_data); + if (err) + return err; + + /* + * if we are picking up the association from the packet, need to plug + * the correct trace ID into the metadata for setting up decoders later. + */ + return cs_etm__metadata_set_trace_id(trace_chan_id, cpu_data); +} + static int cs_etm__metadata_get_trace_id(u8 *trace_chan_id, u64 *cpu_metadata) { u64 cs_etm_magic = cpu_metadata[CS_ETM_MAGIC]; @@ -329,17 +409,13 @@ static int cs_etm__process_aux_output_hw_id(struct perf_session *session, { struct cs_etm_auxtrace *etm; struct perf_sample sample; - struct int_node *inode; struct evsel *evsel; - u64 *cpu_data; u64 hw_id; int cpu, version, err; - u8 trace_chan_id, curr_chan_id; /* extract and parse the HW ID */ hw_id = event->aux_output_hw_id.hw_id; version = FIELD_GET(CS_AUX_HW_ID_VERSION_MASK, hw_id); - trace_chan_id = FIELD_GET(CS_AUX_HW_ID_TRACE_ID_MASK, hw_id); /* check that we can handle this version */ if (version > CS_AUX_HW_ID_CURR_VERSION) @@ -364,43 +440,7 @@ static int cs_etm__process_aux_output_hw_id(struct perf_session *session, return -EINVAL; } - /* See if the ID is mapped to a CPU, and it matches the current CPU */ - inode = intlist__find(traceid_list, trace_chan_id); - if (inode) { - cpu_data = inode->priv; - if ((int)cpu_data[CS_ETM_CPU] != cpu) { - pr_err("CS_ETM: map mismatch between HW_ID packet CPU and Trace ID\n"); - return -EINVAL; - } - - /* check that the mapped ID matches */ - err = cs_etm__metadata_get_trace_id(&curr_chan_id, cpu_data); - if (err) - return err; - if (curr_chan_id != trace_chan_id) { - pr_err("CS_ETM: mismatch between CPU trace ID and HW_ID packet ID\n"); - return -EINVAL; - } - - /* mapped and matched - return OK */ - return 0; - } - - cpu_data = get_cpu_data(etm, cpu); - if (cpu_data == NULL) - return err; - - /* not one we've seen before - lets map it */ - err = cs_etm__map_trace_id(trace_chan_id, cpu_data); - if (err) - return err; - - /* - * if we are picking up the association from the packet, need to plug - * the correct trace ID into the metadata for setting up decoders later. - */ - err = cs_etm__metadata_set_trace_id(trace_chan_id, cpu_data); - return err; + return cs_etm__process_trace_id_v0(etm, cpu, hw_id); } void cs_etm__etmq_set_traceid_queue_timestamp(struct cs_etm_queue *etmq, @@ -853,6 +893,7 @@ static void cs_etm__free_traceid_queues(struct cs_etm_queue *etmq) static void cs_etm__free_queue(void *priv) { + struct int_node *inode, *tmp; struct cs_etm_queue *etmq = priv; if (!etmq) @@ -860,6 +901,14 @@ static void cs_etm__free_queue(void *priv) cs_etm_decoder__free(etmq->decoder); cs_etm__free_traceid_queues(etmq); + + /* First remove all traceID/metadata nodes for the RB tree */ + intlist__for_each_entry_safe(inode, tmp, etmq->traceid_list) + intlist__remove(etmq->traceid_list, inode); + + /* Then the RB tree itself */ + intlist__delete(etmq->traceid_list); + free(etmq); } @@ -882,19 +931,12 @@ static void cs_etm__free_events(struct perf_session *session) static void cs_etm__free(struct perf_session *session) { int i; - struct int_node *inode, *tmp; struct cs_etm_auxtrace *aux = container_of(session->auxtrace, struct cs_etm_auxtrace, auxtrace); cs_etm__free_events(session); session->auxtrace = NULL; - /* First remove all traceID/metadata nodes for the RB tree */ - intlist__for_each_entry_safe(inode, tmp, traceid_list) - intlist__remove(traceid_list, inode); - /* Then the RB tree itself */ - intlist__delete(traceid_list); - for (i = 0; i < aux->num_cpu; i++) zfree(&aux->metadata[i]); @@ -1052,9 +1094,24 @@ static struct cs_etm_queue *cs_etm__alloc_queue(void) etmq->traceid_queues_list = intlist__new(NULL); if (!etmq->traceid_queues_list) - free(etmq); + goto out_free; + + /* + * Create an RB tree for traceID-metadata tuple. Since the conversion + * has to be made for each packet that gets decoded, optimizing access + * in anything other than a sequential array is worth doing. + */ + etmq->traceid_list = intlist__new(NULL); + if (!etmq->traceid_list) + goto out_free; return etmq; + +out_free: + intlist__delete(etmq->traceid_queues_list); + free(etmq); + + return NULL; } static int cs_etm__setup_queue(struct cs_etm_auxtrace *etm, @@ -2204,7 +2261,7 @@ static int cs_etm__set_sample_flags(struct cs_etm_queue *etmq, PERF_IP_FLAG_TRACE_END; break; case CS_ETM_EXCEPTION: - ret = cs_etm__get_magic(packet->trace_chan_id, &magic); + ret = cs_etm__get_magic(etmq, packet->trace_chan_id, &magic); if (ret) return ret; @@ -3132,7 +3189,8 @@ static bool cs_etm__has_virtual_ts(u64 **metadata, int num_cpu) } /* map trace ids to correct metadata block, from information in metadata */ -static int cs_etm__map_trace_ids_metadata(int num_cpu, u64 **metadata) +static int cs_etm__map_trace_ids_metadata(struct cs_etm_auxtrace *etm, int num_cpu, + u64 **metadata) { u64 cs_etm_magic; u8 trace_chan_id; @@ -3154,7 +3212,7 @@ static int cs_etm__map_trace_ids_metadata(int num_cpu, u64 **metadata) /* unknown magic number */ return -EINVAL; } - err = cs_etm__map_trace_id(trace_chan_id, metadata[i]); + err = cs_etm__map_trace_id_v0(etm, trace_chan_id, metadata[i]); if (err) return err; } @@ -3285,23 +3343,12 @@ int cs_etm__process_auxtrace_info_full(union perf_event *event, u64 *ptr = NULL; u64 **metadata = NULL; - /* - * Create an RB tree for traceID-metadata tuple. Since the conversion - * has to be made for each packet that gets decoded, optimizing access - * in anything other than a sequential array is worth doing. - */ - traceid_list = intlist__new(NULL); - if (!traceid_list) - return -ENOMEM; - /* First the global part */ ptr = (u64 *) auxtrace_info->priv; num_cpu = ptr[CS_PMU_TYPE_CPUS] & 0xffffffff; metadata = zalloc(sizeof(*metadata) * num_cpu); - if (!metadata) { - err = -ENOMEM; - goto err_free_traceid_list; - } + if (!metadata) + return -ENOMEM; /* Start parsing after the common part of the header */ i = CS_HEADER_VERSION_MAX; @@ -3465,7 +3512,7 @@ int cs_etm__process_auxtrace_info_full(union perf_event *event, err = cs_etm__clear_unused_trace_ids_metadata(num_cpu, metadata); /* otherwise, this is a file with metadata values only, map from metadata */ else - err = cs_etm__map_trace_ids_metadata(num_cpu, metadata); + err = cs_etm__map_trace_ids_metadata(etm, num_cpu, metadata); if (err) goto err_free_queues; @@ -3487,7 +3534,5 @@ int cs_etm__process_auxtrace_info_full(union perf_event *event, for (int j = 0; j < num_cpu; j++) zfree(&metadata[j]); zfree(&metadata); -err_free_traceid_list: - intlist__delete(traceid_list); return err; } diff --git a/tools/perf/util/cs-etm.h b/tools/perf/util/cs-etm.h index 7cca378879176..ef0349ea2c3db 100644 --- a/tools/perf/util/cs-etm.h +++ b/tools/perf/util/cs-etm.h @@ -252,7 +252,7 @@ enum cs_etm_pid_fmt { #ifdef HAVE_CSTRACE_SUPPORT #include -int cs_etm__get_cpu(u8 trace_chan_id, int *cpu); +int cs_etm__get_cpu(struct cs_etm_queue *etmq, u8 trace_chan_id, int *cpu); enum cs_etm_pid_fmt cs_etm__get_pid_fmt(struct cs_etm_queue *etmq); int cs_etm__etmq_set_tid_el(struct cs_etm_queue *etmq, pid_t tid, u8 trace_chan_id, ocsd_ex_level el); From 81b9cafa677db5f978bb16e62a062f20e0a79bb3 Mon Sep 17 00:00:00 2001 From: James Clark Date: Mon, 22 Jul 2024 11:11:46 +0100 Subject: [PATCH 261/548] perf: cs-etm: Create decoders based on the trace ID mappings Now that each queue has a unique set of trace ID mappings, use this list to create the decoders. In unformatted mode just add a single mapping so only one decoder is made. Previously each queue would have a decoder created for each traced CPU on the system but this won't work anymore because CPUs can have overlapping trace IDs. This also means that the CORESIGHT_TRACE_ID_UNUSED_FLAG isn't needed any more. If mappings aren't added then decoders aren't created, rather than needing a flag to suppress creation. Reviewed-by: Mike Leach Signed-off-by: James Clark Cc: Adrian Hunter Cc: Alexander Shishkin Cc: Alexandre Torgue Cc: Anshuman Khandual Cc: Ganapatrao Kulkarni Cc: Ian Rogers Cc: Ingo Molnar Cc: Jiri Olsa Cc: John Garry Cc: Kan Liang Cc: Leo Yan Cc: Mark Rutland Cc: Maxime Coquelin Cc: Namhyung Kim Cc: Peter Zijlstra Cc: Suzuki Poulouse Cc: Will Deacon Link: https://lore.kernel.org/r/20240722101202.26915-5-james.clark@linaro.org Signed-off-by: James Clark Signed-off-by: Arnaldo Carvalho de Melo (cherry picked from commit 19c3e4db38c5bf30c7e7b53dad5a464d7031dec4) Signed-off-by: Koba Ko --- tools/perf/arch/arm/util/cs-etm.c | 8 +- .../perf/util/cs-etm-decoder/cs-etm-decoder.c | 4 - tools/perf/util/cs-etm.c | 155 ++++++------------ tools/perf/util/cs-etm.h | 10 -- 4 files changed, 55 insertions(+), 122 deletions(-) diff --git a/tools/perf/arch/arm/util/cs-etm.c b/tools/perf/arch/arm/util/cs-etm.c index 28d6935f9fc9f..84999f80ccd7c 100644 --- a/tools/perf/arch/arm/util/cs-etm.c +++ b/tools/perf/arch/arm/util/cs-etm.c @@ -654,8 +654,7 @@ static void cs_etm_save_etmv4_header(__u64 data[], struct auxtrace_record *itr, /* Get trace configuration register */ data[CS_ETMV4_TRCCONFIGR] = cs_etmv4_get_config(itr); /* traceID set to legacy version, in case new perf running on older system */ - data[CS_ETMV4_TRCTRACEIDR] = cs_etm_get_legacy_trace_id(cpu) | - CORESIGHT_TRACE_ID_UNUSED_FLAG; + data[CS_ETMV4_TRCTRACEIDR] = cs_etm_get_legacy_trace_id(cpu); /* Get read-only information from sysFS */ cs_etm_get_ro(cs_etm_pmu, cpu, metadata_etmv4_ro[CS_ETMV4_TRCIDR0], @@ -687,7 +686,7 @@ static void cs_etm_save_ete_header(__u64 data[], struct auxtrace_record *itr, st /* Get trace configuration register */ data[CS_ETE_TRCCONFIGR] = cs_etmv4_get_config(itr); /* traceID set to legacy version, in case new perf running on older system */ - data[CS_ETE_TRCTRACEIDR] = cs_etm_get_legacy_trace_id(cpu) | CORESIGHT_TRACE_ID_UNUSED_FLAG; + data[CS_ETE_TRCTRACEIDR] = cs_etm_get_legacy_trace_id(cpu); /* Get read-only information from sysFS */ cs_etm_get_ro(cs_etm_pmu, cpu, metadata_ete_ro[CS_ETE_TRCIDR0], &data[CS_ETE_TRCIDR0]); @@ -743,8 +742,7 @@ static void cs_etm_get_metadata(struct perf_cpu cpu, u32 *offset, /* Get configuration register */ info->priv[*offset + CS_ETM_ETMCR] = cs_etm_get_config(itr); /* traceID set to legacy value in case new perf running on old system */ - info->priv[*offset + CS_ETM_ETMTRACEIDR] = cs_etm_get_legacy_trace_id(cpu) | - CORESIGHT_TRACE_ID_UNUSED_FLAG; + info->priv[*offset + CS_ETM_ETMTRACEIDR] = cs_etm_get_legacy_trace_id(cpu); /* Get read-only information from sysFS */ cs_etm_get_ro(cs_etm_pmu, cpu, metadata_etmv3_ro[CS_ETM_ETMCCER], &info->priv[*offset + CS_ETM_ETMCCER]); diff --git a/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c b/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c index 0c9c48cedbf1c..d49c3e9c7c21f 100644 --- a/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c +++ b/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c @@ -684,10 +684,6 @@ cs_etm_decoder__create_etm_decoder(struct cs_etm_decoder_params *d_params, return -1; } - /* if the CPU has no trace ID associated, no decoder needed */ - if (csid == CORESIGHT_TRACE_ID_UNUSED_VAL) - return 0; - if (d_params->operation == CS_ETM_OPERATION_DECODE) { if (ocsd_dt_create_decoder(decoder->dcd_tree, decoder->decoder_name, diff --git a/tools/perf/util/cs-etm.c b/tools/perf/util/cs-etm.c index ad95df90fbebc..b58865b72f640 100644 --- a/tools/perf/util/cs-etm.c +++ b/tools/perf/util/cs-etm.c @@ -348,7 +348,6 @@ static int cs_etm__metadata_get_trace_id(u8 *trace_chan_id, u64 *cpu_metadata) /* * update metadata trace ID from the value found in the AUX_HW_INFO packet. - * This will also clear the CORESIGHT_TRACE_ID_UNUSED_FLAG flag if present. */ static int cs_etm__metadata_set_trace_id(u8 trace_chan_id, u64 *cpu_metadata) { @@ -697,80 +696,58 @@ static void cs_etm__packet_dump(const char *pkt_string) } static void cs_etm__set_trace_param_etmv3(struct cs_etm_trace_params *t_params, - struct cs_etm_auxtrace *etm, int t_idx, - int m_idx, u32 etmidr) + u64 *metadata, u32 etmidr) { - u64 **metadata = etm->metadata; - - t_params[t_idx].protocol = cs_etm__get_v7_protocol_version(etmidr); - t_params[t_idx].etmv3.reg_ctrl = metadata[m_idx][CS_ETM_ETMCR]; - t_params[t_idx].etmv3.reg_trc_id = metadata[m_idx][CS_ETM_ETMTRACEIDR]; + t_params->protocol = cs_etm__get_v7_protocol_version(etmidr); + t_params->etmv3.reg_ctrl = metadata[CS_ETM_ETMCR]; + t_params->etmv3.reg_trc_id = metadata[CS_ETM_ETMTRACEIDR]; } static void cs_etm__set_trace_param_etmv4(struct cs_etm_trace_params *t_params, - struct cs_etm_auxtrace *etm, int t_idx, - int m_idx) + u64 *metadata) { - u64 **metadata = etm->metadata; - - t_params[t_idx].protocol = CS_ETM_PROTO_ETMV4i; - t_params[t_idx].etmv4.reg_idr0 = metadata[m_idx][CS_ETMV4_TRCIDR0]; - t_params[t_idx].etmv4.reg_idr1 = metadata[m_idx][CS_ETMV4_TRCIDR1]; - t_params[t_idx].etmv4.reg_idr2 = metadata[m_idx][CS_ETMV4_TRCIDR2]; - t_params[t_idx].etmv4.reg_idr8 = metadata[m_idx][CS_ETMV4_TRCIDR8]; - t_params[t_idx].etmv4.reg_configr = metadata[m_idx][CS_ETMV4_TRCCONFIGR]; - t_params[t_idx].etmv4.reg_traceidr = metadata[m_idx][CS_ETMV4_TRCTRACEIDR]; + t_params->protocol = CS_ETM_PROTO_ETMV4i; + t_params->etmv4.reg_idr0 = metadata[CS_ETMV4_TRCIDR0]; + t_params->etmv4.reg_idr1 = metadata[CS_ETMV4_TRCIDR1]; + t_params->etmv4.reg_idr2 = metadata[CS_ETMV4_TRCIDR2]; + t_params->etmv4.reg_idr8 = metadata[CS_ETMV4_TRCIDR8]; + t_params->etmv4.reg_configr = metadata[CS_ETMV4_TRCCONFIGR]; + t_params->etmv4.reg_traceidr = metadata[CS_ETMV4_TRCTRACEIDR]; } static void cs_etm__set_trace_param_ete(struct cs_etm_trace_params *t_params, - struct cs_etm_auxtrace *etm, int t_idx, - int m_idx) + u64 *metadata) { - u64 **metadata = etm->metadata; - - t_params[t_idx].protocol = CS_ETM_PROTO_ETE; - t_params[t_idx].ete.reg_idr0 = metadata[m_idx][CS_ETE_TRCIDR0]; - t_params[t_idx].ete.reg_idr1 = metadata[m_idx][CS_ETE_TRCIDR1]; - t_params[t_idx].ete.reg_idr2 = metadata[m_idx][CS_ETE_TRCIDR2]; - t_params[t_idx].ete.reg_idr8 = metadata[m_idx][CS_ETE_TRCIDR8]; - t_params[t_idx].ete.reg_configr = metadata[m_idx][CS_ETE_TRCCONFIGR]; - t_params[t_idx].ete.reg_traceidr = metadata[m_idx][CS_ETE_TRCTRACEIDR]; - t_params[t_idx].ete.reg_devarch = metadata[m_idx][CS_ETE_TRCDEVARCH]; + t_params->protocol = CS_ETM_PROTO_ETE; + t_params->ete.reg_idr0 = metadata[CS_ETE_TRCIDR0]; + t_params->ete.reg_idr1 = metadata[CS_ETE_TRCIDR1]; + t_params->ete.reg_idr2 = metadata[CS_ETE_TRCIDR2]; + t_params->ete.reg_idr8 = metadata[CS_ETE_TRCIDR8]; + t_params->ete.reg_configr = metadata[CS_ETE_TRCCONFIGR]; + t_params->ete.reg_traceidr = metadata[CS_ETE_TRCTRACEIDR]; + t_params->ete.reg_devarch = metadata[CS_ETE_TRCDEVARCH]; } static int cs_etm__init_trace_params(struct cs_etm_trace_params *t_params, - struct cs_etm_auxtrace *etm, - enum cs_etm_format format, - int sample_cpu, - int decoders) -{ - int t_idx, m_idx; - u32 etmidr; - u64 architecture; - - for (t_idx = 0; t_idx < decoders; t_idx++) { - if (format == FORMATTED) - m_idx = t_idx; - else { - m_idx = get_cpu_data_idx(etm, sample_cpu); - if (m_idx == -1) { - pr_warning("CS_ETM: unknown CPU, falling back to first metadata\n"); - m_idx = 0; - } - } + struct cs_etm_queue *etmq) +{ + struct int_node *inode; - architecture = etm->metadata[m_idx][CS_ETM_MAGIC]; + intlist__for_each_entry(inode, etmq->traceid_list) { + u64 *metadata = inode->priv; + u64 architecture = metadata[CS_ETM_MAGIC]; + u32 etmidr; switch (architecture) { case __perf_cs_etmv3_magic: - etmidr = etm->metadata[m_idx][CS_ETM_ETMIDR]; - cs_etm__set_trace_param_etmv3(t_params, etm, t_idx, m_idx, etmidr); + etmidr = metadata[CS_ETM_ETMIDR]; + cs_etm__set_trace_param_etmv3(t_params++, metadata, etmidr); break; case __perf_cs_etmv4_magic: - cs_etm__set_trace_param_etmv4(t_params, etm, t_idx, m_idx); + cs_etm__set_trace_param_etmv4(t_params++, metadata); break; case __perf_cs_ete_magic: - cs_etm__set_trace_param_ete(t_params, etm, t_idx, m_idx); + cs_etm__set_trace_param_ete(t_params++, metadata); break; default: return -EINVAL; @@ -3219,35 +3196,6 @@ static int cs_etm__map_trace_ids_metadata(struct cs_etm_auxtrace *etm, int num_c return 0; } -/* - * If we found AUX_HW_ID packets, then set any metadata marked as unused to the - * unused value to reduce the number of unneeded decoders created. - */ -static int cs_etm__clear_unused_trace_ids_metadata(int num_cpu, u64 **metadata) -{ - u64 cs_etm_magic; - int i; - - for (i = 0; i < num_cpu; i++) { - cs_etm_magic = metadata[i][CS_ETM_MAGIC]; - switch (cs_etm_magic) { - case __perf_cs_etmv3_magic: - if (metadata[i][CS_ETM_ETMTRACEIDR] & CORESIGHT_TRACE_ID_UNUSED_FLAG) - metadata[i][CS_ETM_ETMTRACEIDR] = CORESIGHT_TRACE_ID_UNUSED_VAL; - break; - case __perf_cs_etmv4_magic: - case __perf_cs_ete_magic: - if (metadata[i][CS_ETMV4_TRCTRACEIDR] & CORESIGHT_TRACE_ID_UNUSED_FLAG) - metadata[i][CS_ETMV4_TRCTRACEIDR] = CORESIGHT_TRACE_ID_UNUSED_VAL; - break; - default: - /* unknown magic number */ - return -EINVAL; - } - } - return 0; -} - /* * Use the data gathered by the peeks for HW_ID (trace ID mappings) and AUX * (formatted or not) packets to create the decoders. @@ -3255,21 +3203,26 @@ static int cs_etm__clear_unused_trace_ids_metadata(int num_cpu, u64 **metadata) static int cs_etm__create_queue_decoders(struct cs_etm_queue *etmq) { struct cs_etm_decoder_params d_params; + struct cs_etm_trace_params *t_params; + int decoders = intlist__nr_entries(etmq->traceid_list); + + if (decoders == 0) + return 0; /* * Each queue can only contain data from one CPU when unformatted, so only one decoder is * needed. */ - int decoders = etmq->format == FORMATTED ? etmq->etm->num_cpu : 1; + if (etmq->format == UNFORMATTED) + assert(decoders == 1); /* Use metadata to fill in trace parameters for trace decoder */ - struct cs_etm_trace_params *t_params = zalloc(sizeof(*t_params) * decoders); + t_params = zalloc(sizeof(*t_params) * decoders); if (!t_params) goto out_free; - if (cs_etm__init_trace_params(t_params, etmq->etm, etmq->format, - etmq->queue_nr, decoders)) + if (cs_etm__init_trace_params(t_params, etmq)) goto out_free; /* Set decoder parameters to decode trace packets */ @@ -3480,9 +3433,9 @@ int cs_etm__process_auxtrace_info_full(union perf_event *event, /* * Map Trace ID values to CPU metadata. * - * Trace metadata will always contain Trace ID values from the legacy algorithm. If the - * files has been recorded by a "new" perf updated to handle AUX_HW_ID then the metadata - * ID value will also have the CORESIGHT_TRACE_ID_UNUSED_FLAG set. + * Trace metadata will always contain Trace ID values from the legacy algorithm + * in case it's read by a version of Perf that doesn't know about HW_ID packets + * or the kernel doesn't emit them. * * The updated kernel drivers that use AUX_HW_ID to sent Trace IDs will attempt to use * the same IDs as the old algorithm as far as is possible, unless there are clashes @@ -3491,12 +3444,11 @@ int cs_etm__process_auxtrace_info_full(union perf_event *event, * * For a perf able to interpret AUX_HW_ID packets we first check for the presence of * those packets. If they are there then the values will be mapped and plugged into - * the metadata. We then set any remaining metadata values with the used flag to a - * value CORESIGHT_TRACE_ID_UNUSED_VAL - which indicates no decoder is required. + * the metadata and decoders are only created for each mapping received. * * If no AUX_HW_ID packets are present - which means a file recorded on an old kernel - * then we map Trace ID values to CPU directly from the metadata - clearing any unused - * flags if present. + * then we map Trace ID values to CPU directly from the metadata and create decoders + * for all mappings. */ /* Scan for AUX_OUTPUT_HW_ID records to map trace ID values to CPU metadata */ @@ -3507,15 +3459,12 @@ int cs_etm__process_auxtrace_info_full(union perf_event *event, if (err) goto err_free_queues; - /* if HW ID found then clear any unused metadata ID values */ - if (aux_hw_id_found) - err = cs_etm__clear_unused_trace_ids_metadata(num_cpu, metadata); - /* otherwise, this is a file with metadata values only, map from metadata */ - else + /* if no HW ID found this is a file with metadata values only, map from metadata */ + if (!aux_hw_id_found) { err = cs_etm__map_trace_ids_metadata(etm, num_cpu, metadata); - - if (err) - goto err_free_queues; + if (err) + goto err_free_queues; + } err = cs_etm__create_decoders(etm); if (err) diff --git a/tools/perf/util/cs-etm.h b/tools/perf/util/cs-etm.h index ef0349ea2c3db..23e132ff05ea3 100644 --- a/tools/perf/util/cs-etm.h +++ b/tools/perf/util/cs-etm.h @@ -230,16 +230,6 @@ struct cs_etm_packet_queue { /* CoreSight trace ID is currently the bottom 7 bits of the value */ #define CORESIGHT_TRACE_ID_VAL_MASK GENMASK(6, 0) -/* - * perf record will set the legacy meta data values as unused initially. - * This allows perf report to manage the decoders created when dynamic - * allocation in operation. - */ -#define CORESIGHT_TRACE_ID_UNUSED_FLAG BIT(31) - -/* Value to set for unused trace ID values */ -#define CORESIGHT_TRACE_ID_UNUSED_VAL 0x7F - int cs_etm__process_auxtrace_info(union perf_event *event, struct perf_session *session); struct perf_event_attr *cs_etm_get_default_config(struct perf_pmu *pmu); From 1177a07e88b75fbd17e72c6f3b7c52a5b5aa2b89 Mon Sep 17 00:00:00 2001 From: James Clark Date: Mon, 22 Jul 2024 11:11:47 +0100 Subject: [PATCH 262/548] perf: cs-etm: Only save valid trace IDs into files This isn't a bug because Perf always masks with CORESIGHT_TRACE_ID_VAL_MASK before using these values, but to avoid it looking like it could be, make an effort to not save bad values. Reviewed-by: Mike Leach Signed-off-by: James Clark Signed-off-by: James Clark Cc: Adrian Hunter Cc: Alexander Shishkin Cc: Alexandre Torgue Cc: Anshuman Khandual Cc: Ganapatrao Kulkarni Cc: Ian Rogers Cc: Ingo Molnar Cc: Jiri Olsa Cc: John Garry Cc: Kan Liang Cc: Leo Yan Cc: Mark Rutland Cc: Maxime Coquelin Cc: Namhyung Kim Cc: Peter Zijlstra Cc: Suzuki Poulouse Cc: Will Deacon Link: https://lore.kernel.org/r/20240722101202.26915-6-james.clark@linaro.org Signed-off-by: Arnaldo Carvalho de Melo (cherry picked from commit 940007cee539fd07c7c274030f7f7252f8c5a5d7) Signed-off-by: Koba Ko --- tools/perf/arch/arm/util/cs-etm.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/tools/perf/arch/arm/util/cs-etm.c b/tools/perf/arch/arm/util/cs-etm.c index 84999f80ccd7c..e37b10f3e2802 100644 --- a/tools/perf/arch/arm/util/cs-etm.c +++ b/tools/perf/arch/arm/util/cs-etm.c @@ -643,7 +643,8 @@ static bool cs_etm_is_ete(struct perf_pmu *cs_etm_pmu, struct perf_cpu cpu) static __u64 cs_etm_get_legacy_trace_id(struct perf_cpu cpu) { - return CORESIGHT_LEGACY_CPU_TRACE_ID(cpu.cpu); + /* Wrap at 48 so that invalid trace IDs aren't saved into files. */ + return CORESIGHT_LEGACY_CPU_TRACE_ID(cpu.cpu % 48); } static void cs_etm_save_etmv4_header(__u64 data[], struct auxtrace_record *itr, struct perf_cpu cpu) From 1b27448a556842f60402614daa648d7d7170c448 Mon Sep 17 00:00:00 2001 From: James Clark Date: Mon, 22 Jul 2024 11:11:48 +0100 Subject: [PATCH 263/548] perf: cs-etm: Support version 0.1 of HW_ID packets v0.1 HW_ID packets have a new field that describes which sink each CPU writes to. Use the sink ID to link trace ID maps to each other so that mappings are shared wherever the sink is shared. Also update the error message to show that overlapping IDs aren't an error in per-thread mode, just not supported. In the future we can use the CPU ID from the AUX records, or watch for changing sink IDs on HW_ID packets to use the correct decoders. Reviewed-by: Mike Leach Signed-off-by: James Clark Cc: Adrian Hunter Cc: Alexander Shishkin Cc: Alexandre Torgue Cc: Anshuman Khandual Cc: Ganapatrao Kulkarni Cc: Ian Rogers Cc: Ingo Molnar Cc: Jiri Olsa Cc: John Garry Cc: Kan Liang Cc: Leo Yan Cc: Mark Rutland Cc: Maxime Coquelin Cc: Namhyung Kim Cc: Peter Zijlstra Cc: Suzuki Poulouse Cc: Will Deacon Link: https://lore.kernel.org/r/20240722101202.26915-7-james.clark@linaro.org Signed-off-by: James Clark Signed-off-by: Arnaldo Carvalho de Melo (cherry picked from commit 1506af6db8c4abbe3d5dd573ec72b90f81abfcf7) Signed-off-by: Koba Ko --- tools/include/linux/coresight-pmu.h | 17 +++-- tools/perf/util/cs-etm.c | 103 +++++++++++++++++++++++++--- 2 files changed, 106 insertions(+), 14 deletions(-) diff --git a/tools/include/linux/coresight-pmu.h b/tools/include/linux/coresight-pmu.h index 51ac441a37c3e..89b0ac0014b0d 100644 --- a/tools/include/linux/coresight-pmu.h +++ b/tools/include/linux/coresight-pmu.h @@ -49,12 +49,21 @@ * Interpretation of the PERF_RECORD_AUX_OUTPUT_HW_ID payload. * Used to associate a CPU with the CoreSight Trace ID. * [07:00] - Trace ID - uses 8 bits to make value easy to read in file. - * [59:08] - Unused (SBZ) - * [63:60] - Version + * [39:08] - Sink ID - as reported in /sys/bus/event_source/devices/cs_etm/sinks/ + * Added in minor version 1. + * [55:40] - Unused (SBZ) + * [59:56] - Minor Version - previously existing fields are compatible with + * all minor versions. + * [63:60] - Major Version - previously existing fields mean different things + * in new major versions. */ #define CS_AUX_HW_ID_TRACE_ID_MASK GENMASK_ULL(7, 0) -#define CS_AUX_HW_ID_VERSION_MASK GENMASK_ULL(63, 60) +#define CS_AUX_HW_ID_SINK_ID_MASK GENMASK_ULL(39, 8) -#define CS_AUX_HW_ID_CURR_VERSION 0 +#define CS_AUX_HW_ID_MINOR_VERSION_MASK GENMASK_ULL(59, 56) +#define CS_AUX_HW_ID_MAJOR_VERSION_MASK GENMASK_ULL(63, 60) + +#define CS_AUX_HW_ID_MAJOR_VERSION 0 +#define CS_AUX_HW_ID_MINOR_VERSION 1 #endif diff --git a/tools/perf/util/cs-etm.c b/tools/perf/util/cs-etm.c index b58865b72f640..eeb2e0cf074e6 100644 --- a/tools/perf/util/cs-etm.c +++ b/tools/perf/util/cs-etm.c @@ -118,6 +118,12 @@ struct cs_etm_queue { struct cs_etm_traceid_queue **traceid_queues; /* Conversion between traceID and metadata pointers */ struct intlist *traceid_list; + /* + * Same as traceid_list, but traceid_list may be a reference to another + * queue's which has a matching sink ID. + */ + struct intlist *own_traceid_list; + u32 sink_id; }; static int cs_etm__process_timestamped_queues(struct cs_etm_auxtrace *etm); @@ -142,6 +148,7 @@ static int cs_etm__metadata_set_trace_id(u8 trace_chan_id, u64 *cpu_metadata); (queue_nr << 16 | trace_chan_id) #define TO_QUEUE_NR(cs_queue_nr) (cs_queue_nr >> 16) #define TO_TRACE_CHAN_ID(cs_queue_nr) (cs_queue_nr & 0x0000ffff) +#define SINK_UNSET ((u32) -1) static u32 cs_etm__get_v7_protocol_version(u32 etmidr) { @@ -241,7 +248,16 @@ static int cs_etm__insert_trace_id_node(struct cs_etm_queue *etmq, int err; if (curr_cpu_data[CS_ETM_CPU] != cpu_metadata[CS_ETM_CPU]) { - pr_err("CS_ETM: map mismatch between HW_ID packet CPU and Trace ID\n"); + /* + * With > CORESIGHT_TRACE_IDS_MAX ETMs, overlapping IDs + * are expected (but not supported) in per-thread mode, + * rather than signifying an error. + */ + if (etmq->etm->per_thread_decoding) + pr_err("CS_ETM: overlapping Trace IDs aren't currently supported in per-thread mode\n"); + else + pr_err("CS_ETM: map mismatch between HW_ID packet CPU and Trace ID\n"); + return -EINVAL; } @@ -326,6 +342,64 @@ static int cs_etm__process_trace_id_v0(struct cs_etm_auxtrace *etm, int cpu, return cs_etm__metadata_set_trace_id(trace_chan_id, cpu_data); } +static int cs_etm__process_trace_id_v0_1(struct cs_etm_auxtrace *etm, int cpu, + u64 hw_id) +{ + struct cs_etm_queue *etmq = cs_etm__get_queue(etm, cpu); + int ret; + u64 *cpu_data; + u32 sink_id = FIELD_GET(CS_AUX_HW_ID_SINK_ID_MASK, hw_id); + u8 trace_id = FIELD_GET(CS_AUX_HW_ID_TRACE_ID_MASK, hw_id); + + /* + * Check sink id hasn't changed in per-cpu mode. In per-thread mode, + * let it pass for now until an actual overlapping trace ID is hit. In + * most cases IDs won't overlap even if the sink changes. + */ + if (!etmq->etm->per_thread_decoding && etmq->sink_id != SINK_UNSET && + etmq->sink_id != sink_id) { + pr_err("CS_ETM: mismatch between sink IDs\n"); + return -EINVAL; + } + + etmq->sink_id = sink_id; + + /* Find which other queues use this sink and link their ID maps */ + for (unsigned int i = 0; i < etm->queues.nr_queues; ++i) { + struct cs_etm_queue *other_etmq = etm->queues.queue_array[i].priv; + + /* Different sinks, skip */ + if (other_etmq->sink_id != etmq->sink_id) + continue; + + /* Already linked, skip */ + if (other_etmq->traceid_list == etmq->traceid_list) + continue; + + /* At the point of first linking, this one should be empty */ + if (!intlist__empty(etmq->traceid_list)) { + pr_err("CS_ETM: Can't link populated trace ID lists\n"); + return -EINVAL; + } + + etmq->own_traceid_list = NULL; + intlist__delete(etmq->traceid_list); + etmq->traceid_list = other_etmq->traceid_list; + break; + } + + cpu_data = get_cpu_data(etm, cpu); + ret = cs_etm__insert_trace_id_node(etmq, trace_id, cpu_data); + if (ret) + return ret; + + ret = cs_etm__metadata_set_trace_id(trace_id, cpu_data); + if (ret) + return ret; + + return 0; +} + static int cs_etm__metadata_get_trace_id(u8 *trace_chan_id, u64 *cpu_metadata) { u64 cs_etm_magic = cpu_metadata[CS_ETM_MAGIC]; @@ -414,11 +488,14 @@ static int cs_etm__process_aux_output_hw_id(struct perf_session *session, /* extract and parse the HW ID */ hw_id = event->aux_output_hw_id.hw_id; - version = FIELD_GET(CS_AUX_HW_ID_VERSION_MASK, hw_id); + version = FIELD_GET(CS_AUX_HW_ID_MAJOR_VERSION_MASK, hw_id); /* check that we can handle this version */ - if (version > CS_AUX_HW_ID_CURR_VERSION) + if (version > CS_AUX_HW_ID_MAJOR_VERSION) { + pr_err("CS ETM Trace: PERF_RECORD_AUX_OUTPUT_HW_ID version %d not supported. Please update Perf.\n", + version); return -EINVAL; + } /* get access to the etm metadata */ etm = container_of(session->auxtrace, struct cs_etm_auxtrace, auxtrace); @@ -439,7 +516,10 @@ static int cs_etm__process_aux_output_hw_id(struct perf_session *session, return -EINVAL; } - return cs_etm__process_trace_id_v0(etm, cpu, hw_id); + if (FIELD_GET(CS_AUX_HW_ID_MINOR_VERSION_MASK, hw_id) == 0) + return cs_etm__process_trace_id_v0(etm, cpu, hw_id); + + return cs_etm__process_trace_id_v0_1(etm, cpu, hw_id); } void cs_etm__etmq_set_traceid_queue_timestamp(struct cs_etm_queue *etmq, @@ -879,12 +959,14 @@ static void cs_etm__free_queue(void *priv) cs_etm_decoder__free(etmq->decoder); cs_etm__free_traceid_queues(etmq); - /* First remove all traceID/metadata nodes for the RB tree */ - intlist__for_each_entry_safe(inode, tmp, etmq->traceid_list) - intlist__remove(etmq->traceid_list, inode); + if (etmq->own_traceid_list) { + /* First remove all traceID/metadata nodes for the RB tree */ + intlist__for_each_entry_safe(inode, tmp, etmq->own_traceid_list) + intlist__remove(etmq->own_traceid_list, inode); - /* Then the RB tree itself */ - intlist__delete(etmq->traceid_list); + /* Then the RB tree itself */ + intlist__delete(etmq->own_traceid_list); + } free(etmq); } @@ -1078,7 +1160,7 @@ static struct cs_etm_queue *cs_etm__alloc_queue(void) * has to be made for each packet that gets decoded, optimizing access * in anything other than a sequential array is worth doing. */ - etmq->traceid_list = intlist__new(NULL); + etmq->traceid_list = etmq->own_traceid_list = intlist__new(NULL); if (!etmq->traceid_list) goto out_free; @@ -1110,6 +1192,7 @@ static int cs_etm__setup_queue(struct cs_etm_auxtrace *etm, etmq->queue_nr = queue_nr; queue->cpu = queue_nr; /* Placeholder, may be reset to -1 in per-thread mode */ etmq->offset = 0; + etmq->sink_id = SINK_UNSET; return 0; } From 316816a2f52dc979eb8faf4d6ebaaa8e4387e1ad Mon Sep 17 00:00:00 2001 From: James Clark Date: Mon, 22 Jul 2024 11:11:49 +0100 Subject: [PATCH 264/548] perf: cs-etm: Print queue number in raw trace dump Now that we have overlapping trace IDs it's also useful to know what the queue number is to be able to distinguish the source of the trace so print it inline. Hide it behind the -v option because it might not be obvious to users what the queue number is. Reviewed-by: Mike Leach Signed-off-by: James Clark Cc: Adrian Hunter Cc: Alexander Shishkin Cc: Alexandre Torgue Cc: Anshuman Khandual Cc: Ganapatrao Kulkarni Cc: Ian Rogers Cc: Ingo Molnar Cc: Jiri Olsa Cc: John Garry Cc: Kan Liang Cc: Leo Yan Cc: Mark Rutland Cc: Maxime Coquelin Cc: Namhyung Kim Cc: Peter Zijlstra Cc: Suzuki Poulouse Cc: Will Deacon Link: https://lore.kernel.org/r/20240722101202.26915-8-james.clark@linaro.org Signed-off-by: James Clark Signed-off-by: Arnaldo Carvalho de Melo (cherry picked from commit 022aa67b5ab9077d8f5bf02df5a7f0f2d4dfb909) Signed-off-by: Koba Ko --- tools/perf/util/cs-etm-decoder/cs-etm-decoder.c | 4 ++-- tools/perf/util/cs-etm-decoder/cs-etm-decoder.h | 2 +- tools/perf/util/cs-etm.c | 13 ++++++++++--- 3 files changed, 13 insertions(+), 6 deletions(-) diff --git a/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c b/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c index d49c3e9c7c21f..b78ef0262135c 100644 --- a/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c +++ b/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c @@ -41,7 +41,7 @@ const u32 INSTR_PER_NS = 10; struct cs_etm_decoder { void *data; - void (*packet_printer)(const char *msg); + void (*packet_printer)(const char *msg, void *data); bool suppress_printing; dcd_tree_handle_t dcd_tree; cs_etm_mem_cb_type mem_access; @@ -202,7 +202,7 @@ static void cs_etm_decoder__print_str_cb(const void *p_context, const struct cs_etm_decoder *decoder = p_context; if (p_context && str_len && !decoder->suppress_printing) - decoder->packet_printer(msg); + decoder->packet_printer(msg, decoder->data); } static int diff --git a/tools/perf/util/cs-etm-decoder/cs-etm-decoder.h b/tools/perf/util/cs-etm-decoder/cs-etm-decoder.h index 272c2efe78eef..12c782fa6db28 100644 --- a/tools/perf/util/cs-etm-decoder/cs-etm-decoder.h +++ b/tools/perf/util/cs-etm-decoder/cs-etm-decoder.h @@ -60,7 +60,7 @@ struct cs_etm_trace_params { struct cs_etm_decoder_params { int operation; - void (*packet_printer)(const char *msg); + void (*packet_printer)(const char *msg, void *data); cs_etm_mem_cb_type mem_acc_cb; bool formatted; bool fsyncs; diff --git a/tools/perf/util/cs-etm.c b/tools/perf/util/cs-etm.c index eeb2e0cf074e6..1d657d47baf57 100644 --- a/tools/perf/util/cs-etm.c +++ b/tools/perf/util/cs-etm.c @@ -762,15 +762,22 @@ static void cs_etm__packet_swap(struct cs_etm_auxtrace *etm, } } -static void cs_etm__packet_dump(const char *pkt_string) +static void cs_etm__packet_dump(const char *pkt_string, void *data) { const char *color = PERF_COLOR_BLUE; int len = strlen(pkt_string); + struct cs_etm_queue *etmq = data; + char queue_nr[64]; + + if (verbose) + snprintf(queue_nr, sizeof(queue_nr), "Qnr:%d; ", etmq->queue_nr); + else + queue_nr[0] = '\0'; if (len && (pkt_string[len-1] == '\n')) - color_fprintf(stdout, color, " %s", pkt_string); + color_fprintf(stdout, color, " %s%s", queue_nr, pkt_string); else - color_fprintf(stdout, color, " %s\n", pkt_string); + color_fprintf(stdout, color, " %s%s\n", queue_nr, pkt_string); fflush(stdout); } From bafa3f0ab0971668ed755786f6c71eecf44dc44a Mon Sep 17 00:00:00 2001 From: Leo Yan Date: Thu, 3 Oct 2024 19:42:58 +0100 Subject: [PATCH 265/548] perf arm-spe: Define metadata header version 2 The first version's metadata header structure doesn't include a field to indicate a header version, which is not friendly for extension. Define the metadata version 2 format with a new header structure and extend per CPU's metadata. In the meantime, the old metadata header will still be supported for backward compatibility. Signed-off-by: Leo Yan Reviewed-by: James Clark Cc: Will Deacon Cc: Mike Leach Cc: linux-arm-kernel@lists.infradead.org Cc: Besar Wicaksono Cc: John Garry Link: https://lore.kernel.org/r/20241003184302.190806-2-leo.yan@arm.com Signed-off-by: Namhyung Kim (cherry picked from commit 0ca2c45404eed3b6bb80d3169bf672b09cf3a70d) Signed-off-by: Koba Ko --- tools/perf/arch/arm64/util/arm-spe.c | 4 +-- tools/perf/util/arm-spe.c | 2 +- tools/perf/util/arm-spe.h | 38 +++++++++++++++++++++++++++- 3 files changed, 40 insertions(+), 4 deletions(-) diff --git a/tools/perf/arch/arm64/util/arm-spe.c b/tools/perf/arch/arm64/util/arm-spe.c index 4e947814ea748..bd465039ed157 100644 --- a/tools/perf/arch/arm64/util/arm-spe.c +++ b/tools/perf/arch/arm64/util/arm-spe.c @@ -40,7 +40,7 @@ static size_t arm_spe_info_priv_size(struct auxtrace_record *itr __maybe_unused, struct evlist *evlist __maybe_unused) { - return ARM_SPE_AUXTRACE_PRIV_SIZE; + return ARM_SPE_AUXTRACE_V1_PRIV_SIZE; } static int arm_spe_info_fill(struct auxtrace_record *itr, @@ -52,7 +52,7 @@ static int arm_spe_info_fill(struct auxtrace_record *itr, container_of(itr, struct arm_spe_recording, itr); struct perf_pmu *arm_spe_pmu = sper->arm_spe_pmu; - if (priv_size != ARM_SPE_AUXTRACE_PRIV_SIZE) + if (priv_size != ARM_SPE_AUXTRACE_V1_PRIV_SIZE) return -EINVAL; if (!session->evlist->core.nr_mmaps) diff --git a/tools/perf/util/arm-spe.c b/tools/perf/util/arm-spe.c index 9848310cee5f3..8ee17ce9982ee 100644 --- a/tools/perf/util/arm-spe.c +++ b/tools/perf/util/arm-spe.c @@ -1297,7 +1297,7 @@ int arm_spe_process_auxtrace_info(union perf_event *event, struct perf_session *session) { struct perf_record_auxtrace_info *auxtrace_info = &event->auxtrace_info; - size_t min_sz = sizeof(u64) * ARM_SPE_AUXTRACE_PRIV_MAX; + size_t min_sz = ARM_SPE_AUXTRACE_V1_PRIV_SIZE; struct perf_record_time_conv *tc = &session->time_conv; const char *cpuid = perf_env__cpuid(session->evlist->env); u64 midr = strtol(cpuid, NULL, 16); diff --git a/tools/perf/util/arm-spe.h b/tools/perf/util/arm-spe.h index 98d3235781c3c..c68c04cae5b5a 100644 --- a/tools/perf/util/arm-spe.h +++ b/tools/perf/util/arm-spe.h @@ -12,10 +12,46 @@ enum { ARM_SPE_PMU_TYPE, ARM_SPE_PER_CPU_MMAPS, + ARM_SPE_AUXTRACE_V1_PRIV_MAX, +}; + +#define ARM_SPE_AUXTRACE_V1_PRIV_SIZE \ + (ARM_SPE_AUXTRACE_V1_PRIV_MAX * sizeof(u64)) + +enum { + /* + * The old metadata format (defined above) does not include a + * field for version number. Version 1 is reserved and starts + * from version 2. + */ + ARM_SPE_HEADER_VERSION, + /* Number of sizeof(u64) */ + ARM_SPE_HEADER_SIZE, + /* PMU type shared by CPUs */ + ARM_SPE_PMU_TYPE_V2, + /* Number of CPUs */ + ARM_SPE_CPUS_NUM, ARM_SPE_AUXTRACE_PRIV_MAX, }; -#define ARM_SPE_AUXTRACE_PRIV_SIZE (ARM_SPE_AUXTRACE_PRIV_MAX * sizeof(u64)) +enum { + /* Magic number */ + ARM_SPE_MAGIC, + /* CPU logical number in system */ + ARM_SPE_CPU, + /* Number of parameters */ + ARM_SPE_CPU_NR_PARAMS, + /* CPU MIDR */ + ARM_SPE_CPU_MIDR, + /* Associated PMU type */ + ARM_SPE_CPU_PMU_TYPE, + /* Minimal interval */ + ARM_SPE_CAP_MIN_IVAL, + ARM_SPE_CPU_PRIV_MAX, +}; + +#define ARM_SPE_HEADER_CURRENT_VERSION 2 + union perf_event; struct perf_session; From bab8e874640bfc196e4c637ecf4d63a954100b3f Mon Sep 17 00:00:00 2001 From: Leo Yan Date: Thu, 3 Oct 2024 19:42:59 +0100 Subject: [PATCH 266/548] perf arm-spe: Calculate meta data size The metadata is designed to contain a header and per CPU information. The arm_spe_find_cpus() function is introduced to identify how many CPUs support ARM SPE. Based on the CPU number, calculates the metadata size. Signed-off-by: Leo Yan Reviewed-by: James Clark Cc: Will Deacon Cc: Mike Leach Cc: linux-arm-kernel@lists.infradead.org Cc: Besar Wicaksono Cc: John Garry Link: https://lore.kernel.org/r/20241003184302.190806-3-leo.yan@arm.com Signed-off-by: Namhyung Kim (cherry picked from commit 59715b1908b051fa3e4c0efb8a3724786d98bc48) Signed-off-by: Koba Ko --- tools/perf/arch/arm64/util/arm-spe.c | 39 +++++++++++++++++++++++++--- 1 file changed, 36 insertions(+), 3 deletions(-) diff --git a/tools/perf/arch/arm64/util/arm-spe.c b/tools/perf/arch/arm64/util/arm-spe.c index bd465039ed157..229206a4f2b17 100644 --- a/tools/perf/arch/arm64/util/arm-spe.c +++ b/tools/perf/arch/arm64/util/arm-spe.c @@ -36,11 +36,44 @@ struct arm_spe_recording { bool *wrapped; }; +/* + * arm_spe_find_cpus() returns a new cpu map, and the caller should invoke + * perf_cpu_map__put() to release the map after use. + */ +static struct perf_cpu_map *arm_spe_find_cpus(struct evlist *evlist) +{ + struct perf_cpu_map *event_cpus = evlist->core.user_requested_cpus; + struct perf_cpu_map *online_cpus = perf_cpu_map__new_online_cpus(); + struct perf_cpu_map *intersect_cpus; + + /* cpu map is not "any" CPU , we have specific CPUs to work with */ + if (!perf_cpu_map__has_any_cpu(event_cpus)) { + intersect_cpus = perf_cpu_map__intersect(event_cpus, online_cpus); + perf_cpu_map__put(online_cpus); + /* Event can be "any" CPU so count all CPUs. */ + } else { + intersect_cpus = online_cpus; + } + + return intersect_cpus; +} + static size_t arm_spe_info_priv_size(struct auxtrace_record *itr __maybe_unused, - struct evlist *evlist __maybe_unused) + struct evlist *evlist) { - return ARM_SPE_AUXTRACE_V1_PRIV_SIZE; + struct perf_cpu_map *cpu_map = arm_spe_find_cpus(evlist); + size_t size; + + if (!cpu_map) + return 0; + + size = ARM_SPE_AUXTRACE_PRIV_MAX + + ARM_SPE_CPU_PRIV_MAX * perf_cpu_map__nr(cpu_map); + size *= sizeof(u64); + + perf_cpu_map__put(cpu_map); + return size; } static int arm_spe_info_fill(struct auxtrace_record *itr, @@ -52,7 +85,7 @@ static int arm_spe_info_fill(struct auxtrace_record *itr, container_of(itr, struct arm_spe_recording, itr); struct perf_pmu *arm_spe_pmu = sper->arm_spe_pmu; - if (priv_size != ARM_SPE_AUXTRACE_V1_PRIV_SIZE) + if (priv_size != arm_spe_info_priv_size(itr, session->evlist)) return -EINVAL; if (!session->evlist->core.nr_mmaps) From 7a8c7cff3478d2d60de0bc9cde52de22bf8ceedb Mon Sep 17 00:00:00 2001 From: Leo Yan Date: Thu, 3 Oct 2024 19:43:00 +0100 Subject: [PATCH 267/548] perf arm-spe: Save per CPU information in metadata Save the Arm SPE information on a per-CPU basis. This approach is easier in the decoding phase for retrieving metadata based on the CPU number of every Arm SPE record. Signed-off-by: Leo Yan Reviewed-by: James Clark Cc: Will Deacon Cc: Mike Leach Cc: linux-arm-kernel@lists.infradead.org Cc: Besar Wicaksono Cc: John Garry Link: https://lore.kernel.org/r/20241003184302.190806-4-leo.yan@arm.com Signed-off-by: Namhyung Kim (cherry picked from commit 703f344d0c4a3a006d3e1466d38ee6d8791acd87) Signed-off-by: Koba Ko --- tools/perf/arch/arm64/util/arm-spe.c | 83 +++++++++++++++++++++++++++- 1 file changed, 81 insertions(+), 2 deletions(-) diff --git a/tools/perf/arch/arm64/util/arm-spe.c b/tools/perf/arch/arm64/util/arm-spe.c index 229206a4f2b17..dcec0118bdb87 100644 --- a/tools/perf/arch/arm64/util/arm-spe.c +++ b/tools/perf/arch/arm64/util/arm-spe.c @@ -25,6 +25,8 @@ #include "../../../util/arm-spe.h" #include // reallocarray +#define ARM_SPE_CPU_MAGIC 0x1010101010101010ULL + #define KiB(x) ((x) * 1024) #define MiB(x) ((x) * 1024 * 1024) @@ -76,14 +78,70 @@ arm_spe_info_priv_size(struct auxtrace_record *itr __maybe_unused, return size; } +static int arm_spe_save_cpu_header(struct auxtrace_record *itr, + struct perf_cpu cpu, __u64 data[]) +{ + struct arm_spe_recording *sper = + container_of(itr, struct arm_spe_recording, itr); + struct perf_pmu *pmu = NULL; + struct perf_pmu tmp_pmu; + char cpu_id_str[16]; + char *cpuid = NULL; + u64 val; + + snprintf(cpu_id_str, sizeof(cpu_id_str), "%d", cpu.cpu); + tmp_pmu.cpus = perf_cpu_map__new(cpu_id_str); + if (!tmp_pmu.cpus) + return -ENOMEM; + + /* Read CPU MIDR */ + cpuid = perf_pmu__getcpuid(&tmp_pmu); + + /* The CPU map will not be used anymore, release it */ + perf_cpu_map__put(tmp_pmu.cpus); + + if (!cpuid) + return -ENOMEM; + val = strtol(cpuid, NULL, 16); + + data[ARM_SPE_MAGIC] = ARM_SPE_CPU_MAGIC; + data[ARM_SPE_CPU] = cpu.cpu; + data[ARM_SPE_CPU_NR_PARAMS] = ARM_SPE_CPU_PRIV_MAX - ARM_SPE_CPU_MIDR; + data[ARM_SPE_CPU_MIDR] = val; + + /* Find the associate Arm SPE PMU for the CPU */ + if (perf_cpu_map__has(sper->arm_spe_pmu->cpus, cpu)) + pmu = sper->arm_spe_pmu; + + if (!pmu) { + /* No Arm SPE PMU is found */ + data[ARM_SPE_CPU_PMU_TYPE] = ULLONG_MAX; + data[ARM_SPE_CAP_MIN_IVAL] = 0; + } else { + data[ARM_SPE_CPU_PMU_TYPE] = pmu->type; + + if (perf_pmu__scan_file(pmu, "caps/min_interval", "%lu", &val) != 1) + val = 0; + data[ARM_SPE_CAP_MIN_IVAL] = val; + } + + free(cpuid); + return ARM_SPE_CPU_PRIV_MAX; +} + static int arm_spe_info_fill(struct auxtrace_record *itr, struct perf_session *session, struct perf_record_auxtrace_info *auxtrace_info, size_t priv_size) { + int i, ret; + size_t offset; struct arm_spe_recording *sper = container_of(itr, struct arm_spe_recording, itr); struct perf_pmu *arm_spe_pmu = sper->arm_spe_pmu; + struct perf_cpu_map *cpu_map; + struct perf_cpu cpu; + __u64 *data; if (priv_size != arm_spe_info_priv_size(itr, session->evlist)) return -EINVAL; @@ -91,10 +149,31 @@ static int arm_spe_info_fill(struct auxtrace_record *itr, if (!session->evlist->core.nr_mmaps) return -EINVAL; + cpu_map = arm_spe_find_cpus(session->evlist); + if (!cpu_map) + return -EINVAL; + auxtrace_info->type = PERF_AUXTRACE_ARM_SPE; - auxtrace_info->priv[ARM_SPE_PMU_TYPE] = arm_spe_pmu->type; + auxtrace_info->priv[ARM_SPE_HEADER_VERSION] = ARM_SPE_HEADER_CURRENT_VERSION; + auxtrace_info->priv[ARM_SPE_HEADER_SIZE] = + ARM_SPE_AUXTRACE_PRIV_MAX - ARM_SPE_HEADER_VERSION; + auxtrace_info->priv[ARM_SPE_PMU_TYPE_V2] = arm_spe_pmu->type; + auxtrace_info->priv[ARM_SPE_CPUS_NUM] = perf_cpu_map__nr(cpu_map); + + offset = ARM_SPE_AUXTRACE_PRIV_MAX; + perf_cpu_map__for_each_cpu(cpu, i, cpu_map) { + assert(offset < priv_size); + data = &auxtrace_info->priv[offset]; + ret = arm_spe_save_cpu_header(itr, cpu, data); + if (ret < 0) + goto out; + offset += ret; + } - return 0; + ret = 0; +out: + perf_cpu_map__put(cpu_map); + return ret; } static void From e03bdad4a4ea80573176716f0d6c0d5adae394ee Mon Sep 17 00:00:00 2001 From: Leo Yan Date: Thu, 3 Oct 2024 19:43:01 +0100 Subject: [PATCH 268/548] perf arm-spe: Support metadata version 2 This commit is to support metadata version 2 and at the meantime it is backward compatible for version 1's format. The metadata version 1 doesn't include the ARM_SPE_HEADER_VERSION field. As version 1 is fixed with two u64 fields, by checking the metadata size, it distinguishes the metadata is version 1 or version 2 (and any new versions if later will have). For version 2, it reads out CPU number and retrieves the metadata info for every CPU. Signed-off-by: Leo Yan Reviewed-by: James Clark Cc: Will Deacon Cc: Mike Leach Cc: linux-arm-kernel@lists.infradead.org Cc: Besar Wicaksono Cc: John Garry Link: https://lore.kernel.org/r/20241003184302.190806-5-leo.yan@arm.com Signed-off-by: Namhyung Kim (cherry picked from commit 7842a4b6ff698768ccdb13324c3902a069b5d5dd) Signed-off-by: Koba Ko --- tools/perf/util/arm-spe.c | 99 +++++++++++++++++++++++++++++++++++++-- 1 file changed, 95 insertions(+), 4 deletions(-) diff --git a/tools/perf/util/arm-spe.c b/tools/perf/util/arm-spe.c index 8ee17ce9982ee..963c80bd1e1da 100644 --- a/tools/perf/util/arm-spe.c +++ b/tools/perf/util/arm-spe.c @@ -80,6 +80,10 @@ struct arm_spe { unsigned long num_events; u8 use_ctx_pkt_for_pid; + + u64 **metadata; + u64 metadata_ver; + u64 metadata_nr_cpu; }; struct arm_spe_queue { @@ -1022,6 +1026,73 @@ static int arm_spe_flush(struct perf_session *session __maybe_unused, return 0; } +static u64 *arm_spe__alloc_per_cpu_metadata(u64 *buf, int per_cpu_size) +{ + u64 *metadata; + + metadata = zalloc(per_cpu_size); + if (!metadata) + return NULL; + + memcpy(metadata, buf, per_cpu_size); + return metadata; +} + +static void arm_spe__free_metadata(u64 **metadata, int nr_cpu) +{ + int i; + + for (i = 0; i < nr_cpu; i++) + zfree(&metadata[i]); + free(metadata); +} + +static u64 **arm_spe__alloc_metadata(struct perf_record_auxtrace_info *info, + u64 *ver, int *nr_cpu) +{ + u64 *ptr = (u64 *)info->priv; + u64 metadata_size; + u64 **metadata = NULL; + int hdr_sz, per_cpu_sz, i; + + metadata_size = info->header.size - + sizeof(struct perf_record_auxtrace_info); + + /* Metadata version 1 */ + if (metadata_size == ARM_SPE_AUXTRACE_V1_PRIV_SIZE) { + *ver = 1; + *nr_cpu = 0; + /* No per CPU metadata */ + return NULL; + } + + *ver = ptr[ARM_SPE_HEADER_VERSION]; + hdr_sz = ptr[ARM_SPE_HEADER_SIZE]; + *nr_cpu = ptr[ARM_SPE_CPUS_NUM]; + + metadata = calloc(*nr_cpu, sizeof(*metadata)); + if (!metadata) + return NULL; + + /* Locate the start address of per CPU metadata */ + ptr += hdr_sz; + per_cpu_sz = (metadata_size - (hdr_sz * sizeof(u64))) / (*nr_cpu); + + for (i = 0; i < *nr_cpu; i++) { + metadata[i] = arm_spe__alloc_per_cpu_metadata(ptr, per_cpu_sz); + if (!metadata[i]) + goto err_per_cpu_metadata; + + ptr += per_cpu_sz / sizeof(u64); + } + + return metadata; + +err_per_cpu_metadata: + arm_spe__free_metadata(metadata, *nr_cpu); + return NULL; +} + static void arm_spe_free_queue(void *priv) { struct arm_spe_queue *speq = priv; @@ -1056,6 +1127,7 @@ static void arm_spe_free(struct perf_session *session) auxtrace_heap__free(&spe->heap); arm_spe_free_events(session); session->auxtrace = NULL; + arm_spe__free_metadata(spe->metadata, spe->metadata_nr_cpu); free(spe); } @@ -1302,15 +1374,26 @@ int arm_spe_process_auxtrace_info(union perf_event *event, const char *cpuid = perf_env__cpuid(session->evlist->env); u64 midr = strtol(cpuid, NULL, 16); struct arm_spe *spe; - int err; + u64 **metadata = NULL; + u64 metadata_ver; + int nr_cpu, err; if (auxtrace_info->header.size < sizeof(struct perf_record_auxtrace_info) + min_sz) return -EINVAL; + metadata = arm_spe__alloc_metadata(auxtrace_info, &metadata_ver, + &nr_cpu); + if (!metadata && metadata_ver != 1) { + pr_err("Failed to parse Arm SPE metadata.\n"); + return -EINVAL; + } + spe = zalloc(sizeof(struct arm_spe)); - if (!spe) - return -ENOMEM; + if (!spe) { + err = -ENOMEM; + goto err_free_metadata; + } err = auxtrace_queues__init(&spe->queues); if (err) @@ -1319,8 +1402,14 @@ int arm_spe_process_auxtrace_info(union perf_event *event, spe->session = session; spe->machine = &session->machines.host; /* No kvm support */ spe->auxtrace_type = auxtrace_info->type; - spe->pmu_type = auxtrace_info->priv[ARM_SPE_PMU_TYPE]; + if (metadata_ver == 1) + spe->pmu_type = auxtrace_info->priv[ARM_SPE_PMU_TYPE]; + else + spe->pmu_type = auxtrace_info->priv[ARM_SPE_PMU_TYPE_V2]; spe->midr = midr; + spe->metadata = metadata; + spe->metadata_ver = metadata_ver; + spe->metadata_nr_cpu = nr_cpu; spe->timeless_decoding = arm_spe__is_timeless_decoding(spe); @@ -1381,5 +1470,7 @@ int arm_spe_process_auxtrace_info(union perf_event *event, session->auxtrace = NULL; err_free: free(spe); +err_free_metadata: + arm_spe__free_metadata(metadata, nr_cpu); return err; } From aa05085087d912cc7bad9a7aba9f2177f598df99 Mon Sep 17 00:00:00 2001 From: Leo Yan Date: Thu, 3 Oct 2024 19:43:02 +0100 Subject: [PATCH 269/548] perf arm-spe: Dump metadata with version 2 This commit dumps metadata with version 2. It dumps metadata for header and per CPU data respectively in the arm_spe_print_info() function to support metadata version 2 format. After: 0 0 0x3c0 [0x1b0]: PERF_RECORD_AUXTRACE_INFO type: 4 Header version :2 Header size :4 PMU type v2 :13 CPU number :8 Magic :0x1010101010101010 CPU # :0 Num of params :3 MIDR :0x410fd801 PMU Type :-1 Min Interval :0 Magic :0x1010101010101010 CPU # :1 Num of params :3 MIDR :0x410fd801 PMU Type :-1 Min Interval :0 Magic :0x1010101010101010 CPU # :2 Num of params :3 MIDR :0x410fd870 PMU Type :13 Min Interval :1024 Magic :0x1010101010101010 CPU # :3 Num of params :3 MIDR :0x410fd870 PMU Type :13 Min Interval :1024 Magic :0x1010101010101010 CPU # :4 Num of params :3 MIDR :0x410fd870 PMU Type :13 Min Interval :1024 Magic :0x1010101010101010 CPU # :5 Num of params :3 MIDR :0x410fd870 PMU Type :13 Min Interval :1024 Magic :0x1010101010101010 CPU # :6 Num of params :3 MIDR :0x410fd850 PMU Type :-1 Min Interval :0 Magic :0x1010101010101010 CPU # :7 Num of params :3 MIDR :0x410fd850 PMU Type :-1 Min Interval :0 Signed-off-by: Leo Yan Reviewed-by: James Clark Cc: Will Deacon Cc: Mike Leach Cc: linux-arm-kernel@lists.infradead.org Cc: Besar Wicaksono Cc: John Garry Link: https://lore.kernel.org/r/20241003184302.190806-6-leo.yan@arm.com Signed-off-by: Namhyung Kim (cherry picked from commit e52abceb4b6c2723c7e49388e67a32ffb47bd90c) Signed-off-by: Koba Ko --- tools/perf/util/arm-spe.c | 54 +++++++++++++++++++++++++++++++++++---- 1 file changed, 49 insertions(+), 5 deletions(-) diff --git a/tools/perf/util/arm-spe.c b/tools/perf/util/arm-spe.c index 963c80bd1e1da..2c4d2b5635858 100644 --- a/tools/perf/util/arm-spe.c +++ b/tools/perf/util/arm-spe.c @@ -1139,16 +1139,60 @@ static bool arm_spe_evsel_is_auxtrace(struct perf_session *session, return evsel->core.attr.type == spe->pmu_type; } -static const char * const arm_spe_info_fmts[] = { - [ARM_SPE_PMU_TYPE] = " PMU Type %"PRId64"\n", +static const char * const metadata_hdr_v1_fmts[] = { + [ARM_SPE_PMU_TYPE] = " PMU Type :%"PRId64"\n", + [ARM_SPE_PER_CPU_MMAPS] = " Per CPU mmaps :%"PRId64"\n", }; -static void arm_spe_print_info(__u64 *arr) +static const char * const metadata_hdr_fmts[] = { + [ARM_SPE_HEADER_VERSION] = " Header version :%"PRId64"\n", + [ARM_SPE_HEADER_SIZE] = " Header size :%"PRId64"\n", + [ARM_SPE_PMU_TYPE_V2] = " PMU type v2 :%"PRId64"\n", + [ARM_SPE_CPUS_NUM] = " CPU number :%"PRId64"\n", +}; + +static const char * const metadata_per_cpu_fmts[] = { + [ARM_SPE_MAGIC] = " Magic :0x%"PRIx64"\n", + [ARM_SPE_CPU] = " CPU # :%"PRId64"\n", + [ARM_SPE_CPU_NR_PARAMS] = " Num of params :%"PRId64"\n", + [ARM_SPE_CPU_MIDR] = " MIDR :0x%"PRIx64"\n", + [ARM_SPE_CPU_PMU_TYPE] = " PMU Type :%"PRId64"\n", + [ARM_SPE_CAP_MIN_IVAL] = " Min Interval :%"PRId64"\n", +}; + +static void arm_spe_print_info(struct arm_spe *spe, __u64 *arr) { + unsigned int i, cpu, hdr_size, cpu_num, cpu_size; + const char * const *hdr_fmts; + if (!dump_trace) return; - fprintf(stdout, arm_spe_info_fmts[ARM_SPE_PMU_TYPE], arr[ARM_SPE_PMU_TYPE]); + if (spe->metadata_ver == 1) { + cpu_num = 0; + hdr_size = ARM_SPE_AUXTRACE_V1_PRIV_MAX; + hdr_fmts = metadata_hdr_v1_fmts; + } else { + cpu_num = arr[ARM_SPE_CPUS_NUM]; + hdr_size = arr[ARM_SPE_HEADER_SIZE]; + hdr_fmts = metadata_hdr_fmts; + } + + for (i = 0; i < hdr_size; i++) + fprintf(stdout, hdr_fmts[i], arr[i]); + + arr += hdr_size; + for (cpu = 0; cpu < cpu_num; cpu++) { + /* + * The parameters from ARM_SPE_MAGIC to ARM_SPE_CPU_NR_PARAMS + * are fixed. The sequential parameter size is decided by the + * field 'ARM_SPE_CPU_NR_PARAMS'. + */ + cpu_size = (ARM_SPE_CPU_NR_PARAMS + 1) + arr[ARM_SPE_CPU_NR_PARAMS]; + for (i = 0; i < cpu_size; i++) + fprintf(stdout, metadata_per_cpu_fmts[i], arr[i]); + arr += cpu_size; + } } struct arm_spe_synth { @@ -1442,7 +1486,7 @@ int arm_spe_process_auxtrace_info(union perf_event *event, spe->auxtrace.evsel_is_auxtrace = arm_spe_evsel_is_auxtrace; session->auxtrace = &spe->auxtrace; - arm_spe_print_info(&auxtrace_info->priv[0]); + arm_spe_print_info(spe, &auxtrace_info->priv[0]); if (dump_trace) return 0; From 28b31a2e3ca100b119a0ae04eb25085a125b7090 Mon Sep 17 00:00:00 2001 From: Leo Yan Date: Thu, 3 Oct 2024 19:53:16 +0100 Subject: [PATCH 270/548] perf arm-spe: Rename arm_spe__synth_data_source_generic() The arm_spe__synth_data_source_generic() function is invoked when the tool detects that CPUs do not support data source packets and falls back to synthesizing only the memory level. Rename it to arm_spe__synth_memory_level() for better reflecting its purpose. Signed-off-by: Leo Yan Reviewed-by: James Clark Link: https://lore.kernel.org/r/20241003185322.192357-2-leo.yan@arm.com Signed-off-by: Namhyung Kim (cherry picked from commit fb98fa3bf86893e53fac7bc951f503caf4a6eb23) Signed-off-by: Koba Ko --- tools/perf/util/arm-spe.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/tools/perf/util/arm-spe.c b/tools/perf/util/arm-spe.c index 2c4d2b5635858..7c363ea97458b 100644 --- a/tools/perf/util/arm-spe.c +++ b/tools/perf/util/arm-spe.c @@ -498,8 +498,8 @@ static void arm_spe__synth_data_source_neoverse(const struct arm_spe_record *rec } } -static void arm_spe__synth_data_source_generic(const struct arm_spe_record *record, - union perf_mem_data_src *data_src) +static void arm_spe__synth_memory_level(const struct arm_spe_record *record, + union perf_mem_data_src *data_src) { if (record->type & (ARM_SPE_LLC_ACCESS | ARM_SPE_LLC_MISS)) { data_src->mem_lvl = PERF_MEM_LVL_L3; @@ -540,7 +540,7 @@ static u64 arm_spe__synth_data_source(const struct arm_spe_record *record, u64 m if (is_neoverse) arm_spe__synth_data_source_neoverse(record, &data_src); else - arm_spe__synth_data_source_generic(record, &data_src); + arm_spe__synth_memory_level(record, &data_src); if (record->type & (ARM_SPE_TLB_ACCESS | ARM_SPE_TLB_MISS)) { data_src.mem_dtlb = PERF_MEM_TLB_WK; From d2bc6fef989a069ce108438c7306e49ab7462ef2 Mon Sep 17 00:00:00 2001 From: Leo Yan Date: Thu, 3 Oct 2024 19:53:17 +0100 Subject: [PATCH 271/548] perf arm-spe: Rename the common data source encoding The Neoverse CPUs follow the common data source encoding, and other CPU variants can share the same format. Rename the CPU list and data source definitions as common data source names. This change prepares for appending more CPU variants. Signed-off-by: Leo Yan Reviewed-by: James Clark Link: https://lore.kernel.org/r/20241003185322.192357-3-leo.yan@arm.com Signed-off-by: Namhyung Kim (cherry picked from commit 50b8f1d5bf4ad7f09ef8012ccf5f94f741df827b) Signed-off-by: Koba Ko --- .../util/arm-spe-decoder/arm-spe-decoder.h | 18 ++++++------ tools/perf/util/arm-spe.c | 28 +++++++++---------- 2 files changed, 23 insertions(+), 23 deletions(-) diff --git a/tools/perf/util/arm-spe-decoder/arm-spe-decoder.h b/tools/perf/util/arm-spe-decoder/arm-spe-decoder.h index 1443c28545a94..358c611eeddbb 100644 --- a/tools/perf/util/arm-spe-decoder/arm-spe-decoder.h +++ b/tools/perf/util/arm-spe-decoder/arm-spe-decoder.h @@ -56,15 +56,15 @@ enum arm_spe_op_type { ARM_SPE_OP_BR_INDIRECT = 1 << 17, }; -enum arm_spe_neoverse_data_source { - ARM_SPE_NV_L1D = 0x0, - ARM_SPE_NV_L2 = 0x8, - ARM_SPE_NV_PEER_CORE = 0x9, - ARM_SPE_NV_LOCAL_CLUSTER = 0xa, - ARM_SPE_NV_SYS_CACHE = 0xb, - ARM_SPE_NV_PEER_CLUSTER = 0xc, - ARM_SPE_NV_REMOTE = 0xd, - ARM_SPE_NV_DRAM = 0xe, +enum arm_spe_common_data_source { + ARM_SPE_COMMON_DS_L1D = 0x0, + ARM_SPE_COMMON_DS_L2 = 0x8, + ARM_SPE_COMMON_DS_PEER_CORE = 0x9, + ARM_SPE_COMMON_DS_LOCAL_CLUSTER = 0xa, + ARM_SPE_COMMON_DS_SYS_CACHE = 0xb, + ARM_SPE_COMMON_DS_PEER_CLUSTER = 0xc, + ARM_SPE_COMMON_DS_REMOTE = 0xd, + ARM_SPE_COMMON_DS_DRAM = 0xe, }; struct arm_spe_record { diff --git a/tools/perf/util/arm-spe.c b/tools/perf/util/arm-spe.c index 7c363ea97458b..51d090ea360ab 100644 --- a/tools/perf/util/arm-spe.c +++ b/tools/perf/util/arm-spe.c @@ -415,15 +415,15 @@ static int arm_spe__synth_instruction_sample(struct arm_spe_queue *speq, return arm_spe_deliver_synth_event(spe, speq, event, &sample); } -static const struct midr_range neoverse_spe[] = { +static const struct midr_range common_ds_encoding_cpus[] = { MIDR_ALL_VERSIONS(MIDR_NEOVERSE_N1), MIDR_ALL_VERSIONS(MIDR_NEOVERSE_N2), MIDR_ALL_VERSIONS(MIDR_NEOVERSE_V1), {}, }; -static void arm_spe__synth_data_source_neoverse(const struct arm_spe_record *record, - union perf_mem_data_src *data_src) +static void arm_spe__synth_data_source_common(const struct arm_spe_record *record, + union perf_mem_data_src *data_src) { /* * Even though four levels of cache hierarchy are possible, no known @@ -445,17 +445,17 @@ static void arm_spe__synth_data_source_neoverse(const struct arm_spe_record *rec } switch (record->source) { - case ARM_SPE_NV_L1D: + case ARM_SPE_COMMON_DS_L1D: data_src->mem_lvl = PERF_MEM_LVL_L1 | PERF_MEM_LVL_HIT; data_src->mem_lvl_num = PERF_MEM_LVLNUM_L1; data_src->mem_snoop = PERF_MEM_SNOOP_NONE; break; - case ARM_SPE_NV_L2: + case ARM_SPE_COMMON_DS_L2: data_src->mem_lvl = PERF_MEM_LVL_L2 | PERF_MEM_LVL_HIT; data_src->mem_lvl_num = PERF_MEM_LVLNUM_L2; data_src->mem_snoop = PERF_MEM_SNOOP_NONE; break; - case ARM_SPE_NV_PEER_CORE: + case ARM_SPE_COMMON_DS_PEER_CORE: data_src->mem_lvl = PERF_MEM_LVL_L2 | PERF_MEM_LVL_HIT; data_src->mem_lvl_num = PERF_MEM_LVLNUM_L2; data_src->mem_snoopx = PERF_MEM_SNOOPX_PEER; @@ -464,8 +464,8 @@ static void arm_spe__synth_data_source_neoverse(const struct arm_spe_record *rec * We don't know if this is L1, L2 but we do know it was a cache-2-cache * transfer, so set SNOOPX_PEER */ - case ARM_SPE_NV_LOCAL_CLUSTER: - case ARM_SPE_NV_PEER_CLUSTER: + case ARM_SPE_COMMON_DS_LOCAL_CLUSTER: + case ARM_SPE_COMMON_DS_PEER_CLUSTER: data_src->mem_lvl = PERF_MEM_LVL_L3 | PERF_MEM_LVL_HIT; data_src->mem_lvl_num = PERF_MEM_LVLNUM_L3; data_src->mem_snoopx = PERF_MEM_SNOOPX_PEER; @@ -473,7 +473,7 @@ static void arm_spe__synth_data_source_neoverse(const struct arm_spe_record *rec /* * System cache is assumed to be L3 */ - case ARM_SPE_NV_SYS_CACHE: + case ARM_SPE_COMMON_DS_SYS_CACHE: data_src->mem_lvl = PERF_MEM_LVL_L3 | PERF_MEM_LVL_HIT; data_src->mem_lvl_num = PERF_MEM_LVLNUM_L3; data_src->mem_snoop = PERF_MEM_SNOOP_HIT; @@ -482,13 +482,13 @@ static void arm_spe__synth_data_source_neoverse(const struct arm_spe_record *rec * We don't know what level it hit in, except it came from the other * socket */ - case ARM_SPE_NV_REMOTE: + case ARM_SPE_COMMON_DS_REMOTE: data_src->mem_lvl = PERF_MEM_LVL_REM_CCE1; data_src->mem_lvl_num = PERF_MEM_LVLNUM_ANY_CACHE; data_src->mem_remote = PERF_MEM_REMOTE_REMOTE; data_src->mem_snoopx = PERF_MEM_SNOOPX_PEER; break; - case ARM_SPE_NV_DRAM: + case ARM_SPE_COMMON_DS_DRAM: data_src->mem_lvl = PERF_MEM_LVL_LOC_RAM | PERF_MEM_LVL_HIT; data_src->mem_lvl_num = PERF_MEM_LVLNUM_RAM; data_src->mem_snoop = PERF_MEM_SNOOP_NONE; @@ -524,7 +524,7 @@ static void arm_spe__synth_memory_level(const struct arm_spe_record *record, static u64 arm_spe__synth_data_source(const struct arm_spe_record *record, u64 midr) { union perf_mem_data_src data_src = { .mem_op = PERF_MEM_OP_NA }; - bool is_neoverse = is_midr_in_range_list(midr, neoverse_spe); + bool is_common = is_midr_in_range_list(midr, common_ds_encoding_cpus); /* Only synthesize data source for LDST operations */ if (!is_ldst_op(record->op)) @@ -537,8 +537,8 @@ static u64 arm_spe__synth_data_source(const struct arm_spe_record *record, u64 m else return 0; - if (is_neoverse) - arm_spe__synth_data_source_neoverse(record, &data_src); + if (is_common) + arm_spe__synth_data_source_common(record, &data_src); else arm_spe__synth_memory_level(record, &data_src); From 4cbc896c8936fe5dd701c3d67e067527e6957225 Mon Sep 17 00:00:00 2001 From: Leo Yan Date: Thu, 3 Oct 2024 19:53:18 +0100 Subject: [PATCH 272/548] perf arm-spe: Introduce arm_spe__is_homogeneous() Introduce the arm_spe__is_homogeneous() function, it uses to check if Arm SPE is homogeneous cross all CPUs. Signed-off-by: Leo Yan Reviewed-by: James Clark Link: https://lore.kernel.org/r/20241003185322.192357-4-leo.yan@arm.com Signed-off-by: Namhyung Kim (cherry picked from commit 56ae663e7636f2ce180201f0f18d7736c319a43f) Signed-off-by: Koba Ko --- tools/perf/util/arm-spe.c | 26 ++++++++++++++++++++++++++ 1 file changed, 26 insertions(+) diff --git a/tools/perf/util/arm-spe.c b/tools/perf/util/arm-spe.c index 51d090ea360ab..75bd7d5d4aae1 100644 --- a/tools/perf/util/arm-spe.c +++ b/tools/perf/util/arm-spe.c @@ -84,6 +84,7 @@ struct arm_spe { u64 **metadata; u64 metadata_ver; u64 metadata_nr_cpu; + bool is_homogeneous; }; struct arm_spe_queue { @@ -1409,6 +1410,30 @@ arm_spe_synth_events(struct arm_spe *spe, struct perf_session *session) return 0; } +static bool arm_spe__is_homogeneous(u64 **metadata, int nr_cpu) +{ + u64 midr; + int i; + + if (!nr_cpu) + return false; + + for (i = 0; i < nr_cpu; i++) { + if (!metadata[i]) + return false; + + if (i == 0) { + midr = metadata[i][ARM_SPE_CPU_MIDR]; + continue; + } + + if (midr != metadata[i][ARM_SPE_CPU_MIDR]) + return false; + } + + return true; +} + int arm_spe_process_auxtrace_info(union perf_event *event, struct perf_session *session) { @@ -1454,6 +1479,7 @@ int arm_spe_process_auxtrace_info(union perf_event *event, spe->metadata = metadata; spe->metadata_ver = metadata_ver; spe->metadata_nr_cpu = nr_cpu; + spe->is_homogeneous = arm_spe__is_homogeneous(metadata, nr_cpu); spe->timeless_decoding = arm_spe__is_timeless_decoding(spe); From 24da6292452e1f5d7ad20279ba0a958dc2c48a61 Mon Sep 17 00:00:00 2001 From: Leo Yan Date: Thu, 3 Oct 2024 19:53:19 +0100 Subject: [PATCH 273/548] perf arm-spe: Use metadata to decide the data source feature Use the info in the metadata to decide if the data source feature is supported. The CPU MIDR must be in the CPU list for the common data source encoding. For the metadata version 1, it doesn't include info for MIDR. In this case, due to absent info for making decision, print out warning to remind users to upgrade tool and returns false. Signed-off-by: Leo Yan Reviewed-by: James Clark Link: https://lore.kernel.org/r/20241003185322.192357-5-leo.yan@arm.com Signed-off-by: Namhyung Kim (cherry picked from commit ba5e7169e5483a61899497e23fa18f7ef33aa827) Signed-off-by: Koba Ko --- tools/perf/util/arm-spe.c | 67 +++++++++++++++++++++++++++++++++++++-- 1 file changed, 64 insertions(+), 3 deletions(-) diff --git a/tools/perf/util/arm-spe.c b/tools/perf/util/arm-spe.c index 75bd7d5d4aae1..a7b8eb427b7b4 100644 --- a/tools/perf/util/arm-spe.c +++ b/tools/perf/util/arm-spe.c @@ -280,6 +280,20 @@ static int arm_spe_set_tid(struct arm_spe_queue *speq, pid_t tid) return 0; } +static u64 *arm_spe__get_metadata_by_cpu(struct arm_spe *spe, u64 cpu) +{ + u64 i; + + if (!spe->metadata) + return NULL; + + for (i = 0; i < spe->metadata_nr_cpu; i++) + if (spe->metadata[i][ARM_SPE_CPU] == cpu) + return spe->metadata[i]; + + return NULL; +} + static struct simd_flags arm_spe__synth_simd_flags(const struct arm_spe_record *record) { struct simd_flags simd_flags = {}; @@ -522,10 +536,57 @@ static void arm_spe__synth_memory_level(const struct arm_spe_record *record, data_src->mem_lvl |= PERF_MEM_LVL_REM_CCE1; } -static u64 arm_spe__synth_data_source(const struct arm_spe_record *record, u64 midr) +static bool arm_spe__is_common_ds_encoding(struct arm_spe_queue *speq) +{ + struct arm_spe *spe = speq->spe; + bool is_in_cpu_list; + u64 *metadata = NULL; + u64 midr = 0; + + /* + * Metadata version 1 doesn't contain any info for MIDR. + * Simply return false in this case. + */ + if (spe->metadata_ver == 1) { + pr_warning_once("The data file contains metadata version 1, " + "which is absent the info for data source. " + "Please upgrade the tool to record data.\n"); + return false; + } + + /* CPU ID is -1 for per-thread mode */ + if (speq->cpu < 0) { + /* + * On the heterogeneous system, due to CPU ID is -1, + * cannot confirm the data source packet is supported. + */ + if (!spe->is_homogeneous) + return false; + + /* In homogeneous system, simply use CPU0's metadata */ + if (spe->metadata) + metadata = spe->metadata[0]; + } else { + metadata = arm_spe__get_metadata_by_cpu(spe, speq->cpu); + } + + if (!metadata) + return false; + + midr = metadata[ARM_SPE_CPU_MIDR]; + + is_in_cpu_list = is_midr_in_range_list(midr, common_ds_encoding_cpus); + if (is_in_cpu_list) + return true; + else + return false; +} + +static u64 arm_spe__synth_data_source(struct arm_spe_queue *speq, + const struct arm_spe_record *record) { union perf_mem_data_src data_src = { .mem_op = PERF_MEM_OP_NA }; - bool is_common = is_midr_in_range_list(midr, common_ds_encoding_cpus); + bool is_common = arm_spe__is_common_ds_encoding(speq); /* Only synthesize data source for LDST operations */ if (!is_ldst_op(record->op)) @@ -562,7 +623,7 @@ static int arm_spe_sample(struct arm_spe_queue *speq) u64 data_src; int err; - data_src = arm_spe__synth_data_source(record, spe->midr); + data_src = arm_spe__synth_data_source(speq, record); if (spe->sample_flc) { if (record->type & ARM_SPE_L1D_MISS) { From 754dad00bf33ee644c116e09df671aadb922e9f8 Mon Sep 17 00:00:00 2001 From: Leo Yan Date: Thu, 3 Oct 2024 19:53:20 +0100 Subject: [PATCH 274/548] perf arm-spe: Remove the unused 'midr' field The 'midr' field is replaced by the MIDR values stored in metadata (per CPU wise). Remove the 'midr' field as it is no longer used. Signed-off-by: Leo Yan Reviewed-by: James Clark Link: https://lore.kernel.org/r/20241003185322.192357-6-leo.yan@arm.com Signed-off-by: Namhyung Kim (cherry picked from commit 6bcf54c89b3d8406433839f0e3b72c08b4a1caf3) Signed-off-by: Koba Ko --- tools/perf/util/arm-spe.c | 4 ---- 1 file changed, 4 deletions(-) diff --git a/tools/perf/util/arm-spe.c b/tools/perf/util/arm-spe.c index a7b8eb427b7b4..79d15fb44f9c6 100644 --- a/tools/perf/util/arm-spe.c +++ b/tools/perf/util/arm-spe.c @@ -48,7 +48,6 @@ struct arm_spe { struct perf_session *session; struct machine *machine; u32 pmu_type; - u64 midr; struct perf_tsc_conversion tc; @@ -1501,8 +1500,6 @@ int arm_spe_process_auxtrace_info(union perf_event *event, struct perf_record_auxtrace_info *auxtrace_info = &event->auxtrace_info; size_t min_sz = ARM_SPE_AUXTRACE_V1_PRIV_SIZE; struct perf_record_time_conv *tc = &session->time_conv; - const char *cpuid = perf_env__cpuid(session->evlist->env); - u64 midr = strtol(cpuid, NULL, 16); struct arm_spe *spe; u64 **metadata = NULL; u64 metadata_ver; @@ -1536,7 +1533,6 @@ int arm_spe_process_auxtrace_info(union perf_event *event, spe->pmu_type = auxtrace_info->priv[ARM_SPE_PMU_TYPE]; else spe->pmu_type = auxtrace_info->priv[ARM_SPE_PMU_TYPE_V2]; - spe->midr = midr; spe->metadata = metadata; spe->metadata_ver = metadata_ver; spe->metadata_nr_cpu = nr_cpu; From b130cb8551918ea5e2f7e3f3fedb3ede5899d38e Mon Sep 17 00:00:00 2001 From: Besar Wicaksono Date: Thu, 3 Oct 2024 19:53:21 +0100 Subject: [PATCH 275/548] perf arm-spe: Add Neoverse-V2 to common data source encoding list Add Neoverse-V2 MIDR to the common data source encoding range list. Signed-off-by: Besar Wicaksono Reviewed-by: Leo Yan Reviewed-by: Leo Yan Reviewed-by: James Clark Link: https://lore.kernel.org/r/20241003185322.192357-7-leo.yan@arm.com Signed-off-by: Namhyung Kim (cherry picked from commit 041c0e5715a65a1b653283b853b4ca973780607a) Signed-off-by: Koba Ko --- tools/perf/util/arm-spe.c | 1 + 1 file changed, 1 insertion(+) diff --git a/tools/perf/util/arm-spe.c b/tools/perf/util/arm-spe.c index 79d15fb44f9c6..0c743cbc830cf 100644 --- a/tools/perf/util/arm-spe.c +++ b/tools/perf/util/arm-spe.c @@ -433,6 +433,7 @@ static const struct midr_range common_ds_encoding_cpus[] = { MIDR_ALL_VERSIONS(MIDR_NEOVERSE_N1), MIDR_ALL_VERSIONS(MIDR_NEOVERSE_N2), MIDR_ALL_VERSIONS(MIDR_NEOVERSE_V1), + MIDR_ALL_VERSIONS(MIDR_NEOVERSE_V2), {}, }; From 4c25fe442c1cdd5a836fd4df18b697851e8b6c72 Mon Sep 17 00:00:00 2001 From: Leo Yan Date: Thu, 3 Oct 2024 19:53:22 +0100 Subject: [PATCH 276/548] perf arm-spe: Add Cortex CPUs to common data source encoding list Add Cortex-A720, Cortex-A725, Cortex-X1C, Cortex-X3 and Cortex-X925 into the common data source encoding list. For everyone of these CPUs, it technical reference manual defines the data source packet as the common encoding format. Signed-off-by: Leo Yan Reviewed-by: James Clark Link: https://lore.kernel.org/r/20241003185322.192357-8-leo.yan@arm.com Signed-off-by: Namhyung Kim (cherry picked from commit ea2ead4224fd3899f6dadd4c1fc526f32ec2246c) Signed-off-by: Koba Ko --- tools/perf/util/arm-spe.c | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/tools/perf/util/arm-spe.c b/tools/perf/util/arm-spe.c index 0c743cbc830cf..75b9fbaab2cb4 100644 --- a/tools/perf/util/arm-spe.c +++ b/tools/perf/util/arm-spe.c @@ -430,6 +430,11 @@ static int arm_spe__synth_instruction_sample(struct arm_spe_queue *speq, } static const struct midr_range common_ds_encoding_cpus[] = { + MIDR_ALL_VERSIONS(MIDR_CORTEX_A720), + MIDR_ALL_VERSIONS(MIDR_CORTEX_A725), + MIDR_ALL_VERSIONS(MIDR_CORTEX_X1C), + MIDR_ALL_VERSIONS(MIDR_CORTEX_X3), + MIDR_ALL_VERSIONS(MIDR_CORTEX_X925), MIDR_ALL_VERSIONS(MIDR_NEOVERSE_N1), MIDR_ALL_VERSIONS(MIDR_NEOVERSE_N2), MIDR_ALL_VERSIONS(MIDR_NEOVERSE_V1), From c28adacca9f5fed369421f4c868210dc5a7b0d02 Mon Sep 17 00:00:00 2001 From: Namhyung Kim Date: Tue, 6 Aug 2024 12:07:50 -0700 Subject: [PATCH 277/548] tools/include: Sync arm64 headers with the kernel sources To pick up changes from: 9ef54a384526 arm64: cputype: Add Cortex-A725 definitions 58d245e03c32 arm64: cputype: Add Cortex-X1C definitions fd2ff5f0b320 arm64: cputype: Add Cortex-X925 definitions add332c40328 arm64: cputype: Add Cortex-A720 definitions be5a6f238700 arm64: cputype: Add Cortex-X3 definitions This should be used to beautify x86 syscall arguments and it addresses these tools/perf build warnings: Warning: Kernel ABI header differences: diff -u tools/arch/arm64/include/asm/cputype.h arch/arm64/include/asm/cputype.h Please see tools/include/uapi/README for details (it's in the first patch of this series). Cc: Catalin Marinas Cc: Will Deacon Cc: linux-arm-kernel@lists.infradead.org Signed-off-by: Namhyung Kim (cherry picked from commit d5b854893d27b4030943a10cf28a07189aab0c36) Signed-off-by: Koba Ko --- tools/arch/arm64/include/asm/cputype.h | 10 ++++++++++ 1 file changed, 10 insertions(+) diff --git a/tools/arch/arm64/include/asm/cputype.h b/tools/arch/arm64/include/asm/cputype.h index 329d41f8c9237..94480a333cb6c 100644 --- a/tools/arch/arm64/include/asm/cputype.h +++ b/tools/arch/arm64/include/asm/cputype.h @@ -84,9 +84,14 @@ #define ARM_CPU_PART_CORTEX_X2 0xD48 #define ARM_CPU_PART_NEOVERSE_N2 0xD49 #define ARM_CPU_PART_CORTEX_A78C 0xD4B +#define ARM_CPU_PART_CORTEX_X1C 0xD4C +#define ARM_CPU_PART_CORTEX_X3 0xD4E #define ARM_CPU_PART_NEOVERSE_V2 0xD4F +#define ARM_CPU_PART_CORTEX_A720 0xD81 #define ARM_CPU_PART_CORTEX_X4 0xD82 #define ARM_CPU_PART_NEOVERSE_V3 0xD84 +#define ARM_CPU_PART_CORTEX_X925 0xD85 +#define ARM_CPU_PART_CORTEX_A725 0xD87 #define APM_CPU_PART_POTENZA 0x000 @@ -156,9 +161,14 @@ #define MIDR_CORTEX_X2 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_X2) #define MIDR_NEOVERSE_N2 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_NEOVERSE_N2) #define MIDR_CORTEX_A78C MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A78C) +#define MIDR_CORTEX_X1C MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_X1C) +#define MIDR_CORTEX_X3 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_X3) #define MIDR_NEOVERSE_V2 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_NEOVERSE_V2) +#define MIDR_CORTEX_A720 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A720) #define MIDR_CORTEX_X4 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_X4) #define MIDR_NEOVERSE_V3 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_NEOVERSE_V3) +#define MIDR_CORTEX_X925 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_X925) +#define MIDR_CORTEX_A725 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A725) #define MIDR_THUNDERX MIDR_CPU_MODEL(ARM_CPU_IMP_CAVIUM, CAVIUM_CPU_PART_THUNDERX) #define MIDR_THUNDERX_81XX MIDR_CPU_MODEL(ARM_CPU_IMP_CAVIUM, CAVIUM_CPU_PART_THUNDERX_81XX) #define MIDR_THUNDERX_83XX MIDR_CPU_MODEL(ARM_CPU_IMP_CAVIUM, CAVIUM_CPU_PART_THUNDERX_83XX) From 70d495ffba99da70837fb8a67b711d575a4f63cb Mon Sep 17 00:00:00 2001 From: Besar Wicaksono Date: Thu, 31 Oct 2024 14:21:15 +0000 Subject: [PATCH 278/548] perf: arm_cspmu: nvidia: remove unsupported SCF events Remove unsupported events under SCF PMU. Signed-off-by: Besar Wicaksono Link: https://lore.kernel.org/r/20241031142118.1865965-2-bwicaksono@nvidia.com Signed-off-by: Will Deacon (cherry picked from commit ac4c52956f62700803754ffba906e42126273956) Signed-off-by: Koba Ko --- drivers/perf/arm_cspmu/nvidia_cspmu.c | 64 --------------------------- 1 file changed, 64 deletions(-) diff --git a/drivers/perf/arm_cspmu/nvidia_cspmu.c b/drivers/perf/arm_cspmu/nvidia_cspmu.c index 0382b702f0920..c6cdb7c5a27ae 100644 --- a/drivers/perf/arm_cspmu/nvidia_cspmu.c +++ b/drivers/perf/arm_cspmu/nvidia_cspmu.c @@ -54,65 +54,24 @@ static struct attribute *scf_pmu_event_attrs[] = { ARM_CSPMU_EVENT_ATTR(scf_cache_wb, 0xF3), NV_CSPMU_EVENT_ATTR_4(socket, rd_data, 0x101), - NV_CSPMU_EVENT_ATTR_4(socket, dl_rsp, 0x105), NV_CSPMU_EVENT_ATTR_4(socket, wb_data, 0x109), - NV_CSPMU_EVENT_ATTR_4(socket, ev_rsp, 0x10d), - NV_CSPMU_EVENT_ATTR_4(socket, prb_data, 0x111), NV_CSPMU_EVENT_ATTR_4(socket, rd_outstanding, 0x115), - NV_CSPMU_EVENT_ATTR_4(socket, dl_outstanding, 0x119), - NV_CSPMU_EVENT_ATTR_4(socket, wb_outstanding, 0x11d), - NV_CSPMU_EVENT_ATTR_4(socket, wr_outstanding, 0x121), - NV_CSPMU_EVENT_ATTR_4(socket, ev_outstanding, 0x125), - NV_CSPMU_EVENT_ATTR_4(socket, prb_outstanding, 0x129), NV_CSPMU_EVENT_ATTR_4(socket, rd_access, 0x12d), - NV_CSPMU_EVENT_ATTR_4(socket, dl_access, 0x131), NV_CSPMU_EVENT_ATTR_4(socket, wb_access, 0x135), NV_CSPMU_EVENT_ATTR_4(socket, wr_access, 0x139), - NV_CSPMU_EVENT_ATTR_4(socket, ev_access, 0x13d), - NV_CSPMU_EVENT_ATTR_4(socket, prb_access, 0x141), - - NV_CSPMU_EVENT_ATTR_4(ocu, gmem_rd_data, 0x145), - NV_CSPMU_EVENT_ATTR_4(ocu, gmem_rd_access, 0x149), - NV_CSPMU_EVENT_ATTR_4(ocu, gmem_wb_access, 0x14d), - NV_CSPMU_EVENT_ATTR_4(ocu, gmem_rd_outstanding, 0x151), - NV_CSPMU_EVENT_ATTR_4(ocu, gmem_wr_outstanding, 0x155), - - NV_CSPMU_EVENT_ATTR_4(ocu, rem_rd_data, 0x159), - NV_CSPMU_EVENT_ATTR_4(ocu, rem_rd_access, 0x15d), - NV_CSPMU_EVENT_ATTR_4(ocu, rem_wb_access, 0x161), - NV_CSPMU_EVENT_ATTR_4(ocu, rem_rd_outstanding, 0x165), - NV_CSPMU_EVENT_ATTR_4(ocu, rem_wr_outstanding, 0x169), ARM_CSPMU_EVENT_ATTR(gmem_rd_data, 0x16d), ARM_CSPMU_EVENT_ATTR(gmem_rd_access, 0x16e), ARM_CSPMU_EVENT_ATTR(gmem_rd_outstanding, 0x16f), - ARM_CSPMU_EVENT_ATTR(gmem_dl_rsp, 0x170), - ARM_CSPMU_EVENT_ATTR(gmem_dl_access, 0x171), - ARM_CSPMU_EVENT_ATTR(gmem_dl_outstanding, 0x172), ARM_CSPMU_EVENT_ATTR(gmem_wb_data, 0x173), ARM_CSPMU_EVENT_ATTR(gmem_wb_access, 0x174), - ARM_CSPMU_EVENT_ATTR(gmem_wb_outstanding, 0x175), - ARM_CSPMU_EVENT_ATTR(gmem_ev_rsp, 0x176), - ARM_CSPMU_EVENT_ATTR(gmem_ev_access, 0x177), - ARM_CSPMU_EVENT_ATTR(gmem_ev_outstanding, 0x178), ARM_CSPMU_EVENT_ATTR(gmem_wr_data, 0x179), - ARM_CSPMU_EVENT_ATTR(gmem_wr_outstanding, 0x17a), ARM_CSPMU_EVENT_ATTR(gmem_wr_access, 0x17b), NV_CSPMU_EVENT_ATTR_4(socket, wr_data, 0x17c), - NV_CSPMU_EVENT_ATTR_4(ocu, gmem_wr_data, 0x180), - NV_CSPMU_EVENT_ATTR_4(ocu, gmem_wb_data, 0x184), - NV_CSPMU_EVENT_ATTR_4(ocu, gmem_wr_access, 0x188), - NV_CSPMU_EVENT_ATTR_4(ocu, gmem_wb_outstanding, 0x18c), - - NV_CSPMU_EVENT_ATTR_4(ocu, rem_wr_data, 0x190), - NV_CSPMU_EVENT_ATTR_4(ocu, rem_wb_data, 0x194), - NV_CSPMU_EVENT_ATTR_4(ocu, rem_wr_access, 0x198), - NV_CSPMU_EVENT_ATTR_4(ocu, rem_wb_outstanding, 0x19c), - ARM_CSPMU_EVENT_ATTR(gmem_wr_total_bytes, 0x1a0), ARM_CSPMU_EVENT_ATTR(remote_socket_wr_total_bytes, 0x1a1), ARM_CSPMU_EVENT_ATTR(remote_socket_rd_data, 0x1a2), @@ -122,35 +81,12 @@ static struct attribute *scf_pmu_event_attrs[] = { ARM_CSPMU_EVENT_ATTR(cmem_rd_data, 0x1a5), ARM_CSPMU_EVENT_ATTR(cmem_rd_access, 0x1a6), ARM_CSPMU_EVENT_ATTR(cmem_rd_outstanding, 0x1a7), - ARM_CSPMU_EVENT_ATTR(cmem_dl_rsp, 0x1a8), - ARM_CSPMU_EVENT_ATTR(cmem_dl_access, 0x1a9), - ARM_CSPMU_EVENT_ATTR(cmem_dl_outstanding, 0x1aa), ARM_CSPMU_EVENT_ATTR(cmem_wb_data, 0x1ab), ARM_CSPMU_EVENT_ATTR(cmem_wb_access, 0x1ac), - ARM_CSPMU_EVENT_ATTR(cmem_wb_outstanding, 0x1ad), - ARM_CSPMU_EVENT_ATTR(cmem_ev_rsp, 0x1ae), - ARM_CSPMU_EVENT_ATTR(cmem_ev_access, 0x1af), - ARM_CSPMU_EVENT_ATTR(cmem_ev_outstanding, 0x1b0), ARM_CSPMU_EVENT_ATTR(cmem_wr_data, 0x1b1), - ARM_CSPMU_EVENT_ATTR(cmem_wr_outstanding, 0x1b2), - - NV_CSPMU_EVENT_ATTR_4(ocu, cmem_rd_data, 0x1b3), - NV_CSPMU_EVENT_ATTR_4(ocu, cmem_rd_access, 0x1b7), - NV_CSPMU_EVENT_ATTR_4(ocu, cmem_wb_access, 0x1bb), - NV_CSPMU_EVENT_ATTR_4(ocu, cmem_rd_outstanding, 0x1bf), - NV_CSPMU_EVENT_ATTR_4(ocu, cmem_wr_outstanding, 0x1c3), - - ARM_CSPMU_EVENT_ATTR(ocu_prb_access, 0x1c7), - ARM_CSPMU_EVENT_ATTR(ocu_prb_data, 0x1c8), - ARM_CSPMU_EVENT_ATTR(ocu_prb_outstanding, 0x1c9), ARM_CSPMU_EVENT_ATTR(cmem_wr_access, 0x1ca), - NV_CSPMU_EVENT_ATTR_4(ocu, cmem_wr_access, 0x1cb), - NV_CSPMU_EVENT_ATTR_4(ocu, cmem_wb_data, 0x1cf), - NV_CSPMU_EVENT_ATTR_4(ocu, cmem_wr_data, 0x1d3), - NV_CSPMU_EVENT_ATTR_4(ocu, cmem_wb_outstanding, 0x1d7), - ARM_CSPMU_EVENT_ATTR(cmem_wr_total_bytes, 0x1db), ARM_CSPMU_EVENT_ATTR(cycles, ARM_CSPMU_EVT_CYCLES_DEFAULT), From 986fd66e0a8fb6f22a8fe046c0421c3bb4c78476 Mon Sep 17 00:00:00 2001 From: Besar Wicaksono Date: Thu, 31 Oct 2024 14:21:16 +0000 Subject: [PATCH 279/548] perf: arm_cspmu: nvidia: fix sysfs path in the kernel doc Fix typos to the sysfs path referenced by NVIDIA uncore pmu kernel doc. Signed-off-by: Besar Wicaksono Link: https://lore.kernel.org/r/20241031142118.1865965-3-bwicaksono@nvidia.com Signed-off-by: Will Deacon (cherry picked from commit 5f7cd0dc98a658d6470bc738499e01172bc6007f) Signed-off-by: Koba Ko --- Documentation/admin-guide/perf/nvidia-pmu.rst | 14 +++++++------- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/Documentation/admin-guide/perf/nvidia-pmu.rst b/Documentation/admin-guide/perf/nvidia-pmu.rst index 2e0d47cfe7ea7..6e8ee0fcf4718 100644 --- a/Documentation/admin-guide/perf/nvidia-pmu.rst +++ b/Documentation/admin-guide/perf/nvidia-pmu.rst @@ -34,7 +34,7 @@ strongly-ordered (SO) PCIE write traffic to local/remote memory. Please see traffic coverage. The events and configuration options of this PMU device are described in sysfs, -see /sys/bus/event_sources/devices/nvidia_scf_pmu_. +see /sys/bus/event_source/devices/nvidia_scf_pmu_. Example usage: @@ -66,7 +66,7 @@ Please see :ref:`NVIDIA_Uncore_PMU_Traffic_Coverage_Section` for more info about the PMU traffic coverage. The events and configuration options of this PMU device are described in sysfs, -see /sys/bus/event_sources/devices/nvidia_nvlink_c2c0_pmu_. +see /sys/bus/event_source/devices/nvidia_nvlink_c2c0_pmu_. Example usage: @@ -96,7 +96,7 @@ Please see :ref:`NVIDIA_Uncore_PMU_Traffic_Coverage_Section` for more info about the PMU traffic coverage. The events and configuration options of this PMU device are described in sysfs, -see /sys/bus/event_sources/devices/nvidia_nvlink_c2c1_pmu_. +see /sys/bus/event_source/devices/nvidia_nvlink_c2c1_pmu_. Example usage: @@ -125,13 +125,13 @@ to local memory. For PCIE traffic, this PMU captures read and relaxed ordered for more info about the PMU traffic coverage. The events and configuration options of this PMU device are described in sysfs, -see /sys/bus/event_sources/devices/nvidia_cnvlink_pmu_. +see /sys/bus/event_source/devices/nvidia_cnvlink_pmu_. Each SoC socket can be connected to one or more sockets via CNVLink. The user can use "rem_socket" bitmap parameter to select the remote socket(s) to monitor. Each bit represents the socket number, e.g. "rem_socket=0xE" corresponds to socket 1 to 3. -/sys/bus/event_sources/devices/nvidia_cnvlink_pmu_/format/rem_socket +/sys/bus/event_source/devices/nvidia_cnvlink_pmu_/format/rem_socket shows the valid bits that can be set in the "rem_socket" parameter. The PMU can not distinguish the remote traffic initiator, therefore it does not @@ -165,12 +165,12 @@ local/remote memory. Please see :ref:`NVIDIA_Uncore_PMU_Traffic_Coverage_Section for more info about the PMU traffic coverage. The events and configuration options of this PMU device are described in sysfs, -see /sys/bus/event_sources/devices/nvidia_pcie_pmu_. +see /sys/bus/event_source/devices/nvidia_pcie_pmu_. Each SoC socket can support multiple root ports. The user can use "root_port" bitmap parameter to select the port(s) to monitor, i.e. "root_port=0xF" corresponds to root port 0 to 3. -/sys/bus/event_sources/devices/nvidia_pcie_pmu_/format/root_port +/sys/bus/event_source/devices/nvidia_pcie_pmu_/format/root_port shows the valid bits that can be set in the "root_port" parameter. Example usage: From a10448b3af0a0de8acb85a7e766fff96f28f9266 Mon Sep 17 00:00:00 2001 From: Besar Wicaksono Date: Thu, 31 Oct 2024 14:21:17 +0000 Subject: [PATCH 280/548] perf: arm_cspmu: nvidia: enable NVLINK-C2C port filtering Enable NVLINK-C2C port filtering to distinguish traffic from different GPUs connected to NVLINK-C2C. Signed-off-by: Besar Wicaksono Link: https://lore.kernel.org/r/20241031142118.1865965-4-bwicaksono@nvidia.com Signed-off-by: Will Deacon (cherry picked from commit ca26df4b1036bcad326170a6ddb5245f6d6e8d82) Signed-off-by: Koba Ko --- Documentation/admin-guide/perf/nvidia-pmu.rst | 30 +++++++++++++++++++ drivers/perf/arm_cspmu/nvidia_cspmu.c | 5 ++-- 2 files changed, 33 insertions(+), 2 deletions(-) diff --git a/Documentation/admin-guide/perf/nvidia-pmu.rst b/Documentation/admin-guide/perf/nvidia-pmu.rst index 6e8ee0fcf4718..4cfc806070d71 100644 --- a/Documentation/admin-guide/perf/nvidia-pmu.rst +++ b/Documentation/admin-guide/perf/nvidia-pmu.rst @@ -86,6 +86,21 @@ Example usage: perf stat -a -e nvidia_nvlink_c2c0_pmu_3/event=0x0/ +The NVLink-C2C has two ports that can be connected to one GPU (occupying both +ports) or to two GPUs (one GPU per port). The user can use "port" bitmap +parameter to select the port(s) to monitor. Each bit represents the port number, +e.g. "port=0x1" corresponds to port 0 and "port=0x3" is for port 0 and 1. + +Example for port filtering: + +* Count event id 0x0 from the GPU connected with socket 0 on port 0:: + + perf stat -a -e nvidia_nvlink_c2c0_pmu_0/event=0x0,port=0x1/ + +* Count event id 0x0 from the GPUs connected with socket 0 on port 0 and port 1:: + + perf stat -a -e nvidia_nvlink_c2c0_pmu_0/event=0x0,port=0x3/ + NVLink-C2C1 PMU ------------------- @@ -116,6 +131,21 @@ Example usage: perf stat -a -e nvidia_nvlink_c2c1_pmu_3/event=0x0/ +The NVLink-C2C has two ports that can be connected to one GPU (occupying both +ports) or to two GPUs (one GPU per port). The user can use "port" bitmap +parameter to select the port(s) to monitor. Each bit represents the port number, +e.g. "port=0x1" corresponds to port 0 and "port=0x3" is for port 0 and 1. + +Example for port filtering: + +* Count event id 0x0 from the GPU connected with socket 0 on port 0:: + + perf stat -a -e nvidia_nvlink_c2c1_pmu_0/event=0x0,port=0x1/ + +* Count event id 0x0 from the GPUs connected with socket 0 on port 0 and port 1:: + + perf stat -a -e nvidia_nvlink_c2c1_pmu_0/event=0x0,port=0x3/ + CNVLink PMU --------------- diff --git a/drivers/perf/arm_cspmu/nvidia_cspmu.c b/drivers/perf/arm_cspmu/nvidia_cspmu.c index c6cdb7c5a27ae..e48ace63c739d 100644 --- a/drivers/perf/arm_cspmu/nvidia_cspmu.c +++ b/drivers/perf/arm_cspmu/nvidia_cspmu.c @@ -130,6 +130,7 @@ static struct attribute *pcie_pmu_format_attrs[] = { static struct attribute *nvlink_c2c_pmu_format_attrs[] = { ARM_CSPMU_FORMAT_EVENT_ATTR, + ARM_CSPMU_FORMAT_ATTR(port, "config1:0-1"), NULL, }; @@ -210,7 +211,7 @@ static const struct nv_cspmu_match nv_cspmu_match[] = { { .prodid = 0x104, .prodid_mask = NV_PRODID_MASK, - .filter_mask = 0x0, + .filter_mask = NV_NVL_C2C_FILTER_ID_MASK, .filter_default_val = NV_NVL_C2C_FILTER_ID_MASK, .name_pattern = "nvidia_nvlink_c2c1_pmu_%u", .name_fmt = NAME_FMT_SOCKET, @@ -220,7 +221,7 @@ static const struct nv_cspmu_match nv_cspmu_match[] = { { .prodid = 0x105, .prodid_mask = NV_PRODID_MASK, - .filter_mask = 0x0, + .filter_mask = NV_NVL_C2C_FILTER_ID_MASK, .filter_default_val = NV_NVL_C2C_FILTER_ID_MASK, .name_pattern = "nvidia_nvlink_c2c0_pmu_%u", .name_fmt = NAME_FMT_SOCKET, From 2f0cd5e11b0d27a9c345af2af748faf62cc33d6d Mon Sep 17 00:00:00 2001 From: Besar Wicaksono Date: Thu, 31 Oct 2024 14:21:18 +0000 Subject: [PATCH 281/548] perf: arm_cspmu: nvidia: monitor all ports by default Some NVIDIA PMUs like the NVLINK-C2C, CNVLINK, and PCIE PMU provide port filtering. If the port filter is set to zero, the counter of these PMUs will not capture any event. To avoid meaningless experiment, the driver sets the port filter value to a default non-zero value. Signed-off-by: Besar Wicaksono Link: https://lore.kernel.org/r/20241031142118.1865965-5-bwicaksono@nvidia.com Signed-off-by: Will Deacon (cherry picked from commit bce61d5c57647ca4565847217fe811260cc60173) Signed-off-by: Koba Ko --- Documentation/admin-guide/perf/nvidia-pmu.rst | 12 ++++++++---- drivers/perf/arm_cspmu/nvidia_cspmu.c | 6 ++++-- 2 files changed, 12 insertions(+), 6 deletions(-) diff --git a/Documentation/admin-guide/perf/nvidia-pmu.rst b/Documentation/admin-guide/perf/nvidia-pmu.rst index 4cfc806070d71..f538ef67e0e8f 100644 --- a/Documentation/admin-guide/perf/nvidia-pmu.rst +++ b/Documentation/admin-guide/perf/nvidia-pmu.rst @@ -89,7 +89,8 @@ Example usage: The NVLink-C2C has two ports that can be connected to one GPU (occupying both ports) or to two GPUs (one GPU per port). The user can use "port" bitmap parameter to select the port(s) to monitor. Each bit represents the port number, -e.g. "port=0x1" corresponds to port 0 and "port=0x3" is for port 0 and 1. +e.g. "port=0x1" corresponds to port 0 and "port=0x3" is for port 0 and 1. The +PMU will monitor both ports by default if not specified. Example for port filtering: @@ -134,7 +135,8 @@ Example usage: The NVLink-C2C has two ports that can be connected to one GPU (occupying both ports) or to two GPUs (one GPU per port). The user can use "port" bitmap parameter to select the port(s) to monitor. Each bit represents the port number, -e.g. "port=0x1" corresponds to port 0 and "port=0x3" is for port 0 and 1. +e.g. "port=0x1" corresponds to port 0 and "port=0x3" is for port 0 and 1. The +PMU will monitor both ports by default if not specified. Example for port filtering: @@ -160,7 +162,8 @@ see /sys/bus/event_source/devices/nvidia_cnvlink_pmu_. Each SoC socket can be connected to one or more sockets via CNVLink. The user can use "rem_socket" bitmap parameter to select the remote socket(s) to monitor. Each bit represents the socket number, e.g. "rem_socket=0xE" corresponds to -socket 1 to 3. +socket 1 to 3. The PMU will monitor all remote sockets by default if not +specified. /sys/bus/event_source/devices/nvidia_cnvlink_pmu_/format/rem_socket shows the valid bits that can be set in the "rem_socket" parameter. @@ -199,7 +202,8 @@ see /sys/bus/event_source/devices/nvidia_pcie_pmu_. Each SoC socket can support multiple root ports. The user can use "root_port" bitmap parameter to select the port(s) to monitor, i.e. -"root_port=0xF" corresponds to root port 0 to 3. +"root_port=0xF" corresponds to root port 0 to 3. The PMU will monitor all root +ports by default if not specified. /sys/bus/event_source/devices/nvidia_pcie_pmu_/format/root_port shows the valid bits that can be set in the "root_port" parameter. diff --git a/drivers/perf/arm_cspmu/nvidia_cspmu.c b/drivers/perf/arm_cspmu/nvidia_cspmu.c index e48ace63c739d..4a1ef3ecd6622 100644 --- a/drivers/perf/arm_cspmu/nvidia_cspmu.c +++ b/drivers/perf/arm_cspmu/nvidia_cspmu.c @@ -175,10 +175,12 @@ static u32 nv_cspmu_event_filter(const struct perf_event *event) const struct nv_cspmu_ctx *ctx = to_nv_cspmu_ctx(to_arm_cspmu(event->pmu)); - if (ctx->filter_mask == 0) + const u32 filter_val = event->attr.config1 & ctx->filter_mask; + + if (filter_val == 0) return ctx->filter_default_val; - return event->attr.config1 & ctx->filter_mask; + return filter_val; } enum nv_cspmu_name_fmt { From 9588ac02b1b407227023f7b91ac7789d74d13c56 Mon Sep 17 00:00:00 2001 From: Vincent Guittot Date: Mon, 11 Dec 2023 11:48:49 +0100 Subject: [PATCH 282/548] sched/topology: Add a new arch_scale_freq_ref() method Create a new method to get a unique and fixed max frequency. Currently cpuinfo.max_freq or the highest (or last) state of performance domain are used as the max frequency when computing the frequency for a level of utilization, but: - cpuinfo_max_freq can change at runtime. boost is one example of such change. - cpuinfo.max_freq and last item of the PD can be different leading to different results between cpufreq and energy model. We need to save the reference frequency that has been used when computing the CPUs capacity and use this fixed and coherent value to convert between frequency and CPU's capacity. In fact, we already save the frequency that has been used when computing the capacity of each CPU. We extend the precision to save kHz instead of MHz currently and we modify the type to be aligned with other variables used when converting frequency to capacity and the other way. [ mingo: Minor edits. ] Signed-off-by: Vincent Guittot Signed-off-by: Ingo Molnar Tested-by: Lukasz Luba Reviewed-by: Lukasz Luba Acked-by: Sudeep Holla Link: https://lore.kernel.org/r/20231211104855.558096-2-vincent.guittot@linaro.org (cherry picked from commit 9942cb22ea458c34fa17b73d143ea32d4df1caca) Signed-off-by: Koba Ko --- arch/arm/include/asm/topology.h | 1 + arch/arm64/include/asm/topology.h | 1 + arch/riscv/include/asm/topology.h | 1 + drivers/base/arch_topology.c | 29 ++++++++++++++--------------- include/linux/arch_topology.h | 7 +++++++ include/linux/sched/topology.h | 8 ++++++++ 6 files changed, 32 insertions(+), 15 deletions(-) diff --git a/arch/arm/include/asm/topology.h b/arch/arm/include/asm/topology.h index c7d2510e5a786..853c4f81ba4a5 100644 --- a/arch/arm/include/asm/topology.h +++ b/arch/arm/include/asm/topology.h @@ -13,6 +13,7 @@ #define arch_set_freq_scale topology_set_freq_scale #define arch_scale_freq_capacity topology_get_freq_scale #define arch_scale_freq_invariant topology_scale_freq_invariant +#define arch_scale_freq_ref topology_get_freq_ref #endif /* Replace task scheduler's default cpu-invariant accounting */ diff --git a/arch/arm64/include/asm/topology.h b/arch/arm64/include/asm/topology.h index 9fab663dd2de9..a323b109b9c44 100644 --- a/arch/arm64/include/asm/topology.h +++ b/arch/arm64/include/asm/topology.h @@ -23,6 +23,7 @@ void update_freq_counters_refs(void); #define arch_set_freq_scale topology_set_freq_scale #define arch_scale_freq_capacity topology_get_freq_scale #define arch_scale_freq_invariant topology_scale_freq_invariant +#define arch_scale_freq_ref topology_get_freq_ref #ifdef CONFIG_ACPI_CPPC_LIB #define arch_init_invariance_cppc topology_init_cpu_capacity_cppc diff --git a/arch/riscv/include/asm/topology.h b/arch/riscv/include/asm/topology.h index e316ab3b77f34..61183688bdd54 100644 --- a/arch/riscv/include/asm/topology.h +++ b/arch/riscv/include/asm/topology.h @@ -9,6 +9,7 @@ #define arch_set_freq_scale topology_set_freq_scale #define arch_scale_freq_capacity topology_get_freq_scale #define arch_scale_freq_invariant topology_scale_freq_invariant +#define arch_scale_freq_ref topology_get_freq_ref /* Replace task scheduler's default cpu-invariant accounting */ #define arch_scale_cpu_capacity topology_get_cpu_scale diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c index b741b5ba82bd6..0c9ae5b157b17 100644 --- a/drivers/base/arch_topology.c +++ b/drivers/base/arch_topology.c @@ -19,6 +19,7 @@ #include #include #include +#include #define CREATE_TRACE_POINTS #include @@ -26,7 +27,8 @@ static DEFINE_PER_CPU(struct scale_freq_data __rcu *, sft_data); static struct cpumask scale_freq_counters_mask; static bool scale_freq_invariant; -static DEFINE_PER_CPU(u32, freq_factor) = 1; +DEFINE_PER_CPU(unsigned long, capacity_freq_ref) = 1; +EXPORT_PER_CPU_SYMBOL_GPL(capacity_freq_ref); static bool supports_scale_freq_counters(const struct cpumask *cpus) { @@ -170,9 +172,9 @@ DEFINE_PER_CPU(unsigned long, thermal_pressure); * operating on stale data when hot-plug is used for some CPUs. The * @capped_freq reflects the currently allowed max CPUs frequency due to * thermal capping. It might be also a boost frequency value, which is bigger - * than the internal 'freq_factor' max frequency. In such case the pressure - * value should simply be removed, since this is an indication that there is - * no thermal throttling. The @capped_freq must be provided in kHz. + * than the internal 'capacity_freq_ref' max frequency. In such case the + * pressure value should simply be removed, since this is an indication that + * there is no thermal throttling. The @capped_freq must be provided in kHz. */ void topology_update_thermal_pressure(const struct cpumask *cpus, unsigned long capped_freq) @@ -183,10 +185,7 @@ void topology_update_thermal_pressure(const struct cpumask *cpus, cpu = cpumask_first(cpus); max_capacity = arch_scale_cpu_capacity(cpu); - max_freq = per_cpu(freq_factor, cpu); - - /* Convert to MHz scale which is used in 'freq_factor' */ - capped_freq /= 1000; + max_freq = arch_scale_freq_ref(cpu); /* * Handle properly the boost frequencies, which should simply clean @@ -279,13 +278,13 @@ void topology_normalize_cpu_scale(void) capacity_scale = 1; for_each_possible_cpu(cpu) { - capacity = raw_capacity[cpu] * per_cpu(freq_factor, cpu); + capacity = raw_capacity[cpu] * per_cpu(capacity_freq_ref, cpu); capacity_scale = max(capacity, capacity_scale); } pr_debug("cpu_capacity: capacity_scale=%llu\n", capacity_scale); for_each_possible_cpu(cpu) { - capacity = raw_capacity[cpu] * per_cpu(freq_factor, cpu); + capacity = raw_capacity[cpu] * per_cpu(capacity_freq_ref, cpu); capacity = div64_u64(capacity << SCHED_CAPACITY_SHIFT, capacity_scale); topology_set_cpu_scale(cpu, capacity); @@ -321,15 +320,15 @@ bool __init topology_parse_cpu_capacity(struct device_node *cpu_node, int cpu) cpu_node, raw_capacity[cpu]); /* - * Update freq_factor for calculating early boot cpu capacities. + * Update capacity_freq_ref for calculating early boot CPU capacities. * For non-clk CPU DVFS mechanism, there's no way to get the * frequency value now, assuming they are running at the same - * frequency (by keeping the initial freq_factor value). + * frequency (by keeping the initial capacity_freq_ref value). */ cpu_clk = of_clk_get(cpu_node, 0); if (!PTR_ERR_OR_ZERO(cpu_clk)) { - per_cpu(freq_factor, cpu) = - clk_get_rate(cpu_clk) / 1000; + per_cpu(capacity_freq_ref, cpu) = + clk_get_rate(cpu_clk) / HZ_PER_KHZ; clk_put(cpu_clk); } } else { @@ -411,7 +410,7 @@ init_cpu_capacity_callback(struct notifier_block *nb, cpumask_andnot(cpus_to_visit, cpus_to_visit, policy->related_cpus); for_each_cpu(cpu, policy->related_cpus) - per_cpu(freq_factor, cpu) = policy->cpuinfo.max_freq / 1000; + per_cpu(capacity_freq_ref, cpu) = policy->cpuinfo.max_freq; if (cpumask_empty(cpus_to_visit)) { topology_normalize_cpu_scale(); diff --git a/include/linux/arch_topology.h b/include/linux/arch_topology.h index a07b510e7dc55..32c24ff4f2a80 100644 --- a/include/linux/arch_topology.h +++ b/include/linux/arch_topology.h @@ -27,6 +27,13 @@ static inline unsigned long topology_get_cpu_scale(int cpu) void topology_set_cpu_scale(unsigned int cpu, unsigned long capacity); +DECLARE_PER_CPU(unsigned long, capacity_freq_ref); + +static inline unsigned long topology_get_freq_ref(int cpu) +{ + return per_cpu(capacity_freq_ref, cpu); +} + DECLARE_PER_CPU(unsigned long, arch_freq_scale); static inline unsigned long topology_get_freq_scale(int cpu) diff --git a/include/linux/sched/topology.h b/include/linux/sched/topology.h index 67b573d5bf28f..9671b7234684a 100644 --- a/include/linux/sched/topology.h +++ b/include/linux/sched/topology.h @@ -275,6 +275,14 @@ void arch_update_thermal_pressure(const struct cpumask *cpus, { } #endif +#ifndef arch_scale_freq_ref +static __always_inline +unsigned int arch_scale_freq_ref(int cpu) +{ + return 0; +} +#endif + static inline int task_node(const struct task_struct *p) { return cpu_to_node(task_cpu(p)); From 23371504f0eab2087137c5dc4a64a1e8e9ed0b0c Mon Sep 17 00:00:00 2001 From: Vincent Guittot Date: Mon, 11 Dec 2023 11:48:50 +0100 Subject: [PATCH 283/548] cpufreq: Use the fixed and coherent frequency for scaling capacity cpuinfo.max_freq can change at runtime because of boost as an example. This implies that the value could be different from the frequency that has been used to compute the capacity of a CPU. The new arch_scale_freq_ref() returns a fixed and coherent frequency that can be used to compute the capacity for a given frequency. [ Also fix a arch_set_freq_scale() newline style wart in . ] Signed-off-by: Vincent Guittot Signed-off-by: Ingo Molnar Tested-by: Lukasz Luba Reviewed-by: Lukasz Luba Acked-by: Viresh Kumar Acked-by: Rafael J. Wysocki Link: https://lore.kernel.org/r/20231211104855.558096-3-vincent.guittot@linaro.org (cherry picked from commit 599457ba15403037b489fe536266a3d5f9efaed7) Signed-off-by: Koba Ko --- drivers/cpufreq/cpufreq.c | 4 ++-- include/linux/cpufreq.h | 1 + 2 files changed, 3 insertions(+), 2 deletions(-) diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c index cc98d8cf54330..6b00507f9f332 100644 --- a/drivers/cpufreq/cpufreq.c +++ b/drivers/cpufreq/cpufreq.c @@ -454,7 +454,7 @@ void cpufreq_freq_transition_end(struct cpufreq_policy *policy, arch_set_freq_scale(policy->related_cpus, policy->cur, - policy->cpuinfo.max_freq); + arch_scale_freq_ref(policy->cpu)); spin_lock(&policy->transition_lock); policy->transition_ongoing = false; @@ -2205,7 +2205,7 @@ unsigned int cpufreq_driver_fast_switch(struct cpufreq_policy *policy, policy->cur = freq; arch_set_freq_scale(policy->related_cpus, freq, - policy->cpuinfo.max_freq); + arch_scale_freq_ref(policy->cpu)); cpufreq_stats_record_transition(policy, freq); if (trace_cpu_frequency_enabled()) { diff --git a/include/linux/cpufreq.h b/include/linux/cpufreq.h index 184a84dd467ec..bfecd9dcb5529 100644 --- a/include/linux/cpufreq.h +++ b/include/linux/cpufreq.h @@ -1245,6 +1245,7 @@ void arch_set_freq_scale(const struct cpumask *cpus, { } #endif + /* the following are really really optional */ extern struct freq_attr cpufreq_freq_attr_scaling_available_freqs; extern struct freq_attr cpufreq_freq_attr_scaling_boost_freqs; From 8fb34247293f414c76768ae72b25b008d17976c9 Mon Sep 17 00:00:00 2001 From: Vincent Guittot Date: Mon, 11 Dec 2023 11:48:51 +0100 Subject: [PATCH 284/548] cpufreq/schedutil: Use a fixed reference frequency cpuinfo.max_freq can change at runtime because of boost as an example. This implies that the value could be different than the one that has been used when computing the capacity of a CPU. The new arch_scale_freq_ref() returns a fixed and coherent reference frequency that can be used when computing a frequency based on utilization. Use this arch_scale_freq_ref() when available and fallback to policy otherwise. Signed-off-by: Vincent Guittot Signed-off-by: Ingo Molnar Tested-by: Lukasz Luba Reviewed-by: Lukasz Luba Reviewed-by: Dietmar Eggemann Acked-by: Rafael J. Wysocki Acked-by: Viresh Kumar Link: https://lore.kernel.org/r/20231211104855.558096-4-vincent.guittot@linaro.org (cherry picked from commit b3edde44e5d4504c23a176819865cd603fd16d6c) Signed-off-by: Koba Ko --- kernel/sched/cpufreq_schedutil.c | 26 ++++++++++++++++++++++++-- 1 file changed, 24 insertions(+), 2 deletions(-) diff --git a/kernel/sched/cpufreq_schedutil.c b/kernel/sched/cpufreq_schedutil.c index 776be0549162c..cfe7c625d2ad6 100644 --- a/kernel/sched/cpufreq_schedutil.c +++ b/kernel/sched/cpufreq_schedutil.c @@ -137,6 +137,28 @@ static void sugov_deferred_update(struct sugov_policy *sg_policy) } } +/** + * get_capacity_ref_freq - get the reference frequency that has been used to + * correlate frequency and compute capacity for a given cpufreq policy. We use + * the CPU managing it for the arch_scale_freq_ref() call in the function. + * @policy: the cpufreq policy of the CPU in question. + * + * Return: the reference CPU frequency to compute a capacity. + */ +static __always_inline +unsigned long get_capacity_ref_freq(struct cpufreq_policy *policy) +{ + unsigned int freq = arch_scale_freq_ref(policy->cpu); + + if (freq) + return freq; + + if (arch_scale_freq_invariant()) + return policy->cpuinfo.max_freq; + + return policy->cur; +} + /** * get_next_freq - Compute a new frequency for a given cpufreq policy. * @sg_policy: schedutil policy object to compute the new frequency for. @@ -163,9 +185,9 @@ static unsigned int get_next_freq(struct sugov_policy *sg_policy, unsigned long util, unsigned long max) { struct cpufreq_policy *policy = sg_policy->policy; - unsigned int freq = arch_scale_freq_invariant() ? - policy->cpuinfo.max_freq : policy->cur; + unsigned int freq; + freq = get_capacity_ref_freq(policy); freq = map_util_freq(util, freq, max); if (freq == sg_policy->cached_raw_freq && !sg_policy->need_freq_update) From 2470d744cbcc3102576402b0567175acb1e431bf Mon Sep 17 00:00:00 2001 From: Vincent Guittot Date: Mon, 11 Dec 2023 11:48:52 +0100 Subject: [PATCH 285/548] energy_model: Use a fixed reference frequency The last item of a performance domain is not always the performance point that has been used to compute CPU's capacity. This can lead to different target frequency compared with other part of the system like schedutil and would result in wrong energy estimation. A new arch_scale_freq_ref() is available to return a fixed and coherent frequency reference that can be used when computing the CPU's frequency for an level of utilization. Use this function to get this reference frequency. Energy model is never used without defining arch_scale_freq_ref() but can be compiled. Define a default arch_scale_freq_ref() returning 0 in such case. Signed-off-by: Vincent Guittot Signed-off-by: Ingo Molnar Tested-by: Lukasz Luba Reviewed-by: Lukasz Luba Link: https://lore.kernel.org/r/20231211104855.558096-5-vincent.guittot@linaro.org (cherry picked from commit 15cbbd1d317e07b4e5c6aca5d4c5579539a82784) Signed-off-by: Koba Ko --- include/linux/energy_model.h | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/include/linux/energy_model.h b/include/linux/energy_model.h index adec808b371a1..88d91e0874718 100644 --- a/include/linux/energy_model.h +++ b/include/linux/energy_model.h @@ -224,7 +224,7 @@ static inline unsigned long em_cpu_energy(struct em_perf_domain *pd, unsigned long max_util, unsigned long sum_util, unsigned long allowed_cpu_cap) { - unsigned long freq, scale_cpu; + unsigned long freq, ref_freq, scale_cpu; struct em_perf_state *ps; int cpu; @@ -241,10 +241,10 @@ static inline unsigned long em_cpu_energy(struct em_perf_domain *pd, */ cpu = cpumask_first(to_cpumask(pd->cpus)); scale_cpu = arch_scale_cpu_capacity(cpu); - ps = &pd->table[pd->nr_perf_states - 1]; + ref_freq = arch_scale_freq_ref(cpu); max_util = min(max_util, allowed_cpu_cap); - freq = map_util_freq(max_util, ps->frequency, scale_cpu); + freq = map_util_freq(max_util, ref_freq, scale_cpu); /* * Find the lowest performance state of the Energy Model above the From 53e463397e110ac5a4e052c296b06755dbda6c4a Mon Sep 17 00:00:00 2001 From: Vincent Guittot Date: Mon, 11 Dec 2023 11:48:54 +0100 Subject: [PATCH 286/548] cpufreq/cppc: Set the frequency used for computing the capacity Save the frequency associated to the performance that has been used when initializing the capacity of CPUs. Also, cppc cpufreq driver can register an artificial energy model. In such case, it needs the frequency for this compute capacity. Signed-off-by: Vincent Guittot Signed-off-by: Ingo Molnar Tested-by: Pierre Gondois Acked-by: Sudeep Holla Acked-by: Viresh Kumar Link: https://lore.kernel.org/r/20231211104855.558096-7-vincent.guittot@linaro.org (cherry picked from commit 5477fa249b56c59c3baa1b237bf083cffa64c84a) Signed-off-by: Koba Ko --- drivers/base/arch_topology.c | 15 ++++++++++++++- 1 file changed, 14 insertions(+), 1 deletion(-) diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c index 0c9ae5b157b17..1aa76b5c96c24 100644 --- a/drivers/base/arch_topology.c +++ b/drivers/base/arch_topology.c @@ -349,6 +349,7 @@ bool __init topology_parse_cpu_capacity(struct device_node *cpu_node, int cpu) void topology_init_cpu_capacity_cppc(void) { + u64 capacity, capacity_scale = 0; struct cppc_perf_caps perf_caps; int cpu; @@ -365,6 +366,10 @@ void topology_init_cpu_capacity_cppc(void) (perf_caps.highest_perf >= perf_caps.nominal_perf) && (perf_caps.highest_perf >= perf_caps.lowest_perf)) { raw_capacity[cpu] = perf_caps.highest_perf; + capacity_scale = max_t(u64, capacity_scale, raw_capacity[cpu]); + + per_cpu(capacity_freq_ref, cpu) = cppc_perf_to_khz(&perf_caps, raw_capacity[cpu]); + pr_debug("cpu_capacity: CPU%d cpu_capacity=%u (raw).\n", cpu, raw_capacity[cpu]); continue; @@ -375,7 +380,15 @@ void topology_init_cpu_capacity_cppc(void) goto exit; } - topology_normalize_cpu_scale(); + for_each_possible_cpu(cpu) { + capacity = raw_capacity[cpu]; + capacity = div64_u64(capacity << SCHED_CAPACITY_SHIFT, + capacity_scale); + topology_set_cpu_scale(cpu, capacity); + pr_debug("cpu_capacity: CPU%d cpu_capacity=%lu\n", + cpu, topology_get_cpu_scale(cpu)); + } + schedule_work(&update_topology_flags_work); pr_debug("cpu_capacity: cpu_capacity initialization done\n"); From 6753446e3aa9cb49f45664c4933b649528b71dea Mon Sep 17 00:00:00 2001 From: Vincent Guittot Date: Mon, 11 Dec 2023 11:48:55 +0100 Subject: [PATCH 287/548] arm64/amu: Use capacity_ref_freq() to set AMU ratio Use the new capacity_ref_freq() method to set the ratio that is used by AMU for computing the arch_scale_freq_capacity(). This helps to keep everything aligned using the same reference for computing CPUs capacity. The default value of the ratio (stored in per_cpu(arch_max_freq_scale)) ensures that arch_scale_freq_capacity() returns max capacity until it is set to its correct value with the cpu capacity and capacity_ref_freq(). Signed-off-by: Vincent Guittot Signed-off-by: Ingo Molnar Acked-by: Sudeep Holla Acked-by: Will Deacon Link: https://lore.kernel.org/r/20231211104855.558096-8-vincent.guittot@linaro.org (cherry picked from commit 1f023007f5e782bda19ad9104830c404fd622c5d) Signed-off-by: Koba Ko --- arch/arm64/kernel/topology.c | 26 +++++++++++++------------- drivers/base/arch_topology.c | 12 +++++++++++- include/linux/arch_topology.h | 1 + 3 files changed, 25 insertions(+), 14 deletions(-) diff --git a/arch/arm64/kernel/topology.c b/arch/arm64/kernel/topology.c index 817d788cd8666..1a2c72f3e7f80 100644 --- a/arch/arm64/kernel/topology.c +++ b/arch/arm64/kernel/topology.c @@ -82,7 +82,12 @@ int __init parse_acpi_topology(void) #undef pr_fmt #define pr_fmt(fmt) "AMU: " fmt -static DEFINE_PER_CPU_READ_MOSTLY(unsigned long, arch_max_freq_scale); +/* + * Ensure that amu_scale_freq_tick() will return SCHED_CAPACITY_SCALE until + * the CPU capacity and its associated frequency have been correctly + * initialized. + */ +static DEFINE_PER_CPU_READ_MOSTLY(unsigned long, arch_max_freq_scale) = 1UL << (2 * SCHED_CAPACITY_SHIFT); static DEFINE_PER_CPU(u64, arch_const_cycles_prev); static DEFINE_PER_CPU(u64, arch_core_cycles_prev); static cpumask_var_t amu_fie_cpus; @@ -112,14 +117,14 @@ static inline bool freq_counters_valid(int cpu) return true; } -static int freq_inv_set_max_ratio(int cpu, u64 max_rate, u64 ref_rate) +void freq_inv_set_max_ratio(int cpu, u64 max_rate) { - u64 ratio; + u64 ratio, ref_rate = arch_timer_get_rate(); if (unlikely(!max_rate || !ref_rate)) { - pr_debug("CPU%d: invalid maximum or reference frequency.\n", + WARN_ONCE(1, "CPU%d: invalid maximum or reference frequency.\n", cpu); - return -EINVAL; + return; } /* @@ -139,12 +144,10 @@ static int freq_inv_set_max_ratio(int cpu, u64 max_rate, u64 ref_rate) ratio = div64_u64(ratio, max_rate); if (!ratio) { WARN_ONCE(1, "Reference frequency too low.\n"); - return -EINVAL; + return; } - per_cpu(arch_max_freq_scale, cpu) = (unsigned long)ratio; - - return 0; + WRITE_ONCE(per_cpu(arch_max_freq_scale, cpu), (unsigned long)ratio); } static void amu_scale_freq_tick(void) @@ -195,10 +198,7 @@ static void amu_fie_setup(const struct cpumask *cpus) return; for_each_cpu(cpu, cpus) { - if (!freq_counters_valid(cpu) || - freq_inv_set_max_ratio(cpu, - cpufreq_get_hw_max_freq(cpu) * 1000ULL, - arch_timer_get_rate())) + if (!freq_counters_valid(cpu)) return; } diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c index 1aa76b5c96c24..5aaa0865625d0 100644 --- a/drivers/base/arch_topology.c +++ b/drivers/base/arch_topology.c @@ -344,6 +344,10 @@ bool __init topology_parse_cpu_capacity(struct device_node *cpu_node, int cpu) return !ret; } +void __weak freq_inv_set_max_ratio(int cpu, u64 max_rate) +{ +} + #ifdef CONFIG_ACPI_CPPC_LIB #include @@ -381,6 +385,9 @@ void topology_init_cpu_capacity_cppc(void) } for_each_possible_cpu(cpu) { + freq_inv_set_max_ratio(cpu, + per_cpu(capacity_freq_ref, cpu) * HZ_PER_KHZ); + capacity = raw_capacity[cpu]; capacity = div64_u64(capacity << SCHED_CAPACITY_SHIFT, capacity_scale); @@ -422,8 +429,11 @@ init_cpu_capacity_callback(struct notifier_block *nb, cpumask_andnot(cpus_to_visit, cpus_to_visit, policy->related_cpus); - for_each_cpu(cpu, policy->related_cpus) + for_each_cpu(cpu, policy->related_cpus) { per_cpu(capacity_freq_ref, cpu) = policy->cpuinfo.max_freq; + freq_inv_set_max_ratio(cpu, + per_cpu(capacity_freq_ref, cpu) * HZ_PER_KHZ); + } if (cpumask_empty(cpus_to_visit)) { topology_normalize_cpu_scale(); diff --git a/include/linux/arch_topology.h b/include/linux/arch_topology.h index 32c24ff4f2a80..a63d61ca55afc 100644 --- a/include/linux/arch_topology.h +++ b/include/linux/arch_topology.h @@ -99,6 +99,7 @@ void update_siblings_masks(unsigned int cpu); void remove_cpu_topology(unsigned int cpuid); void reset_cpu_topology(void); int parse_acpi_topology(void); +void freq_inv_set_max_ratio(int cpu, u64 max_rate); #endif #endif /* _LINUX_ARCH_TOPOLOGY_H_ */ From 85c821c6afcdcd154341e4fb6766fa9312b5cc4a Mon Sep 17 00:00:00 2001 From: Beata Michalska Date: Fri, 31 Jan 2025 15:58:42 +0000 Subject: [PATCH 288/548] arm64: amu: Delay allocating cpumask for AMU FIE support For the time being, the amu_fie_cpus cpumask is being exclusively used by the AMU-related internals of FIE support and is guaranteed to be valid on every access currently made. Still the mask is not being invalidated on one of the error handling code paths, which leaves a soft spot with theoretical risk of UAF for CPUMASK_OFFSTACK cases. To make things sound, delay allocating said cpumask (for CPUMASK_OFFSTACK) avoiding otherwise nasty sanitising case failing to register the cpufreq policy notifications. Signed-off-by: Beata Michalska Reviewed-by: Prasanna Kumar T S M Reviewed-by: Sumit Gupta Reviewed-by: Sudeep Holla Link: https://lore.kernel.org/r/20250131155842.3839098-1-beata.michalska@arm.com Signed-off-by: Will Deacon (cherry picked from commit d923782b041218ef3804b2fed87619b5b1a497f3) Signed-off-by: Koba Ko --- arch/arm64/kernel/topology.c | 22 ++++++++++------------ 1 file changed, 10 insertions(+), 12 deletions(-) diff --git a/arch/arm64/kernel/topology.c b/arch/arm64/kernel/topology.c index 1a2c72f3e7f80..cb180684d10d5 100644 --- a/arch/arm64/kernel/topology.c +++ b/arch/arm64/kernel/topology.c @@ -194,12 +194,19 @@ static void amu_fie_setup(const struct cpumask *cpus) int cpu; /* We are already set since the last insmod of cpufreq driver */ - if (unlikely(cpumask_subset(cpus, amu_fie_cpus))) + if (cpumask_available(amu_fie_cpus) && + unlikely(cpumask_subset(cpus, amu_fie_cpus))) return; - for_each_cpu(cpu, cpus) { + for_each_cpu(cpu, cpus) if (!freq_counters_valid(cpu)) return; + + if (!cpumask_available(amu_fie_cpus) && + !zalloc_cpumask_var(&amu_fie_cpus, GFP_KERNEL)) { + WARN_ONCE(1, "Failed to allocate FIE cpumask for CPUs[%*pbl]\n", + cpumask_pr_args(cpus)); + return; } cpumask_or(amu_fie_cpus, amu_fie_cpus, cpus); @@ -237,17 +244,8 @@ static struct notifier_block init_amu_fie_notifier = { static int __init init_amu_fie(void) { - int ret; - - if (!zalloc_cpumask_var(&amu_fie_cpus, GFP_KERNEL)) - return -ENOMEM; - - ret = cpufreq_register_notifier(&init_amu_fie_notifier, + return cpufreq_register_notifier(&init_amu_fie_notifier, CPUFREQ_POLICY_NOTIFIER); - if (ret) - free_cpumask_var(amu_fie_cpus); - - return ret; } core_initcall(init_amu_fie); From 48fa4809b1a5253f06e13d246737c5b56f3ea81b Mon Sep 17 00:00:00 2001 From: Beata Michalska Date: Fri, 31 Jan 2025 16:24:36 +0000 Subject: [PATCH 289/548] cpufreq: Allow arch_freq_get_on_cpu to return an error Allow arch_freq_get_on_cpu to return an error for cases when retrieving current CPU frequency is not possible, whether that being due to lack of required arch support or due to other circumstances when the current frequency cannot be determined at given point of time. Signed-off-by: Beata Michalska Reviewed-by: Prasanna Kumar T S M Acked-by: Viresh Kumar Acked-by: Rafael J. Wysocki Link: https://lore.kernel.org/r/20250131162439.3843071-2-beata.michalska@arm.com Signed-off-by: Catalin Marinas (cherry picked from commit 38e480d4fcac2e191e2f24f381aa8957865c49b2) Signed-off-by: Koba Ko --- arch/x86/kernel/cpu/aperfmperf.c | 2 +- arch/x86/kernel/cpu/proc.c | 7 +++++-- drivers/cpufreq/cpufreq.c | 8 ++++---- include/linux/cpufreq.h | 2 +- 4 files changed, 11 insertions(+), 8 deletions(-) diff --git a/arch/x86/kernel/cpu/aperfmperf.c b/arch/x86/kernel/cpu/aperfmperf.c index fdbb5f07448fa..ab646aee6387e 100644 --- a/arch/x86/kernel/cpu/aperfmperf.c +++ b/arch/x86/kernel/cpu/aperfmperf.c @@ -411,7 +411,7 @@ void arch_scale_freq_tick(void) */ #define MAX_SAMPLE_AGE ((unsigned long)HZ / 50) -unsigned int arch_freq_get_on_cpu(int cpu) +int arch_freq_get_on_cpu(int cpu) { struct aperfmperf *s = per_cpu_ptr(&cpu_samples, cpu); unsigned int seq, freq; diff --git a/arch/x86/kernel/cpu/proc.c b/arch/x86/kernel/cpu/proc.c index 31c0e68f62272..224ee8ba1edf1 100644 --- a/arch/x86/kernel/cpu/proc.c +++ b/arch/x86/kernel/cpu/proc.c @@ -86,9 +86,12 @@ static int show_cpuinfo(struct seq_file *m, void *v) seq_printf(m, "microcode\t: 0x%x\n", c->microcode); if (cpu_has(c, X86_FEATURE_TSC)) { - unsigned int freq = arch_freq_get_on_cpu(cpu); + int freq = arch_freq_get_on_cpu(cpu); - seq_printf(m, "cpu MHz\t\t: %u.%03u\n", freq / 1000, (freq % 1000)); + if (freq < 0) + seq_puts(m, "cpu MHz\t\t: Unknown\n"); + else + seq_printf(m, "cpu MHz\t\t: %u.%03u\n", freq / 1000, (freq % 1000)); } /* Cache size */ diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c index 6b00507f9f332..a9e71a86404ec 100644 --- a/drivers/cpufreq/cpufreq.c +++ b/drivers/cpufreq/cpufreq.c @@ -755,18 +755,18 @@ show_one(cpuinfo_transition_latency, cpuinfo.transition_latency); show_one(scaling_min_freq, min); show_one(scaling_max_freq, max); -__weak unsigned int arch_freq_get_on_cpu(int cpu) +__weak int arch_freq_get_on_cpu(int cpu) { - return 0; + return -EOPNOTSUPP; } static ssize_t show_scaling_cur_freq(struct cpufreq_policy *policy, char *buf) { ssize_t ret; - unsigned int freq; + int freq; freq = arch_freq_get_on_cpu(policy->cpu); - if (freq) + if (freq > 0) ret = sprintf(buf, "%u\n", freq); else if (cpufreq_driver->setpolicy && cpufreq_driver->get) ret = sprintf(buf, "%u\n", cpufreq_driver->get(policy->cpu)); diff --git a/include/linux/cpufreq.h b/include/linux/cpufreq.h index bfecd9dcb5529..cfaa642bdb61a 100644 --- a/include/linux/cpufreq.h +++ b/include/linux/cpufreq.h @@ -1235,7 +1235,7 @@ static inline void sched_cpufreq_governor_change(struct cpufreq_policy *policy, struct cpufreq_governor *old_gov) { } #endif -extern unsigned int arch_freq_get_on_cpu(int cpu); +extern int arch_freq_get_on_cpu(int cpu); #ifndef arch_set_freq_scale static __always_inline From 7153cc719b2470e9155e22d2cf57fe871841997d Mon Sep 17 00:00:00 2001 From: Beata Michalska Date: Fri, 31 Jan 2025 16:24:37 +0000 Subject: [PATCH 290/548] cpufreq: Introduce an optional cpuinfo_avg_freq sysfs entry Currently the CPUFreq core exposes two sysfs attributes that can be used to query current frequency of a given CPU(s): namely cpuinfo_cur_freq and scaling_cur_freq. Both provide slightly different view on the subject and they do come with their own drawbacks. cpuinfo_cur_freq provides higher precision though at a cost of being rather expensive. Moreover, the information retrieved via this attribute is somewhat short lived as frequency can change at any point of time making it difficult to reason from. scaling_cur_freq, on the other hand, tends to be less accurate but then the actual level of precision (and source of information) varies between architectures making it a bit ambiguous. The new attribute, cpuinfo_avg_freq, is intended to provide more stable, distinct interface, exposing an average frequency of a given CPU(s), as reported by the hardware, over a time frame spanning no more than a few milliseconds. As it requires appropriate hardware support, this interface is optional. Note that under the hood, the new attribute relies on the information provided by arch_freq_get_on_cpu, which, up to this point, has been feeding data for scaling_cur_freq attribute, being the source of ambiguity when it comes to interpretation. This has been amended by restoring the intended behavior for scaling_cur_freq, with a new dedicated config option to maintain status quo for those, who may need it. CC: Jonathan Corbet CC: Thomas Gleixner CC: Ingo Molnar CC: Borislav Petkov CC: Dave Hansen CC: H. Peter Anvin CC: Phil Auld CC: x86@kernel.org CC: linux-doc@vger.kernel.org Signed-off-by: Beata Michalska Reviewed-by: Prasanna Kumar T S M Reviewed-by: Sumit Gupta Acked-by: Viresh Kumar Acked-by: Rafael J. Wysocki Link: https://lore.kernel.org/r/20250131162439.3843071-3-beata.michalska@arm.com Signed-off-by: Catalin Marinas (cherry picked from commit fbb4a4759b541d09ebb8e391d5fa7f9a5a0cad61) Signed-off-by: Koba Ko --- Documentation/admin-guide/pm/cpufreq.rst | 17 +++++++++++++- drivers/cpufreq/Kconfig.x86 | 12 ++++++++++ drivers/cpufreq/cpufreq.c | 30 +++++++++++++++++++++++- 3 files changed, 57 insertions(+), 2 deletions(-) diff --git a/Documentation/admin-guide/pm/cpufreq.rst b/Documentation/admin-guide/pm/cpufreq.rst index 6adb7988e0eb6..43e6cba1c84f0 100644 --- a/Documentation/admin-guide/pm/cpufreq.rst +++ b/Documentation/admin-guide/pm/cpufreq.rst @@ -248,6 +248,20 @@ are the following: If that frequency cannot be determined, this attribute should not be present. +``cpuinfo_avg_freq`` + An average frequency (in KHz) of all CPUs belonging to a given policy, + derived from a hardware provided feedback and reported on a time frame + spanning at most few milliseconds. + + This is expected to be based on the frequency the hardware actually runs + at and, as such, might require specialised hardware support (such as AMU + extension on ARM). If one cannot be determined, this attribute should + not be present. + + Note, that failed attempt to retrieve current frequency for a given + CPU(s) will result in an appropriate error, i.e: EAGAIN for CPU that + remains idle (raised on ARM). + ``cpuinfo_max_freq`` Maximum possible operating frequency the CPUs belonging to this policy can run at (in kHz). @@ -289,7 +303,8 @@ are the following: Some architectures (e.g. ``x86``) may attempt to provide information more precisely reflecting the current CPU frequency through this attribute, but that still may not be the exact current CPU frequency as - seen by the hardware at the moment. + seen by the hardware at the moment. This behavior though, is only + available via c:macro:``CPUFREQ_ARCH_CUR_FREQ`` option. ``scaling_driver`` The scaling driver currently in use. diff --git a/drivers/cpufreq/Kconfig.x86 b/drivers/cpufreq/Kconfig.x86 index 438c9e75a04dc..16763364a80a0 100644 --- a/drivers/cpufreq/Kconfig.x86 +++ b/drivers/cpufreq/Kconfig.x86 @@ -339,3 +339,15 @@ config X86_SPEEDSTEP_RELAXED_CAP_CHECK option lets the probing code bypass some of those checks if the parameter "relaxed_check=1" is passed to the module. +config CPUFREQ_ARCH_CUR_FREQ + default y + bool "Current frequency derived from HW provided feedback" + help + This determines whether the scaling_cur_freq sysfs attribute returns + the last requested frequency or a more precise value based on hardware + provided feedback (as architected counters). + Given that a more precise frequency can now be provided via the + cpuinfo_avg_freq attribute, by enabling this option, + scaling_cur_freq maintains the provision of a counter based frequency, + for compatibility reasons. + diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c index a9e71a86404ec..890f1255ca75b 100644 --- a/drivers/cpufreq/cpufreq.c +++ b/drivers/cpufreq/cpufreq.c @@ -760,12 +760,20 @@ __weak int arch_freq_get_on_cpu(int cpu) return -EOPNOTSUPP; } +static inline bool cpufreq_avg_freq_supported(struct cpufreq_policy *policy) +{ + return arch_freq_get_on_cpu(policy->cpu) != -EOPNOTSUPP; +} + static ssize_t show_scaling_cur_freq(struct cpufreq_policy *policy, char *buf) { ssize_t ret; int freq; - freq = arch_freq_get_on_cpu(policy->cpu); + freq = IS_ENABLED(CONFIG_CPUFREQ_ARCH_CUR_FREQ) + ? arch_freq_get_on_cpu(policy->cpu) + : 0; + if (freq > 0) ret = sprintf(buf, "%u\n", freq); else if (cpufreq_driver->setpolicy && cpufreq_driver->get) @@ -810,6 +818,19 @@ static ssize_t show_cpuinfo_cur_freq(struct cpufreq_policy *policy, return sprintf(buf, "\n"); } +/* + * show_cpuinfo_avg_freq - average CPU frequency as detected by hardware + */ +static ssize_t show_cpuinfo_avg_freq(struct cpufreq_policy *policy, + char *buf) +{ + int avg_freq = arch_freq_get_on_cpu(policy->cpu); + + if (avg_freq > 0) + return sysfs_emit(buf, "%u\n", avg_freq); + return avg_freq != 0 ? avg_freq : -EINVAL; +} + /* * show_scaling_governor - show the current policy for the specified CPU */ @@ -973,6 +994,7 @@ static ssize_t show_bios_limit(struct cpufreq_policy *policy, char *buf) } cpufreq_freq_attr_ro_perm(cpuinfo_cur_freq, 0400); +cpufreq_freq_attr_ro(cpuinfo_avg_freq); cpufreq_freq_attr_ro(cpuinfo_min_freq); cpufreq_freq_attr_ro(cpuinfo_max_freq); cpufreq_freq_attr_ro(cpuinfo_transition_latency); @@ -1100,6 +1122,12 @@ static int cpufreq_add_dev_interface(struct cpufreq_policy *policy) return ret; } + if (cpufreq_avg_freq_supported(policy)) { + ret = sysfs_create_file(&policy->kobj, &cpuinfo_avg_freq.attr); + if (ret) + return ret; + } + ret = sysfs_create_file(&policy->kobj, &scaling_cur_freq.attr); if (ret) return ret; From e1feb8697fdd20987d4f0153b2e4029bd9f7a6e0 Mon Sep 17 00:00:00 2001 From: Beata Michalska Date: Fri, 31 Jan 2025 16:24:38 +0000 Subject: [PATCH 291/548] arm64: Provide an AMU-based version of arch_freq_get_on_cpu With the Frequency Invariance Engine (FIE) being already wired up with sched tick and making use of relevant (core counter and constant counter) AMU counters, getting the average frequency for a given CPU, can be achieved by utilizing the frequency scale factor which reflects an average CPU frequency for the last tick period length. The solution is partially based on APERF/MPERF implementation of arch_freq_get_on_cpu. Suggested-by: Ionela Voinescu Signed-off-by: Beata Michalska Reviewed-by: Prasanna Kumar T S M Reviewed-by: Sumit Gupta Link: https://lore.kernel.org/r/20250131162439.3843071-4-beata.michalska@arm.com Signed-off-by: Catalin Marinas (cherry picked from commit 16d1e27475f673295f3eaac296dd835db3b8c4d4) Signed-off-by: Koba Ko --- arch/arm64/kernel/topology.c | 111 +++++++++++++++++++++++++++++++---- 1 file changed, 101 insertions(+), 10 deletions(-) diff --git a/arch/arm64/kernel/topology.c b/arch/arm64/kernel/topology.c index cb180684d10d5..28fb004d3103c 100644 --- a/arch/arm64/kernel/topology.c +++ b/arch/arm64/kernel/topology.c @@ -17,6 +17,7 @@ #include #include #include +#include #include #include @@ -88,18 +89,28 @@ int __init parse_acpi_topology(void) * initialized. */ static DEFINE_PER_CPU_READ_MOSTLY(unsigned long, arch_max_freq_scale) = 1UL << (2 * SCHED_CAPACITY_SHIFT); -static DEFINE_PER_CPU(u64, arch_const_cycles_prev); -static DEFINE_PER_CPU(u64, arch_core_cycles_prev); static cpumask_var_t amu_fie_cpus; +struct amu_cntr_sample { + u64 arch_const_cycles_prev; + u64 arch_core_cycles_prev; + unsigned long last_scale_update; +}; + +static DEFINE_PER_CPU_SHARED_ALIGNED(struct amu_cntr_sample, cpu_amu_samples); + void update_freq_counters_refs(void) { - this_cpu_write(arch_core_cycles_prev, read_corecnt()); - this_cpu_write(arch_const_cycles_prev, read_constcnt()); + struct amu_cntr_sample *amu_sample = this_cpu_ptr(&cpu_amu_samples); + + amu_sample->arch_core_cycles_prev = read_corecnt(); + amu_sample->arch_const_cycles_prev = read_constcnt(); } static inline bool freq_counters_valid(int cpu) { + struct amu_cntr_sample *amu_sample = per_cpu_ptr(&cpu_amu_samples, cpu); + if ((cpu >= nr_cpu_ids) || !cpumask_test_cpu(cpu, cpu_present_mask)) return false; @@ -108,8 +119,8 @@ static inline bool freq_counters_valid(int cpu) return false; } - if (unlikely(!per_cpu(arch_const_cycles_prev, cpu) || - !per_cpu(arch_core_cycles_prev, cpu))) { + if (unlikely(!amu_sample->arch_const_cycles_prev || + !amu_sample->arch_core_cycles_prev)) { pr_debug("CPU%d: cycle counters are not enabled.\n", cpu); return false; } @@ -152,17 +163,22 @@ void freq_inv_set_max_ratio(int cpu, u64 max_rate) static void amu_scale_freq_tick(void) { + struct amu_cntr_sample *amu_sample = this_cpu_ptr(&cpu_amu_samples); u64 prev_core_cnt, prev_const_cnt; u64 core_cnt, const_cnt, scale; - prev_const_cnt = this_cpu_read(arch_const_cycles_prev); - prev_core_cnt = this_cpu_read(arch_core_cycles_prev); + prev_const_cnt = amu_sample->arch_const_cycles_prev; + prev_core_cnt = amu_sample->arch_core_cycles_prev; update_freq_counters_refs(); - const_cnt = this_cpu_read(arch_const_cycles_prev); - core_cnt = this_cpu_read(arch_core_cycles_prev); + const_cnt = amu_sample->arch_const_cycles_prev; + core_cnt = amu_sample->arch_core_cycles_prev; + /* + * This should not happen unless the AMUs have been reset and the + * counter values have not been restored - unlikely + */ if (unlikely(core_cnt <= prev_core_cnt || const_cnt <= prev_const_cnt)) return; @@ -182,6 +198,8 @@ static void amu_scale_freq_tick(void) scale = min_t(unsigned long, scale, SCHED_CAPACITY_SCALE); this_cpu_write(arch_freq_scale, (unsigned long)scale); + + amu_sample->last_scale_update = jiffies; } static struct scale_freq_data amu_sfd = { @@ -189,6 +207,79 @@ static struct scale_freq_data amu_sfd = { .set_freq_scale = amu_scale_freq_tick, }; +static __always_inline bool amu_fie_cpu_supported(unsigned int cpu) +{ + return cpumask_available(amu_fie_cpus) && + cpumask_test_cpu(cpu, amu_fie_cpus); +} + +#define AMU_SAMPLE_EXP_MS 20 + +int arch_freq_get_on_cpu(int cpu) +{ + struct amu_cntr_sample *amu_sample; + unsigned int start_cpu = cpu; + unsigned long last_update; + unsigned int freq = 0; + u64 scale; + + if (!amu_fie_cpu_supported(cpu) || !arch_scale_freq_ref(cpu)) + return -EOPNOTSUPP; + + while (1) { + + amu_sample = per_cpu_ptr(&cpu_amu_samples, cpu); + + last_update = amu_sample->last_scale_update; + + /* + * For those CPUs that are in full dynticks mode, or those that have + * not seen tick for a while, try an alternative source for the counters + * (and thus freq scale), if available, for given policy: this boils + * down to identifying an active cpu within the same freq domain, if any. + */ + if (!housekeeping_cpu(cpu, HK_TYPE_TICK) || + time_is_before_jiffies(last_update + msecs_to_jiffies(AMU_SAMPLE_EXP_MS))) { + struct cpufreq_policy *policy = cpufreq_cpu_get(cpu); + int ref_cpu = cpu; + + if (!policy) + return -EINVAL; + + if (!cpumask_intersects(policy->related_cpus, + housekeeping_cpumask(HK_TYPE_TICK))) { + cpufreq_cpu_put(policy); + return -EOPNOTSUPP; + } + + do { + ref_cpu = cpumask_next_wrap(ref_cpu, policy->cpus, + start_cpu, true); + + } while (ref_cpu < nr_cpu_ids && idle_cpu(ref_cpu)); + + cpufreq_cpu_put(policy); + + if (ref_cpu >= nr_cpu_ids) + /* No alternative to pull info from */ + return -EAGAIN; + + cpu = ref_cpu; + } else { + break; + } + } + /* + * Reversed computation to the one used to determine + * the arch_freq_scale value + * (see amu_scale_freq_tick for details) + */ + scale = arch_scale_freq_capacity(cpu); + freq = scale * arch_scale_freq_ref(cpu); + freq >>= SCHED_CAPACITY_SHIFT; + return freq; +} + static void amu_fie_setup(const struct cpumask *cpus) { int cpu; From cc25bd4be114df8c4bccc56105ab85f82747841c Mon Sep 17 00:00:00 2001 From: Beata Michalska Date: Fri, 31 Jan 2025 16:24:39 +0000 Subject: [PATCH 292/548] arm64: Update AMU-based freq scale factor on entering idle Now that the frequency scale factor has been activated for retrieving current frequency on a given CPU, trigger its update upon entering idle. This will, to an extent, allow querying last known frequency in a non-invasive way. It will also improve the frequency scale factor accuracy when a CPU entering idle did not receive a tick for a while. As a consequence, for idle cores, the reported frequency will be the last one observed before entering the idle state. Suggested-by: Vanshidhar Konda Signed-off-by: Beata Michalska Reviewed-by: Prasanna Kumar T S M Reviewed-by: Sumit Gupta Link: https://lore.kernel.org/r/20250131162439.3843071-5-beata.michalska@arm.com Signed-off-by: Catalin Marinas (cherry picked from commit 39b19974982e03bd7b950f33bc0855385845c9fb) Signed-off-by: Koba Ko --- arch/arm64/kernel/topology.c | 13 +++++++++++++ 1 file changed, 13 insertions(+) diff --git a/arch/arm64/kernel/topology.c b/arch/arm64/kernel/topology.c index 28fb004d3103c..a09b0551ec59f 100644 --- a/arch/arm64/kernel/topology.c +++ b/arch/arm64/kernel/topology.c @@ -213,6 +213,19 @@ static __always_inline bool amu_fie_cpu_supported(unsigned int cpu) cpumask_test_cpu(cpu, amu_fie_cpus); } +void arch_cpu_idle_enter(void) +{ + unsigned int cpu = smp_processor_id(); + + if (!amu_fie_cpu_supported(cpu)) + return; + + /* Kick in AMU update but only if one has not happened already */ + if (housekeeping_cpu(cpu, HK_TYPE_TICK) && + time_is_before_jiffies(per_cpu(cpu_amu_samples.last_scale_update, cpu))) + amu_scale_freq_tick(); +} + #define AMU_SAMPLE_EXP_MS 20 int arch_freq_get_on_cpu(int cpu) From c0f24381b71376fca5b1b2539bb3dcf71e34159c Mon Sep 17 00:00:00 2001 From: Beata Michalska Date: Thu, 20 Feb 2025 09:10:15 +0000 Subject: [PATCH 293/548] arm64: Utilize for_each_cpu_wrap for reference lookup While searching for a reference CPU within a given policy, arch_freq_get_on_cpu relies on cpumask_next_wrap to iterate over all available CPUs and to ensure each is verified only once. Recent changes to cpumask_next_wrap will handle the latter no more, so switching to for_each_cpu_wrap, which preserves expected behavior while ensuring compatibility with the updates. Not to mention that when iterating over each CPU, using a dedicated iterator is preferable to an open-coded loop. Fixes: 16d1e27475f6 ("arm64: Provide an AMU-based version of arch_freq_get_on_cpu") Signed-off-by: Beata Michalska Link: https://lore.kernel.org/r/20250220091015.2319901-1-beata.michalska@arm.com Signed-off-by: Catalin Marinas (cherry picked from commit 20711efa91e8ba44149f5e2ed1cf81e5355650e5) Signed-off-by: Koba Ko --- arch/arm64/kernel/topology.c | 16 ++++++++++------ 1 file changed, 10 insertions(+), 6 deletions(-) diff --git a/arch/arm64/kernel/topology.c b/arch/arm64/kernel/topology.c index a09b0551ec59f..9e3583720668a 100644 --- a/arch/arm64/kernel/topology.c +++ b/arch/arm64/kernel/topology.c @@ -254,7 +254,7 @@ int arch_freq_get_on_cpu(int cpu) if (!housekeeping_cpu(cpu, HK_TYPE_TICK) || time_is_before_jiffies(last_update + msecs_to_jiffies(AMU_SAMPLE_EXP_MS))) { struct cpufreq_policy *policy = cpufreq_cpu_get(cpu); - int ref_cpu = cpu; + int ref_cpu; if (!policy) return -EINVAL; @@ -265,11 +265,15 @@ int arch_freq_get_on_cpu(int cpu) return -EOPNOTSUPP; } - do { - ref_cpu = cpumask_next_wrap(ref_cpu, policy->cpus, - start_cpu, true); - - } while (ref_cpu < nr_cpu_ids && idle_cpu(ref_cpu)); + for_each_cpu_wrap(ref_cpu, policy->cpus, cpu + 1) { + if (ref_cpu == start_cpu) { + /* Prevent verifying same CPU twice */ + ref_cpu = nr_cpu_ids; + break; + } + if (!idle_cpu(ref_cpu)) + break; + } cpufreq_cpu_put(policy); From 4e4e00606e964451ace1eee9e3ff1a3586378c6e Mon Sep 17 00:00:00 2001 From: Ionela Voinescu Date: Tue, 27 Aug 2024 16:48:18 +0100 Subject: [PATCH 294/548] arch_topology: init capacity_freq_ref to 0 It's useful to have capacity_freq_ref initialized to 0 for users of arch_scale_freq_ref() to detect when capacity_freq_ref was not yet set. The only scenario affected by this change in the init value is when a cpufreq driver is never loaded. As a result, the only setter of a cpu scale factor remains the call of topology_normalize_cpu_scale() from parse_dt_topology(). There we cannot use the value 0 of capacity_freq_ref so we have to compensate for its uninitialized state. Signed-off-by: Ionela Voinescu Signed-off-by: Beata Michalska Reviewed-by: Vincent Guittot Reviewed-by: Sudeep Holla Link: https://lore.kernel.org/r/20240827154818.1195849-1-ionela.voinescu@arm.com Signed-off-by: Catalin Marinas (cherry picked from commit 004b500a9031973ba0d05edca860193f9a5938df) Signed-off-by: Koba Ko --- drivers/base/arch_topology.c | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-) diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c index 5aaa0865625d0..168b937e5097b 100644 --- a/drivers/base/arch_topology.c +++ b/drivers/base/arch_topology.c @@ -27,7 +27,7 @@ static DEFINE_PER_CPU(struct scale_freq_data __rcu *, sft_data); static struct cpumask scale_freq_counters_mask; static bool scale_freq_invariant; -DEFINE_PER_CPU(unsigned long, capacity_freq_ref) = 1; +DEFINE_PER_CPU(unsigned long, capacity_freq_ref) = 0; EXPORT_PER_CPU_SYMBOL_GPL(capacity_freq_ref); static bool supports_scale_freq_counters(const struct cpumask *cpus) @@ -278,13 +278,15 @@ void topology_normalize_cpu_scale(void) capacity_scale = 1; for_each_possible_cpu(cpu) { - capacity = raw_capacity[cpu] * per_cpu(capacity_freq_ref, cpu); + capacity = raw_capacity[cpu] * + (per_cpu(capacity_freq_ref, cpu) ?: 1); capacity_scale = max(capacity, capacity_scale); } pr_debug("cpu_capacity: capacity_scale=%llu\n", capacity_scale); for_each_possible_cpu(cpu) { - capacity = raw_capacity[cpu] * per_cpu(capacity_freq_ref, cpu); + capacity = raw_capacity[cpu] * + (per_cpu(capacity_freq_ref, cpu) ?: 1); capacity = div64_u64(capacity << SCHED_CAPACITY_SHIFT, capacity_scale); topology_set_cpu_scale(cpu, capacity); From 969865a64504fdbf9a3238cf5b0a2b6d74e88024 Mon Sep 17 00:00:00 2001 From: Tao Zhang Date: Thu, 28 Sep 2023 14:29:34 +0800 Subject: [PATCH 295/548] coresight-tpdm: Remove the unnecessary lock Remove the unnecessary lock "CS_{UN,}LOCK" in TPDM driver. This lock is only needed while writing the data to Coresight registers. Signed-off-by: Tao Zhang Signed-off-by: Suzuki K Poulose Link: https://lore.kernel.org/r/1695882586-10306-2-git-send-email-quic_taozha@quicinc.com (cherry picked from commit f4443ee5a38cb84bdd0515f8832117b2be0684d6) Signed-off-by: Koba Ko --- drivers/hwtracing/coresight/coresight-tpdm.c | 2 -- 1 file changed, 2 deletions(-) diff --git a/drivers/hwtracing/coresight/coresight-tpdm.c b/drivers/hwtracing/coresight/coresight-tpdm.c index f4854af0431e1..b6456120b76a1 100644 --- a/drivers/hwtracing/coresight/coresight-tpdm.c +++ b/drivers/hwtracing/coresight/coresight-tpdm.c @@ -114,11 +114,9 @@ static void tpdm_init_default_data(struct tpdm_drvdata *drvdata) { u32 pidr; - CS_UNLOCK(drvdata->base); /* Get the datasets present on the TPDM. */ pidr = readl_relaxed(drvdata->base + CORESIGHT_PERIPHIDR0); drvdata->datasets |= pidr & GENMASK(TPDM_DATASETS - 1, 0); - CS_LOCK(drvdata->base); } /* From 8ec1c171e7a7a5e79320fe69c7b1f10c568f844f Mon Sep 17 00:00:00 2001 From: Tao Zhang Date: Thu, 28 Sep 2023 14:29:35 +0800 Subject: [PATCH 296/548] dt-bindings: arm: Add support for DSB element size Add property "qcom,dsb-elem-size" to support DSB(Discrete Single Bit) element for TPDM. The associated aggregator will read this size before it is enabled. DSB element size currently only supports 32-bit and 64-bit. Signed-off-by: Tao Zhang Acked-by: Rob Herring Signed-off-by: Suzuki K Poulose Link: https://lore.kernel.org/r/1695882586-10306-3-git-send-email-quic_taozha@quicinc.com (cherry picked from commit 2a8d9b371566e798421ef877c5757e2c4a11ad6f) Signed-off-by: Koba Ko --- .../devicetree/bindings/arm/qcom,coresight-tpdm.yaml | 10 ++++++++++ 1 file changed, 10 insertions(+) diff --git a/Documentation/devicetree/bindings/arm/qcom,coresight-tpdm.yaml b/Documentation/devicetree/bindings/arm/qcom,coresight-tpdm.yaml index 0b4afb4078ef8..ee99768080128 100644 --- a/Documentation/devicetree/bindings/arm/qcom,coresight-tpdm.yaml +++ b/Documentation/devicetree/bindings/arm/qcom,coresight-tpdm.yaml @@ -43,6 +43,14 @@ properties: reg: maxItems: 1 + qcom,dsb-element-size: + description: + Specifies the DSB(Discrete Single Bit) element size supported by + the monitor. The associated aggregator will read this size before it + is enabled. DSB element size currently only supports 32-bit and 64-bit. + $ref: /schemas/types.yaml#/definitions/uint8 + enum: [32, 64] + clocks: maxItems: 1 @@ -76,6 +84,8 @@ examples: compatible = "qcom,coresight-tpdm", "arm,primecell"; reg = <0x0684c000 0x1000>; + qcom,dsb-element-size = /bits/ 8 <32>; + clocks = <&aoss_qmp>; clock-names = "apb_pclk"; From f5054a2f925ed5db3781ae5d954a9abd4383fd95 Mon Sep 17 00:00:00 2001 From: Tao Zhang Date: Thu, 28 Sep 2023 14:29:36 +0800 Subject: [PATCH 297/548] coresight-tpdm: Introduce TPDM subtype to TPDM driver Introduce the new subtype of "CORESIGHT_DEV_SUBTYPE_SOURCE_TPDM" for TPDM components in driver. Signed-off-by: Tao Zhang Signed-off-by: Suzuki K Poulose Link: https://lore.kernel.org/r/1695882586-10306-4-git-send-email-quic_taozha@quicinc.com (cherry picked from commit f7f965c982f7954b46db910146a7ffe0fe1eb5e1) Signed-off-by: Koba Ko --- drivers/hwtracing/coresight/coresight-core.c | 3 +++ drivers/hwtracing/coresight/coresight-tpdm.c | 2 +- include/linux/coresight.h | 1 + 3 files changed, 5 insertions(+), 1 deletion(-) diff --git a/drivers/hwtracing/coresight/coresight-core.c b/drivers/hwtracing/coresight/coresight-core.c index 3b57851869eaa..37f7a331094f7 100644 --- a/drivers/hwtracing/coresight/coresight-core.c +++ b/drivers/hwtracing/coresight/coresight-core.c @@ -1110,6 +1110,7 @@ static int coresight_validate_source(struct coresight_device *csdev, if (subtype != CORESIGHT_DEV_SUBTYPE_SOURCE_PROC && subtype != CORESIGHT_DEV_SUBTYPE_SOURCE_SOFTWARE && + subtype != CORESIGHT_DEV_SUBTYPE_SOURCE_TPDM && subtype != CORESIGHT_DEV_SUBTYPE_SOURCE_OTHERS) { dev_err(&csdev->dev, "wrong device subtype in %s\n", function); return -EINVAL; @@ -1179,6 +1180,7 @@ int coresight_enable(struct coresight_device *csdev) per_cpu(tracer_path, cpu) = path; break; case CORESIGHT_DEV_SUBTYPE_SOURCE_SOFTWARE: + case CORESIGHT_DEV_SUBTYPE_SOURCE_TPDM: case CORESIGHT_DEV_SUBTYPE_SOURCE_OTHERS: /* * Use the hash of source's device name as ID @@ -1229,6 +1231,7 @@ void coresight_disable(struct coresight_device *csdev) per_cpu(tracer_path, cpu) = NULL; break; case CORESIGHT_DEV_SUBTYPE_SOURCE_SOFTWARE: + case CORESIGHT_DEV_SUBTYPE_SOURCE_TPDM: case CORESIGHT_DEV_SUBTYPE_SOURCE_OTHERS: hash = hashlen_hash(hashlen_string(NULL, dev_name(&csdev->dev))); /* Find the path by the hash. */ diff --git a/drivers/hwtracing/coresight/coresight-tpdm.c b/drivers/hwtracing/coresight/coresight-tpdm.c index b6456120b76a1..abaff0b934dbd 100644 --- a/drivers/hwtracing/coresight/coresight-tpdm.c +++ b/drivers/hwtracing/coresight/coresight-tpdm.c @@ -203,7 +203,7 @@ static int tpdm_probe(struct amba_device *adev, const struct amba_id *id) if (!desc.name) return -ENOMEM; desc.type = CORESIGHT_DEV_TYPE_SOURCE; - desc.subtype.source_subtype = CORESIGHT_DEV_SUBTYPE_SOURCE_OTHERS; + desc.subtype.source_subtype = CORESIGHT_DEV_SUBTYPE_SOURCE_TPDM; desc.ops = &tpdm_cs_ops; desc.pdata = adev->dev.platform_data; desc.dev = &adev->dev; diff --git a/include/linux/coresight.h b/include/linux/coresight.h index dccfadde84f41..4c9ff95c2c87f 100644 --- a/include/linux/coresight.h +++ b/include/linux/coresight.h @@ -64,6 +64,7 @@ enum coresight_dev_subtype_source { CORESIGHT_DEV_SUBTYPE_SOURCE_PROC, CORESIGHT_DEV_SUBTYPE_SOURCE_BUS, CORESIGHT_DEV_SUBTYPE_SOURCE_SOFTWARE, + CORESIGHT_DEV_SUBTYPE_SOURCE_TPDM, CORESIGHT_DEV_SUBTYPE_SOURCE_OTHERS, }; From 421c5df72d7e08186083c04cac5cecdd199456bf Mon Sep 17 00:00:00 2001 From: Tao Zhang Date: Thu, 28 Sep 2023 14:29:37 +0800 Subject: [PATCH 298/548] coresight-tpda: Add DSB dataset support Read the DSB element size from the device tree. Set the register bit that controls the DSB element size of the corresponding port. Signed-off-by: Tao Zhang Signed-off-by: Suzuki K Poulose Link: https://lore.kernel.org/r/1695882586-10306-5-git-send-email-quic_taozha@quicinc.com (cherry picked from commit 57e7235aa1d11d4ea8a25dfdc009b3ee463763af) Signed-off-by: Koba Ko --- drivers/hwtracing/coresight/coresight-tpda.c | 126 +++++++++++++++++-- drivers/hwtracing/coresight/coresight-tpda.h | 2 + 2 files changed, 118 insertions(+), 10 deletions(-) diff --git a/drivers/hwtracing/coresight/coresight-tpda.c b/drivers/hwtracing/coresight/coresight-tpda.c index 8d2b9d29237d4..5f82737c37bba 100644 --- a/drivers/hwtracing/coresight/coresight-tpda.c +++ b/drivers/hwtracing/coresight/coresight-tpda.c @@ -21,6 +21,80 @@ DEFINE_CORESIGHT_DEVLIST(tpda_devs, "tpda"); +static bool coresight_device_is_tpdm(struct coresight_device *csdev) +{ + return (csdev->type == CORESIGHT_DEV_TYPE_SOURCE) && + (csdev->subtype.source_subtype == + CORESIGHT_DEV_SUBTYPE_SOURCE_TPDM); +} + +/* + * Read the DSB element size from the TPDM device + * Returns + * The dsb element size read from the devicetree if available. + * 0 - Otherwise, with a warning once. + */ +static int tpdm_read_dsb_element_size(struct coresight_device *csdev) +{ + int rc = 0; + u8 size = 0; + + rc = fwnode_property_read_u8(dev_fwnode(csdev->dev.parent), + "qcom,dsb-element-size", &size); + if (rc) + dev_warn_once(&csdev->dev, + "Failed to read TPDM DSB Element size: %d\n", rc); + + return size; +} + +/* + * Search and read element data size from the TPDM node in + * the devicetree. Each input port of TPDA is connected to + * a TPDM. Different TPDM supports different types of dataset, + * and some may support more than one type of dataset. + * Parameter "inport" is used to pass in the input port number + * of TPDA, and it is set to -1 in the recursize call. + */ +static int tpda_get_element_size(struct coresight_device *csdev, + int inport) +{ + int dsb_size = -ENOENT; + int i, size; + struct coresight_device *in; + + for (i = 0; i < csdev->pdata->nr_inconns; i++) { + in = csdev->pdata->in_conns[i]->src_dev; + if (!in) + continue; + + /* Ignore the paths that do not match port */ + if (inport > 0 && + csdev->pdata->in_conns[i]->dest_port != inport) + continue; + + if (coresight_device_is_tpdm(in)) { + size = tpdm_read_dsb_element_size(in); + } else { + /* Recurse down the path */ + size = tpda_get_element_size(in, -1); + } + + if (size < 0) + return size; + + if (dsb_size < 0) { + /* Found a size, save it. */ + dsb_size = size; + } else { + /* Found duplicate TPDMs */ + return -EEXIST; + } + } + + return dsb_size; +} + /* Settings pre enabling port control register */ static void tpda_enable_pre_port(struct tpda_drvdata *drvdata) { @@ -32,26 +106,55 @@ static void tpda_enable_pre_port(struct tpda_drvdata *drvdata) writel_relaxed(val, drvdata->base + TPDA_CR); } -static void tpda_enable_port(struct tpda_drvdata *drvdata, int port) +static int tpda_enable_port(struct tpda_drvdata *drvdata, int port) { u32 val; + int size; val = readl_relaxed(drvdata->base + TPDA_Pn_CR(port)); + /* + * Configure aggregator port n DSB data set element size + * Set the bit to 0 if the size is 32 + * Set the bit to 1 if the size is 64 + */ + size = tpda_get_element_size(drvdata->csdev, port); + switch (size) { + case 32: + val &= ~TPDA_Pn_CR_DSBSIZE; + break; + case 64: + val |= TPDA_Pn_CR_DSBSIZE; + break; + case 0: + return -EEXIST; + case -EEXIST: + dev_warn_once(&drvdata->csdev->dev, + "Detected multiple TPDMs on port %d", -EEXIST); + return -EEXIST; + default: + return -EINVAL; + } + /* Enable the port */ val |= TPDA_Pn_CR_ENA; writel_relaxed(val, drvdata->base + TPDA_Pn_CR(port)); + + return 0; } -static void __tpda_enable(struct tpda_drvdata *drvdata, int port) +static int __tpda_enable(struct tpda_drvdata *drvdata, int port) { + int ret; + CS_UNLOCK(drvdata->base); if (!drvdata->csdev->enable) tpda_enable_pre_port(drvdata); - tpda_enable_port(drvdata, port); - + ret = tpda_enable_port(drvdata, port); CS_LOCK(drvdata->base); + + return ret; } static int tpda_enable(struct coresight_device *csdev, @@ -59,16 +162,19 @@ static int tpda_enable(struct coresight_device *csdev, struct coresight_connection *out) { struct tpda_drvdata *drvdata = dev_get_drvdata(csdev->dev.parent); + int ret = 0; spin_lock(&drvdata->spinlock); - if (atomic_read(&in->dest_refcnt) == 0) - __tpda_enable(drvdata, in->dest_port); + if (atomic_read(&in->dest_refcnt) == 0) { + ret = __tpda_enable(drvdata, in->dest_port); + if (!ret) { + atomic_inc(&in->dest_refcnt); + dev_dbg(drvdata->dev, "TPDA inport %d enabled.\n", in->dest_port); + } + } - atomic_inc(&in->dest_refcnt); spin_unlock(&drvdata->spinlock); - - dev_dbg(drvdata->dev, "TPDA inport %d enabled.\n", in->dest_port); - return 0; + return ret; } static void __tpda_disable(struct tpda_drvdata *drvdata, int port) diff --git a/drivers/hwtracing/coresight/coresight-tpda.h b/drivers/hwtracing/coresight/coresight-tpda.h index 0399678df312f..b3b38fd41b64b 100644 --- a/drivers/hwtracing/coresight/coresight-tpda.h +++ b/drivers/hwtracing/coresight/coresight-tpda.h @@ -10,6 +10,8 @@ #define TPDA_Pn_CR(n) (0x004 + (n * 4)) /* Aggregator port enable bit */ #define TPDA_Pn_CR_ENA BIT(0) +/* Aggregator port DSB data set element size bit */ +#define TPDA_Pn_CR_DSBSIZE BIT(8) #define TPDA_MAX_INPORTS 32 From a610611f4dbcffd882bfe23137200ace44249958 Mon Sep 17 00:00:00 2001 From: Tao Zhang Date: Thu, 28 Sep 2023 14:29:38 +0800 Subject: [PATCH 299/548] coresight-tpdm: Initialize DSB subunit configuration MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit DSB is used for monitoring “events”. Events are something that occurs at some point in time. It could be a state decode, the act of writing/reading a particular address, a FIFO being empty, etc. This decoding of the event desired is done outside TPDM. DSB subunit need to be configured in enablement and disablement. A struct that specifics associated to dsb dataset is needed. It saves the configuration and parameters of the dsb datasets. This change is to add this struct and initialize the configuration of DSB subunit. Signed-off-by: Tao Zhang Signed-off-by: Suzuki K Poulose Link: https://lore.kernel.org/r/1695882586-10306-6-git-send-email-quic_taozha@quicinc.com (cherry picked from commit f01e4948b516f073c353041328ae7cf709233303) Signed-off-by: Koba Ko --- drivers/hwtracing/coresight/coresight-tpdm.c | 64 +++++++++++++++++--- drivers/hwtracing/coresight/coresight-tpdm.h | 18 ++++++ 2 files changed, 74 insertions(+), 8 deletions(-) diff --git a/drivers/hwtracing/coresight/coresight-tpdm.c b/drivers/hwtracing/coresight/coresight-tpdm.c index abaff0b934dbd..951ad4d9b76fb 100644 --- a/drivers/hwtracing/coresight/coresight-tpdm.c +++ b/drivers/hwtracing/coresight/coresight-tpdm.c @@ -20,23 +20,57 @@ DEFINE_CORESIGHT_DEVLIST(tpdm_devs, "tpdm"); +static bool tpdm_has_dsb_dataset(struct tpdm_drvdata *drvdata) +{ + return (drvdata->datasets & TPDM_PIDR0_DS_DSB); +} + +static void tpdm_reset_datasets(struct tpdm_drvdata *drvdata) +{ + if (tpdm_has_dsb_dataset(drvdata)) { + memset(drvdata->dsb, 0, sizeof(struct dsb_dataset)); + + drvdata->dsb->trig_ts = true; + drvdata->dsb->trig_type = false; + } +} + static void tpdm_enable_dsb(struct tpdm_drvdata *drvdata) { u32 val; - /* Set the enable bit of DSB control register to 1 */ + val = readl_relaxed(drvdata->base + TPDM_DSB_TIER); + /* Set trigger timestamp */ + if (drvdata->dsb->trig_ts) + val |= TPDM_DSB_TIER_XTRIG_TSENAB; + else + val &= ~TPDM_DSB_TIER_XTRIG_TSENAB; + writel_relaxed(val, drvdata->base + TPDM_DSB_TIER); + val = readl_relaxed(drvdata->base + TPDM_DSB_CR); + /* Set trigger type */ + if (drvdata->dsb->trig_type) + val |= TPDM_DSB_CR_TRIG_TYPE; + else + val &= ~TPDM_DSB_CR_TRIG_TYPE; + /* Set the enable bit of DSB control register to 1 */ val |= TPDM_DSB_CR_ENA; writel_relaxed(val, drvdata->base + TPDM_DSB_CR); } -/* TPDM enable operations */ +/* + * TPDM enable operations + * The TPDM or Monitor serves as data collection component for various + * dataset types. It covers Basic Counts(BC), Tenure Counts(TC), + * Continuous Multi-Bit(CMB), Multi-lane CMB(MCMB) and Discrete Single + * Bit(DSB). This function will initialize the configuration according + * to the dataset type supported by the TPDM. + */ static void __tpdm_enable(struct tpdm_drvdata *drvdata) { CS_UNLOCK(drvdata->base); - /* Check if DSB datasets is present for TPDM. */ - if (drvdata->datasets & TPDM_PIDR0_DS_DSB) + if (tpdm_has_dsb_dataset(drvdata)) tpdm_enable_dsb(drvdata); CS_LOCK(drvdata->base); @@ -76,8 +110,7 @@ static void __tpdm_disable(struct tpdm_drvdata *drvdata) { CS_UNLOCK(drvdata->base); - /* Check if DSB datasets is present for TPDM. */ - if (drvdata->datasets & TPDM_PIDR0_DS_DSB) + if (tpdm_has_dsb_dataset(drvdata)) tpdm_disable_dsb(drvdata); CS_LOCK(drvdata->base); @@ -110,13 +143,23 @@ static const struct coresight_ops tpdm_cs_ops = { .source_ops = &tpdm_source_ops, }; -static void tpdm_init_default_data(struct tpdm_drvdata *drvdata) +static int tpdm_datasets_setup(struct tpdm_drvdata *drvdata) { u32 pidr; /* Get the datasets present on the TPDM. */ pidr = readl_relaxed(drvdata->base + CORESIGHT_PERIPHIDR0); drvdata->datasets |= pidr & GENMASK(TPDM_DATASETS - 1, 0); + + if (tpdm_has_dsb_dataset(drvdata) && (!drvdata->dsb)) { + drvdata->dsb = devm_kzalloc(drvdata->dev, + sizeof(*drvdata->dsb), GFP_KERNEL); + if (!drvdata->dsb) + return -ENOMEM; + } + tpdm_reset_datasets(drvdata); + + return 0; } /* @@ -179,6 +222,7 @@ static int tpdm_probe(struct amba_device *adev, const struct amba_id *id) struct coresight_platform_data *pdata; struct tpdm_drvdata *drvdata; struct coresight_desc desc = { 0 }; + int ret; pdata = coresight_get_platform_data(dev); if (IS_ERR(pdata)) @@ -198,6 +242,10 @@ static int tpdm_probe(struct amba_device *adev, const struct amba_id *id) drvdata->base = base; + ret = tpdm_datasets_setup(drvdata); + if (ret) + return ret; + /* Set up coresight component description */ desc.name = coresight_alloc_device_name(&tpdm_devs, dev); if (!desc.name) @@ -214,7 +262,7 @@ static int tpdm_probe(struct amba_device *adev, const struct amba_id *id) return PTR_ERR(drvdata->csdev); spin_lock_init(&drvdata->spinlock); - tpdm_init_default_data(drvdata); + /* Decrease pm refcount when probe is done.*/ pm_runtime_put(&adev->dev); diff --git a/drivers/hwtracing/coresight/coresight-tpdm.h b/drivers/hwtracing/coresight/coresight-tpdm.h index 543854043a2dd..f59e751d35816 100644 --- a/drivers/hwtracing/coresight/coresight-tpdm.h +++ b/drivers/hwtracing/coresight/coresight-tpdm.h @@ -11,8 +11,14 @@ /* DSB Subunit Registers */ #define TPDM_DSB_CR (0x780) +#define TPDM_DSB_TIER (0x784) + /* Enable bit for DSB subunit */ #define TPDM_DSB_CR_ENA BIT(0) +/* Enable bit for DSB subunit trigger type */ +#define TPDM_DSB_CR_TRIG_TYPE BIT(12) +/* Enable bit for DSB subunit trigger timestamp */ +#define TPDM_DSB_TIER_XTRIG_TSENAB BIT(1) /* TPDM integration test registers */ #define TPDM_ITATBCNTRL (0xEF0) @@ -40,6 +46,16 @@ #define TPDM_PIDR0_DS_IMPDEF BIT(0) #define TPDM_PIDR0_DS_DSB BIT(1) +/** + * struct dsb_dataset - specifics associated to dsb dataset + * @trig_ts: Enable/Disable trigger timestamp. + * @trig_type: Enable/Disable trigger type. + */ +struct dsb_dataset { + bool trig_ts; + bool trig_type; +}; + /** * struct tpdm_drvdata - specifics associated to an TPDM component * @base: memory mapped base address for this component. @@ -48,6 +64,7 @@ * @spinlock: lock for the drvdata value. * @enable: enable status of the component. * @datasets: The datasets types present of the TPDM. + * @dsb Specifics associated to TPDM DSB. */ struct tpdm_drvdata { @@ -57,6 +74,7 @@ struct tpdm_drvdata { spinlock_t spinlock; bool enable; unsigned long datasets; + struct dsb_dataset *dsb; }; #endif /* _CORESIGHT_CORESIGHT_TPDM_H */ From 17df201c3674766380126e7ec6dd219bd8aacc80 Mon Sep 17 00:00:00 2001 From: Tao Zhang Date: Thu, 28 Sep 2023 14:29:39 +0800 Subject: [PATCH 300/548] coresight-tpdm: Add reset node to TPDM node TPDM device need a node to reset the configurations and status of it. This change provides a node to reset the configurations and disable the TPDM if it has been enabled. Signed-off-by: Tao Zhang Signed-off-by: Suzuki K Poulose Link: https://lore.kernel.org/r/1695882586-10306-7-git-send-email-quic_taozha@quicinc.com (cherry picked from commit 8fbbce11a90f345a1ff39e2a08e312ee763a1139) Signed-off-by: Koba Ko --- .../testing/sysfs-bus-coresight-devices-tpdm | 10 +++++++++ drivers/hwtracing/coresight/coresight-tpdm.c | 22 +++++++++++++++++++ 2 files changed, 32 insertions(+) diff --git a/Documentation/ABI/testing/sysfs-bus-coresight-devices-tpdm b/Documentation/ABI/testing/sysfs-bus-coresight-devices-tpdm index 4a58e649550d5..ef8b5a6bd4ac4 100644 --- a/Documentation/ABI/testing/sysfs-bus-coresight-devices-tpdm +++ b/Documentation/ABI/testing/sysfs-bus-coresight-devices-tpdm @@ -11,3 +11,13 @@ Description: Accepts only one of the 2 values - 1 or 2. 1 : Generate 64 bits data 2 : Generate 32 bits data + +What: /sys/bus/coresight/devices//reset_dataset +Date: March 2023 +KernelVersion 6.7 +Contact: Jinlong Mao (QUIC) , Tao Zhang (QUIC) +Description: + (Write) Reset the dataset of the tpdm. + + Accepts only one value - 1. + 1 : Reset the dataset of the tpdm diff --git a/drivers/hwtracing/coresight/coresight-tpdm.c b/drivers/hwtracing/coresight/coresight-tpdm.c index 951ad4d9b76fb..9c65e4c01128c 100644 --- a/drivers/hwtracing/coresight/coresight-tpdm.c +++ b/drivers/hwtracing/coresight/coresight-tpdm.c @@ -162,6 +162,27 @@ static int tpdm_datasets_setup(struct tpdm_drvdata *drvdata) return 0; } +static ssize_t reset_dataset_store(struct device *dev, + struct device_attribute *attr, + const char *buf, + size_t size) +{ + int ret = 0; + unsigned long val; + struct tpdm_drvdata *drvdata = dev_get_drvdata(dev->parent); + + ret = kstrtoul(buf, 0, &val); + if (ret || val != 1) + return -EINVAL; + + spin_lock(&drvdata->spinlock); + tpdm_reset_datasets(drvdata); + spin_unlock(&drvdata->spinlock); + + return size; +} +static DEVICE_ATTR_WO(reset_dataset); + /* * value 1: 64 bits test data * value 2: 32 bits test data @@ -202,6 +223,7 @@ static ssize_t integration_test_store(struct device *dev, static DEVICE_ATTR_WO(integration_test); static struct attribute *tpdm_attrs[] = { + &dev_attr_reset_dataset.attr, &dev_attr_integration_test.attr, NULL, }; From fbf3e92082f6a64ea0ed5b89668a9c4e6d150d31 Mon Sep 17 00:00:00 2001 From: Tao Zhang Date: Thu, 28 Sep 2023 14:29:40 +0800 Subject: [PATCH 301/548] coresight-tpdm: Add nodes to set trigger timestamp and type The nodes are needed to set or show the trigger timestamp and trigger type. This change is to add these nodes to achieve these function. Signed-off-by: Tao Zhang Signed-off-by: Suzuki K Poulose Link: https://lore.kernel.org/r/1695882586-10306-8-git-send-email-quic_taozha@quicinc.com (cherry picked from commit 851b3f9c9c0838060158e288c1387d44652c54b5) Signed-off-by: Koba Ko --- .../testing/sysfs-bus-coresight-devices-tpdm | 22 +++++ drivers/hwtracing/coresight/coresight-tpdm.c | 95 +++++++++++++++++++ 2 files changed, 117 insertions(+) diff --git a/Documentation/ABI/testing/sysfs-bus-coresight-devices-tpdm b/Documentation/ABI/testing/sysfs-bus-coresight-devices-tpdm index ef8b5a6bd4ac4..b15bf012a050a 100644 --- a/Documentation/ABI/testing/sysfs-bus-coresight-devices-tpdm +++ b/Documentation/ABI/testing/sysfs-bus-coresight-devices-tpdm @@ -21,3 +21,25 @@ Description: Accepts only one value - 1. 1 : Reset the dataset of the tpdm + +What: /sys/bus/coresight/devices//dsb_trig_type +Date: March 2023 +KernelVersion 6.7 +Contact: Jinlong Mao (QUIC) , Tao Zhang (QUIC) +Description: + (RW) Set/Get the trigger type of the DSB for tpdm. + + Accepts only one of the 2 values - 0 or 1. + 0 : Set the DSB trigger type to false + 1 : Set the DSB trigger type to true + +What: /sys/bus/coresight/devices//dsb_trig_ts +Date: March 2023 +KernelVersion 6.7 +Contact: Jinlong Mao (QUIC) , Tao Zhang (QUIC) +Description: + (RW) Set/Get the trigger timestamp of the DSB for tpdm. + + Accepts only one of the 2 values - 0 or 1. + 0 : Set the DSB trigger type to false + 1 : Set the DSB trigger type to true diff --git a/drivers/hwtracing/coresight/coresight-tpdm.c b/drivers/hwtracing/coresight/coresight-tpdm.c index 9c65e4c01128c..e9fc3482d480a 100644 --- a/drivers/hwtracing/coresight/coresight-tpdm.c +++ b/drivers/hwtracing/coresight/coresight-tpdm.c @@ -25,6 +25,18 @@ static bool tpdm_has_dsb_dataset(struct tpdm_drvdata *drvdata) return (drvdata->datasets & TPDM_PIDR0_DS_DSB); } +static umode_t tpdm_dsb_is_visible(struct kobject *kobj, + struct attribute *attr, int n) +{ + struct device *dev = kobj_to_dev(kobj); + struct tpdm_drvdata *drvdata = dev_get_drvdata(dev->parent); + + if (drvdata && tpdm_has_dsb_dataset(drvdata)) + return attr->mode; + + return 0; +} + static void tpdm_reset_datasets(struct tpdm_drvdata *drvdata) { if (tpdm_has_dsb_dataset(drvdata)) { @@ -232,8 +244,91 @@ static struct attribute_group tpdm_attr_grp = { .attrs = tpdm_attrs, }; +static ssize_t dsb_trig_type_show(struct device *dev, + struct device_attribute *attr, char *buf) +{ + struct tpdm_drvdata *drvdata = dev_get_drvdata(dev->parent); + + return sysfs_emit(buf, "%u\n", + (unsigned int)drvdata->dsb->trig_type); +} + +/* + * Trigger type (boolean): + * false - Disable trigger type. + * true - Enable trigger type. + */ +static ssize_t dsb_trig_type_store(struct device *dev, + struct device_attribute *attr, + const char *buf, + size_t size) +{ + struct tpdm_drvdata *drvdata = dev_get_drvdata(dev->parent); + unsigned long val; + + if ((kstrtoul(buf, 0, &val)) || (val & ~1UL)) + return -EINVAL; + + spin_lock(&drvdata->spinlock); + if (val) + drvdata->dsb->trig_type = true; + else + drvdata->dsb->trig_type = false; + spin_unlock(&drvdata->spinlock); + return size; +} +static DEVICE_ATTR_RW(dsb_trig_type); + +static ssize_t dsb_trig_ts_show(struct device *dev, + struct device_attribute *attr, + char *buf) +{ + struct tpdm_drvdata *drvdata = dev_get_drvdata(dev->parent); + + return sysfs_emit(buf, "%u\n", + (unsigned int)drvdata->dsb->trig_ts); +} + +/* + * Trigger timestamp (boolean): + * false - Disable trigger timestamp. + * true - Enable trigger timestamp. + */ +static ssize_t dsb_trig_ts_store(struct device *dev, + struct device_attribute *attr, + const char *buf, + size_t size) +{ + struct tpdm_drvdata *drvdata = dev_get_drvdata(dev->parent); + unsigned long val; + + if ((kstrtoul(buf, 0, &val)) || (val & ~1UL)) + return -EINVAL; + + spin_lock(&drvdata->spinlock); + if (val) + drvdata->dsb->trig_ts = true; + else + drvdata->dsb->trig_ts = false; + spin_unlock(&drvdata->spinlock); + return size; +} +static DEVICE_ATTR_RW(dsb_trig_ts); + +static struct attribute *tpdm_dsb_attrs[] = { + &dev_attr_dsb_trig_ts.attr, + &dev_attr_dsb_trig_type.attr, + NULL, +}; + +static struct attribute_group tpdm_dsb_attr_grp = { + .attrs = tpdm_dsb_attrs, + .is_visible = tpdm_dsb_is_visible, +}; + static const struct attribute_group *tpdm_attr_grps[] = { &tpdm_attr_grp, + &tpdm_dsb_attr_grp, NULL, }; From 18aeefc12850b044e3531c70b37f34ea039a7a01 Mon Sep 17 00:00:00 2001 From: Tao Zhang Date: Thu, 28 Sep 2023 14:29:41 +0800 Subject: [PATCH 302/548] coresight-tpdm: Add node to set dsb programming mode Add node to set and show programming mode for TPDM DSB subunit. Once the DSB programming mode is set, it will be written to the register DSB_CR. Signed-off-by: Tao Zhang Signed-off-by: Suzuki K Poulose Link: https://lore.kernel.org/r/1695882586-10306-9-git-send-email-quic_taozha@quicinc.com (cherry picked from commit 018e43ad1eeefbb8797e4c933953c50c09e3f4f6) Signed-off-by: Koba Ko --- .../testing/sysfs-bus-coresight-devices-tpdm | 14 +++++ drivers/hwtracing/coresight/coresight-tpdm.c | 53 +++++++++++++++++++ drivers/hwtracing/coresight/coresight-tpdm.h | 19 +++++++ 3 files changed, 86 insertions(+) diff --git a/Documentation/ABI/testing/sysfs-bus-coresight-devices-tpdm b/Documentation/ABI/testing/sysfs-bus-coresight-devices-tpdm index b15bf012a050a..8ec7548070b7b 100644 --- a/Documentation/ABI/testing/sysfs-bus-coresight-devices-tpdm +++ b/Documentation/ABI/testing/sysfs-bus-coresight-devices-tpdm @@ -43,3 +43,17 @@ Description: Accepts only one of the 2 values - 0 or 1. 0 : Set the DSB trigger type to false 1 : Set the DSB trigger type to true + +What: /sys/bus/coresight/devices//dsb_mode +Date: March 2023 +KernelVersion 6.7 +Contact: Jinlong Mao (QUIC) , Tao Zhang (QUIC) +Description: + (RW) Set/Get the programming mode of the DSB for tpdm. + + Accepts the value needs to be greater than 0. What data + bits do is listed below. + Bit[0:1] : Test mode control bit for choosing the inputs. + Bit[3] : Set to 0 for low performance mode. + Set to 1 for high performance mode. + Bit[4:8] : Select byte lane for high performance mode. diff --git a/drivers/hwtracing/coresight/coresight-tpdm.c b/drivers/hwtracing/coresight/coresight-tpdm.c index e9fc3482d480a..6201f12718ca6 100644 --- a/drivers/hwtracing/coresight/coresight-tpdm.c +++ b/drivers/hwtracing/coresight/coresight-tpdm.c @@ -4,6 +4,7 @@ */ #include +#include #include #include #include @@ -47,6 +48,27 @@ static void tpdm_reset_datasets(struct tpdm_drvdata *drvdata) } } +static void set_dsb_mode(struct tpdm_drvdata *drvdata, u32 *val) +{ + u32 mode; + + /* Set the test accurate mode */ + mode = TPDM_DSB_MODE_TEST(drvdata->dsb->mode); + *val &= ~TPDM_DSB_CR_TEST_MODE; + *val |= FIELD_PREP(TPDM_DSB_CR_TEST_MODE, mode); + + /* Set the byte lane for high-performance mode */ + mode = TPDM_DSB_MODE_HPBYTESEL(drvdata->dsb->mode); + *val &= ~TPDM_DSB_CR_HPSEL; + *val |= FIELD_PREP(TPDM_DSB_CR_HPSEL, mode); + + /* Set the performance mode */ + if (drvdata->dsb->mode & TPDM_DSB_MODE_PERF) + *val |= TPDM_DSB_CR_MODE; + else + *val &= ~TPDM_DSB_CR_MODE; +} + static void tpdm_enable_dsb(struct tpdm_drvdata *drvdata) { u32 val; @@ -60,6 +82,8 @@ static void tpdm_enable_dsb(struct tpdm_drvdata *drvdata) writel_relaxed(val, drvdata->base + TPDM_DSB_TIER); val = readl_relaxed(drvdata->base + TPDM_DSB_CR); + /* Set the mode of DSB dataset */ + set_dsb_mode(drvdata, &val); /* Set trigger type */ if (drvdata->dsb->trig_type) val |= TPDM_DSB_CR_TRIG_TYPE; @@ -244,6 +268,34 @@ static struct attribute_group tpdm_attr_grp = { .attrs = tpdm_attrs, }; +static ssize_t dsb_mode_show(struct device *dev, + struct device_attribute *attr, + char *buf) +{ + struct tpdm_drvdata *drvdata = dev_get_drvdata(dev->parent); + + return sysfs_emit(buf, "%x\n", drvdata->dsb->mode); +} + +static ssize_t dsb_mode_store(struct device *dev, + struct device_attribute *attr, + const char *buf, + size_t size) +{ + struct tpdm_drvdata *drvdata = dev_get_drvdata(dev->parent); + unsigned long val; + + if ((kstrtoul(buf, 0, &val)) || (val < 0) || + (val & ~TPDM_DSB_MODE_MASK)) + return -EINVAL; + + spin_lock(&drvdata->spinlock); + drvdata->dsb->mode = val & TPDM_DSB_MODE_MASK; + spin_unlock(&drvdata->spinlock); + return size; +} +static DEVICE_ATTR_RW(dsb_mode); + static ssize_t dsb_trig_type_show(struct device *dev, struct device_attribute *attr, char *buf) { @@ -316,6 +368,7 @@ static ssize_t dsb_trig_ts_store(struct device *dev, static DEVICE_ATTR_RW(dsb_trig_ts); static struct attribute *tpdm_dsb_attrs[] = { + &dev_attr_dsb_mode.attr, &dev_attr_dsb_trig_ts.attr, &dev_attr_dsb_trig_type.attr, NULL, diff --git a/drivers/hwtracing/coresight/coresight-tpdm.h b/drivers/hwtracing/coresight/coresight-tpdm.h index f59e751d35816..b55d6f5ce8520 100644 --- a/drivers/hwtracing/coresight/coresight-tpdm.h +++ b/drivers/hwtracing/coresight/coresight-tpdm.h @@ -15,11 +15,28 @@ /* Enable bit for DSB subunit */ #define TPDM_DSB_CR_ENA BIT(0) +/* Enable bit for DSB subunit perfmance mode */ +#define TPDM_DSB_CR_MODE BIT(1) /* Enable bit for DSB subunit trigger type */ #define TPDM_DSB_CR_TRIG_TYPE BIT(12) +/* Data bits for DSB high performace mode */ +#define TPDM_DSB_CR_HPSEL GENMASK(6, 2) +/* Data bits for DSB test mode */ +#define TPDM_DSB_CR_TEST_MODE GENMASK(10, 9) + /* Enable bit for DSB subunit trigger timestamp */ #define TPDM_DSB_TIER_XTRIG_TSENAB BIT(1) +/* DSB programming modes */ +/* DSB mode bits mask */ +#define TPDM_DSB_MODE_MASK GENMASK(8, 0) +/* Test mode control bit*/ +#define TPDM_DSB_MODE_TEST(val) (val & GENMASK(1, 0)) +/* Performance mode */ +#define TPDM_DSB_MODE_PERF BIT(3) +/* High performance mode */ +#define TPDM_DSB_MODE_HPBYTESEL(val) (val & GENMASK(8, 4)) + /* TPDM integration test registers */ #define TPDM_ITATBCNTRL (0xEF0) #define TPDM_ITCNTRL (0xF00) @@ -48,10 +65,12 @@ /** * struct dsb_dataset - specifics associated to dsb dataset + * @mode: DSB programming mode * @trig_ts: Enable/Disable trigger timestamp. * @trig_type: Enable/Disable trigger type. */ struct dsb_dataset { + u32 mode; bool trig_ts; bool trig_type; }; From aafae2fc5f660d417bd7a3f35f50a39045fbb5ef Mon Sep 17 00:00:00 2001 From: Tao Zhang Date: Thu, 28 Sep 2023 14:29:42 +0800 Subject: [PATCH 303/548] coresight-tpdm: Add nodes for dsb edge control Add the nodes to set value for DSB edge control and DSB edge control mask. Each DSB subunit TPDM has maximum of n(n<16) EDCR resgisters to configure edge control. DSB edge detection control 00: Rising edge detection 01: Falling edge detection 10: Rising and falling edge detection (toggle detection) And each DSB subunit TPDM has maximum of m(m<8) ECDMR registers to configure mask. Eight 32 bit registers providing DSB interface edge detection mask control. Add the nodes to configure DSB edge control and DSB edge control mask. Each DSB subunit TPDM maximum of 256 edge detections can be configured. The index and value sysfs files need to be paired and written to order. The index sysfs file is to set the index number of the edge detection which needs to be configured. And the value sysfs file is to set the control or mask for the edge detection. DSB edge detection control should be set as the following values. 00: Rising edge detection 01: Falling edge detection 10: Rising and falling edge detection (toggle detection) And DSB edge mask should be set as 0 or 1. Each DSB subunit TPDM has maximum of n(n<16) EDCR resgisters to configure edge control. And each DSB subunit TPDM has maximum of m(m<8) ECDMR registers to configure mask. Add the nodes to read a set of the edge control value and mask of the DSB in TPDM. Signed-off-by: Tao Zhang Signed-off-by: Suzuki K Poulose Link: https://lore.kernel.org/r/1695882586-10306-10-git-send-email-quic_taozha@quicinc.com (cherry picked from commit f376caf25f79965ab140b7a297cb4a5bb0c89523) Signed-off-by: Koba Ko --- .../testing/sysfs-bus-coresight-devices-tpdm | 51 +++++ drivers/hwtracing/coresight/coresight-tpdm.c | 174 +++++++++++++++++- drivers/hwtracing/coresight/coresight-tpdm.h | 60 ++++++ 3 files changed, 284 insertions(+), 1 deletion(-) diff --git a/Documentation/ABI/testing/sysfs-bus-coresight-devices-tpdm b/Documentation/ABI/testing/sysfs-bus-coresight-devices-tpdm index 8ec7548070b7b..6853bb1295e30 100644 --- a/Documentation/ABI/testing/sysfs-bus-coresight-devices-tpdm +++ b/Documentation/ABI/testing/sysfs-bus-coresight-devices-tpdm @@ -57,3 +57,54 @@ Description: Bit[3] : Set to 0 for low performance mode. Set to 1 for high performance mode. Bit[4:8] : Select byte lane for high performance mode. + +What: /sys/bus/coresight/devices//dsb_edge/ctrl_idx +Date: March 2023 +KernelVersion 6.7 +Contact: Jinlong Mao (QUIC) , Tao Zhang (QUIC) +Description: + (RW) Set/Get the index number of the edge detection for the DSB + subunit TPDM. Since there are at most 256 edge detections, this + value ranges from 0 to 255. + +What: /sys/bus/coresight/devices//dsb_edge/ctrl_val +Date: March 2023 +KernelVersion 6.7 +Contact: Jinlong Mao (QUIC) , Tao Zhang (QUIC) +Description: + Write a data to control the edge detection corresponding to + the index number. Before writing data to this sysfs file, + "ctrl_idx" should be written first to configure the index + number of the edge detection which needs to be controlled. + + Accepts only one of the following values. + 0 - Rising edge detection + 1 - Falling edge detection + 2 - Rising and falling edge detection (toggle detection) + + +What: /sys/bus/coresight/devices//dsb_edge/ctrl_mask +Date: March 2023 +KernelVersion 6.7 +Contact: Jinlong Mao (QUIC) , Tao Zhang (QUIC) +Description: + Write a data to mask the edge detection corresponding to the index + number. Before writing data to this sysfs file, "ctrl_idx" should + be written first to configure the index number of the edge detection + which needs to be masked. + + Accepts only one of the 2 values - 0 or 1. + +What: /sys/bus/coresight/devices//dsb_edge/edcr[0:15] +Date: March 2023 +KernelVersion 6.7 +Contact: Jinlong Mao (QUIC) , Tao Zhang (QUIC) +Description: + Read a set of the edge control value of the DSB in TPDM. + +What: /sys/bus/coresight/devices//dsb_edge/edcmr[0:7] +Date: March 2023 +KernelVersion 6.7 +Contact: Jinlong Mao (QUIC) , Tao Zhang (QUIC) +Description: + Read a set of the edge control mask of the DSB in TPDM. \ No newline at end of file diff --git a/drivers/hwtracing/coresight/coresight-tpdm.c b/drivers/hwtracing/coresight/coresight-tpdm.c index 6201f12718ca6..7175e70c2c4e0 100644 --- a/drivers/hwtracing/coresight/coresight-tpdm.c +++ b/drivers/hwtracing/coresight/coresight-tpdm.c @@ -21,6 +21,30 @@ DEFINE_CORESIGHT_DEVLIST(tpdm_devs, "tpdm"); +/* Read dataset array member with the index number */ +static ssize_t tpdm_simple_dataset_show(struct device *dev, + struct device_attribute *attr, + char *buf) +{ + struct tpdm_drvdata *drvdata = dev_get_drvdata(dev->parent); + struct tpdm_dataset_attribute *tpdm_attr = + container_of(attr, struct tpdm_dataset_attribute, attr); + + switch (tpdm_attr->mem) { + case DSB_EDGE_CTRL: + if (tpdm_attr->idx >= TPDM_DSB_MAX_EDCR) + return -EINVAL; + return sysfs_emit(buf, "0x%x\n", + drvdata->dsb->edge_ctrl[tpdm_attr->idx]); + case DSB_EDGE_CTRL_MASK: + if (tpdm_attr->idx >= TPDM_DSB_MAX_EDCMR) + return -EINVAL; + return sysfs_emit(buf, "0x%x\n", + drvdata->dsb->edge_ctrl_mask[tpdm_attr->idx]); + } + return -EINVAL; +} + static bool tpdm_has_dsb_dataset(struct tpdm_drvdata *drvdata) { return (drvdata->datasets & TPDM_PIDR0_DS_DSB); @@ -71,7 +95,14 @@ static void set_dsb_mode(struct tpdm_drvdata *drvdata, u32 *val) static void tpdm_enable_dsb(struct tpdm_drvdata *drvdata) { - u32 val; + u32 val, i; + + for (i = 0; i < TPDM_DSB_MAX_EDCR; i++) + writel_relaxed(drvdata->dsb->edge_ctrl[i], + drvdata->base + TPDM_DSB_EDCR(i)); + for (i = 0; i < TPDM_DSB_MAX_EDCMR; i++) + writel_relaxed(drvdata->dsb->edge_ctrl_mask[i], + drvdata->base + TPDM_DSB_EDCMR(i)); val = readl_relaxed(drvdata->base + TPDM_DSB_TIER); /* Set trigger timestamp */ @@ -296,6 +327,109 @@ static ssize_t dsb_mode_store(struct device *dev, } static DEVICE_ATTR_RW(dsb_mode); +static ssize_t ctrl_idx_show(struct device *dev, + struct device_attribute *attr, + char *buf) +{ + struct tpdm_drvdata *drvdata = dev_get_drvdata(dev->parent); + + return sysfs_emit(buf, "%u\n", + (unsigned int)drvdata->dsb->edge_ctrl_idx); +} + +/* + * The EDCR registers can include up to 16 32-bit registers, and each + * one can be configured to control up to 16 edge detections(2 bits + * control one edge detection). So a total 256 edge detections can be + * configured. This function provides a way to set the index number of + * the edge detection which needs to be configured. + */ +static ssize_t ctrl_idx_store(struct device *dev, + struct device_attribute *attr, + const char *buf, + size_t size) +{ + struct tpdm_drvdata *drvdata = dev_get_drvdata(dev->parent); + unsigned long val; + + if ((kstrtoul(buf, 0, &val)) || (val >= TPDM_DSB_MAX_LINES)) + return -EINVAL; + + spin_lock(&drvdata->spinlock); + drvdata->dsb->edge_ctrl_idx = val; + spin_unlock(&drvdata->spinlock); + + return size; +} +static DEVICE_ATTR_RW(ctrl_idx); + +/* + * This function is used to control the edge detection according + * to the index number that has been set. + * "edge_ctrl" should be one of the following values. + * 0 - Rising edge detection + * 1 - Falling edge detection + * 2 - Rising and falling edge detection (toggle detection) + */ +static ssize_t ctrl_val_store(struct device *dev, + struct device_attribute *attr, + const char *buf, + size_t size) +{ + struct tpdm_drvdata *drvdata = dev_get_drvdata(dev->parent); + unsigned long val, edge_ctrl; + int reg; + + if ((kstrtoul(buf, 0, &edge_ctrl)) || (edge_ctrl > 0x2)) + return -EINVAL; + + spin_lock(&drvdata->spinlock); + /* + * There are 2 bit per DSB Edge Control line. + * Thus we have 16 lines in a 32bit word. + */ + reg = EDCR_TO_WORD_IDX(drvdata->dsb->edge_ctrl_idx); + val = drvdata->dsb->edge_ctrl[reg]; + val &= ~EDCR_TO_WORD_MASK(drvdata->dsb->edge_ctrl_idx); + val |= EDCR_TO_WORD_VAL(edge_ctrl, drvdata->dsb->edge_ctrl_idx); + drvdata->dsb->edge_ctrl[reg] = val; + spin_unlock(&drvdata->spinlock); + + return size; +} +static DEVICE_ATTR_WO(ctrl_val); + +static ssize_t ctrl_mask_store(struct device *dev, + struct device_attribute *attr, + const char *buf, + size_t size) +{ + struct tpdm_drvdata *drvdata = dev_get_drvdata(dev->parent); + unsigned long val; + u32 set; + int reg; + + if ((kstrtoul(buf, 0, &val)) || (val & ~1UL)) + return -EINVAL; + + spin_lock(&drvdata->spinlock); + /* + * There is 1 bit per DSB Edge Control Mark line. + * Thus we have 32 lines in a 32bit word. + */ + reg = EDCMR_TO_WORD_IDX(drvdata->dsb->edge_ctrl_idx); + set = drvdata->dsb->edge_ctrl_mask[reg]; + if (val) + set |= BIT(EDCMR_TO_WORD_SHIFT(drvdata->dsb->edge_ctrl_idx)); + else + set &= ~BIT(EDCMR_TO_WORD_SHIFT(drvdata->dsb->edge_ctrl_idx)); + drvdata->dsb->edge_ctrl_mask[reg] = set; + spin_unlock(&drvdata->spinlock); + + return size; +} +static DEVICE_ATTR_WO(ctrl_mask); + static ssize_t dsb_trig_type_show(struct device *dev, struct device_attribute *attr, char *buf) { @@ -367,6 +501,37 @@ static ssize_t dsb_trig_ts_store(struct device *dev, } static DEVICE_ATTR_RW(dsb_trig_ts); +static struct attribute *tpdm_dsb_edge_attrs[] = { + &dev_attr_ctrl_idx.attr, + &dev_attr_ctrl_val.attr, + &dev_attr_ctrl_mask.attr, + DSB_EDGE_CTRL_ATTR(0), + DSB_EDGE_CTRL_ATTR(1), + DSB_EDGE_CTRL_ATTR(2), + DSB_EDGE_CTRL_ATTR(3), + DSB_EDGE_CTRL_ATTR(4), + DSB_EDGE_CTRL_ATTR(5), + DSB_EDGE_CTRL_ATTR(6), + DSB_EDGE_CTRL_ATTR(7), + DSB_EDGE_CTRL_ATTR(8), + DSB_EDGE_CTRL_ATTR(9), + DSB_EDGE_CTRL_ATTR(10), + DSB_EDGE_CTRL_ATTR(11), + DSB_EDGE_CTRL_ATTR(12), + DSB_EDGE_CTRL_ATTR(13), + DSB_EDGE_CTRL_ATTR(14), + DSB_EDGE_CTRL_ATTR(15), + DSB_EDGE_CTRL_MASK_ATTR(0), + DSB_EDGE_CTRL_MASK_ATTR(1), + DSB_EDGE_CTRL_MASK_ATTR(2), + DSB_EDGE_CTRL_MASK_ATTR(3), + DSB_EDGE_CTRL_MASK_ATTR(4), + DSB_EDGE_CTRL_MASK_ATTR(5), + DSB_EDGE_CTRL_MASK_ATTR(6), + DSB_EDGE_CTRL_MASK_ATTR(7), + NULL, +}; + static struct attribute *tpdm_dsb_attrs[] = { &dev_attr_dsb_mode.attr, &dev_attr_dsb_trig_ts.attr, @@ -379,9 +544,16 @@ static struct attribute_group tpdm_dsb_attr_grp = { .is_visible = tpdm_dsb_is_visible, }; +static struct attribute_group tpdm_dsb_edge_grp = { + .attrs = tpdm_dsb_edge_attrs, + .is_visible = tpdm_dsb_is_visible, + .name = "dsb_edge", +}; + static const struct attribute_group *tpdm_attr_grps[] = { &tpdm_attr_grp, &tpdm_dsb_attr_grp, + &tpdm_dsb_edge_grp, NULL, }; diff --git a/drivers/hwtracing/coresight/coresight-tpdm.h b/drivers/hwtracing/coresight/coresight-tpdm.h index b55d6f5ce8520..a9c65d96316a0 100644 --- a/drivers/hwtracing/coresight/coresight-tpdm.h +++ b/drivers/hwtracing/coresight/coresight-tpdm.h @@ -12,6 +12,8 @@ /* DSB Subunit Registers */ #define TPDM_DSB_CR (0x780) #define TPDM_DSB_TIER (0x784) +#define TPDM_DSB_EDCR(n) (0x808 + (n * 4)) +#define TPDM_DSB_EDCMR(n) (0x848 + (n * 4)) /* Enable bit for DSB subunit */ #define TPDM_DSB_CR_ENA BIT(0) @@ -37,6 +39,16 @@ /* High performance mode */ #define TPDM_DSB_MODE_HPBYTESEL(val) (val & GENMASK(8, 4)) +#define EDCRS_PER_WORD 16 +#define EDCR_TO_WORD_IDX(r) ((r) / EDCRS_PER_WORD) +#define EDCR_TO_WORD_SHIFT(r) ((r % EDCRS_PER_WORD) * 2) +#define EDCR_TO_WORD_VAL(val, r) (val << EDCR_TO_WORD_SHIFT(r)) +#define EDCR_TO_WORD_MASK(r) EDCR_TO_WORD_VAL(0x3, r) + +#define EDCMRS_PER_WORD 32 +#define EDCMR_TO_WORD_IDX(r) ((r) / EDCMRS_PER_WORD) +#define EDCMR_TO_WORD_SHIFT(r) ((r) % EDCMRS_PER_WORD) + /* TPDM integration test registers */ #define TPDM_ITATBCNTRL (0xEF0) #define TPDM_ITCNTRL (0xF00) @@ -63,14 +75,43 @@ #define TPDM_PIDR0_DS_IMPDEF BIT(0) #define TPDM_PIDR0_DS_DSB BIT(1) +#define TPDM_DSB_MAX_LINES 256 +/* MAX number of EDCR registers */ +#define TPDM_DSB_MAX_EDCR 16 +/* MAX number of EDCMR registers */ +#define TPDM_DSB_MAX_EDCMR 8 + +#define tpdm_simple_dataset_ro(name, mem, idx) \ + (&((struct tpdm_dataset_attribute[]) { \ + { \ + __ATTR(name, 0444, tpdm_simple_dataset_show, NULL), \ + mem, \ + idx, \ + } \ + })[0].attr.attr) + +#define DSB_EDGE_CTRL_ATTR(nr) \ + tpdm_simple_dataset_ro(edcr##nr, \ + DSB_EDGE_CTRL, nr) + +#define DSB_EDGE_CTRL_MASK_ATTR(nr) \ + tpdm_simple_dataset_ro(edcmr##nr, \ + DSB_EDGE_CTRL_MASK, nr) + /** * struct dsb_dataset - specifics associated to dsb dataset * @mode: DSB programming mode + * @edge_ctrl_idx Index number of the edge control + * @edge_ctrl: Save value for edge control + * @edge_ctrl_mask: Save value for edge control mask * @trig_ts: Enable/Disable trigger timestamp. * @trig_type: Enable/Disable trigger type. */ struct dsb_dataset { u32 mode; + u32 edge_ctrl_idx; + u32 edge_ctrl[TPDM_DSB_MAX_EDCR]; + u32 edge_ctrl_mask[TPDM_DSB_MAX_EDCMR]; bool trig_ts; bool trig_type; }; @@ -96,4 +137,23 @@ struct tpdm_drvdata { struct dsb_dataset *dsb; }; +/* Enumerate members of various datasets */ +enum dataset_mem { + DSB_EDGE_CTRL, + DSB_EDGE_CTRL_MASK, +}; + +/** + * struct tpdm_dataset_attribute - Record the member variables and + * index number of datasets that need to be operated by sysfs file + * @attr: The device attribute + * @mem: The member in the dataset data structure + * @idx: The index number of the array data + */ +struct tpdm_dataset_attribute { + struct device_attribute attr; + enum dataset_mem mem; + u32 idx; +}; + #endif /* _CORESIGHT_CORESIGHT_TPDM_H */ From ccc5ce5af1dad41e89149b69dfd52e918723f573 Mon Sep 17 00:00:00 2001 From: Tao Zhang Date: Thu, 28 Sep 2023 14:29:43 +0800 Subject: [PATCH 304/548] coresight-tpdm: Add nodes to configure pattern match output Add nodes to configure trigger pattern and trigger pattern mask. Each DSB subunit TPDM has maximum of n(n<7) XPR registers to configure trigger pattern match output. Eight 32 bit registers providing DSB interface trigger output pattern match comparison. And each DSB subunit TPDM has maximum of m(m<7) XPMR registers to configure trigger pattern mask match output. Eight 32 bit registers providing DSB interface trigger output pattern match mask. Signed-off-by: Tao Zhang Signed-off-by: Suzuki K Poulose Link: https://lore.kernel.org/r/1695882586-10306-11-git-send-email-quic_taozha@quicinc.com (cherry picked from commit a8138a9445e6d159138b7e574dc5ee7cbcc2f06a) Signed-off-by: Koba Ko --- .../testing/sysfs-bus-coresight-devices-tpdm | 18 +++- drivers/hwtracing/coresight/coresight-tpdm.c | 82 ++++++++++++++++++- drivers/hwtracing/coresight/coresight-tpdm.h | 28 +++++++ 3 files changed, 126 insertions(+), 2 deletions(-) diff --git a/Documentation/ABI/testing/sysfs-bus-coresight-devices-tpdm b/Documentation/ABI/testing/sysfs-bus-coresight-devices-tpdm index 6853bb1295e30..2252e4706a90d 100644 --- a/Documentation/ABI/testing/sysfs-bus-coresight-devices-tpdm +++ b/Documentation/ABI/testing/sysfs-bus-coresight-devices-tpdm @@ -107,4 +107,20 @@ Date: March 2023 KernelVersion 6.7 Contact: Jinlong Mao (QUIC) , Tao Zhang (QUIC) Description: - Read a set of the edge control mask of the DSB in TPDM. \ No newline at end of file + Read a set of the edge control mask of the DSB in TPDM. + +What: /sys/bus/coresight/devices//dsb_trig_patt/xpr[0:7] +Date: March 2023 +KernelVersion 6.7 +Contact: Jinlong Mao (QUIC) , Tao Zhang (QUIC) +Description: + (RW) Set/Get the value of the trigger pattern for the DSB + subunit TPDM. + +What: /sys/bus/coresight/devices//dsb_trig_patt/xpmr[0:7] +Date: March 2023 +KernelVersion 6.7 +Contact: Jinlong Mao (QUIC) , Tao Zhang (QUIC) +Description: + (RW) Set/Get the mask of the trigger pattern for the DSB + subunit TPDM. \ No newline at end of file diff --git a/drivers/hwtracing/coresight/coresight-tpdm.c b/drivers/hwtracing/coresight/coresight-tpdm.c index 7175e70c2c4e0..e04c41f832656 100644 --- a/drivers/hwtracing/coresight/coresight-tpdm.c +++ b/drivers/hwtracing/coresight/coresight-tpdm.c @@ -41,10 +41,58 @@ static ssize_t tpdm_simple_dataset_show(struct device *dev, return -EINVAL; return sysfs_emit(buf, "0x%x\n", drvdata->dsb->edge_ctrl_mask[tpdm_attr->idx]); + case DSB_TRIG_PATT: + if (tpdm_attr->idx >= TPDM_DSB_MAX_PATT) + return -EINVAL; + return sysfs_emit(buf, "0x%x\n", + drvdata->dsb->trig_patt[tpdm_attr->idx]); + case DSB_TRIG_PATT_MASK: + if (tpdm_attr->idx >= TPDM_DSB_MAX_PATT) + return -EINVAL; + return sysfs_emit(buf, "0x%x\n", + drvdata->dsb->trig_patt_mask[tpdm_attr->idx]); } return -EINVAL; } +/* Write dataset array member with the index number */ +static ssize_t tpdm_simple_dataset_store(struct device *dev, + struct device_attribute *attr, + const char *buf, + size_t size) +{ + unsigned long val; + ssize_t ret = size; + + struct tpdm_drvdata *drvdata = dev_get_drvdata(dev->parent); + struct tpdm_dataset_attribute *tpdm_attr = + container_of(attr, struct tpdm_dataset_attribute, attr); + + if (kstrtoul(buf, 0, &val)) + return -EINVAL; + + spin_lock(&drvdata->spinlock); + switch (tpdm_attr->mem) { + case DSB_TRIG_PATT: + if (tpdm_attr->idx < TPDM_DSB_MAX_PATT) + drvdata->dsb->trig_patt[tpdm_attr->idx] = val; + else + ret = -EINVAL; + break; + case DSB_TRIG_PATT_MASK: + if (tpdm_attr->idx < TPDM_DSB_MAX_PATT) + drvdata->dsb->trig_patt_mask[tpdm_attr->idx] = val; + else + ret = -EINVAL; + break; + default: + ret = -EINVAL; + } + spin_unlock(&drvdata->spinlock); + + return ret; +} + static bool tpdm_has_dsb_dataset(struct tpdm_drvdata *drvdata) { return (drvdata->datasets & TPDM_PIDR0_DS_DSB); @@ -103,7 +151,12 @@ static void tpdm_enable_dsb(struct tpdm_drvdata *drvdata) for (i = 0; i < TPDM_DSB_MAX_EDCMR; i++) writel_relaxed(drvdata->dsb->edge_ctrl_mask[i], drvdata->base + TPDM_DSB_EDCMR(i)); - + for (i = 0; i < TPDM_DSB_MAX_PATT; i++) { + writel_relaxed(drvdata->dsb->trig_patt[i], + drvdata->base + TPDM_DSB_XPR(i)); + writel_relaxed(drvdata->dsb->trig_patt_mask[i], + drvdata->base + TPDM_DSB_XPMR(i)); + } val = readl_relaxed(drvdata->base + TPDM_DSB_TIER); /* Set trigger timestamp */ if (drvdata->dsb->trig_ts) @@ -532,6 +585,26 @@ static struct attribute *tpdm_dsb_edge_attrs[] = { NULL, }; +static struct attribute *tpdm_dsb_trig_patt_attrs[] = { + DSB_TRIG_PATT_ATTR(0), + DSB_TRIG_PATT_ATTR(1), + DSB_TRIG_PATT_ATTR(2), + DSB_TRIG_PATT_ATTR(3), + DSB_TRIG_PATT_ATTR(4), + DSB_TRIG_PATT_ATTR(5), + DSB_TRIG_PATT_ATTR(6), + DSB_TRIG_PATT_ATTR(7), + DSB_TRIG_PATT_MASK_ATTR(0), + DSB_TRIG_PATT_MASK_ATTR(1), + DSB_TRIG_PATT_MASK_ATTR(2), + DSB_TRIG_PATT_MASK_ATTR(3), + DSB_TRIG_PATT_MASK_ATTR(4), + DSB_TRIG_PATT_MASK_ATTR(5), + DSB_TRIG_PATT_MASK_ATTR(6), + DSB_TRIG_PATT_MASK_ATTR(7), + NULL, +}; + static struct attribute *tpdm_dsb_attrs[] = { &dev_attr_dsb_mode.attr, &dev_attr_dsb_trig_ts.attr, @@ -550,10 +623,17 @@ static struct attribute_group tpdm_dsb_edge_grp = { .name = "dsb_edge", }; +static struct attribute_group tpdm_dsb_trig_patt_grp = { + .attrs = tpdm_dsb_trig_patt_attrs, + .is_visible = tpdm_dsb_is_visible, + .name = "dsb_trig_patt", +}; + static const struct attribute_group *tpdm_attr_grps[] = { &tpdm_attr_grp, &tpdm_dsb_attr_grp, &tpdm_dsb_edge_grp, + &tpdm_dsb_trig_patt_grp, NULL, }; diff --git a/drivers/hwtracing/coresight/coresight-tpdm.h b/drivers/hwtracing/coresight/coresight-tpdm.h index a9c65d96316a0..2cf7bdbdbb15b 100644 --- a/drivers/hwtracing/coresight/coresight-tpdm.h +++ b/drivers/hwtracing/coresight/coresight-tpdm.h @@ -12,6 +12,8 @@ /* DSB Subunit Registers */ #define TPDM_DSB_CR (0x780) #define TPDM_DSB_TIER (0x784) +#define TPDM_DSB_XPR(n) (0x7C8 + (n * 4)) +#define TPDM_DSB_XPMR(n) (0x7E8 + (n * 4)) #define TPDM_DSB_EDCR(n) (0x808 + (n * 4)) #define TPDM_DSB_EDCMR(n) (0x848 + (n * 4)) @@ -80,6 +82,8 @@ #define TPDM_DSB_MAX_EDCR 16 /* MAX number of EDCMR registers */ #define TPDM_DSB_MAX_EDCMR 8 +/* MAX number of DSB pattern */ +#define TPDM_DSB_MAX_PATT 8 #define tpdm_simple_dataset_ro(name, mem, idx) \ (&((struct tpdm_dataset_attribute[]) { \ @@ -90,6 +94,16 @@ } \ })[0].attr.attr) +#define tpdm_simple_dataset_rw(name, mem, idx) \ + (&((struct tpdm_dataset_attribute[]) { \ + { \ + __ATTR(name, 0644, tpdm_simple_dataset_show, \ + tpdm_simple_dataset_store), \ + mem, \ + idx, \ + } \ + })[0].attr.attr) + #define DSB_EDGE_CTRL_ATTR(nr) \ tpdm_simple_dataset_ro(edcr##nr, \ DSB_EDGE_CTRL, nr) @@ -98,12 +112,22 @@ tpdm_simple_dataset_ro(edcmr##nr, \ DSB_EDGE_CTRL_MASK, nr) +#define DSB_TRIG_PATT_ATTR(nr) \ + tpdm_simple_dataset_rw(xpr##nr, \ + DSB_TRIG_PATT, nr) + +#define DSB_TRIG_PATT_MASK_ATTR(nr) \ + tpdm_simple_dataset_rw(xpmr##nr, \ + DSB_TRIG_PATT_MASK, nr) + /** * struct dsb_dataset - specifics associated to dsb dataset * @mode: DSB programming mode * @edge_ctrl_idx Index number of the edge control * @edge_ctrl: Save value for edge control * @edge_ctrl_mask: Save value for edge control mask + * @trig_patt: Save value for trigger pattern + * @trig_patt_mask: Save value for trigger pattern mask * @trig_ts: Enable/Disable trigger timestamp. * @trig_type: Enable/Disable trigger type. */ @@ -112,6 +136,8 @@ struct dsb_dataset { u32 edge_ctrl_idx; u32 edge_ctrl[TPDM_DSB_MAX_EDCR]; u32 edge_ctrl_mask[TPDM_DSB_MAX_EDCMR]; + u32 trig_patt[TPDM_DSB_MAX_PATT]; + u32 trig_patt_mask[TPDM_DSB_MAX_PATT]; bool trig_ts; bool trig_type; }; @@ -141,6 +167,8 @@ struct tpdm_drvdata { enum dataset_mem { DSB_EDGE_CTRL, DSB_EDGE_CTRL_MASK, + DSB_TRIG_PATT, + DSB_TRIG_PATT_MASK, }; /** From de82ea570a9971cbd769c4c2746fdf9ab54f55ac Mon Sep 17 00:00:00 2001 From: Tao Zhang Date: Thu, 28 Sep 2023 14:29:44 +0800 Subject: [PATCH 305/548] coresight-tpdm: Add nodes for timestamp request Add nodes to configure the timestamp request based on input pattern match. Each TPDM that support DSB subunit has maximum of n(n<7) TPR registers to configure value for timestamp request based on input pattern match. Eight 32 bit registers providing DSB interface timestamp request pattern match comparison. And each TPDM that support DSB subunit has maximum of m(m<7) TPMR registers to configure pattern mask for timestamp request. Eight 32 bit registers providing DSB interface timestamp request pattern match mask generation. Add nodes to enable/disable pattern timestamp and set pattern timestamp type. Signed-off-by: Tao Zhang Signed-off-by: Suzuki K Poulose Link: https://lore.kernel.org/r/1695882586-10306-12-git-send-email-quic_taozha@quicinc.com (cherry picked from commit 4c983382a29eaddd8746af23702f657258bb91cc) Signed-off-by: Koba Ko --- .../testing/sysfs-bus-coresight-devices-tpdm | 40 ++++- drivers/hwtracing/coresight/coresight-tpdm.c | 155 +++++++++++++++++- drivers/hwtracing/coresight/coresight-tpdm.h | 24 +++ 3 files changed, 211 insertions(+), 8 deletions(-) diff --git a/Documentation/ABI/testing/sysfs-bus-coresight-devices-tpdm b/Documentation/ABI/testing/sysfs-bus-coresight-devices-tpdm index 2252e4706a90d..1f20a3f7df7df 100644 --- a/Documentation/ABI/testing/sysfs-bus-coresight-devices-tpdm +++ b/Documentation/ABI/testing/sysfs-bus-coresight-devices-tpdm @@ -123,4 +123,42 @@ KernelVersion 6.7 Contact: Jinlong Mao (QUIC) , Tao Zhang (QUIC) Description: (RW) Set/Get the mask of the trigger pattern for the DSB - subunit TPDM. \ No newline at end of file + subunit TPDM. + +What: /sys/bus/coresight/devices//dsb_patt/tpr[0:7] +Date: March 2023 +KernelVersion 6.7 +Contact: Jinlong Mao (QUIC) , Tao Zhang (QUIC) +Description: + (RW) Set/Get the value of the pattern for the DSB subunit TPDM. + +What: /sys/bus/coresight/devices//dsb_patt/tpmr[0:7] +Date: March 2023 +KernelVersion 6.7 +Contact: Jinlong Mao (QUIC) , Tao Zhang (QUIC) +Description: + (RW) Set/Get the mask of the pattern for the DSB subunit TPDM. + +What: /sys/bus/coresight/devices//dsb_patt/enable_ts +Date: March 2023 +KernelVersion 6.7 +Contact: Jinlong Mao (QUIC) , Tao Zhang (QUIC) +Description: + (Write) Set the pattern timestamp of DSB tpdm. Read + the pattern timestamp of DSB tpdm. + + Accepts only one of the 2 values - 0 or 1. + 0 : Disable DSB pattern timestamp. + 1 : Enable DSB pattern timestamp. + +What: /sys/bus/coresight/devices//dsb_patt/set_type +Date: March 2023 +KernelVersion 6.7 +Contact: Jinlong Mao (QUIC) , Tao Zhang (QUIC) +Description: + (Write) Set the pattern type of DSB tpdm. Read + the pattern type of DSB tpdm. + + Accepts only one of the 2 values - 0 or 1. + 0 : Set the DSB pattern type to value. + 1 : Set the DSB pattern type to toggle. diff --git a/drivers/hwtracing/coresight/coresight-tpdm.c b/drivers/hwtracing/coresight/coresight-tpdm.c index e04c41f832656..693b90c82f3a0 100644 --- a/drivers/hwtracing/coresight/coresight-tpdm.c +++ b/drivers/hwtracing/coresight/coresight-tpdm.c @@ -51,6 +51,16 @@ static ssize_t tpdm_simple_dataset_show(struct device *dev, return -EINVAL; return sysfs_emit(buf, "0x%x\n", drvdata->dsb->trig_patt_mask[tpdm_attr->idx]); + case DSB_PATT: + if (tpdm_attr->idx >= TPDM_DSB_MAX_PATT) + return -EINVAL; + return sysfs_emit(buf, "0x%x\n", + drvdata->dsb->patt_val[tpdm_attr->idx]); + case DSB_PATT_MASK: + if (tpdm_attr->idx >= TPDM_DSB_MAX_PATT) + return -EINVAL; + return sysfs_emit(buf, "0x%x\n", + drvdata->dsb->patt_mask[tpdm_attr->idx]); } return -EINVAL; } @@ -85,6 +95,18 @@ static ssize_t tpdm_simple_dataset_store(struct device *dev, else ret = -EINVAL; break; + case DSB_PATT: + if (tpdm_attr->idx < TPDM_DSB_MAX_PATT) + drvdata->dsb->patt_val[tpdm_attr->idx] = val; + else + ret = -EINVAL; + break; + case DSB_PATT_MASK: + if (tpdm_attr->idx < TPDM_DSB_MAX_PATT) + drvdata->dsb->patt_mask[tpdm_attr->idx] = val; + else + ret = -EINVAL; + break; default: ret = -EINVAL; } @@ -141,6 +163,36 @@ static void set_dsb_mode(struct tpdm_drvdata *drvdata, u32 *val) *val &= ~TPDM_DSB_CR_MODE; } +static void set_dsb_tier(struct tpdm_drvdata *drvdata) +{ + u32 val; + + val = readl_relaxed(drvdata->base + TPDM_DSB_TIER); + + /* Clear all relevant fields */ + val &= ~(TPDM_DSB_TIER_PATT_TSENAB | TPDM_DSB_TIER_PATT_TYPE | + TPDM_DSB_TIER_XTRIG_TSENAB); + + /* Set pattern timestamp type and enablement */ + if (drvdata->dsb->patt_ts) { + val |= TPDM_DSB_TIER_PATT_TSENAB; + if (drvdata->dsb->patt_type) + val |= TPDM_DSB_TIER_PATT_TYPE; + else + val &= ~TPDM_DSB_TIER_PATT_TYPE; + } else { + val &= ~TPDM_DSB_TIER_PATT_TSENAB; + } + + /* Set trigger timestamp */ + if (drvdata->dsb->trig_ts) + val |= TPDM_DSB_TIER_XTRIG_TSENAB; + else + val &= ~TPDM_DSB_TIER_XTRIG_TSENAB; + + writel_relaxed(val, drvdata->base + TPDM_DSB_TIER); +} + static void tpdm_enable_dsb(struct tpdm_drvdata *drvdata) { u32 val, i; @@ -152,18 +204,17 @@ static void tpdm_enable_dsb(struct tpdm_drvdata *drvdata) writel_relaxed(drvdata->dsb->edge_ctrl_mask[i], drvdata->base + TPDM_DSB_EDCMR(i)); for (i = 0; i < TPDM_DSB_MAX_PATT; i++) { + writel_relaxed(drvdata->dsb->patt_val[i], + drvdata->base + TPDM_DSB_TPR(i)); + writel_relaxed(drvdata->dsb->patt_mask[i], + drvdata->base + TPDM_DSB_TPMR(i)); writel_relaxed(drvdata->dsb->trig_patt[i], drvdata->base + TPDM_DSB_XPR(i)); writel_relaxed(drvdata->dsb->trig_patt_mask[i], drvdata->base + TPDM_DSB_XPMR(i)); } - val = readl_relaxed(drvdata->base + TPDM_DSB_TIER); - /* Set trigger timestamp */ - if (drvdata->dsb->trig_ts) - val |= TPDM_DSB_TIER_XTRIG_TSENAB; - else - val &= ~TPDM_DSB_TIER_XTRIG_TSENAB; - writel_relaxed(val, drvdata->base + TPDM_DSB_TIER); + + set_dsb_tier(drvdata); val = readl_relaxed(drvdata->base + TPDM_DSB_CR); /* Set the mode of DSB dataset */ @@ -483,6 +534,67 @@ static ssize_t ctrl_mask_store(struct device *dev, } static DEVICE_ATTR_WO(ctrl_mask); +static ssize_t enable_ts_show(struct device *dev, + struct device_attribute *attr, + char *buf) +{ + struct tpdm_drvdata *drvdata = dev_get_drvdata(dev->parent); + + return sysfs_emit(buf, "%u\n", + (unsigned int)drvdata->dsb->patt_ts); +} + +/* + * value 1: Enable/Disable DSB pattern timestamp + */ +static ssize_t enable_ts_store(struct device *dev, + struct device_attribute *attr, + const char *buf, + size_t size) +{ + struct tpdm_drvdata *drvdata = dev_get_drvdata(dev->parent); + unsigned long val; + + if ((kstrtoul(buf, 0, &val)) || (val & ~1UL)) + return -EINVAL; + + spin_lock(&drvdata->spinlock); + drvdata->dsb->patt_ts = !!val; + spin_unlock(&drvdata->spinlock); + return size; +} +static DEVICE_ATTR_RW(enable_ts); + +static ssize_t set_type_show(struct device *dev, + struct device_attribute *attr, + char *buf) +{ + struct tpdm_drvdata *drvdata = dev_get_drvdata(dev->parent); + + return sysfs_emit(buf, "%u\n", + (unsigned int)drvdata->dsb->patt_type); +} + +/* + * value 1: Set DSB pattern type + */ +static ssize_t set_type_store(struct device *dev, + struct device_attribute *attr, + const char *buf, size_t size) +{ + struct tpdm_drvdata *drvdata = dev_get_drvdata(dev->parent); + unsigned long val; + + if ((kstrtoul(buf, 0, &val)) || (val & ~1UL)) + return -EINVAL; + + spin_lock(&drvdata->spinlock); + drvdata->dsb->patt_type = val; + spin_unlock(&drvdata->spinlock); + return size; +} +static DEVICE_ATTR_RW(set_type); + static ssize_t dsb_trig_type_show(struct device *dev, struct device_attribute *attr, char *buf) { @@ -605,6 +717,28 @@ static struct attribute *tpdm_dsb_trig_patt_attrs[] = { NULL, }; +static struct attribute *tpdm_dsb_patt_attrs[] = { + DSB_PATT_ATTR(0), + DSB_PATT_ATTR(1), + DSB_PATT_ATTR(2), + DSB_PATT_ATTR(3), + DSB_PATT_ATTR(4), + DSB_PATT_ATTR(5), + DSB_PATT_ATTR(6), + DSB_PATT_ATTR(7), + DSB_PATT_MASK_ATTR(0), + DSB_PATT_MASK_ATTR(1), + DSB_PATT_MASK_ATTR(2), + DSB_PATT_MASK_ATTR(3), + DSB_PATT_MASK_ATTR(4), + DSB_PATT_MASK_ATTR(5), + DSB_PATT_MASK_ATTR(6), + DSB_PATT_MASK_ATTR(7), + &dev_attr_enable_ts.attr, + &dev_attr_set_type.attr, + NULL, +}; + static struct attribute *tpdm_dsb_attrs[] = { &dev_attr_dsb_mode.attr, &dev_attr_dsb_trig_ts.attr, @@ -629,11 +763,18 @@ static struct attribute_group tpdm_dsb_trig_patt_grp = { .name = "dsb_trig_patt", }; +static struct attribute_group tpdm_dsb_patt_grp = { + .attrs = tpdm_dsb_patt_attrs, + .is_visible = tpdm_dsb_is_visible, + .name = "dsb_patt", +}; + static const struct attribute_group *tpdm_attr_grps[] = { &tpdm_attr_grp, &tpdm_dsb_attr_grp, &tpdm_dsb_edge_grp, &tpdm_dsb_trig_patt_grp, + &tpdm_dsb_patt_grp, NULL, }; diff --git a/drivers/hwtracing/coresight/coresight-tpdm.h b/drivers/hwtracing/coresight/coresight-tpdm.h index 2cf7bdbdbb15b..891979db111ac 100644 --- a/drivers/hwtracing/coresight/coresight-tpdm.h +++ b/drivers/hwtracing/coresight/coresight-tpdm.h @@ -12,6 +12,8 @@ /* DSB Subunit Registers */ #define TPDM_DSB_CR (0x780) #define TPDM_DSB_TIER (0x784) +#define TPDM_DSB_TPR(n) (0x788 + (n * 4)) +#define TPDM_DSB_TPMR(n) (0x7A8 + (n * 4)) #define TPDM_DSB_XPR(n) (0x7C8 + (n * 4)) #define TPDM_DSB_XPMR(n) (0x7E8 + (n * 4)) #define TPDM_DSB_EDCR(n) (0x808 + (n * 4)) @@ -28,8 +30,12 @@ /* Data bits for DSB test mode */ #define TPDM_DSB_CR_TEST_MODE GENMASK(10, 9) +/* Enable bit for DSB subunit pattern timestamp */ +#define TPDM_DSB_TIER_PATT_TSENAB BIT(0) /* Enable bit for DSB subunit trigger timestamp */ #define TPDM_DSB_TIER_XTRIG_TSENAB BIT(1) +/* Bit for DSB subunit pattern type */ +#define TPDM_DSB_TIER_PATT_TYPE BIT(2) /* DSB programming modes */ /* DSB mode bits mask */ @@ -120,14 +126,26 @@ tpdm_simple_dataset_rw(xpmr##nr, \ DSB_TRIG_PATT_MASK, nr) +#define DSB_PATT_ATTR(nr) \ + tpdm_simple_dataset_rw(tpr##nr, \ + DSB_PATT, nr) + +#define DSB_PATT_MASK_ATTR(nr) \ + tpdm_simple_dataset_rw(tpmr##nr, \ + DSB_PATT_MASK, nr) + /** * struct dsb_dataset - specifics associated to dsb dataset * @mode: DSB programming mode * @edge_ctrl_idx Index number of the edge control * @edge_ctrl: Save value for edge control * @edge_ctrl_mask: Save value for edge control mask + * @patt_val: Save value for pattern + * @patt_mask: Save value for pattern mask * @trig_patt: Save value for trigger pattern * @trig_patt_mask: Save value for trigger pattern mask + * @patt_ts: Enable/Disable pattern timestamp + * @patt_type: Set pattern type * @trig_ts: Enable/Disable trigger timestamp. * @trig_type: Enable/Disable trigger type. */ @@ -136,8 +154,12 @@ struct dsb_dataset { u32 edge_ctrl_idx; u32 edge_ctrl[TPDM_DSB_MAX_EDCR]; u32 edge_ctrl_mask[TPDM_DSB_MAX_EDCMR]; + u32 patt_val[TPDM_DSB_MAX_PATT]; + u32 patt_mask[TPDM_DSB_MAX_PATT]; u32 trig_patt[TPDM_DSB_MAX_PATT]; u32 trig_patt_mask[TPDM_DSB_MAX_PATT]; + bool patt_ts; + bool patt_type; bool trig_ts; bool trig_type; }; @@ -169,6 +191,8 @@ enum dataset_mem { DSB_EDGE_CTRL_MASK, DSB_TRIG_PATT, DSB_TRIG_PATT_MASK, + DSB_PATT, + DSB_PATT_MASK, }; /** From 0621941b58a745715e2bc31f564ba055973c3478 Mon Sep 17 00:00:00 2001 From: Tao Zhang Date: Thu, 28 Sep 2023 14:29:45 +0800 Subject: [PATCH 306/548] dt-bindings: arm: Add support for DSB MSR register Add property "qcom,dsb-msrs-num" to support DSB(Discrete Single Bit) MSR(mux select register) for TPDM. It specifies the number of MSR registers supported by the DSB TDPM. Signed-off-by: Tao Zhang Acked-by: Rob Herring Signed-off-by: Suzuki K Poulose Link: https://lore.kernel.org/r/1695882586-10306-13-git-send-email-quic_taozha@quicinc.com (cherry picked from commit 8e05f86f07a0359584ceb2715fedcc4daf29d898) Signed-off-by: Koba Ko --- .../devicetree/bindings/arm/qcom,coresight-tpdm.yaml | 10 ++++++++++ 1 file changed, 10 insertions(+) diff --git a/Documentation/devicetree/bindings/arm/qcom,coresight-tpdm.yaml b/Documentation/devicetree/bindings/arm/qcom,coresight-tpdm.yaml index ee99768080128..5559df419f8e1 100644 --- a/Documentation/devicetree/bindings/arm/qcom,coresight-tpdm.yaml +++ b/Documentation/devicetree/bindings/arm/qcom,coresight-tpdm.yaml @@ -51,6 +51,15 @@ properties: $ref: /schemas/types.yaml#/definitions/uint8 enum: [32, 64] + qcom,dsb-msrs-num: + description: + Specifies the number of DSB(Discrete Single Bit) MSR(mux select register) + registers supported by the monitor. If this property is not configured + or set to 0, it means this DSB TPDM doesn't support MSR. + $ref: /schemas/types.yaml#/definitions/uint32 + minimum: 0 + maximum: 32 + clocks: maxItems: 1 @@ -85,6 +94,7 @@ examples: reg = <0x0684c000 0x1000>; qcom,dsb-element-size = /bits/ 8 <32>; + qcom,dsb-msrs-num = <16>; clocks = <&aoss_qmp>; clock-names = "apb_pclk"; From 5a0547a827a9b565b7580f38fdd6c3f543d1641b Mon Sep 17 00:00:00 2001 From: Tao Zhang Date: Thu, 28 Sep 2023 14:29:46 +0800 Subject: [PATCH 307/548] coresight-tpdm: Add nodes for dsb msr support MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Add the nodes for DSB subunit MSR(mux select register) support. The TPDM MSR (mux select register) interface is an optional interface and associated bank of registers per TPDM subunit. The intent of mux select registers is to control muxing structures driving the TPDM’s’ various subunit interfaces. Signed-off-by: Tao Zhang Signed-off-by: Suzuki K Poulose Link: https://lore.kernel.org/r/1695882586-10306-14-git-send-email-quic_taozha@quicinc.com (cherry picked from commit 350ba15ae187c118979566f1288adb5f69f24230) Signed-off-by: Koba Ko --- .../testing/sysfs-bus-coresight-devices-tpdm | 8 ++ drivers/hwtracing/coresight/coresight-tpdm.c | 85 +++++++++++++++++++ drivers/hwtracing/coresight/coresight-tpdm.h | 12 +++ 3 files changed, 105 insertions(+) diff --git a/Documentation/ABI/testing/sysfs-bus-coresight-devices-tpdm b/Documentation/ABI/testing/sysfs-bus-coresight-devices-tpdm index 1f20a3f7df7df..f07218e788439 100644 --- a/Documentation/ABI/testing/sysfs-bus-coresight-devices-tpdm +++ b/Documentation/ABI/testing/sysfs-bus-coresight-devices-tpdm @@ -162,3 +162,11 @@ Description: Accepts only one of the 2 values - 0 or 1. 0 : Set the DSB pattern type to value. 1 : Set the DSB pattern type to toggle. + +What: /sys/bus/coresight/devices//dsb_msr/msr[0:31] +Date: March 2023 +KernelVersion 6.7 +Contact: Jinlong Mao (QUIC) , Tao Zhang (QUIC) +Description: + (RW) Set/Get the MSR(mux select register) for the DSB subunit + TPDM. diff --git a/drivers/hwtracing/coresight/coresight-tpdm.c b/drivers/hwtracing/coresight/coresight-tpdm.c index 693b90c82f3a0..b25284e06395f 100644 --- a/drivers/hwtracing/coresight/coresight-tpdm.c +++ b/drivers/hwtracing/coresight/coresight-tpdm.c @@ -61,6 +61,11 @@ static ssize_t tpdm_simple_dataset_show(struct device *dev, return -EINVAL; return sysfs_emit(buf, "0x%x\n", drvdata->dsb->patt_mask[tpdm_attr->idx]); + case DSB_MSR: + if (tpdm_attr->idx >= drvdata->dsb_msr_num) + return -EINVAL; + return sysfs_emit(buf, "0x%x\n", + drvdata->dsb->msr[tpdm_attr->idx]); } return -EINVAL; } @@ -107,6 +112,12 @@ static ssize_t tpdm_simple_dataset_store(struct device *dev, else ret = -EINVAL; break; + case DSB_MSR: + if (tpdm_attr->idx < drvdata->dsb_msr_num) + drvdata->dsb->msr[tpdm_attr->idx] = val; + else + ret = -EINVAL; + break; default: ret = -EINVAL; } @@ -132,6 +143,22 @@ static umode_t tpdm_dsb_is_visible(struct kobject *kobj, return 0; } +static umode_t tpdm_dsb_msr_is_visible(struct kobject *kobj, + struct attribute *attr, int n) +{ + struct device *dev = kobj_to_dev(kobj); + struct tpdm_drvdata *drvdata = dev_get_drvdata(dev->parent); + struct device_attribute *dev_attr = + container_of(attr, struct device_attribute, attr); + struct tpdm_dataset_attribute *tpdm_attr = + container_of(dev_attr, struct tpdm_dataset_attribute, attr); + + if (tpdm_attr->idx < drvdata->dsb_msr_num) + return attr->mode; + + return 0; +} + static void tpdm_reset_datasets(struct tpdm_drvdata *drvdata) { if (tpdm_has_dsb_dataset(drvdata)) { @@ -193,6 +220,15 @@ static void set_dsb_tier(struct tpdm_drvdata *drvdata) writel_relaxed(val, drvdata->base + TPDM_DSB_TIER); } +static void set_dsb_msr(struct tpdm_drvdata *drvdata) +{ + int i; + + for (i = 0; i < drvdata->dsb_msr_num; i++) + writel_relaxed(drvdata->dsb->msr[i], + drvdata->base + TPDM_DSB_MSR(i)); +} + static void tpdm_enable_dsb(struct tpdm_drvdata *drvdata) { u32 val, i; @@ -216,6 +252,8 @@ static void tpdm_enable_dsb(struct tpdm_drvdata *drvdata) set_dsb_tier(drvdata); + set_dsb_msr(drvdata); + val = readl_relaxed(drvdata->base + TPDM_DSB_CR); /* Set the mode of DSB dataset */ set_dsb_mode(drvdata, &val); @@ -739,6 +777,42 @@ static struct attribute *tpdm_dsb_patt_attrs[] = { NULL, }; +static struct attribute *tpdm_dsb_msr_attrs[] = { + DSB_MSR_ATTR(0), + DSB_MSR_ATTR(1), + DSB_MSR_ATTR(2), + DSB_MSR_ATTR(3), + DSB_MSR_ATTR(4), + DSB_MSR_ATTR(5), + DSB_MSR_ATTR(6), + DSB_MSR_ATTR(7), + DSB_MSR_ATTR(8), + DSB_MSR_ATTR(9), + DSB_MSR_ATTR(10), + DSB_MSR_ATTR(11), + DSB_MSR_ATTR(12), + DSB_MSR_ATTR(13), + DSB_MSR_ATTR(14), + DSB_MSR_ATTR(15), + DSB_MSR_ATTR(16), + DSB_MSR_ATTR(17), + DSB_MSR_ATTR(18), + DSB_MSR_ATTR(19), + DSB_MSR_ATTR(20), + DSB_MSR_ATTR(21), + DSB_MSR_ATTR(22), + DSB_MSR_ATTR(23), + DSB_MSR_ATTR(24), + DSB_MSR_ATTR(25), + DSB_MSR_ATTR(26), + DSB_MSR_ATTR(27), + DSB_MSR_ATTR(28), + DSB_MSR_ATTR(29), + DSB_MSR_ATTR(30), + DSB_MSR_ATTR(31), + NULL, +}; + static struct attribute *tpdm_dsb_attrs[] = { &dev_attr_dsb_mode.attr, &dev_attr_dsb_trig_ts.attr, @@ -769,12 +843,19 @@ static struct attribute_group tpdm_dsb_patt_grp = { .name = "dsb_patt", }; +static struct attribute_group tpdm_dsb_msr_grp = { + .attrs = tpdm_dsb_msr_attrs, + .is_visible = tpdm_dsb_msr_is_visible, + .name = "dsb_msr", +}; + static const struct attribute_group *tpdm_attr_grps[] = { &tpdm_attr_grp, &tpdm_dsb_attr_grp, &tpdm_dsb_edge_grp, &tpdm_dsb_trig_patt_grp, &tpdm_dsb_patt_grp, + &tpdm_dsb_msr_grp, NULL, }; @@ -809,6 +890,10 @@ static int tpdm_probe(struct amba_device *adev, const struct amba_id *id) if (ret) return ret; + if (drvdata && tpdm_has_dsb_dataset(drvdata)) + of_property_read_u32(drvdata->dev->of_node, + "qcom,dsb_msr_num", &drvdata->dsb_msr_num); + /* Set up coresight component description */ desc.name = coresight_alloc_device_name(&tpdm_devs, dev); if (!desc.name) diff --git a/drivers/hwtracing/coresight/coresight-tpdm.h b/drivers/hwtracing/coresight/coresight-tpdm.h index 891979db111ac..4115b2a17b8d8 100644 --- a/drivers/hwtracing/coresight/coresight-tpdm.h +++ b/drivers/hwtracing/coresight/coresight-tpdm.h @@ -18,6 +18,7 @@ #define TPDM_DSB_XPMR(n) (0x7E8 + (n * 4)) #define TPDM_DSB_EDCR(n) (0x808 + (n * 4)) #define TPDM_DSB_EDCMR(n) (0x848 + (n * 4)) +#define TPDM_DSB_MSR(n) (0x980 + (n * 4)) /* Enable bit for DSB subunit */ #define TPDM_DSB_CR_ENA BIT(0) @@ -90,6 +91,8 @@ #define TPDM_DSB_MAX_EDCMR 8 /* MAX number of DSB pattern */ #define TPDM_DSB_MAX_PATT 8 +/* MAX number of DSB MSR */ +#define TPDM_DSB_MAX_MSR 32 #define tpdm_simple_dataset_ro(name, mem, idx) \ (&((struct tpdm_dataset_attribute[]) { \ @@ -134,6 +137,10 @@ tpdm_simple_dataset_rw(tpmr##nr, \ DSB_PATT_MASK, nr) +#define DSB_MSR_ATTR(nr) \ + tpdm_simple_dataset_rw(msr##nr, \ + DSB_MSR, nr) + /** * struct dsb_dataset - specifics associated to dsb dataset * @mode: DSB programming mode @@ -144,6 +151,7 @@ * @patt_mask: Save value for pattern mask * @trig_patt: Save value for trigger pattern * @trig_patt_mask: Save value for trigger pattern mask + * @msr Save value for MSR * @patt_ts: Enable/Disable pattern timestamp * @patt_type: Set pattern type * @trig_ts: Enable/Disable trigger timestamp. @@ -158,6 +166,7 @@ struct dsb_dataset { u32 patt_mask[TPDM_DSB_MAX_PATT]; u32 trig_patt[TPDM_DSB_MAX_PATT]; u32 trig_patt_mask[TPDM_DSB_MAX_PATT]; + u32 msr[TPDM_DSB_MAX_MSR]; bool patt_ts; bool patt_type; bool trig_ts; @@ -173,6 +182,7 @@ struct dsb_dataset { * @enable: enable status of the component. * @datasets: The datasets types present of the TPDM. * @dsb Specifics associated to TPDM DSB. + * @dsb_msr_num Number of MSR supported by DSB TPDM */ struct tpdm_drvdata { @@ -183,6 +193,7 @@ struct tpdm_drvdata { bool enable; unsigned long datasets; struct dsb_dataset *dsb; + u32 dsb_msr_num; }; /* Enumerate members of various datasets */ @@ -193,6 +204,7 @@ enum dataset_mem { DSB_TRIG_PATT_MASK, DSB_PATT, DSB_PATT_MASK, + DSB_MSR, }; /** From 7fb3481c2ac55ced276afbd8b95198b449309e44 Mon Sep 17 00:00:00 2001 From: James Clark Date: Mon, 29 Jan 2024 15:40:33 +0000 Subject: [PATCH 308/548] coresight: Make language around "activated" sinks consistent Activated has the specific meaning of a sink that's selected for use by the user via sysfs. But comments in some code that's shared by Perf use the same word, so in those cases change them to just say "selected" instead. With selected implying either via Perf or "activated" via sysfs. coresight_get_enabled_sink() doesn't actually get an enabled sink, it only gets an activated one, so change that too. And change the activated variable name to include "sysfs" so it can't be confused as a general status. Signed-off-by: James Clark Link: https://lore.kernel.org/r/20240129154050.569566-3-james.clark@arm.com Signed-off-by: Suzuki K Poulose (cherry picked from commit a0fef3f05cf36338d471e8f35a9ced88a054d583) Signed-off-by: Koba Ko --- drivers/hwtracing/coresight/coresight-core.c | 51 ++++++++------------ drivers/hwtracing/coresight/coresight-priv.h | 2 - include/linux/coresight.h | 14 +++--- 3 files changed, 27 insertions(+), 40 deletions(-) diff --git a/drivers/hwtracing/coresight/coresight-core.c b/drivers/hwtracing/coresight/coresight-core.c index 37f7a331094f7..9190f567163c4 100644 --- a/drivers/hwtracing/coresight/coresight-core.c +++ b/drivers/hwtracing/coresight/coresight-core.c @@ -501,7 +501,7 @@ static void coresight_disable_path_from(struct list_head *path, /* * ETF devices are tricky... They can be a link or a sink, * depending on how they are configured. If an ETF has been - * "activated" it will be configured as a sink, otherwise + * selected as a sink it will be configured as a sink, otherwise * go ahead with the link configuration. */ if (type == CORESIGHT_DEV_TYPE_LINKSINK) @@ -579,7 +579,7 @@ int coresight_enable_path(struct list_head *path, enum cs_mode mode, /* * ETF devices are tricky... They can be a link or a sink, * depending on how they are configured. If an ETF has been - * "activated" it will be configured as a sink, otherwise + * selected as a sink it will be configured as a sink, otherwise * go ahead with the link configuration. */ if (type == CORESIGHT_DEV_TYPE_LINKSINK) @@ -636,15 +636,21 @@ struct coresight_device *coresight_get_sink(struct list_head *path) return csdev; } +/** + * coresight_find_activated_sysfs_sink - returns the first sink activated via + * sysfs using connection based search starting from the source reference. + * + * @csdev: Coresight source device reference + */ static struct coresight_device * -coresight_find_enabled_sink(struct coresight_device *csdev) +coresight_find_activated_sysfs_sink(struct coresight_device *csdev) { int i; struct coresight_device *sink = NULL; if ((csdev->type == CORESIGHT_DEV_TYPE_SINK || csdev->type == CORESIGHT_DEV_TYPE_LINKSINK) && - csdev->activated) + csdev->sysfs_sink_activated) return csdev; /* @@ -655,7 +661,7 @@ coresight_find_enabled_sink(struct coresight_device *csdev) child_dev = csdev->pdata->out_conns[i]->dest_dev; if (child_dev) - sink = coresight_find_enabled_sink(child_dev); + sink = coresight_find_activated_sysfs_sink(child_dev); if (sink) return sink; } @@ -663,21 +669,6 @@ coresight_find_enabled_sink(struct coresight_device *csdev) return NULL; } -/** - * coresight_get_enabled_sink - returns the first enabled sink using - * connection based search starting from the source reference - * - * @source: Coresight source device reference - */ -struct coresight_device * -coresight_get_enabled_sink(struct coresight_device *source) -{ - if (!source) - return NULL; - - return coresight_find_enabled_sink(source); -} - static int coresight_sink_by_id(struct device *dev, const void *data) { struct coresight_device *csdev = to_coresight_device(dev); @@ -811,11 +802,10 @@ static void coresight_drop_device(struct coresight_device *csdev) * @sink: The final sink we want in this path. * @path: The list to add devices to. * - * The tree of Coresight device is traversed until an activated sink is - * found. From there the sink is added to the list along with all the - * devices that led to that point - the end result is a list from source - * to sink. In that list the source is the first device and the sink the - * last one. + * The tree of Coresight device is traversed until @sink is found. + * From there the sink is added to the list along with all the devices that led + * to that point - the end result is a list from source to sink. In that list + * the source is the first device and the sink the last one. */ static int _coresight_build_path(struct coresight_device *csdev, struct coresight_device *sink, @@ -825,7 +815,7 @@ static int _coresight_build_path(struct coresight_device *csdev, bool found = false; struct coresight_node *node; - /* An activated sink has been found. Enqueue the element */ + /* The sink has been found. Enqueue the element */ if (csdev == sink) goto out; @@ -1146,7 +1136,7 @@ int coresight_enable(struct coresight_device *csdev) goto out; } - sink = coresight_get_enabled_sink(csdev); + sink = coresight_find_activated_sysfs_sink(csdev); if (!sink) { ret = -EINVAL; goto out; @@ -1260,7 +1250,7 @@ static ssize_t enable_sink_show(struct device *dev, { struct coresight_device *csdev = to_coresight_device(dev); - return scnprintf(buf, PAGE_SIZE, "%u\n", csdev->activated); + return scnprintf(buf, PAGE_SIZE, "%u\n", csdev->sysfs_sink_activated); } static ssize_t enable_sink_store(struct device *dev, @@ -1275,10 +1265,7 @@ static ssize_t enable_sink_store(struct device *dev, if (ret) return ret; - if (val) - csdev->activated = true; - else - csdev->activated = false; + csdev->sysfs_sink_activated = !!val; return size; diff --git a/drivers/hwtracing/coresight/coresight-priv.h b/drivers/hwtracing/coresight/coresight-priv.h index b758a42ed8c73..8991b4a9e9699 100644 --- a/drivers/hwtracing/coresight/coresight-priv.h +++ b/drivers/hwtracing/coresight/coresight-priv.h @@ -131,8 +131,6 @@ void coresight_disable_path(struct list_head *path); int coresight_enable_path(struct list_head *path, enum cs_mode mode, void *sink_data); struct coresight_device *coresight_get_sink(struct list_head *path); -struct coresight_device * -coresight_get_enabled_sink(struct coresight_device *source); struct coresight_device *coresight_get_sink_by_id(u32 id); struct coresight_device * coresight_find_default_sink(struct coresight_device *csdev); diff --git a/include/linux/coresight.h b/include/linux/coresight.h index 4c9ff95c2c87f..c1fcab5ade233 100644 --- a/include/linux/coresight.h +++ b/include/linux/coresight.h @@ -229,10 +229,12 @@ struct coresight_sysfs_link { * @refcnt: keep track of what is in use. * @orphan: true if the component has connections that haven't been linked. * @enable: 'true' if component is currently part of an active path. - * @activated: 'true' only if a _sink_ has been activated. A sink can be - * activated but not yet enabled. Enabling for a _sink_ - * happens when a source has been selected and a path is enabled - * from source to that sink. + * @sysfs_sink_activated: 'true' when a sink has been selected for use via sysfs + * by writing a 1 to the 'enable_sink' file. A sink can be + * activated but not yet enabled. Enabling for a _sink_ happens + * when a source has been selected and a path is enabled from + * source to that sink. A sink can also become enabled but not + * activated if it's used via Perf. * @ea: Device attribute for sink representation under PMU directory. * @def_sink: cached reference to default sink found for this device. * @nr_links: number of sysfs links created to other components from this @@ -252,9 +254,9 @@ struct coresight_device { struct device dev; atomic_t refcnt; bool orphan; - bool enable; /* true only if configured as part of a path */ + bool enable; /* sink specific fields */ - bool activated; /* true only if a sink is part of a path */ + bool sysfs_sink_activated; struct dev_ext_attribute *ea; struct coresight_device *def_sink; /* sysfs links between components */ From 5eb08d0d826c36130730657cab7538ca0899610e Mon Sep 17 00:00:00 2001 From: James Clark Date: Mon, 29 Jan 2024 15:40:34 +0000 Subject: [PATCH 309/548] coresight: Remove ops callback checks The check for the existence of callbacks before using them implies that this happens and is supported. There are no devices without enable/disable callbacks, and it wouldn't be possible to add a new working device without adding them either, so just remove them. Furthermore, there are more callbacks than just enable and disable that are already used unguarded in other places. The comment about new session compatibility doesn't seem to match up to the line of code that it's on so remove it. I think it's alluding to the fact that sinks will check if they were already enabled via sysfs or Perf and fail the enable. But there are more detailed comments at those places, and this one isn't very useful. Signed-off-by: James Clark Link: https://lore.kernel.org/r/20240129154050.569566-4-james.clark@arm.com Signed-off-by: Suzuki K Poulose (cherry picked from commit a11ebe138b8ddb119b1b75c143429f64d605a54e) Signed-off-by: Koba Ko --- drivers/hwtracing/coresight/coresight-core.c | 51 +++++--------------- 1 file changed, 12 insertions(+), 39 deletions(-) diff --git a/drivers/hwtracing/coresight/coresight-core.c b/drivers/hwtracing/coresight/coresight-core.c index 9190f567163c4..1ab4b2e2b820b 100644 --- a/drivers/hwtracing/coresight/coresight-core.c +++ b/drivers/hwtracing/coresight/coresight-core.c @@ -280,16 +280,8 @@ EXPORT_SYMBOL_GPL(coresight_add_helper); static int coresight_enable_sink(struct coresight_device *csdev, enum cs_mode mode, void *data) { - int ret; - - /* - * We need to make sure the "new" session is compatible with the - * existing "mode" of operation. - */ - if (!sink_ops(csdev)->enable) - return -EINVAL; + int ret = sink_ops(csdev)->enable(csdev, mode, data); - ret = sink_ops(csdev)->enable(csdev, mode, data); if (ret) return ret; @@ -300,12 +292,7 @@ static int coresight_enable_sink(struct coresight_device *csdev, static void coresight_disable_sink(struct coresight_device *csdev) { - int ret; - - if (!sink_ops(csdev)->disable) - return; - - ret = sink_ops(csdev)->disable(csdev); + int ret = sink_ops(csdev)->disable(csdev); if (ret) return; csdev->enable = false; @@ -331,11 +318,9 @@ static int coresight_enable_link(struct coresight_device *csdev, if (link_subtype == CORESIGHT_DEV_SUBTYPE_LINK_SPLIT && IS_ERR(outconn)) return PTR_ERR(outconn); - if (link_ops(csdev)->enable) { - ret = link_ops(csdev)->enable(csdev, inconn, outconn); - if (!ret) - csdev->enable = true; - } + ret = link_ops(csdev)->enable(csdev, inconn, outconn); + if (!ret) + csdev->enable = true; return ret; } @@ -355,9 +340,7 @@ static void coresight_disable_link(struct coresight_device *csdev, outconn = coresight_find_out_connection(csdev, child); link_subtype = csdev->subtype.link_subtype; - if (link_ops(csdev)->disable) { - link_ops(csdev)->disable(csdev, inconn, outconn); - } + link_ops(csdev)->disable(csdev, inconn, outconn); if (link_subtype == CORESIGHT_DEV_SUBTYPE_LINK_MERG) { for (i = 0; i < csdev->pdata->nr_inconns; i++) @@ -383,11 +366,9 @@ int coresight_enable_source(struct coresight_device *csdev, enum cs_mode mode, int ret; if (!csdev->enable) { - if (source_ops(csdev)->enable) { - ret = source_ops(csdev)->enable(csdev, data, mode); - if (ret) - return ret; - } + ret = source_ops(csdev)->enable(csdev, data, mode); + if (ret) + return ret; csdev->enable = true; } @@ -405,11 +386,8 @@ static bool coresight_is_helper(struct coresight_device *csdev) static int coresight_enable_helper(struct coresight_device *csdev, enum cs_mode mode, void *data) { - int ret; + int ret = helper_ops(csdev)->enable(csdev, mode, data); - if (!helper_ops(csdev)->enable) - return 0; - ret = helper_ops(csdev)->enable(csdev, mode, data); if (ret) return ret; @@ -419,12 +397,8 @@ static int coresight_enable_helper(struct coresight_device *csdev, static void coresight_disable_helper(struct coresight_device *csdev) { - int ret; - - if (!helper_ops(csdev)->disable) - return; + int ret = helper_ops(csdev)->disable(csdev, NULL); - ret = helper_ops(csdev)->disable(csdev, NULL); if (ret) return; csdev->enable = false; @@ -454,8 +428,7 @@ static void coresight_disable_helpers(struct coresight_device *csdev) */ void coresight_disable_source(struct coresight_device *csdev, void *data) { - if (source_ops(csdev)->disable) - source_ops(csdev)->disable(csdev, data); + source_ops(csdev)->disable(csdev, data); coresight_disable_helpers(csdev); } EXPORT_SYMBOL_GPL(coresight_disable_source); From 55603dc71dee318428af26c1a38fb045c708fe53 Mon Sep 17 00:00:00 2001 From: James Clark Date: Mon, 29 Jan 2024 15:40:35 +0000 Subject: [PATCH 310/548] coresight: Move mode to struct coresight_device Most devices use mode, so move the mode definition out of the individual devices and up to the Coresight device. This will allow the core code to also know the mode which will be useful in a later commit. This also fixes the inconsistency of the documentation of the mode field on the individual device types. For example ETB10 had "this ETB is being used". Two devices didn't require an atomic mode type, so these usages have been converted to atomic_get() and atomic_set() only to make it compile, but the documentation of the field in struct coresight_device explains this type of usage. In the future, manipulation of the mode could be completely moved out of the individual devices and into the core code because it's almost all duplicate code, and this change is a step towards that. Signed-off-by: James Clark Link: https://lore.kernel.org/r/20240129154050.569566-5-james.clark@arm.com Signed-off-by: Suzuki K Poulose (cherry picked from commit 9cae77cf23e317f31de036ced7ad2c261317dc76) Signed-off-by: Koba Ko --- drivers/hwtracing/coresight/coresight-etb10.c | 18 ++++++------- drivers/hwtracing/coresight/coresight-etm.h | 2 -- .../coresight/coresight-etm3x-core.c | 13 +++++----- .../coresight/coresight-etm3x-sysfs.c | 4 +-- .../coresight/coresight-etm4x-core.c | 16 +++++------- drivers/hwtracing/coresight/coresight-etm4x.h | 1 - drivers/hwtracing/coresight/coresight-stm.c | 20 +++++++------- .../hwtracing/coresight/coresight-tmc-core.c | 2 +- .../hwtracing/coresight/coresight-tmc-etf.c | 26 +++++++++---------- .../hwtracing/coresight/coresight-tmc-etr.c | 20 +++++++------- drivers/hwtracing/coresight/coresight-tmc.h | 2 -- drivers/hwtracing/coresight/ultrasoc-smb.c | 16 ++++++++---- drivers/hwtracing/coresight/ultrasoc-smb.h | 2 -- include/linux/coresight.h | 6 +++++ 14 files changed, 73 insertions(+), 75 deletions(-) diff --git a/drivers/hwtracing/coresight/coresight-etb10.c b/drivers/hwtracing/coresight/coresight-etb10.c index fa80039e0821f..08e5bba862db0 100644 --- a/drivers/hwtracing/coresight/coresight-etb10.c +++ b/drivers/hwtracing/coresight/coresight-etb10.c @@ -76,7 +76,6 @@ DEFINE_CORESIGHT_DEVLIST(etb_devs, "etb"); * @pid: Process ID of the process being monitored by the session * that is using this component. * @buf: area of memory where ETB buffer content gets sent. - * @mode: this ETB is being used. * @buffer_depth: size of @buf. * @trigger_cntr: amount of words to store after a trigger. */ @@ -89,7 +88,6 @@ struct etb_drvdata { local_t reading; pid_t pid; u8 *buf; - u32 mode; u32 buffer_depth; u32 trigger_cntr; }; @@ -150,17 +148,17 @@ static int etb_enable_sysfs(struct coresight_device *csdev) spin_lock_irqsave(&drvdata->spinlock, flags); /* Don't messup with perf sessions. */ - if (drvdata->mode == CS_MODE_PERF) { + if (local_read(&csdev->mode) == CS_MODE_PERF) { ret = -EBUSY; goto out; } - if (drvdata->mode == CS_MODE_DISABLED) { + if (local_read(&csdev->mode) == CS_MODE_DISABLED) { ret = etb_enable_hw(drvdata); if (ret) goto out; - drvdata->mode = CS_MODE_SYSFS; + local_set(&csdev->mode, CS_MODE_SYSFS); } atomic_inc(&csdev->refcnt); @@ -181,7 +179,7 @@ static int etb_enable_perf(struct coresight_device *csdev, void *data) spin_lock_irqsave(&drvdata->spinlock, flags); /* No need to continue if the component is already in used by sysFS. */ - if (drvdata->mode == CS_MODE_SYSFS) { + if (local_read(&drvdata->csdev->mode) == CS_MODE_SYSFS) { ret = -EBUSY; goto out; } @@ -216,7 +214,7 @@ static int etb_enable_perf(struct coresight_device *csdev, void *data) if (!ret) { /* Associate with monitored process. */ drvdata->pid = pid; - drvdata->mode = CS_MODE_PERF; + local_set(&drvdata->csdev->mode, CS_MODE_PERF); atomic_inc(&csdev->refcnt); } @@ -362,11 +360,11 @@ static int etb_disable(struct coresight_device *csdev) } /* Complain if we (somehow) got out of sync */ - WARN_ON_ONCE(drvdata->mode == CS_MODE_DISABLED); + WARN_ON_ONCE(local_read(&csdev->mode) == CS_MODE_DISABLED); etb_disable_hw(drvdata); /* Dissociate from monitored process. */ drvdata->pid = -1; - drvdata->mode = CS_MODE_DISABLED; + local_set(&csdev->mode, CS_MODE_DISABLED); spin_unlock_irqrestore(&drvdata->spinlock, flags); dev_dbg(&csdev->dev, "ETB disabled\n"); @@ -589,7 +587,7 @@ static void etb_dump(struct etb_drvdata *drvdata) unsigned long flags; spin_lock_irqsave(&drvdata->spinlock, flags); - if (drvdata->mode == CS_MODE_SYSFS) { + if (local_read(&drvdata->csdev->mode) == CS_MODE_SYSFS) { __etb_disable_hw(drvdata); etb_dump_hw(drvdata); __etb_enable_hw(drvdata); diff --git a/drivers/hwtracing/coresight/coresight-etm.h b/drivers/hwtracing/coresight/coresight-etm.h index 9a0d08b092ae7..e02c3ea972c92 100644 --- a/drivers/hwtracing/coresight/coresight-etm.h +++ b/drivers/hwtracing/coresight/coresight-etm.h @@ -215,7 +215,6 @@ struct etm_config { * @port_size: port size as reported by ETMCR bit 4-6 and 21. * @arch: ETM/PTM version number. * @use_cpu14: true if management registers need to be accessed via CP14. - * @mode: this tracer's mode, i.e sysFS, Perf or disabled. * @sticky_enable: true if ETM base configuration has been done. * @boot_enable:true if we should start tracing at boot time. * @os_unlock: true if access to management registers is allowed. @@ -238,7 +237,6 @@ struct etm_drvdata { int port_size; u8 arch; bool use_cp14; - local_t mode; bool sticky_enable; bool boot_enable; bool os_unlock; diff --git a/drivers/hwtracing/coresight/coresight-etm3x-core.c b/drivers/hwtracing/coresight/coresight-etm3x-core.c index 116a91d90ac20..29f62dfac6739 100644 --- a/drivers/hwtracing/coresight/coresight-etm3x-core.c +++ b/drivers/hwtracing/coresight/coresight-etm3x-core.c @@ -559,7 +559,7 @@ static int etm_enable(struct coresight_device *csdev, struct perf_event *event, u32 val; struct etm_drvdata *drvdata = dev_get_drvdata(csdev->dev.parent); - val = local_cmpxchg(&drvdata->mode, CS_MODE_DISABLED, mode); + val = local_cmpxchg(&drvdata->csdev->mode, CS_MODE_DISABLED, mode); /* Someone is already using the tracer */ if (val) @@ -578,7 +578,7 @@ static int etm_enable(struct coresight_device *csdev, struct perf_event *event, /* The tracer didn't start */ if (ret) - local_set(&drvdata->mode, CS_MODE_DISABLED); + local_set(&drvdata->csdev->mode, CS_MODE_DISABLED); return ret; } @@ -672,14 +672,13 @@ static void etm_disable(struct coresight_device *csdev, struct perf_event *event) { enum cs_mode mode; - struct etm_drvdata *drvdata = dev_get_drvdata(csdev->dev.parent); /* * For as long as the tracer isn't disabled another entity can't * change its status. As such we can read the status here without * fearing it will change under us. */ - mode = local_read(&drvdata->mode); + mode = local_read(&csdev->mode); switch (mode) { case CS_MODE_DISABLED: @@ -696,7 +695,7 @@ static void etm_disable(struct coresight_device *csdev, } if (mode) - local_set(&drvdata->mode, CS_MODE_DISABLED); + local_set(&csdev->mode, CS_MODE_DISABLED); } static const struct coresight_ops_source etm_source_ops = { @@ -730,7 +729,7 @@ static int etm_starting_cpu(unsigned int cpu) etmdrvdata[cpu]->os_unlock = true; } - if (local_read(&etmdrvdata[cpu]->mode)) + if (local_read(&etmdrvdata[cpu]->csdev->mode)) etm_enable_hw(etmdrvdata[cpu]); spin_unlock(&etmdrvdata[cpu]->spinlock); return 0; @@ -742,7 +741,7 @@ static int etm_dying_cpu(unsigned int cpu) return 0; spin_lock(&etmdrvdata[cpu]->spinlock); - if (local_read(&etmdrvdata[cpu]->mode)) + if (local_read(&etmdrvdata[cpu]->csdev->mode)) etm_disable_hw(etmdrvdata[cpu]); spin_unlock(&etmdrvdata[cpu]->spinlock); return 0; diff --git a/drivers/hwtracing/coresight/coresight-etm3x-sysfs.c b/drivers/hwtracing/coresight/coresight-etm3x-sysfs.c index 2f271b7fb048c..6c8429c980b14 100644 --- a/drivers/hwtracing/coresight/coresight-etm3x-sysfs.c +++ b/drivers/hwtracing/coresight/coresight-etm3x-sysfs.c @@ -722,7 +722,7 @@ static ssize_t cntr_val_show(struct device *dev, struct etm_drvdata *drvdata = dev_get_drvdata(dev->parent); struct etm_config *config = &drvdata->config; - if (!local_read(&drvdata->mode)) { + if (!local_read(&drvdata->csdev->mode)) { spin_lock(&drvdata->spinlock); for (i = 0; i < drvdata->nr_cntr; i++) ret += sprintf(buf, "counter %d: %x\n", @@ -941,7 +941,7 @@ static ssize_t seq_curr_state_show(struct device *dev, struct etm_drvdata *drvdata = dev_get_drvdata(dev->parent); struct etm_config *config = &drvdata->config; - if (!local_read(&drvdata->mode)) { + if (!local_read(&drvdata->csdev->mode)) { val = config->seq_curr_state; goto out; } diff --git a/drivers/hwtracing/coresight/coresight-etm4x-core.c b/drivers/hwtracing/coresight/coresight-etm4x-core.c index 05d9f87e35333..e111ddbc27778 100644 --- a/drivers/hwtracing/coresight/coresight-etm4x-core.c +++ b/drivers/hwtracing/coresight/coresight-etm4x-core.c @@ -859,9 +859,8 @@ static int etm4_enable(struct coresight_device *csdev, struct perf_event *event, { int ret; u32 val; - struct etmv4_drvdata *drvdata = dev_get_drvdata(csdev->dev.parent); - val = local_cmpxchg(&drvdata->mode, CS_MODE_DISABLED, mode); + val = local_cmpxchg(&csdev->mode, CS_MODE_DISABLED, mode); /* Someone is already using the tracer */ if (val) @@ -880,7 +879,7 @@ static int etm4_enable(struct coresight_device *csdev, struct perf_event *event, /* The tracer didn't start */ if (ret) - local_set(&drvdata->mode, CS_MODE_DISABLED); + local_set(&csdev->mode, CS_MODE_DISABLED); return ret; } @@ -1037,14 +1036,13 @@ static void etm4_disable(struct coresight_device *csdev, struct perf_event *event) { enum cs_mode mode; - struct etmv4_drvdata *drvdata = dev_get_drvdata(csdev->dev.parent); /* * For as long as the tracer isn't disabled another entity can't * change its status. As such we can read the status here without * fearing it will change under us. */ - mode = local_read(&drvdata->mode); + mode = local_read(&csdev->mode); switch (mode) { case CS_MODE_DISABLED: @@ -1058,7 +1056,7 @@ static void etm4_disable(struct coresight_device *csdev, } if (mode) - local_set(&drvdata->mode, CS_MODE_DISABLED); + local_set(&csdev->mode, CS_MODE_DISABLED); } static const struct coresight_ops_source etm4_source_ops = { @@ -1666,7 +1664,7 @@ static int etm4_starting_cpu(unsigned int cpu) if (!etmdrvdata[cpu]->os_unlock) etm4_os_unlock(etmdrvdata[cpu]); - if (local_read(&etmdrvdata[cpu]->mode)) + if (local_read(&etmdrvdata[cpu]->csdev->mode)) etm4_enable_hw(etmdrvdata[cpu]); spin_unlock(&etmdrvdata[cpu]->spinlock); return 0; @@ -1678,7 +1676,7 @@ static int etm4_dying_cpu(unsigned int cpu) return 0; spin_lock(&etmdrvdata[cpu]->spinlock); - if (local_read(&etmdrvdata[cpu]->mode)) + if (local_read(&etmdrvdata[cpu]->csdev->mode)) etm4_disable_hw(etmdrvdata[cpu]); spin_unlock(&etmdrvdata[cpu]->spinlock); return 0; @@ -1835,7 +1833,7 @@ static int etm4_cpu_save(struct etmv4_drvdata *drvdata) * Save and restore the ETM Trace registers only if * the ETM is active. */ - if (local_read(&drvdata->mode) && drvdata->save_state) + if (local_read(&drvdata->csdev->mode) && drvdata->save_state) ret = __etm4_cpu_save(drvdata); return ret; } diff --git a/drivers/hwtracing/coresight/coresight-etm4x.h b/drivers/hwtracing/coresight/coresight-etm4x.h index 6b6760e49ed35..9e9165f62e81f 100644 --- a/drivers/hwtracing/coresight/coresight-etm4x.h +++ b/drivers/hwtracing/coresight/coresight-etm4x.h @@ -990,7 +990,6 @@ struct etmv4_drvdata { void __iomem *base; struct coresight_device *csdev; spinlock_t spinlock; - local_t mode; int cpu; u8 arch; u8 nr_pe; diff --git a/drivers/hwtracing/coresight/coresight-stm.c b/drivers/hwtracing/coresight/coresight-stm.c index a1c27c901ad17..190ba569b05a7 100644 --- a/drivers/hwtracing/coresight/coresight-stm.c +++ b/drivers/hwtracing/coresight/coresight-stm.c @@ -119,7 +119,6 @@ DEFINE_CORESIGHT_DEVLIST(stm_devs, "stm"); * @spinlock: only one at a time pls. * @chs: the channels accociated to this STM. * @stm: structure associated to the generic STM interface. - * @mode: this tracer's mode (enum cs_mode), i.e sysFS, or disabled. * @traceid: value of the current ID for this component. * @write_bytes: Maximus bytes this STM can write at a time. * @stmsper: settings for register STMSPER. @@ -136,7 +135,6 @@ struct stm_drvdata { spinlock_t spinlock; struct channel_space chs; struct stm_data stm; - local_t mode; u8 traceid; u32 write_bytes; u32 stmsper; @@ -201,7 +199,7 @@ static int stm_enable(struct coresight_device *csdev, struct perf_event *event, if (mode != CS_MODE_SYSFS) return -EINVAL; - val = local_cmpxchg(&drvdata->mode, CS_MODE_DISABLED, mode); + val = local_cmpxchg(&csdev->mode, CS_MODE_DISABLED, mode); /* Someone is already using the tracer */ if (val) @@ -266,7 +264,7 @@ static void stm_disable(struct coresight_device *csdev, * change its status. As such we can read the status here without * fearing it will change under us. */ - if (local_read(&drvdata->mode) == CS_MODE_SYSFS) { + if (local_read(&csdev->mode) == CS_MODE_SYSFS) { spin_lock(&drvdata->spinlock); stm_disable_hw(drvdata); spin_unlock(&drvdata->spinlock); @@ -276,7 +274,7 @@ static void stm_disable(struct coresight_device *csdev, pm_runtime_put(csdev->dev.parent); - local_set(&drvdata->mode, CS_MODE_DISABLED); + local_set(&csdev->mode, CS_MODE_DISABLED); dev_dbg(&csdev->dev, "STM tracing disabled\n"); } } @@ -373,7 +371,7 @@ static long stm_generic_set_options(struct stm_data *stm_data, { struct stm_drvdata *drvdata = container_of(stm_data, struct stm_drvdata, stm); - if (!(drvdata && local_read(&drvdata->mode))) + if (!(drvdata && local_read(&drvdata->csdev->mode))) return -EINVAL; if (channel >= drvdata->numsp) @@ -408,7 +406,7 @@ static ssize_t notrace stm_generic_packet(struct stm_data *stm_data, struct stm_drvdata, stm); unsigned int stm_flags; - if (!(drvdata && local_read(&drvdata->mode))) + if (!(drvdata && local_read(&drvdata->csdev->mode))) return -EACCES; if (channel >= drvdata->numsp) @@ -515,7 +513,7 @@ static ssize_t port_select_show(struct device *dev, struct stm_drvdata *drvdata = dev_get_drvdata(dev->parent); unsigned long val; - if (!local_read(&drvdata->mode)) { + if (!local_read(&drvdata->csdev->mode)) { val = drvdata->stmspscr; } else { spin_lock(&drvdata->spinlock); @@ -541,7 +539,7 @@ static ssize_t port_select_store(struct device *dev, spin_lock(&drvdata->spinlock); drvdata->stmspscr = val; - if (local_read(&drvdata->mode)) { + if (local_read(&drvdata->csdev->mode)) { CS_UNLOCK(drvdata->base); /* Process as per ARM's TRM recommendation */ stmsper = readl_relaxed(drvdata->base + STMSPER); @@ -562,7 +560,7 @@ static ssize_t port_enable_show(struct device *dev, struct stm_drvdata *drvdata = dev_get_drvdata(dev->parent); unsigned long val; - if (!local_read(&drvdata->mode)) { + if (!local_read(&drvdata->csdev->mode)) { val = drvdata->stmsper; } else { spin_lock(&drvdata->spinlock); @@ -588,7 +586,7 @@ static ssize_t port_enable_store(struct device *dev, spin_lock(&drvdata->spinlock); drvdata->stmsper = val; - if (local_read(&drvdata->mode)) { + if (local_read(&drvdata->csdev->mode)) { CS_UNLOCK(drvdata->base); writel_relaxed(drvdata->stmsper, drvdata->base + STMSPER); CS_LOCK(drvdata->base); diff --git a/drivers/hwtracing/coresight/coresight-tmc-core.c b/drivers/hwtracing/coresight/coresight-tmc-core.c index c106d142e6322..90f3c06964d02 100644 --- a/drivers/hwtracing/coresight/coresight-tmc-core.c +++ b/drivers/hwtracing/coresight/coresight-tmc-core.c @@ -547,7 +547,7 @@ static void tmc_shutdown(struct amba_device *adev) spin_lock_irqsave(&drvdata->spinlock, flags); - if (drvdata->mode == CS_MODE_DISABLED) + if (local_read(&drvdata->csdev->mode) == CS_MODE_DISABLED) goto out; if (drvdata->config_type == TMC_CONFIG_TYPE_ETR) diff --git a/drivers/hwtracing/coresight/coresight-tmc-etf.c b/drivers/hwtracing/coresight/coresight-tmc-etf.c index 7406b65e2cdda..2a7e516052a29 100644 --- a/drivers/hwtracing/coresight/coresight-tmc-etf.c +++ b/drivers/hwtracing/coresight/coresight-tmc-etf.c @@ -89,7 +89,7 @@ static void __tmc_etb_disable_hw(struct tmc_drvdata *drvdata) * When operating in sysFS mode the content of the buffer needs to be * read before the TMC is disabled. */ - if (drvdata->mode == CS_MODE_SYSFS) + if (local_read(&drvdata->csdev->mode) == CS_MODE_SYSFS) tmc_etb_dump_hw(drvdata); tmc_disable_hw(drvdata); @@ -205,7 +205,7 @@ static int tmc_enable_etf_sink_sysfs(struct coresight_device *csdev) * sink is already enabled no memory is needed and the HW need not be * touched. */ - if (drvdata->mode == CS_MODE_SYSFS) { + if (local_read(&csdev->mode) == CS_MODE_SYSFS) { atomic_inc(&csdev->refcnt); goto out; } @@ -228,7 +228,7 @@ static int tmc_enable_etf_sink_sysfs(struct coresight_device *csdev) ret = tmc_etb_enable_hw(drvdata); if (!ret) { - drvdata->mode = CS_MODE_SYSFS; + local_set(&csdev->mode, CS_MODE_SYSFS); atomic_inc(&csdev->refcnt); } else { /* Free up the buffer if we failed to enable */ @@ -262,7 +262,7 @@ static int tmc_enable_etf_sink_perf(struct coresight_device *csdev, void *data) * No need to continue if the ETB/ETF is already operated * from sysFS. */ - if (drvdata->mode == CS_MODE_SYSFS) { + if (local_read(&csdev->mode) == CS_MODE_SYSFS) { ret = -EBUSY; break; } @@ -292,7 +292,7 @@ static int tmc_enable_etf_sink_perf(struct coresight_device *csdev, void *data) if (!ret) { /* Associate with monitored process. */ drvdata->pid = pid; - drvdata->mode = CS_MODE_PERF; + local_set(&csdev->mode, CS_MODE_PERF); atomic_inc(&csdev->refcnt); } } while (0); @@ -344,11 +344,11 @@ static int tmc_disable_etf_sink(struct coresight_device *csdev) } /* Complain if we (somehow) got out of sync */ - WARN_ON_ONCE(drvdata->mode == CS_MODE_DISABLED); + WARN_ON_ONCE(local_read(&csdev->mode) == CS_MODE_DISABLED); tmc_etb_disable_hw(drvdata); /* Dissociate from monitored process. */ drvdata->pid = -1; - drvdata->mode = CS_MODE_DISABLED; + local_set(&csdev->mode, CS_MODE_DISABLED); spin_unlock_irqrestore(&drvdata->spinlock, flags); @@ -374,7 +374,7 @@ static int tmc_enable_etf_link(struct coresight_device *csdev, if (atomic_read(&csdev->refcnt) == 0) { ret = tmc_etf_enable_hw(drvdata); if (!ret) { - drvdata->mode = CS_MODE_SYSFS; + local_set(&csdev->mode, CS_MODE_SYSFS); first_enable = true; } } @@ -403,7 +403,7 @@ static void tmc_disable_etf_link(struct coresight_device *csdev, if (atomic_dec_return(&csdev->refcnt) == 0) { tmc_etf_disable_hw(drvdata); - drvdata->mode = CS_MODE_DISABLED; + local_set(&csdev->mode, CS_MODE_DISABLED); last_disable = true; } spin_unlock_irqrestore(&drvdata->spinlock, flags); @@ -483,7 +483,7 @@ static unsigned long tmc_update_etf_buffer(struct coresight_device *csdev, return 0; /* This shouldn't happen */ - if (WARN_ON_ONCE(drvdata->mode != CS_MODE_PERF)) + if (WARN_ON_ONCE(local_read(&csdev->mode) != CS_MODE_PERF)) return 0; spin_lock_irqsave(&drvdata->spinlock, flags); @@ -629,7 +629,7 @@ int tmc_read_prepare_etb(struct tmc_drvdata *drvdata) } /* Don't interfere if operated from Perf */ - if (drvdata->mode == CS_MODE_PERF) { + if (local_read(&drvdata->csdev->mode) == CS_MODE_PERF) { ret = -EINVAL; goto out; } @@ -641,7 +641,7 @@ int tmc_read_prepare_etb(struct tmc_drvdata *drvdata) } /* Disable the TMC if need be */ - if (drvdata->mode == CS_MODE_SYSFS) { + if (local_read(&drvdata->csdev->mode) == CS_MODE_SYSFS) { /* There is no point in reading a TMC in HW FIFO mode */ mode = readl_relaxed(drvdata->base + TMC_MODE); if (mode != TMC_MODE_CIRCULAR_BUFFER) { @@ -673,7 +673,7 @@ int tmc_read_unprepare_etb(struct tmc_drvdata *drvdata) spin_lock_irqsave(&drvdata->spinlock, flags); /* Re-enable the TMC if need be */ - if (drvdata->mode == CS_MODE_SYSFS) { + if (local_read(&drvdata->csdev->mode) == CS_MODE_SYSFS) { /* There is no point in reading a TMC in HW FIFO mode */ mode = readl_relaxed(drvdata->base + TMC_MODE); if (mode != TMC_MODE_CIRCULAR_BUFFER) { diff --git a/drivers/hwtracing/coresight/coresight-tmc-etr.c b/drivers/hwtracing/coresight/coresight-tmc-etr.c index f3312fbcdc0f8..ab0beaf99739d 100644 --- a/drivers/hwtracing/coresight/coresight-tmc-etr.c +++ b/drivers/hwtracing/coresight/coresight-tmc-etr.c @@ -1123,7 +1123,7 @@ static void __tmc_etr_disable_hw(struct tmc_drvdata *drvdata) * When operating in sysFS mode the content of the buffer needs to be * read before the TMC is disabled. */ - if (drvdata->mode == CS_MODE_SYSFS) + if (local_read(&drvdata->csdev->mode) == CS_MODE_SYSFS) tmc_etr_sync_sysfs_buf(drvdata); tmc_disable_hw(drvdata); @@ -1169,7 +1169,7 @@ static struct etr_buf *tmc_etr_get_sysfs_buffer(struct coresight_device *csdev) spin_lock_irqsave(&drvdata->spinlock, flags); } - if (drvdata->reading || drvdata->mode == CS_MODE_PERF) { + if (drvdata->reading || local_read(&csdev->mode) == CS_MODE_PERF) { ret = -EBUSY; goto out; } @@ -1210,14 +1210,14 @@ static int tmc_enable_etr_sink_sysfs(struct coresight_device *csdev) * sink is already enabled no memory is needed and the HW need not be * touched, even if the buffer size has changed. */ - if (drvdata->mode == CS_MODE_SYSFS) { + if (local_read(&csdev->mode) == CS_MODE_SYSFS) { atomic_inc(&csdev->refcnt); goto out; } ret = tmc_etr_enable_hw(drvdata, sysfs_buf); if (!ret) { - drvdata->mode = CS_MODE_SYSFS; + local_set(&csdev->mode, CS_MODE_SYSFS); atomic_inc(&csdev->refcnt); } @@ -1632,7 +1632,7 @@ static int tmc_enable_etr_sink_perf(struct coresight_device *csdev, void *data) spin_lock_irqsave(&drvdata->spinlock, flags); /* Don't use this sink if it is already claimed by sysFS */ - if (drvdata->mode == CS_MODE_SYSFS) { + if (local_read(&csdev->mode) == CS_MODE_SYSFS) { rc = -EBUSY; goto unlock_out; } @@ -1664,7 +1664,7 @@ static int tmc_enable_etr_sink_perf(struct coresight_device *csdev, void *data) if (!rc) { /* Associate with monitored process. */ drvdata->pid = pid; - drvdata->mode = CS_MODE_PERF; + local_set(&csdev->mode, CS_MODE_PERF); drvdata->perf_buf = etr_perf->etr_buf; atomic_inc(&csdev->refcnt); } @@ -1705,11 +1705,11 @@ static int tmc_disable_etr_sink(struct coresight_device *csdev) } /* Complain if we (somehow) got out of sync */ - WARN_ON_ONCE(drvdata->mode == CS_MODE_DISABLED); + WARN_ON_ONCE(local_read(&csdev->mode) == CS_MODE_DISABLED); tmc_etr_disable_hw(drvdata); /* Dissociate from monitored process. */ drvdata->pid = -1; - drvdata->mode = CS_MODE_DISABLED; + local_set(&csdev->mode, CS_MODE_DISABLED); /* Reset perf specific data */ drvdata->perf_buf = NULL; @@ -1757,7 +1757,7 @@ int tmc_read_prepare_etr(struct tmc_drvdata *drvdata) } /* Disable the TMC if we are trying to read from a running session. */ - if (drvdata->mode == CS_MODE_SYSFS) + if (local_read(&drvdata->csdev->mode) == CS_MODE_SYSFS) __tmc_etr_disable_hw(drvdata); drvdata->reading = true; @@ -1779,7 +1779,7 @@ int tmc_read_unprepare_etr(struct tmc_drvdata *drvdata) spin_lock_irqsave(&drvdata->spinlock, flags); /* RE-enable the TMC if need be */ - if (drvdata->mode == CS_MODE_SYSFS) { + if (local_read(&drvdata->csdev->mode) == CS_MODE_SYSFS) { /* * The trace run will continue with the same allocated trace * buffer. Since the tracer is still enabled drvdata::buf can't diff --git a/drivers/hwtracing/coresight/coresight-tmc.h b/drivers/hwtracing/coresight/coresight-tmc.h index 0ee48c5ba764d..232db5bdc7fcf 100644 --- a/drivers/hwtracing/coresight/coresight-tmc.h +++ b/drivers/hwtracing/coresight/coresight-tmc.h @@ -177,7 +177,6 @@ struct etr_buf { * @size: trace buffer size for this TMC (common for all modes). * @max_burst_size: The maximum burst size that can be initiated by * TMC-ETR on AXI bus. - * @mode: how this TMC is being used. * @config_type: TMC variant, must be of type @tmc_config_type. * @memwidth: width of the memory interface databus, in bytes. * @trigger_cntr: amount of words to store after a trigger. @@ -202,7 +201,6 @@ struct tmc_drvdata { u32 len; u32 size; u32 max_burst_size; - u32 mode; enum tmc_config_type config_type; enum tmc_mem_intf_width memwidth; u32 trigger_cntr; diff --git a/drivers/hwtracing/coresight/ultrasoc-smb.c b/drivers/hwtracing/coresight/ultrasoc-smb.c index 6e32d31a95fe0..1d3b6642de1ea 100644 --- a/drivers/hwtracing/coresight/ultrasoc-smb.c +++ b/drivers/hwtracing/coresight/ultrasoc-smb.c @@ -216,11 +216,11 @@ static void smb_enable_sysfs(struct coresight_device *csdev) { struct smb_drv_data *drvdata = dev_get_drvdata(csdev->dev.parent); - if (drvdata->mode != CS_MODE_DISABLED) + if (local_read(&csdev->mode) != CS_MODE_DISABLED) return; smb_enable_hw(drvdata); - drvdata->mode = CS_MODE_SYSFS; + local_set(&csdev->mode, CS_MODE_SYSFS); } static int smb_enable_perf(struct coresight_device *csdev, void *data) @@ -243,7 +243,7 @@ static int smb_enable_perf(struct coresight_device *csdev, void *data) if (drvdata->pid == -1) { smb_enable_hw(drvdata); drvdata->pid = pid; - drvdata->mode = CS_MODE_PERF; + local_set(&csdev->mode, CS_MODE_PERF); } return 0; @@ -264,7 +264,8 @@ static int smb_enable(struct coresight_device *csdev, enum cs_mode mode, } /* Do nothing, the SMB is already enabled as other mode */ - if (drvdata->mode != CS_MODE_DISABLED && drvdata->mode != mode) { + if (local_read(&csdev->mode) != CS_MODE_DISABLED && + local_read(&csdev->mode) != mode) { ret = -EBUSY; goto out; } @@ -310,13 +311,18 @@ static int smb_disable(struct coresight_device *csdev) } /* Complain if we (somehow) got out of sync */ - WARN_ON_ONCE(drvdata->mode == CS_MODE_DISABLED); + WARN_ON_ONCE(local_read(&csdev->mode) == CS_MODE_DISABLED); smb_disable_hw(drvdata); /* Dissociate from the target process. */ drvdata->pid = -1; +<<<<<<< HEAD drvdata->mode = CS_MODE_DISABLED; +======= + local_set(&csdev->mode, CS_MODE_DISABLED); + dev_dbg(&csdev->dev, "Ultrasoc SMB disabled\n"); +>>>>>>> 9cae77cf23e3 (coresight: Move mode to struct coresight_device) dev_dbg(&csdev->dev, "Ultrasoc SMB disabled\n"); out: diff --git a/drivers/hwtracing/coresight/ultrasoc-smb.h b/drivers/hwtracing/coresight/ultrasoc-smb.h index 82a44c14a8829..a91d39cfccb8f 100644 --- a/drivers/hwtracing/coresight/ultrasoc-smb.h +++ b/drivers/hwtracing/coresight/ultrasoc-smb.h @@ -109,7 +109,6 @@ struct smb_data_buffer { * @reading: Synchronise user space access to SMB buffer. * @pid: Process ID of the process being monitored by the * session that is using this component. - * @mode: How this SMB is being used, perf mode or sysfs mode. */ struct smb_drv_data { void __iomem *base; @@ -119,7 +118,6 @@ struct smb_drv_data { spinlock_t spinlock; bool reading; pid_t pid; - enum cs_mode mode; }; #endif diff --git a/include/linux/coresight.h b/include/linux/coresight.h index c1fcab5ade233..3ef3c9f0de593 100644 --- a/include/linux/coresight.h +++ b/include/linux/coresight.h @@ -226,6 +226,11 @@ struct coresight_sysfs_link { * by @coresight_ops. * @access: Device i/o access abstraction for this device. * @dev: The device entity associated to this component. + * @mode: This tracer's mode, i.e sysFS, Perf or disabled. This is + * actually an 'enum cs_mode', but is stored in an atomic type. + * This is always accessed through local_read() and local_set(), + * but wherever it's done from within the Coresight device's lock, + * a non-atomic read would also work. * @refcnt: keep track of what is in use. * @orphan: true if the component has connections that haven't been linked. * @enable: 'true' if component is currently part of an active path. @@ -252,6 +257,7 @@ struct coresight_device { const struct coresight_ops *ops; struct csdev_access access; struct device dev; + local_t mode; atomic_t refcnt; bool orphan; bool enable; From d913f29b5c5ba6e225185f618d689fa20c608410 Mon Sep 17 00:00:00 2001 From: James Clark Date: Mon, 29 Jan 2024 15:40:36 +0000 Subject: [PATCH 311/548] coresight: Remove the 'enable' field. 'enable', which probably should have been 'enabled', is only ever read in the core code in relation to controlling sources, and specifically only sources in sysfs mode. Confusingly it's not labelled as such and relying on it can be a source of bugs like the one fixed by commit 078dbba3f0c9 ("coresight: Fix crash when Perf and sysfs modes are used concurrently"). Most importantly, it can only be used when the coresight_mutex is held which is only done when enabling and disabling paths in sysfs mode, and not Perf mode. So to prevent its usage spreading and leaking out to other devices, remove it. It's use is equivalent to checking if the mode is currently sysfs, as due to the coresight_mutex lock, mode == CS_MODE_SYSFS can only become true or untrue when that lock is held, and when mode == CS_MODE_SYSFS the device is both enabled and in sysfs mode. The one place it was used outside of the core code is in TPDA, but that pattern is more appropriately represented using refcounts inside the device's own spinlock. Signed-off-by: James Clark Link: https://lore.kernel.org/r/20240129154050.569566-6-james.clark@arm.com Signed-off-by: Suzuki K Poulose (cherry picked from commit d5e83f97eb5669bfdd894ec980083f65517df2fb) Signed-off-by: Koba Ko --- drivers/hwtracing/coresight/coresight-core.c | 86 +++++++------------- drivers/hwtracing/coresight/coresight-tpda.c | 12 ++- include/linux/coresight.h | 2 - 3 files changed, 38 insertions(+), 62 deletions(-) diff --git a/drivers/hwtracing/coresight/coresight-core.c b/drivers/hwtracing/coresight/coresight-core.c index 1ab4b2e2b820b..2d29c6219bf8b 100644 --- a/drivers/hwtracing/coresight/coresight-core.c +++ b/drivers/hwtracing/coresight/coresight-core.c @@ -280,29 +280,18 @@ EXPORT_SYMBOL_GPL(coresight_add_helper); static int coresight_enable_sink(struct coresight_device *csdev, enum cs_mode mode, void *data) { - int ret = sink_ops(csdev)->enable(csdev, mode, data); - - if (ret) - return ret; - - csdev->enable = true; - - return 0; + return sink_ops(csdev)->enable(csdev, mode, data); } static void coresight_disable_sink(struct coresight_device *csdev) { - int ret = sink_ops(csdev)->disable(csdev); - if (ret) - return; - csdev->enable = false; + sink_ops(csdev)->disable(csdev); } static int coresight_enable_link(struct coresight_device *csdev, struct coresight_device *parent, struct coresight_device *child) { - int ret = 0; int link_subtype; struct coresight_connection *inconn, *outconn; @@ -318,19 +307,13 @@ static int coresight_enable_link(struct coresight_device *csdev, if (link_subtype == CORESIGHT_DEV_SUBTYPE_LINK_SPLIT && IS_ERR(outconn)) return PTR_ERR(outconn); - ret = link_ops(csdev)->enable(csdev, inconn, outconn); - if (!ret) - csdev->enable = true; - - return ret; + return link_ops(csdev)->enable(csdev, inconn, outconn); } static void coresight_disable_link(struct coresight_device *csdev, struct coresight_device *parent, struct coresight_device *child) { - int i; - int link_subtype; struct coresight_connection *inconn, *outconn; if (!parent || !child) @@ -338,26 +321,8 @@ static void coresight_disable_link(struct coresight_device *csdev, inconn = coresight_find_out_connection(parent, csdev); outconn = coresight_find_out_connection(csdev, child); - link_subtype = csdev->subtype.link_subtype; link_ops(csdev)->disable(csdev, inconn, outconn); - - if (link_subtype == CORESIGHT_DEV_SUBTYPE_LINK_MERG) { - for (i = 0; i < csdev->pdata->nr_inconns; i++) - if (atomic_read(&csdev->pdata->in_conns[i]->dest_refcnt) != - 0) - return; - } else if (link_subtype == CORESIGHT_DEV_SUBTYPE_LINK_SPLIT) { - for (i = 0; i < csdev->pdata->nr_outconns; i++) - if (atomic_read(&csdev->pdata->out_conns[i]->src_refcnt) != - 0) - return; - } else { - if (atomic_read(&csdev->refcnt) != 0) - return; - } - - csdev->enable = false; } int coresight_enable_source(struct coresight_device *csdev, enum cs_mode mode, @@ -365,11 +330,16 @@ int coresight_enable_source(struct coresight_device *csdev, enum cs_mode mode, { int ret; - if (!csdev->enable) { + /* + * Comparison with CS_MODE_SYSFS works without taking any device + * specific spinlock because the truthyness of that comparison can only + * change with coresight_mutex held, which we already have here. + */ + lockdep_assert_held(&coresight_mutex); + if (local_read(&csdev->mode) != CS_MODE_SYSFS) { ret = source_ops(csdev)->enable(csdev, data, mode); if (ret) return ret; - csdev->enable = true; } atomic_inc(&csdev->refcnt); @@ -386,22 +356,12 @@ static bool coresight_is_helper(struct coresight_device *csdev) static int coresight_enable_helper(struct coresight_device *csdev, enum cs_mode mode, void *data) { - int ret = helper_ops(csdev)->enable(csdev, mode, data); - - if (ret) - return ret; - - csdev->enable = true; - return 0; + return helper_ops(csdev)->enable(csdev, mode, data); } static void coresight_disable_helper(struct coresight_device *csdev) { - int ret = helper_ops(csdev)->disable(csdev, NULL); - - if (ret) - return; - csdev->enable = false; + helper_ops(csdev)->disable(csdev, NULL); } static void coresight_disable_helpers(struct coresight_device *csdev) @@ -446,11 +406,15 @@ EXPORT_SYMBOL_GPL(coresight_disable_source); static bool coresight_disable_source_sysfs(struct coresight_device *csdev, void *data) { + lockdep_assert_held(&coresight_mutex); + if (local_read(&csdev->mode) != CS_MODE_SYSFS) + return false; + if (atomic_dec_return(&csdev->refcnt) == 0) { coresight_disable_source(csdev, data); - csdev->enable = false; + return true; } - return !csdev->enable; + return false; } /* @@ -1098,7 +1062,13 @@ int coresight_enable(struct coresight_device *csdev) if (ret) goto out; - if (csdev->enable) { + /* + * mode == SYSFS implies that it's already enabled. Don't look at the + * refcount to determine this because we don't claim the source until + * coresight_enable_source() so can still race with Perf mode which + * doesn't hold coresight_mutex. + */ + if (local_read(&csdev->mode) == CS_MODE_SYSFS) { /* * There could be multiple applications driving the software * source. So keep the refcount for each such user when the @@ -1184,7 +1154,7 @@ void coresight_disable(struct coresight_device *csdev) if (ret) goto out; - if (!csdev->enable || !coresight_disable_source_sysfs(csdev, NULL)) + if (!coresight_disable_source_sysfs(csdev, NULL)) goto out; switch (csdev->subtype.source_subtype) { @@ -1250,7 +1220,9 @@ static ssize_t enable_source_show(struct device *dev, { struct coresight_device *csdev = to_coresight_device(dev); - return scnprintf(buf, PAGE_SIZE, "%u\n", csdev->enable); + guard(mutex)(&coresight_mutex); + return scnprintf(buf, PAGE_SIZE, "%u\n", + local_read(&csdev->mode) == CS_MODE_SYSFS); } static ssize_t enable_source_store(struct device *dev, diff --git a/drivers/hwtracing/coresight/coresight-tpda.c b/drivers/hwtracing/coresight/coresight-tpda.c index 5f82737c37bba..65c70995ab00c 100644 --- a/drivers/hwtracing/coresight/coresight-tpda.c +++ b/drivers/hwtracing/coresight/coresight-tpda.c @@ -148,7 +148,11 @@ static int __tpda_enable(struct tpda_drvdata *drvdata, int port) CS_UNLOCK(drvdata->base); - if (!drvdata->csdev->enable) + /* + * Only do pre-port enable for first port that calls enable when the + * device's main refcount is still 0 + */ + if (!atomic_read(&drvdata->csdev->refcnt)) tpda_enable_pre_port(drvdata); ret = tpda_enable_port(drvdata, port); @@ -169,6 +173,7 @@ static int tpda_enable(struct coresight_device *csdev, ret = __tpda_enable(drvdata, in->dest_port); if (!ret) { atomic_inc(&in->dest_refcnt); + atomic_inc(&csdev->refcnt); dev_dbg(drvdata->dev, "TPDA inport %d enabled.\n", in->dest_port); } } @@ -197,9 +202,10 @@ static void tpda_disable(struct coresight_device *csdev, struct tpda_drvdata *drvdata = dev_get_drvdata(csdev->dev.parent); spin_lock(&drvdata->spinlock); - if (atomic_dec_return(&in->dest_refcnt) == 0) + if (atomic_dec_return(&in->dest_refcnt) == 0) { __tpda_disable(drvdata, in->dest_port); - + atomic_dec(&csdev->refcnt); + } spin_unlock(&drvdata->spinlock); dev_dbg(drvdata->dev, "TPDA inport %d disabled\n", in->dest_port); diff --git a/include/linux/coresight.h b/include/linux/coresight.h index 3ef3c9f0de593..cb7a8e111d0e0 100644 --- a/include/linux/coresight.h +++ b/include/linux/coresight.h @@ -233,7 +233,6 @@ struct coresight_sysfs_link { * a non-atomic read would also work. * @refcnt: keep track of what is in use. * @orphan: true if the component has connections that haven't been linked. - * @enable: 'true' if component is currently part of an active path. * @sysfs_sink_activated: 'true' when a sink has been selected for use via sysfs * by writing a 1 to the 'enable_sink' file. A sink can be * activated but not yet enabled. Enabling for a _sink_ happens @@ -260,7 +259,6 @@ struct coresight_device { local_t mode; atomic_t refcnt; bool orphan; - bool enable; /* sink specific fields */ bool sysfs_sink_activated; struct dev_ext_attribute *ea; From 4d82349cff2a43d0b14ac0d4ab292467a96f014c Mon Sep 17 00:00:00 2001 From: James Clark Date: Mon, 29 Jan 2024 15:40:37 +0000 Subject: [PATCH 312/548] coresight: Move all sysfs code to sysfs file At the moment the core file contains both sysfs functionality and core functionality, while the Perf mode is in a separate file in coresight-etm-perf.c Many of the functions have ambiguous names like coresight_enable_source() which actually only work in relation to the sysfs mode. To avoid further confusion, move everything that isn't core functionality into the sysfs file and append _sysfs to the ambiguous functions. Signed-off-by: James Clark Link: https://lore.kernel.org/r/20240129154050.569566-7-james.clark@arm.com Signed-off-by: Suzuki K Poulose (cherry picked from commit 1f5149c7751c50aba1a871143ffa6cb36af3fb49) Signed-off-by: Koba Ko --- drivers/hwtracing/coresight/coresight-core.c | 394 +----------------- .../coresight/coresight-etm3x-core.c | 4 +- .../coresight/coresight-etm4x-core.c | 4 +- drivers/hwtracing/coresight/coresight-priv.h | 5 +- drivers/hwtracing/coresight/coresight-stm.c | 4 +- drivers/hwtracing/coresight/coresight-sysfs.c | 390 +++++++++++++++++ include/linux/coresight.h | 8 +- 7 files changed, 407 insertions(+), 402 deletions(-) diff --git a/drivers/hwtracing/coresight/coresight-core.c b/drivers/hwtracing/coresight/coresight-core.c index 2d29c6219bf8b..9f49bca5ef355 100644 --- a/drivers/hwtracing/coresight/coresight-core.c +++ b/drivers/hwtracing/coresight/coresight-core.c @@ -9,7 +9,6 @@ #include #include #include -#include #include #include #include @@ -25,15 +24,12 @@ #include "coresight-priv.h" #include "coresight-syscfg.h" -static DEFINE_MUTEX(coresight_mutex); -static DEFINE_PER_CPU(struct coresight_device *, csdev_sink); - /* - * Use IDR to map the hash of the source's device name - * to the pointer of path for the source. The idr is for - * the sources which aren't associated with CPU. + * Mutex used to lock all sysfs enable and disable actions and loading and + * unloading devices by the Coresight core. */ -static DEFINE_IDR(path_idr); +DEFINE_MUTEX(coresight_mutex); +static DEFINE_PER_CPU(struct coresight_device *, csdev_sink); /** * struct coresight_node - elements of a path, from source to sink @@ -45,12 +41,6 @@ struct coresight_node { struct list_head link; }; -/* - * When operating Coresight drivers from the sysFS interface, only a single - * path can exist from a tracer (associated to a CPU) to a sink. - */ -static DEFINE_PER_CPU(struct list_head *, tracer_path); - /* * When losing synchronisation a new barrier packet needs to be inserted at the * beginning of the data collected in a buffer. That way the decoder knows that @@ -61,34 +51,6 @@ EXPORT_SYMBOL_GPL(coresight_barrier_pkt); static const struct cti_assoc_op *cti_assoc_ops; -ssize_t coresight_simple_show_pair(struct device *_dev, - struct device_attribute *attr, char *buf) -{ - struct coresight_device *csdev = container_of(_dev, struct coresight_device, dev); - struct cs_pair_attribute *cs_attr = container_of(attr, struct cs_pair_attribute, attr); - u64 val; - - pm_runtime_get_sync(_dev->parent); - val = csdev_access_relaxed_read_pair(&csdev->access, cs_attr->lo_off, cs_attr->hi_off); - pm_runtime_put_sync(_dev->parent); - return sysfs_emit(buf, "0x%llx\n", val); -} -EXPORT_SYMBOL_GPL(coresight_simple_show_pair); - -ssize_t coresight_simple_show32(struct device *_dev, - struct device_attribute *attr, char *buf) -{ - struct coresight_device *csdev = container_of(_dev, struct coresight_device, dev); - struct cs_off_attribute *cs_attr = container_of(attr, struct cs_off_attribute, attr); - u64 val; - - pm_runtime_get_sync(_dev->parent); - val = csdev_access_relaxed_read32(&csdev->access, cs_attr->off); - pm_runtime_put_sync(_dev->parent); - return sysfs_emit(buf, "0x%llx\n", val); -} -EXPORT_SYMBOL_GPL(coresight_simple_show32); - void coresight_set_cti_ops(const struct cti_assoc_op *cti_op) { cti_assoc_ops = cti_op; @@ -325,29 +287,6 @@ static void coresight_disable_link(struct coresight_device *csdev, link_ops(csdev)->disable(csdev, inconn, outconn); } -int coresight_enable_source(struct coresight_device *csdev, enum cs_mode mode, - void *data) -{ - int ret; - - /* - * Comparison with CS_MODE_SYSFS works without taking any device - * specific spinlock because the truthyness of that comparison can only - * change with coresight_mutex held, which we already have here. - */ - lockdep_assert_held(&coresight_mutex); - if (local_read(&csdev->mode) != CS_MODE_SYSFS) { - ret = source_ops(csdev)->enable(csdev, data, mode); - if (ret) - return ret; - } - - atomic_inc(&csdev->refcnt); - - return 0; -} -EXPORT_SYMBOL_GPL(coresight_enable_source); - static bool coresight_is_helper(struct coresight_device *csdev) { return csdev->type == CORESIGHT_DEV_TYPE_HELPER; @@ -393,30 +332,6 @@ void coresight_disable_source(struct coresight_device *csdev, void *data) } EXPORT_SYMBOL_GPL(coresight_disable_source); -/** - * coresight_disable_source_sysfs - Drop the reference count by 1 and disable - * the device if there are no users left. - * - * @csdev: The coresight device to disable - * @data: Opaque data to pass on to the disable function of the source device. - * For example in perf mode this is a pointer to the struct perf_event. - * - * Returns true if the device has been disabled. - */ -static bool coresight_disable_source_sysfs(struct coresight_device *csdev, - void *data) -{ - lockdep_assert_held(&coresight_mutex); - if (local_read(&csdev->mode) != CS_MODE_SYSFS) - return false; - - if (atomic_dec_return(&csdev->refcnt) == 0) { - coresight_disable_source(csdev, data); - return true; - } - return false; -} - /* * coresight_disable_path_from : Disable components in the given path beyond * @nd in the list. If @nd is NULL, all the components, except the SOURCE are @@ -573,39 +488,6 @@ struct coresight_device *coresight_get_sink(struct list_head *path) return csdev; } -/** - * coresight_find_activated_sysfs_sink - returns the first sink activated via - * sysfs using connection based search starting from the source reference. - * - * @csdev: Coresight source device reference - */ -static struct coresight_device * -coresight_find_activated_sysfs_sink(struct coresight_device *csdev) -{ - int i; - struct coresight_device *sink = NULL; - - if ((csdev->type == CORESIGHT_DEV_TYPE_SINK || - csdev->type == CORESIGHT_DEV_TYPE_LINKSINK) && - csdev->sysfs_sink_activated) - return csdev; - - /* - * Recursively explore each port found on this element. - */ - for (i = 0; i < csdev->pdata->nr_outconns; i++) { - struct coresight_device *child_dev; - - child_dev = csdev->pdata->out_conns[i]->dest_dev; - if (child_dev) - sink = coresight_find_activated_sysfs_sink(child_dev); - if (sink) - return sink; - } - - return NULL; -} - static int coresight_sink_by_id(struct device *dev, const void *data) { struct coresight_device *csdev = to_coresight_device(dev); @@ -1016,274 +898,6 @@ static void coresight_clear_default_sink(struct coresight_device *csdev) } } -/** coresight_validate_source - make sure a source has the right credentials - * @csdev: the device structure for a source. - * @function: the function this was called from. - * - * Assumes the coresight_mutex is held. - */ -static int coresight_validate_source(struct coresight_device *csdev, - const char *function) -{ - u32 type, subtype; - - type = csdev->type; - subtype = csdev->subtype.source_subtype; - - if (type != CORESIGHT_DEV_TYPE_SOURCE) { - dev_err(&csdev->dev, "wrong device type in %s\n", function); - return -EINVAL; - } - - if (subtype != CORESIGHT_DEV_SUBTYPE_SOURCE_PROC && - subtype != CORESIGHT_DEV_SUBTYPE_SOURCE_SOFTWARE && - subtype != CORESIGHT_DEV_SUBTYPE_SOURCE_TPDM && - subtype != CORESIGHT_DEV_SUBTYPE_SOURCE_OTHERS) { - dev_err(&csdev->dev, "wrong device subtype in %s\n", function); - return -EINVAL; - } - - return 0; -} - -int coresight_enable(struct coresight_device *csdev) -{ - int cpu, ret = 0; - struct coresight_device *sink; - struct list_head *path; - enum coresight_dev_subtype_source subtype; - u32 hash; - - subtype = csdev->subtype.source_subtype; - - mutex_lock(&coresight_mutex); - - ret = coresight_validate_source(csdev, __func__); - if (ret) - goto out; - - /* - * mode == SYSFS implies that it's already enabled. Don't look at the - * refcount to determine this because we don't claim the source until - * coresight_enable_source() so can still race with Perf mode which - * doesn't hold coresight_mutex. - */ - if (local_read(&csdev->mode) == CS_MODE_SYSFS) { - /* - * There could be multiple applications driving the software - * source. So keep the refcount for each such user when the - * source is already enabled. - */ - if (subtype == CORESIGHT_DEV_SUBTYPE_SOURCE_SOFTWARE) - atomic_inc(&csdev->refcnt); - goto out; - } - - sink = coresight_find_activated_sysfs_sink(csdev); - if (!sink) { - ret = -EINVAL; - goto out; - } - - path = coresight_build_path(csdev, sink); - if (IS_ERR(path)) { - pr_err("building path(s) failed\n"); - ret = PTR_ERR(path); - goto out; - } - - ret = coresight_enable_path(path, CS_MODE_SYSFS, NULL); - if (ret) - goto err_path; - - ret = coresight_enable_source(csdev, CS_MODE_SYSFS, NULL); - if (ret) - goto err_source; - - switch (subtype) { - case CORESIGHT_DEV_SUBTYPE_SOURCE_PROC: - /* - * When working from sysFS it is important to keep track - * of the paths that were created so that they can be - * undone in 'coresight_disable()'. Since there can only - * be a single session per tracer (when working from sysFS) - * a per-cpu variable will do just fine. - */ - cpu = source_ops(csdev)->cpu_id(csdev); - per_cpu(tracer_path, cpu) = path; - break; - case CORESIGHT_DEV_SUBTYPE_SOURCE_SOFTWARE: - case CORESIGHT_DEV_SUBTYPE_SOURCE_TPDM: - case CORESIGHT_DEV_SUBTYPE_SOURCE_OTHERS: - /* - * Use the hash of source's device name as ID - * and map the ID to the pointer of the path. - */ - hash = hashlen_hash(hashlen_string(NULL, dev_name(&csdev->dev))); - ret = idr_alloc_u32(&path_idr, path, &hash, hash, GFP_KERNEL); - if (ret) - goto err_source; - break; - default: - /* We can't be here */ - break; - } - -out: - mutex_unlock(&coresight_mutex); - return ret; - -err_source: - coresight_disable_path(path); - -err_path: - coresight_release_path(path); - goto out; -} -EXPORT_SYMBOL_GPL(coresight_enable); - -void coresight_disable(struct coresight_device *csdev) -{ - int cpu, ret; - struct list_head *path = NULL; - u32 hash; - - mutex_lock(&coresight_mutex); - - ret = coresight_validate_source(csdev, __func__); - if (ret) - goto out; - - if (!coresight_disable_source_sysfs(csdev, NULL)) - goto out; - - switch (csdev->subtype.source_subtype) { - case CORESIGHT_DEV_SUBTYPE_SOURCE_PROC: - cpu = source_ops(csdev)->cpu_id(csdev); - path = per_cpu(tracer_path, cpu); - per_cpu(tracer_path, cpu) = NULL; - break; - case CORESIGHT_DEV_SUBTYPE_SOURCE_SOFTWARE: - case CORESIGHT_DEV_SUBTYPE_SOURCE_TPDM: - case CORESIGHT_DEV_SUBTYPE_SOURCE_OTHERS: - hash = hashlen_hash(hashlen_string(NULL, dev_name(&csdev->dev))); - /* Find the path by the hash. */ - path = idr_find(&path_idr, hash); - if (path == NULL) { - pr_err("Path is not found for %s\n", dev_name(&csdev->dev)); - goto out; - } - idr_remove(&path_idr, hash); - break; - default: - /* We can't be here */ - break; - } - - coresight_disable_path(path); - coresight_release_path(path); - -out: - mutex_unlock(&coresight_mutex); -} -EXPORT_SYMBOL_GPL(coresight_disable); - -static ssize_t enable_sink_show(struct device *dev, - struct device_attribute *attr, char *buf) -{ - struct coresight_device *csdev = to_coresight_device(dev); - - return scnprintf(buf, PAGE_SIZE, "%u\n", csdev->sysfs_sink_activated); -} - -static ssize_t enable_sink_store(struct device *dev, - struct device_attribute *attr, - const char *buf, size_t size) -{ - int ret; - unsigned long val; - struct coresight_device *csdev = to_coresight_device(dev); - - ret = kstrtoul(buf, 10, &val); - if (ret) - return ret; - - csdev->sysfs_sink_activated = !!val; - - return size; - -} -static DEVICE_ATTR_RW(enable_sink); - -static ssize_t enable_source_show(struct device *dev, - struct device_attribute *attr, char *buf) -{ - struct coresight_device *csdev = to_coresight_device(dev); - - guard(mutex)(&coresight_mutex); - return scnprintf(buf, PAGE_SIZE, "%u\n", - local_read(&csdev->mode) == CS_MODE_SYSFS); -} - -static ssize_t enable_source_store(struct device *dev, - struct device_attribute *attr, - const char *buf, size_t size) -{ - int ret = 0; - unsigned long val; - struct coresight_device *csdev = to_coresight_device(dev); - - ret = kstrtoul(buf, 10, &val); - if (ret) - return ret; - - if (val) { - ret = coresight_enable(csdev); - if (ret) - return ret; - } else { - coresight_disable(csdev); - } - - return size; -} -static DEVICE_ATTR_RW(enable_source); - -static struct attribute *coresight_sink_attrs[] = { - &dev_attr_enable_sink.attr, - NULL, -}; -ATTRIBUTE_GROUPS(coresight_sink); - -static struct attribute *coresight_source_attrs[] = { - &dev_attr_enable_source.attr, - NULL, -}; -ATTRIBUTE_GROUPS(coresight_source); - -static struct device_type coresight_dev_type[] = { - { - .name = "sink", - .groups = coresight_sink_groups, - }, - { - .name = "link", - }, - { - .name = "linksink", - .groups = coresight_sink_groups, - }, - { - .name = "source", - .groups = coresight_source_groups, - }, - { - .name = "helper", - } -}; -/* Ensure the enum matches the names and groups */ -static_assert(ARRAY_SIZE(coresight_dev_type) == CORESIGHT_DEV_TYPE_MAX); - static void coresight_device_release(struct device *dev) { struct coresight_device *csdev = to_coresight_device(dev); diff --git a/drivers/hwtracing/coresight/coresight-etm3x-core.c b/drivers/hwtracing/coresight/coresight-etm3x-core.c index 29f62dfac6739..a8be115636f01 100644 --- a/drivers/hwtracing/coresight/coresight-etm3x-core.c +++ b/drivers/hwtracing/coresight/coresight-etm3x-core.c @@ -714,7 +714,7 @@ static int etm_online_cpu(unsigned int cpu) return 0; if (etmdrvdata[cpu]->boot_enable && !etmdrvdata[cpu]->sticky_enable) - coresight_enable(etmdrvdata[cpu]->csdev); + coresight_enable_sysfs(etmdrvdata[cpu]->csdev); return 0; } @@ -924,7 +924,7 @@ static int etm_probe(struct amba_device *adev, const struct amba_id *id) dev_info(&drvdata->csdev->dev, "%s initialized\n", (char *)coresight_get_uci_data(id)); if (boot_enable) { - coresight_enable(drvdata->csdev); + coresight_enable_sysfs(drvdata->csdev); drvdata->boot_enable = true; } diff --git a/drivers/hwtracing/coresight/coresight-etm4x-core.c b/drivers/hwtracing/coresight/coresight-etm4x-core.c index e111ddbc27778..c18abe6560492 100644 --- a/drivers/hwtracing/coresight/coresight-etm4x-core.c +++ b/drivers/hwtracing/coresight/coresight-etm4x-core.c @@ -1651,7 +1651,7 @@ static int etm4_online_cpu(unsigned int cpu) return etm4_probe_cpu(cpu); if (etmdrvdata[cpu]->boot_enable && !etmdrvdata[cpu]->sticky_enable) - coresight_enable(etmdrvdata[cpu]->csdev); + coresight_enable_sysfs(etmdrvdata[cpu]->csdev); return 0; } @@ -2094,7 +2094,7 @@ static int etm4_add_coresight_dev(struct etm4_init_arg *init_arg) drvdata->cpu, type_name, major, minor); if (boot_enable) { - coresight_enable(drvdata->csdev); + coresight_enable_sysfs(drvdata->csdev); drvdata->boot_enable = true; } diff --git a/drivers/hwtracing/coresight/coresight-priv.h b/drivers/hwtracing/coresight/coresight-priv.h index 8991b4a9e9699..7fe79fc1daadf 100644 --- a/drivers/hwtracing/coresight/coresight-priv.h +++ b/drivers/hwtracing/coresight/coresight-priv.h @@ -12,6 +12,9 @@ #include #include +extern struct mutex coresight_mutex; +extern struct device_type coresight_dev_type[]; + /* * Coresight management registers (0xf00-0xfcc) * 0xfa0 - 0xfa4: Management registers in PFTv1.0 @@ -230,8 +233,6 @@ void coresight_add_helper(struct coresight_device *csdev, void coresight_set_percpu_sink(int cpu, struct coresight_device *csdev); struct coresight_device *coresight_get_percpu_sink(int cpu); -int coresight_enable_source(struct coresight_device *csdev, enum cs_mode mode, - void *data); void coresight_disable_source(struct coresight_device *csdev, void *data); #endif diff --git a/drivers/hwtracing/coresight/coresight-stm.c b/drivers/hwtracing/coresight/coresight-stm.c index 190ba569b05a7..af9b212242468 100644 --- a/drivers/hwtracing/coresight/coresight-stm.c +++ b/drivers/hwtracing/coresight/coresight-stm.c @@ -332,7 +332,7 @@ static int stm_generic_link(struct stm_data *stm_data, if (!drvdata || !drvdata->csdev) return -EINVAL; - return coresight_enable(drvdata->csdev); + return coresight_enable_sysfs(drvdata->csdev); } static void stm_generic_unlink(struct stm_data *stm_data, @@ -343,7 +343,7 @@ static void stm_generic_unlink(struct stm_data *stm_data, if (!drvdata || !drvdata->csdev) return; - coresight_disable(drvdata->csdev); + coresight_disable_sysfs(drvdata->csdev); } static phys_addr_t diff --git a/drivers/hwtracing/coresight/coresight-sysfs.c b/drivers/hwtracing/coresight/coresight-sysfs.c index dd78e9fcfc4dc..92cdf8139f23f 100644 --- a/drivers/hwtracing/coresight/coresight-sysfs.c +++ b/drivers/hwtracing/coresight/coresight-sysfs.c @@ -5,10 +5,400 @@ */ #include +#include #include #include "coresight-priv.h" +/* + * Use IDR to map the hash of the source's device name + * to the pointer of path for the source. The idr is for + * the sources which aren't associated with CPU. + */ +static DEFINE_IDR(path_idr); + +/* + * When operating Coresight drivers from the sysFS interface, only a single + * path can exist from a tracer (associated to a CPU) to a sink. + */ +static DEFINE_PER_CPU(struct list_head *, tracer_path); + +ssize_t coresight_simple_show_pair(struct device *_dev, + struct device_attribute *attr, char *buf) +{ + struct coresight_device *csdev = container_of(_dev, struct coresight_device, dev); + struct cs_pair_attribute *cs_attr = container_of(attr, struct cs_pair_attribute, attr); + u64 val; + + pm_runtime_get_sync(_dev->parent); + val = csdev_access_relaxed_read_pair(&csdev->access, cs_attr->lo_off, cs_attr->hi_off); + pm_runtime_put_sync(_dev->parent); + return sysfs_emit(buf, "0x%llx\n", val); +} +EXPORT_SYMBOL_GPL(coresight_simple_show_pair); + +ssize_t coresight_simple_show32(struct device *_dev, + struct device_attribute *attr, char *buf) +{ + struct coresight_device *csdev = container_of(_dev, struct coresight_device, dev); + struct cs_off_attribute *cs_attr = container_of(attr, struct cs_off_attribute, attr); + u64 val; + + pm_runtime_get_sync(_dev->parent); + val = csdev_access_relaxed_read32(&csdev->access, cs_attr->off); + pm_runtime_put_sync(_dev->parent); + return sysfs_emit(buf, "0x%llx\n", val); +} +EXPORT_SYMBOL_GPL(coresight_simple_show32); + +static int coresight_enable_source_sysfs(struct coresight_device *csdev, + enum cs_mode mode, void *data) +{ + int ret; + + /* + * Comparison with CS_MODE_SYSFS works without taking any device + * specific spinlock because the truthyness of that comparison can only + * change with coresight_mutex held, which we already have here. + */ + lockdep_assert_held(&coresight_mutex); + if (local_read(&csdev->mode) != CS_MODE_SYSFS) { + ret = source_ops(csdev)->enable(csdev, data, mode); + if (ret) + return ret; + } + + atomic_inc(&csdev->refcnt); + + return 0; +} + +/** + * coresight_disable_source_sysfs - Drop the reference count by 1 and disable + * the device if there are no users left. + * + * @csdev: The coresight device to disable + * @data: Opaque data to pass on to the disable function of the source device. + * For example in perf mode this is a pointer to the struct perf_event. + * + * Returns true if the device has been disabled. + */ +static bool coresight_disable_source_sysfs(struct coresight_device *csdev, + void *data) +{ + lockdep_assert_held(&coresight_mutex); + if (local_read(&csdev->mode) != CS_MODE_SYSFS) + return false; + + if (atomic_dec_return(&csdev->refcnt) == 0) { + coresight_disable_source(csdev, data); + return true; + } + return false; +} + +/** + * coresight_find_activated_sysfs_sink - returns the first sink activated via + * sysfs using connection based search starting from the source reference. + * + * @csdev: Coresight source device reference + */ +static struct coresight_device * +coresight_find_activated_sysfs_sink(struct coresight_device *csdev) +{ + int i; + struct coresight_device *sink = NULL; + + if ((csdev->type == CORESIGHT_DEV_TYPE_SINK || + csdev->type == CORESIGHT_DEV_TYPE_LINKSINK) && + csdev->sysfs_sink_activated) + return csdev; + + /* + * Recursively explore each port found on this element. + */ + for (i = 0; i < csdev->pdata->nr_outconns; i++) { + struct coresight_device *child_dev; + + child_dev = csdev->pdata->out_conns[i]->dest_dev; + if (child_dev) + sink = coresight_find_activated_sysfs_sink(child_dev); + if (sink) + return sink; + } + + return NULL; +} + +/** coresight_validate_source - make sure a source has the right credentials to + * be used via sysfs. + * @csdev: the device structure for a source. + * @function: the function this was called from. + * + * Assumes the coresight_mutex is held. + */ +static int coresight_validate_source_sysfs(struct coresight_device *csdev, + const char *function) +{ + u32 type, subtype; + + type = csdev->type; + subtype = csdev->subtype.source_subtype; + + if (type != CORESIGHT_DEV_TYPE_SOURCE) { + dev_err(&csdev->dev, "wrong device type in %s\n", function); + return -EINVAL; + } + + if (subtype != CORESIGHT_DEV_SUBTYPE_SOURCE_PROC && + subtype != CORESIGHT_DEV_SUBTYPE_SOURCE_SOFTWARE && + subtype != CORESIGHT_DEV_SUBTYPE_SOURCE_TPDM && + subtype != CORESIGHT_DEV_SUBTYPE_SOURCE_OTHERS) { + dev_err(&csdev->dev, "wrong device subtype in %s\n", function); + return -EINVAL; + } + + return 0; +} + +int coresight_enable_sysfs(struct coresight_device *csdev) +{ + int cpu, ret = 0; + struct coresight_device *sink; + struct list_head *path; + enum coresight_dev_subtype_source subtype; + u32 hash; + + subtype = csdev->subtype.source_subtype; + + mutex_lock(&coresight_mutex); + + ret = coresight_validate_source_sysfs(csdev, __func__); + if (ret) + goto out; + + /* + * mode == SYSFS implies that it's already enabled. Don't look at the + * refcount to determine this because we don't claim the source until + * coresight_enable_source() so can still race with Perf mode which + * doesn't hold coresight_mutex. + */ + if (local_read(&csdev->mode) == CS_MODE_SYSFS) { + /* + * There could be multiple applications driving the software + * source. So keep the refcount for each such user when the + * source is already enabled. + */ + if (subtype == CORESIGHT_DEV_SUBTYPE_SOURCE_SOFTWARE) + atomic_inc(&csdev->refcnt); + goto out; + } + + sink = coresight_find_activated_sysfs_sink(csdev); + if (!sink) { + ret = -EINVAL; + goto out; + } + + path = coresight_build_path(csdev, sink); + if (IS_ERR(path)) { + pr_err("building path(s) failed\n"); + ret = PTR_ERR(path); + goto out; + } + + ret = coresight_enable_path(path, CS_MODE_SYSFS, NULL); + if (ret) + goto err_path; + + ret = coresight_enable_source_sysfs(csdev, CS_MODE_SYSFS, NULL); + if (ret) + goto err_source; + + switch (subtype) { + case CORESIGHT_DEV_SUBTYPE_SOURCE_PROC: + /* + * When working from sysFS it is important to keep track + * of the paths that were created so that they can be + * undone in 'coresight_disable()'. Since there can only + * be a single session per tracer (when working from sysFS) + * a per-cpu variable will do just fine. + */ + cpu = source_ops(csdev)->cpu_id(csdev); + per_cpu(tracer_path, cpu) = path; + break; + case CORESIGHT_DEV_SUBTYPE_SOURCE_SOFTWARE: + case CORESIGHT_DEV_SUBTYPE_SOURCE_TPDM: + case CORESIGHT_DEV_SUBTYPE_SOURCE_OTHERS: + /* + * Use the hash of source's device name as ID + * and map the ID to the pointer of the path. + */ + hash = hashlen_hash(hashlen_string(NULL, dev_name(&csdev->dev))); + ret = idr_alloc_u32(&path_idr, path, &hash, hash, GFP_KERNEL); + if (ret) + goto err_source; + break; + default: + /* We can't be here */ + break; + } + +out: + mutex_unlock(&coresight_mutex); + return ret; + +err_source: + coresight_disable_path(path); + +err_path: + coresight_release_path(path); + goto out; +} +EXPORT_SYMBOL_GPL(coresight_enable_sysfs); + +void coresight_disable_sysfs(struct coresight_device *csdev) +{ + int cpu, ret; + struct list_head *path = NULL; + u32 hash; + + mutex_lock(&coresight_mutex); + + ret = coresight_validate_source_sysfs(csdev, __func__); + if (ret) + goto out; + + if (!coresight_disable_source_sysfs(csdev, NULL)) + goto out; + + switch (csdev->subtype.source_subtype) { + case CORESIGHT_DEV_SUBTYPE_SOURCE_PROC: + cpu = source_ops(csdev)->cpu_id(csdev); + path = per_cpu(tracer_path, cpu); + per_cpu(tracer_path, cpu) = NULL; + break; + case CORESIGHT_DEV_SUBTYPE_SOURCE_SOFTWARE: + case CORESIGHT_DEV_SUBTYPE_SOURCE_TPDM: + case CORESIGHT_DEV_SUBTYPE_SOURCE_OTHERS: + hash = hashlen_hash(hashlen_string(NULL, dev_name(&csdev->dev))); + /* Find the path by the hash. */ + path = idr_find(&path_idr, hash); + if (path == NULL) { + pr_err("Path is not found for %s\n", dev_name(&csdev->dev)); + goto out; + } + idr_remove(&path_idr, hash); + break; + default: + /* We can't be here */ + break; + } + + coresight_disable_path(path); + coresight_release_path(path); + +out: + mutex_unlock(&coresight_mutex); +} +EXPORT_SYMBOL_GPL(coresight_disable_sysfs); + +static ssize_t enable_sink_show(struct device *dev, + struct device_attribute *attr, char *buf) +{ + struct coresight_device *csdev = to_coresight_device(dev); + + return scnprintf(buf, PAGE_SIZE, "%u\n", csdev->sysfs_sink_activated); +} + +static ssize_t enable_sink_store(struct device *dev, + struct device_attribute *attr, + const char *buf, size_t size) +{ + int ret; + unsigned long val; + struct coresight_device *csdev = to_coresight_device(dev); + + ret = kstrtoul(buf, 10, &val); + if (ret) + return ret; + + csdev->sysfs_sink_activated = !!val; + + return size; + +} +static DEVICE_ATTR_RW(enable_sink); + +static ssize_t enable_source_show(struct device *dev, + struct device_attribute *attr, char *buf) +{ + struct coresight_device *csdev = to_coresight_device(dev); + + guard(mutex)(&coresight_mutex); + return scnprintf(buf, PAGE_SIZE, "%u\n", + local_read(&csdev->mode) == CS_MODE_SYSFS); +} + +static ssize_t enable_source_store(struct device *dev, + struct device_attribute *attr, + const char *buf, size_t size) +{ + int ret = 0; + unsigned long val; + struct coresight_device *csdev = to_coresight_device(dev); + + ret = kstrtoul(buf, 10, &val); + if (ret) + return ret; + + if (val) { + ret = coresight_enable_sysfs(csdev); + if (ret) + return ret; + } else { + coresight_disable_sysfs(csdev); + } + + return size; +} +static DEVICE_ATTR_RW(enable_source); + +static struct attribute *coresight_sink_attrs[] = { + &dev_attr_enable_sink.attr, + NULL, +}; +ATTRIBUTE_GROUPS(coresight_sink); + +static struct attribute *coresight_source_attrs[] = { + &dev_attr_enable_source.attr, + NULL, +}; +ATTRIBUTE_GROUPS(coresight_source); + +struct device_type coresight_dev_type[] = { + { + .name = "sink", + .groups = coresight_sink_groups, + }, + { + .name = "link", + }, + { + .name = "linksink", + .groups = coresight_sink_groups, + }, + { + .name = "source", + .groups = coresight_source_groups, + }, + { + .name = "helper", + } +}; +/* Ensure the enum matches the names and groups */ +static_assert(ARRAY_SIZE(coresight_dev_type) == CORESIGHT_DEV_TYPE_MAX); + /* * Connections group - links attribute. * Count of created links between coresight components in the group. diff --git a/include/linux/coresight.h b/include/linux/coresight.h index cb7a8e111d0e0..90353be8066b0 100644 --- a/include/linux/coresight.h +++ b/include/linux/coresight.h @@ -578,8 +578,8 @@ static inline bool coresight_is_percpu_sink(struct coresight_device *csdev) extern struct coresight_device * coresight_register(struct coresight_desc *desc); extern void coresight_unregister(struct coresight_device *csdev); -extern int coresight_enable(struct coresight_device *csdev); -extern void coresight_disable(struct coresight_device *csdev); +extern int coresight_enable_sysfs(struct coresight_device *csdev); +extern void coresight_disable_sysfs(struct coresight_device *csdev); extern int coresight_timeout(struct csdev_access *csa, u32 offset, int position, int value); typedef void (*coresight_timeout_cb_t) (struct csdev_access *, u32, int, int); @@ -613,8 +613,8 @@ static inline struct coresight_device * coresight_register(struct coresight_desc *desc) { return NULL; } static inline void coresight_unregister(struct coresight_device *csdev) {} static inline int -coresight_enable(struct coresight_device *csdev) { return -ENOSYS; } -static inline void coresight_disable(struct coresight_device *csdev) {} +coresight_enable_sysfs(struct coresight_device *csdev) { return -ENOSYS; } +static inline void coresight_disable_sysfs(struct coresight_device *csdev) {} static inline int coresight_timeout(struct csdev_access *csa, u32 offset, int position, int value) From 88d53b8d11714fa40fd5dcc0a81d43422c149534 Mon Sep 17 00:00:00 2001 From: James Clark Date: Mon, 29 Jan 2024 15:40:38 +0000 Subject: [PATCH 313/548] coresight: Remove atomic type from refcnt Refcnt is only ever accessed from either inside the coresight_mutex, or the device's spinlock, making the atomic type and atomic_dec_return() calls confusing and unnecessary. The only point of synchronisation outside of these two types of locks is already done with a compare and swap on 'mode', which a comment has been added for. There was one instance of refcnt being used outside of a lock in TPIU, but that can easily be fixed by making it the same as all the other devices and adding a spinlock. Potentially in the future all the refcounting and locking can be moved up into the core code, and all the mostly duplicate code from the individual devices can be removed. Signed-off-by: James Clark Link: https://lore.kernel.org/r/20240129154050.569566-8-james.clark@arm.com Signed-off-by: Suzuki K Poulose (cherry picked from commit 4545b38ef004a586295750ea49a505b6396a7c90) Signed-off-by: Koba Ko --- drivers/hwtracing/coresight/coresight-etb10.c | 11 +++++----- drivers/hwtracing/coresight/coresight-sysfs.c | 7 ++++--- .../hwtracing/coresight/coresight-tmc-etf.c | 20 ++++++++++--------- .../hwtracing/coresight/coresight-tmc-etr.c | 13 ++++++------ drivers/hwtracing/coresight/coresight-tpda.c | 7 ++++--- drivers/hwtracing/coresight/coresight-tpiu.c | 14 +++++++++++-- drivers/hwtracing/coresight/ultrasoc-smb.c | 15 +++++--------- include/linux/coresight.h | 13 +++++++++--- 8 files changed, 59 insertions(+), 41 deletions(-) diff --git a/drivers/hwtracing/coresight/coresight-etb10.c b/drivers/hwtracing/coresight/coresight-etb10.c index 08e5bba862db0..5f2bb95955b72 100644 --- a/drivers/hwtracing/coresight/coresight-etb10.c +++ b/drivers/hwtracing/coresight/coresight-etb10.c @@ -161,7 +161,7 @@ static int etb_enable_sysfs(struct coresight_device *csdev) local_set(&csdev->mode, CS_MODE_SYSFS); } - atomic_inc(&csdev->refcnt); + csdev->refcnt++; out: spin_unlock_irqrestore(&drvdata->spinlock, flags); return ret; @@ -197,7 +197,7 @@ static int etb_enable_perf(struct coresight_device *csdev, void *data) * use for this session. */ if (drvdata->pid == pid) { - atomic_inc(&csdev->refcnt); + csdev->refcnt++; goto out; } @@ -215,7 +215,7 @@ static int etb_enable_perf(struct coresight_device *csdev, void *data) /* Associate with monitored process. */ drvdata->pid = pid; local_set(&drvdata->csdev->mode, CS_MODE_PERF); - atomic_inc(&csdev->refcnt); + csdev->refcnt++; } out: @@ -354,7 +354,8 @@ static int etb_disable(struct coresight_device *csdev) spin_lock_irqsave(&drvdata->spinlock, flags); - if (atomic_dec_return(&csdev->refcnt)) { + csdev->refcnt--; + if (csdev->refcnt) { spin_unlock_irqrestore(&drvdata->spinlock, flags); return -EBUSY; } @@ -445,7 +446,7 @@ static unsigned long etb_update_buffer(struct coresight_device *csdev, spin_lock_irqsave(&drvdata->spinlock, flags); /* Don't do anything if another tracer is using this sink */ - if (atomic_read(&csdev->refcnt) != 1) + if (csdev->refcnt != 1) goto out; __etb_disable_hw(drvdata); diff --git a/drivers/hwtracing/coresight/coresight-sysfs.c b/drivers/hwtracing/coresight/coresight-sysfs.c index 92cdf8139f23f..5992f2c2200a4 100644 --- a/drivers/hwtracing/coresight/coresight-sysfs.c +++ b/drivers/hwtracing/coresight/coresight-sysfs.c @@ -68,7 +68,7 @@ static int coresight_enable_source_sysfs(struct coresight_device *csdev, return ret; } - atomic_inc(&csdev->refcnt); + csdev->refcnt++; return 0; } @@ -90,7 +90,8 @@ static bool coresight_disable_source_sysfs(struct coresight_device *csdev, if (local_read(&csdev->mode) != CS_MODE_SYSFS) return false; - if (atomic_dec_return(&csdev->refcnt) == 0) { + csdev->refcnt--; + if (csdev->refcnt == 0) { coresight_disable_source(csdev, data); return true; } @@ -190,7 +191,7 @@ int coresight_enable_sysfs(struct coresight_device *csdev) * source is already enabled. */ if (subtype == CORESIGHT_DEV_SUBTYPE_SOURCE_SOFTWARE) - atomic_inc(&csdev->refcnt); + csdev->refcnt++; goto out; } diff --git a/drivers/hwtracing/coresight/coresight-tmc-etf.c b/drivers/hwtracing/coresight/coresight-tmc-etf.c index 2a7e516052a29..f3281c958a570 100644 --- a/drivers/hwtracing/coresight/coresight-tmc-etf.c +++ b/drivers/hwtracing/coresight/coresight-tmc-etf.c @@ -206,7 +206,7 @@ static int tmc_enable_etf_sink_sysfs(struct coresight_device *csdev) * touched. */ if (local_read(&csdev->mode) == CS_MODE_SYSFS) { - atomic_inc(&csdev->refcnt); + csdev->refcnt++; goto out; } @@ -229,7 +229,7 @@ static int tmc_enable_etf_sink_sysfs(struct coresight_device *csdev) ret = tmc_etb_enable_hw(drvdata); if (!ret) { local_set(&csdev->mode, CS_MODE_SYSFS); - atomic_inc(&csdev->refcnt); + csdev->refcnt++; } else { /* Free up the buffer if we failed to enable */ used = false; @@ -284,7 +284,7 @@ static int tmc_enable_etf_sink_perf(struct coresight_device *csdev, void *data) * use for this session. */ if (drvdata->pid == pid) { - atomic_inc(&csdev->refcnt); + csdev->refcnt++; break; } @@ -293,7 +293,7 @@ static int tmc_enable_etf_sink_perf(struct coresight_device *csdev, void *data) /* Associate with monitored process. */ drvdata->pid = pid; local_set(&csdev->mode, CS_MODE_PERF); - atomic_inc(&csdev->refcnt); + csdev->refcnt++; } } while (0); spin_unlock_irqrestore(&drvdata->spinlock, flags); @@ -338,7 +338,8 @@ static int tmc_disable_etf_sink(struct coresight_device *csdev) return -EBUSY; } - if (atomic_dec_return(&csdev->refcnt)) { + csdev->refcnt--; + if (csdev->refcnt) { spin_unlock_irqrestore(&drvdata->spinlock, flags); return -EBUSY; } @@ -371,7 +372,7 @@ static int tmc_enable_etf_link(struct coresight_device *csdev, return -EBUSY; } - if (atomic_read(&csdev->refcnt) == 0) { + if (csdev->refcnt == 0) { ret = tmc_etf_enable_hw(drvdata); if (!ret) { local_set(&csdev->mode, CS_MODE_SYSFS); @@ -379,7 +380,7 @@ static int tmc_enable_etf_link(struct coresight_device *csdev, } } if (!ret) - atomic_inc(&csdev->refcnt); + csdev->refcnt++; spin_unlock_irqrestore(&drvdata->spinlock, flags); if (first_enable) @@ -401,7 +402,8 @@ static void tmc_disable_etf_link(struct coresight_device *csdev, return; } - if (atomic_dec_return(&csdev->refcnt) == 0) { + csdev->refcnt--; + if (csdev->refcnt == 0) { tmc_etf_disable_hw(drvdata); local_set(&csdev->mode, CS_MODE_DISABLED); last_disable = true; @@ -489,7 +491,7 @@ static unsigned long tmc_update_etf_buffer(struct coresight_device *csdev, spin_lock_irqsave(&drvdata->spinlock, flags); /* Don't do anything if another tracer is using this sink */ - if (atomic_read(&csdev->refcnt) != 1) + if (csdev->refcnt != 1) goto out; CS_UNLOCK(drvdata->base); diff --git a/drivers/hwtracing/coresight/coresight-tmc-etr.c b/drivers/hwtracing/coresight/coresight-tmc-etr.c index ab0beaf99739d..0720793701b99 100644 --- a/drivers/hwtracing/coresight/coresight-tmc-etr.c +++ b/drivers/hwtracing/coresight/coresight-tmc-etr.c @@ -1211,14 +1211,14 @@ static int tmc_enable_etr_sink_sysfs(struct coresight_device *csdev) * touched, even if the buffer size has changed. */ if (local_read(&csdev->mode) == CS_MODE_SYSFS) { - atomic_inc(&csdev->refcnt); + csdev->refcnt++; goto out; } ret = tmc_etr_enable_hw(drvdata, sysfs_buf); if (!ret) { local_set(&csdev->mode, CS_MODE_SYSFS); - atomic_inc(&csdev->refcnt); + csdev->refcnt++; } out: @@ -1544,7 +1544,7 @@ tmc_update_etr_buffer(struct coresight_device *csdev, spin_lock_irqsave(&drvdata->spinlock, flags); /* Don't do anything if another tracer is using this sink */ - if (atomic_read(&csdev->refcnt) != 1) { + if (csdev->refcnt != 1) { spin_unlock_irqrestore(&drvdata->spinlock, flags); goto out; } @@ -1656,7 +1656,7 @@ static int tmc_enable_etr_sink_perf(struct coresight_device *csdev, void *data) * use for this session. */ if (drvdata->pid == pid) { - atomic_inc(&csdev->refcnt); + csdev->refcnt++; goto unlock_out; } @@ -1666,7 +1666,7 @@ static int tmc_enable_etr_sink_perf(struct coresight_device *csdev, void *data) drvdata->pid = pid; local_set(&csdev->mode, CS_MODE_PERF); drvdata->perf_buf = etr_perf->etr_buf; - atomic_inc(&csdev->refcnt); + csdev->refcnt++; } unlock_out: @@ -1699,7 +1699,8 @@ static int tmc_disable_etr_sink(struct coresight_device *csdev) return -EBUSY; } - if (atomic_dec_return(&csdev->refcnt)) { + csdev->refcnt--; + if (csdev->refcnt) { spin_unlock_irqrestore(&drvdata->spinlock, flags); return -EBUSY; } diff --git a/drivers/hwtracing/coresight/coresight-tpda.c b/drivers/hwtracing/coresight/coresight-tpda.c index 65c70995ab00c..135a89654c4cf 100644 --- a/drivers/hwtracing/coresight/coresight-tpda.c +++ b/drivers/hwtracing/coresight/coresight-tpda.c @@ -152,7 +152,8 @@ static int __tpda_enable(struct tpda_drvdata *drvdata, int port) * Only do pre-port enable for first port that calls enable when the * device's main refcount is still 0 */ - if (!atomic_read(&drvdata->csdev->refcnt)) + lockdep_assert_held(&drvdata->spinlock); + if (!drvdata->csdev->refcnt) tpda_enable_pre_port(drvdata); ret = tpda_enable_port(drvdata, port); @@ -173,7 +174,7 @@ static int tpda_enable(struct coresight_device *csdev, ret = __tpda_enable(drvdata, in->dest_port); if (!ret) { atomic_inc(&in->dest_refcnt); - atomic_inc(&csdev->refcnt); + csdev->refcnt++; dev_dbg(drvdata->dev, "TPDA inport %d enabled.\n", in->dest_port); } } @@ -204,7 +205,7 @@ static void tpda_disable(struct coresight_device *csdev, spin_lock(&drvdata->spinlock); if (atomic_dec_return(&in->dest_refcnt) == 0) { __tpda_disable(drvdata, in->dest_port); - atomic_dec(&csdev->refcnt); + csdev->refcnt--; } spin_unlock(&drvdata->spinlock); diff --git a/drivers/hwtracing/coresight/coresight-tpiu.c b/drivers/hwtracing/coresight/coresight-tpiu.c index 59eac93fd6bb9..c23a6b9b41fe0 100644 --- a/drivers/hwtracing/coresight/coresight-tpiu.c +++ b/drivers/hwtracing/coresight/coresight-tpiu.c @@ -58,6 +58,7 @@ struct tpiu_drvdata { void __iomem *base; struct clk *atclk; struct coresight_device *csdev; + spinlock_t spinlock; }; static void tpiu_enable_hw(struct csdev_access *csa) @@ -72,8 +73,11 @@ static void tpiu_enable_hw(struct csdev_access *csa) static int tpiu_enable(struct coresight_device *csdev, enum cs_mode mode, void *__unused) { + struct tpiu_drvdata *drvdata = dev_get_drvdata(csdev->dev.parent); + + guard(spinlock)(&drvdata->spinlock); tpiu_enable_hw(&csdev->access); - atomic_inc(&csdev->refcnt); + csdev->refcnt++; dev_dbg(&csdev->dev, "TPIU enabled\n"); return 0; } @@ -96,7 +100,11 @@ static void tpiu_disable_hw(struct csdev_access *csa) static int tpiu_disable(struct coresight_device *csdev) { - if (atomic_dec_return(&csdev->refcnt)) + struct tpiu_drvdata *drvdata = dev_get_drvdata(csdev->dev.parent); + + guard(spinlock)(&drvdata->spinlock); + csdev->refcnt--; + if (csdev->refcnt) return -EBUSY; tpiu_disable_hw(&csdev->access); @@ -132,6 +140,8 @@ static int tpiu_probe(struct amba_device *adev, const struct amba_id *id) if (!drvdata) return -ENOMEM; + spin_lock_init(&drvdata->spinlock); + drvdata->atclk = devm_clk_get(&adev->dev, "atclk"); /* optional */ if (!IS_ERR(drvdata->atclk)) { ret = clk_prepare_enable(drvdata->atclk); diff --git a/drivers/hwtracing/coresight/ultrasoc-smb.c b/drivers/hwtracing/coresight/ultrasoc-smb.c index 1d3b6642de1ea..5ff5dad40e08a 100644 --- a/drivers/hwtracing/coresight/ultrasoc-smb.c +++ b/drivers/hwtracing/coresight/ultrasoc-smb.c @@ -106,7 +106,7 @@ static int smb_open(struct inode *inode, struct file *file) goto out; } - if (atomic_read(&drvdata->csdev->refcnt)) { + if (drvdata->csdev->refcnt) { ret = -EBUSY; goto out; } @@ -284,8 +284,7 @@ static int smb_enable(struct coresight_device *csdev, enum cs_mode mode, if (ret) goto out; - atomic_inc(&csdev->refcnt); - + csdev->refcnt++; dev_dbg(&csdev->dev, "Ultrasoc SMB enabled\n"); out: spin_unlock(&drvdata->spinlock); @@ -305,7 +304,8 @@ static int smb_disable(struct coresight_device *csdev) goto out; } - if (atomic_dec_return(&csdev->refcnt)) { + csdev->refcnt--; + if (csdev->refcnt) { ret = -EBUSY; goto out; } @@ -317,12 +317,7 @@ static int smb_disable(struct coresight_device *csdev) /* Dissociate from the target process. */ drvdata->pid = -1; -<<<<<<< HEAD drvdata->mode = CS_MODE_DISABLED; -======= - local_set(&csdev->mode, CS_MODE_DISABLED); - dev_dbg(&csdev->dev, "Ultrasoc SMB disabled\n"); ->>>>>>> 9cae77cf23e3 (coresight: Move mode to struct coresight_device) dev_dbg(&csdev->dev, "Ultrasoc SMB disabled\n"); out: @@ -410,7 +405,7 @@ static unsigned long smb_update_buffer(struct coresight_device *csdev, spin_lock(&drvdata->spinlock); /* Don't do anything if another tracer is using this sink. */ - if (atomic_read(&csdev->refcnt) != 1) + if (csdev->refcnt != 1) goto out; smb_disable_hw(drvdata); diff --git a/include/linux/coresight.h b/include/linux/coresight.h index 90353be8066b0..2e92168d8ec5f 100644 --- a/include/linux/coresight.h +++ b/include/linux/coresight.h @@ -230,8 +230,15 @@ struct coresight_sysfs_link { * actually an 'enum cs_mode', but is stored in an atomic type. * This is always accessed through local_read() and local_set(), * but wherever it's done from within the Coresight device's lock, - * a non-atomic read would also work. - * @refcnt: keep track of what is in use. + * a non-atomic read would also work. This is the main point of + * synchronisation between code happening inside the sysfs mode's + * coresight_mutex and outside when running in Perf mode. A compare + * and exchange swap is done to atomically claim one mode or the + * other. + * @refcnt: keep track of what is in use. Only access this outside of the + * device's spinlock when the coresight_mutex held and mode == + * CS_MODE_SYSFS. Otherwise it must be accessed from inside the + * spinlock. * @orphan: true if the component has connections that haven't been linked. * @sysfs_sink_activated: 'true' when a sink has been selected for use via sysfs * by writing a 1 to the 'enable_sink' file. A sink can be @@ -257,7 +264,7 @@ struct coresight_device { struct csdev_access access; struct device dev; local_t mode; - atomic_t refcnt; + int refcnt; bool orphan; /* sink specific fields */ bool sysfs_sink_activated; From fc45807211b0ef517fd54a7be812253c527bd1b9 Mon Sep 17 00:00:00 2001 From: James Clark Date: Mon, 29 Jan 2024 15:40:39 +0000 Subject: [PATCH 314/548] coresight: Remove unused stubs These are a bit annoying to keep up to date when the function signatures change. But if CONFIG_CORESIGHT isn't enabled, then they're not used anyway so just delete them. Signed-off-by: James Clark Link: https://lore.kernel.org/r/20240129154050.569566-9-james.clark@arm.com Signed-off-by: Suzuki K Poulose (cherry picked from commit 053ad9ad1d13f253605d7644de3aa20d958569ef) Signed-off-by: Koba Ko --- include/linux/coresight.h | 79 --------------------------------------- 1 file changed, 79 deletions(-) diff --git a/include/linux/coresight.h b/include/linux/coresight.h index 2e92168d8ec5f..c8f41d9c17a91 100644 --- a/include/linux/coresight.h +++ b/include/linux/coresight.h @@ -391,8 +391,6 @@ struct coresight_ops { const struct coresight_ops_helper *helper_ops; }; -#if IS_ENABLED(CONFIG_CORESIGHT) - static inline u32 csdev_access_relaxed_read32(struct csdev_access *csa, u32 offset) { @@ -615,83 +613,6 @@ void coresight_relaxed_write64(struct coresight_device *csdev, u64 val, u32 offset); void coresight_write64(struct coresight_device *csdev, u64 val, u32 offset); -#else -static inline struct coresight_device * -coresight_register(struct coresight_desc *desc) { return NULL; } -static inline void coresight_unregister(struct coresight_device *csdev) {} -static inline int -coresight_enable_sysfs(struct coresight_device *csdev) { return -ENOSYS; } -static inline void coresight_disable_sysfs(struct coresight_device *csdev) {} - -static inline int coresight_timeout(struct csdev_access *csa, u32 offset, - int position, int value) -{ - return 1; -} - -static inline int coresight_claim_device_unlocked(struct coresight_device *csdev) -{ - return -EINVAL; -} - -static inline int coresight_claim_device(struct coresight_device *csdev) -{ - return -EINVAL; -} - -static inline void coresight_disclaim_device(struct coresight_device *csdev) {} -static inline void coresight_disclaim_device_unlocked(struct coresight_device *csdev) {} - -static inline bool coresight_loses_context_with_cpu(struct device *dev) -{ - return false; -} - -static inline u32 coresight_relaxed_read32(struct coresight_device *csdev, u32 offset) -{ - WARN_ON_ONCE(1); - return 0; -} - -static inline u32 coresight_read32(struct coresight_device *csdev, u32 offset) -{ - WARN_ON_ONCE(1); - return 0; -} - -static inline void coresight_write32(struct coresight_device *csdev, u32 val, u32 offset) -{ -} - -static inline void coresight_relaxed_write32(struct coresight_device *csdev, - u32 val, u32 offset) -{ -} - -static inline u64 coresight_relaxed_read64(struct coresight_device *csdev, - u32 offset) -{ - WARN_ON_ONCE(1); - return 0; -} - -static inline u64 coresight_read64(struct coresight_device *csdev, u32 offset) -{ - WARN_ON_ONCE(1); - return 0; -} - -static inline void coresight_relaxed_write64(struct coresight_device *csdev, - u64 val, u32 offset) -{ -} - -static inline void coresight_write64(struct coresight_device *csdev, u64 val, u32 offset) -{ -} - -#endif /* IS_ENABLED(CONFIG_CORESIGHT) */ - extern int coresight_get_cpu(struct device *dev); struct coresight_platform_data *coresight_get_platform_data(struct device *dev); From f7e819187af16856cbcb5329353f58194ce3af68 Mon Sep 17 00:00:00 2001 From: James Clark Date: Mon, 29 Jan 2024 15:40:40 +0000 Subject: [PATCH 315/548] coresight: Add explicit member initializers to coresight_dev_type These could potentially become wrong silently if the enum is changed, so explicitly initialize them. Signed-off-by: James Clark Link: https://lore.kernel.org/r/20240129154050.569566-10-james.clark@arm.com Signed-off-by: Suzuki K Poulose (cherry picked from commit 812265e26ed3ac76d3a131709d8755bafcf31b58) Signed-off-by: Koba Ko --- drivers/hwtracing/coresight/coresight-sysfs.c | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/drivers/hwtracing/coresight/coresight-sysfs.c b/drivers/hwtracing/coresight/coresight-sysfs.c index 5992f2c2200a4..fa52297c73d2f 100644 --- a/drivers/hwtracing/coresight/coresight-sysfs.c +++ b/drivers/hwtracing/coresight/coresight-sysfs.c @@ -378,22 +378,22 @@ static struct attribute *coresight_source_attrs[] = { ATTRIBUTE_GROUPS(coresight_source); struct device_type coresight_dev_type[] = { - { + [CORESIGHT_DEV_TYPE_SINK] = { .name = "sink", .groups = coresight_sink_groups, }, - { + [CORESIGHT_DEV_TYPE_LINK] = { .name = "link", }, - { + [CORESIGHT_DEV_TYPE_LINKSINK] = { .name = "linksink", .groups = coresight_sink_groups, }, - { + [CORESIGHT_DEV_TYPE_SOURCE] = { .name = "source", .groups = coresight_source_groups, }, - { + [CORESIGHT_DEV_TYPE_HELPER] = { .name = "helper", } }; From 371fed8d0b4a11af8bbfc8a448dc0a6d7dcec69c Mon Sep 17 00:00:00 2001 From: James Clark Date: Mon, 29 Jan 2024 15:40:41 +0000 Subject: [PATCH 316/548] coresight: Add helper for atomically taking the device Now that mode is in struct coresight_device, this pattern can be wrapped in a helper. Signed-off-by: James Clark Link: https://lore.kernel.org/r/20240129154050.569566-11-james.clark@arm.com Signed-off-by: Suzuki K Poulose (cherry picked from commit d724f65218b994da234081df5dfe417c23802a65) Signed-off-by: Koba Ko --- drivers/hwtracing/coresight/coresight-etm3x-core.c | 8 +++----- drivers/hwtracing/coresight/coresight-etm4x-core.c | 8 +++----- drivers/hwtracing/coresight/coresight-stm.c | 8 +++----- include/linux/coresight.h | 11 +++++++++++ 4 files changed, 20 insertions(+), 15 deletions(-) diff --git a/drivers/hwtracing/coresight/coresight-etm3x-core.c b/drivers/hwtracing/coresight/coresight-etm3x-core.c index a8be115636f01..ce2b3ed90fb98 100644 --- a/drivers/hwtracing/coresight/coresight-etm3x-core.c +++ b/drivers/hwtracing/coresight/coresight-etm3x-core.c @@ -556,14 +556,12 @@ static int etm_enable(struct coresight_device *csdev, struct perf_event *event, enum cs_mode mode) { int ret; - u32 val; struct etm_drvdata *drvdata = dev_get_drvdata(csdev->dev.parent); - val = local_cmpxchg(&drvdata->csdev->mode, CS_MODE_DISABLED, mode); - - /* Someone is already using the tracer */ - if (val) + if (!coresight_take_mode(csdev, mode)) { + /* Someone is already using the tracer */ return -EBUSY; + } switch (mode) { case CS_MODE_SYSFS: diff --git a/drivers/hwtracing/coresight/coresight-etm4x-core.c b/drivers/hwtracing/coresight/coresight-etm4x-core.c index c18abe6560492..a28acc9fcba2f 100644 --- a/drivers/hwtracing/coresight/coresight-etm4x-core.c +++ b/drivers/hwtracing/coresight/coresight-etm4x-core.c @@ -858,13 +858,11 @@ static int etm4_enable(struct coresight_device *csdev, struct perf_event *event, enum cs_mode mode) { int ret; - u32 val; - val = local_cmpxchg(&csdev->mode, CS_MODE_DISABLED, mode); - - /* Someone is already using the tracer */ - if (val) + if (!coresight_take_mode(csdev, mode)) { + /* Someone is already using the tracer */ return -EBUSY; + } switch (mode) { case CS_MODE_SYSFS: diff --git a/drivers/hwtracing/coresight/coresight-stm.c b/drivers/hwtracing/coresight/coresight-stm.c index af9b212242468..80fed4c377f1f 100644 --- a/drivers/hwtracing/coresight/coresight-stm.c +++ b/drivers/hwtracing/coresight/coresight-stm.c @@ -193,17 +193,15 @@ static void stm_enable_hw(struct stm_drvdata *drvdata) static int stm_enable(struct coresight_device *csdev, struct perf_event *event, enum cs_mode mode) { - u32 val; struct stm_drvdata *drvdata = dev_get_drvdata(csdev->dev.parent); if (mode != CS_MODE_SYSFS) return -EINVAL; - val = local_cmpxchg(&csdev->mode, CS_MODE_DISABLED, mode); - - /* Someone is already using the tracer */ - if (val) + if (!coresight_take_mode(csdev, mode)) { + /* Someone is already using the tracer */ return -EBUSY; + } pm_runtime_get_sync(csdev->dev.parent); diff --git a/include/linux/coresight.h b/include/linux/coresight.h index c8f41d9c17a91..9f1a78246a881 100644 --- a/include/linux/coresight.h +++ b/include/linux/coresight.h @@ -580,6 +580,17 @@ static inline bool coresight_is_percpu_sink(struct coresight_device *csdev) (csdev->subtype.sink_subtype == CORESIGHT_DEV_SUBTYPE_SINK_PERCPU_SYSMEM); } +/* + * Atomically try to take the device and set a new mode. Returns true on + * success, false if the device is already taken by someone else. + */ +static inline bool coresight_take_mode(struct coresight_device *csdev, + enum cs_mode new_mode) +{ + return local_cmpxchg(&csdev->mode, CS_MODE_DISABLED, new_mode) == + CS_MODE_DISABLED; +} + extern struct coresight_device * coresight_register(struct coresight_desc *desc); extern void coresight_unregister(struct coresight_device *csdev); From 31bcc33c9e3b2cb3beb7e445844831ec9631740f Mon Sep 17 00:00:00 2001 From: James Clark Date: Mon, 29 Jan 2024 15:40:42 +0000 Subject: [PATCH 317/548] coresight: Add a helper for getting csdev->mode Now that mode is in struct coresight_device accesses can be wrapped. Signed-off-by: James Clark Link: https://lore.kernel.org/r/20240129154050.569566-12-james.clark@arm.com Signed-off-by: Suzuki K Poulose (cherry picked from commit c95c2733e5feb1f6848923f166849b2d1c7bf682) Signed-off-by: Koba Ko --- drivers/hwtracing/coresight/coresight-etb10.c | 10 +++++----- .../hwtracing/coresight/coresight-etm3x-core.c | 6 +++--- .../hwtracing/coresight/coresight-etm3x-sysfs.c | 4 ++-- .../hwtracing/coresight/coresight-etm4x-core.c | 8 ++++---- drivers/hwtracing/coresight/coresight-stm.c | 14 +++++++------- drivers/hwtracing/coresight/coresight-sysfs.c | 8 ++++---- drivers/hwtracing/coresight/coresight-tmc-core.c | 2 +- drivers/hwtracing/coresight/coresight-tmc-etf.c | 16 ++++++++-------- drivers/hwtracing/coresight/coresight-tmc-etr.c | 14 +++++++------- drivers/hwtracing/coresight/ultrasoc-smb.c | 8 ++++---- include/linux/coresight.h | 5 +++++ 11 files changed, 50 insertions(+), 45 deletions(-) diff --git a/drivers/hwtracing/coresight/coresight-etb10.c b/drivers/hwtracing/coresight/coresight-etb10.c index 5f2bb95955b72..4e82d9c20d36c 100644 --- a/drivers/hwtracing/coresight/coresight-etb10.c +++ b/drivers/hwtracing/coresight/coresight-etb10.c @@ -148,12 +148,12 @@ static int etb_enable_sysfs(struct coresight_device *csdev) spin_lock_irqsave(&drvdata->spinlock, flags); /* Don't messup with perf sessions. */ - if (local_read(&csdev->mode) == CS_MODE_PERF) { + if (coresight_get_mode(csdev) == CS_MODE_PERF) { ret = -EBUSY; goto out; } - if (local_read(&csdev->mode) == CS_MODE_DISABLED) { + if (coresight_get_mode(csdev) == CS_MODE_DISABLED) { ret = etb_enable_hw(drvdata); if (ret) goto out; @@ -179,7 +179,7 @@ static int etb_enable_perf(struct coresight_device *csdev, void *data) spin_lock_irqsave(&drvdata->spinlock, flags); /* No need to continue if the component is already in used by sysFS. */ - if (local_read(&drvdata->csdev->mode) == CS_MODE_SYSFS) { + if (coresight_get_mode(drvdata->csdev) == CS_MODE_SYSFS) { ret = -EBUSY; goto out; } @@ -361,7 +361,7 @@ static int etb_disable(struct coresight_device *csdev) } /* Complain if we (somehow) got out of sync */ - WARN_ON_ONCE(local_read(&csdev->mode) == CS_MODE_DISABLED); + WARN_ON_ONCE(coresight_get_mode(csdev) == CS_MODE_DISABLED); etb_disable_hw(drvdata); /* Dissociate from monitored process. */ drvdata->pid = -1; @@ -588,7 +588,7 @@ static void etb_dump(struct etb_drvdata *drvdata) unsigned long flags; spin_lock_irqsave(&drvdata->spinlock, flags); - if (local_read(&drvdata->csdev->mode) == CS_MODE_SYSFS) { + if (coresight_get_mode(drvdata->csdev) == CS_MODE_SYSFS) { __etb_disable_hw(drvdata); etb_dump_hw(drvdata); __etb_enable_hw(drvdata); diff --git a/drivers/hwtracing/coresight/coresight-etm3x-core.c b/drivers/hwtracing/coresight/coresight-etm3x-core.c index ce2b3ed90fb98..63991029cda05 100644 --- a/drivers/hwtracing/coresight/coresight-etm3x-core.c +++ b/drivers/hwtracing/coresight/coresight-etm3x-core.c @@ -676,7 +676,7 @@ static void etm_disable(struct coresight_device *csdev, * change its status. As such we can read the status here without * fearing it will change under us. */ - mode = local_read(&csdev->mode); + mode = coresight_get_mode(csdev); switch (mode) { case CS_MODE_DISABLED: @@ -727,7 +727,7 @@ static int etm_starting_cpu(unsigned int cpu) etmdrvdata[cpu]->os_unlock = true; } - if (local_read(&etmdrvdata[cpu]->csdev->mode)) + if (coresight_get_mode(etmdrvdata[cpu]->csdev)) etm_enable_hw(etmdrvdata[cpu]); spin_unlock(&etmdrvdata[cpu]->spinlock); return 0; @@ -739,7 +739,7 @@ static int etm_dying_cpu(unsigned int cpu) return 0; spin_lock(&etmdrvdata[cpu]->spinlock); - if (local_read(&etmdrvdata[cpu]->csdev->mode)) + if (coresight_get_mode(etmdrvdata[cpu]->csdev)) etm_disable_hw(etmdrvdata[cpu]); spin_unlock(&etmdrvdata[cpu]->spinlock); return 0; diff --git a/drivers/hwtracing/coresight/coresight-etm3x-sysfs.c b/drivers/hwtracing/coresight/coresight-etm3x-sysfs.c index 6c8429c980b14..68c644be9813b 100644 --- a/drivers/hwtracing/coresight/coresight-etm3x-sysfs.c +++ b/drivers/hwtracing/coresight/coresight-etm3x-sysfs.c @@ -722,7 +722,7 @@ static ssize_t cntr_val_show(struct device *dev, struct etm_drvdata *drvdata = dev_get_drvdata(dev->parent); struct etm_config *config = &drvdata->config; - if (!local_read(&drvdata->csdev->mode)) { + if (!coresight_get_mode(drvdata->csdev)) { spin_lock(&drvdata->spinlock); for (i = 0; i < drvdata->nr_cntr; i++) ret += sprintf(buf, "counter %d: %x\n", @@ -941,7 +941,7 @@ static ssize_t seq_curr_state_show(struct device *dev, struct etm_drvdata *drvdata = dev_get_drvdata(dev->parent); struct etm_config *config = &drvdata->config; - if (!local_read(&drvdata->csdev->mode)) { + if (!coresight_get_mode(drvdata->csdev)) { val = config->seq_curr_state; goto out; } diff --git a/drivers/hwtracing/coresight/coresight-etm4x-core.c b/drivers/hwtracing/coresight/coresight-etm4x-core.c index a28acc9fcba2f..befb32dbbf827 100644 --- a/drivers/hwtracing/coresight/coresight-etm4x-core.c +++ b/drivers/hwtracing/coresight/coresight-etm4x-core.c @@ -1040,7 +1040,7 @@ static void etm4_disable(struct coresight_device *csdev, * change its status. As such we can read the status here without * fearing it will change under us. */ - mode = local_read(&csdev->mode); + mode = coresight_get_mode(csdev); switch (mode) { case CS_MODE_DISABLED: @@ -1662,7 +1662,7 @@ static int etm4_starting_cpu(unsigned int cpu) if (!etmdrvdata[cpu]->os_unlock) etm4_os_unlock(etmdrvdata[cpu]); - if (local_read(&etmdrvdata[cpu]->csdev->mode)) + if (coresight_get_mode(etmdrvdata[cpu]->csdev)) etm4_enable_hw(etmdrvdata[cpu]); spin_unlock(&etmdrvdata[cpu]->spinlock); return 0; @@ -1674,7 +1674,7 @@ static int etm4_dying_cpu(unsigned int cpu) return 0; spin_lock(&etmdrvdata[cpu]->spinlock); - if (local_read(&etmdrvdata[cpu]->csdev->mode)) + if (coresight_get_mode(etmdrvdata[cpu]->csdev)) etm4_disable_hw(etmdrvdata[cpu]); spin_unlock(&etmdrvdata[cpu]->spinlock); return 0; @@ -1831,7 +1831,7 @@ static int etm4_cpu_save(struct etmv4_drvdata *drvdata) * Save and restore the ETM Trace registers only if * the ETM is active. */ - if (local_read(&drvdata->csdev->mode) && drvdata->save_state) + if (coresight_get_mode(drvdata->csdev) && drvdata->save_state) ret = __etm4_cpu_save(drvdata); return ret; } diff --git a/drivers/hwtracing/coresight/coresight-stm.c b/drivers/hwtracing/coresight/coresight-stm.c index 80fed4c377f1f..53a07a5369682 100644 --- a/drivers/hwtracing/coresight/coresight-stm.c +++ b/drivers/hwtracing/coresight/coresight-stm.c @@ -262,7 +262,7 @@ static void stm_disable(struct coresight_device *csdev, * change its status. As such we can read the status here without * fearing it will change under us. */ - if (local_read(&csdev->mode) == CS_MODE_SYSFS) { + if (coresight_get_mode(csdev) == CS_MODE_SYSFS) { spin_lock(&drvdata->spinlock); stm_disable_hw(drvdata); spin_unlock(&drvdata->spinlock); @@ -369,7 +369,7 @@ static long stm_generic_set_options(struct stm_data *stm_data, { struct stm_drvdata *drvdata = container_of(stm_data, struct stm_drvdata, stm); - if (!(drvdata && local_read(&drvdata->csdev->mode))) + if (!(drvdata && coresight_get_mode(drvdata->csdev))) return -EINVAL; if (channel >= drvdata->numsp) @@ -404,7 +404,7 @@ static ssize_t notrace stm_generic_packet(struct stm_data *stm_data, struct stm_drvdata, stm); unsigned int stm_flags; - if (!(drvdata && local_read(&drvdata->csdev->mode))) + if (!(drvdata && coresight_get_mode(drvdata->csdev))) return -EACCES; if (channel >= drvdata->numsp) @@ -511,7 +511,7 @@ static ssize_t port_select_show(struct device *dev, struct stm_drvdata *drvdata = dev_get_drvdata(dev->parent); unsigned long val; - if (!local_read(&drvdata->csdev->mode)) { + if (!coresight_get_mode(drvdata->csdev)) { val = drvdata->stmspscr; } else { spin_lock(&drvdata->spinlock); @@ -537,7 +537,7 @@ static ssize_t port_select_store(struct device *dev, spin_lock(&drvdata->spinlock); drvdata->stmspscr = val; - if (local_read(&drvdata->csdev->mode)) { + if (coresight_get_mode(drvdata->csdev)) { CS_UNLOCK(drvdata->base); /* Process as per ARM's TRM recommendation */ stmsper = readl_relaxed(drvdata->base + STMSPER); @@ -558,7 +558,7 @@ static ssize_t port_enable_show(struct device *dev, struct stm_drvdata *drvdata = dev_get_drvdata(dev->parent); unsigned long val; - if (!local_read(&drvdata->csdev->mode)) { + if (!coresight_get_mode(drvdata->csdev)) { val = drvdata->stmsper; } else { spin_lock(&drvdata->spinlock); @@ -584,7 +584,7 @@ static ssize_t port_enable_store(struct device *dev, spin_lock(&drvdata->spinlock); drvdata->stmsper = val; - if (local_read(&drvdata->csdev->mode)) { + if (coresight_get_mode(drvdata->csdev)) { CS_UNLOCK(drvdata->base); writel_relaxed(drvdata->stmsper, drvdata->base + STMSPER); CS_LOCK(drvdata->base); diff --git a/drivers/hwtracing/coresight/coresight-sysfs.c b/drivers/hwtracing/coresight/coresight-sysfs.c index fa52297c73d2f..f9444e2cb1d9f 100644 --- a/drivers/hwtracing/coresight/coresight-sysfs.c +++ b/drivers/hwtracing/coresight/coresight-sysfs.c @@ -62,7 +62,7 @@ static int coresight_enable_source_sysfs(struct coresight_device *csdev, * change with coresight_mutex held, which we already have here. */ lockdep_assert_held(&coresight_mutex); - if (local_read(&csdev->mode) != CS_MODE_SYSFS) { + if (coresight_get_mode(csdev) != CS_MODE_SYSFS) { ret = source_ops(csdev)->enable(csdev, data, mode); if (ret) return ret; @@ -87,7 +87,7 @@ static bool coresight_disable_source_sysfs(struct coresight_device *csdev, void *data) { lockdep_assert_held(&coresight_mutex); - if (local_read(&csdev->mode) != CS_MODE_SYSFS) + if (coresight_get_mode(csdev) != CS_MODE_SYSFS) return false; csdev->refcnt--; @@ -184,7 +184,7 @@ int coresight_enable_sysfs(struct coresight_device *csdev) * coresight_enable_source() so can still race with Perf mode which * doesn't hold coresight_mutex. */ - if (local_read(&csdev->mode) == CS_MODE_SYSFS) { + if (coresight_get_mode(csdev) == CS_MODE_SYSFS) { /* * There could be multiple applications driving the software * source. So keep the refcount for each such user when the @@ -338,7 +338,7 @@ static ssize_t enable_source_show(struct device *dev, guard(mutex)(&coresight_mutex); return scnprintf(buf, PAGE_SIZE, "%u\n", - local_read(&csdev->mode) == CS_MODE_SYSFS); + coresight_get_mode(csdev) == CS_MODE_SYSFS); } static ssize_t enable_source_store(struct device *dev, diff --git a/drivers/hwtracing/coresight/coresight-tmc-core.c b/drivers/hwtracing/coresight/coresight-tmc-core.c index 90f3c06964d02..58c8c717631c1 100644 --- a/drivers/hwtracing/coresight/coresight-tmc-core.c +++ b/drivers/hwtracing/coresight/coresight-tmc-core.c @@ -547,7 +547,7 @@ static void tmc_shutdown(struct amba_device *adev) spin_lock_irqsave(&drvdata->spinlock, flags); - if (local_read(&drvdata->csdev->mode) == CS_MODE_DISABLED) + if (coresight_get_mode(drvdata->csdev) == CS_MODE_DISABLED) goto out; if (drvdata->config_type == TMC_CONFIG_TYPE_ETR) diff --git a/drivers/hwtracing/coresight/coresight-tmc-etf.c b/drivers/hwtracing/coresight/coresight-tmc-etf.c index f3281c958a570..77ef67c976e92 100644 --- a/drivers/hwtracing/coresight/coresight-tmc-etf.c +++ b/drivers/hwtracing/coresight/coresight-tmc-etf.c @@ -89,7 +89,7 @@ static void __tmc_etb_disable_hw(struct tmc_drvdata *drvdata) * When operating in sysFS mode the content of the buffer needs to be * read before the TMC is disabled. */ - if (local_read(&drvdata->csdev->mode) == CS_MODE_SYSFS) + if (coresight_get_mode(drvdata->csdev) == CS_MODE_SYSFS) tmc_etb_dump_hw(drvdata); tmc_disable_hw(drvdata); @@ -205,7 +205,7 @@ static int tmc_enable_etf_sink_sysfs(struct coresight_device *csdev) * sink is already enabled no memory is needed and the HW need not be * touched. */ - if (local_read(&csdev->mode) == CS_MODE_SYSFS) { + if (coresight_get_mode(csdev) == CS_MODE_SYSFS) { csdev->refcnt++; goto out; } @@ -262,7 +262,7 @@ static int tmc_enable_etf_sink_perf(struct coresight_device *csdev, void *data) * No need to continue if the ETB/ETF is already operated * from sysFS. */ - if (local_read(&csdev->mode) == CS_MODE_SYSFS) { + if (coresight_get_mode(csdev) == CS_MODE_SYSFS) { ret = -EBUSY; break; } @@ -345,7 +345,7 @@ static int tmc_disable_etf_sink(struct coresight_device *csdev) } /* Complain if we (somehow) got out of sync */ - WARN_ON_ONCE(local_read(&csdev->mode) == CS_MODE_DISABLED); + WARN_ON_ONCE(coresight_get_mode(csdev) == CS_MODE_DISABLED); tmc_etb_disable_hw(drvdata); /* Dissociate from monitored process. */ drvdata->pid = -1; @@ -485,7 +485,7 @@ static unsigned long tmc_update_etf_buffer(struct coresight_device *csdev, return 0; /* This shouldn't happen */ - if (WARN_ON_ONCE(local_read(&csdev->mode) != CS_MODE_PERF)) + if (WARN_ON_ONCE(coresight_get_mode(csdev) != CS_MODE_PERF)) return 0; spin_lock_irqsave(&drvdata->spinlock, flags); @@ -631,7 +631,7 @@ int tmc_read_prepare_etb(struct tmc_drvdata *drvdata) } /* Don't interfere if operated from Perf */ - if (local_read(&drvdata->csdev->mode) == CS_MODE_PERF) { + if (coresight_get_mode(drvdata->csdev) == CS_MODE_PERF) { ret = -EINVAL; goto out; } @@ -643,7 +643,7 @@ int tmc_read_prepare_etb(struct tmc_drvdata *drvdata) } /* Disable the TMC if need be */ - if (local_read(&drvdata->csdev->mode) == CS_MODE_SYSFS) { + if (coresight_get_mode(drvdata->csdev) == CS_MODE_SYSFS) { /* There is no point in reading a TMC in HW FIFO mode */ mode = readl_relaxed(drvdata->base + TMC_MODE); if (mode != TMC_MODE_CIRCULAR_BUFFER) { @@ -675,7 +675,7 @@ int tmc_read_unprepare_etb(struct tmc_drvdata *drvdata) spin_lock_irqsave(&drvdata->spinlock, flags); /* Re-enable the TMC if need be */ - if (local_read(&drvdata->csdev->mode) == CS_MODE_SYSFS) { + if (coresight_get_mode(drvdata->csdev) == CS_MODE_SYSFS) { /* There is no point in reading a TMC in HW FIFO mode */ mode = readl_relaxed(drvdata->base + TMC_MODE); if (mode != TMC_MODE_CIRCULAR_BUFFER) { diff --git a/drivers/hwtracing/coresight/coresight-tmc-etr.c b/drivers/hwtracing/coresight/coresight-tmc-etr.c index 0720793701b99..938bae2191cfa 100644 --- a/drivers/hwtracing/coresight/coresight-tmc-etr.c +++ b/drivers/hwtracing/coresight/coresight-tmc-etr.c @@ -1123,7 +1123,7 @@ static void __tmc_etr_disable_hw(struct tmc_drvdata *drvdata) * When operating in sysFS mode the content of the buffer needs to be * read before the TMC is disabled. */ - if (local_read(&drvdata->csdev->mode) == CS_MODE_SYSFS) + if (coresight_get_mode(drvdata->csdev) == CS_MODE_SYSFS) tmc_etr_sync_sysfs_buf(drvdata); tmc_disable_hw(drvdata); @@ -1169,7 +1169,7 @@ static struct etr_buf *tmc_etr_get_sysfs_buffer(struct coresight_device *csdev) spin_lock_irqsave(&drvdata->spinlock, flags); } - if (drvdata->reading || local_read(&csdev->mode) == CS_MODE_PERF) { + if (drvdata->reading || coresight_get_mode(csdev) == CS_MODE_PERF) { ret = -EBUSY; goto out; } @@ -1210,7 +1210,7 @@ static int tmc_enable_etr_sink_sysfs(struct coresight_device *csdev) * sink is already enabled no memory is needed and the HW need not be * touched, even if the buffer size has changed. */ - if (local_read(&csdev->mode) == CS_MODE_SYSFS) { + if (coresight_get_mode(csdev) == CS_MODE_SYSFS) { csdev->refcnt++; goto out; } @@ -1632,7 +1632,7 @@ static int tmc_enable_etr_sink_perf(struct coresight_device *csdev, void *data) spin_lock_irqsave(&drvdata->spinlock, flags); /* Don't use this sink if it is already claimed by sysFS */ - if (local_read(&csdev->mode) == CS_MODE_SYSFS) { + if (coresight_get_mode(csdev) == CS_MODE_SYSFS) { rc = -EBUSY; goto unlock_out; } @@ -1706,7 +1706,7 @@ static int tmc_disable_etr_sink(struct coresight_device *csdev) } /* Complain if we (somehow) got out of sync */ - WARN_ON_ONCE(local_read(&csdev->mode) == CS_MODE_DISABLED); + WARN_ON_ONCE(coresight_get_mode(csdev) == CS_MODE_DISABLED); tmc_etr_disable_hw(drvdata); /* Dissociate from monitored process. */ drvdata->pid = -1; @@ -1758,7 +1758,7 @@ int tmc_read_prepare_etr(struct tmc_drvdata *drvdata) } /* Disable the TMC if we are trying to read from a running session. */ - if (local_read(&drvdata->csdev->mode) == CS_MODE_SYSFS) + if (coresight_get_mode(drvdata->csdev) == CS_MODE_SYSFS) __tmc_etr_disable_hw(drvdata); drvdata->reading = true; @@ -1780,7 +1780,7 @@ int tmc_read_unprepare_etr(struct tmc_drvdata *drvdata) spin_lock_irqsave(&drvdata->spinlock, flags); /* RE-enable the TMC if need be */ - if (local_read(&drvdata->csdev->mode) == CS_MODE_SYSFS) { + if (coresight_get_mode(drvdata->csdev) == CS_MODE_SYSFS) { /* * The trace run will continue with the same allocated trace * buffer. Since the tracer is still enabled drvdata::buf can't diff --git a/drivers/hwtracing/coresight/ultrasoc-smb.c b/drivers/hwtracing/coresight/ultrasoc-smb.c index 5ff5dad40e08a..707dded2ad470 100644 --- a/drivers/hwtracing/coresight/ultrasoc-smb.c +++ b/drivers/hwtracing/coresight/ultrasoc-smb.c @@ -216,7 +216,7 @@ static void smb_enable_sysfs(struct coresight_device *csdev) { struct smb_drv_data *drvdata = dev_get_drvdata(csdev->dev.parent); - if (local_read(&csdev->mode) != CS_MODE_DISABLED) + if (coresight_get_mode(csdev) != CS_MODE_DISABLED) return; smb_enable_hw(drvdata); @@ -264,8 +264,8 @@ static int smb_enable(struct coresight_device *csdev, enum cs_mode mode, } /* Do nothing, the SMB is already enabled as other mode */ - if (local_read(&csdev->mode) != CS_MODE_DISABLED && - local_read(&csdev->mode) != mode) { + if (coresight_get_mode(csdev) != CS_MODE_DISABLED && + coresight_get_mode(csdev) != mode) { ret = -EBUSY; goto out; } @@ -311,7 +311,7 @@ static int smb_disable(struct coresight_device *csdev) } /* Complain if we (somehow) got out of sync */ - WARN_ON_ONCE(local_read(&csdev->mode) == CS_MODE_DISABLED); + WARN_ON_ONCE(coresight_get_mode(csdev) == CS_MODE_DISABLED); smb_disable_hw(drvdata); diff --git a/include/linux/coresight.h b/include/linux/coresight.h index 9f1a78246a881..50690d6f9c297 100644 --- a/include/linux/coresight.h +++ b/include/linux/coresight.h @@ -591,6 +591,11 @@ static inline bool coresight_take_mode(struct coresight_device *csdev, CS_MODE_DISABLED; } +static inline enum cs_mode coresight_get_mode(struct coresight_device *csdev) +{ + return local_read(&csdev->mode); +} + extern struct coresight_device * coresight_register(struct coresight_desc *desc); extern void coresight_unregister(struct coresight_device *csdev); From 8b4081b279011383e4e1b5e1cda691c6bffcab83 Mon Sep 17 00:00:00 2001 From: James Clark Date: Mon, 29 Jan 2024 15:40:43 +0000 Subject: [PATCH 318/548] coresight: Add helper for setting csdev->mode Now that mode is in struct coresight_device, sets can be wrapped. This also allows us to add a sanity check that there have been no concurrent modifications of mode. Currently all usages of local_set() were inside the device's spin locks so this new warning shouldn't be triggered. coresight_take_mode() could maybe have been used in place of adding the warning, but there may be use cases which set the mode to the same mode which are valid but would fail in coresight_take_mode() because it requires the device to only be in the disabled state. Signed-off-by: James Clark Link: https://lore.kernel.org/r/20240129154050.569566-13-james.clark@arm.com Signed-off-by: Suzuki K Poulose (cherry picked from commit bcaabb95f0c9883fb8e1112bd13eaba9cfd62c15) Signed-off-by: Koba Ko --- drivers/hwtracing/coresight/coresight-etb10.c | 6 +++--- .../hwtracing/coresight/coresight-etm3x-core.c | 4 ++-- .../hwtracing/coresight/coresight-etm4x-core.c | 4 ++-- drivers/hwtracing/coresight/coresight-stm.c | 2 +- drivers/hwtracing/coresight/coresight-tmc-etf.c | 10 +++++----- drivers/hwtracing/coresight/coresight-tmc-etr.c | 6 +++--- drivers/hwtracing/coresight/ultrasoc-smb.c | 6 +++--- include/linux/coresight.h | 16 ++++++++++++++++ 8 files changed, 35 insertions(+), 19 deletions(-) diff --git a/drivers/hwtracing/coresight/coresight-etb10.c b/drivers/hwtracing/coresight/coresight-etb10.c index 4e82d9c20d36c..aea4ce6fb0b60 100644 --- a/drivers/hwtracing/coresight/coresight-etb10.c +++ b/drivers/hwtracing/coresight/coresight-etb10.c @@ -158,7 +158,7 @@ static int etb_enable_sysfs(struct coresight_device *csdev) if (ret) goto out; - local_set(&csdev->mode, CS_MODE_SYSFS); + coresight_set_mode(csdev, CS_MODE_SYSFS); } csdev->refcnt++; @@ -214,7 +214,7 @@ static int etb_enable_perf(struct coresight_device *csdev, void *data) if (!ret) { /* Associate with monitored process. */ drvdata->pid = pid; - local_set(&drvdata->csdev->mode, CS_MODE_PERF); + coresight_set_mode(drvdata->csdev, CS_MODE_PERF); csdev->refcnt++; } @@ -365,7 +365,7 @@ static int etb_disable(struct coresight_device *csdev) etb_disable_hw(drvdata); /* Dissociate from monitored process. */ drvdata->pid = -1; - local_set(&csdev->mode, CS_MODE_DISABLED); + coresight_set_mode(csdev, CS_MODE_DISABLED); spin_unlock_irqrestore(&drvdata->spinlock, flags); dev_dbg(&csdev->dev, "ETB disabled\n"); diff --git a/drivers/hwtracing/coresight/coresight-etm3x-core.c b/drivers/hwtracing/coresight/coresight-etm3x-core.c index 63991029cda05..bda9fb61559c8 100644 --- a/drivers/hwtracing/coresight/coresight-etm3x-core.c +++ b/drivers/hwtracing/coresight/coresight-etm3x-core.c @@ -576,7 +576,7 @@ static int etm_enable(struct coresight_device *csdev, struct perf_event *event, /* The tracer didn't start */ if (ret) - local_set(&drvdata->csdev->mode, CS_MODE_DISABLED); + coresight_set_mode(drvdata->csdev, CS_MODE_DISABLED); return ret; } @@ -693,7 +693,7 @@ static void etm_disable(struct coresight_device *csdev, } if (mode) - local_set(&csdev->mode, CS_MODE_DISABLED); + coresight_set_mode(csdev, CS_MODE_DISABLED); } static const struct coresight_ops_source etm_source_ops = { diff --git a/drivers/hwtracing/coresight/coresight-etm4x-core.c b/drivers/hwtracing/coresight/coresight-etm4x-core.c index befb32dbbf827..39ad5950ea3c1 100644 --- a/drivers/hwtracing/coresight/coresight-etm4x-core.c +++ b/drivers/hwtracing/coresight/coresight-etm4x-core.c @@ -877,7 +877,7 @@ static int etm4_enable(struct coresight_device *csdev, struct perf_event *event, /* The tracer didn't start */ if (ret) - local_set(&csdev->mode, CS_MODE_DISABLED); + coresight_set_mode(csdev, CS_MODE_DISABLED); return ret; } @@ -1054,7 +1054,7 @@ static void etm4_disable(struct coresight_device *csdev, } if (mode) - local_set(&csdev->mode, CS_MODE_DISABLED); + coresight_set_mode(csdev, CS_MODE_DISABLED); } static const struct coresight_ops_source etm4_source_ops = { diff --git a/drivers/hwtracing/coresight/coresight-stm.c b/drivers/hwtracing/coresight/coresight-stm.c index 53a07a5369682..6e801191d1db2 100644 --- a/drivers/hwtracing/coresight/coresight-stm.c +++ b/drivers/hwtracing/coresight/coresight-stm.c @@ -272,7 +272,7 @@ static void stm_disable(struct coresight_device *csdev, pm_runtime_put(csdev->dev.parent); - local_set(&csdev->mode, CS_MODE_DISABLED); + coresight_set_mode(csdev, CS_MODE_DISABLED); dev_dbg(&csdev->dev, "STM tracing disabled\n"); } } diff --git a/drivers/hwtracing/coresight/coresight-tmc-etf.c b/drivers/hwtracing/coresight/coresight-tmc-etf.c index 77ef67c976e92..d4f641cd9de69 100644 --- a/drivers/hwtracing/coresight/coresight-tmc-etf.c +++ b/drivers/hwtracing/coresight/coresight-tmc-etf.c @@ -228,7 +228,7 @@ static int tmc_enable_etf_sink_sysfs(struct coresight_device *csdev) ret = tmc_etb_enable_hw(drvdata); if (!ret) { - local_set(&csdev->mode, CS_MODE_SYSFS); + coresight_set_mode(csdev, CS_MODE_SYSFS); csdev->refcnt++; } else { /* Free up the buffer if we failed to enable */ @@ -292,7 +292,7 @@ static int tmc_enable_etf_sink_perf(struct coresight_device *csdev, void *data) if (!ret) { /* Associate with monitored process. */ drvdata->pid = pid; - local_set(&csdev->mode, CS_MODE_PERF); + coresight_set_mode(csdev, CS_MODE_PERF); csdev->refcnt++; } } while (0); @@ -349,7 +349,7 @@ static int tmc_disable_etf_sink(struct coresight_device *csdev) tmc_etb_disable_hw(drvdata); /* Dissociate from monitored process. */ drvdata->pid = -1; - local_set(&csdev->mode, CS_MODE_DISABLED); + coresight_set_mode(csdev, CS_MODE_DISABLED); spin_unlock_irqrestore(&drvdata->spinlock, flags); @@ -375,7 +375,7 @@ static int tmc_enable_etf_link(struct coresight_device *csdev, if (csdev->refcnt == 0) { ret = tmc_etf_enable_hw(drvdata); if (!ret) { - local_set(&csdev->mode, CS_MODE_SYSFS); + coresight_set_mode(csdev, CS_MODE_SYSFS); first_enable = true; } } @@ -405,7 +405,7 @@ static void tmc_disable_etf_link(struct coresight_device *csdev, csdev->refcnt--; if (csdev->refcnt == 0) { tmc_etf_disable_hw(drvdata); - local_set(&csdev->mode, CS_MODE_DISABLED); + coresight_set_mode(csdev, CS_MODE_DISABLED); last_disable = true; } spin_unlock_irqrestore(&drvdata->spinlock, flags); diff --git a/drivers/hwtracing/coresight/coresight-tmc-etr.c b/drivers/hwtracing/coresight/coresight-tmc-etr.c index 938bae2191cfa..3114557066ff9 100644 --- a/drivers/hwtracing/coresight/coresight-tmc-etr.c +++ b/drivers/hwtracing/coresight/coresight-tmc-etr.c @@ -1217,7 +1217,7 @@ static int tmc_enable_etr_sink_sysfs(struct coresight_device *csdev) ret = tmc_etr_enable_hw(drvdata, sysfs_buf); if (!ret) { - local_set(&csdev->mode, CS_MODE_SYSFS); + coresight_set_mode(csdev, CS_MODE_SYSFS); csdev->refcnt++; } @@ -1664,7 +1664,7 @@ static int tmc_enable_etr_sink_perf(struct coresight_device *csdev, void *data) if (!rc) { /* Associate with monitored process. */ drvdata->pid = pid; - local_set(&csdev->mode, CS_MODE_PERF); + coresight_set_mode(csdev, CS_MODE_PERF); drvdata->perf_buf = etr_perf->etr_buf; csdev->refcnt++; } @@ -1710,7 +1710,7 @@ static int tmc_disable_etr_sink(struct coresight_device *csdev) tmc_etr_disable_hw(drvdata); /* Dissociate from monitored process. */ drvdata->pid = -1; - local_set(&csdev->mode, CS_MODE_DISABLED); + coresight_set_mode(csdev, CS_MODE_DISABLED); /* Reset perf specific data */ drvdata->perf_buf = NULL; diff --git a/drivers/hwtracing/coresight/ultrasoc-smb.c b/drivers/hwtracing/coresight/ultrasoc-smb.c index 707dded2ad470..d3d3036d7350b 100644 --- a/drivers/hwtracing/coresight/ultrasoc-smb.c +++ b/drivers/hwtracing/coresight/ultrasoc-smb.c @@ -220,7 +220,7 @@ static void smb_enable_sysfs(struct coresight_device *csdev) return; smb_enable_hw(drvdata); - local_set(&csdev->mode, CS_MODE_SYSFS); + coresight_set_mode(csdev, CS_MODE_SYSFS); } static int smb_enable_perf(struct coresight_device *csdev, void *data) @@ -243,7 +243,7 @@ static int smb_enable_perf(struct coresight_device *csdev, void *data) if (drvdata->pid == -1) { smb_enable_hw(drvdata); drvdata->pid = pid; - local_set(&csdev->mode, CS_MODE_PERF); + coresight_set_mode(csdev, CS_MODE_PERF); } return 0; @@ -317,7 +317,7 @@ static int smb_disable(struct coresight_device *csdev) /* Dissociate from the target process. */ drvdata->pid = -1; - drvdata->mode = CS_MODE_DISABLED; + coresight_set_mode(csdev, CS_MODE_DISABLED); dev_dbg(&csdev->dev, "Ultrasoc SMB disabled\n"); out: diff --git a/include/linux/coresight.h b/include/linux/coresight.h index 50690d6f9c297..6d007bf25e4c9 100644 --- a/include/linux/coresight.h +++ b/include/linux/coresight.h @@ -596,6 +596,22 @@ static inline enum cs_mode coresight_get_mode(struct coresight_device *csdev) return local_read(&csdev->mode); } +static inline void coresight_set_mode(struct coresight_device *csdev, + enum cs_mode new_mode) +{ + enum cs_mode current_mode = coresight_get_mode(csdev); + + /* + * Changing to a new mode must be done from an already disabled state + * unless it's synchronized with coresight_take_mode(). Otherwise the + * device is already in use and signifies a locking issue. + */ + WARN(new_mode != CS_MODE_DISABLED && current_mode != CS_MODE_DISABLED && + current_mode != new_mode, "Device already in use\n"); + + local_set(&csdev->mode, new_mode); +} + extern struct coresight_device * coresight_register(struct coresight_desc *desc); extern void coresight_unregister(struct coresight_device *csdev); From 185773b60e54246fb889aa3d3c2b9fbb171e5aa7 Mon Sep 17 00:00:00 2001 From: James Clark Date: Mon, 22 Jul 2024 11:11:51 +0100 Subject: [PATCH 319/548] coresight: Remove unused ETM Perf stubs This file is never included anywhere if CONFIG_CORESIGHT is not set so they are unused and aren't currently compile tested with any config so remove them. Reviewed-by: Anshuman Khandual Reviewed-by: Mike Leach Signed-off-by: James Clark Tested-by: Leo Yan Tested-by: Ganapatrao Kulkarni Signed-off-by: James Clark Signed-off-by: Suzuki K Poulose Link: https://lore.kernel.org/r/20240722101202.26915-10-james.clark@linaro.org (cherry picked from commit 34172002bdac28dabbb846f191a86454bb226c7a) Signed-off-by: Koba Ko --- .../hwtracing/coresight/coresight-etm-perf.h | 18 ------------------ 1 file changed, 18 deletions(-) diff --git a/drivers/hwtracing/coresight/coresight-etm-perf.h b/drivers/hwtracing/coresight/coresight-etm-perf.h index bebbadee2ceb3..744531158d6ba 100644 --- a/drivers/hwtracing/coresight/coresight-etm-perf.h +++ b/drivers/hwtracing/coresight/coresight-etm-perf.h @@ -62,7 +62,6 @@ struct etm_event_data { struct list_head * __percpu *path; }; -#if IS_ENABLED(CONFIG_CORESIGHT) int etm_perf_symlink(struct coresight_device *csdev, bool link); int etm_perf_add_symlink_sink(struct coresight_device *csdev); void etm_perf_del_symlink_sink(struct coresight_device *csdev); @@ -77,23 +76,6 @@ static inline void *etm_perf_sink_config(struct perf_output_handle *handle) int etm_perf_add_symlink_cscfg(struct device *dev, struct cscfg_config_desc *config_desc); void etm_perf_del_symlink_cscfg(struct cscfg_config_desc *config_desc); -#else -static inline int etm_perf_symlink(struct coresight_device *csdev, bool link) -{ return -EINVAL; } -int etm_perf_add_symlink_sink(struct coresight_device *csdev) -{ return -EINVAL; } -void etm_perf_del_symlink_sink(struct coresight_device *csdev) {} -static inline void *etm_perf_sink_config(struct perf_output_handle *handle) -{ - return NULL; -} -int etm_perf_add_symlink_cscfg(struct device *dev, - struct cscfg_config_desc *config_desc) -{ return -EINVAL; } -void etm_perf_del_symlink_cscfg(struct cscfg_config_desc *config_desc) {} - -#endif /* CONFIG_CORESIGHT */ - int __init etm_perf_init(void); void etm_perf_exit(void); From 1db4d4c1b447a2c03315e97e6872996fbe2c1d1b Mon Sep 17 00:00:00 2001 From: James Clark Date: Mon, 22 Jul 2024 11:11:52 +0100 Subject: [PATCH 320/548] coresight: Clarify comments around the PID of the sink owner "Process being monitored" and "pid of the process to monitor" imply that this would be the same PID if there were two sessions targeting the same process. But this is actually the PID of the process that did the Perf event open call, rather than the target of the session. So update the comments to make this clearer. Reviewed-by: Anshuman Khandual Reviewed-by: Mike Leach Signed-off-by: James Clark Tested-by: Leo Yan Tested-by: Ganapatrao Kulkarni Signed-off-by: James Clark Signed-off-by: Suzuki K Poulose Link: https://lore.kernel.org/r/20240722101202.26915-11-james.clark@linaro.org (cherry picked from commit eda1d11979c03f0104ca59ec4a0478cd52fa20de) Signed-off-by: Koba Ko --- drivers/hwtracing/coresight/coresight-tmc-etr.c | 5 +++-- drivers/hwtracing/coresight/coresight-tmc.h | 5 +++-- 2 files changed, 6 insertions(+), 4 deletions(-) diff --git a/drivers/hwtracing/coresight/coresight-tmc-etr.c b/drivers/hwtracing/coresight/coresight-tmc-etr.c index 3114557066ff9..6dd3020124174 100644 --- a/drivers/hwtracing/coresight/coresight-tmc-etr.c +++ b/drivers/hwtracing/coresight/coresight-tmc-etr.c @@ -30,7 +30,8 @@ struct etr_flat_buf { * etr_perf_buffer - Perf buffer used for ETR * @drvdata - The ETR drvdaga this buffer has been allocated for. * @etr_buf - Actual buffer used by the ETR - * @pid - The PID this etr_perf_buffer belongs to. + * @pid - The PID of the session owner that etr_perf_buffer + * belongs to. * @snaphost - Perf session mode * @nr_pages - Number of pages in the ring buffer. * @pages - Array of Pages in the ring buffer. @@ -1642,7 +1643,7 @@ static int tmc_enable_etr_sink_perf(struct coresight_device *csdev, void *data) goto unlock_out; } - /* Get a handle on the pid of the process to monitor */ + /* Get a handle on the pid of the session owner */ pid = etr_perf->pid; /* Do not proceed if this device is associated with another session */ diff --git a/drivers/hwtracing/coresight/coresight-tmc.h b/drivers/hwtracing/coresight/coresight-tmc.h index 232db5bdc7fcf..9ca17cff1e526 100644 --- a/drivers/hwtracing/coresight/coresight-tmc.h +++ b/drivers/hwtracing/coresight/coresight-tmc.h @@ -169,8 +169,9 @@ struct etr_buf { * @csdev: component vitals needed by the framework. * @miscdev: specifics to handle "/dev/xyz.tmc" entry. * @spinlock: only one at a time pls. - * @pid: Process ID of the process being monitored by the session - * that is using this component. + * @pid: Process ID of the process that owns the session that is using + * this component. For example this would be the pid of the Perf + * process. * @buf: Snapshot of the trace data for ETF/ETB. * @etr_buf: details of buffer used in TMC-ETR * @len: size of the available trace for ETF/ETB. From 4dd96c23875779e0678e56305eec20fd909eb1d7 Mon Sep 17 00:00:00 2001 From: James Clark Date: Mon, 22 Jul 2024 11:11:53 +0100 Subject: [PATCH 321/548] coresight: Move struct coresight_trace_id_map to common header The trace ID maps will need to be created and stored by the core and Perf code so move the definition up to the common header. Reviewed-by: Anshuman Khandual Reviewed-by: Mike Leach Signed-off-by: James Clark Tested-by: Leo Yan Tested-by: Ganapatrao Kulkarni Signed-off-by: James Clark Signed-off-by: Suzuki K Poulose Link: https://lore.kernel.org/r/20240722101202.26915-12-james.clark@linaro.org (cherry picked from commit acb0184fe9bca3bb7103dadc76ba9d38c969ca86) Signed-off-by: Koba Ko --- .../hwtracing/coresight/coresight-trace-id.c | 1 + .../hwtracing/coresight/coresight-trace-id.h | 19 ------------------- include/linux/coresight.h | 18 ++++++++++++++++++ 3 files changed, 19 insertions(+), 19 deletions(-) diff --git a/drivers/hwtracing/coresight/coresight-trace-id.c b/drivers/hwtracing/coresight/coresight-trace-id.c index af5b4ef59ceab..19005b5b4dc41 100644 --- a/drivers/hwtracing/coresight/coresight-trace-id.c +++ b/drivers/hwtracing/coresight/coresight-trace-id.c @@ -3,6 +3,7 @@ * Copyright (c) 2022, Linaro Limited, All rights reserved. * Author: Mike Leach */ +#include #include #include #include diff --git a/drivers/hwtracing/coresight/coresight-trace-id.h b/drivers/hwtracing/coresight/coresight-trace-id.h index 3797777d367e6..49438a96fcc6e 100644 --- a/drivers/hwtracing/coresight/coresight-trace-id.h +++ b/drivers/hwtracing/coresight/coresight-trace-id.h @@ -32,10 +32,6 @@ #include #include - -/* architecturally we have 128 IDs some of which are reserved */ -#define CORESIGHT_TRACE_IDS_MAX 128 - /* ID 0 is reserved */ #define CORESIGHT_TRACE_ID_RES_0 0 @@ -46,21 +42,6 @@ #define IS_VALID_CS_TRACE_ID(id) \ ((id > CORESIGHT_TRACE_ID_RES_0) && (id < CORESIGHT_TRACE_ID_RES_TOP)) -/** - * Trace ID map. - * - * @used_ids: Bitmap to register available (bit = 0) and in use (bit = 1) IDs. - * Initialised so that the reserved IDs are permanently marked as - * in use. - * @pend_rel_ids: CPU IDs that have been released by the trace source but not - * yet marked as available, to allow re-allocation to the same - * CPU during a perf session. - */ -struct coresight_trace_id_map { - DECLARE_BITMAP(used_ids, CORESIGHT_TRACE_IDS_MAX); - DECLARE_BITMAP(pend_rel_ids, CORESIGHT_TRACE_IDS_MAX); -}; - /* Allocate and release IDs for a single default trace ID map */ /** diff --git a/include/linux/coresight.h b/include/linux/coresight.h index 6d007bf25e4c9..0ba3fff44139f 100644 --- a/include/linux/coresight.h +++ b/include/linux/coresight.h @@ -217,6 +217,24 @@ struct coresight_sysfs_link { const char *target_name; }; +/* architecturally we have 128 IDs some of which are reserved */ +#define CORESIGHT_TRACE_IDS_MAX 128 + +/** + * Trace ID map. + * + * @used_ids: Bitmap to register available (bit = 0) and in use (bit = 1) IDs. + * Initialised so that the reserved IDs are permanently marked as + * in use. + * @pend_rel_ids: CPU IDs that have been released by the trace source but not + * yet marked as available, to allow re-allocation to the same + * CPU during a perf session. + */ +struct coresight_trace_id_map { + DECLARE_BITMAP(used_ids, CORESIGHT_TRACE_IDS_MAX); + DECLARE_BITMAP(pend_rel_ids, CORESIGHT_TRACE_IDS_MAX); +}; + /** * struct coresight_device - representation of a device as used by the framework * @pdata: Platform data with device connections associated to this device. From 05d7078cf904c1be6c8088065ef1a5ea962c27b9 Mon Sep 17 00:00:00 2001 From: James Clark Date: Mon, 22 Jul 2024 11:11:54 +0100 Subject: [PATCH 322/548] coresight: Expose map arguments in trace ID API The trace ID API is currently hard coded to always use the global map. Add public versions that allow the map to be passed in so that Perf mode can use per-sink maps. Keep the non-map versions so that sysfs mode can continue to use the default global map. System ID functions are unchanged because they will always use the default map. Signed-off-by: James Clark Reviewed-by: Mike Leach Tested-by: Leo Yan Signed-off-by: James Clark Signed-off-by: Suzuki K Poulose Link: https://lore.kernel.org/r/20240722101202.26915-13-james.clark@linaro.org (cherry picked from commit 7e52877868ae2546ead8ba07cdf1d3e4c9e931f7) Signed-off-by: Koba Ko --- .../hwtracing/coresight/coresight-trace-id.c | 36 ++++++++++++++----- .../hwtracing/coresight/coresight-trace-id.h | 20 +++++++++-- 2 files changed, 45 insertions(+), 11 deletions(-) diff --git a/drivers/hwtracing/coresight/coresight-trace-id.c b/drivers/hwtracing/coresight/coresight-trace-id.c index 19005b5b4dc41..5561989a03fab 100644 --- a/drivers/hwtracing/coresight/coresight-trace-id.c +++ b/drivers/hwtracing/coresight/coresight-trace-id.c @@ -12,7 +12,7 @@ #include "coresight-trace-id.h" -/* Default trace ID map. Used on systems that don't require per sink mappings */ +/* Default trace ID map. Used in sysfs mode and for system sources */ static struct coresight_trace_id_map id_map_default; /* maintain a record of the mapping of IDs and pending releases per cpu */ @@ -47,7 +47,7 @@ static void coresight_trace_id_dump_table(struct coresight_trace_id_map *id_map, #endif /* unlocked read of current trace ID value for given CPU */ -static int _coresight_trace_id_read_cpu_id(int cpu) +static int _coresight_trace_id_read_cpu_id(int cpu, struct coresight_trace_id_map *id_map) { return atomic_read(&per_cpu(cpu_id, cpu)); } @@ -152,7 +152,7 @@ static void coresight_trace_id_release_all_pending(void) DUMP_ID_MAP(id_map); } -static int coresight_trace_id_map_get_cpu_id(int cpu, struct coresight_trace_id_map *id_map) +static int _coresight_trace_id_get_cpu_id(int cpu, struct coresight_trace_id_map *id_map) { unsigned long flags; int id; @@ -160,7 +160,7 @@ static int coresight_trace_id_map_get_cpu_id(int cpu, struct coresight_trace_id_ spin_lock_irqsave(&id_map_lock, flags); /* check for existing allocation for this CPU */ - id = _coresight_trace_id_read_cpu_id(cpu); + id = _coresight_trace_id_read_cpu_id(cpu, id_map); if (id) goto get_cpu_id_clr_pend; @@ -196,13 +196,13 @@ static int coresight_trace_id_map_get_cpu_id(int cpu, struct coresight_trace_id_ return id; } -static void coresight_trace_id_map_put_cpu_id(int cpu, struct coresight_trace_id_map *id_map) +static void _coresight_trace_id_put_cpu_id(int cpu, struct coresight_trace_id_map *id_map) { unsigned long flags; int id; /* check for existing allocation for this CPU */ - id = _coresight_trace_id_read_cpu_id(cpu); + id = _coresight_trace_id_read_cpu_id(cpu, id_map); if (!id) return; @@ -254,22 +254,40 @@ static void coresight_trace_id_map_put_system_id(struct coresight_trace_id_map * int coresight_trace_id_get_cpu_id(int cpu) { - return coresight_trace_id_map_get_cpu_id(cpu, &id_map_default); + return _coresight_trace_id_get_cpu_id(cpu, &id_map_default); } EXPORT_SYMBOL_GPL(coresight_trace_id_get_cpu_id); +int coresight_trace_id_get_cpu_id_map(int cpu, struct coresight_trace_id_map *id_map) +{ + return _coresight_trace_id_get_cpu_id(cpu, id_map); +} +EXPORT_SYMBOL_GPL(coresight_trace_id_get_cpu_id_map); + void coresight_trace_id_put_cpu_id(int cpu) { - coresight_trace_id_map_put_cpu_id(cpu, &id_map_default); + _coresight_trace_id_put_cpu_id(cpu, &id_map_default); } EXPORT_SYMBOL_GPL(coresight_trace_id_put_cpu_id); +void coresight_trace_id_put_cpu_id_map(int cpu, struct coresight_trace_id_map *id_map) +{ + _coresight_trace_id_put_cpu_id(cpu, id_map); +} +EXPORT_SYMBOL_GPL(coresight_trace_id_put_cpu_id_map); + int coresight_trace_id_read_cpu_id(int cpu) { - return _coresight_trace_id_read_cpu_id(cpu); + return _coresight_trace_id_read_cpu_id(cpu, &id_map_default); } EXPORT_SYMBOL_GPL(coresight_trace_id_read_cpu_id); +int coresight_trace_id_read_cpu_id_map(int cpu, struct coresight_trace_id_map *id_map) +{ + return _coresight_trace_id_read_cpu_id(cpu, id_map); +} +EXPORT_SYMBOL_GPL(coresight_trace_id_read_cpu_id_map); + int coresight_trace_id_get_system_id(void) { return coresight_trace_id_map_get_system_id(&id_map_default); diff --git a/drivers/hwtracing/coresight/coresight-trace-id.h b/drivers/hwtracing/coresight/coresight-trace-id.h index 49438a96fcc6e..840babdd07944 100644 --- a/drivers/hwtracing/coresight/coresight-trace-id.h +++ b/drivers/hwtracing/coresight/coresight-trace-id.h @@ -42,8 +42,6 @@ #define IS_VALID_CS_TRACE_ID(id) \ ((id > CORESIGHT_TRACE_ID_RES_0) && (id < CORESIGHT_TRACE_ID_RES_TOP)) -/* Allocate and release IDs for a single default trace ID map */ - /** * Read and optionally allocate a CoreSight trace ID and associate with a CPU. * @@ -59,6 +57,12 @@ */ int coresight_trace_id_get_cpu_id(int cpu); +/** + * Version of coresight_trace_id_get_cpu_id() that allows the ID map to operate + * on to be provided. + */ +int coresight_trace_id_get_cpu_id_map(int cpu, struct coresight_trace_id_map *id_map); + /** * Release an allocated trace ID associated with the CPU. * @@ -72,6 +76,12 @@ int coresight_trace_id_get_cpu_id(int cpu); */ void coresight_trace_id_put_cpu_id(int cpu); +/** + * Version of coresight_trace_id_put_cpu_id() that allows the ID map to operate + * on to be provided. + */ +void coresight_trace_id_put_cpu_id_map(int cpu, struct coresight_trace_id_map *id_map); + /** * Read the current allocated CoreSight Trace ID value for the CPU. * @@ -92,6 +102,12 @@ void coresight_trace_id_put_cpu_id(int cpu); */ int coresight_trace_id_read_cpu_id(int cpu); +/** + * Version of coresight_trace_id_read_cpu_id() that allows the ID map to operate + * on to be provided. + */ +int coresight_trace_id_read_cpu_id_map(int cpu, struct coresight_trace_id_map *id_map); + /** * Allocate a CoreSight trace ID for a system component. * From e18b6787ef329ffe6a24292a8790eed144c51516 Mon Sep 17 00:00:00 2001 From: James Clark Date: Mon, 22 Jul 2024 11:11:55 +0100 Subject: [PATCH 323/548] coresight: Make CPU id map a property of a trace ID map The global CPU ID mappings won't work for per-sink ID maps so move it to the ID map struct. coresight_trace_id_release_all_pending() is hard coded to operate on the default map, but once Perf sessions use their own maps the pending release mechanism will be deleted. So it doesn't need to be extended to accept a trace ID map argument at this point. Signed-off-by: James Clark Reviewed-by: Mike Leach Tested-by: Leo Yan Signed-off-by: James Clark Signed-off-by: Suzuki K Poulose Link: https://lore.kernel.org/r/20240722101202.26915-14-james.clark@linaro.org (cherry picked from commit d53c8253c7822fbc524adcd950e7c6de6a229c99) Signed-off-by: Koba Ko --- drivers/hwtracing/coresight/coresight-trace-id.c | 16 +++++++++------- include/linux/coresight.h | 1 + 2 files changed, 10 insertions(+), 7 deletions(-) diff --git a/drivers/hwtracing/coresight/coresight-trace-id.c b/drivers/hwtracing/coresight/coresight-trace-id.c index 5561989a03fab..8a777c0af6ea2 100644 --- a/drivers/hwtracing/coresight/coresight-trace-id.c +++ b/drivers/hwtracing/coresight/coresight-trace-id.c @@ -13,10 +13,12 @@ #include "coresight-trace-id.h" /* Default trace ID map. Used in sysfs mode and for system sources */ -static struct coresight_trace_id_map id_map_default; +static DEFINE_PER_CPU(atomic_t, id_map_default_cpu_ids) = ATOMIC_INIT(0); +static struct coresight_trace_id_map id_map_default = { + .cpu_map = &id_map_default_cpu_ids +}; -/* maintain a record of the mapping of IDs and pending releases per cpu */ -static DEFINE_PER_CPU(atomic_t, cpu_id) = ATOMIC_INIT(0); +/* maintain a record of the pending releases per cpu */ static cpumask_t cpu_id_release_pending; /* perf session active counter */ @@ -49,7 +51,7 @@ static void coresight_trace_id_dump_table(struct coresight_trace_id_map *id_map, /* unlocked read of current trace ID value for given CPU */ static int _coresight_trace_id_read_cpu_id(int cpu, struct coresight_trace_id_map *id_map) { - return atomic_read(&per_cpu(cpu_id, cpu)); + return atomic_read(per_cpu_ptr(id_map->cpu_map, cpu)); } /* look for next available odd ID, return 0 if none found */ @@ -145,7 +147,7 @@ static void coresight_trace_id_release_all_pending(void) clear_bit(bit, id_map->pend_rel_ids); } for_each_cpu(cpu, &cpu_id_release_pending) { - atomic_set(&per_cpu(cpu_id, cpu), 0); + atomic_set(per_cpu_ptr(id_map_default.cpu_map, cpu), 0); cpumask_clear_cpu(cpu, &cpu_id_release_pending); } spin_unlock_irqrestore(&id_map_lock, flags); @@ -181,7 +183,7 @@ static int _coresight_trace_id_get_cpu_id(int cpu, struct coresight_trace_id_map goto get_cpu_id_out_unlock; /* allocate the new id to the cpu */ - atomic_set(&per_cpu(cpu_id, cpu), id); + atomic_set(per_cpu_ptr(id_map->cpu_map, cpu), id); get_cpu_id_clr_pend: /* we are (re)using this ID - so ensure it is not marked for release */ @@ -215,7 +217,7 @@ static void _coresight_trace_id_put_cpu_id(int cpu, struct coresight_trace_id_ma } else { /* otherwise clear id */ coresight_trace_id_free(id, id_map); - atomic_set(&per_cpu(cpu_id, cpu), 0); + atomic_set(per_cpu_ptr(id_map->cpu_map, cpu), 0); } spin_unlock_irqrestore(&id_map_lock, flags); diff --git a/include/linux/coresight.h b/include/linux/coresight.h index 0ba3fff44139f..ec85004217052 100644 --- a/include/linux/coresight.h +++ b/include/linux/coresight.h @@ -233,6 +233,7 @@ struct coresight_sysfs_link { struct coresight_trace_id_map { DECLARE_BITMAP(used_ids, CORESIGHT_TRACE_IDS_MAX); DECLARE_BITMAP(pend_rel_ids, CORESIGHT_TRACE_IDS_MAX); + atomic_t __percpu *cpu_map; }; /** From 01712606223c286e51a791e9d5028ae279481061 Mon Sep 17 00:00:00 2001 From: James Clark Date: Mon, 22 Jul 2024 11:11:56 +0100 Subject: [PATCH 324/548] coresight: Use per-sink trace ID maps for Perf sessions This will allow sessions with more than CORESIGHT_TRACE_IDS_MAX ETMs as long as there are fewer than that many ETMs connected to each sink. Each sink owns its own trace ID map, and any Perf session connecting to that sink will allocate from it, even if the sink is currently in use by other users. This is similar to the existing behavior where the dynamic trace IDs are constant as long as there is any concurrent Perf session active. It's not completely optimal because slightly more IDs will be used than necessary, but the optimal solution involves tracking the PIDs of each session and allocating ID maps based on the session owner. This is difficult to do with the combination of per-thread and per-cpu modes and some scheduling issues. The complexity of this isn't likely to worth it because even with multiple users they'd just see a difference in the ordering of ID allocations rather than hitting any limits (unless the hardware does have too many ETMs connected to one sink). Signed-off-by: James Clark Reviewed-by: Mike Leach Signed-off-by: James Clark Signed-off-by: Suzuki K Poulose Link: https://lore.kernel.org/r/20240722101202.26915-15-james.clark@linaro.org (cherry picked from commit 5ad628a7617607baac53745f8025638b967470a2) Signed-off-by: Koba Ko --- drivers/hwtracing/coresight/coresight-core.c | 10 ++++++++++ drivers/hwtracing/coresight/coresight-dummy.c | 3 ++- drivers/hwtracing/coresight/coresight-etm-perf.c | 15 ++++++++++----- .../hwtracing/coresight/coresight-etm3x-core.c | 9 +++++---- .../hwtracing/coresight/coresight-etm4x-core.c | 9 +++++---- drivers/hwtracing/coresight/coresight-stm.c | 3 ++- drivers/hwtracing/coresight/coresight-sysfs.c | 3 ++- drivers/hwtracing/coresight/coresight-tpdm.c | 3 ++- include/linux/coresight.h | 3 ++- 9 files changed, 40 insertions(+), 18 deletions(-) diff --git a/drivers/hwtracing/coresight/coresight-core.c b/drivers/hwtracing/coresight/coresight-core.c index 9f49bca5ef355..a84035b184e94 100644 --- a/drivers/hwtracing/coresight/coresight-core.c +++ b/drivers/hwtracing/coresight/coresight-core.c @@ -903,6 +903,7 @@ static void coresight_device_release(struct device *dev) struct coresight_device *csdev = to_coresight_device(dev); fwnode_handle_put(csdev->dev.fwnode); + free_percpu(csdev->perf_sink_id_map.cpu_map); kfree(csdev); } @@ -1170,6 +1171,15 @@ struct coresight_device *coresight_register(struct coresight_desc *desc) csdev->dev.fwnode = fwnode_handle_get(dev_fwnode(desc->dev)); dev_set_name(&csdev->dev, "%s", desc->name); + if (csdev->type == CORESIGHT_DEV_TYPE_SINK || + csdev->type == CORESIGHT_DEV_TYPE_LINKSINK) { + csdev->perf_sink_id_map.cpu_map = alloc_percpu(atomic_t); + if (!csdev->perf_sink_id_map.cpu_map) { + kfree(csdev); + ret = -ENOMEM; + goto err_out; + } + } /* * Make sure the device registration and the connection fixup * are synchronised, so that we don't see uninitialised devices diff --git a/drivers/hwtracing/coresight/coresight-dummy.c b/drivers/hwtracing/coresight/coresight-dummy.c index e4deafae7bc20..e8faa2062ec36 100644 --- a/drivers/hwtracing/coresight/coresight-dummy.c +++ b/drivers/hwtracing/coresight/coresight-dummy.c @@ -21,7 +21,8 @@ DEFINE_CORESIGHT_DEVLIST(source_devs, "dummy_source"); DEFINE_CORESIGHT_DEVLIST(sink_devs, "dummy_sink"); static int dummy_source_enable(struct coresight_device *csdev, - struct perf_event *event, enum cs_mode mode) + struct perf_event *event, enum cs_mode mode, + __maybe_unused struct coresight_trace_id_map *id_map) { dev_dbg(csdev->dev.parent, "Dummy source enabled\n"); diff --git a/drivers/hwtracing/coresight/coresight-etm-perf.c b/drivers/hwtracing/coresight/coresight-etm-perf.c index 58b32b399fac2..ff96c7765f258 100644 --- a/drivers/hwtracing/coresight/coresight-etm-perf.c +++ b/drivers/hwtracing/coresight/coresight-etm-perf.c @@ -227,10 +227,13 @@ static void free_event_data(struct work_struct *work) struct list_head **ppath; ppath = etm_event_cpu_path_ptr(event_data, cpu); - if (!(IS_ERR_OR_NULL(*ppath))) + if (!(IS_ERR_OR_NULL(*ppath))) { + struct coresight_device *sink = coresight_get_sink(*ppath); + + coresight_trace_id_put_cpu_id_map(cpu, &sink->perf_sink_id_map); coresight_release_path(*ppath); + } *ppath = NULL; - coresight_trace_id_put_cpu_id(cpu); } /* mark perf event as done for trace id allocator */ @@ -399,7 +402,7 @@ static void *etm_setup_aux(struct perf_event *event, void **pages, } /* ensure we can allocate a trace ID for this CPU */ - trace_id = coresight_trace_id_get_cpu_id(cpu); + trace_id = coresight_trace_id_get_cpu_id_map(cpu, &sink->perf_sink_id_map); if (!IS_VALID_CS_TRACE_ID(trace_id)) { cpumask_clear_cpu(cpu, mask); coresight_release_path(path); @@ -493,7 +496,8 @@ static void etm_event_start(struct perf_event *event, int flags) goto fail_end_stop; /* Finally enable the tracer */ - if (source_ops(csdev)->enable(csdev, event, CS_MODE_PERF)) + if (source_ops(csdev)->enable(csdev, event, CS_MODE_PERF, + &sink->perf_sink_id_map)) goto fail_disable_path; /* @@ -505,7 +509,8 @@ static void etm_event_start(struct perf_event *event, int flags) hw_id = FIELD_PREP(CS_AUX_HW_ID_VERSION_MASK, CS_AUX_HW_ID_CURR_VERSION); hw_id |= FIELD_PREP(CS_AUX_HW_ID_TRACE_ID_MASK, - coresight_trace_id_read_cpu_id(cpu)); + coresight_trace_id_read_cpu_id_map(cpu, + &sink->perf_sink_id_map)); perf_report_aux_output_id(event, hw_id); } diff --git a/drivers/hwtracing/coresight/coresight-etm3x-core.c b/drivers/hwtracing/coresight/coresight-etm3x-core.c index bda9fb61559c8..6f760aa71f7fd 100644 --- a/drivers/hwtracing/coresight/coresight-etm3x-core.c +++ b/drivers/hwtracing/coresight/coresight-etm3x-core.c @@ -481,7 +481,8 @@ void etm_release_trace_id(struct etm_drvdata *drvdata) } static int etm_enable_perf(struct coresight_device *csdev, - struct perf_event *event) + struct perf_event *event, + struct coresight_trace_id_map *id_map) { struct etm_drvdata *drvdata = dev_get_drvdata(csdev->dev.parent); int trace_id; @@ -500,7 +501,7 @@ static int etm_enable_perf(struct coresight_device *csdev, * with perf locks - we know the ID cannot change until perf shuts down * the session */ - trace_id = coresight_trace_id_read_cpu_id(drvdata->cpu); + trace_id = coresight_trace_id_read_cpu_id_map(drvdata->cpu, id_map); if (!IS_VALID_CS_TRACE_ID(trace_id)) { dev_err(&drvdata->csdev->dev, "Failed to set trace ID for %s on CPU%d\n", dev_name(&drvdata->csdev->dev), drvdata->cpu); @@ -553,7 +554,7 @@ static int etm_enable_sysfs(struct coresight_device *csdev) } static int etm_enable(struct coresight_device *csdev, struct perf_event *event, - enum cs_mode mode) + enum cs_mode mode, struct coresight_trace_id_map *id_map) { int ret; struct etm_drvdata *drvdata = dev_get_drvdata(csdev->dev.parent); @@ -568,7 +569,7 @@ static int etm_enable(struct coresight_device *csdev, struct perf_event *event, ret = etm_enable_sysfs(csdev); break; case CS_MODE_PERF: - ret = etm_enable_perf(csdev, event); + ret = etm_enable_perf(csdev, event, id_map); break; default: ret = -EINVAL; diff --git a/drivers/hwtracing/coresight/coresight-etm4x-core.c b/drivers/hwtracing/coresight/coresight-etm4x-core.c index 39ad5950ea3c1..c50aeafddf22e 100644 --- a/drivers/hwtracing/coresight/coresight-etm4x-core.c +++ b/drivers/hwtracing/coresight/coresight-etm4x-core.c @@ -770,7 +770,8 @@ static int etm4_parse_event_config(struct coresight_device *csdev, } static int etm4_enable_perf(struct coresight_device *csdev, - struct perf_event *event) + struct perf_event *event, + struct coresight_trace_id_map *id_map) { int ret = 0, trace_id; struct etmv4_drvdata *drvdata = dev_get_drvdata(csdev->dev.parent); @@ -793,7 +794,7 @@ static int etm4_enable_perf(struct coresight_device *csdev, * with perf locks - we know the ID cannot change until perf shuts down * the session */ - trace_id = coresight_trace_id_read_cpu_id(drvdata->cpu); + trace_id = coresight_trace_id_read_cpu_id_map(drvdata->cpu, id_map); if (!IS_VALID_CS_TRACE_ID(trace_id)) { dev_err(&drvdata->csdev->dev, "Failed to set trace ID for %s on CPU%d\n", dev_name(&drvdata->csdev->dev), drvdata->cpu); @@ -855,7 +856,7 @@ static int etm4_enable_sysfs(struct coresight_device *csdev) } static int etm4_enable(struct coresight_device *csdev, struct perf_event *event, - enum cs_mode mode) + enum cs_mode mode, struct coresight_trace_id_map *id_map) { int ret; @@ -869,7 +870,7 @@ static int etm4_enable(struct coresight_device *csdev, struct perf_event *event, ret = etm4_enable_sysfs(csdev); break; case CS_MODE_PERF: - ret = etm4_enable_perf(csdev, event); + ret = etm4_enable_perf(csdev, event, id_map); break; default: ret = -EINVAL; diff --git a/drivers/hwtracing/coresight/coresight-stm.c b/drivers/hwtracing/coresight/coresight-stm.c index 6e801191d1db2..31da4f37ee8ab 100644 --- a/drivers/hwtracing/coresight/coresight-stm.c +++ b/drivers/hwtracing/coresight/coresight-stm.c @@ -191,7 +191,8 @@ static void stm_enable_hw(struct stm_drvdata *drvdata) } static int stm_enable(struct coresight_device *csdev, struct perf_event *event, - enum cs_mode mode) + enum cs_mode mode, + __maybe_unused struct coresight_trace_id_map *trace_id) { struct stm_drvdata *drvdata = dev_get_drvdata(csdev->dev.parent); diff --git a/drivers/hwtracing/coresight/coresight-sysfs.c b/drivers/hwtracing/coresight/coresight-sysfs.c index f9444e2cb1d9f..289bec2285796 100644 --- a/drivers/hwtracing/coresight/coresight-sysfs.c +++ b/drivers/hwtracing/coresight/coresight-sysfs.c @@ -9,6 +9,7 @@ #include #include "coresight-priv.h" +#include "coresight-trace-id.h" /* * Use IDR to map the hash of the source's device name @@ -63,7 +64,7 @@ static int coresight_enable_source_sysfs(struct coresight_device *csdev, */ lockdep_assert_held(&coresight_mutex); if (coresight_get_mode(csdev) != CS_MODE_SYSFS) { - ret = source_ops(csdev)->enable(csdev, data, mode); + ret = source_ops(csdev)->enable(csdev, data, mode, NULL); if (ret) return ret; } diff --git a/drivers/hwtracing/coresight/coresight-tpdm.c b/drivers/hwtracing/coresight/coresight-tpdm.c index b25284e06395f..f7c5256ade14e 100644 --- a/drivers/hwtracing/coresight/coresight-tpdm.c +++ b/drivers/hwtracing/coresight/coresight-tpdm.c @@ -286,7 +286,8 @@ static void __tpdm_enable(struct tpdm_drvdata *drvdata) } static int tpdm_enable(struct coresight_device *csdev, struct perf_event *event, - enum cs_mode mode) + enum cs_mode mode, + __maybe_unused struct coresight_trace_id_map *id_map) { struct tpdm_drvdata *drvdata = dev_get_drvdata(csdev->dev.parent); diff --git a/include/linux/coresight.h b/include/linux/coresight.h index ec85004217052..226827ff71edc 100644 --- a/include/linux/coresight.h +++ b/include/linux/coresight.h @@ -289,6 +289,7 @@ struct coresight_device { bool sysfs_sink_activated; struct dev_ext_attribute *ea; struct coresight_device *def_sink; + struct coresight_trace_id_map perf_sink_id_map; /* sysfs links between components */ int nr_links; bool has_conns_grp; @@ -383,7 +384,7 @@ struct coresight_ops_link { struct coresight_ops_source { int (*cpu_id)(struct coresight_device *csdev); int (*enable)(struct coresight_device *csdev, struct perf_event *event, - enum cs_mode mode); + enum cs_mode mode, struct coresight_trace_id_map *id_map); void (*disable)(struct coresight_device *csdev, struct perf_event *event); }; From 6f8d9f81f642743a785b15af2d4ddbee125a724c Mon Sep 17 00:00:00 2001 From: James Clark Date: Mon, 22 Jul 2024 11:11:57 +0100 Subject: [PATCH 325/548] coresight: Remove pending trace ID release mechanism Pending the release of IDs was a way of managing concurrent sysfs and Perf sessions in a single global ID map. Perf may have finished while sysfs hadn't, and Perf shouldn't release the IDs in use by sysfs and vice versa. Now that Perf uses its own exclusive ID maps, pending release doesn't result in any different behavior than just releasing all IDs when the last Perf session finishes. As part of the per-sink trace ID change, we would have still had to make the pending mechanism work on a per-sink basis, due to the overlapping ID allocations, so instead of making that more complicated, just remove it. Signed-off-by: James Clark Reviewed-by: Mike Leach Signed-off-by: James Clark Signed-off-by: Suzuki K Poulose Link: https://lore.kernel.org/r/20240722101202.26915-16-james.clark@linaro.org (cherry picked from commit de0029fdde86092c75472c92e56a962f4edee0f6) Signed-off-by: Koba Ko --- .../hwtracing/coresight/coresight-etm-perf.c | 18 +++-- .../hwtracing/coresight/coresight-trace-id.c | 67 +++++-------------- .../hwtracing/coresight/coresight-trace-id.h | 31 ++++----- include/linux/coresight.h | 6 +- 4 files changed, 43 insertions(+), 79 deletions(-) diff --git a/drivers/hwtracing/coresight/coresight-etm-perf.c b/drivers/hwtracing/coresight/coresight-etm-perf.c index ff96c7765f258..dda6e43bb04d1 100644 --- a/drivers/hwtracing/coresight/coresight-etm-perf.c +++ b/drivers/hwtracing/coresight/coresight-etm-perf.c @@ -230,15 +230,21 @@ static void free_event_data(struct work_struct *work) if (!(IS_ERR_OR_NULL(*ppath))) { struct coresight_device *sink = coresight_get_sink(*ppath); - coresight_trace_id_put_cpu_id_map(cpu, &sink->perf_sink_id_map); + /* + * Mark perf event as done for trace id allocator, but don't call + * coresight_trace_id_put_cpu_id_map() on individual IDs. Perf sessions + * never free trace IDs to ensure that the ID associated with a CPU + * cannot change during their and other's concurrent sessions. Instead, + * a refcount is used so that the last event to call + * coresight_trace_id_perf_stop() frees all IDs. + */ + coresight_trace_id_perf_stop(&sink->perf_sink_id_map); + coresight_release_path(*ppath); } *ppath = NULL; } - /* mark perf event as done for trace id allocator */ - coresight_trace_id_perf_stop(); - free_percpu(event_data->path); kfree(event_data); } @@ -326,9 +332,6 @@ static void *etm_setup_aux(struct perf_event *event, void **pages, sink = user_sink = coresight_get_sink_by_id(id); } - /* tell the trace ID allocator that a perf event is starting up */ - coresight_trace_id_perf_start(); - /* check if user wants a coresight configuration selected */ cfg_hash = (u32)((event->attr.config2 & GENMASK_ULL(63, 32)) >> 32); if (cfg_hash) { @@ -409,6 +412,7 @@ static void *etm_setup_aux(struct perf_event *event, void **pages, continue; } + coresight_trace_id_perf_start(&sink->perf_sink_id_map); *etm_event_cpu_path_ptr(event_data, cpu) = path; } diff --git a/drivers/hwtracing/coresight/coresight-trace-id.c b/drivers/hwtracing/coresight/coresight-trace-id.c index 8a777c0af6ea2..bddaed3e5cf85 100644 --- a/drivers/hwtracing/coresight/coresight-trace-id.c +++ b/drivers/hwtracing/coresight/coresight-trace-id.c @@ -18,12 +18,6 @@ static struct coresight_trace_id_map id_map_default = { .cpu_map = &id_map_default_cpu_ids }; -/* maintain a record of the pending releases per cpu */ -static cpumask_t cpu_id_release_pending; - -/* perf session active counter */ -static atomic_t perf_cs_etm_session_active = ATOMIC_INIT(0); - /* lock to protect id_map and cpu data */ static DEFINE_SPINLOCK(id_map_lock); @@ -35,7 +29,6 @@ static void coresight_trace_id_dump_table(struct coresight_trace_id_map *id_map, { pr_debug("%s id_map::\n", func_name); pr_debug("Used = %*pb\n", CORESIGHT_TRACE_IDS_MAX, id_map->used_ids); - pr_debug("Pend = %*pb\n", CORESIGHT_TRACE_IDS_MAX, id_map->pend_rel_ids); } #define DUMP_ID_MAP(map) coresight_trace_id_dump_table(map, __func__) #define DUMP_ID_CPU(cpu, id) pr_debug("%s called; cpu=%d, id=%d\n", __func__, cpu, id) @@ -122,34 +115,18 @@ static void coresight_trace_id_free(int id, struct coresight_trace_id_map *id_ma clear_bit(id, id_map->used_ids); } -static void coresight_trace_id_set_pend_rel(int id, struct coresight_trace_id_map *id_map) -{ - if (WARN(!IS_VALID_CS_TRACE_ID(id), "Invalid Trace ID %d\n", id)) - return; - set_bit(id, id_map->pend_rel_ids); -} - /* - * release all pending IDs for all current maps & clear CPU associations - * - * This currently operates on the default id map, but may be extended to - * operate on all registered id maps if per sink id maps are used. + * Release all IDs and clear CPU associations. */ -static void coresight_trace_id_release_all_pending(void) +static void coresight_trace_id_release_all(struct coresight_trace_id_map *id_map) { - struct coresight_trace_id_map *id_map = &id_map_default; unsigned long flags; - int cpu, bit; + int cpu; spin_lock_irqsave(&id_map_lock, flags); - for_each_set_bit(bit, id_map->pend_rel_ids, CORESIGHT_TRACE_ID_RES_TOP) { - clear_bit(bit, id_map->used_ids); - clear_bit(bit, id_map->pend_rel_ids); - } - for_each_cpu(cpu, &cpu_id_release_pending) { - atomic_set(per_cpu_ptr(id_map_default.cpu_map, cpu), 0); - cpumask_clear_cpu(cpu, &cpu_id_release_pending); - } + bitmap_zero(id_map->used_ids, CORESIGHT_TRACE_IDS_MAX); + for_each_possible_cpu(cpu) + atomic_set(per_cpu_ptr(id_map->cpu_map, cpu), 0); spin_unlock_irqrestore(&id_map_lock, flags); DUMP_ID_MAP(id_map); } @@ -164,7 +141,7 @@ static int _coresight_trace_id_get_cpu_id(int cpu, struct coresight_trace_id_map /* check for existing allocation for this CPU */ id = _coresight_trace_id_read_cpu_id(cpu, id_map); if (id) - goto get_cpu_id_clr_pend; + goto get_cpu_id_out_unlock; /* * Find a new ID. @@ -185,11 +162,6 @@ static int _coresight_trace_id_get_cpu_id(int cpu, struct coresight_trace_id_map /* allocate the new id to the cpu */ atomic_set(per_cpu_ptr(id_map->cpu_map, cpu), id); -get_cpu_id_clr_pend: - /* we are (re)using this ID - so ensure it is not marked for release */ - cpumask_clear_cpu(cpu, &cpu_id_release_pending); - clear_bit(id, id_map->pend_rel_ids); - get_cpu_id_out_unlock: spin_unlock_irqrestore(&id_map_lock, flags); @@ -210,15 +182,8 @@ static void _coresight_trace_id_put_cpu_id(int cpu, struct coresight_trace_id_ma spin_lock_irqsave(&id_map_lock, flags); - if (atomic_read(&perf_cs_etm_session_active)) { - /* set release at pending if perf still active */ - coresight_trace_id_set_pend_rel(id, id_map); - cpumask_set_cpu(cpu, &cpu_id_release_pending); - } else { - /* otherwise clear id */ - coresight_trace_id_free(id, id_map); - atomic_set(per_cpu_ptr(id_map->cpu_map, cpu), 0); - } + coresight_trace_id_free(id, id_map); + atomic_set(per_cpu_ptr(id_map->cpu_map, cpu), 0); spin_unlock_irqrestore(&id_map_lock, flags); DUMP_ID_CPU(cpu, id); @@ -302,17 +267,17 @@ void coresight_trace_id_put_system_id(int id) } EXPORT_SYMBOL_GPL(coresight_trace_id_put_system_id); -void coresight_trace_id_perf_start(void) +void coresight_trace_id_perf_start(struct coresight_trace_id_map *id_map) { - atomic_inc(&perf_cs_etm_session_active); - PERF_SESSION(atomic_read(&perf_cs_etm_session_active)); + atomic_inc(&id_map->perf_cs_etm_session_active); + PERF_SESSION(atomic_read(&id_map->perf_cs_etm_session_active)); } EXPORT_SYMBOL_GPL(coresight_trace_id_perf_start); -void coresight_trace_id_perf_stop(void) +void coresight_trace_id_perf_stop(struct coresight_trace_id_map *id_map) { - if (!atomic_dec_return(&perf_cs_etm_session_active)) - coresight_trace_id_release_all_pending(); - PERF_SESSION(atomic_read(&perf_cs_etm_session_active)); + if (!atomic_dec_return(&id_map->perf_cs_etm_session_active)) + coresight_trace_id_release_all(id_map); + PERF_SESSION(atomic_read(&id_map->perf_cs_etm_session_active)); } EXPORT_SYMBOL_GPL(coresight_trace_id_perf_stop); diff --git a/drivers/hwtracing/coresight/coresight-trace-id.h b/drivers/hwtracing/coresight/coresight-trace-id.h index 840babdd07944..9aae50a553caa 100644 --- a/drivers/hwtracing/coresight/coresight-trace-id.h +++ b/drivers/hwtracing/coresight/coresight-trace-id.h @@ -17,9 +17,10 @@ * released when done. * * In order to ensure that a consistent cpu / ID matching is maintained - * throughout a perf cs_etm event session - a session in progress flag will - * be maintained, and released IDs not cleared until the perf session is - * complete. This allows the same CPU to be re-allocated its prior ID. + * throughout a perf cs_etm event session - a session in progress flag will be + * maintained for each sink, and IDs are cleared when all the perf sessions + * complete. This allows the same CPU to be re-allocated its prior ID when + * events are scheduled in and out. * * * Trace ID maps will be created and initialised to prevent architecturally @@ -66,11 +67,7 @@ int coresight_trace_id_get_cpu_id_map(int cpu, struct coresight_trace_id_map *id /** * Release an allocated trace ID associated with the CPU. * - * This will release the CoreSight trace ID associated with the CPU, - * unless a perf session is in operation. - * - * If a perf session is in operation then the ID will be marked as pending - * release. + * This will release the CoreSight trace ID associated with the CPU. * * @cpu: The CPU index to release the associated trace ID. */ @@ -133,21 +130,21 @@ void coresight_trace_id_put_system_id(int id); /** * Notify the Trace ID allocator that a perf session is starting. * - * Increase the perf session reference count - called by perf when setting up - * a trace event. + * Increase the perf session reference count - called by perf when setting up a + * trace event. * - * This reference count is used by the ID allocator to ensure that trace IDs - * associated with a CPU cannot change or be released during a perf session. + * Perf sessions never free trace IDs to ensure that the ID associated with a + * CPU cannot change during their and other's concurrent sessions. Instead, + * this refcount is used so that the last event to finish always frees all IDs. */ -void coresight_trace_id_perf_start(void); +void coresight_trace_id_perf_start(struct coresight_trace_id_map *id_map); /** * Notify the ID allocator that a perf session is stopping. * - * Decrease the perf session reference count. - * if this causes the count to go to zero, then all Trace IDs marked as pending - * release, will be released. + * Decrease the perf session reference count. If this causes the count to go to + * zero, then all Trace IDs will be released. */ -void coresight_trace_id_perf_stop(void); +void coresight_trace_id_perf_stop(struct coresight_trace_id_map *id_map); #endif /* _CORESIGHT_TRACE_ID_H */ diff --git a/include/linux/coresight.h b/include/linux/coresight.h index 226827ff71edc..c9de29c7d835e 100644 --- a/include/linux/coresight.h +++ b/include/linux/coresight.h @@ -226,14 +226,12 @@ struct coresight_sysfs_link { * @used_ids: Bitmap to register available (bit = 0) and in use (bit = 1) IDs. * Initialised so that the reserved IDs are permanently marked as * in use. - * @pend_rel_ids: CPU IDs that have been released by the trace source but not - * yet marked as available, to allow re-allocation to the same - * CPU during a perf session. + * @perf_cs_etm_session_active: Number of Perf sessions using this ID map. */ struct coresight_trace_id_map { DECLARE_BITMAP(used_ids, CORESIGHT_TRACE_IDS_MAX); - DECLARE_BITMAP(pend_rel_ids, CORESIGHT_TRACE_IDS_MAX); atomic_t __percpu *cpu_map; + atomic_t perf_cs_etm_session_active; }; /** From 5c7a837694ab8a58567df01b3b68cb195a7c1d67 Mon Sep 17 00:00:00 2001 From: James Clark Date: Mon, 22 Jul 2024 11:11:58 +0100 Subject: [PATCH 326/548] coresight: Emit sink ID in the HW_ID packets For Perf to be able to decode when per-sink trace IDs are used, emit the sink that's being written to for each ETM. Perf currently errors out if it sees a newer packet version so instead of bumping it, add a new minor version field. This can be used to signify new versions that have backwards compatible fields. Considering this change is only for high core count machines, it doesn't make sense to make a breaking change for everyone. Signed-off-by: James Clark Tested-by: Leo Yan Reviewed-by: Mike Leach Signed-off-by: James Clark Signed-off-by: Suzuki K Poulose Link: https://lore.kernel.org/r/20240722101202.26915-17-james.clark@linaro.org (cherry picked from commit 487eec8da80aef16229d30429f1b26090b1bf0eb) Signed-off-by: Koba Ko --- drivers/hwtracing/coresight/coresight-core.c | 26 ++++++++++--------- .../hwtracing/coresight/coresight-etm-perf.c | 16 ++++++++---- drivers/hwtracing/coresight/coresight-priv.h | 1 + include/linux/coresight-pmu.h | 17 +++++++++--- 4 files changed, 39 insertions(+), 21 deletions(-) diff --git a/drivers/hwtracing/coresight/coresight-core.c b/drivers/hwtracing/coresight/coresight-core.c index a84035b184e94..c2d34f2889944 100644 --- a/drivers/hwtracing/coresight/coresight-core.c +++ b/drivers/hwtracing/coresight/coresight-core.c @@ -488,23 +488,25 @@ struct coresight_device *coresight_get_sink(struct list_head *path) return csdev; } +u32 coresight_get_sink_id(struct coresight_device *csdev) +{ + if (!csdev->ea) + return 0; + + /* + * See function etm_perf_add_symlink_sink() to know where + * this comes from. + */ + return (u32) (unsigned long) csdev->ea->var; +} + static int coresight_sink_by_id(struct device *dev, const void *data) { struct coresight_device *csdev = to_coresight_device(dev); - unsigned long hash; if (csdev->type == CORESIGHT_DEV_TYPE_SINK || - csdev->type == CORESIGHT_DEV_TYPE_LINKSINK) { - - if (!csdev->ea) - return 0; - /* - * See function etm_perf_add_symlink_sink() to know where - * this comes from. - */ - hash = (unsigned long)csdev->ea->var; - - if ((u32)hash == *(u32 *)data) + csdev->type == CORESIGHT_DEV_TYPE_LINKSINK) { + if (coresight_get_sink_id(csdev) == *(u32 *)data) return 1; } diff --git a/drivers/hwtracing/coresight/coresight-etm-perf.c b/drivers/hwtracing/coresight/coresight-etm-perf.c index dda6e43bb04d1..fef0d76a5504e 100644 --- a/drivers/hwtracing/coresight/coresight-etm-perf.c +++ b/drivers/hwtracing/coresight/coresight-etm-perf.c @@ -458,6 +458,7 @@ static void etm_event_start(struct perf_event *event, int flags) struct coresight_device *sink, *csdev = per_cpu(csdev_src, cpu); struct list_head *path; u64 hw_id; + u8 trace_id; if (!csdev) goto fail; @@ -510,11 +511,16 @@ static void etm_event_start(struct perf_event *event, int flags) */ if (!cpumask_test_cpu(cpu, &event_data->aux_hwid_done)) { cpumask_set_cpu(cpu, &event_data->aux_hwid_done); - hw_id = FIELD_PREP(CS_AUX_HW_ID_VERSION_MASK, - CS_AUX_HW_ID_CURR_VERSION); - hw_id |= FIELD_PREP(CS_AUX_HW_ID_TRACE_ID_MASK, - coresight_trace_id_read_cpu_id_map(cpu, - &sink->perf_sink_id_map)); + + trace_id = coresight_trace_id_read_cpu_id_map(cpu, &sink->perf_sink_id_map); + + hw_id = FIELD_PREP(CS_AUX_HW_ID_MAJOR_VERSION_MASK, + CS_AUX_HW_ID_MAJOR_VERSION); + hw_id |= FIELD_PREP(CS_AUX_HW_ID_MINOR_VERSION_MASK, + CS_AUX_HW_ID_MINOR_VERSION); + hw_id |= FIELD_PREP(CS_AUX_HW_ID_TRACE_ID_MASK, trace_id); + hw_id |= FIELD_PREP(CS_AUX_HW_ID_SINK_ID_MASK, coresight_get_sink_id(sink)); + perf_report_aux_output_id(event, hw_id); } diff --git a/drivers/hwtracing/coresight/coresight-priv.h b/drivers/hwtracing/coresight/coresight-priv.h index 7fe79fc1daadf..d49ad895b2531 100644 --- a/drivers/hwtracing/coresight/coresight-priv.h +++ b/drivers/hwtracing/coresight/coresight-priv.h @@ -149,6 +149,7 @@ int coresight_make_links(struct coresight_device *orig, struct coresight_device *target); void coresight_remove_links(struct coresight_device *orig, struct coresight_connection *conn); +u32 coresight_get_sink_id(struct coresight_device *csdev); #if IS_ENABLED(CONFIG_CORESIGHT_SOURCE_ETM3X) extern int etm_readl_cp14(u32 off, unsigned int *val); diff --git a/include/linux/coresight-pmu.h b/include/linux/coresight-pmu.h index 51ac441a37c3e..89b0ac0014b0d 100644 --- a/include/linux/coresight-pmu.h +++ b/include/linux/coresight-pmu.h @@ -49,12 +49,21 @@ * Interpretation of the PERF_RECORD_AUX_OUTPUT_HW_ID payload. * Used to associate a CPU with the CoreSight Trace ID. * [07:00] - Trace ID - uses 8 bits to make value easy to read in file. - * [59:08] - Unused (SBZ) - * [63:60] - Version + * [39:08] - Sink ID - as reported in /sys/bus/event_source/devices/cs_etm/sinks/ + * Added in minor version 1. + * [55:40] - Unused (SBZ) + * [59:56] - Minor Version - previously existing fields are compatible with + * all minor versions. + * [63:60] - Major Version - previously existing fields mean different things + * in new major versions. */ #define CS_AUX_HW_ID_TRACE_ID_MASK GENMASK_ULL(7, 0) -#define CS_AUX_HW_ID_VERSION_MASK GENMASK_ULL(63, 60) +#define CS_AUX_HW_ID_SINK_ID_MASK GENMASK_ULL(39, 8) -#define CS_AUX_HW_ID_CURR_VERSION 0 +#define CS_AUX_HW_ID_MINOR_VERSION_MASK GENMASK_ULL(59, 56) +#define CS_AUX_HW_ID_MAJOR_VERSION_MASK GENMASK_ULL(63, 60) + +#define CS_AUX_HW_ID_MAJOR_VERSION 0 +#define CS_AUX_HW_ID_MINOR_VERSION 1 #endif From 48f0e6d4b73b1b8598490d88c0da59cbc38c4e76 Mon Sep 17 00:00:00 2001 From: James Clark Date: Mon, 22 Jul 2024 11:11:59 +0100 Subject: [PATCH 327/548] coresight: Make trace ID map spinlock local to the map Reduce contention on the lock by replacing the global lock with one for each map. Signed-off-by: James Clark Reviewed-by: Mike Leach Signed-off-by: James Clark Signed-off-by: Suzuki K Poulose Link: https://lore.kernel.org/r/20240722101202.26915-18-james.clark@linaro.org (cherry picked from commit 988d40a4d4e7d671305bea501562a5d1a1d479fa) Signed-off-by: Koba Ko --- drivers/hwtracing/coresight/coresight-core.c | 1 + .../hwtracing/coresight/coresight-trace-id.c | 26 +++++++++---------- include/linux/coresight.h | 1 + 3 files changed, 14 insertions(+), 14 deletions(-) diff --git a/drivers/hwtracing/coresight/coresight-core.c b/drivers/hwtracing/coresight/coresight-core.c index c2d34f2889944..5be28cfde8385 100644 --- a/drivers/hwtracing/coresight/coresight-core.c +++ b/drivers/hwtracing/coresight/coresight-core.c @@ -1175,6 +1175,7 @@ struct coresight_device *coresight_register(struct coresight_desc *desc) if (csdev->type == CORESIGHT_DEV_TYPE_SINK || csdev->type == CORESIGHT_DEV_TYPE_LINKSINK) { + spin_lock_init(&csdev->perf_sink_id_map.lock); csdev->perf_sink_id_map.cpu_map = alloc_percpu(atomic_t); if (!csdev->perf_sink_id_map.cpu_map) { kfree(csdev); diff --git a/drivers/hwtracing/coresight/coresight-trace-id.c b/drivers/hwtracing/coresight/coresight-trace-id.c index bddaed3e5cf85..d98e12cb30eca 100644 --- a/drivers/hwtracing/coresight/coresight-trace-id.c +++ b/drivers/hwtracing/coresight/coresight-trace-id.c @@ -15,12 +15,10 @@ /* Default trace ID map. Used in sysfs mode and for system sources */ static DEFINE_PER_CPU(atomic_t, id_map_default_cpu_ids) = ATOMIC_INIT(0); static struct coresight_trace_id_map id_map_default = { - .cpu_map = &id_map_default_cpu_ids + .cpu_map = &id_map_default_cpu_ids, + .lock = __SPIN_LOCK_UNLOCKED(id_map_default.lock) }; -/* lock to protect id_map and cpu data */ -static DEFINE_SPINLOCK(id_map_lock); - /* #define TRACE_ID_DEBUG 1 */ #if defined(TRACE_ID_DEBUG) || defined(CONFIG_COMPILE_TEST) @@ -123,11 +121,11 @@ static void coresight_trace_id_release_all(struct coresight_trace_id_map *id_map unsigned long flags; int cpu; - spin_lock_irqsave(&id_map_lock, flags); + spin_lock_irqsave(&id_map->lock, flags); bitmap_zero(id_map->used_ids, CORESIGHT_TRACE_IDS_MAX); for_each_possible_cpu(cpu) atomic_set(per_cpu_ptr(id_map->cpu_map, cpu), 0); - spin_unlock_irqrestore(&id_map_lock, flags); + spin_unlock_irqrestore(&id_map->lock, flags); DUMP_ID_MAP(id_map); } @@ -136,7 +134,7 @@ static int _coresight_trace_id_get_cpu_id(int cpu, struct coresight_trace_id_map unsigned long flags; int id; - spin_lock_irqsave(&id_map_lock, flags); + spin_lock_irqsave(&id_map->lock, flags); /* check for existing allocation for this CPU */ id = _coresight_trace_id_read_cpu_id(cpu, id_map); @@ -163,7 +161,7 @@ static int _coresight_trace_id_get_cpu_id(int cpu, struct coresight_trace_id_map atomic_set(per_cpu_ptr(id_map->cpu_map, cpu), id); get_cpu_id_out_unlock: - spin_unlock_irqrestore(&id_map_lock, flags); + spin_unlock_irqrestore(&id_map->lock, flags); DUMP_ID_CPU(cpu, id); DUMP_ID_MAP(id_map); @@ -180,12 +178,12 @@ static void _coresight_trace_id_put_cpu_id(int cpu, struct coresight_trace_id_ma if (!id) return; - spin_lock_irqsave(&id_map_lock, flags); + spin_lock_irqsave(&id_map->lock, flags); coresight_trace_id_free(id, id_map); atomic_set(per_cpu_ptr(id_map->cpu_map, cpu), 0); - spin_unlock_irqrestore(&id_map_lock, flags); + spin_unlock_irqrestore(&id_map->lock, flags); DUMP_ID_CPU(cpu, id); DUMP_ID_MAP(id_map); } @@ -195,10 +193,10 @@ static int coresight_trace_id_map_get_system_id(struct coresight_trace_id_map *i unsigned long flags; int id; - spin_lock_irqsave(&id_map_lock, flags); + spin_lock_irqsave(&id_map->lock, flags); /* prefer odd IDs for system components to avoid legacy CPU IDS */ id = coresight_trace_id_alloc_new_id(id_map, 0, true); - spin_unlock_irqrestore(&id_map_lock, flags); + spin_unlock_irqrestore(&id_map->lock, flags); DUMP_ID(id); DUMP_ID_MAP(id_map); @@ -209,9 +207,9 @@ static void coresight_trace_id_map_put_system_id(struct coresight_trace_id_map * { unsigned long flags; - spin_lock_irqsave(&id_map_lock, flags); + spin_lock_irqsave(&id_map->lock, flags); coresight_trace_id_free(id, id_map); - spin_unlock_irqrestore(&id_map_lock, flags); + spin_unlock_irqrestore(&id_map->lock, flags); DUMP_ID(id); DUMP_ID_MAP(id_map); diff --git a/include/linux/coresight.h b/include/linux/coresight.h index c9de29c7d835e..2f8f2f61e2a3e 100644 --- a/include/linux/coresight.h +++ b/include/linux/coresight.h @@ -232,6 +232,7 @@ struct coresight_trace_id_map { DECLARE_BITMAP(used_ids, CORESIGHT_TRACE_IDS_MAX); atomic_t __percpu *cpu_map; atomic_t perf_cs_etm_session_active; + spinlock_t lock; }; /** From 0749520295ece67e92b57bb319a0a07a356a2f27 Mon Sep 17 00:00:00 2001 From: "Rafael J. Wysocki" Date: Tue, 12 Sep 2023 20:35:40 +0200 Subject: [PATCH 328/548] ACPI: thermal: Simplify initialization of critical and hot trips Use the observation that the critical and hot trip points are never updated by the ACPI thermal driver, because the flags passed from acpi_thermal_notify() to acpi_thermal_trips_update() do not include ACPI_TRIPS_CRITICAL or ACPI_TRIPS_HOT, to move the initialization of those trip points directly into acpi_thermal_get_trip_points() and reduce the size of __acpi_thermal_trips_update(). Also make the critical and hot trip points initialization code more straightforward and drop the flags that are not needed any more. Signed-off-by: Rafael J. Wysocki Reviewed-by: Daniel Lezcano (cherry picked from commit 4be32333d941a53dcfb48e342c808c63a3b25576) Signed-off-by: Koba Ko --- drivers/acpi/thermal.c | 130 +++++++++++++++++++++-------------------- 1 file changed, 66 insertions(+), 64 deletions(-) diff --git a/drivers/acpi/thermal.c b/drivers/acpi/thermal.c index 8263508415a8d..f3ecc4bbf8612 100644 --- a/drivers/acpi/thermal.c +++ b/drivers/acpi/thermal.c @@ -43,17 +43,13 @@ #define ACPI_THERMAL_MAX_ACTIVE 10 #define ACPI_THERMAL_MAX_LIMIT_STR_LEN 65 -#define ACPI_TRIPS_CRITICAL BIT(0) -#define ACPI_TRIPS_HOT BIT(1) -#define ACPI_TRIPS_PASSIVE BIT(2) -#define ACPI_TRIPS_ACTIVE BIT(3) -#define ACPI_TRIPS_DEVICES BIT(4) +#define ACPI_TRIPS_PASSIVE BIT(0) +#define ACPI_TRIPS_ACTIVE BIT(1) +#define ACPI_TRIPS_DEVICES BIT(2) #define ACPI_TRIPS_THRESHOLDS (ACPI_TRIPS_PASSIVE | ACPI_TRIPS_ACTIVE) -#define ACPI_TRIPS_INIT (ACPI_TRIPS_CRITICAL | ACPI_TRIPS_HOT | \ - ACPI_TRIPS_PASSIVE | ACPI_TRIPS_ACTIVE | \ - ACPI_TRIPS_DEVICES) +#define ACPI_TRIPS_INIT (ACPI_TRIPS_THRESHOLDS | ACPI_TRIPS_DEVICES) /* * This exception is thrown out in two cases: @@ -196,62 +192,6 @@ static void __acpi_thermal_trips_update(struct acpi_thermal *tz, int flag) bool valid = false; int i; - /* Critical Shutdown */ - if (flag & ACPI_TRIPS_CRITICAL) { - status = acpi_evaluate_integer(tz->device->handle, "_CRT", NULL, &tmp); - tz->trips.critical.temperature = tmp; - /* - * Treat freezing temperatures as invalid as well; some - * BIOSes return really low values and cause reboots at startup. - * Below zero (Celsius) values clearly aren't right for sure.. - * ... so lets discard those as invalid. - */ - if (ACPI_FAILURE(status)) { - tz->trips.critical.valid = false; - acpi_handle_debug(tz->device->handle, - "No critical threshold\n"); - } else if (tmp <= 2732) { - pr_info(FW_BUG "Invalid critical threshold (%llu)\n", tmp); - tz->trips.critical.valid = false; - } else { - tz->trips.critical.valid = true; - acpi_handle_debug(tz->device->handle, - "Found critical threshold [%lu]\n", - tz->trips.critical.temperature); - } - if (tz->trips.critical.valid) { - if (crt == -1) { - tz->trips.critical.valid = false; - } else if (crt > 0) { - unsigned long crt_k = celsius_to_deci_kelvin(crt); - - /* - * Allow override critical threshold - */ - if (crt_k > tz->trips.critical.temperature) - pr_info("Critical threshold %d C\n", crt); - - tz->trips.critical.temperature = crt_k; - } - } - } - - /* Critical Sleep (optional) */ - if (flag & ACPI_TRIPS_HOT) { - status = acpi_evaluate_integer(tz->device->handle, "_HOT", NULL, &tmp); - if (ACPI_FAILURE(status)) { - tz->trips.hot.valid = false; - acpi_handle_debug(tz->device->handle, - "No hot threshold\n"); - } else { - tz->trips.hot.temperature = tmp; - tz->trips.hot.valid = true; - acpi_handle_debug(tz->device->handle, - "Found hot threshold [%lu]\n", - tz->trips.hot.temperature); - } - } - /* Passive (optional) */ if (((flag & ACPI_TRIPS_PASSIVE) && tz->trips.passive.trip.valid) || flag == ACPI_TRIPS_INIT) { @@ -451,11 +391,73 @@ static void acpi_thermal_trips_update(struct acpi_thermal *tz, u32 event) dev_name(&adev->dev), event, 0); } +static void acpi_thermal_get_critical_trip(struct acpi_thermal *tz) +{ + unsigned long long tmp; + acpi_status status; + + if (crt > 0) { + tmp = celsius_to_deci_kelvin(crt); + goto set; + } + if (crt == -1) { + acpi_handle_debug(tz->device->handle, "Critical threshold disabled\n"); + goto fail; + } + + status = acpi_evaluate_integer(tz->device->handle, "_CRT", NULL, &tmp); + if (ACPI_FAILURE(status)) { + acpi_handle_debug(tz->device->handle, "No critical threshold\n"); + goto fail; + } + if (tmp <= 2732) { + /* + * Below zero (Celsius) values clearly aren't right for sure, + * so discard them as invalid. + */ + pr_info(FW_BUG "Invalid critical threshold (%llu)\n", tmp); + goto fail; + } + +set: + tz->trips.critical.valid = true; + tz->trips.critical.temperature = tmp; + acpi_handle_debug(tz->device->handle, "Critical threshold [%lu]\n", + tz->trips.critical.temperature); + return; + +fail: + tz->trips.critical.valid = false; + tz->trips.critical.temperature = THERMAL_TEMP_INVALID; +} + +static void acpi_thermal_get_hot_trip(struct acpi_thermal *tz) +{ + unsigned long long tmp; + acpi_status status; + + status = acpi_evaluate_integer(tz->device->handle, "_HOT", NULL, &tmp); + if (ACPI_FAILURE(status)) { + tz->trips.hot.valid = false; + tz->trips.hot.temperature = THERMAL_TEMP_INVALID; + acpi_handle_debug(tz->device->handle, "No hot threshold\n"); + return; + } + + tz->trips.hot.valid = true; + tz->trips.hot.temperature = tmp; + acpi_handle_debug(tz->device->handle, "Hot threshold [%lu]\n", + tz->trips.hot.temperature); +} + static int acpi_thermal_get_trip_points(struct acpi_thermal *tz) { bool valid; int i; + acpi_thermal_get_critical_trip(tz); + acpi_thermal_get_hot_trip(tz); + /* Passive and active trip points (optional). */ __acpi_thermal_trips_update(tz, ACPI_TRIPS_INIT); valid = tz->trips.critical.valid | From 8ee302e043b4a42e377addba9c54bda814025d1c Mon Sep 17 00:00:00 2001 From: "Rafael J. Wysocki" Date: Tue, 12 Sep 2023 20:36:21 +0200 Subject: [PATCH 329/548] ACPI: thermal: Fold acpi_thermal_get_info() into its caller There is only one caller of acpi_thermal_get_info() and the code from it can be folded into its caller just fine, so do that. No intentional functional impact. Signed-off-by: Rafael J. Wysocki Reviewed-by: Daniel Lezcano (cherry picked from commit b09872a652d386207600fcd001077251c0eae15a) Signed-off-by: Koba Ko --- drivers/acpi/thermal.c | 52 +++++++++++++++--------------------------- 1 file changed, 19 insertions(+), 33 deletions(-) diff --git a/drivers/acpi/thermal.c b/drivers/acpi/thermal.c index f3ecc4bbf8612..3c7d1d1d61e09 100644 --- a/drivers/acpi/thermal.c +++ b/drivers/acpi/thermal.c @@ -846,38 +846,6 @@ static void acpi_thermal_aml_dependency_fix(struct acpi_thermal *tz) acpi_evaluate_integer(handle, "_TMP", NULL, &value); } -static int acpi_thermal_get_info(struct acpi_thermal *tz) -{ - int result; - - if (!tz) - return -EINVAL; - - acpi_thermal_aml_dependency_fix(tz); - - /* Get trip points [_CRT, _PSV, etc.] (required) */ - result = acpi_thermal_get_trip_points(tz); - if (result) - return result; - - /* Get temperature [_TMP] (required) */ - result = acpi_thermal_get_temperature(tz); - if (result) - return result; - - /* Set the cooling mode [_SCP] to active cooling (default) */ - acpi_execute_simple_method(tz->device->handle, "_SCP", - ACPI_THERMAL_MODE_ACTIVE); - - /* Get default polling frequency [_TZP] (optional) */ - if (tzp) - tz->polling_frequency = tzp; - else - acpi_thermal_get_polling_frequency(tz); - - return 0; -} - /* * The exact offset between Kelvin and degree Celsius is 273.15. However ACPI * handles temperature values with a single decimal place. As a consequence, @@ -940,10 +908,28 @@ static int acpi_thermal_add(struct acpi_device *device) strcpy(acpi_device_class(device), ACPI_THERMAL_CLASS); device->driver_data = tz; - result = acpi_thermal_get_info(tz); + acpi_thermal_aml_dependency_fix(tz); + + /* Get trip points [_CRT, _PSV, etc.] (required). */ + result = acpi_thermal_get_trip_points(tz); if (result) goto free_memory; + /* Get temperature [_TMP] (required). */ + result = acpi_thermal_get_temperature(tz); + if (result) + goto free_memory; + + /* Set the cooling mode [_SCP] to active cooling. */ + acpi_execute_simple_method(tz->device->handle, "_SCP", + ACPI_THERMAL_MODE_ACTIVE); + + /* Determine the default polling frequency [_TZP]. */ + if (tzp) + tz->polling_frequency = tzp; + else + acpi_thermal_get_polling_frequency(tz); + acpi_thermal_guess_offset(tz); result = acpi_thermal_register_thermal_zone(tz); From 1c3d725515ba87a665a3faf33bb434a65a99ef68 Mon Sep 17 00:00:00 2001 From: "Rafael J. Wysocki" Date: Tue, 12 Sep 2023 20:37:59 +0200 Subject: [PATCH 330/548] ACPI: thermal: Determine the number of trip points earlier Compute the number of trip points in acpi_thermal_add() so as to allow the driver's data structures to be simplified going forward. No intentional functional impact. Signed-off-by: Rafael J. Wysocki Reviewed-by: Daniel Lezcano (cherry picked from commit f04256a8f7de2c13619b636cec1109e596804229) Signed-off-by: Koba Ko --- drivers/acpi/thermal.c | 56 ++++++++++++++++++++---------------------- 1 file changed, 27 insertions(+), 29 deletions(-) diff --git a/drivers/acpi/thermal.c b/drivers/acpi/thermal.c index 3c7d1d1d61e09..2f16a0aa3a821 100644 --- a/drivers/acpi/thermal.c +++ b/drivers/acpi/thermal.c @@ -452,7 +452,7 @@ static void acpi_thermal_get_hot_trip(struct acpi_thermal *tz) static int acpi_thermal_get_trip_points(struct acpi_thermal *tz) { - bool valid; + unsigned int count = 0; int i; acpi_thermal_get_critical_trip(tz); @@ -460,18 +460,24 @@ static int acpi_thermal_get_trip_points(struct acpi_thermal *tz) /* Passive and active trip points (optional). */ __acpi_thermal_trips_update(tz, ACPI_TRIPS_INIT); - valid = tz->trips.critical.valid | - tz->trips.hot.valid | - tz->trips.passive.trip.valid; + if (tz->trips.critical.valid) + count++; + + if (tz->trips.hot.valid) + count++; - for (i = 0; i < ACPI_THERMAL_MAX_ACTIVE; i++) - valid = valid || tz->trips.active[i].trip.valid; + if (tz->trips.passive.trip.valid) + count++; + + for (i = 0; i < ACPI_THERMAL_MAX_ACTIVE; i++) { + if (tz->trips.active[i].trip.valid) + count++; + else + break; - if (!valid) { - pr_warn(FW_BUG "No valid trip found\n"); - return -ENODEV; } - return 0; + + return count; } /* sys I/F for generic thermal sysfs support */ @@ -681,29 +687,15 @@ static void acpi_thermal_zone_sysfs_remove(struct acpi_thermal *tz) sysfs_remove_link(&tzdev->kobj, "device"); } -static int acpi_thermal_register_thermal_zone(struct acpi_thermal *tz) +static int acpi_thermal_register_thermal_zone(struct acpi_thermal *tz, + unsigned int trip_count) { struct acpi_thermal_trip *acpi_trip; struct thermal_trip *trip; int passive_delay = 0; - int trip_count = 0; int result; int i; - if (tz->trips.critical.valid) - trip_count++; - - if (tz->trips.hot.valid) - trip_count++; - - if (tz->trips.passive.trip.valid) { - trip_count++; - passive_delay = tz->trips.passive.tsp * 100; - } - - for (i = 0; i < ACPI_THERMAL_MAX_ACTIVE && tz->trips.active[i].trip.valid; i++) - trip_count++; - trip = kcalloc(trip_count, sizeof(*trip), GFP_KERNEL); if (!trip) return -ENOMEM; @@ -724,6 +716,8 @@ static int acpi_thermal_register_thermal_zone(struct acpi_thermal *tz) acpi_trip = &tz->trips.passive.trip; if (acpi_trip->valid) { + passive_delay = tz->trips.passive.tsp * 100; + trip->type = THERMAL_TRIP_PASSIVE; trip->temperature = acpi_thermal_temp(tz, acpi_trip->temperature); trip->priv = acpi_trip; @@ -893,6 +887,7 @@ static void acpi_thermal_check_fn(struct work_struct *work) static int acpi_thermal_add(struct acpi_device *device) { struct acpi_thermal *tz; + unsigned int trip_count; int result; if (!device) @@ -911,9 +906,12 @@ static int acpi_thermal_add(struct acpi_device *device) acpi_thermal_aml_dependency_fix(tz); /* Get trip points [_CRT, _PSV, etc.] (required). */ - result = acpi_thermal_get_trip_points(tz); - if (result) + trip_count = acpi_thermal_get_trip_points(tz); + if (!trip_count) { + pr_warn(FW_BUG "No valid trip points!\n"); + result = -ENODEV; goto free_memory; + } /* Get temperature [_TMP] (required). */ result = acpi_thermal_get_temperature(tz); @@ -932,7 +930,7 @@ static int acpi_thermal_add(struct acpi_device *device) acpi_thermal_guess_offset(tz); - result = acpi_thermal_register_thermal_zone(tz); + result = acpi_thermal_register_thermal_zone(tz, trip_count); if (result) goto free_memory; From 4a1c7514b50e753c579c16faeb63646bd3299d74 Mon Sep 17 00:00:00 2001 From: "Rafael J. Wysocki" Date: Tue, 12 Sep 2023 20:39:15 +0200 Subject: [PATCH 331/548] ACPI: thermal: Create and populate trip points table earlier Create and populate the driver's trip points table in acpi_thermal_add() so as to allow the its data structures to be simplified going forward. No intentional functional impact. Signed-off-by: Rafael J. Wysocki Reviewed-by: Daniel Lezcano (cherry picked from commit 06a5f76ee104c17022425ade32b970c1d7793e01) Signed-off-by: Koba Ko --- drivers/acpi/thermal.c | 105 ++++++++++++++++++++--------------------- 1 file changed, 52 insertions(+), 53 deletions(-) diff --git a/drivers/acpi/thermal.c b/drivers/acpi/thermal.c index 2f16a0aa3a821..5743e0a792a3a 100644 --- a/drivers/acpi/thermal.c +++ b/drivers/acpi/thermal.c @@ -688,53 +688,10 @@ static void acpi_thermal_zone_sysfs_remove(struct acpi_thermal *tz) } static int acpi_thermal_register_thermal_zone(struct acpi_thermal *tz, - unsigned int trip_count) + unsigned int trip_count, + int passive_delay) { - struct acpi_thermal_trip *acpi_trip; - struct thermal_trip *trip; - int passive_delay = 0; int result; - int i; - - trip = kcalloc(trip_count, sizeof(*trip), GFP_KERNEL); - if (!trip) - return -ENOMEM; - - tz->trip_table = trip; - - if (tz->trips.critical.valid) { - trip->type = THERMAL_TRIP_CRITICAL; - trip->temperature = acpi_thermal_temp(tz, tz->trips.critical.temperature); - trip++; - } - - if (tz->trips.hot.valid) { - trip->type = THERMAL_TRIP_HOT; - trip->temperature = acpi_thermal_temp(tz, tz->trips.hot.temperature); - trip++; - } - - acpi_trip = &tz->trips.passive.trip; - if (acpi_trip->valid) { - passive_delay = tz->trips.passive.tsp * 100; - - trip->type = THERMAL_TRIP_PASSIVE; - trip->temperature = acpi_thermal_temp(tz, acpi_trip->temperature); - trip->priv = acpi_trip; - trip++; - } - - for (i = 0; i < ACPI_THERMAL_MAX_ACTIVE; i++) { - acpi_trip = &tz->trips.active[i].trip; - - if (!acpi_trip->valid) - break; - - trip->type = THERMAL_TRIP_ACTIVE; - trip->temperature = acpi_thermal_temp(tz, acpi_trip->temperature); - trip->priv = acpi_trip; - trip++; - } tz->thermal_zone = thermal_zone_device_register_with_trips("acpitz", tz->trip_table, @@ -744,10 +701,8 @@ static int acpi_thermal_register_thermal_zone(struct acpi_thermal *tz, NULL, passive_delay, tz->polling_frequency * 100); - if (IS_ERR(tz->thermal_zone)) { - result = PTR_ERR(tz->thermal_zone); - goto free_trip_table; - } + if (IS_ERR(tz->thermal_zone)) + return PTR_ERR(tz->thermal_zone); result = acpi_thermal_zone_sysfs_add(tz); if (result) @@ -766,8 +721,6 @@ static int acpi_thermal_register_thermal_zone(struct acpi_thermal *tz, acpi_thermal_zone_sysfs_remove(tz); unregister_tzd: thermal_zone_device_unregister(tz->thermal_zone); -free_trip_table: - kfree(tz->trip_table); return result; } @@ -886,9 +839,13 @@ static void acpi_thermal_check_fn(struct work_struct *work) static int acpi_thermal_add(struct acpi_device *device) { + struct acpi_thermal_trip *acpi_trip; + struct thermal_trip *trip; struct acpi_thermal *tz; unsigned int trip_count; + int passive_delay = 0; int result; + int i; if (!device) return -EINVAL; @@ -930,9 +887,49 @@ static int acpi_thermal_add(struct acpi_device *device) acpi_thermal_guess_offset(tz); - result = acpi_thermal_register_thermal_zone(tz, trip_count); + trip = kcalloc(trip_count, sizeof(*trip), GFP_KERNEL); + if (!trip) + return -ENOMEM; + + tz->trip_table = trip; + + if (tz->trips.critical.valid) { + trip->type = THERMAL_TRIP_CRITICAL; + trip->temperature = acpi_thermal_temp(tz, tz->trips.critical.temperature); + trip++; + } + + if (tz->trips.hot.valid) { + trip->type = THERMAL_TRIP_HOT; + trip->temperature = acpi_thermal_temp(tz, tz->trips.hot.temperature); + trip++; + } + + acpi_trip = &tz->trips.passive.trip; + if (acpi_trip->valid) { + passive_delay = tz->trips.passive.tsp * 100; + + trip->type = THERMAL_TRIP_PASSIVE; + trip->temperature = acpi_thermal_temp(tz, acpi_trip->temperature); + trip->priv = acpi_trip; + trip++; + } + + for (i = 0; i < ACPI_THERMAL_MAX_ACTIVE; i++) { + acpi_trip = &tz->trips.active[i].trip; + + if (!acpi_trip->valid) + break; + + trip->type = THERMAL_TRIP_ACTIVE; + trip->temperature = acpi_thermal_temp(tz, acpi_trip->temperature); + trip->priv = acpi_trip; + trip++; + } + + result = acpi_thermal_register_thermal_zone(tz, trip_count, passive_delay); if (result) - goto free_memory; + goto free_trips; refcount_set(&tz->thermal_check_count, 3); mutex_init(&tz->thermal_check_lock); @@ -951,6 +948,8 @@ static int acpi_thermal_add(struct acpi_device *device) flush_wq: flush_workqueue(acpi_thermal_pm_queue); acpi_thermal_unregister_thermal_zone(tz); +free_trips: + kfree(tz->trip_table); free_memory: kfree(tz); From ffba0d71c888e42b87c272ef502ecdfb1b6ff65e Mon Sep 17 00:00:00 2001 From: "Rafael J. Wysocki" Date: Tue, 12 Sep 2023 20:41:06 +0200 Subject: [PATCH 332/548] ACPI: thermal: Simplify critical and hot trips representation Notice that the only piece of information regarding the critical and hot trips that needs to be stored in the driver's local data structures is whether or not these trips are valid, so drop all of the redundant information from there and adjust the code accordingly. Among other things, this requires acpi_thermal_add() to be rearranged so as to obtain the critical trip temperature before populating the trip points table and for symmetry, the hot trip temperature is obtained earlier too. No intentional functional impact. Signed-off-by: Rafael J. Wysocki Reviewed-by: Daniel Lezcano (cherry picked from commit 30f04c7535e464f1824edb02e736a50be018777b) Signed-off-by: Koba Ko --- drivers/acpi/thermal.c | 69 ++++++++++++++++++++---------------------- 1 file changed, 33 insertions(+), 36 deletions(-) diff --git a/drivers/acpi/thermal.c b/drivers/acpi/thermal.c index 5743e0a792a3a..d59b231a6e034 100644 --- a/drivers/acpi/thermal.c +++ b/drivers/acpi/thermal.c @@ -107,10 +107,10 @@ struct acpi_thermal_active { }; struct acpi_thermal_trips { - struct acpi_thermal_trip critical; - struct acpi_thermal_trip hot; struct acpi_thermal_passive passive; struct acpi_thermal_active active[ACPI_THERMAL_MAX_ACTIVE]; + bool critical_valid; + bool hot_valid; }; struct acpi_thermal { @@ -391,7 +391,7 @@ static void acpi_thermal_trips_update(struct acpi_thermal *tz, u32 event) dev_name(&adev->dev), event, 0); } -static void acpi_thermal_get_critical_trip(struct acpi_thermal *tz) +static long acpi_thermal_get_critical_trip(struct acpi_thermal *tz) { unsigned long long tmp; acpi_status status; @@ -420,34 +420,30 @@ static void acpi_thermal_get_critical_trip(struct acpi_thermal *tz) } set: - tz->trips.critical.valid = true; - tz->trips.critical.temperature = tmp; - acpi_handle_debug(tz->device->handle, "Critical threshold [%lu]\n", - tz->trips.critical.temperature); - return; + tz->trips.critical_valid = true; + acpi_handle_debug(tz->device->handle, "Critical threshold [%llu]\n", tmp); + return tmp; fail: - tz->trips.critical.valid = false; - tz->trips.critical.temperature = THERMAL_TEMP_INVALID; + tz->trips.critical_valid = false; + return THERMAL_TEMP_INVALID; } -static void acpi_thermal_get_hot_trip(struct acpi_thermal *tz) +static long acpi_thermal_get_hot_trip(struct acpi_thermal *tz) { unsigned long long tmp; acpi_status status; status = acpi_evaluate_integer(tz->device->handle, "_HOT", NULL, &tmp); if (ACPI_FAILURE(status)) { - tz->trips.hot.valid = false; - tz->trips.hot.temperature = THERMAL_TEMP_INVALID; + tz->trips.hot_valid = false; acpi_handle_debug(tz->device->handle, "No hot threshold\n"); - return; + return THERMAL_TEMP_INVALID; } - tz->trips.hot.valid = true; - tz->trips.hot.temperature = tmp; - acpi_handle_debug(tz->device->handle, "Hot threshold [%lu]\n", - tz->trips.hot.temperature); + tz->trips.hot_valid = true; + acpi_handle_debug(tz->device->handle, "Hot threshold [%llu]\n", tmp); + return tmp; } static int acpi_thermal_get_trip_points(struct acpi_thermal *tz) @@ -455,17 +451,9 @@ static int acpi_thermal_get_trip_points(struct acpi_thermal *tz) unsigned int count = 0; int i; - acpi_thermal_get_critical_trip(tz); - acpi_thermal_get_hot_trip(tz); /* Passive and active trip points (optional). */ __acpi_thermal_trips_update(tz, ACPI_TRIPS_INIT); - if (tz->trips.critical.valid) - count++; - - if (tz->trips.hot.valid) - count++; - if (tz->trips.passive.trip.valid) count++; @@ -578,10 +566,10 @@ static int acpi_thermal_cooling_device_cb(struct thermal_zone_device *thermal, int trip = -1; int result = 0; - if (tz->trips.critical.valid) + if (tz->trips.critical_valid) trip++; - if (tz->trips.hot.valid) + if (tz->trips.hot_valid) trip++; if (tz->trips.passive.trip.valid) { @@ -803,10 +791,9 @@ static void acpi_thermal_aml_dependency_fix(struct acpi_thermal *tz) * The heuristic below should work for all ACPI thermal zones which have a * critical trip point with a value being a multiple of 0.5 degree Celsius. */ -static void acpi_thermal_guess_offset(struct acpi_thermal *tz) +static void acpi_thermal_guess_offset(struct acpi_thermal *tz, long crit_temp) { - if (tz->trips.critical.valid && - (tz->trips.critical.temperature % 5) == 1) + if (tz->trips.critical_valid && crit_temp % 5 == 1) tz->kelvin_offset = 273100; else tz->kelvin_offset = 273200; @@ -843,6 +830,7 @@ static int acpi_thermal_add(struct acpi_device *device) struct thermal_trip *trip; struct acpi_thermal *tz; unsigned int trip_count; + int crit_temp, hot_temp; int passive_delay = 0; int result; int i; @@ -864,6 +852,15 @@ static int acpi_thermal_add(struct acpi_device *device) /* Get trip points [_CRT, _PSV, etc.] (required). */ trip_count = acpi_thermal_get_trip_points(tz); + + crit_temp = acpi_thermal_get_critical_trip(tz); + if (tz->trips.critical_valid) + trip_count++; + + hot_temp = acpi_thermal_get_hot_trip(tz); + if (tz->trips.hot_valid) + trip_count++; + if (!trip_count) { pr_warn(FW_BUG "No valid trip points!\n"); result = -ENODEV; @@ -885,7 +882,7 @@ static int acpi_thermal_add(struct acpi_device *device) else acpi_thermal_get_polling_frequency(tz); - acpi_thermal_guess_offset(tz); + acpi_thermal_guess_offset(tz, crit_temp); trip = kcalloc(trip_count, sizeof(*trip), GFP_KERNEL); if (!trip) @@ -893,15 +890,15 @@ static int acpi_thermal_add(struct acpi_device *device) tz->trip_table = trip; - if (tz->trips.critical.valid) { + if (tz->trips.critical_valid) { trip->type = THERMAL_TRIP_CRITICAL; - trip->temperature = acpi_thermal_temp(tz, tz->trips.critical.temperature); + trip->temperature = acpi_thermal_temp(tz, crit_temp); trip++; } - if (tz->trips.hot.valid) { + if (tz->trips.hot_valid) { trip->type = THERMAL_TRIP_HOT; - trip->temperature = acpi_thermal_temp(tz, tz->trips.hot.temperature); + trip->temperature = acpi_thermal_temp(tz, hot_temp); trip++; } From c1ddd9ba43700675b396e9c6697cc3ff54222e33 Mon Sep 17 00:00:00 2001 From: "Rafael J. Wysocki" Date: Tue, 12 Sep 2023 20:43:52 +0200 Subject: [PATCH 333/548] ACPI: thermal: Untangle initialization and updates of the passive trip Separate the code needed to update the passive trip (in a response to a notification from the platform firmware) as well as to initialize it from the code that is only necessary for its initialization and cleanly divide it into functions that each carry out a specific action. Signed-off-by: Rafael J. Wysocki Reviewed-by: Daniel Lezcano (cherry picked from commit 64c512edf97735ecd882b532a5c5351edde39676) Signed-off-by: Koba Ko --- drivers/acpi/thermal.c | 198 ++++++++++++++++++++++++++--------------- 1 file changed, 125 insertions(+), 73 deletions(-) diff --git a/drivers/acpi/thermal.c b/drivers/acpi/thermal.c index d59b231a6e034..f2457dd5cdaa7 100644 --- a/drivers/acpi/thermal.c +++ b/drivers/acpi/thermal.c @@ -192,73 +192,6 @@ static void __acpi_thermal_trips_update(struct acpi_thermal *tz, int flag) bool valid = false; int i; - /* Passive (optional) */ - if (((flag & ACPI_TRIPS_PASSIVE) && tz->trips.passive.trip.valid) || - flag == ACPI_TRIPS_INIT) { - valid = tz->trips.passive.trip.valid; - if (psv == -1) { - status = AE_SUPPORT; - } else if (psv > 0) { - tmp = celsius_to_deci_kelvin(psv); - status = AE_OK; - } else { - status = acpi_evaluate_integer(tz->device->handle, - "_PSV", NULL, &tmp); - } - - if (ACPI_FAILURE(status)) { - tz->trips.passive.trip.valid = false; - } else { - tz->trips.passive.trip.temperature = tmp; - tz->trips.passive.trip.valid = true; - if (flag == ACPI_TRIPS_INIT) { - status = acpi_evaluate_integer(tz->device->handle, - "_TC1", NULL, &tmp); - if (ACPI_FAILURE(status)) - tz->trips.passive.trip.valid = false; - else - tz->trips.passive.tc1 = tmp; - - status = acpi_evaluate_integer(tz->device->handle, - "_TC2", NULL, &tmp); - if (ACPI_FAILURE(status)) - tz->trips.passive.trip.valid = false; - else - tz->trips.passive.tc2 = tmp; - - status = acpi_evaluate_integer(tz->device->handle, - "_TSP", NULL, &tmp); - if (ACPI_FAILURE(status)) - tz->trips.passive.trip.valid = false; - else - tz->trips.passive.tsp = tmp; - } - } - } - if ((flag & ACPI_TRIPS_DEVICES) && tz->trips.passive.trip.valid) { - memset(&devices, 0, sizeof(struct acpi_handle_list)); - status = acpi_evaluate_reference(tz->device->handle, "_PSL", - NULL, &devices); - if (ACPI_FAILURE(status)) { - acpi_handle_info(tz->device->handle, - "Invalid passive threshold\n"); - tz->trips.passive.trip.valid = false; - } else { - tz->trips.passive.trip.valid = true; - } - - if (memcmp(&tz->trips.passive.devices, &devices, - sizeof(struct acpi_handle_list))) { - memcpy(&tz->trips.passive.devices, &devices, - sizeof(struct acpi_handle_list)); - ACPI_THERMAL_TRIPS_EXCEPTION(flag, tz, "device"); - } - } - if ((flag & ACPI_TRIPS_PASSIVE) || (flag & ACPI_TRIPS_DEVICES)) { - if (valid != tz->trips.passive.trip.valid) - ACPI_THERMAL_TRIPS_EXCEPTION(flag, tz, "state"); - } - /* Active (optional) */ for (i = 0; i < ACPI_THERMAL_MAX_ACTIVE; i++) { char name[5] = { '_', 'A', 'C', ('0' + i), '\0' }; @@ -339,6 +272,72 @@ static void __acpi_thermal_trips_update(struct acpi_thermal *tz, int flag) } } +static void update_acpi_thermal_trip_temp(struct acpi_thermal_trip *acpi_trip, + int temp) +{ + acpi_trip->valid = temp != THERMAL_TEMP_INVALID; + acpi_trip->temperature = temp; +} + +static long get_passive_temp(struct acpi_thermal *tz) +{ + unsigned long long tmp; + acpi_status status; + + status = acpi_evaluate_integer(tz->device->handle, "_PSV", NULL, &tmp); + if (ACPI_FAILURE(status)) + return THERMAL_TEMP_INVALID; + + return tmp; +} + +static void acpi_thermal_update_passive_trip(struct acpi_thermal *tz) +{ + struct acpi_thermal_trip *acpi_trip = &tz->trips.passive.trip; + + if (!acpi_trip->valid || psv > 0) + return; + + update_acpi_thermal_trip_temp(acpi_trip, get_passive_temp(tz)); + if (!acpi_trip->valid) + ACPI_THERMAL_TRIPS_EXCEPTION(ACPI_TRIPS_PASSIVE, tz, "state"); +} + +static bool update_passive_devices(struct acpi_thermal *tz, bool compare) +{ + struct acpi_handle_list devices; + acpi_status status; + + memset(&devices, 0, sizeof(devices)); + + status = acpi_evaluate_reference(tz->device->handle, "_PSL", NULL, &devices); + if (ACPI_FAILURE(status)) { + acpi_handle_info(tz->device->handle, + "Missing device list for passive threshold\n"); + return false; + } + + if (compare && memcmp(&tz->trips.passive.devices, &devices, sizeof(devices))) + ACPI_THERMAL_TRIPS_EXCEPTION(ACPI_TRIPS_PASSIVE, tz, "device"); + + memcpy(&tz->trips.passive.devices, &devices, sizeof(devices)); + return true; +} + +static void acpi_thermal_update_passive_devices(struct acpi_thermal *tz) +{ + struct acpi_thermal_trip *acpi_trip = &tz->trips.passive.trip; + + if (!acpi_trip->valid) + return; + + if (update_passive_devices(tz, true)) + return; + + update_acpi_thermal_trip_temp(acpi_trip, THERMAL_TEMP_INVALID); + ACPI_THERMAL_TRIPS_EXCEPTION(ACPI_TRIPS_PASSIVE, tz, "state"); +} + static int acpi_thermal_adjust_trip(struct thermal_trip *trip, void *data) { struct acpi_thermal_trip *acpi_trip = trip->priv; @@ -359,8 +358,15 @@ static void acpi_thermal_adjust_thermal_zone(struct thermal_zone_device *thermal unsigned long data) { struct acpi_thermal *tz = thermal_zone_device_priv(thermal); - int flag = data == ACPI_THERMAL_NOTIFY_THRESHOLDS ? - ACPI_TRIPS_THRESHOLDS : ACPI_TRIPS_DEVICES; + int flag; + + if (data == ACPI_THERMAL_NOTIFY_THRESHOLDS) { + acpi_thermal_update_passive_trip(tz); + flag = ACPI_TRIPS_THRESHOLDS; + } else { + acpi_thermal_update_passive_devices(tz); + flag = ACPI_TRIPS_DEVICES; + } __acpi_thermal_trips_update(tz, flag); @@ -446,17 +452,63 @@ static long acpi_thermal_get_hot_trip(struct acpi_thermal *tz) return tmp; } +static bool acpi_thermal_init_passive_trip(struct acpi_thermal *tz) +{ + unsigned long long tmp; + acpi_status status; + int temp; + + if (psv == -1) + goto fail; + + if (psv > 0) { + temp = celsius_to_deci_kelvin(psv); + } else { + temp = get_passive_temp(tz); + if (temp == THERMAL_TEMP_INVALID) + goto fail; + } + + status = acpi_evaluate_integer(tz->device->handle, "_TC1", NULL, &tmp); + if (ACPI_FAILURE(status)) + goto fail; + + tz->trips.passive.tc1 = tmp; + + status = acpi_evaluate_integer(tz->device->handle, "_TC2", NULL, &tmp); + if (ACPI_FAILURE(status)) + goto fail; + + tz->trips.passive.tc2 = tmp; + + status = acpi_evaluate_integer(tz->device->handle, "_TSP", NULL, &tmp); + if (ACPI_FAILURE(status)) + goto fail; + + tz->trips.passive.tsp = tmp; + + if (!update_passive_devices(tz, false)) + goto fail; + + update_acpi_thermal_trip_temp(&tz->trips.passive.trip, temp); + return true; + +fail: + update_acpi_thermal_trip_temp(&tz->trips.passive.trip, THERMAL_TEMP_INVALID); + return false; +} + static int acpi_thermal_get_trip_points(struct acpi_thermal *tz) { unsigned int count = 0; int i; - /* Passive and active trip points (optional). */ - __acpi_thermal_trips_update(tz, ACPI_TRIPS_INIT); - - if (tz->trips.passive.trip.valid) + if (acpi_thermal_init_passive_trip(tz)) count++; + /* Active trip points (optional). */ + __acpi_thermal_trips_update(tz, ACPI_TRIPS_INIT); + for (i = 0; i < ACPI_THERMAL_MAX_ACTIVE; i++) { if (tz->trips.active[i].trip.valid) count++; From 978cba881583829f8df33aa2ead2969273434f3a Mon Sep 17 00:00:00 2001 From: "Rafael J. Wysocki" Date: Wed, 20 Sep 2023 14:35:46 +0200 Subject: [PATCH 334/548] ACPI: thermal: Untangle initialization and updates of active trips Separate the code needed to update active trips (in a response to a notification from the platform firmware) as well as to initialize them from the code that is only necessary for their initialization and cleanly divide it into functions that each carry out a specific action. Signed-off-by: Rafael J. Wysocki Acked-by: Daniel Lezcano (cherry picked from commit cdfe09df04a0706f3926c0643ba2218fae04003b) Signed-off-by: Koba Ko --- drivers/acpi/thermal.c | 197 +++++++++++++++++++++-------------------- 1 file changed, 100 insertions(+), 97 deletions(-) diff --git a/drivers/acpi/thermal.c b/drivers/acpi/thermal.c index f2457dd5cdaa7..1389d8b353977 100644 --- a/drivers/acpi/thermal.c +++ b/drivers/acpi/thermal.c @@ -184,94 +184,6 @@ static int acpi_thermal_temp(struct acpi_thermal *tz, int temp_deci_k) tz->kelvin_offset); } -static void __acpi_thermal_trips_update(struct acpi_thermal *tz, int flag) -{ - acpi_status status; - unsigned long long tmp; - struct acpi_handle_list devices; - bool valid = false; - int i; - - /* Active (optional) */ - for (i = 0; i < ACPI_THERMAL_MAX_ACTIVE; i++) { - char name[5] = { '_', 'A', 'C', ('0' + i), '\0' }; - valid = tz->trips.active[i].trip.valid; - - if (act == -1) - break; /* disable all active trip points */ - - if (flag == ACPI_TRIPS_INIT || ((flag & ACPI_TRIPS_ACTIVE) && - tz->trips.active[i].trip.valid)) { - status = acpi_evaluate_integer(tz->device->handle, - name, NULL, &tmp); - if (ACPI_FAILURE(status)) { - tz->trips.active[i].trip.valid = false; - if (i == 0) - break; - - if (act <= 0) - break; - - if (i == 1) - tz->trips.active[0].trip.temperature = - celsius_to_deci_kelvin(act); - else - /* - * Don't allow override higher than - * the next higher trip point - */ - tz->trips.active[i-1].trip.temperature = - min_t(unsigned long, - tz->trips.active[i-2].trip.temperature, - celsius_to_deci_kelvin(act)); - - break; - } else { - tz->trips.active[i].trip.temperature = tmp; - tz->trips.active[i].trip.valid = true; - } - } - - name[2] = 'L'; - if ((flag & ACPI_TRIPS_DEVICES) && tz->trips.active[i].trip.valid) { - memset(&devices, 0, sizeof(struct acpi_handle_list)); - status = acpi_evaluate_reference(tz->device->handle, - name, NULL, &devices); - if (ACPI_FAILURE(status)) { - acpi_handle_info(tz->device->handle, - "Invalid active%d threshold\n", i); - tz->trips.active[i].trip.valid = false; - } else { - tz->trips.active[i].trip.valid = true; - } - - if (memcmp(&tz->trips.active[i].devices, &devices, - sizeof(struct acpi_handle_list))) { - memcpy(&tz->trips.active[i].devices, &devices, - sizeof(struct acpi_handle_list)); - ACPI_THERMAL_TRIPS_EXCEPTION(flag, tz, "device"); - } - } - if ((flag & ACPI_TRIPS_ACTIVE) || (flag & ACPI_TRIPS_DEVICES)) - if (valid != tz->trips.active[i].trip.valid) - ACPI_THERMAL_TRIPS_EXCEPTION(flag, tz, "state"); - - if (!tz->trips.active[i].trip.valid) - break; - } - - if (flag & ACPI_TRIPS_DEVICES) { - memset(&devices, 0, sizeof(devices)); - status = acpi_evaluate_reference(tz->device->handle, "_TZD", - NULL, &devices); - if (ACPI_SUCCESS(status) && - memcmp(&tz->devices, &devices, sizeof(devices))) { - tz->devices = devices; - ACPI_THERMAL_TRIPS_EXCEPTION(flag, tz, "device"); - } - } -} - static void update_acpi_thermal_trip_temp(struct acpi_thermal_trip *acpi_trip, int temp) { @@ -338,6 +250,78 @@ static void acpi_thermal_update_passive_devices(struct acpi_thermal *tz) ACPI_THERMAL_TRIPS_EXCEPTION(ACPI_TRIPS_PASSIVE, tz, "state"); } +static long get_active_temp(struct acpi_thermal *tz, int index) +{ + char method[] = { '_', 'A', 'C', '0' + index, '\0' }; + unsigned long long tmp; + acpi_status status; + + status = acpi_evaluate_integer(tz->device->handle, method, NULL, &tmp); + if (ACPI_FAILURE(status)) + return THERMAL_TEMP_INVALID; + + /* + * If an override has been provided, apply it so there are no active + * trips with thresholds greater than the override. + */ + if (act > 0) { + unsigned long long override = celsius_to_deci_kelvin(act); + + if (tmp > override) + tmp = override; + } + return tmp; +} + +static void acpi_thermal_update_active_trip(struct acpi_thermal *tz, int index) +{ + struct acpi_thermal_trip *acpi_trip = &tz->trips.active[index].trip; + + if (!acpi_trip->valid) + return; + + update_acpi_thermal_trip_temp(acpi_trip, get_active_temp(tz, index)); + if (!acpi_trip->valid) + ACPI_THERMAL_TRIPS_EXCEPTION(ACPI_TRIPS_ACTIVE, tz, "state"); +} + +static bool update_active_devices(struct acpi_thermal *tz, int index, bool compare) +{ + char method[] = { '_', 'A', 'L', '0' + index, '\0' }; + struct acpi_handle_list devices; + acpi_status status; + + memset(&devices, 0, sizeof(devices)); + + status = acpi_evaluate_reference(tz->device->handle, method, NULL, &devices); + if (ACPI_FAILURE(status)) { + acpi_handle_info(tz->device->handle, + "Missing device list for active threshold %d\n", + index); + return false; + } + + if (compare && memcmp(&tz->trips.active[index].devices, &devices, sizeof(devices))) + ACPI_THERMAL_TRIPS_EXCEPTION(ACPI_TRIPS_ACTIVE, tz, "device"); + + memcpy(&tz->trips.active[index].devices, &devices, sizeof(devices)); + return true; +} + +static void acpi_thermal_update_active_devices(struct acpi_thermal *tz, int index) +{ + struct acpi_thermal_trip *acpi_trip = &tz->trips.active[index].trip; + + if (!acpi_trip->valid) + return; + + if (update_active_devices(tz, index, true)) + return; + + update_acpi_thermal_trip_temp(acpi_trip, THERMAL_TEMP_INVALID); + ACPI_THERMAL_TRIPS_EXCEPTION(ACPI_TRIPS_ACTIVE, tz, "state"); +} + static int acpi_thermal_adjust_trip(struct thermal_trip *trip, void *data) { struct acpi_thermal_trip *acpi_trip = trip->priv; @@ -358,18 +342,18 @@ static void acpi_thermal_adjust_thermal_zone(struct thermal_zone_device *thermal unsigned long data) { struct acpi_thermal *tz = thermal_zone_device_priv(thermal); - int flag; + int i; if (data == ACPI_THERMAL_NOTIFY_THRESHOLDS) { acpi_thermal_update_passive_trip(tz); - flag = ACPI_TRIPS_THRESHOLDS; + for (i = 0; i < ACPI_THERMAL_MAX_ACTIVE; i++) + acpi_thermal_update_active_trip(tz, i); } else { acpi_thermal_update_passive_devices(tz); - flag = ACPI_TRIPS_DEVICES; + for (i = 0; i < ACPI_THERMAL_MAX_ACTIVE; i++) + acpi_thermal_update_active_devices(tz, i); } - __acpi_thermal_trips_update(tz, flag); - for_each_thermal_trip(tz->thermal_zone, acpi_thermal_adjust_trip, tz); } @@ -498,6 +482,28 @@ static bool acpi_thermal_init_passive_trip(struct acpi_thermal *tz) return false; } +static bool acpi_thermal_init_active_trip(struct acpi_thermal *tz, int index) +{ + long temp; + + if (act == -1) + goto fail; + + temp = get_active_temp(tz, index); + if (temp == THERMAL_TEMP_INVALID) + goto fail; + + if (!update_active_devices(tz, index, false)) + goto fail; + + update_acpi_thermal_trip_temp(&tz->trips.active[index].trip, temp); + return true; + +fail: + update_acpi_thermal_trip_temp(&tz->trips.active[index].trip, THERMAL_TEMP_INVALID); + return false; +} + static int acpi_thermal_get_trip_points(struct acpi_thermal *tz) { unsigned int count = 0; @@ -506,11 +512,8 @@ static int acpi_thermal_get_trip_points(struct acpi_thermal *tz) if (acpi_thermal_init_passive_trip(tz)) count++; - /* Active trip points (optional). */ - __acpi_thermal_trips_update(tz, ACPI_TRIPS_INIT); - for (i = 0; i < ACPI_THERMAL_MAX_ACTIVE; i++) { - if (tz->trips.active[i].trip.valid) + if (acpi_thermal_init_active_trip(tz, i)) count++; else break; From 60aa974494648ad3c7157a0c2c84633468f5c44b Mon Sep 17 00:00:00 2001 From: "Rafael J. Wysocki" Date: Tue, 12 Sep 2023 20:46:02 +0200 Subject: [PATCH 335/548] ACPI: thermal: Drop redundant trip point flags Trip point flags previously used by the driver need not be used any more after the preceding changes, so drop them and adjust the code accordingly. No intentional functional impact. Signed-off-by: Rafael J. Wysocki Reviewed-by: Daniel Lezcano (cherry picked from commit 4175a24f01eb4bfde7d57a9a3ea254649f440034) Signed-off-by: Koba Ko --- drivers/acpi/thermal.c | 29 ++++++++++------------------- 1 file changed, 10 insertions(+), 19 deletions(-) diff --git a/drivers/acpi/thermal.c b/drivers/acpi/thermal.c index 1389d8b353977..e36577bac54db 100644 --- a/drivers/acpi/thermal.c +++ b/drivers/acpi/thermal.c @@ -43,14 +43,6 @@ #define ACPI_THERMAL_MAX_ACTIVE 10 #define ACPI_THERMAL_MAX_LIMIT_STR_LEN 65 -#define ACPI_TRIPS_PASSIVE BIT(0) -#define ACPI_TRIPS_ACTIVE BIT(1) -#define ACPI_TRIPS_DEVICES BIT(2) - -#define ACPI_TRIPS_THRESHOLDS (ACPI_TRIPS_PASSIVE | ACPI_TRIPS_ACTIVE) - -#define ACPI_TRIPS_INIT (ACPI_TRIPS_THRESHOLDS | ACPI_TRIPS_DEVICES) - /* * This exception is thrown out in two cases: * 1.An invalid trip point becomes invalid or a valid trip point becomes invalid @@ -58,12 +50,11 @@ * 2.TODO: Devices listed in _PSL, _ALx, _TZD may change. * We need to re-bind the cooling devices of a thermal zone when this occurs. */ -#define ACPI_THERMAL_TRIPS_EXCEPTION(flags, tz, str) \ +#define ACPI_THERMAL_TRIPS_EXCEPTION(tz, str) \ do { \ - if (flags != ACPI_TRIPS_INIT) \ - acpi_handle_info(tz->device->handle, \ - "ACPI thermal trip point %s changed\n" \ - "Please report to linux-acpi@vger.kernel.org\n", str); \ + acpi_handle_info(tz->device->handle, \ + "ACPI thermal trip point %s changed\n" \ + "Please report to linux-acpi@vger.kernel.org\n", str); \ } while (0) static int act; @@ -212,7 +203,7 @@ static void acpi_thermal_update_passive_trip(struct acpi_thermal *tz) update_acpi_thermal_trip_temp(acpi_trip, get_passive_temp(tz)); if (!acpi_trip->valid) - ACPI_THERMAL_TRIPS_EXCEPTION(ACPI_TRIPS_PASSIVE, tz, "state"); + ACPI_THERMAL_TRIPS_EXCEPTION(tz, "state"); } static bool update_passive_devices(struct acpi_thermal *tz, bool compare) @@ -230,7 +221,7 @@ static bool update_passive_devices(struct acpi_thermal *tz, bool compare) } if (compare && memcmp(&tz->trips.passive.devices, &devices, sizeof(devices))) - ACPI_THERMAL_TRIPS_EXCEPTION(ACPI_TRIPS_PASSIVE, tz, "device"); + ACPI_THERMAL_TRIPS_EXCEPTION(tz, "device"); memcpy(&tz->trips.passive.devices, &devices, sizeof(devices)); return true; @@ -247,7 +238,7 @@ static void acpi_thermal_update_passive_devices(struct acpi_thermal *tz) return; update_acpi_thermal_trip_temp(acpi_trip, THERMAL_TEMP_INVALID); - ACPI_THERMAL_TRIPS_EXCEPTION(ACPI_TRIPS_PASSIVE, tz, "state"); + ACPI_THERMAL_TRIPS_EXCEPTION(tz, "state"); } static long get_active_temp(struct acpi_thermal *tz, int index) @@ -282,7 +273,7 @@ static void acpi_thermal_update_active_trip(struct acpi_thermal *tz, int index) update_acpi_thermal_trip_temp(acpi_trip, get_active_temp(tz, index)); if (!acpi_trip->valid) - ACPI_THERMAL_TRIPS_EXCEPTION(ACPI_TRIPS_ACTIVE, tz, "state"); + ACPI_THERMAL_TRIPS_EXCEPTION(tz, "state"); } static bool update_active_devices(struct acpi_thermal *tz, int index, bool compare) @@ -302,7 +293,7 @@ static bool update_active_devices(struct acpi_thermal *tz, int index, bool compa } if (compare && memcmp(&tz->trips.active[index].devices, &devices, sizeof(devices))) - ACPI_THERMAL_TRIPS_EXCEPTION(ACPI_TRIPS_ACTIVE, tz, "device"); + ACPI_THERMAL_TRIPS_EXCEPTION(tz, "device"); memcpy(&tz->trips.active[index].devices, &devices, sizeof(devices)); return true; @@ -319,7 +310,7 @@ static void acpi_thermal_update_active_devices(struct acpi_thermal *tz, int inde return; update_acpi_thermal_trip_temp(acpi_trip, THERMAL_TEMP_INVALID); - ACPI_THERMAL_TRIPS_EXCEPTION(ACPI_TRIPS_ACTIVE, tz, "state"); + ACPI_THERMAL_TRIPS_EXCEPTION(tz, "state"); } static int acpi_thermal_adjust_trip(struct thermal_trip *trip, void *data) From 578002b496aae264ee77276b7423aba7ff68adeb Mon Sep 17 00:00:00 2001 From: "Rafael J. Wysocki" Date: Wed, 20 Sep 2023 15:03:46 +0200 Subject: [PATCH 336/548] ACPI: thermal: Drop valid flag from struct acpi_thermal_trip Notice that the valid flag in struct acpi_thermal_trip is in fact redundant, because the temperature field of invalid trips is always equal to THERMAL_TEMP_INVALID, so drop it from there and adjust the code accordingly. No intentional functional impact. Signed-off-by: Rafael J. Wysocki Reviewed-by: Daniel Lezcano (cherry picked from commit 058f5e407deb8d21b0a04e50e8efbd25b1fcbd1b) Signed-off-by: Koba Ko --- drivers/acpi/thermal.c | 49 ++++++++++++++++++++---------------------- 1 file changed, 23 insertions(+), 26 deletions(-) diff --git a/drivers/acpi/thermal.c b/drivers/acpi/thermal.c index e36577bac54db..8206b0f1c26ca 100644 --- a/drivers/acpi/thermal.c +++ b/drivers/acpi/thermal.c @@ -81,7 +81,6 @@ static struct workqueue_struct *acpi_thermal_pm_queue; struct acpi_thermal_trip { unsigned long temperature; - bool valid; }; struct acpi_thermal_passive { @@ -175,11 +174,9 @@ static int acpi_thermal_temp(struct acpi_thermal *tz, int temp_deci_k) tz->kelvin_offset); } -static void update_acpi_thermal_trip_temp(struct acpi_thermal_trip *acpi_trip, - int temp) +static bool acpi_thermal_trip_valid(struct acpi_thermal_trip *acpi_trip) { - acpi_trip->valid = temp != THERMAL_TEMP_INVALID; - acpi_trip->temperature = temp; + return acpi_trip->temperature != THERMAL_TEMP_INVALID; } static long get_passive_temp(struct acpi_thermal *tz) @@ -198,11 +195,11 @@ static void acpi_thermal_update_passive_trip(struct acpi_thermal *tz) { struct acpi_thermal_trip *acpi_trip = &tz->trips.passive.trip; - if (!acpi_trip->valid || psv > 0) + if (!acpi_thermal_trip_valid(acpi_trip) || psv > 0) return; - update_acpi_thermal_trip_temp(acpi_trip, get_passive_temp(tz)); - if (!acpi_trip->valid) + acpi_trip->temperature = get_passive_temp(tz); + if (!acpi_thermal_trip_valid(acpi_trip)) ACPI_THERMAL_TRIPS_EXCEPTION(tz, "state"); } @@ -231,13 +228,13 @@ static void acpi_thermal_update_passive_devices(struct acpi_thermal *tz) { struct acpi_thermal_trip *acpi_trip = &tz->trips.passive.trip; - if (!acpi_trip->valid) + if (!acpi_thermal_trip_valid(acpi_trip)) return; if (update_passive_devices(tz, true)) return; - update_acpi_thermal_trip_temp(acpi_trip, THERMAL_TEMP_INVALID); + acpi_trip->temperature = THERMAL_TEMP_INVALID; ACPI_THERMAL_TRIPS_EXCEPTION(tz, "state"); } @@ -268,11 +265,11 @@ static void acpi_thermal_update_active_trip(struct acpi_thermal *tz, int index) { struct acpi_thermal_trip *acpi_trip = &tz->trips.active[index].trip; - if (!acpi_trip->valid) + if (!acpi_thermal_trip_valid(acpi_trip)) return; - update_acpi_thermal_trip_temp(acpi_trip, get_active_temp(tz, index)); - if (!acpi_trip->valid) + acpi_trip->temperature = get_active_temp(tz, index); + if (!acpi_thermal_trip_valid(acpi_trip)) ACPI_THERMAL_TRIPS_EXCEPTION(tz, "state"); } @@ -303,13 +300,13 @@ static void acpi_thermal_update_active_devices(struct acpi_thermal *tz, int inde { struct acpi_thermal_trip *acpi_trip = &tz->trips.active[index].trip; - if (!acpi_trip->valid) + if (!acpi_thermal_trip_valid(acpi_trip)) return; if (update_active_devices(tz, index, true)) return; - update_acpi_thermal_trip_temp(acpi_trip, THERMAL_TEMP_INVALID); + acpi_trip->temperature = THERMAL_TEMP_INVALID; ACPI_THERMAL_TRIPS_EXCEPTION(tz, "state"); } @@ -321,7 +318,7 @@ static int acpi_thermal_adjust_trip(struct thermal_trip *trip, void *data) if (!acpi_trip) return 0; - if (acpi_trip->valid) + if (acpi_thermal_trip_valid(acpi_trip)) trip->temperature = acpi_thermal_temp(tz, acpi_trip->temperature); else trip->temperature = THERMAL_TEMP_INVALID; @@ -465,11 +462,11 @@ static bool acpi_thermal_init_passive_trip(struct acpi_thermal *tz) if (!update_passive_devices(tz, false)) goto fail; - update_acpi_thermal_trip_temp(&tz->trips.passive.trip, temp); + tz->trips.passive.trip.temperature = temp; return true; fail: - update_acpi_thermal_trip_temp(&tz->trips.passive.trip, THERMAL_TEMP_INVALID); + tz->trips.passive.trip.temperature = THERMAL_TEMP_INVALID; return false; } @@ -487,11 +484,11 @@ static bool acpi_thermal_init_active_trip(struct acpi_thermal *tz, int index) if (!update_active_devices(tz, index, false)) goto fail; - update_acpi_thermal_trip_temp(&tz->trips.active[index].trip, temp); + tz->trips.active[index].trip.temperature = temp; return true; fail: - update_acpi_thermal_trip_temp(&tz->trips.active[index].trip, THERMAL_TEMP_INVALID); + tz->trips.active[index].trip.temperature = THERMAL_TEMP_INVALID; return false; } @@ -545,7 +542,7 @@ static int thermal_get_trend(struct thermal_zone_device *thermal, return -EINVAL; acpi_trip = trip->priv; - if (!acpi_trip || !acpi_trip->valid) + if (!acpi_trip || !acpi_thermal_trip_valid(acpi_trip)) return -EINVAL; switch (trip->type) { @@ -618,7 +615,7 @@ static int acpi_thermal_cooling_device_cb(struct thermal_zone_device *thermal, if (tz->trips.hot_valid) trip++; - if (tz->trips.passive.trip.valid) { + if (acpi_thermal_trip_valid(&tz->trips.passive.trip)) { trip++; for (i = 0; i < tz->trips.passive.devices.count; i++) { handle = tz->trips.passive.devices.handles[i]; @@ -643,7 +640,7 @@ static int acpi_thermal_cooling_device_cb(struct thermal_zone_device *thermal, } for (i = 0; i < ACPI_THERMAL_MAX_ACTIVE; i++) { - if (!tz->trips.active[i].trip.valid) + if (!acpi_thermal_trip_valid(&tz->trips.active[i].trip)) break; trip++; @@ -949,7 +946,7 @@ static int acpi_thermal_add(struct acpi_device *device) } acpi_trip = &tz->trips.passive.trip; - if (acpi_trip->valid) { + if (acpi_thermal_trip_valid(acpi_trip)) { passive_delay = tz->trips.passive.tsp * 100; trip->type = THERMAL_TRIP_PASSIVE; @@ -961,7 +958,7 @@ static int acpi_thermal_add(struct acpi_device *device) for (i = 0; i < ACPI_THERMAL_MAX_ACTIVE; i++) { acpi_trip = &tz->trips.active[i].trip; - if (!acpi_trip->valid) + if (!acpi_thermal_trip_valid(acpi_trip)) break; trip->type = THERMAL_TRIP_ACTIVE; @@ -1038,7 +1035,7 @@ static int acpi_thermal_resume(struct device *dev) return -EINVAL; for (i = 0; i < ACPI_THERMAL_MAX_ACTIVE; i++) { - if (!tz->trips.active[i].trip.valid) + if (!acpi_thermal_trip_valid(&tz->trips.active[i].trip)) break; for (j = 0; j < tz->trips.active[i].devices.count; j++) { From bacf16ab7f2a1c77e4c3845ac9c58a7a6fcef64d Mon Sep 17 00:00:00 2001 From: Dan Carpenter Date: Wed, 27 Sep 2023 15:37:26 +0300 Subject: [PATCH 337/548] ACPI: thermal: Fix a small leak in acpi_thermal_add() Free "tz" if the "trip" allocation fails. Fixes: 5fc2189f9335 ("ACPI: thermal: Create and populate trip points table earlier") Signed-off-by: Dan Carpenter Signed-off-by: Rafael J. Wysocki (cherry picked from commit 0d9741abd1c583e7bedb178358a9abd0981f49ba) Signed-off-by: Koba Ko --- drivers/acpi/thermal.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/drivers/acpi/thermal.c b/drivers/acpi/thermal.c index 8206b0f1c26ca..d6dc7b7e78203 100644 --- a/drivers/acpi/thermal.c +++ b/drivers/acpi/thermal.c @@ -928,8 +928,10 @@ static int acpi_thermal_add(struct acpi_device *device) acpi_thermal_guess_offset(tz, crit_temp); trip = kcalloc(trip_count, sizeof(*trip), GFP_KERNEL); - if (!trip) - return -ENOMEM; + if (!trip) { + result = -ENOMEM; + goto free_memory; + } tz->trip_table = trip; From 934e254cdc1095661135c331060d26116d7ceac2 Mon Sep 17 00:00:00 2001 From: "Rafael J. Wysocki" Date: Thu, 21 Sep 2023 19:48:33 +0200 Subject: [PATCH 338/548] ACPI: thermal: Add device list to struct acpi_thermal_trip The device lists present in struct acpi_thermal_passive and struct acpi_thermal_active can be located in struct acpi_thermal_trip which then will allow the same code to be used for handling both the passive and active trip points, so make that change. No intentional functional impact. Signed-off-by: Rafael J. Wysocki Reviewed-by: Daniel Lezcano (cherry picked from commit 0fa1bf34980eba19a5f07f97297bbd60b796e6d1) Signed-off-by: Koba Ko --- drivers/acpi/thermal.c | 36 +++++++++++++++++++++--------------- 1 file changed, 21 insertions(+), 15 deletions(-) diff --git a/drivers/acpi/thermal.c b/drivers/acpi/thermal.c index d6dc7b7e78203..ecc9bc08b1e9e 100644 --- a/drivers/acpi/thermal.c +++ b/drivers/acpi/thermal.c @@ -81,11 +81,11 @@ static struct workqueue_struct *acpi_thermal_pm_queue; struct acpi_thermal_trip { unsigned long temperature; + struct acpi_handle_list devices; }; struct acpi_thermal_passive { struct acpi_thermal_trip trip; - struct acpi_handle_list devices; unsigned long tc1; unsigned long tc2; unsigned long tsp; @@ -93,7 +93,6 @@ struct acpi_thermal_passive { struct acpi_thermal_active { struct acpi_thermal_trip trip; - struct acpi_handle_list devices; }; struct acpi_thermal_trips { @@ -205,6 +204,7 @@ static void acpi_thermal_update_passive_trip(struct acpi_thermal *tz) static bool update_passive_devices(struct acpi_thermal *tz, bool compare) { + struct acpi_thermal_trip *acpi_trip = &tz->trips.passive.trip; struct acpi_handle_list devices; acpi_status status; @@ -217,10 +217,10 @@ static bool update_passive_devices(struct acpi_thermal *tz, bool compare) return false; } - if (compare && memcmp(&tz->trips.passive.devices, &devices, sizeof(devices))) + if (compare && memcmp(&acpi_trip->devices, &devices, sizeof(devices))) ACPI_THERMAL_TRIPS_EXCEPTION(tz, "device"); - memcpy(&tz->trips.passive.devices, &devices, sizeof(devices)); + memcpy(&acpi_trip->devices, &devices, sizeof(devices)); return true; } @@ -276,6 +276,7 @@ static void acpi_thermal_update_active_trip(struct acpi_thermal *tz, int index) static bool update_active_devices(struct acpi_thermal *tz, int index, bool compare) { char method[] = { '_', 'A', 'L', '0' + index, '\0' }; + struct acpi_thermal_trip *acpi_trip = &tz->trips.active[index].trip; struct acpi_handle_list devices; acpi_status status; @@ -289,10 +290,10 @@ static bool update_active_devices(struct acpi_thermal *tz, int index, bool compa return false; } - if (compare && memcmp(&tz->trips.active[index].devices, &devices, sizeof(devices))) + if (compare && memcmp(&acpi_trip->devices, &devices, sizeof(devices))) ACPI_THERMAL_TRIPS_EXCEPTION(tz, "device"); - memcpy(&tz->trips.active[index].devices, &devices, sizeof(devices)); + memcpy(&acpi_trip->devices, &devices, sizeof(devices)); return true; } @@ -602,6 +603,7 @@ static int acpi_thermal_cooling_device_cb(struct thermal_zone_device *thermal, { struct acpi_device *device = cdev->devdata; struct acpi_thermal *tz = thermal_zone_device_priv(thermal); + struct acpi_thermal_trip *acpi_trip; struct acpi_device *dev; acpi_handle handle; int i; @@ -615,10 +617,11 @@ static int acpi_thermal_cooling_device_cb(struct thermal_zone_device *thermal, if (tz->trips.hot_valid) trip++; - if (acpi_thermal_trip_valid(&tz->trips.passive.trip)) { + acpi_trip = &tz->trips.passive.trip; + if (acpi_thermal_trip_valid(acpi_trip)) { trip++; - for (i = 0; i < tz->trips.passive.devices.count; i++) { - handle = tz->trips.passive.devices.handles[i]; + for (i = 0; i < acpi_trip->devices.count; i++) { + handle = acpi_trip->devices.handles[i]; dev = acpi_fetch_acpi_dev(handle); if (dev != device) continue; @@ -640,12 +643,13 @@ static int acpi_thermal_cooling_device_cb(struct thermal_zone_device *thermal, } for (i = 0; i < ACPI_THERMAL_MAX_ACTIVE; i++) { - if (!acpi_thermal_trip_valid(&tz->trips.active[i].trip)) + acpi_trip = &tz->trips.active[i].trip; + if (!acpi_thermal_trip_valid(acpi_trip)) break; trip++; - for (j = 0; j < tz->trips.active[i].devices.count; j++) { - handle = tz->trips.active[i].devices.handles[j]; + for (j = 0; j < acpi_trip->devices.count; j++) { + handle = acpi_trip->devices.handles[j]; dev = acpi_fetch_acpi_dev(handle); if (dev != device) continue; @@ -1037,11 +1041,13 @@ static int acpi_thermal_resume(struct device *dev) return -EINVAL; for (i = 0; i < ACPI_THERMAL_MAX_ACTIVE; i++) { - if (!acpi_thermal_trip_valid(&tz->trips.active[i].trip)) + struct acpi_thermal_trip *acpi_trip = &tz->trips.active[i].trip; + + if (!acpi_thermal_trip_valid(acpi_trip)) break; - for (j = 0; j < tz->trips.active[i].devices.count; j++) { - acpi_bus_update_power(tz->trips.active[i].devices.handles[j], + for (j = 0; j < acpi_trip->devices.count; j++) { + acpi_bus_update_power(acpi_trip->devices.handles[j], &power_state); } } From e02f2a76311770360f36140b1f208e9016c6655d Mon Sep 17 00:00:00 2001 From: "Rafael J. Wysocki" Date: Thu, 21 Sep 2023 19:49:24 +0200 Subject: [PATCH 339/548] ACPI: thermal: Collapse trip devices update functions In order to reduce code duplication, merge update_passive_devices() and update_active_devices() into one function called update_trip_devices() that will be used for updating both the passive and active trip points. No intentional functional impact. Signed-off-by: Rafael J. Wysocki Reviewed-by: Daniel Lezcano (cherry picked from commit 317508c65f1f966367553ef84f5d8eba16720f32) Signed-off-by: Koba Ko --- drivers/acpi/thermal.c | 53 ++++++++++++++++-------------------------- 1 file changed, 20 insertions(+), 33 deletions(-) diff --git a/drivers/acpi/thermal.c b/drivers/acpi/thermal.c index ecc9bc08b1e9e..f5c2056b7c791 100644 --- a/drivers/acpi/thermal.c +++ b/drivers/acpi/thermal.c @@ -43,6 +43,8 @@ #define ACPI_THERMAL_MAX_ACTIVE 10 #define ACPI_THERMAL_MAX_LIMIT_STR_LEN 65 +#define ACPI_THERMAL_TRIP_PASSIVE (-1) + /* * This exception is thrown out in two cases: * 1.An invalid trip point becomes invalid or a valid trip point becomes invalid @@ -202,18 +204,25 @@ static void acpi_thermal_update_passive_trip(struct acpi_thermal *tz) ACPI_THERMAL_TRIPS_EXCEPTION(tz, "state"); } -static bool update_passive_devices(struct acpi_thermal *tz, bool compare) +static bool update_trip_devices(struct acpi_thermal *tz, + struct acpi_thermal_trip *acpi_trip, + int index, bool compare) { - struct acpi_thermal_trip *acpi_trip = &tz->trips.passive.trip; struct acpi_handle_list devices; + char method[] = "_PSL"; acpi_status status; + if (index != ACPI_THERMAL_TRIP_PASSIVE) { + method[1] = 'A'; + method[2] = 'L'; + method[3] = '0' + index; + } + memset(&devices, 0, sizeof(devices)); - status = acpi_evaluate_reference(tz->device->handle, "_PSL", NULL, &devices); + status = acpi_evaluate_reference(tz->device->handle, method, NULL, &devices); if (ACPI_FAILURE(status)) { - acpi_handle_info(tz->device->handle, - "Missing device list for passive threshold\n"); + acpi_handle_info(tz->device->handle, "%s evaluation failure\n", method); return false; } @@ -231,8 +240,9 @@ static void acpi_thermal_update_passive_devices(struct acpi_thermal *tz) if (!acpi_thermal_trip_valid(acpi_trip)) return; - if (update_passive_devices(tz, true)) + if (update_trip_devices(tz, acpi_trip, ACPI_THERMAL_TRIP_PASSIVE, true)) { return; + } acpi_trip->temperature = THERMAL_TEMP_INVALID; ACPI_THERMAL_TRIPS_EXCEPTION(tz, "state"); @@ -273,30 +283,6 @@ static void acpi_thermal_update_active_trip(struct acpi_thermal *tz, int index) ACPI_THERMAL_TRIPS_EXCEPTION(tz, "state"); } -static bool update_active_devices(struct acpi_thermal *tz, int index, bool compare) -{ - char method[] = { '_', 'A', 'L', '0' + index, '\0' }; - struct acpi_thermal_trip *acpi_trip = &tz->trips.active[index].trip; - struct acpi_handle_list devices; - acpi_status status; - - memset(&devices, 0, sizeof(devices)); - - status = acpi_evaluate_reference(tz->device->handle, method, NULL, &devices); - if (ACPI_FAILURE(status)) { - acpi_handle_info(tz->device->handle, - "Missing device list for active threshold %d\n", - index); - return false; - } - - if (compare && memcmp(&acpi_trip->devices, &devices, sizeof(devices))) - ACPI_THERMAL_TRIPS_EXCEPTION(tz, "device"); - - memcpy(&acpi_trip->devices, &devices, sizeof(devices)); - return true; -} - static void acpi_thermal_update_active_devices(struct acpi_thermal *tz, int index) { struct acpi_thermal_trip *acpi_trip = &tz->trips.active[index].trip; @@ -304,7 +290,7 @@ static void acpi_thermal_update_active_devices(struct acpi_thermal *tz, int inde if (!acpi_thermal_trip_valid(acpi_trip)) return; - if (update_active_devices(tz, index, true)) + if (update_trip_devices(tz, acpi_trip, index, true)) return; acpi_trip->temperature = THERMAL_TEMP_INVALID; @@ -460,7 +446,8 @@ static bool acpi_thermal_init_passive_trip(struct acpi_thermal *tz) tz->trips.passive.tsp = tmp; - if (!update_passive_devices(tz, false)) + if (!update_trip_devices(tz, &tz->trips.passive.trip, + ACPI_THERMAL_TRIP_PASSIVE, false)) goto fail; tz->trips.passive.trip.temperature = temp; @@ -482,7 +469,7 @@ static bool acpi_thermal_init_active_trip(struct acpi_thermal *tz, int index) if (temp == THERMAL_TEMP_INVALID) goto fail; - if (!update_active_devices(tz, index, false)) + if (!update_trip_devices(tz, &tz->trips.active[index].trip, index, false)) goto fail; tz->trips.active[index].trip.temperature = temp; From 6e543a10580ade3e3f8823f7d59bc80eecc18ea0 Mon Sep 17 00:00:00 2001 From: "Rafael J. Wysocki" Date: Thu, 21 Sep 2023 19:50:08 +0200 Subject: [PATCH 340/548] ACPI: thermal: Collapse trip devices update function wrappers In order to reduce code duplicationeve further, merge acpi_thermal_update_passive/active_devices() into one function called acpi_thermal_update_trip_devices() that will be used for updating both the passive and active trip points. No intentional functional impact. Signed-off-by: Rafael J. Wysocki Reviewed-by: Daniel Lezcano (cherry picked from commit 54fc61a106c9b7fb55a3c6a595349826c0548468) Signed-off-by: Koba Ko --- drivers/acpi/thermal.c | 26 +++++++------------------- 1 file changed, 7 insertions(+), 19 deletions(-) diff --git a/drivers/acpi/thermal.c b/drivers/acpi/thermal.c index f5c2056b7c791..2b815c093ee5c 100644 --- a/drivers/acpi/thermal.c +++ b/drivers/acpi/thermal.c @@ -233,14 +233,16 @@ static bool update_trip_devices(struct acpi_thermal *tz, return true; } -static void acpi_thermal_update_passive_devices(struct acpi_thermal *tz) +static void acpi_thermal_update_trip_devices(struct acpi_thermal *tz, int index) { - struct acpi_thermal_trip *acpi_trip = &tz->trips.passive.trip; + struct acpi_thermal_trip *acpi_trip; + acpi_trip = index == ACPI_THERMAL_TRIP_PASSIVE ? + &tz->trips.passive.trip : &tz->trips.active[index].trip; if (!acpi_thermal_trip_valid(acpi_trip)) return; - if (update_trip_devices(tz, acpi_trip, ACPI_THERMAL_TRIP_PASSIVE, true)) { + if (update_trip_devices(tz, acpi_trip, index, true)) { return; } @@ -283,20 +285,6 @@ static void acpi_thermal_update_active_trip(struct acpi_thermal *tz, int index) ACPI_THERMAL_TRIPS_EXCEPTION(tz, "state"); } -static void acpi_thermal_update_active_devices(struct acpi_thermal *tz, int index) -{ - struct acpi_thermal_trip *acpi_trip = &tz->trips.active[index].trip; - - if (!acpi_thermal_trip_valid(acpi_trip)) - return; - - if (update_trip_devices(tz, acpi_trip, index, true)) - return; - - acpi_trip->temperature = THERMAL_TEMP_INVALID; - ACPI_THERMAL_TRIPS_EXCEPTION(tz, "state"); -} - static int acpi_thermal_adjust_trip(struct thermal_trip *trip, void *data) { struct acpi_thermal_trip *acpi_trip = trip->priv; @@ -324,9 +312,9 @@ static void acpi_thermal_adjust_thermal_zone(struct thermal_zone_device *thermal for (i = 0; i < ACPI_THERMAL_MAX_ACTIVE; i++) acpi_thermal_update_active_trip(tz, i); } else { - acpi_thermal_update_passive_devices(tz); + acpi_thermal_update_trip_devices(tz, ACPI_THERMAL_TRIP_PASSIVE); for (i = 0; i < ACPI_THERMAL_MAX_ACTIVE; i++) - acpi_thermal_update_active_devices(tz, i); + acpi_thermal_update_trip_devices(tz, i); } for_each_thermal_trip(tz->thermal_zone, acpi_thermal_adjust_trip, tz); From 8634242cb0f54deb845b2822ab999356077a0768 Mon Sep 17 00:00:00 2001 From: "Rafael J. Wysocki" Date: Thu, 21 Sep 2023 19:51:19 +0200 Subject: [PATCH 341/548] ACPI: thermal: Merge trip initialization functions In order to reduce code duplicationeve further, merge acpi_thermal_init_passive/active_trip() into one function called acpi_thermal_init_trip() that will be used for initializing both the passive and active trip points. No intentional functional impact. Signed-off-by: Rafael J. Wysocki Reviewed-by: Daniel Lezcano (cherry picked from commit 3e7d6f396d74a3e40c390bb53947938957426097) Signed-off-by: Koba Ko --- drivers/acpi/thermal.c | 64 ++++++++++++++++++++---------------------- 1 file changed, 30 insertions(+), 34 deletions(-) diff --git a/drivers/acpi/thermal.c b/drivers/acpi/thermal.c index 2b815c093ee5c..666fe64a84c35 100644 --- a/drivers/acpi/thermal.c +++ b/drivers/acpi/thermal.c @@ -399,72 +399,68 @@ static long acpi_thermal_get_hot_trip(struct acpi_thermal *tz) return tmp; } -static bool acpi_thermal_init_passive_trip(struct acpi_thermal *tz) +static bool passive_trip_params_init(struct acpi_thermal *tz) { unsigned long long tmp; acpi_status status; - int temp; - - if (psv == -1) - goto fail; - - if (psv > 0) { - temp = celsius_to_deci_kelvin(psv); - } else { - temp = get_passive_temp(tz); - if (temp == THERMAL_TEMP_INVALID) - goto fail; - } status = acpi_evaluate_integer(tz->device->handle, "_TC1", NULL, &tmp); if (ACPI_FAILURE(status)) - goto fail; + return false; tz->trips.passive.tc1 = tmp; status = acpi_evaluate_integer(tz->device->handle, "_TC2", NULL, &tmp); if (ACPI_FAILURE(status)) - goto fail; + return false; tz->trips.passive.tc2 = tmp; status = acpi_evaluate_integer(tz->device->handle, "_TSP", NULL, &tmp); if (ACPI_FAILURE(status)) - goto fail; + return false; tz->trips.passive.tsp = tmp; - if (!update_trip_devices(tz, &tz->trips.passive.trip, - ACPI_THERMAL_TRIP_PASSIVE, false)) - goto fail; - - tz->trips.passive.trip.temperature = temp; return true; - -fail: - tz->trips.passive.trip.temperature = THERMAL_TEMP_INVALID; - return false; } -static bool acpi_thermal_init_active_trip(struct acpi_thermal *tz, int index) +static bool acpi_thermal_init_trip(struct acpi_thermal *tz, int index) { + struct acpi_thermal_trip *acpi_trip; long temp; - if (act == -1) - goto fail; + if (index == ACPI_THERMAL_TRIP_PASSIVE) { + acpi_trip = &tz->trips.passive.trip; + + if (psv == -1) + goto fail; + + if (!passive_trip_params_init(tz)) + goto fail; + + temp = psv > 0 ? celsius_to_deci_kelvin(psv) : + get_passive_temp(tz); + } else { + acpi_trip = &tz->trips.active[index].trip; + + if (act == -1) + goto fail; + + temp = get_active_temp(tz, index); + } - temp = get_active_temp(tz, index); if (temp == THERMAL_TEMP_INVALID) goto fail; - if (!update_trip_devices(tz, &tz->trips.active[index].trip, index, false)) + if (!update_trip_devices(tz, acpi_trip, index, false)) goto fail; - tz->trips.active[index].trip.temperature = temp; + acpi_trip->temperature = temp; return true; fail: - tz->trips.active[index].trip.temperature = THERMAL_TEMP_INVALID; + acpi_trip->temperature = THERMAL_TEMP_INVALID; return false; } @@ -473,11 +469,11 @@ static int acpi_thermal_get_trip_points(struct acpi_thermal *tz) unsigned int count = 0; int i; - if (acpi_thermal_init_passive_trip(tz)) + if (acpi_thermal_init_trip(tz, ACPI_THERMAL_TRIP_PASSIVE)) count++; for (i = 0; i < ACPI_THERMAL_MAX_ACTIVE; i++) { - if (acpi_thermal_init_active_trip(tz, i)) + if (acpi_thermal_init_trip(tz, i)) count++; else break; From 4173b4d3a36103200692f1f887c66aef69e8830e Mon Sep 17 00:00:00 2001 From: Jeff Brasen Date: Fri, 10 Nov 2023 00:03:21 +0530 Subject: [PATCH 342/548] ACPI: thermal: Add Thermal fast Sampling Period (_TFP) support Add support of "Thermal fast Sampling Period (_TFP)" for passive cooling. As per the ACPI specification (ACPI 6.5, Section 11.4.17 "_TFP (Thermal fast Sampling Period)", _TFP overrides _TSP ("Thermal Sampling Period" if both are present in a Thermal zone. Signed-off-by: Jeff Brasen Co-developed-by: Sumit Gupta Signed-off-by: Sumit Gupta [ rjw: Changelog edits ] Signed-off-by: Rafael J. Wysocki (cherry picked from commit a2ee7581afd59015b8f9ae01fad131aed9f26f01) Signed-off-by: Koba Ko --- drivers/acpi/thermal.c | 12 +++++++++--- 1 file changed, 9 insertions(+), 3 deletions(-) diff --git a/drivers/acpi/thermal.c b/drivers/acpi/thermal.c index 666fe64a84c35..6eee5bf1676ad 100644 --- a/drivers/acpi/thermal.c +++ b/drivers/acpi/thermal.c @@ -90,7 +90,7 @@ struct acpi_thermal_passive { struct acpi_thermal_trip trip; unsigned long tc1; unsigned long tc2; - unsigned long tsp; + unsigned long delay; }; struct acpi_thermal_active { @@ -416,11 +416,17 @@ static bool passive_trip_params_init(struct acpi_thermal *tz) tz->trips.passive.tc2 = tmp; + status = acpi_evaluate_integer(tz->device->handle, "_TFP", NULL, &tmp); + if (ACPI_SUCCESS(status)) { + tz->trips.passive.delay = tmp; + return true; + } + status = acpi_evaluate_integer(tz->device->handle, "_TSP", NULL, &tmp); if (ACPI_FAILURE(status)) return false; - tz->trips.passive.tsp = tmp; + tz->trips.passive.delay = tmp * 100; return true; } @@ -924,7 +930,7 @@ static int acpi_thermal_add(struct acpi_device *device) acpi_trip = &tz->trips.passive.trip; if (acpi_thermal_trip_valid(acpi_trip)) { - passive_delay = tz->trips.passive.tsp * 100; + passive_delay = tz->trips.passive.delay; trip->type = THERMAL_TRIP_PASSIVE; trip->temperature = acpi_thermal_temp(tz, acpi_trip->temperature); From 54666721acebf16a8e3e4d355e5d4d6c189b128f Mon Sep 17 00:00:00 2001 From: Srikar Srimath Tirumala Date: Thu, 23 Nov 2023 17:44:33 +0530 Subject: [PATCH 343/548] ACPI: processor: reduce CPUFREQ thermal reduction pctg for Tegra241 Current implementation of processor_thermal performs software throttling in fixed steps of "20%" which can be too coarse for some platforms. We observed some performance gain after reducing the throttle percentage. Change the CPUFREQ thermal reduction percentage and maximum thermal steps to be configurable. Also, update the default values of both for Nvidia Tegra241 (Grace) SoC. The thermal reduction percentage is reduced to "5%" and accordingly the maximum number of thermal steps are increased as they are derived from the reduction percentage. Signed-off-by: Srikar Srimath Tirumala Co-developed-by: Sumit Gupta Signed-off-by: Sumit Gupta Acked-by: Sudeep Holla Acked-by: Hanjun Guo Signed-off-by: Rafael J. Wysocki (cherry picked from commit 310293a2b94197f3d75e65ab22672287a7938a00) Signed-off-by: Koba Ko --- drivers/acpi/arm64/Makefile | 1 + drivers/acpi/arm64/thermal_cpufreq.c | 20 ++++++++++++ drivers/acpi/internal.h | 9 +++++ drivers/acpi/processor_thermal.c | 49 +++++++++++++++++++++++----- 4 files changed, 70 insertions(+), 9 deletions(-) create mode 100644 drivers/acpi/arm64/thermal_cpufreq.c diff --git a/drivers/acpi/arm64/Makefile b/drivers/acpi/arm64/Makefile index 143debc1ba4a9..726944648c9bc 100644 --- a/drivers/acpi/arm64/Makefile +++ b/drivers/acpi/arm64/Makefile @@ -5,3 +5,4 @@ obj-$(CONFIG_ACPI_GTDT) += gtdt.o obj-$(CONFIG_ACPI_APMT) += apmt.o obj-$(CONFIG_ARM_AMBA) += amba.o obj-y += dma.o init.o +obj-y += thermal_cpufreq.o diff --git a/drivers/acpi/arm64/thermal_cpufreq.c b/drivers/acpi/arm64/thermal_cpufreq.c new file mode 100644 index 0000000000000..d524f2cd6044d --- /dev/null +++ b/drivers/acpi/arm64/thermal_cpufreq.c @@ -0,0 +1,20 @@ +// SPDX-License-Identifier: GPL-2.0-only +#include + +#include "../internal.h" + +#define SMCCC_SOC_ID_T241 0x036b0241 + +int acpi_arch_thermal_cpufreq_pctg(void) +{ + s32 soc_id = arm_smccc_get_soc_id_version(); + + /* + * Check JEP106 code for NVIDIA Tegra241 chip (036b:0241) and + * reduce the CPUFREQ Thermal reduction percentage to 5%. + */ + if (soc_id == SMCCC_SOC_ID_T241) + return 5; + + return 0; +} diff --git a/drivers/acpi/internal.h b/drivers/acpi/internal.h index 1e8ee97fc85f3..cbac78cb268a2 100644 --- a/drivers/acpi/internal.h +++ b/drivers/acpi/internal.h @@ -85,6 +85,15 @@ bool acpi_scan_is_offline(struct acpi_device *adev, bool uevent); acpi_status acpi_sysfs_table_handler(u32 event, void *table, void *context); void acpi_scan_table_notify(void); +#ifdef CONFIG_ARM64 +int acpi_arch_thermal_cpufreq_pctg(void); +#else +static inline int acpi_arch_thermal_cpufreq_pctg(void) +{ + return 0; +} +#endif + /* -------------------------------------------------------------------------- Device Node Initialization / Removal -------------------------------------------------------------------------- */ diff --git a/drivers/acpi/processor_thermal.c b/drivers/acpi/processor_thermal.c index b7c6287eccca2..1219adb11ab92 100644 --- a/drivers/acpi/processor_thermal.c +++ b/drivers/acpi/processor_thermal.c @@ -17,6 +17,8 @@ #include #include +#include "internal.h" + #ifdef CONFIG_CPU_FREQ /* If a passive cooling situation is detected, primarily CPUfreq is used, as it @@ -26,12 +28,21 @@ */ #define CPUFREQ_THERMAL_MIN_STEP 0 -#define CPUFREQ_THERMAL_MAX_STEP 3 -static DEFINE_PER_CPU(unsigned int, cpufreq_thermal_reduction_pctg); +static int cpufreq_thermal_max_step __read_mostly = 3; + +/* + * Minimum throttle percentage for processor_thermal cooling device. + * The processor_thermal driver uses it to calculate the percentage amount by + * which cpu frequency must be reduced for each cooling state. This is also used + * to calculate the maximum number of throttling steps or cooling states. + */ +static int cpufreq_thermal_reduction_pctg __read_mostly = 20; -#define reduction_pctg(cpu) \ - per_cpu(cpufreq_thermal_reduction_pctg, phys_package_first_cpu(cpu)) +static DEFINE_PER_CPU(unsigned int, cpufreq_thermal_reduction_step); + +#define reduction_step(cpu) \ + per_cpu(cpufreq_thermal_reduction_step, phys_package_first_cpu(cpu)) /* * Emulate "per package data" using per cpu data (which should really be @@ -71,7 +82,7 @@ static int cpufreq_get_max_state(unsigned int cpu) if (!cpu_has_cpufreq(cpu)) return 0; - return CPUFREQ_THERMAL_MAX_STEP; + return cpufreq_thermal_max_step; } static int cpufreq_get_cur_state(unsigned int cpu) @@ -79,7 +90,7 @@ static int cpufreq_get_cur_state(unsigned int cpu) if (!cpu_has_cpufreq(cpu)) return 0; - return reduction_pctg(cpu); + return reduction_step(cpu); } static int cpufreq_set_cur_state(unsigned int cpu, int state) @@ -92,7 +103,7 @@ static int cpufreq_set_cur_state(unsigned int cpu, int state) if (!cpu_has_cpufreq(cpu)) return 0; - reduction_pctg(cpu) = state; + reduction_step(cpu) = state; /* * Update all the CPUs in the same package because they all @@ -113,7 +124,8 @@ static int cpufreq_set_cur_state(unsigned int cpu, int state) if (!policy) return -EINVAL; - max_freq = (policy->cpuinfo.max_freq * (100 - reduction_pctg(i) * 20)) / 100; + max_freq = (policy->cpuinfo.max_freq * + (100 - reduction_step(i) * cpufreq_thermal_reduction_pctg)) / 100; cpufreq_cpu_put(policy); @@ -126,10 +138,29 @@ static int cpufreq_set_cur_state(unsigned int cpu, int state) return 0; } +static void acpi_thermal_cpufreq_config(void) +{ + int cpufreq_pctg = acpi_arch_thermal_cpufreq_pctg(); + + if (!cpufreq_pctg) + return; + + cpufreq_thermal_reduction_pctg = cpufreq_pctg; + + /* + * Derive the MAX_STEP from minimum throttle percentage so that the reduction + * percentage doesn't end up becoming negative. Also, cap the MAX_STEP so that + * the CPU performance doesn't become 0. + */ + cpufreq_thermal_max_step = (100 / cpufreq_pctg) - 2; +} + void acpi_thermal_cpufreq_init(struct cpufreq_policy *policy) { unsigned int cpu; + acpi_thermal_cpufreq_config(); + for_each_cpu(cpu, policy->related_cpus) { struct acpi_processor *pr = per_cpu(processors, cpu); int ret; @@ -190,7 +221,7 @@ static int acpi_processor_max_state(struct acpi_processor *pr) /* * There exists four states according to - * cpufreq_thermal_reduction_pctg. 0, 1, 2, 3 + * cpufreq_thermal_reduction_step. 0, 1, 2, 3 */ max_state += cpufreq_get_max_state(pr->id); if (pr->flags.throttling) From bbc96513a76aed223f149952036910a184a6fca6 Mon Sep 17 00:00:00 2001 From: Arnd Bergmann Date: Tue, 12 Dec 2023 22:48:38 +0100 Subject: [PATCH 344/548] ACPI: arm64: export acpi_arch_thermal_cpufreq_pctg() The cpufreq code can be in a loadable module, so the architecture support for it has to be exported: ERROR: modpost: "acpi_arch_thermal_cpufreq_pctg" [drivers/acpi/processor.ko] undefined! Fixes: 310293a2b941 ("ACPI: processor: reduce CPUFREQ thermal reduction pctg for Tegra241") Signed-off-by: Arnd Bergmann Signed-off-by: Rafael J. Wysocki (cherry picked from commit ccb45b34d44016b91fa75646741d317d6d6fdeea) Signed-off-by: Koba Ko --- drivers/acpi/arm64/thermal_cpufreq.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/acpi/arm64/thermal_cpufreq.c b/drivers/acpi/arm64/thermal_cpufreq.c index d524f2cd6044d..582854914c5cd 100644 --- a/drivers/acpi/arm64/thermal_cpufreq.c +++ b/drivers/acpi/arm64/thermal_cpufreq.c @@ -1,5 +1,6 @@ // SPDX-License-Identifier: GPL-2.0-only #include +#include #include "../internal.h" @@ -18,3 +19,4 @@ int acpi_arch_thermal_cpufreq_pctg(void) return 0; } +EXPORT_SYMBOL_GPL(acpi_arch_thermal_cpufreq_pctg); From 50610a51b757cc571c197ab697132bcbbf0c7b7a Mon Sep 17 00:00:00 2001 From: Andrea Righi Date: Thu, 7 Aug 2025 06:34:06 +0200 Subject: [PATCH 345/548] NVIDIA: SAUCE: ci: add github action Signed-off-by: Andrea Righi --- .github/workflows/config_4k | 10928 +++++++++++++++++++++++++++ .github/workflows/config_64k | 10925 ++++++++++++++++++++++++++ .github/workflows/kernel-build.yml | 77 + .gitignore | 1 + 4 files changed, 21931 insertions(+) create mode 100644 .github/workflows/config_4k create mode 100644 .github/workflows/config_64k create mode 100644 .github/workflows/kernel-build.yml diff --git a/.github/workflows/config_4k b/.github/workflows/config_4k new file mode 100644 index 0000000000000..333afb8ea71d2 --- /dev/null +++ b/.github/workflows/config_4k @@ -0,0 +1,10928 @@ +# +# Automatically generated file; DO NOT EDIT. +# Linux/arm64 6.6.3 Kernel Configuration +# +CONFIG_CC_VERSION_TEXT="gcc (Debian 8.3.0-6) 8.3.0" +CONFIG_CC_IS_GCC=y +CONFIG_GCC_VERSION=80300 +CONFIG_CLANG_VERSION=0 +CONFIG_AS_IS_GNU=y +CONFIG_AS_VERSION=23101 +CONFIG_LD_IS_BFD=y +CONFIG_LD_VERSION=23101 +CONFIG_LLD_VERSION=0 +CONFIG_CC_CAN_LINK=y +CONFIG_CC_CAN_LINK_STATIC=y +CONFIG_CC_HAS_ASM_INLINE=y +CONFIG_CC_HAS_NO_PROFILE_FN_ATTR=y +CONFIG_PAHOLE_VERSION=120 +CONFIG_IRQ_WORK=y +CONFIG_BUILDTIME_TABLE_SORT=y +CONFIG_THREAD_INFO_IN_TASK=y + +# +# General setup +# +CONFIG_INIT_ENV_ARG_LIMIT=32 +# CONFIG_COMPILE_TEST is not set +# CONFIG_WERROR is not set +CONFIG_LOCALVERSION="" +# CONFIG_LOCALVERSION_AUTO is not set +CONFIG_BUILD_SALT="6.6.y.bsk.z-arm64" +CONFIG_DEFAULT_INIT="" +CONFIG_DEFAULT_HOSTNAME="(none)" +CONFIG_SYSVIPC=y +CONFIG_SYSVIPC_SYSCTL=y +CONFIG_SYSVIPC_COMPAT=y +CONFIG_POSIX_MQUEUE=y +CONFIG_POSIX_MQUEUE_SYSCTL=y +# CONFIG_WATCH_QUEUE is not set +CONFIG_CROSS_MEMORY_ATTACH=y +# CONFIG_USELIB is not set +CONFIG_AUDIT=y +CONFIG_HAVE_ARCH_AUDITSYSCALL=y +CONFIG_AUDITSYSCALL=y + +# +# IRQ subsystem +# +CONFIG_GENERIC_IRQ_PROBE=y +CONFIG_GENERIC_IRQ_SHOW=y +CONFIG_GENERIC_IRQ_SHOW_LEVEL=y +CONFIG_GENERIC_IRQ_EFFECTIVE_AFF_MASK=y +CONFIG_GENERIC_IRQ_MIGRATION=y +CONFIG_GENERIC_IRQ_INJECTION=y +CONFIG_HARDIRQS_SW_RESEND=y +CONFIG_GENERIC_IRQ_CHIP=y +CONFIG_IRQ_DOMAIN=y +CONFIG_IRQ_DOMAIN_HIERARCHY=y +CONFIG_IRQ_FASTEOI_HIERARCHY_HANDLERS=y +CONFIG_GENERIC_IRQ_IPI=y +CONFIG_GENERIC_MSI_IRQ=y +CONFIG_IRQ_MSI_IOMMU=y +CONFIG_IRQ_FORCED_THREADING=y +CONFIG_SPARSE_IRQ=y +# CONFIG_GENERIC_IRQ_DEBUGFS is not set +# end of IRQ subsystem + +CONFIG_GENERIC_TIME_VSYSCALL=y +CONFIG_GENERIC_CLOCKEVENTS=y +CONFIG_ARCH_HAS_TICK_BROADCAST=y +CONFIG_GENERIC_CLOCKEVENTS_BROADCAST=y +CONFIG_HAVE_POSIX_CPU_TIMERS_TASK_WORK=y +CONFIG_POSIX_CPU_TIMERS_TASK_WORK=y +CONFIG_CONTEXT_TRACKING=y +CONFIG_CONTEXT_TRACKING_IDLE=y + +# +# Timers subsystem +# +CONFIG_TICK_ONESHOT=y +CONFIG_NO_HZ_COMMON=y +# CONFIG_HZ_PERIODIC is not set +CONFIG_NO_HZ_IDLE=y +# CONFIG_NO_HZ_FULL is not set +# CONFIG_NO_HZ is not set +CONFIG_HIGH_RES_TIMERS=y +# end of Timers subsystem + +CONFIG_BPF=y +CONFIG_HAVE_EBPF_JIT=y +CONFIG_ARCH_WANT_DEFAULT_BPF_JIT=y + +# +# BPF subsystem +# +CONFIG_BPF_SYSCALL=y +CONFIG_BPF_JIT=y +# CONFIG_BPF_JIT_ALWAYS_ON is not set +CONFIG_BPF_JIT_DEFAULT_ON=y +# CONFIG_BPF_UNPRIV_DEFAULT_OFF is not set +# CONFIG_BPF_PRELOAD is not set +CONFIG_BPF_LSM=y +# end of BPF subsystem + +CONFIG_PREEMPT_VOLUNTARY_BUILD=y +# CONFIG_PREEMPT_NONE is not set +CONFIG_PREEMPT_VOLUNTARY=y +# CONFIG_PREEMPT is not set +# CONFIG_PREEMPT_DYNAMIC is not set +# CONFIG_SCHED_CORE is not set + +# +# CPU/Task time and stats accounting +# +CONFIG_TICK_CPU_ACCOUNTING=y +# CONFIG_VIRT_CPU_ACCOUNTING_GEN is not set +# CONFIG_IRQ_TIME_ACCOUNTING is not set +CONFIG_SCHED_THERMAL_PRESSURE=y +CONFIG_BSD_PROCESS_ACCT=y +CONFIG_BSD_PROCESS_ACCT_V3=y +CONFIG_TASKSTATS=y +CONFIG_TASK_DELAY_ACCT=y +CONFIG_TASK_XACCT=y +CONFIG_TASK_IO_ACCOUNTING=y +CONFIG_PSI=y +# CONFIG_PSI_DEFAULT_DISABLED is not set +# end of CPU/Task time and stats accounting + +CONFIG_CPU_ISOLATION=y + +# +# RCU Subsystem +# +CONFIG_TREE_RCU=y +# CONFIG_RCU_EXPERT is not set +CONFIG_TREE_SRCU=y +CONFIG_TASKS_RCU_GENERIC=y +CONFIG_TASKS_RUDE_RCU=y +CONFIG_TASKS_TRACE_RCU=y +CONFIG_RCU_STALL_COMMON=y +CONFIG_RCU_NEED_SEGCBLIST=y +# end of RCU Subsystem + +# CONFIG_IKCONFIG is not set +# CONFIG_IKHEADERS is not set +CONFIG_LOG_BUF_SHIFT=17 +CONFIG_LOG_CPU_MAX_BUF_SHIFT=12 +# CONFIG_PRINTK_INDEX is not set +CONFIG_GENERIC_SCHED_CLOCK=y + +# +# Scheduler features +# +# CONFIG_UCLAMP_TASK is not set +# end of Scheduler features + +CONFIG_ARCH_SUPPORTS_NUMA_BALANCING=y +CONFIG_ARCH_WANT_BATCHED_UNMAP_TLB_FLUSH=y +CONFIG_CC_HAS_INT128=y +CONFIG_CC_IMPLICIT_FALLTHROUGH="-Wimplicit-fallthrough=5" +CONFIG_GCC11_NO_ARRAY_BOUNDS=y +CONFIG_ARCH_SUPPORTS_INT128=y +CONFIG_NUMA_BALANCING=y +# CONFIG_NUMA_BALANCING_DEFAULT_ENABLED is not set +CONFIG_CGROUPS=y +CONFIG_PAGE_COUNTER=y +# CONFIG_CGROUP_FAVOR_DYNMODS is not set +CONFIG_MEMCG=y +CONFIG_MEMCG_BGD_RECLAIM=y +CONFIG_MEMCG_KMEM=y +CONFIG_BLK_CGROUP=y +CONFIG_CGROUP_WRITEBACK=y +CONFIG_CGROUP_SCHED=y +CONFIG_FAIR_GROUP_SCHED=y +CONFIG_CFS_BANDWIDTH=y +# CONFIG_RT_GROUP_SCHED is not set +CONFIG_SCHED_MM_CID=y +CONFIG_CGROUP_PIDS=y +CONFIG_CGROUP_RDMA=y +CONFIG_CGROUP_FREEZER=y +CONFIG_CGROUP_HUGETLB=y +CONFIG_CPUSETS=y +CONFIG_PROC_PID_CPUSET=y +CONFIG_CGROUP_DEVICE=y +CONFIG_CGROUP_CPUACCT=y +CONFIG_CGROUP_PERF=y +CONFIG_CGROUP_BPF=y +# CONFIG_CGROUP_MISC is not set +# CONFIG_CGROUP_DEBUG is not set +CONFIG_SOCK_CGROUP_DATA=y +CONFIG_NAMESPACES=y +CONFIG_UTS_NS=y +CONFIG_TIME_NS=y +CONFIG_IPC_NS=y +CONFIG_USER_NS=y +CONFIG_PID_NS=y +CONFIG_NET_NS=y +CONFIG_CHECKPOINT_RESTORE=y +CONFIG_SCHED_AUTOGROUP=y +CONFIG_RELAY=y +CONFIG_BLK_DEV_INITRD=y +CONFIG_INITRAMFS_SOURCE="" +CONFIG_RD_GZIP=y +CONFIG_RD_BZIP2=y +CONFIG_RD_LZMA=y +CONFIG_RD_XZ=y +CONFIG_RD_LZO=y +CONFIG_RD_LZ4=y +CONFIG_RD_ZSTD=y +# CONFIG_BOOT_CONFIG is not set +CONFIG_INITRAMFS_PRESERVE_MTIME=y +CONFIG_CC_OPTIMIZE_FOR_PERFORMANCE=y +# CONFIG_CC_OPTIMIZE_FOR_SIZE is not set +CONFIG_LD_ORPHAN_WARN=y +CONFIG_LD_ORPHAN_WARN_LEVEL="warn" +CONFIG_SYSCTL=y +CONFIG_HAVE_UID16=y +CONFIG_SYSCTL_EXCEPTION_TRACE=y +CONFIG_EXPERT=y +CONFIG_UID16=y +CONFIG_MULTIUSER=y +# CONFIG_SGETMASK_SYSCALL is not set +# CONFIG_SYSFS_SYSCALL is not set +CONFIG_FHANDLE=y +CONFIG_POSIX_TIMERS=y +CONFIG_PRINTK=y +CONFIG_BUG=y +CONFIG_ELF_CORE=y +CONFIG_BASE_FULL=y +CONFIG_FUTEX=y +CONFIG_FUTEX_PI=y +CONFIG_EPOLL=y +CONFIG_SIGNALFD=y +CONFIG_TIMERFD=y +CONFIG_EVENTFD=y +CONFIG_SHMEM=y +CONFIG_AIO=y +CONFIG_IO_URING=y +CONFIG_ADVISE_SYSCALLS=y +CONFIG_MEMBARRIER=y +CONFIG_KALLSYMS=y +# CONFIG_KALLSYMS_SELFTEST is not set +CONFIG_KALLSYMS_ALL=y +CONFIG_KALLSYMS_BASE_RELATIVE=y +CONFIG_ARCH_HAS_MEMBARRIER_SYNC_CORE=y +CONFIG_KCMP=y +CONFIG_RSEQ=y +CONFIG_CACHESTAT_SYSCALL=y +# CONFIG_DEBUG_RSEQ is not set +CONFIG_HAVE_PERF_EVENTS=y +CONFIG_GUEST_PERF_EVENTS=y +# CONFIG_PC104 is not set + +# +# Kernel Performance Events And Counters +# +CONFIG_PERF_EVENTS=y +# CONFIG_DEBUG_PERF_USE_VMALLOC is not set +# end of Kernel Performance Events And Counters + +CONFIG_SYSTEM_DATA_VERIFICATION=y +CONFIG_PROFILING=y +CONFIG_TRACEPOINTS=y + +# +# Kexec and crash features +# +CONFIG_CRASH_CORE=y +CONFIG_KEXEC_CORE=y +CONFIG_KEXEC=y +CONFIG_KEXEC_FILE=y +CONFIG_CRASH_DUMP=y +# end of Kexec and crash features +# end of General setup + +CONFIG_ARM64=y +CONFIG_GCC_SUPPORTS_DYNAMIC_FTRACE_WITH_ARGS=y +CONFIG_64BIT=y +CONFIG_MMU=y +CONFIG_ARM64_PAGE_SHIFT=12 +CONFIG_ARM64_CONT_PTE_SHIFT=4 +CONFIG_ARM64_CONT_PMD_SHIFT=4 +CONFIG_ARCH_MMAP_RND_BITS_MIN=18 +CONFIG_ARCH_MMAP_RND_BITS_MAX=33 +CONFIG_ARCH_MMAP_RND_COMPAT_BITS_MIN=11 +CONFIG_ARCH_MMAP_RND_COMPAT_BITS_MAX=16 +CONFIG_STACKTRACE_SUPPORT=y +CONFIG_ILLEGAL_POINTER_VALUE=0xdead000000000000 +CONFIG_LOCKDEP_SUPPORT=y +CONFIG_GENERIC_BUG=y +CONFIG_GENERIC_BUG_RELATIVE_POINTERS=y +CONFIG_GENERIC_HWEIGHT=y +CONFIG_GENERIC_CSUM=y +CONFIG_GENERIC_CALIBRATE_DELAY=y +CONFIG_SMP=y +CONFIG_KERNEL_MODE_NEON=y +CONFIG_FIX_EARLYCON_MEM=y +CONFIG_PGTABLE_LEVELS=4 +CONFIG_ARCH_SUPPORTS_UPROBES=y +CONFIG_ARCH_PROC_KCORE_TEXT=y + +# +# Platform selection +# +# CONFIG_ARCH_ACTIONS is not set +CONFIG_ARCH_SUNXI=y +# CONFIG_ARCH_ALPINE is not set +# CONFIG_ARCH_APPLE is not set +# CONFIG_ARCH_BCM is not set +# CONFIG_ARCH_BERLIN is not set +# CONFIG_ARCH_BITMAIN is not set +# CONFIG_ARCH_EXYNOS is not set +# CONFIG_ARCH_SPARX5 is not set +# CONFIG_ARCH_K3 is not set +# CONFIG_ARCH_LG1K is not set +CONFIG_ARCH_HISI=y +# CONFIG_ARCH_KEEMBAY is not set +# CONFIG_ARCH_MEDIATEK is not set +CONFIG_ARCH_MESON=y +CONFIG_ARCH_MVEBU=y +# CONFIG_ARCH_NXP is not set +# CONFIG_ARCH_MA35 is not set +# CONFIG_ARCH_NPCM is not set +CONFIG_ARCH_QCOM=y +# CONFIG_ARCH_REALTEK is not set +# CONFIG_ARCH_RENESAS is not set +CONFIG_ARCH_ROCKCHIP=y +CONFIG_ARCH_SEATTLE=y +# CONFIG_ARCH_INTEL_SOCFPGA is not set +# CONFIG_ARCH_STM32 is not set +CONFIG_ARCH_SYNQUACER=y +CONFIG_ARCH_TEGRA=y +# CONFIG_ARCH_SPRD is not set +CONFIG_ARCH_THUNDER=y +CONFIG_ARCH_THUNDER2=y +# CONFIG_ARCH_UNIPHIER is not set +CONFIG_ARCH_VEXPRESS=y +# CONFIG_ARCH_VISCONTI is not set +CONFIG_ARCH_XGENE=y +CONFIG_ARCH_ZYNQMP=y +# end of Platform selection + +# +# Kernel Features +# + +# +# ARM errata workarounds via the alternatives framework +# +CONFIG_AMPERE_ERRATUM_AC03_CPU_38=y +CONFIG_ARM64_WORKAROUND_CLEAN_CACHE=y +CONFIG_ARM64_ERRATUM_826319=y +CONFIG_ARM64_ERRATUM_827319=y +CONFIG_ARM64_ERRATUM_824069=y +CONFIG_ARM64_ERRATUM_819472=y +CONFIG_ARM64_ERRATUM_832075=y +CONFIG_ARM64_ERRATUM_834220=y +CONFIG_ARM64_ERRATUM_1742098=y +CONFIG_ARM64_ERRATUM_845719=y +CONFIG_ARM64_ERRATUM_843419=y +CONFIG_ARM64_LD_HAS_FIX_ERRATUM_843419=y +CONFIG_ARM64_ERRATUM_1024718=y +CONFIG_ARM64_ERRATUM_1418040=y +CONFIG_ARM64_WORKAROUND_SPECULATIVE_AT=y +CONFIG_ARM64_ERRATUM_1165522=y +CONFIG_ARM64_ERRATUM_1319367=y +CONFIG_ARM64_ERRATUM_1530923=y +CONFIG_ARM64_WORKAROUND_REPEAT_TLBI=y +CONFIG_ARM64_ERRATUM_2441007=y +CONFIG_ARM64_ERRATUM_1286807=y +CONFIG_ARM64_ERRATUM_1463225=y +CONFIG_ARM64_ERRATUM_1542419=y +CONFIG_ARM64_ERRATUM_1508412=y +CONFIG_ARM64_ERRATUM_2051678=y +CONFIG_ARM64_ERRATUM_2077057=y +CONFIG_ARM64_ERRATUM_2658417=y +CONFIG_ARM64_WORKAROUND_TSB_FLUSH_FAILURE=y +CONFIG_ARM64_ERRATUM_2054223=y +CONFIG_ARM64_ERRATUM_2067961=y +CONFIG_ARM64_ERRATUM_2441009=y +CONFIG_ARM64_ERRATUM_2457168=y +CONFIG_ARM64_ERRATUM_2645198=y +CONFIG_ARM64_ERRATUM_2966298=y +CONFIG_CAVIUM_ERRATUM_22375=y +CONFIG_CAVIUM_ERRATUM_23144=y +CONFIG_CAVIUM_ERRATUM_23154=y +CONFIG_CAVIUM_ERRATUM_27456=y +CONFIG_CAVIUM_ERRATUM_30115=y +CONFIG_CAVIUM_TX2_ERRATUM_219=y +CONFIG_FUJITSU_ERRATUM_010001=y +CONFIG_HISILICON_ERRATUM_161600802=y +CONFIG_QCOM_FALKOR_ERRATUM_1003=y +CONFIG_QCOM_FALKOR_ERRATUM_1009=y +CONFIG_QCOM_QDF2400_ERRATUM_0065=y +CONFIG_QCOM_FALKOR_ERRATUM_E1041=y +CONFIG_NVIDIA_CARMEL_CNP_ERRATUM=y +CONFIG_ROCKCHIP_ERRATUM_3588001=y +CONFIG_SOCIONEXT_SYNQUACER_PREITS=y +# end of ARM errata workarounds via the alternatives framework + +CONFIG_ARM64_4K_PAGES=y +# CONFIG_ARM64_16K_PAGES is not set +# CONFIG_ARM64_64K_PAGES is not set +# CONFIG_ARM64_VA_BITS_39 is not set +CONFIG_ARM64_VA_BITS_48=y +CONFIG_ARM64_VA_BITS=48 +CONFIG_ARM64_PA_BITS_48=y +CONFIG_ARM64_PA_BITS=48 +# CONFIG_CPU_BIG_ENDIAN is not set +CONFIG_CPU_LITTLE_ENDIAN=y +CONFIG_SCHED_MC=y +# CONFIG_SCHED_CLUSTER is not set +CONFIG_SCHED_SMT=y +CONFIG_NR_CPUS=256 +CONFIG_HOTPLUG_CPU=y +CONFIG_NUMA=y +CONFIG_NODES_SHIFT=6 +# CONFIG_HZ_100 is not set +CONFIG_HZ_250=y +# CONFIG_HZ_300 is not set +# CONFIG_HZ_1000 is not set +CONFIG_HZ=250 +CONFIG_SCHED_HRTICK=y +CONFIG_ARCH_SPARSEMEM_ENABLE=y +CONFIG_HW_PERF_EVENTS=y +CONFIG_PARAVIRT=y +CONFIG_PARAVIRT_TIME_ACCOUNTING=y +CONFIG_ARCH_SUPPORTS_KEXEC=y +CONFIG_ARCH_SUPPORTS_KEXEC_FILE=y +CONFIG_ARCH_SUPPORTS_KEXEC_SIG=y +CONFIG_ARCH_SUPPORTS_KEXEC_IMAGE_VERIFY_SIG=y +CONFIG_ARCH_DEFAULT_KEXEC_IMAGE_VERIFY_SIG=y +CONFIG_ARCH_SUPPORTS_CRASH_DUMP=y +CONFIG_TRANS_TABLE=y +CONFIG_XEN_DOM0=y +CONFIG_XEN=y +CONFIG_ARCH_FORCE_MAX_ORDER=10 +CONFIG_UNMAP_KERNEL_AT_EL0=y +CONFIG_MITIGATE_SPECTRE_BRANCH_HISTORY=y +CONFIG_RODATA_FULL_DEFAULT_ENABLED=y +# CONFIG_ARM64_SW_TTBR0_PAN is not set +CONFIG_ARM64_TAGGED_ADDR_ABI=y +CONFIG_COMPAT=y +CONFIG_KUSER_HELPERS=y +# CONFIG_COMPAT_ALIGNMENT_FIXUPS is not set +CONFIG_ARMV8_DEPRECATED=y +CONFIG_SWP_EMULATION=y +CONFIG_CP15_BARRIER_EMULATION=y +CONFIG_SETEND_EMULATION=y + +# +# ARMv8.1 architectural features +# +CONFIG_ARM64_HW_AFDBM=y +CONFIG_ARM64_PAN=y +CONFIG_AS_HAS_LSE_ATOMICS=y +CONFIG_ARM64_LSE_ATOMICS=y +CONFIG_ARM64_USE_LSE_ATOMICS=y +# end of ARMv8.1 architectural features + +# +# ARMv8.2 architectural features +# +CONFIG_AS_HAS_ARMV8_2=y +CONFIG_AS_HAS_SHA3=y +CONFIG_ARM64_PMEM=y +CONFIG_ARM64_RAS_EXTN=y +CONFIG_ARM64_CNP=y +# end of ARMv8.2 architectural features + +# +# ARMv8.3 architectural features +# +CONFIG_ARM64_PTR_AUTH=y +CONFIG_ARM64_PTR_AUTH_KERNEL=y +CONFIG_CC_HAS_SIGN_RETURN_ADDRESS=y +CONFIG_AS_HAS_ARMV8_3=y +CONFIG_AS_HAS_LDAPR=y +# end of ARMv8.3 architectural features + +# +# ARMv8.4 architectural features +# +CONFIG_ARM64_AMU_EXTN=y +CONFIG_AS_HAS_ARMV8_4=y +CONFIG_ARM64_TLB_RANGE=y +# end of ARMv8.4 architectural features + +# +# ARMv8.5 architectural features +# +CONFIG_ARM64_BTI=y +CONFIG_ARM64_E0PD=y +# end of ARMv8.5 architectural features + +# +# ARMv8.7 architectural features +# +CONFIG_ARM64_EPAN=y +# end of ARMv8.7 architectural features + +CONFIG_ARM64_SVE=y +CONFIG_ARM64_SME=y +CONFIG_ARM64_PSEUDO_NMI=y +# CONFIG_ARM64_DEBUG_PRIORITY_MASKING is not set +CONFIG_RELOCATABLE=y +CONFIG_RANDOMIZE_BASE=y +CONFIG_RANDOMIZE_MODULE_REGION_FULL=y +CONFIG_HAVE_LIVEPATCH=y +CONFIG_LIVEPATCH=y +# end of Kernel Features + +# +# Boot options +# +CONFIG_ARM64_ACPI_PARKING_PROTOCOL=y +CONFIG_CMDLINE="" +CONFIG_EFI_STUB=y +CONFIG_EFI=y +CONFIG_DMI=y +# end of Boot options + +# +# Power management options +# +CONFIG_SUSPEND=y +CONFIG_SUSPEND_FREEZER=y +# CONFIG_SUSPEND_SKIP_SYNC is not set +CONFIG_HIBERNATE_CALLBACKS=y +CONFIG_HIBERNATION=y +CONFIG_HIBERNATION_SNAPSHOT_DEV=y +CONFIG_PM_STD_PARTITION="" +CONFIG_PM_SLEEP=y +CONFIG_PM_SLEEP_SMP=y +# CONFIG_PM_AUTOSLEEP is not set +# CONFIG_PM_USERSPACE_AUTOSLEEP is not set +# CONFIG_PM_WAKELOCKS is not set +CONFIG_PM=y +CONFIG_PM_DEBUG=y +CONFIG_PM_ADVANCED_DEBUG=y +# CONFIG_PM_TEST_SUSPEND is not set +CONFIG_PM_SLEEP_DEBUG=y +# CONFIG_DPM_WATCHDOG is not set +CONFIG_PM_CLK=y +CONFIG_PM_GENERIC_DOMAINS=y +# CONFIG_WQ_POWER_EFFICIENT_DEFAULT is not set +CONFIG_PM_GENERIC_DOMAINS_SLEEP=y +CONFIG_PM_GENERIC_DOMAINS_OF=y +CONFIG_CPU_PM=y +CONFIG_ENERGY_MODEL=y +CONFIG_ARCH_HIBERNATION_POSSIBLE=y +CONFIG_ARCH_HIBERNATION_HEADER=y +CONFIG_ARCH_SUSPEND_POSSIBLE=y +# end of Power management options + +# +# CPU Power Management +# + +# +# CPU Idle +# +CONFIG_CPU_IDLE=y +CONFIG_CPU_IDLE_GOV_LADDER=y +CONFIG_CPU_IDLE_GOV_MENU=y +# CONFIG_CPU_IDLE_GOV_TEO is not set + +# +# ARM CPU Idle Drivers +# +# CONFIG_ARM_PSCI_CPUIDLE is not set +# end of ARM CPU Idle Drivers +# end of CPU Idle + +# +# CPU Frequency scaling +# +CONFIG_CPU_FREQ=y +CONFIG_CPU_FREQ_GOV_ATTR_SET=y +CONFIG_CPU_FREQ_GOV_COMMON=y +CONFIG_CPU_FREQ_STAT=y +# CONFIG_CPU_FREQ_DEFAULT_GOV_PERFORMANCE is not set +# CONFIG_CPU_FREQ_DEFAULT_GOV_POWERSAVE is not set +# CONFIG_CPU_FREQ_DEFAULT_GOV_USERSPACE is not set +# CONFIG_CPU_FREQ_DEFAULT_GOV_ONDEMAND is not set +# CONFIG_CPU_FREQ_DEFAULT_GOV_CONSERVATIVE is not set +CONFIG_CPU_FREQ_DEFAULT_GOV_SCHEDUTIL=y +CONFIG_CPU_FREQ_GOV_PERFORMANCE=y +CONFIG_CPU_FREQ_GOV_POWERSAVE=m +CONFIG_CPU_FREQ_GOV_USERSPACE=m +CONFIG_CPU_FREQ_GOV_ONDEMAND=m +CONFIG_CPU_FREQ_GOV_CONSERVATIVE=m +CONFIG_CPU_FREQ_GOV_SCHEDUTIL=y + +# +# CPU frequency scaling drivers +# +CONFIG_CPUFREQ_DT=m +CONFIG_CPUFREQ_DT_PLATDEV=y +CONFIG_ACPI_CPPC_CPUFREQ=m +CONFIG_ACPI_CPPC_CPUFREQ_FIE=y +CONFIG_ARM_ALLWINNER_SUN50I_CPUFREQ_NVMEM=m +CONFIG_ARM_ARMADA_37XX_CPUFREQ=m +# CONFIG_ARM_ARMADA_8K_CPUFREQ is not set +# CONFIG_ARM_QCOM_CPUFREQ_HW is not set +CONFIG_ARM_TEGRA20_CPUFREQ=m +CONFIG_ARM_TEGRA124_CPUFREQ=y +# end of CPU Frequency scaling +# end of CPU Power Management + +CONFIG_ARCH_SUPPORTS_ACPI=y +CONFIG_ACPI=y +CONFIG_ACPI_GENERIC_GSI=y +CONFIG_ACPI_CCA_REQUIRED=y +# CONFIG_ACPI_DEBUGGER is not set +CONFIG_ACPI_SPCR_TABLE=y +# CONFIG_ACPI_FPDT is not set +# CONFIG_ACPI_EC_DEBUGFS is not set +CONFIG_ACPI_AC=y +CONFIG_ACPI_BATTERY=y +CONFIG_ACPI_BUTTON=y +CONFIG_ACPI_VIDEO=m +CONFIG_ACPI_FAN=y +CONFIG_ACPI_TAD=m +# CONFIG_ACPI_DOCK is not set +CONFIG_ACPI_PROCESSOR_IDLE=y +CONFIG_ACPI_MCFG=y +CONFIG_ACPI_CPPC_LIB=y +CONFIG_ACPI_PROCESSOR=y +CONFIG_ACPI_IPMI=m +CONFIG_ACPI_HOTPLUG_CPU=y +CONFIG_ACPI_THERMAL=y +CONFIG_ARCH_HAS_ACPI_TABLE_UPGRADE=y +CONFIG_ACPI_TABLE_UPGRADE=y +# CONFIG_ACPI_DEBUG is not set +CONFIG_ACPI_PCI_SLOT=y +CONFIG_ACPI_CONTAINER=y +# CONFIG_ACPI_HOTPLUG_MEMORY is not set +CONFIG_ACPI_HED=y +# CONFIG_ACPI_CUSTOM_METHOD is not set +CONFIG_ACPI_BGRT=y +CONFIG_ACPI_REDUCED_HARDWARE_ONLY=y +CONFIG_ACPI_NFIT=m +# CONFIG_NFIT_SECURITY_DEBUG is not set +CONFIG_ACPI_NUMA=y +# CONFIG_ACPI_HMAT is not set +CONFIG_HAVE_ACPI_APEI=y +CONFIG_ACPI_APEI=y +CONFIG_ACPI_APEI_GHES=y +CONFIG_ACPI_APEI_PCIEAER=y +CONFIG_ACPI_APEI_SEA=y +CONFIG_ACPI_APEI_MEMORY_FAILURE=y +CONFIG_ACPI_APEI_EINJ=m +# CONFIG_ACPI_APEI_ERST_DEBUG is not set +CONFIG_ACPI_WATCHDOG=y +# CONFIG_ACPI_CONFIGFS is not set +# CONFIG_ACPI_PFRUT is not set +CONFIG_ACPI_IORT=y +CONFIG_ACPI_GTDT=y +# CONFIG_ACPI_AGDI is not set +CONFIG_ACPI_APMT=y +CONFIG_ACPI_PPTT=y +CONFIG_ACPI_PCC=y +# CONFIG_ACPI_FFH is not set +# CONFIG_PMIC_OPREGION is not set +CONFIG_ACPI_PRMT=y +CONFIG_IRQ_BYPASS_MANAGER=y +CONFIG_HAVE_KVM=y +CONFIG_HAVE_KVM_IRQCHIP=y +CONFIG_HAVE_KVM_IRQFD=y +CONFIG_HAVE_KVM_IRQ_ROUTING=y +CONFIG_HAVE_KVM_DIRTY_RING=y +CONFIG_HAVE_KVM_DIRTY_RING_ACQ_REL=y +CONFIG_NEED_KVM_DIRTY_RING_WITH_BITMAP=y +CONFIG_HAVE_KVM_EVENTFD=y +CONFIG_KVM_MMIO=y +CONFIG_HAVE_KVM_MSI=y +CONFIG_HAVE_KVM_CPU_RELAX_INTERCEPT=y +CONFIG_KVM_VFIO=y +CONFIG_KVM_GENERIC_DIRTYLOG_READ_PROTECT=y +CONFIG_HAVE_KVM_IRQ_BYPASS=y +CONFIG_HAVE_KVM_VCPU_RUN_PID_CHANGE=y +CONFIG_KVM_XFER_TO_GUEST_WORK=y +CONFIG_KVM_GENERIC_HARDWARE_ENABLING=y +CONFIG_VIRTUALIZATION=y +CONFIG_KVM=y +# CONFIG_NVHE_EL2_DEBUG is not set + +# +# General architecture-dependent options +# +CONFIG_HOTPLUG_CORE_SYNC=y +CONFIG_HOTPLUG_CORE_SYNC_DEAD=y +CONFIG_KPROBES=y +CONFIG_JUMP_LABEL=y +# CONFIG_STATIC_KEYS_SELFTEST is not set +CONFIG_UPROBES=y +CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS=y +CONFIG_KRETPROBES=y +CONFIG_HAVE_IOREMAP_PROT=y +CONFIG_HAVE_KPROBES=y +CONFIG_HAVE_KRETPROBES=y +CONFIG_ARCH_CORRECT_STACKTRACE_ON_KRETPROBE=y +CONFIG_HAVE_FUNCTION_ERROR_INJECTION=y +CONFIG_HAVE_NMI=y +CONFIG_TRACE_IRQFLAGS_SUPPORT=y +CONFIG_TRACE_IRQFLAGS_NMI_SUPPORT=y +CONFIG_HAVE_ARCH_TRACEHOOK=y +CONFIG_HAVE_DMA_CONTIGUOUS=y +CONFIG_GENERIC_SMP_IDLE_THREAD=y +CONFIG_GENERIC_IDLE_POLL_SETUP=y +CONFIG_ARCH_HAS_FORTIFY_SOURCE=y +CONFIG_ARCH_HAS_KEEPINITRD=y +CONFIG_ARCH_HAS_SET_MEMORY=y +CONFIG_ARCH_HAS_SET_DIRECT_MAP=y +CONFIG_HAVE_ARCH_THREAD_STRUCT_WHITELIST=y +CONFIG_ARCH_WANTS_NO_INSTR=y +CONFIG_HAVE_ASM_MODVERSIONS=y +CONFIG_HAVE_REGS_AND_STACK_ACCESS_API=y +CONFIG_HAVE_RSEQ=y +CONFIG_HAVE_FUNCTION_ARG_ACCESS_API=y +CONFIG_HAVE_HW_BREAKPOINT=y +CONFIG_HAVE_PERF_EVENTS_NMI=y +CONFIG_HAVE_HARDLOCKUP_DETECTOR_PERF=y +CONFIG_HAVE_PERF_REGS=y +CONFIG_HAVE_PERF_USER_STACK_DUMP=y +CONFIG_HAVE_ARCH_JUMP_LABEL=y +CONFIG_HAVE_ARCH_JUMP_LABEL_RELATIVE=y +CONFIG_MMU_GATHER_TABLE_FREE=y +CONFIG_MMU_GATHER_RCU_TABLE_FREE=y +CONFIG_MMU_LAZY_TLB_REFCOUNT=y +CONFIG_ARCH_HAVE_NMI_SAFE_CMPXCHG=y +CONFIG_ARCH_HAS_NMI_SAFE_THIS_CPU_OPS=y +CONFIG_HAVE_ALIGNED_STRUCT_PAGE=y +CONFIG_HAVE_CMPXCHG_LOCAL=y +CONFIG_HAVE_CMPXCHG_DOUBLE=y +CONFIG_ARCH_WANT_COMPAT_IPC_PARSE_VERSION=y +CONFIG_HAVE_ARCH_SECCOMP=y +CONFIG_HAVE_ARCH_SECCOMP_FILTER=y +CONFIG_SECCOMP=y +CONFIG_SECCOMP_FILTER=y +# CONFIG_SECCOMP_CACHE_DEBUG is not set +CONFIG_HAVE_ARCH_STACKLEAK=y +CONFIG_HAVE_STACKPROTECTOR=y +CONFIG_STACKPROTECTOR=y +CONFIG_STACKPROTECTOR_STRONG=y +CONFIG_ARCH_SUPPORTS_LTO_CLANG=y +CONFIG_ARCH_SUPPORTS_LTO_CLANG_THIN=y +CONFIG_LTO_NONE=y +CONFIG_ARCH_SUPPORTS_CFI_CLANG=y +CONFIG_HAVE_CONTEXT_TRACKING_USER=y +CONFIG_HAVE_VIRT_CPU_ACCOUNTING_GEN=y +CONFIG_HAVE_IRQ_TIME_ACCOUNTING=y +CONFIG_HAVE_MOVE_PUD=y +CONFIG_HAVE_MOVE_PMD=y +CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE=y +CONFIG_HAVE_ARCH_HUGE_VMAP=y +CONFIG_HAVE_ARCH_HUGE_VMALLOC=y +CONFIG_ARCH_WANT_HUGE_PMD_SHARE=y +CONFIG_ARCH_WANT_PMD_MKWRITE=y +CONFIG_HAVE_MOD_ARCH_SPECIFIC=y +CONFIG_MODULES_USE_ELF_RELA=y +CONFIG_HAVE_SOFTIRQ_ON_OWN_STACK=y +CONFIG_SOFTIRQ_ON_OWN_STACK=y +CONFIG_ARCH_HAS_ELF_RANDOMIZE=y +CONFIG_HAVE_ARCH_MMAP_RND_BITS=y +CONFIG_ARCH_MMAP_RND_BITS=18 +CONFIG_HAVE_ARCH_MMAP_RND_COMPAT_BITS=y +CONFIG_ARCH_MMAP_RND_COMPAT_BITS=11 +CONFIG_PAGE_SIZE_LESS_THAN_64KB=y +CONFIG_PAGE_SIZE_LESS_THAN_256KB=y +CONFIG_ARCH_WANT_DEFAULT_TOPDOWN_MMAP_LAYOUT=y +CONFIG_HAVE_RELIABLE_STACKTRACE=y +CONFIG_CLONE_BACKWARDS=y +CONFIG_OLD_SIGSUSPEND3=y +CONFIG_COMPAT_OLD_SIGACTION=y +CONFIG_COMPAT_32BIT_TIME=y +CONFIG_HAVE_ARCH_VMAP_STACK=y +CONFIG_VMAP_STACK=y +CONFIG_HAVE_ARCH_RANDOMIZE_KSTACK_OFFSET=y +CONFIG_RANDOMIZE_KSTACK_OFFSET=y +# CONFIG_RANDOMIZE_KSTACK_OFFSET_DEFAULT is not set +CONFIG_ARCH_HAS_STRICT_KERNEL_RWX=y +CONFIG_STRICT_KERNEL_RWX=y +CONFIG_ARCH_HAS_STRICT_MODULE_RWX=y +CONFIG_STRICT_MODULE_RWX=y +CONFIG_HAVE_ARCH_COMPILER_H=y +CONFIG_HAVE_ARCH_PREL32_RELOCATIONS=y +CONFIG_ARCH_USE_MEMREMAP_PROT=y +# CONFIG_LOCK_EVENT_COUNTS is not set +CONFIG_ARCH_HAS_RELR=y +CONFIG_HAVE_PREEMPT_DYNAMIC=y +CONFIG_HAVE_PREEMPT_DYNAMIC_KEY=y +CONFIG_ARCH_WANT_LD_ORPHAN_WARN=y +CONFIG_ARCH_SUPPORTS_DEBUG_PAGEALLOC=y +CONFIG_ARCH_SUPPORTS_PAGE_TABLE_CHECK=y +CONFIG_ARCH_HAVE_TRACE_MMIO_ACCESS=y + +# +# GCOV-based kernel profiling +# +# CONFIG_GCOV_KERNEL is not set +CONFIG_ARCH_HAS_GCOV_PROFILE_ALL=y +# end of GCOV-based kernel profiling + +CONFIG_HAVE_GCC_PLUGINS=y +CONFIG_FUNCTION_ALIGNMENT_4B=y +CONFIG_FUNCTION_ALIGNMENT_8B=y +CONFIG_FUNCTION_ALIGNMENT=8 +# end of General architecture-dependent options + +CONFIG_RT_MUTEXES=y +CONFIG_BASE_SMALL=0 +CONFIG_MODULE_SIG_FORMAT=y +CONFIG_MODULES=y +# CONFIG_MODULE_DEBUG is not set +CONFIG_MODULE_FORCE_LOAD=y +CONFIG_MODULE_UNLOAD=y +CONFIG_MODULE_FORCE_UNLOAD=y +# CONFIG_MODULE_UNLOAD_TAINT_TRACKING is not set +CONFIG_MODVERSIONS=y +CONFIG_ASM_MODVERSIONS=y +# CONFIG_MODULE_SRCVERSION_ALL is not set +CONFIG_MODULE_SIG=y +# CONFIG_MODULE_SIG_FORCE is not set +# CONFIG_MODULE_SIG_ALL is not set +# CONFIG_MODULE_SIG_SHA1 is not set +# CONFIG_MODULE_SIG_SHA224 is not set +CONFIG_MODULE_SIG_SHA256=y +# CONFIG_MODULE_SIG_SHA384 is not set +# CONFIG_MODULE_SIG_SHA512 is not set +CONFIG_MODULE_SIG_HASH="sha256" +CONFIG_MODULE_COMPRESS_NONE=y +# CONFIG_MODULE_COMPRESS_GZIP is not set +# CONFIG_MODULE_COMPRESS_XZ is not set +# CONFIG_MODULE_COMPRESS_ZSTD is not set +# CONFIG_MODULE_ALLOW_MISSING_NAMESPACE_IMPORTS is not set +CONFIG_MODPROBE_PATH="/sbin/modprobe" +# CONFIG_TRIM_UNUSED_KSYMS is not set +CONFIG_MODULES_TREE_LOOKUP=y +CONFIG_BLOCK=y +CONFIG_BLOCK_LEGACY_AUTOLOAD=y +CONFIG_BLK_RQ_ALLOC_TIME=y +CONFIG_BLK_CGROUP_RWSTAT=y +CONFIG_BLK_CGROUP_PUNT_BIO=y +CONFIG_BLK_DEV_BSG_COMMON=y +CONFIG_BLK_ICQ=y +CONFIG_BLK_DEV_BSGLIB=y +CONFIG_BLK_DEV_INTEGRITY=y +CONFIG_BLK_DEV_INTEGRITY_T10=m +CONFIG_BLK_DEV_ZONED=y +CONFIG_BLK_DEV_THROTTLING=y +# CONFIG_BLK_DEV_THROTTLING_LOW is not set +CONFIG_BLK_WBT=y +CONFIG_BLK_WBT_MQ=y +# CONFIG_BLK_CGROUP_IOLATENCY is not set +# CONFIG_BLK_CGROUP_FC_APPID is not set +CONFIG_BLK_CGROUP_IOCOST=y +# CONFIG_BLK_CGROUP_IOPRIO is not set +CONFIG_BLK_DEBUG_FS=y +CONFIG_BLK_DEBUG_FS_ZONED=y +CONFIG_BLK_SED_OPAL=y +# CONFIG_BLK_INLINE_ENCRYPTION is not set + +# +# Partition Types +# +CONFIG_PARTITION_ADVANCED=y +# CONFIG_ACORN_PARTITION is not set +# CONFIG_AIX_PARTITION is not set +# CONFIG_OSF_PARTITION is not set +# CONFIG_AMIGA_PARTITION is not set +# CONFIG_ATARI_PARTITION is not set +# CONFIG_MAC_PARTITION is not set +CONFIG_MSDOS_PARTITION=y +# CONFIG_BSD_DISKLABEL is not set +# CONFIG_MINIX_SUBPARTITION is not set +# CONFIG_SOLARIS_X86_PARTITION is not set +# CONFIG_UNIXWARE_DISKLABEL is not set +# CONFIG_LDM_PARTITION is not set +# CONFIG_SGI_PARTITION is not set +# CONFIG_ULTRIX_PARTITION is not set +# CONFIG_SUN_PARTITION is not set +CONFIG_KARMA_PARTITION=y +CONFIG_EFI_PARTITION=y +# CONFIG_SYSV68_PARTITION is not set +# CONFIG_CMDLINE_PARTITION is not set +# end of Partition Types + +CONFIG_BLK_MQ_PCI=y +CONFIG_BLK_MQ_VIRTIO=y +CONFIG_BLK_PM=y +CONFIG_BLOCK_HOLDER_DEPRECATED=y +CONFIG_BLK_MQ_STACKING=y + +# +# IO Schedulers +# +CONFIG_MQ_IOSCHED_DEADLINE=y +CONFIG_MQ_IOSCHED_KYBER=m +CONFIG_IOSCHED_BFQ=m +CONFIG_BFQ_GROUP_IOSCHED=y +# CONFIG_BFQ_CGROUP_DEBUG is not set +# end of IO Schedulers + +CONFIG_PREEMPT_NOTIFIERS=y +CONFIG_PADATA=y +CONFIG_ASN1=y +CONFIG_ARCH_INLINE_SPIN_TRYLOCK=y +CONFIG_ARCH_INLINE_SPIN_TRYLOCK_BH=y +CONFIG_ARCH_INLINE_SPIN_LOCK=y +CONFIG_ARCH_INLINE_SPIN_LOCK_BH=y +CONFIG_ARCH_INLINE_SPIN_LOCK_IRQ=y +CONFIG_ARCH_INLINE_SPIN_LOCK_IRQSAVE=y +CONFIG_ARCH_INLINE_SPIN_UNLOCK=y +CONFIG_ARCH_INLINE_SPIN_UNLOCK_BH=y +CONFIG_ARCH_INLINE_SPIN_UNLOCK_IRQ=y +CONFIG_ARCH_INLINE_SPIN_UNLOCK_IRQRESTORE=y +CONFIG_ARCH_INLINE_READ_LOCK=y +CONFIG_ARCH_INLINE_READ_LOCK_BH=y +CONFIG_ARCH_INLINE_READ_LOCK_IRQ=y +CONFIG_ARCH_INLINE_READ_LOCK_IRQSAVE=y +CONFIG_ARCH_INLINE_READ_UNLOCK=y +CONFIG_ARCH_INLINE_READ_UNLOCK_BH=y +CONFIG_ARCH_INLINE_READ_UNLOCK_IRQ=y +CONFIG_ARCH_INLINE_READ_UNLOCK_IRQRESTORE=y +CONFIG_ARCH_INLINE_WRITE_LOCK=y +CONFIG_ARCH_INLINE_WRITE_LOCK_BH=y +CONFIG_ARCH_INLINE_WRITE_LOCK_IRQ=y +CONFIG_ARCH_INLINE_WRITE_LOCK_IRQSAVE=y +CONFIG_ARCH_INLINE_WRITE_UNLOCK=y +CONFIG_ARCH_INLINE_WRITE_UNLOCK_BH=y +CONFIG_ARCH_INLINE_WRITE_UNLOCK_IRQ=y +CONFIG_ARCH_INLINE_WRITE_UNLOCK_IRQRESTORE=y +CONFIG_INLINE_SPIN_TRYLOCK=y +CONFIG_INLINE_SPIN_TRYLOCK_BH=y +CONFIG_INLINE_SPIN_LOCK=y +CONFIG_INLINE_SPIN_LOCK_BH=y +CONFIG_INLINE_SPIN_LOCK_IRQ=y +CONFIG_INLINE_SPIN_LOCK_IRQSAVE=y +CONFIG_INLINE_SPIN_UNLOCK_BH=y +CONFIG_INLINE_SPIN_UNLOCK_IRQ=y +CONFIG_INLINE_SPIN_UNLOCK_IRQRESTORE=y +CONFIG_INLINE_READ_LOCK=y +CONFIG_INLINE_READ_LOCK_BH=y +CONFIG_INLINE_READ_LOCK_IRQ=y +CONFIG_INLINE_READ_LOCK_IRQSAVE=y +CONFIG_INLINE_READ_UNLOCK=y +CONFIG_INLINE_READ_UNLOCK_BH=y +CONFIG_INLINE_READ_UNLOCK_IRQ=y +CONFIG_INLINE_READ_UNLOCK_IRQRESTORE=y +CONFIG_INLINE_WRITE_LOCK=y +CONFIG_INLINE_WRITE_LOCK_BH=y +CONFIG_INLINE_WRITE_LOCK_IRQ=y +CONFIG_INLINE_WRITE_LOCK_IRQSAVE=y +CONFIG_INLINE_WRITE_UNLOCK=y +CONFIG_INLINE_WRITE_UNLOCK_BH=y +CONFIG_INLINE_WRITE_UNLOCK_IRQ=y +CONFIG_INLINE_WRITE_UNLOCK_IRQRESTORE=y +CONFIG_ARCH_SUPPORTS_ATOMIC_RMW=y +CONFIG_MUTEX_SPIN_ON_OWNER=y +CONFIG_RWSEM_SPIN_ON_OWNER=y +CONFIG_LOCK_SPIN_ON_OWNER=y +CONFIG_ARCH_USE_QUEUED_SPINLOCKS=y +CONFIG_QUEUED_SPINLOCKS=y +CONFIG_ARCH_USE_QUEUED_RWLOCKS=y +CONFIG_QUEUED_RWLOCKS=y +CONFIG_ARCH_HAS_NON_OVERLAPPING_ADDRESS_SPACE=y +CONFIG_ARCH_HAS_SYSCALL_WRAPPER=y +CONFIG_FREEZER=y + +# +# Executable file formats +# +CONFIG_BINFMT_ELF=y +CONFIG_COMPAT_BINFMT_ELF=y +CONFIG_ARCH_BINFMT_ELF_STATE=y +CONFIG_ARCH_BINFMT_ELF_EXTRA_PHDRS=y +CONFIG_ARCH_HAVE_ELF_PROT=y +CONFIG_ARCH_USE_GNU_PROPERTY=y +CONFIG_ELFCORE=y +CONFIG_CORE_DUMP_DEFAULT_ELF_HEADERS=y +CONFIG_BINFMT_SCRIPT=y +CONFIG_BINFMT_MISC=y +CONFIG_COREDUMP=y +# end of Executable file formats + +# +# Memory Management options +# +CONFIG_ZPOOL=y +CONFIG_SWAP=y +CONFIG_ZSWAP=y +# CONFIG_ZSWAP_DEFAULT_ON is not set +# CONFIG_ZSWAP_EXCLUSIVE_LOADS_DEFAULT_ON is not set +# CONFIG_ZSWAP_COMPRESSOR_DEFAULT_DEFLATE is not set +CONFIG_ZSWAP_COMPRESSOR_DEFAULT_LZO=y +# CONFIG_ZSWAP_COMPRESSOR_DEFAULT_842 is not set +# CONFIG_ZSWAP_COMPRESSOR_DEFAULT_LZ4 is not set +# CONFIG_ZSWAP_COMPRESSOR_DEFAULT_LZ4HC is not set +# CONFIG_ZSWAP_COMPRESSOR_DEFAULT_ZSTD is not set +CONFIG_ZSWAP_COMPRESSOR_DEFAULT="lzo" +CONFIG_ZSWAP_ZPOOL_DEFAULT_ZBUD=y +# CONFIG_ZSWAP_ZPOOL_DEFAULT_Z3FOLD is not set +# CONFIG_ZSWAP_ZPOOL_DEFAULT_ZSMALLOC is not set +CONFIG_ZSWAP_ZPOOL_DEFAULT="zbud" +CONFIG_ZBUD=y +CONFIG_Z3FOLD=m +CONFIG_ZSMALLOC=m +# CONFIG_ZSMALLOC_STAT is not set +CONFIG_ZSMALLOC_CHAIN_SIZE=8 + +# +# SLAB allocator options +# +# CONFIG_SLAB_DEPRECATED is not set +CONFIG_SLUB=y +# CONFIG_SLUB_TINY is not set +CONFIG_SLAB_MERGE_DEFAULT=y +CONFIG_SLAB_FREELIST_RANDOM=y +CONFIG_SLAB_FREELIST_HARDENED=y +# CONFIG_SLUB_STATS is not set +CONFIG_SLUB_CPU_PARTIAL=y +# CONFIG_RANDOM_KMALLOC_CACHES is not set +# end of SLAB allocator options + +CONFIG_SHUFFLE_PAGE_ALLOCATOR=y +# CONFIG_COMPAT_BRK is not set +CONFIG_SPARSEMEM=y +CONFIG_SPARSEMEM_EXTREME=y +CONFIG_SPARSEMEM_VMEMMAP_ENABLE=y +CONFIG_SPARSEMEM_VMEMMAP=y +CONFIG_HAVE_FAST_GUP=y +CONFIG_ARCH_KEEP_MEMBLOCK=y +CONFIG_NUMA_KEEP_MEMINFO=y +CONFIG_MEMORY_ISOLATION=y +CONFIG_EXCLUSIVE_SYSTEM_RAM=y +CONFIG_ARCH_ENABLE_MEMORY_HOTPLUG=y +CONFIG_ARCH_ENABLE_MEMORY_HOTREMOVE=y +CONFIG_MEMORY_HOTPLUG=y +# CONFIG_MEMORY_HOTPLUG_DEFAULT_ONLINE is not set +# CONFIG_MEMORY_HOTREMOVE is not set +CONFIG_MHP_MEMMAP_ON_MEMORY=y +CONFIG_ARCH_MHP_MEMMAP_ON_MEMORY_ENABLE=y +CONFIG_SPLIT_PTLOCK_CPUS=4 +CONFIG_ARCH_ENABLE_SPLIT_PMD_PTLOCK=y +CONFIG_MEMORY_BALLOON=y +CONFIG_BALLOON_COMPACTION=y +CONFIG_COMPACTION=y +CONFIG_COMPACT_UNEVICTABLE_DEFAULT=1 +CONFIG_PAGE_REPORTING=y +CONFIG_MIGRATION=y +CONFIG_ARCH_ENABLE_HUGEPAGE_MIGRATION=y +CONFIG_ARCH_ENABLE_THP_MIGRATION=y +CONFIG_CONTIG_ALLOC=y +CONFIG_PHYS_ADDR_T_64BIT=y +CONFIG_MMU_NOTIFIER=y +CONFIG_KSM=y +CONFIG_DEFAULT_MMAP_MIN_ADDR=4096 +CONFIG_ARCH_SUPPORTS_MEMORY_FAILURE=y +CONFIG_MEMORY_FAILURE=y +CONFIG_HWPOISON_INJECT=m +CONFIG_ARCH_WANTS_THP_SWAP=y +CONFIG_TRANSPARENT_HUGEPAGE=y +# CONFIG_TRANSPARENT_HUGEPAGE_ALWAYS is not set +CONFIG_TRANSPARENT_HUGEPAGE_MADVISE=y +CONFIG_THP_SWAP=y +# CONFIG_READ_ONLY_THP_FOR_FS is not set +CONFIG_NEED_PER_CPU_EMBED_FIRST_CHUNK=y +CONFIG_NEED_PER_CPU_PAGE_FIRST_CHUNK=y +CONFIG_USE_PERCPU_NUMA_NODE_ID=y +CONFIG_HAVE_SETUP_PER_CPU_AREA=y +# CONFIG_CMA is not set +CONFIG_GENERIC_EARLY_IOREMAP=y +CONFIG_DEFERRED_STRUCT_PAGE_INIT=y +# CONFIG_IDLE_PAGE_TRACKING is not set +CONFIG_ARCH_HAS_CACHE_LINE_SIZE=y +CONFIG_ARCH_HAS_CURRENT_STACK_POINTER=y +CONFIG_ARCH_HAS_PTE_DEVMAP=y +CONFIG_ARCH_HAS_ZONE_DMA_SET=y +CONFIG_ZONE_DMA=y +CONFIG_ZONE_DMA32=y +CONFIG_HMM_MIRROR=y +CONFIG_VM_EVENT_COUNTERS=y +# CONFIG_PERCPU_STATS is not set +# CONFIG_GUP_TEST is not set +# CONFIG_DMAPOOL_TEST is not set +CONFIG_ARCH_HAS_PTE_SPECIAL=y +CONFIG_MEMFD_CREATE=y +CONFIG_SECRETMEM=y +# CONFIG_ANON_VMA_NAME is not set +CONFIG_USERFAULTFD=y +CONFIG_HAVE_ARCH_USERFAULTFD_MINOR=y +# CONFIG_LRU_GEN is not set +CONFIG_ARCH_SUPPORTS_PER_VMA_LOCK=y +CONFIG_PER_VMA_LOCK=y +CONFIG_LOCK_MM_AND_FIND_VMA=y + +# +# Data Access Monitoring +# +# CONFIG_DAMON is not set +# end of Data Access Monitoring +# end of Memory Management options + +CONFIG_NET=y +CONFIG_COMPAT_NETLINK_MESSAGES=y +CONFIG_NET_INGRESS=y +CONFIG_NET_EGRESS=y +CONFIG_NET_XGRESS=y +CONFIG_NET_REDIRECT=y +CONFIG_SKB_EXTENSIONS=y + +# +# Networking options +# +CONFIG_PACKET=y +CONFIG_PACKET_DIAG=m +CONFIG_UNIX=y +CONFIG_UNIX_SCM=y +CONFIG_AF_UNIX_OOB=y +CONFIG_UNIX_DIAG=m +# CONFIG_TLS is not set +CONFIG_XFRM=y +CONFIG_XFRM_OFFLOAD=y +CONFIG_XFRM_ALGO=m +CONFIG_XFRM_USER=m +CONFIG_XFRM_INTERFACE=m +CONFIG_XFRM_SUB_POLICY=y +CONFIG_XFRM_MIGRATE=y +CONFIG_XFRM_STATISTICS=y +CONFIG_XFRM_AH=m +CONFIG_XFRM_ESP=m +CONFIG_XFRM_IPCOMP=m +CONFIG_NET_KEY=m +CONFIG_NET_KEY_MIGRATE=y +CONFIG_SMC=m +CONFIG_SMC_DIAG=m +CONFIG_XDP_SOCKETS=y +# CONFIG_XDP_SOCKETS_DIAG is not set +CONFIG_NET_HANDSHAKE=y +CONFIG_INET=y +CONFIG_IP_MULTICAST=y +CONFIG_IP_ADVANCED_ROUTER=y +CONFIG_IP_FIB_TRIE_STATS=y +CONFIG_IP_MULTIPLE_TABLES=y +CONFIG_IP_ROUTE_MULTIPATH=y +CONFIG_IP_ROUTE_VERBOSE=y +CONFIG_IP_ROUTE_CLASSID=y +# CONFIG_IP_PNP is not set +CONFIG_NET_IPIP=m +CONFIG_NET_IPGRE_DEMUX=m +CONFIG_NET_IP_TUNNEL=m +CONFIG_NET_IPGRE=m +CONFIG_NET_IPGRE_BROADCAST=y +CONFIG_IP_MROUTE_COMMON=y +CONFIG_IP_MROUTE=y +CONFIG_IP_MROUTE_MULTIPLE_TABLES=y +CONFIG_IP_PIMSM_V1=y +CONFIG_IP_PIMSM_V2=y +CONFIG_SYN_COOKIES=y +CONFIG_NET_IPVTI=m +CONFIG_NET_UDP_TUNNEL=m +CONFIG_NET_FOU=m +CONFIG_NET_FOU_IP_TUNNELS=y +CONFIG_INET_AH=m +CONFIG_INET_ESP=m +CONFIG_INET_ESP_OFFLOAD=m +# CONFIG_INET_ESPINTCP is not set +CONFIG_INET_IPCOMP=m +CONFIG_INET_TABLE_PERTURB_ORDER=16 +CONFIG_INET_XFRM_TUNNEL=m +CONFIG_INET_TUNNEL=m +CONFIG_INET_DIAG=m +CONFIG_INET_TCP_DIAG=m +CONFIG_INET_UDP_DIAG=m +CONFIG_INET_RAW_DIAG=m +CONFIG_INET_DIAG_DESTROY=y +CONFIG_TCP_CONG_ADVANCED=y +CONFIG_TCP_CONG_BIC=m +CONFIG_TCP_CONG_CUBIC=y +CONFIG_TCP_CONG_WESTWOOD=m +CONFIG_TCP_CONG_HTCP=m +CONFIG_TCP_CONG_HSTCP=m +CONFIG_TCP_CONG_HYBLA=m +CONFIG_TCP_CONG_VEGAS=m +CONFIG_TCP_CONG_NV=m +CONFIG_TCP_CONG_SCALABLE=m +CONFIG_TCP_CONG_LP=m +CONFIG_TCP_CONG_VENO=m +CONFIG_TCP_CONG_YEAH=m +CONFIG_TCP_CONG_ILLINOIS=m +CONFIG_TCP_CONG_DCTCP=m +CONFIG_TCP_CONG_CDG=m +CONFIG_TCP_CONG_BBR=m +CONFIG_DEFAULT_CUBIC=y +# CONFIG_DEFAULT_RENO is not set +CONFIG_DEFAULT_TCP_CONG="cubic" +CONFIG_TCP_MD5SIG=y +CONFIG_IPV6=y +CONFIG_IPV6_ROUTER_PREF=y +CONFIG_IPV6_ROUTE_INFO=y +CONFIG_IPV6_OPTIMISTIC_DAD=y +CONFIG_INET6_AH=m +CONFIG_INET6_ESP=m +CONFIG_INET6_ESP_OFFLOAD=m +# CONFIG_INET6_ESPINTCP is not set +CONFIG_INET6_IPCOMP=m +CONFIG_IPV6_MIP6=y +CONFIG_IPV6_ILA=m +CONFIG_INET6_XFRM_TUNNEL=m +CONFIG_INET6_TUNNEL=m +CONFIG_IPV6_VTI=m +CONFIG_IPV6_SIT=m +CONFIG_IPV6_SIT_6RD=y +CONFIG_IPV6_NDISC_NODETYPE=y +CONFIG_IPV6_TUNNEL=m +CONFIG_IPV6_GRE=m +CONFIG_IPV6_FOU=m +CONFIG_IPV6_FOU_TUNNEL=m +CONFIG_IPV6_MULTIPLE_TABLES=y +CONFIG_IPV6_SUBTREES=y +CONFIG_IPV6_MROUTE=y +CONFIG_IPV6_MROUTE_MULTIPLE_TABLES=y +CONFIG_IPV6_PIMSM_V2=y +CONFIG_IPV6_SEG6_LWTUNNEL=y +CONFIG_IPV6_SEG6_HMAC=y +CONFIG_IPV6_SEG6_BPF=y +# CONFIG_IPV6_RPL_LWTUNNEL is not set +# CONFIG_IPV6_IOAM6_LWTUNNEL is not set +# CONFIG_NETLABEL is not set +# CONFIG_MPTCP is not set +CONFIG_NETWORK_SECMARK=y +CONFIG_NET_PTP_CLASSIFY=y +# CONFIG_NETWORK_PHY_TIMESTAMPING is not set +CONFIG_NETFILTER=y +CONFIG_NETFILTER_ADVANCED=y +CONFIG_BRIDGE_NETFILTER=m + +# +# Core Netfilter Configuration +# +CONFIG_NETFILTER_INGRESS=y +CONFIG_NETFILTER_EGRESS=y +CONFIG_NETFILTER_SKIP_EGRESS=y +CONFIG_NETFILTER_NETLINK=m +CONFIG_NETFILTER_FAMILY_BRIDGE=y +CONFIG_NETFILTER_FAMILY_ARP=y +CONFIG_NETFILTER_BPF_LINK=y +# CONFIG_NETFILTER_NETLINK_HOOK is not set +CONFIG_NETFILTER_NETLINK_ACCT=m +CONFIG_NETFILTER_NETLINK_QUEUE=m +CONFIG_NETFILTER_NETLINK_LOG=m +CONFIG_NETFILTER_NETLINK_OSF=m +CONFIG_NF_CONNTRACK=m +CONFIG_NF_LOG_SYSLOG=m +CONFIG_NETFILTER_CONNCOUNT=m +CONFIG_NF_CONNTRACK_MARK=y +CONFIG_NF_CONNTRACK_SECMARK=y +CONFIG_NF_CONNTRACK_ZONES=y +CONFIG_NF_CONNTRACK_PROCFS=y +CONFIG_NF_CONNTRACK_EVENTS=y +CONFIG_NF_CONNTRACK_TIMEOUT=y +CONFIG_NF_CONNTRACK_TIMESTAMP=y +CONFIG_NF_CONNTRACK_LABELS=y +CONFIG_NF_CONNTRACK_OVS=y +CONFIG_NF_CT_PROTO_DCCP=y +CONFIG_NF_CT_PROTO_GRE=y +CONFIG_NF_CT_PROTO_SCTP=y +CONFIG_NF_CT_PROTO_UDPLITE=y +CONFIG_NF_CONNTRACK_AMANDA=m +CONFIG_NF_CONNTRACK_FTP=m +CONFIG_NF_CONNTRACK_H323=m +CONFIG_NF_CONNTRACK_IRC=m +CONFIG_NF_CONNTRACK_BROADCAST=m +CONFIG_NF_CONNTRACK_NETBIOS_NS=m +CONFIG_NF_CONNTRACK_SNMP=m +CONFIG_NF_CONNTRACK_PPTP=m +CONFIG_NF_CONNTRACK_SANE=m +CONFIG_NF_CONNTRACK_SIP=m +CONFIG_NF_CONNTRACK_TFTP=m +CONFIG_NF_CT_NETLINK=m +CONFIG_NF_CT_NETLINK_TIMEOUT=m +CONFIG_NF_CT_NETLINK_HELPER=m +CONFIG_NETFILTER_NETLINK_GLUE_CT=y +CONFIG_NF_NAT=m +CONFIG_NF_NAT_AMANDA=m +CONFIG_NF_NAT_FTP=m +CONFIG_NF_NAT_IRC=m +CONFIG_NF_NAT_SIP=m +CONFIG_NF_NAT_TFTP=m +CONFIG_NF_NAT_REDIRECT=y +CONFIG_NF_NAT_MASQUERADE=y +CONFIG_NF_NAT_OVS=y +CONFIG_NETFILTER_SYNPROXY=m +CONFIG_NF_TABLES=m +CONFIG_NF_TABLES_INET=y +CONFIG_NF_TABLES_NETDEV=y +CONFIG_NFT_NUMGEN=m +CONFIG_NFT_CT=m +CONFIG_NFT_FLOW_OFFLOAD=m +CONFIG_NFT_CONNLIMIT=m +CONFIG_NFT_LOG=m +CONFIG_NFT_LIMIT=m +CONFIG_NFT_MASQ=m +CONFIG_NFT_REDIR=m +CONFIG_NFT_NAT=m +CONFIG_NFT_TUNNEL=m +CONFIG_NFT_QUEUE=m +CONFIG_NFT_QUOTA=m +CONFIG_NFT_REJECT=m +CONFIG_NFT_REJECT_INET=m +CONFIG_NFT_COMPAT=m +CONFIG_NFT_HASH=m +CONFIG_NFT_FIB=m +CONFIG_NFT_FIB_INET=m +# CONFIG_NFT_XFRM is not set +CONFIG_NFT_SOCKET=m +CONFIG_NFT_OSF=m +CONFIG_NFT_TPROXY=m +# CONFIG_NFT_SYNPROXY is not set +CONFIG_NF_DUP_NETDEV=m +CONFIG_NFT_DUP_NETDEV=m +CONFIG_NFT_FWD_NETDEV=m +CONFIG_NFT_FIB_NETDEV=m +# CONFIG_NFT_REJECT_NETDEV is not set +CONFIG_NF_FLOW_TABLE_INET=m +CONFIG_NF_FLOW_TABLE=m +# CONFIG_NF_FLOW_TABLE_PROCFS is not set +CONFIG_NETFILTER_XTABLES=m +CONFIG_NETFILTER_XTABLES_COMPAT=y + +# +# Xtables combined modules +# +CONFIG_NETFILTER_XT_MARK=m +CONFIG_NETFILTER_XT_CONNMARK=m +CONFIG_NETFILTER_XT_SET=m + +# +# Xtables targets +# +CONFIG_NETFILTER_XT_TARGET_AUDIT=m +CONFIG_NETFILTER_XT_TARGET_CHECKSUM=m +CONFIG_NETFILTER_XT_TARGET_CLASSIFY=m +CONFIG_NETFILTER_XT_TARGET_CONNMARK=m +CONFIG_NETFILTER_XT_TARGET_CONNSECMARK=m +CONFIG_NETFILTER_XT_TARGET_CT=m +CONFIG_NETFILTER_XT_TARGET_DSCP=m +CONFIG_NETFILTER_XT_TARGET_HL=m +CONFIG_NETFILTER_XT_TARGET_HMARK=m +CONFIG_NETFILTER_XT_TARGET_IDLETIMER=m +CONFIG_NETFILTER_XT_TARGET_LED=m +CONFIG_NETFILTER_XT_TARGET_LOG=m +CONFIG_NETFILTER_XT_TARGET_MARK=m +CONFIG_NETFILTER_XT_NAT=m +CONFIG_NETFILTER_XT_TARGET_NETMAP=m +CONFIG_NETFILTER_XT_TARGET_NFLOG=m +CONFIG_NETFILTER_XT_TARGET_NFQUEUE=m +# CONFIG_NETFILTER_XT_TARGET_NOTRACK is not set +CONFIG_NETFILTER_XT_TARGET_RATEEST=m +CONFIG_NETFILTER_XT_TARGET_REDIRECT=m +CONFIG_NETFILTER_XT_TARGET_MASQUERADE=m +CONFIG_NETFILTER_XT_TARGET_TEE=m +CONFIG_NETFILTER_XT_TARGET_TPROXY=m +CONFIG_NETFILTER_XT_TARGET_TRACE=m +CONFIG_NETFILTER_XT_TARGET_SECMARK=m +CONFIG_NETFILTER_XT_TARGET_TCPMSS=m +CONFIG_NETFILTER_XT_TARGET_TCPOPTSTRIP=m + +# +# Xtables matches +# +CONFIG_NETFILTER_XT_MATCH_ADDRTYPE=m +CONFIG_NETFILTER_XT_MATCH_BPF=m +CONFIG_NETFILTER_XT_MATCH_CGROUP=m +CONFIG_NETFILTER_XT_MATCH_CLUSTER=m +CONFIG_NETFILTER_XT_MATCH_COMMENT=m +CONFIG_NETFILTER_XT_MATCH_CONNBYTES=m +CONFIG_NETFILTER_XT_MATCH_CONNLABEL=m +CONFIG_NETFILTER_XT_MATCH_CONNLIMIT=m +CONFIG_NETFILTER_XT_MATCH_CONNMARK=m +CONFIG_NETFILTER_XT_MATCH_CONNTRACK=m +CONFIG_NETFILTER_XT_MATCH_CPU=m +CONFIG_NETFILTER_XT_MATCH_DCCP=m +CONFIG_NETFILTER_XT_MATCH_DEVGROUP=m +CONFIG_NETFILTER_XT_MATCH_DSCP=m +CONFIG_NETFILTER_XT_MATCH_ECN=m +CONFIG_NETFILTER_XT_MATCH_ESP=m +CONFIG_NETFILTER_XT_MATCH_HASHLIMIT=m +CONFIG_NETFILTER_XT_MATCH_HELPER=m +CONFIG_NETFILTER_XT_MATCH_HL=m +CONFIG_NETFILTER_XT_MATCH_IPCOMP=m +CONFIG_NETFILTER_XT_MATCH_IPRANGE=m +CONFIG_NETFILTER_XT_MATCH_IPVS=m +CONFIG_NETFILTER_XT_MATCH_L2TP=m +CONFIG_NETFILTER_XT_MATCH_LENGTH=m +CONFIG_NETFILTER_XT_MATCH_LIMIT=m +CONFIG_NETFILTER_XT_MATCH_MAC=m +CONFIG_NETFILTER_XT_MATCH_MARK=m +CONFIG_NETFILTER_XT_MATCH_MULTIPORT=m +CONFIG_NETFILTER_XT_MATCH_NFACCT=m +CONFIG_NETFILTER_XT_MATCH_OSF=m +CONFIG_NETFILTER_XT_MATCH_OWNER=m +CONFIG_NETFILTER_XT_MATCH_POLICY=m +CONFIG_NETFILTER_XT_MATCH_PHYSDEV=m +CONFIG_NETFILTER_XT_MATCH_PKTTYPE=m +CONFIG_NETFILTER_XT_MATCH_QUOTA=m +CONFIG_NETFILTER_XT_MATCH_RATEEST=m +CONFIG_NETFILTER_XT_MATCH_REALM=m +CONFIG_NETFILTER_XT_MATCH_RECENT=m +CONFIG_NETFILTER_XT_MATCH_SCTP=m +CONFIG_NETFILTER_XT_MATCH_SOCKET=m +CONFIG_NETFILTER_XT_MATCH_STATE=m +CONFIG_NETFILTER_XT_MATCH_STATISTIC=m +CONFIG_NETFILTER_XT_MATCH_STRING=m +CONFIG_NETFILTER_XT_MATCH_TCPMSS=m +CONFIG_NETFILTER_XT_MATCH_TIME=m +CONFIG_NETFILTER_XT_MATCH_U32=m +# end of Core Netfilter Configuration + +CONFIG_IP_SET=m +CONFIG_IP_SET_MAX=256 +CONFIG_IP_SET_BITMAP_IP=m +CONFIG_IP_SET_BITMAP_IPMAC=m +CONFIG_IP_SET_BITMAP_PORT=m +CONFIG_IP_SET_HASH_IP=m +CONFIG_IP_SET_HASH_IPMARK=m +CONFIG_IP_SET_HASH_IPPORT=m +CONFIG_IP_SET_HASH_IPPORTIP=m +CONFIG_IP_SET_HASH_IPPORTNET=m +CONFIG_IP_SET_HASH_IPMAC=m +CONFIG_IP_SET_HASH_MAC=m +CONFIG_IP_SET_HASH_NETPORTNET=m +CONFIG_IP_SET_HASH_NET=m +CONFIG_IP_SET_HASH_NETNET=m +CONFIG_IP_SET_HASH_NETPORT=m +CONFIG_IP_SET_HASH_NETIFACE=m +CONFIG_IP_SET_LIST_SET=m +CONFIG_IP_VS=m +CONFIG_IP_VS_IPV6=y +# CONFIG_IP_VS_DEBUG is not set +CONFIG_IP_VS_TAB_BITS=12 + +# +# IPVS transport protocol load balancing support +# +CONFIG_IP_VS_PROTO_TCP=y +CONFIG_IP_VS_PROTO_UDP=y +CONFIG_IP_VS_PROTO_AH_ESP=y +CONFIG_IP_VS_PROTO_ESP=y +CONFIG_IP_VS_PROTO_AH=y +CONFIG_IP_VS_PROTO_SCTP=y + +# +# IPVS scheduler +# +CONFIG_IP_VS_RR=m +CONFIG_IP_VS_WRR=m +CONFIG_IP_VS_LC=m +CONFIG_IP_VS_WLC=m +CONFIG_IP_VS_FO=m +CONFIG_IP_VS_OVF=m +CONFIG_IP_VS_LBLC=m +CONFIG_IP_VS_LBLCR=m +CONFIG_IP_VS_DH=m +CONFIG_IP_VS_SH=m +CONFIG_IP_VS_MH=m +CONFIG_IP_VS_SED=m +CONFIG_IP_VS_NQ=m +# CONFIG_IP_VS_TWOS is not set + +# +# IPVS SH scheduler +# +CONFIG_IP_VS_SH_TAB_BITS=8 + +# +# IPVS MH scheduler +# +CONFIG_IP_VS_MH_TAB_INDEX=12 + +# +# IPVS application helper +# +CONFIG_IP_VS_FTP=m +CONFIG_IP_VS_NFCT=y +CONFIG_IP_VS_PE_SIP=m + +# +# IP: Netfilter Configuration +# +CONFIG_NF_DEFRAG_IPV4=m +CONFIG_NF_SOCKET_IPV4=m +CONFIG_NF_TPROXY_IPV4=m +CONFIG_NF_TABLES_IPV4=y +CONFIG_NFT_REJECT_IPV4=m +CONFIG_NFT_DUP_IPV4=m +CONFIG_NFT_FIB_IPV4=m +CONFIG_NF_TABLES_ARP=y +CONFIG_NF_DUP_IPV4=m +CONFIG_NF_LOG_ARP=m +CONFIG_NF_LOG_IPV4=m +CONFIG_NF_REJECT_IPV4=m +CONFIG_NF_NAT_SNMP_BASIC=m +CONFIG_NF_NAT_PPTP=m +CONFIG_NF_NAT_H323=m +CONFIG_IP_NF_IPTABLES=m +CONFIG_IP_NF_MATCH_AH=m +CONFIG_IP_NF_MATCH_ECN=m +CONFIG_IP_NF_MATCH_RPFILTER=m +CONFIG_IP_NF_MATCH_TTL=m +CONFIG_IP_NF_FILTER=m +CONFIG_IP_NF_TARGET_REJECT=m +CONFIG_IP_NF_TARGET_SYNPROXY=m +CONFIG_IP_NF_NAT=m +CONFIG_IP_NF_TARGET_MASQUERADE=m +CONFIG_IP_NF_TARGET_NETMAP=m +CONFIG_IP_NF_TARGET_REDIRECT=m +CONFIG_IP_NF_MANGLE=m +CONFIG_IP_NF_TARGET_ECN=m +CONFIG_IP_NF_TARGET_TTL=m +CONFIG_IP_NF_RAW=m +CONFIG_IP_NF_SECURITY=m +CONFIG_IP_NF_ARPTABLES=m +CONFIG_IP_NF_ARPFILTER=m +CONFIG_IP_NF_ARP_MANGLE=m +# end of IP: Netfilter Configuration + +# +# IPv6: Netfilter Configuration +# +CONFIG_NF_SOCKET_IPV6=m +CONFIG_NF_TPROXY_IPV6=m +CONFIG_NF_TABLES_IPV6=y +CONFIG_NFT_REJECT_IPV6=m +CONFIG_NFT_DUP_IPV6=m +CONFIG_NFT_FIB_IPV6=m +CONFIG_NF_DUP_IPV6=m +CONFIG_NF_REJECT_IPV6=m +CONFIG_NF_LOG_IPV6=m +CONFIG_IP6_NF_IPTABLES=m +CONFIG_IP6_NF_MATCH_AH=m +CONFIG_IP6_NF_MATCH_EUI64=m +CONFIG_IP6_NF_MATCH_FRAG=m +CONFIG_IP6_NF_MATCH_OPTS=m +CONFIG_IP6_NF_MATCH_HL=m +CONFIG_IP6_NF_MATCH_IPV6HEADER=m +CONFIG_IP6_NF_MATCH_MH=m +CONFIG_IP6_NF_MATCH_RPFILTER=m +CONFIG_IP6_NF_MATCH_RT=m +# CONFIG_IP6_NF_MATCH_SRH is not set +CONFIG_IP6_NF_TARGET_HL=m +CONFIG_IP6_NF_FILTER=m +CONFIG_IP6_NF_TARGET_REJECT=m +CONFIG_IP6_NF_TARGET_SYNPROXY=m +CONFIG_IP6_NF_MANGLE=m +CONFIG_IP6_NF_RAW=m +CONFIG_IP6_NF_SECURITY=m +CONFIG_IP6_NF_NAT=m +CONFIG_IP6_NF_TARGET_MASQUERADE=m +CONFIG_IP6_NF_TARGET_NPT=m +# end of IPv6: Netfilter Configuration + +CONFIG_NF_DEFRAG_IPV6=m +CONFIG_NF_TABLES_BRIDGE=m +# CONFIG_NFT_BRIDGE_META is not set +CONFIG_NFT_BRIDGE_REJECT=m +# CONFIG_NF_CONNTRACK_BRIDGE is not set +CONFIG_BRIDGE_NF_EBTABLES=m +CONFIG_BRIDGE_EBT_BROUTE=m +CONFIG_BRIDGE_EBT_T_FILTER=m +CONFIG_BRIDGE_EBT_T_NAT=m +CONFIG_BRIDGE_EBT_802_3=m +CONFIG_BRIDGE_EBT_AMONG=m +CONFIG_BRIDGE_EBT_ARP=m +CONFIG_BRIDGE_EBT_IP=m +CONFIG_BRIDGE_EBT_IP6=m +CONFIG_BRIDGE_EBT_LIMIT=m +CONFIG_BRIDGE_EBT_MARK=m +CONFIG_BRIDGE_EBT_PKTTYPE=m +CONFIG_BRIDGE_EBT_STP=m +CONFIG_BRIDGE_EBT_VLAN=m +CONFIG_BRIDGE_EBT_ARPREPLY=m +CONFIG_BRIDGE_EBT_DNAT=m +CONFIG_BRIDGE_EBT_MARK_T=m +CONFIG_BRIDGE_EBT_REDIRECT=m +CONFIG_BRIDGE_EBT_SNAT=m +CONFIG_BRIDGE_EBT_LOG=m +CONFIG_BRIDGE_EBT_NFLOG=m +# CONFIG_BPFILTER is not set +CONFIG_IP_DCCP=m +CONFIG_INET_DCCP_DIAG=m + +# +# DCCP CCIDs Configuration +# +# CONFIG_IP_DCCP_CCID2_DEBUG is not set +CONFIG_IP_DCCP_CCID3=y +# CONFIG_IP_DCCP_CCID3_DEBUG is not set +CONFIG_IP_DCCP_TFRC_LIB=y +# end of DCCP CCIDs Configuration + +# +# DCCP Kernel Hacking +# +# CONFIG_IP_DCCP_DEBUG is not set +# end of DCCP Kernel Hacking + +CONFIG_IP_SCTP=m +# CONFIG_SCTP_DBG_OBJCNT is not set +CONFIG_SCTP_DEFAULT_COOKIE_HMAC_MD5=y +# CONFIG_SCTP_DEFAULT_COOKIE_HMAC_SHA1 is not set +# CONFIG_SCTP_DEFAULT_COOKIE_HMAC_NONE is not set +CONFIG_SCTP_COOKIE_HMAC_MD5=y +CONFIG_SCTP_COOKIE_HMAC_SHA1=y +CONFIG_INET_SCTP_DIAG=m +CONFIG_RDS=m +CONFIG_RDS_RDMA=m +CONFIG_RDS_TCP=m +# CONFIG_RDS_DEBUG is not set +CONFIG_TIPC=m +CONFIG_TIPC_MEDIA_IB=y +CONFIG_TIPC_MEDIA_UDP=y +CONFIG_TIPC_CRYPTO=y +CONFIG_TIPC_DIAG=m +CONFIG_ATM=m +CONFIG_ATM_CLIP=m +# CONFIG_ATM_CLIP_NO_ICMP is not set +CONFIG_ATM_LANE=m +CONFIG_ATM_MPOA=m +CONFIG_ATM_BR2684=m +# CONFIG_ATM_BR2684_IPFILTER is not set +CONFIG_L2TP=m +CONFIG_L2TP_DEBUGFS=m +CONFIG_L2TP_V3=y +CONFIG_L2TP_IP=m +CONFIG_L2TP_ETH=m +CONFIG_STP=m +CONFIG_GARP=m +CONFIG_MRP=m +CONFIG_BRIDGE=m +CONFIG_BRIDGE_IGMP_SNOOPING=y +CONFIG_BRIDGE_VLAN_FILTERING=y +# CONFIG_BRIDGE_MRP is not set +# CONFIG_BRIDGE_CFM is not set +CONFIG_NET_DSA=m +# CONFIG_NET_DSA_TAG_NONE is not set +# CONFIG_NET_DSA_TAG_AR9331 is not set +# CONFIG_NET_DSA_TAG_BRCM is not set +# CONFIG_NET_DSA_TAG_BRCM_LEGACY is not set +# CONFIG_NET_DSA_TAG_BRCM_PREPEND is not set +# CONFIG_NET_DSA_TAG_HELLCREEK is not set +# CONFIG_NET_DSA_TAG_GSWIP is not set +CONFIG_NET_DSA_TAG_DSA_COMMON=m +CONFIG_NET_DSA_TAG_DSA=m +CONFIG_NET_DSA_TAG_EDSA=m +# CONFIG_NET_DSA_TAG_MTK is not set +# CONFIG_NET_DSA_TAG_KSZ is not set +# CONFIG_NET_DSA_TAG_OCELOT is not set +# CONFIG_NET_DSA_TAG_OCELOT_8021Q is not set +# CONFIG_NET_DSA_TAG_QCA is not set +# CONFIG_NET_DSA_TAG_RTL4_A is not set +# CONFIG_NET_DSA_TAG_RTL8_4 is not set +# CONFIG_NET_DSA_TAG_RZN1_A5PSW is not set +# CONFIG_NET_DSA_TAG_LAN9303 is not set +# CONFIG_NET_DSA_TAG_SJA1105 is not set +CONFIG_NET_DSA_TAG_TRAILER=m +# CONFIG_NET_DSA_TAG_XRS700X is not set +CONFIG_VLAN_8021Q=m +CONFIG_VLAN_8021Q_GVRP=y +CONFIG_VLAN_8021Q_MVRP=y +CONFIG_LLC=m +CONFIG_LLC2=m +CONFIG_ATALK=m +CONFIG_DEV_APPLETALK=m +CONFIG_IPDDP=m +CONFIG_IPDDP_ENCAP=y +# CONFIG_X25 is not set +# CONFIG_LAPB is not set +CONFIG_PHONET=m +CONFIG_6LOWPAN=m +# CONFIG_6LOWPAN_DEBUGFS is not set +CONFIG_6LOWPAN_NHC=m +CONFIG_6LOWPAN_NHC_DEST=m +CONFIG_6LOWPAN_NHC_FRAGMENT=m +CONFIG_6LOWPAN_NHC_HOP=m +CONFIG_6LOWPAN_NHC_IPV6=m +CONFIG_6LOWPAN_NHC_MOBILITY=m +CONFIG_6LOWPAN_NHC_ROUTING=m +CONFIG_6LOWPAN_NHC_UDP=m +CONFIG_6LOWPAN_GHC_EXT_HDR_HOP=m +CONFIG_6LOWPAN_GHC_UDP=m +CONFIG_6LOWPAN_GHC_ICMPV6=m +CONFIG_6LOWPAN_GHC_EXT_HDR_DEST=m +CONFIG_6LOWPAN_GHC_EXT_HDR_FRAG=m +CONFIG_6LOWPAN_GHC_EXT_HDR_ROUTE=m +CONFIG_IEEE802154=m +# CONFIG_IEEE802154_NL802154_EXPERIMENTAL is not set +CONFIG_IEEE802154_SOCKET=m +CONFIG_IEEE802154_6LOWPAN=m +CONFIG_MAC802154=m +CONFIG_NET_SCHED=y + +# +# Queueing/Scheduling +# +CONFIG_NET_SCH_HTB=m +CONFIG_NET_SCH_HFSC=m +CONFIG_NET_SCH_PRIO=m +CONFIG_NET_SCH_MULTIQ=m +CONFIG_NET_SCH_RED=m +CONFIG_NET_SCH_SFB=m +CONFIG_NET_SCH_SFQ=m +CONFIG_NET_SCH_TEQL=m +CONFIG_NET_SCH_TBF=m +CONFIG_NET_SCH_CBS=m +CONFIG_NET_SCH_ETF=m +CONFIG_NET_SCH_MQPRIO_LIB=m +# CONFIG_NET_SCH_TAPRIO is not set +CONFIG_NET_SCH_GRED=m +CONFIG_NET_SCH_NETEM=m +CONFIG_NET_SCH_DRR=m +CONFIG_NET_SCH_MQPRIO=m +CONFIG_NET_SCH_SKBPRIO=m +CONFIG_NET_SCH_CHOKE=m +CONFIG_NET_SCH_QFQ=m +CONFIG_NET_SCH_CODEL=m +CONFIG_NET_SCH_FQ_CODEL=m +CONFIG_NET_SCH_CAKE=m +CONFIG_NET_SCH_FQ=m +CONFIG_NET_SCH_HHF=m +CONFIG_NET_SCH_PIE=m +# CONFIG_NET_SCH_FQ_PIE is not set +CONFIG_NET_SCH_INGRESS=m +CONFIG_NET_SCH_PLUG=m +# CONFIG_NET_SCH_ETS is not set +# CONFIG_NET_SCH_DEFAULT is not set + +# +# Classification +# +CONFIG_NET_CLS=y +CONFIG_NET_CLS_BASIC=m +CONFIG_NET_CLS_ROUTE4=m +CONFIG_NET_CLS_FW=m +CONFIG_NET_CLS_U32=m +CONFIG_CLS_U32_PERF=y +CONFIG_CLS_U32_MARK=y +CONFIG_NET_CLS_FLOW=m +CONFIG_NET_CLS_CGROUP=m +CONFIG_NET_CLS_BPF=m +CONFIG_NET_CLS_FLOWER=m +CONFIG_NET_CLS_MATCHALL=m +CONFIG_NET_EMATCH=y +CONFIG_NET_EMATCH_STACK=32 +CONFIG_NET_EMATCH_CMP=m +CONFIG_NET_EMATCH_NBYTE=m +CONFIG_NET_EMATCH_U32=m +CONFIG_NET_EMATCH_META=m +CONFIG_NET_EMATCH_TEXT=m +CONFIG_NET_EMATCH_CANID=m +CONFIG_NET_EMATCH_IPSET=m +CONFIG_NET_EMATCH_IPT=m +CONFIG_NET_CLS_ACT=y +CONFIG_NET_ACT_POLICE=m +CONFIG_NET_ACT_GACT=m +CONFIG_GACT_PROB=y +CONFIG_NET_ACT_MIRRED=m +CONFIG_NET_ACT_SAMPLE=m +CONFIG_NET_ACT_IPT=m +CONFIG_NET_ACT_NAT=m +CONFIG_NET_ACT_PEDIT=m +CONFIG_NET_ACT_SIMP=m +CONFIG_NET_ACT_SKBEDIT=m +CONFIG_NET_ACT_CSUM=m +# CONFIG_NET_ACT_MPLS is not set +CONFIG_NET_ACT_VLAN=m +CONFIG_NET_ACT_BPF=m +CONFIG_NET_ACT_CONNMARK=m +# CONFIG_NET_ACT_CTINFO is not set +CONFIG_NET_ACT_SKBMOD=m +CONFIG_NET_ACT_IFE=m +CONFIG_NET_ACT_TUNNEL_KEY=m +CONFIG_NET_ACT_CT=m +# CONFIG_NET_ACT_GATE is not set +CONFIG_NET_IFE_SKBMARK=m +CONFIG_NET_IFE_SKBPRIO=m +CONFIG_NET_IFE_SKBTCINDEX=m +# CONFIG_NET_TC_SKB_EXT is not set +CONFIG_NET_SCH_FIFO=y +CONFIG_DCB=y +CONFIG_DNS_RESOLVER=m +CONFIG_BATMAN_ADV=m +# CONFIG_BATMAN_ADV_BATMAN_V is not set +CONFIG_BATMAN_ADV_BLA=y +CONFIG_BATMAN_ADV_DAT=y +CONFIG_BATMAN_ADV_NC=y +CONFIG_BATMAN_ADV_MCAST=y +# CONFIG_BATMAN_ADV_DEBUG is not set +# CONFIG_BATMAN_ADV_TRACING is not set +CONFIG_OPENVSWITCH=m +CONFIG_OPENVSWITCH_GRE=m +CONFIG_OPENVSWITCH_VXLAN=m +CONFIG_OPENVSWITCH_GENEVE=m +CONFIG_VSOCKETS=m +CONFIG_VSOCKETS_DIAG=m +CONFIG_VSOCKETS_LOOPBACK=m +CONFIG_VIRTIO_VSOCKETS=m +CONFIG_VIRTIO_VSOCKETS_COMMON=m +CONFIG_HYPERV_VSOCKETS=m +CONFIG_NETLINK_DIAG=m +CONFIG_MPLS=y +CONFIG_NET_MPLS_GSO=y +CONFIG_MPLS_ROUTING=m +CONFIG_MPLS_IPTUNNEL=m +CONFIG_NET_NSH=m +# CONFIG_HSR is not set +CONFIG_NET_SWITCHDEV=y +CONFIG_NET_L3_MASTER_DEV=y +# CONFIG_QRTR is not set +# CONFIG_NET_NCSI is not set +CONFIG_PCPU_DEV_REFCNT=y +CONFIG_MAX_SKB_FRAGS=17 +CONFIG_RPS=y +CONFIG_RFS_ACCEL=y +CONFIG_SOCK_RX_QUEUE_MAPPING=y +CONFIG_XPS=y +CONFIG_CGROUP_NET_PRIO=y +CONFIG_CGROUP_NET_CLASSID=y +CONFIG_NET_RX_BUSY_POLL=y +CONFIG_BQL=y +CONFIG_BPF_STREAM_PARSER=y +CONFIG_NET_FLOW_LIMIT=y + +# +# Network testing +# +CONFIG_NET_PKTGEN=m +CONFIG_NET_DROP_MONITOR=y +# end of Network testing +# end of Networking options + +CONFIG_HAMRADIO=y + +# +# Packet Radio protocols +# +CONFIG_AX25=m +CONFIG_AX25_DAMA_SLAVE=y +CONFIG_NETROM=m +CONFIG_ROSE=m + +# +# AX.25 network device drivers +# +CONFIG_MKISS=m +CONFIG_6PACK=m +CONFIG_BPQETHER=m +CONFIG_BAYCOM_SER_FDX=m +CONFIG_BAYCOM_SER_HDX=m +CONFIG_BAYCOM_PAR=m +CONFIG_YAM=m +# end of AX.25 network device drivers + +CONFIG_CAN=m +CONFIG_CAN_RAW=m +CONFIG_CAN_BCM=m +CONFIG_CAN_GW=m +# CONFIG_CAN_J1939 is not set +# CONFIG_CAN_ISOTP is not set +CONFIG_BT=m +CONFIG_BT_BREDR=y +CONFIG_BT_RFCOMM=m +CONFIG_BT_RFCOMM_TTY=y +CONFIG_BT_BNEP=m +CONFIG_BT_BNEP_MC_FILTER=y +CONFIG_BT_BNEP_PROTO_FILTER=y +CONFIG_BT_HIDP=m +CONFIG_BT_HS=y +CONFIG_BT_LE=y +CONFIG_BT_LE_L2CAP_ECRED=y +CONFIG_BT_6LOWPAN=m +CONFIG_BT_LEDS=y +# CONFIG_BT_MSFTEXT is not set +# CONFIG_BT_AOSPEXT is not set +CONFIG_BT_DEBUGFS=y +# CONFIG_BT_SELFTEST is not set + +# +# Bluetooth device drivers +# +CONFIG_BT_INTEL=m +CONFIG_BT_BCM=m +CONFIG_BT_RTL=m +CONFIG_BT_QCA=m +CONFIG_BT_MTK=m +CONFIG_BT_HCIBTUSB=m +CONFIG_BT_HCIBTUSB_AUTOSUSPEND=y +CONFIG_BT_HCIBTUSB_POLL_SYNC=y +CONFIG_BT_HCIBTUSB_BCM=y +# CONFIG_BT_HCIBTUSB_MTK is not set +CONFIG_BT_HCIBTUSB_RTL=y +CONFIG_BT_HCIBTSDIO=m +CONFIG_BT_HCIUART=m +CONFIG_BT_HCIUART_SERDEV=y +CONFIG_BT_HCIUART_H4=y +CONFIG_BT_HCIUART_NOKIA=m +# CONFIG_BT_HCIUART_BCSP is not set +CONFIG_BT_HCIUART_ATH3K=y +CONFIG_BT_HCIUART_LL=y +CONFIG_BT_HCIUART_3WIRE=y +CONFIG_BT_HCIUART_INTEL=y +CONFIG_BT_HCIUART_RTL=y +CONFIG_BT_HCIUART_QCA=y +CONFIG_BT_HCIUART_AG6XX=y +CONFIG_BT_HCIUART_MRVL=y +# CONFIG_BT_HCIBCM203X is not set +# CONFIG_BT_HCIBCM4377 is not set +# CONFIG_BT_HCIBPA10X is not set +# CONFIG_BT_HCIBFUSB is not set +# CONFIG_BT_HCIVHCI is not set +CONFIG_BT_MRVL=m +CONFIG_BT_MRVL_SDIO=m +CONFIG_BT_ATH3K=m +# CONFIG_BT_MTKSDIO is not set +CONFIG_BT_MTKUART=m +CONFIG_BT_QCOMSMD=m +CONFIG_BT_HCIRSI=m +# CONFIG_BT_VIRTIO is not set +# CONFIG_BT_NXPUART is not set +# end of Bluetooth device drivers + +CONFIG_AF_RXRPC=m +CONFIG_AF_RXRPC_IPV6=y +# CONFIG_AF_RXRPC_INJECT_LOSS is not set +# CONFIG_AF_RXRPC_INJECT_RX_DELAY is not set +# CONFIG_AF_RXRPC_DEBUG is not set +CONFIG_RXKAD=y +# CONFIG_RXPERF is not set +# CONFIG_AF_KCM is not set +CONFIG_STREAM_PARSER=y +# CONFIG_MCTP is not set +CONFIG_FIB_RULES=y +CONFIG_WIRELESS=y +CONFIG_WIRELESS_EXT=y +CONFIG_WEXT_CORE=y +CONFIG_WEXT_PROC=y +CONFIG_WEXT_SPY=y +CONFIG_WEXT_PRIV=y +CONFIG_CFG80211=m +# CONFIG_NL80211_TESTMODE is not set +# CONFIG_CFG80211_DEVELOPER_WARNINGS is not set +# CONFIG_CFG80211_CERTIFICATION_ONUS is not set +CONFIG_CFG80211_REQUIRE_SIGNED_REGDB=y +CONFIG_CFG80211_USE_KERNEL_REGDB_KEYS=y +CONFIG_CFG80211_DEFAULT_PS=y +# CONFIG_CFG80211_DEBUGFS is not set +CONFIG_CFG80211_CRDA_SUPPORT=y +CONFIG_CFG80211_WEXT=y +CONFIG_CFG80211_WEXT_EXPORT=y +CONFIG_LIB80211=m +CONFIG_LIB80211_CRYPT_WEP=m +CONFIG_LIB80211_CRYPT_CCMP=m +CONFIG_LIB80211_CRYPT_TKIP=m +# CONFIG_LIB80211_DEBUG is not set +CONFIG_MAC80211=m +CONFIG_MAC80211_HAS_RC=y +CONFIG_MAC80211_RC_MINSTREL=y +CONFIG_MAC80211_RC_DEFAULT_MINSTREL=y +CONFIG_MAC80211_RC_DEFAULT="minstrel_ht" +CONFIG_MAC80211_MESH=y +CONFIG_MAC80211_LEDS=y +# CONFIG_MAC80211_DEBUGFS is not set +# CONFIG_MAC80211_MESSAGE_TRACING is not set +# CONFIG_MAC80211_DEBUG_MENU is not set +CONFIG_MAC80211_STA_HASH_MAX_SIZE=0 +CONFIG_RFKILL=m +CONFIG_RFKILL_LEDS=y +CONFIG_RFKILL_INPUT=y +# CONFIG_RFKILL_GPIO is not set +CONFIG_NET_9P=m +CONFIG_NET_9P_FD=m +CONFIG_NET_9P_VIRTIO=m +CONFIG_NET_9P_XEN=m +CONFIG_NET_9P_RDMA=m +# CONFIG_NET_9P_DEBUG is not set +# CONFIG_CAIF is not set +CONFIG_CEPH_LIB=m +# CONFIG_CEPH_LIB_PRETTYDEBUG is not set +# CONFIG_CEPH_LIB_USE_DNS_RESOLVER is not set +CONFIG_NFC=m +CONFIG_NFC_DIGITAL=m +# CONFIG_NFC_NCI is not set +# CONFIG_NFC_HCI is not set + +# +# Near Field Communication (NFC) devices +# +# CONFIG_NFC_TRF7970A is not set +CONFIG_NFC_SIM=m +CONFIG_NFC_PORT100=m +CONFIG_NFC_PN533=m +CONFIG_NFC_PN533_USB=m +# CONFIG_NFC_PN533_I2C is not set +# CONFIG_NFC_PN532_UART is not set +# CONFIG_NFC_ST95HF is not set +# end of Near Field Communication (NFC) devices + +CONFIG_PSAMPLE=m +CONFIG_NET_IFE=m +CONFIG_LWTUNNEL=y +CONFIG_LWTUNNEL_BPF=y +CONFIG_DST_CACHE=y +CONFIG_GRO_CELLS=y +CONFIG_NET_SELFTESTS=m +CONFIG_NET_SOCK_MSG=y +CONFIG_NET_DEVLINK=y +CONFIG_PAGE_POOL=y +CONFIG_PAGE_POOL_STATS=y +CONFIG_FAILOVER=m +CONFIG_ETHTOOL_NETLINK=y + +# +# Device Drivers +# +CONFIG_ARM_AMBA=y +CONFIG_TEGRA_AHB=y +CONFIG_HAVE_PCI=y +CONFIG_PCI=y +CONFIG_PCI_DOMAINS=y +CONFIG_PCI_DOMAINS_GENERIC=y +CONFIG_PCI_SYSCALL=y +CONFIG_PCIEPORTBUS=y +CONFIG_HOTPLUG_PCI_PCIE=y +CONFIG_PCIEAER=y +CONFIG_PCIEAER_INJECT=m +# CONFIG_PCIE_ECRC is not set +CONFIG_PCIEASPM=y +CONFIG_PCIEASPM_DEFAULT=y +# CONFIG_PCIEASPM_POWERSAVE is not set +# CONFIG_PCIEASPM_POWER_SUPERSAVE is not set +# CONFIG_PCIEASPM_PERFORMANCE is not set +CONFIG_PCIE_PME=y +CONFIG_PCIE_DPC=y +CONFIG_PCIE_PTM=y +CONFIG_PCIE_EDR=y +CONFIG_PCI_MSI=y +CONFIG_PCI_QUIRKS=y +# CONFIG_PCI_DEBUG is not set +CONFIG_PCI_REALLOC_ENABLE_AUTO=y +CONFIG_PCI_STUB=m +CONFIG_PCI_PF_STUB=m +CONFIG_PCI_ATS=y +CONFIG_PCI_ECAM=y +CONFIG_PCI_BRIDGE_EMUL=y +CONFIG_PCI_IOV=y +CONFIG_PCI_PRI=y +CONFIG_PCI_PASID=y +CONFIG_PCI_LABEL=y +CONFIG_PCI_HYPERV=m +# CONFIG_PCI_DYNAMIC_OF_NODES is not set +# CONFIG_PCIE_BUS_TUNE_OFF is not set +CONFIG_PCIE_BUS_DEFAULT=y +# CONFIG_PCIE_BUS_SAFE is not set +# CONFIG_PCIE_BUS_PERFORMANCE is not set +# CONFIG_PCIE_BUS_PEER2PEER is not set +CONFIG_VGA_ARB=y +CONFIG_VGA_ARB_MAX_GPUS=16 +CONFIG_HOTPLUG_PCI=y +CONFIG_HOTPLUG_PCI_ACPI=y +CONFIG_HOTPLUG_PCI_ACPI_IBM=m +CONFIG_HOTPLUG_PCI_CPCI=y +CONFIG_HOTPLUG_PCI_SHPC=y + +# +# PCI controller drivers +# +CONFIG_PCI_AARDVARK=y +# CONFIG_PCIE_ALTERA is not set +CONFIG_PCI_HOST_THUNDER_PEM=y +CONFIG_PCI_HOST_THUNDER_ECAM=y +# CONFIG_PCI_FTPCI100 is not set +CONFIG_PCI_HOST_COMMON=y +CONFIG_PCI_HOST_GENERIC=y +# CONFIG_PCIE_HISI_ERR is not set +# CONFIG_PCIE_MICROCHIP_HOST is not set +CONFIG_PCI_HYPERV_INTERFACE=m +CONFIG_PCI_TEGRA=y +CONFIG_PCIE_ROCKCHIP=y +CONFIG_PCIE_ROCKCHIP_HOST=y +CONFIG_PCI_XGENE=y +CONFIG_PCI_XGENE_MSI=y +# CONFIG_PCIE_XILINX is not set +CONFIG_PCIE_XILINX_NWL=y +# CONFIG_PCIE_XILINX_CPM is not set + +# +# Cadence-based PCIe controllers +# +# CONFIG_PCIE_CADENCE_PLAT_HOST is not set +# CONFIG_PCI_J721E_HOST is not set +# end of Cadence-based PCIe controllers + +# +# DesignWare-based PCIe controllers +# +CONFIG_PCIE_DW=y +CONFIG_PCIE_DW_HOST=y +# CONFIG_PCIE_AL is not set +# CONFIG_PCI_MESON is not set +CONFIG_PCI_HISI=y +CONFIG_PCIE_KIRIN=y +# CONFIG_PCIE_HISI_STB is not set +CONFIG_PCIE_ARMADA_8K=y +# CONFIG_PCIE_DW_PLAT_HOST is not set +CONFIG_PCIE_QCOM=y +# CONFIG_PCIE_ROCKCHIP_DW_HOST is not set +# end of DesignWare-based PCIe controllers + +# +# Mobiveil-based PCIe controllers +# +# CONFIG_PCIE_MOBIVEIL_PLAT is not set +# end of Mobiveil-based PCIe controllers +# end of PCI controller drivers + +# +# PCI Endpoint +# +# CONFIG_PCI_ENDPOINT is not set +# end of PCI Endpoint + +# +# PCI switch controller drivers +# +# CONFIG_PCI_SW_SWITCHTEC is not set +# end of PCI switch controller drivers + +# CONFIG_CXL_BUS is not set +# CONFIG_PCCARD is not set +# CONFIG_RAPIDIO is not set + +# +# Generic Driver Options +# +CONFIG_AUXILIARY_BUS=y +# CONFIG_UEVENT_HELPER is not set +CONFIG_DEVTMPFS=y +# CONFIG_DEVTMPFS_MOUNT is not set +# CONFIG_DEVTMPFS_SAFE is not set +CONFIG_STANDALONE=y +CONFIG_PREVENT_FIRMWARE_BUILD=y + +# +# Firmware loader +# +CONFIG_FW_LOADER=y +CONFIG_FW_LOADER_DEBUG=y +CONFIG_EXTRA_FIRMWARE="" +# CONFIG_FW_LOADER_USER_HELPER is not set +# CONFIG_FW_LOADER_COMPRESS is not set +CONFIG_FW_CACHE=y +# CONFIG_FW_UPLOAD is not set +# end of Firmware loader + +CONFIG_WANT_DEV_COREDUMP=y +CONFIG_ALLOW_DEV_COREDUMP=y +CONFIG_DEV_COREDUMP=y +# CONFIG_DEBUG_DRIVER is not set +# CONFIG_DEBUG_DEVRES is not set +# CONFIG_DEBUG_TEST_DRIVER_REMOVE is not set +# CONFIG_TEST_ASYNC_DRIVER_PROBE is not set +CONFIG_SYS_HYPERVISOR=y +CONFIG_GENERIC_CPU_AUTOPROBE=y +CONFIG_GENERIC_CPU_VULNERABILITIES=y +CONFIG_SOC_BUS=y +CONFIG_REGMAP=y +CONFIG_REGMAP_I2C=y +CONFIG_REGMAP_SPI=m +CONFIG_REGMAP_SPMI=y +CONFIG_REGMAP_MMIO=y +CONFIG_REGMAP_IRQ=y +CONFIG_DMA_SHARED_BUFFER=y +# CONFIG_DMA_FENCE_TRACE is not set +CONFIG_GENERIC_ARCH_TOPOLOGY=y +CONFIG_GENERIC_ARCH_NUMA=y +# CONFIG_FW_DEVLINK_SYNC_STATE_TIMEOUT is not set +# end of Generic Driver Options + +# +# Bus devices +# +CONFIG_ARM_CCI=y +CONFIG_ARM_CCI400_COMMON=y +# CONFIG_BRCMSTB_GISB_ARB is not set +# CONFIG_MOXTET is not set +CONFIG_HISILICON_LPC=y +CONFIG_QCOM_EBI2=y +# CONFIG_QCOM_SSC_BLOCK_BUS is not set +CONFIG_SUN50I_DE2_BUS=y +CONFIG_SUNXI_RSB=y +CONFIG_TEGRA_ACONNECT=y +# CONFIG_TEGRA_GMI is not set +CONFIG_VEXPRESS_CONFIG=y +# CONFIG_MHI_BUS is not set +# CONFIG_MHI_BUS_EP is not set +# end of Bus devices + +# +# Cache Drivers +# +# end of Cache Drivers + +CONFIG_CONNECTOR=y +CONFIG_PROC_EVENTS=y + +# +# Firmware Drivers +# + +# +# ARM System Control and Management Interface Protocol +# +# CONFIG_ARM_SCMI_PROTOCOL is not set +# end of ARM System Control and Management Interface Protocol + +# CONFIG_ARM_SCPI_PROTOCOL is not set +CONFIG_ARM_SDE_INTERFACE=y +# CONFIG_FIRMWARE_MEMMAP is not set +CONFIG_DMIID=y +CONFIG_DMI_SYSFS=y +# CONFIG_ISCSI_IBFT is not set +CONFIG_FW_CFG_SYSFS=m +# CONFIG_FW_CFG_SYSFS_CMDLINE is not set +CONFIG_QCOM_SCM=y +# CONFIG_QCOM_SCM_DOWNLOAD_MODE_DEFAULT is not set +CONFIG_SYSFB=y +# CONFIG_SYSFB_SIMPLEFB is not set +CONFIG_TURRIS_MOX_RWTM=m +CONFIG_ARM_FFA_TRANSPORT=m +CONFIG_ARM_FFA_SMCCC=y +CONFIG_GOOGLE_FIRMWARE=y +# CONFIG_GOOGLE_CBMEM is not set +CONFIG_GOOGLE_COREBOOT_TABLE=m +CONFIG_GOOGLE_MEMCONSOLE=m +# CONFIG_GOOGLE_FRAMEBUFFER_COREBOOT is not set +CONFIG_GOOGLE_MEMCONSOLE_COREBOOT=m +# CONFIG_GOOGLE_VPD is not set + +# +# EFI (Extensible Firmware Interface) Support +# +CONFIG_EFI_ESRT=y +CONFIG_EFI_VARS_PSTORE=m +# CONFIG_EFI_VARS_PSTORE_DEFAULT_DISABLE is not set +CONFIG_EFI_PARAMS_FROM_FDT=y +CONFIG_EFI_RUNTIME_WRAPPERS=y +CONFIG_EFI_GENERIC_STUB=y +# CONFIG_EFI_ZBOOT is not set +CONFIG_EFI_ARMSTUB_DTB_LOADER=y +CONFIG_EFI_BOOTLOADER_CONTROL=m +CONFIG_EFI_CAPSULE_LOADER=m +# CONFIG_EFI_TEST is not set +# CONFIG_RESET_ATTACK_MITIGATION is not set +# CONFIG_EFI_DISABLE_PCI_DMA is not set +CONFIG_EFI_EARLYCON=y +CONFIG_EFI_CUSTOM_SSDT_OVERLAYS=y +# CONFIG_EFI_DISABLE_RUNTIME is not set +# CONFIG_EFI_COCO_SECRET is not set +# end of EFI (Extensible Firmware Interface) Support + +CONFIG_UEFI_CPER=y +CONFIG_UEFI_CPER_ARM=y +CONFIG_MESON_SM=y +CONFIG_ARM_PSCI_FW=y +# CONFIG_ARM_PSCI_CHECKER is not set +CONFIG_HAVE_ARM_SMCCC=y +CONFIG_HAVE_ARM_SMCCC_DISCOVERY=y +CONFIG_ARM_SMCCC_SOC_ID=y + +# +# Tegra firmware driver +# +# CONFIG_TEGRA_IVC is not set +# end of Tegra firmware driver + +# +# Zynq MPSoC Firmware Drivers +# +CONFIG_ZYNQMP_FIRMWARE=y +# CONFIG_ZYNQMP_FIRMWARE_DEBUG is not set +# end of Zynq MPSoC Firmware Drivers +# end of Firmware Drivers + +CONFIG_GNSS=m +CONFIG_GNSS_SERIAL=m +# CONFIG_GNSS_MTK_SERIAL is not set +CONFIG_GNSS_SIRF_SERIAL=m +CONFIG_GNSS_UBX_SERIAL=m +# CONFIG_GNSS_USB is not set +CONFIG_MTD=m +# CONFIG_MTD_TESTS is not set + +# +# Partition parsers +# +CONFIG_MTD_AR7_PARTS=m +# CONFIG_MTD_CMDLINE_PARTS is not set +CONFIG_MTD_OF_PARTS=m +# CONFIG_MTD_AFS_PARTS is not set +# CONFIG_MTD_REDBOOT_PARTS is not set +# end of Partition parsers + +# +# User Modules And Translation Layers +# +CONFIG_MTD_BLKDEVS=m +CONFIG_MTD_BLOCK=m +CONFIG_MTD_BLOCK_RO=m + +# +# Note that in some cases UBI block is preferred. See MTD_UBI_BLOCK. +# +# CONFIG_FTL is not set +# CONFIG_NFTL is not set +# CONFIG_INFTL is not set +CONFIG_RFD_FTL=m +CONFIG_SSFDC=m +# CONFIG_SM_FTL is not set +CONFIG_MTD_OOPS=m +CONFIG_MTD_SWAP=m +# CONFIG_MTD_PARTITIONED_MASTER is not set + +# +# RAM/ROM/Flash chip drivers +# +# CONFIG_MTD_CFI is not set +# CONFIG_MTD_JEDECPROBE is not set +CONFIG_MTD_MAP_BANK_WIDTH_1=y +CONFIG_MTD_MAP_BANK_WIDTH_2=y +CONFIG_MTD_MAP_BANK_WIDTH_4=y +CONFIG_MTD_CFI_I1=y +CONFIG_MTD_CFI_I2=y +CONFIG_MTD_RAM=m +# CONFIG_MTD_ROM is not set +# CONFIG_MTD_ABSENT is not set +# end of RAM/ROM/Flash chip drivers + +# +# Mapping drivers for chip access +# +# CONFIG_MTD_COMPLEX_MAPPINGS is not set +# CONFIG_MTD_PHYSMAP is not set +CONFIG_MTD_INTEL_VR_NOR=m +CONFIG_MTD_PLATRAM=m +# end of Mapping drivers for chip access + +# +# Self-contained MTD device drivers +# +# CONFIG_MTD_PMC551 is not set +CONFIG_MTD_DATAFLASH=m +# CONFIG_MTD_DATAFLASH_WRITE_VERIFY is not set +# CONFIG_MTD_DATAFLASH_OTP is not set +# CONFIG_MTD_MCHP23K256 is not set +# CONFIG_MTD_MCHP48L640 is not set +CONFIG_MTD_SST25L=m +# CONFIG_MTD_SLRAM is not set +# CONFIG_MTD_PHRAM is not set +# CONFIG_MTD_MTDRAM is not set +# CONFIG_MTD_BLOCK2MTD is not set + +# +# Disk-On-Chip Device Drivers +# +# CONFIG_MTD_DOCG3 is not set +# end of Self-contained MTD device drivers + +# +# NAND +# +CONFIG_MTD_ONENAND=m +CONFIG_MTD_ONENAND_VERIFY_WRITE=y +# CONFIG_MTD_ONENAND_GENERIC is not set +# CONFIG_MTD_ONENAND_OTP is not set +CONFIG_MTD_ONENAND_2X_PROGRAM=y +# CONFIG_MTD_RAW_NAND is not set +# CONFIG_MTD_SPI_NAND is not set + +# +# ECC engine support +# +# CONFIG_MTD_NAND_ECC_SW_HAMMING is not set +# CONFIG_MTD_NAND_ECC_SW_BCH is not set +# CONFIG_MTD_NAND_ECC_MXIC is not set +# end of ECC engine support +# end of NAND + +# +# LPDDR & LPDDR2 PCM memory drivers +# +CONFIG_MTD_LPDDR=m +CONFIG_MTD_QINFO_PROBE=m +# end of LPDDR & LPDDR2 PCM memory drivers + +CONFIG_MTD_SPI_NOR=m +CONFIG_MTD_SPI_NOR_USE_4K_SECTORS=y +# CONFIG_MTD_SPI_NOR_SWP_DISABLE is not set +CONFIG_MTD_SPI_NOR_SWP_DISABLE_ON_VOLATILE=y +# CONFIG_MTD_SPI_NOR_SWP_KEEP is not set +CONFIG_SPI_HISI_SFC=m +CONFIG_MTD_UBI=m +CONFIG_MTD_UBI_WL_THRESHOLD=4096 +CONFIG_MTD_UBI_BEB_LIMIT=20 +# CONFIG_MTD_UBI_FASTMAP is not set +# CONFIG_MTD_UBI_GLUEBI is not set +CONFIG_MTD_UBI_BLOCK=y +# CONFIG_MTD_HYPERBUS is not set +CONFIG_DTC=y +CONFIG_OF=y +# CONFIG_OF_UNITTEST is not set +CONFIG_OF_FLATTREE=y +CONFIG_OF_EARLY_FLATTREE=y +CONFIG_OF_KOBJ=y +CONFIG_OF_ADDRESS=y +CONFIG_OF_IRQ=y +CONFIG_OF_RESERVED_MEM=y +# CONFIG_OF_OVERLAY is not set +CONFIG_OF_NUMA=y +CONFIG_PARPORT=m +# CONFIG_PARPORT_PC is not set +CONFIG_PARPORT_1284=y +CONFIG_PARPORT_NOT_PC=y +CONFIG_PNP=y +# CONFIG_PNP_DEBUG_MESSAGES is not set + +# +# Protocols +# +CONFIG_PNPACPI=y +CONFIG_BLK_DEV=y +CONFIG_BLK_DEV_NULL_BLK=m +CONFIG_CDROM=m +CONFIG_BLK_DEV_PCIESSD_MTIP32XX=m +CONFIG_ZRAM=m +CONFIG_ZRAM_DEF_COMP_LZORLE=y +# CONFIG_ZRAM_DEF_COMP_ZSTD is not set +# CONFIG_ZRAM_DEF_COMP_LZ4 is not set +# CONFIG_ZRAM_DEF_COMP_LZO is not set +# CONFIG_ZRAM_DEF_COMP_LZ4HC is not set +CONFIG_ZRAM_DEF_COMP="lzo-rle" +CONFIG_ZRAM_WRITEBACK=y +CONFIG_ZRAM_MEMORY_TRACKING=y +# CONFIG_ZRAM_MULTI_COMP is not set +CONFIG_BLK_DEV_LOOP=m +CONFIG_BLK_DEV_LOOP_MIN_COUNT=8 +CONFIG_BLK_DEV_DRBD=m +# CONFIG_DRBD_FAULT_INJECTION is not set +CONFIG_BLK_DEV_NBD=m +CONFIG_BLK_DEV_RAM=m +CONFIG_BLK_DEV_RAM_COUNT=16 +CONFIG_BLK_DEV_RAM_SIZE=16384 +# CONFIG_CDROM_PKTCDVD is not set +CONFIG_ATA_OVER_ETH=m +CONFIG_XEN_BLKDEV_FRONTEND=m +CONFIG_XEN_BLKDEV_BACKEND=m +CONFIG_VIRTIO_BLK=m +CONFIG_BLK_DEV_RBD=m +# CONFIG_BLK_DEV_UBLK is not set + +# +# NVME Support +# +CONFIG_NVME_CORE=m +CONFIG_BLK_DEV_NVME=m +CONFIG_NVME_MULTIPATH=y +# CONFIG_NVME_VERBOSE_ERRORS is not set +# CONFIG_NVME_HWMON is not set +CONFIG_NVME_FABRICS=m +CONFIG_NVME_RDMA=m +CONFIG_NVME_FC=m +CONFIG_NVME_TCP=m +# CONFIG_NVME_AUTH is not set +CONFIG_NVME_TARGET=m +# CONFIG_NVME_TARGET_PASSTHRU is not set +# CONFIG_NVME_TARGET_LOOP is not set +CONFIG_NVME_TARGET_RDMA=m +CONFIG_NVME_TARGET_FC=m +# CONFIG_NVME_TARGET_FCLOOP is not set +CONFIG_NVME_TARGET_TCP=m +# CONFIG_NVME_TARGET_AUTH is not set +# end of NVME Support + +# +# Misc devices +# +CONFIG_SENSORS_LIS3LV02D=m +CONFIG_AD525X_DPOT=m +CONFIG_AD525X_DPOT_I2C=m +CONFIG_AD525X_DPOT_SPI=m +# CONFIG_DUMMY_IRQ is not set +# CONFIG_PHANTOM is not set +CONFIG_TIFM_CORE=m +CONFIG_TIFM_7XX1=m +CONFIG_ICS932S401=m +CONFIG_ENCLOSURE_SERVICES=m +# CONFIG_HI6421V600_IRQ is not set +# CONFIG_HP_ILO is not set +CONFIG_QCOM_COINCELL=m +# CONFIG_QCOM_FASTRPC is not set +CONFIG_APDS9802ALS=m +CONFIG_ISL29003=m +CONFIG_ISL29020=m +CONFIG_SENSORS_TSL2550=m +CONFIG_SENSORS_BH1770=m +CONFIG_SENSORS_APDS990X=m +CONFIG_HMC6352=m +CONFIG_DS1682=m +# CONFIG_LATTICE_ECP3_CONFIG is not set +CONFIG_SRAM=y +# CONFIG_DW_XDATA_PCIE is not set +# CONFIG_PCI_ENDPOINT_TEST is not set +# CONFIG_XILINX_SDFEC is not set +CONFIG_MISC_RTSX=m +# CONFIG_HISI_HIKEY_USB is not set +# CONFIG_OPEN_DICE is not set +# CONFIG_VCPU_STALL_DETECTOR is not set +CONFIG_C2PORT=m + +# +# EEPROM support +# +CONFIG_EEPROM_AT24=m +CONFIG_EEPROM_AT25=m +CONFIG_EEPROM_LEGACY=m +CONFIG_EEPROM_MAX6875=m +CONFIG_EEPROM_93CX6=m +# CONFIG_EEPROM_93XX46 is not set +# CONFIG_EEPROM_IDT_89HPESX is not set +# CONFIG_EEPROM_EE1004 is not set +# end of EEPROM support + +CONFIG_CB710_CORE=m +# CONFIG_CB710_DEBUG is not set +CONFIG_CB710_DEBUG_ASSUMPTIONS=y + +# +# Texas Instruments shared transport line discipline +# +CONFIG_TI_ST=m +# end of Texas Instruments shared transport line discipline + +CONFIG_SENSORS_LIS3_I2C=m +CONFIG_ALTERA_STAPL=m +# CONFIG_VMWARE_VMCI is not set +# CONFIG_GENWQE is not set +# CONFIG_ECHO is not set +# CONFIG_BCM_VK is not set +# CONFIG_MISC_ALCOR_PCI is not set +CONFIG_MISC_RTSX_PCI=m +CONFIG_MISC_RTSX_USB=m +CONFIG_PVPANIC=y +CONFIG_PVPANIC_MMIO=m +CONFIG_PVPANIC_PCI=m +# CONFIG_UACCE is not set +# CONFIG_GP_PCI1XXXX is not set +# end of Misc devices + +# +# SCSI device support +# +CONFIG_SCSI_MOD=m +CONFIG_RAID_ATTRS=m +CONFIG_SCSI_COMMON=m +CONFIG_SCSI=m +CONFIG_SCSI_DMA=y +CONFIG_SCSI_NETLINK=y +# CONFIG_SCSI_PROC_FS is not set + +# +# SCSI support type (disk, tape, CD-ROM) +# +CONFIG_BLK_DEV_SD=m +CONFIG_CHR_DEV_ST=m +CONFIG_BLK_DEV_SR=m +CONFIG_CHR_DEV_SG=m +CONFIG_BLK_DEV_BSG=y +CONFIG_CHR_DEV_SCH=m +CONFIG_SCSI_ENCLOSURE=m +CONFIG_SCSI_CONSTANTS=y +CONFIG_SCSI_LOGGING=y +CONFIG_SCSI_SCAN_ASYNC=y + +# +# SCSI Transports +# +CONFIG_SCSI_SPI_ATTRS=m +CONFIG_SCSI_FC_ATTRS=m +CONFIG_SCSI_ISCSI_ATTRS=m +CONFIG_SCSI_SAS_ATTRS=m +CONFIG_SCSI_SAS_LIBSAS=m +CONFIG_SCSI_SAS_ATA=y +CONFIG_SCSI_SAS_HOST_SMP=y +CONFIG_SCSI_SRP_ATTRS=m +# end of SCSI Transports + +CONFIG_SCSI_LOWLEVEL=y +CONFIG_ISCSI_TCP=m +CONFIG_ISCSI_BOOT_SYSFS=m +CONFIG_SCSI_CXGB3_ISCSI=m +CONFIG_SCSI_CXGB4_ISCSI=m +CONFIG_SCSI_BNX2_ISCSI=m +CONFIG_SCSI_BNX2X_FCOE=m +CONFIG_BE2ISCSI=m +CONFIG_BLK_DEV_3W_XXXX_RAID=m +CONFIG_SCSI_HPSA=m +CONFIG_SCSI_3W_9XXX=m +CONFIG_SCSI_3W_SAS=m +CONFIG_SCSI_ACARD=m +CONFIG_SCSI_AACRAID=m +CONFIG_SCSI_AIC7XXX=m +CONFIG_AIC7XXX_CMDS_PER_DEVICE=32 +CONFIG_AIC7XXX_RESET_DELAY_MS=15000 +CONFIG_AIC7XXX_DEBUG_ENABLE=y +CONFIG_AIC7XXX_DEBUG_MASK=0 +CONFIG_AIC7XXX_REG_PRETTY_PRINT=y +CONFIG_SCSI_AIC79XX=m +CONFIG_AIC79XX_CMDS_PER_DEVICE=32 +CONFIG_AIC79XX_RESET_DELAY_MS=15000 +CONFIG_AIC79XX_DEBUG_ENABLE=y +CONFIG_AIC79XX_DEBUG_MASK=0 +CONFIG_AIC79XX_REG_PRETTY_PRINT=y +CONFIG_SCSI_AIC94XX=m +# CONFIG_AIC94XX_DEBUG is not set +CONFIG_SCSI_HISI_SAS=m +CONFIG_SCSI_HISI_SAS_PCI=m +# CONFIG_SCSI_HISI_SAS_DEBUGFS_DEFAULT_ENABLE is not set +CONFIG_SCSI_MVSAS=m +# CONFIG_SCSI_MVSAS_DEBUG is not set +# CONFIG_SCSI_MVSAS_TASKLET is not set +CONFIG_SCSI_MVUMI=m +CONFIG_SCSI_ADVANSYS=m +# CONFIG_SCSI_ARCMSR is not set +CONFIG_SCSI_ESAS2R=m +# CONFIG_MEGARAID_NEWGEN is not set +# CONFIG_MEGARAID_LEGACY is not set +CONFIG_MEGARAID_SAS=m +CONFIG_SCSI_MPT3SAS=m +CONFIG_SCSI_MPT2SAS_MAX_SGE=128 +CONFIG_SCSI_MPT3SAS_MAX_SGE=128 +CONFIG_SCSI_MPT2SAS=m +# CONFIG_SCSI_MPI3MR is not set +CONFIG_SCSI_SMARTPQI=m +CONFIG_SCSI_HPTIOP=m +# CONFIG_SCSI_BUSLOGIC is not set +# CONFIG_SCSI_MYRB is not set +# CONFIG_SCSI_MYRS is not set +CONFIG_XEN_SCSI_FRONTEND=m +CONFIG_HYPERV_STORAGE=m +CONFIG_LIBFC=m +CONFIG_LIBFCOE=m +CONFIG_FCOE=m +CONFIG_SCSI_SNIC=m +# CONFIG_SCSI_SNIC_DEBUG_FS is not set +CONFIG_SCSI_DMX3191D=m +# CONFIG_SCSI_FDOMAIN_PCI is not set +# CONFIG_SCSI_IPS is not set +# CONFIG_SCSI_INITIO is not set +# CONFIG_SCSI_INIA100 is not set +CONFIG_SCSI_STEX=m +CONFIG_SCSI_SYM53C8XX_2=m +CONFIG_SCSI_SYM53C8XX_DMA_ADDRESSING_MODE=1 +CONFIG_SCSI_SYM53C8XX_DEFAULT_TAGS=16 +CONFIG_SCSI_SYM53C8XX_MAX_TAGS=64 +CONFIG_SCSI_SYM53C8XX_MMIO=y +# CONFIG_SCSI_IPR is not set +# CONFIG_SCSI_QLOGIC_1280 is not set +CONFIG_SCSI_QLA_FC=m +CONFIG_TCM_QLA2XXX=m +# CONFIG_TCM_QLA2XXX_DEBUG is not set +CONFIG_SCSI_QLA_ISCSI=m +CONFIG_QEDI=m +CONFIG_QEDF=m +CONFIG_SCSI_LPFC=m +# CONFIG_SCSI_LPFC_DEBUG_FS is not set +# CONFIG_SCSI_EFCT is not set +# CONFIG_SCSI_DC395x is not set +# CONFIG_SCSI_AM53C974 is not set +CONFIG_SCSI_WD719X=m +# CONFIG_SCSI_DEBUG is not set +CONFIG_SCSI_PMCRAID=m +CONFIG_SCSI_PM8001=m +CONFIG_SCSI_BFA_FC=m +CONFIG_SCSI_VIRTIO=m +CONFIG_SCSI_CHELSIO_FCOE=m +CONFIG_SCSI_DH=y +CONFIG_SCSI_DH_RDAC=m +CONFIG_SCSI_DH_HP_SW=m +CONFIG_SCSI_DH_EMC=m +CONFIG_SCSI_DH_ALUA=m +# end of SCSI device support + +CONFIG_ATA=m +CONFIG_SATA_HOST=y +CONFIG_PATA_TIMINGS=y +CONFIG_ATA_VERBOSE_ERROR=y +CONFIG_ATA_FORCE=y +CONFIG_ATA_ACPI=y +CONFIG_SATA_ZPODD=y +CONFIG_SATA_PMP=y + +# +# Controllers with non-SFF native interface +# +CONFIG_SATA_AHCI=m +CONFIG_SATA_MOBILE_LPM_POLICY=3 +CONFIG_SATA_AHCI_PLATFORM=m +# CONFIG_AHCI_DWC is not set +CONFIG_AHCI_CEVA=m +CONFIG_AHCI_MVEBU=m +# CONFIG_AHCI_SUNXI is not set +CONFIG_AHCI_TEGRA=m +CONFIG_AHCI_XGENE=m +CONFIG_SATA_AHCI_SEATTLE=m +# CONFIG_SATA_INIC162X is not set +CONFIG_SATA_ACARD_AHCI=m +CONFIG_SATA_SIL24=m +CONFIG_ATA_SFF=y + +# +# SFF controllers with custom DMA interface +# +CONFIG_PDC_ADMA=m +CONFIG_SATA_QSTOR=m +CONFIG_SATA_SX4=m +CONFIG_ATA_BMDMA=y + +# +# SATA SFF controllers with BMDMA +# +CONFIG_ATA_PIIX=m +# CONFIG_SATA_DWC is not set +CONFIG_SATA_MV=m +CONFIG_SATA_NV=m +CONFIG_SATA_PROMISE=m +CONFIG_SATA_SIL=m +CONFIG_SATA_SIS=m +CONFIG_SATA_SVW=m +CONFIG_SATA_ULI=m +CONFIG_SATA_VIA=m +CONFIG_SATA_VITESSE=m + +# +# PATA SFF controllers with BMDMA +# +# CONFIG_PATA_ALI is not set +# CONFIG_PATA_AMD is not set +CONFIG_PATA_ARTOP=m +# CONFIG_PATA_ATIIXP is not set +CONFIG_PATA_ATP867X=m +CONFIG_PATA_CMD64X=m +# CONFIG_PATA_CYPRESS is not set +# CONFIG_PATA_EFAR is not set +# CONFIG_PATA_HPT366 is not set +# CONFIG_PATA_HPT37X is not set +# CONFIG_PATA_HPT3X2N is not set +# CONFIG_PATA_HPT3X3 is not set +CONFIG_PATA_IT8213=m +CONFIG_PATA_IT821X=m +CONFIG_PATA_JMICRON=m +CONFIG_PATA_MARVELL=m +# CONFIG_PATA_NETCELL is not set +CONFIG_PATA_NINJA32=m +# CONFIG_PATA_NS87415 is not set +# CONFIG_PATA_OLDPIIX is not set +# CONFIG_PATA_OPTIDMA is not set +# CONFIG_PATA_PDC2027X is not set +# CONFIG_PATA_PDC_OLD is not set +# CONFIG_PATA_RADISYS is not set +CONFIG_PATA_RDC=m +CONFIG_PATA_SCH=m +# CONFIG_PATA_SERVERWORKS is not set +# CONFIG_PATA_SIL680 is not set +CONFIG_PATA_SIS=m +CONFIG_PATA_TOSHIBA=m +# CONFIG_PATA_TRIFLEX is not set +# CONFIG_PATA_VIA is not set +# CONFIG_PATA_WINBOND is not set + +# +# PIO-only SFF controllers +# +# CONFIG_PATA_CMD640_PCI is not set +# CONFIG_PATA_MPIIX is not set +# CONFIG_PATA_NS87410 is not set +# CONFIG_PATA_OPTI is not set +# CONFIG_PATA_OF_PLATFORM is not set +# CONFIG_PATA_RZ1000 is not set + +# +# Generic fallback / legacy drivers +# +# CONFIG_PATA_ACPI is not set +CONFIG_ATA_GENERIC=m +# CONFIG_PATA_LEGACY is not set +CONFIG_MD=y +CONFIG_BLK_DEV_MD=m +CONFIG_MD_BITMAP_FILE=y +CONFIG_MD_LINEAR=m +CONFIG_MD_RAID0=m +CONFIG_MD_RAID1=m +CONFIG_MD_RAID10=m +CONFIG_MD_RAID456=m +CONFIG_MD_MULTIPATH=m +CONFIG_MD_FAULTY=m +# CONFIG_MD_CLUSTER is not set +CONFIG_BCACHE=m +# CONFIG_BCACHE_DEBUG is not set +# CONFIG_BCACHE_CLOSURES_DEBUG is not set +# CONFIG_BCACHE_ASYNC_REGISTRATION is not set +CONFIG_BLK_DEV_DM_BUILTIN=y +CONFIG_BLK_DEV_DM=m +# CONFIG_DM_DEBUG is not set +CONFIG_DM_BUFIO=m +# CONFIG_DM_DEBUG_BLOCK_MANAGER_LOCKING is not set +CONFIG_DM_BIO_PRISON=m +CONFIG_DM_PERSISTENT_DATA=m +CONFIG_DM_UNSTRIPED=m +CONFIG_DM_CRYPT=m +CONFIG_DM_SNAPSHOT=m +CONFIG_DM_THIN_PROVISIONING=m +CONFIG_DM_CACHE=m +CONFIG_DM_CACHE_SMQ=m +CONFIG_DM_WRITECACHE=m +# CONFIG_DM_EBS is not set +CONFIG_DM_ERA=m +# CONFIG_DM_CLONE is not set +CONFIG_DM_MIRROR=m +CONFIG_DM_LOG_USERSPACE=m +CONFIG_DM_RAID=m +CONFIG_DM_ZERO=m +CONFIG_DM_MULTIPATH=m +CONFIG_DM_MULTIPATH_QL=m +CONFIG_DM_MULTIPATH_ST=m +# CONFIG_DM_MULTIPATH_HST is not set +# CONFIG_DM_MULTIPATH_IOA is not set +CONFIG_DM_DELAY=m +# CONFIG_DM_DUST is not set +CONFIG_DM_UEVENT=y +CONFIG_DM_FLAKEY=m +CONFIG_DM_VERITY=m +# CONFIG_DM_VERITY_VERIFY_ROOTHASH_SIG is not set +# CONFIG_DM_VERITY_FEC is not set +CONFIG_DM_SWITCH=m +CONFIG_DM_LOG_WRITES=m +CONFIG_DM_INTEGRITY=m +CONFIG_DM_ZONED=m +CONFIG_DM_AUDIT=y +CONFIG_TARGET_CORE=m +CONFIG_TCM_IBLOCK=m +CONFIG_TCM_FILEIO=m +CONFIG_TCM_PSCSI=m +CONFIG_TCM_USER2=m +CONFIG_LOOPBACK_TARGET=m +CONFIG_TCM_FC=m +CONFIG_ISCSI_TARGET=m +CONFIG_ISCSI_TARGET_CXGB4=m +CONFIG_SBP_TARGET=m +# CONFIG_REMOTE_TARGET is not set +CONFIG_FUSION=y +CONFIG_FUSION_SPI=m +CONFIG_FUSION_FC=m +CONFIG_FUSION_SAS=m +CONFIG_FUSION_MAX_SGE=128 +CONFIG_FUSION_CTL=m +# CONFIG_FUSION_LOGGING is not set + +# +# IEEE 1394 (FireWire) support +# +CONFIG_FIREWIRE=m +CONFIG_FIREWIRE_OHCI=m +CONFIG_FIREWIRE_SBP2=m +CONFIG_FIREWIRE_NET=m +CONFIG_FIREWIRE_NOSY=m +# end of IEEE 1394 (FireWire) support + +CONFIG_NETDEVICES=y +CONFIG_MII=m +CONFIG_NET_CORE=y +CONFIG_BONDING=m +CONFIG_DUMMY=m +# CONFIG_WIREGUARD is not set +CONFIG_EQUALIZER=m +# CONFIG_NET_FC is not set +CONFIG_IFB=m +CONFIG_NET_TEAM=m +CONFIG_NET_TEAM_MODE_BROADCAST=m +CONFIG_NET_TEAM_MODE_ROUNDROBIN=m +CONFIG_NET_TEAM_MODE_RANDOM=m +CONFIG_NET_TEAM_MODE_ACTIVEBACKUP=m +CONFIG_NET_TEAM_MODE_LOADBALANCE=m +CONFIG_MACVLAN=m +CONFIG_MACVTAP=m +CONFIG_IPVLAN_L3S=y +CONFIG_IPVLAN=m +CONFIG_IPVTAP=m +CONFIG_VXLAN=m +CONFIG_GENEVE=m +# CONFIG_BAREUDP is not set +CONFIG_GTP=m +# CONFIG_AMT is not set +CONFIG_MACSEC=m +CONFIG_NETCONSOLE=m +CONFIG_NETCONSOLE_DYNAMIC=y +# CONFIG_NETCONSOLE_EXTENDED_LOG is not set +CONFIG_NETPOLL=y +CONFIG_NET_POLL_CONTROLLER=y +CONFIG_TUN=m +CONFIG_TAP=m +# CONFIG_TUN_VNET_CROSS_LE is not set +CONFIG_VETH=m +CONFIG_VIRTIO_NET=m +CONFIG_NLMON=m +CONFIG_NET_VRF=m +CONFIG_NETKIT=y +CONFIG_VSOCKMON=m +# CONFIG_ARCNET is not set +CONFIG_ATM_DRIVERS=y +CONFIG_ATM_DUMMY=m +# CONFIG_ATM_TCP is not set +# CONFIG_ATM_LANAI is not set +# CONFIG_ATM_ENI is not set +CONFIG_ATM_NICSTAR=m +CONFIG_ATM_NICSTAR_USE_SUNI=y +CONFIG_ATM_NICSTAR_USE_IDT77105=y +# CONFIG_ATM_IDT77252 is not set +CONFIG_ATM_IA=m +# CONFIG_ATM_IA_DEBUG is not set +CONFIG_ATM_FORE200E=m +# CONFIG_ATM_FORE200E_USE_TASKLET is not set +CONFIG_ATM_FORE200E_TX_RETRY=16 +CONFIG_ATM_FORE200E_DEBUG=0 +# CONFIG_ATM_HE is not set +CONFIG_ATM_SOLOS=m + +# +# Distributed Switch Architecture drivers +# +# CONFIG_B53 is not set +# CONFIG_NET_DSA_BCM_SF2 is not set +# CONFIG_NET_DSA_LOOP is not set +# CONFIG_NET_DSA_LANTIQ_GSWIP is not set +# CONFIG_NET_DSA_MT7530 is not set +CONFIG_NET_DSA_MV88E6060=m +# CONFIG_NET_DSA_MICROCHIP_KSZ_COMMON is not set +CONFIG_NET_DSA_MV88E6XXX=m +# CONFIG_NET_DSA_MV88E6XXX_PTP is not set +# CONFIG_NET_DSA_MSCC_OCELOT_EXT is not set +# CONFIG_NET_DSA_MSCC_SEVILLE is not set +# CONFIG_NET_DSA_AR9331 is not set +# CONFIG_NET_DSA_QCA8K is not set +# CONFIG_NET_DSA_SJA1105 is not set +# CONFIG_NET_DSA_XRS700X_I2C is not set +# CONFIG_NET_DSA_XRS700X_MDIO is not set +# CONFIG_NET_DSA_REALTEK is not set +# CONFIG_NET_DSA_SMSC_LAN9303_I2C is not set +# CONFIG_NET_DSA_SMSC_LAN9303_MDIO is not set +# CONFIG_NET_DSA_VITESSE_VSC73XX_SPI is not set +# CONFIG_NET_DSA_VITESSE_VSC73XX_PLATFORM is not set +# end of Distributed Switch Architecture drivers + +CONFIG_ETHERNET=y +CONFIG_MDIO=m +CONFIG_NET_VENDOR_3COM=y +CONFIG_VORTEX=m +CONFIG_TYPHOON=m +CONFIG_NET_VENDOR_ADAPTEC=y +CONFIG_ADAPTEC_STARFIRE=m +CONFIG_NET_VENDOR_AGERE=y +CONFIG_ET131X=m +CONFIG_NET_VENDOR_ALACRITECH=y +# CONFIG_SLICOSS is not set +CONFIG_NET_VENDOR_ALLWINNER=y +# CONFIG_SUN4I_EMAC is not set +CONFIG_NET_VENDOR_ALTEON=y +CONFIG_ACENIC=m +# CONFIG_ACENIC_OMIT_TIGON_I is not set +# CONFIG_ALTERA_TSE is not set +CONFIG_NET_VENDOR_AMAZON=y +CONFIG_ENA_ETHERNET=m +CONFIG_NET_VENDOR_AMD=y +# CONFIG_AMD8111_ETH is not set +CONFIG_PCNET32=m +CONFIG_AMD_XGBE=m +CONFIG_AMD_XGBE_DCB=y +# CONFIG_PDS_CORE is not set +CONFIG_NET_XGENE=m +CONFIG_NET_XGENE_V2=m +CONFIG_NET_VENDOR_AQUANTIA=y +CONFIG_AQTION=m +# CONFIG_NET_VENDOR_ARC is not set +CONFIG_NET_VENDOR_ASIX=y +# CONFIG_SPI_AX88796C is not set +CONFIG_NET_VENDOR_ATHEROS=y +CONFIG_ATL2=m +CONFIG_ATL1=m +CONFIG_ATL1E=m +CONFIG_ATL1C=m +CONFIG_ALX=m +CONFIG_NET_VENDOR_BROADCOM=y +# CONFIG_B44 is not set +# CONFIG_BCMGENET is not set +CONFIG_BNX2=m +CONFIG_CNIC=m +CONFIG_TIGON3=m +CONFIG_TIGON3_HWMON=y +CONFIG_BNX2X=m +CONFIG_BNX2X_SRIOV=y +# CONFIG_SYSTEMPORT is not set +CONFIG_BNXT=m +CONFIG_BNXT_SRIOV=y +CONFIG_BNXT_FLOWER_OFFLOAD=y +CONFIG_BNXT_DCB=y +CONFIG_BNXT_HWMON=y +CONFIG_NET_VENDOR_CADENCE=y +CONFIG_MACB=m +CONFIG_MACB_USE_HWSTAMP=y +# CONFIG_MACB_PCI is not set +CONFIG_NET_VENDOR_CAVIUM=y +CONFIG_THUNDER_NIC_PF=m +CONFIG_THUNDER_NIC_VF=m +CONFIG_THUNDER_NIC_BGX=m +CONFIG_THUNDER_NIC_RGX=m +CONFIG_CAVIUM_PTP=m +CONFIG_LIQUIDIO_CORE=m +CONFIG_LIQUIDIO=m +CONFIG_LIQUIDIO_VF=m +CONFIG_NET_VENDOR_CHELSIO=y +CONFIG_CHELSIO_T1=m +CONFIG_CHELSIO_T1_1G=y +CONFIG_CHELSIO_T3=m +CONFIG_CHELSIO_T4=m +CONFIG_CHELSIO_T4_DCB=y +CONFIG_CHELSIO_T4_FCOE=y +CONFIG_CHELSIO_T4VF=m +CONFIG_CHELSIO_LIB=m +CONFIG_CHELSIO_INLINE_CRYPTO=y +# CONFIG_CHELSIO_IPSEC_INLINE is not set +CONFIG_NET_VENDOR_CISCO=y +CONFIG_ENIC=m +CONFIG_NET_VENDOR_CORTINA=y +# CONFIG_GEMINI_ETHERNET is not set +CONFIG_NET_VENDOR_DAVICOM=y +# CONFIG_DM9051 is not set +# CONFIG_DNET is not set +CONFIG_NET_VENDOR_DEC=y +CONFIG_NET_TULIP=y +CONFIG_DE2104X=m +CONFIG_DE2104X_DSL=0 +CONFIG_TULIP=m +# CONFIG_TULIP_MWI is not set +# CONFIG_TULIP_MMIO is not set +CONFIG_TULIP_NAPI=y +CONFIG_TULIP_NAPI_HW_MITIGATION=y +CONFIG_WINBOND_840=m +CONFIG_DM9102=m +CONFIG_ULI526X=m +CONFIG_NET_VENDOR_DLINK=y +CONFIG_DL2K=m +CONFIG_SUNDANCE=m +# CONFIG_SUNDANCE_MMIO is not set +CONFIG_NET_VENDOR_EMULEX=y +CONFIG_BE2NET=m +CONFIG_BE2NET_HWMON=y +CONFIG_BE2NET_BE2=y +CONFIG_BE2NET_BE3=y +CONFIG_BE2NET_LANCER=y +CONFIG_BE2NET_SKYHAWK=y +CONFIG_NET_VENDOR_ENGLEDER=y +# CONFIG_TSNEP is not set +CONFIG_NET_VENDOR_EZCHIP=y +# CONFIG_EZCHIP_NPS_MANAGEMENT_ENET is not set +CONFIG_NET_VENDOR_FUNGIBLE=y +# CONFIG_FUN_ETH is not set +CONFIG_NET_VENDOR_GOOGLE=y +# CONFIG_GVE is not set +CONFIG_NET_VENDOR_HISILICON=y +CONFIG_HIX5HD2_GMAC=m +CONFIG_HISI_FEMAC=m +CONFIG_HIP04_ETH=m +# CONFIG_HI13X1_GMAC is not set +CONFIG_HNS_MDIO=m +# CONFIG_HNS_DSAF is not set +# CONFIG_HNS_ENET is not set +# CONFIG_HNS3 is not set +CONFIG_NET_VENDOR_HUAWEI=y +CONFIG_HINIC=m +CONFIG_NET_VENDOR_I825XX=y +CONFIG_NET_VENDOR_INTEL=y +CONFIG_E100=m +CONFIG_E1000=m +CONFIG_E1000E=m +CONFIG_IGB=m +CONFIG_IGB_HWMON=y +CONFIG_IGBVF=m +CONFIG_IXGBE=m +CONFIG_IXGBE_HWMON=y +CONFIG_IXGBE_DCB=y +CONFIG_IXGBE_IPSEC=y +CONFIG_IXGBEVF=m +CONFIG_IXGBEVF_IPSEC=y +CONFIG_I40E=m +CONFIG_I40E_DCB=y +CONFIG_IAVF=m +CONFIG_I40EVF=m +CONFIG_ICE=m +CONFIG_ICE_SWITCHDEV=y +# CONFIG_FM10K is not set +# CONFIG_IGC is not set +CONFIG_JME=m +CONFIG_NET_VENDOR_ADI=y +# CONFIG_ADIN1110 is not set +CONFIG_NET_VENDOR_LITEX=y +# CONFIG_LITEX_LITEETH is not set +CONFIG_NET_VENDOR_MARVELL=y +CONFIG_MVMDIO=m +CONFIG_MVNETA=m +CONFIG_MVPP2=m +# CONFIG_MVPP2_PTP is not set +CONFIG_SKGE=m +# CONFIG_SKGE_DEBUG is not set +CONFIG_SKGE_GENESIS=y +CONFIG_SKY2=m +# CONFIG_SKY2_DEBUG is not set +# CONFIG_OCTEONTX2_AF is not set +# CONFIG_OCTEONTX2_PF is not set +# CONFIG_OCTEON_EP is not set +# CONFIG_PRESTERA is not set +CONFIG_NET_VENDOR_MELLANOX=y +CONFIG_MLX4_EN=m +CONFIG_MLX4_EN_DCB=y +CONFIG_MLX4_CORE=m +CONFIG_MLX4_DEBUG=y +CONFIG_MLX4_CORE_GEN2=y +CONFIG_MLX5_CORE=m +CONFIG_MLX5_FPGA=y +CONFIG_MLX5_CORE_EN=y +CONFIG_MLX5_EN_ARFS=y +CONFIG_MLX5_EN_RXNFC=y +CONFIG_MLX5_MPFS=y +CONFIG_MLX5_ESWITCH=y +CONFIG_MLX5_BRIDGE=y +CONFIG_MLX5_CORE_EN_DCB=y +CONFIG_MLX5_CORE_IPOIB=y +# CONFIG_MLX5_MACSEC is not set +# CONFIG_MLX5_EN_IPSEC is not set +CONFIG_MLX5_SW_STEERING=y +# CONFIG_MLX5_SF is not set +# CONFIG_MLXSW_CORE is not set +CONFIG_MLXFW=m +# CONFIG_MLXBF_GIGE is not set +CONFIG_NET_VENDOR_MICREL=y +# CONFIG_KS8842 is not set +# CONFIG_KS8851 is not set +# CONFIG_KS8851_MLL is not set +CONFIG_KSZ884X_PCI=m +CONFIG_NET_VENDOR_MICROCHIP=y +# CONFIG_ENC28J60 is not set +# CONFIG_ENCX24J600 is not set +CONFIG_LAN743X=m +# CONFIG_LAN966X_SWITCH is not set +# CONFIG_VCAP is not set +CONFIG_NET_VENDOR_MICROSEMI=y +# CONFIG_MSCC_OCELOT_SWITCH is not set +CONFIG_NET_VENDOR_MICROSOFT=y +CONFIG_NET_VENDOR_MYRI=y +CONFIG_MYRI10GE=m +CONFIG_FEALNX=m +CONFIG_NET_VENDOR_NI=y +# CONFIG_NI_XGE_MANAGEMENT_ENET is not set +CONFIG_NET_VENDOR_NATSEMI=y +CONFIG_NATSEMI=m +CONFIG_NS83820=m +CONFIG_NET_VENDOR_NETERION=y +CONFIG_S2IO=m +CONFIG_NET_VENDOR_NETRONOME=y +CONFIG_NFP=m +CONFIG_NFP_APP_FLOWER=y +CONFIG_NFP_APP_ABM_NIC=y +CONFIG_NFP_NET_IPSEC=y +# CONFIG_NFP_DEBUG is not set +CONFIG_NET_VENDOR_8390=y +CONFIG_NE2K_PCI=m +CONFIG_NET_VENDOR_NVIDIA=y +# CONFIG_FORCEDETH is not set +CONFIG_NET_VENDOR_OKI=y +# CONFIG_ETHOC is not set +CONFIG_NET_VENDOR_PACKET_ENGINES=y +CONFIG_HAMACHI=m +CONFIG_YELLOWFIN=m +CONFIG_NET_VENDOR_PENSANDO=y +# CONFIG_IONIC is not set +CONFIG_NET_VENDOR_QLOGIC=y +CONFIG_QLA3XXX=m +CONFIG_QLCNIC=m +CONFIG_QLCNIC_SRIOV=y +CONFIG_QLCNIC_DCB=y +CONFIG_QLCNIC_HWMON=y +CONFIG_NETXEN_NIC=m +CONFIG_QED=m +CONFIG_QED_LL2=y +CONFIG_QED_SRIOV=y +CONFIG_QEDE=m +CONFIG_QED_RDMA=y +CONFIG_QED_ISCSI=y +CONFIG_QED_FCOE=y +CONFIG_QED_OOO=y +CONFIG_NET_VENDOR_BROCADE=y +CONFIG_BNA=m +CONFIG_NET_VENDOR_QUALCOMM=y +# CONFIG_QCA7000_SPI is not set +# CONFIG_QCA7000_UART is not set +CONFIG_QCOM_EMAC=m +# CONFIG_RMNET is not set +CONFIG_NET_VENDOR_RDC=y +CONFIG_R6040=m +CONFIG_NET_VENDOR_REALTEK=y +CONFIG_8139CP=m +CONFIG_8139TOO=m +# CONFIG_8139TOO_PIO is not set +CONFIG_8139TOO_TUNE_TWISTER=y +CONFIG_8139TOO_8129=y +# CONFIG_8139_OLD_RX_RESET is not set +CONFIG_R8169=m +CONFIG_NET_VENDOR_RENESAS=y +CONFIG_NET_VENDOR_ROCKER=y +# CONFIG_ROCKER is not set +CONFIG_NET_VENDOR_SAMSUNG=y +# CONFIG_SXGBE_ETH is not set +# CONFIG_NET_VENDOR_SEEQ is not set +CONFIG_NET_VENDOR_SILAN=y +CONFIG_SC92031=m +CONFIG_NET_VENDOR_SIS=y +# CONFIG_SIS900 is not set +CONFIG_SIS190=m +CONFIG_NET_VENDOR_SOLARFLARE=y +CONFIG_SFC=m +CONFIG_SFC_MTD=y +CONFIG_SFC_MCDI_MON=y +CONFIG_SFC_SRIOV=y +CONFIG_SFC_MCDI_LOGGING=y +CONFIG_SFC_FALCON=m +CONFIG_SFC_FALCON_MTD=y +# CONFIG_SFC_SIENA is not set +CONFIG_NET_VENDOR_SMSC=y +CONFIG_SMC91X=m +CONFIG_EPIC100=m +CONFIG_SMSC911X=m +CONFIG_SMSC9420=m +CONFIG_NET_VENDOR_SOCIONEXT=y +CONFIG_SNI_NETSEC=m +CONFIG_NET_VENDOR_STMICRO=y +CONFIG_STMMAC_ETH=m +# CONFIG_STMMAC_SELFTESTS is not set +CONFIG_STMMAC_PLATFORM=m +# CONFIG_DWMAC_DWC_QOS_ETH is not set +CONFIG_DWMAC_GENERIC=m +CONFIG_DWMAC_IPQ806X=m +CONFIG_DWMAC_MESON=m +CONFIG_DWMAC_QCOM_ETHQOS=m +CONFIG_DWMAC_ROCKCHIP=m +CONFIG_DWMAC_SUNXI=m +CONFIG_DWMAC_SUN8I=m +# CONFIG_DWMAC_INTEL_PLAT is not set +# CONFIG_DWMAC_TEGRA is not set +# CONFIG_DWMAC_LOONGSON is not set +# CONFIG_STMMAC_PCI is not set +CONFIG_NET_VENDOR_SUN=y +# CONFIG_HAPPYMEAL is not set +# CONFIG_SUNGEM is not set +CONFIG_CASSINI=m +CONFIG_NIU=m +CONFIG_NET_VENDOR_SYNOPSYS=y +# CONFIG_DWC_XLGMAC is not set +CONFIG_NET_VENDOR_TEHUTI=y +CONFIG_TEHUTI=m +CONFIG_NET_VENDOR_TI=y +# CONFIG_TI_CPSW_PHY_SEL is not set +CONFIG_TLAN=m +CONFIG_NET_VENDOR_VERTEXCOM=y +# CONFIG_MSE102X is not set +CONFIG_NET_VENDOR_VIA=y +# CONFIG_VIA_RHINE is not set +CONFIG_VIA_VELOCITY=m +CONFIG_NET_VENDOR_WANGXUN=y +# CONFIG_NGBE is not set +# CONFIG_TXGBE is not set +CONFIG_NET_VENDOR_WIZNET=y +# CONFIG_WIZNET_W5100 is not set +# CONFIG_WIZNET_W5300 is not set +CONFIG_NET_VENDOR_XILINX=y +# CONFIG_XILINX_EMACLITE is not set +# CONFIG_XILINX_AXI_EMAC is not set +# CONFIG_XILINX_LL_TEMAC is not set +CONFIG_FDDI=y +CONFIG_DEFXX=m +CONFIG_SKFP=m +# CONFIG_HIPPI is not set +# CONFIG_NET_SB1000 is not set +CONFIG_PHYLINK=m +CONFIG_PHYLIB=m +CONFIG_SWPHY=y +CONFIG_LED_TRIGGER_PHY=y +CONFIG_PHYLIB_LEDS=y +CONFIG_FIXED_PHY=m +CONFIG_SFP=m + +# +# MII PHY device drivers +# +CONFIG_AMD_PHY=m +CONFIG_MESON_GXL_PHY=m +CONFIG_ADIN_PHY=m +# CONFIG_ADIN1100_PHY is not set +CONFIG_AQUANTIA_PHY=m +CONFIG_AX88796B_PHY=m +CONFIG_BROADCOM_PHY=m +# CONFIG_BCM54140_PHY is not set +# CONFIG_BCM7XXX_PHY is not set +# CONFIG_BCM84881_PHY is not set +CONFIG_BCM87XX_PHY=m +CONFIG_BCM_NET_PHYLIB=m +CONFIG_CICADA_PHY=m +CONFIG_CORTINA_PHY=m +CONFIG_DAVICOM_PHY=m +CONFIG_ICPLUS_PHY=m +CONFIG_LXT_PHY=m +# CONFIG_INTEL_XWAY_PHY is not set +CONFIG_LSI_ET1011C_PHY=m +CONFIG_MARVELL_PHY=m +CONFIG_MARVELL_10G_PHY=m +# CONFIG_MARVELL_88Q2XXX_PHY is not set +# CONFIG_MARVELL_88X2222_PHY is not set +# CONFIG_MAXLINEAR_GPHY is not set +# CONFIG_MEDIATEK_GE_PHY is not set +CONFIG_MICREL_PHY=m +# CONFIG_MICROCHIP_T1S_PHY is not set +CONFIG_MICROCHIP_PHY=m +CONFIG_MICROCHIP_T1_PHY=m +CONFIG_MICROSEMI_PHY=m +# CONFIG_MOTORCOMM_PHY is not set +CONFIG_NATIONAL_PHY=m +# CONFIG_NXP_CBTX_PHY is not set +# CONFIG_NXP_C45_TJA11XX_PHY is not set +# CONFIG_NXP_TJA11XX_PHY is not set +# CONFIG_NCN26000_PHY is not set +CONFIG_AT803X_PHY=m +CONFIG_QSEMI_PHY=m +CONFIG_REALTEK_PHY=m +CONFIG_RENESAS_PHY=m +CONFIG_ROCKCHIP_PHY=m +CONFIG_SMSC_PHY=m +CONFIG_STE10XP=m +CONFIG_TERANETICS_PHY=m +CONFIG_DP83822_PHY=m +CONFIG_DP83TC811_PHY=m +CONFIG_DP83848_PHY=m +CONFIG_DP83867_PHY=m +# CONFIG_DP83869_PHY is not set +# CONFIG_DP83TD510_PHY is not set +CONFIG_VITESSE_PHY=m +# CONFIG_XILINX_GMII2RGMII is not set +# CONFIG_MICREL_KS8995MA is not set +# CONFIG_PSE_CONTROLLER is not set +CONFIG_CAN_DEV=m +CONFIG_CAN_VCAN=m +CONFIG_CAN_VXCAN=m +CONFIG_CAN_NETLINK=y +CONFIG_CAN_CALC_BITTIMING=y +CONFIG_CAN_RX_OFFLOAD=y +# CONFIG_CAN_CAN327 is not set +# CONFIG_CAN_FLEXCAN is not set +# CONFIG_CAN_GRCAN is not set +# CONFIG_CAN_KVASER_PCIEFD is not set +CONFIG_CAN_SLCAN=m +# CONFIG_CAN_XILINXCAN is not set +# CONFIG_CAN_C_CAN is not set +# CONFIG_CAN_CC770 is not set +# CONFIG_CAN_CTUCANFD_PCI is not set +# CONFIG_CAN_CTUCANFD_PLATFORM is not set +# CONFIG_CAN_IFI_CANFD is not set +# CONFIG_CAN_M_CAN is not set +CONFIG_CAN_PEAK_PCIEFD=m +CONFIG_CAN_SJA1000=m +CONFIG_CAN_EMS_PCI=m +# CONFIG_CAN_F81601 is not set +CONFIG_CAN_KVASER_PCI=m +CONFIG_CAN_PEAK_PCI=m +CONFIG_CAN_PEAK_PCIEC=y +CONFIG_CAN_PLX_PCI=m +CONFIG_CAN_SJA1000_ISA=m +# CONFIG_CAN_SJA1000_PLATFORM is not set +CONFIG_CAN_SOFTING=m + +# +# CAN SPI interfaces +# +# CONFIG_CAN_HI311X is not set +# CONFIG_CAN_MCP251X is not set +# CONFIG_CAN_MCP251XFD is not set +# end of CAN SPI interfaces + +# +# CAN USB interfaces +# +CONFIG_CAN_8DEV_USB=m +CONFIG_CAN_EMS_USB=m +# CONFIG_CAN_ESD_USB is not set +# CONFIG_CAN_ETAS_ES58X is not set +# CONFIG_CAN_F81604 is not set +CONFIG_CAN_GS_USB=m +CONFIG_CAN_KVASER_USB=m +CONFIG_CAN_MCBA_USB=m +CONFIG_CAN_PEAK_USB=m +CONFIG_CAN_UCAN=m +# end of CAN USB interfaces + +# CONFIG_CAN_DEBUG_DEVICES is not set +CONFIG_MDIO_DEVICE=m +CONFIG_MDIO_BUS=m +CONFIG_FWNODE_MDIO=m +CONFIG_OF_MDIO=m +CONFIG_ACPI_MDIO=m +CONFIG_MDIO_DEVRES=m +# CONFIG_MDIO_SUN4I is not set +CONFIG_MDIO_XGENE=m +# CONFIG_MDIO_BITBANG is not set +# CONFIG_MDIO_BCM_UNIMAC is not set +CONFIG_MDIO_CAVIUM=m +CONFIG_MDIO_HISI_FEMAC=m +CONFIG_MDIO_I2C=m +# CONFIG_MDIO_MVUSB is not set +# CONFIG_MDIO_MSCC_MIIM is not set +# CONFIG_MDIO_OCTEON is not set +# CONFIG_MDIO_IPQ4019 is not set +# CONFIG_MDIO_IPQ8064 is not set +CONFIG_MDIO_THUNDER=m + +# +# MDIO Multiplexers +# +CONFIG_MDIO_BUS_MUX=m +CONFIG_MDIO_BUS_MUX_MESON_G12A=m +CONFIG_MDIO_BUS_MUX_MESON_GXL=m +# CONFIG_MDIO_BUS_MUX_GPIO is not set +# CONFIG_MDIO_BUS_MUX_MULTIPLEXER is not set +CONFIG_MDIO_BUS_MUX_MMIOREG=m + +# +# PCS device drivers +# +CONFIG_PCS_XPCS=m +# end of PCS device drivers + +# CONFIG_PLIP is not set +CONFIG_PPP=m +CONFIG_PPP_BSDCOMP=m +CONFIG_PPP_DEFLATE=m +CONFIG_PPP_FILTER=y +CONFIG_PPP_MPPE=m +CONFIG_PPP_MULTILINK=y +CONFIG_PPPOATM=m +CONFIG_PPPOE=m +# CONFIG_PPPOE_HASH_BITS_1 is not set +# CONFIG_PPPOE_HASH_BITS_2 is not set +CONFIG_PPPOE_HASH_BITS_4=y +# CONFIG_PPPOE_HASH_BITS_8 is not set +CONFIG_PPPOE_HASH_BITS=4 +CONFIG_PPTP=m +CONFIG_PPPOL2TP=m +CONFIG_PPP_ASYNC=m +CONFIG_PPP_SYNC_TTY=m +CONFIG_SLIP=m +CONFIG_SLHC=m +CONFIG_SLIP_COMPRESSED=y +CONFIG_SLIP_SMART=y +CONFIG_SLIP_MODE_SLIP6=y + +# +# Host-side USB support is needed for USB Network Adapter support +# +CONFIG_USB_NET_DRIVERS=m +CONFIG_USB_CATC=m +CONFIG_USB_KAWETH=m +CONFIG_USB_PEGASUS=m +CONFIG_USB_RTL8150=m +CONFIG_USB_RTL8152=m +CONFIG_USB_LAN78XX=m +CONFIG_USB_USBNET=m +CONFIG_USB_NET_AX8817X=m +CONFIG_USB_NET_AX88179_178A=m +CONFIG_USB_NET_CDCETHER=m +CONFIG_USB_NET_CDC_EEM=m +CONFIG_USB_NET_CDC_NCM=m +CONFIG_USB_NET_HUAWEI_CDC_NCM=m +CONFIG_USB_NET_CDC_MBIM=m +CONFIG_USB_NET_DM9601=m +CONFIG_USB_NET_SR9700=m +CONFIG_USB_NET_SR9800=m +CONFIG_USB_NET_SMSC75XX=m +CONFIG_USB_NET_SMSC95XX=m +CONFIG_USB_NET_GL620A=m +CONFIG_USB_NET_NET1080=m +CONFIG_USB_NET_PLUSB=m +CONFIG_USB_NET_MCS7830=m +CONFIG_USB_NET_RNDIS_HOST=m +CONFIG_USB_NET_CDC_SUBSET_ENABLE=m +CONFIG_USB_NET_CDC_SUBSET=m +CONFIG_USB_ALI_M5632=y +CONFIG_USB_AN2720=y +CONFIG_USB_BELKIN=y +CONFIG_USB_ARMLINUX=y +CONFIG_USB_EPSON2888=y +CONFIG_USB_KC2190=y +CONFIG_USB_NET_ZAURUS=m +CONFIG_USB_NET_CX82310_ETH=m +CONFIG_USB_NET_KALMIA=m +CONFIG_USB_NET_QMI_WWAN=m +CONFIG_USB_HSO=m +CONFIG_USB_NET_INT51X1=m +CONFIG_USB_CDC_PHONET=m +CONFIG_USB_IPHETH=m +CONFIG_USB_SIERRA_NET=m +CONFIG_USB_VL600=m +CONFIG_USB_NET_CH9200=m +# CONFIG_USB_NET_AQC111 is not set +CONFIG_USB_RTL8153_ECM=m +CONFIG_WLAN=y +CONFIG_WLAN_VENDOR_ADMTEK=y +CONFIG_ADM8211=m +CONFIG_ATH_COMMON=m +CONFIG_WLAN_VENDOR_ATH=y +# CONFIG_ATH_DEBUG is not set +CONFIG_ATH5K=m +# CONFIG_ATH5K_DEBUG is not set +# CONFIG_ATH5K_TRACER is not set +CONFIG_ATH5K_PCI=y +CONFIG_ATH9K_HW=m +CONFIG_ATH9K_COMMON=m +CONFIG_ATH9K_BTCOEX_SUPPORT=y +CONFIG_ATH9K=m +CONFIG_ATH9K_PCI=y +# CONFIG_ATH9K_AHB is not set +# CONFIG_ATH9K_DEBUGFS is not set +# CONFIG_ATH9K_DYNACK is not set +# CONFIG_ATH9K_WOW is not set +CONFIG_ATH9K_RFKILL=y +CONFIG_ATH9K_CHANNEL_CONTEXT=y +CONFIG_ATH9K_PCOEM=y +CONFIG_ATH9K_PCI_NO_EEPROM=m +CONFIG_ATH9K_HTC=m +# CONFIG_ATH9K_HTC_DEBUGFS is not set +# CONFIG_ATH9K_HWRNG is not set +CONFIG_CARL9170=m +CONFIG_CARL9170_LEDS=y +CONFIG_CARL9170_WPC=y +# CONFIG_CARL9170_HWRNG is not set +CONFIG_ATH6KL=m +CONFIG_ATH6KL_SDIO=m +CONFIG_ATH6KL_USB=m +# CONFIG_ATH6KL_DEBUG is not set +# CONFIG_ATH6KL_TRACING is not set +CONFIG_AR5523=m +CONFIG_WIL6210=m +CONFIG_WIL6210_ISR_COR=y +CONFIG_WIL6210_TRACING=y +CONFIG_WIL6210_DEBUGFS=y +CONFIG_ATH10K=m +CONFIG_ATH10K_CE=y +CONFIG_ATH10K_PCI=m +# CONFIG_ATH10K_AHB is not set +# CONFIG_ATH10K_SDIO is not set +CONFIG_ATH10K_USB=m +# CONFIG_ATH10K_DEBUG is not set +# CONFIG_ATH10K_DEBUGFS is not set +# CONFIG_ATH10K_TRACING is not set +CONFIG_WCN36XX=m +# CONFIG_WCN36XX_DEBUGFS is not set +# CONFIG_ATH11K is not set +# CONFIG_ATH12K is not set +CONFIG_WLAN_VENDOR_ATMEL=y +# CONFIG_ATMEL is not set +CONFIG_AT76C50X_USB=m +CONFIG_WLAN_VENDOR_BROADCOM=y +CONFIG_B43=m +CONFIG_B43_BCMA=y +CONFIG_B43_SSB=y +CONFIG_B43_BUSES_BCMA_AND_SSB=y +# CONFIG_B43_BUSES_BCMA is not set +# CONFIG_B43_BUSES_SSB is not set +CONFIG_B43_PCI_AUTOSELECT=y +CONFIG_B43_PCICORE_AUTOSELECT=y +CONFIG_B43_SDIO=y +CONFIG_B43_BCMA_PIO=y +CONFIG_B43_PIO=y +CONFIG_B43_PHY_G=y +CONFIG_B43_PHY_N=y +CONFIG_B43_PHY_LP=y +CONFIG_B43_PHY_HT=y +CONFIG_B43_LEDS=y +CONFIG_B43_HWRNG=y +# CONFIG_B43_DEBUG is not set +CONFIG_B43LEGACY=m +CONFIG_B43LEGACY_PCI_AUTOSELECT=y +CONFIG_B43LEGACY_PCICORE_AUTOSELECT=y +CONFIG_B43LEGACY_LEDS=y +CONFIG_B43LEGACY_HWRNG=y +CONFIG_B43LEGACY_DEBUG=y +CONFIG_B43LEGACY_DMA=y +CONFIG_B43LEGACY_PIO=y +CONFIG_B43LEGACY_DMA_AND_PIO_MODE=y +# CONFIG_B43LEGACY_DMA_MODE is not set +# CONFIG_B43LEGACY_PIO_MODE is not set +CONFIG_BRCMUTIL=m +CONFIG_BRCMSMAC=m +CONFIG_BRCMFMAC=m +CONFIG_BRCMFMAC_PROTO_BCDC=y +CONFIG_BRCMFMAC_PROTO_MSGBUF=y +CONFIG_BRCMFMAC_SDIO=y +CONFIG_BRCMFMAC_USB=y +CONFIG_BRCMFMAC_PCIE=y +# CONFIG_BRCM_TRACING is not set +# CONFIG_BRCMDBG is not set +CONFIG_WLAN_VENDOR_CISCO=y +# CONFIG_AIRO is not set +CONFIG_WLAN_VENDOR_INTEL=y +# CONFIG_IPW2100 is not set +CONFIG_IPW2200=m +CONFIG_IPW2200_MONITOR=y +CONFIG_IPW2200_RADIOTAP=y +CONFIG_IPW2200_PROMISCUOUS=y +CONFIG_IPW2200_QOS=y +# CONFIG_IPW2200_DEBUG is not set +CONFIG_LIBIPW=m +# CONFIG_LIBIPW_DEBUG is not set +CONFIG_IWLEGACY=m +CONFIG_IWL4965=m +CONFIG_IWL3945=m + +# +# iwl3945 / iwl4965 Debugging Options +# +# CONFIG_IWLEGACY_DEBUG is not set +# end of iwl3945 / iwl4965 Debugging Options + +CONFIG_IWLWIFI=m +CONFIG_IWLWIFI_LEDS=y +CONFIG_IWLDVM=m +CONFIG_IWLMVM=m +CONFIG_IWLWIFI_OPMODE_MODULAR=y + +# +# Debugging Options +# +# CONFIG_IWLWIFI_DEBUG is not set +# CONFIG_IWLWIFI_DEVICE_TRACING is not set +# end of Debugging Options + +CONFIG_WLAN_VENDOR_INTERSIL=y +CONFIG_HOSTAP=m +CONFIG_HOSTAP_FIRMWARE=y +# CONFIG_HOSTAP_FIRMWARE_NVRAM is not set +CONFIG_HOSTAP_PLX=m +CONFIG_HOSTAP_PCI=m +# CONFIG_HERMES is not set +CONFIG_P54_COMMON=m +CONFIG_P54_USB=m +CONFIG_P54_PCI=m +# CONFIG_P54_SPI is not set +CONFIG_P54_LEDS=y +CONFIG_WLAN_VENDOR_MARVELL=y +CONFIG_LIBERTAS=m +CONFIG_LIBERTAS_USB=m +CONFIG_LIBERTAS_SDIO=m +# CONFIG_LIBERTAS_SPI is not set +# CONFIG_LIBERTAS_DEBUG is not set +CONFIG_LIBERTAS_MESH=y +CONFIG_LIBERTAS_THINFIRM=m +# CONFIG_LIBERTAS_THINFIRM_DEBUG is not set +CONFIG_LIBERTAS_THINFIRM_USB=m +CONFIG_MWIFIEX=m +# CONFIG_MWIFIEX_SDIO is not set +CONFIG_MWIFIEX_PCIE=m +# CONFIG_MWIFIEX_USB is not set +CONFIG_MWL8K=m +CONFIG_WLAN_VENDOR_MEDIATEK=y +CONFIG_MT7601U=m +CONFIG_MT76_CORE=m +CONFIG_MT76_LEDS=y +CONFIG_MT76_USB=m +CONFIG_MT76x02_LIB=m +CONFIG_MT76x02_USB=m +CONFIG_MT76x0_COMMON=m +CONFIG_MT76x0U=m +# CONFIG_MT76x0E is not set +CONFIG_MT76x2_COMMON=m +CONFIG_MT76x2E=m +CONFIG_MT76x2U=m +# CONFIG_MT7603E is not set +# CONFIG_MT7615E is not set +# CONFIG_MT7663U is not set +# CONFIG_MT7663S is not set +# CONFIG_MT7915E is not set +# CONFIG_MT7921E is not set +# CONFIG_MT7921S is not set +# CONFIG_MT7921U is not set +# CONFIG_MT7996E is not set +CONFIG_WLAN_VENDOR_MICROCHIP=y +# CONFIG_WILC1000_SDIO is not set +# CONFIG_WILC1000_SPI is not set +CONFIG_WLAN_VENDOR_PURELIFI=y +# CONFIG_PLFXLC is not set +CONFIG_WLAN_VENDOR_RALINK=y +CONFIG_RT2X00=m +CONFIG_RT2400PCI=m +CONFIG_RT2500PCI=m +CONFIG_RT61PCI=m +CONFIG_RT2800PCI=m +CONFIG_RT2800PCI_RT33XX=y +CONFIG_RT2800PCI_RT35XX=y +CONFIG_RT2800PCI_RT53XX=y +CONFIG_RT2800PCI_RT3290=y +CONFIG_RT2500USB=m +CONFIG_RT73USB=m +CONFIG_RT2800USB=m +CONFIG_RT2800USB_RT33XX=y +CONFIG_RT2800USB_RT35XX=y +CONFIG_RT2800USB_RT3573=y +CONFIG_RT2800USB_RT53XX=y +CONFIG_RT2800USB_RT55XX=y +# CONFIG_RT2800USB_UNKNOWN is not set +CONFIG_RT2800_LIB=m +CONFIG_RT2800_LIB_MMIO=m +CONFIG_RT2X00_LIB_MMIO=m +CONFIG_RT2X00_LIB_PCI=m +CONFIG_RT2X00_LIB_USB=m +CONFIG_RT2X00_LIB=m +CONFIG_RT2X00_LIB_FIRMWARE=y +CONFIG_RT2X00_LIB_CRYPTO=y +CONFIG_RT2X00_LIB_LEDS=y +# CONFIG_RT2X00_DEBUG is not set +CONFIG_WLAN_VENDOR_REALTEK=y +CONFIG_RTL8180=m +CONFIG_RTL8187=m +CONFIG_RTL8187_LEDS=y +CONFIG_RTL_CARDS=m +CONFIG_RTL8192CE=m +CONFIG_RTL8192SE=m +CONFIG_RTL8192DE=m +CONFIG_RTL8723AE=m +CONFIG_RTL8723BE=m +CONFIG_RTL8188EE=m +CONFIG_RTL8192EE=m +CONFIG_RTL8821AE=m +CONFIG_RTL8192CU=m +CONFIG_RTLWIFI=m +CONFIG_RTLWIFI_PCI=m +CONFIG_RTLWIFI_USB=m +# CONFIG_RTLWIFI_DEBUG is not set +CONFIG_RTL8192C_COMMON=m +CONFIG_RTL8723_COMMON=m +CONFIG_RTLBTCOEXIST=m +CONFIG_RTL8XXXU=m +# CONFIG_RTL8XXXU_UNTESTED is not set +CONFIG_RTW88=m +CONFIG_RTW88_CORE=m +CONFIG_RTW88_PCI=m +CONFIG_RTW88_8822B=m +CONFIG_RTW88_8822C=m +CONFIG_RTW88_8822BE=m +# CONFIG_RTW88_8822BS is not set +# CONFIG_RTW88_8822BU is not set +CONFIG_RTW88_8822CE=m +# CONFIG_RTW88_8822CS is not set +# CONFIG_RTW88_8822CU is not set +# CONFIG_RTW88_8723DE is not set +# CONFIG_RTW88_8723DS is not set +# CONFIG_RTW88_8723DU is not set +# CONFIG_RTW88_8821CE is not set +# CONFIG_RTW88_8821CS is not set +# CONFIG_RTW88_8821CU is not set +# CONFIG_RTW88_DEBUG is not set +# CONFIG_RTW88_DEBUGFS is not set +# CONFIG_RTW89 is not set +CONFIG_WLAN_VENDOR_RSI=y +CONFIG_RSI_91X=m +CONFIG_RSI_DEBUGFS=y +# CONFIG_RSI_SDIO is not set +CONFIG_RSI_USB=m +CONFIG_RSI_COEX=y +CONFIG_WLAN_VENDOR_SILABS=y +# CONFIG_WFX is not set +CONFIG_WLAN_VENDOR_ST=y +# CONFIG_CW1200 is not set +CONFIG_WLAN_VENDOR_TI=y +CONFIG_WL1251=m +CONFIG_WL1251_SPI=m +CONFIG_WL1251_SDIO=m +CONFIG_WL12XX=m +CONFIG_WL18XX=m +CONFIG_WLCORE=m +CONFIG_WLCORE_SPI=m +CONFIG_WLCORE_SDIO=m +CONFIG_WLAN_VENDOR_ZYDAS=y +# CONFIG_USB_ZD1201 is not set +CONFIG_ZD1211RW=m +# CONFIG_ZD1211RW_DEBUG is not set +CONFIG_WLAN_VENDOR_QUANTENNA=y +# CONFIG_QTNFMAC_PCIE is not set +CONFIG_USB_NET_RNDIS_WLAN=m +CONFIG_MAC80211_HWSIM=m +# CONFIG_VIRT_WIFI is not set +# CONFIG_WAN is not set +CONFIG_IEEE802154_DRIVERS=m +CONFIG_IEEE802154_FAKELB=m +CONFIG_IEEE802154_AT86RF230=m +CONFIG_IEEE802154_MRF24J40=m +CONFIG_IEEE802154_CC2520=m +CONFIG_IEEE802154_ATUSB=m +CONFIG_IEEE802154_ADF7242=m +# CONFIG_IEEE802154_CA8210 is not set +# CONFIG_IEEE802154_MCR20A is not set +CONFIG_IEEE802154_HWSIM=m + +# +# Wireless WAN +# +# CONFIG_WWAN is not set +# end of Wireless WAN + +CONFIG_XEN_NETDEV_FRONTEND=m +CONFIG_XEN_NETDEV_BACKEND=m +# CONFIG_VMXNET3 is not set +# CONFIG_FUJITSU_ES is not set +CONFIG_HYPERV_NET=m +# CONFIG_NETDEVSIM is not set +CONFIG_NET_FAILOVER=m +# CONFIG_ISDN is not set + +# +# Input device support +# +CONFIG_INPUT=y +CONFIG_INPUT_LEDS=y +CONFIG_INPUT_FF_MEMLESS=m +CONFIG_INPUT_SPARSEKMAP=m +CONFIG_INPUT_MATRIXKMAP=m +CONFIG_INPUT_VIVALDIFMAP=y + +# +# Userland interfaces +# +CONFIG_INPUT_MOUSEDEV=y +CONFIG_INPUT_MOUSEDEV_PSAUX=y +CONFIG_INPUT_MOUSEDEV_SCREEN_X=1024 +CONFIG_INPUT_MOUSEDEV_SCREEN_Y=768 +CONFIG_INPUT_JOYDEV=m +CONFIG_INPUT_EVDEV=m +# CONFIG_INPUT_EVBUG is not set + +# +# Input Device Drivers +# +CONFIG_INPUT_KEYBOARD=y +# CONFIG_KEYBOARD_ADC is not set +CONFIG_KEYBOARD_ADP5588=m +# CONFIG_KEYBOARD_ADP5589 is not set +CONFIG_KEYBOARD_ATKBD=y +# CONFIG_KEYBOARD_QT1050 is not set +# CONFIG_KEYBOARD_QT1070 is not set +CONFIG_KEYBOARD_QT2160=m +# CONFIG_KEYBOARD_DLINK_DIR685 is not set +# CONFIG_KEYBOARD_LKKBD is not set +CONFIG_KEYBOARD_GPIO=m +# CONFIG_KEYBOARD_GPIO_POLLED is not set +# CONFIG_KEYBOARD_TCA6416 is not set +# CONFIG_KEYBOARD_TCA8418 is not set +# CONFIG_KEYBOARD_MATRIX is not set +CONFIG_KEYBOARD_LM8323=m +# CONFIG_KEYBOARD_LM8333 is not set +CONFIG_KEYBOARD_MAX7359=m +# CONFIG_KEYBOARD_MCS is not set +# CONFIG_KEYBOARD_MPR121 is not set +# CONFIG_KEYBOARD_NEWTON is not set +CONFIG_KEYBOARD_TEGRA=m +CONFIG_KEYBOARD_OPENCORES=m +# CONFIG_KEYBOARD_PINEPHONE is not set +# CONFIG_KEYBOARD_SAMSUNG is not set +CONFIG_KEYBOARD_STOWAWAY=m +# CONFIG_KEYBOARD_SUNKBD is not set +# CONFIG_KEYBOARD_SUN4I_LRADC is not set +# CONFIG_KEYBOARD_OMAP4 is not set +# CONFIG_KEYBOARD_TM2_TOUCHKEY is not set +# CONFIG_KEYBOARD_XTKBD is not set +CONFIG_KEYBOARD_CROS_EC=m +# CONFIG_KEYBOARD_CAP11XX is not set +# CONFIG_KEYBOARD_BCM is not set +# CONFIG_KEYBOARD_CYPRESS_SF is not set +CONFIG_INPUT_MOUSE=y +CONFIG_MOUSE_PS2=y +CONFIG_MOUSE_PS2_ALPS=y +CONFIG_MOUSE_PS2_BYD=y +CONFIG_MOUSE_PS2_LOGIPS2PP=y +CONFIG_MOUSE_PS2_SYNAPTICS=y +CONFIG_MOUSE_PS2_SYNAPTICS_SMBUS=y +CONFIG_MOUSE_PS2_CYPRESS=y +CONFIG_MOUSE_PS2_TRACKPOINT=y +CONFIG_MOUSE_PS2_ELANTECH=y +CONFIG_MOUSE_PS2_ELANTECH_SMBUS=y +CONFIG_MOUSE_PS2_SENTELIC=y +# CONFIG_MOUSE_PS2_TOUCHKIT is not set +CONFIG_MOUSE_PS2_FOCALTECH=y +CONFIG_MOUSE_PS2_SMBUS=y +# CONFIG_MOUSE_SERIAL is not set +# CONFIG_MOUSE_APPLETOUCH is not set +# CONFIG_MOUSE_BCM5974 is not set +# CONFIG_MOUSE_CYAPA is not set +CONFIG_MOUSE_ELAN_I2C=m +CONFIG_MOUSE_ELAN_I2C_I2C=y +# CONFIG_MOUSE_ELAN_I2C_SMBUS is not set +# CONFIG_MOUSE_VSXXXAA is not set +# CONFIG_MOUSE_GPIO is not set +CONFIG_MOUSE_SYNAPTICS_I2C=m +CONFIG_MOUSE_SYNAPTICS_USB=m +# CONFIG_INPUT_JOYSTICK is not set +CONFIG_INPUT_TABLET=y +CONFIG_TABLET_USB_ACECAD=m +CONFIG_TABLET_USB_AIPTEK=m +CONFIG_TABLET_USB_HANWANG=m +CONFIG_TABLET_USB_KBTAB=m +CONFIG_TABLET_USB_PEGASUS=m +CONFIG_TABLET_SERIAL_WACOM4=m +CONFIG_INPUT_TOUCHSCREEN=y +CONFIG_TOUCHSCREEN_ADS7846=m +CONFIG_TOUCHSCREEN_AD7877=m +CONFIG_TOUCHSCREEN_AD7879=m +CONFIG_TOUCHSCREEN_AD7879_I2C=m +# CONFIG_TOUCHSCREEN_AD7879_SPI is not set +# CONFIG_TOUCHSCREEN_ADC is not set +# CONFIG_TOUCHSCREEN_AR1021_I2C is not set +CONFIG_TOUCHSCREEN_ATMEL_MXT=m +# CONFIG_TOUCHSCREEN_ATMEL_MXT_T37 is not set +# CONFIG_TOUCHSCREEN_AUO_PIXCIR is not set +# CONFIG_TOUCHSCREEN_BU21013 is not set +# CONFIG_TOUCHSCREEN_BU21029 is not set +# CONFIG_TOUCHSCREEN_CHIPONE_ICN8318 is not set +# CONFIG_TOUCHSCREEN_CHIPONE_ICN8505 is not set +# CONFIG_TOUCHSCREEN_CY8CTMA140 is not set +# CONFIG_TOUCHSCREEN_CY8CTMG110 is not set +# CONFIG_TOUCHSCREEN_CYTTSP_CORE is not set +# CONFIG_TOUCHSCREEN_CYTTSP4_CORE is not set +# CONFIG_TOUCHSCREEN_CYTTSP5 is not set +CONFIG_TOUCHSCREEN_DYNAPRO=m +CONFIG_TOUCHSCREEN_HAMPSHIRE=m +CONFIG_TOUCHSCREEN_EETI=m +# CONFIG_TOUCHSCREEN_EGALAX is not set +# CONFIG_TOUCHSCREEN_EGALAX_SERIAL is not set +# CONFIG_TOUCHSCREEN_EXC3000 is not set +CONFIG_TOUCHSCREEN_FUJITSU=m +CONFIG_TOUCHSCREEN_GOODIX=m +# CONFIG_TOUCHSCREEN_HIDEEP is not set +# CONFIG_TOUCHSCREEN_HYCON_HY46XX is not set +# CONFIG_TOUCHSCREEN_HYNITRON_CSTXXX is not set +# CONFIG_TOUCHSCREEN_ILI210X is not set +# CONFIG_TOUCHSCREEN_ILITEK is not set +# CONFIG_TOUCHSCREEN_S6SY761 is not set +CONFIG_TOUCHSCREEN_GUNZE=m +# CONFIG_TOUCHSCREEN_EKTF2127 is not set +CONFIG_TOUCHSCREEN_ELAN=m +CONFIG_TOUCHSCREEN_ELO=m +CONFIG_TOUCHSCREEN_WACOM_W8001=m +# CONFIG_TOUCHSCREEN_WACOM_I2C is not set +# CONFIG_TOUCHSCREEN_MAX11801 is not set +CONFIG_TOUCHSCREEN_MCS5000=m +# CONFIG_TOUCHSCREEN_MMS114 is not set +# CONFIG_TOUCHSCREEN_MELFAS_MIP4 is not set +# CONFIG_TOUCHSCREEN_MSG2638 is not set +CONFIG_TOUCHSCREEN_MTOUCH=m +# CONFIG_TOUCHSCREEN_NOVATEK_NVT_TS is not set +# CONFIG_TOUCHSCREEN_IMAGIS is not set +# CONFIG_TOUCHSCREEN_IMX6UL_TSC is not set +CONFIG_TOUCHSCREEN_INEXIO=m +CONFIG_TOUCHSCREEN_PENMOUNT=m +# CONFIG_TOUCHSCREEN_EDT_FT5X06 is not set +CONFIG_TOUCHSCREEN_TOUCHRIGHT=m +CONFIG_TOUCHSCREEN_TOUCHWIN=m +# CONFIG_TOUCHSCREEN_PIXCIR is not set +# CONFIG_TOUCHSCREEN_WDT87XX_I2C is not set +CONFIG_TOUCHSCREEN_WM97XX=m +CONFIG_TOUCHSCREEN_WM9705=y +CONFIG_TOUCHSCREEN_WM9712=y +CONFIG_TOUCHSCREEN_WM9713=y +CONFIG_TOUCHSCREEN_USB_COMPOSITE=m +CONFIG_TOUCHSCREEN_USB_EGALAX=y +CONFIG_TOUCHSCREEN_USB_PANJIT=y +CONFIG_TOUCHSCREEN_USB_3M=y +CONFIG_TOUCHSCREEN_USB_ITM=y +CONFIG_TOUCHSCREEN_USB_ETURBO=y +CONFIG_TOUCHSCREEN_USB_GUNZE=y +CONFIG_TOUCHSCREEN_USB_DMC_TSC10=y +CONFIG_TOUCHSCREEN_USB_IRTOUCH=y +CONFIG_TOUCHSCREEN_USB_IDEALTEK=y +CONFIG_TOUCHSCREEN_USB_GENERAL_TOUCH=y +CONFIG_TOUCHSCREEN_USB_GOTOP=y +CONFIG_TOUCHSCREEN_USB_JASTEC=y +CONFIG_TOUCHSCREEN_USB_ELO=y +CONFIG_TOUCHSCREEN_USB_E2I=y +CONFIG_TOUCHSCREEN_USB_ZYTRONIC=y +CONFIG_TOUCHSCREEN_USB_ETT_TC45USB=y +CONFIG_TOUCHSCREEN_USB_NEXIO=y +CONFIG_TOUCHSCREEN_USB_EASYTOUCH=y +CONFIG_TOUCHSCREEN_TOUCHIT213=m +# CONFIG_TOUCHSCREEN_TSC_SERIO is not set +# CONFIG_TOUCHSCREEN_TSC2004 is not set +# CONFIG_TOUCHSCREEN_TSC2005 is not set +CONFIG_TOUCHSCREEN_TSC2007=m +# CONFIG_TOUCHSCREEN_TSC2007_IIO is not set +# CONFIG_TOUCHSCREEN_RM_TS is not set +# CONFIG_TOUCHSCREEN_SILEAD is not set +# CONFIG_TOUCHSCREEN_SIS_I2C is not set +# CONFIG_TOUCHSCREEN_ST1232 is not set +# CONFIG_TOUCHSCREEN_STMFTS is not set +# CONFIG_TOUCHSCREEN_SUN4I is not set +CONFIG_TOUCHSCREEN_SUR40=m +# CONFIG_TOUCHSCREEN_SURFACE3_SPI is not set +# CONFIG_TOUCHSCREEN_SX8654 is not set +CONFIG_TOUCHSCREEN_TPS6507X=m +# CONFIG_TOUCHSCREEN_ZET6223 is not set +# CONFIG_TOUCHSCREEN_ZFORCE is not set +# CONFIG_TOUCHSCREEN_COLIBRI_VF50 is not set +# CONFIG_TOUCHSCREEN_ROHM_BU21023 is not set +# CONFIG_TOUCHSCREEN_IQS5XX is not set +# CONFIG_TOUCHSCREEN_IQS7211 is not set +# CONFIG_TOUCHSCREEN_ZINITIX is not set +# CONFIG_TOUCHSCREEN_HIMAX_HX83112B is not set +CONFIG_INPUT_MISC=y +# CONFIG_INPUT_AD714X is not set +# CONFIG_INPUT_ATMEL_CAPTOUCH is not set +# CONFIG_INPUT_BMA150 is not set +# CONFIG_INPUT_E3X0_BUTTON is not set +CONFIG_INPUT_PM8941_PWRKEY=m +# CONFIG_INPUT_PM8XXX_VIBRATOR is not set +# CONFIG_INPUT_MMA8450 is not set +# CONFIG_INPUT_GPIO_BEEPER is not set +# CONFIG_INPUT_GPIO_DECODER is not set +# CONFIG_INPUT_GPIO_VIBRA is not set +CONFIG_INPUT_ATI_REMOTE2=m +CONFIG_INPUT_KEYSPAN_REMOTE=m +# CONFIG_INPUT_KXTJ9 is not set +CONFIG_INPUT_POWERMATE=m +CONFIG_INPUT_YEALINK=m +CONFIG_INPUT_CM109=m +# CONFIG_INPUT_REGULATOR_HAPTIC is not set +CONFIG_INPUT_AXP20X_PEK=m +CONFIG_INPUT_UINPUT=m +# CONFIG_INPUT_PCF8574 is not set +# CONFIG_INPUT_PWM_BEEPER is not set +# CONFIG_INPUT_PWM_VIBRA is not set +# CONFIG_INPUT_GPIO_ROTARY_ENCODER is not set +# CONFIG_INPUT_DA7280_HAPTICS is not set +# CONFIG_INPUT_ADXL34X is not set +# CONFIG_INPUT_IBM_PANEL is not set +# CONFIG_INPUT_IMS_PCU is not set +# CONFIG_INPUT_IQS269A is not set +# CONFIG_INPUT_IQS626A is not set +# CONFIG_INPUT_IQS7222 is not set +# CONFIG_INPUT_CMA3000 is not set +CONFIG_INPUT_XEN_KBDDEV_FRONTEND=y +# CONFIG_INPUT_SOC_BUTTON_ARRAY is not set +# CONFIG_INPUT_DRV260X_HAPTICS is not set +# CONFIG_INPUT_DRV2665_HAPTICS is not set +# CONFIG_INPUT_DRV2667_HAPTICS is not set +CONFIG_INPUT_HISI_POWERKEY=m +CONFIG_RMI4_CORE=m +# CONFIG_RMI4_I2C is not set +# CONFIG_RMI4_SPI is not set +# CONFIG_RMI4_SMB is not set +CONFIG_RMI4_F03=y +CONFIG_RMI4_F03_SERIO=m +CONFIG_RMI4_2D_SENSOR=y +CONFIG_RMI4_F11=y +CONFIG_RMI4_F12=y +CONFIG_RMI4_F30=y +CONFIG_RMI4_F34=y +# CONFIG_RMI4_F3A is not set +# CONFIG_RMI4_F54 is not set +CONFIG_RMI4_F55=y + +# +# Hardware I/O ports +# +CONFIG_SERIO=y +CONFIG_SERIO_SERPORT=y +# CONFIG_SERIO_PARKBD is not set +# CONFIG_SERIO_AMBAKMI is not set +# CONFIG_SERIO_PCIPS2 is not set +CONFIG_SERIO_LIBPS2=y +# CONFIG_SERIO_RAW is not set +CONFIG_SERIO_ALTERA_PS2=m +# CONFIG_SERIO_PS2MULT is not set +# CONFIG_SERIO_ARC_PS2 is not set +# CONFIG_SERIO_APBPS2 is not set +CONFIG_HYPERV_KEYBOARD=m +# CONFIG_SERIO_SUN4I_PS2 is not set +# CONFIG_SERIO_GPIO_PS2 is not set +# CONFIG_USERIO is not set +# CONFIG_GAMEPORT is not set +# end of Hardware I/O ports +# end of Input device support + +# +# Character devices +# +CONFIG_TTY=y +CONFIG_VT=y +CONFIG_CONSOLE_TRANSLATIONS=y +CONFIG_VT_CONSOLE=y +CONFIG_VT_CONSOLE_SLEEP=y +CONFIG_HW_CONSOLE=y +CONFIG_VT_HW_CONSOLE_BINDING=y +CONFIG_UNIX98_PTYS=y +# CONFIG_LEGACY_PTYS is not set +CONFIG_LEGACY_TIOCSTI=y +CONFIG_LDISC_AUTOLOAD=y + +# +# Serial drivers +# +CONFIG_SERIAL_EARLYCON=y +CONFIG_SERIAL_8250=y +# CONFIG_SERIAL_8250_DEPRECATED_OPTIONS is not set +CONFIG_SERIAL_8250_PNP=y +CONFIG_SERIAL_8250_16550A_VARIANTS=y +# CONFIG_SERIAL_8250_FINTEK is not set +CONFIG_SERIAL_8250_CONSOLE=y +CONFIG_SERIAL_8250_DMA=y +CONFIG_SERIAL_8250_PCILIB=y +CONFIG_SERIAL_8250_PCI=y +CONFIG_SERIAL_8250_EXAR=m +CONFIG_SERIAL_8250_NR_UARTS=4 +CONFIG_SERIAL_8250_RUNTIME_UARTS=4 +CONFIG_SERIAL_8250_EXTENDED=y +# CONFIG_SERIAL_8250_MANY_PORTS is not set +# CONFIG_SERIAL_8250_PCI1XXXX is not set +CONFIG_SERIAL_8250_SHARE_IRQ=y +# CONFIG_SERIAL_8250_DETECT_IRQ is not set +# CONFIG_SERIAL_8250_RSA is not set +CONFIG_SERIAL_8250_DWLIB=y +CONFIG_SERIAL_8250_FSL=y +CONFIG_SERIAL_8250_DW=y +# CONFIG_SERIAL_8250_RT288X is not set +CONFIG_SERIAL_8250_PERICOM=y +CONFIG_SERIAL_8250_TEGRA=y +CONFIG_SERIAL_OF_PLATFORM=y + +# +# Non-8250 serial port support +# +CONFIG_SERIAL_AMBA_PL010=y +CONFIG_SERIAL_AMBA_PL010_CONSOLE=y +CONFIG_SERIAL_AMBA_PL011=y +CONFIG_SERIAL_AMBA_PL011_CONSOLE=y +# CONFIG_SERIAL_EARLYCON_SEMIHOST is not set +CONFIG_SERIAL_MESON=y +CONFIG_SERIAL_MESON_CONSOLE=y +CONFIG_SERIAL_TEGRA=y +# CONFIG_SERIAL_MAX3100 is not set +# CONFIG_SERIAL_MAX310X is not set +# CONFIG_SERIAL_UARTLITE is not set +CONFIG_SERIAL_CORE=y +CONFIG_SERIAL_CORE_CONSOLE=y +# CONFIG_SERIAL_JSM is not set +CONFIG_SERIAL_MSM=y +CONFIG_SERIAL_MSM_CONSOLE=y +# CONFIG_SERIAL_SIFIVE is not set +# CONFIG_SERIAL_SCCNXP is not set +# CONFIG_SERIAL_SC16IS7XX is not set +# CONFIG_SERIAL_ALTERA_JTAGUART is not set +# CONFIG_SERIAL_ALTERA_UART is not set +CONFIG_SERIAL_XILINX_PS_UART=y +CONFIG_SERIAL_XILINX_PS_UART_CONSOLE=y +# CONFIG_SERIAL_ARC is not set +CONFIG_SERIAL_RP2=m +CONFIG_SERIAL_RP2_NR_UARTS=32 +# CONFIG_SERIAL_FSL_LPUART is not set +# CONFIG_SERIAL_FSL_LINFLEXUART is not set +# CONFIG_SERIAL_CONEXANT_DIGICOLOR is not set +# CONFIG_SERIAL_SPRD is not set +CONFIG_SERIAL_MVEBU_UART=y +CONFIG_SERIAL_MVEBU_CONSOLE=y +# end of Serial drivers + +CONFIG_SERIAL_MCTRL_GPIO=y +# CONFIG_SERIAL_NONSTANDARD is not set +CONFIG_N_GSM=m +CONFIG_NOZOMI=m +# CONFIG_NULL_TTY is not set +CONFIG_HVC_DRIVER=y +CONFIG_HVC_IRQ=y +CONFIG_HVC_XEN=y +CONFIG_HVC_XEN_FRONTEND=y +# CONFIG_HVC_DCC is not set +# CONFIG_RPMSG_TTY is not set +CONFIG_SERIAL_DEV_BUS=m +CONFIG_TTY_PRINTK=m +CONFIG_TTY_PRINTK_LEVEL=6 +# CONFIG_PRINTER is not set +# CONFIG_PPDEV is not set +CONFIG_VIRTIO_CONSOLE=m +CONFIG_IPMI_HANDLER=m +CONFIG_IPMI_DMI_DECODE=y +CONFIG_IPMI_PLAT_DATA=y +# CONFIG_IPMI_PANIC_EVENT is not set +CONFIG_IPMI_DEVICE_INTERFACE=m +CONFIG_IPMI_SI=m +CONFIG_IPMI_SSIF=m +# CONFIG_IPMI_IPMB is not set +CONFIG_IPMI_WATCHDOG=m +CONFIG_IPMI_POWEROFF=m +# CONFIG_SSIF_IPMI_BMC is not set +# CONFIG_IPMB_DEVICE_INTERFACE is not set +CONFIG_HW_RANDOM=m +# CONFIG_HW_RANDOM_TIMERIOMEM is not set +# CONFIG_HW_RANDOM_BA431 is not set +CONFIG_HW_RANDOM_OMAP=m +CONFIG_HW_RANDOM_VIRTIO=m +CONFIG_HW_RANDOM_HISI=m +CONFIG_HW_RANDOM_HISTB=m +CONFIG_HW_RANDOM_XGENE=m +CONFIG_HW_RANDOM_MESON=m +CONFIG_HW_RANDOM_CAVIUM=m +CONFIG_HW_RANDOM_OPTEE=m +# CONFIG_HW_RANDOM_CCTRNG is not set +# CONFIG_HW_RANDOM_XIPHERA is not set +CONFIG_HW_RANDOM_ARM_SMCCC_TRNG=m +CONFIG_HW_RANDOM_CN10K=m +# CONFIG_APPLICOM is not set +CONFIG_DEVMEM=y +CONFIG_DEVPORT=y +CONFIG_TCG_TPM=m +CONFIG_HW_RANDOM_TPM=y +CONFIG_TCG_TIS_CORE=m +CONFIG_TCG_TIS=m +CONFIG_TCG_TIS_SPI=m +# CONFIG_TCG_TIS_SPI_CR50 is not set +# CONFIG_TCG_TIS_I2C is not set +# CONFIG_TCG_TIS_SYNQUACER is not set +# CONFIG_TCG_TIS_I2C_CR50 is not set +# CONFIG_TCG_TIS_I2C_ATMEL is not set +CONFIG_TCG_TIS_I2C_INFINEON=m +# CONFIG_TCG_TIS_I2C_NUVOTON is not set +CONFIG_TCG_ATMEL=m +# CONFIG_TCG_INFINEON is not set +# CONFIG_TCG_XEN is not set +CONFIG_TCG_CRB=m +CONFIG_TCG_VTPM_PROXY=m +# CONFIG_TCG_FTPM_TEE is not set +# CONFIG_TCG_TIS_ST33ZP24_I2C is not set +# CONFIG_TCG_TIS_ST33ZP24_SPI is not set +# CONFIG_XILLYBUS is not set +# CONFIG_XILLYUSB is not set +# end of Character devices + +# +# I2C support +# +CONFIG_I2C=y +CONFIG_ACPI_I2C_OPREGION=y +CONFIG_I2C_BOARDINFO=y +CONFIG_I2C_COMPAT=y +CONFIG_I2C_CHARDEV=m +CONFIG_I2C_MUX=m + +# +# Multiplexer I2C Chip support +# +# CONFIG_I2C_ARB_GPIO_CHALLENGE is not set +# CONFIG_I2C_MUX_GPIO is not set +# CONFIG_I2C_MUX_GPMUX is not set +# CONFIG_I2C_MUX_LTC4306 is not set +# CONFIG_I2C_MUX_PCA9541 is not set +# CONFIG_I2C_MUX_PCA954x is not set +# CONFIG_I2C_MUX_PINCTRL is not set +# CONFIG_I2C_MUX_REG is not set +# CONFIG_I2C_DEMUX_PINCTRL is not set +# CONFIG_I2C_MUX_MLXCPLD is not set +# end of Multiplexer I2C Chip support + +CONFIG_I2C_HELPER_AUTO=y +CONFIG_I2C_SMBUS=m +CONFIG_I2C_ALGOBIT=m +CONFIG_I2C_ALGOPCA=m + +# +# I2C Hardware Bus support +# + +# +# PC SMBus host controller drivers +# +# CONFIG_I2C_ALI1535 is not set +# CONFIG_I2C_ALI1563 is not set +# CONFIG_I2C_ALI15X3 is not set +# CONFIG_I2C_AMD756 is not set +# CONFIG_I2C_AMD8111 is not set +# CONFIG_I2C_AMD_MP2 is not set +# CONFIG_I2C_HIX5HD2 is not set +# CONFIG_I2C_I801 is not set +CONFIG_I2C_ISCH=m +# CONFIG_I2C_PIIX4 is not set +# CONFIG_I2C_NFORCE2 is not set +# CONFIG_I2C_NVIDIA_GPU is not set +# CONFIG_I2C_SIS5595 is not set +# CONFIG_I2C_SIS630 is not set +# CONFIG_I2C_SIS96X is not set +# CONFIG_I2C_VIA is not set +# CONFIG_I2C_VIAPRO is not set + +# +# ACPI drivers +# +# CONFIG_I2C_SCMI is not set + +# +# I2C system bus drivers (mostly embedded / system-on-chip) +# +# CONFIG_I2C_CADENCE is not set +# CONFIG_I2C_CBUS_GPIO is not set +CONFIG_I2C_DESIGNWARE_CORE=m +# CONFIG_I2C_DESIGNWARE_SLAVE is not set +CONFIG_I2C_DESIGNWARE_PLATFORM=m +# CONFIG_I2C_DESIGNWARE_PCI is not set +# CONFIG_I2C_EMEV2 is not set +CONFIG_I2C_GPIO=m +# CONFIG_I2C_GPIO_FAULT_INJECTOR is not set +# CONFIG_I2C_HISI is not set +CONFIG_I2C_MESON=m +CONFIG_I2C_MV64XXX=m +# CONFIG_I2C_NOMADIK is not set +CONFIG_I2C_OCORES=m +CONFIG_I2C_PCA_PLATFORM=m +CONFIG_I2C_PXA=m +# CONFIG_I2C_PXA_SLAVE is not set +# CONFIG_I2C_QCOM_CCI is not set +CONFIG_I2C_QUP=m +CONFIG_I2C_RK3X=m +CONFIG_I2C_SIMTEC=m +# CONFIG_I2C_SYNQUACER is not set +CONFIG_I2C_TEGRA=m +# CONFIG_I2C_VERSATILE is not set +CONFIG_I2C_THUNDERX=m +# CONFIG_I2C_XILINX is not set +CONFIG_I2C_XLP9XX=m + +# +# External I2C/SMBus adapter drivers +# +CONFIG_I2C_DIOLAN_U2C=m +# CONFIG_I2C_CP2615 is not set +# CONFIG_I2C_PARPORT is not set +# CONFIG_I2C_PCI1XXXX is not set +CONFIG_I2C_ROBOTFUZZ_OSIF=m +CONFIG_I2C_TAOS_EVM=m +CONFIG_I2C_TINY_USB=m +CONFIG_I2C_VIPERBOARD=m + +# +# Other I2C/SMBus bus drivers +# +# CONFIG_I2C_MLXCPLD is not set +CONFIG_I2C_CROS_EC_TUNNEL=m +CONFIG_I2C_XGENE_SLIMPRO=m +# CONFIG_I2C_VIRTIO is not set +# end of I2C Hardware Bus support + +# CONFIG_I2C_STUB is not set +CONFIG_I2C_SLAVE=y +# CONFIG_I2C_SLAVE_EEPROM is not set +# CONFIG_I2C_SLAVE_TESTUNIT is not set +# CONFIG_I2C_DEBUG_CORE is not set +# CONFIG_I2C_DEBUG_ALGO is not set +# CONFIG_I2C_DEBUG_BUS is not set +# end of I2C support + +# CONFIG_I3C is not set +CONFIG_SPI=y +# CONFIG_SPI_DEBUG is not set +CONFIG_SPI_MASTER=y +CONFIG_SPI_MEM=y + +# +# SPI Master Controller Drivers +# +# CONFIG_SPI_ALTERA is not set +# CONFIG_SPI_AMLOGIC_SPIFC_A1 is not set +CONFIG_SPI_ARMADA_3700=m +# CONFIG_SPI_AXI_SPI_ENGINE is not set +CONFIG_SPI_BITBANG=m +CONFIG_SPI_BUTTERFLY=m +# CONFIG_SPI_CADENCE is not set +# CONFIG_SPI_CADENCE_QUADSPI is not set +# CONFIG_SPI_CADENCE_XSPI is not set +# CONFIG_SPI_DESIGNWARE is not set +# CONFIG_SPI_HISI_KUNPENG is not set +# CONFIG_SPI_HISI_SFC_V3XX is not set +# CONFIG_SPI_GPIO is not set +CONFIG_SPI_LM70_LLP=m +# CONFIG_SPI_FSL_SPI is not set +# CONFIG_SPI_MESON_SPICC is not set +CONFIG_SPI_MESON_SPIFC=m +# CONFIG_SPI_MICROCHIP_CORE is not set +# CONFIG_SPI_MICROCHIP_CORE_QSPI is not set +# CONFIG_SPI_OC_TINY is not set +# CONFIG_SPI_ORION is not set +# CONFIG_SPI_PCI1XXXX is not set +# CONFIG_SPI_PL022 is not set +# CONFIG_SPI_PXA2XX is not set +CONFIG_SPI_ROCKCHIP=m +# CONFIG_SPI_ROCKCHIP_SFC is not set +# CONFIG_SPI_QCOM_QSPI is not set +CONFIG_SPI_QUP=m +# CONFIG_SPI_SC18IS602 is not set +# CONFIG_SPI_SIFIVE is not set +# CONFIG_SPI_SN_F_OSPI is not set +# CONFIG_SPI_SUN4I is not set +# CONFIG_SPI_SUN6I is not set +# CONFIG_SPI_SYNQUACER is not set +# CONFIG_SPI_MXIC is not set +CONFIG_SPI_TEGRA210_QUAD=m +CONFIG_SPI_TEGRA114=m +CONFIG_SPI_TEGRA20_SFLASH=m +CONFIG_SPI_TEGRA20_SLINK=m +CONFIG_SPI_THUNDERX=m +# CONFIG_SPI_XCOMM is not set +# CONFIG_SPI_XILINX is not set +CONFIG_SPI_XLP=m +# CONFIG_SPI_ZYNQMP_GQSPI is not set +# CONFIG_SPI_AMD is not set + +# +# SPI Multiplexer support +# +# CONFIG_SPI_MUX is not set + +# +# SPI Protocol Masters +# +CONFIG_SPI_SPIDEV=y +# CONFIG_SPI_LOOPBACK_TEST is not set +# CONFIG_SPI_TLE62X0 is not set +# CONFIG_SPI_SLAVE is not set +CONFIG_SPI_DYNAMIC=y +CONFIG_SPMI=y +# CONFIG_SPMI_HISI3670 is not set +CONFIG_SPMI_MSM_PMIC_ARB=y +# CONFIG_HSI is not set +CONFIG_PPS=m +# CONFIG_PPS_DEBUG is not set + +# +# PPS clients support +# +# CONFIG_PPS_CLIENT_KTIMER is not set +CONFIG_PPS_CLIENT_LDISC=m +CONFIG_PPS_CLIENT_PARPORT=m +# CONFIG_PPS_CLIENT_GPIO is not set + +# +# PPS generators support +# + +# +# PTP clock support +# +CONFIG_PTP_1588_CLOCK=m +CONFIG_PTP_1588_CLOCK_OPTIONAL=m + +# +# Enable PHYLIB and NETWORK_PHY_TIMESTAMPING to see the additional clocks. +# +CONFIG_PTP_1588_CLOCK_KVM=m +# CONFIG_PTP_1588_CLOCK_IDT82P33 is not set +# CONFIG_PTP_1588_CLOCK_IDTCM is not set +# CONFIG_PTP_1588_CLOCK_MOCK is not set +# CONFIG_PTP_1588_CLOCK_OCP is not set +# end of PTP clock support + +CONFIG_PINCTRL=y +CONFIG_GENERIC_PINCTRL_GROUPS=y +CONFIG_PINMUX=y +CONFIG_GENERIC_PINMUX_FUNCTIONS=y +CONFIG_PINCONF=y +CONFIG_GENERIC_PINCONF=y +# CONFIG_DEBUG_PINCTRL is not set +CONFIG_PINCTRL_AMD=y +CONFIG_PINCTRL_AXP209=m +# CONFIG_PINCTRL_CY8C95X0 is not set +CONFIG_PINCTRL_MAX77620=y +# CONFIG_PINCTRL_MCP23S08 is not set +# CONFIG_PINCTRL_MICROCHIP_SGPIO is not set +# CONFIG_PINCTRL_OCELOT is not set +CONFIG_PINCTRL_ROCKCHIP=y +CONFIG_PINCTRL_SINGLE=y +# CONFIG_PINCTRL_STMFX is not set +# CONFIG_PINCTRL_SX150X is not set +CONFIG_PINCTRL_ZYNQMP=y +CONFIG_PINCTRL_MESON=y +CONFIG_PINCTRL_MESON_GXBB=y +CONFIG_PINCTRL_MESON_GXL=y +CONFIG_PINCTRL_MESON8_PMX=y +CONFIG_PINCTRL_MESON_AXG=y +CONFIG_PINCTRL_MESON_AXG_PMX=y +CONFIG_PINCTRL_MESON_G12A=y +CONFIG_PINCTRL_MESON_A1=y +CONFIG_PINCTRL_MESON_S4=y +CONFIG_PINCTRL_AMLOGIC_C3=y +CONFIG_PINCTRL_MVEBU=y +CONFIG_PINCTRL_ARMADA_AP806=y +CONFIG_PINCTRL_ARMADA_CP110=y +CONFIG_PINCTRL_AC5=y +CONFIG_PINCTRL_ARMADA_37XX=y +CONFIG_PINCTRL_MSM=y +# CONFIG_PINCTRL_IPQ5018 is not set +# CONFIG_PINCTRL_IPQ5332 is not set +# CONFIG_PINCTRL_IPQ8074 is not set +# CONFIG_PINCTRL_IPQ6018 is not set +# CONFIG_PINCTRL_IPQ9574 is not set +# CONFIG_PINCTRL_MDM9607 is not set +CONFIG_PINCTRL_MSM8916=y +# CONFIG_PINCTRL_MSM8953 is not set +# CONFIG_PINCTRL_MSM8976 is not set +# CONFIG_PINCTRL_MSM8994 is not set +CONFIG_PINCTRL_MSM8996=y +# CONFIG_PINCTRL_MSM8998 is not set +# CONFIG_PINCTRL_QCM2290 is not set +# CONFIG_PINCTRL_QCS404 is not set +# CONFIG_PINCTRL_QDF2XXX is not set +# CONFIG_PINCTRL_QDU1000 is not set +# CONFIG_PINCTRL_SA8775P is not set +# CONFIG_PINCTRL_SC7180 is not set +# CONFIG_PINCTRL_SC7280 is not set +# CONFIG_PINCTRL_SC8180X is not set +# CONFIG_PINCTRL_SC8280XP is not set +# CONFIG_PINCTRL_SDM660 is not set +# CONFIG_PINCTRL_SDM670 is not set +# CONFIG_PINCTRL_SDM845 is not set +# CONFIG_PINCTRL_SDX75 is not set +# CONFIG_PINCTRL_SM6115 is not set +# CONFIG_PINCTRL_SM6125 is not set +# CONFIG_PINCTRL_SM6350 is not set +# CONFIG_PINCTRL_SM6375 is not set +# CONFIG_PINCTRL_SM7150 is not set +# CONFIG_PINCTRL_SM8150 is not set +# CONFIG_PINCTRL_SM8250 is not set +# CONFIG_PINCTRL_SM8350 is not set +# CONFIG_PINCTRL_SM8450 is not set +# CONFIG_PINCTRL_SM8550 is not set +CONFIG_PINCTRL_QCOM_SPMI_PMIC=y +CONFIG_PINCTRL_QCOM_SSBI_PMIC=y +# CONFIG_PINCTRL_LPASS_LPI is not set + +# +# Renesas pinctrl drivers +# +# end of Renesas pinctrl drivers + +CONFIG_PINCTRL_SUNXI=y +# CONFIG_PINCTRL_SUN4I_A10 is not set +# CONFIG_PINCTRL_SUN5I is not set +# CONFIG_PINCTRL_SUN6I_A31 is not set +# CONFIG_PINCTRL_SUN6I_A31_R is not set +# CONFIG_PINCTRL_SUN8I_A23 is not set +# CONFIG_PINCTRL_SUN8I_A33 is not set +# CONFIG_PINCTRL_SUN8I_A83T is not set +# CONFIG_PINCTRL_SUN8I_A83T_R is not set +# CONFIG_PINCTRL_SUN8I_A23_R is not set +# CONFIG_PINCTRL_SUN8I_H3 is not set +CONFIG_PINCTRL_SUN8I_H3_R=y +# CONFIG_PINCTRL_SUN8I_V3S is not set +# CONFIG_PINCTRL_SUN9I_A80 is not set +# CONFIG_PINCTRL_SUN9I_A80_R is not set +# CONFIG_PINCTRL_SUN20I_D1 is not set +CONFIG_PINCTRL_SUN50I_A64=y +CONFIG_PINCTRL_SUN50I_A64_R=y +CONFIG_PINCTRL_SUN50I_A100=y +CONFIG_PINCTRL_SUN50I_A100_R=y +CONFIG_PINCTRL_SUN50I_H5=y +CONFIG_PINCTRL_SUN50I_H6=y +CONFIG_PINCTRL_SUN50I_H6_R=y +CONFIG_PINCTRL_SUN50I_H616=y +CONFIG_PINCTRL_SUN50I_H616_R=y +CONFIG_PINCTRL_TEGRA=y +CONFIG_PINCTRL_TEGRA124=y +CONFIG_PINCTRL_TEGRA210=y +CONFIG_PINCTRL_TEGRA_XUSB=y +CONFIG_GPIOLIB=y +CONFIG_GPIOLIB_FASTPATH_LIMIT=512 +CONFIG_OF_GPIO=y +CONFIG_GPIO_ACPI=y +CONFIG_GPIOLIB_IRQCHIP=y +# CONFIG_DEBUG_GPIO is not set +CONFIG_GPIO_SYSFS=y +CONFIG_GPIO_CDEV=y +CONFIG_GPIO_CDEV_V1=y +CONFIG_GPIO_GENERIC=y +CONFIG_GPIO_REGMAP=m +CONFIG_GPIO_IDIO_16=m + +# +# Memory mapped GPIO drivers +# +# CONFIG_GPIO_74XX_MMIO is not set +# CONFIG_GPIO_ALTERA is not set +# CONFIG_GPIO_AMDPT is not set +# CONFIG_GPIO_CADENCE is not set +# CONFIG_GPIO_DWAPB is not set +CONFIG_GPIO_EXAR=m +# CONFIG_GPIO_FTGPIO010 is not set +CONFIG_GPIO_GENERIC_PLATFORM=y +# CONFIG_GPIO_GRGPIO is not set +# CONFIG_GPIO_HISI is not set +# CONFIG_GPIO_HLWD is not set +# CONFIG_GPIO_LOGICVC is not set +CONFIG_GPIO_MB86S7X=m +CONFIG_GPIO_MVEBU=y +CONFIG_GPIO_PL061=y +CONFIG_GPIO_ROCKCHIP=y +# CONFIG_GPIO_SIFIVE is not set +# CONFIG_GPIO_SYSCON is not set +CONFIG_GPIO_TEGRA=y +CONFIG_GPIO_TEGRA186=m +# CONFIG_GPIO_THUNDERX is not set +CONFIG_GPIO_XGENE=y +CONFIG_GPIO_XGENE_SB=m +# CONFIG_GPIO_XILINX is not set +CONFIG_GPIO_XLP=y +CONFIG_GPIO_ZYNQ=m +CONFIG_GPIO_ZYNQMP_MODEPIN=y +# CONFIG_GPIO_AMD_FCH is not set +# end of Memory mapped GPIO drivers + +# +# I2C GPIO expanders +# +# CONFIG_GPIO_ADNP is not set +# CONFIG_GPIO_FXL6408 is not set +# CONFIG_GPIO_DS4520 is not set +# CONFIG_GPIO_GW_PLD is not set +# CONFIG_GPIO_MAX7300 is not set +# CONFIG_GPIO_MAX732X is not set +CONFIG_GPIO_PCA953X=y +CONFIG_GPIO_PCA953X_IRQ=y +# CONFIG_GPIO_PCA9570 is not set +# CONFIG_GPIO_PCF857X is not set +# CONFIG_GPIO_TPIC2810 is not set +# end of I2C GPIO expanders + +# +# MFD GPIO expanders +# +CONFIG_GPIO_MAX77620=y +# end of MFD GPIO expanders + +# +# PCI GPIO expanders +# +CONFIG_GPIO_PCI_IDIO_16=m +CONFIG_GPIO_PCIE_IDIO_24=m +# CONFIG_GPIO_RDC321X is not set +# end of PCI GPIO expanders + +# +# SPI GPIO expanders +# +# CONFIG_GPIO_74X164 is not set +# CONFIG_GPIO_MAX3191X is not set +# CONFIG_GPIO_MAX7301 is not set +# CONFIG_GPIO_MC33880 is not set +# CONFIG_GPIO_PISOSR is not set +# CONFIG_GPIO_XRA1403 is not set +# end of SPI GPIO expanders + +# +# USB GPIO expanders +# +CONFIG_GPIO_VIPERBOARD=m +# end of USB GPIO expanders + +# +# Virtual GPIO drivers +# +# CONFIG_GPIO_AGGREGATOR is not set +# CONFIG_GPIO_LATCH is not set +# CONFIG_GPIO_MOCKUP is not set +# CONFIG_GPIO_VIRTIO is not set +# CONFIG_GPIO_SIM is not set +# end of Virtual GPIO drivers + +CONFIG_W1=m +CONFIG_W1_CON=y + +# +# 1-wire Bus Masters +# +# CONFIG_W1_MASTER_MATROX is not set +CONFIG_W1_MASTER_DS2490=m +CONFIG_W1_MASTER_DS2482=m +CONFIG_W1_MASTER_GPIO=m +# CONFIG_W1_MASTER_SGI is not set +# end of 1-wire Bus Masters + +# +# 1-wire Slaves +# +CONFIG_W1_SLAVE_THERM=m +CONFIG_W1_SLAVE_SMEM=m +CONFIG_W1_SLAVE_DS2405=m +CONFIG_W1_SLAVE_DS2408=m +CONFIG_W1_SLAVE_DS2408_READBACK=y +CONFIG_W1_SLAVE_DS2413=m +CONFIG_W1_SLAVE_DS2406=m +CONFIG_W1_SLAVE_DS2423=m +CONFIG_W1_SLAVE_DS2805=m +# CONFIG_W1_SLAVE_DS2430 is not set +CONFIG_W1_SLAVE_DS2431=m +CONFIG_W1_SLAVE_DS2433=m +# CONFIG_W1_SLAVE_DS2433_CRC is not set +CONFIG_W1_SLAVE_DS2438=m +# CONFIG_W1_SLAVE_DS250X is not set +CONFIG_W1_SLAVE_DS2780=m +CONFIG_W1_SLAVE_DS2781=m +CONFIG_W1_SLAVE_DS28E04=m +CONFIG_W1_SLAVE_DS28E17=m +# end of 1-wire Slaves + +CONFIG_POWER_RESET=y +# CONFIG_POWER_RESET_BRCMSTB is not set +# CONFIG_POWER_RESET_GPIO is not set +# CONFIG_POWER_RESET_GPIO_RESTART is not set +CONFIG_POWER_RESET_HISI=y +# CONFIG_POWER_RESET_LINKSTATION is not set +CONFIG_POWER_RESET_MSM=y +# CONFIG_POWER_RESET_QCOM_PON is not set +# CONFIG_POWER_RESET_ODROID_GO_ULTRA_POWEROFF is not set +# CONFIG_POWER_RESET_LTC2952 is not set +# CONFIG_POWER_RESET_REGULATOR is not set +# CONFIG_POWER_RESET_RESTART is not set +CONFIG_POWER_RESET_VEXPRESS=y +CONFIG_POWER_RESET_XGENE=y +CONFIG_POWER_RESET_SYSCON=y +CONFIG_POWER_RESET_SYSCON_POWEROFF=y +# CONFIG_SYSCON_REBOOT_MODE is not set +# CONFIG_NVMEM_REBOOT_MODE is not set +CONFIG_POWER_SUPPLY=y +# CONFIG_POWER_SUPPLY_DEBUG is not set +CONFIG_POWER_SUPPLY_HWMON=y +# CONFIG_GENERIC_ADC_BATTERY is not set +# CONFIG_IP5XXX_POWER is not set +# CONFIG_TEST_POWER is not set +# CONFIG_CHARGER_ADP5061 is not set +# CONFIG_BATTERY_CW2015 is not set +CONFIG_BATTERY_DS2760=m +# CONFIG_BATTERY_DS2780 is not set +# CONFIG_BATTERY_DS2781 is not set +# CONFIG_BATTERY_DS2782 is not set +# CONFIG_BATTERY_SAMSUNG_SDI is not set +CONFIG_BATTERY_SBS=m +# CONFIG_CHARGER_SBS is not set +# CONFIG_MANAGER_SBS is not set +CONFIG_BATTERY_BQ27XXX=m +# CONFIG_BATTERY_BQ27XXX_I2C is not set +CONFIG_BATTERY_BQ27XXX_HDQ=m +CONFIG_CHARGER_AXP20X=m +CONFIG_BATTERY_AXP20X=m +CONFIG_AXP20X_POWER=m +# CONFIG_BATTERY_MAX17040 is not set +# CONFIG_BATTERY_MAX17042 is not set +# CONFIG_BATTERY_MAX1721X is not set +# CONFIG_CHARGER_ISP1704 is not set +# CONFIG_CHARGER_MAX8903 is not set +# CONFIG_CHARGER_LP8727 is not set +# CONFIG_CHARGER_GPIO is not set +# CONFIG_CHARGER_MANAGER is not set +# CONFIG_CHARGER_LT3651 is not set +# CONFIG_CHARGER_LTC4162L is not set +# CONFIG_CHARGER_DETECTOR_MAX14656 is not set +# CONFIG_CHARGER_MAX77976 is not set +CONFIG_CHARGER_QCOM_SMBB=m +# CONFIG_CHARGER_BQ2415X is not set +# CONFIG_CHARGER_BQ24190 is not set +# CONFIG_CHARGER_BQ24257 is not set +# CONFIG_CHARGER_BQ24735 is not set +# CONFIG_CHARGER_BQ2515X is not set +# CONFIG_CHARGER_BQ25890 is not set +# CONFIG_CHARGER_BQ25980 is not set +# CONFIG_CHARGER_BQ256XX is not set +# CONFIG_CHARGER_SMB347 is not set +# CONFIG_BATTERY_GAUGE_LTC2941 is not set +# CONFIG_BATTERY_GOLDFISH is not set +# CONFIG_BATTERY_RT5033 is not set +# CONFIG_CHARGER_RT9455 is not set +# CONFIG_CHARGER_RT9467 is not set +# CONFIG_CHARGER_RT9471 is not set +CONFIG_CHARGER_CROS_USBPD=m +CONFIG_CHARGER_CROS_PCHG=y +# CONFIG_CHARGER_UCS1002 is not set +# CONFIG_CHARGER_BD99954 is not set +# CONFIG_BATTERY_UG3105 is not set +# CONFIG_CHARGER_QCOM_SMB2 is not set +CONFIG_HWMON=y +CONFIG_HWMON_VID=m +# CONFIG_HWMON_DEBUG_CHIP is not set + +# +# Native drivers +# +# CONFIG_SENSORS_AD7314 is not set +CONFIG_SENSORS_AD7414=m +CONFIG_SENSORS_AD7418=m +# CONFIG_SENSORS_ADM1021 is not set +# CONFIG_SENSORS_ADM1025 is not set +# CONFIG_SENSORS_ADM1026 is not set +CONFIG_SENSORS_ADM1029=m +# CONFIG_SENSORS_ADM1031 is not set +# CONFIG_SENSORS_ADM1177 is not set +CONFIG_SENSORS_ADM9240=m +# CONFIG_SENSORS_ADT7310 is not set +# CONFIG_SENSORS_ADT7410 is not set +CONFIG_SENSORS_ADT7411=m +CONFIG_SENSORS_ADT7462=m +CONFIG_SENSORS_ADT7470=m +CONFIG_SENSORS_ADT7475=m +# CONFIG_SENSORS_AHT10 is not set +# CONFIG_SENSORS_AQUACOMPUTER_D5NEXT is not set +# CONFIG_SENSORS_AS370 is not set +CONFIG_SENSORS_ASC7621=m +# CONFIG_SENSORS_AXI_FAN_CONTROL is not set +CONFIG_SENSORS_ATXP1=m +# CONFIG_SENSORS_CORSAIR_CPRO is not set +# CONFIG_SENSORS_CORSAIR_PSU is not set +# CONFIG_SENSORS_DRIVETEMP is not set +CONFIG_SENSORS_DS620=m +# CONFIG_SENSORS_DS1621 is not set +CONFIG_SENSORS_I5K_AMB=m +# CONFIG_SENSORS_F71805F is not set +CONFIG_SENSORS_F71882FG=m +CONFIG_SENSORS_F75375S=m +CONFIG_SENSORS_FTSTEUTATES=m +# CONFIG_SENSORS_GL518SM is not set +# CONFIG_SENSORS_GL520SM is not set +CONFIG_SENSORS_G760A=m +# CONFIG_SENSORS_G762 is not set +# CONFIG_SENSORS_GPIO_FAN is not set +# CONFIG_SENSORS_HIH6130 is not set +# CONFIG_SENSORS_HS3001 is not set +CONFIG_SENSORS_IBMAEM=m +CONFIG_SENSORS_IBMPEX=m +# CONFIG_SENSORS_IIO_HWMON is not set +# CONFIG_SENSORS_IT87 is not set +CONFIG_SENSORS_JC42=m +# CONFIG_SENSORS_POWR1220 is not set +CONFIG_SENSORS_LINEAGE=m +# CONFIG_SENSORS_LTC2945 is not set +# CONFIG_SENSORS_LTC2947_I2C is not set +# CONFIG_SENSORS_LTC2947_SPI is not set +# CONFIG_SENSORS_LTC2990 is not set +# CONFIG_SENSORS_LTC2992 is not set +CONFIG_SENSORS_LTC4151=m +CONFIG_SENSORS_LTC4215=m +# CONFIG_SENSORS_LTC4222 is not set +CONFIG_SENSORS_LTC4245=m +# CONFIG_SENSORS_LTC4260 is not set +CONFIG_SENSORS_LTC4261=m +CONFIG_SENSORS_MAX1111=m +# CONFIG_SENSORS_MAX127 is not set +CONFIG_SENSORS_MAX16065=m +# CONFIG_SENSORS_MAX1619 is not set +CONFIG_SENSORS_MAX1668=m +# CONFIG_SENSORS_MAX197 is not set +# CONFIG_SENSORS_MAX31722 is not set +# CONFIG_SENSORS_MAX31730 is not set +# CONFIG_SENSORS_MAX31760 is not set +# CONFIG_MAX31827 is not set +# CONFIG_SENSORS_MAX6620 is not set +# CONFIG_SENSORS_MAX6621 is not set +CONFIG_SENSORS_MAX6639=m +CONFIG_SENSORS_MAX6642=m +CONFIG_SENSORS_MAX6650=m +# CONFIG_SENSORS_MAX6697 is not set +# CONFIG_SENSORS_MAX31790 is not set +# CONFIG_SENSORS_MC34VR500 is not set +# CONFIG_SENSORS_MCP3021 is not set +# CONFIG_SENSORS_TC654 is not set +# CONFIG_SENSORS_TPS23861 is not set +# CONFIG_SENSORS_MR75203 is not set +CONFIG_SENSORS_ADCXX=m +# CONFIG_SENSORS_LM63 is not set +CONFIG_SENSORS_LM70=m +CONFIG_SENSORS_LM73=m +# CONFIG_SENSORS_LM75 is not set +# CONFIG_SENSORS_LM77 is not set +# CONFIG_SENSORS_LM78 is not set +# CONFIG_SENSORS_LM80 is not set +# CONFIG_SENSORS_LM83 is not set +# CONFIG_SENSORS_LM85 is not set +# CONFIG_SENSORS_LM87 is not set +# CONFIG_SENSORS_LM90 is not set +# CONFIG_SENSORS_LM92 is not set +CONFIG_SENSORS_LM93=m +# CONFIG_SENSORS_LM95234 is not set +CONFIG_SENSORS_LM95241=m +CONFIG_SENSORS_LM95245=m +# CONFIG_SENSORS_PC87360 is not set +CONFIG_SENSORS_PC87427=m +CONFIG_SENSORS_NTC_THERMISTOR=m +CONFIG_SENSORS_NCT6683=m +CONFIG_SENSORS_NCT6775_CORE=m +CONFIG_SENSORS_NCT6775=m +# CONFIG_SENSORS_NCT6775_I2C is not set +CONFIG_SENSORS_NCT7802=m +CONFIG_SENSORS_NCT7904=m +CONFIG_SENSORS_NPCM7XX=m +# CONFIG_SENSORS_NZXT_KRAKEN2 is not set +# CONFIG_SENSORS_NZXT_SMART2 is not set +# CONFIG_SENSORS_OCC_P8_I2C is not set +# CONFIG_SENSORS_PCF8591 is not set +# CONFIG_PMBUS is not set +# CONFIG_SENSORS_PWM_FAN is not set +# CONFIG_SENSORS_SBTSI is not set +# CONFIG_SENSORS_SBRMI is not set +# CONFIG_SENSORS_SHT15 is not set +CONFIG_SENSORS_SHT21=m +# CONFIG_SENSORS_SHT3x is not set +# CONFIG_SENSORS_SHT4x is not set +# CONFIG_SENSORS_SHTC1 is not set +# CONFIG_SENSORS_SIS5595 is not set +CONFIG_SENSORS_DME1737=m +CONFIG_SENSORS_EMC1403=m +CONFIG_SENSORS_EMC2103=m +# CONFIG_SENSORS_EMC2305 is not set +CONFIG_SENSORS_EMC6W201=m +# CONFIG_SENSORS_SMSC47M1 is not set +CONFIG_SENSORS_SMSC47M192=m +# CONFIG_SENSORS_SMSC47B397 is not set +CONFIG_SENSORS_SCH56XX_COMMON=m +CONFIG_SENSORS_SCH5627=m +# CONFIG_SENSORS_SCH5636 is not set +# CONFIG_SENSORS_STTS751 is not set +# CONFIG_SENSORS_ADC128D818 is not set +CONFIG_SENSORS_ADS7828=m +CONFIG_SENSORS_ADS7871=m +CONFIG_SENSORS_AMC6821=m +# CONFIG_SENSORS_INA209 is not set +# CONFIG_SENSORS_INA2XX is not set +# CONFIG_SENSORS_INA238 is not set +# CONFIG_SENSORS_INA3221 is not set +# CONFIG_SENSORS_TC74 is not set +CONFIG_SENSORS_THMC50=m +CONFIG_SENSORS_TMP102=m +# CONFIG_SENSORS_TMP103 is not set +# CONFIG_SENSORS_TMP108 is not set +CONFIG_SENSORS_TMP401=m +CONFIG_SENSORS_TMP421=m +# CONFIG_SENSORS_TMP464 is not set +# CONFIG_SENSORS_TMP513 is not set +# CONFIG_SENSORS_VEXPRESS is not set +# CONFIG_SENSORS_VIA686A is not set +CONFIG_SENSORS_VT1211=m +CONFIG_SENSORS_VT8231=m +CONFIG_SENSORS_W83773G=m +# CONFIG_SENSORS_W83781D is not set +CONFIG_SENSORS_W83791D=m +CONFIG_SENSORS_W83792D=m +CONFIG_SENSORS_W83793=m +CONFIG_SENSORS_W83795=m +# CONFIG_SENSORS_W83795_FANCTRL is not set +# CONFIG_SENSORS_W83L785TS is not set +CONFIG_SENSORS_W83L786NG=m +# CONFIG_SENSORS_W83627HF is not set +CONFIG_SENSORS_W83627EHF=m +CONFIG_SENSORS_XGENE=m + +# +# ACPI drivers +# +CONFIG_SENSORS_ACPI_POWER=m +CONFIG_THERMAL=y +# CONFIG_THERMAL_NETLINK is not set +CONFIG_THERMAL_STATISTICS=y +CONFIG_THERMAL_EMERGENCY_POWEROFF_DELAY_MS=0 +CONFIG_THERMAL_HWMON=y +CONFIG_THERMAL_OF=y +# CONFIG_THERMAL_WRITABLE_TRIPS is not set +CONFIG_THERMAL_DEFAULT_GOV_STEP_WISE=y +# CONFIG_THERMAL_DEFAULT_GOV_FAIR_SHARE is not set +# CONFIG_THERMAL_DEFAULT_GOV_USER_SPACE is not set +CONFIG_THERMAL_GOV_FAIR_SHARE=y +CONFIG_THERMAL_GOV_STEP_WISE=y +# CONFIG_THERMAL_GOV_BANG_BANG is not set +# CONFIG_THERMAL_GOV_USER_SPACE is not set +# CONFIG_THERMAL_GOV_POWER_ALLOCATOR is not set +CONFIG_CPU_THERMAL=y +CONFIG_CPU_FREQ_THERMAL=y +CONFIG_DEVFREQ_THERMAL=y +# CONFIG_THERMAL_EMULATION is not set +# CONFIG_THERMAL_MMIO is not set +CONFIG_HISI_THERMAL=m +# CONFIG_MAX77620_THERMAL is not set +# CONFIG_SUN8I_THERMAL is not set +CONFIG_ROCKCHIP_THERMAL=m +CONFIG_ARMADA_THERMAL=m +CONFIG_AMLOGIC_THERMAL=y + +# +# NVIDIA Tegra thermal drivers +# +CONFIG_TEGRA_SOCTHERM=y +# end of NVIDIA Tegra thermal drivers + +# CONFIG_GENERIC_ADC_THERMAL is not set + +# +# Qualcomm thermal drivers +# +# CONFIG_QCOM_SPMI_ADC_TM5 is not set +CONFIG_QCOM_SPMI_TEMP_ALARM=m +# CONFIG_QCOM_LMH is not set +# end of Qualcomm thermal drivers + +CONFIG_WATCHDOG=y +CONFIG_WATCHDOG_CORE=y +# CONFIG_WATCHDOG_NOWAYOUT is not set +CONFIG_WATCHDOG_HANDLE_BOOT_ENABLED=y +CONFIG_WATCHDOG_OPEN_TIMEOUT=0 +CONFIG_WATCHDOG_SYSFS=y +# CONFIG_WATCHDOG_HRTIMER_PRETIMEOUT is not set + +# +# Watchdog Pretimeout Governors +# +CONFIG_WATCHDOG_PRETIMEOUT_GOV=y +CONFIG_WATCHDOG_PRETIMEOUT_GOV_SEL=m +CONFIG_WATCHDOG_PRETIMEOUT_GOV_NOOP=m +CONFIG_WATCHDOG_PRETIMEOUT_GOV_PANIC=m +CONFIG_WATCHDOG_PRETIMEOUT_DEFAULT_GOV_NOOP=y +# CONFIG_WATCHDOG_PRETIMEOUT_DEFAULT_GOV_PANIC is not set + +# +# Watchdog Device Drivers +# +CONFIG_SOFT_WATCHDOG=m +# CONFIG_SOFT_WATCHDOG_PRETIMEOUT is not set +CONFIG_GPIO_WATCHDOG=m +CONFIG_WDAT_WDT=m +# CONFIG_XILINX_WATCHDOG is not set +# CONFIG_XILINX_WINDOW_WATCHDOG is not set +# CONFIG_ZIIRAVE_WATCHDOG is not set +CONFIG_ARM_SP805_WATCHDOG=m +CONFIG_ARM_SBSA_WATCHDOG=m +# CONFIG_ARMADA_37XX_WATCHDOG is not set +# CONFIG_CADENCE_WATCHDOG is not set +CONFIG_DW_WATCHDOG=m +CONFIG_SUNXI_WATCHDOG=m +# CONFIG_MAX63XX_WATCHDOG is not set +# CONFIG_MAX77620_WATCHDOG is not set +CONFIG_TEGRA_WATCHDOG=m +CONFIG_QCOM_WDT=m +CONFIG_MESON_GXBB_WATCHDOG=m +CONFIG_MESON_WATCHDOG=m +# CONFIG_ARM_SMC_WATCHDOG is not set +# CONFIG_PM8916_WATCHDOG is not set +# CONFIG_ALIM7101_WDT is not set +# CONFIG_I6300ESB_WDT is not set +# CONFIG_HP_WATCHDOG is not set +CONFIG_MARVELL_GTI_WDT=y +# CONFIG_MEN_A21_WDT is not set +CONFIG_XEN_WDT=m + +# +# PCI-based Watchdog Cards +# +# CONFIG_PCIPCWATCHDOG is not set +# CONFIG_WDTPCI is not set + +# +# USB-based Watchdog Cards +# +# CONFIG_USBPCWATCHDOG is not set +CONFIG_SSB_POSSIBLE=y +CONFIG_SSB=m +CONFIG_SSB_SPROM=y +CONFIG_SSB_BLOCKIO=y +CONFIG_SSB_PCIHOST_POSSIBLE=y +CONFIG_SSB_PCIHOST=y +CONFIG_SSB_B43_PCI_BRIDGE=y +CONFIG_SSB_SDIOHOST_POSSIBLE=y +CONFIG_SSB_SDIOHOST=y +CONFIG_SSB_DRIVER_PCICORE_POSSIBLE=y +CONFIG_SSB_DRIVER_PCICORE=y +# CONFIG_SSB_DRIVER_GPIO is not set +CONFIG_BCMA_POSSIBLE=y +CONFIG_BCMA=m +CONFIG_BCMA_BLOCKIO=y +CONFIG_BCMA_HOST_PCI_POSSIBLE=y +CONFIG_BCMA_HOST_PCI=y +# CONFIG_BCMA_HOST_SOC is not set +CONFIG_BCMA_DRIVER_PCI=y +# CONFIG_BCMA_DRIVER_GMAC_CMN is not set +# CONFIG_BCMA_DRIVER_GPIO is not set +# CONFIG_BCMA_DEBUG is not set + +# +# Multifunction device drivers +# +CONFIG_MFD_CORE=y +# CONFIG_MFD_ACT8945A is not set +# CONFIG_MFD_SUN4I_GPADC is not set +# CONFIG_MFD_AS3711 is not set +# CONFIG_MFD_SMPRO is not set +# CONFIG_MFD_AS3722 is not set +# CONFIG_PMIC_ADP5520 is not set +# CONFIG_MFD_AAT2870_CORE is not set +# CONFIG_MFD_ATMEL_FLEXCOM is not set +# CONFIG_MFD_ATMEL_HLCDC is not set +# CONFIG_MFD_BCM590XX is not set +# CONFIG_MFD_BD9571MWV is not set +# CONFIG_MFD_AC100 is not set +CONFIG_MFD_AXP20X=m +# CONFIG_MFD_AXP20X_I2C is not set +CONFIG_MFD_AXP20X_RSB=m +CONFIG_MFD_CROS_EC_DEV=y +# CONFIG_MFD_CS42L43_I2C is not set +# CONFIG_MFD_MADERA is not set +# CONFIG_MFD_MAX5970 is not set +# CONFIG_PMIC_DA903X is not set +# CONFIG_MFD_DA9052_SPI is not set +# CONFIG_MFD_DA9052_I2C is not set +# CONFIG_MFD_DA9055 is not set +# CONFIG_MFD_DA9062 is not set +# CONFIG_MFD_DA9063 is not set +# CONFIG_MFD_DA9150 is not set +# CONFIG_MFD_DLN2 is not set +# CONFIG_MFD_GATEWORKS_GSC is not set +# CONFIG_MFD_MC13XXX_SPI is not set +# CONFIG_MFD_MC13XXX_I2C is not set +# CONFIG_MFD_MP2629 is not set +# CONFIG_MFD_HI6421_PMIC is not set +# CONFIG_MFD_HI6421_SPMI is not set +CONFIG_MFD_HI655X_PMIC=m +# CONFIG_LPC_ICH is not set +CONFIG_LPC_SCH=m +# CONFIG_MFD_IQS62X is not set +# CONFIG_MFD_JANZ_CMODIO is not set +# CONFIG_MFD_KEMPLD is not set +# CONFIG_MFD_88PM800 is not set +# CONFIG_MFD_88PM805 is not set +# CONFIG_MFD_88PM860X is not set +# CONFIG_MFD_MAX14577 is not set +# CONFIG_MFD_MAX77541 is not set +CONFIG_MFD_MAX77620=y +# CONFIG_MFD_MAX77650 is not set +# CONFIG_MFD_MAX77686 is not set +# CONFIG_MFD_MAX77693 is not set +# CONFIG_MFD_MAX77714 is not set +# CONFIG_MFD_MAX77843 is not set +# CONFIG_MFD_MAX8907 is not set +# CONFIG_MFD_MAX8925 is not set +# CONFIG_MFD_MAX8997 is not set +# CONFIG_MFD_MAX8998 is not set +# CONFIG_MFD_MT6360 is not set +# CONFIG_MFD_MT6370 is not set +# CONFIG_MFD_MT6397 is not set +# CONFIG_MFD_MENF21BMC is not set +# CONFIG_MFD_OCELOT is not set +# CONFIG_EZX_PCAP is not set +# CONFIG_MFD_CPCAP is not set +CONFIG_MFD_VIPERBOARD=m +# CONFIG_MFD_NTXEC is not set +# CONFIG_MFD_RETU is not set +# CONFIG_MFD_PCF50633 is not set +CONFIG_MFD_QCOM_RPM=m +CONFIG_MFD_SPMI_PMIC=m +# CONFIG_MFD_SY7636A is not set +# CONFIG_MFD_RDC321X is not set +# CONFIG_MFD_RT4831 is not set +# CONFIG_MFD_RT5033 is not set +# CONFIG_MFD_RT5120 is not set +# CONFIG_MFD_RC5T583 is not set +# CONFIG_MFD_RK8XX_I2C is not set +# CONFIG_MFD_RK8XX_SPI is not set +# CONFIG_MFD_RN5T618 is not set +# CONFIG_MFD_SEC_CORE is not set +# CONFIG_MFD_SI476X_CORE is not set +# CONFIG_MFD_SM501 is not set +# CONFIG_MFD_SKY81452 is not set +# CONFIG_MFD_STMPE is not set +CONFIG_MFD_SUN6I_PRCM=y +CONFIG_MFD_SYSCON=y +# CONFIG_MFD_TI_AM335X_TSCADC is not set +# CONFIG_MFD_LP3943 is not set +# CONFIG_MFD_LP8788 is not set +# CONFIG_MFD_TI_LMU is not set +# CONFIG_MFD_PALMAS is not set +# CONFIG_TPS6105X is not set +# CONFIG_TPS65010 is not set +# CONFIG_TPS6507X is not set +# CONFIG_MFD_TPS65086 is not set +# CONFIG_MFD_TPS65090 is not set +# CONFIG_MFD_TPS65217 is not set +# CONFIG_MFD_TI_LP873X is not set +# CONFIG_MFD_TI_LP87565 is not set +# CONFIG_MFD_TPS65218 is not set +# CONFIG_MFD_TPS65219 is not set +# CONFIG_MFD_TPS6586X is not set +# CONFIG_MFD_TPS65910 is not set +# CONFIG_MFD_TPS65912_I2C is not set +# CONFIG_MFD_TPS65912_SPI is not set +# CONFIG_MFD_TPS6594_I2C is not set +# CONFIG_MFD_TPS6594_SPI is not set +# CONFIG_TWL4030_CORE is not set +# CONFIG_TWL6040_CORE is not set +# CONFIG_MFD_WL1273_CORE is not set +# CONFIG_MFD_LM3533 is not set +# CONFIG_MFD_TC3589X is not set +# CONFIG_MFD_TQMX86 is not set +# CONFIG_MFD_VX855 is not set +# CONFIG_MFD_LOCHNAGAR is not set +# CONFIG_MFD_ARIZONA_I2C is not set +# CONFIG_MFD_ARIZONA_SPI is not set +# CONFIG_MFD_WM8400 is not set +# CONFIG_MFD_WM831X_I2C is not set +# CONFIG_MFD_WM831X_SPI is not set +# CONFIG_MFD_WM8350_I2C is not set +# CONFIG_MFD_WM8994 is not set +# CONFIG_MFD_ROHM_BD718XX is not set +# CONFIG_MFD_ROHM_BD71828 is not set +# CONFIG_MFD_ROHM_BD957XMUF is not set +# CONFIG_MFD_STPMIC1 is not set +# CONFIG_MFD_STMFX is not set +# CONFIG_MFD_ATC260X_I2C is not set +# CONFIG_MFD_KHADAS_MCU is not set +# CONFIG_MFD_QCOM_PM8008 is not set +CONFIG_MFD_VEXPRESS_SYSREG=y +# CONFIG_RAVE_SP_CORE is not set +# CONFIG_MFD_INTEL_M10_BMC_SPI is not set +# CONFIG_MFD_RSMU_I2C is not set +# CONFIG_MFD_RSMU_SPI is not set +# end of Multifunction device drivers + +CONFIG_REGULATOR=y +# CONFIG_REGULATOR_DEBUG is not set +CONFIG_REGULATOR_FIXED_VOLTAGE=m +# CONFIG_REGULATOR_VIRTUAL_CONSUMER is not set +# CONFIG_REGULATOR_USERSPACE_CONSUMER is not set +# CONFIG_REGULATOR_88PG86X is not set +# CONFIG_REGULATOR_ACT8865 is not set +# CONFIG_REGULATOR_AD5398 is not set +# CONFIG_REGULATOR_AW37503 is not set +CONFIG_REGULATOR_AXP20X=m +# CONFIG_REGULATOR_CROS_EC is not set +# CONFIG_REGULATOR_DA9121 is not set +# CONFIG_REGULATOR_DA9210 is not set +# CONFIG_REGULATOR_DA9211 is not set +CONFIG_REGULATOR_FAN53555=m +# CONFIG_REGULATOR_FAN53880 is not set +CONFIG_REGULATOR_GPIO=m +CONFIG_REGULATOR_HI655X=m +# CONFIG_REGULATOR_ISL9305 is not set +# CONFIG_REGULATOR_ISL6271A is not set +# CONFIG_REGULATOR_LP3971 is not set +# CONFIG_REGULATOR_LP3972 is not set +# CONFIG_REGULATOR_LP872X is not set +# CONFIG_REGULATOR_LP8755 is not set +# CONFIG_REGULATOR_LTC3589 is not set +# CONFIG_REGULATOR_LTC3676 is not set +# CONFIG_REGULATOR_MAX1586 is not set +CONFIG_REGULATOR_MAX77620=m +# CONFIG_REGULATOR_MAX77857 is not set +# CONFIG_REGULATOR_MAX8649 is not set +# CONFIG_REGULATOR_MAX8660 is not set +# CONFIG_REGULATOR_MAX8893 is not set +# CONFIG_REGULATOR_MAX8952 is not set +# CONFIG_REGULATOR_MAX8973 is not set +# CONFIG_REGULATOR_MAX20086 is not set +# CONFIG_REGULATOR_MAX20411 is not set +# CONFIG_REGULATOR_MAX77826 is not set +# CONFIG_REGULATOR_MCP16502 is not set +# CONFIG_REGULATOR_MP5416 is not set +# CONFIG_REGULATOR_MP8859 is not set +# CONFIG_REGULATOR_MP886X is not set +# CONFIG_REGULATOR_MPQ7920 is not set +# CONFIG_REGULATOR_MT6311 is not set +# CONFIG_REGULATOR_MT6315 is not set +# CONFIG_REGULATOR_PCA9450 is not set +# CONFIG_REGULATOR_PF8X00 is not set +# CONFIG_REGULATOR_PFUZE100 is not set +# CONFIG_REGULATOR_PV88060 is not set +# CONFIG_REGULATOR_PV88080 is not set +# CONFIG_REGULATOR_PV88090 is not set +CONFIG_REGULATOR_PWM=m +# CONFIG_REGULATOR_QCOM_REFGEN is not set +CONFIG_REGULATOR_QCOM_RPM=m +CONFIG_REGULATOR_QCOM_SMD_RPM=m +CONFIG_REGULATOR_QCOM_SPMI=m +# CONFIG_REGULATOR_QCOM_USB_VBUS is not set +# CONFIG_REGULATOR_RAA215300 is not set +# CONFIG_REGULATOR_RASPBERRYPI_TOUCHSCREEN_ATTINY is not set +# CONFIG_REGULATOR_RT4801 is not set +# CONFIG_REGULATOR_RT4803 is not set +# CONFIG_REGULATOR_RT5190A is not set +# CONFIG_REGULATOR_RT5739 is not set +# CONFIG_REGULATOR_RT5759 is not set +# CONFIG_REGULATOR_RT6160 is not set +# CONFIG_REGULATOR_RT6190 is not set +# CONFIG_REGULATOR_RT6245 is not set +# CONFIG_REGULATOR_RTQ2134 is not set +# CONFIG_REGULATOR_RTMV20 is not set +# CONFIG_REGULATOR_RTQ6752 is not set +# CONFIG_REGULATOR_RTQ2208 is not set +# CONFIG_REGULATOR_SLG51000 is not set +# CONFIG_REGULATOR_SY8106A is not set +# CONFIG_REGULATOR_SY8824X is not set +# CONFIG_REGULATOR_SY8827N is not set +# CONFIG_REGULATOR_TPS51632 is not set +# CONFIG_REGULATOR_TPS62360 is not set +# CONFIG_REGULATOR_TPS6286X is not set +# CONFIG_REGULATOR_TPS6287X is not set +# CONFIG_REGULATOR_TPS65023 is not set +# CONFIG_REGULATOR_TPS6507X is not set +# CONFIG_REGULATOR_TPS65132 is not set +# CONFIG_REGULATOR_TPS6524X is not set +CONFIG_REGULATOR_VCTRL=m +# CONFIG_REGULATOR_VEXPRESS is not set +# CONFIG_REGULATOR_VQMMC_IPQ4019 is not set +# CONFIG_REGULATOR_QCOM_LABIBB is not set +CONFIG_RC_CORE=m +CONFIG_LIRC=y +CONFIG_RC_MAP=m +CONFIG_RC_DECODERS=y +CONFIG_IR_IMON_DECODER=m +CONFIG_IR_JVC_DECODER=m +CONFIG_IR_MCE_KBD_DECODER=m +CONFIG_IR_NEC_DECODER=m +CONFIG_IR_RC5_DECODER=m +CONFIG_IR_RC6_DECODER=m +# CONFIG_IR_RCMM_DECODER is not set +CONFIG_IR_SANYO_DECODER=m +CONFIG_IR_SHARP_DECODER=m +CONFIG_IR_SONY_DECODER=m +CONFIG_IR_XMP_DECODER=m +CONFIG_RC_DEVICES=y +CONFIG_IR_ENE=m +# CONFIG_IR_FINTEK is not set +# CONFIG_IR_GPIO_CIR is not set +# CONFIG_IR_GPIO_TX is not set +# CONFIG_IR_HIX5HD2 is not set +CONFIG_IR_IGORPLUGUSB=m +CONFIG_IR_IGUANA=m +CONFIG_IR_IMON=m +CONFIG_IR_IMON_RAW=m +# CONFIG_IR_ITE_CIR is not set +CONFIG_IR_MCEUSB=m +# CONFIG_IR_MESON is not set +# CONFIG_IR_MESON_TX is not set +# CONFIG_IR_NUVOTON is not set +# CONFIG_IR_PWM_TX is not set +CONFIG_IR_REDRAT3=m +# CONFIG_IR_SERIAL is not set +# CONFIG_IR_SPI is not set +CONFIG_IR_STREAMZAP=m +# CONFIG_IR_SUNXI is not set +# CONFIG_IR_TOY is not set +CONFIG_IR_TTUSBIR=m +CONFIG_RC_ATI_REMOTE=m +CONFIG_RC_LOOPBACK=m +# CONFIG_RC_XBOX_DVD is not set +CONFIG_CEC_CORE=m + +# +# CEC support +# +# CONFIG_MEDIA_CEC_RC is not set +CONFIG_MEDIA_CEC_SUPPORT=y +# CONFIG_CEC_CH7322 is not set +# CONFIG_CEC_CROS_EC is not set +# CONFIG_CEC_MESON_AO is not set +# CONFIG_CEC_MESON_G12A_AO is not set +# CONFIG_CEC_TEGRA is not set +CONFIG_USB_PULSE8_CEC=m +CONFIG_USB_RAINSHADOW_CEC=m +# end of CEC support + +CONFIG_MEDIA_SUPPORT=m +# CONFIG_MEDIA_SUPPORT_FILTER is not set +CONFIG_MEDIA_SUBDRV_AUTOSELECT=y + +# +# Media device types +# +CONFIG_MEDIA_CAMERA_SUPPORT=y +CONFIG_MEDIA_ANALOG_TV_SUPPORT=y +CONFIG_MEDIA_DIGITAL_TV_SUPPORT=y +CONFIG_MEDIA_RADIO_SUPPORT=y +CONFIG_MEDIA_SDR_SUPPORT=y +CONFIG_MEDIA_PLATFORM_SUPPORT=y +CONFIG_MEDIA_TEST_SUPPORT=y +# end of Media device types + +# +# Media core support +# +CONFIG_VIDEO_DEV=m +CONFIG_MEDIA_CONTROLLER=y +CONFIG_DVB_CORE=m +# end of Media core support + +# +# Video4Linux options +# +CONFIG_VIDEO_V4L2_I2C=y +CONFIG_VIDEO_V4L2_SUBDEV_API=y +# CONFIG_VIDEO_ADV_DEBUG is not set +# CONFIG_VIDEO_FIXED_MINOR_RANGES is not set +CONFIG_VIDEO_TUNER=m +CONFIG_V4L2_FWNODE=m +CONFIG_V4L2_ASYNC=m +# end of Video4Linux options + +# +# Media controller options +# +CONFIG_MEDIA_CONTROLLER_DVB=y +CONFIG_MEDIA_CONTROLLER_REQUEST_API=y +# end of Media controller options + +# +# Digital TV options +# +# CONFIG_DVB_MMAP is not set +CONFIG_DVB_NET=y +CONFIG_DVB_MAX_ADAPTERS=16 +CONFIG_DVB_DYNAMIC_MINORS=y +# CONFIG_DVB_DEMUX_SECTION_LOSS_LOG is not set +# CONFIG_DVB_ULE_DEBUG is not set +# end of Digital TV options + +# +# Media drivers +# + +# +# Media drivers +# +CONFIG_MEDIA_USB_SUPPORT=y + +# +# Webcam devices +# +CONFIG_USB_GSPCA=m +CONFIG_USB_GSPCA_BENQ=m +CONFIG_USB_GSPCA_CONEX=m +CONFIG_USB_GSPCA_CPIA1=m +CONFIG_USB_GSPCA_DTCS033=m +CONFIG_USB_GSPCA_ETOMS=m +CONFIG_USB_GSPCA_FINEPIX=m +CONFIG_USB_GSPCA_JEILINJ=m +CONFIG_USB_GSPCA_JL2005BCD=m +CONFIG_USB_GSPCA_KINECT=m +CONFIG_USB_GSPCA_KONICA=m +CONFIG_USB_GSPCA_MARS=m +CONFIG_USB_GSPCA_MR97310A=m +CONFIG_USB_GSPCA_NW80X=m +CONFIG_USB_GSPCA_OV519=m +CONFIG_USB_GSPCA_OV534=m +CONFIG_USB_GSPCA_OV534_9=m +CONFIG_USB_GSPCA_PAC207=m +CONFIG_USB_GSPCA_PAC7302=m +CONFIG_USB_GSPCA_PAC7311=m +CONFIG_USB_GSPCA_SE401=m +CONFIG_USB_GSPCA_SN9C2028=m +CONFIG_USB_GSPCA_SN9C20X=m +CONFIG_USB_GSPCA_SONIXB=m +CONFIG_USB_GSPCA_SONIXJ=m +CONFIG_USB_GSPCA_SPCA1528=m +CONFIG_USB_GSPCA_SPCA500=m +CONFIG_USB_GSPCA_SPCA501=m +CONFIG_USB_GSPCA_SPCA505=m +CONFIG_USB_GSPCA_SPCA506=m +CONFIG_USB_GSPCA_SPCA508=m +CONFIG_USB_GSPCA_SPCA561=m +CONFIG_USB_GSPCA_SQ905=m +CONFIG_USB_GSPCA_SQ905C=m +CONFIG_USB_GSPCA_SQ930X=m +CONFIG_USB_GSPCA_STK014=m +CONFIG_USB_GSPCA_STK1135=m +CONFIG_USB_GSPCA_STV0680=m +CONFIG_USB_GSPCA_SUNPLUS=m +CONFIG_USB_GSPCA_T613=m +CONFIG_USB_GSPCA_TOPRO=m +CONFIG_USB_GSPCA_TOUPTEK=m +CONFIG_USB_GSPCA_TV8532=m +CONFIG_USB_GSPCA_VC032X=m +CONFIG_USB_GSPCA_VICAM=m +CONFIG_USB_GSPCA_XIRLINK_CIT=m +CONFIG_USB_GSPCA_ZC3XX=m +CONFIG_USB_GL860=m +CONFIG_USB_M5602=m +CONFIG_USB_STV06XX=m +CONFIG_USB_PWC=m +# CONFIG_USB_PWC_DEBUG is not set +CONFIG_USB_PWC_INPUT_EVDEV=y +CONFIG_USB_S2255=m +CONFIG_VIDEO_USBTV=m +CONFIG_USB_VIDEO_CLASS=m +CONFIG_USB_VIDEO_CLASS_INPUT_EVDEV=y + +# +# Analog TV USB devices +# +CONFIG_VIDEO_GO7007=m +CONFIG_VIDEO_GO7007_USB=m +CONFIG_VIDEO_GO7007_LOADER=m +CONFIG_VIDEO_GO7007_USB_S2250_BOARD=m +CONFIG_VIDEO_HDPVR=m +CONFIG_VIDEO_PVRUSB2=m +CONFIG_VIDEO_PVRUSB2_SYSFS=y +CONFIG_VIDEO_PVRUSB2_DVB=y +# CONFIG_VIDEO_PVRUSB2_DEBUGIFC is not set +CONFIG_VIDEO_STK1160=m + +# +# Analog/digital TV USB devices +# +CONFIG_VIDEO_AU0828=m +CONFIG_VIDEO_AU0828_V4L2=y +CONFIG_VIDEO_AU0828_RC=y +CONFIG_VIDEO_CX231XX=m +CONFIG_VIDEO_CX231XX_RC=y +CONFIG_VIDEO_CX231XX_ALSA=m +CONFIG_VIDEO_CX231XX_DVB=m + +# +# Digital TV USB devices +# +CONFIG_DVB_AS102=m +CONFIG_DVB_B2C2_FLEXCOP_USB=m +# CONFIG_DVB_B2C2_FLEXCOP_USB_DEBUG is not set +CONFIG_DVB_USB_V2=m +CONFIG_DVB_USB_AF9015=m +CONFIG_DVB_USB_AF9035=m +CONFIG_DVB_USB_ANYSEE=m +CONFIG_DVB_USB_AU6610=m +CONFIG_DVB_USB_AZ6007=m +CONFIG_DVB_USB_CE6230=m +CONFIG_DVB_USB_DVBSKY=m +CONFIG_DVB_USB_EC168=m +CONFIG_DVB_USB_GL861=m +CONFIG_DVB_USB_LME2510=m +CONFIG_DVB_USB_MXL111SF=m +CONFIG_DVB_USB_RTL28XXU=m +CONFIG_DVB_USB_ZD1301=m +CONFIG_DVB_USB=m +# CONFIG_DVB_USB_DEBUG is not set +CONFIG_DVB_USB_A800=m +CONFIG_DVB_USB_AF9005=m +CONFIG_DVB_USB_AF9005_REMOTE=m +CONFIG_DVB_USB_AZ6027=m +CONFIG_DVB_USB_CINERGY_T2=m +CONFIG_DVB_USB_CXUSB=m +# CONFIG_DVB_USB_CXUSB_ANALOG is not set +CONFIG_DVB_USB_DIB0700=m +CONFIG_DVB_USB_DIB3000MC=m +CONFIG_DVB_USB_DIBUSB_MB=m +CONFIG_DVB_USB_DIBUSB_MB_FAULTY=y +CONFIG_DVB_USB_DIBUSB_MC=m +CONFIG_DVB_USB_DIGITV=m +CONFIG_DVB_USB_DTT200U=m +CONFIG_DVB_USB_DTV5100=m +CONFIG_DVB_USB_DW2102=m +CONFIG_DVB_USB_GP8PSK=m +CONFIG_DVB_USB_M920X=m +CONFIG_DVB_USB_NOVA_T_USB2=m +CONFIG_DVB_USB_OPERA1=m +CONFIG_DVB_USB_PCTV452E=m +CONFIG_DVB_USB_TECHNISAT_USB2=m +CONFIG_DVB_USB_TTUSB2=m +CONFIG_DVB_USB_UMT_010=m +CONFIG_DVB_USB_VP702X=m +CONFIG_DVB_USB_VP7045=m +CONFIG_SMS_USB_DRV=m +CONFIG_DVB_TTUSB_BUDGET=m +CONFIG_DVB_TTUSB_DEC=m + +# +# Webcam, TV (analog/digital) USB devices +# +CONFIG_VIDEO_EM28XX=m +CONFIG_VIDEO_EM28XX_V4L2=m +CONFIG_VIDEO_EM28XX_ALSA=m +CONFIG_VIDEO_EM28XX_DVB=m +CONFIG_VIDEO_EM28XX_RC=m + +# +# Software defined radio USB devices +# +CONFIG_USB_AIRSPY=m +CONFIG_USB_HACKRF=m +CONFIG_USB_MSI2500=m +CONFIG_MEDIA_PCI_SUPPORT=y + +# +# Media capture support +# +CONFIG_VIDEO_SOLO6X10=m +CONFIG_VIDEO_TW5864=m +CONFIG_VIDEO_TW68=m +CONFIG_VIDEO_TW686X=m +# CONFIG_VIDEO_ZORAN is not set + +# +# Media capture/analog TV support +# +CONFIG_VIDEO_DT3155=m +CONFIG_VIDEO_IVTV=m +CONFIG_VIDEO_IVTV_ALSA=m +CONFIG_VIDEO_FB_IVTV=m +CONFIG_VIDEO_HEXIUM_GEMINI=m +CONFIG_VIDEO_HEXIUM_ORION=m +CONFIG_VIDEO_MXB=m + +# +# Media capture/analog/hybrid TV support +# +CONFIG_VIDEO_BT848=m +CONFIG_DVB_BT8XX=m +CONFIG_VIDEO_CX18=m +CONFIG_VIDEO_CX18_ALSA=m +CONFIG_VIDEO_CX23885=m +CONFIG_MEDIA_ALTERA_CI=m +# CONFIG_VIDEO_CX25821 is not set +CONFIG_VIDEO_CX88=m +CONFIG_VIDEO_CX88_ALSA=m +CONFIG_VIDEO_CX88_BLACKBIRD=m +CONFIG_VIDEO_CX88_DVB=m +CONFIG_VIDEO_CX88_ENABLE_VP3054=y +CONFIG_VIDEO_CX88_VP3054=m +CONFIG_VIDEO_CX88_MPEG=m +CONFIG_VIDEO_SAA7134=m +CONFIG_VIDEO_SAA7134_ALSA=m +CONFIG_VIDEO_SAA7134_RC=y +CONFIG_VIDEO_SAA7134_DVB=m +# CONFIG_VIDEO_SAA7134_GO7007 is not set +CONFIG_VIDEO_SAA7164=m + +# +# Media digital TV PCI Adapters +# +CONFIG_DVB_B2C2_FLEXCOP_PCI=m +# CONFIG_DVB_B2C2_FLEXCOP_PCI_DEBUG is not set +CONFIG_DVB_DDBRIDGE=m +# CONFIG_DVB_DDBRIDGE_MSIENABLE is not set +CONFIG_DVB_DM1105=m +CONFIG_MANTIS_CORE=m +CONFIG_DVB_MANTIS=m +CONFIG_DVB_HOPPER=m +CONFIG_DVB_NETUP_UNIDVB=m +CONFIG_DVB_NGENE=m +CONFIG_DVB_PLUTO2=m +CONFIG_DVB_PT1=m +CONFIG_DVB_PT3=m +CONFIG_DVB_SMIPCIE=m +CONFIG_DVB_BUDGET_CORE=m +CONFIG_DVB_BUDGET=m +CONFIG_DVB_BUDGET_CI=m +CONFIG_DVB_BUDGET_AV=m +# CONFIG_IPU_BRIDGE is not set +CONFIG_RADIO_ADAPTERS=m +# CONFIG_RADIO_MAXIRADIO is not set +# CONFIG_RADIO_SAA7706H is not set +CONFIG_RADIO_SHARK=m +CONFIG_RADIO_SHARK2=m +# CONFIG_RADIO_SI4713 is not set +CONFIG_RADIO_TEA575X=m +# CONFIG_RADIO_TEA5764 is not set +# CONFIG_RADIO_TEF6862 is not set +# CONFIG_RADIO_WL1273 is not set +# CONFIG_USB_DSBR is not set +CONFIG_USB_KEENE=m +CONFIG_USB_MA901=m +CONFIG_USB_MR800=m +CONFIG_USB_RAREMONO=m +CONFIG_RADIO_SI470X=m +CONFIG_USB_SI470X=m +# CONFIG_I2C_SI470X is not set +# CONFIG_RADIO_WL128X is not set +CONFIG_MEDIA_PLATFORM_DRIVERS=y +CONFIG_V4L_PLATFORM_DRIVERS=y +# CONFIG_SDR_PLATFORM_DRIVERS is not set +# CONFIG_DVB_PLATFORM_DRIVERS is not set +CONFIG_V4L_MEM2MEM_DRIVERS=y +# CONFIG_VIDEO_MEM2MEM_DEINTERLACE is not set +# CONFIG_VIDEO_MUX is not set + +# +# Allegro DVT media platform drivers +# +# CONFIG_VIDEO_ALLEGRO_DVT is not set + +# +# Amlogic media platform drivers +# +# CONFIG_VIDEO_MESON_GE2D is not set + +# +# Amphion drivers +# + +# +# Aspeed media platform drivers +# + +# +# Atmel media platform drivers +# + +# +# Cadence media platform drivers +# +# CONFIG_VIDEO_CADENCE_CSI2RX is not set +# CONFIG_VIDEO_CADENCE_CSI2TX is not set + +# +# Chips&Media media platform drivers +# + +# +# Intel media platform drivers +# + +# +# Marvell media platform drivers +# +CONFIG_VIDEO_CAFE_CCIC=m + +# +# Mediatek media platform drivers +# + +# +# Microchip Technology, Inc. media platform drivers +# + +# +# NVidia media platform drivers +# +# CONFIG_VIDEO_TEGRA_VDE is not set + +# +# NXP media platform drivers +# + +# +# Qualcomm media platform drivers +# +# CONFIG_VIDEO_QCOM_CAMSS is not set + +# +# Renesas media platform drivers +# + +# +# Rockchip media platform drivers +# +# CONFIG_VIDEO_ROCKCHIP_RGA is not set +# CONFIG_VIDEO_ROCKCHIP_ISP1 is not set + +# +# Samsung media platform drivers +# + +# +# STMicroelectronics media platform drivers +# + +# +# Sunxi media platform drivers +# +# CONFIG_VIDEO_SUN4I_CSI is not set +# CONFIG_VIDEO_SUN6I_CSI is not set +# CONFIG_VIDEO_SUN8I_A83T_MIPI_CSI2 is not set +# CONFIG_VIDEO_SUN8I_DEINTERLACE is not set +# CONFIG_VIDEO_SUN8I_ROTATE is not set + +# +# Texas Instruments drivers +# + +# +# Verisilicon media platform drivers +# +# CONFIG_VIDEO_HANTRO is not set + +# +# VIA media platform drivers +# + +# +# Xilinx media platform drivers +# +# CONFIG_VIDEO_XILINX is not set + +# +# MMC/SDIO DVB adapters +# +CONFIG_SMS_SDIO_DRV=m +CONFIG_V4L_TEST_DRIVERS=y +# CONFIG_VIDEO_VIM2M is not set +# CONFIG_VIDEO_VICODEC is not set +# CONFIG_VIDEO_VIMC is not set +CONFIG_VIDEO_VIVID=m +CONFIG_VIDEO_VIVID_CEC=y +CONFIG_VIDEO_VIVID_MAX_DEVS=64 +# CONFIG_VIDEO_VISL is not set +# CONFIG_DVB_TEST_DRIVERS is not set + +# +# FireWire (IEEE 1394) Adapters +# +CONFIG_DVB_FIREDTV=m +CONFIG_DVB_FIREDTV_INPUT=y +CONFIG_MEDIA_COMMON_OPTIONS=y + +# +# common driver options +# +CONFIG_CYPRESS_FIRMWARE=m +CONFIG_TTPCI_EEPROM=m +CONFIG_UVC_COMMON=m +CONFIG_VIDEO_CX2341X=m +CONFIG_VIDEO_TVEEPROM=m +CONFIG_DVB_B2C2_FLEXCOP=m +CONFIG_VIDEO_SAA7146=m +CONFIG_VIDEO_SAA7146_VV=m +CONFIG_SMS_SIANO_MDTV=m +CONFIG_SMS_SIANO_RC=y +# CONFIG_SMS_SIANO_DEBUGFS is not set +CONFIG_VIDEO_V4L2_TPG=m +CONFIG_VIDEOBUF2_CORE=m +CONFIG_VIDEOBUF2_V4L2=m +CONFIG_VIDEOBUF2_MEMOPS=m +CONFIG_VIDEOBUF2_DMA_CONTIG=m +CONFIG_VIDEOBUF2_VMALLOC=m +CONFIG_VIDEOBUF2_DMA_SG=m +CONFIG_VIDEOBUF2_DVB=m +# end of Media drivers + +# +# Media ancillary drivers +# +CONFIG_MEDIA_ATTACH=y + +# +# IR I2C driver auto-selected by 'Autoselect ancillary drivers' +# +CONFIG_VIDEO_IR_I2C=m +CONFIG_VIDEO_CAMERA_SENSOR=y +# CONFIG_VIDEO_AR0521 is not set +# CONFIG_VIDEO_HI556 is not set +# CONFIG_VIDEO_HI846 is not set +# CONFIG_VIDEO_HI847 is not set +# CONFIG_VIDEO_IMX208 is not set +# CONFIG_VIDEO_IMX214 is not set +# CONFIG_VIDEO_IMX219 is not set +# CONFIG_VIDEO_IMX258 is not set +# CONFIG_VIDEO_IMX274 is not set +# CONFIG_VIDEO_IMX290 is not set +# CONFIG_VIDEO_IMX296 is not set +# CONFIG_VIDEO_IMX319 is not set +# CONFIG_VIDEO_IMX334 is not set +# CONFIG_VIDEO_IMX335 is not set +# CONFIG_VIDEO_IMX355 is not set +# CONFIG_VIDEO_IMX412 is not set +# CONFIG_VIDEO_IMX415 is not set +# CONFIG_VIDEO_MT9M001 is not set +# CONFIG_VIDEO_MT9M111 is not set +# CONFIG_VIDEO_MT9P031 is not set +# CONFIG_VIDEO_MT9T112 is not set +CONFIG_VIDEO_MT9V011=m +# CONFIG_VIDEO_MT9V032 is not set +# CONFIG_VIDEO_MT9V111 is not set +# CONFIG_VIDEO_OG01A1B is not set +# CONFIG_VIDEO_OV01A10 is not set +# CONFIG_VIDEO_OV02A10 is not set +# CONFIG_VIDEO_OV08D10 is not set +# CONFIG_VIDEO_OV08X40 is not set +# CONFIG_VIDEO_OV13858 is not set +# CONFIG_VIDEO_OV13B10 is not set +CONFIG_VIDEO_OV2640=m +# CONFIG_VIDEO_OV2659 is not set +# CONFIG_VIDEO_OV2680 is not set +# CONFIG_VIDEO_OV2685 is not set +# CONFIG_VIDEO_OV2740 is not set +# CONFIG_VIDEO_OV4689 is not set +# CONFIG_VIDEO_OV5640 is not set +# CONFIG_VIDEO_OV5645 is not set +# CONFIG_VIDEO_OV5647 is not set +# CONFIG_VIDEO_OV5648 is not set +# CONFIG_VIDEO_OV5670 is not set +# CONFIG_VIDEO_OV5675 is not set +# CONFIG_VIDEO_OV5693 is not set +# CONFIG_VIDEO_OV5695 is not set +# CONFIG_VIDEO_OV6650 is not set +# CONFIG_VIDEO_OV7251 is not set +CONFIG_VIDEO_OV7640=m +CONFIG_VIDEO_OV7670=m +# CONFIG_VIDEO_OV772X is not set +# CONFIG_VIDEO_OV7740 is not set +# CONFIG_VIDEO_OV8856 is not set +# CONFIG_VIDEO_OV8858 is not set +# CONFIG_VIDEO_OV8865 is not set +# CONFIG_VIDEO_OV9282 is not set +# CONFIG_VIDEO_OV9640 is not set +# CONFIG_VIDEO_OV9650 is not set +# CONFIG_VIDEO_OV9734 is not set +# CONFIG_VIDEO_RDACM20 is not set +# CONFIG_VIDEO_RDACM21 is not set +# CONFIG_VIDEO_RJ54N1 is not set +# CONFIG_VIDEO_S5C73M3 is not set +# CONFIG_VIDEO_S5K5BAF is not set +# CONFIG_VIDEO_S5K6A3 is not set +# CONFIG_VIDEO_ST_VGXY61 is not set +# CONFIG_VIDEO_CCS is not set +# CONFIG_VIDEO_ET8EK8 is not set + +# +# Lens drivers +# +# CONFIG_VIDEO_AD5820 is not set +# CONFIG_VIDEO_AK7375 is not set +# CONFIG_VIDEO_DW9714 is not set +# CONFIG_VIDEO_DW9719 is not set +# CONFIG_VIDEO_DW9768 is not set +# CONFIG_VIDEO_DW9807_VCM is not set +# end of Lens drivers + +# +# Flash devices +# +# CONFIG_VIDEO_ADP1653 is not set +# CONFIG_VIDEO_LM3560 is not set +# CONFIG_VIDEO_LM3646 is not set +# end of Flash devices + +# +# Audio decoders, processors and mixers +# +CONFIG_VIDEO_CS3308=m +CONFIG_VIDEO_CS5345=m +CONFIG_VIDEO_CS53L32A=m +CONFIG_VIDEO_MSP3400=m +CONFIG_VIDEO_SONY_BTF_MPX=m +# CONFIG_VIDEO_TDA1997X is not set +CONFIG_VIDEO_TDA7432=m +CONFIG_VIDEO_TDA9840=m +CONFIG_VIDEO_TEA6415C=m +CONFIG_VIDEO_TEA6420=m +CONFIG_VIDEO_TLV320AIC23B=m +CONFIG_VIDEO_TVAUDIO=m +CONFIG_VIDEO_UDA1342=m +CONFIG_VIDEO_VP27SMPX=m +CONFIG_VIDEO_WM8739=m +CONFIG_VIDEO_WM8775=m +# end of Audio decoders, processors and mixers + +# +# RDS decoders +# +CONFIG_VIDEO_SAA6588=m +# end of RDS decoders + +# +# Video decoders +# +# CONFIG_VIDEO_ADV7180 is not set +# CONFIG_VIDEO_ADV7183 is not set +# CONFIG_VIDEO_ADV748X is not set +# CONFIG_VIDEO_ADV7604 is not set +# CONFIG_VIDEO_ADV7842 is not set +CONFIG_VIDEO_BT819=m +CONFIG_VIDEO_BT856=m +# CONFIG_VIDEO_BT866 is not set +# CONFIG_VIDEO_ISL7998X is not set +CONFIG_VIDEO_KS0127=m +# CONFIG_VIDEO_MAX9286 is not set +# CONFIG_VIDEO_ML86V7667 is not set +CONFIG_VIDEO_SAA7110=m +CONFIG_VIDEO_SAA711X=m +# CONFIG_VIDEO_TC358743 is not set +# CONFIG_VIDEO_TC358746 is not set +# CONFIG_VIDEO_TVP514X is not set +CONFIG_VIDEO_TVP5150=m +# CONFIG_VIDEO_TVP7002 is not set +CONFIG_VIDEO_TW2804=m +CONFIG_VIDEO_TW9903=m +CONFIG_VIDEO_TW9906=m +# CONFIG_VIDEO_TW9910 is not set +CONFIG_VIDEO_VPX3220=m + +# +# Video and audio decoders +# +CONFIG_VIDEO_SAA717X=m +CONFIG_VIDEO_CX25840=m +# end of Video decoders + +# +# Video encoders +# +CONFIG_VIDEO_ADV7170=m +CONFIG_VIDEO_ADV7175=m +# CONFIG_VIDEO_ADV7343 is not set +# CONFIG_VIDEO_ADV7393 is not set +# CONFIG_VIDEO_AK881X is not set +CONFIG_VIDEO_SAA7127=m +CONFIG_VIDEO_SAA7185=m +# CONFIG_VIDEO_THS8200 is not set +# end of Video encoders + +# +# Video improvement chips +# +CONFIG_VIDEO_UPD64031A=m +CONFIG_VIDEO_UPD64083=m +# end of Video improvement chips + +# +# Audio/Video compression chips +# +CONFIG_VIDEO_SAA6752HS=m +# end of Audio/Video compression chips + +# +# SDR tuner chips +# +# CONFIG_SDR_MAX2175 is not set +# end of SDR tuner chips + +# +# Miscellaneous helper chips +# +# CONFIG_VIDEO_I2C is not set +CONFIG_VIDEO_M52790=m +# CONFIG_VIDEO_ST_MIPID02 is not set +# CONFIG_VIDEO_THS7303 is not set +# end of Miscellaneous helper chips + +# +# Video serializers and deserializers +# +# CONFIG_VIDEO_DS90UB913 is not set +# CONFIG_VIDEO_DS90UB953 is not set +# CONFIG_VIDEO_DS90UB960 is not set +# end of Video serializers and deserializers + +# +# Media SPI Adapters +# +# CONFIG_CXD2880_SPI_DRV is not set +# CONFIG_VIDEO_GS1662 is not set +# end of Media SPI Adapters + +CONFIG_MEDIA_TUNER=m + +# +# Customize TV tuners +# +CONFIG_MEDIA_TUNER_E4000=m +CONFIG_MEDIA_TUNER_FC0011=m +CONFIG_MEDIA_TUNER_FC0012=m +CONFIG_MEDIA_TUNER_FC0013=m +CONFIG_MEDIA_TUNER_FC2580=m +CONFIG_MEDIA_TUNER_IT913X=m +CONFIG_MEDIA_TUNER_M88RS6000T=m +CONFIG_MEDIA_TUNER_MAX2165=m +CONFIG_MEDIA_TUNER_MC44S803=m +CONFIG_MEDIA_TUNER_MSI001=m +CONFIG_MEDIA_TUNER_MT2060=m +CONFIG_MEDIA_TUNER_MT2063=m +CONFIG_MEDIA_TUNER_MT20XX=m +CONFIG_MEDIA_TUNER_MT2131=m +CONFIG_MEDIA_TUNER_MT2266=m +CONFIG_MEDIA_TUNER_MXL301RF=m +CONFIG_MEDIA_TUNER_MXL5005S=m +CONFIG_MEDIA_TUNER_MXL5007T=m +CONFIG_MEDIA_TUNER_QM1D1B0004=m +CONFIG_MEDIA_TUNER_QM1D1C0042=m +CONFIG_MEDIA_TUNER_QT1010=m +CONFIG_MEDIA_TUNER_R820T=m +CONFIG_MEDIA_TUNER_SI2157=m +CONFIG_MEDIA_TUNER_SIMPLE=m +CONFIG_MEDIA_TUNER_TDA18212=m +CONFIG_MEDIA_TUNER_TDA18218=m +CONFIG_MEDIA_TUNER_TDA18250=m +CONFIG_MEDIA_TUNER_TDA18271=m +CONFIG_MEDIA_TUNER_TDA827X=m +CONFIG_MEDIA_TUNER_TDA8290=m +CONFIG_MEDIA_TUNER_TDA9887=m +CONFIG_MEDIA_TUNER_TEA5761=m +CONFIG_MEDIA_TUNER_TEA5767=m +CONFIG_MEDIA_TUNER_TUA9001=m +CONFIG_MEDIA_TUNER_XC2028=m +CONFIG_MEDIA_TUNER_XC4000=m +CONFIG_MEDIA_TUNER_XC5000=m +# end of Customize TV tuners + +# +# Customise DVB Frontends +# + +# +# Multistandard (satellite) frontends +# +CONFIG_DVB_M88DS3103=m +CONFIG_DVB_MXL5XX=m +CONFIG_DVB_STB0899=m +CONFIG_DVB_STB6100=m +CONFIG_DVB_STV090x=m +CONFIG_DVB_STV0910=m +CONFIG_DVB_STV6110x=m +CONFIG_DVB_STV6111=m + +# +# Multistandard (cable + terrestrial) frontends +# +CONFIG_DVB_DRXK=m +CONFIG_DVB_MN88472=m +CONFIG_DVB_MN88473=m +CONFIG_DVB_SI2165=m +CONFIG_DVB_TDA18271C2DD=m + +# +# DVB-S (satellite) frontends +# +CONFIG_DVB_CX24110=m +CONFIG_DVB_CX24116=m +CONFIG_DVB_CX24117=m +CONFIG_DVB_CX24120=m +CONFIG_DVB_CX24123=m +CONFIG_DVB_DS3000=m +CONFIG_DVB_MB86A16=m +CONFIG_DVB_MT312=m +CONFIG_DVB_S5H1420=m +CONFIG_DVB_SI21XX=m +CONFIG_DVB_STB6000=m +CONFIG_DVB_STV0288=m +CONFIG_DVB_STV0299=m +CONFIG_DVB_STV0900=m +CONFIG_DVB_STV6110=m +CONFIG_DVB_TDA10071=m +CONFIG_DVB_TDA10086=m +CONFIG_DVB_TDA8083=m +CONFIG_DVB_TDA8261=m +CONFIG_DVB_TDA826X=m +CONFIG_DVB_TS2020=m +CONFIG_DVB_TUA6100=m +CONFIG_DVB_TUNER_CX24113=m +CONFIG_DVB_TUNER_ITD1000=m +CONFIG_DVB_VES1X93=m +CONFIG_DVB_ZL10036=m +CONFIG_DVB_ZL10039=m + +# +# DVB-T (terrestrial) frontends +# +CONFIG_DVB_AF9013=m +CONFIG_DVB_AS102_FE=m +CONFIG_DVB_CX22700=m +CONFIG_DVB_CX22702=m +CONFIG_DVB_CXD2820R=m +CONFIG_DVB_CXD2841ER=m +CONFIG_DVB_DIB3000MB=m +CONFIG_DVB_DIB3000MC=m +CONFIG_DVB_DIB7000M=m +CONFIG_DVB_DIB7000P=m +# CONFIG_DVB_DIB9000 is not set +CONFIG_DVB_DRXD=m +CONFIG_DVB_EC100=m +CONFIG_DVB_GP8PSK_FE=m +CONFIG_DVB_L64781=m +CONFIG_DVB_MT352=m +CONFIG_DVB_NXT6000=m +CONFIG_DVB_RTL2830=m +CONFIG_DVB_RTL2832=m +CONFIG_DVB_RTL2832_SDR=m +# CONFIG_DVB_S5H1432 is not set +CONFIG_DVB_SI2168=m +CONFIG_DVB_SP887X=m +CONFIG_DVB_STV0367=m +CONFIG_DVB_TDA10048=m +CONFIG_DVB_TDA1004X=m +CONFIG_DVB_ZD1301_DEMOD=m +CONFIG_DVB_ZL10353=m +# CONFIG_DVB_CXD2880 is not set + +# +# DVB-C (cable) frontends +# +CONFIG_DVB_STV0297=m +CONFIG_DVB_TDA10021=m +CONFIG_DVB_TDA10023=m +CONFIG_DVB_VES1820=m + +# +# ATSC (North American/Korean Terrestrial/Cable DTV) frontends +# +CONFIG_DVB_AU8522=m +CONFIG_DVB_AU8522_DTV=m +CONFIG_DVB_AU8522_V4L=m +CONFIG_DVB_BCM3510=m +CONFIG_DVB_LG2160=m +CONFIG_DVB_LGDT3305=m +CONFIG_DVB_LGDT3306A=m +CONFIG_DVB_LGDT330X=m +CONFIG_DVB_MXL692=m +CONFIG_DVB_NXT200X=m +CONFIG_DVB_OR51132=m +CONFIG_DVB_OR51211=m +CONFIG_DVB_S5H1409=m +CONFIG_DVB_S5H1411=m + +# +# ISDB-T (terrestrial) frontends +# +CONFIG_DVB_DIB8000=m +CONFIG_DVB_MB86A20S=m +CONFIG_DVB_S921=m + +# +# ISDB-S (satellite) & ISDB-T (terrestrial) frontends +# +# CONFIG_DVB_MN88443X is not set +CONFIG_DVB_TC90522=m + +# +# Digital terrestrial only tuners/PLL +# +CONFIG_DVB_PLL=m +CONFIG_DVB_TUNER_DIB0070=m +CONFIG_DVB_TUNER_DIB0090=m + +# +# SEC control devices for DVB-S +# +CONFIG_DVB_A8293=m +CONFIG_DVB_AF9033=m +CONFIG_DVB_ASCOT2E=m +CONFIG_DVB_ATBM8830=m +CONFIG_DVB_HELENE=m +CONFIG_DVB_HORUS3A=m +CONFIG_DVB_ISL6405=m +CONFIG_DVB_ISL6421=m +CONFIG_DVB_ISL6423=m +CONFIG_DVB_IX2505V=m +# CONFIG_DVB_LGS8GL5 is not set +CONFIG_DVB_LGS8GXX=m +CONFIG_DVB_LNBH25=m +# CONFIG_DVB_LNBH29 is not set +CONFIG_DVB_LNBP21=m +CONFIG_DVB_LNBP22=m +CONFIG_DVB_M88RS2000=m +CONFIG_DVB_TDA665x=m +CONFIG_DVB_DRX39XYJ=m + +# +# Common Interface (EN50221) controller drivers +# +CONFIG_DVB_CXD2099=m +CONFIG_DVB_SP2=m +# end of Customise DVB Frontends + +# +# Tools to develop new frontends +# +CONFIG_DVB_DUMMY_FE=m +# end of Media ancillary drivers + +# +# Graphics support +# +CONFIG_APERTURE_HELPERS=y +CONFIG_VIDEO_CMDLINE=y +CONFIG_VIDEO_NOMODESET=y +# CONFIG_AUXDISPLAY is not set +# CONFIG_PANEL is not set +CONFIG_TEGRA_HOST1X_CONTEXT_BUS=y +CONFIG_TEGRA_HOST1X=m +CONFIG_TEGRA_HOST1X_FIREWALL=y +CONFIG_DRM=m +CONFIG_DRM_MIPI_DSI=y +CONFIG_DRM_KMS_HELPER=m +# CONFIG_DRM_DEBUG_DP_MST_TOPOLOGY_REFS is not set +# CONFIG_DRM_DEBUG_MODESET_LOCK is not set +CONFIG_DRM_FBDEV_EMULATION=y +CONFIG_DRM_FBDEV_OVERALLOC=100 +# CONFIG_DRM_FBDEV_LEAK_PHYS_SMEM is not set +CONFIG_DRM_LOAD_EDID_FIRMWARE=y +CONFIG_DRM_DP_AUX_BUS=m +CONFIG_DRM_DISPLAY_HELPER=m +CONFIG_DRM_DISPLAY_DP_HELPER=y +CONFIG_DRM_DISPLAY_HDCP_HELPER=y +CONFIG_DRM_DISPLAY_HDMI_HELPER=y +CONFIG_DRM_DP_AUX_CHARDEV=y +# CONFIG_DRM_DP_CEC is not set +CONFIG_DRM_TTM=m +CONFIG_DRM_EXEC=m +CONFIG_DRM_BUDDY=m +CONFIG_DRM_VRAM_HELPER=m +CONFIG_DRM_TTM_HELPER=m +CONFIG_DRM_GEM_DMA_HELPER=m +CONFIG_DRM_GEM_SHMEM_HELPER=m +CONFIG_DRM_SUBALLOC_HELPER=m +CONFIG_DRM_SCHED=m + +# +# I2C encoder or helper chips +# +# CONFIG_DRM_I2C_CH7006 is not set +# CONFIG_DRM_I2C_SIL164 is not set +# CONFIG_DRM_I2C_NXP_TDA998X is not set +# CONFIG_DRM_I2C_NXP_TDA9950 is not set +# end of I2C encoder or helper chips + +# +# ARM devices +# +CONFIG_DRM_HDLCD=m +# CONFIG_DRM_HDLCD_SHOW_UNDERRUN is not set +CONFIG_DRM_MALI_DISPLAY=m +# CONFIG_DRM_KOMEDA is not set +# end of ARM devices + +CONFIG_DRM_RADEON=m +# CONFIG_DRM_RADEON_USERPTR is not set +CONFIG_DRM_AMDGPU=m +CONFIG_DRM_AMDGPU_SI=y +CONFIG_DRM_AMDGPU_CIK=y +# CONFIG_DRM_AMDGPU_USERPTR is not set +# CONFIG_DRM_AMDGPU_WERROR is not set + +# +# ACP (Audio CoProcessor) Configuration +# +# CONFIG_DRM_AMD_ACP is not set +# end of ACP (Audio CoProcessor) Configuration + +# +# Display Engine Configuration +# +CONFIG_DRM_AMD_DC=y +CONFIG_DRM_AMD_DC_FP=y +# CONFIG_DRM_AMD_DC_SI is not set +# CONFIG_DRM_AMD_SECURE_DISPLAY is not set +# end of Display Engine Configuration + +# CONFIG_HSA_AMD is not set +CONFIG_DRM_NOUVEAU=m +CONFIG_NOUVEAU_PLATFORM_DRIVER=y +CONFIG_NOUVEAU_DEBUG=5 +CONFIG_NOUVEAU_DEBUG_DEFAULT=3 +# CONFIG_NOUVEAU_DEBUG_MMU is not set +# CONFIG_NOUVEAU_DEBUG_PUSH is not set +CONFIG_DRM_NOUVEAU_BACKLIGHT=y +CONFIG_DRM_VGEM=m +# CONFIG_DRM_VKMS is not set +CONFIG_DRM_ROCKCHIP=m +CONFIG_ROCKCHIP_VOP=y +# CONFIG_ROCKCHIP_VOP2 is not set +CONFIG_ROCKCHIP_ANALOGIX_DP=y +CONFIG_ROCKCHIP_CDN_DP=y +CONFIG_ROCKCHIP_DW_HDMI=y +CONFIG_ROCKCHIP_DW_MIPI_DSI=y +# CONFIG_ROCKCHIP_INNO_HDMI is not set +# CONFIG_ROCKCHIP_LVDS is not set +# CONFIG_ROCKCHIP_RGB is not set +# CONFIG_ROCKCHIP_RK3066_HDMI is not set +# CONFIG_DRM_VMWGFX is not set +CONFIG_DRM_UDL=m +CONFIG_DRM_AST=m +# CONFIG_DRM_MGAG200 is not set +CONFIG_DRM_SUN4I=m +# CONFIG_DRM_SUN6I_DSI is not set +CONFIG_DRM_SUN8I_DW_HDMI=m +# CONFIG_DRM_SUN8I_MIXER is not set +CONFIG_DRM_QXL=m +CONFIG_DRM_VIRTIO_GPU=m +CONFIG_DRM_VIRTIO_GPU_KMS=y +CONFIG_DRM_MSM=m +CONFIG_DRM_MSM_GPU_STATE=y +# CONFIG_DRM_MSM_GPU_SUDO is not set +CONFIG_DRM_MSM_MDSS=y +CONFIG_DRM_MSM_MDP4=y +CONFIG_DRM_MSM_MDP5=y +CONFIG_DRM_MSM_DPU=y +CONFIG_DRM_MSM_DP=y +CONFIG_DRM_MSM_DSI=y +CONFIG_DRM_MSM_DSI_28NM_PHY=y +CONFIG_DRM_MSM_DSI_20NM_PHY=y +CONFIG_DRM_MSM_DSI_28NM_8960_PHY=y +CONFIG_DRM_MSM_DSI_14NM_PHY=y +CONFIG_DRM_MSM_DSI_10NM_PHY=y +CONFIG_DRM_MSM_DSI_7NM_PHY=y +CONFIG_DRM_MSM_HDMI=y +CONFIG_DRM_MSM_HDMI_HDCP=y +CONFIG_DRM_TEGRA=m +# CONFIG_DRM_TEGRA_DEBUG is not set +CONFIG_DRM_TEGRA_STAGING=y +CONFIG_DRM_PANEL=y + +# +# Display Panels +# +# CONFIG_DRM_PANEL_ABT_Y030XX067A is not set +# CONFIG_DRM_PANEL_ARM_VERSATILE is not set +# CONFIG_DRM_PANEL_ASUS_Z00T_TM5P5_NT35596 is not set +# CONFIG_DRM_PANEL_AUO_A030JTN01 is not set +# CONFIG_DRM_PANEL_BOE_BF060Y8M_AJ0 is not set +# CONFIG_DRM_PANEL_BOE_HIMAX8279D is not set +# CONFIG_DRM_PANEL_BOE_TV101WUM_NL6 is not set +# CONFIG_DRM_PANEL_DSI_CM is not set +# CONFIG_DRM_PANEL_LVDS is not set +CONFIG_DRM_PANEL_SIMPLE=m +# CONFIG_DRM_PANEL_EDP is not set +# CONFIG_DRM_PANEL_EBBG_FT8719 is not set +# CONFIG_DRM_PANEL_ELIDA_KD35T133 is not set +# CONFIG_DRM_PANEL_FEIXIN_K101_IM2BA02 is not set +# CONFIG_DRM_PANEL_FEIYANG_FY07024DI26A30D is not set +# CONFIG_DRM_PANEL_HIMAX_HX8394 is not set +# CONFIG_DRM_PANEL_ILITEK_IL9322 is not set +# CONFIG_DRM_PANEL_ILITEK_ILI9341 is not set +# CONFIG_DRM_PANEL_ILITEK_ILI9881C is not set +# CONFIG_DRM_PANEL_INNOLUX_EJ030NA is not set +# CONFIG_DRM_PANEL_INNOLUX_P079ZCA is not set +# CONFIG_DRM_PANEL_JADARD_JD9365DA_H3 is not set +# CONFIG_DRM_PANEL_JDI_LT070ME05000 is not set +# CONFIG_DRM_PANEL_JDI_R63452 is not set +# CONFIG_DRM_PANEL_KHADAS_TS050 is not set +# CONFIG_DRM_PANEL_KINGDISPLAY_KD097D04 is not set +# CONFIG_DRM_PANEL_LEADTEK_LTK050H3146W is not set +# CONFIG_DRM_PANEL_LEADTEK_LTK500HD1829 is not set +# CONFIG_DRM_PANEL_SAMSUNG_LD9040 is not set +# CONFIG_DRM_PANEL_LG_LB035Q02 is not set +# CONFIG_DRM_PANEL_LG_LG4573 is not set +# CONFIG_DRM_PANEL_MAGNACHIP_D53E6EA8966 is not set +# CONFIG_DRM_PANEL_NEC_NL8048HL11 is not set +# CONFIG_DRM_PANEL_NEWVISION_NV3051D is not set +# CONFIG_DRM_PANEL_NEWVISION_NV3052C is not set +# CONFIG_DRM_PANEL_NOVATEK_NT35510 is not set +# CONFIG_DRM_PANEL_NOVATEK_NT35560 is not set +# CONFIG_DRM_PANEL_NOVATEK_NT35950 is not set +# CONFIG_DRM_PANEL_NOVATEK_NT36523 is not set +# CONFIG_DRM_PANEL_NOVATEK_NT36672A is not set +# CONFIG_DRM_PANEL_NOVATEK_NT39016 is not set +# CONFIG_DRM_PANEL_MANTIX_MLAF057WE51 is not set +# CONFIG_DRM_PANEL_OLIMEX_LCD_OLINUXINO is not set +# CONFIG_DRM_PANEL_ORISETECH_OTA5601A is not set +# CONFIG_DRM_PANEL_ORISETECH_OTM8009A is not set +# CONFIG_DRM_PANEL_OSD_OSD101T2587_53TS is not set +# CONFIG_DRM_PANEL_PANASONIC_VVX10F034N00 is not set +CONFIG_DRM_PANEL_RASPBERRYPI_TOUCHSCREEN=m +# CONFIG_DRM_PANEL_RAYDIUM_RM67191 is not set +# CONFIG_DRM_PANEL_RAYDIUM_RM68200 is not set +# CONFIG_DRM_PANEL_RONBO_RB070D30 is not set +# CONFIG_DRM_PANEL_SAMSUNG_ATNA33XC20 is not set +# CONFIG_DRM_PANEL_SAMSUNG_DB7430 is not set +# CONFIG_DRM_PANEL_SAMSUNG_S6D16D0 is not set +# CONFIG_DRM_PANEL_SAMSUNG_S6D27A1 is not set +# CONFIG_DRM_PANEL_SAMSUNG_S6D7AA0 is not set +# CONFIG_DRM_PANEL_SAMSUNG_S6E3HA2 is not set +# CONFIG_DRM_PANEL_SAMSUNG_S6E63J0X03 is not set +# CONFIG_DRM_PANEL_SAMSUNG_S6E63M0 is not set +# CONFIG_DRM_PANEL_SAMSUNG_S6E88A0_AMS452EF01 is not set +# CONFIG_DRM_PANEL_SAMSUNG_S6E8AA0 is not set +# CONFIG_DRM_PANEL_SAMSUNG_SOFEF00 is not set +# CONFIG_DRM_PANEL_SEIKO_43WVF1G is not set +# CONFIG_DRM_PANEL_SHARP_LQ101R1SX01 is not set +# CONFIG_DRM_PANEL_SHARP_LS037V7DW01 is not set +# CONFIG_DRM_PANEL_SHARP_LS043T1LE01 is not set +# CONFIG_DRM_PANEL_SHARP_LS060T1SX01 is not set +# CONFIG_DRM_PANEL_SITRONIX_ST7701 is not set +# CONFIG_DRM_PANEL_SITRONIX_ST7703 is not set +# CONFIG_DRM_PANEL_SITRONIX_ST7789V is not set +# CONFIG_DRM_PANEL_SONY_ACX565AKM is not set +# CONFIG_DRM_PANEL_SONY_TD4353_JDI is not set +# CONFIG_DRM_PANEL_SONY_TULIP_TRULY_NT35521 is not set +# CONFIG_DRM_PANEL_STARTEK_KD070FHFID015 is not set +# CONFIG_DRM_PANEL_TDO_TL070WSH30 is not set +# CONFIG_DRM_PANEL_TPO_TD028TTEC1 is not set +# CONFIG_DRM_PANEL_TPO_TD043MTEA1 is not set +# CONFIG_DRM_PANEL_TPO_TPG110 is not set +# CONFIG_DRM_PANEL_TRULY_NT35597_WQXGA is not set +# CONFIG_DRM_PANEL_VISIONOX_RM69299 is not set +# CONFIG_DRM_PANEL_VISIONOX_VTDR6130 is not set +# CONFIG_DRM_PANEL_VISIONOX_R66451 is not set +# CONFIG_DRM_PANEL_WIDECHIPS_WS2401 is not set +# CONFIG_DRM_PANEL_XINPENG_XPP055C272 is not set +# end of Display Panels + +CONFIG_DRM_BRIDGE=y +CONFIG_DRM_PANEL_BRIDGE=y + +# +# Display Interface Bridges +# +# CONFIG_DRM_CHIPONE_ICN6211 is not set +# CONFIG_DRM_CHRONTEL_CH7033 is not set +# CONFIG_DRM_CROS_EC_ANX7688 is not set +CONFIG_DRM_DISPLAY_CONNECTOR=m +# CONFIG_DRM_ITE_IT6505 is not set +# CONFIG_DRM_LONTIUM_LT8912B is not set +# CONFIG_DRM_LONTIUM_LT9211 is not set +# CONFIG_DRM_LONTIUM_LT9611 is not set +# CONFIG_DRM_LONTIUM_LT9611UXC is not set +# CONFIG_DRM_ITE_IT66121 is not set +# CONFIG_DRM_LVDS_CODEC is not set +# CONFIG_DRM_MEGACHIPS_STDPXXXX_GE_B850V3_FW is not set +# CONFIG_DRM_NWL_MIPI_DSI is not set +# CONFIG_DRM_NXP_PTN3460 is not set +# CONFIG_DRM_PARADE_PS8622 is not set +# CONFIG_DRM_PARADE_PS8640 is not set +# CONFIG_DRM_SAMSUNG_DSIM is not set +# CONFIG_DRM_SIL_SII8620 is not set +# CONFIG_DRM_SII902X is not set +# CONFIG_DRM_SII9234 is not set +# CONFIG_DRM_SIMPLE_BRIDGE is not set +# CONFIG_DRM_THINE_THC63LVD1024 is not set +# CONFIG_DRM_TOSHIBA_TC358762 is not set +# CONFIG_DRM_TOSHIBA_TC358764 is not set +# CONFIG_DRM_TOSHIBA_TC358767 is not set +# CONFIG_DRM_TOSHIBA_TC358768 is not set +# CONFIG_DRM_TOSHIBA_TC358775 is not set +# CONFIG_DRM_TI_DLPC3433 is not set +# CONFIG_DRM_TI_TFP410 is not set +# CONFIG_DRM_TI_SN65DSI83 is not set +# CONFIG_DRM_TI_SN65DSI86 is not set +# CONFIG_DRM_TI_TPD12S015 is not set +# CONFIG_DRM_ANALOGIX_ANX6345 is not set +# CONFIG_DRM_ANALOGIX_ANX78XX is not set +CONFIG_DRM_ANALOGIX_DP=m +# CONFIG_DRM_ANALOGIX_ANX7625 is not set +CONFIG_DRM_I2C_ADV7511=m +CONFIG_DRM_I2C_ADV7511_AUDIO=y +CONFIG_DRM_I2C_ADV7511_CEC=y +# CONFIG_DRM_CDNS_DSI is not set +# CONFIG_DRM_CDNS_MHDP8546 is not set +CONFIG_DRM_DW_HDMI=m +# CONFIG_DRM_DW_HDMI_AHB_AUDIO is not set +CONFIG_DRM_DW_HDMI_I2S_AUDIO=m +# CONFIG_DRM_DW_HDMI_GP_AUDIO is not set +# CONFIG_DRM_DW_HDMI_CEC is not set +CONFIG_DRM_DW_MIPI_DSI=m +# end of Display Interface Bridges + +# CONFIG_DRM_LOONGSON is not set +# CONFIG_DRM_ETNAVIV is not set +CONFIG_DRM_HISI_HIBMC=m +CONFIG_DRM_HISI_KIRIN=m +# CONFIG_DRM_LOGICVC is not set +CONFIG_DRM_MESON=m +CONFIG_DRM_MESON_DW_HDMI=m +CONFIG_DRM_MESON_DW_MIPI_DSI=m +# CONFIG_DRM_ARCPGU is not set +CONFIG_DRM_BOCHS=m +CONFIG_DRM_CIRRUS_QEMU=m +# CONFIG_DRM_GM12U320 is not set +# CONFIG_DRM_PANEL_MIPI_DBI is not set +# CONFIG_DRM_SIMPLEDRM is not set +# CONFIG_TINYDRM_HX8357D is not set +# CONFIG_TINYDRM_ILI9163 is not set +# CONFIG_TINYDRM_ILI9225 is not set +# CONFIG_TINYDRM_ILI9341 is not set +# CONFIG_TINYDRM_ILI9486 is not set +# CONFIG_TINYDRM_MI0283QT is not set +# CONFIG_TINYDRM_REPAPER is not set +# CONFIG_TINYDRM_ST7586 is not set +# CONFIG_TINYDRM_ST7735R is not set +# CONFIG_DRM_PL111 is not set +# CONFIG_DRM_XEN_FRONTEND is not set +CONFIG_DRM_LIMA=m +CONFIG_DRM_PANFROST=m +# CONFIG_DRM_TIDSS is not set +# CONFIG_DRM_GUD is not set +# CONFIG_DRM_SSD130X is not set +# CONFIG_DRM_HYPERV is not set +CONFIG_DRM_LEGACY=y +CONFIG_DRM_PANEL_ORIENTATION_QUIRKS=y + +# +# Frame buffer Devices +# +CONFIG_FB=y +CONFIG_FB_SVGALIB=m +# CONFIG_FB_CIRRUS is not set +# CONFIG_FB_PM2 is not set +CONFIG_FB_ARMCLCD=y +# CONFIG_FB_CYBER2000 is not set +# CONFIG_FB_ASILIANT is not set +# CONFIG_FB_IMSTT is not set +# CONFIG_FB_UVESA is not set +CONFIG_FB_EFI=y +# CONFIG_FB_OPENCORES is not set +# CONFIG_FB_S1D13XXX is not set +# CONFIG_FB_NVIDIA is not set +# CONFIG_FB_RIVA is not set +# CONFIG_FB_I740 is not set +# CONFIG_FB_MATROX is not set +# CONFIG_FB_RADEON is not set +# CONFIG_FB_ATY128 is not set +# CONFIG_FB_ATY is not set +CONFIG_FB_S3=m +CONFIG_FB_S3_DDC=y +# CONFIG_FB_SAVAGE is not set +# CONFIG_FB_SIS is not set +# CONFIG_FB_NEOMAGIC is not set +# CONFIG_FB_KYRO is not set +CONFIG_FB_3DFX=m +# CONFIG_FB_3DFX_ACCEL is not set +CONFIG_FB_3DFX_I2C=y +# CONFIG_FB_VOODOO1 is not set +CONFIG_FB_VT8623=m +# CONFIG_FB_TRIDENT is not set +CONFIG_FB_ARK=m +CONFIG_FB_PM3=m +# CONFIG_FB_CARMINE is not set +CONFIG_FB_SMSCUFX=m +CONFIG_FB_UDL=m +# CONFIG_FB_IBM_GXT4500 is not set +# CONFIG_FB_XILINX is not set +# CONFIG_FB_VIRTUAL is not set +CONFIG_XEN_FBDEV_FRONTEND=y +# CONFIG_FB_METRONOME is not set +CONFIG_FB_MB862XX=m +CONFIG_FB_MB862XX_PCI_GDC=y +CONFIG_FB_MB862XX_I2C=y +CONFIG_FB_HYPERV=m +CONFIG_FB_SIMPLE=y +# CONFIG_FB_SSD1307 is not set +# CONFIG_FB_SM712 is not set +CONFIG_FB_CORE=y +CONFIG_FB_NOTIFY=y +CONFIG_FIRMWARE_EDID=y +CONFIG_FB_DEVICE=y +CONFIG_FB_DDC=m +CONFIG_FB_CFB_FILLRECT=y +CONFIG_FB_CFB_COPYAREA=y +CONFIG_FB_CFB_IMAGEBLIT=y +CONFIG_FB_SYS_FILLRECT=y +CONFIG_FB_SYS_COPYAREA=y +CONFIG_FB_SYS_IMAGEBLIT=y +# CONFIG_FB_FOREIGN_ENDIAN is not set +CONFIG_FB_SYS_FOPS=y +CONFIG_FB_DEFERRED_IO=y +CONFIG_FB_DMAMEM_HELPERS=y +CONFIG_FB_IOMEM_HELPERS=y +CONFIG_FB_SYSMEM_HELPERS=y +CONFIG_FB_SYSMEM_HELPERS_DEFERRED=y +CONFIG_FB_MODE_HELPERS=y +CONFIG_FB_TILEBLITTING=y +# end of Frame buffer Devices + +# +# Backlight & LCD device support +# +# CONFIG_LCD_CLASS_DEVICE is not set +CONFIG_BACKLIGHT_CLASS_DEVICE=y +# CONFIG_BACKLIGHT_KTD253 is not set +# CONFIG_BACKLIGHT_KTZ8866 is not set +CONFIG_BACKLIGHT_PWM=m +# CONFIG_BACKLIGHT_QCOM_WLED is not set +# CONFIG_BACKLIGHT_ADP8860 is not set +# CONFIG_BACKLIGHT_ADP8870 is not set +# CONFIG_BACKLIGHT_LM3630A is not set +# CONFIG_BACKLIGHT_LM3639 is not set +CONFIG_BACKLIGHT_LP855X=m +# CONFIG_BACKLIGHT_GPIO is not set +# CONFIG_BACKLIGHT_LV5207LP is not set +# CONFIG_BACKLIGHT_BD6107 is not set +# CONFIG_BACKLIGHT_ARCXCNN is not set +# CONFIG_BACKLIGHT_LED is not set +# end of Backlight & LCD device support + +CONFIG_VGASTATE=m +CONFIG_VIDEOMODE_HELPERS=y +CONFIG_HDMI=y + +# +# Console display driver support +# +CONFIG_DUMMY_CONSOLE=y +CONFIG_DUMMY_CONSOLE_COLUMNS=80 +CONFIG_DUMMY_CONSOLE_ROWS=25 +CONFIG_FRAMEBUFFER_CONSOLE=y +# CONFIG_FRAMEBUFFER_CONSOLE_LEGACY_ACCELERATION is not set +CONFIG_FRAMEBUFFER_CONSOLE_DETECT_PRIMARY=y +CONFIG_FRAMEBUFFER_CONSOLE_ROTATION=y +# CONFIG_FRAMEBUFFER_CONSOLE_DEFERRED_TAKEOVER is not set +# end of Console display driver support + +# CONFIG_LOGO is not set +# end of Graphics support + +# CONFIG_DRM_ACCEL is not set +CONFIG_SOUND=m +CONFIG_SOUND_OSS_CORE=y +# CONFIG_SOUND_OSS_CORE_PRECLAIM is not set +CONFIG_SND=m +CONFIG_SND_TIMER=m +CONFIG_SND_PCM=m +CONFIG_SND_PCM_ELD=y +CONFIG_SND_PCM_IEC958=y +CONFIG_SND_DMAENGINE_PCM=m +CONFIG_SND_HWDEP=m +CONFIG_SND_SEQ_DEVICE=m +CONFIG_SND_RAWMIDI=m +CONFIG_SND_JACK=y +CONFIG_SND_JACK_INPUT_DEV=y +CONFIG_SND_OSSEMUL=y +CONFIG_SND_MIXER_OSS=m +CONFIG_SND_PCM_OSS=m +CONFIG_SND_PCM_OSS_PLUGINS=y +CONFIG_SND_PCM_TIMER=y +CONFIG_SND_HRTIMER=m +CONFIG_SND_DYNAMIC_MINORS=y +CONFIG_SND_MAX_CARDS=32 +CONFIG_SND_SUPPORT_OLD_API=y +CONFIG_SND_PROC_FS=y +CONFIG_SND_VERBOSE_PROCFS=y +# CONFIG_SND_VERBOSE_PRINTK is not set +CONFIG_SND_CTL_FAST_LOOKUP=y +# CONFIG_SND_DEBUG is not set +# CONFIG_SND_CTL_INPUT_VALIDATION is not set +CONFIG_SND_VMASTER=y +CONFIG_SND_CTL_LED=m +CONFIG_SND_SEQUENCER=m +CONFIG_SND_SEQ_DUMMY=m +# CONFIG_SND_SEQUENCER_OSS is not set +CONFIG_SND_SEQ_HRTIMER_DEFAULT=y +CONFIG_SND_SEQ_MIDI_EVENT=m +CONFIG_SND_SEQ_MIDI=m +CONFIG_SND_SEQ_MIDI_EMUL=m +# CONFIG_SND_SEQ_UMP is not set +CONFIG_SND_MPU401_UART=m +CONFIG_SND_OPL3_LIB=m +CONFIG_SND_OPL3_LIB_SEQ=m +CONFIG_SND_AC97_CODEC=m +CONFIG_SND_DRIVERS=y +# CONFIG_SND_DUMMY is not set +CONFIG_SND_ALOOP=m +# CONFIG_SND_PCMTEST is not set +# CONFIG_SND_VIRMIDI is not set +# CONFIG_SND_MTPAV is not set +CONFIG_SND_MTS64=m +# CONFIG_SND_SERIAL_U16550 is not set +# CONFIG_SND_SERIAL_GENERIC is not set +# CONFIG_SND_MPU401 is not set +CONFIG_SND_PORTMAN2X4=m +CONFIG_SND_AC97_POWER_SAVE=y +CONFIG_SND_AC97_POWER_SAVE_DEFAULT=0 +CONFIG_SND_PCI=y +CONFIG_SND_AD1889=m +# CONFIG_SND_ALS300 is not set +# CONFIG_SND_ALI5451 is not set +# CONFIG_SND_ATIIXP is not set +# CONFIG_SND_ATIIXP_MODEM is not set +# CONFIG_SND_AU8810 is not set +# CONFIG_SND_AU8820 is not set +# CONFIG_SND_AU8830 is not set +# CONFIG_SND_AW2 is not set +# CONFIG_SND_AZT3328 is not set +# CONFIG_SND_BT87X is not set +# CONFIG_SND_CA0106 is not set +# CONFIG_SND_CMIPCI is not set +CONFIG_SND_OXYGEN_LIB=m +CONFIG_SND_OXYGEN=m +# CONFIG_SND_CS4281 is not set +# CONFIG_SND_CS46XX is not set +CONFIG_SND_CTXFI=m +CONFIG_SND_DARLA20=m +CONFIG_SND_GINA20=m +CONFIG_SND_LAYLA20=m +CONFIG_SND_DARLA24=m +CONFIG_SND_GINA24=m +CONFIG_SND_LAYLA24=m +CONFIG_SND_MONA=m +CONFIG_SND_MIA=m +CONFIG_SND_ECHO3G=m +CONFIG_SND_INDIGO=m +CONFIG_SND_INDIGOIO=m +CONFIG_SND_INDIGODJ=m +CONFIG_SND_INDIGOIOX=m +CONFIG_SND_INDIGODJX=m +# CONFIG_SND_EMU10K1 is not set +# CONFIG_SND_EMU10K1X is not set +# CONFIG_SND_ENS1370 is not set +# CONFIG_SND_ENS1371 is not set +# CONFIG_SND_ES1938 is not set +# CONFIG_SND_ES1968 is not set +# CONFIG_SND_FM801 is not set +# CONFIG_SND_HDSP is not set +CONFIG_SND_HDSPM=m +# CONFIG_SND_ICE1712 is not set +# CONFIG_SND_ICE1724 is not set +# CONFIG_SND_INTEL8X0 is not set +# CONFIG_SND_INTEL8X0M is not set +# CONFIG_SND_KORG1212 is not set +CONFIG_SND_LOLA=m +CONFIG_SND_LX6464ES=m +# CONFIG_SND_MAESTRO3 is not set +# CONFIG_SND_MIXART is not set +# CONFIG_SND_NM256 is not set +CONFIG_SND_PCXHR=m +CONFIG_SND_RIPTIDE=m +# CONFIG_SND_RME32 is not set +# CONFIG_SND_RME96 is not set +# CONFIG_SND_RME9652 is not set +# CONFIG_SND_SONICVIBES is not set +# CONFIG_SND_TRIDENT is not set +# CONFIG_SND_VIA82XX is not set +# CONFIG_SND_VIA82XX_MODEM is not set +CONFIG_SND_VIRTUOSO=m +# CONFIG_SND_VX222 is not set +# CONFIG_SND_YMFPCI is not set + +# +# HD-Audio +# +CONFIG_SND_HDA=m +CONFIG_SND_HDA_GENERIC_LEDS=y +CONFIG_SND_HDA_INTEL=m +CONFIG_SND_HDA_TEGRA=m +CONFIG_SND_HDA_HWDEP=y +CONFIG_SND_HDA_RECONFIG=y +CONFIG_SND_HDA_INPUT_BEEP=y +CONFIG_SND_HDA_INPUT_BEEP_MODE=1 +CONFIG_SND_HDA_PATCH_LOADER=y +# CONFIG_SND_HDA_SCODEC_CS35L41_I2C is not set +# CONFIG_SND_HDA_SCODEC_CS35L41_SPI is not set +# CONFIG_SND_HDA_SCODEC_CS35L56_I2C is not set +# CONFIG_SND_HDA_SCODEC_CS35L56_SPI is not set +# CONFIG_SND_HDA_SCODEC_TAS2781_I2C is not set +CONFIG_SND_HDA_CODEC_REALTEK=m +CONFIG_SND_HDA_CODEC_ANALOG=m +CONFIG_SND_HDA_CODEC_SIGMATEL=m +CONFIG_SND_HDA_CODEC_VIA=m +CONFIG_SND_HDA_CODEC_HDMI=m +CONFIG_SND_HDA_CODEC_CIRRUS=m +# CONFIG_SND_HDA_CODEC_CS8409 is not set +CONFIG_SND_HDA_CODEC_CONEXANT=m +CONFIG_SND_HDA_CODEC_CA0110=m +CONFIG_SND_HDA_CODEC_CA0132=m +CONFIG_SND_HDA_CODEC_CA0132_DSP=y +CONFIG_SND_HDA_CODEC_CMEDIA=m +CONFIG_SND_HDA_CODEC_SI3054=m +CONFIG_SND_HDA_GENERIC=m +CONFIG_SND_HDA_POWER_SAVE_DEFAULT=1 +# CONFIG_SND_HDA_INTEL_HDMI_SILENT_STREAM is not set +# CONFIG_SND_HDA_CTL_DEV_ID is not set +# end of HD-Audio + +CONFIG_SND_HDA_CORE=m +CONFIG_SND_HDA_DSP_LOADER=y +CONFIG_SND_HDA_ALIGNED_MMIO=y +CONFIG_SND_HDA_COMPONENT=y +CONFIG_SND_HDA_PREALLOC_SIZE=2048 +CONFIG_SND_INTEL_NHLT=y +CONFIG_SND_INTEL_DSP_CONFIG=m +CONFIG_SND_INTEL_SOUNDWIRE_ACPI=m +CONFIG_SND_SPI=y +CONFIG_SND_USB=y +CONFIG_SND_USB_AUDIO=m +# CONFIG_SND_USB_AUDIO_MIDI_V2 is not set +CONFIG_SND_USB_AUDIO_USE_MEDIA_CONTROLLER=y +CONFIG_SND_USB_UA101=m +CONFIG_SND_USB_CAIAQ=m +CONFIG_SND_USB_CAIAQ_INPUT=y +CONFIG_SND_USB_6FIRE=m +CONFIG_SND_USB_HIFACE=m +CONFIG_SND_BCD2000=m +CONFIG_SND_USB_LINE6=m +CONFIG_SND_USB_POD=m +CONFIG_SND_USB_PODHD=m +CONFIG_SND_USB_TONEPORT=m +CONFIG_SND_USB_VARIAX=m +CONFIG_SND_FIREWIRE=y +CONFIG_SND_FIREWIRE_LIB=m +CONFIG_SND_DICE=m +CONFIG_SND_OXFW=m +CONFIG_SND_ISIGHT=m +CONFIG_SND_FIREWORKS=m +CONFIG_SND_BEBOB=m +CONFIG_SND_FIREWIRE_DIGI00X=m +CONFIG_SND_FIREWIRE_TASCAM=m +CONFIG_SND_FIREWIRE_MOTU=m +CONFIG_SND_FIREFACE=m +CONFIG_SND_SOC=m +CONFIG_SND_SOC_GENERIC_DMAENGINE_PCM=y +# CONFIG_SND_SOC_ADI is not set +# CONFIG_SND_SOC_AMD_ACP is not set +# CONFIG_SND_AMD_ACP_CONFIG is not set +# CONFIG_SND_ATMEL_SOC is not set +# CONFIG_SND_BCM63XX_I2S_WHISTLER is not set +# CONFIG_SND_DESIGNWARE_I2S is not set + +# +# SoC Audio for Freescale CPUs +# + +# +# Common SoC Audio options for Freescale CPUs: +# +# CONFIG_SND_SOC_FSL_ASRC is not set +# CONFIG_SND_SOC_FSL_SAI is not set +# CONFIG_SND_SOC_FSL_AUDMIX is not set +# CONFIG_SND_SOC_FSL_SSI is not set +# CONFIG_SND_SOC_FSL_SPDIF is not set +# CONFIG_SND_SOC_FSL_ESAI is not set +# CONFIG_SND_SOC_FSL_MICFIL is not set +# CONFIG_SND_SOC_FSL_XCVR is not set +# CONFIG_SND_SOC_FSL_RPMSG is not set +# CONFIG_SND_SOC_IMX_AUDMUX is not set +# end of SoC Audio for Freescale CPUs + +# CONFIG_SND_SOC_CHV3_I2S is not set +CONFIG_SND_I2S_HI6210_I2S=m +# CONFIG_SND_KIRKWOOD_SOC is not set +# CONFIG_SND_SOC_IMG is not set +# CONFIG_SND_SOC_MTK_BTCVSD is not set + +# +# ASoC support for Amlogic platforms +# +# CONFIG_SND_MESON_AIU is not set +# CONFIG_SND_MESON_AXG_FRDDR is not set +# CONFIG_SND_MESON_AXG_TODDR is not set +# CONFIG_SND_MESON_AXG_TDMIN is not set +# CONFIG_SND_MESON_AXG_TDMOUT is not set +# CONFIG_SND_MESON_AXG_SOUND_CARD is not set +# CONFIG_SND_MESON_AXG_SPDIFOUT is not set +# CONFIG_SND_MESON_AXG_SPDIFIN is not set +# CONFIG_SND_MESON_AXG_PDM is not set +# CONFIG_SND_MESON_GX_SOUND_CARD is not set +# CONFIG_SND_MESON_G12A_TOACODEC is not set +# CONFIG_SND_MESON_G12A_TOHDMITX is not set +# CONFIG_SND_SOC_MESON_T9015 is not set +# end of ASoC support for Amlogic platforms + +CONFIG_SND_SOC_QCOM=m +CONFIG_SND_SOC_LPASS_CPU=m +CONFIG_SND_SOC_LPASS_PLATFORM=m +CONFIG_SND_SOC_LPASS_APQ8016=m +# CONFIG_SND_SOC_STORM is not set +CONFIG_SND_SOC_APQ8016_SBC=m +CONFIG_SND_SOC_QCOM_COMMON=m +# CONFIG_SND_SOC_SC7180 is not set +CONFIG_SND_SOC_ROCKCHIP=m +CONFIG_SND_SOC_ROCKCHIP_I2S=m +# CONFIG_SND_SOC_ROCKCHIP_I2S_TDM is not set +# CONFIG_SND_SOC_ROCKCHIP_PDM is not set +CONFIG_SND_SOC_ROCKCHIP_SPDIF=m +# CONFIG_SND_SOC_ROCKCHIP_MAX98090 is not set +CONFIG_SND_SOC_ROCKCHIP_RT5645=m +# CONFIG_SND_SOC_RK3288_HDMI_ANALOG is not set +CONFIG_SND_SOC_RK3399_GRU_SOUND=m +# CONFIG_SND_SOC_SOF_TOPLEVEL is not set + +# +# STMicroelectronics STM32 SOC audio support +# +# end of STMicroelectronics STM32 SOC audio support + +# +# Allwinner SoC Audio support +# +# CONFIG_SND_SUN4I_CODEC is not set +CONFIG_SND_SUN8I_CODEC=m +# CONFIG_SND_SUN8I_CODEC_ANALOG is not set +CONFIG_SND_SUN50I_CODEC_ANALOG=m +CONFIG_SND_SUN4I_I2S=m +# CONFIG_SND_SUN4I_SPDIF is not set +# CONFIG_SND_SUN50I_DMIC is not set +CONFIG_SND_SUN8I_ADDA_PR_REGMAP=m +# end of Allwinner SoC Audio support + +CONFIG_SND_SOC_TEGRA=m +# CONFIG_SND_SOC_TEGRA20_AC97 is not set +# CONFIG_SND_SOC_TEGRA20_DAS is not set +# CONFIG_SND_SOC_TEGRA20_I2S is not set +CONFIG_SND_SOC_TEGRA20_SPDIF=m +# CONFIG_SND_SOC_TEGRA30_AHUB is not set +# CONFIG_SND_SOC_TEGRA30_I2S is not set +# CONFIG_SND_SOC_TEGRA210_AHUB is not set +# CONFIG_SND_SOC_TEGRA210_DMIC is not set +# CONFIG_SND_SOC_TEGRA210_I2S is not set +# CONFIG_SND_SOC_TEGRA210_OPE is not set +# CONFIG_SND_SOC_TEGRA186_ASRC is not set +# CONFIG_SND_SOC_TEGRA186_DSPK is not set +# CONFIG_SND_SOC_TEGRA210_ADMAIF is not set +# CONFIG_SND_SOC_TEGRA210_MVC is not set +# CONFIG_SND_SOC_TEGRA210_SFC is not set +# CONFIG_SND_SOC_TEGRA210_AMX is not set +# CONFIG_SND_SOC_TEGRA210_ADX is not set +# CONFIG_SND_SOC_TEGRA210_MIXER is not set +# CONFIG_SND_SOC_TEGRA_AUDIO_GRAPH_CARD is not set +CONFIG_SND_SOC_TEGRA_MACHINE_DRV=m +# CONFIG_SND_SOC_TEGRA_RT5631 is not set +CONFIG_SND_SOC_TEGRA_RT5640=m +CONFIG_SND_SOC_TEGRA_WM8753=m +CONFIG_SND_SOC_TEGRA_WM8903=m +# CONFIG_SND_SOC_TEGRA_WM9712 is not set +CONFIG_SND_SOC_TEGRA_TRIMSLICE=m +CONFIG_SND_SOC_TEGRA_ALC5632=m +CONFIG_SND_SOC_TEGRA_MAX98090=m +# CONFIG_SND_SOC_TEGRA_MAX98088 is not set +CONFIG_SND_SOC_TEGRA_RT5677=m +# CONFIG_SND_SOC_TEGRA_SGTL5000 is not set +# CONFIG_SND_SOC_XILINX_I2S is not set +# CONFIG_SND_SOC_XILINX_AUDIO_FORMATTER is not set +# CONFIG_SND_SOC_XILINX_SPDIF is not set +# CONFIG_SND_SOC_XTFPGA_I2S is not set +CONFIG_SND_SOC_I2C_AND_SPI=m + +# +# CODEC drivers +# +# CONFIG_SND_SOC_AC97_CODEC is not set +# CONFIG_SND_SOC_ADAU1372_I2C is not set +# CONFIG_SND_SOC_ADAU1372_SPI is not set +# CONFIG_SND_SOC_ADAU1701 is not set +# CONFIG_SND_SOC_ADAU1761_I2C is not set +# CONFIG_SND_SOC_ADAU1761_SPI is not set +# CONFIG_SND_SOC_ADAU7002 is not set +# CONFIG_SND_SOC_ADAU7118_HW is not set +# CONFIG_SND_SOC_ADAU7118_I2C is not set +# CONFIG_SND_SOC_AK4104 is not set +# CONFIG_SND_SOC_AK4118 is not set +# CONFIG_SND_SOC_AK4375 is not set +# CONFIG_SND_SOC_AK4458 is not set +# CONFIG_SND_SOC_AK4554 is not set +# CONFIG_SND_SOC_AK4613 is not set +# CONFIG_SND_SOC_AK4642 is not set +# CONFIG_SND_SOC_AK5386 is not set +# CONFIG_SND_SOC_AK5558 is not set +# CONFIG_SND_SOC_ALC5623 is not set +CONFIG_SND_SOC_ALC5632=m +# CONFIG_SND_SOC_AUDIO_IIO_AUX is not set +# CONFIG_SND_SOC_AW8738 is not set +# CONFIG_SND_SOC_AW88395 is not set +# CONFIG_SND_SOC_AW88261 is not set +# CONFIG_SND_SOC_BD28623 is not set +# CONFIG_SND_SOC_BT_SCO is not set +# CONFIG_SND_SOC_CHV3_CODEC is not set +# CONFIG_SND_SOC_CROS_EC_CODEC is not set +# CONFIG_SND_SOC_CS35L32 is not set +# CONFIG_SND_SOC_CS35L33 is not set +# CONFIG_SND_SOC_CS35L34 is not set +# CONFIG_SND_SOC_CS35L35 is not set +# CONFIG_SND_SOC_CS35L36 is not set +# CONFIG_SND_SOC_CS35L41_SPI is not set +# CONFIG_SND_SOC_CS35L41_I2C is not set +# CONFIG_SND_SOC_CS35L45_SPI is not set +# CONFIG_SND_SOC_CS35L45_I2C is not set +# CONFIG_SND_SOC_CS35L56_I2C is not set +# CONFIG_SND_SOC_CS35L56_SPI is not set +# CONFIG_SND_SOC_CS42L42 is not set +# CONFIG_SND_SOC_CS42L51_I2C is not set +# CONFIG_SND_SOC_CS42L52 is not set +# CONFIG_SND_SOC_CS42L56 is not set +# CONFIG_SND_SOC_CS42L73 is not set +# CONFIG_SND_SOC_CS42L83 is not set +# CONFIG_SND_SOC_CS4234 is not set +# CONFIG_SND_SOC_CS4265 is not set +# CONFIG_SND_SOC_CS4270 is not set +# CONFIG_SND_SOC_CS4271_I2C is not set +# CONFIG_SND_SOC_CS4271_SPI is not set +# CONFIG_SND_SOC_CS42XX8_I2C is not set +# CONFIG_SND_SOC_CS43130 is not set +# CONFIG_SND_SOC_CS4341 is not set +# CONFIG_SND_SOC_CS4349 is not set +# CONFIG_SND_SOC_CS53L30 is not set +# CONFIG_SND_SOC_CX2072X is not set +# CONFIG_SND_SOC_DA7213 is not set +CONFIG_SND_SOC_DA7219=m +CONFIG_SND_SOC_DMIC=m +CONFIG_SND_SOC_HDMI_CODEC=m +# CONFIG_SND_SOC_ES7134 is not set +# CONFIG_SND_SOC_ES7241 is not set +# CONFIG_SND_SOC_ES8316 is not set +# CONFIG_SND_SOC_ES8326 is not set +# CONFIG_SND_SOC_ES8328_I2C is not set +# CONFIG_SND_SOC_ES8328_SPI is not set +# CONFIG_SND_SOC_GTM601 is not set +# CONFIG_SND_SOC_HDA is not set +# CONFIG_SND_SOC_ICS43432 is not set +# CONFIG_SND_SOC_IDT821034 is not set +# CONFIG_SND_SOC_INNO_RK3036 is not set +# CONFIG_SND_SOC_MAX98088 is not set +CONFIG_SND_SOC_MAX98090=m +CONFIG_SND_SOC_MAX98357A=m +# CONFIG_SND_SOC_MAX98504 is not set +# CONFIG_SND_SOC_MAX9867 is not set +# CONFIG_SND_SOC_MAX98927 is not set +# CONFIG_SND_SOC_MAX98520 is not set +# CONFIG_SND_SOC_MAX98373_I2C is not set +# CONFIG_SND_SOC_MAX98388 is not set +# CONFIG_SND_SOC_MAX98390 is not set +# CONFIG_SND_SOC_MAX98396 is not set +# CONFIG_SND_SOC_MAX9860 is not set +# CONFIG_SND_SOC_MSM8916_WCD_ANALOG is not set +# CONFIG_SND_SOC_MSM8916_WCD_DIGITAL is not set +# CONFIG_SND_SOC_PCM1681 is not set +# CONFIG_SND_SOC_PCM1789_I2C is not set +# CONFIG_SND_SOC_PCM179X_I2C is not set +# CONFIG_SND_SOC_PCM179X_SPI is not set +# CONFIG_SND_SOC_PCM186X_I2C is not set +# CONFIG_SND_SOC_PCM186X_SPI is not set +# CONFIG_SND_SOC_PCM3060_I2C is not set +# CONFIG_SND_SOC_PCM3060_SPI is not set +# CONFIG_SND_SOC_PCM3168A_I2C is not set +# CONFIG_SND_SOC_PCM3168A_SPI is not set +# CONFIG_SND_SOC_PCM5102A is not set +# CONFIG_SND_SOC_PCM512x_I2C is not set +# CONFIG_SND_SOC_PCM512x_SPI is not set +# CONFIG_SND_SOC_PEB2466 is not set +# CONFIG_SND_SOC_RK3328 is not set +CONFIG_SND_SOC_RL6231=m +CONFIG_SND_SOC_RT5514=m +CONFIG_SND_SOC_RT5514_SPI=m +# CONFIG_SND_SOC_RT5616 is not set +# CONFIG_SND_SOC_RT5631 is not set +CONFIG_SND_SOC_RT5640=m +CONFIG_SND_SOC_RT5645=m +# CONFIG_SND_SOC_RT5659 is not set +CONFIG_SND_SOC_RT5677=m +CONFIG_SND_SOC_RT5677_SPI=m +# CONFIG_SND_SOC_RT9120 is not set +# CONFIG_SND_SOC_SGTL5000 is not set +CONFIG_SND_SOC_SIMPLE_AMPLIFIER=m +# CONFIG_SND_SOC_SIMPLE_MUX is not set +# CONFIG_SND_SOC_SMA1303 is not set +# CONFIG_SND_SOC_SPDIF is not set +# CONFIG_SND_SOC_SRC4XXX_I2C is not set +# CONFIG_SND_SOC_SSM2305 is not set +# CONFIG_SND_SOC_SSM2518 is not set +# CONFIG_SND_SOC_SSM2602_SPI is not set +# CONFIG_SND_SOC_SSM2602_I2C is not set +# CONFIG_SND_SOC_SSM3515 is not set +# CONFIG_SND_SOC_SSM4567 is not set +# CONFIG_SND_SOC_STA32X is not set +# CONFIG_SND_SOC_STA350 is not set +# CONFIG_SND_SOC_STI_SAS is not set +# CONFIG_SND_SOC_TAS2552 is not set +# CONFIG_SND_SOC_TAS2562 is not set +# CONFIG_SND_SOC_TAS2764 is not set +# CONFIG_SND_SOC_TAS2770 is not set +# CONFIG_SND_SOC_TAS2780 is not set +# CONFIG_SND_SOC_TAS2781_I2C is not set +# CONFIG_SND_SOC_TAS5086 is not set +# CONFIG_SND_SOC_TAS571X is not set +# CONFIG_SND_SOC_TAS5720 is not set +# CONFIG_SND_SOC_TAS5805M is not set +# CONFIG_SND_SOC_TAS6424 is not set +# CONFIG_SND_SOC_TDA7419 is not set +# CONFIG_SND_SOC_TFA9879 is not set +# CONFIG_SND_SOC_TFA989X is not set +# CONFIG_SND_SOC_TLV320ADC3XXX is not set +CONFIG_SND_SOC_TLV320AIC23=m +CONFIG_SND_SOC_TLV320AIC23_I2C=m +# CONFIG_SND_SOC_TLV320AIC23_SPI is not set +# CONFIG_SND_SOC_TLV320AIC31XX is not set +# CONFIG_SND_SOC_TLV320AIC32X4_I2C is not set +# CONFIG_SND_SOC_TLV320AIC32X4_SPI is not set +# CONFIG_SND_SOC_TLV320AIC3X_I2C is not set +# CONFIG_SND_SOC_TLV320AIC3X_SPI is not set +# CONFIG_SND_SOC_TLV320ADCX140 is not set +# CONFIG_SND_SOC_TS3A227E is not set +# CONFIG_SND_SOC_TSCS42XX is not set +# CONFIG_SND_SOC_TSCS454 is not set +# CONFIG_SND_SOC_UDA1334 is not set +# CONFIG_SND_SOC_WM8510 is not set +# CONFIG_SND_SOC_WM8523 is not set +# CONFIG_SND_SOC_WM8524 is not set +# CONFIG_SND_SOC_WM8580 is not set +# CONFIG_SND_SOC_WM8711 is not set +# CONFIG_SND_SOC_WM8728 is not set +# CONFIG_SND_SOC_WM8731_I2C is not set +# CONFIG_SND_SOC_WM8731_SPI is not set +# CONFIG_SND_SOC_WM8737 is not set +# CONFIG_SND_SOC_WM8741 is not set +# CONFIG_SND_SOC_WM8750 is not set +CONFIG_SND_SOC_WM8753=m +# CONFIG_SND_SOC_WM8770 is not set +# CONFIG_SND_SOC_WM8776 is not set +# CONFIG_SND_SOC_WM8782 is not set +# CONFIG_SND_SOC_WM8804_I2C is not set +# CONFIG_SND_SOC_WM8804_SPI is not set +CONFIG_SND_SOC_WM8903=m +# CONFIG_SND_SOC_WM8904 is not set +# CONFIG_SND_SOC_WM8940 is not set +# CONFIG_SND_SOC_WM8960 is not set +# CONFIG_SND_SOC_WM8961 is not set +# CONFIG_SND_SOC_WM8962 is not set +# CONFIG_SND_SOC_WM8974 is not set +# CONFIG_SND_SOC_WM8978 is not set +# CONFIG_SND_SOC_WM8985 is not set +# CONFIG_SND_SOC_ZL38060 is not set +# CONFIG_SND_SOC_MAX9759 is not set +# CONFIG_SND_SOC_MT6351 is not set +# CONFIG_SND_SOC_MT6358 is not set +# CONFIG_SND_SOC_MT6660 is not set +# CONFIG_SND_SOC_NAU8315 is not set +# CONFIG_SND_SOC_NAU8540 is not set +# CONFIG_SND_SOC_NAU8810 is not set +# CONFIG_SND_SOC_NAU8821 is not set +# CONFIG_SND_SOC_NAU8822 is not set +# CONFIG_SND_SOC_NAU8824 is not set +# CONFIG_SND_SOC_TPA6130A2 is not set +# CONFIG_SND_SOC_LPASS_WSA_MACRO is not set +# CONFIG_SND_SOC_LPASS_VA_MACRO is not set +# CONFIG_SND_SOC_LPASS_RX_MACRO is not set +# CONFIG_SND_SOC_LPASS_TX_MACRO is not set +# end of CODEC drivers + +CONFIG_SND_SIMPLE_CARD_UTILS=m +CONFIG_SND_SIMPLE_CARD=m +CONFIG_SND_AUDIO_GRAPH_CARD=m +# CONFIG_SND_AUDIO_GRAPH_CARD2 is not set +# CONFIG_SND_TEST_COMPONENT is not set +CONFIG_SND_XEN_FRONTEND=m +# CONFIG_SND_VIRTIO is not set +CONFIG_AC97_BUS=m +CONFIG_HID_SUPPORT=y +CONFIG_HID=m +CONFIG_HID_BATTERY_STRENGTH=y +CONFIG_HIDRAW=y +CONFIG_UHID=m +CONFIG_HID_GENERIC=m + +# +# Special HID drivers +# +CONFIG_HID_A4TECH=m +CONFIG_HID_ACCUTOUCH=m +CONFIG_HID_ACRUX=m +CONFIG_HID_ACRUX_FF=y +CONFIG_HID_APPLE=m +# CONFIG_HID_APPLEIR is not set +CONFIG_HID_ASUS=m +CONFIG_HID_AUREAL=m +CONFIG_HID_BELKIN=m +CONFIG_HID_BETOP_FF=m +CONFIG_HID_BIGBEN_FF=m +CONFIG_HID_CHERRY=m +CONFIG_HID_CHICONY=m +CONFIG_HID_CORSAIR=m +CONFIG_HID_COUGAR=m +CONFIG_HID_MACALLY=m +CONFIG_HID_PRODIKEYS=m +CONFIG_HID_CMEDIA=m +CONFIG_HID_CP2112=m +# CONFIG_HID_CREATIVE_SB0540 is not set +CONFIG_HID_CYPRESS=m +CONFIG_HID_DRAGONRISE=m +CONFIG_DRAGONRISE_FF=y +CONFIG_HID_EMS_FF=m +CONFIG_HID_ELAN=m +CONFIG_HID_ELECOM=m +CONFIG_HID_ELO=m +# CONFIG_HID_EVISION is not set +CONFIG_HID_EZKEY=m +# CONFIG_HID_FT260 is not set +CONFIG_HID_GEMBIRD=m +CONFIG_HID_GFRM=m +# CONFIG_HID_GLORIOUS is not set +CONFIG_HID_HOLTEK=m +CONFIG_HOLTEK_FF=y +# CONFIG_HID_GOOGLE_HAMMER is not set +# CONFIG_HID_GOOGLE_STADIA_FF is not set +# CONFIG_HID_VIVALDI is not set +CONFIG_HID_GT683R=m +CONFIG_HID_KEYTOUCH=m +CONFIG_HID_KYE=m +CONFIG_HID_UCLOGIC=m +CONFIG_HID_WALTOP=m +CONFIG_HID_VIEWSONIC=m +# CONFIG_HID_VRC2 is not set +# CONFIG_HID_XIAOMI is not set +CONFIG_HID_GYRATION=m +CONFIG_HID_ICADE=m +CONFIG_HID_ITE=m +CONFIG_HID_JABRA=m +CONFIG_HID_TWINHAN=m +CONFIG_HID_KENSINGTON=m +CONFIG_HID_LCPOWER=m +CONFIG_HID_LED=m +CONFIG_HID_LENOVO=m +# CONFIG_HID_LETSKETCH is not set +CONFIG_HID_LOGITECH=m +CONFIG_HID_LOGITECH_DJ=m +CONFIG_HID_LOGITECH_HIDPP=m +CONFIG_LOGITECH_FF=y +CONFIG_LOGIRUMBLEPAD2_FF=y +CONFIG_LOGIG940_FF=y +CONFIG_LOGIWHEELS_FF=y +CONFIG_HID_MAGICMOUSE=m +CONFIG_HID_MALTRON=m +CONFIG_HID_MAYFLASH=m +# CONFIG_HID_MEGAWORLD_FF is not set +CONFIG_HID_REDRAGON=m +CONFIG_HID_MICROSOFT=m +CONFIG_HID_MONTEREY=m +CONFIG_HID_MULTITOUCH=m +# CONFIG_HID_NINTENDO is not set +CONFIG_HID_NTI=m +CONFIG_HID_NTRIG=m +# CONFIG_HID_NVIDIA_SHIELD is not set +CONFIG_HID_ORTEK=m +CONFIG_HID_PANTHERLORD=m +CONFIG_PANTHERLORD_FF=y +CONFIG_HID_PENMOUNT=m +CONFIG_HID_PETALYNX=m +CONFIG_HID_PICOLCD=m +CONFIG_HID_PICOLCD_FB=y +CONFIG_HID_PICOLCD_BACKLIGHT=y +CONFIG_HID_PICOLCD_LEDS=y +CONFIG_HID_PICOLCD_CIR=y +CONFIG_HID_PLANTRONICS=m +# CONFIG_HID_PXRC is not set +# CONFIG_HID_RAZER is not set +CONFIG_HID_PRIMAX=m +CONFIG_HID_RETRODE=m +CONFIG_HID_ROCCAT=m +CONFIG_HID_SAITEK=m +CONFIG_HID_SAMSUNG=m +# CONFIG_HID_SEMITEK is not set +# CONFIG_HID_SIGMAMICRO is not set +CONFIG_HID_SONY=m +CONFIG_SONY_FF=y +CONFIG_HID_SPEEDLINK=m +CONFIG_HID_STEAM=m +# CONFIG_STEAM_FF is not set +CONFIG_HID_STEELSERIES=m +CONFIG_HID_SUNPLUS=m +CONFIG_HID_RMI=m +CONFIG_HID_GREENASIA=m +CONFIG_GREENASIA_FF=y +# CONFIG_HID_HYPERV_MOUSE is not set +CONFIG_HID_SMARTJOYPLUS=m +CONFIG_SMARTJOYPLUS_FF=y +CONFIG_HID_TIVO=m +CONFIG_HID_TOPSEED=m +# CONFIG_HID_TOPRE is not set +CONFIG_HID_THINGM=m +CONFIG_HID_THRUSTMASTER=m +CONFIG_THRUSTMASTER_FF=y +CONFIG_HID_UDRAW_PS3=m +CONFIG_HID_U2FZERO=m +CONFIG_HID_WACOM=m +CONFIG_HID_WIIMOTE=m +CONFIG_HID_XINMO=m +CONFIG_HID_ZEROPLUS=m +CONFIG_ZEROPLUS_FF=y +CONFIG_HID_ZYDACRON=m +CONFIG_HID_SENSOR_HUB=m +CONFIG_HID_SENSOR_CUSTOM_SENSOR=m +CONFIG_HID_ALPS=m +# CONFIG_HID_MCP2221 is not set +# end of Special HID drivers + +# +# HID-BPF support +# +# CONFIG_HID_BPF is not set +# end of HID-BPF support + +# +# USB HID support +# +CONFIG_USB_HID=m +CONFIG_HID_PID=y +CONFIG_USB_HIDDEV=y + +# +# USB HID Boot Protocol drivers +# +# CONFIG_USB_KBD is not set +# CONFIG_USB_MOUSE is not set +# end of USB HID Boot Protocol drivers +# end of USB HID support + +CONFIG_I2C_HID=m +# CONFIG_I2C_HID_ACPI is not set +# CONFIG_I2C_HID_OF is not set +# CONFIG_I2C_HID_OF_ELAN is not set +# CONFIG_I2C_HID_OF_GOODIX is not set +CONFIG_USB_OHCI_LITTLE_ENDIAN=y +CONFIG_USB_SUPPORT=y +CONFIG_USB_COMMON=m +CONFIG_USB_LED_TRIG=y +CONFIG_USB_ULPI_BUS=m +CONFIG_USB_CONN_GPIO=m +CONFIG_USB_ARCH_HAS_HCD=y +CONFIG_USB=m +CONFIG_USB_PCI=y +CONFIG_USB_ANNOUNCE_NEW_DEVICES=y + +# +# Miscellaneous USB options +# +CONFIG_USB_DEFAULT_PERSIST=y +# CONFIG_USB_FEW_INIT_RETRIES is not set +CONFIG_USB_DYNAMIC_MINORS=y +# CONFIG_USB_OTG is not set +# CONFIG_USB_OTG_PRODUCTLIST is not set +# CONFIG_USB_OTG_DISABLE_EXTERNAL_HUB is not set +CONFIG_USB_LEDS_TRIGGER_USBPORT=m +CONFIG_USB_AUTOSUSPEND_DELAY=2 +CONFIG_USB_MON=m + +# +# USB Host Controller Drivers +# +# CONFIG_USB_C67X00_HCD is not set +CONFIG_USB_XHCI_HCD=m +# CONFIG_USB_XHCI_DBGCAP is not set +CONFIG_USB_XHCI_PCI=m +# CONFIG_USB_XHCI_PCI_RENESAS is not set +CONFIG_USB_XHCI_PLATFORM=m +# CONFIG_USB_XHCI_HISTB is not set +# CONFIG_USB_XHCI_MVEBU is not set +CONFIG_USB_XHCI_TEGRA=m +CONFIG_USB_EHCI_HCD=m +CONFIG_USB_EHCI_ROOT_HUB_TT=y +CONFIG_USB_EHCI_TT_NEWSCHED=y +CONFIG_USB_EHCI_PCI=m +# CONFIG_USB_EHCI_FSL is not set +CONFIG_USB_EHCI_HCD_ORION=m +CONFIG_USB_EHCI_TEGRA=m +CONFIG_USB_EHCI_HCD_PLATFORM=m +# CONFIG_USB_OXU210HP_HCD is not set +# CONFIG_USB_ISP116X_HCD is not set +# CONFIG_USB_MAX3421_HCD is not set +CONFIG_USB_OHCI_HCD=m +CONFIG_USB_OHCI_HCD_PCI=m +# CONFIG_USB_OHCI_HCD_SSB is not set +CONFIG_USB_OHCI_HCD_PLATFORM=m +# CONFIG_USB_UHCI_HCD is not set +# CONFIG_USB_SL811_HCD is not set +# CONFIG_USB_R8A66597_HCD is not set +# CONFIG_USB_HCD_BCMA is not set +# CONFIG_USB_HCD_SSB is not set +# CONFIG_USB_HCD_TEST_MODE is not set +# CONFIG_USB_XEN_HCD is not set + +# +# USB Device Class drivers +# +CONFIG_USB_ACM=m +CONFIG_USB_PRINTER=m +CONFIG_USB_WDM=m +CONFIG_USB_TMC=m + +# +# NOTE: USB_STORAGE depends on SCSI but BLK_DEV_SD may +# + +# +# also be needed; see USB_STORAGE Help for more info +# +CONFIG_USB_STORAGE=m +# CONFIG_USB_STORAGE_DEBUG is not set +CONFIG_USB_STORAGE_REALTEK=m +CONFIG_REALTEK_AUTOPM=y +CONFIG_USB_STORAGE_DATAFAB=m +CONFIG_USB_STORAGE_FREECOM=m +CONFIG_USB_STORAGE_ISD200=m +CONFIG_USB_STORAGE_USBAT=m +CONFIG_USB_STORAGE_SDDR09=m +CONFIG_USB_STORAGE_SDDR55=m +CONFIG_USB_STORAGE_JUMPSHOT=m +CONFIG_USB_STORAGE_ALAUDA=m +CONFIG_USB_STORAGE_ONETOUCH=m +CONFIG_USB_STORAGE_KARMA=m +CONFIG_USB_STORAGE_CYPRESS_ATACB=m +CONFIG_USB_STORAGE_ENE_UB6250=m +CONFIG_USB_UAS=m + +# +# USB Imaging devices +# +CONFIG_USB_MDC800=m +CONFIG_USB_MICROTEK=m +CONFIG_USBIP_CORE=m +CONFIG_USBIP_VHCI_HCD=m +CONFIG_USBIP_VHCI_HC_PORTS=15 +CONFIG_USBIP_VHCI_NR_HCS=8 +CONFIG_USBIP_HOST=m +CONFIG_USBIP_VUDC=m +# CONFIG_USBIP_DEBUG is not set + +# +# USB dual-mode controller drivers +# +# CONFIG_USB_CDNS_SUPPORT is not set +CONFIG_USB_MUSB_HDRC=m +# CONFIG_USB_MUSB_HOST is not set +# CONFIG_USB_MUSB_GADGET is not set +CONFIG_USB_MUSB_DUAL_ROLE=y + +# +# Platform Glue Layer +# +CONFIG_USB_MUSB_SUNXI=m + +# +# MUSB DMA mode +# +# CONFIG_MUSB_PIO_ONLY is not set +CONFIG_USB_DWC3=m +CONFIG_USB_DWC3_ULPI=y +# CONFIG_USB_DWC3_HOST is not set +# CONFIG_USB_DWC3_GADGET is not set +CONFIG_USB_DWC3_DUAL_ROLE=y + +# +# Platform Glue Driver Support +# +CONFIG_USB_DWC3_PCI=m +CONFIG_USB_DWC3_HAPS=m +CONFIG_USB_DWC3_MESON_G12A=m +CONFIG_USB_DWC3_OF_SIMPLE=m +CONFIG_USB_DWC3_QCOM=m +CONFIG_USB_DWC3_XILINX=m +CONFIG_USB_DWC2=m +# CONFIG_USB_DWC2_HOST is not set + +# +# Gadget/Dual-role mode requires USB Gadget support to be enabled +# +# CONFIG_USB_DWC2_PERIPHERAL is not set +CONFIG_USB_DWC2_DUAL_ROLE=y +# CONFIG_USB_DWC2_PCI is not set +# CONFIG_USB_DWC2_DEBUG is not set +# CONFIG_USB_DWC2_TRACK_MISSED_SOFS is not set +CONFIG_USB_CHIPIDEA=m +CONFIG_USB_CHIPIDEA_UDC=y +CONFIG_USB_CHIPIDEA_HOST=y +CONFIG_USB_CHIPIDEA_PCI=m +CONFIG_USB_CHIPIDEA_MSM=m +CONFIG_USB_CHIPIDEA_IMX=m +CONFIG_USB_CHIPIDEA_GENERIC=m +CONFIG_USB_CHIPIDEA_TEGRA=m +CONFIG_USB_ISP1760=m +CONFIG_USB_ISP1760_HCD=y +CONFIG_USB_ISP1761_UDC=y +# CONFIG_USB_ISP1760_HOST_ROLE is not set +# CONFIG_USB_ISP1760_GADGET_ROLE is not set +CONFIG_USB_ISP1760_DUAL_ROLE=y + +# +# USB port drivers +# +CONFIG_USB_SERIAL=m +CONFIG_USB_SERIAL_GENERIC=y +CONFIG_USB_SERIAL_SIMPLE=m +CONFIG_USB_SERIAL_AIRCABLE=m +CONFIG_USB_SERIAL_ARK3116=m +CONFIG_USB_SERIAL_BELKIN=m +CONFIG_USB_SERIAL_CH341=m +CONFIG_USB_SERIAL_WHITEHEAT=m +CONFIG_USB_SERIAL_DIGI_ACCELEPORT=m +CONFIG_USB_SERIAL_CP210X=m +CONFIG_USB_SERIAL_CYPRESS_M8=m +CONFIG_USB_SERIAL_EMPEG=m +CONFIG_USB_SERIAL_FTDI_SIO=m +CONFIG_USB_SERIAL_VISOR=m +CONFIG_USB_SERIAL_IPAQ=m +CONFIG_USB_SERIAL_IR=m +CONFIG_USB_SERIAL_EDGEPORT=m +CONFIG_USB_SERIAL_EDGEPORT_TI=m +CONFIG_USB_SERIAL_F81232=m +CONFIG_USB_SERIAL_F8153X=m +CONFIG_USB_SERIAL_GARMIN=m +CONFIG_USB_SERIAL_IPW=m +CONFIG_USB_SERIAL_IUU=m +CONFIG_USB_SERIAL_KEYSPAN_PDA=m +CONFIG_USB_SERIAL_KEYSPAN=m +CONFIG_USB_SERIAL_KLSI=m +CONFIG_USB_SERIAL_KOBIL_SCT=m +CONFIG_USB_SERIAL_MCT_U232=m +CONFIG_USB_SERIAL_METRO=m +CONFIG_USB_SERIAL_MOS7720=m +CONFIG_USB_SERIAL_MOS7715_PARPORT=y +CONFIG_USB_SERIAL_MOS7840=m +CONFIG_USB_SERIAL_MXUPORT=m +CONFIG_USB_SERIAL_NAVMAN=m +CONFIG_USB_SERIAL_PL2303=m +CONFIG_USB_SERIAL_OTI6858=m +CONFIG_USB_SERIAL_QCAUX=m +CONFIG_USB_SERIAL_QUALCOMM=m +CONFIG_USB_SERIAL_SPCP8X5=m +CONFIG_USB_SERIAL_SAFE=m +# CONFIG_USB_SERIAL_SAFE_PADDED is not set +CONFIG_USB_SERIAL_SIERRAWIRELESS=m +CONFIG_USB_SERIAL_SYMBOL=m +CONFIG_USB_SERIAL_TI=m +CONFIG_USB_SERIAL_CYBERJACK=m +CONFIG_USB_SERIAL_WWAN=m +CONFIG_USB_SERIAL_OPTION=m +CONFIG_USB_SERIAL_OMNINET=m +CONFIG_USB_SERIAL_OPTICON=m +CONFIG_USB_SERIAL_XSENS_MT=m +CONFIG_USB_SERIAL_WISHBONE=m +CONFIG_USB_SERIAL_SSU100=m +CONFIG_USB_SERIAL_QT2=m +CONFIG_USB_SERIAL_UPD78F0730=m +# CONFIG_USB_SERIAL_XR is not set +CONFIG_USB_SERIAL_DEBUG=m + +# +# USB Miscellaneous drivers +# +# CONFIG_USB_USS720 is not set +CONFIG_USB_EMI62=m +CONFIG_USB_EMI26=m +CONFIG_USB_ADUTUX=m +CONFIG_USB_SEVSEG=m +CONFIG_USB_LEGOTOWER=m +CONFIG_USB_LCD=m +CONFIG_USB_CYPRESS_CY7C63=m +CONFIG_USB_CYTHERM=m +CONFIG_USB_IDMOUSE=m +CONFIG_USB_APPLEDISPLAY=m +# CONFIG_USB_QCOM_EUD is not set +# CONFIG_APPLE_MFI_FASTCHARGE is not set +CONFIG_USB_SISUSBVGA=m +CONFIG_USB_LD=m +CONFIG_USB_TRANCEVIBRATOR=m +CONFIG_USB_IOWARRIOR=m +CONFIG_USB_TEST=m +CONFIG_USB_EHSET_TEST_FIXTURE=m +CONFIG_USB_ISIGHTFW=m +CONFIG_USB_YUREX=m +CONFIG_USB_EZUSB_FX2=m +# CONFIG_USB_HUB_USB251XB is not set +CONFIG_USB_HSIC_USB3503=m +# CONFIG_USB_HSIC_USB4604 is not set +# CONFIG_USB_LINK_LAYER_TEST is not set +CONFIG_USB_CHAOSKEY=m +# CONFIG_USB_ONBOARD_HUB is not set +# CONFIG_USB_ATM is not set + +# +# USB Physical Layer drivers +# +CONFIG_USB_PHY=y +CONFIG_NOP_USB_XCEIV=m +# CONFIG_USB_GPIO_VBUS is not set +# CONFIG_USB_ISP1301 is not set +CONFIG_USB_TEGRA_PHY=m +CONFIG_USB_ULPI=y +CONFIG_USB_ULPI_VIEWPORT=y +# end of USB Physical Layer drivers + +CONFIG_USB_GADGET=m +# CONFIG_USB_GADGET_DEBUG is not set +# CONFIG_USB_GADGET_DEBUG_FILES is not set +# CONFIG_USB_GADGET_DEBUG_FS is not set +CONFIG_USB_GADGET_VBUS_DRAW=2 +CONFIG_USB_GADGET_STORAGE_NUM_BUFFERS=2 +# CONFIG_U_SERIAL_CONSOLE is not set + +# +# USB Peripheral Controller +# +# CONFIG_USB_GR_UDC is not set +# CONFIG_USB_R8A66597 is not set +# CONFIG_USB_PXA27X is not set +# CONFIG_USB_MV_UDC is not set +# CONFIG_USB_MV_U3D is not set +# CONFIG_USB_SNP_UDC_PLAT is not set +# CONFIG_USB_M66592 is not set +# CONFIG_USB_BDC_UDC is not set +# CONFIG_USB_AMD5536UDC is not set +# CONFIG_USB_NET2272 is not set +CONFIG_USB_NET2280=m +# CONFIG_USB_GOKU is not set +# CONFIG_USB_EG20T is not set +# CONFIG_USB_GADGET_XILINX is not set +# CONFIG_USB_MAX3420_UDC is not set +# CONFIG_USB_TEGRA_XUDC is not set +# CONFIG_USB_CDNS2_UDC is not set +# CONFIG_USB_DUMMY_HCD is not set +# end of USB Peripheral Controller + +CONFIG_USB_LIBCOMPOSITE=m +CONFIG_USB_F_ACM=m +CONFIG_USB_F_SS_LB=m +CONFIG_USB_U_SERIAL=m +CONFIG_USB_U_ETHER=m +CONFIG_USB_U_AUDIO=m +CONFIG_USB_F_SERIAL=m +CONFIG_USB_F_OBEX=m +CONFIG_USB_F_NCM=m +CONFIG_USB_F_ECM=m +CONFIG_USB_F_PHONET=m +CONFIG_USB_F_EEM=m +CONFIG_USB_F_SUBSET=m +CONFIG_USB_F_RNDIS=m +CONFIG_USB_F_MASS_STORAGE=m +CONFIG_USB_F_FS=m +CONFIG_USB_F_UAC1=m +CONFIG_USB_F_UAC2=m +CONFIG_USB_F_UVC=m +CONFIG_USB_F_MIDI=m +CONFIG_USB_F_HID=m +CONFIG_USB_F_PRINTER=m +CONFIG_USB_CONFIGFS=m +CONFIG_USB_CONFIGFS_SERIAL=y +CONFIG_USB_CONFIGFS_ACM=y +CONFIG_USB_CONFIGFS_OBEX=y +CONFIG_USB_CONFIGFS_NCM=y +CONFIG_USB_CONFIGFS_ECM=y +CONFIG_USB_CONFIGFS_ECM_SUBSET=y +CONFIG_USB_CONFIGFS_RNDIS=y +CONFIG_USB_CONFIGFS_EEM=y +CONFIG_USB_CONFIGFS_PHONET=y +CONFIG_USB_CONFIGFS_MASS_STORAGE=y +CONFIG_USB_CONFIGFS_F_LB_SS=y +CONFIG_USB_CONFIGFS_F_FS=y +CONFIG_USB_CONFIGFS_F_UAC1=y +# CONFIG_USB_CONFIGFS_F_UAC1_LEGACY is not set +CONFIG_USB_CONFIGFS_F_UAC2=y +CONFIG_USB_CONFIGFS_F_MIDI=y +# CONFIG_USB_CONFIGFS_F_MIDI2 is not set +CONFIG_USB_CONFIGFS_F_HID=y +CONFIG_USB_CONFIGFS_F_UVC=y +CONFIG_USB_CONFIGFS_F_PRINTER=y +# CONFIG_USB_CONFIGFS_F_TCM is not set + +# +# USB Gadget precomposed configurations +# +# CONFIG_USB_ZERO is not set +# CONFIG_USB_AUDIO is not set +CONFIG_USB_ETH=m +CONFIG_USB_ETH_RNDIS=y +# CONFIG_USB_ETH_EEM is not set +# CONFIG_USB_G_NCM is not set +CONFIG_USB_GADGETFS=m +CONFIG_USB_FUNCTIONFS=m +CONFIG_USB_FUNCTIONFS_ETH=y +CONFIG_USB_FUNCTIONFS_RNDIS=y +CONFIG_USB_FUNCTIONFS_GENERIC=y +# CONFIG_USB_MASS_STORAGE is not set +# CONFIG_USB_GADGET_TARGET is not set +CONFIG_USB_G_SERIAL=m +# CONFIG_USB_MIDI_GADGET is not set +# CONFIG_USB_G_PRINTER is not set +# CONFIG_USB_CDC_COMPOSITE is not set +# CONFIG_USB_G_NOKIA is not set +# CONFIG_USB_G_ACM_MS is not set +# CONFIG_USB_G_MULTI is not set +# CONFIG_USB_G_HID is not set +# CONFIG_USB_G_DBGP is not set +# CONFIG_USB_G_WEBCAM is not set +# CONFIG_USB_RAW_GADGET is not set +# end of USB Gadget precomposed configurations + +# CONFIG_TYPEC is not set +CONFIG_USB_ROLE_SWITCH=m +CONFIG_MMC=y +CONFIG_PWRSEQ_EMMC=y +# CONFIG_PWRSEQ_SD8787 is not set +CONFIG_PWRSEQ_SIMPLE=y +CONFIG_MMC_BLOCK=y +CONFIG_MMC_BLOCK_MINORS=256 +CONFIG_SDIO_UART=m +# CONFIG_MMC_TEST is not set + +# +# MMC/SD/SDIO Host Controller Drivers +# +# CONFIG_MMC_DEBUG is not set +CONFIG_MMC_ARMMMCI=m +CONFIG_MMC_QCOM_DML=y +CONFIG_MMC_STM32_SDMMC=y +CONFIG_MMC_SDHCI=m +CONFIG_MMC_SDHCI_IO_ACCESSORS=y +CONFIG_MMC_SDHCI_PCI=m +CONFIG_MMC_RICOH_MMC=y +CONFIG_MMC_SDHCI_ACPI=m +CONFIG_MMC_SDHCI_PLTFM=m +CONFIG_MMC_SDHCI_OF_ARASAN=m +# CONFIG_MMC_SDHCI_OF_AT91 is not set +# CONFIG_MMC_SDHCI_OF_DWCMSHC is not set +# CONFIG_MMC_SDHCI_CADENCE is not set +CONFIG_MMC_SDHCI_TEGRA=m +# CONFIG_MMC_SDHCI_PXAV3 is not set +CONFIG_MMC_SDHCI_F_SDH30=m +# CONFIG_MMC_SDHCI_MILBEAUT is not set +CONFIG_MMC_MESON_GX=m +# CONFIG_MMC_MESON_MX_SDIO is not set +CONFIG_MMC_SDHCI_MSM=m +CONFIG_MMC_TIFM_SD=m +CONFIG_MMC_SPI=m +CONFIG_MMC_CB710=m +CONFIG_MMC_VIA_SDMMC=m +CONFIG_MMC_DW=m +CONFIG_MMC_DW_PLTFM=m +# CONFIG_MMC_DW_BLUEFIELD is not set +# CONFIG_MMC_DW_EXYNOS is not set +# CONFIG_MMC_DW_HI3798CV200 is not set +CONFIG_MMC_DW_K3=m +# CONFIG_MMC_DW_PCI is not set +CONFIG_MMC_DW_ROCKCHIP=m +CONFIG_MMC_VUB300=m +CONFIG_MMC_USHC=m +# CONFIG_MMC_USDHI6ROL0 is not set +CONFIG_MMC_REALTEK_PCI=m +CONFIG_MMC_REALTEK_USB=m +CONFIG_MMC_SUNXI=m +CONFIG_MMC_CQHCI=m +# CONFIG_MMC_HSQ is not set +CONFIG_MMC_TOSHIBA_PCI=m +# CONFIG_MMC_MTK is not set +CONFIG_MMC_SDHCI_XENON=m +# CONFIG_MMC_SDHCI_OMAP is not set +# CONFIG_MMC_SDHCI_AM654 is not set +CONFIG_SCSI_UFSHCD=m +# CONFIG_SCSI_UFS_BSG is not set +# CONFIG_SCSI_UFS_FAULT_INJECTION is not set +# CONFIG_SCSI_UFS_HWMON is not set +CONFIG_SCSI_UFSHCD_PCI=m +# CONFIG_SCSI_UFS_DWC_TC_PCI is not set +# CONFIG_SCSI_UFSHCD_PLATFORM is not set +CONFIG_MEMSTICK=m +# CONFIG_MEMSTICK_DEBUG is not set + +# +# MemoryStick drivers +# +# CONFIG_MEMSTICK_UNSAFE_RESUME is not set +CONFIG_MSPRO_BLOCK=m +# CONFIG_MS_BLOCK is not set + +# +# MemoryStick Host Controller Drivers +# +CONFIG_MEMSTICK_TIFM_MS=m +CONFIG_MEMSTICK_JMICRON_38X=m +CONFIG_MEMSTICK_R592=m +CONFIG_MEMSTICK_REALTEK_PCI=m +CONFIG_MEMSTICK_REALTEK_USB=m +CONFIG_NEW_LEDS=y +CONFIG_LEDS_CLASS=y +# CONFIG_LEDS_CLASS_FLASH is not set +# CONFIG_LEDS_CLASS_MULTICOLOR is not set +CONFIG_LEDS_BRIGHTNESS_HW_CHANGED=y + +# +# LED drivers +# +# CONFIG_LEDS_AN30259A is not set +# CONFIG_LEDS_AW200XX is not set +# CONFIG_LEDS_AW2013 is not set +# CONFIG_LEDS_BCM6328 is not set +# CONFIG_LEDS_BCM6358 is not set +# CONFIG_LEDS_CR0014114 is not set +# CONFIG_LEDS_EL15203000 is not set +# CONFIG_LEDS_LM3530 is not set +# CONFIG_LEDS_LM3532 is not set +# CONFIG_LEDS_LM3642 is not set +# CONFIG_LEDS_LM3692X is not set +# CONFIG_LEDS_PCA9532 is not set +CONFIG_LEDS_GPIO=m +CONFIG_LEDS_LP3944=m +# CONFIG_LEDS_LP3952 is not set +# CONFIG_LEDS_LP50XX is not set +# CONFIG_LEDS_LP55XX_COMMON is not set +# CONFIG_LEDS_LP8860 is not set +CONFIG_LEDS_PCA955X=m +# CONFIG_LEDS_PCA955X_GPIO is not set +# CONFIG_LEDS_PCA963X is not set +# CONFIG_LEDS_PCA995X is not set +CONFIG_LEDS_DAC124S085=m +CONFIG_LEDS_PWM=m +CONFIG_LEDS_REGULATOR=m +# CONFIG_LEDS_BD2606MVV is not set +CONFIG_LEDS_BD2802=m +CONFIG_LEDS_LT3593=m +# CONFIG_LEDS_TCA6507 is not set +# CONFIG_LEDS_TLC591XX is not set +# CONFIG_LEDS_LM355x is not set +# CONFIG_LEDS_IS31FL319X is not set +# CONFIG_LEDS_IS31FL32XX is not set + +# +# LED driver for blink(1) USB RGB LED is under Special HID drivers (HID_THINGM) +# +# CONFIG_LEDS_BLINKM is not set +# CONFIG_LEDS_SYSCON is not set +# CONFIG_LEDS_MLXREG is not set +# CONFIG_LEDS_USER is not set +# CONFIG_LEDS_SPI_BYTE is not set +# CONFIG_LEDS_LM3697 is not set + +# +# Flash and Torch LED drivers +# + +# +# RGB LED drivers +# + +# +# LED Triggers +# +CONFIG_LEDS_TRIGGERS=y +CONFIG_LEDS_TRIGGER_TIMER=m +CONFIG_LEDS_TRIGGER_ONESHOT=m +CONFIG_LEDS_TRIGGER_DISK=y +CONFIG_LEDS_TRIGGER_MTD=y +CONFIG_LEDS_TRIGGER_HEARTBEAT=m +CONFIG_LEDS_TRIGGER_BACKLIGHT=m +CONFIG_LEDS_TRIGGER_CPU=y +# CONFIG_LEDS_TRIGGER_ACTIVITY is not set +CONFIG_LEDS_TRIGGER_DEFAULT_ON=m + +# +# iptables trigger is under Netfilter config (LED target) +# +CONFIG_LEDS_TRIGGER_TRANSIENT=m +CONFIG_LEDS_TRIGGER_CAMERA=m +CONFIG_LEDS_TRIGGER_PANIC=y +# CONFIG_LEDS_TRIGGER_NETDEV is not set +# CONFIG_LEDS_TRIGGER_PATTERN is not set +CONFIG_LEDS_TRIGGER_AUDIO=m +# CONFIG_LEDS_TRIGGER_TTY is not set + +# +# Simple LED drivers +# +CONFIG_ACCESSIBILITY=y +CONFIG_A11Y_BRAILLE_CONSOLE=y + +# +# Speakup console speech +# +CONFIG_SPEAKUP=m +CONFIG_SPEAKUP_SYNTH_ACNTSA=m +CONFIG_SPEAKUP_SYNTH_APOLLO=m +CONFIG_SPEAKUP_SYNTH_AUDPTR=m +CONFIG_SPEAKUP_SYNTH_BNS=m +CONFIG_SPEAKUP_SYNTH_DECTLK=m +CONFIG_SPEAKUP_SYNTH_DECEXT=m +CONFIG_SPEAKUP_SYNTH_LTLK=m +CONFIG_SPEAKUP_SYNTH_SOFT=m +CONFIG_SPEAKUP_SYNTH_SPKOUT=m +CONFIG_SPEAKUP_SYNTH_TXPRT=m +CONFIG_SPEAKUP_SYNTH_DUMMY=m +# end of Speakup console speech + +CONFIG_INFINIBAND=m +CONFIG_INFINIBAND_USER_MAD=m +CONFIG_INFINIBAND_USER_ACCESS=m +CONFIG_INFINIBAND_USER_MEM=y +CONFIG_INFINIBAND_ON_DEMAND_PAGING=y +CONFIG_INFINIBAND_ADDR_TRANS=y +CONFIG_INFINIBAND_ADDR_TRANS_CONFIGFS=y +CONFIG_INFINIBAND_VIRT_DMA=y +# CONFIG_INFINIBAND_BNXT_RE is not set +CONFIG_INFINIBAND_CXGB4=m +# CONFIG_INFINIBAND_EFA is not set +# CONFIG_INFINIBAND_ERDMA is not set +# CONFIG_INFINIBAND_IRDMA is not set +CONFIG_MLX4_INFINIBAND=m +CONFIG_MLX5_INFINIBAND=m +CONFIG_INFINIBAND_MTHCA=m +CONFIG_INFINIBAND_MTHCA_DEBUG=y +CONFIG_INFINIBAND_OCRDMA=m +CONFIG_INFINIBAND_QEDR=m +CONFIG_RDMA_RXE=m +# CONFIG_RDMA_SIW is not set +CONFIG_INFINIBAND_IPOIB=m +CONFIG_INFINIBAND_IPOIB_CM=y +CONFIG_INFINIBAND_IPOIB_DEBUG=y +# CONFIG_INFINIBAND_IPOIB_DEBUG_DATA is not set +CONFIG_INFINIBAND_SRP=m +CONFIG_INFINIBAND_SRPT=m +CONFIG_INFINIBAND_ISER=m +CONFIG_INFINIBAND_ISERT=m +# CONFIG_INFINIBAND_RTRS_CLIENT is not set +# CONFIG_INFINIBAND_RTRS_SERVER is not set +CONFIG_EDAC_SUPPORT=y +CONFIG_EDAC=y +CONFIG_EDAC_LEGACY_SYSFS=y +# CONFIG_EDAC_DEBUG is not set +CONFIG_EDAC_GHES=y +CONFIG_EDAC_THUNDERX=m +# CONFIG_EDAC_SYNOPSYS is not set +CONFIG_EDAC_XGENE=m +# CONFIG_EDAC_DMC520 is not set +# CONFIG_EDAC_ZYNQMP is not set +CONFIG_RTC_LIB=y +CONFIG_RTC_CLASS=y +CONFIG_RTC_HCTOSYS=y +CONFIG_RTC_HCTOSYS_DEVICE="rtc0" +CONFIG_RTC_SYSTOHC=y +CONFIG_RTC_SYSTOHC_DEVICE="rtc0" +# CONFIG_RTC_DEBUG is not set +CONFIG_RTC_NVMEM=y + +# +# RTC interfaces +# +CONFIG_RTC_INTF_SYSFS=y +CONFIG_RTC_INTF_PROC=y +CONFIG_RTC_INTF_DEV=y +# CONFIG_RTC_INTF_DEV_UIE_EMUL is not set +# CONFIG_RTC_DRV_TEST is not set + +# +# I2C RTC drivers +# +# CONFIG_RTC_DRV_ABB5ZES3 is not set +# CONFIG_RTC_DRV_ABEOZ9 is not set +# CONFIG_RTC_DRV_ABX80X is not set +CONFIG_RTC_DRV_DS1307=y +# CONFIG_RTC_DRV_DS1307_CENTURY is not set +# CONFIG_RTC_DRV_DS1374 is not set +# CONFIG_RTC_DRV_DS1672 is not set +# CONFIG_RTC_DRV_HYM8563 is not set +# CONFIG_RTC_DRV_MAX6900 is not set +CONFIG_RTC_DRV_MAX77686=y +# CONFIG_RTC_DRV_NCT3018Y is not set +# CONFIG_RTC_DRV_RS5C372 is not set +# CONFIG_RTC_DRV_ISL1208 is not set +# CONFIG_RTC_DRV_ISL12022 is not set +# CONFIG_RTC_DRV_ISL12026 is not set +# CONFIG_RTC_DRV_X1205 is not set +# CONFIG_RTC_DRV_PCF8523 is not set +# CONFIG_RTC_DRV_PCF85063 is not set +# CONFIG_RTC_DRV_PCF85363 is not set +CONFIG_RTC_DRV_PCF8563=y +# CONFIG_RTC_DRV_PCF8583 is not set +# CONFIG_RTC_DRV_M41T80 is not set +# CONFIG_RTC_DRV_BQ32K is not set +# CONFIG_RTC_DRV_S35390A is not set +# CONFIG_RTC_DRV_FM3130 is not set +# CONFIG_RTC_DRV_RX8010 is not set +# CONFIG_RTC_DRV_RX8581 is not set +# CONFIG_RTC_DRV_RX8025 is not set +# CONFIG_RTC_DRV_EM3027 is not set +# CONFIG_RTC_DRV_RV3028 is not set +# CONFIG_RTC_DRV_RV3032 is not set +# CONFIG_RTC_DRV_RV8803 is not set +# CONFIG_RTC_DRV_SD3078 is not set + +# +# SPI RTC drivers +# +# CONFIG_RTC_DRV_M41T93 is not set +# CONFIG_RTC_DRV_M41T94 is not set +# CONFIG_RTC_DRV_DS1302 is not set +# CONFIG_RTC_DRV_DS1305 is not set +# CONFIG_RTC_DRV_DS1343 is not set +# CONFIG_RTC_DRV_DS1347 is not set +# CONFIG_RTC_DRV_DS1390 is not set +# CONFIG_RTC_DRV_MAX6916 is not set +# CONFIG_RTC_DRV_R9701 is not set +# CONFIG_RTC_DRV_RX4581 is not set +# CONFIG_RTC_DRV_RS5C348 is not set +# CONFIG_RTC_DRV_MAX6902 is not set +# CONFIG_RTC_DRV_PCF2123 is not set +# CONFIG_RTC_DRV_MCP795 is not set +CONFIG_RTC_I2C_AND_SPI=y + +# +# SPI and I2C RTC drivers +# +# CONFIG_RTC_DRV_DS3232 is not set +# CONFIG_RTC_DRV_PCF2127 is not set +# CONFIG_RTC_DRV_RV3029C2 is not set +# CONFIG_RTC_DRV_RX6110 is not set + +# +# Platform RTC drivers +# +# CONFIG_RTC_DRV_DS1286 is not set +# CONFIG_RTC_DRV_DS1511 is not set +# CONFIG_RTC_DRV_DS1553 is not set +# CONFIG_RTC_DRV_DS1685_FAMILY is not set +# CONFIG_RTC_DRV_DS1742 is not set +# CONFIG_RTC_DRV_DS2404 is not set +CONFIG_RTC_DRV_EFI=y +# CONFIG_RTC_DRV_STK17TA8 is not set +# CONFIG_RTC_DRV_M48T86 is not set +# CONFIG_RTC_DRV_M48T35 is not set +# CONFIG_RTC_DRV_M48T59 is not set +# CONFIG_RTC_DRV_MSM6242 is not set +# CONFIG_RTC_DRV_RP5C01 is not set +# CONFIG_RTC_DRV_OPTEE is not set +# CONFIG_RTC_DRV_ZYNQMP is not set +CONFIG_RTC_DRV_CROS_EC=m + +# +# on-CPU RTC drivers +# +CONFIG_RTC_DRV_MESON_VRTC=m +# CONFIG_RTC_DRV_PL030 is not set +CONFIG_RTC_DRV_PL031=y +CONFIG_RTC_DRV_SUN6I=y +CONFIG_RTC_DRV_MV=m +CONFIG_RTC_DRV_ARMADA38X=m +# CONFIG_RTC_DRV_CADENCE is not set +# CONFIG_RTC_DRV_FTRTC010 is not set +CONFIG_RTC_DRV_PM8XXX=m +CONFIG_RTC_DRV_TEGRA=y +CONFIG_RTC_DRV_XGENE=y +# CONFIG_RTC_DRV_R7301 is not set + +# +# HID Sensor RTC drivers +# +# CONFIG_RTC_DRV_HID_SENSOR_TIME is not set +# CONFIG_RTC_DRV_GOLDFISH is not set +CONFIG_DMADEVICES=y +# CONFIG_DMADEVICES_DEBUG is not set + +# +# DMA Devices +# +CONFIG_ASYNC_TX_ENABLE_CHANNEL_SWITCH=y +CONFIG_DMA_ENGINE=y +CONFIG_DMA_VIRTUAL_CHANNELS=y +CONFIG_DMA_ACPI=y +CONFIG_DMA_OF=y +# CONFIG_ALTERA_MSGDMA is not set +# CONFIG_AMBA_PL08X is not set +# CONFIG_AXI_DMAC is not set +# CONFIG_BCM_SBA_RAID is not set +CONFIG_DMA_SUN6I=m +# CONFIG_DW_AXI_DMAC is not set +# CONFIG_FSL_EDMA is not set +# CONFIG_FSL_QDMA is not set +# CONFIG_HISI_DMA is not set +# CONFIG_INTEL_IDMA64 is not set +CONFIG_K3_DMA=m +CONFIG_MV_XOR=y +CONFIG_MV_XOR_V2=y +CONFIG_PL330_DMA=m +# CONFIG_PLX_DMA is not set +# CONFIG_TEGRA186_GPC_DMA is not set +CONFIG_TEGRA20_APB_DMA=y +CONFIG_TEGRA210_ADMA=y +CONFIG_XGENE_DMA=m +# CONFIG_XILINX_DMA is not set +# CONFIG_XILINX_XDMA is not set +# CONFIG_XILINX_ZYNQMP_DMA is not set +# CONFIG_XILINX_ZYNQMP_DPDMA is not set +CONFIG_QCOM_BAM_DMA=m +# CONFIG_QCOM_GPI_DMA is not set +CONFIG_QCOM_HIDMA_MGMT=m +CONFIG_QCOM_HIDMA=m +# CONFIG_DW_DMAC is not set +# CONFIG_DW_DMAC_PCI is not set +# CONFIG_DW_EDMA is not set +# CONFIG_SF_PDMA is not set + +# +# DMA Clients +# +CONFIG_ASYNC_TX_DMA=y +# CONFIG_DMATEST is not set +CONFIG_DMA_ENGINE_RAID=y + +# +# DMABUF options +# +CONFIG_SYNC_FILE=y +# CONFIG_SW_SYNC is not set +# CONFIG_UDMABUF is not set +# CONFIG_DMABUF_MOVE_NOTIFY is not set +# CONFIG_DMABUF_DEBUG is not set +# CONFIG_DMABUF_SELFTESTS is not set +# CONFIG_DMABUF_HEAPS is not set +# CONFIG_DMABUF_SYSFS_STATS is not set +# end of DMABUF options + +CONFIG_UIO=m +CONFIG_UIO_CIF=m +# CONFIG_UIO_PDRV_GENIRQ is not set +# CONFIG_UIO_DMEM_GENIRQ is not set +CONFIG_UIO_AEC=m +CONFIG_UIO_SERCOS3=m +CONFIG_UIO_PCI_GENERIC=m +CONFIG_UIO_NETX=m +# CONFIG_UIO_PRUSS is not set +CONFIG_UIO_MF624=m +# CONFIG_UIO_HV_GENERIC is not set +CONFIG_VFIO=m +CONFIG_VFIO_GROUP=y +CONFIG_VFIO_CONTAINER=y +CONFIG_VFIO_IOMMU_TYPE1=m +CONFIG_VFIO_NOIOMMU=y +CONFIG_VFIO_VIRQFD=y + +# +# VFIO support for PCI devices +# +CONFIG_VFIO_PCI_CORE=m +CONFIG_VFIO_PCI_MMAP=y +CONFIG_VFIO_PCI_INTX=y +CONFIG_VFIO_PCI=m +# CONFIG_MLX5_VFIO_PCI is not set +# end of VFIO support for PCI devices + +# +# VFIO support for platform devices +# +CONFIG_VFIO_PLATFORM_BASE=m +CONFIG_VFIO_PLATFORM=m +# CONFIG_VFIO_AMBA is not set + +# +# VFIO platform reset drivers +# +# CONFIG_VFIO_PLATFORM_CALXEDAXGMAC_RESET is not set +# CONFIG_VFIO_PLATFORM_AMDXGBE_RESET is not set +# end of VFIO platform reset drivers +# end of VFIO support for platform devices + +CONFIG_VIRT_DRIVERS=y +CONFIG_VMGENID=y +# CONFIG_NITRO_ENCLAVES is not set +CONFIG_VIRTIO_ANCHOR=y +CONFIG_VIRTIO=m +CONFIG_VIRTIO_PCI_LIB=m +CONFIG_VIRTIO_PCI_LIB_LEGACY=m +CONFIG_VIRTIO_MENU=y +CONFIG_VIRTIO_PCI=m +CONFIG_VIRTIO_PCI_LEGACY=y +CONFIG_VIRTIO_VDPA=m +# CONFIG_VIRTIO_PMEM is not set +CONFIG_VIRTIO_BALLOON=m +CONFIG_VIRTIO_INPUT=m +CONFIG_VIRTIO_MMIO=m +# CONFIG_VIRTIO_MMIO_CMDLINE_DEVICES is not set +CONFIG_VIRTIO_DMA_SHARED_BUFFER=m +CONFIG_VDPA=m +CONFIG_VDPA_USER=m +CONFIG_VHOST_IOTLB=m +CONFIG_VHOST_TASK=y +CONFIG_VHOST_RING=m +CONFIG_VHOST=m +CONFIG_VHOST_MENU=y +CONFIG_VHOST_NET=m +CONFIG_VHOST_SCSI=m +CONFIG_VHOST_VSOCK=m +CONFIG_VHOST_VDPA=m +# CONFIG_VHOST_CROSS_ENDIAN_LEGACY is not set + +# +# Microsoft Hyper-V guest support +# +CONFIG_HYPERV=m +CONFIG_HYPERV_UTILS=m +CONFIG_HYPERV_BALLOON=m +# end of Microsoft Hyper-V guest support + +# +# Xen driver support +# +CONFIG_XEN_BALLOON=y +CONFIG_XEN_BALLOON_MEMORY_HOTPLUG=y +CONFIG_XEN_SCRUB_PAGES_DEFAULT=y +CONFIG_XEN_DEV_EVTCHN=m +CONFIG_XEN_BACKEND=y +CONFIG_XENFS=m +CONFIG_XEN_COMPAT_XENFS=y +CONFIG_XEN_SYS_HYPERVISOR=y +CONFIG_XEN_XENBUS_FRONTEND=y +CONFIG_XEN_GNTDEV=m +CONFIG_XEN_GRANT_DEV_ALLOC=m +# CONFIG_XEN_GRANT_DMA_ALLOC is not set +CONFIG_SWIOTLB_XEN=y +CONFIG_XEN_PCI_STUB=y +CONFIG_XEN_PCIDEV_STUB=m +# CONFIG_XEN_PVCALLS_FRONTEND is not set +# CONFIG_XEN_PVCALLS_BACKEND is not set +CONFIG_XEN_SCSI_BACKEND=m +CONFIG_XEN_PRIVCMD=m +CONFIG_XEN_EFI=y +CONFIG_XEN_AUTO_XLATE=y +CONFIG_XEN_FRONT_PGDIR_SHBUF=m +# CONFIG_XEN_VIRTIO is not set +# end of Xen driver support + +# CONFIG_GREYBUS is not set +# CONFIG_COMEDI is not set +CONFIG_STAGING=y +# CONFIG_PRISM2_USB is not set +# CONFIG_RTL8192U is not set +# CONFIG_RTLLIB is not set +CONFIG_RTL8723BS=m +CONFIG_R8712U=m +# CONFIG_RTS5208 is not set +# CONFIG_VT6655 is not set +# CONFIG_VT6656 is not set + +# +# IIO staging drivers +# + +# +# Accelerometers +# +# CONFIG_ADIS16203 is not set +# CONFIG_ADIS16240 is not set +# end of Accelerometers + +# +# Analog to digital converters +# +# CONFIG_AD7816 is not set +# end of Analog to digital converters + +# +# Analog digital bi-direction converters +# +# CONFIG_ADT7316 is not set +# end of Analog digital bi-direction converters + +# +# Direct Digital Synthesis +# +# CONFIG_AD9832 is not set +# CONFIG_AD9834 is not set +# end of Direct Digital Synthesis + +# +# Network Analyzer, Impedance Converters +# +# CONFIG_AD5933 is not set +# end of Network Analyzer, Impedance Converters + +# +# Resolver to digital converters +# +# CONFIG_AD2S1210 is not set +# end of Resolver to digital converters +# end of IIO staging drivers + +# CONFIG_FB_SM750 is not set +# CONFIG_MFD_NVEC is not set +# CONFIG_STAGING_MEDIA is not set +# CONFIG_STAGING_BOARD is not set +# CONFIG_LTE_GDM724X is not set +# CONFIG_FB_TFT is not set +# CONFIG_KS7010 is not set +# CONFIG_PI433 is not set +# CONFIG_XIL_AXIS_FIFO is not set +# CONFIG_FIELDBUS_DEV is not set +CONFIG_QLGE=m +# CONFIG_VME_BUS is not set +# CONFIG_GOLDFISH is not set +CONFIG_CHROME_PLATFORMS=y +# CONFIG_CHROMEOS_ACPI is not set +# CONFIG_CHROMEOS_TBMC is not set +CONFIG_CROS_EC=y +CONFIG_CROS_EC_I2C=m +# CONFIG_CROS_EC_RPMSG is not set +CONFIG_CROS_EC_SPI=m +# CONFIG_CROS_EC_UART is not set +CONFIG_CROS_EC_PROTO=y +CONFIG_CROS_KBD_LED_BACKLIGHT=m +CONFIG_CROS_EC_CHARDEV=y +CONFIG_CROS_EC_LIGHTBAR=m +CONFIG_CROS_EC_VBC=m +CONFIG_CROS_EC_DEBUGFS=m +CONFIG_CROS_EC_SENSORHUB=y +CONFIG_CROS_EC_SYSFS=m +# CONFIG_CROS_HPS_I2C is not set +CONFIG_CROS_USBPD_LOGGER=m +CONFIG_CROS_USBPD_NOTIFY=y +# CONFIG_CHROMEOS_PRIVACY_SCREEN is not set +# CONFIG_MELLANOX_PLATFORM is not set +CONFIG_SURFACE_PLATFORMS=y +# CONFIG_SURFACE_3_POWER_OPREGION is not set +# CONFIG_SURFACE_GPE is not set +# CONFIG_SURFACE_HOTPLUG is not set +# CONFIG_SURFACE_PRO3_BUTTON is not set +# CONFIG_SURFACE_AGGREGATOR is not set +CONFIG_HAVE_CLK=y +CONFIG_HAVE_CLK_PREPARE=y +CONFIG_COMMON_CLK=y + +# +# Clock driver for ARM Reference designs +# +# CONFIG_CLK_ICST is not set +CONFIG_CLK_SP810=y +CONFIG_CLK_VEXPRESS_OSC=y +# end of Clock driver for ARM Reference designs + +# CONFIG_LMK04832 is not set +# CONFIG_COMMON_CLK_MAX77686 is not set +# CONFIG_COMMON_CLK_MAX9485 is not set +CONFIG_COMMON_CLK_HI655X=m +# CONFIG_COMMON_CLK_SI5341 is not set +# CONFIG_COMMON_CLK_SI5351 is not set +# CONFIG_COMMON_CLK_SI514 is not set +# CONFIG_COMMON_CLK_SI544 is not set +# CONFIG_COMMON_CLK_SI570 is not set +# CONFIG_COMMON_CLK_CDCE706 is not set +# CONFIG_COMMON_CLK_CDCE925 is not set +# CONFIG_COMMON_CLK_CS2000_CP is not set +# CONFIG_COMMON_CLK_AXI_CLKGEN is not set +CONFIG_COMMON_CLK_XGENE=y +# CONFIG_COMMON_CLK_PWM is not set +# CONFIG_COMMON_CLK_RS9_PCIE is not set +# CONFIG_COMMON_CLK_SI521XX is not set +# CONFIG_COMMON_CLK_VC3 is not set +# CONFIG_COMMON_CLK_VC5 is not set +# CONFIG_COMMON_CLK_VC7 is not set +# CONFIG_COMMON_CLK_FIXED_MMIO is not set +CONFIG_COMMON_CLK_HI3516CV300=y +CONFIG_COMMON_CLK_HI3519=y +CONFIG_COMMON_CLK_HI3559A=y +CONFIG_COMMON_CLK_HI3660=y +CONFIG_COMMON_CLK_HI3670=y +CONFIG_COMMON_CLK_HI3798CV200=y +CONFIG_COMMON_CLK_HI6220=y +CONFIG_RESET_HISI=y +CONFIG_STUB_CLK_HI6220=y +CONFIG_STUB_CLK_HI3660=y + +# +# Clock support for Amlogic platforms +# +CONFIG_COMMON_CLK_MESON_REGMAP=y +CONFIG_COMMON_CLK_MESON_DUALDIV=y +CONFIG_COMMON_CLK_MESON_MPLL=y +CONFIG_COMMON_CLK_MESON_PLL=y +CONFIG_COMMON_CLK_MESON_VID_PLL_DIV=y +CONFIG_COMMON_CLK_MESON_CLKC_UTILS=y +CONFIG_COMMON_CLK_MESON_AO_CLKC=y +CONFIG_COMMON_CLK_MESON_EE_CLKC=y +CONFIG_COMMON_CLK_MESON_CPU_DYNDIV=y +CONFIG_COMMON_CLK_GXBB=y +CONFIG_COMMON_CLK_AXG=y +# CONFIG_COMMON_CLK_AXG_AUDIO is not set +# CONFIG_COMMON_CLK_A1_PLL is not set +# CONFIG_COMMON_CLK_A1_PERIPHERALS is not set +CONFIG_COMMON_CLK_G12A=y +# end of Clock support for Amlogic platforms + +CONFIG_ARMADA_AP_CP_HELPER=y +CONFIG_ARMADA_37XX_CLK=y +CONFIG_ARMADA_AP806_SYSCON=y +CONFIG_ARMADA_CP110_SYSCON=y +CONFIG_QCOM_GDSC=y +CONFIG_QCOM_RPMCC=y +CONFIG_COMMON_CLK_QCOM=y +CONFIG_QCOM_A53PLL=y +# CONFIG_QCOM_A7PLL is not set +CONFIG_QCOM_CLK_APCS_MSM8916=m +# CONFIG_QCOM_CLK_APCC_MSM8996 is not set +CONFIG_QCOM_CLK_RPM=m +CONFIG_QCOM_CLK_SMD_RPM=m +# CONFIG_IPQ_APSS_PLL is not set +# CONFIG_IPQ_GCC_4019 is not set +# CONFIG_IPQ_GCC_5018 is not set +# CONFIG_IPQ_GCC_5332 is not set +# CONFIG_IPQ_GCC_6018 is not set +# CONFIG_IPQ_GCC_8074 is not set +# CONFIG_IPQ_GCC_9574 is not set +CONFIG_MSM_GCC_8916=y +# CONFIG_MSM_GCC_8917 is not set +# CONFIG_MSM_GCC_8939 is not set +# CONFIG_MSM_GCC_8953 is not set +# CONFIG_MSM_GCC_8976 is not set +# CONFIG_MSM_MMCC_8994 is not set +# CONFIG_MSM_GCC_8994 is not set +CONFIG_MSM_GCC_8996=y +CONFIG_MSM_MMCC_8996=y +# CONFIG_MSM_GCC_8998 is not set +# CONFIG_MSM_GPUCC_8998 is not set +# CONFIG_MSM_MMCC_8998 is not set +# CONFIG_QCM_GCC_2290 is not set +# CONFIG_QCM_DISPCC_2290 is not set +# CONFIG_QCS_GCC_404 is not set +# CONFIG_SC_CAMCC_7180 is not set +# CONFIG_SC_CAMCC_7280 is not set +# CONFIG_SC_DISPCC_7180 is not set +# CONFIG_SC_DISPCC_7280 is not set +# CONFIG_SC_DISPCC_8280XP is not set +# CONFIG_SA_GCC_8775P is not set +# CONFIG_SA_GPUCC_8775P is not set +# CONFIG_SC_GCC_7180 is not set +# CONFIG_SC_GCC_7280 is not set +# CONFIG_SC_GCC_8180X is not set +# CONFIG_SC_GCC_8280XP is not set +# CONFIG_SC_GPUCC_7180 is not set +# CONFIG_SC_GPUCC_7280 is not set +# CONFIG_SC_GPUCC_8280XP is not set +# CONFIG_SC_LPASSCC_7280 is not set +# CONFIG_SC_LPASSCC_8280XP is not set +# CONFIG_SC_LPASS_CORECC_7180 is not set +# CONFIG_SC_LPASS_CORECC_7280 is not set +# CONFIG_SC_MSS_7180 is not set +# CONFIG_SC_VIDEOCC_7180 is not set +# CONFIG_SC_VIDEOCC_7280 is not set +# CONFIG_SDM_CAMCC_845 is not set +# CONFIG_SDM_GCC_660 is not set +# CONFIG_SDM_MMCC_660 is not set +# CONFIG_SDM_GPUCC_660 is not set +# CONFIG_QCS_TURING_404 is not set +# CONFIG_QCS_Q6SSTOP_404 is not set +# CONFIG_QDU_GCC_1000 is not set +# CONFIG_SDM_GCC_845 is not set +# CONFIG_SDM_GPUCC_845 is not set +# CONFIG_SDM_VIDEOCC_845 is not set +# CONFIG_SDM_DISPCC_845 is not set +# CONFIG_SDM_LPASSCC_845 is not set +# CONFIG_SDX_GCC_75 is not set +# CONFIG_SM_CAMCC_6350 is not set +# CONFIG_SM_CAMCC_8250 is not set +# CONFIG_SM_CAMCC_8450 is not set +# CONFIG_SM_GCC_6115 is not set +# CONFIG_SM_GCC_6125 is not set +# CONFIG_SM_GCC_6350 is not set +# CONFIG_SM_GCC_6375 is not set +# CONFIG_SM_GCC_7150 is not set +# CONFIG_SM_GCC_8150 is not set +# CONFIG_SM_GCC_8250 is not set +# CONFIG_SM_GCC_8350 is not set +# CONFIG_SM_GCC_8450 is not set +# CONFIG_SM_GCC_8550 is not set +# CONFIG_SM_GPUCC_6115 is not set +# CONFIG_SM_GPUCC_6125 is not set +# CONFIG_SM_GPUCC_6375 is not set +# CONFIG_SM_GPUCC_6350 is not set +# CONFIG_SM_GPUCC_8150 is not set +# CONFIG_SM_GPUCC_8250 is not set +# CONFIG_SM_GPUCC_8350 is not set +# CONFIG_SM_GPUCC_8450 is not set +# CONFIG_SM_GPUCC_8550 is not set +# CONFIG_SM_TCSRCC_8550 is not set +# CONFIG_SM_VIDEOCC_8150 is not set +# CONFIG_SM_VIDEOCC_8250 is not set +# CONFIG_SM_VIDEOCC_8350 is not set +# CONFIG_SM_VIDEOCC_8550 is not set +# CONFIG_SPMI_PMIC_CLKDIV is not set +# CONFIG_QCOM_HFPLL is not set +# CONFIG_KPSS_XCC is not set +# CONFIG_CLK_GFM_LPASS_SM8250 is not set +# CONFIG_SM_VIDEOCC_8450 is not set +CONFIG_COMMON_CLK_ROCKCHIP=y +CONFIG_CLK_PX30=y +CONFIG_CLK_RK3308=y +CONFIG_CLK_RK3328=y +CONFIG_CLK_RK3368=y +CONFIG_CLK_RK3399=y +CONFIG_CLK_RK3568=y +CONFIG_CLK_RK3588=y +CONFIG_SUNXI_CCU=y +CONFIG_SUN50I_A64_CCU=y +CONFIG_SUN50I_A100_CCU=y +CONFIG_SUN50I_A100_R_CCU=y +CONFIG_SUN50I_H6_CCU=y +CONFIG_SUN50I_H616_CCU=y +CONFIG_SUN50I_H6_R_CCU=y +CONFIG_SUN6I_RTC_CCU=y +CONFIG_SUN8I_H3_CCU=y +CONFIG_SUN8I_DE2_CCU=y +CONFIG_SUN8I_R_CCU=y +CONFIG_TEGRA_CLK_DFLL=y +# CONFIG_XILINX_VCU is not set +# CONFIG_COMMON_CLK_XLNX_CLKWZRD is not set +# CONFIG_COMMON_CLK_ZYNQMP is not set +# CONFIG_HWSPINLOCK is not set + +# +# Clock Source drivers +# +CONFIG_TIMER_OF=y +CONFIG_TIMER_ACPI=y +CONFIG_TIMER_PROBE=y +CONFIG_CLKSRC_MMIO=y +CONFIG_ROCKCHIP_TIMER=y +CONFIG_SUN4I_TIMER=y +CONFIG_TEGRA_TIMER=y +# CONFIG_TEGRA186_TIMER is not set +CONFIG_ARM_ARCH_TIMER=y +CONFIG_ARM_ARCH_TIMER_EVTSTREAM=y +CONFIG_ARM_ARCH_TIMER_OOL_WORKAROUND=y +CONFIG_FSL_ERRATUM_A008585=y +CONFIG_HISILICON_ERRATUM_161010101=y +CONFIG_ARM64_ERRATUM_858921=y +CONFIG_SUN50I_ERRATUM_UNKNOWN1=y +CONFIG_ARM_TIMER_SP804=y +# end of Clock Source drivers + +CONFIG_MAILBOX=y +# CONFIG_ARM_MHU is not set +# CONFIG_ARM_MHU_V2 is not set +# CONFIG_PLATFORM_MHU is not set +# CONFIG_PL320_MBOX is not set +CONFIG_ARMADA_37XX_RWTM_MBOX=m +# CONFIG_ROCKCHIP_MBOX is not set +CONFIG_PCC=y +# CONFIG_ALTERA_MBOX is not set +CONFIG_HI3660_MBOX=y +CONFIG_HI6220_MBOX=y +# CONFIG_MAILBOX_TEST is not set +CONFIG_QCOM_APCS_IPC=m +# CONFIG_TEGRA_HSP_MBOX is not set +CONFIG_XGENE_SLIMPRO_MBOX=m +CONFIG_ZYNQMP_IPI_MBOX=y +CONFIG_SUN6I_MSGBOX=y +# CONFIG_QCOM_IPCC is not set +CONFIG_IOMMU_IOVA=y +CONFIG_IOMMU_API=y +CONFIG_IOMMU_SUPPORT=y + +# +# Generic IOMMU Pagetable Support +# +CONFIG_IOMMU_IO_PGTABLE=y +CONFIG_IOMMU_IO_PGTABLE_LPAE=y +# CONFIG_IOMMU_IO_PGTABLE_LPAE_SELFTEST is not set +# CONFIG_IOMMU_IO_PGTABLE_ARMV7S is not set +# CONFIG_IOMMU_IO_PGTABLE_DART is not set +# end of Generic IOMMU Pagetable Support + +# CONFIG_IOMMU_DEBUGFS is not set +# CONFIG_IOMMU_DEFAULT_DMA_STRICT is not set +# CONFIG_IOMMU_DEFAULT_DMA_LAZY is not set +CONFIG_IOMMU_DEFAULT_PASSTHROUGH=y +CONFIG_OF_IOMMU=y +CONFIG_IOMMU_DMA=y +CONFIG_IOMMU_SVA=y +# CONFIG_IOMMUFD is not set +CONFIG_ROCKCHIP_IOMMU=y +# CONFIG_SUN50I_IOMMU is not set +CONFIG_TEGRA_IOMMU_SMMU=y +CONFIG_ARM_SMMU=y +# CONFIG_ARM_SMMU_LEGACY_DT_BINDINGS is not set +CONFIG_ARM_SMMU_DISABLE_BYPASS_BY_DEFAULT=y +CONFIG_ARM_SMMU_QCOM=y +# CONFIG_ARM_SMMU_QCOM_DEBUG is not set +CONFIG_ARM_SMMU_V3=y +CONFIG_ARM_SMMU_V3_SVA=y +CONFIG_QCOM_IOMMU=y +# CONFIG_VIRTIO_IOMMU is not set + +# +# Remoteproc drivers +# +# CONFIG_REMOTEPROC is not set +# end of Remoteproc drivers + +# +# Rpmsg drivers +# +CONFIG_RPMSG=m +# CONFIG_RPMSG_CHAR is not set +# CONFIG_RPMSG_CTRL is not set +# CONFIG_RPMSG_NS is not set +CONFIG_RPMSG_QCOM_GLINK=m +CONFIG_RPMSG_QCOM_GLINK_RPM=m +# CONFIG_RPMSG_VIRTIO is not set +# end of Rpmsg drivers + +# CONFIG_SOUNDWIRE is not set + +# +# SOC (System On Chip) specific Drivers +# + +# +# Amlogic SoC drivers +# +CONFIG_MESON_CANVAS=m +CONFIG_MESON_CLK_MEASURE=y +CONFIG_MESON_GX_SOCINFO=y +CONFIG_MESON_GX_PM_DOMAINS=y +CONFIG_MESON_EE_PM_DOMAINS=y +CONFIG_MESON_SECURE_PM_DOMAINS=y +# end of Amlogic SoC drivers + +# +# Broadcom SoC drivers +# +# CONFIG_SOC_BRCMSTB is not set +# end of Broadcom SoC drivers + +# +# NXP/Freescale QorIQ SoC drivers +# +# CONFIG_QUICC_ENGINE is not set +# CONFIG_FSL_RCPM is not set +# end of NXP/Freescale QorIQ SoC drivers + +# +# fujitsu SoC drivers +# +# CONFIG_A64FX_DIAG is not set +# end of fujitsu SoC drivers + +# +# Hisilicon SoC drivers +# +# CONFIG_KUNPENG_HCCS is not set +# end of Hisilicon SoC drivers + +# +# i.MX SoC drivers +# +# end of i.MX SoC drivers + +# +# Enable LiteX SoC Builder specific drivers +# +# CONFIG_LITEX_SOC_CONTROLLER is not set +# end of Enable LiteX SoC Builder specific drivers + +# CONFIG_WPCM450_SOC is not set + +# +# Qualcomm SoC drivers +# +# CONFIG_QCOM_AOSS_QMP is not set +CONFIG_QCOM_COMMAND_DB=y +# CONFIG_QCOM_CPR is not set +# CONFIG_QCOM_GENI_SE is not set +CONFIG_QCOM_GSBI=m +# CONFIG_QCOM_LLCC is not set +CONFIG_QCOM_KRYO_L2_ACCESSORS=y +CONFIG_QCOM_MDT_LOADER=m +# CONFIG_QCOM_OCMEM is not set +# CONFIG_QCOM_RAMP_CTRL is not set +# CONFIG_QCOM_RMTFS_MEM is not set +# CONFIG_QCOM_RPM_MASTER_STATS is not set +# CONFIG_QCOM_RPMH is not set +# CONFIG_QCOM_RPMPD is not set +CONFIG_QCOM_SMD_RPM=m +# CONFIG_QCOM_SPM is not set +CONFIG_QCOM_WCNSS_CTRL=m +# CONFIG_QCOM_APR is not set +# CONFIG_QCOM_ICC_BWMON is not set +# end of Qualcomm SoC drivers + +CONFIG_ROCKCHIP_GRF=y +CONFIG_ROCKCHIP_IODOMAIN=m +CONFIG_ROCKCHIP_PM_DOMAINS=y +CONFIG_SUNXI_MBUS=y +CONFIG_SUNXI_SRAM=y +# CONFIG_SUN20I_PPU is not set +CONFIG_ARCH_TEGRA_132_SOC=y +CONFIG_ARCH_TEGRA_210_SOC=y +CONFIG_ARCH_TEGRA_186_SOC=y +CONFIG_ARCH_TEGRA_194_SOC=y +# CONFIG_ARCH_TEGRA_234_SOC is not set +CONFIG_SOC_TEGRA_FUSE=y +CONFIG_SOC_TEGRA_FLOWCTRL=y +CONFIG_SOC_TEGRA_PMC=y +# CONFIG_SOC_TI is not set + +# +# Xilinx SoC drivers +# +CONFIG_ZYNQMP_POWER=y +CONFIG_ZYNQMP_PM_DOMAINS=y +CONFIG_XLNX_EVENT_MANAGER=y +# end of Xilinx SoC drivers +# end of SOC (System On Chip) specific Drivers + +CONFIG_PM_DEVFREQ=y + +# +# DEVFREQ Governors +# +CONFIG_DEVFREQ_GOV_SIMPLE_ONDEMAND=m +# CONFIG_DEVFREQ_GOV_PERFORMANCE is not set +# CONFIG_DEVFREQ_GOV_POWERSAVE is not set +# CONFIG_DEVFREQ_GOV_USERSPACE is not set +# CONFIG_DEVFREQ_GOV_PASSIVE is not set + +# +# DEVFREQ Drivers +# +# CONFIG_ARM_TEGRA_DEVFREQ is not set +CONFIG_ARM_RK3399_DMC_DEVFREQ=m +# CONFIG_ARM_SUN8I_A33_MBUS_DEVFREQ is not set +CONFIG_PM_DEVFREQ_EVENT=y +CONFIG_DEVFREQ_EVENT_ROCKCHIP_DFI=m +CONFIG_EXTCON=y + +# +# Extcon Device Drivers +# +# CONFIG_EXTCON_ADC_JACK is not set +# CONFIG_EXTCON_FSA9480 is not set +# CONFIG_EXTCON_GPIO is not set +# CONFIG_EXTCON_MAX3355 is not set +# CONFIG_EXTCON_PTN5150 is not set +CONFIG_EXTCON_QCOM_SPMI_MISC=m +# CONFIG_EXTCON_RT8973A is not set +# CONFIG_EXTCON_SM5502 is not set +CONFIG_EXTCON_USB_GPIO=m +CONFIG_EXTCON_USBC_CROS_EC=m +CONFIG_MEMORY=y +# CONFIG_ARM_PL172_MPMC is not set +CONFIG_TEGRA_MC=y +# CONFIG_TEGRA210_EMC is not set +CONFIG_IIO=m +CONFIG_IIO_BUFFER=y +# CONFIG_IIO_BUFFER_CB is not set +# CONFIG_IIO_BUFFER_DMA is not set +# CONFIG_IIO_BUFFER_DMAENGINE is not set +# CONFIG_IIO_BUFFER_HW_CONSUMER is not set +CONFIG_IIO_KFIFO_BUF=m +CONFIG_IIO_TRIGGERED_BUFFER=m +# CONFIG_IIO_CONFIGFS is not set +CONFIG_IIO_TRIGGER=y +CONFIG_IIO_CONSUMERS_PER_TRIGGER=2 +# CONFIG_IIO_SW_DEVICE is not set +# CONFIG_IIO_SW_TRIGGER is not set +# CONFIG_IIO_TRIGGERED_EVENT is not set + +# +# Accelerometers +# +# CONFIG_ADIS16201 is not set +# CONFIG_ADIS16209 is not set +# CONFIG_ADXL313_I2C is not set +# CONFIG_ADXL313_SPI is not set +# CONFIG_ADXL345_I2C is not set +# CONFIG_ADXL345_SPI is not set +# CONFIG_ADXL355_I2C is not set +# CONFIG_ADXL355_SPI is not set +# CONFIG_ADXL367_SPI is not set +# CONFIG_ADXL367_I2C is not set +# CONFIG_ADXL372_SPI is not set +# CONFIG_ADXL372_I2C is not set +# CONFIG_BMA180 is not set +# CONFIG_BMA220 is not set +# CONFIG_BMA400 is not set +# CONFIG_BMC150_ACCEL is not set +# CONFIG_BMI088_ACCEL is not set +# CONFIG_DA280 is not set +# CONFIG_DA311 is not set +# CONFIG_DMARD06 is not set +# CONFIG_DMARD09 is not set +# CONFIG_DMARD10 is not set +# CONFIG_FXLS8962AF_I2C is not set +# CONFIG_FXLS8962AF_SPI is not set +CONFIG_HID_SENSOR_ACCEL_3D=m +CONFIG_IIO_CROS_EC_ACCEL_LEGACY=m +# CONFIG_IIO_ST_ACCEL_3AXIS is not set +# CONFIG_IIO_KX022A_SPI is not set +# CONFIG_IIO_KX022A_I2C is not set +# CONFIG_KXSD9 is not set +# CONFIG_KXCJK1013 is not set +# CONFIG_MC3230 is not set +# CONFIG_MMA7455_I2C is not set +# CONFIG_MMA7455_SPI is not set +# CONFIG_MMA7660 is not set +# CONFIG_MMA8452 is not set +# CONFIG_MMA9551 is not set +# CONFIG_MMA9553 is not set +# CONFIG_MSA311 is not set +# CONFIG_MXC4005 is not set +# CONFIG_MXC6255 is not set +# CONFIG_SCA3000 is not set +# CONFIG_SCA3300 is not set +# CONFIG_STK8312 is not set +# CONFIG_STK8BA50 is not set +# end of Accelerometers + +# +# Analog to digital converters +# +# CONFIG_AD4130 is not set +# CONFIG_AD7091R5 is not set +# CONFIG_AD7124 is not set +# CONFIG_AD7192 is not set +# CONFIG_AD7266 is not set +# CONFIG_AD7280 is not set +# CONFIG_AD7291 is not set +# CONFIG_AD7292 is not set +# CONFIG_AD7298 is not set +# CONFIG_AD7476 is not set +# CONFIG_AD7606_IFACE_PARALLEL is not set +# CONFIG_AD7606_IFACE_SPI is not set +# CONFIG_AD7766 is not set +# CONFIG_AD7768_1 is not set +# CONFIG_AD7780 is not set +# CONFIG_AD7791 is not set +# CONFIG_AD7793 is not set +# CONFIG_AD7887 is not set +# CONFIG_AD7923 is not set +# CONFIG_AD7949 is not set +# CONFIG_AD799X is not set +# CONFIG_ADI_AXI_ADC is not set +CONFIG_AXP20X_ADC=m +CONFIG_AXP288_ADC=m +# CONFIG_CC10001_ADC is not set +# CONFIG_ENVELOPE_DETECTOR is not set +# CONFIG_HI8435 is not set +# CONFIG_HX711 is not set +# CONFIG_INA2XX_ADC is not set +# CONFIG_LTC2471 is not set +# CONFIG_LTC2485 is not set +# CONFIG_LTC2496 is not set +# CONFIG_LTC2497 is not set +# CONFIG_MAX1027 is not set +# CONFIG_MAX11100 is not set +# CONFIG_MAX1118 is not set +# CONFIG_MAX11205 is not set +# CONFIG_MAX11410 is not set +# CONFIG_MAX1241 is not set +# CONFIG_MAX1363 is not set +# CONFIG_MAX9611 is not set +# CONFIG_MCP320X is not set +# CONFIG_MCP3422 is not set +# CONFIG_MCP3911 is not set +CONFIG_MESON_SARADC=m +# CONFIG_NAU7802 is not set +CONFIG_QCOM_VADC_COMMON=m +# CONFIG_QCOM_SPMI_RRADC is not set +CONFIG_QCOM_SPMI_IADC=m +CONFIG_QCOM_SPMI_VADC=m +# CONFIG_QCOM_SPMI_ADC5 is not set +CONFIG_ROCKCHIP_SARADC=m +# CONFIG_RICHTEK_RTQ6056 is not set +# CONFIG_SD_ADC_MODULATOR is not set +# CONFIG_SUN20I_GPADC is not set +# CONFIG_TI_ADC081C is not set +# CONFIG_TI_ADC0832 is not set +# CONFIG_TI_ADC084S021 is not set +# CONFIG_TI_ADC12138 is not set +# CONFIG_TI_ADC108S102 is not set +# CONFIG_TI_ADC128S052 is not set +# CONFIG_TI_ADC161S626 is not set +# CONFIG_TI_ADS1015 is not set +# CONFIG_TI_ADS7924 is not set +# CONFIG_TI_ADS1100 is not set +# CONFIG_TI_ADS7950 is not set +# CONFIG_TI_ADS8344 is not set +# CONFIG_TI_ADS8688 is not set +# CONFIG_TI_ADS124S08 is not set +# CONFIG_TI_ADS131E08 is not set +# CONFIG_TI_LMP92064 is not set +# CONFIG_TI_TLC4541 is not set +# CONFIG_TI_TSC2046 is not set +# CONFIG_VF610_ADC is not set +CONFIG_VIPERBOARD_ADC=m +# CONFIG_XILINX_XADC is not set +# CONFIG_XILINX_AMS is not set +# end of Analog to digital converters + +# +# Analog to digital and digital to analog converters +# +# CONFIG_AD74115 is not set +# CONFIG_AD74413R is not set +# end of Analog to digital and digital to analog converters + +# +# Analog Front Ends +# +# CONFIG_IIO_RESCALE is not set +# end of Analog Front Ends + +# +# Amplifiers +# +# CONFIG_AD8366 is not set +# CONFIG_ADA4250 is not set +# CONFIG_HMC425 is not set +# end of Amplifiers + +# +# Capacitance to digital converters +# +# CONFIG_AD7150 is not set +# CONFIG_AD7746 is not set +# end of Capacitance to digital converters + +# +# Chemical Sensors +# +# CONFIG_ATLAS_PH_SENSOR is not set +# CONFIG_ATLAS_EZO_SENSOR is not set +# CONFIG_BME680 is not set +# CONFIG_CCS811 is not set +# CONFIG_IAQCORE is not set +# CONFIG_PMS7003 is not set +# CONFIG_SCD30_CORE is not set +# CONFIG_SCD4X is not set +# CONFIG_SENSIRION_SGP30 is not set +# CONFIG_SENSIRION_SGP40 is not set +# CONFIG_SPS30_I2C is not set +# CONFIG_SPS30_SERIAL is not set +# CONFIG_SENSEAIR_SUNRISE_CO2 is not set +# CONFIG_VZ89X is not set +# end of Chemical Sensors + +CONFIG_IIO_CROS_EC_SENSORS_CORE=m +CONFIG_IIO_CROS_EC_SENSORS=m +# CONFIG_IIO_CROS_EC_SENSORS_LID_ANGLE is not set + +# +# Hid Sensor IIO Common +# +CONFIG_HID_SENSOR_IIO_COMMON=m +CONFIG_HID_SENSOR_IIO_TRIGGER=m +# end of Hid Sensor IIO Common + +# +# IIO SCMI Sensors +# +# end of IIO SCMI Sensors + +# +# SSP Sensor Common +# +# CONFIG_IIO_SSP_SENSORHUB is not set +# end of SSP Sensor Common + +# +# Digital to analog converters +# +# CONFIG_AD3552R is not set +# CONFIG_AD5064 is not set +# CONFIG_AD5360 is not set +# CONFIG_AD5380 is not set +# CONFIG_AD5421 is not set +CONFIG_AD5446=m +# CONFIG_AD5449 is not set +# CONFIG_AD5592R is not set +# CONFIG_AD5593R is not set +# CONFIG_AD5504 is not set +# CONFIG_AD5624R_SPI is not set +# CONFIG_LTC2688 is not set +# CONFIG_AD5686_SPI is not set +# CONFIG_AD5696_I2C is not set +# CONFIG_AD5755 is not set +# CONFIG_AD5758 is not set +# CONFIG_AD5761 is not set +# CONFIG_AD5764 is not set +# CONFIG_AD5766 is not set +# CONFIG_AD5770R is not set +# CONFIG_AD5791 is not set +# CONFIG_AD7293 is not set +# CONFIG_AD7303 is not set +# CONFIG_AD8801 is not set +# CONFIG_DPOT_DAC is not set +# CONFIG_DS4424 is not set +# CONFIG_LTC1660 is not set +# CONFIG_LTC2632 is not set +# CONFIG_M62332 is not set +# CONFIG_MAX517 is not set +# CONFIG_MAX5522 is not set +# CONFIG_MAX5821 is not set +# CONFIG_MCP4725 is not set +# CONFIG_MCP4728 is not set +# CONFIG_MCP4922 is not set +# CONFIG_TI_DAC082S085 is not set +# CONFIG_TI_DAC5571 is not set +# CONFIG_TI_DAC7311 is not set +# CONFIG_TI_DAC7612 is not set +# CONFIG_VF610_DAC is not set +# end of Digital to analog converters + +# +# IIO dummy driver +# +# end of IIO dummy driver + +# +# Filters +# +# CONFIG_ADMV8818 is not set +# end of Filters + +# +# Frequency Synthesizers DDS/PLL +# + +# +# Clock Generator/Distribution +# +# CONFIG_AD9523 is not set +# end of Clock Generator/Distribution + +# +# Phase-Locked Loop (PLL) frequency synthesizers +# +# CONFIG_ADF4350 is not set +# CONFIG_ADF4371 is not set +# CONFIG_ADF4377 is not set +# CONFIG_ADMV1013 is not set +# CONFIG_ADMV1014 is not set +# CONFIG_ADMV4420 is not set +# CONFIG_ADRF6780 is not set +# end of Phase-Locked Loop (PLL) frequency synthesizers +# end of Frequency Synthesizers DDS/PLL + +# +# Digital gyroscope sensors +# +# CONFIG_ADIS16080 is not set +# CONFIG_ADIS16130 is not set +# CONFIG_ADIS16136 is not set +# CONFIG_ADIS16260 is not set +# CONFIG_ADXRS290 is not set +# CONFIG_ADXRS450 is not set +# CONFIG_BMG160 is not set +# CONFIG_FXAS21002C is not set +CONFIG_HID_SENSOR_GYRO_3D=m +# CONFIG_MPU3050_I2C is not set +# CONFIG_IIO_ST_GYRO_3AXIS is not set +# CONFIG_ITG3200 is not set +# end of Digital gyroscope sensors + +# +# Health Sensors +# + +# +# Heart Rate Monitors +# +# CONFIG_AFE4403 is not set +# CONFIG_AFE4404 is not set +# CONFIG_MAX30100 is not set +# CONFIG_MAX30102 is not set +# end of Heart Rate Monitors +# end of Health Sensors + +# +# Humidity sensors +# +# CONFIG_AM2315 is not set +CONFIG_DHT11=m +# CONFIG_HDC100X is not set +# CONFIG_HDC2010 is not set +# CONFIG_HID_SENSOR_HUMIDITY is not set +# CONFIG_HTS221 is not set +# CONFIG_HTU21 is not set +# CONFIG_SI7005 is not set +# CONFIG_SI7020 is not set +# end of Humidity sensors + +# +# Inertial measurement units +# +# CONFIG_ADIS16400 is not set +# CONFIG_ADIS16460 is not set +# CONFIG_ADIS16475 is not set +# CONFIG_ADIS16480 is not set +# CONFIG_BMI160_I2C is not set +# CONFIG_BMI160_SPI is not set +# CONFIG_BOSCH_BNO055_SERIAL is not set +# CONFIG_BOSCH_BNO055_I2C is not set +# CONFIG_FXOS8700_I2C is not set +# CONFIG_FXOS8700_SPI is not set +# CONFIG_KMX61 is not set +# CONFIG_INV_ICM42600_I2C is not set +# CONFIG_INV_ICM42600_SPI is not set +# CONFIG_INV_MPU6050_I2C is not set +# CONFIG_INV_MPU6050_SPI is not set +# CONFIG_IIO_ST_LSM6DSX is not set +# CONFIG_IIO_ST_LSM9DS0 is not set +# end of Inertial measurement units + +# +# Light sensors +# +CONFIG_ACPI_ALS=m +# CONFIG_ADJD_S311 is not set +# CONFIG_ADUX1020 is not set +# CONFIG_AL3010 is not set +# CONFIG_AL3320A is not set +# CONFIG_APDS9300 is not set +# CONFIG_APDS9960 is not set +# CONFIG_AS73211 is not set +# CONFIG_BH1750 is not set +CONFIG_BH1780=m +# CONFIG_CM32181 is not set +# CONFIG_CM3232 is not set +# CONFIG_CM3323 is not set +# CONFIG_CM3605 is not set +# CONFIG_CM36651 is not set +CONFIG_IIO_CROS_EC_LIGHT_PROX=m +# CONFIG_GP2AP002 is not set +# CONFIG_GP2AP020A00F is not set +# CONFIG_SENSORS_ISL29018 is not set +# CONFIG_SENSORS_ISL29028 is not set +# CONFIG_ISL29125 is not set +CONFIG_HID_SENSOR_ALS=m +CONFIG_HID_SENSOR_PROX=m +# CONFIG_JSA1212 is not set +# CONFIG_ROHM_BU27008 is not set +# CONFIG_ROHM_BU27034 is not set +# CONFIG_RPR0521 is not set +# CONFIG_LTR501 is not set +# CONFIG_LTRF216A is not set +# CONFIG_LV0104CS is not set +# CONFIG_MAX44000 is not set +# CONFIG_MAX44009 is not set +# CONFIG_NOA1305 is not set +# CONFIG_OPT3001 is not set +# CONFIG_OPT4001 is not set +# CONFIG_PA12203001 is not set +# CONFIG_SI1133 is not set +# CONFIG_SI1145 is not set +# CONFIG_STK3310 is not set +# CONFIG_ST_UVIS25 is not set +# CONFIG_TCS3414 is not set +# CONFIG_TCS3472 is not set +# CONFIG_SENSORS_TSL2563 is not set +# CONFIG_TSL2583 is not set +# CONFIG_TSL2591 is not set +# CONFIG_TSL2772 is not set +# CONFIG_TSL4531 is not set +# CONFIG_US5182D is not set +# CONFIG_VCNL4000 is not set +# CONFIG_VCNL4035 is not set +# CONFIG_VEML6030 is not set +# CONFIG_VEML6070 is not set +# CONFIG_VL6180 is not set +# CONFIG_ZOPT2201 is not set +# end of Light sensors + +# +# Magnetometer sensors +# +# CONFIG_AK8974 is not set +# CONFIG_AK8975 is not set +# CONFIG_AK09911 is not set +# CONFIG_BMC150_MAGN_I2C is not set +# CONFIG_BMC150_MAGN_SPI is not set +# CONFIG_MAG3110 is not set +CONFIG_HID_SENSOR_MAGNETOMETER_3D=m +# CONFIG_MMC35240 is not set +# CONFIG_IIO_ST_MAGN_3AXIS is not set +# CONFIG_SENSORS_HMC5843_I2C is not set +# CONFIG_SENSORS_HMC5843_SPI is not set +# CONFIG_SENSORS_RM3100_I2C is not set +# CONFIG_SENSORS_RM3100_SPI is not set +# CONFIG_TI_TMAG5273 is not set +# CONFIG_YAMAHA_YAS530 is not set +# end of Magnetometer sensors + +# +# Multiplexers +# +# CONFIG_IIO_MUX is not set +# end of Multiplexers + +# +# Inclinometer sensors +# +CONFIG_HID_SENSOR_INCLINOMETER_3D=m +CONFIG_HID_SENSOR_DEVICE_ROTATION=m +# end of Inclinometer sensors + +# +# Triggers - standalone +# +# CONFIG_IIO_INTERRUPT_TRIGGER is not set +# CONFIG_IIO_SYSFS_TRIGGER is not set +# end of Triggers - standalone + +# +# Linear and angular position sensors +# +# CONFIG_HID_SENSOR_CUSTOM_INTEL_HINGE is not set +# end of Linear and angular position sensors + +# +# Digital potentiometers +# +# CONFIG_AD5110 is not set +# CONFIG_AD5272 is not set +# CONFIG_DS1803 is not set +# CONFIG_MAX5432 is not set +# CONFIG_MAX5481 is not set +# CONFIG_MAX5487 is not set +# CONFIG_MCP4018 is not set +# CONFIG_MCP4131 is not set +# CONFIG_MCP4531 is not set +# CONFIG_MCP41010 is not set +# CONFIG_TPL0102 is not set +# CONFIG_X9250 is not set +# end of Digital potentiometers + +# +# Digital potentiostats +# +# CONFIG_LMP91000 is not set +# end of Digital potentiostats + +# +# Pressure sensors +# +# CONFIG_ABP060MG is not set +# CONFIG_BMP280 is not set +CONFIG_IIO_CROS_EC_BARO=m +# CONFIG_DLHL60D is not set +# CONFIG_DPS310 is not set +CONFIG_HID_SENSOR_PRESS=m +# CONFIG_HP03 is not set +# CONFIG_ICP10100 is not set +# CONFIG_MPL115_I2C is not set +# CONFIG_MPL115_SPI is not set +# CONFIG_MPL3115 is not set +# CONFIG_MPRLS0025PA is not set +# CONFIG_MS5611 is not set +# CONFIG_MS5637 is not set +# CONFIG_IIO_ST_PRESS is not set +# CONFIG_T5403 is not set +# CONFIG_HP206C is not set +# CONFIG_ZPA2326 is not set +# end of Pressure sensors + +# +# Lightning sensors +# +# CONFIG_AS3935 is not set +# end of Lightning sensors + +# +# Proximity and distance sensors +# +# CONFIG_CROS_EC_MKBP_PROXIMITY is not set +# CONFIG_IRSD200 is not set +# CONFIG_ISL29501 is not set +# CONFIG_LIDAR_LITE_V2 is not set +# CONFIG_MB1232 is not set +# CONFIG_PING is not set +# CONFIG_RFD77402 is not set +# CONFIG_SRF04 is not set +# CONFIG_SX9310 is not set +# CONFIG_SX9324 is not set +# CONFIG_SX9360 is not set +# CONFIG_SX9500 is not set +# CONFIG_SRF08 is not set +# CONFIG_VCNL3020 is not set +# CONFIG_VL53L0X_I2C is not set +# end of Proximity and distance sensors + +# +# Resolver to digital converters +# +# CONFIG_AD2S90 is not set +# CONFIG_AD2S1200 is not set +# end of Resolver to digital converters + +# +# Temperature sensors +# +# CONFIG_LTC2983 is not set +# CONFIG_MAXIM_THERMOCOUPLE is not set +# CONFIG_HID_SENSOR_TEMP is not set +# CONFIG_MLX90614 is not set +# CONFIG_MLX90632 is not set +# CONFIG_TMP006 is not set +# CONFIG_TMP007 is not set +# CONFIG_TMP117 is not set +# CONFIG_TSYS01 is not set +# CONFIG_TSYS02D is not set +# CONFIG_MAX30208 is not set +# CONFIG_MAX31856 is not set +# CONFIG_MAX31865 is not set +# end of Temperature sensors + +# CONFIG_NTB is not set +CONFIG_PWM=y +CONFIG_PWM_SYSFS=y +# CONFIG_PWM_DEBUG is not set +# CONFIG_PWM_ATMEL_TCB is not set +# CONFIG_PWM_CLK is not set +CONFIG_PWM_CROS_EC=m +# CONFIG_PWM_DWC is not set +# CONFIG_PWM_FSL_FTM is not set +# CONFIG_PWM_HIBVT is not set +CONFIG_PWM_MESON=m +# CONFIG_PWM_PCA9685 is not set +CONFIG_PWM_ROCKCHIP=m +CONFIG_PWM_SUN4I=m +CONFIG_PWM_TEGRA=m +# CONFIG_PWM_XILINX is not set + +# +# IRQ chip support +# +CONFIG_IRQCHIP=y +CONFIG_ARM_GIC=y +CONFIG_ARM_GIC_PM=y +CONFIG_ARM_GIC_MAX_NR=1 +CONFIG_ARM_GIC_V2M=y +CONFIG_ARM_GIC_V3=y +CONFIG_ARM_GIC_V3_ITS=y +CONFIG_ARM_GIC_V3_ITS_PCI=y +# CONFIG_AL_FIC is not set +CONFIG_HISILICON_IRQ_MBIGEN=y +CONFIG_SUN6I_R_INTC=y +CONFIG_SUNXI_NMI_INTC=y +# CONFIG_XILINX_INTC is not set +CONFIG_MVEBU_GICP=y +CONFIG_MVEBU_ICU=y +CONFIG_MVEBU_ODMI=y +CONFIG_MVEBU_PIC=y +CONFIG_MVEBU_SEI=y +CONFIG_PARTITION_PERCPU=y +CONFIG_QCOM_IRQ_COMBINER=y +CONFIG_MESON_IRQ_GPIO=y +# CONFIG_QCOM_PDC is not set +# CONFIG_QCOM_MPM is not set +# end of IRQ chip support + +# CONFIG_IPACK_BUS is not set +CONFIG_ARCH_HAS_RESET_CONTROLLER=y +CONFIG_RESET_CONTROLLER=y +CONFIG_RESET_MESON=y +# CONFIG_RESET_MESON_AUDIO_ARB is not set +# CONFIG_RESET_QCOM_AOSS is not set +# CONFIG_RESET_QCOM_PDC is not set +CONFIG_RESET_SIMPLE=y +CONFIG_RESET_SUNXI=y +# CONFIG_RESET_TI_SYSCON is not set +# CONFIG_RESET_TI_TPS380X is not set +CONFIG_COMMON_RESET_HI3660=y +CONFIG_COMMON_RESET_HI6220=y + +# +# PHY Subsystem +# +CONFIG_GENERIC_PHY=y +CONFIG_GENERIC_PHY_MIPI_DPHY=y +CONFIG_PHY_XGENE=m +# CONFIG_PHY_CAN_TRANSCEIVER is not set +CONFIG_PHY_SUN4I_USB=m +# CONFIG_PHY_SUN6I_MIPI_DPHY is not set +# CONFIG_PHY_SUN9I_USB is not set +# CONFIG_PHY_SUN50I_USB3 is not set +CONFIG_PHY_MESON8B_USB2=m +CONFIG_PHY_MESON_GXL_USB2=y +CONFIG_PHY_MESON_G12A_MIPI_DPHY_ANALOG=y +CONFIG_PHY_MESON_G12A_USB2=y +CONFIG_PHY_MESON_G12A_USB3_PCIE=y +CONFIG_PHY_MESON_AXG_PCIE=y +CONFIG_PHY_MESON_AXG_MIPI_PCIE_ANALOG=y +CONFIG_PHY_MESON_AXG_MIPI_DPHY=y + +# +# PHY drivers for Broadcom platforms +# +# CONFIG_BCM_KONA_USB2_PHY is not set +# end of PHY drivers for Broadcom platforms + +# CONFIG_PHY_CADENCE_TORRENT is not set +# CONFIG_PHY_CADENCE_DPHY is not set +# CONFIG_PHY_CADENCE_DPHY_RX is not set +# CONFIG_PHY_CADENCE_SIERRA is not set +# CONFIG_PHY_CADENCE_SALVO is not set +CONFIG_PHY_HI6220_USB=m +# CONFIG_PHY_HI3660_USB is not set +# CONFIG_PHY_HI3670_USB is not set +# CONFIG_PHY_HI3670_PCIE is not set +# CONFIG_PHY_HISTB_COMBPHY is not set +# CONFIG_PHY_HISI_INNO_USB2 is not set +CONFIG_PHY_MVEBU_A3700_COMPHY=y +CONFIG_PHY_MVEBU_A3700_UTMI=y +# CONFIG_PHY_MVEBU_A38X_COMPHY is not set +CONFIG_PHY_MVEBU_CP110_COMPHY=m +# CONFIG_PHY_MVEBU_CP110_UTMI is not set +# CONFIG_PHY_PXA_28NM_HSIC is not set +# CONFIG_PHY_PXA_28NM_USB2 is not set +# CONFIG_PHY_LAN966X_SERDES is not set +# CONFIG_PHY_CPCAP_USB is not set +# CONFIG_PHY_MAPPHONE_MDM6600 is not set +# CONFIG_PHY_OCELOT_SERDES is not set +CONFIG_PHY_QCOM_APQ8064_SATA=m +# CONFIG_PHY_QCOM_EDP is not set +# CONFIG_PHY_QCOM_IPQ4019_USB is not set +CONFIG_PHY_QCOM_IPQ806X_SATA=m +# CONFIG_PHY_QCOM_PCIE2 is not set +CONFIG_PHY_QCOM_QMP=m +CONFIG_PHY_QCOM_QMP_COMBO=m +CONFIG_PHY_QCOM_QMP_PCIE=m +CONFIG_PHY_QCOM_QMP_PCIE_8996=m +CONFIG_PHY_QCOM_QMP_UFS=m +CONFIG_PHY_QCOM_QMP_USB=m +# CONFIG_PHY_QCOM_QMP_USB_LEGACY is not set +CONFIG_PHY_QCOM_QUSB2=m +# CONFIG_PHY_QCOM_SNPS_EUSB2 is not set +# CONFIG_PHY_QCOM_EUSB2_REPEATER is not set +# CONFIG_PHY_QCOM_M31_USB is not set +CONFIG_PHY_QCOM_USB_HS=m +# CONFIG_PHY_QCOM_USB_SNPS_FEMTO_V2 is not set +CONFIG_PHY_QCOM_USB_HSIC=m +# CONFIG_PHY_QCOM_USB_HS_28NM is not set +# CONFIG_PHY_QCOM_USB_SS is not set +# CONFIG_PHY_QCOM_IPQ806X_USB is not set +# CONFIG_PHY_QCOM_SGMII_ETH is not set +CONFIG_PHY_ROCKCHIP_DP=m +# CONFIG_PHY_ROCKCHIP_DPHY_RX0 is not set +CONFIG_PHY_ROCKCHIP_EMMC=m +CONFIG_PHY_ROCKCHIP_INNO_HDMI=m +CONFIG_PHY_ROCKCHIP_INNO_USB2=m +# CONFIG_PHY_ROCKCHIP_INNO_CSIDPHY is not set +# CONFIG_PHY_ROCKCHIP_INNO_DSIDPHY is not set +# CONFIG_PHY_ROCKCHIP_NANENG_COMBO_PHY is not set +CONFIG_PHY_ROCKCHIP_PCIE=m +# CONFIG_PHY_ROCKCHIP_SNPS_PCIE3 is not set +CONFIG_PHY_ROCKCHIP_TYPEC=m +CONFIG_PHY_ROCKCHIP_USB=m +# CONFIG_PHY_SAMSUNG_USB2 is not set +CONFIG_PHY_TEGRA_XUSB=m +# CONFIG_PHY_TUSB1210 is not set +# CONFIG_PHY_XILINX_ZYNQMP is not set +# end of PHY Subsystem + +# CONFIG_POWERCAP is not set +# CONFIG_MCB is not set + +# +# Performance monitor support +# +CONFIG_ARM_CCI_PMU=y +CONFIG_ARM_CCI400_PMU=y +CONFIG_ARM_CCI5xx_PMU=y +CONFIG_ARM_CCN=y +CONFIG_ARM_CMN=m +CONFIG_ARM_PMU=y +CONFIG_ARM_PMU_ACPI=y +CONFIG_ARM_SMMU_V3_PMU=y +CONFIG_ARM_PMUV3=y +CONFIG_ARM_DSU_PMU=m +CONFIG_QCOM_L2_PMU=y +CONFIG_QCOM_L3_PMU=y +CONFIG_THUNDERX2_PMU=m +CONFIG_XGENE_PMU=y +CONFIG_ARM_SPE_PMU=m +CONFIG_ARM_DMC620_PMU=y +# CONFIG_MARVELL_CN10K_TAD_PMU is not set +# CONFIG_ALIBABA_UNCORE_DRW_PMU is not set +CONFIG_HISI_PMU=y +# CONFIG_HISI_PCIE_PMU is not set +# CONFIG_HNS3_PMU is not set +# CONFIG_MARVELL_CN10K_DDR_PMU is not set +CONFIG_ARM_CORESIGHT_PMU_ARCH_SYSTEM_PMU=m +CONFIG_NVIDIA_CORESIGHT_PMU_ARCH_SYSTEM_PMU=m +# CONFIG_MESON_DDR_PMU is not set +# end of Performance monitor support + +CONFIG_RAS=y +# CONFIG_USB4 is not set + +# +# Android +# +# CONFIG_ANDROID_BINDER_IPC is not set +# end of Android + +CONFIG_LIBNVDIMM=m +CONFIG_BLK_DEV_PMEM=m +CONFIG_ND_CLAIM=y +CONFIG_ND_BTT=m +CONFIG_BTT=y +CONFIG_OF_PMEM=m +CONFIG_DAX=y +CONFIG_DEV_DAX=m +CONFIG_DEV_DAX_KMEM=m +CONFIG_NVMEM=y +CONFIG_NVMEM_SYSFS=y + +# +# Layout Types +# +# CONFIG_NVMEM_LAYOUT_SL28_VPD is not set +# CONFIG_NVMEM_LAYOUT_ONIE_TLV is not set +# end of Layout Types + +# CONFIG_NVMEM_MESON_EFUSE is not set +# CONFIG_NVMEM_MESON_MX_EFUSE is not set +# CONFIG_NVMEM_QCOM_QFPROM is not set +# CONFIG_NVMEM_QCOM_SEC_QFPROM is not set +# CONFIG_NVMEM_RMEM is not set +# CONFIG_NVMEM_ROCKCHIP_EFUSE is not set +# CONFIG_NVMEM_ROCKCHIP_OTP is not set +# CONFIG_NVMEM_SPMI_SDAM is not set +CONFIG_NVMEM_SUNXI_SID=m +# CONFIG_NVMEM_U_BOOT_ENV is not set +# CONFIG_NVMEM_ZYNQMP is not set + +# +# HW tracing support +# +# CONFIG_STM is not set +# CONFIG_INTEL_TH is not set +# CONFIG_HISI_PTT is not set +# end of HW tracing support + +# CONFIG_FPGA is not set +# CONFIG_FSI is not set +CONFIG_TEE=m +CONFIG_OPTEE=m +# CONFIG_OPTEE_INSECURE_LOAD_IMAGE is not set +CONFIG_PM_OPP=y +# CONFIG_SIOX is not set +# CONFIG_SLIMBUS is not set +CONFIG_INTERCONNECT=y +# CONFIG_INTERCONNECT_QCOM is not set +# CONFIG_COUNTER is not set +# CONFIG_MOST is not set +# CONFIG_PECI is not set +# CONFIG_HTE is not set +# CONFIG_CDX_BUS is not set +# end of Device Drivers + +# +# File systems +# +CONFIG_DCACHE_WORD_ACCESS=y +# CONFIG_VALIDATE_FS_PARSER is not set +CONFIG_FS_IOMAP=y +CONFIG_BUFFER_HEAD=y +CONFIG_LEGACY_DIRECT_IO=y +# CONFIG_EXT2_FS is not set +# CONFIG_EXT3_FS is not set +CONFIG_EXT4_FS=m +CONFIG_EXT4_USE_FOR_EXT2=y +CONFIG_EXT4_FS_POSIX_ACL=y +CONFIG_EXT4_FS_SECURITY=y +# CONFIG_EXT4_DEBUG is not set +CONFIG_JBD2=m +# CONFIG_JBD2_DEBUG is not set +CONFIG_FS_MBCACHE=m +CONFIG_REISERFS_FS=m +# CONFIG_REISERFS_CHECK is not set +# CONFIG_REISERFS_PROC_INFO is not set +CONFIG_REISERFS_FS_XATTR=y +CONFIG_REISERFS_FS_POSIX_ACL=y +CONFIG_REISERFS_FS_SECURITY=y +CONFIG_JFS_FS=m +CONFIG_JFS_POSIX_ACL=y +CONFIG_JFS_SECURITY=y +# CONFIG_JFS_DEBUG is not set +# CONFIG_JFS_STATISTICS is not set +CONFIG_XFS_FS=m +CONFIG_XFS_SUPPORT_V4=y +CONFIG_XFS_SUPPORT_ASCII_CI=y +CONFIG_XFS_QUOTA=y +CONFIG_XFS_POSIX_ACL=y +CONFIG_XFS_RT=y +# CONFIG_XFS_ONLINE_SCRUB is not set +# CONFIG_XFS_WARN is not set +# CONFIG_XFS_DEBUG is not set +CONFIG_GFS2_FS=m +CONFIG_GFS2_FS_LOCKING_DLM=y +CONFIG_OCFS2_FS=m +CONFIG_OCFS2_FS_O2CB=m +CONFIG_OCFS2_FS_USERSPACE_CLUSTER=m +CONFIG_OCFS2_FS_STATS=y +CONFIG_OCFS2_DEBUG_MASKLOG=y +# CONFIG_OCFS2_DEBUG_FS is not set +CONFIG_BTRFS_FS=m +CONFIG_BTRFS_FS_POSIX_ACL=y +# CONFIG_BTRFS_FS_CHECK_INTEGRITY is not set +# CONFIG_BTRFS_FS_RUN_SANITY_TESTS is not set +# CONFIG_BTRFS_DEBUG is not set +# CONFIG_BTRFS_ASSERT is not set +# CONFIG_BTRFS_FS_REF_VERIFY is not set +CONFIG_NILFS2_FS=m +CONFIG_F2FS_FS=m +CONFIG_F2FS_STAT_FS=y +CONFIG_F2FS_FS_XATTR=y +CONFIG_F2FS_FS_POSIX_ACL=y +CONFIG_F2FS_FS_SECURITY=y +# CONFIG_F2FS_CHECK_FS is not set +# CONFIG_F2FS_FAULT_INJECTION is not set +# CONFIG_F2FS_FS_COMPRESSION is not set +CONFIG_F2FS_IOSTAT=y +# CONFIG_F2FS_UNFAIR_RWSEM is not set +# CONFIG_ZONEFS_FS is not set +CONFIG_FS_POSIX_ACL=y +CONFIG_EXPORTFS=y +CONFIG_EXPORTFS_BLOCK_OPS=y +CONFIG_FILE_LOCKING=y +# CONFIG_FS_ENCRYPTION is not set +# CONFIG_FS_VERITY is not set +CONFIG_FSNOTIFY=y +CONFIG_DNOTIFY=y +CONFIG_INOTIFY_USER=y +CONFIG_FANOTIFY=y +CONFIG_FANOTIFY_ACCESS_PERMISSIONS=y +CONFIG_QUOTA=y +CONFIG_QUOTA_NETLINK_INTERFACE=y +# CONFIG_QUOTA_DEBUG is not set +CONFIG_QUOTA_TREE=m +CONFIG_QFMT_V1=m +CONFIG_QFMT_V2=m +CONFIG_QUOTACTL=y +CONFIG_AUTOFS_FS=m +CONFIG_FUSE_FS=m +CONFIG_CUSE=m +# CONFIG_VIRTIO_FS is not set +CONFIG_OVERLAY_FS=m +# CONFIG_OVERLAY_FS_REDIRECT_DIR is not set +CONFIG_OVERLAY_FS_REDIRECT_ALWAYS_FOLLOW=y +# CONFIG_OVERLAY_FS_INDEX is not set +# CONFIG_OVERLAY_FS_XINO_AUTO is not set +# CONFIG_OVERLAY_FS_METACOPY is not set +# CONFIG_OVERLAY_FS_DEBUG is not set + +# +# Caches +# +CONFIG_NETFS_SUPPORT=m +CONFIG_NETFS_STATS=y +CONFIG_FSCACHE=m +CONFIG_FSCACHE_STATS=y +# CONFIG_FSCACHE_DEBUG is not set +CONFIG_CACHEFILES=m +# CONFIG_CACHEFILES_DEBUG is not set +# CONFIG_CACHEFILES_ERROR_INJECTION is not set +# CONFIG_CACHEFILES_ONDEMAND is not set +# end of Caches + +# +# CD-ROM/DVD Filesystems +# +CONFIG_ISO9660_FS=m +CONFIG_JOLIET=y +CONFIG_ZISOFS=y +CONFIG_UDF_FS=m +# end of CD-ROM/DVD Filesystems + +# +# DOS/FAT/EXFAT/NT Filesystems +# +CONFIG_FAT_FS=y +CONFIG_MSDOS_FS=m +CONFIG_VFAT_FS=y +CONFIG_FAT_DEFAULT_CODEPAGE=437 +CONFIG_FAT_DEFAULT_IOCHARSET="ascii" +CONFIG_FAT_DEFAULT_UTF8=y +# CONFIG_EXFAT_FS is not set +CONFIG_NTFS_FS=m +# CONFIG_NTFS_DEBUG is not set +# CONFIG_NTFS_RW is not set +# CONFIG_NTFS3_FS is not set +# end of DOS/FAT/EXFAT/NT Filesystems + +# +# Pseudo filesystems +# +CONFIG_PROC_FS=y +CONFIG_PROC_KCORE=y +CONFIG_PROC_VMCORE=y +# CONFIG_PROC_VMCORE_DEVICE_DUMP is not set +CONFIG_PROC_SYSCTL=y +CONFIG_PROC_PAGE_MONITOR=y +CONFIG_PROC_CHILDREN=y +CONFIG_KERNFS=y +CONFIG_SYSFS=y +CONFIG_TMPFS=y +CONFIG_TMPFS_POSIX_ACL=y +CONFIG_TMPFS_XATTR=y +# CONFIG_TMPFS_INODE64 is not set +# CONFIG_TMPFS_QUOTA is not set +CONFIG_ARCH_SUPPORTS_HUGETLBFS=y +CONFIG_HUGETLBFS=y +CONFIG_HUGETLB_PAGE=y +CONFIG_ARCH_HAS_GIGANTIC_PAGE=y +CONFIG_CONFIGFS_FS=m +CONFIG_EFIVAR_FS=m +# end of Pseudo filesystems + +CONFIG_MISC_FILESYSTEMS=y +CONFIG_ORANGEFS_FS=m +CONFIG_ADFS_FS=m +# CONFIG_ADFS_FS_RW is not set +CONFIG_AFFS_FS=m +CONFIG_ECRYPT_FS=m +CONFIG_ECRYPT_FS_MESSAGING=y +CONFIG_HFS_FS=m +CONFIG_HFSPLUS_FS=m +CONFIG_BEFS_FS=m +# CONFIG_BEFS_DEBUG is not set +CONFIG_BFS_FS=m +CONFIG_EFS_FS=m +CONFIG_JFFS2_FS=m +CONFIG_JFFS2_FS_DEBUG=0 +CONFIG_JFFS2_FS_WRITEBUFFER=y +# CONFIG_JFFS2_FS_WBUF_VERIFY is not set +CONFIG_JFFS2_SUMMARY=y +CONFIG_JFFS2_FS_XATTR=y +CONFIG_JFFS2_FS_POSIX_ACL=y +CONFIG_JFFS2_FS_SECURITY=y +CONFIG_JFFS2_COMPRESSION_OPTIONS=y +CONFIG_JFFS2_ZLIB=y +CONFIG_JFFS2_LZO=y +CONFIG_JFFS2_RTIME=y +# CONFIG_JFFS2_RUBIN is not set +# CONFIG_JFFS2_CMODE_NONE is not set +CONFIG_JFFS2_CMODE_PRIORITY=y +# CONFIG_JFFS2_CMODE_SIZE is not set +# CONFIG_JFFS2_CMODE_FAVOURLZO is not set +CONFIG_UBIFS_FS=m +CONFIG_UBIFS_FS_ADVANCED_COMPR=y +CONFIG_UBIFS_FS_LZO=y +CONFIG_UBIFS_FS_ZLIB=y +CONFIG_UBIFS_FS_ZSTD=y +# CONFIG_UBIFS_ATIME_SUPPORT is not set +CONFIG_UBIFS_FS_XATTR=y +CONFIG_UBIFS_FS_SECURITY=y +# CONFIG_UBIFS_FS_AUTHENTICATION is not set +# CONFIG_CRAMFS is not set +CONFIG_SQUASHFS=m +CONFIG_SQUASHFS_FILE_CACHE=y +# CONFIG_SQUASHFS_FILE_DIRECT is not set +CONFIG_SQUASHFS_DECOMP_SINGLE=y +# CONFIG_SQUASHFS_CHOICE_DECOMP_BY_MOUNT is not set +CONFIG_SQUASHFS_COMPILE_DECOMP_SINGLE=y +# CONFIG_SQUASHFS_COMPILE_DECOMP_MULTI is not set +# CONFIG_SQUASHFS_COMPILE_DECOMP_MULTI_PERCPU is not set +CONFIG_SQUASHFS_XATTR=y +CONFIG_SQUASHFS_ZLIB=y +CONFIG_SQUASHFS_LZ4=y +CONFIG_SQUASHFS_LZO=y +CONFIG_SQUASHFS_XZ=y +CONFIG_SQUASHFS_ZSTD=y +# CONFIG_SQUASHFS_4K_DEVBLK_SIZE is not set +# CONFIG_SQUASHFS_EMBEDDED is not set +CONFIG_SQUASHFS_FRAGMENT_CACHE_SIZE=3 +CONFIG_VXFS_FS=m +CONFIG_MINIX_FS=m +CONFIG_OMFS_FS=m +CONFIG_HPFS_FS=m +CONFIG_QNX4FS_FS=m +CONFIG_QNX6FS_FS=m +# CONFIG_QNX6FS_DEBUG is not set +CONFIG_ROMFS_FS=m +# CONFIG_ROMFS_BACKED_BY_BLOCK is not set +# CONFIG_ROMFS_BACKED_BY_MTD is not set +CONFIG_ROMFS_BACKED_BY_BOTH=y +CONFIG_ROMFS_ON_BLOCK=y +CONFIG_ROMFS_ON_MTD=y +CONFIG_PSTORE=y +CONFIG_PSTORE_DEFAULT_KMSG_BYTES=10240 +CONFIG_PSTORE_COMPRESS=y +# CONFIG_PSTORE_CONSOLE is not set +# CONFIG_PSTORE_PMSG is not set +# CONFIG_PSTORE_FTRACE is not set +CONFIG_PSTORE_RAM=m +# CONFIG_PSTORE_BLK is not set +CONFIG_SYSV_FS=m +CONFIG_UFS_FS=m +# CONFIG_UFS_FS_WRITE is not set +# CONFIG_UFS_DEBUG is not set +# CONFIG_EROFS_FS is not set +CONFIG_NETWORK_FILESYSTEMS=y +CONFIG_NFS_FS=m +CONFIG_NFS_V2=m +CONFIG_NFS_V3=m +CONFIG_NFS_V3_ACL=y +CONFIG_NFS_V4=m +CONFIG_NFS_SWAP=y +CONFIG_NFS_V4_1=y +CONFIG_NFS_V4_2=y +CONFIG_PNFS_FILE_LAYOUT=m +CONFIG_PNFS_BLOCK=m +CONFIG_PNFS_FLEXFILE_LAYOUT=m +CONFIG_NFS_V4_1_IMPLEMENTATION_ID_DOMAIN="kernel.org" +# CONFIG_NFS_V4_1_MIGRATION is not set +CONFIG_NFS_V4_SECURITY_LABEL=y +CONFIG_NFS_FSCACHE=y +# CONFIG_NFS_USE_LEGACY_DNS is not set +CONFIG_NFS_USE_KERNEL_DNS=y +CONFIG_NFS_DEBUG=y +CONFIG_NFS_DISABLE_UDP_SUPPORT=y +# CONFIG_NFS_V4_2_READ_PLUS is not set +CONFIG_NFSD=m +# CONFIG_NFSD_V2 is not set +CONFIG_NFSD_V3_ACL=y +CONFIG_NFSD_V4=y +CONFIG_NFSD_PNFS=y +CONFIG_NFSD_BLOCKLAYOUT=y +# CONFIG_NFSD_SCSILAYOUT is not set +# CONFIG_NFSD_FLEXFILELAYOUT is not set +# CONFIG_NFSD_V4_2_INTER_SSC is not set +CONFIG_NFSD_V4_SECURITY_LABEL=y +CONFIG_GRACE_PERIOD=m +CONFIG_LOCKD=m +CONFIG_LOCKD_V4=y +CONFIG_NFS_ACL_SUPPORT=m +CONFIG_NFS_COMMON=y +CONFIG_NFS_V4_2_SSC_HELPER=y +CONFIG_SUNRPC=m +CONFIG_SUNRPC_GSS=m +CONFIG_SUNRPC_BACKCHANNEL=y +CONFIG_SUNRPC_SWAP=y +CONFIG_RPCSEC_GSS_KRB5=m +CONFIG_RPCSEC_GSS_KRB5_ENCTYPES_AES_SHA1=y +# CONFIG_RPCSEC_GSS_KRB5_ENCTYPES_CAMELLIA is not set +# CONFIG_RPCSEC_GSS_KRB5_ENCTYPES_AES_SHA2 is not set +CONFIG_SUNRPC_DEBUG=y +CONFIG_SUNRPC_XPRT_RDMA=m +CONFIG_CEPH_FS=m +CONFIG_CEPH_FSCACHE=y +CONFIG_CEPH_FS_POSIX_ACL=y +# CONFIG_CEPH_FS_SECURITY_LABEL is not set +CONFIG_CIFS=m +# CONFIG_CIFS_STATS2 is not set +CONFIG_CIFS_ALLOW_INSECURE_LEGACY=y +CONFIG_CIFS_UPCALL=y +CONFIG_CIFS_XATTR=y +CONFIG_CIFS_POSIX=y +CONFIG_CIFS_DEBUG=y +# CONFIG_CIFS_DEBUG2 is not set +# CONFIG_CIFS_DEBUG_DUMP_KEYS is not set +CONFIG_CIFS_DFS_UPCALL=y +# CONFIG_CIFS_SWN_UPCALL is not set +# CONFIG_CIFS_SMB_DIRECT is not set +CONFIG_CIFS_FSCACHE=y +# CONFIG_SMB_SERVER is not set +CONFIG_SMBFS=m +CONFIG_CODA_FS=m +CONFIG_AFS_FS=m +# CONFIG_AFS_DEBUG is not set +CONFIG_AFS_FSCACHE=y +# CONFIG_AFS_DEBUG_CURSOR is not set +CONFIG_9P_FS=m +CONFIG_9P_FSCACHE=y +CONFIG_9P_FS_POSIX_ACL=y +CONFIG_9P_FS_SECURITY=y +CONFIG_NLS=y +CONFIG_NLS_DEFAULT="utf8" +CONFIG_NLS_CODEPAGE_437=m +CONFIG_NLS_CODEPAGE_737=m +CONFIG_NLS_CODEPAGE_775=m +CONFIG_NLS_CODEPAGE_850=m +CONFIG_NLS_CODEPAGE_852=m +CONFIG_NLS_CODEPAGE_855=m +CONFIG_NLS_CODEPAGE_857=m +CONFIG_NLS_CODEPAGE_860=m +CONFIG_NLS_CODEPAGE_861=m +CONFIG_NLS_CODEPAGE_862=m +CONFIG_NLS_CODEPAGE_863=m +CONFIG_NLS_CODEPAGE_864=m +CONFIG_NLS_CODEPAGE_865=m +CONFIG_NLS_CODEPAGE_866=m +CONFIG_NLS_CODEPAGE_869=m +CONFIG_NLS_CODEPAGE_936=m +CONFIG_NLS_CODEPAGE_950=m +CONFIG_NLS_CODEPAGE_932=m +CONFIG_NLS_CODEPAGE_949=m +CONFIG_NLS_CODEPAGE_874=m +CONFIG_NLS_ISO8859_8=m +CONFIG_NLS_CODEPAGE_1250=m +CONFIG_NLS_CODEPAGE_1251=m +CONFIG_NLS_ASCII=m +CONFIG_NLS_ISO8859_1=m +CONFIG_NLS_ISO8859_2=m +CONFIG_NLS_ISO8859_3=m +CONFIG_NLS_ISO8859_4=m +CONFIG_NLS_ISO8859_5=m +CONFIG_NLS_ISO8859_6=m +CONFIG_NLS_ISO8859_7=m +CONFIG_NLS_ISO8859_9=m +CONFIG_NLS_ISO8859_13=m +CONFIG_NLS_ISO8859_14=m +CONFIG_NLS_ISO8859_15=m +CONFIG_NLS_KOI8_R=m +CONFIG_NLS_KOI8_U=m +CONFIG_NLS_MAC_ROMAN=m +CONFIG_NLS_MAC_CELTIC=m +CONFIG_NLS_MAC_CENTEURO=m +CONFIG_NLS_MAC_CROATIAN=m +CONFIG_NLS_MAC_CYRILLIC=m +CONFIG_NLS_MAC_GAELIC=m +CONFIG_NLS_MAC_GREEK=m +CONFIG_NLS_MAC_ICELAND=m +CONFIG_NLS_MAC_INUIT=m +CONFIG_NLS_MAC_ROMANIAN=m +CONFIG_NLS_MAC_TURKISH=m +CONFIG_NLS_UTF8=m +CONFIG_NLS_UCS2_UTILS=m +CONFIG_DLM=m +CONFIG_DLM_DEBUG=y +# CONFIG_UNICODE is not set +CONFIG_IO_WQ=y +# end of File systems + +# +# Security options +# +CONFIG_KEYS=y +# CONFIG_KEYS_REQUEST_CACHE is not set +# CONFIG_PERSISTENT_KEYRINGS is not set +# CONFIG_TRUSTED_KEYS is not set +# CONFIG_ENCRYPTED_KEYS is not set +CONFIG_KEY_DH_OPERATIONS=y +CONFIG_SECURITY_DMESG_RESTRICT=y +CONFIG_SECURITY=y +CONFIG_SECURITYFS=y +CONFIG_SECURITY_NETWORK=y +# CONFIG_SECURITY_INFINIBAND is not set +CONFIG_SECURITY_NETWORK_XFRM=y +CONFIG_SECURITY_PATH=y +CONFIG_LSM_MMAP_MIN_ADDR=32768 +CONFIG_HARDENED_USERCOPY=y +CONFIG_FORTIFY_SOURCE=y +# CONFIG_STATIC_USERMODEHELPER is not set +CONFIG_SECURITY_SELINUX=y +CONFIG_SECURITY_SELINUX_BOOTPARAM=y +CONFIG_SECURITY_SELINUX_DEVELOP=y +CONFIG_SECURITY_SELINUX_AVC_STATS=y +CONFIG_SECURITY_SELINUX_SIDTAB_HASH_BITS=9 +CONFIG_SECURITY_SELINUX_SID2STR_CACHE_SIZE=256 +# CONFIG_SECURITY_SELINUX_DEBUG is not set +# CONFIG_SECURITY_SMACK is not set +CONFIG_SECURITY_TOMOYO=y +CONFIG_SECURITY_TOMOYO_MAX_ACCEPT_ENTRY=2048 +CONFIG_SECURITY_TOMOYO_MAX_AUDIT_LOG=1024 +# CONFIG_SECURITY_TOMOYO_OMIT_USERSPACE_LOADER is not set +CONFIG_SECURITY_TOMOYO_POLICY_LOADER="/sbin/tomoyo-init" +CONFIG_SECURITY_TOMOYO_ACTIVATION_TRIGGER="/sbin/init" +# CONFIG_SECURITY_TOMOYO_INSECURE_BUILTIN_SETTING is not set +CONFIG_SECURITY_APPARMOR=y +# CONFIG_SECURITY_APPARMOR_DEBUG is not set +CONFIG_SECURITY_APPARMOR_INTROSPECT_POLICY=y +CONFIG_SECURITY_APPARMOR_HASH=y +CONFIG_SECURITY_APPARMOR_HASH_DEFAULT=y +CONFIG_SECURITY_APPARMOR_EXPORT_BINARY=y +CONFIG_SECURITY_APPARMOR_PARANOID_LOAD=y +# CONFIG_SECURITY_LOADPIN is not set +CONFIG_SECURITY_YAMA=y +# CONFIG_SECURITY_SAFESETID is not set +# CONFIG_SECURITY_LOCKDOWN_LSM is not set +# CONFIG_SECURITY_LANDLOCK is not set +CONFIG_INTEGRITY=y +CONFIG_INTEGRITY_SIGNATURE=y +CONFIG_INTEGRITY_ASYMMETRIC_KEYS=y +# CONFIG_INTEGRITY_TRUSTED_KEYRING is not set +CONFIG_INTEGRITY_PLATFORM_KEYRING=y +# CONFIG_INTEGRITY_MACHINE_KEYRING is not set +CONFIG_LOAD_UEFI_KEYS=y +CONFIG_INTEGRITY_AUDIT=y +# CONFIG_IMA is not set +# CONFIG_IMA_SECURE_AND_OR_TRUSTED_BOOT is not set +# CONFIG_EVM is not set +# CONFIG_DEFAULT_SECURITY_SELINUX is not set +# CONFIG_DEFAULT_SECURITY_TOMOYO is not set +CONFIG_DEFAULT_SECURITY_APPARMOR=y +# CONFIG_DEFAULT_SECURITY_DAC is not set +CONFIG_LSM="yama,loadpin,safesetid,integrity,apparmor,selinux,smack,tomoyo" + +# +# Kernel hardening options +# + +# +# Memory initialization +# +CONFIG_INIT_STACK_NONE=y +# CONFIG_INIT_ON_ALLOC_DEFAULT_ON is not set +# CONFIG_INIT_ON_FREE_DEFAULT_ON is not set +# end of Memory initialization + +# +# Hardening of kernel data structures +# +CONFIG_LIST_HARDENED=y +CONFIG_BUG_ON_DATA_CORRUPTION=y +# end of Hardening of kernel data structures + +CONFIG_RANDSTRUCT_NONE=y +# end of Kernel hardening options +# end of Security options + +CONFIG_XOR_BLOCKS=m +CONFIG_ASYNC_CORE=m +CONFIG_ASYNC_MEMCPY=m +CONFIG_ASYNC_XOR=m +CONFIG_ASYNC_PQ=m +CONFIG_ASYNC_RAID6_RECOV=m +CONFIG_CRYPTO=y + +# +# Crypto core or helper +# +CONFIG_CRYPTO_FIPS=y +CONFIG_CRYPTO_FIPS_NAME="Linux Kernel Cryptographic API" +# CONFIG_CRYPTO_FIPS_CUSTOM_VERSION is not set +CONFIG_CRYPTO_ALGAPI=y +CONFIG_CRYPTO_ALGAPI2=y +CONFIG_CRYPTO_AEAD=m +CONFIG_CRYPTO_AEAD2=y +CONFIG_CRYPTO_SIG2=y +CONFIG_CRYPTO_SKCIPHER=y +CONFIG_CRYPTO_SKCIPHER2=y +CONFIG_CRYPTO_HASH=y +CONFIG_CRYPTO_HASH2=y +CONFIG_CRYPTO_RNG=m +CONFIG_CRYPTO_RNG2=y +CONFIG_CRYPTO_RNG_DEFAULT=m +CONFIG_CRYPTO_AKCIPHER2=y +CONFIG_CRYPTO_AKCIPHER=y +CONFIG_CRYPTO_KPP2=y +CONFIG_CRYPTO_KPP=y +CONFIG_CRYPTO_ACOMP2=y +CONFIG_CRYPTO_MANAGER=y +CONFIG_CRYPTO_MANAGER2=y +CONFIG_CRYPTO_USER=m +# CONFIG_CRYPTO_MANAGER_DISABLE_TESTS is not set +# CONFIG_CRYPTO_MANAGER_EXTRA_TESTS is not set +CONFIG_CRYPTO_NULL=m +CONFIG_CRYPTO_NULL2=m +CONFIG_CRYPTO_PCRYPT=m +CONFIG_CRYPTO_CRYPTD=m +CONFIG_CRYPTO_AUTHENC=m +CONFIG_CRYPTO_TEST=m +CONFIG_CRYPTO_ENGINE=m +# end of Crypto core or helper + +# +# Public-key cryptography +# +CONFIG_CRYPTO_RSA=y +CONFIG_CRYPTO_DH=y +# CONFIG_CRYPTO_DH_RFC7919_GROUPS is not set +CONFIG_CRYPTO_ECC=m +CONFIG_CRYPTO_ECDH=m +# CONFIG_CRYPTO_ECDSA is not set +# CONFIG_CRYPTO_ECRDSA is not set +# CONFIG_CRYPTO_SM2 is not set +# CONFIG_CRYPTO_CURVE25519 is not set +# end of Public-key cryptography + +# +# Block ciphers +# +CONFIG_CRYPTO_AES=y +# CONFIG_CRYPTO_AES_TI is not set +CONFIG_CRYPTO_ANUBIS=m +# CONFIG_CRYPTO_ARIA is not set +CONFIG_CRYPTO_BLOWFISH=m +CONFIG_CRYPTO_BLOWFISH_COMMON=m +CONFIG_CRYPTO_CAMELLIA=m +CONFIG_CRYPTO_CAST_COMMON=m +CONFIG_CRYPTO_CAST5=m +CONFIG_CRYPTO_CAST6=m +CONFIG_CRYPTO_DES=m +CONFIG_CRYPTO_FCRYPT=m +CONFIG_CRYPTO_KHAZAD=m +CONFIG_CRYPTO_SEED=m +CONFIG_CRYPTO_SERPENT=m +# CONFIG_CRYPTO_SM4_GENERIC is not set +CONFIG_CRYPTO_TEA=m +CONFIG_CRYPTO_TWOFISH=m +CONFIG_CRYPTO_TWOFISH_COMMON=m +# end of Block ciphers + +# +# Length-preserving ciphers and modes +# +# CONFIG_CRYPTO_ADIANTUM is not set +CONFIG_CRYPTO_ARC4=m +CONFIG_CRYPTO_CHACHA20=m +CONFIG_CRYPTO_CBC=y +# CONFIG_CRYPTO_CFB is not set +CONFIG_CRYPTO_CTR=m +CONFIG_CRYPTO_CTS=y +CONFIG_CRYPTO_ECB=y +# CONFIG_CRYPTO_HCTR2 is not set +# CONFIG_CRYPTO_KEYWRAP is not set +CONFIG_CRYPTO_LRW=m +# CONFIG_CRYPTO_OFB is not set +CONFIG_CRYPTO_PCBC=m +CONFIG_CRYPTO_XTS=y +# end of Length-preserving ciphers and modes + +# +# AEAD (authenticated encryption with associated data) ciphers +# +CONFIG_CRYPTO_AEGIS128=m +CONFIG_CRYPTO_AEGIS128_SIMD=y +CONFIG_CRYPTO_CHACHA20POLY1305=m +CONFIG_CRYPTO_CCM=m +CONFIG_CRYPTO_GCM=m +CONFIG_CRYPTO_GENIV=m +CONFIG_CRYPTO_SEQIV=m +CONFIG_CRYPTO_ECHAINIV=m +CONFIG_CRYPTO_ESSIV=m +# end of AEAD (authenticated encryption with associated data) ciphers + +# +# Hashes, digests, and MACs +# +CONFIG_CRYPTO_BLAKE2B=m +CONFIG_CRYPTO_CMAC=m +CONFIG_CRYPTO_GHASH=m +CONFIG_CRYPTO_HMAC=y +CONFIG_CRYPTO_MD4=m +CONFIG_CRYPTO_MD5=y +CONFIG_CRYPTO_MICHAEL_MIC=m +CONFIG_CRYPTO_POLY1305=m +CONFIG_CRYPTO_RMD160=m +CONFIG_CRYPTO_SHA1=y +CONFIG_CRYPTO_SHA256=y +CONFIG_CRYPTO_SHA512=y +CONFIG_CRYPTO_SHA3=m +# CONFIG_CRYPTO_SM3_GENERIC is not set +# CONFIG_CRYPTO_STREEBOG is not set +CONFIG_CRYPTO_VMAC=m +CONFIG_CRYPTO_WP512=m +CONFIG_CRYPTO_XCBC=m +CONFIG_CRYPTO_XXHASH=m +# end of Hashes, digests, and MACs + +# +# CRCs (cyclic redundancy checks) +# +CONFIG_CRYPTO_CRC32C=m +CONFIG_CRYPTO_CRC32=m +CONFIG_CRYPTO_CRCT10DIF=y +CONFIG_CRYPTO_CRC64_ROCKSOFT=m +# end of CRCs (cyclic redundancy checks) + +# +# Compression +# +CONFIG_CRYPTO_DEFLATE=y +CONFIG_CRYPTO_LZO=y +# CONFIG_CRYPTO_842 is not set +CONFIG_CRYPTO_LZ4=m +CONFIG_CRYPTO_LZ4HC=m +CONFIG_CRYPTO_ZSTD=m +# end of Compression + +# +# Random number generation +# +CONFIG_CRYPTO_ANSI_CPRNG=m +CONFIG_CRYPTO_DRBG_MENU=m +CONFIG_CRYPTO_DRBG_HMAC=y +# CONFIG_CRYPTO_DRBG_HASH is not set +# CONFIG_CRYPTO_DRBG_CTR is not set +CONFIG_CRYPTO_DRBG=m +CONFIG_CRYPTO_JITTERENTROPY=m +# CONFIG_CRYPTO_JITTERENTROPY_TESTINTERFACE is not set +CONFIG_CRYPTO_KDF800108_CTR=y +# end of Random number generation + +# +# Userspace interface +# +CONFIG_CRYPTO_USER_API=m +CONFIG_CRYPTO_USER_API_HASH=m +CONFIG_CRYPTO_USER_API_SKCIPHER=m +CONFIG_CRYPTO_USER_API_RNG=m +# CONFIG_CRYPTO_USER_API_RNG_CAVP is not set +CONFIG_CRYPTO_USER_API_AEAD=m +CONFIG_CRYPTO_USER_API_ENABLE_OBSOLETE=y +# CONFIG_CRYPTO_STATS is not set +# end of Userspace interface + +CONFIG_CRYPTO_HASH_INFO=y +# CONFIG_CRYPTO_NHPOLY1305_NEON is not set +# CONFIG_CRYPTO_CHACHA20_NEON is not set + +# +# Accelerated Cryptographic Algorithms for CPU (arm64) +# +CONFIG_CRYPTO_GHASH_ARM64_CE=m +# CONFIG_CRYPTO_POLY1305_NEON is not set +CONFIG_CRYPTO_SHA1_ARM64_CE=m +CONFIG_CRYPTO_SHA256_ARM64=m +CONFIG_CRYPTO_SHA2_ARM64_CE=m +# CONFIG_CRYPTO_SHA512_ARM64 is not set +# CONFIG_CRYPTO_SHA512_ARM64_CE is not set +# CONFIG_CRYPTO_SHA3_ARM64 is not set +# CONFIG_CRYPTO_SM3_NEON is not set +# CONFIG_CRYPTO_SM3_ARM64_CE is not set +# CONFIG_CRYPTO_POLYVAL_ARM64_CE is not set +CONFIG_CRYPTO_AES_ARM64=m +CONFIG_CRYPTO_AES_ARM64_CE=m +CONFIG_CRYPTO_AES_ARM64_CE_BLK=m +# CONFIG_CRYPTO_AES_ARM64_NEON_BLK is not set +# CONFIG_CRYPTO_AES_ARM64_BS is not set +# CONFIG_CRYPTO_SM4_ARM64_CE is not set +# CONFIG_CRYPTO_SM4_ARM64_CE_BLK is not set +# CONFIG_CRYPTO_SM4_ARM64_NEON_BLK is not set +CONFIG_CRYPTO_AES_ARM64_CE_CCM=m +# CONFIG_CRYPTO_SM4_ARM64_CE_CCM is not set +# CONFIG_CRYPTO_SM4_ARM64_CE_GCM is not set +# CONFIG_CRYPTO_CRCT10DIF_ARM64_CE is not set +# end of Accelerated Cryptographic Algorithms for CPU (arm64) + +CONFIG_CRYPTO_HW=y +CONFIG_CRYPTO_DEV_ALLWINNER=y +# CONFIG_CRYPTO_DEV_SUN4I_SS is not set +# CONFIG_CRYPTO_DEV_SUN8I_CE is not set +# CONFIG_CRYPTO_DEV_SUN8I_SS is not set +# CONFIG_CRYPTO_DEV_ATMEL_ECC is not set +# CONFIG_CRYPTO_DEV_ATMEL_SHA204A is not set +# CONFIG_CRYPTO_DEV_CCP is not set +CONFIG_CRYPTO_DEV_CPT=m +CONFIG_CAVIUM_CPT=m +CONFIG_CRYPTO_DEV_NITROX=m +CONFIG_CRYPTO_DEV_NITROX_CNN55XX=m +CONFIG_CRYPTO_DEV_MARVELL=m +CONFIG_CRYPTO_DEV_MARVELL_CESA=m +# CONFIG_CRYPTO_DEV_OCTEONTX_CPT is not set +# CONFIG_CRYPTO_DEV_OCTEONTX2_CPT is not set +# CONFIG_CRYPTO_DEV_QAT_DH895xCC is not set +# CONFIG_CRYPTO_DEV_QAT_C3XXX is not set +# CONFIG_CRYPTO_DEV_QAT_C62X is not set +# CONFIG_CRYPTO_DEV_QAT_4XXX is not set +# CONFIG_CRYPTO_DEV_QAT_DH895xCCVF is not set +# CONFIG_CRYPTO_DEV_QAT_C3XXXVF is not set +# CONFIG_CRYPTO_DEV_QAT_C62XVF is not set +# CONFIG_CRYPTO_DEV_CAVIUM_ZIP is not set +CONFIG_CRYPTO_DEV_QCE=m +CONFIG_CRYPTO_DEV_QCE_SKCIPHER=y +CONFIG_CRYPTO_DEV_QCE_SHA=y +CONFIG_CRYPTO_DEV_QCE_AEAD=y +CONFIG_CRYPTO_DEV_QCE_ENABLE_ALL=y +# CONFIG_CRYPTO_DEV_QCE_ENABLE_SKCIPHER is not set +# CONFIG_CRYPTO_DEV_QCE_ENABLE_SHA is not set +# CONFIG_CRYPTO_DEV_QCE_ENABLE_AEAD is not set +CONFIG_CRYPTO_DEV_QCE_SW_MAX_LEN=512 +CONFIG_CRYPTO_DEV_QCOM_RNG=m +# CONFIG_CRYPTO_DEV_ROCKCHIP is not set +# CONFIG_CRYPTO_DEV_ZYNQMP_AES is not set +# CONFIG_CRYPTO_DEV_ZYNQMP_SHA3 is not set +CONFIG_CRYPTO_DEV_CHELSIO=m +CONFIG_CRYPTO_DEV_VIRTIO=m +CONFIG_CRYPTO_DEV_SAFEXCEL=m +# CONFIG_CRYPTO_DEV_CCREE is not set +# CONFIG_CRYPTO_DEV_HISI_SEC is not set +# CONFIG_CRYPTO_DEV_HISI_SEC2 is not set +# CONFIG_CRYPTO_DEV_HISI_ZIP is not set +# CONFIG_CRYPTO_DEV_HISI_HPRE is not set +# CONFIG_CRYPTO_DEV_HISI_TRNG is not set +CONFIG_CRYPTO_DEV_AMLOGIC_GXL=m +# CONFIG_CRYPTO_DEV_AMLOGIC_GXL_DEBUG is not set +CONFIG_ASYMMETRIC_KEY_TYPE=y +CONFIG_ASYMMETRIC_PUBLIC_KEY_SUBTYPE=y +CONFIG_X509_CERTIFICATE_PARSER=y +CONFIG_PKCS8_PRIVATE_KEY_PARSER=m +CONFIG_PKCS7_MESSAGE_PARSER=y +# CONFIG_PKCS7_TEST_KEY is not set +# CONFIG_SIGNED_PE_FILE_VERIFICATION is not set +# CONFIG_FIPS_SIGNATURE_SELFTEST is not set + +# +# Certificates for signature checking +# +CONFIG_MODULE_SIG_KEY="" +CONFIG_MODULE_SIG_KEY_TYPE_RSA=y +# CONFIG_MODULE_SIG_KEY_TYPE_ECDSA is not set +CONFIG_SYSTEM_TRUSTED_KEYRING=y +CONFIG_SYSTEM_TRUSTED_KEYS="" +# CONFIG_SYSTEM_EXTRA_CERTIFICATE is not set +CONFIG_SECONDARY_TRUSTED_KEYRING=y +CONFIG_SYSTEM_BLACKLIST_KEYRING=y +CONFIG_SYSTEM_BLACKLIST_HASH_LIST="" +# CONFIG_SYSTEM_REVOCATION_LIST is not set +# CONFIG_SYSTEM_BLACKLIST_AUTH_UPDATE is not set +# end of Certificates for signature checking + +CONFIG_BINARY_PRINTF=y + +# +# Library routines +# +CONFIG_RAID6_PQ=m +CONFIG_RAID6_PQ_BENCHMARK=y +CONFIG_LINEAR_RANGES=y +# CONFIG_PACKING is not set +CONFIG_BITREVERSE=y +CONFIG_HAVE_ARCH_BITREVERSE=y +CONFIG_GENERIC_STRNCPY_FROM_USER=y +CONFIG_GENERIC_STRNLEN_USER=y +CONFIG_GENERIC_NET_UTILS=y +CONFIG_CORDIC=m +# CONFIG_PRIME_NUMBERS is not set +CONFIG_RATIONAL=y +CONFIG_GENERIC_PCI_IOMAP=y +CONFIG_ARCH_USE_CMPXCHG_LOCKREF=y +CONFIG_ARCH_HAS_FAST_MULTIPLIER=y +CONFIG_ARCH_USE_SYM_ANNOTATIONS=y +CONFIG_INDIRECT_PIO=y +# CONFIG_TRACE_MMIO_ACCESS is not set + +# +# Crypto library routines +# +CONFIG_CRYPTO_LIB_UTILS=y +CONFIG_CRYPTO_LIB_AES=y +CONFIG_CRYPTO_LIB_ARC4=m +CONFIG_CRYPTO_LIB_GF128MUL=m +CONFIG_CRYPTO_LIB_BLAKE2S_GENERIC=y +CONFIG_CRYPTO_LIB_CHACHA_GENERIC=m +# CONFIG_CRYPTO_LIB_CHACHA is not set +# CONFIG_CRYPTO_LIB_CURVE25519 is not set +CONFIG_CRYPTO_LIB_DES=m +CONFIG_CRYPTO_LIB_POLY1305_RSIZE=9 +CONFIG_CRYPTO_LIB_POLY1305_GENERIC=m +# CONFIG_CRYPTO_LIB_POLY1305 is not set +# CONFIG_CRYPTO_LIB_CHACHA20POLY1305 is not set +CONFIG_CRYPTO_LIB_SHA1=y +CONFIG_CRYPTO_LIB_SHA256=y +# end of Crypto library routines + +CONFIG_CRC_CCITT=m +CONFIG_CRC16=m +CONFIG_CRC_T10DIF=y +CONFIG_CRC64_ROCKSOFT=m +CONFIG_CRC_ITU_T=m +CONFIG_CRC32=y +# CONFIG_CRC32_SELFTEST is not set +CONFIG_CRC32_SLICEBY8=y +# CONFIG_CRC32_SLICEBY4 is not set +# CONFIG_CRC32_SARWATE is not set +# CONFIG_CRC32_BIT is not set +CONFIG_CRC64=m +# CONFIG_CRC4 is not set +CONFIG_CRC7=m +CONFIG_LIBCRC32C=m +CONFIG_CRC8=y +CONFIG_XXHASH=y +CONFIG_AUDIT_GENERIC=y +CONFIG_AUDIT_ARCH_COMPAT_GENERIC=y +CONFIG_AUDIT_COMPAT_GENERIC=y +# CONFIG_RANDOM32_SELFTEST is not set +CONFIG_ZLIB_INFLATE=y +CONFIG_ZLIB_DEFLATE=y +CONFIG_LZO_COMPRESS=y +CONFIG_LZO_DECOMPRESS=y +CONFIG_LZ4_COMPRESS=m +CONFIG_LZ4HC_COMPRESS=m +CONFIG_LZ4_DECOMPRESS=y +CONFIG_ZSTD_COMMON=y +CONFIG_ZSTD_COMPRESS=y +CONFIG_ZSTD_DECOMPRESS=y +CONFIG_XZ_DEC=y +# CONFIG_XZ_DEC_X86 is not set +# CONFIG_XZ_DEC_POWERPC is not set +# CONFIG_XZ_DEC_IA64 is not set +# CONFIG_XZ_DEC_ARM is not set +# CONFIG_XZ_DEC_ARMTHUMB is not set +# CONFIG_XZ_DEC_SPARC is not set +# CONFIG_XZ_DEC_MICROLZMA is not set +# CONFIG_XZ_DEC_TEST is not set +CONFIG_DECOMPRESS_GZIP=y +CONFIG_DECOMPRESS_BZIP2=y +CONFIG_DECOMPRESS_LZMA=y +CONFIG_DECOMPRESS_XZ=y +CONFIG_DECOMPRESS_LZO=y +CONFIG_DECOMPRESS_LZ4=y +CONFIG_DECOMPRESS_ZSTD=y +CONFIG_GENERIC_ALLOCATOR=y +CONFIG_REED_SOLOMON=m +CONFIG_REED_SOLOMON_ENC8=y +CONFIG_REED_SOLOMON_DEC8=y +CONFIG_TEXTSEARCH=y +CONFIG_TEXTSEARCH_KMP=m +CONFIG_TEXTSEARCH_BM=m +CONFIG_TEXTSEARCH_FSM=m +CONFIG_BTREE=y +CONFIG_INTERVAL_TREE=y +CONFIG_XARRAY_MULTI=y +CONFIG_ASSOCIATIVE_ARRAY=y +CONFIG_HAS_IOMEM=y +CONFIG_HAS_IOPORT=y +CONFIG_HAS_IOPORT_MAP=y +CONFIG_HAS_DMA=y +CONFIG_DMA_OPS=y +CONFIG_NEED_SG_DMA_FLAGS=y +CONFIG_NEED_SG_DMA_LENGTH=y +CONFIG_NEED_DMA_MAP_STATE=y +CONFIG_ARCH_DMA_ADDR_T_64BIT=y +CONFIG_DMA_DECLARE_COHERENT=y +CONFIG_ARCH_HAS_SETUP_DMA_OPS=y +CONFIG_ARCH_HAS_TEARDOWN_DMA_OPS=y +CONFIG_ARCH_HAS_SYNC_DMA_FOR_DEVICE=y +CONFIG_ARCH_HAS_SYNC_DMA_FOR_CPU=y +CONFIG_ARCH_HAS_DMA_PREP_COHERENT=y +CONFIG_SWIOTLB=y +# CONFIG_SWIOTLB_DYNAMIC is not set +CONFIG_DMA_BOUNCE_UNALIGNED_KMALLOC=y +# CONFIG_DMA_RESTRICTED_POOL is not set +CONFIG_DMA_NONCOHERENT_MMAP=y +CONFIG_DMA_COHERENT_POOL=y +CONFIG_DMA_DIRECT_REMAP=y +# CONFIG_DMA_API_DEBUG is not set +# CONFIG_DMA_MAP_BENCHMARK is not set +CONFIG_SGL_ALLOC=y +CONFIG_CHECK_SIGNATURE=y +# CONFIG_FORCE_NR_CPUS is not set +CONFIG_CPU_RMAP=y +CONFIG_DQL=y +CONFIG_GLOB=y +# CONFIG_GLOB_SELFTEST is not set +CONFIG_NLATTR=y +CONFIG_LRU_CACHE=m +CONFIG_CLZ_TAB=y +CONFIG_IRQ_POLL=y +CONFIG_MPILIB=y +CONFIG_SIGNATURE=y +CONFIG_DIMLIB=y +CONFIG_LIBFDT=y +CONFIG_OID_REGISTRY=y +CONFIG_UCS2_STRING=y +CONFIG_HAVE_GENERIC_VDSO=y +CONFIG_GENERIC_GETTIMEOFDAY=y +CONFIG_GENERIC_VDSO_TIME_NS=y +CONFIG_FONT_SUPPORT=y +# CONFIG_FONTS is not set +CONFIG_FONT_8x8=y +CONFIG_FONT_8x16=y +CONFIG_SG_POOL=y +CONFIG_ARCH_HAS_PMEM_API=y +CONFIG_MEMREGION=y +CONFIG_ARCH_HAS_UACCESS_FLUSHCACHE=y +CONFIG_ARCH_STACKWALK=y +CONFIG_STACKDEPOT=y +CONFIG_SBITMAP=y +# end of Library routines + +CONFIG_GENERIC_IOREMAP=y +CONFIG_GENERIC_LIB_DEVMEM_IS_ALLOWED=y +CONFIG_PLDMFW=y + +# +# Kernel hacking +# + +# +# printk and dmesg options +# +CONFIG_PRINTK_TIME=y +CONFIG_PRINTK_CALLER=y +# CONFIG_STACKTRACE_BUILD_ID is not set +CONFIG_CONSOLE_LOGLEVEL_DEFAULT=7 +CONFIG_CONSOLE_LOGLEVEL_QUIET=4 +CONFIG_MESSAGE_LOGLEVEL_DEFAULT=4 +CONFIG_BOOT_PRINTK_DELAY=y +CONFIG_DYNAMIC_DEBUG=y +CONFIG_DYNAMIC_DEBUG_CORE=y +CONFIG_SYMBOLIC_ERRNAME=y +CONFIG_DEBUG_BUGVERBOSE=y +# end of printk and dmesg options + +CONFIG_DEBUG_KERNEL=y +CONFIG_DEBUG_MISC=y + +# +# Compile-time checks and compiler options +# +CONFIG_DEBUG_INFO=y +CONFIG_AS_HAS_NON_CONST_LEB128=y +# CONFIG_DEBUG_INFO_NONE is not set +CONFIG_DEBUG_INFO_DWARF_TOOLCHAIN_DEFAULT=y +# CONFIG_DEBUG_INFO_DWARF4 is not set +# CONFIG_DEBUG_INFO_DWARF5 is not set +# CONFIG_DEBUG_INFO_REDUCED is not set +CONFIG_DEBUG_INFO_COMPRESSED_NONE=y +# CONFIG_DEBUG_INFO_COMPRESSED_ZLIB is not set +# CONFIG_DEBUG_INFO_SPLIT is not set +CONFIG_DEBUG_INFO_BTF=y +CONFIG_PAHOLE_HAS_SPLIT_BTF=y +CONFIG_DEBUG_INFO_BTF_MODULES=y +# CONFIG_MODULE_ALLOW_BTF_MISMATCH is not set +# CONFIG_GDB_SCRIPTS is not set +CONFIG_FRAME_WARN=2048 +CONFIG_STRIP_ASM_SYMS=y +# CONFIG_READABLE_ASM is not set +# CONFIG_HEADERS_INSTALL is not set +# CONFIG_DEBUG_SECTION_MISMATCH is not set +CONFIG_SECTION_MISMATCH_WARN_ONLY=y +# CONFIG_DEBUG_FORCE_FUNCTION_ALIGN_64B is not set +CONFIG_ARCH_WANT_FRAME_POINTERS=y +CONFIG_FRAME_POINTER=y +# CONFIG_VMLINUX_MAP is not set +# CONFIG_DEBUG_FORCE_WEAK_PER_CPU is not set +# end of Compile-time checks and compiler options + +# +# Generic Kernel Debugging Instruments +# +CONFIG_MAGIC_SYSRQ=y +CONFIG_MAGIC_SYSRQ_DEFAULT_ENABLE=0x01b6 +CONFIG_MAGIC_SYSRQ_SERIAL=y +CONFIG_MAGIC_SYSRQ_SERIAL_SEQUENCE="" +CONFIG_DEBUG_FS=y +CONFIG_DEBUG_FS_ALLOW_ALL=y +# CONFIG_DEBUG_FS_DISALLOW_MOUNT is not set +# CONFIG_DEBUG_FS_ALLOW_NONE is not set +CONFIG_HAVE_ARCH_KGDB=y +# CONFIG_KGDB is not set +CONFIG_ARCH_HAS_UBSAN_SANITIZE_ALL=y +# CONFIG_UBSAN is not set +CONFIG_HAVE_ARCH_KCSAN=y +# end of Generic Kernel Debugging Instruments + +# +# Networking Debugging +# +# CONFIG_NET_DEV_REFCNT_TRACKER is not set +# CONFIG_NET_NS_REFCNT_TRACKER is not set +# CONFIG_DEBUG_NET is not set +# end of Networking Debugging + +# +# Memory Debugging +# +CONFIG_PAGE_EXTENSION=y +# CONFIG_DEBUG_PAGEALLOC is not set +CONFIG_SLUB_DEBUG=y +# CONFIG_SLUB_DEBUG_ON is not set +CONFIG_PAGE_OWNER=y +# CONFIG_PAGE_TABLE_CHECK is not set +CONFIG_PAGE_POISONING=y +# CONFIG_DEBUG_PAGE_REF is not set +# CONFIG_DEBUG_RODATA_TEST is not set +CONFIG_ARCH_HAS_DEBUG_WX=y +# CONFIG_DEBUG_WX is not set +CONFIG_GENERIC_PTDUMP=y +# CONFIG_PTDUMP_DEBUGFS is not set +CONFIG_HAVE_DEBUG_KMEMLEAK=y +# CONFIG_DEBUG_KMEMLEAK is not set +# CONFIG_PER_VMA_LOCK_STATS is not set +# CONFIG_DEBUG_OBJECTS is not set +# CONFIG_SHRINKER_DEBUG is not set +# CONFIG_DEBUG_STACK_USAGE is not set +CONFIG_SCHED_STACK_END_CHECK=y +CONFIG_ARCH_HAS_DEBUG_VM_PGTABLE=y +# CONFIG_DEBUG_VM is not set +# CONFIG_DEBUG_VM_PGTABLE is not set +CONFIG_ARCH_HAS_DEBUG_VIRTUAL=y +# CONFIG_DEBUG_VIRTUAL is not set +CONFIG_DEBUG_MEMORY_INIT=y +CONFIG_MEMORY_NOTIFIER_ERROR_INJECT=m +# CONFIG_DEBUG_PER_CPU_MAPS is not set +CONFIG_HAVE_ARCH_KASAN=y +CONFIG_HAVE_ARCH_KASAN_SW_TAGS=y +CONFIG_HAVE_ARCH_KASAN_VMALLOC=y +CONFIG_CC_HAS_KASAN_GENERIC=y +CONFIG_CC_HAS_WORKING_NOSANITIZE_ADDRESS=y +# CONFIG_KASAN is not set +CONFIG_HAVE_ARCH_KFENCE=y +CONFIG_KFENCE=y +CONFIG_KFENCE_SAMPLE_INTERVAL=0 +CONFIG_KFENCE_NUM_OBJECTS=511 +# CONFIG_KFENCE_DEFERRABLE is not set +# CONFIG_KFENCE_STATIC_KEYS is not set +CONFIG_KFENCE_STRESS_TEST_FAULTS=0 +# end of Memory Debugging + +# CONFIG_DEBUG_SHIRQ is not set + +# +# Debug Oops, Lockups and Hangs +# +# CONFIG_PANIC_ON_OOPS is not set +CONFIG_PANIC_ON_OOPS_VALUE=0 +CONFIG_PANIC_TIMEOUT=0 +CONFIG_LOCKUP_DETECTOR=y +CONFIG_SOFTLOCKUP_DETECTOR=y +# CONFIG_BOOTPARAM_SOFTLOCKUP_PANIC is not set +CONFIG_HAVE_HARDLOCKUP_DETECTOR_BUDDY=y +# CONFIG_HARDLOCKUP_DETECTOR is not set +CONFIG_DETECT_HUNG_TASK=y +CONFIG_DEFAULT_HUNG_TASK_TIMEOUT=120 +# CONFIG_BOOTPARAM_HUNG_TASK_PANIC is not set +# CONFIG_WQ_WATCHDOG is not set +# CONFIG_WQ_CPU_INTENSIVE_REPORT is not set +# CONFIG_TEST_LOCKUP is not set +# end of Debug Oops, Lockups and Hangs + +# +# Scheduler Debugging +# +CONFIG_SCHED_DEBUG=y +CONFIG_SCHED_INFO=y +CONFIG_SCHEDSTATS=y +# end of Scheduler Debugging + +# CONFIG_DEBUG_TIMEKEEPING is not set + +# +# Lock Debugging (spinlocks, mutexes, etc...) +# +CONFIG_LOCK_DEBUGGING_SUPPORT=y +# CONFIG_PROVE_LOCKING is not set +# CONFIG_LOCK_STAT is not set +# CONFIG_DEBUG_RT_MUTEXES is not set +# CONFIG_DEBUG_SPINLOCK is not set +# CONFIG_DEBUG_MUTEXES is not set +# CONFIG_DEBUG_WW_MUTEX_SLOWPATH is not set +# CONFIG_DEBUG_RWSEMS is not set +# CONFIG_DEBUG_LOCK_ALLOC is not set +# CONFIG_DEBUG_ATOMIC_SLEEP is not set +# CONFIG_DEBUG_LOCKING_API_SELFTESTS is not set +# CONFIG_LOCK_TORTURE_TEST is not set +# CONFIG_WW_MUTEX_SELFTEST is not set +# CONFIG_SCF_TORTURE_TEST is not set +# CONFIG_CSD_LOCK_WAIT_DEBUG is not set +# end of Lock Debugging (spinlocks, mutexes, etc...) + +# CONFIG_DEBUG_IRQFLAGS is not set +CONFIG_STACKTRACE=y +# CONFIG_WARN_ALL_UNSEEDED_RANDOM is not set +# CONFIG_DEBUG_KOBJECT is not set + +# +# Debug kernel data structures +# +CONFIG_DEBUG_LIST=y +# CONFIG_DEBUG_PLIST is not set +# CONFIG_DEBUG_SG is not set +# CONFIG_DEBUG_NOTIFIERS is not set +# CONFIG_DEBUG_MAPLE_TREE is not set +# end of Debug kernel data structures + +# CONFIG_DEBUG_CREDENTIALS is not set + +# +# RCU Debugging +# +# CONFIG_RCU_SCALE_TEST is not set +# CONFIG_RCU_TORTURE_TEST is not set +# CONFIG_RCU_REF_SCALE_TEST is not set +CONFIG_RCU_CPU_STALL_TIMEOUT=21 +CONFIG_RCU_EXP_CPU_STALL_TIMEOUT=0 +# CONFIG_RCU_CPU_STALL_CPUTIME is not set +# CONFIG_RCU_TRACE is not set +# CONFIG_RCU_EQS_DEBUG is not set +# end of RCU Debugging + +# CONFIG_DEBUG_WQ_FORCE_RR_CPU is not set +# CONFIG_CPU_HOTPLUG_STATE_CONTROL is not set +# CONFIG_LATENCYTOP is not set +# CONFIG_DEBUG_CGROUP_REF is not set +CONFIG_NOP_TRACER=y +CONFIG_HAVE_FUNCTION_TRACER=y +CONFIG_HAVE_FUNCTION_GRAPH_TRACER=y +CONFIG_HAVE_FUNCTION_GRAPH_RETVAL=y +CONFIG_HAVE_DYNAMIC_FTRACE=y +CONFIG_HAVE_DYNAMIC_FTRACE_WITH_DIRECT_CALLS=y +CONFIG_HAVE_DYNAMIC_FTRACE_WITH_CALL_OPS=y +CONFIG_HAVE_DYNAMIC_FTRACE_WITH_ARGS=y +CONFIG_HAVE_FTRACE_MCOUNT_RECORD=y +CONFIG_HAVE_SYSCALL_TRACEPOINTS=y +CONFIG_HAVE_C_RECORDMCOUNT=y +CONFIG_TRACER_MAX_TRACE=y +CONFIG_TRACE_CLOCK=y +CONFIG_RING_BUFFER=y +CONFIG_EVENT_TRACING=y +CONFIG_CONTEXT_SWITCH_TRACER=y +CONFIG_TRACING=y +CONFIG_GENERIC_TRACER=y +CONFIG_TRACING_SUPPORT=y +CONFIG_FTRACE=y +# CONFIG_BOOTTIME_TRACING is not set +CONFIG_FUNCTION_TRACER=y +CONFIG_FUNCTION_GRAPH_TRACER=y +# CONFIG_FUNCTION_GRAPH_RETVAL is not set +CONFIG_DYNAMIC_FTRACE=y +CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS=y +CONFIG_DYNAMIC_FTRACE_WITH_CALL_OPS=y +CONFIG_DYNAMIC_FTRACE_WITH_ARGS=y +# CONFIG_FUNCTION_PROFILER is not set +CONFIG_STACK_TRACER=y +# CONFIG_IRQSOFF_TRACER is not set +# CONFIG_SCHED_TRACER is not set +# CONFIG_HWLAT_TRACER is not set +# CONFIG_OSNOISE_TRACER is not set +# CONFIG_TIMERLAT_TRACER is not set +CONFIG_FTRACE_SYSCALLS=y +CONFIG_TRACER_SNAPSHOT=y +# CONFIG_TRACER_SNAPSHOT_PER_CPU_SWAP is not set +CONFIG_BRANCH_PROFILE_NONE=y +# CONFIG_PROFILE_ANNOTATED_BRANCHES is not set +CONFIG_BLK_DEV_IO_TRACE=y +CONFIG_PROBE_EVENTS_BTF_ARGS=y +CONFIG_KPROBE_EVENTS=y +# CONFIG_KPROBE_EVENTS_ON_NOTRACE is not set +CONFIG_UPROBE_EVENTS=y +CONFIG_BPF_EVENTS=y +CONFIG_DYNAMIC_EVENTS=y +CONFIG_PROBE_EVENTS=y +# CONFIG_BPF_KPROBE_OVERRIDE is not set +CONFIG_FTRACE_MCOUNT_RECORD=y +CONFIG_FTRACE_MCOUNT_USE_PATCHABLE_FUNCTION_ENTRY=y +# CONFIG_SYNTH_EVENTS is not set +# CONFIG_USER_EVENTS is not set +# CONFIG_HIST_TRIGGERS is not set +# CONFIG_TRACE_EVENT_INJECT is not set +# CONFIG_TRACEPOINT_BENCHMARK is not set +# CONFIG_RING_BUFFER_BENCHMARK is not set +# CONFIG_TRACE_EVAL_MAP_FILE is not set +# CONFIG_FTRACE_RECORD_RECURSION is not set +# CONFIG_FTRACE_STARTUP_TEST is not set +# CONFIG_RING_BUFFER_STARTUP_TEST is not set +# CONFIG_RING_BUFFER_VALIDATE_TIME_DELTAS is not set +# CONFIG_PREEMPTIRQ_DELAY_TEST is not set +# CONFIG_KPROBE_EVENT_GEN_TEST is not set +# CONFIG_RV is not set +# CONFIG_SAMPLES is not set +CONFIG_HAVE_SAMPLE_FTRACE_DIRECT=y +CONFIG_HAVE_SAMPLE_FTRACE_DIRECT_MULTI=y +CONFIG_STRICT_DEVMEM=y +# CONFIG_IO_STRICT_DEVMEM is not set + +# +# arm64 Debugging +# +# CONFIG_PID_IN_CONTEXTIDR is not set +# CONFIG_DEBUG_EFI is not set +# CONFIG_ARM64_RELOC_TEST is not set +# CONFIG_CORESIGHT is not set +# end of arm64 Debugging + +# +# Kernel Testing and Coverage +# +# CONFIG_KUNIT is not set +CONFIG_NOTIFIER_ERROR_INJECTION=m +CONFIG_PM_NOTIFIER_ERROR_INJECT=m +# CONFIG_NETDEV_NOTIFIER_ERROR_INJECT is not set +CONFIG_FUNCTION_ERROR_INJECTION=y +CONFIG_FAULT_INJECTION=y +# CONFIG_FAILSLAB is not set +# CONFIG_FAIL_PAGE_ALLOC is not set +# CONFIG_FAULT_INJECTION_USERCOPY is not set +# CONFIG_FAIL_MAKE_REQUEST is not set +# CONFIG_FAIL_IO_TIMEOUT is not set +# CONFIG_FAIL_FUTEX is not set +# CONFIG_FAULT_INJECTION_DEBUG_FS is not set +# CONFIG_FAULT_INJECTION_CONFIGFS is not set +CONFIG_ARCH_HAS_KCOV=y +CONFIG_CC_HAS_SANCOV_TRACE_PC=y +CONFIG_RUNTIME_TESTING_MENU=y +# CONFIG_TEST_DHRY is not set +# CONFIG_LKDTM is not set +# CONFIG_TEST_MIN_HEAP is not set +# CONFIG_TEST_DIV64 is not set +# CONFIG_BACKTRACE_SELF_TEST is not set +# CONFIG_TEST_REF_TRACKER is not set +# CONFIG_RBTREE_TEST is not set +# CONFIG_REED_SOLOMON_TEST is not set +# CONFIG_INTERVAL_TREE_TEST is not set +# CONFIG_PERCPU_TEST is not set +# CONFIG_ATOMIC64_SELFTEST is not set +# CONFIG_ASYNC_RAID6_TEST is not set +# CONFIG_TEST_HEXDUMP is not set +# CONFIG_STRING_SELFTEST is not set +# CONFIG_TEST_STRING_HELPERS is not set +# CONFIG_TEST_KSTRTOX is not set +# CONFIG_TEST_PRINTF is not set +# CONFIG_TEST_SCANF is not set +# CONFIG_TEST_BITMAP is not set +# CONFIG_TEST_UUID is not set +# CONFIG_TEST_XARRAY is not set +# CONFIG_TEST_MAPLE_TREE is not set +# CONFIG_TEST_RHASHTABLE is not set +# CONFIG_TEST_IDA is not set +# CONFIG_TEST_LKM is not set +# CONFIG_TEST_BITOPS is not set +# CONFIG_TEST_VMALLOC is not set +CONFIG_TEST_USER_COPY=m +CONFIG_TEST_BPF=m +# CONFIG_TEST_BLACKHOLE_DEV is not set +# CONFIG_FIND_BIT_BENCHMARK is not set +CONFIG_TEST_FIRMWARE=m +# CONFIG_TEST_SYSCTL is not set +# CONFIG_TEST_UDELAY is not set +CONFIG_TEST_STATIC_KEYS=m +# CONFIG_TEST_DYNAMIC_DEBUG is not set +# CONFIG_TEST_KMOD is not set +# CONFIG_TEST_MEMCAT_P is not set +CONFIG_TEST_LIVEPATCH=m +# CONFIG_TEST_MEMINIT is not set +# CONFIG_TEST_FREE_PAGES is not set +CONFIG_ARCH_USE_MEMTEST=y +# CONFIG_MEMTEST is not set +# CONFIG_HYPERV_TESTING is not set +# end of Kernel Testing and Coverage + +# +# Rust hacking +# +# end of Rust hacking +# end of Kernel hacking diff --git a/.github/workflows/config_64k b/.github/workflows/config_64k new file mode 100644 index 0000000000000..d59d754c4e068 --- /dev/null +++ b/.github/workflows/config_64k @@ -0,0 +1,10925 @@ +# +# Automatically generated file; DO NOT EDIT. +# Linux/arm64 6.6.3 Kernel Configuration +# +CONFIG_CC_VERSION_TEXT="gcc (Debian 8.3.0-6) 8.3.0" +CONFIG_CC_IS_GCC=y +CONFIG_GCC_VERSION=80300 +CONFIG_CLANG_VERSION=0 +CONFIG_AS_IS_GNU=y +CONFIG_AS_VERSION=23101 +CONFIG_LD_IS_BFD=y +CONFIG_LD_VERSION=23101 +CONFIG_LLD_VERSION=0 +CONFIG_CC_CAN_LINK=y +CONFIG_CC_CAN_LINK_STATIC=y +CONFIG_CC_HAS_ASM_INLINE=y +CONFIG_CC_HAS_NO_PROFILE_FN_ATTR=y +CONFIG_PAHOLE_VERSION=120 +CONFIG_IRQ_WORK=y +CONFIG_BUILDTIME_TABLE_SORT=y +CONFIG_THREAD_INFO_IN_TASK=y + +# +# General setup +# +CONFIG_INIT_ENV_ARG_LIMIT=32 +# CONFIG_COMPILE_TEST is not set +# CONFIG_WERROR is not set +CONFIG_LOCALVERSION="" +# CONFIG_LOCALVERSION_AUTO is not set +CONFIG_BUILD_SALT="6.6.y.bsk.z-64k-arm64" +CONFIG_DEFAULT_INIT="" +CONFIG_DEFAULT_HOSTNAME="(none)" +CONFIG_SYSVIPC=y +CONFIG_SYSVIPC_SYSCTL=y +CONFIG_SYSVIPC_COMPAT=y +CONFIG_POSIX_MQUEUE=y +CONFIG_POSIX_MQUEUE_SYSCTL=y +# CONFIG_WATCH_QUEUE is not set +CONFIG_CROSS_MEMORY_ATTACH=y +# CONFIG_USELIB is not set +CONFIG_AUDIT=y +CONFIG_HAVE_ARCH_AUDITSYSCALL=y +CONFIG_AUDITSYSCALL=y + +# +# IRQ subsystem +# +CONFIG_GENERIC_IRQ_PROBE=y +CONFIG_GENERIC_IRQ_SHOW=y +CONFIG_GENERIC_IRQ_SHOW_LEVEL=y +CONFIG_GENERIC_IRQ_EFFECTIVE_AFF_MASK=y +CONFIG_GENERIC_IRQ_MIGRATION=y +CONFIG_GENERIC_IRQ_INJECTION=y +CONFIG_HARDIRQS_SW_RESEND=y +CONFIG_GENERIC_IRQ_CHIP=y +CONFIG_IRQ_DOMAIN=y +CONFIG_IRQ_DOMAIN_HIERARCHY=y +CONFIG_IRQ_FASTEOI_HIERARCHY_HANDLERS=y +CONFIG_GENERIC_IRQ_IPI=y +CONFIG_GENERIC_MSI_IRQ=y +CONFIG_IRQ_MSI_IOMMU=y +CONFIG_IRQ_FORCED_THREADING=y +CONFIG_SPARSE_IRQ=y +# CONFIG_GENERIC_IRQ_DEBUGFS is not set +# end of IRQ subsystem + +CONFIG_GENERIC_TIME_VSYSCALL=y +CONFIG_GENERIC_CLOCKEVENTS=y +CONFIG_ARCH_HAS_TICK_BROADCAST=y +CONFIG_GENERIC_CLOCKEVENTS_BROADCAST=y +CONFIG_HAVE_POSIX_CPU_TIMERS_TASK_WORK=y +CONFIG_POSIX_CPU_TIMERS_TASK_WORK=y +CONFIG_CONTEXT_TRACKING=y +CONFIG_CONTEXT_TRACKING_IDLE=y + +# +# Timers subsystem +# +CONFIG_TICK_ONESHOT=y +CONFIG_NO_HZ_COMMON=y +# CONFIG_HZ_PERIODIC is not set +CONFIG_NO_HZ_IDLE=y +# CONFIG_NO_HZ_FULL is not set +# CONFIG_NO_HZ is not set +CONFIG_HIGH_RES_TIMERS=y +# end of Timers subsystem + +CONFIG_BPF=y +CONFIG_HAVE_EBPF_JIT=y +CONFIG_ARCH_WANT_DEFAULT_BPF_JIT=y + +# +# BPF subsystem +# +CONFIG_BPF_SYSCALL=y +CONFIG_BPF_JIT=y +# CONFIG_BPF_JIT_ALWAYS_ON is not set +CONFIG_BPF_JIT_DEFAULT_ON=y +# CONFIG_BPF_UNPRIV_DEFAULT_OFF is not set +# CONFIG_BPF_PRELOAD is not set +CONFIG_BPF_LSM=y +# end of BPF subsystem + +CONFIG_PREEMPT_VOLUNTARY_BUILD=y +CONFIG_PREEMPT_NONE=y +# CONFIG_PREEMPT_VOLUNTARY is not set +# CONFIG_PREEMPT is not set +CONFIG_PREEMPT_DYNAMIC=y +# CONFIG_SCHED_CORE is not set + +# +# CPU/Task time and stats accounting +# +CONFIG_TICK_CPU_ACCOUNTING=y +# CONFIG_VIRT_CPU_ACCOUNTING_GEN is not set +# CONFIG_IRQ_TIME_ACCOUNTING is not set +CONFIG_SCHED_THERMAL_PRESSURE=y +CONFIG_BSD_PROCESS_ACCT=y +CONFIG_BSD_PROCESS_ACCT_V3=y +CONFIG_TASKSTATS=y +CONFIG_TASK_DELAY_ACCT=y +CONFIG_TASK_XACCT=y +CONFIG_TASK_IO_ACCOUNTING=y +CONFIG_PSI=y +# CONFIG_PSI_DEFAULT_DISABLED is not set +# end of CPU/Task time and stats accounting + +CONFIG_CPU_ISOLATION=y + +# +# RCU Subsystem +# +CONFIG_TREE_RCU=y +# CONFIG_RCU_EXPERT is not set +CONFIG_TREE_SRCU=y +CONFIG_TASKS_RCU_GENERIC=y +CONFIG_TASKS_RUDE_RCU=y +CONFIG_TASKS_TRACE_RCU=y +CONFIG_RCU_STALL_COMMON=y +CONFIG_RCU_NEED_SEGCBLIST=y +# end of RCU Subsystem + +# CONFIG_IKCONFIG is not set +# CONFIG_IKHEADERS is not set +CONFIG_LOG_BUF_SHIFT=17 +CONFIG_LOG_CPU_MAX_BUF_SHIFT=12 +# CONFIG_PRINTK_INDEX is not set +CONFIG_GENERIC_SCHED_CLOCK=y + +# +# Scheduler features +# +# CONFIG_UCLAMP_TASK is not set +# end of Scheduler features + +CONFIG_ARCH_SUPPORTS_NUMA_BALANCING=y +CONFIG_ARCH_WANT_BATCHED_UNMAP_TLB_FLUSH=y +CONFIG_CC_HAS_INT128=y +CONFIG_CC_IMPLICIT_FALLTHROUGH="-Wimplicit-fallthrough=5" +CONFIG_GCC11_NO_ARRAY_BOUNDS=y +CONFIG_ARCH_SUPPORTS_INT128=y +CONFIG_NUMA_BALANCING=y +# CONFIG_NUMA_BALANCING_DEFAULT_ENABLED is not set +CONFIG_CGROUPS=y +CONFIG_PAGE_COUNTER=y +# CONFIG_CGROUP_FAVOR_DYNMODS is not set +CONFIG_MEMCG=y +CONFIG_MEMCG_BGD_RECLAIM=y +CONFIG_MEMCG_KMEM=y +CONFIG_BLK_CGROUP=y +CONFIG_CGROUP_WRITEBACK=y +CONFIG_CGROUP_SCHED=y +CONFIG_FAIR_GROUP_SCHED=y +CONFIG_CFS_BANDWIDTH=y +# CONFIG_RT_GROUP_SCHED is not set +CONFIG_SCHED_MM_CID=y +CONFIG_CGROUP_PIDS=y +CONFIG_CGROUP_RDMA=y +CONFIG_CGROUP_FREEZER=y +CONFIG_CGROUP_HUGETLB=y +CONFIG_CPUSETS=y +CONFIG_PROC_PID_CPUSET=y +CONFIG_CGROUP_DEVICE=y +CONFIG_CGROUP_CPUACCT=y +CONFIG_CGROUP_PERF=y +CONFIG_CGROUP_BPF=y +# CONFIG_CGROUP_MISC is not set +# CONFIG_CGROUP_DEBUG is not set +CONFIG_SOCK_CGROUP_DATA=y +CONFIG_NAMESPACES=y +CONFIG_UTS_NS=y +CONFIG_TIME_NS=y +CONFIG_IPC_NS=y +CONFIG_USER_NS=y +CONFIG_PID_NS=y +CONFIG_NET_NS=y +CONFIG_CHECKPOINT_RESTORE=y +CONFIG_SCHED_AUTOGROUP=y +CONFIG_RELAY=y +CONFIG_BLK_DEV_INITRD=y +CONFIG_INITRAMFS_SOURCE="" +CONFIG_RD_GZIP=y +CONFIG_RD_BZIP2=y +CONFIG_RD_LZMA=y +CONFIG_RD_XZ=y +CONFIG_RD_LZO=y +CONFIG_RD_LZ4=y +CONFIG_RD_ZSTD=y +# CONFIG_BOOT_CONFIG is not set +CONFIG_INITRAMFS_PRESERVE_MTIME=y +CONFIG_CC_OPTIMIZE_FOR_PERFORMANCE=y +# CONFIG_CC_OPTIMIZE_FOR_SIZE is not set +CONFIG_LD_ORPHAN_WARN=y +CONFIG_LD_ORPHAN_WARN_LEVEL="warn" +CONFIG_SYSCTL=y +CONFIG_HAVE_UID16=y +CONFIG_SYSCTL_EXCEPTION_TRACE=y +CONFIG_EXPERT=y +CONFIG_UID16=y +CONFIG_MULTIUSER=y +# CONFIG_SGETMASK_SYSCALL is not set +# CONFIG_SYSFS_SYSCALL is not set +CONFIG_FHANDLE=y +CONFIG_POSIX_TIMERS=y +CONFIG_PRINTK=y +CONFIG_BUG=y +CONFIG_ELF_CORE=y +CONFIG_BASE_FULL=y +CONFIG_FUTEX=y +CONFIG_FUTEX_PI=y +CONFIG_EPOLL=y +CONFIG_SIGNALFD=y +CONFIG_TIMERFD=y +CONFIG_EVENTFD=y +CONFIG_SHMEM=y +CONFIG_AIO=y +CONFIG_IO_URING=y +CONFIG_ADVISE_SYSCALLS=y +CONFIG_MEMBARRIER=y +CONFIG_KALLSYMS=y +# CONFIG_KALLSYMS_SELFTEST is not set +CONFIG_KALLSYMS_ALL=y +CONFIG_KALLSYMS_BASE_RELATIVE=y +CONFIG_ARCH_HAS_MEMBARRIER_SYNC_CORE=y +CONFIG_KCMP=y +CONFIG_RSEQ=y +CONFIG_CACHESTAT_SYSCALL=y +# CONFIG_DEBUG_RSEQ is not set +CONFIG_HAVE_PERF_EVENTS=y +CONFIG_GUEST_PERF_EVENTS=y +# CONFIG_PC104 is not set + +# +# Kernel Performance Events And Counters +# +CONFIG_PERF_EVENTS=y +# CONFIG_DEBUG_PERF_USE_VMALLOC is not set +# end of Kernel Performance Events And Counters + +CONFIG_SYSTEM_DATA_VERIFICATION=y +CONFIG_PROFILING=y +CONFIG_TRACEPOINTS=y + +# +# Kexec and crash features +# +CONFIG_CRASH_CORE=y +CONFIG_KEXEC_CORE=y +CONFIG_KEXEC=y +CONFIG_KEXEC_FILE=y +CONFIG_CRASH_DUMP=y +# end of Kexec and crash features +# end of General setup + +CONFIG_ARM64=y +CONFIG_GCC_SUPPORTS_DYNAMIC_FTRACE_WITH_ARGS=y +CONFIG_64BIT=y +CONFIG_MMU=y +CONFIG_ARM64_PAGE_SHIFT=16 +CONFIG_ARM64_CONT_PTE_SHIFT=5 +CONFIG_ARM64_CONT_PMD_SHIFT=5 +CONFIG_ARCH_MMAP_RND_BITS_MIN=14 +CONFIG_ARCH_MMAP_RND_BITS_MAX=29 +CONFIG_ARCH_MMAP_RND_COMPAT_BITS_MIN=7 +CONFIG_ARCH_MMAP_RND_COMPAT_BITS_MAX=16 +CONFIG_STACKTRACE_SUPPORT=y +CONFIG_ILLEGAL_POINTER_VALUE=0xdead000000000000 +CONFIG_LOCKDEP_SUPPORT=y +CONFIG_GENERIC_BUG=y +CONFIG_GENERIC_BUG_RELATIVE_POINTERS=y +CONFIG_GENERIC_HWEIGHT=y +CONFIG_GENERIC_CSUM=y +CONFIG_GENERIC_CALIBRATE_DELAY=y +CONFIG_SMP=y +CONFIG_KERNEL_MODE_NEON=y +CONFIG_FIX_EARLYCON_MEM=y +CONFIG_PGTABLE_LEVELS=3 +CONFIG_ARCH_SUPPORTS_UPROBES=y +CONFIG_ARCH_PROC_KCORE_TEXT=y + +# +# Platform selection +# +# CONFIG_ARCH_ACTIONS is not set +CONFIG_ARCH_SUNXI=y +# CONFIG_ARCH_ALPINE is not set +# CONFIG_ARCH_APPLE is not set +# CONFIG_ARCH_BCM is not set +# CONFIG_ARCH_BERLIN is not set +# CONFIG_ARCH_BITMAIN is not set +# CONFIG_ARCH_EXYNOS is not set +# CONFIG_ARCH_SPARX5 is not set +# CONFIG_ARCH_K3 is not set +# CONFIG_ARCH_LG1K is not set +CONFIG_ARCH_HISI=y +# CONFIG_ARCH_KEEMBAY is not set +# CONFIG_ARCH_MEDIATEK is not set +CONFIG_ARCH_MESON=y +CONFIG_ARCH_MVEBU=y +# CONFIG_ARCH_NXP is not set +# CONFIG_ARCH_MA35 is not set +# CONFIG_ARCH_NPCM is not set +CONFIG_ARCH_QCOM=y +# CONFIG_ARCH_REALTEK is not set +# CONFIG_ARCH_RENESAS is not set +CONFIG_ARCH_ROCKCHIP=y +CONFIG_ARCH_SEATTLE=y +# CONFIG_ARCH_INTEL_SOCFPGA is not set +# CONFIG_ARCH_STM32 is not set +CONFIG_ARCH_SYNQUACER=y +CONFIG_ARCH_TEGRA=y +# CONFIG_ARCH_SPRD is not set +CONFIG_ARCH_THUNDER=y +CONFIG_ARCH_THUNDER2=y +# CONFIG_ARCH_UNIPHIER is not set +CONFIG_ARCH_VEXPRESS=y +# CONFIG_ARCH_VISCONTI is not set +CONFIG_ARCH_XGENE=y +CONFIG_ARCH_ZYNQMP=y +# end of Platform selection + +# +# Kernel Features +# + +# +# ARM errata workarounds via the alternatives framework +# +CONFIG_AMPERE_ERRATUM_AC03_CPU_38=y +CONFIG_ARM64_WORKAROUND_CLEAN_CACHE=y +CONFIG_ARM64_ERRATUM_826319=y +CONFIG_ARM64_ERRATUM_827319=y +CONFIG_ARM64_ERRATUM_824069=y +CONFIG_ARM64_ERRATUM_819472=y +CONFIG_ARM64_ERRATUM_832075=y +CONFIG_ARM64_ERRATUM_834220=y +CONFIG_ARM64_ERRATUM_1742098=y +CONFIG_ARM64_ERRATUM_845719=y +CONFIG_ARM64_ERRATUM_843419=y +CONFIG_ARM64_LD_HAS_FIX_ERRATUM_843419=y +CONFIG_ARM64_ERRATUM_1024718=y +CONFIG_ARM64_ERRATUM_1418040=y +CONFIG_ARM64_WORKAROUND_SPECULATIVE_AT=y +CONFIG_ARM64_ERRATUM_1165522=y +CONFIG_ARM64_ERRATUM_1319367=y +CONFIG_ARM64_ERRATUM_1530923=y +CONFIG_ARM64_WORKAROUND_REPEAT_TLBI=y +CONFIG_ARM64_ERRATUM_2441007=y +CONFIG_ARM64_ERRATUM_1286807=y +CONFIG_ARM64_ERRATUM_1463225=y +CONFIG_ARM64_ERRATUM_1542419=y +CONFIG_ARM64_ERRATUM_1508412=y +CONFIG_ARM64_ERRATUM_2051678=y +CONFIG_ARM64_ERRATUM_2077057=y +CONFIG_ARM64_ERRATUM_2658417=y +CONFIG_ARM64_WORKAROUND_TSB_FLUSH_FAILURE=y +CONFIG_ARM64_ERRATUM_2054223=y +CONFIG_ARM64_ERRATUM_2067961=y +CONFIG_ARM64_ERRATUM_2441009=y +CONFIG_ARM64_ERRATUM_2457168=y +CONFIG_ARM64_ERRATUM_2645198=y +CONFIG_ARM64_ERRATUM_2966298=y +CONFIG_CAVIUM_ERRATUM_22375=y +CONFIG_CAVIUM_ERRATUM_23144=y +CONFIG_CAVIUM_ERRATUM_23154=y +CONFIG_CAVIUM_ERRATUM_27456=y +CONFIG_CAVIUM_ERRATUM_30115=y +CONFIG_CAVIUM_TX2_ERRATUM_219=y +CONFIG_FUJITSU_ERRATUM_010001=y +CONFIG_HISILICON_ERRATUM_161600802=y +CONFIG_QCOM_FALKOR_ERRATUM_1003=y +CONFIG_QCOM_FALKOR_ERRATUM_1009=y +CONFIG_QCOM_QDF2400_ERRATUM_0065=y +CONFIG_QCOM_FALKOR_ERRATUM_E1041=y +CONFIG_NVIDIA_CARMEL_CNP_ERRATUM=y +CONFIG_ROCKCHIP_ERRATUM_3588001=y +CONFIG_SOCIONEXT_SYNQUACER_PREITS=y +# end of ARM errata workarounds via the alternatives framework + +# CONFIG_ARM64_4K_PAGES is not set +# CONFIG_ARM64_16K_PAGES is not set +CONFIG_ARM64_64K_PAGES=y +# CONFIG_ARM64_VA_BITS_42 is not set +CONFIG_ARM64_VA_BITS_48=y +# CONFIG_ARM64_VA_BITS_52 is not set +CONFIG_ARM64_VA_BITS=48 +CONFIG_ARM64_PA_BITS_48=y +# CONFIG_ARM64_PA_BITS_52 is not set +CONFIG_ARM64_PA_BITS=48 +# CONFIG_CPU_BIG_ENDIAN is not set +CONFIG_CPU_LITTLE_ENDIAN=y +CONFIG_SCHED_MC=y +# CONFIG_SCHED_CLUSTER is not set +CONFIG_SCHED_SMT=y +CONFIG_NR_CPUS=512 +CONFIG_HOTPLUG_CPU=y +CONFIG_NUMA=y +CONFIG_NODES_SHIFT=6 +# CONFIG_HZ_100 is not set +CONFIG_HZ_250=y +# CONFIG_HZ_300 is not set +# CONFIG_HZ_1000 is not set +CONFIG_HZ=250 +CONFIG_SCHED_HRTICK=y +CONFIG_ARCH_SPARSEMEM_ENABLE=y +CONFIG_HW_PERF_EVENTS=y +CONFIG_PARAVIRT=y +CONFIG_PARAVIRT_TIME_ACCOUNTING=y +CONFIG_ARCH_SUPPORTS_KEXEC=y +CONFIG_ARCH_SUPPORTS_KEXEC_FILE=y +CONFIG_ARCH_SUPPORTS_KEXEC_SIG=y +CONFIG_ARCH_SUPPORTS_KEXEC_IMAGE_VERIFY_SIG=y +CONFIG_ARCH_DEFAULT_KEXEC_IMAGE_VERIFY_SIG=y +CONFIG_ARCH_SUPPORTS_CRASH_DUMP=y +CONFIG_TRANS_TABLE=y +CONFIG_XEN_DOM0=y +CONFIG_XEN=y +CONFIG_ARCH_FORCE_MAX_ORDER=13 +CONFIG_UNMAP_KERNEL_AT_EL0=y +CONFIG_MITIGATE_SPECTRE_BRANCH_HISTORY=y +CONFIG_RODATA_FULL_DEFAULT_ENABLED=y +# CONFIG_ARM64_SW_TTBR0_PAN is not set +CONFIG_ARM64_TAGGED_ADDR_ABI=y +CONFIG_COMPAT=y +CONFIG_KUSER_HELPERS=y +# CONFIG_COMPAT_ALIGNMENT_FIXUPS is not set +CONFIG_ARMV8_DEPRECATED=y +CONFIG_SWP_EMULATION=y +CONFIG_CP15_BARRIER_EMULATION=y +CONFIG_SETEND_EMULATION=y + +# +# ARMv8.1 architectural features +# +CONFIG_ARM64_HW_AFDBM=y +CONFIG_ARM64_PAN=y +CONFIG_AS_HAS_LSE_ATOMICS=y +CONFIG_ARM64_LSE_ATOMICS=y +CONFIG_ARM64_USE_LSE_ATOMICS=y +# end of ARMv8.1 architectural features + +# +# ARMv8.2 architectural features +# +CONFIG_AS_HAS_ARMV8_2=y +CONFIG_AS_HAS_SHA3=y +CONFIG_ARM64_PMEM=y +CONFIG_ARM64_RAS_EXTN=y +CONFIG_ARM64_CNP=y +# end of ARMv8.2 architectural features + +# +# ARMv8.3 architectural features +# +CONFIG_ARM64_PTR_AUTH=y +CONFIG_ARM64_PTR_AUTH_KERNEL=y +CONFIG_CC_HAS_SIGN_RETURN_ADDRESS=y +CONFIG_AS_HAS_ARMV8_3=y +CONFIG_AS_HAS_LDAPR=y +# end of ARMv8.3 architectural features + +# +# ARMv8.4 architectural features +# +CONFIG_ARM64_AMU_EXTN=y +CONFIG_AS_HAS_ARMV8_4=y +CONFIG_ARM64_TLB_RANGE=y +# end of ARMv8.4 architectural features + +# +# ARMv8.5 architectural features +# +CONFIG_ARM64_BTI=y +CONFIG_ARM64_E0PD=y +# end of ARMv8.5 architectural features + +# +# ARMv8.7 architectural features +# +CONFIG_ARM64_EPAN=y +# end of ARMv8.7 architectural features + +CONFIG_ARM64_SVE=y +CONFIG_ARM64_SME=y +CONFIG_ARM64_PSEUDO_NMI=y +# CONFIG_ARM64_DEBUG_PRIORITY_MASKING is not set +CONFIG_RELOCATABLE=y +CONFIG_RANDOMIZE_BASE=y +CONFIG_RANDOMIZE_MODULE_REGION_FULL=y +CONFIG_ARM64_CONTPTE=y +CONFIG_HAVE_LIVEPATCH=y +CONFIG_LIVEPATCH=y +# end of Kernel Features + +# +# Boot options +# +CONFIG_ARM64_ACPI_PARKING_PROTOCOL=y +CONFIG_CMDLINE="" +CONFIG_EFI_STUB=y +CONFIG_EFI=y +CONFIG_DMI=y +# end of Boot options + +# +# Power management options +# +CONFIG_SUSPEND=y +CONFIG_SUSPEND_FREEZER=y +# CONFIG_SUSPEND_SKIP_SYNC is not set +CONFIG_HIBERNATE_CALLBACKS=y +CONFIG_HIBERNATION=y +CONFIG_HIBERNATION_SNAPSHOT_DEV=y +CONFIG_PM_STD_PARTITION="" +CONFIG_PM_SLEEP=y +CONFIG_PM_SLEEP_SMP=y +# CONFIG_PM_AUTOSLEEP is not set +# CONFIG_PM_USERSPACE_AUTOSLEEP is not set +# CONFIG_PM_WAKELOCKS is not set +CONFIG_PM=y +CONFIG_PM_DEBUG=y +CONFIG_PM_ADVANCED_DEBUG=y +# CONFIG_PM_TEST_SUSPEND is not set +CONFIG_PM_SLEEP_DEBUG=y +# CONFIG_DPM_WATCHDOG is not set +CONFIG_PM_CLK=y +CONFIG_PM_GENERIC_DOMAINS=y +# CONFIG_WQ_POWER_EFFICIENT_DEFAULT is not set +CONFIG_PM_GENERIC_DOMAINS_SLEEP=y +CONFIG_PM_GENERIC_DOMAINS_OF=y +CONFIG_CPU_PM=y +CONFIG_ENERGY_MODEL=y +CONFIG_ARCH_HIBERNATION_POSSIBLE=y +CONFIG_ARCH_HIBERNATION_HEADER=y +CONFIG_ARCH_SUSPEND_POSSIBLE=y +# end of Power management options + +# +# CPU Power Management +# + +# +# CPU Idle +# +CONFIG_CPU_IDLE=y +CONFIG_CPU_IDLE_GOV_LADDER=y +CONFIG_CPU_IDLE_GOV_MENU=y +# CONFIG_CPU_IDLE_GOV_TEO is not set + +# +# ARM CPU Idle Drivers +# +# CONFIG_ARM_PSCI_CPUIDLE is not set +# end of ARM CPU Idle Drivers +# end of CPU Idle + +# +# CPU Frequency scaling +# +CONFIG_CPU_FREQ=y +CONFIG_CPU_FREQ_GOV_ATTR_SET=y +CONFIG_CPU_FREQ_GOV_COMMON=y +CONFIG_CPU_FREQ_STAT=y +CONFIG_CPU_FREQ_DEFAULT_GOV_PERFORMANCE=y +# CONFIG_CPU_FREQ_DEFAULT_GOV_POWERSAVE is not set +# CONFIG_CPU_FREQ_DEFAULT_GOV_USERSPACE is not set +# CONFIG_CPU_FREQ_DEFAULT_GOV_ONDEMAND is not set +# CONFIG_CPU_FREQ_DEFAULT_GOV_CONSERVATIVE is not set +# CONFIG_CPU_FREQ_DEFAULT_GOV_SCHEDUTIL is not set +CONFIG_CPU_FREQ_GOV_PERFORMANCE=y +CONFIG_CPU_FREQ_GOV_POWERSAVE=m +CONFIG_CPU_FREQ_GOV_USERSPACE=m +CONFIG_CPU_FREQ_GOV_ONDEMAND=m +CONFIG_CPU_FREQ_GOV_CONSERVATIVE=m +CONFIG_CPU_FREQ_GOV_SCHEDUTIL=y + +# +# CPU frequency scaling drivers +# +CONFIG_CPUFREQ_DT=m +CONFIG_CPUFREQ_DT_PLATDEV=y +CONFIG_ACPI_CPPC_CPUFREQ=m +CONFIG_ACPI_CPPC_CPUFREQ_FIE=y +CONFIG_ARM_ALLWINNER_SUN50I_CPUFREQ_NVMEM=m +CONFIG_ARM_ARMADA_37XX_CPUFREQ=m +# CONFIG_ARM_ARMADA_8K_CPUFREQ is not set +# CONFIG_ARM_QCOM_CPUFREQ_HW is not set +CONFIG_ARM_TEGRA20_CPUFREQ=m +CONFIG_ARM_TEGRA124_CPUFREQ=y +# end of CPU Frequency scaling +# end of CPU Power Management + +CONFIG_ARCH_SUPPORTS_ACPI=y +CONFIG_ACPI=y +CONFIG_ACPI_GENERIC_GSI=y +CONFIG_ACPI_CCA_REQUIRED=y +# CONFIG_ACPI_DEBUGGER is not set +CONFIG_ACPI_SPCR_TABLE=y +# CONFIG_ACPI_FPDT is not set +# CONFIG_ACPI_EC_DEBUGFS is not set +CONFIG_ACPI_AC=y +CONFIG_ACPI_BATTERY=y +CONFIG_ACPI_BUTTON=y +CONFIG_ACPI_VIDEO=m +CONFIG_ACPI_FAN=y +CONFIG_ACPI_TAD=m +# CONFIG_ACPI_DOCK is not set +CONFIG_ACPI_PROCESSOR_IDLE=y +CONFIG_ACPI_MCFG=y +CONFIG_ACPI_CPPC_LIB=y +CONFIG_ACPI_PROCESSOR=y +CONFIG_ACPI_IPMI=m +CONFIG_ACPI_HOTPLUG_CPU=y +CONFIG_ACPI_THERMAL=y +CONFIG_ARCH_HAS_ACPI_TABLE_UPGRADE=y +CONFIG_ACPI_TABLE_UPGRADE=y +# CONFIG_ACPI_DEBUG is not set +CONFIG_ACPI_PCI_SLOT=y +CONFIG_ACPI_CONTAINER=y +# CONFIG_ACPI_HOTPLUG_MEMORY is not set +CONFIG_ACPI_HED=y +# CONFIG_ACPI_CUSTOM_METHOD is not set +CONFIG_ACPI_BGRT=y +CONFIG_ACPI_REDUCED_HARDWARE_ONLY=y +CONFIG_ACPI_NFIT=m +# CONFIG_NFIT_SECURITY_DEBUG is not set +CONFIG_ACPI_NUMA=y +# CONFIG_ACPI_HMAT is not set +CONFIG_HAVE_ACPI_APEI=y +CONFIG_ACPI_APEI=y +CONFIG_ACPI_APEI_GHES=y +CONFIG_ACPI_APEI_PCIEAER=y +CONFIG_ACPI_APEI_SEA=y +CONFIG_ACPI_APEI_MEMORY_FAILURE=y +CONFIG_ACPI_APEI_EINJ=m +# CONFIG_ACPI_APEI_ERST_DEBUG is not set +CONFIG_ACPI_WATCHDOG=y +# CONFIG_ACPI_CONFIGFS is not set +# CONFIG_ACPI_PFRUT is not set +CONFIG_ACPI_IORT=y +CONFIG_ACPI_GTDT=y +# CONFIG_ACPI_AGDI is not set +CONFIG_ACPI_APMT=y +CONFIG_ACPI_PPTT=y +CONFIG_ACPI_PCC=y +# CONFIG_ACPI_FFH is not set +# CONFIG_PMIC_OPREGION is not set +CONFIG_ACPI_PRMT=y +CONFIG_IRQ_BYPASS_MANAGER=y +CONFIG_HAVE_KVM=y +CONFIG_HAVE_KVM_IRQCHIP=y +CONFIG_HAVE_KVM_IRQFD=y +CONFIG_HAVE_KVM_IRQ_ROUTING=y +CONFIG_HAVE_KVM_DIRTY_RING=y +CONFIG_HAVE_KVM_DIRTY_RING_ACQ_REL=y +CONFIG_NEED_KVM_DIRTY_RING_WITH_BITMAP=y +CONFIG_HAVE_KVM_EVENTFD=y +CONFIG_KVM_MMIO=y +CONFIG_HAVE_KVM_MSI=y +CONFIG_HAVE_KVM_CPU_RELAX_INTERCEPT=y +CONFIG_KVM_VFIO=y +CONFIG_KVM_GENERIC_DIRTYLOG_READ_PROTECT=y +CONFIG_HAVE_KVM_IRQ_BYPASS=y +CONFIG_HAVE_KVM_VCPU_RUN_PID_CHANGE=y +CONFIG_KVM_XFER_TO_GUEST_WORK=y +CONFIG_KVM_GENERIC_HARDWARE_ENABLING=y +CONFIG_VIRTUALIZATION=y +CONFIG_KVM=y +# CONFIG_NVHE_EL2_DEBUG is not set + +# +# General architecture-dependent options +# +CONFIG_HOTPLUG_CORE_SYNC=y +CONFIG_HOTPLUG_CORE_SYNC_DEAD=y +CONFIG_KPROBES=y +CONFIG_JUMP_LABEL=y +# CONFIG_STATIC_KEYS_SELFTEST is not set +CONFIG_UPROBES=y +CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS=y +CONFIG_KRETPROBES=y +CONFIG_HAVE_IOREMAP_PROT=y +CONFIG_HAVE_KPROBES=y +CONFIG_HAVE_KRETPROBES=y +CONFIG_ARCH_CORRECT_STACKTRACE_ON_KRETPROBE=y +CONFIG_HAVE_FUNCTION_ERROR_INJECTION=y +CONFIG_HAVE_NMI=y +CONFIG_TRACE_IRQFLAGS_SUPPORT=y +CONFIG_TRACE_IRQFLAGS_NMI_SUPPORT=y +CONFIG_HAVE_ARCH_TRACEHOOK=y +CONFIG_HAVE_DMA_CONTIGUOUS=y +CONFIG_GENERIC_SMP_IDLE_THREAD=y +CONFIG_GENERIC_IDLE_POLL_SETUP=y +CONFIG_ARCH_HAS_FORTIFY_SOURCE=y +CONFIG_ARCH_HAS_KEEPINITRD=y +CONFIG_ARCH_HAS_SET_MEMORY=y +CONFIG_ARCH_HAS_SET_DIRECT_MAP=y +CONFIG_HAVE_ARCH_THREAD_STRUCT_WHITELIST=y +CONFIG_ARCH_WANTS_NO_INSTR=y +CONFIG_HAVE_ASM_MODVERSIONS=y +CONFIG_HAVE_REGS_AND_STACK_ACCESS_API=y +CONFIG_HAVE_RSEQ=y +CONFIG_HAVE_FUNCTION_ARG_ACCESS_API=y +CONFIG_HAVE_HW_BREAKPOINT=y +CONFIG_HAVE_PERF_EVENTS_NMI=y +CONFIG_HAVE_HARDLOCKUP_DETECTOR_PERF=y +CONFIG_HAVE_PERF_REGS=y +CONFIG_HAVE_PERF_USER_STACK_DUMP=y +CONFIG_HAVE_ARCH_JUMP_LABEL=y +CONFIG_HAVE_ARCH_JUMP_LABEL_RELATIVE=y +CONFIG_MMU_GATHER_TABLE_FREE=y +CONFIG_MMU_GATHER_RCU_TABLE_FREE=y +CONFIG_MMU_LAZY_TLB_REFCOUNT=y +CONFIG_ARCH_HAVE_NMI_SAFE_CMPXCHG=y +CONFIG_ARCH_HAS_NMI_SAFE_THIS_CPU_OPS=y +CONFIG_HAVE_ALIGNED_STRUCT_PAGE=y +CONFIG_HAVE_CMPXCHG_LOCAL=y +CONFIG_HAVE_CMPXCHG_DOUBLE=y +CONFIG_ARCH_WANT_COMPAT_IPC_PARSE_VERSION=y +CONFIG_HAVE_ARCH_SECCOMP=y +CONFIG_HAVE_ARCH_SECCOMP_FILTER=y +CONFIG_SECCOMP=y +CONFIG_SECCOMP_FILTER=y +# CONFIG_SECCOMP_CACHE_DEBUG is not set +CONFIG_HAVE_ARCH_STACKLEAK=y +CONFIG_HAVE_STACKPROTECTOR=y +CONFIG_STACKPROTECTOR=y +CONFIG_STACKPROTECTOR_STRONG=y +CONFIG_ARCH_SUPPORTS_LTO_CLANG=y +CONFIG_ARCH_SUPPORTS_LTO_CLANG_THIN=y +CONFIG_LTO_NONE=y +CONFIG_ARCH_SUPPORTS_CFI_CLANG=y +CONFIG_HAVE_CONTEXT_TRACKING_USER=y +CONFIG_HAVE_VIRT_CPU_ACCOUNTING_GEN=y +CONFIG_HAVE_IRQ_TIME_ACCOUNTING=y +CONFIG_HAVE_MOVE_PUD=y +CONFIG_HAVE_MOVE_PMD=y +CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE=y +CONFIG_HAVE_ARCH_HUGE_VMAP=y +CONFIG_HAVE_ARCH_HUGE_VMALLOC=y +CONFIG_ARCH_WANT_PMD_MKWRITE=y +CONFIG_HAVE_MOD_ARCH_SPECIFIC=y +CONFIG_MODULES_USE_ELF_RELA=y +CONFIG_HAVE_SOFTIRQ_ON_OWN_STACK=y +CONFIG_SOFTIRQ_ON_OWN_STACK=y +CONFIG_ARCH_HAS_ELF_RANDOMIZE=y +CONFIG_HAVE_ARCH_MMAP_RND_BITS=y +CONFIG_ARCH_MMAP_RND_BITS=18 +CONFIG_HAVE_ARCH_MMAP_RND_COMPAT_BITS=y +CONFIG_ARCH_MMAP_RND_COMPAT_BITS=11 +CONFIG_PAGE_SIZE_LESS_THAN_256KB=y +CONFIG_ARCH_WANT_DEFAULT_TOPDOWN_MMAP_LAYOUT=y +CONFIG_HAVE_RELIABLE_STACKTRACE=y +CONFIG_CLONE_BACKWARDS=y +CONFIG_OLD_SIGSUSPEND3=y +CONFIG_COMPAT_OLD_SIGACTION=y +CONFIG_COMPAT_32BIT_TIME=y +CONFIG_HAVE_ARCH_VMAP_STACK=y +CONFIG_VMAP_STACK=y +CONFIG_HAVE_ARCH_RANDOMIZE_KSTACK_OFFSET=y +CONFIG_RANDOMIZE_KSTACK_OFFSET=y +# CONFIG_RANDOMIZE_KSTACK_OFFSET_DEFAULT is not set +CONFIG_ARCH_HAS_STRICT_KERNEL_RWX=y +CONFIG_STRICT_KERNEL_RWX=y +CONFIG_ARCH_HAS_STRICT_MODULE_RWX=y +CONFIG_STRICT_MODULE_RWX=y +CONFIG_HAVE_ARCH_COMPILER_H=y +CONFIG_HAVE_ARCH_PREL32_RELOCATIONS=y +CONFIG_ARCH_USE_MEMREMAP_PROT=y +# CONFIG_LOCK_EVENT_COUNTS is not set +CONFIG_ARCH_HAS_RELR=y +CONFIG_HAVE_PREEMPT_DYNAMIC=y +CONFIG_HAVE_PREEMPT_DYNAMIC_KEY=y +CONFIG_ARCH_WANT_LD_ORPHAN_WARN=y +CONFIG_ARCH_SUPPORTS_DEBUG_PAGEALLOC=y +CONFIG_ARCH_SUPPORTS_PAGE_TABLE_CHECK=y +CONFIG_ARCH_HAVE_TRACE_MMIO_ACCESS=y + +# +# GCOV-based kernel profiling +# +# CONFIG_GCOV_KERNEL is not set +CONFIG_ARCH_HAS_GCOV_PROFILE_ALL=y +# end of GCOV-based kernel profiling + +CONFIG_HAVE_GCC_PLUGINS=y +CONFIG_FUNCTION_ALIGNMENT_4B=y +CONFIG_FUNCTION_ALIGNMENT_8B=y +CONFIG_FUNCTION_ALIGNMENT=8 +# end of General architecture-dependent options + +CONFIG_RT_MUTEXES=y +CONFIG_BASE_SMALL=0 +CONFIG_MODULE_SIG_FORMAT=y +CONFIG_MODULES=y +# CONFIG_MODULE_DEBUG is not set +CONFIG_MODULE_FORCE_LOAD=y +CONFIG_MODULE_UNLOAD=y +CONFIG_MODULE_FORCE_UNLOAD=y +# CONFIG_MODULE_UNLOAD_TAINT_TRACKING is not set +CONFIG_MODVERSIONS=y +CONFIG_ASM_MODVERSIONS=y +# CONFIG_MODULE_SRCVERSION_ALL is not set +CONFIG_MODULE_SIG=y +# CONFIG_MODULE_SIG_FORCE is not set +# CONFIG_MODULE_SIG_ALL is not set +# CONFIG_MODULE_SIG_SHA1 is not set +# CONFIG_MODULE_SIG_SHA224 is not set +CONFIG_MODULE_SIG_SHA256=y +# CONFIG_MODULE_SIG_SHA384 is not set +# CONFIG_MODULE_SIG_SHA512 is not set +CONFIG_MODULE_SIG_HASH="sha256" +CONFIG_MODULE_COMPRESS_NONE=y +# CONFIG_MODULE_COMPRESS_GZIP is not set +# CONFIG_MODULE_COMPRESS_XZ is not set +# CONFIG_MODULE_COMPRESS_ZSTD is not set +# CONFIG_MODULE_ALLOW_MISSING_NAMESPACE_IMPORTS is not set +CONFIG_MODPROBE_PATH="/sbin/modprobe" +# CONFIG_TRIM_UNUSED_KSYMS is not set +CONFIG_MODULES_TREE_LOOKUP=y +CONFIG_BLOCK=y +CONFIG_BLOCK_LEGACY_AUTOLOAD=y +CONFIG_BLK_RQ_ALLOC_TIME=y +CONFIG_BLK_CGROUP_RWSTAT=y +CONFIG_BLK_CGROUP_PUNT_BIO=y +CONFIG_BLK_DEV_BSG_COMMON=y +CONFIG_BLK_ICQ=y +CONFIG_BLK_DEV_BSGLIB=y +CONFIG_BLK_DEV_INTEGRITY=y +CONFIG_BLK_DEV_INTEGRITY_T10=m +CONFIG_BLK_DEV_ZONED=y +CONFIG_BLK_DEV_THROTTLING=y +# CONFIG_BLK_DEV_THROTTLING_LOW is not set +CONFIG_BLK_WBT=y +CONFIG_BLK_WBT_MQ=y +# CONFIG_BLK_CGROUP_IOLATENCY is not set +# CONFIG_BLK_CGROUP_FC_APPID is not set +CONFIG_BLK_CGROUP_IOCOST=y +# CONFIG_BLK_CGROUP_IOPRIO is not set +CONFIG_BLK_DEBUG_FS=y +CONFIG_BLK_DEBUG_FS_ZONED=y +CONFIG_BLK_SED_OPAL=y +# CONFIG_BLK_INLINE_ENCRYPTION is not set + +# +# Partition Types +# +CONFIG_PARTITION_ADVANCED=y +# CONFIG_ACORN_PARTITION is not set +# CONFIG_AIX_PARTITION is not set +# CONFIG_OSF_PARTITION is not set +# CONFIG_AMIGA_PARTITION is not set +# CONFIG_ATARI_PARTITION is not set +# CONFIG_MAC_PARTITION is not set +CONFIG_MSDOS_PARTITION=y +# CONFIG_BSD_DISKLABEL is not set +# CONFIG_MINIX_SUBPARTITION is not set +# CONFIG_SOLARIS_X86_PARTITION is not set +# CONFIG_UNIXWARE_DISKLABEL is not set +# CONFIG_LDM_PARTITION is not set +# CONFIG_SGI_PARTITION is not set +# CONFIG_ULTRIX_PARTITION is not set +# CONFIG_SUN_PARTITION is not set +CONFIG_KARMA_PARTITION=y +CONFIG_EFI_PARTITION=y +# CONFIG_SYSV68_PARTITION is not set +# CONFIG_CMDLINE_PARTITION is not set +# end of Partition Types + +CONFIG_BLK_MQ_PCI=y +CONFIG_BLK_MQ_VIRTIO=y +CONFIG_BLK_PM=y +CONFIG_BLOCK_HOLDER_DEPRECATED=y +CONFIG_BLK_MQ_STACKING=y + +# +# IO Schedulers +# +CONFIG_MQ_IOSCHED_DEADLINE=y +CONFIG_MQ_IOSCHED_KYBER=m +CONFIG_IOSCHED_BFQ=m +CONFIG_BFQ_GROUP_IOSCHED=y +# CONFIG_BFQ_CGROUP_DEBUG is not set +# end of IO Schedulers + +CONFIG_PREEMPT_NOTIFIERS=y +CONFIG_PADATA=y +CONFIG_ASN1=y +CONFIG_ARCH_INLINE_SPIN_TRYLOCK=y +CONFIG_ARCH_INLINE_SPIN_TRYLOCK_BH=y +CONFIG_ARCH_INLINE_SPIN_LOCK=y +CONFIG_ARCH_INLINE_SPIN_LOCK_BH=y +CONFIG_ARCH_INLINE_SPIN_LOCK_IRQ=y +CONFIG_ARCH_INLINE_SPIN_LOCK_IRQSAVE=y +CONFIG_ARCH_INLINE_SPIN_UNLOCK=y +CONFIG_ARCH_INLINE_SPIN_UNLOCK_BH=y +CONFIG_ARCH_INLINE_SPIN_UNLOCK_IRQ=y +CONFIG_ARCH_INLINE_SPIN_UNLOCK_IRQRESTORE=y +CONFIG_ARCH_INLINE_READ_LOCK=y +CONFIG_ARCH_INLINE_READ_LOCK_BH=y +CONFIG_ARCH_INLINE_READ_LOCK_IRQ=y +CONFIG_ARCH_INLINE_READ_LOCK_IRQSAVE=y +CONFIG_ARCH_INLINE_READ_UNLOCK=y +CONFIG_ARCH_INLINE_READ_UNLOCK_BH=y +CONFIG_ARCH_INLINE_READ_UNLOCK_IRQ=y +CONFIG_ARCH_INLINE_READ_UNLOCK_IRQRESTORE=y +CONFIG_ARCH_INLINE_WRITE_LOCK=y +CONFIG_ARCH_INLINE_WRITE_LOCK_BH=y +CONFIG_ARCH_INLINE_WRITE_LOCK_IRQ=y +CONFIG_ARCH_INLINE_WRITE_LOCK_IRQSAVE=y +CONFIG_ARCH_INLINE_WRITE_UNLOCK=y +CONFIG_ARCH_INLINE_WRITE_UNLOCK_BH=y +CONFIG_ARCH_INLINE_WRITE_UNLOCK_IRQ=y +CONFIG_ARCH_INLINE_WRITE_UNLOCK_IRQRESTORE=y +CONFIG_INLINE_SPIN_TRYLOCK=y +CONFIG_INLINE_SPIN_TRYLOCK_BH=y +CONFIG_INLINE_SPIN_LOCK=y +CONFIG_INLINE_SPIN_LOCK_BH=y +CONFIG_INLINE_SPIN_LOCK_IRQ=y +CONFIG_INLINE_SPIN_LOCK_IRQSAVE=y +CONFIG_INLINE_SPIN_UNLOCK_BH=y +CONFIG_INLINE_SPIN_UNLOCK_IRQ=y +CONFIG_INLINE_SPIN_UNLOCK_IRQRESTORE=y +CONFIG_INLINE_READ_LOCK=y +CONFIG_INLINE_READ_LOCK_BH=y +CONFIG_INLINE_READ_LOCK_IRQ=y +CONFIG_INLINE_READ_LOCK_IRQSAVE=y +CONFIG_INLINE_READ_UNLOCK=y +CONFIG_INLINE_READ_UNLOCK_BH=y +CONFIG_INLINE_READ_UNLOCK_IRQ=y +CONFIG_INLINE_READ_UNLOCK_IRQRESTORE=y +CONFIG_INLINE_WRITE_LOCK=y +CONFIG_INLINE_WRITE_LOCK_BH=y +CONFIG_INLINE_WRITE_LOCK_IRQ=y +CONFIG_INLINE_WRITE_LOCK_IRQSAVE=y +CONFIG_INLINE_WRITE_UNLOCK=y +CONFIG_INLINE_WRITE_UNLOCK_BH=y +CONFIG_INLINE_WRITE_UNLOCK_IRQ=y +CONFIG_INLINE_WRITE_UNLOCK_IRQRESTORE=y +CONFIG_ARCH_SUPPORTS_ATOMIC_RMW=y +CONFIG_MUTEX_SPIN_ON_OWNER=y +CONFIG_RWSEM_SPIN_ON_OWNER=y +CONFIG_LOCK_SPIN_ON_OWNER=y +CONFIG_ARCH_USE_QUEUED_SPINLOCKS=y +CONFIG_QUEUED_SPINLOCKS=y +CONFIG_ARCH_USE_QUEUED_RWLOCKS=y +CONFIG_QUEUED_RWLOCKS=y +CONFIG_ARCH_HAS_NON_OVERLAPPING_ADDRESS_SPACE=y +CONFIG_ARCH_HAS_SYSCALL_WRAPPER=y +CONFIG_FREEZER=y + +# +# Executable file formats +# +CONFIG_BINFMT_ELF=y +CONFIG_COMPAT_BINFMT_ELF=y +CONFIG_ARCH_BINFMT_ELF_STATE=y +CONFIG_ARCH_BINFMT_ELF_EXTRA_PHDRS=y +CONFIG_ARCH_HAVE_ELF_PROT=y +CONFIG_ARCH_USE_GNU_PROPERTY=y +CONFIG_ELFCORE=y +CONFIG_CORE_DUMP_DEFAULT_ELF_HEADERS=y +CONFIG_BINFMT_SCRIPT=y +CONFIG_BINFMT_MISC=y +CONFIG_COREDUMP=y +# end of Executable file formats + +# +# Memory Management options +# +CONFIG_ZPOOL=y +CONFIG_SWAP=y +CONFIG_ZSWAP=y +# CONFIG_ZSWAP_DEFAULT_ON is not set +# CONFIG_ZSWAP_EXCLUSIVE_LOADS_DEFAULT_ON is not set +# CONFIG_ZSWAP_COMPRESSOR_DEFAULT_DEFLATE is not set +CONFIG_ZSWAP_COMPRESSOR_DEFAULT_LZO=y +# CONFIG_ZSWAP_COMPRESSOR_DEFAULT_842 is not set +# CONFIG_ZSWAP_COMPRESSOR_DEFAULT_LZ4 is not set +# CONFIG_ZSWAP_COMPRESSOR_DEFAULT_LZ4HC is not set +# CONFIG_ZSWAP_COMPRESSOR_DEFAULT_ZSTD is not set +CONFIG_ZSWAP_COMPRESSOR_DEFAULT="lzo" +CONFIG_ZSWAP_ZPOOL_DEFAULT_ZBUD=y +# CONFIG_ZSWAP_ZPOOL_DEFAULT_Z3FOLD is not set +# CONFIG_ZSWAP_ZPOOL_DEFAULT_ZSMALLOC is not set +CONFIG_ZSWAP_ZPOOL_DEFAULT="zbud" +CONFIG_ZBUD=y +CONFIG_Z3FOLD=m +CONFIG_ZSMALLOC=m +# CONFIG_ZSMALLOC_STAT is not set +CONFIG_ZSMALLOC_CHAIN_SIZE=8 + +# +# SLAB allocator options +# +# CONFIG_SLAB_DEPRECATED is not set +CONFIG_SLUB=y +# CONFIG_SLUB_TINY is not set +CONFIG_SLAB_MERGE_DEFAULT=y +CONFIG_SLAB_FREELIST_RANDOM=y +CONFIG_SLAB_FREELIST_HARDENED=y +# CONFIG_SLUB_STATS is not set +CONFIG_SLUB_CPU_PARTIAL=y +# CONFIG_RANDOM_KMALLOC_CACHES is not set +# end of SLAB allocator options + +CONFIG_SHUFFLE_PAGE_ALLOCATOR=y +# CONFIG_COMPAT_BRK is not set +CONFIG_SPARSEMEM=y +CONFIG_SPARSEMEM_EXTREME=y +CONFIG_SPARSEMEM_VMEMMAP_ENABLE=y +CONFIG_SPARSEMEM_VMEMMAP=y +CONFIG_HAVE_FAST_GUP=y +CONFIG_ARCH_KEEP_MEMBLOCK=y +CONFIG_NUMA_KEEP_MEMINFO=y +CONFIG_MEMORY_ISOLATION=y +CONFIG_EXCLUSIVE_SYSTEM_RAM=y +CONFIG_ARCH_ENABLE_MEMORY_HOTPLUG=y +CONFIG_ARCH_ENABLE_MEMORY_HOTREMOVE=y +CONFIG_MEMORY_HOTPLUG=y +# CONFIG_MEMORY_HOTPLUG_DEFAULT_ONLINE is not set +CONFIG_MEMORY_HOTREMOVE=y +CONFIG_MHP_MEMMAP_ON_MEMORY=y +CONFIG_ARCH_MHP_MEMMAP_ON_MEMORY_ENABLE=y +CONFIG_SPLIT_PTLOCK_CPUS=4 +CONFIG_ARCH_ENABLE_SPLIT_PMD_PTLOCK=y +CONFIG_MEMORY_BALLOON=y +CONFIG_BALLOON_COMPACTION=y +CONFIG_COMPACTION=y +CONFIG_COMPACT_UNEVICTABLE_DEFAULT=1 +CONFIG_PAGE_REPORTING=y +CONFIG_MIGRATION=y +CONFIG_DEVICE_MIGRATION=y +CONFIG_ARCH_ENABLE_HUGEPAGE_MIGRATION=y +CONFIG_ARCH_ENABLE_THP_MIGRATION=y +CONFIG_CONTIG_ALLOC=y +CONFIG_PHYS_ADDR_T_64BIT=y +CONFIG_MMU_NOTIFIER=y +CONFIG_KSM=y +CONFIG_DEFAULT_MMAP_MIN_ADDR=4096 +CONFIG_ARCH_SUPPORTS_MEMORY_FAILURE=y +CONFIG_MEMORY_FAILURE=y +CONFIG_HWPOISON_INJECT=m +CONFIG_TRANSPARENT_HUGEPAGE=y +# CONFIG_TRANSPARENT_HUGEPAGE_ALWAYS is not set +CONFIG_TRANSPARENT_HUGEPAGE_MADVISE=y +# CONFIG_READ_ONLY_THP_FOR_FS is not set +CONFIG_NEED_PER_CPU_EMBED_FIRST_CHUNK=y +CONFIG_NEED_PER_CPU_PAGE_FIRST_CHUNK=y +CONFIG_USE_PERCPU_NUMA_NODE_ID=y +CONFIG_HAVE_SETUP_PER_CPU_AREA=y +# CONFIG_CMA is not set +CONFIG_GENERIC_EARLY_IOREMAP=y +CONFIG_DEFERRED_STRUCT_PAGE_INIT=y +# CONFIG_IDLE_PAGE_TRACKING is not set +CONFIG_ARCH_HAS_CACHE_LINE_SIZE=y +CONFIG_ARCH_HAS_CURRENT_STACK_POINTER=y +CONFIG_ARCH_HAS_PTE_DEVMAP=y +CONFIG_ARCH_HAS_ZONE_DMA_SET=y +CONFIG_ZONE_DMA=y +CONFIG_ZONE_DMA32=y +CONFIG_ZONE_DEVICE=y +CONFIG_HMM_MIRROR=y +CONFIG_DEVICE_PRIVATE=y +CONFIG_VM_EVENT_COUNTERS=y +# CONFIG_PERCPU_STATS is not set +# CONFIG_GUP_TEST is not set +# CONFIG_DMAPOOL_TEST is not set +CONFIG_ARCH_HAS_PTE_SPECIAL=y +CONFIG_MEMFD_CREATE=y +CONFIG_SECRETMEM=y +# CONFIG_ANON_VMA_NAME is not set +CONFIG_USERFAULTFD=y +CONFIG_HAVE_ARCH_USERFAULTFD_MINOR=y +# CONFIG_LRU_GEN is not set +CONFIG_ARCH_SUPPORTS_PER_VMA_LOCK=y +CONFIG_PER_VMA_LOCK=y +CONFIG_LOCK_MM_AND_FIND_VMA=y + +# +# Data Access Monitoring +# +# CONFIG_DAMON is not set +# end of Data Access Monitoring +# end of Memory Management options + +CONFIG_NET=y +CONFIG_COMPAT_NETLINK_MESSAGES=y +CONFIG_NET_INGRESS=y +CONFIG_NET_EGRESS=y +CONFIG_NET_XGRESS=y +CONFIG_NET_REDIRECT=y +CONFIG_SKB_EXTENSIONS=y + +# +# Networking options +# +CONFIG_PACKET=y +CONFIG_PACKET_DIAG=m +CONFIG_UNIX=y +CONFIG_UNIX_SCM=y +CONFIG_AF_UNIX_OOB=y +CONFIG_UNIX_DIAG=m +# CONFIG_TLS is not set +CONFIG_XFRM=y +CONFIG_XFRM_OFFLOAD=y +CONFIG_XFRM_ALGO=m +CONFIG_XFRM_USER=m +CONFIG_XFRM_INTERFACE=m +CONFIG_XFRM_SUB_POLICY=y +CONFIG_XFRM_MIGRATE=y +CONFIG_XFRM_STATISTICS=y +CONFIG_XFRM_AH=m +CONFIG_XFRM_ESP=m +CONFIG_XFRM_IPCOMP=m +CONFIG_NET_KEY=m +CONFIG_NET_KEY_MIGRATE=y +CONFIG_SMC=m +CONFIG_SMC_DIAG=m +CONFIG_XDP_SOCKETS=y +# CONFIG_XDP_SOCKETS_DIAG is not set +CONFIG_NET_HANDSHAKE=y +CONFIG_INET=y +CONFIG_IP_MULTICAST=y +CONFIG_IP_ADVANCED_ROUTER=y +CONFIG_IP_FIB_TRIE_STATS=y +CONFIG_IP_MULTIPLE_TABLES=y +CONFIG_IP_ROUTE_MULTIPATH=y +CONFIG_IP_ROUTE_VERBOSE=y +CONFIG_IP_ROUTE_CLASSID=y +# CONFIG_IP_PNP is not set +CONFIG_NET_IPIP=m +CONFIG_NET_IPGRE_DEMUX=m +CONFIG_NET_IP_TUNNEL=m +CONFIG_NET_IPGRE=m +CONFIG_NET_IPGRE_BROADCAST=y +CONFIG_IP_MROUTE_COMMON=y +CONFIG_IP_MROUTE=y +CONFIG_IP_MROUTE_MULTIPLE_TABLES=y +CONFIG_IP_PIMSM_V1=y +CONFIG_IP_PIMSM_V2=y +CONFIG_SYN_COOKIES=y +CONFIG_NET_IPVTI=m +CONFIG_NET_UDP_TUNNEL=m +CONFIG_NET_FOU=m +CONFIG_NET_FOU_IP_TUNNELS=y +CONFIG_INET_AH=m +CONFIG_INET_ESP=m +CONFIG_INET_ESP_OFFLOAD=m +# CONFIG_INET_ESPINTCP is not set +CONFIG_INET_IPCOMP=m +CONFIG_INET_TABLE_PERTURB_ORDER=16 +CONFIG_INET_XFRM_TUNNEL=m +CONFIG_INET_TUNNEL=m +CONFIG_INET_DIAG=m +CONFIG_INET_TCP_DIAG=m +CONFIG_INET_UDP_DIAG=m +CONFIG_INET_RAW_DIAG=m +CONFIG_INET_DIAG_DESTROY=y +CONFIG_TCP_CONG_ADVANCED=y +CONFIG_TCP_CONG_BIC=m +CONFIG_TCP_CONG_CUBIC=y +CONFIG_TCP_CONG_WESTWOOD=m +CONFIG_TCP_CONG_HTCP=m +CONFIG_TCP_CONG_HSTCP=m +CONFIG_TCP_CONG_HYBLA=m +CONFIG_TCP_CONG_VEGAS=m +CONFIG_TCP_CONG_NV=m +CONFIG_TCP_CONG_SCALABLE=m +CONFIG_TCP_CONG_LP=m +CONFIG_TCP_CONG_VENO=m +CONFIG_TCP_CONG_YEAH=m +CONFIG_TCP_CONG_ILLINOIS=m +CONFIG_TCP_CONG_DCTCP=m +CONFIG_TCP_CONG_CDG=m +CONFIG_TCP_CONG_BBR=m +CONFIG_DEFAULT_CUBIC=y +# CONFIG_DEFAULT_RENO is not set +CONFIG_DEFAULT_TCP_CONG="cubic" +CONFIG_TCP_MD5SIG=y +CONFIG_IPV6=y +CONFIG_IPV6_ROUTER_PREF=y +CONFIG_IPV6_ROUTE_INFO=y +CONFIG_IPV6_OPTIMISTIC_DAD=y +CONFIG_INET6_AH=m +CONFIG_INET6_ESP=m +CONFIG_INET6_ESP_OFFLOAD=m +# CONFIG_INET6_ESPINTCP is not set +CONFIG_INET6_IPCOMP=m +CONFIG_IPV6_MIP6=y +CONFIG_IPV6_ILA=m +CONFIG_INET6_XFRM_TUNNEL=m +CONFIG_INET6_TUNNEL=m +CONFIG_IPV6_VTI=m +CONFIG_IPV6_SIT=m +CONFIG_IPV6_SIT_6RD=y +CONFIG_IPV6_NDISC_NODETYPE=y +CONFIG_IPV6_TUNNEL=m +CONFIG_IPV6_GRE=m +CONFIG_IPV6_FOU=m +CONFIG_IPV6_FOU_TUNNEL=m +CONFIG_IPV6_MULTIPLE_TABLES=y +CONFIG_IPV6_SUBTREES=y +CONFIG_IPV6_MROUTE=y +CONFIG_IPV6_MROUTE_MULTIPLE_TABLES=y +CONFIG_IPV6_PIMSM_V2=y +CONFIG_IPV6_SEG6_LWTUNNEL=y +CONFIG_IPV6_SEG6_HMAC=y +CONFIG_IPV6_SEG6_BPF=y +# CONFIG_IPV6_RPL_LWTUNNEL is not set +# CONFIG_IPV6_IOAM6_LWTUNNEL is not set +# CONFIG_NETLABEL is not set +# CONFIG_MPTCP is not set +CONFIG_NETWORK_SECMARK=y +CONFIG_NET_PTP_CLASSIFY=y +# CONFIG_NETWORK_PHY_TIMESTAMPING is not set +CONFIG_NETFILTER=y +CONFIG_NETFILTER_ADVANCED=y +CONFIG_BRIDGE_NETFILTER=m + +# +# Core Netfilter Configuration +# +CONFIG_NETFILTER_INGRESS=y +CONFIG_NETFILTER_EGRESS=y +CONFIG_NETFILTER_SKIP_EGRESS=y +CONFIG_NETFILTER_NETLINK=m +CONFIG_NETFILTER_FAMILY_BRIDGE=y +CONFIG_NETFILTER_FAMILY_ARP=y +CONFIG_NETFILTER_BPF_LINK=y +# CONFIG_NETFILTER_NETLINK_HOOK is not set +CONFIG_NETFILTER_NETLINK_ACCT=m +CONFIG_NETFILTER_NETLINK_QUEUE=m +CONFIG_NETFILTER_NETLINK_LOG=m +CONFIG_NETFILTER_NETLINK_OSF=m +CONFIG_NF_CONNTRACK=m +CONFIG_NF_LOG_SYSLOG=m +CONFIG_NETFILTER_CONNCOUNT=m +CONFIG_NF_CONNTRACK_MARK=y +CONFIG_NF_CONNTRACK_SECMARK=y +CONFIG_NF_CONNTRACK_ZONES=y +CONFIG_NF_CONNTRACK_PROCFS=y +CONFIG_NF_CONNTRACK_EVENTS=y +CONFIG_NF_CONNTRACK_TIMEOUT=y +CONFIG_NF_CONNTRACK_TIMESTAMP=y +CONFIG_NF_CONNTRACK_LABELS=y +CONFIG_NF_CONNTRACK_OVS=y +CONFIG_NF_CT_PROTO_DCCP=y +CONFIG_NF_CT_PROTO_GRE=y +CONFIG_NF_CT_PROTO_SCTP=y +CONFIG_NF_CT_PROTO_UDPLITE=y +CONFIG_NF_CONNTRACK_AMANDA=m +CONFIG_NF_CONNTRACK_FTP=m +CONFIG_NF_CONNTRACK_H323=m +CONFIG_NF_CONNTRACK_IRC=m +CONFIG_NF_CONNTRACK_BROADCAST=m +CONFIG_NF_CONNTRACK_NETBIOS_NS=m +CONFIG_NF_CONNTRACK_SNMP=m +CONFIG_NF_CONNTRACK_PPTP=m +CONFIG_NF_CONNTRACK_SANE=m +CONFIG_NF_CONNTRACK_SIP=m +CONFIG_NF_CONNTRACK_TFTP=m +CONFIG_NF_CT_NETLINK=m +CONFIG_NF_CT_NETLINK_TIMEOUT=m +CONFIG_NF_CT_NETLINK_HELPER=m +CONFIG_NETFILTER_NETLINK_GLUE_CT=y +CONFIG_NF_NAT=m +CONFIG_NF_NAT_AMANDA=m +CONFIG_NF_NAT_FTP=m +CONFIG_NF_NAT_IRC=m +CONFIG_NF_NAT_SIP=m +CONFIG_NF_NAT_TFTP=m +CONFIG_NF_NAT_REDIRECT=y +CONFIG_NF_NAT_MASQUERADE=y +CONFIG_NF_NAT_OVS=y +CONFIG_NETFILTER_SYNPROXY=m +CONFIG_NF_TABLES=m +CONFIG_NF_TABLES_INET=y +CONFIG_NF_TABLES_NETDEV=y +CONFIG_NFT_NUMGEN=m +CONFIG_NFT_CT=m +CONFIG_NFT_FLOW_OFFLOAD=m +CONFIG_NFT_CONNLIMIT=m +CONFIG_NFT_LOG=m +CONFIG_NFT_LIMIT=m +CONFIG_NFT_MASQ=m +CONFIG_NFT_REDIR=m +CONFIG_NFT_NAT=m +CONFIG_NFT_TUNNEL=m +CONFIG_NFT_QUEUE=m +CONFIG_NFT_QUOTA=m +CONFIG_NFT_REJECT=m +CONFIG_NFT_REJECT_INET=m +CONFIG_NFT_COMPAT=m +CONFIG_NFT_HASH=m +CONFIG_NFT_FIB=m +CONFIG_NFT_FIB_INET=m +# CONFIG_NFT_XFRM is not set +CONFIG_NFT_SOCKET=m +CONFIG_NFT_OSF=m +CONFIG_NFT_TPROXY=m +# CONFIG_NFT_SYNPROXY is not set +CONFIG_NF_DUP_NETDEV=m +CONFIG_NFT_DUP_NETDEV=m +CONFIG_NFT_FWD_NETDEV=m +CONFIG_NFT_FIB_NETDEV=m +# CONFIG_NFT_REJECT_NETDEV is not set +CONFIG_NF_FLOW_TABLE_INET=m +CONFIG_NF_FLOW_TABLE=m +# CONFIG_NF_FLOW_TABLE_PROCFS is not set +CONFIG_NETFILTER_XTABLES=m +CONFIG_NETFILTER_XTABLES_COMPAT=y + +# +# Xtables combined modules +# +CONFIG_NETFILTER_XT_MARK=m +CONFIG_NETFILTER_XT_CONNMARK=m +CONFIG_NETFILTER_XT_SET=m + +# +# Xtables targets +# +CONFIG_NETFILTER_XT_TARGET_AUDIT=m +CONFIG_NETFILTER_XT_TARGET_CHECKSUM=m +CONFIG_NETFILTER_XT_TARGET_CLASSIFY=m +CONFIG_NETFILTER_XT_TARGET_CONNMARK=m +CONFIG_NETFILTER_XT_TARGET_CONNSECMARK=m +CONFIG_NETFILTER_XT_TARGET_CT=m +CONFIG_NETFILTER_XT_TARGET_DSCP=m +CONFIG_NETFILTER_XT_TARGET_HL=m +CONFIG_NETFILTER_XT_TARGET_HMARK=m +CONFIG_NETFILTER_XT_TARGET_IDLETIMER=m +CONFIG_NETFILTER_XT_TARGET_LED=m +CONFIG_NETFILTER_XT_TARGET_LOG=m +CONFIG_NETFILTER_XT_TARGET_MARK=m +CONFIG_NETFILTER_XT_NAT=m +CONFIG_NETFILTER_XT_TARGET_NETMAP=m +CONFIG_NETFILTER_XT_TARGET_NFLOG=m +CONFIG_NETFILTER_XT_TARGET_NFQUEUE=m +# CONFIG_NETFILTER_XT_TARGET_NOTRACK is not set +CONFIG_NETFILTER_XT_TARGET_RATEEST=m +CONFIG_NETFILTER_XT_TARGET_REDIRECT=m +CONFIG_NETFILTER_XT_TARGET_MASQUERADE=m +CONFIG_NETFILTER_XT_TARGET_TEE=m +CONFIG_NETFILTER_XT_TARGET_TPROXY=m +CONFIG_NETFILTER_XT_TARGET_TRACE=m +CONFIG_NETFILTER_XT_TARGET_SECMARK=m +CONFIG_NETFILTER_XT_TARGET_TCPMSS=m +CONFIG_NETFILTER_XT_TARGET_TCPOPTSTRIP=m + +# +# Xtables matches +# +CONFIG_NETFILTER_XT_MATCH_ADDRTYPE=m +CONFIG_NETFILTER_XT_MATCH_BPF=m +CONFIG_NETFILTER_XT_MATCH_CGROUP=m +CONFIG_NETFILTER_XT_MATCH_CLUSTER=m +CONFIG_NETFILTER_XT_MATCH_COMMENT=m +CONFIG_NETFILTER_XT_MATCH_CONNBYTES=m +CONFIG_NETFILTER_XT_MATCH_CONNLABEL=m +CONFIG_NETFILTER_XT_MATCH_CONNLIMIT=m +CONFIG_NETFILTER_XT_MATCH_CONNMARK=m +CONFIG_NETFILTER_XT_MATCH_CONNTRACK=m +CONFIG_NETFILTER_XT_MATCH_CPU=m +CONFIG_NETFILTER_XT_MATCH_DCCP=m +CONFIG_NETFILTER_XT_MATCH_DEVGROUP=m +CONFIG_NETFILTER_XT_MATCH_DSCP=m +CONFIG_NETFILTER_XT_MATCH_ECN=m +CONFIG_NETFILTER_XT_MATCH_ESP=m +CONFIG_NETFILTER_XT_MATCH_HASHLIMIT=m +CONFIG_NETFILTER_XT_MATCH_HELPER=m +CONFIG_NETFILTER_XT_MATCH_HL=m +CONFIG_NETFILTER_XT_MATCH_IPCOMP=m +CONFIG_NETFILTER_XT_MATCH_IPRANGE=m +CONFIG_NETFILTER_XT_MATCH_IPVS=m +CONFIG_NETFILTER_XT_MATCH_L2TP=m +CONFIG_NETFILTER_XT_MATCH_LENGTH=m +CONFIG_NETFILTER_XT_MATCH_LIMIT=m +CONFIG_NETFILTER_XT_MATCH_MAC=m +CONFIG_NETFILTER_XT_MATCH_MARK=m +CONFIG_NETFILTER_XT_MATCH_MULTIPORT=m +CONFIG_NETFILTER_XT_MATCH_NFACCT=m +CONFIG_NETFILTER_XT_MATCH_OSF=m +CONFIG_NETFILTER_XT_MATCH_OWNER=m +CONFIG_NETFILTER_XT_MATCH_POLICY=m +CONFIG_NETFILTER_XT_MATCH_PHYSDEV=m +CONFIG_NETFILTER_XT_MATCH_PKTTYPE=m +CONFIG_NETFILTER_XT_MATCH_QUOTA=m +CONFIG_NETFILTER_XT_MATCH_RATEEST=m +CONFIG_NETFILTER_XT_MATCH_REALM=m +CONFIG_NETFILTER_XT_MATCH_RECENT=m +CONFIG_NETFILTER_XT_MATCH_SCTP=m +CONFIG_NETFILTER_XT_MATCH_SOCKET=m +CONFIG_NETFILTER_XT_MATCH_STATE=m +CONFIG_NETFILTER_XT_MATCH_STATISTIC=m +CONFIG_NETFILTER_XT_MATCH_STRING=m +CONFIG_NETFILTER_XT_MATCH_TCPMSS=m +CONFIG_NETFILTER_XT_MATCH_TIME=m +CONFIG_NETFILTER_XT_MATCH_U32=m +# end of Core Netfilter Configuration + +CONFIG_IP_SET=m +CONFIG_IP_SET_MAX=256 +CONFIG_IP_SET_BITMAP_IP=m +CONFIG_IP_SET_BITMAP_IPMAC=m +CONFIG_IP_SET_BITMAP_PORT=m +CONFIG_IP_SET_HASH_IP=m +CONFIG_IP_SET_HASH_IPMARK=m +CONFIG_IP_SET_HASH_IPPORT=m +CONFIG_IP_SET_HASH_IPPORTIP=m +CONFIG_IP_SET_HASH_IPPORTNET=m +CONFIG_IP_SET_HASH_IPMAC=m +CONFIG_IP_SET_HASH_MAC=m +CONFIG_IP_SET_HASH_NETPORTNET=m +CONFIG_IP_SET_HASH_NET=m +CONFIG_IP_SET_HASH_NETNET=m +CONFIG_IP_SET_HASH_NETPORT=m +CONFIG_IP_SET_HASH_NETIFACE=m +CONFIG_IP_SET_LIST_SET=m +CONFIG_IP_VS=m +CONFIG_IP_VS_IPV6=y +# CONFIG_IP_VS_DEBUG is not set +CONFIG_IP_VS_TAB_BITS=12 + +# +# IPVS transport protocol load balancing support +# +CONFIG_IP_VS_PROTO_TCP=y +CONFIG_IP_VS_PROTO_UDP=y +CONFIG_IP_VS_PROTO_AH_ESP=y +CONFIG_IP_VS_PROTO_ESP=y +CONFIG_IP_VS_PROTO_AH=y +CONFIG_IP_VS_PROTO_SCTP=y + +# +# IPVS scheduler +# +CONFIG_IP_VS_RR=m +CONFIG_IP_VS_WRR=m +CONFIG_IP_VS_LC=m +CONFIG_IP_VS_WLC=m +CONFIG_IP_VS_FO=m +CONFIG_IP_VS_OVF=m +CONFIG_IP_VS_LBLC=m +CONFIG_IP_VS_LBLCR=m +CONFIG_IP_VS_DH=m +CONFIG_IP_VS_SH=m +CONFIG_IP_VS_MH=m +CONFIG_IP_VS_SED=m +CONFIG_IP_VS_NQ=m +# CONFIG_IP_VS_TWOS is not set + +# +# IPVS SH scheduler +# +CONFIG_IP_VS_SH_TAB_BITS=8 + +# +# IPVS MH scheduler +# +CONFIG_IP_VS_MH_TAB_INDEX=12 + +# +# IPVS application helper +# +CONFIG_IP_VS_FTP=m +CONFIG_IP_VS_NFCT=y +CONFIG_IP_VS_PE_SIP=m + +# +# IP: Netfilter Configuration +# +CONFIG_NF_DEFRAG_IPV4=m +CONFIG_NF_SOCKET_IPV4=m +CONFIG_NF_TPROXY_IPV4=m +CONFIG_NF_TABLES_IPV4=y +CONFIG_NFT_REJECT_IPV4=m +CONFIG_NFT_DUP_IPV4=m +CONFIG_NFT_FIB_IPV4=m +CONFIG_NF_TABLES_ARP=y +CONFIG_NF_DUP_IPV4=m +CONFIG_NF_LOG_ARP=m +CONFIG_NF_LOG_IPV4=m +CONFIG_NF_REJECT_IPV4=m +CONFIG_NF_NAT_SNMP_BASIC=m +CONFIG_NF_NAT_PPTP=m +CONFIG_NF_NAT_H323=m +CONFIG_IP_NF_IPTABLES=m +CONFIG_IP_NF_MATCH_AH=m +CONFIG_IP_NF_MATCH_ECN=m +CONFIG_IP_NF_MATCH_RPFILTER=m +CONFIG_IP_NF_MATCH_TTL=m +CONFIG_IP_NF_FILTER=m +CONFIG_IP_NF_TARGET_REJECT=m +CONFIG_IP_NF_TARGET_SYNPROXY=m +CONFIG_IP_NF_NAT=m +CONFIG_IP_NF_TARGET_MASQUERADE=m +CONFIG_IP_NF_TARGET_NETMAP=m +CONFIG_IP_NF_TARGET_REDIRECT=m +CONFIG_IP_NF_MANGLE=m +CONFIG_IP_NF_TARGET_ECN=m +CONFIG_IP_NF_TARGET_TTL=m +CONFIG_IP_NF_RAW=m +CONFIG_IP_NF_SECURITY=m +CONFIG_IP_NF_ARPTABLES=m +CONFIG_IP_NF_ARPFILTER=m +CONFIG_IP_NF_ARP_MANGLE=m +# end of IP: Netfilter Configuration + +# +# IPv6: Netfilter Configuration +# +CONFIG_NF_SOCKET_IPV6=m +CONFIG_NF_TPROXY_IPV6=m +CONFIG_NF_TABLES_IPV6=y +CONFIG_NFT_REJECT_IPV6=m +CONFIG_NFT_DUP_IPV6=m +CONFIG_NFT_FIB_IPV6=m +CONFIG_NF_DUP_IPV6=m +CONFIG_NF_REJECT_IPV6=m +CONFIG_NF_LOG_IPV6=m +CONFIG_IP6_NF_IPTABLES=m +CONFIG_IP6_NF_MATCH_AH=m +CONFIG_IP6_NF_MATCH_EUI64=m +CONFIG_IP6_NF_MATCH_FRAG=m +CONFIG_IP6_NF_MATCH_OPTS=m +CONFIG_IP6_NF_MATCH_HL=m +CONFIG_IP6_NF_MATCH_IPV6HEADER=m +CONFIG_IP6_NF_MATCH_MH=m +CONFIG_IP6_NF_MATCH_RPFILTER=m +CONFIG_IP6_NF_MATCH_RT=m +# CONFIG_IP6_NF_MATCH_SRH is not set +CONFIG_IP6_NF_TARGET_HL=m +CONFIG_IP6_NF_FILTER=m +CONFIG_IP6_NF_TARGET_REJECT=m +CONFIG_IP6_NF_TARGET_SYNPROXY=m +CONFIG_IP6_NF_MANGLE=m +CONFIG_IP6_NF_RAW=m +CONFIG_IP6_NF_SECURITY=m +CONFIG_IP6_NF_NAT=m +CONFIG_IP6_NF_TARGET_MASQUERADE=m +CONFIG_IP6_NF_TARGET_NPT=m +# end of IPv6: Netfilter Configuration + +CONFIG_NF_DEFRAG_IPV6=m +CONFIG_NF_TABLES_BRIDGE=m +# CONFIG_NFT_BRIDGE_META is not set +CONFIG_NFT_BRIDGE_REJECT=m +# CONFIG_NF_CONNTRACK_BRIDGE is not set +CONFIG_BRIDGE_NF_EBTABLES=m +CONFIG_BRIDGE_EBT_BROUTE=m +CONFIG_BRIDGE_EBT_T_FILTER=m +CONFIG_BRIDGE_EBT_T_NAT=m +CONFIG_BRIDGE_EBT_802_3=m +CONFIG_BRIDGE_EBT_AMONG=m +CONFIG_BRIDGE_EBT_ARP=m +CONFIG_BRIDGE_EBT_IP=m +CONFIG_BRIDGE_EBT_IP6=m +CONFIG_BRIDGE_EBT_LIMIT=m +CONFIG_BRIDGE_EBT_MARK=m +CONFIG_BRIDGE_EBT_PKTTYPE=m +CONFIG_BRIDGE_EBT_STP=m +CONFIG_BRIDGE_EBT_VLAN=m +CONFIG_BRIDGE_EBT_ARPREPLY=m +CONFIG_BRIDGE_EBT_DNAT=m +CONFIG_BRIDGE_EBT_MARK_T=m +CONFIG_BRIDGE_EBT_REDIRECT=m +CONFIG_BRIDGE_EBT_SNAT=m +CONFIG_BRIDGE_EBT_LOG=m +CONFIG_BRIDGE_EBT_NFLOG=m +# CONFIG_BPFILTER is not set +CONFIG_IP_DCCP=m +CONFIG_INET_DCCP_DIAG=m + +# +# DCCP CCIDs Configuration +# +# CONFIG_IP_DCCP_CCID2_DEBUG is not set +CONFIG_IP_DCCP_CCID3=y +# CONFIG_IP_DCCP_CCID3_DEBUG is not set +CONFIG_IP_DCCP_TFRC_LIB=y +# end of DCCP CCIDs Configuration + +# +# DCCP Kernel Hacking +# +# CONFIG_IP_DCCP_DEBUG is not set +# end of DCCP Kernel Hacking + +CONFIG_IP_SCTP=m +# CONFIG_SCTP_DBG_OBJCNT is not set +CONFIG_SCTP_DEFAULT_COOKIE_HMAC_MD5=y +# CONFIG_SCTP_DEFAULT_COOKIE_HMAC_SHA1 is not set +# CONFIG_SCTP_DEFAULT_COOKIE_HMAC_NONE is not set +CONFIG_SCTP_COOKIE_HMAC_MD5=y +CONFIG_SCTP_COOKIE_HMAC_SHA1=y +CONFIG_INET_SCTP_DIAG=m +CONFIG_RDS=m +CONFIG_RDS_RDMA=m +CONFIG_RDS_TCP=m +# CONFIG_RDS_DEBUG is not set +CONFIG_TIPC=m +CONFIG_TIPC_MEDIA_IB=y +CONFIG_TIPC_MEDIA_UDP=y +CONFIG_TIPC_CRYPTO=y +CONFIG_TIPC_DIAG=m +CONFIG_ATM=m +CONFIG_ATM_CLIP=m +# CONFIG_ATM_CLIP_NO_ICMP is not set +CONFIG_ATM_LANE=m +CONFIG_ATM_MPOA=m +CONFIG_ATM_BR2684=m +# CONFIG_ATM_BR2684_IPFILTER is not set +CONFIG_L2TP=m +CONFIG_L2TP_DEBUGFS=m +CONFIG_L2TP_V3=y +CONFIG_L2TP_IP=m +CONFIG_L2TP_ETH=m +CONFIG_STP=m +CONFIG_GARP=m +CONFIG_MRP=m +CONFIG_BRIDGE=m +CONFIG_BRIDGE_IGMP_SNOOPING=y +CONFIG_BRIDGE_VLAN_FILTERING=y +# CONFIG_BRIDGE_MRP is not set +# CONFIG_BRIDGE_CFM is not set +CONFIG_NET_DSA=m +# CONFIG_NET_DSA_TAG_NONE is not set +# CONFIG_NET_DSA_TAG_AR9331 is not set +# CONFIG_NET_DSA_TAG_BRCM is not set +# CONFIG_NET_DSA_TAG_BRCM_LEGACY is not set +# CONFIG_NET_DSA_TAG_BRCM_PREPEND is not set +# CONFIG_NET_DSA_TAG_HELLCREEK is not set +# CONFIG_NET_DSA_TAG_GSWIP is not set +CONFIG_NET_DSA_TAG_DSA_COMMON=m +CONFIG_NET_DSA_TAG_DSA=m +CONFIG_NET_DSA_TAG_EDSA=m +# CONFIG_NET_DSA_TAG_MTK is not set +# CONFIG_NET_DSA_TAG_KSZ is not set +# CONFIG_NET_DSA_TAG_OCELOT is not set +# CONFIG_NET_DSA_TAG_OCELOT_8021Q is not set +# CONFIG_NET_DSA_TAG_QCA is not set +# CONFIG_NET_DSA_TAG_RTL4_A is not set +# CONFIG_NET_DSA_TAG_RTL8_4 is not set +# CONFIG_NET_DSA_TAG_RZN1_A5PSW is not set +# CONFIG_NET_DSA_TAG_LAN9303 is not set +# CONFIG_NET_DSA_TAG_SJA1105 is not set +CONFIG_NET_DSA_TAG_TRAILER=m +# CONFIG_NET_DSA_TAG_XRS700X is not set +CONFIG_VLAN_8021Q=m +CONFIG_VLAN_8021Q_GVRP=y +CONFIG_VLAN_8021Q_MVRP=y +CONFIG_LLC=m +CONFIG_LLC2=m +CONFIG_ATALK=m +CONFIG_DEV_APPLETALK=m +CONFIG_IPDDP=m +CONFIG_IPDDP_ENCAP=y +# CONFIG_X25 is not set +# CONFIG_LAPB is not set +CONFIG_PHONET=m +CONFIG_6LOWPAN=m +# CONFIG_6LOWPAN_DEBUGFS is not set +CONFIG_6LOWPAN_NHC=m +CONFIG_6LOWPAN_NHC_DEST=m +CONFIG_6LOWPAN_NHC_FRAGMENT=m +CONFIG_6LOWPAN_NHC_HOP=m +CONFIG_6LOWPAN_NHC_IPV6=m +CONFIG_6LOWPAN_NHC_MOBILITY=m +CONFIG_6LOWPAN_NHC_ROUTING=m +CONFIG_6LOWPAN_NHC_UDP=m +CONFIG_6LOWPAN_GHC_EXT_HDR_HOP=m +CONFIG_6LOWPAN_GHC_UDP=m +CONFIG_6LOWPAN_GHC_ICMPV6=m +CONFIG_6LOWPAN_GHC_EXT_HDR_DEST=m +CONFIG_6LOWPAN_GHC_EXT_HDR_FRAG=m +CONFIG_6LOWPAN_GHC_EXT_HDR_ROUTE=m +CONFIG_IEEE802154=m +# CONFIG_IEEE802154_NL802154_EXPERIMENTAL is not set +CONFIG_IEEE802154_SOCKET=m +CONFIG_IEEE802154_6LOWPAN=m +CONFIG_MAC802154=m +CONFIG_NET_SCHED=y + +# +# Queueing/Scheduling +# +CONFIG_NET_SCH_HTB=m +CONFIG_NET_SCH_HFSC=m +CONFIG_NET_SCH_PRIO=m +CONFIG_NET_SCH_MULTIQ=m +CONFIG_NET_SCH_RED=m +CONFIG_NET_SCH_SFB=m +CONFIG_NET_SCH_SFQ=m +CONFIG_NET_SCH_TEQL=m +CONFIG_NET_SCH_TBF=m +CONFIG_NET_SCH_CBS=m +CONFIG_NET_SCH_ETF=m +CONFIG_NET_SCH_MQPRIO_LIB=m +# CONFIG_NET_SCH_TAPRIO is not set +CONFIG_NET_SCH_GRED=m +CONFIG_NET_SCH_NETEM=m +CONFIG_NET_SCH_DRR=m +CONFIG_NET_SCH_MQPRIO=m +CONFIG_NET_SCH_SKBPRIO=m +CONFIG_NET_SCH_CHOKE=m +CONFIG_NET_SCH_QFQ=m +CONFIG_NET_SCH_CODEL=m +CONFIG_NET_SCH_FQ_CODEL=m +CONFIG_NET_SCH_CAKE=m +CONFIG_NET_SCH_FQ=m +CONFIG_NET_SCH_HHF=m +CONFIG_NET_SCH_PIE=m +# CONFIG_NET_SCH_FQ_PIE is not set +CONFIG_NET_SCH_INGRESS=m +CONFIG_NET_SCH_PLUG=m +# CONFIG_NET_SCH_ETS is not set +# CONFIG_NET_SCH_DEFAULT is not set + +# +# Classification +# +CONFIG_NET_CLS=y +CONFIG_NET_CLS_BASIC=m +CONFIG_NET_CLS_ROUTE4=m +CONFIG_NET_CLS_FW=m +CONFIG_NET_CLS_U32=m +CONFIG_CLS_U32_PERF=y +CONFIG_CLS_U32_MARK=y +CONFIG_NET_CLS_FLOW=m +CONFIG_NET_CLS_CGROUP=m +CONFIG_NET_CLS_BPF=m +CONFIG_NET_CLS_FLOWER=m +CONFIG_NET_CLS_MATCHALL=m +CONFIG_NET_EMATCH=y +CONFIG_NET_EMATCH_STACK=32 +CONFIG_NET_EMATCH_CMP=m +CONFIG_NET_EMATCH_NBYTE=m +CONFIG_NET_EMATCH_U32=m +CONFIG_NET_EMATCH_META=m +CONFIG_NET_EMATCH_TEXT=m +CONFIG_NET_EMATCH_CANID=m +CONFIG_NET_EMATCH_IPSET=m +CONFIG_NET_EMATCH_IPT=m +CONFIG_NET_CLS_ACT=y +CONFIG_NET_ACT_POLICE=m +CONFIG_NET_ACT_GACT=m +CONFIG_GACT_PROB=y +CONFIG_NET_ACT_MIRRED=m +CONFIG_NET_ACT_SAMPLE=m +CONFIG_NET_ACT_IPT=m +CONFIG_NET_ACT_NAT=m +CONFIG_NET_ACT_PEDIT=m +CONFIG_NET_ACT_SIMP=m +CONFIG_NET_ACT_SKBEDIT=m +CONFIG_NET_ACT_CSUM=m +# CONFIG_NET_ACT_MPLS is not set +CONFIG_NET_ACT_VLAN=m +CONFIG_NET_ACT_BPF=m +CONFIG_NET_ACT_CONNMARK=m +# CONFIG_NET_ACT_CTINFO is not set +CONFIG_NET_ACT_SKBMOD=m +CONFIG_NET_ACT_IFE=m +CONFIG_NET_ACT_TUNNEL_KEY=m +CONFIG_NET_ACT_CT=m +# CONFIG_NET_ACT_GATE is not set +CONFIG_NET_IFE_SKBMARK=m +CONFIG_NET_IFE_SKBPRIO=m +CONFIG_NET_IFE_SKBTCINDEX=m +# CONFIG_NET_TC_SKB_EXT is not set +CONFIG_NET_SCH_FIFO=y +CONFIG_DCB=y +CONFIG_DNS_RESOLVER=m +CONFIG_BATMAN_ADV=m +# CONFIG_BATMAN_ADV_BATMAN_V is not set +CONFIG_BATMAN_ADV_BLA=y +CONFIG_BATMAN_ADV_DAT=y +CONFIG_BATMAN_ADV_NC=y +CONFIG_BATMAN_ADV_MCAST=y +# CONFIG_BATMAN_ADV_DEBUG is not set +# CONFIG_BATMAN_ADV_TRACING is not set +CONFIG_OPENVSWITCH=m +CONFIG_OPENVSWITCH_GRE=m +CONFIG_OPENVSWITCH_VXLAN=m +CONFIG_OPENVSWITCH_GENEVE=m +CONFIG_VSOCKETS=m +CONFIG_VSOCKETS_DIAG=m +CONFIG_VSOCKETS_LOOPBACK=m +CONFIG_VIRTIO_VSOCKETS=m +CONFIG_VIRTIO_VSOCKETS_COMMON=m +CONFIG_HYPERV_VSOCKETS=m +CONFIG_NETLINK_DIAG=m +CONFIG_MPLS=y +CONFIG_NET_MPLS_GSO=y +CONFIG_MPLS_ROUTING=m +CONFIG_MPLS_IPTUNNEL=m +CONFIG_NET_NSH=m +# CONFIG_HSR is not set +CONFIG_NET_SWITCHDEV=y +CONFIG_NET_L3_MASTER_DEV=y +# CONFIG_QRTR is not set +# CONFIG_NET_NCSI is not set +CONFIG_PCPU_DEV_REFCNT=y +CONFIG_MAX_SKB_FRAGS=17 +CONFIG_RPS=y +CONFIG_RFS_ACCEL=y +CONFIG_SOCK_RX_QUEUE_MAPPING=y +CONFIG_XPS=y +CONFIG_CGROUP_NET_PRIO=y +CONFIG_CGROUP_NET_CLASSID=y +CONFIG_NET_RX_BUSY_POLL=y +CONFIG_BQL=y +CONFIG_BPF_STREAM_PARSER=y +CONFIG_NET_FLOW_LIMIT=y + +# +# Network testing +# +CONFIG_NET_PKTGEN=m +CONFIG_NET_DROP_MONITOR=y +# end of Network testing +# end of Networking options + +CONFIG_HAMRADIO=y + +# +# Packet Radio protocols +# +CONFIG_AX25=m +CONFIG_AX25_DAMA_SLAVE=y +CONFIG_NETROM=m +CONFIG_ROSE=m + +# +# AX.25 network device drivers +# +CONFIG_MKISS=m +CONFIG_6PACK=m +CONFIG_BPQETHER=m +CONFIG_BAYCOM_SER_FDX=m +CONFIG_BAYCOM_SER_HDX=m +CONFIG_BAYCOM_PAR=m +CONFIG_YAM=m +# end of AX.25 network device drivers + +CONFIG_CAN=m +CONFIG_CAN_RAW=m +CONFIG_CAN_BCM=m +CONFIG_CAN_GW=m +# CONFIG_CAN_J1939 is not set +# CONFIG_CAN_ISOTP is not set +CONFIG_BT=m +CONFIG_BT_BREDR=y +CONFIG_BT_RFCOMM=m +CONFIG_BT_RFCOMM_TTY=y +CONFIG_BT_BNEP=m +CONFIG_BT_BNEP_MC_FILTER=y +CONFIG_BT_BNEP_PROTO_FILTER=y +CONFIG_BT_HIDP=m +CONFIG_BT_HS=y +CONFIG_BT_LE=y +CONFIG_BT_LE_L2CAP_ECRED=y +CONFIG_BT_6LOWPAN=m +CONFIG_BT_LEDS=y +# CONFIG_BT_MSFTEXT is not set +# CONFIG_BT_AOSPEXT is not set +CONFIG_BT_DEBUGFS=y +# CONFIG_BT_SELFTEST is not set + +# +# Bluetooth device drivers +# +CONFIG_BT_INTEL=m +CONFIG_BT_BCM=m +CONFIG_BT_RTL=m +CONFIG_BT_QCA=m +CONFIG_BT_MTK=m +CONFIG_BT_HCIBTUSB=m +CONFIG_BT_HCIBTUSB_AUTOSUSPEND=y +CONFIG_BT_HCIBTUSB_POLL_SYNC=y +CONFIG_BT_HCIBTUSB_BCM=y +# CONFIG_BT_HCIBTUSB_MTK is not set +CONFIG_BT_HCIBTUSB_RTL=y +CONFIG_BT_HCIBTSDIO=m +CONFIG_BT_HCIUART=m +CONFIG_BT_HCIUART_SERDEV=y +CONFIG_BT_HCIUART_H4=y +CONFIG_BT_HCIUART_NOKIA=m +# CONFIG_BT_HCIUART_BCSP is not set +CONFIG_BT_HCIUART_ATH3K=y +CONFIG_BT_HCIUART_LL=y +CONFIG_BT_HCIUART_3WIRE=y +CONFIG_BT_HCIUART_INTEL=y +CONFIG_BT_HCIUART_RTL=y +CONFIG_BT_HCIUART_QCA=y +CONFIG_BT_HCIUART_AG6XX=y +CONFIG_BT_HCIUART_MRVL=y +# CONFIG_BT_HCIBCM203X is not set +# CONFIG_BT_HCIBCM4377 is not set +# CONFIG_BT_HCIBPA10X is not set +# CONFIG_BT_HCIBFUSB is not set +# CONFIG_BT_HCIVHCI is not set +CONFIG_BT_MRVL=m +CONFIG_BT_MRVL_SDIO=m +CONFIG_BT_ATH3K=m +# CONFIG_BT_MTKSDIO is not set +CONFIG_BT_MTKUART=m +CONFIG_BT_QCOMSMD=m +CONFIG_BT_HCIRSI=m +# CONFIG_BT_VIRTIO is not set +# CONFIG_BT_NXPUART is not set +# end of Bluetooth device drivers + +CONFIG_AF_RXRPC=m +CONFIG_AF_RXRPC_IPV6=y +# CONFIG_AF_RXRPC_INJECT_LOSS is not set +# CONFIG_AF_RXRPC_INJECT_RX_DELAY is not set +# CONFIG_AF_RXRPC_DEBUG is not set +CONFIG_RXKAD=y +# CONFIG_RXPERF is not set +# CONFIG_AF_KCM is not set +CONFIG_STREAM_PARSER=y +# CONFIG_MCTP is not set +CONFIG_FIB_RULES=y +CONFIG_WIRELESS=y +CONFIG_WIRELESS_EXT=y +CONFIG_WEXT_CORE=y +CONFIG_WEXT_PROC=y +CONFIG_WEXT_SPY=y +CONFIG_WEXT_PRIV=y +CONFIG_CFG80211=m +# CONFIG_NL80211_TESTMODE is not set +# CONFIG_CFG80211_DEVELOPER_WARNINGS is not set +# CONFIG_CFG80211_CERTIFICATION_ONUS is not set +CONFIG_CFG80211_REQUIRE_SIGNED_REGDB=y +CONFIG_CFG80211_USE_KERNEL_REGDB_KEYS=y +CONFIG_CFG80211_DEFAULT_PS=y +# CONFIG_CFG80211_DEBUGFS is not set +CONFIG_CFG80211_CRDA_SUPPORT=y +CONFIG_CFG80211_WEXT=y +CONFIG_CFG80211_WEXT_EXPORT=y +CONFIG_LIB80211=m +CONFIG_LIB80211_CRYPT_WEP=m +CONFIG_LIB80211_CRYPT_CCMP=m +CONFIG_LIB80211_CRYPT_TKIP=m +# CONFIG_LIB80211_DEBUG is not set +CONFIG_MAC80211=m +CONFIG_MAC80211_HAS_RC=y +CONFIG_MAC80211_RC_MINSTREL=y +CONFIG_MAC80211_RC_DEFAULT_MINSTREL=y +CONFIG_MAC80211_RC_DEFAULT="minstrel_ht" +CONFIG_MAC80211_MESH=y +CONFIG_MAC80211_LEDS=y +# CONFIG_MAC80211_DEBUGFS is not set +# CONFIG_MAC80211_MESSAGE_TRACING is not set +# CONFIG_MAC80211_DEBUG_MENU is not set +CONFIG_MAC80211_STA_HASH_MAX_SIZE=0 +CONFIG_RFKILL=m +CONFIG_RFKILL_LEDS=y +CONFIG_RFKILL_INPUT=y +# CONFIG_RFKILL_GPIO is not set +CONFIG_NET_9P=m +CONFIG_NET_9P_FD=m +CONFIG_NET_9P_VIRTIO=m +CONFIG_NET_9P_XEN=m +CONFIG_NET_9P_RDMA=m +# CONFIG_NET_9P_DEBUG is not set +# CONFIG_CAIF is not set +CONFIG_CEPH_LIB=m +# CONFIG_CEPH_LIB_PRETTYDEBUG is not set +# CONFIG_CEPH_LIB_USE_DNS_RESOLVER is not set +CONFIG_NFC=m +CONFIG_NFC_DIGITAL=m +# CONFIG_NFC_NCI is not set +# CONFIG_NFC_HCI is not set + +# +# Near Field Communication (NFC) devices +# +# CONFIG_NFC_TRF7970A is not set +CONFIG_NFC_SIM=m +CONFIG_NFC_PORT100=m +CONFIG_NFC_PN533=m +CONFIG_NFC_PN533_USB=m +# CONFIG_NFC_PN533_I2C is not set +# CONFIG_NFC_PN532_UART is not set +# CONFIG_NFC_ST95HF is not set +# end of Near Field Communication (NFC) devices + +CONFIG_PSAMPLE=m +CONFIG_NET_IFE=m +CONFIG_LWTUNNEL=y +CONFIG_LWTUNNEL_BPF=y +CONFIG_DST_CACHE=y +CONFIG_GRO_CELLS=y +CONFIG_NET_SELFTESTS=m +CONFIG_NET_SOCK_MSG=y +CONFIG_NET_DEVLINK=y +CONFIG_PAGE_POOL=y +CONFIG_PAGE_POOL_STATS=y +CONFIG_FAILOVER=m +CONFIG_ETHTOOL_NETLINK=y + +# +# Device Drivers +# +CONFIG_ARM_AMBA=y +CONFIG_TEGRA_AHB=y +CONFIG_HAVE_PCI=y +CONFIG_PCI=y +CONFIG_PCI_DOMAINS=y +CONFIG_PCI_DOMAINS_GENERIC=y +CONFIG_PCI_SYSCALL=y +CONFIG_PCIEPORTBUS=y +CONFIG_HOTPLUG_PCI_PCIE=y +CONFIG_PCIEAER=y +CONFIG_PCIEAER_INJECT=m +# CONFIG_PCIE_ECRC is not set +CONFIG_PCIEASPM=y +CONFIG_PCIEASPM_DEFAULT=y +# CONFIG_PCIEASPM_POWERSAVE is not set +# CONFIG_PCIEASPM_POWER_SUPERSAVE is not set +# CONFIG_PCIEASPM_PERFORMANCE is not set +CONFIG_PCIE_PME=y +CONFIG_PCIE_DPC=y +CONFIG_PCIE_PTM=y +CONFIG_PCIE_EDR=y +CONFIG_PCI_MSI=y +CONFIG_PCI_QUIRKS=y +# CONFIG_PCI_DEBUG is not set +CONFIG_PCI_REALLOC_ENABLE_AUTO=y +CONFIG_PCI_STUB=m +CONFIG_PCI_PF_STUB=m +CONFIG_PCI_ATS=y +CONFIG_PCI_ECAM=y +CONFIG_PCI_BRIDGE_EMUL=y +CONFIG_PCI_IOV=y +CONFIG_PCI_PRI=y +CONFIG_PCI_PASID=y +CONFIG_PCI_P2PDMA=y +CONFIG_PCI_LABEL=y +CONFIG_PCI_HYPERV=m +# CONFIG_PCI_DYNAMIC_OF_NODES is not set +# CONFIG_PCIE_BUS_TUNE_OFF is not set +CONFIG_PCIE_BUS_DEFAULT=y +# CONFIG_PCIE_BUS_SAFE is not set +# CONFIG_PCIE_BUS_PERFORMANCE is not set +# CONFIG_PCIE_BUS_PEER2PEER is not set +CONFIG_VGA_ARB=y +CONFIG_VGA_ARB_MAX_GPUS=16 +CONFIG_HOTPLUG_PCI=y +CONFIG_HOTPLUG_PCI_ACPI=y +CONFIG_HOTPLUG_PCI_ACPI_IBM=m +CONFIG_HOTPLUG_PCI_CPCI=y +CONFIG_HOTPLUG_PCI_SHPC=y + +# +# PCI controller drivers +# +CONFIG_PCI_AARDVARK=y +# CONFIG_PCIE_ALTERA is not set +CONFIG_PCI_HOST_THUNDER_PEM=y +CONFIG_PCI_HOST_THUNDER_ECAM=y +# CONFIG_PCI_FTPCI100 is not set +CONFIG_PCI_HOST_COMMON=y +CONFIG_PCI_HOST_GENERIC=y +# CONFIG_PCIE_HISI_ERR is not set +# CONFIG_PCIE_MICROCHIP_HOST is not set +CONFIG_PCI_HYPERV_INTERFACE=m +CONFIG_PCI_TEGRA=y +CONFIG_PCIE_ROCKCHIP=y +CONFIG_PCIE_ROCKCHIP_HOST=y +CONFIG_PCI_XGENE=y +CONFIG_PCI_XGENE_MSI=y +# CONFIG_PCIE_XILINX is not set +CONFIG_PCIE_XILINX_NWL=y +# CONFIG_PCIE_XILINX_CPM is not set + +# +# Cadence-based PCIe controllers +# +# CONFIG_PCIE_CADENCE_PLAT_HOST is not set +# CONFIG_PCI_J721E_HOST is not set +# end of Cadence-based PCIe controllers + +# +# DesignWare-based PCIe controllers +# +CONFIG_PCIE_DW=y +CONFIG_PCIE_DW_HOST=y +# CONFIG_PCIE_AL is not set +# CONFIG_PCI_MESON is not set +CONFIG_PCI_HISI=y +CONFIG_PCIE_KIRIN=y +# CONFIG_PCIE_HISI_STB is not set +CONFIG_PCIE_ARMADA_8K=y +# CONFIG_PCIE_DW_PLAT_HOST is not set +CONFIG_PCIE_QCOM=y +# CONFIG_PCIE_ROCKCHIP_DW_HOST is not set +# end of DesignWare-based PCIe controllers + +# +# Mobiveil-based PCIe controllers +# +# CONFIG_PCIE_MOBIVEIL_PLAT is not set +# end of Mobiveil-based PCIe controllers +# end of PCI controller drivers + +# +# PCI Endpoint +# +# CONFIG_PCI_ENDPOINT is not set +# end of PCI Endpoint + +# +# PCI switch controller drivers +# +# CONFIG_PCI_SW_SWITCHTEC is not set +# end of PCI switch controller drivers + +# CONFIG_CXL_BUS is not set +# CONFIG_PCCARD is not set +# CONFIG_RAPIDIO is not set + +# +# Generic Driver Options +# +CONFIG_AUXILIARY_BUS=y +# CONFIG_UEVENT_HELPER is not set +CONFIG_DEVTMPFS=y +# CONFIG_DEVTMPFS_MOUNT is not set +# CONFIG_DEVTMPFS_SAFE is not set +CONFIG_STANDALONE=y +CONFIG_PREVENT_FIRMWARE_BUILD=y + +# +# Firmware loader +# +CONFIG_FW_LOADER=y +CONFIG_FW_LOADER_DEBUG=y +CONFIG_EXTRA_FIRMWARE="" +# CONFIG_FW_LOADER_USER_HELPER is not set +# CONFIG_FW_LOADER_COMPRESS is not set +CONFIG_FW_CACHE=y +# CONFIG_FW_UPLOAD is not set +# end of Firmware loader + +CONFIG_WANT_DEV_COREDUMP=y +CONFIG_ALLOW_DEV_COREDUMP=y +CONFIG_DEV_COREDUMP=y +# CONFIG_DEBUG_DRIVER is not set +# CONFIG_DEBUG_DEVRES is not set +# CONFIG_DEBUG_TEST_DRIVER_REMOVE is not set +# CONFIG_TEST_ASYNC_DRIVER_PROBE is not set +CONFIG_SYS_HYPERVISOR=y +CONFIG_GENERIC_CPU_AUTOPROBE=y +CONFIG_GENERIC_CPU_VULNERABILITIES=y +CONFIG_SOC_BUS=y +CONFIG_REGMAP=y +CONFIG_REGMAP_I2C=y +CONFIG_REGMAP_SPI=m +CONFIG_REGMAP_SPMI=y +CONFIG_REGMAP_MMIO=y +CONFIG_REGMAP_IRQ=y +CONFIG_DMA_SHARED_BUFFER=y +# CONFIG_DMA_FENCE_TRACE is not set +CONFIG_GENERIC_ARCH_TOPOLOGY=y +CONFIG_GENERIC_ARCH_NUMA=y +# CONFIG_FW_DEVLINK_SYNC_STATE_TIMEOUT is not set +# end of Generic Driver Options + +# +# Bus devices +# +CONFIG_ARM_CCI=y +CONFIG_ARM_CCI400_COMMON=y +# CONFIG_BRCMSTB_GISB_ARB is not set +# CONFIG_MOXTET is not set +CONFIG_HISILICON_LPC=y +CONFIG_QCOM_EBI2=y +# CONFIG_QCOM_SSC_BLOCK_BUS is not set +CONFIG_SUN50I_DE2_BUS=y +CONFIG_SUNXI_RSB=y +CONFIG_TEGRA_ACONNECT=y +# CONFIG_TEGRA_GMI is not set +CONFIG_VEXPRESS_CONFIG=y +# CONFIG_MHI_BUS is not set +# CONFIG_MHI_BUS_EP is not set +# end of Bus devices + +# +# Cache Drivers +# +# end of Cache Drivers + +CONFIG_CONNECTOR=y +CONFIG_PROC_EVENTS=y + +# +# Firmware Drivers +# + +# +# ARM System Control and Management Interface Protocol +# +# CONFIG_ARM_SCMI_PROTOCOL is not set +# end of ARM System Control and Management Interface Protocol + +# CONFIG_ARM_SCPI_PROTOCOL is not set +CONFIG_ARM_SDE_INTERFACE=y +# CONFIG_FIRMWARE_MEMMAP is not set +CONFIG_DMIID=y +CONFIG_DMI_SYSFS=y +# CONFIG_ISCSI_IBFT is not set +CONFIG_FW_CFG_SYSFS=m +# CONFIG_FW_CFG_SYSFS_CMDLINE is not set +CONFIG_QCOM_SCM=y +# CONFIG_QCOM_SCM_DOWNLOAD_MODE_DEFAULT is not set +CONFIG_SYSFB=y +# CONFIG_SYSFB_SIMPLEFB is not set +CONFIG_TURRIS_MOX_RWTM=m +CONFIG_ARM_FFA_TRANSPORT=y +CONFIG_ARM_FFA_SMCCC=y +CONFIG_GOOGLE_FIRMWARE=y +# CONFIG_GOOGLE_CBMEM is not set +CONFIG_GOOGLE_COREBOOT_TABLE=m +CONFIG_GOOGLE_MEMCONSOLE=m +# CONFIG_GOOGLE_FRAMEBUFFER_COREBOOT is not set +CONFIG_GOOGLE_MEMCONSOLE_COREBOOT=m +# CONFIG_GOOGLE_VPD is not set + +# +# EFI (Extensible Firmware Interface) Support +# +CONFIG_EFI_ESRT=y +CONFIG_EFI_VARS_PSTORE=m +# CONFIG_EFI_VARS_PSTORE_DEFAULT_DISABLE is not set +CONFIG_EFI_PARAMS_FROM_FDT=y +CONFIG_EFI_RUNTIME_WRAPPERS=y +CONFIG_EFI_GENERIC_STUB=y +# CONFIG_EFI_ZBOOT is not set +CONFIG_EFI_ARMSTUB_DTB_LOADER=y +CONFIG_EFI_BOOTLOADER_CONTROL=m +CONFIG_EFI_CAPSULE_LOADER=m +# CONFIG_EFI_TEST is not set +# CONFIG_RESET_ATTACK_MITIGATION is not set +# CONFIG_EFI_DISABLE_PCI_DMA is not set +CONFIG_EFI_EARLYCON=y +CONFIG_EFI_CUSTOM_SSDT_OVERLAYS=y +# CONFIG_EFI_DISABLE_RUNTIME is not set +# CONFIG_EFI_COCO_SECRET is not set +# end of EFI (Extensible Firmware Interface) Support + +CONFIG_UEFI_CPER=y +CONFIG_UEFI_CPER_ARM=y +CONFIG_ARM_PSCI_FW=y +# CONFIG_ARM_PSCI_CHECKER is not set +CONFIG_HAVE_ARM_SMCCC=y +CONFIG_HAVE_ARM_SMCCC_DISCOVERY=y +CONFIG_ARM_SMCCC_SOC_ID=y + +# +# Tegra firmware driver +# +CONFIG_TEGRA_BPMP=y +CONFIG_TEGRA_IVC=y +# end of Tegra firmware driver + +# +# Zynq MPSoC Firmware Drivers +# +CONFIG_ZYNQMP_FIRMWARE=y +# CONFIG_ZYNQMP_FIRMWARE_DEBUG is not set +# end of Zynq MPSoC Firmware Drivers +# end of Firmware Drivers + +CONFIG_GNSS=m +CONFIG_GNSS_SERIAL=m +# CONFIG_GNSS_MTK_SERIAL is not set +CONFIG_GNSS_SIRF_SERIAL=m +CONFIG_GNSS_UBX_SERIAL=m +# CONFIG_GNSS_USB is not set +CONFIG_MTD=m +# CONFIG_MTD_TESTS is not set + +# +# Partition parsers +# +CONFIG_MTD_AR7_PARTS=m +# CONFIG_MTD_CMDLINE_PARTS is not set +CONFIG_MTD_OF_PARTS=m +# CONFIG_MTD_AFS_PARTS is not set +# CONFIG_MTD_REDBOOT_PARTS is not set +# end of Partition parsers + +# +# User Modules And Translation Layers +# +CONFIG_MTD_BLKDEVS=m +CONFIG_MTD_BLOCK=m +CONFIG_MTD_BLOCK_RO=m + +# +# Note that in some cases UBI block is preferred. See MTD_UBI_BLOCK. +# +# CONFIG_FTL is not set +# CONFIG_NFTL is not set +# CONFIG_INFTL is not set +CONFIG_RFD_FTL=m +CONFIG_SSFDC=m +# CONFIG_SM_FTL is not set +CONFIG_MTD_OOPS=m +CONFIG_MTD_SWAP=m +# CONFIG_MTD_PARTITIONED_MASTER is not set + +# +# RAM/ROM/Flash chip drivers +# +# CONFIG_MTD_CFI is not set +# CONFIG_MTD_JEDECPROBE is not set +CONFIG_MTD_MAP_BANK_WIDTH_1=y +CONFIG_MTD_MAP_BANK_WIDTH_2=y +CONFIG_MTD_MAP_BANK_WIDTH_4=y +CONFIG_MTD_CFI_I1=y +CONFIG_MTD_CFI_I2=y +CONFIG_MTD_RAM=m +# CONFIG_MTD_ROM is not set +# CONFIG_MTD_ABSENT is not set +# end of RAM/ROM/Flash chip drivers + +# +# Mapping drivers for chip access +# +# CONFIG_MTD_COMPLEX_MAPPINGS is not set +# CONFIG_MTD_PHYSMAP is not set +CONFIG_MTD_INTEL_VR_NOR=m +CONFIG_MTD_PLATRAM=m +# end of Mapping drivers for chip access + +# +# Self-contained MTD device drivers +# +# CONFIG_MTD_PMC551 is not set +CONFIG_MTD_DATAFLASH=m +# CONFIG_MTD_DATAFLASH_WRITE_VERIFY is not set +# CONFIG_MTD_DATAFLASH_OTP is not set +# CONFIG_MTD_MCHP23K256 is not set +# CONFIG_MTD_MCHP48L640 is not set +CONFIG_MTD_SST25L=m +# CONFIG_MTD_SLRAM is not set +# CONFIG_MTD_PHRAM is not set +# CONFIG_MTD_MTDRAM is not set +# CONFIG_MTD_BLOCK2MTD is not set + +# +# Disk-On-Chip Device Drivers +# +# CONFIG_MTD_DOCG3 is not set +# end of Self-contained MTD device drivers + +# +# NAND +# +CONFIG_MTD_ONENAND=m +CONFIG_MTD_ONENAND_VERIFY_WRITE=y +# CONFIG_MTD_ONENAND_GENERIC is not set +# CONFIG_MTD_ONENAND_OTP is not set +CONFIG_MTD_ONENAND_2X_PROGRAM=y +# CONFIG_MTD_RAW_NAND is not set +# CONFIG_MTD_SPI_NAND is not set + +# +# ECC engine support +# +# CONFIG_MTD_NAND_ECC_SW_HAMMING is not set +# CONFIG_MTD_NAND_ECC_SW_BCH is not set +# CONFIG_MTD_NAND_ECC_MXIC is not set +# end of ECC engine support +# end of NAND + +# +# LPDDR & LPDDR2 PCM memory drivers +# +CONFIG_MTD_LPDDR=m +CONFIG_MTD_QINFO_PROBE=m +# end of LPDDR & LPDDR2 PCM memory drivers + +CONFIG_MTD_SPI_NOR=y +CONFIG_MTD_SPI_NOR_USE_4K_SECTORS=y +# CONFIG_MTD_SPI_NOR_SWP_DISABLE is not set +CONFIG_MTD_SPI_NOR_SWP_DISABLE_ON_VOLATILE=y +# CONFIG_MTD_SPI_NOR_SWP_KEEP is not set +CONFIG_SPI_HISI_SFC=m +CONFIG_MTD_UBI=m +CONFIG_MTD_UBI_WL_THRESHOLD=4096 +CONFIG_MTD_UBI_BEB_LIMIT=20 +# CONFIG_MTD_UBI_FASTMAP is not set +# CONFIG_MTD_UBI_GLUEBI is not set +CONFIG_MTD_UBI_BLOCK=y +# CONFIG_MTD_HYPERBUS is not set +CONFIG_DTC=y +CONFIG_OF=y +# CONFIG_OF_UNITTEST is not set +CONFIG_OF_FLATTREE=y +CONFIG_OF_EARLY_FLATTREE=y +CONFIG_OF_KOBJ=y +CONFIG_OF_ADDRESS=y +CONFIG_OF_IRQ=y +CONFIG_OF_RESERVED_MEM=y +# CONFIG_OF_OVERLAY is not set +CONFIG_OF_NUMA=y +CONFIG_PARPORT=m +# CONFIG_PARPORT_PC is not set +CONFIG_PARPORT_1284=y +CONFIG_PARPORT_NOT_PC=y +CONFIG_PNP=y +# CONFIG_PNP_DEBUG_MESSAGES is not set + +# +# Protocols +# +CONFIG_PNPACPI=y +CONFIG_BLK_DEV=y +CONFIG_BLK_DEV_NULL_BLK=m +CONFIG_CDROM=m +CONFIG_BLK_DEV_PCIESSD_MTIP32XX=m +CONFIG_ZRAM=m +CONFIG_ZRAM_DEF_COMP_LZORLE=y +# CONFIG_ZRAM_DEF_COMP_ZSTD is not set +# CONFIG_ZRAM_DEF_COMP_LZ4 is not set +# CONFIG_ZRAM_DEF_COMP_LZO is not set +# CONFIG_ZRAM_DEF_COMP_LZ4HC is not set +CONFIG_ZRAM_DEF_COMP="lzo-rle" +CONFIG_ZRAM_WRITEBACK=y +CONFIG_ZRAM_MEMORY_TRACKING=y +# CONFIG_ZRAM_MULTI_COMP is not set +CONFIG_BLK_DEV_LOOP=m +CONFIG_BLK_DEV_LOOP_MIN_COUNT=8 +CONFIG_BLK_DEV_DRBD=m +# CONFIG_DRBD_FAULT_INJECTION is not set +CONFIG_BLK_DEV_NBD=m +CONFIG_BLK_DEV_RAM=m +CONFIG_BLK_DEV_RAM_COUNT=16 +CONFIG_BLK_DEV_RAM_SIZE=16384 +# CONFIG_CDROM_PKTCDVD is not set +CONFIG_ATA_OVER_ETH=m +CONFIG_XEN_BLKDEV_FRONTEND=m +CONFIG_XEN_BLKDEV_BACKEND=m +CONFIG_VIRTIO_BLK=m +CONFIG_BLK_DEV_RBD=m +# CONFIG_BLK_DEV_UBLK is not set + +# +# NVME Support +# +CONFIG_NVME_CORE=m +CONFIG_BLK_DEV_NVME=m +CONFIG_NVME_MULTIPATH=y +# CONFIG_NVME_VERBOSE_ERRORS is not set +# CONFIG_NVME_HWMON is not set +CONFIG_NVME_FABRICS=m +CONFIG_NVME_RDMA=m +CONFIG_NVME_FC=m +CONFIG_NVME_TCP=m +# CONFIG_NVME_AUTH is not set +CONFIG_NVME_TARGET=m +# CONFIG_NVME_TARGET_PASSTHRU is not set +# CONFIG_NVME_TARGET_LOOP is not set +CONFIG_NVME_TARGET_RDMA=m +CONFIG_NVME_TARGET_FC=m +# CONFIG_NVME_TARGET_FCLOOP is not set +CONFIG_NVME_TARGET_TCP=m +# CONFIG_NVME_TARGET_AUTH is not set +# end of NVME Support + +# +# Misc devices +# +CONFIG_SENSORS_LIS3LV02D=m +CONFIG_AD525X_DPOT=m +CONFIG_AD525X_DPOT_I2C=m +CONFIG_AD525X_DPOT_SPI=m +# CONFIG_DUMMY_IRQ is not set +# CONFIG_PHANTOM is not set +CONFIG_TIFM_CORE=m +CONFIG_TIFM_7XX1=m +CONFIG_ICS932S401=m +CONFIG_ENCLOSURE_SERVICES=m +# CONFIG_HI6421V600_IRQ is not set +# CONFIG_HP_ILO is not set +CONFIG_QCOM_COINCELL=m +# CONFIG_QCOM_FASTRPC is not set +CONFIG_APDS9802ALS=m +CONFIG_ISL29003=m +CONFIG_ISL29020=m +CONFIG_SENSORS_TSL2550=m +CONFIG_SENSORS_BH1770=m +CONFIG_SENSORS_APDS990X=m +CONFIG_HMC6352=m +CONFIG_DS1682=m +# CONFIG_LATTICE_ECP3_CONFIG is not set +CONFIG_SRAM=y +# CONFIG_DW_XDATA_PCIE is not set +# CONFIG_PCI_ENDPOINT_TEST is not set +# CONFIG_XILINX_SDFEC is not set +CONFIG_MISC_RTSX=m +# CONFIG_HISI_HIKEY_USB is not set +# CONFIG_OPEN_DICE is not set +# CONFIG_VCPU_STALL_DETECTOR is not set +CONFIG_C2PORT=m + +# +# EEPROM support +# +CONFIG_EEPROM_AT24=m +CONFIG_EEPROM_AT25=m +CONFIG_EEPROM_LEGACY=m +CONFIG_EEPROM_MAX6875=m +CONFIG_EEPROM_93CX6=m +# CONFIG_EEPROM_93XX46 is not set +# CONFIG_EEPROM_IDT_89HPESX is not set +# CONFIG_EEPROM_EE1004 is not set +# end of EEPROM support + +CONFIG_CB710_CORE=m +# CONFIG_CB710_DEBUG is not set +CONFIG_CB710_DEBUG_ASSUMPTIONS=y + +# +# Texas Instruments shared transport line discipline +# +CONFIG_TI_ST=m +# end of Texas Instruments shared transport line discipline + +CONFIG_SENSORS_LIS3_I2C=m +CONFIG_ALTERA_STAPL=m +# CONFIG_VMWARE_VMCI is not set +# CONFIG_GENWQE is not set +# CONFIG_ECHO is not set +# CONFIG_BCM_VK is not set +# CONFIG_MISC_ALCOR_PCI is not set +CONFIG_MISC_RTSX_PCI=m +CONFIG_MISC_RTSX_USB=m +CONFIG_PVPANIC=y +CONFIG_PVPANIC_MMIO=m +CONFIG_PVPANIC_PCI=m +# CONFIG_UACCE is not set +# CONFIG_GP_PCI1XXXX is not set +# end of Misc devices + +# +# SCSI device support +# +CONFIG_SCSI_MOD=m +CONFIG_RAID_ATTRS=m +CONFIG_SCSI_COMMON=m +CONFIG_SCSI=m +CONFIG_SCSI_DMA=y +CONFIG_SCSI_NETLINK=y +# CONFIG_SCSI_PROC_FS is not set + +# +# SCSI support type (disk, tape, CD-ROM) +# +CONFIG_BLK_DEV_SD=m +CONFIG_CHR_DEV_ST=m +CONFIG_BLK_DEV_SR=m +CONFIG_CHR_DEV_SG=m +CONFIG_BLK_DEV_BSG=y +CONFIG_CHR_DEV_SCH=m +CONFIG_SCSI_ENCLOSURE=m +CONFIG_SCSI_CONSTANTS=y +CONFIG_SCSI_LOGGING=y +CONFIG_SCSI_SCAN_ASYNC=y + +# +# SCSI Transports +# +CONFIG_SCSI_SPI_ATTRS=m +CONFIG_SCSI_FC_ATTRS=m +CONFIG_SCSI_ISCSI_ATTRS=m +CONFIG_SCSI_SAS_ATTRS=m +CONFIG_SCSI_SAS_LIBSAS=m +CONFIG_SCSI_SAS_ATA=y +CONFIG_SCSI_SAS_HOST_SMP=y +CONFIG_SCSI_SRP_ATTRS=m +# end of SCSI Transports + +CONFIG_SCSI_LOWLEVEL=y +CONFIG_ISCSI_TCP=m +CONFIG_ISCSI_BOOT_SYSFS=m +CONFIG_SCSI_CXGB3_ISCSI=m +CONFIG_SCSI_CXGB4_ISCSI=m +CONFIG_SCSI_BNX2_ISCSI=m +CONFIG_SCSI_BNX2X_FCOE=m +CONFIG_BE2ISCSI=m +CONFIG_BLK_DEV_3W_XXXX_RAID=m +CONFIG_SCSI_HPSA=m +CONFIG_SCSI_3W_9XXX=m +CONFIG_SCSI_3W_SAS=m +CONFIG_SCSI_ACARD=m +CONFIG_SCSI_AACRAID=m +CONFIG_SCSI_AIC7XXX=m +CONFIG_AIC7XXX_CMDS_PER_DEVICE=32 +CONFIG_AIC7XXX_RESET_DELAY_MS=15000 +CONFIG_AIC7XXX_DEBUG_ENABLE=y +CONFIG_AIC7XXX_DEBUG_MASK=0 +CONFIG_AIC7XXX_REG_PRETTY_PRINT=y +CONFIG_SCSI_AIC79XX=m +CONFIG_AIC79XX_CMDS_PER_DEVICE=32 +CONFIG_AIC79XX_RESET_DELAY_MS=15000 +CONFIG_AIC79XX_DEBUG_ENABLE=y +CONFIG_AIC79XX_DEBUG_MASK=0 +CONFIG_AIC79XX_REG_PRETTY_PRINT=y +CONFIG_SCSI_AIC94XX=m +# CONFIG_AIC94XX_DEBUG is not set +CONFIG_SCSI_HISI_SAS=m +CONFIG_SCSI_HISI_SAS_PCI=m +# CONFIG_SCSI_HISI_SAS_DEBUGFS_DEFAULT_ENABLE is not set +CONFIG_SCSI_MVSAS=m +# CONFIG_SCSI_MVSAS_DEBUG is not set +# CONFIG_SCSI_MVSAS_TASKLET is not set +CONFIG_SCSI_MVUMI=m +CONFIG_SCSI_ADVANSYS=m +# CONFIG_SCSI_ARCMSR is not set +CONFIG_SCSI_ESAS2R=m +# CONFIG_MEGARAID_NEWGEN is not set +# CONFIG_MEGARAID_LEGACY is not set +CONFIG_MEGARAID_SAS=m +CONFIG_SCSI_MPT3SAS=m +CONFIG_SCSI_MPT2SAS_MAX_SGE=128 +CONFIG_SCSI_MPT3SAS_MAX_SGE=128 +CONFIG_SCSI_MPT2SAS=m +# CONFIG_SCSI_MPI3MR is not set +CONFIG_SCSI_SMARTPQI=m +CONFIG_SCSI_HPTIOP=m +# CONFIG_SCSI_BUSLOGIC is not set +# CONFIG_SCSI_MYRB is not set +# CONFIG_SCSI_MYRS is not set +CONFIG_XEN_SCSI_FRONTEND=m +CONFIG_HYPERV_STORAGE=m +CONFIG_LIBFC=m +CONFIG_LIBFCOE=m +CONFIG_FCOE=m +CONFIG_SCSI_SNIC=m +# CONFIG_SCSI_SNIC_DEBUG_FS is not set +CONFIG_SCSI_DMX3191D=m +# CONFIG_SCSI_FDOMAIN_PCI is not set +# CONFIG_SCSI_IPS is not set +# CONFIG_SCSI_INITIO is not set +# CONFIG_SCSI_INIA100 is not set +CONFIG_SCSI_STEX=m +CONFIG_SCSI_SYM53C8XX_2=m +CONFIG_SCSI_SYM53C8XX_DMA_ADDRESSING_MODE=1 +CONFIG_SCSI_SYM53C8XX_DEFAULT_TAGS=16 +CONFIG_SCSI_SYM53C8XX_MAX_TAGS=64 +CONFIG_SCSI_SYM53C8XX_MMIO=y +# CONFIG_SCSI_IPR is not set +# CONFIG_SCSI_QLOGIC_1280 is not set +CONFIG_SCSI_QLA_FC=m +CONFIG_TCM_QLA2XXX=m +# CONFIG_TCM_QLA2XXX_DEBUG is not set +CONFIG_SCSI_QLA_ISCSI=m +CONFIG_QEDI=m +CONFIG_QEDF=m +CONFIG_SCSI_LPFC=m +# CONFIG_SCSI_LPFC_DEBUG_FS is not set +# CONFIG_SCSI_EFCT is not set +# CONFIG_SCSI_DC395x is not set +# CONFIG_SCSI_AM53C974 is not set +CONFIG_SCSI_WD719X=m +# CONFIG_SCSI_DEBUG is not set +CONFIG_SCSI_PMCRAID=m +CONFIG_SCSI_PM8001=m +CONFIG_SCSI_BFA_FC=m +CONFIG_SCSI_VIRTIO=m +CONFIG_SCSI_CHELSIO_FCOE=m +CONFIG_SCSI_DH=y +CONFIG_SCSI_DH_RDAC=m +CONFIG_SCSI_DH_HP_SW=m +CONFIG_SCSI_DH_EMC=m +CONFIG_SCSI_DH_ALUA=m +# end of SCSI device support + +CONFIG_ATA=m +CONFIG_SATA_HOST=y +CONFIG_PATA_TIMINGS=y +CONFIG_ATA_VERBOSE_ERROR=y +CONFIG_ATA_FORCE=y +CONFIG_ATA_ACPI=y +CONFIG_SATA_ZPODD=y +CONFIG_SATA_PMP=y + +# +# Controllers with non-SFF native interface +# +CONFIG_SATA_AHCI=m +CONFIG_SATA_MOBILE_LPM_POLICY=3 +CONFIG_SATA_AHCI_PLATFORM=m +# CONFIG_AHCI_DWC is not set +CONFIG_AHCI_CEVA=m +CONFIG_AHCI_MVEBU=m +# CONFIG_AHCI_SUNXI is not set +CONFIG_AHCI_TEGRA=m +CONFIG_AHCI_XGENE=m +CONFIG_SATA_AHCI_SEATTLE=m +# CONFIG_SATA_INIC162X is not set +CONFIG_SATA_ACARD_AHCI=m +CONFIG_SATA_SIL24=m +CONFIG_ATA_SFF=y + +# +# SFF controllers with custom DMA interface +# +CONFIG_PDC_ADMA=m +CONFIG_SATA_QSTOR=m +CONFIG_SATA_SX4=m +CONFIG_ATA_BMDMA=y + +# +# SATA SFF controllers with BMDMA +# +CONFIG_ATA_PIIX=m +# CONFIG_SATA_DWC is not set +CONFIG_SATA_MV=m +CONFIG_SATA_NV=m +CONFIG_SATA_PROMISE=m +CONFIG_SATA_SIL=m +CONFIG_SATA_SIS=m +CONFIG_SATA_SVW=m +CONFIG_SATA_ULI=m +CONFIG_SATA_VIA=m +CONFIG_SATA_VITESSE=m + +# +# PATA SFF controllers with BMDMA +# +# CONFIG_PATA_ALI is not set +# CONFIG_PATA_AMD is not set +CONFIG_PATA_ARTOP=m +# CONFIG_PATA_ATIIXP is not set +CONFIG_PATA_ATP867X=m +CONFIG_PATA_CMD64X=m +# CONFIG_PATA_CYPRESS is not set +# CONFIG_PATA_EFAR is not set +# CONFIG_PATA_HPT366 is not set +# CONFIG_PATA_HPT37X is not set +# CONFIG_PATA_HPT3X2N is not set +# CONFIG_PATA_HPT3X3 is not set +CONFIG_PATA_IT8213=m +CONFIG_PATA_IT821X=m +CONFIG_PATA_JMICRON=m +CONFIG_PATA_MARVELL=m +# CONFIG_PATA_NETCELL is not set +CONFIG_PATA_NINJA32=m +# CONFIG_PATA_NS87415 is not set +# CONFIG_PATA_OLDPIIX is not set +# CONFIG_PATA_OPTIDMA is not set +# CONFIG_PATA_PDC2027X is not set +# CONFIG_PATA_PDC_OLD is not set +# CONFIG_PATA_RADISYS is not set +CONFIG_PATA_RDC=m +CONFIG_PATA_SCH=m +# CONFIG_PATA_SERVERWORKS is not set +# CONFIG_PATA_SIL680 is not set +CONFIG_PATA_SIS=m +CONFIG_PATA_TOSHIBA=m +# CONFIG_PATA_TRIFLEX is not set +# CONFIG_PATA_VIA is not set +# CONFIG_PATA_WINBOND is not set + +# +# PIO-only SFF controllers +# +# CONFIG_PATA_CMD640_PCI is not set +# CONFIG_PATA_MPIIX is not set +# CONFIG_PATA_NS87410 is not set +# CONFIG_PATA_OPTI is not set +# CONFIG_PATA_OF_PLATFORM is not set +# CONFIG_PATA_RZ1000 is not set + +# +# Generic fallback / legacy drivers +# +# CONFIG_PATA_ACPI is not set +CONFIG_ATA_GENERIC=m +# CONFIG_PATA_LEGACY is not set +CONFIG_MD=y +CONFIG_BLK_DEV_MD=m +CONFIG_MD_BITMAP_FILE=y +CONFIG_MD_LINEAR=m +CONFIG_MD_RAID0=m +CONFIG_MD_RAID1=m +CONFIG_MD_RAID10=m +CONFIG_MD_RAID456=m +CONFIG_MD_MULTIPATH=m +CONFIG_MD_FAULTY=m +# CONFIG_MD_CLUSTER is not set +CONFIG_BCACHE=m +# CONFIG_BCACHE_DEBUG is not set +# CONFIG_BCACHE_CLOSURES_DEBUG is not set +# CONFIG_BCACHE_ASYNC_REGISTRATION is not set +CONFIG_BLK_DEV_DM_BUILTIN=y +CONFIG_BLK_DEV_DM=m +# CONFIG_DM_DEBUG is not set +CONFIG_DM_BUFIO=m +# CONFIG_DM_DEBUG_BLOCK_MANAGER_LOCKING is not set +CONFIG_DM_BIO_PRISON=m +CONFIG_DM_PERSISTENT_DATA=m +CONFIG_DM_UNSTRIPED=m +CONFIG_DM_CRYPT=m +CONFIG_DM_SNAPSHOT=m +CONFIG_DM_THIN_PROVISIONING=m +CONFIG_DM_CACHE=m +CONFIG_DM_CACHE_SMQ=m +CONFIG_DM_WRITECACHE=m +# CONFIG_DM_EBS is not set +CONFIG_DM_ERA=m +# CONFIG_DM_CLONE is not set +CONFIG_DM_MIRROR=m +CONFIG_DM_LOG_USERSPACE=m +CONFIG_DM_RAID=m +CONFIG_DM_ZERO=m +CONFIG_DM_MULTIPATH=m +CONFIG_DM_MULTIPATH_QL=m +CONFIG_DM_MULTIPATH_ST=m +# CONFIG_DM_MULTIPATH_HST is not set +# CONFIG_DM_MULTIPATH_IOA is not set +CONFIG_DM_DELAY=m +# CONFIG_DM_DUST is not set +CONFIG_DM_UEVENT=y +CONFIG_DM_FLAKEY=m +CONFIG_DM_VERITY=m +# CONFIG_DM_VERITY_VERIFY_ROOTHASH_SIG is not set +# CONFIG_DM_VERITY_FEC is not set +CONFIG_DM_SWITCH=m +CONFIG_DM_LOG_WRITES=m +CONFIG_DM_INTEGRITY=m +CONFIG_DM_ZONED=m +CONFIG_DM_AUDIT=y +CONFIG_TARGET_CORE=m +CONFIG_TCM_IBLOCK=m +CONFIG_TCM_FILEIO=m +CONFIG_TCM_PSCSI=m +CONFIG_TCM_USER2=m +CONFIG_LOOPBACK_TARGET=m +CONFIG_TCM_FC=m +CONFIG_ISCSI_TARGET=m +CONFIG_ISCSI_TARGET_CXGB4=m +CONFIG_SBP_TARGET=m +# CONFIG_REMOTE_TARGET is not set +CONFIG_FUSION=y +CONFIG_FUSION_SPI=m +CONFIG_FUSION_FC=m +CONFIG_FUSION_SAS=m +CONFIG_FUSION_MAX_SGE=128 +CONFIG_FUSION_CTL=m +# CONFIG_FUSION_LOGGING is not set + +# +# IEEE 1394 (FireWire) support +# +CONFIG_FIREWIRE=m +CONFIG_FIREWIRE_OHCI=m +CONFIG_FIREWIRE_SBP2=m +CONFIG_FIREWIRE_NET=m +CONFIG_FIREWIRE_NOSY=m +# end of IEEE 1394 (FireWire) support + +CONFIG_NETDEVICES=y +CONFIG_MII=m +CONFIG_NET_CORE=y +CONFIG_BONDING=m +CONFIG_DUMMY=m +# CONFIG_WIREGUARD is not set +CONFIG_EQUALIZER=m +# CONFIG_NET_FC is not set +CONFIG_IFB=m +CONFIG_NET_TEAM=m +CONFIG_NET_TEAM_MODE_BROADCAST=m +CONFIG_NET_TEAM_MODE_ROUNDROBIN=m +CONFIG_NET_TEAM_MODE_RANDOM=m +CONFIG_NET_TEAM_MODE_ACTIVEBACKUP=m +CONFIG_NET_TEAM_MODE_LOADBALANCE=m +CONFIG_MACVLAN=m +CONFIG_MACVTAP=m +CONFIG_IPVLAN_L3S=y +CONFIG_IPVLAN=m +CONFIG_IPVTAP=m +CONFIG_VXLAN=m +CONFIG_GENEVE=m +# CONFIG_BAREUDP is not set +CONFIG_GTP=m +# CONFIG_AMT is not set +CONFIG_MACSEC=m +CONFIG_NETCONSOLE=m +CONFIG_NETCONSOLE_DYNAMIC=y +# CONFIG_NETCONSOLE_EXTENDED_LOG is not set +CONFIG_NETPOLL=y +CONFIG_NET_POLL_CONTROLLER=y +CONFIG_TUN=m +CONFIG_TAP=m +# CONFIG_TUN_VNET_CROSS_LE is not set +CONFIG_VETH=m +CONFIG_VIRTIO_NET=m +CONFIG_NLMON=m +CONFIG_NET_VRF=m +CONFIG_NETKIT=y +CONFIG_VSOCKMON=m +# CONFIG_ARCNET is not set +CONFIG_ATM_DRIVERS=y +CONFIG_ATM_DUMMY=m +# CONFIG_ATM_TCP is not set +# CONFIG_ATM_LANAI is not set +# CONFIG_ATM_ENI is not set +CONFIG_ATM_NICSTAR=m +CONFIG_ATM_NICSTAR_USE_SUNI=y +CONFIG_ATM_NICSTAR_USE_IDT77105=y +# CONFIG_ATM_IDT77252 is not set +CONFIG_ATM_IA=m +# CONFIG_ATM_IA_DEBUG is not set +CONFIG_ATM_FORE200E=m +# CONFIG_ATM_FORE200E_USE_TASKLET is not set +CONFIG_ATM_FORE200E_TX_RETRY=16 +CONFIG_ATM_FORE200E_DEBUG=0 +# CONFIG_ATM_HE is not set +CONFIG_ATM_SOLOS=m + +# +# Distributed Switch Architecture drivers +# +# CONFIG_B53 is not set +# CONFIG_NET_DSA_BCM_SF2 is not set +# CONFIG_NET_DSA_LOOP is not set +# CONFIG_NET_DSA_LANTIQ_GSWIP is not set +# CONFIG_NET_DSA_MT7530 is not set +CONFIG_NET_DSA_MV88E6060=m +# CONFIG_NET_DSA_MICROCHIP_KSZ_COMMON is not set +CONFIG_NET_DSA_MV88E6XXX=m +# CONFIG_NET_DSA_MV88E6XXX_PTP is not set +# CONFIG_NET_DSA_MSCC_OCELOT_EXT is not set +# CONFIG_NET_DSA_MSCC_SEVILLE is not set +# CONFIG_NET_DSA_AR9331 is not set +# CONFIG_NET_DSA_QCA8K is not set +# CONFIG_NET_DSA_SJA1105 is not set +# CONFIG_NET_DSA_XRS700X_I2C is not set +# CONFIG_NET_DSA_XRS700X_MDIO is not set +# CONFIG_NET_DSA_REALTEK is not set +# CONFIG_NET_DSA_SMSC_LAN9303_I2C is not set +# CONFIG_NET_DSA_SMSC_LAN9303_MDIO is not set +# CONFIG_NET_DSA_VITESSE_VSC73XX_SPI is not set +# CONFIG_NET_DSA_VITESSE_VSC73XX_PLATFORM is not set +# end of Distributed Switch Architecture drivers + +CONFIG_ETHERNET=y +CONFIG_MDIO=m +CONFIG_NET_VENDOR_3COM=y +CONFIG_VORTEX=m +CONFIG_TYPHOON=m +CONFIG_NET_VENDOR_ADAPTEC=y +CONFIG_ADAPTEC_STARFIRE=m +CONFIG_NET_VENDOR_AGERE=y +CONFIG_ET131X=m +CONFIG_NET_VENDOR_ALACRITECH=y +# CONFIG_SLICOSS is not set +CONFIG_NET_VENDOR_ALLWINNER=y +# CONFIG_SUN4I_EMAC is not set +CONFIG_NET_VENDOR_ALTEON=y +CONFIG_ACENIC=m +# CONFIG_ACENIC_OMIT_TIGON_I is not set +# CONFIG_ALTERA_TSE is not set +CONFIG_NET_VENDOR_AMAZON=y +CONFIG_ENA_ETHERNET=m +CONFIG_NET_VENDOR_AMD=y +# CONFIG_AMD8111_ETH is not set +CONFIG_PCNET32=m +CONFIG_AMD_XGBE=m +CONFIG_AMD_XGBE_DCB=y +# CONFIG_PDS_CORE is not set +CONFIG_NET_XGENE=m +CONFIG_NET_XGENE_V2=m +CONFIG_NET_VENDOR_AQUANTIA=y +CONFIG_AQTION=m +# CONFIG_NET_VENDOR_ARC is not set +CONFIG_NET_VENDOR_ASIX=y +# CONFIG_SPI_AX88796C is not set +CONFIG_NET_VENDOR_ATHEROS=y +CONFIG_ATL2=m +CONFIG_ATL1=m +CONFIG_ATL1E=m +CONFIG_ATL1C=m +CONFIG_ALX=m +CONFIG_NET_VENDOR_BROADCOM=y +# CONFIG_B44 is not set +# CONFIG_BCMGENET is not set +CONFIG_BNX2=m +CONFIG_CNIC=m +CONFIG_TIGON3=m +CONFIG_TIGON3_HWMON=y +CONFIG_BNX2X=m +CONFIG_BNX2X_SRIOV=y +# CONFIG_SYSTEMPORT is not set +CONFIG_BNXT=m +CONFIG_BNXT_SRIOV=y +CONFIG_BNXT_FLOWER_OFFLOAD=y +CONFIG_BNXT_DCB=y +CONFIG_BNXT_HWMON=y +CONFIG_NET_VENDOR_CADENCE=y +CONFIG_MACB=m +CONFIG_MACB_USE_HWSTAMP=y +# CONFIG_MACB_PCI is not set +CONFIG_NET_VENDOR_CAVIUM=y +CONFIG_THUNDER_NIC_PF=m +CONFIG_THUNDER_NIC_VF=m +CONFIG_THUNDER_NIC_BGX=m +CONFIG_THUNDER_NIC_RGX=m +CONFIG_CAVIUM_PTP=m +CONFIG_LIQUIDIO_CORE=m +CONFIG_LIQUIDIO=m +CONFIG_LIQUIDIO_VF=m +CONFIG_NET_VENDOR_CHELSIO=y +CONFIG_CHELSIO_T1=m +CONFIG_CHELSIO_T1_1G=y +CONFIG_CHELSIO_T3=m +CONFIG_CHELSIO_T4=m +CONFIG_CHELSIO_T4_DCB=y +CONFIG_CHELSIO_T4_FCOE=y +CONFIG_CHELSIO_T4VF=m +CONFIG_CHELSIO_LIB=m +CONFIG_CHELSIO_INLINE_CRYPTO=y +# CONFIG_CHELSIO_IPSEC_INLINE is not set +CONFIG_NET_VENDOR_CISCO=y +CONFIG_ENIC=m +CONFIG_NET_VENDOR_CORTINA=y +# CONFIG_GEMINI_ETHERNET is not set +CONFIG_NET_VENDOR_DAVICOM=y +# CONFIG_DM9051 is not set +# CONFIG_DNET is not set +CONFIG_NET_VENDOR_DEC=y +CONFIG_NET_TULIP=y +CONFIG_DE2104X=m +CONFIG_DE2104X_DSL=0 +CONFIG_TULIP=m +# CONFIG_TULIP_MWI is not set +# CONFIG_TULIP_MMIO is not set +CONFIG_TULIP_NAPI=y +CONFIG_TULIP_NAPI_HW_MITIGATION=y +CONFIG_WINBOND_840=m +CONFIG_DM9102=m +CONFIG_ULI526X=m +CONFIG_NET_VENDOR_DLINK=y +CONFIG_DL2K=m +CONFIG_SUNDANCE=m +# CONFIG_SUNDANCE_MMIO is not set +CONFIG_NET_VENDOR_EMULEX=y +CONFIG_BE2NET=m +CONFIG_BE2NET_HWMON=y +CONFIG_BE2NET_BE2=y +CONFIG_BE2NET_BE3=y +CONFIG_BE2NET_LANCER=y +CONFIG_BE2NET_SKYHAWK=y +CONFIG_NET_VENDOR_ENGLEDER=y +# CONFIG_TSNEP is not set +CONFIG_NET_VENDOR_EZCHIP=y +# CONFIG_EZCHIP_NPS_MANAGEMENT_ENET is not set +CONFIG_NET_VENDOR_FUNGIBLE=y +# CONFIG_FUN_ETH is not set +CONFIG_NET_VENDOR_GOOGLE=y +# CONFIG_GVE is not set +CONFIG_NET_VENDOR_HISILICON=y +CONFIG_HIX5HD2_GMAC=m +CONFIG_HISI_FEMAC=m +CONFIG_HIP04_ETH=m +# CONFIG_HI13X1_GMAC is not set +CONFIG_HNS_MDIO=m +# CONFIG_HNS_DSAF is not set +# CONFIG_HNS_ENET is not set +# CONFIG_HNS3 is not set +CONFIG_NET_VENDOR_HUAWEI=y +CONFIG_HINIC=m +CONFIG_NET_VENDOR_I825XX=y +CONFIG_NET_VENDOR_INTEL=y +CONFIG_E100=m +CONFIG_E1000=m +CONFIG_E1000E=m +CONFIG_IGB=m +CONFIG_IGB_HWMON=y +CONFIG_IGBVF=m +CONFIG_IXGBE=m +CONFIG_IXGBE_HWMON=y +CONFIG_IXGBE_DCB=y +CONFIG_IXGBE_IPSEC=y +CONFIG_IXGBEVF=m +CONFIG_IXGBEVF_IPSEC=y +CONFIG_I40E=m +CONFIG_I40E_DCB=y +CONFIG_IAVF=m +CONFIG_I40EVF=m +CONFIG_ICE=m +CONFIG_ICE_SWITCHDEV=y +# CONFIG_FM10K is not set +# CONFIG_IGC is not set +CONFIG_JME=m +CONFIG_NET_VENDOR_ADI=y +# CONFIG_ADIN1110 is not set +CONFIG_NET_VENDOR_LITEX=y +# CONFIG_LITEX_LITEETH is not set +CONFIG_NET_VENDOR_MARVELL=y +CONFIG_MVMDIO=m +CONFIG_MVNETA=m +CONFIG_MVPP2=m +# CONFIG_MVPP2_PTP is not set +CONFIG_SKGE=m +# CONFIG_SKGE_DEBUG is not set +CONFIG_SKGE_GENESIS=y +CONFIG_SKY2=m +# CONFIG_SKY2_DEBUG is not set +# CONFIG_OCTEONTX2_AF is not set +# CONFIG_OCTEONTX2_PF is not set +# CONFIG_OCTEON_EP is not set +# CONFIG_PRESTERA is not set +CONFIG_NET_VENDOR_MELLANOX=y +CONFIG_MLX4_EN=m +CONFIG_MLX4_EN_DCB=y +CONFIG_MLX4_CORE=m +CONFIG_MLX4_DEBUG=y +CONFIG_MLX4_CORE_GEN2=y +CONFIG_MLX5_CORE=m +CONFIG_MLX5_FPGA=y +CONFIG_MLX5_CORE_EN=y +CONFIG_MLX5_EN_ARFS=y +CONFIG_MLX5_EN_RXNFC=y +CONFIG_MLX5_MPFS=y +CONFIG_MLX5_ESWITCH=y +CONFIG_MLX5_BRIDGE=y +CONFIG_MLX5_CORE_EN_DCB=y +CONFIG_MLX5_CORE_IPOIB=y +# CONFIG_MLX5_MACSEC is not set +# CONFIG_MLX5_EN_IPSEC is not set +CONFIG_MLX5_SW_STEERING=y +# CONFIG_MLX5_SF is not set +# CONFIG_MLXSW_CORE is not set +CONFIG_MLXFW=m +# CONFIG_MLXBF_GIGE is not set +CONFIG_NET_VENDOR_MICREL=y +# CONFIG_KS8842 is not set +# CONFIG_KS8851 is not set +# CONFIG_KS8851_MLL is not set +CONFIG_KSZ884X_PCI=m +CONFIG_NET_VENDOR_MICROCHIP=y +# CONFIG_ENC28J60 is not set +# CONFIG_ENCX24J600 is not set +CONFIG_LAN743X=m +# CONFIG_LAN966X_SWITCH is not set +# CONFIG_VCAP is not set +CONFIG_NET_VENDOR_MICROSEMI=y +# CONFIG_MSCC_OCELOT_SWITCH is not set +CONFIG_NET_VENDOR_MICROSOFT=y +CONFIG_NET_VENDOR_MYRI=y +CONFIG_MYRI10GE=m +CONFIG_FEALNX=m +CONFIG_NET_VENDOR_NI=y +# CONFIG_NI_XGE_MANAGEMENT_ENET is not set +CONFIG_NET_VENDOR_NATSEMI=y +CONFIG_NATSEMI=m +CONFIG_NS83820=m +CONFIG_NET_VENDOR_NETERION=y +CONFIG_S2IO=m +CONFIG_NET_VENDOR_NETRONOME=y +CONFIG_NFP=m +CONFIG_NFP_APP_FLOWER=y +CONFIG_NFP_APP_ABM_NIC=y +CONFIG_NFP_NET_IPSEC=y +# CONFIG_NFP_DEBUG is not set +CONFIG_NET_VENDOR_8390=y +CONFIG_NE2K_PCI=m +CONFIG_NET_VENDOR_NVIDIA=y +# CONFIG_FORCEDETH is not set +CONFIG_NET_VENDOR_OKI=y +# CONFIG_ETHOC is not set +CONFIG_NET_VENDOR_PACKET_ENGINES=y +CONFIG_HAMACHI=m +CONFIG_YELLOWFIN=m +CONFIG_NET_VENDOR_PENSANDO=y +# CONFIG_IONIC is not set +CONFIG_NET_VENDOR_QLOGIC=y +CONFIG_QLA3XXX=m +CONFIG_QLCNIC=m +CONFIG_QLCNIC_SRIOV=y +CONFIG_QLCNIC_DCB=y +CONFIG_QLCNIC_HWMON=y +CONFIG_NETXEN_NIC=m +CONFIG_QED=m +CONFIG_QED_LL2=y +CONFIG_QED_SRIOV=y +CONFIG_QEDE=m +CONFIG_QED_RDMA=y +CONFIG_QED_ISCSI=y +CONFIG_QED_FCOE=y +CONFIG_QED_OOO=y +CONFIG_NET_VENDOR_BROCADE=y +CONFIG_BNA=m +CONFIG_NET_VENDOR_QUALCOMM=y +# CONFIG_QCA7000_SPI is not set +# CONFIG_QCA7000_UART is not set +CONFIG_QCOM_EMAC=m +# CONFIG_RMNET is not set +CONFIG_NET_VENDOR_RDC=y +CONFIG_R6040=m +CONFIG_NET_VENDOR_REALTEK=y +CONFIG_8139CP=m +CONFIG_8139TOO=m +# CONFIG_8139TOO_PIO is not set +CONFIG_8139TOO_TUNE_TWISTER=y +CONFIG_8139TOO_8129=y +# CONFIG_8139_OLD_RX_RESET is not set +CONFIG_R8169=m +CONFIG_NET_VENDOR_RENESAS=y +CONFIG_NET_VENDOR_ROCKER=y +# CONFIG_ROCKER is not set +CONFIG_NET_VENDOR_SAMSUNG=y +# CONFIG_SXGBE_ETH is not set +# CONFIG_NET_VENDOR_SEEQ is not set +CONFIG_NET_VENDOR_SILAN=y +CONFIG_SC92031=m +CONFIG_NET_VENDOR_SIS=y +# CONFIG_SIS900 is not set +CONFIG_SIS190=m +CONFIG_NET_VENDOR_SOLARFLARE=y +CONFIG_SFC=m +CONFIG_SFC_MTD=y +CONFIG_SFC_MCDI_MON=y +CONFIG_SFC_SRIOV=y +CONFIG_SFC_MCDI_LOGGING=y +CONFIG_SFC_FALCON=m +CONFIG_SFC_FALCON_MTD=y +# CONFIG_SFC_SIENA is not set +CONFIG_NET_VENDOR_SMSC=y +CONFIG_SMC91X=m +CONFIG_EPIC100=m +CONFIG_SMSC911X=m +CONFIG_SMSC9420=m +CONFIG_NET_VENDOR_SOCIONEXT=y +CONFIG_SNI_NETSEC=m +CONFIG_NET_VENDOR_STMICRO=y +CONFIG_STMMAC_ETH=m +# CONFIG_STMMAC_SELFTESTS is not set +CONFIG_STMMAC_PLATFORM=m +# CONFIG_DWMAC_DWC_QOS_ETH is not set +CONFIG_DWMAC_GENERIC=m +CONFIG_DWMAC_IPQ806X=m +CONFIG_DWMAC_MESON=m +CONFIG_DWMAC_QCOM_ETHQOS=m +CONFIG_DWMAC_ROCKCHIP=m +CONFIG_DWMAC_SUNXI=m +CONFIG_DWMAC_SUN8I=m +# CONFIG_DWMAC_INTEL_PLAT is not set +# CONFIG_DWMAC_TEGRA is not set +# CONFIG_DWMAC_LOONGSON is not set +# CONFIG_STMMAC_PCI is not set +CONFIG_NET_VENDOR_SUN=y +# CONFIG_HAPPYMEAL is not set +# CONFIG_SUNGEM is not set +CONFIG_CASSINI=m +CONFIG_NIU=m +CONFIG_NET_VENDOR_SYNOPSYS=y +# CONFIG_DWC_XLGMAC is not set +CONFIG_NET_VENDOR_TEHUTI=y +CONFIG_TEHUTI=m +CONFIG_NET_VENDOR_TI=y +# CONFIG_TI_CPSW_PHY_SEL is not set +CONFIG_TLAN=m +CONFIG_NET_VENDOR_VERTEXCOM=y +# CONFIG_MSE102X is not set +CONFIG_NET_VENDOR_VIA=y +# CONFIG_VIA_RHINE is not set +CONFIG_VIA_VELOCITY=m +CONFIG_NET_VENDOR_WANGXUN=y +# CONFIG_NGBE is not set +# CONFIG_TXGBE is not set +CONFIG_NET_VENDOR_WIZNET=y +# CONFIG_WIZNET_W5100 is not set +# CONFIG_WIZNET_W5300 is not set +CONFIG_NET_VENDOR_XILINX=y +# CONFIG_XILINX_EMACLITE is not set +# CONFIG_XILINX_AXI_EMAC is not set +# CONFIG_XILINX_LL_TEMAC is not set +CONFIG_FDDI=y +CONFIG_DEFXX=m +CONFIG_SKFP=m +# CONFIG_HIPPI is not set +# CONFIG_NET_SB1000 is not set +CONFIG_PHYLINK=m +CONFIG_PHYLIB=m +CONFIG_SWPHY=y +CONFIG_LED_TRIGGER_PHY=y +CONFIG_PHYLIB_LEDS=y +CONFIG_FIXED_PHY=m +CONFIG_SFP=m + +# +# MII PHY device drivers +# +CONFIG_AMD_PHY=m +CONFIG_MESON_GXL_PHY=m +CONFIG_ADIN_PHY=m +# CONFIG_ADIN1100_PHY is not set +CONFIG_AQUANTIA_PHY=m +CONFIG_AX88796B_PHY=m +CONFIG_BROADCOM_PHY=m +# CONFIG_BCM54140_PHY is not set +# CONFIG_BCM7XXX_PHY is not set +# CONFIG_BCM84881_PHY is not set +CONFIG_BCM87XX_PHY=m +CONFIG_BCM_NET_PHYLIB=m +CONFIG_CICADA_PHY=m +CONFIG_CORTINA_PHY=m +CONFIG_DAVICOM_PHY=m +CONFIG_ICPLUS_PHY=m +CONFIG_LXT_PHY=m +# CONFIG_INTEL_XWAY_PHY is not set +CONFIG_LSI_ET1011C_PHY=m +CONFIG_MARVELL_PHY=m +CONFIG_MARVELL_10G_PHY=m +# CONFIG_MARVELL_88Q2XXX_PHY is not set +# CONFIG_MARVELL_88X2222_PHY is not set +# CONFIG_MAXLINEAR_GPHY is not set +# CONFIG_MEDIATEK_GE_PHY is not set +CONFIG_MICREL_PHY=m +# CONFIG_MICROCHIP_T1S_PHY is not set +CONFIG_MICROCHIP_PHY=m +CONFIG_MICROCHIP_T1_PHY=m +CONFIG_MICROSEMI_PHY=m +# CONFIG_MOTORCOMM_PHY is not set +CONFIG_NATIONAL_PHY=m +# CONFIG_NXP_CBTX_PHY is not set +# CONFIG_NXP_C45_TJA11XX_PHY is not set +# CONFIG_NXP_TJA11XX_PHY is not set +# CONFIG_NCN26000_PHY is not set +CONFIG_AT803X_PHY=m +CONFIG_QSEMI_PHY=m +CONFIG_REALTEK_PHY=m +CONFIG_RENESAS_PHY=m +CONFIG_ROCKCHIP_PHY=m +CONFIG_SMSC_PHY=m +CONFIG_STE10XP=m +CONFIG_TERANETICS_PHY=m +CONFIG_DP83822_PHY=m +CONFIG_DP83TC811_PHY=m +CONFIG_DP83848_PHY=m +CONFIG_DP83867_PHY=m +# CONFIG_DP83869_PHY is not set +# CONFIG_DP83TD510_PHY is not set +CONFIG_VITESSE_PHY=m +# CONFIG_XILINX_GMII2RGMII is not set +# CONFIG_MICREL_KS8995MA is not set +# CONFIG_PSE_CONTROLLER is not set +CONFIG_CAN_DEV=m +CONFIG_CAN_VCAN=m +CONFIG_CAN_VXCAN=m +CONFIG_CAN_NETLINK=y +CONFIG_CAN_CALC_BITTIMING=y +CONFIG_CAN_RX_OFFLOAD=y +# CONFIG_CAN_CAN327 is not set +# CONFIG_CAN_FLEXCAN is not set +# CONFIG_CAN_GRCAN is not set +# CONFIG_CAN_KVASER_PCIEFD is not set +CONFIG_CAN_SLCAN=m +# CONFIG_CAN_XILINXCAN is not set +# CONFIG_CAN_C_CAN is not set +# CONFIG_CAN_CC770 is not set +# CONFIG_CAN_CTUCANFD_PCI is not set +# CONFIG_CAN_CTUCANFD_PLATFORM is not set +# CONFIG_CAN_IFI_CANFD is not set +# CONFIG_CAN_M_CAN is not set +CONFIG_CAN_PEAK_PCIEFD=m +CONFIG_CAN_SJA1000=m +CONFIG_CAN_EMS_PCI=m +# CONFIG_CAN_F81601 is not set +CONFIG_CAN_KVASER_PCI=m +CONFIG_CAN_PEAK_PCI=m +CONFIG_CAN_PEAK_PCIEC=y +CONFIG_CAN_PLX_PCI=m +CONFIG_CAN_SJA1000_ISA=m +# CONFIG_CAN_SJA1000_PLATFORM is not set +CONFIG_CAN_SOFTING=m + +# +# CAN SPI interfaces +# +# CONFIG_CAN_HI311X is not set +# CONFIG_CAN_MCP251X is not set +# CONFIG_CAN_MCP251XFD is not set +# end of CAN SPI interfaces + +# +# CAN USB interfaces +# +CONFIG_CAN_8DEV_USB=m +CONFIG_CAN_EMS_USB=m +# CONFIG_CAN_ESD_USB is not set +# CONFIG_CAN_ETAS_ES58X is not set +# CONFIG_CAN_F81604 is not set +CONFIG_CAN_GS_USB=m +CONFIG_CAN_KVASER_USB=m +CONFIG_CAN_MCBA_USB=m +CONFIG_CAN_PEAK_USB=m +CONFIG_CAN_UCAN=m +# end of CAN USB interfaces + +# CONFIG_CAN_DEBUG_DEVICES is not set +CONFIG_MDIO_DEVICE=m +CONFIG_MDIO_BUS=m +CONFIG_FWNODE_MDIO=m +CONFIG_OF_MDIO=m +CONFIG_ACPI_MDIO=m +CONFIG_MDIO_DEVRES=m +# CONFIG_MDIO_SUN4I is not set +CONFIG_MDIO_XGENE=m +# CONFIG_MDIO_BITBANG is not set +# CONFIG_MDIO_BCM_UNIMAC is not set +CONFIG_MDIO_CAVIUM=m +CONFIG_MDIO_HISI_FEMAC=m +CONFIG_MDIO_I2C=m +# CONFIG_MDIO_MVUSB is not set +# CONFIG_MDIO_MSCC_MIIM is not set +# CONFIG_MDIO_OCTEON is not set +# CONFIG_MDIO_IPQ4019 is not set +# CONFIG_MDIO_IPQ8064 is not set +CONFIG_MDIO_THUNDER=m + +# +# MDIO Multiplexers +# +CONFIG_MDIO_BUS_MUX=m +CONFIG_MDIO_BUS_MUX_MESON_G12A=m +CONFIG_MDIO_BUS_MUX_MESON_GXL=m +# CONFIG_MDIO_BUS_MUX_GPIO is not set +# CONFIG_MDIO_BUS_MUX_MULTIPLEXER is not set +CONFIG_MDIO_BUS_MUX_MMIOREG=m + +# +# PCS device drivers +# +CONFIG_PCS_XPCS=m +# end of PCS device drivers + +# CONFIG_PLIP is not set +CONFIG_PPP=m +CONFIG_PPP_BSDCOMP=m +CONFIG_PPP_DEFLATE=m +CONFIG_PPP_FILTER=y +CONFIG_PPP_MPPE=m +CONFIG_PPP_MULTILINK=y +CONFIG_PPPOATM=m +CONFIG_PPPOE=m +# CONFIG_PPPOE_HASH_BITS_1 is not set +# CONFIG_PPPOE_HASH_BITS_2 is not set +CONFIG_PPPOE_HASH_BITS_4=y +# CONFIG_PPPOE_HASH_BITS_8 is not set +CONFIG_PPPOE_HASH_BITS=4 +CONFIG_PPTP=m +CONFIG_PPPOL2TP=m +CONFIG_PPP_ASYNC=m +CONFIG_PPP_SYNC_TTY=m +CONFIG_SLIP=m +CONFIG_SLHC=m +CONFIG_SLIP_COMPRESSED=y +CONFIG_SLIP_SMART=y +CONFIG_SLIP_MODE_SLIP6=y + +# +# Host-side USB support is needed for USB Network Adapter support +# +CONFIG_USB_NET_DRIVERS=m +CONFIG_USB_CATC=m +CONFIG_USB_KAWETH=m +CONFIG_USB_PEGASUS=m +CONFIG_USB_RTL8150=m +CONFIG_USB_RTL8152=m +CONFIG_USB_LAN78XX=m +CONFIG_USB_USBNET=m +CONFIG_USB_NET_AX8817X=m +CONFIG_USB_NET_AX88179_178A=m +CONFIG_USB_NET_CDCETHER=m +CONFIG_USB_NET_CDC_EEM=m +CONFIG_USB_NET_CDC_NCM=m +CONFIG_USB_NET_HUAWEI_CDC_NCM=m +CONFIG_USB_NET_CDC_MBIM=m +CONFIG_USB_NET_DM9601=m +CONFIG_USB_NET_SR9700=m +CONFIG_USB_NET_SR9800=m +CONFIG_USB_NET_SMSC75XX=m +CONFIG_USB_NET_SMSC95XX=m +CONFIG_USB_NET_GL620A=m +CONFIG_USB_NET_NET1080=m +CONFIG_USB_NET_PLUSB=m +CONFIG_USB_NET_MCS7830=m +CONFIG_USB_NET_RNDIS_HOST=m +CONFIG_USB_NET_CDC_SUBSET_ENABLE=m +CONFIG_USB_NET_CDC_SUBSET=m +CONFIG_USB_ALI_M5632=y +CONFIG_USB_AN2720=y +CONFIG_USB_BELKIN=y +CONFIG_USB_ARMLINUX=y +CONFIG_USB_EPSON2888=y +CONFIG_USB_KC2190=y +CONFIG_USB_NET_ZAURUS=m +CONFIG_USB_NET_CX82310_ETH=m +CONFIG_USB_NET_KALMIA=m +CONFIG_USB_NET_QMI_WWAN=m +CONFIG_USB_HSO=m +CONFIG_USB_NET_INT51X1=m +CONFIG_USB_CDC_PHONET=m +CONFIG_USB_IPHETH=m +CONFIG_USB_SIERRA_NET=m +CONFIG_USB_VL600=m +CONFIG_USB_NET_CH9200=m +# CONFIG_USB_NET_AQC111 is not set +CONFIG_USB_RTL8153_ECM=m +CONFIG_WLAN=y +CONFIG_WLAN_VENDOR_ADMTEK=y +CONFIG_ADM8211=m +CONFIG_ATH_COMMON=m +CONFIG_WLAN_VENDOR_ATH=y +# CONFIG_ATH_DEBUG is not set +CONFIG_ATH5K=m +# CONFIG_ATH5K_DEBUG is not set +# CONFIG_ATH5K_TRACER is not set +CONFIG_ATH5K_PCI=y +CONFIG_ATH9K_HW=m +CONFIG_ATH9K_COMMON=m +CONFIG_ATH9K_BTCOEX_SUPPORT=y +CONFIG_ATH9K=m +CONFIG_ATH9K_PCI=y +# CONFIG_ATH9K_AHB is not set +# CONFIG_ATH9K_DEBUGFS is not set +# CONFIG_ATH9K_DYNACK is not set +# CONFIG_ATH9K_WOW is not set +CONFIG_ATH9K_RFKILL=y +CONFIG_ATH9K_CHANNEL_CONTEXT=y +CONFIG_ATH9K_PCOEM=y +CONFIG_ATH9K_PCI_NO_EEPROM=m +CONFIG_ATH9K_HTC=m +# CONFIG_ATH9K_HTC_DEBUGFS is not set +# CONFIG_ATH9K_HWRNG is not set +CONFIG_CARL9170=m +CONFIG_CARL9170_LEDS=y +CONFIG_CARL9170_WPC=y +# CONFIG_CARL9170_HWRNG is not set +CONFIG_ATH6KL=m +CONFIG_ATH6KL_SDIO=m +CONFIG_ATH6KL_USB=m +# CONFIG_ATH6KL_DEBUG is not set +# CONFIG_ATH6KL_TRACING is not set +CONFIG_AR5523=m +CONFIG_WIL6210=m +CONFIG_WIL6210_ISR_COR=y +CONFIG_WIL6210_TRACING=y +CONFIG_WIL6210_DEBUGFS=y +CONFIG_ATH10K=m +CONFIG_ATH10K_CE=y +CONFIG_ATH10K_PCI=m +# CONFIG_ATH10K_AHB is not set +# CONFIG_ATH10K_SDIO is not set +CONFIG_ATH10K_USB=m +# CONFIG_ATH10K_DEBUG is not set +# CONFIG_ATH10K_DEBUGFS is not set +# CONFIG_ATH10K_TRACING is not set +CONFIG_WCN36XX=m +# CONFIG_WCN36XX_DEBUGFS is not set +# CONFIG_ATH11K is not set +# CONFIG_ATH12K is not set +CONFIG_WLAN_VENDOR_ATMEL=y +# CONFIG_ATMEL is not set +CONFIG_AT76C50X_USB=m +CONFIG_WLAN_VENDOR_BROADCOM=y +CONFIG_B43=m +CONFIG_B43_BCMA=y +CONFIG_B43_SSB=y +CONFIG_B43_BUSES_BCMA_AND_SSB=y +# CONFIG_B43_BUSES_BCMA is not set +# CONFIG_B43_BUSES_SSB is not set +CONFIG_B43_PCI_AUTOSELECT=y +CONFIG_B43_PCICORE_AUTOSELECT=y +CONFIG_B43_SDIO=y +CONFIG_B43_BCMA_PIO=y +CONFIG_B43_PIO=y +CONFIG_B43_PHY_G=y +CONFIG_B43_PHY_N=y +CONFIG_B43_PHY_LP=y +CONFIG_B43_PHY_HT=y +CONFIG_B43_LEDS=y +CONFIG_B43_HWRNG=y +# CONFIG_B43_DEBUG is not set +CONFIG_B43LEGACY=m +CONFIG_B43LEGACY_PCI_AUTOSELECT=y +CONFIG_B43LEGACY_PCICORE_AUTOSELECT=y +CONFIG_B43LEGACY_LEDS=y +CONFIG_B43LEGACY_HWRNG=y +CONFIG_B43LEGACY_DEBUG=y +CONFIG_B43LEGACY_DMA=y +CONFIG_B43LEGACY_PIO=y +CONFIG_B43LEGACY_DMA_AND_PIO_MODE=y +# CONFIG_B43LEGACY_DMA_MODE is not set +# CONFIG_B43LEGACY_PIO_MODE is not set +CONFIG_BRCMUTIL=m +CONFIG_BRCMSMAC=m +CONFIG_BRCMFMAC=m +CONFIG_BRCMFMAC_PROTO_BCDC=y +CONFIG_BRCMFMAC_PROTO_MSGBUF=y +CONFIG_BRCMFMAC_SDIO=y +CONFIG_BRCMFMAC_USB=y +CONFIG_BRCMFMAC_PCIE=y +# CONFIG_BRCM_TRACING is not set +# CONFIG_BRCMDBG is not set +CONFIG_WLAN_VENDOR_CISCO=y +# CONFIG_AIRO is not set +CONFIG_WLAN_VENDOR_INTEL=y +# CONFIG_IPW2100 is not set +CONFIG_IPW2200=m +CONFIG_IPW2200_MONITOR=y +CONFIG_IPW2200_RADIOTAP=y +CONFIG_IPW2200_PROMISCUOUS=y +CONFIG_IPW2200_QOS=y +# CONFIG_IPW2200_DEBUG is not set +CONFIG_LIBIPW=m +# CONFIG_LIBIPW_DEBUG is not set +CONFIG_IWLEGACY=m +CONFIG_IWL4965=m +CONFIG_IWL3945=m + +# +# iwl3945 / iwl4965 Debugging Options +# +# CONFIG_IWLEGACY_DEBUG is not set +# end of iwl3945 / iwl4965 Debugging Options + +CONFIG_IWLWIFI=m +CONFIG_IWLWIFI_LEDS=y +CONFIG_IWLDVM=m +CONFIG_IWLMVM=m +CONFIG_IWLWIFI_OPMODE_MODULAR=y + +# +# Debugging Options +# +# CONFIG_IWLWIFI_DEBUG is not set +# CONFIG_IWLWIFI_DEVICE_TRACING is not set +# end of Debugging Options + +CONFIG_WLAN_VENDOR_INTERSIL=y +CONFIG_HOSTAP=m +CONFIG_HOSTAP_FIRMWARE=y +# CONFIG_HOSTAP_FIRMWARE_NVRAM is not set +CONFIG_HOSTAP_PLX=m +CONFIG_HOSTAP_PCI=m +# CONFIG_HERMES is not set +CONFIG_P54_COMMON=m +CONFIG_P54_USB=m +CONFIG_P54_PCI=m +# CONFIG_P54_SPI is not set +CONFIG_P54_LEDS=y +CONFIG_WLAN_VENDOR_MARVELL=y +CONFIG_LIBERTAS=m +CONFIG_LIBERTAS_USB=m +CONFIG_LIBERTAS_SDIO=m +# CONFIG_LIBERTAS_SPI is not set +# CONFIG_LIBERTAS_DEBUG is not set +CONFIG_LIBERTAS_MESH=y +CONFIG_LIBERTAS_THINFIRM=m +# CONFIG_LIBERTAS_THINFIRM_DEBUG is not set +CONFIG_LIBERTAS_THINFIRM_USB=m +CONFIG_MWIFIEX=m +# CONFIG_MWIFIEX_SDIO is not set +CONFIG_MWIFIEX_PCIE=m +# CONFIG_MWIFIEX_USB is not set +CONFIG_MWL8K=m +CONFIG_WLAN_VENDOR_MEDIATEK=y +CONFIG_MT7601U=m +CONFIG_MT76_CORE=m +CONFIG_MT76_LEDS=y +CONFIG_MT76_USB=m +CONFIG_MT76x02_LIB=m +CONFIG_MT76x02_USB=m +CONFIG_MT76x0_COMMON=m +CONFIG_MT76x0U=m +# CONFIG_MT76x0E is not set +CONFIG_MT76x2_COMMON=m +CONFIG_MT76x2E=m +CONFIG_MT76x2U=m +# CONFIG_MT7603E is not set +# CONFIG_MT7615E is not set +# CONFIG_MT7663U is not set +# CONFIG_MT7663S is not set +# CONFIG_MT7915E is not set +# CONFIG_MT7921E is not set +# CONFIG_MT7921S is not set +# CONFIG_MT7921U is not set +# CONFIG_MT7996E is not set +CONFIG_WLAN_VENDOR_MICROCHIP=y +# CONFIG_WILC1000_SDIO is not set +# CONFIG_WILC1000_SPI is not set +CONFIG_WLAN_VENDOR_PURELIFI=y +# CONFIG_PLFXLC is not set +CONFIG_WLAN_VENDOR_RALINK=y +CONFIG_RT2X00=m +CONFIG_RT2400PCI=m +CONFIG_RT2500PCI=m +CONFIG_RT61PCI=m +CONFIG_RT2800PCI=m +CONFIG_RT2800PCI_RT33XX=y +CONFIG_RT2800PCI_RT35XX=y +CONFIG_RT2800PCI_RT53XX=y +CONFIG_RT2800PCI_RT3290=y +CONFIG_RT2500USB=m +CONFIG_RT73USB=m +CONFIG_RT2800USB=m +CONFIG_RT2800USB_RT33XX=y +CONFIG_RT2800USB_RT35XX=y +CONFIG_RT2800USB_RT3573=y +CONFIG_RT2800USB_RT53XX=y +CONFIG_RT2800USB_RT55XX=y +# CONFIG_RT2800USB_UNKNOWN is not set +CONFIG_RT2800_LIB=m +CONFIG_RT2800_LIB_MMIO=m +CONFIG_RT2X00_LIB_MMIO=m +CONFIG_RT2X00_LIB_PCI=m +CONFIG_RT2X00_LIB_USB=m +CONFIG_RT2X00_LIB=m +CONFIG_RT2X00_LIB_FIRMWARE=y +CONFIG_RT2X00_LIB_CRYPTO=y +CONFIG_RT2X00_LIB_LEDS=y +# CONFIG_RT2X00_DEBUG is not set +CONFIG_WLAN_VENDOR_REALTEK=y +CONFIG_RTL8180=m +CONFIG_RTL8187=m +CONFIG_RTL8187_LEDS=y +CONFIG_RTL_CARDS=m +CONFIG_RTL8192CE=m +CONFIG_RTL8192SE=m +CONFIG_RTL8192DE=m +CONFIG_RTL8723AE=m +CONFIG_RTL8723BE=m +CONFIG_RTL8188EE=m +CONFIG_RTL8192EE=m +CONFIG_RTL8821AE=m +CONFIG_RTL8192CU=m +CONFIG_RTLWIFI=m +CONFIG_RTLWIFI_PCI=m +CONFIG_RTLWIFI_USB=m +# CONFIG_RTLWIFI_DEBUG is not set +CONFIG_RTL8192C_COMMON=m +CONFIG_RTL8723_COMMON=m +CONFIG_RTLBTCOEXIST=m +CONFIG_RTL8XXXU=m +# CONFIG_RTL8XXXU_UNTESTED is not set +CONFIG_RTW88=m +CONFIG_RTW88_CORE=m +CONFIG_RTW88_PCI=m +CONFIG_RTW88_8822B=m +CONFIG_RTW88_8822C=m +CONFIG_RTW88_8822BE=m +# CONFIG_RTW88_8822BS is not set +# CONFIG_RTW88_8822BU is not set +CONFIG_RTW88_8822CE=m +# CONFIG_RTW88_8822CS is not set +# CONFIG_RTW88_8822CU is not set +# CONFIG_RTW88_8723DE is not set +# CONFIG_RTW88_8723DS is not set +# CONFIG_RTW88_8723DU is not set +# CONFIG_RTW88_8821CE is not set +# CONFIG_RTW88_8821CS is not set +# CONFIG_RTW88_8821CU is not set +# CONFIG_RTW88_DEBUG is not set +# CONFIG_RTW88_DEBUGFS is not set +# CONFIG_RTW89 is not set +CONFIG_WLAN_VENDOR_RSI=y +CONFIG_RSI_91X=m +CONFIG_RSI_DEBUGFS=y +# CONFIG_RSI_SDIO is not set +CONFIG_RSI_USB=m +CONFIG_RSI_COEX=y +CONFIG_WLAN_VENDOR_SILABS=y +# CONFIG_WFX is not set +CONFIG_WLAN_VENDOR_ST=y +# CONFIG_CW1200 is not set +CONFIG_WLAN_VENDOR_TI=y +CONFIG_WL1251=m +CONFIG_WL1251_SPI=m +CONFIG_WL1251_SDIO=m +CONFIG_WL12XX=m +CONFIG_WL18XX=m +CONFIG_WLCORE=m +CONFIG_WLCORE_SPI=m +CONFIG_WLCORE_SDIO=m +CONFIG_WLAN_VENDOR_ZYDAS=y +# CONFIG_USB_ZD1201 is not set +CONFIG_ZD1211RW=m +# CONFIG_ZD1211RW_DEBUG is not set +CONFIG_WLAN_VENDOR_QUANTENNA=y +# CONFIG_QTNFMAC_PCIE is not set +CONFIG_USB_NET_RNDIS_WLAN=m +CONFIG_MAC80211_HWSIM=m +# CONFIG_VIRT_WIFI is not set +# CONFIG_WAN is not set +CONFIG_IEEE802154_DRIVERS=m +CONFIG_IEEE802154_FAKELB=m +CONFIG_IEEE802154_AT86RF230=m +CONFIG_IEEE802154_MRF24J40=m +CONFIG_IEEE802154_CC2520=m +CONFIG_IEEE802154_ATUSB=m +CONFIG_IEEE802154_ADF7242=m +# CONFIG_IEEE802154_CA8210 is not set +# CONFIG_IEEE802154_MCR20A is not set +CONFIG_IEEE802154_HWSIM=m + +# +# Wireless WAN +# +# CONFIG_WWAN is not set +# end of Wireless WAN + +CONFIG_XEN_NETDEV_FRONTEND=m +CONFIG_XEN_NETDEV_BACKEND=m +# CONFIG_FUJITSU_ES is not set +CONFIG_HYPERV_NET=m +# CONFIG_NETDEVSIM is not set +CONFIG_NET_FAILOVER=m +# CONFIG_ISDN is not set + +# +# Input device support +# +CONFIG_INPUT=y +CONFIG_INPUT_LEDS=y +CONFIG_INPUT_FF_MEMLESS=m +CONFIG_INPUT_SPARSEKMAP=m +CONFIG_INPUT_MATRIXKMAP=m +CONFIG_INPUT_VIVALDIFMAP=y + +# +# Userland interfaces +# +CONFIG_INPUT_MOUSEDEV=y +CONFIG_INPUT_MOUSEDEV_PSAUX=y +CONFIG_INPUT_MOUSEDEV_SCREEN_X=1024 +CONFIG_INPUT_MOUSEDEV_SCREEN_Y=768 +CONFIG_INPUT_JOYDEV=m +CONFIG_INPUT_EVDEV=m +# CONFIG_INPUT_EVBUG is not set + +# +# Input Device Drivers +# +CONFIG_INPUT_KEYBOARD=y +# CONFIG_KEYBOARD_ADC is not set +CONFIG_KEYBOARD_ADP5588=m +# CONFIG_KEYBOARD_ADP5589 is not set +CONFIG_KEYBOARD_ATKBD=y +# CONFIG_KEYBOARD_QT1050 is not set +# CONFIG_KEYBOARD_QT1070 is not set +CONFIG_KEYBOARD_QT2160=m +# CONFIG_KEYBOARD_DLINK_DIR685 is not set +# CONFIG_KEYBOARD_LKKBD is not set +CONFIG_KEYBOARD_GPIO=m +# CONFIG_KEYBOARD_GPIO_POLLED is not set +# CONFIG_KEYBOARD_TCA6416 is not set +# CONFIG_KEYBOARD_TCA8418 is not set +# CONFIG_KEYBOARD_MATRIX is not set +CONFIG_KEYBOARD_LM8323=m +# CONFIG_KEYBOARD_LM8333 is not set +CONFIG_KEYBOARD_MAX7359=m +# CONFIG_KEYBOARD_MCS is not set +# CONFIG_KEYBOARD_MPR121 is not set +# CONFIG_KEYBOARD_NEWTON is not set +CONFIG_KEYBOARD_TEGRA=m +CONFIG_KEYBOARD_OPENCORES=m +# CONFIG_KEYBOARD_PINEPHONE is not set +# CONFIG_KEYBOARD_SAMSUNG is not set +CONFIG_KEYBOARD_STOWAWAY=m +# CONFIG_KEYBOARD_SUNKBD is not set +# CONFIG_KEYBOARD_SUN4I_LRADC is not set +# CONFIG_KEYBOARD_OMAP4 is not set +# CONFIG_KEYBOARD_TM2_TOUCHKEY is not set +# CONFIG_KEYBOARD_XTKBD is not set +CONFIG_KEYBOARD_CROS_EC=m +# CONFIG_KEYBOARD_CAP11XX is not set +# CONFIG_KEYBOARD_BCM is not set +# CONFIG_KEYBOARD_CYPRESS_SF is not set +CONFIG_INPUT_MOUSE=y +CONFIG_MOUSE_PS2=y +CONFIG_MOUSE_PS2_ALPS=y +CONFIG_MOUSE_PS2_BYD=y +CONFIG_MOUSE_PS2_LOGIPS2PP=y +CONFIG_MOUSE_PS2_SYNAPTICS=y +CONFIG_MOUSE_PS2_SYNAPTICS_SMBUS=y +CONFIG_MOUSE_PS2_CYPRESS=y +CONFIG_MOUSE_PS2_TRACKPOINT=y +CONFIG_MOUSE_PS2_ELANTECH=y +CONFIG_MOUSE_PS2_ELANTECH_SMBUS=y +CONFIG_MOUSE_PS2_SENTELIC=y +# CONFIG_MOUSE_PS2_TOUCHKIT is not set +CONFIG_MOUSE_PS2_FOCALTECH=y +CONFIG_MOUSE_PS2_SMBUS=y +# CONFIG_MOUSE_SERIAL is not set +# CONFIG_MOUSE_APPLETOUCH is not set +# CONFIG_MOUSE_BCM5974 is not set +# CONFIG_MOUSE_CYAPA is not set +CONFIG_MOUSE_ELAN_I2C=m +CONFIG_MOUSE_ELAN_I2C_I2C=y +# CONFIG_MOUSE_ELAN_I2C_SMBUS is not set +# CONFIG_MOUSE_VSXXXAA is not set +# CONFIG_MOUSE_GPIO is not set +CONFIG_MOUSE_SYNAPTICS_I2C=m +CONFIG_MOUSE_SYNAPTICS_USB=m +# CONFIG_INPUT_JOYSTICK is not set +CONFIG_INPUT_TABLET=y +CONFIG_TABLET_USB_ACECAD=m +CONFIG_TABLET_USB_AIPTEK=m +CONFIG_TABLET_USB_HANWANG=m +CONFIG_TABLET_USB_KBTAB=m +CONFIG_TABLET_USB_PEGASUS=m +CONFIG_TABLET_SERIAL_WACOM4=m +CONFIG_INPUT_TOUCHSCREEN=y +CONFIG_TOUCHSCREEN_ADS7846=m +CONFIG_TOUCHSCREEN_AD7877=m +CONFIG_TOUCHSCREEN_AD7879=m +CONFIG_TOUCHSCREEN_AD7879_I2C=m +# CONFIG_TOUCHSCREEN_AD7879_SPI is not set +# CONFIG_TOUCHSCREEN_ADC is not set +# CONFIG_TOUCHSCREEN_AR1021_I2C is not set +CONFIG_TOUCHSCREEN_ATMEL_MXT=m +# CONFIG_TOUCHSCREEN_ATMEL_MXT_T37 is not set +# CONFIG_TOUCHSCREEN_AUO_PIXCIR is not set +# CONFIG_TOUCHSCREEN_BU21013 is not set +# CONFIG_TOUCHSCREEN_BU21029 is not set +# CONFIG_TOUCHSCREEN_CHIPONE_ICN8318 is not set +# CONFIG_TOUCHSCREEN_CHIPONE_ICN8505 is not set +# CONFIG_TOUCHSCREEN_CY8CTMA140 is not set +# CONFIG_TOUCHSCREEN_CY8CTMG110 is not set +# CONFIG_TOUCHSCREEN_CYTTSP_CORE is not set +# CONFIG_TOUCHSCREEN_CYTTSP4_CORE is not set +# CONFIG_TOUCHSCREEN_CYTTSP5 is not set +CONFIG_TOUCHSCREEN_DYNAPRO=m +CONFIG_TOUCHSCREEN_HAMPSHIRE=m +CONFIG_TOUCHSCREEN_EETI=m +# CONFIG_TOUCHSCREEN_EGALAX is not set +# CONFIG_TOUCHSCREEN_EGALAX_SERIAL is not set +# CONFIG_TOUCHSCREEN_EXC3000 is not set +CONFIG_TOUCHSCREEN_FUJITSU=m +CONFIG_TOUCHSCREEN_GOODIX=m +# CONFIG_TOUCHSCREEN_HIDEEP is not set +# CONFIG_TOUCHSCREEN_HYCON_HY46XX is not set +# CONFIG_TOUCHSCREEN_HYNITRON_CSTXXX is not set +# CONFIG_TOUCHSCREEN_ILI210X is not set +# CONFIG_TOUCHSCREEN_ILITEK is not set +# CONFIG_TOUCHSCREEN_S6SY761 is not set +CONFIG_TOUCHSCREEN_GUNZE=m +# CONFIG_TOUCHSCREEN_EKTF2127 is not set +CONFIG_TOUCHSCREEN_ELAN=m +CONFIG_TOUCHSCREEN_ELO=m +CONFIG_TOUCHSCREEN_WACOM_W8001=m +# CONFIG_TOUCHSCREEN_WACOM_I2C is not set +# CONFIG_TOUCHSCREEN_MAX11801 is not set +CONFIG_TOUCHSCREEN_MCS5000=m +# CONFIG_TOUCHSCREEN_MMS114 is not set +# CONFIG_TOUCHSCREEN_MELFAS_MIP4 is not set +# CONFIG_TOUCHSCREEN_MSG2638 is not set +CONFIG_TOUCHSCREEN_MTOUCH=m +# CONFIG_TOUCHSCREEN_NOVATEK_NVT_TS is not set +# CONFIG_TOUCHSCREEN_IMAGIS is not set +# CONFIG_TOUCHSCREEN_IMX6UL_TSC is not set +CONFIG_TOUCHSCREEN_INEXIO=m +CONFIG_TOUCHSCREEN_PENMOUNT=m +# CONFIG_TOUCHSCREEN_EDT_FT5X06 is not set +CONFIG_TOUCHSCREEN_TOUCHRIGHT=m +CONFIG_TOUCHSCREEN_TOUCHWIN=m +# CONFIG_TOUCHSCREEN_PIXCIR is not set +# CONFIG_TOUCHSCREEN_WDT87XX_I2C is not set +CONFIG_TOUCHSCREEN_WM97XX=m +CONFIG_TOUCHSCREEN_WM9705=y +CONFIG_TOUCHSCREEN_WM9712=y +CONFIG_TOUCHSCREEN_WM9713=y +CONFIG_TOUCHSCREEN_USB_COMPOSITE=m +CONFIG_TOUCHSCREEN_USB_EGALAX=y +CONFIG_TOUCHSCREEN_USB_PANJIT=y +CONFIG_TOUCHSCREEN_USB_3M=y +CONFIG_TOUCHSCREEN_USB_ITM=y +CONFIG_TOUCHSCREEN_USB_ETURBO=y +CONFIG_TOUCHSCREEN_USB_GUNZE=y +CONFIG_TOUCHSCREEN_USB_DMC_TSC10=y +CONFIG_TOUCHSCREEN_USB_IRTOUCH=y +CONFIG_TOUCHSCREEN_USB_IDEALTEK=y +CONFIG_TOUCHSCREEN_USB_GENERAL_TOUCH=y +CONFIG_TOUCHSCREEN_USB_GOTOP=y +CONFIG_TOUCHSCREEN_USB_JASTEC=y +CONFIG_TOUCHSCREEN_USB_ELO=y +CONFIG_TOUCHSCREEN_USB_E2I=y +CONFIG_TOUCHSCREEN_USB_ZYTRONIC=y +CONFIG_TOUCHSCREEN_USB_ETT_TC45USB=y +CONFIG_TOUCHSCREEN_USB_NEXIO=y +CONFIG_TOUCHSCREEN_USB_EASYTOUCH=y +CONFIG_TOUCHSCREEN_TOUCHIT213=m +# CONFIG_TOUCHSCREEN_TSC_SERIO is not set +# CONFIG_TOUCHSCREEN_TSC2004 is not set +# CONFIG_TOUCHSCREEN_TSC2005 is not set +CONFIG_TOUCHSCREEN_TSC2007=m +# CONFIG_TOUCHSCREEN_TSC2007_IIO is not set +# CONFIG_TOUCHSCREEN_RM_TS is not set +# CONFIG_TOUCHSCREEN_SILEAD is not set +# CONFIG_TOUCHSCREEN_SIS_I2C is not set +# CONFIG_TOUCHSCREEN_ST1232 is not set +# CONFIG_TOUCHSCREEN_STMFTS is not set +# CONFIG_TOUCHSCREEN_SUN4I is not set +CONFIG_TOUCHSCREEN_SUR40=m +# CONFIG_TOUCHSCREEN_SURFACE3_SPI is not set +# CONFIG_TOUCHSCREEN_SX8654 is not set +CONFIG_TOUCHSCREEN_TPS6507X=m +# CONFIG_TOUCHSCREEN_ZET6223 is not set +# CONFIG_TOUCHSCREEN_ZFORCE is not set +# CONFIG_TOUCHSCREEN_COLIBRI_VF50 is not set +# CONFIG_TOUCHSCREEN_ROHM_BU21023 is not set +# CONFIG_TOUCHSCREEN_IQS5XX is not set +# CONFIG_TOUCHSCREEN_IQS7211 is not set +# CONFIG_TOUCHSCREEN_ZINITIX is not set +# CONFIG_TOUCHSCREEN_HIMAX_HX83112B is not set +CONFIG_INPUT_MISC=y +# CONFIG_INPUT_AD714X is not set +# CONFIG_INPUT_ATMEL_CAPTOUCH is not set +# CONFIG_INPUT_BMA150 is not set +# CONFIG_INPUT_E3X0_BUTTON is not set +CONFIG_INPUT_PM8941_PWRKEY=m +# CONFIG_INPUT_PM8XXX_VIBRATOR is not set +# CONFIG_INPUT_MMA8450 is not set +# CONFIG_INPUT_GPIO_BEEPER is not set +# CONFIG_INPUT_GPIO_DECODER is not set +# CONFIG_INPUT_GPIO_VIBRA is not set +CONFIG_INPUT_ATI_REMOTE2=m +CONFIG_INPUT_KEYSPAN_REMOTE=m +# CONFIG_INPUT_KXTJ9 is not set +CONFIG_INPUT_POWERMATE=m +CONFIG_INPUT_YEALINK=m +CONFIG_INPUT_CM109=m +# CONFIG_INPUT_REGULATOR_HAPTIC is not set +CONFIG_INPUT_AXP20X_PEK=m +CONFIG_INPUT_UINPUT=m +# CONFIG_INPUT_PCF8574 is not set +# CONFIG_INPUT_PWM_BEEPER is not set +# CONFIG_INPUT_PWM_VIBRA is not set +# CONFIG_INPUT_GPIO_ROTARY_ENCODER is not set +# CONFIG_INPUT_DA7280_HAPTICS is not set +# CONFIG_INPUT_ADXL34X is not set +# CONFIG_INPUT_IBM_PANEL is not set +# CONFIG_INPUT_IMS_PCU is not set +# CONFIG_INPUT_IQS269A is not set +# CONFIG_INPUT_IQS626A is not set +# CONFIG_INPUT_IQS7222 is not set +# CONFIG_INPUT_CMA3000 is not set +CONFIG_INPUT_XEN_KBDDEV_FRONTEND=y +# CONFIG_INPUT_SOC_BUTTON_ARRAY is not set +# CONFIG_INPUT_DRV260X_HAPTICS is not set +# CONFIG_INPUT_DRV2665_HAPTICS is not set +# CONFIG_INPUT_DRV2667_HAPTICS is not set +CONFIG_INPUT_HISI_POWERKEY=m +CONFIG_RMI4_CORE=m +# CONFIG_RMI4_I2C is not set +# CONFIG_RMI4_SPI is not set +# CONFIG_RMI4_SMB is not set +CONFIG_RMI4_F03=y +CONFIG_RMI4_F03_SERIO=m +CONFIG_RMI4_2D_SENSOR=y +CONFIG_RMI4_F11=y +CONFIG_RMI4_F12=y +CONFIG_RMI4_F30=y +CONFIG_RMI4_F34=y +# CONFIG_RMI4_F3A is not set +# CONFIG_RMI4_F54 is not set +CONFIG_RMI4_F55=y + +# +# Hardware I/O ports +# +CONFIG_SERIO=y +CONFIG_SERIO_SERPORT=y +# CONFIG_SERIO_PARKBD is not set +# CONFIG_SERIO_AMBAKMI is not set +# CONFIG_SERIO_PCIPS2 is not set +CONFIG_SERIO_LIBPS2=y +# CONFIG_SERIO_RAW is not set +CONFIG_SERIO_ALTERA_PS2=m +# CONFIG_SERIO_PS2MULT is not set +# CONFIG_SERIO_ARC_PS2 is not set +# CONFIG_SERIO_APBPS2 is not set +CONFIG_HYPERV_KEYBOARD=m +# CONFIG_SERIO_SUN4I_PS2 is not set +# CONFIG_SERIO_GPIO_PS2 is not set +# CONFIG_USERIO is not set +# CONFIG_GAMEPORT is not set +# end of Hardware I/O ports +# end of Input device support + +# +# Character devices +# +CONFIG_TTY=y +CONFIG_VT=y +CONFIG_CONSOLE_TRANSLATIONS=y +CONFIG_VT_CONSOLE=y +CONFIG_VT_CONSOLE_SLEEP=y +CONFIG_HW_CONSOLE=y +CONFIG_VT_HW_CONSOLE_BINDING=y +CONFIG_UNIX98_PTYS=y +# CONFIG_LEGACY_PTYS is not set +CONFIG_LEGACY_TIOCSTI=y +CONFIG_LDISC_AUTOLOAD=y + +# +# Serial drivers +# +CONFIG_SERIAL_EARLYCON=y +CONFIG_SERIAL_8250=y +# CONFIG_SERIAL_8250_DEPRECATED_OPTIONS is not set +CONFIG_SERIAL_8250_PNP=y +CONFIG_SERIAL_8250_16550A_VARIANTS=y +# CONFIG_SERIAL_8250_FINTEK is not set +CONFIG_SERIAL_8250_CONSOLE=y +CONFIG_SERIAL_8250_DMA=y +CONFIG_SERIAL_8250_PCILIB=y +CONFIG_SERIAL_8250_PCI=y +CONFIG_SERIAL_8250_EXAR=m +CONFIG_SERIAL_8250_NR_UARTS=4 +CONFIG_SERIAL_8250_RUNTIME_UARTS=4 +CONFIG_SERIAL_8250_EXTENDED=y +# CONFIG_SERIAL_8250_MANY_PORTS is not set +# CONFIG_SERIAL_8250_PCI1XXXX is not set +CONFIG_SERIAL_8250_SHARE_IRQ=y +# CONFIG_SERIAL_8250_DETECT_IRQ is not set +# CONFIG_SERIAL_8250_RSA is not set +CONFIG_SERIAL_8250_DWLIB=y +CONFIG_SERIAL_8250_FSL=y +CONFIG_SERIAL_8250_DW=y +# CONFIG_SERIAL_8250_RT288X is not set +CONFIG_SERIAL_8250_PERICOM=y +CONFIG_SERIAL_8250_TEGRA=y +CONFIG_SERIAL_OF_PLATFORM=y + +# +# Non-8250 serial port support +# +CONFIG_SERIAL_AMBA_PL010=y +CONFIG_SERIAL_AMBA_PL010_CONSOLE=y +CONFIG_SERIAL_AMBA_PL011=y +CONFIG_SERIAL_AMBA_PL011_CONSOLE=y +# CONFIG_SERIAL_EARLYCON_SEMIHOST is not set +CONFIG_SERIAL_MESON=y +CONFIG_SERIAL_MESON_CONSOLE=y +CONFIG_SERIAL_TEGRA=y +# CONFIG_SERIAL_MAX3100 is not set +# CONFIG_SERIAL_MAX310X is not set +# CONFIG_SERIAL_UARTLITE is not set +CONFIG_SERIAL_CORE=y +CONFIG_SERIAL_CORE_CONSOLE=y +# CONFIG_SERIAL_JSM is not set +CONFIG_SERIAL_MSM=y +CONFIG_SERIAL_MSM_CONSOLE=y +# CONFIG_SERIAL_SIFIVE is not set +# CONFIG_SERIAL_SCCNXP is not set +# CONFIG_SERIAL_SC16IS7XX is not set +# CONFIG_SERIAL_ALTERA_JTAGUART is not set +# CONFIG_SERIAL_ALTERA_UART is not set +CONFIG_SERIAL_XILINX_PS_UART=y +CONFIG_SERIAL_XILINX_PS_UART_CONSOLE=y +# CONFIG_SERIAL_ARC is not set +CONFIG_SERIAL_RP2=m +CONFIG_SERIAL_RP2_NR_UARTS=32 +# CONFIG_SERIAL_FSL_LPUART is not set +# CONFIG_SERIAL_FSL_LINFLEXUART is not set +# CONFIG_SERIAL_CONEXANT_DIGICOLOR is not set +# CONFIG_SERIAL_SPRD is not set +CONFIG_SERIAL_MVEBU_UART=y +CONFIG_SERIAL_MVEBU_CONSOLE=y +# end of Serial drivers + +CONFIG_SERIAL_MCTRL_GPIO=y +# CONFIG_SERIAL_NONSTANDARD is not set +CONFIG_N_GSM=m +CONFIG_NOZOMI=m +# CONFIG_NULL_TTY is not set +CONFIG_HVC_DRIVER=y +CONFIG_HVC_IRQ=y +CONFIG_HVC_XEN=y +CONFIG_HVC_XEN_FRONTEND=y +# CONFIG_HVC_DCC is not set +# CONFIG_RPMSG_TTY is not set +CONFIG_SERIAL_DEV_BUS=m +CONFIG_TTY_PRINTK=m +CONFIG_TTY_PRINTK_LEVEL=6 +# CONFIG_PRINTER is not set +# CONFIG_PPDEV is not set +CONFIG_VIRTIO_CONSOLE=m +CONFIG_IPMI_HANDLER=m +CONFIG_IPMI_DMI_DECODE=y +CONFIG_IPMI_PLAT_DATA=y +# CONFIG_IPMI_PANIC_EVENT is not set +CONFIG_IPMI_DEVICE_INTERFACE=m +CONFIG_IPMI_SI=m +CONFIG_IPMI_SSIF=m +# CONFIG_IPMI_IPMB is not set +CONFIG_IPMI_WATCHDOG=m +CONFIG_IPMI_POWEROFF=m +# CONFIG_SSIF_IPMI_BMC is not set +# CONFIG_IPMB_DEVICE_INTERFACE is not set +CONFIG_HW_RANDOM=m +# CONFIG_HW_RANDOM_TIMERIOMEM is not set +# CONFIG_HW_RANDOM_BA431 is not set +CONFIG_HW_RANDOM_OMAP=m +CONFIG_HW_RANDOM_VIRTIO=m +CONFIG_HW_RANDOM_HISI=m +CONFIG_HW_RANDOM_HISTB=m +CONFIG_HW_RANDOM_XGENE=m +CONFIG_HW_RANDOM_MESON=m +CONFIG_HW_RANDOM_CAVIUM=m +CONFIG_HW_RANDOM_OPTEE=m +# CONFIG_HW_RANDOM_CCTRNG is not set +# CONFIG_HW_RANDOM_XIPHERA is not set +CONFIG_HW_RANDOM_ARM_SMCCC_TRNG=m +CONFIG_HW_RANDOM_CN10K=m +# CONFIG_APPLICOM is not set +CONFIG_DEVMEM=y +CONFIG_DEVPORT=y +CONFIG_TCG_TPM=m +CONFIG_HW_RANDOM_TPM=y +CONFIG_TCG_TIS_CORE=m +CONFIG_TCG_TIS=m +CONFIG_TCG_TIS_SPI=y +# CONFIG_TCG_TIS_SPI_CR50 is not set +# CONFIG_TCG_TIS_I2C is not set +# CONFIG_TCG_TIS_SYNQUACER is not set +# CONFIG_TCG_TIS_I2C_CR50 is not set +# CONFIG_TCG_TIS_I2C_ATMEL is not set +CONFIG_TCG_TIS_I2C_INFINEON=m +# CONFIG_TCG_TIS_I2C_NUVOTON is not set +CONFIG_TCG_ATMEL=m +# CONFIG_TCG_INFINEON is not set +# CONFIG_TCG_XEN is not set +CONFIG_TCG_CRB=m +CONFIG_TCG_VTPM_PROXY=m +# CONFIG_TCG_FTPM_TEE is not set +# CONFIG_TCG_TIS_ST33ZP24_I2C is not set +# CONFIG_TCG_TIS_ST33ZP24_SPI is not set +# CONFIG_XILLYBUS is not set +# CONFIG_XILLYUSB is not set +# end of Character devices + +# +# I2C support +# +CONFIG_I2C=y +CONFIG_ACPI_I2C_OPREGION=y +CONFIG_I2C_BOARDINFO=y +CONFIG_I2C_COMPAT=y +CONFIG_I2C_CHARDEV=m +CONFIG_I2C_MUX=m + +# +# Multiplexer I2C Chip support +# +# CONFIG_I2C_ARB_GPIO_CHALLENGE is not set +# CONFIG_I2C_MUX_GPIO is not set +# CONFIG_I2C_MUX_GPMUX is not set +# CONFIG_I2C_MUX_LTC4306 is not set +# CONFIG_I2C_MUX_PCA9541 is not set +# CONFIG_I2C_MUX_PCA954x is not set +# CONFIG_I2C_MUX_PINCTRL is not set +# CONFIG_I2C_MUX_REG is not set +# CONFIG_I2C_DEMUX_PINCTRL is not set +# CONFIG_I2C_MUX_MLXCPLD is not set +# end of Multiplexer I2C Chip support + +CONFIG_I2C_HELPER_AUTO=y +CONFIG_I2C_SMBUS=m +CONFIG_I2C_ALGOBIT=m +CONFIG_I2C_ALGOPCA=m + +# +# I2C Hardware Bus support +# + +# +# PC SMBus host controller drivers +# +# CONFIG_I2C_ALI1535 is not set +# CONFIG_I2C_ALI1563 is not set +# CONFIG_I2C_ALI15X3 is not set +# CONFIG_I2C_AMD756 is not set +# CONFIG_I2C_AMD8111 is not set +# CONFIG_I2C_AMD_MP2 is not set +# CONFIG_I2C_HIX5HD2 is not set +# CONFIG_I2C_I801 is not set +CONFIG_I2C_ISCH=m +# CONFIG_I2C_PIIX4 is not set +# CONFIG_I2C_NFORCE2 is not set +# CONFIG_I2C_NVIDIA_GPU is not set +# CONFIG_I2C_SIS5595 is not set +# CONFIG_I2C_SIS630 is not set +# CONFIG_I2C_SIS96X is not set +# CONFIG_I2C_VIA is not set +# CONFIG_I2C_VIAPRO is not set + +# +# ACPI drivers +# +# CONFIG_I2C_SCMI is not set + +# +# I2C system bus drivers (mostly embedded / system-on-chip) +# +# CONFIG_I2C_CADENCE is not set +# CONFIG_I2C_CBUS_GPIO is not set +CONFIG_I2C_DESIGNWARE_CORE=m +# CONFIG_I2C_DESIGNWARE_SLAVE is not set +CONFIG_I2C_DESIGNWARE_PLATFORM=m +# CONFIG_I2C_DESIGNWARE_PCI is not set +# CONFIG_I2C_EMEV2 is not set +CONFIG_I2C_GPIO=m +# CONFIG_I2C_GPIO_FAULT_INJECTOR is not set +# CONFIG_I2C_HISI is not set +CONFIG_I2C_MESON=m +CONFIG_I2C_MV64XXX=m +# CONFIG_I2C_NOMADIK is not set +CONFIG_I2C_OCORES=m +CONFIG_I2C_PCA_PLATFORM=m +CONFIG_I2C_PXA=m +# CONFIG_I2C_PXA_SLAVE is not set +# CONFIG_I2C_QCOM_CCI is not set +CONFIG_I2C_QUP=m +CONFIG_I2C_RK3X=m +CONFIG_I2C_SIMTEC=m +# CONFIG_I2C_SYNQUACER is not set +CONFIG_I2C_TEGRA=m +# CONFIG_I2C_VERSATILE is not set +CONFIG_I2C_THUNDERX=m +# CONFIG_I2C_XILINX is not set +CONFIG_I2C_XLP9XX=m + +# +# External I2C/SMBus adapter drivers +# +CONFIG_I2C_DIOLAN_U2C=m +# CONFIG_I2C_CP2615 is not set +# CONFIG_I2C_PARPORT is not set +# CONFIG_I2C_PCI1XXXX is not set +CONFIG_I2C_ROBOTFUZZ_OSIF=m +CONFIG_I2C_TAOS_EVM=m +CONFIG_I2C_TINY_USB=m +CONFIG_I2C_VIPERBOARD=m + +# +# Other I2C/SMBus bus drivers +# +# CONFIG_I2C_MLXCPLD is not set +CONFIG_I2C_CROS_EC_TUNNEL=m +CONFIG_I2C_XGENE_SLIMPRO=m +# CONFIG_I2C_VIRTIO is not set +# end of I2C Hardware Bus support + +# CONFIG_I2C_STUB is not set +CONFIG_I2C_SLAVE=y +# CONFIG_I2C_SLAVE_EEPROM is not set +# CONFIG_I2C_SLAVE_TESTUNIT is not set +# CONFIG_I2C_DEBUG_CORE is not set +# CONFIG_I2C_DEBUG_ALGO is not set +# CONFIG_I2C_DEBUG_BUS is not set +# end of I2C support + +# CONFIG_I3C is not set +CONFIG_SPI=y +# CONFIG_SPI_DEBUG is not set +CONFIG_SPI_MASTER=y +CONFIG_SPI_MEM=y + +# +# SPI Master Controller Drivers +# +# CONFIG_SPI_ALTERA is not set +# CONFIG_SPI_AMLOGIC_SPIFC_A1 is not set +CONFIG_SPI_ARMADA_3700=m +# CONFIG_SPI_AXI_SPI_ENGINE is not set +CONFIG_SPI_BITBANG=m +CONFIG_SPI_BUTTERFLY=m +# CONFIG_SPI_CADENCE is not set +# CONFIG_SPI_CADENCE_QUADSPI is not set +# CONFIG_SPI_CADENCE_XSPI is not set +# CONFIG_SPI_DESIGNWARE is not set +# CONFIG_SPI_HISI_KUNPENG is not set +# CONFIG_SPI_HISI_SFC_V3XX is not set +# CONFIG_SPI_GPIO is not set +CONFIG_SPI_LM70_LLP=m +# CONFIG_SPI_FSL_SPI is not set +# CONFIG_SPI_MESON_SPICC is not set +CONFIG_SPI_MESON_SPIFC=m +# CONFIG_SPI_MICROCHIP_CORE is not set +# CONFIG_SPI_MICROCHIP_CORE_QSPI is not set +# CONFIG_SPI_OC_TINY is not set +# CONFIG_SPI_ORION is not set +# CONFIG_SPI_PCI1XXXX is not set +# CONFIG_SPI_PL022 is not set +# CONFIG_SPI_PXA2XX is not set +CONFIG_SPI_ROCKCHIP=m +# CONFIG_SPI_ROCKCHIP_SFC is not set +# CONFIG_SPI_QCOM_QSPI is not set +CONFIG_SPI_QUP=m +# CONFIG_SPI_SC18IS602 is not set +# CONFIG_SPI_SIFIVE is not set +# CONFIG_SPI_SN_F_OSPI is not set +# CONFIG_SPI_SUN4I is not set +# CONFIG_SPI_SUN6I is not set +# CONFIG_SPI_SYNQUACER is not set +# CONFIG_SPI_MXIC is not set +CONFIG_SPI_TEGRA210_QUAD=y +CONFIG_SPI_TEGRA114=m +CONFIG_SPI_TEGRA20_SFLASH=m +CONFIG_SPI_TEGRA20_SLINK=m +CONFIG_SPI_THUNDERX=m +# CONFIG_SPI_XCOMM is not set +# CONFIG_SPI_XILINX is not set +CONFIG_SPI_XLP=m +# CONFIG_SPI_ZYNQMP_GQSPI is not set +# CONFIG_SPI_AMD is not set + +# +# SPI Multiplexer support +# +# CONFIG_SPI_MUX is not set + +# +# SPI Protocol Masters +# +CONFIG_SPI_SPIDEV=y +# CONFIG_SPI_LOOPBACK_TEST is not set +# CONFIG_SPI_TLE62X0 is not set +# CONFIG_SPI_SLAVE is not set +CONFIG_SPI_DYNAMIC=y +CONFIG_SPMI=y +# CONFIG_SPMI_HISI3670 is not set +CONFIG_SPMI_MSM_PMIC_ARB=y +# CONFIG_HSI is not set +CONFIG_PPS=m +# CONFIG_PPS_DEBUG is not set + +# +# PPS clients support +# +# CONFIG_PPS_CLIENT_KTIMER is not set +CONFIG_PPS_CLIENT_LDISC=m +CONFIG_PPS_CLIENT_PARPORT=m +# CONFIG_PPS_CLIENT_GPIO is not set + +# +# PPS generators support +# + +# +# PTP clock support +# +CONFIG_PTP_1588_CLOCK=m +CONFIG_PTP_1588_CLOCK_OPTIONAL=m + +# +# Enable PHYLIB and NETWORK_PHY_TIMESTAMPING to see the additional clocks. +# +CONFIG_PTP_1588_CLOCK_KVM=m +# CONFIG_PTP_1588_CLOCK_IDT82P33 is not set +# CONFIG_PTP_1588_CLOCK_IDTCM is not set +# CONFIG_PTP_1588_CLOCK_MOCK is not set +# CONFIG_PTP_1588_CLOCK_OCP is not set +# end of PTP clock support + +CONFIG_PINCTRL=y +CONFIG_GENERIC_PINCTRL_GROUPS=y +CONFIG_PINMUX=y +CONFIG_GENERIC_PINMUX_FUNCTIONS=y +CONFIG_PINCONF=y +CONFIG_GENERIC_PINCONF=y +# CONFIG_DEBUG_PINCTRL is not set +CONFIG_PINCTRL_AMD=y +CONFIG_PINCTRL_AXP209=m +# CONFIG_PINCTRL_CY8C95X0 is not set +CONFIG_PINCTRL_MAX77620=y +# CONFIG_PINCTRL_MCP23S08 is not set +# CONFIG_PINCTRL_MICROCHIP_SGPIO is not set +# CONFIG_PINCTRL_OCELOT is not set +CONFIG_PINCTRL_ROCKCHIP=y +CONFIG_PINCTRL_SINGLE=y +# CONFIG_PINCTRL_STMFX is not set +# CONFIG_PINCTRL_SX150X is not set +CONFIG_PINCTRL_ZYNQMP=y +CONFIG_PINCTRL_MESON=y +CONFIG_PINCTRL_MESON_GXBB=y +CONFIG_PINCTRL_MESON_GXL=y +CONFIG_PINCTRL_MESON8_PMX=y +CONFIG_PINCTRL_MESON_AXG=y +CONFIG_PINCTRL_MESON_AXG_PMX=y +CONFIG_PINCTRL_MESON_G12A=y +CONFIG_PINCTRL_MESON_A1=y +CONFIG_PINCTRL_MESON_S4=y +CONFIG_PINCTRL_AMLOGIC_C3=y +CONFIG_PINCTRL_MVEBU=y +CONFIG_PINCTRL_ARMADA_AP806=y +CONFIG_PINCTRL_ARMADA_CP110=y +CONFIG_PINCTRL_AC5=y +CONFIG_PINCTRL_ARMADA_37XX=y +CONFIG_PINCTRL_MSM=y +# CONFIG_PINCTRL_IPQ5018 is not set +# CONFIG_PINCTRL_IPQ5332 is not set +# CONFIG_PINCTRL_IPQ8074 is not set +# CONFIG_PINCTRL_IPQ6018 is not set +# CONFIG_PINCTRL_IPQ9574 is not set +# CONFIG_PINCTRL_MDM9607 is not set +CONFIG_PINCTRL_MSM8916=y +# CONFIG_PINCTRL_MSM8953 is not set +# CONFIG_PINCTRL_MSM8976 is not set +# CONFIG_PINCTRL_MSM8994 is not set +CONFIG_PINCTRL_MSM8996=y +# CONFIG_PINCTRL_MSM8998 is not set +# CONFIG_PINCTRL_QCM2290 is not set +# CONFIG_PINCTRL_QCS404 is not set +# CONFIG_PINCTRL_QDF2XXX is not set +# CONFIG_PINCTRL_QDU1000 is not set +# CONFIG_PINCTRL_SA8775P is not set +# CONFIG_PINCTRL_SC7180 is not set +# CONFIG_PINCTRL_SC7280 is not set +# CONFIG_PINCTRL_SC8180X is not set +# CONFIG_PINCTRL_SC8280XP is not set +# CONFIG_PINCTRL_SDM660 is not set +# CONFIG_PINCTRL_SDM670 is not set +# CONFIG_PINCTRL_SDM845 is not set +# CONFIG_PINCTRL_SDX75 is not set +# CONFIG_PINCTRL_SM6115 is not set +# CONFIG_PINCTRL_SM6125 is not set +# CONFIG_PINCTRL_SM6350 is not set +# CONFIG_PINCTRL_SM6375 is not set +# CONFIG_PINCTRL_SM7150 is not set +# CONFIG_PINCTRL_SM8150 is not set +# CONFIG_PINCTRL_SM8250 is not set +# CONFIG_PINCTRL_SM8350 is not set +# CONFIG_PINCTRL_SM8450 is not set +# CONFIG_PINCTRL_SM8550 is not set +CONFIG_PINCTRL_QCOM_SPMI_PMIC=y +CONFIG_PINCTRL_QCOM_SSBI_PMIC=y +# CONFIG_PINCTRL_LPASS_LPI is not set + +# +# Renesas pinctrl drivers +# +# end of Renesas pinctrl drivers + +CONFIG_PINCTRL_SUNXI=y +# CONFIG_PINCTRL_SUN4I_A10 is not set +# CONFIG_PINCTRL_SUN5I is not set +# CONFIG_PINCTRL_SUN6I_A31 is not set +# CONFIG_PINCTRL_SUN6I_A31_R is not set +# CONFIG_PINCTRL_SUN8I_A23 is not set +# CONFIG_PINCTRL_SUN8I_A33 is not set +# CONFIG_PINCTRL_SUN8I_A83T is not set +# CONFIG_PINCTRL_SUN8I_A83T_R is not set +# CONFIG_PINCTRL_SUN8I_A23_R is not set +# CONFIG_PINCTRL_SUN8I_H3 is not set +CONFIG_PINCTRL_SUN8I_H3_R=y +# CONFIG_PINCTRL_SUN8I_V3S is not set +# CONFIG_PINCTRL_SUN9I_A80 is not set +# CONFIG_PINCTRL_SUN9I_A80_R is not set +# CONFIG_PINCTRL_SUN20I_D1 is not set +CONFIG_PINCTRL_SUN50I_A64=y +CONFIG_PINCTRL_SUN50I_A64_R=y +CONFIG_PINCTRL_SUN50I_A100=y +CONFIG_PINCTRL_SUN50I_A100_R=y +CONFIG_PINCTRL_SUN50I_H5=y +CONFIG_PINCTRL_SUN50I_H6=y +CONFIG_PINCTRL_SUN50I_H6_R=y +CONFIG_PINCTRL_SUN50I_H616=y +CONFIG_PINCTRL_SUN50I_H616_R=y +CONFIG_PINCTRL_TEGRA=y +CONFIG_PINCTRL_TEGRA124=y +CONFIG_PINCTRL_TEGRA210=y +CONFIG_PINCTRL_TEGRA_XUSB=y +CONFIG_GPIOLIB=y +CONFIG_GPIOLIB_FASTPATH_LIMIT=512 +CONFIG_OF_GPIO=y +CONFIG_GPIO_ACPI=y +CONFIG_GPIOLIB_IRQCHIP=y +# CONFIG_DEBUG_GPIO is not set +CONFIG_GPIO_SYSFS=y +CONFIG_GPIO_CDEV=y +CONFIG_GPIO_CDEV_V1=y +CONFIG_GPIO_GENERIC=y +CONFIG_GPIO_REGMAP=m +CONFIG_GPIO_IDIO_16=m + +# +# Memory mapped GPIO drivers +# +# CONFIG_GPIO_74XX_MMIO is not set +# CONFIG_GPIO_ALTERA is not set +# CONFIG_GPIO_AMDPT is not set +# CONFIG_GPIO_CADENCE is not set +# CONFIG_GPIO_DWAPB is not set +CONFIG_GPIO_EXAR=m +# CONFIG_GPIO_FTGPIO010 is not set +CONFIG_GPIO_GENERIC_PLATFORM=y +# CONFIG_GPIO_GRGPIO is not set +# CONFIG_GPIO_HISI is not set +# CONFIG_GPIO_HLWD is not set +# CONFIG_GPIO_LOGICVC is not set +CONFIG_GPIO_MB86S7X=m +CONFIG_GPIO_MVEBU=y +CONFIG_GPIO_PL061=y +CONFIG_GPIO_ROCKCHIP=y +# CONFIG_GPIO_SIFIVE is not set +# CONFIG_GPIO_SYSCON is not set +CONFIG_GPIO_TEGRA=y +CONFIG_GPIO_TEGRA186=y +# CONFIG_GPIO_THUNDERX is not set +CONFIG_GPIO_XGENE=y +CONFIG_GPIO_XGENE_SB=m +# CONFIG_GPIO_XILINX is not set +CONFIG_GPIO_XLP=y +CONFIG_GPIO_ZYNQ=m +CONFIG_GPIO_ZYNQMP_MODEPIN=y +# CONFIG_GPIO_AMD_FCH is not set +# end of Memory mapped GPIO drivers + +# +# I2C GPIO expanders +# +# CONFIG_GPIO_ADNP is not set +# CONFIG_GPIO_FXL6408 is not set +# CONFIG_GPIO_DS4520 is not set +# CONFIG_GPIO_GW_PLD is not set +# CONFIG_GPIO_MAX7300 is not set +# CONFIG_GPIO_MAX732X is not set +CONFIG_GPIO_PCA953X=y +CONFIG_GPIO_PCA953X_IRQ=y +# CONFIG_GPIO_PCA9570 is not set +# CONFIG_GPIO_PCF857X is not set +# CONFIG_GPIO_TPIC2810 is not set +# end of I2C GPIO expanders + +# +# MFD GPIO expanders +# +CONFIG_GPIO_MAX77620=y +# end of MFD GPIO expanders + +# +# PCI GPIO expanders +# +CONFIG_GPIO_PCI_IDIO_16=m +CONFIG_GPIO_PCIE_IDIO_24=m +# CONFIG_GPIO_RDC321X is not set +# end of PCI GPIO expanders + +# +# SPI GPIO expanders +# +# CONFIG_GPIO_74X164 is not set +# CONFIG_GPIO_MAX3191X is not set +# CONFIG_GPIO_MAX7301 is not set +# CONFIG_GPIO_MC33880 is not set +# CONFIG_GPIO_PISOSR is not set +# CONFIG_GPIO_XRA1403 is not set +# end of SPI GPIO expanders + +# +# USB GPIO expanders +# +CONFIG_GPIO_VIPERBOARD=m +# end of USB GPIO expanders + +# +# Virtual GPIO drivers +# +# CONFIG_GPIO_AGGREGATOR is not set +# CONFIG_GPIO_LATCH is not set +# CONFIG_GPIO_MOCKUP is not set +# CONFIG_GPIO_VIRTIO is not set +# CONFIG_GPIO_SIM is not set +# end of Virtual GPIO drivers + +CONFIG_W1=m +CONFIG_W1_CON=y + +# +# 1-wire Bus Masters +# +# CONFIG_W1_MASTER_MATROX is not set +CONFIG_W1_MASTER_DS2490=m +CONFIG_W1_MASTER_DS2482=m +CONFIG_W1_MASTER_GPIO=m +# CONFIG_W1_MASTER_SGI is not set +# end of 1-wire Bus Masters + +# +# 1-wire Slaves +# +CONFIG_W1_SLAVE_THERM=m +CONFIG_W1_SLAVE_SMEM=m +CONFIG_W1_SLAVE_DS2405=m +CONFIG_W1_SLAVE_DS2408=m +CONFIG_W1_SLAVE_DS2408_READBACK=y +CONFIG_W1_SLAVE_DS2413=m +CONFIG_W1_SLAVE_DS2406=m +CONFIG_W1_SLAVE_DS2423=m +CONFIG_W1_SLAVE_DS2805=m +# CONFIG_W1_SLAVE_DS2430 is not set +CONFIG_W1_SLAVE_DS2431=m +CONFIG_W1_SLAVE_DS2433=m +# CONFIG_W1_SLAVE_DS2433_CRC is not set +CONFIG_W1_SLAVE_DS2438=m +# CONFIG_W1_SLAVE_DS250X is not set +CONFIG_W1_SLAVE_DS2780=m +CONFIG_W1_SLAVE_DS2781=m +CONFIG_W1_SLAVE_DS28E04=m +CONFIG_W1_SLAVE_DS28E17=m +# end of 1-wire Slaves + +CONFIG_POWER_RESET=y +# CONFIG_POWER_RESET_BRCMSTB is not set +# CONFIG_POWER_RESET_GPIO is not set +# CONFIG_POWER_RESET_GPIO_RESTART is not set +CONFIG_POWER_RESET_HISI=y +# CONFIG_POWER_RESET_LINKSTATION is not set +CONFIG_POWER_RESET_MSM=y +# CONFIG_POWER_RESET_QCOM_PON is not set +# CONFIG_POWER_RESET_ODROID_GO_ULTRA_POWEROFF is not set +# CONFIG_POWER_RESET_LTC2952 is not set +# CONFIG_POWER_RESET_REGULATOR is not set +# CONFIG_POWER_RESET_RESTART is not set +CONFIG_POWER_RESET_VEXPRESS=y +CONFIG_POWER_RESET_XGENE=y +CONFIG_POWER_RESET_SYSCON=y +CONFIG_POWER_RESET_SYSCON_POWEROFF=y +# CONFIG_SYSCON_REBOOT_MODE is not set +# CONFIG_NVMEM_REBOOT_MODE is not set +CONFIG_POWER_SUPPLY=y +# CONFIG_POWER_SUPPLY_DEBUG is not set +CONFIG_POWER_SUPPLY_HWMON=y +# CONFIG_GENERIC_ADC_BATTERY is not set +# CONFIG_IP5XXX_POWER is not set +# CONFIG_TEST_POWER is not set +# CONFIG_CHARGER_ADP5061 is not set +# CONFIG_BATTERY_CW2015 is not set +CONFIG_BATTERY_DS2760=m +# CONFIG_BATTERY_DS2780 is not set +# CONFIG_BATTERY_DS2781 is not set +# CONFIG_BATTERY_DS2782 is not set +# CONFIG_BATTERY_SAMSUNG_SDI is not set +CONFIG_BATTERY_SBS=m +# CONFIG_CHARGER_SBS is not set +# CONFIG_MANAGER_SBS is not set +CONFIG_BATTERY_BQ27XXX=m +# CONFIG_BATTERY_BQ27XXX_I2C is not set +CONFIG_BATTERY_BQ27XXX_HDQ=m +CONFIG_CHARGER_AXP20X=m +CONFIG_BATTERY_AXP20X=m +CONFIG_AXP20X_POWER=m +# CONFIG_BATTERY_MAX17040 is not set +# CONFIG_BATTERY_MAX17042 is not set +# CONFIG_BATTERY_MAX1721X is not set +# CONFIG_CHARGER_ISP1704 is not set +# CONFIG_CHARGER_MAX8903 is not set +# CONFIG_CHARGER_LP8727 is not set +# CONFIG_CHARGER_GPIO is not set +# CONFIG_CHARGER_MANAGER is not set +# CONFIG_CHARGER_LT3651 is not set +# CONFIG_CHARGER_LTC4162L is not set +# CONFIG_CHARGER_DETECTOR_MAX14656 is not set +# CONFIG_CHARGER_MAX77976 is not set +CONFIG_CHARGER_QCOM_SMBB=m +# CONFIG_CHARGER_BQ2415X is not set +# CONFIG_CHARGER_BQ24190 is not set +# CONFIG_CHARGER_BQ24257 is not set +# CONFIG_CHARGER_BQ24735 is not set +# CONFIG_CHARGER_BQ2515X is not set +# CONFIG_CHARGER_BQ25890 is not set +# CONFIG_CHARGER_BQ25980 is not set +# CONFIG_CHARGER_BQ256XX is not set +# CONFIG_CHARGER_SMB347 is not set +# CONFIG_BATTERY_GAUGE_LTC2941 is not set +# CONFIG_BATTERY_GOLDFISH is not set +# CONFIG_BATTERY_RT5033 is not set +# CONFIG_CHARGER_RT9455 is not set +# CONFIG_CHARGER_RT9467 is not set +# CONFIG_CHARGER_RT9471 is not set +CONFIG_CHARGER_CROS_USBPD=m +CONFIG_CHARGER_CROS_PCHG=y +# CONFIG_CHARGER_UCS1002 is not set +# CONFIG_CHARGER_BD99954 is not set +# CONFIG_BATTERY_UG3105 is not set +# CONFIG_CHARGER_QCOM_SMB2 is not set +CONFIG_HWMON=y +CONFIG_HWMON_VID=m +# CONFIG_HWMON_DEBUG_CHIP is not set + +# +# Native drivers +# +# CONFIG_SENSORS_AD7314 is not set +CONFIG_SENSORS_AD7414=m +CONFIG_SENSORS_AD7418=m +# CONFIG_SENSORS_ADM1021 is not set +# CONFIG_SENSORS_ADM1025 is not set +# CONFIG_SENSORS_ADM1026 is not set +CONFIG_SENSORS_ADM1029=m +# CONFIG_SENSORS_ADM1031 is not set +# CONFIG_SENSORS_ADM1177 is not set +CONFIG_SENSORS_ADM9240=m +# CONFIG_SENSORS_ADT7310 is not set +# CONFIG_SENSORS_ADT7410 is not set +CONFIG_SENSORS_ADT7411=m +CONFIG_SENSORS_ADT7462=m +CONFIG_SENSORS_ADT7470=m +CONFIG_SENSORS_ADT7475=m +# CONFIG_SENSORS_AHT10 is not set +# CONFIG_SENSORS_AQUACOMPUTER_D5NEXT is not set +# CONFIG_SENSORS_AS370 is not set +CONFIG_SENSORS_ASC7621=m +# CONFIG_SENSORS_AXI_FAN_CONTROL is not set +CONFIG_SENSORS_ATXP1=m +# CONFIG_SENSORS_CORSAIR_CPRO is not set +# CONFIG_SENSORS_CORSAIR_PSU is not set +# CONFIG_SENSORS_DRIVETEMP is not set +CONFIG_SENSORS_DS620=m +# CONFIG_SENSORS_DS1621 is not set +CONFIG_SENSORS_I5K_AMB=m +# CONFIG_SENSORS_F71805F is not set +CONFIG_SENSORS_F71882FG=m +CONFIG_SENSORS_F75375S=m +CONFIG_SENSORS_FTSTEUTATES=m +# CONFIG_SENSORS_GL518SM is not set +# CONFIG_SENSORS_GL520SM is not set +CONFIG_SENSORS_G760A=m +# CONFIG_SENSORS_G762 is not set +# CONFIG_SENSORS_GPIO_FAN is not set +# CONFIG_SENSORS_HIH6130 is not set +# CONFIG_SENSORS_HS3001 is not set +CONFIG_SENSORS_IBMAEM=m +CONFIG_SENSORS_IBMPEX=m +# CONFIG_SENSORS_IIO_HWMON is not set +# CONFIG_SENSORS_IT87 is not set +CONFIG_SENSORS_JC42=m +# CONFIG_SENSORS_POWR1220 is not set +CONFIG_SENSORS_LINEAGE=m +# CONFIG_SENSORS_LTC2945 is not set +# CONFIG_SENSORS_LTC2947_I2C is not set +# CONFIG_SENSORS_LTC2947_SPI is not set +# CONFIG_SENSORS_LTC2990 is not set +# CONFIG_SENSORS_LTC2992 is not set +CONFIG_SENSORS_LTC4151=m +CONFIG_SENSORS_LTC4215=m +# CONFIG_SENSORS_LTC4222 is not set +CONFIG_SENSORS_LTC4245=m +# CONFIG_SENSORS_LTC4260 is not set +CONFIG_SENSORS_LTC4261=m +CONFIG_SENSORS_MAX1111=m +# CONFIG_SENSORS_MAX127 is not set +CONFIG_SENSORS_MAX16065=m +# CONFIG_SENSORS_MAX1619 is not set +CONFIG_SENSORS_MAX1668=m +# CONFIG_SENSORS_MAX197 is not set +# CONFIG_SENSORS_MAX31722 is not set +# CONFIG_SENSORS_MAX31730 is not set +# CONFIG_SENSORS_MAX31760 is not set +# CONFIG_MAX31827 is not set +# CONFIG_SENSORS_MAX6620 is not set +# CONFIG_SENSORS_MAX6621 is not set +CONFIG_SENSORS_MAX6639=m +CONFIG_SENSORS_MAX6642=m +CONFIG_SENSORS_MAX6650=m +# CONFIG_SENSORS_MAX6697 is not set +# CONFIG_SENSORS_MAX31790 is not set +# CONFIG_SENSORS_MC34VR500 is not set +# CONFIG_SENSORS_MCP3021 is not set +# CONFIG_SENSORS_TC654 is not set +# CONFIG_SENSORS_TPS23861 is not set +# CONFIG_SENSORS_MR75203 is not set +CONFIG_SENSORS_ADCXX=m +# CONFIG_SENSORS_LM63 is not set +CONFIG_SENSORS_LM70=m +CONFIG_SENSORS_LM73=m +# CONFIG_SENSORS_LM75 is not set +# CONFIG_SENSORS_LM77 is not set +# CONFIG_SENSORS_LM78 is not set +# CONFIG_SENSORS_LM80 is not set +# CONFIG_SENSORS_LM83 is not set +# CONFIG_SENSORS_LM85 is not set +# CONFIG_SENSORS_LM87 is not set +# CONFIG_SENSORS_LM90 is not set +# CONFIG_SENSORS_LM92 is not set +CONFIG_SENSORS_LM93=m +# CONFIG_SENSORS_LM95234 is not set +CONFIG_SENSORS_LM95241=m +CONFIG_SENSORS_LM95245=m +# CONFIG_SENSORS_PC87360 is not set +CONFIG_SENSORS_PC87427=m +CONFIG_SENSORS_NTC_THERMISTOR=m +CONFIG_SENSORS_NCT6683=m +CONFIG_SENSORS_NCT6775_CORE=m +CONFIG_SENSORS_NCT6775=m +# CONFIG_SENSORS_NCT6775_I2C is not set +CONFIG_SENSORS_NCT7802=m +CONFIG_SENSORS_NCT7904=m +CONFIG_SENSORS_NPCM7XX=m +# CONFIG_SENSORS_NZXT_KRAKEN2 is not set +# CONFIG_SENSORS_NZXT_SMART2 is not set +# CONFIG_SENSORS_OCC_P8_I2C is not set +# CONFIG_SENSORS_PCF8591 is not set +# CONFIG_PMBUS is not set +# CONFIG_SENSORS_PWM_FAN is not set +# CONFIG_SENSORS_SBTSI is not set +# CONFIG_SENSORS_SBRMI is not set +# CONFIG_SENSORS_SHT15 is not set +CONFIG_SENSORS_SHT21=m +# CONFIG_SENSORS_SHT3x is not set +# CONFIG_SENSORS_SHT4x is not set +# CONFIG_SENSORS_SHTC1 is not set +# CONFIG_SENSORS_SIS5595 is not set +CONFIG_SENSORS_DME1737=m +CONFIG_SENSORS_EMC1403=m +CONFIG_SENSORS_EMC2103=m +# CONFIG_SENSORS_EMC2305 is not set +CONFIG_SENSORS_EMC6W201=m +# CONFIG_SENSORS_SMSC47M1 is not set +CONFIG_SENSORS_SMSC47M192=m +# CONFIG_SENSORS_SMSC47B397 is not set +CONFIG_SENSORS_SCH56XX_COMMON=m +CONFIG_SENSORS_SCH5627=m +# CONFIG_SENSORS_SCH5636 is not set +# CONFIG_SENSORS_STTS751 is not set +# CONFIG_SENSORS_ADC128D818 is not set +CONFIG_SENSORS_ADS7828=m +CONFIG_SENSORS_ADS7871=m +CONFIG_SENSORS_AMC6821=m +# CONFIG_SENSORS_INA209 is not set +# CONFIG_SENSORS_INA2XX is not set +# CONFIG_SENSORS_INA238 is not set +# CONFIG_SENSORS_INA3221 is not set +# CONFIG_SENSORS_TC74 is not set +CONFIG_SENSORS_THMC50=m +CONFIG_SENSORS_TMP102=m +# CONFIG_SENSORS_TMP103 is not set +# CONFIG_SENSORS_TMP108 is not set +CONFIG_SENSORS_TMP401=m +CONFIG_SENSORS_TMP421=m +# CONFIG_SENSORS_TMP464 is not set +# CONFIG_SENSORS_TMP513 is not set +# CONFIG_SENSORS_VEXPRESS is not set +# CONFIG_SENSORS_VIA686A is not set +CONFIG_SENSORS_VT1211=m +CONFIG_SENSORS_VT8231=m +CONFIG_SENSORS_W83773G=m +# CONFIG_SENSORS_W83781D is not set +CONFIG_SENSORS_W83791D=m +CONFIG_SENSORS_W83792D=m +CONFIG_SENSORS_W83793=m +CONFIG_SENSORS_W83795=m +# CONFIG_SENSORS_W83795_FANCTRL is not set +# CONFIG_SENSORS_W83L785TS is not set +CONFIG_SENSORS_W83L786NG=m +# CONFIG_SENSORS_W83627HF is not set +CONFIG_SENSORS_W83627EHF=m +CONFIG_SENSORS_XGENE=m + +# +# ACPI drivers +# +CONFIG_SENSORS_ACPI_POWER=m +CONFIG_THERMAL=y +# CONFIG_THERMAL_NETLINK is not set +CONFIG_THERMAL_STATISTICS=y +CONFIG_THERMAL_EMERGENCY_POWEROFF_DELAY_MS=0 +CONFIG_THERMAL_HWMON=y +CONFIG_THERMAL_OF=y +# CONFIG_THERMAL_WRITABLE_TRIPS is not set +CONFIG_THERMAL_DEFAULT_GOV_STEP_WISE=y +# CONFIG_THERMAL_DEFAULT_GOV_FAIR_SHARE is not set +# CONFIG_THERMAL_DEFAULT_GOV_USER_SPACE is not set +CONFIG_THERMAL_GOV_FAIR_SHARE=y +CONFIG_THERMAL_GOV_STEP_WISE=y +# CONFIG_THERMAL_GOV_BANG_BANG is not set +# CONFIG_THERMAL_GOV_USER_SPACE is not set +# CONFIG_THERMAL_GOV_POWER_ALLOCATOR is not set +CONFIG_CPU_THERMAL=y +CONFIG_CPU_FREQ_THERMAL=y +CONFIG_DEVFREQ_THERMAL=y +# CONFIG_THERMAL_EMULATION is not set +# CONFIG_THERMAL_MMIO is not set +CONFIG_HISI_THERMAL=m +# CONFIG_MAX77620_THERMAL is not set +# CONFIG_SUN8I_THERMAL is not set +CONFIG_ROCKCHIP_THERMAL=m +CONFIG_ARMADA_THERMAL=m +CONFIG_AMLOGIC_THERMAL=y + +# +# NVIDIA Tegra thermal drivers +# +CONFIG_TEGRA_SOCTHERM=y +# end of NVIDIA Tegra thermal drivers + +# CONFIG_GENERIC_ADC_THERMAL is not set + +# +# Qualcomm thermal drivers +# +# CONFIG_QCOM_SPMI_ADC_TM5 is not set +CONFIG_QCOM_SPMI_TEMP_ALARM=m +# CONFIG_QCOM_LMH is not set +# end of Qualcomm thermal drivers + +CONFIG_WATCHDOG=y +CONFIG_WATCHDOG_CORE=y +# CONFIG_WATCHDOG_NOWAYOUT is not set +CONFIG_WATCHDOG_HANDLE_BOOT_ENABLED=y +CONFIG_WATCHDOG_OPEN_TIMEOUT=0 +CONFIG_WATCHDOG_SYSFS=y +# CONFIG_WATCHDOG_HRTIMER_PRETIMEOUT is not set + +# +# Watchdog Pretimeout Governors +# +CONFIG_WATCHDOG_PRETIMEOUT_GOV=y +CONFIG_WATCHDOG_PRETIMEOUT_GOV_SEL=m +CONFIG_WATCHDOG_PRETIMEOUT_GOV_NOOP=m +CONFIG_WATCHDOG_PRETIMEOUT_GOV_PANIC=m +CONFIG_WATCHDOG_PRETIMEOUT_DEFAULT_GOV_NOOP=y +# CONFIG_WATCHDOG_PRETIMEOUT_DEFAULT_GOV_PANIC is not set + +# +# Watchdog Device Drivers +# +CONFIG_SOFT_WATCHDOG=m +# CONFIG_SOFT_WATCHDOG_PRETIMEOUT is not set +CONFIG_GPIO_WATCHDOG=m +CONFIG_WDAT_WDT=m +# CONFIG_XILINX_WATCHDOG is not set +# CONFIG_XILINX_WINDOW_WATCHDOG is not set +# CONFIG_ZIIRAVE_WATCHDOG is not set +CONFIG_ARM_SP805_WATCHDOG=m +CONFIG_ARM_SBSA_WATCHDOG=m +# CONFIG_ARMADA_37XX_WATCHDOG is not set +# CONFIG_CADENCE_WATCHDOG is not set +CONFIG_DW_WATCHDOG=m +CONFIG_SUNXI_WATCHDOG=m +# CONFIG_MAX63XX_WATCHDOG is not set +# CONFIG_MAX77620_WATCHDOG is not set +CONFIG_TEGRA_WATCHDOG=m +CONFIG_QCOM_WDT=m +CONFIG_MESON_GXBB_WATCHDOG=m +CONFIG_MESON_WATCHDOG=m +# CONFIG_ARM_SMC_WATCHDOG is not set +# CONFIG_PM8916_WATCHDOG is not set +# CONFIG_ALIM7101_WDT is not set +# CONFIG_I6300ESB_WDT is not set +# CONFIG_HP_WATCHDOG is not set +CONFIG_MARVELL_GTI_WDT=y +# CONFIG_MEN_A21_WDT is not set +CONFIG_XEN_WDT=m + +# +# PCI-based Watchdog Cards +# +# CONFIG_PCIPCWATCHDOG is not set +# CONFIG_WDTPCI is not set + +# +# USB-based Watchdog Cards +# +# CONFIG_USBPCWATCHDOG is not set +CONFIG_SSB_POSSIBLE=y +CONFIG_SSB=m +CONFIG_SSB_SPROM=y +CONFIG_SSB_BLOCKIO=y +CONFIG_SSB_PCIHOST_POSSIBLE=y +CONFIG_SSB_PCIHOST=y +CONFIG_SSB_B43_PCI_BRIDGE=y +CONFIG_SSB_SDIOHOST_POSSIBLE=y +CONFIG_SSB_SDIOHOST=y +CONFIG_SSB_DRIVER_PCICORE_POSSIBLE=y +CONFIG_SSB_DRIVER_PCICORE=y +# CONFIG_SSB_DRIVER_GPIO is not set +CONFIG_BCMA_POSSIBLE=y +CONFIG_BCMA=m +CONFIG_BCMA_BLOCKIO=y +CONFIG_BCMA_HOST_PCI_POSSIBLE=y +CONFIG_BCMA_HOST_PCI=y +# CONFIG_BCMA_HOST_SOC is not set +CONFIG_BCMA_DRIVER_PCI=y +# CONFIG_BCMA_DRIVER_GMAC_CMN is not set +# CONFIG_BCMA_DRIVER_GPIO is not set +# CONFIG_BCMA_DEBUG is not set + +# +# Multifunction device drivers +# +CONFIG_MFD_CORE=y +# CONFIG_MFD_ACT8945A is not set +# CONFIG_MFD_SUN4I_GPADC is not set +# CONFIG_MFD_AS3711 is not set +# CONFIG_MFD_SMPRO is not set +# CONFIG_MFD_AS3722 is not set +# CONFIG_PMIC_ADP5520 is not set +# CONFIG_MFD_AAT2870_CORE is not set +# CONFIG_MFD_ATMEL_FLEXCOM is not set +# CONFIG_MFD_ATMEL_HLCDC is not set +# CONFIG_MFD_BCM590XX is not set +# CONFIG_MFD_BD9571MWV is not set +# CONFIG_MFD_AC100 is not set +CONFIG_MFD_AXP20X=m +# CONFIG_MFD_AXP20X_I2C is not set +CONFIG_MFD_AXP20X_RSB=m +CONFIG_MFD_CROS_EC_DEV=y +# CONFIG_MFD_CS42L43_I2C is not set +# CONFIG_MFD_MADERA is not set +# CONFIG_MFD_MAX5970 is not set +# CONFIG_PMIC_DA903X is not set +# CONFIG_MFD_DA9052_SPI is not set +# CONFIG_MFD_DA9052_I2C is not set +# CONFIG_MFD_DA9055 is not set +# CONFIG_MFD_DA9062 is not set +# CONFIG_MFD_DA9063 is not set +# CONFIG_MFD_DA9150 is not set +# CONFIG_MFD_DLN2 is not set +# CONFIG_MFD_GATEWORKS_GSC is not set +# CONFIG_MFD_MC13XXX_SPI is not set +# CONFIG_MFD_MC13XXX_I2C is not set +# CONFIG_MFD_MP2629 is not set +# CONFIG_MFD_HI6421_PMIC is not set +# CONFIG_MFD_HI6421_SPMI is not set +CONFIG_MFD_HI655X_PMIC=m +# CONFIG_LPC_ICH is not set +CONFIG_LPC_SCH=m +# CONFIG_MFD_IQS62X is not set +# CONFIG_MFD_JANZ_CMODIO is not set +# CONFIG_MFD_KEMPLD is not set +# CONFIG_MFD_88PM800 is not set +# CONFIG_MFD_88PM805 is not set +# CONFIG_MFD_88PM860X is not set +# CONFIG_MFD_MAX14577 is not set +# CONFIG_MFD_MAX77541 is not set +CONFIG_MFD_MAX77620=y +# CONFIG_MFD_MAX77650 is not set +# CONFIG_MFD_MAX77686 is not set +# CONFIG_MFD_MAX77693 is not set +# CONFIG_MFD_MAX77714 is not set +# CONFIG_MFD_MAX77843 is not set +# CONFIG_MFD_MAX8907 is not set +# CONFIG_MFD_MAX8925 is not set +# CONFIG_MFD_MAX8997 is not set +# CONFIG_MFD_MAX8998 is not set +# CONFIG_MFD_MT6360 is not set +# CONFIG_MFD_MT6370 is not set +# CONFIG_MFD_MT6397 is not set +# CONFIG_MFD_MENF21BMC is not set +# CONFIG_MFD_OCELOT is not set +# CONFIG_EZX_PCAP is not set +# CONFIG_MFD_CPCAP is not set +CONFIG_MFD_VIPERBOARD=m +# CONFIG_MFD_NTXEC is not set +# CONFIG_MFD_RETU is not set +# CONFIG_MFD_PCF50633 is not set +CONFIG_MFD_QCOM_RPM=m +CONFIG_MFD_SPMI_PMIC=m +# CONFIG_MFD_SY7636A is not set +# CONFIG_MFD_RDC321X is not set +# CONFIG_MFD_RT4831 is not set +# CONFIG_MFD_RT5033 is not set +# CONFIG_MFD_RT5120 is not set +# CONFIG_MFD_RC5T583 is not set +# CONFIG_MFD_RK8XX_I2C is not set +# CONFIG_MFD_RK8XX_SPI is not set +# CONFIG_MFD_RN5T618 is not set +# CONFIG_MFD_SEC_CORE is not set +# CONFIG_MFD_SI476X_CORE is not set +# CONFIG_MFD_SM501 is not set +# CONFIG_MFD_SKY81452 is not set +# CONFIG_MFD_STMPE is not set +CONFIG_MFD_SUN6I_PRCM=y +CONFIG_MFD_SYSCON=y +# CONFIG_MFD_TI_AM335X_TSCADC is not set +# CONFIG_MFD_LP3943 is not set +# CONFIG_MFD_LP8788 is not set +# CONFIG_MFD_TI_LMU is not set +# CONFIG_MFD_PALMAS is not set +# CONFIG_TPS6105X is not set +# CONFIG_TPS65010 is not set +# CONFIG_TPS6507X is not set +# CONFIG_MFD_TPS65086 is not set +# CONFIG_MFD_TPS65090 is not set +# CONFIG_MFD_TPS65217 is not set +# CONFIG_MFD_TI_LP873X is not set +# CONFIG_MFD_TI_LP87565 is not set +# CONFIG_MFD_TPS65218 is not set +# CONFIG_MFD_TPS65219 is not set +# CONFIG_MFD_TPS6586X is not set +# CONFIG_MFD_TPS65910 is not set +# CONFIG_MFD_TPS65912_I2C is not set +# CONFIG_MFD_TPS65912_SPI is not set +# CONFIG_MFD_TPS6594_I2C is not set +# CONFIG_MFD_TPS6594_SPI is not set +# CONFIG_TWL4030_CORE is not set +# CONFIG_TWL6040_CORE is not set +# CONFIG_MFD_WL1273_CORE is not set +# CONFIG_MFD_LM3533 is not set +# CONFIG_MFD_TC3589X is not set +# CONFIG_MFD_TQMX86 is not set +# CONFIG_MFD_VX855 is not set +# CONFIG_MFD_LOCHNAGAR is not set +# CONFIG_MFD_ARIZONA_I2C is not set +# CONFIG_MFD_ARIZONA_SPI is not set +# CONFIG_MFD_WM8400 is not set +# CONFIG_MFD_WM831X_I2C is not set +# CONFIG_MFD_WM831X_SPI is not set +# CONFIG_MFD_WM8350_I2C is not set +# CONFIG_MFD_WM8994 is not set +# CONFIG_MFD_ROHM_BD718XX is not set +# CONFIG_MFD_ROHM_BD71828 is not set +# CONFIG_MFD_ROHM_BD957XMUF is not set +# CONFIG_MFD_STPMIC1 is not set +# CONFIG_MFD_STMFX is not set +# CONFIG_MFD_ATC260X_I2C is not set +# CONFIG_MFD_KHADAS_MCU is not set +# CONFIG_MFD_QCOM_PM8008 is not set +CONFIG_MFD_VEXPRESS_SYSREG=y +# CONFIG_RAVE_SP_CORE is not set +# CONFIG_MFD_INTEL_M10_BMC_SPI is not set +# CONFIG_MFD_RSMU_I2C is not set +# CONFIG_MFD_RSMU_SPI is not set +# end of Multifunction device drivers + +CONFIG_REGULATOR=y +# CONFIG_REGULATOR_DEBUG is not set +CONFIG_REGULATOR_FIXED_VOLTAGE=m +# CONFIG_REGULATOR_VIRTUAL_CONSUMER is not set +# CONFIG_REGULATOR_USERSPACE_CONSUMER is not set +# CONFIG_REGULATOR_88PG86X is not set +# CONFIG_REGULATOR_ACT8865 is not set +# CONFIG_REGULATOR_AD5398 is not set +# CONFIG_REGULATOR_AW37503 is not set +CONFIG_REGULATOR_AXP20X=m +# CONFIG_REGULATOR_CROS_EC is not set +# CONFIG_REGULATOR_DA9121 is not set +# CONFIG_REGULATOR_DA9210 is not set +# CONFIG_REGULATOR_DA9211 is not set +CONFIG_REGULATOR_FAN53555=m +# CONFIG_REGULATOR_FAN53880 is not set +CONFIG_REGULATOR_GPIO=m +CONFIG_REGULATOR_HI655X=m +# CONFIG_REGULATOR_ISL9305 is not set +# CONFIG_REGULATOR_ISL6271A is not set +# CONFIG_REGULATOR_LP3971 is not set +# CONFIG_REGULATOR_LP3972 is not set +# CONFIG_REGULATOR_LP872X is not set +# CONFIG_REGULATOR_LP8755 is not set +# CONFIG_REGULATOR_LTC3589 is not set +# CONFIG_REGULATOR_LTC3676 is not set +# CONFIG_REGULATOR_MAX1586 is not set +CONFIG_REGULATOR_MAX77620=m +# CONFIG_REGULATOR_MAX77857 is not set +# CONFIG_REGULATOR_MAX8649 is not set +# CONFIG_REGULATOR_MAX8660 is not set +# CONFIG_REGULATOR_MAX8893 is not set +# CONFIG_REGULATOR_MAX8952 is not set +# CONFIG_REGULATOR_MAX8973 is not set +# CONFIG_REGULATOR_MAX20086 is not set +# CONFIG_REGULATOR_MAX20411 is not set +# CONFIG_REGULATOR_MAX77826 is not set +# CONFIG_REGULATOR_MCP16502 is not set +# CONFIG_REGULATOR_MP5416 is not set +# CONFIG_REGULATOR_MP8859 is not set +# CONFIG_REGULATOR_MP886X is not set +# CONFIG_REGULATOR_MPQ7920 is not set +# CONFIG_REGULATOR_MT6311 is not set +# CONFIG_REGULATOR_MT6315 is not set +# CONFIG_REGULATOR_PCA9450 is not set +# CONFIG_REGULATOR_PF8X00 is not set +# CONFIG_REGULATOR_PFUZE100 is not set +# CONFIG_REGULATOR_PV88060 is not set +# CONFIG_REGULATOR_PV88080 is not set +# CONFIG_REGULATOR_PV88090 is not set +CONFIG_REGULATOR_PWM=m +# CONFIG_REGULATOR_QCOM_REFGEN is not set +CONFIG_REGULATOR_QCOM_RPM=m +CONFIG_REGULATOR_QCOM_SMD_RPM=m +CONFIG_REGULATOR_QCOM_SPMI=m +# CONFIG_REGULATOR_QCOM_USB_VBUS is not set +# CONFIG_REGULATOR_RAA215300 is not set +# CONFIG_REGULATOR_RASPBERRYPI_TOUCHSCREEN_ATTINY is not set +# CONFIG_REGULATOR_RT4801 is not set +# CONFIG_REGULATOR_RT4803 is not set +# CONFIG_REGULATOR_RT5190A is not set +# CONFIG_REGULATOR_RT5739 is not set +# CONFIG_REGULATOR_RT5759 is not set +# CONFIG_REGULATOR_RT6160 is not set +# CONFIG_REGULATOR_RT6190 is not set +# CONFIG_REGULATOR_RT6245 is not set +# CONFIG_REGULATOR_RTQ2134 is not set +# CONFIG_REGULATOR_RTMV20 is not set +# CONFIG_REGULATOR_RTQ6752 is not set +# CONFIG_REGULATOR_RTQ2208 is not set +# CONFIG_REGULATOR_SLG51000 is not set +# CONFIG_REGULATOR_SY8106A is not set +# CONFIG_REGULATOR_SY8824X is not set +# CONFIG_REGULATOR_SY8827N is not set +# CONFIG_REGULATOR_TPS51632 is not set +# CONFIG_REGULATOR_TPS62360 is not set +# CONFIG_REGULATOR_TPS6286X is not set +# CONFIG_REGULATOR_TPS6287X is not set +# CONFIG_REGULATOR_TPS65023 is not set +# CONFIG_REGULATOR_TPS6507X is not set +# CONFIG_REGULATOR_TPS65132 is not set +# CONFIG_REGULATOR_TPS6524X is not set +CONFIG_REGULATOR_VCTRL=m +# CONFIG_REGULATOR_VEXPRESS is not set +# CONFIG_REGULATOR_VQMMC_IPQ4019 is not set +# CONFIG_REGULATOR_QCOM_LABIBB is not set +CONFIG_RC_CORE=m +CONFIG_LIRC=y +CONFIG_RC_MAP=m +CONFIG_RC_DECODERS=y +CONFIG_IR_IMON_DECODER=m +CONFIG_IR_JVC_DECODER=m +CONFIG_IR_MCE_KBD_DECODER=m +CONFIG_IR_NEC_DECODER=m +CONFIG_IR_RC5_DECODER=m +CONFIG_IR_RC6_DECODER=m +# CONFIG_IR_RCMM_DECODER is not set +CONFIG_IR_SANYO_DECODER=m +CONFIG_IR_SHARP_DECODER=m +CONFIG_IR_SONY_DECODER=m +CONFIG_IR_XMP_DECODER=m +CONFIG_RC_DEVICES=y +CONFIG_IR_ENE=m +# CONFIG_IR_FINTEK is not set +# CONFIG_IR_GPIO_CIR is not set +# CONFIG_IR_GPIO_TX is not set +# CONFIG_IR_HIX5HD2 is not set +CONFIG_IR_IGORPLUGUSB=m +CONFIG_IR_IGUANA=m +CONFIG_IR_IMON=m +CONFIG_IR_IMON_RAW=m +# CONFIG_IR_ITE_CIR is not set +CONFIG_IR_MCEUSB=m +# CONFIG_IR_MESON is not set +# CONFIG_IR_MESON_TX is not set +# CONFIG_IR_NUVOTON is not set +# CONFIG_IR_PWM_TX is not set +CONFIG_IR_REDRAT3=m +# CONFIG_IR_SERIAL is not set +# CONFIG_IR_SPI is not set +CONFIG_IR_STREAMZAP=m +# CONFIG_IR_SUNXI is not set +# CONFIG_IR_TOY is not set +CONFIG_IR_TTUSBIR=m +CONFIG_RC_ATI_REMOTE=m +CONFIG_RC_LOOPBACK=m +# CONFIG_RC_XBOX_DVD is not set +CONFIG_CEC_CORE=m + +# +# CEC support +# +# CONFIG_MEDIA_CEC_RC is not set +CONFIG_MEDIA_CEC_SUPPORT=y +# CONFIG_CEC_CH7322 is not set +# CONFIG_CEC_CROS_EC is not set +# CONFIG_CEC_MESON_AO is not set +# CONFIG_CEC_MESON_G12A_AO is not set +# CONFIG_CEC_TEGRA is not set +CONFIG_USB_PULSE8_CEC=m +CONFIG_USB_RAINSHADOW_CEC=m +# end of CEC support + +CONFIG_MEDIA_SUPPORT=m +# CONFIG_MEDIA_SUPPORT_FILTER is not set +CONFIG_MEDIA_SUBDRV_AUTOSELECT=y + +# +# Media device types +# +CONFIG_MEDIA_CAMERA_SUPPORT=y +CONFIG_MEDIA_ANALOG_TV_SUPPORT=y +CONFIG_MEDIA_DIGITAL_TV_SUPPORT=y +CONFIG_MEDIA_RADIO_SUPPORT=y +CONFIG_MEDIA_SDR_SUPPORT=y +CONFIG_MEDIA_PLATFORM_SUPPORT=y +CONFIG_MEDIA_TEST_SUPPORT=y +# end of Media device types + +# +# Media core support +# +CONFIG_VIDEO_DEV=m +CONFIG_MEDIA_CONTROLLER=y +CONFIG_DVB_CORE=m +# end of Media core support + +# +# Video4Linux options +# +CONFIG_VIDEO_V4L2_I2C=y +CONFIG_VIDEO_V4L2_SUBDEV_API=y +# CONFIG_VIDEO_ADV_DEBUG is not set +# CONFIG_VIDEO_FIXED_MINOR_RANGES is not set +CONFIG_VIDEO_TUNER=m +CONFIG_V4L2_FWNODE=m +CONFIG_V4L2_ASYNC=m +# end of Video4Linux options + +# +# Media controller options +# +CONFIG_MEDIA_CONTROLLER_DVB=y +CONFIG_MEDIA_CONTROLLER_REQUEST_API=y +# end of Media controller options + +# +# Digital TV options +# +# CONFIG_DVB_MMAP is not set +CONFIG_DVB_NET=y +CONFIG_DVB_MAX_ADAPTERS=16 +CONFIG_DVB_DYNAMIC_MINORS=y +# CONFIG_DVB_DEMUX_SECTION_LOSS_LOG is not set +# CONFIG_DVB_ULE_DEBUG is not set +# end of Digital TV options + +# +# Media drivers +# + +# +# Media drivers +# +CONFIG_MEDIA_USB_SUPPORT=y + +# +# Webcam devices +# +CONFIG_USB_GSPCA=m +CONFIG_USB_GSPCA_BENQ=m +CONFIG_USB_GSPCA_CONEX=m +CONFIG_USB_GSPCA_CPIA1=m +CONFIG_USB_GSPCA_DTCS033=m +CONFIG_USB_GSPCA_ETOMS=m +CONFIG_USB_GSPCA_FINEPIX=m +CONFIG_USB_GSPCA_JEILINJ=m +CONFIG_USB_GSPCA_JL2005BCD=m +CONFIG_USB_GSPCA_KINECT=m +CONFIG_USB_GSPCA_KONICA=m +CONFIG_USB_GSPCA_MARS=m +CONFIG_USB_GSPCA_MR97310A=m +CONFIG_USB_GSPCA_NW80X=m +CONFIG_USB_GSPCA_OV519=m +CONFIG_USB_GSPCA_OV534=m +CONFIG_USB_GSPCA_OV534_9=m +CONFIG_USB_GSPCA_PAC207=m +CONFIG_USB_GSPCA_PAC7302=m +CONFIG_USB_GSPCA_PAC7311=m +CONFIG_USB_GSPCA_SE401=m +CONFIG_USB_GSPCA_SN9C2028=m +CONFIG_USB_GSPCA_SN9C20X=m +CONFIG_USB_GSPCA_SONIXB=m +CONFIG_USB_GSPCA_SONIXJ=m +CONFIG_USB_GSPCA_SPCA1528=m +CONFIG_USB_GSPCA_SPCA500=m +CONFIG_USB_GSPCA_SPCA501=m +CONFIG_USB_GSPCA_SPCA505=m +CONFIG_USB_GSPCA_SPCA506=m +CONFIG_USB_GSPCA_SPCA508=m +CONFIG_USB_GSPCA_SPCA561=m +CONFIG_USB_GSPCA_SQ905=m +CONFIG_USB_GSPCA_SQ905C=m +CONFIG_USB_GSPCA_SQ930X=m +CONFIG_USB_GSPCA_STK014=m +CONFIG_USB_GSPCA_STK1135=m +CONFIG_USB_GSPCA_STV0680=m +CONFIG_USB_GSPCA_SUNPLUS=m +CONFIG_USB_GSPCA_T613=m +CONFIG_USB_GSPCA_TOPRO=m +CONFIG_USB_GSPCA_TOUPTEK=m +CONFIG_USB_GSPCA_TV8532=m +CONFIG_USB_GSPCA_VC032X=m +CONFIG_USB_GSPCA_VICAM=m +CONFIG_USB_GSPCA_XIRLINK_CIT=m +CONFIG_USB_GSPCA_ZC3XX=m +CONFIG_USB_GL860=m +CONFIG_USB_M5602=m +CONFIG_USB_STV06XX=m +CONFIG_USB_PWC=m +# CONFIG_USB_PWC_DEBUG is not set +CONFIG_USB_PWC_INPUT_EVDEV=y +CONFIG_USB_S2255=m +CONFIG_VIDEO_USBTV=m +CONFIG_USB_VIDEO_CLASS=m +CONFIG_USB_VIDEO_CLASS_INPUT_EVDEV=y + +# +# Analog TV USB devices +# +CONFIG_VIDEO_GO7007=m +CONFIG_VIDEO_GO7007_USB=m +CONFIG_VIDEO_GO7007_LOADER=m +CONFIG_VIDEO_GO7007_USB_S2250_BOARD=m +CONFIG_VIDEO_HDPVR=m +CONFIG_VIDEO_PVRUSB2=m +CONFIG_VIDEO_PVRUSB2_SYSFS=y +CONFIG_VIDEO_PVRUSB2_DVB=y +# CONFIG_VIDEO_PVRUSB2_DEBUGIFC is not set +CONFIG_VIDEO_STK1160=m + +# +# Analog/digital TV USB devices +# +CONFIG_VIDEO_AU0828=m +CONFIG_VIDEO_AU0828_V4L2=y +CONFIG_VIDEO_AU0828_RC=y +CONFIG_VIDEO_CX231XX=m +CONFIG_VIDEO_CX231XX_RC=y +CONFIG_VIDEO_CX231XX_ALSA=m +CONFIG_VIDEO_CX231XX_DVB=m + +# +# Digital TV USB devices +# +CONFIG_DVB_AS102=m +CONFIG_DVB_B2C2_FLEXCOP_USB=m +# CONFIG_DVB_B2C2_FLEXCOP_USB_DEBUG is not set +CONFIG_DVB_USB_V2=m +CONFIG_DVB_USB_AF9015=m +CONFIG_DVB_USB_AF9035=m +CONFIG_DVB_USB_ANYSEE=m +CONFIG_DVB_USB_AU6610=m +CONFIG_DVB_USB_AZ6007=m +CONFIG_DVB_USB_CE6230=m +CONFIG_DVB_USB_DVBSKY=m +CONFIG_DVB_USB_EC168=m +CONFIG_DVB_USB_GL861=m +CONFIG_DVB_USB_LME2510=m +CONFIG_DVB_USB_MXL111SF=m +CONFIG_DVB_USB_RTL28XXU=m +CONFIG_DVB_USB_ZD1301=m +CONFIG_DVB_USB=m +# CONFIG_DVB_USB_DEBUG is not set +CONFIG_DVB_USB_A800=m +CONFIG_DVB_USB_AF9005=m +CONFIG_DVB_USB_AF9005_REMOTE=m +CONFIG_DVB_USB_AZ6027=m +CONFIG_DVB_USB_CINERGY_T2=m +CONFIG_DVB_USB_CXUSB=m +# CONFIG_DVB_USB_CXUSB_ANALOG is not set +CONFIG_DVB_USB_DIB0700=m +CONFIG_DVB_USB_DIB3000MC=m +CONFIG_DVB_USB_DIBUSB_MB=m +CONFIG_DVB_USB_DIBUSB_MB_FAULTY=y +CONFIG_DVB_USB_DIBUSB_MC=m +CONFIG_DVB_USB_DIGITV=m +CONFIG_DVB_USB_DTT200U=m +CONFIG_DVB_USB_DTV5100=m +CONFIG_DVB_USB_DW2102=m +CONFIG_DVB_USB_GP8PSK=m +CONFIG_DVB_USB_M920X=m +CONFIG_DVB_USB_NOVA_T_USB2=m +CONFIG_DVB_USB_OPERA1=m +CONFIG_DVB_USB_PCTV452E=m +CONFIG_DVB_USB_TECHNISAT_USB2=m +CONFIG_DVB_USB_TTUSB2=m +CONFIG_DVB_USB_UMT_010=m +CONFIG_DVB_USB_VP702X=m +CONFIG_DVB_USB_VP7045=m +CONFIG_SMS_USB_DRV=m +CONFIG_DVB_TTUSB_BUDGET=m +CONFIG_DVB_TTUSB_DEC=m + +# +# Webcam, TV (analog/digital) USB devices +# +CONFIG_VIDEO_EM28XX=m +CONFIG_VIDEO_EM28XX_V4L2=m +CONFIG_VIDEO_EM28XX_ALSA=m +CONFIG_VIDEO_EM28XX_DVB=m +CONFIG_VIDEO_EM28XX_RC=m + +# +# Software defined radio USB devices +# +CONFIG_USB_AIRSPY=m +CONFIG_USB_HACKRF=m +CONFIG_USB_MSI2500=m +CONFIG_MEDIA_PCI_SUPPORT=y + +# +# Media capture support +# +CONFIG_VIDEO_SOLO6X10=m +CONFIG_VIDEO_TW5864=m +CONFIG_VIDEO_TW68=m +CONFIG_VIDEO_TW686X=m +# CONFIG_VIDEO_ZORAN is not set + +# +# Media capture/analog TV support +# +CONFIG_VIDEO_DT3155=m +CONFIG_VIDEO_IVTV=m +CONFIG_VIDEO_IVTV_ALSA=m +CONFIG_VIDEO_FB_IVTV=m +CONFIG_VIDEO_HEXIUM_GEMINI=m +CONFIG_VIDEO_HEXIUM_ORION=m +CONFIG_VIDEO_MXB=m + +# +# Media capture/analog/hybrid TV support +# +CONFIG_VIDEO_BT848=m +CONFIG_DVB_BT8XX=m +CONFIG_VIDEO_CX18=m +CONFIG_VIDEO_CX18_ALSA=m +CONFIG_VIDEO_CX23885=m +CONFIG_MEDIA_ALTERA_CI=m +# CONFIG_VIDEO_CX25821 is not set +CONFIG_VIDEO_CX88=m +CONFIG_VIDEO_CX88_ALSA=m +CONFIG_VIDEO_CX88_BLACKBIRD=m +CONFIG_VIDEO_CX88_DVB=m +CONFIG_VIDEO_CX88_ENABLE_VP3054=y +CONFIG_VIDEO_CX88_VP3054=m +CONFIG_VIDEO_CX88_MPEG=m +CONFIG_VIDEO_SAA7134=m +CONFIG_VIDEO_SAA7134_ALSA=m +CONFIG_VIDEO_SAA7134_RC=y +CONFIG_VIDEO_SAA7134_DVB=m +# CONFIG_VIDEO_SAA7134_GO7007 is not set +CONFIG_VIDEO_SAA7164=m + +# +# Media digital TV PCI Adapters +# +CONFIG_DVB_B2C2_FLEXCOP_PCI=m +# CONFIG_DVB_B2C2_FLEXCOP_PCI_DEBUG is not set +CONFIG_DVB_DDBRIDGE=m +# CONFIG_DVB_DDBRIDGE_MSIENABLE is not set +CONFIG_DVB_DM1105=m +CONFIG_MANTIS_CORE=m +CONFIG_DVB_MANTIS=m +CONFIG_DVB_HOPPER=m +CONFIG_DVB_NETUP_UNIDVB=m +CONFIG_DVB_NGENE=m +CONFIG_DVB_PLUTO2=m +CONFIG_DVB_PT1=m +CONFIG_DVB_PT3=m +CONFIG_DVB_SMIPCIE=m +CONFIG_DVB_BUDGET_CORE=m +CONFIG_DVB_BUDGET=m +CONFIG_DVB_BUDGET_CI=m +CONFIG_DVB_BUDGET_AV=m +# CONFIG_IPU_BRIDGE is not set +CONFIG_RADIO_ADAPTERS=m +# CONFIG_RADIO_MAXIRADIO is not set +# CONFIG_RADIO_SAA7706H is not set +CONFIG_RADIO_SHARK=m +CONFIG_RADIO_SHARK2=m +# CONFIG_RADIO_SI4713 is not set +CONFIG_RADIO_TEA575X=m +# CONFIG_RADIO_TEA5764 is not set +# CONFIG_RADIO_TEF6862 is not set +# CONFIG_RADIO_WL1273 is not set +# CONFIG_USB_DSBR is not set +CONFIG_USB_KEENE=m +CONFIG_USB_MA901=m +CONFIG_USB_MR800=m +CONFIG_USB_RAREMONO=m +CONFIG_RADIO_SI470X=m +CONFIG_USB_SI470X=m +# CONFIG_I2C_SI470X is not set +# CONFIG_RADIO_WL128X is not set +CONFIG_MEDIA_PLATFORM_DRIVERS=y +CONFIG_V4L_PLATFORM_DRIVERS=y +# CONFIG_SDR_PLATFORM_DRIVERS is not set +# CONFIG_DVB_PLATFORM_DRIVERS is not set +CONFIG_V4L_MEM2MEM_DRIVERS=y +# CONFIG_VIDEO_MEM2MEM_DEINTERLACE is not set +# CONFIG_VIDEO_MUX is not set + +# +# Allegro DVT media platform drivers +# +# CONFIG_VIDEO_ALLEGRO_DVT is not set + +# +# Amlogic media platform drivers +# +# CONFIG_VIDEO_MESON_GE2D is not set + +# +# Amphion drivers +# + +# +# Aspeed media platform drivers +# + +# +# Atmel media platform drivers +# + +# +# Cadence media platform drivers +# +# CONFIG_VIDEO_CADENCE_CSI2RX is not set +# CONFIG_VIDEO_CADENCE_CSI2TX is not set + +# +# Chips&Media media platform drivers +# + +# +# Intel media platform drivers +# + +# +# Marvell media platform drivers +# +CONFIG_VIDEO_CAFE_CCIC=m + +# +# Mediatek media platform drivers +# + +# +# Microchip Technology, Inc. media platform drivers +# + +# +# NVidia media platform drivers +# +# CONFIG_VIDEO_TEGRA_VDE is not set + +# +# NXP media platform drivers +# + +# +# Qualcomm media platform drivers +# +# CONFIG_VIDEO_QCOM_CAMSS is not set + +# +# Renesas media platform drivers +# + +# +# Rockchip media platform drivers +# +# CONFIG_VIDEO_ROCKCHIP_RGA is not set +# CONFIG_VIDEO_ROCKCHIP_ISP1 is not set + +# +# Samsung media platform drivers +# + +# +# STMicroelectronics media platform drivers +# + +# +# Sunxi media platform drivers +# +# CONFIG_VIDEO_SUN4I_CSI is not set +# CONFIG_VIDEO_SUN6I_CSI is not set +# CONFIG_VIDEO_SUN8I_A83T_MIPI_CSI2 is not set +# CONFIG_VIDEO_SUN8I_DEINTERLACE is not set +# CONFIG_VIDEO_SUN8I_ROTATE is not set + +# +# Texas Instruments drivers +# + +# +# Verisilicon media platform drivers +# +# CONFIG_VIDEO_HANTRO is not set + +# +# VIA media platform drivers +# + +# +# Xilinx media platform drivers +# +# CONFIG_VIDEO_XILINX is not set + +# +# MMC/SDIO DVB adapters +# +CONFIG_SMS_SDIO_DRV=m +CONFIG_V4L_TEST_DRIVERS=y +# CONFIG_VIDEO_VIM2M is not set +# CONFIG_VIDEO_VICODEC is not set +# CONFIG_VIDEO_VIMC is not set +CONFIG_VIDEO_VIVID=m +CONFIG_VIDEO_VIVID_CEC=y +CONFIG_VIDEO_VIVID_MAX_DEVS=64 +# CONFIG_VIDEO_VISL is not set +# CONFIG_DVB_TEST_DRIVERS is not set + +# +# FireWire (IEEE 1394) Adapters +# +CONFIG_DVB_FIREDTV=m +CONFIG_DVB_FIREDTV_INPUT=y +CONFIG_MEDIA_COMMON_OPTIONS=y + +# +# common driver options +# +CONFIG_CYPRESS_FIRMWARE=m +CONFIG_TTPCI_EEPROM=m +CONFIG_UVC_COMMON=m +CONFIG_VIDEO_CX2341X=m +CONFIG_VIDEO_TVEEPROM=m +CONFIG_DVB_B2C2_FLEXCOP=m +CONFIG_VIDEO_SAA7146=m +CONFIG_VIDEO_SAA7146_VV=m +CONFIG_SMS_SIANO_MDTV=m +CONFIG_SMS_SIANO_RC=y +# CONFIG_SMS_SIANO_DEBUGFS is not set +CONFIG_VIDEO_V4L2_TPG=m +CONFIG_VIDEOBUF2_CORE=m +CONFIG_VIDEOBUF2_V4L2=m +CONFIG_VIDEOBUF2_MEMOPS=m +CONFIG_VIDEOBUF2_DMA_CONTIG=m +CONFIG_VIDEOBUF2_VMALLOC=m +CONFIG_VIDEOBUF2_DMA_SG=m +CONFIG_VIDEOBUF2_DVB=m +# end of Media drivers + +# +# Media ancillary drivers +# +CONFIG_MEDIA_ATTACH=y + +# +# IR I2C driver auto-selected by 'Autoselect ancillary drivers' +# +CONFIG_VIDEO_IR_I2C=m +CONFIG_VIDEO_CAMERA_SENSOR=y +# CONFIG_VIDEO_AR0521 is not set +# CONFIG_VIDEO_HI556 is not set +# CONFIG_VIDEO_HI846 is not set +# CONFIG_VIDEO_HI847 is not set +# CONFIG_VIDEO_IMX208 is not set +# CONFIG_VIDEO_IMX214 is not set +# CONFIG_VIDEO_IMX219 is not set +# CONFIG_VIDEO_IMX258 is not set +# CONFIG_VIDEO_IMX274 is not set +# CONFIG_VIDEO_IMX290 is not set +# CONFIG_VIDEO_IMX296 is not set +# CONFIG_VIDEO_IMX319 is not set +# CONFIG_VIDEO_IMX334 is not set +# CONFIG_VIDEO_IMX335 is not set +# CONFIG_VIDEO_IMX355 is not set +# CONFIG_VIDEO_IMX412 is not set +# CONFIG_VIDEO_IMX415 is not set +# CONFIG_VIDEO_MT9M001 is not set +# CONFIG_VIDEO_MT9M111 is not set +# CONFIG_VIDEO_MT9P031 is not set +# CONFIG_VIDEO_MT9T112 is not set +CONFIG_VIDEO_MT9V011=m +# CONFIG_VIDEO_MT9V032 is not set +# CONFIG_VIDEO_MT9V111 is not set +# CONFIG_VIDEO_OG01A1B is not set +# CONFIG_VIDEO_OV01A10 is not set +# CONFIG_VIDEO_OV02A10 is not set +# CONFIG_VIDEO_OV08D10 is not set +# CONFIG_VIDEO_OV08X40 is not set +# CONFIG_VIDEO_OV13858 is not set +# CONFIG_VIDEO_OV13B10 is not set +CONFIG_VIDEO_OV2640=m +# CONFIG_VIDEO_OV2659 is not set +# CONFIG_VIDEO_OV2680 is not set +# CONFIG_VIDEO_OV2685 is not set +# CONFIG_VIDEO_OV2740 is not set +# CONFIG_VIDEO_OV4689 is not set +# CONFIG_VIDEO_OV5640 is not set +# CONFIG_VIDEO_OV5645 is not set +# CONFIG_VIDEO_OV5647 is not set +# CONFIG_VIDEO_OV5648 is not set +# CONFIG_VIDEO_OV5670 is not set +# CONFIG_VIDEO_OV5675 is not set +# CONFIG_VIDEO_OV5693 is not set +# CONFIG_VIDEO_OV5695 is not set +# CONFIG_VIDEO_OV6650 is not set +# CONFIG_VIDEO_OV7251 is not set +CONFIG_VIDEO_OV7640=m +CONFIG_VIDEO_OV7670=m +# CONFIG_VIDEO_OV772X is not set +# CONFIG_VIDEO_OV7740 is not set +# CONFIG_VIDEO_OV8856 is not set +# CONFIG_VIDEO_OV8858 is not set +# CONFIG_VIDEO_OV8865 is not set +# CONFIG_VIDEO_OV9282 is not set +# CONFIG_VIDEO_OV9640 is not set +# CONFIG_VIDEO_OV9650 is not set +# CONFIG_VIDEO_OV9734 is not set +# CONFIG_VIDEO_RDACM20 is not set +# CONFIG_VIDEO_RDACM21 is not set +# CONFIG_VIDEO_RJ54N1 is not set +# CONFIG_VIDEO_S5C73M3 is not set +# CONFIG_VIDEO_S5K5BAF is not set +# CONFIG_VIDEO_S5K6A3 is not set +# CONFIG_VIDEO_ST_VGXY61 is not set +# CONFIG_VIDEO_CCS is not set +# CONFIG_VIDEO_ET8EK8 is not set + +# +# Lens drivers +# +# CONFIG_VIDEO_AD5820 is not set +# CONFIG_VIDEO_AK7375 is not set +# CONFIG_VIDEO_DW9714 is not set +# CONFIG_VIDEO_DW9719 is not set +# CONFIG_VIDEO_DW9768 is not set +# CONFIG_VIDEO_DW9807_VCM is not set +# end of Lens drivers + +# +# Flash devices +# +# CONFIG_VIDEO_ADP1653 is not set +# CONFIG_VIDEO_LM3560 is not set +# CONFIG_VIDEO_LM3646 is not set +# end of Flash devices + +# +# Audio decoders, processors and mixers +# +CONFIG_VIDEO_CS3308=m +CONFIG_VIDEO_CS5345=m +CONFIG_VIDEO_CS53L32A=m +CONFIG_VIDEO_MSP3400=m +CONFIG_VIDEO_SONY_BTF_MPX=m +# CONFIG_VIDEO_TDA1997X is not set +CONFIG_VIDEO_TDA7432=m +CONFIG_VIDEO_TDA9840=m +CONFIG_VIDEO_TEA6415C=m +CONFIG_VIDEO_TEA6420=m +CONFIG_VIDEO_TLV320AIC23B=m +CONFIG_VIDEO_TVAUDIO=m +CONFIG_VIDEO_UDA1342=m +CONFIG_VIDEO_VP27SMPX=m +CONFIG_VIDEO_WM8739=m +CONFIG_VIDEO_WM8775=m +# end of Audio decoders, processors and mixers + +# +# RDS decoders +# +CONFIG_VIDEO_SAA6588=m +# end of RDS decoders + +# +# Video decoders +# +# CONFIG_VIDEO_ADV7180 is not set +# CONFIG_VIDEO_ADV7183 is not set +# CONFIG_VIDEO_ADV748X is not set +# CONFIG_VIDEO_ADV7604 is not set +# CONFIG_VIDEO_ADV7842 is not set +CONFIG_VIDEO_BT819=m +CONFIG_VIDEO_BT856=m +# CONFIG_VIDEO_BT866 is not set +# CONFIG_VIDEO_ISL7998X is not set +CONFIG_VIDEO_KS0127=m +# CONFIG_VIDEO_MAX9286 is not set +# CONFIG_VIDEO_ML86V7667 is not set +CONFIG_VIDEO_SAA7110=m +CONFIG_VIDEO_SAA711X=m +# CONFIG_VIDEO_TC358743 is not set +# CONFIG_VIDEO_TC358746 is not set +# CONFIG_VIDEO_TVP514X is not set +CONFIG_VIDEO_TVP5150=m +# CONFIG_VIDEO_TVP7002 is not set +CONFIG_VIDEO_TW2804=m +CONFIG_VIDEO_TW9903=m +CONFIG_VIDEO_TW9906=m +# CONFIG_VIDEO_TW9910 is not set +CONFIG_VIDEO_VPX3220=m + +# +# Video and audio decoders +# +CONFIG_VIDEO_SAA717X=m +CONFIG_VIDEO_CX25840=m +# end of Video decoders + +# +# Video encoders +# +CONFIG_VIDEO_ADV7170=m +CONFIG_VIDEO_ADV7175=m +# CONFIG_VIDEO_ADV7343 is not set +# CONFIG_VIDEO_ADV7393 is not set +# CONFIG_VIDEO_AK881X is not set +CONFIG_VIDEO_SAA7127=m +CONFIG_VIDEO_SAA7185=m +# CONFIG_VIDEO_THS8200 is not set +# end of Video encoders + +# +# Video improvement chips +# +CONFIG_VIDEO_UPD64031A=m +CONFIG_VIDEO_UPD64083=m +# end of Video improvement chips + +# +# Audio/Video compression chips +# +CONFIG_VIDEO_SAA6752HS=m +# end of Audio/Video compression chips + +# +# SDR tuner chips +# +# CONFIG_SDR_MAX2175 is not set +# end of SDR tuner chips + +# +# Miscellaneous helper chips +# +# CONFIG_VIDEO_I2C is not set +CONFIG_VIDEO_M52790=m +# CONFIG_VIDEO_ST_MIPID02 is not set +# CONFIG_VIDEO_THS7303 is not set +# end of Miscellaneous helper chips + +# +# Video serializers and deserializers +# +# CONFIG_VIDEO_DS90UB913 is not set +# CONFIG_VIDEO_DS90UB953 is not set +# CONFIG_VIDEO_DS90UB960 is not set +# end of Video serializers and deserializers + +# +# Media SPI Adapters +# +# CONFIG_CXD2880_SPI_DRV is not set +# CONFIG_VIDEO_GS1662 is not set +# end of Media SPI Adapters + +CONFIG_MEDIA_TUNER=m + +# +# Customize TV tuners +# +CONFIG_MEDIA_TUNER_E4000=m +CONFIG_MEDIA_TUNER_FC0011=m +CONFIG_MEDIA_TUNER_FC0012=m +CONFIG_MEDIA_TUNER_FC0013=m +CONFIG_MEDIA_TUNER_FC2580=m +CONFIG_MEDIA_TUNER_IT913X=m +CONFIG_MEDIA_TUNER_M88RS6000T=m +CONFIG_MEDIA_TUNER_MAX2165=m +CONFIG_MEDIA_TUNER_MC44S803=m +CONFIG_MEDIA_TUNER_MSI001=m +CONFIG_MEDIA_TUNER_MT2060=m +CONFIG_MEDIA_TUNER_MT2063=m +CONFIG_MEDIA_TUNER_MT20XX=m +CONFIG_MEDIA_TUNER_MT2131=m +CONFIG_MEDIA_TUNER_MT2266=m +CONFIG_MEDIA_TUNER_MXL301RF=m +CONFIG_MEDIA_TUNER_MXL5005S=m +CONFIG_MEDIA_TUNER_MXL5007T=m +CONFIG_MEDIA_TUNER_QM1D1B0004=m +CONFIG_MEDIA_TUNER_QM1D1C0042=m +CONFIG_MEDIA_TUNER_QT1010=m +CONFIG_MEDIA_TUNER_R820T=m +CONFIG_MEDIA_TUNER_SI2157=m +CONFIG_MEDIA_TUNER_SIMPLE=m +CONFIG_MEDIA_TUNER_TDA18212=m +CONFIG_MEDIA_TUNER_TDA18218=m +CONFIG_MEDIA_TUNER_TDA18250=m +CONFIG_MEDIA_TUNER_TDA18271=m +CONFIG_MEDIA_TUNER_TDA827X=m +CONFIG_MEDIA_TUNER_TDA8290=m +CONFIG_MEDIA_TUNER_TDA9887=m +CONFIG_MEDIA_TUNER_TEA5761=m +CONFIG_MEDIA_TUNER_TEA5767=m +CONFIG_MEDIA_TUNER_TUA9001=m +CONFIG_MEDIA_TUNER_XC2028=m +CONFIG_MEDIA_TUNER_XC4000=m +CONFIG_MEDIA_TUNER_XC5000=m +# end of Customize TV tuners + +# +# Customise DVB Frontends +# + +# +# Multistandard (satellite) frontends +# +CONFIG_DVB_M88DS3103=m +CONFIG_DVB_MXL5XX=m +CONFIG_DVB_STB0899=m +CONFIG_DVB_STB6100=m +CONFIG_DVB_STV090x=m +CONFIG_DVB_STV0910=m +CONFIG_DVB_STV6110x=m +CONFIG_DVB_STV6111=m + +# +# Multistandard (cable + terrestrial) frontends +# +CONFIG_DVB_DRXK=m +CONFIG_DVB_MN88472=m +CONFIG_DVB_MN88473=m +CONFIG_DVB_SI2165=m +CONFIG_DVB_TDA18271C2DD=m + +# +# DVB-S (satellite) frontends +# +CONFIG_DVB_CX24110=m +CONFIG_DVB_CX24116=m +CONFIG_DVB_CX24117=m +CONFIG_DVB_CX24120=m +CONFIG_DVB_CX24123=m +CONFIG_DVB_DS3000=m +CONFIG_DVB_MB86A16=m +CONFIG_DVB_MT312=m +CONFIG_DVB_S5H1420=m +CONFIG_DVB_SI21XX=m +CONFIG_DVB_STB6000=m +CONFIG_DVB_STV0288=m +CONFIG_DVB_STV0299=m +CONFIG_DVB_STV0900=m +CONFIG_DVB_STV6110=m +CONFIG_DVB_TDA10071=m +CONFIG_DVB_TDA10086=m +CONFIG_DVB_TDA8083=m +CONFIG_DVB_TDA8261=m +CONFIG_DVB_TDA826X=m +CONFIG_DVB_TS2020=m +CONFIG_DVB_TUA6100=m +CONFIG_DVB_TUNER_CX24113=m +CONFIG_DVB_TUNER_ITD1000=m +CONFIG_DVB_VES1X93=m +CONFIG_DVB_ZL10036=m +CONFIG_DVB_ZL10039=m + +# +# DVB-T (terrestrial) frontends +# +CONFIG_DVB_AF9013=m +CONFIG_DVB_AS102_FE=m +CONFIG_DVB_CX22700=m +CONFIG_DVB_CX22702=m +CONFIG_DVB_CXD2820R=m +CONFIG_DVB_CXD2841ER=m +CONFIG_DVB_DIB3000MB=m +CONFIG_DVB_DIB3000MC=m +CONFIG_DVB_DIB7000M=m +CONFIG_DVB_DIB7000P=m +# CONFIG_DVB_DIB9000 is not set +CONFIG_DVB_DRXD=m +CONFIG_DVB_EC100=m +CONFIG_DVB_GP8PSK_FE=m +CONFIG_DVB_L64781=m +CONFIG_DVB_MT352=m +CONFIG_DVB_NXT6000=m +CONFIG_DVB_RTL2830=m +CONFIG_DVB_RTL2832=m +CONFIG_DVB_RTL2832_SDR=m +# CONFIG_DVB_S5H1432 is not set +CONFIG_DVB_SI2168=m +CONFIG_DVB_SP887X=m +CONFIG_DVB_STV0367=m +CONFIG_DVB_TDA10048=m +CONFIG_DVB_TDA1004X=m +CONFIG_DVB_ZD1301_DEMOD=m +CONFIG_DVB_ZL10353=m +# CONFIG_DVB_CXD2880 is not set + +# +# DVB-C (cable) frontends +# +CONFIG_DVB_STV0297=m +CONFIG_DVB_TDA10021=m +CONFIG_DVB_TDA10023=m +CONFIG_DVB_VES1820=m + +# +# ATSC (North American/Korean Terrestrial/Cable DTV) frontends +# +CONFIG_DVB_AU8522=m +CONFIG_DVB_AU8522_DTV=m +CONFIG_DVB_AU8522_V4L=m +CONFIG_DVB_BCM3510=m +CONFIG_DVB_LG2160=m +CONFIG_DVB_LGDT3305=m +CONFIG_DVB_LGDT3306A=m +CONFIG_DVB_LGDT330X=m +CONFIG_DVB_MXL692=m +CONFIG_DVB_NXT200X=m +CONFIG_DVB_OR51132=m +CONFIG_DVB_OR51211=m +CONFIG_DVB_S5H1409=m +CONFIG_DVB_S5H1411=m + +# +# ISDB-T (terrestrial) frontends +# +CONFIG_DVB_DIB8000=m +CONFIG_DVB_MB86A20S=m +CONFIG_DVB_S921=m + +# +# ISDB-S (satellite) & ISDB-T (terrestrial) frontends +# +# CONFIG_DVB_MN88443X is not set +CONFIG_DVB_TC90522=m + +# +# Digital terrestrial only tuners/PLL +# +CONFIG_DVB_PLL=m +CONFIG_DVB_TUNER_DIB0070=m +CONFIG_DVB_TUNER_DIB0090=m + +# +# SEC control devices for DVB-S +# +CONFIG_DVB_A8293=m +CONFIG_DVB_AF9033=m +CONFIG_DVB_ASCOT2E=m +CONFIG_DVB_ATBM8830=m +CONFIG_DVB_HELENE=m +CONFIG_DVB_HORUS3A=m +CONFIG_DVB_ISL6405=m +CONFIG_DVB_ISL6421=m +CONFIG_DVB_ISL6423=m +CONFIG_DVB_IX2505V=m +# CONFIG_DVB_LGS8GL5 is not set +CONFIG_DVB_LGS8GXX=m +CONFIG_DVB_LNBH25=m +# CONFIG_DVB_LNBH29 is not set +CONFIG_DVB_LNBP21=m +CONFIG_DVB_LNBP22=m +CONFIG_DVB_M88RS2000=m +CONFIG_DVB_TDA665x=m +CONFIG_DVB_DRX39XYJ=m + +# +# Common Interface (EN50221) controller drivers +# +CONFIG_DVB_CXD2099=m +CONFIG_DVB_SP2=m +# end of Customise DVB Frontends + +# +# Tools to develop new frontends +# +CONFIG_DVB_DUMMY_FE=m +# end of Media ancillary drivers + +# +# Graphics support +# +CONFIG_APERTURE_HELPERS=y +CONFIG_VIDEO_CMDLINE=y +CONFIG_VIDEO_NOMODESET=y +# CONFIG_AUXDISPLAY is not set +# CONFIG_PANEL is not set +CONFIG_TEGRA_HOST1X_CONTEXT_BUS=y +CONFIG_TEGRA_HOST1X=m +CONFIG_TEGRA_HOST1X_FIREWALL=y +CONFIG_DRM=m +CONFIG_DRM_MIPI_DSI=y +CONFIG_DRM_KMS_HELPER=m +# CONFIG_DRM_DEBUG_DP_MST_TOPOLOGY_REFS is not set +# CONFIG_DRM_DEBUG_MODESET_LOCK is not set +CONFIG_DRM_FBDEV_EMULATION=y +CONFIG_DRM_FBDEV_OVERALLOC=100 +# CONFIG_DRM_FBDEV_LEAK_PHYS_SMEM is not set +CONFIG_DRM_LOAD_EDID_FIRMWARE=y +CONFIG_DRM_DP_AUX_BUS=m +CONFIG_DRM_DISPLAY_HELPER=m +CONFIG_DRM_DISPLAY_DP_HELPER=y +CONFIG_DRM_DISPLAY_HDCP_HELPER=y +CONFIG_DRM_DISPLAY_HDMI_HELPER=y +CONFIG_DRM_DP_AUX_CHARDEV=y +# CONFIG_DRM_DP_CEC is not set +CONFIG_DRM_TTM=m +CONFIG_DRM_EXEC=m +CONFIG_DRM_BUDDY=m +CONFIG_DRM_VRAM_HELPER=m +CONFIG_DRM_TTM_HELPER=m +CONFIG_DRM_GEM_DMA_HELPER=m +CONFIG_DRM_GEM_SHMEM_HELPER=m +CONFIG_DRM_SUBALLOC_HELPER=m +CONFIG_DRM_SCHED=m + +# +# I2C encoder or helper chips +# +# CONFIG_DRM_I2C_CH7006 is not set +# CONFIG_DRM_I2C_SIL164 is not set +# CONFIG_DRM_I2C_NXP_TDA998X is not set +# CONFIG_DRM_I2C_NXP_TDA9950 is not set +# end of I2C encoder or helper chips + +# +# ARM devices +# +CONFIG_DRM_HDLCD=m +# CONFIG_DRM_HDLCD_SHOW_UNDERRUN is not set +CONFIG_DRM_MALI_DISPLAY=m +# CONFIG_DRM_KOMEDA is not set +# end of ARM devices + +CONFIG_DRM_RADEON=m +# CONFIG_DRM_RADEON_USERPTR is not set +CONFIG_DRM_AMDGPU=m +CONFIG_DRM_AMDGPU_SI=y +CONFIG_DRM_AMDGPU_CIK=y +# CONFIG_DRM_AMDGPU_USERPTR is not set +# CONFIG_DRM_AMDGPU_WERROR is not set + +# +# ACP (Audio CoProcessor) Configuration +# +# CONFIG_DRM_AMD_ACP is not set +# end of ACP (Audio CoProcessor) Configuration + +# +# Display Engine Configuration +# +CONFIG_DRM_AMD_DC=y +CONFIG_DRM_AMD_DC_FP=y +# CONFIG_DRM_AMD_DC_SI is not set +# CONFIG_DRM_AMD_SECURE_DISPLAY is not set +# end of Display Engine Configuration + +# CONFIG_HSA_AMD is not set +CONFIG_DRM_NOUVEAU=m +CONFIG_NOUVEAU_PLATFORM_DRIVER=y +CONFIG_NOUVEAU_DEBUG=5 +CONFIG_NOUVEAU_DEBUG_DEFAULT=3 +# CONFIG_NOUVEAU_DEBUG_MMU is not set +# CONFIG_NOUVEAU_DEBUG_PUSH is not set +CONFIG_DRM_NOUVEAU_BACKLIGHT=y +CONFIG_DRM_VGEM=m +# CONFIG_DRM_VKMS is not set +CONFIG_DRM_ROCKCHIP=m +CONFIG_ROCKCHIP_VOP=y +# CONFIG_ROCKCHIP_VOP2 is not set +CONFIG_ROCKCHIP_ANALOGIX_DP=y +CONFIG_ROCKCHIP_CDN_DP=y +CONFIG_ROCKCHIP_DW_HDMI=y +CONFIG_ROCKCHIP_DW_MIPI_DSI=y +# CONFIG_ROCKCHIP_INNO_HDMI is not set +# CONFIG_ROCKCHIP_LVDS is not set +# CONFIG_ROCKCHIP_RGB is not set +# CONFIG_ROCKCHIP_RK3066_HDMI is not set +# CONFIG_DRM_VMWGFX is not set +CONFIG_DRM_UDL=m +CONFIG_DRM_AST=m +# CONFIG_DRM_MGAG200 is not set +CONFIG_DRM_SUN4I=m +# CONFIG_DRM_SUN6I_DSI is not set +CONFIG_DRM_SUN8I_DW_HDMI=m +# CONFIG_DRM_SUN8I_MIXER is not set +CONFIG_DRM_QXL=m +CONFIG_DRM_VIRTIO_GPU=m +CONFIG_DRM_VIRTIO_GPU_KMS=y +CONFIG_DRM_MSM=m +CONFIG_DRM_MSM_GPU_STATE=y +# CONFIG_DRM_MSM_GPU_SUDO is not set +CONFIG_DRM_MSM_MDSS=y +CONFIG_DRM_MSM_MDP4=y +CONFIG_DRM_MSM_MDP5=y +CONFIG_DRM_MSM_DPU=y +CONFIG_DRM_MSM_DP=y +CONFIG_DRM_MSM_DSI=y +CONFIG_DRM_MSM_DSI_28NM_PHY=y +CONFIG_DRM_MSM_DSI_20NM_PHY=y +CONFIG_DRM_MSM_DSI_28NM_8960_PHY=y +CONFIG_DRM_MSM_DSI_14NM_PHY=y +CONFIG_DRM_MSM_DSI_10NM_PHY=y +CONFIG_DRM_MSM_DSI_7NM_PHY=y +CONFIG_DRM_MSM_HDMI=y +CONFIG_DRM_MSM_HDMI_HDCP=y +CONFIG_DRM_TEGRA=m +# CONFIG_DRM_TEGRA_DEBUG is not set +CONFIG_DRM_TEGRA_STAGING=y +CONFIG_DRM_PANEL=y + +# +# Display Panels +# +# CONFIG_DRM_PANEL_ABT_Y030XX067A is not set +# CONFIG_DRM_PANEL_ARM_VERSATILE is not set +# CONFIG_DRM_PANEL_ASUS_Z00T_TM5P5_NT35596 is not set +# CONFIG_DRM_PANEL_AUO_A030JTN01 is not set +# CONFIG_DRM_PANEL_BOE_BF060Y8M_AJ0 is not set +# CONFIG_DRM_PANEL_BOE_HIMAX8279D is not set +# CONFIG_DRM_PANEL_BOE_TV101WUM_NL6 is not set +# CONFIG_DRM_PANEL_DSI_CM is not set +# CONFIG_DRM_PANEL_LVDS is not set +CONFIG_DRM_PANEL_SIMPLE=m +# CONFIG_DRM_PANEL_EDP is not set +# CONFIG_DRM_PANEL_EBBG_FT8719 is not set +# CONFIG_DRM_PANEL_ELIDA_KD35T133 is not set +# CONFIG_DRM_PANEL_FEIXIN_K101_IM2BA02 is not set +# CONFIG_DRM_PANEL_FEIYANG_FY07024DI26A30D is not set +# CONFIG_DRM_PANEL_HIMAX_HX8394 is not set +# CONFIG_DRM_PANEL_ILITEK_IL9322 is not set +# CONFIG_DRM_PANEL_ILITEK_ILI9341 is not set +# CONFIG_DRM_PANEL_ILITEK_ILI9881C is not set +# CONFIG_DRM_PANEL_INNOLUX_EJ030NA is not set +# CONFIG_DRM_PANEL_INNOLUX_P079ZCA is not set +# CONFIG_DRM_PANEL_JADARD_JD9365DA_H3 is not set +# CONFIG_DRM_PANEL_JDI_LT070ME05000 is not set +# CONFIG_DRM_PANEL_JDI_R63452 is not set +# CONFIG_DRM_PANEL_KHADAS_TS050 is not set +# CONFIG_DRM_PANEL_KINGDISPLAY_KD097D04 is not set +# CONFIG_DRM_PANEL_LEADTEK_LTK050H3146W is not set +# CONFIG_DRM_PANEL_LEADTEK_LTK500HD1829 is not set +# CONFIG_DRM_PANEL_SAMSUNG_LD9040 is not set +# CONFIG_DRM_PANEL_LG_LB035Q02 is not set +# CONFIG_DRM_PANEL_LG_LG4573 is not set +# CONFIG_DRM_PANEL_MAGNACHIP_D53E6EA8966 is not set +# CONFIG_DRM_PANEL_NEC_NL8048HL11 is not set +# CONFIG_DRM_PANEL_NEWVISION_NV3051D is not set +# CONFIG_DRM_PANEL_NEWVISION_NV3052C is not set +# CONFIG_DRM_PANEL_NOVATEK_NT35510 is not set +# CONFIG_DRM_PANEL_NOVATEK_NT35560 is not set +# CONFIG_DRM_PANEL_NOVATEK_NT35950 is not set +# CONFIG_DRM_PANEL_NOVATEK_NT36523 is not set +# CONFIG_DRM_PANEL_NOVATEK_NT36672A is not set +# CONFIG_DRM_PANEL_NOVATEK_NT39016 is not set +# CONFIG_DRM_PANEL_MANTIX_MLAF057WE51 is not set +# CONFIG_DRM_PANEL_OLIMEX_LCD_OLINUXINO is not set +# CONFIG_DRM_PANEL_ORISETECH_OTA5601A is not set +# CONFIG_DRM_PANEL_ORISETECH_OTM8009A is not set +# CONFIG_DRM_PANEL_OSD_OSD101T2587_53TS is not set +# CONFIG_DRM_PANEL_PANASONIC_VVX10F034N00 is not set +CONFIG_DRM_PANEL_RASPBERRYPI_TOUCHSCREEN=m +# CONFIG_DRM_PANEL_RAYDIUM_RM67191 is not set +# CONFIG_DRM_PANEL_RAYDIUM_RM68200 is not set +# CONFIG_DRM_PANEL_RONBO_RB070D30 is not set +# CONFIG_DRM_PANEL_SAMSUNG_ATNA33XC20 is not set +# CONFIG_DRM_PANEL_SAMSUNG_DB7430 is not set +# CONFIG_DRM_PANEL_SAMSUNG_S6D16D0 is not set +# CONFIG_DRM_PANEL_SAMSUNG_S6D27A1 is not set +# CONFIG_DRM_PANEL_SAMSUNG_S6D7AA0 is not set +# CONFIG_DRM_PANEL_SAMSUNG_S6E3HA2 is not set +# CONFIG_DRM_PANEL_SAMSUNG_S6E63J0X03 is not set +# CONFIG_DRM_PANEL_SAMSUNG_S6E63M0 is not set +# CONFIG_DRM_PANEL_SAMSUNG_S6E88A0_AMS452EF01 is not set +# CONFIG_DRM_PANEL_SAMSUNG_S6E8AA0 is not set +# CONFIG_DRM_PANEL_SAMSUNG_SOFEF00 is not set +# CONFIG_DRM_PANEL_SEIKO_43WVF1G is not set +# CONFIG_DRM_PANEL_SHARP_LQ101R1SX01 is not set +# CONFIG_DRM_PANEL_SHARP_LS037V7DW01 is not set +# CONFIG_DRM_PANEL_SHARP_LS043T1LE01 is not set +# CONFIG_DRM_PANEL_SHARP_LS060T1SX01 is not set +# CONFIG_DRM_PANEL_SITRONIX_ST7701 is not set +# CONFIG_DRM_PANEL_SITRONIX_ST7703 is not set +# CONFIG_DRM_PANEL_SITRONIX_ST7789V is not set +# CONFIG_DRM_PANEL_SONY_ACX565AKM is not set +# CONFIG_DRM_PANEL_SONY_TD4353_JDI is not set +# CONFIG_DRM_PANEL_SONY_TULIP_TRULY_NT35521 is not set +# CONFIG_DRM_PANEL_STARTEK_KD070FHFID015 is not set +# CONFIG_DRM_PANEL_TDO_TL070WSH30 is not set +# CONFIG_DRM_PANEL_TPO_TD028TTEC1 is not set +# CONFIG_DRM_PANEL_TPO_TD043MTEA1 is not set +# CONFIG_DRM_PANEL_TPO_TPG110 is not set +# CONFIG_DRM_PANEL_TRULY_NT35597_WQXGA is not set +# CONFIG_DRM_PANEL_VISIONOX_RM69299 is not set +# CONFIG_DRM_PANEL_VISIONOX_VTDR6130 is not set +# CONFIG_DRM_PANEL_VISIONOX_R66451 is not set +# CONFIG_DRM_PANEL_WIDECHIPS_WS2401 is not set +# CONFIG_DRM_PANEL_XINPENG_XPP055C272 is not set +# end of Display Panels + +CONFIG_DRM_BRIDGE=y +CONFIG_DRM_PANEL_BRIDGE=y + +# +# Display Interface Bridges +# +# CONFIG_DRM_CHIPONE_ICN6211 is not set +# CONFIG_DRM_CHRONTEL_CH7033 is not set +# CONFIG_DRM_CROS_EC_ANX7688 is not set +CONFIG_DRM_DISPLAY_CONNECTOR=m +# CONFIG_DRM_ITE_IT6505 is not set +# CONFIG_DRM_LONTIUM_LT8912B is not set +# CONFIG_DRM_LONTIUM_LT9211 is not set +# CONFIG_DRM_LONTIUM_LT9611 is not set +# CONFIG_DRM_LONTIUM_LT9611UXC is not set +# CONFIG_DRM_ITE_IT66121 is not set +# CONFIG_DRM_LVDS_CODEC is not set +# CONFIG_DRM_MEGACHIPS_STDPXXXX_GE_B850V3_FW is not set +# CONFIG_DRM_NWL_MIPI_DSI is not set +# CONFIG_DRM_NXP_PTN3460 is not set +# CONFIG_DRM_PARADE_PS8622 is not set +# CONFIG_DRM_PARADE_PS8640 is not set +# CONFIG_DRM_SAMSUNG_DSIM is not set +# CONFIG_DRM_SIL_SII8620 is not set +# CONFIG_DRM_SII902X is not set +# CONFIG_DRM_SII9234 is not set +# CONFIG_DRM_SIMPLE_BRIDGE is not set +# CONFIG_DRM_THINE_THC63LVD1024 is not set +# CONFIG_DRM_TOSHIBA_TC358762 is not set +# CONFIG_DRM_TOSHIBA_TC358764 is not set +# CONFIG_DRM_TOSHIBA_TC358767 is not set +# CONFIG_DRM_TOSHIBA_TC358768 is not set +# CONFIG_DRM_TOSHIBA_TC358775 is not set +# CONFIG_DRM_TI_DLPC3433 is not set +# CONFIG_DRM_TI_TFP410 is not set +# CONFIG_DRM_TI_SN65DSI83 is not set +# CONFIG_DRM_TI_SN65DSI86 is not set +# CONFIG_DRM_TI_TPD12S015 is not set +# CONFIG_DRM_ANALOGIX_ANX6345 is not set +# CONFIG_DRM_ANALOGIX_ANX78XX is not set +CONFIG_DRM_ANALOGIX_DP=m +# CONFIG_DRM_ANALOGIX_ANX7625 is not set +CONFIG_DRM_I2C_ADV7511=m +CONFIG_DRM_I2C_ADV7511_AUDIO=y +CONFIG_DRM_I2C_ADV7511_CEC=y +# CONFIG_DRM_CDNS_DSI is not set +# CONFIG_DRM_CDNS_MHDP8546 is not set +CONFIG_DRM_DW_HDMI=m +# CONFIG_DRM_DW_HDMI_AHB_AUDIO is not set +CONFIG_DRM_DW_HDMI_I2S_AUDIO=m +# CONFIG_DRM_DW_HDMI_GP_AUDIO is not set +# CONFIG_DRM_DW_HDMI_CEC is not set +CONFIG_DRM_DW_MIPI_DSI=m +# end of Display Interface Bridges + +# CONFIG_DRM_LOONGSON is not set +# CONFIG_DRM_ETNAVIV is not set +CONFIG_DRM_HISI_HIBMC=m +CONFIG_DRM_HISI_KIRIN=m +# CONFIG_DRM_LOGICVC is not set +CONFIG_DRM_MESON=m +CONFIG_DRM_MESON_DW_HDMI=m +CONFIG_DRM_MESON_DW_MIPI_DSI=m +# CONFIG_DRM_ARCPGU is not set +CONFIG_DRM_BOCHS=m +CONFIG_DRM_CIRRUS_QEMU=m +# CONFIG_DRM_GM12U320 is not set +# CONFIG_DRM_PANEL_MIPI_DBI is not set +# CONFIG_DRM_SIMPLEDRM is not set +# CONFIG_TINYDRM_HX8357D is not set +# CONFIG_TINYDRM_ILI9163 is not set +# CONFIG_TINYDRM_ILI9225 is not set +# CONFIG_TINYDRM_ILI9341 is not set +# CONFIG_TINYDRM_ILI9486 is not set +# CONFIG_TINYDRM_MI0283QT is not set +# CONFIG_TINYDRM_REPAPER is not set +# CONFIG_TINYDRM_ST7586 is not set +# CONFIG_TINYDRM_ST7735R is not set +# CONFIG_DRM_PL111 is not set +# CONFIG_DRM_XEN_FRONTEND is not set +CONFIG_DRM_LIMA=m +CONFIG_DRM_PANFROST=m +# CONFIG_DRM_TIDSS is not set +# CONFIG_DRM_GUD is not set +# CONFIG_DRM_SSD130X is not set +# CONFIG_DRM_HYPERV is not set +CONFIG_DRM_LEGACY=y +CONFIG_DRM_PANEL_ORIENTATION_QUIRKS=y + +# +# Frame buffer Devices +# +CONFIG_FB=y +CONFIG_FB_SVGALIB=m +# CONFIG_FB_CIRRUS is not set +# CONFIG_FB_PM2 is not set +CONFIG_FB_ARMCLCD=y +# CONFIG_FB_CYBER2000 is not set +# CONFIG_FB_ASILIANT is not set +# CONFIG_FB_IMSTT is not set +# CONFIG_FB_UVESA is not set +CONFIG_FB_EFI=y +# CONFIG_FB_OPENCORES is not set +# CONFIG_FB_S1D13XXX is not set +# CONFIG_FB_NVIDIA is not set +# CONFIG_FB_RIVA is not set +# CONFIG_FB_I740 is not set +# CONFIG_FB_MATROX is not set +# CONFIG_FB_RADEON is not set +# CONFIG_FB_ATY128 is not set +# CONFIG_FB_ATY is not set +CONFIG_FB_S3=m +CONFIG_FB_S3_DDC=y +# CONFIG_FB_SAVAGE is not set +# CONFIG_FB_SIS is not set +# CONFIG_FB_NEOMAGIC is not set +# CONFIG_FB_KYRO is not set +CONFIG_FB_3DFX=m +# CONFIG_FB_3DFX_ACCEL is not set +CONFIG_FB_3DFX_I2C=y +# CONFIG_FB_VOODOO1 is not set +CONFIG_FB_VT8623=m +# CONFIG_FB_TRIDENT is not set +CONFIG_FB_ARK=m +CONFIG_FB_PM3=m +# CONFIG_FB_CARMINE is not set +CONFIG_FB_SMSCUFX=m +CONFIG_FB_UDL=m +# CONFIG_FB_IBM_GXT4500 is not set +# CONFIG_FB_XILINX is not set +# CONFIG_FB_VIRTUAL is not set +CONFIG_XEN_FBDEV_FRONTEND=y +# CONFIG_FB_METRONOME is not set +CONFIG_FB_MB862XX=m +CONFIG_FB_MB862XX_PCI_GDC=y +CONFIG_FB_MB862XX_I2C=y +CONFIG_FB_HYPERV=m +CONFIG_FB_SIMPLE=y +# CONFIG_FB_SSD1307 is not set +# CONFIG_FB_SM712 is not set +CONFIG_FB_CORE=y +CONFIG_FB_NOTIFY=y +CONFIG_FIRMWARE_EDID=y +CONFIG_FB_DEVICE=y +CONFIG_FB_DDC=m +CONFIG_FB_CFB_FILLRECT=y +CONFIG_FB_CFB_COPYAREA=y +CONFIG_FB_CFB_IMAGEBLIT=y +CONFIG_FB_SYS_FILLRECT=y +CONFIG_FB_SYS_COPYAREA=y +CONFIG_FB_SYS_IMAGEBLIT=y +# CONFIG_FB_FOREIGN_ENDIAN is not set +CONFIG_FB_SYS_FOPS=y +CONFIG_FB_DEFERRED_IO=y +CONFIG_FB_DMAMEM_HELPERS=y +CONFIG_FB_IOMEM_HELPERS=y +CONFIG_FB_SYSMEM_HELPERS=y +CONFIG_FB_SYSMEM_HELPERS_DEFERRED=y +CONFIG_FB_MODE_HELPERS=y +CONFIG_FB_TILEBLITTING=y +# end of Frame buffer Devices + +# +# Backlight & LCD device support +# +# CONFIG_LCD_CLASS_DEVICE is not set +CONFIG_BACKLIGHT_CLASS_DEVICE=y +# CONFIG_BACKLIGHT_KTD253 is not set +# CONFIG_BACKLIGHT_KTZ8866 is not set +CONFIG_BACKLIGHT_PWM=m +# CONFIG_BACKLIGHT_QCOM_WLED is not set +# CONFIG_BACKLIGHT_ADP8860 is not set +# CONFIG_BACKLIGHT_ADP8870 is not set +# CONFIG_BACKLIGHT_LM3630A is not set +# CONFIG_BACKLIGHT_LM3639 is not set +CONFIG_BACKLIGHT_LP855X=m +# CONFIG_BACKLIGHT_GPIO is not set +# CONFIG_BACKLIGHT_LV5207LP is not set +# CONFIG_BACKLIGHT_BD6107 is not set +# CONFIG_BACKLIGHT_ARCXCNN is not set +# CONFIG_BACKLIGHT_LED is not set +# end of Backlight & LCD device support + +CONFIG_VGASTATE=m +CONFIG_VIDEOMODE_HELPERS=y +CONFIG_HDMI=y + +# +# Console display driver support +# +CONFIG_DUMMY_CONSOLE=y +CONFIG_DUMMY_CONSOLE_COLUMNS=80 +CONFIG_DUMMY_CONSOLE_ROWS=25 +CONFIG_FRAMEBUFFER_CONSOLE=y +# CONFIG_FRAMEBUFFER_CONSOLE_LEGACY_ACCELERATION is not set +CONFIG_FRAMEBUFFER_CONSOLE_DETECT_PRIMARY=y +CONFIG_FRAMEBUFFER_CONSOLE_ROTATION=y +# CONFIG_FRAMEBUFFER_CONSOLE_DEFERRED_TAKEOVER is not set +# end of Console display driver support + +# CONFIG_LOGO is not set +# end of Graphics support + +# CONFIG_DRM_ACCEL is not set +CONFIG_SOUND=m +CONFIG_SOUND_OSS_CORE=y +# CONFIG_SOUND_OSS_CORE_PRECLAIM is not set +CONFIG_SND=m +CONFIG_SND_TIMER=m +CONFIG_SND_PCM=m +CONFIG_SND_PCM_ELD=y +CONFIG_SND_PCM_IEC958=y +CONFIG_SND_DMAENGINE_PCM=m +CONFIG_SND_HWDEP=m +CONFIG_SND_SEQ_DEVICE=m +CONFIG_SND_RAWMIDI=m +CONFIG_SND_JACK=y +CONFIG_SND_JACK_INPUT_DEV=y +CONFIG_SND_OSSEMUL=y +CONFIG_SND_MIXER_OSS=m +CONFIG_SND_PCM_OSS=m +CONFIG_SND_PCM_OSS_PLUGINS=y +CONFIG_SND_PCM_TIMER=y +CONFIG_SND_HRTIMER=m +CONFIG_SND_DYNAMIC_MINORS=y +CONFIG_SND_MAX_CARDS=32 +CONFIG_SND_SUPPORT_OLD_API=y +CONFIG_SND_PROC_FS=y +CONFIG_SND_VERBOSE_PROCFS=y +# CONFIG_SND_VERBOSE_PRINTK is not set +CONFIG_SND_CTL_FAST_LOOKUP=y +# CONFIG_SND_DEBUG is not set +# CONFIG_SND_CTL_INPUT_VALIDATION is not set +CONFIG_SND_VMASTER=y +CONFIG_SND_CTL_LED=m +CONFIG_SND_SEQUENCER=m +CONFIG_SND_SEQ_DUMMY=m +# CONFIG_SND_SEQUENCER_OSS is not set +CONFIG_SND_SEQ_HRTIMER_DEFAULT=y +CONFIG_SND_SEQ_MIDI_EVENT=m +CONFIG_SND_SEQ_MIDI=m +CONFIG_SND_SEQ_MIDI_EMUL=m +# CONFIG_SND_SEQ_UMP is not set +CONFIG_SND_MPU401_UART=m +CONFIG_SND_OPL3_LIB=m +CONFIG_SND_OPL3_LIB_SEQ=m +CONFIG_SND_AC97_CODEC=m +CONFIG_SND_DRIVERS=y +# CONFIG_SND_DUMMY is not set +CONFIG_SND_ALOOP=m +# CONFIG_SND_PCMTEST is not set +# CONFIG_SND_VIRMIDI is not set +# CONFIG_SND_MTPAV is not set +CONFIG_SND_MTS64=m +# CONFIG_SND_SERIAL_U16550 is not set +# CONFIG_SND_SERIAL_GENERIC is not set +# CONFIG_SND_MPU401 is not set +CONFIG_SND_PORTMAN2X4=m +CONFIG_SND_AC97_POWER_SAVE=y +CONFIG_SND_AC97_POWER_SAVE_DEFAULT=0 +CONFIG_SND_PCI=y +CONFIG_SND_AD1889=m +# CONFIG_SND_ALS300 is not set +# CONFIG_SND_ALI5451 is not set +# CONFIG_SND_ATIIXP is not set +# CONFIG_SND_ATIIXP_MODEM is not set +# CONFIG_SND_AU8810 is not set +# CONFIG_SND_AU8820 is not set +# CONFIG_SND_AU8830 is not set +# CONFIG_SND_AW2 is not set +# CONFIG_SND_AZT3328 is not set +# CONFIG_SND_BT87X is not set +# CONFIG_SND_CA0106 is not set +# CONFIG_SND_CMIPCI is not set +CONFIG_SND_OXYGEN_LIB=m +CONFIG_SND_OXYGEN=m +# CONFIG_SND_CS4281 is not set +# CONFIG_SND_CS46XX is not set +CONFIG_SND_CTXFI=m +CONFIG_SND_DARLA20=m +CONFIG_SND_GINA20=m +CONFIG_SND_LAYLA20=m +CONFIG_SND_DARLA24=m +CONFIG_SND_GINA24=m +CONFIG_SND_LAYLA24=m +CONFIG_SND_MONA=m +CONFIG_SND_MIA=m +CONFIG_SND_ECHO3G=m +CONFIG_SND_INDIGO=m +CONFIG_SND_INDIGOIO=m +CONFIG_SND_INDIGODJ=m +CONFIG_SND_INDIGOIOX=m +CONFIG_SND_INDIGODJX=m +# CONFIG_SND_EMU10K1 is not set +# CONFIG_SND_EMU10K1X is not set +# CONFIG_SND_ENS1370 is not set +# CONFIG_SND_ENS1371 is not set +# CONFIG_SND_ES1938 is not set +# CONFIG_SND_ES1968 is not set +# CONFIG_SND_FM801 is not set +# CONFIG_SND_HDSP is not set +CONFIG_SND_HDSPM=m +# CONFIG_SND_ICE1712 is not set +# CONFIG_SND_ICE1724 is not set +# CONFIG_SND_INTEL8X0 is not set +# CONFIG_SND_INTEL8X0M is not set +# CONFIG_SND_KORG1212 is not set +CONFIG_SND_LOLA=m +CONFIG_SND_LX6464ES=m +# CONFIG_SND_MAESTRO3 is not set +# CONFIG_SND_MIXART is not set +# CONFIG_SND_NM256 is not set +CONFIG_SND_PCXHR=m +CONFIG_SND_RIPTIDE=m +# CONFIG_SND_RME32 is not set +# CONFIG_SND_RME96 is not set +# CONFIG_SND_RME9652 is not set +# CONFIG_SND_SONICVIBES is not set +# CONFIG_SND_TRIDENT is not set +# CONFIG_SND_VIA82XX is not set +# CONFIG_SND_VIA82XX_MODEM is not set +CONFIG_SND_VIRTUOSO=m +# CONFIG_SND_VX222 is not set +# CONFIG_SND_YMFPCI is not set + +# +# HD-Audio +# +CONFIG_SND_HDA=m +CONFIG_SND_HDA_GENERIC_LEDS=y +CONFIG_SND_HDA_INTEL=m +CONFIG_SND_HDA_TEGRA=m +CONFIG_SND_HDA_HWDEP=y +CONFIG_SND_HDA_RECONFIG=y +CONFIG_SND_HDA_INPUT_BEEP=y +CONFIG_SND_HDA_INPUT_BEEP_MODE=1 +CONFIG_SND_HDA_PATCH_LOADER=y +# CONFIG_SND_HDA_SCODEC_CS35L41_I2C is not set +# CONFIG_SND_HDA_SCODEC_CS35L41_SPI is not set +# CONFIG_SND_HDA_SCODEC_CS35L56_I2C is not set +# CONFIG_SND_HDA_SCODEC_CS35L56_SPI is not set +# CONFIG_SND_HDA_SCODEC_TAS2781_I2C is not set +CONFIG_SND_HDA_CODEC_REALTEK=m +CONFIG_SND_HDA_CODEC_ANALOG=m +CONFIG_SND_HDA_CODEC_SIGMATEL=m +CONFIG_SND_HDA_CODEC_VIA=m +CONFIG_SND_HDA_CODEC_HDMI=m +CONFIG_SND_HDA_CODEC_CIRRUS=m +# CONFIG_SND_HDA_CODEC_CS8409 is not set +CONFIG_SND_HDA_CODEC_CONEXANT=m +CONFIG_SND_HDA_CODEC_CA0110=m +CONFIG_SND_HDA_CODEC_CA0132=m +CONFIG_SND_HDA_CODEC_CA0132_DSP=y +CONFIG_SND_HDA_CODEC_CMEDIA=m +CONFIG_SND_HDA_CODEC_SI3054=m +CONFIG_SND_HDA_GENERIC=m +CONFIG_SND_HDA_POWER_SAVE_DEFAULT=1 +# CONFIG_SND_HDA_INTEL_HDMI_SILENT_STREAM is not set +# CONFIG_SND_HDA_CTL_DEV_ID is not set +# end of HD-Audio + +CONFIG_SND_HDA_CORE=m +CONFIG_SND_HDA_DSP_LOADER=y +CONFIG_SND_HDA_ALIGNED_MMIO=y +CONFIG_SND_HDA_COMPONENT=y +CONFIG_SND_HDA_PREALLOC_SIZE=2048 +CONFIG_SND_INTEL_NHLT=y +CONFIG_SND_INTEL_DSP_CONFIG=m +CONFIG_SND_INTEL_SOUNDWIRE_ACPI=m +CONFIG_SND_SPI=y +CONFIG_SND_USB=y +CONFIG_SND_USB_AUDIO=m +# CONFIG_SND_USB_AUDIO_MIDI_V2 is not set +CONFIG_SND_USB_AUDIO_USE_MEDIA_CONTROLLER=y +CONFIG_SND_USB_UA101=m +CONFIG_SND_USB_CAIAQ=m +CONFIG_SND_USB_CAIAQ_INPUT=y +CONFIG_SND_USB_6FIRE=m +CONFIG_SND_USB_HIFACE=m +CONFIG_SND_BCD2000=m +CONFIG_SND_USB_LINE6=m +CONFIG_SND_USB_POD=m +CONFIG_SND_USB_PODHD=m +CONFIG_SND_USB_TONEPORT=m +CONFIG_SND_USB_VARIAX=m +CONFIG_SND_FIREWIRE=y +CONFIG_SND_FIREWIRE_LIB=m +CONFIG_SND_DICE=m +CONFIG_SND_OXFW=m +CONFIG_SND_ISIGHT=m +CONFIG_SND_FIREWORKS=m +CONFIG_SND_BEBOB=m +CONFIG_SND_FIREWIRE_DIGI00X=m +CONFIG_SND_FIREWIRE_TASCAM=m +CONFIG_SND_FIREWIRE_MOTU=m +CONFIG_SND_FIREFACE=m +CONFIG_SND_SOC=m +CONFIG_SND_SOC_GENERIC_DMAENGINE_PCM=y +# CONFIG_SND_SOC_ADI is not set +# CONFIG_SND_SOC_AMD_ACP is not set +# CONFIG_SND_AMD_ACP_CONFIG is not set +# CONFIG_SND_ATMEL_SOC is not set +# CONFIG_SND_BCM63XX_I2S_WHISTLER is not set +# CONFIG_SND_DESIGNWARE_I2S is not set + +# +# SoC Audio for Freescale CPUs +# + +# +# Common SoC Audio options for Freescale CPUs: +# +# CONFIG_SND_SOC_FSL_ASRC is not set +# CONFIG_SND_SOC_FSL_SAI is not set +# CONFIG_SND_SOC_FSL_AUDMIX is not set +# CONFIG_SND_SOC_FSL_SSI is not set +# CONFIG_SND_SOC_FSL_SPDIF is not set +# CONFIG_SND_SOC_FSL_ESAI is not set +# CONFIG_SND_SOC_FSL_MICFIL is not set +# CONFIG_SND_SOC_FSL_XCVR is not set +# CONFIG_SND_SOC_FSL_RPMSG is not set +# CONFIG_SND_SOC_IMX_AUDMUX is not set +# end of SoC Audio for Freescale CPUs + +# CONFIG_SND_SOC_CHV3_I2S is not set +CONFIG_SND_I2S_HI6210_I2S=m +# CONFIG_SND_KIRKWOOD_SOC is not set +# CONFIG_SND_SOC_IMG is not set +# CONFIG_SND_SOC_MTK_BTCVSD is not set + +# +# ASoC support for Amlogic platforms +# +# CONFIG_SND_MESON_AIU is not set +# CONFIG_SND_MESON_AXG_FRDDR is not set +# CONFIG_SND_MESON_AXG_TODDR is not set +# CONFIG_SND_MESON_AXG_TDMIN is not set +# CONFIG_SND_MESON_AXG_TDMOUT is not set +# CONFIG_SND_MESON_AXG_SOUND_CARD is not set +# CONFIG_SND_MESON_AXG_SPDIFOUT is not set +# CONFIG_SND_MESON_AXG_SPDIFIN is not set +# CONFIG_SND_MESON_AXG_PDM is not set +# CONFIG_SND_MESON_GX_SOUND_CARD is not set +# CONFIG_SND_MESON_G12A_TOACODEC is not set +# CONFIG_SND_MESON_G12A_TOHDMITX is not set +# CONFIG_SND_SOC_MESON_T9015 is not set +# end of ASoC support for Amlogic platforms + +CONFIG_SND_SOC_QCOM=m +CONFIG_SND_SOC_LPASS_CPU=m +CONFIG_SND_SOC_LPASS_PLATFORM=m +CONFIG_SND_SOC_LPASS_APQ8016=m +# CONFIG_SND_SOC_STORM is not set +CONFIG_SND_SOC_APQ8016_SBC=m +CONFIG_SND_SOC_QCOM_COMMON=m +# CONFIG_SND_SOC_SC7180 is not set +CONFIG_SND_SOC_ROCKCHIP=m +CONFIG_SND_SOC_ROCKCHIP_I2S=m +# CONFIG_SND_SOC_ROCKCHIP_I2S_TDM is not set +# CONFIG_SND_SOC_ROCKCHIP_PDM is not set +CONFIG_SND_SOC_ROCKCHIP_SPDIF=m +# CONFIG_SND_SOC_ROCKCHIP_MAX98090 is not set +CONFIG_SND_SOC_ROCKCHIP_RT5645=m +# CONFIG_SND_SOC_RK3288_HDMI_ANALOG is not set +CONFIG_SND_SOC_RK3399_GRU_SOUND=m +# CONFIG_SND_SOC_SOF_TOPLEVEL is not set + +# +# STMicroelectronics STM32 SOC audio support +# +# end of STMicroelectronics STM32 SOC audio support + +# +# Allwinner SoC Audio support +# +# CONFIG_SND_SUN4I_CODEC is not set +CONFIG_SND_SUN8I_CODEC=m +# CONFIG_SND_SUN8I_CODEC_ANALOG is not set +CONFIG_SND_SUN50I_CODEC_ANALOG=m +CONFIG_SND_SUN4I_I2S=m +# CONFIG_SND_SUN4I_SPDIF is not set +# CONFIG_SND_SUN50I_DMIC is not set +CONFIG_SND_SUN8I_ADDA_PR_REGMAP=m +# end of Allwinner SoC Audio support + +CONFIG_SND_SOC_TEGRA=m +# CONFIG_SND_SOC_TEGRA20_AC97 is not set +# CONFIG_SND_SOC_TEGRA20_DAS is not set +# CONFIG_SND_SOC_TEGRA20_I2S is not set +CONFIG_SND_SOC_TEGRA20_SPDIF=m +# CONFIG_SND_SOC_TEGRA30_AHUB is not set +# CONFIG_SND_SOC_TEGRA30_I2S is not set +# CONFIG_SND_SOC_TEGRA210_AHUB is not set +# CONFIG_SND_SOC_TEGRA210_DMIC is not set +# CONFIG_SND_SOC_TEGRA210_I2S is not set +# CONFIG_SND_SOC_TEGRA210_OPE is not set +# CONFIG_SND_SOC_TEGRA186_ASRC is not set +# CONFIG_SND_SOC_TEGRA186_DSPK is not set +# CONFIG_SND_SOC_TEGRA210_ADMAIF is not set +# CONFIG_SND_SOC_TEGRA210_MVC is not set +# CONFIG_SND_SOC_TEGRA210_SFC is not set +# CONFIG_SND_SOC_TEGRA210_AMX is not set +# CONFIG_SND_SOC_TEGRA210_ADX is not set +# CONFIG_SND_SOC_TEGRA210_MIXER is not set +# CONFIG_SND_SOC_TEGRA_AUDIO_GRAPH_CARD is not set +CONFIG_SND_SOC_TEGRA_MACHINE_DRV=m +# CONFIG_SND_SOC_TEGRA_RT5631 is not set +CONFIG_SND_SOC_TEGRA_RT5640=m +CONFIG_SND_SOC_TEGRA_WM8753=m +CONFIG_SND_SOC_TEGRA_WM8903=m +# CONFIG_SND_SOC_TEGRA_WM9712 is not set +CONFIG_SND_SOC_TEGRA_TRIMSLICE=m +CONFIG_SND_SOC_TEGRA_ALC5632=m +CONFIG_SND_SOC_TEGRA_MAX98090=m +# CONFIG_SND_SOC_TEGRA_MAX98088 is not set +CONFIG_SND_SOC_TEGRA_RT5677=m +# CONFIG_SND_SOC_TEGRA_SGTL5000 is not set +# CONFIG_SND_SOC_XILINX_I2S is not set +# CONFIG_SND_SOC_XILINX_AUDIO_FORMATTER is not set +# CONFIG_SND_SOC_XILINX_SPDIF is not set +# CONFIG_SND_SOC_XTFPGA_I2S is not set +CONFIG_SND_SOC_I2C_AND_SPI=m + +# +# CODEC drivers +# +# CONFIG_SND_SOC_AC97_CODEC is not set +# CONFIG_SND_SOC_ADAU1372_I2C is not set +# CONFIG_SND_SOC_ADAU1372_SPI is not set +# CONFIG_SND_SOC_ADAU1701 is not set +# CONFIG_SND_SOC_ADAU1761_I2C is not set +# CONFIG_SND_SOC_ADAU1761_SPI is not set +# CONFIG_SND_SOC_ADAU7002 is not set +# CONFIG_SND_SOC_ADAU7118_HW is not set +# CONFIG_SND_SOC_ADAU7118_I2C is not set +# CONFIG_SND_SOC_AK4104 is not set +# CONFIG_SND_SOC_AK4118 is not set +# CONFIG_SND_SOC_AK4375 is not set +# CONFIG_SND_SOC_AK4458 is not set +# CONFIG_SND_SOC_AK4554 is not set +# CONFIG_SND_SOC_AK4613 is not set +# CONFIG_SND_SOC_AK4642 is not set +# CONFIG_SND_SOC_AK5386 is not set +# CONFIG_SND_SOC_AK5558 is not set +# CONFIG_SND_SOC_ALC5623 is not set +CONFIG_SND_SOC_ALC5632=m +# CONFIG_SND_SOC_AUDIO_IIO_AUX is not set +# CONFIG_SND_SOC_AW8738 is not set +# CONFIG_SND_SOC_AW88395 is not set +# CONFIG_SND_SOC_AW88261 is not set +# CONFIG_SND_SOC_BD28623 is not set +# CONFIG_SND_SOC_BT_SCO is not set +# CONFIG_SND_SOC_CHV3_CODEC is not set +# CONFIG_SND_SOC_CROS_EC_CODEC is not set +# CONFIG_SND_SOC_CS35L32 is not set +# CONFIG_SND_SOC_CS35L33 is not set +# CONFIG_SND_SOC_CS35L34 is not set +# CONFIG_SND_SOC_CS35L35 is not set +# CONFIG_SND_SOC_CS35L36 is not set +# CONFIG_SND_SOC_CS35L41_SPI is not set +# CONFIG_SND_SOC_CS35L41_I2C is not set +# CONFIG_SND_SOC_CS35L45_SPI is not set +# CONFIG_SND_SOC_CS35L45_I2C is not set +# CONFIG_SND_SOC_CS35L56_I2C is not set +# CONFIG_SND_SOC_CS35L56_SPI is not set +# CONFIG_SND_SOC_CS42L42 is not set +# CONFIG_SND_SOC_CS42L51_I2C is not set +# CONFIG_SND_SOC_CS42L52 is not set +# CONFIG_SND_SOC_CS42L56 is not set +# CONFIG_SND_SOC_CS42L73 is not set +# CONFIG_SND_SOC_CS42L83 is not set +# CONFIG_SND_SOC_CS4234 is not set +# CONFIG_SND_SOC_CS4265 is not set +# CONFIG_SND_SOC_CS4270 is not set +# CONFIG_SND_SOC_CS4271_I2C is not set +# CONFIG_SND_SOC_CS4271_SPI is not set +# CONFIG_SND_SOC_CS42XX8_I2C is not set +# CONFIG_SND_SOC_CS43130 is not set +# CONFIG_SND_SOC_CS4341 is not set +# CONFIG_SND_SOC_CS4349 is not set +# CONFIG_SND_SOC_CS53L30 is not set +# CONFIG_SND_SOC_CX2072X is not set +# CONFIG_SND_SOC_DA7213 is not set +CONFIG_SND_SOC_DA7219=m +CONFIG_SND_SOC_DMIC=m +CONFIG_SND_SOC_HDMI_CODEC=m +# CONFIG_SND_SOC_ES7134 is not set +# CONFIG_SND_SOC_ES7241 is not set +# CONFIG_SND_SOC_ES8316 is not set +# CONFIG_SND_SOC_ES8326 is not set +# CONFIG_SND_SOC_ES8328_I2C is not set +# CONFIG_SND_SOC_ES8328_SPI is not set +# CONFIG_SND_SOC_GTM601 is not set +# CONFIG_SND_SOC_HDA is not set +# CONFIG_SND_SOC_ICS43432 is not set +# CONFIG_SND_SOC_IDT821034 is not set +# CONFIG_SND_SOC_INNO_RK3036 is not set +# CONFIG_SND_SOC_MAX98088 is not set +CONFIG_SND_SOC_MAX98090=m +CONFIG_SND_SOC_MAX98357A=m +# CONFIG_SND_SOC_MAX98504 is not set +# CONFIG_SND_SOC_MAX9867 is not set +# CONFIG_SND_SOC_MAX98927 is not set +# CONFIG_SND_SOC_MAX98520 is not set +# CONFIG_SND_SOC_MAX98373_I2C is not set +# CONFIG_SND_SOC_MAX98388 is not set +# CONFIG_SND_SOC_MAX98390 is not set +# CONFIG_SND_SOC_MAX98396 is not set +# CONFIG_SND_SOC_MAX9860 is not set +# CONFIG_SND_SOC_MSM8916_WCD_ANALOG is not set +# CONFIG_SND_SOC_MSM8916_WCD_DIGITAL is not set +# CONFIG_SND_SOC_PCM1681 is not set +# CONFIG_SND_SOC_PCM1789_I2C is not set +# CONFIG_SND_SOC_PCM179X_I2C is not set +# CONFIG_SND_SOC_PCM179X_SPI is not set +# CONFIG_SND_SOC_PCM186X_I2C is not set +# CONFIG_SND_SOC_PCM186X_SPI is not set +# CONFIG_SND_SOC_PCM3060_I2C is not set +# CONFIG_SND_SOC_PCM3060_SPI is not set +# CONFIG_SND_SOC_PCM3168A_I2C is not set +# CONFIG_SND_SOC_PCM3168A_SPI is not set +# CONFIG_SND_SOC_PCM5102A is not set +# CONFIG_SND_SOC_PCM512x_I2C is not set +# CONFIG_SND_SOC_PCM512x_SPI is not set +# CONFIG_SND_SOC_PEB2466 is not set +# CONFIG_SND_SOC_RK3328 is not set +CONFIG_SND_SOC_RL6231=m +CONFIG_SND_SOC_RT5514=m +CONFIG_SND_SOC_RT5514_SPI=m +# CONFIG_SND_SOC_RT5616 is not set +# CONFIG_SND_SOC_RT5631 is not set +CONFIG_SND_SOC_RT5640=m +CONFIG_SND_SOC_RT5645=m +# CONFIG_SND_SOC_RT5659 is not set +CONFIG_SND_SOC_RT5677=m +CONFIG_SND_SOC_RT5677_SPI=m +# CONFIG_SND_SOC_RT9120 is not set +# CONFIG_SND_SOC_SGTL5000 is not set +CONFIG_SND_SOC_SIMPLE_AMPLIFIER=m +# CONFIG_SND_SOC_SIMPLE_MUX is not set +# CONFIG_SND_SOC_SMA1303 is not set +# CONFIG_SND_SOC_SPDIF is not set +# CONFIG_SND_SOC_SRC4XXX_I2C is not set +# CONFIG_SND_SOC_SSM2305 is not set +# CONFIG_SND_SOC_SSM2518 is not set +# CONFIG_SND_SOC_SSM2602_SPI is not set +# CONFIG_SND_SOC_SSM2602_I2C is not set +# CONFIG_SND_SOC_SSM3515 is not set +# CONFIG_SND_SOC_SSM4567 is not set +# CONFIG_SND_SOC_STA32X is not set +# CONFIG_SND_SOC_STA350 is not set +# CONFIG_SND_SOC_STI_SAS is not set +# CONFIG_SND_SOC_TAS2552 is not set +# CONFIG_SND_SOC_TAS2562 is not set +# CONFIG_SND_SOC_TAS2764 is not set +# CONFIG_SND_SOC_TAS2770 is not set +# CONFIG_SND_SOC_TAS2780 is not set +# CONFIG_SND_SOC_TAS2781_I2C is not set +# CONFIG_SND_SOC_TAS5086 is not set +# CONFIG_SND_SOC_TAS571X is not set +# CONFIG_SND_SOC_TAS5720 is not set +# CONFIG_SND_SOC_TAS5805M is not set +# CONFIG_SND_SOC_TAS6424 is not set +# CONFIG_SND_SOC_TDA7419 is not set +# CONFIG_SND_SOC_TFA9879 is not set +# CONFIG_SND_SOC_TFA989X is not set +# CONFIG_SND_SOC_TLV320ADC3XXX is not set +CONFIG_SND_SOC_TLV320AIC23=m +CONFIG_SND_SOC_TLV320AIC23_I2C=m +# CONFIG_SND_SOC_TLV320AIC23_SPI is not set +# CONFIG_SND_SOC_TLV320AIC31XX is not set +# CONFIG_SND_SOC_TLV320AIC32X4_I2C is not set +# CONFIG_SND_SOC_TLV320AIC32X4_SPI is not set +# CONFIG_SND_SOC_TLV320AIC3X_I2C is not set +# CONFIG_SND_SOC_TLV320AIC3X_SPI is not set +# CONFIG_SND_SOC_TLV320ADCX140 is not set +# CONFIG_SND_SOC_TS3A227E is not set +# CONFIG_SND_SOC_TSCS42XX is not set +# CONFIG_SND_SOC_TSCS454 is not set +# CONFIG_SND_SOC_UDA1334 is not set +# CONFIG_SND_SOC_WM8510 is not set +# CONFIG_SND_SOC_WM8523 is not set +# CONFIG_SND_SOC_WM8524 is not set +# CONFIG_SND_SOC_WM8580 is not set +# CONFIG_SND_SOC_WM8711 is not set +# CONFIG_SND_SOC_WM8728 is not set +# CONFIG_SND_SOC_WM8731_I2C is not set +# CONFIG_SND_SOC_WM8731_SPI is not set +# CONFIG_SND_SOC_WM8737 is not set +# CONFIG_SND_SOC_WM8741 is not set +# CONFIG_SND_SOC_WM8750 is not set +CONFIG_SND_SOC_WM8753=m +# CONFIG_SND_SOC_WM8770 is not set +# CONFIG_SND_SOC_WM8776 is not set +# CONFIG_SND_SOC_WM8782 is not set +# CONFIG_SND_SOC_WM8804_I2C is not set +# CONFIG_SND_SOC_WM8804_SPI is not set +CONFIG_SND_SOC_WM8903=m +# CONFIG_SND_SOC_WM8904 is not set +# CONFIG_SND_SOC_WM8940 is not set +# CONFIG_SND_SOC_WM8960 is not set +# CONFIG_SND_SOC_WM8961 is not set +# CONFIG_SND_SOC_WM8962 is not set +# CONFIG_SND_SOC_WM8974 is not set +# CONFIG_SND_SOC_WM8978 is not set +# CONFIG_SND_SOC_WM8985 is not set +# CONFIG_SND_SOC_ZL38060 is not set +# CONFIG_SND_SOC_MAX9759 is not set +# CONFIG_SND_SOC_MT6351 is not set +# CONFIG_SND_SOC_MT6358 is not set +# CONFIG_SND_SOC_MT6660 is not set +# CONFIG_SND_SOC_NAU8315 is not set +# CONFIG_SND_SOC_NAU8540 is not set +# CONFIG_SND_SOC_NAU8810 is not set +# CONFIG_SND_SOC_NAU8821 is not set +# CONFIG_SND_SOC_NAU8822 is not set +# CONFIG_SND_SOC_NAU8824 is not set +# CONFIG_SND_SOC_TPA6130A2 is not set +# CONFIG_SND_SOC_LPASS_WSA_MACRO is not set +# CONFIG_SND_SOC_LPASS_VA_MACRO is not set +# CONFIG_SND_SOC_LPASS_RX_MACRO is not set +# CONFIG_SND_SOC_LPASS_TX_MACRO is not set +# end of CODEC drivers + +CONFIG_SND_SIMPLE_CARD_UTILS=m +CONFIG_SND_SIMPLE_CARD=m +CONFIG_SND_AUDIO_GRAPH_CARD=m +# CONFIG_SND_AUDIO_GRAPH_CARD2 is not set +# CONFIG_SND_TEST_COMPONENT is not set +CONFIG_SND_XEN_FRONTEND=m +# CONFIG_SND_VIRTIO is not set +CONFIG_AC97_BUS=m +CONFIG_HID_SUPPORT=y +CONFIG_HID=m +CONFIG_HID_BATTERY_STRENGTH=y +CONFIG_HIDRAW=y +CONFIG_UHID=m +CONFIG_HID_GENERIC=m + +# +# Special HID drivers +# +CONFIG_HID_A4TECH=m +CONFIG_HID_ACCUTOUCH=m +CONFIG_HID_ACRUX=m +CONFIG_HID_ACRUX_FF=y +CONFIG_HID_APPLE=m +# CONFIG_HID_APPLEIR is not set +CONFIG_HID_ASUS=m +CONFIG_HID_AUREAL=m +CONFIG_HID_BELKIN=m +CONFIG_HID_BETOP_FF=m +CONFIG_HID_BIGBEN_FF=m +CONFIG_HID_CHERRY=m +CONFIG_HID_CHICONY=m +CONFIG_HID_CORSAIR=m +CONFIG_HID_COUGAR=m +CONFIG_HID_MACALLY=m +CONFIG_HID_PRODIKEYS=m +CONFIG_HID_CMEDIA=m +CONFIG_HID_CP2112=m +# CONFIG_HID_CREATIVE_SB0540 is not set +CONFIG_HID_CYPRESS=m +CONFIG_HID_DRAGONRISE=m +CONFIG_DRAGONRISE_FF=y +CONFIG_HID_EMS_FF=m +CONFIG_HID_ELAN=m +CONFIG_HID_ELECOM=m +CONFIG_HID_ELO=m +# CONFIG_HID_EVISION is not set +CONFIG_HID_EZKEY=m +# CONFIG_HID_FT260 is not set +CONFIG_HID_GEMBIRD=m +CONFIG_HID_GFRM=m +# CONFIG_HID_GLORIOUS is not set +CONFIG_HID_HOLTEK=m +CONFIG_HOLTEK_FF=y +# CONFIG_HID_GOOGLE_HAMMER is not set +# CONFIG_HID_GOOGLE_STADIA_FF is not set +# CONFIG_HID_VIVALDI is not set +CONFIG_HID_GT683R=m +CONFIG_HID_KEYTOUCH=m +CONFIG_HID_KYE=m +CONFIG_HID_UCLOGIC=m +CONFIG_HID_WALTOP=m +CONFIG_HID_VIEWSONIC=m +# CONFIG_HID_VRC2 is not set +# CONFIG_HID_XIAOMI is not set +CONFIG_HID_GYRATION=m +CONFIG_HID_ICADE=m +CONFIG_HID_ITE=m +CONFIG_HID_JABRA=m +CONFIG_HID_TWINHAN=m +CONFIG_HID_KENSINGTON=m +CONFIG_HID_LCPOWER=m +CONFIG_HID_LED=m +CONFIG_HID_LENOVO=m +# CONFIG_HID_LETSKETCH is not set +CONFIG_HID_LOGITECH=m +CONFIG_HID_LOGITECH_DJ=m +CONFIG_HID_LOGITECH_HIDPP=m +CONFIG_LOGITECH_FF=y +CONFIG_LOGIRUMBLEPAD2_FF=y +CONFIG_LOGIG940_FF=y +CONFIG_LOGIWHEELS_FF=y +CONFIG_HID_MAGICMOUSE=m +CONFIG_HID_MALTRON=m +CONFIG_HID_MAYFLASH=m +# CONFIG_HID_MEGAWORLD_FF is not set +CONFIG_HID_REDRAGON=m +CONFIG_HID_MICROSOFT=m +CONFIG_HID_MONTEREY=m +CONFIG_HID_MULTITOUCH=m +# CONFIG_HID_NINTENDO is not set +CONFIG_HID_NTI=m +CONFIG_HID_NTRIG=m +# CONFIG_HID_NVIDIA_SHIELD is not set +CONFIG_HID_ORTEK=m +CONFIG_HID_PANTHERLORD=m +CONFIG_PANTHERLORD_FF=y +CONFIG_HID_PENMOUNT=m +CONFIG_HID_PETALYNX=m +CONFIG_HID_PICOLCD=m +CONFIG_HID_PICOLCD_FB=y +CONFIG_HID_PICOLCD_BACKLIGHT=y +CONFIG_HID_PICOLCD_LEDS=y +CONFIG_HID_PICOLCD_CIR=y +CONFIG_HID_PLANTRONICS=m +# CONFIG_HID_PXRC is not set +# CONFIG_HID_RAZER is not set +CONFIG_HID_PRIMAX=m +CONFIG_HID_RETRODE=m +CONFIG_HID_ROCCAT=m +CONFIG_HID_SAITEK=m +CONFIG_HID_SAMSUNG=m +# CONFIG_HID_SEMITEK is not set +# CONFIG_HID_SIGMAMICRO is not set +CONFIG_HID_SONY=m +CONFIG_SONY_FF=y +CONFIG_HID_SPEEDLINK=m +CONFIG_HID_STEAM=m +# CONFIG_STEAM_FF is not set +CONFIG_HID_STEELSERIES=m +CONFIG_HID_SUNPLUS=m +CONFIG_HID_RMI=m +CONFIG_HID_GREENASIA=m +CONFIG_GREENASIA_FF=y +# CONFIG_HID_HYPERV_MOUSE is not set +CONFIG_HID_SMARTJOYPLUS=m +CONFIG_SMARTJOYPLUS_FF=y +CONFIG_HID_TIVO=m +CONFIG_HID_TOPSEED=m +# CONFIG_HID_TOPRE is not set +CONFIG_HID_THINGM=m +CONFIG_HID_THRUSTMASTER=m +CONFIG_THRUSTMASTER_FF=y +CONFIG_HID_UDRAW_PS3=m +CONFIG_HID_U2FZERO=m +CONFIG_HID_WACOM=m +CONFIG_HID_WIIMOTE=m +CONFIG_HID_XINMO=m +CONFIG_HID_ZEROPLUS=m +CONFIG_ZEROPLUS_FF=y +CONFIG_HID_ZYDACRON=m +CONFIG_HID_SENSOR_HUB=m +CONFIG_HID_SENSOR_CUSTOM_SENSOR=m +CONFIG_HID_ALPS=m +# CONFIG_HID_MCP2221 is not set +# end of Special HID drivers + +# +# HID-BPF support +# +# CONFIG_HID_BPF is not set +# end of HID-BPF support + +# +# USB HID support +# +CONFIG_USB_HID=m +CONFIG_HID_PID=y +CONFIG_USB_HIDDEV=y + +# +# USB HID Boot Protocol drivers +# +# CONFIG_USB_KBD is not set +# CONFIG_USB_MOUSE is not set +# end of USB HID Boot Protocol drivers +# end of USB HID support + +CONFIG_I2C_HID=m +# CONFIG_I2C_HID_ACPI is not set +# CONFIG_I2C_HID_OF is not set +# CONFIG_I2C_HID_OF_ELAN is not set +# CONFIG_I2C_HID_OF_GOODIX is not set +CONFIG_USB_OHCI_LITTLE_ENDIAN=y +CONFIG_USB_SUPPORT=y +CONFIG_USB_COMMON=m +CONFIG_USB_LED_TRIG=y +CONFIG_USB_ULPI_BUS=m +CONFIG_USB_CONN_GPIO=m +CONFIG_USB_ARCH_HAS_HCD=y +CONFIG_USB=m +CONFIG_USB_PCI=y +CONFIG_USB_ANNOUNCE_NEW_DEVICES=y + +# +# Miscellaneous USB options +# +CONFIG_USB_DEFAULT_PERSIST=y +# CONFIG_USB_FEW_INIT_RETRIES is not set +CONFIG_USB_DYNAMIC_MINORS=y +# CONFIG_USB_OTG is not set +# CONFIG_USB_OTG_PRODUCTLIST is not set +# CONFIG_USB_OTG_DISABLE_EXTERNAL_HUB is not set +CONFIG_USB_LEDS_TRIGGER_USBPORT=m +CONFIG_USB_AUTOSUSPEND_DELAY=2 +CONFIG_USB_MON=m + +# +# USB Host Controller Drivers +# +# CONFIG_USB_C67X00_HCD is not set +CONFIG_USB_XHCI_HCD=m +# CONFIG_USB_XHCI_DBGCAP is not set +CONFIG_USB_XHCI_PCI=m +CONFIG_USB_XHCI_PCI_RENESAS=m +CONFIG_USB_XHCI_PLATFORM=m +# CONFIG_USB_XHCI_HISTB is not set +# CONFIG_USB_XHCI_MVEBU is not set +CONFIG_USB_XHCI_TEGRA=m +CONFIG_USB_EHCI_HCD=m +CONFIG_USB_EHCI_ROOT_HUB_TT=y +CONFIG_USB_EHCI_TT_NEWSCHED=y +CONFIG_USB_EHCI_PCI=m +# CONFIG_USB_EHCI_FSL is not set +CONFIG_USB_EHCI_HCD_ORION=m +CONFIG_USB_EHCI_TEGRA=m +CONFIG_USB_EHCI_HCD_PLATFORM=m +# CONFIG_USB_OXU210HP_HCD is not set +# CONFIG_USB_ISP116X_HCD is not set +# CONFIG_USB_MAX3421_HCD is not set +CONFIG_USB_OHCI_HCD=m +CONFIG_USB_OHCI_HCD_PCI=m +# CONFIG_USB_OHCI_HCD_SSB is not set +CONFIG_USB_OHCI_HCD_PLATFORM=m +# CONFIG_USB_UHCI_HCD is not set +# CONFIG_USB_SL811_HCD is not set +# CONFIG_USB_R8A66597_HCD is not set +# CONFIG_USB_HCD_BCMA is not set +# CONFIG_USB_HCD_SSB is not set +# CONFIG_USB_HCD_TEST_MODE is not set +# CONFIG_USB_XEN_HCD is not set + +# +# USB Device Class drivers +# +CONFIG_USB_ACM=m +CONFIG_USB_PRINTER=m +CONFIG_USB_WDM=m +CONFIG_USB_TMC=m + +# +# NOTE: USB_STORAGE depends on SCSI but BLK_DEV_SD may +# + +# +# also be needed; see USB_STORAGE Help for more info +# +CONFIG_USB_STORAGE=m +# CONFIG_USB_STORAGE_DEBUG is not set +CONFIG_USB_STORAGE_REALTEK=m +CONFIG_REALTEK_AUTOPM=y +CONFIG_USB_STORAGE_DATAFAB=m +CONFIG_USB_STORAGE_FREECOM=m +CONFIG_USB_STORAGE_ISD200=m +CONFIG_USB_STORAGE_USBAT=m +CONFIG_USB_STORAGE_SDDR09=m +CONFIG_USB_STORAGE_SDDR55=m +CONFIG_USB_STORAGE_JUMPSHOT=m +CONFIG_USB_STORAGE_ALAUDA=m +CONFIG_USB_STORAGE_ONETOUCH=m +CONFIG_USB_STORAGE_KARMA=m +CONFIG_USB_STORAGE_CYPRESS_ATACB=m +CONFIG_USB_STORAGE_ENE_UB6250=m +CONFIG_USB_UAS=m + +# +# USB Imaging devices +# +CONFIG_USB_MDC800=m +CONFIG_USB_MICROTEK=m +CONFIG_USBIP_CORE=m +CONFIG_USBIP_VHCI_HCD=m +CONFIG_USBIP_VHCI_HC_PORTS=15 +CONFIG_USBIP_VHCI_NR_HCS=8 +CONFIG_USBIP_HOST=m +CONFIG_USBIP_VUDC=m +# CONFIG_USBIP_DEBUG is not set + +# +# USB dual-mode controller drivers +# +# CONFIG_USB_CDNS_SUPPORT is not set +CONFIG_USB_MUSB_HDRC=m +# CONFIG_USB_MUSB_HOST is not set +# CONFIG_USB_MUSB_GADGET is not set +CONFIG_USB_MUSB_DUAL_ROLE=y + +# +# Platform Glue Layer +# +CONFIG_USB_MUSB_SUNXI=m + +# +# MUSB DMA mode +# +# CONFIG_MUSB_PIO_ONLY is not set +CONFIG_USB_DWC3=m +CONFIG_USB_DWC3_ULPI=y +# CONFIG_USB_DWC3_HOST is not set +# CONFIG_USB_DWC3_GADGET is not set +CONFIG_USB_DWC3_DUAL_ROLE=y + +# +# Platform Glue Driver Support +# +CONFIG_USB_DWC3_PCI=m +CONFIG_USB_DWC3_HAPS=m +CONFIG_USB_DWC3_MESON_G12A=m +CONFIG_USB_DWC3_OF_SIMPLE=m +CONFIG_USB_DWC3_QCOM=m +CONFIG_USB_DWC3_XILINX=m +CONFIG_USB_DWC2=m +# CONFIG_USB_DWC2_HOST is not set + +# +# Gadget/Dual-role mode requires USB Gadget support to be enabled +# +# CONFIG_USB_DWC2_PERIPHERAL is not set +CONFIG_USB_DWC2_DUAL_ROLE=y +# CONFIG_USB_DWC2_PCI is not set +# CONFIG_USB_DWC2_DEBUG is not set +# CONFIG_USB_DWC2_TRACK_MISSED_SOFS is not set +CONFIG_USB_CHIPIDEA=m +CONFIG_USB_CHIPIDEA_UDC=y +CONFIG_USB_CHIPIDEA_HOST=y +CONFIG_USB_CHIPIDEA_PCI=m +CONFIG_USB_CHIPIDEA_MSM=m +CONFIG_USB_CHIPIDEA_IMX=m +CONFIG_USB_CHIPIDEA_GENERIC=m +CONFIG_USB_CHIPIDEA_TEGRA=m +CONFIG_USB_ISP1760=m +CONFIG_USB_ISP1760_HCD=y +CONFIG_USB_ISP1761_UDC=y +# CONFIG_USB_ISP1760_HOST_ROLE is not set +# CONFIG_USB_ISP1760_GADGET_ROLE is not set +CONFIG_USB_ISP1760_DUAL_ROLE=y + +# +# USB port drivers +# +CONFIG_USB_SERIAL=m +CONFIG_USB_SERIAL_GENERIC=y +CONFIG_USB_SERIAL_SIMPLE=m +CONFIG_USB_SERIAL_AIRCABLE=m +CONFIG_USB_SERIAL_ARK3116=m +CONFIG_USB_SERIAL_BELKIN=m +CONFIG_USB_SERIAL_CH341=m +CONFIG_USB_SERIAL_WHITEHEAT=m +CONFIG_USB_SERIAL_DIGI_ACCELEPORT=m +CONFIG_USB_SERIAL_CP210X=m +CONFIG_USB_SERIAL_CYPRESS_M8=m +CONFIG_USB_SERIAL_EMPEG=m +CONFIG_USB_SERIAL_FTDI_SIO=m +CONFIG_USB_SERIAL_VISOR=m +CONFIG_USB_SERIAL_IPAQ=m +CONFIG_USB_SERIAL_IR=m +CONFIG_USB_SERIAL_EDGEPORT=m +CONFIG_USB_SERIAL_EDGEPORT_TI=m +CONFIG_USB_SERIAL_F81232=m +CONFIG_USB_SERIAL_F8153X=m +CONFIG_USB_SERIAL_GARMIN=m +CONFIG_USB_SERIAL_IPW=m +CONFIG_USB_SERIAL_IUU=m +CONFIG_USB_SERIAL_KEYSPAN_PDA=m +CONFIG_USB_SERIAL_KEYSPAN=m +CONFIG_USB_SERIAL_KLSI=m +CONFIG_USB_SERIAL_KOBIL_SCT=m +CONFIG_USB_SERIAL_MCT_U232=m +CONFIG_USB_SERIAL_METRO=m +CONFIG_USB_SERIAL_MOS7720=m +CONFIG_USB_SERIAL_MOS7715_PARPORT=y +CONFIG_USB_SERIAL_MOS7840=m +CONFIG_USB_SERIAL_MXUPORT=m +CONFIG_USB_SERIAL_NAVMAN=m +CONFIG_USB_SERIAL_PL2303=m +CONFIG_USB_SERIAL_OTI6858=m +CONFIG_USB_SERIAL_QCAUX=m +CONFIG_USB_SERIAL_QUALCOMM=m +CONFIG_USB_SERIAL_SPCP8X5=m +CONFIG_USB_SERIAL_SAFE=m +# CONFIG_USB_SERIAL_SAFE_PADDED is not set +CONFIG_USB_SERIAL_SIERRAWIRELESS=m +CONFIG_USB_SERIAL_SYMBOL=m +CONFIG_USB_SERIAL_TI=m +CONFIG_USB_SERIAL_CYBERJACK=m +CONFIG_USB_SERIAL_WWAN=m +CONFIG_USB_SERIAL_OPTION=m +CONFIG_USB_SERIAL_OMNINET=m +CONFIG_USB_SERIAL_OPTICON=m +CONFIG_USB_SERIAL_XSENS_MT=m +CONFIG_USB_SERIAL_WISHBONE=m +CONFIG_USB_SERIAL_SSU100=m +CONFIG_USB_SERIAL_QT2=m +CONFIG_USB_SERIAL_UPD78F0730=m +# CONFIG_USB_SERIAL_XR is not set +CONFIG_USB_SERIAL_DEBUG=m + +# +# USB Miscellaneous drivers +# +# CONFIG_USB_USS720 is not set +CONFIG_USB_EMI62=m +CONFIG_USB_EMI26=m +CONFIG_USB_ADUTUX=m +CONFIG_USB_SEVSEG=m +CONFIG_USB_LEGOTOWER=m +CONFIG_USB_LCD=m +CONFIG_USB_CYPRESS_CY7C63=m +CONFIG_USB_CYTHERM=m +CONFIG_USB_IDMOUSE=m +CONFIG_USB_APPLEDISPLAY=m +# CONFIG_USB_QCOM_EUD is not set +# CONFIG_APPLE_MFI_FASTCHARGE is not set +CONFIG_USB_SISUSBVGA=m +CONFIG_USB_LD=m +CONFIG_USB_TRANCEVIBRATOR=m +CONFIG_USB_IOWARRIOR=m +CONFIG_USB_TEST=m +CONFIG_USB_EHSET_TEST_FIXTURE=m +CONFIG_USB_ISIGHTFW=m +CONFIG_USB_YUREX=m +CONFIG_USB_EZUSB_FX2=m +# CONFIG_USB_HUB_USB251XB is not set +CONFIG_USB_HSIC_USB3503=m +# CONFIG_USB_HSIC_USB4604 is not set +# CONFIG_USB_LINK_LAYER_TEST is not set +CONFIG_USB_CHAOSKEY=m +# CONFIG_USB_ONBOARD_HUB is not set +# CONFIG_USB_ATM is not set + +# +# USB Physical Layer drivers +# +CONFIG_USB_PHY=y +CONFIG_NOP_USB_XCEIV=m +# CONFIG_USB_GPIO_VBUS is not set +# CONFIG_USB_ISP1301 is not set +CONFIG_USB_TEGRA_PHY=m +CONFIG_USB_ULPI=y +CONFIG_USB_ULPI_VIEWPORT=y +# end of USB Physical Layer drivers + +CONFIG_USB_GADGET=m +# CONFIG_USB_GADGET_DEBUG is not set +# CONFIG_USB_GADGET_DEBUG_FILES is not set +# CONFIG_USB_GADGET_DEBUG_FS is not set +CONFIG_USB_GADGET_VBUS_DRAW=2 +CONFIG_USB_GADGET_STORAGE_NUM_BUFFERS=2 +# CONFIG_U_SERIAL_CONSOLE is not set + +# +# USB Peripheral Controller +# +# CONFIG_USB_GR_UDC is not set +# CONFIG_USB_R8A66597 is not set +# CONFIG_USB_PXA27X is not set +# CONFIG_USB_MV_UDC is not set +# CONFIG_USB_MV_U3D is not set +# CONFIG_USB_SNP_UDC_PLAT is not set +# CONFIG_USB_M66592 is not set +# CONFIG_USB_BDC_UDC is not set +# CONFIG_USB_AMD5536UDC is not set +# CONFIG_USB_NET2272 is not set +CONFIG_USB_NET2280=m +# CONFIG_USB_GOKU is not set +# CONFIG_USB_EG20T is not set +# CONFIG_USB_GADGET_XILINX is not set +# CONFIG_USB_MAX3420_UDC is not set +# CONFIG_USB_TEGRA_XUDC is not set +# CONFIG_USB_CDNS2_UDC is not set +# CONFIG_USB_DUMMY_HCD is not set +# end of USB Peripheral Controller + +CONFIG_USB_LIBCOMPOSITE=m +CONFIG_USB_F_ACM=m +CONFIG_USB_F_SS_LB=m +CONFIG_USB_U_SERIAL=m +CONFIG_USB_U_ETHER=m +CONFIG_USB_U_AUDIO=m +CONFIG_USB_F_SERIAL=m +CONFIG_USB_F_OBEX=m +CONFIG_USB_F_NCM=m +CONFIG_USB_F_ECM=m +CONFIG_USB_F_PHONET=m +CONFIG_USB_F_EEM=m +CONFIG_USB_F_SUBSET=m +CONFIG_USB_F_RNDIS=m +CONFIG_USB_F_MASS_STORAGE=m +CONFIG_USB_F_FS=m +CONFIG_USB_F_UAC1=m +CONFIG_USB_F_UAC2=m +CONFIG_USB_F_UVC=m +CONFIG_USB_F_MIDI=m +CONFIG_USB_F_HID=m +CONFIG_USB_F_PRINTER=m +CONFIG_USB_CONFIGFS=m +CONFIG_USB_CONFIGFS_SERIAL=y +CONFIG_USB_CONFIGFS_ACM=y +CONFIG_USB_CONFIGFS_OBEX=y +CONFIG_USB_CONFIGFS_NCM=y +CONFIG_USB_CONFIGFS_ECM=y +CONFIG_USB_CONFIGFS_ECM_SUBSET=y +CONFIG_USB_CONFIGFS_RNDIS=y +CONFIG_USB_CONFIGFS_EEM=y +CONFIG_USB_CONFIGFS_PHONET=y +CONFIG_USB_CONFIGFS_MASS_STORAGE=y +CONFIG_USB_CONFIGFS_F_LB_SS=y +CONFIG_USB_CONFIGFS_F_FS=y +CONFIG_USB_CONFIGFS_F_UAC1=y +# CONFIG_USB_CONFIGFS_F_UAC1_LEGACY is not set +CONFIG_USB_CONFIGFS_F_UAC2=y +CONFIG_USB_CONFIGFS_F_MIDI=y +# CONFIG_USB_CONFIGFS_F_MIDI2 is not set +CONFIG_USB_CONFIGFS_F_HID=y +CONFIG_USB_CONFIGFS_F_UVC=y +CONFIG_USB_CONFIGFS_F_PRINTER=y +# CONFIG_USB_CONFIGFS_F_TCM is not set + +# +# USB Gadget precomposed configurations +# +# CONFIG_USB_ZERO is not set +# CONFIG_USB_AUDIO is not set +CONFIG_USB_ETH=m +CONFIG_USB_ETH_RNDIS=y +# CONFIG_USB_ETH_EEM is not set +# CONFIG_USB_G_NCM is not set +CONFIG_USB_GADGETFS=m +CONFIG_USB_FUNCTIONFS=m +CONFIG_USB_FUNCTIONFS_ETH=y +CONFIG_USB_FUNCTIONFS_RNDIS=y +CONFIG_USB_FUNCTIONFS_GENERIC=y +# CONFIG_USB_MASS_STORAGE is not set +# CONFIG_USB_GADGET_TARGET is not set +CONFIG_USB_G_SERIAL=m +# CONFIG_USB_MIDI_GADGET is not set +# CONFIG_USB_G_PRINTER is not set +# CONFIG_USB_CDC_COMPOSITE is not set +# CONFIG_USB_G_NOKIA is not set +# CONFIG_USB_G_ACM_MS is not set +# CONFIG_USB_G_MULTI is not set +# CONFIG_USB_G_HID is not set +# CONFIG_USB_G_DBGP is not set +# CONFIG_USB_G_WEBCAM is not set +# CONFIG_USB_RAW_GADGET is not set +# end of USB Gadget precomposed configurations + +# CONFIG_TYPEC is not set +CONFIG_USB_ROLE_SWITCH=m +CONFIG_MMC=y +CONFIG_PWRSEQ_EMMC=y +# CONFIG_PWRSEQ_SD8787 is not set +CONFIG_PWRSEQ_SIMPLE=y +CONFIG_MMC_BLOCK=y +CONFIG_MMC_BLOCK_MINORS=256 +CONFIG_SDIO_UART=m +# CONFIG_MMC_TEST is not set + +# +# MMC/SD/SDIO Host Controller Drivers +# +# CONFIG_MMC_DEBUG is not set +CONFIG_MMC_ARMMMCI=m +CONFIG_MMC_QCOM_DML=y +CONFIG_MMC_STM32_SDMMC=y +CONFIG_MMC_SDHCI=m +CONFIG_MMC_SDHCI_IO_ACCESSORS=y +CONFIG_MMC_SDHCI_PCI=m +CONFIG_MMC_RICOH_MMC=y +CONFIG_MMC_SDHCI_ACPI=m +CONFIG_MMC_SDHCI_PLTFM=m +CONFIG_MMC_SDHCI_OF_ARASAN=m +# CONFIG_MMC_SDHCI_OF_AT91 is not set +# CONFIG_MMC_SDHCI_OF_DWCMSHC is not set +# CONFIG_MMC_SDHCI_CADENCE is not set +CONFIG_MMC_SDHCI_TEGRA=m +# CONFIG_MMC_SDHCI_PXAV3 is not set +CONFIG_MMC_SDHCI_F_SDH30=m +# CONFIG_MMC_SDHCI_MILBEAUT is not set +CONFIG_MMC_MESON_GX=m +# CONFIG_MMC_MESON_MX_SDIO is not set +CONFIG_MMC_SDHCI_MSM=m +CONFIG_MMC_TIFM_SD=m +CONFIG_MMC_SPI=m +CONFIG_MMC_CB710=m +CONFIG_MMC_VIA_SDMMC=m +CONFIG_MMC_DW=m +CONFIG_MMC_DW_PLTFM=m +# CONFIG_MMC_DW_BLUEFIELD is not set +# CONFIG_MMC_DW_EXYNOS is not set +# CONFIG_MMC_DW_HI3798CV200 is not set +CONFIG_MMC_DW_K3=m +# CONFIG_MMC_DW_PCI is not set +CONFIG_MMC_DW_ROCKCHIP=m +CONFIG_MMC_VUB300=m +CONFIG_MMC_USHC=m +# CONFIG_MMC_USDHI6ROL0 is not set +CONFIG_MMC_REALTEK_PCI=m +CONFIG_MMC_REALTEK_USB=m +CONFIG_MMC_SUNXI=m +CONFIG_MMC_CQHCI=m +# CONFIG_MMC_HSQ is not set +CONFIG_MMC_TOSHIBA_PCI=m +# CONFIG_MMC_MTK is not set +CONFIG_MMC_SDHCI_XENON=m +# CONFIG_MMC_SDHCI_OMAP is not set +# CONFIG_MMC_SDHCI_AM654 is not set +CONFIG_SCSI_UFSHCD=m +# CONFIG_SCSI_UFS_BSG is not set +# CONFIG_SCSI_UFS_FAULT_INJECTION is not set +# CONFIG_SCSI_UFS_HWMON is not set +CONFIG_SCSI_UFSHCD_PCI=m +# CONFIG_SCSI_UFS_DWC_TC_PCI is not set +# CONFIG_SCSI_UFSHCD_PLATFORM is not set +CONFIG_MEMSTICK=m +# CONFIG_MEMSTICK_DEBUG is not set + +# +# MemoryStick drivers +# +# CONFIG_MEMSTICK_UNSAFE_RESUME is not set +CONFIG_MSPRO_BLOCK=m +# CONFIG_MS_BLOCK is not set + +# +# MemoryStick Host Controller Drivers +# +CONFIG_MEMSTICK_TIFM_MS=m +CONFIG_MEMSTICK_JMICRON_38X=m +CONFIG_MEMSTICK_R592=m +CONFIG_MEMSTICK_REALTEK_PCI=m +CONFIG_MEMSTICK_REALTEK_USB=m +CONFIG_NEW_LEDS=y +CONFIG_LEDS_CLASS=y +# CONFIG_LEDS_CLASS_FLASH is not set +# CONFIG_LEDS_CLASS_MULTICOLOR is not set +CONFIG_LEDS_BRIGHTNESS_HW_CHANGED=y + +# +# LED drivers +# +# CONFIG_LEDS_AN30259A is not set +# CONFIG_LEDS_AW200XX is not set +# CONFIG_LEDS_AW2013 is not set +# CONFIG_LEDS_BCM6328 is not set +# CONFIG_LEDS_BCM6358 is not set +# CONFIG_LEDS_CR0014114 is not set +# CONFIG_LEDS_EL15203000 is not set +# CONFIG_LEDS_LM3530 is not set +# CONFIG_LEDS_LM3532 is not set +# CONFIG_LEDS_LM3642 is not set +# CONFIG_LEDS_LM3692X is not set +# CONFIG_LEDS_PCA9532 is not set +CONFIG_LEDS_GPIO=m +CONFIG_LEDS_LP3944=m +# CONFIG_LEDS_LP3952 is not set +# CONFIG_LEDS_LP50XX is not set +# CONFIG_LEDS_LP55XX_COMMON is not set +# CONFIG_LEDS_LP8860 is not set +CONFIG_LEDS_PCA955X=m +# CONFIG_LEDS_PCA955X_GPIO is not set +# CONFIG_LEDS_PCA963X is not set +# CONFIG_LEDS_PCA995X is not set +CONFIG_LEDS_DAC124S085=m +CONFIG_LEDS_PWM=m +CONFIG_LEDS_REGULATOR=m +# CONFIG_LEDS_BD2606MVV is not set +CONFIG_LEDS_BD2802=m +CONFIG_LEDS_LT3593=m +# CONFIG_LEDS_TCA6507 is not set +# CONFIG_LEDS_TLC591XX is not set +# CONFIG_LEDS_LM355x is not set +# CONFIG_LEDS_IS31FL319X is not set +# CONFIG_LEDS_IS31FL32XX is not set + +# +# LED driver for blink(1) USB RGB LED is under Special HID drivers (HID_THINGM) +# +# CONFIG_LEDS_BLINKM is not set +# CONFIG_LEDS_SYSCON is not set +# CONFIG_LEDS_MLXREG is not set +# CONFIG_LEDS_USER is not set +# CONFIG_LEDS_SPI_BYTE is not set +# CONFIG_LEDS_LM3697 is not set + +# +# Flash and Torch LED drivers +# + +# +# RGB LED drivers +# + +# +# LED Triggers +# +CONFIG_LEDS_TRIGGERS=y +CONFIG_LEDS_TRIGGER_TIMER=m +CONFIG_LEDS_TRIGGER_ONESHOT=m +CONFIG_LEDS_TRIGGER_DISK=y +CONFIG_LEDS_TRIGGER_MTD=y +CONFIG_LEDS_TRIGGER_HEARTBEAT=m +CONFIG_LEDS_TRIGGER_BACKLIGHT=m +CONFIG_LEDS_TRIGGER_CPU=y +# CONFIG_LEDS_TRIGGER_ACTIVITY is not set +CONFIG_LEDS_TRIGGER_DEFAULT_ON=m + +# +# iptables trigger is under Netfilter config (LED target) +# +CONFIG_LEDS_TRIGGER_TRANSIENT=m +CONFIG_LEDS_TRIGGER_CAMERA=m +CONFIG_LEDS_TRIGGER_PANIC=y +# CONFIG_LEDS_TRIGGER_NETDEV is not set +# CONFIG_LEDS_TRIGGER_PATTERN is not set +CONFIG_LEDS_TRIGGER_AUDIO=m +# CONFIG_LEDS_TRIGGER_TTY is not set + +# +# Simple LED drivers +# +CONFIG_ACCESSIBILITY=y +CONFIG_A11Y_BRAILLE_CONSOLE=y + +# +# Speakup console speech +# +CONFIG_SPEAKUP=m +CONFIG_SPEAKUP_SYNTH_ACNTSA=m +CONFIG_SPEAKUP_SYNTH_APOLLO=m +CONFIG_SPEAKUP_SYNTH_AUDPTR=m +CONFIG_SPEAKUP_SYNTH_BNS=m +CONFIG_SPEAKUP_SYNTH_DECTLK=m +CONFIG_SPEAKUP_SYNTH_DECEXT=m +CONFIG_SPEAKUP_SYNTH_LTLK=m +CONFIG_SPEAKUP_SYNTH_SOFT=m +CONFIG_SPEAKUP_SYNTH_SPKOUT=m +CONFIG_SPEAKUP_SYNTH_TXPRT=m +CONFIG_SPEAKUP_SYNTH_DUMMY=m +# end of Speakup console speech + +CONFIG_INFINIBAND=m +CONFIG_INFINIBAND_USER_MAD=m +CONFIG_INFINIBAND_USER_ACCESS=m +CONFIG_INFINIBAND_USER_MEM=y +CONFIG_INFINIBAND_ON_DEMAND_PAGING=y +CONFIG_INFINIBAND_ADDR_TRANS=y +CONFIG_INFINIBAND_ADDR_TRANS_CONFIGFS=y +CONFIG_INFINIBAND_VIRT_DMA=y +# CONFIG_INFINIBAND_BNXT_RE is not set +CONFIG_INFINIBAND_CXGB4=m +# CONFIG_INFINIBAND_EFA is not set +# CONFIG_INFINIBAND_ERDMA is not set +# CONFIG_INFINIBAND_IRDMA is not set +CONFIG_MLX4_INFINIBAND=m +CONFIG_MLX5_INFINIBAND=m +CONFIG_INFINIBAND_MTHCA=m +CONFIG_INFINIBAND_MTHCA_DEBUG=y +CONFIG_INFINIBAND_OCRDMA=m +CONFIG_INFINIBAND_QEDR=m +CONFIG_RDMA_RXE=m +# CONFIG_RDMA_SIW is not set +CONFIG_INFINIBAND_IPOIB=m +CONFIG_INFINIBAND_IPOIB_CM=y +CONFIG_INFINIBAND_IPOIB_DEBUG=y +# CONFIG_INFINIBAND_IPOIB_DEBUG_DATA is not set +CONFIG_INFINIBAND_SRP=m +CONFIG_INFINIBAND_SRPT=m +CONFIG_INFINIBAND_ISER=m +CONFIG_INFINIBAND_ISERT=m +# CONFIG_INFINIBAND_RTRS_CLIENT is not set +# CONFIG_INFINIBAND_RTRS_SERVER is not set +CONFIG_EDAC_SUPPORT=y +CONFIG_EDAC=y +CONFIG_EDAC_LEGACY_SYSFS=y +CONFIG_EDAC_DEBUG=y +CONFIG_EDAC_GHES=y +CONFIG_EDAC_THUNDERX=m +# CONFIG_EDAC_SYNOPSYS is not set +CONFIG_EDAC_XGENE=m +# CONFIG_EDAC_DMC520 is not set +# CONFIG_EDAC_ZYNQMP is not set +CONFIG_RTC_LIB=y +CONFIG_RTC_CLASS=y +CONFIG_RTC_HCTOSYS=y +CONFIG_RTC_HCTOSYS_DEVICE="rtc0" +CONFIG_RTC_SYSTOHC=y +CONFIG_RTC_SYSTOHC_DEVICE="rtc0" +# CONFIG_RTC_DEBUG is not set +CONFIG_RTC_NVMEM=y + +# +# RTC interfaces +# +CONFIG_RTC_INTF_SYSFS=y +CONFIG_RTC_INTF_PROC=y +CONFIG_RTC_INTF_DEV=y +# CONFIG_RTC_INTF_DEV_UIE_EMUL is not set +# CONFIG_RTC_DRV_TEST is not set + +# +# I2C RTC drivers +# +# CONFIG_RTC_DRV_ABB5ZES3 is not set +# CONFIG_RTC_DRV_ABEOZ9 is not set +# CONFIG_RTC_DRV_ABX80X is not set +CONFIG_RTC_DRV_DS1307=y +# CONFIG_RTC_DRV_DS1307_CENTURY is not set +# CONFIG_RTC_DRV_DS1374 is not set +# CONFIG_RTC_DRV_DS1672 is not set +# CONFIG_RTC_DRV_HYM8563 is not set +# CONFIG_RTC_DRV_MAX6900 is not set +CONFIG_RTC_DRV_MAX77686=y +# CONFIG_RTC_DRV_NCT3018Y is not set +# CONFIG_RTC_DRV_RS5C372 is not set +# CONFIG_RTC_DRV_ISL1208 is not set +# CONFIG_RTC_DRV_ISL12022 is not set +# CONFIG_RTC_DRV_ISL12026 is not set +# CONFIG_RTC_DRV_X1205 is not set +# CONFIG_RTC_DRV_PCF8523 is not set +# CONFIG_RTC_DRV_PCF85063 is not set +# CONFIG_RTC_DRV_PCF85363 is not set +CONFIG_RTC_DRV_PCF8563=y +# CONFIG_RTC_DRV_PCF8583 is not set +# CONFIG_RTC_DRV_M41T80 is not set +# CONFIG_RTC_DRV_BQ32K is not set +# CONFIG_RTC_DRV_S35390A is not set +# CONFIG_RTC_DRV_FM3130 is not set +# CONFIG_RTC_DRV_RX8010 is not set +# CONFIG_RTC_DRV_RX8581 is not set +# CONFIG_RTC_DRV_RX8025 is not set +# CONFIG_RTC_DRV_EM3027 is not set +# CONFIG_RTC_DRV_RV3028 is not set +# CONFIG_RTC_DRV_RV3032 is not set +# CONFIG_RTC_DRV_RV8803 is not set +# CONFIG_RTC_DRV_SD3078 is not set + +# +# SPI RTC drivers +# +# CONFIG_RTC_DRV_M41T93 is not set +# CONFIG_RTC_DRV_M41T94 is not set +# CONFIG_RTC_DRV_DS1302 is not set +# CONFIG_RTC_DRV_DS1305 is not set +# CONFIG_RTC_DRV_DS1343 is not set +# CONFIG_RTC_DRV_DS1347 is not set +# CONFIG_RTC_DRV_DS1390 is not set +# CONFIG_RTC_DRV_MAX6916 is not set +# CONFIG_RTC_DRV_R9701 is not set +# CONFIG_RTC_DRV_RX4581 is not set +# CONFIG_RTC_DRV_RS5C348 is not set +# CONFIG_RTC_DRV_MAX6902 is not set +# CONFIG_RTC_DRV_PCF2123 is not set +# CONFIG_RTC_DRV_MCP795 is not set +CONFIG_RTC_I2C_AND_SPI=y + +# +# SPI and I2C RTC drivers +# +# CONFIG_RTC_DRV_DS3232 is not set +# CONFIG_RTC_DRV_PCF2127 is not set +# CONFIG_RTC_DRV_RV3029C2 is not set +# CONFIG_RTC_DRV_RX6110 is not set + +# +# Platform RTC drivers +# +# CONFIG_RTC_DRV_DS1286 is not set +# CONFIG_RTC_DRV_DS1511 is not set +# CONFIG_RTC_DRV_DS1553 is not set +# CONFIG_RTC_DRV_DS1685_FAMILY is not set +# CONFIG_RTC_DRV_DS1742 is not set +# CONFIG_RTC_DRV_DS2404 is not set +CONFIG_RTC_DRV_EFI=y +# CONFIG_RTC_DRV_STK17TA8 is not set +# CONFIG_RTC_DRV_M48T86 is not set +# CONFIG_RTC_DRV_M48T35 is not set +# CONFIG_RTC_DRV_M48T59 is not set +# CONFIG_RTC_DRV_MSM6242 is not set +# CONFIG_RTC_DRV_RP5C01 is not set +# CONFIG_RTC_DRV_OPTEE is not set +# CONFIG_RTC_DRV_ZYNQMP is not set +CONFIG_RTC_DRV_CROS_EC=m + +# +# on-CPU RTC drivers +# +CONFIG_RTC_DRV_MESON_VRTC=m +# CONFIG_RTC_DRV_PL030 is not set +CONFIG_RTC_DRV_PL031=y +CONFIG_RTC_DRV_SUN6I=y +CONFIG_RTC_DRV_MV=m +CONFIG_RTC_DRV_ARMADA38X=m +# CONFIG_RTC_DRV_CADENCE is not set +# CONFIG_RTC_DRV_FTRTC010 is not set +CONFIG_RTC_DRV_PM8XXX=m +CONFIG_RTC_DRV_TEGRA=y +CONFIG_RTC_DRV_XGENE=y +# CONFIG_RTC_DRV_R7301 is not set + +# +# HID Sensor RTC drivers +# +# CONFIG_RTC_DRV_HID_SENSOR_TIME is not set +# CONFIG_RTC_DRV_GOLDFISH is not set +CONFIG_DMADEVICES=y +# CONFIG_DMADEVICES_DEBUG is not set + +# +# DMA Devices +# +CONFIG_ASYNC_TX_ENABLE_CHANNEL_SWITCH=y +CONFIG_DMA_ENGINE=y +CONFIG_DMA_VIRTUAL_CHANNELS=y +CONFIG_DMA_ACPI=y +CONFIG_DMA_OF=y +# CONFIG_ALTERA_MSGDMA is not set +# CONFIG_AMBA_PL08X is not set +# CONFIG_AXI_DMAC is not set +# CONFIG_BCM_SBA_RAID is not set +CONFIG_DMA_SUN6I=m +# CONFIG_DW_AXI_DMAC is not set +# CONFIG_FSL_EDMA is not set +# CONFIG_FSL_QDMA is not set +# CONFIG_HISI_DMA is not set +# CONFIG_INTEL_IDMA64 is not set +CONFIG_K3_DMA=m +CONFIG_MV_XOR=y +CONFIG_MV_XOR_V2=y +CONFIG_PL330_DMA=m +# CONFIG_PLX_DMA is not set +# CONFIG_TEGRA186_GPC_DMA is not set +CONFIG_TEGRA20_APB_DMA=y +CONFIG_TEGRA210_ADMA=y +CONFIG_XGENE_DMA=m +# CONFIG_XILINX_DMA is not set +# CONFIG_XILINX_XDMA is not set +# CONFIG_XILINX_ZYNQMP_DMA is not set +# CONFIG_XILINX_ZYNQMP_DPDMA is not set +CONFIG_QCOM_BAM_DMA=m +# CONFIG_QCOM_GPI_DMA is not set +CONFIG_QCOM_HIDMA_MGMT=m +CONFIG_QCOM_HIDMA=m +# CONFIG_DW_DMAC is not set +# CONFIG_DW_DMAC_PCI is not set +# CONFIG_DW_EDMA is not set +# CONFIG_SF_PDMA is not set + +# +# DMA Clients +# +CONFIG_ASYNC_TX_DMA=y +# CONFIG_DMATEST is not set +CONFIG_DMA_ENGINE_RAID=y + +# +# DMABUF options +# +CONFIG_SYNC_FILE=y +# CONFIG_SW_SYNC is not set +# CONFIG_UDMABUF is not set +# CONFIG_DMABUF_MOVE_NOTIFY is not set +# CONFIG_DMABUF_DEBUG is not set +# CONFIG_DMABUF_SELFTESTS is not set +CONFIG_DMABUF_HEAPS=y +CONFIG_DMABUF_HEAPS_SYSTEM=y +# CONFIG_DMABUF_SYSFS_STATS is not set +# end of DMABUF options + +CONFIG_UIO=m +CONFIG_UIO_CIF=m +# CONFIG_UIO_PDRV_GENIRQ is not set +# CONFIG_UIO_DMEM_GENIRQ is not set +CONFIG_UIO_AEC=m +CONFIG_UIO_SERCOS3=m +CONFIG_UIO_PCI_GENERIC=m +CONFIG_UIO_NETX=m +# CONFIG_UIO_PRUSS is not set +CONFIG_UIO_MF624=m +# CONFIG_UIO_HV_GENERIC is not set +CONFIG_VFIO=m +CONFIG_VFIO_GROUP=y +CONFIG_VFIO_CONTAINER=y +CONFIG_VFIO_IOMMU_TYPE1=m +CONFIG_VFIO_NOIOMMU=y +CONFIG_VFIO_VIRQFD=y + +# +# VFIO support for PCI devices +# +CONFIG_VFIO_PCI_CORE=m +CONFIG_VFIO_PCI_MMAP=y +CONFIG_VFIO_PCI_INTX=y +CONFIG_VFIO_PCI=m +# CONFIG_MLX5_VFIO_PCI is not set +# end of VFIO support for PCI devices + +# +# VFIO support for platform devices +# +CONFIG_VFIO_PLATFORM_BASE=m +CONFIG_VFIO_PLATFORM=m +# CONFIG_VFIO_AMBA is not set + +# +# VFIO platform reset drivers +# +# CONFIG_VFIO_PLATFORM_CALXEDAXGMAC_RESET is not set +# CONFIG_VFIO_PLATFORM_AMDXGBE_RESET is not set +# end of VFIO platform reset drivers +# end of VFIO support for platform devices + +CONFIG_VIRT_DRIVERS=y +CONFIG_VMGENID=y +# CONFIG_NITRO_ENCLAVES is not set +CONFIG_VIRTIO_ANCHOR=y +CONFIG_VIRTIO=m +CONFIG_VIRTIO_PCI_LIB=m +CONFIG_VIRTIO_PCI_LIB_LEGACY=m +CONFIG_VIRTIO_MENU=y +CONFIG_VIRTIO_PCI=m +CONFIG_VIRTIO_PCI_LEGACY=y +# CONFIG_VIRTIO_PMEM is not set +CONFIG_VIRTIO_BALLOON=m +CONFIG_VIRTIO_INPUT=m +CONFIG_VIRTIO_MMIO=m +# CONFIG_VIRTIO_MMIO_CMDLINE_DEVICES is not set +CONFIG_VIRTIO_DMA_SHARED_BUFFER=m +# CONFIG_VDPA is not set +CONFIG_VHOST_IOTLB=m +CONFIG_VHOST_TASK=y +CONFIG_VHOST=m +CONFIG_VHOST_MENU=y +CONFIG_VHOST_NET=m +CONFIG_VHOST_SCSI=m +CONFIG_VHOST_VSOCK=m +# CONFIG_VHOST_CROSS_ENDIAN_LEGACY is not set + +# +# Microsoft Hyper-V guest support +# +CONFIG_HYPERV=m +CONFIG_HYPERV_UTILS=m +CONFIG_HYPERV_BALLOON=m +# end of Microsoft Hyper-V guest support + +# +# Xen driver support +# +CONFIG_XEN_BALLOON=y +CONFIG_XEN_BALLOON_MEMORY_HOTPLUG=y +CONFIG_XEN_SCRUB_PAGES_DEFAULT=y +CONFIG_XEN_DEV_EVTCHN=m +CONFIG_XEN_BACKEND=y +CONFIG_XENFS=m +CONFIG_XEN_COMPAT_XENFS=y +CONFIG_XEN_SYS_HYPERVISOR=y +CONFIG_XEN_XENBUS_FRONTEND=y +CONFIG_XEN_GNTDEV=m +CONFIG_XEN_GRANT_DEV_ALLOC=m +# CONFIG_XEN_GRANT_DMA_ALLOC is not set +CONFIG_SWIOTLB_XEN=y +CONFIG_XEN_PCI_STUB=y +CONFIG_XEN_PCIDEV_STUB=m +# CONFIG_XEN_PVCALLS_FRONTEND is not set +# CONFIG_XEN_PVCALLS_BACKEND is not set +CONFIG_XEN_SCSI_BACKEND=m +CONFIG_XEN_PRIVCMD=m +CONFIG_XEN_EFI=y +CONFIG_XEN_AUTO_XLATE=y +CONFIG_XEN_FRONT_PGDIR_SHBUF=m +# CONFIG_XEN_VIRTIO is not set +# end of Xen driver support + +# CONFIG_GREYBUS is not set +# CONFIG_COMEDI is not set +CONFIG_STAGING=y +# CONFIG_PRISM2_USB is not set +# CONFIG_RTL8192U is not set +# CONFIG_RTLLIB is not set +CONFIG_RTL8723BS=m +CONFIG_R8712U=m +# CONFIG_RTS5208 is not set +# CONFIG_VT6655 is not set +# CONFIG_VT6656 is not set + +# +# IIO staging drivers +# + +# +# Accelerometers +# +# CONFIG_ADIS16203 is not set +# CONFIG_ADIS16240 is not set +# end of Accelerometers + +# +# Analog to digital converters +# +# CONFIG_AD7816 is not set +# end of Analog to digital converters + +# +# Analog digital bi-direction converters +# +# CONFIG_ADT7316 is not set +# end of Analog digital bi-direction converters + +# +# Direct Digital Synthesis +# +# CONFIG_AD9832 is not set +# CONFIG_AD9834 is not set +# end of Direct Digital Synthesis + +# +# Network Analyzer, Impedance Converters +# +# CONFIG_AD5933 is not set +# end of Network Analyzer, Impedance Converters + +# +# Resolver to digital converters +# +# CONFIG_AD2S1210 is not set +# end of Resolver to digital converters +# end of IIO staging drivers + +# CONFIG_FB_SM750 is not set +# CONFIG_MFD_NVEC is not set +# CONFIG_STAGING_MEDIA is not set +# CONFIG_STAGING_BOARD is not set +# CONFIG_LTE_GDM724X is not set +# CONFIG_FB_TFT is not set +# CONFIG_KS7010 is not set +# CONFIG_PI433 is not set +# CONFIG_XIL_AXIS_FIFO is not set +# CONFIG_FIELDBUS_DEV is not set +CONFIG_QLGE=m +# CONFIG_VME_BUS is not set +# CONFIG_GOLDFISH is not set +CONFIG_CHROME_PLATFORMS=y +# CONFIG_CHROMEOS_ACPI is not set +# CONFIG_CHROMEOS_TBMC is not set +CONFIG_CROS_EC=y +CONFIG_CROS_EC_I2C=m +# CONFIG_CROS_EC_RPMSG is not set +CONFIG_CROS_EC_SPI=m +# CONFIG_CROS_EC_UART is not set +CONFIG_CROS_EC_PROTO=y +CONFIG_CROS_KBD_LED_BACKLIGHT=m +CONFIG_CROS_EC_CHARDEV=y +CONFIG_CROS_EC_LIGHTBAR=m +CONFIG_CROS_EC_VBC=m +CONFIG_CROS_EC_DEBUGFS=m +CONFIG_CROS_EC_SENSORHUB=y +CONFIG_CROS_EC_SYSFS=m +# CONFIG_CROS_HPS_I2C is not set +CONFIG_CROS_USBPD_LOGGER=m +CONFIG_CROS_USBPD_NOTIFY=y +# CONFIG_CHROMEOS_PRIVACY_SCREEN is not set +# CONFIG_MELLANOX_PLATFORM is not set +CONFIG_SURFACE_PLATFORMS=y +# CONFIG_SURFACE_3_POWER_OPREGION is not set +# CONFIG_SURFACE_GPE is not set +# CONFIG_SURFACE_HOTPLUG is not set +# CONFIG_SURFACE_PRO3_BUTTON is not set +# CONFIG_SURFACE_AGGREGATOR is not set +CONFIG_HAVE_CLK=y +CONFIG_HAVE_CLK_PREPARE=y +CONFIG_COMMON_CLK=y + +# +# Clock driver for ARM Reference designs +# +# CONFIG_CLK_ICST is not set +CONFIG_CLK_SP810=y +CONFIG_CLK_VEXPRESS_OSC=y +# end of Clock driver for ARM Reference designs + +# CONFIG_LMK04832 is not set +# CONFIG_COMMON_CLK_MAX77686 is not set +# CONFIG_COMMON_CLK_MAX9485 is not set +CONFIG_COMMON_CLK_HI655X=m +# CONFIG_COMMON_CLK_SI5341 is not set +# CONFIG_COMMON_CLK_SI5351 is not set +# CONFIG_COMMON_CLK_SI514 is not set +# CONFIG_COMMON_CLK_SI544 is not set +# CONFIG_COMMON_CLK_SI570 is not set +# CONFIG_COMMON_CLK_CDCE706 is not set +# CONFIG_COMMON_CLK_CDCE925 is not set +# CONFIG_COMMON_CLK_CS2000_CP is not set +# CONFIG_COMMON_CLK_AXI_CLKGEN is not set +CONFIG_COMMON_CLK_XGENE=y +# CONFIG_COMMON_CLK_PWM is not set +# CONFIG_COMMON_CLK_RS9_PCIE is not set +# CONFIG_COMMON_CLK_SI521XX is not set +# CONFIG_COMMON_CLK_VC3 is not set +# CONFIG_COMMON_CLK_VC5 is not set +# CONFIG_COMMON_CLK_VC7 is not set +# CONFIG_COMMON_CLK_FIXED_MMIO is not set +CONFIG_COMMON_CLK_HI3516CV300=y +CONFIG_COMMON_CLK_HI3519=y +CONFIG_COMMON_CLK_HI3559A=y +CONFIG_COMMON_CLK_HI3660=y +CONFIG_COMMON_CLK_HI3670=y +CONFIG_COMMON_CLK_HI3798CV200=y +CONFIG_COMMON_CLK_HI6220=y +CONFIG_RESET_HISI=y +CONFIG_STUB_CLK_HI6220=y +CONFIG_STUB_CLK_HI3660=y + +# +# Clock support for Amlogic platforms +# +CONFIG_COMMON_CLK_MESON_REGMAP=y +CONFIG_COMMON_CLK_MESON_DUALDIV=y +CONFIG_COMMON_CLK_MESON_MPLL=y +CONFIG_COMMON_CLK_MESON_PLL=y +CONFIG_COMMON_CLK_MESON_VID_PLL_DIV=y +CONFIG_COMMON_CLK_MESON_CLKC_UTILS=y +CONFIG_COMMON_CLK_MESON_AO_CLKC=y +CONFIG_COMMON_CLK_MESON_EE_CLKC=y +CONFIG_COMMON_CLK_MESON_CPU_DYNDIV=y +CONFIG_COMMON_CLK_GXBB=y +CONFIG_COMMON_CLK_AXG=y +# CONFIG_COMMON_CLK_AXG_AUDIO is not set +# CONFIG_COMMON_CLK_A1_PLL is not set +# CONFIG_COMMON_CLK_A1_PERIPHERALS is not set +CONFIG_COMMON_CLK_G12A=y +# end of Clock support for Amlogic platforms + +CONFIG_ARMADA_AP_CP_HELPER=y +CONFIG_ARMADA_37XX_CLK=y +CONFIG_ARMADA_AP806_SYSCON=y +CONFIG_ARMADA_CP110_SYSCON=y +CONFIG_QCOM_GDSC=y +CONFIG_QCOM_RPMCC=y +CONFIG_COMMON_CLK_QCOM=y +CONFIG_QCOM_A53PLL=y +# CONFIG_QCOM_A7PLL is not set +CONFIG_QCOM_CLK_APCS_MSM8916=m +# CONFIG_QCOM_CLK_APCC_MSM8996 is not set +CONFIG_QCOM_CLK_RPM=m +CONFIG_QCOM_CLK_SMD_RPM=m +# CONFIG_IPQ_APSS_PLL is not set +# CONFIG_IPQ_GCC_4019 is not set +# CONFIG_IPQ_GCC_5018 is not set +# CONFIG_IPQ_GCC_5332 is not set +# CONFIG_IPQ_GCC_6018 is not set +# CONFIG_IPQ_GCC_8074 is not set +# CONFIG_IPQ_GCC_9574 is not set +CONFIG_MSM_GCC_8916=y +# CONFIG_MSM_GCC_8917 is not set +# CONFIG_MSM_GCC_8939 is not set +# CONFIG_MSM_GCC_8953 is not set +# CONFIG_MSM_GCC_8976 is not set +# CONFIG_MSM_MMCC_8994 is not set +# CONFIG_MSM_GCC_8994 is not set +CONFIG_MSM_GCC_8996=y +CONFIG_MSM_MMCC_8996=y +# CONFIG_MSM_GCC_8998 is not set +# CONFIG_MSM_GPUCC_8998 is not set +# CONFIG_MSM_MMCC_8998 is not set +# CONFIG_QCM_GCC_2290 is not set +# CONFIG_QCM_DISPCC_2290 is not set +# CONFIG_QCS_GCC_404 is not set +# CONFIG_SC_CAMCC_7180 is not set +# CONFIG_SC_CAMCC_7280 is not set +# CONFIG_SC_DISPCC_7180 is not set +# CONFIG_SC_DISPCC_7280 is not set +# CONFIG_SC_DISPCC_8280XP is not set +# CONFIG_SA_GCC_8775P is not set +# CONFIG_SA_GPUCC_8775P is not set +# CONFIG_SC_GCC_7180 is not set +# CONFIG_SC_GCC_7280 is not set +# CONFIG_SC_GCC_8180X is not set +# CONFIG_SC_GCC_8280XP is not set +# CONFIG_SC_GPUCC_7180 is not set +# CONFIG_SC_GPUCC_7280 is not set +# CONFIG_SC_GPUCC_8280XP is not set +# CONFIG_SC_LPASSCC_7280 is not set +# CONFIG_SC_LPASSCC_8280XP is not set +# CONFIG_SC_LPASS_CORECC_7180 is not set +# CONFIG_SC_LPASS_CORECC_7280 is not set +# CONFIG_SC_MSS_7180 is not set +# CONFIG_SC_VIDEOCC_7180 is not set +# CONFIG_SC_VIDEOCC_7280 is not set +# CONFIG_SDM_CAMCC_845 is not set +# CONFIG_SDM_GCC_660 is not set +# CONFIG_SDM_MMCC_660 is not set +# CONFIG_SDM_GPUCC_660 is not set +# CONFIG_QCS_TURING_404 is not set +# CONFIG_QCS_Q6SSTOP_404 is not set +# CONFIG_QDU_GCC_1000 is not set +# CONFIG_SDM_GCC_845 is not set +# CONFIG_SDM_GPUCC_845 is not set +# CONFIG_SDM_VIDEOCC_845 is not set +# CONFIG_SDM_DISPCC_845 is not set +# CONFIG_SDM_LPASSCC_845 is not set +# CONFIG_SDX_GCC_75 is not set +# CONFIG_SM_CAMCC_6350 is not set +# CONFIG_SM_CAMCC_8250 is not set +# CONFIG_SM_CAMCC_8450 is not set +# CONFIG_SM_GCC_6115 is not set +# CONFIG_SM_GCC_6125 is not set +# CONFIG_SM_GCC_6350 is not set +# CONFIG_SM_GCC_6375 is not set +# CONFIG_SM_GCC_7150 is not set +# CONFIG_SM_GCC_8150 is not set +# CONFIG_SM_GCC_8250 is not set +# CONFIG_SM_GCC_8350 is not set +# CONFIG_SM_GCC_8450 is not set +# CONFIG_SM_GCC_8550 is not set +# CONFIG_SM_GPUCC_6115 is not set +# CONFIG_SM_GPUCC_6125 is not set +# CONFIG_SM_GPUCC_6375 is not set +# CONFIG_SM_GPUCC_6350 is not set +# CONFIG_SM_GPUCC_8150 is not set +# CONFIG_SM_GPUCC_8250 is not set +# CONFIG_SM_GPUCC_8350 is not set +# CONFIG_SM_GPUCC_8450 is not set +# CONFIG_SM_GPUCC_8550 is not set +# CONFIG_SM_TCSRCC_8550 is not set +# CONFIG_SM_VIDEOCC_8150 is not set +# CONFIG_SM_VIDEOCC_8250 is not set +# CONFIG_SM_VIDEOCC_8350 is not set +# CONFIG_SM_VIDEOCC_8550 is not set +# CONFIG_SPMI_PMIC_CLKDIV is not set +# CONFIG_QCOM_HFPLL is not set +# CONFIG_KPSS_XCC is not set +# CONFIG_CLK_GFM_LPASS_SM8250 is not set +# CONFIG_SM_VIDEOCC_8450 is not set +CONFIG_COMMON_CLK_ROCKCHIP=y +CONFIG_CLK_PX30=y +CONFIG_CLK_RK3308=y +CONFIG_CLK_RK3328=y +CONFIG_CLK_RK3368=y +CONFIG_CLK_RK3399=y +CONFIG_CLK_RK3568=y +CONFIG_CLK_RK3588=y +CONFIG_SUNXI_CCU=y +CONFIG_SUN50I_A64_CCU=y +CONFIG_SUN50I_A100_CCU=y +CONFIG_SUN50I_A100_R_CCU=y +CONFIG_SUN50I_H6_CCU=y +CONFIG_SUN50I_H616_CCU=y +CONFIG_SUN50I_H6_R_CCU=y +CONFIG_SUN6I_RTC_CCU=y +CONFIG_SUN8I_H3_CCU=y +CONFIG_SUN8I_DE2_CCU=y +CONFIG_SUN8I_R_CCU=y +CONFIG_TEGRA_CLK_DFLL=y +# CONFIG_XILINX_VCU is not set +# CONFIG_COMMON_CLK_XLNX_CLKWZRD is not set +# CONFIG_COMMON_CLK_ZYNQMP is not set +# CONFIG_HWSPINLOCK is not set + +# +# Clock Source drivers +# +CONFIG_TIMER_OF=y +CONFIG_TIMER_ACPI=y +CONFIG_TIMER_PROBE=y +CONFIG_CLKSRC_MMIO=y +CONFIG_ROCKCHIP_TIMER=y +CONFIG_SUN4I_TIMER=y +CONFIG_TEGRA_TIMER=y +# CONFIG_TEGRA186_TIMER is not set +CONFIG_ARM_ARCH_TIMER=y +CONFIG_ARM_ARCH_TIMER_EVTSTREAM=y +CONFIG_ARM_ARCH_TIMER_OOL_WORKAROUND=y +CONFIG_FSL_ERRATUM_A008585=y +CONFIG_HISILICON_ERRATUM_161010101=y +CONFIG_ARM64_ERRATUM_858921=y +CONFIG_SUN50I_ERRATUM_UNKNOWN1=y +CONFIG_ARM_TIMER_SP804=y +# end of Clock Source drivers + +CONFIG_MAILBOX=y +# CONFIG_ARM_MHU is not set +# CONFIG_ARM_MHU_V2 is not set +# CONFIG_PLATFORM_MHU is not set +# CONFIG_PL320_MBOX is not set +CONFIG_ARMADA_37XX_RWTM_MBOX=m +# CONFIG_ROCKCHIP_MBOX is not set +CONFIG_PCC=y +# CONFIG_ALTERA_MBOX is not set +CONFIG_HI3660_MBOX=y +CONFIG_HI6220_MBOX=y +# CONFIG_MAILBOX_TEST is not set +CONFIG_QCOM_APCS_IPC=m +CONFIG_TEGRA_HSP_MBOX=y +CONFIG_XGENE_SLIMPRO_MBOX=m +CONFIG_ZYNQMP_IPI_MBOX=y +CONFIG_SUN6I_MSGBOX=y +# CONFIG_QCOM_IPCC is not set +CONFIG_IOMMU_IOVA=y +CONFIG_IOMMU_API=y +CONFIG_IOMMU_SUPPORT=y + +# +# Generic IOMMU Pagetable Support +# +CONFIG_IOMMU_IO_PGTABLE=y +CONFIG_IOMMU_IO_PGTABLE_LPAE=y +# CONFIG_IOMMU_IO_PGTABLE_LPAE_SELFTEST is not set +# CONFIG_IOMMU_IO_PGTABLE_ARMV7S is not set +# CONFIG_IOMMU_IO_PGTABLE_DART is not set +# end of Generic IOMMU Pagetable Support + +# CONFIG_IOMMU_DEBUGFS is not set +# CONFIG_IOMMU_DEFAULT_DMA_STRICT is not set +# CONFIG_IOMMU_DEFAULT_DMA_LAZY is not set +# CONFIG_IOMMU_DEFAULT_PASSTHROUGH is not set +CONFIG_OF_IOMMU=y +CONFIG_IOMMU_DMA=y +CONFIG_IOMMU_SVA=y +# CONFIG_IOMMUFD is not set +CONFIG_ROCKCHIP_IOMMU=y +# CONFIG_SUN50I_IOMMU is not set +CONFIG_TEGRA_IOMMU_SMMU=y +CONFIG_ARM_SMMU=y +# CONFIG_ARM_SMMU_LEGACY_DT_BINDINGS is not set +CONFIG_ARM_SMMU_DISABLE_BYPASS_BY_DEFAULT=y +CONFIG_ARM_SMMU_QCOM=y +# CONFIG_ARM_SMMU_QCOM_DEBUG is not set +CONFIG_ARM_SMMU_V3=y +CONFIG_ARM_SMMU_V3_SVA=y +CONFIG_QCOM_IOMMU=y +# CONFIG_VIRTIO_IOMMU is not set + +# +# Remoteproc drivers +# +# CONFIG_REMOTEPROC is not set +# end of Remoteproc drivers + +# +# Rpmsg drivers +# +CONFIG_RPMSG=m +# CONFIG_RPMSG_CHAR is not set +# CONFIG_RPMSG_CTRL is not set +# CONFIG_RPMSG_NS is not set +CONFIG_RPMSG_QCOM_GLINK=m +CONFIG_RPMSG_QCOM_GLINK_RPM=m +# CONFIG_RPMSG_VIRTIO is not set +# end of Rpmsg drivers + +# CONFIG_SOUNDWIRE is not set + +# +# SOC (System On Chip) specific Drivers +# + +# +# Amlogic SoC drivers +# +CONFIG_MESON_CANVAS=m +CONFIG_MESON_CLK_MEASURE=y +CONFIG_MESON_GX_SOCINFO=y +CONFIG_MESON_GX_PM_DOMAINS=y +CONFIG_MESON_EE_PM_DOMAINS=y +# end of Amlogic SoC drivers + +# +# Broadcom SoC drivers +# +# CONFIG_SOC_BRCMSTB is not set +# end of Broadcom SoC drivers + +# +# NXP/Freescale QorIQ SoC drivers +# +# CONFIG_QUICC_ENGINE is not set +# CONFIG_FSL_RCPM is not set +# end of NXP/Freescale QorIQ SoC drivers + +# +# fujitsu SoC drivers +# +# CONFIG_A64FX_DIAG is not set +# end of fujitsu SoC drivers + +# +# Hisilicon SoC drivers +# +# CONFIG_KUNPENG_HCCS is not set +# end of Hisilicon SoC drivers + +# +# i.MX SoC drivers +# +# end of i.MX SoC drivers + +# +# Enable LiteX SoC Builder specific drivers +# +# CONFIG_LITEX_SOC_CONTROLLER is not set +# end of Enable LiteX SoC Builder specific drivers + +# CONFIG_WPCM450_SOC is not set + +# +# Qualcomm SoC drivers +# +# CONFIG_QCOM_AOSS_QMP is not set +CONFIG_QCOM_COMMAND_DB=y +# CONFIG_QCOM_CPR is not set +# CONFIG_QCOM_GENI_SE is not set +CONFIG_QCOM_GSBI=m +# CONFIG_QCOM_LLCC is not set +CONFIG_QCOM_KRYO_L2_ACCESSORS=y +CONFIG_QCOM_MDT_LOADER=m +# CONFIG_QCOM_OCMEM is not set +# CONFIG_QCOM_RAMP_CTRL is not set +# CONFIG_QCOM_RMTFS_MEM is not set +# CONFIG_QCOM_RPM_MASTER_STATS is not set +# CONFIG_QCOM_RPMH is not set +# CONFIG_QCOM_RPMPD is not set +CONFIG_QCOM_SMD_RPM=m +# CONFIG_QCOM_SPM is not set +CONFIG_QCOM_WCNSS_CTRL=m +# CONFIG_QCOM_APR is not set +# CONFIG_QCOM_ICC_BWMON is not set +# end of Qualcomm SoC drivers + +CONFIG_ROCKCHIP_GRF=y +CONFIG_ROCKCHIP_IODOMAIN=m +CONFIG_ROCKCHIP_PM_DOMAINS=y +CONFIG_SUNXI_MBUS=y +CONFIG_SUNXI_SRAM=y +# CONFIG_SUN20I_PPU is not set +CONFIG_ARCH_TEGRA_132_SOC=y +CONFIG_ARCH_TEGRA_210_SOC=y +CONFIG_ARCH_TEGRA_186_SOC=y +CONFIG_ARCH_TEGRA_194_SOC=y +# CONFIG_ARCH_TEGRA_234_SOC is not set +CONFIG_ARCH_TEGRA_241_SOC=y +CONFIG_SOC_TEGRA_FUSE=y +CONFIG_SOC_TEGRA_FLOWCTRL=y +CONFIG_SOC_TEGRA_PMC=y +# CONFIG_SOC_TI is not set + +# +# Xilinx SoC drivers +# +CONFIG_ZYNQMP_POWER=y +CONFIG_ZYNQMP_PM_DOMAINS=y +CONFIG_XLNX_EVENT_MANAGER=y +# end of Xilinx SoC drivers +# end of SOC (System On Chip) specific Drivers + +CONFIG_PM_DEVFREQ=y + +# +# DEVFREQ Governors +# +CONFIG_DEVFREQ_GOV_SIMPLE_ONDEMAND=m +# CONFIG_DEVFREQ_GOV_PERFORMANCE is not set +# CONFIG_DEVFREQ_GOV_POWERSAVE is not set +# CONFIG_DEVFREQ_GOV_USERSPACE is not set +# CONFIG_DEVFREQ_GOV_PASSIVE is not set + +# +# DEVFREQ Drivers +# +# CONFIG_ARM_TEGRA_DEVFREQ is not set +CONFIG_ARM_RK3399_DMC_DEVFREQ=m +# CONFIG_ARM_SUN8I_A33_MBUS_DEVFREQ is not set +CONFIG_PM_DEVFREQ_EVENT=y +CONFIG_DEVFREQ_EVENT_ROCKCHIP_DFI=m +CONFIG_EXTCON=y + +# +# Extcon Device Drivers +# +# CONFIG_EXTCON_ADC_JACK is not set +# CONFIG_EXTCON_FSA9480 is not set +# CONFIG_EXTCON_GPIO is not set +# CONFIG_EXTCON_MAX3355 is not set +# CONFIG_EXTCON_PTN5150 is not set +CONFIG_EXTCON_QCOM_SPMI_MISC=m +# CONFIG_EXTCON_RT8973A is not set +# CONFIG_EXTCON_SM5502 is not set +CONFIG_EXTCON_USB_GPIO=m +CONFIG_EXTCON_USBC_CROS_EC=m +CONFIG_MEMORY=y +# CONFIG_ARM_PL172_MPMC is not set +CONFIG_TEGRA_MC=y +# CONFIG_TEGRA210_EMC is not set +CONFIG_IIO=m +CONFIG_IIO_BUFFER=y +# CONFIG_IIO_BUFFER_CB is not set +# CONFIG_IIO_BUFFER_DMA is not set +# CONFIG_IIO_BUFFER_DMAENGINE is not set +# CONFIG_IIO_BUFFER_HW_CONSUMER is not set +CONFIG_IIO_KFIFO_BUF=m +CONFIG_IIO_TRIGGERED_BUFFER=m +# CONFIG_IIO_CONFIGFS is not set +CONFIG_IIO_TRIGGER=y +CONFIG_IIO_CONSUMERS_PER_TRIGGER=2 +# CONFIG_IIO_SW_DEVICE is not set +# CONFIG_IIO_SW_TRIGGER is not set +# CONFIG_IIO_TRIGGERED_EVENT is not set + +# +# Accelerometers +# +# CONFIG_ADIS16201 is not set +# CONFIG_ADIS16209 is not set +# CONFIG_ADXL313_I2C is not set +# CONFIG_ADXL313_SPI is not set +# CONFIG_ADXL345_I2C is not set +# CONFIG_ADXL345_SPI is not set +# CONFIG_ADXL355_I2C is not set +# CONFIG_ADXL355_SPI is not set +# CONFIG_ADXL367_SPI is not set +# CONFIG_ADXL367_I2C is not set +# CONFIG_ADXL372_SPI is not set +# CONFIG_ADXL372_I2C is not set +# CONFIG_BMA180 is not set +# CONFIG_BMA220 is not set +# CONFIG_BMA400 is not set +# CONFIG_BMC150_ACCEL is not set +# CONFIG_BMI088_ACCEL is not set +# CONFIG_DA280 is not set +# CONFIG_DA311 is not set +# CONFIG_DMARD06 is not set +# CONFIG_DMARD09 is not set +# CONFIG_DMARD10 is not set +# CONFIG_FXLS8962AF_I2C is not set +# CONFIG_FXLS8962AF_SPI is not set +CONFIG_HID_SENSOR_ACCEL_3D=m +CONFIG_IIO_CROS_EC_ACCEL_LEGACY=m +# CONFIG_IIO_ST_ACCEL_3AXIS is not set +# CONFIG_IIO_KX022A_SPI is not set +# CONFIG_IIO_KX022A_I2C is not set +# CONFIG_KXSD9 is not set +# CONFIG_KXCJK1013 is not set +# CONFIG_MC3230 is not set +# CONFIG_MMA7455_I2C is not set +# CONFIG_MMA7455_SPI is not set +# CONFIG_MMA7660 is not set +# CONFIG_MMA8452 is not set +# CONFIG_MMA9551 is not set +# CONFIG_MMA9553 is not set +# CONFIG_MSA311 is not set +# CONFIG_MXC4005 is not set +# CONFIG_MXC6255 is not set +# CONFIG_SCA3000 is not set +# CONFIG_SCA3300 is not set +# CONFIG_STK8312 is not set +# CONFIG_STK8BA50 is not set +# end of Accelerometers + +# +# Analog to digital converters +# +# CONFIG_AD4130 is not set +# CONFIG_AD7091R5 is not set +# CONFIG_AD7124 is not set +# CONFIG_AD7192 is not set +# CONFIG_AD7266 is not set +# CONFIG_AD7280 is not set +# CONFIG_AD7291 is not set +# CONFIG_AD7292 is not set +# CONFIG_AD7298 is not set +# CONFIG_AD7476 is not set +# CONFIG_AD7606_IFACE_PARALLEL is not set +# CONFIG_AD7606_IFACE_SPI is not set +# CONFIG_AD7766 is not set +# CONFIG_AD7768_1 is not set +# CONFIG_AD7780 is not set +# CONFIG_AD7791 is not set +# CONFIG_AD7793 is not set +# CONFIG_AD7887 is not set +# CONFIG_AD7923 is not set +# CONFIG_AD7949 is not set +# CONFIG_AD799X is not set +# CONFIG_ADI_AXI_ADC is not set +CONFIG_AXP20X_ADC=m +CONFIG_AXP288_ADC=m +# CONFIG_CC10001_ADC is not set +# CONFIG_ENVELOPE_DETECTOR is not set +# CONFIG_HI8435 is not set +# CONFIG_HX711 is not set +# CONFIG_INA2XX_ADC is not set +# CONFIG_LTC2471 is not set +# CONFIG_LTC2485 is not set +# CONFIG_LTC2496 is not set +# CONFIG_LTC2497 is not set +# CONFIG_MAX1027 is not set +# CONFIG_MAX11100 is not set +# CONFIG_MAX1118 is not set +# CONFIG_MAX11205 is not set +# CONFIG_MAX11410 is not set +# CONFIG_MAX1241 is not set +# CONFIG_MAX1363 is not set +# CONFIG_MAX9611 is not set +# CONFIG_MCP320X is not set +# CONFIG_MCP3422 is not set +# CONFIG_MCP3911 is not set +CONFIG_MESON_SARADC=m +# CONFIG_NAU7802 is not set +CONFIG_QCOM_VADC_COMMON=m +# CONFIG_QCOM_SPMI_RRADC is not set +CONFIG_QCOM_SPMI_IADC=m +CONFIG_QCOM_SPMI_VADC=m +# CONFIG_QCOM_SPMI_ADC5 is not set +CONFIG_ROCKCHIP_SARADC=m +# CONFIG_RICHTEK_RTQ6056 is not set +# CONFIG_SD_ADC_MODULATOR is not set +# CONFIG_SUN20I_GPADC is not set +# CONFIG_TI_ADC081C is not set +# CONFIG_TI_ADC0832 is not set +# CONFIG_TI_ADC084S021 is not set +# CONFIG_TI_ADC12138 is not set +# CONFIG_TI_ADC108S102 is not set +# CONFIG_TI_ADC128S052 is not set +# CONFIG_TI_ADC161S626 is not set +# CONFIG_TI_ADS1015 is not set +# CONFIG_TI_ADS7924 is not set +# CONFIG_TI_ADS1100 is not set +# CONFIG_TI_ADS7950 is not set +# CONFIG_TI_ADS8344 is not set +# CONFIG_TI_ADS8688 is not set +# CONFIG_TI_ADS124S08 is not set +# CONFIG_TI_ADS131E08 is not set +# CONFIG_TI_LMP92064 is not set +# CONFIG_TI_TLC4541 is not set +# CONFIG_TI_TSC2046 is not set +# CONFIG_VF610_ADC is not set +CONFIG_VIPERBOARD_ADC=m +# CONFIG_XILINX_XADC is not set +# CONFIG_XILINX_AMS is not set +# end of Analog to digital converters + +# +# Analog to digital and digital to analog converters +# +# CONFIG_AD74115 is not set +# CONFIG_AD74413R is not set +# end of Analog to digital and digital to analog converters + +# +# Analog Front Ends +# +# CONFIG_IIO_RESCALE is not set +# end of Analog Front Ends + +# +# Amplifiers +# +# CONFIG_AD8366 is not set +# CONFIG_ADA4250 is not set +# CONFIG_HMC425 is not set +# end of Amplifiers + +# +# Capacitance to digital converters +# +# CONFIG_AD7150 is not set +# CONFIG_AD7746 is not set +# end of Capacitance to digital converters + +# +# Chemical Sensors +# +# CONFIG_ATLAS_PH_SENSOR is not set +# CONFIG_ATLAS_EZO_SENSOR is not set +# CONFIG_BME680 is not set +# CONFIG_CCS811 is not set +# CONFIG_IAQCORE is not set +# CONFIG_PMS7003 is not set +# CONFIG_SCD30_CORE is not set +# CONFIG_SCD4X is not set +# CONFIG_SENSIRION_SGP30 is not set +# CONFIG_SENSIRION_SGP40 is not set +# CONFIG_SPS30_I2C is not set +# CONFIG_SPS30_SERIAL is not set +# CONFIG_SENSEAIR_SUNRISE_CO2 is not set +# CONFIG_VZ89X is not set +# end of Chemical Sensors + +CONFIG_IIO_CROS_EC_SENSORS_CORE=m +CONFIG_IIO_CROS_EC_SENSORS=m +# CONFIG_IIO_CROS_EC_SENSORS_LID_ANGLE is not set + +# +# Hid Sensor IIO Common +# +CONFIG_HID_SENSOR_IIO_COMMON=m +CONFIG_HID_SENSOR_IIO_TRIGGER=m +# end of Hid Sensor IIO Common + +# +# IIO SCMI Sensors +# +# end of IIO SCMI Sensors + +# +# SSP Sensor Common +# +# CONFIG_IIO_SSP_SENSORHUB is not set +# end of SSP Sensor Common + +# +# Digital to analog converters +# +# CONFIG_AD3552R is not set +# CONFIG_AD5064 is not set +# CONFIG_AD5360 is not set +# CONFIG_AD5380 is not set +# CONFIG_AD5421 is not set +CONFIG_AD5446=m +# CONFIG_AD5449 is not set +# CONFIG_AD5592R is not set +# CONFIG_AD5593R is not set +# CONFIG_AD5504 is not set +# CONFIG_AD5624R_SPI is not set +# CONFIG_LTC2688 is not set +# CONFIG_AD5686_SPI is not set +# CONFIG_AD5696_I2C is not set +# CONFIG_AD5755 is not set +# CONFIG_AD5758 is not set +# CONFIG_AD5761 is not set +# CONFIG_AD5764 is not set +# CONFIG_AD5766 is not set +# CONFIG_AD5770R is not set +# CONFIG_AD5791 is not set +# CONFIG_AD7293 is not set +# CONFIG_AD7303 is not set +# CONFIG_AD8801 is not set +# CONFIG_DPOT_DAC is not set +# CONFIG_DS4424 is not set +# CONFIG_LTC1660 is not set +# CONFIG_LTC2632 is not set +# CONFIG_M62332 is not set +# CONFIG_MAX517 is not set +# CONFIG_MAX5522 is not set +# CONFIG_MAX5821 is not set +# CONFIG_MCP4725 is not set +# CONFIG_MCP4728 is not set +# CONFIG_MCP4922 is not set +# CONFIG_TI_DAC082S085 is not set +# CONFIG_TI_DAC5571 is not set +# CONFIG_TI_DAC7311 is not set +# CONFIG_TI_DAC7612 is not set +# CONFIG_VF610_DAC is not set +# end of Digital to analog converters + +# +# IIO dummy driver +# +# end of IIO dummy driver + +# +# Filters +# +# CONFIG_ADMV8818 is not set +# end of Filters + +# +# Frequency Synthesizers DDS/PLL +# + +# +# Clock Generator/Distribution +# +# CONFIG_AD9523 is not set +# end of Clock Generator/Distribution + +# +# Phase-Locked Loop (PLL) frequency synthesizers +# +# CONFIG_ADF4350 is not set +# CONFIG_ADF4371 is not set +# CONFIG_ADF4377 is not set +# CONFIG_ADMV1013 is not set +# CONFIG_ADMV1014 is not set +# CONFIG_ADMV4420 is not set +# CONFIG_ADRF6780 is not set +# end of Phase-Locked Loop (PLL) frequency synthesizers +# end of Frequency Synthesizers DDS/PLL + +# +# Digital gyroscope sensors +# +# CONFIG_ADIS16080 is not set +# CONFIG_ADIS16130 is not set +# CONFIG_ADIS16136 is not set +# CONFIG_ADIS16260 is not set +# CONFIG_ADXRS290 is not set +# CONFIG_ADXRS450 is not set +# CONFIG_BMG160 is not set +# CONFIG_FXAS21002C is not set +CONFIG_HID_SENSOR_GYRO_3D=m +# CONFIG_MPU3050_I2C is not set +# CONFIG_IIO_ST_GYRO_3AXIS is not set +# CONFIG_ITG3200 is not set +# end of Digital gyroscope sensors + +# +# Health Sensors +# + +# +# Heart Rate Monitors +# +# CONFIG_AFE4403 is not set +# CONFIG_AFE4404 is not set +# CONFIG_MAX30100 is not set +# CONFIG_MAX30102 is not set +# end of Heart Rate Monitors +# end of Health Sensors + +# +# Humidity sensors +# +# CONFIG_AM2315 is not set +CONFIG_DHT11=m +# CONFIG_HDC100X is not set +# CONFIG_HDC2010 is not set +# CONFIG_HID_SENSOR_HUMIDITY is not set +# CONFIG_HTS221 is not set +# CONFIG_HTU21 is not set +# CONFIG_SI7005 is not set +# CONFIG_SI7020 is not set +# end of Humidity sensors + +# +# Inertial measurement units +# +# CONFIG_ADIS16400 is not set +# CONFIG_ADIS16460 is not set +# CONFIG_ADIS16475 is not set +# CONFIG_ADIS16480 is not set +# CONFIG_BMI160_I2C is not set +# CONFIG_BMI160_SPI is not set +# CONFIG_BOSCH_BNO055_SERIAL is not set +# CONFIG_BOSCH_BNO055_I2C is not set +# CONFIG_FXOS8700_I2C is not set +# CONFIG_FXOS8700_SPI is not set +# CONFIG_KMX61 is not set +# CONFIG_INV_ICM42600_I2C is not set +# CONFIG_INV_ICM42600_SPI is not set +# CONFIG_INV_MPU6050_I2C is not set +# CONFIG_INV_MPU6050_SPI is not set +# CONFIG_IIO_ST_LSM6DSX is not set +# CONFIG_IIO_ST_LSM9DS0 is not set +# end of Inertial measurement units + +# +# Light sensors +# +CONFIG_ACPI_ALS=m +# CONFIG_ADJD_S311 is not set +# CONFIG_ADUX1020 is not set +# CONFIG_AL3010 is not set +# CONFIG_AL3320A is not set +# CONFIG_APDS9300 is not set +# CONFIG_APDS9960 is not set +# CONFIG_AS73211 is not set +# CONFIG_BH1750 is not set +CONFIG_BH1780=m +# CONFIG_CM32181 is not set +# CONFIG_CM3232 is not set +# CONFIG_CM3323 is not set +# CONFIG_CM3605 is not set +# CONFIG_CM36651 is not set +CONFIG_IIO_CROS_EC_LIGHT_PROX=m +# CONFIG_GP2AP002 is not set +# CONFIG_GP2AP020A00F is not set +# CONFIG_SENSORS_ISL29018 is not set +# CONFIG_SENSORS_ISL29028 is not set +# CONFIG_ISL29125 is not set +CONFIG_HID_SENSOR_ALS=m +CONFIG_HID_SENSOR_PROX=m +# CONFIG_JSA1212 is not set +# CONFIG_ROHM_BU27008 is not set +# CONFIG_ROHM_BU27034 is not set +# CONFIG_RPR0521 is not set +# CONFIG_LTR501 is not set +# CONFIG_LTRF216A is not set +# CONFIG_LV0104CS is not set +# CONFIG_MAX44000 is not set +# CONFIG_MAX44009 is not set +# CONFIG_NOA1305 is not set +# CONFIG_OPT3001 is not set +# CONFIG_OPT4001 is not set +# CONFIG_PA12203001 is not set +# CONFIG_SI1133 is not set +# CONFIG_SI1145 is not set +# CONFIG_STK3310 is not set +# CONFIG_ST_UVIS25 is not set +# CONFIG_TCS3414 is not set +# CONFIG_TCS3472 is not set +# CONFIG_SENSORS_TSL2563 is not set +# CONFIG_TSL2583 is not set +# CONFIG_TSL2591 is not set +# CONFIG_TSL2772 is not set +# CONFIG_TSL4531 is not set +# CONFIG_US5182D is not set +# CONFIG_VCNL4000 is not set +# CONFIG_VCNL4035 is not set +# CONFIG_VEML6030 is not set +# CONFIG_VEML6070 is not set +# CONFIG_VL6180 is not set +# CONFIG_ZOPT2201 is not set +# end of Light sensors + +# +# Magnetometer sensors +# +# CONFIG_AK8974 is not set +# CONFIG_AK8975 is not set +# CONFIG_AK09911 is not set +# CONFIG_BMC150_MAGN_I2C is not set +# CONFIG_BMC150_MAGN_SPI is not set +# CONFIG_MAG3110 is not set +CONFIG_HID_SENSOR_MAGNETOMETER_3D=m +# CONFIG_MMC35240 is not set +# CONFIG_IIO_ST_MAGN_3AXIS is not set +# CONFIG_SENSORS_HMC5843_I2C is not set +# CONFIG_SENSORS_HMC5843_SPI is not set +# CONFIG_SENSORS_RM3100_I2C is not set +# CONFIG_SENSORS_RM3100_SPI is not set +# CONFIG_TI_TMAG5273 is not set +# CONFIG_YAMAHA_YAS530 is not set +# end of Magnetometer sensors + +# +# Multiplexers +# +# CONFIG_IIO_MUX is not set +# end of Multiplexers + +# +# Inclinometer sensors +# +CONFIG_HID_SENSOR_INCLINOMETER_3D=m +CONFIG_HID_SENSOR_DEVICE_ROTATION=m +# end of Inclinometer sensors + +# +# Triggers - standalone +# +# CONFIG_IIO_INTERRUPT_TRIGGER is not set +# CONFIG_IIO_SYSFS_TRIGGER is not set +# end of Triggers - standalone + +# +# Linear and angular position sensors +# +# CONFIG_HID_SENSOR_CUSTOM_INTEL_HINGE is not set +# end of Linear and angular position sensors + +# +# Digital potentiometers +# +# CONFIG_AD5110 is not set +# CONFIG_AD5272 is not set +# CONFIG_DS1803 is not set +# CONFIG_MAX5432 is not set +# CONFIG_MAX5481 is not set +# CONFIG_MAX5487 is not set +# CONFIG_MCP4018 is not set +# CONFIG_MCP4131 is not set +# CONFIG_MCP4531 is not set +# CONFIG_MCP41010 is not set +# CONFIG_TPL0102 is not set +# CONFIG_X9250 is not set +# end of Digital potentiometers + +# +# Digital potentiostats +# +# CONFIG_LMP91000 is not set +# end of Digital potentiostats + +# +# Pressure sensors +# +# CONFIG_ABP060MG is not set +# CONFIG_BMP280 is not set +CONFIG_IIO_CROS_EC_BARO=m +# CONFIG_DLHL60D is not set +# CONFIG_DPS310 is not set +CONFIG_HID_SENSOR_PRESS=m +# CONFIG_HP03 is not set +# CONFIG_ICP10100 is not set +# CONFIG_MPL115_I2C is not set +# CONFIG_MPL115_SPI is not set +# CONFIG_MPL3115 is not set +# CONFIG_MPRLS0025PA is not set +# CONFIG_MS5611 is not set +# CONFIG_MS5637 is not set +# CONFIG_IIO_ST_PRESS is not set +# CONFIG_T5403 is not set +# CONFIG_HP206C is not set +# CONFIG_ZPA2326 is not set +# end of Pressure sensors + +# +# Lightning sensors +# +# CONFIG_AS3935 is not set +# end of Lightning sensors + +# +# Proximity and distance sensors +# +# CONFIG_CROS_EC_MKBP_PROXIMITY is not set +# CONFIG_IRSD200 is not set +# CONFIG_ISL29501 is not set +# CONFIG_LIDAR_LITE_V2 is not set +# CONFIG_MB1232 is not set +# CONFIG_PING is not set +# CONFIG_RFD77402 is not set +# CONFIG_SRF04 is not set +# CONFIG_SX9310 is not set +# CONFIG_SX9324 is not set +# CONFIG_SX9360 is not set +# CONFIG_SX9500 is not set +# CONFIG_SRF08 is not set +# CONFIG_VCNL3020 is not set +# CONFIG_VL53L0X_I2C is not set +# end of Proximity and distance sensors + +# +# Resolver to digital converters +# +# CONFIG_AD2S90 is not set +# CONFIG_AD2S1200 is not set +# end of Resolver to digital converters + +# +# Temperature sensors +# +# CONFIG_LTC2983 is not set +# CONFIG_MAXIM_THERMOCOUPLE is not set +# CONFIG_HID_SENSOR_TEMP is not set +# CONFIG_MLX90614 is not set +# CONFIG_MLX90632 is not set +# CONFIG_TMP006 is not set +# CONFIG_TMP007 is not set +# CONFIG_TMP117 is not set +# CONFIG_TSYS01 is not set +# CONFIG_TSYS02D is not set +# CONFIG_MAX30208 is not set +# CONFIG_MAX31856 is not set +# CONFIG_MAX31865 is not set +# end of Temperature sensors + +# CONFIG_NTB is not set +CONFIG_PWM=y +CONFIG_PWM_SYSFS=y +# CONFIG_PWM_DEBUG is not set +# CONFIG_PWM_ATMEL_TCB is not set +# CONFIG_PWM_CLK is not set +CONFIG_PWM_CROS_EC=m +# CONFIG_PWM_DWC is not set +# CONFIG_PWM_FSL_FTM is not set +# CONFIG_PWM_HIBVT is not set +CONFIG_PWM_MESON=m +# CONFIG_PWM_PCA9685 is not set +CONFIG_PWM_ROCKCHIP=m +CONFIG_PWM_SUN4I=m +CONFIG_PWM_TEGRA=m +# CONFIG_PWM_XILINX is not set + +# +# IRQ chip support +# +CONFIG_IRQCHIP=y +CONFIG_ARM_GIC=y +CONFIG_ARM_GIC_PM=y +CONFIG_ARM_GIC_MAX_NR=1 +CONFIG_ARM_GIC_V2M=y +CONFIG_ARM_GIC_V3=y +CONFIG_ARM_GIC_V3_ITS=y +CONFIG_ARM_GIC_V3_ITS_PCI=y +# CONFIG_AL_FIC is not set +CONFIG_HISILICON_IRQ_MBIGEN=y +CONFIG_SUN6I_R_INTC=y +CONFIG_SUNXI_NMI_INTC=y +# CONFIG_XILINX_INTC is not set +CONFIG_MVEBU_GICP=y +CONFIG_MVEBU_ICU=y +CONFIG_MVEBU_ODMI=y +CONFIG_MVEBU_PIC=y +CONFIG_MVEBU_SEI=y +CONFIG_PARTITION_PERCPU=y +CONFIG_QCOM_IRQ_COMBINER=y +CONFIG_MESON_IRQ_GPIO=y +# CONFIG_QCOM_PDC is not set +# CONFIG_QCOM_MPM is not set +# end of IRQ chip support + +# CONFIG_IPACK_BUS is not set +CONFIG_ARCH_HAS_RESET_CONTROLLER=y +CONFIG_RESET_CONTROLLER=y +CONFIG_RESET_MESON=y +# CONFIG_RESET_MESON_AUDIO_ARB is not set +# CONFIG_RESET_QCOM_AOSS is not set +# CONFIG_RESET_QCOM_PDC is not set +CONFIG_RESET_SIMPLE=y +CONFIG_RESET_SUNXI=y +# CONFIG_RESET_TI_SYSCON is not set +# CONFIG_RESET_TI_TPS380X is not set +CONFIG_COMMON_RESET_HI3660=y +CONFIG_COMMON_RESET_HI6220=y + +# +# PHY Subsystem +# +CONFIG_GENERIC_PHY=y +CONFIG_GENERIC_PHY_MIPI_DPHY=y +CONFIG_PHY_XGENE=m +# CONFIG_PHY_CAN_TRANSCEIVER is not set +CONFIG_PHY_SUN4I_USB=m +# CONFIG_PHY_SUN6I_MIPI_DPHY is not set +# CONFIG_PHY_SUN9I_USB is not set +# CONFIG_PHY_SUN50I_USB3 is not set +CONFIG_PHY_MESON8B_USB2=m +CONFIG_PHY_MESON_GXL_USB2=y +CONFIG_PHY_MESON_G12A_MIPI_DPHY_ANALOG=y +CONFIG_PHY_MESON_G12A_USB2=y +CONFIG_PHY_MESON_G12A_USB3_PCIE=y +CONFIG_PHY_MESON_AXG_PCIE=y +CONFIG_PHY_MESON_AXG_MIPI_PCIE_ANALOG=y +CONFIG_PHY_MESON_AXG_MIPI_DPHY=y + +# +# PHY drivers for Broadcom platforms +# +# CONFIG_BCM_KONA_USB2_PHY is not set +# end of PHY drivers for Broadcom platforms + +# CONFIG_PHY_CADENCE_TORRENT is not set +# CONFIG_PHY_CADENCE_DPHY is not set +# CONFIG_PHY_CADENCE_DPHY_RX is not set +# CONFIG_PHY_CADENCE_SIERRA is not set +# CONFIG_PHY_CADENCE_SALVO is not set +CONFIG_PHY_HI6220_USB=m +# CONFIG_PHY_HI3660_USB is not set +# CONFIG_PHY_HI3670_USB is not set +# CONFIG_PHY_HI3670_PCIE is not set +# CONFIG_PHY_HISTB_COMBPHY is not set +# CONFIG_PHY_HISI_INNO_USB2 is not set +CONFIG_PHY_MVEBU_A3700_COMPHY=y +CONFIG_PHY_MVEBU_A3700_UTMI=y +# CONFIG_PHY_MVEBU_A38X_COMPHY is not set +CONFIG_PHY_MVEBU_CP110_COMPHY=m +# CONFIG_PHY_MVEBU_CP110_UTMI is not set +# CONFIG_PHY_PXA_28NM_HSIC is not set +# CONFIG_PHY_PXA_28NM_USB2 is not set +# CONFIG_PHY_LAN966X_SERDES is not set +# CONFIG_PHY_CPCAP_USB is not set +# CONFIG_PHY_MAPPHONE_MDM6600 is not set +# CONFIG_PHY_OCELOT_SERDES is not set +CONFIG_PHY_QCOM_APQ8064_SATA=m +# CONFIG_PHY_QCOM_EDP is not set +# CONFIG_PHY_QCOM_IPQ4019_USB is not set +CONFIG_PHY_QCOM_IPQ806X_SATA=m +# CONFIG_PHY_QCOM_PCIE2 is not set +CONFIG_PHY_QCOM_QMP=m +CONFIG_PHY_QCOM_QMP_COMBO=m +CONFIG_PHY_QCOM_QMP_PCIE=m +CONFIG_PHY_QCOM_QMP_PCIE_8996=m +CONFIG_PHY_QCOM_QMP_UFS=m +CONFIG_PHY_QCOM_QMP_USB=m +# CONFIG_PHY_QCOM_QMP_USB_LEGACY is not set +CONFIG_PHY_QCOM_QUSB2=m +# CONFIG_PHY_QCOM_SNPS_EUSB2 is not set +# CONFIG_PHY_QCOM_EUSB2_REPEATER is not set +# CONFIG_PHY_QCOM_M31_USB is not set +CONFIG_PHY_QCOM_USB_HS=m +# CONFIG_PHY_QCOM_USB_SNPS_FEMTO_V2 is not set +CONFIG_PHY_QCOM_USB_HSIC=m +# CONFIG_PHY_QCOM_USB_HS_28NM is not set +# CONFIG_PHY_QCOM_USB_SS is not set +# CONFIG_PHY_QCOM_IPQ806X_USB is not set +# CONFIG_PHY_QCOM_SGMII_ETH is not set +CONFIG_PHY_ROCKCHIP_DP=m +# CONFIG_PHY_ROCKCHIP_DPHY_RX0 is not set +CONFIG_PHY_ROCKCHIP_EMMC=m +CONFIG_PHY_ROCKCHIP_INNO_HDMI=m +CONFIG_PHY_ROCKCHIP_INNO_USB2=m +# CONFIG_PHY_ROCKCHIP_INNO_CSIDPHY is not set +# CONFIG_PHY_ROCKCHIP_INNO_DSIDPHY is not set +# CONFIG_PHY_ROCKCHIP_NANENG_COMBO_PHY is not set +CONFIG_PHY_ROCKCHIP_PCIE=m +# CONFIG_PHY_ROCKCHIP_SNPS_PCIE3 is not set +CONFIG_PHY_ROCKCHIP_TYPEC=m +CONFIG_PHY_ROCKCHIP_USB=m +# CONFIG_PHY_SAMSUNG_USB2 is not set +CONFIG_PHY_TEGRA_XUSB=m +# CONFIG_PHY_TUSB1210 is not set +# CONFIG_PHY_XILINX_ZYNQMP is not set +# end of PHY Subsystem + +# CONFIG_POWERCAP is not set +# CONFIG_MCB is not set + +# +# Performance monitor support +# +CONFIG_ARM_CCI_PMU=y +CONFIG_ARM_CCI400_PMU=y +CONFIG_ARM_CCI5xx_PMU=y +CONFIG_ARM_CCN=y +CONFIG_ARM_CMN=m +CONFIG_ARM_PMU=y +CONFIG_ARM_PMU_ACPI=y +CONFIG_ARM_SMMU_V3_PMU=y +CONFIG_ARM_PMUV3=y +CONFIG_ARM_DSU_PMU=m +CONFIG_QCOM_L2_PMU=y +CONFIG_QCOM_L3_PMU=y +CONFIG_THUNDERX2_PMU=m +CONFIG_XGENE_PMU=y +CONFIG_ARM_SPE_PMU=m +CONFIG_ARM_DMC620_PMU=y +# CONFIG_MARVELL_CN10K_TAD_PMU is not set +# CONFIG_ALIBABA_UNCORE_DRW_PMU is not set +CONFIG_HISI_PMU=y +# CONFIG_HISI_PCIE_PMU is not set +# CONFIG_HNS3_PMU is not set +# CONFIG_MARVELL_CN10K_DDR_PMU is not set +CONFIG_ARM_CORESIGHT_PMU_ARCH_SYSTEM_PMU=m +CONFIG_NVIDIA_CORESIGHT_PMU_ARCH_SYSTEM_PMU=m +# CONFIG_MESON_DDR_PMU is not set +# end of Performance monitor support + +CONFIG_RAS=y +# CONFIG_USB4 is not set + +# +# Android +# +# CONFIG_ANDROID_BINDER_IPC is not set +# end of Android + +CONFIG_LIBNVDIMM=m +CONFIG_BLK_DEV_PMEM=m +CONFIG_ND_CLAIM=y +CONFIG_ND_BTT=m +CONFIG_BTT=y +CONFIG_OF_PMEM=m +CONFIG_DAX=y +CONFIG_DEV_DAX=m +CONFIG_DEV_DAX_KMEM=m +CONFIG_NVMEM=y +CONFIG_NVMEM_SYSFS=y + +# +# Layout Types +# +# CONFIG_NVMEM_LAYOUT_SL28_VPD is not set +# CONFIG_NVMEM_LAYOUT_ONIE_TLV is not set +# end of Layout Types + +# CONFIG_NVMEM_MESON_MX_EFUSE is not set +# CONFIG_NVMEM_QCOM_QFPROM is not set +# CONFIG_NVMEM_QCOM_SEC_QFPROM is not set +# CONFIG_NVMEM_RMEM is not set +# CONFIG_NVMEM_ROCKCHIP_EFUSE is not set +# CONFIG_NVMEM_ROCKCHIP_OTP is not set +# CONFIG_NVMEM_SPMI_SDAM is not set +CONFIG_NVMEM_SUNXI_SID=m +# CONFIG_NVMEM_U_BOOT_ENV is not set +# CONFIG_NVMEM_ZYNQMP is not set + +# +# HW tracing support +# +# CONFIG_STM is not set +# CONFIG_INTEL_TH is not set +# CONFIG_HISI_PTT is not set +# end of HW tracing support + +# CONFIG_FPGA is not set +# CONFIG_FSI is not set +CONFIG_TEE=m +CONFIG_OPTEE=m +# CONFIG_OPTEE_INSECURE_LOAD_IMAGE is not set +CONFIG_PM_OPP=y +# CONFIG_SIOX is not set +# CONFIG_SLIMBUS is not set +CONFIG_INTERCONNECT=y +# CONFIG_INTERCONNECT_QCOM is not set +# CONFIG_COUNTER is not set +# CONFIG_MOST is not set +# CONFIG_PECI is not set +# CONFIG_HTE is not set +# CONFIG_CDX_BUS is not set +# end of Device Drivers + +# +# File systems +# +CONFIG_DCACHE_WORD_ACCESS=y +# CONFIG_VALIDATE_FS_PARSER is not set +CONFIG_FS_IOMAP=y +CONFIG_BUFFER_HEAD=y +CONFIG_LEGACY_DIRECT_IO=y +# CONFIG_EXT2_FS is not set +# CONFIG_EXT3_FS is not set +CONFIG_EXT4_FS=m +CONFIG_EXT4_USE_FOR_EXT2=y +CONFIG_EXT4_FS_POSIX_ACL=y +CONFIG_EXT4_FS_SECURITY=y +# CONFIG_EXT4_DEBUG is not set +CONFIG_JBD2=m +# CONFIG_JBD2_DEBUG is not set +CONFIG_FS_MBCACHE=m +CONFIG_REISERFS_FS=m +# CONFIG_REISERFS_CHECK is not set +# CONFIG_REISERFS_PROC_INFO is not set +CONFIG_REISERFS_FS_XATTR=y +CONFIG_REISERFS_FS_POSIX_ACL=y +CONFIG_REISERFS_FS_SECURITY=y +CONFIG_JFS_FS=m +CONFIG_JFS_POSIX_ACL=y +CONFIG_JFS_SECURITY=y +# CONFIG_JFS_DEBUG is not set +# CONFIG_JFS_STATISTICS is not set +CONFIG_XFS_FS=m +CONFIG_XFS_SUPPORT_V4=y +CONFIG_XFS_SUPPORT_ASCII_CI=y +CONFIG_XFS_QUOTA=y +CONFIG_XFS_POSIX_ACL=y +CONFIG_XFS_RT=y +# CONFIG_XFS_ONLINE_SCRUB is not set +# CONFIG_XFS_WARN is not set +# CONFIG_XFS_DEBUG is not set +CONFIG_GFS2_FS=m +CONFIG_GFS2_FS_LOCKING_DLM=y +CONFIG_OCFS2_FS=m +CONFIG_OCFS2_FS_O2CB=m +CONFIG_OCFS2_FS_USERSPACE_CLUSTER=m +CONFIG_OCFS2_FS_STATS=y +CONFIG_OCFS2_DEBUG_MASKLOG=y +# CONFIG_OCFS2_DEBUG_FS is not set +CONFIG_BTRFS_FS=m +CONFIG_BTRFS_FS_POSIX_ACL=y +# CONFIG_BTRFS_FS_CHECK_INTEGRITY is not set +# CONFIG_BTRFS_FS_RUN_SANITY_TESTS is not set +# CONFIG_BTRFS_DEBUG is not set +# CONFIG_BTRFS_ASSERT is not set +# CONFIG_BTRFS_FS_REF_VERIFY is not set +CONFIG_NILFS2_FS=m +CONFIG_F2FS_FS=m +CONFIG_F2FS_STAT_FS=y +CONFIG_F2FS_FS_XATTR=y +CONFIG_F2FS_FS_POSIX_ACL=y +CONFIG_F2FS_FS_SECURITY=y +# CONFIG_F2FS_CHECK_FS is not set +# CONFIG_F2FS_FAULT_INJECTION is not set +# CONFIG_F2FS_FS_COMPRESSION is not set +CONFIG_F2FS_IOSTAT=y +# CONFIG_F2FS_UNFAIR_RWSEM is not set +# CONFIG_ZONEFS_FS is not set +CONFIG_FS_POSIX_ACL=y +CONFIG_EXPORTFS=y +CONFIG_EXPORTFS_BLOCK_OPS=y +CONFIG_FILE_LOCKING=y +# CONFIG_FS_ENCRYPTION is not set +# CONFIG_FS_VERITY is not set +CONFIG_FSNOTIFY=y +CONFIG_DNOTIFY=y +CONFIG_INOTIFY_USER=y +CONFIG_FANOTIFY=y +CONFIG_FANOTIFY_ACCESS_PERMISSIONS=y +CONFIG_QUOTA=y +CONFIG_QUOTA_NETLINK_INTERFACE=y +# CONFIG_QUOTA_DEBUG is not set +CONFIG_QUOTA_TREE=m +CONFIG_QFMT_V1=m +CONFIG_QFMT_V2=m +CONFIG_QUOTACTL=y +CONFIG_AUTOFS_FS=m +CONFIG_FUSE_FS=m +CONFIG_CUSE=m +# CONFIG_VIRTIO_FS is not set +CONFIG_OVERLAY_FS=m +# CONFIG_OVERLAY_FS_REDIRECT_DIR is not set +CONFIG_OVERLAY_FS_REDIRECT_ALWAYS_FOLLOW=y +# CONFIG_OVERLAY_FS_INDEX is not set +# CONFIG_OVERLAY_FS_XINO_AUTO is not set +# CONFIG_OVERLAY_FS_METACOPY is not set +# CONFIG_OVERLAY_FS_DEBUG is not set + +# +# Caches +# +CONFIG_NETFS_SUPPORT=m +CONFIG_NETFS_STATS=y +CONFIG_FSCACHE=m +CONFIG_FSCACHE_STATS=y +# CONFIG_FSCACHE_DEBUG is not set +CONFIG_CACHEFILES=m +# CONFIG_CACHEFILES_DEBUG is not set +# CONFIG_CACHEFILES_ERROR_INJECTION is not set +# CONFIG_CACHEFILES_ONDEMAND is not set +# end of Caches + +# +# CD-ROM/DVD Filesystems +# +CONFIG_ISO9660_FS=m +CONFIG_JOLIET=y +CONFIG_ZISOFS=y +CONFIG_UDF_FS=m +# end of CD-ROM/DVD Filesystems + +# +# DOS/FAT/EXFAT/NT Filesystems +# +CONFIG_FAT_FS=y +CONFIG_MSDOS_FS=m +CONFIG_VFAT_FS=y +CONFIG_FAT_DEFAULT_CODEPAGE=437 +CONFIG_FAT_DEFAULT_IOCHARSET="ascii" +CONFIG_FAT_DEFAULT_UTF8=y +# CONFIG_EXFAT_FS is not set +CONFIG_NTFS_FS=m +# CONFIG_NTFS_DEBUG is not set +# CONFIG_NTFS3_FS is not set +# end of DOS/FAT/EXFAT/NT Filesystems + +# +# Pseudo filesystems +# +CONFIG_PROC_FS=y +CONFIG_PROC_KCORE=y +CONFIG_PROC_VMCORE=y +# CONFIG_PROC_VMCORE_DEVICE_DUMP is not set +CONFIG_PROC_SYSCTL=y +CONFIG_PROC_PAGE_MONITOR=y +CONFIG_PROC_CHILDREN=y +CONFIG_KERNFS=y +CONFIG_SYSFS=y +CONFIG_TMPFS=y +CONFIG_TMPFS_POSIX_ACL=y +CONFIG_TMPFS_XATTR=y +# CONFIG_TMPFS_INODE64 is not set +# CONFIG_TMPFS_QUOTA is not set +CONFIG_ARCH_SUPPORTS_HUGETLBFS=y +CONFIG_HUGETLBFS=y +CONFIG_HUGETLB_PAGE=y +CONFIG_ARCH_HAS_GIGANTIC_PAGE=y +CONFIG_CONFIGFS_FS=m +CONFIG_EFIVAR_FS=m +# end of Pseudo filesystems + +CONFIG_MISC_FILESYSTEMS=y +CONFIG_ORANGEFS_FS=m +CONFIG_ADFS_FS=m +# CONFIG_ADFS_FS_RW is not set +CONFIG_AFFS_FS=m +CONFIG_ECRYPT_FS=m +CONFIG_ECRYPT_FS_MESSAGING=y +CONFIG_HFS_FS=m +CONFIG_HFSPLUS_FS=m +CONFIG_BEFS_FS=m +# CONFIG_BEFS_DEBUG is not set +CONFIG_BFS_FS=m +CONFIG_EFS_FS=m +CONFIG_JFFS2_FS=m +CONFIG_JFFS2_FS_DEBUG=0 +CONFIG_JFFS2_FS_WRITEBUFFER=y +# CONFIG_JFFS2_FS_WBUF_VERIFY is not set +CONFIG_JFFS2_SUMMARY=y +CONFIG_JFFS2_FS_XATTR=y +CONFIG_JFFS2_FS_POSIX_ACL=y +CONFIG_JFFS2_FS_SECURITY=y +CONFIG_JFFS2_COMPRESSION_OPTIONS=y +CONFIG_JFFS2_ZLIB=y +CONFIG_JFFS2_LZO=y +CONFIG_JFFS2_RTIME=y +# CONFIG_JFFS2_RUBIN is not set +# CONFIG_JFFS2_CMODE_NONE is not set +CONFIG_JFFS2_CMODE_PRIORITY=y +# CONFIG_JFFS2_CMODE_SIZE is not set +# CONFIG_JFFS2_CMODE_FAVOURLZO is not set +CONFIG_UBIFS_FS=m +CONFIG_UBIFS_FS_ADVANCED_COMPR=y +CONFIG_UBIFS_FS_LZO=y +CONFIG_UBIFS_FS_ZLIB=y +CONFIG_UBIFS_FS_ZSTD=y +# CONFIG_UBIFS_ATIME_SUPPORT is not set +CONFIG_UBIFS_FS_XATTR=y +CONFIG_UBIFS_FS_SECURITY=y +# CONFIG_UBIFS_FS_AUTHENTICATION is not set +# CONFIG_CRAMFS is not set +CONFIG_SQUASHFS=m +CONFIG_SQUASHFS_FILE_CACHE=y +# CONFIG_SQUASHFS_FILE_DIRECT is not set +CONFIG_SQUASHFS_DECOMP_SINGLE=y +# CONFIG_SQUASHFS_CHOICE_DECOMP_BY_MOUNT is not set +CONFIG_SQUASHFS_COMPILE_DECOMP_SINGLE=y +# CONFIG_SQUASHFS_COMPILE_DECOMP_MULTI is not set +# CONFIG_SQUASHFS_COMPILE_DECOMP_MULTI_PERCPU is not set +CONFIG_SQUASHFS_XATTR=y +CONFIG_SQUASHFS_ZLIB=y +CONFIG_SQUASHFS_LZ4=y +CONFIG_SQUASHFS_LZO=y +CONFIG_SQUASHFS_XZ=y +CONFIG_SQUASHFS_ZSTD=y +# CONFIG_SQUASHFS_4K_DEVBLK_SIZE is not set +# CONFIG_SQUASHFS_EMBEDDED is not set +CONFIG_SQUASHFS_FRAGMENT_CACHE_SIZE=3 +CONFIG_VXFS_FS=m +CONFIG_MINIX_FS=m +CONFIG_OMFS_FS=m +CONFIG_HPFS_FS=m +CONFIG_QNX4FS_FS=m +CONFIG_QNX6FS_FS=m +# CONFIG_QNX6FS_DEBUG is not set +CONFIG_ROMFS_FS=m +# CONFIG_ROMFS_BACKED_BY_BLOCK is not set +# CONFIG_ROMFS_BACKED_BY_MTD is not set +CONFIG_ROMFS_BACKED_BY_BOTH=y +CONFIG_ROMFS_ON_BLOCK=y +CONFIG_ROMFS_ON_MTD=y +CONFIG_PSTORE=y +CONFIG_PSTORE_DEFAULT_KMSG_BYTES=10240 +CONFIG_PSTORE_COMPRESS=y +# CONFIG_PSTORE_CONSOLE is not set +# CONFIG_PSTORE_PMSG is not set +# CONFIG_PSTORE_FTRACE is not set +CONFIG_PSTORE_RAM=m +# CONFIG_PSTORE_BLK is not set +CONFIG_SYSV_FS=m +CONFIG_UFS_FS=m +# CONFIG_UFS_FS_WRITE is not set +# CONFIG_UFS_DEBUG is not set +# CONFIG_EROFS_FS is not set +CONFIG_NETWORK_FILESYSTEMS=y +CONFIG_NFS_FS=m +CONFIG_NFS_V2=m +CONFIG_NFS_V3=m +CONFIG_NFS_V3_ACL=y +CONFIG_NFS_V4=m +CONFIG_NFS_SWAP=y +CONFIG_NFS_V4_1=y +CONFIG_NFS_V4_2=y +CONFIG_PNFS_FILE_LAYOUT=m +CONFIG_PNFS_BLOCK=m +CONFIG_PNFS_FLEXFILE_LAYOUT=m +CONFIG_NFS_V4_1_IMPLEMENTATION_ID_DOMAIN="kernel.org" +# CONFIG_NFS_V4_1_MIGRATION is not set +CONFIG_NFS_V4_SECURITY_LABEL=y +CONFIG_NFS_FSCACHE=y +# CONFIG_NFS_USE_LEGACY_DNS is not set +CONFIG_NFS_USE_KERNEL_DNS=y +CONFIG_NFS_DEBUG=y +CONFIG_NFS_DISABLE_UDP_SUPPORT=y +# CONFIG_NFS_V4_2_READ_PLUS is not set +CONFIG_NFSD=m +# CONFIG_NFSD_V2 is not set +CONFIG_NFSD_V3_ACL=y +CONFIG_NFSD_V4=y +CONFIG_NFSD_PNFS=y +CONFIG_NFSD_BLOCKLAYOUT=y +# CONFIG_NFSD_SCSILAYOUT is not set +# CONFIG_NFSD_FLEXFILELAYOUT is not set +# CONFIG_NFSD_V4_2_INTER_SSC is not set +CONFIG_NFSD_V4_SECURITY_LABEL=y +CONFIG_GRACE_PERIOD=m +CONFIG_LOCKD=m +CONFIG_LOCKD_V4=y +CONFIG_NFS_ACL_SUPPORT=m +CONFIG_NFS_COMMON=y +CONFIG_NFS_V4_2_SSC_HELPER=y +CONFIG_SUNRPC=m +CONFIG_SUNRPC_GSS=m +CONFIG_SUNRPC_BACKCHANNEL=y +CONFIG_SUNRPC_SWAP=y +CONFIG_RPCSEC_GSS_KRB5=m +CONFIG_RPCSEC_GSS_KRB5_ENCTYPES_AES_SHA1=y +# CONFIG_RPCSEC_GSS_KRB5_ENCTYPES_CAMELLIA is not set +# CONFIG_RPCSEC_GSS_KRB5_ENCTYPES_AES_SHA2 is not set +CONFIG_SUNRPC_DEBUG=y +CONFIG_SUNRPC_XPRT_RDMA=m +CONFIG_CEPH_FS=m +CONFIG_CEPH_FSCACHE=y +CONFIG_CEPH_FS_POSIX_ACL=y +# CONFIG_CEPH_FS_SECURITY_LABEL is not set +CONFIG_CIFS=m +# CONFIG_CIFS_STATS2 is not set +CONFIG_CIFS_ALLOW_INSECURE_LEGACY=y +CONFIG_CIFS_UPCALL=y +CONFIG_CIFS_XATTR=y +CONFIG_CIFS_POSIX=y +CONFIG_CIFS_DEBUG=y +# CONFIG_CIFS_DEBUG2 is not set +# CONFIG_CIFS_DEBUG_DUMP_KEYS is not set +CONFIG_CIFS_DFS_UPCALL=y +# CONFIG_CIFS_SWN_UPCALL is not set +# CONFIG_CIFS_SMB_DIRECT is not set +CONFIG_CIFS_FSCACHE=y +# CONFIG_SMB_SERVER is not set +CONFIG_SMBFS=m +CONFIG_CODA_FS=m +CONFIG_AFS_FS=m +# CONFIG_AFS_DEBUG is not set +CONFIG_AFS_FSCACHE=y +# CONFIG_AFS_DEBUG_CURSOR is not set +CONFIG_9P_FS=m +CONFIG_9P_FSCACHE=y +CONFIG_9P_FS_POSIX_ACL=y +CONFIG_9P_FS_SECURITY=y +CONFIG_NLS=y +CONFIG_NLS_DEFAULT="utf8" +CONFIG_NLS_CODEPAGE_437=m +CONFIG_NLS_CODEPAGE_737=m +CONFIG_NLS_CODEPAGE_775=m +CONFIG_NLS_CODEPAGE_850=m +CONFIG_NLS_CODEPAGE_852=m +CONFIG_NLS_CODEPAGE_855=m +CONFIG_NLS_CODEPAGE_857=m +CONFIG_NLS_CODEPAGE_860=m +CONFIG_NLS_CODEPAGE_861=m +CONFIG_NLS_CODEPAGE_862=m +CONFIG_NLS_CODEPAGE_863=m +CONFIG_NLS_CODEPAGE_864=m +CONFIG_NLS_CODEPAGE_865=m +CONFIG_NLS_CODEPAGE_866=m +CONFIG_NLS_CODEPAGE_869=m +CONFIG_NLS_CODEPAGE_936=m +CONFIG_NLS_CODEPAGE_950=m +CONFIG_NLS_CODEPAGE_932=m +CONFIG_NLS_CODEPAGE_949=m +CONFIG_NLS_CODEPAGE_874=m +CONFIG_NLS_ISO8859_8=m +CONFIG_NLS_CODEPAGE_1250=m +CONFIG_NLS_CODEPAGE_1251=m +CONFIG_NLS_ASCII=m +CONFIG_NLS_ISO8859_1=m +CONFIG_NLS_ISO8859_2=m +CONFIG_NLS_ISO8859_3=m +CONFIG_NLS_ISO8859_4=m +CONFIG_NLS_ISO8859_5=m +CONFIG_NLS_ISO8859_6=m +CONFIG_NLS_ISO8859_7=m +CONFIG_NLS_ISO8859_9=m +CONFIG_NLS_ISO8859_13=m +CONFIG_NLS_ISO8859_14=m +CONFIG_NLS_ISO8859_15=m +CONFIG_NLS_KOI8_R=m +CONFIG_NLS_KOI8_U=m +CONFIG_NLS_MAC_ROMAN=m +CONFIG_NLS_MAC_CELTIC=m +CONFIG_NLS_MAC_CENTEURO=m +CONFIG_NLS_MAC_CROATIAN=m +CONFIG_NLS_MAC_CYRILLIC=m +CONFIG_NLS_MAC_GAELIC=m +CONFIG_NLS_MAC_GREEK=m +CONFIG_NLS_MAC_ICELAND=m +CONFIG_NLS_MAC_INUIT=m +CONFIG_NLS_MAC_ROMANIAN=m +CONFIG_NLS_MAC_TURKISH=m +CONFIG_NLS_UTF8=m +CONFIG_NLS_UCS2_UTILS=m +CONFIG_DLM=m +CONFIG_DLM_DEBUG=y +# CONFIG_UNICODE is not set +CONFIG_IO_WQ=y +# end of File systems + +# +# Security options +# +CONFIG_KEYS=y +# CONFIG_KEYS_REQUEST_CACHE is not set +# CONFIG_PERSISTENT_KEYRINGS is not set +# CONFIG_TRUSTED_KEYS is not set +# CONFIG_ENCRYPTED_KEYS is not set +CONFIG_KEY_DH_OPERATIONS=y +CONFIG_SECURITY_DMESG_RESTRICT=y +CONFIG_SECURITY=y +CONFIG_SECURITYFS=y +CONFIG_SECURITY_NETWORK=y +# CONFIG_SECURITY_INFINIBAND is not set +CONFIG_SECURITY_NETWORK_XFRM=y +CONFIG_SECURITY_PATH=y +CONFIG_LSM_MMAP_MIN_ADDR=32768 +CONFIG_HARDENED_USERCOPY=y +CONFIG_FORTIFY_SOURCE=y +# CONFIG_STATIC_USERMODEHELPER is not set +CONFIG_SECURITY_SELINUX=y +CONFIG_SECURITY_SELINUX_BOOTPARAM=y +CONFIG_SECURITY_SELINUX_DEVELOP=y +CONFIG_SECURITY_SELINUX_AVC_STATS=y +CONFIG_SECURITY_SELINUX_SIDTAB_HASH_BITS=9 +CONFIG_SECURITY_SELINUX_SID2STR_CACHE_SIZE=256 +# CONFIG_SECURITY_SELINUX_DEBUG is not set +# CONFIG_SECURITY_SMACK is not set +CONFIG_SECURITY_TOMOYO=y +CONFIG_SECURITY_TOMOYO_MAX_ACCEPT_ENTRY=2048 +CONFIG_SECURITY_TOMOYO_MAX_AUDIT_LOG=1024 +# CONFIG_SECURITY_TOMOYO_OMIT_USERSPACE_LOADER is not set +CONFIG_SECURITY_TOMOYO_POLICY_LOADER="/sbin/tomoyo-init" +CONFIG_SECURITY_TOMOYO_ACTIVATION_TRIGGER="/sbin/init" +# CONFIG_SECURITY_TOMOYO_INSECURE_BUILTIN_SETTING is not set +CONFIG_SECURITY_APPARMOR=y +# CONFIG_SECURITY_APPARMOR_DEBUG is not set +CONFIG_SECURITY_APPARMOR_INTROSPECT_POLICY=y +CONFIG_SECURITY_APPARMOR_HASH=y +CONFIG_SECURITY_APPARMOR_HASH_DEFAULT=y +CONFIG_SECURITY_APPARMOR_EXPORT_BINARY=y +CONFIG_SECURITY_APPARMOR_PARANOID_LOAD=y +# CONFIG_SECURITY_LOADPIN is not set +CONFIG_SECURITY_YAMA=y +# CONFIG_SECURITY_SAFESETID is not set +# CONFIG_SECURITY_LOCKDOWN_LSM is not set +# CONFIG_SECURITY_LANDLOCK is not set +CONFIG_INTEGRITY=y +CONFIG_INTEGRITY_SIGNATURE=y +CONFIG_INTEGRITY_ASYMMETRIC_KEYS=y +# CONFIG_INTEGRITY_TRUSTED_KEYRING is not set +CONFIG_INTEGRITY_PLATFORM_KEYRING=y +# CONFIG_INTEGRITY_MACHINE_KEYRING is not set +CONFIG_LOAD_UEFI_KEYS=y +CONFIG_INTEGRITY_AUDIT=y +# CONFIG_IMA is not set +# CONFIG_IMA_SECURE_AND_OR_TRUSTED_BOOT is not set +# CONFIG_EVM is not set +# CONFIG_DEFAULT_SECURITY_SELINUX is not set +# CONFIG_DEFAULT_SECURITY_TOMOYO is not set +CONFIG_DEFAULT_SECURITY_APPARMOR=y +# CONFIG_DEFAULT_SECURITY_DAC is not set +CONFIG_LSM="yama,loadpin,safesetid,integrity,apparmor,selinux,smack,tomoyo" + +# +# Kernel hardening options +# + +# +# Memory initialization +# +CONFIG_INIT_STACK_NONE=y +# CONFIG_INIT_ON_ALLOC_DEFAULT_ON is not set +# CONFIG_INIT_ON_FREE_DEFAULT_ON is not set +# end of Memory initialization + +# +# Hardening of kernel data structures +# +CONFIG_LIST_HARDENED=y +CONFIG_BUG_ON_DATA_CORRUPTION=y +# end of Hardening of kernel data structures + +CONFIG_RANDSTRUCT_NONE=y +# end of Kernel hardening options +# end of Security options + +CONFIG_XOR_BLOCKS=m +CONFIG_ASYNC_CORE=m +CONFIG_ASYNC_MEMCPY=m +CONFIG_ASYNC_XOR=m +CONFIG_ASYNC_PQ=m +CONFIG_ASYNC_RAID6_RECOV=m +CONFIG_CRYPTO=y + +# +# Crypto core or helper +# +CONFIG_CRYPTO_FIPS=y +CONFIG_CRYPTO_FIPS_NAME="Linux Kernel Cryptographic API" +# CONFIG_CRYPTO_FIPS_CUSTOM_VERSION is not set +CONFIG_CRYPTO_ALGAPI=y +CONFIG_CRYPTO_ALGAPI2=y +CONFIG_CRYPTO_AEAD=m +CONFIG_CRYPTO_AEAD2=y +CONFIG_CRYPTO_SIG2=y +CONFIG_CRYPTO_SKCIPHER=y +CONFIG_CRYPTO_SKCIPHER2=y +CONFIG_CRYPTO_HASH=y +CONFIG_CRYPTO_HASH2=y +CONFIG_CRYPTO_RNG=m +CONFIG_CRYPTO_RNG2=y +CONFIG_CRYPTO_RNG_DEFAULT=m +CONFIG_CRYPTO_AKCIPHER2=y +CONFIG_CRYPTO_AKCIPHER=y +CONFIG_CRYPTO_KPP2=y +CONFIG_CRYPTO_KPP=y +CONFIG_CRYPTO_ACOMP2=y +CONFIG_CRYPTO_MANAGER=y +CONFIG_CRYPTO_MANAGER2=y +CONFIG_CRYPTO_USER=m +# CONFIG_CRYPTO_MANAGER_DISABLE_TESTS is not set +# CONFIG_CRYPTO_MANAGER_EXTRA_TESTS is not set +CONFIG_CRYPTO_NULL=m +CONFIG_CRYPTO_NULL2=m +CONFIG_CRYPTO_PCRYPT=m +CONFIG_CRYPTO_CRYPTD=m +CONFIG_CRYPTO_AUTHENC=m +CONFIG_CRYPTO_TEST=m +CONFIG_CRYPTO_ENGINE=m +# end of Crypto core or helper + +# +# Public-key cryptography +# +CONFIG_CRYPTO_RSA=y +CONFIG_CRYPTO_DH=y +# CONFIG_CRYPTO_DH_RFC7919_GROUPS is not set +CONFIG_CRYPTO_ECC=m +CONFIG_CRYPTO_ECDH=m +# CONFIG_CRYPTO_ECDSA is not set +# CONFIG_CRYPTO_ECRDSA is not set +# CONFIG_CRYPTO_SM2 is not set +# CONFIG_CRYPTO_CURVE25519 is not set +# end of Public-key cryptography + +# +# Block ciphers +# +CONFIG_CRYPTO_AES=y +# CONFIG_CRYPTO_AES_TI is not set +CONFIG_CRYPTO_ANUBIS=m +# CONFIG_CRYPTO_ARIA is not set +CONFIG_CRYPTO_BLOWFISH=m +CONFIG_CRYPTO_BLOWFISH_COMMON=m +CONFIG_CRYPTO_CAMELLIA=m +CONFIG_CRYPTO_CAST_COMMON=m +CONFIG_CRYPTO_CAST5=m +CONFIG_CRYPTO_CAST6=m +CONFIG_CRYPTO_DES=m +CONFIG_CRYPTO_FCRYPT=m +CONFIG_CRYPTO_KHAZAD=m +CONFIG_CRYPTO_SEED=m +CONFIG_CRYPTO_SERPENT=m +# CONFIG_CRYPTO_SM4_GENERIC is not set +CONFIG_CRYPTO_TEA=m +CONFIG_CRYPTO_TWOFISH=m +CONFIG_CRYPTO_TWOFISH_COMMON=m +# end of Block ciphers + +# +# Length-preserving ciphers and modes +# +# CONFIG_CRYPTO_ADIANTUM is not set +CONFIG_CRYPTO_ARC4=m +CONFIG_CRYPTO_CHACHA20=m +CONFIG_CRYPTO_CBC=y +# CONFIG_CRYPTO_CFB is not set +CONFIG_CRYPTO_CTR=m +CONFIG_CRYPTO_CTS=y +CONFIG_CRYPTO_ECB=y +# CONFIG_CRYPTO_HCTR2 is not set +# CONFIG_CRYPTO_KEYWRAP is not set +CONFIG_CRYPTO_LRW=m +# CONFIG_CRYPTO_OFB is not set +CONFIG_CRYPTO_PCBC=m +CONFIG_CRYPTO_XTS=y +# end of Length-preserving ciphers and modes + +# +# AEAD (authenticated encryption with associated data) ciphers +# +CONFIG_CRYPTO_AEGIS128=m +CONFIG_CRYPTO_AEGIS128_SIMD=y +CONFIG_CRYPTO_CHACHA20POLY1305=m +CONFIG_CRYPTO_CCM=m +CONFIG_CRYPTO_GCM=m +CONFIG_CRYPTO_GENIV=m +CONFIG_CRYPTO_SEQIV=m +CONFIG_CRYPTO_ECHAINIV=m +CONFIG_CRYPTO_ESSIV=m +# end of AEAD (authenticated encryption with associated data) ciphers + +# +# Hashes, digests, and MACs +# +CONFIG_CRYPTO_BLAKE2B=m +CONFIG_CRYPTO_CMAC=m +CONFIG_CRYPTO_GHASH=m +CONFIG_CRYPTO_HMAC=y +CONFIG_CRYPTO_MD4=m +CONFIG_CRYPTO_MD5=y +CONFIG_CRYPTO_MICHAEL_MIC=m +CONFIG_CRYPTO_POLY1305=m +CONFIG_CRYPTO_RMD160=m +CONFIG_CRYPTO_SHA1=y +CONFIG_CRYPTO_SHA256=y +CONFIG_CRYPTO_SHA512=y +CONFIG_CRYPTO_SHA3=m +# CONFIG_CRYPTO_SM3_GENERIC is not set +# CONFIG_CRYPTO_STREEBOG is not set +CONFIG_CRYPTO_VMAC=m +CONFIG_CRYPTO_WP512=m +CONFIG_CRYPTO_XCBC=m +CONFIG_CRYPTO_XXHASH=m +# end of Hashes, digests, and MACs + +# +# CRCs (cyclic redundancy checks) +# +CONFIG_CRYPTO_CRC32C=m +CONFIG_CRYPTO_CRC32=m +CONFIG_CRYPTO_CRCT10DIF=y +CONFIG_CRYPTO_CRC64_ROCKSOFT=m +# end of CRCs (cyclic redundancy checks) + +# +# Compression +# +CONFIG_CRYPTO_DEFLATE=y +CONFIG_CRYPTO_LZO=y +# CONFIG_CRYPTO_842 is not set +CONFIG_CRYPTO_LZ4=m +CONFIG_CRYPTO_LZ4HC=m +CONFIG_CRYPTO_ZSTD=m +# end of Compression + +# +# Random number generation +# +CONFIG_CRYPTO_ANSI_CPRNG=m +CONFIG_CRYPTO_DRBG_MENU=m +CONFIG_CRYPTO_DRBG_HMAC=y +# CONFIG_CRYPTO_DRBG_HASH is not set +# CONFIG_CRYPTO_DRBG_CTR is not set +CONFIG_CRYPTO_DRBG=m +CONFIG_CRYPTO_JITTERENTROPY=m +# CONFIG_CRYPTO_JITTERENTROPY_TESTINTERFACE is not set +CONFIG_CRYPTO_KDF800108_CTR=y +# end of Random number generation + +# +# Userspace interface +# +CONFIG_CRYPTO_USER_API=m +CONFIG_CRYPTO_USER_API_HASH=m +CONFIG_CRYPTO_USER_API_SKCIPHER=m +CONFIG_CRYPTO_USER_API_RNG=m +# CONFIG_CRYPTO_USER_API_RNG_CAVP is not set +CONFIG_CRYPTO_USER_API_AEAD=m +CONFIG_CRYPTO_USER_API_ENABLE_OBSOLETE=y +# CONFIG_CRYPTO_STATS is not set +# end of Userspace interface + +CONFIG_CRYPTO_HASH_INFO=y +# CONFIG_CRYPTO_NHPOLY1305_NEON is not set +# CONFIG_CRYPTO_CHACHA20_NEON is not set + +# +# Accelerated Cryptographic Algorithms for CPU (arm64) +# +CONFIG_CRYPTO_GHASH_ARM64_CE=m +# CONFIG_CRYPTO_POLY1305_NEON is not set +CONFIG_CRYPTO_SHA1_ARM64_CE=m +CONFIG_CRYPTO_SHA256_ARM64=m +CONFIG_CRYPTO_SHA2_ARM64_CE=m +CONFIG_CRYPTO_SHA512_ARM64=m +CONFIG_CRYPTO_SHA512_ARM64_CE=m +CONFIG_CRYPTO_SHA3_ARM64=m +CONFIG_CRYPTO_SM3_NEON=m +CONFIG_CRYPTO_SM3_ARM64_CE=m +CONFIG_CRYPTO_POLYVAL_ARM64_CE=m +CONFIG_CRYPTO_AES_ARM64=m +CONFIG_CRYPTO_AES_ARM64_CE=m +CONFIG_CRYPTO_AES_ARM64_CE_BLK=m +CONFIG_CRYPTO_AES_ARM64_NEON_BLK=m +# CONFIG_CRYPTO_AES_ARM64_BS is not set +CONFIG_CRYPTO_SM4_ARM64_CE=m +CONFIG_CRYPTO_SM4_ARM64_CE_BLK=m +CONFIG_CRYPTO_SM4_ARM64_NEON_BLK=m +CONFIG_CRYPTO_AES_ARM64_CE_CCM=m +CONFIG_CRYPTO_SM4_ARM64_CE_CCM=m +CONFIG_CRYPTO_SM4_ARM64_CE_GCM=m +# CONFIG_CRYPTO_CRCT10DIF_ARM64_CE is not set +# end of Accelerated Cryptographic Algorithms for CPU (arm64) + +CONFIG_CRYPTO_HW=y +CONFIG_CRYPTO_DEV_ALLWINNER=y +# CONFIG_CRYPTO_DEV_SUN4I_SS is not set +# CONFIG_CRYPTO_DEV_SUN8I_CE is not set +# CONFIG_CRYPTO_DEV_SUN8I_SS is not set +# CONFIG_CRYPTO_DEV_ATMEL_ECC is not set +# CONFIG_CRYPTO_DEV_ATMEL_SHA204A is not set +# CONFIG_CRYPTO_DEV_CCP is not set +CONFIG_CRYPTO_DEV_CPT=m +CONFIG_CAVIUM_CPT=m +CONFIG_CRYPTO_DEV_NITROX=m +CONFIG_CRYPTO_DEV_NITROX_CNN55XX=m +CONFIG_CRYPTO_DEV_MARVELL=m +CONFIG_CRYPTO_DEV_MARVELL_CESA=m +# CONFIG_CRYPTO_DEV_OCTEONTX_CPT is not set +# CONFIG_CRYPTO_DEV_OCTEONTX2_CPT is not set +# CONFIG_CRYPTO_DEV_QAT_DH895xCC is not set +# CONFIG_CRYPTO_DEV_QAT_C3XXX is not set +# CONFIG_CRYPTO_DEV_QAT_C62X is not set +# CONFIG_CRYPTO_DEV_QAT_4XXX is not set +# CONFIG_CRYPTO_DEV_QAT_DH895xCCVF is not set +# CONFIG_CRYPTO_DEV_QAT_C3XXXVF is not set +# CONFIG_CRYPTO_DEV_QAT_C62XVF is not set +# CONFIG_CRYPTO_DEV_CAVIUM_ZIP is not set +CONFIG_CRYPTO_DEV_QCE=m +CONFIG_CRYPTO_DEV_QCE_SKCIPHER=y +CONFIG_CRYPTO_DEV_QCE_SHA=y +CONFIG_CRYPTO_DEV_QCE_AEAD=y +CONFIG_CRYPTO_DEV_QCE_ENABLE_ALL=y +# CONFIG_CRYPTO_DEV_QCE_ENABLE_SKCIPHER is not set +# CONFIG_CRYPTO_DEV_QCE_ENABLE_SHA is not set +# CONFIG_CRYPTO_DEV_QCE_ENABLE_AEAD is not set +CONFIG_CRYPTO_DEV_QCE_SW_MAX_LEN=512 +CONFIG_CRYPTO_DEV_QCOM_RNG=m +# CONFIG_CRYPTO_DEV_ROCKCHIP is not set +# CONFIG_CRYPTO_DEV_ZYNQMP_AES is not set +# CONFIG_CRYPTO_DEV_ZYNQMP_SHA3 is not set +CONFIG_CRYPTO_DEV_CHELSIO=m +CONFIG_CRYPTO_DEV_VIRTIO=m +CONFIG_CRYPTO_DEV_SAFEXCEL=m +# CONFIG_CRYPTO_DEV_CCREE is not set +# CONFIG_CRYPTO_DEV_HISI_SEC is not set +# CONFIG_CRYPTO_DEV_HISI_SEC2 is not set +# CONFIG_CRYPTO_DEV_HISI_ZIP is not set +# CONFIG_CRYPTO_DEV_HISI_HPRE is not set +# CONFIG_CRYPTO_DEV_HISI_TRNG is not set +CONFIG_CRYPTO_DEV_AMLOGIC_GXL=m +# CONFIG_CRYPTO_DEV_AMLOGIC_GXL_DEBUG is not set +CONFIG_ASYMMETRIC_KEY_TYPE=y +CONFIG_ASYMMETRIC_PUBLIC_KEY_SUBTYPE=y +CONFIG_X509_CERTIFICATE_PARSER=y +CONFIG_PKCS8_PRIVATE_KEY_PARSER=m +CONFIG_PKCS7_MESSAGE_PARSER=y +# CONFIG_PKCS7_TEST_KEY is not set +# CONFIG_SIGNED_PE_FILE_VERIFICATION is not set +# CONFIG_FIPS_SIGNATURE_SELFTEST is not set + +# +# Certificates for signature checking +# +CONFIG_MODULE_SIG_KEY="" +CONFIG_MODULE_SIG_KEY_TYPE_RSA=y +# CONFIG_MODULE_SIG_KEY_TYPE_ECDSA is not set +CONFIG_SYSTEM_TRUSTED_KEYRING=y +CONFIG_SYSTEM_TRUSTED_KEYS="" +# CONFIG_SYSTEM_EXTRA_CERTIFICATE is not set +CONFIG_SECONDARY_TRUSTED_KEYRING=y +CONFIG_SYSTEM_BLACKLIST_KEYRING=y +CONFIG_SYSTEM_BLACKLIST_HASH_LIST="" +# CONFIG_SYSTEM_REVOCATION_LIST is not set +# CONFIG_SYSTEM_BLACKLIST_AUTH_UPDATE is not set +# end of Certificates for signature checking + +CONFIG_BINARY_PRINTF=y + +# +# Library routines +# +CONFIG_RAID6_PQ=m +CONFIG_RAID6_PQ_BENCHMARK=y +CONFIG_LINEAR_RANGES=y +# CONFIG_PACKING is not set +CONFIG_BITREVERSE=y +CONFIG_HAVE_ARCH_BITREVERSE=y +CONFIG_GENERIC_STRNCPY_FROM_USER=y +CONFIG_GENERIC_STRNLEN_USER=y +CONFIG_GENERIC_NET_UTILS=y +CONFIG_CORDIC=m +# CONFIG_PRIME_NUMBERS is not set +CONFIG_RATIONAL=y +CONFIG_GENERIC_PCI_IOMAP=y +CONFIG_ARCH_USE_CMPXCHG_LOCKREF=y +CONFIG_ARCH_HAS_FAST_MULTIPLIER=y +CONFIG_ARCH_USE_SYM_ANNOTATIONS=y +CONFIG_INDIRECT_PIO=y +# CONFIG_TRACE_MMIO_ACCESS is not set + +# +# Crypto library routines +# +CONFIG_CRYPTO_LIB_UTILS=y +CONFIG_CRYPTO_LIB_AES=y +CONFIG_CRYPTO_LIB_ARC4=m +CONFIG_CRYPTO_LIB_GF128MUL=m +CONFIG_CRYPTO_LIB_BLAKE2S_GENERIC=y +CONFIG_CRYPTO_LIB_CHACHA_GENERIC=m +# CONFIG_CRYPTO_LIB_CHACHA is not set +# CONFIG_CRYPTO_LIB_CURVE25519 is not set +CONFIG_CRYPTO_LIB_DES=m +CONFIG_CRYPTO_LIB_POLY1305_RSIZE=9 +CONFIG_CRYPTO_LIB_POLY1305_GENERIC=m +# CONFIG_CRYPTO_LIB_POLY1305 is not set +# CONFIG_CRYPTO_LIB_CHACHA20POLY1305 is not set +CONFIG_CRYPTO_LIB_SHA1=y +CONFIG_CRYPTO_LIB_SHA256=y +# end of Crypto library routines + +CONFIG_CRC_CCITT=m +CONFIG_CRC16=m +CONFIG_CRC_T10DIF=y +CONFIG_CRC64_ROCKSOFT=m +CONFIG_CRC_ITU_T=m +CONFIG_CRC32=y +# CONFIG_CRC32_SELFTEST is not set +CONFIG_CRC32_SLICEBY8=y +# CONFIG_CRC32_SLICEBY4 is not set +# CONFIG_CRC32_SARWATE is not set +# CONFIG_CRC32_BIT is not set +CONFIG_CRC64=m +# CONFIG_CRC4 is not set +CONFIG_CRC7=m +CONFIG_LIBCRC32C=m +CONFIG_CRC8=y +CONFIG_XXHASH=y +CONFIG_AUDIT_GENERIC=y +CONFIG_AUDIT_ARCH_COMPAT_GENERIC=y +CONFIG_AUDIT_COMPAT_GENERIC=y +# CONFIG_RANDOM32_SELFTEST is not set +CONFIG_ZLIB_INFLATE=y +CONFIG_ZLIB_DEFLATE=y +CONFIG_LZO_COMPRESS=y +CONFIG_LZO_DECOMPRESS=y +CONFIG_LZ4_COMPRESS=m +CONFIG_LZ4HC_COMPRESS=m +CONFIG_LZ4_DECOMPRESS=y +CONFIG_ZSTD_COMMON=y +CONFIG_ZSTD_COMPRESS=y +CONFIG_ZSTD_DECOMPRESS=y +CONFIG_XZ_DEC=y +# CONFIG_XZ_DEC_X86 is not set +# CONFIG_XZ_DEC_POWERPC is not set +# CONFIG_XZ_DEC_IA64 is not set +# CONFIG_XZ_DEC_ARM is not set +# CONFIG_XZ_DEC_ARMTHUMB is not set +# CONFIG_XZ_DEC_SPARC is not set +# CONFIG_XZ_DEC_MICROLZMA is not set +# CONFIG_XZ_DEC_TEST is not set +CONFIG_DECOMPRESS_GZIP=y +CONFIG_DECOMPRESS_BZIP2=y +CONFIG_DECOMPRESS_LZMA=y +CONFIG_DECOMPRESS_XZ=y +CONFIG_DECOMPRESS_LZO=y +CONFIG_DECOMPRESS_LZ4=y +CONFIG_DECOMPRESS_ZSTD=y +CONFIG_GENERIC_ALLOCATOR=y +CONFIG_REED_SOLOMON=m +CONFIG_REED_SOLOMON_ENC8=y +CONFIG_REED_SOLOMON_DEC8=y +CONFIG_TEXTSEARCH=y +CONFIG_TEXTSEARCH_KMP=m +CONFIG_TEXTSEARCH_BM=m +CONFIG_TEXTSEARCH_FSM=m +CONFIG_BTREE=y +CONFIG_INTERVAL_TREE=y +CONFIG_XARRAY_MULTI=y +CONFIG_ASSOCIATIVE_ARRAY=y +CONFIG_HAS_IOMEM=y +CONFIG_HAS_IOPORT=y +CONFIG_HAS_IOPORT_MAP=y +CONFIG_HAS_DMA=y +CONFIG_DMA_OPS=y +CONFIG_NEED_SG_DMA_FLAGS=y +CONFIG_NEED_SG_DMA_LENGTH=y +CONFIG_NEED_DMA_MAP_STATE=y +CONFIG_ARCH_DMA_ADDR_T_64BIT=y +CONFIG_DMA_DECLARE_COHERENT=y +CONFIG_ARCH_HAS_SETUP_DMA_OPS=y +CONFIG_ARCH_HAS_TEARDOWN_DMA_OPS=y +CONFIG_ARCH_HAS_SYNC_DMA_FOR_DEVICE=y +CONFIG_ARCH_HAS_SYNC_DMA_FOR_CPU=y +CONFIG_ARCH_HAS_DMA_PREP_COHERENT=y +CONFIG_SWIOTLB=y +# CONFIG_SWIOTLB_DYNAMIC is not set +CONFIG_DMA_BOUNCE_UNALIGNED_KMALLOC=y +# CONFIG_DMA_RESTRICTED_POOL is not set +CONFIG_DMA_NONCOHERENT_MMAP=y +CONFIG_DMA_COHERENT_POOL=y +CONFIG_DMA_DIRECT_REMAP=y +# CONFIG_DMA_API_DEBUG is not set +# CONFIG_DMA_MAP_BENCHMARK is not set +CONFIG_SGL_ALLOC=y +CONFIG_CHECK_SIGNATURE=y +# CONFIG_FORCE_NR_CPUS is not set +CONFIG_CPU_RMAP=y +CONFIG_DQL=y +CONFIG_GLOB=y +# CONFIG_GLOB_SELFTEST is not set +CONFIG_NLATTR=y +CONFIG_LRU_CACHE=m +CONFIG_CLZ_TAB=y +CONFIG_IRQ_POLL=y +CONFIG_MPILIB=y +CONFIG_SIGNATURE=y +CONFIG_DIMLIB=y +CONFIG_LIBFDT=y +CONFIG_OID_REGISTRY=y +CONFIG_UCS2_STRING=y +CONFIG_HAVE_GENERIC_VDSO=y +CONFIG_GENERIC_GETTIMEOFDAY=y +CONFIG_GENERIC_VDSO_TIME_NS=y +CONFIG_FONT_SUPPORT=y +# CONFIG_FONTS is not set +CONFIG_FONT_8x8=y +CONFIG_FONT_8x16=y +CONFIG_SG_POOL=y +CONFIG_ARCH_HAS_PMEM_API=y +CONFIG_MEMREGION=y +CONFIG_ARCH_HAS_UACCESS_FLUSHCACHE=y +CONFIG_ARCH_STACKWALK=y +CONFIG_STACKDEPOT=y +CONFIG_SBITMAP=y +# end of Library routines + +CONFIG_GENERIC_IOREMAP=y +CONFIG_GENERIC_LIB_DEVMEM_IS_ALLOWED=y +CONFIG_PLDMFW=y + +# +# Kernel hacking +# + +# +# printk and dmesg options +# +CONFIG_PRINTK_TIME=y +CONFIG_PRINTK_CALLER=y +# CONFIG_STACKTRACE_BUILD_ID is not set +CONFIG_CONSOLE_LOGLEVEL_DEFAULT=7 +CONFIG_CONSOLE_LOGLEVEL_QUIET=4 +CONFIG_MESSAGE_LOGLEVEL_DEFAULT=4 +CONFIG_BOOT_PRINTK_DELAY=y +CONFIG_DYNAMIC_DEBUG=y +CONFIG_DYNAMIC_DEBUG_CORE=y +CONFIG_SYMBOLIC_ERRNAME=y +CONFIG_DEBUG_BUGVERBOSE=y +# end of printk and dmesg options + +CONFIG_DEBUG_KERNEL=y +CONFIG_DEBUG_MISC=y + +# +# Compile-time checks and compiler options +# +CONFIG_DEBUG_INFO=y +CONFIG_AS_HAS_NON_CONST_LEB128=y +# CONFIG_DEBUG_INFO_NONE is not set +CONFIG_DEBUG_INFO_DWARF_TOOLCHAIN_DEFAULT=y +# CONFIG_DEBUG_INFO_DWARF4 is not set +# CONFIG_DEBUG_INFO_DWARF5 is not set +# CONFIG_DEBUG_INFO_REDUCED is not set +CONFIG_DEBUG_INFO_COMPRESSED_NONE=y +# CONFIG_DEBUG_INFO_COMPRESSED_ZLIB is not set +# CONFIG_DEBUG_INFO_SPLIT is not set +CONFIG_DEBUG_INFO_BTF=y +CONFIG_PAHOLE_HAS_SPLIT_BTF=y +CONFIG_DEBUG_INFO_BTF_MODULES=y +# CONFIG_MODULE_ALLOW_BTF_MISMATCH is not set +# CONFIG_GDB_SCRIPTS is not set +CONFIG_FRAME_WARN=2048 +CONFIG_STRIP_ASM_SYMS=y +# CONFIG_READABLE_ASM is not set +# CONFIG_HEADERS_INSTALL is not set +# CONFIG_DEBUG_SECTION_MISMATCH is not set +CONFIG_SECTION_MISMATCH_WARN_ONLY=y +# CONFIG_DEBUG_FORCE_FUNCTION_ALIGN_64B is not set +CONFIG_ARCH_WANT_FRAME_POINTERS=y +CONFIG_FRAME_POINTER=y +# CONFIG_VMLINUX_MAP is not set +# CONFIG_DEBUG_FORCE_WEAK_PER_CPU is not set +# end of Compile-time checks and compiler options + +# +# Generic Kernel Debugging Instruments +# +CONFIG_MAGIC_SYSRQ=y +CONFIG_MAGIC_SYSRQ_DEFAULT_ENABLE=0x01b6 +CONFIG_MAGIC_SYSRQ_SERIAL=y +CONFIG_MAGIC_SYSRQ_SERIAL_SEQUENCE="" +CONFIG_DEBUG_FS=y +CONFIG_DEBUG_FS_ALLOW_ALL=y +# CONFIG_DEBUG_FS_DISALLOW_MOUNT is not set +# CONFIG_DEBUG_FS_ALLOW_NONE is not set +CONFIG_HAVE_ARCH_KGDB=y +# CONFIG_KGDB is not set +CONFIG_ARCH_HAS_UBSAN_SANITIZE_ALL=y +# CONFIG_UBSAN is not set +CONFIG_HAVE_ARCH_KCSAN=y +# end of Generic Kernel Debugging Instruments + +# +# Networking Debugging +# +# CONFIG_NET_DEV_REFCNT_TRACKER is not set +# CONFIG_NET_NS_REFCNT_TRACKER is not set +# CONFIG_DEBUG_NET is not set +# end of Networking Debugging + +# +# Memory Debugging +# +CONFIG_PAGE_EXTENSION=y +# CONFIG_DEBUG_PAGEALLOC is not set +CONFIG_SLUB_DEBUG=y +# CONFIG_SLUB_DEBUG_ON is not set +CONFIG_PAGE_OWNER=y +# CONFIG_PAGE_TABLE_CHECK is not set +CONFIG_PAGE_POISONING=y +# CONFIG_DEBUG_PAGE_REF is not set +# CONFIG_DEBUG_RODATA_TEST is not set +CONFIG_ARCH_HAS_DEBUG_WX=y +# CONFIG_DEBUG_WX is not set +CONFIG_GENERIC_PTDUMP=y +# CONFIG_PTDUMP_DEBUGFS is not set +CONFIG_HAVE_DEBUG_KMEMLEAK=y +# CONFIG_DEBUG_KMEMLEAK is not set +# CONFIG_PER_VMA_LOCK_STATS is not set +# CONFIG_DEBUG_OBJECTS is not set +# CONFIG_SHRINKER_DEBUG is not set +# CONFIG_DEBUG_STACK_USAGE is not set +CONFIG_SCHED_STACK_END_CHECK=y +CONFIG_ARCH_HAS_DEBUG_VM_PGTABLE=y +# CONFIG_DEBUG_VM is not set +# CONFIG_DEBUG_VM_PGTABLE is not set +CONFIG_ARCH_HAS_DEBUG_VIRTUAL=y +# CONFIG_DEBUG_VIRTUAL is not set +CONFIG_DEBUG_MEMORY_INIT=y +CONFIG_MEMORY_NOTIFIER_ERROR_INJECT=m +# CONFIG_DEBUG_PER_CPU_MAPS is not set +CONFIG_HAVE_ARCH_KASAN=y +CONFIG_HAVE_ARCH_KASAN_SW_TAGS=y +CONFIG_HAVE_ARCH_KASAN_VMALLOC=y +CONFIG_CC_HAS_KASAN_GENERIC=y +CONFIG_CC_HAS_WORKING_NOSANITIZE_ADDRESS=y +# CONFIG_KASAN is not set +CONFIG_HAVE_ARCH_KFENCE=y +CONFIG_KFENCE=y +CONFIG_KFENCE_SAMPLE_INTERVAL=0 +CONFIG_KFENCE_NUM_OBJECTS=511 +# CONFIG_KFENCE_DEFERRABLE is not set +# CONFIG_KFENCE_STATIC_KEYS is not set +CONFIG_KFENCE_STRESS_TEST_FAULTS=0 +# end of Memory Debugging + +# CONFIG_DEBUG_SHIRQ is not set + +# +# Debug Oops, Lockups and Hangs +# +# CONFIG_PANIC_ON_OOPS is not set +CONFIG_PANIC_ON_OOPS_VALUE=0 +CONFIG_PANIC_TIMEOUT=0 +CONFIG_LOCKUP_DETECTOR=y +CONFIG_SOFTLOCKUP_DETECTOR=y +# CONFIG_BOOTPARAM_SOFTLOCKUP_PANIC is not set +CONFIG_HAVE_HARDLOCKUP_DETECTOR_BUDDY=y +# CONFIG_HARDLOCKUP_DETECTOR is not set +CONFIG_DETECT_HUNG_TASK=y +CONFIG_DEFAULT_HUNG_TASK_TIMEOUT=120 +# CONFIG_BOOTPARAM_HUNG_TASK_PANIC is not set +# CONFIG_WQ_WATCHDOG is not set +# CONFIG_WQ_CPU_INTENSIVE_REPORT is not set +# CONFIG_TEST_LOCKUP is not set +# end of Debug Oops, Lockups and Hangs + +# +# Scheduler Debugging +# +CONFIG_SCHED_DEBUG=y +CONFIG_SCHED_INFO=y +CONFIG_SCHEDSTATS=y +# end of Scheduler Debugging + +# CONFIG_DEBUG_TIMEKEEPING is not set + +# +# Lock Debugging (spinlocks, mutexes, etc...) +# +CONFIG_LOCK_DEBUGGING_SUPPORT=y +# CONFIG_PROVE_LOCKING is not set +# CONFIG_LOCK_STAT is not set +# CONFIG_DEBUG_RT_MUTEXES is not set +# CONFIG_DEBUG_SPINLOCK is not set +# CONFIG_DEBUG_MUTEXES is not set +# CONFIG_DEBUG_WW_MUTEX_SLOWPATH is not set +# CONFIG_DEBUG_RWSEMS is not set +# CONFIG_DEBUG_LOCK_ALLOC is not set +# CONFIG_DEBUG_ATOMIC_SLEEP is not set +# CONFIG_DEBUG_LOCKING_API_SELFTESTS is not set +# CONFIG_LOCK_TORTURE_TEST is not set +# CONFIG_WW_MUTEX_SELFTEST is not set +# CONFIG_SCF_TORTURE_TEST is not set +# CONFIG_CSD_LOCK_WAIT_DEBUG is not set +# end of Lock Debugging (spinlocks, mutexes, etc...) + +# CONFIG_DEBUG_IRQFLAGS is not set +CONFIG_STACKTRACE=y +# CONFIG_WARN_ALL_UNSEEDED_RANDOM is not set +# CONFIG_DEBUG_KOBJECT is not set + +# +# Debug kernel data structures +# +CONFIG_DEBUG_LIST=y +# CONFIG_DEBUG_PLIST is not set +# CONFIG_DEBUG_SG is not set +# CONFIG_DEBUG_NOTIFIERS is not set +# CONFIG_DEBUG_MAPLE_TREE is not set +# end of Debug kernel data structures + +# CONFIG_DEBUG_CREDENTIALS is not set + +# +# RCU Debugging +# +# CONFIG_RCU_SCALE_TEST is not set +# CONFIG_RCU_TORTURE_TEST is not set +# CONFIG_RCU_REF_SCALE_TEST is not set +CONFIG_RCU_CPU_STALL_TIMEOUT=21 +CONFIG_RCU_EXP_CPU_STALL_TIMEOUT=0 +# CONFIG_RCU_CPU_STALL_CPUTIME is not set +# CONFIG_RCU_TRACE is not set +# CONFIG_RCU_EQS_DEBUG is not set +# end of RCU Debugging + +# CONFIG_DEBUG_WQ_FORCE_RR_CPU is not set +# CONFIG_CPU_HOTPLUG_STATE_CONTROL is not set +# CONFIG_LATENCYTOP is not set +# CONFIG_DEBUG_CGROUP_REF is not set +CONFIG_NOP_TRACER=y +CONFIG_HAVE_FUNCTION_TRACER=y +CONFIG_HAVE_FUNCTION_GRAPH_TRACER=y +CONFIG_HAVE_FUNCTION_GRAPH_RETVAL=y +CONFIG_HAVE_DYNAMIC_FTRACE=y +CONFIG_HAVE_DYNAMIC_FTRACE_WITH_DIRECT_CALLS=y +CONFIG_HAVE_DYNAMIC_FTRACE_WITH_CALL_OPS=y +CONFIG_HAVE_DYNAMIC_FTRACE_WITH_ARGS=y +CONFIG_HAVE_FTRACE_MCOUNT_RECORD=y +CONFIG_HAVE_SYSCALL_TRACEPOINTS=y +CONFIG_HAVE_C_RECORDMCOUNT=y +CONFIG_TRACER_MAX_TRACE=y +CONFIG_TRACE_CLOCK=y +CONFIG_RING_BUFFER=y +CONFIG_EVENT_TRACING=y +CONFIG_CONTEXT_SWITCH_TRACER=y +CONFIG_TRACING=y +CONFIG_GENERIC_TRACER=y +CONFIG_TRACING_SUPPORT=y +CONFIG_FTRACE=y +# CONFIG_BOOTTIME_TRACING is not set +CONFIG_FUNCTION_TRACER=y +CONFIG_FUNCTION_GRAPH_TRACER=y +# CONFIG_FUNCTION_GRAPH_RETVAL is not set +CONFIG_DYNAMIC_FTRACE=y +CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS=y +CONFIG_DYNAMIC_FTRACE_WITH_CALL_OPS=y +CONFIG_DYNAMIC_FTRACE_WITH_ARGS=y +# CONFIG_FUNCTION_PROFILER is not set +CONFIG_STACK_TRACER=y +# CONFIG_IRQSOFF_TRACER is not set +# CONFIG_SCHED_TRACER is not set +# CONFIG_HWLAT_TRACER is not set +# CONFIG_OSNOISE_TRACER is not set +# CONFIG_TIMERLAT_TRACER is not set +CONFIG_FTRACE_SYSCALLS=y +CONFIG_TRACER_SNAPSHOT=y +# CONFIG_TRACER_SNAPSHOT_PER_CPU_SWAP is not set +CONFIG_BRANCH_PROFILE_NONE=y +# CONFIG_PROFILE_ANNOTATED_BRANCHES is not set +CONFIG_BLK_DEV_IO_TRACE=y +CONFIG_PROBE_EVENTS_BTF_ARGS=y +CONFIG_KPROBE_EVENTS=y +# CONFIG_KPROBE_EVENTS_ON_NOTRACE is not set +CONFIG_UPROBE_EVENTS=y +CONFIG_BPF_EVENTS=y +CONFIG_DYNAMIC_EVENTS=y +CONFIG_PROBE_EVENTS=y +# CONFIG_BPF_KPROBE_OVERRIDE is not set +CONFIG_FTRACE_MCOUNT_RECORD=y +CONFIG_FTRACE_MCOUNT_USE_PATCHABLE_FUNCTION_ENTRY=y +# CONFIG_SYNTH_EVENTS is not set +# CONFIG_USER_EVENTS is not set +# CONFIG_HIST_TRIGGERS is not set +# CONFIG_TRACE_EVENT_INJECT is not set +# CONFIG_TRACEPOINT_BENCHMARK is not set +# CONFIG_RING_BUFFER_BENCHMARK is not set +# CONFIG_TRACE_EVAL_MAP_FILE is not set +# CONFIG_FTRACE_RECORD_RECURSION is not set +# CONFIG_FTRACE_STARTUP_TEST is not set +# CONFIG_RING_BUFFER_STARTUP_TEST is not set +# CONFIG_RING_BUFFER_VALIDATE_TIME_DELTAS is not set +# CONFIG_PREEMPTIRQ_DELAY_TEST is not set +# CONFIG_KPROBE_EVENT_GEN_TEST is not set +# CONFIG_RV is not set +# CONFIG_SAMPLES is not set +CONFIG_HAVE_SAMPLE_FTRACE_DIRECT=y +CONFIG_HAVE_SAMPLE_FTRACE_DIRECT_MULTI=y +CONFIG_STRICT_DEVMEM=y +# CONFIG_IO_STRICT_DEVMEM is not set + +# +# arm64 Debugging +# +# CONFIG_PID_IN_CONTEXTIDR is not set +# CONFIG_DEBUG_EFI is not set +# CONFIG_ARM64_RELOC_TEST is not set +# CONFIG_CORESIGHT is not set +# end of arm64 Debugging + +# +# Kernel Testing and Coverage +# +# CONFIG_KUNIT is not set +CONFIG_NOTIFIER_ERROR_INJECTION=m +CONFIG_PM_NOTIFIER_ERROR_INJECT=m +# CONFIG_NETDEV_NOTIFIER_ERROR_INJECT is not set +CONFIG_FUNCTION_ERROR_INJECTION=y +CONFIG_FAULT_INJECTION=y +# CONFIG_FAILSLAB is not set +# CONFIG_FAIL_PAGE_ALLOC is not set +# CONFIG_FAULT_INJECTION_USERCOPY is not set +# CONFIG_FAIL_MAKE_REQUEST is not set +# CONFIG_FAIL_IO_TIMEOUT is not set +# CONFIG_FAIL_FUTEX is not set +# CONFIG_FAULT_INJECTION_DEBUG_FS is not set +# CONFIG_FAULT_INJECTION_CONFIGFS is not set +CONFIG_ARCH_HAS_KCOV=y +CONFIG_CC_HAS_SANCOV_TRACE_PC=y +CONFIG_RUNTIME_TESTING_MENU=y +# CONFIG_TEST_DHRY is not set +# CONFIG_LKDTM is not set +# CONFIG_TEST_MIN_HEAP is not set +# CONFIG_TEST_DIV64 is not set +# CONFIG_BACKTRACE_SELF_TEST is not set +# CONFIG_TEST_REF_TRACKER is not set +# CONFIG_RBTREE_TEST is not set +# CONFIG_REED_SOLOMON_TEST is not set +# CONFIG_INTERVAL_TREE_TEST is not set +# CONFIG_PERCPU_TEST is not set +# CONFIG_ATOMIC64_SELFTEST is not set +# CONFIG_ASYNC_RAID6_TEST is not set +# CONFIG_TEST_HEXDUMP is not set +# CONFIG_STRING_SELFTEST is not set +# CONFIG_TEST_STRING_HELPERS is not set +# CONFIG_TEST_KSTRTOX is not set +# CONFIG_TEST_PRINTF is not set +# CONFIG_TEST_SCANF is not set +# CONFIG_TEST_BITMAP is not set +# CONFIG_TEST_UUID is not set +# CONFIG_TEST_XARRAY is not set +# CONFIG_TEST_MAPLE_TREE is not set +# CONFIG_TEST_RHASHTABLE is not set +# CONFIG_TEST_IDA is not set +# CONFIG_TEST_LKM is not set +# CONFIG_TEST_BITOPS is not set +# CONFIG_TEST_VMALLOC is not set +CONFIG_TEST_USER_COPY=m +CONFIG_TEST_BPF=m +# CONFIG_TEST_BLACKHOLE_DEV is not set +# CONFIG_FIND_BIT_BENCHMARK is not set +CONFIG_TEST_FIRMWARE=m +# CONFIG_TEST_SYSCTL is not set +# CONFIG_TEST_UDELAY is not set +CONFIG_TEST_STATIC_KEYS=m +# CONFIG_TEST_DYNAMIC_DEBUG is not set +# CONFIG_TEST_KMOD is not set +# CONFIG_TEST_MEMCAT_P is not set +CONFIG_TEST_LIVEPATCH=m +# CONFIG_TEST_MEMINIT is not set +# CONFIG_TEST_FREE_PAGES is not set +CONFIG_ARCH_USE_MEMTEST=y +# CONFIG_MEMTEST is not set +# CONFIG_HYPERV_TESTING is not set +# end of Kernel Testing and Coverage + +# +# Rust hacking +# +# end of Rust hacking +# end of Kernel hacking diff --git a/.github/workflows/kernel-build.yml b/.github/workflows/kernel-build.yml new file mode 100644 index 0000000000000..51a7148e1726e --- /dev/null +++ b/.github/workflows/kernel-build.yml @@ -0,0 +1,77 @@ +name: Kernel Test + +on: + push: + pull_request: + +jobs: + build: + name: ${{ matrix.runs-on == 'ubuntu-latest' && 'x86_64' || 'ARM64' }} - ${{ matrix.config }} + runs-on: ${{ matrix.runs-on }} + strategy: + matrix: + include: + - runs-on: ubuntu-latest + config: defconfig + - runs-on: ubuntu-24.04-arm + config: config_4k + - runs-on: ubuntu-24.04-arm + config: config_64k + steps: + - uses: actions/checkout@v4 + with: + fetch-depth: 0 + + - name: Install Dependencies + run: | + sudo apt-get update -y + sudo apt-get install -y build-essential flex bison libssl-dev bc \ + busybox-static cmake coreutils virtme-ng cpio elfutils file gcc \ + git iproute2 jq kbd kmod libcap-dev libelf-dev libunwind-dev \ + libvirt-clients libzstd-dev libasound2-dev libdrm-dev \ + linux-headers-generic linux-tools-common linux-tools-generic \ + make ninja-build pahole pkg-config python3-dev python3-pip \ + python3-requests qemu-kvm qemu-system-arm rsync stress-ng udev zstd \ + libseccomp-dev libcap-ng-dev llvm-19 clang-19 python3-full curl \ + bpftrace dwarves gcc-aarch64-linux-gnu binutils-aarch64-linux-gnu \ + bash-completion ccache debhelper devscripts docbook-utils \ + fakeroot gawk kernel-wedge libncurses5-dev makedumpfile python3-git \ + python3-requests python3-ruamel.yaml sharutils transfig ubuntu-dev-tools \ + libnuma-dev xmlto + + - name: Install virtme-ng + run: | + git clone https://github.com/arighi/virtme-ng.git + cd virtme-ng + VNG_PACKAGE=1 BUILD_VIRTME_NG_INIT=1 pip install --break-system-packages . + cd - + + - name: Configure Kernel + run: | + if [ "${{ matrix.config }}" = "defconfig" ]; then + vng -vb --kconfig + else + vng -vb --kconfig --config .github/workflows/${{ matrix.config }} + fi + # Ensure WERROR is disabled + ./scripts/config --disable CONFIG_WERROR + # Disable certificate verification + ./scripts/config --disable SYSTEM_TRUSTED_KEYS + ./scripts/config --disable SYSTEM_REVOCATION_KEYS "" + + - name: Build Kernel + run: | + vng -vb + + - name: Boot test + run: | + vng -v -m 4G -- uname -a + + - name: Run kselftests + run: | + make headers_install + make -j$(nproc) -C tools/testing/selftests \ + TARGETS="mm dma iommu locking" \ + INSTALL_PATH=$(pwd)/tests \ + all install + vng -v -m 4G -- ./tests/run_kselftest.sh --summary diff --git a/.gitignore b/.gitignore index d1a8ab3f98aaf..0265d678bac31 100644 --- a/.gitignore +++ b/.gitignore @@ -11,6 +11,7 @@ # Normal rules (sorted alphabetically) # .* +!.github/ *.a *.asn1.[ch] *.bin From e55f58e7a9ce7b95641d461a554c4c90b8821f12 Mon Sep 17 00:00:00 2001 From: Jason Gunthorpe Date: Tue, 5 Dec 2023 15:14:33 -0400 Subject: [PATCH 346/548] wqiommu/arm-smmu-v3: Add a type for the STE Instead of passing a naked __le16 * around to represent a STE wrap it in a "struct arm_smmu_ste" with an array of the correct size. This makes it much clearer which functions will comprise the "STE API". Reviewed-by: Moritz Fischer Reviewed-by: Michael Shavit Reviewed-by: Eric Auger Reviewed-by: Nicolin Chen Tested-by: Shameer Kolothum Tested-by: Nicolin Chen Signed-off-by: Jason Gunthorpe Signed-off-by: Will Deacon (cherry picked from commit 57b89048874c9edd4a17f34c126b4a4188b7b444) Signed-off-by: Koba Ko --- drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 59 ++++++++++----------- drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 7 ++- 2 files changed, 35 insertions(+), 31 deletions(-) diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c index f2260f45728e7..7f528169248e7 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c @@ -1255,7 +1255,7 @@ static void arm_smmu_sync_ste_for_sid(struct arm_smmu_device *smmu, u32 sid) } static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid, - __le64 *dst) + struct arm_smmu_ste *dst) { /* * This is hideously complicated, but we only really care about @@ -1273,7 +1273,7 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid, * 2. Write everything apart from dword 0, sync, write dword 0, sync * 3. Update Config, sync */ - u64 val = le64_to_cpu(dst[0]); + u64 val = le64_to_cpu(dst->data[0]); bool ste_live = false; struct arm_smmu_device *smmu = NULL; struct arm_smmu_s1_cfg *s1_cfg = NULL; @@ -1331,10 +1331,10 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid, else val |= FIELD_PREP(STRTAB_STE_0_CFG, STRTAB_STE_0_CFG_BYPASS); - dst[0] = cpu_to_le64(val); - dst[1] = cpu_to_le64(FIELD_PREP(STRTAB_STE_1_SHCFG, + dst->data[0] = cpu_to_le64(val); + dst->data[1] = cpu_to_le64(FIELD_PREP(STRTAB_STE_1_SHCFG, STRTAB_STE_1_SHCFG_INCOMING)); - dst[2] = 0; /* Nuke the VMID */ + dst->data[2] = 0; /* Nuke the VMID */ /* * The SMMU can perform negative caching, so we must sync * the STE regardless of whether the old value was live. @@ -1349,7 +1349,7 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid, STRTAB_STE_1_STRW_EL2 : STRTAB_STE_1_STRW_NSEL1; BUG_ON(ste_live); - dst[1] = cpu_to_le64( + dst->data[1] = cpu_to_le64( FIELD_PREP(STRTAB_STE_1_S1DSS, STRTAB_STE_1_S1DSS_SSID0) | FIELD_PREP(STRTAB_STE_1_S1CIR, STRTAB_STE_1_S1C_CACHE_WBRA) | FIELD_PREP(STRTAB_STE_1_S1COR, STRTAB_STE_1_S1C_CACHE_WBRA) | @@ -1358,7 +1358,7 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid, if (smmu->features & ARM_SMMU_FEAT_STALLS && !master->stall_enabled) - dst[1] |= cpu_to_le64(STRTAB_STE_1_S1STALLD); + dst->data[1] |= cpu_to_le64(STRTAB_STE_1_S1STALLD); val |= (s1_cfg->cdcfg.cdtab_dma & STRTAB_STE_0_S1CTXPTR_MASK) | FIELD_PREP(STRTAB_STE_0_CFG, STRTAB_STE_0_CFG_S1_TRANS) | @@ -1368,7 +1368,7 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid, if (s2_cfg) { BUG_ON(ste_live); - dst[2] = cpu_to_le64( + dst->data[2] = cpu_to_le64( FIELD_PREP(STRTAB_STE_2_S2VMID, s2_cfg->vmid) | FIELD_PREP(STRTAB_STE_2_VTCR, s2_cfg->vtcr) | #ifdef __BIG_ENDIAN @@ -1377,18 +1377,18 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid, STRTAB_STE_2_S2PTW | STRTAB_STE_2_S2AA64 | STRTAB_STE_2_S2R); - dst[3] = cpu_to_le64(s2_cfg->vttbr & STRTAB_STE_3_S2TTB_MASK); + dst->data[3] = cpu_to_le64(s2_cfg->vttbr & STRTAB_STE_3_S2TTB_MASK); val |= FIELD_PREP(STRTAB_STE_0_CFG, STRTAB_STE_0_CFG_S2_TRANS); } if (master->ats_enabled) - dst[1] |= cpu_to_le64(FIELD_PREP(STRTAB_STE_1_EATS, + dst->data[1] |= cpu_to_le64(FIELD_PREP(STRTAB_STE_1_EATS, STRTAB_STE_1_EATS_TRANS)); arm_smmu_sync_ste_for_sid(smmu, sid); /* See comment in arm_smmu_write_ctx_desc() */ - WRITE_ONCE(dst[0], cpu_to_le64(val)); + WRITE_ONCE(dst->data[0], cpu_to_le64(val)); arm_smmu_sync_ste_for_sid(smmu, sid); /* It's likely that we'll want to use the new STE soon */ @@ -1396,7 +1396,8 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid, arm_smmu_cmdq_issue_cmd(smmu, &prefetch_cmd); } -static void arm_smmu_init_bypass_stes(__le64 *strtab, unsigned int nent, bool force) +static void arm_smmu_init_bypass_stes(struct arm_smmu_ste *strtab, + unsigned int nent, bool force) { unsigned int i; u64 val = STRTAB_STE_0_V; @@ -1407,11 +1408,11 @@ static void arm_smmu_init_bypass_stes(__le64 *strtab, unsigned int nent, bool fo val |= FIELD_PREP(STRTAB_STE_0_CFG, STRTAB_STE_0_CFG_BYPASS); for (i = 0; i < nent; ++i) { - strtab[0] = cpu_to_le64(val); - strtab[1] = cpu_to_le64(FIELD_PREP(STRTAB_STE_1_SHCFG, - STRTAB_STE_1_SHCFG_INCOMING)); - strtab[2] = 0; - strtab += STRTAB_STE_DWORDS; + strtab->data[0] = cpu_to_le64(val); + strtab->data[1] = cpu_to_le64(FIELD_PREP( + STRTAB_STE_1_SHCFG, STRTAB_STE_1_SHCFG_INCOMING)); + strtab->data[2] = 0; + strtab++; } } @@ -2255,26 +2256,23 @@ static int arm_smmu_domain_finalise(struct iommu_domain *domain, return 0; } -static __le64 *arm_smmu_get_step_for_sid(struct arm_smmu_device *smmu, u32 sid) +static struct arm_smmu_ste * +arm_smmu_get_step_for_sid(struct arm_smmu_device *smmu, u32 sid) { - __le64 *step; struct arm_smmu_strtab_cfg *cfg = &smmu->strtab_cfg; if (smmu->features & ARM_SMMU_FEAT_2_LVL_STRTAB) { - struct arm_smmu_strtab_l1_desc *l1_desc; - int idx; + unsigned int idx1, idx2; /* Two-level walk */ - idx = (sid >> STRTAB_SPLIT) * STRTAB_L1_DESC_DWORDS; - l1_desc = &cfg->l1_desc[idx]; - idx = (sid & ((1 << STRTAB_SPLIT) - 1)) * STRTAB_STE_DWORDS; - step = &l1_desc->l2ptr[idx]; + idx1 = (sid >> STRTAB_SPLIT) * STRTAB_L1_DESC_DWORDS; + idx2 = sid & ((1 << STRTAB_SPLIT) - 1); + return &cfg->l1_desc[idx1].l2ptr[idx2]; } else { /* Simple linear lookup */ - step = &cfg->strtab[sid * STRTAB_STE_DWORDS]; + return (struct arm_smmu_ste *)&cfg + ->strtab[sid * STRTAB_STE_DWORDS]; } - - return step; } static void arm_smmu_install_ste_for_dev(struct arm_smmu_master *master) @@ -2284,7 +2282,8 @@ static void arm_smmu_install_ste_for_dev(struct arm_smmu_master *master) for (i = 0; i < master->num_streams; ++i) { u32 sid = master->streams[i].id; - __le64 *step = arm_smmu_get_step_for_sid(smmu, sid); + struct arm_smmu_ste *step = + arm_smmu_get_step_for_sid(smmu, sid); /* Bridged PCI devices may end up with duplicated IDs */ for (j = 0; j < i; j++) @@ -3781,7 +3780,7 @@ static void arm_smmu_rmr_install_bypass_ste(struct arm_smmu_device *smmu) iort_get_rmr_sids(dev_fwnode(smmu->dev), &rmr_list); list_for_each_entry(e, &rmr_list, list) { - __le64 *step; + struct arm_smmu_ste *step; struct iommu_iort_rmr_data *rmr; int ret, i; diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h index 9915850dd4dbf..aba17f0b06e3a 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h @@ -206,6 +206,11 @@ #define STRTAB_L1_DESC_L2PTR_MASK GENMASK_ULL(51, 6) #define STRTAB_STE_DWORDS 8 + +struct arm_smmu_ste { + __le64 data[STRTAB_STE_DWORDS]; +}; + #define STRTAB_STE_0_V (1UL << 0) #define STRTAB_STE_0_CFG GENMASK_ULL(3, 1) #define STRTAB_STE_0_CFG_ABORT 0 @@ -571,7 +576,7 @@ struct arm_smmu_priq { struct arm_smmu_strtab_l1_desc { u8 span; - __le64 *l2ptr; + struct arm_smmu_ste *l2ptr; dma_addr_t l2ptr_dma; }; From e8244ab54cf79092df484feff2c50b175aa000b5 Mon Sep 17 00:00:00 2001 From: Michael Shavit Date: Fri, 15 Sep 2023 21:17:32 +0800 Subject: [PATCH 347/548] iommu/arm-smmu-v3: Move ctx_desc out of s1_cfg arm_smmu_s1_cfg (and by extension arm_smmu_domain) owns both a CD table and the CD inserted into that table's non-pasid CD entry. This limits arm_smmu_domain's ability to represent non-pasid domains, where multiple domains need to be inserted into a common CD table. Rather than describing an STE entry (which may have multiple domains installed into it with PASID), a domain should describe a single CD entry instead. This is precisely the role of arm_smmu_ctx_desc. A subsequent commit will also move the CD table outside of arm_smmu_domain. Reviewed-by: Jason Gunthorpe Reviewed-by: Nicolin Chen Signed-off-by: Michael Shavit Tested-by: Nicolin Chen Link: https://lore.kernel.org/r/20230915211705.v8.1.I67ab103c18d882aedc8a08985af1fba70bca084e@changeid Signed-off-by: Will Deacon (cherry picked from commit 987a878e09c6c22b9c1a517900bd94d5ed0cc459) Signed-off-by: Koba Ko --- .../iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c | 2 +- drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 23 ++++++++++--------- drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 6 +++-- 3 files changed, 17 insertions(+), 14 deletions(-) diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c index 8a16cd3ef487c..9b27fffe2ab95 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c @@ -62,7 +62,7 @@ arm_smmu_share_asid(struct mm_struct *mm, u16 asid) return cd; } - smmu_domain = container_of(cd, struct arm_smmu_domain, s1_cfg.cd); + smmu_domain = container_of(cd, struct arm_smmu_domain, cd); smmu = smmu_domain->smmu; ret = xa_alloc(&arm_smmu_asid_xa, &new_asid, cd, diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c index 7f528169248e7..56a2c1905231d 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c @@ -1881,7 +1881,7 @@ static void arm_smmu_tlb_inv_context(void *cookie) * careful, 007. */ if (smmu_domain->stage == ARM_SMMU_DOMAIN_S1) { - arm_smmu_tlb_inv_asid(smmu, smmu_domain->s1_cfg.cd.asid); + arm_smmu_tlb_inv_asid(smmu, smmu_domain->cd.asid); } else { cmd.opcode = CMDQ_OP_TLBI_S12_VMALL; cmd.tlbi.vmid = smmu_domain->s2_cfg.vmid; @@ -1974,7 +1974,7 @@ static void arm_smmu_tlb_inv_range_domain(unsigned long iova, size_t size, if (smmu_domain->stage == ARM_SMMU_DOMAIN_S1) { cmd.opcode = smmu_domain->smmu->features & ARM_SMMU_FEAT_E2H ? CMDQ_OP_TLBI_EL2_VA : CMDQ_OP_TLBI_NH_VA; - cmd.tlbi.asid = smmu_domain->s1_cfg.cd.asid; + cmd.tlbi.asid = smmu_domain->cd.asid; } else { cmd.opcode = CMDQ_OP_TLBI_S2_IPA; cmd.tlbi.vmid = smmu_domain->s2_cfg.vmid; @@ -2087,7 +2087,7 @@ static void arm_smmu_domain_free(struct iommu_domain *domain) mutex_lock(&arm_smmu_asid_lock); if (cfg->cdcfg.cdtab) arm_smmu_free_cd_tables(smmu_domain); - arm_smmu_free_asid(&cfg->cd); + arm_smmu_free_asid(&smmu_domain->cd); mutex_unlock(&arm_smmu_asid_lock); } else { struct arm_smmu_s2_cfg *cfg = &smmu_domain->s2_cfg; @@ -2106,13 +2106,14 @@ static int arm_smmu_domain_finalise_s1(struct arm_smmu_domain *smmu_domain, u32 asid; struct arm_smmu_device *smmu = smmu_domain->smmu; struct arm_smmu_s1_cfg *cfg = &smmu_domain->s1_cfg; + struct arm_smmu_ctx_desc *cd = &smmu_domain->cd; typeof(&pgtbl_cfg->arm_lpae_s1_cfg.tcr) tcr = &pgtbl_cfg->arm_lpae_s1_cfg.tcr; - refcount_set(&cfg->cd.refs, 1); + refcount_set(&cd->refs, 1); /* Prevent SVA from modifying the ASID until it is written to the CD */ mutex_lock(&arm_smmu_asid_lock); - ret = xa_alloc(&arm_smmu_asid_xa, &asid, &cfg->cd, + ret = xa_alloc(&arm_smmu_asid_xa, &asid, cd, XA_LIMIT(1, (1 << smmu->asid_bits) - 1), GFP_KERNEL); if (ret) goto out_unlock; @@ -2125,23 +2126,23 @@ static int arm_smmu_domain_finalise_s1(struct arm_smmu_domain *smmu_domain, if (ret) goto out_free_asid; - cfg->cd.asid = (u16)asid; - cfg->cd.ttbr = pgtbl_cfg->arm_lpae_s1_cfg.ttbr; - cfg->cd.tcr = FIELD_PREP(CTXDESC_CD_0_TCR_T0SZ, tcr->tsz) | + cd->asid = (u16)asid; + cd->ttbr = pgtbl_cfg->arm_lpae_s1_cfg.ttbr; + cd->tcr = FIELD_PREP(CTXDESC_CD_0_TCR_T0SZ, tcr->tsz) | FIELD_PREP(CTXDESC_CD_0_TCR_TG0, tcr->tg) | FIELD_PREP(CTXDESC_CD_0_TCR_IRGN0, tcr->irgn) | FIELD_PREP(CTXDESC_CD_0_TCR_ORGN0, tcr->orgn) | FIELD_PREP(CTXDESC_CD_0_TCR_SH0, tcr->sh) | FIELD_PREP(CTXDESC_CD_0_TCR_IPS, tcr->ips) | CTXDESC_CD_0_TCR_EPD1 | CTXDESC_CD_0_AA64; - cfg->cd.mair = pgtbl_cfg->arm_lpae_s1_cfg.mair; + cd->mair = pgtbl_cfg->arm_lpae_s1_cfg.mair; /* * Note that this will end up calling arm_smmu_sync_cd() before * the master has been added to the devices list for this domain. * This isn't an issue because the STE hasn't been installed yet. */ - ret = arm_smmu_write_ctx_desc(smmu_domain, IOMMU_NO_PASID, &cfg->cd); + ret = arm_smmu_write_ctx_desc(smmu_domain, IOMMU_NO_PASID, cd); if (ret) goto out_free_cd_tables; @@ -2151,7 +2152,7 @@ static int arm_smmu_domain_finalise_s1(struct arm_smmu_domain *smmu_domain, out_free_cd_tables: arm_smmu_free_cd_tables(smmu_domain); out_free_asid: - arm_smmu_free_asid(&cfg->cd); + arm_smmu_free_asid(cd); out_unlock: mutex_unlock(&arm_smmu_asid_lock); return ret; diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h index aba17f0b06e3a..ba3d3e5f77cd4 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h @@ -604,7 +604,6 @@ struct arm_smmu_ctx_desc_cfg { struct arm_smmu_s1_cfg { struct arm_smmu_ctx_desc_cfg cdcfg; - struct arm_smmu_ctx_desc cd; u8 s1fmt; u8 s1cdmax; }; @@ -729,7 +728,10 @@ struct arm_smmu_domain { enum arm_smmu_domain_stage stage; union { - struct arm_smmu_s1_cfg s1_cfg; + struct { + struct arm_smmu_ctx_desc cd; + struct arm_smmu_s1_cfg s1_cfg; + }; struct arm_smmu_s2_cfg s2_cfg; }; From 3751e71828077034f1dc6ebdfd8e6bf65b1f2884 Mon Sep 17 00:00:00 2001 From: Michael Shavit Date: Fri, 15 Sep 2023 21:17:33 +0800 Subject: [PATCH 348/548] iommu/arm-smmu-v3: Replace s1_cfg with cdtab_cfg Remove struct arm_smmu_s1_cfg. This is really just a CD table with a bit of extra information. Move other attributes of the CD table that were held there into the existing CD table structure, struct arm_smmu_ctx_desc_cfg, and replace all usages of arm_smmu_s1_cfg with arm_smmu_ctx_desc_cfg. For clarity, use the name "cd_table" for the variables pointing to arm_smmu_ctx_desc_cfg in the new code instead of cdcfg. A later patch will make this fully consistent. Reviewed-by: Jason Gunthorpe Reviewed-by: Nicolin Chen Signed-off-by: Michael Shavit Tested-by: Nicolin Chen Link: https://lore.kernel.org/r/20230915211705.v8.2.I1ef1ed19d7786c8176a0d05820c869e650c8d68f@changeid Signed-off-by: Will Deacon (cherry picked from commit 1f8588834016ad9d07880b0055d6583dc4014099) Signed-off-by: Koba Ko --- drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 41 ++++++++++----------- drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 7 +--- 2 files changed, 22 insertions(+), 26 deletions(-) diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c index 56a2c1905231d..8ba7abbf9987b 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c @@ -1033,9 +1033,9 @@ static __le64 *arm_smmu_get_cd_ptr(struct arm_smmu_domain *smmu_domain, unsigned int idx; struct arm_smmu_l1_ctx_desc *l1_desc; struct arm_smmu_device *smmu = smmu_domain->smmu; - struct arm_smmu_ctx_desc_cfg *cdcfg = &smmu_domain->s1_cfg.cdcfg; + struct arm_smmu_ctx_desc_cfg *cdcfg = &smmu_domain->cd_table; - if (smmu_domain->s1_cfg.s1fmt == STRTAB_STE_0_S1FMT_LINEAR) + if (cdcfg->s1fmt == STRTAB_STE_0_S1FMT_LINEAR) return cdcfg->cdtab + ssid * CTXDESC_CD_DWORDS; idx = ssid >> CTXDESC_SPLIT; @@ -1071,7 +1071,7 @@ int arm_smmu_write_ctx_desc(struct arm_smmu_domain *smmu_domain, int ssid, bool cd_live; __le64 *cdptr; - if (WARN_ON(ssid >= (1 << smmu_domain->s1_cfg.s1cdmax))) + if (WARN_ON(ssid >= (1 << smmu_domain->cd_table.s1cdmax))) return -E2BIG; cdptr = arm_smmu_get_cd_ptr(smmu_domain, ssid); @@ -1138,19 +1138,18 @@ static int arm_smmu_alloc_cd_tables(struct arm_smmu_domain *smmu_domain) size_t l1size; size_t max_contexts; struct arm_smmu_device *smmu = smmu_domain->smmu; - struct arm_smmu_s1_cfg *cfg = &smmu_domain->s1_cfg; - struct arm_smmu_ctx_desc_cfg *cdcfg = &cfg->cdcfg; + struct arm_smmu_ctx_desc_cfg *cdcfg = &smmu_domain->cd_table; - max_contexts = 1 << cfg->s1cdmax; + max_contexts = 1 << cdcfg->s1cdmax; if (!(smmu->features & ARM_SMMU_FEAT_2_LVL_CDTAB) || max_contexts <= CTXDESC_L2_ENTRIES) { - cfg->s1fmt = STRTAB_STE_0_S1FMT_LINEAR; + cdcfg->s1fmt = STRTAB_STE_0_S1FMT_LINEAR; cdcfg->num_l1_ents = max_contexts; l1size = max_contexts * (CTXDESC_CD_DWORDS << 3); } else { - cfg->s1fmt = STRTAB_STE_0_S1FMT_64K_L2; + cdcfg->s1fmt = STRTAB_STE_0_S1FMT_64K_L2; cdcfg->num_l1_ents = DIV_ROUND_UP(max_contexts, CTXDESC_L2_ENTRIES); @@ -1186,7 +1185,7 @@ static void arm_smmu_free_cd_tables(struct arm_smmu_domain *smmu_domain) int i; size_t size, l1size; struct arm_smmu_device *smmu = smmu_domain->smmu; - struct arm_smmu_ctx_desc_cfg *cdcfg = &smmu_domain->s1_cfg.cdcfg; + struct arm_smmu_ctx_desc_cfg *cdcfg = &smmu_domain->cd_table; if (cdcfg->l1_desc) { size = CTXDESC_L2_ENTRIES * (CTXDESC_CD_DWORDS << 3); @@ -1276,7 +1275,7 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid, u64 val = le64_to_cpu(dst->data[0]); bool ste_live = false; struct arm_smmu_device *smmu = NULL; - struct arm_smmu_s1_cfg *s1_cfg = NULL; + struct arm_smmu_ctx_desc_cfg *cd_table = NULL; struct arm_smmu_s2_cfg *s2_cfg = NULL; struct arm_smmu_domain *smmu_domain = NULL; struct arm_smmu_cmdq_ent prefetch_cmd = { @@ -1294,7 +1293,7 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid, if (smmu_domain) { switch (smmu_domain->stage) { case ARM_SMMU_DOMAIN_S1: - s1_cfg = &smmu_domain->s1_cfg; + cd_table = &smmu_domain->cd_table; break; case ARM_SMMU_DOMAIN_S2: case ARM_SMMU_DOMAIN_NESTED: @@ -1325,7 +1324,7 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid, val = STRTAB_STE_0_V; /* Bypass/fault */ - if (!smmu_domain || !(s1_cfg || s2_cfg)) { + if (!smmu_domain || !(cd_table || s2_cfg)) { if (!smmu_domain && disable_bypass) val |= FIELD_PREP(STRTAB_STE_0_CFG, STRTAB_STE_0_CFG_ABORT); else @@ -1344,7 +1343,7 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid, return; } - if (s1_cfg) { + if (cd_table) { u64 strw = smmu->features & ARM_SMMU_FEAT_E2H ? STRTAB_STE_1_STRW_EL2 : STRTAB_STE_1_STRW_NSEL1; @@ -1360,10 +1359,10 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid, !master->stall_enabled) dst->data[1] |= cpu_to_le64(STRTAB_STE_1_S1STALLD); - val |= (s1_cfg->cdcfg.cdtab_dma & STRTAB_STE_0_S1CTXPTR_MASK) | + val |= (cd_table->cdtab_dma & STRTAB_STE_0_S1CTXPTR_MASK) | FIELD_PREP(STRTAB_STE_0_CFG, STRTAB_STE_0_CFG_S1_TRANS) | - FIELD_PREP(STRTAB_STE_0_S1CDMAX, s1_cfg->s1cdmax) | - FIELD_PREP(STRTAB_STE_0_S1FMT, s1_cfg->s1fmt); + FIELD_PREP(STRTAB_STE_0_S1CDMAX, cd_table->s1cdmax) | + FIELD_PREP(STRTAB_STE_0_S1FMT, cd_table->s1fmt); } if (s2_cfg) { @@ -2081,11 +2080,11 @@ static void arm_smmu_domain_free(struct iommu_domain *domain) /* Free the CD and ASID, if we allocated them */ if (smmu_domain->stage == ARM_SMMU_DOMAIN_S1) { - struct arm_smmu_s1_cfg *cfg = &smmu_domain->s1_cfg; + struct arm_smmu_ctx_desc_cfg *cd_table = &smmu_domain->cd_table; /* Prevent SVA from touching the CD while we're freeing it */ mutex_lock(&arm_smmu_asid_lock); - if (cfg->cdcfg.cdtab) + if (cd_table->cdtab) arm_smmu_free_cd_tables(smmu_domain); arm_smmu_free_asid(&smmu_domain->cd); mutex_unlock(&arm_smmu_asid_lock); @@ -2105,7 +2104,7 @@ static int arm_smmu_domain_finalise_s1(struct arm_smmu_domain *smmu_domain, int ret; u32 asid; struct arm_smmu_device *smmu = smmu_domain->smmu; - struct arm_smmu_s1_cfg *cfg = &smmu_domain->s1_cfg; + struct arm_smmu_ctx_desc_cfg *cd_table = &smmu_domain->cd_table; struct arm_smmu_ctx_desc *cd = &smmu_domain->cd; typeof(&pgtbl_cfg->arm_lpae_s1_cfg.tcr) tcr = &pgtbl_cfg->arm_lpae_s1_cfg.tcr; @@ -2118,7 +2117,7 @@ static int arm_smmu_domain_finalise_s1(struct arm_smmu_domain *smmu_domain, if (ret) goto out_unlock; - cfg->s1cdmax = master->ssid_bits; + cd_table->s1cdmax = master->ssid_bits; smmu_domain->stall_enabled = master->stall_enabled; @@ -2456,7 +2455,7 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev) ret = -EINVAL; goto out_unlock; } else if (smmu_domain->stage == ARM_SMMU_DOMAIN_S1 && - master->ssid_bits != smmu_domain->s1_cfg.s1cdmax) { + master->ssid_bits != smmu_domain->cd_table.s1cdmax) { ret = -EINVAL; goto out_unlock; } else if (smmu_domain->stage == ARM_SMMU_DOMAIN_S1 && diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h index ba3d3e5f77cd4..ecbae54685714 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h @@ -600,11 +600,8 @@ struct arm_smmu_ctx_desc_cfg { dma_addr_t cdtab_dma; struct arm_smmu_l1_ctx_desc *l1_desc; unsigned int num_l1_ents; -}; - -struct arm_smmu_s1_cfg { - struct arm_smmu_ctx_desc_cfg cdcfg; u8 s1fmt; + /* log2 of the maximum number of CDs supported by this table */ u8 s1cdmax; }; @@ -730,7 +727,7 @@ struct arm_smmu_domain { union { struct { struct arm_smmu_ctx_desc cd; - struct arm_smmu_s1_cfg s1_cfg; + struct arm_smmu_ctx_desc_cfg cd_table; }; struct arm_smmu_s2_cfg s2_cfg; }; From a3244c705b4ac6b36d2f118d66acd405e0181202 Mon Sep 17 00:00:00 2001 From: Michael Shavit Date: Fri, 15 Sep 2023 21:17:34 +0800 Subject: [PATCH 349/548] iommu/arm-smmu-v3: Encapsulate ctx_desc_cfg init in alloc_cd_tables This is slighlty cleaner: arm_smmu_ctx_desc_cfg is initialized in a single function instead of having pieces set ahead-of time by its caller. Reviewed-by: Jason Gunthorpe Reviewed-by: Nicolin Chen Signed-off-by: Michael Shavit Tested-by: Nicolin Chen Link: https://lore.kernel.org/r/20230915211705.v8.3.I875254464d044a8ce8b3a2ad6beb655a4a006456@changeid Signed-off-by: Will Deacon (cherry picked from commit e3aad74c51a7012233ebe249c8ed86bb01c4a3f0) Signed-off-by: Koba Ko --- drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 9 ++++----- 1 file changed, 4 insertions(+), 5 deletions(-) diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c index 8ba7abbf9987b..358db405912b2 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c @@ -1132,7 +1132,8 @@ int arm_smmu_write_ctx_desc(struct arm_smmu_domain *smmu_domain, int ssid, return 0; } -static int arm_smmu_alloc_cd_tables(struct arm_smmu_domain *smmu_domain) +static int arm_smmu_alloc_cd_tables(struct arm_smmu_domain *smmu_domain, + struct arm_smmu_master *master) { int ret; size_t l1size; @@ -1140,6 +1141,7 @@ static int arm_smmu_alloc_cd_tables(struct arm_smmu_domain *smmu_domain) struct arm_smmu_device *smmu = smmu_domain->smmu; struct arm_smmu_ctx_desc_cfg *cdcfg = &smmu_domain->cd_table; + cdcfg->s1cdmax = master->ssid_bits; max_contexts = 1 << cdcfg->s1cdmax; if (!(smmu->features & ARM_SMMU_FEAT_2_LVL_CDTAB) || @@ -2104,7 +2106,6 @@ static int arm_smmu_domain_finalise_s1(struct arm_smmu_domain *smmu_domain, int ret; u32 asid; struct arm_smmu_device *smmu = smmu_domain->smmu; - struct arm_smmu_ctx_desc_cfg *cd_table = &smmu_domain->cd_table; struct arm_smmu_ctx_desc *cd = &smmu_domain->cd; typeof(&pgtbl_cfg->arm_lpae_s1_cfg.tcr) tcr = &pgtbl_cfg->arm_lpae_s1_cfg.tcr; @@ -2117,11 +2118,9 @@ static int arm_smmu_domain_finalise_s1(struct arm_smmu_domain *smmu_domain, if (ret) goto out_unlock; - cd_table->s1cdmax = master->ssid_bits; - smmu_domain->stall_enabled = master->stall_enabled; - ret = arm_smmu_alloc_cd_tables(smmu_domain); + ret = arm_smmu_alloc_cd_tables(smmu_domain, master); if (ret) goto out_free_asid; From 9c3af3a9eeed36c23cdda36a47931002dbdbb5c3 Mon Sep 17 00:00:00 2001 From: Michael Shavit Date: Fri, 15 Sep 2023 21:17:35 +0800 Subject: [PATCH 350/548] iommu/arm-smmu-v3: move stall_enabled to the cd table A domain can be attached to multiple masters with different master->stall_enabled values. The stall bit of a CD entry should follow master->stall_enabled and has an inverse relationship with the STE.S1STALLD bit. The stall_enabled bit does not depend on any property of the domain, so move it out of the arm_smmu_domain struct. Move it to the CD table struct so that it can fully describe how CD entries should be written to it. Reviewed-by: Jason Gunthorpe Reviewed-by: Nicolin Chen Signed-off-by: Michael Shavit Tested-by: Nicolin Chen Link: https://lore.kernel.org/r/20230915211705.v8.4.I5aa89c849228794a64146cfe86df21fb71629384@changeid Signed-off-by: Will Deacon (cherry picked from commit 1228cc509fc6bf7b10f0c5e202086979e4bbd43a) Signed-off-by: Koba Ko --- drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 8 ++++---- drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 3 ++- 2 files changed, 6 insertions(+), 5 deletions(-) diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c index 358db405912b2..89164e159db27 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c @@ -1114,7 +1114,7 @@ int arm_smmu_write_ctx_desc(struct arm_smmu_domain *smmu_domain, int ssid, FIELD_PREP(CTXDESC_CD_0_ASID, cd->asid) | CTXDESC_CD_0_V; - if (smmu_domain->stall_enabled) + if (smmu_domain->cd_table.stall_enabled) val |= CTXDESC_CD_0_S; } @@ -1141,6 +1141,7 @@ static int arm_smmu_alloc_cd_tables(struct arm_smmu_domain *smmu_domain, struct arm_smmu_device *smmu = smmu_domain->smmu; struct arm_smmu_ctx_desc_cfg *cdcfg = &smmu_domain->cd_table; + cdcfg->stall_enabled = master->stall_enabled; cdcfg->s1cdmax = master->ssid_bits; max_contexts = 1 << cdcfg->s1cdmax; @@ -2118,8 +2119,6 @@ static int arm_smmu_domain_finalise_s1(struct arm_smmu_domain *smmu_domain, if (ret) goto out_unlock; - smmu_domain->stall_enabled = master->stall_enabled; - ret = arm_smmu_alloc_cd_tables(smmu_domain, master); if (ret) goto out_free_asid; @@ -2458,7 +2457,8 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev) ret = -EINVAL; goto out_unlock; } else if (smmu_domain->stage == ARM_SMMU_DOMAIN_S1 && - smmu_domain->stall_enabled != master->stall_enabled) { + smmu_domain->cd_table.stall_enabled != + master->stall_enabled) { ret = -EINVAL; goto out_unlock; } diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h index ecbae54685714..bd569f2df5695 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h @@ -603,6 +603,8 @@ struct arm_smmu_ctx_desc_cfg { u8 s1fmt; /* log2 of the maximum number of CDs supported by this table */ u8 s1cdmax; + /* Whether CD entries in this table have the stall bit set. */ + u8 stall_enabled:1; }; struct arm_smmu_s2_cfg { @@ -720,7 +722,6 @@ struct arm_smmu_domain { struct mutex init_mutex; /* Protects smmu pointer */ struct io_pgtable_ops *pgtbl_ops; - bool stall_enabled; atomic_t nr_ats_masters; enum arm_smmu_domain_stage stage; From 46d0e633b3f29d45338704f6ece661f7c35261bc Mon Sep 17 00:00:00 2001 From: Michael Shavit Date: Fri, 15 Sep 2023 21:17:36 +0800 Subject: [PATCH 351/548] iommu/arm-smmu-v3: Refactor write_ctx_desc Update arm_smmu_write_ctx_desc and downstream functions to operate on a master instead of an smmu domain. We expect arm_smmu_write_ctx_desc() to only be called to write a CD entry into a CD table owned by the master. Under the hood, arm_smmu_write_ctx_desc still fetches the CD table from the domain that is attached to the master, but a subsequent commit will move that table's ownership to the master. Note that this change isn't a nop refactor since SVA will call arm_smmu_write_ctx_desc in a loop for every master the domain is attached to despite the fact that they all share the same CD table. This loop may look weird but becomes necessary when the CD table becomes per-master in a subsequent commit. Reviewed-by: Jason Gunthorpe Reviewed-by: Nicolin Chen Signed-off-by: Michael Shavit Tested-by: Nicolin Chen Link: https://lore.kernel.org/r/20230915211705.v8.5.I219054a6cf538df5bb22f4ada2d9933155d6058c@changeid Signed-off-by: Will Deacon (cherry picked from commit 24503148c545f08c10de73cc3a3779f1ee7a8fa0) Signed-off-by: Koba Ko --- .../iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c | 39 +++++++++++++-- drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 48 +++++++------------ drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 2 +- 3 files changed, 54 insertions(+), 35 deletions(-) diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c index 9b27fffe2ab95..df66fe43a9853 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c @@ -37,6 +37,25 @@ struct arm_smmu_bond { static DEFINE_MUTEX(sva_lock); +/* + * Write the CD to the CD tables for all masters that this domain is attached + * to. Note that this is only used to update existing CD entries in the target + * CD table, for which it's assumed that arm_smmu_write_ctx_desc can't fail. + */ +static void arm_smmu_update_ctx_desc_devices(struct arm_smmu_domain *smmu_domain, + int ssid, + struct arm_smmu_ctx_desc *cd) +{ + struct arm_smmu_master *master; + unsigned long flags; + + spin_lock_irqsave(&smmu_domain->devices_lock, flags); + list_for_each_entry(master, &smmu_domain->devices, domain_head) { + arm_smmu_write_ctx_desc(master, ssid, cd); + } + spin_unlock_irqrestore(&smmu_domain->devices_lock, flags); +} + /* * Check if the CPU ASID is available on the SMMU side. If a private context * descriptor is using it, try to replace it. @@ -80,7 +99,7 @@ arm_smmu_share_asid(struct mm_struct *mm, u16 asid) * be some overlap between use of both ASIDs, until we invalidate the * TLB. */ - arm_smmu_write_ctx_desc(smmu_domain, IOMMU_NO_PASID, cd); + arm_smmu_update_ctx_desc_devices(smmu_domain, IOMMU_NO_PASID, cd); /* Invalidate TLB entries previously associated with that context */ arm_smmu_tlb_inv_asid(smmu, asid); @@ -247,7 +266,7 @@ static void arm_smmu_mm_release(struct mmu_notifier *mn, struct mm_struct *mm) * DMA may still be running. Keep the cd valid to avoid C_BAD_CD events, * but disable translation. */ - arm_smmu_write_ctx_desc(smmu_domain, mm->pasid, &quiet_cd); + arm_smmu_update_ctx_desc_devices(smmu_domain, mm->pasid, &quiet_cd); arm_smmu_tlb_inv_asid(smmu_domain->smmu, smmu_mn->cd->asid); arm_smmu_atc_inv_domain(smmu_domain, mm->pasid, 0, 0); @@ -273,8 +292,10 @@ arm_smmu_mmu_notifier_get(struct arm_smmu_domain *smmu_domain, struct mm_struct *mm) { int ret; + unsigned long flags; struct arm_smmu_ctx_desc *cd; struct arm_smmu_mmu_notifier *smmu_mn; + struct arm_smmu_master *master; list_for_each_entry(smmu_mn, &smmu_domain->mmu_notifiers, list) { if (smmu_mn->mn.mm == mm) { @@ -304,7 +325,16 @@ arm_smmu_mmu_notifier_get(struct arm_smmu_domain *smmu_domain, goto err_free_cd; } - ret = arm_smmu_write_ctx_desc(smmu_domain, mm->pasid, cd); + spin_lock_irqsave(&smmu_domain->devices_lock, flags); + list_for_each_entry(master, &smmu_domain->devices, domain_head) { + ret = arm_smmu_write_ctx_desc(master, mm->pasid, cd); + if (ret) { + list_for_each_entry_from_reverse(master, &smmu_domain->devices, domain_head) + arm_smmu_write_ctx_desc(master, mm->pasid, NULL); + break; + } + } + spin_unlock_irqrestore(&smmu_domain->devices_lock, flags); if (ret) goto err_put_notifier; @@ -329,7 +359,8 @@ static void arm_smmu_mmu_notifier_put(struct arm_smmu_mmu_notifier *smmu_mn) return; list_del(&smmu_mn->list); - arm_smmu_write_ctx_desc(smmu_domain, mm->pasid, NULL); + + arm_smmu_update_ctx_desc_devices(smmu_domain, mm->pasid, NULL); /* * If we went through clear(), we've already invalidated, and no diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c index 89164e159db27..95acb71451017 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c @@ -971,14 +971,12 @@ void arm_smmu_tlb_inv_asid(struct arm_smmu_device *smmu, u16 asid) arm_smmu_cmdq_issue_cmd_with_sync(smmu, &cmd); } -static void arm_smmu_sync_cd(struct arm_smmu_domain *smmu_domain, +static void arm_smmu_sync_cd(struct arm_smmu_master *master, int ssid, bool leaf) { size_t i; - unsigned long flags; - struct arm_smmu_master *master; struct arm_smmu_cmdq_batch cmds; - struct arm_smmu_device *smmu = smmu_domain->smmu; + struct arm_smmu_device *smmu = master->smmu; struct arm_smmu_cmdq_ent cmd = { .opcode = CMDQ_OP_CFGI_CD, .cfgi = { @@ -988,15 +986,10 @@ static void arm_smmu_sync_cd(struct arm_smmu_domain *smmu_domain, }; cmds.num = 0; - - spin_lock_irqsave(&smmu_domain->devices_lock, flags); - list_for_each_entry(master, &smmu_domain->devices, domain_head) { - for (i = 0; i < master->num_streams; i++) { - cmd.cfgi.sid = master->streams[i].id; - arm_smmu_cmdq_batch_add(smmu, &cmds, &cmd); - } + for (i = 0; i < master->num_streams; i++) { + cmd.cfgi.sid = master->streams[i].id; + arm_smmu_cmdq_batch_add(smmu, &cmds, &cmd); } - spin_unlock_irqrestore(&smmu_domain->devices_lock, flags); arm_smmu_cmdq_batch_submit(smmu, &cmds); } @@ -1026,14 +1019,13 @@ static void arm_smmu_write_cd_l1_desc(__le64 *dst, WRITE_ONCE(*dst, cpu_to_le64(val)); } -static __le64 *arm_smmu_get_cd_ptr(struct arm_smmu_domain *smmu_domain, - u32 ssid) +static __le64 *arm_smmu_get_cd_ptr(struct arm_smmu_master *master, u32 ssid) { __le64 *l1ptr; unsigned int idx; struct arm_smmu_l1_ctx_desc *l1_desc; - struct arm_smmu_device *smmu = smmu_domain->smmu; - struct arm_smmu_ctx_desc_cfg *cdcfg = &smmu_domain->cd_table; + struct arm_smmu_device *smmu = master->smmu; + struct arm_smmu_ctx_desc_cfg *cdcfg = &master->domain->cd_table; if (cdcfg->s1fmt == STRTAB_STE_0_S1FMT_LINEAR) return cdcfg->cdtab + ssid * CTXDESC_CD_DWORDS; @@ -1047,13 +1039,13 @@ static __le64 *arm_smmu_get_cd_ptr(struct arm_smmu_domain *smmu_domain, l1ptr = cdcfg->cdtab + idx * CTXDESC_L1_DESC_DWORDS; arm_smmu_write_cd_l1_desc(l1ptr, l1_desc); /* An invalid L1CD can be cached */ - arm_smmu_sync_cd(smmu_domain, ssid, false); + arm_smmu_sync_cd(master, ssid, false); } idx = ssid & (CTXDESC_L2_ENTRIES - 1); return l1_desc->l2ptr + idx * CTXDESC_CD_DWORDS; } -int arm_smmu_write_ctx_desc(struct arm_smmu_domain *smmu_domain, int ssid, +int arm_smmu_write_ctx_desc(struct arm_smmu_master *master, int ssid, struct arm_smmu_ctx_desc *cd) { /* @@ -1070,11 +1062,12 @@ int arm_smmu_write_ctx_desc(struct arm_smmu_domain *smmu_domain, int ssid, u64 val; bool cd_live; __le64 *cdptr; + struct arm_smmu_ctx_desc_cfg *cd_table = &master->domain->cd_table; - if (WARN_ON(ssid >= (1 << smmu_domain->cd_table.s1cdmax))) + if (WARN_ON(ssid >= (1 << cd_table->s1cdmax))) return -E2BIG; - cdptr = arm_smmu_get_cd_ptr(smmu_domain, ssid); + cdptr = arm_smmu_get_cd_ptr(master, ssid); if (!cdptr) return -ENOMEM; @@ -1102,7 +1095,7 @@ int arm_smmu_write_ctx_desc(struct arm_smmu_domain *smmu_domain, int ssid, * order. Ensure that it observes valid values before reading * V=1. */ - arm_smmu_sync_cd(smmu_domain, ssid, true); + arm_smmu_sync_cd(master, ssid, true); val = cd->tcr | #ifdef __BIG_ENDIAN @@ -1114,7 +1107,7 @@ int arm_smmu_write_ctx_desc(struct arm_smmu_domain *smmu_domain, int ssid, FIELD_PREP(CTXDESC_CD_0_ASID, cd->asid) | CTXDESC_CD_0_V; - if (smmu_domain->cd_table.stall_enabled) + if (cd_table->stall_enabled) val |= CTXDESC_CD_0_S; } @@ -1128,7 +1121,7 @@ int arm_smmu_write_ctx_desc(struct arm_smmu_domain *smmu_domain, int ssid, * without first making the structure invalid. */ WRITE_ONCE(cdptr[0], cpu_to_le64(val)); - arm_smmu_sync_cd(smmu_domain, ssid, true); + arm_smmu_sync_cd(master, ssid, true); return 0; } @@ -1138,7 +1131,7 @@ static int arm_smmu_alloc_cd_tables(struct arm_smmu_domain *smmu_domain, int ret; size_t l1size; size_t max_contexts; - struct arm_smmu_device *smmu = smmu_domain->smmu; + struct arm_smmu_device *smmu = master->smmu; struct arm_smmu_ctx_desc_cfg *cdcfg = &smmu_domain->cd_table; cdcfg->stall_enabled = master->stall_enabled; @@ -2134,12 +2127,7 @@ static int arm_smmu_domain_finalise_s1(struct arm_smmu_domain *smmu_domain, CTXDESC_CD_0_TCR_EPD1 | CTXDESC_CD_0_AA64; cd->mair = pgtbl_cfg->arm_lpae_s1_cfg.mair; - /* - * Note that this will end up calling arm_smmu_sync_cd() before - * the master has been added to the devices list for this domain. - * This isn't an issue because the STE hasn't been installed yet. - */ - ret = arm_smmu_write_ctx_desc(smmu_domain, IOMMU_NO_PASID, cd); + ret = arm_smmu_write_ctx_desc(master, IOMMU_NO_PASID, cd); if (ret) goto out_free_cd_tables; diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h index bd569f2df5695..4f1bbff869ab1 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h @@ -750,7 +750,7 @@ extern struct xarray arm_smmu_asid_xa; extern struct mutex arm_smmu_asid_lock; extern struct arm_smmu_ctx_desc quiet_cd; -int arm_smmu_write_ctx_desc(struct arm_smmu_domain *smmu_domain, int ssid, +int arm_smmu_write_ctx_desc(struct arm_smmu_master *smmu_master, int ssid, struct arm_smmu_ctx_desc *cd); void arm_smmu_tlb_inv_asid(struct arm_smmu_device *smmu, u16 asid); void arm_smmu_tlb_inv_range_asid(unsigned long iova, size_t size, int asid, From a4a69c371d1c8683ad066c2497bad87aacb06932 Mon Sep 17 00:00:00 2001 From: Michael Shavit Date: Fri, 15 Sep 2023 21:17:37 +0800 Subject: [PATCH 352/548] iommu/arm-smmu-v3: Move CD table to arm_smmu_master With this change, each master will now own its own CD table instead of sharing one with other masters attached to the same domain. Attaching a stage 1 domain installs CD entries into the master's CD table. SVA writes its CD entries into each master's CD table if the domain is shared across masters. Also add the device to the devices list before writing the CD to the table so that SVA will know that the CD needs to be re-written to this device's CD table as well if it decides to update the CD's ASID concurrently with this function. Tested-by: Nicolin Chen Reviewed-by: Jason Gunthorpe Signed-off-by: Michael Shavit Link: https://lore.kernel.org/r/20230915211705.v8.6.Ice063dcf87d1b777a72e008d9e3406d2bcf6d876@changeid Signed-off-by: Will Deacon (cherry picked from commit 10e4968cd511e3125cf62574aee950c4012c161a) Signed-off-by: Koba Ko --- drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 100 +++++++++++--------- drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 7 +- 2 files changed, 58 insertions(+), 49 deletions(-) diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c index 95acb71451017..999aafa3206d2 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c @@ -1025,7 +1025,7 @@ static __le64 *arm_smmu_get_cd_ptr(struct arm_smmu_master *master, u32 ssid) unsigned int idx; struct arm_smmu_l1_ctx_desc *l1_desc; struct arm_smmu_device *smmu = master->smmu; - struct arm_smmu_ctx_desc_cfg *cdcfg = &master->domain->cd_table; + struct arm_smmu_ctx_desc_cfg *cdcfg = &master->cd_table; if (cdcfg->s1fmt == STRTAB_STE_0_S1FMT_LINEAR) return cdcfg->cdtab + ssid * CTXDESC_CD_DWORDS; @@ -1062,7 +1062,7 @@ int arm_smmu_write_ctx_desc(struct arm_smmu_master *master, int ssid, u64 val; bool cd_live; __le64 *cdptr; - struct arm_smmu_ctx_desc_cfg *cd_table = &master->domain->cd_table; + struct arm_smmu_ctx_desc_cfg *cd_table = &master->cd_table; if (WARN_ON(ssid >= (1 << cd_table->s1cdmax))) return -E2BIG; @@ -1125,14 +1125,13 @@ int arm_smmu_write_ctx_desc(struct arm_smmu_master *master, int ssid, return 0; } -static int arm_smmu_alloc_cd_tables(struct arm_smmu_domain *smmu_domain, - struct arm_smmu_master *master) +static int arm_smmu_alloc_cd_tables(struct arm_smmu_master *master) { int ret; size_t l1size; size_t max_contexts; struct arm_smmu_device *smmu = master->smmu; - struct arm_smmu_ctx_desc_cfg *cdcfg = &smmu_domain->cd_table; + struct arm_smmu_ctx_desc_cfg *cdcfg = &master->cd_table; cdcfg->stall_enabled = master->stall_enabled; cdcfg->s1cdmax = master->ssid_bits; @@ -1176,12 +1175,12 @@ static int arm_smmu_alloc_cd_tables(struct arm_smmu_domain *smmu_domain, return ret; } -static void arm_smmu_free_cd_tables(struct arm_smmu_domain *smmu_domain) +static void arm_smmu_free_cd_tables(struct arm_smmu_master *master) { int i; size_t size, l1size; - struct arm_smmu_device *smmu = smmu_domain->smmu; - struct arm_smmu_ctx_desc_cfg *cdcfg = &smmu_domain->cd_table; + struct arm_smmu_device *smmu = master->smmu; + struct arm_smmu_ctx_desc_cfg *cdcfg = &master->cd_table; if (cdcfg->l1_desc) { size = CTXDESC_L2_ENTRIES * (CTXDESC_CD_DWORDS << 3); @@ -1289,7 +1288,7 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid, if (smmu_domain) { switch (smmu_domain->stage) { case ARM_SMMU_DOMAIN_S1: - cd_table = &smmu_domain->cd_table; + cd_table = &master->cd_table; break; case ARM_SMMU_DOMAIN_S2: case ARM_SMMU_DOMAIN_NESTED: @@ -2074,14 +2073,10 @@ static void arm_smmu_domain_free(struct iommu_domain *domain) free_io_pgtable_ops(smmu_domain->pgtbl_ops); - /* Free the CD and ASID, if we allocated them */ + /* Free the ASID or VMID */ if (smmu_domain->stage == ARM_SMMU_DOMAIN_S1) { - struct arm_smmu_ctx_desc_cfg *cd_table = &smmu_domain->cd_table; - /* Prevent SVA from touching the CD while we're freeing it */ mutex_lock(&arm_smmu_asid_lock); - if (cd_table->cdtab) - arm_smmu_free_cd_tables(smmu_domain); arm_smmu_free_asid(&smmu_domain->cd); mutex_unlock(&arm_smmu_asid_lock); } else { @@ -2112,10 +2107,6 @@ static int arm_smmu_domain_finalise_s1(struct arm_smmu_domain *smmu_domain, if (ret) goto out_unlock; - ret = arm_smmu_alloc_cd_tables(smmu_domain, master); - if (ret) - goto out_free_asid; - cd->asid = (u16)asid; cd->ttbr = pgtbl_cfg->arm_lpae_s1_cfg.ttbr; cd->tcr = FIELD_PREP(CTXDESC_CD_0_TCR_T0SZ, tcr->tsz) | @@ -2127,17 +2118,9 @@ static int arm_smmu_domain_finalise_s1(struct arm_smmu_domain *smmu_domain, CTXDESC_CD_0_TCR_EPD1 | CTXDESC_CD_0_AA64; cd->mair = pgtbl_cfg->arm_lpae_s1_cfg.mair; - ret = arm_smmu_write_ctx_desc(master, IOMMU_NO_PASID, cd); - if (ret) - goto out_free_cd_tables; - mutex_unlock(&arm_smmu_asid_lock); return 0; -out_free_cd_tables: - arm_smmu_free_cd_tables(smmu_domain); -out_free_asid: - arm_smmu_free_asid(cd); out_unlock: mutex_unlock(&arm_smmu_asid_lock); return ret; @@ -2399,6 +2382,14 @@ static void arm_smmu_detach_dev(struct arm_smmu_master *master) master->domain = NULL; master->ats_enabled = false; arm_smmu_install_ste_for_dev(master); + /* + * Clearing the CD entry isn't strictly required to detach the domain + * since the table is uninstalled anyway, but it helps avoid confusion + * in the call to arm_smmu_write_ctx_desc on the next attach (which + * expects the entry to be empty). + */ + if (smmu_domain->stage == ARM_SMMU_DOMAIN_S1 && master->cd_table.cdtab) + arm_smmu_write_ctx_desc(master, IOMMU_NO_PASID, NULL); } static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev) @@ -2433,23 +2424,14 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev) if (!smmu_domain->smmu) { smmu_domain->smmu = smmu; ret = arm_smmu_domain_finalise(domain, master); - if (ret) { + if (ret) smmu_domain->smmu = NULL; - goto out_unlock; - } - } else if (smmu_domain->smmu != smmu) { - ret = -EINVAL; - goto out_unlock; - } else if (smmu_domain->stage == ARM_SMMU_DOMAIN_S1 && - master->ssid_bits != smmu_domain->cd_table.s1cdmax) { + } else if (smmu_domain->smmu != smmu) ret = -EINVAL; - goto out_unlock; - } else if (smmu_domain->stage == ARM_SMMU_DOMAIN_S1 && - smmu_domain->cd_table.stall_enabled != - master->stall_enabled) { - ret = -EINVAL; - goto out_unlock; - } + + mutex_unlock(&smmu_domain->init_mutex); + if (ret) + return ret; master->domain = smmu_domain; @@ -2463,16 +2445,42 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev) if (smmu_domain->stage != ARM_SMMU_DOMAIN_BYPASS) master->ats_enabled = arm_smmu_ats_supported(master); - arm_smmu_install_ste_for_dev(master); - spin_lock_irqsave(&smmu_domain->devices_lock, flags); list_add(&master->domain_head, &smmu_domain->devices); spin_unlock_irqrestore(&smmu_domain->devices_lock, flags); + if (smmu_domain->stage == ARM_SMMU_DOMAIN_S1) { + if (!master->cd_table.cdtab) { + ret = arm_smmu_alloc_cd_tables(master); + if (ret) { + master->domain = NULL; + goto out_list_del; + } + } + + /* + * Prevent SVA from concurrently modifying the CD or writing to + * the CD entry + */ + mutex_lock(&arm_smmu_asid_lock); + ret = arm_smmu_write_ctx_desc(master, IOMMU_NO_PASID, &smmu_domain->cd); + mutex_unlock(&arm_smmu_asid_lock); + if (ret) { + master->domain = NULL; + goto out_list_del; + } + } + + arm_smmu_install_ste_for_dev(master); + arm_smmu_enable_ats(master); + return 0; + +out_list_del: + spin_lock_irqsave(&smmu_domain->devices_lock, flags); + list_del(&master->domain_head); + spin_unlock_irqrestore(&smmu_domain->devices_lock, flags); -out_unlock: - mutex_unlock(&smmu_domain->init_mutex); return ret; } @@ -2711,6 +2719,8 @@ static void arm_smmu_release_device(struct device *dev) arm_smmu_detach_dev(master); arm_smmu_disable_pasid(master); arm_smmu_remove_master(master); + if (master->cd_table.cdtab) + arm_smmu_free_cd_tables(master); kfree(master); } diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h index 4f1bbff869ab1..03f9e526cbd92 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h @@ -700,6 +700,8 @@ struct arm_smmu_master { struct arm_smmu_domain *domain; struct list_head domain_head; struct arm_smmu_stream *streams; + /* Locked by the iommu core using the group mutex */ + struct arm_smmu_ctx_desc_cfg cd_table; unsigned int num_streams; bool ats_enabled; bool stall_enabled; @@ -726,11 +728,8 @@ struct arm_smmu_domain { enum arm_smmu_domain_stage stage; union { - struct { struct arm_smmu_ctx_desc cd; - struct arm_smmu_ctx_desc_cfg cd_table; - }; - struct arm_smmu_s2_cfg s2_cfg; + struct arm_smmu_s2_cfg s2_cfg; }; struct iommu_domain domain; From 8e508382dc5075c54161162f1aec726f97087923 Mon Sep 17 00:00:00 2001 From: Michael Shavit Date: Fri, 15 Sep 2023 21:17:38 +0800 Subject: [PATCH 353/548] iommu/arm-smmu-v3: Cleanup arm_smmu_domain_finalise Remove unused master parameter now that the CD table is allocated elsewhere. Reviewed-by: Nicolin Chen Reviewed-by: Jason Gunthorpe Signed-off-by: Michael Shavit Tested-by: Nicolin Chen Link: https://lore.kernel.org/r/20230915211705.v8.7.Iff18df41564b9df82bf40b3ec7af26b87f08ef6e@changeid Signed-off-by: Will Deacon (cherry picked from commit 5e14313df2c8f81e22e62b4cd068ef3ee30914c8) Signed-off-by: Koba Ko --- drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 10 +++------- 1 file changed, 3 insertions(+), 7 deletions(-) diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c index 999aafa3206d2..50005ee447cf5 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c @@ -2089,7 +2089,6 @@ static void arm_smmu_domain_free(struct iommu_domain *domain) } static int arm_smmu_domain_finalise_s1(struct arm_smmu_domain *smmu_domain, - struct arm_smmu_master *master, struct io_pgtable_cfg *pgtbl_cfg) { int ret; @@ -2127,7 +2126,6 @@ static int arm_smmu_domain_finalise_s1(struct arm_smmu_domain *smmu_domain, } static int arm_smmu_domain_finalise_s2(struct arm_smmu_domain *smmu_domain, - struct arm_smmu_master *master, struct io_pgtable_cfg *pgtbl_cfg) { int vmid; @@ -2154,8 +2152,7 @@ static int arm_smmu_domain_finalise_s2(struct arm_smmu_domain *smmu_domain, return 0; } -static int arm_smmu_domain_finalise(struct iommu_domain *domain, - struct arm_smmu_master *master) +static int arm_smmu_domain_finalise(struct iommu_domain *domain) { int ret; unsigned long ias, oas; @@ -2163,7 +2160,6 @@ static int arm_smmu_domain_finalise(struct iommu_domain *domain, struct io_pgtable_cfg pgtbl_cfg; struct io_pgtable_ops *pgtbl_ops; int (*finalise_stage_fn)(struct arm_smmu_domain *, - struct arm_smmu_master *, struct io_pgtable_cfg *); struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain); struct arm_smmu_device *smmu = smmu_domain->smmu; @@ -2215,7 +2211,7 @@ static int arm_smmu_domain_finalise(struct iommu_domain *domain, domain->geometry.aperture_end = (1UL << pgtbl_cfg.ias) - 1; domain->geometry.force_aperture = true; - ret = finalise_stage_fn(smmu_domain, master, &pgtbl_cfg); + ret = finalise_stage_fn(smmu_domain, &pgtbl_cfg); if (ret < 0) { free_io_pgtable_ops(pgtbl_ops); return ret; @@ -2423,7 +2419,7 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev) if (!smmu_domain->smmu) { smmu_domain->smmu = smmu; - ret = arm_smmu_domain_finalise(domain, master); + ret = arm_smmu_domain_finalise(domain); if (ret) smmu_domain->smmu = NULL; } else if (smmu_domain->smmu != smmu) From 0c1de2dcf2747d8bb7420045d611ef515a639e37 Mon Sep 17 00:00:00 2001 From: Michael Shavit Date: Fri, 15 Sep 2023 21:17:39 +0800 Subject: [PATCH 354/548] iommu/arm-smmu-v3: Update comment about STE liveness Update the comment to reflect the fact that the STE is not always installed. arm_smmu_domain_finalise_s1 intentionnaly calls arm_smmu_write_ctx_desc while the STE is not installed. Reviewed-by: Nicolin Chen Reviewed-by: Jason Gunthorpe Signed-off-by: Michael Shavit Tested-by: Nicolin Chen Link: https://lore.kernel.org/r/20230915211705.v8.8.I7a8beb615e2520ad395d96df94b9ab9708ee0d9c@changeid Signed-off-by: Will Deacon (cherry picked from commit 6032f58498b70f80103f467de4213f2ffbc2b062) Signed-off-by: Koba Ko --- drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c index 50005ee447cf5..42aed0e317813 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c @@ -1091,7 +1091,7 @@ int arm_smmu_write_ctx_desc(struct arm_smmu_master *master, int ssid, cdptr[3] = cpu_to_le64(cd->mair); /* - * STE is live, and the SMMU might read dwords of this CD in any + * STE may be live, and the SMMU might read dwords of this CD in any * order. Ensure that it observes valid values before reading * V=1. */ From 3e0f42578a038fce3d9b4cc3fc599db2204dcb5e Mon Sep 17 00:00:00 2001 From: Michael Shavit Date: Fri, 15 Sep 2023 21:17:40 +0800 Subject: [PATCH 355/548] iommu/arm-smmu-v3: Rename cdcfg to cd_table cdcfg is a confusing name, especially given other variables with the cfg suffix in this driver. cd_table more clearly describes what is being operated on. Tested-by: Nicolin Chen Reviewed-by: Jason Gunthorpe Reviewed-by: Nicolin Chen Signed-off-by: Michael Shavit Link: https://lore.kernel.org/r/20230915211705.v8.9.I5ee79793b444ddb933e8bc1eb7b77e728d7f8350@changeid Signed-off-by: Will Deacon (cherry picked from commit 475918e9c4ebda495f507d4b9612999d31512748) Signed-off-by: Koba Ko --- drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 70 ++++++++++----------- 1 file changed, 35 insertions(+), 35 deletions(-) diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c index 42aed0e317813..82384228aa130 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c @@ -1025,18 +1025,18 @@ static __le64 *arm_smmu_get_cd_ptr(struct arm_smmu_master *master, u32 ssid) unsigned int idx; struct arm_smmu_l1_ctx_desc *l1_desc; struct arm_smmu_device *smmu = master->smmu; - struct arm_smmu_ctx_desc_cfg *cdcfg = &master->cd_table; + struct arm_smmu_ctx_desc_cfg *cd_table = &master->cd_table; - if (cdcfg->s1fmt == STRTAB_STE_0_S1FMT_LINEAR) - return cdcfg->cdtab + ssid * CTXDESC_CD_DWORDS; + if (cd_table->s1fmt == STRTAB_STE_0_S1FMT_LINEAR) + return cd_table->cdtab + ssid * CTXDESC_CD_DWORDS; idx = ssid >> CTXDESC_SPLIT; - l1_desc = &cdcfg->l1_desc[idx]; + l1_desc = &cd_table->l1_desc[idx]; if (!l1_desc->l2ptr) { if (arm_smmu_alloc_cd_leaf_table(smmu, l1_desc)) return NULL; - l1ptr = cdcfg->cdtab + idx * CTXDESC_L1_DESC_DWORDS; + l1ptr = cd_table->cdtab + idx * CTXDESC_L1_DESC_DWORDS; arm_smmu_write_cd_l1_desc(l1ptr, l1_desc); /* An invalid L1CD can be cached */ arm_smmu_sync_cd(master, ssid, false); @@ -1131,35 +1131,35 @@ static int arm_smmu_alloc_cd_tables(struct arm_smmu_master *master) size_t l1size; size_t max_contexts; struct arm_smmu_device *smmu = master->smmu; - struct arm_smmu_ctx_desc_cfg *cdcfg = &master->cd_table; + struct arm_smmu_ctx_desc_cfg *cd_table = &master->cd_table; - cdcfg->stall_enabled = master->stall_enabled; - cdcfg->s1cdmax = master->ssid_bits; - max_contexts = 1 << cdcfg->s1cdmax; + cd_table->stall_enabled = master->stall_enabled; + cd_table->s1cdmax = master->ssid_bits; + max_contexts = 1 << cd_table->s1cdmax; if (!(smmu->features & ARM_SMMU_FEAT_2_LVL_CDTAB) || max_contexts <= CTXDESC_L2_ENTRIES) { - cdcfg->s1fmt = STRTAB_STE_0_S1FMT_LINEAR; - cdcfg->num_l1_ents = max_contexts; + cd_table->s1fmt = STRTAB_STE_0_S1FMT_LINEAR; + cd_table->num_l1_ents = max_contexts; l1size = max_contexts * (CTXDESC_CD_DWORDS << 3); } else { - cdcfg->s1fmt = STRTAB_STE_0_S1FMT_64K_L2; - cdcfg->num_l1_ents = DIV_ROUND_UP(max_contexts, + cd_table->s1fmt = STRTAB_STE_0_S1FMT_64K_L2; + cd_table->num_l1_ents = DIV_ROUND_UP(max_contexts, CTXDESC_L2_ENTRIES); - cdcfg->l1_desc = devm_kcalloc(smmu->dev, cdcfg->num_l1_ents, - sizeof(*cdcfg->l1_desc), + cd_table->l1_desc = devm_kcalloc(smmu->dev, cd_table->num_l1_ents, + sizeof(*cd_table->l1_desc), GFP_KERNEL); - if (!cdcfg->l1_desc) + if (!cd_table->l1_desc) return -ENOMEM; - l1size = cdcfg->num_l1_ents * (CTXDESC_L1_DESC_DWORDS << 3); + l1size = cd_table->num_l1_ents * (CTXDESC_L1_DESC_DWORDS << 3); } - cdcfg->cdtab = dmam_alloc_coherent(smmu->dev, l1size, &cdcfg->cdtab_dma, + cd_table->cdtab = dmam_alloc_coherent(smmu->dev, l1size, &cd_table->cdtab_dma, GFP_KERNEL); - if (!cdcfg->cdtab) { + if (!cd_table->cdtab) { dev_warn(smmu->dev, "failed to allocate context descriptor\n"); ret = -ENOMEM; goto err_free_l1; @@ -1168,9 +1168,9 @@ static int arm_smmu_alloc_cd_tables(struct arm_smmu_master *master) return 0; err_free_l1: - if (cdcfg->l1_desc) { - devm_kfree(smmu->dev, cdcfg->l1_desc); - cdcfg->l1_desc = NULL; + if (cd_table->l1_desc) { + devm_kfree(smmu->dev, cd_table->l1_desc); + cd_table->l1_desc = NULL; } return ret; } @@ -1180,30 +1180,30 @@ static void arm_smmu_free_cd_tables(struct arm_smmu_master *master) int i; size_t size, l1size; struct arm_smmu_device *smmu = master->smmu; - struct arm_smmu_ctx_desc_cfg *cdcfg = &master->cd_table; + struct arm_smmu_ctx_desc_cfg *cd_table = &master->cd_table; - if (cdcfg->l1_desc) { + if (cd_table->l1_desc) { size = CTXDESC_L2_ENTRIES * (CTXDESC_CD_DWORDS << 3); - for (i = 0; i < cdcfg->num_l1_ents; i++) { - if (!cdcfg->l1_desc[i].l2ptr) + for (i = 0; i < cd_table->num_l1_ents; i++) { + if (!cd_table->l1_desc[i].l2ptr) continue; dmam_free_coherent(smmu->dev, size, - cdcfg->l1_desc[i].l2ptr, - cdcfg->l1_desc[i].l2ptr_dma); + cd_table->l1_desc[i].l2ptr, + cd_table->l1_desc[i].l2ptr_dma); } - devm_kfree(smmu->dev, cdcfg->l1_desc); - cdcfg->l1_desc = NULL; + devm_kfree(smmu->dev, cd_table->l1_desc); + cd_table->l1_desc = NULL; - l1size = cdcfg->num_l1_ents * (CTXDESC_L1_DESC_DWORDS << 3); + l1size = cd_table->num_l1_ents * (CTXDESC_L1_DESC_DWORDS << 3); } else { - l1size = cdcfg->num_l1_ents * (CTXDESC_CD_DWORDS << 3); + l1size = cd_table->num_l1_ents * (CTXDESC_CD_DWORDS << 3); } - dmam_free_coherent(smmu->dev, l1size, cdcfg->cdtab, cdcfg->cdtab_dma); - cdcfg->cdtab_dma = 0; - cdcfg->cdtab = NULL; + dmam_free_coherent(smmu->dev, l1size, cd_table->cdtab, cd_table->cdtab_dma); + cd_table->cdtab_dma = 0; + cd_table->cdtab = NULL; } bool arm_smmu_free_asid(struct arm_smmu_ctx_desc *cd) From ed4a4578d68b6ad42897ba94e3e854c8e5760d95 Mon Sep 17 00:00:00 2001 From: Jason Gunthorpe Date: Tue, 5 Dec 2023 15:14:34 -0400 Subject: [PATCH 356/548] iommu/arm-smmu-v3: Master cannot be NULL in arm_smmu_write_strtab_ent() The only caller is arm_smmu_install_ste_for_dev() which never has a NULL master. Remove the confusing if. Reviewed-by: Moritz Fischer Reviewed-by: Michael Shavit Reviewed-by: Eric Auger Reviewed-by: Nicolin Chen Tested-by: Shameer Kolothum Tested-by: Nicolin Chen Signed-off-by: Jason Gunthorpe Signed-off-by: Will Deacon (cherry picked from commit 12a48fe90d0919e4e594815122403f793a090803) Signed-off-by: Koba Ko --- drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 9 ++------- 1 file changed, 2 insertions(+), 7 deletions(-) diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c index 82384228aa130..1264f68a3986f 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c @@ -1269,10 +1269,10 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid, */ u64 val = le64_to_cpu(dst->data[0]); bool ste_live = false; - struct arm_smmu_device *smmu = NULL; + struct arm_smmu_device *smmu = master->smmu; struct arm_smmu_ctx_desc_cfg *cd_table = NULL; struct arm_smmu_s2_cfg *s2_cfg = NULL; - struct arm_smmu_domain *smmu_domain = NULL; + struct arm_smmu_domain *smmu_domain = master->domain; struct arm_smmu_cmdq_ent prefetch_cmd = { .opcode = CMDQ_OP_PREFETCH_CFG, .prefetch = { @@ -1280,11 +1280,6 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid, }, }; - if (master) { - smmu_domain = master->domain; - smmu = master->smmu; - } - if (smmu_domain) { switch (smmu_domain->stage) { case ARM_SMMU_DOMAIN_S1: From 772ab80b34b00832a7c6b1dbe0c1d0d4a6f41479 Mon Sep 17 00:00:00 2001 From: Jason Gunthorpe Date: Wed, 13 Sep 2023 10:43:34 -0300 Subject: [PATCH 357/548] iommu: Add iommu_ops->identity_domain This allows a driver to set a global static to an IDENTITY domain and the core code will automatically use it whenever an IDENTITY domain is requested. By making it always available it means the IDENTITY can be used in error handling paths to force the iommu driver into a known state. Devices implementing global static identity domains should avoid failing their attach_dev ops. To make global static domains simpler allow drivers to omit their free function and update the iommufd selftest. Convert rockchip to use the new mechanism. Tested-by: Steven Price Tested-by: Marek Szyprowski Tested-by: Nicolin Chen Reviewed-by: Lu Baolu Reviewed-by: Jerry Snitselaar Signed-off-by: Jason Gunthorpe Link: https://lore.kernel.org/r/1-v8-81230027b2fa+9d-iommu_all_defdom_jgg@nvidia.com Signed-off-by: Joerg Roedel (cherry picked from commit df31b298477e65a01deff0af352be3a61524d930) Signed-off-by: Koba Ko --- drivers/iommu/iommu.c | 6 +++++- drivers/iommu/iommufd/selftest.c | 5 ----- drivers/iommu/rockchip-iommu.c | 9 +-------- include/linux/iommu.h | 3 +++ 4 files changed, 9 insertions(+), 14 deletions(-) diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c index 3fa5699b9ff19..bd2db56ee0d0d 100644 --- a/drivers/iommu/iommu.c +++ b/drivers/iommu/iommu.c @@ -1991,6 +1991,9 @@ static struct iommu_domain *__iommu_domain_alloc(const struct bus_type *bus, if (bus == NULL || bus->iommu_ops == NULL) return NULL; + if (alloc_type == IOMMU_DOMAIN_IDENTITY && bus->iommu_ops->identity_domain) + return bus->iommu_ops->identity_domain; + domain = bus->iommu_ops->domain_alloc(alloc_type); if (!domain) return NULL; @@ -2024,7 +2027,8 @@ void iommu_domain_free(struct iommu_domain *domain) if (domain->type == IOMMU_DOMAIN_SVA) mmdrop(domain->mm); iommu_put_dma_cookie(domain); - domain->ops->free(domain); + if (domain->ops->free) + domain->ops->free(domain); } EXPORT_SYMBOL_GPL(iommu_domain_free); diff --git a/drivers/iommu/iommufd/selftest.c b/drivers/iommu/iommufd/selftest.c index 00b794d74e03b..d29e438377405 100644 --- a/drivers/iommu/iommufd/selftest.c +++ b/drivers/iommu/iommufd/selftest.c @@ -126,10 +126,6 @@ struct selftest_obj { }; }; -static void mock_domain_blocking_free(struct iommu_domain *domain) -{ -} - static int mock_domain_nop_attach(struct iommu_domain *domain, struct device *dev) { @@ -137,7 +133,6 @@ static int mock_domain_nop_attach(struct iommu_domain *domain, } static const struct iommu_domain_ops mock_blocking_ops = { - .free = mock_domain_blocking_free, .attach_dev = mock_domain_nop_attach, }; diff --git a/drivers/iommu/rockchip-iommu.c b/drivers/iommu/rockchip-iommu.c index 36fec26d2a04a..e16761dadf698 100644 --- a/drivers/iommu/rockchip-iommu.c +++ b/drivers/iommu/rockchip-iommu.c @@ -989,13 +989,8 @@ static int rk_iommu_identity_attach(struct iommu_domain *identity_domain, return 0; } -static void rk_iommu_identity_free(struct iommu_domain *domain) -{ -} - static struct iommu_domain_ops rk_identity_ops = { .attach_dev = rk_iommu_identity_attach, - .free = rk_iommu_identity_free, }; static struct iommu_domain rk_identity_domain = { @@ -1059,9 +1054,6 @@ static struct iommu_domain *rk_iommu_domain_alloc(unsigned type) { struct rk_iommu_domain *rk_domain; - if (type == IOMMU_DOMAIN_IDENTITY) - return &rk_identity_domain; - if (type != IOMMU_DOMAIN_UNMANAGED && type != IOMMU_DOMAIN_DMA) return NULL; @@ -1185,6 +1177,7 @@ static int rk_iommu_of_xlate(struct device *dev, } static const struct iommu_ops rk_iommu_ops = { + .identity_domain = &rk_identity_domain, .domain_alloc = rk_iommu_domain_alloc, .probe_device = rk_iommu_probe_device, .release_device = rk_iommu_release_device, diff --git a/include/linux/iommu.h b/include/linux/iommu.h index b6ef263e85c06..0274d12a48e1b 100644 --- a/include/linux/iommu.h +++ b/include/linux/iommu.h @@ -260,6 +260,8 @@ struct iommu_iotlb_gather { * will be blocked by the hardware. * @pgsize_bitmap: bitmap of all possible supported page sizes * @owner: Driver module providing these ops + * @identity_domain: An always available, always attachable identity + * translation. */ struct iommu_ops { bool (*capable)(struct device *dev, enum iommu_cap); @@ -294,6 +296,7 @@ struct iommu_ops { const struct iommu_domain_ops *default_domain_ops; unsigned long pgsize_bitmap; struct module *owner; + struct iommu_domain *identity_domain; }; /** From 4999ec942c7f0bc5277392a798b394b4fbbaccfe Mon Sep 17 00:00:00 2001 From: Jason Gunthorpe Date: Wed, 13 Sep 2023 10:43:35 -0300 Subject: [PATCH 358/548] iommu: Add IOMMU_DOMAIN_PLATFORM This is used when the iommu driver is taking control of the dma_ops, currently only on S390 and power spapr. It is designed to preserve the original ops->detach_dev() semantic that these S390 was built around. Provide an opaque domain type and a 'default_domain' ops value that allows the driver to trivially force any single domain as the default domain. Update iommufd selftest to use this instead of set_platform_dma_ops Reviewed-by: Lu Baolu Reviewed-by: Jerry Snitselaar Signed-off-by: Jason Gunthorpe Link: https://lore.kernel.org/r/2-v8-81230027b2fa+9d-iommu_all_defdom_jgg@nvidia.com Signed-off-by: Joerg Roedel (cherry picked from commit 1c68cbc64fe6ac01dc242ba562344303031a76fb) Signed-off-by: Koba Ko --- drivers/iommu/iommu.c | 13 +++++++++++++ drivers/iommu/iommufd/selftest.c | 14 +++++--------- include/linux/iommu.h | 8 ++++++++ 3 files changed, 26 insertions(+), 9 deletions(-) diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c index bd2db56ee0d0d..04a7d10436f4f 100644 --- a/drivers/iommu/iommu.c +++ b/drivers/iommu/iommu.c @@ -184,6 +184,8 @@ static const char *iommu_domain_type_str(unsigned int t) case IOMMU_DOMAIN_DMA: case IOMMU_DOMAIN_DMA_FQ: return "Translated"; + case IOMMU_DOMAIN_PLATFORM: + return "Platform"; default: return "Unknown"; } @@ -1763,6 +1765,17 @@ iommu_group_alloc_default_domain(struct iommu_group *group, int req_type) lockdep_assert_held(&group->mutex); + /* + * Allow legacy drivers to specify the domain that will be the default + * domain. This should always be either an IDENTITY/BLOCKED/PLATFORM + * domain. Do not use in new drivers. + */ + if (bus->iommu_ops->default_domain) { + if (req_type) + return ERR_PTR(-EINVAL); + return bus->iommu_ops->default_domain; + } + if (req_type) return __iommu_group_alloc_default_domain(bus, group, req_type); diff --git a/drivers/iommu/iommufd/selftest.c b/drivers/iommu/iommufd/selftest.c index d29e438377405..aa56b67c44d48 100644 --- a/drivers/iommu/iommufd/selftest.c +++ b/drivers/iommu/iommufd/selftest.c @@ -296,14 +296,6 @@ static bool mock_domain_capable(struct device *dev, enum iommu_cap cap) return cap == IOMMU_CAP_CACHE_COHERENCY; } -static void mock_domain_set_plaform_dma_ops(struct device *dev) -{ - /* - * mock doesn't setup default domains because we can't hook into the - * normal probe path - */ -} - static struct iommu_device mock_iommu_device = { }; @@ -313,12 +305,16 @@ static struct iommu_device *mock_probe_device(struct device *dev) } static const struct iommu_ops mock_ops = { + /* + * IOMMU_DOMAIN_BLOCKED cannot be returned from def_domain_type() + * because it is zero. + */ + .default_domain = &mock_blocking_domain, .owner = THIS_MODULE, .pgsize_bitmap = MOCK_IO_PAGE_SIZE, .hw_info = mock_domain_hw_info, .domain_alloc = mock_domain_alloc, .capable = mock_domain_capable, - .set_platform_dma_ops = mock_domain_set_plaform_dma_ops, .device_group = generic_device_group, .probe_device = mock_probe_device, .default_domain_ops = diff --git a/include/linux/iommu.h b/include/linux/iommu.h index 0274d12a48e1b..b4b27624462d6 100644 --- a/include/linux/iommu.h +++ b/include/linux/iommu.h @@ -64,6 +64,7 @@ struct iommu_domain_geometry { #define __IOMMU_DOMAIN_DMA_FQ (1U << 3) /* DMA-API uses flush queue */ #define __IOMMU_DOMAIN_SVA (1U << 4) /* Shared process address space */ +#define __IOMMU_DOMAIN_PLATFORM (1U << 5) #define IOMMU_DOMAIN_ALLOC_FLAGS ~__IOMMU_DOMAIN_DMA_FQ /* @@ -81,6 +82,8 @@ struct iommu_domain_geometry { * invalidation. * IOMMU_DOMAIN_SVA - DMA addresses are shared process addresses * represented by mm_struct's. + * IOMMU_DOMAIN_PLATFORM - Legacy domain for drivers that do their own + * dma_api stuff. Do not use in new drivers. */ #define IOMMU_DOMAIN_BLOCKED (0U) #define IOMMU_DOMAIN_IDENTITY (__IOMMU_DOMAIN_PT) @@ -91,6 +94,7 @@ struct iommu_domain_geometry { __IOMMU_DOMAIN_DMA_API | \ __IOMMU_DOMAIN_DMA_FQ) #define IOMMU_DOMAIN_SVA (__IOMMU_DOMAIN_SVA) +#define IOMMU_DOMAIN_PLATFORM (__IOMMU_DOMAIN_PLATFORM) struct iommu_domain { unsigned type; @@ -262,6 +266,9 @@ struct iommu_iotlb_gather { * @owner: Driver module providing these ops * @identity_domain: An always available, always attachable identity * translation. + * @default_domain: If not NULL this will always be set as the default domain. + * This should be an IDENTITY/BLOCKED/PLATFORM domain. + * Do not use in new drivers. */ struct iommu_ops { bool (*capable)(struct device *dev, enum iommu_cap); @@ -297,6 +304,7 @@ struct iommu_ops { unsigned long pgsize_bitmap; struct module *owner; struct iommu_domain *identity_domain; + struct iommu_domain *default_domain; }; /** From d278f9a4a6509aa5116d0a2d3c222374de28f2b0 Mon Sep 17 00:00:00 2001 From: Yi Liu Date: Thu, 28 Sep 2023 00:15:23 -0700 Subject: [PATCH 359/548] iommu: Add new iommu op to create domains owned by userspace Introduce a new iommu_domain op to create domains owned by userspace, e.g. through IOMMUFD. These domains have a few different properties compares to kernel owned domains: - They may be PAGING domains, but created with special parameters. For instance aperture size changes/number of levels, different IOPTE formats, or other things necessary to make a vIOMMU work - We have to track all the memory allocations with GFP_KERNEL_ACCOUNT to make the cgroup sandbox stronger - Device-specialty domains, such as NESTED domains can be created by IOMMUFD. The new op clearly says the domain is being created by IOMMUFD, that the domain is intended for userspace use, and it provides a way to pass user flags or a driver specific uAPI structure to customize the created domain to exactly what the vIOMMU userspace driver requires. iommu drivers that cannot support VFIO/IOMMUFD should not support this op. This includes any driver that cannot provide a fully functional PAGING domain. This new op for now is only supposed to be used by IOMMUFD, hence no wrapper for it. IOMMUFD would call the callback directly. As for domain free, IOMMUFD would use iommu_domain_free(). Link: https://lore.kernel.org/r/20230928071528.26258-2-yi.l.liu@intel.com Suggested-by: Jason Gunthorpe Signed-off-by: Lu Baolu Co-developed-by: Nicolin Chen Signed-off-by: Nicolin Chen Signed-off-by: Yi Liu Reviewed-by: Kevin Tian Signed-off-by: Jason Gunthorpe (cherry picked from commit 909f4abd1097769d024c3a9c2e59c2fbe5d2d0c0) Signed-off-by: Koba Ko --- include/linux/iommu.h | 11 ++++++++++- 1 file changed, 10 insertions(+), 1 deletion(-) diff --git a/include/linux/iommu.h b/include/linux/iommu.h index b4b27624462d6..ad5cf1420edcf 100644 --- a/include/linux/iommu.h +++ b/include/linux/iommu.h @@ -238,7 +238,15 @@ struct iommu_iotlb_gather { * op is allocated in the iommu driver and freed by the caller after * use. The information type is one of enum iommu_hw_info_type defined * in include/uapi/linux/iommufd.h. - * @domain_alloc: allocate iommu domain + * @domain_alloc: allocate and return an iommu domain if success. Otherwise + * NULL is returned. The domain is not fully initialized until + * the caller iommu_domain_alloc() returns. + * @domain_alloc_user: Allocate an iommu domain corresponding to the input + * parameters as defined in include/uapi/linux/iommufd.h. + * Unlike @domain_alloc, it is called only by IOMMUFD and + * must fully initialize the new domain before return. + * Upon success, a domain is returned. Upon failure, + * ERR_PTR must be returned. * @probe_device: Add device to iommu driver handling * @release_device: Remove device from iommu driver handling * @probe_finalize: Do final setup work after the device is added to an IOMMU @@ -276,6 +284,7 @@ struct iommu_ops { /* Domain allocation and freeing by the iommu driver */ struct iommu_domain *(*domain_alloc)(unsigned iommu_domain_type); + struct iommu_domain *(*domain_alloc_user)(struct device *dev, u32 flags); struct iommu_device *(*probe_device)(struct device *dev); void (*release_device)(struct device *dev); From f860737adf6e021abb3d8aff709f98b5b17b2020 Mon Sep 17 00:00:00 2001 From: Yi Liu Date: Thu, 28 Sep 2023 00:15:24 -0700 Subject: [PATCH 360/548] iommufd: Use the domain_alloc_user() op for domain allocation Make IOMMUFD use iommu_domain_alloc_user() by default for iommu_domain creation. IOMMUFD needs to support iommu_domain allocation with parameters from userspace in nested support, and a driver is expected to implement everything under this op. If the iommu driver doesn't provide domain_alloc_user callback then IOMMUFD falls back to use iommu_domain_alloc() with an UNMANAGED type if possible. Link: https://lore.kernel.org/r/20230928071528.26258-3-yi.l.liu@intel.com Suggested-by: Jason Gunthorpe Reviewed-by: Lu Baolu Reviewed-by: Kevin Tian Co-developed-by: Nicolin Chen Signed-off-by: Nicolin Chen Signed-off-by: Yi Liu Signed-off-by: Jason Gunthorpe (cherry picked from commit 7975b722087fa23ff3ad1ff4998b8572a7e17e84) Signed-off-by: Koba Ko --- drivers/iommu/iommufd/hw_pagetable.c | 19 +++++++++++++++---- 1 file changed, 15 insertions(+), 4 deletions(-) diff --git a/drivers/iommu/iommufd/hw_pagetable.c b/drivers/iommu/iommufd/hw_pagetable.c index cf2c1504e20d8..48874f8965219 100644 --- a/drivers/iommu/iommufd/hw_pagetable.c +++ b/drivers/iommu/iommufd/hw_pagetable.c @@ -5,6 +5,7 @@ #include #include +#include "../iommu-priv.h" #include "iommufd_private.h" void iommufd_hw_pagetable_destroy(struct iommufd_object *obj) @@ -74,6 +75,7 @@ struct iommufd_hw_pagetable * iommufd_hw_pagetable_alloc(struct iommufd_ctx *ictx, struct iommufd_ioas *ioas, struct iommufd_device *idev, bool immediate_attach) { + const struct iommu_ops *ops = dev_iommu_ops(idev->dev); struct iommufd_hw_pagetable *hwpt; int rc; @@ -88,10 +90,19 @@ iommufd_hw_pagetable_alloc(struct iommufd_ctx *ictx, struct iommufd_ioas *ioas, refcount_inc(&ioas->obj.users); hwpt->ioas = ioas; - hwpt->domain = iommu_domain_alloc(idev->dev->bus); - if (!hwpt->domain) { - rc = -ENOMEM; - goto out_abort; + if (ops->domain_alloc_user) { + hwpt->domain = ops->domain_alloc_user(idev->dev, 0); + if (IS_ERR(hwpt->domain)) { + rc = PTR_ERR(hwpt->domain); + hwpt->domain = NULL; + goto out_abort; + } + } else { + hwpt->domain = iommu_domain_alloc(idev->dev->bus); + if (!hwpt->domain) { + rc = -ENOMEM; + goto out_abort; + } } /* From 22c14d7ecfedfd89c4f82afe087f3b36dd859839 Mon Sep 17 00:00:00 2001 From: Yi Liu Date: Thu, 28 Sep 2023 00:15:25 -0700 Subject: [PATCH 361/548] iommufd: Flow user flags for domain allocation to domain_alloc_user() Extends iommufd_hw_pagetable_alloc() to accept user flags, the uAPI will provide the flags. Link: https://lore.kernel.org/r/20230928071528.26258-4-yi.l.liu@intel.com Reviewed-by: Kevin Tian Signed-off-by: Yi Liu Reviewed-by: Lu Baolu Signed-off-by: Jason Gunthorpe (cherry picked from commit 89d63875d80ea127280c60dd4cd101af1d9b6557) Signed-off-by: Koba Ko --- drivers/iommu/iommufd/device.c | 2 +- drivers/iommu/iommufd/hw_pagetable.c | 9 ++++++--- drivers/iommu/iommufd/iommufd_private.h | 3 ++- 3 files changed, 9 insertions(+), 5 deletions(-) diff --git a/drivers/iommu/iommufd/device.c b/drivers/iommu/iommufd/device.c index ff6386fd1e8fe..3f6d4a2d6b44a 100644 --- a/drivers/iommu/iommufd/device.c +++ b/drivers/iommu/iommufd/device.c @@ -556,7 +556,7 @@ iommufd_device_auto_get_domain(struct iommufd_device *idev, } hwpt = iommufd_hw_pagetable_alloc(idev->ictx, ioas, idev, - immediate_attach); + 0, immediate_attach); if (IS_ERR(hwpt)) { destroy_hwpt = ERR_CAST(hwpt); goto out_unlock; diff --git a/drivers/iommu/iommufd/hw_pagetable.c b/drivers/iommu/iommufd/hw_pagetable.c index 48874f8965219..5be7a31cbd9cf 100644 --- a/drivers/iommu/iommufd/hw_pagetable.c +++ b/drivers/iommu/iommufd/hw_pagetable.c @@ -61,6 +61,7 @@ int iommufd_hw_pagetable_enforce_cc(struct iommufd_hw_pagetable *hwpt) * @ictx: iommufd context * @ioas: IOAS to associate the domain with * @idev: Device to get an iommu_domain for + * @flags: Flags from userspace * @immediate_attach: True if idev should be attached to the hwpt * * Allocate a new iommu_domain and return it as a hw_pagetable. The HWPT @@ -73,7 +74,8 @@ int iommufd_hw_pagetable_enforce_cc(struct iommufd_hw_pagetable *hwpt) */ struct iommufd_hw_pagetable * iommufd_hw_pagetable_alloc(struct iommufd_ctx *ictx, struct iommufd_ioas *ioas, - struct iommufd_device *idev, bool immediate_attach) + struct iommufd_device *idev, u32 flags, + bool immediate_attach) { const struct iommu_ops *ops = dev_iommu_ops(idev->dev); struct iommufd_hw_pagetable *hwpt; @@ -91,7 +93,7 @@ iommufd_hw_pagetable_alloc(struct iommufd_ctx *ictx, struct iommufd_ioas *ioas, hwpt->ioas = ioas; if (ops->domain_alloc_user) { - hwpt->domain = ops->domain_alloc_user(idev->dev, 0); + hwpt->domain = ops->domain_alloc_user(idev->dev, flags); if (IS_ERR(hwpt->domain)) { rc = PTR_ERR(hwpt->domain); hwpt->domain = NULL; @@ -166,7 +168,8 @@ int iommufd_hwpt_alloc(struct iommufd_ucmd *ucmd) } mutex_lock(&ioas->mutex); - hwpt = iommufd_hw_pagetable_alloc(ucmd->ictx, ioas, idev, false); + hwpt = iommufd_hw_pagetable_alloc(ucmd->ictx, ioas, + idev, cmd->flags, false); if (IS_ERR(hwpt)) { rc = PTR_ERR(hwpt); goto out_unlock; diff --git a/drivers/iommu/iommufd/iommufd_private.h b/drivers/iommu/iommufd/iommufd_private.h index 2c58670011fe9..3064997a0181e 100644 --- a/drivers/iommu/iommufd/iommufd_private.h +++ b/drivers/iommu/iommufd/iommufd_private.h @@ -242,7 +242,8 @@ struct iommufd_hw_pagetable { struct iommufd_hw_pagetable * iommufd_hw_pagetable_alloc(struct iommufd_ctx *ictx, struct iommufd_ioas *ioas, - struct iommufd_device *idev, bool immediate_attach); + struct iommufd_device *idev, u32 flags, + bool immediate_attach); int iommufd_hw_pagetable_enforce_cc(struct iommufd_hw_pagetable *hwpt); int iommufd_hw_pagetable_attach(struct iommufd_hw_pagetable *hwpt, struct iommufd_device *idev); From 436a94eaec6e7b443e3147998f4e0afea37c8e50 Mon Sep 17 00:00:00 2001 From: Yi Liu Date: Thu, 28 Sep 2023 00:15:26 -0700 Subject: [PATCH 362/548] iommufd: Support allocating nested parent domain Extend IOMMU_HWPT_ALLOC to allocate domains to be used as parent (stage-2) in nested translation. Add IOMMU_HWPT_ALLOC_NEST_PARENT to the uAPI. Link: https://lore.kernel.org/r/20230928071528.26258-5-yi.l.liu@intel.com Signed-off-by: Yi Liu Reviewed-by: Kevin Tian Reviewed-by: Lu Baolu Signed-off-by: Jason Gunthorpe (cherry picked from commit 4ff542163397073f86eda484318d61980ff1031d) Signed-off-by: Koba Ko --- drivers/iommu/iommufd/hw_pagetable.c | 5 ++++- include/uapi/linux/iommufd.h | 12 +++++++++++- 2 files changed, 15 insertions(+), 2 deletions(-) diff --git a/drivers/iommu/iommufd/hw_pagetable.c b/drivers/iommu/iommufd/hw_pagetable.c index 5be7a31cbd9cf..8b3d2875d642d 100644 --- a/drivers/iommu/iommufd/hw_pagetable.c +++ b/drivers/iommu/iommufd/hw_pagetable.c @@ -83,6 +83,9 @@ iommufd_hw_pagetable_alloc(struct iommufd_ctx *ictx, struct iommufd_ioas *ioas, lockdep_assert_held(&ioas->mutex); + if (flags && !ops->domain_alloc_user) + return ERR_PTR(-EOPNOTSUPP); + hwpt = iommufd_object_alloc(ictx, hwpt, IOMMUFD_OBJ_HW_PAGETABLE); if (IS_ERR(hwpt)) return hwpt; @@ -154,7 +157,7 @@ int iommufd_hwpt_alloc(struct iommufd_ucmd *ucmd) struct iommufd_ioas *ioas; int rc; - if (cmd->flags || cmd->__reserved) + if ((cmd->flags & (~IOMMU_HWPT_ALLOC_NEST_PARENT)) || cmd->__reserved) return -EOPNOTSUPP; idev = iommufd_get_device(ucmd, cmd->dev_id); diff --git a/include/uapi/linux/iommufd.h b/include/uapi/linux/iommufd.h index b4ba0c0cbab6b..4a7c5c8fdbb44 100644 --- a/include/uapi/linux/iommufd.h +++ b/include/uapi/linux/iommufd.h @@ -347,10 +347,20 @@ struct iommu_vfio_ioas { }; #define IOMMU_VFIO_IOAS _IO(IOMMUFD_TYPE, IOMMUFD_CMD_VFIO_IOAS) +/** + * enum iommufd_hwpt_alloc_flags - Flags for HWPT allocation + * @IOMMU_HWPT_ALLOC_NEST_PARENT: If set, allocate a domain which can serve + * as the parent domain in the nesting + * configuration. + */ +enum iommufd_hwpt_alloc_flags { + IOMMU_HWPT_ALLOC_NEST_PARENT = 1 << 0, +}; + /** * struct iommu_hwpt_alloc - ioctl(IOMMU_HWPT_ALLOC) * @size: sizeof(struct iommu_hwpt_alloc) - * @flags: Must be 0 + * @flags: Combination of enum iommufd_hwpt_alloc_flags * @dev_id: The device to allocate this HWPT for * @pt_id: The IOAS to connect this HWPT to * @out_hwpt_id: The ID of the new HWPT From 9d2b86cf095fc965d670b1c86d7b721e894bf8f2 Mon Sep 17 00:00:00 2001 From: Yi Liu Date: Thu, 28 Sep 2023 00:15:27 -0700 Subject: [PATCH 363/548] iommufd/selftest: Add domain_alloc_user() support in iommu mock Add mock_domain_alloc_user() and a new test case for IOMMU_HWPT_ALLOC_NEST_PARENT. Link: https://lore.kernel.org/r/20230928071528.26258-6-yi.l.liu@intel.com Co-developed-by: Nicolin Chen Signed-off-by: Nicolin Chen Signed-off-by: Yi Liu Reviewed-by: Kevin Tian Signed-off-by: Jason Gunthorpe (cherry picked from commit 408663619fcfc89c087df65b362c91bf0a0be617) Signed-off-by: Koba Ko --- drivers/iommu/iommufd/selftest.c | 19 +++++++++++++++ tools/testing/selftests/iommu/iommufd.c | 24 +++++++++++++++---- .../selftests/iommu/iommufd_fail_nth.c | 2 +- tools/testing/selftests/iommu/iommufd_utils.h | 11 ++++++--- 4 files changed, 48 insertions(+), 8 deletions(-) diff --git a/drivers/iommu/iommufd/selftest.c b/drivers/iommu/iommufd/selftest.c index aa56b67c44d48..f68e4481ae95f 100644 --- a/drivers/iommu/iommufd/selftest.c +++ b/drivers/iommu/iommufd/selftest.c @@ -156,6 +156,8 @@ static void *mock_domain_hw_info(struct device *dev, u32 *length, u32 *type) return info; } +static const struct iommu_ops mock_ops; + static struct iommu_domain *mock_domain_alloc(unsigned int iommu_domain_type) { struct mock_iommu_domain *mock; @@ -172,10 +174,26 @@ static struct iommu_domain *mock_domain_alloc(unsigned int iommu_domain_type) mock->domain.geometry.aperture_start = MOCK_APERTURE_START; mock->domain.geometry.aperture_end = MOCK_APERTURE_LAST; mock->domain.pgsize_bitmap = MOCK_IO_PAGE_SIZE; + mock->domain.ops = mock_ops.default_domain_ops; + mock->domain.type = iommu_domain_type; xa_init(&mock->pfns); return &mock->domain; } +static struct iommu_domain * +mock_domain_alloc_user(struct device *dev, u32 flags) +{ + struct iommu_domain *domain; + + if (flags & (~IOMMU_HWPT_ALLOC_NEST_PARENT)) + return ERR_PTR(-EOPNOTSUPP); + + domain = mock_domain_alloc(IOMMU_DOMAIN_UNMANAGED); + if (!domain) + domain = ERR_PTR(-ENOMEM); + return domain; +} + static void mock_domain_free(struct iommu_domain *domain) { struct mock_iommu_domain *mock = @@ -314,6 +332,7 @@ static const struct iommu_ops mock_ops = { .pgsize_bitmap = MOCK_IO_PAGE_SIZE, .hw_info = mock_domain_hw_info, .domain_alloc = mock_domain_alloc, + .domain_alloc_user = mock_domain_alloc_user, .capable = mock_domain_capable, .device_group = generic_device_group, .probe_device = mock_probe_device, diff --git a/tools/testing/selftests/iommu/iommufd.c b/tools/testing/selftests/iommu/iommufd.c index 890a81f4ff618..bdc56e32a4b24 100644 --- a/tools/testing/selftests/iommu/iommufd.c +++ b/tools/testing/selftests/iommu/iommufd.c @@ -114,6 +114,7 @@ TEST_F(iommufd, cmd_length) TEST_LENGTH(iommu_destroy, IOMMU_DESTROY); TEST_LENGTH(iommu_hw_info, IOMMU_GET_HW_INFO); + TEST_LENGTH(iommu_hwpt_alloc, IOMMU_HWPT_ALLOC); TEST_LENGTH(iommu_ioas_alloc, IOMMU_IOAS_ALLOC); TEST_LENGTH(iommu_ioas_iova_ranges, IOMMU_IOAS_IOVA_RANGES); TEST_LENGTH(iommu_ioas_allow_iovas, IOMMU_IOAS_ALLOW_IOVAS); @@ -1404,13 +1405,28 @@ TEST_F(iommufd_mock_domain, alloc_hwpt) int i; for (i = 0; i != variant->mock_domains; i++) { + uint32_t hwpt_id[2]; uint32_t stddev_id; - uint32_t hwpt_id; - test_cmd_hwpt_alloc(self->idev_ids[0], self->ioas_id, &hwpt_id); - test_cmd_mock_domain(hwpt_id, &stddev_id, NULL, NULL); + test_err_hwpt_alloc(EOPNOTSUPP, + self->idev_ids[i], self->ioas_id, + ~IOMMU_HWPT_ALLOC_NEST_PARENT, &hwpt_id[0]); + test_cmd_hwpt_alloc(self->idev_ids[i], self->ioas_id, + 0, &hwpt_id[0]); + test_cmd_hwpt_alloc(self->idev_ids[i], self->ioas_id, + IOMMU_HWPT_ALLOC_NEST_PARENT, &hwpt_id[1]); + + /* Do a hw_pagetable rotation test */ + test_cmd_mock_domain_replace(self->stdev_ids[i], hwpt_id[0]); + EXPECT_ERRNO(EBUSY, _test_ioctl_destroy(self->fd, hwpt_id[0])); + test_cmd_mock_domain_replace(self->stdev_ids[i], hwpt_id[1]); + EXPECT_ERRNO(EBUSY, _test_ioctl_destroy(self->fd, hwpt_id[1])); + test_cmd_mock_domain_replace(self->stdev_ids[i], self->ioas_id); + test_ioctl_destroy(hwpt_id[1]); + + test_cmd_mock_domain(hwpt_id[0], &stddev_id, NULL, NULL); test_ioctl_destroy(stddev_id); - test_ioctl_destroy(hwpt_id); + test_ioctl_destroy(hwpt_id[0]); } } diff --git a/tools/testing/selftests/iommu/iommufd_fail_nth.c b/tools/testing/selftests/iommu/iommufd_fail_nth.c index a220ca2a689d1..3d7838506bfe4 100644 --- a/tools/testing/selftests/iommu/iommufd_fail_nth.c +++ b/tools/testing/selftests/iommu/iommufd_fail_nth.c @@ -615,7 +615,7 @@ TEST_FAIL_NTH(basic_fail_nth, device) if (_test_cmd_get_hw_info(self->fd, idev_id, &info, sizeof(info))) return -1; - if (_test_cmd_hwpt_alloc(self->fd, idev_id, ioas_id, &hwpt_id)) + if (_test_cmd_hwpt_alloc(self->fd, idev_id, ioas_id, 0, &hwpt_id)) return -1; if (_test_cmd_mock_domain_replace(self->fd, stdev_id, ioas_id2, NULL)) diff --git a/tools/testing/selftests/iommu/iommufd_utils.h b/tools/testing/selftests/iommu/iommufd_utils.h index e0753d03ecaa8..be4970a849776 100644 --- a/tools/testing/selftests/iommu/iommufd_utils.h +++ b/tools/testing/selftests/iommu/iommufd_utils.h @@ -103,10 +103,11 @@ static int _test_cmd_mock_domain_replace(int fd, __u32 stdev_id, __u32 pt_id, pt_id, NULL)) static int _test_cmd_hwpt_alloc(int fd, __u32 device_id, __u32 pt_id, - __u32 *hwpt_id) + __u32 flags, __u32 *hwpt_id) { struct iommu_hwpt_alloc cmd = { .size = sizeof(cmd), + .flags = flags, .dev_id = device_id, .pt_id = pt_id, }; @@ -120,8 +121,12 @@ static int _test_cmd_hwpt_alloc(int fd, __u32 device_id, __u32 pt_id, return 0; } -#define test_cmd_hwpt_alloc(device_id, pt_id, hwpt_id) \ - ASSERT_EQ(0, _test_cmd_hwpt_alloc(self->fd, device_id, pt_id, hwpt_id)) +#define test_cmd_hwpt_alloc(device_id, pt_id, flags, hwpt_id) \ + ASSERT_EQ(0, _test_cmd_hwpt_alloc(self->fd, device_id, \ + pt_id, flags, hwpt_id)) +#define test_err_hwpt_alloc(_errno, device_id, pt_id, flags, hwpt_id) \ + EXPECT_ERRNO(_errno, _test_cmd_hwpt_alloc(self->fd, device_id, \ + pt_id, flags, hwpt_id)) static int _test_cmd_access_replace_ioas(int fd, __u32 access_id, unsigned int ioas_id) From 6d48e21e87934ceef780588f005c0b3a36888933 Mon Sep 17 00:00:00 2001 From: Yi Liu Date: Thu, 28 Sep 2023 00:15:28 -0700 Subject: [PATCH 364/548] iommu/vt-d: Add domain_alloc_user op Add the domain_alloc_user() op implementation. It supports allocating domains to be used as parent under nested translation. Unlike other drivers VT-D uses only a single page table format so it only needs to check if the HW can support nesting. Link: https://lore.kernel.org/r/20230928071528.26258-7-yi.l.liu@intel.com Signed-off-by: Yi Liu Reviewed-by: Lu Baolu Reviewed-by: Kevin Tian Signed-off-by: Jason Gunthorpe (cherry picked from commit c97d1b20d3835178bcd0e3a86c20ce4e36b6d80c) Signed-off-by: Koba Ko --- drivers/iommu/intel/iommu.c | 28 ++++++++++++++++++++++++++++ 1 file changed, 28 insertions(+) diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c index 6a745616d85a4..4ff09263b3230 100644 --- a/drivers/iommu/intel/iommu.c +++ b/drivers/iommu/intel/iommu.c @@ -4075,6 +4075,33 @@ static struct iommu_domain *intel_iommu_domain_alloc(unsigned type) return NULL; } +static struct iommu_domain * +intel_iommu_domain_alloc_user(struct device *dev, u32 flags) +{ + struct iommu_domain *domain; + struct intel_iommu *iommu; + + if (flags & (~IOMMU_HWPT_ALLOC_NEST_PARENT)) + return ERR_PTR(-EOPNOTSUPP); + + iommu = device_to_iommu(dev, NULL, NULL); + if (!iommu) + return ERR_PTR(-ENODEV); + + if ((flags & IOMMU_HWPT_ALLOC_NEST_PARENT) && !ecap_nest(iommu->ecap)) + return ERR_PTR(-EOPNOTSUPP); + + /* + * domain_alloc_user op needs to fully initialize a domain + * before return, so uses iommu_domain_alloc() here for + * simple. + */ + domain = iommu_domain_alloc(dev->bus); + if (!domain) + domain = ERR_PTR(-ENOMEM); + return domain; +} + static void intel_iommu_domain_free(struct iommu_domain *domain) { if (domain != &si_domain->domain && domain != &blocking_domain) @@ -4809,6 +4836,7 @@ const struct iommu_ops intel_iommu_ops = { .capable = intel_iommu_capable, .hw_info = intel_iommu_hw_info, .domain_alloc = intel_iommu_domain_alloc, + .domain_alloc_user = intel_iommu_domain_alloc_user, .probe_device = intel_iommu_probe_device, .probe_finalize = intel_iommu_probe_finalize, .release_device = intel_iommu_release_device, From ce439d04a447307f998a855ea789fce626ca33a2 Mon Sep 17 00:00:00 2001 From: Yi Liu Date: Tue, 24 Oct 2023 08:00:11 -0700 Subject: [PATCH 365/548] iommu/vt-d: Enhance capability check for nested parent domain allocation This adds the scalable mode check before allocating the nested parent domain as checking nested capability is not enough. User may turn off scalable mode which also means no nested support even if the hardware supports it. Fixes: c97d1b20d383 ("iommu/vt-d: Add domain_alloc_user op") Link: https://lore.kernel.org/r/20231024150011.44642-1-yi.l.liu@intel.com Signed-off-by: Yi Liu Reviewed-by: Lu Baolu Reviewed-by: Kevin Tian Signed-off-by: Jason Gunthorpe (cherry picked from commit a2cdecdf9d234455fdfc8f539bbf5818711bc29d) Signed-off-by: Koba Ko --- drivers/iommu/intel/iommu.c | 2 +- drivers/iommu/intel/iommu.h | 2 ++ 2 files changed, 3 insertions(+), 1 deletion(-) diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c index 4ff09263b3230..d9c6ac2398eba 100644 --- a/drivers/iommu/intel/iommu.c +++ b/drivers/iommu/intel/iommu.c @@ -4088,7 +4088,7 @@ intel_iommu_domain_alloc_user(struct device *dev, u32 flags) if (!iommu) return ERR_PTR(-ENODEV); - if ((flags & IOMMU_HWPT_ALLOC_NEST_PARENT) && !ecap_nest(iommu->ecap)) + if ((flags & IOMMU_HWPT_ALLOC_NEST_PARENT) && !nested_supported(iommu)) return ERR_PTR(-EOPNOTSUPP); /* diff --git a/drivers/iommu/intel/iommu.h b/drivers/iommu/intel/iommu.h index e6a3e70656166..a5d27d61e1567 100644 --- a/drivers/iommu/intel/iommu.h +++ b/drivers/iommu/intel/iommu.h @@ -539,6 +539,8 @@ enum { #define sm_supported(iommu) (intel_iommu_sm && ecap_smts((iommu)->ecap)) #define pasid_supported(iommu) (sm_supported(iommu) && \ ecap_pasid((iommu)->ecap)) +#define nested_supported(iommu) (sm_supported(iommu) && \ + ecap_nest((iommu)->ecap)) struct pasid_entry; struct pasid_state_entry; From 67ffdfe8711a72ea3175e81b081b3f2748aebc3c Mon Sep 17 00:00:00 2001 From: Nicolin Chen Date: Tue, 17 Oct 2023 11:15:52 -0700 Subject: [PATCH 366/548] iommufd: Correct IOMMU_HWPT_ALLOC_NEST_PARENT description The IOMMU_HWPT_ALLOC_NEST_PARENT flag is used to allocate a HWPT. Though a HWPT holds a domain in the core structure, it is still quite confusing to describe it using "domain" in the uAPI kdoc. Correct it to "HWPT". Fixes: 4ff542163397 ("iommufd: Support allocating nested parent domain") Link: https://lore.kernel.org/r/20231017181552.12667-1-nicolinc@nvidia.com Signed-off-by: Nicolin Chen Signed-off-by: Jason Gunthorpe (cherry picked from commit b5f9e63278d6f32789478acf1ed41d21d92b36cf) Signed-off-by: Koba Ko --- include/uapi/linux/iommufd.h | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/include/uapi/linux/iommufd.h b/include/uapi/linux/iommufd.h index 4a7c5c8fdbb44..be7a95042677b 100644 --- a/include/uapi/linux/iommufd.h +++ b/include/uapi/linux/iommufd.h @@ -349,9 +349,8 @@ struct iommu_vfio_ioas { /** * enum iommufd_hwpt_alloc_flags - Flags for HWPT allocation - * @IOMMU_HWPT_ALLOC_NEST_PARENT: If set, allocate a domain which can serve - * as the parent domain in the nesting - * configuration. + * @IOMMU_HWPT_ALLOC_NEST_PARENT: If set, allocate a HWPT that can serve as + * the parent HWPT in a nesting configuration. */ enum iommufd_hwpt_alloc_flags { IOMMU_HWPT_ALLOC_NEST_PARENT = 1 << 0, From 545a9b2b8109329b2754c76ebfceb5d09a023320 Mon Sep 17 00:00:00 2001 From: Joao Martins Date: Tue, 24 Oct 2023 14:50:52 +0100 Subject: [PATCH 367/548] vfio/iova_bitmap: Export more API symbols In preparation to move iova_bitmap into iommufd, export the rest of API symbols that will be used in what could be used by modules, namely: iova_bitmap_alloc iova_bitmap_free iova_bitmap_for_each Link: https://lore.kernel.org/r/20231024135109.73787-2-joao.m.martins@oracle.com Suggested-by: Alex Williamson Signed-off-by: Joao Martins Reviewed-by: Jason Gunthorpe Reviewed-by: Kevin Tian Reviewed-by: Alex Williamson Signed-off-by: Jason Gunthorpe (cherry picked from commit 53f0b020218fcc0a56a11df39630dbd379e4d9a6) Signed-off-by: Koba Ko --- drivers/vfio/iova_bitmap.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/vfio/iova_bitmap.c b/drivers/vfio/iova_bitmap.c index 38b51613ecca9..a1e383d9a323a 100644 --- a/drivers/vfio/iova_bitmap.c +++ b/drivers/vfio/iova_bitmap.c @@ -269,6 +269,7 @@ struct iova_bitmap *iova_bitmap_alloc(unsigned long iova, size_t length, iova_bitmap_free(bitmap); return ERR_PTR(rc); } +EXPORT_SYMBOL_GPL(iova_bitmap_alloc); /** * iova_bitmap_free() - Frees an IOVA bitmap object @@ -290,6 +291,7 @@ void iova_bitmap_free(struct iova_bitmap *bitmap) kfree(bitmap); } +EXPORT_SYMBOL_GPL(iova_bitmap_free); /* * Returns the remaining bitmap indexes from mapped_total_index to process for @@ -388,6 +390,7 @@ int iova_bitmap_for_each(struct iova_bitmap *bitmap, void *opaque, return ret; } +EXPORT_SYMBOL_GPL(iova_bitmap_for_each); /** * iova_bitmap_set() - Records an IOVA range in bitmap From d1fdc5cd11a1ef40146470aa40bc59ebc7f64a17 Mon Sep 17 00:00:00 2001 From: Joao Martins Date: Tue, 24 Oct 2023 14:50:53 +0100 Subject: [PATCH 368/548] vfio: Move iova_bitmap into iommufd Both VFIO and IOMMUFD will need iova bitmap for storing dirties and walking the user bitmaps, so move to the common dependency into IOMMUFD. In doing so, create the symbol IOMMUFD_DRIVER which designates the builtin code that will be used by drivers when selected. Today this means MLX5_VFIO_PCI and PDS_VFIO_PCI. IOMMU drivers will do the same (in future patches) when supporting dirty tracking and select IOMMUFD_DRIVER accordingly. Given that the symbol maybe be disabled, add header definitions in iova_bitmap.h for when IOMMUFD_DRIVER=n Link: https://lore.kernel.org/r/20231024135109.73787-3-joao.m.martins@oracle.com Signed-off-by: Joao Martins Reviewed-by: Jason Gunthorpe Reviewed-by: Brett Creeley Reviewed-by: Kevin Tian Reviewed-by: Alex Williamson Signed-off-by: Jason Gunthorpe (cherry picked from commit 8c9c727b6142325ed5697240fceb99cbeb4ac2ec) Signed-off-by: Koba Ko --- drivers/iommu/Kconfig | 4 +++ drivers/iommu/iommufd/Makefile | 1 + drivers/{vfio => iommu/iommufd}/iova_bitmap.c | 0 drivers/vfio/Makefile | 3 +-- drivers/vfio/pci/mlx5/Kconfig | 1 + drivers/vfio/pci/pds/Kconfig | 1 + include/linux/iova_bitmap.h | 26 +++++++++++++++++++ 7 files changed, 34 insertions(+), 2 deletions(-) rename drivers/{vfio => iommu/iommufd}/iova_bitmap.c (100%) diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig index 20aa5ed80aa38..ea49bce5cd71b 100644 --- a/drivers/iommu/Kconfig +++ b/drivers/iommu/Kconfig @@ -7,6 +7,10 @@ config IOMMU_IOVA config IOMMU_API bool +config IOMMUFD_DRIVER + bool + default n + menuconfig IOMMU_SUPPORT bool "IOMMU Hardware Support" depends on MMU diff --git a/drivers/iommu/iommufd/Makefile b/drivers/iommu/iommufd/Makefile index 8aeba81800c51..34b446146961c 100644 --- a/drivers/iommu/iommufd/Makefile +++ b/drivers/iommu/iommufd/Makefile @@ -11,3 +11,4 @@ iommufd-y := \ iommufd-$(CONFIG_IOMMUFD_TEST) += selftest.o obj-$(CONFIG_IOMMUFD) += iommufd.o +obj-$(CONFIG_IOMMUFD_DRIVER) += iova_bitmap.o diff --git a/drivers/vfio/iova_bitmap.c b/drivers/iommu/iommufd/iova_bitmap.c similarity index 100% rename from drivers/vfio/iova_bitmap.c rename to drivers/iommu/iommufd/iova_bitmap.c diff --git a/drivers/vfio/Makefile b/drivers/vfio/Makefile index c82ea032d3521..68c05705200fc 100644 --- a/drivers/vfio/Makefile +++ b/drivers/vfio/Makefile @@ -1,8 +1,7 @@ # SPDX-License-Identifier: GPL-2.0 obj-$(CONFIG_VFIO) += vfio.o -vfio-y += vfio_main.o \ - iova_bitmap.o +vfio-y += vfio_main.o vfio-$(CONFIG_VFIO_DEVICE_CDEV) += device_cdev.o vfio-$(CONFIG_VFIO_GROUP) += group.o vfio-$(CONFIG_IOMMUFD) += iommufd.o diff --git a/drivers/vfio/pci/mlx5/Kconfig b/drivers/vfio/pci/mlx5/Kconfig index 7088edc4fb28d..c3ced56b77876 100644 --- a/drivers/vfio/pci/mlx5/Kconfig +++ b/drivers/vfio/pci/mlx5/Kconfig @@ -3,6 +3,7 @@ config MLX5_VFIO_PCI tristate "VFIO support for MLX5 PCI devices" depends on MLX5_CORE select VFIO_PCI_CORE + select IOMMUFD_DRIVER help This provides migration support for MLX5 devices using the VFIO framework. diff --git a/drivers/vfio/pci/pds/Kconfig b/drivers/vfio/pci/pds/Kconfig index 6eceef7b028aa..fec9b167c7b9a 100644 --- a/drivers/vfio/pci/pds/Kconfig +++ b/drivers/vfio/pci/pds/Kconfig @@ -5,6 +5,7 @@ config PDS_VFIO_PCI tristate "VFIO support for PDS PCI devices" depends on PDS_CORE && PCI_IOV select VFIO_PCI_CORE + select IOMMUFD_DRIVER help This provides generic PCI support for PDS devices using the VFIO framework. diff --git a/include/linux/iova_bitmap.h b/include/linux/iova_bitmap.h index c006cf0a25f3d..1c338f5e5b7a6 100644 --- a/include/linux/iova_bitmap.h +++ b/include/linux/iova_bitmap.h @@ -7,6 +7,7 @@ #define _IOVA_BITMAP_H_ #include +#include struct iova_bitmap; @@ -14,6 +15,7 @@ typedef int (*iova_bitmap_fn_t)(struct iova_bitmap *bitmap, unsigned long iova, size_t length, void *opaque); +#if IS_ENABLED(CONFIG_IOMMUFD_DRIVER) struct iova_bitmap *iova_bitmap_alloc(unsigned long iova, size_t length, unsigned long page_size, u64 __user *data); @@ -22,5 +24,29 @@ int iova_bitmap_for_each(struct iova_bitmap *bitmap, void *opaque, iova_bitmap_fn_t fn); void iova_bitmap_set(struct iova_bitmap *bitmap, unsigned long iova, size_t length); +#else +static inline struct iova_bitmap *iova_bitmap_alloc(unsigned long iova, + size_t length, + unsigned long page_size, + u64 __user *data) +{ + return NULL; +} + +static inline void iova_bitmap_free(struct iova_bitmap *bitmap) +{ +} + +static inline int iova_bitmap_for_each(struct iova_bitmap *bitmap, void *opaque, + iova_bitmap_fn_t fn) +{ + return -EOPNOTSUPP; +} + +static inline void iova_bitmap_set(struct iova_bitmap *bitmap, + unsigned long iova, size_t length) +{ +} +#endif #endif From 72d2ea2efcb909c85586ffd2c66d23e8bbd239af Mon Sep 17 00:00:00 2001 From: Joao Martins Date: Tue, 24 Oct 2023 14:50:54 +0100 Subject: [PATCH 369/548] iommufd/iova_bitmap: Move symbols to IOMMUFD namespace Have the IOVA bitmap exported symbols adhere to the IOMMUFD symbol export convention i.e. using the IOMMUFD namespace. In doing so, import the namespace in the current users. This means VFIO and the vfio-pci drivers that use iova_bitmap_set(). Link: https://lore.kernel.org/r/20231024135109.73787-4-joao.m.martins@oracle.com Suggested-by: Jason Gunthorpe Signed-off-by: Joao Martins Reviewed-by: Jason Gunthorpe Reviewed-by: Brett Creeley Reviewed-by: Kevin Tian Reviewed-by: Alex Williamson Signed-off-by: Jason Gunthorpe (cherry picked from commit 13578d4ebe8be1c16146f37c0c91f2579611cff2) Signed-off-by: Koba Ko --- drivers/iommu/iommufd/iova_bitmap.c | 8 ++++---- drivers/vfio/pci/mlx5/main.c | 1 + drivers/vfio/pci/pds/pci_drv.c | 1 + drivers/vfio/vfio_main.c | 1 + 4 files changed, 7 insertions(+), 4 deletions(-) diff --git a/drivers/iommu/iommufd/iova_bitmap.c b/drivers/iommu/iommufd/iova_bitmap.c index a1e383d9a323a..e9b08e16711f8 100644 --- a/drivers/iommu/iommufd/iova_bitmap.c +++ b/drivers/iommu/iommufd/iova_bitmap.c @@ -269,7 +269,7 @@ struct iova_bitmap *iova_bitmap_alloc(unsigned long iova, size_t length, iova_bitmap_free(bitmap); return ERR_PTR(rc); } -EXPORT_SYMBOL_GPL(iova_bitmap_alloc); +EXPORT_SYMBOL_NS_GPL(iova_bitmap_alloc, IOMMUFD); /** * iova_bitmap_free() - Frees an IOVA bitmap object @@ -291,7 +291,7 @@ void iova_bitmap_free(struct iova_bitmap *bitmap) kfree(bitmap); } -EXPORT_SYMBOL_GPL(iova_bitmap_free); +EXPORT_SYMBOL_NS_GPL(iova_bitmap_free, IOMMUFD); /* * Returns the remaining bitmap indexes from mapped_total_index to process for @@ -390,7 +390,7 @@ int iova_bitmap_for_each(struct iova_bitmap *bitmap, void *opaque, return ret; } -EXPORT_SYMBOL_GPL(iova_bitmap_for_each); +EXPORT_SYMBOL_NS_GPL(iova_bitmap_for_each, IOMMUFD); /** * iova_bitmap_set() - Records an IOVA range in bitmap @@ -428,4 +428,4 @@ void iova_bitmap_set(struct iova_bitmap *bitmap, cur_bit += nbits; } while (cur_bit <= last_bit); } -EXPORT_SYMBOL_GPL(iova_bitmap_set); +EXPORT_SYMBOL_NS_GPL(iova_bitmap_set, IOMMUFD); diff --git a/drivers/vfio/pci/mlx5/main.c b/drivers/vfio/pci/mlx5/main.c index 42ec574a86221..5cf2b491d15a0 100644 --- a/drivers/vfio/pci/mlx5/main.c +++ b/drivers/vfio/pci/mlx5/main.c @@ -1376,6 +1376,7 @@ static struct pci_driver mlx5vf_pci_driver = { module_pci_driver(mlx5vf_pci_driver); +MODULE_IMPORT_NS(IOMMUFD); MODULE_LICENSE("GPL"); MODULE_AUTHOR("Max Gurtovoy "); MODULE_AUTHOR("Yishai Hadas "); diff --git a/drivers/vfio/pci/pds/pci_drv.c b/drivers/vfio/pci/pds/pci_drv.c index caffa1a2cf591..a34dda5166293 100644 --- a/drivers/vfio/pci/pds/pci_drv.c +++ b/drivers/vfio/pci/pds/pci_drv.c @@ -204,6 +204,7 @@ static struct pci_driver pds_vfio_pci_driver = { module_pci_driver(pds_vfio_pci_driver); +MODULE_IMPORT_NS(IOMMUFD); MODULE_DESCRIPTION(PDS_VFIO_DRV_DESCRIPTION); MODULE_AUTHOR("Brett Creeley "); MODULE_LICENSE("GPL"); diff --git a/drivers/vfio/vfio_main.c b/drivers/vfio/vfio_main.c index edb631e5e7ec9..b78dde0805b4e 100644 --- a/drivers/vfio/vfio_main.c +++ b/drivers/vfio/vfio_main.c @@ -1694,6 +1694,7 @@ static void __exit vfio_cleanup(void) module_init(vfio_init); module_exit(vfio_cleanup); +MODULE_IMPORT_NS(IOMMUFD); MODULE_VERSION(DRIVER_VERSION); MODULE_LICENSE("GPL v2"); MODULE_AUTHOR(DRIVER_AUTHOR); From 629340a97b197b27482447a8404c09586b79b604 Mon Sep 17 00:00:00 2001 From: Joao Martins Date: Tue, 24 Oct 2023 14:50:55 +0100 Subject: [PATCH 370/548] iommu: Add iommu_domain ops for dirty tracking Add to iommu domain operations a set of callbacks to perform dirty tracking, particulary to start and stop tracking and to read and clear the dirty data. Drivers are generally expected to dynamically change its translation structures to toggle the tracking and flush some form of control state structure that stands in the IOVA translation path. Though it's not mandatory, as drivers can also enable dirty tracking at boot, and just clear the dirty bits before setting dirty tracking. For each of the newly added IOMMU core APIs: iommu_cap::IOMMU_CAP_DIRTY_TRACKING: new device iommu_capable value when probing for capabilities of the device. .set_dirty_tracking(): an iommu driver is expected to change its translation structures and enable dirty tracking for the devices in the iommu_domain. For drivers making dirty tracking always-enabled, it should just return 0. .read_and_clear_dirty(): an iommu driver is expected to walk the pagetables for the iova range passed in and use iommu_dirty_bitmap_record() to record dirty info per IOVA. When detecting that a given IOVA is dirty it should also clear its dirty state from the PTE, *unless* the flag IOMMU_DIRTY_NO_CLEAR is passed in -- flushing is steered from the caller of the domain_op via iotlb_gather. The iommu core APIs use the same data structure in use for dirty tracking for VFIO device dirty (struct iova_bitmap) abstracted by iommu_dirty_bitmap_record() helper function. domain::dirty_ops: IOMMU domains will store the dirty ops depending on whether the iommu device supports dirty tracking or not. iommu drivers can then use this field to figure if the dirty tracking is supported+enforced on attach. The enforcement is enable via domain_alloc_user() which is done via IOMMUFD hwpt flag introduced later. Link: https://lore.kernel.org/r/20231024135109.73787-5-joao.m.martins@oracle.com Signed-off-by: Joao Martins Reviewed-by: Jason Gunthorpe Reviewed-by: Lu Baolu Reviewed-by: Kevin Tian Signed-off-by: Jason Gunthorpe (cherry picked from commit 750e2e902b7180cb82d2f9b1e372e32087bb8b1b) Signed-off-by: Koba Ko --- include/linux/io-pgtable.h | 4 +++ include/linux/iommu.h | 70 ++++++++++++++++++++++++++++++++++++++ 2 files changed, 74 insertions(+) diff --git a/include/linux/io-pgtable.h b/include/linux/io-pgtable.h index 1b7a44b35616c..25142a0e2fc2c 100644 --- a/include/linux/io-pgtable.h +++ b/include/linux/io-pgtable.h @@ -166,6 +166,10 @@ struct io_pgtable_ops { struct iommu_iotlb_gather *gather); phys_addr_t (*iova_to_phys)(struct io_pgtable_ops *ops, unsigned long iova); + int (*read_and_clear_dirty)(struct io_pgtable_ops *ops, + unsigned long iova, size_t size, + unsigned long flags, + struct iommu_dirty_bitmap *dirty); }; /** diff --git a/include/linux/iommu.h b/include/linux/iommu.h index ad5cf1420edcf..57aa6c9f0da8b 100644 --- a/include/linux/iommu.h +++ b/include/linux/iommu.h @@ -13,6 +13,7 @@ #include #include #include +#include #include #define IOMMU_READ (1 << 0) @@ -37,6 +38,7 @@ struct bus_type; struct device; struct iommu_domain; struct iommu_domain_ops; +struct iommu_dirty_ops; struct notifier_block; struct iommu_sva; struct iommu_fault_event; @@ -99,6 +101,8 @@ struct iommu_domain_geometry { struct iommu_domain { unsigned type; const struct iommu_domain_ops *ops; + const struct iommu_dirty_ops *dirty_ops; + unsigned long pgsize_bitmap; /* Bitmap of page sizes in use */ struct iommu_domain_geometry geometry; struct iommu_dma_cookie *iova_cookie; @@ -137,6 +141,7 @@ enum iommu_cap { * usefully support the non-strict DMA flush queue. */ IOMMU_CAP_DEFERRED_FLUSH, + IOMMU_CAP_DIRTY_TRACKING, /* IOMMU supports dirty tracking */ }; /* These are the possible reserved region types */ @@ -231,6 +236,35 @@ struct iommu_iotlb_gather { bool queued; }; +/** + * struct iommu_dirty_bitmap - Dirty IOVA bitmap state + * @bitmap: IOVA bitmap + * @gather: Range information for a pending IOTLB flush + */ +struct iommu_dirty_bitmap { + struct iova_bitmap *bitmap; + struct iommu_iotlb_gather *gather; +}; + +/* Read but do not clear any dirty bits */ +#define IOMMU_DIRTY_NO_CLEAR (1 << 0) + +/** + * struct iommu_dirty_ops - domain specific dirty tracking operations + * @set_dirty_tracking: Enable or Disable dirty tracking on the iommu domain + * @read_and_clear_dirty: Walk IOMMU page tables for dirtied PTEs marshalled + * into a bitmap, with a bit represented as a page. + * Reads the dirty PTE bits and clears it from IO + * pagetables. + */ +struct iommu_dirty_ops { + int (*set_dirty_tracking)(struct iommu_domain *domain, bool enabled); + int (*read_and_clear_dirty)(struct iommu_domain *domain, + unsigned long iova, size_t size, + unsigned long flags, + struct iommu_dirty_bitmap *dirty); +}; + /** * struct iommu_ops - iommu ops and capabilities * @capable: check capability @@ -652,6 +686,28 @@ static inline bool iommu_iotlb_gather_queued(struct iommu_iotlb_gather *gather) return gather && gather->queued; } +static inline void iommu_dirty_bitmap_init(struct iommu_dirty_bitmap *dirty, + struct iova_bitmap *bitmap, + struct iommu_iotlb_gather *gather) +{ + if (gather) + iommu_iotlb_gather_init(gather); + + dirty->bitmap = bitmap; + dirty->gather = gather; +} + +static inline void iommu_dirty_bitmap_record(struct iommu_dirty_bitmap *dirty, + unsigned long iova, + unsigned long length) +{ + if (dirty->bitmap) + iova_bitmap_set(dirty->bitmap, iova, length); + + if (dirty->gather) + iommu_iotlb_gather_add_range(dirty->gather, iova, length); +} + /* PCI device grouping function */ extern struct iommu_group *pci_device_group(struct device *dev); /* Generic device grouping function */ @@ -758,6 +814,8 @@ struct iommu_fwspec {}; struct iommu_device {}; struct iommu_fault_param {}; struct iommu_iotlb_gather {}; +struct iommu_dirty_bitmap {}; +struct iommu_dirty_ops {}; static inline bool iommu_present(const struct bus_type *bus) { @@ -990,6 +1048,18 @@ static inline bool iommu_iotlb_gather_queued(struct iommu_iotlb_gather *gather) return false; } +static inline void iommu_dirty_bitmap_init(struct iommu_dirty_bitmap *dirty, + struct iova_bitmap *bitmap, + struct iommu_iotlb_gather *gather) +{ +} + +static inline void iommu_dirty_bitmap_record(struct iommu_dirty_bitmap *dirty, + unsigned long iova, + unsigned long length) +{ +} + static inline void iommu_device_unregister(struct iommu_device *iommu) { } From 4297538c63360eec87d7d31ea04dc1c6f37e4c1b Mon Sep 17 00:00:00 2001 From: Joao Martins Date: Tue, 24 Oct 2023 14:50:56 +0100 Subject: [PATCH 371/548] iommufd: Add a flag to enforce dirty tracking on attach Throughout IOMMU domain lifetime that wants to use dirty tracking, some guarantees are needed such that any device attached to the iommu_domain supports dirty tracking. The idea is to handle a case where IOMMU in the system are assymetric feature-wise and thus the capability may not be supported for all devices. The enforcement is done by adding a flag into HWPT_ALLOC namely: IOMMU_HWPT_ALLOC_DIRTY_TRACKING .. Passed in HWPT_ALLOC ioctl() flags. The enforcement is done by creating a iommu_domain via domain_alloc_user() and validating the requested flags with what the device IOMMU supports (and failing accordingly) advertised). Advertising the new IOMMU domain feature flag requires that the individual iommu driver capability is supported when a future device attachment happens. Link: https://lore.kernel.org/r/20231024135109.73787-6-joao.m.martins@oracle.com Signed-off-by: Joao Martins Reviewed-by: Jason Gunthorpe Reviewed-by: Kevin Tian Signed-off-by: Jason Gunthorpe (cherry picked from commit 5f9bdbf4c65860cc8b9c544d92bfd76fbea8d9c5) Signed-off-by: Koba Ko --- drivers/iommu/iommufd/hw_pagetable.c | 4 +++- include/uapi/linux/iommufd.h | 3 +++ 2 files changed, 6 insertions(+), 1 deletion(-) diff --git a/drivers/iommu/iommufd/hw_pagetable.c b/drivers/iommu/iommufd/hw_pagetable.c index 8b3d2875d642d..dd50ca9e2c096 100644 --- a/drivers/iommu/iommufd/hw_pagetable.c +++ b/drivers/iommu/iommufd/hw_pagetable.c @@ -157,7 +157,9 @@ int iommufd_hwpt_alloc(struct iommufd_ucmd *ucmd) struct iommufd_ioas *ioas; int rc; - if ((cmd->flags & (~IOMMU_HWPT_ALLOC_NEST_PARENT)) || cmd->__reserved) + if ((cmd->flags & ~(IOMMU_HWPT_ALLOC_NEST_PARENT | + IOMMU_HWPT_ALLOC_DIRTY_TRACKING)) || + cmd->__reserved) return -EOPNOTSUPP; idev = iommufd_get_device(ucmd, cmd->dev_id); diff --git a/include/uapi/linux/iommufd.h b/include/uapi/linux/iommufd.h index be7a95042677b..c76248410120a 100644 --- a/include/uapi/linux/iommufd.h +++ b/include/uapi/linux/iommufd.h @@ -351,9 +351,12 @@ struct iommu_vfio_ioas { * enum iommufd_hwpt_alloc_flags - Flags for HWPT allocation * @IOMMU_HWPT_ALLOC_NEST_PARENT: If set, allocate a HWPT that can serve as * the parent HWPT in a nesting configuration. + * @IOMMU_HWPT_ALLOC_DIRTY_TRACKING: Dirty tracking support for device IOMMU is + * enforced on device attachment */ enum iommufd_hwpt_alloc_flags { IOMMU_HWPT_ALLOC_NEST_PARENT = 1 << 0, + IOMMU_HWPT_ALLOC_DIRTY_TRACKING = 1 << 1, }; /** From 70a911e1715d0974976906febebac6dd8c591b8c Mon Sep 17 00:00:00 2001 From: Joao Martins Date: Tue, 24 Oct 2023 14:50:57 +0100 Subject: [PATCH 372/548] iommufd: Add IOMMU_HWPT_SET_DIRTY_TRACKING Every IOMMU driver should be able to implement the needed iommu domain ops to control dirty tracking. Connect a hw_pagetable to the IOMMU core dirty tracking ops, specifically the ability to enable/disable dirty tracking on an IOMMU domain (hw_pagetable id). To that end add an io_pagetable kernel API to toggle dirty tracking: * iopt_set_dirty_tracking(iopt, [domain], state) The intended caller of this is via the hw_pagetable object that is created. Internally it will ensure the leftover dirty state is cleared /right before/ dirty tracking starts. This is also useful for iommu drivers which may decide that dirty tracking is always-enabled at boot without wanting to toggle dynamically via corresponding iommu domain op. Link: https://lore.kernel.org/r/20231024135109.73787-7-joao.m.martins@oracle.com Signed-off-by: Joao Martins Reviewed-by: Jason Gunthorpe Reviewed-by: Kevin Tian Signed-off-by: Jason Gunthorpe (cherry picked from commit e2a4b294784957fc28ecb1fed8a7e69da18eb18d) Signed-off-by: Koba Ko --- drivers/iommu/iommufd/hw_pagetable.c | 24 +++++++++++ drivers/iommu/iommufd/io_pagetable.c | 54 +++++++++++++++++++++++++ drivers/iommu/iommufd/iommufd_private.h | 12 ++++++ drivers/iommu/iommufd/main.c | 3 ++ include/uapi/linux/iommufd.h | 28 +++++++++++++ 5 files changed, 121 insertions(+) diff --git a/drivers/iommu/iommufd/hw_pagetable.c b/drivers/iommu/iommufd/hw_pagetable.c index dd50ca9e2c096..c3b7bd9bfcbb9 100644 --- a/drivers/iommu/iommufd/hw_pagetable.c +++ b/drivers/iommu/iommufd/hw_pagetable.c @@ -196,3 +196,27 @@ int iommufd_hwpt_alloc(struct iommufd_ucmd *ucmd) iommufd_put_object(&idev->obj); return rc; } + +int iommufd_hwpt_set_dirty_tracking(struct iommufd_ucmd *ucmd) +{ + struct iommu_hwpt_set_dirty_tracking *cmd = ucmd->cmd; + struct iommufd_hw_pagetable *hwpt; + struct iommufd_ioas *ioas; + int rc = -EOPNOTSUPP; + bool enable; + + if (cmd->flags & ~IOMMU_HWPT_DIRTY_TRACKING_ENABLE) + return rc; + + hwpt = iommufd_get_hwpt(ucmd, cmd->hwpt_id); + if (IS_ERR(hwpt)) + return PTR_ERR(hwpt); + + ioas = hwpt->ioas; + enable = cmd->flags & IOMMU_HWPT_DIRTY_TRACKING_ENABLE; + + rc = iopt_set_dirty_tracking(&ioas->iopt, hwpt->domain, enable); + + iommufd_put_object(&hwpt->obj); + return rc; +} diff --git a/drivers/iommu/iommufd/io_pagetable.c b/drivers/iommu/iommufd/io_pagetable.c index e76b229399948..8e9c51c5d30a2 100644 --- a/drivers/iommu/iommufd/io_pagetable.c +++ b/drivers/iommu/iommufd/io_pagetable.c @@ -432,6 +432,60 @@ int iopt_map_user_pages(struct iommufd_ctx *ictx, struct io_pagetable *iopt, return 0; } +static int iopt_clear_dirty_data(struct io_pagetable *iopt, + struct iommu_domain *domain) +{ + const struct iommu_dirty_ops *ops = domain->dirty_ops; + struct iommu_iotlb_gather gather; + struct iommu_dirty_bitmap dirty; + struct iopt_area *area; + int ret = 0; + + lockdep_assert_held_read(&iopt->iova_rwsem); + + iommu_dirty_bitmap_init(&dirty, NULL, &gather); + + for (area = iopt_area_iter_first(iopt, 0, ULONG_MAX); area; + area = iopt_area_iter_next(area, 0, ULONG_MAX)) { + if (!area->pages) + continue; + + ret = ops->read_and_clear_dirty(domain, iopt_area_iova(area), + iopt_area_length(area), 0, + &dirty); + if (ret) + break; + } + + iommu_iotlb_sync(domain, &gather); + return ret; +} + +int iopt_set_dirty_tracking(struct io_pagetable *iopt, + struct iommu_domain *domain, bool enable) +{ + const struct iommu_dirty_ops *ops = domain->dirty_ops; + int ret = 0; + + if (!ops) + return -EOPNOTSUPP; + + down_read(&iopt->iova_rwsem); + + /* Clear dirty bits from PTEs to ensure a clean snapshot */ + if (enable) { + ret = iopt_clear_dirty_data(iopt, domain); + if (ret) + goto out_unlock; + } + + ret = ops->set_dirty_tracking(domain, enable); + +out_unlock: + up_read(&iopt->iova_rwsem); + return ret; +} + int iopt_get_pages(struct io_pagetable *iopt, unsigned long iova, unsigned long length, struct list_head *pages_list) { diff --git a/drivers/iommu/iommufd/iommufd_private.h b/drivers/iommu/iommufd/iommufd_private.h index 3064997a0181e..b09750848da66 100644 --- a/drivers/iommu/iommufd/iommufd_private.h +++ b/drivers/iommu/iommufd/iommufd_private.h @@ -8,6 +8,7 @@ #include #include #include +#include struct iommu_domain; struct iommu_group; @@ -70,6 +71,9 @@ int iopt_unmap_iova(struct io_pagetable *iopt, unsigned long iova, unsigned long length, unsigned long *unmapped); int iopt_unmap_all(struct io_pagetable *iopt, unsigned long *unmapped); +int iopt_set_dirty_tracking(struct io_pagetable *iopt, + struct iommu_domain *domain, bool enable); + void iommufd_access_notify_unmap(struct io_pagetable *iopt, unsigned long iova, unsigned long length); int iopt_table_add_domain(struct io_pagetable *iopt, @@ -240,6 +244,14 @@ struct iommufd_hw_pagetable { struct list_head hwpt_item; }; +static inline struct iommufd_hw_pagetable * +iommufd_get_hwpt(struct iommufd_ucmd *ucmd, u32 id) +{ + return container_of(iommufd_get_object(ucmd->ictx, id, + IOMMUFD_OBJ_HW_PAGETABLE), + struct iommufd_hw_pagetable, obj); +} +int iommufd_hwpt_set_dirty_tracking(struct iommufd_ucmd *ucmd); struct iommufd_hw_pagetable * iommufd_hw_pagetable_alloc(struct iommufd_ctx *ictx, struct iommufd_ioas *ioas, struct iommufd_device *idev, u32 flags, diff --git a/drivers/iommu/iommufd/main.c b/drivers/iommu/iommufd/main.c index e71523cbd0de4..46fedd779714f 100644 --- a/drivers/iommu/iommufd/main.c +++ b/drivers/iommu/iommufd/main.c @@ -307,6 +307,7 @@ union ucmd_buffer { struct iommu_destroy destroy; struct iommu_hw_info info; struct iommu_hwpt_alloc hwpt; + struct iommu_hwpt_set_dirty_tracking set_dirty_tracking; struct iommu_ioas_alloc alloc; struct iommu_ioas_allow_iovas allow_iovas; struct iommu_ioas_copy ioas_copy; @@ -342,6 +343,8 @@ static const struct iommufd_ioctl_op iommufd_ioctl_ops[] = { __reserved), IOCTL_OP(IOMMU_HWPT_ALLOC, iommufd_hwpt_alloc, struct iommu_hwpt_alloc, __reserved), + IOCTL_OP(IOMMU_HWPT_SET_DIRTY_TRACKING, iommufd_hwpt_set_dirty_tracking, + struct iommu_hwpt_set_dirty_tracking, __reserved), IOCTL_OP(IOMMU_IOAS_ALLOC, iommufd_ioas_alloc_ioctl, struct iommu_ioas_alloc, out_ioas_id), IOCTL_OP(IOMMU_IOAS_ALLOW_IOVAS, iommufd_ioas_allow_iovas, diff --git a/include/uapi/linux/iommufd.h b/include/uapi/linux/iommufd.h index c76248410120a..5c82b68c88f30 100644 --- a/include/uapi/linux/iommufd.h +++ b/include/uapi/linux/iommufd.h @@ -47,6 +47,7 @@ enum { IOMMUFD_CMD_VFIO_IOAS, IOMMUFD_CMD_HWPT_ALLOC, IOMMUFD_CMD_GET_HW_INFO, + IOMMUFD_CMD_HWPT_SET_DIRTY_TRACKING, }; /** @@ -453,4 +454,31 @@ struct iommu_hw_info { __u32 __reserved; }; #define IOMMU_GET_HW_INFO _IO(IOMMUFD_TYPE, IOMMUFD_CMD_GET_HW_INFO) + +/* + * enum iommufd_hwpt_set_dirty_tracking_flags - Flags for steering dirty + * tracking + * @IOMMU_HWPT_DIRTY_TRACKING_ENABLE: Enable dirty tracking + */ +enum iommufd_hwpt_set_dirty_tracking_flags { + IOMMU_HWPT_DIRTY_TRACKING_ENABLE = 1, +}; + +/** + * struct iommu_hwpt_set_dirty_tracking - ioctl(IOMMU_HWPT_SET_DIRTY_TRACKING) + * @size: sizeof(struct iommu_hwpt_set_dirty_tracking) + * @flags: Combination of enum iommufd_hwpt_set_dirty_tracking_flags + * @hwpt_id: HW pagetable ID that represents the IOMMU domain + * @__reserved: Must be 0 + * + * Toggle dirty tracking on an HW pagetable. + */ +struct iommu_hwpt_set_dirty_tracking { + __u32 size; + __u32 flags; + __u32 hwpt_id; + __u32 __reserved; +}; +#define IOMMU_HWPT_SET_DIRTY_TRACKING _IO(IOMMUFD_TYPE, \ + IOMMUFD_CMD_HWPT_SET_DIRTY_TRACKING) #endif From 3cbadb2bdb8dea165fed93cfd90823f02279c44c Mon Sep 17 00:00:00 2001 From: Joao Martins Date: Tue, 24 Oct 2023 14:50:58 +0100 Subject: [PATCH 373/548] iommufd: Add IOMMU_HWPT_GET_DIRTY_BITMAP Connect a hw_pagetable to the IOMMU core dirty tracking read_and_clear_dirty iommu domain op. It exposes all of the functionality for the UAPI that read the dirtied IOVAs while clearing the Dirty bits from the PTEs. In doing so, add an IO pagetable API iopt_read_and_clear_dirty_data() that performs the reading of dirty IOPTEs for a given IOVA range and then copying back to userspace bitmap. Underneath it uses the IOMMU domain kernel API which will read the dirty bits, as well as atomically clearing the IOPTE dirty bit and flushing the IOTLB at the end. The IOVA bitmaps usage takes care of the iteration of the bitmaps user pages efficiently and without copies. Within the iterator function we iterate over io-pagetable contigous areas that have been mapped. Contrary to past incantation of a similar interface in VFIO the IOVA range to be scanned is tied in to the bitmap size, thus the application needs to pass a appropriately sized bitmap address taking into account the iova range being passed *and* page size ... as opposed to allowing bitmap-iova != iova. Link: https://lore.kernel.org/r/20231024135109.73787-8-joao.m.martins@oracle.com Signed-off-by: Joao Martins Reviewed-by: Jason Gunthorpe Reviewed-by: Kevin Tian Signed-off-by: Jason Gunthorpe (cherry picked from commit b9a60d6f850e4470017b60f731220a58cda199aa) Signed-off-by: Koba Ko --- drivers/iommu/iommufd/hw_pagetable.c | 22 +++++ drivers/iommu/iommufd/io_pagetable.c | 113 ++++++++++++++++++++++++ drivers/iommu/iommufd/iommufd_private.h | 10 +++ drivers/iommu/iommufd/main.c | 4 + include/uapi/linux/iommufd.h | 35 ++++++++ 5 files changed, 184 insertions(+) diff --git a/drivers/iommu/iommufd/hw_pagetable.c b/drivers/iommu/iommufd/hw_pagetable.c index c3b7bd9bfcbb9..7316f69110ef8 100644 --- a/drivers/iommu/iommufd/hw_pagetable.c +++ b/drivers/iommu/iommufd/hw_pagetable.c @@ -220,3 +220,25 @@ int iommufd_hwpt_set_dirty_tracking(struct iommufd_ucmd *ucmd) iommufd_put_object(&hwpt->obj); return rc; } + +int iommufd_hwpt_get_dirty_bitmap(struct iommufd_ucmd *ucmd) +{ + struct iommu_hwpt_get_dirty_bitmap *cmd = ucmd->cmd; + struct iommufd_hw_pagetable *hwpt; + struct iommufd_ioas *ioas; + int rc = -EOPNOTSUPP; + + if ((cmd->flags || cmd->__reserved)) + return -EOPNOTSUPP; + + hwpt = iommufd_get_hwpt(ucmd, cmd->hwpt_id); + if (IS_ERR(hwpt)) + return PTR_ERR(hwpt); + + ioas = hwpt->ioas; + rc = iopt_read_and_clear_dirty_data(&ioas->iopt, hwpt->domain, + cmd->flags, cmd); + + iommufd_put_object(&hwpt->obj); + return rc; +} diff --git a/drivers/iommu/iommufd/io_pagetable.c b/drivers/iommu/iommufd/io_pagetable.c index 8e9c51c5d30a2..ea6fd83202a21 100644 --- a/drivers/iommu/iommufd/io_pagetable.c +++ b/drivers/iommu/iommufd/io_pagetable.c @@ -15,6 +15,7 @@ #include #include #include +#include #include "io_pagetable.h" #include "double_span.h" @@ -432,6 +433,118 @@ int iopt_map_user_pages(struct iommufd_ctx *ictx, struct io_pagetable *iopt, return 0; } +struct iova_bitmap_fn_arg { + struct io_pagetable *iopt; + struct iommu_domain *domain; + struct iommu_dirty_bitmap *dirty; +}; + +static int __iommu_read_and_clear_dirty(struct iova_bitmap *bitmap, + unsigned long iova, size_t length, + void *opaque) +{ + struct iopt_area *area; + struct iopt_area_contig_iter iter; + struct iova_bitmap_fn_arg *arg = opaque; + struct iommu_domain *domain = arg->domain; + struct iommu_dirty_bitmap *dirty = arg->dirty; + const struct iommu_dirty_ops *ops = domain->dirty_ops; + unsigned long last_iova = iova + length - 1; + int ret; + + iopt_for_each_contig_area(&iter, area, arg->iopt, iova, last_iova) { + unsigned long last = min(last_iova, iopt_area_last_iova(area)); + + ret = ops->read_and_clear_dirty(domain, iter.cur_iova, + last - iter.cur_iova + 1, 0, + dirty); + if (ret) + return ret; + } + + if (!iopt_area_contig_done(&iter)) + return -EINVAL; + return 0; +} + +static int +iommu_read_and_clear_dirty(struct iommu_domain *domain, + struct io_pagetable *iopt, unsigned long flags, + struct iommu_hwpt_get_dirty_bitmap *bitmap) +{ + const struct iommu_dirty_ops *ops = domain->dirty_ops; + struct iommu_iotlb_gather gather; + struct iommu_dirty_bitmap dirty; + struct iova_bitmap_fn_arg arg; + struct iova_bitmap *iter; + int ret = 0; + + if (!ops || !ops->read_and_clear_dirty) + return -EOPNOTSUPP; + + iter = iova_bitmap_alloc(bitmap->iova, bitmap->length, + bitmap->page_size, + u64_to_user_ptr(bitmap->data)); + if (IS_ERR(iter)) + return -ENOMEM; + + iommu_dirty_bitmap_init(&dirty, iter, &gather); + + arg.iopt = iopt; + arg.domain = domain; + arg.dirty = &dirty; + iova_bitmap_for_each(iter, &arg, __iommu_read_and_clear_dirty); + + iommu_iotlb_sync(domain, &gather); + iova_bitmap_free(iter); + + return ret; +} + +int iommufd_check_iova_range(struct io_pagetable *iopt, + struct iommu_hwpt_get_dirty_bitmap *bitmap) +{ + size_t iommu_pgsize = iopt->iova_alignment; + u64 last_iova; + + if (check_add_overflow(bitmap->iova, bitmap->length - 1, &last_iova)) + return -EOVERFLOW; + + if (bitmap->iova > ULONG_MAX || last_iova > ULONG_MAX) + return -EOVERFLOW; + + if ((bitmap->iova & (iommu_pgsize - 1)) || + ((last_iova + 1) & (iommu_pgsize - 1))) + return -EINVAL; + + if (!bitmap->page_size) + return -EINVAL; + + if ((bitmap->iova & (bitmap->page_size - 1)) || + ((last_iova + 1) & (bitmap->page_size - 1))) + return -EINVAL; + + return 0; +} + +int iopt_read_and_clear_dirty_data(struct io_pagetable *iopt, + struct iommu_domain *domain, + unsigned long flags, + struct iommu_hwpt_get_dirty_bitmap *bitmap) +{ + int ret; + + ret = iommufd_check_iova_range(iopt, bitmap); + if (ret) + return ret; + + down_read(&iopt->iova_rwsem); + ret = iommu_read_and_clear_dirty(domain, iopt, flags, bitmap); + up_read(&iopt->iova_rwsem); + + return ret; +} + static int iopt_clear_dirty_data(struct io_pagetable *iopt, struct iommu_domain *domain) { diff --git a/drivers/iommu/iommufd/iommufd_private.h b/drivers/iommu/iommufd/iommufd_private.h index b09750848da66..034129130db37 100644 --- a/drivers/iommu/iommufd/iommufd_private.h +++ b/drivers/iommu/iommufd/iommufd_private.h @@ -8,6 +8,8 @@ #include #include #include +#include +#include #include struct iommu_domain; @@ -71,6 +73,10 @@ int iopt_unmap_iova(struct io_pagetable *iopt, unsigned long iova, unsigned long length, unsigned long *unmapped); int iopt_unmap_all(struct io_pagetable *iopt, unsigned long *unmapped); +int iopt_read_and_clear_dirty_data(struct io_pagetable *iopt, + struct iommu_domain *domain, + unsigned long flags, + struct iommu_hwpt_get_dirty_bitmap *bitmap); int iopt_set_dirty_tracking(struct io_pagetable *iopt, struct iommu_domain *domain, bool enable); @@ -226,6 +232,8 @@ int iommufd_option_rlimit_mode(struct iommu_option *cmd, struct iommufd_ctx *ictx); int iommufd_vfio_ioas(struct iommufd_ucmd *ucmd); +int iommufd_check_iova_range(struct io_pagetable *iopt, + struct iommu_hwpt_get_dirty_bitmap *bitmap); /* * A HW pagetable is called an iommu_domain inside the kernel. This user object @@ -252,6 +260,8 @@ iommufd_get_hwpt(struct iommufd_ucmd *ucmd, u32 id) struct iommufd_hw_pagetable, obj); } int iommufd_hwpt_set_dirty_tracking(struct iommufd_ucmd *ucmd); +int iommufd_hwpt_get_dirty_bitmap(struct iommufd_ucmd *ucmd); + struct iommufd_hw_pagetable * iommufd_hw_pagetable_alloc(struct iommufd_ctx *ictx, struct iommufd_ioas *ioas, struct iommufd_device *idev, u32 flags, diff --git a/drivers/iommu/iommufd/main.c b/drivers/iommu/iommufd/main.c index 46fedd779714f..d50f42a730aa3 100644 --- a/drivers/iommu/iommufd/main.c +++ b/drivers/iommu/iommufd/main.c @@ -307,6 +307,7 @@ union ucmd_buffer { struct iommu_destroy destroy; struct iommu_hw_info info; struct iommu_hwpt_alloc hwpt; + struct iommu_hwpt_get_dirty_bitmap get_dirty_bitmap; struct iommu_hwpt_set_dirty_tracking set_dirty_tracking; struct iommu_ioas_alloc alloc; struct iommu_ioas_allow_iovas allow_iovas; @@ -343,6 +344,8 @@ static const struct iommufd_ioctl_op iommufd_ioctl_ops[] = { __reserved), IOCTL_OP(IOMMU_HWPT_ALLOC, iommufd_hwpt_alloc, struct iommu_hwpt_alloc, __reserved), + IOCTL_OP(IOMMU_HWPT_GET_DIRTY_BITMAP, iommufd_hwpt_get_dirty_bitmap, + struct iommu_hwpt_get_dirty_bitmap, data), IOCTL_OP(IOMMU_HWPT_SET_DIRTY_TRACKING, iommufd_hwpt_set_dirty_tracking, struct iommu_hwpt_set_dirty_tracking, __reserved), IOCTL_OP(IOMMU_IOAS_ALLOC, iommufd_ioas_alloc_ioctl, @@ -555,5 +558,6 @@ MODULE_ALIAS_MISCDEV(VFIO_MINOR); MODULE_ALIAS("devname:vfio/vfio"); #endif MODULE_IMPORT_NS(IOMMUFD_INTERNAL); +MODULE_IMPORT_NS(IOMMUFD); MODULE_DESCRIPTION("I/O Address Space Management for passthrough devices"); MODULE_LICENSE("GPL"); diff --git a/include/uapi/linux/iommufd.h b/include/uapi/linux/iommufd.h index 5c82b68c88f30..dce38e32ca841 100644 --- a/include/uapi/linux/iommufd.h +++ b/include/uapi/linux/iommufd.h @@ -48,6 +48,7 @@ enum { IOMMUFD_CMD_HWPT_ALLOC, IOMMUFD_CMD_GET_HW_INFO, IOMMUFD_CMD_HWPT_SET_DIRTY_TRACKING, + IOMMUFD_CMD_HWPT_GET_DIRTY_BITMAP, }; /** @@ -481,4 +482,38 @@ struct iommu_hwpt_set_dirty_tracking { }; #define IOMMU_HWPT_SET_DIRTY_TRACKING _IO(IOMMUFD_TYPE, \ IOMMUFD_CMD_HWPT_SET_DIRTY_TRACKING) + +/** + * struct iommu_hwpt_get_dirty_bitmap - ioctl(IOMMU_HWPT_GET_DIRTY_BITMAP) + * @size: sizeof(struct iommu_hwpt_get_dirty_bitmap) + * @hwpt_id: HW pagetable ID that represents the IOMMU domain + * @flags: Must be zero + * @__reserved: Must be 0 + * @iova: base IOVA of the bitmap first bit + * @length: IOVA range size + * @page_size: page size granularity of each bit in the bitmap + * @data: bitmap where to set the dirty bits. The bitmap bits each + * represent a page_size which you deviate from an arbitrary iova. + * + * Checking a given IOVA is dirty: + * + * data[(iova / page_size) / 64] & (1ULL << ((iova / page_size) % 64)) + * + * Walk the IOMMU pagetables for a given IOVA range to return a bitmap + * with the dirty IOVAs. In doing so it will also by default clear any + * dirty bit metadata set in the IOPTE. + */ +struct iommu_hwpt_get_dirty_bitmap { + __u32 size; + __u32 hwpt_id; + __u32 flags; + __u32 __reserved; + __aligned_u64 iova; + __aligned_u64 length; + __aligned_u64 page_size; + __aligned_u64 data; +}; +#define IOMMU_HWPT_GET_DIRTY_BITMAP _IO(IOMMUFD_TYPE, \ + IOMMUFD_CMD_HWPT_GET_DIRTY_BITMAP) + #endif From af749fe5e7f74df03129ed132f0d9f9d40598fb0 Mon Sep 17 00:00:00 2001 From: Joao Martins Date: Tue, 24 Oct 2023 14:50:59 +0100 Subject: [PATCH 374/548] iommufd: Add capabilities to IOMMU_GET_HW_INFO Extend IOMMUFD_CMD_GET_HW_INFO op to query generic iommu capabilities for a given device. Capabilities are IOMMU agnostic and use device_iommu_capable() API passing one of the IOMMU_CAP_*. Enumerate IOMMU_CAP_DIRTY_TRACKING for now in the out_capabilities field returned back to userspace. Link: https://lore.kernel.org/r/20231024135109.73787-9-joao.m.martins@oracle.com Signed-off-by: Joao Martins Reviewed-by: Jason Gunthorpe Reviewed-by: Kevin Tian Signed-off-by: Jason Gunthorpe (cherry picked from commit 7623683857e52b75184d37862c70f1230aef2edd) Signed-off-by: Koba Ko --- drivers/iommu/iommufd/device.c | 4 ++++ include/uapi/linux/iommufd.h | 17 +++++++++++++++++ 2 files changed, 21 insertions(+) diff --git a/drivers/iommu/iommufd/device.c b/drivers/iommu/iommufd/device.c index 3f6d4a2d6b44a..cd0c96bf86e1a 100644 --- a/drivers/iommu/iommufd/device.c +++ b/drivers/iommu/iommufd/device.c @@ -1201,6 +1201,10 @@ int iommufd_get_hw_info(struct iommufd_ucmd *ucmd) */ cmd->data_len = data_len; + cmd->out_capabilities = 0; + if (device_iommu_capable(idev->dev, IOMMU_CAP_DIRTY_TRACKING)) + cmd->out_capabilities |= IOMMU_HW_CAP_DIRTY_TRACKING; + rc = iommufd_ucmd_respond(ucmd, sizeof(*cmd)); out_free: kfree(data); diff --git a/include/uapi/linux/iommufd.h b/include/uapi/linux/iommufd.h index dce38e32ca841..036ebc6c19cf9 100644 --- a/include/uapi/linux/iommufd.h +++ b/include/uapi/linux/iommufd.h @@ -418,6 +418,20 @@ enum iommu_hw_info_type { IOMMU_HW_INFO_TYPE_INTEL_VTD, }; +/** + * enum iommufd_hw_capabilities + * @IOMMU_HW_CAP_DIRTY_TRACKING: IOMMU hardware support for dirty tracking + * If available, it means the following APIs + * are supported: + * + * IOMMU_HWPT_GET_DIRTY_BITMAP + * IOMMU_HWPT_SET_DIRTY_TRACKING + * + */ +enum iommufd_hw_capabilities { + IOMMU_HW_CAP_DIRTY_TRACKING = 1 << 0, +}; + /** * struct iommu_hw_info - ioctl(IOMMU_GET_HW_INFO) * @size: sizeof(struct iommu_hw_info) @@ -429,6 +443,8 @@ enum iommu_hw_info_type { * the iommu type specific hardware information data * @out_data_type: Output the iommu hardware info type as defined in the enum * iommu_hw_info_type. + * @out_capabilities: Output the generic iommu capability info type as defined + * in the enum iommu_hw_capabilities. * @__reserved: Must be 0 * * Query an iommu type specific hardware information data from an iommu behind @@ -453,6 +469,7 @@ struct iommu_hw_info { __aligned_u64 data_uptr; __u32 out_data_type; __u32 __reserved; + __aligned_u64 out_capabilities; }; #define IOMMU_GET_HW_INFO _IO(IOMMUFD_TYPE, IOMMUFD_CMD_GET_HW_INFO) From 01f7aacb7eff3bc56118ac5fcbc0dea472824440 Mon Sep 17 00:00:00 2001 From: Joao Martins Date: Tue, 24 Oct 2023 14:51:00 +0100 Subject: [PATCH 375/548] iommufd: Add a flag to skip clearing of IOPTE dirty VFIO has an operation where it unmaps an IOVA while returning a bitmap with the dirty data. In reality the operation doesn't quite query the IO pagetables that the PTE was dirty or not. Instead it marks as dirty on anything that was mapped, and doing so in one syscall. In IOMMUFD the equivalent is done in two operations by querying with GET_DIRTY_IOVA followed by UNMAP_IOVA. However, this would incur two TLB flushes given that after clearing dirty bits IOMMU implementations require invalidating their IOTLB, plus another invalidation needed for the UNMAP. To allow dirty bits to be queried faster, add a flag (IOMMU_HWPT_GET_DIRTY_BITMAP_NO_CLEAR) that requests to not clear the dirty bits from the PTE (but just reading them), under the expectation that the next operation is the unmap. An alternative is to unmap and just perpectually mark as dirty as that's the same behaviour as today. So here equivalent functionally can be provided with unmap alone, and if real dirty info is required it will amortize the cost while querying. There's still a race against DMA where in theory the unmap of the IOVA (when the guest invalidates the IOTLB via emulated iommu) would race against the VF performing DMA on the same IOVA. As discussed in [0], we are accepting to resolve this race as throwing away the DMA and it doesn't matter if it hit physical DRAM or not, the VM can't tell if we threw it away because the DMA was blocked or because we failed to copy the DRAM. [0] https://lore.kernel.org/linux-iommu/20220502185239.GR8364@nvidia.com/ Link: https://lore.kernel.org/r/20231024135109.73787-10-joao.m.martins@oracle.com Signed-off-by: Joao Martins Reviewed-by: Jason Gunthorpe Reviewed-by: Kevin Tian Signed-off-by: Jason Gunthorpe (cherry picked from commit 609848132c71316df3260d1ec066539c21bba585) Signed-off-by: Koba Ko --- drivers/iommu/iommufd/hw_pagetable.c | 3 ++- drivers/iommu/iommufd/io_pagetable.c | 9 +++++++-- include/uapi/linux/iommufd.h | 15 ++++++++++++++- 3 files changed, 23 insertions(+), 4 deletions(-) diff --git a/drivers/iommu/iommufd/hw_pagetable.c b/drivers/iommu/iommufd/hw_pagetable.c index 7316f69110ef8..72a5269984b01 100644 --- a/drivers/iommu/iommufd/hw_pagetable.c +++ b/drivers/iommu/iommufd/hw_pagetable.c @@ -228,7 +228,8 @@ int iommufd_hwpt_get_dirty_bitmap(struct iommufd_ucmd *ucmd) struct iommufd_ioas *ioas; int rc = -EOPNOTSUPP; - if ((cmd->flags || cmd->__reserved)) + if ((cmd->flags & ~(IOMMU_HWPT_GET_DIRTY_BITMAP_NO_CLEAR)) || + cmd->__reserved) return -EOPNOTSUPP; hwpt = iommufd_get_hwpt(ucmd, cmd->hwpt_id); diff --git a/drivers/iommu/iommufd/io_pagetable.c b/drivers/iommu/iommufd/io_pagetable.c index ea6fd83202a21..9f193c933de6e 100644 --- a/drivers/iommu/iommufd/io_pagetable.c +++ b/drivers/iommu/iommufd/io_pagetable.c @@ -434,6 +434,7 @@ int iopt_map_user_pages(struct iommufd_ctx *ictx, struct io_pagetable *iopt, } struct iova_bitmap_fn_arg { + unsigned long flags; struct io_pagetable *iopt; struct iommu_domain *domain; struct iommu_dirty_bitmap *dirty; @@ -450,13 +451,14 @@ static int __iommu_read_and_clear_dirty(struct iova_bitmap *bitmap, struct iommu_dirty_bitmap *dirty = arg->dirty; const struct iommu_dirty_ops *ops = domain->dirty_ops; unsigned long last_iova = iova + length - 1; + unsigned long flags = arg->flags; int ret; iopt_for_each_contig_area(&iter, area, arg->iopt, iova, last_iova) { unsigned long last = min(last_iova, iopt_area_last_iova(area)); ret = ops->read_and_clear_dirty(domain, iter.cur_iova, - last - iter.cur_iova + 1, 0, + last - iter.cur_iova + 1, flags, dirty); if (ret) return ret; @@ -490,12 +492,15 @@ iommu_read_and_clear_dirty(struct iommu_domain *domain, iommu_dirty_bitmap_init(&dirty, iter, &gather); + arg.flags = flags; arg.iopt = iopt; arg.domain = domain; arg.dirty = &dirty; iova_bitmap_for_each(iter, &arg, __iommu_read_and_clear_dirty); - iommu_iotlb_sync(domain, &gather); + if (!(flags & IOMMU_DIRTY_NO_CLEAR)) + iommu_iotlb_sync(domain, &gather); + iova_bitmap_free(iter); return ret; diff --git a/include/uapi/linux/iommufd.h b/include/uapi/linux/iommufd.h index 036ebc6c19cf9..c44eecf5d318e 100644 --- a/include/uapi/linux/iommufd.h +++ b/include/uapi/linux/iommufd.h @@ -500,11 +500,24 @@ struct iommu_hwpt_set_dirty_tracking { #define IOMMU_HWPT_SET_DIRTY_TRACKING _IO(IOMMUFD_TYPE, \ IOMMUFD_CMD_HWPT_SET_DIRTY_TRACKING) +/** + * enum iommufd_hwpt_get_dirty_bitmap_flags - Flags for getting dirty bits + * @IOMMU_HWPT_GET_DIRTY_BITMAP_NO_CLEAR: Just read the PTEs without clearing + * any dirty bits metadata. This flag + * can be passed in the expectation + * where the next operation is an unmap + * of the same IOVA range. + * + */ +enum iommufd_hwpt_get_dirty_bitmap_flags { + IOMMU_HWPT_GET_DIRTY_BITMAP_NO_CLEAR = 1, +}; + /** * struct iommu_hwpt_get_dirty_bitmap - ioctl(IOMMU_HWPT_GET_DIRTY_BITMAP) * @size: sizeof(struct iommu_hwpt_get_dirty_bitmap) * @hwpt_id: HW pagetable ID that represents the IOMMU domain - * @flags: Must be zero + * @flags: Combination of enum iommufd_hwpt_get_dirty_bitmap_flags * @__reserved: Must be 0 * @iova: base IOVA of the bitmap first bit * @length: IOVA range size From 52076e9ba6ef80608919f2b4a9f826c092c007ca Mon Sep 17 00:00:00 2001 From: Joao Martins Date: Tue, 24 Oct 2023 14:51:03 +0100 Subject: [PATCH 376/548] iommu/vt-d: Access/Dirty bit support for SS domains IOMMU advertises Access/Dirty bits for second-stage page table if the extended capability DMAR register reports it (ECAP, mnemonic ECAP.SSADS). The first stage table is compatible with CPU page table thus A/D bits are implicitly supported. Relevant Intel IOMMU SDM ref for first stage table "3.6.2 Accessed, Extended Accessed, and Dirty Flags" and second stage table "3.7.2 Accessed and Dirty Flags". First stage page table is enabled by default so it's allowed to set dirty tracking and no control bits needed, it just returns 0. To use SSADS, set bit 9 (SSADE) in the scalable-mode PASID table entry and flush the IOTLB via pasid_flush_caches() following the manual. Relevant SDM refs: "3.7.2 Accessed and Dirty Flags" "6.5.3.3 Guidance to Software for Invalidations, Table 23. Guidance to Software for Invalidations" PTE dirty bit is located in bit 9 and it's cached in the IOTLB so flush IOTLB to make sure IOMMU attempts to set the dirty bit again. Note that iommu_dirty_bitmap_record() will add the IOVA to iotlb_gather and thus the caller of the iommu op will flush the IOTLB. Relevant manuals over the hardware translation is chapter 6 with some special mention to: "6.2.3.1 Scalable-Mode PASID-Table Entry Programming Considerations" "6.2.4 IOTLB" Select IOMMUFD_DRIVER only if IOMMUFD is enabled, given that IOMMU dirty tracking requires IOMMUFD. Link: https://lore.kernel.org/r/20231024135109.73787-13-joao.m.martins@oracle.com Signed-off-by: Joao Martins Reviewed-by: Lu Baolu Reviewed-by: Kevin Tian Signed-off-by: Jason Gunthorpe (cherry picked from commit f35f22cc760eb2c7034bf53251399685d611e03f) Signed-off-by: Koba Ko --- drivers/iommu/intel/Kconfig | 1 + drivers/iommu/intel/iommu.c | 103 +++++++++++++++++++++++++++++++++- drivers/iommu/intel/iommu.h | 16 ++++++ drivers/iommu/intel/pasid.c | 109 ++++++++++++++++++++++++++++++++++++ drivers/iommu/intel/pasid.h | 4 ++ 5 files changed, 232 insertions(+), 1 deletion(-) diff --git a/drivers/iommu/intel/Kconfig b/drivers/iommu/intel/Kconfig index 2e56bd79f589d..f5348b80652b6 100644 --- a/drivers/iommu/intel/Kconfig +++ b/drivers/iommu/intel/Kconfig @@ -15,6 +15,7 @@ config INTEL_IOMMU select DMA_OPS select IOMMU_API select IOMMU_IOVA + select IOMMUFD_DRIVER if IOMMUFD select NEED_DMA_MAP_STATE select DMAR_TABLE select SWIOTLB diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c index d9c6ac2398eba..19709122947e6 100644 --- a/drivers/iommu/intel/iommu.c +++ b/drivers/iommu/intel/iommu.c @@ -300,6 +300,7 @@ static int iommu_skip_te_disable; #define IDENTMAP_AZALIA 4 const struct iommu_ops intel_iommu_ops; +const struct iommu_dirty_ops intel_dirty_ops; static bool translation_pre_enabled(struct intel_iommu *iommu) { @@ -4080,8 +4081,10 @@ intel_iommu_domain_alloc_user(struct device *dev, u32 flags) { struct iommu_domain *domain; struct intel_iommu *iommu; + bool dirty_tracking; - if (flags & (~IOMMU_HWPT_ALLOC_NEST_PARENT)) + if (flags & + (~(IOMMU_HWPT_ALLOC_NEST_PARENT | IOMMU_HWPT_ALLOC_DIRTY_TRACKING))) return ERR_PTR(-EOPNOTSUPP); iommu = device_to_iommu(dev, NULL, NULL); @@ -4091,6 +4094,10 @@ intel_iommu_domain_alloc_user(struct device *dev, u32 flags) if ((flags & IOMMU_HWPT_ALLOC_NEST_PARENT) && !nested_supported(iommu)) return ERR_PTR(-EOPNOTSUPP); + dirty_tracking = (flags & IOMMU_HWPT_ALLOC_DIRTY_TRACKING); + if (dirty_tracking && !ssads_supported(iommu)) + return ERR_PTR(-EOPNOTSUPP); + /* * domain_alloc_user op needs to fully initialize a domain * before return, so uses iommu_domain_alloc() here for @@ -4099,6 +4106,15 @@ intel_iommu_domain_alloc_user(struct device *dev, u32 flags) domain = iommu_domain_alloc(dev->bus); if (!domain) domain = ERR_PTR(-ENOMEM); + + if (!IS_ERR(domain) && dirty_tracking) { + if (to_dmar_domain(domain)->use_first_level) { + iommu_domain_free(domain); + return ERR_PTR(-EOPNOTSUPP); + } + domain->dirty_ops = &intel_dirty_ops; + } + return domain; } @@ -4122,6 +4138,9 @@ static int prepare_domain_attach_device(struct iommu_domain *domain, if (dmar_domain->force_snooping && !ecap_sc_support(iommu->ecap)) return -EINVAL; + if (domain->dirty_ops && !ssads_supported(iommu)) + return -EINVAL; + /* check if this iommu agaw is sufficient for max mapped address */ addr_width = agaw_to_width(iommu->agaw); if (addr_width > cap_mgaw(iommu->cap)) @@ -4377,6 +4396,8 @@ static bool intel_iommu_capable(struct device *dev, enum iommu_cap cap) return dmar_platform_optin(); case IOMMU_CAP_ENFORCE_CACHE_COHERENCY: return ecap_sc_support(info->iommu->ecap); + case IOMMU_CAP_DIRTY_TRACKING: + return ssads_supported(info->iommu); default: return false; } @@ -4774,6 +4795,9 @@ static int intel_iommu_set_dev_pasid(struct iommu_domain *domain, if (!pasid_supported(iommu) || dev_is_real_dma_subdevice(dev)) return -EOPNOTSUPP; + if (domain->dirty_ops) + return -EINVAL; + if (context_copied(iommu, info->bus, info->devfn)) return -EBUSY; @@ -4832,6 +4856,83 @@ static void *intel_iommu_hw_info(struct device *dev, u32 *length, u32 *type) return vtd; } +static int intel_iommu_set_dirty_tracking(struct iommu_domain *domain, + bool enable) +{ + struct dmar_domain *dmar_domain = to_dmar_domain(domain); + struct device_domain_info *info; + int ret; + + spin_lock(&dmar_domain->lock); + if (dmar_domain->dirty_tracking == enable) + goto out_unlock; + + list_for_each_entry(info, &dmar_domain->devices, link) { + ret = intel_pasid_setup_dirty_tracking(info->iommu, + info->domain, info->dev, + IOMMU_NO_PASID, enable); + if (ret) + goto err_unwind; + } + + dmar_domain->dirty_tracking = enable; +out_unlock: + spin_unlock(&dmar_domain->lock); + + return 0; + +err_unwind: + list_for_each_entry(info, &dmar_domain->devices, link) + intel_pasid_setup_dirty_tracking(info->iommu, dmar_domain, + info->dev, IOMMU_NO_PASID, + dmar_domain->dirty_tracking); + spin_unlock(&dmar_domain->lock); + return ret; +} + +static int intel_iommu_read_and_clear_dirty(struct iommu_domain *domain, + unsigned long iova, size_t size, + unsigned long flags, + struct iommu_dirty_bitmap *dirty) +{ + struct dmar_domain *dmar_domain = to_dmar_domain(domain); + unsigned long end = iova + size - 1; + unsigned long pgsize; + + /* + * IOMMUFD core calls into a dirty tracking disabled domain without an + * IOVA bitmap set in order to clean dirty bits in all PTEs that might + * have occurred when we stopped dirty tracking. This ensures that we + * never inherit dirtied bits from a previous cycle. + */ + if (!dmar_domain->dirty_tracking && dirty->bitmap) + return -EINVAL; + + do { + struct dma_pte *pte; + int lvl = 0; + + pte = pfn_to_dma_pte(dmar_domain, iova >> VTD_PAGE_SHIFT, &lvl, + GFP_ATOMIC); + pgsize = level_size(lvl) << VTD_PAGE_SHIFT; + if (!pte || !dma_pte_present(pte)) { + iova += pgsize; + continue; + } + + if (dma_sl_pte_test_and_clear_dirty(pte, flags)) + iommu_dirty_bitmap_record(dirty, iova, pgsize); + iova += pgsize; + } while (iova < end); + + return 0; +} + +const struct iommu_dirty_ops intel_dirty_ops = { + .set_dirty_tracking = intel_iommu_set_dirty_tracking, + .read_and_clear_dirty = intel_iommu_read_and_clear_dirty, +}; + const struct iommu_ops intel_iommu_ops = { .capable = intel_iommu_capable, .hw_info = intel_iommu_hw_info, diff --git a/drivers/iommu/intel/iommu.h b/drivers/iommu/intel/iommu.h index a5d27d61e1567..49ea164cb006f 100644 --- a/drivers/iommu/intel/iommu.h +++ b/drivers/iommu/intel/iommu.h @@ -48,6 +48,9 @@ #define DMA_FL_PTE_DIRTY BIT_ULL(6) #define DMA_FL_PTE_XD BIT_ULL(63) +#define DMA_SL_PTE_DIRTY_BIT 9 +#define DMA_SL_PTE_DIRTY BIT_ULL(DMA_SL_PTE_DIRTY_BIT) + #define ADDR_WIDTH_5LEVEL (57) #define ADDR_WIDTH_4LEVEL (48) @@ -539,6 +542,8 @@ enum { #define sm_supported(iommu) (intel_iommu_sm && ecap_smts((iommu)->ecap)) #define pasid_supported(iommu) (sm_supported(iommu) && \ ecap_pasid((iommu)->ecap)) +#define ssads_supported(iommu) (sm_supported(iommu) && \ + ecap_slads((iommu)->ecap)) #define nested_supported(iommu) (sm_supported(iommu) && \ ecap_nest((iommu)->ecap)) @@ -597,6 +602,7 @@ struct dmar_domain { u8 has_mappings:1; /* Has mappings configured through * iommu_map() interface. */ + u8 dirty_tracking:1; /* Dirty tracking is enabled */ spinlock_t lock; /* Protect device tracking lists */ struct list_head devices; /* all devices' list */ @@ -786,6 +792,16 @@ static inline bool dma_pte_present(struct dma_pte *pte) return (pte->val & 3) != 0; } +static inline bool dma_sl_pte_test_and_clear_dirty(struct dma_pte *pte, + unsigned long flags) +{ + if (flags & IOMMU_DIRTY_NO_CLEAR) + return (pte->val & DMA_SL_PTE_DIRTY) != 0; + + return test_and_clear_bit(DMA_SL_PTE_DIRTY_BIT, + (unsigned long *)&pte->val); +} + static inline bool dma_pte_superpage(struct dma_pte *pte) { return (pte->val & DMA_PTE_LARGE_PAGE); diff --git a/drivers/iommu/intel/pasid.c b/drivers/iommu/intel/pasid.c index 8faa93cffac45..06ea2dd535421 100644 --- a/drivers/iommu/intel/pasid.c +++ b/drivers/iommu/intel/pasid.c @@ -277,6 +277,11 @@ static inline void pasid_set_bits(u64 *ptr, u64 mask, u64 bits) WRITE_ONCE(*ptr, (old & ~mask) | bits); } +static inline u64 pasid_get_bits(u64 *ptr) +{ + return READ_ONCE(*ptr); +} + /* * Setup the DID(Domain Identifier) field (Bit 64~79) of scalable mode * PASID entry. @@ -335,6 +340,36 @@ static inline void pasid_set_fault_enable(struct pasid_entry *pe) pasid_set_bits(&pe->val[0], 1 << 1, 0); } +/* + * Enable second level A/D bits by setting the SLADE (Second Level + * Access Dirty Enable) field (Bit 9) of a scalable mode PASID + * entry. + */ +static inline void pasid_set_ssade(struct pasid_entry *pe) +{ + pasid_set_bits(&pe->val[0], 1 << 9, 1 << 9); +} + +/* + * Disable second level A/D bits by clearing the SLADE (Second Level + * Access Dirty Enable) field (Bit 9) of a scalable mode PASID + * entry. + */ +static inline void pasid_clear_ssade(struct pasid_entry *pe) +{ + pasid_set_bits(&pe->val[0], 1 << 9, 0); +} + +/* + * Checks if second level A/D bits specifically the SLADE (Second Level + * Access Dirty Enable) field (Bit 9) of a scalable mode PASID + * entry is set. + */ +static inline bool pasid_get_ssade(struct pasid_entry *pe) +{ + return pasid_get_bits(&pe->val[0]) & (1 << 9); +} + /* * Setup the WPE(Write Protect Enable) field (Bit 132) of a * scalable mode PASID entry. @@ -630,6 +665,8 @@ int intel_pasid_setup_second_level(struct intel_iommu *iommu, pasid_set_translation_type(pte, PASID_ENTRY_PGTT_SL_ONLY); pasid_set_fault_enable(pte); pasid_set_page_snoop(pte, !!ecap_smpwc(iommu->ecap)); + if (domain->dirty_tracking) + pasid_set_ssade(pte); pasid_set_present(pte); spin_unlock(&iommu->lock); @@ -639,6 +676,78 @@ int intel_pasid_setup_second_level(struct intel_iommu *iommu, return 0; } +/* + * Set up dirty tracking on a second only or nested translation type. + */ +int intel_pasid_setup_dirty_tracking(struct intel_iommu *iommu, + struct dmar_domain *domain, + struct device *dev, u32 pasid, + bool enabled) +{ + struct pasid_entry *pte; + u16 did, pgtt; + + spin_lock(&iommu->lock); + + pte = intel_pasid_get_entry(dev, pasid); + if (!pte) { + spin_unlock(&iommu->lock); + dev_err_ratelimited( + dev, "Failed to get pasid entry of PASID %d\n", pasid); + return -ENODEV; + } + + did = domain_id_iommu(domain, iommu); + pgtt = pasid_pte_get_pgtt(pte); + if (pgtt != PASID_ENTRY_PGTT_SL_ONLY && + pgtt != PASID_ENTRY_PGTT_NESTED) { + spin_unlock(&iommu->lock); + dev_err_ratelimited( + dev, + "Dirty tracking not supported on translation type %d\n", + pgtt); + return -EOPNOTSUPP; + } + + if (pasid_get_ssade(pte) == enabled) { + spin_unlock(&iommu->lock); + return 0; + } + + if (enabled) + pasid_set_ssade(pte); + else + pasid_clear_ssade(pte); + spin_unlock(&iommu->lock); + + if (!ecap_coherent(iommu->ecap)) + clflush_cache_range(pte, sizeof(*pte)); + + /* + * From VT-d spec table 25 "Guidance to Software for Invalidations": + * + * - PASID-selective-within-Domain PASID-cache invalidation + * If (PGTT=SS or Nested) + * - Domain-selective IOTLB invalidation + * Else + * - PASID-selective PASID-based IOTLB invalidation + * - If (pasid is RID_PASID) + * - Global Device-TLB invalidation to affected functions + * Else + * - PASID-based Device-TLB invalidation (with S=1 and + * Addr[63:12]=0x7FFFFFFF_FFFFF) to affected functions + */ + pasid_cache_invalidation_with_pasid(iommu, did, pasid); + + iommu->flush.flush_iotlb(iommu, did, 0, 0, DMA_TLB_DSI_FLUSH); + + /* Device IOTLB doesn't need to be flushed in caching mode. */ + if (!cap_caching_mode(iommu->cap)) + devtlb_invalidation_with_pasid(iommu, dev, pasid); + + return 0; +} + /* * Set up the scalable mode pasid entry for passthrough translation type. */ diff --git a/drivers/iommu/intel/pasid.h b/drivers/iommu/intel/pasid.h index 4e9e68c3c3888..958050b093aa2 100644 --- a/drivers/iommu/intel/pasid.h +++ b/drivers/iommu/intel/pasid.h @@ -106,6 +106,10 @@ int intel_pasid_setup_first_level(struct intel_iommu *iommu, int intel_pasid_setup_second_level(struct intel_iommu *iommu, struct dmar_domain *domain, struct device *dev, u32 pasid); +int intel_pasid_setup_dirty_tracking(struct intel_iommu *iommu, + struct dmar_domain *domain, + struct device *dev, u32 pasid, + bool enabled); int intel_pasid_setup_pass_through(struct intel_iommu *iommu, struct dmar_domain *domain, struct device *dev, u32 pasid); From 2f751fa28d801ab78158c85733da62a5ed9df4aa Mon Sep 17 00:00:00 2001 From: Joao Martins Date: Tue, 24 Oct 2023 14:51:04 +0100 Subject: [PATCH 377/548] iommufd/selftest: Expand mock_domain with dev_flags Expand mock_domain test to be able to manipulate the device capabilities. This allows testing with mockdev without dirty tracking support advertised and thus make sure enforce_dirty test does the expected. To avoid breaking IOMMUFD_TEST UABI replicate the mock_domain struct and thus add an input dev_flags at the end. Link: https://lore.kernel.org/r/20231024135109.73787-14-joao.m.martins@oracle.com Signed-off-by: Joao Martins Reviewed-by: Kevin Tian Signed-off-by: Jason Gunthorpe (cherry picked from commit e04b23c8d4ed977dbab4a4159f9e4d9a878b5c65) Signed-off-by: Koba Ko --- drivers/iommu/iommufd/iommufd_test.h | 12 ++++++++ drivers/iommu/iommufd/selftest.c | 11 +++++-- tools/testing/selftests/iommu/iommufd_utils.h | 29 +++++++++++++++++++ 3 files changed, 50 insertions(+), 2 deletions(-) diff --git a/drivers/iommu/iommufd/iommufd_test.h b/drivers/iommu/iommufd/iommufd_test.h index 3f3644375bf13..9817edcd89683 100644 --- a/drivers/iommu/iommufd/iommufd_test.h +++ b/drivers/iommu/iommufd/iommufd_test.h @@ -19,6 +19,7 @@ enum { IOMMU_TEST_OP_SET_TEMP_MEMORY_LIMIT, IOMMU_TEST_OP_MOCK_DOMAIN_REPLACE, IOMMU_TEST_OP_ACCESS_REPLACE_IOAS, + IOMMU_TEST_OP_MOCK_DOMAIN_FLAGS, }; enum { @@ -40,6 +41,10 @@ enum { MOCK_FLAGS_ACCESS_CREATE_NEEDS_PIN_PAGES = 1 << 0, }; +enum { + MOCK_FLAGS_DEVICE_NO_DIRTY = 1 << 0, +}; + struct iommu_test_cmd { __u32 size; __u32 op; @@ -56,6 +61,13 @@ struct iommu_test_cmd { /* out_idev_id is the standard iommufd_bind object */ __u32 out_idev_id; } mock_domain; + struct { + __u32 out_stdev_id; + __u32 out_hwpt_id; + __u32 out_idev_id; + /* Expand mock_domain to set mock device flags */ + __u32 dev_flags; + } mock_domain_flags; struct { __u32 pt_id; } mock_domain_replace; diff --git a/drivers/iommu/iommufd/selftest.c b/drivers/iommu/iommufd/selftest.c index f68e4481ae95f..da7414be92e3d 100644 --- a/drivers/iommu/iommufd/selftest.c +++ b/drivers/iommu/iommufd/selftest.c @@ -111,6 +111,7 @@ enum selftest_obj_type { struct mock_dev { struct device dev; + unsigned long flags; }; struct selftest_obj { @@ -387,7 +388,7 @@ static void mock_dev_release(struct device *dev) kfree(mdev); } -static struct mock_dev *mock_dev_create(void) +static struct mock_dev *mock_dev_create(unsigned long dev_flags) { struct mock_dev *mdev; int rc; @@ -397,6 +398,7 @@ static struct mock_dev *mock_dev_create(void) return ERR_PTR(-ENOMEM); device_initialize(&mdev->dev); + mdev->flags = dev_flags; mdev->dev.release = mock_dev_release; mdev->dev.bus = &iommufd_mock_bus_type.bus; @@ -432,6 +434,7 @@ static int iommufd_test_mock_domain(struct iommufd_ucmd *ucmd, struct iommufd_device *idev; struct selftest_obj *sobj; u32 pt_id = cmd->id; + u32 dev_flags = 0; u32 idev_id; int rc; @@ -442,7 +445,10 @@ static int iommufd_test_mock_domain(struct iommufd_ucmd *ucmd, sobj->idev.ictx = ucmd->ictx; sobj->type = TYPE_IDEV; - sobj->idev.mock_dev = mock_dev_create(); + if (cmd->op == IOMMU_TEST_OP_MOCK_DOMAIN_FLAGS) + dev_flags = cmd->mock_domain_flags.dev_flags; + + sobj->idev.mock_dev = mock_dev_create(dev_flags); if (IS_ERR(sobj->idev.mock_dev)) { rc = PTR_ERR(sobj->idev.mock_dev); goto out_sobj; @@ -1025,6 +1031,7 @@ int iommufd_test(struct iommufd_ucmd *ucmd) cmd->add_reserved.start, cmd->add_reserved.length); case IOMMU_TEST_OP_MOCK_DOMAIN: + case IOMMU_TEST_OP_MOCK_DOMAIN_FLAGS: return iommufd_test_mock_domain(ucmd, cmd); case IOMMU_TEST_OP_MOCK_DOMAIN_REPLACE: return iommufd_test_mock_domain_replace( diff --git a/tools/testing/selftests/iommu/iommufd_utils.h b/tools/testing/selftests/iommu/iommufd_utils.h index be4970a849776..1e0736adc9912 100644 --- a/tools/testing/selftests/iommu/iommufd_utils.h +++ b/tools/testing/selftests/iommu/iommufd_utils.h @@ -74,6 +74,35 @@ static int _test_cmd_mock_domain(int fd, unsigned int ioas_id, __u32 *stdev_id, EXPECT_ERRNO(_errno, _test_cmd_mock_domain(self->fd, ioas_id, \ stdev_id, hwpt_id, NULL)) +static int _test_cmd_mock_domain_flags(int fd, unsigned int ioas_id, + __u32 stdev_flags, __u32 *stdev_id, + __u32 *hwpt_id, __u32 *idev_id) +{ + struct iommu_test_cmd cmd = { + .size = sizeof(cmd), + .op = IOMMU_TEST_OP_MOCK_DOMAIN_FLAGS, + .id = ioas_id, + .mock_domain_flags = { .dev_flags = stdev_flags }, + }; + int ret; + + ret = ioctl(fd, IOMMU_TEST_CMD, &cmd); + if (ret) + return ret; + if (stdev_id) + *stdev_id = cmd.mock_domain_flags.out_stdev_id; + assert(cmd.id != 0); + if (hwpt_id) + *hwpt_id = cmd.mock_domain_flags.out_hwpt_id; + if (idev_id) + *idev_id = cmd.mock_domain_flags.out_idev_id; + return 0; +} +#define test_err_mock_domain_flags(_errno, ioas_id, flags, stdev_id, hwpt_id) \ + EXPECT_ERRNO(_errno, \ + _test_cmd_mock_domain_flags(self->fd, ioas_id, flags, \ + stdev_id, hwpt_id, NULL)) + static int _test_cmd_mock_domain_replace(int fd, __u32 stdev_id, __u32 pt_id, __u32 *hwpt_id) { From 92b158bb6edcff3945fd9ba5f17c5ed50ef7f842 Mon Sep 17 00:00:00 2001 From: Joao Martins Date: Tue, 24 Oct 2023 14:51:05 +0100 Subject: [PATCH 378/548] iommufd/selftest: Test IOMMU_HWPT_ALLOC_DIRTY_TRACKING In order to selftest the iommu domain dirty enforcing implement the mock_domain necessary support and add a new dev_flags to test that the hwpt_alloc/attach_device fails as expected. Expand the existing mock_domain fixture with a enforce_dirty test that exercises the hwpt_alloc and device attachment. Link: https://lore.kernel.org/r/20231024135109.73787-15-joao.m.martins@oracle.com Signed-off-by: Joao Martins Reviewed-by: Kevin Tian Signed-off-by: Jason Gunthorpe (cherry picked from commit 266ce58989ba05e2a24460fdbf402d766c2e3870) Signed-off-by: Koba Ko --- drivers/iommu/iommufd/selftest.c | 37 +++++++++++++- tools/testing/selftests/iommu/iommufd.c | 49 +++++++++++++++++++ tools/testing/selftests/iommu/iommufd_utils.h | 3 ++ 3 files changed, 88 insertions(+), 1 deletion(-) diff --git a/drivers/iommu/iommufd/selftest.c b/drivers/iommu/iommufd/selftest.c index da7414be92e3d..2acf1244e86f2 100644 --- a/drivers/iommu/iommufd/selftest.c +++ b/drivers/iommu/iommufd/selftest.c @@ -130,6 +130,11 @@ struct selftest_obj { static int mock_domain_nop_attach(struct iommu_domain *domain, struct device *dev) { + struct mock_dev *mdev = container_of(dev, struct mock_dev, dev); + + if (domain->dirty_ops && (mdev->flags & MOCK_FLAGS_DEVICE_NO_DIRTY)) + return -EINVAL; + return 0; } @@ -157,6 +162,25 @@ static void *mock_domain_hw_info(struct device *dev, u32 *length, u32 *type) return info; } +static int mock_domain_set_dirty_tracking(struct iommu_domain *domain, + bool enable) +{ + return 0; +} + +static int mock_domain_read_and_clear_dirty(struct iommu_domain *domain, + unsigned long iova, size_t size, + unsigned long flags, + struct iommu_dirty_bitmap *dirty) +{ + return 0; +} + +const struct iommu_dirty_ops dirty_ops = { + .set_dirty_tracking = mock_domain_set_dirty_tracking, + .read_and_clear_dirty = mock_domain_read_and_clear_dirty, +}; + static const struct iommu_ops mock_ops; static struct iommu_domain *mock_domain_alloc(unsigned int iommu_domain_type) @@ -184,12 +208,20 @@ static struct iommu_domain *mock_domain_alloc(unsigned int iommu_domain_type) static struct iommu_domain * mock_domain_alloc_user(struct device *dev, u32 flags) { + struct mock_dev *mdev = container_of(dev, struct mock_dev, dev); struct iommu_domain *domain; - if (flags & (~IOMMU_HWPT_ALLOC_NEST_PARENT)) + if (flags & + (~(IOMMU_HWPT_ALLOC_NEST_PARENT | IOMMU_HWPT_ALLOC_DIRTY_TRACKING))) + return ERR_PTR(-EOPNOTSUPP); + + if ((flags & IOMMU_HWPT_ALLOC_DIRTY_TRACKING) && + (mdev->flags & MOCK_FLAGS_DEVICE_NO_DIRTY)) return ERR_PTR(-EOPNOTSUPP); domain = mock_domain_alloc(IOMMU_DOMAIN_UNMANAGED); + if (domain && !(mdev->flags & MOCK_FLAGS_DEVICE_NO_DIRTY)) + domain->dirty_ops = &dirty_ops; if (!domain) domain = ERR_PTR(-ENOMEM); return domain; @@ -393,6 +425,9 @@ static struct mock_dev *mock_dev_create(unsigned long dev_flags) struct mock_dev *mdev; int rc; + if (dev_flags & ~(MOCK_FLAGS_DEVICE_NO_DIRTY)) + return ERR_PTR(-EINVAL); + mdev = kzalloc(sizeof(*mdev), GFP_KERNEL); if (!mdev) return ERR_PTR(-ENOMEM); diff --git a/tools/testing/selftests/iommu/iommufd.c b/tools/testing/selftests/iommu/iommufd.c index bdc56e32a4b24..120e34a8bb144 100644 --- a/tools/testing/selftests/iommu/iommufd.c +++ b/tools/testing/selftests/iommu/iommufd.c @@ -1430,6 +1430,55 @@ TEST_F(iommufd_mock_domain, alloc_hwpt) } } +FIXTURE(iommufd_dirty_tracking) +{ + int fd; + uint32_t ioas_id; + uint32_t hwpt_id; + uint32_t stdev_id; + uint32_t idev_id; +}; + +FIXTURE_SETUP(iommufd_dirty_tracking) +{ + self->fd = open("/dev/iommu", O_RDWR); + ASSERT_NE(-1, self->fd); + + test_ioctl_ioas_alloc(&self->ioas_id); + test_cmd_mock_domain(self->ioas_id, &self->stdev_id, &self->hwpt_id, + &self->idev_id); +} + +FIXTURE_TEARDOWN(iommufd_dirty_tracking) +{ + teardown_iommufd(self->fd, _metadata); +} + +TEST_F(iommufd_dirty_tracking, enforce_dirty) +{ + uint32_t ioas_id, stddev_id, idev_id; + uint32_t hwpt_id, _hwpt_id; + uint32_t dev_flags; + + /* Regular case */ + dev_flags = MOCK_FLAGS_DEVICE_NO_DIRTY; + test_cmd_hwpt_alloc(self->idev_id, self->ioas_id, + IOMMU_HWPT_ALLOC_DIRTY_TRACKING, &hwpt_id); + test_cmd_mock_domain(hwpt_id, &stddev_id, NULL, NULL); + test_err_mock_domain_flags(EINVAL, hwpt_id, dev_flags, &stddev_id, + NULL); + test_ioctl_destroy(stddev_id); + test_ioctl_destroy(hwpt_id); + + /* IOMMU device does not support dirty tracking */ + test_ioctl_ioas_alloc(&ioas_id); + test_cmd_mock_domain_flags(ioas_id, dev_flags, &stddev_id, &_hwpt_id, + &idev_id); + test_err_hwpt_alloc(EOPNOTSUPP, idev_id, ioas_id, + IOMMU_HWPT_ALLOC_DIRTY_TRACKING, &hwpt_id); + test_ioctl_destroy(stddev_id); +} + /* VFIO compatibility IOCTLs */ TEST_F(iommufd, simple_ioctls) diff --git a/tools/testing/selftests/iommu/iommufd_utils.h b/tools/testing/selftests/iommu/iommufd_utils.h index 1e0736adc9912..4ddafa29e638b 100644 --- a/tools/testing/selftests/iommu/iommufd_utils.h +++ b/tools/testing/selftests/iommu/iommufd_utils.h @@ -98,6 +98,9 @@ static int _test_cmd_mock_domain_flags(int fd, unsigned int ioas_id, *idev_id = cmd.mock_domain_flags.out_idev_id; return 0; } +#define test_cmd_mock_domain_flags(ioas_id, flags, stdev_id, hwpt_id, idev_id) \ + ASSERT_EQ(0, _test_cmd_mock_domain_flags(self->fd, ioas_id, flags, \ + stdev_id, hwpt_id, idev_id)) #define test_err_mock_domain_flags(_errno, ioas_id, flags, stdev_id, hwpt_id) \ EXPECT_ERRNO(_errno, \ _test_cmd_mock_domain_flags(self->fd, ioas_id, flags, \ From c62dfd22ba79c527f67b8d72cd7cb0b9bf26e1fe Mon Sep 17 00:00:00 2001 From: Joao Martins Date: Tue, 24 Oct 2023 14:51:06 +0100 Subject: [PATCH 379/548] iommufd/selftest: Test IOMMU_HWPT_SET_DIRTY_TRACKING Change mock_domain to supporting dirty tracking and add tests to exercise the new SET_DIRTY_TRACKING API in the iommufd_dirty_tracking selftest fixture. Link: https://lore.kernel.org/r/20231024135109.73787-16-joao.m.martins@oracle.com Signed-off-by: Joao Martins Reviewed-by: Kevin Tian Signed-off-by: Jason Gunthorpe (cherry picked from commit 7adf267d66d1d737ea8318976fd1ce93733fd3a4) Signed-off-by: Koba Ko --- drivers/iommu/iommufd/selftest.c | 16 ++++++++++++++++ tools/testing/selftests/iommu/iommufd.c | 15 +++++++++++++++ tools/testing/selftests/iommu/iommufd_utils.h | 17 +++++++++++++++++ 3 files changed, 48 insertions(+) diff --git a/drivers/iommu/iommufd/selftest.c b/drivers/iommu/iommufd/selftest.c index 2acf1244e86f2..2ef536c4a854c 100644 --- a/drivers/iommu/iommufd/selftest.c +++ b/drivers/iommu/iommufd/selftest.c @@ -24,6 +24,7 @@ static struct platform_device *selftest_iommu_dev; size_t iommufd_test_memory_limit = 65536; enum { + MOCK_DIRTY_TRACK = 1, MOCK_IO_PAGE_SIZE = PAGE_SIZE / 2, /* @@ -101,6 +102,7 @@ void iommufd_test_syz_conv_iova_id(struct iommufd_ucmd *ucmd, } struct mock_iommu_domain { + unsigned long flags; struct iommu_domain domain; struct xarray pfns; }; @@ -165,6 +167,20 @@ static void *mock_domain_hw_info(struct device *dev, u32 *length, u32 *type) static int mock_domain_set_dirty_tracking(struct iommu_domain *domain, bool enable) { + struct mock_iommu_domain *mock = + container_of(domain, struct mock_iommu_domain, domain); + unsigned long flags = mock->flags; + + if (enable && !domain->dirty_ops) + return -EINVAL; + + /* No change? */ + if (!(enable ^ !!(flags & MOCK_DIRTY_TRACK))) + return 0; + + flags = (enable ? flags | MOCK_DIRTY_TRACK : flags & ~MOCK_DIRTY_TRACK); + + mock->flags = flags; return 0; } diff --git a/tools/testing/selftests/iommu/iommufd.c b/tools/testing/selftests/iommu/iommufd.c index 120e34a8bb144..03bfce08ce437 100644 --- a/tools/testing/selftests/iommu/iommufd.c +++ b/tools/testing/selftests/iommu/iommufd.c @@ -1479,6 +1479,21 @@ TEST_F(iommufd_dirty_tracking, enforce_dirty) test_ioctl_destroy(stddev_id); } +TEST_F(iommufd_dirty_tracking, set_dirty_tracking) +{ + uint32_t stddev_id; + uint32_t hwpt_id; + + test_cmd_hwpt_alloc(self->idev_id, self->ioas_id, + IOMMU_HWPT_ALLOC_DIRTY_TRACKING, &hwpt_id); + test_cmd_mock_domain(hwpt_id, &stddev_id, NULL, NULL); + test_cmd_set_dirty_tracking(hwpt_id, true); + test_cmd_set_dirty_tracking(hwpt_id, false); + + test_ioctl_destroy(stddev_id); + test_ioctl_destroy(hwpt_id); +} + /* VFIO compatibility IOCTLs */ TEST_F(iommufd, simple_ioctls) diff --git a/tools/testing/selftests/iommu/iommufd_utils.h b/tools/testing/selftests/iommu/iommufd_utils.h index 4ddafa29e638b..e37af6291b226 100644 --- a/tools/testing/selftests/iommu/iommufd_utils.h +++ b/tools/testing/selftests/iommu/iommufd_utils.h @@ -179,6 +179,23 @@ static int _test_cmd_access_replace_ioas(int fd, __u32 access_id, #define test_cmd_access_replace_ioas(access_id, ioas_id) \ ASSERT_EQ(0, _test_cmd_access_replace_ioas(self->fd, access_id, ioas_id)) +static int _test_cmd_set_dirty_tracking(int fd, __u32 hwpt_id, bool enabled) +{ + struct iommu_hwpt_set_dirty_tracking cmd = { + .size = sizeof(cmd), + .flags = enabled ? IOMMU_HWPT_DIRTY_TRACKING_ENABLE : 0, + .hwpt_id = hwpt_id, + }; + int ret; + + ret = ioctl(fd, IOMMU_HWPT_SET_DIRTY_TRACKING, &cmd); + if (ret) + return -errno; + return 0; +} +#define test_cmd_set_dirty_tracking(hwpt_id, enabled) \ + ASSERT_EQ(0, _test_cmd_set_dirty_tracking(self->fd, hwpt_id, enabled)) + static int _test_cmd_create_access(int fd, unsigned int ioas_id, __u32 *access_id, unsigned int flags) { From cdf5d5d4cbce25b15a4dca20334c3ef1e00de02b Mon Sep 17 00:00:00 2001 From: Joao Martins Date: Tue, 24 Oct 2023 14:51:07 +0100 Subject: [PATCH 380/548] iommufd/selftest: Test IOMMU_HWPT_GET_DIRTY_BITMAP Add a new test ioctl for simulating the dirty IOVAs in the mock domain, and implement the mock iommu domain ops that get the dirty tracking supported. The selftest exercises the usual main workflow of: 1) Setting dirty tracking from the iommu domain 2) Read and clear dirty IOPTEs Different fixtures will test different IOVA range sizes, that exercise corner cases of the bitmaps. Link: https://lore.kernel.org/r/20231024135109.73787-17-joao.m.martins@oracle.com Signed-off-by: Joao Martins Signed-off-by: Jason Gunthorpe (cherry picked from commit a9af47e382a4d517685cb13c780272e7f300ebc5) Signed-off-by: Koba Ko --- drivers/iommu/iommufd/iommufd_test.h | 9 ++ drivers/iommu/iommufd/selftest.c | 107 ++++++++++++++- tools/testing/selftests/iommu/iommufd.c | 96 +++++++++++++ tools/testing/selftests/iommu/iommufd_utils.h | 127 ++++++++++++++++++ 4 files changed, 334 insertions(+), 5 deletions(-) diff --git a/drivers/iommu/iommufd/iommufd_test.h b/drivers/iommu/iommufd/iommufd_test.h index 9817edcd89683..1f2e93d3d4e87 100644 --- a/drivers/iommu/iommufd/iommufd_test.h +++ b/drivers/iommu/iommufd/iommufd_test.h @@ -20,6 +20,7 @@ enum { IOMMU_TEST_OP_MOCK_DOMAIN_REPLACE, IOMMU_TEST_OP_ACCESS_REPLACE_IOAS, IOMMU_TEST_OP_MOCK_DOMAIN_FLAGS, + IOMMU_TEST_OP_DIRTY, }; enum { @@ -107,6 +108,14 @@ struct iommu_test_cmd { struct { __u32 ioas_id; } access_replace_ioas; + struct { + __u32 flags; + __aligned_u64 iova; + __aligned_u64 length; + __aligned_u64 page_size; + __aligned_u64 uptr; + __aligned_u64 out_nr_dirty; + } dirty; }; __u32 last; }; diff --git a/drivers/iommu/iommufd/selftest.c b/drivers/iommu/iommufd/selftest.c index 2ef536c4a854c..6ea1fa01c510c 100644 --- a/drivers/iommu/iommufd/selftest.c +++ b/drivers/iommu/iommufd/selftest.c @@ -37,6 +37,7 @@ enum { _MOCK_PFN_START = MOCK_PFN_MASK + 1, MOCK_PFN_START_IOVA = _MOCK_PFN_START, MOCK_PFN_LAST_IOVA = _MOCK_PFN_START, + MOCK_PFN_DIRTY_IOVA = _MOCK_PFN_START << 1, }; /* @@ -189,6 +190,31 @@ static int mock_domain_read_and_clear_dirty(struct iommu_domain *domain, unsigned long flags, struct iommu_dirty_bitmap *dirty) { + struct mock_iommu_domain *mock = + container_of(domain, struct mock_iommu_domain, domain); + unsigned long i, max = size / MOCK_IO_PAGE_SIZE; + void *ent, *old; + + if (!(mock->flags & MOCK_DIRTY_TRACK) && dirty->bitmap) + return -EINVAL; + + for (i = 0; i < max; i++) { + unsigned long cur = iova + i * MOCK_IO_PAGE_SIZE; + + ent = xa_load(&mock->pfns, cur / MOCK_IO_PAGE_SIZE); + if (ent && (xa_to_value(ent) & MOCK_PFN_DIRTY_IOVA)) { + unsigned long val; + + /* Clear dirty */ + val = xa_to_value(ent) & ~MOCK_PFN_DIRTY_IOVA; + old = xa_store(&mock->pfns, cur / MOCK_IO_PAGE_SIZE, + xa_mk_value(val), GFP_KERNEL); + WARN_ON_ONCE(ent != old); + iommu_dirty_bitmap_record(dirty, cur, + MOCK_IO_PAGE_SIZE); + } + } + return 0; } @@ -320,7 +346,7 @@ static size_t mock_domain_unmap_pages(struct iommu_domain *domain, for (cur = 0; cur != pgsize; cur += MOCK_IO_PAGE_SIZE) { ent = xa_erase(&mock->pfns, iova / MOCK_IO_PAGE_SIZE); - WARN_ON(!ent); + /* * iommufd generates unmaps that must be a strict * superset of the map's performend So every starting @@ -330,13 +356,13 @@ static size_t mock_domain_unmap_pages(struct iommu_domain *domain, * passed to map_pages */ if (first) { - WARN_ON(!(xa_to_value(ent) & - MOCK_PFN_START_IOVA)); + WARN_ON(ent && !(xa_to_value(ent) & + MOCK_PFN_START_IOVA)); first = false; } if (pgcount == 1 && cur + MOCK_IO_PAGE_SIZE == pgsize) - WARN_ON(!(xa_to_value(ent) & - MOCK_PFN_LAST_IOVA)); + WARN_ON(ent && !(xa_to_value(ent) & + MOCK_PFN_LAST_IOVA)); iova += MOCK_IO_PAGE_SIZE; ret += MOCK_IO_PAGE_SIZE; @@ -1059,6 +1085,71 @@ static_assert((unsigned int)MOCK_ACCESS_RW_WRITE == IOMMUFD_ACCESS_RW_WRITE); static_assert((unsigned int)MOCK_ACCESS_RW_SLOW_PATH == __IOMMUFD_ACCESS_RW_SLOW_PATH); +static int iommufd_test_dirty(struct iommufd_ucmd *ucmd, unsigned int mockpt_id, + unsigned long iova, size_t length, + unsigned long page_size, void __user *uptr, + u32 flags) +{ + unsigned long bitmap_size, i, max = length / page_size; + struct iommu_test_cmd *cmd = ucmd->cmd; + struct iommufd_hw_pagetable *hwpt; + struct mock_iommu_domain *mock; + int rc, count = 0; + void *tmp; + + if (iova % page_size || length % page_size || !uptr) + return -EINVAL; + + hwpt = get_md_pagetable(ucmd, mockpt_id, &mock); + if (IS_ERR(hwpt)) + return PTR_ERR(hwpt); + + if (!(mock->flags & MOCK_DIRTY_TRACK)) { + rc = -EINVAL; + goto out_put; + } + + bitmap_size = max / BITS_PER_BYTE; + + tmp = kvzalloc(bitmap_size, GFP_KERNEL_ACCOUNT); + if (!tmp) { + rc = -ENOMEM; + goto out_put; + } + + if (copy_from_user(tmp, uptr, bitmap_size)) { + rc = -EFAULT; + goto out_free; + } + + for (i = 0; i < max; i++) { + unsigned long cur = iova + i * page_size; + void *ent, *old; + + if (!test_bit(i, (unsigned long *)tmp)) + continue; + + ent = xa_load(&mock->pfns, cur / page_size); + if (ent) { + unsigned long val; + + val = xa_to_value(ent) | MOCK_PFN_DIRTY_IOVA; + old = xa_store(&mock->pfns, cur / page_size, + xa_mk_value(val), GFP_KERNEL); + WARN_ON_ONCE(ent != old); + count++; + } + } + + cmd->dirty.out_nr_dirty = count; + rc = iommufd_ucmd_respond(ucmd, sizeof(*cmd)); +out_free: + kvfree(tmp); +out_put: + iommufd_put_object(&hwpt->obj); + return rc; +} + void iommufd_selftest_destroy(struct iommufd_object *obj) { struct selftest_obj *sobj = container_of(obj, struct selftest_obj, obj); @@ -1124,6 +1215,12 @@ int iommufd_test(struct iommufd_ucmd *ucmd) return -EINVAL; iommufd_test_memory_limit = cmd->memory_limit.limit; return 0; + case IOMMU_TEST_OP_DIRTY: + return iommufd_test_dirty(ucmd, cmd->id, cmd->dirty.iova, + cmd->dirty.length, + cmd->dirty.page_size, + u64_to_user_ptr(cmd->dirty.uptr), + cmd->dirty.flags); default: return -EOPNOTSUPP; } diff --git a/tools/testing/selftests/iommu/iommufd.c b/tools/testing/selftests/iommu/iommufd.c index 03bfce08ce437..830b038200459 100644 --- a/tools/testing/selftests/iommu/iommufd.c +++ b/tools/testing/selftests/iommu/iommufd.c @@ -1437,13 +1437,47 @@ FIXTURE(iommufd_dirty_tracking) uint32_t hwpt_id; uint32_t stdev_id; uint32_t idev_id; + unsigned long page_size; + unsigned long bitmap_size; + void *bitmap; + void *buffer; +}; + +FIXTURE_VARIANT(iommufd_dirty_tracking) +{ + unsigned long buffer_size; }; FIXTURE_SETUP(iommufd_dirty_tracking) { + void *vrc; + int rc; + self->fd = open("/dev/iommu", O_RDWR); ASSERT_NE(-1, self->fd); + rc = posix_memalign(&self->buffer, HUGEPAGE_SIZE, variant->buffer_size); + if (rc || !self->buffer) { + SKIP(return, "Skipping buffer_size=%lu due to errno=%d", + variant->buffer_size, rc); + } + + assert((uintptr_t)self->buffer % HUGEPAGE_SIZE == 0); + vrc = mmap(self->buffer, variant->buffer_size, PROT_READ | PROT_WRITE, + MAP_SHARED | MAP_ANONYMOUS | MAP_FIXED, -1, 0); + assert(vrc == self->buffer); + + self->page_size = MOCK_PAGE_SIZE; + self->bitmap_size = + variant->buffer_size / self->page_size / BITS_PER_BYTE; + + /* Provision with an extra (MOCK_PAGE_SIZE) for the unaligned case */ + rc = posix_memalign(&self->bitmap, PAGE_SIZE, + self->bitmap_size + MOCK_PAGE_SIZE); + assert(!rc); + assert(self->bitmap); + assert((uintptr_t)self->bitmap % PAGE_SIZE == 0); + test_ioctl_ioas_alloc(&self->ioas_id); test_cmd_mock_domain(self->ioas_id, &self->stdev_id, &self->hwpt_id, &self->idev_id); @@ -1451,9 +1485,41 @@ FIXTURE_SETUP(iommufd_dirty_tracking) FIXTURE_TEARDOWN(iommufd_dirty_tracking) { + munmap(self->buffer, variant->buffer_size); + munmap(self->bitmap, self->bitmap_size); teardown_iommufd(self->fd, _metadata); } +FIXTURE_VARIANT_ADD(iommufd_dirty_tracking, domain_dirty128k) +{ + /* one u32 index bitmap */ + .buffer_size = 128UL * 1024UL, +}; + +FIXTURE_VARIANT_ADD(iommufd_dirty_tracking, domain_dirty256k) +{ + /* one u64 index bitmap */ + .buffer_size = 256UL * 1024UL, +}; + +FIXTURE_VARIANT_ADD(iommufd_dirty_tracking, domain_dirty640k) +{ + /* two u64 index and trailing end bitmap */ + .buffer_size = 640UL * 1024UL, +}; + +FIXTURE_VARIANT_ADD(iommufd_dirty_tracking, domain_dirty128M) +{ + /* 4K bitmap (128M IOVA range) */ + .buffer_size = 128UL * 1024UL * 1024UL, +}; + +FIXTURE_VARIANT_ADD(iommufd_dirty_tracking, domain_dirty256M) +{ + /* 8K bitmap (256M IOVA range) */ + .buffer_size = 256UL * 1024UL * 1024UL, +}; + TEST_F(iommufd_dirty_tracking, enforce_dirty) { uint32_t ioas_id, stddev_id, idev_id; @@ -1494,6 +1560,36 @@ TEST_F(iommufd_dirty_tracking, set_dirty_tracking) test_ioctl_destroy(hwpt_id); } +TEST_F(iommufd_dirty_tracking, get_dirty_bitmap) +{ + uint32_t stddev_id; + uint32_t hwpt_id; + uint32_t ioas_id; + + test_ioctl_ioas_alloc(&ioas_id); + test_ioctl_ioas_map_fixed_id(ioas_id, self->buffer, + variant->buffer_size, MOCK_APERTURE_START); + + test_cmd_hwpt_alloc(self->idev_id, ioas_id, + IOMMU_HWPT_ALLOC_DIRTY_TRACKING, &hwpt_id); + test_cmd_mock_domain(hwpt_id, &stddev_id, NULL, NULL); + + test_cmd_set_dirty_tracking(hwpt_id, true); + + test_mock_dirty_bitmaps(hwpt_id, variant->buffer_size, + MOCK_APERTURE_START, self->page_size, + self->bitmap, self->bitmap_size, _metadata); + + /* PAGE_SIZE unaligned bitmap */ + test_mock_dirty_bitmaps(hwpt_id, variant->buffer_size, + MOCK_APERTURE_START, self->page_size, + self->bitmap + MOCK_PAGE_SIZE, + self->bitmap_size, _metadata); + + test_ioctl_destroy(stddev_id); + test_ioctl_destroy(hwpt_id); +} + /* VFIO compatibility IOCTLs */ TEST_F(iommufd, simple_ioctls) diff --git a/tools/testing/selftests/iommu/iommufd_utils.h b/tools/testing/selftests/iommu/iommufd_utils.h index e37af6291b226..b129cf23b8245 100644 --- a/tools/testing/selftests/iommu/iommufd_utils.h +++ b/tools/testing/selftests/iommu/iommufd_utils.h @@ -16,6 +16,25 @@ /* Hack to make assertions more readable */ #define _IOMMU_TEST_CMD(x) IOMMU_TEST_CMD +/* Imported from include/asm-generic/bitops/generic-non-atomic.h */ +#define BITS_PER_BYTE 8 +#define BITS_PER_LONG __BITS_PER_LONG +#define BIT_MASK(nr) (1UL << ((nr) % __BITS_PER_LONG)) +#define BIT_WORD(nr) ((nr) / __BITS_PER_LONG) + +static inline void set_bit(unsigned int nr, unsigned long *addr) +{ + unsigned long mask = BIT_MASK(nr); + unsigned long *p = ((unsigned long *)addr) + BIT_WORD(nr); + + *p |= mask; +} + +static inline bool test_bit(unsigned int nr, unsigned long *addr) +{ + return 1UL & (addr[BIT_WORD(nr)] >> (nr & (BITS_PER_LONG - 1))); +} + static void *buffer; static unsigned long BUFFER_SIZE; @@ -196,6 +215,103 @@ static int _test_cmd_set_dirty_tracking(int fd, __u32 hwpt_id, bool enabled) #define test_cmd_set_dirty_tracking(hwpt_id, enabled) \ ASSERT_EQ(0, _test_cmd_set_dirty_tracking(self->fd, hwpt_id, enabled)) +static int _test_cmd_get_dirty_bitmap(int fd, __u32 hwpt_id, size_t length, + __u64 iova, size_t page_size, + __u64 *bitmap) +{ + struct iommu_hwpt_get_dirty_bitmap cmd = { + .size = sizeof(cmd), + .hwpt_id = hwpt_id, + .iova = iova, + .length = length, + .page_size = page_size, + .data = (uintptr_t)bitmap, + }; + int ret; + + ret = ioctl(fd, IOMMU_HWPT_GET_DIRTY_BITMAP, &cmd); + if (ret) + return ret; + return 0; +} + +#define test_cmd_get_dirty_bitmap(fd, hwpt_id, length, iova, page_size, \ + bitmap) \ + ASSERT_EQ(0, _test_cmd_get_dirty_bitmap(fd, hwpt_id, length, iova, \ + page_size, bitmap)) + +static int _test_cmd_mock_domain_set_dirty(int fd, __u32 hwpt_id, size_t length, + __u64 iova, size_t page_size, + __u64 *bitmap, __u64 *dirty) +{ + struct iommu_test_cmd cmd = { + .size = sizeof(cmd), + .op = IOMMU_TEST_OP_DIRTY, + .id = hwpt_id, + .dirty = { + .iova = iova, + .length = length, + .page_size = page_size, + .uptr = (uintptr_t)bitmap, + } + }; + int ret; + + ret = ioctl(fd, _IOMMU_TEST_CMD(IOMMU_TEST_OP_DIRTY), &cmd); + if (ret) + return -ret; + if (dirty) + *dirty = cmd.dirty.out_nr_dirty; + return 0; +} + +#define test_cmd_mock_domain_set_dirty(fd, hwpt_id, length, iova, page_size, \ + bitmap, nr) \ + ASSERT_EQ(0, \ + _test_cmd_mock_domain_set_dirty(fd, hwpt_id, length, iova, \ + page_size, bitmap, nr)) + +static int _test_mock_dirty_bitmaps(int fd, __u32 hwpt_id, size_t length, + __u64 iova, size_t page_size, __u64 *bitmap, + __u64 bitmap_size, + struct __test_metadata *_metadata) +{ + unsigned long i, count, nbits = bitmap_size * BITS_PER_BYTE; + unsigned long nr = nbits / 2; + __u64 out_dirty = 0; + + /* Mark all even bits as dirty in the mock domain */ + for (count = 0, i = 0; i < nbits; count += !(i % 2), i++) + if (!(i % 2)) + set_bit(i, (unsigned long *)bitmap); + ASSERT_EQ(nr, count); + + test_cmd_mock_domain_set_dirty(fd, hwpt_id, length, iova, page_size, + bitmap, &out_dirty); + ASSERT_EQ(nr, out_dirty); + + /* Expect all even bits as dirty in the user bitmap */ + memset(bitmap, 0, bitmap_size); + test_cmd_get_dirty_bitmap(fd, hwpt_id, length, iova, page_size, bitmap); + for (count = 0, i = 0; i < nbits; count += !(i % 2), i++) + ASSERT_EQ(!(i % 2), test_bit(i, (unsigned long *)bitmap)); + ASSERT_EQ(count, out_dirty); + + memset(bitmap, 0, bitmap_size); + test_cmd_get_dirty_bitmap(fd, hwpt_id, length, iova, page_size, bitmap); + + /* It as read already -- expect all zeroes */ + for (i = 0; i < nbits; i++) + ASSERT_EQ(0, test_bit(i, (unsigned long *)bitmap)); + + return 0; +} +#define test_mock_dirty_bitmaps(hwpt_id, length, iova, page_size, bitmap, \ + bitmap_size, _metadata) \ + ASSERT_EQ(0, _test_mock_dirty_bitmaps(self->fd, hwpt_id, length, iova, \ + page_size, bitmap, bitmap_size, \ + _metadata)) + static int _test_cmd_create_access(int fd, unsigned int ioas_id, __u32 *access_id, unsigned int flags) { @@ -320,6 +436,17 @@ static int _test_ioctl_ioas_map(int fd, unsigned int ioas_id, void *buffer, IOMMU_IOAS_MAP_READABLE)); \ }) +#define test_ioctl_ioas_map_fixed_id(ioas_id, buffer, length, iova) \ + ({ \ + __u64 __iova = iova; \ + ASSERT_EQ(0, \ + _test_ioctl_ioas_map( \ + self->fd, ioas_id, buffer, length, &__iova, \ + IOMMU_IOAS_MAP_FIXED_IOVA | \ + IOMMU_IOAS_MAP_WRITEABLE | \ + IOMMU_IOAS_MAP_READABLE)); \ + }) + #define test_err_ioctl_ioas_map_fixed(_errno, buffer, length, iova) \ ({ \ __u64 __iova = iova; \ From 5129ad366047cea40c551a79fc91407c32253197 Mon Sep 17 00:00:00 2001 From: Joao Martins Date: Tue, 24 Oct 2023 14:51:08 +0100 Subject: [PATCH 381/548] iommufd/selftest: Test out_capabilities in IOMMU_GET_HW_INFO Enumerate the capabilities from the mock device and test whether it advertises as expected. Include it as part of the iommufd_dirty_tracking fixture. Link: https://lore.kernel.org/r/20231024135109.73787-18-joao.m.martins@oracle.com Signed-off-by: Joao Martins Signed-off-by: Jason Gunthorpe (cherry picked from commit ae36fe70cea4d7c177452ab41e6734fa3cbd4ad8) Signed-off-by: Koba Ko --- drivers/iommu/iommufd/selftest.c | 13 +++++++++- tools/testing/selftests/iommu/iommufd.c | 17 +++++++++++++ .../selftests/iommu/iommufd_fail_nth.c | 2 +- tools/testing/selftests/iommu/iommufd_utils.h | 24 ++++++++++++------- 4 files changed, 45 insertions(+), 11 deletions(-) diff --git a/drivers/iommu/iommufd/selftest.c b/drivers/iommu/iommufd/selftest.c index 6ea1fa01c510c..580a70cd32eaf 100644 --- a/drivers/iommu/iommufd/selftest.c +++ b/drivers/iommu/iommufd/selftest.c @@ -386,7 +386,18 @@ static phys_addr_t mock_domain_iova_to_phys(struct iommu_domain *domain, static bool mock_domain_capable(struct device *dev, enum iommu_cap cap) { - return cap == IOMMU_CAP_CACHE_COHERENCY; + struct mock_dev *mdev = container_of(dev, struct mock_dev, dev); + + switch (cap) { + case IOMMU_CAP_CACHE_COHERENCY: + return true; + case IOMMU_CAP_DIRTY_TRACKING: + return !(mdev->flags & MOCK_FLAGS_DEVICE_NO_DIRTY); + default: + break; + } + + return false; } static struct iommu_device mock_iommu_device = { diff --git a/tools/testing/selftests/iommu/iommufd.c b/tools/testing/selftests/iommu/iommufd.c index 830b038200459..b16c304dc9955 100644 --- a/tools/testing/selftests/iommu/iommufd.c +++ b/tools/testing/selftests/iommu/iommufd.c @@ -1560,6 +1560,23 @@ TEST_F(iommufd_dirty_tracking, set_dirty_tracking) test_ioctl_destroy(hwpt_id); } +TEST_F(iommufd_dirty_tracking, device_dirty_capability) +{ + uint32_t caps = 0; + uint32_t stddev_id; + uint32_t hwpt_id; + + test_cmd_hwpt_alloc(self->idev_id, self->ioas_id, 0, &hwpt_id); + test_cmd_mock_domain(hwpt_id, &stddev_id, NULL, NULL); + test_cmd_get_hw_capabilities(self->idev_id, caps, + IOMMU_HW_CAP_DIRTY_TRACKING); + ASSERT_EQ(IOMMU_HW_CAP_DIRTY_TRACKING, + caps & IOMMU_HW_CAP_DIRTY_TRACKING); + + test_ioctl_destroy(stddev_id); + test_ioctl_destroy(hwpt_id); +} + TEST_F(iommufd_dirty_tracking, get_dirty_bitmap) { uint32_t stddev_id; diff --git a/tools/testing/selftests/iommu/iommufd_fail_nth.c b/tools/testing/selftests/iommu/iommufd_fail_nth.c index 3d7838506bfe4..1fcd69cb0e416 100644 --- a/tools/testing/selftests/iommu/iommufd_fail_nth.c +++ b/tools/testing/selftests/iommu/iommufd_fail_nth.c @@ -612,7 +612,7 @@ TEST_FAIL_NTH(basic_fail_nth, device) &idev_id)) return -1; - if (_test_cmd_get_hw_info(self->fd, idev_id, &info, sizeof(info))) + if (_test_cmd_get_hw_info(self->fd, idev_id, &info, sizeof(info), NULL)) return -1; if (_test_cmd_hwpt_alloc(self->fd, idev_id, ioas_id, 0, &hwpt_id)) diff --git a/tools/testing/selftests/iommu/iommufd_utils.h b/tools/testing/selftests/iommu/iommufd_utils.h index b129cf23b8245..2410d06f5a340 100644 --- a/tools/testing/selftests/iommu/iommufd_utils.h +++ b/tools/testing/selftests/iommu/iommufd_utils.h @@ -535,8 +535,8 @@ static void teardown_iommufd(int fd, struct __test_metadata *_metadata) #endif /* @data can be NULL */ -static int _test_cmd_get_hw_info(int fd, __u32 device_id, - void *data, size_t data_len) +static int _test_cmd_get_hw_info(int fd, __u32 device_id, void *data, + size_t data_len, uint32_t *capabilities) { struct iommu_test_hw_info *info = (struct iommu_test_hw_info *)data; struct iommu_hw_info cmd = { @@ -544,6 +544,7 @@ static int _test_cmd_get_hw_info(int fd, __u32 device_id, .dev_id = device_id, .data_len = data_len, .data_uptr = (uint64_t)data, + .out_capabilities = 0, }; int ret; @@ -580,14 +581,19 @@ static int _test_cmd_get_hw_info(int fd, __u32 device_id, assert(!info->flags); } + if (capabilities) + *capabilities = cmd.out_capabilities; + return 0; } -#define test_cmd_get_hw_info(device_id, data, data_len) \ - ASSERT_EQ(0, _test_cmd_get_hw_info(self->fd, device_id, \ - data, data_len)) +#define test_cmd_get_hw_info(device_id, data, data_len) \ + ASSERT_EQ(0, _test_cmd_get_hw_info(self->fd, device_id, data, \ + data_len, NULL)) + +#define test_err_get_hw_info(_errno, device_id, data, data_len) \ + EXPECT_ERRNO(_errno, _test_cmd_get_hw_info(self->fd, device_id, data, \ + data_len, NULL)) -#define test_err_get_hw_info(_errno, device_id, data, data_len) \ - EXPECT_ERRNO(_errno, \ - _test_cmd_get_hw_info(self->fd, device_id, \ - data, data_len)) +#define test_cmd_get_hw_capabilities(device_id, caps, mask) \ + ASSERT_EQ(0, _test_cmd_get_hw_info(self->fd, device_id, NULL, 0, &caps)) From 4cd18fff5b5ef1e151a5e00547d244d4d39fd195 Mon Sep 17 00:00:00 2001 From: Joao Martins Date: Tue, 24 Oct 2023 14:51:09 +0100 Subject: [PATCH 382/548] iommufd/selftest: Test IOMMU_HWPT_GET_DIRTY_BITMAP_NO_CLEAR flag Change test_mock_dirty_bitmaps() to pass a flag where it specifies the flag under test. The test does the same thing as the GET_DIRTY_BITMAP regular test. Except that it tests whether the dirtied bits are fetched all the same a second time, as opposed to observing them cleared. Link: https://lore.kernel.org/r/20231024135109.73787-19-joao.m.martins@oracle.com Signed-off-by: Joao Martins Signed-off-by: Jason Gunthorpe (cherry picked from commit 0795b305da8902e7d092f90bf9a1a2c98f34b1db) Signed-off-by: Koba Ko --- drivers/iommu/iommufd/selftest.c | 15 +++++--- tools/testing/selftests/iommu/iommufd.c | 38 ++++++++++++++++++- tools/testing/selftests/iommu/iommufd_utils.h | 26 ++++++++----- 3 files changed, 61 insertions(+), 18 deletions(-) diff --git a/drivers/iommu/iommufd/selftest.c b/drivers/iommu/iommufd/selftest.c index 580a70cd32eaf..87026377d5885 100644 --- a/drivers/iommu/iommufd/selftest.c +++ b/drivers/iommu/iommufd/selftest.c @@ -203,13 +203,16 @@ static int mock_domain_read_and_clear_dirty(struct iommu_domain *domain, ent = xa_load(&mock->pfns, cur / MOCK_IO_PAGE_SIZE); if (ent && (xa_to_value(ent) & MOCK_PFN_DIRTY_IOVA)) { - unsigned long val; - /* Clear dirty */ - val = xa_to_value(ent) & ~MOCK_PFN_DIRTY_IOVA; - old = xa_store(&mock->pfns, cur / MOCK_IO_PAGE_SIZE, - xa_mk_value(val), GFP_KERNEL); - WARN_ON_ONCE(ent != old); + if (!(flags & IOMMU_DIRTY_NO_CLEAR)) { + unsigned long val; + + val = xa_to_value(ent) & ~MOCK_PFN_DIRTY_IOVA; + old = xa_store(&mock->pfns, + cur / MOCK_IO_PAGE_SIZE, + xa_mk_value(val), GFP_KERNEL); + WARN_ON_ONCE(ent != old); + } iommu_dirty_bitmap_record(dirty, cur, MOCK_IO_PAGE_SIZE); } diff --git a/tools/testing/selftests/iommu/iommufd.c b/tools/testing/selftests/iommu/iommufd.c index b16c304dc9955..d5e64f36ad704 100644 --- a/tools/testing/selftests/iommu/iommufd.c +++ b/tools/testing/selftests/iommu/iommufd.c @@ -1595,13 +1595,47 @@ TEST_F(iommufd_dirty_tracking, get_dirty_bitmap) test_mock_dirty_bitmaps(hwpt_id, variant->buffer_size, MOCK_APERTURE_START, self->page_size, - self->bitmap, self->bitmap_size, _metadata); + self->bitmap, self->bitmap_size, 0, _metadata); /* PAGE_SIZE unaligned bitmap */ test_mock_dirty_bitmaps(hwpt_id, variant->buffer_size, MOCK_APERTURE_START, self->page_size, self->bitmap + MOCK_PAGE_SIZE, - self->bitmap_size, _metadata); + self->bitmap_size, 0, _metadata); + + test_ioctl_destroy(stddev_id); + test_ioctl_destroy(hwpt_id); +} + +TEST_F(iommufd_dirty_tracking, get_dirty_bitmap_no_clear) +{ + uint32_t stddev_id; + uint32_t hwpt_id; + uint32_t ioas_id; + + test_ioctl_ioas_alloc(&ioas_id); + test_ioctl_ioas_map_fixed_id(ioas_id, self->buffer, + variant->buffer_size, MOCK_APERTURE_START); + + test_cmd_hwpt_alloc(self->idev_id, ioas_id, + IOMMU_HWPT_ALLOC_DIRTY_TRACKING, &hwpt_id); + test_cmd_mock_domain(hwpt_id, &stddev_id, NULL, NULL); + + test_cmd_set_dirty_tracking(hwpt_id, true); + + test_mock_dirty_bitmaps(hwpt_id, variant->buffer_size, + MOCK_APERTURE_START, self->page_size, + self->bitmap, self->bitmap_size, + IOMMU_HWPT_GET_DIRTY_BITMAP_NO_CLEAR, + _metadata); + + /* Unaligned bitmap */ + test_mock_dirty_bitmaps(hwpt_id, variant->buffer_size, + MOCK_APERTURE_START, self->page_size, + self->bitmap + MOCK_PAGE_SIZE, + self->bitmap_size, + IOMMU_HWPT_GET_DIRTY_BITMAP_NO_CLEAR, + _metadata); test_ioctl_destroy(stddev_id); test_ioctl_destroy(hwpt_id); diff --git a/tools/testing/selftests/iommu/iommufd_utils.h b/tools/testing/selftests/iommu/iommufd_utils.h index 2410d06f5a340..e263bf80a977d 100644 --- a/tools/testing/selftests/iommu/iommufd_utils.h +++ b/tools/testing/selftests/iommu/iommufd_utils.h @@ -217,11 +217,12 @@ static int _test_cmd_set_dirty_tracking(int fd, __u32 hwpt_id, bool enabled) static int _test_cmd_get_dirty_bitmap(int fd, __u32 hwpt_id, size_t length, __u64 iova, size_t page_size, - __u64 *bitmap) + __u64 *bitmap, __u32 flags) { struct iommu_hwpt_get_dirty_bitmap cmd = { .size = sizeof(cmd), .hwpt_id = hwpt_id, + .flags = flags, .iova = iova, .length = length, .page_size = page_size, @@ -236,9 +237,9 @@ static int _test_cmd_get_dirty_bitmap(int fd, __u32 hwpt_id, size_t length, } #define test_cmd_get_dirty_bitmap(fd, hwpt_id, length, iova, page_size, \ - bitmap) \ + bitmap, flags) \ ASSERT_EQ(0, _test_cmd_get_dirty_bitmap(fd, hwpt_id, length, iova, \ - page_size, bitmap)) + page_size, bitmap, flags)) static int _test_cmd_mock_domain_set_dirty(int fd, __u32 hwpt_id, size_t length, __u64 iova, size_t page_size, @@ -273,7 +274,7 @@ static int _test_cmd_mock_domain_set_dirty(int fd, __u32 hwpt_id, size_t length, static int _test_mock_dirty_bitmaps(int fd, __u32 hwpt_id, size_t length, __u64 iova, size_t page_size, __u64 *bitmap, - __u64 bitmap_size, + __u64 bitmap_size, __u32 flags, struct __test_metadata *_metadata) { unsigned long i, count, nbits = bitmap_size * BITS_PER_BYTE; @@ -292,25 +293,30 @@ static int _test_mock_dirty_bitmaps(int fd, __u32 hwpt_id, size_t length, /* Expect all even bits as dirty in the user bitmap */ memset(bitmap, 0, bitmap_size); - test_cmd_get_dirty_bitmap(fd, hwpt_id, length, iova, page_size, bitmap); + test_cmd_get_dirty_bitmap(fd, hwpt_id, length, iova, page_size, bitmap, + flags); for (count = 0, i = 0; i < nbits; count += !(i % 2), i++) ASSERT_EQ(!(i % 2), test_bit(i, (unsigned long *)bitmap)); ASSERT_EQ(count, out_dirty); memset(bitmap, 0, bitmap_size); - test_cmd_get_dirty_bitmap(fd, hwpt_id, length, iova, page_size, bitmap); + test_cmd_get_dirty_bitmap(fd, hwpt_id, length, iova, page_size, bitmap, + flags); /* It as read already -- expect all zeroes */ - for (i = 0; i < nbits; i++) - ASSERT_EQ(0, test_bit(i, (unsigned long *)bitmap)); + for (i = 0; i < nbits; i++) { + ASSERT_EQ(!(i % 2) && (flags & + IOMMU_HWPT_GET_DIRTY_BITMAP_NO_CLEAR), + test_bit(i, (unsigned long *)bitmap)); + } return 0; } #define test_mock_dirty_bitmaps(hwpt_id, length, iova, page_size, bitmap, \ - bitmap_size, _metadata) \ + bitmap_size, flags, _metadata) \ ASSERT_EQ(0, _test_mock_dirty_bitmaps(self->fd, hwpt_id, length, iova, \ page_size, bitmap, bitmap_size, \ - _metadata)) + flags, _metadata)) static int _test_cmd_create_access(int fd, unsigned int ioas_id, __u32 *access_id, unsigned int flags) From 24bce8ad918f69201f3710838c6fd0cb5426dafe Mon Sep 17 00:00:00 2001 From: Joao Martins Date: Mon, 30 Oct 2023 11:34:46 +0000 Subject: [PATCH 383/548] iommufd/selftest: Fix page-size check in iommufd_test_dirty() iommufd_test_dirty()/IOMMU_TEST_OP_DIRTY sets the dirty bits in the mock domain implementation that the userspace side validates against what it obtains via the UAPI. However in introducing iommufd_test_dirty() it forgot to validate page_size being 0 leading to two possible divide-by-zero problems: one at the beginning when calculating @max and while calculating the IOVA in the XArray PFN tracking list. While at it, validate the length to require non-zero value as well, as we can't be allocating a 0-sized bitmap. Link: https://lore.kernel.org/r/20231030113446.7056-1-joao.m.martins@oracle.com Reported-by: syzbot+25dc7383c30ecdc83c38@syzkaller.appspotmail.com Closes: https://lore.kernel.org/linux-iommu/00000000000005f6aa0608b9220f@google.com/ Fixes: a9af47e382a4 ("iommufd/selftest: Test IOMMU_HWPT_GET_DIRTY_BITMAP") Signed-off-by: Joao Martins Signed-off-by: Jason Gunthorpe (cherry picked from commit 2e22aac3ea9cfc0ec3209c96644f60c1806a8117) Signed-off-by: Koba Ko --- drivers/iommu/iommufd/selftest.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/drivers/iommu/iommufd/selftest.c b/drivers/iommu/iommufd/selftest.c index 87026377d5885..4eed87f936b43 100644 --- a/drivers/iommu/iommufd/selftest.c +++ b/drivers/iommu/iommufd/selftest.c @@ -1104,14 +1104,15 @@ static int iommufd_test_dirty(struct iommufd_ucmd *ucmd, unsigned int mockpt_id, unsigned long page_size, void __user *uptr, u32 flags) { - unsigned long bitmap_size, i, max = length / page_size; + unsigned long bitmap_size, i, max; struct iommu_test_cmd *cmd = ucmd->cmd; struct iommufd_hw_pagetable *hwpt; struct mock_iommu_domain *mock; int rc, count = 0; void *tmp; - if (iova % page_size || length % page_size || !uptr) + if (!page_size || !length || iova % page_size || length % page_size || + !uptr) return -EINVAL; hwpt = get_md_pagetable(ucmd, mockpt_id, &mock); @@ -1123,6 +1124,7 @@ static int iommufd_test_dirty(struct iommufd_ucmd *ucmd, unsigned int mockpt_id, goto out_put; } + max = length / page_size; bitmap_size = max / BITS_PER_BYTE; tmp = kvzalloc(bitmap_size, GFP_KERNEL_ACCOUNT); From dedc7fcab05717f41d4c277d3bb097a5ac2f4a74 Mon Sep 17 00:00:00 2001 From: Kunwu Chan Date: Wed, 22 Nov 2023 11:26:08 +0800 Subject: [PATCH 384/548] iommu/vt-d: Set variable intel_dirty_ops to static Fix the following warning: drivers/iommu/intel/iommu.c:302:30: warning: symbol 'intel_dirty_ops' was not declared. Should it be static? This variable is only used in its defining file, so it should be static. Fixes: f35f22cc760e ("iommu/vt-d: Access/Dirty bit support for SS domains") Signed-off-by: Kunwu Chan Reviewed-by: Jason Gunthorpe Reviewed-by: Joao Martins Link: https://lore.kernel.org/r/20231120101025.1103404-1-chentao@kylinos.cn Signed-off-by: Lu Baolu Signed-off-by: Joerg Roedel (cherry picked from commit e378c7de74620051c3be899a8c2506c25d23049d) Signed-off-by: Koba Ko --- drivers/iommu/intel/iommu.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c index 19709122947e6..b3ee204eeef55 100644 --- a/drivers/iommu/intel/iommu.c +++ b/drivers/iommu/intel/iommu.c @@ -300,7 +300,7 @@ static int iommu_skip_te_disable; #define IDENTMAP_AZALIA 4 const struct iommu_ops intel_iommu_ops; -const struct iommu_dirty_ops intel_dirty_ops; +static const struct iommu_dirty_ops intel_dirty_ops; static bool translation_pre_enabled(struct intel_iommu *iommu) { @@ -4928,7 +4928,7 @@ static int intel_iommu_read_and_clear_dirty(struct iommu_domain *domain, return 0; } -const struct iommu_dirty_ops intel_dirty_ops = { +static const struct iommu_dirty_ops intel_dirty_ops = { .set_dirty_tracking = intel_iommu_set_dirty_tracking, .read_and_clear_dirty = intel_iommu_read_and_clear_dirty, }; From 426e4c829d761ea56bf4d11c69e0ad6fb19da2ff Mon Sep 17 00:00:00 2001 From: Robin Murphy Date: Thu, 16 Nov 2023 16:52:15 +0000 Subject: [PATCH 385/548] iommufd/selftest: Fix _test_mock_dirty_bitmaps() The ASSERT_EQ() macro sneakily expands to two statements, so the loop here needs braces to ensure it captures both and actually terminates the test upon failure. Where these tests are currently failing on my arm64 machine, this reduces the number of logged lines from a rather unreasonable ~197,000 down to 10. While we're at it, we can also clean up the tautologous "count" calculations whose assertions can never fail unless mathematics and/or the C language become fundamentally broken. Fixes: a9af47e382a4 ("iommufd/selftest: Test IOMMU_HWPT_GET_DIRTY_BITMAP") Link: https://lore.kernel.org/r/90e083045243ef407dd592bb1deec89cd1f4ddf2.1700153535.git.robin.murphy@arm.com Signed-off-by: Robin Murphy Reviewed-by: Kevin Tian Reviewed-by: Joao Martins Tested-by: Joao Martins Signed-off-by: Jason Gunthorpe (cherry picked from commit 98594181944daa201481ad63242806beb7c89ff4) Signed-off-by: Koba Ko --- tools/testing/selftests/iommu/iommufd_utils.h | 13 ++++++------- 1 file changed, 6 insertions(+), 7 deletions(-) diff --git a/tools/testing/selftests/iommu/iommufd_utils.h b/tools/testing/selftests/iommu/iommufd_utils.h index e263bf80a977d..70d558e0f0c74 100644 --- a/tools/testing/selftests/iommu/iommufd_utils.h +++ b/tools/testing/selftests/iommu/iommufd_utils.h @@ -277,15 +277,13 @@ static int _test_mock_dirty_bitmaps(int fd, __u32 hwpt_id, size_t length, __u64 bitmap_size, __u32 flags, struct __test_metadata *_metadata) { - unsigned long i, count, nbits = bitmap_size * BITS_PER_BYTE; + unsigned long i, nbits = bitmap_size * BITS_PER_BYTE; unsigned long nr = nbits / 2; __u64 out_dirty = 0; /* Mark all even bits as dirty in the mock domain */ - for (count = 0, i = 0; i < nbits; count += !(i % 2), i++) - if (!(i % 2)) - set_bit(i, (unsigned long *)bitmap); - ASSERT_EQ(nr, count); + for (i = 0; i < nbits; i += 2) + set_bit(i, (unsigned long *)bitmap); test_cmd_mock_domain_set_dirty(fd, hwpt_id, length, iova, page_size, bitmap, &out_dirty); @@ -295,9 +293,10 @@ static int _test_mock_dirty_bitmaps(int fd, __u32 hwpt_id, size_t length, memset(bitmap, 0, bitmap_size); test_cmd_get_dirty_bitmap(fd, hwpt_id, length, iova, page_size, bitmap, flags); - for (count = 0, i = 0; i < nbits; count += !(i % 2), i++) + /* Beware ASSERT_EQ() is two statements -- braces are not redundant! */ + for (i = 0; i < nbits; i++) { ASSERT_EQ(!(i % 2), test_bit(i, (unsigned long *)bitmap)); - ASSERT_EQ(count, out_dirty); + } memset(bitmap, 0, bitmap_size); test_cmd_get_dirty_bitmap(fd, hwpt_id, length, iova, page_size, bitmap, From 6aa6816b28aa96736e0deceba4cc75ad27eaedae Mon Sep 17 00:00:00 2001 From: Jinjie Ruan Date: Mon, 19 Aug 2024 20:00:07 +0800 Subject: [PATCH 386/548] iommufd/selftest: Make dirty_ops static The sparse tool complains as follows: drivers/iommu/iommufd/selftest.c:277:30: warning: symbol 'dirty_ops' was not declared. Should it be static? This symbol is not used outside of selftest.c, so marks it static. Fixes: 266ce58989ba ("iommufd/selftest: Test IOMMU_HWPT_ALLOC_DIRTY_TRACKING") Link: https://patch.msgid.link/r/20240819120007.3884868-1-ruanjinjie@huawei.com Signed-off-by: Jinjie Ruan Reviewed-by: Yi Liu Signed-off-by: Jason Gunthorpe (cherry picked from commit cf1e515c9a40caa8bddb920970d3257bb01c1421) Signed-off-by: Koba Ko --- drivers/iommu/iommufd/selftest.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/iommu/iommufd/selftest.c b/drivers/iommu/iommufd/selftest.c index 4eed87f936b43..a88e37f8ced17 100644 --- a/drivers/iommu/iommufd/selftest.c +++ b/drivers/iommu/iommufd/selftest.c @@ -221,7 +221,7 @@ static int mock_domain_read_and_clear_dirty(struct iommu_domain *domain, return 0; } -const struct iommu_dirty_ops dirty_ops = { +static const struct iommu_dirty_ops dirty_ops = { .set_dirty_tracking = mock_domain_set_dirty_tracking, .read_and_clear_dirty = mock_domain_read_and_clear_dirty, }; From e1e5f34afb596815a36c895da147ebcbf6c8815c Mon Sep 17 00:00:00 2001 From: Joao Martins Date: Fri, 2 Feb 2024 13:34:10 +0000 Subject: [PATCH 387/548] iommufd/iova_bitmap: Handle recording beyond the mapped pages IOVA bitmap is a zero-copy scheme of recording dirty bits that iterate the different bitmap user pages at chunks of a maximum of PAGE_SIZE/sizeof(struct page*) pages. When the iterations are split up into 64G, the end of the range may be broken up in a way that's aligned with a non base page PTE size. This leads to only part of the huge page being recorded in the bitmap. Note that in pratice this is only a problem for IOMMU dirty tracking i.e. when the backing PTEs are in IOMMU hugepages and the bitmap is in base page granularity. So far this not something that affects VF dirty trackers (which reports and records at the same granularity). To fix that, if there is a remainder of bits left to set in which the current IOVA bitmap doesn't cover, make a copy of the bitmap structure and iterate-and-set the rest of the bits remaining. Finally, when advancing the iterator, skip all the bits that were set ahead. Link: https://lore.kernel.org/r/20240202133415.23819-5-joao.m.martins@oracle.com Reported-by: Avihai Horon Fixes: f35f22cc760e ("iommu/vt-d: Access/Dirty bit support for SS domains") Fixes: 421a511a293f ("iommu/amd: Access/Dirty bit support in IOPTEs") Signed-off-by: Joao Martins Tested-by: Avihai Horon Signed-off-by: Jason Gunthorpe (cherry picked from commit 2780025e01e2e1c92f83ee7da91d9727c2e58a3e) Signed-off-by: Koba Ko --- drivers/iommu/iommufd/iova_bitmap.c | 43 +++++++++++++++++++++++++++++ 1 file changed, 43 insertions(+) diff --git a/drivers/iommu/iommufd/iova_bitmap.c b/drivers/iommu/iommufd/iova_bitmap.c index e9b08e16711f8..89e6e11157f0c 100644 --- a/drivers/iommu/iommufd/iova_bitmap.c +++ b/drivers/iommu/iommufd/iova_bitmap.c @@ -113,6 +113,9 @@ struct iova_bitmap { /* length of the IOVA range for the whole bitmap */ size_t length; + + /* length of the IOVA range set ahead the pinned pages */ + unsigned long set_ahead_length; }; /* @@ -342,6 +345,32 @@ static bool iova_bitmap_done(struct iova_bitmap *bitmap) return bitmap->mapped_base_index >= bitmap->mapped_total_index; } +static int iova_bitmap_set_ahead(struct iova_bitmap *bitmap, + size_t set_ahead_length) +{ + int ret = 0; + + while (set_ahead_length > 0 && !iova_bitmap_done(bitmap)) { + unsigned long length = iova_bitmap_mapped_length(bitmap); + unsigned long iova = iova_bitmap_mapped_iova(bitmap); + + ret = iova_bitmap_get(bitmap); + if (ret) + break; + + length = min(length, set_ahead_length); + iova_bitmap_set(bitmap, iova, length); + + set_ahead_length -= length; + bitmap->mapped_base_index += + iova_bitmap_offset_to_index(bitmap, length - 1) + 1; + iova_bitmap_put(bitmap); + } + + bitmap->set_ahead_length = 0; + return ret; +} + /* * Advances to the next range, releases the current pinned * pages and pins the next set of bitmap pages. @@ -358,6 +387,15 @@ static int iova_bitmap_advance(struct iova_bitmap *bitmap) if (iova_bitmap_done(bitmap)) return 0; + /* Iterate, set and skip any bits requested for next iteration */ + if (bitmap->set_ahead_length) { + int ret; + + ret = iova_bitmap_set_ahead(bitmap, bitmap->set_ahead_length); + if (ret) + return ret; + } + /* When advancing the index we pin the next set of bitmap pages */ return iova_bitmap_get(bitmap); } @@ -427,5 +465,10 @@ void iova_bitmap_set(struct iova_bitmap *bitmap, kunmap_local(kaddr); cur_bit += nbits; } while (cur_bit <= last_bit); + + if (unlikely(cur_bit <= last_bit)) { + bitmap->set_ahead_length = + ((last_bit - cur_bit + 1) << bitmap->mapped.pgshift); + } } EXPORT_SYMBOL_NS_GPL(iova_bitmap_set, IOMMUFD); From e812fd25f663bde0ceba35f353e39b1658e9664e Mon Sep 17 00:00:00 2001 From: Joao Martins Date: Fri, 2 Feb 2024 13:34:14 +0000 Subject: [PATCH 388/548] iommufd/selftest: Add mock IO hugepages tests Leverage previously added MOCK_FLAGS_DEVICE_HUGE_IOVA flag to create an IOMMU domain with more than MOCK_IO_PAGE_SIZE supported. Plumb the hugetlb backing memory for buffer allocation and change the expected page size to MOCK_HUGE_PAGE_SIZE (1M) when hugepage variant test cases are used. These so far are limited to 128M and 256M IOVA range tests cases which is when 1M hugepages can be used. Link: https://lore.kernel.org/r/20240202133415.23819-9-joao.m.martins@oracle.com Signed-off-by: Joao Martins Signed-off-by: Jason Gunthorpe (cherry picked from commit fe13166f0562117c42fc60a109e664a914523e63) Signed-off-by: Koba Ko --- tools/testing/selftests/iommu/iommufd.c | 45 +++++++++++++++++++++++-- 1 file changed, 42 insertions(+), 3 deletions(-) diff --git a/tools/testing/selftests/iommu/iommufd.c b/tools/testing/selftests/iommu/iommufd.c index d5e64f36ad704..57037bec9c4e0 100644 --- a/tools/testing/selftests/iommu/iommufd.c +++ b/tools/testing/selftests/iommu/iommufd.c @@ -12,6 +12,7 @@ static unsigned long HUGEPAGE_SIZE; #define MOCK_PAGE_SIZE (PAGE_SIZE / 2) +#define MOCK_HUGE_PAGE_SIZE (512 * MOCK_PAGE_SIZE) static unsigned long get_huge_page_size(void) { @@ -1446,10 +1447,12 @@ FIXTURE(iommufd_dirty_tracking) FIXTURE_VARIANT(iommufd_dirty_tracking) { unsigned long buffer_size; + bool hugepages; }; FIXTURE_SETUP(iommufd_dirty_tracking) { + int mmap_flags; void *vrc; int rc; @@ -1462,9 +1465,17 @@ FIXTURE_SETUP(iommufd_dirty_tracking) variant->buffer_size, rc); } + mmap_flags = MAP_SHARED | MAP_ANONYMOUS | MAP_FIXED; + if (variant->hugepages) { + /* + * MAP_POPULATE will cause the kernel to fail mmap if THPs are + * not available. + */ + mmap_flags |= MAP_HUGETLB | MAP_POPULATE; + } assert((uintptr_t)self->buffer % HUGEPAGE_SIZE == 0); vrc = mmap(self->buffer, variant->buffer_size, PROT_READ | PROT_WRITE, - MAP_SHARED | MAP_ANONYMOUS | MAP_FIXED, -1, 0); + mmap_flags, -1, 0); assert(vrc == self->buffer); self->page_size = MOCK_PAGE_SIZE; @@ -1479,8 +1490,16 @@ FIXTURE_SETUP(iommufd_dirty_tracking) assert((uintptr_t)self->bitmap % PAGE_SIZE == 0); test_ioctl_ioas_alloc(&self->ioas_id); - test_cmd_mock_domain(self->ioas_id, &self->stdev_id, &self->hwpt_id, - &self->idev_id); + /* Enable 1M mock IOMMU hugepages */ + if (variant->hugepages) { + test_cmd_mock_domain_flags(self->ioas_id, + MOCK_FLAGS_DEVICE_HUGE_IOVA, + &self->stdev_id, &self->hwpt_id, + &self->idev_id); + } else { + test_cmd_mock_domain(self->ioas_id, &self->stdev_id, + &self->hwpt_id, &self->idev_id); + } } FIXTURE_TEARDOWN(iommufd_dirty_tracking) @@ -1514,12 +1533,26 @@ FIXTURE_VARIANT_ADD(iommufd_dirty_tracking, domain_dirty128M) .buffer_size = 128UL * 1024UL * 1024UL, }; +FIXTURE_VARIANT_ADD(iommufd_dirty_tracking, domain_dirty128M_huge) +{ + /* 4K bitmap (128M IOVA range) */ + .buffer_size = 128UL * 1024UL * 1024UL, + .hugepages = true, +}; + FIXTURE_VARIANT_ADD(iommufd_dirty_tracking, domain_dirty256M) { /* 8K bitmap (256M IOVA range) */ .buffer_size = 256UL * 1024UL * 1024UL, }; +FIXTURE_VARIANT_ADD(iommufd_dirty_tracking, domain_dirty256M_huge) +{ + /* 8K bitmap (256M IOVA range) */ + .buffer_size = 256UL * 1024UL * 1024UL, + .hugepages = true, +}; + TEST_F(iommufd_dirty_tracking, enforce_dirty) { uint32_t ioas_id, stddev_id, idev_id; @@ -1583,6 +1616,9 @@ TEST_F(iommufd_dirty_tracking, get_dirty_bitmap) uint32_t hwpt_id; uint32_t ioas_id; + if (variant->hugepages) + page_size = MOCK_HUGE_PAGE_SIZE; + test_ioctl_ioas_alloc(&ioas_id); test_ioctl_ioas_map_fixed_id(ioas_id, self->buffer, variant->buffer_size, MOCK_APERTURE_START); @@ -1613,6 +1649,9 @@ TEST_F(iommufd_dirty_tracking, get_dirty_bitmap_no_clear) uint32_t hwpt_id; uint32_t ioas_id; + if (variant->hugepages) + page_size = MOCK_HUGE_PAGE_SIZE; + test_ioctl_ioas_alloc(&ioas_id); test_ioctl_ioas_map_fixed_id(ioas_id, self->buffer, variant->buffer_size, MOCK_APERTURE_START); From 2f98f9715b33417ed6309add4eeb2ad0f7be0be5 Mon Sep 17 00:00:00 2001 From: Joao Martins Date: Fri, 2 Feb 2024 13:34:12 +0000 Subject: [PATCH 389/548] iommufd/selftest: Refactor mock_domain_read_and_clear_dirty() Move the clearing of the dirty bit of the mock domain into mock_domain_test_and_clear_dirty() helper, simplifying the caller function. Additionally, rework the mock_domain_read_and_clear_dirty() loop to iterate over a potentially variable IO page size. No functional change intended with the loop refactor. This is in preparation for dirty tracking support for IOMMU hugepage mock domains. Link: https://lore.kernel.org/r/20240202133415.23819-7-joao.m.martins@oracle.com Signed-off-by: Joao Martins Signed-off-by: Jason Gunthorpe (cherry picked from commit 02a8c61a8b06a4a082b58c3e643b28036c6be60f) Signed-off-by: Koba Ko --- drivers/iommu/iommufd/selftest.c | 64 ++++++++++++++++++++++---------- 1 file changed, 45 insertions(+), 19 deletions(-) diff --git a/drivers/iommu/iommufd/selftest.c b/drivers/iommu/iommufd/selftest.c index a88e37f8ced17..32e5117a6f531 100644 --- a/drivers/iommu/iommufd/selftest.c +++ b/drivers/iommu/iommufd/selftest.c @@ -185,6 +185,34 @@ static int mock_domain_set_dirty_tracking(struct iommu_domain *domain, return 0; } +static bool mock_test_and_clear_dirty(struct mock_iommu_domain *mock, + unsigned long iova, size_t page_size, + unsigned long flags) +{ + unsigned long cur, end = iova + page_size - 1; + bool dirty = false; + void *ent, *old; + + for (cur = iova; cur < end; cur += MOCK_IO_PAGE_SIZE) { + ent = xa_load(&mock->pfns, cur / MOCK_IO_PAGE_SIZE); + if (!ent || !(xa_to_value(ent) & MOCK_PFN_DIRTY_IOVA)) + continue; + + dirty = true; + /* Clear dirty */ + if (!(flags & IOMMU_DIRTY_NO_CLEAR)) { + unsigned long val; + + val = xa_to_value(ent) & ~MOCK_PFN_DIRTY_IOVA; + old = xa_store(&mock->pfns, cur / MOCK_IO_PAGE_SIZE, + xa_mk_value(val), GFP_KERNEL); + WARN_ON_ONCE(ent != old); + } + } + + return dirty; +} + static int mock_domain_read_and_clear_dirty(struct iommu_domain *domain, unsigned long iova, size_t size, unsigned long flags, @@ -192,31 +220,29 @@ static int mock_domain_read_and_clear_dirty(struct iommu_domain *domain, { struct mock_iommu_domain *mock = container_of(domain, struct mock_iommu_domain, domain); - unsigned long i, max = size / MOCK_IO_PAGE_SIZE; - void *ent, *old; + unsigned long end = iova + size; + void *ent; if (!(mock->flags & MOCK_DIRTY_TRACK) && dirty->bitmap) return -EINVAL; - for (i = 0; i < max; i++) { - unsigned long cur = iova + i * MOCK_IO_PAGE_SIZE; + do { + unsigned long pgsize = MOCK_IO_PAGE_SIZE; + unsigned long head; - ent = xa_load(&mock->pfns, cur / MOCK_IO_PAGE_SIZE); - if (ent && (xa_to_value(ent) & MOCK_PFN_DIRTY_IOVA)) { - /* Clear dirty */ - if (!(flags & IOMMU_DIRTY_NO_CLEAR)) { - unsigned long val; - - val = xa_to_value(ent) & ~MOCK_PFN_DIRTY_IOVA; - old = xa_store(&mock->pfns, - cur / MOCK_IO_PAGE_SIZE, - xa_mk_value(val), GFP_KERNEL); - WARN_ON_ONCE(ent != old); - } - iommu_dirty_bitmap_record(dirty, cur, - MOCK_IO_PAGE_SIZE); + ent = xa_load(&mock->pfns, iova / MOCK_IO_PAGE_SIZE); + if (!ent) { + iova += pgsize; + continue; } - } + + head = iova & ~(pgsize - 1); + + /* Clear dirty */ + if (mock_test_and_clear_dirty(mock, head, pgsize, flags)) + iommu_dirty_bitmap_record(dirty, head, pgsize); + iova = head + pgsize; + } while (iova < end); return 0; } From 5241efc45293c0bec95e18730c3f26c3bebc1dc5 Mon Sep 17 00:00:00 2001 From: Joao Martins Date: Fri, 2 Feb 2024 13:34:09 +0000 Subject: [PATCH 390/548] iommufd/selftest: Test u64 unaligned bitmaps Exercise the dirty tracking bitmaps with byte unaligned addresses in addition to the PAGE_SIZE unaligned bitmaps, using a address towards the end of the page boundary. In doing so, increase the tailroom we allocate for the bitmap from MOCK_PAGE_SIZE(2K) into PAGE_SIZE(4K), such that we can test end of bitmap boundary. Link: https://lore.kernel.org/r/20240202133415.23819-4-joao.m.martins@oracle.com Signed-off-by: Joao Martins Signed-off-by: Jason Gunthorpe (cherry picked from commit 42af95114535dd94c39714b97ad720602d406b9a) Signed-off-by: Koba Ko --- tools/testing/selftests/iommu/iommufd.c | 19 +++++++++++++++++-- 1 file changed, 17 insertions(+), 2 deletions(-) diff --git a/tools/testing/selftests/iommu/iommufd.c b/tools/testing/selftests/iommu/iommufd.c index 57037bec9c4e0..0f0dafde58523 100644 --- a/tools/testing/selftests/iommu/iommufd.c +++ b/tools/testing/selftests/iommu/iommufd.c @@ -1482,9 +1482,9 @@ FIXTURE_SETUP(iommufd_dirty_tracking) self->bitmap_size = variant->buffer_size / self->page_size / BITS_PER_BYTE; - /* Provision with an extra (MOCK_PAGE_SIZE) for the unaligned case */ + /* Provision with an extra (PAGE_SIZE) for the unaligned case */ rc = posix_memalign(&self->bitmap, PAGE_SIZE, - self->bitmap_size + MOCK_PAGE_SIZE); + self->bitmap_size + PAGE_SIZE); assert(!rc); assert(self->bitmap); assert((uintptr_t)self->bitmap % PAGE_SIZE == 0); @@ -1639,6 +1639,13 @@ TEST_F(iommufd_dirty_tracking, get_dirty_bitmap) self->bitmap + MOCK_PAGE_SIZE, self->bitmap_size, 0, _metadata); + /* u64 unaligned bitmap */ + test_mock_dirty_bitmaps(hwpt_id, variant->buffer_size, + MOCK_APERTURE_START, self->page_size, + self->bitmap + 0xff1, + self->bitmap_size, 0, _metadata); + + test_ioctl_destroy(stddev_id); test_ioctl_destroy(hwpt_id); } @@ -1676,6 +1683,14 @@ TEST_F(iommufd_dirty_tracking, get_dirty_bitmap_no_clear) IOMMU_HWPT_GET_DIRTY_BITMAP_NO_CLEAR, _metadata); + /* u64 unaligned bitmap */ + test_mock_dirty_bitmaps(hwpt_id, variant->buffer_size, + MOCK_APERTURE_START, self->page_size, + self->bitmap + 0xff1, + self->bitmap_size, + IOMMU_HWPT_GET_DIRTY_BITMAP_NO_CLEAR, + _metadata); + test_ioctl_destroy(stddev_id); test_ioctl_destroy(hwpt_id); } From e848195c3cd10af7e920381c4eb685ae867beb71 Mon Sep 17 00:00:00 2001 From: Joao Martins Date: Fri, 2 Feb 2024 13:34:11 +0000 Subject: [PATCH 391/548] iommufd/selftest: Refactor dirty bitmap tests Rework the functions that test and set the bitmaps to receive a new parameter (the pte_page_size) that reflects the expected PTE size in the page tables. The same scheme is still used i.e. even bits are dirty and odd page indexes aren't dirty. Here it just refactors to consider the size of the PTE rather than hardcoded to IOMMU mock base page assumptions. While at it, refactor dirty bitmap tests to use the idev_id created by the fixture instead of creating a new one. This is in preparation for doing tests with IOMMU hugepages where multiple bits set as part of recording a whole hugepage as dirty and thus the pte_page_size will vary depending on io hugepages or io base pages. Link: https://lore.kernel.org/r/20240202133415.23819-6-joao.m.martins@oracle.com Signed-off-by: Joao Martins Signed-off-by: Jason Gunthorpe (cherry picked from commit 407fc184f0e0bfde61026d6ce3fb1e70a15159a3) Signed-off-by: Koba Ko --- tools/testing/selftests/iommu/iommufd.c | 28 ++++++------- tools/testing/selftests/iommu/iommufd_utils.h | 39 ++++++++++++------- 2 files changed, 36 insertions(+), 31 deletions(-) diff --git a/tools/testing/selftests/iommu/iommufd.c b/tools/testing/selftests/iommu/iommufd.c index 0f0dafde58523..680ddaada8849 100644 --- a/tools/testing/selftests/iommu/iommufd.c +++ b/tools/testing/selftests/iommu/iommufd.c @@ -1612,7 +1612,7 @@ TEST_F(iommufd_dirty_tracking, device_dirty_capability) TEST_F(iommufd_dirty_tracking, get_dirty_bitmap) { - uint32_t stddev_id; + uint32_t page_size = MOCK_PAGE_SIZE; uint32_t hwpt_id; uint32_t ioas_id; @@ -1625,34 +1625,31 @@ TEST_F(iommufd_dirty_tracking, get_dirty_bitmap) test_cmd_hwpt_alloc(self->idev_id, ioas_id, IOMMU_HWPT_ALLOC_DIRTY_TRACKING, &hwpt_id); - test_cmd_mock_domain(hwpt_id, &stddev_id, NULL, NULL); test_cmd_set_dirty_tracking(hwpt_id, true); test_mock_dirty_bitmaps(hwpt_id, variant->buffer_size, - MOCK_APERTURE_START, self->page_size, + MOCK_APERTURE_START, self->page_size, page_size, self->bitmap, self->bitmap_size, 0, _metadata); /* PAGE_SIZE unaligned bitmap */ test_mock_dirty_bitmaps(hwpt_id, variant->buffer_size, - MOCK_APERTURE_START, self->page_size, + MOCK_APERTURE_START, self->page_size, page_size, self->bitmap + MOCK_PAGE_SIZE, self->bitmap_size, 0, _metadata); /* u64 unaligned bitmap */ test_mock_dirty_bitmaps(hwpt_id, variant->buffer_size, - MOCK_APERTURE_START, self->page_size, - self->bitmap + 0xff1, - self->bitmap_size, 0, _metadata); - + MOCK_APERTURE_START, self->page_size, page_size, + self->bitmap + 0xff1, self->bitmap_size, 0, + _metadata); - test_ioctl_destroy(stddev_id); test_ioctl_destroy(hwpt_id); } TEST_F(iommufd_dirty_tracking, get_dirty_bitmap_no_clear) { - uint32_t stddev_id; + uint32_t page_size = MOCK_PAGE_SIZE; uint32_t hwpt_id; uint32_t ioas_id; @@ -1665,19 +1662,18 @@ TEST_F(iommufd_dirty_tracking, get_dirty_bitmap_no_clear) test_cmd_hwpt_alloc(self->idev_id, ioas_id, IOMMU_HWPT_ALLOC_DIRTY_TRACKING, &hwpt_id); - test_cmd_mock_domain(hwpt_id, &stddev_id, NULL, NULL); test_cmd_set_dirty_tracking(hwpt_id, true); test_mock_dirty_bitmaps(hwpt_id, variant->buffer_size, - MOCK_APERTURE_START, self->page_size, + MOCK_APERTURE_START, self->page_size, page_size, self->bitmap, self->bitmap_size, IOMMU_HWPT_GET_DIRTY_BITMAP_NO_CLEAR, _metadata); /* Unaligned bitmap */ test_mock_dirty_bitmaps(hwpt_id, variant->buffer_size, - MOCK_APERTURE_START, self->page_size, + MOCK_APERTURE_START, self->page_size, page_size, self->bitmap + MOCK_PAGE_SIZE, self->bitmap_size, IOMMU_HWPT_GET_DIRTY_BITMAP_NO_CLEAR, @@ -1685,13 +1681,11 @@ TEST_F(iommufd_dirty_tracking, get_dirty_bitmap_no_clear) /* u64 unaligned bitmap */ test_mock_dirty_bitmaps(hwpt_id, variant->buffer_size, - MOCK_APERTURE_START, self->page_size, - self->bitmap + 0xff1, - self->bitmap_size, + MOCK_APERTURE_START, self->page_size, page_size, + self->bitmap + 0xff1, self->bitmap_size, IOMMU_HWPT_GET_DIRTY_BITMAP_NO_CLEAR, _metadata); - test_ioctl_destroy(stddev_id); test_ioctl_destroy(hwpt_id); } diff --git a/tools/testing/selftests/iommu/iommufd_utils.h b/tools/testing/selftests/iommu/iommufd_utils.h index 70d558e0f0c74..35b3f8949c414 100644 --- a/tools/testing/selftests/iommu/iommufd_utils.h +++ b/tools/testing/selftests/iommu/iommufd_utils.h @@ -273,16 +273,19 @@ static int _test_cmd_mock_domain_set_dirty(int fd, __u32 hwpt_id, size_t length, page_size, bitmap, nr)) static int _test_mock_dirty_bitmaps(int fd, __u32 hwpt_id, size_t length, - __u64 iova, size_t page_size, __u64 *bitmap, + __u64 iova, size_t page_size, + size_t pte_page_size, __u64 *bitmap, __u64 bitmap_size, __u32 flags, struct __test_metadata *_metadata) { - unsigned long i, nbits = bitmap_size * BITS_PER_BYTE; - unsigned long nr = nbits / 2; + unsigned long npte = pte_page_size / page_size, pteset = 2 * npte; + unsigned long nbits = bitmap_size * BITS_PER_BYTE; + unsigned long j, i, nr = nbits / pteset ?: 1; __u64 out_dirty = 0; /* Mark all even bits as dirty in the mock domain */ - for (i = 0; i < nbits; i += 2) + memset(bitmap, 0, bitmap_size); + for (i = 0; i < nbits; i += pteset) set_bit(i, (unsigned long *)bitmap); test_cmd_mock_domain_set_dirty(fd, hwpt_id, length, iova, page_size, @@ -294,8 +297,12 @@ static int _test_mock_dirty_bitmaps(int fd, __u32 hwpt_id, size_t length, test_cmd_get_dirty_bitmap(fd, hwpt_id, length, iova, page_size, bitmap, flags); /* Beware ASSERT_EQ() is two statements -- braces are not redundant! */ - for (i = 0; i < nbits; i++) { - ASSERT_EQ(!(i % 2), test_bit(i, (unsigned long *)bitmap)); + for (i = 0; i < nbits; i += pteset) { + for (j = 0; j < pteset; j++) { + ASSERT_EQ(j < npte, + test_bit(i + j, (unsigned long *)bitmap)); + } + ASSERT_EQ(!(i % pteset), test_bit(i, (unsigned long *)bitmap)); } memset(bitmap, 0, bitmap_size); @@ -303,19 +310,23 @@ static int _test_mock_dirty_bitmaps(int fd, __u32 hwpt_id, size_t length, flags); /* It as read already -- expect all zeroes */ - for (i = 0; i < nbits; i++) { - ASSERT_EQ(!(i % 2) && (flags & - IOMMU_HWPT_GET_DIRTY_BITMAP_NO_CLEAR), - test_bit(i, (unsigned long *)bitmap)); + for (i = 0; i < nbits; i += pteset) { + for (j = 0; j < pteset; j++) { + ASSERT_EQ( + (j < npte) && + (flags & + IOMMU_HWPT_GET_DIRTY_BITMAP_NO_CLEAR), + test_bit(i + j, (unsigned long *)bitmap)); + } } return 0; } -#define test_mock_dirty_bitmaps(hwpt_id, length, iova, page_size, bitmap, \ - bitmap_size, flags, _metadata) \ +#define test_mock_dirty_bitmaps(hwpt_id, length, iova, page_size, pte_size,\ + bitmap, bitmap_size, flags, _metadata) \ ASSERT_EQ(0, _test_mock_dirty_bitmaps(self->fd, hwpt_id, length, iova, \ - page_size, bitmap, bitmap_size, \ - flags, _metadata)) + page_size, pte_size, bitmap, \ + bitmap_size, flags, _metadata)) static int _test_cmd_create_access(int fd, unsigned int ioas_id, __u32 *access_id, unsigned int flags) From 41b07687ad46f98b5568f53bf820f8842fb4bdb4 Mon Sep 17 00:00:00 2001 From: Nicolin Chen Date: Tue, 24 Jun 2025 11:00:45 -0700 Subject: [PATCH 392/548] iommufd/selftest: Fix iommufd_dirty_tracking with large hugepage sizes The hugepage test cases of iommufd_dirty_tracking have the 64MB and 128MB coverages. Both of them are smaller than the default hugepage size 512MB, when CONFIG_PAGE_SIZE_64KB=y. However, these test cases have a variant of using huge pages, which would mmap(MAP_HUGETLB) using these smaller sizes than the system hugepag size. This results in the kernel aligning up the smaller size to 512MB. If a memory was located between the upper 64/128MB size boundary and the hugepage 512MB boundary, it would get wiped out: https://lore.kernel.org/all/aEoUhPYIAizTLADq@nvidia.com/ Given that this aligning up behavior is well documented, we have no choice but to allocate a hugepage aligned size to avoid this unintended wipe out. Instead of relying on the kernel's internal force alignment, pass the same size to posix_memalign() and map(). Also, fix the FIXTURE_TEARDOWN() misusing munmap() to free the memory from posix_memalign(), as munmap() doesn't destroy the allocator meta data. So, call free() instead. Fixes: a9af47e382a4 ("iommufd/selftest: Test IOMMU_HWPT_GET_DIRTY_BITMAP") Link: https://patch.msgid.link/r/1ea8609ae6d523fdd4d8efb179ddee79c8582cb6.1750787928.git.nicolinc@nvidia.com Cc: stable@vger.kernel.org Suggested-by: Jason Gunthorpe Signed-off-by: Nicolin Chen Signed-off-by: Jason Gunthorpe (cherry picked from commit 818625570558cd91082c9bafd6f2b59b73241a69) Signed-off-by: Koba Ko --- tools/testing/selftests/iommu/iommufd.c | 30 +++++++++++++++++-------- 1 file changed, 21 insertions(+), 9 deletions(-) diff --git a/tools/testing/selftests/iommu/iommufd.c b/tools/testing/selftests/iommu/iommufd.c index 680ddaada8849..83b573c40db6c 100644 --- a/tools/testing/selftests/iommu/iommufd.c +++ b/tools/testing/selftests/iommu/iommufd.c @@ -1452,6 +1452,7 @@ FIXTURE_VARIANT(iommufd_dirty_tracking) FIXTURE_SETUP(iommufd_dirty_tracking) { + size_t mmap_buffer_size; int mmap_flags; void *vrc; int rc; @@ -1459,22 +1460,33 @@ FIXTURE_SETUP(iommufd_dirty_tracking) self->fd = open("/dev/iommu", O_RDWR); ASSERT_NE(-1, self->fd); - rc = posix_memalign(&self->buffer, HUGEPAGE_SIZE, variant->buffer_size); - if (rc || !self->buffer) { - SKIP(return, "Skipping buffer_size=%lu due to errno=%d", - variant->buffer_size, rc); - } - mmap_flags = MAP_SHARED | MAP_ANONYMOUS | MAP_FIXED; + mmap_buffer_size = variant->buffer_size; if (variant->hugepages) { /* * MAP_POPULATE will cause the kernel to fail mmap if THPs are * not available. */ mmap_flags |= MAP_HUGETLB | MAP_POPULATE; + + /* + * Allocation must be aligned to the HUGEPAGE_SIZE, because the + * following mmap() will automatically align the length to be a + * multiple of the underlying huge page size. Failing to do the + * same at this allocation will result in a memory overwrite by + * the mmap(). + */ + if (mmap_buffer_size < HUGEPAGE_SIZE) + mmap_buffer_size = HUGEPAGE_SIZE; + } + + rc = posix_memalign(&self->buffer, HUGEPAGE_SIZE, mmap_buffer_size); + if (rc || !self->buffer) { + SKIP(return, "Skipping buffer_size=%lu due to errno=%d", + mmap_buffer_size, rc); } assert((uintptr_t)self->buffer % HUGEPAGE_SIZE == 0); - vrc = mmap(self->buffer, variant->buffer_size, PROT_READ | PROT_WRITE, + vrc = mmap(self->buffer, mmap_buffer_size, PROT_READ | PROT_WRITE, mmap_flags, -1, 0); assert(vrc == self->buffer); @@ -1504,8 +1516,8 @@ FIXTURE_SETUP(iommufd_dirty_tracking) FIXTURE_TEARDOWN(iommufd_dirty_tracking) { - munmap(self->buffer, variant->buffer_size); - munmap(self->bitmap, self->bitmap_size); + free(self->buffer); + free(self->bitmap); teardown_iommufd(self->fd, _metadata); } From 5900d4b112d75fee0abd5c36241cbba5c3c861c0 Mon Sep 17 00:00:00 2001 From: Nicolin Chen Date: Mon, 23 Oct 2023 18:29:58 -0700 Subject: [PATCH 393/548] iommufd: Only enforce cache coherency in iommufd_hw_pagetable_alloc According to the conversation in the following link: https://lore.kernel.org/linux-iommu/20231020135501.GG3952@nvidia.com/ The enforce_cache_coherency should be set/enforced in the hwpt allocation routine. The iommu driver in its attach_dev() op should decide whether to reject or not a device that doesn't match with the configuration of cache coherency. Drop the enforce_cache_coherency piece in the attach/replace() and move the remaining "num_devices" piece closer to the refcount that is using it. Accordingly drop its function prototype in the header and mark it static. Also add some extra comments to clarify the expected behaviors. Link: https://lore.kernel.org/r/20231024012958.30842-1-nicolinc@nvidia.com Suggested-by: Kevin Tian Reviewed-by: Lu Baolu Reviewed-by: Kevin Tian Signed-off-by: Nicolin Chen Signed-off-by: Jason Gunthorpe (cherry picked from commit 2ccabf81ddff81355bf044bdad3c44e9e6ac32d9) Signed-off-by: Koba Ko --- drivers/iommu/iommufd/device.c | 20 ++------------------ drivers/iommu/iommufd/hw_pagetable.c | 9 ++++++++- drivers/iommu/iommufd/iommufd_private.h | 1 - 3 files changed, 10 insertions(+), 20 deletions(-) diff --git a/drivers/iommu/iommufd/device.c b/drivers/iommu/iommufd/device.c index cd0c96bf86e1a..0f264242ab09a 100644 --- a/drivers/iommu/iommufd/device.c +++ b/drivers/iommu/iommufd/device.c @@ -337,13 +337,6 @@ int iommufd_hw_pagetable_attach(struct iommufd_hw_pagetable *hwpt, goto err_unlock; } - /* Try to upgrade the domain we have */ - if (idev->enforce_cache_coherency) { - rc = iommufd_hw_pagetable_enforce_cc(hwpt); - if (rc) - goto err_unlock; - } - rc = iopt_table_enforce_dev_resv_regions(&hwpt->ioas->iopt, idev->dev, &idev->igroup->sw_msi_start); if (rc) @@ -424,8 +417,8 @@ iommufd_device_do_replace(struct iommufd_device *idev, { struct iommufd_group *igroup = idev->igroup; struct iommufd_hw_pagetable *old_hwpt; - unsigned int num_devices = 0; struct iommufd_device *cur; + unsigned int num_devices; int rc; mutex_lock(&idev->igroup->lock); @@ -445,16 +438,6 @@ iommufd_device_do_replace(struct iommufd_device *idev, return NULL; } - /* Try to upgrade the domain we have */ - list_for_each_entry(cur, &igroup->device_list, group_item) { - num_devices++; - if (cur->enforce_cache_coherency) { - rc = iommufd_hw_pagetable_enforce_cc(hwpt); - if (rc) - goto err_unlock; - } - } - old_hwpt = igroup->hwpt; if (hwpt->ioas != old_hwpt->ioas) { list_for_each_entry(cur, &igroup->device_list, group_item) { @@ -481,6 +464,7 @@ iommufd_device_do_replace(struct iommufd_device *idev, igroup->hwpt = hwpt; + num_devices = list_count_nodes(&igroup->device_list); /* * Move the refcounts held by the device_list to the new hwpt. Retain a * refcount for this thread as the caller will free it. diff --git a/drivers/iommu/iommufd/hw_pagetable.c b/drivers/iommu/iommufd/hw_pagetable.c index 72a5269984b01..c2b742ffeb0ca 100644 --- a/drivers/iommu/iommufd/hw_pagetable.c +++ b/drivers/iommu/iommufd/hw_pagetable.c @@ -42,7 +42,7 @@ void iommufd_hw_pagetable_abort(struct iommufd_object *obj) iommufd_hw_pagetable_destroy(obj); } -int iommufd_hw_pagetable_enforce_cc(struct iommufd_hw_pagetable *hwpt) +static int iommufd_hw_pagetable_enforce_cc(struct iommufd_hw_pagetable *hwpt) { if (hwpt->enforce_cache_coherency) return 0; @@ -116,6 +116,13 @@ iommufd_hw_pagetable_alloc(struct iommufd_ctx *ictx, struct iommufd_ioas *ioas, * doing any maps. It is an iommu driver bug to report * IOMMU_CAP_ENFORCE_CACHE_COHERENCY but fail enforce_cache_coherency on * a new domain. + * + * The cache coherency mode must be configured here and unchanged later. + * Note that a HWPT (non-CC) created for a device (non-CC) can be later + * reused by another device (either non-CC or CC). However, A HWPT (CC) + * created for a device (CC) cannot be reused by another device (non-CC) + * but only devices (CC). Instead user space in this case would need to + * allocate a separate HWPT (non-CC). */ if (idev->enforce_cache_coherency) { rc = iommufd_hw_pagetable_enforce_cc(hwpt); diff --git a/drivers/iommu/iommufd/iommufd_private.h b/drivers/iommu/iommufd/iommufd_private.h index 034129130db37..43410fd531573 100644 --- a/drivers/iommu/iommufd/iommufd_private.h +++ b/drivers/iommu/iommufd/iommufd_private.h @@ -266,7 +266,6 @@ struct iommufd_hw_pagetable * iommufd_hw_pagetable_alloc(struct iommufd_ctx *ictx, struct iommufd_ioas *ioas, struct iommufd_device *idev, u32 flags, bool immediate_attach); -int iommufd_hw_pagetable_enforce_cc(struct iommufd_hw_pagetable *hwpt); int iommufd_hw_pagetable_attach(struct iommufd_hw_pagetable *hwpt, struct iommufd_device *idev); struct iommufd_hw_pagetable * From 711c7ca62bad24b123912cba3b9bc66d06ae4e11 Mon Sep 17 00:00:00 2001 From: Lu Baolu Date: Wed, 25 Oct 2023 21:39:29 -0700 Subject: [PATCH 394/548] iommu: Add IOMMU_DOMAIN_NESTED Introduce a new domain type for a user I/O page table, which is nested on top of another user space address represented by a PAGING domain. This new domain can be allocated by the domain_alloc_user op, and attached to a device through the existing iommu_attach_device/group() interfaces. The mappings of a nested domain are managed by user space software, so it is not necessary to have map/unmap callbacks. Link: https://lore.kernel.org/r/20231026043938.63898-2-yi.l.liu@intel.com Signed-off-by: Lu Baolu Signed-off-by: Nicolin Chen Signed-off-by: Yi Liu Reviewed-by: Kevin Tian Reviewed-by: Jason Gunthorpe Signed-off-by: Jason Gunthorpe (cherry picked from commit 54d606816b32401de5431f6776a78b1de135bfa2) Signed-off-by: Koba Ko --- include/linux/iommu.h | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/include/linux/iommu.h b/include/linux/iommu.h index 57aa6c9f0da8b..ebb220afd0594 100644 --- a/include/linux/iommu.h +++ b/include/linux/iommu.h @@ -68,6 +68,9 @@ struct iommu_domain_geometry { #define __IOMMU_DOMAIN_SVA (1U << 4) /* Shared process address space */ #define __IOMMU_DOMAIN_PLATFORM (1U << 5) +#define __IOMMU_DOMAIN_NESTED (1U << 6) /* User-managed address space nested + on a stage-2 translation */ + #define IOMMU_DOMAIN_ALLOC_FLAGS ~__IOMMU_DOMAIN_DMA_FQ /* * This are the possible domain-types @@ -97,6 +100,7 @@ struct iommu_domain_geometry { __IOMMU_DOMAIN_DMA_FQ) #define IOMMU_DOMAIN_SVA (__IOMMU_DOMAIN_SVA) #define IOMMU_DOMAIN_PLATFORM (__IOMMU_DOMAIN_PLATFORM) +#define IOMMU_DOMAIN_NESTED (__IOMMU_DOMAIN_NESTED) struct iommu_domain { unsigned type; From bc2d30fe0df14775e3560d16c404b8e776004543 Mon Sep 17 00:00:00 2001 From: Jason Gunthorpe Date: Wed, 25 Oct 2023 21:39:30 -0700 Subject: [PATCH 395/548] iommufd: Rename IOMMUFD_OBJ_HW_PAGETABLE to IOMMUFD_OBJ_HWPT_PAGING To add a new IOMMUFD_OBJ_HWPT_NESTED, rename the HWPT object to confine it to PAGING hwpts/domains. The following patch will separate the hwpt structure as well. Link: https://lore.kernel.org/r/20231026043938.63898-3-yi.l.liu@intel.com Signed-off-by: Nicolin Chen Signed-off-by: Yi Liu Reviewed-by: Kevin Tian Reviewed-by: Jason Gunthorpe Signed-off-by: Jason Gunthorpe (cherry picked from commit 9744a7ab62cc7354096aaff788c08b947f86ba60) Signed-off-by: Koba Ko --- drivers/iommu/iommufd/device.c | 10 +++++----- drivers/iommu/iommufd/hw_pagetable.c | 2 +- drivers/iommu/iommufd/iommufd_private.h | 4 ++-- drivers/iommu/iommufd/main.c | 2 +- drivers/iommu/iommufd/selftest.c | 2 +- 5 files changed, 10 insertions(+), 10 deletions(-) diff --git a/drivers/iommu/iommufd/device.c b/drivers/iommu/iommufd/device.c index 0f264242ab09a..b315a8ccbeb34 100644 --- a/drivers/iommu/iommufd/device.c +++ b/drivers/iommu/iommufd/device.c @@ -579,7 +579,7 @@ static int iommufd_device_change_pt(struct iommufd_device *idev, u32 *pt_id, return PTR_ERR(pt_obj); switch (pt_obj->type) { - case IOMMUFD_OBJ_HW_PAGETABLE: { + case IOMMUFD_OBJ_HWPT_PAGING: { struct iommufd_hw_pagetable *hwpt = container_of(pt_obj, struct iommufd_hw_pagetable, obj); @@ -617,8 +617,8 @@ static int iommufd_device_change_pt(struct iommufd_device *idev, u32 *pt_id, /** * iommufd_device_attach - Connect a device to an iommu_domain * @idev: device to attach - * @pt_id: Input a IOMMUFD_OBJ_IOAS, or IOMMUFD_OBJ_HW_PAGETABLE - * Output the IOMMUFD_OBJ_HW_PAGETABLE ID + * @pt_id: Input a IOMMUFD_OBJ_IOAS, or IOMMUFD_OBJ_HWPT_PAGING + * Output the IOMMUFD_OBJ_HWPT_PAGING ID * * This connects the device to an iommu_domain, either automatically or manually * selected. Once this completes the device could do DMA. @@ -646,8 +646,8 @@ EXPORT_SYMBOL_NS_GPL(iommufd_device_attach, IOMMUFD); /** * iommufd_device_replace - Change the device's iommu_domain * @idev: device to change - * @pt_id: Input a IOMMUFD_OBJ_IOAS, or IOMMUFD_OBJ_HW_PAGETABLE - * Output the IOMMUFD_OBJ_HW_PAGETABLE ID + * @pt_id: Input a IOMMUFD_OBJ_IOAS, or IOMMUFD_OBJ_HWPT_PAGING + * Output the IOMMUFD_OBJ_HWPT_PAGING ID * * This is the same as:: * diff --git a/drivers/iommu/iommufd/hw_pagetable.c b/drivers/iommu/iommufd/hw_pagetable.c index c2b742ffeb0ca..8dc2b39f8cb0c 100644 --- a/drivers/iommu/iommufd/hw_pagetable.c +++ b/drivers/iommu/iommufd/hw_pagetable.c @@ -86,7 +86,7 @@ iommufd_hw_pagetable_alloc(struct iommufd_ctx *ictx, struct iommufd_ioas *ioas, if (flags && !ops->domain_alloc_user) return ERR_PTR(-EOPNOTSUPP); - hwpt = iommufd_object_alloc(ictx, hwpt, IOMMUFD_OBJ_HW_PAGETABLE); + hwpt = iommufd_object_alloc(ictx, hwpt, IOMMUFD_OBJ_HWPT_PAGING); if (IS_ERR(hwpt)) return hwpt; diff --git a/drivers/iommu/iommufd/iommufd_private.h b/drivers/iommu/iommufd/iommufd_private.h index 43410fd531573..70bebad63a745 100644 --- a/drivers/iommu/iommufd/iommufd_private.h +++ b/drivers/iommu/iommufd/iommufd_private.h @@ -123,7 +123,7 @@ enum iommufd_object_type { IOMMUFD_OBJ_NONE, IOMMUFD_OBJ_ANY = IOMMUFD_OBJ_NONE, IOMMUFD_OBJ_DEVICE, - IOMMUFD_OBJ_HW_PAGETABLE, + IOMMUFD_OBJ_HWPT_PAGING, IOMMUFD_OBJ_IOAS, IOMMUFD_OBJ_ACCESS, #ifdef CONFIG_IOMMUFD_TEST @@ -256,7 +256,7 @@ static inline struct iommufd_hw_pagetable * iommufd_get_hwpt(struct iommufd_ucmd *ucmd, u32 id) { return container_of(iommufd_get_object(ucmd->ictx, id, - IOMMUFD_OBJ_HW_PAGETABLE), + IOMMUFD_OBJ_HWPT_PAGING), struct iommufd_hw_pagetable, obj); } int iommufd_hwpt_set_dirty_tracking(struct iommufd_ucmd *ucmd); diff --git a/drivers/iommu/iommufd/main.c b/drivers/iommu/iommufd/main.c index d50f42a730aa3..46198e8948d68 100644 --- a/drivers/iommu/iommufd/main.c +++ b/drivers/iommu/iommufd/main.c @@ -488,7 +488,7 @@ static const struct iommufd_object_ops iommufd_object_ops[] = { [IOMMUFD_OBJ_IOAS] = { .destroy = iommufd_ioas_destroy, }, - [IOMMUFD_OBJ_HW_PAGETABLE] = { + [IOMMUFD_OBJ_HWPT_PAGING] = { .destroy = iommufd_hw_pagetable_destroy, .abort = iommufd_hw_pagetable_abort, }, diff --git a/drivers/iommu/iommufd/selftest.c b/drivers/iommu/iommufd/selftest.c index 32e5117a6f531..6f00b4f9f1465 100644 --- a/drivers/iommu/iommufd/selftest.c +++ b/drivers/iommu/iommufd/selftest.c @@ -469,7 +469,7 @@ get_md_pagetable(struct iommufd_ucmd *ucmd, u32 mockpt_id, struct iommufd_object *obj; obj = iommufd_get_object(ucmd->ictx, mockpt_id, - IOMMUFD_OBJ_HW_PAGETABLE); + IOMMUFD_OBJ_HWPT_PAGING); if (IS_ERR(obj)) return ERR_CAST(obj); hwpt = container_of(obj, struct iommufd_hw_pagetable, obj); From 06e00f367cf5f6ec084605953b3b9f7b9c96c9a3 Mon Sep 17 00:00:00 2001 From: Jason Gunthorpe Date: Wed, 25 Oct 2023 21:39:31 -0700 Subject: [PATCH 396/548] iommufd/device: Wrap IOMMUFD_OBJ_HWPT_PAGING-only configurations Some of the configurations during the attach/replace() should only apply to IOMMUFD_OBJ_HWPT_PAGING. Once IOMMUFD_OBJ_HWPT_NESTED gets introduced in a following patch, keeping them unconditionally in the common routine will not work. Wrap all of those PAGING-only configurations together into helpers. Do a hwpt_is_paging check whenever calling them or their fallback routines. Link: https://lore.kernel.org/r/20231026043938.63898-4-yi.l.liu@intel.com Signed-off-by: Nicolin Chen Signed-off-by: Yi Liu Reviewed-by: Kevin Tian Reviewed-by: Jason Gunthorpe Signed-off-by: Jason Gunthorpe (cherry picked from commit 58d84f430dc7f737d21c60906de0f39104c89e9d) Signed-off-by: Koba Ko --- drivers/iommu/iommufd/device.c | 111 +++++++++++++++++------- drivers/iommu/iommufd/iommufd_private.h | 5 ++ 2 files changed, 86 insertions(+), 30 deletions(-) diff --git a/drivers/iommu/iommufd/device.c b/drivers/iommu/iommufd/device.c index b315a8ccbeb34..b24cba9e3d92d 100644 --- a/drivers/iommu/iommufd/device.c +++ b/drivers/iommu/iommufd/device.c @@ -325,6 +325,28 @@ static int iommufd_group_setup_msi(struct iommufd_group *igroup, return 0; } +static int iommufd_hwpt_paging_attach(struct iommufd_hw_pagetable *hwpt, + struct iommufd_device *idev) +{ + int rc; + + lockdep_assert_held(&idev->igroup->lock); + + rc = iopt_table_enforce_dev_resv_regions(&hwpt->ioas->iopt, idev->dev, + &idev->igroup->sw_msi_start); + if (rc) + return rc; + + if (list_empty(&idev->igroup->device_list)) { + rc = iommufd_group_setup_msi(idev->igroup, hwpt); + if (rc) { + iopt_remove_reserved_iova(&hwpt->ioas->iopt, idev->dev); + return rc; + } + } + return 0; +} + int iommufd_hw_pagetable_attach(struct iommufd_hw_pagetable *hwpt, struct iommufd_device *idev) { @@ -337,10 +359,11 @@ int iommufd_hw_pagetable_attach(struct iommufd_hw_pagetable *hwpt, goto err_unlock; } - rc = iopt_table_enforce_dev_resv_regions(&hwpt->ioas->iopt, idev->dev, - &idev->igroup->sw_msi_start); - if (rc) - goto err_unlock; + if (hwpt_is_paging(hwpt)) { + rc = iommufd_hwpt_paging_attach(hwpt, idev); + if (rc) + goto err_unlock; + } /* * Only attach to the group once for the first device that is in the @@ -350,10 +373,6 @@ int iommufd_hw_pagetable_attach(struct iommufd_hw_pagetable *hwpt, * attachment. */ if (list_empty(&idev->igroup->device_list)) { - rc = iommufd_group_setup_msi(idev->igroup, hwpt); - if (rc) - goto err_unresv; - rc = iommu_attach_group(hwpt->domain, idev->igroup->group); if (rc) goto err_unresv; @@ -364,7 +383,8 @@ int iommufd_hw_pagetable_attach(struct iommufd_hw_pagetable *hwpt, mutex_unlock(&idev->igroup->lock); return 0; err_unresv: - iopt_remove_reserved_iova(&hwpt->ioas->iopt, idev->dev); + if (hwpt_is_paging(hwpt)) + iopt_remove_reserved_iova(&hwpt->ioas->iopt, idev->dev); err_unlock: mutex_unlock(&idev->igroup->lock); return rc; @@ -381,7 +401,8 @@ iommufd_hw_pagetable_detach(struct iommufd_device *idev) iommu_detach_group(hwpt->domain, idev->igroup->group); idev->igroup->hwpt = NULL; } - iopt_remove_reserved_iova(&hwpt->ioas->iopt, idev->dev); + if (hwpt_is_paging(hwpt)) + iopt_remove_reserved_iova(&hwpt->ioas->iopt, idev->dev); mutex_unlock(&idev->igroup->lock); /* Caller must destroy hwpt */ @@ -411,13 +432,52 @@ static bool iommufd_device_is_attached(struct iommufd_device *idev) return false; } +static void +iommufd_group_remove_reserved_iova(struct iommufd_group *igroup, + struct iommufd_hw_pagetable *hwpt) +{ + struct iommufd_device *cur; + + lockdep_assert_held(&igroup->lock); + + list_for_each_entry(cur, &igroup->device_list, group_item) + iopt_remove_reserved_iova(&hwpt->ioas->iopt, cur->dev); +} + +static int iommufd_group_do_replace_paging(struct iommufd_group *igroup, + struct iommufd_hw_pagetable *hwpt) +{ + struct iommufd_hw_pagetable *old_hwpt = igroup->hwpt; + struct iommufd_device *cur; + int rc; + + lockdep_assert_held(&igroup->lock); + + if (!hwpt_is_paging(old_hwpt) || hwpt->ioas != old_hwpt->ioas) { + list_for_each_entry(cur, &igroup->device_list, group_item) { + rc = iopt_table_enforce_dev_resv_regions( + &hwpt->ioas->iopt, cur->dev, NULL); + if (rc) + goto err_unresv; + } + } + + rc = iommufd_group_setup_msi(igroup, hwpt); + if (rc) + goto err_unresv; + return 0; + +err_unresv: + iommufd_group_remove_reserved_iova(igroup, hwpt); + return rc; +} + static struct iommufd_hw_pagetable * iommufd_device_do_replace(struct iommufd_device *idev, struct iommufd_hw_pagetable *hwpt) { struct iommufd_group *igroup = idev->igroup; struct iommufd_hw_pagetable *old_hwpt; - struct iommufd_device *cur; unsigned int num_devices; int rc; @@ -438,29 +498,20 @@ iommufd_device_do_replace(struct iommufd_device *idev, return NULL; } - old_hwpt = igroup->hwpt; - if (hwpt->ioas != old_hwpt->ioas) { - list_for_each_entry(cur, &igroup->device_list, group_item) { - rc = iopt_table_enforce_dev_resv_regions( - &hwpt->ioas->iopt, cur->dev, NULL); - if (rc) - goto err_unresv; - } + if (hwpt_is_paging(hwpt)) { + rc = iommufd_group_do_replace_paging(igroup, hwpt); + if (rc) + goto err_unlock; } - rc = iommufd_group_setup_msi(idev->igroup, hwpt); - if (rc) - goto err_unresv; - rc = iommu_group_replace_domain(igroup->group, hwpt->domain); if (rc) goto err_unresv; - if (hwpt->ioas != old_hwpt->ioas) { - list_for_each_entry(cur, &igroup->device_list, group_item) - iopt_remove_reserved_iova(&old_hwpt->ioas->iopt, - cur->dev); - } + old_hwpt = igroup->hwpt; + if (hwpt_is_paging(old_hwpt) && + (!hwpt_is_paging(hwpt) || hwpt->ioas != old_hwpt->ioas)) + iommufd_group_remove_reserved_iova(igroup, old_hwpt); igroup->hwpt = hwpt; @@ -478,8 +529,8 @@ iommufd_device_do_replace(struct iommufd_device *idev, /* Caller must destroy old_hwpt */ return old_hwpt; err_unresv: - list_for_each_entry(cur, &igroup->device_list, group_item) - iopt_remove_reserved_iova(&hwpt->ioas->iopt, cur->dev); + if (hwpt_is_paging(hwpt)) + iommufd_group_remove_reserved_iova(igroup, hwpt); err_unlock: mutex_unlock(&idev->igroup->lock); return ERR_PTR(rc); diff --git a/drivers/iommu/iommufd/iommufd_private.h b/drivers/iommu/iommufd/iommufd_private.h index 70bebad63a745..776dd41c077f0 100644 --- a/drivers/iommu/iommufd/iommufd_private.h +++ b/drivers/iommu/iommufd/iommufd_private.h @@ -252,6 +252,11 @@ struct iommufd_hw_pagetable { struct list_head hwpt_item; }; +static inline bool hwpt_is_paging(struct iommufd_hw_pagetable *hwpt) +{ + return hwpt->obj.type == IOMMUFD_OBJ_HWPT_PAGING; +} + static inline struct iommufd_hw_pagetable * iommufd_get_hwpt(struct iommufd_ucmd *ucmd, u32 id) { From 009f7ced6a3b583619292a11c91015c5ba9d9f39 Mon Sep 17 00:00:00 2001 From: Nicolin Chen Date: Wed, 25 Oct 2023 21:39:32 -0700 Subject: [PATCH 397/548] iommufd: Derive iommufd_hwpt_paging from iommufd_hw_pagetable To prepare for IOMMUFD_OBJ_HWPT_NESTED, derive struct iommufd_hwpt_paging from struct iommufd_hw_pagetable, by leaving the common members in struct iommufd_hw_pagetable. Add a __iommufd_object_alloc and to_hwpt_paging() helpers for the new structure. Then, update "hwpt" to "hwpt_paging" throughout the files, accordingly. Link: https://lore.kernel.org/r/20231026043938.63898-5-yi.l.liu@intel.com Suggested-by: Jason Gunthorpe Signed-off-by: Nicolin Chen Signed-off-by: Yi Liu Reviewed-by: Kevin Tian Reviewed-by: Jason Gunthorpe Signed-off-by: Jason Gunthorpe (cherry picked from commit 89db31635c87a7856e205c7ebf9f562e4bb206fe) Signed-off-by: Koba Ko --- drivers/iommu/iommufd/device.c | 76 ++++++++------ drivers/iommu/iommufd/hw_pagetable.c | 129 +++++++++++++----------- drivers/iommu/iommufd/iommufd_private.h | 42 +++++--- drivers/iommu/iommufd/main.c | 4 +- drivers/iommu/iommufd/vfio_compat.c | 6 +- 5 files changed, 149 insertions(+), 108 deletions(-) diff --git a/drivers/iommu/iommufd/device.c b/drivers/iommu/iommufd/device.c index b24cba9e3d92d..24396a53ce06f 100644 --- a/drivers/iommu/iommufd/device.c +++ b/drivers/iommu/iommufd/device.c @@ -293,7 +293,7 @@ u32 iommufd_device_to_id(struct iommufd_device *idev) EXPORT_SYMBOL_NS_GPL(iommufd_device_to_id, IOMMUFD); static int iommufd_group_setup_msi(struct iommufd_group *igroup, - struct iommufd_hw_pagetable *hwpt) + struct iommufd_hwpt_paging *hwpt_paging) { phys_addr_t sw_msi_start = igroup->sw_msi_start; int rc; @@ -311,8 +311,9 @@ static int iommufd_group_setup_msi(struct iommufd_group *igroup, * matches what the IRQ layer actually expects in a newly created * domain. */ - if (sw_msi_start != PHYS_ADDR_MAX && !hwpt->msi_cookie) { - rc = iommu_get_msi_cookie(hwpt->domain, sw_msi_start); + if (sw_msi_start != PHYS_ADDR_MAX && !hwpt_paging->msi_cookie) { + rc = iommu_get_msi_cookie(hwpt_paging->common.domain, + sw_msi_start); if (rc) return rc; @@ -320,27 +321,29 @@ static int iommufd_group_setup_msi(struct iommufd_group *igroup, * iommu_get_msi_cookie() can only be called once per domain, * it returns -EBUSY on later calls. */ - hwpt->msi_cookie = true; + hwpt_paging->msi_cookie = true; } return 0; } -static int iommufd_hwpt_paging_attach(struct iommufd_hw_pagetable *hwpt, +static int iommufd_hwpt_paging_attach(struct iommufd_hwpt_paging *hwpt_paging, struct iommufd_device *idev) { int rc; lockdep_assert_held(&idev->igroup->lock); - rc = iopt_table_enforce_dev_resv_regions(&hwpt->ioas->iopt, idev->dev, + rc = iopt_table_enforce_dev_resv_regions(&hwpt_paging->ioas->iopt, + idev->dev, &idev->igroup->sw_msi_start); if (rc) return rc; if (list_empty(&idev->igroup->device_list)) { - rc = iommufd_group_setup_msi(idev->igroup, hwpt); + rc = iommufd_group_setup_msi(idev->igroup, hwpt_paging); if (rc) { - iopt_remove_reserved_iova(&hwpt->ioas->iopt, idev->dev); + iopt_remove_reserved_iova(&hwpt_paging->ioas->iopt, + idev->dev); return rc; } } @@ -360,7 +363,7 @@ int iommufd_hw_pagetable_attach(struct iommufd_hw_pagetable *hwpt, } if (hwpt_is_paging(hwpt)) { - rc = iommufd_hwpt_paging_attach(hwpt, idev); + rc = iommufd_hwpt_paging_attach(to_hwpt_paging(hwpt), idev); if (rc) goto err_unlock; } @@ -384,7 +387,8 @@ int iommufd_hw_pagetable_attach(struct iommufd_hw_pagetable *hwpt, return 0; err_unresv: if (hwpt_is_paging(hwpt)) - iopt_remove_reserved_iova(&hwpt->ioas->iopt, idev->dev); + iopt_remove_reserved_iova(&to_hwpt_paging(hwpt)->ioas->iopt, + idev->dev); err_unlock: mutex_unlock(&idev->igroup->lock); return rc; @@ -402,7 +406,8 @@ iommufd_hw_pagetable_detach(struct iommufd_device *idev) idev->igroup->hwpt = NULL; } if (hwpt_is_paging(hwpt)) - iopt_remove_reserved_iova(&hwpt->ioas->iopt, idev->dev); + iopt_remove_reserved_iova(&to_hwpt_paging(hwpt)->ioas->iopt, + idev->dev); mutex_unlock(&idev->igroup->lock); /* Caller must destroy hwpt */ @@ -434,18 +439,19 @@ static bool iommufd_device_is_attached(struct iommufd_device *idev) static void iommufd_group_remove_reserved_iova(struct iommufd_group *igroup, - struct iommufd_hw_pagetable *hwpt) + struct iommufd_hwpt_paging *hwpt_paging) { struct iommufd_device *cur; lockdep_assert_held(&igroup->lock); list_for_each_entry(cur, &igroup->device_list, group_item) - iopt_remove_reserved_iova(&hwpt->ioas->iopt, cur->dev); + iopt_remove_reserved_iova(&hwpt_paging->ioas->iopt, cur->dev); } -static int iommufd_group_do_replace_paging(struct iommufd_group *igroup, - struct iommufd_hw_pagetable *hwpt) +static int +iommufd_group_do_replace_paging(struct iommufd_group *igroup, + struct iommufd_hwpt_paging *hwpt_paging) { struct iommufd_hw_pagetable *old_hwpt = igroup->hwpt; struct iommufd_device *cur; @@ -453,22 +459,23 @@ static int iommufd_group_do_replace_paging(struct iommufd_group *igroup, lockdep_assert_held(&igroup->lock); - if (!hwpt_is_paging(old_hwpt) || hwpt->ioas != old_hwpt->ioas) { + if (!hwpt_is_paging(old_hwpt) || + hwpt_paging->ioas != to_hwpt_paging(old_hwpt)->ioas) { list_for_each_entry(cur, &igroup->device_list, group_item) { rc = iopt_table_enforce_dev_resv_regions( - &hwpt->ioas->iopt, cur->dev, NULL); + &hwpt_paging->ioas->iopt, cur->dev, NULL); if (rc) goto err_unresv; } } - rc = iommufd_group_setup_msi(igroup, hwpt); + rc = iommufd_group_setup_msi(igroup, hwpt_paging); if (rc) goto err_unresv; return 0; err_unresv: - iommufd_group_remove_reserved_iova(igroup, hwpt); + iommufd_group_remove_reserved_iova(igroup, hwpt_paging); return rc; } @@ -498,8 +505,10 @@ iommufd_device_do_replace(struct iommufd_device *idev, return NULL; } + old_hwpt = igroup->hwpt; if (hwpt_is_paging(hwpt)) { - rc = iommufd_group_do_replace_paging(igroup, hwpt); + rc = iommufd_group_do_replace_paging(igroup, + to_hwpt_paging(hwpt)); if (rc) goto err_unlock; } @@ -508,10 +517,11 @@ iommufd_device_do_replace(struct iommufd_device *idev, if (rc) goto err_unresv; - old_hwpt = igroup->hwpt; if (hwpt_is_paging(old_hwpt) && - (!hwpt_is_paging(hwpt) || hwpt->ioas != old_hwpt->ioas)) - iommufd_group_remove_reserved_iova(igroup, old_hwpt); + (!hwpt_is_paging(hwpt) || + to_hwpt_paging(hwpt)->ioas != to_hwpt_paging(old_hwpt)->ioas)) + iommufd_group_remove_reserved_iova(igroup, + to_hwpt_paging(old_hwpt)); igroup->hwpt = hwpt; @@ -530,7 +540,8 @@ iommufd_device_do_replace(struct iommufd_device *idev, return old_hwpt; err_unresv: if (hwpt_is_paging(hwpt)) - iommufd_group_remove_reserved_iova(igroup, hwpt); + iommufd_group_remove_reserved_iova(igroup, + to_hwpt_paging(old_hwpt)); err_unlock: mutex_unlock(&idev->igroup->lock); return ERR_PTR(rc); @@ -558,6 +569,7 @@ iommufd_device_auto_get_domain(struct iommufd_device *idev, */ bool immediate_attach = do_attach == iommufd_device_do_attach; struct iommufd_hw_pagetable *destroy_hwpt; + struct iommufd_hwpt_paging *hwpt_paging; struct iommufd_hw_pagetable *hwpt; /* @@ -566,10 +578,11 @@ iommufd_device_auto_get_domain(struct iommufd_device *idev, * other. */ mutex_lock(&ioas->mutex); - list_for_each_entry(hwpt, &ioas->hwpt_list, hwpt_item) { - if (!hwpt->auto_domain) + list_for_each_entry(hwpt_paging, &ioas->hwpt_list, hwpt_item) { + if (!hwpt_paging->auto_domain) continue; + hwpt = &hwpt_paging->common; if (!iommufd_lock_obj(&hwpt->obj)) continue; destroy_hwpt = (*do_attach)(idev, hwpt); @@ -590,12 +603,13 @@ iommufd_device_auto_get_domain(struct iommufd_device *idev, goto out_unlock; } - hwpt = iommufd_hw_pagetable_alloc(idev->ictx, ioas, idev, - 0, immediate_attach); - if (IS_ERR(hwpt)) { - destroy_hwpt = ERR_CAST(hwpt); + hwpt_paging = iommufd_hwpt_paging_alloc(idev->ictx, ioas, idev, 0, + immediate_attach); + if (IS_ERR(hwpt_paging)) { + destroy_hwpt = ERR_CAST(hwpt_paging); goto out_unlock; } + hwpt = &hwpt_paging->common; if (!immediate_attach) { destroy_hwpt = (*do_attach)(idev, hwpt); @@ -605,7 +619,7 @@ iommufd_device_auto_get_domain(struct iommufd_device *idev, destroy_hwpt = NULL; } - hwpt->auto_domain = true; + hwpt_paging->auto_domain = true; *pt_id = hwpt->obj.id; iommufd_object_finalize(idev->ictx, &hwpt->obj); diff --git a/drivers/iommu/iommufd/hw_pagetable.c b/drivers/iommu/iommufd/hw_pagetable.c index 8dc2b39f8cb0c..39b8b625b48d4 100644 --- a/drivers/iommu/iommufd/hw_pagetable.c +++ b/drivers/iommu/iommufd/hw_pagetable.c @@ -8,56 +8,61 @@ #include "../iommu-priv.h" #include "iommufd_private.h" -void iommufd_hw_pagetable_destroy(struct iommufd_object *obj) +void iommufd_hwpt_paging_destroy(struct iommufd_object *obj) { - struct iommufd_hw_pagetable *hwpt = - container_of(obj, struct iommufd_hw_pagetable, obj); + struct iommufd_hwpt_paging *hwpt_paging = + container_of(obj, struct iommufd_hwpt_paging, common.obj); - if (!list_empty(&hwpt->hwpt_item)) { - mutex_lock(&hwpt->ioas->mutex); - list_del(&hwpt->hwpt_item); - mutex_unlock(&hwpt->ioas->mutex); + if (!list_empty(&hwpt_paging->hwpt_item)) { + mutex_lock(&hwpt_paging->ioas->mutex); + list_del(&hwpt_paging->hwpt_item); + mutex_unlock(&hwpt_paging->ioas->mutex); - iopt_table_remove_domain(&hwpt->ioas->iopt, hwpt->domain); + iopt_table_remove_domain(&hwpt_paging->ioas->iopt, + hwpt_paging->common.domain); } - if (hwpt->domain) - iommu_domain_free(hwpt->domain); + if (hwpt_paging->common.domain) + iommu_domain_free(hwpt_paging->common.domain); - refcount_dec(&hwpt->ioas->obj.users); + refcount_dec(&hwpt_paging->ioas->obj.users); } -void iommufd_hw_pagetable_abort(struct iommufd_object *obj) +void iommufd_hwpt_paging_abort(struct iommufd_object *obj) { - struct iommufd_hw_pagetable *hwpt = - container_of(obj, struct iommufd_hw_pagetable, obj); + struct iommufd_hwpt_paging *hwpt_paging = + container_of(obj, struct iommufd_hwpt_paging, common.obj); /* The ioas->mutex must be held until finalize is called. */ - lockdep_assert_held(&hwpt->ioas->mutex); + lockdep_assert_held(&hwpt_paging->ioas->mutex); - if (!list_empty(&hwpt->hwpt_item)) { - list_del_init(&hwpt->hwpt_item); - iopt_table_remove_domain(&hwpt->ioas->iopt, hwpt->domain); + if (!list_empty(&hwpt_paging->hwpt_item)) { + list_del_init(&hwpt_paging->hwpt_item); + iopt_table_remove_domain(&hwpt_paging->ioas->iopt, + hwpt_paging->common.domain); } - iommufd_hw_pagetable_destroy(obj); + iommufd_hwpt_paging_destroy(obj); } -static int iommufd_hw_pagetable_enforce_cc(struct iommufd_hw_pagetable *hwpt) +static int +iommufd_hwpt_paging_enforce_cc(struct iommufd_hwpt_paging *hwpt_paging) { - if (hwpt->enforce_cache_coherency) + struct iommu_domain *paging_domain = hwpt_paging->common.domain; + + if (hwpt_paging->enforce_cache_coherency) return 0; - if (hwpt->domain->ops->enforce_cache_coherency) - hwpt->enforce_cache_coherency = - hwpt->domain->ops->enforce_cache_coherency( - hwpt->domain); - if (!hwpt->enforce_cache_coherency) + if (paging_domain->ops->enforce_cache_coherency) + hwpt_paging->enforce_cache_coherency = + paging_domain->ops->enforce_cache_coherency( + paging_domain); + if (!hwpt_paging->enforce_cache_coherency) return -EINVAL; return 0; } /** - * iommufd_hw_pagetable_alloc() - Get an iommu_domain for a device + * iommufd_hwpt_paging_alloc() - Get a PAGING iommu_domain for a device * @ictx: iommufd context * @ioas: IOAS to associate the domain with * @idev: Device to get an iommu_domain for @@ -72,12 +77,13 @@ static int iommufd_hw_pagetable_enforce_cc(struct iommufd_hw_pagetable *hwpt) * iommufd_object_abort_and_destroy() or iommufd_object_finalize() is called on * the returned hwpt. */ -struct iommufd_hw_pagetable * -iommufd_hw_pagetable_alloc(struct iommufd_ctx *ictx, struct iommufd_ioas *ioas, - struct iommufd_device *idev, u32 flags, - bool immediate_attach) +struct iommufd_hwpt_paging * +iommufd_hwpt_paging_alloc(struct iommufd_ctx *ictx, struct iommufd_ioas *ioas, + struct iommufd_device *idev, u32 flags, + bool immediate_attach) { const struct iommu_ops *ops = dev_iommu_ops(idev->dev); + struct iommufd_hwpt_paging *hwpt_paging; struct iommufd_hw_pagetable *hwpt; int rc; @@ -86,14 +92,16 @@ iommufd_hw_pagetable_alloc(struct iommufd_ctx *ictx, struct iommufd_ioas *ioas, if (flags && !ops->domain_alloc_user) return ERR_PTR(-EOPNOTSUPP); - hwpt = iommufd_object_alloc(ictx, hwpt, IOMMUFD_OBJ_HWPT_PAGING); - if (IS_ERR(hwpt)) - return hwpt; + hwpt_paging = __iommufd_object_alloc( + ictx, hwpt_paging, IOMMUFD_OBJ_HWPT_PAGING, common.obj); + if (IS_ERR(hwpt_paging)) + return ERR_CAST(hwpt_paging); + hwpt = &hwpt_paging->common; - INIT_LIST_HEAD(&hwpt->hwpt_item); + INIT_LIST_HEAD(&hwpt_paging->hwpt_item); /* Pairs with iommufd_hw_pagetable_destroy() */ refcount_inc(&ioas->obj.users); - hwpt->ioas = ioas; + hwpt_paging->ioas = ioas; if (ops->domain_alloc_user) { hwpt->domain = ops->domain_alloc_user(idev->dev, flags); @@ -125,7 +133,7 @@ iommufd_hw_pagetable_alloc(struct iommufd_ctx *ictx, struct iommufd_ioas *ioas, * allocate a separate HWPT (non-CC). */ if (idev->enforce_cache_coherency) { - rc = iommufd_hw_pagetable_enforce_cc(hwpt); + rc = iommufd_hwpt_paging_enforce_cc(hwpt_paging); if (WARN_ON(rc)) goto out_abort; } @@ -142,11 +150,11 @@ iommufd_hw_pagetable_alloc(struct iommufd_ctx *ictx, struct iommufd_ioas *ioas, goto out_abort; } - rc = iopt_table_add_domain(&hwpt->ioas->iopt, hwpt->domain); + rc = iopt_table_add_domain(&ioas->iopt, hwpt->domain); if (rc) goto out_detach; - list_add_tail(&hwpt->hwpt_item, &hwpt->ioas->hwpt_list); - return hwpt; + list_add_tail(&hwpt_paging->hwpt_item, &ioas->hwpt_list); + return hwpt_paging; out_detach: if (immediate_attach) @@ -159,6 +167,7 @@ iommufd_hw_pagetable_alloc(struct iommufd_ctx *ictx, struct iommufd_ioas *ioas, int iommufd_hwpt_alloc(struct iommufd_ucmd *ucmd) { struct iommu_hwpt_alloc *cmd = ucmd->cmd; + struct iommufd_hwpt_paging *hwpt_paging; struct iommufd_hw_pagetable *hwpt; struct iommufd_device *idev; struct iommufd_ioas *ioas; @@ -180,12 +189,13 @@ int iommufd_hwpt_alloc(struct iommufd_ucmd *ucmd) } mutex_lock(&ioas->mutex); - hwpt = iommufd_hw_pagetable_alloc(ucmd->ictx, ioas, - idev, cmd->flags, false); - if (IS_ERR(hwpt)) { - rc = PTR_ERR(hwpt); + hwpt_paging = iommufd_hwpt_paging_alloc(ucmd->ictx, ioas, idev, + cmd->flags, false); + if (IS_ERR(hwpt_paging)) { + rc = PTR_ERR(hwpt_paging); goto out_unlock; } + hwpt = &hwpt_paging->common; cmd->out_hwpt_id = hwpt->obj.id; rc = iommufd_ucmd_respond(ucmd, sizeof(*cmd)); @@ -207,7 +217,7 @@ int iommufd_hwpt_alloc(struct iommufd_ucmd *ucmd) int iommufd_hwpt_set_dirty_tracking(struct iommufd_ucmd *ucmd) { struct iommu_hwpt_set_dirty_tracking *cmd = ucmd->cmd; - struct iommufd_hw_pagetable *hwpt; + struct iommufd_hwpt_paging *hwpt_paging; struct iommufd_ioas *ioas; int rc = -EOPNOTSUPP; bool enable; @@ -215,23 +225,24 @@ int iommufd_hwpt_set_dirty_tracking(struct iommufd_ucmd *ucmd) if (cmd->flags & ~IOMMU_HWPT_DIRTY_TRACKING_ENABLE) return rc; - hwpt = iommufd_get_hwpt(ucmd, cmd->hwpt_id); - if (IS_ERR(hwpt)) - return PTR_ERR(hwpt); + hwpt_paging = iommufd_get_hwpt_paging(ucmd, cmd->hwpt_id); + if (IS_ERR(hwpt_paging)) + return PTR_ERR(hwpt_paging); - ioas = hwpt->ioas; + ioas = hwpt_paging->ioas; enable = cmd->flags & IOMMU_HWPT_DIRTY_TRACKING_ENABLE; - rc = iopt_set_dirty_tracking(&ioas->iopt, hwpt->domain, enable); + rc = iopt_set_dirty_tracking(&ioas->iopt, hwpt_paging->common.domain, + enable); - iommufd_put_object(&hwpt->obj); + iommufd_put_object(&hwpt_paging->common.obj); return rc; } int iommufd_hwpt_get_dirty_bitmap(struct iommufd_ucmd *ucmd) { struct iommu_hwpt_get_dirty_bitmap *cmd = ucmd->cmd; - struct iommufd_hw_pagetable *hwpt; + struct iommufd_hwpt_paging *hwpt_paging; struct iommufd_ioas *ioas; int rc = -EOPNOTSUPP; @@ -239,14 +250,14 @@ int iommufd_hwpt_get_dirty_bitmap(struct iommufd_ucmd *ucmd) cmd->__reserved) return -EOPNOTSUPP; - hwpt = iommufd_get_hwpt(ucmd, cmd->hwpt_id); - if (IS_ERR(hwpt)) - return PTR_ERR(hwpt); + hwpt_paging = iommufd_get_hwpt_paging(ucmd, cmd->hwpt_id); + if (IS_ERR(hwpt_paging)) + return PTR_ERR(hwpt_paging); - ioas = hwpt->ioas; - rc = iopt_read_and_clear_dirty_data(&ioas->iopt, hwpt->domain, - cmd->flags, cmd); + ioas = hwpt_paging->ioas; + rc = iopt_read_and_clear_dirty_data( + &ioas->iopt, hwpt_paging->common.domain, cmd->flags, cmd); - iommufd_put_object(&hwpt->obj); + iommufd_put_object(&hwpt_paging->common.obj); return rc; } diff --git a/drivers/iommu/iommufd/iommufd_private.h b/drivers/iommu/iommufd/iommufd_private.h index 776dd41c077f0..477f80888a0d9 100644 --- a/drivers/iommu/iommufd/iommufd_private.h +++ b/drivers/iommu/iommufd/iommufd_private.h @@ -181,7 +181,7 @@ struct iommufd_object *_iommufd_object_alloc(struct iommufd_ctx *ictx, size_t size, enum iommufd_object_type type); -#define iommufd_object_alloc(ictx, ptr, type) \ +#define __iommufd_object_alloc(ictx, ptr, type, obj) \ container_of(_iommufd_object_alloc( \ ictx, \ sizeof(*(ptr)) + BUILD_BUG_ON_ZERO( \ @@ -190,6 +190,9 @@ struct iommufd_object *_iommufd_object_alloc(struct iommufd_ctx *ictx, type), \ typeof(*(ptr)), obj) +#define iommufd_object_alloc(ictx, ptr, type) \ + __iommufd_object_alloc(ictx, ptr, type, obj) + /* * The IO Address Space (IOAS) pagetable is a virtual page table backed by the * io_pagetable object. It is a user controlled mapping of IOVA -> PFNs. The @@ -243,8 +246,12 @@ int iommufd_check_iova_range(struct io_pagetable *iopt, */ struct iommufd_hw_pagetable { struct iommufd_object obj; - struct iommufd_ioas *ioas; struct iommu_domain *domain; +}; + +struct iommufd_hwpt_paging { + struct iommufd_hw_pagetable common; + struct iommufd_ioas *ioas; bool auto_domain : 1; bool enforce_cache_coherency : 1; bool msi_cookie : 1; @@ -257,33 +264,42 @@ static inline bool hwpt_is_paging(struct iommufd_hw_pagetable *hwpt) return hwpt->obj.type == IOMMUFD_OBJ_HWPT_PAGING; } -static inline struct iommufd_hw_pagetable * -iommufd_get_hwpt(struct iommufd_ucmd *ucmd, u32 id) +static inline struct iommufd_hwpt_paging * +to_hwpt_paging(struct iommufd_hw_pagetable *hwpt) +{ + return container_of(hwpt, struct iommufd_hwpt_paging, common); +} + +static inline struct iommufd_hwpt_paging * +iommufd_get_hwpt_paging(struct iommufd_ucmd *ucmd, u32 id) { return container_of(iommufd_get_object(ucmd->ictx, id, IOMMUFD_OBJ_HWPT_PAGING), - struct iommufd_hw_pagetable, obj); + struct iommufd_hwpt_paging, common.obj); } int iommufd_hwpt_set_dirty_tracking(struct iommufd_ucmd *ucmd); int iommufd_hwpt_get_dirty_bitmap(struct iommufd_ucmd *ucmd); -struct iommufd_hw_pagetable * -iommufd_hw_pagetable_alloc(struct iommufd_ctx *ictx, struct iommufd_ioas *ioas, - struct iommufd_device *idev, u32 flags, - bool immediate_attach); +struct iommufd_hwpt_paging * +iommufd_hwpt_paging_alloc(struct iommufd_ctx *ictx, struct iommufd_ioas *ioas, + struct iommufd_device *idev, u32 flags, + bool immediate_attach); +int iommufd_hw_pagetable_enforce_cc(struct iommufd_hw_pagetable *hwpt); int iommufd_hw_pagetable_attach(struct iommufd_hw_pagetable *hwpt, struct iommufd_device *idev); struct iommufd_hw_pagetable * iommufd_hw_pagetable_detach(struct iommufd_device *idev); -void iommufd_hw_pagetable_destroy(struct iommufd_object *obj); -void iommufd_hw_pagetable_abort(struct iommufd_object *obj); +void iommufd_hwpt_paging_destroy(struct iommufd_object *obj); +void iommufd_hwpt_paging_abort(struct iommufd_object *obj); int iommufd_hwpt_alloc(struct iommufd_ucmd *ucmd); static inline void iommufd_hw_pagetable_put(struct iommufd_ctx *ictx, struct iommufd_hw_pagetable *hwpt) { - lockdep_assert_not_held(&hwpt->ioas->mutex); - if (hwpt->auto_domain) + struct iommufd_hwpt_paging *hwpt_paging = to_hwpt_paging(hwpt); + + lockdep_assert_not_held(&hwpt_paging->ioas->mutex); + if (hwpt_paging->auto_domain) iommufd_object_deref_user(ictx, &hwpt->obj); else refcount_dec(&hwpt->obj.users); diff --git a/drivers/iommu/iommufd/main.c b/drivers/iommu/iommufd/main.c index 46198e8948d68..ab6675a7f6b4c 100644 --- a/drivers/iommu/iommufd/main.c +++ b/drivers/iommu/iommufd/main.c @@ -489,8 +489,8 @@ static const struct iommufd_object_ops iommufd_object_ops[] = { .destroy = iommufd_ioas_destroy, }, [IOMMUFD_OBJ_HWPT_PAGING] = { - .destroy = iommufd_hw_pagetable_destroy, - .abort = iommufd_hw_pagetable_abort, + .destroy = iommufd_hwpt_paging_destroy, + .abort = iommufd_hwpt_paging_abort, }, #ifdef CONFIG_IOMMUFD_TEST [IOMMUFD_OBJ_SELFTEST] = { diff --git a/drivers/iommu/iommufd/vfio_compat.c b/drivers/iommu/iommufd/vfio_compat.c index 6c810bf80f99a..538fbf76354d1 100644 --- a/drivers/iommu/iommufd/vfio_compat.c +++ b/drivers/iommu/iommufd/vfio_compat.c @@ -255,7 +255,7 @@ static int iommufd_vfio_unmap_dma(struct iommufd_ctx *ictx, unsigned int cmd, static int iommufd_vfio_cc_iommu(struct iommufd_ctx *ictx) { - struct iommufd_hw_pagetable *hwpt; + struct iommufd_hwpt_paging *hwpt_paging; struct iommufd_ioas *ioas; int rc = 1; @@ -264,8 +264,8 @@ static int iommufd_vfio_cc_iommu(struct iommufd_ctx *ictx) return PTR_ERR(ioas); mutex_lock(&ioas->mutex); - list_for_each_entry(hwpt, &ioas->hwpt_list, hwpt_item) { - if (!hwpt->enforce_cache_coherency) { + list_for_each_entry(hwpt_paging, &ioas->hwpt_list, hwpt_item) { + if (!hwpt_paging->enforce_cache_coherency) { rc = 0; break; } From 093ebdc1a7974e240d9c580b8cd5ae8cf140339b Mon Sep 17 00:00:00 2001 From: Nicolin Chen Date: Wed, 25 Oct 2023 21:39:33 -0700 Subject: [PATCH 398/548] iommufd: Share iommufd_hwpt_alloc with IOMMUFD_OBJ_HWPT_NESTED Allow iommufd_hwpt_alloc() to have a common routine but jump to different allocators corresponding to different user input pt_obj types, either an IOMMUFD_OBJ_IOAS for a PAGING hwpt or an IOMMUFD_OBJ_HWPT_PAGING as the parent for a NESTED hwpt. Also, move the "flags" validation to the hwpt allocator (paging), so that later the hwpt_nested allocator can do its own separate flags validation. Link: https://lore.kernel.org/r/20231026043938.63898-6-yi.l.liu@intel.com Signed-off-by: Nicolin Chen Signed-off-by: Yi Liu Reviewed-by: Kevin Tian Reviewed-by: Jason Gunthorpe Signed-off-by: Jason Gunthorpe (cherry picked from commit b5021cb264e67baf051569a41debe277c279952b) Signed-off-by: Koba Ko --- drivers/iommu/iommufd/hw_pagetable.c | 46 ++++++++++++++++++---------- 1 file changed, 29 insertions(+), 17 deletions(-) diff --git a/drivers/iommu/iommufd/hw_pagetable.c b/drivers/iommu/iommufd/hw_pagetable.c index 39b8b625b48d4..6bce9af0cb8d6 100644 --- a/drivers/iommu/iommufd/hw_pagetable.c +++ b/drivers/iommu/iommufd/hw_pagetable.c @@ -82,6 +82,8 @@ iommufd_hwpt_paging_alloc(struct iommufd_ctx *ictx, struct iommufd_ioas *ioas, struct iommufd_device *idev, u32 flags, bool immediate_attach) { + const u32 valid_flags = IOMMU_HWPT_ALLOC_NEST_PARENT | + IOMMU_HWPT_ALLOC_DIRTY_TRACKING; const struct iommu_ops *ops = dev_iommu_ops(idev->dev); struct iommufd_hwpt_paging *hwpt_paging; struct iommufd_hw_pagetable *hwpt; @@ -91,6 +93,8 @@ iommufd_hwpt_paging_alloc(struct iommufd_ctx *ictx, struct iommufd_ioas *ioas, if (flags && !ops->domain_alloc_user) return ERR_PTR(-EOPNOTSUPP); + if (flags & ~valid_flags) + return ERR_PTR(-EOPNOTSUPP); hwpt_paging = __iommufd_object_alloc( ictx, hwpt_paging, IOMMUFD_OBJ_HWPT_PAGING, common.obj); @@ -167,35 +171,41 @@ iommufd_hwpt_paging_alloc(struct iommufd_ctx *ictx, struct iommufd_ioas *ioas, int iommufd_hwpt_alloc(struct iommufd_ucmd *ucmd) { struct iommu_hwpt_alloc *cmd = ucmd->cmd; - struct iommufd_hwpt_paging *hwpt_paging; struct iommufd_hw_pagetable *hwpt; + struct iommufd_ioas *ioas = NULL; + struct iommufd_object *pt_obj; struct iommufd_device *idev; - struct iommufd_ioas *ioas; int rc; - if ((cmd->flags & ~(IOMMU_HWPT_ALLOC_NEST_PARENT | - IOMMU_HWPT_ALLOC_DIRTY_TRACKING)) || - cmd->__reserved) + if (cmd->__reserved) return -EOPNOTSUPP; idev = iommufd_get_device(ucmd, cmd->dev_id); if (IS_ERR(idev)) return PTR_ERR(idev); - ioas = iommufd_get_ioas(ucmd->ictx, cmd->pt_id); - if (IS_ERR(ioas)) { - rc = PTR_ERR(ioas); + pt_obj = iommufd_get_object(ucmd->ictx, cmd->pt_id, IOMMUFD_OBJ_ANY); + if (IS_ERR(pt_obj)) { + rc = -EINVAL; goto out_put_idev; } - mutex_lock(&ioas->mutex); - hwpt_paging = iommufd_hwpt_paging_alloc(ucmd->ictx, ioas, idev, - cmd->flags, false); - if (IS_ERR(hwpt_paging)) { - rc = PTR_ERR(hwpt_paging); - goto out_unlock; + if (pt_obj->type == IOMMUFD_OBJ_IOAS) { + struct iommufd_hwpt_paging *hwpt_paging; + + ioas = container_of(pt_obj, struct iommufd_ioas, obj); + mutex_lock(&ioas->mutex); + hwpt_paging = iommufd_hwpt_paging_alloc(ucmd->ictx, ioas, idev, + cmd->flags, false); + if (IS_ERR(hwpt_paging)) { + rc = PTR_ERR(hwpt_paging); + goto out_unlock; + } + hwpt = &hwpt_paging->common; + } else { + rc = -EINVAL; + goto out_put_pt; } - hwpt = &hwpt_paging->common; cmd->out_hwpt_id = hwpt->obj.id; rc = iommufd_ucmd_respond(ucmd, sizeof(*cmd)); @@ -207,8 +217,10 @@ int iommufd_hwpt_alloc(struct iommufd_ucmd *ucmd) out_hwpt: iommufd_object_abort_and_destroy(ucmd->ictx, &hwpt->obj); out_unlock: - mutex_unlock(&ioas->mutex); - iommufd_put_object(&ioas->obj); + if (ioas) + mutex_unlock(&ioas->mutex); +out_put_pt: + iommufd_put_object(pt_obj); out_put_idev: iommufd_put_object(&idev->obj); return rc; From d6dd1fc4881a687b45b36794a5e9e16e048dd9ba Mon Sep 17 00:00:00 2001 From: Yi Liu Date: Wed, 25 Oct 2023 21:39:34 -0700 Subject: [PATCH 399/548] iommu: Pass in parent domain with user_data to domain_alloc_user op domain_alloc_user op already accepts user flags for domain allocation, add a parent domain pointer and a driver specific user data support as well. The user data would be tagged with a type for iommu drivers to add their own driver specific user data per hw_pagetable. Add a struct iommu_user_data as a bundle of data_ptr/data_len/type from an iommufd core uAPI structure. Make the user data opaque to the core, since a userspace driver must match the kernel driver. In the future, if drivers share some common parameter, there would be a generic parameter as well. Link: https://lore.kernel.org/r/20231026043938.63898-7-yi.l.liu@intel.com Signed-off-by: Lu Baolu Co-developed-by: Nicolin Chen Signed-off-by: Nicolin Chen Signed-off-by: Yi Liu Reviewed-by: Kevin Tian Reviewed-by: Jason Gunthorpe Signed-off-by: Jason Gunthorpe (cherry picked from commit 2bdabb8e82f564d19eeeb7c83e6b2467af0707cb) Signed-off-by: Koba Ko --- drivers/iommu/intel/iommu.c | 7 ++++++- drivers/iommu/iommufd/hw_pagetable.c | 3 ++- drivers/iommu/iommufd/selftest.c | 7 ++++++- include/linux/iommu.h | 27 ++++++++++++++++++++++++--- 4 files changed, 38 insertions(+), 6 deletions(-) diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c index b3ee204eeef55..679d687e62c90 100644 --- a/drivers/iommu/intel/iommu.c +++ b/drivers/iommu/intel/iommu.c @@ -4077,7 +4077,9 @@ static struct iommu_domain *intel_iommu_domain_alloc(unsigned type) } static struct iommu_domain * -intel_iommu_domain_alloc_user(struct device *dev, u32 flags) +intel_iommu_domain_alloc_user(struct device *dev, u32 flags, + struct iommu_domain *parent, + const struct iommu_user_data *user_data) { struct iommu_domain *domain; struct intel_iommu *iommu; @@ -4087,6 +4089,9 @@ intel_iommu_domain_alloc_user(struct device *dev, u32 flags) (~(IOMMU_HWPT_ALLOC_NEST_PARENT | IOMMU_HWPT_ALLOC_DIRTY_TRACKING))) return ERR_PTR(-EOPNOTSUPP); + if (parent || user_data) + return ERR_PTR(-EOPNOTSUPP); + iommu = device_to_iommu(dev, NULL, NULL); if (!iommu) return ERR_PTR(-ENODEV); diff --git a/drivers/iommu/iommufd/hw_pagetable.c b/drivers/iommu/iommufd/hw_pagetable.c index 6bce9af0cb8d6..198ecbd536f7b 100644 --- a/drivers/iommu/iommufd/hw_pagetable.c +++ b/drivers/iommu/iommufd/hw_pagetable.c @@ -108,7 +108,8 @@ iommufd_hwpt_paging_alloc(struct iommufd_ctx *ictx, struct iommufd_ioas *ioas, hwpt_paging->ioas = ioas; if (ops->domain_alloc_user) { - hwpt->domain = ops->domain_alloc_user(idev->dev, flags); + hwpt->domain = + ops->domain_alloc_user(idev->dev, flags, NULL, NULL); if (IS_ERR(hwpt->domain)) { rc = PTR_ERR(hwpt->domain); hwpt->domain = NULL; diff --git a/drivers/iommu/iommufd/selftest.c b/drivers/iommu/iommufd/selftest.c index 6f00b4f9f1465..0a903270fb7b3 100644 --- a/drivers/iommu/iommufd/selftest.c +++ b/drivers/iommu/iommufd/selftest.c @@ -277,7 +277,9 @@ static struct iommu_domain *mock_domain_alloc(unsigned int iommu_domain_type) } static struct iommu_domain * -mock_domain_alloc_user(struct device *dev, u32 flags) +mock_domain_alloc_user(struct device *dev, u32 flags, + struct iommu_domain *parent, + const struct iommu_user_data *user_data) { struct mock_dev *mdev = container_of(dev, struct mock_dev, dev); struct iommu_domain *domain; @@ -286,6 +288,9 @@ mock_domain_alloc_user(struct device *dev, u32 flags) (~(IOMMU_HWPT_ALLOC_NEST_PARENT | IOMMU_HWPT_ALLOC_DIRTY_TRACKING))) return ERR_PTR(-EOPNOTSUPP); + if (parent || user_data) + return ERR_PTR(-EOPNOTSUPP); + if ((flags & IOMMU_HWPT_ALLOC_DIRTY_TRACKING) && (mdev->flags & MOCK_FLAGS_DEVICE_NO_DIRTY)) return ERR_PTR(-EOPNOTSUPP); diff --git a/include/linux/iommu.h b/include/linux/iommu.h index ebb220afd0594..9471dec680d35 100644 --- a/include/linux/iommu.h +++ b/include/linux/iommu.h @@ -269,6 +269,21 @@ struct iommu_dirty_ops { struct iommu_dirty_bitmap *dirty); }; +/** + * struct iommu_user_data - iommu driver specific user space data info + * @type: The data type of the user buffer + * @uptr: Pointer to the user buffer for copy_from_user() + * @len: The length of the user buffer in bytes + * + * A user space data is an uAPI that is defined in include/uapi/linux/iommufd.h + * @type, @uptr and @len should be just copied from an iommufd core uAPI struct. + */ +struct iommu_user_data { + unsigned int type; + void __user *uptr; + size_t len; +}; + /** * struct iommu_ops - iommu ops and capabilities * @capable: check capability @@ -283,8 +298,12 @@ struct iommu_dirty_ops { * parameters as defined in include/uapi/linux/iommufd.h. * Unlike @domain_alloc, it is called only by IOMMUFD and * must fully initialize the new domain before return. - * Upon success, a domain is returned. Upon failure, - * ERR_PTR must be returned. + * Upon success, if the @user_data is valid and the @parent + * points to a kernel-managed domain, the new domain must be + * IOMMU_DOMAIN_NESTED type; otherwise, the @parent must be + * NULL while the @user_data can be optionally provided, the + * new domain must support __IOMMU_DOMAIN_PAGING. + * Upon failure, ERR_PTR must be returned. * @probe_device: Add device to iommu driver handling * @release_device: Remove device from iommu driver handling * @probe_finalize: Do final setup work after the device is added to an IOMMU @@ -322,7 +341,9 @@ struct iommu_ops { /* Domain allocation and freeing by the iommu driver */ struct iommu_domain *(*domain_alloc)(unsigned iommu_domain_type); - struct iommu_domain *(*domain_alloc_user)(struct device *dev, u32 flags); + struct iommu_domain *(*domain_alloc_user)( + struct device *dev, u32 flags, struct iommu_domain *parent, + const struct iommu_user_data *user_data); struct iommu_device *(*probe_device)(struct device *dev); void (*release_device)(struct device *dev); From 7562edef7c7b83be88fa5616028f15b5c9c3ffd4 Mon Sep 17 00:00:00 2001 From: Nicolin Chen Date: Wed, 25 Oct 2023 21:39:35 -0700 Subject: [PATCH 400/548] iommufd: Add a nested HW pagetable object IOMMU_HWPT_ALLOC already supports iommu_domain allocation for usersapce. But it can only allocate a hw_pagetable that associates to a given IOAS, i.e. only a kernel-managed hw_pagetable of IOMMUFD_OBJ_HWPT_PAGING type. IOMMU drivers can now support user-managed hw_pagetables, for two-stage translation use cases that require user data input from the user space. Add a new IOMMUFD_OBJ_HWPT_NESTED type with its abort/destroy(). Pair it with a new iommufd_hwpt_nested structure and its to_hwpt_nested() helper. Update the to_hwpt_paging() helper, so a NESTED-type hw_pagetable can be handled in the callers, for example iommufd_hw_pagetable_enforce_rr(). Screen the inputs including the parent PAGING-type hw_pagetable that has a need of a new nest_parent flag in the iommufd_hwpt_paging structure. Extend the IOMMU_HWPT_ALLOC ioctl to accept an IOMMU driver specific data input which is tagged by the enum iommu_hwpt_data_type. Also, update the @pt_id to accept hwpt_id too besides an ioas_id. Then, use them to allocate a hw_pagetable of IOMMUFD_OBJ_HWPT_NESTED type using the iommufd_hw_pagetable_alloc_nested() allocator. Link: https://lore.kernel.org/r/20231026043938.63898-8-yi.l.liu@intel.com Signed-off-by: Nicolin Chen Co-developed-by: Yi Liu Signed-off-by: Yi Liu Reviewed-by: Kevin Tian Reviewed-by: Jason Gunthorpe Signed-off-by: Jason Gunthorpe (cherry picked from commit bd529dbb661d62bd9f03e44c9fc837d98a190499) Signed-off-by: Koba Ko --- drivers/iommu/iommufd/device.c | 3 +- drivers/iommu/iommufd/hw_pagetable.c | 109 ++++++++++++++++++++++-- drivers/iommu/iommufd/iommufd_private.h | 28 ++++-- drivers/iommu/iommufd/main.c | 4 + include/uapi/linux/iommufd.h | 31 ++++++- 5 files changed, 159 insertions(+), 16 deletions(-) diff --git a/drivers/iommu/iommufd/device.c b/drivers/iommu/iommufd/device.c index 24396a53ce06f..dbcb027857b6b 100644 --- a/drivers/iommu/iommufd/device.c +++ b/drivers/iommu/iommufd/device.c @@ -604,7 +604,7 @@ iommufd_device_auto_get_domain(struct iommufd_device *idev, } hwpt_paging = iommufd_hwpt_paging_alloc(idev->ictx, ioas, idev, 0, - immediate_attach); + immediate_attach, NULL); if (IS_ERR(hwpt_paging)) { destroy_hwpt = ERR_CAST(hwpt_paging); goto out_unlock; @@ -644,6 +644,7 @@ static int iommufd_device_change_pt(struct iommufd_device *idev, u32 *pt_id, return PTR_ERR(pt_obj); switch (pt_obj->type) { + case IOMMUFD_OBJ_HWPT_NESTED: case IOMMUFD_OBJ_HWPT_PAGING: { struct iommufd_hw_pagetable *hwpt = container_of(pt_obj, struct iommufd_hw_pagetable, obj); diff --git a/drivers/iommu/iommufd/hw_pagetable.c b/drivers/iommu/iommufd/hw_pagetable.c index 198ecbd536f7b..2abbeafdbd22d 100644 --- a/drivers/iommu/iommufd/hw_pagetable.c +++ b/drivers/iommu/iommufd/hw_pagetable.c @@ -44,6 +44,22 @@ void iommufd_hwpt_paging_abort(struct iommufd_object *obj) iommufd_hwpt_paging_destroy(obj); } +void iommufd_hwpt_nested_destroy(struct iommufd_object *obj) +{ + struct iommufd_hwpt_nested *hwpt_nested = + container_of(obj, struct iommufd_hwpt_nested, common.obj); + + if (hwpt_nested->common.domain) + iommu_domain_free(hwpt_nested->common.domain); + + refcount_dec(&hwpt_nested->parent->common.obj.users); +} + +void iommufd_hwpt_nested_abort(struct iommufd_object *obj) +{ + iommufd_hwpt_nested_destroy(obj); +} + static int iommufd_hwpt_paging_enforce_cc(struct iommufd_hwpt_paging *hwpt_paging) { @@ -68,6 +84,8 @@ iommufd_hwpt_paging_enforce_cc(struct iommufd_hwpt_paging *hwpt_paging) * @idev: Device to get an iommu_domain for * @flags: Flags from userspace * @immediate_attach: True if idev should be attached to the hwpt + * @user_data: The user provided driver specific data describing the domain to + * create * * Allocate a new iommu_domain and return it as a hw_pagetable. The HWPT * will be linked to the given ioas and upon return the underlying iommu_domain @@ -80,7 +98,8 @@ iommufd_hwpt_paging_enforce_cc(struct iommufd_hwpt_paging *hwpt_paging) struct iommufd_hwpt_paging * iommufd_hwpt_paging_alloc(struct iommufd_ctx *ictx, struct iommufd_ioas *ioas, struct iommufd_device *idev, u32 flags, - bool immediate_attach) + bool immediate_attach, + const struct iommu_user_data *user_data) { const u32 valid_flags = IOMMU_HWPT_ALLOC_NEST_PARENT | IOMMU_HWPT_ALLOC_DIRTY_TRACKING; @@ -91,7 +110,7 @@ iommufd_hwpt_paging_alloc(struct iommufd_ctx *ictx, struct iommufd_ioas *ioas, lockdep_assert_held(&ioas->mutex); - if (flags && !ops->domain_alloc_user) + if ((flags || user_data) && !ops->domain_alloc_user) return ERR_PTR(-EOPNOTSUPP); if (flags & ~valid_flags) return ERR_PTR(-EOPNOTSUPP); @@ -106,10 +125,11 @@ iommufd_hwpt_paging_alloc(struct iommufd_ctx *ictx, struct iommufd_ioas *ioas, /* Pairs with iommufd_hw_pagetable_destroy() */ refcount_inc(&ioas->obj.users); hwpt_paging->ioas = ioas; + hwpt_paging->nest_parent = flags & IOMMU_HWPT_ALLOC_NEST_PARENT; if (ops->domain_alloc_user) { - hwpt->domain = - ops->domain_alloc_user(idev->dev, flags, NULL, NULL); + hwpt->domain = ops->domain_alloc_user(idev->dev, flags, NULL, + user_data); if (IS_ERR(hwpt->domain)) { rc = PTR_ERR(hwpt->domain); hwpt->domain = NULL; @@ -169,9 +189,70 @@ iommufd_hwpt_paging_alloc(struct iommufd_ctx *ictx, struct iommufd_ioas *ioas, return ERR_PTR(rc); } +/** + * iommufd_hwpt_nested_alloc() - Get a NESTED iommu_domain for a device + * @ictx: iommufd context + * @parent: Parent PAGING-type hwpt to associate the domain with + * @idev: Device to get an iommu_domain for + * @flags: Flags from userspace + * @user_data: user_data pointer. Must be valid + * + * Allocate a new iommu_domain (must be IOMMU_DOMAIN_NESTED) and return it as + * a NESTED hw_pagetable. The given parent PAGING-type hwpt must be capable of + * being a parent. + */ +static struct iommufd_hwpt_nested * +iommufd_hwpt_nested_alloc(struct iommufd_ctx *ictx, + struct iommufd_hwpt_paging *parent, + struct iommufd_device *idev, u32 flags, + const struct iommu_user_data *user_data) +{ + const struct iommu_ops *ops = dev_iommu_ops(idev->dev); + struct iommufd_hwpt_nested *hwpt_nested; + struct iommufd_hw_pagetable *hwpt; + int rc; + + if (flags || !user_data->len || !ops->domain_alloc_user) + return ERR_PTR(-EOPNOTSUPP); + if (parent->auto_domain || !parent->nest_parent) + return ERR_PTR(-EINVAL); + + hwpt_nested = __iommufd_object_alloc( + ictx, hwpt_nested, IOMMUFD_OBJ_HWPT_NESTED, common.obj); + if (IS_ERR(hwpt_nested)) + return ERR_CAST(hwpt_nested); + hwpt = &hwpt_nested->common; + + refcount_inc(&parent->common.obj.users); + hwpt_nested->parent = parent; + + hwpt->domain = ops->domain_alloc_user(idev->dev, flags, + parent->common.domain, user_data); + if (IS_ERR(hwpt->domain)) { + rc = PTR_ERR(hwpt->domain); + hwpt->domain = NULL; + goto out_abort; + } + + if (WARN_ON_ONCE(hwpt->domain->type != IOMMU_DOMAIN_NESTED)) { + rc = -EINVAL; + goto out_abort; + } + return hwpt_nested; + +out_abort: + iommufd_object_abort_and_destroy(ictx, &hwpt->obj); + return ERR_PTR(rc); +} + int iommufd_hwpt_alloc(struct iommufd_ucmd *ucmd) { struct iommu_hwpt_alloc *cmd = ucmd->cmd; + const struct iommu_user_data user_data = { + .type = cmd->data_type, + .uptr = u64_to_user_ptr(cmd->data_uptr), + .len = cmd->data_len, + }; struct iommufd_hw_pagetable *hwpt; struct iommufd_ioas *ioas = NULL; struct iommufd_object *pt_obj; @@ -180,6 +261,8 @@ int iommufd_hwpt_alloc(struct iommufd_ucmd *ucmd) if (cmd->__reserved) return -EOPNOTSUPP; + if (cmd->data_type == IOMMU_HWPT_DATA_NONE && cmd->data_len) + return -EINVAL; idev = iommufd_get_device(ucmd, cmd->dev_id); if (IS_ERR(idev)) @@ -196,13 +279,27 @@ int iommufd_hwpt_alloc(struct iommufd_ucmd *ucmd) ioas = container_of(pt_obj, struct iommufd_ioas, obj); mutex_lock(&ioas->mutex); - hwpt_paging = iommufd_hwpt_paging_alloc(ucmd->ictx, ioas, idev, - cmd->flags, false); + hwpt_paging = iommufd_hwpt_paging_alloc( + ucmd->ictx, ioas, idev, cmd->flags, false, + user_data.len ? &user_data : NULL); if (IS_ERR(hwpt_paging)) { rc = PTR_ERR(hwpt_paging); goto out_unlock; } hwpt = &hwpt_paging->common; + } else if (pt_obj->type == IOMMUFD_OBJ_HWPT_PAGING) { + struct iommufd_hwpt_nested *hwpt_nested; + + hwpt_nested = iommufd_hwpt_nested_alloc( + ucmd->ictx, + container_of(pt_obj, struct iommufd_hwpt_paging, + common.obj), + idev, cmd->flags, &user_data); + if (IS_ERR(hwpt_nested)) { + rc = PTR_ERR(hwpt_nested); + goto out_unlock; + } + hwpt = &hwpt_nested->common; } else { rc = -EINVAL; goto out_put_pt; diff --git a/drivers/iommu/iommufd/iommufd_private.h b/drivers/iommu/iommufd/iommufd_private.h index 477f80888a0d9..12622a42f19d0 100644 --- a/drivers/iommu/iommufd/iommufd_private.h +++ b/drivers/iommu/iommufd/iommufd_private.h @@ -124,6 +124,7 @@ enum iommufd_object_type { IOMMUFD_OBJ_ANY = IOMMUFD_OBJ_NONE, IOMMUFD_OBJ_DEVICE, IOMMUFD_OBJ_HWPT_PAGING, + IOMMUFD_OBJ_HWPT_NESTED, IOMMUFD_OBJ_IOAS, IOMMUFD_OBJ_ACCESS, #ifdef CONFIG_IOMMUFD_TEST @@ -255,10 +256,16 @@ struct iommufd_hwpt_paging { bool auto_domain : 1; bool enforce_cache_coherency : 1; bool msi_cookie : 1; + bool nest_parent : 1; /* Head at iommufd_ioas::hwpt_list */ struct list_head hwpt_item; }; +struct iommufd_hwpt_nested { + struct iommufd_hw_pagetable common; + struct iommufd_hwpt_paging *parent; +}; + static inline bool hwpt_is_paging(struct iommufd_hw_pagetable *hwpt) { return hwpt->obj.type == IOMMUFD_OBJ_HWPT_PAGING; @@ -283,7 +290,8 @@ int iommufd_hwpt_get_dirty_bitmap(struct iommufd_ucmd *ucmd); struct iommufd_hwpt_paging * iommufd_hwpt_paging_alloc(struct iommufd_ctx *ictx, struct iommufd_ioas *ioas, struct iommufd_device *idev, u32 flags, - bool immediate_attach); + bool immediate_attach, + const struct iommu_user_data *user_data); int iommufd_hw_pagetable_enforce_cc(struct iommufd_hw_pagetable *hwpt); int iommufd_hw_pagetable_attach(struct iommufd_hw_pagetable *hwpt, struct iommufd_device *idev); @@ -291,18 +299,24 @@ struct iommufd_hw_pagetable * iommufd_hw_pagetable_detach(struct iommufd_device *idev); void iommufd_hwpt_paging_destroy(struct iommufd_object *obj); void iommufd_hwpt_paging_abort(struct iommufd_object *obj); +void iommufd_hwpt_nested_destroy(struct iommufd_object *obj); +void iommufd_hwpt_nested_abort(struct iommufd_object *obj); int iommufd_hwpt_alloc(struct iommufd_ucmd *ucmd); static inline void iommufd_hw_pagetable_put(struct iommufd_ctx *ictx, struct iommufd_hw_pagetable *hwpt) { - struct iommufd_hwpt_paging *hwpt_paging = to_hwpt_paging(hwpt); + if (hwpt->obj.type == IOMMUFD_OBJ_HWPT_PAGING) { + struct iommufd_hwpt_paging *hwpt_paging = to_hwpt_paging(hwpt); - lockdep_assert_not_held(&hwpt_paging->ioas->mutex); - if (hwpt_paging->auto_domain) - iommufd_object_deref_user(ictx, &hwpt->obj); - else - refcount_dec(&hwpt->obj.users); + lockdep_assert_not_held(&hwpt_paging->ioas->mutex); + + if (hwpt_paging->auto_domain) { + iommufd_object_deref_user(ictx, &hwpt->obj); + return; + } + } + refcount_dec(&hwpt->obj.users); } struct iommufd_group { diff --git a/drivers/iommu/iommufd/main.c b/drivers/iommu/iommufd/main.c index ab6675a7f6b4c..45b9d40773b13 100644 --- a/drivers/iommu/iommufd/main.c +++ b/drivers/iommu/iommufd/main.c @@ -492,6 +492,10 @@ static const struct iommufd_object_ops iommufd_object_ops[] = { .destroy = iommufd_hwpt_paging_destroy, .abort = iommufd_hwpt_paging_abort, }, + [IOMMUFD_OBJ_HWPT_NESTED] = { + .destroy = iommufd_hwpt_nested_destroy, + .abort = iommufd_hwpt_nested_abort, + }, #ifdef CONFIG_IOMMUFD_TEST [IOMMUFD_OBJ_SELFTEST] = { .destroy = iommufd_selftest_destroy, diff --git a/include/uapi/linux/iommufd.h b/include/uapi/linux/iommufd.h index c44eecf5d318e..d816deac906fd 100644 --- a/include/uapi/linux/iommufd.h +++ b/include/uapi/linux/iommufd.h @@ -361,20 +361,44 @@ enum iommufd_hwpt_alloc_flags { IOMMU_HWPT_ALLOC_DIRTY_TRACKING = 1 << 1, }; +/** + * enum iommu_hwpt_data_type - IOMMU HWPT Data Type + * @IOMMU_HWPT_DATA_NONE: no data + */ +enum iommu_hwpt_data_type { + IOMMU_HWPT_DATA_NONE, +}; + /** * struct iommu_hwpt_alloc - ioctl(IOMMU_HWPT_ALLOC) * @size: sizeof(struct iommu_hwpt_alloc) * @flags: Combination of enum iommufd_hwpt_alloc_flags * @dev_id: The device to allocate this HWPT for - * @pt_id: The IOAS to connect this HWPT to + * @pt_id: The IOAS or HWPT to connect this HWPT to * @out_hwpt_id: The ID of the new HWPT * @__reserved: Must be 0 + * @data_type: One of enum iommu_hwpt_data_type + * @data_len: Length of the type specific data + * @data_uptr: User pointer to the type specific data * * Explicitly allocate a hardware page table object. This is the same object * type that is returned by iommufd_device_attach() and represents the * underlying iommu driver's iommu_domain kernel object. * - * A HWPT will be created with the IOVA mappings from the given IOAS. + * A kernel-managed HWPT will be created with the mappings from the given + * IOAS via the @pt_id. The @data_type for this allocation must be set to + * IOMMU_HWPT_DATA_NONE. The HWPT can be allocated as a parent HWPT for a + * nesting configuration by passing IOMMU_HWPT_ALLOC_NEST_PARENT via @flags. + * + * A user-managed nested HWPT will be created from a given parent HWPT via + * @pt_id, in which the parent HWPT must be allocated previously via the + * same ioctl from a given IOAS (@pt_id). In this case, the @data_type + * must be set to a pre-defined type corresponding to an I/O page table + * type supported by the underlying IOMMU hardware. + * + * If the @data_type is set to IOMMU_HWPT_DATA_NONE, @data_len and + * @data_uptr should be zero. Otherwise, both @data_len and @data_uptr + * must be given. */ struct iommu_hwpt_alloc { __u32 size; @@ -383,6 +407,9 @@ struct iommu_hwpt_alloc { __u32 pt_id; __u32 out_hwpt_id; __u32 __reserved; + __u32 data_type; + __u32 data_len; + __aligned_u64 data_uptr; }; #define IOMMU_HWPT_ALLOC _IO(IOMMUFD_TYPE, IOMMUFD_CMD_HWPT_ALLOC) From a0e6ab895dff681624e112e65e32d333b0524349 Mon Sep 17 00:00:00 2001 From: Nicolin Chen Date: Wed, 25 Oct 2023 21:39:36 -0700 Subject: [PATCH 401/548] iommu: Add iommu_copy_struct_from_user helper Wrap up the data type/pointer/len sanity and a copy_struct_from_user call for iommu drivers to copy driver specific data via struct iommu_user_data. And expect it to be used in the domain_alloc_user op for example. Link: https://lore.kernel.org/r/20231026043938.63898-9-yi.l.liu@intel.com Signed-off-by: Nicolin Chen Co-developed-by: Yi Liu Signed-off-by: Yi Liu Reviewed-by: Kevin Tian Reviewed-by: Jason Gunthorpe Signed-off-by: Jason Gunthorpe (cherry picked from commit e9d36c07bb787840e4813fb09a929a17d522a69f) Signed-off-by: Koba Ko --- include/linux/iommu.h | 40 ++++++++++++++++++++++++++++++++++++++++ 1 file changed, 40 insertions(+) diff --git a/include/linux/iommu.h b/include/linux/iommu.h index 9471dec680d35..62f002d11e8e4 100644 --- a/include/linux/iommu.h +++ b/include/linux/iommu.h @@ -284,6 +284,46 @@ struct iommu_user_data { size_t len; }; +/** + * __iommu_copy_struct_from_user - Copy iommu driver specific user space data + * @dst_data: Pointer to an iommu driver specific user data that is defined in + * include/uapi/linux/iommufd.h + * @src_data: Pointer to a struct iommu_user_data for user space data info + * @data_type: The data type of the @dst_data. Must match with @src_data.type + * @data_len: Length of current user data structure, i.e. sizeof(struct _dst) + * @min_len: Initial length of user data structure for backward compatibility. + * This should be offsetofend using the last member in the user data + * struct that was initially added to include/uapi/linux/iommufd.h + */ +static inline int __iommu_copy_struct_from_user( + void *dst_data, const struct iommu_user_data *src_data, + unsigned int data_type, size_t data_len, size_t min_len) +{ + if (src_data->type != data_type) + return -EINVAL; + if (WARN_ON(!dst_data || !src_data)) + return -EINVAL; + if (src_data->len < min_len || data_len < src_data->len) + return -EINVAL; + return copy_struct_from_user(dst_data, data_len, src_data->uptr, + src_data->len); +} + +/** + * iommu_copy_struct_from_user - Copy iommu driver specific user space data + * @kdst: Pointer to an iommu driver specific user data that is defined in + * include/uapi/linux/iommufd.h + * @user_data: Pointer to a struct iommu_user_data for user space data info + * @data_type: The data type of the @kdst. Must match with @user_data->type + * @min_last: The last memember of the data structure @kdst points in the + * initial version. + * Return 0 for success, otherwise -error. + */ +#define iommu_copy_struct_from_user(kdst, user_data, data_type, min_last) \ + __iommu_copy_struct_from_user(kdst, user_data, data_type, \ + sizeof(*kdst), \ + offsetofend(typeof(*kdst), min_last)) + /** * struct iommu_ops - iommu ops and capabilities * @capable: check capability From 95fee8bf1065fd0496ffc37a38bbe66a65e05c24 Mon Sep 17 00:00:00 2001 From: Nicolin Chen Date: Wed, 25 Oct 2023 21:39:37 -0700 Subject: [PATCH 402/548] iommufd/selftest: Add nested domain allocation for mock domain Add nested domain support in the ->domain_alloc_user op with some proper sanity checks. Then, add a domain_nested_ops for all nested domains and split the get_md_pagetable helper into paging and nested helpers. Also, add an iotlb as a testing property of a nested domain. Link: https://lore.kernel.org/r/20231026043938.63898-10-yi.l.liu@intel.com Signed-off-by: Nicolin Chen Signed-off-by: Yi Liu Reviewed-by: Kevin Tian Signed-off-by: Jason Gunthorpe (cherry picked from commit 65fe32f7a4472e19331a524b9c980b3444dd20a2) Signed-off-by: Koba Ko --- drivers/iommu/iommufd/iommufd_test.h | 18 ++++ drivers/iommu/iommufd/selftest.c | 152 +++++++++++++++++++++------ 2 files changed, 140 insertions(+), 30 deletions(-) diff --git a/drivers/iommu/iommufd/iommufd_test.h b/drivers/iommu/iommufd/iommufd_test.h index 1f2e93d3d4e87..7910fbe1962d7 100644 --- a/drivers/iommu/iommufd/iommufd_test.h +++ b/drivers/iommu/iommufd/iommufd_test.h @@ -46,6 +46,11 @@ enum { MOCK_FLAGS_DEVICE_NO_DIRTY = 1 << 0, }; +enum { + MOCK_NESTED_DOMAIN_IOTLB_ID_MAX = 3, + MOCK_NESTED_DOMAIN_IOTLB_NUM = 4, +}; + struct iommu_test_cmd { __u32 size; __u32 op; @@ -130,4 +135,17 @@ struct iommu_test_hw_info { __u32 test_reg; }; +/* Should not be equal to any defined value in enum iommu_hwpt_data_type */ +#define IOMMU_HWPT_DATA_SELFTEST 0xdead +#define IOMMU_TEST_IOTLB_DEFAULT 0xbadbeef + +/** + * struct iommu_hwpt_selftest + * + * @iotlb: default mock iotlb value, IOMMU_TEST_IOTLB_DEFAULT + */ +struct iommu_hwpt_selftest { + __u32 iotlb; +}; + #endif diff --git a/drivers/iommu/iommufd/selftest.c b/drivers/iommu/iommufd/selftest.c index 0a903270fb7b3..7f588a358306a 100644 --- a/drivers/iommu/iommufd/selftest.c +++ b/drivers/iommu/iommufd/selftest.c @@ -108,6 +108,12 @@ struct mock_iommu_domain { struct xarray pfns; }; +struct mock_iommu_domain_nested { + struct iommu_domain domain; + struct mock_iommu_domain *parent; + u32 iotlb[MOCK_NESTED_DOMAIN_IOTLB_NUM]; +}; + enum selftest_obj_type { TYPE_IDEV, }; @@ -253,54 +259,99 @@ static const struct iommu_dirty_ops dirty_ops = { }; static const struct iommu_ops mock_ops; +static struct iommu_domain_ops domain_nested_ops; -static struct iommu_domain *mock_domain_alloc(unsigned int iommu_domain_type) +static struct iommu_domain * +__mock_domain_alloc_paging(unsigned int iommu_domain_type, bool needs_dirty_ops) { struct mock_iommu_domain *mock; - if (iommu_domain_type == IOMMU_DOMAIN_BLOCKED) - return &mock_blocking_domain; - - if (iommu_domain_type != IOMMU_DOMAIN_UNMANAGED) - return NULL; - mock = kzalloc(sizeof(*mock), GFP_KERNEL); if (!mock) - return NULL; + return ERR_PTR(-ENOMEM); mock->domain.geometry.aperture_start = MOCK_APERTURE_START; mock->domain.geometry.aperture_end = MOCK_APERTURE_LAST; mock->domain.pgsize_bitmap = MOCK_IO_PAGE_SIZE; mock->domain.ops = mock_ops.default_domain_ops; + if (needs_dirty_ops) + mock->domain.dirty_ops = &dirty_ops; mock->domain.type = iommu_domain_type; xa_init(&mock->pfns); return &mock->domain; } +static struct iommu_domain * +__mock_domain_alloc_nested(struct mock_iommu_domain *mock_parent, + const struct iommu_hwpt_selftest *user_cfg) +{ + struct mock_iommu_domain_nested *mock_nested; + int i; + + mock_nested = kzalloc(sizeof(*mock_nested), GFP_KERNEL); + if (!mock_nested) + return ERR_PTR(-ENOMEM); + mock_nested->parent = mock_parent; + mock_nested->domain.ops = &domain_nested_ops; + mock_nested->domain.type = IOMMU_DOMAIN_NESTED; + for (i = 0; i < MOCK_NESTED_DOMAIN_IOTLB_NUM; i++) + mock_nested->iotlb[i] = user_cfg->iotlb; + return &mock_nested->domain; +} + +static struct iommu_domain *mock_domain_alloc(unsigned int iommu_domain_type) +{ + struct iommu_domain *domain; + + if (iommu_domain_type == IOMMU_DOMAIN_BLOCKED) + return &mock_blocking_domain; + if (iommu_domain_type != IOMMU_DOMAIN_UNMANAGED) + return NULL; + domain = __mock_domain_alloc_paging(iommu_domain_type, false); + if (IS_ERR(domain)) + domain = NULL; + return domain; +} + static struct iommu_domain * mock_domain_alloc_user(struct device *dev, u32 flags, struct iommu_domain *parent, const struct iommu_user_data *user_data) { - struct mock_dev *mdev = container_of(dev, struct mock_dev, dev); - struct iommu_domain *domain; + struct mock_iommu_domain *mock_parent; + struct iommu_hwpt_selftest user_cfg; + int rc; - if (flags & - (~(IOMMU_HWPT_ALLOC_NEST_PARENT | IOMMU_HWPT_ALLOC_DIRTY_TRACKING))) - return ERR_PTR(-EOPNOTSUPP); + /* must be mock_domain */ + if (!parent) { + struct mock_dev *mdev = container_of(dev, struct mock_dev, dev); + bool has_dirty_flag = flags & IOMMU_HWPT_ALLOC_DIRTY_TRACKING; + bool no_dirty_ops = mdev->flags & MOCK_FLAGS_DEVICE_NO_DIRTY; + + if (flags & (~(IOMMU_HWPT_ALLOC_NEST_PARENT | + IOMMU_HWPT_ALLOC_DIRTY_TRACKING))) + return ERR_PTR(-EOPNOTSUPP); + if (user_data || (has_dirty_flag && no_dirty_ops)) + return ERR_PTR(-EOPNOTSUPP); + return __mock_domain_alloc_paging(IOMMU_DOMAIN_UNMANAGED, + has_dirty_flag); + } - if (parent || user_data) + /* must be mock_domain_nested */ + if (user_data->type != IOMMU_HWPT_DATA_SELFTEST || flags) return ERR_PTR(-EOPNOTSUPP); + if (!parent || parent->ops != mock_ops.default_domain_ops) + return ERR_PTR(-EINVAL); - if ((flags & IOMMU_HWPT_ALLOC_DIRTY_TRACKING) && - (mdev->flags & MOCK_FLAGS_DEVICE_NO_DIRTY)) - return ERR_PTR(-EOPNOTSUPP); + mock_parent = container_of(parent, struct mock_iommu_domain, domain); + if (!mock_parent) + return ERR_PTR(-EINVAL); - domain = mock_domain_alloc(IOMMU_DOMAIN_UNMANAGED); - if (domain && !(mdev->flags & MOCK_FLAGS_DEVICE_NO_DIRTY)) - domain->dirty_ops = &dirty_ops; - if (!domain) - domain = ERR_PTR(-ENOMEM); - return domain; + rc = iommu_copy_struct_from_user(&user_cfg, user_data, + IOMMU_HWPT_DATA_SELFTEST, iotlb); + if (rc) + return ERR_PTR(rc); + + return __mock_domain_alloc_nested(mock_parent, &user_cfg); } static void mock_domain_free(struct iommu_domain *domain) @@ -466,19 +517,41 @@ static const struct iommu_ops mock_ops = { }, }; +static void mock_domain_free_nested(struct iommu_domain *domain) +{ + struct mock_iommu_domain_nested *mock_nested = + container_of(domain, struct mock_iommu_domain_nested, domain); + + kfree(mock_nested); +} + +static struct iommu_domain_ops domain_nested_ops = { + .free = mock_domain_free_nested, + .attach_dev = mock_domain_nop_attach, +}; + static inline struct iommufd_hw_pagetable * -get_md_pagetable(struct iommufd_ucmd *ucmd, u32 mockpt_id, - struct mock_iommu_domain **mock) +__get_md_pagetable(struct iommufd_ucmd *ucmd, u32 mockpt_id, u32 hwpt_type) { - struct iommufd_hw_pagetable *hwpt; struct iommufd_object *obj; - obj = iommufd_get_object(ucmd->ictx, mockpt_id, - IOMMUFD_OBJ_HWPT_PAGING); + obj = iommufd_get_object(ucmd->ictx, mockpt_id, hwpt_type); if (IS_ERR(obj)) return ERR_CAST(obj); - hwpt = container_of(obj, struct iommufd_hw_pagetable, obj); - if (hwpt->domain->ops != mock_ops.default_domain_ops) { + return container_of(obj, struct iommufd_hw_pagetable, obj); +} + +static inline struct iommufd_hw_pagetable * +get_md_pagetable(struct iommufd_ucmd *ucmd, u32 mockpt_id, + struct mock_iommu_domain **mock) +{ + struct iommufd_hw_pagetable *hwpt; + + hwpt = __get_md_pagetable(ucmd, mockpt_id, IOMMUFD_OBJ_HWPT_PAGING); + if (IS_ERR(hwpt)) + return hwpt; + if (hwpt->domain->type != IOMMU_DOMAIN_UNMANAGED || + hwpt->domain->ops != mock_ops.default_domain_ops) { iommufd_put_object(&hwpt->obj); return ERR_PTR(-EINVAL); } @@ -486,6 +559,25 @@ get_md_pagetable(struct iommufd_ucmd *ucmd, u32 mockpt_id, return hwpt; } +static inline struct iommufd_hw_pagetable * +get_md_pagetable_nested(struct iommufd_ucmd *ucmd, u32 mockpt_id, + struct mock_iommu_domain_nested **mock_nested) +{ + struct iommufd_hw_pagetable *hwpt; + + hwpt = __get_md_pagetable(ucmd, mockpt_id, IOMMUFD_OBJ_HWPT_NESTED); + if (IS_ERR(hwpt)) + return hwpt; + if (hwpt->domain->type != IOMMU_DOMAIN_NESTED || + hwpt->domain->ops != &domain_nested_ops) { + iommufd_put_object(&hwpt->obj); + return ERR_PTR(-EINVAL); + } + *mock_nested = container_of(hwpt->domain, + struct mock_iommu_domain_nested, domain); + return hwpt; +} + struct mock_bus_type { struct bus_type bus; struct notifier_block nb; From ea0347c7d3a8035dcf1eb789f06afe1dbef8df7f Mon Sep 17 00:00:00 2001 From: Nicolin Chen Date: Wed, 25 Oct 2023 21:39:38 -0700 Subject: [PATCH 403/548] iommufd/selftest: Add coverage for IOMMU_HWPT_ALLOC with nested HWPTs The IOMMU_HWPT_ALLOC ioctl now supports passing user_data to allocate a user-managed domain for nested HWPTs. Add its coverage for that. Also, update _test_cmd_hwpt_alloc() and add test_cmd/err_hwpt_alloc_nested(). Link: https://lore.kernel.org/r/20231026043938.63898-11-yi.l.liu@intel.com Signed-off-by: Nicolin Chen Signed-off-by: Yi Liu Reviewed-by: Kevin Tian Signed-off-by: Jason Gunthorpe (cherry picked from commit 55a01657cbee07d772b1d3cb144f867a326e4673) Signed-off-by: Koba Ko --- tools/testing/selftests/iommu/iommufd.c | 115 ++++++++++++++++++ .../selftests/iommu/iommufd_fail_nth.c | 3 +- tools/testing/selftests/iommu/iommufd_utils.h | 30 +++-- 3 files changed, 140 insertions(+), 8 deletions(-) diff --git a/tools/testing/selftests/iommu/iommufd.c b/tools/testing/selftests/iommu/iommufd.c index 83b573c40db6c..0b24976409a17 100644 --- a/tools/testing/selftests/iommu/iommufd.c +++ b/tools/testing/selftests/iommu/iommufd.c @@ -262,6 +262,121 @@ TEST_F(iommufd_ioas, ioas_destroy) } } +TEST_F(iommufd_ioas, alloc_hwpt_nested) +{ + const uint32_t min_data_len = + offsetofend(struct iommu_hwpt_selftest, iotlb); + struct iommu_hwpt_selftest data = { + .iotlb = IOMMU_TEST_IOTLB_DEFAULT, + }; + uint32_t nested_hwpt_id[2] = {}; + uint32_t parent_hwpt_id = 0; + uint32_t parent_hwpt_id_not_work = 0; + uint32_t test_hwpt_id = 0; + + if (self->device_id) { + /* Negative tests */ + test_err_hwpt_alloc(ENOENT, self->ioas_id, self->device_id, 0, + &test_hwpt_id); + test_err_hwpt_alloc(EINVAL, self->device_id, self->device_id, 0, + &test_hwpt_id); + + test_cmd_hwpt_alloc(self->device_id, self->ioas_id, + IOMMU_HWPT_ALLOC_NEST_PARENT, + &parent_hwpt_id); + + test_cmd_hwpt_alloc(self->device_id, self->ioas_id, 0, + &parent_hwpt_id_not_work); + + /* Negative nested tests */ + test_err_hwpt_alloc_nested(EINVAL, self->device_id, + parent_hwpt_id, 0, + &nested_hwpt_id[0], + IOMMU_HWPT_DATA_NONE, &data, + sizeof(data)); + test_err_hwpt_alloc_nested(EOPNOTSUPP, self->device_id, + parent_hwpt_id, 0, + &nested_hwpt_id[0], + IOMMU_HWPT_DATA_SELFTEST + 1, &data, + sizeof(data)); + test_err_hwpt_alloc_nested(EINVAL, self->device_id, + parent_hwpt_id, 0, + &nested_hwpt_id[0], + IOMMU_HWPT_DATA_SELFTEST, &data, + min_data_len - 1); + test_err_hwpt_alloc_nested(EFAULT, self->device_id, + parent_hwpt_id, 0, + &nested_hwpt_id[0], + IOMMU_HWPT_DATA_SELFTEST, NULL, + sizeof(data)); + test_err_hwpt_alloc_nested( + EOPNOTSUPP, self->device_id, parent_hwpt_id, + IOMMU_HWPT_ALLOC_NEST_PARENT, &nested_hwpt_id[0], + IOMMU_HWPT_DATA_SELFTEST, &data, sizeof(data)); + test_err_hwpt_alloc_nested(EINVAL, self->device_id, + parent_hwpt_id_not_work, 0, + &nested_hwpt_id[0], + IOMMU_HWPT_DATA_SELFTEST, &data, + sizeof(data)); + + /* Allocate two nested hwpts sharing one common parent hwpt */ + test_cmd_hwpt_alloc_nested(self->device_id, parent_hwpt_id, 0, + &nested_hwpt_id[0], + IOMMU_HWPT_DATA_SELFTEST, &data, + sizeof(data)); + test_cmd_hwpt_alloc_nested(self->device_id, parent_hwpt_id, 0, + &nested_hwpt_id[1], + IOMMU_HWPT_DATA_SELFTEST, &data, + sizeof(data)); + + /* Negative test: a nested hwpt on top of a nested hwpt */ + test_err_hwpt_alloc_nested(EINVAL, self->device_id, + nested_hwpt_id[0], 0, &test_hwpt_id, + IOMMU_HWPT_DATA_SELFTEST, &data, + sizeof(data)); + /* Negative test: parent hwpt now cannot be freed */ + EXPECT_ERRNO(EBUSY, + _test_ioctl_destroy(self->fd, parent_hwpt_id)); + + /* Attach device to nested_hwpt_id[0] that then will be busy */ + test_cmd_mock_domain_replace(self->stdev_id, nested_hwpt_id[0]); + EXPECT_ERRNO(EBUSY, + _test_ioctl_destroy(self->fd, nested_hwpt_id[0])); + + /* Switch from nested_hwpt_id[0] to nested_hwpt_id[1] */ + test_cmd_mock_domain_replace(self->stdev_id, nested_hwpt_id[1]); + EXPECT_ERRNO(EBUSY, + _test_ioctl_destroy(self->fd, nested_hwpt_id[1])); + test_ioctl_destroy(nested_hwpt_id[0]); + + /* Detach from nested_hwpt_id[1] and destroy it */ + test_cmd_mock_domain_replace(self->stdev_id, parent_hwpt_id); + test_ioctl_destroy(nested_hwpt_id[1]); + + /* Detach from the parent hw_pagetable and destroy it */ + test_cmd_mock_domain_replace(self->stdev_id, self->ioas_id); + test_ioctl_destroy(parent_hwpt_id); + test_ioctl_destroy(parent_hwpt_id_not_work); + } else { + test_err_hwpt_alloc(ENOENT, self->device_id, self->ioas_id, 0, + &parent_hwpt_id); + test_err_hwpt_alloc_nested(ENOENT, self->device_id, + parent_hwpt_id, 0, + &nested_hwpt_id[0], + IOMMU_HWPT_DATA_SELFTEST, &data, + sizeof(data)); + test_err_hwpt_alloc_nested(ENOENT, self->device_id, + parent_hwpt_id, 0, + &nested_hwpt_id[1], + IOMMU_HWPT_DATA_SELFTEST, &data, + sizeof(data)); + test_err_mock_domain_replace(ENOENT, self->stdev_id, + nested_hwpt_id[0]); + test_err_mock_domain_replace(ENOENT, self->stdev_id, + nested_hwpt_id[1]); + } +} + TEST_F(iommufd_ioas, hwpt_attach) { /* Create a device attached directly to a hwpt */ diff --git a/tools/testing/selftests/iommu/iommufd_fail_nth.c b/tools/testing/selftests/iommu/iommufd_fail_nth.c index 1fcd69cb0e416..17ff2d605355a 100644 --- a/tools/testing/selftests/iommu/iommufd_fail_nth.c +++ b/tools/testing/selftests/iommu/iommufd_fail_nth.c @@ -615,7 +615,8 @@ TEST_FAIL_NTH(basic_fail_nth, device) if (_test_cmd_get_hw_info(self->fd, idev_id, &info, sizeof(info), NULL)) return -1; - if (_test_cmd_hwpt_alloc(self->fd, idev_id, ioas_id, 0, &hwpt_id)) + if (_test_cmd_hwpt_alloc(self->fd, idev_id, ioas_id, 0, &hwpt_id, + IOMMU_HWPT_DATA_NONE, 0, 0)) return -1; if (_test_cmd_mock_domain_replace(self->fd, stdev_id, ioas_id2, NULL)) diff --git a/tools/testing/selftests/iommu/iommufd_utils.h b/tools/testing/selftests/iommu/iommufd_utils.h index 35b3f8949c414..c24d00fd8542f 100644 --- a/tools/testing/selftests/iommu/iommufd_utils.h +++ b/tools/testing/selftests/iommu/iommufd_utils.h @@ -154,13 +154,17 @@ static int _test_cmd_mock_domain_replace(int fd, __u32 stdev_id, __u32 pt_id, pt_id, NULL)) static int _test_cmd_hwpt_alloc(int fd, __u32 device_id, __u32 pt_id, - __u32 flags, __u32 *hwpt_id) + __u32 flags, __u32 *hwpt_id, __u32 data_type, + void *data, size_t data_len) { struct iommu_hwpt_alloc cmd = { .size = sizeof(cmd), .flags = flags, .dev_id = device_id, .pt_id = pt_id, + .data_type = data_type, + .data_len = data_len, + .data_uptr = (uint64_t)data, }; int ret; @@ -172,12 +176,24 @@ static int _test_cmd_hwpt_alloc(int fd, __u32 device_id, __u32 pt_id, return 0; } -#define test_cmd_hwpt_alloc(device_id, pt_id, flags, hwpt_id) \ - ASSERT_EQ(0, _test_cmd_hwpt_alloc(self->fd, device_id, \ - pt_id, flags, hwpt_id)) -#define test_err_hwpt_alloc(_errno, device_id, pt_id, flags, hwpt_id) \ - EXPECT_ERRNO(_errno, _test_cmd_hwpt_alloc(self->fd, device_id, \ - pt_id, flags, hwpt_id)) +#define test_cmd_hwpt_alloc(device_id, pt_id, flags, hwpt_id) \ + ASSERT_EQ(0, _test_cmd_hwpt_alloc(self->fd, device_id, pt_id, flags, \ + hwpt_id, IOMMU_HWPT_DATA_NONE, NULL, \ + 0)) +#define test_err_hwpt_alloc(_errno, device_id, pt_id, flags, hwpt_id) \ + EXPECT_ERRNO(_errno, _test_cmd_hwpt_alloc( \ + self->fd, device_id, pt_id, flags, \ + hwpt_id, IOMMU_HWPT_DATA_NONE, NULL, 0)) + +#define test_cmd_hwpt_alloc_nested(device_id, pt_id, flags, hwpt_id, \ + data_type, data, data_len) \ + ASSERT_EQ(0, _test_cmd_hwpt_alloc(self->fd, device_id, pt_id, flags, \ + hwpt_id, data_type, data, data_len)) +#define test_err_hwpt_alloc_nested(_errno, device_id, pt_id, flags, hwpt_id, \ + data_type, data, data_len) \ + EXPECT_ERRNO(_errno, \ + _test_cmd_hwpt_alloc(self->fd, device_id, pt_id, flags, \ + hwpt_id, data_type, data, data_len)) static int _test_cmd_access_replace_ioas(int fd, __u32 access_id, unsigned int ioas_id) From 86e96d39800ec30ae5a3a6e6b043898b61634d72 Mon Sep 17 00:00:00 2001 From: Jason Gunthorpe Date: Mon, 30 Oct 2023 15:11:20 -0300 Subject: [PATCH 404/548] iommufd: Organize the mock domain alloc functions closer to Joerg's tree Patches in Joerg's iommu tree to convert the mock driver to use domain_alloc_paging() that clash badly with the way the selftest changes for nesting were structured. Massage the selftest so that it looks closer the code after the domain_alloc_paging() conversion to ease the merge. Change __mock_domain_alloc_paging() into mock_domain_alloc_paging() in the same way as the iommu tree. The merge resolution then trivially takes both and deletes mock_domain_alloc(). Link: https://lore.kernel.org/r/0-v1-90a855762c96+19de-mock_merge_jgg@nvidia.com Reviewed-by: Nicolin Chen Reviewed-by: Kevin Tian Reviewed-by: Yi Liu Signed-off-by: Jason Gunthorpe (cherry picked from commit b2b67c997bf74453f3469d8b54e4859f190943bd) Signed-off-by: Koba Ko --- drivers/iommu/iommufd/selftest.c | 35 +++++++++++++++----------------- 1 file changed, 16 insertions(+), 19 deletions(-) diff --git a/drivers/iommu/iommufd/selftest.c b/drivers/iommu/iommufd/selftest.c index 7f588a358306a..89a90ead795c6 100644 --- a/drivers/iommu/iommufd/selftest.c +++ b/drivers/iommu/iommufd/selftest.c @@ -20,6 +20,8 @@ static DECLARE_FAULT_ATTR(fail_iommufd); static struct dentry *dbgfs_root; static struct platform_device *selftest_iommu_dev; +static const struct iommu_ops mock_ops; +static struct iommu_domain_ops domain_nested_ops; size_t iommufd_test_memory_limit = 65536; @@ -258,24 +260,18 @@ static const struct iommu_dirty_ops dirty_ops = { .read_and_clear_dirty = mock_domain_read_and_clear_dirty, }; -static const struct iommu_ops mock_ops; -static struct iommu_domain_ops domain_nested_ops; - -static struct iommu_domain * -__mock_domain_alloc_paging(unsigned int iommu_domain_type, bool needs_dirty_ops) +static struct iommu_domain *mock_domain_alloc_paging(struct device *dev) { struct mock_iommu_domain *mock; mock = kzalloc(sizeof(*mock), GFP_KERNEL); if (!mock) - return ERR_PTR(-ENOMEM); + return NULL; mock->domain.geometry.aperture_start = MOCK_APERTURE_START; mock->domain.geometry.aperture_end = MOCK_APERTURE_LAST; mock->domain.pgsize_bitmap = MOCK_IO_PAGE_SIZE; mock->domain.ops = mock_ops.default_domain_ops; - if (needs_dirty_ops) - mock->domain.dirty_ops = &dirty_ops; - mock->domain.type = iommu_domain_type; + mock->domain.type = IOMMU_DOMAIN_UNMANAGED; xa_init(&mock->pfns); return &mock->domain; } @@ -300,16 +296,11 @@ __mock_domain_alloc_nested(struct mock_iommu_domain *mock_parent, static struct iommu_domain *mock_domain_alloc(unsigned int iommu_domain_type) { - struct iommu_domain *domain; - if (iommu_domain_type == IOMMU_DOMAIN_BLOCKED) return &mock_blocking_domain; - if (iommu_domain_type != IOMMU_DOMAIN_UNMANAGED) - return NULL; - domain = __mock_domain_alloc_paging(iommu_domain_type, false); - if (IS_ERR(domain)) - domain = NULL; - return domain; + if (iommu_domain_type == IOMMU_DOMAIN_UNMANAGED) + return mock_domain_alloc_paging(NULL); + return NULL; } static struct iommu_domain * @@ -326,14 +317,20 @@ mock_domain_alloc_user(struct device *dev, u32 flags, struct mock_dev *mdev = container_of(dev, struct mock_dev, dev); bool has_dirty_flag = flags & IOMMU_HWPT_ALLOC_DIRTY_TRACKING; bool no_dirty_ops = mdev->flags & MOCK_FLAGS_DEVICE_NO_DIRTY; + struct iommu_domain *domain; if (flags & (~(IOMMU_HWPT_ALLOC_NEST_PARENT | IOMMU_HWPT_ALLOC_DIRTY_TRACKING))) return ERR_PTR(-EOPNOTSUPP); if (user_data || (has_dirty_flag && no_dirty_ops)) return ERR_PTR(-EOPNOTSUPP); - return __mock_domain_alloc_paging(IOMMU_DOMAIN_UNMANAGED, - has_dirty_flag); + domain = mock_domain_alloc_paging(NULL); + if (!domain) + return ERR_PTR(-ENOMEM); + if (has_dirty_flag) + container_of(domain, struct mock_iommu_domain, domain) + ->domain.dirty_ops = &dirty_ops; + return domain; } /* must be mock_domain_nested */ From 6b90a8a481d7120bfe072d055024caad8d29b417 Mon Sep 17 00:00:00 2001 From: Jason Gunthorpe Date: Wed, 13 Sep 2023 10:43:36 -0300 Subject: [PATCH 405/548] powerpc/iommu: Setup a default domain and remove set_platform_dma_ops POWER is using the set_platform_dma_ops() callback to hook up its private dma_ops, but this is buired under some indirection and is weirdly happening for a BLOCKED domain as well. For better documentation create a PLATFORM domain to manage the dma_ops, since that is what it is for, and make the BLOCKED domain an alias for it. BLOCKED is required for VFIO. Also removes the leaky allocation of the BLOCKED domain by using a global static. Reviewed-by: Lu Baolu Reviewed-by: Jerry Snitselaar Signed-off-by: Jason Gunthorpe Link: https://lore.kernel.org/r/3-v8-81230027b2fa+9d-iommu_all_defdom_jgg@nvidia.com Signed-off-by: Joerg Roedel (cherry picked from commit 2ad56efa80dba89162106c06ebc00b611325e584) Signed-off-by: Koba Ko --- arch/powerpc/kernel/iommu.c | 38 +++++++++++++++++-------------------- 1 file changed, 17 insertions(+), 21 deletions(-) diff --git a/arch/powerpc/kernel/iommu.c b/arch/powerpc/kernel/iommu.c index efaca0c6eff9d..b1e38b78ed52c 100644 --- a/arch/powerpc/kernel/iommu.c +++ b/arch/powerpc/kernel/iommu.c @@ -1280,7 +1280,7 @@ struct iommu_table_group_ops spapr_tce_table_group_ops = { /* * A simple iommu_ops to allow less cruft in generic VFIO code. */ -static int spapr_tce_blocking_iommu_attach_dev(struct iommu_domain *dom, +static int spapr_tce_platform_iommu_attach_dev(struct iommu_domain *dom, struct device *dev) { struct iommu_group *grp = iommu_group_get(dev); @@ -1297,17 +1297,22 @@ static int spapr_tce_blocking_iommu_attach_dev(struct iommu_domain *dom, return ret; } -static void spapr_tce_blocking_iommu_set_platform_dma(struct device *dev) -{ - struct iommu_group *grp = iommu_group_get(dev); - struct iommu_table_group *table_group; +static const struct iommu_domain_ops spapr_tce_platform_domain_ops = { + .attach_dev = spapr_tce_platform_iommu_attach_dev, +}; - table_group = iommu_group_get_iommudata(grp); - table_group->ops->release_ownership(table_group); -} +static struct iommu_domain spapr_tce_platform_domain = { + .type = IOMMU_DOMAIN_PLATFORM, + .ops = &spapr_tce_platform_domain_ops, +}; -static const struct iommu_domain_ops spapr_tce_blocking_domain_ops = { - .attach_dev = spapr_tce_blocking_iommu_attach_dev, +static struct iommu_domain spapr_tce_blocked_domain = { + .type = IOMMU_DOMAIN_BLOCKED, + /* + * FIXME: SPAPR mixes blocked and platform behaviors, the blocked domain + * also sets the dma_api ops + */ + .ops = &spapr_tce_platform_domain_ops, }; static bool spapr_tce_iommu_capable(struct device *dev, enum iommu_cap cap) @@ -1324,18 +1329,9 @@ static bool spapr_tce_iommu_capable(struct device *dev, enum iommu_cap cap) static struct iommu_domain *spapr_tce_iommu_domain_alloc(unsigned int type) { - struct iommu_domain *dom; - if (type != IOMMU_DOMAIN_BLOCKED) return NULL; - - dom = kzalloc(sizeof(*dom), GFP_KERNEL); - if (!dom) - return NULL; - - dom->ops = &spapr_tce_blocking_domain_ops; - - return dom; + return &spapr_tce_blocked_domain; } static struct iommu_device *spapr_tce_iommu_probe_device(struct device *dev) @@ -1371,12 +1367,12 @@ static struct iommu_group *spapr_tce_iommu_device_group(struct device *dev) } static const struct iommu_ops spapr_tce_iommu_ops = { + .default_domain = &spapr_tce_platform_domain, .capable = spapr_tce_iommu_capable, .domain_alloc = spapr_tce_iommu_domain_alloc, .probe_device = spapr_tce_iommu_probe_device, .release_device = spapr_tce_iommu_release_device, .device_group = spapr_tce_iommu_device_group, - .set_platform_dma_ops = spapr_tce_blocking_iommu_set_platform_dma, }; static struct attribute *spapr_tce_iommu_attrs[] = { From c5c64fa57062550645ac371efbdeb409720341f4 Mon Sep 17 00:00:00 2001 From: Jason Gunthorpe Date: Wed, 13 Sep 2023 10:43:37 -0300 Subject: [PATCH 406/548] iommu: Add IOMMU_DOMAIN_PLATFORM for S390 The PLATFORM domain will be set as the default domain and attached as normal during probe. The driver will ignore the initial attach from a NULL domain to the PLATFORM domain. After this, the PLATFORM domain's attach_dev will be called whenever we detach from an UNMANAGED domain (eg for VFIO). This is the same time the original design would have called op->detach_dev(). This is temporary until the S390 dma-iommu.c conversion is merged. Tested-by: Heiko Stuebner Tested-by: Niklas Schnelle Reviewed-by: Lu Baolu Reviewed-by: Jerry Snitselaar Signed-off-by: Jason Gunthorpe Link: https://lore.kernel.org/r/4-v8-81230027b2fa+9d-iommu_all_defdom_jgg@nvidia.com Signed-off-by: Joerg Roedel (cherry picked from commit e04c7487a6655722172e93e8f36e51d6ab279f86) Signed-off-by: Koba Ko --- drivers/iommu/s390-iommu.c | 21 +++++++++++++++++++-- 1 file changed, 19 insertions(+), 2 deletions(-) diff --git a/drivers/iommu/s390-iommu.c b/drivers/iommu/s390-iommu.c index fbf59a8db29b1..f0c867c57a5b9 100644 --- a/drivers/iommu/s390-iommu.c +++ b/drivers/iommu/s390-iommu.c @@ -142,14 +142,31 @@ static int s390_iommu_attach_device(struct iommu_domain *domain, return 0; } -static void s390_iommu_set_platform_dma(struct device *dev) +/* + * Switch control over the IOMMU to S390's internal dma_api ops + */ +static int s390_iommu_platform_attach(struct iommu_domain *platform_domain, + struct device *dev) { struct zpci_dev *zdev = to_zpci_dev(dev); + if (!zdev->s390_domain) + return 0; + __s390_iommu_detach_device(zdev); zpci_dma_init_device(zdev); + return 0; } +static struct iommu_domain_ops s390_iommu_platform_ops = { + .attach_dev = s390_iommu_platform_attach, +}; + +static struct iommu_domain s390_iommu_platform_domain = { + .type = IOMMU_DOMAIN_PLATFORM, + .ops = &s390_iommu_platform_ops, +}; + static void s390_iommu_get_resv_regions(struct device *dev, struct list_head *list) { @@ -428,12 +445,12 @@ void zpci_destroy_iommu(struct zpci_dev *zdev) } static const struct iommu_ops s390_iommu_ops = { + .default_domain = &s390_iommu_platform_domain, .capable = s390_iommu_capable, .domain_alloc = s390_domain_alloc, .probe_device = s390_iommu_probe_device, .release_device = s390_iommu_release_device, .device_group = generic_device_group, - .set_platform_dma_ops = s390_iommu_set_platform_dma, .pgsize_bitmap = SZ_4K, .get_resv_regions = s390_iommu_get_resv_regions, .default_domain_ops = &(const struct iommu_domain_ops) { From db92f1578715cd030d77f41669c44922fa46961f Mon Sep 17 00:00:00 2001 From: Jason Gunthorpe Date: Wed, 13 Sep 2023 10:43:38 -0300 Subject: [PATCH 407/548] iommu/fsl_pamu: Implement a PLATFORM domain This driver is nonsensical. To not block migrating the core API away from NULL default_domains give it a hacky of a PLATFORM domain that keeps it working exactly as it always did. Leave some comments around to warn away any future people looking at this. Reviewed-by: Lu Baolu Reviewed-by: Jerry Snitselaar Signed-off-by: Jason Gunthorpe Link: https://lore.kernel.org/r/5-v8-81230027b2fa+9d-iommu_all_defdom_jgg@nvidia.com Signed-off-by: Joerg Roedel (cherry picked from commit 8565915e75581ba540527d1d9a2b89620002a0d0) Signed-off-by: Koba Ko --- drivers/iommu/fsl_pamu_domain.c | 41 ++++++++++++++++++++++++++++++--- 1 file changed, 38 insertions(+), 3 deletions(-) diff --git a/drivers/iommu/fsl_pamu_domain.c b/drivers/iommu/fsl_pamu_domain.c index 4ac0e247ec2b5..e9d2bff4659b7 100644 --- a/drivers/iommu/fsl_pamu_domain.c +++ b/drivers/iommu/fsl_pamu_domain.c @@ -196,6 +196,13 @@ static struct iommu_domain *fsl_pamu_domain_alloc(unsigned type) { struct fsl_dma_domain *dma_domain; + /* + * FIXME: This isn't creating an unmanaged domain since the + * default_domain_ops do not have any map/unmap function it doesn't meet + * the requirements for __IOMMU_DOMAIN_PAGING. The only purpose seems to + * allow drivers/soc/fsl/qbman/qman_portal.c to do + * fsl_pamu_configure_l1_stash() + */ if (type != IOMMU_DOMAIN_UNMANAGED) return NULL; @@ -283,15 +290,33 @@ static int fsl_pamu_attach_device(struct iommu_domain *domain, return ret; } -static void fsl_pamu_set_platform_dma(struct device *dev) +/* + * FIXME: fsl/pamu is completely broken in terms of how it works with the iommu + * API. Immediately after probe the HW is left in an IDENTITY translation and + * the driver provides a non-working UNMANAGED domain that it can switch over + * to. However it cannot switch back to an IDENTITY translation, instead it + * switches to what looks like BLOCKING. + */ +static int fsl_pamu_platform_attach(struct iommu_domain *platform_domain, + struct device *dev) { struct iommu_domain *domain = iommu_get_domain_for_dev(dev); - struct fsl_dma_domain *dma_domain = to_fsl_dma_domain(domain); + struct fsl_dma_domain *dma_domain; const u32 *prop; int len; struct pci_dev *pdev = NULL; struct pci_controller *pci_ctl; + /* + * Hack to keep things working as they always have, only leaving an + * UNMANAGED domain makes it BLOCKING. + */ + if (domain == platform_domain || !domain || + domain->type != IOMMU_DOMAIN_UNMANAGED) + return 0; + + dma_domain = to_fsl_dma_domain(domain); + /* * Use LIODN of the PCI controller while detaching a * PCI device. @@ -312,8 +337,18 @@ static void fsl_pamu_set_platform_dma(struct device *dev) detach_device(dev, dma_domain); else pr_debug("missing fsl,liodn property at %pOF\n", dev->of_node); + return 0; } +static struct iommu_domain_ops fsl_pamu_platform_ops = { + .attach_dev = fsl_pamu_platform_attach, +}; + +static struct iommu_domain fsl_pamu_platform_domain = { + .type = IOMMU_DOMAIN_PLATFORM, + .ops = &fsl_pamu_platform_ops, +}; + /* Set the domain stash attribute */ int fsl_pamu_configure_l1_stash(struct iommu_domain *domain, u32 cpu) { @@ -395,11 +430,11 @@ static struct iommu_device *fsl_pamu_probe_device(struct device *dev) } static const struct iommu_ops fsl_pamu_ops = { + .default_domain = &fsl_pamu_platform_domain, .capable = fsl_pamu_capable, .domain_alloc = fsl_pamu_domain_alloc, .probe_device = fsl_pamu_probe_device, .device_group = fsl_pamu_device_group, - .set_platform_dma_ops = fsl_pamu_set_platform_dma, .default_domain_ops = &(const struct iommu_domain_ops) { .attach_dev = fsl_pamu_attach_device, .iova_to_phys = fsl_pamu_iova_to_phys, From bcee4f4cb6c5b0fe14730754bb61fbfac626e765 Mon Sep 17 00:00:00 2001 From: Jason Gunthorpe Date: Wed, 13 Sep 2023 10:43:39 -0300 Subject: [PATCH 408/548] iommu/tegra-gart: Remove tegra-gart Thierry says this is not used anymore, and doesn't think it makes sense as an iommu driver. The HW it supports is about 10 years old now and newer HW uses different IOMMU drivers. As this is the only driver with a GART approach, and it doesn't really meet the driver expectations from the IOMMU core, let's just remove it so we don't have to think about how to make it fit in. It has a number of identified problems: - The assignment of iommu_groups doesn't match the HW behavior - It claims to have an UNMANAGED domain but it is really an IDENTITY domain with a translation aperture. This is inconsistent with the core expectation for security sensitive operations - It doesn't implement a SW page table under struct iommu_domain so * It can't accept a map until the domain is attached * It forgets about all maps after the domain is detached * It doesn't clear the HW of maps once the domain is detached (made worse by having the wrong groups) Cc: Thierry Reding Cc: Dmitry Osipenko Acked-by: Thierry Reding Reviewed-by: Lu Baolu Reviewed-by: Jerry Snitselaar Signed-off-by: Jason Gunthorpe Link: https://lore.kernel.org/r/6-v8-81230027b2fa+9d-iommu_all_defdom_jgg@nvidia.com Signed-off-by: Joerg Roedel (cherry picked from commit c462944901319cb52ec0d0382dcea64f4f6f70e8) Signed-off-by: Koba Ko --- arch/arm/configs/multi_v7_defconfig | 1 - arch/arm/configs/tegra_defconfig | 1 - drivers/iommu/Kconfig | 11 - drivers/iommu/Makefile | 1 - drivers/iommu/tegra-gart.c | 371 ---------------------------- drivers/memory/tegra/mc.c | 34 --- drivers/memory/tegra/tegra20.c | 28 --- include/soc/tegra/mc.h | 26 -- 8 files changed, 473 deletions(-) delete mode 100644 drivers/iommu/tegra-gart.c diff --git a/arch/arm/configs/multi_v7_defconfig b/arch/arm/configs/multi_v7_defconfig index 23fc49f23d255..5dc4416b75d36 100644 --- a/arch/arm/configs/multi_v7_defconfig +++ b/arch/arm/configs/multi_v7_defconfig @@ -1073,7 +1073,6 @@ CONFIG_QCOM_IPCC=y CONFIG_OMAP_IOMMU=y CONFIG_OMAP_IOMMU_DEBUG=y CONFIG_ROCKCHIP_IOMMU=y -CONFIG_TEGRA_IOMMU_GART=y CONFIG_TEGRA_IOMMU_SMMU=y CONFIG_EXYNOS_IOMMU=y CONFIG_QCOM_IOMMU=y diff --git a/arch/arm/configs/tegra_defconfig b/arch/arm/configs/tegra_defconfig index 613f07b8ce150..8635b7216bfc5 100644 --- a/arch/arm/configs/tegra_defconfig +++ b/arch/arm/configs/tegra_defconfig @@ -292,7 +292,6 @@ CONFIG_CHROME_PLATFORMS=y CONFIG_CROS_EC=y CONFIG_CROS_EC_I2C=m CONFIG_CROS_EC_SPI=m -CONFIG_TEGRA_IOMMU_GART=y CONFIG_TEGRA_IOMMU_SMMU=y CONFIG_ARCH_TEGRA_2x_SOC=y CONFIG_ARCH_TEGRA_3x_SOC=y diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig index ea49bce5cd71b..ef1007dd494f7 100644 --- a/drivers/iommu/Kconfig +++ b/drivers/iommu/Kconfig @@ -239,17 +239,6 @@ config SUN50I_IOMMU help Support for the IOMMU introduced in the Allwinner H6 SoCs. -config TEGRA_IOMMU_GART - bool "Tegra GART IOMMU Support" - depends on ARCH_TEGRA_2x_SOC - depends on TEGRA_MC - select IOMMU_API - help - Enables support for remapping discontiguous physical memory - shared with the operating system into contiguous I/O virtual - space through the GART (Graphics Address Relocation Table) - hardware included on Tegra SoCs. - config TEGRA_IOMMU_SMMU bool "NVIDIA Tegra SMMU Support" depends on ARCH_TEGRA diff --git a/drivers/iommu/Makefile b/drivers/iommu/Makefile index 769e43d780ce8..95ad9dbfbda02 100644 --- a/drivers/iommu/Makefile +++ b/drivers/iommu/Makefile @@ -20,7 +20,6 @@ obj-$(CONFIG_OMAP_IOMMU) += omap-iommu.o obj-$(CONFIG_OMAP_IOMMU_DEBUG) += omap-iommu-debug.o obj-$(CONFIG_ROCKCHIP_IOMMU) += rockchip-iommu.o obj-$(CONFIG_SUN50I_IOMMU) += sun50i-iommu.o -obj-$(CONFIG_TEGRA_IOMMU_GART) += tegra-gart.o obj-$(CONFIG_TEGRA_IOMMU_SMMU) += tegra-smmu.o obj-$(CONFIG_EXYNOS_IOMMU) += exynos-iommu.o obj-$(CONFIG_FSL_PAMU) += fsl_pamu.o fsl_pamu_domain.o diff --git a/drivers/iommu/tegra-gart.c b/drivers/iommu/tegra-gart.c deleted file mode 100644 index a482ff838b533..0000000000000 --- a/drivers/iommu/tegra-gart.c +++ /dev/null @@ -1,371 +0,0 @@ -// SPDX-License-Identifier: GPL-2.0-only -/* - * IOMMU API for Graphics Address Relocation Table on Tegra20 - * - * Copyright (c) 2010-2012, NVIDIA CORPORATION. All rights reserved. - * - * Author: Hiroshi DOYU - */ - -#define dev_fmt(fmt) "gart: " fmt - -#include -#include -#include -#include -#include -#include -#include - -#include - -#define GART_REG_BASE 0x24 -#define GART_CONFIG (0x24 - GART_REG_BASE) -#define GART_ENTRY_ADDR (0x28 - GART_REG_BASE) -#define GART_ENTRY_DATA (0x2c - GART_REG_BASE) - -#define GART_ENTRY_PHYS_ADDR_VALID BIT(31) - -#define GART_PAGE_SHIFT 12 -#define GART_PAGE_SIZE (1 << GART_PAGE_SHIFT) -#define GART_PAGE_MASK GENMASK(30, GART_PAGE_SHIFT) - -/* bitmap of the page sizes currently supported */ -#define GART_IOMMU_PGSIZES (GART_PAGE_SIZE) - -struct gart_device { - void __iomem *regs; - u32 *savedata; - unsigned long iovmm_base; /* offset to vmm_area start */ - unsigned long iovmm_end; /* offset to vmm_area end */ - spinlock_t pte_lock; /* for pagetable */ - spinlock_t dom_lock; /* for active domain */ - unsigned int active_devices; /* number of active devices */ - struct iommu_domain *active_domain; /* current active domain */ - struct iommu_device iommu; /* IOMMU Core handle */ - struct device *dev; -}; - -static struct gart_device *gart_handle; /* unique for a system */ - -static bool gart_debug; - -/* - * Any interaction between any block on PPSB and a block on APB or AHB - * must have these read-back to ensure the APB/AHB bus transaction is - * complete before initiating activity on the PPSB block. - */ -#define FLUSH_GART_REGS(gart) readl_relaxed((gart)->regs + GART_CONFIG) - -#define for_each_gart_pte(gart, iova) \ - for (iova = gart->iovmm_base; \ - iova < gart->iovmm_end; \ - iova += GART_PAGE_SIZE) - -static inline void gart_set_pte(struct gart_device *gart, - unsigned long iova, unsigned long pte) -{ - writel_relaxed(iova, gart->regs + GART_ENTRY_ADDR); - writel_relaxed(pte, gart->regs + GART_ENTRY_DATA); -} - -static inline unsigned long gart_read_pte(struct gart_device *gart, - unsigned long iova) -{ - unsigned long pte; - - writel_relaxed(iova, gart->regs + GART_ENTRY_ADDR); - pte = readl_relaxed(gart->regs + GART_ENTRY_DATA); - - return pte; -} - -static void do_gart_setup(struct gart_device *gart, const u32 *data) -{ - unsigned long iova; - - for_each_gart_pte(gart, iova) - gart_set_pte(gart, iova, data ? *(data++) : 0); - - writel_relaxed(1, gart->regs + GART_CONFIG); - FLUSH_GART_REGS(gart); -} - -static inline bool gart_iova_range_invalid(struct gart_device *gart, - unsigned long iova, size_t bytes) -{ - return unlikely(iova < gart->iovmm_base || bytes != GART_PAGE_SIZE || - iova + bytes > gart->iovmm_end); -} - -static inline bool gart_pte_valid(struct gart_device *gart, unsigned long iova) -{ - return !!(gart_read_pte(gart, iova) & GART_ENTRY_PHYS_ADDR_VALID); -} - -static int gart_iommu_attach_dev(struct iommu_domain *domain, - struct device *dev) -{ - struct gart_device *gart = gart_handle; - int ret = 0; - - spin_lock(&gart->dom_lock); - - if (gart->active_domain && gart->active_domain != domain) { - ret = -EINVAL; - } else if (dev_iommu_priv_get(dev) != domain) { - dev_iommu_priv_set(dev, domain); - gart->active_domain = domain; - gart->active_devices++; - } - - spin_unlock(&gart->dom_lock); - - return ret; -} - -static void gart_iommu_set_platform_dma(struct device *dev) -{ - struct iommu_domain *domain = iommu_get_domain_for_dev(dev); - struct gart_device *gart = gart_handle; - - spin_lock(&gart->dom_lock); - - if (dev_iommu_priv_get(dev) == domain) { - dev_iommu_priv_set(dev, NULL); - - if (--gart->active_devices == 0) - gart->active_domain = NULL; - } - - spin_unlock(&gart->dom_lock); -} - -static struct iommu_domain *gart_iommu_domain_alloc(unsigned type) -{ - struct iommu_domain *domain; - - if (type != IOMMU_DOMAIN_UNMANAGED) - return NULL; - - domain = kzalloc(sizeof(*domain), GFP_KERNEL); - if (domain) { - domain->geometry.aperture_start = gart_handle->iovmm_base; - domain->geometry.aperture_end = gart_handle->iovmm_end - 1; - domain->geometry.force_aperture = true; - } - - return domain; -} - -static void gart_iommu_domain_free(struct iommu_domain *domain) -{ - WARN_ON(gart_handle->active_domain == domain); - kfree(domain); -} - -static inline int __gart_iommu_map(struct gart_device *gart, unsigned long iova, - unsigned long pa) -{ - if (unlikely(gart_debug && gart_pte_valid(gart, iova))) { - dev_err(gart->dev, "Page entry is in-use\n"); - return -EINVAL; - } - - gart_set_pte(gart, iova, GART_ENTRY_PHYS_ADDR_VALID | pa); - - return 0; -} - -static int gart_iommu_map(struct iommu_domain *domain, unsigned long iova, - phys_addr_t pa, size_t bytes, int prot, gfp_t gfp) -{ - struct gart_device *gart = gart_handle; - int ret; - - if (gart_iova_range_invalid(gart, iova, bytes)) - return -EINVAL; - - spin_lock(&gart->pte_lock); - ret = __gart_iommu_map(gart, iova, (unsigned long)pa); - spin_unlock(&gart->pte_lock); - - return ret; -} - -static inline int __gart_iommu_unmap(struct gart_device *gart, - unsigned long iova) -{ - if (unlikely(gart_debug && !gart_pte_valid(gart, iova))) { - dev_err(gart->dev, "Page entry is invalid\n"); - return -EINVAL; - } - - gart_set_pte(gart, iova, 0); - - return 0; -} - -static size_t gart_iommu_unmap(struct iommu_domain *domain, unsigned long iova, - size_t bytes, struct iommu_iotlb_gather *gather) -{ - struct gart_device *gart = gart_handle; - int err; - - if (gart_iova_range_invalid(gart, iova, bytes)) - return 0; - - spin_lock(&gart->pte_lock); - err = __gart_iommu_unmap(gart, iova); - spin_unlock(&gart->pte_lock); - - return err ? 0 : bytes; -} - -static phys_addr_t gart_iommu_iova_to_phys(struct iommu_domain *domain, - dma_addr_t iova) -{ - struct gart_device *gart = gart_handle; - unsigned long pte; - - if (gart_iova_range_invalid(gart, iova, GART_PAGE_SIZE)) - return -EINVAL; - - spin_lock(&gart->pte_lock); - pte = gart_read_pte(gart, iova); - spin_unlock(&gart->pte_lock); - - return pte & GART_PAGE_MASK; -} - -static struct iommu_device *gart_iommu_probe_device(struct device *dev) -{ - if (!dev_iommu_fwspec_get(dev)) - return ERR_PTR(-ENODEV); - - return &gart_handle->iommu; -} - -static int gart_iommu_of_xlate(struct device *dev, - struct of_phandle_args *args) -{ - return 0; -} - -static void gart_iommu_sync_map(struct iommu_domain *domain, unsigned long iova, - size_t size) -{ - FLUSH_GART_REGS(gart_handle); -} - -static void gart_iommu_sync(struct iommu_domain *domain, - struct iommu_iotlb_gather *gather) -{ - size_t length = gather->end - gather->start + 1; - - gart_iommu_sync_map(domain, gather->start, length); -} - -static const struct iommu_ops gart_iommu_ops = { - .domain_alloc = gart_iommu_domain_alloc, - .probe_device = gart_iommu_probe_device, - .device_group = generic_device_group, - .set_platform_dma_ops = gart_iommu_set_platform_dma, - .pgsize_bitmap = GART_IOMMU_PGSIZES, - .of_xlate = gart_iommu_of_xlate, - .default_domain_ops = &(const struct iommu_domain_ops) { - .attach_dev = gart_iommu_attach_dev, - .map = gart_iommu_map, - .unmap = gart_iommu_unmap, - .iova_to_phys = gart_iommu_iova_to_phys, - .iotlb_sync_map = gart_iommu_sync_map, - .iotlb_sync = gart_iommu_sync, - .free = gart_iommu_domain_free, - } -}; - -int tegra_gart_suspend(struct gart_device *gart) -{ - u32 *data = gart->savedata; - unsigned long iova; - - /* - * All GART users shall be suspended at this point. Disable - * address translation to trap all GART accesses as invalid - * memory accesses. - */ - writel_relaxed(0, gart->regs + GART_CONFIG); - FLUSH_GART_REGS(gart); - - for_each_gart_pte(gart, iova) - *(data++) = gart_read_pte(gart, iova); - - return 0; -} - -int tegra_gart_resume(struct gart_device *gart) -{ - do_gart_setup(gart, gart->savedata); - - return 0; -} - -struct gart_device *tegra_gart_probe(struct device *dev, struct tegra_mc *mc) -{ - struct gart_device *gart; - struct resource *res; - int err; - - BUILD_BUG_ON(PAGE_SHIFT != GART_PAGE_SHIFT); - - /* the GART memory aperture is required */ - res = platform_get_resource(to_platform_device(dev), IORESOURCE_MEM, 1); - if (!res) { - dev_err(dev, "Memory aperture resource unavailable\n"); - return ERR_PTR(-ENXIO); - } - - gart = kzalloc(sizeof(*gart), GFP_KERNEL); - if (!gart) - return ERR_PTR(-ENOMEM); - - gart_handle = gart; - - gart->dev = dev; - gart->regs = mc->regs + GART_REG_BASE; - gart->iovmm_base = res->start; - gart->iovmm_end = res->end + 1; - spin_lock_init(&gart->pte_lock); - spin_lock_init(&gart->dom_lock); - - do_gart_setup(gart, NULL); - - err = iommu_device_sysfs_add(&gart->iommu, dev, NULL, "gart"); - if (err) - goto free_gart; - - err = iommu_device_register(&gart->iommu, &gart_iommu_ops, dev); - if (err) - goto remove_sysfs; - - gart->savedata = vmalloc(resource_size(res) / GART_PAGE_SIZE * - sizeof(u32)); - if (!gart->savedata) { - err = -ENOMEM; - goto unregister_iommu; - } - - return gart; - -unregister_iommu: - iommu_device_unregister(&gart->iommu); -remove_sysfs: - iommu_device_sysfs_remove(&gart->iommu); -free_gart: - kfree(gart); - - return ERR_PTR(err); -} - -module_param(gart_debug, bool, 0644); -MODULE_PARM_DESC(gart_debug, "Enable GART debugging"); diff --git a/drivers/memory/tegra/mc.c b/drivers/memory/tegra/mc.c index 67d6e70b4eab1..a083921a8968b 100644 --- a/drivers/memory/tegra/mc.c +++ b/drivers/memory/tegra/mc.c @@ -979,35 +979,6 @@ static int tegra_mc_probe(struct platform_device *pdev) } } - if (IS_ENABLED(CONFIG_TEGRA_IOMMU_GART) && !mc->soc->smmu) { - mc->gart = tegra_gart_probe(&pdev->dev, mc); - if (IS_ERR(mc->gart)) { - dev_err(&pdev->dev, "failed to probe GART: %ld\n", - PTR_ERR(mc->gart)); - mc->gart = NULL; - } - } - - return 0; -} - -static int __maybe_unused tegra_mc_suspend(struct device *dev) -{ - struct tegra_mc *mc = dev_get_drvdata(dev); - - if (mc->soc->ops && mc->soc->ops->suspend) - return mc->soc->ops->suspend(mc); - - return 0; -} - -static int __maybe_unused tegra_mc_resume(struct device *dev) -{ - struct tegra_mc *mc = dev_get_drvdata(dev); - - if (mc->soc->ops && mc->soc->ops->resume) - return mc->soc->ops->resume(mc); - return 0; } @@ -1020,15 +991,10 @@ static void tegra_mc_sync_state(struct device *dev) icc_sync_state(dev); } -static const struct dev_pm_ops tegra_mc_pm_ops = { - SET_SYSTEM_SLEEP_PM_OPS(tegra_mc_suspend, tegra_mc_resume) -}; - static struct platform_driver tegra_mc_driver = { .driver = { .name = "tegra-mc", .of_match_table = tegra_mc_of_match, - .pm = &tegra_mc_pm_ops, .suppress_bind_attrs = true, .sync_state = tegra_mc_sync_state, }, diff --git a/drivers/memory/tegra/tegra20.c b/drivers/memory/tegra/tegra20.c index 544bfd216a220..aa4b97d5e7323 100644 --- a/drivers/memory/tegra/tegra20.c +++ b/drivers/memory/tegra/tegra20.c @@ -688,32 +688,6 @@ static int tegra20_mc_probe(struct tegra_mc *mc) return 0; } -static int tegra20_mc_suspend(struct tegra_mc *mc) -{ - int err; - - if (IS_ENABLED(CONFIG_TEGRA_IOMMU_GART) && mc->gart) { - err = tegra_gart_suspend(mc->gart); - if (err < 0) - return err; - } - - return 0; -} - -static int tegra20_mc_resume(struct tegra_mc *mc) -{ - int err; - - if (IS_ENABLED(CONFIG_TEGRA_IOMMU_GART) && mc->gart) { - err = tegra_gart_resume(mc->gart); - if (err < 0) - return err; - } - - return 0; -} - static irqreturn_t tegra20_mc_handle_irq(int irq, void *data) { struct tegra_mc *mc = data; @@ -789,8 +763,6 @@ static irqreturn_t tegra20_mc_handle_irq(int irq, void *data) static const struct tegra_mc_ops tegra20_mc_ops = { .probe = tegra20_mc_probe, - .suspend = tegra20_mc_suspend, - .resume = tegra20_mc_resume, .handle_irq = tegra20_mc_handle_irq, }; diff --git a/include/soc/tegra/mc.h b/include/soc/tegra/mc.h index a5ef84944a068..71ae37d3bedd7 100644 --- a/include/soc/tegra/mc.h +++ b/include/soc/tegra/mc.h @@ -96,7 +96,6 @@ struct tegra_smmu_soc { struct tegra_mc; struct tegra_smmu; -struct gart_device; #ifdef CONFIG_TEGRA_IOMMU_SMMU struct tegra_smmu *tegra_smmu_probe(struct device *dev, @@ -116,28 +115,6 @@ static inline void tegra_smmu_remove(struct tegra_smmu *smmu) } #endif -#ifdef CONFIG_TEGRA_IOMMU_GART -struct gart_device *tegra_gart_probe(struct device *dev, struct tegra_mc *mc); -int tegra_gart_suspend(struct gart_device *gart); -int tegra_gart_resume(struct gart_device *gart); -#else -static inline struct gart_device * -tegra_gart_probe(struct device *dev, struct tegra_mc *mc) -{ - return ERR_PTR(-ENODEV); -} - -static inline int tegra_gart_suspend(struct gart_device *gart) -{ - return -ENODEV; -} - -static inline int tegra_gart_resume(struct gart_device *gart) -{ - return -ENODEV; -} -#endif - struct tegra_mc_reset { const char *name; unsigned long id; @@ -185,8 +162,6 @@ struct tegra_mc_ops { */ int (*probe)(struct tegra_mc *mc); void (*remove)(struct tegra_mc *mc); - int (*suspend)(struct tegra_mc *mc); - int (*resume)(struct tegra_mc *mc); irqreturn_t (*handle_irq)(int irq, void *data); int (*probe_device)(struct tegra_mc *mc, struct device *dev); }; @@ -225,7 +200,6 @@ struct tegra_mc { struct tegra_bpmp *bpmp; struct device *dev; struct tegra_smmu *smmu; - struct gart_device *gart; void __iomem *regs; void __iomem *bcast_ch_regs; void __iomem **ch_regs; From 70885daba277f1d60e71454f165a6f3b5df76f3e Mon Sep 17 00:00:00 2001 From: Jason Gunthorpe Date: Wed, 13 Sep 2023 10:43:40 -0300 Subject: [PATCH 409/548] iommu/mtk_iommu_v1: Implement an IDENTITY domain What mtk does during mtk_iommu_v1_set_platform_dma() is actually putting the iommu into identity mode. Make this available as a proper IDENTITY domain. The mtk_iommu_v1_def_domain_type() from commit 8bbe13f52cb7 ("iommu/mediatek-v1: Add def_domain_type") explains this was needed to allow probe_finalize() to be called, but now the IDENTITY domain will do the same job so change the returned def_domain_type. mkt_v1 is the only driver that returns IOMMU_DOMAIN_UNMANAGED from def_domain_type(). This allows the next patch to enforce an IDENTITY domain policy for this driver. Reviewed-by: Jerry Snitselaar Signed-off-by: Jason Gunthorpe Link: https://lore.kernel.org/r/7-v8-81230027b2fa+9d-iommu_all_defdom_jgg@nvidia.com Signed-off-by: Joerg Roedel (cherry picked from commit 90057dc095dca023ad3e8cd5a129f392b543fb9e) Signed-off-by: Koba Ko --- drivers/iommu/mtk_iommu_v1.c | 21 +++++++++++++++++++-- 1 file changed, 19 insertions(+), 2 deletions(-) diff --git a/drivers/iommu/mtk_iommu_v1.c b/drivers/iommu/mtk_iommu_v1.c index f1754efcfe74e..d0c7e1727ed8a 100644 --- a/drivers/iommu/mtk_iommu_v1.c +++ b/drivers/iommu/mtk_iommu_v1.c @@ -319,11 +319,27 @@ static int mtk_iommu_v1_attach_device(struct iommu_domain *domain, struct device return 0; } -static void mtk_iommu_v1_set_platform_dma(struct device *dev) +static int mtk_iommu_v1_identity_attach(struct iommu_domain *identity_domain, + struct device *dev) { struct mtk_iommu_v1_data *data = dev_iommu_priv_get(dev); mtk_iommu_v1_config(data, dev, false); + return 0; +} + +static struct iommu_domain_ops mtk_iommu_v1_identity_ops = { + .attach_dev = mtk_iommu_v1_identity_attach, +}; + +static struct iommu_domain mtk_iommu_v1_identity_domain = { + .type = IOMMU_DOMAIN_IDENTITY, + .ops = &mtk_iommu_v1_identity_ops, +}; + +static void mtk_iommu_v1_set_platform_dma(struct device *dev) +{ + mtk_iommu_v1_identity_attach(&mtk_iommu_v1_identity_domain, dev); } static int mtk_iommu_v1_map(struct iommu_domain *domain, unsigned long iova, @@ -443,7 +459,7 @@ static int mtk_iommu_v1_create_mapping(struct device *dev, struct of_phandle_arg static int mtk_iommu_v1_def_domain_type(struct device *dev) { - return IOMMU_DOMAIN_UNMANAGED; + return IOMMU_DOMAIN_IDENTITY; } static struct iommu_device *mtk_iommu_v1_probe_device(struct device *dev) @@ -578,6 +594,7 @@ static int mtk_iommu_v1_hw_init(const struct mtk_iommu_v1_data *data) } static const struct iommu_ops mtk_iommu_v1_ops = { + .identity_domain = &mtk_iommu_v1_identity_domain, .domain_alloc = mtk_iommu_v1_domain_alloc, .probe_device = mtk_iommu_v1_probe_device, .probe_finalize = mtk_iommu_v1_probe_finalize, From b3c7228dec1b704570326229daed532156864853 Mon Sep 17 00:00:00 2001 From: Jason Gunthorpe Date: Wed, 13 Sep 2023 10:43:41 -0300 Subject: [PATCH 410/548] iommu: Reorganize iommu_get_default_domain_type() to respect def_domain_type() Except for dart (which forces IOMMU_DOMAIN_DMA) every driver returns 0 or IDENTITY from ops->def_domain_type(). The drivers that return IDENTITY have some kind of good reason, typically that quirky hardware really can't support anything other than IDENTITY. Arrange things so that if the driver says it needs IDENTITY then iommu_get_default_domain_type() either fails or returns IDENTITY. It will not ignore the driver's override to IDENTITY. Split the function into two steps, reducing the group device list to the driver's def_domain_type() and the untrusted flag. Then compute the result based on those two reduced variables. Fully reject combining untrusted with IDENTITY. Remove the debugging print on the iommu_group_store_type() failure path, userspace should not be able to trigger kernel prints. This makes the next patch cleaner that wants to force IDENTITY always for ARM_IOMMU because there is no support for DMA. Reviewed-by: Jerry Snitselaar Signed-off-by: Jason Gunthorpe Link: https://lore.kernel.org/r/8-v8-81230027b2fa+9d-iommu_all_defdom_jgg@nvidia.com Signed-off-by: Joerg Roedel (cherry picked from commit 59ddce4418da483c932bc7a08b88d6ba14020e83) Signed-off-by: Koba Ko --- drivers/iommu/iommu.c | 115 ++++++++++++++++++++++++++++-------------- 1 file changed, 78 insertions(+), 37 deletions(-) diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c index 04a7d10436f4f..5cd467843e668 100644 --- a/drivers/iommu/iommu.c +++ b/drivers/iommu/iommu.c @@ -1729,19 +1729,6 @@ struct iommu_group *fsl_mc_device_group(struct device *dev) } EXPORT_SYMBOL_GPL(fsl_mc_device_group); -static int iommu_get_def_domain_type(struct device *dev) -{ - const struct iommu_ops *ops = dev_iommu_ops(dev); - - if (dev_is_pci(dev) && to_pci_dev(dev)->untrusted) - return IOMMU_DOMAIN_DMA; - - if (ops->def_domain_type) - return ops->def_domain_type(dev); - - return 0; -} - static struct iommu_domain * __iommu_group_alloc_default_domain(const struct bus_type *bus, struct iommu_group *group, int req_type) @@ -1751,6 +1738,23 @@ __iommu_group_alloc_default_domain(const struct bus_type *bus, return __iommu_domain_alloc(bus, req_type); } +/* + * Returns the iommu_ops for the devices in an iommu group. + * + * It is assumed that all devices in an iommu group are managed by a single + * IOMMU unit. Therefore, this returns the dev_iommu_ops of the first device + * in the group. + */ +static const struct iommu_ops *group_iommu_ops(struct iommu_group *group) +{ + struct group_device *device = + list_first_entry(&group->devices, struct group_device, list); + + lockdep_assert_held(&group->mutex); + + return dev_iommu_ops(device->dev); +} + /* * req_type of 0 means "auto" which means to select a domain based on * iommu_def_domain_type or what the driver actually supports. @@ -1833,40 +1837,77 @@ static int iommu_bus_notifier(struct notifier_block *nb, return 0; } -/* A target_type of 0 will select the best domain type and cannot fail */ +/* + * Combine the driver's chosen def_domain_type across all the devices in a + * group. Drivers must give a consistent result. + */ +static int iommu_get_def_domain_type(struct iommu_group *group, + struct device *dev, int cur_type) +{ + const struct iommu_ops *ops = group_iommu_ops(group); + int type; + + if (!ops->def_domain_type) + return cur_type; + + type = ops->def_domain_type(dev); + if (!type || cur_type == type) + return cur_type; + if (!cur_type) + return type; + + dev_err_ratelimited( + dev, + "IOMMU driver error, requesting conflicting def_domain_type, %s and %s, for devices in group %u.\n", + iommu_domain_type_str(cur_type), iommu_domain_type_str(type), + group->id); + + /* + * Try to recover, drivers are allowed to force IDENITY or DMA, IDENTITY + * takes precedence. + */ + if (type == IOMMU_DOMAIN_IDENTITY) + return type; + return cur_type; +} + +/* + * A target_type of 0 will select the best domain type. 0 can be returned in + * this case meaning the global default should be used. + */ static int iommu_get_default_domain_type(struct iommu_group *group, int target_type) { - int best_type = target_type; + struct device *untrusted = NULL; struct group_device *gdev; - struct device *last_dev; + int driver_type = 0; lockdep_assert_held(&group->mutex); - for_each_group_device(group, gdev) { - unsigned int type = iommu_get_def_domain_type(gdev->dev); - - if (best_type && type && best_type != type) { - if (target_type) { - dev_err_ratelimited( - gdev->dev, - "Device cannot be in %s domain\n", - iommu_domain_type_str(target_type)); - return -1; - } + driver_type = iommu_get_def_domain_type(group, gdev->dev, + driver_type); - dev_warn( - gdev->dev, - "Device needs domain type %s, but device %s in the same iommu group requires type %s - using default\n", - iommu_domain_type_str(type), dev_name(last_dev), - iommu_domain_type_str(best_type)); - return 0; + if (dev_is_pci(gdev->dev) && to_pci_dev(gdev->dev)->untrusted) + untrusted = gdev->dev; + } + + if (untrusted) { + if (driver_type && driver_type != IOMMU_DOMAIN_DMA) { + dev_err_ratelimited( + untrusted, + "Device is not trusted, but driver is overriding group %u to %s, refusing to probe.\n", + group->id, iommu_domain_type_str(driver_type)); + return -1; } - if (!best_type) - best_type = type; - last_dev = gdev->dev; + driver_type = IOMMU_DOMAIN_DMA; + } + + if (target_type) { + if (driver_type && target_type != driver_type) + return -1; + return target_type; } - return best_type; + return driver_type; } static void iommu_group_do_probe_finalize(struct device *dev) From 4fc517c42b2875c4c734d1bf7c135ff615af2dd3 Mon Sep 17 00:00:00 2001 From: Jason Gunthorpe Date: Wed, 13 Sep 2023 10:43:42 -0300 Subject: [PATCH 411/548] iommu: Allow an IDENTITY domain as the default_domain in ARM32 Even though dma-iommu.c and CONFIG_ARM_DMA_USE_IOMMU do approximately the same stuff, the way they relate to the IOMMU core is quiet different. dma-iommu.c expects the core code to setup an UNMANAGED domain (of type IOMMU_DOMAIN_DMA) and then configures itself to use that domain. This becomes the default_domain for the group. ARM_DMA_USE_IOMMU does not use the default_domain, instead it directly allocates an UNMANAGED domain and operates it just like an external driver. In this case group->default_domain is NULL. If the driver provides a global static identity_domain then automatically use it as the default_domain when in ARM_DMA_USE_IOMMU mode. This allows drivers that implemented default_domain == NULL as an IDENTITY translation to trivially get a properly labeled non-NULL default_domain on ARM32 configs. With this arrangment when ARM_DMA_USE_IOMMU wants to disconnect from the device the normal detach_domain flow will restore the IDENTITY domain as the default domain. Overall this makes attach_dev() of the IDENTITY domain called in the same places as detach_dev(). This effectively migrates these drivers to default_domain mode. For drivers that support ARM64 they will gain support for the IDENTITY translation mode for the dma_api and behave in a uniform way. Drivers use this by setting ops->identity_domain to a static singleton iommu_domain that implements the identity attach. If the core detects ARM_DMA_USE_IOMMU mode then it automatically attaches the IDENTITY domain during probe. Drivers can continue to prevent the use of DMA translation by returning IOMMU_DOMAIN_IDENTITY from def_domain_type, this will completely prevent IOMMU_DMA from running but will not impact ARM_DMA_USE_IOMMU. This allows removing the set_platform_dma_ops() from every remaining driver. Remove the set_platform_dma_ops from rockchip and mkt_v1 as all it does is set an existing global static identity domain. mkt_v1 does not support IOMMU_DOMAIN_DMA and it does not compile on ARM64 so this transformation is safe. Tested-by: Steven Price Tested-by: Marek Szyprowski Tested-by: Nicolin Chen Reviewed-by: Lu Baolu Reviewed-by: Jerry Snitselaar Signed-off-by: Jason Gunthorpe Link: https://lore.kernel.org/r/9-v8-81230027b2fa+9d-iommu_all_defdom_jgg@nvidia.com Signed-off-by: Joerg Roedel (cherry picked from commit e98befd010bd56b8b3f2afea2d600e30df023e6b) Signed-off-by: Koba Ko --- drivers/iommu/iommu.c | 21 ++++++++++++++++++++- drivers/iommu/mtk_iommu_v1.c | 12 ------------ drivers/iommu/rockchip-iommu.c | 10 ---------- 3 files changed, 20 insertions(+), 23 deletions(-) diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c index 5cd467843e668..e9ba6a4d95d08 100644 --- a/drivers/iommu/iommu.c +++ b/drivers/iommu/iommu.c @@ -1878,17 +1878,36 @@ static int iommu_get_def_domain_type(struct iommu_group *group, static int iommu_get_default_domain_type(struct iommu_group *group, int target_type) { + const struct iommu_ops *ops = group_iommu_ops(group); struct device *untrusted = NULL; struct group_device *gdev; int driver_type = 0; lockdep_assert_held(&group->mutex); + + /* + * ARM32 drivers supporting CONFIG_ARM_DMA_USE_IOMMU can declare an + * identity_domain and it will automatically become their default + * domain. Later on ARM_DMA_USE_IOMMU will install its UNMANAGED domain. + * Override the selection to IDENTITY if we are sure the driver supports + * it. + */ + if (IS_ENABLED(CONFIG_ARM_DMA_USE_IOMMU) && ops->identity_domain) + driver_type = IOMMU_DOMAIN_IDENTITY; + for_each_group_device(group, gdev) { driver_type = iommu_get_def_domain_type(group, gdev->dev, driver_type); - if (dev_is_pci(gdev->dev) && to_pci_dev(gdev->dev)->untrusted) + if (dev_is_pci(gdev->dev) && to_pci_dev(gdev->dev)->untrusted) { + /* + * No ARM32 using systems will set untrusted, it cannot + * work. + */ + if (WARN_ON(IS_ENABLED(CONFIG_ARM_DMA_USE_IOMMU))) + return -1; untrusted = gdev->dev; + } } if (untrusted) { diff --git a/drivers/iommu/mtk_iommu_v1.c b/drivers/iommu/mtk_iommu_v1.c index d0c7e1727ed8a..0e6b2bb4955de 100644 --- a/drivers/iommu/mtk_iommu_v1.c +++ b/drivers/iommu/mtk_iommu_v1.c @@ -337,11 +337,6 @@ static struct iommu_domain mtk_iommu_v1_identity_domain = { .ops = &mtk_iommu_v1_identity_ops, }; -static void mtk_iommu_v1_set_platform_dma(struct device *dev) -{ - mtk_iommu_v1_identity_attach(&mtk_iommu_v1_identity_domain, dev); -} - static int mtk_iommu_v1_map(struct iommu_domain *domain, unsigned long iova, phys_addr_t paddr, size_t pgsize, size_t pgcount, int prot, gfp_t gfp, size_t *mapped) @@ -457,11 +452,6 @@ static int mtk_iommu_v1_create_mapping(struct device *dev, struct of_phandle_arg return 0; } -static int mtk_iommu_v1_def_domain_type(struct device *dev) -{ - return IOMMU_DOMAIN_IDENTITY; -} - static struct iommu_device *mtk_iommu_v1_probe_device(struct device *dev) { struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev); @@ -599,10 +589,8 @@ static const struct iommu_ops mtk_iommu_v1_ops = { .probe_device = mtk_iommu_v1_probe_device, .probe_finalize = mtk_iommu_v1_probe_finalize, .release_device = mtk_iommu_v1_release_device, - .def_domain_type = mtk_iommu_v1_def_domain_type, .device_group = generic_device_group, .pgsize_bitmap = MT2701_IOMMU_PAGE_SIZE, - .set_platform_dma_ops = mtk_iommu_v1_set_platform_dma, .owner = THIS_MODULE, .default_domain_ops = &(const struct iommu_domain_ops) { .attach_dev = mtk_iommu_v1_attach_device, diff --git a/drivers/iommu/rockchip-iommu.c b/drivers/iommu/rockchip-iommu.c index e16761dadf698..b6f1ba01a341d 100644 --- a/drivers/iommu/rockchip-iommu.c +++ b/drivers/iommu/rockchip-iommu.c @@ -998,13 +998,6 @@ static struct iommu_domain rk_identity_domain = { .ops = &rk_identity_ops, }; -#ifdef CONFIG_ARM -static void rk_iommu_set_platform_dma(struct device *dev) -{ - WARN_ON(rk_iommu_identity_attach(&rk_identity_domain, dev)); -} -#endif - static int rk_iommu_attach_device(struct iommu_domain *domain, struct device *dev) { @@ -1182,9 +1175,6 @@ static const struct iommu_ops rk_iommu_ops = { .probe_device = rk_iommu_probe_device, .release_device = rk_iommu_release_device, .device_group = rk_iommu_device_group, -#ifdef CONFIG_ARM - .set_platform_dma_ops = rk_iommu_set_platform_dma, -#endif .pgsize_bitmap = RK_IOMMU_PGSIZE_BITMAP, .of_xlate = rk_iommu_of_xlate, .default_domain_ops = &(const struct iommu_domain_ops) { From b49da43d9a87706d8e8e3ef79cb4e2faffc3ca89 Mon Sep 17 00:00:00 2001 From: Jason Gunthorpe Date: Wed, 13 Sep 2023 10:43:43 -0300 Subject: [PATCH 412/548] iommu/exynos: Implement an IDENTITY domain What exynos calls exynos_iommu_detach_device is actually putting the iommu into identity mode. Move to the new core support for ARM_DMA_USE_IOMMU by defining ops->identity_domain. Tested-by: Marek Szyprowski Acked-by: Marek Szyprowski Reviewed-by: Jerry Snitselaar Signed-off-by: Jason Gunthorpe Link: https://lore.kernel.org/r/10-v8-81230027b2fa+9d-iommu_all_defdom_jgg@nvidia.com Signed-off-by: Joerg Roedel (cherry picked from commit b3d14960e629f9ee8c82f8feb211ae43e1cb3246) Signed-off-by: Koba Ko --- drivers/iommu/exynos-iommu.c | 66 +++++++++++++++++------------------- 1 file changed, 32 insertions(+), 34 deletions(-) diff --git a/drivers/iommu/exynos-iommu.c b/drivers/iommu/exynos-iommu.c index c275fe71c4db3..5e12b85dfe870 100644 --- a/drivers/iommu/exynos-iommu.c +++ b/drivers/iommu/exynos-iommu.c @@ -24,6 +24,7 @@ typedef u32 sysmmu_iova_t; typedef u32 sysmmu_pte_t; +static struct iommu_domain exynos_identity_domain; /* We do not consider super section mapping (16MB) */ #define SECT_ORDER 20 @@ -829,7 +830,7 @@ static int __maybe_unused exynos_sysmmu_suspend(struct device *dev) struct exynos_iommu_owner *owner = dev_iommu_priv_get(master); mutex_lock(&owner->rpm_lock); - if (data->domain) { + if (&data->domain->domain != &exynos_identity_domain) { dev_dbg(data->sysmmu, "saving state\n"); __sysmmu_disable(data); } @@ -847,7 +848,7 @@ static int __maybe_unused exynos_sysmmu_resume(struct device *dev) struct exynos_iommu_owner *owner = dev_iommu_priv_get(master); mutex_lock(&owner->rpm_lock); - if (data->domain) { + if (&data->domain->domain != &exynos_identity_domain) { dev_dbg(data->sysmmu, "restoring state\n"); __sysmmu_enable(data); } @@ -980,17 +981,20 @@ static void exynos_iommu_domain_free(struct iommu_domain *iommu_domain) kfree(domain); } -static void exynos_iommu_detach_device(struct iommu_domain *iommu_domain, - struct device *dev) +static int exynos_iommu_identity_attach(struct iommu_domain *identity_domain, + struct device *dev) { - struct exynos_iommu_domain *domain = to_exynos_domain(iommu_domain); struct exynos_iommu_owner *owner = dev_iommu_priv_get(dev); - phys_addr_t pagetable = virt_to_phys(domain->pgtable); + struct exynos_iommu_domain *domain; + phys_addr_t pagetable; struct sysmmu_drvdata *data, *next; unsigned long flags; - if (!has_sysmmu(dev) || owner->domain != iommu_domain) - return; + if (owner->domain == identity_domain) + return 0; + + domain = to_exynos_domain(owner->domain); + pagetable = virt_to_phys(domain->pgtable); mutex_lock(&owner->rpm_lock); @@ -1009,15 +1013,25 @@ static void exynos_iommu_detach_device(struct iommu_domain *iommu_domain, list_del_init(&data->domain_node); spin_unlock(&data->lock); } - owner->domain = NULL; + owner->domain = identity_domain; spin_unlock_irqrestore(&domain->lock, flags); mutex_unlock(&owner->rpm_lock); - dev_dbg(dev, "%s: Detached IOMMU with pgtable %pa\n", __func__, - &pagetable); + dev_dbg(dev, "%s: Restored IOMMU to IDENTITY from pgtable %pa\n", + __func__, &pagetable); + return 0; } +static struct iommu_domain_ops exynos_identity_ops = { + .attach_dev = exynos_iommu_identity_attach, +}; + +static struct iommu_domain exynos_identity_domain = { + .type = IOMMU_DOMAIN_IDENTITY, + .ops = &exynos_identity_ops, +}; + static int exynos_iommu_attach_device(struct iommu_domain *iommu_domain, struct device *dev) { @@ -1026,12 +1040,11 @@ static int exynos_iommu_attach_device(struct iommu_domain *iommu_domain, struct sysmmu_drvdata *data; phys_addr_t pagetable = virt_to_phys(domain->pgtable); unsigned long flags; + int err; - if (!has_sysmmu(dev)) - return -ENODEV; - - if (owner->domain) - exynos_iommu_detach_device(owner->domain, dev); + err = exynos_iommu_identity_attach(&exynos_identity_domain, dev); + if (err) + return err; mutex_lock(&owner->rpm_lock); @@ -1407,26 +1420,12 @@ static struct iommu_device *exynos_iommu_probe_device(struct device *dev) return &data->iommu; } -static void exynos_iommu_set_platform_dma(struct device *dev) -{ - struct exynos_iommu_owner *owner = dev_iommu_priv_get(dev); - - if (owner->domain) { - struct iommu_group *group = iommu_group_get(dev); - - if (group) { - exynos_iommu_detach_device(owner->domain, dev); - iommu_group_put(group); - } - } -} - static void exynos_iommu_release_device(struct device *dev) { struct exynos_iommu_owner *owner = dev_iommu_priv_get(dev); struct sysmmu_drvdata *data; - exynos_iommu_set_platform_dma(dev); + WARN_ON(exynos_iommu_identity_attach(&exynos_identity_domain, dev)); list_for_each_entry(data, &owner->controllers, owner_node) device_link_del(data->link); @@ -1457,6 +1456,7 @@ static int exynos_iommu_of_xlate(struct device *dev, INIT_LIST_HEAD(&owner->controllers); mutex_init(&owner->rpm_lock); + owner->domain = &exynos_identity_domain; dev_iommu_priv_set(dev, owner); } @@ -1471,11 +1471,9 @@ static int exynos_iommu_of_xlate(struct device *dev, } static const struct iommu_ops exynos_iommu_ops = { + .identity_domain = &exynos_identity_domain, .domain_alloc = exynos_iommu_domain_alloc, .device_group = generic_device_group, -#ifdef CONFIG_ARM - .set_platform_dma_ops = exynos_iommu_set_platform_dma, -#endif .probe_device = exynos_iommu_probe_device, .release_device = exynos_iommu_release_device, .pgsize_bitmap = SECT_SIZE | LPAGE_SIZE | SPAGE_SIZE, From ee43699eb823212c6f77be43e5eb612ded9f8cb3 Mon Sep 17 00:00:00 2001 From: Jason Gunthorpe Date: Wed, 13 Sep 2023 10:43:44 -0300 Subject: [PATCH 413/548] iommu/tegra-smmu: Implement an IDENTITY domain What tegra-smmu does during tegra_smmu_set_platform_dma() is actually putting the iommu into identity mode. Move to the new core support for ARM_DMA_USE_IOMMU by defining ops->identity_domain. Reviewed-by: Lu Baolu Reviewed-by: Jerry Snitselaar Signed-off-by: Jason Gunthorpe Link: https://lore.kernel.org/r/11-v8-81230027b2fa+9d-iommu_all_defdom_jgg@nvidia.com Signed-off-by: Joerg Roedel (cherry picked from commit c8cc2655cc6c7ff832827ad5bc1a8f3df165706d) Signed-off-by: Koba Ko --- drivers/iommu/tegra-smmu.c | 37 ++++++++++++++++++++++++++++++++----- 1 file changed, 32 insertions(+), 5 deletions(-) diff --git a/drivers/iommu/tegra-smmu.c b/drivers/iommu/tegra-smmu.c index e445f80d02263..80481e1ba561b 100644 --- a/drivers/iommu/tegra-smmu.c +++ b/drivers/iommu/tegra-smmu.c @@ -511,23 +511,39 @@ static int tegra_smmu_attach_dev(struct iommu_domain *domain, return err; } -static void tegra_smmu_set_platform_dma(struct device *dev) +static int tegra_smmu_identity_attach(struct iommu_domain *identity_domain, + struct device *dev) { struct iommu_domain *domain = iommu_get_domain_for_dev(dev); struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev); - struct tegra_smmu_as *as = to_smmu_as(domain); - struct tegra_smmu *smmu = as->smmu; + struct tegra_smmu_as *as; + struct tegra_smmu *smmu; unsigned int index; if (!fwspec) - return; + return -ENODEV; + + if (domain == identity_domain || !domain) + return 0; + as = to_smmu_as(domain); + smmu = as->smmu; for (index = 0; index < fwspec->num_ids; index++) { tegra_smmu_disable(smmu, fwspec->ids[index], as->id); tegra_smmu_as_unprepare(smmu, as); } + return 0; } +static struct iommu_domain_ops tegra_smmu_identity_ops = { + .attach_dev = tegra_smmu_identity_attach, +}; + +static struct iommu_domain tegra_smmu_identity_domain = { + .type = IOMMU_DOMAIN_IDENTITY, + .ops = &tegra_smmu_identity_ops, +}; + static void tegra_smmu_set_pde(struct tegra_smmu_as *as, unsigned long iova, u32 value) { @@ -962,11 +978,22 @@ static int tegra_smmu_of_xlate(struct device *dev, return iommu_fwspec_add_ids(dev, &id, 1); } +static int tegra_smmu_def_domain_type(struct device *dev) +{ + /* + * FIXME: For now we want to run all translation in IDENTITY mode, due + * to some device quirks. Better would be to just quirk the troubled + * devices. + */ + return IOMMU_DOMAIN_IDENTITY; +} + static const struct iommu_ops tegra_smmu_ops = { + .identity_domain = &tegra_smmu_identity_domain, + .def_domain_type = &tegra_smmu_def_domain_type, .domain_alloc = tegra_smmu_domain_alloc, .probe_device = tegra_smmu_probe_device, .device_group = tegra_smmu_device_group, - .set_platform_dma_ops = tegra_smmu_set_platform_dma, .of_xlate = tegra_smmu_of_xlate, .pgsize_bitmap = SZ_4K, .default_domain_ops = &(const struct iommu_domain_ops) { From ba23bb0a76fd1060dc7765419edf6e3b1294600a Mon Sep 17 00:00:00 2001 From: Jason Gunthorpe Date: Wed, 13 Sep 2023 10:43:45 -0300 Subject: [PATCH 414/548] iommu/tegra-smmu: Support DMA domains in tegra All ARM64 iommu drivers should support IOMMU_DOMAIN_DMA to enable dma-iommu.c. tegra is blocking dma-iommu usage, and also default_domain's, because it wants an identity translation. This is needed for some device quirk. The correct way to do this is to support IDENTITY domains and use ops->def_domain_type() to return IOMMU_DOMAIN_IDENTITY for only the quirky devices. Add support for IOMMU_DOMAIN_DMA and force IOMMU_DOMAIN_IDENTITY mode for everything so no behavior changes. Reviewed-by: Jerry Snitselaar Signed-off-by: Jason Gunthorpe Link: https://lore.kernel.org/r/12-v8-81230027b2fa+9d-iommu_all_defdom_jgg@nvidia.com Signed-off-by: Joerg Roedel (cherry picked from commit f128094f347f578279e551488796d0a0067cbdff) Signed-off-by: Koba Ko --- drivers/iommu/tegra-smmu.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/iommu/tegra-smmu.c b/drivers/iommu/tegra-smmu.c index 80481e1ba561b..b91ad1b5a20d3 100644 --- a/drivers/iommu/tegra-smmu.c +++ b/drivers/iommu/tegra-smmu.c @@ -276,7 +276,7 @@ static struct iommu_domain *tegra_smmu_domain_alloc(unsigned type) { struct tegra_smmu_as *as; - if (type != IOMMU_DOMAIN_UNMANAGED) + if (type != IOMMU_DOMAIN_UNMANAGED && type != IOMMU_DOMAIN_DMA) return NULL; as = kzalloc(sizeof(*as), GFP_KERNEL); From 4107d93714390a4979075148fcccfbbbf47ce64e Mon Sep 17 00:00:00 2001 From: Jason Gunthorpe Date: Wed, 13 Sep 2023 10:43:46 -0300 Subject: [PATCH 415/548] iommu/omap: Implement an IDENTITY domain What omap does during omap_iommu_set_platform_dma() is actually putting the iommu into identity mode. Move to the new core support for ARM_DMA_USE_IOMMU by defining ops->identity_domain. This driver does not support IOMMU_DOMAIN_DMA, however it cannot be compiled on ARM64 either. Most likely it is fine to support dma-iommu.c Reviewed-by: Lu Baolu Reviewed-by: Jerry Snitselaar Signed-off-by: Jason Gunthorpe Link: https://lore.kernel.org/r/13-v8-81230027b2fa+9d-iommu_all_defdom_jgg@nvidia.com Signed-off-by: Joerg Roedel (cherry picked from commit bbe39b76330ecf64f8b3f45cdf8bacbbb7f390c9) Signed-off-by: Koba Ko --- drivers/iommu/omap-iommu.c | 21 ++++++++++++++++++--- 1 file changed, 18 insertions(+), 3 deletions(-) diff --git a/drivers/iommu/omap-iommu.c b/drivers/iommu/omap-iommu.c index 537e402f9bba9..34340ef15241b 100644 --- a/drivers/iommu/omap-iommu.c +++ b/drivers/iommu/omap-iommu.c @@ -1555,16 +1555,31 @@ static void _omap_iommu_detach_dev(struct omap_iommu_domain *omap_domain, omap_domain->dev = NULL; } -static void omap_iommu_set_platform_dma(struct device *dev) +static int omap_iommu_identity_attach(struct iommu_domain *identity_domain, + struct device *dev) { struct iommu_domain *domain = iommu_get_domain_for_dev(dev); - struct omap_iommu_domain *omap_domain = to_omap_domain(domain); + struct omap_iommu_domain *omap_domain; + if (domain == identity_domain || !domain) + return 0; + + omap_domain = to_omap_domain(domain); spin_lock(&omap_domain->lock); _omap_iommu_detach_dev(omap_domain, dev); spin_unlock(&omap_domain->lock); + return 0; } +static struct iommu_domain_ops omap_iommu_identity_ops = { + .attach_dev = omap_iommu_identity_attach, +}; + +static struct iommu_domain omap_iommu_identity_domain = { + .type = IOMMU_DOMAIN_IDENTITY, + .ops = &omap_iommu_identity_ops, +}; + static struct iommu_domain *omap_iommu_domain_alloc(unsigned type) { struct omap_iommu_domain *omap_domain; @@ -1732,11 +1747,11 @@ static struct iommu_group *omap_iommu_device_group(struct device *dev) } static const struct iommu_ops omap_iommu_ops = { + .identity_domain = &omap_iommu_identity_domain, .domain_alloc = omap_iommu_domain_alloc, .probe_device = omap_iommu_probe_device, .release_device = omap_iommu_release_device, .device_group = omap_iommu_device_group, - .set_platform_dma_ops = omap_iommu_set_platform_dma, .pgsize_bitmap = OMAP_IOMMU_PGSIZES, .default_domain_ops = &(const struct iommu_domain_ops) { .attach_dev = omap_iommu_attach_dev, From 7a41db14efc7dd2a9647d269158bf0e04bc4510d Mon Sep 17 00:00:00 2001 From: Jason Gunthorpe Date: Wed, 13 Sep 2023 10:43:47 -0300 Subject: [PATCH 416/548] iommu/msm: Implement an IDENTITY domain What msm does during msm_iommu_set_platform_dma() is actually putting the iommu into identity mode. Move to the new core support for ARM_DMA_USE_IOMMU by defining ops->identity_domain. This driver does not support IOMMU_DOMAIN_DMA, however it cannot be compiled on ARM64 either. Most likely it is fine to support dma-iommu.c Reviewed-by: Lu Baolu Reviewed-by: Jerry Snitselaar Signed-off-by: Jason Gunthorpe Link: https://lore.kernel.org/r/14-v8-81230027b2fa+9d-iommu_all_defdom_jgg@nvidia.com Signed-off-by: Joerg Roedel (cherry picked from commit 78fc30b4bb3540c88edd19c36d0a72bb0dd02d0b) Signed-off-by: Koba Ko --- drivers/iommu/msm_iommu.c | 23 +++++++++++++++++++---- 1 file changed, 19 insertions(+), 4 deletions(-) diff --git a/drivers/iommu/msm_iommu.c b/drivers/iommu/msm_iommu.c index 79d89bad5132b..26ed81cfeee89 100644 --- a/drivers/iommu/msm_iommu.c +++ b/drivers/iommu/msm_iommu.c @@ -443,15 +443,20 @@ static int msm_iommu_attach_dev(struct iommu_domain *domain, struct device *dev) return ret; } -static void msm_iommu_set_platform_dma(struct device *dev) +static int msm_iommu_identity_attach(struct iommu_domain *identity_domain, + struct device *dev) { struct iommu_domain *domain = iommu_get_domain_for_dev(dev); - struct msm_priv *priv = to_msm_priv(domain); + struct msm_priv *priv; unsigned long flags; struct msm_iommu_dev *iommu; struct msm_iommu_ctx_dev *master; - int ret; + int ret = 0; + + if (domain == identity_domain || !domain) + return 0; + priv = to_msm_priv(domain); free_io_pgtable_ops(priv->iop); spin_lock_irqsave(&msm_iommu_lock, flags); @@ -468,8 +473,18 @@ static void msm_iommu_set_platform_dma(struct device *dev) } fail: spin_unlock_irqrestore(&msm_iommu_lock, flags); + return ret; } +static struct iommu_domain_ops msm_iommu_identity_ops = { + .attach_dev = msm_iommu_identity_attach, +}; + +static struct iommu_domain msm_iommu_identity_domain = { + .type = IOMMU_DOMAIN_IDENTITY, + .ops = &msm_iommu_identity_ops, +}; + static int msm_iommu_map(struct iommu_domain *domain, unsigned long iova, phys_addr_t pa, size_t pgsize, size_t pgcount, int prot, gfp_t gfp, size_t *mapped) @@ -675,10 +690,10 @@ irqreturn_t msm_iommu_fault_handler(int irq, void *dev_id) } static struct iommu_ops msm_iommu_ops = { + .identity_domain = &msm_iommu_identity_domain, .domain_alloc = msm_iommu_domain_alloc, .probe_device = msm_iommu_probe_device, .device_group = generic_device_group, - .set_platform_dma_ops = msm_iommu_set_platform_dma, .pgsize_bitmap = MSM_IOMMU_PGSIZES, .of_xlate = qcom_iommu_of_xlate, .default_domain_ops = &(const struct iommu_domain_ops) { From 8571d706c218cebcec87d6a2d387d1426a283375 Mon Sep 17 00:00:00 2001 From: Jason Gunthorpe Date: Wed, 13 Sep 2023 10:43:49 -0300 Subject: [PATCH 417/548] iommu/qcom_iommu: Add an IOMMU_IDENTITIY_DOMAIN This brings back the ops->detach_dev() code that commit 1b932ceddd19 ("iommu: Remove detach_dev callbacks") deleted and turns it into an IDENTITY domain. Reviewed-by: Lu Baolu Reviewed-by: Jerry Snitselaar Signed-off-by: Jason Gunthorpe Link: https://lore.kernel.org/r/16-v8-81230027b2fa+9d-iommu_all_defdom_jgg@nvidia.com Signed-off-by: Joerg Roedel (cherry picked from commit 786478a90294ea5b149ed1156a43e82d63ea61ff) Signed-off-by: Koba Ko --- drivers/iommu/arm/arm-smmu/qcom_iommu.c | 39 +++++++++++++++++++++++++ 1 file changed, 39 insertions(+) diff --git a/drivers/iommu/arm/arm-smmu/qcom_iommu.c b/drivers/iommu/arm/arm-smmu/qcom_iommu.c index 775a3cbaff4ed..bc45d18f350cb 100644 --- a/drivers/iommu/arm/arm-smmu/qcom_iommu.c +++ b/drivers/iommu/arm/arm-smmu/qcom_iommu.c @@ -400,6 +400,44 @@ static int qcom_iommu_attach_dev(struct iommu_domain *domain, struct device *dev return 0; } +static int qcom_iommu_identity_attach(struct iommu_domain *identity_domain, + struct device *dev) +{ + struct iommu_domain *domain = iommu_get_domain_for_dev(dev); + struct qcom_iommu_domain *qcom_domain; + struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev); + struct qcom_iommu_dev *qcom_iommu = to_iommu(dev); + unsigned int i; + + if (domain == identity_domain || !domain) + return 0; + + qcom_domain = to_qcom_iommu_domain(domain); + if (WARN_ON(!qcom_domain->iommu)) + return -EINVAL; + + pm_runtime_get_sync(qcom_iommu->dev); + for (i = 0; i < fwspec->num_ids; i++) { + struct qcom_iommu_ctx *ctx = to_ctx(qcom_domain, fwspec->ids[i]); + + /* Disable the context bank: */ + iommu_writel(ctx, ARM_SMMU_CB_SCTLR, 0); + + ctx->domain = NULL; + } + pm_runtime_put_sync(qcom_iommu->dev); + return 0; +} + +static struct iommu_domain_ops qcom_iommu_identity_ops = { + .attach_dev = qcom_iommu_identity_attach, +}; + +static struct iommu_domain qcom_iommu_identity_domain = { + .type = IOMMU_DOMAIN_IDENTITY, + .ops = &qcom_iommu_identity_ops, +}; + static int qcom_iommu_map(struct iommu_domain *domain, unsigned long iova, phys_addr_t paddr, size_t pgsize, size_t pgcount, int prot, gfp_t gfp, size_t *mapped) @@ -565,6 +603,7 @@ static int qcom_iommu_of_xlate(struct device *dev, struct of_phandle_args *args) } static const struct iommu_ops qcom_iommu_ops = { + .identity_domain = &qcom_iommu_identity_domain, .capable = qcom_iommu_capable, .domain_alloc = qcom_iommu_domain_alloc, .probe_device = qcom_iommu_probe_device, From 7886178c2025f9eeea4d2f0bf73b280d5d9d7370 Mon Sep 17 00:00:00 2001 From: Jason Gunthorpe Date: Wed, 13 Sep 2023 10:43:50 -0300 Subject: [PATCH 418/548] iommu/ipmmu: Add an IOMMU_IDENTITIY_DOMAIN This brings back the ops->detach_dev() code that commit 1b932ceddd19 ("iommu: Remove detach_dev callbacks") deleted and turns it into an IDENTITY domain. Also reverts commit 584d334b1393 ("iommu/ipmmu-vmsa: Remove ipmmu_utlb_disable()") Reviewed-by: Lu Baolu Reviewed-by: Jerry Snitselaar Signed-off-by: Jason Gunthorpe Link: https://lore.kernel.org/r/17-v8-81230027b2fa+9d-iommu_all_defdom_jgg@nvidia.com Signed-off-by: Joerg Roedel (cherry picked from commit 666c9f1ef7fa9330ea041aba36658aeb48145846) Signed-off-by: Koba Ko --- drivers/iommu/ipmmu-vmsa.c | 43 ++++++++++++++++++++++++++++++++++++++ 1 file changed, 43 insertions(+) diff --git a/drivers/iommu/ipmmu-vmsa.c b/drivers/iommu/ipmmu-vmsa.c index 65ff69477c43e..04830d3931d23 100644 --- a/drivers/iommu/ipmmu-vmsa.c +++ b/drivers/iommu/ipmmu-vmsa.c @@ -295,6 +295,18 @@ static void ipmmu_utlb_enable(struct ipmmu_vmsa_domain *domain, mmu->utlb_ctx[utlb] = domain->context_id; } +/* + * Disable MMU translation for the microTLB. + */ +static void ipmmu_utlb_disable(struct ipmmu_vmsa_domain *domain, + unsigned int utlb) +{ + struct ipmmu_vmsa_device *mmu = domain->mmu; + + ipmmu_imuctr_write(mmu, utlb, 0); + mmu->utlb_ctx[utlb] = IPMMU_CTX_INVALID; +} + static void ipmmu_tlb_flush_all(void *cookie) { struct ipmmu_vmsa_domain *domain = cookie; @@ -627,6 +639,36 @@ static int ipmmu_attach_device(struct iommu_domain *io_domain, return 0; } +static int ipmmu_iommu_identity_attach(struct iommu_domain *identity_domain, + struct device *dev) +{ + struct iommu_domain *io_domain = iommu_get_domain_for_dev(dev); + struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev); + struct ipmmu_vmsa_domain *domain; + unsigned int i; + + if (io_domain == identity_domain || !io_domain) + return 0; + + domain = to_vmsa_domain(io_domain); + for (i = 0; i < fwspec->num_ids; ++i) + ipmmu_utlb_disable(domain, fwspec->ids[i]); + + /* + * TODO: Optimize by disabling the context when no device is attached. + */ + return 0; +} + +static struct iommu_domain_ops ipmmu_iommu_identity_ops = { + .attach_dev = ipmmu_iommu_identity_attach, +}; + +static struct iommu_domain ipmmu_iommu_identity_domain = { + .type = IOMMU_DOMAIN_IDENTITY, + .ops = &ipmmu_iommu_identity_ops, +}; + static int ipmmu_map(struct iommu_domain *io_domain, unsigned long iova, phys_addr_t paddr, size_t pgsize, size_t pgcount, int prot, gfp_t gfp, size_t *mapped) @@ -849,6 +891,7 @@ static struct iommu_group *ipmmu_find_group(struct device *dev) } static const struct iommu_ops ipmmu_ops = { + .identity_domain = &ipmmu_iommu_identity_domain, .domain_alloc = ipmmu_domain_alloc, .probe_device = ipmmu_probe_device, .release_device = ipmmu_release_device, From 90168b214d5e272076cde68da88163a3750d91d1 Mon Sep 17 00:00:00 2001 From: Jason Gunthorpe Date: Wed, 13 Sep 2023 10:43:51 -0300 Subject: [PATCH 419/548] iommu/mtk_iommu: Add an IOMMU_IDENTITIY_DOMAIN This brings back the ops->detach_dev() code that commit 1b932ceddd19 ("iommu: Remove detach_dev callbacks") deleted and turns it into an IDENTITY domain. Reviewed-by: Lu Baolu Reviewed-by: Jerry Snitselaar Signed-off-by: Jason Gunthorpe Link: https://lore.kernel.org/r/18-v8-81230027b2fa+9d-iommu_all_defdom_jgg@nvidia.com Signed-off-by: Joerg Roedel (cherry picked from commit b01b12573837f237cec13b9ddf8e5e623ee6f981) Signed-off-by: Koba Ko --- drivers/iommu/mtk_iommu.c | 23 +++++++++++++++++++++++ 1 file changed, 23 insertions(+) diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c index 06c0770ff894e..06192aff5eb8e 100644 --- a/drivers/iommu/mtk_iommu.c +++ b/drivers/iommu/mtk_iommu.c @@ -776,6 +776,28 @@ static int mtk_iommu_attach_device(struct iommu_domain *domain, return ret; } +static int mtk_iommu_identity_attach(struct iommu_domain *identity_domain, + struct device *dev) +{ + struct iommu_domain *domain = iommu_get_domain_for_dev(dev); + struct mtk_iommu_data *data = dev_iommu_priv_get(dev); + + if (domain == identity_domain || !domain) + return 0; + + mtk_iommu_config(data, dev, false, 0); + return 0; +} + +static struct iommu_domain_ops mtk_iommu_identity_ops = { + .attach_dev = mtk_iommu_identity_attach, +}; + +static struct iommu_domain mtk_iommu_identity_domain = { + .type = IOMMU_DOMAIN_IDENTITY, + .ops = &mtk_iommu_identity_ops, +}; + static int mtk_iommu_map(struct iommu_domain *domain, unsigned long iova, phys_addr_t paddr, size_t pgsize, size_t pgcount, int prot, gfp_t gfp, size_t *mapped) @@ -995,6 +1017,7 @@ static void mtk_iommu_get_resv_regions(struct device *dev, } static const struct iommu_ops mtk_iommu_ops = { + .identity_domain = &mtk_iommu_identity_domain, .domain_alloc = mtk_iommu_domain_alloc, .probe_device = mtk_iommu_probe_device, .release_device = mtk_iommu_release_device, From 1ddc1e0bd449f7c70c353b2d12292bda707ccac4 Mon Sep 17 00:00:00 2001 From: Jason Gunthorpe Date: Wed, 13 Sep 2023 10:43:52 -0300 Subject: [PATCH 420/548] iommu/sun50i: Add an IOMMU_IDENTITIY_DOMAIN Prior to commit 1b932ceddd19 ("iommu: Remove detach_dev callbacks") the sun50i_iommu_detach_device() function was being called by ops->detach_dev(). This is an IDENTITY domain so convert sun50i_iommu_detach_device() into sun50i_iommu_identity_attach() and a full IDENTITY domain and thus hook it back up the same was as the old ops->detach_dev(). Reviewed-by: Jerry Snitselaar Signed-off-by: Jason Gunthorpe Link: https://lore.kernel.org/r/19-v8-81230027b2fa+9d-iommu_all_defdom_jgg@nvidia.com Signed-off-by: Joerg Roedel (cherry picked from commit 5b167dea64a3fb72907e33f30211d4a5711076f1) Signed-off-by: Koba Ko --- drivers/iommu/sun50i-iommu.c | 26 +++++++++++++++++++------- 1 file changed, 19 insertions(+), 7 deletions(-) diff --git a/drivers/iommu/sun50i-iommu.c b/drivers/iommu/sun50i-iommu.c index 94bd7f25f6f26..2a61b50c97f83 100644 --- a/drivers/iommu/sun50i-iommu.c +++ b/drivers/iommu/sun50i-iommu.c @@ -758,21 +758,32 @@ static void sun50i_iommu_detach_domain(struct sun50i_iommu *iommu, iommu->domain = NULL; } -static void sun50i_iommu_detach_device(struct iommu_domain *domain, - struct device *dev) +static int sun50i_iommu_identity_attach(struct iommu_domain *identity_domain, + struct device *dev) { - struct sun50i_iommu_domain *sun50i_domain = to_sun50i_domain(domain); struct sun50i_iommu *iommu = dev_iommu_priv_get(dev); + struct sun50i_iommu_domain *sun50i_domain; dev_dbg(dev, "Detaching from IOMMU domain\n"); - if (iommu->domain != domain) - return; + if (iommu->domain == identity_domain) + return 0; + sun50i_domain = to_sun50i_domain(iommu->domain); if (refcount_dec_and_test(&sun50i_domain->refcnt)) sun50i_iommu_detach_domain(iommu, sun50i_domain); + return 0; } +static struct iommu_domain_ops sun50i_iommu_identity_ops = { + .attach_dev = sun50i_iommu_identity_attach, +}; + +static struct iommu_domain sun50i_iommu_identity_domain = { + .type = IOMMU_DOMAIN_IDENTITY, + .ops = &sun50i_iommu_identity_ops, +}; + static int sun50i_iommu_attach_device(struct iommu_domain *domain, struct device *dev) { @@ -790,8 +801,7 @@ static int sun50i_iommu_attach_device(struct iommu_domain *domain, if (iommu->domain == domain) return 0; - if (iommu->domain) - sun50i_iommu_detach_device(iommu->domain, dev); + sun50i_iommu_identity_attach(&sun50i_iommu_identity_domain, dev); sun50i_iommu_attach_domain(iommu, sun50i_domain); @@ -828,6 +838,7 @@ static int sun50i_iommu_of_xlate(struct device *dev, } static const struct iommu_ops sun50i_iommu_ops = { + .identity_domain = &sun50i_iommu_identity_domain, .pgsize_bitmap = SZ_4K, .device_group = sun50i_iommu_device_group, .domain_alloc = sun50i_iommu_domain_alloc, @@ -986,6 +997,7 @@ static int sun50i_iommu_probe(struct platform_device *pdev) if (!iommu) return -ENOMEM; spin_lock_init(&iommu->iommu_lock); + iommu->domain = &sun50i_iommu_identity_domain; platform_set_drvdata(pdev, iommu); iommu->dev = &pdev->dev; From 66784efbed2f4e85891a2475ff34f37569937894 Mon Sep 17 00:00:00 2001 From: Jason Gunthorpe Date: Wed, 13 Sep 2023 10:43:53 -0300 Subject: [PATCH 421/548] iommu: Require a default_domain for all iommu drivers At this point every iommu driver will cause a default_domain to be selected, so we can finally remove this gap from the core code. The following table explains what each driver supports and what the resulting default_domain will be: ops->defaut_domain IDENTITY DMA PLATFORM v ARM32 dma-iommu ARCH amd/iommu.c Y Y N/A either apple-dart.c Y Y N/A either arm-smmu.c Y Y IDENTITY either qcom_iommu.c G Y IDENTITY either arm-smmu-v3.c Y Y N/A either exynos-iommu.c G Y IDENTITY either fsl_pamu_domain.c Y Y N/A N/A PLATFORM intel/iommu.c Y Y N/A either ipmmu-vmsa.c G Y IDENTITY either msm_iommu.c G IDENTITY N/A mtk_iommu.c G Y IDENTITY either mtk_iommu_v1.c G IDENTITY N/A omap-iommu.c G IDENTITY N/A rockchip-iommu.c G Y IDENTITY either s390-iommu.c Y Y N/A N/A PLATFORM sprd-iommu.c Y N/A DMA sun50i-iommu.c G Y IDENTITY either tegra-smmu.c G Y IDENTITY IDENTITY virtio-iommu.c Y Y N/A either spapr Y Y N/A N/A PLATFORM * G means ops->identity_domain is used * N/A means the driver will not compile in this configuration ARM32 drivers select an IDENTITY default domain through either the ops->identity_domain or directly requesting an IDENTIY domain through alloc_domain(). In ARM64 mode tegra-smmu will still block the use of dma-iommu.c and forces an IDENTITY domain. S390 uses a PLATFORM domain to represent when the dma_ops are set to the s390 iommu code. fsl_pamu uses an PLATFORM domain. POWER SPAPR uses PLATFORM and blocking to enable its weird VFIO mode. The x86 drivers continue unchanged. After this patch group->default_domain is only NULL for a short period during bus iommu probing while all the groups are constituted. Otherwise it is always !NULL. This completes changing the iommu subsystem driver contract to a system where the current iommu_domain always represents some form of translation and the driver is continuously asserting a definable translation mode. It resolves the confusion that the original ops->detach_dev() caused around what translation, exactly, is the IOMMU performing after detach. There were at least three different answers to that question in the tree, they are all now clearly named with domain types. Tested-by: Heiko Stuebner Tested-by: Niklas Schnelle Tested-by: Steven Price Tested-by: Marek Szyprowski Tested-by: Nicolin Chen Reviewed-by: Lu Baolu Reviewed-by: Jerry Snitselaar Signed-off-by: Jason Gunthorpe Link: https://lore.kernel.org/r/20-v8-81230027b2fa+9d-iommu_all_defdom_jgg@nvidia.com Signed-off-by: Joerg Roedel (cherry picked from commit 98ac73f99bc44fba8a14252ffb0bad02459f7008) Signed-off-by: Koba Ko --- drivers/iommu/iommu.c | 25 +++++++------------------ 1 file changed, 7 insertions(+), 18 deletions(-) diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c index e9ba6a4d95d08..75bc6aebebed0 100644 --- a/drivers/iommu/iommu.c +++ b/drivers/iommu/iommu.c @@ -1878,7 +1878,6 @@ static int iommu_get_def_domain_type(struct iommu_group *group, static int iommu_get_default_domain_type(struct iommu_group *group, int target_type) { - const struct iommu_ops *ops = group_iommu_ops(group); struct device *untrusted = NULL; struct group_device *gdev; int driver_type = 0; @@ -1889,11 +1888,13 @@ static int iommu_get_default_domain_type(struct iommu_group *group, * ARM32 drivers supporting CONFIG_ARM_DMA_USE_IOMMU can declare an * identity_domain and it will automatically become their default * domain. Later on ARM_DMA_USE_IOMMU will install its UNMANAGED domain. - * Override the selection to IDENTITY if we are sure the driver supports - * it. + * Override the selection to IDENTITY. */ - if (IS_ENABLED(CONFIG_ARM_DMA_USE_IOMMU) && ops->identity_domain) + if (IS_ENABLED(CONFIG_ARM_DMA_USE_IOMMU)) { + static_assert(!(IS_ENABLED(CONFIG_ARM_DMA_USE_IOMMU) && + IS_ENABLED(CONFIG_IOMMU_DMA))); driver_type = IOMMU_DOMAIN_IDENTITY; + } for_each_group_device(group, gdev) { driver_type = iommu_get_def_domain_type(group, gdev->dev, @@ -3048,21 +3049,9 @@ static int iommu_setup_default_domain(struct iommu_group *group, if (req_type < 0) return -EINVAL; - /* - * There are still some drivers which don't support default domains, so - * we ignore the failure and leave group->default_domain NULL. - * - * We assume that the iommu driver starts up the device in - * 'set_platform_dma_ops' mode if it does not support default domains. - */ dom = iommu_group_alloc_default_domain(group, req_type); - if (!dom) { - /* Once in default_domain mode we never leave */ - if (group->default_domain) - return -ENODEV; - group->default_domain = NULL; - return 0; - } + if (!dom) + return -ENODEV; if (group->default_domain == dom) return 0; From 7522b6d676386431b663c110bbbc9376a452099b Mon Sep 17 00:00:00 2001 From: Jason Gunthorpe Date: Wed, 13 Sep 2023 10:43:54 -0300 Subject: [PATCH 422/548] iommu: Add __iommu_group_domain_alloc() Allocate a domain from a group. Automatically obtains the iommu_ops to use from the device list of the group. Convert the internal callers to use it. Tested-by: Steven Price Tested-by: Marek Szyprowski Tested-by: Nicolin Chen Reviewed-by: Lu Baolu Reviewed-by: Jerry Snitselaar Signed-off-by: Jason Gunthorpe Link: https://lore.kernel.org/r/21-v8-81230027b2fa+9d-iommu_all_defdom_jgg@nvidia.com Signed-off-by: Joerg Roedel (cherry picked from commit 8359cf39acba72606736f453f54432ac9dc28523) Signed-off-by: Koba Ko --- drivers/iommu/iommu.c | 59 +++++++++++++++++++++---------------------- 1 file changed, 29 insertions(+), 30 deletions(-) diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c index 75bc6aebebed0..b7649d00ce0f6 100644 --- a/drivers/iommu/iommu.c +++ b/drivers/iommu/iommu.c @@ -96,8 +96,8 @@ static const char * const iommu_group_resv_type_string[] = { static int iommu_bus_notifier(struct notifier_block *nb, unsigned long action, void *data); static void iommu_release_device(struct device *dev); -static struct iommu_domain *__iommu_domain_alloc(const struct bus_type *bus, - unsigned type); +static struct iommu_domain * +__iommu_group_domain_alloc(struct iommu_group *group, unsigned int type); static int __iommu_attach_device(struct iommu_domain *domain, struct device *dev); static int __iommu_attach_group(struct iommu_domain *domain, @@ -1730,12 +1730,11 @@ struct iommu_group *fsl_mc_device_group(struct device *dev) EXPORT_SYMBOL_GPL(fsl_mc_device_group); static struct iommu_domain * -__iommu_group_alloc_default_domain(const struct bus_type *bus, - struct iommu_group *group, int req_type) +__iommu_group_alloc_default_domain(struct iommu_group *group, int req_type) { if (group->default_domain && group->default_domain->type == req_type) return group->default_domain; - return __iommu_domain_alloc(bus, req_type); + return __iommu_group_domain_alloc(group, req_type); } /* @@ -1762,9 +1761,7 @@ static const struct iommu_ops *group_iommu_ops(struct iommu_group *group) static struct iommu_domain * iommu_group_alloc_default_domain(struct iommu_group *group, int req_type) { - const struct bus_type *bus = - list_first_entry(&group->devices, struct group_device, list) - ->dev->bus; + const struct iommu_ops *ops = group_iommu_ops(group); struct iommu_domain *dom; lockdep_assert_held(&group->mutex); @@ -1774,24 +1771,24 @@ iommu_group_alloc_default_domain(struct iommu_group *group, int req_type) * domain. This should always be either an IDENTITY/BLOCKED/PLATFORM * domain. Do not use in new drivers. */ - if (bus->iommu_ops->default_domain) { + if (ops->default_domain) { if (req_type) return ERR_PTR(-EINVAL); - return bus->iommu_ops->default_domain; + return ops->default_domain; } if (req_type) - return __iommu_group_alloc_default_domain(bus, group, req_type); + return __iommu_group_alloc_default_domain(group, req_type); /* The driver gave no guidance on what type to use, try the default */ - dom = __iommu_group_alloc_default_domain(bus, group, iommu_def_domain_type); + dom = __iommu_group_alloc_default_domain(group, iommu_def_domain_type); if (dom) return dom; /* Otherwise IDENTITY and DMA_FQ defaults will try DMA */ if (iommu_def_domain_type == IOMMU_DOMAIN_DMA) return NULL; - dom = __iommu_group_alloc_default_domain(bus, group, IOMMU_DOMAIN_DMA); + dom = __iommu_group_alloc_default_domain(group, IOMMU_DOMAIN_DMA); if (!dom) return NULL; @@ -2056,19 +2053,16 @@ void iommu_set_fault_handler(struct iommu_domain *domain, } EXPORT_SYMBOL_GPL(iommu_set_fault_handler); -static struct iommu_domain *__iommu_domain_alloc(const struct bus_type *bus, - unsigned type) +static struct iommu_domain *__iommu_domain_alloc(const struct iommu_ops *ops, + unsigned int type) { struct iommu_domain *domain; unsigned int alloc_type = type & IOMMU_DOMAIN_ALLOC_FLAGS; - if (bus == NULL || bus->iommu_ops == NULL) - return NULL; + if (alloc_type == IOMMU_DOMAIN_IDENTITY && ops->identity_domain) + return ops->identity_domain; - if (alloc_type == IOMMU_DOMAIN_IDENTITY && bus->iommu_ops->identity_domain) - return bus->iommu_ops->identity_domain; - - domain = bus->iommu_ops->domain_alloc(alloc_type); + domain = ops->domain_alloc(alloc_type); if (!domain) return NULL; @@ -2078,10 +2072,10 @@ static struct iommu_domain *__iommu_domain_alloc(const struct bus_type *bus, * may override this later */ if (!domain->pgsize_bitmap) - domain->pgsize_bitmap = bus->iommu_ops->pgsize_bitmap; + domain->pgsize_bitmap = ops->pgsize_bitmap; if (!domain->ops) - domain->ops = bus->iommu_ops->default_domain_ops; + domain->ops = ops->default_domain_ops; if (iommu_is_dma_domain(domain) && iommu_get_dma_cookie(domain)) { iommu_domain_free(domain); @@ -2090,9 +2084,17 @@ static struct iommu_domain *__iommu_domain_alloc(const struct bus_type *bus, return domain; } +static struct iommu_domain * +__iommu_group_domain_alloc(struct iommu_group *group, unsigned int type) +{ + return __iommu_domain_alloc(group_iommu_ops(group), type); +} + struct iommu_domain *iommu_domain_alloc(const struct bus_type *bus) { - return __iommu_domain_alloc(bus, IOMMU_DOMAIN_UNMANAGED); + if (bus == NULL || bus->iommu_ops == NULL) + return NULL; + return __iommu_domain_alloc(bus->iommu_ops, IOMMU_DOMAIN_UNMANAGED); } EXPORT_SYMBOL_GPL(iommu_domain_alloc); @@ -3277,21 +3279,18 @@ void iommu_device_unuse_default_domain(struct device *dev) static int __iommu_group_alloc_blocking_domain(struct iommu_group *group) { - struct group_device *dev = - list_first_entry(&group->devices, struct group_device, list); - if (group->blocking_domain) return 0; group->blocking_domain = - __iommu_domain_alloc(dev->dev->bus, IOMMU_DOMAIN_BLOCKED); + __iommu_group_domain_alloc(group, IOMMU_DOMAIN_BLOCKED); if (!group->blocking_domain) { /* * For drivers that do not yet understand IOMMU_DOMAIN_BLOCKED * create an empty domain instead. */ - group->blocking_domain = __iommu_domain_alloc( - dev->dev->bus, IOMMU_DOMAIN_UNMANAGED); + group->blocking_domain = __iommu_group_domain_alloc( + group, IOMMU_DOMAIN_UNMANAGED); if (!group->blocking_domain) return -EINVAL; } From ca8cf91604eccb32ba5832ff21d556f327e32fcb Mon Sep 17 00:00:00 2001 From: Jason Gunthorpe Date: Wed, 13 Sep 2023 10:43:55 -0300 Subject: [PATCH 423/548] iommu: Add ops->domain_alloc_paging() This callback requests the driver to create only a __IOMMU_DOMAIN_PAGING domain, so it saves a few lines in a lot of drivers needlessly checking the type. More critically, this allows us to sweep out all the IOMMU_DOMAIN_UNMANAGED and IOMMU_DOMAIN_DMA checks from a lot of the drivers, simplifying what is going on in the code and ultimately removing the now-unused special cases in drivers where they did not support IOMMU_DOMAIN_DMA. domain_alloc_paging() should return a struct iommu_domain that is functionally compatible with ARM_DMA_USE_IOMMU, dma-iommu.c and iommufd. Be forwards looking and pass in a 'struct device *' argument. We can provide this when allocating the default_domain. No drivers will look at this. Tested-by: Steven Price Tested-by: Marek Szyprowski Tested-by: Nicolin Chen Reviewed-by: Lu Baolu Reviewed-by: Jerry Snitselaar Signed-off-by: Jason Gunthorpe Link: https://lore.kernel.org/r/22-v8-81230027b2fa+9d-iommu_all_defdom_jgg@nvidia.com Signed-off-by: Joerg Roedel (cherry picked from commit 4601cd2d7c4c82c4bafc822e1ff630a709eff206) Signed-off-by: Koba Ko --- drivers/iommu/iommu.c | 17 ++++++++++++++--- include/linux/iommu.h | 9 ++++++--- 2 files changed, 20 insertions(+), 6 deletions(-) diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c index b7649d00ce0f6..d76e4419279cf 100644 --- a/drivers/iommu/iommu.c +++ b/drivers/iommu/iommu.c @@ -2054,6 +2054,7 @@ void iommu_set_fault_handler(struct iommu_domain *domain, EXPORT_SYMBOL_GPL(iommu_set_fault_handler); static struct iommu_domain *__iommu_domain_alloc(const struct iommu_ops *ops, + struct device *dev, unsigned int type) { struct iommu_domain *domain; @@ -2061,8 +2062,13 @@ static struct iommu_domain *__iommu_domain_alloc(const struct iommu_ops *ops, if (alloc_type == IOMMU_DOMAIN_IDENTITY && ops->identity_domain) return ops->identity_domain; + else if (type & __IOMMU_DOMAIN_PAGING && ops->domain_alloc_paging) + domain = ops->domain_alloc_paging(dev); + else if (ops->domain_alloc) + domain = ops->domain_alloc(alloc_type); + else + return NULL; - domain = ops->domain_alloc(alloc_type); if (!domain) return NULL; @@ -2087,14 +2093,19 @@ static struct iommu_domain *__iommu_domain_alloc(const struct iommu_ops *ops, static struct iommu_domain * __iommu_group_domain_alloc(struct iommu_group *group, unsigned int type) { - return __iommu_domain_alloc(group_iommu_ops(group), type); + struct device *dev = + list_first_entry(&group->devices, struct group_device, list) + ->dev; + + return __iommu_domain_alloc(group_iommu_ops(group), dev, type); } struct iommu_domain *iommu_domain_alloc(const struct bus_type *bus) { if (bus == NULL || bus->iommu_ops == NULL) return NULL; - return __iommu_domain_alloc(bus->iommu_ops, IOMMU_DOMAIN_UNMANAGED); + return __iommu_domain_alloc(bus->iommu_ops, NULL, + IOMMU_DOMAIN_UNMANAGED); } EXPORT_SYMBOL_GPL(iommu_domain_alloc); diff --git a/include/linux/iommu.h b/include/linux/iommu.h index 62f002d11e8e4..daa40443f8b82 100644 --- a/include/linux/iommu.h +++ b/include/linux/iommu.h @@ -334,6 +334,8 @@ static inline int __iommu_copy_struct_from_user( * @domain_alloc: allocate and return an iommu domain if success. Otherwise * NULL is returned. The domain is not fully initialized until * the caller iommu_domain_alloc() returns. + * @domain_alloc_paging: Allocate an iommu_domain that can be used for + * UNMANAGED, DMA, and DMA_FQ domain types. * @domain_alloc_user: Allocate an iommu domain corresponding to the input * parameters as defined in include/uapi/linux/iommufd.h. * Unlike @domain_alloc, it is called only by IOMMUFD and @@ -381,9 +383,10 @@ struct iommu_ops { /* Domain allocation and freeing by the iommu driver */ struct iommu_domain *(*domain_alloc)(unsigned iommu_domain_type); - struct iommu_domain *(*domain_alloc_user)( - struct device *dev, u32 flags, struct iommu_domain *parent, - const struct iommu_user_data *user_data); + struct iommu_domain *(*domain_alloc_user)( + struct device *dev, u32 flags, struct iommu_domain *parent, + const struct iommu_user_data *user_data); + struct iommu_domain *(*domain_alloc_paging)(struct device *dev); struct iommu_device *(*probe_device)(struct device *dev); void (*release_device)(struct device *dev); From d4f56901a528346be31fe89e00d19acb9d338256 Mon Sep 17 00:00:00 2001 From: Jason Gunthorpe Date: Wed, 27 Mar 2024 15:07:50 -0300 Subject: [PATCH 424/548] iommu/arm-smmu-v3: Do not ATC invalidate the entire domain At this point we know which master we are going to change the PCI config on, this is the only device we need to invalidate. Switch arm_smmu_atc_inv_domain() for arm_smmu_atc_inv_master(). Tested-by: Nicolin Chen Tested-by: Shameer Kolothum Reviewed-by: Michael Shavit Reviewed-by: Nicolin Chen Reviewed-by: Moritz Fischer Reviewed-by: Mostafa Saleh Signed-off-by: Jason Gunthorpe Link: https://lore.kernel.org/r/4-v6-228e7adf25eb+4155-smmuv3_newapi_p2_jgg@nvidia.com Signed-off-by: Will Deacon (cherry picked from commit 86e5ca098dd9f8c5b80a6205395aea0535018837) Signed-off-by: Koba Ko --- drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c index 1264f68a3986f..486927683be32 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c @@ -2287,7 +2287,10 @@ static void arm_smmu_enable_ats(struct arm_smmu_master *master) pdev = to_pci_dev(master->dev); atomic_inc(&smmu_domain->nr_ats_masters); - arm_smmu_atc_inv_domain(smmu_domain, IOMMU_NO_PASID, 0, 0); + /* + * ATC invalidation of PASID 0 causes the entire ATC to be flushed. + */ + arm_smmu_atc_inv_master(master); if (pci_enable_ats(pdev, stu)) dev_err(master->dev, "Failed to enable ATS (STU %zu)\n", stu); } From a03aee524ba9002711c7a28bc5ec73fe86a0df0e Mon Sep 17 00:00:00 2001 From: Jason Gunthorpe Date: Wed, 13 Sep 2023 10:43:56 -0300 Subject: [PATCH 425/548] iommu: Convert simple drivers with DOMAIN_DMA to domain_alloc_paging() These drivers are all trivially converted since the function is only called if the domain type is going to be IOMMU_DOMAIN_UNMANAGED/DMA. Tested-by: Heiko Stuebner Tested-by: Steven Price Tested-by: Marek Szyprowski Tested-by: Nicolin Chen Reviewed-by: Lu Baolu Reviewed-by: Jerry Snitselaar Signed-off-by: Jason Gunthorpe Tested-by: Yong Wu #For mtk_iommu.c Link: https://lore.kernel.org/r/23-v8-81230027b2fa+9d-iommu_all_defdom_jgg@nvidia.com Signed-off-by: Joerg Roedel (cherry picked from commit 3529375e7777b0f9b77c53df9a7da122bd165ae7) Signed-off-by: Koba Ko --- drivers/iommu/arm/arm-smmu/qcom_iommu.c | 6 ++---- drivers/iommu/exynos-iommu.c | 7 ++----- drivers/iommu/ipmmu-vmsa.c | 7 ++----- drivers/iommu/mtk_iommu.c | 7 ++----- drivers/iommu/rockchip-iommu.c | 7 ++----- drivers/iommu/sprd-iommu.c | 7 ++----- drivers/iommu/sun50i-iommu.c | 9 +++------ drivers/iommu/tegra-smmu.c | 7 ++----- 8 files changed, 17 insertions(+), 40 deletions(-) diff --git a/drivers/iommu/arm/arm-smmu/qcom_iommu.c b/drivers/iommu/arm/arm-smmu/qcom_iommu.c index bc45d18f350cb..97b2122032b23 100644 --- a/drivers/iommu/arm/arm-smmu/qcom_iommu.c +++ b/drivers/iommu/arm/arm-smmu/qcom_iommu.c @@ -332,12 +332,10 @@ static int qcom_iommu_init_domain(struct iommu_domain *domain, return ret; } -static struct iommu_domain *qcom_iommu_domain_alloc(unsigned type) +static struct iommu_domain *qcom_iommu_domain_alloc_paging(struct device *dev) { struct qcom_iommu_domain *qcom_domain; - if (type != IOMMU_DOMAIN_UNMANAGED && type != IOMMU_DOMAIN_DMA) - return NULL; /* * Allocate the domain and initialise some of its data structures. * We can't really do anything meaningful until we've added a @@ -605,7 +603,7 @@ static int qcom_iommu_of_xlate(struct device *dev, struct of_phandle_args *args) static const struct iommu_ops qcom_iommu_ops = { .identity_domain = &qcom_iommu_identity_domain, .capable = qcom_iommu_capable, - .domain_alloc = qcom_iommu_domain_alloc, + .domain_alloc_paging = qcom_iommu_domain_alloc_paging, .probe_device = qcom_iommu_probe_device, .device_group = generic_device_group, .of_xlate = qcom_iommu_of_xlate, diff --git a/drivers/iommu/exynos-iommu.c b/drivers/iommu/exynos-iommu.c index 5e12b85dfe870..d6dead2ed10c1 100644 --- a/drivers/iommu/exynos-iommu.c +++ b/drivers/iommu/exynos-iommu.c @@ -887,7 +887,7 @@ static inline void exynos_iommu_set_pte(sysmmu_pte_t *ent, sysmmu_pte_t val) DMA_TO_DEVICE); } -static struct iommu_domain *exynos_iommu_domain_alloc(unsigned type) +static struct iommu_domain *exynos_iommu_domain_alloc_paging(struct device *dev) { struct exynos_iommu_domain *domain; dma_addr_t handle; @@ -896,9 +896,6 @@ static struct iommu_domain *exynos_iommu_domain_alloc(unsigned type) /* Check if correct PTE offsets are initialized */ BUG_ON(PG_ENT_SHIFT < 0 || !dma_dev); - if (type != IOMMU_DOMAIN_DMA && type != IOMMU_DOMAIN_UNMANAGED) - return NULL; - domain = kzalloc(sizeof(*domain), GFP_KERNEL); if (!domain) return NULL; @@ -1472,7 +1469,7 @@ static int exynos_iommu_of_xlate(struct device *dev, static const struct iommu_ops exynos_iommu_ops = { .identity_domain = &exynos_identity_domain, - .domain_alloc = exynos_iommu_domain_alloc, + .domain_alloc_paging = exynos_iommu_domain_alloc_paging, .device_group = generic_device_group, .probe_device = exynos_iommu_probe_device, .release_device = exynos_iommu_release_device, diff --git a/drivers/iommu/ipmmu-vmsa.c b/drivers/iommu/ipmmu-vmsa.c index 04830d3931d23..eaabae7615776 100644 --- a/drivers/iommu/ipmmu-vmsa.c +++ b/drivers/iommu/ipmmu-vmsa.c @@ -563,13 +563,10 @@ static irqreturn_t ipmmu_irq(int irq, void *dev) * IOMMU Operations */ -static struct iommu_domain *ipmmu_domain_alloc(unsigned type) +static struct iommu_domain *ipmmu_domain_alloc_paging(struct device *dev) { struct ipmmu_vmsa_domain *domain; - if (type != IOMMU_DOMAIN_UNMANAGED && type != IOMMU_DOMAIN_DMA) - return NULL; - domain = kzalloc(sizeof(*domain), GFP_KERNEL); if (!domain) return NULL; @@ -892,7 +889,7 @@ static struct iommu_group *ipmmu_find_group(struct device *dev) static const struct iommu_ops ipmmu_ops = { .identity_domain = &ipmmu_iommu_identity_domain, - .domain_alloc = ipmmu_domain_alloc, + .domain_alloc_paging = ipmmu_domain_alloc_paging, .probe_device = ipmmu_probe_device, .release_device = ipmmu_release_device, .probe_finalize = ipmmu_probe_finalize, diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c index 06192aff5eb8e..f7e2b9cd39a2a 100644 --- a/drivers/iommu/mtk_iommu.c +++ b/drivers/iommu/mtk_iommu.c @@ -688,13 +688,10 @@ static int mtk_iommu_domain_finalise(struct mtk_iommu_domain *dom, return 0; } -static struct iommu_domain *mtk_iommu_domain_alloc(unsigned type) +static struct iommu_domain *mtk_iommu_domain_alloc_paging(struct device *dev) { struct mtk_iommu_domain *dom; - if (type != IOMMU_DOMAIN_DMA && type != IOMMU_DOMAIN_UNMANAGED) - return NULL; - dom = kzalloc(sizeof(*dom), GFP_KERNEL); if (!dom) return NULL; @@ -1018,7 +1015,7 @@ static void mtk_iommu_get_resv_regions(struct device *dev, static const struct iommu_ops mtk_iommu_ops = { .identity_domain = &mtk_iommu_identity_domain, - .domain_alloc = mtk_iommu_domain_alloc, + .domain_alloc_paging = mtk_iommu_domain_alloc_paging, .probe_device = mtk_iommu_probe_device, .release_device = mtk_iommu_release_device, .device_group = mtk_iommu_device_group, diff --git a/drivers/iommu/rockchip-iommu.c b/drivers/iommu/rockchip-iommu.c index b6f1ba01a341d..527a4395a1b40 100644 --- a/drivers/iommu/rockchip-iommu.c +++ b/drivers/iommu/rockchip-iommu.c @@ -1043,13 +1043,10 @@ static int rk_iommu_attach_device(struct iommu_domain *domain, return ret; } -static struct iommu_domain *rk_iommu_domain_alloc(unsigned type) +static struct iommu_domain *rk_iommu_domain_alloc_paging(struct device *dev) { struct rk_iommu_domain *rk_domain; - if (type != IOMMU_DOMAIN_UNMANAGED && type != IOMMU_DOMAIN_DMA) - return NULL; - if (!dma_dev) return NULL; @@ -1171,7 +1168,7 @@ static int rk_iommu_of_xlate(struct device *dev, static const struct iommu_ops rk_iommu_ops = { .identity_domain = &rk_identity_domain, - .domain_alloc = rk_iommu_domain_alloc, + .domain_alloc_paging = rk_iommu_domain_alloc_paging, .probe_device = rk_iommu_probe_device, .release_device = rk_iommu_release_device, .device_group = rk_iommu_device_group, diff --git a/drivers/iommu/sprd-iommu.c b/drivers/iommu/sprd-iommu.c index c8e79a2d8b4c6..1a4bc7287de22 100644 --- a/drivers/iommu/sprd-iommu.c +++ b/drivers/iommu/sprd-iommu.c @@ -134,13 +134,10 @@ sprd_iommu_pgt_size(struct iommu_domain *domain) SPRD_IOMMU_PAGE_SHIFT) * sizeof(u32); } -static struct iommu_domain *sprd_iommu_domain_alloc(unsigned int domain_type) +static struct iommu_domain *sprd_iommu_domain_alloc_paging(struct device *dev) { struct sprd_iommu_domain *dom; - if (domain_type != IOMMU_DOMAIN_DMA && domain_type != IOMMU_DOMAIN_UNMANAGED) - return NULL; - dom = kzalloc(sizeof(*dom), GFP_KERNEL); if (!dom) return NULL; @@ -421,7 +418,7 @@ static int sprd_iommu_of_xlate(struct device *dev, struct of_phandle_args *args) static const struct iommu_ops sprd_iommu_ops = { - .domain_alloc = sprd_iommu_domain_alloc, + .domain_alloc_paging = sprd_iommu_domain_alloc_paging, .probe_device = sprd_iommu_probe_device, .device_group = sprd_iommu_device_group, .of_xlate = sprd_iommu_of_xlate, diff --git a/drivers/iommu/sun50i-iommu.c b/drivers/iommu/sun50i-iommu.c index 2a61b50c97f83..85ce3dc0bdd59 100644 --- a/drivers/iommu/sun50i-iommu.c +++ b/drivers/iommu/sun50i-iommu.c @@ -668,14 +668,11 @@ static phys_addr_t sun50i_iommu_iova_to_phys(struct iommu_domain *domain, sun50i_iova_get_page_offset(iova); } -static struct iommu_domain *sun50i_iommu_domain_alloc(unsigned type) +static struct iommu_domain * +sun50i_iommu_domain_alloc_paging(struct device *dev) { struct sun50i_iommu_domain *sun50i_domain; - if (type != IOMMU_DOMAIN_DMA && - type != IOMMU_DOMAIN_UNMANAGED) - return NULL; - sun50i_domain = kzalloc(sizeof(*sun50i_domain), GFP_KERNEL); if (!sun50i_domain) return NULL; @@ -841,7 +838,7 @@ static const struct iommu_ops sun50i_iommu_ops = { .identity_domain = &sun50i_iommu_identity_domain, .pgsize_bitmap = SZ_4K, .device_group = sun50i_iommu_device_group, - .domain_alloc = sun50i_iommu_domain_alloc, + .domain_alloc_paging = sun50i_iommu_domain_alloc_paging, .of_xlate = sun50i_iommu_of_xlate, .probe_device = sun50i_iommu_probe_device, .default_domain_ops = &(const struct iommu_domain_ops) { diff --git a/drivers/iommu/tegra-smmu.c b/drivers/iommu/tegra-smmu.c index b91ad1b5a20d3..1764a63347b04 100644 --- a/drivers/iommu/tegra-smmu.c +++ b/drivers/iommu/tegra-smmu.c @@ -272,13 +272,10 @@ static void tegra_smmu_free_asid(struct tegra_smmu *smmu, unsigned int id) clear_bit(id, smmu->asids); } -static struct iommu_domain *tegra_smmu_domain_alloc(unsigned type) +static struct iommu_domain *tegra_smmu_domain_alloc_paging(struct device *dev) { struct tegra_smmu_as *as; - if (type != IOMMU_DOMAIN_UNMANAGED && type != IOMMU_DOMAIN_DMA) - return NULL; - as = kzalloc(sizeof(*as), GFP_KERNEL); if (!as) return NULL; @@ -991,7 +988,7 @@ static int tegra_smmu_def_domain_type(struct device *dev) static const struct iommu_ops tegra_smmu_ops = { .identity_domain = &tegra_smmu_identity_domain, .def_domain_type = &tegra_smmu_def_domain_type, - .domain_alloc = tegra_smmu_domain_alloc, + .domain_alloc_paging = tegra_smmu_domain_alloc_paging, .probe_device = tegra_smmu_probe_device, .device_group = tegra_smmu_device_group, .of_xlate = tegra_smmu_of_xlate, From ba2566ac1ff0378e8e9406dc79a45714a3c153c7 Mon Sep 17 00:00:00 2001 From: Jason Gunthorpe Date: Wed, 13 Sep 2023 10:43:57 -0300 Subject: [PATCH 426/548] iommu: Convert remaining simple drivers to domain_alloc_paging() These drivers don't support IOMMU_DOMAIN_DMA, so this commit effectively allows them to support that mode. The prior work to require default_domains makes this safe because every one of these drivers is either compilation incompatible with dma-iommu.c, or already establishing a default_domain. In both cases alloc_domain() will never be called with IOMMU_DOMAIN_DMA for these drivers so it is safe to drop the test. Removing these tests clarifies that the domain allocation path is only about the functionality of a paging domain and has nothing to do with policy of how the paging domain is used for UNMANAGED/DMA/DMA_FQ. Tested-by: Niklas Schnelle Tested-by: Steven Price Tested-by: Marek Szyprowski Tested-by: Nicolin Chen Reviewed-by: Lu Baolu Reviewed-by: Jerry Snitselaar Signed-off-by: Jason Gunthorpe Link: https://lore.kernel.org/r/24-v8-81230027b2fa+9d-iommu_all_defdom_jgg@nvidia.com Signed-off-by: Joerg Roedel (cherry picked from commit 4efd98d41ea71835f4d291176ff1fd0a1803dfd5) Signed-off-by: Koba Ko --- drivers/iommu/msm_iommu.c | 7 ++----- drivers/iommu/mtk_iommu_v1.c | 7 ++----- drivers/iommu/omap-iommu.c | 7 ++----- drivers/iommu/s390-iommu.c | 7 ++----- 4 files changed, 8 insertions(+), 20 deletions(-) diff --git a/drivers/iommu/msm_iommu.c b/drivers/iommu/msm_iommu.c index 26ed81cfeee89..a163cee0b7242 100644 --- a/drivers/iommu/msm_iommu.c +++ b/drivers/iommu/msm_iommu.c @@ -302,13 +302,10 @@ static void __program_context(void __iomem *base, int ctx, SET_M(base, ctx, 1); } -static struct iommu_domain *msm_iommu_domain_alloc(unsigned type) +static struct iommu_domain *msm_iommu_domain_alloc_paging(struct device *dev) { struct msm_priv *priv; - if (type != IOMMU_DOMAIN_UNMANAGED) - return NULL; - priv = kzalloc(sizeof(*priv), GFP_KERNEL); if (!priv) goto fail_nomem; @@ -691,7 +688,7 @@ irqreturn_t msm_iommu_fault_handler(int irq, void *dev_id) static struct iommu_ops msm_iommu_ops = { .identity_domain = &msm_iommu_identity_domain, - .domain_alloc = msm_iommu_domain_alloc, + .domain_alloc_paging = msm_iommu_domain_alloc_paging, .probe_device = msm_iommu_probe_device, .device_group = generic_device_group, .pgsize_bitmap = MSM_IOMMU_PGSIZES, diff --git a/drivers/iommu/mtk_iommu_v1.c b/drivers/iommu/mtk_iommu_v1.c index 0e6b2bb4955de..574c6e38cfb3d 100644 --- a/drivers/iommu/mtk_iommu_v1.c +++ b/drivers/iommu/mtk_iommu_v1.c @@ -270,13 +270,10 @@ static int mtk_iommu_v1_domain_finalise(struct mtk_iommu_v1_data *data) return 0; } -static struct iommu_domain *mtk_iommu_v1_domain_alloc(unsigned type) +static struct iommu_domain *mtk_iommu_v1_domain_alloc_paging(struct device *dev) { struct mtk_iommu_v1_domain *dom; - if (type != IOMMU_DOMAIN_UNMANAGED) - return NULL; - dom = kzalloc(sizeof(*dom), GFP_KERNEL); if (!dom) return NULL; @@ -585,7 +582,7 @@ static int mtk_iommu_v1_hw_init(const struct mtk_iommu_v1_data *data) static const struct iommu_ops mtk_iommu_v1_ops = { .identity_domain = &mtk_iommu_v1_identity_domain, - .domain_alloc = mtk_iommu_v1_domain_alloc, + .domain_alloc_paging = mtk_iommu_v1_domain_alloc_paging, .probe_device = mtk_iommu_v1_probe_device, .probe_finalize = mtk_iommu_v1_probe_finalize, .release_device = mtk_iommu_v1_release_device, diff --git a/drivers/iommu/omap-iommu.c b/drivers/iommu/omap-iommu.c index 34340ef15241b..fcf99bd195b32 100644 --- a/drivers/iommu/omap-iommu.c +++ b/drivers/iommu/omap-iommu.c @@ -1580,13 +1580,10 @@ static struct iommu_domain omap_iommu_identity_domain = { .ops = &omap_iommu_identity_ops, }; -static struct iommu_domain *omap_iommu_domain_alloc(unsigned type) +static struct iommu_domain *omap_iommu_domain_alloc_paging(struct device *dev) { struct omap_iommu_domain *omap_domain; - if (type != IOMMU_DOMAIN_UNMANAGED) - return NULL; - omap_domain = kzalloc(sizeof(*omap_domain), GFP_KERNEL); if (!omap_domain) return NULL; @@ -1748,7 +1745,7 @@ static struct iommu_group *omap_iommu_device_group(struct device *dev) static const struct iommu_ops omap_iommu_ops = { .identity_domain = &omap_iommu_identity_domain, - .domain_alloc = omap_iommu_domain_alloc, + .domain_alloc_paging = omap_iommu_domain_alloc_paging, .probe_device = omap_iommu_probe_device, .release_device = omap_iommu_release_device, .device_group = omap_iommu_device_group, diff --git a/drivers/iommu/s390-iommu.c b/drivers/iommu/s390-iommu.c index f0c867c57a5b9..5695ad71d60e2 100644 --- a/drivers/iommu/s390-iommu.c +++ b/drivers/iommu/s390-iommu.c @@ -39,13 +39,10 @@ static bool s390_iommu_capable(struct device *dev, enum iommu_cap cap) } } -static struct iommu_domain *s390_domain_alloc(unsigned domain_type) +static struct iommu_domain *s390_domain_alloc_paging(struct device *dev) { struct s390_domain *s390_domain; - if (domain_type != IOMMU_DOMAIN_UNMANAGED) - return NULL; - s390_domain = kzalloc(sizeof(*s390_domain), GFP_KERNEL); if (!s390_domain) return NULL; @@ -447,7 +444,7 @@ void zpci_destroy_iommu(struct zpci_dev *zdev) static const struct iommu_ops s390_iommu_ops = { .default_domain = &s390_iommu_platform_domain, .capable = s390_iommu_capable, - .domain_alloc = s390_domain_alloc, + .domain_alloc_paging = s390_domain_alloc_paging, .probe_device = s390_iommu_probe_device, .release_device = s390_iommu_release_device, .device_group = generic_device_group, From 3615c79ceb647695ce3e8451a6bfbcb384972de3 Mon Sep 17 00:00:00 2001 From: Jason Gunthorpe Date: Thu, 5 Oct 2023 10:35:11 -0300 Subject: [PATCH 427/548] powerpc/iommu: Do not do platform domain attach atctions after probe POWER throws a splat at boot, it looks like the DMA ops were probably changed while a driver was attached. Something is still weird about how power sequences its bootup. Previously this was hidden since the core iommu code did nothing during probe, now it calls spapr_tce_platform_iommu_attach_dev(). Make spapr_tce_platform_iommu_attach_dev() do nothing on the probe time call like it did before. WARNING: CPU: 0 PID: 8 at arch/powerpc/kernel/iommu.c:407 __iommu_free+0x1e4/0x1f0 Modules linked in: sd_mod t10_pi crc64_rocksoft crc64 sg ibmvfc mlx5_core(+) scsi_transport_fc ibmveth mlxfw psample dm_multipath dm_mirror dm_region_hash dm_log dm_mod fuse CPU: 0 PID: 8 Comm: kworker/0:0 Not tainted 6.6.0-rc3-next-20230929-auto #1 Hardware name: IBM,9080-HEX POWER10 (raw) 0x800200 0xf000006 of:IBM,FW1030.30 (NH1030_062) hv:phyp pSeries Workqueue: events work_for_cpu_fn NIP: c00000000005f6d4 LR: c00000000005f6d0 CTR: 00000000005ca81c REGS: c000000003a27890 TRAP: 0700 Not tainted (6.6.0-rc3-next-20230929-auto) MSR: 800000000282b033 CR: 48000824 XER: 00000008 CFAR: c00000000020f738 IRQMASK: 0 GPR00: c00000000005f6d0 c000000003a27b30 c000000001481800 000000000000017 GPR04: 00000000ffff7fff c000000003a27950 c000000003a27948 0000000000000027 GPR08: c000000c18c07c10 0000000000000001 0000000000000027 c000000002ac8a08 GPR12: 0000000000000000 c000000002ff0000 c00000000019cc88 c000000003042300 GPR16: 0000000000000000 0000000000000000 0000000000000000 c000000003071ab0 GPR20: c00000000349f80d c000000003215440 c000000003215480 61c8864680b583eb GPR24: 0000000000000000 000000007fffffff 0800000020000000 0000000000000010 GPR28: 0000000000020000 0000800000020000 c00000000c5dc800 c00000000c5dc880 NIP [c00000000005f6d4] __iommu_free+0x1e4/0x1f0 LR [c00000000005f6d0] __iommu_free+0x1e0/0x1f0 Call Trace: [c000000003a27b30] [c00000000005f6d0] __iommu_free+0x1e0/0x1f0 (unreliable) [c000000003a27bc0] [c00000000005f848] iommu_free+0x28/0x70 [c000000003a27bf0] [c000000000061518] iommu_free_coherent+0x68/0xa0 [c000000003a27c20] [c00000000005e8d4] dma_iommu_free_coherent+0x24/0x40 [c000000003a27c40] [c00000000024698c] dma_free_attrs+0x10c/0x140 [c000000003a27c90] [c008000000dcb8d4] mlx5_cmd_cleanup+0x5c/0x90 [mlx5_core] [c000000003a27cc0] [c008000000dc45a0] mlx5_mdev_uninit+0xc8/0x100 [mlx5_core] [c000000003a27d00] [c008000000dc4ac4] probe_one+0x3ec/0x530 [mlx5_core] [c000000003a27d90] [c0000000008c5edc] local_pci_probe+0x6c/0x110 [c000000003a27e10] [c000000000189c98] work_for_cpu_fn+0x38/0x60 [c000000003a27e40] [c00000000018d1d0] process_scheduled_works+0x230/0x4f0 [c000000003a27f10] [c00000000018ff14] worker_thread+0x1e4/0x500 [c000000003a27f90] [c00000000019cdb8] kthread+0x138/0x140 [c000000003a27fe0] [c00000000000df98] start_kernel_thread+0x14/0x18 Code: 481b004d 60000000 e89e0028 3c62ffe0 3863dd20 481b0039 60000000 e89e0038 3c62ffe0 3863dd38 481b0025 60000000 <0fe00000> 4bffff20 60000000 3c4c0142 ---[ end trace 0000000000000000 ]--- iommu_free: invalid entry entry = 0x8000000203d0 dma_addr = 0x8000000203d0000 Table = 0xc00000000c5dc800 bus# = 0x1 size = 0x20000 startOff = 0x800000000000 index = 0x70200016 Fixes: 2ad56efa80db ("powerpc/iommu: Setup a default domain and remove set_platform_dma_ops") Reported-by: Tasmiya Nalatwad Link: https://lore.kernel.org/r/d06cee81-c47f-9d62-dfc6-4c77b60058db@linux.vnet.ibm.com Tested-by: Tasmiya Nalatwad Signed-off-by: Jason Gunthorpe Link: https://lore.kernel.org/r/0-v1-2b52423411b9+164fc-iommu_ppc_defdomain_jgg@nvidia.com Signed-off-by: Joerg Roedel (cherry picked from commit a8ca9fc9134c1a43e6d4db7ff59496bbd7075def) Signed-off-by: Koba Ko --- arch/powerpc/kernel/iommu.c | 10 ++++++++-- 1 file changed, 8 insertions(+), 2 deletions(-) diff --git a/arch/powerpc/kernel/iommu.c b/arch/powerpc/kernel/iommu.c index b1e38b78ed52c..47b2d86e998a1 100644 --- a/arch/powerpc/kernel/iommu.c +++ b/arch/powerpc/kernel/iommu.c @@ -1280,13 +1280,19 @@ struct iommu_table_group_ops spapr_tce_table_group_ops = { /* * A simple iommu_ops to allow less cruft in generic VFIO code. */ -static int spapr_tce_platform_iommu_attach_dev(struct iommu_domain *dom, - struct device *dev) +static int +spapr_tce_platform_iommu_attach_dev(struct iommu_domain *platform_domain, + struct device *dev) { + struct iommu_domain *domain = iommu_get_domain_for_dev(dev); struct iommu_group *grp = iommu_group_get(dev); struct iommu_table_group *table_group; int ret = -EINVAL; + /* At first attach the ownership is already set */ + if (!domain) + return 0; + if (!grp) return -ENODEV; From 6f7c851248ffe89a5aae9bfc5a7c1c752d519f7f Mon Sep 17 00:00:00 2001 From: Jason Gunthorpe Date: Tue, 30 Jan 2024 12:14:54 -0400 Subject: [PATCH 428/548] drm/tegra: Do not assume that a NULL domain means no DMA IOMMU Previously with tegra-smmu, even with CONFIG_IOMMU_DMA, the default domain could have been left as NULL. The NULL domain is specially recognized by host1x_client_iommu_attach() as meaning it is not the DMA domain and should be replaced with the special shared domain. This happened prior to the below commit because tegra-smmu was using the NULL domain to mean IDENTITY. Now that the domain is properly labled the test in DRM doesn't see NULL. Check for IDENTITY as well to enable the special domains. Fixes: c8cc2655cc6c ("iommu/tegra-smmu: Implement an IDENTITY domain") Reported-by: diogo.ivo@tecnico.ulisboa.pt Closes: https://lore.kernel.org/all/bbmhcoghrprmbdibnjum6lefix2eoquxrde7wyqeulm4xabmlm@b6jy32saugqh/ Tested-by: diogo.ivo@tecnico.ulisboa.pt Signed-off-by: Jason Gunthorpe Link: https://lore.kernel.org/r/0-v1-3049f92c4812+16691-host1x_def_dom_fix_jgg@nvidia.com Signed-off-by: Joerg Roedel (cherry picked from commit fae6e669cdc52fdbb843e7fb1b8419642b6b8cba) Signed-off-by: Koba Ko --- drivers/gpu/drm/tegra/drm.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/tegra/drm.c b/drivers/gpu/drm/tegra/drm.c index 373bcd79257e0..03d1c76aec2d3 100644 --- a/drivers/gpu/drm/tegra/drm.c +++ b/drivers/gpu/drm/tegra/drm.c @@ -960,7 +960,8 @@ int host1x_client_iommu_attach(struct host1x_client *client) * not the shared IOMMU domain, don't try to attach it to a different * domain. This allows using the IOMMU-backed DMA API. */ - if (domain && domain != tegra->domain) + if (domain && domain->type != IOMMU_DOMAIN_IDENTITY && + domain != tegra->domain) return 0; if (tegra->domain) { From 025c8109aedbeb2bea05d1c9073c0b59a58f1a31 Mon Sep 17 00:00:00 2001 From: Jason Gunthorpe Date: Tue, 3 Oct 2023 13:52:36 -0300 Subject: [PATCH 429/548] iommu: Do not use IOMMU_DOMAIN_DMA if CONFIG_IOMMU_DMA is not enabled msm_iommu platforms do not select either CONFIG_IOMMU_DMA or CONFIG_ARM_DMA_USE_IOMMU so they create a IOMMU_DOMAIN_DMA domain by default and never populate it. This acts like a BLOCKED domain and breaks the GPU driver on the platform. Detect this and force use of IDENTITY instead. Fixes: 98ac73f99bc4 ("iommu: Require a default_domain for all iommu drivers") Reported-by: Dmitry Baryshkov Link: https://lore.kernel.org/linux-iommu/CAA8EJprz7VVmBG68U9zLuqPd0UdSRHYoLDJSP6tCj6H6qanuTQ@mail.gmail.com/ Signed-off-by: Jason Gunthorpe Reviewed-by: Jerry Snitselaar Tested-by: Dmitry Baryshkov Link: https://lore.kernel.org/r/0-v1-20700abdf239+19c-iommu_no_dma_iommu_jgg@nvidia.com Signed-off-by: Joerg Roedel (cherry picked from commit 0f6a90436a5771fc9f6ca0d1e64f7549219e6c3c) Signed-off-by: Koba Ko --- drivers/iommu/iommu.c | 12 ++++++++++++ 1 file changed, 12 insertions(+) diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c index d76e4419279cf..3bb911a704e6a 100644 --- a/drivers/iommu/iommu.c +++ b/drivers/iommu/iommu.c @@ -1908,6 +1908,18 @@ static int iommu_get_default_domain_type(struct iommu_group *group, } } + /* + * If the common dma ops are not selected in kconfig then we cannot use + * IOMMU_DOMAIN_DMA at all. Force IDENTITY if nothing else has been + * selected. + */ + if (!IS_ENABLED(CONFIG_IOMMU_DMA)) { + if (WARN_ON(driver_type == IOMMU_DOMAIN_DMA)) + return -1; + if (!driver_type) + driver_type = IOMMU_DOMAIN_IDENTITY; + } + if (untrusted) { if (driver_type && driver_type != IOMMU_DOMAIN_DMA) { dev_err_ratelimited( From 3a4e9019528a8b43e3bf4a73aeb5ee634c906ac3 Mon Sep 17 00:00:00 2001 From: Jason Gunthorpe Date: Tue, 30 Jan 2024 12:12:53 -0400 Subject: [PATCH 430/548] iommu: Allow ops->default_domain to work when !CONFIG_IOMMU_DMA The ops->default_domain flow used a 0 req_type to select the default domain and this was enforced by iommu_group_alloc_default_domain(). When !CONFIG_IOMMU_DMA started forcing the old ARM32 drivers into IDENTITY it also overroad the 0 req_type of the ops->default_domain drivers to IDENTITY which ends up causing failures during device probe. Make iommu_group_alloc_default_domain() accept a req_type that matches the ops->default_domain and have iommu_group_alloc_default_domain() generate a req_type that matches the default_domain. This way the req_type always describes what kind of domain should be attached and ops->default_domain overrides all other mechanisms to choose the default domain. Fixes: 2ad56efa80db ("powerpc/iommu: Setup a default domain and remove set_platform_dma_ops") Fixes: 0f6a90436a57 ("iommu: Do not use IOMMU_DOMAIN_DMA if CONFIG_IOMMU_DMA is not enabled") Reported-by: Ovidiu Panait Closes: https://lore.kernel.org/linux-iommu/20240123165829.630276-1-ovidiu.panait@windriver.com/ Reported-by: Shivaprasad G Bhat Closes: https://lore.kernel.org/linux-iommu/170618452753.3805.4425669653666211728.stgit@ltcd48-lp2.aus.stglab.ibm.com/ Tested-by: Ovidiu Panait Tested-by: Shivaprasad G Bhat Signed-off-by: Jason Gunthorpe Link: https://lore.kernel.org/r/0-v1-755bd21c4a64+525b8-iommu_def_dom_fix_jgg@nvidia.com Signed-off-by: Joerg Roedel (cherry picked from commit 83b3836bf83f09beea5f592b126cfdd1bc921e48) Signed-off-by: Koba Ko --- drivers/iommu/iommu.c | 18 +++++++++++++----- 1 file changed, 13 insertions(+), 5 deletions(-) diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c index 3bb911a704e6a..c13eb588bcc9e 100644 --- a/drivers/iommu/iommu.c +++ b/drivers/iommu/iommu.c @@ -1772,7 +1772,7 @@ iommu_group_alloc_default_domain(struct iommu_group *group, int req_type) * domain. Do not use in new drivers. */ if (ops->default_domain) { - if (req_type) + if (req_type != ops->default_domain->type) return ERR_PTR(-EINVAL); return ops->default_domain; } @@ -1844,10 +1844,18 @@ static int iommu_get_def_domain_type(struct iommu_group *group, const struct iommu_ops *ops = group_iommu_ops(group); int type; - if (!ops->def_domain_type) - return cur_type; - - type = ops->def_domain_type(dev); + if (ops->default_domain) { + /* + * Drivers that declare a global static default_domain will + * always choose that. + */ + type = ops->default_domain->type; + } else { + if (ops->def_domain_type) + type = ops->def_domain_type(dev); + else + return cur_type; + } if (!type || cur_type == type) return cur_type; if (!cur_type) From 87465faf22fb2cdfbb69cf786b3fdde16ab02c97 Mon Sep 17 00:00:00 2001 From: Vasant Hegde Date: Tue, 23 Apr 2024 11:17:25 +0000 Subject: [PATCH 431/548] iommu/amd: Enhance def_domain_type to handle untrusted device Previously, IOMMU core layer was forcing IOMMU_DOMAIN_DMA domain for untrusted device. This always took precedence over driver's def_domain_type(). Commit 59ddce4418da ("iommu: Reorganize iommu_get_default_domain_type() to respect def_domain_type()") changed the behaviour. Current code calls def_domain_type() but if it doesn't return IOMMU_DOMAIN_DMA for untrusted device it throws error. This results in IOMMU group (and potentially IOMMU itself) in undetermined state. This patch adds untrusted check in AMD IOMMU driver code. So that it allows eGPUs behind Thunderbolt work again. Fine tuning amd_iommu_def_domain_type() will be done later. Reported-by: Eric Wagner Link: https://lore.kernel.org/linux-iommu/CAHudX3zLH6CsRmLE-yb+gRjhh-v4bU5_1jW_xCcxOo_oUUZKYg@mail.gmail.com Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3182 Fixes: 59ddce4418da ("iommu: Reorganize iommu_get_default_domain_type() to respect def_domain_type()") Cc: Robin Murphy Cc: Jason Gunthorpe Cc: stable@kernel.org # v6.7+ Signed-off-by: Vasant Hegde Link: https://lore.kernel.org/r/20240423111725.5813-1-vasant.hegde@amd.com Signed-off-by: Joerg Roedel (cherry picked from commit 0f91d0795741c12cee200667648669a91b568735) Signed-off-by: Koba Ko --- drivers/iommu/amd/iommu.c | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c index 23cfb98fe90a7..97aec5c21203a 100644 --- a/drivers/iommu/amd/iommu.c +++ b/drivers/iommu/amd/iommu.c @@ -2460,6 +2460,10 @@ static int amd_iommu_def_domain_type(struct device *dev) if (!dev_data) return 0; + /* Always use DMA domain for untrusted device */ + if (dev_is_pci(dev) && to_pci_dev(dev)->untrusted) + return IOMMU_DOMAIN_DMA; + /* * Do not identity map IOMMUv2 capable devices when: * - memory encryption is active, because some of those devices From 8f761b44f2f2023733918eafeb8ca3734a3322a1 Mon Sep 17 00:00:00 2001 From: Jason Gunthorpe Date: Mon, 26 Feb 2024 13:07:12 -0400 Subject: [PATCH 432/548] iommu/arm-smmu-v3: Make STE programming independent of the callers As the comment in arm_smmu_write_strtab_ent() explains, this routine has been limited to only work correctly in certain scenarios that the caller must ensure. Generally the caller must put the STE into ABORT or BYPASS before attempting to program it to something else. The iommu core APIs would ideally expect the driver to do a hitless change of iommu_domain in a number of cases: - RESV_DIRECT support wants IDENTITY -> DMA -> IDENTITY to be hitless for the RESV ranges - PASID upgrade has IDENTIY on the RID with no PASID then a PASID paging domain installed. The RID should not be impacted - PASID downgrade has IDENTIY on the RID and all PASID's removed. The RID should not be impacted - RID does PAGING -> BLOCKING with active PASID, PASID's should not be impacted - NESTING -> NESTING for carrying all the above hitless cases in a VM into the hypervisor. To comprehensively emulate the HW in a VM we should assume the VM OS is running logic like this and expecting hitless updates to be relayed to real HW. For CD updates arm_smmu_write_ctx_desc() has a similar comment explaining how limited it is, and the driver does have a need for hitless CD updates: - SMMUv3 BTM S1 ASID re-label - SVA mm release should change the CD to answert not-present to all requests without allowing logging (EPD0) The next patches/series are going to start removing some of this logic from the callers, and add more complex state combinations than currently. At the end everything that can be hitless will be hitless, including all of the above. Introduce arm_smmu_write_ste() which will run through the multi-qword programming sequence to avoid creating an incoherent 'torn' STE in the HW caches. It automatically detects which of two algorithms to use: 1) The disruptive V=0 update described in the spec which disrupts the entry and does three syncs to make the change: - Write V=0 to QWORD 0 - Write the entire STE except QWORD 0 - Write QWORD 0 2) A hitless update algorithm that follows the same rational that the driver already uses. It is safe to change IGNORED bits that HW doesn't use: - Write the target value into all currently unused bits - Write a single QWORD, this makes the new STE live atomically - Ensure now unused bits are 0 The detection of which path to use and the implementation of the hitless update rely on a "used bitmask" describing what bits the HW is actually using based on the V/CFG/etc bits. This flows from the spec language, typically indicated as IGNORED. Knowing which bits the HW is using we can update the bits it does not use and then compute how many QWORDS need to be changed. If only one qword needs to be updated the hitless algorithm is possible. Later patches will include CD updates in this mechanism so make the implementation generic using a struct arm_smmu_entry_writer and struct arm_smmu_entry_writer_ops to abstract the differences between STE and CD to be plugged in. At this point it generates the same sequence of updates as the current code, except that zeroing the VMID on entry to BYPASS/ABORT will do an extra sync (this seems to be an existing bug). Going forward this will use a V=0 transition instead of cycling through ABORT if a hitfull change is required. This seems more appropriate as ABORT will fail DMAs without any logging, but dropping a DMA due to transient V=0 is probably signaling a bug, so the C_BAD_STE is valuable. Add STRTAB_STE_1_SHCFG_INCOMING to s2_cfg, this was editing the STE in place and subtly inherited the value of data[1] from abort/bypass. Signed-off-by: Michael Shavit Signed-off-by: Jason Gunthorpe Link: https://lore.kernel.org/r/1-v6-96275f25c39d+2d4-smmuv3_newapi_p1_jgg@nvidia.com Signed-off-by: Will Deacon (cherry picked from commit 7da51af9125c624318c8099de13c5ddefd47e9e8) Signed-off-by: Koba Ko --- drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 275 +++++++++++++++----- 1 file changed, 211 insertions(+), 64 deletions(-) diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c index 486927683be32..bb04ed79819fd 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c @@ -48,6 +48,9 @@ enum arm_smmu_msi_index { ARM_SMMU_MAX_MSIS, }; +static void arm_smmu_sync_ste_for_sid(struct arm_smmu_device *smmu, + ioasid_t sid); + static phys_addr_t arm_smmu_msi_cfg[ARM_SMMU_MAX_MSIS][3] = { [EVTQ_MSI_INDEX] = { ARM_SMMU_EVTQ_IRQ_CFG0, @@ -971,6 +974,199 @@ void arm_smmu_tlb_inv_asid(struct arm_smmu_device *smmu, u16 asid) arm_smmu_cmdq_issue_cmd_with_sync(smmu, &cmd); } +/* + * Based on the value of ent report which bits of the STE the HW will access. It + * would be nice if this was complete according to the spec, but minimally it + * has to capture the bits this driver uses. + */ +static void arm_smmu_get_ste_used(const struct arm_smmu_ste *ent, + struct arm_smmu_ste *used_bits) +{ + unsigned int cfg = FIELD_GET(STRTAB_STE_0_CFG, le64_to_cpu(ent->data[0])); + + used_bits->data[0] = cpu_to_le64(STRTAB_STE_0_V); + if (!(ent->data[0] & cpu_to_le64(STRTAB_STE_0_V))) + return; + + used_bits->data[0] |= cpu_to_le64(STRTAB_STE_0_CFG); + + /* S1 translates */ + if (cfg & BIT(0)) { + used_bits->data[0] |= cpu_to_le64(STRTAB_STE_0_S1FMT | + STRTAB_STE_0_S1CTXPTR_MASK | + STRTAB_STE_0_S1CDMAX); + used_bits->data[1] |= + cpu_to_le64(STRTAB_STE_1_S1DSS | STRTAB_STE_1_S1CIR | + STRTAB_STE_1_S1COR | STRTAB_STE_1_S1CSH | + STRTAB_STE_1_S1STALLD | STRTAB_STE_1_STRW | + STRTAB_STE_1_EATS); + used_bits->data[2] |= cpu_to_le64(STRTAB_STE_2_S2VMID); + } + + /* S2 translates */ + if (cfg & BIT(1)) { + used_bits->data[1] |= + cpu_to_le64(STRTAB_STE_1_EATS | STRTAB_STE_1_SHCFG); + used_bits->data[2] |= + cpu_to_le64(STRTAB_STE_2_S2VMID | STRTAB_STE_2_VTCR | + STRTAB_STE_2_S2AA64 | STRTAB_STE_2_S2ENDI | + STRTAB_STE_2_S2PTW | STRTAB_STE_2_S2R); + used_bits->data[3] |= cpu_to_le64(STRTAB_STE_3_S2TTB_MASK); + } + + if (cfg == STRTAB_STE_0_CFG_BYPASS) + used_bits->data[1] |= cpu_to_le64(STRTAB_STE_1_SHCFG); +} + +/* + * Figure out if we can do a hitless update of entry to become target. Returns a + * bit mask where 1 indicates that qword needs to be set disruptively. + * unused_update is an intermediate value of entry that has unused bits set to + * their new values. + */ +static u8 arm_smmu_entry_qword_diff(const struct arm_smmu_ste *entry, + const struct arm_smmu_ste *target, + struct arm_smmu_ste *unused_update) +{ + struct arm_smmu_ste target_used = {}; + struct arm_smmu_ste cur_used = {}; + u8 used_qword_diff = 0; + unsigned int i; + + arm_smmu_get_ste_used(entry, &cur_used); + arm_smmu_get_ste_used(target, &target_used); + + for (i = 0; i != ARRAY_SIZE(target_used.data); i++) { + /* + * Check that masks are up to date, the make functions are not + * allowed to set a bit to 1 if the used function doesn't say it + * is used. + */ + WARN_ON_ONCE(target->data[i] & ~target_used.data[i]); + + /* Bits can change because they are not currently being used */ + unused_update->data[i] = (entry->data[i] & cur_used.data[i]) | + (target->data[i] & ~cur_used.data[i]); + /* + * Each bit indicates that a used bit in a qword needs to be + * changed after unused_update is applied. + */ + if ((unused_update->data[i] & target_used.data[i]) != + target->data[i]) + used_qword_diff |= 1 << i; + } + return used_qword_diff; +} + +static bool entry_set(struct arm_smmu_device *smmu, ioasid_t sid, + struct arm_smmu_ste *entry, + const struct arm_smmu_ste *target, unsigned int start, + unsigned int len) +{ + bool changed = false; + unsigned int i; + + for (i = start; len != 0; len--, i++) { + if (entry->data[i] != target->data[i]) { + WRITE_ONCE(entry->data[i], target->data[i]); + changed = true; + } + } + + if (changed) + arm_smmu_sync_ste_for_sid(smmu, sid); + return changed; +} + +/* + * Update the STE/CD to the target configuration. The transition from the + * current entry to the target entry takes place over multiple steps that + * attempts to make the transition hitless if possible. This function takes care + * not to create a situation where the HW can perceive a corrupted entry. HW is + * only required to have a 64 bit atomicity with stores from the CPU, while + * entries are many 64 bit values big. + * + * The difference between the current value and the target value is analyzed to + * determine which of three updates are required - disruptive, hitless or no + * change. + * + * In the most general disruptive case we can make any update in three steps: + * - Disrupting the entry (V=0) + * - Fill now unused qwords, execpt qword 0 which contains V + * - Make qword 0 have the final value and valid (V=1) with a single 64 + * bit store + * + * However this disrupts the HW while it is happening. There are several + * interesting cases where a STE/CD can be updated without disturbing the HW + * because only a small number of bits are changing (S1DSS, CONFIG, etc) or + * because the used bits don't intersect. We can detect this by calculating how + * many 64 bit values need update after adjusting the unused bits and skip the + * V=0 process. This relies on the IGNORED behavior described in the + * specification. + */ +static void arm_smmu_write_ste(struct arm_smmu_master *master, u32 sid, + struct arm_smmu_ste *entry, + const struct arm_smmu_ste *target) +{ + unsigned int num_entry_qwords = ARRAY_SIZE(target->data); + struct arm_smmu_device *smmu = master->smmu; + struct arm_smmu_ste unused_update; + u8 used_qword_diff; + + used_qword_diff = + arm_smmu_entry_qword_diff(entry, target, &unused_update); + if (hweight8(used_qword_diff) == 1) { + /* + * Only one qword needs its used bits to be changed. This is a + * hitless update, update all bits the current STE is ignoring + * to their new values, then update a single "critical qword" to + * change the STE and finally 0 out any bits that are now unused + * in the target configuration. + */ + unsigned int critical_qword_index = ffs(used_qword_diff) - 1; + + /* + * Skip writing unused bits in the critical qword since we'll be + * writing it in the next step anyways. This can save a sync + * when the only change is in that qword. + */ + unused_update.data[critical_qword_index] = + entry->data[critical_qword_index]; + entry_set(smmu, sid, entry, &unused_update, 0, num_entry_qwords); + entry_set(smmu, sid, entry, target, critical_qword_index, 1); + entry_set(smmu, sid, entry, target, 0, num_entry_qwords); + } else if (used_qword_diff) { + /* + * At least two qwords need their inuse bits to be changed. This + * requires a breaking update, zero the V bit, write all qwords + * but 0, then set qword 0 + */ + unused_update.data[0] = entry->data[0] & (~STRTAB_STE_0_V); + entry_set(smmu, sid, entry, &unused_update, 0, 1); + entry_set(smmu, sid, entry, target, 1, num_entry_qwords - 1); + entry_set(smmu, sid, entry, target, 0, 1); + } else { + /* + * No inuse bit changed. Sanity check that all unused bits are 0 + * in the entry. The target was already sanity checked by + * compute_qword_diff(). + */ + WARN_ON_ONCE( + entry_set(smmu, sid, entry, target, 0, num_entry_qwords)); + } + + /* It's likely that we'll want to use the new STE soon */ + if (!(smmu->options & ARM_SMMU_OPT_SKIP_PREFETCH)) { + struct arm_smmu_cmdq_ent + prefetch_cmd = { .opcode = CMDQ_OP_PREFETCH_CFG, + .prefetch = { + .sid = sid, + } }; + + arm_smmu_cmdq_issue_cmd(smmu, &prefetch_cmd); + } +} + static void arm_smmu_sync_cd(struct arm_smmu_master *master, int ssid, bool leaf) { @@ -1251,34 +1447,12 @@ static void arm_smmu_sync_ste_for_sid(struct arm_smmu_device *smmu, u32 sid) static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid, struct arm_smmu_ste *dst) { - /* - * This is hideously complicated, but we only really care about - * three cases at the moment: - * - * 1. Invalid (all zero) -> bypass/fault (init) - * 2. Bypass/fault -> translation/bypass (attach) - * 3. Translation/bypass -> bypass/fault (detach) - * - * Given that we can't update the STE atomically and the SMMU - * doesn't read the thing in a defined order, that leaves us - * with the following maintenance requirements: - * - * 1. Update Config, return (init time STEs aren't live) - * 2. Write everything apart from dword 0, sync, write dword 0, sync - * 3. Update Config, sync - */ - u64 val = le64_to_cpu(dst->data[0]); - bool ste_live = false; + u64 val; struct arm_smmu_device *smmu = master->smmu; struct arm_smmu_ctx_desc_cfg *cd_table = NULL; struct arm_smmu_s2_cfg *s2_cfg = NULL; struct arm_smmu_domain *smmu_domain = master->domain; - struct arm_smmu_cmdq_ent prefetch_cmd = { - .opcode = CMDQ_OP_PREFETCH_CFG, - .prefetch = { - .sid = sid, - }, - }; + struct arm_smmu_ste target = {}; if (smmu_domain) { switch (smmu_domain->stage) { @@ -1294,22 +1468,6 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid, } } - if (val & STRTAB_STE_0_V) { - switch (FIELD_GET(STRTAB_STE_0_CFG, val)) { - case STRTAB_STE_0_CFG_BYPASS: - break; - case STRTAB_STE_0_CFG_S1_TRANS: - case STRTAB_STE_0_CFG_S2_TRANS: - ste_live = true; - break; - case STRTAB_STE_0_CFG_ABORT: - BUG_ON(!disable_bypass); - break; - default: - BUG(); /* STE corruption */ - } - } - /* Nuke the existing STE_0 value, as we're going to rewrite it */ val = STRTAB_STE_0_V; @@ -1320,16 +1478,11 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid, else val |= FIELD_PREP(STRTAB_STE_0_CFG, STRTAB_STE_0_CFG_BYPASS); - dst->data[0] = cpu_to_le64(val); - dst->data[1] = cpu_to_le64(FIELD_PREP(STRTAB_STE_1_SHCFG, + target.data[0] = cpu_to_le64(val); + target.data[1] = cpu_to_le64(FIELD_PREP(STRTAB_STE_1_SHCFG, STRTAB_STE_1_SHCFG_INCOMING)); - dst->data[2] = 0; /* Nuke the VMID */ - /* - * The SMMU can perform negative caching, so we must sync - * the STE regardless of whether the old value was live. - */ - if (smmu) - arm_smmu_sync_ste_for_sid(smmu, sid); + target.data[2] = 0; /* Nuke the VMID */ + arm_smmu_write_ste(master, sid, dst, &target); return; } @@ -1337,8 +1490,7 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid, u64 strw = smmu->features & ARM_SMMU_FEAT_E2H ? STRTAB_STE_1_STRW_EL2 : STRTAB_STE_1_STRW_NSEL1; - BUG_ON(ste_live); - dst->data[1] = cpu_to_le64( + target.data[1] = cpu_to_le64( FIELD_PREP(STRTAB_STE_1_S1DSS, STRTAB_STE_1_S1DSS_SSID0) | FIELD_PREP(STRTAB_STE_1_S1CIR, STRTAB_STE_1_S1C_CACHE_WBRA) | FIELD_PREP(STRTAB_STE_1_S1COR, STRTAB_STE_1_S1C_CACHE_WBRA) | @@ -1347,7 +1499,7 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid, if (smmu->features & ARM_SMMU_FEAT_STALLS && !master->stall_enabled) - dst->data[1] |= cpu_to_le64(STRTAB_STE_1_S1STALLD); + target.data[1] |= cpu_to_le64(STRTAB_STE_1_S1STALLD); val |= (cd_table->cdtab_dma & STRTAB_STE_0_S1CTXPTR_MASK) | FIELD_PREP(STRTAB_STE_0_CFG, STRTAB_STE_0_CFG_S1_TRANS) | @@ -1356,8 +1508,9 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid, } if (s2_cfg) { - BUG_ON(ste_live); - dst->data[2] = cpu_to_le64( + target.data[1] = cpu_to_le64(FIELD_PREP(STRTAB_STE_1_SHCFG, + STRTAB_STE_1_SHCFG_INCOMING)); + target.data[2] = cpu_to_le64( FIELD_PREP(STRTAB_STE_2_S2VMID, s2_cfg->vmid) | FIELD_PREP(STRTAB_STE_2_VTCR, s2_cfg->vtcr) | #ifdef __BIG_ENDIAN @@ -1366,23 +1519,17 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid, STRTAB_STE_2_S2PTW | STRTAB_STE_2_S2AA64 | STRTAB_STE_2_S2R); - dst->data[3] = cpu_to_le64(s2_cfg->vttbr & STRTAB_STE_3_S2TTB_MASK); + target.data[3] = cpu_to_le64(s2_cfg->vttbr & STRTAB_STE_3_S2TTB_MASK); val |= FIELD_PREP(STRTAB_STE_0_CFG, STRTAB_STE_0_CFG_S2_TRANS); } if (master->ats_enabled) - dst->data[1] |= cpu_to_le64(FIELD_PREP(STRTAB_STE_1_EATS, + target.data[1] |= cpu_to_le64(FIELD_PREP(STRTAB_STE_1_EATS, STRTAB_STE_1_EATS_TRANS)); - arm_smmu_sync_ste_for_sid(smmu, sid); - /* See comment in arm_smmu_write_ctx_desc() */ - WRITE_ONCE(dst->data[0], cpu_to_le64(val)); - arm_smmu_sync_ste_for_sid(smmu, sid); - - /* It's likely that we'll want to use the new STE soon */ - if (!(smmu->options & ARM_SMMU_OPT_SKIP_PREFETCH)) - arm_smmu_cmdq_issue_cmd(smmu, &prefetch_cmd); + target.data[0] = cpu_to_le64(val); + arm_smmu_write_ste(master, sid, dst, &target); } static void arm_smmu_init_bypass_stes(struct arm_smmu_ste *strtab, From c0f840f40208ac59ae73a9172dd8c2f5f1aabe29 Mon Sep 17 00:00:00 2001 From: Jason Gunthorpe Date: Mon, 26 Feb 2024 13:07:13 -0400 Subject: [PATCH 433/548] iommu/arm-smmu-v3: Consolidate the STE generation for abort/bypass This allows writing the flow of arm_smmu_write_strtab_ent() around abort and bypass domains more naturally. Note that the core code no longer supplies NULL domains, though there is still a flow in the driver that end up in arm_smmu_write_strtab_ent() with NULL. A later patch will remove it. Remove the duplicate calculation of the STE in arm_smmu_init_bypass_stes() and remove the force parameter. arm_smmu_rmr_install_bypass_ste() can now simply invoke arm_smmu_make_bypass_ste() directly. Rename arm_smmu_init_bypass_stes() to arm_smmu_init_initial_stes() to better reflect its purpose. Reviewed-by: Michael Shavit Reviewed-by: Nicolin Chen Reviewed-by: Mostafa Saleh Tested-by: Shameer Kolothum Tested-by: Nicolin Chen Tested-by: Moritz Fischer Signed-off-by: Jason Gunthorpe Link: https://lore.kernel.org/r/2-v6-96275f25c39d+2d4-smmuv3_newapi_p1_jgg@nvidia.com Signed-off-by: Will Deacon (cherry picked from commit 7686aa5f8d61388eeaa64730363ffc8df20a481d) Signed-off-by: Koba Ko --- drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 98 ++++++++++++--------- 1 file changed, 55 insertions(+), 43 deletions(-) diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c index bb04ed79819fd..ad5b09101b967 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c @@ -1444,6 +1444,24 @@ static void arm_smmu_sync_ste_for_sid(struct arm_smmu_device *smmu, u32 sid) arm_smmu_cmdq_issue_cmd_with_sync(smmu, &cmd); } +static void arm_smmu_make_abort_ste(struct arm_smmu_ste *target) +{ + memset(target, 0, sizeof(*target)); + target->data[0] = cpu_to_le64( + STRTAB_STE_0_V | + FIELD_PREP(STRTAB_STE_0_CFG, STRTAB_STE_0_CFG_ABORT)); +} + +static void arm_smmu_make_bypass_ste(struct arm_smmu_ste *target) +{ + memset(target, 0, sizeof(*target)); + target->data[0] = cpu_to_le64( + STRTAB_STE_0_V | + FIELD_PREP(STRTAB_STE_0_CFG, STRTAB_STE_0_CFG_BYPASS)); + target->data[1] = cpu_to_le64( + FIELD_PREP(STRTAB_STE_1_SHCFG, STRTAB_STE_1_SHCFG_INCOMING)); +} + static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid, struct arm_smmu_ste *dst) { @@ -1454,38 +1472,31 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid, struct arm_smmu_domain *smmu_domain = master->domain; struct arm_smmu_ste target = {}; - if (smmu_domain) { - switch (smmu_domain->stage) { - case ARM_SMMU_DOMAIN_S1: - cd_table = &master->cd_table; - break; - case ARM_SMMU_DOMAIN_S2: - case ARM_SMMU_DOMAIN_NESTED: - s2_cfg = &smmu_domain->s2_cfg; - break; - default: - break; - } - } - - /* Nuke the existing STE_0 value, as we're going to rewrite it */ - val = STRTAB_STE_0_V; - - /* Bypass/fault */ - if (!smmu_domain || !(cd_table || s2_cfg)) { - if (!smmu_domain && disable_bypass) - val |= FIELD_PREP(STRTAB_STE_0_CFG, STRTAB_STE_0_CFG_ABORT); + if (!smmu_domain) { + if (disable_bypass) + arm_smmu_make_abort_ste(&target); else - val |= FIELD_PREP(STRTAB_STE_0_CFG, STRTAB_STE_0_CFG_BYPASS); + arm_smmu_make_bypass_ste(&target); + arm_smmu_write_ste(master, sid, dst, &target); + return; + } - target.data[0] = cpu_to_le64(val); - target.data[1] = cpu_to_le64(FIELD_PREP(STRTAB_STE_1_SHCFG, - STRTAB_STE_1_SHCFG_INCOMING)); - target.data[2] = 0; /* Nuke the VMID */ + switch (smmu_domain->stage) { + case ARM_SMMU_DOMAIN_S1: + cd_table = &master->cd_table; + break; + case ARM_SMMU_DOMAIN_S2: + s2_cfg = &smmu_domain->s2_cfg; + break; + case ARM_SMMU_DOMAIN_BYPASS: + arm_smmu_make_bypass_ste(&target); arm_smmu_write_ste(master, sid, dst, &target); return; } + /* Nuke the existing STE_0 value, as we're going to rewrite it */ + val = STRTAB_STE_0_V; + if (cd_table) { u64 strw = smmu->features & ARM_SMMU_FEAT_E2H ? STRTAB_STE_1_STRW_EL2 : STRTAB_STE_1_STRW_NSEL1; @@ -1532,22 +1543,20 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid, arm_smmu_write_ste(master, sid, dst, &target); } -static void arm_smmu_init_bypass_stes(struct arm_smmu_ste *strtab, - unsigned int nent, bool force) +/* + * This can safely directly manipulate the STE memory without a sync sequence + * because the STE table has not been installed in the SMMU yet. + */ +static void arm_smmu_init_initial_stes(struct arm_smmu_ste *strtab, + unsigned int nent) { unsigned int i; - u64 val = STRTAB_STE_0_V; - - if (disable_bypass && !force) - val |= FIELD_PREP(STRTAB_STE_0_CFG, STRTAB_STE_0_CFG_ABORT); - else - val |= FIELD_PREP(STRTAB_STE_0_CFG, STRTAB_STE_0_CFG_BYPASS); for (i = 0; i < nent; ++i) { - strtab->data[0] = cpu_to_le64(val); - strtab->data[1] = cpu_to_le64(FIELD_PREP( - STRTAB_STE_1_SHCFG, STRTAB_STE_1_SHCFG_INCOMING)); - strtab->data[2] = 0; + if (disable_bypass) + arm_smmu_make_abort_ste(strtab); + else + arm_smmu_make_bypass_ste(strtab); strtab++; } } @@ -1575,7 +1584,7 @@ static int arm_smmu_init_l2_strtab(struct arm_smmu_device *smmu, u32 sid) return -ENOMEM; } - arm_smmu_init_bypass_stes(desc->l2ptr, 1 << STRTAB_SPLIT, false); + arm_smmu_init_initial_stes(desc->l2ptr, 1 << STRTAB_SPLIT); arm_smmu_write_strtab_l1_desc(strtab, desc); return 0; } @@ -3207,7 +3216,7 @@ static int arm_smmu_init_strtab_linear(struct arm_smmu_device *smmu) reg |= FIELD_PREP(STRTAB_BASE_CFG_LOG2SIZE, smmu->sid_bits); cfg->strtab_base_cfg = reg; - arm_smmu_init_bypass_stes(strtab, cfg->num_l1_ents, false); + arm_smmu_init_initial_stes(strtab, cfg->num_l1_ents); return 0; } @@ -3918,7 +3927,6 @@ static void arm_smmu_rmr_install_bypass_ste(struct arm_smmu_device *smmu) iort_get_rmr_sids(dev_fwnode(smmu->dev), &rmr_list); list_for_each_entry(e, &rmr_list, list) { - struct arm_smmu_ste *step; struct iommu_iort_rmr_data *rmr; int ret, i; @@ -3931,8 +3939,12 @@ static void arm_smmu_rmr_install_bypass_ste(struct arm_smmu_device *smmu) continue; } - step = arm_smmu_get_step_for_sid(smmu, rmr->sids[i]); - arm_smmu_init_bypass_stes(step, 1, true); + /* + * STE table is not programmed to HW, see + * arm_smmu_initial_bypass_stes() + */ + arm_smmu_make_bypass_ste( + arm_smmu_get_step_for_sid(smmu, rmr->sids[i])); } } From 715947b064cdab2eb0562b4ec7841a985bf490c3 Mon Sep 17 00:00:00 2001 From: Jason Gunthorpe Date: Mon, 26 Feb 2024 13:07:14 -0400 Subject: [PATCH 434/548] iommu/arm-smmu-v3: Move the STE generation for S1 and S2 domains into functions This is preparation to move the STE calculation higher up in to the call chain and remove arm_smmu_write_strtab_ent(). These new functions will be called directly from attach_dev. Reviewed-by: Moritz Fischer Reviewed-by: Michael Shavit Reviewed-by: Nicolin Chen Reviewed-by: Mostafa Saleh Tested-by: Shameer Kolothum Tested-by: Nicolin Chen Tested-by: Moritz Fischer Signed-off-by: Jason Gunthorpe Link: https://lore.kernel.org/r/3-v6-96275f25c39d+2d4-smmuv3_newapi_p1_jgg@nvidia.com Signed-off-by: Will Deacon (cherry picked from commit efe15df08727d483bd247ff905a828f0de955de6) Signed-off-by: Koba Ko --- drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 138 ++++++++++++-------- 1 file changed, 83 insertions(+), 55 deletions(-) diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c index ad5b09101b967..2aa0d022db73f 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c @@ -1462,13 +1462,89 @@ static void arm_smmu_make_bypass_ste(struct arm_smmu_ste *target) FIELD_PREP(STRTAB_STE_1_SHCFG, STRTAB_STE_1_SHCFG_INCOMING)); } +static void arm_smmu_make_cdtable_ste(struct arm_smmu_ste *target, + struct arm_smmu_master *master) +{ + struct arm_smmu_ctx_desc_cfg *cd_table = &master->cd_table; + struct arm_smmu_device *smmu = master->smmu; + + memset(target, 0, sizeof(*target)); + target->data[0] = cpu_to_le64( + STRTAB_STE_0_V | + FIELD_PREP(STRTAB_STE_0_CFG, STRTAB_STE_0_CFG_S1_TRANS) | + FIELD_PREP(STRTAB_STE_0_S1FMT, cd_table->s1fmt) | + (cd_table->cdtab_dma & STRTAB_STE_0_S1CTXPTR_MASK) | + FIELD_PREP(STRTAB_STE_0_S1CDMAX, cd_table->s1cdmax)); + + target->data[1] = cpu_to_le64( + FIELD_PREP(STRTAB_STE_1_S1DSS, STRTAB_STE_1_S1DSS_SSID0) | + FIELD_PREP(STRTAB_STE_1_S1CIR, STRTAB_STE_1_S1C_CACHE_WBRA) | + FIELD_PREP(STRTAB_STE_1_S1COR, STRTAB_STE_1_S1C_CACHE_WBRA) | + FIELD_PREP(STRTAB_STE_1_S1CSH, ARM_SMMU_SH_ISH) | + ((smmu->features & ARM_SMMU_FEAT_STALLS && + !master->stall_enabled) ? + STRTAB_STE_1_S1STALLD : + 0) | + FIELD_PREP(STRTAB_STE_1_EATS, + master->ats_enabled ? STRTAB_STE_1_EATS_TRANS : 0)); + + if (smmu->features & ARM_SMMU_FEAT_E2H) { + /* + * To support BTM the streamworld needs to match the + * configuration of the CPU so that the ASID broadcasts are + * properly matched. This means either S/NS-EL2-E2H (hypervisor) + * or NS-EL1 (guest). Since an SVA domain can be installed in a + * PASID this should always use a BTM compatible configuration + * if the HW supports it. + */ + target->data[1] |= cpu_to_le64( + FIELD_PREP(STRTAB_STE_1_STRW, STRTAB_STE_1_STRW_EL2)); + } else { + target->data[1] |= cpu_to_le64( + FIELD_PREP(STRTAB_STE_1_STRW, STRTAB_STE_1_STRW_NSEL1)); + + /* + * VMID 0 is reserved for stage-2 bypass EL1 STEs, see + * arm_smmu_domain_alloc_id() + */ + target->data[2] = + cpu_to_le64(FIELD_PREP(STRTAB_STE_2_S2VMID, 0)); + } +} + +static void arm_smmu_make_s2_domain_ste(struct arm_smmu_ste *target, + struct arm_smmu_master *master, + struct arm_smmu_domain *smmu_domain) +{ + struct arm_smmu_s2_cfg *s2_cfg = &smmu_domain->s2_cfg; + + memset(target, 0, sizeof(*target)); + target->data[0] = cpu_to_le64( + STRTAB_STE_0_V | + FIELD_PREP(STRTAB_STE_0_CFG, STRTAB_STE_0_CFG_S2_TRANS)); + + target->data[1] = cpu_to_le64( + FIELD_PREP(STRTAB_STE_1_EATS, + master->ats_enabled ? STRTAB_STE_1_EATS_TRANS : 0) | + FIELD_PREP(STRTAB_STE_1_SHCFG, + STRTAB_STE_1_SHCFG_INCOMING)); + + target->data[2] = cpu_to_le64( + FIELD_PREP(STRTAB_STE_2_S2VMID, s2_cfg->vmid) | + FIELD_PREP(STRTAB_STE_2_VTCR, s2_cfg->vtcr) | + STRTAB_STE_2_S2AA64 | +#ifdef __BIG_ENDIAN + STRTAB_STE_2_S2ENDI | +#endif + STRTAB_STE_2_S2PTW | + STRTAB_STE_2_S2R); + + target->data[3] = cpu_to_le64(s2_cfg->vttbr & STRTAB_STE_3_S2TTB_MASK); +} + static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid, struct arm_smmu_ste *dst) { - u64 val; - struct arm_smmu_device *smmu = master->smmu; - struct arm_smmu_ctx_desc_cfg *cd_table = NULL; - struct arm_smmu_s2_cfg *s2_cfg = NULL; struct arm_smmu_domain *smmu_domain = master->domain; struct arm_smmu_ste target = {}; @@ -1483,63 +1559,15 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid, switch (smmu_domain->stage) { case ARM_SMMU_DOMAIN_S1: - cd_table = &master->cd_table; + arm_smmu_make_cdtable_ste(&target, master); break; case ARM_SMMU_DOMAIN_S2: - s2_cfg = &smmu_domain->s2_cfg; + arm_smmu_make_s2_domain_ste(&target, master, smmu_domain); break; case ARM_SMMU_DOMAIN_BYPASS: arm_smmu_make_bypass_ste(&target); - arm_smmu_write_ste(master, sid, dst, &target); - return; - } - - /* Nuke the existing STE_0 value, as we're going to rewrite it */ - val = STRTAB_STE_0_V; - - if (cd_table) { - u64 strw = smmu->features & ARM_SMMU_FEAT_E2H ? - STRTAB_STE_1_STRW_EL2 : STRTAB_STE_1_STRW_NSEL1; - - target.data[1] = cpu_to_le64( - FIELD_PREP(STRTAB_STE_1_S1DSS, STRTAB_STE_1_S1DSS_SSID0) | - FIELD_PREP(STRTAB_STE_1_S1CIR, STRTAB_STE_1_S1C_CACHE_WBRA) | - FIELD_PREP(STRTAB_STE_1_S1COR, STRTAB_STE_1_S1C_CACHE_WBRA) | - FIELD_PREP(STRTAB_STE_1_S1CSH, ARM_SMMU_SH_ISH) | - FIELD_PREP(STRTAB_STE_1_STRW, strw)); - - if (smmu->features & ARM_SMMU_FEAT_STALLS && - !master->stall_enabled) - target.data[1] |= cpu_to_le64(STRTAB_STE_1_S1STALLD); - - val |= (cd_table->cdtab_dma & STRTAB_STE_0_S1CTXPTR_MASK) | - FIELD_PREP(STRTAB_STE_0_CFG, STRTAB_STE_0_CFG_S1_TRANS) | - FIELD_PREP(STRTAB_STE_0_S1CDMAX, cd_table->s1cdmax) | - FIELD_PREP(STRTAB_STE_0_S1FMT, cd_table->s1fmt); - } - - if (s2_cfg) { - target.data[1] = cpu_to_le64(FIELD_PREP(STRTAB_STE_1_SHCFG, - STRTAB_STE_1_SHCFG_INCOMING)); - target.data[2] = cpu_to_le64( - FIELD_PREP(STRTAB_STE_2_S2VMID, s2_cfg->vmid) | - FIELD_PREP(STRTAB_STE_2_VTCR, s2_cfg->vtcr) | -#ifdef __BIG_ENDIAN - STRTAB_STE_2_S2ENDI | -#endif - STRTAB_STE_2_S2PTW | STRTAB_STE_2_S2AA64 | - STRTAB_STE_2_S2R); - - target.data[3] = cpu_to_le64(s2_cfg->vttbr & STRTAB_STE_3_S2TTB_MASK); - - val |= FIELD_PREP(STRTAB_STE_0_CFG, STRTAB_STE_0_CFG_S2_TRANS); + break; } - - if (master->ats_enabled) - target.data[1] |= cpu_to_le64(FIELD_PREP(STRTAB_STE_1_EATS, - STRTAB_STE_1_EATS_TRANS)); - - target.data[0] = cpu_to_le64(val); arm_smmu_write_ste(master, sid, dst, &target); } From bcec42c4ec34040e1d913dd6a20bd671a865b70a Mon Sep 17 00:00:00 2001 From: Jason Gunthorpe Date: Mon, 26 Feb 2024 13:07:15 -0400 Subject: [PATCH 435/548] iommu/arm-smmu-v3: Build the whole STE in arm_smmu_make_s2_domain_ste() Half the code was living in arm_smmu_domain_finalise_s2(), just move it here and take the values directly from the pgtbl_ops instead of storing copies. Reviewed-by: Michael Shavit Reviewed-by: Nicolin Chen Reviewed-by: Mostafa Saleh Tested-by: Shameer Kolothum Tested-by: Nicolin Chen Tested-by: Moritz Fischer Signed-off-by: Jason Gunthorpe Link: https://lore.kernel.org/r/4-v6-96275f25c39d+2d4-smmuv3_newapi_p1_jgg@nvidia.com Signed-off-by: Will Deacon (cherry picked from commit 71b0aa10b18dd589a899449457e5ab7c1da00d01) Signed-off-by: Koba Ko --- drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 27 ++++++++++++--------- drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 2 -- 2 files changed, 15 insertions(+), 14 deletions(-) diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c index 2aa0d022db73f..c08fe314a33b9 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c @@ -1517,6 +1517,11 @@ static void arm_smmu_make_s2_domain_ste(struct arm_smmu_ste *target, struct arm_smmu_domain *smmu_domain) { struct arm_smmu_s2_cfg *s2_cfg = &smmu_domain->s2_cfg; + const struct io_pgtable_cfg *pgtbl_cfg = + &io_pgtable_ops_to_pgtable(smmu_domain->pgtbl_ops)->cfg; + typeof(&pgtbl_cfg->arm_lpae_s2_cfg.vtcr) vtcr = + &pgtbl_cfg->arm_lpae_s2_cfg.vtcr; + u64 vtcr_val; memset(target, 0, sizeof(*target)); target->data[0] = cpu_to_le64( @@ -1529,9 +1534,16 @@ static void arm_smmu_make_s2_domain_ste(struct arm_smmu_ste *target, FIELD_PREP(STRTAB_STE_1_SHCFG, STRTAB_STE_1_SHCFG_INCOMING)); + vtcr_val = FIELD_PREP(STRTAB_STE_2_VTCR_S2T0SZ, vtcr->tsz) | + FIELD_PREP(STRTAB_STE_2_VTCR_S2SL0, vtcr->sl) | + FIELD_PREP(STRTAB_STE_2_VTCR_S2IR0, vtcr->irgn) | + FIELD_PREP(STRTAB_STE_2_VTCR_S2OR0, vtcr->orgn) | + FIELD_PREP(STRTAB_STE_2_VTCR_S2SH0, vtcr->sh) | + FIELD_PREP(STRTAB_STE_2_VTCR_S2TG, vtcr->tg) | + FIELD_PREP(STRTAB_STE_2_VTCR_S2PS, vtcr->ps); target->data[2] = cpu_to_le64( FIELD_PREP(STRTAB_STE_2_S2VMID, s2_cfg->vmid) | - FIELD_PREP(STRTAB_STE_2_VTCR, s2_cfg->vtcr) | + FIELD_PREP(STRTAB_STE_2_VTCR, vtcr_val) | STRTAB_STE_2_S2AA64 | #ifdef __BIG_ENDIAN STRTAB_STE_2_S2ENDI | @@ -1539,7 +1551,8 @@ static void arm_smmu_make_s2_domain_ste(struct arm_smmu_ste *target, STRTAB_STE_2_S2PTW | STRTAB_STE_2_S2R); - target->data[3] = cpu_to_le64(s2_cfg->vttbr & STRTAB_STE_3_S2TTB_MASK); + target->data[3] = cpu_to_le64(pgtbl_cfg->arm_lpae_s2_cfg.vttbr & + STRTAB_STE_3_S2TTB_MASK); } static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid, @@ -2310,7 +2323,6 @@ static int arm_smmu_domain_finalise_s2(struct arm_smmu_domain *smmu_domain, int vmid; struct arm_smmu_device *smmu = smmu_domain->smmu; struct arm_smmu_s2_cfg *cfg = &smmu_domain->s2_cfg; - typeof(&pgtbl_cfg->arm_lpae_s2_cfg.vtcr) vtcr; /* Reserve VMID 0 for stage-2 bypass STEs */ vmid = ida_alloc_range(&smmu->vmid_map, 1, (1 << smmu->vmid_bits) - 1, @@ -2318,16 +2330,7 @@ static int arm_smmu_domain_finalise_s2(struct arm_smmu_domain *smmu_domain, if (vmid < 0) return vmid; - vtcr = &pgtbl_cfg->arm_lpae_s2_cfg.vtcr; cfg->vmid = (u16)vmid; - cfg->vttbr = pgtbl_cfg->arm_lpae_s2_cfg.vttbr; - cfg->vtcr = FIELD_PREP(STRTAB_STE_2_VTCR_S2T0SZ, vtcr->tsz) | - FIELD_PREP(STRTAB_STE_2_VTCR_S2SL0, vtcr->sl) | - FIELD_PREP(STRTAB_STE_2_VTCR_S2IR0, vtcr->irgn) | - FIELD_PREP(STRTAB_STE_2_VTCR_S2OR0, vtcr->orgn) | - FIELD_PREP(STRTAB_STE_2_VTCR_S2SH0, vtcr->sh) | - FIELD_PREP(STRTAB_STE_2_VTCR_S2TG, vtcr->tg) | - FIELD_PREP(STRTAB_STE_2_VTCR_S2PS, vtcr->ps); return 0; } diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h index 03f9e526cbd92..2a7bd3b2e23d6 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h @@ -609,8 +609,6 @@ struct arm_smmu_ctx_desc_cfg { struct arm_smmu_s2_cfg { u16 vmid; - u64 vttbr; - u64 vtcr; }; struct arm_smmu_strtab_cfg { From d10a20b5bf0f971e83909eae645e89de520d4597 Mon Sep 17 00:00:00 2001 From: Jason Gunthorpe Date: Mon, 26 Feb 2024 13:07:16 -0400 Subject: [PATCH 436/548] iommu/arm-smmu-v3: Hold arm_smmu_asid_lock during all of attach_dev The BTM support wants to be able to change the ASID of any smmu_domain. When it goes to do this it holds the arm_smmu_asid_lock and iterates over the target domain's devices list. During attach of a S1 domain we must ensure that the devices list and CD are in sync, otherwise we could miss CD updates or a parallel CD update could push an out of date CD. This is pretty complicated, and almost works today because arm_smmu_detach_dev() removes the master from the linked list before working on the CD entries, preventing parallel update of the CD. However, it does have an issue where the CD can remain programed while the domain appears to be unattached. arm_smmu_share_asid() will then not clear any CD entriess and install its own CD entry with the same ASID concurrently. This creates a small race window where the IOMMU can see two ASIDs pointing to different translations. CPU0 CPU1 arm_smmu_attach_dev() arm_smmu_detach_dev() spin_lock_irqsave(&smmu_domain->devices_lock, flags); list_del(&master->domain_head); spin_unlock_irqrestore(&smmu_domain->devices_lock, flags); arm_smmu_mmu_notifier_get() arm_smmu_alloc_shared_cd() arm_smmu_share_asid(): // Does nothing due to list_del above arm_smmu_update_ctx_desc_devices() arm_smmu_tlb_inv_asid() arm_smmu_write_ctx_desc() ** Now the ASID is in two CDs with different translation arm_smmu_write_ctx_desc(master, IOMMU_NO_PASID, NULL); Solve this by wrapping most of the attach flow in the arm_smmu_asid_lock. This locks more than strictly needed to prepare for the next patch which will reorganize the order of the linked list, STE and CD changes. Move arm_smmu_detach_dev() till after we have initialized the domain so the lock can be held for less time. Reviewed-by: Michael Shavit Reviewed-by: Nicolin Chen Reviewed-by: Mostafa Saleh Tested-by: Shameer Kolothum Tested-by: Nicolin Chen Tested-by: Moritz Fischer Signed-off-by: Jason Gunthorpe Link: https://lore.kernel.org/r/5-v6-96275f25c39d+2d4-smmuv3_newapi_p1_jgg@nvidia.com Signed-off-by: Will Deacon (cherry picked from commit 9f7c68911579bc15c57d227d021ccd253da2b635) Signed-off-by: Koba Ko --- drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 22 ++++++++++++--------- 1 file changed, 13 insertions(+), 9 deletions(-) diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c index c08fe314a33b9..feac9af51528c 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c @@ -2598,8 +2598,6 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev) return -EBUSY; } - arm_smmu_detach_dev(master); - mutex_lock(&smmu_domain->init_mutex); if (!smmu_domain->smmu) { @@ -2614,6 +2612,16 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev) if (ret) return ret; + /* + * Prevent arm_smmu_share_asid() from trying to change the ASID + * of either the old or new domain while we are working on it. + * This allows the STE and the smmu_domain->devices list to + * be inconsistent during this routine. + */ + mutex_lock(&arm_smmu_asid_lock); + + arm_smmu_detach_dev(master); + master->domain = smmu_domain; /* @@ -2639,13 +2647,7 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev) } } - /* - * Prevent SVA from concurrently modifying the CD or writing to - * the CD entry - */ - mutex_lock(&arm_smmu_asid_lock); ret = arm_smmu_write_ctx_desc(master, IOMMU_NO_PASID, &smmu_domain->cd); - mutex_unlock(&arm_smmu_asid_lock); if (ret) { master->domain = NULL; goto out_list_del; @@ -2655,13 +2657,15 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev) arm_smmu_install_ste_for_dev(master); arm_smmu_enable_ats(master); - return 0; + goto out_unlock; out_list_del: spin_lock_irqsave(&smmu_domain->devices_lock, flags); list_del(&master->domain_head); spin_unlock_irqrestore(&smmu_domain->devices_lock, flags); +out_unlock: + mutex_unlock(&arm_smmu_asid_lock); return ret; } From 03ec58a303c55fa84f4d786f250cdaa0d4ffed87 Mon Sep 17 00:00:00 2001 From: Jason Gunthorpe Date: Mon, 26 Feb 2024 13:07:17 -0400 Subject: [PATCH 437/548] iommu/arm-smmu-v3: Compute the STE only once for each master Currently arm_smmu_install_ste_for_dev() iterates over every SID and computes from scratch an identical STE. Every SID should have the same STE contents. Turn this inside out so that the STE is supplied by the caller and arm_smmu_install_ste_for_dev() simply installs it to every SID. This is possible now that the STE generation does not inform what sequence should be used to program it. This allows splitting the STE calculation up according to the call site, which following patches will make use of, and removes the confusing NULL domain special case that only supported arm_smmu_detach_dev(). Reviewed-by: Michael Shavit Reviewed-by: Nicolin Chen Reviewed-by: Mostafa Saleh Tested-by: Shameer Kolothum Tested-by: Nicolin Chen Tested-by: Moritz Fischer Signed-off-by: Jason Gunthorpe Link: https://lore.kernel.org/r/6-v6-96275f25c39d+2d4-smmuv3_newapi_p1_jgg@nvidia.com Signed-off-by: Will Deacon (cherry picked from commit 65547275d76965c3106fbcd4a9244242eb88224c) Signed-off-by: Koba Ko --- drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 57 ++++++++------------- 1 file changed, 22 insertions(+), 35 deletions(-) diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c index feac9af51528c..9bd48cd66433c 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c @@ -1555,35 +1555,6 @@ static void arm_smmu_make_s2_domain_ste(struct arm_smmu_ste *target, STRTAB_STE_3_S2TTB_MASK); } -static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid, - struct arm_smmu_ste *dst) -{ - struct arm_smmu_domain *smmu_domain = master->domain; - struct arm_smmu_ste target = {}; - - if (!smmu_domain) { - if (disable_bypass) - arm_smmu_make_abort_ste(&target); - else - arm_smmu_make_bypass_ste(&target); - arm_smmu_write_ste(master, sid, dst, &target); - return; - } - - switch (smmu_domain->stage) { - case ARM_SMMU_DOMAIN_S1: - arm_smmu_make_cdtable_ste(&target, master); - break; - case ARM_SMMU_DOMAIN_S2: - arm_smmu_make_s2_domain_ste(&target, master, smmu_domain); - break; - case ARM_SMMU_DOMAIN_BYPASS: - arm_smmu_make_bypass_ste(&target); - break; - } - arm_smmu_write_ste(master, sid, dst, &target); -} - /* * This can safely directly manipulate the STE memory without a sync sequence * because the STE table has not been installed in the SMMU yet. @@ -2422,7 +2393,8 @@ arm_smmu_get_step_for_sid(struct arm_smmu_device *smmu, u32 sid) } } -static void arm_smmu_install_ste_for_dev(struct arm_smmu_master *master) +static void arm_smmu_install_ste_for_dev(struct arm_smmu_master *master, + const struct arm_smmu_ste *target) { int i, j; struct arm_smmu_device *smmu = master->smmu; @@ -2439,7 +2411,7 @@ static void arm_smmu_install_ste_for_dev(struct arm_smmu_master *master) if (j < i) continue; - arm_smmu_write_strtab_ent(master, sid, step); + arm_smmu_write_ste(master, sid, step, target); } } @@ -2549,6 +2521,7 @@ static void arm_smmu_disable_pasid(struct arm_smmu_master *master) static void arm_smmu_detach_dev(struct arm_smmu_master *master) { unsigned long flags; + struct arm_smmu_ste target; struct arm_smmu_domain *smmu_domain = master->domain; if (!smmu_domain) @@ -2562,7 +2535,11 @@ static void arm_smmu_detach_dev(struct arm_smmu_master *master) master->domain = NULL; master->ats_enabled = false; - arm_smmu_install_ste_for_dev(master); + if (disable_bypass) + arm_smmu_make_abort_ste(&target); + else + arm_smmu_make_bypass_ste(&target); + arm_smmu_install_ste_for_dev(master, &target); /* * Clearing the CD entry isn't strictly required to detach the domain * since the table is uninstalled anyway, but it helps avoid confusion @@ -2577,6 +2554,7 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev) { int ret = 0; unsigned long flags; + struct arm_smmu_ste target; struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev); struct arm_smmu_device *smmu; struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain); @@ -2638,7 +2616,8 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev) list_add(&master->domain_head, &smmu_domain->devices); spin_unlock_irqrestore(&smmu_domain->devices_lock, flags); - if (smmu_domain->stage == ARM_SMMU_DOMAIN_S1) { + switch (smmu_domain->stage) { + case ARM_SMMU_DOMAIN_S1: if (!master->cd_table.cdtab) { ret = arm_smmu_alloc_cd_tables(master); if (ret) { @@ -2652,9 +2631,17 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev) master->domain = NULL; goto out_list_del; } - } - arm_smmu_install_ste_for_dev(master); + arm_smmu_make_cdtable_ste(&target, master); + break; + case ARM_SMMU_DOMAIN_S2: + arm_smmu_make_s2_domain_ste(&target, master, smmu_domain); + break; + case ARM_SMMU_DOMAIN_BYPASS: + arm_smmu_make_bypass_ste(&target); + break; + } + arm_smmu_install_ste_for_dev(master, &target); arm_smmu_enable_ats(master); goto out_unlock; From 9df6f8d5bbf99d599bb2ffb077b3d8182cb0ad53 Mon Sep 17 00:00:00 2001 From: Jason Gunthorpe Date: Mon, 26 Feb 2024 13:07:18 -0400 Subject: [PATCH 438/548] iommu/arm-smmu-v3: Do not change the STE twice during arm_smmu_attach_dev() This was needed because the STE code required the STE to be in ABORT/BYPASS inorder to program a cdtable or S2 STE. Now that the STE code can automatically handle all transitions we can remove this step from the attach_dev flow. A few small bugs exist because of this: 1) If the core code does BLOCKED -> UNMANAGED with disable_bypass=false then there will be a moment where the STE points at BYPASS. Since this can be done by VFIO/IOMMUFD it is a small security race. 2) If the core code does IDENTITY -> DMA then any IOMMU_RESV_DIRECT regions will temporarily become BLOCKED. We'd like drivers to work in a way that allows IOMMU_RESV_DIRECT to be continuously functional during these transitions. Make arm_smmu_release_device() put the STE back to the correct ABORT/BYPASS setting. Fix a bug where a IOMMU_RESV_DIRECT was ignored on this path. As noted before the reordering of the linked list/STE/CD changes is OK against concurrent arm_smmu_share_asid() because of the arm_smmu_asid_lock. Tested-by: Shameer Kolothum Tested-by: Nicolin Chen Tested-by: Moritz Fischer Reviewed-by: Nicolin Chen Signed-off-by: Jason Gunthorpe Link: https://lore.kernel.org/r/7-v6-96275f25c39d+2d4-smmuv3_newapi_p1_jgg@nvidia.com Signed-off-by: Will Deacon (cherry picked from commit 8c73c32c83ce7f3c31864cb044abd3daefacd996) Signed-off-by: Koba Ko --- drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 15 +++++++++------ 1 file changed, 9 insertions(+), 6 deletions(-) diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c index 9bd48cd66433c..df4a23c0f4cfa 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c @@ -2521,7 +2521,6 @@ static void arm_smmu_disable_pasid(struct arm_smmu_master *master) static void arm_smmu_detach_dev(struct arm_smmu_master *master) { unsigned long flags; - struct arm_smmu_ste target; struct arm_smmu_domain *smmu_domain = master->domain; if (!smmu_domain) @@ -2535,11 +2534,6 @@ static void arm_smmu_detach_dev(struct arm_smmu_master *master) master->domain = NULL; master->ats_enabled = false; - if (disable_bypass) - arm_smmu_make_abort_ste(&target); - else - arm_smmu_make_bypass_ste(&target); - arm_smmu_install_ste_for_dev(master, &target); /* * Clearing the CD entry isn't strictly required to detach the domain * since the table is uninstalled anyway, but it helps avoid confusion @@ -2885,9 +2879,18 @@ static struct iommu_device *arm_smmu_probe_device(struct device *dev) static void arm_smmu_release_device(struct device *dev) { struct arm_smmu_master *master = dev_iommu_priv_get(dev); + struct arm_smmu_ste target; if (WARN_ON(arm_smmu_master_sva_enabled(master))) iopf_queue_remove_device(master->smmu->evtq.iopf, dev); + + /* Put the STE back to what arm_smmu_init_strtab() sets */ + if (disable_bypass && !dev->iommu->require_direct) + arm_smmu_make_abort_ste(&target); + else + arm_smmu_make_bypass_ste(&target); + arm_smmu_install_ste_for_dev(master, &target); + arm_smmu_detach_dev(master); arm_smmu_disable_pasid(master); arm_smmu_remove_master(master); From 10e9b8ff7653b8272c952b2c500cef3cf54b5e32 Mon Sep 17 00:00:00 2001 From: Jason Gunthorpe Date: Mon, 26 Feb 2024 13:07:19 -0400 Subject: [PATCH 439/548] iommu/arm-smmu-v3: Put writing the context descriptor in the right order Get closer to the IOMMU API ideal that changes between domains can be hitless. The ordering for the CD table entry is not entirely clean from this perspective. When switching away from a STE with a CD table programmed in it we should write the new STE first, then clear any old data in the CD entry. If we are programming a CD table for the first time to a STE then the CD entry should be programmed before the STE is loaded. If we are replacing a CD table entry when the STE already points at the CD entry then we just need to do the make/break sequence. Lift this code out of arm_smmu_detach_dev() so it can all be sequenced properly. The only other caller is arm_smmu_release_device() and it is going to free the cdtable anyhow, so it doesn't matter what is in it. Reviewed-by: Michael Shavit Reviewed-by: Nicolin Chen Reviewed-by: Mostafa Saleh Tested-by: Shameer Kolothum Tested-by: Nicolin Chen Tested-by: Moritz Fischer Signed-off-by: Jason Gunthorpe Link: https://lore.kernel.org/r/8-v6-96275f25c39d+2d4-smmuv3_newapi_p1_jgg@nvidia.com Signed-off-by: Will Deacon (cherry picked from commit d2e053d73247b68144c7f44d002ebf56acaf2d48) Signed-off-by: Koba Ko --- drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 29 ++++++++++++++------- 1 file changed, 20 insertions(+), 9 deletions(-) diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c index df4a23c0f4cfa..40734a5820b0d 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c @@ -2534,14 +2534,6 @@ static void arm_smmu_detach_dev(struct arm_smmu_master *master) master->domain = NULL; master->ats_enabled = false; - /* - * Clearing the CD entry isn't strictly required to detach the domain - * since the table is uninstalled anyway, but it helps avoid confusion - * in the call to arm_smmu_write_ctx_desc on the next attach (which - * expects the entry to be empty). - */ - if (smmu_domain->stage == ARM_SMMU_DOMAIN_S1 && master->cd_table.cdtab) - arm_smmu_write_ctx_desc(master, IOMMU_NO_PASID, NULL); } static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev) @@ -2618,6 +2610,17 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev) master->domain = NULL; goto out_list_del; } + } else { + /* + * arm_smmu_write_ctx_desc() relies on the entry being + * invalid to work, clear any existing entry. + */ + ret = arm_smmu_write_ctx_desc(master, IOMMU_NO_PASID, + NULL); + if (ret) { + master->domain = NULL; + goto out_list_del; + } } ret = arm_smmu_write_ctx_desc(master, IOMMU_NO_PASID, &smmu_domain->cd); @@ -2627,15 +2630,23 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev) } arm_smmu_make_cdtable_ste(&target, master); + arm_smmu_install_ste_for_dev(master, &target); break; case ARM_SMMU_DOMAIN_S2: arm_smmu_make_s2_domain_ste(&target, master, smmu_domain); + arm_smmu_install_ste_for_dev(master, &target); + if (master->cd_table.cdtab) + arm_smmu_write_ctx_desc(master, IOMMU_NO_PASID, + NULL); break; case ARM_SMMU_DOMAIN_BYPASS: arm_smmu_make_bypass_ste(&target); + arm_smmu_install_ste_for_dev(master, &target); + if (master->cd_table.cdtab) + arm_smmu_write_ctx_desc(master, IOMMU_NO_PASID, + NULL); break; } - arm_smmu_install_ste_for_dev(master, &target); arm_smmu_enable_ats(master); goto out_unlock; From 1e7491fc356e29982b9f2e8c7b9874af28988a76 Mon Sep 17 00:00:00 2001 From: Jason Gunthorpe Date: Mon, 26 Feb 2024 13:07:20 -0400 Subject: [PATCH 440/548] iommu/arm-smmu-v3: Pass smmu_domain to arm_enable/disable_ats() The caller already has the domain, just pass it in. A following patch will remove master->domain. Tested-by: Shameer Kolothum Tested-by: Nicolin Chen Tested-by: Moritz Fischer Reviewed-by: Nicolin Chen Reviewed-by: Mostafa Saleh Signed-off-by: Jason Gunthorpe Link: https://lore.kernel.org/r/9-v6-96275f25c39d+2d4-smmuv3_newapi_p1_jgg@nvidia.com Signed-off-by: Will Deacon (cherry picked from commit d550ddc5b789f258cb5ce3bfe74af6d5383589b5) Signed-off-by: Koba Ko --- drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 13 ++++++------- 1 file changed, 6 insertions(+), 7 deletions(-) diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c index 40734a5820b0d..6e6210c5b091a 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c @@ -2430,12 +2430,12 @@ static bool arm_smmu_ats_supported(struct arm_smmu_master *master) return dev_is_pci(dev) && pci_ats_supported(to_pci_dev(dev)); } -static void arm_smmu_enable_ats(struct arm_smmu_master *master) +static void arm_smmu_enable_ats(struct arm_smmu_master *master, + struct arm_smmu_domain *smmu_domain) { size_t stu; struct pci_dev *pdev; struct arm_smmu_device *smmu = master->smmu; - struct arm_smmu_domain *smmu_domain = master->domain; /* Don't enable ATS at the endpoint if it's not enabled in the STE */ if (!master->ats_enabled) @@ -2454,10 +2454,9 @@ static void arm_smmu_enable_ats(struct arm_smmu_master *master) dev_err(master->dev, "Failed to enable ATS (STU %zu)\n", stu); } -static void arm_smmu_disable_ats(struct arm_smmu_master *master) +static void arm_smmu_disable_ats(struct arm_smmu_master *master, + struct arm_smmu_domain *smmu_domain) { - struct arm_smmu_domain *smmu_domain = master->domain; - if (!master->ats_enabled) return; @@ -2526,7 +2525,7 @@ static void arm_smmu_detach_dev(struct arm_smmu_master *master) if (!smmu_domain) return; - arm_smmu_disable_ats(master); + arm_smmu_disable_ats(master, smmu_domain); spin_lock_irqsave(&smmu_domain->devices_lock, flags); list_del(&master->domain_head); @@ -2648,7 +2647,7 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev) break; } - arm_smmu_enable_ats(master); + arm_smmu_enable_ats(master, smmu_domain); goto out_unlock; out_list_del: From 0b04a1aaffdd6ed0d21b8946808e3ca84553a53f Mon Sep 17 00:00:00 2001 From: Jason Gunthorpe Date: Mon, 26 Feb 2024 13:07:21 -0400 Subject: [PATCH 441/548] iommu/arm-smmu-v3: Remove arm_smmu_master->domain Introducing global statics which are of type struct iommu_domain, not struct arm_smmu_domain makes it difficult to retain arm_smmu_master->domain, as it can no longer point to an IDENTITY or BLOCKED domain. The only place that uses the value is arm_smmu_detach_dev(). Change things to work like other drivers and call iommu_get_domain_for_dev() to obtain the current domain. The master->domain is subtly protecting the master->domain_head against being unused as only PAGING domains will set master->domain and only paging domains use the master->domain_head. To make it simple keep the master->domain_head initialized so that the list_del() logic just does nothing for attached non-PAGING domains. Tested-by: Shameer Kolothum Tested-by: Nicolin Chen Tested-by: Moritz Fischer Reviewed-by: Nicolin Chen Reviewed-by: Mostafa Saleh Signed-off-by: Jason Gunthorpe Link: https://lore.kernel.org/r/10-v6-96275f25c39d+2d4-smmuv3_newapi_p1_jgg@nvidia.com Signed-off-by: Will Deacon (cherry picked from commit 1b50017d39f650d78a0066734d6fe05920a8c9e8) Signed-off-by: Koba Ko --- drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 26 ++++++++------------- drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 1 - 2 files changed, 10 insertions(+), 17 deletions(-) diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c index 6e6210c5b091a..ad20568b17658 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c @@ -2519,19 +2519,20 @@ static void arm_smmu_disable_pasid(struct arm_smmu_master *master) static void arm_smmu_detach_dev(struct arm_smmu_master *master) { + struct iommu_domain *domain = iommu_get_domain_for_dev(master->dev); + struct arm_smmu_domain *smmu_domain; unsigned long flags; - struct arm_smmu_domain *smmu_domain = master->domain; - if (!smmu_domain) + if (!domain) return; + smmu_domain = to_smmu_domain(domain); arm_smmu_disable_ats(master, smmu_domain); spin_lock_irqsave(&smmu_domain->devices_lock, flags); - list_del(&master->domain_head); + list_del_init(&master->domain_head); spin_unlock_irqrestore(&smmu_domain->devices_lock, flags); - master->domain = NULL; master->ats_enabled = false; } @@ -2585,8 +2586,6 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev) arm_smmu_detach_dev(master); - master->domain = smmu_domain; - /* * The SMMU does not support enabling ATS with bypass. When the STE is * in bypass (STE.Config[2:0] == 0b100), ATS Translation Requests and @@ -2605,10 +2604,8 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev) case ARM_SMMU_DOMAIN_S1: if (!master->cd_table.cdtab) { ret = arm_smmu_alloc_cd_tables(master); - if (ret) { - master->domain = NULL; + if (ret) goto out_list_del; - } } else { /* * arm_smmu_write_ctx_desc() relies on the entry being @@ -2616,17 +2613,13 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev) */ ret = arm_smmu_write_ctx_desc(master, IOMMU_NO_PASID, NULL); - if (ret) { - master->domain = NULL; + if (ret) goto out_list_del; - } } ret = arm_smmu_write_ctx_desc(master, IOMMU_NO_PASID, &smmu_domain->cd); - if (ret) { - master->domain = NULL; + if (ret) goto out_list_del; - } arm_smmu_make_cdtable_ste(&target, master); arm_smmu_install_ste_for_dev(master, &target); @@ -2652,7 +2645,7 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev) out_list_del: spin_lock_irqsave(&smmu_domain->devices_lock, flags); - list_del(&master->domain_head); + list_del_init(&master->domain_head); spin_unlock_irqrestore(&smmu_domain->devices_lock, flags); out_unlock: @@ -2850,6 +2843,7 @@ static struct iommu_device *arm_smmu_probe_device(struct device *dev) master->dev = dev; master->smmu = smmu; INIT_LIST_HEAD(&master->bonds); + INIT_LIST_HEAD(&master->domain_head); dev_iommu_priv_set(dev, master); ret = arm_smmu_insert_master(smmu, master); diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h index 2a7bd3b2e23d6..62efc6e39f29a 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h @@ -695,7 +695,6 @@ struct arm_smmu_stream { struct arm_smmu_master { struct arm_smmu_device *smmu; struct device *dev; - struct arm_smmu_domain *domain; struct list_head domain_head; struct arm_smmu_stream *streams; /* Locked by the iommu core using the group mutex */ From ec6498c912c3d0d382660e1a9efa2efaf3a72d77 Mon Sep 17 00:00:00 2001 From: Jason Gunthorpe Date: Mon, 26 Feb 2024 13:07:22 -0400 Subject: [PATCH 442/548] iommu/arm-smmu-v3: Check that the RID domain is S1 in SVA The SVA code only works if the RID domain is a S1 domain and has already installed the cdtable. Originally the check for this was in arm_smmu_sva_bind() but when the op was removed the test didn't get copied over to the new arm_smmu_sva_set_dev_pasid(). Without the test wrong usage usually will hit a WARN_ON() in arm_smmu_write_ctx_desc() due to a missing ctx table. However, the next patches wil change things so that an IDENTITY domain is not a struct arm_smmu_domain and this will get into memory corruption if the struct is wrongly casted. Fail in arm_smmu_sva_set_dev_pasid() if the STE does not have a S1, which is a proxy for the STE having a pointer to the CD table. Write it in a way that will be compatible with the next patches. Fixes: 386fa64fd52b ("arm-smmu-v3/sva: Add SVA domain support") Reported-by: Shameerali Kolothum Thodi Closes: https://lore.kernel.org/linux-iommu/2a828e481416405fb3a4cceb9e075a59@huawei.com/ Tested-by: Nicolin Chen Signed-off-by: Jason Gunthorpe Link: https://lore.kernel.org/r/11-v6-96275f25c39d+2d4-smmuv3_newapi_p1_jgg@nvidia.com Signed-off-by: Will Deacon (cherry picked from commit ae91f6552c301e5e8569667e9d5440d5f75a90c4) Signed-off-by: Koba Ko --- drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c index df66fe43a9853..15f2e53f21165 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c @@ -383,7 +383,13 @@ __arm_smmu_sva_bind(struct device *dev, struct mm_struct *mm) struct arm_smmu_bond *bond; struct arm_smmu_master *master = dev_iommu_priv_get(dev); struct iommu_domain *domain = iommu_get_domain_for_dev(dev); - struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain); + struct arm_smmu_domain *smmu_domain; + + if (!(domain->type & __IOMMU_DOMAIN_PAGING)) + return -ENODEV; + smmu_domain = to_smmu_domain(domain); + if (smmu_domain->stage != ARM_SMMU_DOMAIN_S1) + return -ENODEV; if (!master || !master->sva_enabled) return ERR_PTR(-ENODEV); From 4472d15d7683dc33a44bc33b3e75fd2e115f7b14 Mon Sep 17 00:00:00 2001 From: Jason Gunthorpe Date: Mon, 26 Feb 2024 13:07:23 -0400 Subject: [PATCH 443/548] iommu/arm-smmu-v3: Add a global static IDENTITY domain Move to the new static global for identity domains. Move all the logic out of arm_smmu_attach_dev into an identity only function. Reviewed-by: Michael Shavit Reviewed-by: Nicolin Chen Tested-by: Shameer Kolothum Tested-by: Nicolin Chen Tested-by: Moritz Fischer Signed-off-by: Jason Gunthorpe Link: https://lore.kernel.org/r/12-v6-96275f25c39d+2d4-smmuv3_newapi_p1_jgg@nvidia.com Signed-off-by: Will Deacon (cherry picked from commit 12dacfb5b938cdd90ced0109165eee9cb27061d9) Signed-off-by: Koba Ko --- drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 82 +++++++++++++++------ drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 1 - 2 files changed, 58 insertions(+), 25 deletions(-) diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c index ad20568b17658..c5f9fd493d333 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c @@ -2208,8 +2208,7 @@ static struct iommu_domain *arm_smmu_domain_alloc(unsigned type) return arm_smmu_sva_domain_alloc(); if (type != IOMMU_DOMAIN_UNMANAGED && - type != IOMMU_DOMAIN_DMA && - type != IOMMU_DOMAIN_IDENTITY) + type != IOMMU_DOMAIN_DMA) return NULL; /* @@ -2317,11 +2316,6 @@ static int arm_smmu_domain_finalise(struct iommu_domain *domain) struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain); struct arm_smmu_device *smmu = smmu_domain->smmu; - if (domain->type == IOMMU_DOMAIN_IDENTITY) { - smmu_domain->stage = ARM_SMMU_DOMAIN_BYPASS; - return 0; - } - /* Restrict the stage to what we can actually support */ if (!(smmu->features & ARM_SMMU_FEAT_TRANS_S1)) smmu_domain->stage = ARM_SMMU_DOMAIN_S2; @@ -2523,7 +2517,7 @@ static void arm_smmu_detach_dev(struct arm_smmu_master *master) struct arm_smmu_domain *smmu_domain; unsigned long flags; - if (!domain) + if (!domain || !(domain->type & __IOMMU_DOMAIN_PAGING)) return; smmu_domain = to_smmu_domain(domain); @@ -2586,15 +2580,7 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev) arm_smmu_detach_dev(master); - /* - * The SMMU does not support enabling ATS with bypass. When the STE is - * in bypass (STE.Config[2:0] == 0b100), ATS Translation Requests and - * Translated transactions are denied as though ATS is disabled for the - * stream (STE.EATS == 0b00), causing F_BAD_ATS_TREQ and - * F_TRANSL_FORBIDDEN events (IHI0070Ea 5.2 Stream Table Entry). - */ - if (smmu_domain->stage != ARM_SMMU_DOMAIN_BYPASS) - master->ats_enabled = arm_smmu_ats_supported(master); + master->ats_enabled = arm_smmu_ats_supported(master); spin_lock_irqsave(&smmu_domain->devices_lock, flags); list_add(&master->domain_head, &smmu_domain->devices); @@ -2631,13 +2617,6 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev) arm_smmu_write_ctx_desc(master, IOMMU_NO_PASID, NULL); break; - case ARM_SMMU_DOMAIN_BYPASS: - arm_smmu_make_bypass_ste(&target); - arm_smmu_install_ste_for_dev(master, &target); - if (master->cd_table.cdtab) - arm_smmu_write_ctx_desc(master, IOMMU_NO_PASID, - NULL); - break; } arm_smmu_enable_ats(master, smmu_domain); @@ -2653,6 +2632,60 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev) return ret; } +static int arm_smmu_attach_dev_ste(struct device *dev, + struct arm_smmu_ste *ste) +{ + struct arm_smmu_master *master = dev_iommu_priv_get(dev); + + if (arm_smmu_master_sva_enabled(master)) + return -EBUSY; + + /* + * Do not allow any ASID to be changed while are working on the STE, + * otherwise we could miss invalidations. + */ + mutex_lock(&arm_smmu_asid_lock); + + /* + * The SMMU does not support enabling ATS with bypass/abort. When the + * STE is in bypass (STE.Config[2:0] == 0b100), ATS Translation Requests + * and Translated transactions are denied as though ATS is disabled for + * the stream (STE.EATS == 0b00), causing F_BAD_ATS_TREQ and + * F_TRANSL_FORBIDDEN events (IHI0070Ea 5.2 Stream Table Entry). + */ + arm_smmu_detach_dev(master); + + arm_smmu_install_ste_for_dev(master, ste); + mutex_unlock(&arm_smmu_asid_lock); + + /* + * This has to be done after removing the master from the + * arm_smmu_domain->devices to avoid races updating the same context + * descriptor from arm_smmu_share_asid(). + */ + if (master->cd_table.cdtab) + arm_smmu_write_ctx_desc(master, IOMMU_NO_PASID, NULL); + return 0; +} + +static int arm_smmu_attach_dev_identity(struct iommu_domain *domain, + struct device *dev) +{ + struct arm_smmu_ste ste; + + arm_smmu_make_bypass_ste(&ste); + return arm_smmu_attach_dev_ste(dev, &ste); +} + +static const struct iommu_domain_ops arm_smmu_identity_ops = { + .attach_dev = arm_smmu_attach_dev_identity, +}; + +static struct iommu_domain arm_smmu_identity_domain = { + .type = IOMMU_DOMAIN_IDENTITY, + .ops = &arm_smmu_identity_ops, +}; + static int arm_smmu_map_pages(struct iommu_domain *domain, unsigned long iova, phys_addr_t paddr, size_t pgsize, size_t pgcount, int prot, gfp_t gfp, size_t *mapped) @@ -3040,6 +3073,7 @@ static void arm_smmu_remove_dev_pasid(struct device *dev, ioasid_t pasid) } static struct iommu_ops arm_smmu_ops = { + .identity_domain = &arm_smmu_identity_domain, .capable = arm_smmu_capable, .domain_alloc = arm_smmu_domain_alloc, .probe_device = arm_smmu_probe_device, diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h index 62efc6e39f29a..0b1f5820b8384 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h @@ -713,7 +713,6 @@ enum arm_smmu_domain_stage { ARM_SMMU_DOMAIN_S1 = 0, ARM_SMMU_DOMAIN_S2, ARM_SMMU_DOMAIN_NESTED, - ARM_SMMU_DOMAIN_BYPASS, }; struct arm_smmu_domain { From 113bab4e3fb55d8b37efe1701eebf16ebe76a151 Mon Sep 17 00:00:00 2001 From: Jason Gunthorpe Date: Mon, 26 Feb 2024 13:07:24 -0400 Subject: [PATCH 444/548] iommu/arm-smmu-v3: Add a global static BLOCKED domain Using the same design as the IDENTITY domain install an STRTAB_STE_0_CFG_ABORT STE. Reviewed-by: Michael Shavit Reviewed-by: Nicolin Chen Tested-by: Shameer Kolothum Tested-by: Nicolin Chen Tested-by: Moritz Fischer Signed-off-by: Jason Gunthorpe Link: https://lore.kernel.org/r/13-v6-96275f25c39d+2d4-smmuv3_newapi_p1_jgg@nvidia.com Signed-off-by: Will Deacon (cherry picked from commit 352bd64cd8288c5c6808735d52a75809dfef8635) Signed-off-by: Koba Ko --- drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 19 +++++++++++++++++++ 1 file changed, 19 insertions(+) diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c index c5f9fd493d333..1fc4e507317d7 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c @@ -2686,6 +2686,24 @@ static struct iommu_domain arm_smmu_identity_domain = { .ops = &arm_smmu_identity_ops, }; +static int arm_smmu_attach_dev_blocked(struct iommu_domain *domain, + struct device *dev) +{ + struct arm_smmu_ste ste; + + arm_smmu_make_abort_ste(&ste); + return arm_smmu_attach_dev_ste(dev, &ste); +} + +static const struct iommu_domain_ops arm_smmu_blocked_ops = { + .attach_dev = arm_smmu_attach_dev_blocked, +}; + +static struct iommu_domain arm_smmu_blocked_domain = { + .type = IOMMU_DOMAIN_BLOCKED, + .ops = &arm_smmu_blocked_ops, +}; + static int arm_smmu_map_pages(struct iommu_domain *domain, unsigned long iova, phys_addr_t paddr, size_t pgsize, size_t pgcount, int prot, gfp_t gfp, size_t *mapped) @@ -3074,6 +3092,7 @@ static void arm_smmu_remove_dev_pasid(struct device *dev, ioasid_t pasid) static struct iommu_ops arm_smmu_ops = { .identity_domain = &arm_smmu_identity_domain, + .blocked_domain = &arm_smmu_blocked_domain, .capable = arm_smmu_capable, .domain_alloc = arm_smmu_domain_alloc, .probe_device = arm_smmu_probe_device, From 1d35088870992efac64d769f0e7bd94ca4a0ed16 Mon Sep 17 00:00:00 2001 From: Jason Gunthorpe Date: Mon, 26 Feb 2024 13:07:25 -0400 Subject: [PATCH 445/548] iommu/arm-smmu-v3: Use the identity/blocked domain during release Consolidate some more code by having release call arm_smmu_attach_dev_identity/blocked() instead of open coding this. Reviewed-by: Nicolin Chen Tested-by: Shameer Kolothum Tested-by: Nicolin Chen Tested-by: Moritz Fischer Signed-off-by: Jason Gunthorpe Link: https://lore.kernel.org/r/14-v6-96275f25c39d+2d4-smmuv3_newapi_p1_jgg@nvidia.com Signed-off-by: Will Deacon (cherry picked from commit d36464f40f29c984984168ea89e08629bdba41df) Signed-off-by: Koba Ko --- drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 7 ++----- 1 file changed, 2 insertions(+), 5 deletions(-) diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c index 1fc4e507317d7..2d13dceb47f6c 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c @@ -2934,19 +2934,16 @@ static struct iommu_device *arm_smmu_probe_device(struct device *dev) static void arm_smmu_release_device(struct device *dev) { struct arm_smmu_master *master = dev_iommu_priv_get(dev); - struct arm_smmu_ste target; if (WARN_ON(arm_smmu_master_sva_enabled(master))) iopf_queue_remove_device(master->smmu->evtq.iopf, dev); /* Put the STE back to what arm_smmu_init_strtab() sets */ if (disable_bypass && !dev->iommu->require_direct) - arm_smmu_make_abort_ste(&target); + arm_smmu_attach_dev_blocked(&arm_smmu_blocked_domain, dev); else - arm_smmu_make_bypass_ste(&target); - arm_smmu_install_ste_for_dev(master, &target); + arm_smmu_attach_dev_identity(&arm_smmu_identity_domain, dev); - arm_smmu_detach_dev(master); arm_smmu_disable_pasid(master); arm_smmu_remove_master(master); if (master->cd_table.cdtab) From b105eadd8e447163bf820bd7cac2ae8b0ae79e2b Mon Sep 17 00:00:00 2001 From: Jason Gunthorpe Date: Mon, 26 Feb 2024 13:07:26 -0400 Subject: [PATCH 446/548] iommu/arm-smmu-v3: Pass arm_smmu_domain and arm_smmu_device to finalize Instead of putting container_of() casts in the internals, use the proper type in this call chain. This makes it easier to check that the two global static domains are not leaking into call chains they should not. Passing the smmu avoids the only caller from having to set it and unset it in the error path. Reviewed-by: Michael Shavit Reviewed-by: Nicolin Chen Tested-by: Shameer Kolothum Tested-by: Nicolin Chen Tested-by: Moritz Fischer Signed-off-by: Jason Gunthorpe Link: https://lore.kernel.org/r/15-v6-96275f25c39d+2d4-smmuv3_newapi_p1_jgg@nvidia.com Signed-off-by: Will Deacon (cherry picked from commit d8cd200609cf6a404cda73794f0c8c4fd74c568c) Signed-off-by: Koba Ko --- drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 35 +++++++++++---------- 1 file changed, 18 insertions(+), 17 deletions(-) diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c index 2d13dceb47f6c..94f4998f5cb36 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c @@ -89,6 +89,9 @@ static struct arm_smmu_option_prop arm_smmu_options[] = { { 0, NULL}, }; +static int arm_smmu_domain_finalise(struct arm_smmu_domain *smmu_domain, + struct arm_smmu_device *smmu); + static void parse_driver_options(struct arm_smmu_device *smmu) { int i = 0; @@ -2250,12 +2253,12 @@ static void arm_smmu_domain_free(struct iommu_domain *domain) kfree(smmu_domain); } -static int arm_smmu_domain_finalise_s1(struct arm_smmu_domain *smmu_domain, +static int arm_smmu_domain_finalise_s1(struct arm_smmu_device *smmu, + struct arm_smmu_domain *smmu_domain, struct io_pgtable_cfg *pgtbl_cfg) { int ret; u32 asid; - struct arm_smmu_device *smmu = smmu_domain->smmu; struct arm_smmu_ctx_desc *cd = &smmu_domain->cd; typeof(&pgtbl_cfg->arm_lpae_s1_cfg.tcr) tcr = &pgtbl_cfg->arm_lpae_s1_cfg.tcr; @@ -2287,11 +2290,11 @@ static int arm_smmu_domain_finalise_s1(struct arm_smmu_domain *smmu_domain, return ret; } -static int arm_smmu_domain_finalise_s2(struct arm_smmu_domain *smmu_domain, +static int arm_smmu_domain_finalise_s2(struct arm_smmu_device *smmu, + struct arm_smmu_domain *smmu_domain, struct io_pgtable_cfg *pgtbl_cfg) { int vmid; - struct arm_smmu_device *smmu = smmu_domain->smmu; struct arm_smmu_s2_cfg *cfg = &smmu_domain->s2_cfg; /* Reserve VMID 0 for stage-2 bypass STEs */ @@ -2304,17 +2307,17 @@ static int arm_smmu_domain_finalise_s2(struct arm_smmu_domain *smmu_domain, return 0; } -static int arm_smmu_domain_finalise(struct iommu_domain *domain) +static int arm_smmu_domain_finalise(struct arm_smmu_domain *smmu_domain, + struct arm_smmu_device *smmu) { int ret; unsigned long ias, oas; enum io_pgtable_fmt fmt; struct io_pgtable_cfg pgtbl_cfg; struct io_pgtable_ops *pgtbl_ops; - int (*finalise_stage_fn)(struct arm_smmu_domain *, - struct io_pgtable_cfg *); - struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain); - struct arm_smmu_device *smmu = smmu_domain->smmu; + int (*finalise_stage_fn)(struct arm_smmu_device *smmu, + struct arm_smmu_domain *smmu_domain, + struct io_pgtable_cfg *pgtbl_cfg); /* Restrict the stage to what we can actually support */ if (!(smmu->features & ARM_SMMU_FEAT_TRANS_S1)) @@ -2354,17 +2357,18 @@ static int arm_smmu_domain_finalise(struct iommu_domain *domain) if (!pgtbl_ops) return -ENOMEM; - domain->pgsize_bitmap = pgtbl_cfg.pgsize_bitmap; - domain->geometry.aperture_end = (1UL << pgtbl_cfg.ias) - 1; - domain->geometry.force_aperture = true; + smmu_domain->domain.pgsize_bitmap = pgtbl_cfg.pgsize_bitmap; + smmu_domain->domain.geometry.aperture_end = (1UL << pgtbl_cfg.ias) - 1; + smmu_domain->domain.geometry.force_aperture = true; - ret = finalise_stage_fn(smmu_domain, &pgtbl_cfg); + ret = finalise_stage_fn(smmu, smmu_domain, &pgtbl_cfg); if (ret < 0) { free_io_pgtable_ops(pgtbl_ops); return ret; } smmu_domain->pgtbl_ops = pgtbl_ops; + smmu_domain->smmu = smmu; return 0; } @@ -2559,10 +2563,7 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev) mutex_lock(&smmu_domain->init_mutex); if (!smmu_domain->smmu) { - smmu_domain->smmu = smmu; - ret = arm_smmu_domain_finalise(domain); - if (ret) - smmu_domain->smmu = NULL; + ret = arm_smmu_domain_finalise(smmu_domain, smmu); } else if (smmu_domain->smmu != smmu) ret = -EINVAL; From cc592ace6a63e5b4e8313df8e19341d59622d7c5 Mon Sep 17 00:00:00 2001 From: Jason Gunthorpe Date: Mon, 26 Feb 2024 13:07:27 -0400 Subject: [PATCH 447/548] iommu/arm-smmu-v3: Convert to domain_alloc_paging() Now that the BLOCKED and IDENTITY behaviors are managed with their own domains change to the domain_alloc_paging() op. For now SVA remains using the old interface, eventually it will get its own op that can pass in the device and mm_struct which will let us have a sane lifetime for the mmu_notifier. Call arm_smmu_domain_finalise() early if dev is available. Tested-by: Shameer Kolothum Tested-by: Nicolin Chen Tested-by: Moritz Fischer Reviewed-by: Nicolin Chen Signed-off-by: Jason Gunthorpe Link: https://lore.kernel.org/r/16-v6-96275f25c39d+2d4-smmuv3_newapi_p1_jgg@nvidia.com Signed-off-by: Will Deacon (cherry picked from commit 327e10b47ae99f76ac53f0b8b73a0539f390d2d2) Signed-off-by: Koba Ko --- drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 22 ++++++++++++++++----- 1 file changed, 17 insertions(+), 5 deletions(-) diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c index 94f4998f5cb36..1a2b21d355738 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c @@ -2205,14 +2205,15 @@ static bool arm_smmu_capable(struct device *dev, enum iommu_cap cap) static struct iommu_domain *arm_smmu_domain_alloc(unsigned type) { - struct arm_smmu_domain *smmu_domain; if (type == IOMMU_DOMAIN_SVA) return arm_smmu_sva_domain_alloc(); + return ERR_PTR(-EOPNOTSUPP); +} - if (type != IOMMU_DOMAIN_UNMANAGED && - type != IOMMU_DOMAIN_DMA) - return NULL; +static struct iommu_domain *arm_smmu_domain_alloc_paging(struct device *dev) +{ + struct arm_smmu_domain *smmu_domain; /* * Allocate the domain and initialise some of its data structures. @@ -2221,13 +2222,23 @@ static struct iommu_domain *arm_smmu_domain_alloc(unsigned type) */ smmu_domain = kzalloc(sizeof(*smmu_domain), GFP_KERNEL); if (!smmu_domain) - return NULL; + return ERR_PTR(-ENOMEM); mutex_init(&smmu_domain->init_mutex); INIT_LIST_HEAD(&smmu_domain->devices); spin_lock_init(&smmu_domain->devices_lock); INIT_LIST_HEAD(&smmu_domain->mmu_notifiers); + if (dev) { + struct arm_smmu_master *master = dev_iommu_priv_get(dev); + int ret; + + ret = arm_smmu_domain_finalise(smmu_domain, master->smmu); + if (ret) { + kfree(smmu_domain); + return ERR_PTR(ret); + } + } return &smmu_domain->domain; } @@ -3093,6 +3104,7 @@ static struct iommu_ops arm_smmu_ops = { .blocked_domain = &arm_smmu_blocked_domain, .capable = arm_smmu_capable, .domain_alloc = arm_smmu_domain_alloc, + .domain_alloc_paging = arm_smmu_domain_alloc_paging, .probe_device = arm_smmu_probe_device, .release_device = arm_smmu_release_device, .device_group = arm_smmu_device_group, From b2b86ea410a10f38bd479c39675cb0b868d08e10 Mon Sep 17 00:00:00 2001 From: Michael Shavit Date: Tue, 5 Sep 2023 19:49:12 +0800 Subject: [PATCH 448/548] iommu/arm-smmu-v3-sva: Remove unused iommu_sva handle The __arm_smmu_sva_bind function returned an unused iommu_sva handle that can be removed. Signed-off-by: Michael Shavit Reviewed-by: Jason Gunthorpe Link: https://lore.kernel.org/r/20230905194849.v1.1.Ib483f67c9e2ad90ea2254b4b5ac696e4b68aa638@changeid Signed-off-by: Will Deacon (cherry picked from commit d912aed14fe412038379b1a90d1ae5a02307e830) Signed-off-by: Koba Ko --- .../iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c | 18 ++++++------------ 1 file changed, 6 insertions(+), 12 deletions(-) diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c index 15f2e53f21165..b77c59da71ca3 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c @@ -25,7 +25,6 @@ struct arm_smmu_mmu_notifier { #define mn_to_smmu(mn) container_of(mn, struct arm_smmu_mmu_notifier, mn) struct arm_smmu_bond { - struct iommu_sva sva; struct mm_struct *mm; struct arm_smmu_mmu_notifier *smmu_mn; struct list_head list; @@ -376,8 +375,7 @@ static void arm_smmu_mmu_notifier_put(struct arm_smmu_mmu_notifier *smmu_mn) arm_smmu_free_shared_cd(cd); } -static struct iommu_sva * -__arm_smmu_sva_bind(struct device *dev, struct mm_struct *mm) +static int __arm_smmu_sva_bind(struct device *dev, struct mm_struct *mm) { int ret; struct arm_smmu_bond *bond; @@ -392,7 +390,7 @@ __arm_smmu_sva_bind(struct device *dev, struct mm_struct *mm) return -ENODEV; if (!master || !master->sva_enabled) - return ERR_PTR(-ENODEV); + return -ENODEV; /* If bind() was already called for this {dev, mm} pair, reuse it. */ list_for_each_entry(bond, &master->bonds, list) { @@ -404,10 +402,9 @@ __arm_smmu_sva_bind(struct device *dev, struct mm_struct *mm) bond = kzalloc(sizeof(*bond), GFP_KERNEL); if (!bond) - return ERR_PTR(-ENOMEM); + return -ENOMEM; bond->mm = mm; - bond->sva.dev = dev; refcount_set(&bond->refs, 1); bond->smmu_mn = arm_smmu_mmu_notifier_get(smmu_domain, mm); @@ -417,11 +414,11 @@ __arm_smmu_sva_bind(struct device *dev, struct mm_struct *mm) } list_add(&bond->list, &master->bonds); - return &bond->sva; + return 0; err_free_bond: kfree(bond); - return ERR_PTR(ret); + return ret; } bool arm_smmu_sva_supported(struct arm_smmu_device *smmu) @@ -599,13 +596,10 @@ static int arm_smmu_sva_set_dev_pasid(struct iommu_domain *domain, struct device *dev, ioasid_t id) { int ret = 0; - struct iommu_sva *handle; struct mm_struct *mm = domain->mm; mutex_lock(&sva_lock); - handle = __arm_smmu_sva_bind(dev, mm); - if (IS_ERR(handle)) - ret = PTR_ERR(handle); + ret = __arm_smmu_sva_bind(dev, mm); mutex_unlock(&sva_lock); return ret; From a5eedcef58f41b98d9737b04c2fc8ccbab07f8a9 Mon Sep 17 00:00:00 2001 From: Michael Shavit Date: Tue, 5 Sep 2023 19:49:13 +0800 Subject: [PATCH 449/548] iommu/arm-smmu-v3-sva: Remove bond refcount Always allocate a new arm_smmu_bond in __arm_smmu_sva_bind and remove the bond refcount since arm_smmu_bond can never be shared across calls to __arm_smmu_sva_bind. The iommu framework will not allocate multiple SVA domains for the same (device/mm) pair, nor will it call set_dev_pasid for a device if a domain is already attached on the given pasid. There's also a one-to-one mapping between MM and PASID. __arm_smmu_sva_bind is therefore never called with the same (device/mm) pair, and so there's no reason to try and normalize allocations of the arm_smmu_bond struct for a (device/mm) pair across set_dev_pasid. Signed-off-by: Michael Shavit Reviewed-by: Jason Gunthorpe Link: https://lore.kernel.org/r/20230905194849.v1.2.Id3ab7cf665bcead097654937233a645722a4cce3@changeid Signed-off-by: Will Deacon (cherry picked from commit 37ed36448fcd73be197c87ce291d96eeb4d55c00) Signed-off-by: Koba Ko --- drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c | 12 +----------- 1 file changed, 1 insertion(+), 11 deletions(-) diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c index b77c59da71ca3..c61b23b03b793 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c @@ -28,7 +28,6 @@ struct arm_smmu_bond { struct mm_struct *mm; struct arm_smmu_mmu_notifier *smmu_mn; struct list_head list; - refcount_t refs; }; #define sva_to_bond(handle) \ @@ -392,20 +391,11 @@ static int __arm_smmu_sva_bind(struct device *dev, struct mm_struct *mm) if (!master || !master->sva_enabled) return -ENODEV; - /* If bind() was already called for this {dev, mm} pair, reuse it. */ - list_for_each_entry(bond, &master->bonds, list) { - if (bond->mm == mm) { - refcount_inc(&bond->refs); - return &bond->sva; - } - } - bond = kzalloc(sizeof(*bond), GFP_KERNEL); if (!bond) return -ENOMEM; bond->mm = mm; - refcount_set(&bond->refs, 1); bond->smmu_mn = arm_smmu_mmu_notifier_get(smmu_domain, mm); if (IS_ERR(bond->smmu_mn)) { @@ -584,7 +574,7 @@ void arm_smmu_sva_remove_dev_pasid(struct iommu_domain *domain, } } - if (!WARN_ON(!bond) && refcount_dec_and_test(&bond->refs)) { + if (!WARN_ON(!bond)) { list_del(&bond->list); arm_smmu_mmu_notifier_put(bond->smmu_mn); kfree(bond); From c5c0b93deaaecfe274bdf85eed6d68e30ce0ff61 Mon Sep 17 00:00:00 2001 From: Jason Gunthorpe Date: Wed, 27 Sep 2023 20:47:31 -0300 Subject: [PATCH 450/548] iommu: Move IOMMU_DOMAIN_BLOCKED global statics to ops->blocked_domain Following the pattern of identity domains, just assign the BLOCKED domain global statics to a value in ops. Update the core code to use the global static directly. Update powerpc to use the new scheme and remove its empty domain_alloc callback. Reviewed-by: Lu Baolu Signed-off-by: Jason Gunthorpe Reviewed-by: Kevin Tian Acked-by: Sven Peter Link: https://lore.kernel.org/r/1-v2-bff223cf6409+282-dart_paging_jgg@nvidia.com Signed-off-by: Joerg Roedel (cherry picked from commit e5d8be7406ca7b6b251c09b786f08176337c0999) Signed-off-by: Koba Ko --- arch/powerpc/kernel/iommu.c | 9 +-------- drivers/iommu/iommu.c | 2 ++ include/linux/iommu.h | 3 +++ 3 files changed, 6 insertions(+), 8 deletions(-) diff --git a/arch/powerpc/kernel/iommu.c b/arch/powerpc/kernel/iommu.c index 47b2d86e998a1..88debde1d8d02 100644 --- a/arch/powerpc/kernel/iommu.c +++ b/arch/powerpc/kernel/iommu.c @@ -1333,13 +1333,6 @@ static bool spapr_tce_iommu_capable(struct device *dev, enum iommu_cap cap) return false; } -static struct iommu_domain *spapr_tce_iommu_domain_alloc(unsigned int type) -{ - if (type != IOMMU_DOMAIN_BLOCKED) - return NULL; - return &spapr_tce_blocked_domain; -} - static struct iommu_device *spapr_tce_iommu_probe_device(struct device *dev) { struct pci_dev *pdev; @@ -1374,8 +1367,8 @@ static struct iommu_group *spapr_tce_iommu_device_group(struct device *dev) static const struct iommu_ops spapr_tce_iommu_ops = { .default_domain = &spapr_tce_platform_domain, + .blocked_domain = &spapr_tce_blocked_domain, .capable = spapr_tce_iommu_capable, - .domain_alloc = spapr_tce_iommu_domain_alloc, .probe_device = spapr_tce_iommu_probe_device, .release_device = spapr_tce_iommu_release_device, .device_group = spapr_tce_iommu_device_group, diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c index c13eb588bcc9e..38208edf38104 100644 --- a/drivers/iommu/iommu.c +++ b/drivers/iommu/iommu.c @@ -2082,6 +2082,8 @@ static struct iommu_domain *__iommu_domain_alloc(const struct iommu_ops *ops, if (alloc_type == IOMMU_DOMAIN_IDENTITY && ops->identity_domain) return ops->identity_domain; + else if (alloc_type == IOMMU_DOMAIN_BLOCKED && ops->blocked_domain) + return ops->blocked_domain; else if (type & __IOMMU_DOMAIN_PAGING && ops->domain_alloc_paging) domain = ops->domain_alloc_paging(dev); else if (ops->domain_alloc) diff --git a/include/linux/iommu.h b/include/linux/iommu.h index daa40443f8b82..34d1943fb58d4 100644 --- a/include/linux/iommu.h +++ b/include/linux/iommu.h @@ -373,6 +373,8 @@ static inline int __iommu_copy_struct_from_user( * @owner: Driver module providing these ops * @identity_domain: An always available, always attachable identity * translation. + * @blocked_domain: An always available, always attachable blocking + * translation. * @default_domain: If not NULL this will always be set as the default domain. * This should be an IDENTITY/BLOCKED/PLATFORM domain. * Do not use in new drivers. @@ -415,6 +417,7 @@ struct iommu_ops { unsigned long pgsize_bitmap; struct module *owner; struct iommu_domain *identity_domain; + struct iommu_domain *blocked_domain; struct iommu_domain *default_domain; }; From 6ade8bc5bcc9a2b4b48f6d2e46e8b93a83c67b4c Mon Sep 17 00:00:00 2001 From: Jason Gunthorpe Date: Mon, 4 Mar 2024 15:50:08 -0400 Subject: [PATCH 451/548] iommu/arm-smmu-v3: Add cpu_to_le64() around STRTAB_STE_0_V STRTAB_STE_0_V is a CPU value, it needs conversion for sparse to be clean. The missing annotation was a mistake introduced by splitting the ops out from the STE writer. Fixes: 7da51af9125c ("iommu/arm-smmu-v3: Make STE programming independent of the callers") Reported-by: kernel test robot Closes: https://lore.kernel.org/oe-kbuild-all/202403011441.5WqGrYjp-lkp@intel.com/ Signed-off-by: Jason Gunthorpe Link: https://lore.kernel.org/r/0-v1-98b23ebb0c84+9f-smmu_cputole_jgg@nvidia.com Signed-off-by: Will Deacon (cherry picked from commit 0493e739ccc60a3e0870847f1a12d6d79b86a1fc) Signed-off-by: Koba Ko --- drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c index 1a2b21d355738..bf381abdfdf1e 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c @@ -1144,7 +1144,8 @@ static void arm_smmu_write_ste(struct arm_smmu_master *master, u32 sid, * requires a breaking update, zero the V bit, write all qwords * but 0, then set qword 0 */ - unused_update.data[0] = entry->data[0] & (~STRTAB_STE_0_V); + unused_update.data[0] = entry->data[0] & + cpu_to_le64(~STRTAB_STE_0_V); entry_set(smmu, sid, entry, &unused_update, 0, 1); entry_set(smmu, sid, entry, target, 1, num_entry_qwords - 1); entry_set(smmu, sid, entry, target, 0, 1); From 7e1861c51fd70ce856e49e7c232f72efd4a0106f Mon Sep 17 00:00:00 2001 From: Jason Gunthorpe Date: Wed, 27 Sep 2023 20:47:32 -0300 Subject: [PATCH 452/548] iommu/vt-d: Update the definition of the blocking domain The global static should pre-define the type and the NOP free function can be now left as NULL. Reviewed-by: Lu Baolu Signed-off-by: Jason Gunthorpe Reviewed-by: Kevin Tian Acked-by: Sven Peter Link: https://lore.kernel.org/r/2-v2-bff223cf6409+282-dart_paging_jgg@nvidia.com Signed-off-by: Joerg Roedel (cherry picked from commit 7b6dd84e703147f5c71f163b84b0cb729a320541) Signed-off-by: Koba Ko --- drivers/iommu/intel/iommu.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c index 679d687e62c90..eeffc13ff7885 100644 --- a/drivers/iommu/intel/iommu.c +++ b/drivers/iommu/intel/iommu.c @@ -4031,9 +4031,9 @@ static int blocking_domain_attach_dev(struct iommu_domain *domain, } static struct iommu_domain blocking_domain = { + .type = IOMMU_DOMAIN_BLOCKED, .ops = &(const struct iommu_domain_ops) { .attach_dev = blocking_domain_attach_dev, - .free = intel_iommu_domain_free } }; @@ -4125,7 +4125,7 @@ intel_iommu_domain_alloc_user(struct device *dev, u32 flags, static void intel_iommu_domain_free(struct iommu_domain *domain) { - if (domain != &si_domain->domain && domain != &blocking_domain) + if (domain != &si_domain->domain) domain_exit(to_dmar_domain(domain)); } From c472110fb3ca09b8f39173565785354993a339db Mon Sep 17 00:00:00 2001 From: Jason Gunthorpe Date: Wed, 27 Sep 2023 20:47:33 -0300 Subject: [PATCH 453/548] iommu/vt-d: Use ops->blocked_domain Trivially migrate to the ops->blocked_domain for the existing global static. Reviewed-by: Lu Baolu Signed-off-by: Jason Gunthorpe Reviewed-by: Kevin Tian Acked-by: Sven Peter Link: https://lore.kernel.org/r/3-v2-bff223cf6409+282-dart_paging_jgg@nvidia.com Signed-off-by: Joerg Roedel (cherry picked from commit 7d12eb2d2f59e348b74fd36add42a3208a1eaeaa) Signed-off-by: Koba Ko --- drivers/iommu/intel/iommu.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c index eeffc13ff7885..d96f270fa9e7d 100644 --- a/drivers/iommu/intel/iommu.c +++ b/drivers/iommu/intel/iommu.c @@ -4043,8 +4043,6 @@ static struct iommu_domain *intel_iommu_domain_alloc(unsigned type) struct iommu_domain *domain; switch (type) { - case IOMMU_DOMAIN_BLOCKED: - return &blocking_domain; case IOMMU_DOMAIN_DMA: case IOMMU_DOMAIN_UNMANAGED: dmar_domain = alloc_domain(type); @@ -4939,6 +4937,7 @@ static const struct iommu_dirty_ops intel_dirty_ops = { }; const struct iommu_ops intel_iommu_ops = { + .blocked_domain = &blocking_domain, .capable = intel_iommu_capable, .hw_info = intel_iommu_hw_info, .domain_alloc = intel_iommu_domain_alloc, From bcbc693ec51f73d7f9e72b8a7da1ba2caf7be469 Mon Sep 17 00:00:00 2001 From: Jason Gunthorpe Date: Wed, 27 Sep 2023 20:47:34 -0300 Subject: [PATCH 454/548] iommufd: Convert to alloc_domain_paging() Move the global static blocked domain to the ops and convert the unmanaged domain to domain_alloc_paging. Signed-off-by: Jason Gunthorpe Reviewed-by: Kevin Tian Acked-by: Sven Peter Link: https://lore.kernel.org/r/4-v2-bff223cf6409+282-dart_paging_jgg@nvidia.com Signed-off-by: Joerg Roedel (cherry picked from commit 13fbceb1b8e9d1101d9dcd5ceec483cb6d203a3c) Signed-off-by: Koba Ko --- drivers/iommu/iommufd/selftest.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/iommu/iommufd/selftest.c b/drivers/iommu/iommufd/selftest.c index 89a90ead795c6..9b911b1313459 100644 --- a/drivers/iommu/iommufd/selftest.c +++ b/drivers/iommu/iommufd/selftest.c @@ -496,10 +496,11 @@ static const struct iommu_ops mock_ops = { * because it is zero. */ .default_domain = &mock_blocking_domain, + .blocked_domain = &mock_blocking_domain, .owner = THIS_MODULE, .pgsize_bitmap = MOCK_IO_PAGE_SIZE, .hw_info = mock_domain_hw_info, - .domain_alloc = mock_domain_alloc, + .domain_alloc_paging = mock_domain_alloc_paging, .domain_alloc_user = mock_domain_alloc_user, .capable = mock_domain_capable, .device_group = generic_device_group, From 625cdd14c9ae7b7b61de358f7374a2ce244adc2d Mon Sep 17 00:00:00 2001 From: Jason Gunthorpe Date: Wed, 27 Sep 2023 20:47:35 -0300 Subject: [PATCH 455/548] iommu/dart: Use static global identity domains Move to the new static global for identity domains. Move the identity specific code to apple_dart_attach_dev_identity(). Reviewed-by: Janne Grunau Signed-off-by: Jason Gunthorpe Acked-by: Sven Peter Link: https://lore.kernel.org/r/5-v2-bff223cf6409+282-dart_paging_jgg@nvidia.com Signed-off-by: Joerg Roedel (cherry picked from commit 7993085d8d5d271a83045c5c7d01982a7e89a52e) Signed-off-by: Koba Ko --- drivers/iommu/apple-dart.c | 39 +++++++++++++++++++++++++++----------- 1 file changed, 28 insertions(+), 11 deletions(-) diff --git a/drivers/iommu/apple-dart.c b/drivers/iommu/apple-dart.c index 0b89275084274..2c121c525749c 100644 --- a/drivers/iommu/apple-dart.c +++ b/drivers/iommu/apple-dart.c @@ -659,11 +659,7 @@ static int apple_dart_attach_dev(struct iommu_domain *domain, struct apple_dart_master_cfg *cfg = dev_iommu_priv_get(dev); struct apple_dart_domain *dart_domain = to_dart_domain(domain); - if (cfg->stream_maps[0].dart->force_bypass && - domain->type != IOMMU_DOMAIN_IDENTITY) - return -EINVAL; - if (!cfg->stream_maps[0].dart->supports_bypass && - domain->type == IOMMU_DOMAIN_IDENTITY) + if (cfg->stream_maps[0].dart->force_bypass) return -EINVAL; ret = apple_dart_finalize_domain(domain, cfg); @@ -683,15 +679,35 @@ static int apple_dart_attach_dev(struct iommu_domain *domain, for_each_stream_map(i, cfg, stream_map) apple_dart_hw_disable_dma(stream_map); break; - case IOMMU_DOMAIN_IDENTITY: - for_each_stream_map(i, cfg, stream_map) - apple_dart_hw_enable_bypass(stream_map); - break; } return ret; } +static int apple_dart_attach_dev_identity(struct iommu_domain *domain, + struct device *dev) +{ + struct apple_dart_master_cfg *cfg = dev_iommu_priv_get(dev); + struct apple_dart_stream_map *stream_map; + int i; + + if (!cfg->stream_maps[0].dart->supports_bypass) + return -EINVAL; + + for_each_stream_map(i, cfg, stream_map) + apple_dart_hw_enable_bypass(stream_map); + return 0; +} + +static const struct iommu_domain_ops apple_dart_identity_ops = { + .attach_dev = apple_dart_attach_dev_identity, +}; + +static struct iommu_domain apple_dart_identity_domain = { + .type = IOMMU_DOMAIN_IDENTITY, + .ops = &apple_dart_identity_ops, +}; + static struct iommu_device *apple_dart_probe_device(struct device *dev) { struct apple_dart_master_cfg *cfg = dev_iommu_priv_get(dev); @@ -722,7 +738,7 @@ static struct iommu_domain *apple_dart_domain_alloc(unsigned int type) struct apple_dart_domain *dart_domain; if (type != IOMMU_DOMAIN_DMA && type != IOMMU_DOMAIN_UNMANAGED && - type != IOMMU_DOMAIN_IDENTITY && type != IOMMU_DOMAIN_BLOCKED) + type != IOMMU_DOMAIN_BLOCKED) return NULL; dart_domain = kzalloc(sizeof(*dart_domain), GFP_KERNEL); @@ -732,7 +748,7 @@ static struct iommu_domain *apple_dart_domain_alloc(unsigned int type) mutex_init(&dart_domain->init_lock); /* no need to allocate pgtbl_ops or do any other finalization steps */ - if (type == IOMMU_DOMAIN_IDENTITY || type == IOMMU_DOMAIN_BLOCKED) + if (type == IOMMU_DOMAIN_BLOCKED) dart_domain->finalized = true; return &dart_domain->domain; @@ -947,6 +963,7 @@ static void apple_dart_get_resv_regions(struct device *dev, } static const struct iommu_ops apple_dart_iommu_ops = { + .identity_domain = &apple_dart_identity_domain, .domain_alloc = apple_dart_domain_alloc, .probe_device = apple_dart_probe_device, .release_device = apple_dart_release_device, From dd681e73fa80fb3d5030f6548c37097ad4ea1449 Mon Sep 17 00:00:00 2001 From: Jason Gunthorpe Date: Wed, 27 Sep 2023 20:47:36 -0300 Subject: [PATCH 456/548] iommu/dart: Move the blocked domain support to a global static Move to the new static global for blocked domains. Move the blocked specific code to apple_dart_attach_dev_blocked(). Reviewed-by: Janne Grunau Signed-off-by: Jason Gunthorpe Acked-by: Sven Peter Link: https://lore.kernel.org/r/6-v2-bff223cf6409+282-dart_paging_jgg@nvidia.com Signed-off-by: Joerg Roedel (cherry picked from commit 17ef8d6876e01165836790d83d908eb21da3f08e) Signed-off-by: Koba Ko --- drivers/iommu/apple-dart.c | 53 +++++++++++++++++++++++--------------- 1 file changed, 32 insertions(+), 21 deletions(-) diff --git a/drivers/iommu/apple-dart.c b/drivers/iommu/apple-dart.c index 2c121c525749c..a34812e8c9ea5 100644 --- a/drivers/iommu/apple-dart.c +++ b/drivers/iommu/apple-dart.c @@ -666,22 +666,13 @@ static int apple_dart_attach_dev(struct iommu_domain *domain, if (ret) return ret; - switch (domain->type) { - default: - ret = apple_dart_domain_add_streams(dart_domain, cfg); - if (ret) - return ret; - - for_each_stream_map(i, cfg, stream_map) - apple_dart_setup_translation(dart_domain, stream_map); - break; - case IOMMU_DOMAIN_BLOCKED: - for_each_stream_map(i, cfg, stream_map) - apple_dart_hw_disable_dma(stream_map); - break; - } + ret = apple_dart_domain_add_streams(dart_domain, cfg); + if (ret) + return ret; - return ret; + for_each_stream_map(i, cfg, stream_map) + apple_dart_setup_translation(dart_domain, stream_map); + return 0; } static int apple_dart_attach_dev_identity(struct iommu_domain *domain, @@ -708,6 +699,30 @@ static struct iommu_domain apple_dart_identity_domain = { .ops = &apple_dart_identity_ops, }; +static int apple_dart_attach_dev_blocked(struct iommu_domain *domain, + struct device *dev) +{ + struct apple_dart_master_cfg *cfg = dev_iommu_priv_get(dev); + struct apple_dart_stream_map *stream_map; + int i; + + if (cfg->stream_maps[0].dart->force_bypass) + return -EINVAL; + + for_each_stream_map(i, cfg, stream_map) + apple_dart_hw_disable_dma(stream_map); + return 0; +} + +static const struct iommu_domain_ops apple_dart_blocked_ops = { + .attach_dev = apple_dart_attach_dev_blocked, +}; + +static struct iommu_domain apple_dart_blocked_domain = { + .type = IOMMU_DOMAIN_BLOCKED, + .ops = &apple_dart_blocked_ops, +}; + static struct iommu_device *apple_dart_probe_device(struct device *dev) { struct apple_dart_master_cfg *cfg = dev_iommu_priv_get(dev); @@ -737,8 +752,7 @@ static struct iommu_domain *apple_dart_domain_alloc(unsigned int type) { struct apple_dart_domain *dart_domain; - if (type != IOMMU_DOMAIN_DMA && type != IOMMU_DOMAIN_UNMANAGED && - type != IOMMU_DOMAIN_BLOCKED) + if (type != IOMMU_DOMAIN_DMA && type != IOMMU_DOMAIN_UNMANAGED) return NULL; dart_domain = kzalloc(sizeof(*dart_domain), GFP_KERNEL); @@ -747,10 +761,6 @@ static struct iommu_domain *apple_dart_domain_alloc(unsigned int type) mutex_init(&dart_domain->init_lock); - /* no need to allocate pgtbl_ops or do any other finalization steps */ - if (type == IOMMU_DOMAIN_BLOCKED) - dart_domain->finalized = true; - return &dart_domain->domain; } @@ -964,6 +974,7 @@ static void apple_dart_get_resv_regions(struct device *dev, static const struct iommu_ops apple_dart_iommu_ops = { .identity_domain = &apple_dart_identity_domain, + .blocked_domain = &apple_dart_blocked_domain, .domain_alloc = apple_dart_domain_alloc, .probe_device = apple_dart_probe_device, .release_device = apple_dart_release_device, From 27b8e671132016617c4f090bef1186c3ef8c554d Mon Sep 17 00:00:00 2001 From: Jason Gunthorpe Date: Wed, 27 Sep 2023 20:47:37 -0300 Subject: [PATCH 457/548] iommu/dart: Convert to domain_alloc_paging() Since the IDENTITY and BLOCKED behaviors were moved to global statics all that remains is the paging domain. Rename to apple_dart_attach_dev_paging() and remove the left over type check. Reviewed-by: Janne Grunau Signed-off-by: Jason Gunthorpe Acked-by: Sven Peter Link: https://lore.kernel.org/r/7-v2-bff223cf6409+282-dart_paging_jgg@nvidia.com Signed-off-by: Joerg Roedel (cherry picked from commit 9c3ef90c4ccbee0a000b0f6b1b28c38773de87c7) Signed-off-by: Koba Ko --- drivers/iommu/apple-dart.c | 13 +++++-------- 1 file changed, 5 insertions(+), 8 deletions(-) diff --git a/drivers/iommu/apple-dart.c b/drivers/iommu/apple-dart.c index a34812e8c9ea5..2566cf3111c7c 100644 --- a/drivers/iommu/apple-dart.c +++ b/drivers/iommu/apple-dart.c @@ -651,8 +651,8 @@ static int apple_dart_domain_add_streams(struct apple_dart_domain *domain, true); } -static int apple_dart_attach_dev(struct iommu_domain *domain, - struct device *dev) +static int apple_dart_attach_dev_paging(struct iommu_domain *domain, + struct device *dev) { int ret, i; struct apple_dart_stream_map *stream_map; @@ -748,13 +748,10 @@ static void apple_dart_release_device(struct device *dev) kfree(cfg); } -static struct iommu_domain *apple_dart_domain_alloc(unsigned int type) +static struct iommu_domain *apple_dart_domain_alloc_paging(struct device *dev) { struct apple_dart_domain *dart_domain; - if (type != IOMMU_DOMAIN_DMA && type != IOMMU_DOMAIN_UNMANAGED) - return NULL; - dart_domain = kzalloc(sizeof(*dart_domain), GFP_KERNEL); if (!dart_domain) return NULL; @@ -975,7 +972,7 @@ static void apple_dart_get_resv_regions(struct device *dev, static const struct iommu_ops apple_dart_iommu_ops = { .identity_domain = &apple_dart_identity_domain, .blocked_domain = &apple_dart_blocked_domain, - .domain_alloc = apple_dart_domain_alloc, + .domain_alloc_paging = apple_dart_domain_alloc_paging, .probe_device = apple_dart_probe_device, .release_device = apple_dart_release_device, .device_group = apple_dart_device_group, @@ -985,7 +982,7 @@ static const struct iommu_ops apple_dart_iommu_ops = { .pgsize_bitmap = -1UL, /* Restricted during dart probe */ .owner = THIS_MODULE, .default_domain_ops = &(const struct iommu_domain_ops) { - .attach_dev = apple_dart_attach_dev, + .attach_dev = apple_dart_attach_dev_paging, .map_pages = apple_dart_map_pages, .unmap_pages = apple_dart_unmap_pages, .flush_iotlb_all = apple_dart_flush_iotlb_all, From c513cf49400e97257c75c723e39ab4ce9886a5b3 Mon Sep 17 00:00:00 2001 From: Jason Gunthorpe Date: Wed, 27 Sep 2023 20:47:38 -0300 Subject: [PATCH 458/548] iommu/dart: Call apple_dart_finalize_domain() as part of alloc_paging() In many cases the dev argument will now be !NULL so we should use it to finalize the domain at allocation. Make apple_dart_finalize_domain() accept the correct type. Reviewed-by: Janne Grunau Signed-off-by: Jason Gunthorpe Acked-by: Sven Peter Link: https://lore.kernel.org/r/8-v2-bff223cf6409+282-dart_paging_jgg@nvidia.com Signed-off-by: Joerg Roedel (cherry picked from commit 482feb5c649261cd2a7ad02e4ca63c159d6ec795) Signed-off-by: Koba Ko --- drivers/iommu/apple-dart.c | 28 +++++++++++++++++++--------- 1 file changed, 19 insertions(+), 9 deletions(-) diff --git a/drivers/iommu/apple-dart.c b/drivers/iommu/apple-dart.c index 2566cf3111c7c..126da0d89f0dd 100644 --- a/drivers/iommu/apple-dart.c +++ b/drivers/iommu/apple-dart.c @@ -568,10 +568,9 @@ apple_dart_setup_translation(struct apple_dart_domain *domain, stream_map->dart->hw->invalidate_tlb(stream_map); } -static int apple_dart_finalize_domain(struct iommu_domain *domain, +static int apple_dart_finalize_domain(struct apple_dart_domain *dart_domain, struct apple_dart_master_cfg *cfg) { - struct apple_dart_domain *dart_domain = to_dart_domain(domain); struct apple_dart *dart = cfg->stream_maps[0].dart; struct io_pgtable_cfg pgtbl_cfg; int ret = 0; @@ -597,17 +596,18 @@ static int apple_dart_finalize_domain(struct iommu_domain *domain, .iommu_dev = dart->dev, }; - dart_domain->pgtbl_ops = - alloc_io_pgtable_ops(dart->hw->fmt, &pgtbl_cfg, domain); + dart_domain->pgtbl_ops = alloc_io_pgtable_ops(dart->hw->fmt, &pgtbl_cfg, + &dart_domain->domain); if (!dart_domain->pgtbl_ops) { ret = -ENOMEM; goto done; } - domain->pgsize_bitmap = pgtbl_cfg.pgsize_bitmap; - domain->geometry.aperture_start = 0; - domain->geometry.aperture_end = (dma_addr_t)DMA_BIT_MASK(dart->ias); - domain->geometry.force_aperture = true; + dart_domain->domain.pgsize_bitmap = pgtbl_cfg.pgsize_bitmap; + dart_domain->domain.geometry.aperture_start = 0; + dart_domain->domain.geometry.aperture_end = + (dma_addr_t)DMA_BIT_MASK(dart->ias); + dart_domain->domain.geometry.force_aperture = true; dart_domain->finalized = true; @@ -662,7 +662,7 @@ static int apple_dart_attach_dev_paging(struct iommu_domain *domain, if (cfg->stream_maps[0].dart->force_bypass) return -EINVAL; - ret = apple_dart_finalize_domain(domain, cfg); + ret = apple_dart_finalize_domain(dart_domain, cfg); if (ret) return ret; @@ -758,6 +758,16 @@ static struct iommu_domain *apple_dart_domain_alloc_paging(struct device *dev) mutex_init(&dart_domain->init_lock); + if (dev) { + struct apple_dart_master_cfg *cfg = dev_iommu_priv_get(dev); + int ret; + + ret = apple_dart_finalize_domain(dart_domain, cfg); + if (ret) { + kfree(dart_domain); + return ERR_PTR(ret); + } + } return &dart_domain->domain; } From fd05d60593dbc2e4787491f92d7f1324b5324bfa Mon Sep 17 00:00:00 2001 From: Jason Gunthorpe Date: Wed, 27 Sep 2023 20:47:39 -0300 Subject: [PATCH 459/548] iommu/dart: Remove the force_bypass variable This flag just caches if the IO page size is larger than the CPU PAGE_SIZE. This only needs to be checked in two places so remove the confusingly named cache. dart would like to not support paging domains at all if the IO page size is larger than the CPU page size. In this case we should ideally fail domain_alloc_paging(), as there is no point in creating a domain that can never be attached. Move the test into apple_dart_finalize_domain(). The check in apple_dart_mod_streams() will prevent the domain from being attached to the wrong dart There is no HW limitation that prevents BLOCKED domains from working, remove that test. The check in apple_dart_of_xlate() is redundant since immediately after the pgsize is checked. Remove it. Remove the variable. Suggested-by: Janne Grunau Signed-off-by: Jason Gunthorpe Reviewed-by: Janne Grunau Acked-by: Sven Peter Link: https://lore.kernel.org/r/9-v2-bff223cf6409+282-dart_paging_jgg@nvidia.com Signed-off-by: Joerg Roedel (cherry picked from commit bbc70e0aec287e164344b1a071bd46466a4f29b3) Signed-off-by: Koba Ko --- drivers/iommu/apple-dart.c | 20 ++++++-------------- 1 file changed, 6 insertions(+), 14 deletions(-) diff --git a/drivers/iommu/apple-dart.c b/drivers/iommu/apple-dart.c index 126da0d89f0dd..821b4a3465dfb 100644 --- a/drivers/iommu/apple-dart.c +++ b/drivers/iommu/apple-dart.c @@ -196,7 +196,6 @@ struct apple_dart_hw { * @lock: lock for hardware operations involving this dart * @pgsize: pagesize supported by this DART * @supports_bypass: indicates if this DART supports bypass mode - * @force_bypass: force bypass mode due to pagesize mismatch? * @sid2group: maps stream ids to iommu_groups * @iommu: iommu core device */ @@ -217,7 +216,6 @@ struct apple_dart { u32 pgsize; u32 num_streams; u32 supports_bypass : 1; - u32 force_bypass : 1; struct iommu_group *sid2group[DART_MAX_STREAMS]; struct iommu_device iommu; @@ -576,6 +574,9 @@ static int apple_dart_finalize_domain(struct apple_dart_domain *dart_domain, int ret = 0; int i, j; + if (dart->pgsize > PAGE_SIZE) + return -EINVAL; + mutex_lock(&dart_domain->init_lock); if (dart_domain->finalized) @@ -659,9 +660,6 @@ static int apple_dart_attach_dev_paging(struct iommu_domain *domain, struct apple_dart_master_cfg *cfg = dev_iommu_priv_get(dev); struct apple_dart_domain *dart_domain = to_dart_domain(domain); - if (cfg->stream_maps[0].dart->force_bypass) - return -EINVAL; - ret = apple_dart_finalize_domain(dart_domain, cfg); if (ret) return ret; @@ -706,9 +704,6 @@ static int apple_dart_attach_dev_blocked(struct iommu_domain *domain, struct apple_dart_stream_map *stream_map; int i; - if (cfg->stream_maps[0].dart->force_bypass) - return -EINVAL; - for_each_stream_map(i, cfg, stream_map) apple_dart_hw_disable_dma(stream_map); return 0; @@ -803,8 +798,6 @@ static int apple_dart_of_xlate(struct device *dev, struct of_phandle_args *args) if (cfg_dart) { if (cfg_dart->supports_bypass != dart->supports_bypass) return -EINVAL; - if (cfg_dart->force_bypass != dart->force_bypass) - return -EINVAL; if (cfg_dart->pgsize != dart->pgsize) return -EINVAL; } @@ -946,7 +939,7 @@ static int apple_dart_def_domain_type(struct device *dev) { struct apple_dart_master_cfg *cfg = dev_iommu_priv_get(dev); - if (cfg->stream_maps[0].dart->force_bypass) + if (cfg->stream_maps[0].dart->pgsize > PAGE_SIZE) return IOMMU_DOMAIN_IDENTITY; if (!cfg->stream_maps[0].dart->supports_bypass) return IOMMU_DOMAIN_DMA; @@ -1146,8 +1139,6 @@ static int apple_dart_probe(struct platform_device *pdev) goto err_clk_disable; } - dart->force_bypass = dart->pgsize > PAGE_SIZE; - ret = apple_dart_hw_reset(dart); if (ret) goto err_clk_disable; @@ -1171,7 +1162,8 @@ static int apple_dart_probe(struct platform_device *pdev) dev_info( &pdev->dev, "DART [pagesize %x, %d streams, bypass support: %d, bypass forced: %d] initialized\n", - dart->pgsize, dart->num_streams, dart->supports_bypass, dart->force_bypass); + dart->pgsize, dart->num_streams, dart->supports_bypass, + dart->pgsize > PAGE_SIZE); return 0; err_sysfs_remove: From bf7239df9635ad6748edc151da714227b87c8ca4 Mon Sep 17 00:00:00 2001 From: Jason Gunthorpe Date: Wed, 1 Nov 2023 20:28:11 -0300 Subject: [PATCH 460/548] iommu: Flow ERR_PTR out from __iommu_domain_alloc() Most of the calling code now has error handling that can carry an error code further up the call chain. Keep the exported interface iommu_domain_alloc() returning NULL and reflow the internal code to use ERR_PTR not NULL for domain allocation failure. Optionally allow drivers to return ERR_PTR from any of the alloc ops. Many of the new ops (user, sva, etc) already return ERR_PTR, so having two rules is confusing and hard on drivers. This fixes a bug in DART that was returning ERR_PTR. Fixes: 482feb5c6492 ("iommu/dart: Call apple_dart_finalize_domain() as part of alloc_paging()") Reported-by: Dan Carpenter Link: https://lore.kernel.org/linux-iommu/b85e0715-3224-4f45-ad6b-ebb9f08c015d@moroto.mountain/ Signed-off-by: Jason Gunthorpe Reviewed-by: Jerry Snitselaar Link: https://lore.kernel.org/r/0-v2-55ae413017b8+97-domain_alloc_err_ptr_jgg@nvidia.com Signed-off-by: Joerg Roedel (cherry picked from commit 34e2dccbb30baf7e5502bae382722aacbbfddc5b) Signed-off-by: Koba Ko --- drivers/iommu/iommu.c | 57 ++++++++++++++++++++++++++++--------------- 1 file changed, 38 insertions(+), 19 deletions(-) diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c index 38208edf38104..ec96a00a67ef7 100644 --- a/drivers/iommu/iommu.c +++ b/drivers/iommu/iommu.c @@ -1782,15 +1782,15 @@ iommu_group_alloc_default_domain(struct iommu_group *group, int req_type) /* The driver gave no guidance on what type to use, try the default */ dom = __iommu_group_alloc_default_domain(group, iommu_def_domain_type); - if (dom) + if (!IS_ERR(dom)) return dom; /* Otherwise IDENTITY and DMA_FQ defaults will try DMA */ if (iommu_def_domain_type == IOMMU_DOMAIN_DMA) - return NULL; + return ERR_PTR(-EINVAL); dom = __iommu_group_alloc_default_domain(group, IOMMU_DOMAIN_DMA); - if (!dom) - return NULL; + if (IS_ERR(dom)) + return dom; pr_warn("Failed to allocate default IOMMU domain of type %u for group %s - Falling back to IOMMU_DOMAIN_DMA", iommu_def_domain_type, group->name); @@ -2089,10 +2089,17 @@ static struct iommu_domain *__iommu_domain_alloc(const struct iommu_ops *ops, else if (ops->domain_alloc) domain = ops->domain_alloc(alloc_type); else - return NULL; + return ERR_PTR(-EOPNOTSUPP); + /* + * Many domain_alloc ops now return ERR_PTR, make things easier for the + * driver by accepting ERR_PTR from all domain_alloc ops instead of + * having two rules. + */ + if (IS_ERR(domain)) + return domain; if (!domain) - return NULL; + return ERR_PTR(-ENOMEM); domain->type = type; /* @@ -2105,9 +2112,14 @@ static struct iommu_domain *__iommu_domain_alloc(const struct iommu_ops *ops, if (!domain->ops) domain->ops = ops->default_domain_ops; - if (iommu_is_dma_domain(domain) && iommu_get_dma_cookie(domain)) { - iommu_domain_free(domain); - domain = NULL; + if (iommu_is_dma_domain(domain)) { + int rc; + + rc = iommu_get_dma_cookie(domain); + if (rc) { + iommu_domain_free(domain); + return ERR_PTR(rc); + } } return domain; } @@ -2124,10 +2136,15 @@ __iommu_group_domain_alloc(struct iommu_group *group, unsigned int type) struct iommu_domain *iommu_domain_alloc(const struct bus_type *bus) { + struct iommu_domain *domain; + if (bus == NULL || bus->iommu_ops == NULL) return NULL; - return __iommu_domain_alloc(bus->iommu_ops, NULL, + domain = __iommu_domain_alloc(bus->iommu_ops, NULL, IOMMU_DOMAIN_UNMANAGED); + if (IS_ERR(domain)) + return NULL; + return domain; } EXPORT_SYMBOL_GPL(iommu_domain_alloc); @@ -3085,8 +3102,8 @@ static int iommu_setup_default_domain(struct iommu_group *group, return -EINVAL; dom = iommu_group_alloc_default_domain(group, req_type); - if (!dom) - return -ENODEV; + if (IS_ERR(dom)) + return PTR_ERR(dom); if (group->default_domain == dom) return 0; @@ -3312,21 +3329,23 @@ void iommu_device_unuse_default_domain(struct device *dev) static int __iommu_group_alloc_blocking_domain(struct iommu_group *group) { + struct iommu_domain *domain; + if (group->blocking_domain) return 0; - group->blocking_domain = - __iommu_group_domain_alloc(group, IOMMU_DOMAIN_BLOCKED); - if (!group->blocking_domain) { + domain = __iommu_group_domain_alloc(group, IOMMU_DOMAIN_BLOCKED); + if (IS_ERR(domain)) { /* * For drivers that do not yet understand IOMMU_DOMAIN_BLOCKED * create an empty domain instead. */ - group->blocking_domain = __iommu_group_domain_alloc( - group, IOMMU_DOMAIN_UNMANAGED); - if (!group->blocking_domain) - return -EINVAL; + domain = __iommu_group_domain_alloc(group, + IOMMU_DOMAIN_UNMANAGED); + if (IS_ERR(domain)) + return PTR_ERR(domain); } + group->blocking_domain = domain; return 0; } From 2411c282d2e51b4681ad1ef158d0459fc2348a72 Mon Sep 17 00:00:00 2001 From: Jason Gunthorpe Date: Sun, 12 Nov 2023 14:50:13 -0400 Subject: [PATCH 461/548] iommufd: Add iommufd_ctx to iommufd_put_object() Will be used in the next patch. Link: https://lore.kernel.org/r/1-v2-ca9e00171c5b+123-iommufd_syz4_jgg@nvidia.com/ Reviewed-by: Kevin Tian Signed-off-by: Jason Gunthorpe (cherry picked from commit bd7a282650b8beb57bc9d19bfcb714b1ccae843a) Signed-off-by: Koba Ko --- drivers/iommu/iommufd/device.c | 14 +++++++------- drivers/iommu/iommufd/hw_pagetable.c | 8 ++++---- drivers/iommu/iommufd/ioas.c | 14 +++++++------- drivers/iommu/iommufd/iommufd_private.h | 3 ++- drivers/iommu/iommufd/selftest.c | 14 +++++++------- drivers/iommu/iommufd/vfio_compat.c | 18 +++++++++--------- 6 files changed, 36 insertions(+), 35 deletions(-) diff --git a/drivers/iommu/iommufd/device.c b/drivers/iommu/iommufd/device.c index dbcb027857b6b..68ac138d925ca 100644 --- a/drivers/iommu/iommufd/device.c +++ b/drivers/iommu/iommufd/device.c @@ -587,7 +587,7 @@ iommufd_device_auto_get_domain(struct iommufd_device *idev, continue; destroy_hwpt = (*do_attach)(idev, hwpt); if (IS_ERR(destroy_hwpt)) { - iommufd_put_object(&hwpt->obj); + iommufd_put_object(idev->ictx, &hwpt->obj); /* * -EINVAL means the domain is incompatible with the * device. Other error codes should propagate to @@ -599,7 +599,7 @@ iommufd_device_auto_get_domain(struct iommufd_device *idev, goto out_unlock; } *pt_id = hwpt->obj.id; - iommufd_put_object(&hwpt->obj); + iommufd_put_object(idev->ictx, &hwpt->obj); goto out_unlock; } @@ -668,7 +668,7 @@ static int iommufd_device_change_pt(struct iommufd_device *idev, u32 *pt_id, destroy_hwpt = ERR_PTR(-EINVAL); goto out_put_pt_obj; } - iommufd_put_object(pt_obj); + iommufd_put_object(idev->ictx, pt_obj); /* This destruction has to be after we unlock everything */ if (destroy_hwpt) @@ -676,7 +676,7 @@ static int iommufd_device_change_pt(struct iommufd_device *idev, u32 *pt_id, return 0; out_put_pt_obj: - iommufd_put_object(pt_obj); + iommufd_put_object(idev->ictx, pt_obj); return PTR_ERR(destroy_hwpt); } @@ -808,7 +808,7 @@ static int iommufd_access_change_ioas_id(struct iommufd_access *access, u32 id) if (IS_ERR(ioas)) return PTR_ERR(ioas); rc = iommufd_access_change_ioas(access, ioas); - iommufd_put_object(&ioas->obj); + iommufd_put_object(access->ictx, &ioas->obj); return rc; } @@ -957,7 +957,7 @@ void iommufd_access_notify_unmap(struct io_pagetable *iopt, unsigned long iova, access->ops->unmap(access->data, iova, length); - iommufd_put_object(&access->obj); + iommufd_put_object(access->ictx, &access->obj); xa_lock(&ioas->iopt.access_list); } xa_unlock(&ioas->iopt.access_list); @@ -1259,6 +1259,6 @@ int iommufd_get_hw_info(struct iommufd_ucmd *ucmd) out_free: kfree(data); out_put: - iommufd_put_object(&idev->obj); + iommufd_put_object(ucmd->ictx, &idev->obj); return rc; } diff --git a/drivers/iommu/iommufd/hw_pagetable.c b/drivers/iommu/iommufd/hw_pagetable.c index 2abbeafdbd22d..cbb5df0a6c32f 100644 --- a/drivers/iommu/iommufd/hw_pagetable.c +++ b/drivers/iommu/iommufd/hw_pagetable.c @@ -318,9 +318,9 @@ int iommufd_hwpt_alloc(struct iommufd_ucmd *ucmd) if (ioas) mutex_unlock(&ioas->mutex); out_put_pt: - iommufd_put_object(pt_obj); + iommufd_put_object(ucmd->ictx, pt_obj); out_put_idev: - iommufd_put_object(&idev->obj); + iommufd_put_object(ucmd->ictx, &idev->obj); return rc; } @@ -345,7 +345,7 @@ int iommufd_hwpt_set_dirty_tracking(struct iommufd_ucmd *ucmd) rc = iopt_set_dirty_tracking(&ioas->iopt, hwpt_paging->common.domain, enable); - iommufd_put_object(&hwpt_paging->common.obj); + iommufd_put_object(ucmd->ictx, &hwpt_paging->common.obj); return rc; } @@ -368,6 +368,6 @@ int iommufd_hwpt_get_dirty_bitmap(struct iommufd_ucmd *ucmd) rc = iopt_read_and_clear_dirty_data( &ioas->iopt, hwpt_paging->common.domain, cmd->flags, cmd); - iommufd_put_object(&hwpt_paging->common.obj); + iommufd_put_object(ucmd->ictx, &hwpt_paging->common.obj); return rc; } diff --git a/drivers/iommu/iommufd/ioas.c b/drivers/iommu/iommufd/ioas.c index 0407e2b758ef4..157a89b993e43 100644 --- a/drivers/iommu/iommufd/ioas.c +++ b/drivers/iommu/iommufd/ioas.c @@ -105,7 +105,7 @@ int iommufd_ioas_iova_ranges(struct iommufd_ucmd *ucmd) rc = -EMSGSIZE; out_put: up_read(&ioas->iopt.iova_rwsem); - iommufd_put_object(&ioas->obj); + iommufd_put_object(ucmd->ictx, &ioas->obj); return rc; } @@ -175,7 +175,7 @@ int iommufd_ioas_allow_iovas(struct iommufd_ucmd *ucmd) interval_tree_remove(node, &allowed_iova); kfree(container_of(node, struct iopt_allowed, node)); } - iommufd_put_object(&ioas->obj); + iommufd_put_object(ucmd->ictx, &ioas->obj); return rc; } @@ -232,7 +232,7 @@ int iommufd_ioas_map(struct iommufd_ucmd *ucmd) cmd->iova = iova; rc = iommufd_ucmd_respond(ucmd, sizeof(*cmd)); out_put: - iommufd_put_object(&ioas->obj); + iommufd_put_object(ucmd->ictx, &ioas->obj); return rc; } @@ -266,7 +266,7 @@ int iommufd_ioas_copy(struct iommufd_ucmd *ucmd) return PTR_ERR(src_ioas); rc = iopt_get_pages(&src_ioas->iopt, cmd->src_iova, cmd->length, &pages_list); - iommufd_put_object(&src_ioas->obj); + iommufd_put_object(ucmd->ictx, &src_ioas->obj); if (rc) return rc; @@ -287,7 +287,7 @@ int iommufd_ioas_copy(struct iommufd_ucmd *ucmd) cmd->dst_iova = iova; rc = iommufd_ucmd_respond(ucmd, sizeof(*cmd)); out_put_dst: - iommufd_put_object(&dst_ioas->obj); + iommufd_put_object(ucmd->ictx, &dst_ioas->obj); out_pages: iopt_free_pages_list(&pages_list); return rc; @@ -323,7 +323,7 @@ int iommufd_ioas_unmap(struct iommufd_ucmd *ucmd) rc = iommufd_ucmd_respond(ucmd, sizeof(*cmd)); out_put: - iommufd_put_object(&ioas->obj); + iommufd_put_object(ucmd->ictx, &ioas->obj); return rc; } @@ -401,6 +401,6 @@ int iommufd_ioas_option(struct iommufd_ucmd *ucmd) rc = -EOPNOTSUPP; } - iommufd_put_object(&ioas->obj); + iommufd_put_object(ucmd->ictx, &ioas->obj); return rc; } diff --git a/drivers/iommu/iommufd/iommufd_private.h b/drivers/iommu/iommufd/iommufd_private.h index 12622a42f19d0..b709ef98cdbb9 100644 --- a/drivers/iommu/iommufd/iommufd_private.h +++ b/drivers/iommu/iommufd/iommufd_private.h @@ -154,7 +154,8 @@ static inline bool iommufd_lock_obj(struct iommufd_object *obj) struct iommufd_object *iommufd_get_object(struct iommufd_ctx *ictx, u32 id, enum iommufd_object_type type); -static inline void iommufd_put_object(struct iommufd_object *obj) +static inline void iommufd_put_object(struct iommufd_ctx *ictx, + struct iommufd_object *obj) { refcount_dec(&obj->users); up_read(&obj->destroy_rwsem); diff --git a/drivers/iommu/iommufd/selftest.c b/drivers/iommu/iommufd/selftest.c index 9b911b1313459..e8f3df75bfbaa 100644 --- a/drivers/iommu/iommufd/selftest.c +++ b/drivers/iommu/iommufd/selftest.c @@ -101,7 +101,7 @@ void iommufd_test_syz_conv_iova_id(struct iommufd_ucmd *ucmd, if (IS_ERR(ioas)) return; *iova = __iommufd_test_syz_conv_iova(&ioas->iopt, iova); - iommufd_put_object(&ioas->obj); + iommufd_put_object(ucmd->ictx, &ioas->obj); } struct mock_iommu_domain { @@ -550,7 +550,7 @@ get_md_pagetable(struct iommufd_ucmd *ucmd, u32 mockpt_id, return hwpt; if (hwpt->domain->type != IOMMU_DOMAIN_UNMANAGED || hwpt->domain->ops != mock_ops.default_domain_ops) { - iommufd_put_object(&hwpt->obj); + iommufd_put_object(ucmd->ictx, &hwpt->obj); return ERR_PTR(-EINVAL); } *mock = container_of(hwpt->domain, struct mock_iommu_domain, domain); @@ -568,7 +568,7 @@ get_md_pagetable_nested(struct iommufd_ucmd *ucmd, u32 mockpt_id, return hwpt; if (hwpt->domain->type != IOMMU_DOMAIN_NESTED || hwpt->domain->ops != &domain_nested_ops) { - iommufd_put_object(&hwpt->obj); + iommufd_put_object(ucmd->ictx, &hwpt->obj); return ERR_PTR(-EINVAL); } *mock_nested = container_of(hwpt->domain, @@ -731,7 +731,7 @@ static int iommufd_test_mock_domain_replace(struct iommufd_ucmd *ucmd, rc = iommufd_ucmd_respond(ucmd, sizeof(*cmd)); out_dev_obj: - iommufd_put_object(dev_obj); + iommufd_put_object(ucmd->ictx, dev_obj); return rc; } @@ -749,7 +749,7 @@ static int iommufd_test_add_reserved(struct iommufd_ucmd *ucmd, down_write(&ioas->iopt.iova_rwsem); rc = iopt_reserve_iova(&ioas->iopt, start, start + length - 1, NULL); up_write(&ioas->iopt.iova_rwsem); - iommufd_put_object(&ioas->obj); + iommufd_put_object(ucmd->ictx, &ioas->obj); return rc; } @@ -804,7 +804,7 @@ static int iommufd_test_md_check_pa(struct iommufd_ucmd *ucmd, rc = 0; out_put: - iommufd_put_object(&hwpt->obj); + iommufd_put_object(ucmd->ictx, &hwpt->obj); return rc; } @@ -1283,7 +1283,7 @@ static int iommufd_test_dirty(struct iommufd_ucmd *ucmd, unsigned int mockpt_id, out_free: kvfree(tmp); out_put: - iommufd_put_object(&hwpt->obj); + iommufd_put_object(ucmd->ictx, &hwpt->obj); return rc; } diff --git a/drivers/iommu/iommufd/vfio_compat.c b/drivers/iommu/iommufd/vfio_compat.c index 538fbf76354d1..a3ad5f0b6c59d 100644 --- a/drivers/iommu/iommufd/vfio_compat.c +++ b/drivers/iommu/iommufd/vfio_compat.c @@ -41,7 +41,7 @@ int iommufd_vfio_compat_ioas_get_id(struct iommufd_ctx *ictx, u32 *out_ioas_id) if (IS_ERR(ioas)) return PTR_ERR(ioas); *out_ioas_id = ioas->obj.id; - iommufd_put_object(&ioas->obj); + iommufd_put_object(ictx, &ioas->obj); return 0; } EXPORT_SYMBOL_NS_GPL(iommufd_vfio_compat_ioas_get_id, IOMMUFD_VFIO); @@ -98,7 +98,7 @@ int iommufd_vfio_compat_ioas_create(struct iommufd_ctx *ictx) if (ictx->vfio_ioas && iommufd_lock_obj(&ictx->vfio_ioas->obj)) { ret = 0; - iommufd_put_object(&ictx->vfio_ioas->obj); + iommufd_put_object(ictx, &ictx->vfio_ioas->obj); goto out_abort; } ictx->vfio_ioas = ioas; @@ -133,7 +133,7 @@ int iommufd_vfio_ioas(struct iommufd_ucmd *ucmd) if (IS_ERR(ioas)) return PTR_ERR(ioas); cmd->ioas_id = ioas->obj.id; - iommufd_put_object(&ioas->obj); + iommufd_put_object(ucmd->ictx, &ioas->obj); return iommufd_ucmd_respond(ucmd, sizeof(*cmd)); case IOMMU_VFIO_IOAS_SET: @@ -143,7 +143,7 @@ int iommufd_vfio_ioas(struct iommufd_ucmd *ucmd) xa_lock(&ucmd->ictx->objects); ucmd->ictx->vfio_ioas = ioas; xa_unlock(&ucmd->ictx->objects); - iommufd_put_object(&ioas->obj); + iommufd_put_object(ucmd->ictx, &ioas->obj); return 0; case IOMMU_VFIO_IOAS_CLEAR: @@ -190,7 +190,7 @@ static int iommufd_vfio_map_dma(struct iommufd_ctx *ictx, unsigned int cmd, iova = map.iova; rc = iopt_map_user_pages(ictx, &ioas->iopt, &iova, u64_to_user_ptr(map.vaddr), map.size, iommu_prot, 0); - iommufd_put_object(&ioas->obj); + iommufd_put_object(ictx, &ioas->obj); return rc; } @@ -249,7 +249,7 @@ static int iommufd_vfio_unmap_dma(struct iommufd_ctx *ictx, unsigned int cmd, rc = -EFAULT; err_put: - iommufd_put_object(&ioas->obj); + iommufd_put_object(ictx, &ioas->obj); return rc; } @@ -272,7 +272,7 @@ static int iommufd_vfio_cc_iommu(struct iommufd_ctx *ictx) } mutex_unlock(&ioas->mutex); - iommufd_put_object(&ioas->obj); + iommufd_put_object(ictx, &ioas->obj); return rc; } @@ -349,7 +349,7 @@ static int iommufd_vfio_set_iommu(struct iommufd_ctx *ictx, unsigned long type) */ if (type == VFIO_TYPE1_IOMMU) rc = iopt_disable_large_pages(&ioas->iopt); - iommufd_put_object(&ioas->obj); + iommufd_put_object(ictx, &ioas->obj); return rc; } @@ -511,7 +511,7 @@ static int iommufd_vfio_iommu_get_info(struct iommufd_ctx *ictx, out_put: up_read(&ioas->iopt.iova_rwsem); - iommufd_put_object(&ioas->obj); + iommufd_put_object(ictx, &ioas->obj); return rc; } From ec503f8059a2c64680eb8a496c20432b7b8f8919 Mon Sep 17 00:00:00 2001 From: Jason Gunthorpe Date: Sun, 12 Nov 2023 15:44:08 -0400 Subject: [PATCH 462/548] iommufd: Do not UAF during iommufd_put_object() The mixture of kernel and user space lifecycle objects continues to be complicated inside iommufd. The obj->destroy_rwsem is used to bring order to the kernel driver destruction sequence but it cannot be sequenced right with the other refcounts so we end up possibly UAF'ing: BUG: KASAN: slab-use-after-free in __up_read+0x627/0x750 kernel/locking/rwsem.c:1342 Read of size 8 at addr ffff888073cde868 by task syz-executor934/6535 CPU: 1 PID: 6535 Comm: syz-executor934 Not tainted 6.6.0-rc7-syzkaller-00195-g2af9b20dbb39 #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/09/2023 Call Trace: __dump_stack lib/dump_stack.c:88 [inline] dump_stack_lvl+0xd9/0x1b0 lib/dump_stack.c:106 print_address_description mm/kasan/report.c:364 [inline] print_report+0xc4/0x620 mm/kasan/report.c:475 kasan_report+0xda/0x110 mm/kasan/report.c:588 __up_read+0x627/0x750 kernel/locking/rwsem.c:1342 iommufd_put_object drivers/iommu/iommufd/iommufd_private.h:149 [inline] iommufd_vfio_ioas+0x46c/0x580 drivers/iommu/iommufd/vfio_compat.c:146 iommufd_fops_ioctl+0x347/0x4d0 drivers/iommu/iommufd/main.c:398 vfs_ioctl fs/ioctl.c:51 [inline] __do_sys_ioctl fs/ioctl.c:871 [inline] __se_sys_ioctl fs/ioctl.c:857 [inline] __x64_sys_ioctl+0x18f/0x210 fs/ioctl.c:857 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x38/0xb0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x63/0xcd There are two races here, the more obvious one: CPU 0 CPU 1 iommufd_put_object() iommufd_destroy() refcount_dec(&obj->users) iommufd_object_remove() kfree() up_read(&obj->destroy_rwsem) // Boom And there is also perhaps some possibility that the rwsem could hit an issue: CPU 0 CPU 1 iommufd_put_object() iommufd_object_destroy_user() refcount_dec(&obj->users); down_write(&obj->destroy_rwsem) up_read(&obj->destroy_rwsem); atomic_long_or(RWSEM_FLAG_WAITERS, &sem->count); tmp = atomic_long_add_return_release() rwsem_try_write_lock() iommufd_object_remove() up_write(&obj->destroy_rwsem) kfree() clear_nonspinnable() // Boom Fix this by reorganizing this again so that two refcounts are used to keep track of things with a rule that users == 0 && shortterm_users == 0 means no other threads have that memory. Put a wait_queue in the iommufd_ctx object that is triggered when any sub object reaches a 0 shortterm_users. This allows the same wait for userspace ioctls to finish behavior that the rwsem was providing. This is weaker still than the prior versions: - There is no bias on shortterm_users so if some thread is waiting to destroy other threads can continue to get new read sides - If destruction fails, eg because of an active in-kernel user, then shortterm_users will have cycled to zero momentarily blocking new users - If userspace races destroy with other userspace operations they continue to get an EBUSY since we still can't intermix looking up an ID and sleeping for its unref In all cases these are things that userspace brings on itself, correct programs will not hit them. Fixes: 99f98a7c0d69 ("iommufd: IOMMUFD_DESTROY should not increase the refcount") Link: https://lore.kernel.org/all/2-v2-ca9e00171c5b+123-iommufd_syz4_jgg@nvidia.com/ Reported-by: syzbot+d31adfb277377ef8fcba@syzkaller.appspotmail.com Closes: https://lore.kernel.org/r/00000000000055ef9a0609336580@google.com Reviewed-by: Kevin Tian Signed-off-by: Jason Gunthorpe (cherry picked from commit 6f9c4d8c468c189d6dc470324bd52955f8aa0a10) Signed-off-by: Koba Ko --- drivers/iommu/iommufd/iommufd_private.h | 67 +++++++++-- drivers/iommu/iommufd/main.c | 146 +++++++++++++----------- 2 files changed, 135 insertions(+), 78 deletions(-) diff --git a/drivers/iommu/iommufd/iommufd_private.h b/drivers/iommu/iommufd/iommufd_private.h index b709ef98cdbb9..8913d4359cd30 100644 --- a/drivers/iommu/iommufd/iommufd_private.h +++ b/drivers/iommu/iommufd/iommufd_private.h @@ -21,6 +21,7 @@ struct iommufd_ctx { struct file *file; struct xarray objects; struct xarray groups; + wait_queue_head_t destroy_wait; u8 account_mode; /* Compatibility with VFIO no iommu */ @@ -135,7 +136,7 @@ enum iommufd_object_type { /* Base struct for all objects with a userspace ID handle. */ struct iommufd_object { - struct rw_semaphore destroy_rwsem; + refcount_t shortterm_users; refcount_t users; enum iommufd_object_type type; unsigned int id; @@ -143,10 +144,15 @@ struct iommufd_object { static inline bool iommufd_lock_obj(struct iommufd_object *obj) { - if (!down_read_trylock(&obj->destroy_rwsem)) + if (!refcount_inc_not_zero(&obj->users)) return false; - if (!refcount_inc_not_zero(&obj->users)) { - up_read(&obj->destroy_rwsem); + if (!refcount_inc_not_zero(&obj->shortterm_users)) { + /* + * If the caller doesn't already have a ref on obj this must be + * called under the xa_lock. Otherwise the caller is holding a + * ref on users. Thus it cannot be one before this decrement. + */ + refcount_dec(&obj->users); return false; } return true; @@ -157,8 +163,13 @@ struct iommufd_object *iommufd_get_object(struct iommufd_ctx *ictx, u32 id, static inline void iommufd_put_object(struct iommufd_ctx *ictx, struct iommufd_object *obj) { + /* + * Users first, then shortterm so that REMOVE_WAIT_SHORTTERM never sees + * a spurious !0 users with a 0 shortterm_users. + */ refcount_dec(&obj->users); - up_read(&obj->destroy_rwsem); + if (refcount_dec_and_test(&obj->shortterm_users)) + wake_up_interruptible_all(&ictx->destroy_wait); } void iommufd_object_abort(struct iommufd_ctx *ictx, struct iommufd_object *obj); @@ -166,17 +177,49 @@ void iommufd_object_abort_and_destroy(struct iommufd_ctx *ictx, struct iommufd_object *obj); void iommufd_object_finalize(struct iommufd_ctx *ictx, struct iommufd_object *obj); -void __iommufd_object_destroy_user(struct iommufd_ctx *ictx, - struct iommufd_object *obj, bool allow_fail); + +enum { + REMOVE_WAIT_SHORTTERM = 1, +}; +int iommufd_object_remove(struct iommufd_ctx *ictx, + struct iommufd_object *to_destroy, u32 id, + unsigned int flags); + +/* + * The caller holds a users refcount and wants to destroy the object. At this + * point the caller has no shortterm_users reference and at least the xarray + * will be holding one. + */ static inline void iommufd_object_destroy_user(struct iommufd_ctx *ictx, struct iommufd_object *obj) { - __iommufd_object_destroy_user(ictx, obj, false); + int ret; + + ret = iommufd_object_remove(ictx, obj, obj->id, REMOVE_WAIT_SHORTTERM); + + /* + * If there is a bug and we couldn't destroy the object then we did put + * back the caller's users refcount and will eventually try to free it + * again during close. + */ + WARN_ON(ret); } -static inline void iommufd_object_deref_user(struct iommufd_ctx *ictx, - struct iommufd_object *obj) + +/* + * The HWPT allocated by autodomains is used in possibly many devices and + * is automatically destroyed when its refcount reaches zero. + * + * If userspace uses the HWPT manually, even for a short term, then it will + * disrupt this refcounting and the auto-free in the kernel will not work. + * Userspace that tries to use the automatically allocated HWPT must be careful + * to ensure that it is consistently destroyed, eg by not racing accesses + * and by not attaching an automatic HWPT to a device manually. + */ +static inline void +iommufd_object_put_and_try_destroy(struct iommufd_ctx *ictx, + struct iommufd_object *obj) { - __iommufd_object_destroy_user(ictx, obj, true); + iommufd_object_remove(ictx, obj, obj->id, 0); } struct iommufd_object *_iommufd_object_alloc(struct iommufd_ctx *ictx, @@ -313,7 +356,7 @@ static inline void iommufd_hw_pagetable_put(struct iommufd_ctx *ictx, lockdep_assert_not_held(&hwpt_paging->ioas->mutex); if (hwpt_paging->auto_domain) { - iommufd_object_deref_user(ictx, &hwpt->obj); + iommufd_object_put_and_try_destroy(ictx, &hwpt->obj); return; } } diff --git a/drivers/iommu/iommufd/main.c b/drivers/iommu/iommufd/main.c index 45b9d40773b13..c9091e46d208a 100644 --- a/drivers/iommu/iommufd/main.c +++ b/drivers/iommu/iommufd/main.c @@ -33,7 +33,6 @@ struct iommufd_object *_iommufd_object_alloc(struct iommufd_ctx *ictx, size_t size, enum iommufd_object_type type) { - static struct lock_class_key obj_keys[IOMMUFD_OBJ_MAX]; struct iommufd_object *obj; int rc; @@ -41,15 +40,8 @@ struct iommufd_object *_iommufd_object_alloc(struct iommufd_ctx *ictx, if (!obj) return ERR_PTR(-ENOMEM); obj->type = type; - /* - * In most cases the destroy_rwsem is obtained with try so it doesn't - * interact with lockdep, however on destroy we have to sleep. This - * means if we have to destroy an object while holding a get on another - * object it triggers lockdep. Using one locking class per object type - * is a simple and reasonable way to avoid this. - */ - __init_rwsem(&obj->destroy_rwsem, "iommufd_object::destroy_rwsem", - &obj_keys[type]); + /* Starts out bias'd by 1 until it is removed from the xarray */ + refcount_set(&obj->shortterm_users, 1); refcount_set(&obj->users, 1); /* @@ -129,92 +121,113 @@ struct iommufd_object *iommufd_get_object(struct iommufd_ctx *ictx, u32 id, return obj; } +static int iommufd_object_dec_wait_shortterm(struct iommufd_ctx *ictx, + struct iommufd_object *to_destroy) +{ + if (refcount_dec_and_test(&to_destroy->shortterm_users)) + return 0; + + if (wait_event_timeout(ictx->destroy_wait, + refcount_read(&to_destroy->shortterm_users) == + 0, + msecs_to_jiffies(10000))) + return 0; + + pr_crit("Time out waiting for iommufd object to become free\n"); + refcount_inc(&to_destroy->shortterm_users); + return -EBUSY; +} + /* * Remove the given object id from the xarray if the only reference to the - * object is held by the xarray. The caller must call ops destroy(). + * object is held by the xarray. */ -static struct iommufd_object *iommufd_object_remove(struct iommufd_ctx *ictx, - u32 id, bool extra_put) +int iommufd_object_remove(struct iommufd_ctx *ictx, + struct iommufd_object *to_destroy, u32 id, + unsigned int flags) { struct iommufd_object *obj; XA_STATE(xas, &ictx->objects, id); - - xa_lock(&ictx->objects); - obj = xas_load(&xas); - if (xa_is_zero(obj) || !obj) { - obj = ERR_PTR(-ENOENT); - goto out_xa; - } + bool zerod_shortterm = false; + int ret; /* - * If the caller is holding a ref on obj we put it here under the - * spinlock. + * The purpose of the shortterm_users is to ensure deterministic + * destruction of objects used by external drivers and destroyed by this + * function. Any temporary increment of the refcount must increment + * shortterm_users, such as during ioctl execution. */ - if (extra_put) + if (flags & REMOVE_WAIT_SHORTTERM) { + ret = iommufd_object_dec_wait_shortterm(ictx, to_destroy); + if (ret) { + /* + * We have a bug. Put back the callers reference and + * defer cleaning this object until close. + */ + refcount_dec(&to_destroy->users); + return ret; + } + zerod_shortterm = true; + } + + xa_lock(&ictx->objects); + obj = xas_load(&xas); + if (to_destroy) { + /* + * If the caller is holding a ref on obj we put it here under + * the spinlock. + */ refcount_dec(&obj->users); + if (WARN_ON(obj != to_destroy)) { + ret = -ENOENT; + goto err_xa; + } + } else if (xa_is_zero(obj) || !obj) { + ret = -ENOENT; + goto err_xa; + } + if (!refcount_dec_if_one(&obj->users)) { - obj = ERR_PTR(-EBUSY); - goto out_xa; + ret = -EBUSY; + goto err_xa; } xas_store(&xas, NULL); if (ictx->vfio_ioas == container_of(obj, struct iommufd_ioas, obj)) ictx->vfio_ioas = NULL; - -out_xa: xa_unlock(&ictx->objects); - /* The returned object reference count is zero */ - return obj; -} - -/* - * The caller holds a users refcount and wants to destroy the object. Returns - * true if the object was destroyed. In all cases the caller no longer has a - * reference on obj. - */ -void __iommufd_object_destroy_user(struct iommufd_ctx *ictx, - struct iommufd_object *obj, bool allow_fail) -{ - struct iommufd_object *ret; - /* - * The purpose of the destroy_rwsem is to ensure deterministic - * destruction of objects used by external drivers and destroyed by this - * function. Any temporary increment of the refcount must hold the read - * side of this, such as during ioctl execution. - */ - down_write(&obj->destroy_rwsem); - ret = iommufd_object_remove(ictx, obj->id, true); - up_write(&obj->destroy_rwsem); - - if (allow_fail && IS_ERR(ret)) - return; - - /* - * If there is a bug and we couldn't destroy the object then we did put - * back the caller's refcount and will eventually try to free it again - * during close. + * Since users is zero any positive users_shortterm must be racing + * iommufd_put_object(), or we have a bug. */ - if (WARN_ON(IS_ERR(ret))) - return; + if (!zerod_shortterm) { + ret = iommufd_object_dec_wait_shortterm(ictx, obj); + if (WARN_ON(ret)) + return ret; + } iommufd_object_ops[obj->type].destroy(obj); kfree(obj); + return 0; + +err_xa: + if (zerod_shortterm) { + /* Restore the xarray owned reference */ + refcount_set(&obj->shortterm_users, 1); + } + xa_unlock(&ictx->objects); + + /* The returned object reference count is zero */ + return ret; } static int iommufd_destroy(struct iommufd_ucmd *ucmd) { struct iommu_destroy *cmd = ucmd->cmd; - struct iommufd_object *obj; - obj = iommufd_object_remove(ucmd->ictx, cmd->id, false); - if (IS_ERR(obj)) - return PTR_ERR(obj); - iommufd_object_ops[obj->type].destroy(obj); - kfree(obj); - return 0; + return iommufd_object_remove(ucmd->ictx, NULL, cmd->id, 0); } static int iommufd_fops_open(struct inode *inode, struct file *filp) @@ -238,6 +251,7 @@ static int iommufd_fops_open(struct inode *inode, struct file *filp) xa_init_flags(&ictx->objects, XA_FLAGS_ALLOC1 | XA_FLAGS_ACCOUNT); xa_init(&ictx->groups); ictx->file = filp; + init_waitqueue_head(&ictx->destroy_wait); filp->private_data = ictx; return 0; } From 3e615d722510a3c16c9dd46607ba3f8ff7a00082 Mon Sep 17 00:00:00 2001 From: Jason Gunthorpe Date: Wed, 27 Mar 2024 15:07:51 -0300 Subject: [PATCH 463/548] iommu/arm-smmu-v3: Add a type for the CD entry Instead of passing a naked __le16 * around to represent a CD table entry wrap it in a "struct arm_smmu_cd" with an array of the correct size. This makes it much clearer which functions will comprise the "CD API". Tested-by: Nicolin Chen Tested-by: Shameer Kolothum Reviewed-by: Michael Shavit Reviewed-by: Moritz Fischer Reviewed-by: Nicolin Chen Reviewed-by: Mostafa Saleh Signed-off-by: Jason Gunthorpe Link: https://lore.kernel.org/r/5-v6-228e7adf25eb+4155-smmuv3_newapi_p2_jgg@nvidia.com Signed-off-by: Will Deacon (cherry picked from commit e8e4398d53f98be7ac48e0bda9ea6e26df24136d) Signed-off-by: Koba Ko --- drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 20 +++++++++++--------- drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 7 ++++++- 2 files changed, 17 insertions(+), 10 deletions(-) diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c index bf381abdfdf1e..9050323bf1694 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c @@ -1219,7 +1219,8 @@ static void arm_smmu_write_cd_l1_desc(__le64 *dst, WRITE_ONCE(*dst, cpu_to_le64(val)); } -static __le64 *arm_smmu_get_cd_ptr(struct arm_smmu_master *master, u32 ssid) +static struct arm_smmu_cd *arm_smmu_get_cd_ptr(struct arm_smmu_master *master, + u32 ssid) { __le64 *l1ptr; unsigned int idx; @@ -1228,7 +1229,8 @@ static __le64 *arm_smmu_get_cd_ptr(struct arm_smmu_master *master, u32 ssid) struct arm_smmu_ctx_desc_cfg *cd_table = &master->cd_table; if (cd_table->s1fmt == STRTAB_STE_0_S1FMT_LINEAR) - return cd_table->cdtab + ssid * CTXDESC_CD_DWORDS; + return (struct arm_smmu_cd *)(cd_table->cdtab + + ssid * CTXDESC_CD_DWORDS); idx = ssid >> CTXDESC_SPLIT; l1_desc = &cd_table->l1_desc[idx]; @@ -1242,7 +1244,7 @@ static __le64 *arm_smmu_get_cd_ptr(struct arm_smmu_master *master, u32 ssid) arm_smmu_sync_cd(master, ssid, false); } idx = ssid & (CTXDESC_L2_ENTRIES - 1); - return l1_desc->l2ptr + idx * CTXDESC_CD_DWORDS; + return &l1_desc->l2ptr[idx]; } int arm_smmu_write_ctx_desc(struct arm_smmu_master *master, int ssid, @@ -1261,7 +1263,7 @@ int arm_smmu_write_ctx_desc(struct arm_smmu_master *master, int ssid, */ u64 val; bool cd_live; - __le64 *cdptr; + struct arm_smmu_cd *cdptr; struct arm_smmu_ctx_desc_cfg *cd_table = &master->cd_table; if (WARN_ON(ssid >= (1 << cd_table->s1cdmax))) @@ -1271,7 +1273,7 @@ int arm_smmu_write_ctx_desc(struct arm_smmu_master *master, int ssid, if (!cdptr) return -ENOMEM; - val = le64_to_cpu(cdptr[0]); + val = le64_to_cpu(cdptr->data[0]); cd_live = !!(val & CTXDESC_CD_0_V); if (!cd) { /* (5) */ @@ -1286,9 +1288,9 @@ int arm_smmu_write_ctx_desc(struct arm_smmu_master *master, int ssid, * this substream's traffic */ } else { /* (1) and (2) */ - cdptr[1] = cpu_to_le64(cd->ttbr & CTXDESC_CD_1_TTB0_MASK); - cdptr[2] = 0; - cdptr[3] = cpu_to_le64(cd->mair); + cdptr->data[1] = cpu_to_le64(cd->ttbr & CTXDESC_CD_1_TTB0_MASK); + cdptr->data[2] = 0; + cdptr->data[3] = cpu_to_le64(cd->mair); /* * STE may be live, and the SMMU might read dwords of this CD in any @@ -1320,7 +1322,7 @@ int arm_smmu_write_ctx_desc(struct arm_smmu_master *master, int ssid, * field within an aligned 64-bit span of a structure can be altered * without first making the structure invalid. */ - WRITE_ONCE(cdptr[0], cpu_to_le64(val)); + WRITE_ONCE(cdptr->data[0], cpu_to_le64(val)); arm_smmu_sync_cd(master, ssid, true); return 0; } diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h index 0b1f5820b8384..a28101e3b9751 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h @@ -282,6 +282,11 @@ struct arm_smmu_ste { #define CTXDESC_L1_DESC_L2PTR_MASK GENMASK_ULL(51, 12) #define CTXDESC_CD_DWORDS 8 + +struct arm_smmu_cd { + __le64 data[CTXDESC_CD_DWORDS]; +}; + #define CTXDESC_CD_0_TCR_T0SZ GENMASK_ULL(5, 0) #define CTXDESC_CD_0_TCR_TG0 GENMASK_ULL(7, 6) #define CTXDESC_CD_0_TCR_IRGN0 GENMASK_ULL(9, 8) @@ -591,7 +596,7 @@ struct arm_smmu_ctx_desc { }; struct arm_smmu_l1_ctx_desc { - __le64 *l2ptr; + struct arm_smmu_cd *l2ptr; dma_addr_t l2ptr_dma; }; From f56c18d1baf339c4f981b17d0f4ba190cbe046d5 Mon Sep 17 00:00:00 2001 From: Tina Zhang Date: Fri, 27 Oct 2023 08:05:22 +0800 Subject: [PATCH 464/548] iommu: Add mm_get_enqcmd_pasid() helper function mm_get_enqcmd_pasid() should be used by architecture code and closely related to learn the PASID value that the x86 ENQCMD operation should use for the mm. For the moment SMMUv3 uses this without any connection to ENQCMD, it will be cleaned up similar to how the prior patch made VT-d use the PASID argument of set_dev_pasid(). The motivation is to replace mm->pasid with an iommu private data structure that is introduced in a later patch. Reviewed-by: Lu Baolu Reviewed-by: Jason Gunthorpe Tested-by: Nicolin Chen Signed-off-by: Tina Zhang Signed-off-by: Jason Gunthorpe Link: https://lore.kernel.org/r/20231027000525.1278806-4-tina.zhang@intel.com Signed-off-by: Joerg Roedel (cherry picked from commit 2396046d75d3c0b2cfead852a77efd023f8539dc) Signed-off-by: Koba Ko --- arch/x86/kernel/traps.c | 2 +- .../iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c | 23 ++++++++++++------- drivers/iommu/iommu-sva.c | 2 +- include/linux/iommu.h | 12 ++++++++++ 4 files changed, 29 insertions(+), 10 deletions(-) diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c index a52db362a65d1..9e76ddc06201e 100644 --- a/arch/x86/kernel/traps.c +++ b/arch/x86/kernel/traps.c @@ -694,7 +694,7 @@ static bool try_fixup_enqcmd_gp(void) if (!mm_valid_pasid(current->mm)) return false; - pasid = current->mm->pasid; + pasid = mm_get_enqcmd_pasid(current->mm); /* * Did this thread already have its PASID activated? diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c index c61b23b03b793..540f524ecf018 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c @@ -246,7 +246,8 @@ static void arm_smmu_mm_arch_invalidate_secondary_tlbs(struct mmu_notifier *mn, smmu_domain); } - arm_smmu_atc_inv_domain(smmu_domain, mm->pasid, start, size); + arm_smmu_atc_inv_domain(smmu_domain, mm_get_enqcmd_pasid(mm), start, + size); } static void arm_smmu_mm_release(struct mmu_notifier *mn, struct mm_struct *mm) @@ -264,10 +265,11 @@ static void arm_smmu_mm_release(struct mmu_notifier *mn, struct mm_struct *mm) * DMA may still be running. Keep the cd valid to avoid C_BAD_CD events, * but disable translation. */ - arm_smmu_update_ctx_desc_devices(smmu_domain, mm->pasid, &quiet_cd); + arm_smmu_update_ctx_desc_devices(smmu_domain, mm_get_enqcmd_pasid(mm), + &quiet_cd); arm_smmu_tlb_inv_asid(smmu_domain->smmu, smmu_mn->cd->asid); - arm_smmu_atc_inv_domain(smmu_domain, mm->pasid, 0, 0); + arm_smmu_atc_inv_domain(smmu_domain, mm_get_enqcmd_pasid(mm), 0, 0); smmu_mn->cleared = true; mutex_unlock(&sva_lock); @@ -325,10 +327,13 @@ arm_smmu_mmu_notifier_get(struct arm_smmu_domain *smmu_domain, spin_lock_irqsave(&smmu_domain->devices_lock, flags); list_for_each_entry(master, &smmu_domain->devices, domain_head) { - ret = arm_smmu_write_ctx_desc(master, mm->pasid, cd); + ret = arm_smmu_write_ctx_desc(master, mm_get_enqcmd_pasid(mm), + cd); if (ret) { - list_for_each_entry_from_reverse(master, &smmu_domain->devices, domain_head) - arm_smmu_write_ctx_desc(master, mm->pasid, NULL); + list_for_each_entry_from_reverse( + master, &smmu_domain->devices, domain_head) + arm_smmu_write_ctx_desc( + master, mm_get_enqcmd_pasid(mm), NULL); break; } } @@ -358,7 +363,8 @@ static void arm_smmu_mmu_notifier_put(struct arm_smmu_mmu_notifier *smmu_mn) list_del(&smmu_mn->list); - arm_smmu_update_ctx_desc_devices(smmu_domain, mm->pasid, NULL); + arm_smmu_update_ctx_desc_devices(smmu_domain, mm_get_enqcmd_pasid(mm), + NULL); /* * If we went through clear(), we've already invalidated, and no @@ -366,7 +372,8 @@ static void arm_smmu_mmu_notifier_put(struct arm_smmu_mmu_notifier *smmu_mn) */ if (!smmu_mn->cleared) { arm_smmu_tlb_inv_asid(smmu_domain->smmu, cd->asid); - arm_smmu_atc_inv_domain(smmu_domain, mm->pasid, 0, 0); + arm_smmu_atc_inv_domain(smmu_domain, mm_get_enqcmd_pasid(mm), 0, + 0); } /* Frees smmu_mn */ diff --git a/drivers/iommu/iommu-sva.c b/drivers/iommu/iommu-sva.c index b78671a8a9143..4a2f5699747f1 100644 --- a/drivers/iommu/iommu-sva.c +++ b/drivers/iommu/iommu-sva.c @@ -141,7 +141,7 @@ u32 iommu_sva_get_pasid(struct iommu_sva *handle) { struct iommu_domain *domain = handle->domain; - return domain->mm->pasid; + return mm_get_enqcmd_pasid(domain->mm); } EXPORT_SYMBOL_GPL(iommu_sva_get_pasid); diff --git a/include/linux/iommu.h b/include/linux/iommu.h index 34d1943fb58d4..ef18ae54b94cd 100644 --- a/include/linux/iommu.h +++ b/include/linux/iommu.h @@ -1351,6 +1351,12 @@ static inline bool mm_valid_pasid(struct mm_struct *mm) { return mm->pasid != IOMMU_PASID_INVALID; } + +static inline u32 mm_get_enqcmd_pasid(struct mm_struct *mm) +{ + return mm->pasid; +} + void mm_pasid_drop(struct mm_struct *mm); struct iommu_sva *iommu_sva_bind_device(struct device *dev, struct mm_struct *mm); @@ -1373,6 +1379,12 @@ static inline u32 iommu_sva_get_pasid(struct iommu_sva *handle) } static inline void mm_pasid_init(struct mm_struct *mm) {} static inline bool mm_valid_pasid(struct mm_struct *mm) { return false; } + +static inline u32 mm_get_enqcmd_pasid(struct mm_struct *mm) +{ + return IOMMU_PASID_INVALID; +} + static inline void mm_pasid_drop(struct mm_struct *mm) {} #endif /* CONFIG_IOMMU_SVA */ From 56e30637b6129117c831f1cc61f02afbbcafc8b4 Mon Sep 17 00:00:00 2001 From: Tina Zhang Date: Fri, 27 Oct 2023 08:05:23 +0800 Subject: [PATCH 465/548] mm: Add structure to keep sva information Introduce iommu_mm_data structure to keep sva information (pasid and the related sva domains). Add iommu_mm pointer, pointing to an instance of iommu_mm_data structure, to mm. Reviewed-by: Vasant Hegde Reviewed-by: Lu Baolu Reviewed-by: Jason Gunthorpe Tested-by: Nicolin Chen Signed-off-by: Tina Zhang Signed-off-by: Jason Gunthorpe Link: https://lore.kernel.org/r/20231027000525.1278806-5-tina.zhang@intel.com Signed-off-by: Joerg Roedel (cherry picked from commit 541a3e257d48c16b77d19f39ed939ef5832046df) Signed-off-by: Koba Ko --- include/linux/iommu.h | 5 +++++ include/linux/mm_types.h | 2 ++ 2 files changed, 7 insertions(+) diff --git a/include/linux/iommu.h b/include/linux/iommu.h index ef18ae54b94cd..3283f39ff2c72 100644 --- a/include/linux/iommu.h +++ b/include/linux/iommu.h @@ -817,6 +817,11 @@ struct iommu_sva { struct iommu_domain *domain; }; +struct iommu_mm_data { + u32 pasid; + struct list_head sva_domains; +}; + int iommu_fwspec_init(struct device *dev, struct fwnode_handle *iommu_fwnode, const struct iommu_ops *ops); void iommu_fwspec_free(struct device *dev); diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h index e77d4a5c0bace..a65ead76dfa68 100644 --- a/include/linux/mm_types.h +++ b/include/linux/mm_types.h @@ -730,6 +730,7 @@ struct mm_cid { #endif struct kioctx_table; +struct iommu_mm_data; struct mm_struct { struct { /* @@ -943,6 +944,7 @@ struct mm_struct { #ifdef CONFIG_IOMMU_SVA u32 pasid; + struct iommu_mm_data *iommu_mm; #endif #ifdef CONFIG_KSM /* From affd121f2aab3a6e94bfccb4018dc4b83c1fad82 Mon Sep 17 00:00:00 2001 From: Jason Gunthorpe Date: Fri, 27 Oct 2023 08:05:20 +0800 Subject: [PATCH 466/548] iommu: Change kconfig around IOMMU_SVA Linus suggested that the kconfig here is confusing: https://lore.kernel.org/all/CAHk-=wgUiAtiszwseM1p2fCJ+sC4XWQ+YN4TanFhUgvUqjr9Xw@mail.gmail.com/ Let's break it into three kconfigs controlling distinct things: - CONFIG_IOMMU_MM_DATA controls if the mm_struct has the additional fields for the IOMMU. Currently only PASID, but later patches store a struct iommu_mm_data * - CONFIG_ARCH_HAS_CPU_PASID controls if the arch needs the scheduling bit for keeping track of the ENQCMD instruction. x86 will select this if IOMMU_SVA is enabled - IOMMU_SVA controls if the IOMMU core compiles in the SVA support code for iommu driver use and the IOMMU exported API This way ARM will not enable CONFIG_ARCH_HAS_CPU_PASID Signed-off-by: Jason Gunthorpe Link: https://lore.kernel.org/r/20231027000525.1278806-2-tina.zhang@intel.com Signed-off-by: Joerg Roedel (cherry picked from commit d268e612f44bd503a687d1edadb62c4aefea97f5) Signed-off-by: Koba Ko --- arch/Kconfig | 5 +++++ arch/x86/Kconfig | 1 + arch/x86/kernel/traps.c | 2 +- drivers/iommu/Kconfig | 1 + include/linux/iommu.h | 2 +- include/linux/mm_types.h | 2 +- include/linux/sched.h | 2 +- kernel/fork.c | 2 +- mm/Kconfig | 3 +++ mm/init-mm.c | 2 +- 10 files changed, 16 insertions(+), 6 deletions(-) diff --git a/arch/Kconfig b/arch/Kconfig index 09603e0bc2cc1..da16dd35fd1a9 100644 --- a/arch/Kconfig +++ b/arch/Kconfig @@ -309,6 +309,11 @@ config ARCH_HAS_DMA_CLEAR_UNCACHED config ARCH_HAS_CPU_FINALIZE_INIT bool +# The architecture has a per-task state that includes the mm's PASID +config ARCH_HAS_CPU_PASID + bool + select IOMMU_MM_DATA + # Select if arch init_task must go in the __init_task_data section config ARCH_TASK_STRUCT_ON_STACK bool diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index 2b5b7d9a24e98..3aeb7e76b6489 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -73,6 +73,7 @@ config X86 select ARCH_HAS_CACHE_LINE_SIZE select ARCH_HAS_CPU_CACHE_INVALIDATE_MEMREGION select ARCH_HAS_CPU_FINALIZE_INIT + select ARCH_HAS_CPU_PASID if IOMMU_SVA select ARCH_HAS_CURRENT_STACK_POINTER select ARCH_HAS_DEBUG_VIRTUAL select ARCH_HAS_DEBUG_VM_PGTABLE if !X86_PAE diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c index 9e76ddc06201e..555a7b640437c 100644 --- a/arch/x86/kernel/traps.c +++ b/arch/x86/kernel/traps.c @@ -668,7 +668,7 @@ static bool fixup_iopl_exception(struct pt_regs *regs) */ static bool try_fixup_enqcmd_gp(void) { -#ifdef CONFIG_IOMMU_SVA +#ifdef CONFIG_ARCH_HAS_CPU_PASID u32 pasid; /* diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig index ef1007dd494f7..d648fe0627eba 100644 --- a/drivers/iommu/Kconfig +++ b/drivers/iommu/Kconfig @@ -160,6 +160,7 @@ config IOMMU_DMA # Shared Virtual Addressing config IOMMU_SVA + select IOMMU_MM_DATA bool config FSL_PAMU diff --git a/include/linux/iommu.h b/include/linux/iommu.h index 3283f39ff2c72..a829cf41614fe 100644 --- a/include/linux/iommu.h +++ b/include/linux/iommu.h @@ -1347,7 +1347,7 @@ static inline bool tegra_dev_iommu_get_stream_id(struct device *dev, u32 *stream return false; } -#ifdef CONFIG_IOMMU_SVA +#ifdef CONFIG_IOMMU_MM_DATA static inline void mm_pasid_init(struct mm_struct *mm) { mm->pasid = IOMMU_PASID_INVALID; diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h index a65ead76dfa68..92352bddf8bd6 100644 --- a/include/linux/mm_types.h +++ b/include/linux/mm_types.h @@ -942,7 +942,7 @@ struct mm_struct { #endif struct work_struct async_put_work; -#ifdef CONFIG_IOMMU_SVA +#ifdef CONFIG_IOMMU_MM_DATA u32 pasid; struct iommu_mm_data *iommu_mm; #endif diff --git a/include/linux/sched.h b/include/linux/sched.h index cb38eee732fd0..a084bc96e8d07 100644 --- a/include/linux/sched.h +++ b/include/linux/sched.h @@ -953,7 +953,7 @@ struct task_struct { /* Recursion prevention for eventfd_signal() */ unsigned in_eventfd:1; #endif -#ifdef CONFIG_IOMMU_SVA +#ifdef CONFIG_ARCH_HAS_CPU_PASID unsigned pasid_activated:1; #endif #ifdef CONFIG_CPU_SUP_INTEL diff --git a/kernel/fork.c b/kernel/fork.c index 7966c9a1c163d..1b78ddd2cd70f 100644 --- a/kernel/fork.c +++ b/kernel/fork.c @@ -1185,7 +1185,7 @@ static struct task_struct *dup_task_struct(struct task_struct *orig, int node) tsk->use_memdelay = 0; #endif -#ifdef CONFIG_IOMMU_SVA +#ifdef CONFIG_ARCH_HAS_CPU_PASID tsk->pasid_activated = 0; #endif diff --git a/mm/Kconfig b/mm/Kconfig index c11cd01169e8d..22f79ce92a2a6 100644 --- a/mm/Kconfig +++ b/mm/Kconfig @@ -1282,6 +1282,9 @@ config LOCK_MM_AND_FIND_VMA bool depends on !STACK_GROWSUP +config IOMMU_MM_DATA + bool + source "mm/damon/Kconfig" endmenu diff --git a/mm/init-mm.c b/mm/init-mm.c index cfd367822cdd2..c52dc2740a3de 100644 --- a/mm/init-mm.c +++ b/mm/init-mm.c @@ -44,7 +44,7 @@ struct mm_struct init_mm = { #endif .user_ns = &init_user_ns, .cpu_bitmap = CPU_BITS_NONE, -#ifdef CONFIG_IOMMU_SVA +#ifdef CONFIG_IOMMU_MM_DATA .pasid = IOMMU_PASID_INVALID, #endif INIT_MM_CONTEXT(init_mm) From 4fb75212d998e0692003b5c06aa3ae08a8d452bb Mon Sep 17 00:00:00 2001 From: Tina Zhang Date: Fri, 27 Oct 2023 08:05:21 +0800 Subject: [PATCH 467/548] iommu/vt-d: Remove mm->pasid in intel_sva_bind_mm() The pasid is passed in as a parameter through .set_dev_pasid() callback. Thus, intel_sva_bind_mm() can directly use it instead of retrieving the pasid value from mm->pasid. Suggested-by: Lu Baolu Reviewed-by: Lu Baolu Reviewed-by: Jason Gunthorpe Signed-off-by: Tina Zhang Signed-off-by: Jason Gunthorpe Link: https://lore.kernel.org/r/20231027000525.1278806-3-tina.zhang@intel.com Signed-off-by: Joerg Roedel (cherry picked from commit 752c54987f773a41265b26b4aa6dd610f6c81cd7) Signed-off-by: Koba Ko --- drivers/iommu/intel/svm.c | 14 +++++++------- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/drivers/iommu/intel/svm.c b/drivers/iommu/intel/svm.c index 6010b93c514c5..fe7ecf36eb783 100644 --- a/drivers/iommu/intel/svm.c +++ b/drivers/iommu/intel/svm.c @@ -316,21 +316,22 @@ static int pasid_to_svm_sdev(struct device *dev, unsigned int pasid, } static int intel_svm_bind_mm(struct intel_iommu *iommu, struct device *dev, - struct mm_struct *mm) + struct iommu_domain *domain, ioasid_t pasid) { struct device_domain_info *info = dev_iommu_priv_get(dev); + struct mm_struct *mm = domain->mm; struct intel_svm_dev *sdev; struct intel_svm *svm; unsigned long sflags; int ret = 0; - svm = pasid_private_find(mm->pasid); + svm = pasid_private_find(pasid); if (!svm) { svm = kzalloc(sizeof(*svm), GFP_KERNEL); if (!svm) return -ENOMEM; - svm->pasid = mm->pasid; + svm->pasid = pasid; svm->mm = mm; INIT_LIST_HEAD_RCU(&svm->devs); @@ -368,7 +369,7 @@ static int intel_svm_bind_mm(struct intel_iommu *iommu, struct device *dev, /* Setup the pasid table: */ sflags = cpu_feature_enabled(X86_FEATURE_LA57) ? PASID_FLAG_FL5LP : 0; - ret = intel_pasid_setup_first_level(iommu, dev, mm->pgd, mm->pasid, + ret = intel_pasid_setup_first_level(iommu, dev, mm->pgd, pasid, FLPT_DEFAULT_DID, sflags); if (ret) goto free_sdev; @@ -382,7 +383,7 @@ static int intel_svm_bind_mm(struct intel_iommu *iommu, struct device *dev, free_svm: if (list_empty(&svm->devs)) { mmu_notifier_unregister(&svm->notifier, mm); - pasid_private_remove(mm->pasid); + pasid_private_remove(pasid); kfree(svm); } @@ -822,9 +823,8 @@ static int intel_svm_set_dev_pasid(struct iommu_domain *domain, { struct device_domain_info *info = dev_iommu_priv_get(dev); struct intel_iommu *iommu = info->iommu; - struct mm_struct *mm = domain->mm; - return intel_svm_bind_mm(iommu, dev, mm); + return intel_svm_bind_mm(iommu, dev, domain, pasid); } static void intel_svm_domain_free(struct iommu_domain *domain) From ab228a69fc7d1b03bcc3f9519f142eb2dc3d1713 Mon Sep 17 00:00:00 2001 From: Tina Zhang Date: Fri, 27 Oct 2023 08:05:24 +0800 Subject: [PATCH 468/548] iommu: Support mm PASID 1:n with sva domains Each mm bound to devices gets a PASID and corresponding sva domains allocated in iommu_sva_bind_device(), which are referenced by iommu_mm field of the mm. The PASID is released in __mmdrop(), while a sva domain is released when no one is using it (the reference count is decremented in iommu_sva_unbind_device()). However, although sva domains and their PASID are separate objects such that their own life cycles could be handled independently, an enqcmd use case may require releasing the PASID in releasing the mm (i.e., once a PASID is allocated for a mm, it will be permanently used by the mm and won't be released until the end of mm) and only allows to drop the PASID after the sva domains are released. To this end, mmgrab() is called in iommu_sva_domain_alloc() to increment the mm reference count and mmdrop() is invoked in iommu_domain_free() to decrement the mm reference count. Since the required info of PASID and sva domains is kept in struct iommu_mm_data of a mm, use mm->iommu_mm field instead of the old pasid field in mm struct. The sva domain list is protected by iommu_sva_lock. Besides, this patch removes mm_pasid_init(), as with the introduced iommu_mm structure, initializing mm pasid in mm_init() is unnecessary. Reviewed-by: Lu Baolu Reviewed-by: Vasant Hegde Reviewed-by: Jason Gunthorpe Tested-by: Nicolin Chen Signed-off-by: Tina Zhang Signed-off-by: Jason Gunthorpe Link: https://lore.kernel.org/r/20231027000525.1278806-6-tina.zhang@intel.com Signed-off-by: Joerg Roedel (cherry picked from commit 27defa62483515bef0cbaa45569903fbf0ecc5c6) Signed-off-by: Koba Ko --- drivers/iommu/iommu-sva.c | 92 +++++++++++++++++++++++---------------- include/linux/iommu.h | 23 ++++++++-- 2 files changed, 74 insertions(+), 41 deletions(-) diff --git a/drivers/iommu/iommu-sva.c b/drivers/iommu/iommu-sva.c index 4a2f5699747f1..5175e8d85247b 100644 --- a/drivers/iommu/iommu-sva.c +++ b/drivers/iommu/iommu-sva.c @@ -12,32 +12,42 @@ static DEFINE_MUTEX(iommu_sva_lock); /* Allocate a PASID for the mm within range (inclusive) */ -static int iommu_sva_alloc_pasid(struct mm_struct *mm, struct device *dev) +static struct iommu_mm_data *iommu_alloc_mm_data(struct mm_struct *mm, struct device *dev) { + struct iommu_mm_data *iommu_mm; ioasid_t pasid; - int ret = 0; + + lockdep_assert_held(&iommu_sva_lock); if (!arch_pgtable_dma_compat(mm)) - return -EBUSY; + return ERR_PTR(-EBUSY); - mutex_lock(&iommu_sva_lock); + iommu_mm = mm->iommu_mm; /* Is a PASID already associated with this mm? */ - if (mm_valid_pasid(mm)) { - if (mm->pasid >= dev->iommu->max_pasids) - ret = -EOVERFLOW; - goto out; + if (iommu_mm) { + if (iommu_mm->pasid >= dev->iommu->max_pasids) + return ERR_PTR(-EOVERFLOW); + return iommu_mm; } + iommu_mm = kzalloc(sizeof(struct iommu_mm_data), GFP_KERNEL); + if (!iommu_mm) + return ERR_PTR(-ENOMEM); + pasid = iommu_alloc_global_pasid(dev); if (pasid == IOMMU_PASID_INVALID) { - ret = -ENOSPC; - goto out; + kfree(iommu_mm); + return ERR_PTR(-ENOSPC); } - mm->pasid = pasid; - ret = 0; -out: - mutex_unlock(&iommu_sva_lock); - return ret; + iommu_mm->pasid = pasid; + INIT_LIST_HEAD(&iommu_mm->sva_domains); + /* + * Make sure the write to mm->iommu_mm is not reordered in front of + * initialization to iommu_mm fields. If it does, readers may see a + * valid iommu_mm with uninitialized values. + */ + smp_store_release(&mm->iommu_mm, iommu_mm); + return iommu_mm; } /** @@ -58,31 +68,33 @@ static int iommu_sva_alloc_pasid(struct mm_struct *mm, struct device *dev) */ struct iommu_sva *iommu_sva_bind_device(struct device *dev, struct mm_struct *mm) { + struct iommu_mm_data *iommu_mm; struct iommu_domain *domain; struct iommu_sva *handle; int ret; + mutex_lock(&iommu_sva_lock); + /* Allocate mm->pasid if necessary. */ - ret = iommu_sva_alloc_pasid(mm, dev); - if (ret) - return ERR_PTR(ret); + iommu_mm = iommu_alloc_mm_data(mm, dev); + if (IS_ERR(iommu_mm)) { + ret = PTR_ERR(iommu_mm); + goto out_unlock; + } handle = kzalloc(sizeof(*handle), GFP_KERNEL); - if (!handle) - return ERR_PTR(-ENOMEM); - - mutex_lock(&iommu_sva_lock); - /* Search for an existing domain. */ - domain = iommu_get_domain_for_dev_pasid(dev, mm->pasid, - IOMMU_DOMAIN_SVA); - if (IS_ERR(domain)) { - ret = PTR_ERR(domain); + if (!handle) { + ret = -ENOMEM; goto out_unlock; } - if (domain) { - domain->users++; - goto out; + /* Search for an existing domain. */ + list_for_each_entry(domain, &mm->iommu_mm->sva_domains, next) { + ret = iommu_attach_device_pasid(domain, dev, iommu_mm->pasid); + if (!ret) { + domain->users++; + goto out; + } } /* Allocate a new domain and set it on device pasid. */ @@ -92,23 +104,23 @@ struct iommu_sva *iommu_sva_bind_device(struct device *dev, struct mm_struct *mm goto out_unlock; } - ret = iommu_attach_device_pasid(domain, dev, mm->pasid); + ret = iommu_attach_device_pasid(domain, dev, iommu_mm->pasid); if (ret) goto out_free_domain; domain->users = 1; + list_add(&domain->next, &mm->iommu_mm->sva_domains); + out: mutex_unlock(&iommu_sva_lock); handle->dev = dev; handle->domain = domain; - return handle; out_free_domain: iommu_domain_free(domain); + kfree(handle); out_unlock: mutex_unlock(&iommu_sva_lock); - kfree(handle); - return ERR_PTR(ret); } EXPORT_SYMBOL_GPL(iommu_sva_bind_device); @@ -124,12 +136,13 @@ EXPORT_SYMBOL_GPL(iommu_sva_bind_device); void iommu_sva_unbind_device(struct iommu_sva *handle) { struct iommu_domain *domain = handle->domain; - ioasid_t pasid = domain->mm->pasid; + struct iommu_mm_data *iommu_mm = domain->mm->iommu_mm; struct device *dev = handle->dev; mutex_lock(&iommu_sva_lock); + iommu_detach_device_pasid(domain, dev, iommu_mm->pasid); if (--domain->users == 0) { - iommu_detach_device_pasid(domain, dev, pasid); + list_del(&domain->next); iommu_domain_free(domain); } mutex_unlock(&iommu_sva_lock); @@ -205,8 +218,11 @@ iommu_sva_handle_iopf(struct iommu_fault *fault, void *data) void mm_pasid_drop(struct mm_struct *mm) { - if (likely(!mm_valid_pasid(mm))) + struct iommu_mm_data *iommu_mm = mm->iommu_mm; + + if (!iommu_mm) return; - iommu_free_global_pasid(mm->pasid); + iommu_free_global_pasid(iommu_mm->pasid); + kfree(iommu_mm); } diff --git a/include/linux/iommu.h b/include/linux/iommu.h index a829cf41614fe..539d9c7ba1e6f 100644 --- a/include/linux/iommu.h +++ b/include/linux/iommu.h @@ -121,6 +121,11 @@ struct iommu_domain { struct { /* IOMMU_DOMAIN_SVA */ struct mm_struct *mm; int users; + /* + * Next iommu_domain in mm->iommu_mm->sva-domains list + * protected by iommu_sva_lock. + */ + struct list_head next; }; }; }; @@ -1350,16 +1355,28 @@ static inline bool tegra_dev_iommu_get_stream_id(struct device *dev, u32 *stream #ifdef CONFIG_IOMMU_MM_DATA static inline void mm_pasid_init(struct mm_struct *mm) { - mm->pasid = IOMMU_PASID_INVALID; + /* + * During dup_mm(), a new mm will be memcpy'd from an old one and that makes + * the new mm and the old one point to a same iommu_mm instance. When either + * one of the two mms gets released, the iommu_mm instance is freed, leaving + * the other mm running into a use-after-free/double-free problem. To avoid + * the problem, zeroing the iommu_mm pointer of a new mm is needed here. + */ + mm->iommu_mm = NULL; } + static inline bool mm_valid_pasid(struct mm_struct *mm) { - return mm->pasid != IOMMU_PASID_INVALID; + return READ_ONCE(mm->iommu_mm); } static inline u32 mm_get_enqcmd_pasid(struct mm_struct *mm) { - return mm->pasid; + struct iommu_mm_data *iommu_mm = READ_ONCE(mm->iommu_mm); + + if (!iommu_mm) + return IOMMU_PASID_INVALID; + return iommu_mm->pasid; } void mm_pasid_drop(struct mm_struct *mm); From 549f0120baa70168a293255f523107662ba527a9 Mon Sep 17 00:00:00 2001 From: Tina Zhang Date: Fri, 27 Oct 2023 08:05:25 +0800 Subject: [PATCH 469/548] mm: Deprecate pasid field Drop the pasid field, as all the information needed for sva domain management has been moved to the newly added iommu_mm field. Reviewed-by: Lu Baolu Reviewed-by: Vasant Hegde Reviewed-by: Jason Gunthorpe Signed-off-by: Tina Zhang Signed-off-by: Jason Gunthorpe Link: https://lore.kernel.org/r/20231027000525.1278806-7-tina.zhang@intel.com Signed-off-by: Joerg Roedel (cherry picked from commit 4d72f208d6ab0fa2f4144c5ac5d405c89c587955) Signed-off-by: Koba Ko --- include/linux/mm_types.h | 1 - mm/init-mm.c | 3 --- 2 files changed, 4 deletions(-) diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h index 92352bddf8bd6..7c15db9241543 100644 --- a/include/linux/mm_types.h +++ b/include/linux/mm_types.h @@ -943,7 +943,6 @@ struct mm_struct { struct work_struct async_put_work; #ifdef CONFIG_IOMMU_MM_DATA - u32 pasid; struct iommu_mm_data *iommu_mm; #endif #ifdef CONFIG_KSM diff --git a/mm/init-mm.c b/mm/init-mm.c index c52dc2740a3de..24c8093792745 100644 --- a/mm/init-mm.c +++ b/mm/init-mm.c @@ -44,9 +44,6 @@ struct mm_struct init_mm = { #endif .user_ns = &init_user_ns, .cpu_bitmap = CPU_BITS_NONE, -#ifdef CONFIG_IOMMU_MM_DATA - .pasid = IOMMU_PASID_INVALID, -#endif INIT_MM_CONTEXT(init_mm) }; From 8bd2c61f85477dec025f761fcbf0714749084bfb Mon Sep 17 00:00:00 2001 From: Vasant Hegde Date: Mon, 5 Feb 2024 11:56:07 +0000 Subject: [PATCH 470/548] iommu: Introduce iommu_group_mutex_assert() Add function to check iommu group mutex lock. So that device drivers can rely on group mutex lock instead of adding another driver level lock before modifying driver specific device data structure. Suggested-by: Jason Gunthorpe Signed-off-by: Vasant Hegde Reviewed-by: Jason Gunthorpe Link: https://lore.kernel.org/r/20240205115615.6053-10-vasant.hegde@amd.com Signed-off-by: Joerg Roedel (cherry picked from commit bf8aff2945ba4091f503df673b9df33002546e6a) Signed-off-by: Koba Ko --- drivers/iommu/iommu.c | 19 +++++++++++++++++++ include/linux/iommu.h | 8 ++++++++ 2 files changed, 27 insertions(+) diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c index ec96a00a67ef7..22b3a39ddbb57 100644 --- a/drivers/iommu/iommu.c +++ b/drivers/iommu/iommu.c @@ -1231,6 +1231,25 @@ void iommu_group_remove_device(struct device *dev) } EXPORT_SYMBOL_GPL(iommu_group_remove_device); +#if IS_ENABLED(CONFIG_LOCKDEP) && IS_ENABLED(CONFIG_IOMMU_API) +/** + * iommu_group_mutex_assert - Check device group mutex lock + * @dev: the device that has group param set + * + * This function is called by an iommu driver to check whether it holds + * group mutex lock for the given device or not. + * + * Note that this function must be called after device group param is set. + */ +void iommu_group_mutex_assert(struct device *dev) +{ + struct iommu_group *group = dev->iommu_group; + + lockdep_assert_held(&group->mutex); +} +EXPORT_SYMBOL_GPL(iommu_group_mutex_assert); +#endif + /** * iommu_group_for_each_dev - iterate over each device in the group * @group: the group diff --git a/include/linux/iommu.h b/include/linux/iommu.h index 539d9c7ba1e6f..3ff5fd2f76abc 100644 --- a/include/linux/iommu.h +++ b/include/linux/iommu.h @@ -1271,6 +1271,14 @@ static inline ioasid_t iommu_alloc_global_pasid(struct device *dev) static inline void iommu_free_global_pasid(ioasid_t pasid) {} #endif /* CONFIG_IOMMU_API */ +#if IS_ENABLED(CONFIG_LOCKDEP) && IS_ENABLED(CONFIG_IOMMU_API) +void iommu_group_mutex_assert(struct device *dev); +#else +static inline void iommu_group_mutex_assert(struct device *dev) +{ +} +#endif + /** * iommu_map_sgtable - Map the given buffer to the IOMMU domain * @domain: The IOMMU domain to perform the mapping From f49f13ef0d1922f12159f00f394262f3d4c3a27a Mon Sep 17 00:00:00 2001 From: Jason Gunthorpe Date: Wed, 21 Feb 2024 20:27:02 -0400 Subject: [PATCH 471/548] iommu/arm-smmu-v3: Do not use GFP_KERNEL under as spinlock If the SMMU is configured to use a two level CD table then arm_smmu_write_ctx_desc() allocates a CD table leaf internally using GFP_KERNEL. Due to recent changes this is being done under a spinlock to iterate over the device list - thus it will trigger a sleeping while atomic warning: arm_smmu_sva_set_dev_pasid() mutex_lock(&sva_lock); __arm_smmu_sva_bind() arm_smmu_mmu_notifier_get() spin_lock_irqsave() arm_smmu_write_ctx_desc() arm_smmu_get_cd_ptr() arm_smmu_alloc_cd_leaf_table() dmam_alloc_coherent(GFP_KERNEL) This is a 64K high order allocation and really should not be done atomically. At the moment the rework of the SVA to follow the new API is half finished. Recently the CD table memory was moved from the domain to the master, however we have the confusing situation where the SVA code is wrongly using the RID domains device's list to track which CD tables the SVA is installed in. Remove the logic to replicate the CD across all the domain's masters during attach. We know which master and which CD table the PASID should be installed in. Right now SVA only works when dma-iommu.c is in control of the RID translation, which means we have a single iommu_domain shared across the entire group and that iommu_domain is not shared outside the group. Critically this means that the iommu_group->devices list and RID's smmu_domain->devices list describe the same set of masters. For PCI cases the core code also insists on singleton groups so there is only one entry in the smmu_domain->devices list that is equal to the master being passed in to arm_smmu_sva_set_dev_pasid(). Only non-PCI cases may have multi-device groups. However, the core code will repeat the calls to arm_smmu_sva_set_dev_pasid() across the entire iommu_group->devices list. Instead of having arm_smmu_mmu_notifier_get() indirectly loop over all the devices in the group via the RID's smmu_domain, rely on __arm_smmu_sva_bind() to be called for each device in the group and install the repeated CD entry that way. This avoids taking the spinlock to access the devices list and permits the arm_smmu_write_ctx_desc() to use a sleeping allocation. Leave the arm_smmu_mm_release() as a confusing situation, this requires tracking attached masters inside the SVA domain. Removing the loop allows arm_smmu_write_ctx_desc() to be called outside the spinlock and thus is safe to use GFP_KERNEL. Move the clearing of the CD into arm_smmu_sva_remove_dev_pasid() so that arm_smmu_mmu_notifier_get/put() remain paired functions. Fixes: 24503148c545 ("iommu/arm-smmu-v3: Refactor write_ctx_desc") Reported-by: Dan Carpenter Closes: https://lore.kernel.org/all/4e25d161-0cf8-4050-9aa3-dfa21cd63e56@moroto.mountain/ Signed-off-by: Jason Gunthorpe Reviewed-by: Michael Shavit Link: https://lore.kernel.org/r/0-v3-11978fc67151+112-smmu_cd_atomic_jgg@nvidia.com Signed-off-by: Will Deacon (cherry picked from commit b5bf7778b722105d7a04b1d51e884497b542638b) Signed-off-by: Koba Ko --- .../iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c | 38 ++++++------------- 1 file changed, 12 insertions(+), 26 deletions(-) diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c index 540f524ecf018..2610e82c0ecd0 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c @@ -292,10 +292,8 @@ arm_smmu_mmu_notifier_get(struct arm_smmu_domain *smmu_domain, struct mm_struct *mm) { int ret; - unsigned long flags; struct arm_smmu_ctx_desc *cd; struct arm_smmu_mmu_notifier *smmu_mn; - struct arm_smmu_master *master; list_for_each_entry(smmu_mn, &smmu_domain->mmu_notifiers, list) { if (smmu_mn->mn.mm == mm) { @@ -325,28 +323,9 @@ arm_smmu_mmu_notifier_get(struct arm_smmu_domain *smmu_domain, goto err_free_cd; } - spin_lock_irqsave(&smmu_domain->devices_lock, flags); - list_for_each_entry(master, &smmu_domain->devices, domain_head) { - ret = arm_smmu_write_ctx_desc(master, mm_get_enqcmd_pasid(mm), - cd); - if (ret) { - list_for_each_entry_from_reverse( - master, &smmu_domain->devices, domain_head) - arm_smmu_write_ctx_desc( - master, mm_get_enqcmd_pasid(mm), NULL); - break; - } - } - spin_unlock_irqrestore(&smmu_domain->devices_lock, flags); - if (ret) - goto err_put_notifier; - list_add(&smmu_mn->list, &smmu_domain->mmu_notifiers); return smmu_mn; -err_put_notifier: - /* Frees smmu_mn */ - mmu_notifier_put(&smmu_mn->mn); err_free_cd: arm_smmu_free_shared_cd(cd); return ERR_PTR(ret); @@ -363,9 +342,6 @@ static void arm_smmu_mmu_notifier_put(struct arm_smmu_mmu_notifier *smmu_mn) list_del(&smmu_mn->list); - arm_smmu_update_ctx_desc_devices(smmu_domain, mm_get_enqcmd_pasid(mm), - NULL); - /* * If we went through clear(), we've already invalidated, and no * new TLB entry can have been formed. @@ -381,7 +357,8 @@ static void arm_smmu_mmu_notifier_put(struct arm_smmu_mmu_notifier *smmu_mn) arm_smmu_free_shared_cd(cd); } -static int __arm_smmu_sva_bind(struct device *dev, struct mm_struct *mm) +static int __arm_smmu_sva_bind(struct device *dev, ioasid_t pasid, + struct mm_struct *mm) { int ret; struct arm_smmu_bond *bond; @@ -410,9 +387,15 @@ static int __arm_smmu_sva_bind(struct device *dev, struct mm_struct *mm) goto err_free_bond; } + ret = arm_smmu_write_ctx_desc(master, pasid, bond->smmu_mn->cd); + if (ret) + goto err_put_notifier; + list_add(&bond->list, &master->bonds); return 0; +err_put_notifier: + arm_smmu_mmu_notifier_put(bond->smmu_mn); err_free_bond: kfree(bond); return ret; @@ -574,6 +557,9 @@ void arm_smmu_sva_remove_dev_pasid(struct iommu_domain *domain, struct arm_smmu_master *master = dev_iommu_priv_get(dev); mutex_lock(&sva_lock); + + arm_smmu_write_ctx_desc(master, id, NULL); + list_for_each_entry(t, &master->bonds, list) { if (t->mm == mm) { bond = t; @@ -596,7 +582,7 @@ static int arm_smmu_sva_set_dev_pasid(struct iommu_domain *domain, struct mm_struct *mm = domain->mm; mutex_lock(&sva_lock); - ret = __arm_smmu_sva_bind(dev, mm); + ret = __arm_smmu_sva_bind(dev, id, mm); mutex_unlock(&sva_lock); return ret; From f778f815491827cbf7491cdf615271b0b712e65f Mon Sep 17 00:00:00 2001 From: Robin Murphy Date: Tue, 21 Nov 2023 18:03:57 +0000 Subject: [PATCH 472/548] iommu: Factor out some helpers The pattern for picking the first device out of the group list is repeated a few times now, so it's clearly worth factoring out, which also helps hide the iommu_group_dev detail from places that don't need to know. Similarly, the safety check for dev_iommu_ops() at certain public interfaces starts looking a bit repetitive, and might not be completely obvious at first glance, so let's factor that out for clarity as well, in preparation for more uses of both. Reviewed-by: Lu Baolu Reviewed-by: Jason Gunthorpe Reviewed-by: Jerry Snitselaar Signed-off-by: Robin Murphy Link: https://lore.kernel.org/r/566cbd161546caa6aed49662c9b3e8f09dc9c3cf.1700589539.git.robin.murphy@arm.com Signed-off-by: Joerg Roedel (cherry picked from commit 48ed12788ed8aa099e463c2febcd3f06fb71c68c) Signed-off-by: Koba Ko --- drivers/iommu/iommu.c | 56 ++++++++++++++++++++----------------------- 1 file changed, 26 insertions(+), 30 deletions(-) diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c index 22b3a39ddbb57..46666573984cf 100644 --- a/drivers/iommu/iommu.c +++ b/drivers/iommu/iommu.c @@ -365,6 +365,15 @@ static void dev_iommu_free(struct device *dev) kfree(param); } +/* + * Internal equivalent of device_iommu_mapped() for when we care that a device + * actually has API ops, and don't want false positives from VFIO-only groups. + */ +static bool dev_has_iommu(struct device *dev) +{ + return dev->iommu && dev->iommu->iommu_dev; +} + static u32 dev_iommu_get_max_pasids(struct device *dev) { u32 max_pasids = 0, bits = 0; @@ -626,7 +635,7 @@ static void __iommu_group_remove_device(struct device *dev) list_del(&device->list); __iommu_group_free_device(group, device); - if (dev->iommu && dev->iommu->iommu_dev) + if (dev_has_iommu(dev)) iommu_deinit_device(dev); else dev->iommu_group = NULL; @@ -825,7 +834,7 @@ int iommu_get_group_resv_regions(struct iommu_group *group, * Non-API groups still expose reserved_regions in sysfs, * so filter out calls that get here that way. */ - if (!device->dev->iommu) + if (!dev_has_iommu(device->dev)) break; INIT_LIST_HEAD(&dev_resv_regions); @@ -1250,6 +1259,12 @@ void iommu_group_mutex_assert(struct device *dev) EXPORT_SYMBOL_GPL(iommu_group_mutex_assert); #endif +static struct device *iommu_group_first_dev(struct iommu_group *group) +{ + lockdep_assert_held(&group->mutex); + return list_first_entry(&group->devices, struct group_device, list)->dev; +} + /** * iommu_group_for_each_dev - iterate over each device in the group * @group: the group @@ -1756,23 +1771,6 @@ __iommu_group_alloc_default_domain(struct iommu_group *group, int req_type) return __iommu_group_domain_alloc(group, req_type); } -/* - * Returns the iommu_ops for the devices in an iommu group. - * - * It is assumed that all devices in an iommu group are managed by a single - * IOMMU unit. Therefore, this returns the dev_iommu_ops of the first device - * in the group. - */ -static const struct iommu_ops *group_iommu_ops(struct iommu_group *group) -{ - struct group_device *device = - list_first_entry(&group->devices, struct group_device, list); - - lockdep_assert_held(&group->mutex); - - return dev_iommu_ops(device->dev); -} - /* * req_type of 0 means "auto" which means to select a domain based on * iommu_def_domain_type or what the driver actually supports. @@ -1780,7 +1778,7 @@ static const struct iommu_ops *group_iommu_ops(struct iommu_group *group) static struct iommu_domain * iommu_group_alloc_default_domain(struct iommu_group *group, int req_type) { - const struct iommu_ops *ops = group_iommu_ops(group); + const struct iommu_ops *ops = dev_iommu_ops(iommu_group_first_dev(group)); struct iommu_domain *dom; lockdep_assert_held(&group->mutex); @@ -1860,7 +1858,7 @@ static int iommu_bus_notifier(struct notifier_block *nb, static int iommu_get_def_domain_type(struct iommu_group *group, struct device *dev, int cur_type) { - const struct iommu_ops *ops = group_iommu_ops(group); + const struct iommu_ops *ops = dev_iommu_ops(dev); int type; if (ops->default_domain) { @@ -2035,7 +2033,7 @@ bool device_iommu_capable(struct device *dev, enum iommu_cap cap) { const struct iommu_ops *ops; - if (!dev->iommu || !dev->iommu->iommu_dev) + if (!dev_has_iommu(dev)) return false; ops = dev_iommu_ops(dev); @@ -2146,11 +2144,9 @@ static struct iommu_domain *__iommu_domain_alloc(const struct iommu_ops *ops, static struct iommu_domain * __iommu_group_domain_alloc(struct iommu_group *group, unsigned int type) { - struct device *dev = - list_first_entry(&group->devices, struct group_device, list) - ->dev; + struct device *dev = iommu_group_first_dev(group); - return __iommu_domain_alloc(group_iommu_ops(group), dev, type); + return __iommu_domain_alloc(dev_iommu_ops(dev), dev, type); } struct iommu_domain *iommu_domain_alloc(const struct bus_type *bus) @@ -3067,8 +3063,8 @@ EXPORT_SYMBOL_GPL(iommu_fwspec_add_ids); */ int iommu_dev_enable_feature(struct device *dev, enum iommu_dev_features feat) { - if (dev->iommu && dev->iommu->iommu_dev) { - const struct iommu_ops *ops = dev->iommu->iommu_dev->ops; + if (dev_has_iommu(dev)) { + const struct iommu_ops *ops = dev_iommu_ops(dev); if (ops->dev_enable_feat) return ops->dev_enable_feat(dev, feat); @@ -3083,8 +3079,8 @@ EXPORT_SYMBOL_GPL(iommu_dev_enable_feature); */ int iommu_dev_disable_feature(struct device *dev, enum iommu_dev_features feat) { - if (dev->iommu && dev->iommu->iommu_dev) { - const struct iommu_ops *ops = dev->iommu->iommu_dev->ops; + if (dev_has_iommu(dev)) { + const struct iommu_ops *ops = dev_iommu_ops(dev); if (ops->dev_disable_feat) return ops->dev_disable_feat(dev, feat); From 17b7b83c89ef2453569032c22207af4e9f95360b Mon Sep 17 00:00:00 2001 From: Robin Murphy Date: Tue, 21 Nov 2023 18:03:58 +0000 Subject: [PATCH 473/548] iommu: Decouple iommu_present() from bus ops Much as I'd like to remove iommu_present(), the final remaining users are proving stubbornly difficult to clean up, so kick that can down the road and just rework it to preserve the current behaviour without depending on bus ops. Since commit 57365a04c921 ("iommu: Move bus setup to IOMMU device registration"), any registered IOMMU instance is already considered "present" for every entry in iommu_buses, so it's simply a case of validating the bus and checking we have at least once IOMMU. Reviewed-by: Jason Gunthorpe Reviewed-by: Lu Baolu Reviewed-by: Jerry Snitselaar Signed-off-by: Robin Murphy Link: https://lore.kernel.org/r/caa93680bb9d35a8facbcd8ff46267ca67335229.1700589539.git.robin.murphy@arm.com Signed-off-by: Joerg Roedel (cherry picked from commit 1d8d43bb984b9cfa7ff63dbd0e2f6e5775ff1fdb) Signed-off-by: Koba Ko --- drivers/iommu/iommu.c | 21 ++++++++++++++++++++- 1 file changed, 20 insertions(+), 1 deletion(-) diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c index 46666573984cf..f37a0eabd2534 100644 --- a/drivers/iommu/iommu.c +++ b/drivers/iommu/iommu.c @@ -2015,9 +2015,28 @@ int bus_iommu_probe(const struct bus_type *bus) return 0; } +/** + * iommu_present() - make platform-specific assumptions about an IOMMU + * @bus: bus to check + * + * Do not use this function. You want device_iommu_mapped() instead. + * + * Return: true if some IOMMU is present and aware of devices on the given bus; + * in general it may not be the only IOMMU, and it may not have anything to do + * with whatever device you are ultimately interested in. + */ bool iommu_present(const struct bus_type *bus) { - return bus->iommu_ops != NULL; + bool ret = false; + + for (int i = 0; i < ARRAY_SIZE(iommu_buses); i++) { + if (iommu_buses[i] == bus) { + spin_lock(&iommu_device_lock); + ret = !list_empty(&iommu_device_list); + spin_unlock(&iommu_device_lock); + } + } + return ret; } EXPORT_SYMBOL_GPL(iommu_present); From 9e73f5948fccf94f57412ac3822f5daae25c162a Mon Sep 17 00:00:00 2001 From: Robin Murphy Date: Tue, 21 Nov 2023 18:03:59 +0000 Subject: [PATCH 474/548] iommu: Validate that devices match domains Before we can allow drivers to coexist, we need to make sure that one driver's domain ops can't misinterpret another driver's dev_iommu_priv data. To that end, add a token to the domain so we can remember how it was allocated - for now this may as well be the device ops, since they still correlate 1:1 with drivers. We can trust ourselves for internal default domain attachment, so add checks to cover all the public attach interfaces. Reviewed-by: Lu Baolu Reviewed-by: Jason Gunthorpe Reviewed-by: Jerry Snitselaar Signed-off-by: Robin Murphy Link: https://lore.kernel.org/r/097c6f30480e4efe12195d00ba0e84ea4837fb4c.1700589539.git.robin.murphy@arm.com Signed-off-by: Joerg Roedel (cherry picked from commit a9c362db39207c4934c9125e56ed730c5297c37c) Signed-off-by: Koba Ko --- drivers/iommu/iommu.c | 10 ++++++++++ drivers/iommu/iommufd/hw_pagetable.c | 2 ++ include/linux/iommu.h | 2 +- 3 files changed, 13 insertions(+), 1 deletion(-) diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c index f37a0eabd2534..bdc93beee03ff 100644 --- a/drivers/iommu/iommu.c +++ b/drivers/iommu/iommu.c @@ -2138,6 +2138,7 @@ static struct iommu_domain *__iommu_domain_alloc(const struct iommu_ops *ops, return ERR_PTR(-ENOMEM); domain->type = type; + domain->owner = ops; /* * If not already set, assume all sizes by default; the driver * may override this later @@ -2321,10 +2322,16 @@ struct iommu_domain *iommu_get_dma_domain(struct device *dev) static int __iommu_attach_group(struct iommu_domain *domain, struct iommu_group *group) { + struct device *dev; + if (group->domain && group->domain != group->default_domain && group->domain != group->blocking_domain) return -EBUSY; + dev = iommu_group_first_dev(group); + if (!dev_has_iommu(dev) || dev_iommu_ops(dev) != domain->owner) + return -EINVAL; + return __iommu_group_set_domain(group, domain); } @@ -3597,6 +3604,9 @@ int iommu_attach_device_pasid(struct iommu_domain *domain, if (!group) return -ENODEV; + if (!dev_has_iommu(dev) || dev_iommu_ops(dev) != domain->owner) + return -EINVAL; + mutex_lock(&group->mutex); curr = xa_cmpxchg(&group->pasid_array, pasid, NULL, domain, GFP_KERNEL); if (curr) { diff --git a/drivers/iommu/iommufd/hw_pagetable.c b/drivers/iommu/iommufd/hw_pagetable.c index cbb5df0a6c32f..29612106a962a 100644 --- a/drivers/iommu/iommufd/hw_pagetable.c +++ b/drivers/iommu/iommufd/hw_pagetable.c @@ -135,6 +135,7 @@ iommufd_hwpt_paging_alloc(struct iommufd_ctx *ictx, struct iommufd_ioas *ioas, hwpt->domain = NULL; goto out_abort; } + hwpt->domain->owner = ops; } else { hwpt->domain = iommu_domain_alloc(idev->dev->bus); if (!hwpt->domain) { @@ -233,6 +234,7 @@ iommufd_hwpt_nested_alloc(struct iommufd_ctx *ictx, hwpt->domain = NULL; goto out_abort; } + hwpt->domain->owner = ops; if (WARN_ON_ONCE(hwpt->domain->type != IOMMU_DOMAIN_NESTED)) { rc = -EINVAL; diff --git a/include/linux/iommu.h b/include/linux/iommu.h index 3ff5fd2f76abc..539eb91b4e38b 100644 --- a/include/linux/iommu.h +++ b/include/linux/iommu.h @@ -106,7 +106,7 @@ struct iommu_domain { unsigned type; const struct iommu_domain_ops *ops; const struct iommu_dirty_ops *dirty_ops; - + const struct iommu_ops *owner; /* Whose domain_alloc we came from */ unsigned long pgsize_bitmap; /* Bitmap of page sizes in use */ struct iommu_domain_geometry geometry; struct iommu_dma_cookie *iova_cookie; From f50840e61212604bbd24cdec2c513490973acbf6 Mon Sep 17 00:00:00 2001 From: Robin Murphy Date: Tue, 21 Nov 2023 18:04:00 +0000 Subject: [PATCH 475/548] iommu: Decouple iommu_domain_alloc() from bus ops As the final remaining piece of bus-dependent API, iommu_domain_alloc() can now take responsibility for the "one iommu_ops per bus" rule for itself. It turns out we can't safely make the internal allocation call any more group-based or device-based yet - that will have to wait until the external callers can pass the right thing - but we can at least get as far as deriving "bus ops" based on which driver is actually managing devices on the given bus, rather than whichever driver won the race to register first. This will then leave us able to convert the last of the core internals over to the IOMMU-instance model, allow multiple drivers to register and actually coexist (modulo the above limitation for unmanaged domain users in the short term), and start trying to solve the long-standing iommu_probe_device() mess. Reviewed-by: Jason Gunthorpe Reviewed-by: Jerry Snitselaar Signed-off-by: Robin Murphy Reviewed-by: Lu Baolu Link: https://lore.kernel.org/r/6c7313009aae0e39ae2855920990ebf85af4662f.1700589539.git.robin.murphy@arm.com Signed-off-by: Joerg Roedel (cherry picked from commit b4c0497169d54e08346678dbc79ded7809dae5b7) Signed-off-by: Koba Ko --- drivers/iommu/iommu.c | 28 +++++++++++++++++++++------- 1 file changed, 21 insertions(+), 7 deletions(-) diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c index bdc93beee03ff..1bd47b31517c1 100644 --- a/drivers/iommu/iommu.c +++ b/drivers/iommu/iommu.c @@ -2169,17 +2169,31 @@ __iommu_group_domain_alloc(struct iommu_group *group, unsigned int type) return __iommu_domain_alloc(dev_iommu_ops(dev), dev, type); } +static int __iommu_domain_alloc_dev(struct device *dev, void *data) +{ + const struct iommu_ops **ops = data; + + if (!dev_has_iommu(dev)) + return 0; + + if (WARN_ONCE(*ops && *ops != dev_iommu_ops(dev), + "Multiple IOMMU drivers present for bus %s, which the public IOMMU API can't fully support yet. You will still need to disable one or more for this to work, sorry!\n", + dev_bus_name(dev))) + return -EBUSY; + + *ops = dev_iommu_ops(dev); + return 0; +} + struct iommu_domain *iommu_domain_alloc(const struct bus_type *bus) { - struct iommu_domain *domain; + const struct iommu_ops *ops = NULL; + int err = bus_for_each_dev(bus, NULL, &ops, __iommu_domain_alloc_dev); - if (bus == NULL || bus->iommu_ops == NULL) + if (err || !ops) return NULL; - domain = __iommu_domain_alloc(bus->iommu_ops, NULL, - IOMMU_DOMAIN_UNMANAGED); - if (IS_ERR(domain)) - return NULL; - return domain; + + return __iommu_domain_alloc(ops, NULL, IOMMU_DOMAIN_UNMANAGED); } EXPORT_SYMBOL_GPL(iommu_domain_alloc); From 5704024294ce86c0e040391b15e7824c20b0e424 Mon Sep 17 00:00:00 2001 From: Robin Murphy Date: Tue, 21 Nov 2023 18:04:01 +0000 Subject: [PATCH 476/548] iommu/arm-smmu: Don't register fwnode for legacy binding When using the legacy binding we bypass the of_xlate mechanism, so avoid registering the instance fwnodes which act as keys for that. This will help __iommu_probe_device() to retrieve the registered ops the same way as for x86 etc. when no fwspec has previously been set up by of_xlate. Acked-by: Will Deacon Reviewed-by: Jason Gunthorpe Reviewed-by: Jerry Snitselaar Signed-off-by: Robin Murphy Link: https://lore.kernel.org/r/18b0f812a42a74dd6924aea24e68ab409d6e1b52.1700589539.git.robin.murphy@arm.com Signed-off-by: Joerg Roedel (cherry picked from commit 01bf81af85458c0427a70ae3bf90b375d038c95b) Signed-off-by: Koba Ko --- drivers/iommu/arm/arm-smmu/arm-smmu.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu.c b/drivers/iommu/arm/arm-smmu/arm-smmu.c index 42c5012ba8aac..d35eeebedc73a 100644 --- a/drivers/iommu/arm/arm-smmu/arm-smmu.c +++ b/drivers/iommu/arm/arm-smmu/arm-smmu.c @@ -2172,7 +2172,8 @@ static int arm_smmu_device_probe(struct platform_device *pdev) return err; } - err = iommu_device_register(&smmu->iommu, &arm_smmu_ops, dev); + err = iommu_device_register(&smmu->iommu, &arm_smmu_ops, + using_legacy_binding ? NULL : dev); if (err) { dev_err(dev, "Failed to register iommu\n"); iommu_device_sysfs_remove(&smmu->iommu); From c1b88563b729b0809b80a14e37a76eda9dafe85b Mon Sep 17 00:00:00 2001 From: Robin Murphy Date: Tue, 21 Nov 2023 18:04:02 +0000 Subject: [PATCH 477/548] iommu: Retire bus ops With the rest of the API internals converted, it's time to finally tackle probe_device and how we bootstrap the per-device ops association to begin with. This ends up being disappointingly straightforward, since fwspec users are already doing it in order to find their of_xlate callback, and it works out that we can easily do the equivalent for other drivers too. Then shuffle the remaining awareness of iommu_ops into the couple of core headers that still need it, and breathe a sigh of relief. Ding dong the bus ops are gone! CC: Rafael J. Wysocki Acked-by: Christoph Hellwig Acked-by: Greg Kroah-Hartman Reviewed-by: Lu Baolu Reviewed-by: Jason Gunthorpe Reviewed-by: Jerry Snitselaar Signed-off-by: Robin Murphy Link: https://lore.kernel.org/r/a59011ef65b4b6657cb0b7a388d786b779b61305.1700589539.git.robin.murphy@arm.com Signed-off-by: Joerg Roedel (cherry picked from commit 17de3f5fdd35676b0e3d41c7c9bf4e3032eb3673) Signed-off-by: Koba Ko --- drivers/iommu/iommu.c | 31 ++++++++++++++++++------------- include/acpi/acpi_bus.h | 2 ++ include/linux/device.h | 1 - include/linux/device/bus.h | 5 ----- include/linux/dma-map-ops.h | 1 + 5 files changed, 21 insertions(+), 19 deletions(-) diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c index 1bd47b31517c1..b5aea21fce471 100644 --- a/drivers/iommu/iommu.c +++ b/drivers/iommu/iommu.c @@ -149,7 +149,7 @@ struct iommu_group_attribute iommu_group_attr_##_name = \ static LIST_HEAD(iommu_device_list); static DEFINE_SPINLOCK(iommu_device_lock); -static struct bus_type * const iommu_buses[] = { +static const struct bus_type * const iommu_buses[] = { &platform_bus_type, #ifdef CONFIG_PCI &pci_bus_type, @@ -258,13 +258,6 @@ int iommu_device_register(struct iommu_device *iommu, /* We need to be able to take module references appropriately */ if (WARN_ON(is_module_address((unsigned long)ops) && !ops->owner)) return -EINVAL; - /* - * Temporarily enforce global restriction to a single driver. This was - * already the de-facto behaviour, since any possible combination of - * existing drivers would compete for at least the PCI or platform bus. - */ - if (iommu_buses[0]->iommu_ops && iommu_buses[0]->iommu_ops != ops) - return -EBUSY; iommu->ops = ops; if (hwdev) @@ -274,10 +267,8 @@ int iommu_device_register(struct iommu_device *iommu, list_add_tail(&iommu->list, &iommu_device_list); spin_unlock(&iommu_device_lock); - for (int i = 0; i < ARRAY_SIZE(iommu_buses) && !err; i++) { - iommu_buses[i]->iommu_ops = ops; + for (int i = 0; i < ARRAY_SIZE(iommu_buses) && !err; i++) err = bus_iommu_probe(iommu_buses[i]); - } if (err) iommu_device_unregister(iommu); return err; @@ -326,7 +317,6 @@ int iommu_device_register_bus(struct iommu_device *iommu, list_add_tail(&iommu->list, &iommu_device_list); spin_unlock(&iommu_device_lock); - bus->iommu_ops = ops; err = bus_iommu_probe(bus); if (err) { iommu_device_unregister_bus(iommu, bus, nb); @@ -494,11 +484,26 @@ DEFINE_MUTEX(iommu_probe_device_lock); static int __iommu_probe_device(struct device *dev, struct list_head *group_list) { - const struct iommu_ops *ops = dev->bus->iommu_ops; + const struct iommu_ops *ops; + struct iommu_fwspec *fwspec; struct iommu_group *group; struct group_device *gdev; int ret; + /* + * For FDT-based systems and ACPI IORT/VIOT, drivers register IOMMU + * instances with non-NULL fwnodes, and client devices should have been + * identified with a fwspec by this point. Otherwise, we can currently + * assume that only one of Intel, AMD, s390, PAMU or legacy SMMUv2 can + * be present, and that any of their registered instances has suitable + * ops for probing, and thus cheekily co-opt the same mechanism. + */ + fwspec = dev_iommu_fwspec_get(dev); + if (fwspec && fwspec->ops) + ops = fwspec->ops; + else + ops = iommu_ops_from_fwnode(NULL); + if (!ops) return -ENODEV; /* diff --git a/include/acpi/acpi_bus.h b/include/acpi/acpi_bus.h index d9c20ae23b632..a117d8274667a 100644 --- a/include/acpi/acpi_bus.h +++ b/include/acpi/acpi_bus.h @@ -624,6 +624,8 @@ struct acpi_pci_root { /* helper */ +struct iommu_ops; + bool acpi_dma_supported(const struct acpi_device *adev); enum dev_dma_attr acpi_get_dma_attr(struct acpi_device *adev); int acpi_iommu_fwspec_init(struct device *dev, u32 id, diff --git a/include/linux/device.h b/include/linux/device.h index 3627b26b243e6..72bae73b1eddb 100644 --- a/include/linux/device.h +++ b/include/linux/device.h @@ -42,7 +42,6 @@ struct class; struct subsys_private; struct device_node; struct fwnode_handle; -struct iommu_ops; struct iommu_group; struct dev_pin_info; struct dev_iommu; diff --git a/include/linux/device/bus.h b/include/linux/device/bus.h index ae10c43227543..e25aab08f873d 100644 --- a/include/linux/device/bus.h +++ b/include/linux/device/bus.h @@ -62,9 +62,6 @@ struct fwnode_handle; * this bus. * @pm: Power management operations of this bus, callback the specific * device driver's pm-ops. - * @iommu_ops: IOMMU specific operations for this bus, used to attach IOMMU - * driver implementations to a bus and allow the driver to do - * bus-specific setup * @need_parent_lock: When probing or removing a device on this bus, the * device core should lock the device's parent. * @@ -104,8 +101,6 @@ struct bus_type { const struct dev_pm_ops *pm; - const struct iommu_ops *iommu_ops; - bool need_parent_lock; }; diff --git a/include/linux/dma-map-ops.h b/include/linux/dma-map-ops.h index f2fc203fb8a1a..a52e508d1869f 100644 --- a/include/linux/dma-map-ops.h +++ b/include/linux/dma-map-ops.h @@ -11,6 +11,7 @@ #include struct cma; +struct iommu_ops; /* * Values for struct dma_map_ops.flags: From 4d9ad85c5a9feaa8113fdb80915f8371e90cfc09 Mon Sep 17 00:00:00 2001 From: Robin Murphy Date: Tue, 21 Nov 2023 18:04:03 +0000 Subject: [PATCH 478/548] iommu: Clean up open-coded ownership checks Some drivers already implement their own defence against the possibility of being given someone else's device. Since this is now taken care of by the core code (and via a slightly different path from the original fwspec-based idea), let's clean them up. Acked-by: Will Deacon Reviewed-by: Jason Gunthorpe Reviewed-by: Jerry Snitselaar Signed-off-by: Robin Murphy Link: https://lore.kernel.org/r/58a9879ce3f03562bb061e6714fe6efb554c3907.1700589539.git.robin.murphy@arm.com Signed-off-by: Joerg Roedel (cherry picked from commit e7080665c977ea1aafb8547a9c7bd08b199311d6) Signed-off-by: Koba Ko --- drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 3 --- drivers/iommu/arm/arm-smmu/arm-smmu.c | 20 +------------------- drivers/iommu/arm/arm-smmu/qcom_iommu.c | 16 +++------------- drivers/iommu/mtk_iommu.c | 7 +------ drivers/iommu/mtk_iommu_v1.c | 3 --- drivers/iommu/sprd-iommu.c | 8 +------- drivers/iommu/virtio-iommu.c | 3 --- 7 files changed, 6 insertions(+), 54 deletions(-) diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c index 9050323bf1694..8f6952fad3efc 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c @@ -2892,9 +2892,6 @@ static struct iommu_device *arm_smmu_probe_device(struct device *dev) struct arm_smmu_master *master; struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev); - if (!fwspec || fwspec->ops != &arm_smmu_ops) - return ERR_PTR(-ENODEV); - if (WARN_ON_ONCE(dev_iommu_priv_get(dev))) return ERR_PTR(-EBUSY); diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu.c b/drivers/iommu/arm/arm-smmu/arm-smmu.c index d35eeebedc73a..4d09c00478927 100644 --- a/drivers/iommu/arm/arm-smmu/arm-smmu.c +++ b/drivers/iommu/arm/arm-smmu/arm-smmu.c @@ -1116,11 +1116,6 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev) struct arm_smmu_device *smmu; int ret; - if (!fwspec || fwspec->ops != &arm_smmu_ops) { - dev_err(dev, "cannot attach to SMMU, is it on the same bus?\n"); - return -ENXIO; - } - /* * FIXME: The arch/arm DMA API code tries to attach devices to its own * domains between of_xlate() and probe_device() - we have no way to cope @@ -1357,21 +1352,8 @@ static struct iommu_device *arm_smmu_probe_device(struct device *dev) fwspec = dev_iommu_fwspec_get(dev); if (ret) goto out_free; - } else if (fwspec && fwspec->ops == &arm_smmu_ops) { - smmu = arm_smmu_get_by_fwnode(fwspec->iommu_fwnode); - - /* - * Defer probe if the relevant SMMU instance hasn't finished - * probing yet. This is a fragile hack and we'd ideally - * avoid this race in the core code. Until that's ironed - * out, however, this is the most pragmatic option on the - * table. - */ - if (!smmu) - return ERR_PTR(dev_err_probe(dev, -EPROBE_DEFER, - "smmu dev has not bound yet\n")); } else { - return ERR_PTR(-ENODEV); + smmu = arm_smmu_get_by_fwnode(fwspec->iommu_fwnode); } ret = -EINVAL; diff --git a/drivers/iommu/arm/arm-smmu/qcom_iommu.c b/drivers/iommu/arm/arm-smmu/qcom_iommu.c index 97b2122032b23..33f3c870086ce 100644 --- a/drivers/iommu/arm/arm-smmu/qcom_iommu.c +++ b/drivers/iommu/arm/arm-smmu/qcom_iommu.c @@ -79,16 +79,6 @@ static struct qcom_iommu_domain *to_qcom_iommu_domain(struct iommu_domain *dom) static const struct iommu_ops qcom_iommu_ops; -static struct qcom_iommu_dev * to_iommu(struct device *dev) -{ - struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev); - - if (!fwspec || fwspec->ops != &qcom_iommu_ops) - return NULL; - - return dev_iommu_priv_get(dev); -} - static struct qcom_iommu_ctx * to_ctx(struct qcom_iommu_domain *d, unsigned asid) { struct qcom_iommu_dev *qcom_iommu = d->iommu; @@ -372,7 +362,7 @@ static void qcom_iommu_domain_free(struct iommu_domain *domain) static int qcom_iommu_attach_dev(struct iommu_domain *domain, struct device *dev) { - struct qcom_iommu_dev *qcom_iommu = to_iommu(dev); + struct qcom_iommu_dev *qcom_iommu = dev_iommu_priv_get(dev); struct qcom_iommu_domain *qcom_domain = to_qcom_iommu_domain(domain); int ret; @@ -404,7 +394,7 @@ static int qcom_iommu_identity_attach(struct iommu_domain *identity_domain, struct iommu_domain *domain = iommu_get_domain_for_dev(dev); struct qcom_iommu_domain *qcom_domain; struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev); - struct qcom_iommu_dev *qcom_iommu = to_iommu(dev); + struct qcom_iommu_dev *qcom_iommu = dev_iommu_priv_get(dev); unsigned int i; if (domain == identity_domain || !domain) @@ -535,7 +525,7 @@ static bool qcom_iommu_capable(struct device *dev, enum iommu_cap cap) static struct iommu_device *qcom_iommu_probe_device(struct device *dev) { - struct qcom_iommu_dev *qcom_iommu = to_iommu(dev); + struct qcom_iommu_dev *qcom_iommu = dev_iommu_priv_get(dev); struct device_link *link; if (!qcom_iommu) diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c index f7e2b9cd39a2a..d773bd26097d8 100644 --- a/drivers/iommu/mtk_iommu.c +++ b/drivers/iommu/mtk_iommu.c @@ -862,16 +862,11 @@ static phys_addr_t mtk_iommu_iova_to_phys(struct iommu_domain *domain, static struct iommu_device *mtk_iommu_probe_device(struct device *dev) { struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev); - struct mtk_iommu_data *data; + struct mtk_iommu_data *data = dev_iommu_priv_get(dev); struct device_link *link; struct device *larbdev; unsigned int larbid, larbidx, i; - if (!fwspec || fwspec->ops != &mtk_iommu_ops) - return ERR_PTR(-ENODEV); /* Not a iommu client device */ - - data = dev_iommu_priv_get(dev); - if (!MTK_IOMMU_IS_TYPE(data->plat_data, MTK_IOMMU_TYPE_MM)) return &data->iommu; diff --git a/drivers/iommu/mtk_iommu_v1.c b/drivers/iommu/mtk_iommu_v1.c index 574c6e38cfb3d..32cc8341d3726 100644 --- a/drivers/iommu/mtk_iommu_v1.c +++ b/drivers/iommu/mtk_iommu_v1.c @@ -481,9 +481,6 @@ static struct iommu_device *mtk_iommu_v1_probe_device(struct device *dev) idx++; } - if (!fwspec || fwspec->ops != &mtk_iommu_v1_ops) - return ERR_PTR(-ENODEV); /* Not a iommu client device */ - data = dev_iommu_priv_get(dev); /* Link the consumer device with the smi-larb device(supplier) */ diff --git a/drivers/iommu/sprd-iommu.c b/drivers/iommu/sprd-iommu.c index 1a4bc7287de22..525ff3610d390 100644 --- a/drivers/iommu/sprd-iommu.c +++ b/drivers/iommu/sprd-iommu.c @@ -385,13 +385,7 @@ static phys_addr_t sprd_iommu_iova_to_phys(struct iommu_domain *domain, static struct iommu_device *sprd_iommu_probe_device(struct device *dev) { - struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev); - struct sprd_iommu_device *sdev; - - if (!fwspec || fwspec->ops != &sprd_iommu_ops) - return ERR_PTR(-ENODEV); - - sdev = dev_iommu_priv_get(dev); + struct sprd_iommu_device *sdev = dev_iommu_priv_get(dev); return &sdev->iommu; } diff --git a/drivers/iommu/virtio-iommu.c b/drivers/iommu/virtio-iommu.c index 17dcd826f5c20..bb2e795a80d0f 100644 --- a/drivers/iommu/virtio-iommu.c +++ b/drivers/iommu/virtio-iommu.c @@ -969,9 +969,6 @@ static struct iommu_device *viommu_probe_device(struct device *dev) struct viommu_dev *viommu = NULL; struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev); - if (!fwspec || fwspec->ops != &viommu_ops) - return ERR_PTR(-ENODEV); - viommu = viommu_get_by_fwnode(fwspec->iommu_fwnode); if (!viommu) return ERR_PTR(-ENODEV); From d34bb04897fabe855aec3e0b010a9fbdea47ca6e Mon Sep 17 00:00:00 2001 From: Guixin Liu Date: Thu, 25 Sep 2025 13:47:30 +0800 Subject: [PATCH 479/548] iommufd: Register iommufd mock devices with fwspec Since the bus ops were retired the iommu subsystem changed to using fwspec to match the iommu driver to the iommu device. If a device has a NULL fwspec then it is matched to the first iommu driver with a NULL fwspec, effectively disabling support for systems with more than one non-fwspec iommu driver. Thus, if the iommufd selfest are run in an x86 system that registers a non-fwspec iommu driver they fail to bind their mock devices to the mock iommu driver. Fix this by allocating a software fwnode for mock iommu driver's iommu_device, and set it to the device which mock iommu driver created. This is done by adding a new helper iommu_mock_device_add() which abuses the internals of the fwspec system to establish a fwspec before the device is added and is careful not to leak it. A matching dummy fwspec is automatically added to the mock iommu driver. Test by "make -C toosl/testing/selftests TARGETS=iommu run_tests": PASSED: 229 / 229 tests passed. In addition, this issue is also can be found on amd platform, and also tested on a amd machine. Link: https://patch.msgid.link/r/20250925054730.3877-1-kanie@linux.alibaba.com Fixes: 17de3f5fdd35 ("iommu: Retire bus ops") Signed-off-by: Guixin Liu Reviewed-by: Lu Baolu Tested-by: Qinyun Tan Signed-off-by: Jason Gunthorpe (cherry picked from commit 2a918911ed3d0841923525ed0fe707762ee78844) Signed-off-by: Koba Ko --- drivers/iommu/iommu-priv.h | 2 ++ drivers/iommu/iommu.c | 26 ++++++++++++++++++++++++++ drivers/iommu/iommufd/selftest.c | 2 +- 3 files changed, 29 insertions(+), 1 deletion(-) diff --git a/drivers/iommu/iommu-priv.h b/drivers/iommu/iommu-priv.h index 2024a23133486..e6ef72415505c 100644 --- a/drivers/iommu/iommu-priv.h +++ b/drivers/iommu/iommu-priv.h @@ -27,4 +27,6 @@ void iommu_device_unregister_bus(struct iommu_device *iommu, struct bus_type *bus, struct notifier_block *nb); +int iommu_mock_device_add(struct device *dev, struct iommu_device *iommu); + #endif /* __LINUX_IOMMU_PRIV_H */ diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c index b5aea21fce471..f7547ea8df6e5 100644 --- a/drivers/iommu/iommu.c +++ b/drivers/iommu/iommu.c @@ -292,6 +292,7 @@ void iommu_device_unregister_bus(struct iommu_device *iommu, struct notifier_block *nb) { bus_unregister_notifier(bus, nb); + fwnode_remove_software_node(iommu->fwnode); iommu_device_unregister(iommu); } EXPORT_SYMBOL_GPL(iommu_device_unregister_bus); @@ -313,6 +314,12 @@ int iommu_device_register_bus(struct iommu_device *iommu, if (err) return err; + iommu->fwnode = fwnode_create_software_node(NULL, NULL); + if (IS_ERR(iommu->fwnode)) { + bus_unregister_notifier(bus, nb); + return PTR_ERR(iommu->fwnode); + } + spin_lock(&iommu_device_lock); list_add_tail(&iommu->list, &iommu_device_list); spin_unlock(&iommu_device_lock); @@ -322,9 +329,28 @@ int iommu_device_register_bus(struct iommu_device *iommu, iommu_device_unregister_bus(iommu, bus, nb); return err; } + WRITE_ONCE(iommu->ready, true); return 0; } EXPORT_SYMBOL_GPL(iommu_device_register_bus); + +int iommu_mock_device_add(struct device *dev, struct iommu_device *iommu) +{ + int rc; + + mutex_lock(&iommu_probe_device_lock); + rc = iommu_fwspec_init(dev, iommu->fwnode); + mutex_unlock(&iommu_probe_device_lock); + + if (rc) + return rc; + + rc = device_add(dev); + if (rc) + iommu_fwspec_free(dev); + return rc; +} +EXPORT_SYMBOL_GPL(iommu_mock_device_add); #endif static struct dev_iommu *dev_iommu_get(struct device *dev) diff --git a/drivers/iommu/iommufd/selftest.c b/drivers/iommu/iommufd/selftest.c index e8f3df75bfbaa..0e06dd3d65892 100644 --- a/drivers/iommu/iommufd/selftest.c +++ b/drivers/iommu/iommufd/selftest.c @@ -619,7 +619,7 @@ static struct mock_dev *mock_dev_create(unsigned long dev_flags) if (rc) goto err_put; - rc = device_add(&mdev->dev); + rc = iommu_mock_device_add(&mdev->dev, &mock_iommu.iommu_dev); if (rc) goto err_put; return mdev; From ab804df074d5a4da6875b323f91324be2322790a Mon Sep 17 00:00:00 2001 From: Jason Gunthorpe Date: Wed, 3 Jan 2024 11:27:22 -0400 Subject: [PATCH 480/548] iommufd/selftest: Check the bus type during probe This relied on the probe function only being invoked by the bus type mock was registered on. The removal of the bus ops broke this assumption and the probe could be called on non-mock bus types like PCI. Check the bus type directly in probe. Fixes: 17de3f5fdd35 ("iommu: Retire bus ops") Link: https://lore.kernel.org/r/0-v1-82d59f7eab8c+40c-iommufd_mock_bus_jgg@nvidia.com Reviewed-by: Kevin Tian Signed-off-by: Jason Gunthorpe (cherry picked from commit 47f2bd2ff382e5fe766b1322e354558a8da4a470) Signed-off-by: Koba Ko --- drivers/iommu/iommufd/selftest.c | 28 +++++++++++++++------------- 1 file changed, 15 insertions(+), 13 deletions(-) diff --git a/drivers/iommu/iommufd/selftest.c b/drivers/iommu/iommufd/selftest.c index 0e06dd3d65892..62aab1178eff3 100644 --- a/drivers/iommu/iommufd/selftest.c +++ b/drivers/iommu/iommufd/selftest.c @@ -25,6 +25,19 @@ static struct iommu_domain_ops domain_nested_ops; size_t iommufd_test_memory_limit = 65536; +struct mock_bus_type { + struct bus_type bus; + struct notifier_block nb; +}; + +static struct mock_bus_type iommufd_mock_bus_type = { + .bus = { + .name = "iommufd_mock", + }, +}; + +static atomic_t mock_dev_num; + enum { MOCK_DIRTY_TRACK = 1, MOCK_IO_PAGE_SIZE = PAGE_SIZE / 2, @@ -487,6 +500,8 @@ static struct iommu_device mock_iommu_device = { static struct iommu_device *mock_probe_device(struct device *dev) { + if (dev->bus != &iommufd_mock_bus_type.bus) + return ERR_PTR(-ENODEV); return &mock_iommu_device; } @@ -576,19 +591,6 @@ get_md_pagetable_nested(struct iommufd_ucmd *ucmd, u32 mockpt_id, return hwpt; } -struct mock_bus_type { - struct bus_type bus; - struct notifier_block nb; -}; - -static struct mock_bus_type iommufd_mock_bus_type = { - .bus = { - .name = "iommufd_mock", - }, -}; - -static atomic_t mock_dev_num; - static void mock_dev_release(struct device *dev) { struct mock_dev *mdev = container_of(dev, struct mock_dev, dev); From 172225b92e75d2de9b075992d497ed56863a5968 Mon Sep 17 00:00:00 2001 From: Robin Murphy Date: Mon, 28 Oct 2024 17:58:37 +0000 Subject: [PATCH 481/548] iommu/omap: Add minimal fwnode support The OMAP driver uses the generic "iommus" DT binding but is the final holdout not implementing a corresponding .of_xlate method. Unfortunately this now results in __iommu_probe_device() failing to find ops due to client devices missing the expected IOMMU fwnode association. The legacy DT parsing in omap_iommu_probe_device() could probably all be delegated to generic code now, but for the sake of an immediate fix, just add a minimal .of_xlate implementation to allow client fwspecs to be created appropriately, and so the ops lookup to work again. This means we also need to register the additional instances on DRA7 so that of_iommu_xlate() doesn't defer indefinitely waiting for their ops either, but we'll continue to hide them from sysfs just in case. This also renders the bus_iommu_probe() call entirely redundant. Reported-by: Beleswar Padhi Link: https://lore.kernel.org/linux-iommu/0dbde87b-593f-4b14-8929-b78e189549ad@ti.com/ Reported-by: H. Nikolaus Schaller Link: https://lore.kernel.org/linux-media/A7C284A9-33A5-4E21-9B57-9C4C213CC13F@goldelico.com/ Fixes: 17de3f5fdd35 ("iommu: Retire bus ops") Signed-off-by: Robin Murphy Tested-by: H. Nikolaus Schaller Reviewed-by: Kevin Hilman Tested-by: Kevin Hilman Tested-by: Beleswar Padhi Link: https://lore.kernel.org/r/cfd766f96bc799e32b97f4664707adbcf99097b0.1730136799.git.robin.murphy@arm.com Signed-off-by: Joerg Roedel (backported from commit c9b4a3185fcb2bca0ab8cd098a4df85b2951c44b) Signed-off-by: Koba Ko --- drivers/iommu/omap-iommu.c | 29 +++++++++++++++++------------ 1 file changed, 17 insertions(+), 12 deletions(-) diff --git a/drivers/iommu/omap-iommu.c b/drivers/iommu/omap-iommu.c index fcf99bd195b32..4efa5a3488629 100644 --- a/drivers/iommu/omap-iommu.c +++ b/drivers/iommu/omap-iommu.c @@ -1234,24 +1234,25 @@ static int omap_iommu_probe(struct platform_device *pdev) if (err) goto out_group; - err = iommu_device_register(&obj->iommu, &omap_iommu_ops, &pdev->dev); - if (err) - goto out_sysfs; + obj->has_iommu_driver = true; } + err = iommu_device_register(&obj->iommu, &omap_iommu_ops, &pdev->dev); + if (err) + goto out_sysfs; + pm_runtime_enable(obj->dev); omap_iommu_debugfs_add(obj); dev_info(&pdev->dev, "%s registered\n", obj->name); - /* Re-probe bus to probe device attached to this IOMMU */ - bus_iommu_probe(&platform_bus_type); - return 0; out_sysfs: iommu_device_sysfs_remove(&obj->iommu); + if (obj->has_iommu_driver) + iommu_device_sysfs_remove(&obj->iommu); out_group: iommu_group_put(obj->group); return err; @@ -1261,13 +1262,10 @@ static void omap_iommu_remove(struct platform_device *pdev) { struct omap_iommu *obj = platform_get_drvdata(pdev); - if (obj->group) { - iommu_group_put(obj->group); - obj->group = NULL; - + if (obj->has_iommu_driver) iommu_device_sysfs_remove(&obj->iommu); - iommu_device_unregister(&obj->iommu); - } + + iommu_device_unregister(&obj->iommu); omap_iommu_debugfs_remove(obj); @@ -1743,12 +1741,19 @@ static struct iommu_group *omap_iommu_device_group(struct device *dev) return group; } +static int omap_iommu_of_xlate(struct device *dev, const struct of_phandle_args *args) +{ + /* TODO: collect args->np to save re-parsing in probe above */ + return 0; +} + static const struct iommu_ops omap_iommu_ops = { .identity_domain = &omap_iommu_identity_domain, .domain_alloc_paging = omap_iommu_domain_alloc_paging, .probe_device = omap_iommu_probe_device, .release_device = omap_iommu_release_device, .device_group = omap_iommu_device_group, + .of_xlate = omap_iommu_of_xlate, .pgsize_bitmap = OMAP_IOMMU_PGSIZES, .default_domain_ops = &(const struct iommu_domain_ops) { .attach_dev = omap_iommu_attach_dev, From fe03d9daa9ac5ca2175e923102f66cd609d23dc9 Mon Sep 17 00:00:00 2001 From: Jason Gunthorpe Date: Thu, 29 Aug 2024 10:19:59 -0300 Subject: [PATCH 482/548] iommufd: Check the domain owner of the parent before creating a nesting domain This check was missed, before we can pass a struct iommu_domain to a driver callback we need to validate that the domain was created by that driver. Fixes: bd529dbb661d ("iommufd: Add a nested HW pagetable object") Link: https://patch.msgid.link/r/0-v1-c8770519edde+1a-iommufd_nesting_ops_jgg@nvidia.com Reviewed-by: Nicolin Chen Signed-off-by: Jason Gunthorpe (cherry picked from commit 73183ad6ea51029d04b098286dcee98d715015f1) Signed-off-by: Koba Ko --- drivers/iommu/iommufd/hw_pagetable.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/iommu/iommufd/hw_pagetable.c b/drivers/iommu/iommufd/hw_pagetable.c index 29612106a962a..a2604955a7345 100644 --- a/drivers/iommu/iommufd/hw_pagetable.c +++ b/drivers/iommu/iommufd/hw_pagetable.c @@ -215,7 +215,8 @@ iommufd_hwpt_nested_alloc(struct iommufd_ctx *ictx, if (flags || !user_data->len || !ops->domain_alloc_user) return ERR_PTR(-EOPNOTSUPP); - if (parent->auto_domain || !parent->nest_parent) + if (parent->auto_domain || !parent->nest_parent || + parent->common.domain->owner != ops) return ERR_PTR(-EINVAL); hwpt_nested = __iommufd_object_alloc( From 9ee9ac8475acea48714d4f90ae457089993a9bbf Mon Sep 17 00:00:00 2001 From: Lu Baolu Date: Fri, 8 Dec 2023 09:53:14 +0800 Subject: [PATCH 483/548] iommu: Set owner token to SVA domain Commit a9c362db3920 ("iommu: Validate that devices match domains") added an owner token to the iommu_domain. This token is checked during domain attachment to RID or PASID through the generic iommu interfaces. The SVA domains are attached to PASIDs through those iommu interfaces. Therefore, they require the owner token to be set during allocation. Otherwise, they fail to attach. Set the owner token for SVA domains. Fixes: a9c362db3920 ("iommu: Validate that devices match domains") Cc: Robin Murphy Signed-off-by: Lu Baolu Reviewed-by: Robin Murphy Reviewed-by: Jason Gunthorpe Link: https://lore.kernel.org/r/20231208015314.320663-1-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel (cherry picked from commit 7be423336eccc872249d37900c19c1d24f171353) Signed-off-by: Koba Ko --- drivers/iommu/iommu.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c index f7547ea8df6e5..3c9858b36fd78 100644 --- a/drivers/iommu/iommu.c +++ b/drivers/iommu/iommu.c @@ -3742,6 +3742,7 @@ struct iommu_domain *iommu_sva_domain_alloc(struct device *dev, domain->type = IOMMU_DOMAIN_SVA; mmgrab(mm); domain->mm = mm; + domain->owner = ops; domain->iopf_handler = iommu_sva_handle_iopf; domain->fault_data = mm; From bbe8658ef41c4895a51a0eecba65509449997ec5 Mon Sep 17 00:00:00 2001 From: Lu Baolu Date: Mon, 12 Feb 2024 09:22:12 +0800 Subject: [PATCH 484/548] iommu: Move iommu fault data to linux/iommu.h The iommu fault data is currently defined in uapi/linux/iommu.h, but is only used inside the iommu subsystem. Move it to linux/iommu.h, where it will be more accessible to kernel drivers. With this done, uapi/linux/iommu.h becomes empty and can be removed from the tree. Signed-off-by: Lu Baolu Reviewed-by: Jason Gunthorpe Reviewed-by: Kevin Tian Reviewed-by: Yi Liu Tested-by: Yan Zhao Tested-by: Longfang Liu Link: https://lore.kernel.org/r/20240212012227.119381-2-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel (cherry picked from commit 00a9bc6070434814d39118a0de70c1645f64bf60) Signed-off-by: Koba Ko --- MAINTAINERS | 1 - include/linux/iommu.h | 152 +++++++++++++++++++++++++++++++++- include/uapi/linux/iommu.h | 161 ------------------------------------- 3 files changed, 151 insertions(+), 163 deletions(-) delete mode 100644 include/uapi/linux/iommu.h diff --git a/MAINTAINERS b/MAINTAINERS index 294d2ce29b735..940194dea7466 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -10992,7 +10992,6 @@ F: drivers/iommu/ F: include/linux/iommu.h F: include/linux/iova.h F: include/linux/of_iommu.h -F: include/uapi/linux/iommu.h IOMMUFD M: Jason Gunthorpe diff --git a/include/linux/iommu.h b/include/linux/iommu.h index 539eb91b4e38b..1ef2e13cff26d 100644 --- a/include/linux/iommu.h +++ b/include/linux/iommu.h @@ -14,7 +14,6 @@ #include #include #include -#include #define IOMMU_READ (1 << 0) #define IOMMU_WRITE (1 << 1) @@ -44,6 +43,157 @@ struct iommu_sva; struct iommu_fault_event; struct iommu_dma_cookie; +#define IOMMU_FAULT_PERM_READ (1 << 0) /* read */ +#define IOMMU_FAULT_PERM_WRITE (1 << 1) /* write */ +#define IOMMU_FAULT_PERM_EXEC (1 << 2) /* exec */ +#define IOMMU_FAULT_PERM_PRIV (1 << 3) /* privileged */ + +/* Generic fault types, can be expanded IRQ remapping fault */ +enum iommu_fault_type { + IOMMU_FAULT_DMA_UNRECOV = 1, /* unrecoverable fault */ + IOMMU_FAULT_PAGE_REQ, /* page request fault */ +}; + +enum iommu_fault_reason { + IOMMU_FAULT_REASON_UNKNOWN = 0, + + /* Could not access the PASID table (fetch caused external abort) */ + IOMMU_FAULT_REASON_PASID_FETCH, + + /* PASID entry is invalid or has configuration errors */ + IOMMU_FAULT_REASON_BAD_PASID_ENTRY, + + /* + * PASID is out of range (e.g. exceeds the maximum PASID + * supported by the IOMMU) or disabled. + */ + IOMMU_FAULT_REASON_PASID_INVALID, + + /* + * An external abort occurred fetching (or updating) a translation + * table descriptor + */ + IOMMU_FAULT_REASON_WALK_EABT, + + /* + * Could not access the page table entry (Bad address), + * actual translation fault + */ + IOMMU_FAULT_REASON_PTE_FETCH, + + /* Protection flag check failed */ + IOMMU_FAULT_REASON_PERMISSION, + + /* access flag check failed */ + IOMMU_FAULT_REASON_ACCESS, + + /* Output address of a translation stage caused Address Size fault */ + IOMMU_FAULT_REASON_OOR_ADDRESS, +}; + +/** + * struct iommu_fault_unrecoverable - Unrecoverable fault data + * @reason: reason of the fault, from &enum iommu_fault_reason + * @flags: parameters of this fault (IOMMU_FAULT_UNRECOV_* values) + * @pasid: Process Address Space ID + * @perm: requested permission access using by the incoming transaction + * (IOMMU_FAULT_PERM_* values) + * @addr: offending page address + * @fetch_addr: address that caused a fetch abort, if any + */ +struct iommu_fault_unrecoverable { + __u32 reason; +#define IOMMU_FAULT_UNRECOV_PASID_VALID (1 << 0) +#define IOMMU_FAULT_UNRECOV_ADDR_VALID (1 << 1) +#define IOMMU_FAULT_UNRECOV_FETCH_ADDR_VALID (1 << 2) + __u32 flags; + __u32 pasid; + __u32 perm; + __u64 addr; + __u64 fetch_addr; +}; + +/** + * struct iommu_fault_page_request - Page Request data + * @flags: encodes whether the corresponding fields are valid and whether this + * is the last page in group (IOMMU_FAULT_PAGE_REQUEST_* values). + * When IOMMU_FAULT_PAGE_RESPONSE_NEEDS_PASID is set, the page response + * must have the same PASID value as the page request. When it is clear, + * the page response should not have a PASID. + * @pasid: Process Address Space ID + * @grpid: Page Request Group Index + * @perm: requested page permissions (IOMMU_FAULT_PERM_* values) + * @addr: page address + * @private_data: device-specific private information + */ +struct iommu_fault_page_request { +#define IOMMU_FAULT_PAGE_REQUEST_PASID_VALID (1 << 0) +#define IOMMU_FAULT_PAGE_REQUEST_LAST_PAGE (1 << 1) +#define IOMMU_FAULT_PAGE_REQUEST_PRIV_DATA (1 << 2) +#define IOMMU_FAULT_PAGE_RESPONSE_NEEDS_PASID (1 << 3) + __u32 flags; + __u32 pasid; + __u32 grpid; + __u32 perm; + __u64 addr; + __u64 private_data[2]; +}; + +/** + * struct iommu_fault - Generic fault data + * @type: fault type from &enum iommu_fault_type + * @padding: reserved for future use (should be zero) + * @event: fault event, when @type is %IOMMU_FAULT_DMA_UNRECOV + * @prm: Page Request message, when @type is %IOMMU_FAULT_PAGE_REQ + * @padding2: sets the fault size to allow for future extensions + */ +struct iommu_fault { + __u32 type; + __u32 padding; + union { + struct iommu_fault_unrecoverable event; + struct iommu_fault_page_request prm; + __u8 padding2[56]; + }; +}; + +/** + * enum iommu_page_response_code - Return status of fault handlers + * @IOMMU_PAGE_RESP_SUCCESS: Fault has been handled and the page tables + * populated, retry the access. This is "Success" in PCI PRI. + * @IOMMU_PAGE_RESP_FAILURE: General error. Drop all subsequent faults from + * this device if possible. This is "Response Failure" in PCI PRI. + * @IOMMU_PAGE_RESP_INVALID: Could not handle this fault, don't retry the + * access. This is "Invalid Request" in PCI PRI. + */ +enum iommu_page_response_code { + IOMMU_PAGE_RESP_SUCCESS = 0, + IOMMU_PAGE_RESP_INVALID, + IOMMU_PAGE_RESP_FAILURE, +}; + +/** + * struct iommu_page_response - Generic page response information + * @argsz: User filled size of this data + * @version: API version of this structure + * @flags: encodes whether the corresponding fields are valid + * (IOMMU_FAULT_PAGE_RESPONSE_* values) + * @pasid: Process Address Space ID + * @grpid: Page Request Group Index + * @code: response code from &enum iommu_page_response_code + */ +struct iommu_page_response { + __u32 argsz; +#define IOMMU_PAGE_RESP_VERSION_1 1 + __u32 version; +#define IOMMU_PAGE_RESP_PASID_VALID (1 << 0) + __u32 flags; + __u32 pasid; + __u32 grpid; + __u32 code; +}; + + /* iommu fault flags */ #define IOMMU_FAULT_READ 0x0 #define IOMMU_FAULT_WRITE 0x1 diff --git a/include/uapi/linux/iommu.h b/include/uapi/linux/iommu.h deleted file mode 100644 index 65d8b0234f690..0000000000000 --- a/include/uapi/linux/iommu.h +++ /dev/null @@ -1,161 +0,0 @@ -/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */ -/* - * IOMMU user API definitions - */ - -#ifndef _UAPI_IOMMU_H -#define _UAPI_IOMMU_H - -#include - -#define IOMMU_FAULT_PERM_READ (1 << 0) /* read */ -#define IOMMU_FAULT_PERM_WRITE (1 << 1) /* write */ -#define IOMMU_FAULT_PERM_EXEC (1 << 2) /* exec */ -#define IOMMU_FAULT_PERM_PRIV (1 << 3) /* privileged */ - -/* Generic fault types, can be expanded IRQ remapping fault */ -enum iommu_fault_type { - IOMMU_FAULT_DMA_UNRECOV = 1, /* unrecoverable fault */ - IOMMU_FAULT_PAGE_REQ, /* page request fault */ -}; - -enum iommu_fault_reason { - IOMMU_FAULT_REASON_UNKNOWN = 0, - - /* Could not access the PASID table (fetch caused external abort) */ - IOMMU_FAULT_REASON_PASID_FETCH, - - /* PASID entry is invalid or has configuration errors */ - IOMMU_FAULT_REASON_BAD_PASID_ENTRY, - - /* - * PASID is out of range (e.g. exceeds the maximum PASID - * supported by the IOMMU) or disabled. - */ - IOMMU_FAULT_REASON_PASID_INVALID, - - /* - * An external abort occurred fetching (or updating) a translation - * table descriptor - */ - IOMMU_FAULT_REASON_WALK_EABT, - - /* - * Could not access the page table entry (Bad address), - * actual translation fault - */ - IOMMU_FAULT_REASON_PTE_FETCH, - - /* Protection flag check failed */ - IOMMU_FAULT_REASON_PERMISSION, - - /* access flag check failed */ - IOMMU_FAULT_REASON_ACCESS, - - /* Output address of a translation stage caused Address Size fault */ - IOMMU_FAULT_REASON_OOR_ADDRESS, -}; - -/** - * struct iommu_fault_unrecoverable - Unrecoverable fault data - * @reason: reason of the fault, from &enum iommu_fault_reason - * @flags: parameters of this fault (IOMMU_FAULT_UNRECOV_* values) - * @pasid: Process Address Space ID - * @perm: requested permission access using by the incoming transaction - * (IOMMU_FAULT_PERM_* values) - * @addr: offending page address - * @fetch_addr: address that caused a fetch abort, if any - */ -struct iommu_fault_unrecoverable { - __u32 reason; -#define IOMMU_FAULT_UNRECOV_PASID_VALID (1 << 0) -#define IOMMU_FAULT_UNRECOV_ADDR_VALID (1 << 1) -#define IOMMU_FAULT_UNRECOV_FETCH_ADDR_VALID (1 << 2) - __u32 flags; - __u32 pasid; - __u32 perm; - __u64 addr; - __u64 fetch_addr; -}; - -/** - * struct iommu_fault_page_request - Page Request data - * @flags: encodes whether the corresponding fields are valid and whether this - * is the last page in group (IOMMU_FAULT_PAGE_REQUEST_* values). - * When IOMMU_FAULT_PAGE_RESPONSE_NEEDS_PASID is set, the page response - * must have the same PASID value as the page request. When it is clear, - * the page response should not have a PASID. - * @pasid: Process Address Space ID - * @grpid: Page Request Group Index - * @perm: requested page permissions (IOMMU_FAULT_PERM_* values) - * @addr: page address - * @private_data: device-specific private information - */ -struct iommu_fault_page_request { -#define IOMMU_FAULT_PAGE_REQUEST_PASID_VALID (1 << 0) -#define IOMMU_FAULT_PAGE_REQUEST_LAST_PAGE (1 << 1) -#define IOMMU_FAULT_PAGE_REQUEST_PRIV_DATA (1 << 2) -#define IOMMU_FAULT_PAGE_RESPONSE_NEEDS_PASID (1 << 3) - __u32 flags; - __u32 pasid; - __u32 grpid; - __u32 perm; - __u64 addr; - __u64 private_data[2]; -}; - -/** - * struct iommu_fault - Generic fault data - * @type: fault type from &enum iommu_fault_type - * @padding: reserved for future use (should be zero) - * @event: fault event, when @type is %IOMMU_FAULT_DMA_UNRECOV - * @prm: Page Request message, when @type is %IOMMU_FAULT_PAGE_REQ - * @padding2: sets the fault size to allow for future extensions - */ -struct iommu_fault { - __u32 type; - __u32 padding; - union { - struct iommu_fault_unrecoverable event; - struct iommu_fault_page_request prm; - __u8 padding2[56]; - }; -}; - -/** - * enum iommu_page_response_code - Return status of fault handlers - * @IOMMU_PAGE_RESP_SUCCESS: Fault has been handled and the page tables - * populated, retry the access. This is "Success" in PCI PRI. - * @IOMMU_PAGE_RESP_FAILURE: General error. Drop all subsequent faults from - * this device if possible. This is "Response Failure" in PCI PRI. - * @IOMMU_PAGE_RESP_INVALID: Could not handle this fault, don't retry the - * access. This is "Invalid Request" in PCI PRI. - */ -enum iommu_page_response_code { - IOMMU_PAGE_RESP_SUCCESS = 0, - IOMMU_PAGE_RESP_INVALID, - IOMMU_PAGE_RESP_FAILURE, -}; - -/** - * struct iommu_page_response - Generic page response information - * @argsz: User filled size of this data - * @version: API version of this structure - * @flags: encodes whether the corresponding fields are valid - * (IOMMU_FAULT_PAGE_RESPONSE_* values) - * @pasid: Process Address Space ID - * @grpid: Page Request Group Index - * @code: response code from &enum iommu_page_response_code - */ -struct iommu_page_response { - __u32 argsz; -#define IOMMU_PAGE_RESP_VERSION_1 1 - __u32 version; -#define IOMMU_PAGE_RESP_PASID_VALID (1 << 0) - __u32 flags; - __u32 pasid; - __u32 grpid; - __u32 code; -}; - -#endif /* _UAPI_IOMMU_H */ From 57f9879084724b55f036bb01745c055338a09a9d Mon Sep 17 00:00:00 2001 From: Lu Baolu Date: Mon, 12 Feb 2024 09:22:13 +0800 Subject: [PATCH 485/548] iommu/arm-smmu-v3: Remove unrecoverable faults reporting No device driver registers fault handler to handle the reported unrecoveraable faults. Remove it to avoid dead code. Signed-off-by: Lu Baolu Reviewed-by: Kevin Tian Reviewed-by: Jason Gunthorpe Tested-by: Longfang Liu Link: https://lore.kernel.org/r/20240212012227.119381-3-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel (cherry picked from commit 66014df73b302d326b995178141a150b9a3a52b7) Signed-off-by: Koba Ko --- drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 46 ++++++--------------- 1 file changed, 13 insertions(+), 33 deletions(-) diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c index 8f6952fad3efc..62a11afc92e58 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c @@ -1644,7 +1644,6 @@ arm_smmu_find_master(struct arm_smmu_device *smmu, u32 sid) static int arm_smmu_handle_evt(struct arm_smmu_device *smmu, u64 *evt) { int ret; - u32 reason; u32 perm = 0; struct arm_smmu_master *master; bool ssid_valid = evt[0] & EVTQ_0_SSV; @@ -1654,16 +1653,9 @@ static int arm_smmu_handle_evt(struct arm_smmu_device *smmu, u64 *evt) switch (FIELD_GET(EVTQ_0_ID, evt[0])) { case EVT_ID_TRANSLATION_FAULT: - reason = IOMMU_FAULT_REASON_PTE_FETCH; - break; case EVT_ID_ADDR_SIZE_FAULT: - reason = IOMMU_FAULT_REASON_OOR_ADDRESS; - break; case EVT_ID_ACCESS_FAULT: - reason = IOMMU_FAULT_REASON_ACCESS; - break; case EVT_ID_PERMISSION_FAULT: - reason = IOMMU_FAULT_REASON_PERMISSION; break; default: return -EOPNOTSUPP; @@ -1673,6 +1665,9 @@ static int arm_smmu_handle_evt(struct arm_smmu_device *smmu, u64 *evt) if (evt[1] & EVTQ_1_S2) return -EFAULT; + if (!(evt[1] & EVTQ_1_STALL)) + return -EOPNOTSUPP; + if (evt[1] & EVTQ_1_RnW) perm |= IOMMU_FAULT_PERM_READ; else @@ -1684,32 +1679,17 @@ static int arm_smmu_handle_evt(struct arm_smmu_device *smmu, u64 *evt) if (evt[1] & EVTQ_1_PnU) perm |= IOMMU_FAULT_PERM_PRIV; - if (evt[1] & EVTQ_1_STALL) { - flt->type = IOMMU_FAULT_PAGE_REQ; - flt->prm = (struct iommu_fault_page_request) { - .flags = IOMMU_FAULT_PAGE_REQUEST_LAST_PAGE, - .grpid = FIELD_GET(EVTQ_1_STAG, evt[1]), - .perm = perm, - .addr = FIELD_GET(EVTQ_2_ADDR, evt[2]), - }; - - if (ssid_valid) { - flt->prm.flags |= IOMMU_FAULT_PAGE_REQUEST_PASID_VALID; - flt->prm.pasid = FIELD_GET(EVTQ_0_SSID, evt[0]); - } - } else { - flt->type = IOMMU_FAULT_DMA_UNRECOV; - flt->event = (struct iommu_fault_unrecoverable) { - .reason = reason, - .flags = IOMMU_FAULT_UNRECOV_ADDR_VALID, - .perm = perm, - .addr = FIELD_GET(EVTQ_2_ADDR, evt[2]), - }; + flt->type = IOMMU_FAULT_PAGE_REQ; + flt->prm = (struct iommu_fault_page_request) { + .flags = IOMMU_FAULT_PAGE_REQUEST_LAST_PAGE, + .grpid = FIELD_GET(EVTQ_1_STAG, evt[1]), + .perm = perm, + .addr = FIELD_GET(EVTQ_2_ADDR, evt[2]), + }; - if (ssid_valid) { - flt->event.flags |= IOMMU_FAULT_UNRECOV_PASID_VALID; - flt->event.pasid = FIELD_GET(EVTQ_0_SSID, evt[0]); - } + if (ssid_valid) { + flt->prm.flags |= IOMMU_FAULT_PAGE_REQUEST_PASID_VALID; + flt->prm.pasid = FIELD_GET(EVTQ_0_SSID, evt[0]); } mutex_lock(&smmu->streams_mutex); From 073ebc93b82d22031a25350ea4927451809d9e59 Mon Sep 17 00:00:00 2001 From: Lu Baolu Date: Mon, 12 Feb 2024 09:22:14 +0800 Subject: [PATCH 486/548] iommu: Remove unrecoverable fault data The unrecoverable fault data is not used anywhere. Remove it to avoid dead code. Suggested-by: Kevin Tian Signed-off-by: Lu Baolu Reviewed-by: Jason Gunthorpe Reviewed-by: Kevin Tian Tested-by: Yan Zhao Tested-by: Longfang Liu Link: https://lore.kernel.org/r/20240212012227.119381-4-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel (cherry picked from commit 0edeab66eba88947dabe8634a3efd136cc771750) Signed-off-by: Koba Ko --- include/linux/iommu.h | 72 ++----------------------------------------- 1 file changed, 2 insertions(+), 70 deletions(-) diff --git a/include/linux/iommu.h b/include/linux/iommu.h index 1ef2e13cff26d..1d9454c1b3a11 100644 --- a/include/linux/iommu.h +++ b/include/linux/iommu.h @@ -50,67 +50,7 @@ struct iommu_dma_cookie; /* Generic fault types, can be expanded IRQ remapping fault */ enum iommu_fault_type { - IOMMU_FAULT_DMA_UNRECOV = 1, /* unrecoverable fault */ - IOMMU_FAULT_PAGE_REQ, /* page request fault */ -}; - -enum iommu_fault_reason { - IOMMU_FAULT_REASON_UNKNOWN = 0, - - /* Could not access the PASID table (fetch caused external abort) */ - IOMMU_FAULT_REASON_PASID_FETCH, - - /* PASID entry is invalid or has configuration errors */ - IOMMU_FAULT_REASON_BAD_PASID_ENTRY, - - /* - * PASID is out of range (e.g. exceeds the maximum PASID - * supported by the IOMMU) or disabled. - */ - IOMMU_FAULT_REASON_PASID_INVALID, - - /* - * An external abort occurred fetching (or updating) a translation - * table descriptor - */ - IOMMU_FAULT_REASON_WALK_EABT, - - /* - * Could not access the page table entry (Bad address), - * actual translation fault - */ - IOMMU_FAULT_REASON_PTE_FETCH, - - /* Protection flag check failed */ - IOMMU_FAULT_REASON_PERMISSION, - - /* access flag check failed */ - IOMMU_FAULT_REASON_ACCESS, - - /* Output address of a translation stage caused Address Size fault */ - IOMMU_FAULT_REASON_OOR_ADDRESS, -}; - -/** - * struct iommu_fault_unrecoverable - Unrecoverable fault data - * @reason: reason of the fault, from &enum iommu_fault_reason - * @flags: parameters of this fault (IOMMU_FAULT_UNRECOV_* values) - * @pasid: Process Address Space ID - * @perm: requested permission access using by the incoming transaction - * (IOMMU_FAULT_PERM_* values) - * @addr: offending page address - * @fetch_addr: address that caused a fetch abort, if any - */ -struct iommu_fault_unrecoverable { - __u32 reason; -#define IOMMU_FAULT_UNRECOV_PASID_VALID (1 << 0) -#define IOMMU_FAULT_UNRECOV_ADDR_VALID (1 << 1) -#define IOMMU_FAULT_UNRECOV_FETCH_ADDR_VALID (1 << 2) - __u32 flags; - __u32 pasid; - __u32 perm; - __u64 addr; - __u64 fetch_addr; + IOMMU_FAULT_PAGE_REQ = 1, /* page request fault */ }; /** @@ -142,19 +82,11 @@ struct iommu_fault_page_request { /** * struct iommu_fault - Generic fault data * @type: fault type from &enum iommu_fault_type - * @padding: reserved for future use (should be zero) - * @event: fault event, when @type is %IOMMU_FAULT_DMA_UNRECOV * @prm: Page Request message, when @type is %IOMMU_FAULT_PAGE_REQ - * @padding2: sets the fault size to allow for future extensions */ struct iommu_fault { __u32 type; - __u32 padding; - union { - struct iommu_fault_unrecoverable event; - struct iommu_fault_page_request prm; - __u8 padding2[56]; - }; + struct iommu_fault_page_request prm; }; /** From bc80b159be589caffcd43cf4c60ad535c018491f Mon Sep 17 00:00:00 2001 From: Lu Baolu Date: Mon, 12 Feb 2024 09:22:15 +0800 Subject: [PATCH 487/548] iommu: Cleanup iopf data structure definitions struct iommu_fault_page_request and struct iommu_page_response are not part of uAPI anymore. Convert them to data structures for kAPI. Signed-off-by: Lu Baolu Reviewed-by: Jason Gunthorpe Reviewed-by: Kevin Tian Reviewed-by: Yi Liu Tested-by: Yan Zhao Tested-by: Longfang Liu Link: https://lore.kernel.org/r/20240212012227.119381-5-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel (cherry picked from commit 8b32a3bea2629049c484f595af7aad797e24453e) Signed-off-by: Koba Ko --- drivers/iommu/io-pgfault.c | 1 - drivers/iommu/iommu.c | 4 ---- include/linux/iommu.h | 27 +++++++++++---------------- 3 files changed, 11 insertions(+), 21 deletions(-) diff --git a/drivers/iommu/io-pgfault.c b/drivers/iommu/io-pgfault.c index e5b8b9110c132..24b5545352ae7 100644 --- a/drivers/iommu/io-pgfault.c +++ b/drivers/iommu/io-pgfault.c @@ -56,7 +56,6 @@ static int iopf_complete_group(struct device *dev, struct iopf_fault *iopf, enum iommu_page_response_code status) { struct iommu_page_response resp = { - .version = IOMMU_PAGE_RESP_VERSION_1, .pasid = iopf->fault.prm.pasid, .grpid = iopf->fault.prm.grpid, .code = status, diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c index 3c9858b36fd78..11f15aa9f30ba 100644 --- a/drivers/iommu/iommu.c +++ b/drivers/iommu/iommu.c @@ -1536,10 +1536,6 @@ int iommu_page_response(struct device *dev, if (!param || !param->fault_param) return -EINVAL; - if (msg->version != IOMMU_PAGE_RESP_VERSION_1 || - msg->flags & ~IOMMU_PAGE_RESP_PASID_VALID) - return -EINVAL; - /* Only send response if there is a fault report pending */ mutex_lock(¶m->fault_param->lock); if (list_empty(¶m->fault_param->faults)) { diff --git a/include/linux/iommu.h b/include/linux/iommu.h index 1d9454c1b3a11..dc3538cc65120 100644 --- a/include/linux/iommu.h +++ b/include/linux/iommu.h @@ -71,12 +71,12 @@ struct iommu_fault_page_request { #define IOMMU_FAULT_PAGE_REQUEST_LAST_PAGE (1 << 1) #define IOMMU_FAULT_PAGE_REQUEST_PRIV_DATA (1 << 2) #define IOMMU_FAULT_PAGE_RESPONSE_NEEDS_PASID (1 << 3) - __u32 flags; - __u32 pasid; - __u32 grpid; - __u32 perm; - __u64 addr; - __u64 private_data[2]; + u32 flags; + u32 pasid; + u32 grpid; + u32 perm; + u64 addr; + u64 private_data[2]; }; /** @@ -85,7 +85,7 @@ struct iommu_fault_page_request { * @prm: Page Request message, when @type is %IOMMU_FAULT_PAGE_REQ */ struct iommu_fault { - __u32 type; + u32 type; struct iommu_fault_page_request prm; }; @@ -106,8 +106,6 @@ enum iommu_page_response_code { /** * struct iommu_page_response - Generic page response information - * @argsz: User filled size of this data - * @version: API version of this structure * @flags: encodes whether the corresponding fields are valid * (IOMMU_FAULT_PAGE_RESPONSE_* values) * @pasid: Process Address Space ID @@ -115,14 +113,11 @@ enum iommu_page_response_code { * @code: response code from &enum iommu_page_response_code */ struct iommu_page_response { - __u32 argsz; -#define IOMMU_PAGE_RESP_VERSION_1 1 - __u32 version; #define IOMMU_PAGE_RESP_PASID_VALID (1 << 0) - __u32 flags; - __u32 pasid; - __u32 grpid; - __u32 code; + u32 flags; + u32 pasid; + u32 grpid; + u32 code; }; From dba2da594a658ca0da021391a6ebf59d5eca776b Mon Sep 17 00:00:00 2001 From: Lu Baolu Date: Mon, 12 Feb 2024 09:22:16 +0800 Subject: [PATCH 488/548] iommu: Merge iopf_device_param into iommu_fault_param The struct dev_iommu contains two pointers, fault_param and iopf_param. The fault_param pointer points to a data structure that is used to store pending faults that are awaiting responses. The iopf_param pointer points to a data structure that is used to store partial faults that are part of a Page Request Group. The fault_param and iopf_param pointers are essentially duplicate. This causes memory waste. Merge the iopf_device_param pointer into the iommu_fault_param pointer to consolidate the code and save memory. The consolidated pointer would be allocated on demand when the device driver enables the iopf on device, and would be freed after iopf is disabled. Signed-off-by: Lu Baolu Reviewed-by: Jason Gunthorpe Reviewed-by: Kevin Tian Tested-by: Yan Zhao Tested-by: Longfang Liu Link: https://lore.kernel.org/r/20240212012227.119381-6-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel (cherry picked from commit 15fc60cdd2d236a73b32c99d21fc0f7b7ce6cbbb) Signed-off-by: Koba Ko --- drivers/iommu/io-pgfault.c | 110 ++++++++++++++++++------------------- drivers/iommu/iommu.c | 34 ++---------- include/linux/iommu.h | 18 ++++-- 3 files changed, 72 insertions(+), 90 deletions(-) diff --git a/drivers/iommu/io-pgfault.c b/drivers/iommu/io-pgfault.c index 24b5545352ae7..f948303b2a91a 100644 --- a/drivers/iommu/io-pgfault.c +++ b/drivers/iommu/io-pgfault.c @@ -25,21 +25,6 @@ struct iopf_queue { struct mutex lock; }; -/** - * struct iopf_device_param - IO Page Fault data attached to a device - * @dev: the device that owns this param - * @queue: IOPF queue - * @queue_list: index into queue->devices - * @partial: faults that are part of a Page Request Group for which the last - * request hasn't been submitted yet. - */ -struct iopf_device_param { - struct device *dev; - struct iopf_queue *queue; - struct list_head queue_list; - struct list_head partial; -}; - struct iopf_fault { struct iommu_fault fault; struct list_head list; @@ -144,7 +129,7 @@ int iommu_queue_iopf(struct iommu_fault *fault, void *cookie) int ret; struct iopf_group *group; struct iopf_fault *iopf, *next; - struct iopf_device_param *iopf_param; + struct iommu_fault_param *iopf_param; struct device *dev = cookie; struct dev_iommu *param = dev->iommu; @@ -159,7 +144,7 @@ int iommu_queue_iopf(struct iommu_fault *fault, void *cookie) * As long as we're holding param->lock, the queue can't be unlinked * from the device and therefore cannot disappear. */ - iopf_param = param->iopf_param; + iopf_param = param->fault_param; if (!iopf_param) return -ENODEV; @@ -229,14 +214,14 @@ EXPORT_SYMBOL_GPL(iommu_queue_iopf); int iopf_queue_flush_dev(struct device *dev) { int ret = 0; - struct iopf_device_param *iopf_param; + struct iommu_fault_param *iopf_param; struct dev_iommu *param = dev->iommu; if (!param) return -ENODEV; mutex_lock(¶m->lock); - iopf_param = param->iopf_param; + iopf_param = param->fault_param; if (iopf_param) flush_workqueue(iopf_param->queue->wq); else @@ -260,7 +245,7 @@ EXPORT_SYMBOL_GPL(iopf_queue_flush_dev); int iopf_queue_discard_partial(struct iopf_queue *queue) { struct iopf_fault *iopf, *next; - struct iopf_device_param *iopf_param; + struct iommu_fault_param *iopf_param; if (!queue) return -EINVAL; @@ -287,34 +272,36 @@ EXPORT_SYMBOL_GPL(iopf_queue_discard_partial); */ int iopf_queue_add_device(struct iopf_queue *queue, struct device *dev) { - int ret = -EBUSY; - struct iopf_device_param *iopf_param; + int ret = 0; struct dev_iommu *param = dev->iommu; - - if (!param) - return -ENODEV; - - iopf_param = kzalloc(sizeof(*iopf_param), GFP_KERNEL); - if (!iopf_param) - return -ENOMEM; - - INIT_LIST_HEAD(&iopf_param->partial); - iopf_param->queue = queue; - iopf_param->dev = dev; + struct iommu_fault_param *fault_param; mutex_lock(&queue->lock); mutex_lock(¶m->lock); - if (!param->iopf_param) { - list_add(&iopf_param->queue_list, &queue->devices); - param->iopf_param = iopf_param; - ret = 0; + if (param->fault_param) { + ret = -EBUSY; + goto done_unlock; } + + fault_param = kzalloc(sizeof(*fault_param), GFP_KERNEL); + if (!fault_param) { + ret = -ENOMEM; + goto done_unlock; + } + + mutex_init(&fault_param->lock); + INIT_LIST_HEAD(&fault_param->faults); + INIT_LIST_HEAD(&fault_param->partial); + fault_param->dev = dev; + list_add(&fault_param->queue_list, &queue->devices); + fault_param->queue = queue; + + param->fault_param = fault_param; + +done_unlock: mutex_unlock(¶m->lock); mutex_unlock(&queue->lock); - if (ret) - kfree(iopf_param); - return ret; } EXPORT_SYMBOL_GPL(iopf_queue_add_device); @@ -330,34 +317,41 @@ EXPORT_SYMBOL_GPL(iopf_queue_add_device); */ int iopf_queue_remove_device(struct iopf_queue *queue, struct device *dev) { - int ret = -EINVAL; + int ret = 0; struct iopf_fault *iopf, *next; - struct iopf_device_param *iopf_param; struct dev_iommu *param = dev->iommu; - - if (!param || !queue) - return -EINVAL; + struct iommu_fault_param *fault_param = param->fault_param; mutex_lock(&queue->lock); mutex_lock(¶m->lock); - iopf_param = param->iopf_param; - if (iopf_param && iopf_param->queue == queue) { - list_del(&iopf_param->queue_list); - param->iopf_param = NULL; - ret = 0; + if (!fault_param) { + ret = -ENODEV; + goto unlock; } - mutex_unlock(¶m->lock); - mutex_unlock(&queue->lock); - if (ret) - return ret; + + if (fault_param->queue != queue) { + ret = -EINVAL; + goto unlock; + } + + if (!list_empty(&fault_param->faults)) { + ret = -EBUSY; + goto unlock; + } + + list_del(&fault_param->queue_list); /* Just in case some faults are still stuck */ - list_for_each_entry_safe(iopf, next, &iopf_param->partial, list) + list_for_each_entry_safe(iopf, next, &fault_param->partial, list) kfree(iopf); - kfree(iopf_param); + param->fault_param = NULL; + kfree(fault_param); +unlock: + mutex_unlock(¶m->lock); + mutex_unlock(&queue->lock); - return 0; + return ret; } EXPORT_SYMBOL_GPL(iopf_queue_remove_device); @@ -403,7 +397,7 @@ EXPORT_SYMBOL_GPL(iopf_queue_alloc); */ void iopf_queue_free(struct iopf_queue *queue) { - struct iopf_device_param *iopf_param, *next; + struct iommu_fault_param *iopf_param, *next; if (!queue) return; diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c index 11f15aa9f30ba..f292a23f1887f 100644 --- a/drivers/iommu/iommu.c +++ b/drivers/iommu/iommu.c @@ -1397,27 +1397,18 @@ int iommu_register_device_fault_handler(struct device *dev, struct dev_iommu *param = dev->iommu; int ret = 0; - if (!param) + if (!param || !param->fault_param) return -EINVAL; mutex_lock(¶m->lock); /* Only allow one fault handler registered for each device */ - if (param->fault_param) { + if (param->fault_param->handler) { ret = -EBUSY; goto done_unlock; } - get_device(dev); - param->fault_param = kzalloc(sizeof(*param->fault_param), GFP_KERNEL); - if (!param->fault_param) { - put_device(dev); - ret = -ENOMEM; - goto done_unlock; - } param->fault_param->handler = handler; param->fault_param->data = data; - mutex_init(¶m->fault_param->lock); - INIT_LIST_HEAD(¶m->fault_param->faults); done_unlock: mutex_unlock(¶m->lock); @@ -1438,29 +1429,16 @@ EXPORT_SYMBOL_GPL(iommu_register_device_fault_handler); int iommu_unregister_device_fault_handler(struct device *dev) { struct dev_iommu *param = dev->iommu; - int ret = 0; - if (!param) + if (!param || !param->fault_param) return -EINVAL; mutex_lock(¶m->lock); - - if (!param->fault_param) - goto unlock; - - /* we cannot unregister handler if there are pending faults */ - if (!list_empty(¶m->fault_param->faults)) { - ret = -EBUSY; - goto unlock; - } - - kfree(param->fault_param); - param->fault_param = NULL; - put_device(dev); -unlock: + param->fault_param->handler = NULL; + param->fault_param->data = NULL; mutex_unlock(¶m->lock); - return ret; + return 0; } EXPORT_SYMBOL_GPL(iommu_unregister_device_fault_handler); diff --git a/include/linux/iommu.h b/include/linux/iommu.h index dc3538cc65120..0151b42ed7c98 100644 --- a/include/linux/iommu.h +++ b/include/linux/iommu.h @@ -42,6 +42,7 @@ struct notifier_block; struct iommu_sva; struct iommu_fault_event; struct iommu_dma_cookie; +struct iopf_queue; #define IOMMU_FAULT_PERM_READ (1 << 0) /* read */ #define IOMMU_FAULT_PERM_WRITE (1 << 1) /* write */ @@ -603,21 +604,31 @@ struct iommu_fault_event { * struct iommu_fault_param - per-device IOMMU fault data * @handler: Callback function to handle IOMMU faults at device level * @data: handler private data - * @faults: holds the pending faults which needs response * @lock: protect pending faults list + * @dev: the device that owns this param + * @queue: IOPF queue + * @queue_list: index into queue->devices + * @partial: faults that are part of a Page Request Group for which the last + * request hasn't been submitted yet. + * @faults: holds the pending faults which need response */ struct iommu_fault_param { iommu_dev_fault_handler_t handler; void *data; - struct list_head faults; struct mutex lock; + + struct device *dev; + struct iopf_queue *queue; + struct list_head queue_list; + + struct list_head partial; + struct list_head faults; }; /** * struct dev_iommu - Collection of per-device IOMMU data * * @fault_param: IOMMU detected device fault reporting data - * @iopf_param: I/O Page Fault queue and data * @fwspec: IOMMU fwspec data * @iommu_dev: IOMMU device this device is linked to * @priv: IOMMU Driver private data @@ -632,7 +643,6 @@ struct iommu_fault_param { struct dev_iommu { struct mutex lock; struct iommu_fault_param *fault_param; - struct iopf_device_param *iopf_param; struct iommu_fwspec *fwspec; struct iommu_device *iommu_dev; void *priv; From 7e9914b5d4fee78fa7b160f196973e89ff1d559a Mon Sep 17 00:00:00 2001 From: Lu Baolu Date: Mon, 12 Feb 2024 09:22:17 +0800 Subject: [PATCH 489/548] iommu: Remove iommu_[un]register_device_fault_handler() The individual iommu driver reports the iommu page faults by calling iommu_report_device_fault(), where a pre-registered device fault handler is called to route the fault to another fault handler installed on the corresponding iommu domain. The pre-registered device fault handler is static and won't be dynamic as the fault handler is eventually per iommu domain. Replace calling device fault handler with iommu_queue_iopf(). After this replacement, the registering and unregistering fault handler interfaces are not needed anywhere. Remove the interfaces and the related data structures to avoid dead code. Convert cookie parameter of iommu_queue_iopf() into a device pointer that is really passed. Signed-off-by: Lu Baolu Reviewed-by: Kevin Tian Reviewed-by: Jason Gunthorpe Tested-by: Yan Zhao Tested-by: Longfang Liu Link: https://lore.kernel.org/r/20240212012227.119381-7-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel (cherry picked from commit 1ff25d798e52943d037accf15c675a6845d9776f) Signed-off-by: Koba Ko --- .../iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c | 13 +--- drivers/iommu/intel/iommu.c | 24 ++---- drivers/iommu/io-pgfault.c | 6 +- drivers/iommu/iommu-sva.h | 4 +- drivers/iommu/iommu.c | 76 +------------------ include/linux/iommu.h | 23 ------ 6 files changed, 13 insertions(+), 133 deletions(-) diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c index 2610e82c0ecd0..547e77121a3cd 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c @@ -476,7 +476,6 @@ bool arm_smmu_master_sva_enabled(struct arm_smmu_master *master) static int arm_smmu_master_sva_enable_iopf(struct arm_smmu_master *master) { - int ret; struct device *dev = master->dev; /* @@ -489,16 +488,7 @@ static int arm_smmu_master_sva_enable_iopf(struct arm_smmu_master *master) if (!master->iopf_enabled) return -EINVAL; - ret = iopf_queue_add_device(master->smmu->evtq.iopf, dev); - if (ret) - return ret; - - ret = iommu_register_device_fault_handler(dev, iommu_queue_iopf, dev); - if (ret) { - iopf_queue_remove_device(master->smmu->evtq.iopf, dev); - return ret; - } - return 0; + return iopf_queue_add_device(master->smmu->evtq.iopf, dev); } static void arm_smmu_master_sva_disable_iopf(struct arm_smmu_master *master) @@ -508,7 +498,6 @@ static void arm_smmu_master_sva_disable_iopf(struct arm_smmu_master *master) if (!master->iopf_enabled) return; - iommu_unregister_device_fault_handler(dev); iopf_queue_remove_device(master->smmu->evtq.iopf, dev); } diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c index d96f270fa9e7d..2bac09d04ae11 100644 --- a/drivers/iommu/intel/iommu.c +++ b/drivers/iommu/intel/iommu.c @@ -4628,23 +4628,15 @@ static int intel_iommu_enable_iopf(struct device *dev) if (ret) return ret; - ret = iommu_register_device_fault_handler(dev, iommu_queue_iopf, dev); - if (ret) - goto iopf_remove_device; - ret = pci_enable_pri(pdev, PRQ_DEPTH); - if (ret) - goto iopf_unregister_handler; + if (ret) { + iopf_queue_remove_device(iommu->iopf_queue, dev); + return ret; + } + info->pri_enabled = 1; return 0; - -iopf_unregister_handler: - iommu_unregister_device_fault_handler(dev); -iopf_remove_device: - iopf_queue_remove_device(iommu->iopf_queue, dev); - - return ret; } static int intel_iommu_disable_iopf(struct device *dev) @@ -4667,11 +4659,9 @@ static int intel_iommu_disable_iopf(struct device *dev) info->pri_enabled = 0; /* - * With PRI disabled and outstanding PRQs drained, unregistering - * fault handler and removing device from iopf queue should never - * fail. + * With PRI disabled and outstanding PRQs drained, removing device + * from iopf queue should never fail. */ - WARN_ON(iommu_unregister_device_fault_handler(dev)); WARN_ON(iopf_queue_remove_device(iommu->iopf_queue, dev)); return 0; diff --git a/drivers/iommu/io-pgfault.c b/drivers/iommu/io-pgfault.c index f948303b2a91a..4fda01de55898 100644 --- a/drivers/iommu/io-pgfault.c +++ b/drivers/iommu/io-pgfault.c @@ -87,7 +87,7 @@ static void iopf_handler(struct work_struct *work) /** * iommu_queue_iopf - IO Page Fault handler * @fault: fault event - * @cookie: struct device, passed to iommu_register_device_fault_handler. + * @dev: struct device. * * Add a fault to the device workqueue, to be handled by mm. * @@ -124,14 +124,12 @@ static void iopf_handler(struct work_struct *work) * * Return: 0 on success and <0 on error. */ -int iommu_queue_iopf(struct iommu_fault *fault, void *cookie) +int iommu_queue_iopf(struct iommu_fault *fault, struct device *dev) { int ret; struct iopf_group *group; struct iopf_fault *iopf, *next; struct iommu_fault_param *iopf_param; - - struct device *dev = cookie; struct dev_iommu *param = dev->iommu; lockdep_assert_held(¶m->lock); diff --git a/drivers/iommu/iommu-sva.h b/drivers/iommu/iommu-sva.h index 54946b5a7cafe..de7819c796cee 100644 --- a/drivers/iommu/iommu-sva.h +++ b/drivers/iommu/iommu-sva.h @@ -13,7 +13,7 @@ struct iommu_fault; struct iopf_queue; #ifdef CONFIG_IOMMU_SVA -int iommu_queue_iopf(struct iommu_fault *fault, void *cookie); +int iommu_queue_iopf(struct iommu_fault *fault, struct device *dev); int iopf_queue_add_device(struct iopf_queue *queue, struct device *dev); int iopf_queue_remove_device(struct iopf_queue *queue, @@ -26,7 +26,7 @@ enum iommu_page_response_code iommu_sva_handle_iopf(struct iommu_fault *fault, void *data); #else /* CONFIG_IOMMU_SVA */ -static inline int iommu_queue_iopf(struct iommu_fault *fault, void *cookie) +static inline int iommu_queue_iopf(struct iommu_fault *fault, struct device *dev) { return -ENODEV; } diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c index f292a23f1887f..7cddb73605385 100644 --- a/drivers/iommu/iommu.c +++ b/drivers/iommu/iommu.c @@ -1372,76 +1372,6 @@ void iommu_group_put(struct iommu_group *group) } EXPORT_SYMBOL_GPL(iommu_group_put); -/** - * iommu_register_device_fault_handler() - Register a device fault handler - * @dev: the device - * @handler: the fault handler - * @data: private data passed as argument to the handler - * - * When an IOMMU fault event is received, this handler gets called with the - * fault event and data as argument. The handler should return 0 on success. If - * the fault is recoverable (IOMMU_FAULT_PAGE_REQ), the consumer should also - * complete the fault by calling iommu_page_response() with one of the following - * response code: - * - IOMMU_PAGE_RESP_SUCCESS: retry the translation - * - IOMMU_PAGE_RESP_INVALID: terminate the fault - * - IOMMU_PAGE_RESP_FAILURE: terminate the fault and stop reporting - * page faults if possible. - * - * Return 0 if the fault handler was installed successfully, or an error. - */ -int iommu_register_device_fault_handler(struct device *dev, - iommu_dev_fault_handler_t handler, - void *data) -{ - struct dev_iommu *param = dev->iommu; - int ret = 0; - - if (!param || !param->fault_param) - return -EINVAL; - - mutex_lock(¶m->lock); - /* Only allow one fault handler registered for each device */ - if (param->fault_param->handler) { - ret = -EBUSY; - goto done_unlock; - } - - param->fault_param->handler = handler; - param->fault_param->data = data; - -done_unlock: - mutex_unlock(¶m->lock); - - return ret; -} -EXPORT_SYMBOL_GPL(iommu_register_device_fault_handler); - -/** - * iommu_unregister_device_fault_handler() - Unregister the device fault handler - * @dev: the device - * - * Remove the device fault handler installed with - * iommu_register_device_fault_handler(). - * - * Return 0 on success, or an error. - */ -int iommu_unregister_device_fault_handler(struct device *dev) -{ - struct dev_iommu *param = dev->iommu; - - if (!param || !param->fault_param) - return -EINVAL; - - mutex_lock(¶m->lock); - param->fault_param->handler = NULL; - param->fault_param->data = NULL; - mutex_unlock(¶m->lock); - - return 0; -} -EXPORT_SYMBOL_GPL(iommu_unregister_device_fault_handler); - /** * iommu_report_device_fault() - Report fault event to device driver * @dev: the device @@ -1466,10 +1396,6 @@ int iommu_report_device_fault(struct device *dev, struct iommu_fault_event *evt) /* we only report device fault if there is a handler registered */ mutex_lock(¶m->lock); fparam = param->fault_param; - if (!fparam || !fparam->handler) { - ret = -EINVAL; - goto done_unlock; - } if (evt->fault.type == IOMMU_FAULT_PAGE_REQ && (evt->fault.prm.flags & IOMMU_FAULT_PAGE_REQUEST_LAST_PAGE)) { @@ -1484,7 +1410,7 @@ int iommu_report_device_fault(struct device *dev, struct iommu_fault_event *evt) mutex_unlock(&fparam->lock); } - ret = fparam->handler(&evt->fault, fparam->data); + ret = iommu_queue_iopf(&evt->fault, dev); if (ret && evt_pending) { mutex_lock(&fparam->lock); list_del(&evt_pending->list); diff --git a/include/linux/iommu.h b/include/linux/iommu.h index 0151b42ed7c98..bac15c96a4bcf 100644 --- a/include/linux/iommu.h +++ b/include/linux/iommu.h @@ -128,7 +128,6 @@ struct iommu_page_response { typedef int (*iommu_fault_handler_t)(struct iommu_domain *, struct device *, unsigned long, int, void *); -typedef int (*iommu_dev_fault_handler_t)(struct iommu_fault *, void *); struct iommu_domain_geometry { dma_addr_t aperture_start; /* First address that can be mapped */ @@ -602,8 +601,6 @@ struct iommu_fault_event { /** * struct iommu_fault_param - per-device IOMMU fault data - * @handler: Callback function to handle IOMMU faults at device level - * @data: handler private data * @lock: protect pending faults list * @dev: the device that owns this param * @queue: IOPF queue @@ -613,8 +610,6 @@ struct iommu_fault_event { * @faults: holds the pending faults which need response */ struct iommu_fault_param { - iommu_dev_fault_handler_t handler; - void *data; struct mutex lock; struct device *dev; @@ -735,11 +730,6 @@ extern int iommu_group_for_each_dev(struct iommu_group *group, void *data, extern struct iommu_group *iommu_group_get(struct device *dev); extern struct iommu_group *iommu_group_ref_get(struct iommu_group *group); extern void iommu_group_put(struct iommu_group *group); -extern int iommu_register_device_fault_handler(struct device *dev, - iommu_dev_fault_handler_t handler, - void *data); - -extern int iommu_unregister_device_fault_handler(struct device *dev); extern int iommu_report_device_fault(struct device *dev, struct iommu_fault_event *evt); @@ -1153,19 +1143,6 @@ static inline void iommu_group_put(struct iommu_group *group) { } -static inline -int iommu_register_device_fault_handler(struct device *dev, - iommu_dev_fault_handler_t handler, - void *data) -{ - return -ENODEV; -} - -static inline int iommu_unregister_device_fault_handler(struct device *dev) -{ - return 0; -} - static inline int iommu_report_device_fault(struct device *dev, struct iommu_fault_event *evt) { From 88ac2fdc494529b4164ef71384eb8ff7838a2844 Mon Sep 17 00:00:00 2001 From: Lu Baolu Date: Mon, 12 Feb 2024 09:22:18 +0800 Subject: [PATCH 490/548] iommu: Merge iommu_fault_event and iopf_fault The iommu_fault_event and iopf_fault data structures store the same information about an iopf fault. They are also used in the same way. Merge these two data structures into a single one to make the code more concise and easier to maintain. Signed-off-by: Lu Baolu Reviewed-by: Kevin Tian Reviewed-by: Jason Gunthorpe Reviewed-by: Yi Liu Tested-by: Yan Zhao Tested-by: Longfang Liu Link: https://lore.kernel.org/r/20240212012227.119381-8-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel (cherry picked from commit 3f02a9dc70007c0e6299fda9c4f7a1e2277ec3d2) Signed-off-by: Koba Ko --- drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 4 +-- drivers/iommu/intel/iommu.h | 2 +- drivers/iommu/intel/svm.c | 5 ++-- drivers/iommu/io-pgfault.c | 5 ---- drivers/iommu/iommu.c | 8 +++--- include/linux/iommu.h | 27 ++++++--------------- 6 files changed, 17 insertions(+), 34 deletions(-) diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c index 62a11afc92e58..66f11a0720259 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c @@ -928,7 +928,7 @@ static int arm_smmu_cmdq_batch_submit(struct arm_smmu_device *smmu, } static int arm_smmu_page_response(struct device *dev, - struct iommu_fault_event *unused, + struct iopf_fault *unused, struct iommu_page_response *resp) { struct arm_smmu_cmdq_ent cmd = {0}; @@ -1648,7 +1648,7 @@ static int arm_smmu_handle_evt(struct arm_smmu_device *smmu, u64 *evt) struct arm_smmu_master *master; bool ssid_valid = evt[0] & EVTQ_0_SSV; u32 sid = FIELD_GET(EVTQ_0_SID, evt[0]); - struct iommu_fault_event fault_evt = { }; + struct iopf_fault fault_evt = { }; struct iommu_fault *flt = &fault_evt.fault; switch (FIELD_GET(EVTQ_0_ID, evt[0])) { diff --git a/drivers/iommu/intel/iommu.h b/drivers/iommu/intel/iommu.h index 49ea164cb006f..0b1a9fd426889 100644 --- a/drivers/iommu/intel/iommu.h +++ b/drivers/iommu/intel/iommu.h @@ -868,7 +868,7 @@ struct intel_iommu *device_to_iommu(struct device *dev, u8 *bus, u8 *devfn); void intel_svm_check(struct intel_iommu *iommu); int intel_svm_enable_prq(struct intel_iommu *iommu); int intel_svm_finish_prq(struct intel_iommu *iommu); -int intel_svm_page_response(struct device *dev, struct iommu_fault_event *evt, +int intel_svm_page_response(struct device *dev, struct iopf_fault *evt, struct iommu_page_response *msg); struct iommu_domain *intel_svm_domain_alloc(void); void intel_svm_remove_dev_pasid(struct device *dev, ioasid_t pasid); diff --git a/drivers/iommu/intel/svm.c b/drivers/iommu/intel/svm.c index fe7ecf36eb783..e262a4304ab42 100644 --- a/drivers/iommu/intel/svm.c +++ b/drivers/iommu/intel/svm.c @@ -570,13 +570,12 @@ static int prq_to_iommu_prot(struct page_req_dsc *req) static int intel_svm_prq_report(struct intel_iommu *iommu, struct device *dev, struct page_req_dsc *desc) { - struct iommu_fault_event event; + struct iopf_fault event = { }; if (!dev || !dev_is_pci(dev)) return -ENODEV; /* Fill in event data for device specific processing */ - memset(&event, 0, sizeof(struct iommu_fault_event)); event.fault.type = IOMMU_FAULT_PAGE_REQ; event.fault.prm.addr = (u64)desc->addr << VTD_PAGE_SHIFT; event.fault.prm.pasid = desc->pasid; @@ -748,7 +747,7 @@ static irqreturn_t prq_event_thread(int irq, void *d) } int intel_svm_page_response(struct device *dev, - struct iommu_fault_event *evt, + struct iopf_fault *evt, struct iommu_page_response *msg) { struct iommu_fault_page_request *prm; diff --git a/drivers/iommu/io-pgfault.c b/drivers/iommu/io-pgfault.c index 4fda01de55898..10d48eb72608c 100644 --- a/drivers/iommu/io-pgfault.c +++ b/drivers/iommu/io-pgfault.c @@ -25,11 +25,6 @@ struct iopf_queue { struct mutex lock; }; -struct iopf_fault { - struct iommu_fault fault; - struct list_head list; -}; - struct iopf_group { struct iopf_fault last_fault; struct list_head faults; diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c index 7cddb73605385..aaf7ec954d390 100644 --- a/drivers/iommu/iommu.c +++ b/drivers/iommu/iommu.c @@ -1383,10 +1383,10 @@ EXPORT_SYMBOL_GPL(iommu_group_put); * * Return 0 on success, or an error. */ -int iommu_report_device_fault(struct device *dev, struct iommu_fault_event *evt) +int iommu_report_device_fault(struct device *dev, struct iopf_fault *evt) { struct dev_iommu *param = dev->iommu; - struct iommu_fault_event *evt_pending = NULL; + struct iopf_fault *evt_pending = NULL; struct iommu_fault_param *fparam; int ret = 0; @@ -1399,7 +1399,7 @@ int iommu_report_device_fault(struct device *dev, struct iommu_fault_event *evt) if (evt->fault.type == IOMMU_FAULT_PAGE_REQ && (evt->fault.prm.flags & IOMMU_FAULT_PAGE_REQUEST_LAST_PAGE)) { - evt_pending = kmemdup(evt, sizeof(struct iommu_fault_event), + evt_pending = kmemdup(evt, sizeof(struct iopf_fault), GFP_KERNEL); if (!evt_pending) { ret = -ENOMEM; @@ -1428,7 +1428,7 @@ int iommu_page_response(struct device *dev, { bool needs_pasid; int ret = -EINVAL; - struct iommu_fault_event *evt; + struct iopf_fault *evt; struct iommu_fault_page_request *prm; struct dev_iommu *param = dev->iommu; const struct iommu_ops *ops = dev_iommu_ops(dev); diff --git a/include/linux/iommu.h b/include/linux/iommu.h index bac15c96a4bcf..52c63cde47245 100644 --- a/include/linux/iommu.h +++ b/include/linux/iommu.h @@ -40,7 +40,6 @@ struct iommu_domain_ops; struct iommu_dirty_ops; struct notifier_block; struct iommu_sva; -struct iommu_fault_event; struct iommu_dma_cookie; struct iopf_queue; @@ -121,6 +120,11 @@ struct iommu_page_response { u32 code; }; +struct iopf_fault { + struct iommu_fault fault; + /* node for pending lists */ + struct list_head list; +}; /* iommu fault flags */ #define IOMMU_FAULT_READ 0x0 @@ -489,7 +493,7 @@ struct iommu_ops { int (*dev_disable_feat)(struct device *dev, enum iommu_dev_features f); int (*page_response)(struct device *dev, - struct iommu_fault_event *evt, + struct iopf_fault *evt, struct iommu_page_response *msg); int (*def_domain_type)(struct device *dev); @@ -585,20 +589,6 @@ struct iommu_device { u32 max_pasids; }; -/** - * struct iommu_fault_event - Generic fault event - * - * Can represent recoverable faults such as a page requests or - * unrecoverable faults such as DMA or IRQ remapping faults. - * - * @fault: fault descriptor - * @list: pending fault event list, used for tracking responses - */ -struct iommu_fault_event { - struct iommu_fault fault; - struct list_head list; -}; - /** * struct iommu_fault_param - per-device IOMMU fault data * @lock: protect pending faults list @@ -731,8 +721,7 @@ extern struct iommu_group *iommu_group_get(struct device *dev); extern struct iommu_group *iommu_group_ref_get(struct iommu_group *group); extern void iommu_group_put(struct iommu_group *group); -extern int iommu_report_device_fault(struct device *dev, - struct iommu_fault_event *evt); +extern int iommu_report_device_fault(struct device *dev, struct iopf_fault *evt); extern int iommu_page_response(struct device *dev, struct iommu_page_response *msg); @@ -1144,7 +1133,7 @@ static inline void iommu_group_put(struct iommu_group *group) } static inline -int iommu_report_device_fault(struct device *dev, struct iommu_fault_event *evt) +int iommu_report_device_fault(struct device *dev, struct iopf_fault *evt) { return -ENODEV; } From f27d1ec59370703ee832ae7ed4e9f0cf87f0f1cf Mon Sep 17 00:00:00 2001 From: Lu Baolu Date: Mon, 12 Feb 2024 09:22:19 +0800 Subject: [PATCH 491/548] iommu: Prepare for separating SVA and IOPF Move iopf_group data structure to iommu.h to make it a minimal set of faults that a domain's page fault handler should handle. Add a new function, iopf_free_group(), to free a fault group after all faults in the group are handled. This function will be made global so that it can be called from other files, such as iommu-sva.c. Move iopf_queue data structure to iommu.h to allow the workqueue to be scheduled out of this file. This will simplify the sequential patches. Signed-off-by: Lu Baolu Reviewed-by: Jason Gunthorpe Reviewed-by: Kevin Tian Reviewed-by: Yi Liu Tested-by: Yan Zhao Tested-by: Longfang Liu Link: https://lore.kernel.org/r/20240212012227.119381-9-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel (cherry picked from commit 24b5d268b5ab95c12b5ae58a054d04bfa442f58f) Signed-off-by: Koba Ko --- drivers/iommu/io-pgfault.c | 39 ++++++++++++++------------------------ include/linux/iommu.h | 20 ++++++++++++++++++- 2 files changed, 33 insertions(+), 26 deletions(-) diff --git a/drivers/iommu/io-pgfault.c b/drivers/iommu/io-pgfault.c index 10d48eb72608c..c7e6bbed5c05b 100644 --- a/drivers/iommu/io-pgfault.c +++ b/drivers/iommu/io-pgfault.c @@ -13,24 +13,17 @@ #include "iommu-sva.h" -/** - * struct iopf_queue - IO Page Fault queue - * @wq: the fault workqueue - * @devices: devices attached to this queue - * @lock: protects the device list - */ -struct iopf_queue { - struct workqueue_struct *wq; - struct list_head devices; - struct mutex lock; -}; - -struct iopf_group { - struct iopf_fault last_fault; - struct list_head faults; - struct work_struct work; - struct device *dev; -}; +static void iopf_free_group(struct iopf_group *group) +{ + struct iopf_fault *iopf, *next; + + list_for_each_entry_safe(iopf, next, &group->faults, list) { + if (!(iopf->fault.prm.flags & IOMMU_FAULT_PAGE_REQUEST_LAST_PAGE)) + kfree(iopf); + } + + kfree(group); +} static int iopf_complete_group(struct device *dev, struct iopf_fault *iopf, enum iommu_page_response_code status) @@ -50,9 +43,9 @@ static int iopf_complete_group(struct device *dev, struct iopf_fault *iopf, static void iopf_handler(struct work_struct *work) { + struct iopf_fault *iopf; struct iopf_group *group; struct iommu_domain *domain; - struct iopf_fault *iopf, *next; enum iommu_page_response_code status = IOMMU_PAGE_RESP_SUCCESS; group = container_of(work, struct iopf_group, work); @@ -61,7 +54,7 @@ static void iopf_handler(struct work_struct *work) if (!domain || !domain->iopf_handler) status = IOMMU_PAGE_RESP_INVALID; - list_for_each_entry_safe(iopf, next, &group->faults, list) { + list_for_each_entry(iopf, &group->faults, list) { /* * For the moment, errors are sticky: don't handle subsequent * faults in the group if there is an error. @@ -69,14 +62,10 @@ static void iopf_handler(struct work_struct *work) if (status == IOMMU_PAGE_RESP_SUCCESS) status = domain->iopf_handler(&iopf->fault, domain->fault_data); - - if (!(iopf->fault.prm.flags & - IOMMU_FAULT_PAGE_REQUEST_LAST_PAGE)) - kfree(iopf); } iopf_complete_group(group->dev, &group->last_fault, status); - kfree(group); + iopf_free_group(group); } /** diff --git a/include/linux/iommu.h b/include/linux/iommu.h index 52c63cde47245..bbc6931acf677 100644 --- a/include/linux/iommu.h +++ b/include/linux/iommu.h @@ -41,7 +41,6 @@ struct iommu_dirty_ops; struct notifier_block; struct iommu_sva; struct iommu_dma_cookie; -struct iopf_queue; #define IOMMU_FAULT_PERM_READ (1 << 0) /* read */ #define IOMMU_FAULT_PERM_WRITE (1 << 1) /* write */ @@ -126,6 +125,25 @@ struct iopf_fault { struct list_head list; }; +struct iopf_group { + struct iopf_fault last_fault; + struct list_head faults; + struct work_struct work; + struct device *dev; +}; + +/** + * struct iopf_queue - IO Page Fault queue + * @wq: the fault workqueue + * @devices: devices attached to this queue + * @lock: protects the device list + */ +struct iopf_queue { + struct workqueue_struct *wq; + struct list_head devices; + struct mutex lock; +}; + /* iommu fault flags */ #define IOMMU_FAULT_READ 0x0 #define IOMMU_FAULT_WRITE 0x1 From 1d00ac2984162df2804b25eff43e5f04be8777ff Mon Sep 17 00:00:00 2001 From: Lu Baolu Date: Mon, 12 Feb 2024 09:22:20 +0800 Subject: [PATCH 492/548] iommu: Make iommu_queue_iopf() more generic Make iommu_queue_iopf() more generic by making the iopf_group a minimal set of iopf's that an iopf handler of domain should handle and respond to. Add domain parameter to struct iopf_group so that the handler can retrieve and use it directly. Change iommu_queue_iopf() to forward groups of iopf's to the domain's iopf handler. This is also a necessary step to decouple the sva iopf handling code from this interface. Signed-off-by: Lu Baolu Reviewed-by: Kevin Tian Reviewed-by: Jason Gunthorpe Reviewed-by: Yi Liu Tested-by: Yan Zhao Tested-by: Longfang Liu Link: https://lore.kernel.org/r/20240212012227.119381-10-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel (cherry picked from commit 351ffcb11ca0ff64e399982e279cfa131e7cb1aa) Signed-off-by: Koba Ko --- drivers/iommu/io-pgfault.c | 68 +++++++++++++++++++++++++++++++------- drivers/iommu/iommu-sva.c | 3 +- drivers/iommu/iommu-sva.h | 6 ++-- include/linux/iommu.h | 4 +-- 4 files changed, 61 insertions(+), 20 deletions(-) diff --git a/drivers/iommu/io-pgfault.c b/drivers/iommu/io-pgfault.c index c7e6bbed5c05b..13cd0929e7661 100644 --- a/drivers/iommu/io-pgfault.c +++ b/drivers/iommu/io-pgfault.c @@ -13,6 +13,9 @@ #include "iommu-sva.h" +enum iommu_page_response_code +iommu_sva_handle_mm(struct iommu_fault *fault, struct mm_struct *mm); + static void iopf_free_group(struct iopf_group *group) { struct iopf_fault *iopf, *next; @@ -45,29 +48,48 @@ static void iopf_handler(struct work_struct *work) { struct iopf_fault *iopf; struct iopf_group *group; - struct iommu_domain *domain; enum iommu_page_response_code status = IOMMU_PAGE_RESP_SUCCESS; group = container_of(work, struct iopf_group, work); - domain = iommu_get_domain_for_dev_pasid(group->dev, - group->last_fault.fault.prm.pasid, 0); - if (!domain || !domain->iopf_handler) - status = IOMMU_PAGE_RESP_INVALID; - list_for_each_entry(iopf, &group->faults, list) { /* * For the moment, errors are sticky: don't handle subsequent * faults in the group if there is an error. */ - if (status == IOMMU_PAGE_RESP_SUCCESS) - status = domain->iopf_handler(&iopf->fault, - domain->fault_data); + if (status != IOMMU_PAGE_RESP_SUCCESS) + break; + + status = iommu_sva_handle_mm(&iopf->fault, group->domain->mm); } iopf_complete_group(group->dev, &group->last_fault, status); iopf_free_group(group); } +static struct iommu_domain *get_domain_for_iopf(struct device *dev, + struct iommu_fault *fault) +{ + struct iommu_domain *domain; + + if (fault->prm.flags & IOMMU_FAULT_PAGE_REQUEST_PASID_VALID) { + domain = iommu_get_domain_for_dev_pasid(dev, fault->prm.pasid, 0); + if (IS_ERR(domain)) + domain = NULL; + } else { + domain = iommu_get_domain_for_dev(dev); + } + + if (!domain || !domain->iopf_handler) { + dev_warn_ratelimited(dev, + "iopf (pasid %d) without domain attached or handler installed\n", + fault->prm.pasid); + + return NULL; + } + + return domain; +} + /** * iommu_queue_iopf - IO Page Fault handler * @fault: fault event @@ -112,6 +134,7 @@ int iommu_queue_iopf(struct iommu_fault *fault, struct device *dev) { int ret; struct iopf_group *group; + struct iommu_domain *domain; struct iopf_fault *iopf, *next; struct iommu_fault_param *iopf_param; struct dev_iommu *param = dev->iommu; @@ -143,6 +166,12 @@ int iommu_queue_iopf(struct iommu_fault *fault, struct device *dev) return 0; } + domain = get_domain_for_iopf(dev, fault); + if (!domain) { + ret = -EINVAL; + goto cleanup_partial; + } + group = kzalloc(sizeof(*group), GFP_KERNEL); if (!group) { /* @@ -157,8 +186,8 @@ int iommu_queue_iopf(struct iommu_fault *fault, struct device *dev) group->dev = dev; group->last_fault.fault = *fault; INIT_LIST_HEAD(&group->faults); + group->domain = domain; list_add(&group->last_fault.list, &group->faults); - INIT_WORK(&group->work, iopf_handler); /* See if we have partial faults for this group */ list_for_each_entry_safe(iopf, next, &iopf_param->partial, list) { @@ -167,9 +196,13 @@ int iommu_queue_iopf(struct iommu_fault *fault, struct device *dev) list_move(&iopf->list, &group->faults); } - queue_work(iopf_param->queue->wq, &group->work); - return 0; + mutex_unlock(&iopf_param->lock); + ret = domain->iopf_handler(group); + mutex_lock(&iopf_param->lock); + if (ret) + iopf_free_group(group); + return ret; cleanup_partial: list_for_each_entry_safe(iopf, next, &iopf_param->partial, list) { if (iopf->fault.prm.grpid == fault->prm.grpid) { @@ -181,6 +214,17 @@ int iommu_queue_iopf(struct iommu_fault *fault, struct device *dev) } EXPORT_SYMBOL_GPL(iommu_queue_iopf); +int iommu_sva_handle_iopf(struct iopf_group *group) +{ + struct iommu_fault_param *fault_param = group->dev->iommu->fault_param; + + INIT_WORK(&group->work, iopf_handler); + if (!queue_work(fault_param->queue->wq, &group->work)) + return -EBUSY; + + return 0; +} + /** * iopf_queue_flush_dev - Ensure that all queued faults have been processed * @dev: the endpoint whose faults need to be flushed. diff --git a/drivers/iommu/iommu-sva.c b/drivers/iommu/iommu-sva.c index 5175e8d85247b..d247d5330bb97 100644 --- a/drivers/iommu/iommu-sva.c +++ b/drivers/iommu/iommu-sva.c @@ -162,11 +162,10 @@ EXPORT_SYMBOL_GPL(iommu_sva_get_pasid); * I/O page fault handler for SVA */ enum iommu_page_response_code -iommu_sva_handle_iopf(struct iommu_fault *fault, void *data) +iommu_sva_handle_mm(struct iommu_fault *fault, struct mm_struct *mm) { vm_fault_t ret; struct vm_area_struct *vma; - struct mm_struct *mm = data; unsigned int access_flags = 0; unsigned int fault_flags = FAULT_FLAG_REMOTE; struct iommu_fault_page_request *prm = &fault->prm; diff --git a/drivers/iommu/iommu-sva.h b/drivers/iommu/iommu-sva.h index de7819c796cee..27c8da115b418 100644 --- a/drivers/iommu/iommu-sva.h +++ b/drivers/iommu/iommu-sva.h @@ -22,8 +22,7 @@ int iopf_queue_flush_dev(struct device *dev); struct iopf_queue *iopf_queue_alloc(const char *name); void iopf_queue_free(struct iopf_queue *queue); int iopf_queue_discard_partial(struct iopf_queue *queue); -enum iommu_page_response_code -iommu_sva_handle_iopf(struct iommu_fault *fault, void *data); +int iommu_sva_handle_iopf(struct iopf_group *group); #else /* CONFIG_IOMMU_SVA */ static inline int iommu_queue_iopf(struct iommu_fault *fault, struct device *dev) @@ -62,8 +61,7 @@ static inline int iopf_queue_discard_partial(struct iopf_queue *queue) return -ENODEV; } -static inline enum iommu_page_response_code -iommu_sva_handle_iopf(struct iommu_fault *fault, void *data) +static inline int iommu_sva_handle_iopf(struct iopf_group *group) { return IOMMU_PAGE_RESP_INVALID; } diff --git a/include/linux/iommu.h b/include/linux/iommu.h index bbc6931acf677..9db575baebeff 100644 --- a/include/linux/iommu.h +++ b/include/linux/iommu.h @@ -130,6 +130,7 @@ struct iopf_group { struct list_head faults; struct work_struct work; struct device *dev; + struct iommu_domain *domain; }; /** @@ -209,8 +210,7 @@ struct iommu_domain { unsigned long pgsize_bitmap; /* Bitmap of page sizes in use */ struct iommu_domain_geometry geometry; struct iommu_dma_cookie *iova_cookie; - enum iommu_page_response_code (*iopf_handler)(struct iommu_fault *fault, - void *data); + int (*iopf_handler)(struct iopf_group *group); void *fault_data; union { struct { From 5e40500d17775d47e4f6c713c2fe156cf07229bd Mon Sep 17 00:00:00 2001 From: Lu Baolu Date: Mon, 12 Feb 2024 09:22:21 +0800 Subject: [PATCH 493/548] iommu: Separate SVA and IOPF Add CONFIG_IOMMU_IOPF for page fault handling framework and select it from its real consumer. Move iopf function declaration from iommu-sva.h to iommu.h and remove iommu-sva.h as it's empty now. Consolidate all SVA related code into iommu-sva.c: - Move iommu_sva_domain_alloc() from iommu.c to iommu-sva.c. - Move sva iopf handling code from io-pgfault.c to iommu-sva.c. Consolidate iommu_report_device_fault() and iommu_page_response() into io-pgfault.c. Export iopf_free_group() and iopf_group_response() for iopf handlers implemented in modules. Some functions are renamed with more meaningful names. No other intentional functionality changes. Signed-off-by: Lu Baolu Reviewed-by: Jason Gunthorpe Reviewed-by: Kevin Tian Tested-by: Yan Zhao Tested-by: Longfang Liu Link: https://lore.kernel.org/r/20240212012227.119381-11-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel (cherry picked from commit 17c51a0ea36b800e7a5998a92d83016c82935dff) Signed-off-by: Koba Ko --- drivers/iommu/Kconfig | 4 + drivers/iommu/Makefile | 3 +- .../iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c | 1 - drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 1 - drivers/iommu/intel/Kconfig | 1 + drivers/iommu/intel/iommu.c | 1 - drivers/iommu/intel/svm.c | 1 - drivers/iommu/io-pgfault.c | 188 +++++++++++++----- drivers/iommu/iommu-sva.c | 68 ++++++- drivers/iommu/iommu-sva.h | 69 ------- drivers/iommu/iommu.c | 134 ------------- include/linux/iommu.h | 98 ++++++--- 12 files changed, 277 insertions(+), 292 deletions(-) delete mode 100644 drivers/iommu/iommu-sva.h diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig index d648fe0627eba..055059ceb34a2 100644 --- a/drivers/iommu/Kconfig +++ b/drivers/iommu/Kconfig @@ -163,6 +163,9 @@ config IOMMU_SVA select IOMMU_MM_DATA bool +config IOMMU_IOPF + bool + config FSL_PAMU bool "Freescale IOMMU support" depends on PCI @@ -397,6 +400,7 @@ config ARM_SMMU_V3_SVA bool "Shared Virtual Addressing support for the ARM SMMUv3" depends on ARM_SMMU_V3 select IOMMU_SVA + select IOMMU_IOPF select MMU_NOTIFIER help Support for sharing process address spaces with devices using the diff --git a/drivers/iommu/Makefile b/drivers/iommu/Makefile index 95ad9dbfbda02..542760d963ec7 100644 --- a/drivers/iommu/Makefile +++ b/drivers/iommu/Makefile @@ -26,6 +26,7 @@ obj-$(CONFIG_FSL_PAMU) += fsl_pamu.o fsl_pamu_domain.o obj-$(CONFIG_S390_IOMMU) += s390-iommu.o obj-$(CONFIG_HYPERV_IOMMU) += hyperv-iommu.o obj-$(CONFIG_VIRTIO_IOMMU) += virtio-iommu.o -obj-$(CONFIG_IOMMU_SVA) += iommu-sva.o io-pgfault.o +obj-$(CONFIG_IOMMU_SVA) += iommu-sva.o +obj-$(CONFIG_IOMMU_IOPF) += io-pgfault.o obj-$(CONFIG_SPRD_IOMMU) += sprd-iommu.o obj-$(CONFIG_APPLE_DART) += apple-dart.o diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c index 547e77121a3cd..2cd433a9c8a0f 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c @@ -10,7 +10,6 @@ #include #include "arm-smmu-v3.h" -#include "../../iommu-sva.h" #include "../../io-pgtable-arm.h" struct arm_smmu_mmu_notifier { diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c index 66f11a0720259..f005a0f5206ba 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c @@ -29,7 +29,6 @@ #include "arm-smmu-v3.h" #include "../../dma-iommu.h" -#include "../../iommu-sva.h" static bool disable_bypass = true; module_param(disable_bypass, bool, 0444); diff --git a/drivers/iommu/intel/Kconfig b/drivers/iommu/intel/Kconfig index f5348b80652b6..f6f46b1b3dadd 100644 --- a/drivers/iommu/intel/Kconfig +++ b/drivers/iommu/intel/Kconfig @@ -51,6 +51,7 @@ config INTEL_IOMMU_SVM depends on X86_64 select MMU_NOTIFIER select IOMMU_SVA + select IOMMU_IOPF help Shared Virtual Memory (SVM) provides a facility for devices to access DMA resources through process address space by diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c index 2bac09d04ae11..8c8a46736f85b 100644 --- a/drivers/iommu/intel/iommu.c +++ b/drivers/iommu/intel/iommu.c @@ -27,7 +27,6 @@ #include "iommu.h" #include "../dma-iommu.h" #include "../irq_remapping.h" -#include "../iommu-sva.h" #include "pasid.h" #include "cap_audit.h" #include "perfmon.h" diff --git a/drivers/iommu/intel/svm.c b/drivers/iommu/intel/svm.c index e262a4304ab42..4da09a60cba6a 100644 --- a/drivers/iommu/intel/svm.c +++ b/drivers/iommu/intel/svm.c @@ -22,7 +22,6 @@ #include "iommu.h" #include "pasid.h" #include "perf.h" -#include "../iommu-sva.h" #include "trace.h" static irqreturn_t prq_event_thread(int irq, void *d); diff --git a/drivers/iommu/io-pgfault.c b/drivers/iommu/io-pgfault.c index 13cd0929e7661..c1e88da973cef 100644 --- a/drivers/iommu/io-pgfault.c +++ b/drivers/iommu/io-pgfault.c @@ -11,12 +11,9 @@ #include #include -#include "iommu-sva.h" +#include "iommu-priv.h" -enum iommu_page_response_code -iommu_sva_handle_mm(struct iommu_fault *fault, struct mm_struct *mm); - -static void iopf_free_group(struct iopf_group *group) +void iopf_free_group(struct iopf_group *group) { struct iopf_fault *iopf, *next; @@ -27,44 +24,7 @@ static void iopf_free_group(struct iopf_group *group) kfree(group); } - -static int iopf_complete_group(struct device *dev, struct iopf_fault *iopf, - enum iommu_page_response_code status) -{ - struct iommu_page_response resp = { - .pasid = iopf->fault.prm.pasid, - .grpid = iopf->fault.prm.grpid, - .code = status, - }; - - if ((iopf->fault.prm.flags & IOMMU_FAULT_PAGE_REQUEST_PASID_VALID) && - (iopf->fault.prm.flags & IOMMU_FAULT_PAGE_RESPONSE_NEEDS_PASID)) - resp.flags = IOMMU_PAGE_RESP_PASID_VALID; - - return iommu_page_response(dev, &resp); -} - -static void iopf_handler(struct work_struct *work) -{ - struct iopf_fault *iopf; - struct iopf_group *group; - enum iommu_page_response_code status = IOMMU_PAGE_RESP_SUCCESS; - - group = container_of(work, struct iopf_group, work); - list_for_each_entry(iopf, &group->faults, list) { - /* - * For the moment, errors are sticky: don't handle subsequent - * faults in the group if there is an error. - */ - if (status != IOMMU_PAGE_RESP_SUCCESS) - break; - - status = iommu_sva_handle_mm(&iopf->fault, group->domain->mm); - } - - iopf_complete_group(group->dev, &group->last_fault, status); - iopf_free_group(group); -} +EXPORT_SYMBOL_GPL(iopf_free_group); static struct iommu_domain *get_domain_for_iopf(struct device *dev, struct iommu_fault *fault) @@ -91,7 +51,7 @@ static struct iommu_domain *get_domain_for_iopf(struct device *dev, } /** - * iommu_queue_iopf - IO Page Fault handler + * iommu_handle_iopf - IO Page Fault handler * @fault: fault event * @dev: struct device. * @@ -130,7 +90,7 @@ static struct iommu_domain *get_domain_for_iopf(struct device *dev, * * Return: 0 on success and <0 on error. */ -int iommu_queue_iopf(struct iommu_fault *fault, struct device *dev) +static int iommu_handle_iopf(struct iommu_fault *fault, struct device *dev) { int ret; struct iopf_group *group; @@ -212,18 +172,117 @@ int iommu_queue_iopf(struct iommu_fault *fault, struct device *dev) } return ret; } -EXPORT_SYMBOL_GPL(iommu_queue_iopf); -int iommu_sva_handle_iopf(struct iopf_group *group) +/** + * iommu_report_device_fault() - Report fault event to device driver + * @dev: the device + * @evt: fault event data + * + * Called by IOMMU drivers when a fault is detected, typically in a threaded IRQ + * handler. When this function fails and the fault is recoverable, it is the + * caller's responsibility to complete the fault. + * + * Return 0 on success, or an error. + */ +int iommu_report_device_fault(struct device *dev, struct iopf_fault *evt) { - struct iommu_fault_param *fault_param = group->dev->iommu->fault_param; + struct dev_iommu *param = dev->iommu; + struct iopf_fault *evt_pending = NULL; + struct iommu_fault_param *fparam; + int ret = 0; - INIT_WORK(&group->work, iopf_handler); - if (!queue_work(fault_param->queue->wq, &group->work)) - return -EBUSY; + if (!param || !evt) + return -EINVAL; - return 0; + /* we only report device fault if there is a handler registered */ + mutex_lock(¶m->lock); + fparam = param->fault_param; + + if (evt->fault.type == IOMMU_FAULT_PAGE_REQ && + (evt->fault.prm.flags & IOMMU_FAULT_PAGE_REQUEST_LAST_PAGE)) { + evt_pending = kmemdup(evt, sizeof(struct iopf_fault), + GFP_KERNEL); + if (!evt_pending) { + ret = -ENOMEM; + goto done_unlock; + } + mutex_lock(&fparam->lock); + list_add_tail(&evt_pending->list, &fparam->faults); + mutex_unlock(&fparam->lock); + } + + ret = iommu_handle_iopf(&evt->fault, dev); + if (ret && evt_pending) { + mutex_lock(&fparam->lock); + list_del(&evt_pending->list); + mutex_unlock(&fparam->lock); + kfree(evt_pending); + } +done_unlock: + mutex_unlock(¶m->lock); + return ret; +} +EXPORT_SYMBOL_GPL(iommu_report_device_fault); + +int iommu_page_response(struct device *dev, + struct iommu_page_response *msg) +{ + bool needs_pasid; + int ret = -EINVAL; + struct iopf_fault *evt; + struct iommu_fault_page_request *prm; + struct dev_iommu *param = dev->iommu; + const struct iommu_ops *ops = dev_iommu_ops(dev); + bool has_pasid = msg->flags & IOMMU_PAGE_RESP_PASID_VALID; + + if (!ops->page_response) + return -ENODEV; + + if (!param || !param->fault_param) + return -EINVAL; + + /* Only send response if there is a fault report pending */ + mutex_lock(¶m->fault_param->lock); + if (list_empty(¶m->fault_param->faults)) { + dev_warn_ratelimited(dev, "no pending PRQ, drop response\n"); + goto done_unlock; + } + /* + * Check if we have a matching page request pending to respond, + * otherwise return -EINVAL + */ + list_for_each_entry(evt, ¶m->fault_param->faults, list) { + prm = &evt->fault.prm; + if (prm->grpid != msg->grpid) + continue; + + /* + * If the PASID is required, the corresponding request is + * matched using the group ID, the PASID valid bit and the PASID + * value. Otherwise only the group ID matches request and + * response. + */ + needs_pasid = prm->flags & IOMMU_FAULT_PAGE_RESPONSE_NEEDS_PASID; + if (needs_pasid && (!has_pasid || msg->pasid != prm->pasid)) + continue; + + if (!needs_pasid && has_pasid) { + /* No big deal, just clear it. */ + msg->flags &= ~IOMMU_PAGE_RESP_PASID_VALID; + msg->pasid = 0; + } + + ret = ops->page_response(dev, evt, msg); + list_del(&evt->list); + kfree(evt); + break; + } + +done_unlock: + mutex_unlock(¶m->fault_param->lock); + return ret; } +EXPORT_SYMBOL_GPL(iommu_page_response); /** * iopf_queue_flush_dev - Ensure that all queued faults have been processed @@ -258,6 +317,31 @@ int iopf_queue_flush_dev(struct device *dev) } EXPORT_SYMBOL_GPL(iopf_queue_flush_dev); +/** + * iopf_group_response - Respond a group of page faults + * @group: the group of faults with the same group id + * @status: the response code + * + * Return 0 on success and <0 on error. + */ +int iopf_group_response(struct iopf_group *group, + enum iommu_page_response_code status) +{ + struct iopf_fault *iopf = &group->last_fault; + struct iommu_page_response resp = { + .pasid = iopf->fault.prm.pasid, + .grpid = iopf->fault.prm.grpid, + .code = status, + }; + + if ((iopf->fault.prm.flags & IOMMU_FAULT_PAGE_REQUEST_PASID_VALID) && + (iopf->fault.prm.flags & IOMMU_FAULT_PAGE_RESPONSE_NEEDS_PASID)) + resp.flags = IOMMU_PAGE_RESP_PASID_VALID; + + return iommu_page_response(group->dev, &resp); +} +EXPORT_SYMBOL_GPL(iopf_group_response); + /** * iopf_queue_discard_partial - Remove all pending partial fault * @queue: the queue whose partial faults need to be discarded diff --git a/drivers/iommu/iommu-sva.c b/drivers/iommu/iommu-sva.c index d247d5330bb97..d061271e30cfb 100644 --- a/drivers/iommu/iommu-sva.c +++ b/drivers/iommu/iommu-sva.c @@ -7,7 +7,7 @@ #include #include -#include "iommu-sva.h" +#include "iommu-priv.h" static DEFINE_MUTEX(iommu_sva_lock); @@ -158,10 +158,21 @@ u32 iommu_sva_get_pasid(struct iommu_sva *handle) } EXPORT_SYMBOL_GPL(iommu_sva_get_pasid); +void mm_pasid_drop(struct mm_struct *mm) +{ + struct iommu_mm_data *iommu_mm = mm->iommu_mm; + + if (!iommu_mm) + return; + + iommu_free_global_pasid(iommu_mm->pasid); + kfree(iommu_mm); +} + /* * I/O page fault handler for SVA */ -enum iommu_page_response_code +static enum iommu_page_response_code iommu_sva_handle_mm(struct iommu_fault *fault, struct mm_struct *mm) { vm_fault_t ret; @@ -215,13 +226,54 @@ iommu_sva_handle_mm(struct iommu_fault *fault, struct mm_struct *mm) return status; } -void mm_pasid_drop(struct mm_struct *mm) +static void iommu_sva_handle_iopf(struct work_struct *work) { - struct iommu_mm_data *iommu_mm = mm->iommu_mm; + struct iopf_fault *iopf; + struct iopf_group *group; + enum iommu_page_response_code status = IOMMU_PAGE_RESP_SUCCESS; + + group = container_of(work, struct iopf_group, work); + list_for_each_entry(iopf, &group->faults, list) { + /* + * For the moment, errors are sticky: don't handle subsequent + * faults in the group if there is an error. + */ + if (status != IOMMU_PAGE_RESP_SUCCESS) + break; + + status = iommu_sva_handle_mm(&iopf->fault, group->domain->mm); + } - if (!iommu_mm) - return; + iopf_group_response(group, status); + iopf_free_group(group); +} - iommu_free_global_pasid(iommu_mm->pasid); - kfree(iommu_mm); +static int iommu_sva_iopf_handler(struct iopf_group *group) +{ + struct iommu_fault_param *fault_param = group->dev->iommu->fault_param; + + INIT_WORK(&group->work, iommu_sva_handle_iopf); + if (!queue_work(fault_param->queue->wq, &group->work)) + return -EBUSY; + + return 0; +} + +struct iommu_domain *iommu_sva_domain_alloc(struct device *dev, + struct mm_struct *mm) +{ + const struct iommu_ops *ops = dev_iommu_ops(dev); + struct iommu_domain *domain; + + domain = ops->domain_alloc(IOMMU_DOMAIN_SVA); + if (!domain) + return NULL; + + domain->type = IOMMU_DOMAIN_SVA; + mmgrab(mm); + domain->mm = mm; + domain->owner = ops; + domain->iopf_handler = iommu_sva_iopf_handler; + + return domain; } diff --git a/drivers/iommu/iommu-sva.h b/drivers/iommu/iommu-sva.h deleted file mode 100644 index 27c8da115b418..0000000000000 --- a/drivers/iommu/iommu-sva.h +++ /dev/null @@ -1,69 +0,0 @@ -/* SPDX-License-Identifier: GPL-2.0 */ -/* - * SVA library for IOMMU drivers - */ -#ifndef _IOMMU_SVA_H -#define _IOMMU_SVA_H - -#include - -/* I/O Page fault */ -struct device; -struct iommu_fault; -struct iopf_queue; - -#ifdef CONFIG_IOMMU_SVA -int iommu_queue_iopf(struct iommu_fault *fault, struct device *dev); - -int iopf_queue_add_device(struct iopf_queue *queue, struct device *dev); -int iopf_queue_remove_device(struct iopf_queue *queue, - struct device *dev); -int iopf_queue_flush_dev(struct device *dev); -struct iopf_queue *iopf_queue_alloc(const char *name); -void iopf_queue_free(struct iopf_queue *queue); -int iopf_queue_discard_partial(struct iopf_queue *queue); -int iommu_sva_handle_iopf(struct iopf_group *group); - -#else /* CONFIG_IOMMU_SVA */ -static inline int iommu_queue_iopf(struct iommu_fault *fault, struct device *dev) -{ - return -ENODEV; -} - -static inline int iopf_queue_add_device(struct iopf_queue *queue, - struct device *dev) -{ - return -ENODEV; -} - -static inline int iopf_queue_remove_device(struct iopf_queue *queue, - struct device *dev) -{ - return -ENODEV; -} - -static inline int iopf_queue_flush_dev(struct device *dev) -{ - return -ENODEV; -} - -static inline struct iopf_queue *iopf_queue_alloc(const char *name) -{ - return NULL; -} - -static inline void iopf_queue_free(struct iopf_queue *queue) -{ -} - -static inline int iopf_queue_discard_partial(struct iopf_queue *queue) -{ - return -ENODEV; -} - -static inline int iommu_sva_handle_iopf(struct iopf_group *group) -{ - return IOMMU_PAGE_RESP_INVALID; -} -#endif /* CONFIG_IOMMU_SVA */ -#endif /* _IOMMU_SVA_H */ diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c index aaf7ec954d390..fd4514800d81e 100644 --- a/drivers/iommu/iommu.c +++ b/drivers/iommu/iommu.c @@ -36,9 +36,6 @@ #include "dma-iommu.h" #include "iommu-priv.h" -#include "iommu-sva.h" -#include "iommu-priv.h" - static struct kset *iommu_group_kset; static DEFINE_IDA(iommu_group_ida); static DEFINE_IDA(iommu_global_pasid_ida); @@ -1372,117 +1369,6 @@ void iommu_group_put(struct iommu_group *group) } EXPORT_SYMBOL_GPL(iommu_group_put); -/** - * iommu_report_device_fault() - Report fault event to device driver - * @dev: the device - * @evt: fault event data - * - * Called by IOMMU drivers when a fault is detected, typically in a threaded IRQ - * handler. When this function fails and the fault is recoverable, it is the - * caller's responsibility to complete the fault. - * - * Return 0 on success, or an error. - */ -int iommu_report_device_fault(struct device *dev, struct iopf_fault *evt) -{ - struct dev_iommu *param = dev->iommu; - struct iopf_fault *evt_pending = NULL; - struct iommu_fault_param *fparam; - int ret = 0; - - if (!param || !evt) - return -EINVAL; - - /* we only report device fault if there is a handler registered */ - mutex_lock(¶m->lock); - fparam = param->fault_param; - - if (evt->fault.type == IOMMU_FAULT_PAGE_REQ && - (evt->fault.prm.flags & IOMMU_FAULT_PAGE_REQUEST_LAST_PAGE)) { - evt_pending = kmemdup(evt, sizeof(struct iopf_fault), - GFP_KERNEL); - if (!evt_pending) { - ret = -ENOMEM; - goto done_unlock; - } - mutex_lock(&fparam->lock); - list_add_tail(&evt_pending->list, &fparam->faults); - mutex_unlock(&fparam->lock); - } - - ret = iommu_queue_iopf(&evt->fault, dev); - if (ret && evt_pending) { - mutex_lock(&fparam->lock); - list_del(&evt_pending->list); - mutex_unlock(&fparam->lock); - kfree(evt_pending); - } -done_unlock: - mutex_unlock(¶m->lock); - return ret; -} -EXPORT_SYMBOL_GPL(iommu_report_device_fault); - -int iommu_page_response(struct device *dev, - struct iommu_page_response *msg) -{ - bool needs_pasid; - int ret = -EINVAL; - struct iopf_fault *evt; - struct iommu_fault_page_request *prm; - struct dev_iommu *param = dev->iommu; - const struct iommu_ops *ops = dev_iommu_ops(dev); - bool has_pasid = msg->flags & IOMMU_PAGE_RESP_PASID_VALID; - - if (!ops->page_response) - return -ENODEV; - - if (!param || !param->fault_param) - return -EINVAL; - - /* Only send response if there is a fault report pending */ - mutex_lock(¶m->fault_param->lock); - if (list_empty(¶m->fault_param->faults)) { - dev_warn_ratelimited(dev, "no pending PRQ, drop response\n"); - goto done_unlock; - } - /* - * Check if we have a matching page request pending to respond, - * otherwise return -EINVAL - */ - list_for_each_entry(evt, ¶m->fault_param->faults, list) { - prm = &evt->fault.prm; - if (prm->grpid != msg->grpid) - continue; - - /* - * If the PASID is required, the corresponding request is - * matched using the group ID, the PASID valid bit and the PASID - * value. Otherwise only the group ID matches request and - * response. - */ - needs_pasid = prm->flags & IOMMU_FAULT_PAGE_RESPONSE_NEEDS_PASID; - if (needs_pasid && (!has_pasid || msg->pasid != prm->pasid)) - continue; - - if (!needs_pasid && has_pasid) { - /* No big deal, just clear it. */ - msg->flags &= ~IOMMU_PAGE_RESP_PASID_VALID; - msg->pasid = 0; - } - - ret = ops->page_response(dev, evt, msg); - list_del(&evt->list); - kfree(evt); - break; - } - -done_unlock: - mutex_unlock(¶m->fault_param->lock); - return ret; -} -EXPORT_SYMBOL_GPL(iommu_page_response); - /** * iommu_group_id - Return ID for a group * @group: the group to ID @@ -3629,26 +3515,6 @@ struct iommu_domain *iommu_get_domain_for_dev_pasid(struct device *dev, } EXPORT_SYMBOL_GPL(iommu_get_domain_for_dev_pasid); -struct iommu_domain *iommu_sva_domain_alloc(struct device *dev, - struct mm_struct *mm) -{ - const struct iommu_ops *ops = dev_iommu_ops(dev); - struct iommu_domain *domain; - - domain = ops->domain_alloc(IOMMU_DOMAIN_SVA); - if (!domain) - return NULL; - - domain->type = IOMMU_DOMAIN_SVA; - mmgrab(mm); - domain->mm = mm; - domain->owner = ops; - domain->iopf_handler = iommu_sva_handle_iopf; - domain->fault_data = mm; - - return domain; -} - ioasid_t iommu_alloc_global_pasid(struct device *dev) { int ret; diff --git a/include/linux/iommu.h b/include/linux/iommu.h index 9db575baebeff..e39b7778620ee 100644 --- a/include/linux/iommu.h +++ b/include/linux/iommu.h @@ -739,10 +739,6 @@ extern struct iommu_group *iommu_group_get(struct device *dev); extern struct iommu_group *iommu_group_ref_get(struct iommu_group *group); extern void iommu_group_put(struct iommu_group *group); -extern int iommu_report_device_fault(struct device *dev, struct iopf_fault *evt); -extern int iommu_page_response(struct device *dev, - struct iommu_page_response *msg); - extern int iommu_group_id(struct iommu_group *group); extern struct iommu_domain *iommu_group_default_domain(struct iommu_group *); @@ -960,8 +956,6 @@ bool iommu_group_dma_owner_claimed(struct iommu_group *group); int iommu_device_claim_dma_owner(struct device *dev, void *owner); void iommu_device_release_dma_owner(struct device *dev); -struct iommu_domain *iommu_sva_domain_alloc(struct device *dev, - struct mm_struct *mm); int iommu_attach_device_pasid(struct iommu_domain *domain, struct device *dev, ioasid_t pasid); void iommu_detach_device_pasid(struct iommu_domain *domain, @@ -1150,18 +1144,6 @@ static inline void iommu_group_put(struct iommu_group *group) { } -static inline -int iommu_report_device_fault(struct device *dev, struct iopf_fault *evt) -{ - return -ENODEV; -} - -static inline int iommu_page_response(struct device *dev, - struct iommu_page_response *msg) -{ - return -ENODEV; -} - static inline int iommu_group_id(struct iommu_group *group) { return -ENODEV; @@ -1310,12 +1292,6 @@ static inline int iommu_device_claim_dma_owner(struct device *dev, void *owner) return -ENODEV; } -static inline struct iommu_domain * -iommu_sva_domain_alloc(struct device *dev, struct mm_struct *mm) -{ - return NULL; -} - static inline int iommu_attach_device_pasid(struct iommu_domain *domain, struct device *dev, ioasid_t pasid) { @@ -1463,6 +1439,8 @@ struct iommu_sva *iommu_sva_bind_device(struct device *dev, struct mm_struct *mm); void iommu_sva_unbind_device(struct iommu_sva *handle); u32 iommu_sva_get_pasid(struct iommu_sva *handle); +struct iommu_domain *iommu_sva_domain_alloc(struct device *dev, + struct mm_struct *mm); #else static inline struct iommu_sva * iommu_sva_bind_device(struct device *dev, struct mm_struct *mm) @@ -1487,6 +1465,78 @@ static inline u32 mm_get_enqcmd_pasid(struct mm_struct *mm) } static inline void mm_pasid_drop(struct mm_struct *mm) {} + +static inline struct iommu_domain * +iommu_sva_domain_alloc(struct device *dev, struct mm_struct *mm) +{ + return NULL; +} #endif /* CONFIG_IOMMU_SVA */ +#ifdef CONFIG_IOMMU_IOPF +int iopf_queue_add_device(struct iopf_queue *queue, struct device *dev); +int iopf_queue_remove_device(struct iopf_queue *queue, struct device *dev); +int iopf_queue_flush_dev(struct device *dev); +struct iopf_queue *iopf_queue_alloc(const char *name); +void iopf_queue_free(struct iopf_queue *queue); +int iopf_queue_discard_partial(struct iopf_queue *queue); +void iopf_free_group(struct iopf_group *group); +int iommu_report_device_fault(struct device *dev, struct iopf_fault *evt); +int iommu_page_response(struct device *dev, struct iommu_page_response *msg); +int iopf_group_response(struct iopf_group *group, + enum iommu_page_response_code status); +#else +static inline int +iopf_queue_add_device(struct iopf_queue *queue, struct device *dev) +{ + return -ENODEV; +} + +static inline int +iopf_queue_remove_device(struct iopf_queue *queue, struct device *dev) +{ + return -ENODEV; +} + +static inline int iopf_queue_flush_dev(struct device *dev) +{ + return -ENODEV; +} + +static inline struct iopf_queue *iopf_queue_alloc(const char *name) +{ + return NULL; +} + +static inline void iopf_queue_free(struct iopf_queue *queue) +{ +} + +static inline int iopf_queue_discard_partial(struct iopf_queue *queue) +{ + return -ENODEV; +} + +static inline void iopf_free_group(struct iopf_group *group) +{ +} + +static inline int +iommu_report_device_fault(struct device *dev, struct iopf_fault *evt) +{ + return -ENODEV; +} + +static inline int +iommu_page_response(struct device *dev, struct iommu_page_response *msg) +{ + return -ENODEV; +} + +static inline int iopf_group_response(struct iopf_group *group, + enum iommu_page_response_code status) +{ + return -ENODEV; +} +#endif /* CONFIG_IOMMU_IOPF */ #endif /* __LINUX_IOMMU_H */ From 06085b6d8c302dea8006298fe84fa7110d513276 Mon Sep 17 00:00:00 2001 From: Lu Baolu Date: Mon, 12 Feb 2024 09:22:22 +0800 Subject: [PATCH 494/548] iommu: Refine locking for per-device fault data management The per-device fault data is a data structure that is used to store information about faults that occur on a device. This data is allocated when IOPF is enabled on the device and freed when IOPF is disabled. The data is used in the paths of iopf reporting, handling, responding, and draining. The fault data is protected by two locks: - dev->iommu->lock: This lock is used to protect the allocation and freeing of the fault data. - dev->iommu->fault_parameter->lock: This lock is used to protect the fault data itself. Apply the locking mechanism to the fault reporting and responding paths. The fault_parameter->lock is also added in iopf_queue_discard_partial(). It does not fix any real issue, as iopf_queue_discard_partial() is only used in the VT-d driver's prq_event_thread(), which is a single-threaded path that reports the IOPFs. Signed-off-by: Lu Baolu Reviewed-by: Kevin Tian Reviewed-by: Jason Gunthorpe Tested-by: Yan Zhao Tested-by: Longfang Liu Link: https://lore.kernel.org/r/20240212012227.119381-12-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel (cherry picked from commit cc7338e9d807e20e60e6720a62956f0e9d46f0f8) Signed-off-by: Koba Ko --- drivers/iommu/io-pgfault.c | 61 +++++++++++++++++++------------------- 1 file changed, 30 insertions(+), 31 deletions(-) diff --git a/drivers/iommu/io-pgfault.c b/drivers/iommu/io-pgfault.c index c1e88da973cef..5aea8402be476 100644 --- a/drivers/iommu/io-pgfault.c +++ b/drivers/iommu/io-pgfault.c @@ -53,7 +53,7 @@ static struct iommu_domain *get_domain_for_iopf(struct device *dev, /** * iommu_handle_iopf - IO Page Fault handler * @fault: fault event - * @dev: struct device. + * @iopf_param: the fault parameter of the device. * * Add a fault to the device workqueue, to be handled by mm. * @@ -90,29 +90,21 @@ static struct iommu_domain *get_domain_for_iopf(struct device *dev, * * Return: 0 on success and <0 on error. */ -static int iommu_handle_iopf(struct iommu_fault *fault, struct device *dev) +static int iommu_handle_iopf(struct iommu_fault *fault, + struct iommu_fault_param *iopf_param) { int ret; struct iopf_group *group; struct iommu_domain *domain; struct iopf_fault *iopf, *next; - struct iommu_fault_param *iopf_param; - struct dev_iommu *param = dev->iommu; + struct device *dev = iopf_param->dev; - lockdep_assert_held(¶m->lock); + lockdep_assert_held(&iopf_param->lock); if (fault->type != IOMMU_FAULT_PAGE_REQ) /* Not a recoverable page fault */ return -EOPNOTSUPP; - /* - * As long as we're holding param->lock, the queue can't be unlinked - * from the device and therefore cannot disappear. - */ - iopf_param = param->fault_param; - if (!iopf_param) - return -ENODEV; - if (!(fault->prm.flags & IOMMU_FAULT_PAGE_REQUEST_LAST_PAGE)) { iopf = kzalloc(sizeof(*iopf), GFP_KERNEL); if (!iopf) @@ -186,18 +178,19 @@ static int iommu_handle_iopf(struct iommu_fault *fault, struct device *dev) */ int iommu_report_device_fault(struct device *dev, struct iopf_fault *evt) { - struct dev_iommu *param = dev->iommu; + struct iommu_fault_param *fault_param; struct iopf_fault *evt_pending = NULL; - struct iommu_fault_param *fparam; + struct dev_iommu *param = dev->iommu; int ret = 0; - if (!param || !evt) - return -EINVAL; - - /* we only report device fault if there is a handler registered */ mutex_lock(¶m->lock); - fparam = param->fault_param; + fault_param = param->fault_param; + if (!fault_param) { + mutex_unlock(¶m->lock); + return -EINVAL; + } + mutex_lock(&fault_param->lock); if (evt->fault.type == IOMMU_FAULT_PAGE_REQ && (evt->fault.prm.flags & IOMMU_FAULT_PAGE_REQUEST_LAST_PAGE)) { evt_pending = kmemdup(evt, sizeof(struct iopf_fault), @@ -206,20 +199,18 @@ int iommu_report_device_fault(struct device *dev, struct iopf_fault *evt) ret = -ENOMEM; goto done_unlock; } - mutex_lock(&fparam->lock); - list_add_tail(&evt_pending->list, &fparam->faults); - mutex_unlock(&fparam->lock); + list_add_tail(&evt_pending->list, &fault_param->faults); } - ret = iommu_handle_iopf(&evt->fault, dev); + ret = iommu_handle_iopf(&evt->fault, fault_param); if (ret && evt_pending) { - mutex_lock(&fparam->lock); list_del(&evt_pending->list); - mutex_unlock(&fparam->lock); kfree(evt_pending); } done_unlock: + mutex_unlock(&fault_param->lock); mutex_unlock(¶m->lock); + return ret; } EXPORT_SYMBOL_GPL(iommu_report_device_fault); @@ -232,18 +223,23 @@ int iommu_page_response(struct device *dev, struct iopf_fault *evt; struct iommu_fault_page_request *prm; struct dev_iommu *param = dev->iommu; + struct iommu_fault_param *fault_param; const struct iommu_ops *ops = dev_iommu_ops(dev); bool has_pasid = msg->flags & IOMMU_PAGE_RESP_PASID_VALID; if (!ops->page_response) return -ENODEV; - if (!param || !param->fault_param) + mutex_lock(¶m->lock); + fault_param = param->fault_param; + if (!fault_param) { + mutex_unlock(¶m->lock); return -EINVAL; + } /* Only send response if there is a fault report pending */ - mutex_lock(¶m->fault_param->lock); - if (list_empty(¶m->fault_param->faults)) { + mutex_lock(&fault_param->lock); + if (list_empty(&fault_param->faults)) { dev_warn_ratelimited(dev, "no pending PRQ, drop response\n"); goto done_unlock; } @@ -251,7 +247,7 @@ int iommu_page_response(struct device *dev, * Check if we have a matching page request pending to respond, * otherwise return -EINVAL */ - list_for_each_entry(evt, ¶m->fault_param->faults, list) { + list_for_each_entry(evt, &fault_param->faults, list) { prm = &evt->fault.prm; if (prm->grpid != msg->grpid) continue; @@ -279,7 +275,8 @@ int iommu_page_response(struct device *dev, } done_unlock: - mutex_unlock(¶m->fault_param->lock); + mutex_unlock(&fault_param->lock); + mutex_unlock(¶m->lock); return ret; } EXPORT_SYMBOL_GPL(iommu_page_response); @@ -362,11 +359,13 @@ int iopf_queue_discard_partial(struct iopf_queue *queue) mutex_lock(&queue->lock); list_for_each_entry(iopf_param, &queue->devices, queue_list) { + mutex_lock(&iopf_param->lock); list_for_each_entry_safe(iopf, next, &iopf_param->partial, list) { list_del(&iopf->list); kfree(iopf); } + mutex_unlock(&iopf_param->lock); } mutex_unlock(&queue->lock); return 0; From e622607588d06400ea1c1660292f605ad8550a2a Mon Sep 17 00:00:00 2001 From: Lu Baolu Date: Mon, 12 Feb 2024 09:22:23 +0800 Subject: [PATCH 495/548] iommu: Use refcount for fault data access The per-device fault data structure stores information about faults occurring on a device. Its lifetime spans from IOPF enablement to disablement. Multiple paths, including IOPF reporting, handling, and responding, may access it concurrently. Previously, a mutex protected the fault data from use after free. But this is not performance friendly due to the critical nature of IOPF handling paths. Refine this with a refcount-based approach. The fault data pointer is obtained within an RCU read region with a refcount. The fault data pointer is returned for usage only when the pointer is valid and a refcount is successfully obtained. The fault data is freed with kfree_rcu(), ensuring data is only freed after all RCU critical regions complete. An iopf handling work starts once an iopf group is created. The handling work continues until iommu_page_response() is called to respond to the iopf and the iopf group is freed. During this time, the device fault parameter should always be available. Add a pointer to the device fault parameter in the iopf_group structure and hold the reference until the iopf_group is freed. Make iommu_page_response() static as it is only used in io-pgfault.c. Co-developed-by: Jason Gunthorpe Signed-off-by: Jason Gunthorpe Signed-off-by: Lu Baolu Reviewed-by: Jason Gunthorpe Reviewed-by: Kevin Tian Tested-by: Yan Zhao Link: https://lore.kernel.org/r/20240212012227.119381-13-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel (cherry picked from commit a74c077b9021b36c785095c571336e5b204d3c2d) Signed-off-by: Koba Ko --- drivers/iommu/io-pgfault.c | 127 +++++++++++++++++++++++-------------- drivers/iommu/iommu-sva.c | 2 +- include/linux/iommu.h | 17 +++-- 3 files changed, 88 insertions(+), 58 deletions(-) diff --git a/drivers/iommu/io-pgfault.c b/drivers/iommu/io-pgfault.c index 5aea8402be476..ce7058892b598 100644 --- a/drivers/iommu/io-pgfault.c +++ b/drivers/iommu/io-pgfault.c @@ -13,6 +13,32 @@ #include "iommu-priv.h" +/* + * Return the fault parameter of a device if it exists. Otherwise, return NULL. + * On a successful return, the caller takes a reference of this parameter and + * should put it after use by calling iopf_put_dev_fault_param(). + */ +static struct iommu_fault_param *iopf_get_dev_fault_param(struct device *dev) +{ + struct dev_iommu *param = dev->iommu; + struct iommu_fault_param *fault_param; + + rcu_read_lock(); + fault_param = rcu_dereference(param->fault_param); + if (fault_param && !refcount_inc_not_zero(&fault_param->users)) + fault_param = NULL; + rcu_read_unlock(); + + return fault_param; +} + +/* Caller must hold a reference of the fault parameter. */ +static void iopf_put_dev_fault_param(struct iommu_fault_param *fault_param) +{ + if (refcount_dec_and_test(&fault_param->users)) + kfree_rcu(fault_param, rcu); +} + void iopf_free_group(struct iopf_group *group) { struct iopf_fault *iopf, *next; @@ -22,6 +48,8 @@ void iopf_free_group(struct iopf_group *group) kfree(iopf); } + /* Pair with iommu_report_device_fault(). */ + iopf_put_dev_fault_param(group->fault_param); kfree(group); } EXPORT_SYMBOL_GPL(iopf_free_group); @@ -135,7 +163,7 @@ static int iommu_handle_iopf(struct iommu_fault *fault, goto cleanup_partial; } - group->dev = dev; + group->fault_param = iopf_param; group->last_fault.fault = *fault; INIT_LIST_HEAD(&group->faults); group->domain = domain; @@ -178,64 +206,61 @@ static int iommu_handle_iopf(struct iommu_fault *fault, */ int iommu_report_device_fault(struct device *dev, struct iopf_fault *evt) { + bool last_prq = evt->fault.type == IOMMU_FAULT_PAGE_REQ && + (evt->fault.prm.flags & IOMMU_FAULT_PAGE_REQUEST_LAST_PAGE); struct iommu_fault_param *fault_param; - struct iopf_fault *evt_pending = NULL; - struct dev_iommu *param = dev->iommu; - int ret = 0; + struct iopf_fault *evt_pending; + int ret; - mutex_lock(¶m->lock); - fault_param = param->fault_param; - if (!fault_param) { - mutex_unlock(¶m->lock); + fault_param = iopf_get_dev_fault_param(dev); + if (!fault_param) return -EINVAL; - } mutex_lock(&fault_param->lock); - if (evt->fault.type == IOMMU_FAULT_PAGE_REQ && - (evt->fault.prm.flags & IOMMU_FAULT_PAGE_REQUEST_LAST_PAGE)) { + if (last_prq) { evt_pending = kmemdup(evt, sizeof(struct iopf_fault), GFP_KERNEL); if (!evt_pending) { ret = -ENOMEM; - goto done_unlock; + goto err_unlock; } list_add_tail(&evt_pending->list, &fault_param->faults); } ret = iommu_handle_iopf(&evt->fault, fault_param); - if (ret && evt_pending) { + if (ret) + goto err_free; + + mutex_unlock(&fault_param->lock); + /* The reference count of fault_param is now held by iopf_group. */ + if (!last_prq) + iopf_put_dev_fault_param(fault_param); + + return 0; +err_free: + if (last_prq) { list_del(&evt_pending->list); kfree(evt_pending); } -done_unlock: +err_unlock: mutex_unlock(&fault_param->lock); - mutex_unlock(¶m->lock); + iopf_put_dev_fault_param(fault_param); return ret; } EXPORT_SYMBOL_GPL(iommu_report_device_fault); -int iommu_page_response(struct device *dev, - struct iommu_page_response *msg) +static int iommu_page_response(struct iopf_group *group, + struct iommu_page_response *msg) { bool needs_pasid; int ret = -EINVAL; struct iopf_fault *evt; struct iommu_fault_page_request *prm; - struct dev_iommu *param = dev->iommu; - struct iommu_fault_param *fault_param; + struct device *dev = group->fault_param->dev; const struct iommu_ops *ops = dev_iommu_ops(dev); bool has_pasid = msg->flags & IOMMU_PAGE_RESP_PASID_VALID; - - if (!ops->page_response) - return -ENODEV; - - mutex_lock(¶m->lock); - fault_param = param->fault_param; - if (!fault_param) { - mutex_unlock(¶m->lock); - return -EINVAL; - } + struct iommu_fault_param *fault_param = group->fault_param; /* Only send response if there is a fault report pending */ mutex_lock(&fault_param->lock); @@ -276,10 +301,9 @@ int iommu_page_response(struct device *dev, done_unlock: mutex_unlock(&fault_param->lock); - mutex_unlock(¶m->lock); + return ret; } -EXPORT_SYMBOL_GPL(iommu_page_response); /** * iopf_queue_flush_dev - Ensure that all queued faults have been processed @@ -295,22 +319,20 @@ EXPORT_SYMBOL_GPL(iommu_page_response); */ int iopf_queue_flush_dev(struct device *dev) { - int ret = 0; struct iommu_fault_param *iopf_param; - struct dev_iommu *param = dev->iommu; - if (!param) + /* + * It's a driver bug to be here after iopf_queue_remove_device(). + * Therefore, it's safe to dereference the fault parameter without + * holding the lock. + */ + iopf_param = rcu_dereference_check(dev->iommu->fault_param, true); + if (WARN_ON(!iopf_param)) return -ENODEV; - mutex_lock(¶m->lock); - iopf_param = param->fault_param; - if (iopf_param) - flush_workqueue(iopf_param->queue->wq); - else - ret = -ENODEV; - mutex_unlock(¶m->lock); + flush_workqueue(iopf_param->queue->wq); - return ret; + return 0; } EXPORT_SYMBOL_GPL(iopf_queue_flush_dev); @@ -335,7 +357,7 @@ int iopf_group_response(struct iopf_group *group, (iopf->fault.prm.flags & IOMMU_FAULT_PAGE_RESPONSE_NEEDS_PASID)) resp.flags = IOMMU_PAGE_RESP_PASID_VALID; - return iommu_page_response(group->dev, &resp); + return iommu_page_response(group, &resp); } EXPORT_SYMBOL_GPL(iopf_group_response); @@ -384,10 +406,15 @@ int iopf_queue_add_device(struct iopf_queue *queue, struct device *dev) int ret = 0; struct dev_iommu *param = dev->iommu; struct iommu_fault_param *fault_param; + const struct iommu_ops *ops = dev_iommu_ops(dev); + + if (!ops->page_response) + return -ENODEV; mutex_lock(&queue->lock); mutex_lock(¶m->lock); - if (param->fault_param) { + if (rcu_dereference_check(param->fault_param, + lockdep_is_held(¶m->lock))) { ret = -EBUSY; goto done_unlock; } @@ -402,10 +429,11 @@ int iopf_queue_add_device(struct iopf_queue *queue, struct device *dev) INIT_LIST_HEAD(&fault_param->faults); INIT_LIST_HEAD(&fault_param->partial); fault_param->dev = dev; + refcount_set(&fault_param->users, 1); list_add(&fault_param->queue_list, &queue->devices); fault_param->queue = queue; - param->fault_param = fault_param; + rcu_assign_pointer(param->fault_param, fault_param); done_unlock: mutex_unlock(¶m->lock); @@ -429,10 +457,12 @@ int iopf_queue_remove_device(struct iopf_queue *queue, struct device *dev) int ret = 0; struct iopf_fault *iopf, *next; struct dev_iommu *param = dev->iommu; - struct iommu_fault_param *fault_param = param->fault_param; + struct iommu_fault_param *fault_param; mutex_lock(&queue->lock); mutex_lock(¶m->lock); + fault_param = rcu_dereference_check(param->fault_param, + lockdep_is_held(¶m->lock)); if (!fault_param) { ret = -ENODEV; goto unlock; @@ -454,8 +484,9 @@ int iopf_queue_remove_device(struct iopf_queue *queue, struct device *dev) list_for_each_entry_safe(iopf, next, &fault_param->partial, list) kfree(iopf); - param->fault_param = NULL; - kfree(fault_param); + /* dec the ref owned by iopf_queue_add_device() */ + rcu_assign_pointer(param->fault_param, NULL); + iopf_put_dev_fault_param(fault_param); unlock: mutex_unlock(¶m->lock); mutex_unlock(&queue->lock); diff --git a/drivers/iommu/iommu-sva.c b/drivers/iommu/iommu-sva.c index d061271e30cfb..4616c8c2fa7d0 100644 --- a/drivers/iommu/iommu-sva.c +++ b/drivers/iommu/iommu-sva.c @@ -250,7 +250,7 @@ static void iommu_sva_handle_iopf(struct work_struct *work) static int iommu_sva_iopf_handler(struct iopf_group *group) { - struct iommu_fault_param *fault_param = group->dev->iommu->fault_param; + struct iommu_fault_param *fault_param = group->fault_param; INIT_WORK(&group->work, iommu_sva_handle_iopf); if (!queue_work(fault_param->queue->wq, &group->work)) diff --git a/include/linux/iommu.h b/include/linux/iommu.h index e39b7778620ee..0e75ede024dc9 100644 --- a/include/linux/iommu.h +++ b/include/linux/iommu.h @@ -41,6 +41,7 @@ struct iommu_dirty_ops; struct notifier_block; struct iommu_sva; struct iommu_dma_cookie; +struct iommu_fault_param; #define IOMMU_FAULT_PERM_READ (1 << 0) /* read */ #define IOMMU_FAULT_PERM_WRITE (1 << 1) /* write */ @@ -129,8 +130,9 @@ struct iopf_group { struct iopf_fault last_fault; struct list_head faults; struct work_struct work; - struct device *dev; struct iommu_domain *domain; + /* The device's fault data parameter. */ + struct iommu_fault_param *fault_param; }; /** @@ -610,6 +612,8 @@ struct iommu_device { /** * struct iommu_fault_param - per-device IOMMU fault data * @lock: protect pending faults list + * @users: user counter to manage the lifetime of the data + * @rcu: rcu head for kfree_rcu() * @dev: the device that owns this param * @queue: IOPF queue * @queue_list: index into queue->devices @@ -619,6 +623,8 @@ struct iommu_device { */ struct iommu_fault_param { struct mutex lock; + refcount_t users; + struct rcu_head rcu; struct device *dev; struct iopf_queue *queue; @@ -645,7 +651,7 @@ struct iommu_fault_param { */ struct dev_iommu { struct mutex lock; - struct iommu_fault_param *fault_param; + struct iommu_fault_param __rcu *fault_param; struct iommu_fwspec *fwspec; struct iommu_device *iommu_dev; void *priv; @@ -1482,7 +1488,6 @@ void iopf_queue_free(struct iopf_queue *queue); int iopf_queue_discard_partial(struct iopf_queue *queue); void iopf_free_group(struct iopf_group *group); int iommu_report_device_fault(struct device *dev, struct iopf_fault *evt); -int iommu_page_response(struct device *dev, struct iommu_page_response *msg); int iopf_group_response(struct iopf_group *group, enum iommu_page_response_code status); #else @@ -1527,12 +1532,6 @@ iommu_report_device_fault(struct device *dev, struct iopf_fault *evt) return -ENODEV; } -static inline int -iommu_page_response(struct device *dev, struct iommu_page_response *msg) -{ - return -ENODEV; -} - static inline int iopf_group_response(struct iopf_group *group, enum iommu_page_response_code status) { From 4812f2456fdf0324ee9c1f5fe815a511470a245e Mon Sep 17 00:00:00 2001 From: Lu Baolu Date: Mon, 12 Feb 2024 09:22:24 +0800 Subject: [PATCH 496/548] iommu: Improve iopf_queue_remove_device() Convert iopf_queue_remove_device() to return void instead of an error code, as the return value is never used. This removal helper is designed to be never-failed, so there's no need for error handling. Ack all outstanding page requests from the device with the response code of IOMMU_PAGE_RESP_INVALID, indicating device should not attempt any retry. Add comments to this helper explaining the steps involved in removing a device from the iopf queue and disabling its PRI. The individual drivers are expected to be adjusted accordingly. Here we just define the expected behaviors of the individual iommu driver from the core's perspective. Suggested-by: Jason Gunthorpe Signed-off-by: Lu Baolu Reviewed-by: Jason Gunthorpe Tested-by: Yan Zhao Link: https://lore.kernel.org/r/20240212012227.119381-14-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel (cherry picked from commit 0095bf83554f8e7a681961656608101bdf40e9ef) Signed-off-by: Koba Ko --- drivers/iommu/intel/iommu.c | 7 +---- drivers/iommu/io-pgfault.c | 57 ++++++++++++++++++++++++------------- include/linux/iommu.h | 5 ++-- 3 files changed, 40 insertions(+), 29 deletions(-) diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c index 8c8a46736f85b..008c396c583b1 100644 --- a/drivers/iommu/intel/iommu.c +++ b/drivers/iommu/intel/iommu.c @@ -4656,12 +4656,7 @@ static int intel_iommu_disable_iopf(struct device *dev) */ pci_disable_pri(to_pci_dev(dev)); info->pri_enabled = 0; - - /* - * With PRI disabled and outstanding PRQs drained, removing device - * from iopf queue should never fail. - */ - WARN_ON(iopf_queue_remove_device(iommu->iopf_queue, dev)); + iopf_queue_remove_device(iommu->iopf_queue, dev); return 0; } diff --git a/drivers/iommu/io-pgfault.c b/drivers/iommu/io-pgfault.c index ce7058892b598..ece09552e5cf9 100644 --- a/drivers/iommu/io-pgfault.c +++ b/drivers/iommu/io-pgfault.c @@ -448,41 +448,60 @@ EXPORT_SYMBOL_GPL(iopf_queue_add_device); * @queue: IOPF queue * @dev: device to remove * - * Caller makes sure that no more faults are reported for this device. + * Removing a device from an iopf_queue. It's recommended to follow these + * steps when removing a device: * - * Return: 0 on success and <0 on error. + * - Disable new PRI reception: Turn off PRI generation in the IOMMU hardware + * and flush any hardware page request queues. This should be done before + * calling into this helper. + * - Acknowledge all outstanding PRQs to the device: Respond to all outstanding + * page requests with IOMMU_PAGE_RESP_INVALID, indicating the device should + * not retry. This helper function handles this. + * - Disable PRI on the device: After calling this helper, the caller could + * then disable PRI on the device. + * + * Calling iopf_queue_remove_device() essentially disassociates the device. + * The fault_param might still exist, but iommu_page_response() will do + * nothing. The device fault parameter reference count has been properly + * passed from iommu_report_device_fault() to the fault handling work, and + * will eventually be released after iommu_page_response(). */ -int iopf_queue_remove_device(struct iopf_queue *queue, struct device *dev) +void iopf_queue_remove_device(struct iopf_queue *queue, struct device *dev) { - int ret = 0; struct iopf_fault *iopf, *next; + struct iommu_page_response resp; struct dev_iommu *param = dev->iommu; struct iommu_fault_param *fault_param; + const struct iommu_ops *ops = dev_iommu_ops(dev); mutex_lock(&queue->lock); mutex_lock(¶m->lock); fault_param = rcu_dereference_check(param->fault_param, lockdep_is_held(¶m->lock)); - if (!fault_param) { - ret = -ENODEV; - goto unlock; - } - if (fault_param->queue != queue) { - ret = -EINVAL; + if (WARN_ON(!fault_param || fault_param->queue != queue)) goto unlock; - } - if (!list_empty(&fault_param->faults)) { - ret = -EBUSY; - goto unlock; - } + mutex_lock(&fault_param->lock); + list_for_each_entry_safe(iopf, next, &fault_param->partial, list) + kfree(iopf); - list_del(&fault_param->queue_list); + list_for_each_entry_safe(iopf, next, &fault_param->faults, list) { + memset(&resp, 0, sizeof(struct iommu_page_response)); + resp.pasid = iopf->fault.prm.pasid; + resp.grpid = iopf->fault.prm.grpid; + resp.code = IOMMU_PAGE_RESP_INVALID; - /* Just in case some faults are still stuck */ - list_for_each_entry_safe(iopf, next, &fault_param->partial, list) + if (iopf->fault.prm.flags & IOMMU_FAULT_PAGE_RESPONSE_NEEDS_PASID) + resp.flags = IOMMU_PAGE_RESP_PASID_VALID; + + ops->page_response(dev, iopf, &resp); + list_del(&iopf->list); kfree(iopf); + } + mutex_unlock(&fault_param->lock); + + list_del(&fault_param->queue_list); /* dec the ref owned by iopf_queue_add_device() */ rcu_assign_pointer(param->fault_param, NULL); @@ -490,8 +509,6 @@ int iopf_queue_remove_device(struct iopf_queue *queue, struct device *dev) unlock: mutex_unlock(¶m->lock); mutex_unlock(&queue->lock); - - return ret; } EXPORT_SYMBOL_GPL(iopf_queue_remove_device); diff --git a/include/linux/iommu.h b/include/linux/iommu.h index 0e75ede024dc9..413df55a70c7f 100644 --- a/include/linux/iommu.h +++ b/include/linux/iommu.h @@ -1481,7 +1481,7 @@ iommu_sva_domain_alloc(struct device *dev, struct mm_struct *mm) #ifdef CONFIG_IOMMU_IOPF int iopf_queue_add_device(struct iopf_queue *queue, struct device *dev); -int iopf_queue_remove_device(struct iopf_queue *queue, struct device *dev); +void iopf_queue_remove_device(struct iopf_queue *queue, struct device *dev); int iopf_queue_flush_dev(struct device *dev); struct iopf_queue *iopf_queue_alloc(const char *name); void iopf_queue_free(struct iopf_queue *queue); @@ -1497,10 +1497,9 @@ iopf_queue_add_device(struct iopf_queue *queue, struct device *dev) return -ENODEV; } -static inline int +static inline void iopf_queue_remove_device(struct iopf_queue *queue, struct device *dev) { - return -ENODEV; } static inline int iopf_queue_flush_dev(struct device *dev) From ab413c5ad155645044e5449c14a99ca2015c520d Mon Sep 17 00:00:00 2001 From: Lu Baolu Date: Mon, 12 Feb 2024 09:22:25 +0800 Subject: [PATCH 497/548] iommu: Track iopf group instead of last fault Previously, before a group of page faults was passed to the domain's iopf handler, the last page fault of the group was kept in the list of iommu_fault_param::faults. In the page fault response path, the group's last page fault was used to look up the list, and the page faults were responded to device only if there was a matched fault. The previous approach seems unnecessarily complex and not performance friendly. Put the page fault group itself to the outstanding fault list. It can be removed in the page fault response path or in the iopf_queue_remove_device() path. The pending list is protected by iommu_fault_param::lock. To allow checking for the group's presence in the list using list_empty(), the iopf group should be removed from the list with list_del_init(). IOMMU_PAGE_RESP_PASID_VALID is set in the code but not used anywhere. Remove it to make the code clean. IOMMU_PAGE_RESP_PASID_VALID is set in the response message indicating that the response message includes a valid PASID value. Actually, we should keep this hardware detail in the individual driver. When the page fault handling framework in IOMMU and IOMMUFD subsystems includes a valid PASID in the fault message, the response message should always contain the same PASID value. Individual drivers should be responsible for deciding whether to include the PASID in the messages they provide for the hardware. Signed-off-by: Lu Baolu Reviewed-by: Jason Gunthorpe Reviewed-by: Kevin Tian Tested-by: Yan Zhao Link: https://lore.kernel.org/r/20240212012227.119381-15-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel (cherry picked from commit 19911232713573a2ebea84a25bd4d71d024ed86b) Signed-off-by: Koba Ko --- drivers/iommu/io-pgfault.c | 242 +++++++++++++------------------------ include/linux/iommu.h | 6 +- 2 files changed, 86 insertions(+), 162 deletions(-) diff --git a/drivers/iommu/io-pgfault.c b/drivers/iommu/io-pgfault.c index ece09552e5cf9..05e49e2e6a52e 100644 --- a/drivers/iommu/io-pgfault.c +++ b/drivers/iommu/io-pgfault.c @@ -78,12 +78,33 @@ static struct iommu_domain *get_domain_for_iopf(struct device *dev, return domain; } +/* Non-last request of a group. Postpone until the last one. */ +static int report_partial_fault(struct iommu_fault_param *fault_param, + struct iommu_fault *fault) +{ + struct iopf_fault *iopf; + + iopf = kzalloc(sizeof(*iopf), GFP_KERNEL); + if (!iopf) + return -ENOMEM; + + iopf->fault = *fault; + + mutex_lock(&fault_param->lock); + list_add(&iopf->list, &fault_param->partial); + mutex_unlock(&fault_param->lock); + + return 0; +} + /** - * iommu_handle_iopf - IO Page Fault handler - * @fault: fault event - * @iopf_param: the fault parameter of the device. + * iommu_report_device_fault() - Report fault event to device driver + * @dev: the device + * @evt: fault event data * - * Add a fault to the device workqueue, to be handled by mm. + * Called by IOMMU drivers when a fault is detected, typically in a threaded IRQ + * handler. When this function fails and the fault is recoverable, it is the + * caller's responsibility to complete the fault. * * This module doesn't handle PCI PASID Stop Marker; IOMMU drivers must discard * them before reporting faults. A PASID Stop Marker (LRW = 0b100) doesn't @@ -118,34 +139,37 @@ static struct iommu_domain *get_domain_for_iopf(struct device *dev, * * Return: 0 on success and <0 on error. */ -static int iommu_handle_iopf(struct iommu_fault *fault, - struct iommu_fault_param *iopf_param) +int iommu_report_device_fault(struct device *dev, struct iopf_fault *evt) { - int ret; - struct iopf_group *group; - struct iommu_domain *domain; + struct iommu_fault *fault = &evt->fault; + struct iommu_fault_param *iopf_param; struct iopf_fault *iopf, *next; - struct device *dev = iopf_param->dev; - - lockdep_assert_held(&iopf_param->lock); + struct iommu_domain *domain; + struct iopf_group *group; + int ret; if (fault->type != IOMMU_FAULT_PAGE_REQ) - /* Not a recoverable page fault */ return -EOPNOTSUPP; - if (!(fault->prm.flags & IOMMU_FAULT_PAGE_REQUEST_LAST_PAGE)) { - iopf = kzalloc(sizeof(*iopf), GFP_KERNEL); - if (!iopf) - return -ENOMEM; - - iopf->fault = *fault; + iopf_param = iopf_get_dev_fault_param(dev); + if (!iopf_param) + return -ENODEV; - /* Non-last request of a group. Postpone until the last one */ - list_add(&iopf->list, &iopf_param->partial); + if (!(fault->prm.flags & IOMMU_FAULT_PAGE_REQUEST_LAST_PAGE)) { + ret = report_partial_fault(iopf_param, fault); + iopf_put_dev_fault_param(iopf_param); - return 0; + return ret; } + /* + * This is the last page fault of a group. Allocate an iopf group and + * pass it to domain's page fault handler. The group holds a reference + * count of the fault parameter. It will be released after response or + * error path of this function. If an error is returned, the caller + * will send a response to the hardware. We need to clean up before + * leaving, otherwise partial faults will be stuck. + */ domain = get_domain_for_iopf(dev, fault); if (!domain) { ret = -EINVAL; @@ -154,11 +178,6 @@ static int iommu_handle_iopf(struct iommu_fault *fault, group = kzalloc(sizeof(*group), GFP_KERNEL); if (!group) { - /* - * The caller will send a response to the hardware. But we do - * need to clean up before leaving, otherwise partial faults - * will be stuck. - */ ret = -ENOMEM; goto cleanup_partial; } @@ -166,145 +185,45 @@ static int iommu_handle_iopf(struct iommu_fault *fault, group->fault_param = iopf_param; group->last_fault.fault = *fault; INIT_LIST_HEAD(&group->faults); + INIT_LIST_HEAD(&group->pending_node); group->domain = domain; list_add(&group->last_fault.list, &group->faults); /* See if we have partial faults for this group */ + mutex_lock(&iopf_param->lock); list_for_each_entry_safe(iopf, next, &iopf_param->partial, list) { if (iopf->fault.prm.grpid == fault->prm.grpid) /* Insert *before* the last fault */ list_move(&iopf->list, &group->faults); } - + list_add(&group->pending_node, &iopf_param->faults); mutex_unlock(&iopf_param->lock); + ret = domain->iopf_handler(group); - mutex_lock(&iopf_param->lock); - if (ret) + if (ret) { + mutex_lock(&iopf_param->lock); + list_del_init(&group->pending_node); + mutex_unlock(&iopf_param->lock); iopf_free_group(group); + } return ret; + cleanup_partial: + mutex_lock(&iopf_param->lock); list_for_each_entry_safe(iopf, next, &iopf_param->partial, list) { if (iopf->fault.prm.grpid == fault->prm.grpid) { list_del(&iopf->list); kfree(iopf); } } - return ret; -} - -/** - * iommu_report_device_fault() - Report fault event to device driver - * @dev: the device - * @evt: fault event data - * - * Called by IOMMU drivers when a fault is detected, typically in a threaded IRQ - * handler. When this function fails and the fault is recoverable, it is the - * caller's responsibility to complete the fault. - * - * Return 0 on success, or an error. - */ -int iommu_report_device_fault(struct device *dev, struct iopf_fault *evt) -{ - bool last_prq = evt->fault.type == IOMMU_FAULT_PAGE_REQ && - (evt->fault.prm.flags & IOMMU_FAULT_PAGE_REQUEST_LAST_PAGE); - struct iommu_fault_param *fault_param; - struct iopf_fault *evt_pending; - int ret; - - fault_param = iopf_get_dev_fault_param(dev); - if (!fault_param) - return -EINVAL; - - mutex_lock(&fault_param->lock); - if (last_prq) { - evt_pending = kmemdup(evt, sizeof(struct iopf_fault), - GFP_KERNEL); - if (!evt_pending) { - ret = -ENOMEM; - goto err_unlock; - } - list_add_tail(&evt_pending->list, &fault_param->faults); - } - - ret = iommu_handle_iopf(&evt->fault, fault_param); - if (ret) - goto err_free; - - mutex_unlock(&fault_param->lock); - /* The reference count of fault_param is now held by iopf_group. */ - if (!last_prq) - iopf_put_dev_fault_param(fault_param); - - return 0; -err_free: - if (last_prq) { - list_del(&evt_pending->list); - kfree(evt_pending); - } -err_unlock: - mutex_unlock(&fault_param->lock); - iopf_put_dev_fault_param(fault_param); + mutex_unlock(&iopf_param->lock); + iopf_put_dev_fault_param(iopf_param); return ret; } EXPORT_SYMBOL_GPL(iommu_report_device_fault); -static int iommu_page_response(struct iopf_group *group, - struct iommu_page_response *msg) -{ - bool needs_pasid; - int ret = -EINVAL; - struct iopf_fault *evt; - struct iommu_fault_page_request *prm; - struct device *dev = group->fault_param->dev; - const struct iommu_ops *ops = dev_iommu_ops(dev); - bool has_pasid = msg->flags & IOMMU_PAGE_RESP_PASID_VALID; - struct iommu_fault_param *fault_param = group->fault_param; - - /* Only send response if there is a fault report pending */ - mutex_lock(&fault_param->lock); - if (list_empty(&fault_param->faults)) { - dev_warn_ratelimited(dev, "no pending PRQ, drop response\n"); - goto done_unlock; - } - /* - * Check if we have a matching page request pending to respond, - * otherwise return -EINVAL - */ - list_for_each_entry(evt, &fault_param->faults, list) { - prm = &evt->fault.prm; - if (prm->grpid != msg->grpid) - continue; - - /* - * If the PASID is required, the corresponding request is - * matched using the group ID, the PASID valid bit and the PASID - * value. Otherwise only the group ID matches request and - * response. - */ - needs_pasid = prm->flags & IOMMU_FAULT_PAGE_RESPONSE_NEEDS_PASID; - if (needs_pasid && (!has_pasid || msg->pasid != prm->pasid)) - continue; - - if (!needs_pasid && has_pasid) { - /* No big deal, just clear it. */ - msg->flags &= ~IOMMU_PAGE_RESP_PASID_VALID; - msg->pasid = 0; - } - - ret = ops->page_response(dev, evt, msg); - list_del(&evt->list); - kfree(evt); - break; - } - -done_unlock: - mutex_unlock(&fault_param->lock); - - return ret; -} - /** * iopf_queue_flush_dev - Ensure that all queued faults have been processed * @dev: the endpoint whose faults need to be flushed. @@ -346,18 +265,26 @@ EXPORT_SYMBOL_GPL(iopf_queue_flush_dev); int iopf_group_response(struct iopf_group *group, enum iommu_page_response_code status) { + struct iommu_fault_param *fault_param = group->fault_param; struct iopf_fault *iopf = &group->last_fault; + struct device *dev = group->fault_param->dev; + const struct iommu_ops *ops = dev_iommu_ops(dev); struct iommu_page_response resp = { .pasid = iopf->fault.prm.pasid, .grpid = iopf->fault.prm.grpid, .code = status, }; + int ret = -EINVAL; - if ((iopf->fault.prm.flags & IOMMU_FAULT_PAGE_REQUEST_PASID_VALID) && - (iopf->fault.prm.flags & IOMMU_FAULT_PAGE_RESPONSE_NEEDS_PASID)) - resp.flags = IOMMU_PAGE_RESP_PASID_VALID; + /* Only send response if there is a fault report pending */ + mutex_lock(&fault_param->lock); + if (!list_empty(&group->pending_node)) { + ret = ops->page_response(dev, &group->last_fault, &resp); + list_del_init(&group->pending_node); + } + mutex_unlock(&fault_param->lock); - return iommu_page_response(group, &resp); + return ret; } EXPORT_SYMBOL_GPL(iopf_group_response); @@ -468,8 +395,9 @@ EXPORT_SYMBOL_GPL(iopf_queue_add_device); */ void iopf_queue_remove_device(struct iopf_queue *queue, struct device *dev) { - struct iopf_fault *iopf, *next; - struct iommu_page_response resp; + struct iopf_fault *partial_iopf; + struct iopf_fault *next; + struct iopf_group *group, *temp; struct dev_iommu *param = dev->iommu; struct iommu_fault_param *fault_param; const struct iommu_ops *ops = dev_iommu_ops(dev); @@ -483,21 +411,19 @@ void iopf_queue_remove_device(struct iopf_queue *queue, struct device *dev) goto unlock; mutex_lock(&fault_param->lock); - list_for_each_entry_safe(iopf, next, &fault_param->partial, list) - kfree(iopf); - - list_for_each_entry_safe(iopf, next, &fault_param->faults, list) { - memset(&resp, 0, sizeof(struct iommu_page_response)); - resp.pasid = iopf->fault.prm.pasid; - resp.grpid = iopf->fault.prm.grpid; - resp.code = IOMMU_PAGE_RESP_INVALID; + list_for_each_entry_safe(partial_iopf, next, &fault_param->partial, list) + kfree(partial_iopf); - if (iopf->fault.prm.flags & IOMMU_FAULT_PAGE_RESPONSE_NEEDS_PASID) - resp.flags = IOMMU_PAGE_RESP_PASID_VALID; + list_for_each_entry_safe(group, temp, &fault_param->faults, pending_node) { + struct iopf_fault *iopf = &group->last_fault; + struct iommu_page_response resp = { + .pasid = iopf->fault.prm.pasid, + .grpid = iopf->fault.prm.grpid, + .code = IOMMU_PAGE_RESP_INVALID + }; ops->page_response(dev, iopf, &resp); - list_del(&iopf->list); - kfree(iopf); + list_del_init(&group->pending_node); } mutex_unlock(&fault_param->lock); diff --git a/include/linux/iommu.h b/include/linux/iommu.h index 413df55a70c7f..94d2ae245fb25 100644 --- a/include/linux/iommu.h +++ b/include/linux/iommu.h @@ -106,15 +106,11 @@ enum iommu_page_response_code { /** * struct iommu_page_response - Generic page response information - * @flags: encodes whether the corresponding fields are valid - * (IOMMU_FAULT_PAGE_RESPONSE_* values) * @pasid: Process Address Space ID * @grpid: Page Request Group Index * @code: response code from &enum iommu_page_response_code */ struct iommu_page_response { -#define IOMMU_PAGE_RESP_PASID_VALID (1 << 0) - u32 flags; u32 pasid; u32 grpid; u32 code; @@ -129,6 +125,8 @@ struct iopf_fault { struct iopf_group { struct iopf_fault last_fault; struct list_head faults; + /* list node for iommu_fault_param::faults */ + struct list_head pending_node; struct work_struct work; struct iommu_domain *domain; /* The device's fault data parameter. */ From 210e639f33f2db1d99f8616cefb1fce1ed5a12c8 Mon Sep 17 00:00:00 2001 From: Lu Baolu Date: Mon, 12 Feb 2024 09:22:26 +0800 Subject: [PATCH 498/548] iommu: Make iopf_group_response() return void The iopf_group_response() should return void, as nothing can do anything with the failure. This implies that ops->page_response() must also return void; this is consistent with what the drivers do. The failure paths, which are all integrity validations of the fault, should be WARN_ON'd, not return codes. If the iommu core fails to enqueue the fault, it should respond the fault directly by calling ops->page_response() instead of returning an error number and relying on the iommu drivers to do so. Consolidate the error fault handling code in the core. Co-developed-by: Jason Gunthorpe Signed-off-by: Jason Gunthorpe Signed-off-by: Lu Baolu Reviewed-by: Jason Gunthorpe Reviewed-by: Kevin Tian Link: https://lore.kernel.org/r/20240212012227.119381-16-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel (cherry picked from commit b554e396e51ce3d378a560666f85c6836a8323fd) Signed-off-by: Koba Ko --- drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 50 +++----- drivers/iommu/intel/iommu.h | 4 +- drivers/iommu/intel/svm.c | 17 +-- drivers/iommu/io-pgfault.c | 132 +++++++++++--------- include/linux/iommu.h | 14 +-- 5 files changed, 98 insertions(+), 119 deletions(-) diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c index f005a0f5206ba..9cb4680958c0a 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c @@ -926,31 +926,29 @@ static int arm_smmu_cmdq_batch_submit(struct arm_smmu_device *smmu, return arm_smmu_cmdq_issue_cmdlist(smmu, cmds->cmds, cmds->num, true); } -static int arm_smmu_page_response(struct device *dev, - struct iopf_fault *unused, - struct iommu_page_response *resp) +static void arm_smmu_page_response(struct device *dev, struct iopf_fault *unused, + struct iommu_page_response *resp) { struct arm_smmu_cmdq_ent cmd = {0}; struct arm_smmu_master *master = dev_iommu_priv_get(dev); int sid = master->streams[0].id; - if (master->stall_enabled) { - cmd.opcode = CMDQ_OP_RESUME; - cmd.resume.sid = sid; - cmd.resume.stag = resp->grpid; - switch (resp->code) { - case IOMMU_PAGE_RESP_INVALID: - case IOMMU_PAGE_RESP_FAILURE: - cmd.resume.resp = CMDQ_RESUME_0_RESP_ABORT; - break; - case IOMMU_PAGE_RESP_SUCCESS: - cmd.resume.resp = CMDQ_RESUME_0_RESP_RETRY; - break; - default: - return -EINVAL; - } - } else { - return -ENODEV; + if (WARN_ON(!master->stall_enabled)) + return; + + cmd.opcode = CMDQ_OP_RESUME; + cmd.resume.sid = sid; + cmd.resume.stag = resp->grpid; + switch (resp->code) { + case IOMMU_PAGE_RESP_INVALID: + case IOMMU_PAGE_RESP_FAILURE: + cmd.resume.resp = CMDQ_RESUME_0_RESP_ABORT; + break; + case IOMMU_PAGE_RESP_SUCCESS: + cmd.resume.resp = CMDQ_RESUME_0_RESP_RETRY; + break; + default: + break; } arm_smmu_cmdq_issue_cmd(master->smmu, &cmd); @@ -960,8 +958,6 @@ static int arm_smmu_page_response(struct device *dev, * terminated... at some point in the future. PRI_RESP is fire and * forget. */ - - return 0; } /* Context descriptor manipulation functions */ @@ -1699,16 +1695,6 @@ static int arm_smmu_handle_evt(struct arm_smmu_device *smmu, u64 *evt) } ret = iommu_report_device_fault(master->dev, &fault_evt); - if (ret && flt->type == IOMMU_FAULT_PAGE_REQ) { - /* Nobody cared, abort the access */ - struct iommu_page_response resp = { - .pasid = flt->prm.pasid, - .grpid = flt->prm.grpid, - .code = IOMMU_PAGE_RESP_FAILURE, - }; - arm_smmu_page_response(master->dev, &fault_evt, &resp); - } - out_unlock: mutex_unlock(&smmu->streams_mutex); return ret; diff --git a/drivers/iommu/intel/iommu.h b/drivers/iommu/intel/iommu.h index 0b1a9fd426889..32fde11bf92fb 100644 --- a/drivers/iommu/intel/iommu.h +++ b/drivers/iommu/intel/iommu.h @@ -868,8 +868,8 @@ struct intel_iommu *device_to_iommu(struct device *dev, u8 *bus, u8 *devfn); void intel_svm_check(struct intel_iommu *iommu); int intel_svm_enable_prq(struct intel_iommu *iommu); int intel_svm_finish_prq(struct intel_iommu *iommu); -int intel_svm_page_response(struct device *dev, struct iopf_fault *evt, - struct iommu_page_response *msg); +void intel_svm_page_response(struct device *dev, struct iopf_fault *evt, + struct iommu_page_response *msg); struct iommu_domain *intel_svm_domain_alloc(void); void intel_svm_remove_dev_pasid(struct device *dev, ioasid_t pasid); void intel_drain_pasid_prq(struct device *dev, u32 pasid); diff --git a/drivers/iommu/intel/svm.c b/drivers/iommu/intel/svm.c index 4da09a60cba6a..10db5aa6b3f78 100644 --- a/drivers/iommu/intel/svm.c +++ b/drivers/iommu/intel/svm.c @@ -745,9 +745,8 @@ static irqreturn_t prq_event_thread(int irq, void *d) return IRQ_RETVAL(handled); } -int intel_svm_page_response(struct device *dev, - struct iopf_fault *evt, - struct iommu_page_response *msg) +void intel_svm_page_response(struct device *dev, struct iopf_fault *evt, + struct iommu_page_response *msg) { struct iommu_fault_page_request *prm; struct intel_iommu *iommu; @@ -774,16 +773,6 @@ int intel_svm_page_response(struct device *dev, private_present = prm->flags & IOMMU_FAULT_PAGE_REQUEST_PRIV_DATA; last_page = prm->flags & IOMMU_FAULT_PAGE_REQUEST_LAST_PAGE; - if (!pasid_present) { - ret = -EINVAL; - goto out; - } - - if (prm->pasid == 0 || prm->pasid >= PASID_MAX) { - ret = -EINVAL; - goto out; - } - /* * Per VT-d spec. v3.0 ch7.7, system software must respond * with page group response if private data is present (PDP) @@ -812,8 +801,6 @@ int intel_svm_page_response(struct device *dev, qi_submit_sync(iommu, &desc, 1, 0); } -out: - return ret; } static int intel_svm_set_dev_pasid(struct iommu_domain *domain, diff --git a/drivers/iommu/io-pgfault.c b/drivers/iommu/io-pgfault.c index 05e49e2e6a52e..6a325bff8164e 100644 --- a/drivers/iommu/io-pgfault.c +++ b/drivers/iommu/io-pgfault.c @@ -39,7 +39,7 @@ static void iopf_put_dev_fault_param(struct iommu_fault_param *fault_param) kfree_rcu(fault_param, rcu); } -void iopf_free_group(struct iopf_group *group) +static void __iopf_free_group(struct iopf_group *group) { struct iopf_fault *iopf, *next; @@ -50,6 +50,11 @@ void iopf_free_group(struct iopf_group *group) /* Pair with iommu_report_device_fault(). */ iopf_put_dev_fault_param(group->fault_param); +} + +void iopf_free_group(struct iopf_group *group) +{ + __iopf_free_group(group); kfree(group); } EXPORT_SYMBOL_GPL(iopf_free_group); @@ -97,14 +102,49 @@ static int report_partial_fault(struct iommu_fault_param *fault_param, return 0; } +static struct iopf_group *iopf_group_alloc(struct iommu_fault_param *iopf_param, + struct iopf_fault *evt, + struct iopf_group *abort_group) +{ + struct iopf_fault *iopf, *next; + struct iopf_group *group; + + group = kzalloc(sizeof(*group), GFP_KERNEL); + if (!group) { + /* + * We always need to construct the group as we need it to abort + * the request at the driver if it can't be handled. + */ + group = abort_group; + } + + group->fault_param = iopf_param; + group->last_fault.fault = evt->fault; + INIT_LIST_HEAD(&group->faults); + INIT_LIST_HEAD(&group->pending_node); + list_add(&group->last_fault.list, &group->faults); + + /* See if we have partial faults for this group */ + mutex_lock(&iopf_param->lock); + list_for_each_entry_safe(iopf, next, &iopf_param->partial, list) { + if (iopf->fault.prm.grpid == evt->fault.prm.grpid) + /* Insert *before* the last fault */ + list_move(&iopf->list, &group->faults); + } + list_add(&group->pending_node, &iopf_param->faults); + mutex_unlock(&iopf_param->lock); + + return group; +} + /** * iommu_report_device_fault() - Report fault event to device driver * @dev: the device * @evt: fault event data * * Called by IOMMU drivers when a fault is detected, typically in a threaded IRQ - * handler. When this function fails and the fault is recoverable, it is the - * caller's responsibility to complete the fault. + * handler. If this function fails then ops->page_response() was called to + * complete evt if required. * * This module doesn't handle PCI PASID Stop Marker; IOMMU drivers must discard * them before reporting faults. A PASID Stop Marker (LRW = 0b100) doesn't @@ -143,22 +183,18 @@ int iommu_report_device_fault(struct device *dev, struct iopf_fault *evt) { struct iommu_fault *fault = &evt->fault; struct iommu_fault_param *iopf_param; - struct iopf_fault *iopf, *next; - struct iommu_domain *domain; + struct iopf_group abort_group = {}; struct iopf_group *group; int ret; - if (fault->type != IOMMU_FAULT_PAGE_REQ) - return -EOPNOTSUPP; - iopf_param = iopf_get_dev_fault_param(dev); - if (!iopf_param) + if (WARN_ON(!iopf_param)) return -ENODEV; if (!(fault->prm.flags & IOMMU_FAULT_PAGE_REQUEST_LAST_PAGE)) { ret = report_partial_fault(iopf_param, fault); iopf_put_dev_fault_param(iopf_param); - + /* A request that is not the last does not need to be ack'd */ return ret; } @@ -170,56 +206,33 @@ int iommu_report_device_fault(struct device *dev, struct iopf_fault *evt) * will send a response to the hardware. We need to clean up before * leaving, otherwise partial faults will be stuck. */ - domain = get_domain_for_iopf(dev, fault); - if (!domain) { - ret = -EINVAL; - goto cleanup_partial; - } - - group = kzalloc(sizeof(*group), GFP_KERNEL); - if (!group) { + group = iopf_group_alloc(iopf_param, evt, &abort_group); + if (group == &abort_group) { ret = -ENOMEM; - goto cleanup_partial; + goto err_abort; } - group->fault_param = iopf_param; - group->last_fault.fault = *fault; - INIT_LIST_HEAD(&group->faults); - INIT_LIST_HEAD(&group->pending_node); - group->domain = domain; - list_add(&group->last_fault.list, &group->faults); - - /* See if we have partial faults for this group */ - mutex_lock(&iopf_param->lock); - list_for_each_entry_safe(iopf, next, &iopf_param->partial, list) { - if (iopf->fault.prm.grpid == fault->prm.grpid) - /* Insert *before* the last fault */ - list_move(&iopf->list, &group->faults); - } - list_add(&group->pending_node, &iopf_param->faults); - mutex_unlock(&iopf_param->lock); - - ret = domain->iopf_handler(group); - if (ret) { - mutex_lock(&iopf_param->lock); - list_del_init(&group->pending_node); - mutex_unlock(&iopf_param->lock); - iopf_free_group(group); + group->domain = get_domain_for_iopf(dev, fault); + if (!group->domain) { + ret = -EINVAL; + goto err_abort; } - return ret; - -cleanup_partial: - mutex_lock(&iopf_param->lock); - list_for_each_entry_safe(iopf, next, &iopf_param->partial, list) { - if (iopf->fault.prm.grpid == fault->prm.grpid) { - list_del(&iopf->list); - kfree(iopf); - } - } - mutex_unlock(&iopf_param->lock); - iopf_put_dev_fault_param(iopf_param); + /* + * On success iopf_handler must call iopf_group_response() and + * iopf_free_group() + */ + ret = group->domain->iopf_handler(group); + if (ret) + goto err_abort; + return 0; +err_abort: + iopf_group_response(group, IOMMU_PAGE_RESP_FAILURE); + if (group == &abort_group) + __iopf_free_group(group); + else + iopf_free_group(group); return ret; } EXPORT_SYMBOL_GPL(iommu_report_device_fault); @@ -259,11 +272,9 @@ EXPORT_SYMBOL_GPL(iopf_queue_flush_dev); * iopf_group_response - Respond a group of page faults * @group: the group of faults with the same group id * @status: the response code - * - * Return 0 on success and <0 on error. */ -int iopf_group_response(struct iopf_group *group, - enum iommu_page_response_code status) +void iopf_group_response(struct iopf_group *group, + enum iommu_page_response_code status) { struct iommu_fault_param *fault_param = group->fault_param; struct iopf_fault *iopf = &group->last_fault; @@ -274,17 +285,14 @@ int iopf_group_response(struct iopf_group *group, .grpid = iopf->fault.prm.grpid, .code = status, }; - int ret = -EINVAL; /* Only send response if there is a fault report pending */ mutex_lock(&fault_param->lock); if (!list_empty(&group->pending_node)) { - ret = ops->page_response(dev, &group->last_fault, &resp); + ops->page_response(dev, &group->last_fault, &resp); list_del_init(&group->pending_node); } mutex_unlock(&fault_param->lock); - - return ret; } EXPORT_SYMBOL_GPL(iopf_group_response); diff --git a/include/linux/iommu.h b/include/linux/iommu.h index 94d2ae245fb25..f59029bfd3941 100644 --- a/include/linux/iommu.h +++ b/include/linux/iommu.h @@ -510,9 +510,8 @@ struct iommu_ops { int (*dev_enable_feat)(struct device *dev, enum iommu_dev_features f); int (*dev_disable_feat)(struct device *dev, enum iommu_dev_features f); - int (*page_response)(struct device *dev, - struct iopf_fault *evt, - struct iommu_page_response *msg); + void (*page_response)(struct device *dev, struct iopf_fault *evt, + struct iommu_page_response *msg); int (*def_domain_type)(struct device *dev); void (*remove_dev_pasid)(struct device *dev, ioasid_t pasid); @@ -1486,8 +1485,8 @@ void iopf_queue_free(struct iopf_queue *queue); int iopf_queue_discard_partial(struct iopf_queue *queue); void iopf_free_group(struct iopf_group *group); int iommu_report_device_fault(struct device *dev, struct iopf_fault *evt); -int iopf_group_response(struct iopf_group *group, - enum iommu_page_response_code status); +void iopf_group_response(struct iopf_group *group, + enum iommu_page_response_code status); #else static inline int iopf_queue_add_device(struct iopf_queue *queue, struct device *dev) @@ -1529,10 +1528,9 @@ iommu_report_device_fault(struct device *dev, struct iopf_fault *evt) return -ENODEV; } -static inline int iopf_group_response(struct iopf_group *group, - enum iommu_page_response_code status) +static inline void iopf_group_response(struct iopf_group *group, + enum iommu_page_response_code status) { - return -ENODEV; } #endif /* CONFIG_IOMMU_IOPF */ #endif /* __LINUX_IOMMU_H */ From f90345568d00adfd9167484ed1d420cfb5ecca91 Mon Sep 17 00:00:00 2001 From: Lu Baolu Date: Mon, 12 Feb 2024 09:22:27 +0800 Subject: [PATCH 499/548] iommu: Make iommu_report_device_fault() return void As the iommu_report_device_fault() has been converted to auto-respond a page fault if it fails to enqueue it, there's no need to return a code in any case. Make it return void. Suggested-by: Jason Gunthorpe Signed-off-by: Lu Baolu Reviewed-by: Jason Gunthorpe Reviewed-by: Kevin Tian Link: https://lore.kernel.org/r/20240212012227.119381-17-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel (cherry picked from commit 3dfa64aecbafc288216b2790438d395add192c30) Signed-off-by: Koba Ko --- drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 4 ++-- drivers/iommu/intel/svm.c | 19 ++++++---------- drivers/iommu/io-pgfault.c | 25 +++++++-------------- include/linux/iommu.h | 5 ++--- 4 files changed, 19 insertions(+), 34 deletions(-) diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c index 9cb4680958c0a..b666f665d3504 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c @@ -1638,7 +1638,7 @@ arm_smmu_find_master(struct arm_smmu_device *smmu, u32 sid) /* IRQ and event handlers */ static int arm_smmu_handle_evt(struct arm_smmu_device *smmu, u64 *evt) { - int ret; + int ret = 0; u32 perm = 0; struct arm_smmu_master *master; bool ssid_valid = evt[0] & EVTQ_0_SSV; @@ -1694,7 +1694,7 @@ static int arm_smmu_handle_evt(struct arm_smmu_device *smmu, u64 *evt) goto out_unlock; } - ret = iommu_report_device_fault(master->dev, &fault_evt); + iommu_report_device_fault(master->dev, &fault_evt); out_unlock: mutex_unlock(&smmu->streams_mutex); return ret; diff --git a/drivers/iommu/intel/svm.c b/drivers/iommu/intel/svm.c index 10db5aa6b3f78..a7d19ccf8b8a6 100644 --- a/drivers/iommu/intel/svm.c +++ b/drivers/iommu/intel/svm.c @@ -566,14 +566,11 @@ static int prq_to_iommu_prot(struct page_req_dsc *req) return prot; } -static int intel_svm_prq_report(struct intel_iommu *iommu, struct device *dev, - struct page_req_dsc *desc) +static void intel_svm_prq_report(struct intel_iommu *iommu, struct device *dev, + struct page_req_dsc *desc) { struct iopf_fault event = { }; - if (!dev || !dev_is_pci(dev)) - return -ENODEV; - /* Fill in event data for device specific processing */ event.fault.type = IOMMU_FAULT_PAGE_REQ; event.fault.prm.addr = (u64)desc->addr << VTD_PAGE_SHIFT; @@ -606,7 +603,7 @@ static int intel_svm_prq_report(struct intel_iommu *iommu, struct device *dev, event.fault.prm.private_data[0] = ktime_to_ns(ktime_get()); } - return iommu_report_device_fault(dev, &event); + iommu_report_device_fault(dev, &event); } static void handle_bad_prq_event(struct intel_iommu *iommu, @@ -709,12 +706,10 @@ static irqreturn_t prq_event_thread(int irq, void *d) if (!pdev) goto bad_req; - if (intel_svm_prq_report(iommu, &pdev->dev, req)) - handle_bad_prq_event(iommu, req, QI_RESP_INVALID); - else - trace_prq_report(iommu, &pdev->dev, req->qw_0, req->qw_1, - req->priv_data[0], req->priv_data[1], - iommu->prq_seq_number++); + intel_svm_prq_report(iommu, &pdev->dev, req); + trace_prq_report(iommu, &pdev->dev, req->qw_0, req->qw_1, + req->priv_data[0], req->priv_data[1], + iommu->prq_seq_number++); pci_dev_put(pdev); prq_advance: head = (head + sizeof(*req)) & PRQ_RING_MASK; diff --git a/drivers/iommu/io-pgfault.c b/drivers/iommu/io-pgfault.c index 6a325bff8164e..06d78fcc79fdb 100644 --- a/drivers/iommu/io-pgfault.c +++ b/drivers/iommu/io-pgfault.c @@ -176,26 +176,22 @@ static struct iopf_group *iopf_group_alloc(struct iommu_fault_param *iopf_param, * freed after the device has stopped generating page faults (or the iommu * hardware has been set to block the page faults) and the pending page faults * have been flushed. - * - * Return: 0 on success and <0 on error. */ -int iommu_report_device_fault(struct device *dev, struct iopf_fault *evt) +void iommu_report_device_fault(struct device *dev, struct iopf_fault *evt) { struct iommu_fault *fault = &evt->fault; struct iommu_fault_param *iopf_param; struct iopf_group abort_group = {}; struct iopf_group *group; - int ret; iopf_param = iopf_get_dev_fault_param(dev); if (WARN_ON(!iopf_param)) - return -ENODEV; + return; if (!(fault->prm.flags & IOMMU_FAULT_PAGE_REQUEST_LAST_PAGE)) { - ret = report_partial_fault(iopf_param, fault); + report_partial_fault(iopf_param, fault); iopf_put_dev_fault_param(iopf_param); /* A request that is not the last does not need to be ack'd */ - return ret; } /* @@ -207,25 +203,21 @@ int iommu_report_device_fault(struct device *dev, struct iopf_fault *evt) * leaving, otherwise partial faults will be stuck. */ group = iopf_group_alloc(iopf_param, evt, &abort_group); - if (group == &abort_group) { - ret = -ENOMEM; + if (group == &abort_group) goto err_abort; - } group->domain = get_domain_for_iopf(dev, fault); - if (!group->domain) { - ret = -EINVAL; + if (!group->domain) goto err_abort; - } /* * On success iopf_handler must call iopf_group_response() and * iopf_free_group() */ - ret = group->domain->iopf_handler(group); - if (ret) + if (group->domain->iopf_handler(group)) goto err_abort; - return 0; + + return; err_abort: iopf_group_response(group, IOMMU_PAGE_RESP_FAILURE); @@ -233,7 +225,6 @@ int iommu_report_device_fault(struct device *dev, struct iopf_fault *evt) __iopf_free_group(group); else iopf_free_group(group); - return ret; } EXPORT_SYMBOL_GPL(iommu_report_device_fault); diff --git a/include/linux/iommu.h b/include/linux/iommu.h index f59029bfd3941..a051bb3e5fdad 100644 --- a/include/linux/iommu.h +++ b/include/linux/iommu.h @@ -1484,7 +1484,7 @@ struct iopf_queue *iopf_queue_alloc(const char *name); void iopf_queue_free(struct iopf_queue *queue); int iopf_queue_discard_partial(struct iopf_queue *queue); void iopf_free_group(struct iopf_group *group); -int iommu_report_device_fault(struct device *dev, struct iopf_fault *evt); +void iommu_report_device_fault(struct device *dev, struct iopf_fault *evt); void iopf_group_response(struct iopf_group *group, enum iommu_page_response_code status); #else @@ -1522,10 +1522,9 @@ static inline void iopf_free_group(struct iopf_group *group) { } -static inline int +static inline void iommu_report_device_fault(struct device *dev, struct iopf_fault *evt) { - return -ENODEV; } static inline void iopf_group_response(struct iopf_group *group, From 4f2ef92411d93230176f8e84a6c0f5921906d391 Mon Sep 17 00:00:00 2001 From: Lu Baolu Date: Fri, 17 Jan 2025 13:58:00 +0800 Subject: [PATCH 500/548] iommu: Fix potential memory leak in iopf_queue_remove_device() The iopf_queue_remove_device() helper removes a device from the per-iommu iopf queue when PRI is disabled on the device. It responds to all outstanding iopf's with an IOMMU_PAGE_RESP_INVALID code and detaches the device from the queue. However, it fails to release the group structure that represents a group of iopf's awaiting for a response after responding to the hardware. This can cause a memory leak if iopf_queue_remove_device() is called with pending iopf's. Fix it by calling iopf_free_group() after the iopf group is responded. Fixes: 199112327135 ("iommu: Track iopf group instead of last fault") Cc: stable@vger.kernel.org Suggested-by: Kevin Tian Signed-off-by: Lu Baolu Reviewed-by: Kevin Tian Reviewed-by: Jason Gunthorpe Link: https://lore.kernel.org/r/20250117055800.782462-1-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel (cherry picked from commit 9759ae2cee7cd42b95f1c48aa3749bd02b5ddb08) Signed-off-by: Koba Ko --- drivers/iommu/io-pgfault.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/iommu/io-pgfault.c b/drivers/iommu/io-pgfault.c index 06d78fcc79fdb..350ccd20c7a04 100644 --- a/drivers/iommu/io-pgfault.c +++ b/drivers/iommu/io-pgfault.c @@ -423,6 +423,7 @@ void iopf_queue_remove_device(struct iopf_queue *queue, struct device *dev) ops->page_response(dev, iopf, &resp); list_del_init(&group->pending_node); + iopf_free_group(group); } mutex_unlock(&fault_param->lock); From dd5240acb7787323f37427601f4a7a94cf2cf931 Mon Sep 17 00:00:00 2001 From: Jason Gunthorpe Date: Thu, 22 Feb 2024 10:07:41 -0400 Subject: [PATCH 501/548] iommu/sva: Restore SVA handle sharing Prior to commit 092edaddb660 ("iommu: Support mm PASID 1:n with sva domains") the code allowed a SVA handle to be bound multiple times to the same (mm, device) pair. This was alluded to in the kdoc comment, but we had understood this to be more a remark about allowing multiple devices, not a literal same-driver re-opening the same SVA. It turns out uacce and idxd were both relying on the core code to handle reference counting for same-device same-mm scenarios. As this looks hard to resolve in the drivers bring it back to the core code. The new design has changed the meaning of the domain->users refcount to refer to the number of devices that are sharing that domain for the same mm. This is part of the design to lift the SVA domain de-duplication out of the drivers. Return the old behavior by explicitly de-duplicating the struct iommu_sva handle. The same (mm, device) will return the same handle pointer and the core code will handle tracking this. The last unbind of the handle will destroy it. Fixes: 092edaddb660 ("iommu: Support mm PASID 1:n with sva domains") Reported-by: Zhangfei Gao Closes: https://lore.kernel.org/all/20240221110658.529-1-zhangfei.gao@linaro.org/ Tested-by: Zhangfei Gao Signed-off-by: Jason Gunthorpe Reviewed-by: Lu Baolu Link: https://lore.kernel.org/r/0-v1-9455fc497a6f+3b4-iommu_sva_sharing_jgg@nvidia.com Signed-off-by: Joerg Roedel (cherry picked from commit 65d4418c5002ec5b0e529455bf4152fd43459079) Signed-off-by: Koba Ko --- drivers/iommu/iommu-sva.c | 17 +++++++++++++++++ include/linux/iommu.h | 3 +++ 2 files changed, 20 insertions(+) diff --git a/drivers/iommu/iommu-sva.c b/drivers/iommu/iommu-sva.c index 4616c8c2fa7d0..121dddf5015f5 100644 --- a/drivers/iommu/iommu-sva.c +++ b/drivers/iommu/iommu-sva.c @@ -41,6 +41,7 @@ static struct iommu_mm_data *iommu_alloc_mm_data(struct mm_struct *mm, struct de } iommu_mm->pasid = pasid; INIT_LIST_HEAD(&iommu_mm->sva_domains); + INIT_LIST_HEAD(&iommu_mm->sva_handles); /* * Make sure the write to mm->iommu_mm is not reordered in front of * initialization to iommu_mm fields. If it does, readers may see a @@ -82,6 +83,14 @@ struct iommu_sva *iommu_sva_bind_device(struct device *dev, struct mm_struct *mm goto out_unlock; } + list_for_each_entry(handle, &mm->iommu_mm->sva_handles, handle_item) { + if (handle->dev == dev) { + refcount_inc(&handle->users); + mutex_unlock(&iommu_sva_lock); + return handle; + } + } + handle = kzalloc(sizeof(*handle), GFP_KERNEL); if (!handle) { ret = -ENOMEM; @@ -108,7 +117,9 @@ struct iommu_sva *iommu_sva_bind_device(struct device *dev, struct mm_struct *mm if (ret) goto out_free_domain; domain->users = 1; + refcount_set(&handle->users, 1); list_add(&domain->next, &mm->iommu_mm->sva_domains); + list_add(&handle->handle_item, &mm->iommu_mm->sva_handles); out: mutex_unlock(&iommu_sva_lock); @@ -140,6 +151,12 @@ void iommu_sva_unbind_device(struct iommu_sva *handle) struct device *dev = handle->dev; mutex_lock(&iommu_sva_lock); + if (!refcount_dec_and_test(&handle->users)) { + mutex_unlock(&iommu_sva_lock); + return; + } + list_del(&handle->handle_item); + iommu_detach_device_pasid(domain, dev, iommu_mm->pasid); if (--domain->users == 0) { list_del(&domain->next); diff --git a/include/linux/iommu.h b/include/linux/iommu.h index a051bb3e5fdad..1d5fc107cf525 100644 --- a/include/linux/iommu.h +++ b/include/linux/iommu.h @@ -903,11 +903,14 @@ struct iommu_fwspec { struct iommu_sva { struct device *dev; struct iommu_domain *domain; + struct list_head handle_item; + refcount_t users; }; struct iommu_mm_data { u32 pasid; struct list_head sva_domains; + struct list_head sva_handles; }; int iommu_fwspec_init(struct device *dev, struct fwnode_handle *iommu_fwnode, From b0a0eb2e249a97e06868424798a590c87b211da6 Mon Sep 17 00:00:00 2001 From: Mostafa Saleh Date: Sat, 23 Mar 2024 13:46:58 +0000 Subject: [PATCH 502/548] iommu/arm-smmu-v3: Fix access for STE.SHCFG STE attributes(NSCFG, PRIVCFG, INSTCFG) use value 0 for "Use Icomming", for some reason SHCFG doesn't follow that, and it is defined as "0b01". Currently the driver sets SHCFG to Use Incoming for stage-2 and bypass domains. However according to the User Manual (ARM IHI 0070 F.b): When SMMU_IDR1.ATTR_TYPES_OVR == 0, this field is RES0 and the incoming Shareability attribute is used. This patch adds a condition for writing SHCFG to Use incoming to be compliant with the architecture, and defines ATTR_TYPE_OVR as a new feature discovered from IDR1. This also required to propagate the SMMU through some functions args. There is no need to add similar condition for the newly introduced function arm_smmu_get_ste_used() as the values of the STE are the same before and after any transition, so this will not trigger any change. (we already do the same for the VMID). Although this is a misconfiguration from the driver, this has been there for a long time, so probably no HW running Linux is affected by it. Reported-by: Will Deacon Closes: https://lore.kernel.org/all/20240215134952.GA690@willie-the-truck/ Signed-off-by: Mostafa Saleh Reviewed-by: Jason Gunthorpe Link: https://lore.kernel.org/r/20240323134658.464743-1-smostafa@google.com Signed-off-by: Will Deacon (cherry picked from commit ec9098d6bffea6e82d63640134c123a3d96e0781) Signed-off-by: Koba Ko --- drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 35 ++++++++++++++------- drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 2 ++ 2 files changed, 25 insertions(+), 12 deletions(-) diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c index b666f665d3504..935d408cdb71d 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c @@ -1453,14 +1453,17 @@ static void arm_smmu_make_abort_ste(struct arm_smmu_ste *target) FIELD_PREP(STRTAB_STE_0_CFG, STRTAB_STE_0_CFG_ABORT)); } -static void arm_smmu_make_bypass_ste(struct arm_smmu_ste *target) +static void arm_smmu_make_bypass_ste(struct arm_smmu_device *smmu, + struct arm_smmu_ste *target) { memset(target, 0, sizeof(*target)); target->data[0] = cpu_to_le64( STRTAB_STE_0_V | FIELD_PREP(STRTAB_STE_0_CFG, STRTAB_STE_0_CFG_BYPASS)); - target->data[1] = cpu_to_le64( - FIELD_PREP(STRTAB_STE_1_SHCFG, STRTAB_STE_1_SHCFG_INCOMING)); + + if (smmu->features & ARM_SMMU_FEAT_ATTR_TYPES_OVR) + target->data[1] = cpu_to_le64(FIELD_PREP(STRTAB_STE_1_SHCFG, + STRTAB_STE_1_SHCFG_INCOMING)); } static void arm_smmu_make_cdtable_ste(struct arm_smmu_ste *target, @@ -1523,6 +1526,7 @@ static void arm_smmu_make_s2_domain_ste(struct arm_smmu_ste *target, typeof(&pgtbl_cfg->arm_lpae_s2_cfg.vtcr) vtcr = &pgtbl_cfg->arm_lpae_s2_cfg.vtcr; u64 vtcr_val; + struct arm_smmu_device *smmu = master->smmu; memset(target, 0, sizeof(*target)); target->data[0] = cpu_to_le64( @@ -1531,9 +1535,11 @@ static void arm_smmu_make_s2_domain_ste(struct arm_smmu_ste *target, target->data[1] = cpu_to_le64( FIELD_PREP(STRTAB_STE_1_EATS, - master->ats_enabled ? STRTAB_STE_1_EATS_TRANS : 0) | - FIELD_PREP(STRTAB_STE_1_SHCFG, - STRTAB_STE_1_SHCFG_INCOMING)); + master->ats_enabled ? STRTAB_STE_1_EATS_TRANS : 0)); + + if (smmu->features & ARM_SMMU_FEAT_ATTR_TYPES_OVR) + target->data[1] |= cpu_to_le64(FIELD_PREP(STRTAB_STE_1_SHCFG, + STRTAB_STE_1_SHCFG_INCOMING)); vtcr_val = FIELD_PREP(STRTAB_STE_2_VTCR_S2T0SZ, vtcr->tsz) | FIELD_PREP(STRTAB_STE_2_VTCR_S2SL0, vtcr->sl) | @@ -1560,7 +1566,8 @@ static void arm_smmu_make_s2_domain_ste(struct arm_smmu_ste *target, * This can safely directly manipulate the STE memory without a sync sequence * because the STE table has not been installed in the SMMU yet. */ -static void arm_smmu_init_initial_stes(struct arm_smmu_ste *strtab, +static void arm_smmu_init_initial_stes(struct arm_smmu_device *smmu, + struct arm_smmu_ste *strtab, unsigned int nent) { unsigned int i; @@ -1569,7 +1576,7 @@ static void arm_smmu_init_initial_stes(struct arm_smmu_ste *strtab, if (disable_bypass) arm_smmu_make_abort_ste(strtab); else - arm_smmu_make_bypass_ste(strtab); + arm_smmu_make_bypass_ste(smmu, strtab); strtab++; } } @@ -1597,7 +1604,7 @@ static int arm_smmu_init_l2_strtab(struct arm_smmu_device *smmu, u32 sid) return -ENOMEM; } - arm_smmu_init_initial_stes(desc->l2ptr, 1 << STRTAB_SPLIT); + arm_smmu_init_initial_stes(smmu, desc->l2ptr, 1 << STRTAB_SPLIT); arm_smmu_write_strtab_l1_desc(strtab, desc); return 0; } @@ -2652,8 +2659,9 @@ static int arm_smmu_attach_dev_identity(struct iommu_domain *domain, struct device *dev) { struct arm_smmu_ste ste; + struct arm_smmu_master *master = dev_iommu_priv_get(dev); - arm_smmu_make_bypass_ste(&ste); + arm_smmu_make_bypass_ste(master->smmu, &ste); return arm_smmu_attach_dev_ste(dev, &ste); } @@ -3273,7 +3281,7 @@ static int arm_smmu_init_strtab_linear(struct arm_smmu_device *smmu) reg |= FIELD_PREP(STRTAB_BASE_CFG_LOG2SIZE, smmu->sid_bits); cfg->strtab_base_cfg = reg; - arm_smmu_init_initial_stes(strtab, cfg->num_l1_ents); + arm_smmu_init_initial_stes(smmu, strtab, cfg->num_l1_ents); return 0; } @@ -3785,6 +3793,9 @@ static int arm_smmu_device_hw_probe(struct arm_smmu_device *smmu) return -ENXIO; } + if (reg & IDR1_ATTR_TYPES_OVR) + smmu->features |= ARM_SMMU_FEAT_ATTR_TYPES_OVR; + /* Queue sizes, capped to ensure natural alignment */ smmu->cmdq.q.llq.max_n_shift = min_t(u32, CMDQ_MAX_SZ_SHIFT, FIELD_GET(IDR1_CMDQS, reg)); @@ -4000,7 +4011,7 @@ static void arm_smmu_rmr_install_bypass_ste(struct arm_smmu_device *smmu) * STE table is not programmed to HW, see * arm_smmu_initial_bypass_stes() */ - arm_smmu_make_bypass_ste( + arm_smmu_make_bypass_ste(smmu, arm_smmu_get_step_for_sid(smmu, rmr->sids[i])); } } diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h index a28101e3b9751..eecf12f816f17 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h @@ -44,6 +44,7 @@ #define IDR1_TABLES_PRESET (1 << 30) #define IDR1_QUEUES_PRESET (1 << 29) #define IDR1_REL (1 << 28) +#define IDR1_ATTR_TYPES_OVR (1 << 27) #define IDR1_CMDQS GENMASK(25, 21) #define IDR1_EVTQS GENMASK(20, 16) #define IDR1_PRIQS GENMASK(15, 11) @@ -652,6 +653,7 @@ struct arm_smmu_device { #define ARM_SMMU_FEAT_SVA (1 << 17) #define ARM_SMMU_FEAT_E2H (1 << 18) #define ARM_SMMU_FEAT_NESTING (1 << 19) +#define ARM_SMMU_FEAT_ATTR_TYPES_OVR (1 << 20) u32 features; #define ARM_SMMU_OPT_SKIP_PREFETCH (1 << 0) From 3b9ab76e8801570068ac8ef8f6ba210e0ceca2f0 Mon Sep 17 00:00:00 2001 From: Yi Liu Date: Thu, 28 Mar 2024 05:29:58 -0700 Subject: [PATCH 503/548] iommu: Pass domain to remove_dev_pasid() op Existing remove_dev_pasid() callbacks of the underlying iommu drivers get the attached domain from the group->pasid_array. However, the domain stored in group->pasid_array is not always correct in all scenarios. A wrong domain may result in failure in remove_dev_pasid() callback. To avoid such problems, it is more reliable to pass the domain to the remove_dev_pasid() op. Suggested-by: Jason Gunthorpe Signed-off-by: Yi Liu Reviewed-by: Kevin Tian Link: https://lore.kernel.org/r/20240328122958.83332-3-yi.l.liu@intel.com Signed-off-by: Joerg Roedel (cherry picked from commit d2f85a263883b679f87ed8f911746105658e9c47) Signed-off-by: Koba Ko --- drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 9 ++------- drivers/iommu/intel/iommu.c | 11 +++-------- drivers/iommu/iommu.c | 9 +++++---- include/linux/iommu.h | 3 ++- 4 files changed, 12 insertions(+), 20 deletions(-) diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c index 935d408cdb71d..bbce7cca34d0c 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c @@ -3061,14 +3061,9 @@ static int arm_smmu_def_domain_type(struct device *dev) return 0; } -static void arm_smmu_remove_dev_pasid(struct device *dev, ioasid_t pasid) +static void arm_smmu_remove_dev_pasid(struct device *dev, ioasid_t pasid, + struct iommu_domain *domain) { - struct iommu_domain *domain; - - domain = iommu_get_domain_for_dev_pasid(dev, pasid, IOMMU_DOMAIN_SVA); - if (WARN_ON(IS_ERR(domain)) || !domain) - return; - arm_smmu_sva_remove_dev_pasid(domain, dev, pasid); } diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c index 008c396c583b1..96dc2b2afa366 100644 --- a/drivers/iommu/intel/iommu.c +++ b/drivers/iommu/intel/iommu.c @@ -4728,18 +4728,14 @@ static void intel_iommu_iotlb_sync_map(struct iommu_domain *domain, __mapping_notify_one(info->iommu, dmar_domain, pfn, pages); } -static void intel_iommu_remove_dev_pasid(struct device *dev, ioasid_t pasid) +static void intel_iommu_remove_dev_pasid(struct device *dev, ioasid_t pasid, + struct iommu_domain *domain) { struct intel_iommu *iommu = device_to_iommu(dev, NULL, NULL); + struct dmar_domain *dmar_domain = to_dmar_domain(domain); struct dev_pasid_info *curr, *dev_pasid = NULL; - struct dmar_domain *dmar_domain; - struct iommu_domain *domain; unsigned long flags; - domain = iommu_get_domain_for_dev_pasid(dev, pasid, 0); - if (WARN_ON_ONCE(!domain)) - goto out_tear_down; - /* * The SVA implementation needs to handle its own stuffs like the mm * notification. Before consolidating that code into iommu core, let @@ -4750,7 +4746,6 @@ static void intel_iommu_remove_dev_pasid(struct device *dev, ioasid_t pasid) goto out_tear_down; } - dmar_domain = to_dmar_domain(domain); spin_lock_irqsave(&dmar_domain->lock, flags); list_for_each_entry(curr, &dmar_domain->dev_pasids, link_domain) { if (curr->dev == dev && curr->pasid == pasid) { diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c index fd4514800d81e..6713c44225f03 100644 --- a/drivers/iommu/iommu.c +++ b/drivers/iommu/iommu.c @@ -3396,20 +3396,21 @@ static int __iommu_set_group_pasid(struct iommu_domain *domain, if (device == last_gdev) break; - ops->remove_dev_pasid(device->dev, pasid); + ops->remove_dev_pasid(device->dev, pasid, domain); } return ret; } static void __iommu_remove_group_pasid(struct iommu_group *group, - ioasid_t pasid) + ioasid_t pasid, + struct iommu_domain *domain) { struct group_device *device; const struct iommu_ops *ops; for_each_group_device(group, device) { ops = dev_iommu_ops(device->dev); - ops->remove_dev_pasid(device->dev, pasid); + ops->remove_dev_pasid(device->dev, pasid, domain); } } @@ -3471,7 +3472,7 @@ void iommu_detach_device_pasid(struct iommu_domain *domain, struct device *dev, struct iommu_group *group = iommu_group_get(dev); mutex_lock(&group->mutex); - __iommu_remove_group_pasid(group, pasid); + __iommu_remove_group_pasid(group, pasid, domain); WARN_ON(xa_erase(&group->pasid_array, pasid) != domain); mutex_unlock(&group->mutex); diff --git a/include/linux/iommu.h b/include/linux/iommu.h index 1d5fc107cf525..c3ab4a3089dc1 100644 --- a/include/linux/iommu.h +++ b/include/linux/iommu.h @@ -514,7 +514,8 @@ struct iommu_ops { struct iommu_page_response *msg); int (*def_domain_type)(struct device *dev); - void (*remove_dev_pasid)(struct device *dev, ioasid_t pasid); + void (*remove_dev_pasid)(struct device *dev, ioasid_t pasid, + struct iommu_domain *domain); const struct iommu_domain_ops *default_domain_ops; unsigned long pgsize_bitmap; From 40c8a7a1f7283667055ad4781ef3cc991a6efe69 Mon Sep 17 00:00:00 2001 From: Jason Gunthorpe Date: Thu, 18 Apr 2024 10:33:59 +0000 Subject: [PATCH 504/548] iommu: Add ops->domain_alloc_sva() Make a new op that receives the device and the mm_struct that the SVA domain should be created for. Unlike domain_alloc_paging() the dev argument is never NULL here. This allows drivers to fully initialize the SVA domain and allocate the mmu_notifier during allocation. It allows the notifier lifetime to follow the lifetime of the iommu_domain. Since we have only one call site, upgrade the new op to return ERR_PTR instead of NULL. Signed-off-by: Jason Gunthorpe [Removed smmu3 related changes - Vasant] Signed-off-by: Vasant Hegde Reviewed-by: Tina Zhang Link: https://lore.kernel.org/r/20240418103400.6229-15-vasant.hegde@amd.com Signed-off-by: Joerg Roedel (cherry picked from commit 80af5a45202422db957549a241e00bf4d4e0ce89) Signed-off-by: Koba Ko --- drivers/iommu/iommu-sva.c | 16 +++++++++++----- include/linux/iommu.h | 3 +++ 2 files changed, 14 insertions(+), 5 deletions(-) diff --git a/drivers/iommu/iommu-sva.c b/drivers/iommu/iommu-sva.c index 121dddf5015f5..09d673bb0b400 100644 --- a/drivers/iommu/iommu-sva.c +++ b/drivers/iommu/iommu-sva.c @@ -108,8 +108,8 @@ struct iommu_sva *iommu_sva_bind_device(struct device *dev, struct mm_struct *mm /* Allocate a new domain and set it on device pasid. */ domain = iommu_sva_domain_alloc(dev, mm); - if (!domain) { - ret = -ENOMEM; + if (IS_ERR(domain)) { + ret = PTR_ERR(domain); goto out_unlock; } @@ -282,9 +282,15 @@ struct iommu_domain *iommu_sva_domain_alloc(struct device *dev, const struct iommu_ops *ops = dev_iommu_ops(dev); struct iommu_domain *domain; - domain = ops->domain_alloc(IOMMU_DOMAIN_SVA); - if (!domain) - return NULL; + if (ops->domain_alloc_sva) { + domain = ops->domain_alloc_sva(dev, mm); + if (IS_ERR(domain)) + return domain; + } else { + domain = ops->domain_alloc(IOMMU_DOMAIN_SVA); + if (!domain) + return ERR_PTR(-ENOMEM); + } domain->type = IOMMU_DOMAIN_SVA; mmgrab(mm); diff --git a/include/linux/iommu.h b/include/linux/iommu.h index c3ab4a3089dc1..d3dc8986c7f9f 100644 --- a/include/linux/iommu.h +++ b/include/linux/iommu.h @@ -440,6 +440,7 @@ static inline int __iommu_copy_struct_from_user( * the caller iommu_domain_alloc() returns. * @domain_alloc_paging: Allocate an iommu_domain that can be used for * UNMANAGED, DMA, and DMA_FQ domain types. + * @domain_alloc_sva: Allocate an iommu_domain for Shared Virtual Addressing. * @domain_alloc_user: Allocate an iommu domain corresponding to the input * parameters as defined in include/uapi/linux/iommufd.h. * Unlike @domain_alloc, it is called only by IOMMUFD and @@ -493,6 +494,8 @@ struct iommu_ops { struct device *dev, u32 flags, struct iommu_domain *parent, const struct iommu_user_data *user_data); struct iommu_domain *(*domain_alloc_paging)(struct device *dev); + struct iommu_domain *(*domain_alloc_sva)(struct device *dev, + struct mm_struct *mm); struct iommu_device *(*probe_device)(struct device *dev); void (*release_device)(struct device *dev); From 6fa87a9d6fb55b3c2ce32159ec7ae2fe37c287fa Mon Sep 17 00:00:00 2001 From: Lu Baolu Date: Tue, 28 May 2024 12:54:58 +0800 Subject: [PATCH 505/548] iommu: Make iommu_sva_domain_alloc() static iommu_sva_domain_alloc() is only called in iommu-sva.c, hence make it static. On the other hand, iommu_sva_domain_alloc() should not return NULL anymore after commit <80af5a452024> ("iommu: Add ops->domain_alloc_sva()"), the removal of inline code avoids potential confusion. Fixes: 80af5a452024 ("iommu: Add ops->domain_alloc_sva()") Signed-off-by: Lu Baolu Reviewed-by: Kevin Tian Reviewed-by: Jason Gunthorpe Link: https://lore.kernel.org/r/20240528045458.81458-1-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel (cherry picked from commit b5c29fba72a6c950655d1cb0f6aa16b60dc83be7) Signed-off-by: Koba Ko --- drivers/iommu/iommu-sva.c | 6 ++++-- include/linux/iommu.h | 8 -------- 2 files changed, 4 insertions(+), 10 deletions(-) diff --git a/drivers/iommu/iommu-sva.c b/drivers/iommu/iommu-sva.c index 09d673bb0b400..d29f92c03c860 100644 --- a/drivers/iommu/iommu-sva.c +++ b/drivers/iommu/iommu-sva.c @@ -10,6 +10,8 @@ #include "iommu-priv.h" static DEFINE_MUTEX(iommu_sva_lock); +static struct iommu_domain *iommu_sva_domain_alloc(struct device *dev, + struct mm_struct *mm); /* Allocate a PASID for the mm within range (inclusive) */ static struct iommu_mm_data *iommu_alloc_mm_data(struct mm_struct *mm, struct device *dev) @@ -276,8 +278,8 @@ static int iommu_sva_iopf_handler(struct iopf_group *group) return 0; } -struct iommu_domain *iommu_sva_domain_alloc(struct device *dev, - struct mm_struct *mm) +static struct iommu_domain *iommu_sva_domain_alloc(struct device *dev, + struct mm_struct *mm) { const struct iommu_ops *ops = dev_iommu_ops(dev); struct iommu_domain *domain; diff --git a/include/linux/iommu.h b/include/linux/iommu.h index d3dc8986c7f9f..016a7b3eb2cf4 100644 --- a/include/linux/iommu.h +++ b/include/linux/iommu.h @@ -1449,8 +1449,6 @@ struct iommu_sva *iommu_sva_bind_device(struct device *dev, struct mm_struct *mm); void iommu_sva_unbind_device(struct iommu_sva *handle); u32 iommu_sva_get_pasid(struct iommu_sva *handle); -struct iommu_domain *iommu_sva_domain_alloc(struct device *dev, - struct mm_struct *mm); #else static inline struct iommu_sva * iommu_sva_bind_device(struct device *dev, struct mm_struct *mm) @@ -1475,12 +1473,6 @@ static inline u32 mm_get_enqcmd_pasid(struct mm_struct *mm) } static inline void mm_pasid_drop(struct mm_struct *mm) {} - -static inline struct iommu_domain * -iommu_sva_domain_alloc(struct device *dev, struct mm_struct *mm) -{ - return NULL; -} #endif /* CONFIG_IOMMU_SVA */ #ifdef CONFIG_IOMMU_IOPF From 285593a7da098bfc86a51f214781763abae609f6 Mon Sep 17 00:00:00 2001 From: Lu Baolu Date: Tue, 2 Jul 2024 14:34:35 +0800 Subject: [PATCH 506/548] iommu: Introduce domain attachment handle Currently, when attaching a domain to a device or its PASID, domain is stored within the iommu group. It could be retrieved for use during the window between attachment and detachment. With new features introduced, there's a need to store more information than just a domain pointer. This information essentially represents the association between a domain and a device. For example, the SVA code already has a custom struct iommu_sva which represents a bond between sva domain and a PASID of a device. Looking forward, the IOMMUFD needs a place to store the iommufd_device pointer in the core, so that the device object ID could be quickly retrieved in the critical fault handling path. Introduce domain attachment handle that explicitly represents the attachment relationship between a domain and a device or its PASID. Co-developed-by: Jason Gunthorpe Signed-off-by: Jason Gunthorpe Signed-off-by: Lu Baolu Reviewed-by: Jason Gunthorpe Reviewed-by: Kevin Tian Link: https://lore.kernel.org/r/20240702063444.105814-2-baolu.lu@linux.intel.com Signed-off-by: Will Deacon (cherry picked from commit 14678219cf4093e897ab353fd78eab7994d1be7d) Signed-off-by: Koba Ko --- drivers/dma/idxd/init.c | 2 +- drivers/iommu/iommu-sva.c | 13 ++++++++----- drivers/iommu/iommu.c | 27 +++++++++++++++++---------- include/linux/iommu.h | 18 +++++++++++++++--- 4 files changed, 41 insertions(+), 19 deletions(-) diff --git a/drivers/dma/idxd/init.c b/drivers/dma/idxd/init.c index 92e86ae9db29d..fd67a4834bf0f 100644 --- a/drivers/dma/idxd/init.c +++ b/drivers/dma/idxd/init.c @@ -659,7 +659,7 @@ static int idxd_enable_system_pasid(struct idxd_device *idxd) * DMA domain is owned by the driver, it should support all valid * types such as DMA-FQ, identity, etc. */ - ret = iommu_attach_device_pasid(domain, dev, pasid); + ret = iommu_attach_device_pasid(domain, dev, pasid, NULL); if (ret) { dev_err(dev, "failed to attach device pasid %d, domain type %d", pasid, domain->type); diff --git a/drivers/iommu/iommu-sva.c b/drivers/iommu/iommu-sva.c index d29f92c03c860..c5f0d3e2438db 100644 --- a/drivers/iommu/iommu-sva.c +++ b/drivers/iommu/iommu-sva.c @@ -101,7 +101,9 @@ struct iommu_sva *iommu_sva_bind_device(struct device *dev, struct mm_struct *mm /* Search for an existing domain. */ list_for_each_entry(domain, &mm->iommu_mm->sva_domains, next) { - ret = iommu_attach_device_pasid(domain, dev, iommu_mm->pasid); + handle->handle.domain = domain; + ret = iommu_attach_device_pasid(domain, dev, iommu_mm->pasid, + &handle->handle); if (!ret) { domain->users++; goto out; @@ -115,7 +117,9 @@ struct iommu_sva *iommu_sva_bind_device(struct device *dev, struct mm_struct *mm goto out_unlock; } - ret = iommu_attach_device_pasid(domain, dev, iommu_mm->pasid); + handle->handle.domain = domain; + ret = iommu_attach_device_pasid(domain, dev, iommu_mm->pasid, + &handle->handle); if (ret) goto out_free_domain; domain->users = 1; @@ -126,7 +130,6 @@ struct iommu_sva *iommu_sva_bind_device(struct device *dev, struct mm_struct *mm out: mutex_unlock(&iommu_sva_lock); handle->dev = dev; - handle->domain = domain; return handle; out_free_domain: @@ -148,7 +151,7 @@ EXPORT_SYMBOL_GPL(iommu_sva_bind_device); */ void iommu_sva_unbind_device(struct iommu_sva *handle) { - struct iommu_domain *domain = handle->domain; + struct iommu_domain *domain = handle->handle.domain; struct iommu_mm_data *iommu_mm = domain->mm->iommu_mm; struct device *dev = handle->dev; @@ -171,7 +174,7 @@ EXPORT_SYMBOL_GPL(iommu_sva_unbind_device); u32 iommu_sva_get_pasid(struct iommu_sva *handle) { - struct iommu_domain *domain = handle->domain; + struct iommu_domain *domain = handle->handle.domain; return mm_get_enqcmd_pasid(domain->mm); } diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c index 6713c44225f03..0e1d8c46ac071 100644 --- a/drivers/iommu/iommu.c +++ b/drivers/iommu/iommu.c @@ -3419,14 +3419,15 @@ static void __iommu_remove_group_pasid(struct iommu_group *group, * @domain: the iommu domain. * @dev: the attached device. * @pasid: the pasid of the device. + * @handle: the attach handle. * * Return: 0 on success, or an error. */ int iommu_attach_device_pasid(struct iommu_domain *domain, - struct device *dev, ioasid_t pasid) + struct device *dev, ioasid_t pasid, + struct iommu_attach_handle *handle) { struct iommu_group *group; - void *curr; int ret; if (!domain->ops->set_dev_pasid) @@ -3440,11 +3441,13 @@ int iommu_attach_device_pasid(struct iommu_domain *domain, return -EINVAL; mutex_lock(&group->mutex); - curr = xa_cmpxchg(&group->pasid_array, pasid, NULL, domain, GFP_KERNEL); - if (curr) { - ret = xa_err(curr) ? : -EBUSY; + + if (handle) + handle->domain = domain; + + ret = xa_insert(&group->pasid_array, pasid, handle, GFP_KERNEL); + if (ret) goto out_unlock; - } ret = __iommu_set_group_pasid(domain, group, pasid); if (ret) @@ -3473,7 +3476,7 @@ void iommu_detach_device_pasid(struct iommu_domain *domain, struct device *dev, mutex_lock(&group->mutex); __iommu_remove_group_pasid(group, pasid, domain); - WARN_ON(xa_erase(&group->pasid_array, pasid) != domain); + xa_erase(&group->pasid_array, pasid); mutex_unlock(&group->mutex); iommu_group_put(group); @@ -3498,17 +3501,21 @@ struct iommu_domain *iommu_get_domain_for_dev_pasid(struct device *dev, ioasid_t pasid, unsigned int type) { - struct iommu_domain *domain; struct iommu_group *group; + struct iommu_attach_handle *handle; + struct iommu_domain *domain = NULL; group = iommu_group_get(dev); if (!group) return NULL; xa_lock(&group->pasid_array); - domain = xa_load(&group->pasid_array, pasid); + handle = xa_load(&group->pasid_array, pasid); + if (handle) + domain = handle->domain; + if (type && domain && domain->type != type) - domain = ERR_PTR(-EBUSY); + domain = NULL; xa_unlock(&group->pasid_array); iommu_group_put(group); diff --git a/include/linux/iommu.h b/include/linux/iommu.h index 016a7b3eb2cf4..b6832b2dacf1c 100644 --- a/include/linux/iommu.h +++ b/include/linux/iommu.h @@ -901,12 +901,22 @@ struct iommu_fwspec { /* ATS is supported */ #define IOMMU_FWSPEC_PCI_RC_ATS (1 << 0) +/* + * An iommu attach handle represents a relationship between an iommu domain + * and a PASID or RID of a device. It is allocated and managed by the component + * that manages the domain and is stored in the iommu group during the time the + * domain is attached. + */ +struct iommu_attach_handle { + struct iommu_domain *domain; +}; + /** * struct iommu_sva - handle to a device-mm bond */ struct iommu_sva { + struct iommu_attach_handle handle; struct device *dev; - struct iommu_domain *domain; struct list_head handle_item; refcount_t users; }; @@ -967,7 +977,8 @@ int iommu_device_claim_dma_owner(struct device *dev, void *owner); void iommu_device_release_dma_owner(struct device *dev); int iommu_attach_device_pasid(struct iommu_domain *domain, - struct device *dev, ioasid_t pasid); + struct device *dev, ioasid_t pasid, + struct iommu_attach_handle *handle); void iommu_detach_device_pasid(struct iommu_domain *domain, struct device *dev, ioasid_t pasid); struct iommu_domain * @@ -1303,7 +1314,8 @@ static inline int iommu_device_claim_dma_owner(struct device *dev, void *owner) } static inline int iommu_attach_device_pasid(struct iommu_domain *domain, - struct device *dev, ioasid_t pasid) + struct device *dev, ioasid_t pasid, + struct iommu_attach_handle *handle) { return -ENODEV; } From bc920771a058ab80f9b198c79324587da93c0690 Mon Sep 17 00:00:00 2001 From: Lu Baolu Date: Tue, 2 Jul 2024 14:34:36 +0800 Subject: [PATCH 507/548] iommu: Remove sva handle list The struct sva_iommu represents an association between an SVA domain and a PASID of a device. It's stored in the iommu group's pasid array and also tracked by a list in the per-mm data structure. Removes duplicate tracking of sva_iommu by eliminating the list. Signed-off-by: Lu Baolu Reviewed-by: Jason Gunthorpe Reviewed-by: Kevin Tian Link: https://lore.kernel.org/r/20240702063444.105814-3-baolu.lu@linux.intel.com Signed-off-by: Will Deacon (backported from commit 3e7f57d1ef3f5fbed58974fae38d35e430f57d35) Signed-off-by: Koba Ko --- drivers/iommu/iommu-priv.h | 3 +++ drivers/iommu/iommu-sva.c | 32 +++++++++++++++++++++----------- drivers/iommu/iommu.c | 31 +++++++++++++++++++++++++++++++ include/linux/iommu.h | 2 -- 4 files changed, 55 insertions(+), 13 deletions(-) diff --git a/drivers/iommu/iommu-priv.h b/drivers/iommu/iommu-priv.h index e6ef72415505c..95b6f9510a4cb 100644 --- a/drivers/iommu/iommu-priv.h +++ b/drivers/iommu/iommu-priv.h @@ -29,4 +29,7 @@ void iommu_device_unregister_bus(struct iommu_device *iommu, int iommu_mock_device_add(struct device *dev, struct iommu_device *iommu); +struct iommu_attach_handle *iommu_attach_handle_get(struct iommu_group *group, + ioasid_t pasid, + unsigned int type); #endif /* __LINUX_IOMMU_PRIV_H */ diff --git a/drivers/iommu/iommu-sva.c b/drivers/iommu/iommu-sva.c index c5f0d3e2438db..c7138f0a75e35 100644 --- a/drivers/iommu/iommu-sva.c +++ b/drivers/iommu/iommu-sva.c @@ -43,7 +43,6 @@ static struct iommu_mm_data *iommu_alloc_mm_data(struct mm_struct *mm, struct de } iommu_mm->pasid = pasid; INIT_LIST_HEAD(&iommu_mm->sva_domains); - INIT_LIST_HEAD(&iommu_mm->sva_handles); /* * Make sure the write to mm->iommu_mm is not reordered in front of * initialization to iommu_mm fields. If it does, readers may see a @@ -71,11 +70,16 @@ static struct iommu_mm_data *iommu_alloc_mm_data(struct mm_struct *mm, struct de */ struct iommu_sva *iommu_sva_bind_device(struct device *dev, struct mm_struct *mm) { + struct iommu_group *group = dev->iommu_group; + struct iommu_attach_handle *attach_handle; struct iommu_mm_data *iommu_mm; struct iommu_domain *domain; struct iommu_sva *handle; int ret; + if (!group) + return ERR_PTR(-ENODEV); + mutex_lock(&iommu_sva_lock); /* Allocate mm->pasid if necessary. */ @@ -85,12 +89,22 @@ struct iommu_sva *iommu_sva_bind_device(struct device *dev, struct mm_struct *mm goto out_unlock; } - list_for_each_entry(handle, &mm->iommu_mm->sva_handles, handle_item) { - if (handle->dev == dev) { - refcount_inc(&handle->users); - mutex_unlock(&iommu_sva_lock); - return handle; + /* A bond already exists, just take a reference`. */ + attach_handle = iommu_attach_handle_get(group, iommu_mm->pasid, IOMMU_DOMAIN_SVA); + if (!IS_ERR(attach_handle)) { + handle = container_of(attach_handle, struct iommu_sva, handle); + if (attach_handle->domain->mm != mm) { + ret = -EBUSY; + goto out_unlock; } + refcount_inc(&handle->users); + mutex_unlock(&iommu_sva_lock); + return handle; + } + + if (PTR_ERR(attach_handle) != -ENOENT) { + ret = PTR_ERR(attach_handle); + goto out_unlock; } handle = kzalloc(sizeof(*handle), GFP_KERNEL); @@ -101,7 +115,6 @@ struct iommu_sva *iommu_sva_bind_device(struct device *dev, struct mm_struct *mm /* Search for an existing domain. */ list_for_each_entry(domain, &mm->iommu_mm->sva_domains, next) { - handle->handle.domain = domain; ret = iommu_attach_device_pasid(domain, dev, iommu_mm->pasid, &handle->handle); if (!ret) { @@ -117,17 +130,15 @@ struct iommu_sva *iommu_sva_bind_device(struct device *dev, struct mm_struct *mm goto out_unlock; } - handle->handle.domain = domain; ret = iommu_attach_device_pasid(domain, dev, iommu_mm->pasid, &handle->handle); if (ret) goto out_free_domain; domain->users = 1; - refcount_set(&handle->users, 1); list_add(&domain->next, &mm->iommu_mm->sva_domains); - list_add(&handle->handle_item, &mm->iommu_mm->sva_handles); out: + refcount_set(&handle->users, 1); mutex_unlock(&iommu_sva_lock); handle->dev = dev; return handle; @@ -160,7 +171,6 @@ void iommu_sva_unbind_device(struct iommu_sva *handle) mutex_unlock(&iommu_sva_lock); return; } - list_del(&handle->handle_item); iommu_detach_device_pasid(domain, dev, iommu_mm->pasid); if (--domain->users == 0) { diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c index 0e1d8c46ac071..0651ab271ee5a 100644 --- a/drivers/iommu/iommu.c +++ b/drivers/iommu/iommu.c @@ -3549,3 +3549,34 @@ void iommu_free_global_pasid(ioasid_t pasid) ida_free(&iommu_global_pasid_ida, pasid); } EXPORT_SYMBOL_GPL(iommu_free_global_pasid); + +/** + * iommu_attach_handle_get - Return the attach handle + * @group: the iommu group that domain was attached to + * @pasid: the pasid within the group + * @type: matched domain type, 0 for any match + * + * Return handle or ERR_PTR(-ENOENT) on none, ERR_PTR(-EBUSY) on mismatch. + * + * Return the attach handle to the caller. The life cycle of an iommu attach + * handle is from the time when the domain is attached to the time when the + * domain is detached. Callers are required to synchronize the call of + * iommu_attach_handle_get() with domain attachment and detachment. The attach + * handle can only be used during its life cycle. + */ +struct iommu_attach_handle * +iommu_attach_handle_get(struct iommu_group *group, ioasid_t pasid, unsigned int type) +{ + struct iommu_attach_handle *handle; + + xa_lock(&group->pasid_array); + handle = xa_load(&group->pasid_array, pasid); + if (!handle) + handle = ERR_PTR(-ENOENT); + else if (type && handle->domain->type != type) + handle = ERR_PTR(-EBUSY); + xa_unlock(&group->pasid_array); + + return handle; +} +EXPORT_SYMBOL_NS_GPL(iommu_attach_handle_get, IOMMUFD_INTERNAL); diff --git a/include/linux/iommu.h b/include/linux/iommu.h index b6832b2dacf1c..941968a9171ff 100644 --- a/include/linux/iommu.h +++ b/include/linux/iommu.h @@ -917,14 +917,12 @@ struct iommu_attach_handle { struct iommu_sva { struct iommu_attach_handle handle; struct device *dev; - struct list_head handle_item; refcount_t users; }; struct iommu_mm_data { u32 pasid; struct list_head sva_domains; - struct list_head sva_handles; }; int iommu_fwspec_init(struct device *dev, struct fwnode_handle *iommu_fwnode, From 69790d34a870edb367be745d13f8e900378203dc Mon Sep 17 00:00:00 2001 From: Lu Baolu Date: Tue, 2 Jul 2024 14:34:37 +0800 Subject: [PATCH 508/548] iommu: Add attach handle to struct iopf_group Previously, the domain that a page fault targets is stored in an iopf_group, which represents a minimal set of page faults. With the introduction of attach handle, replace the domain with the handle so that the fault handler can obtain more information as needed when handling the faults. iommu_report_device_fault() is currently used for SVA page faults, which handles the page fault in an internal cycle. The domain is retrieved with iommu_get_domain_for_dev_pasid() if the pasid in the fault message is valid. This doesn't work in IOMMUFD case, where if the pasid table of a device is wholly managed by user space, there is no domain attached to the PASID of the device, and all page faults are forwarded through a NESTING domain attaching to RID. Add a static flag in iommu ops, which indicates if the IOMMU driver supports user-managed PASID tables. In the iopf deliver path, if no attach handle found for the iopf PASID, roll back to RID domain when the IOMMU driver supports this capability. iommu_get_domain_for_dev_pasid() is no longer used and can be removed. Signed-off-by: Lu Baolu Reviewed-by: Jason Gunthorpe Reviewed-by: Kevin Tian Link: https://lore.kernel.org/r/20240702063444.105814-4-baolu.lu@linux.intel.com Signed-off-by: Will Deacon (cherry picked from commit 06cdcc32d65759d42c6340700796e2906045b6a5) Signed-off-by: Koba Ko --- drivers/iommu/io-pgfault.c | 61 +++++++++++++++++++++----------------- drivers/iommu/iommu-sva.c | 3 +- drivers/iommu/iommu.c | 40 ------------------------- include/linux/iommu.h | 17 ++++------- 4 files changed, 42 insertions(+), 79 deletions(-) diff --git a/drivers/iommu/io-pgfault.c b/drivers/iommu/io-pgfault.c index 350ccd20c7a04..24292688cdcdd 100644 --- a/drivers/iommu/io-pgfault.c +++ b/drivers/iommu/io-pgfault.c @@ -59,30 +59,6 @@ void iopf_free_group(struct iopf_group *group) } EXPORT_SYMBOL_GPL(iopf_free_group); -static struct iommu_domain *get_domain_for_iopf(struct device *dev, - struct iommu_fault *fault) -{ - struct iommu_domain *domain; - - if (fault->prm.flags & IOMMU_FAULT_PAGE_REQUEST_PASID_VALID) { - domain = iommu_get_domain_for_dev_pasid(dev, fault->prm.pasid, 0); - if (IS_ERR(domain)) - domain = NULL; - } else { - domain = iommu_get_domain_for_dev(dev); - } - - if (!domain || !domain->iopf_handler) { - dev_warn_ratelimited(dev, - "iopf (pasid %d) without domain attached or handler installed\n", - fault->prm.pasid); - - return NULL; - } - - return domain; -} - /* Non-last request of a group. Postpone until the last one. */ static int report_partial_fault(struct iommu_fault_param *fault_param, struct iommu_fault *fault) @@ -206,20 +182,51 @@ void iommu_report_device_fault(struct device *dev, struct iopf_fault *evt) if (group == &abort_group) goto err_abort; - group->domain = get_domain_for_iopf(dev, fault); - if (!group->domain) + if (fault->prm.flags & IOMMU_FAULT_PAGE_REQUEST_PASID_VALID) { + group->attach_handle = iommu_attach_handle_get(dev->iommu_group, + fault->prm.pasid, + 0); + if (IS_ERR(group->attach_handle)) { + const struct iommu_ops *ops = dev_iommu_ops(dev); + + if (!ops->user_pasid_table) + goto err_abort; + + /* + * The iommu driver for this device supports user- + * managed PASID table. Therefore page faults for + * any PASID should go through the NESTING domain + * attached to the device RID. + */ + group->attach_handle = + iommu_attach_handle_get(dev->iommu_group, + IOMMU_NO_PASID, + IOMMU_DOMAIN_NESTED); + if (IS_ERR(group->attach_handle)) + goto err_abort; + } + } else { + group->attach_handle = + iommu_attach_handle_get(dev->iommu_group, IOMMU_NO_PASID, 0); + if (IS_ERR(group->attach_handle)) + goto err_abort; + } + + if (!group->attach_handle->domain->iopf_handler) goto err_abort; /* * On success iopf_handler must call iopf_group_response() and * iopf_free_group() */ - if (group->domain->iopf_handler(group)) + if (group->attach_handle->domain->iopf_handler(group)) goto err_abort; return; err_abort: + dev_warn_ratelimited(dev, "iopf with pasid %d aborted\n", + fault->prm.pasid); iopf_group_response(group, IOMMU_PAGE_RESP_FAILURE); if (group == &abort_group) __iopf_free_group(group); diff --git a/drivers/iommu/iommu-sva.c b/drivers/iommu/iommu-sva.c index c7138f0a75e35..e314cd99dc0ac 100644 --- a/drivers/iommu/iommu-sva.c +++ b/drivers/iommu/iommu-sva.c @@ -273,7 +273,8 @@ static void iommu_sva_handle_iopf(struct work_struct *work) if (status != IOMMU_PAGE_RESP_SUCCESS) break; - status = iommu_sva_handle_mm(&iopf->fault, group->domain->mm); + status = iommu_sva_handle_mm(&iopf->fault, + group->attach_handle->domain->mm); } iopf_group_response(group, status); diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c index 0651ab271ee5a..a02e9e24e96bd 100644 --- a/drivers/iommu/iommu.c +++ b/drivers/iommu/iommu.c @@ -3483,46 +3483,6 @@ void iommu_detach_device_pasid(struct iommu_domain *domain, struct device *dev, } EXPORT_SYMBOL_GPL(iommu_detach_device_pasid); -/* - * iommu_get_domain_for_dev_pasid() - Retrieve domain for @pasid of @dev - * @dev: the queried device - * @pasid: the pasid of the device - * @type: matched domain type, 0 for any match - * - * This is a variant of iommu_get_domain_for_dev(). It returns the existing - * domain attached to pasid of a device. Callers must hold a lock around this - * function, and both iommu_attach/detach_dev_pasid() whenever a domain of - * type is being manipulated. This API does not internally resolve races with - * attach/detach. - * - * Return: attached domain on success, NULL otherwise. - */ -struct iommu_domain *iommu_get_domain_for_dev_pasid(struct device *dev, - ioasid_t pasid, - unsigned int type) -{ - struct iommu_group *group; - struct iommu_attach_handle *handle; - struct iommu_domain *domain = NULL; - - group = iommu_group_get(dev); - if (!group) - return NULL; - - xa_lock(&group->pasid_array); - handle = xa_load(&group->pasid_array, pasid); - if (handle) - domain = handle->domain; - - if (type && domain && domain->type != type) - domain = NULL; - xa_unlock(&group->pasid_array); - iommu_group_put(group); - - return domain; -} -EXPORT_SYMBOL_GPL(iommu_get_domain_for_dev_pasid); - ioasid_t iommu_alloc_global_pasid(struct device *dev) { int ret; diff --git a/include/linux/iommu.h b/include/linux/iommu.h index 941968a9171ff..ed795ce8d4c3e 100644 --- a/include/linux/iommu.h +++ b/include/linux/iommu.h @@ -128,7 +128,7 @@ struct iopf_group { /* list node for iommu_fault_param::faults */ struct list_head pending_node; struct work_struct work; - struct iommu_domain *domain; + struct iommu_attach_handle *attach_handle; /* The device's fault data parameter. */ struct iommu_fault_param *fault_param; }; @@ -483,6 +483,10 @@ static inline int __iommu_copy_struct_from_user( * @default_domain: If not NULL this will always be set as the default domain. * This should be an IDENTITY/BLOCKED/PLATFORM domain. * Do not use in new drivers. + * @user_pasid_table: IOMMU driver supports user-managed PASID table. There is + * no user domain for each PASID and the I/O page faults are + * forwarded through the user domain attached to the device + * RID. */ struct iommu_ops { bool (*capable)(struct device *dev, enum iommu_cap); @@ -526,6 +530,7 @@ struct iommu_ops { struct iommu_domain *identity_domain; struct iommu_domain *blocked_domain; struct iommu_domain *default_domain; + u8 user_pasid_table:1; }; /** @@ -979,9 +984,6 @@ int iommu_attach_device_pasid(struct iommu_domain *domain, struct iommu_attach_handle *handle); void iommu_detach_device_pasid(struct iommu_domain *domain, struct device *dev, ioasid_t pasid); -struct iommu_domain * -iommu_get_domain_for_dev_pasid(struct device *dev, ioasid_t pasid, - unsigned int type); ioasid_t iommu_alloc_global_pasid(struct device *dev); void iommu_free_global_pasid(ioasid_t pasid); #else /* CONFIG_IOMMU_API */ @@ -1323,13 +1325,6 @@ static inline void iommu_detach_device_pasid(struct iommu_domain *domain, { } -static inline struct iommu_domain * -iommu_get_domain_for_dev_pasid(struct device *dev, ioasid_t pasid, - unsigned int type) -{ - return NULL; -} - static inline ioasid_t iommu_alloc_global_pasid(struct device *dev) { return IOMMU_PASID_INVALID; From 604cd0e621a8682568a729171f2dcbfdaa002f8d Mon Sep 17 00:00:00 2001 From: Lu Baolu Date: Tue, 2 Jul 2024 14:34:38 +0800 Subject: [PATCH 509/548] iommu: Extend domain attach group with handle support Unlike the SVA case where each PASID of a device has an SVA domain attached to it, the I/O page faults are handled by the fault handler of the SVA domain. The I/O page faults for a user page table might be handled by the domain attached to RID or the domain attached to the PASID, depending on whether the PASID table is managed by user space or kernel. As a result, there is a need for the domain attach group interfaces to have attach handle support. The attach handle will be forwarded to the fault handler of the user domain. Add some variants of the domain attaching group interfaces so that they could support the attach handle and export them for use in IOMMUFD. Signed-off-by: Lu Baolu Reviewed-by: Kevin Tian Link: https://lore.kernel.org/r/20240702063444.105814-5-baolu.lu@linux.intel.com Signed-off-by: Will Deacon (cherry picked from commit 8519e689834a3ecf9a36332a9abb1844bd34e459) Signed-off-by: Koba Ko --- drivers/iommu/iommu-priv.h | 8 +++ drivers/iommu/iommu.c | 103 +++++++++++++++++++++++++++++++++++++ 2 files changed, 111 insertions(+) diff --git a/drivers/iommu/iommu-priv.h b/drivers/iommu/iommu-priv.h index 95b6f9510a4cb..f396c7063a325 100644 --- a/drivers/iommu/iommu-priv.h +++ b/drivers/iommu/iommu-priv.h @@ -32,4 +32,12 @@ int iommu_mock_device_add(struct device *dev, struct iommu_device *iommu); struct iommu_attach_handle *iommu_attach_handle_get(struct iommu_group *group, ioasid_t pasid, unsigned int type); +int iommu_attach_group_handle(struct iommu_domain *domain, + struct iommu_group *group, + struct iommu_attach_handle *handle); +void iommu_detach_group_handle(struct iommu_domain *domain, + struct iommu_group *group); +int iommu_replace_group_handle(struct iommu_group *group, + struct iommu_domain *new_domain, + struct iommu_attach_handle *handle); #endif /* __LINUX_IOMMU_PRIV_H */ diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c index a02e9e24e96bd..53ee79a6babb4 100644 --- a/drivers/iommu/iommu.c +++ b/drivers/iommu/iommu.c @@ -3540,3 +3540,106 @@ iommu_attach_handle_get(struct iommu_group *group, ioasid_t pasid, unsigned int return handle; } EXPORT_SYMBOL_NS_GPL(iommu_attach_handle_get, IOMMUFD_INTERNAL); + +/** + * iommu_attach_group_handle - Attach an IOMMU domain to an IOMMU group + * @domain: IOMMU domain to attach + * @group: IOMMU group that will be attached + * @handle: attach handle + * + * Returns 0 on success and error code on failure. + * + * This is a variant of iommu_attach_group(). It allows the caller to provide + * an attach handle and use it when the domain is attached. This is currently + * used by IOMMUFD to deliver the I/O page faults. + */ +int iommu_attach_group_handle(struct iommu_domain *domain, + struct iommu_group *group, + struct iommu_attach_handle *handle) +{ + int ret; + + if (handle) + handle->domain = domain; + + mutex_lock(&group->mutex); + ret = xa_insert(&group->pasid_array, IOMMU_NO_PASID, handle, GFP_KERNEL); + if (ret) + goto err_unlock; + + ret = __iommu_attach_group(domain, group); + if (ret) + goto err_erase; + mutex_unlock(&group->mutex); + + return 0; +err_erase: + xa_erase(&group->pasid_array, IOMMU_NO_PASID); +err_unlock: + mutex_unlock(&group->mutex); + return ret; +} +EXPORT_SYMBOL_NS_GPL(iommu_attach_group_handle, IOMMUFD_INTERNAL); + +/** + * iommu_detach_group_handle - Detach an IOMMU domain from an IOMMU group + * @domain: IOMMU domain to attach + * @group: IOMMU group that will be attached + * + * Detach the specified IOMMU domain from the specified IOMMU group. + * It must be used in conjunction with iommu_attach_group_handle(). + */ +void iommu_detach_group_handle(struct iommu_domain *domain, + struct iommu_group *group) +{ + mutex_lock(&group->mutex); + __iommu_group_set_core_domain(group); + xa_erase(&group->pasid_array, IOMMU_NO_PASID); + mutex_unlock(&group->mutex); +} +EXPORT_SYMBOL_NS_GPL(iommu_detach_group_handle, IOMMUFD_INTERNAL); + +/** + * iommu_replace_group_handle - replace the domain that a group is attached to + * @group: IOMMU group that will be attached to the new domain + * @new_domain: new IOMMU domain to replace with + * @handle: attach handle + * + * This is a variant of iommu_group_replace_domain(). It allows the caller to + * provide an attach handle for the new domain and use it when the domain is + * attached. + */ +int iommu_replace_group_handle(struct iommu_group *group, + struct iommu_domain *new_domain, + struct iommu_attach_handle *handle) +{ + void *curr; + int ret; + + if (!new_domain) + return -EINVAL; + + mutex_lock(&group->mutex); + if (handle) { + ret = xa_reserve(&group->pasid_array, IOMMU_NO_PASID, GFP_KERNEL); + if (ret) + goto err_unlock; + } + + ret = __iommu_group_set_domain(group, new_domain); + if (ret) + goto err_release; + + curr = xa_store(&group->pasid_array, IOMMU_NO_PASID, handle, GFP_KERNEL); + WARN_ON(xa_is_err(curr)); + + mutex_unlock(&group->mutex); + + return 0; +err_release: + xa_release(&group->pasid_array, IOMMU_NO_PASID); +err_unlock: + mutex_unlock(&group->mutex); + return ret; +} +EXPORT_SYMBOL_NS_GPL(iommu_replace_group_handle, IOMMUFD_INTERNAL); From 05cf257c4b9cbf529e1ae4f49c6e01cbf5da7f50 Mon Sep 17 00:00:00 2001 From: Lu Baolu Date: Tue, 2 Jul 2024 14:34:39 +0800 Subject: [PATCH 510/548] iommufd: Add fault and response message definitions iommu_hwpt_pgfaults represent fault messages that the userspace can retrieve. Multiple iommu_hwpt_pgfaults might be put in an iopf group, with the IOMMU_PGFAULT_FLAGS_LAST_PAGE flag set only for the last iommu_hwpt_pgfault. An iommu_hwpt_page_response is a response message that the userspace should send to the kernel after finishing handling a group of fault messages. The @dev_id, @pasid, and @grpid fields in the message identify an outstanding iopf group for a device. The @cookie field, which matches the cookie field of the last fault in the group, will be used by the kernel to look up the pending message. Link: https://lore.kernel.org/r/20240702063444.105814-6-baolu.lu@linux.intel.com Signed-off-by: Lu Baolu Reviewed-by: Kevin Tian Signed-off-by: Jason Gunthorpe (cherry picked from commit c714f15860fcca02fe0fd7c3f1f1fc35b1768ac1) Signed-off-by: Koba Ko --- include/uapi/linux/iommufd.h | 82 ++++++++++++++++++++++++++++++++++++ 1 file changed, 82 insertions(+) diff --git a/include/uapi/linux/iommufd.h b/include/uapi/linux/iommufd.h index d816deac906fd..409cc1a89e71b 100644 --- a/include/uapi/linux/iommufd.h +++ b/include/uapi/linux/iommufd.h @@ -573,4 +573,86 @@ struct iommu_hwpt_get_dirty_bitmap { #define IOMMU_HWPT_GET_DIRTY_BITMAP _IO(IOMMUFD_TYPE, \ IOMMUFD_CMD_HWPT_GET_DIRTY_BITMAP) +/** + * enum iommu_hwpt_pgfault_flags - flags for struct iommu_hwpt_pgfault + * @IOMMU_PGFAULT_FLAGS_PASID_VALID: The pasid field of the fault data is + * valid. + * @IOMMU_PGFAULT_FLAGS_LAST_PAGE: It's the last fault of a fault group. + */ +enum iommu_hwpt_pgfault_flags { + IOMMU_PGFAULT_FLAGS_PASID_VALID = (1 << 0), + IOMMU_PGFAULT_FLAGS_LAST_PAGE = (1 << 1), +}; + +/** + * enum iommu_hwpt_pgfault_perm - perm bits for struct iommu_hwpt_pgfault + * @IOMMU_PGFAULT_PERM_READ: request for read permission + * @IOMMU_PGFAULT_PERM_WRITE: request for write permission + * @IOMMU_PGFAULT_PERM_EXEC: (PCIE 10.4.1) request with a PASID that has the + * Execute Requested bit set in PASID TLP Prefix. + * @IOMMU_PGFAULT_PERM_PRIV: (PCIE 10.4.1) request with a PASID that has the + * Privileged Mode Requested bit set in PASID TLP + * Prefix. + */ +enum iommu_hwpt_pgfault_perm { + IOMMU_PGFAULT_PERM_READ = (1 << 0), + IOMMU_PGFAULT_PERM_WRITE = (1 << 1), + IOMMU_PGFAULT_PERM_EXEC = (1 << 2), + IOMMU_PGFAULT_PERM_PRIV = (1 << 3), +}; + +/** + * struct iommu_hwpt_pgfault - iommu page fault data + * @flags: Combination of enum iommu_hwpt_pgfault_flags + * @dev_id: id of the originated device + * @pasid: Process Address Space ID + * @grpid: Page Request Group Index + * @perm: Combination of enum iommu_hwpt_pgfault_perm + * @addr: Fault address + * @length: a hint of how much data the requestor is expecting to fetch. For + * example, if the PRI initiator knows it is going to do a 10MB + * transfer, it could fill in 10MB and the OS could pre-fault in + * 10MB of IOVA. It's default to 0 if there's no such hint. + * @cookie: kernel-managed cookie identifying a group of fault messages. The + * cookie number encoded in the last page fault of the group should + * be echoed back in the response message. + */ +struct iommu_hwpt_pgfault { + __u32 flags; + __u32 dev_id; + __u32 pasid; + __u32 grpid; + __u32 perm; + __u64 addr; + __u32 length; + __u32 cookie; +}; + +/** + * enum iommufd_page_response_code - Return status of fault handlers + * @IOMMUFD_PAGE_RESP_SUCCESS: Fault has been handled and the page tables + * populated, retry the access. This is the + * "Success" defined in PCI 10.4.2.1. + * @IOMMUFD_PAGE_RESP_INVALID: Could not handle this fault, don't retry the + * access. This is the "Invalid Request" in PCI + * 10.4.2.1. + * @IOMMUFD_PAGE_RESP_FAILURE: General error. Drop all subsequent faults from + * this device if possible. This is the "Response + * Failure" in PCI 10.4.2.1. + */ +enum iommufd_page_response_code { + IOMMUFD_PAGE_RESP_SUCCESS = 0, + IOMMUFD_PAGE_RESP_INVALID, + IOMMUFD_PAGE_RESP_FAILURE, +}; + +/** + * struct iommu_hwpt_page_response - IOMMU page fault response + * @cookie: The kernel-managed cookie reported in the fault message. + * @code: One of response code in enum iommufd_page_response_code. + */ +struct iommu_hwpt_page_response { + __u32 cookie; + __u32 code; +}; #endif From 77ad3aa4d8ec6e4d3888d45d5bf07fa4009c1830 Mon Sep 17 00:00:00 2001 From: Lu Baolu Date: Tue, 2 Jul 2024 14:34:40 +0800 Subject: [PATCH 511/548] iommufd: Add iommufd fault object An iommufd fault object provides an interface for delivering I/O page faults to user space. These objects are created and destroyed by user space, and they can be associated with or dissociated from hardware page table objects during page table allocation or destruction. User space interacts with the fault object through a file interface. This interface offers a straightforward and efficient way for user space to handle page faults. It allows user space to read fault messages sequentially and respond to them by writing to the same file. The file interface supports reading messages in poll mode, so it's recommended that user space applications use io_uring to enhance read and write efficiency. A fault object can be associated with any iopf-capable iommufd_hw_pgtable during the pgtable's allocation. All I/O page faults triggered by devices when accessing the I/O addresses of an iommufd_hw_pgtable are routed through the fault object to user space. Similarly, user space's responses to these page faults are routed back to the iommu device driver through the same fault object. Link: https://lore.kernel.org/r/20240702063444.105814-7-baolu.lu@linux.intel.com Signed-off-by: Lu Baolu Reviewed-by: Jason Gunthorpe Reviewed-by: Kevin Tian Signed-off-by: Jason Gunthorpe (cherry picked from commit 07838f7fd529c8a6de44b601d4b7057e6c8d36ed) Signed-off-by: Koba Ko --- drivers/iommu/io-pgfault.c | 2 + drivers/iommu/iommufd/Makefile | 1 + drivers/iommu/iommufd/fault.c | 226 ++++++++++++++++++++++++ drivers/iommu/iommufd/iommufd_private.h | 30 ++++ drivers/iommu/iommufd/main.c | 6 + include/linux/iommu.h | 4 + include/uapi/linux/iommufd.h | 18 ++ 7 files changed, 287 insertions(+) create mode 100644 drivers/iommu/iommufd/fault.c diff --git a/drivers/iommu/io-pgfault.c b/drivers/iommu/io-pgfault.c index 24292688cdcdd..e017902f90539 100644 --- a/drivers/iommu/io-pgfault.c +++ b/drivers/iommu/io-pgfault.c @@ -110,6 +110,8 @@ static struct iopf_group *iopf_group_alloc(struct iommu_fault_param *iopf_param, list_add(&group->pending_node, &iopf_param->faults); mutex_unlock(&iopf_param->lock); + group->fault_count = list_count_nodes(&group->faults); + return group; } diff --git a/drivers/iommu/iommufd/Makefile b/drivers/iommu/iommufd/Makefile index 34b446146961c..cf4605962bea6 100644 --- a/drivers/iommu/iommufd/Makefile +++ b/drivers/iommu/iommufd/Makefile @@ -1,6 +1,7 @@ # SPDX-License-Identifier: GPL-2.0-only iommufd-y := \ device.o \ + fault.o \ hw_pagetable.o \ io_pagetable.o \ ioas.o \ diff --git a/drivers/iommu/iommufd/fault.c b/drivers/iommu/iommufd/fault.c new file mode 100644 index 0000000000000..68ff94671d489 --- /dev/null +++ b/drivers/iommu/iommufd/fault.c @@ -0,0 +1,226 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* Copyright (C) 2024 Intel Corporation + */ +#define pr_fmt(fmt) "iommufd: " fmt + +#include +#include +#include +#include +#include +#include +#include +#include + +#include "../iommu-priv.h" +#include "iommufd_private.h" + +void iommufd_fault_destroy(struct iommufd_object *obj) +{ + struct iommufd_fault *fault = container_of(obj, struct iommufd_fault, obj); + struct iopf_group *group, *next; + + /* + * The iommufd object's reference count is zero at this point. + * We can be confident that no other threads are currently + * accessing this pointer. Therefore, acquiring the mutex here + * is unnecessary. + */ + list_for_each_entry_safe(group, next, &fault->deliver, node) { + list_del(&group->node); + iopf_group_response(group, IOMMU_PAGE_RESP_INVALID); + iopf_free_group(group); + } +} + +static void iommufd_compose_fault_message(struct iommu_fault *fault, + struct iommu_hwpt_pgfault *hwpt_fault, + struct iommufd_device *idev, + u32 cookie) +{ + hwpt_fault->flags = fault->prm.flags; + hwpt_fault->dev_id = idev->obj.id; + hwpt_fault->pasid = fault->prm.pasid; + hwpt_fault->grpid = fault->prm.grpid; + hwpt_fault->perm = fault->prm.perm; + hwpt_fault->addr = fault->prm.addr; + hwpt_fault->length = 0; + hwpt_fault->cookie = cookie; +} + +static ssize_t iommufd_fault_fops_read(struct file *filep, char __user *buf, + size_t count, loff_t *ppos) +{ + size_t fault_size = sizeof(struct iommu_hwpt_pgfault); + struct iommufd_fault *fault = filep->private_data; + struct iommu_hwpt_pgfault data; + struct iommufd_device *idev; + struct iopf_group *group; + struct iopf_fault *iopf; + size_t done = 0; + int rc = 0; + + if (*ppos || count % fault_size) + return -ESPIPE; + + mutex_lock(&fault->mutex); + while (!list_empty(&fault->deliver) && count > done) { + group = list_first_entry(&fault->deliver, + struct iopf_group, node); + + if (group->fault_count * fault_size > count - done) + break; + + rc = xa_alloc(&fault->response, &group->cookie, group, + xa_limit_32b, GFP_KERNEL); + if (rc) + break; + + idev = to_iommufd_handle(group->attach_handle)->idev; + list_for_each_entry(iopf, &group->faults, list) { + iommufd_compose_fault_message(&iopf->fault, + &data, idev, + group->cookie); + if (copy_to_user(buf + done, &data, fault_size)) { + xa_erase(&fault->response, group->cookie); + rc = -EFAULT; + break; + } + done += fault_size; + } + + list_del(&group->node); + } + mutex_unlock(&fault->mutex); + + return done == 0 ? rc : done; +} + +static ssize_t iommufd_fault_fops_write(struct file *filep, const char __user *buf, + size_t count, loff_t *ppos) +{ + size_t response_size = sizeof(struct iommu_hwpt_page_response); + struct iommufd_fault *fault = filep->private_data; + struct iommu_hwpt_page_response response; + struct iopf_group *group; + size_t done = 0; + int rc = 0; + + if (*ppos || count % response_size) + return -ESPIPE; + + mutex_lock(&fault->mutex); + while (count > done) { + rc = copy_from_user(&response, buf + done, response_size); + if (rc) + break; + + group = xa_erase(&fault->response, response.cookie); + if (!group) { + rc = -EINVAL; + break; + } + + iopf_group_response(group, response.code); + iopf_free_group(group); + done += response_size; + } + mutex_unlock(&fault->mutex); + + return done == 0 ? rc : done; +} + +static __poll_t iommufd_fault_fops_poll(struct file *filep, + struct poll_table_struct *wait) +{ + struct iommufd_fault *fault = filep->private_data; + __poll_t pollflags = EPOLLOUT; + + poll_wait(filep, &fault->wait_queue, wait); + mutex_lock(&fault->mutex); + if (!list_empty(&fault->deliver)) + pollflags |= EPOLLIN | EPOLLRDNORM; + mutex_unlock(&fault->mutex); + + return pollflags; +} + +static int iommufd_fault_fops_release(struct inode *inode, struct file *filep) +{ + struct iommufd_fault *fault = filep->private_data; + + refcount_dec(&fault->obj.users); + iommufd_ctx_put(fault->ictx); + return 0; +} + +static const struct file_operations iommufd_fault_fops = { + .owner = THIS_MODULE, + .open = nonseekable_open, + .read = iommufd_fault_fops_read, + .write = iommufd_fault_fops_write, + .poll = iommufd_fault_fops_poll, + .release = iommufd_fault_fops_release, + .llseek = no_llseek, +}; + +int iommufd_fault_alloc(struct iommufd_ucmd *ucmd) +{ + struct iommu_fault_alloc *cmd = ucmd->cmd; + struct iommufd_fault *fault; + struct file *filep; + int fdno; + int rc; + + if (cmd->flags) + return -EOPNOTSUPP; + + fault = iommufd_object_alloc(ucmd->ictx, fault, IOMMUFD_OBJ_FAULT); + if (IS_ERR(fault)) + return PTR_ERR(fault); + + fault->ictx = ucmd->ictx; + INIT_LIST_HEAD(&fault->deliver); + xa_init_flags(&fault->response, XA_FLAGS_ALLOC1); + mutex_init(&fault->mutex); + init_waitqueue_head(&fault->wait_queue); + + filep = anon_inode_getfile("[iommufd-pgfault]", &iommufd_fault_fops, + fault, O_RDWR); + if (IS_ERR(filep)) { + rc = PTR_ERR(filep); + goto out_abort; + } + + refcount_inc(&fault->obj.users); + iommufd_ctx_get(fault->ictx); + fault->filep = filep; + + fdno = get_unused_fd_flags(O_CLOEXEC); + if (fdno < 0) { + rc = fdno; + goto out_fput; + } + + cmd->out_fault_id = fault->obj.id; + cmd->out_fault_fd = fdno; + + rc = iommufd_ucmd_respond(ucmd, sizeof(*cmd)); + if (rc) + goto out_put_fdno; + iommufd_object_finalize(ucmd->ictx, &fault->obj); + + fd_install(fdno, fault->filep); + + return 0; +out_put_fdno: + put_unused_fd(fdno); +out_fput: + fput(filep); + refcount_dec(&fault->obj.users); + iommufd_ctx_put(fault->ictx); +out_abort: + iommufd_object_abort_and_destroy(ucmd->ictx, &fault->obj); + + return rc; +} diff --git a/drivers/iommu/iommufd/iommufd_private.h b/drivers/iommu/iommufd/iommufd_private.h index 8913d4359cd30..227d2be169cdb 100644 --- a/drivers/iommu/iommufd/iommufd_private.h +++ b/drivers/iommu/iommufd/iommufd_private.h @@ -128,6 +128,7 @@ enum iommufd_object_type { IOMMUFD_OBJ_HWPT_NESTED, IOMMUFD_OBJ_IOAS, IOMMUFD_OBJ_ACCESS, + IOMMUFD_OBJ_FAULT, #ifdef CONFIG_IOMMUFD_TEST IOMMUFD_OBJ_SELFTEST, #endif @@ -417,6 +418,35 @@ void iopt_remove_access(struct io_pagetable *iopt, u32 iopt_access_list_id); void iommufd_access_destroy_object(struct iommufd_object *obj); +/* + * An iommufd_fault object represents an interface to deliver I/O page faults + * to the user space. These objects are created/destroyed by the user space and + * associated with hardware page table objects during page-table allocation. + */ +struct iommufd_fault { + struct iommufd_object obj; + struct iommufd_ctx *ictx; + struct file *filep; + + /* The lists of outstanding faults protected by below mutex. */ + struct mutex mutex; + struct list_head deliver; + struct xarray response; + + struct wait_queue_head wait_queue; +}; + +struct iommufd_attach_handle { + struct iommu_attach_handle handle; + struct iommufd_device *idev; +}; + +/* Convert an iommu attach handle to iommufd handle. */ +#define to_iommufd_handle(hdl) container_of(hdl, struct iommufd_attach_handle, handle) + +int iommufd_fault_alloc(struct iommufd_ucmd *ucmd); +void iommufd_fault_destroy(struct iommufd_object *obj); + #ifdef CONFIG_IOMMUFD_TEST int iommufd_test(struct iommufd_ucmd *ucmd); void iommufd_selftest_destroy(struct iommufd_object *obj); diff --git a/drivers/iommu/iommufd/main.c b/drivers/iommu/iommufd/main.c index c9091e46d208a..9ea903a2cb0ce 100644 --- a/drivers/iommu/iommufd/main.c +++ b/drivers/iommu/iommufd/main.c @@ -319,6 +319,7 @@ static int iommufd_option(struct iommufd_ucmd *ucmd) union ucmd_buffer { struct iommu_destroy destroy; + struct iommu_fault_alloc fault; struct iommu_hw_info info; struct iommu_hwpt_alloc hwpt; struct iommu_hwpt_get_dirty_bitmap get_dirty_bitmap; @@ -354,6 +355,8 @@ struct iommufd_ioctl_op { } static const struct iommufd_ioctl_op iommufd_ioctl_ops[] = { IOCTL_OP(IOMMU_DESTROY, iommufd_destroy, struct iommu_destroy, id), + IOCTL_OP(IOMMU_FAULT_QUEUE_ALLOC, iommufd_fault_alloc, struct iommu_fault_alloc, + out_fault_fd), IOCTL_OP(IOMMU_GET_HW_INFO, iommufd_get_hw_info, struct iommu_hw_info, __reserved), IOCTL_OP(IOMMU_HWPT_ALLOC, iommufd_hwpt_alloc, struct iommu_hwpt_alloc, @@ -510,6 +513,9 @@ static const struct iommufd_object_ops iommufd_object_ops[] = { .destroy = iommufd_hwpt_nested_destroy, .abort = iommufd_hwpt_nested_abort, }, + [IOMMUFD_OBJ_FAULT] = { + .destroy = iommufd_fault_destroy, + }, #ifdef CONFIG_IOMMUFD_TEST [IOMMUFD_OBJ_SELFTEST] = { .destroy = iommufd_selftest_destroy, diff --git a/include/linux/iommu.h b/include/linux/iommu.h index ed795ce8d4c3e..3e0250aafa00f 100644 --- a/include/linux/iommu.h +++ b/include/linux/iommu.h @@ -125,12 +125,16 @@ struct iopf_fault { struct iopf_group { struct iopf_fault last_fault; struct list_head faults; + size_t fault_count; /* list node for iommu_fault_param::faults */ struct list_head pending_node; struct work_struct work; struct iommu_attach_handle *attach_handle; /* The device's fault data parameter. */ struct iommu_fault_param *fault_param; + /* Used by handler provider to hook the group on its own lists. */ + struct list_head node; + u32 cookie; }; /** diff --git a/include/uapi/linux/iommufd.h b/include/uapi/linux/iommufd.h index 409cc1a89e71b..73b33d74336c9 100644 --- a/include/uapi/linux/iommufd.h +++ b/include/uapi/linux/iommufd.h @@ -49,6 +49,7 @@ enum { IOMMUFD_CMD_GET_HW_INFO, IOMMUFD_CMD_HWPT_SET_DIRTY_TRACKING, IOMMUFD_CMD_HWPT_GET_DIRTY_BITMAP, + IOMMUFD_CMD_FAULT_QUEUE_ALLOC, }; /** @@ -655,4 +656,21 @@ struct iommu_hwpt_page_response { __u32 cookie; __u32 code; }; + +/** + * struct iommu_fault_alloc - ioctl(IOMMU_FAULT_QUEUE_ALLOC) + * @size: sizeof(struct iommu_fault_alloc) + * @flags: Must be 0 + * @out_fault_id: The ID of the new FAULT + * @out_fault_fd: The fd of the new FAULT + * + * Explicitly allocate a fault handling object. + */ +struct iommu_fault_alloc { + __u32 size; + __u32 flags; + __u32 out_fault_id; + __u32 out_fault_fd; +}; +#define IOMMU_FAULT_QUEUE_ALLOC _IO(IOMMUFD_TYPE, IOMMUFD_CMD_FAULT_QUEUE_ALLOC) #endif From 7f12a1569c0f7d0ecb685e430db4a6a4a7709503 Mon Sep 17 00:00:00 2001 From: Lu Baolu Date: Tue, 2 Jul 2024 14:34:41 +0800 Subject: [PATCH 512/548] iommufd: Fault-capable hwpt attach/detach/replace Add iopf-capable hw page table attach/detach/replace helpers. The pointer to iommufd_device is stored in the domain attachment handle, so that it can be echo'ed back in the iopf_group. The iopf-capable hw page tables can only be attached to devices that support the IOMMU_DEV_FEAT_IOPF feature. On the first attachment of an iopf-capable hw_pagetable to the device, the IOPF feature is enabled on the device. Similarly, after the last iopf-capable hwpt is detached from the device, the IOPF feature is disabled on the device. The current implementation allows a replacement between iopf-capable and non-iopf-capable hw page tables. This matches the nested translation use case, where a parent domain is attached by default and can then be replaced with a nested user domain with iopf support. Link: https://lore.kernel.org/r/20240702063444.105814-8-baolu.lu@linux.intel.com Signed-off-by: Lu Baolu Reviewed-by: Kevin Tian Signed-off-by: Jason Gunthorpe (cherry picked from commit b7d8833677baad8c80ed1aac8c396d687e64a376) Signed-off-by: Koba Ko --- drivers/iommu/iommufd/device.c | 7 +- drivers/iommu/iommufd/fault.c | 190 ++++++++++++++++++++++++ drivers/iommu/iommufd/iommufd_private.h | 41 +++++ 3 files changed, 235 insertions(+), 3 deletions(-) diff --git a/drivers/iommu/iommufd/device.c b/drivers/iommu/iommufd/device.c index 68ac138d925ca..59ae9a4ad0170 100644 --- a/drivers/iommu/iommufd/device.c +++ b/drivers/iommu/iommufd/device.c @@ -215,6 +215,7 @@ struct iommufd_device *iommufd_device_bind(struct iommufd_ctx *ictx, refcount_inc(&idev->obj.users); /* igroup refcount moves into iommufd_device */ idev->igroup = igroup; + mutex_init(&idev->iopf_lock); /* * If the caller fails after this success it must call @@ -376,7 +377,7 @@ int iommufd_hw_pagetable_attach(struct iommufd_hw_pagetable *hwpt, * attachment. */ if (list_empty(&idev->igroup->device_list)) { - rc = iommu_attach_group(hwpt->domain, idev->igroup->group); + rc = iommufd_hwpt_attach_device(hwpt, idev); if (rc) goto err_unresv; idev->igroup->hwpt = hwpt; @@ -402,7 +403,7 @@ iommufd_hw_pagetable_detach(struct iommufd_device *idev) mutex_lock(&idev->igroup->lock); list_del(&idev->group_item); if (list_empty(&idev->igroup->device_list)) { - iommu_detach_group(hwpt->domain, idev->igroup->group); + iommufd_hwpt_detach_device(hwpt, idev); idev->igroup->hwpt = NULL; } if (hwpt_is_paging(hwpt)) @@ -513,7 +514,7 @@ iommufd_device_do_replace(struct iommufd_device *idev, goto err_unlock; } - rc = iommu_group_replace_domain(igroup->group, hwpt->domain); + rc = iommufd_hwpt_replace_device(idev, hwpt, old_hwpt); if (rc) goto err_unresv; diff --git a/drivers/iommu/iommufd/fault.c b/drivers/iommu/iommufd/fault.c index 68ff94671d489..4934ae5726383 100644 --- a/drivers/iommu/iommufd/fault.c +++ b/drivers/iommu/iommufd/fault.c @@ -8,6 +8,7 @@ #include #include #include +#include #include #include #include @@ -15,6 +16,195 @@ #include "../iommu-priv.h" #include "iommufd_private.h" +static int iommufd_fault_iopf_enable(struct iommufd_device *idev) +{ + struct device *dev = idev->dev; + int ret; + + /* + * Once we turn on PCI/PRI support for VF, the response failure code + * should not be forwarded to the hardware due to PRI being a shared + * resource between PF and VFs. There is no coordination for this + * shared capability. This waits for a vPRI reset to recover. + */ + if (dev_is_pci(dev) && to_pci_dev(dev)->is_virtfn) + return -EINVAL; + + mutex_lock(&idev->iopf_lock); + /* Device iopf has already been on. */ + if (++idev->iopf_enabled > 1) { + mutex_unlock(&idev->iopf_lock); + return 0; + } + + ret = iommu_dev_enable_feature(dev, IOMMU_DEV_FEAT_IOPF); + if (ret) + --idev->iopf_enabled; + mutex_unlock(&idev->iopf_lock); + + return ret; +} + +static void iommufd_fault_iopf_disable(struct iommufd_device *idev) +{ + mutex_lock(&idev->iopf_lock); + if (!WARN_ON(idev->iopf_enabled == 0)) { + if (--idev->iopf_enabled == 0) + iommu_dev_disable_feature(idev->dev, IOMMU_DEV_FEAT_IOPF); + } + mutex_unlock(&idev->iopf_lock); +} + +static int __fault_domain_attach_dev(struct iommufd_hw_pagetable *hwpt, + struct iommufd_device *idev) +{ + struct iommufd_attach_handle *handle; + int ret; + + handle = kzalloc(sizeof(*handle), GFP_KERNEL); + if (!handle) + return -ENOMEM; + + handle->idev = idev; + ret = iommu_attach_group_handle(hwpt->domain, idev->igroup->group, + &handle->handle); + if (ret) + kfree(handle); + + return ret; +} + +int iommufd_fault_domain_attach_dev(struct iommufd_hw_pagetable *hwpt, + struct iommufd_device *idev) +{ + int ret; + + if (!hwpt->fault) + return -EINVAL; + + ret = iommufd_fault_iopf_enable(idev); + if (ret) + return ret; + + ret = __fault_domain_attach_dev(hwpt, idev); + if (ret) + iommufd_fault_iopf_disable(idev); + + return ret; +} + +static void iommufd_auto_response_faults(struct iommufd_hw_pagetable *hwpt, + struct iommufd_attach_handle *handle) +{ + struct iommufd_fault *fault = hwpt->fault; + struct iopf_group *group, *next; + unsigned long index; + + if (!fault) + return; + + mutex_lock(&fault->mutex); + list_for_each_entry_safe(group, next, &fault->deliver, node) { + if (group->attach_handle != &handle->handle) + continue; + list_del(&group->node); + iopf_group_response(group, IOMMU_PAGE_RESP_INVALID); + iopf_free_group(group); + } + + xa_for_each(&fault->response, index, group) { + if (group->attach_handle != &handle->handle) + continue; + xa_erase(&fault->response, index); + iopf_group_response(group, IOMMU_PAGE_RESP_INVALID); + iopf_free_group(group); + } + mutex_unlock(&fault->mutex); +} + +static struct iommufd_attach_handle * +iommufd_device_get_attach_handle(struct iommufd_device *idev) +{ + struct iommu_attach_handle *handle; + + handle = iommu_attach_handle_get(idev->igroup->group, IOMMU_NO_PASID, 0); + if (!handle) + return NULL; + + return to_iommufd_handle(handle); +} + +void iommufd_fault_domain_detach_dev(struct iommufd_hw_pagetable *hwpt, + struct iommufd_device *idev) +{ + struct iommufd_attach_handle *handle; + + handle = iommufd_device_get_attach_handle(idev); + iommu_detach_group_handle(hwpt->domain, idev->igroup->group); + iommufd_auto_response_faults(hwpt, handle); + iommufd_fault_iopf_disable(idev); + kfree(handle); +} + +static int __fault_domain_replace_dev(struct iommufd_device *idev, + struct iommufd_hw_pagetable *hwpt, + struct iommufd_hw_pagetable *old) +{ + struct iommufd_attach_handle *handle, *curr = NULL; + int ret; + + if (old->fault) + curr = iommufd_device_get_attach_handle(idev); + + if (hwpt->fault) { + handle = kzalloc(sizeof(*handle), GFP_KERNEL); + if (!handle) + return -ENOMEM; + + handle->handle.domain = hwpt->domain; + handle->idev = idev; + ret = iommu_replace_group_handle(idev->igroup->group, + hwpt->domain, &handle->handle); + } else { + ret = iommu_replace_group_handle(idev->igroup->group, + hwpt->domain, NULL); + } + + if (!ret && curr) { + iommufd_auto_response_faults(old, curr); + kfree(curr); + } + + return ret; +} + +int iommufd_fault_domain_replace_dev(struct iommufd_device *idev, + struct iommufd_hw_pagetable *hwpt, + struct iommufd_hw_pagetable *old) +{ + bool iopf_off = !hwpt->fault && old->fault; + bool iopf_on = hwpt->fault && !old->fault; + int ret; + + if (iopf_on) { + ret = iommufd_fault_iopf_enable(idev); + if (ret) + return ret; + } + + ret = __fault_domain_replace_dev(idev, hwpt, old); + if (ret) { + if (iopf_on) + iommufd_fault_iopf_disable(idev); + return ret; + } + + if (iopf_off) + iommufd_fault_iopf_disable(idev); + + return 0; +} + void iommufd_fault_destroy(struct iommufd_object *obj) { struct iommufd_fault *fault = container_of(obj, struct iommufd_fault, obj); diff --git a/drivers/iommu/iommufd/iommufd_private.h b/drivers/iommu/iommufd/iommufd_private.h index 227d2be169cdb..a25431c689fd1 100644 --- a/drivers/iommu/iommufd/iommufd_private.h +++ b/drivers/iommu/iommufd/iommufd_private.h @@ -11,6 +11,7 @@ #include #include #include +#include "../iommu-priv.h" struct iommu_domain; struct iommu_group; @@ -293,6 +294,7 @@ int iommufd_check_iova_range(struct io_pagetable *iopt, struct iommufd_hw_pagetable { struct iommufd_object obj; struct iommu_domain *domain; + struct iommufd_fault *fault; }; struct iommufd_hwpt_paging { @@ -387,6 +389,9 @@ struct iommufd_device { /* always the physical device */ struct device *dev; bool enforce_cache_coherency; + /* protect iopf_enabled counter */ + struct mutex iopf_lock; + unsigned int iopf_enabled; }; static inline struct iommufd_device * @@ -447,6 +452,42 @@ struct iommufd_attach_handle { int iommufd_fault_alloc(struct iommufd_ucmd *ucmd); void iommufd_fault_destroy(struct iommufd_object *obj); +int iommufd_fault_domain_attach_dev(struct iommufd_hw_pagetable *hwpt, + struct iommufd_device *idev); +void iommufd_fault_domain_detach_dev(struct iommufd_hw_pagetable *hwpt, + struct iommufd_device *idev); +int iommufd_fault_domain_replace_dev(struct iommufd_device *idev, + struct iommufd_hw_pagetable *hwpt, + struct iommufd_hw_pagetable *old); + +static inline int iommufd_hwpt_attach_device(struct iommufd_hw_pagetable *hwpt, + struct iommufd_device *idev) +{ + if (hwpt->fault) + return iommufd_fault_domain_attach_dev(hwpt, idev); + + return iommu_attach_group(hwpt->domain, idev->igroup->group); +} + +static inline void iommufd_hwpt_detach_device(struct iommufd_hw_pagetable *hwpt, + struct iommufd_device *idev) +{ + if (hwpt->fault) + iommufd_fault_domain_detach_dev(hwpt, idev); + + iommu_detach_group(hwpt->domain, idev->igroup->group); +} + +static inline int iommufd_hwpt_replace_device(struct iommufd_device *idev, + struct iommufd_hw_pagetable *hwpt, + struct iommufd_hw_pagetable *old) +{ + if (old->fault || hwpt->fault) + return iommufd_fault_domain_replace_dev(idev, hwpt, old); + + return iommu_group_replace_domain(idev->igroup->group, hwpt->domain); +} + #ifdef CONFIG_IOMMUFD_TEST int iommufd_test(struct iommufd_ucmd *ucmd); void iommufd_selftest_destroy(struct iommufd_object *obj); From 61d7abe7b50b2ffb42ed59a1b9ce45dca17d133e Mon Sep 17 00:00:00 2001 From: Lu Baolu Date: Tue, 2 Jul 2024 14:34:42 +0800 Subject: [PATCH 513/548] iommufd: Associate fault object with iommufd_hw_pgtable When allocating a user iommufd_hw_pagetable, the user space is allowed to associate a fault object with the hw_pagetable by specifying the fault object ID in the page table allocation data and setting the IOMMU_HWPT_FAULT_ID_VALID flag bit. On a successful return of hwpt allocation, the user can retrieve and respond to page faults by reading and writing the file interface of the fault object. Once a fault object has been associated with a hwpt, the hwpt is iopf-capable, indicated by hwpt->fault is non NULL. Attaching, detaching, or replacing an iopf-capable hwpt to an RID or PASID will differ from those that are not iopf-capable. Link: https://lore.kernel.org/r/20240702063444.105814-9-baolu.lu@linux.intel.com Signed-off-by: Lu Baolu Reviewed-by: Kevin Tian Signed-off-by: Jason Gunthorpe (cherry picked from commit 34765cbc679c59ea5d952d738d2d16bf4aadc497) Signed-off-by: Koba Ko --- drivers/iommu/iommufd/fault.c | 17 +++++++++++ drivers/iommu/iommufd/hw_pagetable.c | 38 +++++++++++++++++++------ drivers/iommu/iommufd/iommufd_private.h | 9 ++++++ include/uapi/linux/iommufd.h | 8 ++++++ 4 files changed, 64 insertions(+), 8 deletions(-) diff --git a/drivers/iommu/iommufd/fault.c b/drivers/iommu/iommufd/fault.c index 4934ae5726383..54d6cd20a6730 100644 --- a/drivers/iommu/iommufd/fault.c +++ b/drivers/iommu/iommufd/fault.c @@ -414,3 +414,20 @@ int iommufd_fault_alloc(struct iommufd_ucmd *ucmd) return rc; } + +int iommufd_fault_iopf_handler(struct iopf_group *group) +{ + struct iommufd_hw_pagetable *hwpt; + struct iommufd_fault *fault; + + hwpt = group->attach_handle->domain->fault_data; + fault = hwpt->fault; + + mutex_lock(&fault->mutex); + list_add_tail(&group->node, &fault->deliver); + mutex_unlock(&fault->mutex); + + wake_up_interruptible(&fault->wait_queue); + + return 0; +} diff --git a/drivers/iommu/iommufd/hw_pagetable.c b/drivers/iommu/iommufd/hw_pagetable.c index a2604955a7345..0350f90047416 100644 --- a/drivers/iommu/iommufd/hw_pagetable.c +++ b/drivers/iommu/iommufd/hw_pagetable.c @@ -8,6 +8,15 @@ #include "../iommu-priv.h" #include "iommufd_private.h" +static void __iommufd_hwpt_destroy(struct iommufd_hw_pagetable *hwpt) +{ + if (hwpt->domain) + iommu_domain_free(hwpt->domain); + + if (hwpt->fault) + refcount_dec(&hwpt->fault->obj.users); +} + void iommufd_hwpt_paging_destroy(struct iommufd_object *obj) { struct iommufd_hwpt_paging *hwpt_paging = @@ -22,9 +31,7 @@ void iommufd_hwpt_paging_destroy(struct iommufd_object *obj) hwpt_paging->common.domain); } - if (hwpt_paging->common.domain) - iommu_domain_free(hwpt_paging->common.domain); - + __iommufd_hwpt_destroy(&hwpt_paging->common); refcount_dec(&hwpt_paging->ioas->obj.users); } @@ -49,9 +56,7 @@ void iommufd_hwpt_nested_destroy(struct iommufd_object *obj) struct iommufd_hwpt_nested *hwpt_nested = container_of(obj, struct iommufd_hwpt_nested, common.obj); - if (hwpt_nested->common.domain) - iommu_domain_free(hwpt_nested->common.domain); - + __iommufd_hwpt_destroy(&hwpt_nested->common); refcount_dec(&hwpt_nested->parent->common.obj.users); } @@ -213,7 +218,8 @@ iommufd_hwpt_nested_alloc(struct iommufd_ctx *ictx, struct iommufd_hw_pagetable *hwpt; int rc; - if (flags || !user_data->len || !ops->domain_alloc_user) + if ((flags & ~IOMMU_HWPT_FAULT_ID_VALID) || + !user_data->len || !ops->domain_alloc_user) return ERR_PTR(-EOPNOTSUPP); if (parent->auto_domain || !parent->nest_parent || parent->common.domain->owner != ops) @@ -228,7 +234,8 @@ iommufd_hwpt_nested_alloc(struct iommufd_ctx *ictx, refcount_inc(&parent->common.obj.users); hwpt_nested->parent = parent; - hwpt->domain = ops->domain_alloc_user(idev->dev, flags, + hwpt->domain = ops->domain_alloc_user(idev->dev, + flags & ~IOMMU_HWPT_FAULT_ID_VALID, parent->common.domain, user_data); if (IS_ERR(hwpt->domain)) { rc = PTR_ERR(hwpt->domain); @@ -308,6 +315,21 @@ int iommufd_hwpt_alloc(struct iommufd_ucmd *ucmd) goto out_put_pt; } + if (cmd->flags & IOMMU_HWPT_FAULT_ID_VALID) { + struct iommufd_fault *fault; + + fault = iommufd_get_fault(ucmd, cmd->fault_id); + if (IS_ERR(fault)) { + rc = PTR_ERR(fault); + goto out_hwpt; + } + hwpt->fault = fault; + hwpt->domain->iopf_handler = iommufd_fault_iopf_handler; + hwpt->domain->fault_data = hwpt; + refcount_inc(&fault->obj.users); + iommufd_put_object(ucmd->ictx, &fault->obj); + } + cmd->out_hwpt_id = hwpt->obj.id; rc = iommufd_ucmd_respond(ucmd, sizeof(*cmd)); if (rc) diff --git a/drivers/iommu/iommufd/iommufd_private.h b/drivers/iommu/iommufd/iommufd_private.h index a25431c689fd1..c921e87464cda 100644 --- a/drivers/iommu/iommufd/iommufd_private.h +++ b/drivers/iommu/iommufd/iommufd_private.h @@ -449,8 +449,17 @@ struct iommufd_attach_handle { /* Convert an iommu attach handle to iommufd handle. */ #define to_iommufd_handle(hdl) container_of(hdl, struct iommufd_attach_handle, handle) +static inline struct iommufd_fault * +iommufd_get_fault(struct iommufd_ucmd *ucmd, u32 id) +{ + return container_of(iommufd_get_object(ucmd->ictx, id, + IOMMUFD_OBJ_FAULT), + struct iommufd_fault, obj); +} + int iommufd_fault_alloc(struct iommufd_ucmd *ucmd); void iommufd_fault_destroy(struct iommufd_object *obj); +int iommufd_fault_iopf_handler(struct iopf_group *group); int iommufd_fault_domain_attach_dev(struct iommufd_hw_pagetable *hwpt, struct iommufd_device *idev); diff --git a/include/uapi/linux/iommufd.h b/include/uapi/linux/iommufd.h index 73b33d74336c9..cd47af424f7e3 100644 --- a/include/uapi/linux/iommufd.h +++ b/include/uapi/linux/iommufd.h @@ -356,10 +356,13 @@ struct iommu_vfio_ioas { * the parent HWPT in a nesting configuration. * @IOMMU_HWPT_ALLOC_DIRTY_TRACKING: Dirty tracking support for device IOMMU is * enforced on device attachment + * @IOMMU_HWPT_FAULT_ID_VALID: The fault_id field of hwpt allocation data is + * valid. */ enum iommufd_hwpt_alloc_flags { IOMMU_HWPT_ALLOC_NEST_PARENT = 1 << 0, IOMMU_HWPT_ALLOC_DIRTY_TRACKING = 1 << 1, + IOMMU_HWPT_FAULT_ID_VALID = 1 << 2, }; /** @@ -381,6 +384,9 @@ enum iommu_hwpt_data_type { * @data_type: One of enum iommu_hwpt_data_type * @data_len: Length of the type specific data * @data_uptr: User pointer to the type specific data + * @fault_id: The ID of IOMMUFD_FAULT object. Valid only if flags field of + * IOMMU_HWPT_FAULT_ID_VALID is set. + * @__reserved2: Padding to 64-bit alignment. Must be 0. * * Explicitly allocate a hardware page table object. This is the same object * type that is returned by iommufd_device_attach() and represents the @@ -411,6 +417,8 @@ struct iommu_hwpt_alloc { __u32 data_type; __u32 data_len; __aligned_u64 data_uptr; + __u32 fault_id; + __u32 __reserved2; }; #define IOMMU_HWPT_ALLOC _IO(IOMMUFD_TYPE, IOMMUFD_CMD_HWPT_ALLOC) From fd061393fb2b7593b8dd9496c541244fbcdb307f Mon Sep 17 00:00:00 2001 From: Lu Baolu Date: Tue, 2 Jul 2024 14:34:43 +0800 Subject: [PATCH 514/548] iommufd/selftest: Add IOPF support for mock device Extend the selftest mock device to support generating and responding to an IOPF. Also add an ioctl interface to userspace applications to trigger the IOPF on the mock device. This would allow userspace applications to test the IOMMUFD's handling of IOPFs without having to rely on any real hardware. Link: https://lore.kernel.org/r/20240702063444.105814-10-baolu.lu@linux.intel.com Signed-off-by: Lu Baolu Reviewed-by: Kevin Tian Signed-off-by: Jason Gunthorpe (cherry picked from commit ddee19971081b42615d62f4fdada21274708ed4d) Signed-off-by: Koba Ko --- drivers/iommu/iommufd/iommufd_test.h | 8 ++++ drivers/iommu/iommufd/selftest.c | 64 ++++++++++++++++++++++++++++ 2 files changed, 72 insertions(+) diff --git a/drivers/iommu/iommufd/iommufd_test.h b/drivers/iommu/iommufd/iommufd_test.h index 7910fbe1962d7..1827f11de865c 100644 --- a/drivers/iommu/iommufd/iommufd_test.h +++ b/drivers/iommu/iommufd/iommufd_test.h @@ -21,6 +21,7 @@ enum { IOMMU_TEST_OP_ACCESS_REPLACE_IOAS, IOMMU_TEST_OP_MOCK_DOMAIN_FLAGS, IOMMU_TEST_OP_DIRTY, + IOMMU_TEST_OP_TRIGGER_IOPF, }; enum { @@ -121,6 +122,13 @@ struct iommu_test_cmd { __aligned_u64 uptr; __aligned_u64 out_nr_dirty; } dirty; + struct { + __u32 dev_id; + __u32 pasid; + __u32 grpid; + __u32 perm; + __u64 addr; + } trigger_iopf; }; __u32 last; }; diff --git a/drivers/iommu/iommufd/selftest.c b/drivers/iommu/iommufd/selftest.c index 62aab1178eff3..00e9fa8256add 100644 --- a/drivers/iommu/iommufd/selftest.c +++ b/drivers/iommu/iommufd/selftest.c @@ -495,6 +495,8 @@ static bool mock_domain_capable(struct device *dev, enum iommu_cap cap) return false; } +static struct iopf_queue *mock_iommu_iopf_queue; + static struct iommu_device mock_iommu_device = { }; @@ -505,6 +507,29 @@ static struct iommu_device *mock_probe_device(struct device *dev) return &mock_iommu_device; } +static void mock_domain_page_response(struct device *dev, struct iopf_fault *evt, + struct iommu_page_response *msg) +{ +} + +static int mock_dev_enable_feat(struct device *dev, enum iommu_dev_features feat) +{ + if (feat != IOMMU_DEV_FEAT_IOPF || !mock_iommu_iopf_queue) + return -ENODEV; + + return iopf_queue_add_device(mock_iommu_iopf_queue, dev); +} + +static int mock_dev_disable_feat(struct device *dev, enum iommu_dev_features feat) +{ + if (feat != IOMMU_DEV_FEAT_IOPF || !mock_iommu_iopf_queue) + return -ENODEV; + + iopf_queue_remove_device(mock_iommu_iopf_queue, dev); + + return 0; +} + static const struct iommu_ops mock_ops = { /* * IOMMU_DOMAIN_BLOCKED cannot be returned from def_domain_type() @@ -520,6 +545,10 @@ static const struct iommu_ops mock_ops = { .capable = mock_domain_capable, .device_group = generic_device_group, .probe_device = mock_probe_device, + .page_response = mock_domain_page_response, + .dev_enable_feat = mock_dev_enable_feat, + .dev_disable_feat = mock_dev_disable_feat, + .user_pasid_table = true, .default_domain_ops = &(struct iommu_domain_ops){ .free = mock_domain_free, @@ -1289,6 +1318,31 @@ static int iommufd_test_dirty(struct iommufd_ucmd *ucmd, unsigned int mockpt_id, return rc; } +static int iommufd_test_trigger_iopf(struct iommufd_ucmd *ucmd, + struct iommu_test_cmd *cmd) +{ + struct iopf_fault event = { }; + struct iommufd_device *idev; + + idev = iommufd_get_device(ucmd, cmd->trigger_iopf.dev_id); + if (IS_ERR(idev)) + return PTR_ERR(idev); + + event.fault.prm.flags = IOMMU_FAULT_PAGE_REQUEST_LAST_PAGE; + if (cmd->trigger_iopf.pasid != IOMMU_NO_PASID) + event.fault.prm.flags |= IOMMU_FAULT_PAGE_REQUEST_PASID_VALID; + event.fault.type = IOMMU_FAULT_PAGE_REQ; + event.fault.prm.addr = cmd->trigger_iopf.addr; + event.fault.prm.pasid = cmd->trigger_iopf.pasid; + event.fault.prm.grpid = cmd->trigger_iopf.grpid; + event.fault.prm.perm = cmd->trigger_iopf.perm; + + iommu_report_device_fault(idev->dev, &event); + iommufd_put_object(ucmd->ictx, &idev->obj); + + return 0; +} + void iommufd_selftest_destroy(struct iommufd_object *obj) { struct selftest_obj *sobj = container_of(obj, struct selftest_obj, obj); @@ -1360,6 +1414,8 @@ int iommufd_test(struct iommufd_ucmd *ucmd) cmd->dirty.page_size, u64_to_user_ptr(cmd->dirty.uptr), cmd->dirty.flags); + case IOMMU_TEST_OP_TRIGGER_IOPF: + return iommufd_test_trigger_iopf(ucmd, cmd); default: return -EOPNOTSUPP; } @@ -1401,6 +1457,9 @@ int __init iommufd_test_init(void) &iommufd_mock_bus_type.nb); if (rc) goto err_sysfs; + + mock_iommu_iopf_queue = iopf_queue_alloc("mock-iopfq"); + return 0; err_sysfs: @@ -1416,6 +1475,11 @@ int __init iommufd_test_init(void) void iommufd_test_exit(void) { + if (mock_iommu_iopf_queue) { + iopf_queue_free(mock_iommu_iopf_queue); + mock_iommu_iopf_queue = NULL; + } + iommu_device_sysfs_remove(&mock_iommu_device); iommu_device_unregister_bus(&mock_iommu_device, &iommufd_mock_bus_type.bus, From edb707b2939b587a3b9c4720d901df10a9bc35cc Mon Sep 17 00:00:00 2001 From: Lu Baolu Date: Tue, 2 Jul 2024 14:34:44 +0800 Subject: [PATCH 515/548] iommufd/selftest: Add coverage for IOPF test Extend the selftest tool to add coverage of testing IOPF handling. This would include the following tests: - Allocating and destroying an iommufd fault object. - Allocating and destroying an IOPF-capable HWPT. - Attaching/detaching/replacing an IOPF-capable HWPT on a device. - Triggering an IOPF on the mock device. - Retrieving and responding to the IOPF through the file interface. Link: https://lore.kernel.org/r/20240702063444.105814-11-baolu.lu@linux.intel.com Signed-off-by: Lu Baolu Reviewed-by: Kevin Tian Signed-off-by: Jason Gunthorpe (cherry picked from commit d1211768b62d02e27b46a3ff78f739c4776a0f03) Signed-off-by: Koba Ko --- tools/testing/selftests/iommu/iommufd.c | 22 +++++ .../selftests/iommu/iommufd_fail_nth.c | 2 +- tools/testing/selftests/iommu/iommufd_utils.h | 86 +++++++++++++++++-- 3 files changed, 104 insertions(+), 6 deletions(-) diff --git a/tools/testing/selftests/iommu/iommufd.c b/tools/testing/selftests/iommu/iommufd.c index 0b24976409a17..4e9329e09e93f 100644 --- a/tools/testing/selftests/iommu/iommufd.c +++ b/tools/testing/selftests/iommu/iommufd.c @@ -273,6 +273,9 @@ TEST_F(iommufd_ioas, alloc_hwpt_nested) uint32_t parent_hwpt_id = 0; uint32_t parent_hwpt_id_not_work = 0; uint32_t test_hwpt_id = 0; + uint32_t iopf_hwpt_id; + uint32_t fault_id; + uint32_t fault_fd; if (self->device_id) { /* Negative tests */ @@ -320,6 +323,7 @@ TEST_F(iommufd_ioas, alloc_hwpt_nested) sizeof(data)); /* Allocate two nested hwpts sharing one common parent hwpt */ + test_ioctl_fault_alloc(&fault_id, &fault_fd); test_cmd_hwpt_alloc_nested(self->device_id, parent_hwpt_id, 0, &nested_hwpt_id[0], IOMMU_HWPT_DATA_SELFTEST, &data, @@ -328,6 +332,14 @@ TEST_F(iommufd_ioas, alloc_hwpt_nested) &nested_hwpt_id[1], IOMMU_HWPT_DATA_SELFTEST, &data, sizeof(data)); + test_err_hwpt_alloc_iopf(ENOENT, self->device_id, parent_hwpt_id, + UINT32_MAX, IOMMU_HWPT_FAULT_ID_VALID, + &iopf_hwpt_id, IOMMU_HWPT_DATA_SELFTEST, + &data, sizeof(data)); + test_cmd_hwpt_alloc_iopf(self->device_id, parent_hwpt_id, fault_id, + IOMMU_HWPT_FAULT_ID_VALID, &iopf_hwpt_id, + IOMMU_HWPT_DATA_SELFTEST, &data, + sizeof(data)); /* Negative test: a nested hwpt on top of a nested hwpt */ test_err_hwpt_alloc_nested(EINVAL, self->device_id, @@ -349,14 +361,24 @@ TEST_F(iommufd_ioas, alloc_hwpt_nested) _test_ioctl_destroy(self->fd, nested_hwpt_id[1])); test_ioctl_destroy(nested_hwpt_id[0]); + /* Switch from nested_hwpt_id[1] to iopf_hwpt_id */ + test_cmd_mock_domain_replace(self->stdev_id, iopf_hwpt_id); + EXPECT_ERRNO(EBUSY, + _test_ioctl_destroy(self->fd, iopf_hwpt_id)); + /* Trigger an IOPF on the device */ + test_cmd_trigger_iopf(self->device_id, fault_fd); + /* Detach from nested_hwpt_id[1] and destroy it */ test_cmd_mock_domain_replace(self->stdev_id, parent_hwpt_id); test_ioctl_destroy(nested_hwpt_id[1]); + test_ioctl_destroy(iopf_hwpt_id); /* Detach from the parent hw_pagetable and destroy it */ test_cmd_mock_domain_replace(self->stdev_id, self->ioas_id); test_ioctl_destroy(parent_hwpt_id); test_ioctl_destroy(parent_hwpt_id_not_work); + close(fault_fd); + test_ioctl_destroy(fault_id); } else { test_err_hwpt_alloc(ENOENT, self->device_id, self->ioas_id, 0, &parent_hwpt_id); diff --git a/tools/testing/selftests/iommu/iommufd_fail_nth.c b/tools/testing/selftests/iommu/iommufd_fail_nth.c index 17ff2d605355a..4ee39af6b883c 100644 --- a/tools/testing/selftests/iommu/iommufd_fail_nth.c +++ b/tools/testing/selftests/iommu/iommufd_fail_nth.c @@ -615,7 +615,7 @@ TEST_FAIL_NTH(basic_fail_nth, device) if (_test_cmd_get_hw_info(self->fd, idev_id, &info, sizeof(info), NULL)) return -1; - if (_test_cmd_hwpt_alloc(self->fd, idev_id, ioas_id, 0, &hwpt_id, + if (_test_cmd_hwpt_alloc(self->fd, idev_id, ioas_id, 0, 0, &hwpt_id, IOMMU_HWPT_DATA_NONE, 0, 0)) return -1; diff --git a/tools/testing/selftests/iommu/iommufd_utils.h b/tools/testing/selftests/iommu/iommufd_utils.h index c24d00fd8542f..d9df1f3233332 100644 --- a/tools/testing/selftests/iommu/iommufd_utils.h +++ b/tools/testing/selftests/iommu/iommufd_utils.h @@ -153,7 +153,7 @@ static int _test_cmd_mock_domain_replace(int fd, __u32 stdev_id, __u32 pt_id, EXPECT_ERRNO(_errno, _test_cmd_mock_domain_replace(self->fd, stdev_id, \ pt_id, NULL)) -static int _test_cmd_hwpt_alloc(int fd, __u32 device_id, __u32 pt_id, +static int _test_cmd_hwpt_alloc(int fd, __u32 device_id, __u32 pt_id, __u32 ft_id, __u32 flags, __u32 *hwpt_id, __u32 data_type, void *data, size_t data_len) { @@ -165,6 +165,7 @@ static int _test_cmd_hwpt_alloc(int fd, __u32 device_id, __u32 pt_id, .data_type = data_type, .data_len = data_len, .data_uptr = (uint64_t)data, + .fault_id = ft_id, }; int ret; @@ -177,24 +178,36 @@ static int _test_cmd_hwpt_alloc(int fd, __u32 device_id, __u32 pt_id, } #define test_cmd_hwpt_alloc(device_id, pt_id, flags, hwpt_id) \ - ASSERT_EQ(0, _test_cmd_hwpt_alloc(self->fd, device_id, pt_id, flags, \ + ASSERT_EQ(0, _test_cmd_hwpt_alloc(self->fd, device_id, pt_id, 0, flags, \ hwpt_id, IOMMU_HWPT_DATA_NONE, NULL, \ 0)) #define test_err_hwpt_alloc(_errno, device_id, pt_id, flags, hwpt_id) \ EXPECT_ERRNO(_errno, _test_cmd_hwpt_alloc( \ - self->fd, device_id, pt_id, flags, \ + self->fd, device_id, pt_id, 0, flags, \ hwpt_id, IOMMU_HWPT_DATA_NONE, NULL, 0)) #define test_cmd_hwpt_alloc_nested(device_id, pt_id, flags, hwpt_id, \ data_type, data, data_len) \ - ASSERT_EQ(0, _test_cmd_hwpt_alloc(self->fd, device_id, pt_id, flags, \ + ASSERT_EQ(0, _test_cmd_hwpt_alloc(self->fd, device_id, pt_id, 0, flags, \ hwpt_id, data_type, data, data_len)) #define test_err_hwpt_alloc_nested(_errno, device_id, pt_id, flags, hwpt_id, \ data_type, data, data_len) \ EXPECT_ERRNO(_errno, \ - _test_cmd_hwpt_alloc(self->fd, device_id, pt_id, flags, \ + _test_cmd_hwpt_alloc(self->fd, device_id, pt_id, 0, flags, \ hwpt_id, data_type, data, data_len)) +#define test_cmd_hwpt_alloc_iopf(device_id, pt_id, fault_id, flags, hwpt_id, \ + data_type, data, data_len) \ + ASSERT_EQ(0, _test_cmd_hwpt_alloc(self->fd, device_id, pt_id, fault_id, \ + flags, hwpt_id, data_type, data, \ + data_len)) +#define test_err_hwpt_alloc_iopf(_errno, device_id, pt_id, fault_id, flags, \ + hwpt_id, data_type, data, data_len) \ + EXPECT_ERRNO(_errno, \ + _test_cmd_hwpt_alloc(self->fd, device_id, pt_id, fault_id, \ + flags, hwpt_id, data_type, data, \ + data_len)) + static int _test_cmd_access_replace_ioas(int fd, __u32 access_id, unsigned int ioas_id) { @@ -629,3 +642,66 @@ static int _test_cmd_get_hw_info(int fd, __u32 device_id, void *data, #define test_cmd_get_hw_capabilities(device_id, caps, mask) \ ASSERT_EQ(0, _test_cmd_get_hw_info(self->fd, device_id, NULL, 0, &caps)) + +static int _test_ioctl_fault_alloc(int fd, __u32 *fault_id, __u32 *fault_fd) +{ + struct iommu_fault_alloc cmd = { + .size = sizeof(cmd), + }; + int ret; + + ret = ioctl(fd, IOMMU_FAULT_QUEUE_ALLOC, &cmd); + if (ret) + return ret; + *fault_id = cmd.out_fault_id; + *fault_fd = cmd.out_fault_fd; + return 0; +} + +#define test_ioctl_fault_alloc(fault_id, fault_fd) \ + ({ \ + ASSERT_EQ(0, _test_ioctl_fault_alloc(self->fd, fault_id, \ + fault_fd)); \ + ASSERT_NE(0, *(fault_id)); \ + ASSERT_NE(0, *(fault_fd)); \ + }) + +static int _test_cmd_trigger_iopf(int fd, __u32 device_id, __u32 fault_fd) +{ + struct iommu_test_cmd trigger_iopf_cmd = { + .size = sizeof(trigger_iopf_cmd), + .op = IOMMU_TEST_OP_TRIGGER_IOPF, + .trigger_iopf = { + .dev_id = device_id, + .pasid = 0x1, + .grpid = 0x2, + .perm = IOMMU_PGFAULT_PERM_READ | IOMMU_PGFAULT_PERM_WRITE, + .addr = 0xdeadbeaf, + }, + }; + struct iommu_hwpt_page_response response = { + .code = IOMMUFD_PAGE_RESP_SUCCESS, + }; + struct iommu_hwpt_pgfault fault = {}; + ssize_t bytes; + int ret; + + ret = ioctl(fd, _IOMMU_TEST_CMD(IOMMU_TEST_OP_TRIGGER_IOPF), &trigger_iopf_cmd); + if (ret) + return ret; + + bytes = read(fault_fd, &fault, sizeof(fault)); + if (bytes <= 0) + return -EIO; + + response.cookie = fault.cookie; + + bytes = write(fault_fd, &response, sizeof(response)); + if (bytes <= 0) + return -EIO; + + return 0; +} + +#define test_cmd_trigger_iopf(device_id, fault_fd) \ + ASSERT_EQ(0, _test_cmd_trigger_iopf(self->fd, device_id, fault_fd)) From 6dd64f74c38b1eec2597b469a8ca1e7f7df94d04 Mon Sep 17 00:00:00 2001 From: Pranjal Shrivastava Date: Fri, 16 Aug 2024 10:49:06 +0000 Subject: [PATCH 516/548] iommu: Handle iommu faults for a bad iopf setup The iommu_report_device_fault function was updated to return void while assuming that drivers only need to call iommu_report_device_fault() for reporting an iopf. This implementation causes following problems: 1. The drivers rely on the core code to call it's page_reponse, however, when a fault is received and no fault capable domain is attached / iopf_param is NULL, the ops->page_response is NOT called causing the device to stall in case the fault type was PAGE_REQ. 2. The arm_smmu_v3 driver relies on the returned value to log errors returning void from iommu_report_device_fault causes these events to be missed while logging. Modify the iommu_report_device_fault function to return -EINVAL for cases where no fault capable domain is attached or iopf_param was NULL and calls back to the driver (ops->page_response) in case the fault type was IOMMU_FAULT_PAGE_REQ. The returned value can be used by the drivers to log the fault/event as needed. Reported-by: Kunkun Jiang Closes: https://lore.kernel.org/all/6147caf0-b9a0-30ca-795e-a1aa502a5c51@huawei.com/ Fixes: 3dfa64aecbaf ("iommu: Make iommu_report_device_fault() return void") Signed-off-by: Jason Gunthorpe Signed-off-by: Pranjal Shrivastava Reviewed-by: Jason Gunthorpe Reviewed-by: Lu Baolu Link: https://lore.kernel.org/r/20240816104906.1010626-1-praan@google.com Signed-off-by: Joerg Roedel (cherry picked from commit b58b133e680b20d219940e0fdb6f6132c2b60f38) Signed-off-by: Koba Ko --- drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 2 +- drivers/iommu/io-pgfault.c | 120 ++++++++++++++------ include/linux/iommu.h | 5 +- 3 files changed, 87 insertions(+), 40 deletions(-) diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c index bbce7cca34d0c..0576ca33be343 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c @@ -1701,7 +1701,7 @@ static int arm_smmu_handle_evt(struct arm_smmu_device *smmu, u64 *evt) goto out_unlock; } - iommu_report_device_fault(master->dev, &fault_evt); + ret = iommu_report_device_fault(master->dev, &fault_evt); out_unlock: mutex_unlock(&smmu->streams_mutex); return ret; diff --git a/drivers/iommu/io-pgfault.c b/drivers/iommu/io-pgfault.c index e017902f90539..8b5926c1452ed 100644 --- a/drivers/iommu/io-pgfault.c +++ b/drivers/iommu/io-pgfault.c @@ -115,6 +115,59 @@ static struct iopf_group *iopf_group_alloc(struct iommu_fault_param *iopf_param, return group; } +static struct iommu_attach_handle *find_fault_handler(struct device *dev, + struct iopf_fault *evt) +{ + struct iommu_fault *fault = &evt->fault; + struct iommu_attach_handle *attach_handle; + + if (fault->prm.flags & IOMMU_FAULT_PAGE_REQUEST_PASID_VALID) { + attach_handle = iommu_attach_handle_get(dev->iommu_group, + fault->prm.pasid, 0); + if (IS_ERR(attach_handle)) { + const struct iommu_ops *ops = dev_iommu_ops(dev); + + if (!ops->user_pasid_table) + return NULL; + /* + * The iommu driver for this device supports user- + * managed PASID table. Therefore page faults for + * any PASID should go through the NESTING domain + * attached to the device RID. + */ + attach_handle = iommu_attach_handle_get( + dev->iommu_group, IOMMU_NO_PASID, + IOMMU_DOMAIN_NESTED); + if (IS_ERR(attach_handle)) + return NULL; + } + } else { + attach_handle = iommu_attach_handle_get(dev->iommu_group, + IOMMU_NO_PASID, 0); + + if (IS_ERR(attach_handle)) + return NULL; + } + + if (!attach_handle->domain->iopf_handler) + return NULL; + + return attach_handle; +} + +static void iopf_error_response(struct device *dev, struct iopf_fault *evt) +{ + const struct iommu_ops *ops = dev_iommu_ops(dev); + struct iommu_fault *fault = &evt->fault; + struct iommu_page_response resp = { + .pasid = fault->prm.pasid, + .grpid = fault->prm.grpid, + .code = IOMMU_PAGE_RESP_INVALID + }; + + ops->page_response(dev, evt, &resp); +} + /** * iommu_report_device_fault() - Report fault event to device driver * @dev: the device @@ -153,23 +206,39 @@ static struct iopf_group *iopf_group_alloc(struct iommu_fault_param *iopf_param, * handling framework should guarantee that the iommu domain could only be * freed after the device has stopped generating page faults (or the iommu * hardware has been set to block the page faults) and the pending page faults - * have been flushed. + * have been flushed. In case no page fault handler is attached or no iopf params + * are setup, then the ops->page_response() is called to complete the evt. + * + * Returns 0 on success, or an error in case of a bad/failed iopf setup. */ -void iommu_report_device_fault(struct device *dev, struct iopf_fault *evt) +int iommu_report_device_fault(struct device *dev, struct iopf_fault *evt) { + struct iommu_attach_handle *attach_handle; struct iommu_fault *fault = &evt->fault; struct iommu_fault_param *iopf_param; struct iopf_group abort_group = {}; struct iopf_group *group; + attach_handle = find_fault_handler(dev, evt); + if (!attach_handle) + goto err_bad_iopf; + + /* + * Something has gone wrong if a fault capable domain is attached but no + * iopf_param is setup + */ iopf_param = iopf_get_dev_fault_param(dev); if (WARN_ON(!iopf_param)) - return; + goto err_bad_iopf; if (!(fault->prm.flags & IOMMU_FAULT_PAGE_REQUEST_LAST_PAGE)) { - report_partial_fault(iopf_param, fault); + int ret; + + ret = report_partial_fault(iopf_param, fault); iopf_put_dev_fault_param(iopf_param); /* A request that is not the last does not need to be ack'd */ + + return ret; } /* @@ -184,38 +253,7 @@ void iommu_report_device_fault(struct device *dev, struct iopf_fault *evt) if (group == &abort_group) goto err_abort; - if (fault->prm.flags & IOMMU_FAULT_PAGE_REQUEST_PASID_VALID) { - group->attach_handle = iommu_attach_handle_get(dev->iommu_group, - fault->prm.pasid, - 0); - if (IS_ERR(group->attach_handle)) { - const struct iommu_ops *ops = dev_iommu_ops(dev); - - if (!ops->user_pasid_table) - goto err_abort; - - /* - * The iommu driver for this device supports user- - * managed PASID table. Therefore page faults for - * any PASID should go through the NESTING domain - * attached to the device RID. - */ - group->attach_handle = - iommu_attach_handle_get(dev->iommu_group, - IOMMU_NO_PASID, - IOMMU_DOMAIN_NESTED); - if (IS_ERR(group->attach_handle)) - goto err_abort; - } - } else { - group->attach_handle = - iommu_attach_handle_get(dev->iommu_group, IOMMU_NO_PASID, 0); - if (IS_ERR(group->attach_handle)) - goto err_abort; - } - - if (!group->attach_handle->domain->iopf_handler) - goto err_abort; + group->attach_handle = attach_handle; /* * On success iopf_handler must call iopf_group_response() and @@ -224,7 +262,7 @@ void iommu_report_device_fault(struct device *dev, struct iopf_fault *evt) if (group->attach_handle->domain->iopf_handler(group)) goto err_abort; - return; + return 0; err_abort: dev_warn_ratelimited(dev, "iopf with pasid %d aborted\n", @@ -234,6 +272,14 @@ void iommu_report_device_fault(struct device *dev, struct iopf_fault *evt) __iopf_free_group(group); else iopf_free_group(group); + + return 0; + +err_bad_iopf: + if (fault->type == IOMMU_FAULT_PAGE_REQ) + iopf_error_response(dev, evt); + + return -EINVAL; } EXPORT_SYMBOL_GPL(iommu_report_device_fault); diff --git a/include/linux/iommu.h b/include/linux/iommu.h index 3e0250aafa00f..bc221bc51e8f8 100644 --- a/include/linux/iommu.h +++ b/include/linux/iommu.h @@ -1492,7 +1492,7 @@ struct iopf_queue *iopf_queue_alloc(const char *name); void iopf_queue_free(struct iopf_queue *queue); int iopf_queue_discard_partial(struct iopf_queue *queue); void iopf_free_group(struct iopf_group *group); -void iommu_report_device_fault(struct device *dev, struct iopf_fault *evt); +int iommu_report_device_fault(struct device *dev, struct iopf_fault *evt); void iopf_group_response(struct iopf_group *group, enum iommu_page_response_code status); #else @@ -1530,9 +1530,10 @@ static inline void iopf_free_group(struct iopf_group *group) { } -static inline void +static inline int iommu_report_device_fault(struct device *dev, struct iopf_fault *evt) { + return -ENODEV; } static inline void iopf_group_response(struct iopf_group *group, From 452b70a5136d4dd6c88f8d7ee7c0b9b1f9d29dc1 Mon Sep 17 00:00:00 2001 From: Jason Gunthorpe Date: Tue, 30 Apr 2024 14:21:33 -0300 Subject: [PATCH 517/548] iommu/arm-smmu-v3: Add an ops indirection to the STE code Prepare to put the CD code into the same mechanism. Add an ops indirection around all the STE specific code and make the worker functions independent of the entry content being processed. get_used and sync ops are provided to hook the correct code. Signed-off-by: Michael Shavit Reviewed-by: Michael Shavit Reviewed-by: Nicolin Chen Tested-by: Nicolin Chen Tested-by: Shameer Kolothum Signed-off-by: Jason Gunthorpe Link: https://lore.kernel.org/r/1-v9-5040dc602008+177d7-smmuv3_newapi_p2_jgg@nvidia.com Signed-off-by: Will Deacon (cherry picked from commit de31c355541286aa4c938c982dfcafbf062fcb93) Signed-off-by: Koba Ko --- drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 176 ++++++++++++-------- 1 file changed, 104 insertions(+), 72 deletions(-) diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c index 0576ca33be343..dd42305b2bbf1 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c @@ -47,8 +47,19 @@ enum arm_smmu_msi_index { ARM_SMMU_MAX_MSIS, }; -static void arm_smmu_sync_ste_for_sid(struct arm_smmu_device *smmu, - ioasid_t sid); +struct arm_smmu_entry_writer_ops; +struct arm_smmu_entry_writer { + const struct arm_smmu_entry_writer_ops *ops; + struct arm_smmu_master *master; +}; + +struct arm_smmu_entry_writer_ops { + void (*get_used)(const __le64 *entry, __le64 *used); + void (*sync)(struct arm_smmu_entry_writer *writer); +}; + +#define NUM_ENTRY_QWORDS 8 +static_assert(sizeof(struct arm_smmu_ste) == NUM_ENTRY_QWORDS * sizeof(u64)); static phys_addr_t arm_smmu_msi_cfg[ARM_SMMU_MAX_MSIS][3] = { [EVTQ_MSI_INDEX] = { @@ -977,43 +988,42 @@ void arm_smmu_tlb_inv_asid(struct arm_smmu_device *smmu, u16 asid) * would be nice if this was complete according to the spec, but minimally it * has to capture the bits this driver uses. */ -static void arm_smmu_get_ste_used(const struct arm_smmu_ste *ent, - struct arm_smmu_ste *used_bits) +static void arm_smmu_get_ste_used(const __le64 *ent, __le64 *used_bits) { - unsigned int cfg = FIELD_GET(STRTAB_STE_0_CFG, le64_to_cpu(ent->data[0])); + unsigned int cfg = FIELD_GET(STRTAB_STE_0_CFG, le64_to_cpu(ent[0])); - used_bits->data[0] = cpu_to_le64(STRTAB_STE_0_V); - if (!(ent->data[0] & cpu_to_le64(STRTAB_STE_0_V))) + used_bits[0] = cpu_to_le64(STRTAB_STE_0_V); + if (!(ent[0] & cpu_to_le64(STRTAB_STE_0_V))) return; - used_bits->data[0] |= cpu_to_le64(STRTAB_STE_0_CFG); + used_bits[0] |= cpu_to_le64(STRTAB_STE_0_CFG); /* S1 translates */ if (cfg & BIT(0)) { - used_bits->data[0] |= cpu_to_le64(STRTAB_STE_0_S1FMT | - STRTAB_STE_0_S1CTXPTR_MASK | - STRTAB_STE_0_S1CDMAX); - used_bits->data[1] |= + used_bits[0] |= cpu_to_le64(STRTAB_STE_0_S1FMT | + STRTAB_STE_0_S1CTXPTR_MASK | + STRTAB_STE_0_S1CDMAX); + used_bits[1] |= cpu_to_le64(STRTAB_STE_1_S1DSS | STRTAB_STE_1_S1CIR | STRTAB_STE_1_S1COR | STRTAB_STE_1_S1CSH | STRTAB_STE_1_S1STALLD | STRTAB_STE_1_STRW | STRTAB_STE_1_EATS); - used_bits->data[2] |= cpu_to_le64(STRTAB_STE_2_S2VMID); + used_bits[2] |= cpu_to_le64(STRTAB_STE_2_S2VMID); } /* S2 translates */ if (cfg & BIT(1)) { - used_bits->data[1] |= + used_bits[1] |= cpu_to_le64(STRTAB_STE_1_EATS | STRTAB_STE_1_SHCFG); - used_bits->data[2] |= + used_bits[2] |= cpu_to_le64(STRTAB_STE_2_S2VMID | STRTAB_STE_2_VTCR | STRTAB_STE_2_S2AA64 | STRTAB_STE_2_S2ENDI | STRTAB_STE_2_S2PTW | STRTAB_STE_2_S2R); - used_bits->data[3] |= cpu_to_le64(STRTAB_STE_3_S2TTB_MASK); + used_bits[3] |= cpu_to_le64(STRTAB_STE_3_S2TTB_MASK); } if (cfg == STRTAB_STE_0_CFG_BYPASS) - used_bits->data[1] |= cpu_to_le64(STRTAB_STE_1_SHCFG); + used_bits[1] |= cpu_to_le64(STRTAB_STE_1_SHCFG); } /* @@ -1022,57 +1032,55 @@ static void arm_smmu_get_ste_used(const struct arm_smmu_ste *ent, * unused_update is an intermediate value of entry that has unused bits set to * their new values. */ -static u8 arm_smmu_entry_qword_diff(const struct arm_smmu_ste *entry, - const struct arm_smmu_ste *target, - struct arm_smmu_ste *unused_update) +static u8 arm_smmu_entry_qword_diff(struct arm_smmu_entry_writer *writer, + const __le64 *entry, const __le64 *target, + __le64 *unused_update) { - struct arm_smmu_ste target_used = {}; - struct arm_smmu_ste cur_used = {}; + __le64 target_used[NUM_ENTRY_QWORDS] = {}; + __le64 cur_used[NUM_ENTRY_QWORDS] = {}; u8 used_qword_diff = 0; unsigned int i; - arm_smmu_get_ste_used(entry, &cur_used); - arm_smmu_get_ste_used(target, &target_used); + writer->ops->get_used(entry, cur_used); + writer->ops->get_used(target, target_used); - for (i = 0; i != ARRAY_SIZE(target_used.data); i++) { + for (i = 0; i != NUM_ENTRY_QWORDS; i++) { /* * Check that masks are up to date, the make functions are not * allowed to set a bit to 1 if the used function doesn't say it * is used. */ - WARN_ON_ONCE(target->data[i] & ~target_used.data[i]); + WARN_ON_ONCE(target[i] & ~target_used[i]); /* Bits can change because they are not currently being used */ - unused_update->data[i] = (entry->data[i] & cur_used.data[i]) | - (target->data[i] & ~cur_used.data[i]); + unused_update[i] = (entry[i] & cur_used[i]) | + (target[i] & ~cur_used[i]); /* * Each bit indicates that a used bit in a qword needs to be * changed after unused_update is applied. */ - if ((unused_update->data[i] & target_used.data[i]) != - target->data[i]) + if ((unused_update[i] & target_used[i]) != target[i]) used_qword_diff |= 1 << i; } return used_qword_diff; } -static bool entry_set(struct arm_smmu_device *smmu, ioasid_t sid, - struct arm_smmu_ste *entry, - const struct arm_smmu_ste *target, unsigned int start, +static bool entry_set(struct arm_smmu_entry_writer *writer, __le64 *entry, + const __le64 *target, unsigned int start, unsigned int len) { bool changed = false; unsigned int i; for (i = start; len != 0; len--, i++) { - if (entry->data[i] != target->data[i]) { - WRITE_ONCE(entry->data[i], target->data[i]); + if (entry[i] != target[i]) { + WRITE_ONCE(entry[i], target[i]); changed = true; } } if (changed) - arm_smmu_sync_ste_for_sid(smmu, sid); + writer->ops->sync(writer); return changed; } @@ -1102,24 +1110,21 @@ static bool entry_set(struct arm_smmu_device *smmu, ioasid_t sid, * V=0 process. This relies on the IGNORED behavior described in the * specification. */ -static void arm_smmu_write_ste(struct arm_smmu_master *master, u32 sid, - struct arm_smmu_ste *entry, - const struct arm_smmu_ste *target) +static void arm_smmu_write_entry(struct arm_smmu_entry_writer *writer, + __le64 *entry, const __le64 *target) { - unsigned int num_entry_qwords = ARRAY_SIZE(target->data); - struct arm_smmu_device *smmu = master->smmu; - struct arm_smmu_ste unused_update; + __le64 unused_update[NUM_ENTRY_QWORDS]; u8 used_qword_diff; used_qword_diff = - arm_smmu_entry_qword_diff(entry, target, &unused_update); + arm_smmu_entry_qword_diff(writer, entry, target, unused_update); if (hweight8(used_qword_diff) == 1) { /* * Only one qword needs its used bits to be changed. This is a - * hitless update, update all bits the current STE is ignoring - * to their new values, then update a single "critical qword" to - * change the STE and finally 0 out any bits that are now unused - * in the target configuration. + * hitless update, update all bits the current STE/CD is + * ignoring to their new values, then update a single "critical + * qword" to change the STE/CD and finally 0 out any bits that + * are now unused in the target configuration. */ unsigned int critical_qword_index = ffs(used_qword_diff) - 1; @@ -1128,22 +1133,21 @@ static void arm_smmu_write_ste(struct arm_smmu_master *master, u32 sid, * writing it in the next step anyways. This can save a sync * when the only change is in that qword. */ - unused_update.data[critical_qword_index] = - entry->data[critical_qword_index]; - entry_set(smmu, sid, entry, &unused_update, 0, num_entry_qwords); - entry_set(smmu, sid, entry, target, critical_qword_index, 1); - entry_set(smmu, sid, entry, target, 0, num_entry_qwords); + unused_update[critical_qword_index] = + entry[critical_qword_index]; + entry_set(writer, entry, unused_update, 0, NUM_ENTRY_QWORDS); + entry_set(writer, entry, target, critical_qword_index, 1); + entry_set(writer, entry, target, 0, NUM_ENTRY_QWORDS); } else if (used_qword_diff) { /* * At least two qwords need their inuse bits to be changed. This * requires a breaking update, zero the V bit, write all qwords * but 0, then set qword 0 */ - unused_update.data[0] = entry->data[0] & - cpu_to_le64(~STRTAB_STE_0_V); - entry_set(smmu, sid, entry, &unused_update, 0, 1); - entry_set(smmu, sid, entry, target, 1, num_entry_qwords - 1); - entry_set(smmu, sid, entry, target, 0, 1); + unused_update[0] = 0; + entry_set(writer, entry, unused_update, 0, 1); + entry_set(writer, entry, target, 1, NUM_ENTRY_QWORDS - 1); + entry_set(writer, entry, target, 0, 1); } else { /* * No inuse bit changed. Sanity check that all unused bits are 0 @@ -1151,18 +1155,7 @@ static void arm_smmu_write_ste(struct arm_smmu_master *master, u32 sid, * compute_qword_diff(). */ WARN_ON_ONCE( - entry_set(smmu, sid, entry, target, 0, num_entry_qwords)); - } - - /* It's likely that we'll want to use the new STE soon */ - if (!(smmu->options & ARM_SMMU_OPT_SKIP_PREFETCH)) { - struct arm_smmu_cmdq_ent - prefetch_cmd = { .opcode = CMDQ_OP_PREFETCH_CFG, - .prefetch = { - .sid = sid, - } }; - - arm_smmu_cmdq_issue_cmd(smmu, &prefetch_cmd); + entry_set(writer, entry, target, 0, NUM_ENTRY_QWORDS)); } } @@ -1432,17 +1425,56 @@ arm_smmu_write_strtab_l1_desc(__le64 *dst, struct arm_smmu_strtab_l1_desc *desc) WRITE_ONCE(*dst, cpu_to_le64(val)); } -static void arm_smmu_sync_ste_for_sid(struct arm_smmu_device *smmu, u32 sid) +struct arm_smmu_ste_writer { + struct arm_smmu_entry_writer writer; + u32 sid; +}; + +static void arm_smmu_ste_writer_sync_entry(struct arm_smmu_entry_writer *writer) { + struct arm_smmu_ste_writer *ste_writer = + container_of(writer, struct arm_smmu_ste_writer, writer); struct arm_smmu_cmdq_ent cmd = { .opcode = CMDQ_OP_CFGI_STE, .cfgi = { - .sid = sid, + .sid = ste_writer->sid, .leaf = true, }, }; - arm_smmu_cmdq_issue_cmd_with_sync(smmu, &cmd); + arm_smmu_cmdq_issue_cmd_with_sync(writer->master->smmu, &cmd); +} + +static const struct arm_smmu_entry_writer_ops arm_smmu_ste_writer_ops = { + .sync = arm_smmu_ste_writer_sync_entry, + .get_used = arm_smmu_get_ste_used, +}; + +static void arm_smmu_write_ste(struct arm_smmu_master *master, u32 sid, + struct arm_smmu_ste *ste, + const struct arm_smmu_ste *target) +{ + struct arm_smmu_device *smmu = master->smmu; + struct arm_smmu_ste_writer ste_writer = { + .writer = { + .ops = &arm_smmu_ste_writer_ops, + .master = master, + }, + .sid = sid, + }; + + arm_smmu_write_entry(&ste_writer.writer, ste->data, target->data); + + /* It's likely that we'll want to use the new STE soon */ + if (!(smmu->options & ARM_SMMU_OPT_SKIP_PREFETCH)) { + struct arm_smmu_cmdq_ent + prefetch_cmd = { .opcode = CMDQ_OP_PREFETCH_CFG, + .prefetch = { + .sid = sid, + } }; + + arm_smmu_cmdq_issue_cmd(smmu, &prefetch_cmd); + } } static void arm_smmu_make_abort_ste(struct arm_smmu_ste *target) From 7a410669102ed88451bea1695984efd1dd709bbf Mon Sep 17 00:00:00 2001 From: Jason Gunthorpe Date: Tue, 30 Apr 2024 14:21:34 -0300 Subject: [PATCH 518/548] iommu/arm-smmu-v3: Make CD programming use arm_smmu_write_entry() CD table entries and STE's have the same essential programming sequence, just with different types. Use the new ops indirection to link CD programming to the common writer. In a few more patches all CD writers will call an appropriate make function and then directly call arm_smmu_write_cd_entry(). arm_smmu_write_ctx_desc() will be removed. Until then lightly tweak arm_smmu_write_ctx_desc() to also use the new programmer by using the same logic as right now to build the target CD on the stack, sanitizing it to meet the used rules, and then using the writer. Sanitizing is necessary because the writer expects that the currently programmed CD follows the used rules. Next patches add new make functions and new direct calls to arm_smmu_write_cd_entry() which will require this. Signed-off-by: Michael Shavit Tested-by: Nicolin Chen Tested-by: Shameer Kolothum Reviewed-by: Moritz Fischer Reviewed-by: Nicolin Chen Signed-off-by: Jason Gunthorpe Link: https://lore.kernel.org/r/2-v9-5040dc602008+177d7-smmuv3_newapi_p2_jgg@nvidia.com Signed-off-by: Will Deacon (cherry picked from commit 78a5fbe8395b365d58142ff9b7a6aeb556481a1f) Signed-off-by: Koba Ko --- drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 89 ++++++++++++++++----- 1 file changed, 67 insertions(+), 22 deletions(-) diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c index dd42305b2bbf1..5ad4830d77e8e 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c @@ -60,6 +60,7 @@ struct arm_smmu_entry_writer_ops { #define NUM_ENTRY_QWORDS 8 static_assert(sizeof(struct arm_smmu_ste) == NUM_ENTRY_QWORDS * sizeof(u64)); +static_assert(sizeof(struct arm_smmu_cd) == NUM_ENTRY_QWORDS * sizeof(u64)); static phys_addr_t arm_smmu_msi_cfg[ARM_SMMU_MAX_MSIS][3] = { [EVTQ_MSI_INDEX] = { @@ -1235,6 +1236,59 @@ static struct arm_smmu_cd *arm_smmu_get_cd_ptr(struct arm_smmu_master *master, return &l1_desc->l2ptr[idx]; } +struct arm_smmu_cd_writer { + struct arm_smmu_entry_writer writer; + unsigned int ssid; +}; + +static void arm_smmu_get_cd_used(const __le64 *ent, __le64 *used_bits) +{ + used_bits[0] = cpu_to_le64(CTXDESC_CD_0_V); + if (!(ent[0] & cpu_to_le64(CTXDESC_CD_0_V))) + return; + memset(used_bits, 0xFF, sizeof(struct arm_smmu_cd)); + + /* + * If EPD0 is set by the make function it means + * T0SZ/TG0/IR0/OR0/SH0/TTB0 are IGNORED + */ + if (ent[0] & cpu_to_le64(CTXDESC_CD_0_TCR_EPD0)) { + used_bits[0] &= ~cpu_to_le64( + CTXDESC_CD_0_TCR_T0SZ | CTXDESC_CD_0_TCR_TG0 | + CTXDESC_CD_0_TCR_IRGN0 | CTXDESC_CD_0_TCR_ORGN0 | + CTXDESC_CD_0_TCR_SH0); + used_bits[1] &= ~cpu_to_le64(CTXDESC_CD_1_TTB0_MASK); + } +} + +static void arm_smmu_cd_writer_sync_entry(struct arm_smmu_entry_writer *writer) +{ + struct arm_smmu_cd_writer *cd_writer = + container_of(writer, struct arm_smmu_cd_writer, writer); + + arm_smmu_sync_cd(writer->master, cd_writer->ssid, true); +} + +static const struct arm_smmu_entry_writer_ops arm_smmu_cd_writer_ops = { + .sync = arm_smmu_cd_writer_sync_entry, + .get_used = arm_smmu_get_cd_used, +}; + +static void arm_smmu_write_cd_entry(struct arm_smmu_master *master, int ssid, + struct arm_smmu_cd *cdptr, + const struct arm_smmu_cd *target) +{ + struct arm_smmu_cd_writer cd_writer = { + .writer = { + .ops = &arm_smmu_cd_writer_ops, + .master = master, + }, + .ssid = ssid, + }; + + arm_smmu_write_entry(&cd_writer.writer, cdptr->data, target->data); +} + int arm_smmu_write_ctx_desc(struct arm_smmu_master *master, int ssid, struct arm_smmu_ctx_desc *cd) { @@ -1251,23 +1305,31 @@ int arm_smmu_write_ctx_desc(struct arm_smmu_master *master, int ssid, */ u64 val; bool cd_live; - struct arm_smmu_cd *cdptr; + struct arm_smmu_cd target; + struct arm_smmu_cd *cdptr = ⌖ + struct arm_smmu_cd *cd_table_entry; struct arm_smmu_ctx_desc_cfg *cd_table = &master->cd_table; if (WARN_ON(ssid >= (1 << cd_table->s1cdmax))) return -E2BIG; - cdptr = arm_smmu_get_cd_ptr(master, ssid); - if (!cdptr) + cd_table_entry = arm_smmu_get_cd_ptr(master, ssid); + if (!cd_table_entry) return -ENOMEM; + target = *cd_table_entry; val = le64_to_cpu(cdptr->data[0]); cd_live = !!(val & CTXDESC_CD_0_V); if (!cd) { /* (5) */ + memset(cdptr, 0, sizeof(*cdptr)); val = 0; } else if (cd == &quiet_cd) { /* (4) */ + val &= ~(CTXDESC_CD_0_TCR_T0SZ | CTXDESC_CD_0_TCR_TG0 | + CTXDESC_CD_0_TCR_IRGN0 | CTXDESC_CD_0_TCR_ORGN0 | + CTXDESC_CD_0_TCR_SH0); val |= CTXDESC_CD_0_TCR_EPD0; + cdptr->data[1] &= ~cpu_to_le64(CTXDESC_CD_1_TTB0_MASK); } else if (cd_live) { /* (3) */ val &= ~CTXDESC_CD_0_ASID; val |= FIELD_PREP(CTXDESC_CD_0_ASID, cd->asid); @@ -1280,13 +1342,6 @@ int arm_smmu_write_ctx_desc(struct arm_smmu_master *master, int ssid, cdptr->data[2] = 0; cdptr->data[3] = cpu_to_le64(cd->mair); - /* - * STE may be live, and the SMMU might read dwords of this CD in any - * order. Ensure that it observes valid values before reading - * V=1. - */ - arm_smmu_sync_cd(master, ssid, true); - val = cd->tcr | #ifdef __BIG_ENDIAN CTXDESC_CD_0_ENDI | @@ -1300,18 +1355,8 @@ int arm_smmu_write_ctx_desc(struct arm_smmu_master *master, int ssid, if (cd_table->stall_enabled) val |= CTXDESC_CD_0_S; } - - /* - * The SMMU accesses 64-bit values atomically. See IHI0070Ca 3.21.3 - * "Configuration structures and configuration invalidation completion" - * - * The size of single-copy atomic reads made by the SMMU is - * IMPLEMENTATION DEFINED but must be at least 64 bits. Any single - * field within an aligned 64-bit span of a structure can be altered - * without first making the structure invalid. - */ - WRITE_ONCE(cdptr->data[0], cpu_to_le64(val)); - arm_smmu_sync_cd(master, ssid, true); + cdptr->data[0] = cpu_to_le64(val); + arm_smmu_write_cd_entry(master, ssid, cd_table_entry, &target); return 0; } From 7bd3df5bf6d8212c1354b1f9691d1972d5306e87 Mon Sep 17 00:00:00 2001 From: Jason Gunthorpe Date: Tue, 30 Apr 2024 14:21:35 -0300 Subject: [PATCH 519/548] iommu/arm-smmu-v3: Move the CD generation for S1 domains into a function Introduce arm_smmu_make_s1_cd() to build the CD from the paging S1 domain, and reorganize all the places programming S1 domain CD table entries to call it. Split arm_smmu_update_s1_domain_cd_entry() from arm_smmu_update_ctx_desc_devices() so that the S1 path has its own call chain separate from the unrelated SVA path. arm_smmu_update_s1_domain_cd_entry() only works on S1 domains attached to RIDs and refreshes all their CDs. Remove case (3) from arm_smmu_write_ctx_desc() as it is now handled by directly calling arm_smmu_write_cd_entry(). Remove the forced clear of the CD during S1 domain attach, arm_smmu_write_cd_entry() will do this automatically if necessary. Tested-by: Nicolin Chen Tested-by: Shameer Kolothum Reviewed-by: Michael Shavit Reviewed-by: Nicolin Chen Reviewed-by: Mostafa Saleh Signed-off-by: Jason Gunthorpe Link: https://lore.kernel.org/r/3-v9-5040dc602008+177d7-smmuv3_newapi_p2_jgg@nvidia.com [will: Drop unused arm_smmu_clean_cd_entry() function] Signed-off-by: Will Deacon (cherry picked from commit e9d1e4ff74b96cf180d04be38541a245c8c574c1) Signed-off-by: Koba Ko --- .../iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c | 25 ++++++- drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 71 +++++++++++-------- drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 9 +++ 3 files changed, 76 insertions(+), 29 deletions(-) diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c index 2cd433a9c8a0f..c6232bc870080 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c @@ -53,6 +53,29 @@ static void arm_smmu_update_ctx_desc_devices(struct arm_smmu_domain *smmu_domain spin_unlock_irqrestore(&smmu_domain->devices_lock, flags); } +static void +arm_smmu_update_s1_domain_cd_entry(struct arm_smmu_domain *smmu_domain) +{ + struct arm_smmu_master *master; + struct arm_smmu_cd target_cd; + unsigned long flags; + + spin_lock_irqsave(&smmu_domain->devices_lock, flags); + list_for_each_entry(master, &smmu_domain->devices, domain_head) { + struct arm_smmu_cd *cdptr; + + /* S1 domains only support RID attachment right now */ + cdptr = arm_smmu_get_cd_ptr(master, IOMMU_NO_PASID); + if (WARN_ON(!cdptr)) + continue; + + arm_smmu_make_s1_cd(&target_cd, master, smmu_domain); + arm_smmu_write_cd_entry(master, IOMMU_NO_PASID, cdptr, + &target_cd); + } + spin_unlock_irqrestore(&smmu_domain->devices_lock, flags); +} + /* * Check if the CPU ASID is available on the SMMU side. If a private context * descriptor is using it, try to replace it. @@ -96,7 +119,7 @@ arm_smmu_share_asid(struct mm_struct *mm, u16 asid) * be some overlap between use of both ASIDs, until we invalidate the * TLB. */ - arm_smmu_update_ctx_desc_devices(smmu_domain, IOMMU_NO_PASID, cd); + arm_smmu_update_s1_domain_cd_entry(smmu_domain); /* Invalidate TLB entries previously associated with that context */ arm_smmu_tlb_inv_asid(smmu, asid); diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c index 5ad4830d77e8e..cc13f48de9093 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c @@ -1208,8 +1208,8 @@ static void arm_smmu_write_cd_l1_desc(__le64 *dst, WRITE_ONCE(*dst, cpu_to_le64(val)); } -static struct arm_smmu_cd *arm_smmu_get_cd_ptr(struct arm_smmu_master *master, - u32 ssid) +struct arm_smmu_cd *arm_smmu_get_cd_ptr(struct arm_smmu_master *master, + u32 ssid) { __le64 *l1ptr; unsigned int idx; @@ -1274,9 +1274,9 @@ static const struct arm_smmu_entry_writer_ops arm_smmu_cd_writer_ops = { .get_used = arm_smmu_get_cd_used, }; -static void arm_smmu_write_cd_entry(struct arm_smmu_master *master, int ssid, - struct arm_smmu_cd *cdptr, - const struct arm_smmu_cd *target) +void arm_smmu_write_cd_entry(struct arm_smmu_master *master, int ssid, + struct arm_smmu_cd *cdptr, + const struct arm_smmu_cd *target) { struct arm_smmu_cd_writer cd_writer = { .writer = { @@ -1289,6 +1289,32 @@ static void arm_smmu_write_cd_entry(struct arm_smmu_master *master, int ssid, arm_smmu_write_entry(&cd_writer.writer, cdptr->data, target->data); } +void arm_smmu_make_s1_cd(struct arm_smmu_cd *target, + struct arm_smmu_master *master, + struct arm_smmu_domain *smmu_domain) +{ + struct arm_smmu_ctx_desc *cd = &smmu_domain->cd; + + memset(target, 0, sizeof(*target)); + + target->data[0] = cpu_to_le64( + cd->tcr | +#ifdef __BIG_ENDIAN + CTXDESC_CD_0_ENDI | +#endif + CTXDESC_CD_0_V | + CTXDESC_CD_0_AA64 | + (master->stall_enabled ? CTXDESC_CD_0_S : 0) | + CTXDESC_CD_0_R | + CTXDESC_CD_0_A | + CTXDESC_CD_0_ASET | + FIELD_PREP(CTXDESC_CD_0_ASID, cd->asid) + ); + + target->data[1] = cpu_to_le64(cd->ttbr & CTXDESC_CD_1_TTB0_MASK); + target->data[3] = cpu_to_le64(cd->mair); +} + int arm_smmu_write_ctx_desc(struct arm_smmu_master *master, int ssid, struct arm_smmu_ctx_desc *cd) { @@ -1297,14 +1323,11 @@ int arm_smmu_write_ctx_desc(struct arm_smmu_master *master, int ssid, * * (1) Install primary CD, for normal DMA traffic (SSID = IOMMU_NO_PASID = 0). * (2) Install a secondary CD, for SID+SSID traffic. - * (3) Update ASID of a CD. Atomically write the first 64 bits of the - * CD, then invalidate the old entry and mappings. * (4) Quiesce the context without clearing the valid bit. Disable * translation, and ignore any translation fault. * (5) Remove a secondary CD. */ u64 val; - bool cd_live; struct arm_smmu_cd target; struct arm_smmu_cd *cdptr = ⌖ struct arm_smmu_cd *cd_table_entry; @@ -1319,7 +1342,6 @@ int arm_smmu_write_ctx_desc(struct arm_smmu_master *master, int ssid, target = *cd_table_entry; val = le64_to_cpu(cdptr->data[0]); - cd_live = !!(val & CTXDESC_CD_0_V); if (!cd) { /* (5) */ memset(cdptr, 0, sizeof(*cdptr)); @@ -1330,13 +1352,6 @@ int arm_smmu_write_ctx_desc(struct arm_smmu_master *master, int ssid, CTXDESC_CD_0_TCR_SH0); val |= CTXDESC_CD_0_TCR_EPD0; cdptr->data[1] &= ~cpu_to_le64(CTXDESC_CD_1_TTB0_MASK); - } else if (cd_live) { /* (3) */ - val &= ~CTXDESC_CD_0_ASID; - val |= FIELD_PREP(CTXDESC_CD_0_ASID, cd->asid); - /* - * Until CD+TLB invalidation, both ASIDs may be used for tagging - * this substream's traffic - */ } else { /* (1) and (2) */ cdptr->data[1] = cpu_to_le64(cd->ttbr & CTXDESC_CD_1_TTB0_MASK); cdptr->data[2] = 0; @@ -2651,29 +2666,29 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev) spin_unlock_irqrestore(&smmu_domain->devices_lock, flags); switch (smmu_domain->stage) { - case ARM_SMMU_DOMAIN_S1: + case ARM_SMMU_DOMAIN_S1: { + struct arm_smmu_cd target_cd; + struct arm_smmu_cd *cdptr; + if (!master->cd_table.cdtab) { ret = arm_smmu_alloc_cd_tables(master); if (ret) goto out_list_del; - } else { - /* - * arm_smmu_write_ctx_desc() relies on the entry being - * invalid to work, clear any existing entry. - */ - ret = arm_smmu_write_ctx_desc(master, IOMMU_NO_PASID, - NULL); - if (ret) - goto out_list_del; } - ret = arm_smmu_write_ctx_desc(master, IOMMU_NO_PASID, &smmu_domain->cd); - if (ret) + cdptr = arm_smmu_get_cd_ptr(master, IOMMU_NO_PASID); + if (!cdptr) { + ret = -ENOMEM; goto out_list_del; + } + arm_smmu_make_s1_cd(&target_cd, master, smmu_domain); + arm_smmu_write_cd_entry(master, IOMMU_NO_PASID, cdptr, + &target_cd); arm_smmu_make_cdtable_ste(&target, master); arm_smmu_install_ste_for_dev(master, &target); break; + } case ARM_SMMU_DOMAIN_S2: arm_smmu_make_s2_domain_ste(&target, master, smmu_domain); arm_smmu_install_ste_for_dev(master, &target); diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h index eecf12f816f17..9f48b1309bdd6 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h @@ -752,6 +752,15 @@ extern struct xarray arm_smmu_asid_xa; extern struct mutex arm_smmu_asid_lock; extern struct arm_smmu_ctx_desc quiet_cd; +struct arm_smmu_cd *arm_smmu_get_cd_ptr(struct arm_smmu_master *master, + u32 ssid); +void arm_smmu_make_s1_cd(struct arm_smmu_cd *target, + struct arm_smmu_master *master, + struct arm_smmu_domain *smmu_domain); +void arm_smmu_write_cd_entry(struct arm_smmu_master *master, int ssid, + struct arm_smmu_cd *cdptr, + const struct arm_smmu_cd *target); + int arm_smmu_write_ctx_desc(struct arm_smmu_master *smmu_master, int ssid, struct arm_smmu_ctx_desc *cd); void arm_smmu_tlb_inv_asid(struct arm_smmu_device *smmu, u16 asid); From 5f454092b236143ac258523a547271c998d14bc6 Mon Sep 17 00:00:00 2001 From: Jason Gunthorpe Date: Tue, 30 Apr 2024 14:21:36 -0300 Subject: [PATCH 520/548] iommu/arm-smmu-v3: Consolidate clearing a CD table entry A cleared entry is all 0's. Make arm_smmu_clear_cd() do this sequence. If we are clearing an entry and for some reason it is not already allocated in the CD table then something has gone wrong. Remove case (5) from arm_smmu_write_ctx_desc(). Tested-by: Nicolin Chen Tested-by: Shameer Kolothum Reviewed-by: Michael Shavit Reviewed-by: Nicolin Chen Reviewed-by: Moritz Fischer Reviewed-by: Mostafa Saleh Signed-off-by: Jason Gunthorpe Link: https://lore.kernel.org/r/4-v9-5040dc602008+177d7-smmuv3_newapi_p2_jgg@nvidia.com Signed-off-by: Will Deacon (cherry picked from commit af8f0b83ea2bcc7cd365c32044f31bdadc07c351) Signed-off-by: Koba Ko --- .../iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c | 2 +- drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 26 ++++++++++++------- drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 1 + 3 files changed, 18 insertions(+), 11 deletions(-) diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c index c6232bc870080..7da22bebac1d6 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c @@ -569,7 +569,7 @@ void arm_smmu_sva_remove_dev_pasid(struct iommu_domain *domain, mutex_lock(&sva_lock); - arm_smmu_write_ctx_desc(master, id, NULL); + arm_smmu_clear_cd(master, id); list_for_each_entry(t, &master->bonds, list) { if (t->mm == mm) { diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c index cc13f48de9093..edd219bdd7c5c 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c @@ -1315,6 +1315,19 @@ void arm_smmu_make_s1_cd(struct arm_smmu_cd *target, target->data[3] = cpu_to_le64(cd->mair); } +void arm_smmu_clear_cd(struct arm_smmu_master *master, ioasid_t ssid) +{ + struct arm_smmu_cd target = {}; + struct arm_smmu_cd *cdptr; + + if (!master->cd_table.cdtab) + return; + cdptr = arm_smmu_get_cd_ptr(master, ssid); + if (WARN_ON(!cdptr)) + return; + arm_smmu_write_cd_entry(master, ssid, cdptr, &target); +} + int arm_smmu_write_ctx_desc(struct arm_smmu_master *master, int ssid, struct arm_smmu_ctx_desc *cd) { @@ -1325,7 +1338,6 @@ int arm_smmu_write_ctx_desc(struct arm_smmu_master *master, int ssid, * (2) Install a secondary CD, for SID+SSID traffic. * (4) Quiesce the context without clearing the valid bit. Disable * translation, and ignore any translation fault. - * (5) Remove a secondary CD. */ u64 val; struct arm_smmu_cd target; @@ -1343,10 +1355,7 @@ int arm_smmu_write_ctx_desc(struct arm_smmu_master *master, int ssid, target = *cd_table_entry; val = le64_to_cpu(cdptr->data[0]); - if (!cd) { /* (5) */ - memset(cdptr, 0, sizeof(*cdptr)); - val = 0; - } else if (cd == &quiet_cd) { /* (4) */ + if (cd == &quiet_cd) { /* (4) */ val &= ~(CTXDESC_CD_0_TCR_T0SZ | CTXDESC_CD_0_TCR_TG0 | CTXDESC_CD_0_TCR_IRGN0 | CTXDESC_CD_0_TCR_ORGN0 | CTXDESC_CD_0_TCR_SH0); @@ -2692,9 +2701,7 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev) case ARM_SMMU_DOMAIN_S2: arm_smmu_make_s2_domain_ste(&target, master, smmu_domain); arm_smmu_install_ste_for_dev(master, &target); - if (master->cd_table.cdtab) - arm_smmu_write_ctx_desc(master, IOMMU_NO_PASID, - NULL); + arm_smmu_clear_cd(master, IOMMU_NO_PASID); break; } @@ -2742,8 +2749,7 @@ static int arm_smmu_attach_dev_ste(struct device *dev, * arm_smmu_domain->devices to avoid races updating the same context * descriptor from arm_smmu_share_asid(). */ - if (master->cd_table.cdtab) - arm_smmu_write_ctx_desc(master, IOMMU_NO_PASID, NULL); + arm_smmu_clear_cd(master, IOMMU_NO_PASID); return 0; } diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h index 9f48b1309bdd6..6b4fe91b88a72 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h @@ -752,6 +752,7 @@ extern struct xarray arm_smmu_asid_xa; extern struct mutex arm_smmu_asid_lock; extern struct arm_smmu_ctx_desc quiet_cd; +void arm_smmu_clear_cd(struct arm_smmu_master *master, ioasid_t ssid); struct arm_smmu_cd *arm_smmu_get_cd_ptr(struct arm_smmu_master *master, u32 ssid); void arm_smmu_make_s1_cd(struct arm_smmu_cd *target, From 8e998b03099b1960b9db39c7a55f16f989f2619b Mon Sep 17 00:00:00 2001 From: Jason Gunthorpe Date: Tue, 30 Apr 2024 14:21:37 -0300 Subject: [PATCH 521/548] iommu/arm-smmu-v3: Make arm_smmu_alloc_cd_ptr() Only the attach callers can perform an allocation for the CD table entry, the other callers must not do so, they do not have the correct locking and they cannot sleep. Split up the functions so this is clear. arm_smmu_get_cd_ptr() will return pointer to a CD table entry without doing any kind of allocation. arm_smmu_alloc_cd_ptr() will allocate the table and any required leaf. A following patch will add lockdep assertions to arm_smmu_alloc_cd_ptr() once the restructuring is completed and arm_smmu_alloc_cd_ptr() is never called in the wrong context. Tested-by: Nicolin Chen Reviewed-by: Nicolin Chen Signed-off-by: Jason Gunthorpe Link: https://lore.kernel.org/r/5-v9-5040dc602008+177d7-smmuv3_newapi_p2_jgg@nvidia.com Signed-off-by: Will Deacon (cherry picked from commit b2f4c0fcf094dacd2d1fb96a6fd6598919501589) Signed-off-by: Koba Ko --- drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 59 +++++++++++++-------- drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 3 +- 2 files changed, 39 insertions(+), 23 deletions(-) diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c index edd219bdd7c5c..befe1a7924190 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c @@ -102,6 +102,7 @@ static struct arm_smmu_option_prop arm_smmu_options[] = { static int arm_smmu_domain_finalise(struct arm_smmu_domain *smmu_domain, struct arm_smmu_device *smmu); +static int arm_smmu_alloc_cd_tables(struct arm_smmu_master *master); static void parse_driver_options(struct arm_smmu_device *smmu) { @@ -1211,29 +1212,51 @@ static void arm_smmu_write_cd_l1_desc(__le64 *dst, struct arm_smmu_cd *arm_smmu_get_cd_ptr(struct arm_smmu_master *master, u32 ssid) { - __le64 *l1ptr; - unsigned int idx; struct arm_smmu_l1_ctx_desc *l1_desc; - struct arm_smmu_device *smmu = master->smmu; struct arm_smmu_ctx_desc_cfg *cd_table = &master->cd_table; + if (!cd_table->cdtab) + return NULL; + if (cd_table->s1fmt == STRTAB_STE_0_S1FMT_LINEAR) return (struct arm_smmu_cd *)(cd_table->cdtab + ssid * CTXDESC_CD_DWORDS); - idx = ssid >> CTXDESC_SPLIT; - l1_desc = &cd_table->l1_desc[idx]; - if (!l1_desc->l2ptr) { - if (arm_smmu_alloc_cd_leaf_table(smmu, l1_desc)) + l1_desc = &cd_table->l1_desc[ssid / CTXDESC_L2_ENTRIES]; + if (!l1_desc->l2ptr) + return NULL; + return &l1_desc->l2ptr[ssid % CTXDESC_L2_ENTRIES]; +} + +static struct arm_smmu_cd *arm_smmu_alloc_cd_ptr(struct arm_smmu_master *master, + u32 ssid) +{ + struct arm_smmu_ctx_desc_cfg *cd_table = &master->cd_table; + struct arm_smmu_device *smmu = master->smmu; + + if (!cd_table->cdtab) { + if (arm_smmu_alloc_cd_tables(master)) return NULL; + } + + if (cd_table->s1fmt == STRTAB_STE_0_S1FMT_64K_L2) { + unsigned int idx = ssid / CTXDESC_L2_ENTRIES; + struct arm_smmu_l1_ctx_desc *l1_desc; + + l1_desc = &cd_table->l1_desc[idx]; + if (!l1_desc->l2ptr) { + __le64 *l1ptr; - l1ptr = cd_table->cdtab + idx * CTXDESC_L1_DESC_DWORDS; - arm_smmu_write_cd_l1_desc(l1ptr, l1_desc); - /* An invalid L1CD can be cached */ - arm_smmu_sync_cd(master, ssid, false); + if (arm_smmu_alloc_cd_leaf_table(smmu, l1_desc)) + return NULL; + + l1ptr = cd_table->cdtab + idx * CTXDESC_L1_DESC_DWORDS; + arm_smmu_write_cd_l1_desc(l1ptr, l1_desc); + /* An invalid L1CD can be cached */ + arm_smmu_sync_cd(master, ssid, false); + } } - idx = ssid & (CTXDESC_L2_ENTRIES - 1); - return &l1_desc->l2ptr[idx]; + return arm_smmu_get_cd_ptr(master, ssid); } struct arm_smmu_cd_writer { @@ -1348,7 +1371,7 @@ int arm_smmu_write_ctx_desc(struct arm_smmu_master *master, int ssid, if (WARN_ON(ssid >= (1 << cd_table->s1cdmax))) return -E2BIG; - cd_table_entry = arm_smmu_get_cd_ptr(master, ssid); + cd_table_entry = arm_smmu_alloc_cd_ptr(master, ssid); if (!cd_table_entry) return -ENOMEM; @@ -2679,13 +2702,7 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev) struct arm_smmu_cd target_cd; struct arm_smmu_cd *cdptr; - if (!master->cd_table.cdtab) { - ret = arm_smmu_alloc_cd_tables(master); - if (ret) - goto out_list_del; - } - - cdptr = arm_smmu_get_cd_ptr(master, IOMMU_NO_PASID); + cdptr = arm_smmu_alloc_cd_ptr(master, IOMMU_NO_PASID); if (!cdptr) { ret = -ENOMEM; goto out_list_del; diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h index 6b4fe91b88a72..831618ccfde83 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h @@ -275,8 +275,7 @@ struct arm_smmu_ste { * 2lvl: at most 1024 L1 entries, * 1024 lazy entries per table. */ -#define CTXDESC_SPLIT 10 -#define CTXDESC_L2_ENTRIES (1 << CTXDESC_SPLIT) +#define CTXDESC_L2_ENTRIES 1024 #define CTXDESC_L1_DESC_DWORDS 1 #define CTXDESC_L1_DESC_V (1UL << 0) From 8e4f416f0a5be463c1e1b8b01ca7a60e5c543e47 Mon Sep 17 00:00:00 2001 From: Jason Gunthorpe Date: Tue, 30 Apr 2024 14:21:38 -0300 Subject: [PATCH 522/548] iommu/arm-smmu-v3: Allocate the CD table entry in advance Avoid arm_smmu_attach_dev() having to undo the changes to the smmu_domain->devices list, acquire the cdptr earlier so we don't need to handle that error. Now there is a clear break in arm_smmu_attach_dev() where all the prep-work has been done non-disruptively and we commit to making the HW change, which cannot fail. This completes transforming arm_smmu_attach_dev() so that it does not disturb the HW if it fails. Tested-by: Nicolin Chen Tested-by: Shameer Kolothum Reviewed-by: Michael Shavit Reviewed-by: Nicolin Chen Reviewed-by: Mostafa Saleh Signed-off-by: Jason Gunthorpe Link: https://lore.kernel.org/r/6-v9-5040dc602008+177d7-smmuv3_newapi_p2_jgg@nvidia.com Signed-off-by: Will Deacon (cherry picked from commit 13abe4faac4348da0cf1c4eeb2b1b39fcfdb4b8f) Signed-off-by: Koba Ko --- drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 24 +++++++-------------- 1 file changed, 8 insertions(+), 16 deletions(-) diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c index befe1a7924190..7b06f39ccec36 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c @@ -2653,6 +2653,7 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev) struct arm_smmu_device *smmu; struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain); struct arm_smmu_master *master; + struct arm_smmu_cd *cdptr; if (!fwspec) return -ENOENT; @@ -2681,6 +2682,12 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev) if (ret) return ret; + if (smmu_domain->stage == ARM_SMMU_DOMAIN_S1) { + cdptr = arm_smmu_alloc_cd_ptr(master, IOMMU_NO_PASID); + if (!cdptr) + return -ENOMEM; + } + /* * Prevent arm_smmu_share_asid() from trying to change the ASID * of either the old or new domain while we are working on it. @@ -2700,13 +2707,6 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev) switch (smmu_domain->stage) { case ARM_SMMU_DOMAIN_S1: { struct arm_smmu_cd target_cd; - struct arm_smmu_cd *cdptr; - - cdptr = arm_smmu_alloc_cd_ptr(master, IOMMU_NO_PASID); - if (!cdptr) { - ret = -ENOMEM; - goto out_list_del; - } arm_smmu_make_s1_cd(&target_cd, master, smmu_domain); arm_smmu_write_cd_entry(master, IOMMU_NO_PASID, cdptr, @@ -2723,16 +2723,8 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev) } arm_smmu_enable_ats(master, smmu_domain); - goto out_unlock; - -out_list_del: - spin_lock_irqsave(&smmu_domain->devices_lock, flags); - list_del_init(&master->domain_head); - spin_unlock_irqrestore(&smmu_domain->devices_lock, flags); - -out_unlock: mutex_unlock(&arm_smmu_asid_lock); - return ret; + return 0; } static int arm_smmu_attach_dev_ste(struct device *dev, From 91f0a433d34de888a678b9a3aa07801885381df5 Mon Sep 17 00:00:00 2001 From: Jason Gunthorpe Date: Tue, 30 Apr 2024 14:21:39 -0300 Subject: [PATCH 523/548] iommu/arm-smmu-v3: Move the CD generation for SVA into a function Pull all the calculations for building the CD table entry for a mmu_struct into arm_smmu_make_sva_cd(). Call it in the two places installing the SVA CD table entry. Open code the last caller of arm_smmu_update_ctx_desc_devices() and remove the function. Remove arm_smmu_write_ctx_desc() since all callers are gone. Add the locking assertions to arm_smmu_alloc_cd_ptr() since arm_smmu_update_ctx_desc_devices() was the last problematic caller. Remove quiet_cd since all users are gone, arm_smmu_make_sva_cd() creates the same value. The behavior of quiet_cd changes slightly, the old implementation edited the CD in place to set CTXDESC_CD_0_TCR_EPD0 assuming it was a SVA CD entry. This version generates a full CD entry with a 0 TTB0 and relies on arm_smmu_write_cd_entry() to install it hitlessly. Tested-by: Nicolin Chen Tested-by: Shameer Kolothum Reviewed-by: Nicolin Chen Signed-off-by: Jason Gunthorpe Link: https://lore.kernel.org/r/7-v9-5040dc602008+177d7-smmuv3_newapi_p2_jgg@nvidia.com Signed-off-by: Will Deacon (cherry picked from commit 7b87c93c8b86d9d9b9567d83f0ca3d3046fdfc5a) Signed-off-by: Koba Ko --- .../iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c | 155 +++++++++++------- drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 74 +-------- drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 7 +- 3 files changed, 107 insertions(+), 129 deletions(-) diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c index 7da22bebac1d6..3fb3040f57b0e 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c @@ -34,25 +34,6 @@ struct arm_smmu_bond { static DEFINE_MUTEX(sva_lock); -/* - * Write the CD to the CD tables for all masters that this domain is attached - * to. Note that this is only used to update existing CD entries in the target - * CD table, for which it's assumed that arm_smmu_write_ctx_desc can't fail. - */ -static void arm_smmu_update_ctx_desc_devices(struct arm_smmu_domain *smmu_domain, - int ssid, - struct arm_smmu_ctx_desc *cd) -{ - struct arm_smmu_master *master; - unsigned long flags; - - spin_lock_irqsave(&smmu_domain->devices_lock, flags); - list_for_each_entry(master, &smmu_domain->devices, domain_head) { - arm_smmu_write_ctx_desc(master, ssid, cd); - } - spin_unlock_irqrestore(&smmu_domain->devices_lock, flags); -} - static void arm_smmu_update_s1_domain_cd_entry(struct arm_smmu_domain *smmu_domain) { @@ -128,11 +109,85 @@ arm_smmu_share_asid(struct mm_struct *mm, u16 asid) return NULL; } +static u64 page_size_to_cd(void) +{ + static_assert(PAGE_SIZE == SZ_4K || PAGE_SIZE == SZ_16K || + PAGE_SIZE == SZ_64K); + if (PAGE_SIZE == SZ_64K) + return ARM_LPAE_TCR_TG0_64K; + if (PAGE_SIZE == SZ_16K) + return ARM_LPAE_TCR_TG0_16K; + return ARM_LPAE_TCR_TG0_4K; +} + +static void arm_smmu_make_sva_cd(struct arm_smmu_cd *target, + struct arm_smmu_master *master, + struct mm_struct *mm, u16 asid) +{ + u64 par; + + memset(target, 0, sizeof(*target)); + + par = cpuid_feature_extract_unsigned_field( + read_sanitised_ftr_reg(SYS_ID_AA64MMFR0_EL1), + ID_AA64MMFR0_EL1_PARANGE_SHIFT); + + target->data[0] = cpu_to_le64( + CTXDESC_CD_0_TCR_EPD1 | +#ifdef __BIG_ENDIAN + CTXDESC_CD_0_ENDI | +#endif + CTXDESC_CD_0_V | + FIELD_PREP(CTXDESC_CD_0_TCR_IPS, par) | + CTXDESC_CD_0_AA64 | + (master->stall_enabled ? CTXDESC_CD_0_S : 0) | + CTXDESC_CD_0_R | + CTXDESC_CD_0_A | + CTXDESC_CD_0_ASET | + FIELD_PREP(CTXDESC_CD_0_ASID, asid)); + + /* + * If no MM is passed then this creates a SVA entry that faults + * everything. arm_smmu_write_cd_entry() can hitlessly go between these + * two entries types since TTB0 is ignored by HW when EPD0 is set. + */ + if (mm) { + target->data[0] |= cpu_to_le64( + FIELD_PREP(CTXDESC_CD_0_TCR_T0SZ, + 64ULL - vabits_actual) | + FIELD_PREP(CTXDESC_CD_0_TCR_TG0, page_size_to_cd()) | + FIELD_PREP(CTXDESC_CD_0_TCR_IRGN0, + ARM_LPAE_TCR_RGN_WBWA) | + FIELD_PREP(CTXDESC_CD_0_TCR_ORGN0, + ARM_LPAE_TCR_RGN_WBWA) | + FIELD_PREP(CTXDESC_CD_0_TCR_SH0, ARM_LPAE_TCR_SH_IS)); + + target->data[1] = cpu_to_le64(virt_to_phys(mm->pgd) & + CTXDESC_CD_1_TTB0_MASK); + } else { + target->data[0] |= cpu_to_le64(CTXDESC_CD_0_TCR_EPD0); + + /* + * Disable stall and immediately generate an abort if stall + * disable is permitted. This speeds up cleanup for an unclean + * exit if the device is still doing a lot of DMA. + */ + if (!(master->smmu->features & ARM_SMMU_FEAT_STALL_FORCE)) + target->data[0] &= + cpu_to_le64(~(CTXDESC_CD_0_S | CTXDESC_CD_0_R)); + } + + /* + * MAIR value is pretty much constant and global, so we can just get it + * from the current CPU register + */ + target->data[3] = cpu_to_le64(read_sysreg(mair_el1)); +} + static struct arm_smmu_ctx_desc *arm_smmu_alloc_shared_cd(struct mm_struct *mm) { u16 asid; int err = 0; - u64 tcr, par, reg; struct arm_smmu_ctx_desc *cd; struct arm_smmu_ctx_desc *ret = NULL; @@ -166,39 +221,6 @@ static struct arm_smmu_ctx_desc *arm_smmu_alloc_shared_cd(struct mm_struct *mm) if (err) goto out_free_asid; - tcr = FIELD_PREP(CTXDESC_CD_0_TCR_T0SZ, 64ULL - vabits_actual) | - FIELD_PREP(CTXDESC_CD_0_TCR_IRGN0, ARM_LPAE_TCR_RGN_WBWA) | - FIELD_PREP(CTXDESC_CD_0_TCR_ORGN0, ARM_LPAE_TCR_RGN_WBWA) | - FIELD_PREP(CTXDESC_CD_0_TCR_SH0, ARM_LPAE_TCR_SH_IS) | - CTXDESC_CD_0_TCR_EPD1 | CTXDESC_CD_0_AA64; - - switch (PAGE_SIZE) { - case SZ_4K: - tcr |= FIELD_PREP(CTXDESC_CD_0_TCR_TG0, ARM_LPAE_TCR_TG0_4K); - break; - case SZ_16K: - tcr |= FIELD_PREP(CTXDESC_CD_0_TCR_TG0, ARM_LPAE_TCR_TG0_16K); - break; - case SZ_64K: - tcr |= FIELD_PREP(CTXDESC_CD_0_TCR_TG0, ARM_LPAE_TCR_TG0_64K); - break; - default: - WARN_ON(1); - err = -EINVAL; - goto out_free_asid; - } - - reg = read_sanitised_ftr_reg(SYS_ID_AA64MMFR0_EL1); - par = cpuid_feature_extract_unsigned_field(reg, ID_AA64MMFR0_EL1_PARANGE_SHIFT); - tcr |= FIELD_PREP(CTXDESC_CD_0_TCR_IPS, par); - - cd->ttbr = virt_to_phys(mm->pgd); - cd->tcr = tcr; - /* - * MAIR value is pretty much constant and global, so we can just get it - * from the current CPU register - */ - cd->mair = read_sysreg(mair_el1); cd->asid = asid; cd->mm = mm; @@ -276,6 +298,8 @@ static void arm_smmu_mm_release(struct mmu_notifier *mn, struct mm_struct *mm) { struct arm_smmu_mmu_notifier *smmu_mn = mn_to_smmu(mn); struct arm_smmu_domain *smmu_domain = smmu_mn->domain; + struct arm_smmu_master *master; + unsigned long flags; mutex_lock(&sva_lock); if (smmu_mn->cleared) { @@ -287,8 +311,19 @@ static void arm_smmu_mm_release(struct mmu_notifier *mn, struct mm_struct *mm) * DMA may still be running. Keep the cd valid to avoid C_BAD_CD events, * but disable translation. */ - arm_smmu_update_ctx_desc_devices(smmu_domain, mm_get_enqcmd_pasid(mm), - &quiet_cd); + spin_lock_irqsave(&smmu_domain->devices_lock, flags); + list_for_each_entry(master, &smmu_domain->devices, domain_head) { + struct arm_smmu_cd target; + struct arm_smmu_cd *cdptr; + + cdptr = arm_smmu_get_cd_ptr(master, mm_get_enqcmd_pasid(mm)); + if (WARN_ON(!cdptr)) + continue; + arm_smmu_make_sva_cd(&target, master, NULL, smmu_mn->cd->asid); + arm_smmu_write_cd_entry(master, mm_get_enqcmd_pasid(mm), cdptr, + &target); + } + spin_unlock_irqrestore(&smmu_domain->devices_lock, flags); arm_smmu_tlb_inv_asid(smmu_domain->smmu, smmu_mn->cd->asid); arm_smmu_atc_inv_domain(smmu_domain, mm_get_enqcmd_pasid(mm), 0, 0); @@ -383,6 +418,8 @@ static int __arm_smmu_sva_bind(struct device *dev, ioasid_t pasid, struct mm_struct *mm) { int ret; + struct arm_smmu_cd target; + struct arm_smmu_cd *cdptr; struct arm_smmu_bond *bond; struct arm_smmu_master *master = dev_iommu_priv_get(dev); struct iommu_domain *domain = iommu_get_domain_for_dev(dev); @@ -409,9 +446,13 @@ static int __arm_smmu_sva_bind(struct device *dev, ioasid_t pasid, goto err_free_bond; } - ret = arm_smmu_write_ctx_desc(master, pasid, bond->smmu_mn->cd); - if (ret) + cdptr = arm_smmu_alloc_cd_ptr(master, mm_get_enqcmd_pasid(mm)); + if (!cdptr) { + ret = -ENOMEM; goto err_put_notifier; + } + arm_smmu_make_sva_cd(&target, master, mm, bond->smmu_mn->cd->asid); + arm_smmu_write_cd_entry(master, pasid, cdptr, &target); list_add(&bond->list, &master->bonds); return 0; diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c index 7b06f39ccec36..7d3fe6b8fffb3 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c @@ -88,12 +88,6 @@ struct arm_smmu_option_prop { DEFINE_XARRAY_ALLOC1(arm_smmu_asid_xa); DEFINE_MUTEX(arm_smmu_asid_lock); -/* - * Special value used by SVA when a process dies, to quiesce a CD without - * disabling it. - */ -struct arm_smmu_ctx_desc quiet_cd = { 0 }; - static struct arm_smmu_option_prop arm_smmu_options[] = { { ARM_SMMU_OPT_SKIP_PREFETCH, "hisilicon,broken-prefetch-cmd" }, { ARM_SMMU_OPT_PAGE0_REGS_ONLY, "cavium,cn9900-broken-page1-regspace"}, @@ -1205,7 +1199,7 @@ static void arm_smmu_write_cd_l1_desc(__le64 *dst, u64 val = (l1_desc->l2ptr_dma & CTXDESC_L1_DESC_L2PTR_MASK) | CTXDESC_L1_DESC_V; - /* See comment in arm_smmu_write_ctx_desc() */ + /* The HW has 64 bit atomicity with stores to the L2 CD table */ WRITE_ONCE(*dst, cpu_to_le64(val)); } @@ -1228,12 +1222,15 @@ struct arm_smmu_cd *arm_smmu_get_cd_ptr(struct arm_smmu_master *master, return &l1_desc->l2ptr[ssid % CTXDESC_L2_ENTRIES]; } -static struct arm_smmu_cd *arm_smmu_alloc_cd_ptr(struct arm_smmu_master *master, - u32 ssid) +struct arm_smmu_cd *arm_smmu_alloc_cd_ptr(struct arm_smmu_master *master, + u32 ssid) { struct arm_smmu_ctx_desc_cfg *cd_table = &master->cd_table; struct arm_smmu_device *smmu = master->smmu; + might_sleep(); + iommu_group_mutex_assert(master->dev); + if (!cd_table->cdtab) { if (arm_smmu_alloc_cd_tables(master)) return NULL; @@ -1351,62 +1348,6 @@ void arm_smmu_clear_cd(struct arm_smmu_master *master, ioasid_t ssid) arm_smmu_write_cd_entry(master, ssid, cdptr, &target); } -int arm_smmu_write_ctx_desc(struct arm_smmu_master *master, int ssid, - struct arm_smmu_ctx_desc *cd) -{ - /* - * This function handles the following cases: - * - * (1) Install primary CD, for normal DMA traffic (SSID = IOMMU_NO_PASID = 0). - * (2) Install a secondary CD, for SID+SSID traffic. - * (4) Quiesce the context without clearing the valid bit. Disable - * translation, and ignore any translation fault. - */ - u64 val; - struct arm_smmu_cd target; - struct arm_smmu_cd *cdptr = ⌖ - struct arm_smmu_cd *cd_table_entry; - struct arm_smmu_ctx_desc_cfg *cd_table = &master->cd_table; - - if (WARN_ON(ssid >= (1 << cd_table->s1cdmax))) - return -E2BIG; - - cd_table_entry = arm_smmu_alloc_cd_ptr(master, ssid); - if (!cd_table_entry) - return -ENOMEM; - - target = *cd_table_entry; - val = le64_to_cpu(cdptr->data[0]); - - if (cd == &quiet_cd) { /* (4) */ - val &= ~(CTXDESC_CD_0_TCR_T0SZ | CTXDESC_CD_0_TCR_TG0 | - CTXDESC_CD_0_TCR_IRGN0 | CTXDESC_CD_0_TCR_ORGN0 | - CTXDESC_CD_0_TCR_SH0); - val |= CTXDESC_CD_0_TCR_EPD0; - cdptr->data[1] &= ~cpu_to_le64(CTXDESC_CD_1_TTB0_MASK); - } else { /* (1) and (2) */ - cdptr->data[1] = cpu_to_le64(cd->ttbr & CTXDESC_CD_1_TTB0_MASK); - cdptr->data[2] = 0; - cdptr->data[3] = cpu_to_le64(cd->mair); - - val = cd->tcr | -#ifdef __BIG_ENDIAN - CTXDESC_CD_0_ENDI | -#endif - CTXDESC_CD_0_R | CTXDESC_CD_0_A | - (cd->mm ? 0 : CTXDESC_CD_0_ASET) | - CTXDESC_CD_0_AA64 | - FIELD_PREP(CTXDESC_CD_0_ASID, cd->asid) | - CTXDESC_CD_0_V; - - if (cd_table->stall_enabled) - val |= CTXDESC_CD_0_S; - } - cdptr->data[0] = cpu_to_le64(val); - arm_smmu_write_cd_entry(master, ssid, cd_table_entry, &target); - return 0; -} - static int arm_smmu_alloc_cd_tables(struct arm_smmu_master *master) { int ret; @@ -1415,7 +1356,6 @@ static int arm_smmu_alloc_cd_tables(struct arm_smmu_master *master) struct arm_smmu_device *smmu = master->smmu; struct arm_smmu_ctx_desc_cfg *cd_table = &master->cd_table; - cd_table->stall_enabled = master->stall_enabled; cd_table->s1cdmax = master->ssid_bits; max_contexts = 1 << cd_table->s1cdmax; @@ -1513,7 +1453,7 @@ arm_smmu_write_strtab_l1_desc(__le64 *dst, struct arm_smmu_strtab_l1_desc *desc) val |= FIELD_PREP(STRTAB_L1_DESC_SPAN, desc->span); val |= desc->l2ptr_dma & STRTAB_L1_DESC_L2PTR_MASK; - /* See comment in arm_smmu_write_ctx_desc() */ + /* The HW has 64 bit atomicity with stores to the L2 STE table */ WRITE_ONCE(*dst, cpu_to_le64(val)); } diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h index 831618ccfde83..4fd3cd657d17b 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h @@ -608,8 +608,6 @@ struct arm_smmu_ctx_desc_cfg { u8 s1fmt; /* log2 of the maximum number of CDs supported by this table */ u8 s1cdmax; - /* Whether CD entries in this table have the stall bit set. */ - u8 stall_enabled:1; }; struct arm_smmu_s2_cfg { @@ -749,11 +747,12 @@ static inline struct arm_smmu_domain *to_smmu_domain(struct iommu_domain *dom) extern struct xarray arm_smmu_asid_xa; extern struct mutex arm_smmu_asid_lock; -extern struct arm_smmu_ctx_desc quiet_cd; void arm_smmu_clear_cd(struct arm_smmu_master *master, ioasid_t ssid); struct arm_smmu_cd *arm_smmu_get_cd_ptr(struct arm_smmu_master *master, u32 ssid); +struct arm_smmu_cd *arm_smmu_alloc_cd_ptr(struct arm_smmu_master *master, + u32 ssid); void arm_smmu_make_s1_cd(struct arm_smmu_cd *target, struct arm_smmu_master *master, struct arm_smmu_domain *smmu_domain); @@ -761,8 +760,6 @@ void arm_smmu_write_cd_entry(struct arm_smmu_master *master, int ssid, struct arm_smmu_cd *cdptr, const struct arm_smmu_cd *target); -int arm_smmu_write_ctx_desc(struct arm_smmu_master *smmu_master, int ssid, - struct arm_smmu_ctx_desc *cd); void arm_smmu_tlb_inv_asid(struct arm_smmu_device *smmu, u16 asid); void arm_smmu_tlb_inv_range_asid(unsigned long iova, size_t size, int asid, size_t granule, bool leaf, From 49064856d08e8834b893258c8f7b057553ea9207 Mon Sep 17 00:00:00 2001 From: Jason Gunthorpe Date: Tue, 30 Apr 2024 14:21:40 -0300 Subject: [PATCH 524/548] iommu/arm-smmu-v3: Build the whole CD in arm_smmu_make_s1_cd() Half the code was living in arm_smmu_domain_finalise_s1(), just move it here and take the values directly from the pgtbl_ops instead of storing copies. Tested-by: Nicolin Chen Tested-by: Shameer Kolothum Reviewed-by: Michael Shavit Reviewed-by: Mostafa Saleh Reviewed-by: Nicolin Chen Signed-off-by: Jason Gunthorpe Link: https://lore.kernel.org/r/8-v9-5040dc602008+177d7-smmuv3_newapi_p2_jgg@nvidia.com Signed-off-by: Will Deacon (cherry picked from commit 04905c17f64890311e6b5a5065d8c220602712e5) Signed-off-by: Koba Ko --- drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 47 ++++++++------------- drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 3 -- 2 files changed, 18 insertions(+), 32 deletions(-) diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c index 7d3fe6b8fffb3..0d0a8cedaeda4 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c @@ -1314,15 +1314,25 @@ void arm_smmu_make_s1_cd(struct arm_smmu_cd *target, struct arm_smmu_domain *smmu_domain) { struct arm_smmu_ctx_desc *cd = &smmu_domain->cd; + const struct io_pgtable_cfg *pgtbl_cfg = + &io_pgtable_ops_to_pgtable(smmu_domain->pgtbl_ops)->cfg; + typeof(&pgtbl_cfg->arm_lpae_s1_cfg.tcr) tcr = + &pgtbl_cfg->arm_lpae_s1_cfg.tcr; memset(target, 0, sizeof(*target)); target->data[0] = cpu_to_le64( - cd->tcr | + FIELD_PREP(CTXDESC_CD_0_TCR_T0SZ, tcr->tsz) | + FIELD_PREP(CTXDESC_CD_0_TCR_TG0, tcr->tg) | + FIELD_PREP(CTXDESC_CD_0_TCR_IRGN0, tcr->irgn) | + FIELD_PREP(CTXDESC_CD_0_TCR_ORGN0, tcr->orgn) | + FIELD_PREP(CTXDESC_CD_0_TCR_SH0, tcr->sh) | #ifdef __BIG_ENDIAN CTXDESC_CD_0_ENDI | #endif + CTXDESC_CD_0_TCR_EPD1 | CTXDESC_CD_0_V | + FIELD_PREP(CTXDESC_CD_0_TCR_IPS, tcr->ips) | CTXDESC_CD_0_AA64 | (master->stall_enabled ? CTXDESC_CD_0_S : 0) | CTXDESC_CD_0_R | @@ -1330,9 +1340,9 @@ void arm_smmu_make_s1_cd(struct arm_smmu_cd *target, CTXDESC_CD_0_ASET | FIELD_PREP(CTXDESC_CD_0_ASID, cd->asid) ); - - target->data[1] = cpu_to_le64(cd->ttbr & CTXDESC_CD_1_TTB0_MASK); - target->data[3] = cpu_to_le64(cd->mair); + target->data[1] = cpu_to_le64(pgtbl_cfg->arm_lpae_s1_cfg.ttbr & + CTXDESC_CD_1_TTB0_MASK); + target->data[3] = cpu_to_le64(pgtbl_cfg->arm_lpae_s1_cfg.mair); } void arm_smmu_clear_cd(struct arm_smmu_master *master, ioasid_t ssid) @@ -2304,13 +2314,11 @@ static void arm_smmu_domain_free(struct iommu_domain *domain) } static int arm_smmu_domain_finalise_s1(struct arm_smmu_device *smmu, - struct arm_smmu_domain *smmu_domain, - struct io_pgtable_cfg *pgtbl_cfg) + struct arm_smmu_domain *smmu_domain) { int ret; u32 asid; struct arm_smmu_ctx_desc *cd = &smmu_domain->cd; - typeof(&pgtbl_cfg->arm_lpae_s1_cfg.tcr) tcr = &pgtbl_cfg->arm_lpae_s1_cfg.tcr; refcount_set(&cd->refs, 1); @@ -2318,31 +2326,13 @@ static int arm_smmu_domain_finalise_s1(struct arm_smmu_device *smmu, mutex_lock(&arm_smmu_asid_lock); ret = xa_alloc(&arm_smmu_asid_xa, &asid, cd, XA_LIMIT(1, (1 << smmu->asid_bits) - 1), GFP_KERNEL); - if (ret) - goto out_unlock; - cd->asid = (u16)asid; - cd->ttbr = pgtbl_cfg->arm_lpae_s1_cfg.ttbr; - cd->tcr = FIELD_PREP(CTXDESC_CD_0_TCR_T0SZ, tcr->tsz) | - FIELD_PREP(CTXDESC_CD_0_TCR_TG0, tcr->tg) | - FIELD_PREP(CTXDESC_CD_0_TCR_IRGN0, tcr->irgn) | - FIELD_PREP(CTXDESC_CD_0_TCR_ORGN0, tcr->orgn) | - FIELD_PREP(CTXDESC_CD_0_TCR_SH0, tcr->sh) | - FIELD_PREP(CTXDESC_CD_0_TCR_IPS, tcr->ips) | - CTXDESC_CD_0_TCR_EPD1 | CTXDESC_CD_0_AA64; - cd->mair = pgtbl_cfg->arm_lpae_s1_cfg.mair; - - mutex_unlock(&arm_smmu_asid_lock); - return 0; - -out_unlock: mutex_unlock(&arm_smmu_asid_lock); return ret; } static int arm_smmu_domain_finalise_s2(struct arm_smmu_device *smmu, - struct arm_smmu_domain *smmu_domain, - struct io_pgtable_cfg *pgtbl_cfg) + struct arm_smmu_domain *smmu_domain) { int vmid; struct arm_smmu_s2_cfg *cfg = &smmu_domain->s2_cfg; @@ -2366,8 +2356,7 @@ static int arm_smmu_domain_finalise(struct arm_smmu_domain *smmu_domain, struct io_pgtable_cfg pgtbl_cfg; struct io_pgtable_ops *pgtbl_ops; int (*finalise_stage_fn)(struct arm_smmu_device *smmu, - struct arm_smmu_domain *smmu_domain, - struct io_pgtable_cfg *pgtbl_cfg); + struct arm_smmu_domain *smmu_domain); /* Restrict the stage to what we can actually support */ if (!(smmu->features & ARM_SMMU_FEAT_TRANS_S1)) @@ -2411,7 +2400,7 @@ static int arm_smmu_domain_finalise(struct arm_smmu_domain *smmu_domain, smmu_domain->domain.geometry.aperture_end = (1UL << pgtbl_cfg.ias) - 1; smmu_domain->domain.geometry.force_aperture = true; - ret = finalise_stage_fn(smmu, smmu_domain, &pgtbl_cfg); + ret = finalise_stage_fn(smmu, smmu_domain); if (ret < 0) { free_io_pgtable_ops(pgtbl_ops); return ret; diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h index 4fd3cd657d17b..72cacd3c24db3 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h @@ -587,9 +587,6 @@ struct arm_smmu_strtab_l1_desc { struct arm_smmu_ctx_desc { u16 asid; - u64 ttbr; - u64 tcr; - u64 mair; refcount_t refs; struct mm_struct *mm; From 1b55708e8888bac688e8d2240e58c6e4aec8cb82 Mon Sep 17 00:00:00 2001 From: Jason Gunthorpe Date: Tue, 30 Apr 2024 14:21:41 -0300 Subject: [PATCH 525/548] iommu/arm-smmu-v3: Add unit tests for arm_smmu_write_entry Add tests for some of the more common STE update operations that we expect to see, as well as some artificial STE updates to test the edges of arm_smmu_write_entry. These also serve as a record of which common operation is expected to be hitless, and how many syncs they require. arm_smmu_write_entry implements a generic algorithm that updates an STE/CD to any other abritrary STE/CD configuration. The update requires a sequence of write+sync operations with some invariants that must be held true after each sync. arm_smmu_write_entry lends itself well to unit-testing since the function's interaction with the STE/CD is already abstracted by input callbacks that we can hook to introspect into the sequence of operations. We can use these hooks to guarantee that invariants are held throughout the entire update operation. Link: https://lore.kernel.org/r/20240106083617.1173871-3-mshavit@google.com Tested-by: Nicolin Chen Signed-off-by: Michael Shavit Signed-off-by: Jason Gunthorpe Link: https://lore.kernel.org/r/9-v9-5040dc602008+177d7-smmuv3_newapi_p2_jgg@nvidia.com Signed-off-by: Will Deacon (cherry picked from commit 56e1a4cc2588a7cb9664457a62fd7a77e005aa01) Signed-off-by: Koba Ko --- drivers/iommu/Kconfig | 13 +- drivers/iommu/arm/arm-smmu-v3/Makefile | 1 + .../iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c | 8 +- .../iommu/arm/arm-smmu-v3/arm-smmu-v3-test.c | 465 ++++++++++++++++++ drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 43 +- drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 30 ++ 6 files changed, 533 insertions(+), 27 deletions(-) create mode 100644 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-test.c diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig index 055059ceb34a2..be4bcc4f4b84b 100644 --- a/drivers/iommu/Kconfig +++ b/drivers/iommu/Kconfig @@ -396,9 +396,9 @@ config ARM_SMMU_V3 Say Y here if your system includes an IOMMU device implementing the ARM SMMUv3 architecture. +if ARM_SMMU_V3 config ARM_SMMU_V3_SVA bool "Shared Virtual Addressing support for the ARM SMMUv3" - depends on ARM_SMMU_V3 select IOMMU_SVA select IOMMU_IOPF select MMU_NOTIFIER @@ -409,6 +409,17 @@ config ARM_SMMU_V3_SVA Say Y here if your system supports SVA extensions such as PCIe PASID and PRI. +config ARM_SMMU_V3_KUNIT_TEST + bool "KUnit tests for arm-smmu-v3 driver" if !KUNIT_ALL_TESTS + depends on KUNIT + depends on ARM_SMMU_V3_SVA + default KUNIT_ALL_TESTS + help + Enable this option to unit-test arm-smmu-v3 driver functions. + + If unsure, say N. +endif + config S390_IOMMU def_bool y if S390 && PCI depends on S390 && PCI diff --git a/drivers/iommu/arm/arm-smmu-v3/Makefile b/drivers/iommu/arm/arm-smmu-v3/Makefile index 54feb1ecccad8..0b97054b3929b 100644 --- a/drivers/iommu/arm/arm-smmu-v3/Makefile +++ b/drivers/iommu/arm/arm-smmu-v3/Makefile @@ -2,4 +2,5 @@ obj-$(CONFIG_ARM_SMMU_V3) += arm_smmu_v3.o arm_smmu_v3-objs-y += arm-smmu-v3.o arm_smmu_v3-objs-$(CONFIG_ARM_SMMU_V3_SVA) += arm-smmu-v3-sva.o +arm_smmu_v3-objs-$(CONFIG_ARM_SMMU_V3_KUNIT_TEST) += arm-smmu-v3-test.o arm_smmu_v3-objs := $(arm_smmu_v3-objs-y) diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c index 3fb3040f57b0e..480014ad4a23b 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c @@ -8,6 +8,7 @@ #include #include #include +#include #include "arm-smmu-v3.h" #include "../../io-pgtable-arm.h" @@ -120,9 +121,10 @@ static u64 page_size_to_cd(void) return ARM_LPAE_TCR_TG0_4K; } -static void arm_smmu_make_sva_cd(struct arm_smmu_cd *target, - struct arm_smmu_master *master, - struct mm_struct *mm, u16 asid) +VISIBLE_IF_KUNIT +void arm_smmu_make_sva_cd(struct arm_smmu_cd *target, + struct arm_smmu_master *master, struct mm_struct *mm, + u16 asid) { u64 par; diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-test.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-test.c new file mode 100644 index 0000000000000..417804392ff08 --- /dev/null +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-test.c @@ -0,0 +1,465 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Copyright 2024 Google LLC. + */ +#include +#include + +#include "arm-smmu-v3.h" + +struct arm_smmu_test_writer { + struct arm_smmu_entry_writer writer; + struct kunit *test; + const __le64 *init_entry; + const __le64 *target_entry; + __le64 *entry; + + bool invalid_entry_written; + unsigned int num_syncs; +}; + +#define NUM_ENTRY_QWORDS 8 +#define NUM_EXPECTED_SYNCS(x) x + +static struct arm_smmu_ste bypass_ste; +static struct arm_smmu_ste abort_ste; +static struct arm_smmu_device smmu = { + .features = ARM_SMMU_FEAT_STALLS | ARM_SMMU_FEAT_ATTR_TYPES_OVR +}; +static struct mm_struct sva_mm = { + .pgd = (void *)0xdaedbeefdeadbeefULL, +}; + +static bool arm_smmu_entry_differs_in_used_bits(const __le64 *entry, + const __le64 *used_bits, + const __le64 *target, + unsigned int length) +{ + bool differs = false; + unsigned int i; + + for (i = 0; i < length; i++) { + if ((entry[i] & used_bits[i]) != target[i]) + differs = true; + } + return differs; +} + +static void +arm_smmu_test_writer_record_syncs(struct arm_smmu_entry_writer *writer) +{ + struct arm_smmu_test_writer *test_writer = + container_of(writer, struct arm_smmu_test_writer, writer); + __le64 *entry_used_bits; + + entry_used_bits = kunit_kzalloc( + test_writer->test, sizeof(*entry_used_bits) * NUM_ENTRY_QWORDS, + GFP_KERNEL); + KUNIT_ASSERT_NOT_NULL(test_writer->test, entry_used_bits); + + pr_debug("STE value is now set to: "); + print_hex_dump_debug(" ", DUMP_PREFIX_NONE, 16, 8, + test_writer->entry, + NUM_ENTRY_QWORDS * sizeof(*test_writer->entry), + false); + + test_writer->num_syncs += 1; + if (!test_writer->entry[0]) { + test_writer->invalid_entry_written = true; + } else { + /* + * At any stage in a hitless transition, the entry must be + * equivalent to either the initial entry or the target entry + * when only considering the bits used by the current + * configuration. + */ + writer->ops->get_used(test_writer->entry, entry_used_bits); + KUNIT_EXPECT_FALSE( + test_writer->test, + arm_smmu_entry_differs_in_used_bits( + test_writer->entry, entry_used_bits, + test_writer->init_entry, NUM_ENTRY_QWORDS) && + arm_smmu_entry_differs_in_used_bits( + test_writer->entry, entry_used_bits, + test_writer->target_entry, + NUM_ENTRY_QWORDS)); + } +} + +static void +arm_smmu_v3_test_debug_print_used_bits(struct arm_smmu_entry_writer *writer, + const __le64 *ste) +{ + __le64 used_bits[NUM_ENTRY_QWORDS] = {}; + + arm_smmu_get_ste_used(ste, used_bits); + pr_debug("STE used bits: "); + print_hex_dump_debug(" ", DUMP_PREFIX_NONE, 16, 8, used_bits, + sizeof(used_bits), false); +} + +static const struct arm_smmu_entry_writer_ops test_ste_ops = { + .sync = arm_smmu_test_writer_record_syncs, + .get_used = arm_smmu_get_ste_used, +}; + +static const struct arm_smmu_entry_writer_ops test_cd_ops = { + .sync = arm_smmu_test_writer_record_syncs, + .get_used = arm_smmu_get_cd_used, +}; + +static void arm_smmu_v3_test_ste_expect_transition( + struct kunit *test, const struct arm_smmu_ste *cur, + const struct arm_smmu_ste *target, unsigned int num_syncs_expected, + bool hitless) +{ + struct arm_smmu_ste cur_copy = *cur; + struct arm_smmu_test_writer test_writer = { + .writer = { + .ops = &test_ste_ops, + }, + .test = test, + .init_entry = cur->data, + .target_entry = target->data, + .entry = cur_copy.data, + .num_syncs = 0, + .invalid_entry_written = false, + + }; + + pr_debug("STE initial value: "); + print_hex_dump_debug(" ", DUMP_PREFIX_NONE, 16, 8, cur_copy.data, + sizeof(cur_copy), false); + arm_smmu_v3_test_debug_print_used_bits(&test_writer.writer, cur->data); + pr_debug("STE target value: "); + print_hex_dump_debug(" ", DUMP_PREFIX_NONE, 16, 8, target->data, + sizeof(cur_copy), false); + arm_smmu_v3_test_debug_print_used_bits(&test_writer.writer, + target->data); + + arm_smmu_write_entry(&test_writer.writer, cur_copy.data, target->data); + + KUNIT_EXPECT_EQ(test, test_writer.invalid_entry_written, !hitless); + KUNIT_EXPECT_EQ(test, test_writer.num_syncs, num_syncs_expected); + KUNIT_EXPECT_MEMEQ(test, target->data, cur_copy.data, sizeof(cur_copy)); +} + +static void arm_smmu_v3_test_ste_expect_hitless_transition( + struct kunit *test, const struct arm_smmu_ste *cur, + const struct arm_smmu_ste *target, unsigned int num_syncs_expected) +{ + arm_smmu_v3_test_ste_expect_transition(test, cur, target, + num_syncs_expected, true); +} + +static const dma_addr_t fake_cdtab_dma_addr = 0xF0F0F0F0F0F0; + +static void arm_smmu_test_make_cdtable_ste(struct arm_smmu_ste *ste, + const dma_addr_t dma_addr) +{ + struct arm_smmu_master master = { + .cd_table.cdtab_dma = dma_addr, + .cd_table.s1cdmax = 0xFF, + .cd_table.s1fmt = STRTAB_STE_0_S1FMT_64K_L2, + .smmu = &smmu, + }; + + arm_smmu_make_cdtable_ste(ste, &master); +} + +static void arm_smmu_v3_write_ste_test_bypass_to_abort(struct kunit *test) +{ + /* + * Bypass STEs has used bits in the first two Qwords, while abort STEs + * only have used bits in the first QWord. Transitioning from bypass to + * abort requires two syncs: the first to set the first qword and make + * the STE into an abort, the second to clean up the second qword. + */ + arm_smmu_v3_test_ste_expect_hitless_transition( + test, &bypass_ste, &abort_ste, NUM_EXPECTED_SYNCS(2)); +} + +static void arm_smmu_v3_write_ste_test_abort_to_bypass(struct kunit *test) +{ + /* + * Transitioning from abort to bypass also requires two syncs: the first + * to set the second qword data required by the bypass STE, and the + * second to set the first qword and switch to bypass. + */ + arm_smmu_v3_test_ste_expect_hitless_transition( + test, &abort_ste, &bypass_ste, NUM_EXPECTED_SYNCS(2)); +} + +static void arm_smmu_v3_write_ste_test_cdtable_to_abort(struct kunit *test) +{ + struct arm_smmu_ste ste; + + arm_smmu_test_make_cdtable_ste(&ste, fake_cdtab_dma_addr); + arm_smmu_v3_test_ste_expect_hitless_transition(test, &ste, &abort_ste, + NUM_EXPECTED_SYNCS(2)); +} + +static void arm_smmu_v3_write_ste_test_abort_to_cdtable(struct kunit *test) +{ + struct arm_smmu_ste ste; + + arm_smmu_test_make_cdtable_ste(&ste, fake_cdtab_dma_addr); + arm_smmu_v3_test_ste_expect_hitless_transition(test, &abort_ste, &ste, + NUM_EXPECTED_SYNCS(2)); +} + +static void arm_smmu_v3_write_ste_test_cdtable_to_bypass(struct kunit *test) +{ + struct arm_smmu_ste ste; + + arm_smmu_test_make_cdtable_ste(&ste, fake_cdtab_dma_addr); + arm_smmu_v3_test_ste_expect_hitless_transition(test, &ste, &bypass_ste, + NUM_EXPECTED_SYNCS(3)); +} + +static void arm_smmu_v3_write_ste_test_bypass_to_cdtable(struct kunit *test) +{ + struct arm_smmu_ste ste; + + arm_smmu_test_make_cdtable_ste(&ste, fake_cdtab_dma_addr); + arm_smmu_v3_test_ste_expect_hitless_transition(test, &bypass_ste, &ste, + NUM_EXPECTED_SYNCS(3)); +} + +static void arm_smmu_test_make_s2_ste(struct arm_smmu_ste *ste, + bool ats_enabled) +{ + struct arm_smmu_master master = { + .smmu = &smmu, + .ats_enabled = ats_enabled, + }; + struct io_pgtable io_pgtable = {}; + struct arm_smmu_domain smmu_domain = { + .pgtbl_ops = &io_pgtable.ops, + }; + + io_pgtable.cfg.arm_lpae_s2_cfg.vttbr = 0xdaedbeefdeadbeefULL; + io_pgtable.cfg.arm_lpae_s2_cfg.vtcr.ps = 1; + io_pgtable.cfg.arm_lpae_s2_cfg.vtcr.tg = 2; + io_pgtable.cfg.arm_lpae_s2_cfg.vtcr.sh = 3; + io_pgtable.cfg.arm_lpae_s2_cfg.vtcr.orgn = 1; + io_pgtable.cfg.arm_lpae_s2_cfg.vtcr.irgn = 2; + io_pgtable.cfg.arm_lpae_s2_cfg.vtcr.sl = 3; + io_pgtable.cfg.arm_lpae_s2_cfg.vtcr.tsz = 4; + + arm_smmu_make_s2_domain_ste(ste, &master, &smmu_domain); +} + +static void arm_smmu_v3_write_ste_test_s2_to_abort(struct kunit *test) +{ + struct arm_smmu_ste ste; + + arm_smmu_test_make_s2_ste(&ste, true); + arm_smmu_v3_test_ste_expect_hitless_transition(test, &ste, &abort_ste, + NUM_EXPECTED_SYNCS(2)); +} + +static void arm_smmu_v3_write_ste_test_abort_to_s2(struct kunit *test) +{ + struct arm_smmu_ste ste; + + arm_smmu_test_make_s2_ste(&ste, true); + arm_smmu_v3_test_ste_expect_hitless_transition(test, &abort_ste, &ste, + NUM_EXPECTED_SYNCS(2)); +} + +static void arm_smmu_v3_write_ste_test_s2_to_bypass(struct kunit *test) +{ + struct arm_smmu_ste ste; + + arm_smmu_test_make_s2_ste(&ste, true); + arm_smmu_v3_test_ste_expect_hitless_transition(test, &ste, &bypass_ste, + NUM_EXPECTED_SYNCS(2)); +} + +static void arm_smmu_v3_write_ste_test_bypass_to_s2(struct kunit *test) +{ + struct arm_smmu_ste ste; + + arm_smmu_test_make_s2_ste(&ste, true); + arm_smmu_v3_test_ste_expect_hitless_transition(test, &bypass_ste, &ste, + NUM_EXPECTED_SYNCS(2)); +} + +static void arm_smmu_v3_test_cd_expect_transition( + struct kunit *test, const struct arm_smmu_cd *cur, + const struct arm_smmu_cd *target, unsigned int num_syncs_expected, + bool hitless) +{ + struct arm_smmu_cd cur_copy = *cur; + struct arm_smmu_test_writer test_writer = { + .writer = { + .ops = &test_cd_ops, + }, + .test = test, + .init_entry = cur->data, + .target_entry = target->data, + .entry = cur_copy.data, + .num_syncs = 0, + .invalid_entry_written = false, + + }; + + pr_debug("CD initial value: "); + print_hex_dump_debug(" ", DUMP_PREFIX_NONE, 16, 8, cur_copy.data, + sizeof(cur_copy), false); + arm_smmu_v3_test_debug_print_used_bits(&test_writer.writer, cur->data); + pr_debug("CD target value: "); + print_hex_dump_debug(" ", DUMP_PREFIX_NONE, 16, 8, target->data, + sizeof(cur_copy), false); + arm_smmu_v3_test_debug_print_used_bits(&test_writer.writer, + target->data); + + arm_smmu_write_entry(&test_writer.writer, cur_copy.data, target->data); + + KUNIT_EXPECT_EQ(test, test_writer.invalid_entry_written, !hitless); + KUNIT_EXPECT_EQ(test, test_writer.num_syncs, num_syncs_expected); + KUNIT_EXPECT_MEMEQ(test, target->data, cur_copy.data, sizeof(cur_copy)); +} + +static void arm_smmu_v3_test_cd_expect_non_hitless_transition( + struct kunit *test, const struct arm_smmu_cd *cur, + const struct arm_smmu_cd *target, unsigned int num_syncs_expected) +{ + arm_smmu_v3_test_cd_expect_transition(test, cur, target, + num_syncs_expected, false); +} + +static void arm_smmu_v3_test_cd_expect_hitless_transition( + struct kunit *test, const struct arm_smmu_cd *cur, + const struct arm_smmu_cd *target, unsigned int num_syncs_expected) +{ + arm_smmu_v3_test_cd_expect_transition(test, cur, target, + num_syncs_expected, true); +} + +static void arm_smmu_test_make_s1_cd(struct arm_smmu_cd *cd, unsigned int asid) +{ + struct arm_smmu_master master = { + .smmu = &smmu, + }; + struct io_pgtable io_pgtable = {}; + struct arm_smmu_domain smmu_domain = { + .pgtbl_ops = &io_pgtable.ops, + .cd = { + .asid = asid, + }, + }; + + io_pgtable.cfg.arm_lpae_s1_cfg.ttbr = 0xdaedbeefdeadbeefULL; + io_pgtable.cfg.arm_lpae_s1_cfg.tcr.ips = 1; + io_pgtable.cfg.arm_lpae_s1_cfg.tcr.tg = 2; + io_pgtable.cfg.arm_lpae_s1_cfg.tcr.sh = 3; + io_pgtable.cfg.arm_lpae_s1_cfg.tcr.orgn = 1; + io_pgtable.cfg.arm_lpae_s1_cfg.tcr.irgn = 2; + io_pgtable.cfg.arm_lpae_s1_cfg.tcr.tsz = 4; + io_pgtable.cfg.arm_lpae_s1_cfg.mair = 0xabcdef012345678ULL; + + arm_smmu_make_s1_cd(cd, &master, &smmu_domain); +} + +static void arm_smmu_v3_write_cd_test_s1_clear(struct kunit *test) +{ + struct arm_smmu_cd cd = {}; + struct arm_smmu_cd cd_2; + + arm_smmu_test_make_s1_cd(&cd_2, 1997); + arm_smmu_v3_test_cd_expect_non_hitless_transition( + test, &cd, &cd_2, NUM_EXPECTED_SYNCS(2)); + arm_smmu_v3_test_cd_expect_non_hitless_transition( + test, &cd_2, &cd, NUM_EXPECTED_SYNCS(2)); +} + +static void arm_smmu_v3_write_cd_test_s1_change_asid(struct kunit *test) +{ + struct arm_smmu_cd cd = {}; + struct arm_smmu_cd cd_2; + + arm_smmu_test_make_s1_cd(&cd, 778); + arm_smmu_test_make_s1_cd(&cd_2, 1997); + arm_smmu_v3_test_cd_expect_hitless_transition(test, &cd, &cd_2, + NUM_EXPECTED_SYNCS(1)); + arm_smmu_v3_test_cd_expect_hitless_transition(test, &cd_2, &cd, + NUM_EXPECTED_SYNCS(1)); +} + +static void arm_smmu_test_make_sva_cd(struct arm_smmu_cd *cd, unsigned int asid) +{ + struct arm_smmu_master master = { + .smmu = &smmu, + }; + + arm_smmu_make_sva_cd(cd, &master, &sva_mm, asid); +} + +static void arm_smmu_test_make_sva_release_cd(struct arm_smmu_cd *cd, + unsigned int asid) +{ + struct arm_smmu_master master = { + .smmu = &smmu, + }; + + arm_smmu_make_sva_cd(cd, &master, NULL, asid); +} + +static void arm_smmu_v3_write_cd_test_sva_clear(struct kunit *test) +{ + struct arm_smmu_cd cd = {}; + struct arm_smmu_cd cd_2; + + arm_smmu_test_make_sva_cd(&cd_2, 1997); + arm_smmu_v3_test_cd_expect_non_hitless_transition( + test, &cd, &cd_2, NUM_EXPECTED_SYNCS(2)); + arm_smmu_v3_test_cd_expect_non_hitless_transition( + test, &cd_2, &cd, NUM_EXPECTED_SYNCS(2)); +} + +static void arm_smmu_v3_write_cd_test_sva_release(struct kunit *test) +{ + struct arm_smmu_cd cd; + struct arm_smmu_cd cd_2; + + arm_smmu_test_make_sva_cd(&cd, 1997); + arm_smmu_test_make_sva_release_cd(&cd_2, 1997); + arm_smmu_v3_test_cd_expect_hitless_transition(test, &cd, &cd_2, + NUM_EXPECTED_SYNCS(2)); + arm_smmu_v3_test_cd_expect_hitless_transition(test, &cd_2, &cd, + NUM_EXPECTED_SYNCS(2)); +} + +static struct kunit_case arm_smmu_v3_test_cases[] = { + KUNIT_CASE(arm_smmu_v3_write_ste_test_bypass_to_abort), + KUNIT_CASE(arm_smmu_v3_write_ste_test_abort_to_bypass), + KUNIT_CASE(arm_smmu_v3_write_ste_test_cdtable_to_abort), + KUNIT_CASE(arm_smmu_v3_write_ste_test_abort_to_cdtable), + KUNIT_CASE(arm_smmu_v3_write_ste_test_cdtable_to_bypass), + KUNIT_CASE(arm_smmu_v3_write_ste_test_bypass_to_cdtable), + KUNIT_CASE(arm_smmu_v3_write_ste_test_s2_to_abort), + KUNIT_CASE(arm_smmu_v3_write_ste_test_abort_to_s2), + KUNIT_CASE(arm_smmu_v3_write_ste_test_s2_to_bypass), + KUNIT_CASE(arm_smmu_v3_write_ste_test_bypass_to_s2), + KUNIT_CASE(arm_smmu_v3_write_cd_test_s1_clear), + KUNIT_CASE(arm_smmu_v3_write_cd_test_s1_change_asid), + KUNIT_CASE(arm_smmu_v3_write_cd_test_sva_clear), + KUNIT_CASE(arm_smmu_v3_write_cd_test_sva_release), + {}, +}; + +static int arm_smmu_v3_test_suite_init(struct kunit_suite *test) +{ + arm_smmu_make_bypass_ste(&smmu, &bypass_ste); + arm_smmu_make_abort_ste(&abort_ste); + return 0; +} + +static struct kunit_suite arm_smmu_v3_test_module = { + .name = "arm-smmu-v3-kunit-test", + .suite_init = arm_smmu_v3_test_suite_init, + .test_cases = arm_smmu_v3_test_cases, +}; +kunit_test_suites(&arm_smmu_v3_test_module); diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c index 0d0a8cedaeda4..ec1c9e106b5ba 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c @@ -26,6 +26,7 @@ #include #include #include +#include #include "arm-smmu-v3.h" #include "../../dma-iommu.h" @@ -47,17 +48,6 @@ enum arm_smmu_msi_index { ARM_SMMU_MAX_MSIS, }; -struct arm_smmu_entry_writer_ops; -struct arm_smmu_entry_writer { - const struct arm_smmu_entry_writer_ops *ops; - struct arm_smmu_master *master; -}; - -struct arm_smmu_entry_writer_ops { - void (*get_used)(const __le64 *entry, __le64 *used); - void (*sync)(struct arm_smmu_entry_writer *writer); -}; - #define NUM_ENTRY_QWORDS 8 static_assert(sizeof(struct arm_smmu_ste) == NUM_ENTRY_QWORDS * sizeof(u64)); static_assert(sizeof(struct arm_smmu_cd) == NUM_ENTRY_QWORDS * sizeof(u64)); @@ -984,7 +974,8 @@ void arm_smmu_tlb_inv_asid(struct arm_smmu_device *smmu, u16 asid) * would be nice if this was complete according to the spec, but minimally it * has to capture the bits this driver uses. */ -static void arm_smmu_get_ste_used(const __le64 *ent, __le64 *used_bits) +VISIBLE_IF_KUNIT +void arm_smmu_get_ste_used(const __le64 *ent, __le64 *used_bits) { unsigned int cfg = FIELD_GET(STRTAB_STE_0_CFG, le64_to_cpu(ent[0])); @@ -1106,8 +1097,9 @@ static bool entry_set(struct arm_smmu_entry_writer *writer, __le64 *entry, * V=0 process. This relies on the IGNORED behavior described in the * specification. */ -static void arm_smmu_write_entry(struct arm_smmu_entry_writer *writer, - __le64 *entry, const __le64 *target) +VISIBLE_IF_KUNIT +void arm_smmu_write_entry(struct arm_smmu_entry_writer *writer, __le64 *entry, + const __le64 *target) { __le64 unused_update[NUM_ENTRY_QWORDS]; u8 used_qword_diff; @@ -1261,7 +1253,8 @@ struct arm_smmu_cd_writer { unsigned int ssid; }; -static void arm_smmu_get_cd_used(const __le64 *ent, __le64 *used_bits) +VISIBLE_IF_KUNIT +void arm_smmu_get_cd_used(const __le64 *ent, __le64 *used_bits) { used_bits[0] = cpu_to_le64(CTXDESC_CD_0_V); if (!(ent[0] & cpu_to_le64(CTXDESC_CD_0_V))) @@ -1519,7 +1512,8 @@ static void arm_smmu_write_ste(struct arm_smmu_master *master, u32 sid, } } -static void arm_smmu_make_abort_ste(struct arm_smmu_ste *target) +VISIBLE_IF_KUNIT +void arm_smmu_make_abort_ste(struct arm_smmu_ste *target) { memset(target, 0, sizeof(*target)); target->data[0] = cpu_to_le64( @@ -1527,8 +1521,9 @@ static void arm_smmu_make_abort_ste(struct arm_smmu_ste *target) FIELD_PREP(STRTAB_STE_0_CFG, STRTAB_STE_0_CFG_ABORT)); } -static void arm_smmu_make_bypass_ste(struct arm_smmu_device *smmu, - struct arm_smmu_ste *target) +VISIBLE_IF_KUNIT +void arm_smmu_make_bypass_ste(struct arm_smmu_device *smmu, + struct arm_smmu_ste *target) { memset(target, 0, sizeof(*target)); target->data[0] = cpu_to_le64( @@ -1540,8 +1535,9 @@ static void arm_smmu_make_bypass_ste(struct arm_smmu_device *smmu, STRTAB_STE_1_SHCFG_INCOMING)); } -static void arm_smmu_make_cdtable_ste(struct arm_smmu_ste *target, - struct arm_smmu_master *master) +VISIBLE_IF_KUNIT +void arm_smmu_make_cdtable_ste(struct arm_smmu_ste *target, + struct arm_smmu_master *master) { struct arm_smmu_ctx_desc_cfg *cd_table = &master->cd_table; struct arm_smmu_device *smmu = master->smmu; @@ -1590,9 +1586,10 @@ static void arm_smmu_make_cdtable_ste(struct arm_smmu_ste *target, } } -static void arm_smmu_make_s2_domain_ste(struct arm_smmu_ste *target, - struct arm_smmu_master *master, - struct arm_smmu_domain *smmu_domain) +VISIBLE_IF_KUNIT +void arm_smmu_make_s2_domain_ste(struct arm_smmu_ste *target, + struct arm_smmu_master *master, + struct arm_smmu_domain *smmu_domain) { struct arm_smmu_s2_cfg *s2_cfg = &smmu_domain->s2_cfg; const struct io_pgtable_cfg *pgtbl_cfg = diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h index 72cacd3c24db3..f6b0a573f4698 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h @@ -737,6 +737,36 @@ struct arm_smmu_domain { struct list_head mmu_notifiers; }; +/* The following are exposed for testing purposes. */ +struct arm_smmu_entry_writer_ops; +struct arm_smmu_entry_writer { + const struct arm_smmu_entry_writer_ops *ops; + struct arm_smmu_master *master; +}; + +struct arm_smmu_entry_writer_ops { + void (*get_used)(const __le64 *entry, __le64 *used); + void (*sync)(struct arm_smmu_entry_writer *writer); +}; + +#if IS_ENABLED(CONFIG_KUNIT) +void arm_smmu_get_ste_used(const __le64 *ent, __le64 *used_bits); +void arm_smmu_write_entry(struct arm_smmu_entry_writer *writer, __le64 *cur, + const __le64 *target); +void arm_smmu_get_cd_used(const __le64 *ent, __le64 *used_bits); +void arm_smmu_make_abort_ste(struct arm_smmu_ste *target); +void arm_smmu_make_bypass_ste(struct arm_smmu_device *smmu, + struct arm_smmu_ste *target); +void arm_smmu_make_cdtable_ste(struct arm_smmu_ste *target, + struct arm_smmu_master *master); +void arm_smmu_make_s2_domain_ste(struct arm_smmu_ste *target, + struct arm_smmu_master *master, + struct arm_smmu_domain *smmu_domain); +void arm_smmu_make_sva_cd(struct arm_smmu_cd *target, + struct arm_smmu_master *master, struct mm_struct *mm, + u16 asid); +#endif + static inline struct arm_smmu_domain *to_smmu_domain(struct iommu_domain *dom) { return container_of(dom, struct arm_smmu_domain, domain); From 2a717dd9916513a6c84254897c6c14716fdb6f17 Mon Sep 17 00:00:00 2001 From: Mostafa Saleh Date: Tue, 4 Jun 2024 18:52:18 +0000 Subject: [PATCH 526/548] iommu/arm-smmu-v3: Avoid uninitialized asid in case of error Static checker is complaining about the ASID possibly set uninitialized. This only happens in case of error and this value would be ignored anyway. A simple fix would be just to initialize the local variable to zero, this path will only be reached on the first attach to a domain where the CD is already initialized to zero. This avoids having to bloat the function with an error path. Closes: https://lore.kernel.org/linux-iommu/849e3d77-0a3c-43c4-878d-a0e061c8cd61@moroto.mountain/T/#u Reported-by: Dan Carpenter Signed-off-by: Mostafa Saleh Fixes: 04905c17f648 ("iommu/arm-smmu-v3: Build the whole CD in arm_smmu_make_s1_cd()") Reviewed-by: Jason Gunthorpe Link: https://lore.kernel.org/r/20240604185218.2602058-1-smostafa@google.com Signed-off-by: Will Deacon (cherry picked from commit d3867e7148318e12b5d69b64950622f5ed06fe86) Signed-off-by: Koba Ko --- drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c index ec1c9e106b5ba..55f5c177191b8 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c @@ -2314,7 +2314,7 @@ static int arm_smmu_domain_finalise_s1(struct arm_smmu_device *smmu, struct arm_smmu_domain *smmu_domain) { int ret; - u32 asid; + u32 asid = 0; struct arm_smmu_ctx_desc *cd = &smmu_domain->cd; refcount_set(&cd->refs, 1); From fe1120f780ee4cd38168309c33a1824062fb09e0 Mon Sep 17 00:00:00 2001 From: Jason Gunthorpe Date: Tue, 25 Jun 2024 09:37:32 -0300 Subject: [PATCH 527/548] iommu/arm-smmu-v3: Convert to domain_alloc_sva() This allows the driver the receive the mm and always a device during allocation. Later patches need this to properly setup the notifier when the domain is first allocated. Remove ops->domain_alloc() as SVA was the only remaining purpose. Tested-by: Nicolin Chen Tested-by: Shameer Kolothum Reviewed-by: Michael Shavit Reviewed-by: Nicolin Chen Reviewed-by: Jerry Snitselaar Signed-off-by: Jason Gunthorpe Link: https://lore.kernel.org/r/1-v9-5cd718286059+79186-smmuv3_newapi_p2b_jgg@nvidia.com Signed-off-by: Will Deacon (cherry picked from commit 678d79b98028ce2365b30e35479bea0e555c23d3) Signed-off-by: Koba Ko --- drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c | 6 ++++-- drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 10 +--------- drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 8 +++----- 3 files changed, 8 insertions(+), 16 deletions(-) diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c index 480014ad4a23b..535c37133f443 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c @@ -652,13 +652,15 @@ static const struct iommu_domain_ops arm_smmu_sva_domain_ops = { .free = arm_smmu_sva_domain_free }; -struct iommu_domain *arm_smmu_sva_domain_alloc(void) +struct iommu_domain *arm_smmu_sva_domain_alloc(struct device *dev, + struct mm_struct *mm) { struct iommu_domain *domain; domain = kzalloc(sizeof(*domain), GFP_KERNEL); if (!domain) - return NULL; + return ERR_PTR(-ENOMEM); + domain->type = IOMMU_DOMAIN_SVA; domain->ops = &arm_smmu_sva_domain_ops; return domain; diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c index 55f5c177191b8..180800a7ba090 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c @@ -2249,14 +2249,6 @@ static bool arm_smmu_capable(struct device *dev, enum iommu_cap cap) } } -static struct iommu_domain *arm_smmu_domain_alloc(unsigned type) -{ - - if (type == IOMMU_DOMAIN_SVA) - return arm_smmu_sva_domain_alloc(); - return ERR_PTR(-EOPNOTSUPP); -} - static struct iommu_domain *arm_smmu_domain_alloc_paging(struct device *dev) { struct arm_smmu_domain *smmu_domain; @@ -3104,8 +3096,8 @@ static struct iommu_ops arm_smmu_ops = { .identity_domain = &arm_smmu_identity_domain, .blocked_domain = &arm_smmu_blocked_domain, .capable = arm_smmu_capable, - .domain_alloc = arm_smmu_domain_alloc, .domain_alloc_paging = arm_smmu_domain_alloc_paging, + .domain_alloc_sva = arm_smmu_sva_domain_alloc, .probe_device = arm_smmu_probe_device, .release_device = arm_smmu_release_device, .device_group = arm_smmu_device_group, diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h index f6b0a573f4698..d1e2d2a16d939 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h @@ -803,7 +803,8 @@ int arm_smmu_master_enable_sva(struct arm_smmu_master *master); int arm_smmu_master_disable_sva(struct arm_smmu_master *master); bool arm_smmu_master_iopf_supported(struct arm_smmu_master *master); void arm_smmu_sva_notifier_synchronize(void); -struct iommu_domain *arm_smmu_sva_domain_alloc(void); +struct iommu_domain *arm_smmu_sva_domain_alloc(struct device *dev, + struct mm_struct *mm); void arm_smmu_sva_remove_dev_pasid(struct iommu_domain *domain, struct device *dev, ioasid_t id); #else /* CONFIG_ARM_SMMU_V3_SVA */ @@ -839,10 +840,7 @@ static inline bool arm_smmu_master_iopf_supported(struct arm_smmu_master *master static inline void arm_smmu_sva_notifier_synchronize(void) {} -static inline struct iommu_domain *arm_smmu_sva_domain_alloc(void) -{ - return NULL; -} +#define arm_smmu_sva_domain_alloc NULL static inline void arm_smmu_sva_remove_dev_pasid(struct iommu_domain *domain, struct device *dev, From c2e53af0d9d1ad13d90fb225e8d5d2e4f0506cc6 Mon Sep 17 00:00:00 2001 From: Jason Gunthorpe Date: Tue, 25 Jun 2024 09:37:33 -0300 Subject: [PATCH 528/548] iommu/arm-smmu-v3: Start building a generic PASID layer Add arm_smmu_set_pasid()/arm_smmu_remove_pasid() which are to be used by callers that already constructed the arm_smmu_cd they wish to program. These functions will encapsulate the shared logic to setup a CD entry that will be shared by SVA and S1 domain cases. Prior fixes had already moved most of this logic up into __arm_smmu_sva_bind(), move it to it's final home. Following patches will relieve some of the remaining SVA restrictions: - The RID domain is a S1 domain and has already setup the STE to point to the CD table - The programmed PASID is the mm_get_enqcmd_pasid() - Nothing changes while SVA is running (sva_enable) SVA invalidation will still iterate over the S1 domain's master list, later patches will resolve that. Tested-by: Nicolin Chen Tested-by: Shameer Kolothum Reviewed-by: Nicolin Chen Reviewed-by: Jerry Snitselaar Signed-off-by: Jason Gunthorpe Link: https://lore.kernel.org/r/2-v9-5cd718286059+79186-smmuv3_newapi_p2b_jgg@nvidia.com Signed-off-by: Will Deacon (cherry picked from commit 85f2fb6ef4137c631c9d2663716d998d7e4f164f) Signed-off-by: Koba Ko --- .../iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c | 57 ++++++++++--------- drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 32 ++++++++++- drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 9 ++- 3 files changed, 67 insertions(+), 31 deletions(-) diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c index 535c37133f443..c216e30178c80 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c @@ -416,29 +416,27 @@ static void arm_smmu_mmu_notifier_put(struct arm_smmu_mmu_notifier *smmu_mn) arm_smmu_free_shared_cd(cd); } -static int __arm_smmu_sva_bind(struct device *dev, ioasid_t pasid, - struct mm_struct *mm) +static struct arm_smmu_bond *__arm_smmu_sva_bind(struct device *dev, + struct mm_struct *mm) { int ret; - struct arm_smmu_cd target; - struct arm_smmu_cd *cdptr; struct arm_smmu_bond *bond; struct arm_smmu_master *master = dev_iommu_priv_get(dev); struct iommu_domain *domain = iommu_get_domain_for_dev(dev); struct arm_smmu_domain *smmu_domain; if (!(domain->type & __IOMMU_DOMAIN_PAGING)) - return -ENODEV; + return ERR_PTR(-ENODEV); smmu_domain = to_smmu_domain(domain); if (smmu_domain->stage != ARM_SMMU_DOMAIN_S1) - return -ENODEV; + return ERR_PTR(-ENODEV); if (!master || !master->sva_enabled) - return -ENODEV; + return ERR_PTR(-ENODEV); bond = kzalloc(sizeof(*bond), GFP_KERNEL); if (!bond) - return -ENOMEM; + return ERR_PTR(-ENOMEM); bond->mm = mm; @@ -448,22 +446,12 @@ static int __arm_smmu_sva_bind(struct device *dev, ioasid_t pasid, goto err_free_bond; } - cdptr = arm_smmu_alloc_cd_ptr(master, mm_get_enqcmd_pasid(mm)); - if (!cdptr) { - ret = -ENOMEM; - goto err_put_notifier; - } - arm_smmu_make_sva_cd(&target, master, mm, bond->smmu_mn->cd->asid); - arm_smmu_write_cd_entry(master, pasid, cdptr, &target); - list_add(&bond->list, &master->bonds); - return 0; + return bond; -err_put_notifier: - arm_smmu_mmu_notifier_put(bond->smmu_mn); err_free_bond: kfree(bond); - return ret; + return ERR_PTR(ret); } bool arm_smmu_sva_supported(struct arm_smmu_device *smmu) @@ -610,10 +598,9 @@ void arm_smmu_sva_remove_dev_pasid(struct iommu_domain *domain, struct arm_smmu_bond *bond = NULL, *t; struct arm_smmu_master *master = dev_iommu_priv_get(dev); - mutex_lock(&sva_lock); - - arm_smmu_clear_cd(master, id); + arm_smmu_remove_pasid(master, to_smmu_domain(domain), id); + mutex_lock(&sva_lock); list_for_each_entry(t, &master->bonds, list) { if (t->mm == mm) { bond = t; @@ -632,14 +619,30 @@ void arm_smmu_sva_remove_dev_pasid(struct iommu_domain *domain, static int arm_smmu_sva_set_dev_pasid(struct iommu_domain *domain, struct device *dev, ioasid_t id) { - int ret = 0; + struct arm_smmu_master *master = dev_iommu_priv_get(dev); struct mm_struct *mm = domain->mm; + struct arm_smmu_bond *bond; + struct arm_smmu_cd target; + int ret; mutex_lock(&sva_lock); - ret = __arm_smmu_sva_bind(dev, id, mm); - mutex_unlock(&sva_lock); + bond = __arm_smmu_sva_bind(dev, mm); + if (IS_ERR(bond)) { + mutex_unlock(&sva_lock); + return PTR_ERR(bond); + } - return ret; + arm_smmu_make_sva_cd(&target, master, mm, bond->smmu_mn->cd->asid); + ret = arm_smmu_set_pasid(master, NULL, id, &target); + if (ret) { + list_del(&bond->list); + arm_smmu_mmu_notifier_put(bond->smmu_mn); + kfree(bond); + mutex_unlock(&sva_lock); + return ret; + } + mutex_unlock(&sva_lock); + return 0; } static void arm_smmu_sva_domain_free(struct iommu_domain *domain) diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c index 180800a7ba090..9f5bedc432656 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c @@ -1214,8 +1214,8 @@ struct arm_smmu_cd *arm_smmu_get_cd_ptr(struct arm_smmu_master *master, return &l1_desc->l2ptr[ssid % CTXDESC_L2_ENTRIES]; } -struct arm_smmu_cd *arm_smmu_alloc_cd_ptr(struct arm_smmu_master *master, - u32 ssid) +static struct arm_smmu_cd *arm_smmu_alloc_cd_ptr(struct arm_smmu_master *master, + u32 ssid) { struct arm_smmu_ctx_desc_cfg *cd_table = &master->cd_table; struct arm_smmu_device *smmu = master->smmu; @@ -2425,6 +2425,10 @@ static void arm_smmu_install_ste_for_dev(struct arm_smmu_master *master, int i, j; struct arm_smmu_device *smmu = master->smmu; + master->cd_table.in_ste = + FIELD_GET(STRTAB_STE_0_CFG, le64_to_cpu(target->data[0])) == + STRTAB_STE_0_CFG_S1_TRANS; + for (i = 0; i < master->num_streams; ++i) { u32 sid = master->streams[i].id; struct arm_smmu_ste *step = @@ -2645,6 +2649,30 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev) return 0; } +int arm_smmu_set_pasid(struct arm_smmu_master *master, + struct arm_smmu_domain *smmu_domain, ioasid_t pasid, + const struct arm_smmu_cd *cd) +{ + struct arm_smmu_cd *cdptr; + + /* The core code validates pasid */ + + if (!master->cd_table.in_ste) + return -ENODEV; + + cdptr = arm_smmu_alloc_cd_ptr(master, pasid); + if (!cdptr) + return -ENOMEM; + arm_smmu_write_cd_entry(master, pasid, cdptr, cd); + return 0; +} + +void arm_smmu_remove_pasid(struct arm_smmu_master *master, + struct arm_smmu_domain *smmu_domain, ioasid_t pasid) +{ + arm_smmu_clear_cd(master, pasid); +} + static int arm_smmu_attach_dev_ste(struct device *dev, struct arm_smmu_ste *ste) { diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h index d1e2d2a16d939..1fd5b73872753 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h @@ -602,6 +602,7 @@ struct arm_smmu_ctx_desc_cfg { dma_addr_t cdtab_dma; struct arm_smmu_l1_ctx_desc *l1_desc; unsigned int num_l1_ents; + u8 in_ste; u8 s1fmt; /* log2 of the maximum number of CDs supported by this table */ u8 s1cdmax; @@ -778,8 +779,6 @@ extern struct mutex arm_smmu_asid_lock; void arm_smmu_clear_cd(struct arm_smmu_master *master, ioasid_t ssid); struct arm_smmu_cd *arm_smmu_get_cd_ptr(struct arm_smmu_master *master, u32 ssid); -struct arm_smmu_cd *arm_smmu_alloc_cd_ptr(struct arm_smmu_master *master, - u32 ssid); void arm_smmu_make_s1_cd(struct arm_smmu_cd *target, struct arm_smmu_master *master, struct arm_smmu_domain *smmu_domain); @@ -787,6 +786,12 @@ void arm_smmu_write_cd_entry(struct arm_smmu_master *master, int ssid, struct arm_smmu_cd *cdptr, const struct arm_smmu_cd *target); +int arm_smmu_set_pasid(struct arm_smmu_master *master, + struct arm_smmu_domain *smmu_domain, ioasid_t pasid, + const struct arm_smmu_cd *cd); +void arm_smmu_remove_pasid(struct arm_smmu_master *master, + struct arm_smmu_domain *smmu_domain, ioasid_t pasid); + void arm_smmu_tlb_inv_asid(struct arm_smmu_device *smmu, u16 asid); void arm_smmu_tlb_inv_range_asid(unsigned long iova, size_t size, int asid, size_t granule, bool leaf, From ded407e36ff719886aa7ac1b04c153ba61ad0183 Mon Sep 17 00:00:00 2001 From: Jason Gunthorpe Date: Tue, 25 Jun 2024 09:37:34 -0300 Subject: [PATCH 529/548] iommu/arm-smmu-v3: Make smmu_domain->devices into an allocated list The next patch will need to store the same master twice (with different SSIDs), so allocate memory for each list element. Tested-by: Nicolin Chen Tested-by: Shameer Kolothum Reviewed-by: Michael Shavit Reviewed-by: Nicolin Chen Reviewed-by: Jerry Snitselaar Signed-off-by: Jason Gunthorpe Link: https://lore.kernel.org/r/3-v9-5cd718286059+79186-smmuv3_newapi_p2b_jgg@nvidia.com Signed-off-by: Will Deacon (cherry picked from commit ad10dce61303d82f7bdd2dbb116e02146778f728) Signed-off-by: Koba Ko --- .../iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c | 11 ++++-- drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 39 ++++++++++++++++--- drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 7 +++- 3 files changed, 47 insertions(+), 10 deletions(-) diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c index c216e30178c80..c09790c702c9d 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c @@ -38,12 +38,13 @@ static DEFINE_MUTEX(sva_lock); static void arm_smmu_update_s1_domain_cd_entry(struct arm_smmu_domain *smmu_domain) { - struct arm_smmu_master *master; + struct arm_smmu_master_domain *master_domain; struct arm_smmu_cd target_cd; unsigned long flags; spin_lock_irqsave(&smmu_domain->devices_lock, flags); - list_for_each_entry(master, &smmu_domain->devices, domain_head) { + list_for_each_entry(master_domain, &smmu_domain->devices, devices_elm) { + struct arm_smmu_master *master = master_domain->master; struct arm_smmu_cd *cdptr; /* S1 domains only support RID attachment right now */ @@ -300,7 +301,7 @@ static void arm_smmu_mm_release(struct mmu_notifier *mn, struct mm_struct *mm) { struct arm_smmu_mmu_notifier *smmu_mn = mn_to_smmu(mn); struct arm_smmu_domain *smmu_domain = smmu_mn->domain; - struct arm_smmu_master *master; + struct arm_smmu_master_domain *master_domain; unsigned long flags; mutex_lock(&sva_lock); @@ -314,7 +315,9 @@ static void arm_smmu_mm_release(struct mmu_notifier *mn, struct mm_struct *mm) * but disable translation. */ spin_lock_irqsave(&smmu_domain->devices_lock, flags); - list_for_each_entry(master, &smmu_domain->devices, domain_head) { + list_for_each_entry(master_domain, &smmu_domain->devices, + devices_elm) { + struct arm_smmu_master *master = master_domain->master; struct arm_smmu_cd target; struct arm_smmu_cd *cdptr; diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c index 9f5bedc432656..b7b027deaf06b 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c @@ -2027,10 +2027,10 @@ static int arm_smmu_atc_inv_master(struct arm_smmu_master *master) int arm_smmu_atc_inv_domain(struct arm_smmu_domain *smmu_domain, int ssid, unsigned long iova, size_t size) { + struct arm_smmu_master_domain *master_domain; int i; unsigned long flags; struct arm_smmu_cmdq_ent cmd; - struct arm_smmu_master *master; struct arm_smmu_cmdq_batch cmds; if (!(smmu_domain->smmu->features & ARM_SMMU_FEAT_ATS)) @@ -2058,7 +2058,10 @@ int arm_smmu_atc_inv_domain(struct arm_smmu_domain *smmu_domain, int ssid, cmds.num = 0; spin_lock_irqsave(&smmu_domain->devices_lock, flags); - list_for_each_entry(master, &smmu_domain->devices, domain_head) { + list_for_each_entry(master_domain, &smmu_domain->devices, + devices_elm) { + struct arm_smmu_master *master = master_domain->master; + if (!master->ats_enabled) continue; @@ -2547,9 +2550,26 @@ static void arm_smmu_disable_pasid(struct arm_smmu_master *master) pci_disable_pasid(pdev); } +static struct arm_smmu_master_domain * +arm_smmu_find_master_domain(struct arm_smmu_domain *smmu_domain, + struct arm_smmu_master *master) +{ + struct arm_smmu_master_domain *master_domain; + + lockdep_assert_held(&smmu_domain->devices_lock); + + list_for_each_entry(master_domain, &smmu_domain->devices, + devices_elm) { + if (master_domain->master == master) + return master_domain; + } + return NULL; +} + static void arm_smmu_detach_dev(struct arm_smmu_master *master) { struct iommu_domain *domain = iommu_get_domain_for_dev(master->dev); + struct arm_smmu_master_domain *master_domain; struct arm_smmu_domain *smmu_domain; unsigned long flags; @@ -2560,7 +2580,11 @@ static void arm_smmu_detach_dev(struct arm_smmu_master *master) arm_smmu_disable_ats(master, smmu_domain); spin_lock_irqsave(&smmu_domain->devices_lock, flags); - list_del_init(&master->domain_head); + master_domain = arm_smmu_find_master_domain(smmu_domain, master); + if (master_domain) { + list_del(&master_domain->devices_elm); + kfree(master_domain); + } spin_unlock_irqrestore(&smmu_domain->devices_lock, flags); master->ats_enabled = false; @@ -2574,6 +2598,7 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev) struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev); struct arm_smmu_device *smmu; struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain); + struct arm_smmu_master_domain *master_domain; struct arm_smmu_master *master; struct arm_smmu_cd *cdptr; @@ -2610,6 +2635,11 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev) return -ENOMEM; } + master_domain = kzalloc(sizeof(*master_domain), GFP_KERNEL); + if (!master_domain) + return -ENOMEM; + master_domain->master = master; + /* * Prevent arm_smmu_share_asid() from trying to change the ASID * of either the old or new domain while we are working on it. @@ -2623,7 +2653,7 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev) master->ats_enabled = arm_smmu_ats_supported(master); spin_lock_irqsave(&smmu_domain->devices_lock, flags); - list_add(&master->domain_head, &smmu_domain->devices); + list_add(&master_domain->devices_elm, &smmu_domain->devices); spin_unlock_irqrestore(&smmu_domain->devices_lock, flags); switch (smmu_domain->stage) { @@ -2932,7 +2962,6 @@ static struct iommu_device *arm_smmu_probe_device(struct device *dev) master->dev = dev; master->smmu = smmu; INIT_LIST_HEAD(&master->bonds); - INIT_LIST_HEAD(&master->domain_head); dev_iommu_priv_set(dev, master); ret = arm_smmu_insert_master(smmu, master); diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h index 1fd5b73872753..69f73aa2f87d0 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h @@ -697,7 +697,6 @@ struct arm_smmu_stream { struct arm_smmu_master { struct arm_smmu_device *smmu; struct device *dev; - struct list_head domain_head; struct arm_smmu_stream *streams; /* Locked by the iommu core using the group mutex */ struct arm_smmu_ctx_desc_cfg cd_table; @@ -732,6 +731,7 @@ struct arm_smmu_domain { struct iommu_domain domain; + /* List of struct arm_smmu_master_domain */ struct list_head devices; spinlock_t devices_lock; @@ -768,6 +768,11 @@ void arm_smmu_make_sva_cd(struct arm_smmu_cd *target, u16 asid); #endif +struct arm_smmu_master_domain { + struct list_head devices_elm; + struct arm_smmu_master *master; +}; + static inline struct arm_smmu_domain *to_smmu_domain(struct iommu_domain *dom) { return container_of(dom, struct arm_smmu_domain, domain); From 61b1dd068a9eee922876d6c6d84fdfd6b1fd2fac Mon Sep 17 00:00:00 2001 From: Jason Gunthorpe Date: Tue, 25 Jun 2024 09:37:35 -0300 Subject: [PATCH 530/548] iommu/arm-smmu-v3: Make changing domains be hitless for ATS The core code allows the domain to be changed on the fly without a forced stop in BLOCKED/IDENTITY. In this flow the driver should just continually maintain the ATS with no change while the STE is updated. ATS relies on a linked list smmu_domain->devices to keep track of which masters have the domain programmed, but this list is also used by arm_smmu_share_asid(), unrelated to ats. Create two new functions to encapsulate this combined logic: arm_smmu_attach_prepare() arm_smmu_attach_commit() The two functions can sequence both enabling ATS and disabling across the STE store. Have every update of the STE use this sequence. Installing a S1/S2 domain always enables the ATS if the PCIe device supports it. The enable flow is now ordered differently to allow it to be hitless: 1) Add the master to the new smmu_domain->devices list 2) Program the STE 3) Enable ATS at PCIe 4) Remove the master from the old smmu_domain This flow ensures that invalidations to either domain will generate an ATC invalidation to the device while the STE is being switched. Thus we don't need to turn off the ATS anymore for correctness. The disable flow is the reverse: 1) Disable ATS at PCIe 2) Program the STE 3) Invalidate the ATC 4) Remove the master from the old smmu_domain Move the nr_ats_masters adjustments to be close to the list manipulations. It is a count of the number of ATS enabled masters currently in the list. This is stricly before and after the STE/CD are revised, and done under the list's spin_lock. This is part of the bigger picture to allow changing the RID domain while a PASID is in use. If a SVA PASID is relying on ATS to function then changing the RID domain cannot just temporarily toggle ATS off without also wrecking the SVA PASID. The new infrastructure here is organized so that the PASID attach/detach flows will make use of it as well in following patches. Tested-by: Nicolin Chen Tested-by: Shameer Kolothum Reviewed-by: Nicolin Chen Reviewed-by: Michael Shavit Reviewed-by: Jerry Snitselaar Signed-off-by: Jason Gunthorpe Link: https://lore.kernel.org/r/4-v9-5cd718286059+79186-smmuv3_newapi_p2b_jgg@nvidia.com Signed-off-by: Will Deacon (cherry picked from commit 7497f4211f4fbdcec5fc5bb4df7f6ccd345966e8) Signed-off-by: Koba Ko --- .../iommu/arm/arm-smmu-v3/arm-smmu-v3-test.c | 5 +- drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 237 +++++++++++++----- drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 6 +- 3 files changed, 177 insertions(+), 71 deletions(-) diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-test.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-test.c index 417804392ff08..68aa1c61fdb38 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-test.c +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-test.c @@ -164,7 +164,7 @@ static void arm_smmu_test_make_cdtable_ste(struct arm_smmu_ste *ste, .smmu = &smmu, }; - arm_smmu_make_cdtable_ste(ste, &master); + arm_smmu_make_cdtable_ste(ste, &master, true); } static void arm_smmu_v3_write_ste_test_bypass_to_abort(struct kunit *test) @@ -231,7 +231,6 @@ static void arm_smmu_test_make_s2_ste(struct arm_smmu_ste *ste, { struct arm_smmu_master master = { .smmu = &smmu, - .ats_enabled = ats_enabled, }; struct io_pgtable io_pgtable = {}; struct arm_smmu_domain smmu_domain = { @@ -247,7 +246,7 @@ static void arm_smmu_test_make_s2_ste(struct arm_smmu_ste *ste, io_pgtable.cfg.arm_lpae_s2_cfg.vtcr.sl = 3; io_pgtable.cfg.arm_lpae_s2_cfg.vtcr.tsz = 4; - arm_smmu_make_s2_domain_ste(ste, &master, &smmu_domain); + arm_smmu_make_s2_domain_ste(ste, &master, &smmu_domain, ats_enabled); } static void arm_smmu_v3_write_ste_test_s2_to_abort(struct kunit *test) diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c index b7b027deaf06b..cfd9ac26e341d 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c @@ -1537,7 +1537,7 @@ void arm_smmu_make_bypass_ste(struct arm_smmu_device *smmu, VISIBLE_IF_KUNIT void arm_smmu_make_cdtable_ste(struct arm_smmu_ste *target, - struct arm_smmu_master *master) + struct arm_smmu_master *master, bool ats_enabled) { struct arm_smmu_ctx_desc_cfg *cd_table = &master->cd_table; struct arm_smmu_device *smmu = master->smmu; @@ -1560,7 +1560,7 @@ void arm_smmu_make_cdtable_ste(struct arm_smmu_ste *target, STRTAB_STE_1_S1STALLD : 0) | FIELD_PREP(STRTAB_STE_1_EATS, - master->ats_enabled ? STRTAB_STE_1_EATS_TRANS : 0)); + ats_enabled ? STRTAB_STE_1_EATS_TRANS : 0)); if (smmu->features & ARM_SMMU_FEAT_E2H) { /* @@ -1589,7 +1589,8 @@ void arm_smmu_make_cdtable_ste(struct arm_smmu_ste *target, VISIBLE_IF_KUNIT void arm_smmu_make_s2_domain_ste(struct arm_smmu_ste *target, struct arm_smmu_master *master, - struct arm_smmu_domain *smmu_domain) + struct arm_smmu_domain *smmu_domain, + bool ats_enabled) { struct arm_smmu_s2_cfg *s2_cfg = &smmu_domain->s2_cfg; const struct io_pgtable_cfg *pgtbl_cfg = @@ -1606,7 +1607,7 @@ void arm_smmu_make_s2_domain_ste(struct arm_smmu_ste *target, target->data[1] = cpu_to_le64( FIELD_PREP(STRTAB_STE_1_EATS, - master->ats_enabled ? STRTAB_STE_1_EATS_TRANS : 0)); + ats_enabled ? STRTAB_STE_1_EATS_TRANS : 0)); if (smmu->features & ARM_SMMU_FEAT_ATTR_TYPES_OVR) target->data[1] |= cpu_to_le64(FIELD_PREP(STRTAB_STE_1_SHCFG, @@ -2463,22 +2464,16 @@ static bool arm_smmu_ats_supported(struct arm_smmu_master *master) return dev_is_pci(dev) && pci_ats_supported(to_pci_dev(dev)); } -static void arm_smmu_enable_ats(struct arm_smmu_master *master, - struct arm_smmu_domain *smmu_domain) +static void arm_smmu_enable_ats(struct arm_smmu_master *master) { size_t stu; struct pci_dev *pdev; struct arm_smmu_device *smmu = master->smmu; - /* Don't enable ATS at the endpoint if it's not enabled in the STE */ - if (!master->ats_enabled) - return; - /* Smallest Translation Unit: log2 of the smallest supported granule */ stu = __ffs(smmu->pgsize_bitmap); pdev = to_pci_dev(master->dev); - atomic_inc(&smmu_domain->nr_ats_masters); /* * ATC invalidation of PASID 0 causes the entire ATC to be flushed. */ @@ -2487,22 +2482,6 @@ static void arm_smmu_enable_ats(struct arm_smmu_master *master, dev_err(master->dev, "Failed to enable ATS (STU %zu)\n", stu); } -static void arm_smmu_disable_ats(struct arm_smmu_master *master, - struct arm_smmu_domain *smmu_domain) -{ - if (!master->ats_enabled) - return; - - pci_disable_ats(to_pci_dev(master->dev)); - /* - * Ensure ATS is disabled at the endpoint before we issue the - * ATC invalidation via the SMMU. - */ - wmb(); - arm_smmu_atc_inv_master(master); - atomic_dec(&smmu_domain->nr_ats_masters); -} - static int arm_smmu_enable_pasid(struct arm_smmu_master *master) { int ret; @@ -2566,46 +2545,181 @@ arm_smmu_find_master_domain(struct arm_smmu_domain *smmu_domain, return NULL; } -static void arm_smmu_detach_dev(struct arm_smmu_master *master) +/* + * If the domain uses the smmu_domain->devices list return the arm_smmu_domain + * structure, otherwise NULL. These domains track attached devices so they can + * issue invalidations. + */ +static struct arm_smmu_domain * +to_smmu_domain_devices(struct iommu_domain *domain) +{ + /* The domain can be NULL only when processing the first attach */ + if (!domain) + return NULL; + if (domain->type & __IOMMU_DOMAIN_PAGING) + return to_smmu_domain(domain); + return NULL; +} + +static void arm_smmu_remove_master_domain(struct arm_smmu_master *master, + struct iommu_domain *domain) { - struct iommu_domain *domain = iommu_get_domain_for_dev(master->dev); + struct arm_smmu_domain *smmu_domain = to_smmu_domain_devices(domain); struct arm_smmu_master_domain *master_domain; - struct arm_smmu_domain *smmu_domain; unsigned long flags; - if (!domain || !(domain->type & __IOMMU_DOMAIN_PAGING)) + if (!smmu_domain) return; - smmu_domain = to_smmu_domain(domain); - arm_smmu_disable_ats(master, smmu_domain); - spin_lock_irqsave(&smmu_domain->devices_lock, flags); master_domain = arm_smmu_find_master_domain(smmu_domain, master); if (master_domain) { list_del(&master_domain->devices_elm); kfree(master_domain); + if (master->ats_enabled) + atomic_dec(&smmu_domain->nr_ats_masters); } spin_unlock_irqrestore(&smmu_domain->devices_lock, flags); +} + +struct arm_smmu_attach_state { + /* Inputs */ + struct iommu_domain *old_domain; + struct arm_smmu_master *master; + /* Resulting state */ + bool ats_enabled; +}; + +/* + * Start the sequence to attach a domain to a master. The sequence contains three + * steps: + * arm_smmu_attach_prepare() + * arm_smmu_install_ste_for_dev() + * arm_smmu_attach_commit() + * + * If prepare succeeds then the sequence must be completed. The STE installed + * must set the STE.EATS field according to state.ats_enabled. + * + * If the device supports ATS then this determines if EATS should be enabled + * in the STE, and starts sequencing EATS disable if required. + * + * The change of the EATS in the STE and the PCI ATS config space is managed by + * this sequence to be in the right order so that if PCI ATS is enabled then + * STE.ETAS is enabled. + * + * new_domain can be a non-paging domain. In this case ATS will not be enabled, + * and invalidations won't be tracked. + */ +static int arm_smmu_attach_prepare(struct arm_smmu_attach_state *state, + struct iommu_domain *new_domain) +{ + struct arm_smmu_master *master = state->master; + struct arm_smmu_master_domain *master_domain; + struct arm_smmu_domain *smmu_domain = + to_smmu_domain_devices(new_domain); + unsigned long flags; + + /* + * arm_smmu_share_asid() must not see two domains pointing to the same + * arm_smmu_master_domain contents otherwise it could randomly write one + * or the other to the CD. + */ + lockdep_assert_held(&arm_smmu_asid_lock); + + if (smmu_domain) { + /* + * The SMMU does not support enabling ATS with bypass/abort. + * When the STE is in bypass (STE.Config[2:0] == 0b100), ATS + * Translation Requests and Translated transactions are denied + * as though ATS is disabled for the stream (STE.EATS == 0b00), + * causing F_BAD_ATS_TREQ and F_TRANSL_FORBIDDEN events + * (IHI0070Ea 5.2 Stream Table Entry). Thus ATS can only be + * enabled if we have arm_smmu_domain, those always have page + * tables. + */ + state->ats_enabled = arm_smmu_ats_supported(master); + + master_domain = kzalloc(sizeof(*master_domain), GFP_KERNEL); + if (!master_domain) + return -ENOMEM; + master_domain->master = master; - master->ats_enabled = false; + /* + * During prepare we want the current smmu_domain and new + * smmu_domain to be in the devices list before we change any + * HW. This ensures that both domains will send ATS + * invalidations to the master until we are done. + * + * It is tempting to make this list only track masters that are + * using ATS, but arm_smmu_share_asid() also uses this to change + * the ASID of a domain, unrelated to ATS. + * + * Notice if we are re-attaching the same domain then the list + * will have two identical entries and commit will remove only + * one of them. + */ + spin_lock_irqsave(&smmu_domain->devices_lock, flags); + if (state->ats_enabled) + atomic_inc(&smmu_domain->nr_ats_masters); + list_add(&master_domain->devices_elm, &smmu_domain->devices); + spin_unlock_irqrestore(&smmu_domain->devices_lock, flags); + } + + if (!state->ats_enabled && master->ats_enabled) { + pci_disable_ats(to_pci_dev(master->dev)); + /* + * This is probably overkill, but the config write for disabling + * ATS should complete before the STE is configured to generate + * UR to avoid AER noise. + */ + wmb(); + } + return 0; +} + +/* + * Commit is done after the STE/CD are configured with the EATS setting. It + * completes synchronizing the PCI device's ATC and finishes manipulating the + * smmu_domain->devices list. + */ +static void arm_smmu_attach_commit(struct arm_smmu_attach_state *state) +{ + struct arm_smmu_master *master = state->master; + + lockdep_assert_held(&arm_smmu_asid_lock); + + if (state->ats_enabled && !master->ats_enabled) { + arm_smmu_enable_ats(master); + } else if (master->ats_enabled) { + /* + * The translation has changed, flush the ATC. At this point the + * SMMU is translating for the new domain and both the old&new + * domain will issue invalidations. + */ + arm_smmu_atc_inv_master(master); + } + master->ats_enabled = state->ats_enabled; + + arm_smmu_remove_master_domain(master, state->old_domain); } static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev) { int ret = 0; - unsigned long flags; struct arm_smmu_ste target; struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev); struct arm_smmu_device *smmu; struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain); - struct arm_smmu_master_domain *master_domain; + struct arm_smmu_attach_state state = { + .old_domain = iommu_get_domain_for_dev(dev), + }; struct arm_smmu_master *master; struct arm_smmu_cd *cdptr; if (!fwspec) return -ENOENT; - master = dev_iommu_priv_get(dev); + state.master = master = dev_iommu_priv_get(dev); smmu = master->smmu; /* @@ -2635,11 +2749,6 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev) return -ENOMEM; } - master_domain = kzalloc(sizeof(*master_domain), GFP_KERNEL); - if (!master_domain) - return -ENOMEM; - master_domain->master = master; - /* * Prevent arm_smmu_share_asid() from trying to change the ASID * of either the old or new domain while we are working on it. @@ -2648,13 +2757,11 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev) */ mutex_lock(&arm_smmu_asid_lock); - arm_smmu_detach_dev(master); - - master->ats_enabled = arm_smmu_ats_supported(master); - - spin_lock_irqsave(&smmu_domain->devices_lock, flags); - list_add(&master_domain->devices_elm, &smmu_domain->devices); - spin_unlock_irqrestore(&smmu_domain->devices_lock, flags); + ret = arm_smmu_attach_prepare(&state, domain); + if (ret) { + mutex_unlock(&arm_smmu_asid_lock); + return ret; + } switch (smmu_domain->stage) { case ARM_SMMU_DOMAIN_S1: { @@ -2663,18 +2770,19 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev) arm_smmu_make_s1_cd(&target_cd, master, smmu_domain); arm_smmu_write_cd_entry(master, IOMMU_NO_PASID, cdptr, &target_cd); - arm_smmu_make_cdtable_ste(&target, master); + arm_smmu_make_cdtable_ste(&target, master, state.ats_enabled); arm_smmu_install_ste_for_dev(master, &target); break; } case ARM_SMMU_DOMAIN_S2: - arm_smmu_make_s2_domain_ste(&target, master, smmu_domain); + arm_smmu_make_s2_domain_ste(&target, master, smmu_domain, + state.ats_enabled); arm_smmu_install_ste_for_dev(master, &target); arm_smmu_clear_cd(master, IOMMU_NO_PASID); break; } - arm_smmu_enable_ats(master, smmu_domain); + arm_smmu_attach_commit(&state); mutex_unlock(&arm_smmu_asid_lock); return 0; } @@ -2703,10 +2811,14 @@ void arm_smmu_remove_pasid(struct arm_smmu_master *master, arm_smmu_clear_cd(master, pasid); } -static int arm_smmu_attach_dev_ste(struct device *dev, - struct arm_smmu_ste *ste) +static int arm_smmu_attach_dev_ste(struct iommu_domain *domain, + struct device *dev, struct arm_smmu_ste *ste) { struct arm_smmu_master *master = dev_iommu_priv_get(dev); + struct arm_smmu_attach_state state = { + .master = master, + .old_domain = iommu_get_domain_for_dev(dev), + }; if (arm_smmu_master_sva_enabled(master)) return -EBUSY; @@ -2717,16 +2829,9 @@ static int arm_smmu_attach_dev_ste(struct device *dev, */ mutex_lock(&arm_smmu_asid_lock); - /* - * The SMMU does not support enabling ATS with bypass/abort. When the - * STE is in bypass (STE.Config[2:0] == 0b100), ATS Translation Requests - * and Translated transactions are denied as though ATS is disabled for - * the stream (STE.EATS == 0b00), causing F_BAD_ATS_TREQ and - * F_TRANSL_FORBIDDEN events (IHI0070Ea 5.2 Stream Table Entry). - */ - arm_smmu_detach_dev(master); - + arm_smmu_attach_prepare(&state, domain); arm_smmu_install_ste_for_dev(master, ste); + arm_smmu_attach_commit(&state); mutex_unlock(&arm_smmu_asid_lock); /* @@ -2745,7 +2850,7 @@ static int arm_smmu_attach_dev_identity(struct iommu_domain *domain, struct arm_smmu_master *master = dev_iommu_priv_get(dev); arm_smmu_make_bypass_ste(master->smmu, &ste); - return arm_smmu_attach_dev_ste(dev, &ste); + return arm_smmu_attach_dev_ste(domain, dev, &ste); } static const struct iommu_domain_ops arm_smmu_identity_ops = { @@ -2763,7 +2868,7 @@ static int arm_smmu_attach_dev_blocked(struct iommu_domain *domain, struct arm_smmu_ste ste; arm_smmu_make_abort_ste(&ste); - return arm_smmu_attach_dev_ste(dev, &ste); + return arm_smmu_attach_dev_ste(domain, dev, &ste); } static const struct iommu_domain_ops arm_smmu_blocked_ops = { diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h index 69f73aa2f87d0..5e4a21057545a 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h @@ -759,10 +759,12 @@ void arm_smmu_make_abort_ste(struct arm_smmu_ste *target); void arm_smmu_make_bypass_ste(struct arm_smmu_device *smmu, struct arm_smmu_ste *target); void arm_smmu_make_cdtable_ste(struct arm_smmu_ste *target, - struct arm_smmu_master *master); + struct arm_smmu_master *master, + bool ats_enabled); void arm_smmu_make_s2_domain_ste(struct arm_smmu_ste *target, struct arm_smmu_master *master, - struct arm_smmu_domain *smmu_domain); + struct arm_smmu_domain *smmu_domain, + bool ats_enabled); void arm_smmu_make_sva_cd(struct arm_smmu_cd *target, struct arm_smmu_master *master, struct mm_struct *mm, u16 asid); From 95942d4fafc0a7366e8cfd7403007b05a542ccc1 Mon Sep 17 00:00:00 2001 From: Jason Gunthorpe Date: Tue, 25 Jun 2024 09:37:36 -0300 Subject: [PATCH 531/548] iommu/arm-smmu-v3: Add ssid to struct arm_smmu_master_domain Prepare to allow a S1 domain to be attached to a PASID as well. Keep track of the SSID the domain is using on each master in the arm_smmu_master_domain. Tested-by: Nicolin Chen Tested-by: Shameer Kolothum Reviewed-by: Michael Shavit Reviewed-by: Nicolin Chen Reviewed-by: Jerry Snitselaar Signed-off-by: Jason Gunthorpe Link: https://lore.kernel.org/r/5-v9-5cd718286059+79186-smmuv3_newapi_p2b_jgg@nvidia.com Signed-off-by: Will Deacon (cherry picked from commit 64efb3def3a53effe01fa750eec6e7369f65e386) Signed-off-by: Koba Ko --- .../iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c | 15 ++++--- drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 42 +++++++++++++++---- drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 5 ++- 3 files changed, 43 insertions(+), 19 deletions(-) diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c index c09790c702c9d..755b87eb3910d 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c @@ -47,13 +47,12 @@ arm_smmu_update_s1_domain_cd_entry(struct arm_smmu_domain *smmu_domain) struct arm_smmu_master *master = master_domain->master; struct arm_smmu_cd *cdptr; - /* S1 domains only support RID attachment right now */ - cdptr = arm_smmu_get_cd_ptr(master, IOMMU_NO_PASID); + cdptr = arm_smmu_get_cd_ptr(master, master_domain->ssid); if (WARN_ON(!cdptr)) continue; arm_smmu_make_s1_cd(&target_cd, master, smmu_domain); - arm_smmu_write_cd_entry(master, IOMMU_NO_PASID, cdptr, + arm_smmu_write_cd_entry(master, master_domain->ssid, cdptr, &target_cd); } spin_unlock_irqrestore(&smmu_domain->devices_lock, flags); @@ -293,8 +292,8 @@ static void arm_smmu_mm_arch_invalidate_secondary_tlbs(struct mmu_notifier *mn, smmu_domain); } - arm_smmu_atc_inv_domain(smmu_domain, mm_get_enqcmd_pasid(mm), start, - size); + arm_smmu_atc_inv_domain_sva(smmu_domain, mm_get_enqcmd_pasid(mm), start, + size); } static void arm_smmu_mm_release(struct mmu_notifier *mn, struct mm_struct *mm) @@ -331,7 +330,7 @@ static void arm_smmu_mm_release(struct mmu_notifier *mn, struct mm_struct *mm) spin_unlock_irqrestore(&smmu_domain->devices_lock, flags); arm_smmu_tlb_inv_asid(smmu_domain->smmu, smmu_mn->cd->asid); - arm_smmu_atc_inv_domain(smmu_domain, mm_get_enqcmd_pasid(mm), 0, 0); + arm_smmu_atc_inv_domain_sva(smmu_domain, mm_get_enqcmd_pasid(mm), 0, 0); smmu_mn->cleared = true; mutex_unlock(&sva_lock); @@ -410,8 +409,8 @@ static void arm_smmu_mmu_notifier_put(struct arm_smmu_mmu_notifier *smmu_mn) */ if (!smmu_mn->cleared) { arm_smmu_tlb_inv_asid(smmu_domain->smmu, cd->asid); - arm_smmu_atc_inv_domain(smmu_domain, mm_get_enqcmd_pasid(mm), 0, - 0); + arm_smmu_atc_inv_domain_sva(smmu_domain, + mm_get_enqcmd_pasid(mm), 0, 0); } /* Frees smmu_mn */ diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c index cfd9ac26e341d..e836dc37b7654 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c @@ -2025,8 +2025,8 @@ static int arm_smmu_atc_inv_master(struct arm_smmu_master *master) return arm_smmu_cmdq_batch_submit(master->smmu, &cmds); } -int arm_smmu_atc_inv_domain(struct arm_smmu_domain *smmu_domain, int ssid, - unsigned long iova, size_t size) +static int __arm_smmu_atc_inv_domain(struct arm_smmu_domain *smmu_domain, + ioasid_t ssid, unsigned long iova, size_t size) { struct arm_smmu_master_domain *master_domain; int i; @@ -2054,8 +2054,6 @@ int arm_smmu_atc_inv_domain(struct arm_smmu_domain *smmu_domain, int ssid, if (!atomic_read(&smmu_domain->nr_ats_masters)) return 0; - arm_smmu_atc_inv_to_cmd(ssid, iova, size, &cmd); - cmds.num = 0; spin_lock_irqsave(&smmu_domain->devices_lock, flags); @@ -2066,6 +2064,16 @@ int arm_smmu_atc_inv_domain(struct arm_smmu_domain *smmu_domain, int ssid, if (!master->ats_enabled) continue; + /* + * Non-zero ssid means SVA is co-opting the S1 domain to issue + * invalidations for SVA PASIDs. + */ + if (ssid != IOMMU_NO_PASID) + arm_smmu_atc_inv_to_cmd(ssid, iova, size, &cmd); + else + arm_smmu_atc_inv_to_cmd(master_domain->ssid, iova, size, + &cmd); + for (i = 0; i < master->num_streams; i++) { cmd.atc.sid = master->streams[i].id; arm_smmu_cmdq_batch_add(smmu_domain->smmu, &cmds, &cmd); @@ -2076,6 +2084,19 @@ int arm_smmu_atc_inv_domain(struct arm_smmu_domain *smmu_domain, int ssid, return arm_smmu_cmdq_batch_submit(smmu_domain->smmu, &cmds); } +static int arm_smmu_atc_inv_domain(struct arm_smmu_domain *smmu_domain, + unsigned long iova, size_t size) +{ + return __arm_smmu_atc_inv_domain(smmu_domain, IOMMU_NO_PASID, iova, + size); +} + +int arm_smmu_atc_inv_domain_sva(struct arm_smmu_domain *smmu_domain, + ioasid_t ssid, unsigned long iova, size_t size) +{ + return __arm_smmu_atc_inv_domain(smmu_domain, ssid, iova, size); +} + /* IO_PGTABLE API */ static void arm_smmu_tlb_inv_context(void *cookie) { @@ -2097,7 +2118,7 @@ static void arm_smmu_tlb_inv_context(void *cookie) cmd.tlbi.vmid = smmu_domain->s2_cfg.vmid; arm_smmu_cmdq_issue_cmd_with_sync(smmu, &cmd); } - arm_smmu_atc_inv_domain(smmu_domain, IOMMU_NO_PASID, 0, 0); + arm_smmu_atc_inv_domain(smmu_domain, 0, 0); } static void __arm_smmu_tlb_inv_range(struct arm_smmu_cmdq_ent *cmd, @@ -2195,7 +2216,7 @@ static void arm_smmu_tlb_inv_range_domain(unsigned long iova, size_t size, * Unfortunately, this can't be leaf-only since we may have * zapped an entire table. */ - arm_smmu_atc_inv_domain(smmu_domain, IOMMU_NO_PASID, iova, size); + arm_smmu_atc_inv_domain(smmu_domain, iova, size); } void arm_smmu_tlb_inv_range_asid(unsigned long iova, size_t size, int asid, @@ -2531,7 +2552,8 @@ static void arm_smmu_disable_pasid(struct arm_smmu_master *master) static struct arm_smmu_master_domain * arm_smmu_find_master_domain(struct arm_smmu_domain *smmu_domain, - struct arm_smmu_master *master) + struct arm_smmu_master *master, + ioasid_t ssid) { struct arm_smmu_master_domain *master_domain; @@ -2539,7 +2561,8 @@ arm_smmu_find_master_domain(struct arm_smmu_domain *smmu_domain, list_for_each_entry(master_domain, &smmu_domain->devices, devices_elm) { - if (master_domain->master == master) + if (master_domain->master == master && + master_domain->ssid == ssid) return master_domain; } return NULL; @@ -2572,7 +2595,8 @@ static void arm_smmu_remove_master_domain(struct arm_smmu_master *master, return; spin_lock_irqsave(&smmu_domain->devices_lock, flags); - master_domain = arm_smmu_find_master_domain(smmu_domain, master); + master_domain = arm_smmu_find_master_domain(smmu_domain, master, + IOMMU_NO_PASID); if (master_domain) { list_del(&master_domain->devices_elm); kfree(master_domain); diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h index 5e4a21057545a..73483817f462d 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h @@ -773,6 +773,7 @@ void arm_smmu_make_sva_cd(struct arm_smmu_cd *target, struct arm_smmu_master_domain { struct list_head devices_elm; struct arm_smmu_master *master; + ioasid_t ssid; }; static inline struct arm_smmu_domain *to_smmu_domain(struct iommu_domain *dom) @@ -804,8 +805,8 @@ void arm_smmu_tlb_inv_range_asid(unsigned long iova, size_t size, int asid, size_t granule, bool leaf, struct arm_smmu_domain *smmu_domain); bool arm_smmu_free_asid(struct arm_smmu_ctx_desc *cd); -int arm_smmu_atc_inv_domain(struct arm_smmu_domain *smmu_domain, int ssid, - unsigned long iova, size_t size); +int arm_smmu_atc_inv_domain_sva(struct arm_smmu_domain *smmu_domain, + ioasid_t ssid, unsigned long iova, size_t size); #ifdef CONFIG_ARM_SMMU_V3_SVA bool arm_smmu_sva_supported(struct arm_smmu_device *smmu); From d25898e123eb1af81bb2a4a362725aeb0870c409 Mon Sep 17 00:00:00 2001 From: Jason Gunthorpe Date: Tue, 25 Jun 2024 09:37:37 -0300 Subject: [PATCH 532/548] iommu/arm-smmu-v3: Do not use master->sva_enable to restrict attaches We no longer need a master->sva_enable to control what attaches are allowed. Instead we can tell if the attach is legal based on the current configuration of the master. Keep track of the number of valid CD entries for SSID's in the cd_table and if the cd_table has been installed in the STE directly so we know what the configuration is. The attach logic is then made into: - SVA bind, check if the CD is installed - RID attach of S2, block if SSIDs are used - RID attach of IDENTITY/BLOCKING, block if SSIDs are used arm_smmu_set_pasid() is already checking if it is possible to setup a CD entry, at this patch it means the RID path already set a STE pointing at the CD table. Tested-by: Nicolin Chen Reviewed-by: Nicolin Chen Reviewed-by: Jerry Snitselaar Signed-off-by: Jason Gunthorpe Link: https://lore.kernel.org/r/6-v9-5cd718286059+79186-smmuv3_newapi_p2b_jgg@nvidia.com Signed-off-by: Will Deacon (cherry picked from commit be7c90de39fdebdba4f9cce7575b71c6b2506ea0) Signed-off-by: Koba Ko --- drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 24 ++++++++++----------- drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 7 ++++++ 2 files changed, 19 insertions(+), 12 deletions(-) diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c index e836dc37b7654..634368b6282bc 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c @@ -1291,6 +1291,8 @@ void arm_smmu_write_cd_entry(struct arm_smmu_master *master, int ssid, struct arm_smmu_cd *cdptr, const struct arm_smmu_cd *target) { + bool target_valid = target->data[0] & cpu_to_le64(CTXDESC_CD_0_V); + bool cur_valid = cdptr->data[0] & cpu_to_le64(CTXDESC_CD_0_V); struct arm_smmu_cd_writer cd_writer = { .writer = { .ops = &arm_smmu_cd_writer_ops, @@ -1299,6 +1301,13 @@ void arm_smmu_write_cd_entry(struct arm_smmu_master *master, int ssid, .ssid = ssid, }; + if (ssid != IOMMU_NO_PASID && cur_valid != target_valid) { + if (cur_valid) + master->cd_table.used_ssids--; + else + master->cd_table.used_ssids++; + } + arm_smmu_write_entry(&cd_writer.writer, cdptr->data, target->data); } @@ -2746,16 +2755,6 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev) state.master = master = dev_iommu_priv_get(dev); smmu = master->smmu; - /* - * Checking that SVA is disabled ensures that this device isn't bound to - * any mm, and can be safely detached from its old domain. Bonds cannot - * be removed concurrently since we're holding the group mutex. - */ - if (arm_smmu_master_sva_enabled(master)) { - dev_err(dev, "cannot attach - SVA enabled\n"); - return -EBUSY; - } - mutex_lock(&smmu_domain->init_mutex); if (!smmu_domain->smmu) { @@ -2771,7 +2770,8 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev) cdptr = arm_smmu_alloc_cd_ptr(master, IOMMU_NO_PASID); if (!cdptr) return -ENOMEM; - } + } else if (arm_smmu_ssids_in_use(&master->cd_table)) + return -EBUSY; /* * Prevent arm_smmu_share_asid() from trying to change the ASID @@ -2844,7 +2844,7 @@ static int arm_smmu_attach_dev_ste(struct iommu_domain *domain, .old_domain = iommu_get_domain_for_dev(dev), }; - if (arm_smmu_master_sva_enabled(master)) + if (arm_smmu_ssids_in_use(&master->cd_table)) return -EBUSY; /* diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h index 73483817f462d..772ef24c3c7fe 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h @@ -602,12 +602,19 @@ struct arm_smmu_ctx_desc_cfg { dma_addr_t cdtab_dma; struct arm_smmu_l1_ctx_desc *l1_desc; unsigned int num_l1_ents; + unsigned int used_ssids; u8 in_ste; u8 s1fmt; /* log2 of the maximum number of CDs supported by this table */ u8 s1cdmax; }; +/* True if the cd table has SSIDS > 0 in use. */ +static inline bool arm_smmu_ssids_in_use(struct arm_smmu_ctx_desc_cfg *cd_table) +{ + return cd_table->used_ssids; +} + struct arm_smmu_s2_cfg { u16 vmid; }; From bbc6054b63712dc00e56e4f83bba5129493a422a Mon Sep 17 00:00:00 2001 From: Jason Gunthorpe Date: Tue, 25 Jun 2024 09:37:38 -0300 Subject: [PATCH 533/548] iommu/arm-smmu-v3: Thread SSID through the arm_smmu_attach_*() interface Allow creating and managing arm_smmu_mater_domain's with a non-zero SSID through the arm_smmu_attach_*() family of functions. This triggers ATC invalidation for the correct SSID in PASID cases and tracks the per-attachment SSID in the struct arm_smmu_master_domain. Generalize arm_smmu_attach_remove() to be able to remove SSID's as well by ensuring the ATC for the PASID is flushed properly. Tested-by: Nicolin Chen Tested-by: Shameer Kolothum Reviewed-by: Nicolin Chen Reviewed-by: Jerry Snitselaar Signed-off-by: Jason Gunthorpe Link: https://lore.kernel.org/r/7-v9-5cd718286059+79186-smmuv3_newapi_p2b_jgg@nvidia.com Signed-off-by: Will Deacon (cherry picked from commit 1d5f34f0002f9f56d0ca153022cfdead07d45dc6) Signed-off-by: Koba Ko --- drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 26 ++++++++++++++------- 1 file changed, 17 insertions(+), 9 deletions(-) diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c index 634368b6282bc..9cda85586e4d0 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c @@ -2017,13 +2017,14 @@ arm_smmu_atc_inv_to_cmd(int ssid, unsigned long iova, size_t size, cmd->atc.size = log2_span; } -static int arm_smmu_atc_inv_master(struct arm_smmu_master *master) +static int arm_smmu_atc_inv_master(struct arm_smmu_master *master, + ioasid_t ssid) { int i; struct arm_smmu_cmdq_ent cmd; struct arm_smmu_cmdq_batch cmds; - arm_smmu_atc_inv_to_cmd(IOMMU_NO_PASID, 0, 0, &cmd); + arm_smmu_atc_inv_to_cmd(ssid, 0, 0, &cmd); cmds.num = 0; for (i = 0; i < master->num_streams; i++) { @@ -2507,7 +2508,7 @@ static void arm_smmu_enable_ats(struct arm_smmu_master *master) /* * ATC invalidation of PASID 0 causes the entire ATC to be flushed. */ - arm_smmu_atc_inv_master(master); + arm_smmu_atc_inv_master(master, IOMMU_NO_PASID); if (pci_enable_ats(pdev, stu)) dev_err(master->dev, "Failed to enable ATS (STU %zu)\n", stu); } @@ -2594,7 +2595,8 @@ to_smmu_domain_devices(struct iommu_domain *domain) } static void arm_smmu_remove_master_domain(struct arm_smmu_master *master, - struct iommu_domain *domain) + struct iommu_domain *domain, + ioasid_t ssid) { struct arm_smmu_domain *smmu_domain = to_smmu_domain_devices(domain); struct arm_smmu_master_domain *master_domain; @@ -2604,8 +2606,7 @@ static void arm_smmu_remove_master_domain(struct arm_smmu_master *master, return; spin_lock_irqsave(&smmu_domain->devices_lock, flags); - master_domain = arm_smmu_find_master_domain(smmu_domain, master, - IOMMU_NO_PASID); + master_domain = arm_smmu_find_master_domain(smmu_domain, master, ssid); if (master_domain) { list_del(&master_domain->devices_elm); kfree(master_domain); @@ -2619,6 +2620,7 @@ struct arm_smmu_attach_state { /* Inputs */ struct iommu_domain *old_domain; struct arm_smmu_master *master; + ioasid_t ssid; /* Resulting state */ bool ats_enabled; }; @@ -2676,6 +2678,7 @@ static int arm_smmu_attach_prepare(struct arm_smmu_attach_state *state, if (!master_domain) return -ENOMEM; master_domain->master = master; + master_domain->ssid = state->ssid; /* * During prepare we want the current smmu_domain and new @@ -2723,17 +2726,20 @@ static void arm_smmu_attach_commit(struct arm_smmu_attach_state *state) if (state->ats_enabled && !master->ats_enabled) { arm_smmu_enable_ats(master); - } else if (master->ats_enabled) { + } else if (state->ats_enabled && master->ats_enabled) { /* * The translation has changed, flush the ATC. At this point the * SMMU is translating for the new domain and both the old&new * domain will issue invalidations. */ - arm_smmu_atc_inv_master(master); + arm_smmu_atc_inv_master(master, state->ssid); + } else if (!state->ats_enabled && master->ats_enabled) { + /* ATS is being switched off, invalidate the entire ATC */ + arm_smmu_atc_inv_master(master, IOMMU_NO_PASID); } master->ats_enabled = state->ats_enabled; - arm_smmu_remove_master_domain(master, state->old_domain); + arm_smmu_remove_master_domain(master, state->old_domain, state->ssid); } static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev) @@ -2745,6 +2751,7 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev) struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain); struct arm_smmu_attach_state state = { .old_domain = iommu_get_domain_for_dev(dev), + .ssid = IOMMU_NO_PASID, }; struct arm_smmu_master *master; struct arm_smmu_cd *cdptr; @@ -2842,6 +2849,7 @@ static int arm_smmu_attach_dev_ste(struct iommu_domain *domain, struct arm_smmu_attach_state state = { .master = master, .old_domain = iommu_get_domain_for_dev(dev), + .ssid = IOMMU_NO_PASID, }; if (arm_smmu_ssids_in_use(&master->cd_table)) From adb54a3c85c1eebc3f10062edcd878757993b9d4 Mon Sep 17 00:00:00 2001 From: Jason Gunthorpe Date: Tue, 25 Jun 2024 09:37:39 -0300 Subject: [PATCH 534/548] iommu/arm-smmu-v3: Make SVA allocate a normal arm_smmu_domain Currently the SVA domain is a naked struct iommu_domain, allocate a struct arm_smmu_domain instead. This is necessary to be able to use the struct arm_master_domain mechanism. Tested-by: Nicolin Chen Tested-by: Shameer Kolothum Reviewed-by: Michael Shavit Reviewed-by: Nicolin Chen Reviewed-by: Jerry Snitselaar Signed-off-by: Jason Gunthorpe Link: https://lore.kernel.org/r/8-v9-5cd718286059+79186-smmuv3_newapi_p2b_jgg@nvidia.com Signed-off-by: Will Deacon (cherry picked from commit d7b2d2ba1b84f4ae7cd94de22f74d6c6c5419de6) Signed-off-by: Koba Ko --- .../iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c | 21 ++++++++------- drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 27 +++++++++++++------ drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 2 ++ 3 files changed, 33 insertions(+), 17 deletions(-) diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c index 755b87eb3910d..c40de57c3436c 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c @@ -635,7 +635,7 @@ static int arm_smmu_sva_set_dev_pasid(struct iommu_domain *domain, } arm_smmu_make_sva_cd(&target, master, mm, bond->smmu_mn->cd->asid); - ret = arm_smmu_set_pasid(master, NULL, id, &target); + ret = arm_smmu_set_pasid(master, to_smmu_domain(domain), id, &target); if (ret) { list_del(&bond->list); arm_smmu_mmu_notifier_put(bond->smmu_mn); @@ -649,7 +649,7 @@ static int arm_smmu_sva_set_dev_pasid(struct iommu_domain *domain, static void arm_smmu_sva_domain_free(struct iommu_domain *domain) { - kfree(domain); + kfree(to_smmu_domain(domain)); } static const struct iommu_domain_ops arm_smmu_sva_domain_ops = { @@ -660,13 +660,16 @@ static const struct iommu_domain_ops arm_smmu_sva_domain_ops = { struct iommu_domain *arm_smmu_sva_domain_alloc(struct device *dev, struct mm_struct *mm) { - struct iommu_domain *domain; + struct arm_smmu_master *master = dev_iommu_priv_get(dev); + struct arm_smmu_device *smmu = master->smmu; + struct arm_smmu_domain *smmu_domain; - domain = kzalloc(sizeof(*domain), GFP_KERNEL); - if (!domain) - return ERR_PTR(-ENOMEM); - domain->type = IOMMU_DOMAIN_SVA; - domain->ops = &arm_smmu_sva_domain_ops; + smmu_domain = arm_smmu_domain_alloc(); + if (IS_ERR(smmu_domain)) + return ERR_CAST(smmu_domain); + smmu_domain->domain.type = IOMMU_DOMAIN_SVA; + smmu_domain->domain.ops = &arm_smmu_sva_domain_ops; + smmu_domain->smmu = smmu; - return domain; + return &smmu_domain->domain; } diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c index 9cda85586e4d0..6bee8e7ca1189 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c @@ -2284,15 +2284,10 @@ static bool arm_smmu_capable(struct device *dev, enum iommu_cap cap) } } -static struct iommu_domain *arm_smmu_domain_alloc_paging(struct device *dev) +struct arm_smmu_domain *arm_smmu_domain_alloc(void) { struct arm_smmu_domain *smmu_domain; - /* - * Allocate the domain and initialise some of its data structures. - * We can't really do anything meaningful until we've added a - * master. - */ smmu_domain = kzalloc(sizeof(*smmu_domain), GFP_KERNEL); if (!smmu_domain) return ERR_PTR(-ENOMEM); @@ -2302,6 +2297,22 @@ static struct iommu_domain *arm_smmu_domain_alloc_paging(struct device *dev) spin_lock_init(&smmu_domain->devices_lock); INIT_LIST_HEAD(&smmu_domain->mmu_notifiers); + return smmu_domain; +} + +static struct iommu_domain *arm_smmu_domain_alloc_paging(struct device *dev) +{ + struct arm_smmu_domain *smmu_domain; + + /* + * Allocate the domain and initialise some of its data structures. + * We can't really do anything meaningful until we've added a + * master. + */ + smmu_domain = arm_smmu_domain_alloc(); + if (IS_ERR(smmu_domain)) + return ERR_CAST(smmu_domain); + if (dev) { struct arm_smmu_master *master = dev_iommu_priv_get(dev); int ret; @@ -2315,7 +2326,7 @@ static struct iommu_domain *arm_smmu_domain_alloc_paging(struct device *dev) return &smmu_domain->domain; } -static void arm_smmu_domain_free(struct iommu_domain *domain) +static void arm_smmu_domain_free_paging(struct iommu_domain *domain) { struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain); struct arm_smmu_device *smmu = smmu_domain->smmu; @@ -3312,7 +3323,7 @@ static struct iommu_ops arm_smmu_ops = { .iotlb_sync = arm_smmu_iotlb_sync, .iova_to_phys = arm_smmu_iova_to_phys, .enable_nesting = arm_smmu_enable_nesting, - .free = arm_smmu_domain_free, + .free = arm_smmu_domain_free_paging, } }; diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h index 772ef24c3c7fe..3cf9d2358a2b2 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h @@ -791,6 +791,8 @@ static inline struct arm_smmu_domain *to_smmu_domain(struct iommu_domain *dom) extern struct xarray arm_smmu_asid_xa; extern struct mutex arm_smmu_asid_lock; +struct arm_smmu_domain *arm_smmu_domain_alloc(void); + void arm_smmu_clear_cd(struct arm_smmu_master *master, ioasid_t ssid); struct arm_smmu_cd *arm_smmu_get_cd_ptr(struct arm_smmu_master *master, u32 ssid); From b601b795de4df3837697cbe27e0004fef68d2752 Mon Sep 17 00:00:00 2001 From: Jason Gunthorpe Date: Tue, 25 Jun 2024 09:37:40 -0300 Subject: [PATCH 535/548] iommu/arm-smmu-v3: Keep track of arm_smmu_master_domain for SVA Fill in the smmu_domain->devices list in the new struct arm_smmu_domain that SVA allocates. Keep track of every SSID and master that is using the domain reusing the logic for the RID attach. This is the first step to making the SVA invalidation follow the same design as S1/S2 invalidation. At present nothing will read this list. Tested-by: Nicolin Chen Tested-by: Shameer Kolothum Reviewed-by: Nicolin Chen Reviewed-by: Jerry Snitselaar Signed-off-by: Jason Gunthorpe Link: https://lore.kernel.org/r/9-v9-5cd718286059+79186-smmuv3_newapi_p2b_jgg@nvidia.com Signed-off-by: Will Deacon (cherry picked from commit 49db2ed23c52f8371c12ab8646df23fa1daad4b2) Signed-off-by: Koba Ko --- drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 30 +++++++++++++++++++-- 1 file changed, 28 insertions(+), 2 deletions(-) diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c index 6bee8e7ca1189..3c8267f7fcd21 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c @@ -2600,7 +2600,8 @@ to_smmu_domain_devices(struct iommu_domain *domain) /* The domain can be NULL only when processing the first attach */ if (!domain) return NULL; - if (domain->type & __IOMMU_DOMAIN_PAGING) + if ((domain->type & __IOMMU_DOMAIN_PAGING) || + domain->type == IOMMU_DOMAIN_SVA) return to_smmu_domain(domain); return NULL; } @@ -2833,7 +2834,16 @@ int arm_smmu_set_pasid(struct arm_smmu_master *master, struct arm_smmu_domain *smmu_domain, ioasid_t pasid, const struct arm_smmu_cd *cd) { + struct arm_smmu_attach_state state = { + .master = master, + /* + * For now the core code prevents calling this when a domain is + * already attached, no need to set old_domain. + */ + .ssid = pasid, + }; struct arm_smmu_cd *cdptr; + int ret; /* The core code validates pasid */ @@ -2843,14 +2853,30 @@ int arm_smmu_set_pasid(struct arm_smmu_master *master, cdptr = arm_smmu_alloc_cd_ptr(master, pasid); if (!cdptr) return -ENOMEM; + + mutex_lock(&arm_smmu_asid_lock); + ret = arm_smmu_attach_prepare(&state, &smmu_domain->domain); + if (ret) + goto out_unlock; + arm_smmu_write_cd_entry(master, pasid, cdptr, cd); - return 0; + + arm_smmu_attach_commit(&state); + +out_unlock: + mutex_unlock(&arm_smmu_asid_lock); + return ret; } void arm_smmu_remove_pasid(struct arm_smmu_master *master, struct arm_smmu_domain *smmu_domain, ioasid_t pasid) { + mutex_lock(&arm_smmu_asid_lock); arm_smmu_clear_cd(master, pasid); + if (master->ats_enabled) + arm_smmu_atc_inv_master(master, pasid); + arm_smmu_remove_master_domain(master, &smmu_domain->domain, pasid); + mutex_unlock(&arm_smmu_asid_lock); } static int arm_smmu_attach_dev_ste(struct iommu_domain *domain, From c2f134ba4b56a8c0bea8559ad96bd8e7a05bc126 Mon Sep 17 00:00:00 2001 From: Jason Gunthorpe Date: Tue, 25 Jun 2024 09:37:41 -0300 Subject: [PATCH 536/548] iommu/arm-smmu-v3: Put the SVA mmu notifier in the smmu_domain This removes all the notifier de-duplication logic in the driver and relies on the core code to de-duplicate and allocate only one SVA domain per mm per smmu instance. This naturally gives a 1:1 relationship between SVA domain and mmu notifier. It is a significant simplication of the flow, as we end up with a single struct arm_smmu_domain for each MM and the invalidation can then be shifted to properly use the masters list like S1/S2 do. Remove all of the previous mmu_notifier, bond, shared cd, and cd refcount logic entirely. The logic here is tightly wound together with the unusued BTM support. Since the BTM logic requires holding all the iommu_domains in a global ASID xarray it conflicts with the design to have a single SVA domain per PASID, as multiple SMMU instances will need to have different domains. Following patches resolve this by making the ASID xarray per-instance instead of global. However, converting the BTM code over to this methodology requires many changes. Thus, since ARM_SMMU_FEAT_BTM is never enabled, remove the parts of the BTM support for ASID sharing that interact with SVA as well. A followup series is already working on fully enabling the BTM support, that requires iommufd's VIOMMU feature to bring in the KVM's VMID as well. It will come with an already written patch to bring back the ASID sharing using a per-instance ASID xarray. https://lore.kernel.org/linux-iommu/20240208151837.35068-1-shameerali.kolothum.thodi@huawei.com/ https://lore.kernel.org/linux-iommu/26-v6-228e7adf25eb+4155-smmuv3_newapi_p2_jgg@nvidia.com/ Tested-by: Nicolin Chen Tested-by: Shameer Kolothum Reviewed-by: Nicolin Chen Reviewed-by: Michael Shavit Reviewed-by: Jerry Snitselaar Signed-off-by: Jason Gunthorpe Link: https://lore.kernel.org/r/10-v9-5cd718286059+79186-smmuv3_newapi_p2b_jgg@nvidia.com Signed-off-by: Will Deacon (cherry picked from commit d38c28dbefeee03d7dd02004ad80d9676ac54d86) Signed-off-by: Koba Ko --- .../iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c | 396 ++++-------------- drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 69 +-- drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 15 +- 3 files changed, 88 insertions(+), 392 deletions(-) diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c index c40de57c3436c..95e70a304ae20 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c @@ -13,29 +13,9 @@ #include "arm-smmu-v3.h" #include "../../io-pgtable-arm.h" -struct arm_smmu_mmu_notifier { - struct mmu_notifier mn; - struct arm_smmu_ctx_desc *cd; - bool cleared; - refcount_t refs; - struct list_head list; - struct arm_smmu_domain *domain; -}; - -#define mn_to_smmu(mn) container_of(mn, struct arm_smmu_mmu_notifier, mn) - -struct arm_smmu_bond { - struct mm_struct *mm; - struct arm_smmu_mmu_notifier *smmu_mn; - struct list_head list; -}; - -#define sva_to_bond(handle) \ - container_of(handle, struct arm_smmu_bond, sva) - static DEFINE_MUTEX(sva_lock); -static void +static void __maybe_unused arm_smmu_update_s1_domain_cd_entry(struct arm_smmu_domain *smmu_domain) { struct arm_smmu_master_domain *master_domain; @@ -58,58 +38,6 @@ arm_smmu_update_s1_domain_cd_entry(struct arm_smmu_domain *smmu_domain) spin_unlock_irqrestore(&smmu_domain->devices_lock, flags); } -/* - * Check if the CPU ASID is available on the SMMU side. If a private context - * descriptor is using it, try to replace it. - */ -static struct arm_smmu_ctx_desc * -arm_smmu_share_asid(struct mm_struct *mm, u16 asid) -{ - int ret; - u32 new_asid; - struct arm_smmu_ctx_desc *cd; - struct arm_smmu_device *smmu; - struct arm_smmu_domain *smmu_domain; - - cd = xa_load(&arm_smmu_asid_xa, asid); - if (!cd) - return NULL; - - if (cd->mm) { - if (WARN_ON(cd->mm != mm)) - return ERR_PTR(-EINVAL); - /* All devices bound to this mm use the same cd struct. */ - refcount_inc(&cd->refs); - return cd; - } - - smmu_domain = container_of(cd, struct arm_smmu_domain, cd); - smmu = smmu_domain->smmu; - - ret = xa_alloc(&arm_smmu_asid_xa, &new_asid, cd, - XA_LIMIT(1, (1 << smmu->asid_bits) - 1), GFP_KERNEL); - if (ret) - return ERR_PTR(-ENOSPC); - /* - * Race with unmap: TLB invalidations will start targeting the new ASID, - * which isn't assigned yet. We'll do an invalidate-all on the old ASID - * later, so it doesn't matter. - */ - cd->asid = new_asid; - /* - * Update ASID and invalidate CD in all associated masters. There will - * be some overlap between use of both ASIDs, until we invalidate the - * TLB. - */ - arm_smmu_update_s1_domain_cd_entry(smmu_domain); - - /* Invalidate TLB entries previously associated with that context */ - arm_smmu_tlb_inv_asid(smmu, asid); - - xa_erase(&arm_smmu_asid_xa, asid); - return NULL; -} - static u64 page_size_to_cd(void) { static_assert(PAGE_SIZE == SZ_4K || PAGE_SIZE == SZ_16K || @@ -186,69 +114,6 @@ void arm_smmu_make_sva_cd(struct arm_smmu_cd *target, target->data[3] = cpu_to_le64(read_sysreg(mair_el1)); } -static struct arm_smmu_ctx_desc *arm_smmu_alloc_shared_cd(struct mm_struct *mm) -{ - u16 asid; - int err = 0; - struct arm_smmu_ctx_desc *cd; - struct arm_smmu_ctx_desc *ret = NULL; - - /* Don't free the mm until we release the ASID */ - mmgrab(mm); - - asid = arm64_mm_context_get(mm); - if (!asid) { - err = -ESRCH; - goto out_drop_mm; - } - - cd = kzalloc(sizeof(*cd), GFP_KERNEL); - if (!cd) { - err = -ENOMEM; - goto out_put_context; - } - - refcount_set(&cd->refs, 1); - - mutex_lock(&arm_smmu_asid_lock); - ret = arm_smmu_share_asid(mm, asid); - if (ret) { - mutex_unlock(&arm_smmu_asid_lock); - goto out_free_cd; - } - - err = xa_insert(&arm_smmu_asid_xa, asid, cd, GFP_KERNEL); - mutex_unlock(&arm_smmu_asid_lock); - - if (err) - goto out_free_asid; - - cd->asid = asid; - cd->mm = mm; - - return cd; - -out_free_asid: - arm_smmu_free_asid(cd); -out_free_cd: - kfree(cd); -out_put_context: - arm64_mm_context_put(mm); -out_drop_mm: - mmdrop(mm); - return err < 0 ? ERR_PTR(err) : ret; -} - -static void arm_smmu_free_shared_cd(struct arm_smmu_ctx_desc *cd) -{ - if (arm_smmu_free_asid(cd)) { - /* Unpin ASID */ - arm64_mm_context_put(cd->mm); - mmdrop(cd->mm); - kfree(cd); - } -} - /* * Cloned from the MAX_TLBI_OPS in arch/arm64/include/asm/tlbflush.h, this * is used as a threshold to replace per-page TLBI commands to issue in the @@ -263,8 +128,8 @@ static void arm_smmu_mm_arch_invalidate_secondary_tlbs(struct mmu_notifier *mn, unsigned long start, unsigned long end) { - struct arm_smmu_mmu_notifier *smmu_mn = mn_to_smmu(mn); - struct arm_smmu_domain *smmu_domain = smmu_mn->domain; + struct arm_smmu_domain *smmu_domain = + container_of(mn, struct arm_smmu_domain, mmu_notifier); size_t size; /* @@ -281,34 +146,22 @@ static void arm_smmu_mm_arch_invalidate_secondary_tlbs(struct mmu_notifier *mn, size = 0; } - if (!(smmu_domain->smmu->features & ARM_SMMU_FEAT_BTM)) { - if (!size) - arm_smmu_tlb_inv_asid(smmu_domain->smmu, - smmu_mn->cd->asid); - else - arm_smmu_tlb_inv_range_asid(start, size, - smmu_mn->cd->asid, - PAGE_SIZE, false, - smmu_domain); - } + if (!size) + arm_smmu_tlb_inv_asid(smmu_domain->smmu, smmu_domain->cd.asid); + else + arm_smmu_tlb_inv_range_asid(start, size, smmu_domain->cd.asid, + PAGE_SIZE, false, smmu_domain); - arm_smmu_atc_inv_domain_sva(smmu_domain, mm_get_enqcmd_pasid(mm), start, - size); + arm_smmu_atc_inv_domain(smmu_domain, start, size); } static void arm_smmu_mm_release(struct mmu_notifier *mn, struct mm_struct *mm) { - struct arm_smmu_mmu_notifier *smmu_mn = mn_to_smmu(mn); - struct arm_smmu_domain *smmu_domain = smmu_mn->domain; + struct arm_smmu_domain *smmu_domain = + container_of(mn, struct arm_smmu_domain, mmu_notifier); struct arm_smmu_master_domain *master_domain; unsigned long flags; - mutex_lock(&sva_lock); - if (smmu_mn->cleared) { - mutex_unlock(&sva_lock); - return; - } - /* * DMA may still be running. Keep the cd valid to avoid C_BAD_CD events, * but disable translation. @@ -320,25 +173,23 @@ static void arm_smmu_mm_release(struct mmu_notifier *mn, struct mm_struct *mm) struct arm_smmu_cd target; struct arm_smmu_cd *cdptr; - cdptr = arm_smmu_get_cd_ptr(master, mm_get_enqcmd_pasid(mm)); + cdptr = arm_smmu_get_cd_ptr(master, master_domain->ssid); if (WARN_ON(!cdptr)) continue; - arm_smmu_make_sva_cd(&target, master, NULL, smmu_mn->cd->asid); - arm_smmu_write_cd_entry(master, mm_get_enqcmd_pasid(mm), cdptr, + arm_smmu_make_sva_cd(&target, master, NULL, + smmu_domain->cd.asid); + arm_smmu_write_cd_entry(master, master_domain->ssid, cdptr, &target); } spin_unlock_irqrestore(&smmu_domain->devices_lock, flags); - arm_smmu_tlb_inv_asid(smmu_domain->smmu, smmu_mn->cd->asid); - arm_smmu_atc_inv_domain_sva(smmu_domain, mm_get_enqcmd_pasid(mm), 0, 0); - - smmu_mn->cleared = true; - mutex_unlock(&sva_lock); + arm_smmu_tlb_inv_asid(smmu_domain->smmu, smmu_domain->cd.asid); + arm_smmu_atc_inv_domain(smmu_domain, 0, 0); } static void arm_smmu_mmu_notifier_free(struct mmu_notifier *mn) { - kfree(mn_to_smmu(mn)); + kfree(container_of(mn, struct arm_smmu_domain, mmu_notifier)); } static const struct mmu_notifier_ops arm_smmu_mmu_notifier_ops = { @@ -347,115 +198,6 @@ static const struct mmu_notifier_ops arm_smmu_mmu_notifier_ops = { .free_notifier = arm_smmu_mmu_notifier_free, }; -/* Allocate or get existing MMU notifier for this {domain, mm} pair */ -static struct arm_smmu_mmu_notifier * -arm_smmu_mmu_notifier_get(struct arm_smmu_domain *smmu_domain, - struct mm_struct *mm) -{ - int ret; - struct arm_smmu_ctx_desc *cd; - struct arm_smmu_mmu_notifier *smmu_mn; - - list_for_each_entry(smmu_mn, &smmu_domain->mmu_notifiers, list) { - if (smmu_mn->mn.mm == mm) { - refcount_inc(&smmu_mn->refs); - return smmu_mn; - } - } - - cd = arm_smmu_alloc_shared_cd(mm); - if (IS_ERR(cd)) - return ERR_CAST(cd); - - smmu_mn = kzalloc(sizeof(*smmu_mn), GFP_KERNEL); - if (!smmu_mn) { - ret = -ENOMEM; - goto err_free_cd; - } - - refcount_set(&smmu_mn->refs, 1); - smmu_mn->cd = cd; - smmu_mn->domain = smmu_domain; - smmu_mn->mn.ops = &arm_smmu_mmu_notifier_ops; - - ret = mmu_notifier_register(&smmu_mn->mn, mm); - if (ret) { - kfree(smmu_mn); - goto err_free_cd; - } - - list_add(&smmu_mn->list, &smmu_domain->mmu_notifiers); - return smmu_mn; - -err_free_cd: - arm_smmu_free_shared_cd(cd); - return ERR_PTR(ret); -} - -static void arm_smmu_mmu_notifier_put(struct arm_smmu_mmu_notifier *smmu_mn) -{ - struct mm_struct *mm = smmu_mn->mn.mm; - struct arm_smmu_ctx_desc *cd = smmu_mn->cd; - struct arm_smmu_domain *smmu_domain = smmu_mn->domain; - - if (!refcount_dec_and_test(&smmu_mn->refs)) - return; - - list_del(&smmu_mn->list); - - /* - * If we went through clear(), we've already invalidated, and no - * new TLB entry can have been formed. - */ - if (!smmu_mn->cleared) { - arm_smmu_tlb_inv_asid(smmu_domain->smmu, cd->asid); - arm_smmu_atc_inv_domain_sva(smmu_domain, - mm_get_enqcmd_pasid(mm), 0, 0); - } - - /* Frees smmu_mn */ - mmu_notifier_put(&smmu_mn->mn); - arm_smmu_free_shared_cd(cd); -} - -static struct arm_smmu_bond *__arm_smmu_sva_bind(struct device *dev, - struct mm_struct *mm) -{ - int ret; - struct arm_smmu_bond *bond; - struct arm_smmu_master *master = dev_iommu_priv_get(dev); - struct iommu_domain *domain = iommu_get_domain_for_dev(dev); - struct arm_smmu_domain *smmu_domain; - - if (!(domain->type & __IOMMU_DOMAIN_PAGING)) - return ERR_PTR(-ENODEV); - smmu_domain = to_smmu_domain(domain); - if (smmu_domain->stage != ARM_SMMU_DOMAIN_S1) - return ERR_PTR(-ENODEV); - - if (!master || !master->sva_enabled) - return ERR_PTR(-ENODEV); - - bond = kzalloc(sizeof(*bond), GFP_KERNEL); - if (!bond) - return ERR_PTR(-ENOMEM); - - bond->mm = mm; - - bond->smmu_mn = arm_smmu_mmu_notifier_get(smmu_domain, mm); - if (IS_ERR(bond->smmu_mn)) { - ret = PTR_ERR(bond->smmu_mn); - goto err_free_bond; - } - - list_add(&bond->list, &master->bonds); - return bond; - -err_free_bond: - kfree(bond); - return ERR_PTR(ret); -} - bool arm_smmu_sva_supported(struct arm_smmu_device *smmu) { unsigned long reg, fld; @@ -572,11 +314,6 @@ int arm_smmu_master_enable_sva(struct arm_smmu_master *master) int arm_smmu_master_disable_sva(struct arm_smmu_master *master) { mutex_lock(&sva_lock); - if (!list_empty(&master->bonds)) { - dev_err(master->dev, "cannot disable SVA, device is bound\n"); - mutex_unlock(&sva_lock); - return -EBUSY; - } arm_smmu_master_sva_disable_iopf(master); master->sva_enabled = false; mutex_unlock(&sva_lock); @@ -593,63 +330,51 @@ void arm_smmu_sva_notifier_synchronize(void) mmu_notifier_synchronize(); } -void arm_smmu_sva_remove_dev_pasid(struct iommu_domain *domain, - struct device *dev, ioasid_t id) -{ - struct mm_struct *mm = domain->mm; - struct arm_smmu_bond *bond = NULL, *t; - struct arm_smmu_master *master = dev_iommu_priv_get(dev); - - arm_smmu_remove_pasid(master, to_smmu_domain(domain), id); - - mutex_lock(&sva_lock); - list_for_each_entry(t, &master->bonds, list) { - if (t->mm == mm) { - bond = t; - break; - } - } - - if (!WARN_ON(!bond)) { - list_del(&bond->list); - arm_smmu_mmu_notifier_put(bond->smmu_mn); - kfree(bond); - } - mutex_unlock(&sva_lock); -} - static int arm_smmu_sva_set_dev_pasid(struct iommu_domain *domain, struct device *dev, ioasid_t id) { + struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain); struct arm_smmu_master *master = dev_iommu_priv_get(dev); - struct mm_struct *mm = domain->mm; - struct arm_smmu_bond *bond; struct arm_smmu_cd target; int ret; - mutex_lock(&sva_lock); - bond = __arm_smmu_sva_bind(dev, mm); - if (IS_ERR(bond)) { - mutex_unlock(&sva_lock); - return PTR_ERR(bond); - } + /* Prevent arm_smmu_mm_release from being called while we are attaching */ + if (!mmget_not_zero(domain->mm)) + return -EINVAL; - arm_smmu_make_sva_cd(&target, master, mm, bond->smmu_mn->cd->asid); - ret = arm_smmu_set_pasid(master, to_smmu_domain(domain), id, &target); - if (ret) { - list_del(&bond->list); - arm_smmu_mmu_notifier_put(bond->smmu_mn); - kfree(bond); - mutex_unlock(&sva_lock); - return ret; - } - mutex_unlock(&sva_lock); - return 0; + /* + * This does not need the arm_smmu_asid_lock because SVA domains never + * get reassigned + */ + arm_smmu_make_sva_cd(&target, master, domain->mm, smmu_domain->cd.asid); + ret = arm_smmu_set_pasid(master, smmu_domain, id, &target); + + mmput(domain->mm); + return ret; } static void arm_smmu_sva_domain_free(struct iommu_domain *domain) { - kfree(to_smmu_domain(domain)); + struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain); + + /* + * Ensure the ASID is empty in the iommu cache before allowing reuse. + */ + arm_smmu_tlb_inv_asid(smmu_domain->smmu, smmu_domain->cd.asid); + + /* + * Notice that the arm_smmu_mm_arch_invalidate_secondary_tlbs op can + * still be called/running at this point. We allow the ASID to be + * reused, and if there is a race then it just suffers harmless + * unnecessary invalidation. + */ + xa_erase(&arm_smmu_asid_xa, smmu_domain->cd.asid); + + /* + * Actual free is defered to the SRCU callback + * arm_smmu_mmu_notifier_free() + */ + mmu_notifier_put(&smmu_domain->mmu_notifier); } static const struct iommu_domain_ops arm_smmu_sva_domain_ops = { @@ -663,6 +388,8 @@ struct iommu_domain *arm_smmu_sva_domain_alloc(struct device *dev, struct arm_smmu_master *master = dev_iommu_priv_get(dev); struct arm_smmu_device *smmu = master->smmu; struct arm_smmu_domain *smmu_domain; + u32 asid; + int ret; smmu_domain = arm_smmu_domain_alloc(); if (IS_ERR(smmu_domain)) @@ -671,5 +398,22 @@ struct iommu_domain *arm_smmu_sva_domain_alloc(struct device *dev, smmu_domain->domain.ops = &arm_smmu_sva_domain_ops; smmu_domain->smmu = smmu; + ret = xa_alloc(&arm_smmu_asid_xa, &asid, smmu_domain, + XA_LIMIT(1, (1 << smmu->asid_bits) - 1), GFP_KERNEL); + if (ret) + goto err_free; + + smmu_domain->cd.asid = asid; + smmu_domain->mmu_notifier.ops = &arm_smmu_mmu_notifier_ops; + ret = mmu_notifier_register(&smmu_domain->mmu_notifier, mm); + if (ret) + goto err_asid; + return &smmu_domain->domain; + +err_asid: + xa_erase(&arm_smmu_asid_xa, smmu_domain->cd.asid); +err_free: + kfree(smmu_domain); + return ERR_PTR(ret); } diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c index 3c8267f7fcd21..5edc7a9d77487 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c @@ -1440,22 +1440,6 @@ static void arm_smmu_free_cd_tables(struct arm_smmu_master *master) cd_table->cdtab = NULL; } -bool arm_smmu_free_asid(struct arm_smmu_ctx_desc *cd) -{ - bool free; - struct arm_smmu_ctx_desc *old_cd; - - if (!cd->asid) - return false; - - free = refcount_dec_and_test(&cd->refs); - if (free) { - old_cd = xa_erase(&arm_smmu_asid_xa, cd->asid); - WARN_ON(old_cd != cd); - } - return free; -} - /* Stream table manipulation functions */ static void arm_smmu_write_strtab_l1_desc(__le64 *dst, struct arm_smmu_strtab_l1_desc *desc) @@ -2035,8 +2019,8 @@ static int arm_smmu_atc_inv_master(struct arm_smmu_master *master, return arm_smmu_cmdq_batch_submit(master->smmu, &cmds); } -static int __arm_smmu_atc_inv_domain(struct arm_smmu_domain *smmu_domain, - ioasid_t ssid, unsigned long iova, size_t size) +int arm_smmu_atc_inv_domain(struct arm_smmu_domain *smmu_domain, + unsigned long iova, size_t size) { struct arm_smmu_master_domain *master_domain; int i; @@ -2074,15 +2058,7 @@ static int __arm_smmu_atc_inv_domain(struct arm_smmu_domain *smmu_domain, if (!master->ats_enabled) continue; - /* - * Non-zero ssid means SVA is co-opting the S1 domain to issue - * invalidations for SVA PASIDs. - */ - if (ssid != IOMMU_NO_PASID) - arm_smmu_atc_inv_to_cmd(ssid, iova, size, &cmd); - else - arm_smmu_atc_inv_to_cmd(master_domain->ssid, iova, size, - &cmd); + arm_smmu_atc_inv_to_cmd(master_domain->ssid, iova, size, &cmd); for (i = 0; i < master->num_streams; i++) { cmd.atc.sid = master->streams[i].id; @@ -2094,19 +2070,6 @@ static int __arm_smmu_atc_inv_domain(struct arm_smmu_domain *smmu_domain, return arm_smmu_cmdq_batch_submit(smmu_domain->smmu, &cmds); } -static int arm_smmu_atc_inv_domain(struct arm_smmu_domain *smmu_domain, - unsigned long iova, size_t size) -{ - return __arm_smmu_atc_inv_domain(smmu_domain, IOMMU_NO_PASID, iova, - size); -} - -int arm_smmu_atc_inv_domain_sva(struct arm_smmu_domain *smmu_domain, - ioasid_t ssid, unsigned long iova, size_t size) -{ - return __arm_smmu_atc_inv_domain(smmu_domain, ssid, iova, size); -} - /* IO_PGTABLE API */ static void arm_smmu_tlb_inv_context(void *cookie) { @@ -2295,7 +2258,6 @@ struct arm_smmu_domain *arm_smmu_domain_alloc(void) mutex_init(&smmu_domain->init_mutex); INIT_LIST_HEAD(&smmu_domain->devices); spin_lock_init(&smmu_domain->devices_lock); - INIT_LIST_HEAD(&smmu_domain->mmu_notifiers); return smmu_domain; } @@ -2337,7 +2299,7 @@ static void arm_smmu_domain_free_paging(struct iommu_domain *domain) if (smmu_domain->stage == ARM_SMMU_DOMAIN_S1) { /* Prevent SVA from touching the CD while we're freeing it */ mutex_lock(&arm_smmu_asid_lock); - arm_smmu_free_asid(&smmu_domain->cd); + xa_erase(&arm_smmu_asid_xa, smmu_domain->cd.asid); mutex_unlock(&arm_smmu_asid_lock); } else { struct arm_smmu_s2_cfg *cfg = &smmu_domain->s2_cfg; @@ -2355,11 +2317,9 @@ static int arm_smmu_domain_finalise_s1(struct arm_smmu_device *smmu, u32 asid = 0; struct arm_smmu_ctx_desc *cd = &smmu_domain->cd; - refcount_set(&cd->refs, 1); - /* Prevent SVA from modifying the ASID until it is written to the CD */ mutex_lock(&arm_smmu_asid_lock); - ret = xa_alloc(&arm_smmu_asid_xa, &asid, cd, + ret = xa_alloc(&arm_smmu_asid_xa, &asid, smmu_domain, XA_LIMIT(1, (1 << smmu->asid_bits) - 1), GFP_KERNEL); cd->asid = (u16)asid; mutex_unlock(&arm_smmu_asid_lock); @@ -2847,6 +2807,9 @@ int arm_smmu_set_pasid(struct arm_smmu_master *master, /* The core code validates pasid */ + if (smmu_domain->smmu != master->smmu) + return -EINVAL; + if (!master->cd_table.in_ste) return -ENODEV; @@ -2868,9 +2831,14 @@ int arm_smmu_set_pasid(struct arm_smmu_master *master, return ret; } -void arm_smmu_remove_pasid(struct arm_smmu_master *master, - struct arm_smmu_domain *smmu_domain, ioasid_t pasid) +static void arm_smmu_remove_dev_pasid(struct device *dev, ioasid_t pasid, + struct iommu_domain *domain) { + struct arm_smmu_master *master = dev_iommu_priv_get(dev); + struct arm_smmu_domain *smmu_domain; + + smmu_domain = to_smmu_domain(domain); + mutex_lock(&arm_smmu_asid_lock); arm_smmu_clear_cd(master, pasid); if (master->ats_enabled) @@ -3135,7 +3103,6 @@ static struct iommu_device *arm_smmu_probe_device(struct device *dev) master->dev = dev; master->smmu = smmu; - INIT_LIST_HEAD(&master->bonds); dev_iommu_priv_set(dev, master); ret = arm_smmu_insert_master(smmu, master); @@ -3317,12 +3284,6 @@ static int arm_smmu_def_domain_type(struct device *dev) return 0; } -static void arm_smmu_remove_dev_pasid(struct device *dev, ioasid_t pasid, - struct iommu_domain *domain) -{ - arm_smmu_sva_remove_dev_pasid(domain, dev, pasid); -} - static struct iommu_ops arm_smmu_ops = { .identity_domain = &arm_smmu_identity_domain, .blocked_domain = &arm_smmu_blocked_domain, diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h index 3cf9d2358a2b2..a5aa50d916ed9 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h @@ -587,9 +587,6 @@ struct arm_smmu_strtab_l1_desc { struct arm_smmu_ctx_desc { u16 asid; - - refcount_t refs; - struct mm_struct *mm; }; struct arm_smmu_l1_ctx_desc { @@ -712,7 +709,6 @@ struct arm_smmu_master { bool stall_enabled; bool sva_enabled; bool iopf_enabled; - struct list_head bonds; unsigned int ssid_bits; }; @@ -742,7 +738,7 @@ struct arm_smmu_domain { struct list_head devices; spinlock_t devices_lock; - struct list_head mmu_notifiers; + struct mmu_notifier mmu_notifier; }; /* The following are exposed for testing purposes. */ @@ -806,16 +802,13 @@ void arm_smmu_write_cd_entry(struct arm_smmu_master *master, int ssid, int arm_smmu_set_pasid(struct arm_smmu_master *master, struct arm_smmu_domain *smmu_domain, ioasid_t pasid, const struct arm_smmu_cd *cd); -void arm_smmu_remove_pasid(struct arm_smmu_master *master, - struct arm_smmu_domain *smmu_domain, ioasid_t pasid); void arm_smmu_tlb_inv_asid(struct arm_smmu_device *smmu, u16 asid); void arm_smmu_tlb_inv_range_asid(unsigned long iova, size_t size, int asid, size_t granule, bool leaf, struct arm_smmu_domain *smmu_domain); -bool arm_smmu_free_asid(struct arm_smmu_ctx_desc *cd); -int arm_smmu_atc_inv_domain_sva(struct arm_smmu_domain *smmu_domain, - ioasid_t ssid, unsigned long iova, size_t size); +int arm_smmu_atc_inv_domain(struct arm_smmu_domain *smmu_domain, + unsigned long iova, size_t size); #ifdef CONFIG_ARM_SMMU_V3_SVA bool arm_smmu_sva_supported(struct arm_smmu_device *smmu); @@ -827,8 +820,6 @@ bool arm_smmu_master_iopf_supported(struct arm_smmu_master *master); void arm_smmu_sva_notifier_synchronize(void); struct iommu_domain *arm_smmu_sva_domain_alloc(struct device *dev, struct mm_struct *mm); -void arm_smmu_sva_remove_dev_pasid(struct iommu_domain *domain, - struct device *dev, ioasid_t id); #else /* CONFIG_ARM_SMMU_V3_SVA */ static inline bool arm_smmu_sva_supported(struct arm_smmu_device *smmu) { From ee9adca7060bd263d3b6a1fbfa349232287daa1b Mon Sep 17 00:00:00 2001 From: Jason Gunthorpe Date: Tue, 25 Jun 2024 09:37:42 -0300 Subject: [PATCH 537/548] iommu/arm-smmu-v3: Allow IDENTITY/BLOCKED to be set while PASID is used The HW supports this, use the S1DSS bits to configure the behavior of SSID=0 which is the RID's translation. If SSID's are currently being used in the CD table then just update the S1DSS bits in the STE, remove the master_domain and leave ATS alone. For iommufd the driver design has a small problem that all the unused CD table entries are set with V=0 which will generate an event if VFIO userspace tries to use the CD entry. This patch extends this problem to include the RID as well if PASID is being used. For BLOCKED with used PASIDs the F_STREAM_DISABLED (STRTAB_STE_1_S1DSS_TERMINATE) event is generated on untagged traffic and a substream CD table entry with V=0 (removed pasid) will generate C_BAD_CD. Arguably there is no advantage to using S1DSS over the CD entry 0 with V=0. As we don't yet support PASID in iommufd this is a problem to resolve later, possibly by using EPD0 for unused CD table entries instead of V=0, and not using S1DSS for BLOCKED. Tested-by: Nicolin Chen Tested-by: Shameer Kolothum Reviewed-by: Nicolin Chen Reviewed-by: Jerry Snitselaar Signed-off-by: Jason Gunthorpe Link: https://lore.kernel.org/r/11-v9-5cd718286059+79186-smmuv3_newapi_p2b_jgg@nvidia.com Signed-off-by: Will Deacon (cherry picked from commit ce26ea9e6e12df01432bd2a1cb8cbfa025b8a977) Signed-off-by: Koba Ko --- .../iommu/arm/arm-smmu-v3/arm-smmu-v3-test.c | 2 +- drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 60 +++++++++++++++---- drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 4 +- 3 files changed, 50 insertions(+), 16 deletions(-) diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-test.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-test.c index 68aa1c61fdb38..8608504fe7865 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-test.c +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-test.c @@ -164,7 +164,7 @@ static void arm_smmu_test_make_cdtable_ste(struct arm_smmu_ste *ste, .smmu = &smmu, }; - arm_smmu_make_cdtable_ste(ste, &master, true); + arm_smmu_make_cdtable_ste(ste, &master, true, STRTAB_STE_1_S1DSS_SSID0); } static void arm_smmu_v3_write_ste_test_bypass_to_abort(struct kunit *test) diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c index 5edc7a9d77487..fe9afb6497f77 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c @@ -996,6 +996,14 @@ void arm_smmu_get_ste_used(const __le64 *ent, __le64 *used_bits) STRTAB_STE_1_S1STALLD | STRTAB_STE_1_STRW | STRTAB_STE_1_EATS); used_bits[2] |= cpu_to_le64(STRTAB_STE_2_S2VMID); + + /* + * See 13.5 Summary of attribute/permission configuration fields + * for the SHCFG behavior. + */ + if (FIELD_GET(STRTAB_STE_1_S1DSS, le64_to_cpu(ent[1])) == + STRTAB_STE_1_S1DSS_BYPASS) + used_bits[1] |= cpu_to_le64(STRTAB_STE_1_SHCFG); } /* S2 translates */ @@ -1530,7 +1538,8 @@ void arm_smmu_make_bypass_ste(struct arm_smmu_device *smmu, VISIBLE_IF_KUNIT void arm_smmu_make_cdtable_ste(struct arm_smmu_ste *target, - struct arm_smmu_master *master, bool ats_enabled) + struct arm_smmu_master *master, bool ats_enabled, + unsigned int s1dss) { struct arm_smmu_ctx_desc_cfg *cd_table = &master->cd_table; struct arm_smmu_device *smmu = master->smmu; @@ -1544,7 +1553,7 @@ void arm_smmu_make_cdtable_ste(struct arm_smmu_ste *target, FIELD_PREP(STRTAB_STE_0_S1CDMAX, cd_table->s1cdmax)); target->data[1] = cpu_to_le64( - FIELD_PREP(STRTAB_STE_1_S1DSS, STRTAB_STE_1_S1DSS_SSID0) | + FIELD_PREP(STRTAB_STE_1_S1DSS, s1dss) | FIELD_PREP(STRTAB_STE_1_S1CIR, STRTAB_STE_1_S1C_CACHE_WBRA) | FIELD_PREP(STRTAB_STE_1_S1COR, STRTAB_STE_1_S1C_CACHE_WBRA) | FIELD_PREP(STRTAB_STE_1_S1CSH, ARM_SMMU_SH_ISH) | @@ -1555,6 +1564,11 @@ void arm_smmu_make_cdtable_ste(struct arm_smmu_ste *target, FIELD_PREP(STRTAB_STE_1_EATS, ats_enabled ? STRTAB_STE_1_EATS_TRANS : 0)); + if ((smmu->features & ARM_SMMU_FEAT_ATTR_TYPES_OVR) && + s1dss == STRTAB_STE_1_S1DSS_BYPASS) + target->data[1] |= cpu_to_le64(FIELD_PREP( + STRTAB_STE_1_SHCFG, STRTAB_STE_1_SHCFG_INCOMING)); + if (smmu->features & ARM_SMMU_FEAT_E2H) { /* * To support BTM the streamworld needs to match the @@ -2592,6 +2606,7 @@ struct arm_smmu_attach_state { /* Inputs */ struct iommu_domain *old_domain; struct arm_smmu_master *master; + bool cd_needs_ats; ioasid_t ssid; /* Resulting state */ bool ats_enabled; @@ -2633,7 +2648,7 @@ static int arm_smmu_attach_prepare(struct arm_smmu_attach_state *state, */ lockdep_assert_held(&arm_smmu_asid_lock); - if (smmu_domain) { + if (smmu_domain || state->cd_needs_ats) { /* * The SMMU does not support enabling ATS with bypass/abort. * When the STE is in bypass (STE.Config[2:0] == 0b100), ATS @@ -2645,7 +2660,9 @@ static int arm_smmu_attach_prepare(struct arm_smmu_attach_state *state, * tables. */ state->ats_enabled = arm_smmu_ats_supported(master); + } + if (smmu_domain) { master_domain = kzalloc(sizeof(*master_domain), GFP_KERNEL); if (!master_domain) return -ENOMEM; @@ -2773,7 +2790,8 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev) arm_smmu_make_s1_cd(&target_cd, master, smmu_domain); arm_smmu_write_cd_entry(master, IOMMU_NO_PASID, cdptr, &target_cd); - arm_smmu_make_cdtable_ste(&target, master, state.ats_enabled); + arm_smmu_make_cdtable_ste(&target, master, state.ats_enabled, + STRTAB_STE_1_S1DSS_SSID0); arm_smmu_install_ste_for_dev(master, &target); break; } @@ -2847,8 +2865,10 @@ static void arm_smmu_remove_dev_pasid(struct device *dev, ioasid_t pasid, mutex_unlock(&arm_smmu_asid_lock); } -static int arm_smmu_attach_dev_ste(struct iommu_domain *domain, - struct device *dev, struct arm_smmu_ste *ste) +static void arm_smmu_attach_dev_ste(struct iommu_domain *domain, + struct device *dev, + struct arm_smmu_ste *ste, + unsigned int s1dss) { struct arm_smmu_master *master = dev_iommu_priv_get(dev); struct arm_smmu_attach_state state = { @@ -2857,16 +2877,28 @@ static int arm_smmu_attach_dev_ste(struct iommu_domain *domain, .ssid = IOMMU_NO_PASID, }; - if (arm_smmu_ssids_in_use(&master->cd_table)) - return -EBUSY; - /* * Do not allow any ASID to be changed while are working on the STE, * otherwise we could miss invalidations. */ mutex_lock(&arm_smmu_asid_lock); - arm_smmu_attach_prepare(&state, domain); + /* + * If the CD table is not in use we can use the provided STE, otherwise + * we use a cdtable STE with the provided S1DSS. + */ + if (arm_smmu_ssids_in_use(&master->cd_table)) { + /* + * If a CD table has to be present then we need to run with ATS + * on even though the RID will fail ATS queries with UR. This is + * because we have no idea what the PASID's need. + */ + state.cd_needs_ats = true; + arm_smmu_attach_prepare(&state, domain); + arm_smmu_make_cdtable_ste(ste, master, state.ats_enabled, s1dss); + } else { + arm_smmu_attach_prepare(&state, domain); + } arm_smmu_install_ste_for_dev(master, ste); arm_smmu_attach_commit(&state); mutex_unlock(&arm_smmu_asid_lock); @@ -2877,7 +2909,6 @@ static int arm_smmu_attach_dev_ste(struct iommu_domain *domain, * descriptor from arm_smmu_share_asid(). */ arm_smmu_clear_cd(master, IOMMU_NO_PASID); - return 0; } static int arm_smmu_attach_dev_identity(struct iommu_domain *domain, @@ -2887,7 +2918,8 @@ static int arm_smmu_attach_dev_identity(struct iommu_domain *domain, struct arm_smmu_master *master = dev_iommu_priv_get(dev); arm_smmu_make_bypass_ste(master->smmu, &ste); - return arm_smmu_attach_dev_ste(domain, dev, &ste); + arm_smmu_attach_dev_ste(domain, dev, &ste, STRTAB_STE_1_S1DSS_BYPASS); + return 0; } static const struct iommu_domain_ops arm_smmu_identity_ops = { @@ -2905,7 +2937,9 @@ static int arm_smmu_attach_dev_blocked(struct iommu_domain *domain, struct arm_smmu_ste ste; arm_smmu_make_abort_ste(&ste); - return arm_smmu_attach_dev_ste(domain, dev, &ste); + arm_smmu_attach_dev_ste(domain, dev, &ste, + STRTAB_STE_1_S1DSS_TERMINATE); + return 0; } static const struct iommu_domain_ops arm_smmu_blocked_ops = { diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h index a5aa50d916ed9..ada45530074c7 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h @@ -762,8 +762,8 @@ void arm_smmu_make_abort_ste(struct arm_smmu_ste *target); void arm_smmu_make_bypass_ste(struct arm_smmu_device *smmu, struct arm_smmu_ste *target); void arm_smmu_make_cdtable_ste(struct arm_smmu_ste *target, - struct arm_smmu_master *master, - bool ats_enabled); + struct arm_smmu_master *master, bool ats_enabled, + unsigned int s1dss); void arm_smmu_make_s2_domain_ste(struct arm_smmu_ste *target, struct arm_smmu_master *master, struct arm_smmu_domain *smmu_domain, From 67306188f2863fa714a956b551fc74a9b002f657 Mon Sep 17 00:00:00 2001 From: Jason Gunthorpe Date: Tue, 25 Jun 2024 09:37:43 -0300 Subject: [PATCH 538/548] iommu/arm-smmu-v3: Test the STE S1DSS functionality S1DSS brings in quite a few new transition pairs that are interesting. Test to/from S1DSS_BYPASS <-> S1DSS_SSID0, and BYPASS <-> S1DSS_SSID0. Test a contrived non-hitless flow to make sure that the logic works. Tested-by: Nicolin Chen Signed-off-by: Michael Shavit Reviewed-by: Nicolin Chen Reviewed-by: Jerry Snitselaar Signed-off-by: Jason Gunthorpe Link: https://lore.kernel.org/r/12-v9-5cd718286059+79186-smmuv3_newapi_p2b_jgg@nvidia.com Signed-off-by: Will Deacon (cherry picked from commit 3b5302cbb06af6b62022360066944a1ff6aea0d1) Signed-off-by: Koba Ko --- .../iommu/arm/arm-smmu-v3/arm-smmu-v3-test.c | 113 +++++++++++++++++- 1 file changed, 108 insertions(+), 5 deletions(-) diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-test.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-test.c index 8608504fe7865..ee90dee4865c8 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-test.c +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-test.c @@ -144,6 +144,14 @@ static void arm_smmu_v3_test_ste_expect_transition( KUNIT_EXPECT_MEMEQ(test, target->data, cur_copy.data, sizeof(cur_copy)); } +static void arm_smmu_v3_test_ste_expect_non_hitless_transition( + struct kunit *test, const struct arm_smmu_ste *cur, + const struct arm_smmu_ste *target, unsigned int num_syncs_expected) +{ + arm_smmu_v3_test_ste_expect_transition(test, cur, target, + num_syncs_expected, false); +} + static void arm_smmu_v3_test_ste_expect_hitless_transition( struct kunit *test, const struct arm_smmu_ste *cur, const struct arm_smmu_ste *target, unsigned int num_syncs_expected) @@ -155,6 +163,7 @@ static void arm_smmu_v3_test_ste_expect_hitless_transition( static const dma_addr_t fake_cdtab_dma_addr = 0xF0F0F0F0F0F0; static void arm_smmu_test_make_cdtable_ste(struct arm_smmu_ste *ste, + unsigned int s1dss, const dma_addr_t dma_addr) { struct arm_smmu_master master = { @@ -164,7 +173,7 @@ static void arm_smmu_test_make_cdtable_ste(struct arm_smmu_ste *ste, .smmu = &smmu, }; - arm_smmu_make_cdtable_ste(ste, &master, true, STRTAB_STE_1_S1DSS_SSID0); + arm_smmu_make_cdtable_ste(ste, &master, true, s1dss); } static void arm_smmu_v3_write_ste_test_bypass_to_abort(struct kunit *test) @@ -194,7 +203,8 @@ static void arm_smmu_v3_write_ste_test_cdtable_to_abort(struct kunit *test) { struct arm_smmu_ste ste; - arm_smmu_test_make_cdtable_ste(&ste, fake_cdtab_dma_addr); + arm_smmu_test_make_cdtable_ste(&ste, STRTAB_STE_1_S1DSS_SSID0, + fake_cdtab_dma_addr); arm_smmu_v3_test_ste_expect_hitless_transition(test, &ste, &abort_ste, NUM_EXPECTED_SYNCS(2)); } @@ -203,7 +213,8 @@ static void arm_smmu_v3_write_ste_test_abort_to_cdtable(struct kunit *test) { struct arm_smmu_ste ste; - arm_smmu_test_make_cdtable_ste(&ste, fake_cdtab_dma_addr); + arm_smmu_test_make_cdtable_ste(&ste, STRTAB_STE_1_S1DSS_SSID0, + fake_cdtab_dma_addr); arm_smmu_v3_test_ste_expect_hitless_transition(test, &abort_ste, &ste, NUM_EXPECTED_SYNCS(2)); } @@ -212,7 +223,8 @@ static void arm_smmu_v3_write_ste_test_cdtable_to_bypass(struct kunit *test) { struct arm_smmu_ste ste; - arm_smmu_test_make_cdtable_ste(&ste, fake_cdtab_dma_addr); + arm_smmu_test_make_cdtable_ste(&ste, STRTAB_STE_1_S1DSS_SSID0, + fake_cdtab_dma_addr); arm_smmu_v3_test_ste_expect_hitless_transition(test, &ste, &bypass_ste, NUM_EXPECTED_SYNCS(3)); } @@ -221,11 +233,54 @@ static void arm_smmu_v3_write_ste_test_bypass_to_cdtable(struct kunit *test) { struct arm_smmu_ste ste; - arm_smmu_test_make_cdtable_ste(&ste, fake_cdtab_dma_addr); + arm_smmu_test_make_cdtable_ste(&ste, STRTAB_STE_1_S1DSS_SSID0, + fake_cdtab_dma_addr); arm_smmu_v3_test_ste_expect_hitless_transition(test, &bypass_ste, &ste, NUM_EXPECTED_SYNCS(3)); } +static void arm_smmu_v3_write_ste_test_cdtable_s1dss_change(struct kunit *test) +{ + struct arm_smmu_ste ste; + struct arm_smmu_ste s1dss_bypass; + + arm_smmu_test_make_cdtable_ste(&ste, STRTAB_STE_1_S1DSS_SSID0, + fake_cdtab_dma_addr); + arm_smmu_test_make_cdtable_ste(&s1dss_bypass, STRTAB_STE_1_S1DSS_BYPASS, + fake_cdtab_dma_addr); + + /* + * Flipping s1dss on a CD table STE only involves changes to the second + * qword of an STE and can be done in a single write. + */ + arm_smmu_v3_test_ste_expect_hitless_transition( + test, &ste, &s1dss_bypass, NUM_EXPECTED_SYNCS(1)); + arm_smmu_v3_test_ste_expect_hitless_transition( + test, &s1dss_bypass, &ste, NUM_EXPECTED_SYNCS(1)); +} + +static void +arm_smmu_v3_write_ste_test_s1dssbypass_to_stebypass(struct kunit *test) +{ + struct arm_smmu_ste s1dss_bypass; + + arm_smmu_test_make_cdtable_ste(&s1dss_bypass, STRTAB_STE_1_S1DSS_BYPASS, + fake_cdtab_dma_addr); + arm_smmu_v3_test_ste_expect_hitless_transition( + test, &s1dss_bypass, &bypass_ste, NUM_EXPECTED_SYNCS(2)); +} + +static void +arm_smmu_v3_write_ste_test_stebypass_to_s1dssbypass(struct kunit *test) +{ + struct arm_smmu_ste s1dss_bypass; + + arm_smmu_test_make_cdtable_ste(&s1dss_bypass, STRTAB_STE_1_S1DSS_BYPASS, + fake_cdtab_dma_addr); + arm_smmu_v3_test_ste_expect_hitless_transition( + test, &bypass_ste, &s1dss_bypass, NUM_EXPECTED_SYNCS(2)); +} + static void arm_smmu_test_make_s2_ste(struct arm_smmu_ste *ste, bool ats_enabled) { @@ -285,6 +340,48 @@ static void arm_smmu_v3_write_ste_test_bypass_to_s2(struct kunit *test) NUM_EXPECTED_SYNCS(2)); } +static void arm_smmu_v3_write_ste_test_s1_to_s2(struct kunit *test) +{ + struct arm_smmu_ste s1_ste; + struct arm_smmu_ste s2_ste; + + arm_smmu_test_make_cdtable_ste(&s1_ste, STRTAB_STE_1_S1DSS_SSID0, + fake_cdtab_dma_addr); + arm_smmu_test_make_s2_ste(&s2_ste, true); + arm_smmu_v3_test_ste_expect_hitless_transition(test, &s1_ste, &s2_ste, + NUM_EXPECTED_SYNCS(3)); +} + +static void arm_smmu_v3_write_ste_test_s2_to_s1(struct kunit *test) +{ + struct arm_smmu_ste s1_ste; + struct arm_smmu_ste s2_ste; + + arm_smmu_test_make_cdtable_ste(&s1_ste, STRTAB_STE_1_S1DSS_SSID0, + fake_cdtab_dma_addr); + arm_smmu_test_make_s2_ste(&s2_ste, true); + arm_smmu_v3_test_ste_expect_hitless_transition(test, &s2_ste, &s1_ste, + NUM_EXPECTED_SYNCS(3)); +} + +static void arm_smmu_v3_write_ste_test_non_hitless(struct kunit *test) +{ + struct arm_smmu_ste ste; + struct arm_smmu_ste ste_2; + + /* + * Although no flow resembles this in practice, one way to force an STE + * update to be non-hitless is to change its CD table pointer as well as + * s1 dss field in the same update. + */ + arm_smmu_test_make_cdtable_ste(&ste, STRTAB_STE_1_S1DSS_SSID0, + fake_cdtab_dma_addr); + arm_smmu_test_make_cdtable_ste(&ste_2, STRTAB_STE_1_S1DSS_BYPASS, + 0x4B4B4b4B4B); + arm_smmu_v3_test_ste_expect_non_hitless_transition( + test, &ste, &ste_2, NUM_EXPECTED_SYNCS(3)); +} + static void arm_smmu_v3_test_cd_expect_transition( struct kunit *test, const struct arm_smmu_cd *cur, const struct arm_smmu_cd *target, unsigned int num_syncs_expected, @@ -438,10 +535,16 @@ static struct kunit_case arm_smmu_v3_test_cases[] = { KUNIT_CASE(arm_smmu_v3_write_ste_test_abort_to_cdtable), KUNIT_CASE(arm_smmu_v3_write_ste_test_cdtable_to_bypass), KUNIT_CASE(arm_smmu_v3_write_ste_test_bypass_to_cdtable), + KUNIT_CASE(arm_smmu_v3_write_ste_test_cdtable_s1dss_change), + KUNIT_CASE(arm_smmu_v3_write_ste_test_s1dssbypass_to_stebypass), + KUNIT_CASE(arm_smmu_v3_write_ste_test_stebypass_to_s1dssbypass), KUNIT_CASE(arm_smmu_v3_write_ste_test_s2_to_abort), KUNIT_CASE(arm_smmu_v3_write_ste_test_abort_to_s2), KUNIT_CASE(arm_smmu_v3_write_ste_test_s2_to_bypass), KUNIT_CASE(arm_smmu_v3_write_ste_test_bypass_to_s2), + KUNIT_CASE(arm_smmu_v3_write_ste_test_s1_to_s2), + KUNIT_CASE(arm_smmu_v3_write_ste_test_s2_to_s1), + KUNIT_CASE(arm_smmu_v3_write_ste_test_non_hitless), KUNIT_CASE(arm_smmu_v3_write_cd_test_s1_clear), KUNIT_CASE(arm_smmu_v3_write_cd_test_s1_change_asid), KUNIT_CASE(arm_smmu_v3_write_cd_test_sva_clear), From 1603c0d9f543eeeea4cc28b1060c7c4038592a7c Mon Sep 17 00:00:00 2001 From: Jason Gunthorpe Date: Tue, 25 Jun 2024 09:37:44 -0300 Subject: [PATCH 539/548] iommu/arm-smmu-v3: Allow a PASID to be set when RID is IDENTITY/BLOCKED If the STE doesn't point to the CD table we can upgrade it by reprogramming the STE with the appropriate S1DSS. We may also need to turn on ATS at the same time. Keep track if the installed STE is pointing at the cd_table and the ATS state to trigger this path. Tested-by: Nicolin Chen Tested-by: Shameer Kolothum Reviewed-by: Nicolin Chen Reviewed-by: Jerry Snitselaar Signed-off-by: Jason Gunthorpe Link: https://lore.kernel.org/r/13-v9-5cd718286059+79186-smmuv3_newapi_p2b_jgg@nvidia.com Signed-off-by: Will Deacon (cherry picked from commit 8ee9175c25827240dd84a7adffbfa9c16938ac5d) Signed-off-by: Koba Ko --- drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 49 ++++++++++++++++++++- drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 3 +- 2 files changed, 49 insertions(+), 3 deletions(-) diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c index fe9afb6497f77..6caa842363ed0 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c @@ -2448,6 +2448,9 @@ static void arm_smmu_install_ste_for_dev(struct arm_smmu_master *master, master->cd_table.in_ste = FIELD_GET(STRTAB_STE_0_CFG, le64_to_cpu(target->data[0])) == STRTAB_STE_0_CFG_S1_TRANS; + master->ste_ats_enabled = + FIELD_GET(STRTAB_STE_1_EATS, le64_to_cpu(target->data[1])) == + STRTAB_STE_1_EATS_TRANS; for (i = 0; i < master->num_streams; ++i) { u32 sid = master->streams[i].id; @@ -2808,10 +2811,36 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev) return 0; } +static void arm_smmu_update_ste(struct arm_smmu_master *master, + struct iommu_domain *sid_domain, + bool ats_enabled) +{ + unsigned int s1dss = STRTAB_STE_1_S1DSS_TERMINATE; + struct arm_smmu_ste ste; + + if (master->cd_table.in_ste && master->ste_ats_enabled == ats_enabled) + return; + + if (sid_domain->type == IOMMU_DOMAIN_IDENTITY) + s1dss = STRTAB_STE_1_S1DSS_BYPASS; + else + WARN_ON(sid_domain->type != IOMMU_DOMAIN_BLOCKED); + + /* + * Change the STE into a cdtable one with SID IDENTITY/BLOCKED behavior + * using s1dss if necessary. If the cd_table is already installed then + * the S1DSS is correct and this will just update the EATS. Otherwise it + * installs the entire thing. This will be hitless. + */ + arm_smmu_make_cdtable_ste(&ste, master, ats_enabled, s1dss); + arm_smmu_install_ste_for_dev(master, &ste); +} + int arm_smmu_set_pasid(struct arm_smmu_master *master, struct arm_smmu_domain *smmu_domain, ioasid_t pasid, const struct arm_smmu_cd *cd) { + struct iommu_domain *sid_domain = iommu_get_domain_for_dev(master->dev); struct arm_smmu_attach_state state = { .master = master, /* @@ -2828,8 +2857,10 @@ int arm_smmu_set_pasid(struct arm_smmu_master *master, if (smmu_domain->smmu != master->smmu) return -EINVAL; - if (!master->cd_table.in_ste) - return -ENODEV; + if (!master->cd_table.in_ste && + sid_domain->type != IOMMU_DOMAIN_IDENTITY && + sid_domain->type != IOMMU_DOMAIN_BLOCKED) + return -EINVAL; cdptr = arm_smmu_alloc_cd_ptr(master, pasid); if (!cdptr) @@ -2841,6 +2872,7 @@ int arm_smmu_set_pasid(struct arm_smmu_master *master, goto out_unlock; arm_smmu_write_cd_entry(master, pasid, cdptr, cd); + arm_smmu_update_ste(master, sid_domain, state.ats_enabled); arm_smmu_attach_commit(&state); @@ -2863,6 +2895,19 @@ static void arm_smmu_remove_dev_pasid(struct device *dev, ioasid_t pasid, arm_smmu_atc_inv_master(master, pasid); arm_smmu_remove_master_domain(master, &smmu_domain->domain, pasid); mutex_unlock(&arm_smmu_asid_lock); + + /* + * When the last user of the CD table goes away downgrade the STE back + * to a non-cd_table one. + */ + if (!arm_smmu_ssids_in_use(&master->cd_table)) { + struct iommu_domain *sid_domain = + iommu_get_domain_for_dev(master->dev); + + if (sid_domain->type == IOMMU_DOMAIN_IDENTITY || + sid_domain->type == IOMMU_DOMAIN_BLOCKED) + sid_domain->ops->attach_dev(sid_domain, dev); + } } static void arm_smmu_attach_dev_ste(struct iommu_domain *domain, diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h index ada45530074c7..e240969addb4a 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h @@ -705,7 +705,8 @@ struct arm_smmu_master { /* Locked by the iommu core using the group mutex */ struct arm_smmu_ctx_desc_cfg cd_table; unsigned int num_streams; - bool ats_enabled; + bool ats_enabled : 1; + bool ste_ats_enabled : 1; bool stall_enabled; bool sva_enabled; bool iopf_enabled; From e807d26bea627346395317b23c861f948ce98bf7 Mon Sep 17 00:00:00 2001 From: Jason Gunthorpe Date: Tue, 25 Jun 2024 09:37:45 -0300 Subject: [PATCH 540/548] iommu/arm-smmu-v3: Allow setting a S1 domain to a PASID The SVA cleanup made the SSID logic entirely general so all we need to do is call it with the correct cd table entry for a S1 domain. This is slightly tricky because of the ASID and how the locking works, the simple fix is to just update the ASID once we get the right locks. Tested-by: Nicolin Chen Tested-by: Shameer Kolothum Reviewed-by: Nicolin Chen Reviewed-by: Jerry Snitselaar Signed-off-by: Jason Gunthorpe Link: https://lore.kernel.org/r/14-v9-5cd718286059+79186-smmuv3_newapi_p2b_jgg@nvidia.com Signed-off-by: Will Deacon (cherry picked from commit f3b273b7c7e42ff7ef5b6063834d768d33c7ba79) Signed-off-by: Koba Ko --- drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 41 ++++++++++++++++++++- drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 2 +- 2 files changed, 41 insertions(+), 2 deletions(-) diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c index 6caa842363ed0..f380847888c45 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c @@ -2811,6 +2811,36 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev) return 0; } +static int arm_smmu_s1_set_dev_pasid(struct iommu_domain *domain, + struct device *dev, ioasid_t id) +{ + struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain); + struct arm_smmu_master *master = dev_iommu_priv_get(dev); + struct arm_smmu_device *smmu = master->smmu; + struct arm_smmu_cd target_cd; + int ret = 0; + + mutex_lock(&smmu_domain->init_mutex); + if (!smmu_domain->smmu) + ret = arm_smmu_domain_finalise(smmu_domain, smmu); + else if (smmu_domain->smmu != smmu) + ret = -EINVAL; + mutex_unlock(&smmu_domain->init_mutex); + if (ret) + return ret; + + if (smmu_domain->stage != ARM_SMMU_DOMAIN_S1) + return -EINVAL; + + /* + * We can read cd.asid outside the lock because arm_smmu_set_pasid() + * will fix it + */ + arm_smmu_make_s1_cd(&target_cd, master, smmu_domain); + return arm_smmu_set_pasid(master, to_smmu_domain(domain), id, + &target_cd); +} + static void arm_smmu_update_ste(struct arm_smmu_master *master, struct iommu_domain *sid_domain, bool ats_enabled) @@ -2838,7 +2868,7 @@ static void arm_smmu_update_ste(struct arm_smmu_master *master, int arm_smmu_set_pasid(struct arm_smmu_master *master, struct arm_smmu_domain *smmu_domain, ioasid_t pasid, - const struct arm_smmu_cd *cd) + struct arm_smmu_cd *cd) { struct iommu_domain *sid_domain = iommu_get_domain_for_dev(master->dev); struct arm_smmu_attach_state state = { @@ -2871,6 +2901,14 @@ int arm_smmu_set_pasid(struct arm_smmu_master *master, if (ret) goto out_unlock; + /* + * We don't want to obtain to the asid_lock too early, so fix up the + * caller set ASID under the lock in case it changed. + */ + cd->data[0] &= ~cpu_to_le64(CTXDESC_CD_0_ASID); + cd->data[0] |= cpu_to_le64( + FIELD_PREP(CTXDESC_CD_0_ASID, smmu_domain->cd.asid)); + arm_smmu_write_cd_entry(master, pasid, cdptr, cd); arm_smmu_update_ste(master, sid_domain, state.ats_enabled); @@ -3383,6 +3421,7 @@ static struct iommu_ops arm_smmu_ops = { .owner = THIS_MODULE, .default_domain_ops = &(const struct iommu_domain_ops) { .attach_dev = arm_smmu_attach_dev, + .set_dev_pasid = arm_smmu_s1_set_dev_pasid, .map_pages = arm_smmu_map_pages, .unmap_pages = arm_smmu_unmap_pages, .flush_iotlb_all = arm_smmu_flush_iotlb_all, diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h index e240969addb4a..739afc7d8fc7f 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h @@ -802,7 +802,7 @@ void arm_smmu_write_cd_entry(struct arm_smmu_master *master, int ssid, int arm_smmu_set_pasid(struct arm_smmu_master *master, struct arm_smmu_domain *smmu_domain, ioasid_t pasid, - const struct arm_smmu_cd *cd); + struct arm_smmu_cd *cd); void arm_smmu_tlb_inv_asid(struct arm_smmu_device *smmu, u16 asid); void arm_smmu_tlb_inv_range_asid(unsigned long iova, size_t size, int asid, From 3eb476df3f208eeae400c8c7f907da611669eed2 Mon Sep 17 00:00:00 2001 From: Koba Ko Date: Thu, 28 Aug 2025 06:44:47 -0700 Subject: [PATCH 541/548] arm64: configs: Add kernel configuration for 6.6.63 with 64k pages Add a comprehensive kernel configuration file for Linux 6.6.63 targeting ARM64 architecture with 64k page size. This configuration enables essential features for NVIDIA ARM64 platforms including: - 64k page size configuration (CONFIG_ARM64_64K_PAGES) - Full ARM64 architecture support with NUMA balancing - BPF subsystem with JIT compilation - Control groups (cgroups) support for resource management - Memory controller with swap support - CPU isolation and RCU subsystem configuration - Process accounting and task statistics - Standard kernel debugging and security features The configuration is built with GCC 12.2.0 and includes the build salt "6.6.y.bsk.z-64k-arm64" to identify this specific kernel build variant. This configuration serves as the baseline for NVIDIA ARM64 kernel builds on the linux-nvidia-6.6 branch. Signed-off-by: Koba Ko --- config-6.6.63 | 10949 ++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 10949 insertions(+) create mode 100644 config-6.6.63 diff --git a/config-6.6.63 b/config-6.6.63 new file mode 100644 index 0000000000000..d4d63c9ae0037 --- /dev/null +++ b/config-6.6.63 @@ -0,0 +1,10949 @@ +# +# Automatically generated file; DO NOT EDIT. +# Linux/arm64 6.6.63 Kernel Configuration +# +CONFIG_CC_VERSION_TEXT="gcc (Debian 12.2.0-14+deb12u1) 12.2.0" +CONFIG_CC_IS_GCC=y +CONFIG_GCC_VERSION=120200 +CONFIG_CLANG_VERSION=0 +CONFIG_AS_IS_GNU=y +CONFIG_AS_VERSION=24000 +CONFIG_LD_IS_BFD=y +CONFIG_LD_VERSION=24000 +CONFIG_LLD_VERSION=0 +CONFIG_CC_CAN_LINK=y +CONFIG_CC_CAN_LINK_STATIC=y +CONFIG_CC_HAS_ASM_GOTO_OUTPUT=y +CONFIG_CC_HAS_ASM_GOTO_TIED_OUTPUT=y +CONFIG_GCC_ASM_GOTO_OUTPUT_WORKAROUND=y +CONFIG_CC_HAS_ASM_INLINE=y +CONFIG_CC_HAS_NO_PROFILE_FN_ATTR=y +CONFIG_PAHOLE_VERSION=124 +CONFIG_IRQ_WORK=y +CONFIG_BUILDTIME_TABLE_SORT=y +CONFIG_THREAD_INFO_IN_TASK=y + +# +# General setup +# +CONFIG_INIT_ENV_ARG_LIMIT=32 +# CONFIG_COMPILE_TEST is not set +# CONFIG_WERROR is not set +CONFIG_LOCALVERSION="" +# CONFIG_LOCALVERSION_AUTO is not set +CONFIG_BUILD_SALT="6.6.y.bsk.z-64k-arm64" +CONFIG_DEFAULT_INIT="" +CONFIG_DEFAULT_HOSTNAME="(none)" +CONFIG_SYSVIPC=y +CONFIG_SYSVIPC_SYSCTL=y +CONFIG_SYSVIPC_COMPAT=y +CONFIG_POSIX_MQUEUE=y +CONFIG_POSIX_MQUEUE_SYSCTL=y +# CONFIG_WATCH_QUEUE is not set +CONFIG_CROSS_MEMORY_ATTACH=y +# CONFIG_USELIB is not set +CONFIG_AUDIT=y +CONFIG_HAVE_ARCH_AUDITSYSCALL=y +CONFIG_AUDITSYSCALL=y + +# +# IRQ subsystem +# +CONFIG_GENERIC_IRQ_PROBE=y +CONFIG_GENERIC_IRQ_SHOW=y +CONFIG_GENERIC_IRQ_SHOW_LEVEL=y +CONFIG_GENERIC_IRQ_EFFECTIVE_AFF_MASK=y +CONFIG_GENERIC_IRQ_MIGRATION=y +CONFIG_GENERIC_IRQ_INJECTION=y +CONFIG_HARDIRQS_SW_RESEND=y +CONFIG_GENERIC_IRQ_CHIP=y +CONFIG_IRQ_DOMAIN=y +CONFIG_IRQ_DOMAIN_HIERARCHY=y +CONFIG_IRQ_FASTEOI_HIERARCHY_HANDLERS=y +CONFIG_GENERIC_IRQ_IPI=y +CONFIG_GENERIC_MSI_IRQ=y +CONFIG_IRQ_MSI_IOMMU=y +CONFIG_IRQ_FORCED_THREADING=y +CONFIG_SPARSE_IRQ=y +# CONFIG_GENERIC_IRQ_DEBUGFS is not set +# end of IRQ subsystem + +CONFIG_GENERIC_TIME_VSYSCALL=y +CONFIG_GENERIC_CLOCKEVENTS=y +CONFIG_ARCH_HAS_TICK_BROADCAST=y +CONFIG_GENERIC_CLOCKEVENTS_BROADCAST=y +CONFIG_HAVE_POSIX_CPU_TIMERS_TASK_WORK=y +CONFIG_POSIX_CPU_TIMERS_TASK_WORK=y +CONFIG_CONTEXT_TRACKING=y +CONFIG_CONTEXT_TRACKING_IDLE=y + +# +# Timers subsystem +# +CONFIG_TICK_ONESHOT=y +CONFIG_NO_HZ_COMMON=y +# CONFIG_HZ_PERIODIC is not set +CONFIG_NO_HZ_IDLE=y +# CONFIG_NO_HZ_FULL is not set +# CONFIG_NO_HZ is not set +CONFIG_HIGH_RES_TIMERS=y +# end of Timers subsystem + +CONFIG_BPF=y +CONFIG_HAVE_EBPF_JIT=y +CONFIG_ARCH_WANT_DEFAULT_BPF_JIT=y + +# +# BPF subsystem +# +CONFIG_BPF_SYSCALL=y +CONFIG_BPF_JIT=y +# CONFIG_BPF_JIT_ALWAYS_ON is not set +CONFIG_BPF_JIT_DEFAULT_ON=y +# CONFIG_BPF_UNPRIV_DEFAULT_OFF is not set +# CONFIG_BPF_PRELOAD is not set +CONFIG_BPF_LSM=y +# end of BPF subsystem + +CONFIG_PREEMPT_BUILD=y +CONFIG_PREEMPT_NONE=y +# CONFIG_PREEMPT_VOLUNTARY is not set +# CONFIG_PREEMPT is not set +CONFIG_PREEMPT_COUNT=y +CONFIG_PREEMPTION=y +CONFIG_PREEMPT_DYNAMIC=y +# CONFIG_SCHED_CORE is not set + +# +# CPU/Task time and stats accounting +# +CONFIG_TICK_CPU_ACCOUNTING=y +# CONFIG_VIRT_CPU_ACCOUNTING_GEN is not set +# CONFIG_IRQ_TIME_ACCOUNTING is not set +CONFIG_HAVE_SCHED_AVG_IRQ=y +CONFIG_SCHED_THERMAL_PRESSURE=y +CONFIG_BSD_PROCESS_ACCT=y +CONFIG_BSD_PROCESS_ACCT_V3=y +CONFIG_TASKSTATS=y +CONFIG_TASK_DELAY_ACCT=y +CONFIG_TASK_XACCT=y +CONFIG_TASK_IO_ACCOUNTING=y +CONFIG_PSI=y +# CONFIG_PSI_DEFAULT_DISABLED is not set +# end of CPU/Task time and stats accounting + +CONFIG_CPU_ISOLATION=y + +# +# RCU Subsystem +# +CONFIG_TREE_RCU=y +CONFIG_PREEMPT_RCU=y +# CONFIG_RCU_EXPERT is not set +CONFIG_TREE_SRCU=y +CONFIG_TASKS_RCU_GENERIC=y +CONFIG_TASKS_RCU=y +CONFIG_TASKS_RUDE_RCU=y +CONFIG_TASKS_TRACE_RCU=y +CONFIG_RCU_STALL_COMMON=y +CONFIG_RCU_NEED_SEGCBLIST=y +# end of RCU Subsystem + +# CONFIG_IKCONFIG is not set +# CONFIG_IKHEADERS is not set +CONFIG_LOG_BUF_SHIFT=17 +CONFIG_LOG_CPU_MAX_BUF_SHIFT=12 +# CONFIG_PRINTK_INDEX is not set +CONFIG_GENERIC_SCHED_CLOCK=y + +# +# Scheduler features +# +# CONFIG_UCLAMP_TASK is not set +# end of Scheduler features + +CONFIG_ARCH_SUPPORTS_NUMA_BALANCING=y +CONFIG_ARCH_WANT_BATCHED_UNMAP_TLB_FLUSH=y +CONFIG_CC_HAS_INT128=y +CONFIG_CC_IMPLICIT_FALLTHROUGH="-Wimplicit-fallthrough=5" +CONFIG_GCC10_NO_ARRAY_BOUNDS=y +CONFIG_CC_NO_ARRAY_BOUNDS=y +CONFIG_ARCH_SUPPORTS_INT128=y +CONFIG_NUMA_BALANCING=y +# CONFIG_NUMA_BALANCING_DEFAULT_ENABLED is not set +CONFIG_CGROUPS=y +CONFIG_PAGE_COUNTER=y +# CONFIG_CGROUP_FAVOR_DYNMODS is not set +CONFIG_MEMCG=y +CONFIG_MEMCG_SWAP=y +CONFIG_MEMCG_BGD_RECLAIM=y +CONFIG_MEMCG_KMEM=y +CONFIG_BLK_CGROUP=y +CONFIG_CGROUP_WRITEBACK=y +CONFIG_CGROUP_SCHED=y +CONFIG_FAIR_GROUP_SCHED=y +CONFIG_CFS_BANDWIDTH=y +# CONFIG_RT_GROUP_SCHED is not set +CONFIG_SCHED_MM_CID=y +CONFIG_CGROUP_PIDS=y +CONFIG_CGROUP_RDMA=y +CONFIG_CGROUP_FREEZER=y +CONFIG_CGROUP_HUGETLB=y +CONFIG_CPUSETS=y +CONFIG_PROC_PID_CPUSET=y +CONFIG_CGROUP_DEVICE=y +CONFIG_CGROUP_CPUACCT=y +CONFIG_CGROUP_PERF=y +CONFIG_CGROUP_BPF=y +# CONFIG_CGROUP_MISC is not set +# CONFIG_CGROUP_DEBUG is not set +CONFIG_SOCK_CGROUP_DATA=y +CONFIG_NAMESPACES=y +CONFIG_UTS_NS=y +CONFIG_TIME_NS=y +CONFIG_IPC_NS=y +CONFIG_USER_NS=y +CONFIG_PID_NS=y +CONFIG_NET_NS=y +CONFIG_CHECKPOINT_RESTORE=y +CONFIG_SCHED_AUTOGROUP=y +CONFIG_RELAY=y +CONFIG_BLK_DEV_INITRD=y +CONFIG_INITRAMFS_SOURCE="" +CONFIG_RD_GZIP=y +CONFIG_RD_BZIP2=y +CONFIG_RD_LZMA=y +CONFIG_RD_XZ=y +CONFIG_RD_LZO=y +CONFIG_RD_LZ4=y +CONFIG_RD_ZSTD=y +# CONFIG_BOOT_CONFIG is not set +CONFIG_INITRAMFS_PRESERVE_MTIME=y +CONFIG_CC_OPTIMIZE_FOR_PERFORMANCE=y +# CONFIG_CC_OPTIMIZE_FOR_SIZE is not set +CONFIG_LD_ORPHAN_WARN=y +CONFIG_LD_ORPHAN_WARN_LEVEL="warn" +CONFIG_SYSCTL=y +CONFIG_HAVE_UID16=y +CONFIG_SYSCTL_EXCEPTION_TRACE=y +CONFIG_EXPERT=y +CONFIG_UID16=y +CONFIG_MULTIUSER=y +# CONFIG_SGETMASK_SYSCALL is not set +# CONFIG_SYSFS_SYSCALL is not set +CONFIG_FHANDLE=y +CONFIG_POSIX_TIMERS=y +CONFIG_PRINTK=y +CONFIG_BUG=y +CONFIG_ELF_CORE=y +CONFIG_BASE_FULL=y +CONFIG_FUTEX=y +CONFIG_FUTEX_PI=y +CONFIG_EPOLL=y +CONFIG_SIGNALFD=y +CONFIG_TIMERFD=y +CONFIG_EVENTFD=y +CONFIG_SHMEM=y +CONFIG_AIO=y +CONFIG_IO_URING=y +CONFIG_ADVISE_SYSCALLS=y +CONFIG_MEMBARRIER=y +CONFIG_KALLSYMS=y +# CONFIG_KALLSYMS_SELFTEST is not set +CONFIG_KALLSYMS_ALL=y +CONFIG_KALLSYMS_BASE_RELATIVE=y +CONFIG_ARCH_HAS_MEMBARRIER_SYNC_CORE=y +CONFIG_KCMP=y +CONFIG_RSEQ=y +CONFIG_CACHESTAT_SYSCALL=y +# CONFIG_DEBUG_RSEQ is not set +CONFIG_HAVE_PERF_EVENTS=y +CONFIG_GUEST_PERF_EVENTS=y +# CONFIG_PC104 is not set + +# +# Kernel Performance Events And Counters +# +CONFIG_PERF_EVENTS=y +# CONFIG_DEBUG_PERF_USE_VMALLOC is not set +# end of Kernel Performance Events And Counters + +CONFIG_SYSTEM_DATA_VERIFICATION=y +CONFIG_PROFILING=y +CONFIG_TRACEPOINTS=y + +# +# Kexec and crash features +# +CONFIG_CRASH_CORE=y +CONFIG_KEXEC_CORE=y +CONFIG_KEXEC=y +CONFIG_KEXEC_FILE=y +# CONFIG_KEXEC_SIG is not set +CONFIG_CRASH_DUMP=y +# end of Kexec and crash features +# end of General setup + +CONFIG_ARM64=y +CONFIG_GCC_SUPPORTS_DYNAMIC_FTRACE_WITH_ARGS=y +CONFIG_64BIT=y +CONFIG_MMU=y +CONFIG_ARM64_PAGE_SHIFT=16 +CONFIG_ARM64_CONT_PTE_SHIFT=5 +CONFIG_ARM64_CONT_PMD_SHIFT=5 +CONFIG_ARCH_MMAP_RND_BITS_MIN=14 +CONFIG_ARCH_MMAP_RND_BITS_MAX=29 +CONFIG_ARCH_MMAP_RND_COMPAT_BITS_MIN=7 +CONFIG_ARCH_MMAP_RND_COMPAT_BITS_MAX=16 +CONFIG_STACKTRACE_SUPPORT=y +CONFIG_ILLEGAL_POINTER_VALUE=0xdead000000000000 +CONFIG_LOCKDEP_SUPPORT=y +CONFIG_GENERIC_BUG=y +CONFIG_GENERIC_BUG_RELATIVE_POINTERS=y +CONFIG_GENERIC_HWEIGHT=y +CONFIG_GENERIC_CSUM=y +CONFIG_GENERIC_CALIBRATE_DELAY=y +CONFIG_SMP=y +CONFIG_KERNEL_MODE_NEON=y +CONFIG_FIX_EARLYCON_MEM=y +CONFIG_PGTABLE_LEVELS=3 +CONFIG_ARCH_SUPPORTS_UPROBES=y +CONFIG_ARCH_PROC_KCORE_TEXT=y +CONFIG_BUILTIN_RETURN_ADDRESS_STRIPS_PAC=y + +# +# Platform selection +# +# CONFIG_ARCH_ACTIONS is not set +CONFIG_ARCH_SUNXI=y +# CONFIG_ARCH_ALPINE is not set +# CONFIG_ARCH_APPLE is not set +# CONFIG_ARCH_BCM is not set +# CONFIG_ARCH_BERLIN is not set +# CONFIG_ARCH_BITMAIN is not set +# CONFIG_ARCH_EXYNOS is not set +# CONFIG_ARCH_SPARX5 is not set +# CONFIG_ARCH_K3 is not set +# CONFIG_ARCH_LG1K is not set +CONFIG_ARCH_HISI=y +# CONFIG_ARCH_KEEMBAY is not set +# CONFIG_ARCH_MEDIATEK is not set +CONFIG_ARCH_MESON=y +CONFIG_ARCH_MVEBU=y +# CONFIG_ARCH_NXP is not set +# CONFIG_ARCH_MA35 is not set +# CONFIG_ARCH_NPCM is not set +CONFIG_ARCH_QCOM=y +# CONFIG_ARCH_REALTEK is not set +# CONFIG_ARCH_RENESAS is not set +CONFIG_ARCH_ROCKCHIP=y +CONFIG_ARCH_SEATTLE=y +# CONFIG_ARCH_INTEL_SOCFPGA is not set +# CONFIG_ARCH_STM32 is not set +CONFIG_ARCH_SYNQUACER=y +CONFIG_ARCH_TEGRA=y +# CONFIG_ARCH_SPRD is not set +CONFIG_ARCH_THUNDER=y +CONFIG_ARCH_THUNDER2=y +# CONFIG_ARCH_UNIPHIER is not set +CONFIG_ARCH_VEXPRESS=y +# CONFIG_ARCH_VISCONTI is not set +CONFIG_ARCH_XGENE=y +CONFIG_ARCH_ZYNQMP=y +# end of Platform selection + +# +# Kernel Features +# + +# +# ARM errata workarounds via the alternatives framework +# +CONFIG_AMPERE_ERRATUM_AC03_CPU_38=y +CONFIG_ARM64_WORKAROUND_CLEAN_CACHE=y +CONFIG_ARM64_ERRATUM_826319=y +CONFIG_ARM64_ERRATUM_827319=y +CONFIG_ARM64_ERRATUM_824069=y +CONFIG_ARM64_ERRATUM_819472=y +CONFIG_ARM64_ERRATUM_832075=y +CONFIG_ARM64_ERRATUM_834220=y +CONFIG_ARM64_ERRATUM_1742098=y +CONFIG_ARM64_ERRATUM_845719=y +CONFIG_ARM64_ERRATUM_843419=y +CONFIG_ARM64_LD_HAS_FIX_ERRATUM_843419=y +CONFIG_ARM64_ERRATUM_1024718=y +CONFIG_ARM64_ERRATUM_1418040=y +CONFIG_ARM64_WORKAROUND_SPECULATIVE_AT=y +CONFIG_ARM64_ERRATUM_1165522=y +CONFIG_ARM64_ERRATUM_1319367=y +CONFIG_ARM64_ERRATUM_1530923=y +CONFIG_ARM64_WORKAROUND_REPEAT_TLBI=y +CONFIG_ARM64_ERRATUM_2441007=y +CONFIG_ARM64_ERRATUM_1286807=y +CONFIG_ARM64_ERRATUM_1463225=y +CONFIG_ARM64_ERRATUM_1542419=y +CONFIG_ARM64_ERRATUM_1508412=y +CONFIG_ARM64_ERRATUM_2051678=y +CONFIG_ARM64_ERRATUM_2077057=y +CONFIG_ARM64_ERRATUM_2658417=y +CONFIG_ARM64_WORKAROUND_TSB_FLUSH_FAILURE=y +CONFIG_ARM64_ERRATUM_2054223=y +CONFIG_ARM64_ERRATUM_2067961=y +CONFIG_ARM64_ERRATUM_2441009=y +CONFIG_ARM64_ERRATUM_2457168=y +CONFIG_ARM64_ERRATUM_2645198=y +CONFIG_ARM64_WORKAROUND_SPECULATIVE_UNPRIV_LOAD=y +CONFIG_ARM64_ERRATUM_2966298=y +CONFIG_ARM64_ERRATUM_3117295=y +CONFIG_ARM64_ERRATUM_3194386=y +CONFIG_CAVIUM_ERRATUM_22375=y +CONFIG_CAVIUM_ERRATUM_23144=y +CONFIG_CAVIUM_ERRATUM_23154=y +CONFIG_CAVIUM_ERRATUM_27456=y +CONFIG_CAVIUM_ERRATUM_30115=y +CONFIG_CAVIUM_TX2_ERRATUM_219=y +CONFIG_FUJITSU_ERRATUM_010001=y +CONFIG_HISILICON_ERRATUM_161600802=y +CONFIG_QCOM_FALKOR_ERRATUM_1003=y +CONFIG_QCOM_FALKOR_ERRATUM_1009=y +CONFIG_QCOM_QDF2400_ERRATUM_0065=y +CONFIG_QCOM_FALKOR_ERRATUM_E1041=y +CONFIG_NVIDIA_CARMEL_CNP_ERRATUM=y +CONFIG_ROCKCHIP_ERRATUM_3588001=y +CONFIG_SOCIONEXT_SYNQUACER_PREITS=y +# end of ARM errata workarounds via the alternatives framework + +# CONFIG_ARM64_4K_PAGES is not set +# CONFIG_ARM64_16K_PAGES is not set +CONFIG_ARM64_64K_PAGES=y +# CONFIG_ARM64_VA_BITS_42 is not set +CONFIG_ARM64_VA_BITS_48=y +# CONFIG_ARM64_VA_BITS_52 is not set +CONFIG_ARM64_VA_BITS=48 +CONFIG_ARM64_PA_BITS_48=y +# CONFIG_ARM64_PA_BITS_52 is not set +CONFIG_ARM64_PA_BITS=48 +# CONFIG_CPU_BIG_ENDIAN is not set +CONFIG_CPU_LITTLE_ENDIAN=y +CONFIG_SCHED_MC=y +# CONFIG_SCHED_CLUSTER is not set +CONFIG_SCHED_SMT=y +CONFIG_NR_CPUS=512 +CONFIG_HOTPLUG_CPU=y +CONFIG_NUMA=y +CONFIG_NODES_SHIFT=6 +# CONFIG_HZ_100 is not set +CONFIG_HZ_250=y +# CONFIG_HZ_300 is not set +# CONFIG_HZ_1000 is not set +CONFIG_HZ=250 +CONFIG_SCHED_HRTICK=y +CONFIG_ARCH_SPARSEMEM_ENABLE=y +CONFIG_HW_PERF_EVENTS=y +CONFIG_CC_HAVE_SHADOW_CALL_STACK=y +CONFIG_PARAVIRT=y +CONFIG_PARAVIRT_TIME_ACCOUNTING=y +CONFIG_ARCH_SUPPORTS_KEXEC=y +CONFIG_ARCH_SUPPORTS_KEXEC_FILE=y +CONFIG_ARCH_SELECTS_KEXEC_FILE=y +CONFIG_ARCH_SUPPORTS_KEXEC_SIG=y +CONFIG_ARCH_SUPPORTS_KEXEC_IMAGE_VERIFY_SIG=y +CONFIG_ARCH_DEFAULT_KEXEC_IMAGE_VERIFY_SIG=y +CONFIG_ARCH_SUPPORTS_CRASH_DUMP=y +CONFIG_TRANS_TABLE=y +CONFIG_XEN_DOM0=y +CONFIG_XEN=y +CONFIG_ARCH_FORCE_MAX_ORDER=13 +CONFIG_UNMAP_KERNEL_AT_EL0=y +CONFIG_MITIGATE_SPECTRE_BRANCH_HISTORY=y +CONFIG_RODATA_FULL_DEFAULT_ENABLED=y +# CONFIG_ARM64_SW_TTBR0_PAN is not set +CONFIG_ARM64_TAGGED_ADDR_ABI=y +CONFIG_COMPAT=y +CONFIG_KUSER_HELPERS=y +# CONFIG_COMPAT_ALIGNMENT_FIXUPS is not set +CONFIG_ARMV8_DEPRECATED=y +CONFIG_SWP_EMULATION=y +CONFIG_CP15_BARRIER_EMULATION=y +CONFIG_SETEND_EMULATION=y + +# +# ARMv8.1 architectural features +# +CONFIG_ARM64_HW_AFDBM=y +CONFIG_ARM64_PAN=y +CONFIG_AS_HAS_LSE_ATOMICS=y +CONFIG_ARM64_LSE_ATOMICS=y +CONFIG_ARM64_USE_LSE_ATOMICS=y +# end of ARMv8.1 architectural features + +# +# ARMv8.2 architectural features +# +CONFIG_AS_HAS_ARMV8_2=y +CONFIG_AS_HAS_SHA3=y +CONFIG_ARM64_PMEM=y +CONFIG_ARM64_RAS_EXTN=y +CONFIG_ARM64_CNP=y +# end of ARMv8.2 architectural features + +# +# ARMv8.3 architectural features +# +CONFIG_ARM64_PTR_AUTH=y +CONFIG_ARM64_PTR_AUTH_KERNEL=y +CONFIG_CC_HAS_BRANCH_PROT_PAC_RET=y +CONFIG_CC_HAS_SIGN_RETURN_ADDRESS=y +CONFIG_AS_HAS_ARMV8_3=y +CONFIG_AS_HAS_CFI_NEGATE_RA_STATE=y +CONFIG_AS_HAS_LDAPR=y +# end of ARMv8.3 architectural features + +# +# ARMv8.4 architectural features +# +CONFIG_ARM64_AMU_EXTN=y +CONFIG_AS_HAS_ARMV8_4=y +CONFIG_ARM64_TLB_RANGE=y +# end of ARMv8.4 architectural features + +# +# ARMv8.5 architectural features +# +CONFIG_AS_HAS_ARMV8_5=y +CONFIG_ARM64_BTI=y +CONFIG_CC_HAS_BRANCH_PROT_PAC_RET_BTI=y +CONFIG_ARM64_E0PD=y +CONFIG_ARM64_AS_HAS_MTE=y +CONFIG_ARM64_MTE=y +# end of ARMv8.5 architectural features + +# +# ARMv8.7 architectural features +# +CONFIG_ARM64_EPAN=y +# end of ARMv8.7 architectural features + +CONFIG_ARM64_SVE=y +CONFIG_ARM64_PSEUDO_NMI=y +# CONFIG_ARM64_DEBUG_PRIORITY_MASKING is not set +CONFIG_RELOCATABLE=y +CONFIG_RANDOMIZE_BASE=y +CONFIG_RANDOMIZE_MODULE_REGION_FULL=y +CONFIG_CC_HAVE_STACKPROTECTOR_SYSREG=y +CONFIG_STACKPROTECTOR_PER_TASK=y +CONFIG_ARM64_CONTPTE=y +CONFIG_HAVE_LIVEPATCH=y +CONFIG_LIVEPATCH=y +# end of Kernel Features + +# +# Boot options +# +CONFIG_ARM64_ACPI_PARKING_PROTOCOL=y +CONFIG_CMDLINE="" +CONFIG_EFI_STUB=y +CONFIG_EFI=y +CONFIG_DMI=y +# end of Boot options + +# +# Power management options +# +CONFIG_SUSPEND=y +CONFIG_SUSPEND_FREEZER=y +# CONFIG_SUSPEND_SKIP_SYNC is not set +CONFIG_HIBERNATE_CALLBACKS=y +CONFIG_HIBERNATION=y +CONFIG_HIBERNATION_SNAPSHOT_DEV=y +CONFIG_PM_STD_PARTITION="" +CONFIG_PM_SLEEP=y +CONFIG_PM_SLEEP_SMP=y +# CONFIG_PM_AUTOSLEEP is not set +# CONFIG_PM_USERSPACE_AUTOSLEEP is not set +# CONFIG_PM_WAKELOCKS is not set +CONFIG_PM=y +CONFIG_PM_DEBUG=y +CONFIG_PM_ADVANCED_DEBUG=y +# CONFIG_PM_TEST_SUSPEND is not set +CONFIG_PM_SLEEP_DEBUG=y +# CONFIG_DPM_WATCHDOG is not set +CONFIG_PM_CLK=y +CONFIG_PM_GENERIC_DOMAINS=y +# CONFIG_WQ_POWER_EFFICIENT_DEFAULT is not set +CONFIG_PM_GENERIC_DOMAINS_SLEEP=y +CONFIG_PM_GENERIC_DOMAINS_OF=y +CONFIG_CPU_PM=y +CONFIG_ENERGY_MODEL=y +CONFIG_ARCH_HIBERNATION_POSSIBLE=y +CONFIG_ARCH_HIBERNATION_HEADER=y +CONFIG_ARCH_SUSPEND_POSSIBLE=y +# end of Power management options + +# +# CPU Power Management +# + +# +# CPU Idle +# +CONFIG_CPU_IDLE=y +CONFIG_CPU_IDLE_GOV_LADDER=y +CONFIG_CPU_IDLE_GOV_MENU=y +# CONFIG_CPU_IDLE_GOV_TEO is not set + +# +# ARM CPU Idle Drivers +# +# CONFIG_ARM_PSCI_CPUIDLE is not set +# end of ARM CPU Idle Drivers +# end of CPU Idle + +# +# CPU Frequency scaling +# +CONFIG_CPU_FREQ=y +CONFIG_CPU_FREQ_GOV_ATTR_SET=y +CONFIG_CPU_FREQ_GOV_COMMON=y +CONFIG_CPU_FREQ_STAT=y +CONFIG_CPU_FREQ_DEFAULT_GOV_PERFORMANCE=y +# CONFIG_CPU_FREQ_DEFAULT_GOV_POWERSAVE is not set +# CONFIG_CPU_FREQ_DEFAULT_GOV_USERSPACE is not set +# CONFIG_CPU_FREQ_DEFAULT_GOV_ONDEMAND is not set +# CONFIG_CPU_FREQ_DEFAULT_GOV_CONSERVATIVE is not set +# CONFIG_CPU_FREQ_DEFAULT_GOV_SCHEDUTIL is not set +CONFIG_CPU_FREQ_GOV_PERFORMANCE=y +CONFIG_CPU_FREQ_GOV_POWERSAVE=m +CONFIG_CPU_FREQ_GOV_USERSPACE=m +CONFIG_CPU_FREQ_GOV_ONDEMAND=m +CONFIG_CPU_FREQ_GOV_CONSERVATIVE=m +CONFIG_CPU_FREQ_GOV_SCHEDUTIL=y + +# +# CPU frequency scaling drivers +# +CONFIG_CPUFREQ_DT=m +CONFIG_CPUFREQ_DT_PLATDEV=y +CONFIG_ACPI_CPPC_CPUFREQ=m +CONFIG_ACPI_CPPC_CPUFREQ_FIE=y +CONFIG_ARM_ALLWINNER_SUN50I_CPUFREQ_NVMEM=m +CONFIG_ARM_ARMADA_37XX_CPUFREQ=m +# CONFIG_ARM_ARMADA_8K_CPUFREQ is not set +# CONFIG_ARM_QCOM_CPUFREQ_HW is not set +CONFIG_ARM_TEGRA20_CPUFREQ=m +CONFIG_ARM_TEGRA124_CPUFREQ=y +# CONFIG_ARM_TEGRA186_CPUFREQ is not set +CONFIG_ARM_TEGRA194_CPUFREQ=y +# end of CPU Frequency scaling +# end of CPU Power Management + +CONFIG_ARCH_SUPPORTS_ACPI=y +CONFIG_ACPI=y +CONFIG_ACPI_GENERIC_GSI=y +CONFIG_ACPI_CCA_REQUIRED=y +CONFIG_ACPI_THERMAL_LIB=y +# CONFIG_ACPI_DEBUGGER is not set +CONFIG_ACPI_SPCR_TABLE=y +# CONFIG_ACPI_FPDT is not set +# CONFIG_ACPI_EC_DEBUGFS is not set +CONFIG_ACPI_AC=y +CONFIG_ACPI_BATTERY=y +CONFIG_ACPI_BUTTON=y +CONFIG_ACPI_VIDEO=m +CONFIG_ACPI_FAN=y +CONFIG_ACPI_TAD=m +# CONFIG_ACPI_DOCK is not set +CONFIG_ACPI_PROCESSOR_IDLE=y +CONFIG_ACPI_MCFG=y +CONFIG_ACPI_CPPC_LIB=y +CONFIG_ACPI_PROCESSOR=y +CONFIG_ACPI_IPMI=m +CONFIG_ACPI_HOTPLUG_CPU=y +CONFIG_ACPI_THERMAL=y +CONFIG_ARCH_HAS_ACPI_TABLE_UPGRADE=y +CONFIG_ACPI_TABLE_UPGRADE=y +# CONFIG_ACPI_DEBUG is not set +CONFIG_ACPI_PCI_SLOT=y +CONFIG_ACPI_CONTAINER=y +# CONFIG_ACPI_HOTPLUG_MEMORY is not set +CONFIG_ACPI_HED=y +# CONFIG_ACPI_CUSTOM_METHOD is not set +CONFIG_ACPI_BGRT=y +CONFIG_ACPI_REDUCED_HARDWARE_ONLY=y +CONFIG_ACPI_NFIT=m +# CONFIG_NFIT_SECURITY_DEBUG is not set +CONFIG_ACPI_NUMA=y +# CONFIG_ACPI_HMAT is not set +CONFIG_HAVE_ACPI_APEI=y +CONFIG_ACPI_APEI=y +CONFIG_ACPI_APEI_GHES=y +CONFIG_ACPI_APEI_PCIEAER=y +CONFIG_ACPI_APEI_SEA=y +CONFIG_ACPI_APEI_MEMORY_FAILURE=y +CONFIG_ACPI_APEI_EINJ=m +# CONFIG_ACPI_APEI_ERST_DEBUG is not set +CONFIG_ACPI_WATCHDOG=y +# CONFIG_ACPI_CONFIGFS is not set +# CONFIG_ACPI_PFRUT is not set +CONFIG_ACPI_IORT=y +CONFIG_ACPI_GTDT=y +# CONFIG_ACPI_AGDI is not set +CONFIG_ACPI_APMT=y +CONFIG_ACPI_PPTT=y +CONFIG_ACPI_PCC=y +# CONFIG_ACPI_FFH is not set +# CONFIG_PMIC_OPREGION is not set +CONFIG_ACPI_PRMT=y +CONFIG_IRQ_BYPASS_MANAGER=y +CONFIG_HAVE_KVM=y +CONFIG_HAVE_KVM_IRQCHIP=y +CONFIG_HAVE_KVM_IRQFD=y +CONFIG_HAVE_KVM_IRQ_ROUTING=y +CONFIG_HAVE_KVM_DIRTY_RING=y +CONFIG_HAVE_KVM_DIRTY_RING_ACQ_REL=y +CONFIG_NEED_KVM_DIRTY_RING_WITH_BITMAP=y +CONFIG_HAVE_KVM_EVENTFD=y +CONFIG_KVM_MMIO=y +CONFIG_HAVE_KVM_MSI=y +CONFIG_HAVE_KVM_CPU_RELAX_INTERCEPT=y +CONFIG_KVM_VFIO=y +CONFIG_KVM_GENERIC_DIRTYLOG_READ_PROTECT=y +CONFIG_HAVE_KVM_IRQ_BYPASS=y +CONFIG_HAVE_KVM_VCPU_RUN_PID_CHANGE=y +CONFIG_KVM_XFER_TO_GUEST_WORK=y +CONFIG_KVM_GENERIC_HARDWARE_ENABLING=y +CONFIG_KVM_GENERIC_MMU_NOTIFIER=y +CONFIG_VIRTUALIZATION=y +CONFIG_KVM=y +# CONFIG_NVHE_EL2_DEBUG is not set +CONFIG_CPU_MITIGATIONS=y + +# +# General architecture-dependent options +# +CONFIG_ARCH_HAS_SUBPAGE_FAULTS=y +CONFIG_HOTPLUG_CORE_SYNC=y +CONFIG_HOTPLUG_CORE_SYNC_DEAD=y +CONFIG_KPROBES=y +CONFIG_JUMP_LABEL=y +# CONFIG_STATIC_KEYS_SELFTEST is not set +CONFIG_UPROBES=y +CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS=y +CONFIG_KRETPROBES=y +CONFIG_HAVE_IOREMAP_PROT=y +CONFIG_HAVE_KPROBES=y +CONFIG_HAVE_KRETPROBES=y +CONFIG_ARCH_CORRECT_STACKTRACE_ON_KRETPROBE=y +CONFIG_HAVE_FUNCTION_ERROR_INJECTION=y +CONFIG_HAVE_NMI=y +CONFIG_TRACE_IRQFLAGS_SUPPORT=y +CONFIG_TRACE_IRQFLAGS_NMI_SUPPORT=y +CONFIG_HAVE_ARCH_TRACEHOOK=y +CONFIG_HAVE_DMA_CONTIGUOUS=y +CONFIG_GENERIC_SMP_IDLE_THREAD=y +CONFIG_GENERIC_IDLE_POLL_SETUP=y +CONFIG_ARCH_HAS_FORTIFY_SOURCE=y +CONFIG_ARCH_HAS_KEEPINITRD=y +CONFIG_ARCH_HAS_SET_MEMORY=y +CONFIG_ARCH_HAS_SET_DIRECT_MAP=y +CONFIG_HAVE_ARCH_THREAD_STRUCT_WHITELIST=y +CONFIG_ARCH_WANTS_NO_INSTR=y +CONFIG_HAVE_ASM_MODVERSIONS=y +CONFIG_HAVE_REGS_AND_STACK_ACCESS_API=y +CONFIG_HAVE_RSEQ=y +CONFIG_HAVE_FUNCTION_ARG_ACCESS_API=y +CONFIG_HAVE_HW_BREAKPOINT=y +CONFIG_HAVE_PERF_EVENTS_NMI=y +CONFIG_HAVE_HARDLOCKUP_DETECTOR_PERF=y +CONFIG_HAVE_PERF_REGS=y +CONFIG_HAVE_PERF_USER_STACK_DUMP=y +CONFIG_HAVE_ARCH_JUMP_LABEL=y +CONFIG_HAVE_ARCH_JUMP_LABEL_RELATIVE=y +CONFIG_MMU_GATHER_TABLE_FREE=y +CONFIG_MMU_GATHER_RCU_TABLE_FREE=y +CONFIG_MMU_LAZY_TLB_REFCOUNT=y +CONFIG_ARCH_HAVE_NMI_SAFE_CMPXCHG=y +CONFIG_ARCH_HAS_NMI_SAFE_THIS_CPU_OPS=y +CONFIG_HAVE_ALIGNED_STRUCT_PAGE=y +CONFIG_HAVE_CMPXCHG_LOCAL=y +CONFIG_HAVE_CMPXCHG_DOUBLE=y +CONFIG_ARCH_WANT_COMPAT_IPC_PARSE_VERSION=y +CONFIG_HAVE_ARCH_SECCOMP=y +CONFIG_HAVE_ARCH_SECCOMP_FILTER=y +CONFIG_SECCOMP=y +CONFIG_SECCOMP_FILTER=y +# CONFIG_SECCOMP_CACHE_DEBUG is not set +CONFIG_HAVE_ARCH_STACKLEAK=y +CONFIG_HAVE_STACKPROTECTOR=y +CONFIG_STACKPROTECTOR=y +CONFIG_STACKPROTECTOR_STRONG=y +CONFIG_ARCH_SUPPORTS_SHADOW_CALL_STACK=y +# CONFIG_SHADOW_CALL_STACK is not set +CONFIG_ARCH_SUPPORTS_LTO_CLANG=y +CONFIG_ARCH_SUPPORTS_LTO_CLANG_THIN=y +CONFIG_LTO_NONE=y +CONFIG_ARCH_SUPPORTS_CFI_CLANG=y +CONFIG_HAVE_CONTEXT_TRACKING_USER=y +CONFIG_HAVE_VIRT_CPU_ACCOUNTING_GEN=y +CONFIG_HAVE_IRQ_TIME_ACCOUNTING=y +CONFIG_HAVE_MOVE_PUD=y +CONFIG_HAVE_MOVE_PMD=y +CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE=y +CONFIG_HAVE_ARCH_HUGE_VMAP=y +CONFIG_HAVE_ARCH_HUGE_VMALLOC=y +CONFIG_ARCH_WANT_PMD_MKWRITE=y +CONFIG_HAVE_MOD_ARCH_SPECIFIC=y +CONFIG_MODULES_USE_ELF_RELA=y +CONFIG_HAVE_SOFTIRQ_ON_OWN_STACK=y +CONFIG_SOFTIRQ_ON_OWN_STACK=y +CONFIG_ARCH_HAS_ELF_RANDOMIZE=y +CONFIG_HAVE_ARCH_MMAP_RND_BITS=y +CONFIG_ARCH_MMAP_RND_BITS=18 +CONFIG_HAVE_ARCH_MMAP_RND_COMPAT_BITS=y +CONFIG_ARCH_MMAP_RND_COMPAT_BITS=11 +CONFIG_PAGE_SIZE_LESS_THAN_256KB=y +CONFIG_ARCH_WANT_DEFAULT_TOPDOWN_MMAP_LAYOUT=y +CONFIG_HAVE_RELIABLE_STACKTRACE=y +CONFIG_CLONE_BACKWARDS=y +CONFIG_OLD_SIGSUSPEND3=y +CONFIG_COMPAT_OLD_SIGACTION=y +CONFIG_COMPAT_32BIT_TIME=y +CONFIG_HAVE_ARCH_VMAP_STACK=y +CONFIG_VMAP_STACK=y +CONFIG_HAVE_ARCH_RANDOMIZE_KSTACK_OFFSET=y +CONFIG_RANDOMIZE_KSTACK_OFFSET=y +# CONFIG_RANDOMIZE_KSTACK_OFFSET_DEFAULT is not set +CONFIG_ARCH_HAS_STRICT_KERNEL_RWX=y +CONFIG_STRICT_KERNEL_RWX=y +CONFIG_ARCH_HAS_STRICT_MODULE_RWX=y +CONFIG_STRICT_MODULE_RWX=y +CONFIG_HAVE_ARCH_COMPILER_H=y +CONFIG_HAVE_ARCH_PREL32_RELOCATIONS=y +CONFIG_ARCH_USE_MEMREMAP_PROT=y +# CONFIG_LOCK_EVENT_COUNTS is not set +CONFIG_ARCH_HAS_RELR=y +CONFIG_HAVE_PREEMPT_DYNAMIC=y +CONFIG_HAVE_PREEMPT_DYNAMIC_KEY=y +CONFIG_ARCH_WANT_LD_ORPHAN_WARN=y +CONFIG_ARCH_SUPPORTS_DEBUG_PAGEALLOC=y +CONFIG_ARCH_SUPPORTS_PAGE_TABLE_CHECK=y +CONFIG_ARCH_HAVE_TRACE_MMIO_ACCESS=y + +# +# GCOV-based kernel profiling +# +# CONFIG_GCOV_KERNEL is not set +CONFIG_ARCH_HAS_GCOV_PROFILE_ALL=y +# end of GCOV-based kernel profiling + +CONFIG_HAVE_GCC_PLUGINS=y +CONFIG_FUNCTION_ALIGNMENT_4B=y +CONFIG_FUNCTION_ALIGNMENT_8B=y +CONFIG_FUNCTION_ALIGNMENT=8 +# end of General architecture-dependent options + +CONFIG_RT_MUTEXES=y +CONFIG_BASE_SMALL=0 +CONFIG_MODULE_SIG_FORMAT=y +CONFIG_MODULES=y +# CONFIG_MODULE_DEBUG is not set +CONFIG_MODULE_FORCE_LOAD=y +CONFIG_MODULE_UNLOAD=y +CONFIG_MODULE_FORCE_UNLOAD=y +# CONFIG_MODULE_UNLOAD_TAINT_TRACKING is not set +CONFIG_MODVERSIONS=y +CONFIG_ASM_MODVERSIONS=y +# CONFIG_MODULE_SRCVERSION_ALL is not set +CONFIG_MODULE_SIG=y +# CONFIG_MODULE_SIG_FORCE is not set +# CONFIG_MODULE_SIG_ALL is not set +# CONFIG_MODULE_SIG_SHA1 is not set +# CONFIG_MODULE_SIG_SHA224 is not set +CONFIG_MODULE_SIG_SHA256=y +# CONFIG_MODULE_SIG_SHA384 is not set +# CONFIG_MODULE_SIG_SHA512 is not set +CONFIG_MODULE_SIG_HASH="sha256" +CONFIG_MODULE_COMPRESS_NONE=y +# CONFIG_MODULE_COMPRESS_GZIP is not set +# CONFIG_MODULE_COMPRESS_XZ is not set +# CONFIG_MODULE_COMPRESS_ZSTD is not set +# CONFIG_MODULE_ALLOW_MISSING_NAMESPACE_IMPORTS is not set +CONFIG_MODPROBE_PATH="/sbin/modprobe" +# CONFIG_TRIM_UNUSED_KSYMS is not set +CONFIG_MODULES_TREE_LOOKUP=y +CONFIG_BLOCK=y +CONFIG_BLOCK_LEGACY_AUTOLOAD=y +CONFIG_BLK_RQ_ALLOC_TIME=y +CONFIG_BLK_CGROUP_RWSTAT=y +CONFIG_BLK_CGROUP_PUNT_BIO=y +CONFIG_BLK_DEV_BSG_COMMON=y +CONFIG_BLK_ICQ=y +CONFIG_BLK_DEV_BSGLIB=y +CONFIG_BLK_DEV_INTEGRITY=y +CONFIG_BLK_DEV_INTEGRITY_T10=m +CONFIG_BLK_DEV_ZONED=y +CONFIG_BLK_DEV_THROTTLING=y +# CONFIG_BLK_DEV_THROTTLING_LOW is not set +CONFIG_BLK_WBT=y +CONFIG_BLK_WBT_MQ=y +# CONFIG_BLK_CGROUP_IOLATENCY is not set +# CONFIG_BLK_CGROUP_FC_APPID is not set +CONFIG_BLK_CGROUP_IOCOST=y +# CONFIG_BLK_CGROUP_IOPRIO is not set +CONFIG_BLK_DEBUG_FS=y +CONFIG_BLK_DEBUG_FS_ZONED=y +CONFIG_BLK_SED_OPAL=y +# CONFIG_BLK_INLINE_ENCRYPTION is not set + +# +# Partition Types +# +CONFIG_PARTITION_ADVANCED=y +# CONFIG_ACORN_PARTITION is not set +# CONFIG_AIX_PARTITION is not set +# CONFIG_OSF_PARTITION is not set +# CONFIG_AMIGA_PARTITION is not set +# CONFIG_ATARI_PARTITION is not set +# CONFIG_MAC_PARTITION is not set +CONFIG_MSDOS_PARTITION=y +# CONFIG_BSD_DISKLABEL is not set +# CONFIG_MINIX_SUBPARTITION is not set +# CONFIG_SOLARIS_X86_PARTITION is not set +# CONFIG_UNIXWARE_DISKLABEL is not set +# CONFIG_LDM_PARTITION is not set +# CONFIG_SGI_PARTITION is not set +# CONFIG_ULTRIX_PARTITION is not set +# CONFIG_SUN_PARTITION is not set +CONFIG_KARMA_PARTITION=y +CONFIG_EFI_PARTITION=y +# CONFIG_SYSV68_PARTITION is not set +# CONFIG_CMDLINE_PARTITION is not set +# end of Partition Types + +CONFIG_BLK_MQ_PCI=y +CONFIG_BLK_MQ_VIRTIO=y +CONFIG_BLK_PM=y +CONFIG_BLOCK_HOLDER_DEPRECATED=y +CONFIG_BLK_MQ_STACKING=y + +# +# IO Schedulers +# +CONFIG_MQ_IOSCHED_DEADLINE=y +CONFIG_MQ_IOSCHED_KYBER=m +CONFIG_IOSCHED_BFQ=m +CONFIG_BFQ_GROUP_IOSCHED=y +# CONFIG_BFQ_CGROUP_DEBUG is not set +# end of IO Schedulers + +CONFIG_PREEMPT_NOTIFIERS=y +CONFIG_PADATA=y +CONFIG_ASN1=y +CONFIG_UNINLINE_SPIN_UNLOCK=y +CONFIG_ARCH_SUPPORTS_ATOMIC_RMW=y +CONFIG_MUTEX_SPIN_ON_OWNER=y +CONFIG_RWSEM_SPIN_ON_OWNER=y +CONFIG_LOCK_SPIN_ON_OWNER=y +CONFIG_ARCH_USE_QUEUED_SPINLOCKS=y +CONFIG_QUEUED_SPINLOCKS=y +CONFIG_ARCH_USE_QUEUED_RWLOCKS=y +CONFIG_QUEUED_RWLOCKS=y +CONFIG_ARCH_HAS_NON_OVERLAPPING_ADDRESS_SPACE=y +CONFIG_ARCH_HAS_SYSCALL_WRAPPER=y +CONFIG_FREEZER=y + +# +# Executable file formats +# +CONFIG_BINFMT_ELF=y +CONFIG_COMPAT_BINFMT_ELF=y +CONFIG_ARCH_BINFMT_ELF_STATE=y +CONFIG_ARCH_BINFMT_ELF_EXTRA_PHDRS=y +CONFIG_ARCH_HAVE_ELF_PROT=y +CONFIG_ARCH_USE_GNU_PROPERTY=y +CONFIG_ELFCORE=y +CONFIG_CORE_DUMP_DEFAULT_ELF_HEADERS=y +CONFIG_BINFMT_SCRIPT=y +CONFIG_BINFMT_MISC=y +CONFIG_COREDUMP=y +# end of Executable file formats + +# +# Memory Management options +# +CONFIG_ZPOOL=y +CONFIG_SWAP=y +CONFIG_ZSWAP=y +# CONFIG_ZSWAP_DEFAULT_ON is not set +# CONFIG_ZSWAP_SHRINKER_DEFAULT_ON is not set +# CONFIG_ZSWAP_COMPRESSOR_DEFAULT_DEFLATE is not set +CONFIG_ZSWAP_COMPRESSOR_DEFAULT_LZO=y +# CONFIG_ZSWAP_COMPRESSOR_DEFAULT_842 is not set +# CONFIG_ZSWAP_COMPRESSOR_DEFAULT_LZ4 is not set +# CONFIG_ZSWAP_COMPRESSOR_DEFAULT_LZ4HC is not set +# CONFIG_ZSWAP_COMPRESSOR_DEFAULT_ZSTD is not set +CONFIG_ZSWAP_COMPRESSOR_DEFAULT="lzo" +CONFIG_ZSWAP_ZPOOL_DEFAULT_ZBUD=y +# CONFIG_ZSWAP_ZPOOL_DEFAULT_Z3FOLD_DEPRECATED is not set +# CONFIG_ZSWAP_ZPOOL_DEFAULT_ZSMALLOC is not set +CONFIG_ZSWAP_ZPOOL_DEFAULT="zbud" +CONFIG_ZBUD=y +# CONFIG_Z3FOLD_DEPRECATED is not set +CONFIG_ZSMALLOC=m +# CONFIG_ZSMALLOC_STAT is not set +CONFIG_ZSMALLOC_CHAIN_SIZE=8 + +# +# SLAB allocator options +# +# CONFIG_SLAB_DEPRECATED is not set +CONFIG_SLUB=y +# CONFIG_SLUB_TINY is not set +CONFIG_SLAB_MERGE_DEFAULT=y +CONFIG_SLAB_FREELIST_RANDOM=y +CONFIG_SLAB_FREELIST_HARDENED=y +# CONFIG_SLUB_STATS is not set +CONFIG_SLUB_CPU_PARTIAL=y +# CONFIG_RANDOM_KMALLOC_CACHES is not set +# end of SLAB allocator options + +CONFIG_SHUFFLE_PAGE_ALLOCATOR=y +# CONFIG_COMPAT_BRK is not set +CONFIG_SPARSEMEM=y +CONFIG_SPARSEMEM_EXTREME=y +CONFIG_SPARSEMEM_VMEMMAP_ENABLE=y +CONFIG_SPARSEMEM_VMEMMAP=y +CONFIG_HAVE_FAST_GUP=y +CONFIG_ARCH_KEEP_MEMBLOCK=y +CONFIG_NUMA_KEEP_MEMINFO=y +CONFIG_MEMORY_ISOLATION=y +CONFIG_EXCLUSIVE_SYSTEM_RAM=y +CONFIG_ARCH_ENABLE_MEMORY_HOTPLUG=y +CONFIG_ARCH_ENABLE_MEMORY_HOTREMOVE=y +CONFIG_MEMORY_HOTPLUG=y +# CONFIG_MEMORY_HOTPLUG_DEFAULT_ONLINE is not set +CONFIG_MEMORY_HOTREMOVE=y +CONFIG_MHP_MEMMAP_ON_MEMORY=y +CONFIG_ARCH_MHP_MEMMAP_ON_MEMORY_ENABLE=y +CONFIG_SPLIT_PTLOCK_CPUS=4 +CONFIG_ARCH_ENABLE_SPLIT_PMD_PTLOCK=y +CONFIG_MEMORY_BALLOON=y +CONFIG_BALLOON_COMPACTION=y +CONFIG_COMPACTION=y +CONFIG_COMPACT_UNEVICTABLE_DEFAULT=1 +CONFIG_PAGE_REPORTING=y +CONFIG_MIGRATION=y +CONFIG_DEVICE_MIGRATION=y +CONFIG_ARCH_ENABLE_HUGEPAGE_MIGRATION=y +CONFIG_ARCH_ENABLE_THP_MIGRATION=y +CONFIG_CONTIG_ALLOC=y +CONFIG_PCP_BATCH_SCALE_MAX=5 +CONFIG_PHYS_ADDR_T_64BIT=y +CONFIG_MMU_NOTIFIER=y +CONFIG_KSM=y +CONFIG_DEFAULT_MMAP_MIN_ADDR=4096 +CONFIG_ARCH_SUPPORTS_MEMORY_FAILURE=y +CONFIG_MEMORY_FAILURE=y +CONFIG_HWPOISON_INJECT=m +CONFIG_TRANSPARENT_HUGEPAGE=y +# CONFIG_TRANSPARENT_HUGEPAGE_ALWAYS is not set +CONFIG_TRANSPARENT_HUGEPAGE_MADVISE=y +# CONFIG_READ_ONLY_THP_FOR_FS is not set +CONFIG_NEED_PER_CPU_EMBED_FIRST_CHUNK=y +CONFIG_NEED_PER_CPU_PAGE_FIRST_CHUNK=y +CONFIG_USE_PERCPU_NUMA_NODE_ID=y +CONFIG_HAVE_SETUP_PER_CPU_AREA=y +# CONFIG_CMA is not set +CONFIG_GENERIC_EARLY_IOREMAP=y +CONFIG_DEFERRED_STRUCT_PAGE_INIT=y +# CONFIG_IDLE_PAGE_TRACKING is not set +CONFIG_ARCH_HAS_CACHE_LINE_SIZE=y +CONFIG_ARCH_HAS_CURRENT_STACK_POINTER=y +CONFIG_ARCH_HAS_PTE_DEVMAP=y +CONFIG_ARCH_HAS_ZONE_DMA_SET=y +CONFIG_ZONE_DMA=y +CONFIG_ZONE_DMA32=y +CONFIG_ZONE_DEVICE=y +CONFIG_HMM_MIRROR=y +CONFIG_GET_FREE_REGION=y +CONFIG_DEVICE_PRIVATE=y +CONFIG_ARCH_USES_HIGH_VMA_FLAGS=y +CONFIG_ARCH_USES_PG_ARCH_X=y +CONFIG_VM_EVENT_COUNTERS=y +# CONFIG_PERCPU_STATS is not set +# CONFIG_GUP_TEST is not set +# CONFIG_DMAPOOL_TEST is not set +CONFIG_ARCH_HAS_PTE_SPECIAL=y +CONFIG_MEMFD_CREATE=y +CONFIG_SECRETMEM=y +# CONFIG_ANON_VMA_NAME is not set +CONFIG_USERFAULTFD=y +CONFIG_HAVE_ARCH_USERFAULTFD_MINOR=y +# CONFIG_LRU_GEN is not set +CONFIG_ARCH_SUPPORTS_PER_VMA_LOCK=y +CONFIG_PER_VMA_LOCK=y +CONFIG_LOCK_MM_AND_FIND_VMA=y + +# +# Data Access Monitoring +# +# CONFIG_DAMON is not set +# end of Data Access Monitoring +# end of Memory Management options + +CONFIG_NET=y +CONFIG_COMPAT_NETLINK_MESSAGES=y +CONFIG_NET_INGRESS=y +CONFIG_NET_EGRESS=y +CONFIG_NET_XGRESS=y +CONFIG_NET_REDIRECT=y +CONFIG_SKB_EXTENSIONS=y + +# +# Networking options +# +CONFIG_PACKET=y +CONFIG_PACKET_DIAG=m +CONFIG_UNIX=y +CONFIG_UNIX_SCM=y +CONFIG_AF_UNIX_OOB=y +CONFIG_UNIX_DIAG=m +# CONFIG_TLS is not set +CONFIG_XFRM=y +CONFIG_XFRM_OFFLOAD=y +CONFIG_XFRM_ALGO=m +CONFIG_XFRM_USER=m +CONFIG_XFRM_INTERFACE=m +CONFIG_XFRM_SUB_POLICY=y +CONFIG_XFRM_MIGRATE=y +CONFIG_XFRM_STATISTICS=y +CONFIG_XFRM_AH=m +CONFIG_XFRM_ESP=m +CONFIG_XFRM_IPCOMP=m +CONFIG_NET_KEY=m +CONFIG_NET_KEY_MIGRATE=y +CONFIG_SMC=m +CONFIG_SMC_DIAG=m +CONFIG_XDP_SOCKETS=y +# CONFIG_XDP_SOCKETS_DIAG is not set +CONFIG_NET_HANDSHAKE=y +CONFIG_INET=y +CONFIG_IP_MULTICAST=y +CONFIG_IP_ADVANCED_ROUTER=y +CONFIG_IP_FIB_TRIE_STATS=y +CONFIG_IP_MULTIPLE_TABLES=y +CONFIG_IP_ROUTE_MULTIPATH=y +CONFIG_IP_ROUTE_VERBOSE=y +CONFIG_IP_ROUTE_CLASSID=y +# CONFIG_IP_PNP is not set +CONFIG_NET_IPIP=m +CONFIG_NET_IPGRE_DEMUX=m +CONFIG_NET_IP_TUNNEL=m +CONFIG_NET_IPGRE=m +CONFIG_NET_IPGRE_BROADCAST=y +CONFIG_IP_MROUTE_COMMON=y +CONFIG_IP_MROUTE=y +CONFIG_IP_MROUTE_MULTIPLE_TABLES=y +CONFIG_IP_PIMSM_V1=y +CONFIG_IP_PIMSM_V2=y +CONFIG_SYN_COOKIES=y +CONFIG_NET_IPVTI=m +CONFIG_NET_UDP_TUNNEL=m +CONFIG_NET_FOU=m +CONFIG_NET_FOU_IP_TUNNELS=y +CONFIG_INET_AH=m +CONFIG_INET_ESP=m +CONFIG_INET_ESP_OFFLOAD=m +# CONFIG_INET_ESPINTCP is not set +CONFIG_INET_IPCOMP=m +CONFIG_INET_TABLE_PERTURB_ORDER=16 +CONFIG_INET_XFRM_TUNNEL=m +CONFIG_INET_TUNNEL=m +CONFIG_INET_DIAG=m +CONFIG_INET_TCP_DIAG=m +CONFIG_INET_UDP_DIAG=m +CONFIG_INET_RAW_DIAG=m +CONFIG_INET_DIAG_DESTROY=y +CONFIG_TCP_CONG_ADVANCED=y +CONFIG_TCP_CONG_BIC=m +CONFIG_TCP_CONG_CUBIC=y +CONFIG_TCP_CONG_WESTWOOD=m +CONFIG_TCP_CONG_HTCP=m +CONFIG_TCP_CONG_HSTCP=m +CONFIG_TCP_CONG_HYBLA=m +CONFIG_TCP_CONG_VEGAS=m +CONFIG_TCP_CONG_NV=m +CONFIG_TCP_CONG_SCALABLE=m +CONFIG_TCP_CONG_LP=m +CONFIG_TCP_CONG_VENO=m +CONFIG_TCP_CONG_YEAH=m +CONFIG_TCP_CONG_ILLINOIS=m +CONFIG_TCP_CONG_DCTCP=m +CONFIG_TCP_CONG_CDG=m +CONFIG_TCP_CONG_BBR=m +CONFIG_DEFAULT_CUBIC=y +# CONFIG_DEFAULT_RENO is not set +CONFIG_DEFAULT_TCP_CONG="cubic" +CONFIG_TCP_MD5SIG=y +CONFIG_IPV6=y +CONFIG_IPV6_ROUTER_PREF=y +CONFIG_IPV6_ROUTE_INFO=y +CONFIG_IPV6_OPTIMISTIC_DAD=y +CONFIG_INET6_AH=m +CONFIG_INET6_ESP=m +CONFIG_INET6_ESP_OFFLOAD=m +# CONFIG_INET6_ESPINTCP is not set +CONFIG_INET6_IPCOMP=m +CONFIG_IPV6_MIP6=y +CONFIG_IPV6_ILA=m +CONFIG_INET6_XFRM_TUNNEL=m +CONFIG_INET6_TUNNEL=m +CONFIG_IPV6_VTI=m +CONFIG_IPV6_SIT=m +CONFIG_IPV6_SIT_6RD=y +CONFIG_IPV6_NDISC_NODETYPE=y +CONFIG_IPV6_TUNNEL=m +CONFIG_IPV6_GRE=m +CONFIG_IPV6_FOU=m +CONFIG_IPV6_FOU_TUNNEL=m +CONFIG_IPV6_MULTIPLE_TABLES=y +CONFIG_IPV6_SUBTREES=y +CONFIG_IPV6_MROUTE=y +CONFIG_IPV6_MROUTE_MULTIPLE_TABLES=y +CONFIG_IPV6_PIMSM_V2=y +CONFIG_IPV6_SEG6_LWTUNNEL=y +CONFIG_IPV6_SEG6_HMAC=y +CONFIG_IPV6_SEG6_BPF=y +# CONFIG_IPV6_RPL_LWTUNNEL is not set +# CONFIG_IPV6_IOAM6_LWTUNNEL is not set +# CONFIG_NETLABEL is not set +# CONFIG_MPTCP is not set +CONFIG_NETWORK_SECMARK=y +CONFIG_NET_PTP_CLASSIFY=y +# CONFIG_NETWORK_PHY_TIMESTAMPING is not set +CONFIG_NETFILTER=y +CONFIG_NETFILTER_ADVANCED=y +CONFIG_BRIDGE_NETFILTER=m + +# +# Core Netfilter Configuration +# +CONFIG_NETFILTER_INGRESS=y +CONFIG_NETFILTER_EGRESS=y +CONFIG_NETFILTER_SKIP_EGRESS=y +CONFIG_NETFILTER_NETLINK=m +CONFIG_NETFILTER_FAMILY_BRIDGE=y +CONFIG_NETFILTER_FAMILY_ARP=y +CONFIG_NETFILTER_BPF_LINK=y +# CONFIG_NETFILTER_NETLINK_HOOK is not set +CONFIG_NETFILTER_NETLINK_ACCT=m +CONFIG_NETFILTER_NETLINK_QUEUE=m +CONFIG_NETFILTER_NETLINK_LOG=m +CONFIG_NETFILTER_NETLINK_OSF=m +CONFIG_NF_CONNTRACK=m +CONFIG_NF_LOG_SYSLOG=m +CONFIG_NETFILTER_CONNCOUNT=m +CONFIG_NF_CONNTRACK_MARK=y +CONFIG_NF_CONNTRACK_SECMARK=y +CONFIG_NF_CONNTRACK_ZONES=y +CONFIG_NF_CONNTRACK_PROCFS=y +CONFIG_NF_CONNTRACK_EVENTS=y +CONFIG_NF_CONNTRACK_TIMEOUT=y +CONFIG_NF_CONNTRACK_TIMESTAMP=y +CONFIG_NF_CONNTRACK_LABELS=y +CONFIG_NF_CONNTRACK_OVS=y +CONFIG_NF_CT_PROTO_DCCP=y +CONFIG_NF_CT_PROTO_GRE=y +CONFIG_NF_CT_PROTO_SCTP=y +CONFIG_NF_CT_PROTO_UDPLITE=y +CONFIG_NF_CONNTRACK_AMANDA=m +CONFIG_NF_CONNTRACK_FTP=m +CONFIG_NF_CONNTRACK_H323=m +CONFIG_NF_CONNTRACK_IRC=m +CONFIG_NF_CONNTRACK_BROADCAST=m +CONFIG_NF_CONNTRACK_NETBIOS_NS=m +CONFIG_NF_CONNTRACK_SNMP=m +CONFIG_NF_CONNTRACK_PPTP=m +CONFIG_NF_CONNTRACK_SANE=m +CONFIG_NF_CONNTRACK_SIP=m +CONFIG_NF_CONNTRACK_TFTP=m +CONFIG_NF_CT_NETLINK=m +CONFIG_NF_CT_NETLINK_TIMEOUT=m +CONFIG_NF_CT_NETLINK_HELPER=m +CONFIG_NETFILTER_NETLINK_GLUE_CT=y +CONFIG_NF_NAT=m +CONFIG_NF_NAT_AMANDA=m +CONFIG_NF_NAT_FTP=m +CONFIG_NF_NAT_IRC=m +CONFIG_NF_NAT_SIP=m +CONFIG_NF_NAT_TFTP=m +CONFIG_NF_NAT_REDIRECT=y +CONFIG_NF_NAT_MASQUERADE=y +CONFIG_NF_NAT_OVS=y +CONFIG_NETFILTER_SYNPROXY=m +CONFIG_NF_TABLES=m +CONFIG_NF_TABLES_INET=y +CONFIG_NF_TABLES_NETDEV=y +CONFIG_NFT_NUMGEN=m +CONFIG_NFT_CT=m +CONFIG_NFT_FLOW_OFFLOAD=m +CONFIG_NFT_CONNLIMIT=m +CONFIG_NFT_LOG=m +CONFIG_NFT_LIMIT=m +CONFIG_NFT_MASQ=m +CONFIG_NFT_REDIR=m +CONFIG_NFT_NAT=m +CONFIG_NFT_TUNNEL=m +CONFIG_NFT_QUEUE=m +CONFIG_NFT_QUOTA=m +CONFIG_NFT_REJECT=m +CONFIG_NFT_REJECT_INET=m +CONFIG_NFT_COMPAT=m +CONFIG_NFT_HASH=m +CONFIG_NFT_FIB=m +CONFIG_NFT_FIB_INET=m +# CONFIG_NFT_XFRM is not set +CONFIG_NFT_SOCKET=m +CONFIG_NFT_OSF=m +CONFIG_NFT_TPROXY=m +# CONFIG_NFT_SYNPROXY is not set +CONFIG_NF_DUP_NETDEV=m +CONFIG_NFT_DUP_NETDEV=m +CONFIG_NFT_FWD_NETDEV=m +CONFIG_NFT_FIB_NETDEV=m +# CONFIG_NFT_REJECT_NETDEV is not set +CONFIG_NF_FLOW_TABLE_INET=m +CONFIG_NF_FLOW_TABLE=m +# CONFIG_NF_FLOW_TABLE_PROCFS is not set +CONFIG_NETFILTER_XTABLES=m +CONFIG_NETFILTER_XTABLES_COMPAT=y + +# +# Xtables combined modules +# +CONFIG_NETFILTER_XT_MARK=m +CONFIG_NETFILTER_XT_CONNMARK=m +CONFIG_NETFILTER_XT_SET=m + +# +# Xtables targets +# +CONFIG_NETFILTER_XT_TARGET_AUDIT=m +CONFIG_NETFILTER_XT_TARGET_CHECKSUM=m +CONFIG_NETFILTER_XT_TARGET_CLASSIFY=m +CONFIG_NETFILTER_XT_TARGET_CONNMARK=m +CONFIG_NETFILTER_XT_TARGET_CONNSECMARK=m +CONFIG_NETFILTER_XT_TARGET_CT=m +CONFIG_NETFILTER_XT_TARGET_DSCP=m +CONFIG_NETFILTER_XT_TARGET_HL=m +CONFIG_NETFILTER_XT_TARGET_HMARK=m +CONFIG_NETFILTER_XT_TARGET_IDLETIMER=m +CONFIG_NETFILTER_XT_TARGET_LED=m +CONFIG_NETFILTER_XT_TARGET_LOG=m +CONFIG_NETFILTER_XT_TARGET_MARK=m +CONFIG_NETFILTER_XT_NAT=m +CONFIG_NETFILTER_XT_TARGET_NETMAP=m +CONFIG_NETFILTER_XT_TARGET_NFLOG=m +CONFIG_NETFILTER_XT_TARGET_NFQUEUE=m +# CONFIG_NETFILTER_XT_TARGET_NOTRACK is not set +CONFIG_NETFILTER_XT_TARGET_RATEEST=m +CONFIG_NETFILTER_XT_TARGET_REDIRECT=m +CONFIG_NETFILTER_XT_TARGET_MASQUERADE=m +CONFIG_NETFILTER_XT_TARGET_TEE=m +CONFIG_NETFILTER_XT_TARGET_TPROXY=m +CONFIG_NETFILTER_XT_TARGET_TRACE=m +CONFIG_NETFILTER_XT_TARGET_SECMARK=m +CONFIG_NETFILTER_XT_TARGET_TCPMSS=m +CONFIG_NETFILTER_XT_TARGET_TCPOPTSTRIP=m + +# +# Xtables matches +# +CONFIG_NETFILTER_XT_MATCH_ADDRTYPE=m +CONFIG_NETFILTER_XT_MATCH_BPF=m +CONFIG_NETFILTER_XT_MATCH_CGROUP=m +CONFIG_NETFILTER_XT_MATCH_CLUSTER=m +CONFIG_NETFILTER_XT_MATCH_COMMENT=m +CONFIG_NETFILTER_XT_MATCH_CONNBYTES=m +CONFIG_NETFILTER_XT_MATCH_CONNLABEL=m +CONFIG_NETFILTER_XT_MATCH_CONNLIMIT=m +CONFIG_NETFILTER_XT_MATCH_CONNMARK=m +CONFIG_NETFILTER_XT_MATCH_CONNTRACK=m +CONFIG_NETFILTER_XT_MATCH_CPU=m +CONFIG_NETFILTER_XT_MATCH_DCCP=m +CONFIG_NETFILTER_XT_MATCH_DEVGROUP=m +CONFIG_NETFILTER_XT_MATCH_DSCP=m +CONFIG_NETFILTER_XT_MATCH_ECN=m +CONFIG_NETFILTER_XT_MATCH_ESP=m +CONFIG_NETFILTER_XT_MATCH_HASHLIMIT=m +CONFIG_NETFILTER_XT_MATCH_HELPER=m +CONFIG_NETFILTER_XT_MATCH_HL=m +CONFIG_NETFILTER_XT_MATCH_IPCOMP=m +CONFIG_NETFILTER_XT_MATCH_IPRANGE=m +CONFIG_NETFILTER_XT_MATCH_IPVS=m +CONFIG_NETFILTER_XT_MATCH_L2TP=m +CONFIG_NETFILTER_XT_MATCH_LENGTH=m +CONFIG_NETFILTER_XT_MATCH_LIMIT=m +CONFIG_NETFILTER_XT_MATCH_MAC=m +CONFIG_NETFILTER_XT_MATCH_MARK=m +CONFIG_NETFILTER_XT_MATCH_MULTIPORT=m +CONFIG_NETFILTER_XT_MATCH_NFACCT=m +CONFIG_NETFILTER_XT_MATCH_OSF=m +CONFIG_NETFILTER_XT_MATCH_OWNER=m +CONFIG_NETFILTER_XT_MATCH_POLICY=m +CONFIG_NETFILTER_XT_MATCH_PHYSDEV=m +CONFIG_NETFILTER_XT_MATCH_PKTTYPE=m +CONFIG_NETFILTER_XT_MATCH_QUOTA=m +CONFIG_NETFILTER_XT_MATCH_RATEEST=m +CONFIG_NETFILTER_XT_MATCH_REALM=m +CONFIG_NETFILTER_XT_MATCH_RECENT=m +CONFIG_NETFILTER_XT_MATCH_SCTP=m +CONFIG_NETFILTER_XT_MATCH_SOCKET=m +CONFIG_NETFILTER_XT_MATCH_STATE=m +CONFIG_NETFILTER_XT_MATCH_STATISTIC=m +CONFIG_NETFILTER_XT_MATCH_STRING=m +CONFIG_NETFILTER_XT_MATCH_TCPMSS=m +CONFIG_NETFILTER_XT_MATCH_TIME=m +CONFIG_NETFILTER_XT_MATCH_U32=m +# end of Core Netfilter Configuration + +CONFIG_IP_SET=m +CONFIG_IP_SET_MAX=256 +CONFIG_IP_SET_BITMAP_IP=m +CONFIG_IP_SET_BITMAP_IPMAC=m +CONFIG_IP_SET_BITMAP_PORT=m +CONFIG_IP_SET_HASH_IP=m +CONFIG_IP_SET_HASH_IPMARK=m +CONFIG_IP_SET_HASH_IPPORT=m +CONFIG_IP_SET_HASH_IPPORTIP=m +CONFIG_IP_SET_HASH_IPPORTNET=m +CONFIG_IP_SET_HASH_IPMAC=m +CONFIG_IP_SET_HASH_MAC=m +CONFIG_IP_SET_HASH_NETPORTNET=m +CONFIG_IP_SET_HASH_NET=m +CONFIG_IP_SET_HASH_NETNET=m +CONFIG_IP_SET_HASH_NETPORT=m +CONFIG_IP_SET_HASH_NETIFACE=m +CONFIG_IP_SET_LIST_SET=m +CONFIG_IP_VS=m +CONFIG_IP_VS_IPV6=y +# CONFIG_IP_VS_DEBUG is not set +CONFIG_IP_VS_TAB_BITS=12 + +# +# IPVS transport protocol load balancing support +# +CONFIG_IP_VS_PROTO_TCP=y +CONFIG_IP_VS_PROTO_UDP=y +CONFIG_IP_VS_PROTO_AH_ESP=y +CONFIG_IP_VS_PROTO_ESP=y +CONFIG_IP_VS_PROTO_AH=y +CONFIG_IP_VS_PROTO_SCTP=y + +# +# IPVS scheduler +# +CONFIG_IP_VS_RR=m +CONFIG_IP_VS_WRR=m +CONFIG_IP_VS_LC=m +CONFIG_IP_VS_WLC=m +CONFIG_IP_VS_FO=m +CONFIG_IP_VS_OVF=m +CONFIG_IP_VS_LBLC=m +CONFIG_IP_VS_LBLCR=m +CONFIG_IP_VS_DH=m +CONFIG_IP_VS_SH=m +CONFIG_IP_VS_MH=m +CONFIG_IP_VS_SED=m +CONFIG_IP_VS_NQ=m +# CONFIG_IP_VS_TWOS is not set + +# +# IPVS SH scheduler +# +CONFIG_IP_VS_SH_TAB_BITS=8 + +# +# IPVS MH scheduler +# +CONFIG_IP_VS_MH_TAB_INDEX=12 + +# +# IPVS application helper +# +CONFIG_IP_VS_FTP=m +CONFIG_IP_VS_NFCT=y +CONFIG_IP_VS_PE_SIP=m + +# +# IP: Netfilter Configuration +# +CONFIG_NF_DEFRAG_IPV4=m +CONFIG_NF_SOCKET_IPV4=m +CONFIG_NF_TPROXY_IPV4=m +CONFIG_NF_TABLES_IPV4=y +CONFIG_NFT_REJECT_IPV4=m +CONFIG_NFT_DUP_IPV4=m +CONFIG_NFT_FIB_IPV4=m +CONFIG_NF_TABLES_ARP=y +CONFIG_NF_DUP_IPV4=m +CONFIG_NF_LOG_ARP=m +CONFIG_NF_LOG_IPV4=m +CONFIG_NF_REJECT_IPV4=m +CONFIG_NF_NAT_SNMP_BASIC=m +CONFIG_NF_NAT_PPTP=m +CONFIG_NF_NAT_H323=m +CONFIG_IP_NF_IPTABLES=m +CONFIG_IP_NF_MATCH_AH=m +CONFIG_IP_NF_MATCH_ECN=m +CONFIG_IP_NF_MATCH_RPFILTER=m +CONFIG_IP_NF_MATCH_TTL=m +CONFIG_IP_NF_FILTER=m +CONFIG_IP_NF_TARGET_REJECT=m +CONFIG_IP_NF_TARGET_SYNPROXY=m +CONFIG_IP_NF_NAT=m +CONFIG_IP_NF_TARGET_MASQUERADE=m +CONFIG_IP_NF_TARGET_NETMAP=m +CONFIG_IP_NF_TARGET_REDIRECT=m +CONFIG_IP_NF_MANGLE=m +CONFIG_IP_NF_TARGET_ECN=m +CONFIG_IP_NF_TARGET_TTL=m +CONFIG_IP_NF_RAW=m +CONFIG_IP_NF_SECURITY=m +CONFIG_IP_NF_ARPTABLES=m +CONFIG_IP_NF_ARPFILTER=m +CONFIG_IP_NF_ARP_MANGLE=m +# end of IP: Netfilter Configuration + +# +# IPv6: Netfilter Configuration +# +CONFIG_NF_SOCKET_IPV6=m +CONFIG_NF_TPROXY_IPV6=m +CONFIG_NF_TABLES_IPV6=y +CONFIG_NFT_REJECT_IPV6=m +CONFIG_NFT_DUP_IPV6=m +CONFIG_NFT_FIB_IPV6=m +CONFIG_NF_DUP_IPV6=m +CONFIG_NF_REJECT_IPV6=m +CONFIG_NF_LOG_IPV6=m +CONFIG_IP6_NF_IPTABLES=m +CONFIG_IP6_NF_MATCH_AH=m +CONFIG_IP6_NF_MATCH_EUI64=m +CONFIG_IP6_NF_MATCH_FRAG=m +CONFIG_IP6_NF_MATCH_OPTS=m +CONFIG_IP6_NF_MATCH_HL=m +CONFIG_IP6_NF_MATCH_IPV6HEADER=m +CONFIG_IP6_NF_MATCH_MH=m +CONFIG_IP6_NF_MATCH_RPFILTER=m +CONFIG_IP6_NF_MATCH_RT=m +# CONFIG_IP6_NF_MATCH_SRH is not set +CONFIG_IP6_NF_TARGET_HL=m +CONFIG_IP6_NF_FILTER=m +CONFIG_IP6_NF_TARGET_REJECT=m +CONFIG_IP6_NF_TARGET_SYNPROXY=m +CONFIG_IP6_NF_MANGLE=m +CONFIG_IP6_NF_RAW=m +CONFIG_IP6_NF_SECURITY=m +CONFIG_IP6_NF_NAT=m +CONFIG_IP6_NF_TARGET_MASQUERADE=m +CONFIG_IP6_NF_TARGET_NPT=m +# end of IPv6: Netfilter Configuration + +CONFIG_NF_DEFRAG_IPV6=m +CONFIG_NF_TABLES_BRIDGE=m +# CONFIG_NFT_BRIDGE_META is not set +CONFIG_NFT_BRIDGE_REJECT=m +# CONFIG_NF_CONNTRACK_BRIDGE is not set +CONFIG_BRIDGE_NF_EBTABLES=m +CONFIG_BRIDGE_EBT_BROUTE=m +CONFIG_BRIDGE_EBT_T_FILTER=m +CONFIG_BRIDGE_EBT_T_NAT=m +CONFIG_BRIDGE_EBT_802_3=m +CONFIG_BRIDGE_EBT_AMONG=m +CONFIG_BRIDGE_EBT_ARP=m +CONFIG_BRIDGE_EBT_IP=m +CONFIG_BRIDGE_EBT_IP6=m +CONFIG_BRIDGE_EBT_LIMIT=m +CONFIG_BRIDGE_EBT_MARK=m +CONFIG_BRIDGE_EBT_PKTTYPE=m +CONFIG_BRIDGE_EBT_STP=m +CONFIG_BRIDGE_EBT_VLAN=m +CONFIG_BRIDGE_EBT_ARPREPLY=m +CONFIG_BRIDGE_EBT_DNAT=m +CONFIG_BRIDGE_EBT_MARK_T=m +CONFIG_BRIDGE_EBT_REDIRECT=m +CONFIG_BRIDGE_EBT_SNAT=m +CONFIG_BRIDGE_EBT_LOG=m +CONFIG_BRIDGE_EBT_NFLOG=m +# CONFIG_BPFILTER is not set +CONFIG_IP_DCCP=m +CONFIG_INET_DCCP_DIAG=m + +# +# DCCP CCIDs Configuration +# +# CONFIG_IP_DCCP_CCID2_DEBUG is not set +CONFIG_IP_DCCP_CCID3=y +# CONFIG_IP_DCCP_CCID3_DEBUG is not set +CONFIG_IP_DCCP_TFRC_LIB=y +# end of DCCP CCIDs Configuration + +# +# DCCP Kernel Hacking +# +# CONFIG_IP_DCCP_DEBUG is not set +# end of DCCP Kernel Hacking + +CONFIG_IP_SCTP=m +# CONFIG_SCTP_DBG_OBJCNT is not set +CONFIG_SCTP_DEFAULT_COOKIE_HMAC_MD5=y +# CONFIG_SCTP_DEFAULT_COOKIE_HMAC_SHA1 is not set +# CONFIG_SCTP_DEFAULT_COOKIE_HMAC_NONE is not set +CONFIG_SCTP_COOKIE_HMAC_MD5=y +CONFIG_SCTP_COOKIE_HMAC_SHA1=y +CONFIG_INET_SCTP_DIAG=m +CONFIG_RDS=m +CONFIG_RDS_RDMA=m +CONFIG_RDS_TCP=m +# CONFIG_RDS_DEBUG is not set +CONFIG_TIPC=m +CONFIG_TIPC_MEDIA_IB=y +CONFIG_TIPC_MEDIA_UDP=y +CONFIG_TIPC_CRYPTO=y +CONFIG_TIPC_DIAG=m +CONFIG_ATM=m +CONFIG_ATM_CLIP=m +# CONFIG_ATM_CLIP_NO_ICMP is not set +CONFIG_ATM_LANE=m +CONFIG_ATM_MPOA=m +CONFIG_ATM_BR2684=m +# CONFIG_ATM_BR2684_IPFILTER is not set +CONFIG_L2TP=m +CONFIG_L2TP_DEBUGFS=m +CONFIG_L2TP_V3=y +CONFIG_L2TP_IP=m +CONFIG_L2TP_ETH=m +CONFIG_STP=m +CONFIG_GARP=m +CONFIG_MRP=m +CONFIG_BRIDGE=m +CONFIG_BRIDGE_IGMP_SNOOPING=y +CONFIG_BRIDGE_VLAN_FILTERING=y +# CONFIG_BRIDGE_MRP is not set +# CONFIG_BRIDGE_CFM is not set +CONFIG_NET_DSA=m +# CONFIG_NET_DSA_TAG_NONE is not set +# CONFIG_NET_DSA_TAG_AR9331 is not set +# CONFIG_NET_DSA_TAG_BRCM is not set +# CONFIG_NET_DSA_TAG_BRCM_LEGACY is not set +# CONFIG_NET_DSA_TAG_BRCM_PREPEND is not set +# CONFIG_NET_DSA_TAG_HELLCREEK is not set +# CONFIG_NET_DSA_TAG_GSWIP is not set +CONFIG_NET_DSA_TAG_DSA_COMMON=m +CONFIG_NET_DSA_TAG_DSA=m +CONFIG_NET_DSA_TAG_EDSA=m +# CONFIG_NET_DSA_TAG_MTK is not set +# CONFIG_NET_DSA_TAG_KSZ is not set +# CONFIG_NET_DSA_TAG_OCELOT is not set +# CONFIG_NET_DSA_TAG_OCELOT_8021Q is not set +# CONFIG_NET_DSA_TAG_QCA is not set +# CONFIG_NET_DSA_TAG_RTL4_A is not set +# CONFIG_NET_DSA_TAG_RTL8_4 is not set +# CONFIG_NET_DSA_TAG_RZN1_A5PSW is not set +# CONFIG_NET_DSA_TAG_LAN9303 is not set +# CONFIG_NET_DSA_TAG_SJA1105 is not set +CONFIG_NET_DSA_TAG_TRAILER=m +# CONFIG_NET_DSA_TAG_XRS700X is not set +CONFIG_VLAN_8021Q=m +CONFIG_VLAN_8021Q_GVRP=y +CONFIG_VLAN_8021Q_MVRP=y +CONFIG_LLC=m +CONFIG_LLC2=m +CONFIG_ATALK=m +CONFIG_DEV_APPLETALK=m +CONFIG_IPDDP=m +CONFIG_IPDDP_ENCAP=y +# CONFIG_X25 is not set +# CONFIG_LAPB is not set +CONFIG_PHONET=m +CONFIG_6LOWPAN=m +# CONFIG_6LOWPAN_DEBUGFS is not set +CONFIG_6LOWPAN_NHC=m +CONFIG_6LOWPAN_NHC_DEST=m +CONFIG_6LOWPAN_NHC_FRAGMENT=m +CONFIG_6LOWPAN_NHC_HOP=m +CONFIG_6LOWPAN_NHC_IPV6=m +CONFIG_6LOWPAN_NHC_MOBILITY=m +CONFIG_6LOWPAN_NHC_ROUTING=m +CONFIG_6LOWPAN_NHC_UDP=m +CONFIG_6LOWPAN_GHC_EXT_HDR_HOP=m +CONFIG_6LOWPAN_GHC_UDP=m +CONFIG_6LOWPAN_GHC_ICMPV6=m +CONFIG_6LOWPAN_GHC_EXT_HDR_DEST=m +CONFIG_6LOWPAN_GHC_EXT_HDR_FRAG=m +CONFIG_6LOWPAN_GHC_EXT_HDR_ROUTE=m +CONFIG_IEEE802154=m +# CONFIG_IEEE802154_NL802154_EXPERIMENTAL is not set +CONFIG_IEEE802154_SOCKET=m +CONFIG_IEEE802154_6LOWPAN=m +CONFIG_MAC802154=m +CONFIG_NET_SCHED=y + +# +# Queueing/Scheduling +# +CONFIG_NET_SCH_HTB=m +CONFIG_NET_SCH_HFSC=m +CONFIG_NET_SCH_PRIO=m +CONFIG_NET_SCH_MULTIQ=m +CONFIG_NET_SCH_RED=m +CONFIG_NET_SCH_SFB=m +CONFIG_NET_SCH_SFQ=m +CONFIG_NET_SCH_TEQL=m +CONFIG_NET_SCH_TBF=m +CONFIG_NET_SCH_CBS=m +CONFIG_NET_SCH_ETF=m +CONFIG_NET_SCH_MQPRIO_LIB=m +# CONFIG_NET_SCH_TAPRIO is not set +CONFIG_NET_SCH_GRED=m +CONFIG_NET_SCH_NETEM=m +CONFIG_NET_SCH_DRR=m +CONFIG_NET_SCH_MQPRIO=m +CONFIG_NET_SCH_SKBPRIO=m +CONFIG_NET_SCH_CHOKE=m +CONFIG_NET_SCH_QFQ=m +CONFIG_NET_SCH_CODEL=m +CONFIG_NET_SCH_FQ_CODEL=m +CONFIG_NET_SCH_CAKE=m +CONFIG_NET_SCH_FQ=m +CONFIG_NET_SCH_HHF=m +CONFIG_NET_SCH_PIE=m +# CONFIG_NET_SCH_FQ_PIE is not set +CONFIG_NET_SCH_INGRESS=m +CONFIG_NET_SCH_PLUG=m +# CONFIG_NET_SCH_ETS is not set +# CONFIG_NET_SCH_DEFAULT is not set + +# +# Classification +# +CONFIG_NET_CLS=y +CONFIG_NET_CLS_BASIC=m +CONFIG_NET_CLS_ROUTE4=m +CONFIG_NET_CLS_FW=m +CONFIG_NET_CLS_U32=m +CONFIG_CLS_U32_PERF=y +CONFIG_CLS_U32_MARK=y +CONFIG_NET_CLS_FLOW=m +CONFIG_NET_CLS_CGROUP=m +CONFIG_NET_CLS_BPF=m +CONFIG_NET_CLS_FLOWER=m +CONFIG_NET_CLS_MATCHALL=m +CONFIG_NET_EMATCH=y +CONFIG_NET_EMATCH_STACK=32 +CONFIG_NET_EMATCH_CMP=m +CONFIG_NET_EMATCH_NBYTE=m +CONFIG_NET_EMATCH_U32=m +CONFIG_NET_EMATCH_META=m +CONFIG_NET_EMATCH_TEXT=m +CONFIG_NET_EMATCH_CANID=m +CONFIG_NET_EMATCH_IPSET=m +CONFIG_NET_EMATCH_IPT=m +CONFIG_NET_CLS_ACT=y +CONFIG_NET_ACT_POLICE=m +CONFIG_NET_ACT_GACT=m +CONFIG_GACT_PROB=y +CONFIG_NET_ACT_MIRRED=m +CONFIG_NET_ACT_SAMPLE=m +CONFIG_NET_ACT_IPT=m +CONFIG_NET_ACT_NAT=m +CONFIG_NET_ACT_PEDIT=m +CONFIG_NET_ACT_SIMP=m +CONFIG_NET_ACT_SKBEDIT=m +CONFIG_NET_ACT_CSUM=m +# CONFIG_NET_ACT_MPLS is not set +CONFIG_NET_ACT_VLAN=m +CONFIG_NET_ACT_BPF=m +CONFIG_NET_ACT_CONNMARK=m +# CONFIG_NET_ACT_CTINFO is not set +CONFIG_NET_ACT_SKBMOD=m +CONFIG_NET_ACT_IFE=m +CONFIG_NET_ACT_TUNNEL_KEY=m +CONFIG_NET_ACT_CT=m +# CONFIG_NET_ACT_GATE is not set +CONFIG_NET_IFE_SKBMARK=m +CONFIG_NET_IFE_SKBPRIO=m +CONFIG_NET_IFE_SKBTCINDEX=m +# CONFIG_NET_TC_SKB_EXT is not set +CONFIG_NET_SCH_FIFO=y +CONFIG_DCB=y +CONFIG_DNS_RESOLVER=m +CONFIG_BATMAN_ADV=m +# CONFIG_BATMAN_ADV_BATMAN_V is not set +CONFIG_BATMAN_ADV_BLA=y +CONFIG_BATMAN_ADV_DAT=y +CONFIG_BATMAN_ADV_NC=y +CONFIG_BATMAN_ADV_MCAST=y +# CONFIG_BATMAN_ADV_DEBUG is not set +# CONFIG_BATMAN_ADV_TRACING is not set +CONFIG_OPENVSWITCH=m +CONFIG_OPENVSWITCH_GRE=m +CONFIG_OPENVSWITCH_VXLAN=m +CONFIG_OPENVSWITCH_GENEVE=m +CONFIG_VSOCKETS=m +CONFIG_VSOCKETS_DIAG=m +CONFIG_VSOCKETS_LOOPBACK=m +CONFIG_VIRTIO_VSOCKETS=m +CONFIG_VIRTIO_VSOCKETS_COMMON=m +CONFIG_HYPERV_VSOCKETS=m +CONFIG_NETLINK_DIAG=m +CONFIG_MPLS=y +CONFIG_NET_MPLS_GSO=y +CONFIG_MPLS_ROUTING=m +CONFIG_MPLS_IPTUNNEL=m +CONFIG_NET_NSH=m +# CONFIG_HSR is not set +CONFIG_NET_SWITCHDEV=y +CONFIG_NET_L3_MASTER_DEV=y +# CONFIG_QRTR is not set +# CONFIG_NET_NCSI is not set +CONFIG_PCPU_DEV_REFCNT=y +CONFIG_MAX_SKB_FRAGS=17 +CONFIG_RPS=y +CONFIG_RFS_ACCEL=y +CONFIG_SOCK_RX_QUEUE_MAPPING=y +CONFIG_XPS=y +CONFIG_CGROUP_NET_PRIO=y +CONFIG_CGROUP_NET_CLASSID=y +CONFIG_NET_RX_BUSY_POLL=y +CONFIG_BQL=y +CONFIG_BPF_STREAM_PARSER=y +CONFIG_NET_FLOW_LIMIT=y + +# +# Network testing +# +CONFIG_NET_PKTGEN=m +CONFIG_NET_DROP_MONITOR=y +# end of Network testing +# end of Networking options + +CONFIG_HAMRADIO=y + +# +# Packet Radio protocols +# +CONFIG_AX25=m +CONFIG_AX25_DAMA_SLAVE=y +CONFIG_NETROM=m +CONFIG_ROSE=m + +# +# AX.25 network device drivers +# +CONFIG_MKISS=m +CONFIG_6PACK=m +CONFIG_BPQETHER=m +CONFIG_BAYCOM_SER_FDX=m +CONFIG_BAYCOM_SER_HDX=m +CONFIG_BAYCOM_PAR=m +CONFIG_YAM=m +# end of AX.25 network device drivers + +CONFIG_CAN=m +CONFIG_CAN_RAW=m +CONFIG_CAN_BCM=m +CONFIG_CAN_GW=m +# CONFIG_CAN_J1939 is not set +# CONFIG_CAN_ISOTP is not set +CONFIG_BT=m +CONFIG_BT_BREDR=y +CONFIG_BT_RFCOMM=m +CONFIG_BT_RFCOMM_TTY=y +CONFIG_BT_BNEP=m +CONFIG_BT_BNEP_MC_FILTER=y +CONFIG_BT_BNEP_PROTO_FILTER=y +CONFIG_BT_HIDP=m +CONFIG_BT_LE=y +CONFIG_BT_LE_L2CAP_ECRED=y +CONFIG_BT_6LOWPAN=m +CONFIG_BT_LEDS=y +# CONFIG_BT_MSFTEXT is not set +# CONFIG_BT_AOSPEXT is not set +CONFIG_BT_DEBUGFS=y +# CONFIG_BT_SELFTEST is not set + +# +# Bluetooth device drivers +# +CONFIG_BT_INTEL=m +CONFIG_BT_BCM=m +CONFIG_BT_RTL=m +CONFIG_BT_QCA=m +CONFIG_BT_MTK=m +CONFIG_BT_HCIBTUSB=m +CONFIG_BT_HCIBTUSB_AUTOSUSPEND=y +CONFIG_BT_HCIBTUSB_POLL_SYNC=y +CONFIG_BT_HCIBTUSB_BCM=y +# CONFIG_BT_HCIBTUSB_MTK is not set +CONFIG_BT_HCIBTUSB_RTL=y +CONFIG_BT_HCIBTSDIO=m +CONFIG_BT_HCIUART=m +CONFIG_BT_HCIUART_SERDEV=y +CONFIG_BT_HCIUART_H4=y +CONFIG_BT_HCIUART_NOKIA=m +# CONFIG_BT_HCIUART_BCSP is not set +CONFIG_BT_HCIUART_ATH3K=y +CONFIG_BT_HCIUART_LL=y +CONFIG_BT_HCIUART_3WIRE=y +CONFIG_BT_HCIUART_INTEL=y +CONFIG_BT_HCIUART_RTL=y +CONFIG_BT_HCIUART_QCA=y +CONFIG_BT_HCIUART_AG6XX=y +CONFIG_BT_HCIUART_MRVL=y +# CONFIG_BT_HCIBCM203X is not set +# CONFIG_BT_HCIBCM4377 is not set +# CONFIG_BT_HCIBPA10X is not set +# CONFIG_BT_HCIBFUSB is not set +# CONFIG_BT_HCIVHCI is not set +CONFIG_BT_MRVL=m +CONFIG_BT_MRVL_SDIO=m +CONFIG_BT_ATH3K=m +# CONFIG_BT_MTKSDIO is not set +CONFIG_BT_MTKUART=m +CONFIG_BT_QCOMSMD=m +CONFIG_BT_HCIRSI=m +# CONFIG_BT_VIRTIO is not set +# CONFIG_BT_NXPUART is not set +# end of Bluetooth device drivers + +CONFIG_AF_RXRPC=m +CONFIG_AF_RXRPC_IPV6=y +# CONFIG_AF_RXRPC_INJECT_LOSS is not set +# CONFIG_AF_RXRPC_INJECT_RX_DELAY is not set +# CONFIG_AF_RXRPC_DEBUG is not set +CONFIG_RXKAD=y +# CONFIG_RXPERF is not set +# CONFIG_AF_KCM is not set +CONFIG_STREAM_PARSER=y +# CONFIG_MCTP is not set +CONFIG_FIB_RULES=y +CONFIG_WIRELESS=y +CONFIG_WIRELESS_EXT=y +CONFIG_WEXT_CORE=y +CONFIG_WEXT_PROC=y +CONFIG_WEXT_SPY=y +CONFIG_WEXT_PRIV=y +CONFIG_CFG80211=m +# CONFIG_NL80211_TESTMODE is not set +# CONFIG_CFG80211_DEVELOPER_WARNINGS is not set +# CONFIG_CFG80211_CERTIFICATION_ONUS is not set +CONFIG_CFG80211_REQUIRE_SIGNED_REGDB=y +CONFIG_CFG80211_USE_KERNEL_REGDB_KEYS=y +CONFIG_CFG80211_DEFAULT_PS=y +# CONFIG_CFG80211_DEBUGFS is not set +CONFIG_CFG80211_CRDA_SUPPORT=y +CONFIG_CFG80211_WEXT=y +CONFIG_CFG80211_WEXT_EXPORT=y +CONFIG_LIB80211=m +CONFIG_LIB80211_CRYPT_WEP=m +CONFIG_LIB80211_CRYPT_CCMP=m +CONFIG_LIB80211_CRYPT_TKIP=m +# CONFIG_LIB80211_DEBUG is not set +CONFIG_MAC80211=m +CONFIG_MAC80211_HAS_RC=y +CONFIG_MAC80211_RC_MINSTREL=y +CONFIG_MAC80211_RC_DEFAULT_MINSTREL=y +CONFIG_MAC80211_RC_DEFAULT="minstrel_ht" +CONFIG_MAC80211_MESH=y +CONFIG_MAC80211_LEDS=y +# CONFIG_MAC80211_DEBUGFS is not set +# CONFIG_MAC80211_MESSAGE_TRACING is not set +# CONFIG_MAC80211_DEBUG_MENU is not set +CONFIG_MAC80211_STA_HASH_MAX_SIZE=0 +CONFIG_RFKILL=m +CONFIG_RFKILL_LEDS=y +CONFIG_RFKILL_INPUT=y +# CONFIG_RFKILL_GPIO is not set +CONFIG_NET_9P=m +CONFIG_NET_9P_FD=m +CONFIG_NET_9P_VIRTIO=m +CONFIG_NET_9P_XEN=m +CONFIG_NET_9P_RDMA=m +# CONFIG_NET_9P_DEBUG is not set +# CONFIG_CAIF is not set +CONFIG_CEPH_LIB=m +# CONFIG_CEPH_LIB_PRETTYDEBUG is not set +# CONFIG_CEPH_LIB_USE_DNS_RESOLVER is not set +CONFIG_NFC=m +CONFIG_NFC_DIGITAL=m +# CONFIG_NFC_NCI is not set +# CONFIG_NFC_HCI is not set + +# +# Near Field Communication (NFC) devices +# +# CONFIG_NFC_TRF7970A is not set +CONFIG_NFC_SIM=m +CONFIG_NFC_PORT100=m +CONFIG_NFC_PN533=m +CONFIG_NFC_PN533_USB=m +# CONFIG_NFC_PN533_I2C is not set +# CONFIG_NFC_PN532_UART is not set +# CONFIG_NFC_ST95HF is not set +# end of Near Field Communication (NFC) devices + +CONFIG_PSAMPLE=m +CONFIG_NET_IFE=m +CONFIG_LWTUNNEL=y +CONFIG_LWTUNNEL_BPF=y +CONFIG_DST_CACHE=y +CONFIG_GRO_CELLS=y +CONFIG_NET_SELFTESTS=m +CONFIG_NET_SOCK_MSG=y +CONFIG_NET_DEVLINK=y +CONFIG_PAGE_POOL=y +CONFIG_PAGE_POOL_STATS=y +CONFIG_FAILOVER=m +CONFIG_ETHTOOL_NETLINK=y + +# +# Device Drivers +# +CONFIG_ARM_AMBA=y +CONFIG_TEGRA_AHB=y +CONFIG_HAVE_PCI=y +CONFIG_PCI=y +CONFIG_PCI_DOMAINS=y +CONFIG_PCI_DOMAINS_GENERIC=y +CONFIG_PCI_SYSCALL=y +CONFIG_PCIEPORTBUS=y +CONFIG_HOTPLUG_PCI_PCIE=y +CONFIG_PCIEAER=y +CONFIG_PCIEAER_INJECT=m +# CONFIG_PCIE_ECRC is not set +CONFIG_PCIEASPM=y +CONFIG_PCIEASPM_DEFAULT=y +# CONFIG_PCIEASPM_POWERSAVE is not set +# CONFIG_PCIEASPM_POWER_SUPERSAVE is not set +# CONFIG_PCIEASPM_PERFORMANCE is not set +CONFIG_PCIE_PME=y +CONFIG_PCIE_DPC=y +CONFIG_PCIE_PTM=y +CONFIG_PCIE_EDR=y +CONFIG_PCI_MSI=y +CONFIG_PCI_QUIRKS=y +# CONFIG_PCI_DEBUG is not set +CONFIG_PCI_REALLOC_ENABLE_AUTO=y +CONFIG_PCI_STUB=m +CONFIG_PCI_PF_STUB=m +CONFIG_PCI_ATS=y +CONFIG_PCI_ECAM=y +CONFIG_PCI_BRIDGE_EMUL=y +CONFIG_PCI_IOV=y +CONFIG_PCI_PRI=y +CONFIG_PCI_PASID=y +CONFIG_PCI_P2PDMA=y +CONFIG_PCI_LABEL=y +CONFIG_PCI_HYPERV=m +# CONFIG_PCI_DYNAMIC_OF_NODES is not set +# CONFIG_PCIE_BUS_TUNE_OFF is not set +CONFIG_PCIE_BUS_DEFAULT=y +# CONFIG_PCIE_BUS_SAFE is not set +# CONFIG_PCIE_BUS_PERFORMANCE is not set +# CONFIG_PCIE_BUS_PEER2PEER is not set +CONFIG_VGA_ARB=y +CONFIG_VGA_ARB_MAX_GPUS=16 +CONFIG_HOTPLUG_PCI=y +CONFIG_HOTPLUG_PCI_ACPI=y +CONFIG_HOTPLUG_PCI_ACPI_IBM=m +CONFIG_HOTPLUG_PCI_CPCI=y +CONFIG_HOTPLUG_PCI_SHPC=y + +# +# PCI controller drivers +# +CONFIG_PCI_AARDVARK=y +# CONFIG_PCIE_ALTERA is not set +CONFIG_PCI_HOST_THUNDER_PEM=y +CONFIG_PCI_HOST_THUNDER_ECAM=y +# CONFIG_PCI_FTPCI100 is not set +CONFIG_PCI_HOST_COMMON=y +CONFIG_PCI_HOST_GENERIC=y +# CONFIG_PCIE_HISI_ERR is not set +# CONFIG_PCIE_MICROCHIP_HOST is not set +CONFIG_PCI_HYPERV_INTERFACE=m +CONFIG_PCI_TEGRA=y +CONFIG_PCIE_ROCKCHIP=y +CONFIG_PCIE_ROCKCHIP_HOST=y +CONFIG_PCI_XGENE=y +CONFIG_PCI_XGENE_MSI=y +# CONFIG_PCIE_XILINX is not set +CONFIG_PCIE_XILINX_NWL=y +# CONFIG_PCIE_XILINX_CPM is not set + +# +# Cadence-based PCIe controllers +# +# CONFIG_PCIE_CADENCE_PLAT_HOST is not set +# CONFIG_PCI_J721E_HOST is not set +# end of Cadence-based PCIe controllers + +# +# DesignWare-based PCIe controllers +# +CONFIG_PCIE_DW=y +CONFIG_PCIE_DW_HOST=y +# CONFIG_PCIE_AL is not set +# CONFIG_PCI_MESON is not set +CONFIG_PCI_HISI=y +CONFIG_PCIE_KIRIN=y +# CONFIG_PCIE_HISI_STB is not set +CONFIG_PCIE_ARMADA_8K=y +# CONFIG_PCIE_TEGRA194_HOST is not set +# CONFIG_PCIE_DW_PLAT_HOST is not set +CONFIG_PCIE_QCOM=y +# CONFIG_PCIE_ROCKCHIP_DW_HOST is not set +# end of DesignWare-based PCIe controllers + +# +# Mobiveil-based PCIe controllers +# +# CONFIG_PCIE_MOBIVEIL_PLAT is not set +# end of Mobiveil-based PCIe controllers +# end of PCI controller drivers + +# +# PCI Endpoint +# +# CONFIG_PCI_ENDPOINT is not set +# end of PCI Endpoint + +# +# PCI switch controller drivers +# +# CONFIG_PCI_SW_SWITCHTEC is not set +# end of PCI switch controller drivers + +# CONFIG_CXL_BUS is not set +# CONFIG_PCCARD is not set +# CONFIG_RAPIDIO is not set + +# +# Generic Driver Options +# +CONFIG_AUXILIARY_BUS=y +# CONFIG_UEVENT_HELPER is not set +CONFIG_DEVTMPFS=y +# CONFIG_DEVTMPFS_MOUNT is not set +# CONFIG_DEVTMPFS_SAFE is not set +CONFIG_STANDALONE=y +CONFIG_PREVENT_FIRMWARE_BUILD=y + +# +# Firmware loader +# +CONFIG_FW_LOADER=y +CONFIG_FW_LOADER_DEBUG=y +CONFIG_EXTRA_FIRMWARE="" +# CONFIG_FW_LOADER_USER_HELPER is not set +# CONFIG_FW_LOADER_COMPRESS is not set +CONFIG_FW_CACHE=y +# CONFIG_FW_UPLOAD is not set +# end of Firmware loader + +CONFIG_WANT_DEV_COREDUMP=y +CONFIG_ALLOW_DEV_COREDUMP=y +CONFIG_DEV_COREDUMP=y +# CONFIG_DEBUG_DRIVER is not set +# CONFIG_DEBUG_DEVRES is not set +# CONFIG_DEBUG_TEST_DRIVER_REMOVE is not set +# CONFIG_TEST_ASYNC_DRIVER_PROBE is not set +CONFIG_SYS_HYPERVISOR=y +CONFIG_GENERIC_CPU_AUTOPROBE=y +CONFIG_GENERIC_CPU_VULNERABILITIES=y +CONFIG_SOC_BUS=y +CONFIG_REGMAP=y +CONFIG_REGMAP_I2C=y +CONFIG_REGMAP_SPI=m +CONFIG_REGMAP_SPMI=y +CONFIG_REGMAP_MMIO=y +CONFIG_REGMAP_IRQ=y +CONFIG_DMA_SHARED_BUFFER=y +# CONFIG_DMA_FENCE_TRACE is not set +CONFIG_GENERIC_ARCH_TOPOLOGY=y +CONFIG_GENERIC_ARCH_NUMA=y +# CONFIG_FW_DEVLINK_SYNC_STATE_TIMEOUT is not set +# end of Generic Driver Options + +# +# Bus devices +# +CONFIG_ARM_CCI=y +CONFIG_ARM_CCI400_COMMON=y +# CONFIG_BRCMSTB_GISB_ARB is not set +# CONFIG_MOXTET is not set +CONFIG_HISILICON_LPC=y +CONFIG_QCOM_EBI2=y +# CONFIG_QCOM_SSC_BLOCK_BUS is not set +CONFIG_SUN50I_DE2_BUS=y +CONFIG_SUNXI_RSB=y +CONFIG_TEGRA_ACONNECT=y +# CONFIG_TEGRA_GMI is not set +CONFIG_VEXPRESS_CONFIG=y +# CONFIG_MHI_BUS is not set +# CONFIG_MHI_BUS_EP is not set +# end of Bus devices + +# +# Cache Drivers +# +# end of Cache Drivers + +CONFIG_CONNECTOR=y +CONFIG_PROC_EVENTS=y + +# +# Firmware Drivers +# + +# +# ARM System Control and Management Interface Protocol +# +# CONFIG_ARM_SCMI_PROTOCOL is not set +# end of ARM System Control and Management Interface Protocol + +# CONFIG_ARM_SCPI_PROTOCOL is not set +CONFIG_ARM_SDE_INTERFACE=y +# CONFIG_FIRMWARE_MEMMAP is not set +CONFIG_DMIID=y +CONFIG_DMI_SYSFS=y +# CONFIG_ISCSI_IBFT is not set +CONFIG_FW_CFG_SYSFS=m +# CONFIG_FW_CFG_SYSFS_CMDLINE is not set +CONFIG_QCOM_SCM=y +# CONFIG_QCOM_SCM_DOWNLOAD_MODE_DEFAULT is not set +CONFIG_SYSFB=y +# CONFIG_SYSFB_SIMPLEFB is not set +CONFIG_TURRIS_MOX_RWTM=m +CONFIG_ARM_FFA_TRANSPORT=y +CONFIG_ARM_FFA_SMCCC=y +CONFIG_GOOGLE_FIRMWARE=y +# CONFIG_GOOGLE_CBMEM is not set +CONFIG_GOOGLE_COREBOOT_TABLE=m +CONFIG_GOOGLE_MEMCONSOLE=m +# CONFIG_GOOGLE_FRAMEBUFFER_COREBOOT is not set +CONFIG_GOOGLE_MEMCONSOLE_COREBOOT=m +# CONFIG_GOOGLE_VPD is not set + +# +# EFI (Extensible Firmware Interface) Support +# +CONFIG_EFI_ESRT=y +CONFIG_EFI_VARS_PSTORE=m +# CONFIG_EFI_VARS_PSTORE_DEFAULT_DISABLE is not set +CONFIG_EFI_PARAMS_FROM_FDT=y +CONFIG_EFI_RUNTIME_WRAPPERS=y +CONFIG_EFI_GENERIC_STUB=y +# CONFIG_EFI_ZBOOT is not set +CONFIG_EFI_ARMSTUB_DTB_LOADER=y +CONFIG_EFI_BOOTLOADER_CONTROL=m +CONFIG_EFI_CAPSULE_LOADER=m +# CONFIG_EFI_TEST is not set +# CONFIG_RESET_ATTACK_MITIGATION is not set +# CONFIG_EFI_DISABLE_PCI_DMA is not set +CONFIG_EFI_EARLYCON=y +CONFIG_EFI_CUSTOM_SSDT_OVERLAYS=y +# CONFIG_EFI_DISABLE_RUNTIME is not set +# CONFIG_EFI_COCO_SECRET is not set +# end of EFI (Extensible Firmware Interface) Support + +CONFIG_UEFI_CPER=y +CONFIG_UEFI_CPER_ARM=y +CONFIG_ARM_PSCI_FW=y +# CONFIG_ARM_PSCI_CHECKER is not set +CONFIG_HAVE_ARM_SMCCC=y +CONFIG_HAVE_ARM_SMCCC_DISCOVERY=y +CONFIG_ARM_SMCCC_SOC_ID=y + +# +# Tegra firmware driver +# +CONFIG_TEGRA_IVC=y +CONFIG_TEGRA_BPMP=y +# end of Tegra firmware driver + +# +# Zynq MPSoC Firmware Drivers +# +CONFIG_ZYNQMP_FIRMWARE=y +# CONFIG_ZYNQMP_FIRMWARE_DEBUG is not set +# end of Zynq MPSoC Firmware Drivers +# end of Firmware Drivers + +CONFIG_GNSS=m +CONFIG_GNSS_SERIAL=m +# CONFIG_GNSS_MTK_SERIAL is not set +CONFIG_GNSS_SIRF_SERIAL=m +CONFIG_GNSS_UBX_SERIAL=m +# CONFIG_GNSS_USB is not set +CONFIG_MTD=m +# CONFIG_MTD_TESTS is not set + +# +# Partition parsers +# +CONFIG_MTD_AR7_PARTS=m +# CONFIG_MTD_CMDLINE_PARTS is not set +CONFIG_MTD_OF_PARTS=m +# CONFIG_MTD_AFS_PARTS is not set +# CONFIG_MTD_REDBOOT_PARTS is not set +# end of Partition parsers + +# +# User Modules And Translation Layers +# +CONFIG_MTD_BLKDEVS=m +CONFIG_MTD_BLOCK=m +CONFIG_MTD_BLOCK_RO=m + +# +# Note that in some cases UBI block is preferred. See MTD_UBI_BLOCK. +# +# CONFIG_FTL is not set +# CONFIG_NFTL is not set +# CONFIG_INFTL is not set +CONFIG_RFD_FTL=m +CONFIG_SSFDC=m +# CONFIG_SM_FTL is not set +CONFIG_MTD_OOPS=m +CONFIG_MTD_SWAP=m +# CONFIG_MTD_PARTITIONED_MASTER is not set + +# +# RAM/ROM/Flash chip drivers +# +# CONFIG_MTD_CFI is not set +# CONFIG_MTD_JEDECPROBE is not set +CONFIG_MTD_MAP_BANK_WIDTH_1=y +CONFIG_MTD_MAP_BANK_WIDTH_2=y +CONFIG_MTD_MAP_BANK_WIDTH_4=y +CONFIG_MTD_CFI_I1=y +CONFIG_MTD_CFI_I2=y +CONFIG_MTD_RAM=m +# CONFIG_MTD_ROM is not set +# CONFIG_MTD_ABSENT is not set +# end of RAM/ROM/Flash chip drivers + +# +# Mapping drivers for chip access +# +# CONFIG_MTD_COMPLEX_MAPPINGS is not set +# CONFIG_MTD_PHYSMAP is not set +CONFIG_MTD_INTEL_VR_NOR=m +CONFIG_MTD_PLATRAM=m +# end of Mapping drivers for chip access + +# +# Self-contained MTD device drivers +# +# CONFIG_MTD_PMC551 is not set +CONFIG_MTD_DATAFLASH=m +# CONFIG_MTD_DATAFLASH_WRITE_VERIFY is not set +# CONFIG_MTD_DATAFLASH_OTP is not set +# CONFIG_MTD_MCHP23K256 is not set +# CONFIG_MTD_MCHP48L640 is not set +CONFIG_MTD_SST25L=m +# CONFIG_MTD_SLRAM is not set +# CONFIG_MTD_PHRAM is not set +# CONFIG_MTD_MTDRAM is not set +# CONFIG_MTD_BLOCK2MTD is not set + +# +# Disk-On-Chip Device Drivers +# +# CONFIG_MTD_DOCG3 is not set +# end of Self-contained MTD device drivers + +# +# NAND +# +CONFIG_MTD_ONENAND=m +CONFIG_MTD_ONENAND_VERIFY_WRITE=y +# CONFIG_MTD_ONENAND_GENERIC is not set +# CONFIG_MTD_ONENAND_OTP is not set +CONFIG_MTD_ONENAND_2X_PROGRAM=y +# CONFIG_MTD_RAW_NAND is not set +# CONFIG_MTD_SPI_NAND is not set + +# +# ECC engine support +# +# CONFIG_MTD_NAND_ECC_SW_HAMMING is not set +# CONFIG_MTD_NAND_ECC_SW_BCH is not set +# CONFIG_MTD_NAND_ECC_MXIC is not set +# end of ECC engine support +# end of NAND + +# +# LPDDR & LPDDR2 PCM memory drivers +# +CONFIG_MTD_LPDDR=m +CONFIG_MTD_QINFO_PROBE=m +# end of LPDDR & LPDDR2 PCM memory drivers + +CONFIG_MTD_SPI_NOR=m +CONFIG_MTD_SPI_NOR_USE_4K_SECTORS=y +# CONFIG_MTD_SPI_NOR_SWP_DISABLE is not set +CONFIG_MTD_SPI_NOR_SWP_DISABLE_ON_VOLATILE=y +# CONFIG_MTD_SPI_NOR_SWP_KEEP is not set +CONFIG_SPI_HISI_SFC=m +CONFIG_MTD_UBI=m +CONFIG_MTD_UBI_WL_THRESHOLD=4096 +CONFIG_MTD_UBI_BEB_LIMIT=20 +# CONFIG_MTD_UBI_FASTMAP is not set +# CONFIG_MTD_UBI_GLUEBI is not set +CONFIG_MTD_UBI_BLOCK=y +# CONFIG_MTD_HYPERBUS is not set +CONFIG_DTC=y +CONFIG_OF=y +# CONFIG_OF_UNITTEST is not set +CONFIG_OF_FLATTREE=y +CONFIG_OF_EARLY_FLATTREE=y +CONFIG_OF_KOBJ=y +CONFIG_OF_ADDRESS=y +CONFIG_OF_IRQ=y +CONFIG_OF_RESERVED_MEM=y +# CONFIG_OF_OVERLAY is not set +CONFIG_OF_NUMA=y +CONFIG_PARPORT=m +# CONFIG_PARPORT_PC is not set +CONFIG_PARPORT_1284=y +CONFIG_PARPORT_NOT_PC=y +CONFIG_PNP=y +# CONFIG_PNP_DEBUG_MESSAGES is not set + +# +# Protocols +# +CONFIG_PNPACPI=y +CONFIG_BLK_DEV=y +CONFIG_BLK_DEV_NULL_BLK=m +CONFIG_CDROM=m +CONFIG_BLK_DEV_PCIESSD_MTIP32XX=m +CONFIG_ZRAM=m +CONFIG_ZRAM_DEF_COMP_LZORLE=y +# CONFIG_ZRAM_DEF_COMP_ZSTD is not set +# CONFIG_ZRAM_DEF_COMP_LZ4 is not set +# CONFIG_ZRAM_DEF_COMP_LZO is not set +# CONFIG_ZRAM_DEF_COMP_LZ4HC is not set +CONFIG_ZRAM_DEF_COMP="lzo-rle" +CONFIG_ZRAM_WRITEBACK=y +CONFIG_ZRAM_MEMORY_TRACKING=y +# CONFIG_ZRAM_MULTI_COMP is not set +CONFIG_BLK_DEV_LOOP=m +CONFIG_BLK_DEV_LOOP_MIN_COUNT=8 +CONFIG_BLK_DEV_DRBD=m +# CONFIG_DRBD_FAULT_INJECTION is not set +CONFIG_BLK_DEV_NBD=m +CONFIG_BLK_DEV_RAM=m +CONFIG_BLK_DEV_RAM_COUNT=16 +CONFIG_BLK_DEV_RAM_SIZE=16384 +# CONFIG_CDROM_PKTCDVD is not set +CONFIG_ATA_OVER_ETH=m +CONFIG_XEN_BLKDEV_FRONTEND=m +CONFIG_XEN_BLKDEV_BACKEND=m +CONFIG_VIRTIO_BLK=m +CONFIG_BLK_DEV_RBD=m +# CONFIG_BLK_DEV_UBLK is not set + +# +# NVME Support +# +CONFIG_NVME_CORE=m +CONFIG_BLK_DEV_NVME=m +CONFIG_NVME_MULTIPATH=y +# CONFIG_NVME_VERBOSE_ERRORS is not set +# CONFIG_NVME_HWMON is not set +CONFIG_NVME_FABRICS=m +CONFIG_NVME_RDMA=m +CONFIG_NVME_FC=m +CONFIG_NVME_TCP=m +# CONFIG_NVME_AUTH is not set +CONFIG_NVME_TARGET=m +# CONFIG_NVME_TARGET_PASSTHRU is not set +# CONFIG_NVME_TARGET_LOOP is not set +CONFIG_NVME_TARGET_RDMA=m +CONFIG_NVME_TARGET_FC=m +# CONFIG_NVME_TARGET_FCLOOP is not set +CONFIG_NVME_TARGET_TCP=m +# CONFIG_NVME_TARGET_AUTH is not set +# end of NVME Support + +# +# Misc devices +# +CONFIG_SENSORS_LIS3LV02D=m +CONFIG_AD525X_DPOT=m +CONFIG_AD525X_DPOT_I2C=m +CONFIG_AD525X_DPOT_SPI=m +# CONFIG_DUMMY_IRQ is not set +# CONFIG_PHANTOM is not set +CONFIG_TIFM_CORE=m +CONFIG_TIFM_7XX1=m +CONFIG_ICS932S401=m +CONFIG_ENCLOSURE_SERVICES=m +# CONFIG_HI6421V600_IRQ is not set +# CONFIG_HP_ILO is not set +CONFIG_QCOM_COINCELL=m +# CONFIG_QCOM_FASTRPC is not set +CONFIG_APDS9802ALS=m +CONFIG_ISL29003=m +CONFIG_ISL29020=m +CONFIG_SENSORS_TSL2550=m +CONFIG_SENSORS_BH1770=m +CONFIG_SENSORS_APDS990X=m +CONFIG_HMC6352=m +CONFIG_DS1682=m +# CONFIG_LATTICE_ECP3_CONFIG is not set +CONFIG_SRAM=y +# CONFIG_DW_XDATA_PCIE is not set +# CONFIG_PCI_ENDPOINT_TEST is not set +# CONFIG_XILINX_SDFEC is not set +CONFIG_MISC_RTSX=m +# CONFIG_HISI_HIKEY_USB is not set +# CONFIG_OPEN_DICE is not set +# CONFIG_VCPU_STALL_DETECTOR is not set +CONFIG_C2PORT=m + +# +# EEPROM support +# +CONFIG_EEPROM_AT24=m +CONFIG_EEPROM_AT25=m +CONFIG_EEPROM_LEGACY=m +CONFIG_EEPROM_MAX6875=m +CONFIG_EEPROM_93CX6=m +# CONFIG_EEPROM_93XX46 is not set +# CONFIG_EEPROM_IDT_89HPESX is not set +# CONFIG_EEPROM_EE1004 is not set +# end of EEPROM support + +CONFIG_CB710_CORE=m +# CONFIG_CB710_DEBUG is not set +CONFIG_CB710_DEBUG_ASSUMPTIONS=y + +# +# Texas Instruments shared transport line discipline +# +CONFIG_TI_ST=m +# end of Texas Instruments shared transport line discipline + +CONFIG_SENSORS_LIS3_I2C=m +CONFIG_ALTERA_STAPL=m +# CONFIG_VMWARE_VMCI is not set +# CONFIG_GENWQE is not set +# CONFIG_ECHO is not set +# CONFIG_BCM_VK is not set +# CONFIG_MISC_ALCOR_PCI is not set +CONFIG_MISC_RTSX_PCI=m +CONFIG_MISC_RTSX_USB=m +# CONFIG_UACCE is not set +CONFIG_PVPANIC=y +CONFIG_PVPANIC_MMIO=m +CONFIG_PVPANIC_PCI=m +# CONFIG_GP_PCI1XXXX is not set +# end of Misc devices + +# +# SCSI device support +# +CONFIG_SCSI_MOD=m +CONFIG_RAID_ATTRS=m +CONFIG_SCSI_COMMON=m +CONFIG_SCSI=m +CONFIG_SCSI_DMA=y +CONFIG_SCSI_NETLINK=y +# CONFIG_SCSI_PROC_FS is not set + +# +# SCSI support type (disk, tape, CD-ROM) +# +CONFIG_BLK_DEV_SD=m +CONFIG_CHR_DEV_ST=m +CONFIG_BLK_DEV_SR=m +CONFIG_CHR_DEV_SG=m +CONFIG_BLK_DEV_BSG=y +CONFIG_CHR_DEV_SCH=m +CONFIG_SCSI_ENCLOSURE=m +CONFIG_SCSI_CONSTANTS=y +CONFIG_SCSI_LOGGING=y +CONFIG_SCSI_SCAN_ASYNC=y + +# +# SCSI Transports +# +CONFIG_SCSI_SPI_ATTRS=m +CONFIG_SCSI_FC_ATTRS=m +CONFIG_SCSI_ISCSI_ATTRS=m +CONFIG_SCSI_SAS_ATTRS=m +CONFIG_SCSI_SAS_LIBSAS=m +CONFIG_SCSI_SAS_ATA=y +CONFIG_SCSI_SAS_HOST_SMP=y +CONFIG_SCSI_SRP_ATTRS=m +# end of SCSI Transports + +CONFIG_SCSI_LOWLEVEL=y +CONFIG_ISCSI_TCP=m +CONFIG_ISCSI_BOOT_SYSFS=m +CONFIG_SCSI_CXGB3_ISCSI=m +CONFIG_SCSI_CXGB4_ISCSI=m +CONFIG_SCSI_BNX2_ISCSI=m +CONFIG_SCSI_BNX2X_FCOE=m +CONFIG_BE2ISCSI=m +CONFIG_BLK_DEV_3W_XXXX_RAID=m +CONFIG_SCSI_HPSA=m +CONFIG_SCSI_3W_9XXX=m +CONFIG_SCSI_3W_SAS=m +CONFIG_SCSI_ACARD=m +CONFIG_SCSI_AACRAID=m +CONFIG_SCSI_AIC7XXX=m +CONFIG_AIC7XXX_CMDS_PER_DEVICE=32 +CONFIG_AIC7XXX_RESET_DELAY_MS=15000 +CONFIG_AIC7XXX_DEBUG_ENABLE=y +CONFIG_AIC7XXX_DEBUG_MASK=0 +CONFIG_AIC7XXX_REG_PRETTY_PRINT=y +CONFIG_SCSI_AIC79XX=m +CONFIG_AIC79XX_CMDS_PER_DEVICE=32 +CONFIG_AIC79XX_RESET_DELAY_MS=15000 +CONFIG_AIC79XX_DEBUG_ENABLE=y +CONFIG_AIC79XX_DEBUG_MASK=0 +CONFIG_AIC79XX_REG_PRETTY_PRINT=y +CONFIG_SCSI_AIC94XX=m +# CONFIG_AIC94XX_DEBUG is not set +CONFIG_SCSI_HISI_SAS=m +CONFIG_SCSI_HISI_SAS_PCI=m +# CONFIG_SCSI_HISI_SAS_DEBUGFS_DEFAULT_ENABLE is not set +CONFIG_SCSI_MVSAS=m +# CONFIG_SCSI_MVSAS_DEBUG is not set +# CONFIG_SCSI_MVSAS_TASKLET is not set +CONFIG_SCSI_MVUMI=m +CONFIG_SCSI_ADVANSYS=m +# CONFIG_SCSI_ARCMSR is not set +CONFIG_SCSI_ESAS2R=m +# CONFIG_MEGARAID_NEWGEN is not set +# CONFIG_MEGARAID_LEGACY is not set +CONFIG_MEGARAID_SAS=m +CONFIG_SCSI_MPT3SAS=m +CONFIG_SCSI_MPT2SAS_MAX_SGE=128 +CONFIG_SCSI_MPT3SAS_MAX_SGE=128 +CONFIG_SCSI_MPT2SAS=m +# CONFIG_SCSI_MPI3MR is not set +CONFIG_SCSI_SMARTPQI=m +CONFIG_SCSI_HPTIOP=m +# CONFIG_SCSI_BUSLOGIC is not set +# CONFIG_SCSI_MYRB is not set +# CONFIG_SCSI_MYRS is not set +CONFIG_XEN_SCSI_FRONTEND=m +CONFIG_HYPERV_STORAGE=m +CONFIG_LIBFC=m +CONFIG_LIBFCOE=m +CONFIG_FCOE=m +CONFIG_SCSI_SNIC=m +# CONFIG_SCSI_SNIC_DEBUG_FS is not set +CONFIG_SCSI_DMX3191D=m +# CONFIG_SCSI_FDOMAIN_PCI is not set +# CONFIG_SCSI_IPS is not set +# CONFIG_SCSI_INITIO is not set +# CONFIG_SCSI_INIA100 is not set +CONFIG_SCSI_STEX=m +CONFIG_SCSI_SYM53C8XX_2=m +CONFIG_SCSI_SYM53C8XX_DMA_ADDRESSING_MODE=1 +CONFIG_SCSI_SYM53C8XX_DEFAULT_TAGS=16 +CONFIG_SCSI_SYM53C8XX_MAX_TAGS=64 +CONFIG_SCSI_SYM53C8XX_MMIO=y +# CONFIG_SCSI_IPR is not set +# CONFIG_SCSI_QLOGIC_1280 is not set +CONFIG_SCSI_QLA_FC=m +CONFIG_TCM_QLA2XXX=m +# CONFIG_TCM_QLA2XXX_DEBUG is not set +CONFIG_SCSI_QLA_ISCSI=m +CONFIG_QEDI=m +CONFIG_QEDF=m +CONFIG_SCSI_LPFC=m +# CONFIG_SCSI_LPFC_DEBUG_FS is not set +# CONFIG_SCSI_EFCT is not set +# CONFIG_SCSI_DC395x is not set +# CONFIG_SCSI_AM53C974 is not set +CONFIG_SCSI_WD719X=m +# CONFIG_SCSI_DEBUG is not set +CONFIG_SCSI_PMCRAID=m +CONFIG_SCSI_PM8001=m +CONFIG_SCSI_BFA_FC=m +CONFIG_SCSI_VIRTIO=m +CONFIG_SCSI_CHELSIO_FCOE=m +CONFIG_SCSI_DH=y +CONFIG_SCSI_DH_RDAC=m +CONFIG_SCSI_DH_HP_SW=m +CONFIG_SCSI_DH_EMC=m +CONFIG_SCSI_DH_ALUA=m +# end of SCSI device support + +CONFIG_ATA=m +CONFIG_SATA_HOST=y +CONFIG_PATA_TIMINGS=y +CONFIG_ATA_VERBOSE_ERROR=y +CONFIG_ATA_FORCE=y +CONFIG_ATA_ACPI=y +CONFIG_SATA_ZPODD=y +CONFIG_SATA_PMP=y + +# +# Controllers with non-SFF native interface +# +CONFIG_SATA_AHCI=m +CONFIG_SATA_MOBILE_LPM_POLICY=3 +CONFIG_SATA_AHCI_PLATFORM=m +# CONFIG_AHCI_DWC is not set +CONFIG_AHCI_CEVA=m +CONFIG_AHCI_MVEBU=m +# CONFIG_AHCI_SUNXI is not set +CONFIG_AHCI_TEGRA=m +CONFIG_AHCI_XGENE=m +CONFIG_SATA_AHCI_SEATTLE=m +# CONFIG_SATA_INIC162X is not set +CONFIG_SATA_ACARD_AHCI=m +CONFIG_SATA_SIL24=m +CONFIG_ATA_SFF=y + +# +# SFF controllers with custom DMA interface +# +CONFIG_PDC_ADMA=m +CONFIG_SATA_QSTOR=m +CONFIG_SATA_SX4=m +CONFIG_ATA_BMDMA=y + +# +# SATA SFF controllers with BMDMA +# +CONFIG_ATA_PIIX=m +# CONFIG_SATA_DWC is not set +CONFIG_SATA_MV=m +CONFIG_SATA_NV=m +CONFIG_SATA_PROMISE=m +CONFIG_SATA_SIL=m +CONFIG_SATA_SIS=m +CONFIG_SATA_SVW=m +CONFIG_SATA_ULI=m +CONFIG_SATA_VIA=m +CONFIG_SATA_VITESSE=m + +# +# PATA SFF controllers with BMDMA +# +# CONFIG_PATA_ALI is not set +# CONFIG_PATA_AMD is not set +CONFIG_PATA_ARTOP=m +# CONFIG_PATA_ATIIXP is not set +CONFIG_PATA_ATP867X=m +CONFIG_PATA_CMD64X=m +# CONFIG_PATA_CYPRESS is not set +# CONFIG_PATA_EFAR is not set +# CONFIG_PATA_HPT366 is not set +# CONFIG_PATA_HPT37X is not set +# CONFIG_PATA_HPT3X2N is not set +# CONFIG_PATA_HPT3X3 is not set +CONFIG_PATA_IT8213=m +CONFIG_PATA_IT821X=m +CONFIG_PATA_JMICRON=m +CONFIG_PATA_MARVELL=m +# CONFIG_PATA_NETCELL is not set +CONFIG_PATA_NINJA32=m +# CONFIG_PATA_NS87415 is not set +# CONFIG_PATA_OLDPIIX is not set +# CONFIG_PATA_OPTIDMA is not set +# CONFIG_PATA_PDC2027X is not set +# CONFIG_PATA_PDC_OLD is not set +# CONFIG_PATA_RADISYS is not set +CONFIG_PATA_RDC=m +CONFIG_PATA_SCH=m +# CONFIG_PATA_SERVERWORKS is not set +# CONFIG_PATA_SIL680 is not set +CONFIG_PATA_SIS=m +CONFIG_PATA_TOSHIBA=m +# CONFIG_PATA_TRIFLEX is not set +# CONFIG_PATA_VIA is not set +# CONFIG_PATA_WINBOND is not set + +# +# PIO-only SFF controllers +# +# CONFIG_PATA_CMD640_PCI is not set +# CONFIG_PATA_MPIIX is not set +# CONFIG_PATA_NS87410 is not set +# CONFIG_PATA_OPTI is not set +# CONFIG_PATA_OF_PLATFORM is not set +# CONFIG_PATA_RZ1000 is not set + +# +# Generic fallback / legacy drivers +# +# CONFIG_PATA_ACPI is not set +CONFIG_ATA_GENERIC=m +# CONFIG_PATA_LEGACY is not set +CONFIG_MD=y +CONFIG_BLK_DEV_MD=m +CONFIG_MD_BITMAP_FILE=y +CONFIG_MD_LINEAR=m +CONFIG_MD_RAID0=m +CONFIG_MD_RAID1=m +CONFIG_MD_RAID10=m +CONFIG_MD_RAID456=m +CONFIG_MD_MULTIPATH=m +CONFIG_MD_FAULTY=m +# CONFIG_MD_CLUSTER is not set +CONFIG_BCACHE=m +# CONFIG_BCACHE_DEBUG is not set +# CONFIG_BCACHE_CLOSURES_DEBUG is not set +# CONFIG_BCACHE_ASYNC_REGISTRATION is not set +CONFIG_BLK_DEV_DM_BUILTIN=y +CONFIG_BLK_DEV_DM=m +# CONFIG_DM_DEBUG is not set +CONFIG_DM_BUFIO=m +# CONFIG_DM_DEBUG_BLOCK_MANAGER_LOCKING is not set +CONFIG_DM_BIO_PRISON=m +CONFIG_DM_PERSISTENT_DATA=m +CONFIG_DM_UNSTRIPED=m +CONFIG_DM_CRYPT=m +CONFIG_DM_SNAPSHOT=m +CONFIG_DM_THIN_PROVISIONING=m +CONFIG_DM_CACHE=m +CONFIG_DM_CACHE_SMQ=m +CONFIG_DM_WRITECACHE=m +# CONFIG_DM_EBS is not set +CONFIG_DM_ERA=m +# CONFIG_DM_CLONE is not set +CONFIG_DM_MIRROR=m +CONFIG_DM_LOG_USERSPACE=m +CONFIG_DM_RAID=m +CONFIG_DM_ZERO=m +CONFIG_DM_MULTIPATH=m +CONFIG_DM_MULTIPATH_QL=m +CONFIG_DM_MULTIPATH_ST=m +# CONFIG_DM_MULTIPATH_HST is not set +# CONFIG_DM_MULTIPATH_IOA is not set +CONFIG_DM_DELAY=m +# CONFIG_DM_DUST is not set +CONFIG_DM_UEVENT=y +CONFIG_DM_FLAKEY=m +CONFIG_DM_VERITY=m +# CONFIG_DM_VERITY_VERIFY_ROOTHASH_SIG is not set +# CONFIG_DM_VERITY_FEC is not set +CONFIG_DM_SWITCH=m +CONFIG_DM_LOG_WRITES=m +CONFIG_DM_INTEGRITY=m +CONFIG_DM_ZONED=m +CONFIG_DM_AUDIT=y +CONFIG_TARGET_CORE=m +CONFIG_TCM_IBLOCK=m +CONFIG_TCM_FILEIO=m +CONFIG_TCM_PSCSI=m +CONFIG_TCM_USER2=m +CONFIG_LOOPBACK_TARGET=m +CONFIG_TCM_FC=m +CONFIG_ISCSI_TARGET=m +CONFIG_ISCSI_TARGET_CXGB4=m +CONFIG_SBP_TARGET=m +# CONFIG_REMOTE_TARGET is not set +CONFIG_FUSION=y +CONFIG_FUSION_SPI=m +CONFIG_FUSION_FC=m +CONFIG_FUSION_SAS=m +CONFIG_FUSION_MAX_SGE=128 +CONFIG_FUSION_CTL=m +# CONFIG_FUSION_LOGGING is not set + +# +# IEEE 1394 (FireWire) support +# +CONFIG_FIREWIRE=m +CONFIG_FIREWIRE_OHCI=m +CONFIG_FIREWIRE_SBP2=m +CONFIG_FIREWIRE_NET=m +CONFIG_FIREWIRE_NOSY=m +# end of IEEE 1394 (FireWire) support + +CONFIG_NETDEVICES=y +CONFIG_MII=m +CONFIG_NET_CORE=y +CONFIG_BONDING=m +CONFIG_DUMMY=m +# CONFIG_WIREGUARD is not set +CONFIG_EQUALIZER=m +# CONFIG_NET_FC is not set +CONFIG_IFB=m +CONFIG_NET_TEAM=m +CONFIG_NET_TEAM_MODE_BROADCAST=m +CONFIG_NET_TEAM_MODE_ROUNDROBIN=m +CONFIG_NET_TEAM_MODE_RANDOM=m +CONFIG_NET_TEAM_MODE_ACTIVEBACKUP=m +CONFIG_NET_TEAM_MODE_LOADBALANCE=m +CONFIG_MACVLAN=m +CONFIG_MACVTAP=m +CONFIG_IPVLAN_L3S=y +CONFIG_IPVLAN=m +CONFIG_IPVTAP=m +CONFIG_VXLAN=m +CONFIG_GENEVE=m +# CONFIG_BAREUDP is not set +CONFIG_GTP=m +# CONFIG_AMT is not set +CONFIG_MACSEC=m +CONFIG_NETCONSOLE=m +CONFIG_NETCONSOLE_DYNAMIC=y +# CONFIG_NETCONSOLE_EXTENDED_LOG is not set +CONFIG_NETPOLL=y +CONFIG_NET_POLL_CONTROLLER=y +CONFIG_TUN=m +CONFIG_TAP=m +# CONFIG_TUN_VNET_CROSS_LE is not set +CONFIG_VETH=m +CONFIG_VIRTIO_NET=m +CONFIG_NLMON=m +CONFIG_NETKIT=y +CONFIG_NET_VRF=m +CONFIG_VSOCKMON=m +# CONFIG_ARCNET is not set +CONFIG_ATM_DRIVERS=y +CONFIG_ATM_DUMMY=m +# CONFIG_ATM_TCP is not set +# CONFIG_ATM_LANAI is not set +# CONFIG_ATM_ENI is not set +CONFIG_ATM_NICSTAR=m +CONFIG_ATM_NICSTAR_USE_SUNI=y +CONFIG_ATM_NICSTAR_USE_IDT77105=y +# CONFIG_ATM_IDT77252 is not set +CONFIG_ATM_IA=m +# CONFIG_ATM_IA_DEBUG is not set +CONFIG_ATM_FORE200E=m +# CONFIG_ATM_FORE200E_USE_TASKLET is not set +CONFIG_ATM_FORE200E_TX_RETRY=16 +CONFIG_ATM_FORE200E_DEBUG=0 +# CONFIG_ATM_HE is not set +CONFIG_ATM_SOLOS=m + +# +# Distributed Switch Architecture drivers +# +# CONFIG_B53 is not set +# CONFIG_NET_DSA_BCM_SF2 is not set +# CONFIG_NET_DSA_LOOP is not set +# CONFIG_NET_DSA_LANTIQ_GSWIP is not set +# CONFIG_NET_DSA_MT7530 is not set +CONFIG_NET_DSA_MV88E6060=m +# CONFIG_NET_DSA_MICROCHIP_KSZ_COMMON is not set +CONFIG_NET_DSA_MV88E6XXX=m +# CONFIG_NET_DSA_MV88E6XXX_PTP is not set +# CONFIG_NET_DSA_MSCC_OCELOT_EXT is not set +# CONFIG_NET_DSA_MSCC_SEVILLE is not set +# CONFIG_NET_DSA_AR9331 is not set +# CONFIG_NET_DSA_QCA8K is not set +# CONFIG_NET_DSA_SJA1105 is not set +# CONFIG_NET_DSA_XRS700X_I2C is not set +# CONFIG_NET_DSA_XRS700X_MDIO is not set +# CONFIG_NET_DSA_REALTEK is not set +# CONFIG_NET_DSA_SMSC_LAN9303_I2C is not set +# CONFIG_NET_DSA_SMSC_LAN9303_MDIO is not set +# CONFIG_NET_DSA_VITESSE_VSC73XX_SPI is not set +# CONFIG_NET_DSA_VITESSE_VSC73XX_PLATFORM is not set +# end of Distributed Switch Architecture drivers + +CONFIG_ETHERNET=y +CONFIG_MDIO=m +CONFIG_NET_VENDOR_3COM=y +CONFIG_VORTEX=m +CONFIG_TYPHOON=m +CONFIG_NET_VENDOR_ADAPTEC=y +CONFIG_ADAPTEC_STARFIRE=m +CONFIG_NET_VENDOR_AGERE=y +CONFIG_ET131X=m +CONFIG_NET_VENDOR_ALACRITECH=y +# CONFIG_SLICOSS is not set +CONFIG_NET_VENDOR_ALLWINNER=y +# CONFIG_SUN4I_EMAC is not set +CONFIG_NET_VENDOR_ALTEON=y +CONFIG_ACENIC=m +# CONFIG_ACENIC_OMIT_TIGON_I is not set +# CONFIG_ALTERA_TSE is not set +CONFIG_NET_VENDOR_AMAZON=y +CONFIG_ENA_ETHERNET=m +CONFIG_NET_VENDOR_AMD=y +# CONFIG_AMD8111_ETH is not set +CONFIG_PCNET32=m +CONFIG_AMD_XGBE=m +CONFIG_AMD_XGBE_DCB=y +# CONFIG_PDS_CORE is not set +CONFIG_NET_XGENE=m +CONFIG_NET_XGENE_V2=m +CONFIG_NET_VENDOR_AQUANTIA=y +CONFIG_AQTION=m +# CONFIG_NET_VENDOR_ARC is not set +CONFIG_NET_VENDOR_ASIX=y +# CONFIG_SPI_AX88796C is not set +CONFIG_NET_VENDOR_ATHEROS=y +CONFIG_ATL2=m +CONFIG_ATL1=m +CONFIG_ATL1E=m +CONFIG_ATL1C=m +CONFIG_ALX=m +CONFIG_NET_VENDOR_BROADCOM=y +# CONFIG_B44 is not set +# CONFIG_BCMGENET is not set +CONFIG_BNX2=m +CONFIG_CNIC=m +CONFIG_TIGON3=m +CONFIG_TIGON3_HWMON=y +CONFIG_BNX2X=m +CONFIG_BNX2X_SRIOV=y +# CONFIG_SYSTEMPORT is not set +CONFIG_BNXT=m +CONFIG_BNXT_SRIOV=y +CONFIG_BNXT_FLOWER_OFFLOAD=y +CONFIG_BNXT_DCB=y +CONFIG_BNXT_HWMON=y +CONFIG_NET_VENDOR_CADENCE=y +CONFIG_MACB=m +CONFIG_MACB_USE_HWSTAMP=y +# CONFIG_MACB_PCI is not set +CONFIG_NET_VENDOR_CAVIUM=y +CONFIG_THUNDER_NIC_PF=m +CONFIG_THUNDER_NIC_VF=m +CONFIG_THUNDER_NIC_BGX=m +CONFIG_THUNDER_NIC_RGX=m +CONFIG_CAVIUM_PTP=m +CONFIG_LIQUIDIO_CORE=m +CONFIG_LIQUIDIO=m +CONFIG_LIQUIDIO_VF=m +CONFIG_NET_VENDOR_CHELSIO=y +CONFIG_CHELSIO_T1=m +CONFIG_CHELSIO_T1_1G=y +CONFIG_CHELSIO_T3=m +CONFIG_CHELSIO_T4=m +CONFIG_CHELSIO_T4_DCB=y +CONFIG_CHELSIO_T4_FCOE=y +CONFIG_CHELSIO_T4VF=m +CONFIG_CHELSIO_LIB=m +CONFIG_CHELSIO_INLINE_CRYPTO=y +# CONFIG_CHELSIO_IPSEC_INLINE is not set +CONFIG_NET_VENDOR_CISCO=y +CONFIG_ENIC=m +CONFIG_NET_VENDOR_CORTINA=y +# CONFIG_GEMINI_ETHERNET is not set +CONFIG_NET_VENDOR_DAVICOM=y +# CONFIG_DM9051 is not set +# CONFIG_DNET is not set +CONFIG_NET_VENDOR_DEC=y +CONFIG_NET_TULIP=y +CONFIG_DE2104X=m +CONFIG_DE2104X_DSL=0 +CONFIG_TULIP=m +# CONFIG_TULIP_MWI is not set +# CONFIG_TULIP_MMIO is not set +CONFIG_TULIP_NAPI=y +CONFIG_TULIP_NAPI_HW_MITIGATION=y +CONFIG_WINBOND_840=m +CONFIG_DM9102=m +CONFIG_ULI526X=m +CONFIG_NET_VENDOR_DLINK=y +CONFIG_DL2K=m +CONFIG_SUNDANCE=m +# CONFIG_SUNDANCE_MMIO is not set +CONFIG_NET_VENDOR_EMULEX=y +CONFIG_BE2NET=m +CONFIG_BE2NET_HWMON=y +CONFIG_BE2NET_BE2=y +CONFIG_BE2NET_BE3=y +CONFIG_BE2NET_LANCER=y +CONFIG_BE2NET_SKYHAWK=y +CONFIG_NET_VENDOR_ENGLEDER=y +# CONFIG_TSNEP is not set +CONFIG_NET_VENDOR_EZCHIP=y +# CONFIG_EZCHIP_NPS_MANAGEMENT_ENET is not set +CONFIG_NET_VENDOR_FUNGIBLE=y +# CONFIG_FUN_ETH is not set +CONFIG_NET_VENDOR_GOOGLE=y +# CONFIG_GVE is not set +CONFIG_NET_VENDOR_HISILICON=y +CONFIG_HIX5HD2_GMAC=m +CONFIG_HISI_FEMAC=m +CONFIG_HIP04_ETH=m +# CONFIG_HI13X1_GMAC is not set +CONFIG_HNS_MDIO=m +# CONFIG_HNS_DSAF is not set +# CONFIG_HNS_ENET is not set +# CONFIG_HNS3 is not set +CONFIG_NET_VENDOR_HUAWEI=y +CONFIG_HINIC=m +CONFIG_NET_VENDOR_I825XX=y +CONFIG_NET_VENDOR_INTEL=y +CONFIG_E100=m +CONFIG_E1000=m +CONFIG_E1000E=m +CONFIG_IGB=m +CONFIG_IGB_HWMON=y +CONFIG_IGBVF=m +CONFIG_IXGBE=m +CONFIG_IXGBE_HWMON=y +CONFIG_IXGBE_DCB=y +CONFIG_IXGBE_IPSEC=y +CONFIG_IXGBEVF=m +CONFIG_IXGBEVF_IPSEC=y +CONFIG_I40E=m +CONFIG_I40E_DCB=y +CONFIG_IAVF=m +CONFIG_I40EVF=m +CONFIG_ICE=m +CONFIG_ICE_SWITCHDEV=y +# CONFIG_FM10K is not set +# CONFIG_IGC is not set +CONFIG_JME=m +CONFIG_NET_VENDOR_ADI=y +# CONFIG_ADIN1110 is not set +CONFIG_NET_VENDOR_LITEX=y +# CONFIG_LITEX_LITEETH is not set +CONFIG_NET_VENDOR_MARVELL=y +CONFIG_MVMDIO=m +CONFIG_MVNETA=m +CONFIG_MVPP2=m +# CONFIG_MVPP2_PTP is not set +CONFIG_SKGE=m +# CONFIG_SKGE_DEBUG is not set +CONFIG_SKGE_GENESIS=y +CONFIG_SKY2=m +# CONFIG_SKY2_DEBUG is not set +# CONFIG_OCTEONTX2_AF is not set +# CONFIG_OCTEONTX2_PF is not set +# CONFIG_OCTEON_EP is not set +# CONFIG_PRESTERA is not set +CONFIG_NET_VENDOR_MELLANOX=y +CONFIG_MLX4_EN=m +CONFIG_MLX4_EN_DCB=y +CONFIG_MLX4_CORE=m +CONFIG_MLX4_DEBUG=y +CONFIG_MLX4_CORE_GEN2=y +CONFIG_MLX5_CORE=m +CONFIG_MLX5_FPGA=y +CONFIG_MLX5_CORE_EN=y +CONFIG_MLX5_EN_ARFS=y +CONFIG_MLX5_EN_RXNFC=y +CONFIG_MLX5_MPFS=y +CONFIG_MLX5_ESWITCH=y +CONFIG_MLX5_BRIDGE=y +CONFIG_MLX5_CORE_EN_DCB=y +CONFIG_MLX5_CORE_IPOIB=y +# CONFIG_MLX5_MACSEC is not set +# CONFIG_MLX5_EN_IPSEC is not set +CONFIG_MLX5_SW_STEERING=y +# CONFIG_MLX5_SF is not set +# CONFIG_MLXSW_CORE is not set +CONFIG_MLXFW=m +# CONFIG_MLXBF_GIGE is not set +CONFIG_NET_VENDOR_MICREL=y +# CONFIG_KS8842 is not set +# CONFIG_KS8851 is not set +# CONFIG_KS8851_MLL is not set +CONFIG_KSZ884X_PCI=m +CONFIG_NET_VENDOR_MICROCHIP=y +# CONFIG_ENC28J60 is not set +# CONFIG_ENCX24J600 is not set +CONFIG_LAN743X=m +# CONFIG_LAN966X_SWITCH is not set +# CONFIG_VCAP is not set +CONFIG_NET_VENDOR_MICROSEMI=y +# CONFIG_MSCC_OCELOT_SWITCH is not set +CONFIG_NET_VENDOR_MICROSOFT=y +# CONFIG_MICROSOFT_MANA is not set +CONFIG_NET_VENDOR_MYRI=y +CONFIG_MYRI10GE=m +CONFIG_FEALNX=m +CONFIG_NET_VENDOR_NI=y +# CONFIG_NI_XGE_MANAGEMENT_ENET is not set +CONFIG_NET_VENDOR_NATSEMI=y +CONFIG_NATSEMI=m +CONFIG_NS83820=m +CONFIG_NET_VENDOR_NETERION=y +CONFIG_S2IO=m +CONFIG_NET_VENDOR_NETRONOME=y +CONFIG_NFP=m +CONFIG_NFP_APP_FLOWER=y +CONFIG_NFP_APP_ABM_NIC=y +CONFIG_NFP_NET_IPSEC=y +# CONFIG_NFP_DEBUG is not set +CONFIG_NET_VENDOR_8390=y +CONFIG_NE2K_PCI=m +CONFIG_NET_VENDOR_NVIDIA=y +# CONFIG_FORCEDETH is not set +CONFIG_NET_VENDOR_OKI=y +# CONFIG_ETHOC is not set +CONFIG_NET_VENDOR_PACKET_ENGINES=y +CONFIG_HAMACHI=m +CONFIG_YELLOWFIN=m +CONFIG_NET_VENDOR_PENSANDO=y +# CONFIG_IONIC is not set +CONFIG_NET_VENDOR_QLOGIC=y +CONFIG_QLA3XXX=m +CONFIG_QLCNIC=m +CONFIG_QLCNIC_SRIOV=y +CONFIG_QLCNIC_DCB=y +CONFIG_QLCNIC_HWMON=y +CONFIG_NETXEN_NIC=m +CONFIG_QED=m +CONFIG_QED_LL2=y +CONFIG_QED_SRIOV=y +CONFIG_QEDE=m +CONFIG_QED_RDMA=y +CONFIG_QED_ISCSI=y +CONFIG_QED_FCOE=y +CONFIG_QED_OOO=y +CONFIG_NET_VENDOR_BROCADE=y +CONFIG_BNA=m +CONFIG_NET_VENDOR_QUALCOMM=y +# CONFIG_QCA7000_SPI is not set +# CONFIG_QCA7000_UART is not set +CONFIG_QCOM_EMAC=m +# CONFIG_RMNET is not set +CONFIG_NET_VENDOR_RDC=y +CONFIG_R6040=m +CONFIG_NET_VENDOR_REALTEK=y +CONFIG_8139CP=m +CONFIG_8139TOO=m +# CONFIG_8139TOO_PIO is not set +CONFIG_8139TOO_TUNE_TWISTER=y +CONFIG_8139TOO_8129=y +# CONFIG_8139_OLD_RX_RESET is not set +CONFIG_R8169=m +CONFIG_NET_VENDOR_RENESAS=y +CONFIG_NET_VENDOR_ROCKER=y +# CONFIG_ROCKER is not set +CONFIG_NET_VENDOR_SAMSUNG=y +# CONFIG_SXGBE_ETH is not set +# CONFIG_NET_VENDOR_SEEQ is not set +CONFIG_NET_VENDOR_SILAN=y +CONFIG_SC92031=m +CONFIG_NET_VENDOR_SIS=y +# CONFIG_SIS900 is not set +CONFIG_SIS190=m +CONFIG_NET_VENDOR_SOLARFLARE=y +CONFIG_SFC=m +CONFIG_SFC_MTD=y +CONFIG_SFC_MCDI_MON=y +CONFIG_SFC_SRIOV=y +CONFIG_SFC_MCDI_LOGGING=y +CONFIG_SFC_FALCON=m +CONFIG_SFC_FALCON_MTD=y +# CONFIG_SFC_SIENA is not set +CONFIG_NET_VENDOR_SMSC=y +CONFIG_SMC91X=m +CONFIG_EPIC100=m +CONFIG_SMSC911X=m +CONFIG_SMSC9420=m +CONFIG_NET_VENDOR_SOCIONEXT=y +CONFIG_SNI_NETSEC=m +CONFIG_NET_VENDOR_STMICRO=y +CONFIG_STMMAC_ETH=m +# CONFIG_STMMAC_SELFTESTS is not set +CONFIG_STMMAC_PLATFORM=m +# CONFIG_DWMAC_DWC_QOS_ETH is not set +CONFIG_DWMAC_GENERIC=m +CONFIG_DWMAC_IPQ806X=m +CONFIG_DWMAC_MESON=m +CONFIG_DWMAC_QCOM_ETHQOS=m +CONFIG_DWMAC_ROCKCHIP=m +CONFIG_DWMAC_SUNXI=m +CONFIG_DWMAC_SUN8I=m +# CONFIG_DWMAC_INTEL_PLAT is not set +# CONFIG_DWMAC_TEGRA is not set +# CONFIG_STMMAC_PCI is not set +CONFIG_NET_VENDOR_SUN=y +# CONFIG_HAPPYMEAL is not set +# CONFIG_SUNGEM is not set +CONFIG_CASSINI=m +CONFIG_NIU=m +CONFIG_NET_VENDOR_SYNOPSYS=y +# CONFIG_DWC_XLGMAC is not set +CONFIG_NET_VENDOR_TEHUTI=y +CONFIG_TEHUTI=m +CONFIG_NET_VENDOR_TI=y +# CONFIG_TI_CPSW_PHY_SEL is not set +CONFIG_TLAN=m +CONFIG_NET_VENDOR_VERTEXCOM=y +# CONFIG_MSE102X is not set +CONFIG_NET_VENDOR_VIA=y +# CONFIG_VIA_RHINE is not set +CONFIG_VIA_VELOCITY=m +CONFIG_NET_VENDOR_WANGXUN=y +# CONFIG_NGBE is not set +# CONFIG_TXGBE is not set +CONFIG_NET_VENDOR_WIZNET=y +# CONFIG_WIZNET_W5100 is not set +# CONFIG_WIZNET_W5300 is not set +CONFIG_NET_VENDOR_XILINX=y +# CONFIG_XILINX_EMACLITE is not set +# CONFIG_XILINX_AXI_EMAC is not set +# CONFIG_XILINX_LL_TEMAC is not set +CONFIG_FDDI=y +CONFIG_DEFXX=m +CONFIG_SKFP=m +# CONFIG_HIPPI is not set +# CONFIG_NET_SB1000 is not set +CONFIG_PHYLINK=m +CONFIG_PHYLIB=m +CONFIG_SWPHY=y +CONFIG_LED_TRIGGER_PHY=y +CONFIG_PHYLIB_LEDS=y +CONFIG_FIXED_PHY=m +CONFIG_SFP=m + +# +# MII PHY device drivers +# +CONFIG_AMD_PHY=m +CONFIG_MESON_GXL_PHY=m +CONFIG_ADIN_PHY=m +# CONFIG_ADIN1100_PHY is not set +CONFIG_AQUANTIA_PHY=m +CONFIG_AX88796B_PHY=m +CONFIG_BROADCOM_PHY=m +# CONFIG_BCM54140_PHY is not set +# CONFIG_BCM7XXX_PHY is not set +# CONFIG_BCM84881_PHY is not set +CONFIG_BCM87XX_PHY=m +CONFIG_BCM_NET_PHYLIB=m +CONFIG_CICADA_PHY=m +CONFIG_CORTINA_PHY=m +CONFIG_DAVICOM_PHY=m +CONFIG_ICPLUS_PHY=m +CONFIG_LXT_PHY=m +# CONFIG_INTEL_XWAY_PHY is not set +CONFIG_LSI_ET1011C_PHY=m +CONFIG_MARVELL_PHY=m +CONFIG_MARVELL_10G_PHY=m +# CONFIG_MARVELL_88Q2XXX_PHY is not set +# CONFIG_MARVELL_88X2222_PHY is not set +# CONFIG_MAXLINEAR_GPHY is not set +# CONFIG_MEDIATEK_GE_PHY is not set +CONFIG_MICREL_PHY=m +# CONFIG_MICROCHIP_T1S_PHY is not set +CONFIG_MICROCHIP_PHY=m +CONFIG_MICROCHIP_T1_PHY=m +CONFIG_MICROSEMI_PHY=m +# CONFIG_MOTORCOMM_PHY is not set +CONFIG_NATIONAL_PHY=m +# CONFIG_NXP_CBTX_PHY is not set +# CONFIG_NXP_C45_TJA11XX_PHY is not set +# CONFIG_NXP_TJA11XX_PHY is not set +# CONFIG_NCN26000_PHY is not set +CONFIG_AT803X_PHY=m +CONFIG_QSEMI_PHY=m +CONFIG_REALTEK_PHY=m +CONFIG_RENESAS_PHY=m +CONFIG_ROCKCHIP_PHY=m +CONFIG_SMSC_PHY=m +CONFIG_STE10XP=m +CONFIG_TERANETICS_PHY=m +CONFIG_DP83822_PHY=m +CONFIG_DP83TC811_PHY=m +CONFIG_DP83848_PHY=m +CONFIG_DP83867_PHY=m +# CONFIG_DP83869_PHY is not set +# CONFIG_DP83TD510_PHY is not set +CONFIG_VITESSE_PHY=m +# CONFIG_XILINX_GMII2RGMII is not set +# CONFIG_MICREL_KS8995MA is not set +# CONFIG_PSE_CONTROLLER is not set +CONFIG_CAN_DEV=m +CONFIG_CAN_VCAN=m +CONFIG_CAN_VXCAN=m +CONFIG_CAN_NETLINK=y +CONFIG_CAN_CALC_BITTIMING=y +CONFIG_CAN_RX_OFFLOAD=y +# CONFIG_CAN_CAN327 is not set +# CONFIG_CAN_FLEXCAN is not set +# CONFIG_CAN_GRCAN is not set +# CONFIG_CAN_KVASER_PCIEFD is not set +CONFIG_CAN_SLCAN=m +# CONFIG_CAN_XILINXCAN is not set +# CONFIG_CAN_C_CAN is not set +# CONFIG_CAN_CC770 is not set +# CONFIG_CAN_CTUCANFD_PCI is not set +# CONFIG_CAN_CTUCANFD_PLATFORM is not set +# CONFIG_CAN_IFI_CANFD is not set +# CONFIG_CAN_M_CAN is not set +CONFIG_CAN_PEAK_PCIEFD=m +CONFIG_CAN_SJA1000=m +CONFIG_CAN_EMS_PCI=m +# CONFIG_CAN_F81601 is not set +CONFIG_CAN_KVASER_PCI=m +CONFIG_CAN_PEAK_PCI=m +CONFIG_CAN_PEAK_PCIEC=y +CONFIG_CAN_PLX_PCI=m +CONFIG_CAN_SJA1000_ISA=m +# CONFIG_CAN_SJA1000_PLATFORM is not set +CONFIG_CAN_SOFTING=m + +# +# CAN SPI interfaces +# +# CONFIG_CAN_HI311X is not set +# CONFIG_CAN_MCP251X is not set +# CONFIG_CAN_MCP251XFD is not set +# end of CAN SPI interfaces + +# +# CAN USB interfaces +# +CONFIG_CAN_8DEV_USB=m +CONFIG_CAN_EMS_USB=m +# CONFIG_CAN_ESD_USB is not set +# CONFIG_CAN_ETAS_ES58X is not set +# CONFIG_CAN_F81604 is not set +CONFIG_CAN_GS_USB=m +CONFIG_CAN_KVASER_USB=m +CONFIG_CAN_MCBA_USB=m +CONFIG_CAN_PEAK_USB=m +CONFIG_CAN_UCAN=m +# end of CAN USB interfaces + +# CONFIG_CAN_DEBUG_DEVICES is not set +CONFIG_MDIO_DEVICE=m +CONFIG_MDIO_BUS=m +CONFIG_FWNODE_MDIO=m +CONFIG_OF_MDIO=m +CONFIG_ACPI_MDIO=m +CONFIG_MDIO_DEVRES=m +# CONFIG_MDIO_SUN4I is not set +CONFIG_MDIO_XGENE=m +# CONFIG_MDIO_BITBANG is not set +# CONFIG_MDIO_BCM_UNIMAC is not set +CONFIG_MDIO_CAVIUM=m +CONFIG_MDIO_HISI_FEMAC=m +CONFIG_MDIO_I2C=m +# CONFIG_MDIO_MVUSB is not set +# CONFIG_MDIO_MSCC_MIIM is not set +# CONFIG_MDIO_OCTEON is not set +# CONFIG_MDIO_IPQ4019 is not set +# CONFIG_MDIO_IPQ8064 is not set +CONFIG_MDIO_THUNDER=m + +# +# MDIO Multiplexers +# +CONFIG_MDIO_BUS_MUX=m +CONFIG_MDIO_BUS_MUX_MESON_G12A=m +CONFIG_MDIO_BUS_MUX_MESON_GXL=m +# CONFIG_MDIO_BUS_MUX_GPIO is not set +# CONFIG_MDIO_BUS_MUX_MULTIPLEXER is not set +CONFIG_MDIO_BUS_MUX_MMIOREG=m + +# +# PCS device drivers +# +CONFIG_PCS_XPCS=m +# end of PCS device drivers + +# CONFIG_PLIP is not set +CONFIG_PPP=m +CONFIG_PPP_BSDCOMP=m +CONFIG_PPP_DEFLATE=m +CONFIG_PPP_FILTER=y +CONFIG_PPP_MPPE=m +CONFIG_PPP_MULTILINK=y +CONFIG_PPPOATM=m +CONFIG_PPPOE=m +# CONFIG_PPPOE_HASH_BITS_1 is not set +# CONFIG_PPPOE_HASH_BITS_2 is not set +CONFIG_PPPOE_HASH_BITS_4=y +# CONFIG_PPPOE_HASH_BITS_8 is not set +CONFIG_PPPOE_HASH_BITS=4 +CONFIG_PPTP=m +CONFIG_PPPOL2TP=m +CONFIG_PPP_ASYNC=m +CONFIG_PPP_SYNC_TTY=m +CONFIG_SLIP=m +CONFIG_SLHC=m +CONFIG_SLIP_COMPRESSED=y +CONFIG_SLIP_SMART=y +CONFIG_SLIP_MODE_SLIP6=y + +# +# Host-side USB support is needed for USB Network Adapter support +# +CONFIG_USB_NET_DRIVERS=m +CONFIG_USB_CATC=m +CONFIG_USB_KAWETH=m +CONFIG_USB_PEGASUS=m +CONFIG_USB_RTL8150=m +CONFIG_USB_RTL8152=m +CONFIG_USB_LAN78XX=m +CONFIG_USB_USBNET=m +CONFIG_USB_NET_AX8817X=m +CONFIG_USB_NET_AX88179_178A=m +CONFIG_USB_NET_CDCETHER=m +CONFIG_USB_NET_CDC_EEM=m +CONFIG_USB_NET_CDC_NCM=m +CONFIG_USB_NET_HUAWEI_CDC_NCM=m +CONFIG_USB_NET_CDC_MBIM=m +CONFIG_USB_NET_DM9601=m +CONFIG_USB_NET_SR9700=m +CONFIG_USB_NET_SR9800=m +CONFIG_USB_NET_SMSC75XX=m +CONFIG_USB_NET_SMSC95XX=m +CONFIG_USB_NET_GL620A=m +CONFIG_USB_NET_NET1080=m +CONFIG_USB_NET_PLUSB=m +CONFIG_USB_NET_MCS7830=m +CONFIG_USB_NET_RNDIS_HOST=m +CONFIG_USB_NET_CDC_SUBSET_ENABLE=m +CONFIG_USB_NET_CDC_SUBSET=m +CONFIG_USB_ALI_M5632=y +CONFIG_USB_AN2720=y +CONFIG_USB_BELKIN=y +CONFIG_USB_ARMLINUX=y +CONFIG_USB_EPSON2888=y +CONFIG_USB_KC2190=y +CONFIG_USB_NET_ZAURUS=m +CONFIG_USB_NET_CX82310_ETH=m +CONFIG_USB_NET_KALMIA=m +CONFIG_USB_NET_QMI_WWAN=m +CONFIG_USB_HSO=m +CONFIG_USB_NET_INT51X1=m +CONFIG_USB_CDC_PHONET=m +CONFIG_USB_IPHETH=m +CONFIG_USB_SIERRA_NET=m +CONFIG_USB_VL600=m +CONFIG_USB_NET_CH9200=m +# CONFIG_USB_NET_AQC111 is not set +CONFIG_USB_RTL8153_ECM=m +CONFIG_WLAN=y +CONFIG_WLAN_VENDOR_ADMTEK=y +CONFIG_ADM8211=m +CONFIG_ATH_COMMON=m +CONFIG_WLAN_VENDOR_ATH=y +# CONFIG_ATH_DEBUG is not set +CONFIG_ATH5K=m +# CONFIG_ATH5K_DEBUG is not set +# CONFIG_ATH5K_TRACER is not set +CONFIG_ATH5K_PCI=y +CONFIG_ATH9K_HW=m +CONFIG_ATH9K_COMMON=m +CONFIG_ATH9K_BTCOEX_SUPPORT=y +CONFIG_ATH9K=m +CONFIG_ATH9K_PCI=y +# CONFIG_ATH9K_AHB is not set +# CONFIG_ATH9K_DEBUGFS is not set +# CONFIG_ATH9K_DYNACK is not set +# CONFIG_ATH9K_WOW is not set +CONFIG_ATH9K_RFKILL=y +CONFIG_ATH9K_CHANNEL_CONTEXT=y +CONFIG_ATH9K_PCOEM=y +CONFIG_ATH9K_PCI_NO_EEPROM=m +CONFIG_ATH9K_HTC=m +# CONFIG_ATH9K_HTC_DEBUGFS is not set +# CONFIG_ATH9K_HWRNG is not set +CONFIG_CARL9170=m +CONFIG_CARL9170_LEDS=y +CONFIG_CARL9170_WPC=y +# CONFIG_CARL9170_HWRNG is not set +CONFIG_ATH6KL=m +CONFIG_ATH6KL_SDIO=m +CONFIG_ATH6KL_USB=m +# CONFIG_ATH6KL_DEBUG is not set +# CONFIG_ATH6KL_TRACING is not set +CONFIG_AR5523=m +CONFIG_WIL6210=m +CONFIG_WIL6210_ISR_COR=y +CONFIG_WIL6210_TRACING=y +CONFIG_WIL6210_DEBUGFS=y +CONFIG_ATH10K=m +CONFIG_ATH10K_CE=y +CONFIG_ATH10K_PCI=m +# CONFIG_ATH10K_AHB is not set +# CONFIG_ATH10K_SDIO is not set +CONFIG_ATH10K_USB=m +# CONFIG_ATH10K_DEBUG is not set +# CONFIG_ATH10K_DEBUGFS is not set +# CONFIG_ATH10K_TRACING is not set +CONFIG_WCN36XX=m +# CONFIG_WCN36XX_DEBUGFS is not set +# CONFIG_ATH11K is not set +# CONFIG_ATH12K is not set +CONFIG_WLAN_VENDOR_ATMEL=y +# CONFIG_ATMEL is not set +CONFIG_AT76C50X_USB=m +CONFIG_WLAN_VENDOR_BROADCOM=y +CONFIG_B43=m +CONFIG_B43_BCMA=y +CONFIG_B43_SSB=y +CONFIG_B43_BUSES_BCMA_AND_SSB=y +# CONFIG_B43_BUSES_BCMA is not set +# CONFIG_B43_BUSES_SSB is not set +CONFIG_B43_PCI_AUTOSELECT=y +CONFIG_B43_PCICORE_AUTOSELECT=y +CONFIG_B43_SDIO=y +CONFIG_B43_BCMA_PIO=y +CONFIG_B43_PIO=y +CONFIG_B43_PHY_G=y +CONFIG_B43_PHY_N=y +CONFIG_B43_PHY_LP=y +CONFIG_B43_PHY_HT=y +CONFIG_B43_LEDS=y +CONFIG_B43_HWRNG=y +# CONFIG_B43_DEBUG is not set +CONFIG_B43LEGACY=m +CONFIG_B43LEGACY_PCI_AUTOSELECT=y +CONFIG_B43LEGACY_PCICORE_AUTOSELECT=y +CONFIG_B43LEGACY_LEDS=y +CONFIG_B43LEGACY_HWRNG=y +CONFIG_B43LEGACY_DEBUG=y +CONFIG_B43LEGACY_DMA=y +CONFIG_B43LEGACY_PIO=y +CONFIG_B43LEGACY_DMA_AND_PIO_MODE=y +# CONFIG_B43LEGACY_DMA_MODE is not set +# CONFIG_B43LEGACY_PIO_MODE is not set +CONFIG_BRCMUTIL=m +CONFIG_BRCMSMAC=m +CONFIG_BRCMFMAC=m +CONFIG_BRCMFMAC_PROTO_BCDC=y +CONFIG_BRCMFMAC_PROTO_MSGBUF=y +CONFIG_BRCMFMAC_SDIO=y +CONFIG_BRCMFMAC_USB=y +CONFIG_BRCMFMAC_PCIE=y +# CONFIG_BRCM_TRACING is not set +# CONFIG_BRCMDBG is not set +CONFIG_WLAN_VENDOR_CISCO=y +# CONFIG_AIRO is not set +CONFIG_WLAN_VENDOR_INTEL=y +# CONFIG_IPW2100 is not set +CONFIG_IPW2200=m +CONFIG_IPW2200_MONITOR=y +CONFIG_IPW2200_RADIOTAP=y +CONFIG_IPW2200_PROMISCUOUS=y +CONFIG_IPW2200_QOS=y +# CONFIG_IPW2200_DEBUG is not set +CONFIG_LIBIPW=m +# CONFIG_LIBIPW_DEBUG is not set +CONFIG_IWLEGACY=m +CONFIG_IWL4965=m +CONFIG_IWL3945=m + +# +# iwl3945 / iwl4965 Debugging Options +# +# CONFIG_IWLEGACY_DEBUG is not set +# end of iwl3945 / iwl4965 Debugging Options + +CONFIG_IWLWIFI=m +CONFIG_IWLWIFI_LEDS=y +CONFIG_IWLDVM=m +CONFIG_IWLMVM=m +CONFIG_IWLWIFI_OPMODE_MODULAR=y + +# +# Debugging Options +# +# CONFIG_IWLWIFI_DEBUG is not set +# CONFIG_IWLWIFI_DEVICE_TRACING is not set +# end of Debugging Options + +CONFIG_WLAN_VENDOR_INTERSIL=y +CONFIG_HOSTAP=m +CONFIG_HOSTAP_FIRMWARE=y +# CONFIG_HOSTAP_FIRMWARE_NVRAM is not set +CONFIG_HOSTAP_PLX=m +CONFIG_HOSTAP_PCI=m +# CONFIG_HERMES is not set +CONFIG_P54_COMMON=m +CONFIG_P54_USB=m +CONFIG_P54_PCI=m +# CONFIG_P54_SPI is not set +CONFIG_P54_LEDS=y +CONFIG_WLAN_VENDOR_MARVELL=y +CONFIG_LIBERTAS=m +CONFIG_LIBERTAS_USB=m +CONFIG_LIBERTAS_SDIO=m +# CONFIG_LIBERTAS_SPI is not set +# CONFIG_LIBERTAS_DEBUG is not set +CONFIG_LIBERTAS_MESH=y +CONFIG_LIBERTAS_THINFIRM=m +# CONFIG_LIBERTAS_THINFIRM_DEBUG is not set +CONFIG_LIBERTAS_THINFIRM_USB=m +CONFIG_MWIFIEX=m +# CONFIG_MWIFIEX_SDIO is not set +CONFIG_MWIFIEX_PCIE=m +# CONFIG_MWIFIEX_USB is not set +CONFIG_MWL8K=m +CONFIG_WLAN_VENDOR_MEDIATEK=y +CONFIG_MT7601U=m +CONFIG_MT76_CORE=m +CONFIG_MT76_LEDS=y +CONFIG_MT76_USB=m +CONFIG_MT76x02_LIB=m +CONFIG_MT76x02_USB=m +CONFIG_MT76x0_COMMON=m +CONFIG_MT76x0U=m +# CONFIG_MT76x0E is not set +CONFIG_MT76x2_COMMON=m +CONFIG_MT76x2E=m +CONFIG_MT76x2U=m +# CONFIG_MT7603E is not set +# CONFIG_MT7615E is not set +# CONFIG_MT7663U is not set +# CONFIG_MT7663S is not set +# CONFIG_MT7915E is not set +# CONFIG_MT7921E is not set +# CONFIG_MT7921S is not set +# CONFIG_MT7921U is not set +# CONFIG_MT7996E is not set +CONFIG_WLAN_VENDOR_MICROCHIP=y +# CONFIG_WILC1000_SDIO is not set +# CONFIG_WILC1000_SPI is not set +CONFIG_WLAN_VENDOR_PURELIFI=y +# CONFIG_PLFXLC is not set +CONFIG_WLAN_VENDOR_RALINK=y +CONFIG_RT2X00=m +CONFIG_RT2400PCI=m +CONFIG_RT2500PCI=m +CONFIG_RT61PCI=m +CONFIG_RT2800PCI=m +CONFIG_RT2800PCI_RT33XX=y +CONFIG_RT2800PCI_RT35XX=y +CONFIG_RT2800PCI_RT53XX=y +CONFIG_RT2800PCI_RT3290=y +CONFIG_RT2500USB=m +CONFIG_RT73USB=m +CONFIG_RT2800USB=m +CONFIG_RT2800USB_RT33XX=y +CONFIG_RT2800USB_RT35XX=y +CONFIG_RT2800USB_RT3573=y +CONFIG_RT2800USB_RT53XX=y +CONFIG_RT2800USB_RT55XX=y +# CONFIG_RT2800USB_UNKNOWN is not set +CONFIG_RT2800_LIB=m +CONFIG_RT2800_LIB_MMIO=m +CONFIG_RT2X00_LIB_MMIO=m +CONFIG_RT2X00_LIB_PCI=m +CONFIG_RT2X00_LIB_USB=m +CONFIG_RT2X00_LIB=m +CONFIG_RT2X00_LIB_FIRMWARE=y +CONFIG_RT2X00_LIB_CRYPTO=y +CONFIG_RT2X00_LIB_LEDS=y +# CONFIG_RT2X00_DEBUG is not set +CONFIG_WLAN_VENDOR_REALTEK=y +CONFIG_RTL8180=m +CONFIG_RTL8187=m +CONFIG_RTL8187_LEDS=y +CONFIG_RTL_CARDS=m +CONFIG_RTL8192CE=m +CONFIG_RTL8192SE=m +CONFIG_RTL8192DE=m +CONFIG_RTL8723AE=m +CONFIG_RTL8723BE=m +CONFIG_RTL8188EE=m +CONFIG_RTL8192EE=m +CONFIG_RTL8821AE=m +CONFIG_RTL8192CU=m +CONFIG_RTLWIFI=m +CONFIG_RTLWIFI_PCI=m +CONFIG_RTLWIFI_USB=m +# CONFIG_RTLWIFI_DEBUG is not set +CONFIG_RTL8192C_COMMON=m +CONFIG_RTL8723_COMMON=m +CONFIG_RTLBTCOEXIST=m +CONFIG_RTL8XXXU=m +# CONFIG_RTL8XXXU_UNTESTED is not set +CONFIG_RTW88=m +CONFIG_RTW88_CORE=m +CONFIG_RTW88_PCI=m +CONFIG_RTW88_8822B=m +CONFIG_RTW88_8822C=m +CONFIG_RTW88_8822BE=m +# CONFIG_RTW88_8822BS is not set +# CONFIG_RTW88_8822BU is not set +CONFIG_RTW88_8822CE=m +# CONFIG_RTW88_8822CS is not set +# CONFIG_RTW88_8822CU is not set +# CONFIG_RTW88_8723DE is not set +# CONFIG_RTW88_8723DS is not set +# CONFIG_RTW88_8723DU is not set +# CONFIG_RTW88_8821CE is not set +# CONFIG_RTW88_8821CS is not set +# CONFIG_RTW88_8821CU is not set +# CONFIG_RTW88_DEBUG is not set +# CONFIG_RTW88_DEBUGFS is not set +# CONFIG_RTW89 is not set +CONFIG_WLAN_VENDOR_RSI=y +CONFIG_RSI_91X=m +CONFIG_RSI_DEBUGFS=y +# CONFIG_RSI_SDIO is not set +CONFIG_RSI_USB=m +CONFIG_RSI_COEX=y +CONFIG_WLAN_VENDOR_SILABS=y +# CONFIG_WFX is not set +CONFIG_WLAN_VENDOR_ST=y +# CONFIG_CW1200 is not set +CONFIG_WLAN_VENDOR_TI=y +CONFIG_WL1251=m +CONFIG_WL1251_SPI=m +CONFIG_WL1251_SDIO=m +CONFIG_WL12XX=m +CONFIG_WL18XX=m +CONFIG_WLCORE=m +CONFIG_WLCORE_SPI=m +CONFIG_WLCORE_SDIO=m +CONFIG_WLAN_VENDOR_ZYDAS=y +# CONFIG_USB_ZD1201 is not set +CONFIG_ZD1211RW=m +# CONFIG_ZD1211RW_DEBUG is not set +CONFIG_WLAN_VENDOR_QUANTENNA=y +# CONFIG_QTNFMAC_PCIE is not set +CONFIG_USB_NET_RNDIS_WLAN=m +CONFIG_MAC80211_HWSIM=m +# CONFIG_VIRT_WIFI is not set +# CONFIG_WAN is not set +CONFIG_IEEE802154_DRIVERS=m +CONFIG_IEEE802154_FAKELB=m +CONFIG_IEEE802154_AT86RF230=m +CONFIG_IEEE802154_MRF24J40=m +CONFIG_IEEE802154_CC2520=m +CONFIG_IEEE802154_ATUSB=m +CONFIG_IEEE802154_ADF7242=m +# CONFIG_IEEE802154_CA8210 is not set +# CONFIG_IEEE802154_MCR20A is not set +CONFIG_IEEE802154_HWSIM=m + +# +# Wireless WAN +# +# CONFIG_WWAN is not set +# end of Wireless WAN + +CONFIG_XEN_NETDEV_FRONTEND=m +CONFIG_XEN_NETDEV_BACKEND=m +# CONFIG_FUJITSU_ES is not set +CONFIG_HYPERV_NET=m +# CONFIG_NETDEVSIM is not set +CONFIG_NET_FAILOVER=m +# CONFIG_ISDN is not set + +# +# Input device support +# +CONFIG_INPUT=y +CONFIG_INPUT_LEDS=y +CONFIG_INPUT_FF_MEMLESS=m +CONFIG_INPUT_SPARSEKMAP=m +CONFIG_INPUT_MATRIXKMAP=m +CONFIG_INPUT_VIVALDIFMAP=y + +# +# Userland interfaces +# +CONFIG_INPUT_MOUSEDEV=y +CONFIG_INPUT_MOUSEDEV_PSAUX=y +CONFIG_INPUT_MOUSEDEV_SCREEN_X=1024 +CONFIG_INPUT_MOUSEDEV_SCREEN_Y=768 +CONFIG_INPUT_JOYDEV=m +CONFIG_INPUT_EVDEV=m +# CONFIG_INPUT_EVBUG is not set + +# +# Input Device Drivers +# +CONFIG_INPUT_KEYBOARD=y +# CONFIG_KEYBOARD_ADC is not set +CONFIG_KEYBOARD_ADP5588=m +# CONFIG_KEYBOARD_ADP5589 is not set +CONFIG_KEYBOARD_ATKBD=y +# CONFIG_KEYBOARD_QT1050 is not set +# CONFIG_KEYBOARD_QT1070 is not set +CONFIG_KEYBOARD_QT2160=m +# CONFIG_KEYBOARD_DLINK_DIR685 is not set +# CONFIG_KEYBOARD_LKKBD is not set +CONFIG_KEYBOARD_GPIO=m +# CONFIG_KEYBOARD_GPIO_POLLED is not set +# CONFIG_KEYBOARD_TCA6416 is not set +# CONFIG_KEYBOARD_TCA8418 is not set +# CONFIG_KEYBOARD_MATRIX is not set +CONFIG_KEYBOARD_LM8323=m +# CONFIG_KEYBOARD_LM8333 is not set +CONFIG_KEYBOARD_MAX7359=m +# CONFIG_KEYBOARD_MCS is not set +# CONFIG_KEYBOARD_MPR121 is not set +# CONFIG_KEYBOARD_NEWTON is not set +CONFIG_KEYBOARD_TEGRA=m +CONFIG_KEYBOARD_OPENCORES=m +# CONFIG_KEYBOARD_PINEPHONE is not set +# CONFIG_KEYBOARD_SAMSUNG is not set +CONFIG_KEYBOARD_STOWAWAY=m +# CONFIG_KEYBOARD_SUNKBD is not set +# CONFIG_KEYBOARD_SUN4I_LRADC is not set +# CONFIG_KEYBOARD_OMAP4 is not set +# CONFIG_KEYBOARD_TM2_TOUCHKEY is not set +# CONFIG_KEYBOARD_XTKBD is not set +CONFIG_KEYBOARD_CROS_EC=m +# CONFIG_KEYBOARD_CAP11XX is not set +# CONFIG_KEYBOARD_BCM is not set +# CONFIG_KEYBOARD_CYPRESS_SF is not set +CONFIG_INPUT_MOUSE=y +CONFIG_MOUSE_PS2=y +CONFIG_MOUSE_PS2_ALPS=y +CONFIG_MOUSE_PS2_BYD=y +CONFIG_MOUSE_PS2_LOGIPS2PP=y +CONFIG_MOUSE_PS2_SYNAPTICS=y +CONFIG_MOUSE_PS2_SYNAPTICS_SMBUS=y +CONFIG_MOUSE_PS2_CYPRESS=y +CONFIG_MOUSE_PS2_TRACKPOINT=y +CONFIG_MOUSE_PS2_ELANTECH=y +CONFIG_MOUSE_PS2_ELANTECH_SMBUS=y +CONFIG_MOUSE_PS2_SENTELIC=y +# CONFIG_MOUSE_PS2_TOUCHKIT is not set +CONFIG_MOUSE_PS2_FOCALTECH=y +CONFIG_MOUSE_PS2_SMBUS=y +# CONFIG_MOUSE_SERIAL is not set +# CONFIG_MOUSE_APPLETOUCH is not set +# CONFIG_MOUSE_BCM5974 is not set +# CONFIG_MOUSE_CYAPA is not set +CONFIG_MOUSE_ELAN_I2C=m +CONFIG_MOUSE_ELAN_I2C_I2C=y +# CONFIG_MOUSE_ELAN_I2C_SMBUS is not set +# CONFIG_MOUSE_VSXXXAA is not set +# CONFIG_MOUSE_GPIO is not set +CONFIG_MOUSE_SYNAPTICS_I2C=m +CONFIG_MOUSE_SYNAPTICS_USB=m +# CONFIG_INPUT_JOYSTICK is not set +CONFIG_INPUT_TABLET=y +CONFIG_TABLET_USB_ACECAD=m +CONFIG_TABLET_USB_AIPTEK=m +CONFIG_TABLET_USB_HANWANG=m +CONFIG_TABLET_USB_KBTAB=m +CONFIG_TABLET_USB_PEGASUS=m +CONFIG_TABLET_SERIAL_WACOM4=m +CONFIG_INPUT_TOUCHSCREEN=y +CONFIG_TOUCHSCREEN_ADS7846=m +CONFIG_TOUCHSCREEN_AD7877=m +CONFIG_TOUCHSCREEN_AD7879=m +CONFIG_TOUCHSCREEN_AD7879_I2C=m +# CONFIG_TOUCHSCREEN_AD7879_SPI is not set +# CONFIG_TOUCHSCREEN_ADC is not set +# CONFIG_TOUCHSCREEN_AR1021_I2C is not set +CONFIG_TOUCHSCREEN_ATMEL_MXT=m +# CONFIG_TOUCHSCREEN_ATMEL_MXT_T37 is not set +# CONFIG_TOUCHSCREEN_AUO_PIXCIR is not set +# CONFIG_TOUCHSCREEN_BU21013 is not set +# CONFIG_TOUCHSCREEN_BU21029 is not set +# CONFIG_TOUCHSCREEN_CHIPONE_ICN8318 is not set +# CONFIG_TOUCHSCREEN_CHIPONE_ICN8505 is not set +# CONFIG_TOUCHSCREEN_CY8CTMA140 is not set +# CONFIG_TOUCHSCREEN_CY8CTMG110 is not set +# CONFIG_TOUCHSCREEN_CYTTSP_CORE is not set +# CONFIG_TOUCHSCREEN_CYTTSP4_CORE is not set +# CONFIG_TOUCHSCREEN_CYTTSP5 is not set +CONFIG_TOUCHSCREEN_DYNAPRO=m +CONFIG_TOUCHSCREEN_HAMPSHIRE=m +CONFIG_TOUCHSCREEN_EETI=m +# CONFIG_TOUCHSCREEN_EGALAX is not set +# CONFIG_TOUCHSCREEN_EGALAX_SERIAL is not set +# CONFIG_TOUCHSCREEN_EXC3000 is not set +CONFIG_TOUCHSCREEN_FUJITSU=m +CONFIG_TOUCHSCREEN_GOODIX=m +# CONFIG_TOUCHSCREEN_HIDEEP is not set +# CONFIG_TOUCHSCREEN_HYCON_HY46XX is not set +# CONFIG_TOUCHSCREEN_HYNITRON_CSTXXX is not set +# CONFIG_TOUCHSCREEN_ILI210X is not set +# CONFIG_TOUCHSCREEN_ILITEK is not set +# CONFIG_TOUCHSCREEN_S6SY761 is not set +CONFIG_TOUCHSCREEN_GUNZE=m +# CONFIG_TOUCHSCREEN_EKTF2127 is not set +CONFIG_TOUCHSCREEN_ELAN=m +CONFIG_TOUCHSCREEN_ELO=m +CONFIG_TOUCHSCREEN_WACOM_W8001=m +# CONFIG_TOUCHSCREEN_WACOM_I2C is not set +# CONFIG_TOUCHSCREEN_MAX11801 is not set +CONFIG_TOUCHSCREEN_MCS5000=m +# CONFIG_TOUCHSCREEN_MMS114 is not set +# CONFIG_TOUCHSCREEN_MELFAS_MIP4 is not set +# CONFIG_TOUCHSCREEN_MSG2638 is not set +CONFIG_TOUCHSCREEN_MTOUCH=m +# CONFIG_TOUCHSCREEN_NOVATEK_NVT_TS is not set +# CONFIG_TOUCHSCREEN_IMAGIS is not set +# CONFIG_TOUCHSCREEN_IMX6UL_TSC is not set +CONFIG_TOUCHSCREEN_INEXIO=m +CONFIG_TOUCHSCREEN_PENMOUNT=m +# CONFIG_TOUCHSCREEN_EDT_FT5X06 is not set +CONFIG_TOUCHSCREEN_TOUCHRIGHT=m +CONFIG_TOUCHSCREEN_TOUCHWIN=m +# CONFIG_TOUCHSCREEN_PIXCIR is not set +# CONFIG_TOUCHSCREEN_WDT87XX_I2C is not set +CONFIG_TOUCHSCREEN_WM97XX=m +CONFIG_TOUCHSCREEN_WM9705=y +CONFIG_TOUCHSCREEN_WM9712=y +CONFIG_TOUCHSCREEN_WM9713=y +CONFIG_TOUCHSCREEN_USB_COMPOSITE=m +CONFIG_TOUCHSCREEN_USB_EGALAX=y +CONFIG_TOUCHSCREEN_USB_PANJIT=y +CONFIG_TOUCHSCREEN_USB_3M=y +CONFIG_TOUCHSCREEN_USB_ITM=y +CONFIG_TOUCHSCREEN_USB_ETURBO=y +CONFIG_TOUCHSCREEN_USB_GUNZE=y +CONFIG_TOUCHSCREEN_USB_DMC_TSC10=y +CONFIG_TOUCHSCREEN_USB_IRTOUCH=y +CONFIG_TOUCHSCREEN_USB_IDEALTEK=y +CONFIG_TOUCHSCREEN_USB_GENERAL_TOUCH=y +CONFIG_TOUCHSCREEN_USB_GOTOP=y +CONFIG_TOUCHSCREEN_USB_JASTEC=y +CONFIG_TOUCHSCREEN_USB_ELO=y +CONFIG_TOUCHSCREEN_USB_E2I=y +CONFIG_TOUCHSCREEN_USB_ZYTRONIC=y +CONFIG_TOUCHSCREEN_USB_ETT_TC45USB=y +CONFIG_TOUCHSCREEN_USB_NEXIO=y +CONFIG_TOUCHSCREEN_USB_EASYTOUCH=y +CONFIG_TOUCHSCREEN_TOUCHIT213=m +# CONFIG_TOUCHSCREEN_TSC_SERIO is not set +# CONFIG_TOUCHSCREEN_TSC2004 is not set +# CONFIG_TOUCHSCREEN_TSC2005 is not set +CONFIG_TOUCHSCREEN_TSC2007=m +# CONFIG_TOUCHSCREEN_TSC2007_IIO is not set +# CONFIG_TOUCHSCREEN_RM_TS is not set +# CONFIG_TOUCHSCREEN_SILEAD is not set +# CONFIG_TOUCHSCREEN_SIS_I2C is not set +# CONFIG_TOUCHSCREEN_ST1232 is not set +# CONFIG_TOUCHSCREEN_STMFTS is not set +# CONFIG_TOUCHSCREEN_SUN4I is not set +CONFIG_TOUCHSCREEN_SUR40=m +# CONFIG_TOUCHSCREEN_SURFACE3_SPI is not set +# CONFIG_TOUCHSCREEN_SX8654 is not set +CONFIG_TOUCHSCREEN_TPS6507X=m +# CONFIG_TOUCHSCREEN_ZET6223 is not set +# CONFIG_TOUCHSCREEN_ZFORCE is not set +# CONFIG_TOUCHSCREEN_COLIBRI_VF50 is not set +# CONFIG_TOUCHSCREEN_ROHM_BU21023 is not set +# CONFIG_TOUCHSCREEN_IQS5XX is not set +# CONFIG_TOUCHSCREEN_IQS7211 is not set +# CONFIG_TOUCHSCREEN_ZINITIX is not set +# CONFIG_TOUCHSCREEN_HIMAX_HX83112B is not set +CONFIG_INPUT_MISC=y +# CONFIG_INPUT_AD714X is not set +# CONFIG_INPUT_ATMEL_CAPTOUCH is not set +# CONFIG_INPUT_BMA150 is not set +# CONFIG_INPUT_E3X0_BUTTON is not set +CONFIG_INPUT_PM8941_PWRKEY=m +# CONFIG_INPUT_PM8XXX_VIBRATOR is not set +# CONFIG_INPUT_MMA8450 is not set +# CONFIG_INPUT_GPIO_BEEPER is not set +# CONFIG_INPUT_GPIO_DECODER is not set +# CONFIG_INPUT_GPIO_VIBRA is not set +CONFIG_INPUT_ATI_REMOTE2=m +CONFIG_INPUT_KEYSPAN_REMOTE=m +# CONFIG_INPUT_KXTJ9 is not set +CONFIG_INPUT_POWERMATE=m +CONFIG_INPUT_YEALINK=m +CONFIG_INPUT_CM109=m +# CONFIG_INPUT_REGULATOR_HAPTIC is not set +CONFIG_INPUT_AXP20X_PEK=m +CONFIG_INPUT_UINPUT=m +# CONFIG_INPUT_PCF8574 is not set +# CONFIG_INPUT_PWM_BEEPER is not set +# CONFIG_INPUT_PWM_VIBRA is not set +# CONFIG_INPUT_GPIO_ROTARY_ENCODER is not set +# CONFIG_INPUT_DA7280_HAPTICS is not set +# CONFIG_INPUT_ADXL34X is not set +# CONFIG_INPUT_IBM_PANEL is not set +# CONFIG_INPUT_IMS_PCU is not set +# CONFIG_INPUT_IQS269A is not set +# CONFIG_INPUT_IQS626A is not set +# CONFIG_INPUT_IQS7222 is not set +# CONFIG_INPUT_CMA3000 is not set +CONFIG_INPUT_XEN_KBDDEV_FRONTEND=y +# CONFIG_INPUT_SOC_BUTTON_ARRAY is not set +# CONFIG_INPUT_DRV260X_HAPTICS is not set +# CONFIG_INPUT_DRV2665_HAPTICS is not set +# CONFIG_INPUT_DRV2667_HAPTICS is not set +CONFIG_INPUT_HISI_POWERKEY=m +CONFIG_RMI4_CORE=m +# CONFIG_RMI4_I2C is not set +# CONFIG_RMI4_SPI is not set +# CONFIG_RMI4_SMB is not set +CONFIG_RMI4_F03=y +CONFIG_RMI4_F03_SERIO=m +CONFIG_RMI4_2D_SENSOR=y +CONFIG_RMI4_F11=y +CONFIG_RMI4_F12=y +CONFIG_RMI4_F30=y +CONFIG_RMI4_F34=y +# CONFIG_RMI4_F3A is not set +# CONFIG_RMI4_F54 is not set +CONFIG_RMI4_F55=y + +# +# Hardware I/O ports +# +CONFIG_SERIO=y +CONFIG_SERIO_SERPORT=y +# CONFIG_SERIO_PARKBD is not set +# CONFIG_SERIO_AMBAKMI is not set +# CONFIG_SERIO_PCIPS2 is not set +CONFIG_SERIO_LIBPS2=y +# CONFIG_SERIO_RAW is not set +CONFIG_SERIO_ALTERA_PS2=m +# CONFIG_SERIO_PS2MULT is not set +# CONFIG_SERIO_ARC_PS2 is not set +# CONFIG_SERIO_APBPS2 is not set +CONFIG_HYPERV_KEYBOARD=m +# CONFIG_SERIO_SUN4I_PS2 is not set +# CONFIG_SERIO_GPIO_PS2 is not set +# CONFIG_USERIO is not set +# CONFIG_GAMEPORT is not set +# end of Hardware I/O ports +# end of Input device support + +# +# Character devices +# +CONFIG_TTY=y +CONFIG_VT=y +CONFIG_CONSOLE_TRANSLATIONS=y +CONFIG_VT_CONSOLE=y +CONFIG_VT_CONSOLE_SLEEP=y +CONFIG_HW_CONSOLE=y +CONFIG_VT_HW_CONSOLE_BINDING=y +CONFIG_UNIX98_PTYS=y +# CONFIG_LEGACY_PTYS is not set +CONFIG_LEGACY_TIOCSTI=y +CONFIG_LDISC_AUTOLOAD=y + +# +# Serial drivers +# +CONFIG_SERIAL_EARLYCON=y +CONFIG_SERIAL_8250=y +# CONFIG_SERIAL_8250_DEPRECATED_OPTIONS is not set +CONFIG_SERIAL_8250_PNP=y +CONFIG_SERIAL_8250_16550A_VARIANTS=y +# CONFIG_SERIAL_8250_FINTEK is not set +CONFIG_SERIAL_8250_CONSOLE=y +CONFIG_SERIAL_8250_DMA=y +CONFIG_SERIAL_8250_PCILIB=y +CONFIG_SERIAL_8250_PCI=y +CONFIG_SERIAL_8250_EXAR=m +CONFIG_SERIAL_8250_NR_UARTS=4 +CONFIG_SERIAL_8250_RUNTIME_UARTS=4 +CONFIG_SERIAL_8250_EXTENDED=y +# CONFIG_SERIAL_8250_MANY_PORTS is not set +# CONFIG_SERIAL_8250_PCI1XXXX is not set +CONFIG_SERIAL_8250_SHARE_IRQ=y +# CONFIG_SERIAL_8250_DETECT_IRQ is not set +# CONFIG_SERIAL_8250_RSA is not set +CONFIG_SERIAL_8250_DWLIB=y +CONFIG_SERIAL_8250_FSL=y +CONFIG_SERIAL_8250_DW=y +# CONFIG_SERIAL_8250_RT288X is not set +CONFIG_SERIAL_8250_PERICOM=y +CONFIG_SERIAL_8250_TEGRA=y +CONFIG_SERIAL_OF_PLATFORM=y + +# +# Non-8250 serial port support +# +CONFIG_SERIAL_AMBA_PL010=y +CONFIG_SERIAL_AMBA_PL010_CONSOLE=y +CONFIG_SERIAL_AMBA_PL011=y +CONFIG_SERIAL_AMBA_PL011_CONSOLE=y +# CONFIG_SERIAL_EARLYCON_SEMIHOST is not set +CONFIG_SERIAL_MESON=y +CONFIG_SERIAL_MESON_CONSOLE=y +CONFIG_SERIAL_TEGRA=y +# CONFIG_SERIAL_TEGRA_TCU is not set +# CONFIG_SERIAL_MAX3100 is not set +# CONFIG_SERIAL_MAX310X is not set +# CONFIG_SERIAL_UARTLITE is not set +CONFIG_SERIAL_CORE=y +CONFIG_SERIAL_CORE_CONSOLE=y +# CONFIG_SERIAL_JSM is not set +CONFIG_SERIAL_MSM=y +CONFIG_SERIAL_MSM_CONSOLE=y +# CONFIG_SERIAL_SIFIVE is not set +# CONFIG_SERIAL_SCCNXP is not set +# CONFIG_SERIAL_SC16IS7XX is not set +# CONFIG_SERIAL_ALTERA_JTAGUART is not set +# CONFIG_SERIAL_ALTERA_UART is not set +CONFIG_SERIAL_XILINX_PS_UART=y +CONFIG_SERIAL_XILINX_PS_UART_CONSOLE=y +# CONFIG_SERIAL_ARC is not set +CONFIG_SERIAL_RP2=m +CONFIG_SERIAL_RP2_NR_UARTS=32 +# CONFIG_SERIAL_FSL_LPUART is not set +# CONFIG_SERIAL_FSL_LINFLEXUART is not set +# CONFIG_SERIAL_CONEXANT_DIGICOLOR is not set +# CONFIG_SERIAL_SPRD is not set +CONFIG_SERIAL_MVEBU_UART=y +CONFIG_SERIAL_MVEBU_CONSOLE=y +# end of Serial drivers + +CONFIG_SERIAL_MCTRL_GPIO=y +# CONFIG_SERIAL_NONSTANDARD is not set +CONFIG_N_GSM=m +CONFIG_NOZOMI=m +# CONFIG_NULL_TTY is not set +CONFIG_HVC_DRIVER=y +CONFIG_HVC_IRQ=y +CONFIG_HVC_XEN=y +CONFIG_HVC_XEN_FRONTEND=y +# CONFIG_HVC_DCC is not set +# CONFIG_RPMSG_TTY is not set +CONFIG_SERIAL_DEV_BUS=m +CONFIG_TTY_PRINTK=m +CONFIG_TTY_PRINTK_LEVEL=6 +# CONFIG_PRINTER is not set +# CONFIG_PPDEV is not set +CONFIG_VIRTIO_CONSOLE=m +CONFIG_IPMI_HANDLER=m +CONFIG_IPMI_DMI_DECODE=y +CONFIG_IPMI_PLAT_DATA=y +# CONFIG_IPMI_PANIC_EVENT is not set +CONFIG_IPMI_DEVICE_INTERFACE=m +CONFIG_IPMI_SI=m +CONFIG_IPMI_SSIF=m +# CONFIG_IPMI_IPMB is not set +CONFIG_IPMI_WATCHDOG=m +CONFIG_IPMI_POWEROFF=m +# CONFIG_SSIF_IPMI_BMC is not set +# CONFIG_IPMB_DEVICE_INTERFACE is not set +CONFIG_HW_RANDOM=m +# CONFIG_HW_RANDOM_TIMERIOMEM is not set +# CONFIG_HW_RANDOM_BA431 is not set +CONFIG_HW_RANDOM_OMAP=m +CONFIG_HW_RANDOM_VIRTIO=m +CONFIG_HW_RANDOM_HISI=m +CONFIG_HW_RANDOM_HISTB=m +CONFIG_HW_RANDOM_XGENE=m +CONFIG_HW_RANDOM_MESON=m +CONFIG_HW_RANDOM_CAVIUM=m +CONFIG_HW_RANDOM_OPTEE=m +# CONFIG_HW_RANDOM_CCTRNG is not set +# CONFIG_HW_RANDOM_XIPHERA is not set +CONFIG_HW_RANDOM_ARM_SMCCC_TRNG=m +CONFIG_HW_RANDOM_CN10K=m +# CONFIG_APPLICOM is not set +CONFIG_DEVMEM=y +CONFIG_DEVPORT=y +CONFIG_TCG_TPM=m +CONFIG_HW_RANDOM_TPM=y +CONFIG_TCG_TIS_CORE=m +CONFIG_TCG_TIS=m +CONFIG_TCG_TIS_SPI=m +# CONFIG_TCG_TIS_SPI_CR50 is not set +# CONFIG_TCG_TIS_I2C is not set +# CONFIG_TCG_TIS_SYNQUACER is not set +# CONFIG_TCG_TIS_I2C_CR50 is not set +# CONFIG_TCG_TIS_I2C_ATMEL is not set +CONFIG_TCG_TIS_I2C_INFINEON=m +# CONFIG_TCG_TIS_I2C_NUVOTON is not set +CONFIG_TCG_ATMEL=m +# CONFIG_TCG_INFINEON is not set +# CONFIG_TCG_XEN is not set +CONFIG_TCG_CRB=m +CONFIG_TCG_VTPM_PROXY=m +# CONFIG_TCG_FTPM_TEE is not set +# CONFIG_TCG_TIS_ST33ZP24_I2C is not set +# CONFIG_TCG_TIS_ST33ZP24_SPI is not set +# CONFIG_XILLYBUS is not set +# CONFIG_XILLYUSB is not set +# end of Character devices + +# +# I2C support +# +CONFIG_I2C=y +CONFIG_ACPI_I2C_OPREGION=y +CONFIG_I2C_BOARDINFO=y +CONFIG_I2C_COMPAT=y +CONFIG_I2C_CHARDEV=m +CONFIG_I2C_MUX=m + +# +# Multiplexer I2C Chip support +# +# CONFIG_I2C_ARB_GPIO_CHALLENGE is not set +# CONFIG_I2C_MUX_GPIO is not set +# CONFIG_I2C_MUX_GPMUX is not set +# CONFIG_I2C_MUX_LTC4306 is not set +# CONFIG_I2C_MUX_PCA9541 is not set +# CONFIG_I2C_MUX_PCA954x is not set +# CONFIG_I2C_MUX_PINCTRL is not set +# CONFIG_I2C_MUX_REG is not set +# CONFIG_I2C_DEMUX_PINCTRL is not set +# CONFIG_I2C_MUX_MLXCPLD is not set +# end of Multiplexer I2C Chip support + +CONFIG_I2C_HELPER_AUTO=y +CONFIG_I2C_SMBUS=m +CONFIG_I2C_ALGOBIT=m +CONFIG_I2C_ALGOPCA=m + +# +# I2C Hardware Bus support +# + +# +# PC SMBus host controller drivers +# +# CONFIG_I2C_ALI1535 is not set +# CONFIG_I2C_ALI1563 is not set +# CONFIG_I2C_ALI15X3 is not set +# CONFIG_I2C_AMD756 is not set +# CONFIG_I2C_AMD8111 is not set +# CONFIG_I2C_AMD_MP2 is not set +# CONFIG_I2C_HIX5HD2 is not set +# CONFIG_I2C_I801 is not set +CONFIG_I2C_ISCH=m +# CONFIG_I2C_PIIX4 is not set +# CONFIG_I2C_NFORCE2 is not set +# CONFIG_I2C_NVIDIA_GPU is not set +# CONFIG_I2C_SIS5595 is not set +# CONFIG_I2C_SIS630 is not set +# CONFIG_I2C_SIS96X is not set +# CONFIG_I2C_VIA is not set +# CONFIG_I2C_VIAPRO is not set + +# +# ACPI drivers +# +# CONFIG_I2C_SCMI is not set + +# +# I2C system bus drivers (mostly embedded / system-on-chip) +# +# CONFIG_I2C_CADENCE is not set +# CONFIG_I2C_CBUS_GPIO is not set +CONFIG_I2C_DESIGNWARE_CORE=m +# CONFIG_I2C_DESIGNWARE_SLAVE is not set +CONFIG_I2C_DESIGNWARE_PLATFORM=m +# CONFIG_I2C_DESIGNWARE_PCI is not set +# CONFIG_I2C_EMEV2 is not set +CONFIG_I2C_GPIO=m +# CONFIG_I2C_GPIO_FAULT_INJECTOR is not set +# CONFIG_I2C_HISI is not set +CONFIG_I2C_MESON=m +CONFIG_I2C_MV64XXX=m +# CONFIG_I2C_NOMADIK is not set +CONFIG_I2C_OCORES=m +CONFIG_I2C_PCA_PLATFORM=m +CONFIG_I2C_PXA=m +# CONFIG_I2C_PXA_SLAVE is not set +# CONFIG_I2C_QCOM_CCI is not set +CONFIG_I2C_QUP=m +CONFIG_I2C_RK3X=m +CONFIG_I2C_SIMTEC=m +# CONFIG_I2C_SYNQUACER is not set +CONFIG_I2C_TEGRA=m +CONFIG_I2C_TEGRA_BPMP=y +# CONFIG_I2C_VERSATILE is not set +CONFIG_I2C_THUNDERX=m +# CONFIG_I2C_XILINX is not set +CONFIG_I2C_XLP9XX=m + +# +# External I2C/SMBus adapter drivers +# +CONFIG_I2C_DIOLAN_U2C=m +# CONFIG_I2C_CP2615 is not set +# CONFIG_I2C_PARPORT is not set +# CONFIG_I2C_PCI1XXXX is not set +CONFIG_I2C_ROBOTFUZZ_OSIF=m +CONFIG_I2C_TAOS_EVM=m +CONFIG_I2C_TINY_USB=m +CONFIG_I2C_VIPERBOARD=m + +# +# Other I2C/SMBus bus drivers +# +# CONFIG_I2C_MLXCPLD is not set +CONFIG_I2C_CROS_EC_TUNNEL=m +CONFIG_I2C_XGENE_SLIMPRO=m +# CONFIG_I2C_VIRTIO is not set +# end of I2C Hardware Bus support + +# CONFIG_I2C_STUB is not set +CONFIG_I2C_SLAVE=y +# CONFIG_I2C_SLAVE_EEPROM is not set +# CONFIG_I2C_SLAVE_TESTUNIT is not set +# CONFIG_I2C_DEBUG_CORE is not set +# CONFIG_I2C_DEBUG_ALGO is not set +# CONFIG_I2C_DEBUG_BUS is not set +# end of I2C support + +# CONFIG_I3C is not set +CONFIG_SPI=y +# CONFIG_SPI_DEBUG is not set +CONFIG_SPI_MASTER=y +CONFIG_SPI_MEM=y + +# +# SPI Master Controller Drivers +# +# CONFIG_SPI_ALTERA is not set +# CONFIG_SPI_AMLOGIC_SPIFC_A1 is not set +CONFIG_SPI_ARMADA_3700=m +# CONFIG_SPI_AXI_SPI_ENGINE is not set +CONFIG_SPI_BITBANG=m +CONFIG_SPI_BUTTERFLY=m +# CONFIG_SPI_CADENCE is not set +# CONFIG_SPI_CADENCE_QUADSPI is not set +# CONFIG_SPI_CADENCE_XSPI is not set +# CONFIG_SPI_DESIGNWARE is not set +# CONFIG_SPI_HISI_KUNPENG is not set +# CONFIG_SPI_HISI_SFC_V3XX is not set +# CONFIG_SPI_GPIO is not set +CONFIG_SPI_LM70_LLP=m +# CONFIG_SPI_FSL_SPI is not set +# CONFIG_SPI_MESON_SPICC is not set +CONFIG_SPI_MESON_SPIFC=m +# CONFIG_SPI_MICROCHIP_CORE is not set +# CONFIG_SPI_MICROCHIP_CORE_QSPI is not set +# CONFIG_SPI_OC_TINY is not set +# CONFIG_SPI_ORION is not set +# CONFIG_SPI_PCI1XXXX is not set +# CONFIG_SPI_PL022 is not set +# CONFIG_SPI_PXA2XX is not set +CONFIG_SPI_ROCKCHIP=m +# CONFIG_SPI_ROCKCHIP_SFC is not set +# CONFIG_SPI_QCOM_QSPI is not set +CONFIG_SPI_QUP=m +# CONFIG_SPI_SC18IS602 is not set +# CONFIG_SPI_SIFIVE is not set +# CONFIG_SPI_SN_F_OSPI is not set +# CONFIG_SPI_SUN4I is not set +# CONFIG_SPI_SUN6I is not set +# CONFIG_SPI_SYNQUACER is not set +# CONFIG_SPI_MXIC is not set +CONFIG_SPI_TEGRA210_QUAD=y +CONFIG_SPI_TEGRA114=m +CONFIG_SPI_TEGRA20_SFLASH=m +CONFIG_SPI_TEGRA20_SLINK=m +CONFIG_SPI_THUNDERX=m +# CONFIG_SPI_XCOMM is not set +# CONFIG_SPI_XILINX is not set +CONFIG_SPI_XLP=m +# CONFIG_SPI_ZYNQMP_GQSPI is not set +# CONFIG_SPI_AMD is not set + +# +# SPI Multiplexer support +# +# CONFIG_SPI_MUX is not set + +# +# SPI Protocol Masters +# +CONFIG_SPI_SPIDEV=y +# CONFIG_SPI_LOOPBACK_TEST is not set +# CONFIG_SPI_TLE62X0 is not set +# CONFIG_SPI_SLAVE is not set +CONFIG_SPI_DYNAMIC=y +CONFIG_SPMI=y +# CONFIG_SPMI_HISI3670 is not set +CONFIG_SPMI_MSM_PMIC_ARB=y +# CONFIG_HSI is not set +CONFIG_PPS=m +# CONFIG_PPS_DEBUG is not set + +# +# PPS clients support +# +# CONFIG_PPS_CLIENT_KTIMER is not set +CONFIG_PPS_CLIENT_LDISC=m +CONFIG_PPS_CLIENT_PARPORT=m +# CONFIG_PPS_CLIENT_GPIO is not set + +# +# PPS generators support +# + +# +# PTP clock support +# +CONFIG_PTP_1588_CLOCK=m +CONFIG_PTP_1588_CLOCK_OPTIONAL=m + +# +# Enable PHYLIB and NETWORK_PHY_TIMESTAMPING to see the additional clocks. +# +CONFIG_PTP_1588_CLOCK_KVM=m +# CONFIG_PTP_1588_CLOCK_IDT82P33 is not set +# CONFIG_PTP_1588_CLOCK_IDTCM is not set +# CONFIG_PTP_1588_CLOCK_MOCK is not set +# CONFIG_PTP_1588_CLOCK_OCP is not set +# end of PTP clock support + +CONFIG_PINCTRL=y +CONFIG_GENERIC_PINCTRL_GROUPS=y +CONFIG_PINMUX=y +CONFIG_GENERIC_PINMUX_FUNCTIONS=y +CONFIG_PINCONF=y +CONFIG_GENERIC_PINCONF=y +# CONFIG_DEBUG_PINCTRL is not set +CONFIG_PINCTRL_AMD=y +CONFIG_PINCTRL_AXP209=m +# CONFIG_PINCTRL_CY8C95X0 is not set +CONFIG_PINCTRL_MAX77620=y +# CONFIG_PINCTRL_MCP23S08 is not set +# CONFIG_PINCTRL_MICROCHIP_SGPIO is not set +# CONFIG_PINCTRL_OCELOT is not set +CONFIG_PINCTRL_ROCKCHIP=y +CONFIG_PINCTRL_SINGLE=y +# CONFIG_PINCTRL_STMFX is not set +# CONFIG_PINCTRL_SX150X is not set +CONFIG_PINCTRL_ZYNQMP=y +CONFIG_PINCTRL_MESON=y +CONFIG_PINCTRL_MESON_GXBB=y +CONFIG_PINCTRL_MESON_GXL=y +CONFIG_PINCTRL_MESON8_PMX=y +CONFIG_PINCTRL_MESON_AXG=y +CONFIG_PINCTRL_MESON_AXG_PMX=y +CONFIG_PINCTRL_MESON_G12A=y +CONFIG_PINCTRL_MESON_A1=y +CONFIG_PINCTRL_MESON_S4=y +CONFIG_PINCTRL_AMLOGIC_C3=y +CONFIG_PINCTRL_MVEBU=y +CONFIG_PINCTRL_ARMADA_AP806=y +CONFIG_PINCTRL_ARMADA_CP110=y +CONFIG_PINCTRL_AC5=y +CONFIG_PINCTRL_ARMADA_37XX=y +CONFIG_PINCTRL_MSM=y +# CONFIG_PINCTRL_IPQ5018 is not set +# CONFIG_PINCTRL_IPQ5332 is not set +# CONFIG_PINCTRL_IPQ8074 is not set +# CONFIG_PINCTRL_IPQ6018 is not set +# CONFIG_PINCTRL_IPQ9574 is not set +# CONFIG_PINCTRL_MDM9607 is not set +CONFIG_PINCTRL_MSM8916=y +# CONFIG_PINCTRL_MSM8953 is not set +# CONFIG_PINCTRL_MSM8976 is not set +# CONFIG_PINCTRL_MSM8994 is not set +CONFIG_PINCTRL_MSM8996=y +# CONFIG_PINCTRL_MSM8998 is not set +# CONFIG_PINCTRL_QCM2290 is not set +# CONFIG_PINCTRL_QCS404 is not set +# CONFIG_PINCTRL_QDF2XXX is not set +# CONFIG_PINCTRL_QDU1000 is not set +# CONFIG_PINCTRL_SA8775P is not set +# CONFIG_PINCTRL_SC7180 is not set +# CONFIG_PINCTRL_SC7280 is not set +# CONFIG_PINCTRL_SC8180X is not set +# CONFIG_PINCTRL_SC8280XP is not set +# CONFIG_PINCTRL_SDM660 is not set +# CONFIG_PINCTRL_SDM670 is not set +# CONFIG_PINCTRL_SDM845 is not set +# CONFIG_PINCTRL_SDX75 is not set +# CONFIG_PINCTRL_SM6115 is not set +# CONFIG_PINCTRL_SM6125 is not set +# CONFIG_PINCTRL_SM6350 is not set +# CONFIG_PINCTRL_SM6375 is not set +# CONFIG_PINCTRL_SM7150 is not set +# CONFIG_PINCTRL_SM8150 is not set +# CONFIG_PINCTRL_SM8250 is not set +# CONFIG_PINCTRL_SM8350 is not set +# CONFIG_PINCTRL_SM8450 is not set +# CONFIG_PINCTRL_SM8550 is not set +CONFIG_PINCTRL_QCOM_SPMI_PMIC=y +CONFIG_PINCTRL_QCOM_SSBI_PMIC=y +# CONFIG_PINCTRL_LPASS_LPI is not set + +# +# Renesas pinctrl drivers +# +# end of Renesas pinctrl drivers + +CONFIG_PINCTRL_SUNXI=y +# CONFIG_PINCTRL_SUN4I_A10 is not set +# CONFIG_PINCTRL_SUN5I is not set +# CONFIG_PINCTRL_SUN6I_A31 is not set +# CONFIG_PINCTRL_SUN6I_A31_R is not set +# CONFIG_PINCTRL_SUN8I_A23 is not set +# CONFIG_PINCTRL_SUN8I_A33 is not set +# CONFIG_PINCTRL_SUN8I_A83T is not set +# CONFIG_PINCTRL_SUN8I_A83T_R is not set +# CONFIG_PINCTRL_SUN8I_A23_R is not set +# CONFIG_PINCTRL_SUN8I_H3 is not set +CONFIG_PINCTRL_SUN8I_H3_R=y +# CONFIG_PINCTRL_SUN8I_V3S is not set +# CONFIG_PINCTRL_SUN9I_A80 is not set +# CONFIG_PINCTRL_SUN9I_A80_R is not set +# CONFIG_PINCTRL_SUN20I_D1 is not set +CONFIG_PINCTRL_SUN50I_A64=y +CONFIG_PINCTRL_SUN50I_A64_R=y +CONFIG_PINCTRL_SUN50I_A100=y +CONFIG_PINCTRL_SUN50I_A100_R=y +CONFIG_PINCTRL_SUN50I_H5=y +CONFIG_PINCTRL_SUN50I_H6=y +CONFIG_PINCTRL_SUN50I_H6_R=y +CONFIG_PINCTRL_SUN50I_H616=y +CONFIG_PINCTRL_SUN50I_H616_R=y +CONFIG_PINCTRL_TEGRA=y +CONFIG_PINCTRL_TEGRA124=y +CONFIG_PINCTRL_TEGRA210=y +CONFIG_PINCTRL_TEGRA194=y +CONFIG_PINCTRL_TEGRA_XUSB=y +CONFIG_GPIOLIB=y +CONFIG_GPIOLIB_FASTPATH_LIMIT=512 +CONFIG_OF_GPIO=y +CONFIG_GPIO_ACPI=y +CONFIG_GPIOLIB_IRQCHIP=y +# CONFIG_DEBUG_GPIO is not set +CONFIG_GPIO_SYSFS=y +CONFIG_GPIO_CDEV=y +CONFIG_GPIO_CDEV_V1=y +CONFIG_GPIO_GENERIC=y +CONFIG_GPIO_REGMAP=m +CONFIG_GPIO_IDIO_16=m + +# +# Memory mapped GPIO drivers +# +# CONFIG_GPIO_74XX_MMIO is not set +# CONFIG_GPIO_ALTERA is not set +# CONFIG_GPIO_AMDPT is not set +# CONFIG_GPIO_CADENCE is not set +# CONFIG_GPIO_DWAPB is not set +CONFIG_GPIO_EXAR=m +# CONFIG_GPIO_FTGPIO010 is not set +CONFIG_GPIO_GENERIC_PLATFORM=y +# CONFIG_GPIO_GRGPIO is not set +# CONFIG_GPIO_HISI is not set +# CONFIG_GPIO_HLWD is not set +# CONFIG_GPIO_LOGICVC is not set +CONFIG_GPIO_MB86S7X=m +CONFIG_GPIO_MVEBU=y +CONFIG_GPIO_PL061=y +CONFIG_GPIO_ROCKCHIP=y +# CONFIG_GPIO_SIFIVE is not set +# CONFIG_GPIO_SYSCON is not set +CONFIG_GPIO_TEGRA=y +CONFIG_GPIO_TEGRA186=y +# CONFIG_GPIO_THUNDERX is not set +CONFIG_GPIO_XGENE=y +CONFIG_GPIO_XGENE_SB=m +# CONFIG_GPIO_XILINX is not set +CONFIG_GPIO_XLP=y +CONFIG_GPIO_ZYNQ=m +CONFIG_GPIO_ZYNQMP_MODEPIN=y +# CONFIG_GPIO_AMD_FCH is not set +# end of Memory mapped GPIO drivers + +# +# I2C GPIO expanders +# +# CONFIG_GPIO_ADNP is not set +# CONFIG_GPIO_FXL6408 is not set +# CONFIG_GPIO_DS4520 is not set +# CONFIG_GPIO_GW_PLD is not set +# CONFIG_GPIO_MAX7300 is not set +# CONFIG_GPIO_MAX732X is not set +CONFIG_GPIO_PCA953X=y +CONFIG_GPIO_PCA953X_IRQ=y +# CONFIG_GPIO_PCA9570 is not set +# CONFIG_GPIO_PCF857X is not set +# CONFIG_GPIO_TPIC2810 is not set +# end of I2C GPIO expanders + +# +# MFD GPIO expanders +# +CONFIG_GPIO_MAX77620=y +# end of MFD GPIO expanders + +# +# PCI GPIO expanders +# +CONFIG_GPIO_PCI_IDIO_16=m +CONFIG_GPIO_PCIE_IDIO_24=m +# CONFIG_GPIO_RDC321X is not set +# end of PCI GPIO expanders + +# +# SPI GPIO expanders +# +# CONFIG_GPIO_74X164 is not set +# CONFIG_GPIO_MAX3191X is not set +# CONFIG_GPIO_MAX7301 is not set +# CONFIG_GPIO_MC33880 is not set +# CONFIG_GPIO_PISOSR is not set +# CONFIG_GPIO_XRA1403 is not set +# end of SPI GPIO expanders + +# +# USB GPIO expanders +# +CONFIG_GPIO_VIPERBOARD=m +# end of USB GPIO expanders + +# +# Virtual GPIO drivers +# +# CONFIG_GPIO_AGGREGATOR is not set +# CONFIG_GPIO_LATCH is not set +# CONFIG_GPIO_MOCKUP is not set +# CONFIG_GPIO_VIRTIO is not set +# CONFIG_GPIO_SIM is not set +# end of Virtual GPIO drivers + +CONFIG_W1=m +CONFIG_W1_CON=y + +# +# 1-wire Bus Masters +# +# CONFIG_W1_MASTER_MATROX is not set +CONFIG_W1_MASTER_DS2490=m +CONFIG_W1_MASTER_DS2482=m +CONFIG_W1_MASTER_GPIO=m +# CONFIG_W1_MASTER_SGI is not set +# end of 1-wire Bus Masters + +# +# 1-wire Slaves +# +CONFIG_W1_SLAVE_THERM=m +CONFIG_W1_SLAVE_SMEM=m +CONFIG_W1_SLAVE_DS2405=m +CONFIG_W1_SLAVE_DS2408=m +CONFIG_W1_SLAVE_DS2408_READBACK=y +CONFIG_W1_SLAVE_DS2413=m +CONFIG_W1_SLAVE_DS2406=m +CONFIG_W1_SLAVE_DS2423=m +CONFIG_W1_SLAVE_DS2805=m +# CONFIG_W1_SLAVE_DS2430 is not set +CONFIG_W1_SLAVE_DS2431=m +CONFIG_W1_SLAVE_DS2433=m +# CONFIG_W1_SLAVE_DS2433_CRC is not set +CONFIG_W1_SLAVE_DS2438=m +# CONFIG_W1_SLAVE_DS250X is not set +CONFIG_W1_SLAVE_DS2780=m +CONFIG_W1_SLAVE_DS2781=m +CONFIG_W1_SLAVE_DS28E04=m +CONFIG_W1_SLAVE_DS28E17=m +# end of 1-wire Slaves + +CONFIG_POWER_RESET=y +# CONFIG_POWER_RESET_BRCMSTB is not set +# CONFIG_POWER_RESET_GPIO is not set +# CONFIG_POWER_RESET_GPIO_RESTART is not set +CONFIG_POWER_RESET_HISI=y +# CONFIG_POWER_RESET_LINKSTATION is not set +CONFIG_POWER_RESET_MSM=y +# CONFIG_POWER_RESET_QCOM_PON is not set +# CONFIG_POWER_RESET_ODROID_GO_ULTRA_POWEROFF is not set +# CONFIG_POWER_RESET_LTC2952 is not set +# CONFIG_POWER_RESET_REGULATOR is not set +# CONFIG_POWER_RESET_RESTART is not set +CONFIG_POWER_RESET_VEXPRESS=y +CONFIG_POWER_RESET_XGENE=y +CONFIG_POWER_RESET_SYSCON=y +CONFIG_POWER_RESET_SYSCON_POWEROFF=y +# CONFIG_SYSCON_REBOOT_MODE is not set +# CONFIG_NVMEM_REBOOT_MODE is not set +CONFIG_POWER_SUPPLY=y +# CONFIG_POWER_SUPPLY_DEBUG is not set +CONFIG_POWER_SUPPLY_HWMON=y +# CONFIG_GENERIC_ADC_BATTERY is not set +# CONFIG_IP5XXX_POWER is not set +# CONFIG_TEST_POWER is not set +# CONFIG_CHARGER_ADP5061 is not set +# CONFIG_BATTERY_CW2015 is not set +CONFIG_BATTERY_DS2760=m +# CONFIG_BATTERY_DS2780 is not set +# CONFIG_BATTERY_DS2781 is not set +# CONFIG_BATTERY_DS2782 is not set +# CONFIG_BATTERY_SAMSUNG_SDI is not set +CONFIG_BATTERY_SBS=m +# CONFIG_CHARGER_SBS is not set +# CONFIG_MANAGER_SBS is not set +CONFIG_BATTERY_BQ27XXX=m +# CONFIG_BATTERY_BQ27XXX_I2C is not set +CONFIG_BATTERY_BQ27XXX_HDQ=m +CONFIG_CHARGER_AXP20X=m +CONFIG_BATTERY_AXP20X=m +CONFIG_AXP20X_POWER=m +# CONFIG_BATTERY_MAX17040 is not set +# CONFIG_BATTERY_MAX17042 is not set +# CONFIG_BATTERY_MAX1721X is not set +# CONFIG_CHARGER_ISP1704 is not set +# CONFIG_CHARGER_MAX8903 is not set +# CONFIG_CHARGER_LP8727 is not set +# CONFIG_CHARGER_GPIO is not set +# CONFIG_CHARGER_MANAGER is not set +# CONFIG_CHARGER_LT3651 is not set +# CONFIG_CHARGER_LTC4162L is not set +# CONFIG_CHARGER_DETECTOR_MAX14656 is not set +# CONFIG_CHARGER_MAX77976 is not set +CONFIG_CHARGER_QCOM_SMBB=m +# CONFIG_CHARGER_BQ2415X is not set +# CONFIG_CHARGER_BQ24190 is not set +# CONFIG_CHARGER_BQ24257 is not set +# CONFIG_CHARGER_BQ24735 is not set +# CONFIG_CHARGER_BQ2515X is not set +# CONFIG_CHARGER_BQ25890 is not set +# CONFIG_CHARGER_BQ25980 is not set +# CONFIG_CHARGER_BQ256XX is not set +# CONFIG_CHARGER_SMB347 is not set +# CONFIG_BATTERY_GAUGE_LTC2941 is not set +# CONFIG_BATTERY_GOLDFISH is not set +# CONFIG_BATTERY_RT5033 is not set +# CONFIG_CHARGER_RT9455 is not set +# CONFIG_CHARGER_RT9467 is not set +# CONFIG_CHARGER_RT9471 is not set +CONFIG_CHARGER_CROS_USBPD=m +CONFIG_CHARGER_CROS_PCHG=y +# CONFIG_CHARGER_UCS1002 is not set +# CONFIG_CHARGER_BD99954 is not set +# CONFIG_BATTERY_UG3105 is not set +# CONFIG_CHARGER_QCOM_SMB2 is not set +CONFIG_HWMON=y +CONFIG_HWMON_VID=m +# CONFIG_HWMON_DEBUG_CHIP is not set + +# +# Native drivers +# +# CONFIG_SENSORS_AD7314 is not set +CONFIG_SENSORS_AD7414=m +CONFIG_SENSORS_AD7418=m +# CONFIG_SENSORS_ADM1021 is not set +# CONFIG_SENSORS_ADM1025 is not set +# CONFIG_SENSORS_ADM1026 is not set +CONFIG_SENSORS_ADM1029=m +# CONFIG_SENSORS_ADM1031 is not set +# CONFIG_SENSORS_ADM1177 is not set +CONFIG_SENSORS_ADM9240=m +# CONFIG_SENSORS_ADT7310 is not set +# CONFIG_SENSORS_ADT7410 is not set +CONFIG_SENSORS_ADT7411=m +CONFIG_SENSORS_ADT7462=m +CONFIG_SENSORS_ADT7470=m +CONFIG_SENSORS_ADT7475=m +# CONFIG_SENSORS_AHT10 is not set +# CONFIG_SENSORS_AQUACOMPUTER_D5NEXT is not set +# CONFIG_SENSORS_AS370 is not set +CONFIG_SENSORS_ASC7621=m +# CONFIG_SENSORS_AXI_FAN_CONTROL is not set +CONFIG_SENSORS_ATXP1=m +# CONFIG_SENSORS_CORSAIR_CPRO is not set +# CONFIG_SENSORS_CORSAIR_PSU is not set +# CONFIG_SENSORS_DRIVETEMP is not set +CONFIG_SENSORS_DS620=m +# CONFIG_SENSORS_DS1621 is not set +CONFIG_SENSORS_I5K_AMB=m +# CONFIG_SENSORS_F71805F is not set +CONFIG_SENSORS_F71882FG=m +CONFIG_SENSORS_F75375S=m +CONFIG_SENSORS_FTSTEUTATES=m +# CONFIG_SENSORS_GL518SM is not set +# CONFIG_SENSORS_GL520SM is not set +CONFIG_SENSORS_G760A=m +# CONFIG_SENSORS_G762 is not set +# CONFIG_SENSORS_GPIO_FAN is not set +# CONFIG_SENSORS_HIH6130 is not set +# CONFIG_SENSORS_HS3001 is not set +CONFIG_SENSORS_IBMAEM=m +CONFIG_SENSORS_IBMPEX=m +# CONFIG_SENSORS_IIO_HWMON is not set +# CONFIG_SENSORS_IT87 is not set +CONFIG_SENSORS_JC42=m +# CONFIG_SENSORS_POWR1220 is not set +CONFIG_SENSORS_LINEAGE=m +# CONFIG_SENSORS_LTC2945 is not set +# CONFIG_SENSORS_LTC2947_I2C is not set +# CONFIG_SENSORS_LTC2947_SPI is not set +# CONFIG_SENSORS_LTC2990 is not set +# CONFIG_SENSORS_LTC2992 is not set +CONFIG_SENSORS_LTC4151=m +CONFIG_SENSORS_LTC4215=m +# CONFIG_SENSORS_LTC4222 is not set +CONFIG_SENSORS_LTC4245=m +# CONFIG_SENSORS_LTC4260 is not set +CONFIG_SENSORS_LTC4261=m +CONFIG_SENSORS_MAX1111=m +# CONFIG_SENSORS_MAX127 is not set +CONFIG_SENSORS_MAX16065=m +# CONFIG_SENSORS_MAX1619 is not set +CONFIG_SENSORS_MAX1668=m +# CONFIG_SENSORS_MAX197 is not set +# CONFIG_SENSORS_MAX31722 is not set +# CONFIG_SENSORS_MAX31730 is not set +# CONFIG_SENSORS_MAX31760 is not set +# CONFIG_MAX31827 is not set +# CONFIG_SENSORS_MAX6620 is not set +# CONFIG_SENSORS_MAX6621 is not set +CONFIG_SENSORS_MAX6639=m +CONFIG_SENSORS_MAX6642=m +CONFIG_SENSORS_MAX6650=m +# CONFIG_SENSORS_MAX6697 is not set +# CONFIG_SENSORS_MAX31790 is not set +# CONFIG_SENSORS_MC34VR500 is not set +# CONFIG_SENSORS_MCP3021 is not set +# CONFIG_SENSORS_TC654 is not set +# CONFIG_SENSORS_TPS23861 is not set +# CONFIG_SENSORS_MR75203 is not set +CONFIG_SENSORS_ADCXX=m +# CONFIG_SENSORS_LM63 is not set +CONFIG_SENSORS_LM70=m +CONFIG_SENSORS_LM73=m +# CONFIG_SENSORS_LM75 is not set +# CONFIG_SENSORS_LM77 is not set +# CONFIG_SENSORS_LM78 is not set +# CONFIG_SENSORS_LM80 is not set +# CONFIG_SENSORS_LM83 is not set +# CONFIG_SENSORS_LM85 is not set +# CONFIG_SENSORS_LM87 is not set +# CONFIG_SENSORS_LM90 is not set +# CONFIG_SENSORS_LM92 is not set +CONFIG_SENSORS_LM93=m +# CONFIG_SENSORS_LM95234 is not set +CONFIG_SENSORS_LM95241=m +CONFIG_SENSORS_LM95245=m +# CONFIG_SENSORS_PC87360 is not set +CONFIG_SENSORS_PC87427=m +CONFIG_SENSORS_NTC_THERMISTOR=m +CONFIG_SENSORS_NCT6683=m +CONFIG_SENSORS_NCT6775_CORE=m +CONFIG_SENSORS_NCT6775=m +# CONFIG_SENSORS_NCT6775_I2C is not set +CONFIG_SENSORS_NCT7802=m +CONFIG_SENSORS_NCT7904=m +CONFIG_SENSORS_NPCM7XX=m +# CONFIG_SENSORS_NZXT_KRAKEN2 is not set +# CONFIG_SENSORS_NZXT_SMART2 is not set +# CONFIG_SENSORS_OCC_P8_I2C is not set +# CONFIG_SENSORS_PCF8591 is not set +# CONFIG_PMBUS is not set +# CONFIG_SENSORS_PWM_FAN is not set +# CONFIG_SENSORS_SBTSI is not set +# CONFIG_SENSORS_SBRMI is not set +# CONFIG_SENSORS_SHT15 is not set +CONFIG_SENSORS_SHT21=m +# CONFIG_SENSORS_SHT3x is not set +# CONFIG_SENSORS_SHT4x is not set +# CONFIG_SENSORS_SHTC1 is not set +# CONFIG_SENSORS_SIS5595 is not set +CONFIG_SENSORS_DME1737=m +CONFIG_SENSORS_EMC1403=m +CONFIG_SENSORS_EMC2103=m +# CONFIG_SENSORS_EMC2305 is not set +CONFIG_SENSORS_EMC6W201=m +# CONFIG_SENSORS_SMSC47M1 is not set +CONFIG_SENSORS_SMSC47M192=m +# CONFIG_SENSORS_SMSC47B397 is not set +CONFIG_SENSORS_SCH56XX_COMMON=m +CONFIG_SENSORS_SCH5627=m +# CONFIG_SENSORS_SCH5636 is not set +# CONFIG_SENSORS_STTS751 is not set +# CONFIG_SENSORS_ADC128D818 is not set +CONFIG_SENSORS_ADS7828=m +CONFIG_SENSORS_ADS7871=m +CONFIG_SENSORS_AMC6821=m +# CONFIG_SENSORS_INA209 is not set +# CONFIG_SENSORS_INA2XX is not set +# CONFIG_SENSORS_INA238 is not set +# CONFIG_SENSORS_INA3221 is not set +# CONFIG_SENSORS_TC74 is not set +CONFIG_SENSORS_THMC50=m +CONFIG_SENSORS_TMP102=m +# CONFIG_SENSORS_TMP103 is not set +# CONFIG_SENSORS_TMP108 is not set +CONFIG_SENSORS_TMP401=m +CONFIG_SENSORS_TMP421=m +# CONFIG_SENSORS_TMP464 is not set +# CONFIG_SENSORS_TMP513 is not set +# CONFIG_SENSORS_VEXPRESS is not set +# CONFIG_SENSORS_VIA686A is not set +CONFIG_SENSORS_VT1211=m +CONFIG_SENSORS_VT8231=m +CONFIG_SENSORS_W83773G=m +# CONFIG_SENSORS_W83781D is not set +CONFIG_SENSORS_W83791D=m +CONFIG_SENSORS_W83792D=m +CONFIG_SENSORS_W83793=m +CONFIG_SENSORS_W83795=m +# CONFIG_SENSORS_W83795_FANCTRL is not set +# CONFIG_SENSORS_W83L785TS is not set +CONFIG_SENSORS_W83L786NG=m +# CONFIG_SENSORS_W83627HF is not set +CONFIG_SENSORS_W83627EHF=m +CONFIG_SENSORS_XGENE=m + +# +# ACPI drivers +# +CONFIG_SENSORS_ACPI_POWER=m +CONFIG_THERMAL=y +# CONFIG_THERMAL_NETLINK is not set +CONFIG_THERMAL_STATISTICS=y +CONFIG_THERMAL_EMERGENCY_POWEROFF_DELAY_MS=0 +CONFIG_THERMAL_HWMON=y +CONFIG_THERMAL_OF=y +# CONFIG_THERMAL_WRITABLE_TRIPS is not set +CONFIG_THERMAL_DEFAULT_GOV_STEP_WISE=y +# CONFIG_THERMAL_DEFAULT_GOV_FAIR_SHARE is not set +# CONFIG_THERMAL_DEFAULT_GOV_USER_SPACE is not set +CONFIG_THERMAL_GOV_FAIR_SHARE=y +CONFIG_THERMAL_GOV_STEP_WISE=y +# CONFIG_THERMAL_GOV_BANG_BANG is not set +# CONFIG_THERMAL_GOV_USER_SPACE is not set +# CONFIG_THERMAL_GOV_POWER_ALLOCATOR is not set +CONFIG_CPU_THERMAL=y +CONFIG_CPU_FREQ_THERMAL=y +CONFIG_DEVFREQ_THERMAL=y +# CONFIG_THERMAL_EMULATION is not set +# CONFIG_THERMAL_MMIO is not set +CONFIG_HISI_THERMAL=m +# CONFIG_MAX77620_THERMAL is not set +# CONFIG_SUN8I_THERMAL is not set +CONFIG_ROCKCHIP_THERMAL=m +CONFIG_ARMADA_THERMAL=m +CONFIG_AMLOGIC_THERMAL=y + +# +# NVIDIA Tegra thermal drivers +# +CONFIG_TEGRA_SOCTHERM=y +# CONFIG_TEGRA_BPMP_THERMAL is not set +# end of NVIDIA Tegra thermal drivers + +# CONFIG_GENERIC_ADC_THERMAL is not set + +# +# Qualcomm thermal drivers +# +# CONFIG_QCOM_SPMI_ADC_TM5 is not set +CONFIG_QCOM_SPMI_TEMP_ALARM=m +# CONFIG_QCOM_LMH is not set +# end of Qualcomm thermal drivers + +CONFIG_WATCHDOG=y +CONFIG_WATCHDOG_CORE=y +# CONFIG_WATCHDOG_NOWAYOUT is not set +CONFIG_WATCHDOG_HANDLE_BOOT_ENABLED=y +CONFIG_WATCHDOG_OPEN_TIMEOUT=0 +CONFIG_WATCHDOG_SYSFS=y +# CONFIG_WATCHDOG_HRTIMER_PRETIMEOUT is not set + +# +# Watchdog Pretimeout Governors +# +CONFIG_WATCHDOG_PRETIMEOUT_GOV=y +CONFIG_WATCHDOG_PRETIMEOUT_GOV_SEL=m +CONFIG_WATCHDOG_PRETIMEOUT_GOV_NOOP=m +CONFIG_WATCHDOG_PRETIMEOUT_GOV_PANIC=m +CONFIG_WATCHDOG_PRETIMEOUT_DEFAULT_GOV_NOOP=y +# CONFIG_WATCHDOG_PRETIMEOUT_DEFAULT_GOV_PANIC is not set + +# +# Watchdog Device Drivers +# +CONFIG_SOFT_WATCHDOG=m +# CONFIG_SOFT_WATCHDOG_PRETIMEOUT is not set +CONFIG_GPIO_WATCHDOG=m +CONFIG_WDAT_WDT=m +# CONFIG_XILINX_WATCHDOG is not set +# CONFIG_XILINX_WINDOW_WATCHDOG is not set +# CONFIG_ZIIRAVE_WATCHDOG is not set +CONFIG_ARM_SP805_WATCHDOG=m +CONFIG_ARM_SBSA_WATCHDOG=m +# CONFIG_ARMADA_37XX_WATCHDOG is not set +# CONFIG_CADENCE_WATCHDOG is not set +CONFIG_DW_WATCHDOG=m +CONFIG_SUNXI_WATCHDOG=m +# CONFIG_MAX63XX_WATCHDOG is not set +# CONFIG_MAX77620_WATCHDOG is not set +CONFIG_TEGRA_WATCHDOG=m +CONFIG_QCOM_WDT=m +CONFIG_MESON_GXBB_WATCHDOG=m +CONFIG_MESON_WATCHDOG=m +# CONFIG_ARM_SMC_WATCHDOG is not set +# CONFIG_PM8916_WATCHDOG is not set +# CONFIG_ALIM7101_WDT is not set +# CONFIG_I6300ESB_WDT is not set +# CONFIG_HP_WATCHDOG is not set +CONFIG_MARVELL_GTI_WDT=y +# CONFIG_MEN_A21_WDT is not set +CONFIG_XEN_WDT=m + +# +# PCI-based Watchdog Cards +# +# CONFIG_PCIPCWATCHDOG is not set +# CONFIG_WDTPCI is not set + +# +# USB-based Watchdog Cards +# +# CONFIG_USBPCWATCHDOG is not set +CONFIG_SSB_POSSIBLE=y +CONFIG_SSB=m +CONFIG_SSB_SPROM=y +CONFIG_SSB_BLOCKIO=y +CONFIG_SSB_PCIHOST_POSSIBLE=y +CONFIG_SSB_PCIHOST=y +CONFIG_SSB_B43_PCI_BRIDGE=y +CONFIG_SSB_SDIOHOST_POSSIBLE=y +CONFIG_SSB_SDIOHOST=y +CONFIG_SSB_DRIVER_PCICORE_POSSIBLE=y +CONFIG_SSB_DRIVER_PCICORE=y +# CONFIG_SSB_DRIVER_GPIO is not set +CONFIG_BCMA_POSSIBLE=y +CONFIG_BCMA=m +CONFIG_BCMA_BLOCKIO=y +CONFIG_BCMA_HOST_PCI_POSSIBLE=y +CONFIG_BCMA_HOST_PCI=y +# CONFIG_BCMA_HOST_SOC is not set +CONFIG_BCMA_DRIVER_PCI=y +# CONFIG_BCMA_DRIVER_GMAC_CMN is not set +# CONFIG_BCMA_DRIVER_GPIO is not set +# CONFIG_BCMA_DEBUG is not set + +# +# Multifunction device drivers +# +CONFIG_MFD_CORE=y +# CONFIG_MFD_ACT8945A is not set +# CONFIG_MFD_SUN4I_GPADC is not set +# CONFIG_MFD_AS3711 is not set +# CONFIG_MFD_SMPRO is not set +# CONFIG_MFD_AS3722 is not set +# CONFIG_PMIC_ADP5520 is not set +# CONFIG_MFD_AAT2870_CORE is not set +# CONFIG_MFD_ATMEL_FLEXCOM is not set +# CONFIG_MFD_ATMEL_HLCDC is not set +# CONFIG_MFD_BCM590XX is not set +# CONFIG_MFD_BD9571MWV is not set +# CONFIG_MFD_AC100 is not set +CONFIG_MFD_AXP20X=m +# CONFIG_MFD_AXP20X_I2C is not set +CONFIG_MFD_AXP20X_RSB=m +CONFIG_MFD_CROS_EC_DEV=y +# CONFIG_MFD_CS42L43_I2C is not set +# CONFIG_MFD_MADERA is not set +# CONFIG_MFD_MAX5970 is not set +# CONFIG_PMIC_DA903X is not set +# CONFIG_MFD_DA9052_SPI is not set +# CONFIG_MFD_DA9052_I2C is not set +# CONFIG_MFD_DA9055 is not set +# CONFIG_MFD_DA9062 is not set +# CONFIG_MFD_DA9063 is not set +# CONFIG_MFD_DA9150 is not set +# CONFIG_MFD_DLN2 is not set +# CONFIG_MFD_GATEWORKS_GSC is not set +# CONFIG_MFD_MC13XXX_SPI is not set +# CONFIG_MFD_MC13XXX_I2C is not set +# CONFIG_MFD_MP2629 is not set +# CONFIG_MFD_HI6421_PMIC is not set +# CONFIG_MFD_HI6421_SPMI is not set +CONFIG_MFD_HI655X_PMIC=m +# CONFIG_LPC_ICH is not set +CONFIG_LPC_SCH=m +# CONFIG_MFD_IQS62X is not set +# CONFIG_MFD_JANZ_CMODIO is not set +# CONFIG_MFD_KEMPLD is not set +# CONFIG_MFD_88PM800 is not set +# CONFIG_MFD_88PM805 is not set +# CONFIG_MFD_88PM860X is not set +# CONFIG_MFD_MAX14577 is not set +# CONFIG_MFD_MAX77541 is not set +CONFIG_MFD_MAX77620=y +# CONFIG_MFD_MAX77650 is not set +# CONFIG_MFD_MAX77686 is not set +# CONFIG_MFD_MAX77693 is not set +# CONFIG_MFD_MAX77714 is not set +# CONFIG_MFD_MAX77843 is not set +# CONFIG_MFD_MAX8907 is not set +# CONFIG_MFD_MAX8925 is not set +# CONFIG_MFD_MAX8997 is not set +# CONFIG_MFD_MAX8998 is not set +# CONFIG_MFD_MT6360 is not set +# CONFIG_MFD_MT6370 is not set +# CONFIG_MFD_MT6397 is not set +# CONFIG_MFD_MENF21BMC is not set +# CONFIG_MFD_OCELOT is not set +# CONFIG_EZX_PCAP is not set +# CONFIG_MFD_CPCAP is not set +CONFIG_MFD_VIPERBOARD=m +# CONFIG_MFD_NTXEC is not set +# CONFIG_MFD_RETU is not set +# CONFIG_MFD_PCF50633 is not set +CONFIG_MFD_QCOM_RPM=m +CONFIG_MFD_SPMI_PMIC=m +# CONFIG_MFD_SY7636A is not set +# CONFIG_MFD_RDC321X is not set +# CONFIG_MFD_RT4831 is not set +# CONFIG_MFD_RT5033 is not set +# CONFIG_MFD_RT5120 is not set +# CONFIG_MFD_RC5T583 is not set +# CONFIG_MFD_RK8XX_I2C is not set +# CONFIG_MFD_RK8XX_SPI is not set +# CONFIG_MFD_RN5T618 is not set +# CONFIG_MFD_SEC_CORE is not set +# CONFIG_MFD_SI476X_CORE is not set +# CONFIG_MFD_SM501 is not set +# CONFIG_MFD_SKY81452 is not set +# CONFIG_MFD_STMPE is not set +CONFIG_MFD_SUN6I_PRCM=y +CONFIG_MFD_SYSCON=y +# CONFIG_MFD_LP3943 is not set +# CONFIG_MFD_LP8788 is not set +# CONFIG_MFD_TI_LMU is not set +# CONFIG_MFD_PALMAS is not set +# CONFIG_TPS6105X is not set +# CONFIG_TPS65010 is not set +# CONFIG_TPS6507X is not set +# CONFIG_MFD_TPS65086 is not set +# CONFIG_MFD_TPS65090 is not set +# CONFIG_MFD_TPS65217 is not set +# CONFIG_MFD_TI_LP873X is not set +# CONFIG_MFD_TI_LP87565 is not set +# CONFIG_MFD_TPS65218 is not set +# CONFIG_MFD_TPS65219 is not set +# CONFIG_MFD_TPS6586X is not set +# CONFIG_MFD_TPS65910 is not set +# CONFIG_MFD_TPS65912_I2C is not set +# CONFIG_MFD_TPS65912_SPI is not set +# CONFIG_MFD_TPS6594_I2C is not set +# CONFIG_MFD_TPS6594_SPI is not set +# CONFIG_TWL4030_CORE is not set +# CONFIG_TWL6040_CORE is not set +# CONFIG_MFD_WL1273_CORE is not set +# CONFIG_MFD_LM3533 is not set +# CONFIG_MFD_TC3589X is not set +# CONFIG_MFD_TQMX86 is not set +# CONFIG_MFD_VX855 is not set +# CONFIG_MFD_LOCHNAGAR is not set +# CONFIG_MFD_ARIZONA_I2C is not set +# CONFIG_MFD_ARIZONA_SPI is not set +# CONFIG_MFD_WM8400 is not set +# CONFIG_MFD_WM831X_I2C is not set +# CONFIG_MFD_WM831X_SPI is not set +# CONFIG_MFD_WM8350_I2C is not set +# CONFIG_MFD_WM8994 is not set +# CONFIG_MFD_ROHM_BD718XX is not set +# CONFIG_MFD_ROHM_BD71828 is not set +# CONFIG_MFD_ROHM_BD957XMUF is not set +# CONFIG_MFD_STPMIC1 is not set +# CONFIG_MFD_STMFX is not set +# CONFIG_MFD_ATC260X_I2C is not set +# CONFIG_MFD_KHADAS_MCU is not set +# CONFIG_MFD_QCOM_PM8008 is not set +CONFIG_MFD_VEXPRESS_SYSREG=y +# CONFIG_RAVE_SP_CORE is not set +# CONFIG_MFD_INTEL_M10_BMC_SPI is not set +# CONFIG_MFD_RSMU_I2C is not set +# CONFIG_MFD_RSMU_SPI is not set +# end of Multifunction device drivers + +CONFIG_REGULATOR=y +# CONFIG_REGULATOR_DEBUG is not set +CONFIG_REGULATOR_FIXED_VOLTAGE=m +# CONFIG_REGULATOR_VIRTUAL_CONSUMER is not set +# CONFIG_REGULATOR_USERSPACE_CONSUMER is not set +# CONFIG_REGULATOR_88PG86X is not set +# CONFIG_REGULATOR_ACT8865 is not set +# CONFIG_REGULATOR_AD5398 is not set +# CONFIG_REGULATOR_AW37503 is not set +CONFIG_REGULATOR_AXP20X=m +# CONFIG_REGULATOR_CROS_EC is not set +# CONFIG_REGULATOR_DA9121 is not set +# CONFIG_REGULATOR_DA9210 is not set +# CONFIG_REGULATOR_DA9211 is not set +CONFIG_REGULATOR_FAN53555=m +# CONFIG_REGULATOR_FAN53880 is not set +CONFIG_REGULATOR_GPIO=m +CONFIG_REGULATOR_HI655X=m +# CONFIG_REGULATOR_ISL9305 is not set +# CONFIG_REGULATOR_ISL6271A is not set +# CONFIG_REGULATOR_LP3971 is not set +# CONFIG_REGULATOR_LP3972 is not set +# CONFIG_REGULATOR_LP872X is not set +# CONFIG_REGULATOR_LP8755 is not set +# CONFIG_REGULATOR_LTC3589 is not set +# CONFIG_REGULATOR_LTC3676 is not set +# CONFIG_REGULATOR_MAX1586 is not set +CONFIG_REGULATOR_MAX77620=m +# CONFIG_REGULATOR_MAX77857 is not set +# CONFIG_REGULATOR_MAX8649 is not set +# CONFIG_REGULATOR_MAX8660 is not set +# CONFIG_REGULATOR_MAX8893 is not set +# CONFIG_REGULATOR_MAX8952 is not set +# CONFIG_REGULATOR_MAX8973 is not set +# CONFIG_REGULATOR_MAX20086 is not set +# CONFIG_REGULATOR_MAX20411 is not set +# CONFIG_REGULATOR_MAX77826 is not set +# CONFIG_REGULATOR_MCP16502 is not set +# CONFIG_REGULATOR_MP5416 is not set +# CONFIG_REGULATOR_MP8859 is not set +# CONFIG_REGULATOR_MP886X is not set +# CONFIG_REGULATOR_MPQ7920 is not set +# CONFIG_REGULATOR_MT6311 is not set +# CONFIG_REGULATOR_MT6315 is not set +# CONFIG_REGULATOR_PCA9450 is not set +# CONFIG_REGULATOR_PF8X00 is not set +# CONFIG_REGULATOR_PFUZE100 is not set +# CONFIG_REGULATOR_PV88060 is not set +# CONFIG_REGULATOR_PV88080 is not set +# CONFIG_REGULATOR_PV88090 is not set +CONFIG_REGULATOR_PWM=m +# CONFIG_REGULATOR_QCOM_REFGEN is not set +CONFIG_REGULATOR_QCOM_RPM=m +CONFIG_REGULATOR_QCOM_SMD_RPM=m +CONFIG_REGULATOR_QCOM_SPMI=m +# CONFIG_REGULATOR_QCOM_USB_VBUS is not set +# CONFIG_REGULATOR_RAA215300 is not set +# CONFIG_REGULATOR_RASPBERRYPI_TOUCHSCREEN_ATTINY is not set +# CONFIG_REGULATOR_RT4801 is not set +# CONFIG_REGULATOR_RT4803 is not set +# CONFIG_REGULATOR_RT5190A is not set +# CONFIG_REGULATOR_RT5739 is not set +# CONFIG_REGULATOR_RT5759 is not set +# CONFIG_REGULATOR_RT6160 is not set +# CONFIG_REGULATOR_RT6190 is not set +# CONFIG_REGULATOR_RT6245 is not set +# CONFIG_REGULATOR_RTQ2134 is not set +# CONFIG_REGULATOR_RTMV20 is not set +# CONFIG_REGULATOR_RTQ6752 is not set +# CONFIG_REGULATOR_RTQ2208 is not set +# CONFIG_REGULATOR_SLG51000 is not set +# CONFIG_REGULATOR_SY8106A is not set +# CONFIG_REGULATOR_SY8824X is not set +# CONFIG_REGULATOR_SY8827N is not set +# CONFIG_REGULATOR_TPS51632 is not set +# CONFIG_REGULATOR_TPS62360 is not set +# CONFIG_REGULATOR_TPS6286X is not set +# CONFIG_REGULATOR_TPS6287X is not set +# CONFIG_REGULATOR_TPS65023 is not set +# CONFIG_REGULATOR_TPS6507X is not set +# CONFIG_REGULATOR_TPS65132 is not set +# CONFIG_REGULATOR_TPS6524X is not set +CONFIG_REGULATOR_VCTRL=m +# CONFIG_REGULATOR_VEXPRESS is not set +# CONFIG_REGULATOR_VQMMC_IPQ4019 is not set +# CONFIG_REGULATOR_QCOM_LABIBB is not set +CONFIG_RC_CORE=m +CONFIG_LIRC=y +CONFIG_RC_MAP=m +CONFIG_RC_DECODERS=y +CONFIG_IR_IMON_DECODER=m +CONFIG_IR_JVC_DECODER=m +CONFIG_IR_MCE_KBD_DECODER=m +CONFIG_IR_NEC_DECODER=m +CONFIG_IR_RC5_DECODER=m +CONFIG_IR_RC6_DECODER=m +# CONFIG_IR_RCMM_DECODER is not set +CONFIG_IR_SANYO_DECODER=m +CONFIG_IR_SHARP_DECODER=m +CONFIG_IR_SONY_DECODER=m +CONFIG_IR_XMP_DECODER=m +CONFIG_RC_DEVICES=y +CONFIG_IR_ENE=m +# CONFIG_IR_FINTEK is not set +# CONFIG_IR_GPIO_CIR is not set +# CONFIG_IR_GPIO_TX is not set +# CONFIG_IR_HIX5HD2 is not set +CONFIG_IR_IGORPLUGUSB=m +CONFIG_IR_IGUANA=m +CONFIG_IR_IMON=m +CONFIG_IR_IMON_RAW=m +# CONFIG_IR_ITE_CIR is not set +CONFIG_IR_MCEUSB=m +# CONFIG_IR_MESON is not set +# CONFIG_IR_MESON_TX is not set +# CONFIG_IR_NUVOTON is not set +# CONFIG_IR_PWM_TX is not set +CONFIG_IR_REDRAT3=m +# CONFIG_IR_SERIAL is not set +# CONFIG_IR_SPI is not set +CONFIG_IR_STREAMZAP=m +# CONFIG_IR_SUNXI is not set +# CONFIG_IR_TOY is not set +CONFIG_IR_TTUSBIR=m +CONFIG_RC_ATI_REMOTE=m +CONFIG_RC_LOOPBACK=m +# CONFIG_RC_XBOX_DVD is not set +CONFIG_CEC_CORE=m + +# +# CEC support +# +# CONFIG_MEDIA_CEC_RC is not set +CONFIG_MEDIA_CEC_SUPPORT=y +# CONFIG_CEC_CH7322 is not set +# CONFIG_CEC_CROS_EC is not set +# CONFIG_CEC_MESON_AO is not set +# CONFIG_CEC_MESON_G12A_AO is not set +# CONFIG_CEC_GPIO is not set +# CONFIG_CEC_TEGRA is not set +CONFIG_USB_PULSE8_CEC=m +CONFIG_USB_RAINSHADOW_CEC=m +# end of CEC support + +CONFIG_MEDIA_SUPPORT=m +# CONFIG_MEDIA_SUPPORT_FILTER is not set +CONFIG_MEDIA_SUBDRV_AUTOSELECT=y + +# +# Media device types +# +CONFIG_MEDIA_CAMERA_SUPPORT=y +CONFIG_MEDIA_ANALOG_TV_SUPPORT=y +CONFIG_MEDIA_DIGITAL_TV_SUPPORT=y +CONFIG_MEDIA_RADIO_SUPPORT=y +CONFIG_MEDIA_SDR_SUPPORT=y +CONFIG_MEDIA_PLATFORM_SUPPORT=y +CONFIG_MEDIA_TEST_SUPPORT=y +# end of Media device types + +# +# Media core support +# +CONFIG_VIDEO_DEV=m +CONFIG_MEDIA_CONTROLLER=y +CONFIG_DVB_CORE=m +# end of Media core support + +# +# Video4Linux options +# +CONFIG_VIDEO_V4L2_I2C=y +CONFIG_VIDEO_V4L2_SUBDEV_API=y +# CONFIG_VIDEO_ADV_DEBUG is not set +# CONFIG_VIDEO_FIXED_MINOR_RANGES is not set +CONFIG_VIDEO_TUNER=m +CONFIG_V4L2_FWNODE=m +CONFIG_V4L2_ASYNC=m +# end of Video4Linux options + +# +# Media controller options +# +CONFIG_MEDIA_CONTROLLER_DVB=y +CONFIG_MEDIA_CONTROLLER_REQUEST_API=y +# end of Media controller options + +# +# Digital TV options +# +# CONFIG_DVB_MMAP is not set +CONFIG_DVB_NET=y +CONFIG_DVB_MAX_ADAPTERS=16 +CONFIG_DVB_DYNAMIC_MINORS=y +# CONFIG_DVB_DEMUX_SECTION_LOSS_LOG is not set +# CONFIG_DVB_ULE_DEBUG is not set +# end of Digital TV options + +# +# Media drivers +# + +# +# Media drivers +# +CONFIG_MEDIA_USB_SUPPORT=y + +# +# Webcam devices +# +CONFIG_USB_GSPCA=m +CONFIG_USB_GSPCA_BENQ=m +CONFIG_USB_GSPCA_CONEX=m +CONFIG_USB_GSPCA_CPIA1=m +CONFIG_USB_GSPCA_DTCS033=m +CONFIG_USB_GSPCA_ETOMS=m +CONFIG_USB_GSPCA_FINEPIX=m +CONFIG_USB_GSPCA_JEILINJ=m +CONFIG_USB_GSPCA_JL2005BCD=m +CONFIG_USB_GSPCA_KINECT=m +CONFIG_USB_GSPCA_KONICA=m +CONFIG_USB_GSPCA_MARS=m +CONFIG_USB_GSPCA_MR97310A=m +CONFIG_USB_GSPCA_NW80X=m +CONFIG_USB_GSPCA_OV519=m +CONFIG_USB_GSPCA_OV534=m +CONFIG_USB_GSPCA_OV534_9=m +CONFIG_USB_GSPCA_PAC207=m +CONFIG_USB_GSPCA_PAC7302=m +CONFIG_USB_GSPCA_PAC7311=m +CONFIG_USB_GSPCA_SE401=m +CONFIG_USB_GSPCA_SN9C2028=m +CONFIG_USB_GSPCA_SN9C20X=m +CONFIG_USB_GSPCA_SONIXB=m +CONFIG_USB_GSPCA_SONIXJ=m +CONFIG_USB_GSPCA_SPCA1528=m +CONFIG_USB_GSPCA_SPCA500=m +CONFIG_USB_GSPCA_SPCA501=m +CONFIG_USB_GSPCA_SPCA505=m +CONFIG_USB_GSPCA_SPCA506=m +CONFIG_USB_GSPCA_SPCA508=m +CONFIG_USB_GSPCA_SPCA561=m +CONFIG_USB_GSPCA_SQ905=m +CONFIG_USB_GSPCA_SQ905C=m +CONFIG_USB_GSPCA_SQ930X=m +CONFIG_USB_GSPCA_STK014=m +CONFIG_USB_GSPCA_STK1135=m +CONFIG_USB_GSPCA_STV0680=m +CONFIG_USB_GSPCA_SUNPLUS=m +CONFIG_USB_GSPCA_T613=m +CONFIG_USB_GSPCA_TOPRO=m +CONFIG_USB_GSPCA_TOUPTEK=m +CONFIG_USB_GSPCA_TV8532=m +CONFIG_USB_GSPCA_VC032X=m +CONFIG_USB_GSPCA_VICAM=m +CONFIG_USB_GSPCA_XIRLINK_CIT=m +CONFIG_USB_GSPCA_ZC3XX=m +CONFIG_USB_GL860=m +CONFIG_USB_M5602=m +CONFIG_USB_STV06XX=m +CONFIG_USB_PWC=m +# CONFIG_USB_PWC_DEBUG is not set +CONFIG_USB_PWC_INPUT_EVDEV=y +CONFIG_USB_S2255=m +CONFIG_VIDEO_USBTV=m +CONFIG_USB_VIDEO_CLASS=m +CONFIG_USB_VIDEO_CLASS_INPUT_EVDEV=y + +# +# Analog TV USB devices +# +CONFIG_VIDEO_GO7007=m +CONFIG_VIDEO_GO7007_USB=m +CONFIG_VIDEO_GO7007_LOADER=m +CONFIG_VIDEO_GO7007_USB_S2250_BOARD=m +CONFIG_VIDEO_HDPVR=m +CONFIG_VIDEO_PVRUSB2=m +CONFIG_VIDEO_PVRUSB2_SYSFS=y +CONFIG_VIDEO_PVRUSB2_DVB=y +# CONFIG_VIDEO_PVRUSB2_DEBUGIFC is not set +CONFIG_VIDEO_STK1160=m + +# +# Analog/digital TV USB devices +# +CONFIG_VIDEO_AU0828=m +CONFIG_VIDEO_AU0828_V4L2=y +CONFIG_VIDEO_AU0828_RC=y +CONFIG_VIDEO_CX231XX=m +CONFIG_VIDEO_CX231XX_RC=y +CONFIG_VIDEO_CX231XX_ALSA=m +CONFIG_VIDEO_CX231XX_DVB=m + +# +# Digital TV USB devices +# +CONFIG_DVB_AS102=m +CONFIG_DVB_B2C2_FLEXCOP_USB=m +# CONFIG_DVB_B2C2_FLEXCOP_USB_DEBUG is not set +CONFIG_DVB_USB_V2=m +CONFIG_DVB_USB_AF9015=m +CONFIG_DVB_USB_AF9035=m +CONFIG_DVB_USB_ANYSEE=m +CONFIG_DVB_USB_AU6610=m +CONFIG_DVB_USB_AZ6007=m +CONFIG_DVB_USB_CE6230=m +CONFIG_DVB_USB_DVBSKY=m +CONFIG_DVB_USB_EC168=m +CONFIG_DVB_USB_GL861=m +CONFIG_DVB_USB_LME2510=m +CONFIG_DVB_USB_MXL111SF=m +CONFIG_DVB_USB_RTL28XXU=m +CONFIG_DVB_USB_ZD1301=m +CONFIG_DVB_USB=m +# CONFIG_DVB_USB_DEBUG is not set +CONFIG_DVB_USB_A800=m +CONFIG_DVB_USB_AF9005=m +CONFIG_DVB_USB_AF9005_REMOTE=m +CONFIG_DVB_USB_AZ6027=m +CONFIG_DVB_USB_CINERGY_T2=m +CONFIG_DVB_USB_CXUSB=m +# CONFIG_DVB_USB_CXUSB_ANALOG is not set +CONFIG_DVB_USB_DIB0700=m +CONFIG_DVB_USB_DIB3000MC=m +CONFIG_DVB_USB_DIBUSB_MB=m +CONFIG_DVB_USB_DIBUSB_MB_FAULTY=y +CONFIG_DVB_USB_DIBUSB_MC=m +CONFIG_DVB_USB_DIGITV=m +CONFIG_DVB_USB_DTT200U=m +CONFIG_DVB_USB_DTV5100=m +CONFIG_DVB_USB_DW2102=m +CONFIG_DVB_USB_GP8PSK=m +CONFIG_DVB_USB_M920X=m +CONFIG_DVB_USB_NOVA_T_USB2=m +CONFIG_DVB_USB_OPERA1=m +CONFIG_DVB_USB_PCTV452E=m +CONFIG_DVB_USB_TECHNISAT_USB2=m +CONFIG_DVB_USB_TTUSB2=m +CONFIG_DVB_USB_UMT_010=m +CONFIG_DVB_USB_VP702X=m +CONFIG_DVB_USB_VP7045=m +CONFIG_SMS_USB_DRV=m +CONFIG_DVB_TTUSB_BUDGET=m +CONFIG_DVB_TTUSB_DEC=m + +# +# Webcam, TV (analog/digital) USB devices +# +CONFIG_VIDEO_EM28XX=m +CONFIG_VIDEO_EM28XX_V4L2=m +CONFIG_VIDEO_EM28XX_ALSA=m +CONFIG_VIDEO_EM28XX_DVB=m +CONFIG_VIDEO_EM28XX_RC=m + +# +# Software defined radio USB devices +# +CONFIG_USB_AIRSPY=m +CONFIG_USB_HACKRF=m +CONFIG_USB_MSI2500=m +CONFIG_MEDIA_PCI_SUPPORT=y + +# +# Media capture support +# +CONFIG_VIDEO_SOLO6X10=m +CONFIG_VIDEO_TW5864=m +CONFIG_VIDEO_TW68=m +CONFIG_VIDEO_TW686X=m +# CONFIG_VIDEO_ZORAN is not set + +# +# Media capture/analog TV support +# +CONFIG_VIDEO_DT3155=m +CONFIG_VIDEO_IVTV=m +CONFIG_VIDEO_IVTV_ALSA=m +CONFIG_VIDEO_FB_IVTV=m +CONFIG_VIDEO_HEXIUM_GEMINI=m +CONFIG_VIDEO_HEXIUM_ORION=m +CONFIG_VIDEO_MXB=m + +# +# Media capture/analog/hybrid TV support +# +CONFIG_VIDEO_BT848=m +CONFIG_DVB_BT8XX=m +CONFIG_VIDEO_CX18=m +CONFIG_VIDEO_CX18_ALSA=m +CONFIG_VIDEO_CX23885=m +CONFIG_MEDIA_ALTERA_CI=m +# CONFIG_VIDEO_CX25821 is not set +CONFIG_VIDEO_CX88=m +CONFIG_VIDEO_CX88_ALSA=m +CONFIG_VIDEO_CX88_BLACKBIRD=m +CONFIG_VIDEO_CX88_DVB=m +CONFIG_VIDEO_CX88_ENABLE_VP3054=y +CONFIG_VIDEO_CX88_VP3054=m +CONFIG_VIDEO_CX88_MPEG=m +CONFIG_VIDEO_SAA7134=m +CONFIG_VIDEO_SAA7134_ALSA=m +CONFIG_VIDEO_SAA7134_RC=y +CONFIG_VIDEO_SAA7134_DVB=m +# CONFIG_VIDEO_SAA7134_GO7007 is not set +CONFIG_VIDEO_SAA7164=m + +# +# Media digital TV PCI Adapters +# +CONFIG_DVB_B2C2_FLEXCOP_PCI=m +# CONFIG_DVB_B2C2_FLEXCOP_PCI_DEBUG is not set +CONFIG_DVB_DDBRIDGE=m +# CONFIG_DVB_DDBRIDGE_MSIENABLE is not set +CONFIG_DVB_DM1105=m +CONFIG_MANTIS_CORE=m +CONFIG_DVB_MANTIS=m +CONFIG_DVB_HOPPER=m +CONFIG_DVB_NETUP_UNIDVB=m +CONFIG_DVB_NGENE=m +CONFIG_DVB_PLUTO2=m +CONFIG_DVB_PT1=m +CONFIG_DVB_PT3=m +CONFIG_DVB_SMIPCIE=m +CONFIG_DVB_BUDGET_CORE=m +CONFIG_DVB_BUDGET=m +CONFIG_DVB_BUDGET_CI=m +CONFIG_DVB_BUDGET_AV=m +# CONFIG_IPU_BRIDGE is not set +CONFIG_RADIO_ADAPTERS=m +# CONFIG_RADIO_MAXIRADIO is not set +# CONFIG_RADIO_SAA7706H is not set +CONFIG_RADIO_SHARK=m +CONFIG_RADIO_SHARK2=m +# CONFIG_RADIO_SI4713 is not set +CONFIG_RADIO_TEA575X=m +# CONFIG_RADIO_TEA5764 is not set +# CONFIG_RADIO_TEF6862 is not set +# CONFIG_RADIO_WL1273 is not set +# CONFIG_USB_DSBR is not set +CONFIG_USB_KEENE=m +CONFIG_USB_MA901=m +CONFIG_USB_MR800=m +CONFIG_USB_RAREMONO=m +CONFIG_RADIO_SI470X=m +CONFIG_USB_SI470X=m +# CONFIG_I2C_SI470X is not set +# CONFIG_RADIO_WL128X is not set +CONFIG_MEDIA_PLATFORM_DRIVERS=y +CONFIG_V4L_PLATFORM_DRIVERS=y +# CONFIG_SDR_PLATFORM_DRIVERS is not set +# CONFIG_DVB_PLATFORM_DRIVERS is not set +CONFIG_V4L_MEM2MEM_DRIVERS=y +# CONFIG_VIDEO_MEM2MEM_DEINTERLACE is not set +# CONFIG_VIDEO_MUX is not set + +# +# Allegro DVT media platform drivers +# +# CONFIG_VIDEO_ALLEGRO_DVT is not set + +# +# Amlogic media platform drivers +# +# CONFIG_VIDEO_MESON_GE2D is not set + +# +# Amphion drivers +# + +# +# Aspeed media platform drivers +# + +# +# Atmel media platform drivers +# + +# +# Cadence media platform drivers +# +# CONFIG_VIDEO_CADENCE_CSI2RX is not set +# CONFIG_VIDEO_CADENCE_CSI2TX is not set + +# +# Chips&Media media platform drivers +# + +# +# Intel media platform drivers +# + +# +# Marvell media platform drivers +# +CONFIG_VIDEO_CAFE_CCIC=m + +# +# Mediatek media platform drivers +# + +# +# Microchip Technology, Inc. media platform drivers +# + +# +# NVidia media platform drivers +# +# CONFIG_VIDEO_TEGRA_VDE is not set + +# +# NXP media platform drivers +# + +# +# Qualcomm media platform drivers +# +# CONFIG_VIDEO_QCOM_CAMSS is not set + +# +# Renesas media platform drivers +# + +# +# Rockchip media platform drivers +# +# CONFIG_VIDEO_ROCKCHIP_RGA is not set +# CONFIG_VIDEO_ROCKCHIP_ISP1 is not set + +# +# Samsung media platform drivers +# + +# +# STMicroelectronics media platform drivers +# + +# +# Sunxi media platform drivers +# +# CONFIG_VIDEO_SUN4I_CSI is not set +# CONFIG_VIDEO_SUN6I_CSI is not set +# CONFIG_VIDEO_SUN8I_A83T_MIPI_CSI2 is not set +# CONFIG_VIDEO_SUN8I_DEINTERLACE is not set +# CONFIG_VIDEO_SUN8I_ROTATE is not set + +# +# Texas Instruments drivers +# + +# +# Verisilicon media platform drivers +# +# CONFIG_VIDEO_HANTRO is not set + +# +# VIA media platform drivers +# + +# +# Xilinx media platform drivers +# +# CONFIG_VIDEO_XILINX is not set + +# +# MMC/SDIO DVB adapters +# +CONFIG_SMS_SDIO_DRV=m +CONFIG_V4L_TEST_DRIVERS=y +# CONFIG_VIDEO_VIM2M is not set +# CONFIG_VIDEO_VICODEC is not set +# CONFIG_VIDEO_VIMC is not set +CONFIG_VIDEO_VIVID=m +CONFIG_VIDEO_VIVID_CEC=y +CONFIG_VIDEO_VIVID_MAX_DEVS=64 +# CONFIG_VIDEO_VISL is not set +# CONFIG_DVB_TEST_DRIVERS is not set + +# +# FireWire (IEEE 1394) Adapters +# +CONFIG_DVB_FIREDTV=m +CONFIG_DVB_FIREDTV_INPUT=y +CONFIG_MEDIA_COMMON_OPTIONS=y + +# +# common driver options +# +CONFIG_CYPRESS_FIRMWARE=m +CONFIG_TTPCI_EEPROM=m +CONFIG_UVC_COMMON=m +CONFIG_VIDEO_CX2341X=m +CONFIG_VIDEO_TVEEPROM=m +CONFIG_DVB_B2C2_FLEXCOP=m +CONFIG_VIDEO_SAA7146=m +CONFIG_VIDEO_SAA7146_VV=m +CONFIG_SMS_SIANO_MDTV=m +CONFIG_SMS_SIANO_RC=y +# CONFIG_SMS_SIANO_DEBUGFS is not set +CONFIG_VIDEO_V4L2_TPG=m +CONFIG_VIDEOBUF2_CORE=m +CONFIG_VIDEOBUF2_V4L2=m +CONFIG_VIDEOBUF2_MEMOPS=m +CONFIG_VIDEOBUF2_DMA_CONTIG=m +CONFIG_VIDEOBUF2_VMALLOC=m +CONFIG_VIDEOBUF2_DMA_SG=m +CONFIG_VIDEOBUF2_DVB=m +# end of Media drivers + +# +# Media ancillary drivers +# +CONFIG_MEDIA_ATTACH=y + +# +# IR I2C driver auto-selected by 'Autoselect ancillary drivers' +# +CONFIG_VIDEO_IR_I2C=m +CONFIG_VIDEO_CAMERA_SENSOR=y +# CONFIG_VIDEO_AR0521 is not set +# CONFIG_VIDEO_HI556 is not set +# CONFIG_VIDEO_HI846 is not set +# CONFIG_VIDEO_HI847 is not set +# CONFIG_VIDEO_IMX208 is not set +# CONFIG_VIDEO_IMX214 is not set +# CONFIG_VIDEO_IMX219 is not set +# CONFIG_VIDEO_IMX258 is not set +# CONFIG_VIDEO_IMX274 is not set +# CONFIG_VIDEO_IMX290 is not set +# CONFIG_VIDEO_IMX296 is not set +# CONFIG_VIDEO_IMX319 is not set +# CONFIG_VIDEO_IMX334 is not set +# CONFIG_VIDEO_IMX335 is not set +# CONFIG_VIDEO_IMX355 is not set +# CONFIG_VIDEO_IMX412 is not set +# CONFIG_VIDEO_IMX415 is not set +# CONFIG_VIDEO_MT9M001 is not set +# CONFIG_VIDEO_MT9M111 is not set +# CONFIG_VIDEO_MT9P031 is not set +# CONFIG_VIDEO_MT9T112 is not set +CONFIG_VIDEO_MT9V011=m +# CONFIG_VIDEO_MT9V032 is not set +# CONFIG_VIDEO_MT9V111 is not set +# CONFIG_VIDEO_OG01A1B is not set +# CONFIG_VIDEO_OV01A10 is not set +# CONFIG_VIDEO_OV02A10 is not set +# CONFIG_VIDEO_OV08D10 is not set +# CONFIG_VIDEO_OV08X40 is not set +# CONFIG_VIDEO_OV13858 is not set +# CONFIG_VIDEO_OV13B10 is not set +CONFIG_VIDEO_OV2640=m +# CONFIG_VIDEO_OV2659 is not set +# CONFIG_VIDEO_OV2680 is not set +# CONFIG_VIDEO_OV2685 is not set +# CONFIG_VIDEO_OV2740 is not set +# CONFIG_VIDEO_OV4689 is not set +# CONFIG_VIDEO_OV5640 is not set +# CONFIG_VIDEO_OV5645 is not set +# CONFIG_VIDEO_OV5647 is not set +# CONFIG_VIDEO_OV5648 is not set +# CONFIG_VIDEO_OV5670 is not set +# CONFIG_VIDEO_OV5675 is not set +# CONFIG_VIDEO_OV5693 is not set +# CONFIG_VIDEO_OV5695 is not set +# CONFIG_VIDEO_OV6650 is not set +# CONFIG_VIDEO_OV7251 is not set +CONFIG_VIDEO_OV7640=m +CONFIG_VIDEO_OV7670=m +# CONFIG_VIDEO_OV772X is not set +# CONFIG_VIDEO_OV7740 is not set +# CONFIG_VIDEO_OV8856 is not set +# CONFIG_VIDEO_OV8858 is not set +# CONFIG_VIDEO_OV8865 is not set +# CONFIG_VIDEO_OV9282 is not set +# CONFIG_VIDEO_OV9640 is not set +# CONFIG_VIDEO_OV9650 is not set +# CONFIG_VIDEO_OV9734 is not set +# CONFIG_VIDEO_RDACM20 is not set +# CONFIG_VIDEO_RDACM21 is not set +# CONFIG_VIDEO_RJ54N1 is not set +# CONFIG_VIDEO_S5C73M3 is not set +# CONFIG_VIDEO_S5K5BAF is not set +# CONFIG_VIDEO_S5K6A3 is not set +# CONFIG_VIDEO_ST_VGXY61 is not set +# CONFIG_VIDEO_CCS is not set +# CONFIG_VIDEO_ET8EK8 is not set + +# +# Lens drivers +# +# CONFIG_VIDEO_AD5820 is not set +# CONFIG_VIDEO_AK7375 is not set +# CONFIG_VIDEO_DW9714 is not set +# CONFIG_VIDEO_DW9719 is not set +# CONFIG_VIDEO_DW9768 is not set +# CONFIG_VIDEO_DW9807_VCM is not set +# end of Lens drivers + +# +# Flash devices +# +# CONFIG_VIDEO_ADP1653 is not set +# CONFIG_VIDEO_LM3560 is not set +# CONFIG_VIDEO_LM3646 is not set +# end of Flash devices + +# +# Audio decoders, processors and mixers +# +CONFIG_VIDEO_CS3308=m +CONFIG_VIDEO_CS5345=m +CONFIG_VIDEO_CS53L32A=m +CONFIG_VIDEO_MSP3400=m +CONFIG_VIDEO_SONY_BTF_MPX=m +# CONFIG_VIDEO_TDA1997X is not set +CONFIG_VIDEO_TDA7432=m +CONFIG_VIDEO_TDA9840=m +CONFIG_VIDEO_TEA6415C=m +CONFIG_VIDEO_TEA6420=m +CONFIG_VIDEO_TLV320AIC23B=m +CONFIG_VIDEO_TVAUDIO=m +CONFIG_VIDEO_UDA1342=m +CONFIG_VIDEO_VP27SMPX=m +CONFIG_VIDEO_WM8739=m +CONFIG_VIDEO_WM8775=m +# end of Audio decoders, processors and mixers + +# +# RDS decoders +# +CONFIG_VIDEO_SAA6588=m +# end of RDS decoders + +# +# Video decoders +# +# CONFIG_VIDEO_ADV7180 is not set +# CONFIG_VIDEO_ADV7183 is not set +# CONFIG_VIDEO_ADV748X is not set +# CONFIG_VIDEO_ADV7604 is not set +# CONFIG_VIDEO_ADV7842 is not set +CONFIG_VIDEO_BT819=m +CONFIG_VIDEO_BT856=m +# CONFIG_VIDEO_BT866 is not set +# CONFIG_VIDEO_ISL7998X is not set +CONFIG_VIDEO_KS0127=m +# CONFIG_VIDEO_MAX9286 is not set +# CONFIG_VIDEO_ML86V7667 is not set +CONFIG_VIDEO_SAA7110=m +CONFIG_VIDEO_SAA711X=m +# CONFIG_VIDEO_TC358743 is not set +# CONFIG_VIDEO_TC358746 is not set +# CONFIG_VIDEO_TVP514X is not set +CONFIG_VIDEO_TVP5150=m +# CONFIG_VIDEO_TVP7002 is not set +CONFIG_VIDEO_TW2804=m +CONFIG_VIDEO_TW9903=m +CONFIG_VIDEO_TW9906=m +# CONFIG_VIDEO_TW9910 is not set +CONFIG_VIDEO_VPX3220=m + +# +# Video and audio decoders +# +CONFIG_VIDEO_SAA717X=m +CONFIG_VIDEO_CX25840=m +# end of Video decoders + +# +# Video encoders +# +CONFIG_VIDEO_ADV7170=m +CONFIG_VIDEO_ADV7175=m +# CONFIG_VIDEO_ADV7343 is not set +# CONFIG_VIDEO_ADV7393 is not set +# CONFIG_VIDEO_AK881X is not set +CONFIG_VIDEO_SAA7127=m +CONFIG_VIDEO_SAA7185=m +# CONFIG_VIDEO_THS8200 is not set +# end of Video encoders + +# +# Video improvement chips +# +CONFIG_VIDEO_UPD64031A=m +CONFIG_VIDEO_UPD64083=m +# end of Video improvement chips + +# +# Audio/Video compression chips +# +CONFIG_VIDEO_SAA6752HS=m +# end of Audio/Video compression chips + +# +# SDR tuner chips +# +# CONFIG_SDR_MAX2175 is not set +# end of SDR tuner chips + +# +# Miscellaneous helper chips +# +# CONFIG_VIDEO_I2C is not set +CONFIG_VIDEO_M52790=m +# CONFIG_VIDEO_ST_MIPID02 is not set +# CONFIG_VIDEO_THS7303 is not set +# end of Miscellaneous helper chips + +# +# Video serializers and deserializers +# +# CONFIG_VIDEO_DS90UB913 is not set +# CONFIG_VIDEO_DS90UB953 is not set +# CONFIG_VIDEO_DS90UB960 is not set +# end of Video serializers and deserializers + +# +# Media SPI Adapters +# +# CONFIG_CXD2880_SPI_DRV is not set +# CONFIG_VIDEO_GS1662 is not set +# end of Media SPI Adapters + +CONFIG_MEDIA_TUNER=m + +# +# Customize TV tuners +# +CONFIG_MEDIA_TUNER_E4000=m +CONFIG_MEDIA_TUNER_FC0011=m +CONFIG_MEDIA_TUNER_FC0012=m +CONFIG_MEDIA_TUNER_FC0013=m +CONFIG_MEDIA_TUNER_FC2580=m +CONFIG_MEDIA_TUNER_IT913X=m +CONFIG_MEDIA_TUNER_M88RS6000T=m +CONFIG_MEDIA_TUNER_MAX2165=m +CONFIG_MEDIA_TUNER_MC44S803=m +CONFIG_MEDIA_TUNER_MSI001=m +CONFIG_MEDIA_TUNER_MT2060=m +CONFIG_MEDIA_TUNER_MT2063=m +CONFIG_MEDIA_TUNER_MT20XX=m +CONFIG_MEDIA_TUNER_MT2131=m +CONFIG_MEDIA_TUNER_MT2266=m +CONFIG_MEDIA_TUNER_MXL301RF=m +CONFIG_MEDIA_TUNER_MXL5005S=m +CONFIG_MEDIA_TUNER_MXL5007T=m +CONFIG_MEDIA_TUNER_QM1D1B0004=m +CONFIG_MEDIA_TUNER_QM1D1C0042=m +CONFIG_MEDIA_TUNER_QT1010=m +CONFIG_MEDIA_TUNER_R820T=m +CONFIG_MEDIA_TUNER_SI2157=m +CONFIG_MEDIA_TUNER_SIMPLE=m +CONFIG_MEDIA_TUNER_TDA18212=m +CONFIG_MEDIA_TUNER_TDA18218=m +CONFIG_MEDIA_TUNER_TDA18250=m +CONFIG_MEDIA_TUNER_TDA18271=m +CONFIG_MEDIA_TUNER_TDA827X=m +CONFIG_MEDIA_TUNER_TDA8290=m +CONFIG_MEDIA_TUNER_TDA9887=m +CONFIG_MEDIA_TUNER_TEA5761=m +CONFIG_MEDIA_TUNER_TEA5767=m +CONFIG_MEDIA_TUNER_TUA9001=m +CONFIG_MEDIA_TUNER_XC2028=m +CONFIG_MEDIA_TUNER_XC4000=m +CONFIG_MEDIA_TUNER_XC5000=m +# end of Customize TV tuners + +# +# Customise DVB Frontends +# + +# +# Multistandard (satellite) frontends +# +CONFIG_DVB_M88DS3103=m +CONFIG_DVB_MXL5XX=m +CONFIG_DVB_STB0899=m +CONFIG_DVB_STB6100=m +CONFIG_DVB_STV090x=m +CONFIG_DVB_STV0910=m +CONFIG_DVB_STV6110x=m +CONFIG_DVB_STV6111=m + +# +# Multistandard (cable + terrestrial) frontends +# +CONFIG_DVB_DRXK=m +CONFIG_DVB_MN88472=m +CONFIG_DVB_MN88473=m +CONFIG_DVB_SI2165=m +CONFIG_DVB_TDA18271C2DD=m + +# +# DVB-S (satellite) frontends +# +CONFIG_DVB_CX24110=m +CONFIG_DVB_CX24116=m +CONFIG_DVB_CX24117=m +CONFIG_DVB_CX24120=m +CONFIG_DVB_CX24123=m +CONFIG_DVB_DS3000=m +CONFIG_DVB_MB86A16=m +CONFIG_DVB_MT312=m +CONFIG_DVB_S5H1420=m +CONFIG_DVB_SI21XX=m +CONFIG_DVB_STB6000=m +CONFIG_DVB_STV0288=m +CONFIG_DVB_STV0299=m +CONFIG_DVB_STV0900=m +CONFIG_DVB_STV6110=m +CONFIG_DVB_TDA10071=m +CONFIG_DVB_TDA10086=m +CONFIG_DVB_TDA8083=m +CONFIG_DVB_TDA8261=m +CONFIG_DVB_TDA826X=m +CONFIG_DVB_TS2020=m +CONFIG_DVB_TUA6100=m +CONFIG_DVB_TUNER_CX24113=m +CONFIG_DVB_TUNER_ITD1000=m +CONFIG_DVB_VES1X93=m +CONFIG_DVB_ZL10036=m +CONFIG_DVB_ZL10039=m + +# +# DVB-T (terrestrial) frontends +# +CONFIG_DVB_AF9013=m +CONFIG_DVB_AS102_FE=m +CONFIG_DVB_CX22700=m +CONFIG_DVB_CX22702=m +CONFIG_DVB_CXD2820R=m +CONFIG_DVB_CXD2841ER=m +CONFIG_DVB_DIB3000MB=m +CONFIG_DVB_DIB3000MC=m +CONFIG_DVB_DIB7000M=m +CONFIG_DVB_DIB7000P=m +# CONFIG_DVB_DIB9000 is not set +CONFIG_DVB_DRXD=m +CONFIG_DVB_EC100=m +CONFIG_DVB_GP8PSK_FE=m +CONFIG_DVB_L64781=m +CONFIG_DVB_MT352=m +CONFIG_DVB_NXT6000=m +CONFIG_DVB_RTL2830=m +CONFIG_DVB_RTL2832=m +CONFIG_DVB_RTL2832_SDR=m +# CONFIG_DVB_S5H1432 is not set +CONFIG_DVB_SI2168=m +CONFIG_DVB_SP887X=m +CONFIG_DVB_STV0367=m +CONFIG_DVB_TDA10048=m +CONFIG_DVB_TDA1004X=m +CONFIG_DVB_ZD1301_DEMOD=m +CONFIG_DVB_ZL10353=m +# CONFIG_DVB_CXD2880 is not set + +# +# DVB-C (cable) frontends +# +CONFIG_DVB_STV0297=m +CONFIG_DVB_TDA10021=m +CONFIG_DVB_TDA10023=m +CONFIG_DVB_VES1820=m + +# +# ATSC (North American/Korean Terrestrial/Cable DTV) frontends +# +CONFIG_DVB_AU8522=m +CONFIG_DVB_AU8522_DTV=m +CONFIG_DVB_AU8522_V4L=m +CONFIG_DVB_BCM3510=m +CONFIG_DVB_LG2160=m +CONFIG_DVB_LGDT3305=m +CONFIG_DVB_LGDT3306A=m +CONFIG_DVB_LGDT330X=m +CONFIG_DVB_MXL692=m +CONFIG_DVB_NXT200X=m +CONFIG_DVB_OR51132=m +CONFIG_DVB_OR51211=m +CONFIG_DVB_S5H1409=m +CONFIG_DVB_S5H1411=m + +# +# ISDB-T (terrestrial) frontends +# +CONFIG_DVB_DIB8000=m +CONFIG_DVB_MB86A20S=m +CONFIG_DVB_S921=m + +# +# ISDB-S (satellite) & ISDB-T (terrestrial) frontends +# +# CONFIG_DVB_MN88443X is not set +CONFIG_DVB_TC90522=m + +# +# Digital terrestrial only tuners/PLL +# +CONFIG_DVB_PLL=m +CONFIG_DVB_TUNER_DIB0070=m +CONFIG_DVB_TUNER_DIB0090=m + +# +# SEC control devices for DVB-S +# +CONFIG_DVB_A8293=m +CONFIG_DVB_AF9033=m +CONFIG_DVB_ASCOT2E=m +CONFIG_DVB_ATBM8830=m +CONFIG_DVB_HELENE=m +CONFIG_DVB_HORUS3A=m +CONFIG_DVB_ISL6405=m +CONFIG_DVB_ISL6421=m +CONFIG_DVB_ISL6423=m +CONFIG_DVB_IX2505V=m +# CONFIG_DVB_LGS8GL5 is not set +CONFIG_DVB_LGS8GXX=m +CONFIG_DVB_LNBH25=m +# CONFIG_DVB_LNBH29 is not set +CONFIG_DVB_LNBP21=m +CONFIG_DVB_LNBP22=m +CONFIG_DVB_M88RS2000=m +CONFIG_DVB_TDA665x=m +CONFIG_DVB_DRX39XYJ=m + +# +# Common Interface (EN50221) controller drivers +# +CONFIG_DVB_CXD2099=m +CONFIG_DVB_SP2=m +# end of Customise DVB Frontends + +# +# Tools to develop new frontends +# +CONFIG_DVB_DUMMY_FE=m +# end of Media ancillary drivers + +# +# Graphics support +# +CONFIG_APERTURE_HELPERS=y +CONFIG_SCREEN_INFO=y +CONFIG_VIDEO_CMDLINE=y +CONFIG_VIDEO_NOMODESET=y +# CONFIG_AUXDISPLAY is not set +# CONFIG_PANEL is not set +CONFIG_TEGRA_HOST1X_CONTEXT_BUS=y +CONFIG_TEGRA_HOST1X=m +CONFIG_TEGRA_HOST1X_FIREWALL=y +CONFIG_DRM=m +CONFIG_DRM_MIPI_DSI=y +CONFIG_DRM_KMS_HELPER=m +# CONFIG_DRM_DEBUG_DP_MST_TOPOLOGY_REFS is not set +# CONFIG_DRM_DEBUG_MODESET_LOCK is not set +CONFIG_DRM_FBDEV_EMULATION=y +CONFIG_DRM_FBDEV_OVERALLOC=100 +# CONFIG_DRM_FBDEV_LEAK_PHYS_SMEM is not set +CONFIG_DRM_LOAD_EDID_FIRMWARE=y +CONFIG_DRM_DP_AUX_BUS=m +CONFIG_DRM_DISPLAY_HELPER=m +CONFIG_DRM_DISPLAY_DP_HELPER=y +CONFIG_DRM_DISPLAY_HDCP_HELPER=y +CONFIG_DRM_DISPLAY_HDMI_HELPER=y +CONFIG_DRM_DP_AUX_CHARDEV=y +# CONFIG_DRM_DP_CEC is not set +CONFIG_DRM_TTM=m +CONFIG_DRM_EXEC=m +CONFIG_DRM_BUDDY=m +CONFIG_DRM_VRAM_HELPER=m +CONFIG_DRM_TTM_HELPER=m +CONFIG_DRM_GEM_DMA_HELPER=m +CONFIG_DRM_GEM_SHMEM_HELPER=m +CONFIG_DRM_SUBALLOC_HELPER=m +CONFIG_DRM_SCHED=m + +# +# I2C encoder or helper chips +# +# CONFIG_DRM_I2C_CH7006 is not set +# CONFIG_DRM_I2C_SIL164 is not set +# CONFIG_DRM_I2C_NXP_TDA998X is not set +# CONFIG_DRM_I2C_NXP_TDA9950 is not set +# end of I2C encoder or helper chips + +# +# ARM devices +# +CONFIG_DRM_HDLCD=m +# CONFIG_DRM_HDLCD_SHOW_UNDERRUN is not set +CONFIG_DRM_MALI_DISPLAY=m +# CONFIG_DRM_KOMEDA is not set +# end of ARM devices + +CONFIG_DRM_RADEON=m +# CONFIG_DRM_RADEON_USERPTR is not set +CONFIG_DRM_AMDGPU=m +CONFIG_DRM_AMDGPU_SI=y +CONFIG_DRM_AMDGPU_CIK=y +# CONFIG_DRM_AMDGPU_USERPTR is not set +# CONFIG_DRM_AMDGPU_WERROR is not set + +# +# ACP (Audio CoProcessor) Configuration +# +# CONFIG_DRM_AMD_ACP is not set +# end of ACP (Audio CoProcessor) Configuration + +# +# Display Engine Configuration +# +CONFIG_DRM_AMD_DC=y +CONFIG_DRM_AMD_DC_FP=y +# CONFIG_DRM_AMD_DC_SI is not set +# CONFIG_DRM_AMD_SECURE_DISPLAY is not set +# end of Display Engine Configuration + +# CONFIG_HSA_AMD is not set +CONFIG_DRM_NOUVEAU=m +CONFIG_NOUVEAU_PLATFORM_DRIVER=y +CONFIG_NOUVEAU_DEBUG=5 +CONFIG_NOUVEAU_DEBUG_DEFAULT=3 +# CONFIG_NOUVEAU_DEBUG_MMU is not set +# CONFIG_NOUVEAU_DEBUG_PUSH is not set +CONFIG_DRM_NOUVEAU_BACKLIGHT=y +# CONFIG_DRM_NOUVEAU_SVM is not set +CONFIG_DRM_VGEM=m +# CONFIG_DRM_VKMS is not set +CONFIG_DRM_ROCKCHIP=m +CONFIG_ROCKCHIP_VOP=y +# CONFIG_ROCKCHIP_VOP2 is not set +CONFIG_ROCKCHIP_ANALOGIX_DP=y +CONFIG_ROCKCHIP_CDN_DP=y +CONFIG_ROCKCHIP_DW_HDMI=y +CONFIG_ROCKCHIP_DW_MIPI_DSI=y +# CONFIG_ROCKCHIP_INNO_HDMI is not set +# CONFIG_ROCKCHIP_LVDS is not set +# CONFIG_ROCKCHIP_RGB is not set +# CONFIG_ROCKCHIP_RK3066_HDMI is not set +# CONFIG_DRM_VMWGFX is not set +CONFIG_DRM_UDL=m +CONFIG_DRM_AST=m +# CONFIG_DRM_MGAG200 is not set +CONFIG_DRM_SUN4I=m +# CONFIG_DRM_SUN6I_DSI is not set +CONFIG_DRM_SUN8I_DW_HDMI=m +# CONFIG_DRM_SUN8I_MIXER is not set +CONFIG_DRM_QXL=m +CONFIG_DRM_VIRTIO_GPU=m +CONFIG_DRM_VIRTIO_GPU_KMS=y +CONFIG_DRM_MSM=m +CONFIG_DRM_MSM_GPU_STATE=y +# CONFIG_DRM_MSM_GPU_SUDO is not set +CONFIG_DRM_MSM_MDSS=y +CONFIG_DRM_MSM_MDP4=y +CONFIG_DRM_MSM_MDP5=y +CONFIG_DRM_MSM_DPU=y +CONFIG_DRM_MSM_DP=y +CONFIG_DRM_MSM_DSI=y +CONFIG_DRM_MSM_DSI_28NM_PHY=y +CONFIG_DRM_MSM_DSI_20NM_PHY=y +CONFIG_DRM_MSM_DSI_28NM_8960_PHY=y +CONFIG_DRM_MSM_DSI_14NM_PHY=y +CONFIG_DRM_MSM_DSI_10NM_PHY=y +CONFIG_DRM_MSM_DSI_7NM_PHY=y +CONFIG_DRM_MSM_HDMI=y +CONFIG_DRM_MSM_HDMI_HDCP=y +CONFIG_DRM_TEGRA=m +# CONFIG_DRM_TEGRA_DEBUG is not set +CONFIG_DRM_TEGRA_STAGING=y +CONFIG_DRM_PANEL=y + +# +# Display Panels +# +# CONFIG_DRM_PANEL_ABT_Y030XX067A is not set +# CONFIG_DRM_PANEL_ARM_VERSATILE is not set +# CONFIG_DRM_PANEL_ASUS_Z00T_TM5P5_NT35596 is not set +# CONFIG_DRM_PANEL_AUO_A030JTN01 is not set +# CONFIG_DRM_PANEL_BOE_BF060Y8M_AJ0 is not set +# CONFIG_DRM_PANEL_BOE_HIMAX8279D is not set +# CONFIG_DRM_PANEL_BOE_TV101WUM_NL6 is not set +# CONFIG_DRM_PANEL_DSI_CM is not set +# CONFIG_DRM_PANEL_LVDS is not set +CONFIG_DRM_PANEL_SIMPLE=m +# CONFIG_DRM_PANEL_EDP is not set +# CONFIG_DRM_PANEL_EBBG_FT8719 is not set +# CONFIG_DRM_PANEL_ELIDA_KD35T133 is not set +# CONFIG_DRM_PANEL_FEIXIN_K101_IM2BA02 is not set +# CONFIG_DRM_PANEL_FEIYANG_FY07024DI26A30D is not set +# CONFIG_DRM_PANEL_HIMAX_HX8394 is not set +# CONFIG_DRM_PANEL_ILITEK_IL9322 is not set +# CONFIG_DRM_PANEL_ILITEK_ILI9341 is not set +# CONFIG_DRM_PANEL_ILITEK_ILI9881C is not set +# CONFIG_DRM_PANEL_INNOLUX_EJ030NA is not set +# CONFIG_DRM_PANEL_INNOLUX_P079ZCA is not set +# CONFIG_DRM_PANEL_JADARD_JD9365DA_H3 is not set +# CONFIG_DRM_PANEL_JDI_LT070ME05000 is not set +# CONFIG_DRM_PANEL_JDI_R63452 is not set +# CONFIG_DRM_PANEL_KHADAS_TS050 is not set +# CONFIG_DRM_PANEL_KINGDISPLAY_KD097D04 is not set +# CONFIG_DRM_PANEL_LEADTEK_LTK050H3146W is not set +# CONFIG_DRM_PANEL_LEADTEK_LTK500HD1829 is not set +# CONFIG_DRM_PANEL_SAMSUNG_LD9040 is not set +# CONFIG_DRM_PANEL_LG_LB035Q02 is not set +# CONFIG_DRM_PANEL_LG_LG4573 is not set +# CONFIG_DRM_PANEL_MAGNACHIP_D53E6EA8966 is not set +# CONFIG_DRM_PANEL_NEC_NL8048HL11 is not set +# CONFIG_DRM_PANEL_NEWVISION_NV3051D is not set +# CONFIG_DRM_PANEL_NEWVISION_NV3052C is not set +# CONFIG_DRM_PANEL_NOVATEK_NT35510 is not set +# CONFIG_DRM_PANEL_NOVATEK_NT35560 is not set +# CONFIG_DRM_PANEL_NOVATEK_NT35950 is not set +# CONFIG_DRM_PANEL_NOVATEK_NT36523 is not set +# CONFIG_DRM_PANEL_NOVATEK_NT36672A is not set +# CONFIG_DRM_PANEL_NOVATEK_NT39016 is not set +# CONFIG_DRM_PANEL_MANTIX_MLAF057WE51 is not set +# CONFIG_DRM_PANEL_OLIMEX_LCD_OLINUXINO is not set +# CONFIG_DRM_PANEL_ORISETECH_OTA5601A is not set +# CONFIG_DRM_PANEL_ORISETECH_OTM8009A is not set +# CONFIG_DRM_PANEL_OSD_OSD101T2587_53TS is not set +# CONFIG_DRM_PANEL_PANASONIC_VVX10F034N00 is not set +CONFIG_DRM_PANEL_RASPBERRYPI_TOUCHSCREEN=m +# CONFIG_DRM_PANEL_RAYDIUM_RM67191 is not set +# CONFIG_DRM_PANEL_RAYDIUM_RM68200 is not set +# CONFIG_DRM_PANEL_RONBO_RB070D30 is not set +# CONFIG_DRM_PANEL_SAMSUNG_ATNA33XC20 is not set +# CONFIG_DRM_PANEL_SAMSUNG_DB7430 is not set +# CONFIG_DRM_PANEL_SAMSUNG_S6D16D0 is not set +# CONFIG_DRM_PANEL_SAMSUNG_S6D27A1 is not set +# CONFIG_DRM_PANEL_SAMSUNG_S6D7AA0 is not set +# CONFIG_DRM_PANEL_SAMSUNG_S6E3HA2 is not set +# CONFIG_DRM_PANEL_SAMSUNG_S6E63J0X03 is not set +# CONFIG_DRM_PANEL_SAMSUNG_S6E63M0 is not set +# CONFIG_DRM_PANEL_SAMSUNG_S6E88A0_AMS452EF01 is not set +# CONFIG_DRM_PANEL_SAMSUNG_S6E8AA0 is not set +# CONFIG_DRM_PANEL_SAMSUNG_SOFEF00 is not set +# CONFIG_DRM_PANEL_SEIKO_43WVF1G is not set +# CONFIG_DRM_PANEL_SHARP_LQ101R1SX01 is not set +# CONFIG_DRM_PANEL_SHARP_LS037V7DW01 is not set +# CONFIG_DRM_PANEL_SHARP_LS043T1LE01 is not set +# CONFIG_DRM_PANEL_SHARP_LS060T1SX01 is not set +# CONFIG_DRM_PANEL_SITRONIX_ST7701 is not set +# CONFIG_DRM_PANEL_SITRONIX_ST7703 is not set +# CONFIG_DRM_PANEL_SITRONIX_ST7789V is not set +# CONFIG_DRM_PANEL_SONY_ACX565AKM is not set +# CONFIG_DRM_PANEL_SONY_TD4353_JDI is not set +# CONFIG_DRM_PANEL_SONY_TULIP_TRULY_NT35521 is not set +# CONFIG_DRM_PANEL_STARTEK_KD070FHFID015 is not set +# CONFIG_DRM_PANEL_TDO_TL070WSH30 is not set +# CONFIG_DRM_PANEL_TPO_TD028TTEC1 is not set +# CONFIG_DRM_PANEL_TPO_TD043MTEA1 is not set +# CONFIG_DRM_PANEL_TPO_TPG110 is not set +# CONFIG_DRM_PANEL_TRULY_NT35597_WQXGA is not set +# CONFIG_DRM_PANEL_VISIONOX_RM69299 is not set +# CONFIG_DRM_PANEL_VISIONOX_VTDR6130 is not set +# CONFIG_DRM_PANEL_VISIONOX_R66451 is not set +# CONFIG_DRM_PANEL_WIDECHIPS_WS2401 is not set +# CONFIG_DRM_PANEL_XINPENG_XPP055C272 is not set +# end of Display Panels + +CONFIG_DRM_BRIDGE=y +CONFIG_DRM_PANEL_BRIDGE=y + +# +# Display Interface Bridges +# +# CONFIG_DRM_CHIPONE_ICN6211 is not set +# CONFIG_DRM_CHRONTEL_CH7033 is not set +# CONFIG_DRM_CROS_EC_ANX7688 is not set +CONFIG_DRM_DISPLAY_CONNECTOR=m +# CONFIG_DRM_ITE_IT6505 is not set +# CONFIG_DRM_LONTIUM_LT8912B is not set +# CONFIG_DRM_LONTIUM_LT9211 is not set +# CONFIG_DRM_LONTIUM_LT9611 is not set +# CONFIG_DRM_LONTIUM_LT9611UXC is not set +# CONFIG_DRM_ITE_IT66121 is not set +# CONFIG_DRM_LVDS_CODEC is not set +# CONFIG_DRM_MEGACHIPS_STDPXXXX_GE_B850V3_FW is not set +# CONFIG_DRM_NWL_MIPI_DSI is not set +# CONFIG_DRM_NXP_PTN3460 is not set +# CONFIG_DRM_PARADE_PS8622 is not set +# CONFIG_DRM_PARADE_PS8640 is not set +# CONFIG_DRM_SAMSUNG_DSIM is not set +# CONFIG_DRM_SIL_SII8620 is not set +# CONFIG_DRM_SII902X is not set +# CONFIG_DRM_SII9234 is not set +# CONFIG_DRM_SIMPLE_BRIDGE is not set +# CONFIG_DRM_THINE_THC63LVD1024 is not set +# CONFIG_DRM_TOSHIBA_TC358762 is not set +# CONFIG_DRM_TOSHIBA_TC358764 is not set +# CONFIG_DRM_TOSHIBA_TC358767 is not set +# CONFIG_DRM_TOSHIBA_TC358768 is not set +# CONFIG_DRM_TOSHIBA_TC358775 is not set +# CONFIG_DRM_TI_DLPC3433 is not set +# CONFIG_DRM_TI_TFP410 is not set +# CONFIG_DRM_TI_SN65DSI83 is not set +# CONFIG_DRM_TI_SN65DSI86 is not set +# CONFIG_DRM_TI_TPD12S015 is not set +# CONFIG_DRM_ANALOGIX_ANX6345 is not set +# CONFIG_DRM_ANALOGIX_ANX78XX is not set +CONFIG_DRM_ANALOGIX_DP=m +# CONFIG_DRM_ANALOGIX_ANX7625 is not set +CONFIG_DRM_I2C_ADV7511=m +CONFIG_DRM_I2C_ADV7511_AUDIO=y +CONFIG_DRM_I2C_ADV7511_CEC=y +# CONFIG_DRM_CDNS_DSI is not set +# CONFIG_DRM_CDNS_MHDP8546 is not set +CONFIG_DRM_DW_HDMI=m +# CONFIG_DRM_DW_HDMI_AHB_AUDIO is not set +CONFIG_DRM_DW_HDMI_I2S_AUDIO=m +# CONFIG_DRM_DW_HDMI_GP_AUDIO is not set +# CONFIG_DRM_DW_HDMI_CEC is not set +CONFIG_DRM_DW_MIPI_DSI=m +# end of Display Interface Bridges + +# CONFIG_DRM_LOONGSON is not set +# CONFIG_DRM_ETNAVIV is not set +CONFIG_DRM_HISI_HIBMC=m +CONFIG_DRM_HISI_KIRIN=m +# CONFIG_DRM_LOGICVC is not set +CONFIG_DRM_MESON=m +CONFIG_DRM_MESON_DW_HDMI=m +CONFIG_DRM_MESON_DW_MIPI_DSI=m +# CONFIG_DRM_ARCPGU is not set +CONFIG_DRM_BOCHS=m +CONFIG_DRM_CIRRUS_QEMU=m +# CONFIG_DRM_GM12U320 is not set +# CONFIG_DRM_PANEL_MIPI_DBI is not set +# CONFIG_DRM_SIMPLEDRM is not set +# CONFIG_TINYDRM_HX8357D is not set +# CONFIG_TINYDRM_ILI9163 is not set +# CONFIG_TINYDRM_ILI9225 is not set +# CONFIG_TINYDRM_ILI9341 is not set +# CONFIG_TINYDRM_ILI9486 is not set +# CONFIG_TINYDRM_MI0283QT is not set +# CONFIG_TINYDRM_REPAPER is not set +# CONFIG_TINYDRM_ST7586 is not set +# CONFIG_TINYDRM_ST7735R is not set +# CONFIG_DRM_PL111 is not set +# CONFIG_DRM_XEN_FRONTEND is not set +CONFIG_DRM_LIMA=m +CONFIG_DRM_PANFROST=m +# CONFIG_DRM_TIDSS is not set +# CONFIG_DRM_GUD is not set +# CONFIG_DRM_SSD130X is not set +# CONFIG_DRM_HYPERV is not set +CONFIG_DRM_LEGACY=y +CONFIG_DRM_PANEL_ORIENTATION_QUIRKS=y + +# +# Frame buffer Devices +# +CONFIG_FB=y +CONFIG_FB_SVGALIB=m +# CONFIG_FB_CIRRUS is not set +# CONFIG_FB_PM2 is not set +CONFIG_FB_ARMCLCD=y +# CONFIG_FB_CYBER2000 is not set +# CONFIG_FB_ASILIANT is not set +# CONFIG_FB_IMSTT is not set +# CONFIG_FB_UVESA is not set +CONFIG_FB_EFI=y +# CONFIG_FB_OPENCORES is not set +# CONFIG_FB_S1D13XXX is not set +# CONFIG_FB_NVIDIA is not set +# CONFIG_FB_RIVA is not set +# CONFIG_FB_I740 is not set +# CONFIG_FB_MATROX is not set +# CONFIG_FB_RADEON is not set +# CONFIG_FB_ATY128 is not set +# CONFIG_FB_ATY is not set +CONFIG_FB_S3=m +CONFIG_FB_S3_DDC=y +# CONFIG_FB_SAVAGE is not set +# CONFIG_FB_SIS is not set +# CONFIG_FB_NEOMAGIC is not set +# CONFIG_FB_KYRO is not set +CONFIG_FB_3DFX=m +# CONFIG_FB_3DFX_ACCEL is not set +CONFIG_FB_3DFX_I2C=y +# CONFIG_FB_VOODOO1 is not set +CONFIG_FB_VT8623=m +# CONFIG_FB_TRIDENT is not set +CONFIG_FB_ARK=m +CONFIG_FB_PM3=m +# CONFIG_FB_CARMINE is not set +CONFIG_FB_SMSCUFX=m +CONFIG_FB_UDL=m +# CONFIG_FB_IBM_GXT4500 is not set +# CONFIG_FB_XILINX is not set +# CONFIG_FB_VIRTUAL is not set +CONFIG_XEN_FBDEV_FRONTEND=y +# CONFIG_FB_METRONOME is not set +CONFIG_FB_MB862XX=m +CONFIG_FB_MB862XX_PCI_GDC=y +CONFIG_FB_MB862XX_I2C=y +CONFIG_FB_HYPERV=m +CONFIG_FB_SIMPLE=y +# CONFIG_FB_SSD1307 is not set +# CONFIG_FB_SM712 is not set +CONFIG_FB_CORE=y +CONFIG_FB_NOTIFY=y +CONFIG_FIRMWARE_EDID=y +CONFIG_FB_DEVICE=y +CONFIG_FB_DDC=m +CONFIG_FB_CFB_FILLRECT=y +CONFIG_FB_CFB_COPYAREA=y +CONFIG_FB_CFB_IMAGEBLIT=y +CONFIG_FB_SYS_FILLRECT=y +CONFIG_FB_SYS_COPYAREA=y +CONFIG_FB_SYS_IMAGEBLIT=y +# CONFIG_FB_FOREIGN_ENDIAN is not set +CONFIG_FB_SYS_FOPS=y +CONFIG_FB_DEFERRED_IO=y +CONFIG_FB_DMAMEM_HELPERS=y +CONFIG_FB_IOMEM_FOPS=y +CONFIG_FB_IOMEM_HELPERS=y +CONFIG_FB_SYSMEM_HELPERS=y +CONFIG_FB_SYSMEM_HELPERS_DEFERRED=y +CONFIG_FB_MODE_HELPERS=y +CONFIG_FB_TILEBLITTING=y +# end of Frame buffer Devices + +# +# Backlight & LCD device support +# +# CONFIG_LCD_CLASS_DEVICE is not set +CONFIG_BACKLIGHT_CLASS_DEVICE=y +# CONFIG_BACKLIGHT_KTD253 is not set +# CONFIG_BACKLIGHT_KTZ8866 is not set +CONFIG_BACKLIGHT_PWM=m +# CONFIG_BACKLIGHT_QCOM_WLED is not set +# CONFIG_BACKLIGHT_ADP8860 is not set +# CONFIG_BACKLIGHT_ADP8870 is not set +# CONFIG_BACKLIGHT_LM3630A is not set +# CONFIG_BACKLIGHT_LM3639 is not set +CONFIG_BACKLIGHT_LP855X=m +# CONFIG_BACKLIGHT_GPIO is not set +# CONFIG_BACKLIGHT_LV5207LP is not set +# CONFIG_BACKLIGHT_BD6107 is not set +# CONFIG_BACKLIGHT_ARCXCNN is not set +# CONFIG_BACKLIGHT_LED is not set +# end of Backlight & LCD device support + +CONFIG_VGASTATE=m +CONFIG_VIDEOMODE_HELPERS=y +CONFIG_HDMI=y + +# +# Console display driver support +# +CONFIG_DUMMY_CONSOLE=y +CONFIG_DUMMY_CONSOLE_COLUMNS=80 +CONFIG_DUMMY_CONSOLE_ROWS=25 +CONFIG_FRAMEBUFFER_CONSOLE=y +# CONFIG_FRAMEBUFFER_CONSOLE_LEGACY_ACCELERATION is not set +CONFIG_FRAMEBUFFER_CONSOLE_DETECT_PRIMARY=y +CONFIG_FRAMEBUFFER_CONSOLE_ROTATION=y +# CONFIG_FRAMEBUFFER_CONSOLE_DEFERRED_TAKEOVER is not set +# end of Console display driver support + +# CONFIG_LOGO is not set +# end of Graphics support + +# CONFIG_DRM_ACCEL is not set +CONFIG_SOUND=m +CONFIG_SOUND_OSS_CORE=y +# CONFIG_SOUND_OSS_CORE_PRECLAIM is not set +CONFIG_SND=m +CONFIG_SND_TIMER=m +CONFIG_SND_PCM=m +CONFIG_SND_PCM_ELD=y +CONFIG_SND_PCM_IEC958=y +CONFIG_SND_DMAENGINE_PCM=m +CONFIG_SND_HWDEP=m +CONFIG_SND_SEQ_DEVICE=m +CONFIG_SND_RAWMIDI=m +CONFIG_SND_JACK=y +CONFIG_SND_JACK_INPUT_DEV=y +CONFIG_SND_OSSEMUL=y +CONFIG_SND_MIXER_OSS=m +CONFIG_SND_PCM_OSS=m +CONFIG_SND_PCM_OSS_PLUGINS=y +CONFIG_SND_PCM_TIMER=y +CONFIG_SND_HRTIMER=m +CONFIG_SND_DYNAMIC_MINORS=y +CONFIG_SND_MAX_CARDS=32 +CONFIG_SND_SUPPORT_OLD_API=y +CONFIG_SND_PROC_FS=y +CONFIG_SND_VERBOSE_PROCFS=y +# CONFIG_SND_VERBOSE_PRINTK is not set +CONFIG_SND_CTL_FAST_LOOKUP=y +# CONFIG_SND_DEBUG is not set +# CONFIG_SND_CTL_INPUT_VALIDATION is not set +CONFIG_SND_VMASTER=y +CONFIG_SND_CTL_LED=m +CONFIG_SND_SEQUENCER=m +CONFIG_SND_SEQ_DUMMY=m +# CONFIG_SND_SEQUENCER_OSS is not set +CONFIG_SND_SEQ_HRTIMER_DEFAULT=y +CONFIG_SND_SEQ_MIDI_EVENT=m +CONFIG_SND_SEQ_MIDI=m +CONFIG_SND_SEQ_MIDI_EMUL=m +# CONFIG_SND_SEQ_UMP is not set +CONFIG_SND_MPU401_UART=m +CONFIG_SND_OPL3_LIB=m +CONFIG_SND_OPL3_LIB_SEQ=m +CONFIG_SND_AC97_CODEC=m +CONFIG_SND_DRIVERS=y +# CONFIG_SND_DUMMY is not set +CONFIG_SND_ALOOP=m +# CONFIG_SND_PCMTEST is not set +# CONFIG_SND_VIRMIDI is not set +# CONFIG_SND_MTPAV is not set +CONFIG_SND_MTS64=m +# CONFIG_SND_SERIAL_U16550 is not set +# CONFIG_SND_SERIAL_GENERIC is not set +# CONFIG_SND_MPU401 is not set +CONFIG_SND_PORTMAN2X4=m +CONFIG_SND_AC97_POWER_SAVE=y +CONFIG_SND_AC97_POWER_SAVE_DEFAULT=0 +CONFIG_SND_PCI=y +CONFIG_SND_AD1889=m +# CONFIG_SND_ALS300 is not set +# CONFIG_SND_ALI5451 is not set +# CONFIG_SND_ATIIXP is not set +# CONFIG_SND_ATIIXP_MODEM is not set +# CONFIG_SND_AU8810 is not set +# CONFIG_SND_AU8820 is not set +# CONFIG_SND_AU8830 is not set +# CONFIG_SND_AW2 is not set +# CONFIG_SND_AZT3328 is not set +# CONFIG_SND_BT87X is not set +# CONFIG_SND_CA0106 is not set +# CONFIG_SND_CMIPCI is not set +CONFIG_SND_OXYGEN_LIB=m +CONFIG_SND_OXYGEN=m +# CONFIG_SND_CS4281 is not set +# CONFIG_SND_CS46XX is not set +CONFIG_SND_CTXFI=m +CONFIG_SND_DARLA20=m +CONFIG_SND_GINA20=m +CONFIG_SND_LAYLA20=m +CONFIG_SND_DARLA24=m +CONFIG_SND_GINA24=m +CONFIG_SND_LAYLA24=m +CONFIG_SND_MONA=m +CONFIG_SND_MIA=m +CONFIG_SND_ECHO3G=m +CONFIG_SND_INDIGO=m +CONFIG_SND_INDIGOIO=m +CONFIG_SND_INDIGODJ=m +CONFIG_SND_INDIGOIOX=m +CONFIG_SND_INDIGODJX=m +# CONFIG_SND_EMU10K1 is not set +# CONFIG_SND_EMU10K1X is not set +# CONFIG_SND_ENS1370 is not set +# CONFIG_SND_ENS1371 is not set +# CONFIG_SND_ES1938 is not set +# CONFIG_SND_ES1968 is not set +# CONFIG_SND_FM801 is not set +# CONFIG_SND_HDSP is not set +CONFIG_SND_HDSPM=m +# CONFIG_SND_ICE1712 is not set +# CONFIG_SND_ICE1724 is not set +# CONFIG_SND_INTEL8X0 is not set +# CONFIG_SND_INTEL8X0M is not set +# CONFIG_SND_KORG1212 is not set +CONFIG_SND_LOLA=m +CONFIG_SND_LX6464ES=m +# CONFIG_SND_MAESTRO3 is not set +# CONFIG_SND_MIXART is not set +# CONFIG_SND_NM256 is not set +CONFIG_SND_PCXHR=m +CONFIG_SND_RIPTIDE=m +# CONFIG_SND_RME32 is not set +# CONFIG_SND_RME96 is not set +# CONFIG_SND_RME9652 is not set +# CONFIG_SND_SONICVIBES is not set +# CONFIG_SND_TRIDENT is not set +# CONFIG_SND_VIA82XX is not set +# CONFIG_SND_VIA82XX_MODEM is not set +CONFIG_SND_VIRTUOSO=m +# CONFIG_SND_VX222 is not set +# CONFIG_SND_YMFPCI is not set + +# +# HD-Audio +# +CONFIG_SND_HDA=m +CONFIG_SND_HDA_GENERIC_LEDS=y +CONFIG_SND_HDA_INTEL=m +CONFIG_SND_HDA_TEGRA=m +CONFIG_SND_HDA_HWDEP=y +CONFIG_SND_HDA_RECONFIG=y +CONFIG_SND_HDA_INPUT_BEEP=y +CONFIG_SND_HDA_INPUT_BEEP_MODE=1 +CONFIG_SND_HDA_PATCH_LOADER=y +# CONFIG_SND_HDA_SCODEC_CS35L41_I2C is not set +# CONFIG_SND_HDA_SCODEC_CS35L41_SPI is not set +# CONFIG_SND_HDA_SCODEC_CS35L56_I2C is not set +# CONFIG_SND_HDA_SCODEC_CS35L56_SPI is not set +# CONFIG_SND_HDA_SCODEC_TAS2781_I2C is not set +CONFIG_SND_HDA_CODEC_REALTEK=m +CONFIG_SND_HDA_CODEC_ANALOG=m +CONFIG_SND_HDA_CODEC_SIGMATEL=m +CONFIG_SND_HDA_CODEC_VIA=m +CONFIG_SND_HDA_CODEC_HDMI=m +CONFIG_SND_HDA_CODEC_CIRRUS=m +# CONFIG_SND_HDA_CODEC_CS8409 is not set +CONFIG_SND_HDA_CODEC_CONEXANT=m +CONFIG_SND_HDA_CODEC_CA0110=m +CONFIG_SND_HDA_CODEC_CA0132=m +CONFIG_SND_HDA_CODEC_CA0132_DSP=y +CONFIG_SND_HDA_CODEC_CMEDIA=m +CONFIG_SND_HDA_CODEC_SI3054=m +CONFIG_SND_HDA_GENERIC=m +CONFIG_SND_HDA_POWER_SAVE_DEFAULT=1 +# CONFIG_SND_HDA_INTEL_HDMI_SILENT_STREAM is not set +# CONFIG_SND_HDA_CTL_DEV_ID is not set +# end of HD-Audio + +CONFIG_SND_HDA_CORE=m +CONFIG_SND_HDA_DSP_LOADER=y +CONFIG_SND_HDA_ALIGNED_MMIO=y +CONFIG_SND_HDA_COMPONENT=y +CONFIG_SND_HDA_PREALLOC_SIZE=2048 +CONFIG_SND_INTEL_NHLT=y +CONFIG_SND_INTEL_DSP_CONFIG=m +CONFIG_SND_INTEL_SOUNDWIRE_ACPI=m +CONFIG_SND_SPI=y +CONFIG_SND_USB=y +CONFIG_SND_USB_AUDIO=m +# CONFIG_SND_USB_AUDIO_MIDI_V2 is not set +CONFIG_SND_USB_AUDIO_USE_MEDIA_CONTROLLER=y +CONFIG_SND_USB_UA101=m +CONFIG_SND_USB_CAIAQ=m +CONFIG_SND_USB_CAIAQ_INPUT=y +CONFIG_SND_USB_6FIRE=m +CONFIG_SND_USB_HIFACE=m +CONFIG_SND_BCD2000=m +CONFIG_SND_USB_LINE6=m +CONFIG_SND_USB_POD=m +CONFIG_SND_USB_PODHD=m +CONFIG_SND_USB_TONEPORT=m +CONFIG_SND_USB_VARIAX=m +CONFIG_SND_FIREWIRE=y +CONFIG_SND_FIREWIRE_LIB=m +CONFIG_SND_DICE=m +CONFIG_SND_OXFW=m +CONFIG_SND_ISIGHT=m +CONFIG_SND_FIREWORKS=m +CONFIG_SND_BEBOB=m +CONFIG_SND_FIREWIRE_DIGI00X=m +CONFIG_SND_FIREWIRE_TASCAM=m +CONFIG_SND_FIREWIRE_MOTU=m +CONFIG_SND_FIREFACE=m +CONFIG_SND_SOC=m +CONFIG_SND_SOC_GENERIC_DMAENGINE_PCM=y +# CONFIG_SND_SOC_ADI is not set +# CONFIG_SND_SOC_AMD_ACP is not set +# CONFIG_SND_AMD_ACP_CONFIG is not set +# CONFIG_SND_ATMEL_SOC is not set +# CONFIG_SND_BCM63XX_I2S_WHISTLER is not set +# CONFIG_SND_DESIGNWARE_I2S is not set + +# +# SoC Audio for Freescale CPUs +# + +# +# Common SoC Audio options for Freescale CPUs: +# +# CONFIG_SND_SOC_FSL_ASRC is not set +# CONFIG_SND_SOC_FSL_SAI is not set +# CONFIG_SND_SOC_FSL_AUDMIX is not set +# CONFIG_SND_SOC_FSL_SSI is not set +# CONFIG_SND_SOC_FSL_SPDIF is not set +# CONFIG_SND_SOC_FSL_ESAI is not set +# CONFIG_SND_SOC_FSL_MICFIL is not set +# CONFIG_SND_SOC_FSL_XCVR is not set +# CONFIG_SND_SOC_FSL_RPMSG is not set +# CONFIG_SND_SOC_IMX_AUDMUX is not set +# end of SoC Audio for Freescale CPUs + +# CONFIG_SND_SOC_CHV3_I2S is not set +CONFIG_SND_I2S_HI6210_I2S=m +# CONFIG_SND_KIRKWOOD_SOC is not set +# CONFIG_SND_SOC_IMG is not set +# CONFIG_SND_SOC_MTK_BTCVSD is not set + +# +# ASoC support for Amlogic platforms +# +# CONFIG_SND_MESON_AIU is not set +# CONFIG_SND_MESON_AXG_FRDDR is not set +# CONFIG_SND_MESON_AXG_TODDR is not set +# CONFIG_SND_MESON_AXG_TDMIN is not set +# CONFIG_SND_MESON_AXG_TDMOUT is not set +# CONFIG_SND_MESON_AXG_SOUND_CARD is not set +# CONFIG_SND_MESON_AXG_SPDIFOUT is not set +# CONFIG_SND_MESON_AXG_SPDIFIN is not set +# CONFIG_SND_MESON_AXG_PDM is not set +# CONFIG_SND_MESON_GX_SOUND_CARD is not set +# CONFIG_SND_MESON_G12A_TOACODEC is not set +# CONFIG_SND_MESON_G12A_TOHDMITX is not set +# CONFIG_SND_SOC_MESON_T9015 is not set +# end of ASoC support for Amlogic platforms + +CONFIG_SND_SOC_QCOM=m +CONFIG_SND_SOC_LPASS_CPU=m +CONFIG_SND_SOC_LPASS_PLATFORM=m +CONFIG_SND_SOC_LPASS_APQ8016=m +# CONFIG_SND_SOC_STORM is not set +CONFIG_SND_SOC_APQ8016_SBC=m +CONFIG_SND_SOC_QCOM_COMMON=m +# CONFIG_SND_SOC_SC7180 is not set +CONFIG_SND_SOC_ROCKCHIP=m +CONFIG_SND_SOC_ROCKCHIP_I2S=m +# CONFIG_SND_SOC_ROCKCHIP_I2S_TDM is not set +# CONFIG_SND_SOC_ROCKCHIP_PDM is not set +CONFIG_SND_SOC_ROCKCHIP_SPDIF=m +# CONFIG_SND_SOC_ROCKCHIP_MAX98090 is not set +CONFIG_SND_SOC_ROCKCHIP_RT5645=m +# CONFIG_SND_SOC_RK3288_HDMI_ANALOG is not set +CONFIG_SND_SOC_RK3399_GRU_SOUND=m +# CONFIG_SND_SOC_SOF_TOPLEVEL is not set + +# +# STMicroelectronics STM32 SOC audio support +# +# end of STMicroelectronics STM32 SOC audio support + +# +# Allwinner SoC Audio support +# +# CONFIG_SND_SUN4I_CODEC is not set +CONFIG_SND_SUN8I_CODEC=m +# CONFIG_SND_SUN8I_CODEC_ANALOG is not set +CONFIG_SND_SUN50I_CODEC_ANALOG=m +CONFIG_SND_SUN4I_I2S=m +# CONFIG_SND_SUN4I_SPDIF is not set +# CONFIG_SND_SUN50I_DMIC is not set +CONFIG_SND_SUN8I_ADDA_PR_REGMAP=m +# end of Allwinner SoC Audio support + +CONFIG_SND_SOC_TEGRA=m +# CONFIG_SND_SOC_TEGRA20_AC97 is not set +# CONFIG_SND_SOC_TEGRA20_DAS is not set +# CONFIG_SND_SOC_TEGRA20_I2S is not set +CONFIG_SND_SOC_TEGRA20_SPDIF=m +# CONFIG_SND_SOC_TEGRA30_AHUB is not set +# CONFIG_SND_SOC_TEGRA30_I2S is not set +# CONFIG_SND_SOC_TEGRA210_AHUB is not set +# CONFIG_SND_SOC_TEGRA210_DMIC is not set +# CONFIG_SND_SOC_TEGRA210_I2S is not set +# CONFIG_SND_SOC_TEGRA210_OPE is not set +# CONFIG_SND_SOC_TEGRA186_ASRC is not set +# CONFIG_SND_SOC_TEGRA186_DSPK is not set +# CONFIG_SND_SOC_TEGRA210_ADMAIF is not set +# CONFIG_SND_SOC_TEGRA210_MVC is not set +# CONFIG_SND_SOC_TEGRA210_SFC is not set +# CONFIG_SND_SOC_TEGRA210_AMX is not set +# CONFIG_SND_SOC_TEGRA210_ADX is not set +# CONFIG_SND_SOC_TEGRA210_MIXER is not set +# CONFIG_SND_SOC_TEGRA_AUDIO_GRAPH_CARD is not set +CONFIG_SND_SOC_TEGRA_MACHINE_DRV=m +# CONFIG_SND_SOC_TEGRA_RT5631 is not set +CONFIG_SND_SOC_TEGRA_RT5640=m +CONFIG_SND_SOC_TEGRA_WM8753=m +CONFIG_SND_SOC_TEGRA_WM8903=m +# CONFIG_SND_SOC_TEGRA_WM9712 is not set +CONFIG_SND_SOC_TEGRA_TRIMSLICE=m +CONFIG_SND_SOC_TEGRA_ALC5632=m +CONFIG_SND_SOC_TEGRA_MAX98090=m +# CONFIG_SND_SOC_TEGRA_MAX98088 is not set +CONFIG_SND_SOC_TEGRA_RT5677=m +# CONFIG_SND_SOC_TEGRA_SGTL5000 is not set +# CONFIG_SND_SOC_XILINX_I2S is not set +# CONFIG_SND_SOC_XILINX_AUDIO_FORMATTER is not set +# CONFIG_SND_SOC_XILINX_SPDIF is not set +# CONFIG_SND_SOC_XTFPGA_I2S is not set +CONFIG_SND_SOC_I2C_AND_SPI=m + +# +# CODEC drivers +# +# CONFIG_SND_SOC_AC97_CODEC is not set +# CONFIG_SND_SOC_ADAU1372_I2C is not set +# CONFIG_SND_SOC_ADAU1372_SPI is not set +# CONFIG_SND_SOC_ADAU1701 is not set +# CONFIG_SND_SOC_ADAU1761_I2C is not set +# CONFIG_SND_SOC_ADAU1761_SPI is not set +# CONFIG_SND_SOC_ADAU7002 is not set +# CONFIG_SND_SOC_ADAU7118_HW is not set +# CONFIG_SND_SOC_ADAU7118_I2C is not set +# CONFIG_SND_SOC_AK4104 is not set +# CONFIG_SND_SOC_AK4118 is not set +# CONFIG_SND_SOC_AK4375 is not set +# CONFIG_SND_SOC_AK4458 is not set +# CONFIG_SND_SOC_AK4554 is not set +# CONFIG_SND_SOC_AK4613 is not set +# CONFIG_SND_SOC_AK4642 is not set +# CONFIG_SND_SOC_AK5386 is not set +# CONFIG_SND_SOC_AK5558 is not set +# CONFIG_SND_SOC_ALC5623 is not set +CONFIG_SND_SOC_ALC5632=m +# CONFIG_SND_SOC_AUDIO_IIO_AUX is not set +# CONFIG_SND_SOC_AW8738 is not set +# CONFIG_SND_SOC_AW88395 is not set +# CONFIG_SND_SOC_AW88261 is not set +# CONFIG_SND_SOC_BD28623 is not set +# CONFIG_SND_SOC_BT_SCO is not set +# CONFIG_SND_SOC_CHV3_CODEC is not set +# CONFIG_SND_SOC_CROS_EC_CODEC is not set +# CONFIG_SND_SOC_CS35L32 is not set +# CONFIG_SND_SOC_CS35L33 is not set +# CONFIG_SND_SOC_CS35L34 is not set +# CONFIG_SND_SOC_CS35L35 is not set +# CONFIG_SND_SOC_CS35L36 is not set +# CONFIG_SND_SOC_CS35L41_SPI is not set +# CONFIG_SND_SOC_CS35L41_I2C is not set +# CONFIG_SND_SOC_CS35L45_SPI is not set +# CONFIG_SND_SOC_CS35L45_I2C is not set +# CONFIG_SND_SOC_CS35L56_I2C is not set +# CONFIG_SND_SOC_CS35L56_SPI is not set +# CONFIG_SND_SOC_CS42L42 is not set +# CONFIG_SND_SOC_CS42L51_I2C is not set +# CONFIG_SND_SOC_CS42L52 is not set +# CONFIG_SND_SOC_CS42L56 is not set +# CONFIG_SND_SOC_CS42L73 is not set +# CONFIG_SND_SOC_CS42L83 is not set +# CONFIG_SND_SOC_CS4234 is not set +# CONFIG_SND_SOC_CS4265 is not set +# CONFIG_SND_SOC_CS4270 is not set +# CONFIG_SND_SOC_CS4271_I2C is not set +# CONFIG_SND_SOC_CS4271_SPI is not set +# CONFIG_SND_SOC_CS42XX8_I2C is not set +# CONFIG_SND_SOC_CS43130 is not set +# CONFIG_SND_SOC_CS4341 is not set +# CONFIG_SND_SOC_CS4349 is not set +# CONFIG_SND_SOC_CS53L30 is not set +# CONFIG_SND_SOC_CX2072X is not set +# CONFIG_SND_SOC_DA7213 is not set +CONFIG_SND_SOC_DA7219=m +CONFIG_SND_SOC_DMIC=m +CONFIG_SND_SOC_HDMI_CODEC=m +# CONFIG_SND_SOC_ES7134 is not set +# CONFIG_SND_SOC_ES7241 is not set +# CONFIG_SND_SOC_ES8316 is not set +# CONFIG_SND_SOC_ES8326 is not set +# CONFIG_SND_SOC_ES8328_I2C is not set +# CONFIG_SND_SOC_ES8328_SPI is not set +# CONFIG_SND_SOC_GTM601 is not set +# CONFIG_SND_SOC_HDA is not set +# CONFIG_SND_SOC_ICS43432 is not set +# CONFIG_SND_SOC_IDT821034 is not set +# CONFIG_SND_SOC_INNO_RK3036 is not set +# CONFIG_SND_SOC_MAX98088 is not set +CONFIG_SND_SOC_MAX98090=m +CONFIG_SND_SOC_MAX98357A=m +# CONFIG_SND_SOC_MAX98504 is not set +# CONFIG_SND_SOC_MAX9867 is not set +# CONFIG_SND_SOC_MAX98927 is not set +# CONFIG_SND_SOC_MAX98520 is not set +# CONFIG_SND_SOC_MAX98373_I2C is not set +# CONFIG_SND_SOC_MAX98388 is not set +# CONFIG_SND_SOC_MAX98390 is not set +# CONFIG_SND_SOC_MAX98396 is not set +# CONFIG_SND_SOC_MAX9860 is not set +# CONFIG_SND_SOC_MSM8916_WCD_ANALOG is not set +# CONFIG_SND_SOC_MSM8916_WCD_DIGITAL is not set +# CONFIG_SND_SOC_PCM1681 is not set +# CONFIG_SND_SOC_PCM1789_I2C is not set +# CONFIG_SND_SOC_PCM179X_I2C is not set +# CONFIG_SND_SOC_PCM179X_SPI is not set +# CONFIG_SND_SOC_PCM186X_I2C is not set +# CONFIG_SND_SOC_PCM186X_SPI is not set +# CONFIG_SND_SOC_PCM3060_I2C is not set +# CONFIG_SND_SOC_PCM3060_SPI is not set +# CONFIG_SND_SOC_PCM3168A_I2C is not set +# CONFIG_SND_SOC_PCM3168A_SPI is not set +# CONFIG_SND_SOC_PCM5102A is not set +# CONFIG_SND_SOC_PCM512x_I2C is not set +# CONFIG_SND_SOC_PCM512x_SPI is not set +# CONFIG_SND_SOC_PEB2466 is not set +# CONFIG_SND_SOC_RK3328 is not set +CONFIG_SND_SOC_RL6231=m +CONFIG_SND_SOC_RT5514=m +CONFIG_SND_SOC_RT5514_SPI=m +# CONFIG_SND_SOC_RT5616 is not set +# CONFIG_SND_SOC_RT5631 is not set +CONFIG_SND_SOC_RT5640=m +CONFIG_SND_SOC_RT5645=m +# CONFIG_SND_SOC_RT5659 is not set +CONFIG_SND_SOC_RT5677=m +CONFIG_SND_SOC_RT5677_SPI=m +# CONFIG_SND_SOC_RT9120 is not set +# CONFIG_SND_SOC_SGTL5000 is not set +CONFIG_SND_SOC_SIMPLE_AMPLIFIER=m +# CONFIG_SND_SOC_SIMPLE_MUX is not set +# CONFIG_SND_SOC_SMA1303 is not set +# CONFIG_SND_SOC_SPDIF is not set +# CONFIG_SND_SOC_SRC4XXX_I2C is not set +# CONFIG_SND_SOC_SSM2305 is not set +# CONFIG_SND_SOC_SSM2518 is not set +# CONFIG_SND_SOC_SSM2602_SPI is not set +# CONFIG_SND_SOC_SSM2602_I2C is not set +# CONFIG_SND_SOC_SSM3515 is not set +# CONFIG_SND_SOC_SSM4567 is not set +# CONFIG_SND_SOC_STA32X is not set +# CONFIG_SND_SOC_STA350 is not set +# CONFIG_SND_SOC_STI_SAS is not set +# CONFIG_SND_SOC_TAS2552 is not set +# CONFIG_SND_SOC_TAS2562 is not set +# CONFIG_SND_SOC_TAS2764 is not set +# CONFIG_SND_SOC_TAS2770 is not set +# CONFIG_SND_SOC_TAS2780 is not set +# CONFIG_SND_SOC_TAS2781_I2C is not set +# CONFIG_SND_SOC_TAS5086 is not set +# CONFIG_SND_SOC_TAS571X is not set +# CONFIG_SND_SOC_TAS5720 is not set +# CONFIG_SND_SOC_TAS5805M is not set +# CONFIG_SND_SOC_TAS6424 is not set +# CONFIG_SND_SOC_TDA7419 is not set +# CONFIG_SND_SOC_TFA9879 is not set +# CONFIG_SND_SOC_TFA989X is not set +# CONFIG_SND_SOC_TLV320ADC3XXX is not set +CONFIG_SND_SOC_TLV320AIC23=m +CONFIG_SND_SOC_TLV320AIC23_I2C=m +# CONFIG_SND_SOC_TLV320AIC23_SPI is not set +# CONFIG_SND_SOC_TLV320AIC31XX is not set +# CONFIG_SND_SOC_TLV320AIC32X4_I2C is not set +# CONFIG_SND_SOC_TLV320AIC32X4_SPI is not set +# CONFIG_SND_SOC_TLV320AIC3X_I2C is not set +# CONFIG_SND_SOC_TLV320AIC3X_SPI is not set +# CONFIG_SND_SOC_TLV320ADCX140 is not set +# CONFIG_SND_SOC_TS3A227E is not set +# CONFIG_SND_SOC_TSCS42XX is not set +# CONFIG_SND_SOC_TSCS454 is not set +# CONFIG_SND_SOC_UDA1334 is not set +# CONFIG_SND_SOC_WM8510 is not set +# CONFIG_SND_SOC_WM8523 is not set +# CONFIG_SND_SOC_WM8524 is not set +# CONFIG_SND_SOC_WM8580 is not set +# CONFIG_SND_SOC_WM8711 is not set +# CONFIG_SND_SOC_WM8728 is not set +# CONFIG_SND_SOC_WM8731_I2C is not set +# CONFIG_SND_SOC_WM8731_SPI is not set +# CONFIG_SND_SOC_WM8737 is not set +# CONFIG_SND_SOC_WM8741 is not set +# CONFIG_SND_SOC_WM8750 is not set +CONFIG_SND_SOC_WM8753=m +# CONFIG_SND_SOC_WM8770 is not set +# CONFIG_SND_SOC_WM8776 is not set +# CONFIG_SND_SOC_WM8782 is not set +# CONFIG_SND_SOC_WM8804_I2C is not set +# CONFIG_SND_SOC_WM8804_SPI is not set +CONFIG_SND_SOC_WM8903=m +# CONFIG_SND_SOC_WM8904 is not set +# CONFIG_SND_SOC_WM8940 is not set +# CONFIG_SND_SOC_WM8960 is not set +# CONFIG_SND_SOC_WM8961 is not set +# CONFIG_SND_SOC_WM8962 is not set +# CONFIG_SND_SOC_WM8974 is not set +# CONFIG_SND_SOC_WM8978 is not set +# CONFIG_SND_SOC_WM8985 is not set +# CONFIG_SND_SOC_ZL38060 is not set +# CONFIG_SND_SOC_MAX9759 is not set +# CONFIG_SND_SOC_MT6351 is not set +# CONFIG_SND_SOC_MT6358 is not set +# CONFIG_SND_SOC_MT6660 is not set +# CONFIG_SND_SOC_NAU8315 is not set +# CONFIG_SND_SOC_NAU8540 is not set +# CONFIG_SND_SOC_NAU8810 is not set +# CONFIG_SND_SOC_NAU8821 is not set +# CONFIG_SND_SOC_NAU8822 is not set +# CONFIG_SND_SOC_NAU8824 is not set +# CONFIG_SND_SOC_TPA6130A2 is not set +# CONFIG_SND_SOC_LPASS_WSA_MACRO is not set +# CONFIG_SND_SOC_LPASS_VA_MACRO is not set +# CONFIG_SND_SOC_LPASS_RX_MACRO is not set +# CONFIG_SND_SOC_LPASS_TX_MACRO is not set +# end of CODEC drivers + +CONFIG_SND_SIMPLE_CARD_UTILS=m +CONFIG_SND_SIMPLE_CARD=m +CONFIG_SND_AUDIO_GRAPH_CARD=m +# CONFIG_SND_AUDIO_GRAPH_CARD2 is not set +# CONFIG_SND_TEST_COMPONENT is not set +CONFIG_SND_XEN_FRONTEND=m +# CONFIG_SND_VIRTIO is not set +CONFIG_AC97_BUS=m +CONFIG_HID_SUPPORT=y +CONFIG_HID=m +CONFIG_HID_BATTERY_STRENGTH=y +CONFIG_HIDRAW=y +CONFIG_UHID=m +CONFIG_HID_GENERIC=m + +# +# Special HID drivers +# +CONFIG_HID_A4TECH=m +CONFIG_HID_ACCUTOUCH=m +CONFIG_HID_ACRUX=m +CONFIG_HID_ACRUX_FF=y +CONFIG_HID_APPLE=m +# CONFIG_HID_APPLEIR is not set +CONFIG_HID_ASUS=m +CONFIG_HID_AUREAL=m +CONFIG_HID_BELKIN=m +CONFIG_HID_BETOP_FF=m +CONFIG_HID_BIGBEN_FF=m +CONFIG_HID_CHERRY=m +CONFIG_HID_CHICONY=m +CONFIG_HID_CORSAIR=m +CONFIG_HID_COUGAR=m +CONFIG_HID_MACALLY=m +CONFIG_HID_PRODIKEYS=m +CONFIG_HID_CMEDIA=m +CONFIG_HID_CP2112=m +# CONFIG_HID_CREATIVE_SB0540 is not set +CONFIG_HID_CYPRESS=m +CONFIG_HID_DRAGONRISE=m +CONFIG_DRAGONRISE_FF=y +CONFIG_HID_EMS_FF=m +CONFIG_HID_ELAN=m +CONFIG_HID_ELECOM=m +CONFIG_HID_ELO=m +# CONFIG_HID_EVISION is not set +CONFIG_HID_EZKEY=m +# CONFIG_HID_FT260 is not set +CONFIG_HID_GEMBIRD=m +CONFIG_HID_GFRM=m +# CONFIG_HID_GLORIOUS is not set +CONFIG_HID_HOLTEK=m +CONFIG_HOLTEK_FF=y +# CONFIG_HID_GOOGLE_HAMMER is not set +# CONFIG_HID_GOOGLE_STADIA_FF is not set +# CONFIG_HID_VIVALDI is not set +CONFIG_HID_GT683R=m +CONFIG_HID_KEYTOUCH=m +CONFIG_HID_KYE=m +CONFIG_HID_UCLOGIC=m +CONFIG_HID_WALTOP=m +CONFIG_HID_VIEWSONIC=m +# CONFIG_HID_VRC2 is not set +# CONFIG_HID_XIAOMI is not set +CONFIG_HID_GYRATION=m +CONFIG_HID_ICADE=m +CONFIG_HID_ITE=m +CONFIG_HID_JABRA=m +CONFIG_HID_TWINHAN=m +CONFIG_HID_KENSINGTON=m +CONFIG_HID_LCPOWER=m +CONFIG_HID_LED=m +CONFIG_HID_LENOVO=m +# CONFIG_HID_LETSKETCH is not set +CONFIG_HID_LOGITECH=m +CONFIG_HID_LOGITECH_DJ=m +CONFIG_HID_LOGITECH_HIDPP=m +CONFIG_LOGITECH_FF=y +CONFIG_LOGIRUMBLEPAD2_FF=y +CONFIG_LOGIG940_FF=y +CONFIG_LOGIWHEELS_FF=y +CONFIG_HID_MAGICMOUSE=m +CONFIG_HID_MALTRON=m +CONFIG_HID_MAYFLASH=m +# CONFIG_HID_MEGAWORLD_FF is not set +CONFIG_HID_REDRAGON=m +CONFIG_HID_MICROSOFT=m +CONFIG_HID_MONTEREY=m +CONFIG_HID_MULTITOUCH=m +# CONFIG_HID_NINTENDO is not set +CONFIG_HID_NTI=m +CONFIG_HID_NTRIG=m +# CONFIG_HID_NVIDIA_SHIELD is not set +CONFIG_HID_ORTEK=m +CONFIG_HID_PANTHERLORD=m +CONFIG_PANTHERLORD_FF=y +CONFIG_HID_PENMOUNT=m +CONFIG_HID_PETALYNX=m +CONFIG_HID_PICOLCD=m +CONFIG_HID_PICOLCD_FB=y +CONFIG_HID_PICOLCD_BACKLIGHT=y +CONFIG_HID_PICOLCD_LEDS=y +CONFIG_HID_PICOLCD_CIR=y +CONFIG_HID_PLANTRONICS=m +# CONFIG_HID_PXRC is not set +# CONFIG_HID_RAZER is not set +CONFIG_HID_PRIMAX=m +CONFIG_HID_RETRODE=m +CONFIG_HID_ROCCAT=m +CONFIG_HID_SAITEK=m +CONFIG_HID_SAMSUNG=m +# CONFIG_HID_SEMITEK is not set +# CONFIG_HID_SIGMAMICRO is not set +CONFIG_HID_SONY=m +CONFIG_SONY_FF=y +CONFIG_HID_SPEEDLINK=m +CONFIG_HID_STEAM=m +# CONFIG_STEAM_FF is not set +CONFIG_HID_STEELSERIES=m +CONFIG_HID_SUNPLUS=m +CONFIG_HID_RMI=m +CONFIG_HID_GREENASIA=m +CONFIG_GREENASIA_FF=y +# CONFIG_HID_HYPERV_MOUSE is not set +CONFIG_HID_SMARTJOYPLUS=m +CONFIG_SMARTJOYPLUS_FF=y +CONFIG_HID_TIVO=m +CONFIG_HID_TOPSEED=m +# CONFIG_HID_TOPRE is not set +CONFIG_HID_THINGM=m +CONFIG_HID_THRUSTMASTER=m +CONFIG_THRUSTMASTER_FF=y +CONFIG_HID_UDRAW_PS3=m +CONFIG_HID_U2FZERO=m +CONFIG_HID_WACOM=m +CONFIG_HID_WIIMOTE=m +CONFIG_HID_XINMO=m +CONFIG_HID_ZEROPLUS=m +CONFIG_ZEROPLUS_FF=y +CONFIG_HID_ZYDACRON=m +CONFIG_HID_SENSOR_HUB=m +CONFIG_HID_SENSOR_CUSTOM_SENSOR=m +CONFIG_HID_ALPS=m +# CONFIG_HID_MCP2200 is not set +# CONFIG_HID_MCP2221 is not set +# end of Special HID drivers + +# +# HID-BPF support +# +# CONFIG_HID_BPF is not set +# end of HID-BPF support + +# +# USB HID support +# +CONFIG_USB_HID=m +CONFIG_HID_PID=y +CONFIG_USB_HIDDEV=y + +# +# USB HID Boot Protocol drivers +# +# CONFIG_USB_KBD is not set +# CONFIG_USB_MOUSE is not set +# end of USB HID Boot Protocol drivers +# end of USB HID support + +CONFIG_I2C_HID=m +# CONFIG_I2C_HID_ACPI is not set +# CONFIG_I2C_HID_OF is not set +# CONFIG_I2C_HID_OF_ELAN is not set +# CONFIG_I2C_HID_OF_GOODIX is not set +CONFIG_USB_OHCI_LITTLE_ENDIAN=y +CONFIG_USB_SUPPORT=y +CONFIG_USB_COMMON=m +CONFIG_USB_LED_TRIG=y +CONFIG_USB_ULPI_BUS=m +CONFIG_USB_CONN_GPIO=m +CONFIG_USB_ARCH_HAS_HCD=y +CONFIG_USB=m +CONFIG_USB_PCI=y +CONFIG_USB_ANNOUNCE_NEW_DEVICES=y + +# +# Miscellaneous USB options +# +CONFIG_USB_DEFAULT_PERSIST=y +# CONFIG_USB_FEW_INIT_RETRIES is not set +CONFIG_USB_DYNAMIC_MINORS=y +# CONFIG_USB_OTG is not set +# CONFIG_USB_OTG_PRODUCTLIST is not set +# CONFIG_USB_OTG_DISABLE_EXTERNAL_HUB is not set +CONFIG_USB_LEDS_TRIGGER_USBPORT=m +CONFIG_USB_AUTOSUSPEND_DELAY=2 +CONFIG_USB_MON=m + +# +# USB Host Controller Drivers +# +# CONFIG_USB_C67X00_HCD is not set +CONFIG_USB_XHCI_HCD=m +# CONFIG_USB_XHCI_DBGCAP is not set +CONFIG_USB_XHCI_PCI=m +CONFIG_USB_XHCI_PCI_RENESAS=m +CONFIG_USB_XHCI_PLATFORM=m +# CONFIG_USB_XHCI_HISTB is not set +# CONFIG_USB_XHCI_MVEBU is not set +CONFIG_USB_XHCI_TEGRA=m +CONFIG_USB_EHCI_HCD=m +CONFIG_USB_EHCI_ROOT_HUB_TT=y +CONFIG_USB_EHCI_TT_NEWSCHED=y +CONFIG_USB_EHCI_PCI=m +# CONFIG_USB_EHCI_FSL is not set +CONFIG_USB_EHCI_HCD_ORION=m +CONFIG_USB_EHCI_TEGRA=m +CONFIG_USB_EHCI_HCD_PLATFORM=m +# CONFIG_USB_OXU210HP_HCD is not set +# CONFIG_USB_ISP116X_HCD is not set +# CONFIG_USB_MAX3421_HCD is not set +CONFIG_USB_OHCI_HCD=m +CONFIG_USB_OHCI_HCD_PCI=m +# CONFIG_USB_OHCI_HCD_SSB is not set +CONFIG_USB_OHCI_HCD_PLATFORM=m +# CONFIG_USB_UHCI_HCD is not set +# CONFIG_USB_SL811_HCD is not set +# CONFIG_USB_R8A66597_HCD is not set +# CONFIG_USB_HCD_BCMA is not set +# CONFIG_USB_HCD_SSB is not set +# CONFIG_USB_HCD_TEST_MODE is not set +# CONFIG_USB_XEN_HCD is not set + +# +# USB Device Class drivers +# +CONFIG_USB_ACM=m +CONFIG_USB_PRINTER=m +CONFIG_USB_WDM=m +CONFIG_USB_TMC=m + +# +# NOTE: USB_STORAGE depends on SCSI but BLK_DEV_SD may +# + +# +# also be needed; see USB_STORAGE Help for more info +# +CONFIG_USB_STORAGE=m +# CONFIG_USB_STORAGE_DEBUG is not set +CONFIG_USB_STORAGE_REALTEK=m +CONFIG_REALTEK_AUTOPM=y +CONFIG_USB_STORAGE_DATAFAB=m +CONFIG_USB_STORAGE_FREECOM=m +CONFIG_USB_STORAGE_ISD200=m +CONFIG_USB_STORAGE_USBAT=m +CONFIG_USB_STORAGE_SDDR09=m +CONFIG_USB_STORAGE_SDDR55=m +CONFIG_USB_STORAGE_JUMPSHOT=m +CONFIG_USB_STORAGE_ALAUDA=m +CONFIG_USB_STORAGE_ONETOUCH=m +CONFIG_USB_STORAGE_KARMA=m +CONFIG_USB_STORAGE_CYPRESS_ATACB=m +CONFIG_USB_STORAGE_ENE_UB6250=m +CONFIG_USB_UAS=m + +# +# USB Imaging devices +# +CONFIG_USB_MDC800=m +CONFIG_USB_MICROTEK=m +CONFIG_USBIP_CORE=m +CONFIG_USBIP_VHCI_HCD=m +CONFIG_USBIP_VHCI_HC_PORTS=15 +CONFIG_USBIP_VHCI_NR_HCS=8 +CONFIG_USBIP_HOST=m +CONFIG_USBIP_VUDC=m +# CONFIG_USBIP_DEBUG is not set + +# +# USB dual-mode controller drivers +# +# CONFIG_USB_CDNS_SUPPORT is not set +CONFIG_USB_MUSB_HDRC=m +# CONFIG_USB_MUSB_HOST is not set +# CONFIG_USB_MUSB_GADGET is not set +CONFIG_USB_MUSB_DUAL_ROLE=y + +# +# Platform Glue Layer +# +CONFIG_USB_MUSB_SUNXI=m + +# +# MUSB DMA mode +# +# CONFIG_MUSB_PIO_ONLY is not set +CONFIG_USB_DWC3=m +CONFIG_USB_DWC3_ULPI=y +# CONFIG_USB_DWC3_HOST is not set +# CONFIG_USB_DWC3_GADGET is not set +CONFIG_USB_DWC3_DUAL_ROLE=y + +# +# Platform Glue Driver Support +# +CONFIG_USB_DWC3_PCI=m +CONFIG_USB_DWC3_HAPS=m +CONFIG_USB_DWC3_MESON_G12A=m +CONFIG_USB_DWC3_OF_SIMPLE=m +CONFIG_USB_DWC3_QCOM=m +CONFIG_USB_DWC3_XILINX=m +CONFIG_USB_DWC2=m +# CONFIG_USB_DWC2_HOST is not set + +# +# Gadget/Dual-role mode requires USB Gadget support to be enabled +# +# CONFIG_USB_DWC2_PERIPHERAL is not set +CONFIG_USB_DWC2_DUAL_ROLE=y +# CONFIG_USB_DWC2_PCI is not set +# CONFIG_USB_DWC2_DEBUG is not set +# CONFIG_USB_DWC2_TRACK_MISSED_SOFS is not set +CONFIG_USB_CHIPIDEA=m +CONFIG_USB_CHIPIDEA_UDC=y +CONFIG_USB_CHIPIDEA_HOST=y +CONFIG_USB_CHIPIDEA_PCI=m +CONFIG_USB_CHIPIDEA_MSM=m +CONFIG_USB_CHIPIDEA_IMX=m +CONFIG_USB_CHIPIDEA_GENERIC=m +CONFIG_USB_CHIPIDEA_TEGRA=m +CONFIG_USB_ISP1760=m +CONFIG_USB_ISP1760_HCD=y +CONFIG_USB_ISP1761_UDC=y +# CONFIG_USB_ISP1760_HOST_ROLE is not set +# CONFIG_USB_ISP1760_GADGET_ROLE is not set +CONFIG_USB_ISP1760_DUAL_ROLE=y + +# +# USB port drivers +# +CONFIG_USB_SERIAL=m +CONFIG_USB_SERIAL_GENERIC=y +CONFIG_USB_SERIAL_SIMPLE=m +CONFIG_USB_SERIAL_AIRCABLE=m +CONFIG_USB_SERIAL_ARK3116=m +CONFIG_USB_SERIAL_BELKIN=m +CONFIG_USB_SERIAL_CH341=m +CONFIG_USB_SERIAL_WHITEHEAT=m +CONFIG_USB_SERIAL_DIGI_ACCELEPORT=m +CONFIG_USB_SERIAL_CP210X=m +CONFIG_USB_SERIAL_CYPRESS_M8=m +CONFIG_USB_SERIAL_EMPEG=m +CONFIG_USB_SERIAL_FTDI_SIO=m +CONFIG_USB_SERIAL_VISOR=m +CONFIG_USB_SERIAL_IPAQ=m +CONFIG_USB_SERIAL_IR=m +CONFIG_USB_SERIAL_EDGEPORT=m +CONFIG_USB_SERIAL_EDGEPORT_TI=m +CONFIG_USB_SERIAL_F81232=m +CONFIG_USB_SERIAL_F8153X=m +CONFIG_USB_SERIAL_GARMIN=m +CONFIG_USB_SERIAL_IPW=m +CONFIG_USB_SERIAL_IUU=m +CONFIG_USB_SERIAL_KEYSPAN_PDA=m +CONFIG_USB_SERIAL_KEYSPAN=m +CONFIG_USB_SERIAL_KLSI=m +CONFIG_USB_SERIAL_KOBIL_SCT=m +CONFIG_USB_SERIAL_MCT_U232=m +CONFIG_USB_SERIAL_METRO=m +CONFIG_USB_SERIAL_MOS7720=m +CONFIG_USB_SERIAL_MOS7715_PARPORT=y +CONFIG_USB_SERIAL_MOS7840=m +CONFIG_USB_SERIAL_MXUPORT=m +CONFIG_USB_SERIAL_NAVMAN=m +CONFIG_USB_SERIAL_PL2303=m +CONFIG_USB_SERIAL_OTI6858=m +CONFIG_USB_SERIAL_QCAUX=m +CONFIG_USB_SERIAL_QUALCOMM=m +CONFIG_USB_SERIAL_SPCP8X5=m +CONFIG_USB_SERIAL_SAFE=m +# CONFIG_USB_SERIAL_SAFE_PADDED is not set +CONFIG_USB_SERIAL_SIERRAWIRELESS=m +CONFIG_USB_SERIAL_SYMBOL=m +CONFIG_USB_SERIAL_TI=m +CONFIG_USB_SERIAL_CYBERJACK=m +CONFIG_USB_SERIAL_WWAN=m +CONFIG_USB_SERIAL_OPTION=m +CONFIG_USB_SERIAL_OMNINET=m +CONFIG_USB_SERIAL_OPTICON=m +CONFIG_USB_SERIAL_XSENS_MT=m +CONFIG_USB_SERIAL_WISHBONE=m +CONFIG_USB_SERIAL_SSU100=m +CONFIG_USB_SERIAL_QT2=m +CONFIG_USB_SERIAL_UPD78F0730=m +# CONFIG_USB_SERIAL_XR is not set +CONFIG_USB_SERIAL_DEBUG=m + +# +# USB Miscellaneous drivers +# +# CONFIG_USB_USS720 is not set +CONFIG_USB_EMI62=m +CONFIG_USB_EMI26=m +CONFIG_USB_ADUTUX=m +CONFIG_USB_SEVSEG=m +CONFIG_USB_LEGOTOWER=m +CONFIG_USB_LCD=m +CONFIG_USB_CYPRESS_CY7C63=m +CONFIG_USB_CYTHERM=m +CONFIG_USB_IDMOUSE=m +CONFIG_USB_APPLEDISPLAY=m +# CONFIG_USB_QCOM_EUD is not set +# CONFIG_APPLE_MFI_FASTCHARGE is not set +CONFIG_USB_SISUSBVGA=m +CONFIG_USB_LD=m +CONFIG_USB_TRANCEVIBRATOR=m +CONFIG_USB_IOWARRIOR=m +CONFIG_USB_TEST=m +CONFIG_USB_EHSET_TEST_FIXTURE=m +CONFIG_USB_ISIGHTFW=m +CONFIG_USB_YUREX=m +CONFIG_USB_EZUSB_FX2=m +# CONFIG_USB_HUB_USB251XB is not set +CONFIG_USB_HSIC_USB3503=m +# CONFIG_USB_HSIC_USB4604 is not set +# CONFIG_USB_LINK_LAYER_TEST is not set +CONFIG_USB_CHAOSKEY=m +# CONFIG_USB_ONBOARD_HUB is not set +# CONFIG_USB_ATM is not set + +# +# USB Physical Layer drivers +# +CONFIG_USB_PHY=y +CONFIG_NOP_USB_XCEIV=m +# CONFIG_USB_GPIO_VBUS is not set +# CONFIG_USB_ISP1301 is not set +CONFIG_USB_TEGRA_PHY=m +CONFIG_USB_ULPI=y +CONFIG_USB_ULPI_VIEWPORT=y +# end of USB Physical Layer drivers + +CONFIG_USB_GADGET=m +# CONFIG_USB_GADGET_DEBUG is not set +# CONFIG_USB_GADGET_DEBUG_FILES is not set +# CONFIG_USB_GADGET_DEBUG_FS is not set +CONFIG_USB_GADGET_VBUS_DRAW=2 +CONFIG_USB_GADGET_STORAGE_NUM_BUFFERS=2 +# CONFIG_U_SERIAL_CONSOLE is not set + +# +# USB Peripheral Controller +# +# CONFIG_USB_GR_UDC is not set +# CONFIG_USB_R8A66597 is not set +# CONFIG_USB_PXA27X is not set +# CONFIG_USB_MV_UDC is not set +# CONFIG_USB_MV_U3D is not set +# CONFIG_USB_SNP_UDC_PLAT is not set +# CONFIG_USB_M66592 is not set +# CONFIG_USB_BDC_UDC is not set +# CONFIG_USB_AMD5536UDC is not set +# CONFIG_USB_NET2272 is not set +CONFIG_USB_NET2280=m +# CONFIG_USB_GOKU is not set +# CONFIG_USB_EG20T is not set +# CONFIG_USB_GADGET_XILINX is not set +# CONFIG_USB_MAX3420_UDC is not set +# CONFIG_USB_TEGRA_XUDC is not set +# CONFIG_USB_CDNS2_UDC is not set +# CONFIG_USB_DUMMY_HCD is not set +# end of USB Peripheral Controller + +CONFIG_USB_LIBCOMPOSITE=m +CONFIG_USB_F_ACM=m +CONFIG_USB_F_SS_LB=m +CONFIG_USB_U_SERIAL=m +CONFIG_USB_U_ETHER=m +CONFIG_USB_U_AUDIO=m +CONFIG_USB_F_SERIAL=m +CONFIG_USB_F_OBEX=m +CONFIG_USB_F_NCM=m +CONFIG_USB_F_ECM=m +CONFIG_USB_F_PHONET=m +CONFIG_USB_F_EEM=m +CONFIG_USB_F_SUBSET=m +CONFIG_USB_F_RNDIS=m +CONFIG_USB_F_MASS_STORAGE=m +CONFIG_USB_F_FS=m +CONFIG_USB_F_UAC1=m +CONFIG_USB_F_UAC2=m +CONFIG_USB_F_UVC=m +CONFIG_USB_F_MIDI=m +CONFIG_USB_F_HID=m +CONFIG_USB_F_PRINTER=m +CONFIG_USB_CONFIGFS=m +CONFIG_USB_CONFIGFS_SERIAL=y +CONFIG_USB_CONFIGFS_ACM=y +CONFIG_USB_CONFIGFS_OBEX=y +CONFIG_USB_CONFIGFS_NCM=y +CONFIG_USB_CONFIGFS_ECM=y +CONFIG_USB_CONFIGFS_ECM_SUBSET=y +CONFIG_USB_CONFIGFS_RNDIS=y +CONFIG_USB_CONFIGFS_EEM=y +CONFIG_USB_CONFIGFS_PHONET=y +CONFIG_USB_CONFIGFS_MASS_STORAGE=y +CONFIG_USB_CONFIGFS_F_LB_SS=y +CONFIG_USB_CONFIGFS_F_FS=y +CONFIG_USB_CONFIGFS_F_UAC1=y +# CONFIG_USB_CONFIGFS_F_UAC1_LEGACY is not set +CONFIG_USB_CONFIGFS_F_UAC2=y +CONFIG_USB_CONFIGFS_F_MIDI=y +# CONFIG_USB_CONFIGFS_F_MIDI2 is not set +CONFIG_USB_CONFIGFS_F_HID=y +CONFIG_USB_CONFIGFS_F_UVC=y +CONFIG_USB_CONFIGFS_F_PRINTER=y +# CONFIG_USB_CONFIGFS_F_TCM is not set + +# +# USB Gadget precomposed configurations +# +# CONFIG_USB_ZERO is not set +# CONFIG_USB_AUDIO is not set +CONFIG_USB_ETH=m +CONFIG_USB_ETH_RNDIS=y +# CONFIG_USB_ETH_EEM is not set +# CONFIG_USB_G_NCM is not set +CONFIG_USB_GADGETFS=m +CONFIG_USB_FUNCTIONFS=m +CONFIG_USB_FUNCTIONFS_ETH=y +CONFIG_USB_FUNCTIONFS_RNDIS=y +CONFIG_USB_FUNCTIONFS_GENERIC=y +# CONFIG_USB_MASS_STORAGE is not set +# CONFIG_USB_GADGET_TARGET is not set +CONFIG_USB_G_SERIAL=m +# CONFIG_USB_MIDI_GADGET is not set +# CONFIG_USB_G_PRINTER is not set +# CONFIG_USB_CDC_COMPOSITE is not set +# CONFIG_USB_G_NOKIA is not set +# CONFIG_USB_G_ACM_MS is not set +# CONFIG_USB_G_MULTI is not set +# CONFIG_USB_G_HID is not set +# CONFIG_USB_G_DBGP is not set +# CONFIG_USB_G_WEBCAM is not set +# CONFIG_USB_RAW_GADGET is not set +# end of USB Gadget precomposed configurations + +# CONFIG_TYPEC is not set +CONFIG_USB_ROLE_SWITCH=m +CONFIG_MMC=y +CONFIG_PWRSEQ_EMMC=y +# CONFIG_PWRSEQ_SD8787 is not set +CONFIG_PWRSEQ_SIMPLE=y +CONFIG_MMC_BLOCK=y +CONFIG_MMC_BLOCK_MINORS=256 +CONFIG_SDIO_UART=m +# CONFIG_MMC_TEST is not set + +# +# MMC/SD/SDIO Host Controller Drivers +# +# CONFIG_MMC_DEBUG is not set +CONFIG_MMC_ARMMMCI=m +CONFIG_MMC_QCOM_DML=y +CONFIG_MMC_STM32_SDMMC=y +CONFIG_MMC_SDHCI=m +CONFIG_MMC_SDHCI_IO_ACCESSORS=y +CONFIG_MMC_SDHCI_PCI=m +CONFIG_MMC_RICOH_MMC=y +CONFIG_MMC_SDHCI_ACPI=m +CONFIG_MMC_SDHCI_PLTFM=m +CONFIG_MMC_SDHCI_OF_ARASAN=m +# CONFIG_MMC_SDHCI_OF_AT91 is not set +# CONFIG_MMC_SDHCI_OF_DWCMSHC is not set +# CONFIG_MMC_SDHCI_CADENCE is not set +CONFIG_MMC_SDHCI_TEGRA=m +# CONFIG_MMC_SDHCI_PXAV3 is not set +CONFIG_MMC_SDHCI_F_SDH30=m +# CONFIG_MMC_SDHCI_MILBEAUT is not set +CONFIG_MMC_MESON_GX=m +# CONFIG_MMC_MESON_MX_SDIO is not set +CONFIG_MMC_SDHCI_MSM=m +CONFIG_MMC_TIFM_SD=m +CONFIG_MMC_SPI=m +CONFIG_MMC_CB710=m +CONFIG_MMC_VIA_SDMMC=m +CONFIG_MMC_DW=m +CONFIG_MMC_DW_PLTFM=m +# CONFIG_MMC_DW_BLUEFIELD is not set +# CONFIG_MMC_DW_EXYNOS is not set +# CONFIG_MMC_DW_HI3798CV200 is not set +CONFIG_MMC_DW_K3=m +# CONFIG_MMC_DW_PCI is not set +CONFIG_MMC_DW_ROCKCHIP=m +CONFIG_MMC_VUB300=m +CONFIG_MMC_USHC=m +# CONFIG_MMC_USDHI6ROL0 is not set +CONFIG_MMC_REALTEK_PCI=m +CONFIG_MMC_REALTEK_USB=m +CONFIG_MMC_SUNXI=m +CONFIG_MMC_CQHCI=m +# CONFIG_MMC_HSQ is not set +CONFIG_MMC_TOSHIBA_PCI=m +# CONFIG_MMC_MTK is not set +CONFIG_MMC_SDHCI_XENON=m +CONFIG_SCSI_UFSHCD=m +# CONFIG_SCSI_UFS_BSG is not set +# CONFIG_SCSI_UFS_FAULT_INJECTION is not set +# CONFIG_SCSI_UFS_HWMON is not set +CONFIG_SCSI_UFSHCD_PCI=m +# CONFIG_SCSI_UFS_DWC_TC_PCI is not set +# CONFIG_SCSI_UFSHCD_PLATFORM is not set +CONFIG_MEMSTICK=m +# CONFIG_MEMSTICK_DEBUG is not set + +# +# MemoryStick drivers +# +# CONFIG_MEMSTICK_UNSAFE_RESUME is not set +CONFIG_MSPRO_BLOCK=m +# CONFIG_MS_BLOCK is not set + +# +# MemoryStick Host Controller Drivers +# +CONFIG_MEMSTICK_TIFM_MS=m +CONFIG_MEMSTICK_JMICRON_38X=m +CONFIG_MEMSTICK_R592=m +CONFIG_MEMSTICK_REALTEK_PCI=m +CONFIG_MEMSTICK_REALTEK_USB=m +CONFIG_NEW_LEDS=y +CONFIG_LEDS_CLASS=y +# CONFIG_LEDS_CLASS_FLASH is not set +# CONFIG_LEDS_CLASS_MULTICOLOR is not set +CONFIG_LEDS_BRIGHTNESS_HW_CHANGED=y + +# +# LED drivers +# +# CONFIG_LEDS_AN30259A is not set +# CONFIG_LEDS_AW200XX is not set +# CONFIG_LEDS_AW2013 is not set +# CONFIG_LEDS_BCM6328 is not set +# CONFIG_LEDS_BCM6358 is not set +# CONFIG_LEDS_CR0014114 is not set +# CONFIG_LEDS_EL15203000 is not set +# CONFIG_LEDS_LM3530 is not set +# CONFIG_LEDS_LM3532 is not set +# CONFIG_LEDS_LM3642 is not set +# CONFIG_LEDS_LM3692X is not set +# CONFIG_LEDS_PCA9532 is not set +CONFIG_LEDS_GPIO=m +CONFIG_LEDS_LP3944=m +# CONFIG_LEDS_LP3952 is not set +# CONFIG_LEDS_LP50XX is not set +# CONFIG_LEDS_LP55XX_COMMON is not set +# CONFIG_LEDS_LP8860 is not set +CONFIG_LEDS_PCA955X=m +# CONFIG_LEDS_PCA955X_GPIO is not set +# CONFIG_LEDS_PCA963X is not set +# CONFIG_LEDS_PCA995X is not set +CONFIG_LEDS_DAC124S085=m +CONFIG_LEDS_PWM=m +CONFIG_LEDS_REGULATOR=m +# CONFIG_LEDS_BD2606MVV is not set +CONFIG_LEDS_BD2802=m +CONFIG_LEDS_LT3593=m +# CONFIG_LEDS_TCA6507 is not set +# CONFIG_LEDS_TLC591XX is not set +# CONFIG_LEDS_LM355x is not set +# CONFIG_LEDS_IS31FL319X is not set +# CONFIG_LEDS_IS31FL32XX is not set + +# +# LED driver for blink(1) USB RGB LED is under Special HID drivers (HID_THINGM) +# +# CONFIG_LEDS_BLINKM is not set +# CONFIG_LEDS_SYSCON is not set +# CONFIG_LEDS_MLXREG is not set +# CONFIG_LEDS_USER is not set +# CONFIG_LEDS_SPI_BYTE is not set +# CONFIG_LEDS_LM3697 is not set + +# +# Flash and Torch LED drivers +# + +# +# RGB LED drivers +# + +# +# LED Triggers +# +CONFIG_LEDS_TRIGGERS=y +CONFIG_LEDS_TRIGGER_TIMER=m +CONFIG_LEDS_TRIGGER_ONESHOT=m +CONFIG_LEDS_TRIGGER_DISK=y +CONFIG_LEDS_TRIGGER_MTD=y +CONFIG_LEDS_TRIGGER_HEARTBEAT=m +CONFIG_LEDS_TRIGGER_BACKLIGHT=m +CONFIG_LEDS_TRIGGER_CPU=y +# CONFIG_LEDS_TRIGGER_ACTIVITY is not set +CONFIG_LEDS_TRIGGER_DEFAULT_ON=m + +# +# iptables trigger is under Netfilter config (LED target) +# +CONFIG_LEDS_TRIGGER_TRANSIENT=m +CONFIG_LEDS_TRIGGER_CAMERA=m +CONFIG_LEDS_TRIGGER_PANIC=y +# CONFIG_LEDS_TRIGGER_NETDEV is not set +# CONFIG_LEDS_TRIGGER_PATTERN is not set +CONFIG_LEDS_TRIGGER_AUDIO=m +# CONFIG_LEDS_TRIGGER_TTY is not set + +# +# Simple LED drivers +# +CONFIG_ACCESSIBILITY=y +CONFIG_A11Y_BRAILLE_CONSOLE=y + +# +# Speakup console speech +# +CONFIG_SPEAKUP=m +CONFIG_SPEAKUP_SYNTH_ACNTSA=m +CONFIG_SPEAKUP_SYNTH_APOLLO=m +CONFIG_SPEAKUP_SYNTH_AUDPTR=m +CONFIG_SPEAKUP_SYNTH_BNS=m +CONFIG_SPEAKUP_SYNTH_DECTLK=m +CONFIG_SPEAKUP_SYNTH_DECEXT=m +CONFIG_SPEAKUP_SYNTH_LTLK=m +CONFIG_SPEAKUP_SYNTH_SOFT=m +CONFIG_SPEAKUP_SYNTH_SPKOUT=m +CONFIG_SPEAKUP_SYNTH_TXPRT=m +CONFIG_SPEAKUP_SYNTH_DUMMY=m +# end of Speakup console speech + +CONFIG_INFINIBAND=m +CONFIG_INFINIBAND_USER_MAD=m +CONFIG_INFINIBAND_USER_ACCESS=m +CONFIG_INFINIBAND_USER_MEM=y +CONFIG_INFINIBAND_ON_DEMAND_PAGING=y +CONFIG_INFINIBAND_ADDR_TRANS=y +CONFIG_INFINIBAND_ADDR_TRANS_CONFIGFS=y +CONFIG_INFINIBAND_VIRT_DMA=y +# CONFIG_INFINIBAND_BNXT_RE is not set +CONFIG_INFINIBAND_CXGB4=m +# CONFIG_INFINIBAND_EFA is not set +# CONFIG_INFINIBAND_ERDMA is not set +# CONFIG_INFINIBAND_IRDMA is not set +CONFIG_MLX4_INFINIBAND=m +CONFIG_MLX5_INFINIBAND=m +CONFIG_INFINIBAND_MTHCA=m +CONFIG_INFINIBAND_MTHCA_DEBUG=y +CONFIG_INFINIBAND_OCRDMA=m +CONFIG_INFINIBAND_QEDR=m +CONFIG_RDMA_RXE=m +# CONFIG_RDMA_SIW is not set +CONFIG_INFINIBAND_IPOIB=m +CONFIG_INFINIBAND_IPOIB_CM=y +CONFIG_INFINIBAND_IPOIB_DEBUG=y +# CONFIG_INFINIBAND_IPOIB_DEBUG_DATA is not set +CONFIG_INFINIBAND_SRP=m +CONFIG_INFINIBAND_SRPT=m +CONFIG_INFINIBAND_ISER=m +CONFIG_INFINIBAND_ISERT=m +# CONFIG_INFINIBAND_RTRS_CLIENT is not set +# CONFIG_INFINIBAND_RTRS_SERVER is not set +CONFIG_EDAC_SUPPORT=y +CONFIG_EDAC=y +CONFIG_EDAC_LEGACY_SYSFS=y +CONFIG_EDAC_DEBUG=y +CONFIG_EDAC_GHES=y +CONFIG_EDAC_THUNDERX=m +# CONFIG_EDAC_SYNOPSYS is not set +CONFIG_EDAC_XGENE=m +# CONFIG_EDAC_DMC520 is not set +# CONFIG_EDAC_ZYNQMP is not set +CONFIG_RTC_LIB=y +CONFIG_RTC_CLASS=y +CONFIG_RTC_HCTOSYS=y +CONFIG_RTC_HCTOSYS_DEVICE="rtc0" +CONFIG_RTC_SYSTOHC=y +CONFIG_RTC_SYSTOHC_DEVICE="rtc0" +# CONFIG_RTC_DEBUG is not set +CONFIG_RTC_NVMEM=y + +# +# RTC interfaces +# +CONFIG_RTC_INTF_SYSFS=y +CONFIG_RTC_INTF_PROC=y +CONFIG_RTC_INTF_DEV=y +# CONFIG_RTC_INTF_DEV_UIE_EMUL is not set +# CONFIG_RTC_DRV_TEST is not set + +# +# I2C RTC drivers +# +# CONFIG_RTC_DRV_ABB5ZES3 is not set +# CONFIG_RTC_DRV_ABEOZ9 is not set +# CONFIG_RTC_DRV_ABX80X is not set +CONFIG_RTC_DRV_DS1307=y +# CONFIG_RTC_DRV_DS1307_CENTURY is not set +# CONFIG_RTC_DRV_DS1374 is not set +# CONFIG_RTC_DRV_DS1672 is not set +# CONFIG_RTC_DRV_HYM8563 is not set +# CONFIG_RTC_DRV_MAX6900 is not set +CONFIG_RTC_DRV_MAX77686=y +# CONFIG_RTC_DRV_NCT3018Y is not set +# CONFIG_RTC_DRV_RS5C372 is not set +# CONFIG_RTC_DRV_ISL1208 is not set +# CONFIG_RTC_DRV_ISL12022 is not set +# CONFIG_RTC_DRV_ISL12026 is not set +# CONFIG_RTC_DRV_X1205 is not set +# CONFIG_RTC_DRV_PCF8523 is not set +# CONFIG_RTC_DRV_PCF85063 is not set +# CONFIG_RTC_DRV_PCF85363 is not set +CONFIG_RTC_DRV_PCF8563=y +# CONFIG_RTC_DRV_PCF8583 is not set +# CONFIG_RTC_DRV_M41T80 is not set +# CONFIG_RTC_DRV_BQ32K is not set +# CONFIG_RTC_DRV_S35390A is not set +# CONFIG_RTC_DRV_FM3130 is not set +# CONFIG_RTC_DRV_RX8010 is not set +# CONFIG_RTC_DRV_RX8581 is not set +# CONFIG_RTC_DRV_RX8025 is not set +# CONFIG_RTC_DRV_EM3027 is not set +# CONFIG_RTC_DRV_RV3028 is not set +# CONFIG_RTC_DRV_RV3032 is not set +# CONFIG_RTC_DRV_RV8803 is not set +# CONFIG_RTC_DRV_SD3078 is not set + +# +# SPI RTC drivers +# +# CONFIG_RTC_DRV_M41T93 is not set +# CONFIG_RTC_DRV_M41T94 is not set +# CONFIG_RTC_DRV_DS1302 is not set +# CONFIG_RTC_DRV_DS1305 is not set +# CONFIG_RTC_DRV_DS1343 is not set +# CONFIG_RTC_DRV_DS1347 is not set +# CONFIG_RTC_DRV_DS1390 is not set +# CONFIG_RTC_DRV_MAX6916 is not set +# CONFIG_RTC_DRV_R9701 is not set +# CONFIG_RTC_DRV_RX4581 is not set +# CONFIG_RTC_DRV_RS5C348 is not set +# CONFIG_RTC_DRV_MAX6902 is not set +# CONFIG_RTC_DRV_PCF2123 is not set +# CONFIG_RTC_DRV_MCP795 is not set +CONFIG_RTC_I2C_AND_SPI=y + +# +# SPI and I2C RTC drivers +# +# CONFIG_RTC_DRV_DS3232 is not set +# CONFIG_RTC_DRV_PCF2127 is not set +# CONFIG_RTC_DRV_RV3029C2 is not set +# CONFIG_RTC_DRV_RX6110 is not set + +# +# Platform RTC drivers +# +# CONFIG_RTC_DRV_DS1286 is not set +# CONFIG_RTC_DRV_DS1511 is not set +# CONFIG_RTC_DRV_DS1553 is not set +# CONFIG_RTC_DRV_DS1685_FAMILY is not set +# CONFIG_RTC_DRV_DS1742 is not set +# CONFIG_RTC_DRV_DS2404 is not set +CONFIG_RTC_DRV_EFI=y +# CONFIG_RTC_DRV_STK17TA8 is not set +# CONFIG_RTC_DRV_M48T86 is not set +# CONFIG_RTC_DRV_M48T35 is not set +# CONFIG_RTC_DRV_M48T59 is not set +# CONFIG_RTC_DRV_MSM6242 is not set +# CONFIG_RTC_DRV_RP5C01 is not set +# CONFIG_RTC_DRV_OPTEE is not set +# CONFIG_RTC_DRV_ZYNQMP is not set +CONFIG_RTC_DRV_CROS_EC=m + +# +# on-CPU RTC drivers +# +CONFIG_RTC_DRV_MESON_VRTC=m +# CONFIG_RTC_DRV_PL030 is not set +CONFIG_RTC_DRV_PL031=y +CONFIG_RTC_DRV_SUN6I=y +CONFIG_RTC_DRV_MV=m +CONFIG_RTC_DRV_ARMADA38X=m +# CONFIG_RTC_DRV_CADENCE is not set +# CONFIG_RTC_DRV_FTRTC010 is not set +CONFIG_RTC_DRV_PM8XXX=m +CONFIG_RTC_DRV_TEGRA=y +CONFIG_RTC_DRV_XGENE=y +# CONFIG_RTC_DRV_R7301 is not set + +# +# HID Sensor RTC drivers +# +# CONFIG_RTC_DRV_HID_SENSOR_TIME is not set +# CONFIG_RTC_DRV_GOLDFISH is not set +CONFIG_DMADEVICES=y +# CONFIG_DMADEVICES_DEBUG is not set + +# +# DMA Devices +# +CONFIG_ASYNC_TX_ENABLE_CHANNEL_SWITCH=y +CONFIG_DMA_ENGINE=y +CONFIG_DMA_VIRTUAL_CHANNELS=y +CONFIG_DMA_ACPI=y +CONFIG_DMA_OF=y +# CONFIG_ALTERA_MSGDMA is not set +# CONFIG_AMBA_PL08X is not set +# CONFIG_AXI_DMAC is not set +# CONFIG_BCM_SBA_RAID is not set +CONFIG_DMA_SUN6I=m +# CONFIG_DW_AXI_DMAC is not set +# CONFIG_FSL_EDMA is not set +# CONFIG_FSL_QDMA is not set +# CONFIG_HISI_DMA is not set +# CONFIG_INTEL_IDMA64 is not set +CONFIG_K3_DMA=m +CONFIG_MV_XOR=y +CONFIG_MV_XOR_V2=y +CONFIG_PL330_DMA=m +# CONFIG_PLX_DMA is not set +# CONFIG_TEGRA186_GPC_DMA is not set +CONFIG_TEGRA20_APB_DMA=y +CONFIG_TEGRA210_ADMA=y +CONFIG_XGENE_DMA=m +# CONFIG_XILINX_DMA is not set +# CONFIG_XILINX_XDMA is not set +# CONFIG_XILINX_ZYNQMP_DMA is not set +# CONFIG_XILINX_ZYNQMP_DPDMA is not set +CONFIG_QCOM_BAM_DMA=m +# CONFIG_QCOM_GPI_DMA is not set +CONFIG_QCOM_HIDMA_MGMT=m +CONFIG_QCOM_HIDMA=m +# CONFIG_DW_DMAC is not set +# CONFIG_DW_DMAC_PCI is not set +# CONFIG_DW_EDMA is not set +# CONFIG_SF_PDMA is not set + +# +# DMA Clients +# +CONFIG_ASYNC_TX_DMA=y +# CONFIG_DMATEST is not set +CONFIG_DMA_ENGINE_RAID=y + +# +# DMABUF options +# +CONFIG_SYNC_FILE=y +# CONFIG_SW_SYNC is not set +# CONFIG_UDMABUF is not set +# CONFIG_DMABUF_MOVE_NOTIFY is not set +# CONFIG_DMABUF_DEBUG is not set +# CONFIG_DMABUF_SELFTESTS is not set +CONFIG_DMABUF_HEAPS=y +# CONFIG_DMABUF_SYSFS_STATS is not set +CONFIG_DMABUF_HEAPS_SYSTEM=y +# end of DMABUF options + +CONFIG_UIO=m +CONFIG_UIO_CIF=m +# CONFIG_UIO_PDRV_GENIRQ is not set +# CONFIG_UIO_DMEM_GENIRQ is not set +CONFIG_UIO_AEC=m +CONFIG_UIO_SERCOS3=m +CONFIG_UIO_PCI_GENERIC=m +CONFIG_UIO_NETX=m +# CONFIG_UIO_PRUSS is not set +CONFIG_UIO_MF624=m +# CONFIG_UIO_HV_GENERIC is not set +CONFIG_VFIO=m +CONFIG_VFIO_GROUP=y +CONFIG_VFIO_CONTAINER=y +CONFIG_VFIO_IOMMU_TYPE1=m +CONFIG_VFIO_NOIOMMU=y +CONFIG_VFIO_VIRQFD=y + +# +# VFIO support for PCI devices +# +CONFIG_VFIO_PCI_CORE=m +CONFIG_VFIO_PCI_MMAP=y +CONFIG_VFIO_PCI_INTX=y +CONFIG_VFIO_PCI=m +# CONFIG_MLX5_VFIO_PCI is not set +# end of VFIO support for PCI devices + +# +# VFIO support for platform devices +# +CONFIG_VFIO_PLATFORM_BASE=m +CONFIG_VFIO_PLATFORM=m +# CONFIG_VFIO_AMBA is not set + +# +# VFIO platform reset drivers +# +# CONFIG_VFIO_PLATFORM_CALXEDAXGMAC_RESET is not set +# CONFIG_VFIO_PLATFORM_AMDXGBE_RESET is not set +# end of VFIO platform reset drivers +# end of VFIO support for platform devices + +CONFIG_VIRT_DRIVERS=y +CONFIG_VMGENID=y +# CONFIG_NITRO_ENCLAVES is not set +CONFIG_VIRTIO_ANCHOR=y +CONFIG_VIRTIO=m +CONFIG_VIRTIO_PCI_LIB=m +CONFIG_VIRTIO_PCI_LIB_LEGACY=m +CONFIG_VIRTIO_MENU=y +CONFIG_VIRTIO_PCI=m +CONFIG_VIRTIO_PCI_LEGACY=y +# CONFIG_VIRTIO_PMEM is not set +CONFIG_VIRTIO_BALLOON=m +# CONFIG_VIRTIO_MEM is not set +CONFIG_VIRTIO_INPUT=m +CONFIG_VIRTIO_MMIO=m +# CONFIG_VIRTIO_MMIO_CMDLINE_DEVICES is not set +CONFIG_VIRTIO_DMA_SHARED_BUFFER=m +# CONFIG_VDPA is not set +CONFIG_VHOST_IOTLB=m +CONFIG_VHOST_TASK=y +CONFIG_VHOST=m +CONFIG_VHOST_MENU=y +CONFIG_VHOST_NET=m +CONFIG_VHOST_SCSI=m +CONFIG_VHOST_VSOCK=m +# CONFIG_VHOST_CROSS_ENDIAN_LEGACY is not set + +# +# Microsoft Hyper-V guest support +# +CONFIG_HYPERV=m +CONFIG_HYPERV_UTILS=m +CONFIG_HYPERV_BALLOON=m +# end of Microsoft Hyper-V guest support + +# +# Xen driver support +# +CONFIG_XEN_BALLOON=y +CONFIG_XEN_BALLOON_MEMORY_HOTPLUG=y +CONFIG_XEN_SCRUB_PAGES_DEFAULT=y +CONFIG_XEN_DEV_EVTCHN=m +CONFIG_XEN_BACKEND=y +CONFIG_XENFS=m +CONFIG_XEN_COMPAT_XENFS=y +CONFIG_XEN_SYS_HYPERVISOR=y +CONFIG_XEN_XENBUS_FRONTEND=y +CONFIG_XEN_GNTDEV=m +CONFIG_XEN_GRANT_DEV_ALLOC=m +# CONFIG_XEN_GRANT_DMA_ALLOC is not set +CONFIG_SWIOTLB_XEN=y +CONFIG_XEN_PCI_STUB=y +CONFIG_XEN_PCIDEV_STUB=m +# CONFIG_XEN_PVCALLS_FRONTEND is not set +# CONFIG_XEN_PVCALLS_BACKEND is not set +CONFIG_XEN_SCSI_BACKEND=m +CONFIG_XEN_PRIVCMD=m +CONFIG_XEN_EFI=y +CONFIG_XEN_AUTO_XLATE=y +CONFIG_XEN_FRONT_PGDIR_SHBUF=m +CONFIG_XEN_UNPOPULATED_ALLOC=y +# CONFIG_XEN_VIRTIO is not set +# end of Xen driver support + +# CONFIG_GREYBUS is not set +# CONFIG_COMEDI is not set +CONFIG_STAGING=y +# CONFIG_PRISM2_USB is not set +# CONFIG_RTL8192U is not set +# CONFIG_RTLLIB is not set +CONFIG_RTL8723BS=m +CONFIG_R8712U=m +# CONFIG_RTS5208 is not set +# CONFIG_VT6655 is not set +# CONFIG_VT6656 is not set + +# +# IIO staging drivers +# + +# +# Accelerometers +# +# CONFIG_ADIS16203 is not set +# CONFIG_ADIS16240 is not set +# end of Accelerometers + +# +# Analog to digital converters +# +# CONFIG_AD7816 is not set +# end of Analog to digital converters + +# +# Analog digital bi-direction converters +# +# CONFIG_ADT7316 is not set +# end of Analog digital bi-direction converters + +# +# Direct Digital Synthesis +# +# CONFIG_AD9832 is not set +# CONFIG_AD9834 is not set +# end of Direct Digital Synthesis + +# +# Network Analyzer, Impedance Converters +# +# CONFIG_AD5933 is not set +# end of Network Analyzer, Impedance Converters + +# +# Resolver to digital converters +# +# CONFIG_AD2S1210 is not set +# end of Resolver to digital converters +# end of IIO staging drivers + +# CONFIG_FB_SM750 is not set +# CONFIG_MFD_NVEC is not set +# CONFIG_STAGING_MEDIA is not set +# CONFIG_STAGING_BOARD is not set +# CONFIG_LTE_GDM724X is not set +# CONFIG_FB_TFT is not set +# CONFIG_KS7010 is not set +# CONFIG_PI433 is not set +# CONFIG_XIL_AXIS_FIFO is not set +# CONFIG_FIELDBUS_DEV is not set +CONFIG_QLGE=m +# CONFIG_VME_BUS is not set +# CONFIG_GOLDFISH is not set +CONFIG_CHROME_PLATFORMS=y +# CONFIG_CHROMEOS_ACPI is not set +# CONFIG_CHROMEOS_TBMC is not set +CONFIG_CROS_EC=y +CONFIG_CROS_EC_I2C=m +# CONFIG_CROS_EC_RPMSG is not set +CONFIG_CROS_EC_SPI=m +# CONFIG_CROS_EC_UART is not set +CONFIG_CROS_EC_PROTO=y +CONFIG_CROS_KBD_LED_BACKLIGHT=m +CONFIG_CROS_EC_CHARDEV=y +CONFIG_CROS_EC_LIGHTBAR=m +CONFIG_CROS_EC_VBC=m +CONFIG_CROS_EC_DEBUGFS=m +CONFIG_CROS_EC_SENSORHUB=y +CONFIG_CROS_EC_SYSFS=m +# CONFIG_CROS_HPS_I2C is not set +CONFIG_CROS_USBPD_LOGGER=m +CONFIG_CROS_USBPD_NOTIFY=y +# CONFIG_CHROMEOS_PRIVACY_SCREEN is not set +# CONFIG_MELLANOX_PLATFORM is not set +CONFIG_SURFACE_PLATFORMS=y +# CONFIG_SURFACE_3_POWER_OPREGION is not set +# CONFIG_SURFACE_GPE is not set +# CONFIG_SURFACE_HOTPLUG is not set +# CONFIG_SURFACE_PRO3_BUTTON is not set +# CONFIG_SURFACE_AGGREGATOR is not set +CONFIG_HAVE_CLK=y +CONFIG_HAVE_CLK_PREPARE=y +CONFIG_COMMON_CLK=y + +# +# Clock driver for ARM Reference designs +# +# CONFIG_CLK_ICST is not set +CONFIG_CLK_SP810=y +CONFIG_CLK_VEXPRESS_OSC=y +# end of Clock driver for ARM Reference designs + +# CONFIG_LMK04832 is not set +# CONFIG_COMMON_CLK_MAX77686 is not set +# CONFIG_COMMON_CLK_MAX9485 is not set +CONFIG_COMMON_CLK_HI655X=m +# CONFIG_COMMON_CLK_SI5341 is not set +# CONFIG_COMMON_CLK_SI5351 is not set +# CONFIG_COMMON_CLK_SI514 is not set +# CONFIG_COMMON_CLK_SI544 is not set +# CONFIG_COMMON_CLK_SI570 is not set +# CONFIG_COMMON_CLK_CDCE706 is not set +# CONFIG_COMMON_CLK_CDCE925 is not set +# CONFIG_COMMON_CLK_CS2000_CP is not set +# CONFIG_COMMON_CLK_AXI_CLKGEN is not set +CONFIG_COMMON_CLK_XGENE=y +# CONFIG_COMMON_CLK_PWM is not set +# CONFIG_COMMON_CLK_RS9_PCIE is not set +# CONFIG_COMMON_CLK_SI521XX is not set +# CONFIG_COMMON_CLK_VC3 is not set +# CONFIG_COMMON_CLK_VC5 is not set +# CONFIG_COMMON_CLK_VC7 is not set +# CONFIG_COMMON_CLK_FIXED_MMIO is not set +CONFIG_COMMON_CLK_HI3516CV300=y +CONFIG_COMMON_CLK_HI3519=y +CONFIG_COMMON_CLK_HI3559A=y +CONFIG_COMMON_CLK_HI3660=y +CONFIG_COMMON_CLK_HI3670=y +CONFIG_COMMON_CLK_HI3798CV200=y +CONFIG_COMMON_CLK_HI6220=y +CONFIG_RESET_HISI=y +CONFIG_STUB_CLK_HI6220=y +CONFIG_STUB_CLK_HI3660=y + +# +# Clock support for Amlogic platforms +# +CONFIG_COMMON_CLK_MESON_REGMAP=y +CONFIG_COMMON_CLK_MESON_DUALDIV=y +CONFIG_COMMON_CLK_MESON_MPLL=y +CONFIG_COMMON_CLK_MESON_PLL=y +CONFIG_COMMON_CLK_MESON_VID_PLL_DIV=y +CONFIG_COMMON_CLK_MESON_CLKC_UTILS=y +CONFIG_COMMON_CLK_MESON_AO_CLKC=y +CONFIG_COMMON_CLK_MESON_EE_CLKC=y +CONFIG_COMMON_CLK_MESON_CPU_DYNDIV=y +CONFIG_COMMON_CLK_GXBB=y +CONFIG_COMMON_CLK_AXG=y +# CONFIG_COMMON_CLK_AXG_AUDIO is not set +# CONFIG_COMMON_CLK_A1_PLL is not set +# CONFIG_COMMON_CLK_A1_PERIPHERALS is not set +CONFIG_COMMON_CLK_G12A=y +# end of Clock support for Amlogic platforms + +CONFIG_ARMADA_AP_CP_HELPER=y +CONFIG_ARMADA_37XX_CLK=y +CONFIG_ARMADA_AP806_SYSCON=y +CONFIG_ARMADA_CP110_SYSCON=y +CONFIG_QCOM_GDSC=y +CONFIG_QCOM_RPMCC=y +CONFIG_COMMON_CLK_QCOM=y +CONFIG_QCOM_A53PLL=y +# CONFIG_QCOM_A7PLL is not set +CONFIG_QCOM_CLK_APCS_MSM8916=m +# CONFIG_QCOM_CLK_APCC_MSM8996 is not set +CONFIG_QCOM_CLK_RPM=m +CONFIG_QCOM_CLK_SMD_RPM=m +# CONFIG_IPQ_APSS_PLL is not set +# CONFIG_IPQ_GCC_4019 is not set +# CONFIG_IPQ_GCC_5018 is not set +# CONFIG_IPQ_GCC_5332 is not set +# CONFIG_IPQ_GCC_6018 is not set +# CONFIG_IPQ_GCC_8074 is not set +# CONFIG_IPQ_GCC_9574 is not set +CONFIG_MSM_GCC_8916=y +# CONFIG_MSM_GCC_8917 is not set +# CONFIG_MSM_GCC_8939 is not set +# CONFIG_MSM_GCC_8953 is not set +# CONFIG_MSM_GCC_8976 is not set +# CONFIG_MSM_MMCC_8994 is not set +# CONFIG_MSM_GCC_8994 is not set +CONFIG_MSM_GCC_8996=y +CONFIG_MSM_MMCC_8996=y +# CONFIG_MSM_GCC_8998 is not set +# CONFIG_MSM_GPUCC_8998 is not set +# CONFIG_MSM_MMCC_8998 is not set +# CONFIG_QCM_GCC_2290 is not set +# CONFIG_QCM_DISPCC_2290 is not set +# CONFIG_QCS_GCC_404 is not set +# CONFIG_SC_CAMCC_7180 is not set +# CONFIG_SC_CAMCC_7280 is not set +# CONFIG_SC_DISPCC_7180 is not set +# CONFIG_SC_DISPCC_7280 is not set +# CONFIG_SC_DISPCC_8280XP is not set +# CONFIG_SA_GCC_8775P is not set +# CONFIG_SA_GPUCC_8775P is not set +# CONFIG_SC_GCC_7180 is not set +# CONFIG_SC_GCC_7280 is not set +# CONFIG_SC_GCC_8180X is not set +# CONFIG_SC_GCC_8280XP is not set +# CONFIG_SC_GPUCC_7180 is not set +# CONFIG_SC_GPUCC_7280 is not set +# CONFIG_SC_GPUCC_8280XP is not set +# CONFIG_SC_LPASSCC_7280 is not set +# CONFIG_SC_LPASSCC_8280XP is not set +# CONFIG_SC_LPASS_CORECC_7180 is not set +# CONFIG_SC_LPASS_CORECC_7280 is not set +# CONFIG_SC_MSS_7180 is not set +# CONFIG_SC_VIDEOCC_7180 is not set +# CONFIG_SC_VIDEOCC_7280 is not set +# CONFIG_SDM_CAMCC_845 is not set +# CONFIG_SDM_GCC_660 is not set +# CONFIG_SDM_MMCC_660 is not set +# CONFIG_SDM_GPUCC_660 is not set +# CONFIG_QCS_TURING_404 is not set +# CONFIG_QCS_Q6SSTOP_404 is not set +# CONFIG_QDU_GCC_1000 is not set +# CONFIG_SDM_GCC_845 is not set +# CONFIG_SDM_GPUCC_845 is not set +# CONFIG_SDM_VIDEOCC_845 is not set +# CONFIG_SDM_DISPCC_845 is not set +# CONFIG_SDM_LPASSCC_845 is not set +# CONFIG_SDX_GCC_75 is not set +# CONFIG_SM_CAMCC_6350 is not set +# CONFIG_SM_CAMCC_8250 is not set +# CONFIG_SM_CAMCC_8450 is not set +# CONFIG_SM_GCC_6115 is not set +# CONFIG_SM_GCC_6125 is not set +# CONFIG_SM_GCC_6350 is not set +# CONFIG_SM_GCC_6375 is not set +# CONFIG_SM_GCC_7150 is not set +# CONFIG_SM_GCC_8150 is not set +# CONFIG_SM_GCC_8250 is not set +# CONFIG_SM_GCC_8350 is not set +# CONFIG_SM_GCC_8450 is not set +# CONFIG_SM_GCC_8550 is not set +# CONFIG_SM_GPUCC_6115 is not set +# CONFIG_SM_GPUCC_6125 is not set +# CONFIG_SM_GPUCC_6375 is not set +# CONFIG_SM_GPUCC_6350 is not set +# CONFIG_SM_GPUCC_8150 is not set +# CONFIG_SM_GPUCC_8250 is not set +# CONFIG_SM_GPUCC_8350 is not set +# CONFIG_SM_GPUCC_8450 is not set +# CONFIG_SM_GPUCC_8550 is not set +# CONFIG_SM_TCSRCC_8550 is not set +# CONFIG_SM_VIDEOCC_8150 is not set +# CONFIG_SM_VIDEOCC_8250 is not set +# CONFIG_SM_VIDEOCC_8350 is not set +# CONFIG_SM_VIDEOCC_8550 is not set +# CONFIG_SPMI_PMIC_CLKDIV is not set +# CONFIG_QCOM_HFPLL is not set +# CONFIG_KPSS_XCC is not set +# CONFIG_CLK_GFM_LPASS_SM8250 is not set +# CONFIG_SM_VIDEOCC_8450 is not set +CONFIG_COMMON_CLK_ROCKCHIP=y +CONFIG_CLK_PX30=y +CONFIG_CLK_RK3308=y +CONFIG_CLK_RK3328=y +CONFIG_CLK_RK3368=y +CONFIG_CLK_RK3399=y +CONFIG_CLK_RK3568=y +CONFIG_CLK_RK3588=y +CONFIG_SUNXI_CCU=y +CONFIG_SUN50I_A64_CCU=y +CONFIG_SUN50I_A100_CCU=y +CONFIG_SUN50I_A100_R_CCU=y +CONFIG_SUN50I_H6_CCU=y +CONFIG_SUN50I_H616_CCU=y +CONFIG_SUN50I_H6_R_CCU=y +CONFIG_SUN6I_RTC_CCU=y +CONFIG_SUN8I_H3_CCU=y +CONFIG_SUN8I_DE2_CCU=y +CONFIG_SUN8I_R_CCU=y +CONFIG_CLK_TEGRA_BPMP=y +CONFIG_TEGRA_CLK_DFLL=y +# CONFIG_XILINX_VCU is not set +# CONFIG_COMMON_CLK_XLNX_CLKWZRD is not set +# CONFIG_COMMON_CLK_ZYNQMP is not set +# CONFIG_HWSPINLOCK is not set + +# +# Clock Source drivers +# +CONFIG_TIMER_OF=y +CONFIG_TIMER_ACPI=y +CONFIG_TIMER_PROBE=y +CONFIG_CLKSRC_MMIO=y +CONFIG_ROCKCHIP_TIMER=y +CONFIG_SUN4I_TIMER=y +CONFIG_TEGRA_TIMER=y +# CONFIG_TEGRA186_TIMER is not set +CONFIG_ARM_ARCH_TIMER=y +CONFIG_ARM_ARCH_TIMER_EVTSTREAM=y +CONFIG_ARM_ARCH_TIMER_OOL_WORKAROUND=y +CONFIG_FSL_ERRATUM_A008585=y +CONFIG_HISILICON_ERRATUM_161010101=y +CONFIG_ARM64_ERRATUM_858921=y +CONFIG_SUN50I_ERRATUM_UNKNOWN1=y +CONFIG_ARM_TIMER_SP804=y +# end of Clock Source drivers + +CONFIG_MAILBOX=y +# CONFIG_ARM_MHU is not set +# CONFIG_ARM_MHU_V2 is not set +# CONFIG_PLATFORM_MHU is not set +# CONFIG_PL320_MBOX is not set +CONFIG_ARMADA_37XX_RWTM_MBOX=m +# CONFIG_ROCKCHIP_MBOX is not set +CONFIG_PCC=y +# CONFIG_ALTERA_MBOX is not set +CONFIG_HI3660_MBOX=y +CONFIG_HI6220_MBOX=y +# CONFIG_MAILBOX_TEST is not set +CONFIG_QCOM_APCS_IPC=m +CONFIG_TEGRA_HSP_MBOX=y +CONFIG_XGENE_SLIMPRO_MBOX=m +CONFIG_ZYNQMP_IPI_MBOX=y +CONFIG_SUN6I_MSGBOX=y +# CONFIG_QCOM_IPCC is not set +CONFIG_IOMMU_IOVA=y +CONFIG_IOMMU_API=y +CONFIG_IOMMU_SUPPORT=y + +# +# Generic IOMMU Pagetable Support +# +CONFIG_IOMMU_IO_PGTABLE=y +CONFIG_IOMMU_IO_PGTABLE_LPAE=y +# CONFIG_IOMMU_IO_PGTABLE_LPAE_SELFTEST is not set +# CONFIG_IOMMU_IO_PGTABLE_ARMV7S is not set +# CONFIG_IOMMU_IO_PGTABLE_DART is not set +# end of Generic IOMMU Pagetable Support + +# CONFIG_IOMMU_DEBUGFS is not set +CONFIG_IOMMU_DEFAULT_DMA_STRICT=y +# CONFIG_IOMMU_DEFAULT_DMA_LAZY is not set +# CONFIG_IOMMU_DEFAULT_PASSTHROUGH is not set +CONFIG_OF_IOMMU=y +CONFIG_IOMMU_DMA=y +CONFIG_IOMMU_SVA=y +# CONFIG_IOMMUFD is not set +CONFIG_ROCKCHIP_IOMMU=y +# CONFIG_SUN50I_IOMMU is not set +CONFIG_TEGRA_IOMMU_SMMU=y +CONFIG_ARM_SMMU=y +# CONFIG_ARM_SMMU_LEGACY_DT_BINDINGS is not set +CONFIG_ARM_SMMU_DISABLE_BYPASS_BY_DEFAULT=y +CONFIG_ARM_SMMU_QCOM=y +# CONFIG_ARM_SMMU_QCOM_DEBUG is not set +CONFIG_ARM_SMMU_V3=y +CONFIG_ARM_SMMU_V3_SVA=y +CONFIG_QCOM_IOMMU=y +# CONFIG_VIRTIO_IOMMU is not set + +# +# Remoteproc drivers +# +# CONFIG_REMOTEPROC is not set +# end of Remoteproc drivers + +# +# Rpmsg drivers +# +CONFIG_RPMSG=m +# CONFIG_RPMSG_CHAR is not set +# CONFIG_RPMSG_CTRL is not set +# CONFIG_RPMSG_NS is not set +CONFIG_RPMSG_QCOM_GLINK=m +CONFIG_RPMSG_QCOM_GLINK_RPM=m +# CONFIG_RPMSG_VIRTIO is not set +# end of Rpmsg drivers + +# CONFIG_SOUNDWIRE is not set + +# +# SOC (System On Chip) specific Drivers +# + +# +# Amlogic SoC drivers +# +CONFIG_MESON_CANVAS=m +CONFIG_MESON_CLK_MEASURE=y +CONFIG_MESON_GX_SOCINFO=y +CONFIG_MESON_GX_PM_DOMAINS=y +CONFIG_MESON_EE_PM_DOMAINS=y +# end of Amlogic SoC drivers + +# +# Broadcom SoC drivers +# +# CONFIG_SOC_BRCMSTB is not set +# end of Broadcom SoC drivers + +# +# NXP/Freescale QorIQ SoC drivers +# +# CONFIG_QUICC_ENGINE is not set +# CONFIG_FSL_RCPM is not set +# end of NXP/Freescale QorIQ SoC drivers + +# +# fujitsu SoC drivers +# +# CONFIG_A64FX_DIAG is not set +# end of fujitsu SoC drivers + +# +# Hisilicon SoC drivers +# +# CONFIG_KUNPENG_HCCS is not set +# end of Hisilicon SoC drivers + +# +# i.MX SoC drivers +# +# end of i.MX SoC drivers + +# +# Enable LiteX SoC Builder specific drivers +# +# CONFIG_LITEX_SOC_CONTROLLER is not set +# end of Enable LiteX SoC Builder specific drivers + +# CONFIG_WPCM450_SOC is not set + +# +# Qualcomm SoC drivers +# +# CONFIG_QCOM_AOSS_QMP is not set +CONFIG_QCOM_COMMAND_DB=y +# CONFIG_QCOM_CPR is not set +# CONFIG_QCOM_GENI_SE is not set +CONFIG_QCOM_GSBI=m +# CONFIG_QCOM_LLCC is not set +CONFIG_QCOM_KRYO_L2_ACCESSORS=y +CONFIG_QCOM_MDT_LOADER=m +# CONFIG_QCOM_OCMEM is not set +# CONFIG_QCOM_RAMP_CTRL is not set +# CONFIG_QCOM_RMTFS_MEM is not set +# CONFIG_QCOM_RPM_MASTER_STATS is not set +# CONFIG_QCOM_RPMH is not set +# CONFIG_QCOM_RPMPD is not set +CONFIG_QCOM_SMD_RPM=m +# CONFIG_QCOM_SPM is not set +CONFIG_QCOM_WCNSS_CTRL=m +# CONFIG_QCOM_APR is not set +# CONFIG_QCOM_ICC_BWMON is not set +# end of Qualcomm SoC drivers + +CONFIG_ROCKCHIP_GRF=y +CONFIG_ROCKCHIP_IODOMAIN=m +CONFIG_ROCKCHIP_PM_DOMAINS=y +CONFIG_SUNXI_MBUS=y +CONFIG_SUNXI_SRAM=y +# CONFIG_SUN20I_PPU is not set +CONFIG_ARCH_TEGRA_132_SOC=y +CONFIG_ARCH_TEGRA_210_SOC=y +CONFIG_ARCH_TEGRA_186_SOC=y +CONFIG_ARCH_TEGRA_194_SOC=y +# CONFIG_ARCH_TEGRA_234_SOC is not set +CONFIG_SOC_TEGRA_FUSE=y +CONFIG_SOC_TEGRA_FLOWCTRL=y +CONFIG_SOC_TEGRA_PMC=y +CONFIG_SOC_TEGRA_POWERGATE_BPMP=y +CONFIG_SOC_TEGRA_CBB=y +# CONFIG_SOC_TI is not set + +# +# Xilinx SoC drivers +# +CONFIG_ZYNQMP_POWER=y +CONFIG_ZYNQMP_PM_DOMAINS=y +CONFIG_XLNX_EVENT_MANAGER=y +# end of Xilinx SoC drivers +# end of SOC (System On Chip) specific Drivers + +CONFIG_PM_DEVFREQ=y + +# +# DEVFREQ Governors +# +CONFIG_DEVFREQ_GOV_SIMPLE_ONDEMAND=m +# CONFIG_DEVFREQ_GOV_PERFORMANCE is not set +# CONFIG_DEVFREQ_GOV_POWERSAVE is not set +# CONFIG_DEVFREQ_GOV_USERSPACE is not set +# CONFIG_DEVFREQ_GOV_PASSIVE is not set + +# +# DEVFREQ Drivers +# +# CONFIG_ARM_TEGRA_DEVFREQ is not set +CONFIG_ARM_RK3399_DMC_DEVFREQ=m +# CONFIG_ARM_SUN8I_A33_MBUS_DEVFREQ is not set +CONFIG_PM_DEVFREQ_EVENT=y +CONFIG_DEVFREQ_EVENT_ROCKCHIP_DFI=m +CONFIG_EXTCON=y + +# +# Extcon Device Drivers +# +# CONFIG_EXTCON_ADC_JACK is not set +# CONFIG_EXTCON_FSA9480 is not set +# CONFIG_EXTCON_GPIO is not set +# CONFIG_EXTCON_MAX3355 is not set +# CONFIG_EXTCON_PTN5150 is not set +CONFIG_EXTCON_QCOM_SPMI_MISC=m +# CONFIG_EXTCON_RT8973A is not set +# CONFIG_EXTCON_SM5502 is not set +CONFIG_EXTCON_USB_GPIO=m +CONFIG_EXTCON_USBC_CROS_EC=m +CONFIG_MEMORY=y +# CONFIG_ARM_PL172_MPMC is not set +CONFIG_TEGRA_MC=y +# CONFIG_TEGRA210_EMC is not set +CONFIG_IIO=m +CONFIG_IIO_BUFFER=y +# CONFIG_IIO_BUFFER_CB is not set +# CONFIG_IIO_BUFFER_DMA is not set +# CONFIG_IIO_BUFFER_DMAENGINE is not set +# CONFIG_IIO_BUFFER_HW_CONSUMER is not set +CONFIG_IIO_KFIFO_BUF=m +CONFIG_IIO_TRIGGERED_BUFFER=m +# CONFIG_IIO_CONFIGFS is not set +CONFIG_IIO_TRIGGER=y +CONFIG_IIO_CONSUMERS_PER_TRIGGER=2 +# CONFIG_IIO_SW_DEVICE is not set +# CONFIG_IIO_SW_TRIGGER is not set +# CONFIG_IIO_TRIGGERED_EVENT is not set + +# +# Accelerometers +# +# CONFIG_ADIS16201 is not set +# CONFIG_ADIS16209 is not set +# CONFIG_ADXL313_I2C is not set +# CONFIG_ADXL313_SPI is not set +# CONFIG_ADXL345_I2C is not set +# CONFIG_ADXL345_SPI is not set +# CONFIG_ADXL355_I2C is not set +# CONFIG_ADXL355_SPI is not set +# CONFIG_ADXL367_SPI is not set +# CONFIG_ADXL367_I2C is not set +# CONFIG_ADXL372_SPI is not set +# CONFIG_ADXL372_I2C is not set +# CONFIG_BMA180 is not set +# CONFIG_BMA220 is not set +# CONFIG_BMA400 is not set +# CONFIG_BMC150_ACCEL is not set +# CONFIG_BMI088_ACCEL is not set +# CONFIG_DA280 is not set +# CONFIG_DA311 is not set +# CONFIG_DMARD06 is not set +# CONFIG_DMARD09 is not set +# CONFIG_DMARD10 is not set +# CONFIG_FXLS8962AF_I2C is not set +# CONFIG_FXLS8962AF_SPI is not set +CONFIG_HID_SENSOR_ACCEL_3D=m +CONFIG_IIO_CROS_EC_ACCEL_LEGACY=m +# CONFIG_IIO_ST_ACCEL_3AXIS is not set +# CONFIG_IIO_KX022A_SPI is not set +# CONFIG_IIO_KX022A_I2C is not set +# CONFIG_KXSD9 is not set +# CONFIG_KXCJK1013 is not set +# CONFIG_MC3230 is not set +# CONFIG_MMA7455_I2C is not set +# CONFIG_MMA7455_SPI is not set +# CONFIG_MMA7660 is not set +# CONFIG_MMA8452 is not set +# CONFIG_MMA9551 is not set +# CONFIG_MMA9553 is not set +# CONFIG_MSA311 is not set +# CONFIG_MXC4005 is not set +# CONFIG_MXC6255 is not set +# CONFIG_SCA3000 is not set +# CONFIG_SCA3300 is not set +# CONFIG_STK8312 is not set +# CONFIG_STK8BA50 is not set +# end of Accelerometers + +# +# Analog to digital converters +# +# CONFIG_AD4130 is not set +# CONFIG_AD7091R5 is not set +# CONFIG_AD7124 is not set +# CONFIG_AD7192 is not set +# CONFIG_AD7266 is not set +# CONFIG_AD7280 is not set +# CONFIG_AD7291 is not set +# CONFIG_AD7292 is not set +# CONFIG_AD7298 is not set +# CONFIG_AD7476 is not set +# CONFIG_AD7606_IFACE_PARALLEL is not set +# CONFIG_AD7606_IFACE_SPI is not set +# CONFIG_AD7766 is not set +# CONFIG_AD7768_1 is not set +# CONFIG_AD7780 is not set +# CONFIG_AD7791 is not set +# CONFIG_AD7793 is not set +# CONFIG_AD7887 is not set +# CONFIG_AD7923 is not set +# CONFIG_AD7949 is not set +# CONFIG_AD799X is not set +# CONFIG_AD9467 is not set +# CONFIG_ADI_AXI_ADC is not set +CONFIG_AXP20X_ADC=m +CONFIG_AXP288_ADC=m +# CONFIG_CC10001_ADC is not set +# CONFIG_ENVELOPE_DETECTOR is not set +# CONFIG_HI8435 is not set +# CONFIG_HX711 is not set +# CONFIG_INA2XX_ADC is not set +# CONFIG_LTC2471 is not set +# CONFIG_LTC2485 is not set +# CONFIG_LTC2496 is not set +# CONFIG_LTC2497 is not set +# CONFIG_MAX1027 is not set +# CONFIG_MAX11100 is not set +# CONFIG_MAX1118 is not set +# CONFIG_MAX11205 is not set +# CONFIG_MAX11410 is not set +# CONFIG_MAX1241 is not set +# CONFIG_MAX1363 is not set +# CONFIG_MAX9611 is not set +# CONFIG_MCP320X is not set +# CONFIG_MCP3422 is not set +# CONFIG_MCP3911 is not set +CONFIG_MESON_SARADC=m +# CONFIG_NAU7802 is not set +CONFIG_QCOM_VADC_COMMON=m +# CONFIG_QCOM_SPMI_RRADC is not set +CONFIG_QCOM_SPMI_IADC=m +CONFIG_QCOM_SPMI_VADC=m +# CONFIG_QCOM_SPMI_ADC5 is not set +CONFIG_ROCKCHIP_SARADC=m +# CONFIG_RICHTEK_RTQ6056 is not set +# CONFIG_SD_ADC_MODULATOR is not set +# CONFIG_SUN20I_GPADC is not set +# CONFIG_TI_ADC081C is not set +# CONFIG_TI_ADC0832 is not set +# CONFIG_TI_ADC084S021 is not set +# CONFIG_TI_ADC12138 is not set +# CONFIG_TI_ADC108S102 is not set +# CONFIG_TI_ADC128S052 is not set +# CONFIG_TI_ADC161S626 is not set +# CONFIG_TI_ADS1015 is not set +# CONFIG_TI_ADS7924 is not set +# CONFIG_TI_ADS1100 is not set +# CONFIG_TI_ADS7950 is not set +# CONFIG_TI_ADS8344 is not set +# CONFIG_TI_ADS8688 is not set +# CONFIG_TI_ADS124S08 is not set +# CONFIG_TI_ADS131E08 is not set +# CONFIG_TI_LMP92064 is not set +# CONFIG_TI_TLC4541 is not set +# CONFIG_TI_TSC2046 is not set +# CONFIG_VF610_ADC is not set +CONFIG_VIPERBOARD_ADC=m +# CONFIG_XILINX_XADC is not set +# CONFIG_XILINX_AMS is not set +# end of Analog to digital converters + +# +# Analog to digital and digital to analog converters +# +# CONFIG_AD74115 is not set +# CONFIG_AD74413R is not set +# end of Analog to digital and digital to analog converters + +# +# Analog Front Ends +# +# CONFIG_IIO_RESCALE is not set +# end of Analog Front Ends + +# +# Amplifiers +# +# CONFIG_AD8366 is not set +# CONFIG_ADA4250 is not set +# CONFIG_HMC425 is not set +# end of Amplifiers + +# +# Capacitance to digital converters +# +# CONFIG_AD7150 is not set +# CONFIG_AD7746 is not set +# end of Capacitance to digital converters + +# +# Chemical Sensors +# +# CONFIG_ATLAS_PH_SENSOR is not set +# CONFIG_ATLAS_EZO_SENSOR is not set +# CONFIG_BME680 is not set +# CONFIG_CCS811 is not set +# CONFIG_IAQCORE is not set +# CONFIG_PMS7003 is not set +# CONFIG_SCD30_CORE is not set +# CONFIG_SCD4X is not set +# CONFIG_SENSIRION_SGP30 is not set +# CONFIG_SENSIRION_SGP40 is not set +# CONFIG_SPS30_I2C is not set +# CONFIG_SPS30_SERIAL is not set +# CONFIG_SENSEAIR_SUNRISE_CO2 is not set +# CONFIG_VZ89X is not set +# end of Chemical Sensors + +CONFIG_IIO_CROS_EC_SENSORS_CORE=m +CONFIG_IIO_CROS_EC_SENSORS=m +# CONFIG_IIO_CROS_EC_SENSORS_LID_ANGLE is not set + +# +# Hid Sensor IIO Common +# +CONFIG_HID_SENSOR_IIO_COMMON=m +CONFIG_HID_SENSOR_IIO_TRIGGER=m +# end of Hid Sensor IIO Common + +# +# IIO SCMI Sensors +# +# end of IIO SCMI Sensors + +# +# SSP Sensor Common +# +# CONFIG_IIO_SSP_SENSORHUB is not set +# end of SSP Sensor Common + +# +# Digital to analog converters +# +# CONFIG_AD3552R is not set +# CONFIG_AD5064 is not set +# CONFIG_AD5360 is not set +# CONFIG_AD5380 is not set +# CONFIG_AD5421 is not set +CONFIG_AD5446=m +# CONFIG_AD5449 is not set +# CONFIG_AD5592R is not set +# CONFIG_AD5593R is not set +# CONFIG_AD5504 is not set +# CONFIG_AD5624R_SPI is not set +# CONFIG_LTC2688 is not set +# CONFIG_AD5686_SPI is not set +# CONFIG_AD5696_I2C is not set +# CONFIG_AD5755 is not set +# CONFIG_AD5758 is not set +# CONFIG_AD5761 is not set +# CONFIG_AD5764 is not set +# CONFIG_AD5766 is not set +# CONFIG_AD5770R is not set +# CONFIG_AD5791 is not set +# CONFIG_AD7293 is not set +# CONFIG_AD7303 is not set +# CONFIG_AD8801 is not set +# CONFIG_DPOT_DAC is not set +# CONFIG_DS4424 is not set +# CONFIG_LTC1660 is not set +# CONFIG_LTC2632 is not set +# CONFIG_M62332 is not set +# CONFIG_MAX517 is not set +# CONFIG_MAX5522 is not set +# CONFIG_MAX5821 is not set +# CONFIG_MCP4725 is not set +# CONFIG_MCP4728 is not set +# CONFIG_MCP4922 is not set +# CONFIG_TI_DAC082S085 is not set +# CONFIG_TI_DAC5571 is not set +# CONFIG_TI_DAC7311 is not set +# CONFIG_TI_DAC7612 is not set +# CONFIG_VF610_DAC is not set +# end of Digital to analog converters + +# +# IIO dummy driver +# +# end of IIO dummy driver + +# +# Filters +# +# CONFIG_ADMV8818 is not set +# end of Filters + +# +# Frequency Synthesizers DDS/PLL +# + +# +# Clock Generator/Distribution +# +# CONFIG_AD9523 is not set +# end of Clock Generator/Distribution + +# +# Phase-Locked Loop (PLL) frequency synthesizers +# +# CONFIG_ADF4350 is not set +# CONFIG_ADF4371 is not set +# CONFIG_ADF4377 is not set +# CONFIG_ADMV1013 is not set +# CONFIG_ADMV1014 is not set +# CONFIG_ADMV4420 is not set +# CONFIG_ADRF6780 is not set +# end of Phase-Locked Loop (PLL) frequency synthesizers +# end of Frequency Synthesizers DDS/PLL + +# +# Digital gyroscope sensors +# +# CONFIG_ADIS16080 is not set +# CONFIG_ADIS16130 is not set +# CONFIG_ADIS16136 is not set +# CONFIG_ADIS16260 is not set +# CONFIG_ADXRS290 is not set +# CONFIG_ADXRS450 is not set +# CONFIG_BMG160 is not set +# CONFIG_FXAS21002C is not set +CONFIG_HID_SENSOR_GYRO_3D=m +# CONFIG_MPU3050_I2C is not set +# CONFIG_IIO_ST_GYRO_3AXIS is not set +# CONFIG_ITG3200 is not set +# end of Digital gyroscope sensors + +# +# Health Sensors +# + +# +# Heart Rate Monitors +# +# CONFIG_AFE4403 is not set +# CONFIG_AFE4404 is not set +# CONFIG_MAX30100 is not set +# CONFIG_MAX30102 is not set +# end of Heart Rate Monitors +# end of Health Sensors + +# +# Humidity sensors +# +# CONFIG_AM2315 is not set +CONFIG_DHT11=m +# CONFIG_HDC100X is not set +# CONFIG_HDC2010 is not set +# CONFIG_HID_SENSOR_HUMIDITY is not set +# CONFIG_HTS221 is not set +# CONFIG_HTU21 is not set +# CONFIG_SI7005 is not set +# CONFIG_SI7020 is not set +# end of Humidity sensors + +# +# Inertial measurement units +# +# CONFIG_ADIS16400 is not set +# CONFIG_ADIS16460 is not set +# CONFIG_ADIS16475 is not set +# CONFIG_ADIS16480 is not set +# CONFIG_BMI160_I2C is not set +# CONFIG_BMI160_SPI is not set +# CONFIG_BOSCH_BNO055_SERIAL is not set +# CONFIG_BOSCH_BNO055_I2C is not set +# CONFIG_FXOS8700_I2C is not set +# CONFIG_FXOS8700_SPI is not set +# CONFIG_KMX61 is not set +# CONFIG_INV_ICM42600_I2C is not set +# CONFIG_INV_ICM42600_SPI is not set +# CONFIG_INV_MPU6050_I2C is not set +# CONFIG_INV_MPU6050_SPI is not set +# CONFIG_IIO_ST_LSM6DSX is not set +# CONFIG_IIO_ST_LSM9DS0 is not set +# end of Inertial measurement units + +# +# Light sensors +# +CONFIG_ACPI_ALS=m +# CONFIG_ADJD_S311 is not set +# CONFIG_ADUX1020 is not set +# CONFIG_AL3010 is not set +# CONFIG_AL3320A is not set +# CONFIG_APDS9300 is not set +# CONFIG_APDS9960 is not set +# CONFIG_AS73211 is not set +# CONFIG_BH1750 is not set +CONFIG_BH1780=m +# CONFIG_CM32181 is not set +# CONFIG_CM3232 is not set +# CONFIG_CM3323 is not set +# CONFIG_CM3605 is not set +# CONFIG_CM36651 is not set +CONFIG_IIO_CROS_EC_LIGHT_PROX=m +# CONFIG_GP2AP002 is not set +# CONFIG_GP2AP020A00F is not set +# CONFIG_SENSORS_ISL29018 is not set +# CONFIG_SENSORS_ISL29028 is not set +# CONFIG_ISL29125 is not set +CONFIG_HID_SENSOR_ALS=m +CONFIG_HID_SENSOR_PROX=m +# CONFIG_JSA1212 is not set +# CONFIG_ROHM_BU27008 is not set +# CONFIG_ROHM_BU27034 is not set +# CONFIG_RPR0521 is not set +# CONFIG_LTR501 is not set +# CONFIG_LTRF216A is not set +# CONFIG_LV0104CS is not set +# CONFIG_MAX44000 is not set +# CONFIG_MAX44009 is not set +# CONFIG_NOA1305 is not set +# CONFIG_OPT3001 is not set +# CONFIG_OPT4001 is not set +# CONFIG_PA12203001 is not set +# CONFIG_SI1133 is not set +# CONFIG_SI1145 is not set +# CONFIG_STK3310 is not set +# CONFIG_ST_UVIS25 is not set +# CONFIG_TCS3414 is not set +# CONFIG_TCS3472 is not set +# CONFIG_SENSORS_TSL2563 is not set +# CONFIG_TSL2583 is not set +# CONFIG_TSL2591 is not set +# CONFIG_TSL2772 is not set +# CONFIG_TSL4531 is not set +# CONFIG_US5182D is not set +# CONFIG_VCNL4000 is not set +# CONFIG_VCNL4035 is not set +# CONFIG_VEML6030 is not set +# CONFIG_VEML6070 is not set +# CONFIG_VL6180 is not set +# CONFIG_ZOPT2201 is not set +# end of Light sensors + +# +# Magnetometer sensors +# +# CONFIG_AK8974 is not set +# CONFIG_AK8975 is not set +# CONFIG_AK09911 is not set +# CONFIG_BMC150_MAGN_I2C is not set +# CONFIG_BMC150_MAGN_SPI is not set +# CONFIG_MAG3110 is not set +CONFIG_HID_SENSOR_MAGNETOMETER_3D=m +# CONFIG_MMC35240 is not set +# CONFIG_IIO_ST_MAGN_3AXIS is not set +# CONFIG_SENSORS_HMC5843_I2C is not set +# CONFIG_SENSORS_HMC5843_SPI is not set +# CONFIG_SENSORS_RM3100_I2C is not set +# CONFIG_SENSORS_RM3100_SPI is not set +# CONFIG_TI_TMAG5273 is not set +# CONFIG_YAMAHA_YAS530 is not set +# end of Magnetometer sensors + +# +# Multiplexers +# +# CONFIG_IIO_MUX is not set +# end of Multiplexers + +# +# Inclinometer sensors +# +CONFIG_HID_SENSOR_INCLINOMETER_3D=m +CONFIG_HID_SENSOR_DEVICE_ROTATION=m +# end of Inclinometer sensors + +# +# Triggers - standalone +# +# CONFIG_IIO_INTERRUPT_TRIGGER is not set +# CONFIG_IIO_SYSFS_TRIGGER is not set +# end of Triggers - standalone + +# +# Linear and angular position sensors +# +# CONFIG_HID_SENSOR_CUSTOM_INTEL_HINGE is not set +# end of Linear and angular position sensors + +# +# Digital potentiometers +# +# CONFIG_AD5110 is not set +# CONFIG_AD5272 is not set +# CONFIG_DS1803 is not set +# CONFIG_MAX5432 is not set +# CONFIG_MAX5481 is not set +# CONFIG_MAX5487 is not set +# CONFIG_MCP4018 is not set +# CONFIG_MCP4131 is not set +# CONFIG_MCP4531 is not set +# CONFIG_MCP41010 is not set +# CONFIG_TPL0102 is not set +# CONFIG_X9250 is not set +# end of Digital potentiometers + +# +# Digital potentiostats +# +# CONFIG_LMP91000 is not set +# end of Digital potentiostats + +# +# Pressure sensors +# +# CONFIG_ABP060MG is not set +# CONFIG_BMP280 is not set +CONFIG_IIO_CROS_EC_BARO=m +# CONFIG_DLHL60D is not set +# CONFIG_DPS310 is not set +CONFIG_HID_SENSOR_PRESS=m +# CONFIG_HP03 is not set +# CONFIG_ICP10100 is not set +# CONFIG_MPL115_I2C is not set +# CONFIG_MPL115_SPI is not set +# CONFIG_MPL3115 is not set +# CONFIG_MPRLS0025PA is not set +# CONFIG_MS5611 is not set +# CONFIG_MS5637 is not set +# CONFIG_IIO_ST_PRESS is not set +# CONFIG_T5403 is not set +# CONFIG_HP206C is not set +# CONFIG_ZPA2326 is not set +# end of Pressure sensors + +# +# Lightning sensors +# +# CONFIG_AS3935 is not set +# end of Lightning sensors + +# +# Proximity and distance sensors +# +# CONFIG_CROS_EC_MKBP_PROXIMITY is not set +# CONFIG_IRSD200 is not set +# CONFIG_ISL29501 is not set +# CONFIG_LIDAR_LITE_V2 is not set +# CONFIG_MB1232 is not set +# CONFIG_PING is not set +# CONFIG_RFD77402 is not set +# CONFIG_SRF04 is not set +# CONFIG_SX9310 is not set +# CONFIG_SX9324 is not set +# CONFIG_SX9360 is not set +# CONFIG_SX9500 is not set +# CONFIG_SRF08 is not set +# CONFIG_VCNL3020 is not set +# CONFIG_VL53L0X_I2C is not set +# end of Proximity and distance sensors + +# +# Resolver to digital converters +# +# CONFIG_AD2S90 is not set +# CONFIG_AD2S1200 is not set +# end of Resolver to digital converters + +# +# Temperature sensors +# +# CONFIG_LTC2983 is not set +# CONFIG_MAXIM_THERMOCOUPLE is not set +# CONFIG_HID_SENSOR_TEMP is not set +# CONFIG_MLX90614 is not set +# CONFIG_MLX90632 is not set +# CONFIG_TMP006 is not set +# CONFIG_TMP007 is not set +# CONFIG_TMP117 is not set +# CONFIG_TSYS01 is not set +# CONFIG_TSYS02D is not set +# CONFIG_MAX30208 is not set +# CONFIG_MAX31856 is not set +# CONFIG_MAX31865 is not set +# end of Temperature sensors + +# CONFIG_NTB is not set +CONFIG_PWM=y +CONFIG_PWM_SYSFS=y +# CONFIG_PWM_DEBUG is not set +# CONFIG_PWM_ATMEL_TCB is not set +# CONFIG_PWM_CLK is not set +CONFIG_PWM_CROS_EC=m +# CONFIG_PWM_DWC is not set +# CONFIG_PWM_FSL_FTM is not set +# CONFIG_PWM_HIBVT is not set +CONFIG_PWM_MESON=m +# CONFIG_PWM_PCA9685 is not set +CONFIG_PWM_ROCKCHIP=m +CONFIG_PWM_SUN4I=m +CONFIG_PWM_TEGRA=m +# CONFIG_PWM_XILINX is not set + +# +# IRQ chip support +# +CONFIG_IRQCHIP=y +CONFIG_ARM_GIC=y +CONFIG_ARM_GIC_PM=y +CONFIG_ARM_GIC_MAX_NR=1 +CONFIG_ARM_GIC_V2M=y +CONFIG_ARM_GIC_V3=y +CONFIG_ARM_GIC_V3_ITS=y +CONFIG_ARM_GIC_V3_ITS_PCI=y +# CONFIG_AL_FIC is not set +CONFIG_HISILICON_IRQ_MBIGEN=y +CONFIG_SUN6I_R_INTC=y +CONFIG_SUNXI_NMI_INTC=y +# CONFIG_XILINX_INTC is not set +CONFIG_MVEBU_GICP=y +CONFIG_MVEBU_ICU=y +CONFIG_MVEBU_ODMI=y +CONFIG_MVEBU_PIC=y +CONFIG_MVEBU_SEI=y +CONFIG_PARTITION_PERCPU=y +CONFIG_QCOM_IRQ_COMBINER=y +CONFIG_MESON_IRQ_GPIO=y +# CONFIG_QCOM_PDC is not set +# CONFIG_QCOM_MPM is not set +# end of IRQ chip support + +# CONFIG_IPACK_BUS is not set +CONFIG_ARCH_HAS_RESET_CONTROLLER=y +CONFIG_RESET_CONTROLLER=y +CONFIG_RESET_MESON=y +# CONFIG_RESET_MESON_AUDIO_ARB is not set +# CONFIG_RESET_QCOM_AOSS is not set +# CONFIG_RESET_QCOM_PDC is not set +CONFIG_RESET_SIMPLE=y +CONFIG_RESET_SUNXI=y +# CONFIG_RESET_TI_SYSCON is not set +# CONFIG_RESET_TI_TPS380X is not set +CONFIG_COMMON_RESET_HI3660=y +CONFIG_COMMON_RESET_HI6220=y +CONFIG_RESET_TEGRA_BPMP=y + +# +# PHY Subsystem +# +CONFIG_GENERIC_PHY=y +CONFIG_GENERIC_PHY_MIPI_DPHY=y +CONFIG_PHY_XGENE=m +# CONFIG_PHY_CAN_TRANSCEIVER is not set +CONFIG_PHY_SUN4I_USB=m +# CONFIG_PHY_SUN6I_MIPI_DPHY is not set +# CONFIG_PHY_SUN9I_USB is not set +# CONFIG_PHY_SUN50I_USB3 is not set +CONFIG_PHY_MESON8B_USB2=m +CONFIG_PHY_MESON_GXL_USB2=y +CONFIG_PHY_MESON_G12A_MIPI_DPHY_ANALOG=y +CONFIG_PHY_MESON_G12A_USB2=y +CONFIG_PHY_MESON_G12A_USB3_PCIE=y +CONFIG_PHY_MESON_AXG_PCIE=y +CONFIG_PHY_MESON_AXG_MIPI_PCIE_ANALOG=y +CONFIG_PHY_MESON_AXG_MIPI_DPHY=y + +# +# PHY drivers for Broadcom platforms +# +# CONFIG_BCM_KONA_USB2_PHY is not set +# end of PHY drivers for Broadcom platforms + +# CONFIG_PHY_CADENCE_TORRENT is not set +# CONFIG_PHY_CADENCE_DPHY is not set +# CONFIG_PHY_CADENCE_DPHY_RX is not set +# CONFIG_PHY_CADENCE_SIERRA is not set +# CONFIG_PHY_CADENCE_SALVO is not set +CONFIG_PHY_HI6220_USB=m +# CONFIG_PHY_HI3660_USB is not set +# CONFIG_PHY_HI3670_USB is not set +# CONFIG_PHY_HI3670_PCIE is not set +# CONFIG_PHY_HISTB_COMBPHY is not set +# CONFIG_PHY_HISI_INNO_USB2 is not set +CONFIG_PHY_MVEBU_A3700_COMPHY=y +CONFIG_PHY_MVEBU_A3700_UTMI=y +# CONFIG_PHY_MVEBU_A38X_COMPHY is not set +CONFIG_PHY_MVEBU_CP110_COMPHY=m +# CONFIG_PHY_MVEBU_CP110_UTMI is not set +# CONFIG_PHY_PXA_28NM_HSIC is not set +# CONFIG_PHY_PXA_28NM_USB2 is not set +# CONFIG_PHY_LAN966X_SERDES is not set +# CONFIG_PHY_CPCAP_USB is not set +# CONFIG_PHY_MAPPHONE_MDM6600 is not set +# CONFIG_PHY_OCELOT_SERDES is not set +CONFIG_PHY_QCOM_APQ8064_SATA=m +# CONFIG_PHY_QCOM_EDP is not set +# CONFIG_PHY_QCOM_IPQ4019_USB is not set +CONFIG_PHY_QCOM_IPQ806X_SATA=m +# CONFIG_PHY_QCOM_PCIE2 is not set +CONFIG_PHY_QCOM_QMP=m +CONFIG_PHY_QCOM_QMP_COMBO=m +CONFIG_PHY_QCOM_QMP_PCIE=m +CONFIG_PHY_QCOM_QMP_PCIE_8996=m +CONFIG_PHY_QCOM_QMP_UFS=m +CONFIG_PHY_QCOM_QMP_USB=m +# CONFIG_PHY_QCOM_QMP_USB_LEGACY is not set +CONFIG_PHY_QCOM_QUSB2=m +# CONFIG_PHY_QCOM_SNPS_EUSB2 is not set +# CONFIG_PHY_QCOM_EUSB2_REPEATER is not set +# CONFIG_PHY_QCOM_M31_USB is not set +CONFIG_PHY_QCOM_USB_HS=m +# CONFIG_PHY_QCOM_USB_SNPS_FEMTO_V2 is not set +CONFIG_PHY_QCOM_USB_HSIC=m +# CONFIG_PHY_QCOM_USB_HS_28NM is not set +# CONFIG_PHY_QCOM_USB_SS is not set +# CONFIG_PHY_QCOM_IPQ806X_USB is not set +# CONFIG_PHY_QCOM_SGMII_ETH is not set +CONFIG_PHY_ROCKCHIP_DP=m +# CONFIG_PHY_ROCKCHIP_DPHY_RX0 is not set +CONFIG_PHY_ROCKCHIP_EMMC=m +CONFIG_PHY_ROCKCHIP_INNO_HDMI=m +CONFIG_PHY_ROCKCHIP_INNO_USB2=m +# CONFIG_PHY_ROCKCHIP_INNO_CSIDPHY is not set +# CONFIG_PHY_ROCKCHIP_INNO_DSIDPHY is not set +# CONFIG_PHY_ROCKCHIP_NANENG_COMBO_PHY is not set +CONFIG_PHY_ROCKCHIP_PCIE=m +# CONFIG_PHY_ROCKCHIP_SNPS_PCIE3 is not set +CONFIG_PHY_ROCKCHIP_TYPEC=m +CONFIG_PHY_ROCKCHIP_USB=m +# CONFIG_PHY_SAMSUNG_USB2 is not set +CONFIG_PHY_TEGRA_XUSB=m +# CONFIG_PHY_TEGRA194_P2U is not set +# CONFIG_PHY_TUSB1210 is not set +# CONFIG_PHY_XILINX_ZYNQMP is not set +# end of PHY Subsystem + +# CONFIG_POWERCAP is not set +# CONFIG_MCB is not set + +# +# Performance monitor support +# +CONFIG_ARM_CCI_PMU=y +CONFIG_ARM_CCI400_PMU=y +CONFIG_ARM_CCI5xx_PMU=y +CONFIG_ARM_CCN=y +CONFIG_ARM_CMN=m +CONFIG_ARM_PMU=y +CONFIG_ARM_PMU_ACPI=y +CONFIG_ARM_SMMU_V3_PMU=y +CONFIG_ARM_PMUV3=y +CONFIG_ARM_DSU_PMU=m +CONFIG_QCOM_L2_PMU=y +CONFIG_QCOM_L3_PMU=y +CONFIG_THUNDERX2_PMU=m +CONFIG_XGENE_PMU=y +CONFIG_ARM_SPE_PMU=m +CONFIG_ARM_DMC620_PMU=y +# CONFIG_MARVELL_CN10K_TAD_PMU is not set +# CONFIG_ALIBABA_UNCORE_DRW_PMU is not set +CONFIG_HISI_PMU=y +# CONFIG_HISI_PCIE_PMU is not set +# CONFIG_HNS3_PMU is not set +# CONFIG_MARVELL_CN10K_DDR_PMU is not set +CONFIG_ARM_CORESIGHT_PMU_ARCH_SYSTEM_PMU=m +CONFIG_NVIDIA_CORESIGHT_PMU_ARCH_SYSTEM_PMU=m +# CONFIG_MESON_DDR_PMU is not set +# end of Performance monitor support + +CONFIG_RAS=y +# CONFIG_USB4 is not set + +# +# Android +# +# CONFIG_ANDROID_BINDER_IPC is not set +# end of Android + +CONFIG_LIBNVDIMM=m +CONFIG_BLK_DEV_PMEM=m +CONFIG_ND_CLAIM=y +CONFIG_ND_BTT=m +CONFIG_BTT=y +CONFIG_ND_PFN=m +CONFIG_NVDIMM_PFN=y +CONFIG_NVDIMM_DAX=y +CONFIG_OF_PMEM=m +CONFIG_DAX=y +CONFIG_DEV_DAX=m +CONFIG_DEV_DAX_PMEM=m +CONFIG_DEV_DAX_KMEM=m +CONFIG_NVMEM=y +CONFIG_NVMEM_SYSFS=y + +# +# Layout Types +# +# CONFIG_NVMEM_LAYOUT_SL28_VPD is not set +# CONFIG_NVMEM_LAYOUT_ONIE_TLV is not set +# end of Layout Types + +# CONFIG_NVMEM_MESON_MX_EFUSE is not set +# CONFIG_NVMEM_QCOM_QFPROM is not set +# CONFIG_NVMEM_QCOM_SEC_QFPROM is not set +# CONFIG_NVMEM_RMEM is not set +# CONFIG_NVMEM_ROCKCHIP_EFUSE is not set +# CONFIG_NVMEM_ROCKCHIP_OTP is not set +# CONFIG_NVMEM_SPMI_SDAM is not set +CONFIG_NVMEM_SUNXI_SID=m +# CONFIG_NVMEM_U_BOOT_ENV is not set +# CONFIG_NVMEM_ZYNQMP is not set + +# +# HW tracing support +# +# CONFIG_STM is not set +# CONFIG_INTEL_TH is not set +# CONFIG_HISI_PTT is not set +# end of HW tracing support + +# CONFIG_FPGA is not set +# CONFIG_FSI is not set +CONFIG_TEE=m +CONFIG_OPTEE=m +# CONFIG_OPTEE_INSECURE_LOAD_IMAGE is not set +CONFIG_PM_OPP=y +# CONFIG_SIOX is not set +# CONFIG_SLIMBUS is not set +CONFIG_INTERCONNECT=y +# CONFIG_INTERCONNECT_QCOM is not set +# CONFIG_COUNTER is not set +# CONFIG_MOST is not set +# CONFIG_PECI is not set +# CONFIG_HTE is not set +# CONFIG_CDX_BUS is not set +# end of Device Drivers + +# +# File systems +# +CONFIG_DCACHE_WORD_ACCESS=y +# CONFIG_VALIDATE_FS_PARSER is not set +CONFIG_FS_IOMAP=y +CONFIG_BUFFER_HEAD=y +CONFIG_LEGACY_DIRECT_IO=y +# CONFIG_EXT2_FS is not set +# CONFIG_EXT3_FS is not set +CONFIG_EXT4_FS=m +CONFIG_EXT4_USE_FOR_EXT2=y +CONFIG_EXT4_FS_POSIX_ACL=y +CONFIG_EXT4_FS_SECURITY=y +# CONFIG_EXT4_DEBUG is not set +CONFIG_JBD2=m +# CONFIG_JBD2_DEBUG is not set +CONFIG_FS_MBCACHE=m +CONFIG_REISERFS_FS=m +# CONFIG_REISERFS_CHECK is not set +# CONFIG_REISERFS_PROC_INFO is not set +CONFIG_REISERFS_FS_XATTR=y +CONFIG_REISERFS_FS_POSIX_ACL=y +CONFIG_REISERFS_FS_SECURITY=y +CONFIG_JFS_FS=m +CONFIG_JFS_POSIX_ACL=y +CONFIG_JFS_SECURITY=y +# CONFIG_JFS_DEBUG is not set +# CONFIG_JFS_STATISTICS is not set +CONFIG_XFS_FS=m +CONFIG_XFS_SUPPORT_V4=y +CONFIG_XFS_SUPPORT_ASCII_CI=y +CONFIG_XFS_QUOTA=y +CONFIG_XFS_POSIX_ACL=y +CONFIG_XFS_RT=y +# CONFIG_XFS_ONLINE_SCRUB is not set +# CONFIG_XFS_WARN is not set +# CONFIG_XFS_DEBUG is not set +CONFIG_GFS2_FS=m +CONFIG_GFS2_FS_LOCKING_DLM=y +CONFIG_OCFS2_FS=m +CONFIG_OCFS2_FS_O2CB=m +CONFIG_OCFS2_FS_USERSPACE_CLUSTER=m +CONFIG_OCFS2_FS_STATS=y +CONFIG_OCFS2_DEBUG_MASKLOG=y +# CONFIG_OCFS2_DEBUG_FS is not set +CONFIG_BTRFS_FS=m +CONFIG_BTRFS_FS_POSIX_ACL=y +# CONFIG_BTRFS_FS_CHECK_INTEGRITY is not set +# CONFIG_BTRFS_FS_RUN_SANITY_TESTS is not set +# CONFIG_BTRFS_DEBUG is not set +# CONFIG_BTRFS_ASSERT is not set +# CONFIG_BTRFS_FS_REF_VERIFY is not set +CONFIG_NILFS2_FS=m +CONFIG_F2FS_FS=m +CONFIG_F2FS_STAT_FS=y +CONFIG_F2FS_FS_XATTR=y +CONFIG_F2FS_FS_POSIX_ACL=y +CONFIG_F2FS_FS_SECURITY=y +# CONFIG_F2FS_CHECK_FS is not set +# CONFIG_F2FS_FAULT_INJECTION is not set +# CONFIG_F2FS_FS_COMPRESSION is not set +CONFIG_F2FS_IOSTAT=y +# CONFIG_F2FS_UNFAIR_RWSEM is not set +# CONFIG_ZONEFS_FS is not set +# CONFIG_FS_DAX is not set +CONFIG_FS_POSIX_ACL=y +CONFIG_EXPORTFS=y +CONFIG_EXPORTFS_BLOCK_OPS=y +CONFIG_FILE_LOCKING=y +# CONFIG_FS_ENCRYPTION is not set +# CONFIG_FS_VERITY is not set +CONFIG_FSNOTIFY=y +CONFIG_DNOTIFY=y +CONFIG_INOTIFY_USER=y +CONFIG_FANOTIFY=y +CONFIG_FANOTIFY_ACCESS_PERMISSIONS=y +CONFIG_QUOTA=y +CONFIG_QUOTA_NETLINK_INTERFACE=y +# CONFIG_QUOTA_DEBUG is not set +CONFIG_QUOTA_TREE=m +CONFIG_QFMT_V1=m +CONFIG_QFMT_V2=m +CONFIG_QUOTACTL=y +CONFIG_AUTOFS_FS=m +CONFIG_FUSE_FS=m +CONFIG_CUSE=m +# CONFIG_VIRTIO_FS is not set +CONFIG_OVERLAY_FS=m +# CONFIG_OVERLAY_FS_REDIRECT_DIR is not set +CONFIG_OVERLAY_FS_REDIRECT_ALWAYS_FOLLOW=y +# CONFIG_OVERLAY_FS_INDEX is not set +# CONFIG_OVERLAY_FS_XINO_AUTO is not set +# CONFIG_OVERLAY_FS_METACOPY is not set +# CONFIG_OVERLAY_FS_DEBUG is not set + +# +# Caches +# +CONFIG_NETFS_SUPPORT=m +CONFIG_NETFS_STATS=y +CONFIG_FSCACHE=m +CONFIG_FSCACHE_STATS=y +# CONFIG_FSCACHE_DEBUG is not set +CONFIG_CACHEFILES=m +# CONFIG_CACHEFILES_DEBUG is not set +# CONFIG_CACHEFILES_ERROR_INJECTION is not set +# CONFIG_CACHEFILES_ONDEMAND is not set +# end of Caches + +# +# CD-ROM/DVD Filesystems +# +CONFIG_ISO9660_FS=m +CONFIG_JOLIET=y +CONFIG_ZISOFS=y +CONFIG_UDF_FS=m +# end of CD-ROM/DVD Filesystems + +# +# DOS/FAT/EXFAT/NT Filesystems +# +CONFIG_FAT_FS=y +CONFIG_MSDOS_FS=m +CONFIG_VFAT_FS=y +CONFIG_FAT_DEFAULT_CODEPAGE=437 +CONFIG_FAT_DEFAULT_IOCHARSET="ascii" +CONFIG_FAT_DEFAULT_UTF8=y +# CONFIG_EXFAT_FS is not set +CONFIG_NTFS_FS=m +# CONFIG_NTFS_DEBUG is not set +# CONFIG_NTFS3_FS is not set +# end of DOS/FAT/EXFAT/NT Filesystems + +# +# Pseudo filesystems +# +CONFIG_PROC_FS=y +CONFIG_PROC_KCORE=y +CONFIG_PROC_VMCORE=y +# CONFIG_PROC_VMCORE_DEVICE_DUMP is not set +CONFIG_PROC_SYSCTL=y +CONFIG_PROC_PAGE_MONITOR=y +CONFIG_PROC_CHILDREN=y +CONFIG_KERNFS=y +CONFIG_SYSFS=y +CONFIG_TMPFS=y +CONFIG_TMPFS_POSIX_ACL=y +CONFIG_TMPFS_XATTR=y +# CONFIG_TMPFS_INODE64 is not set +# CONFIG_TMPFS_QUOTA is not set +CONFIG_ARCH_SUPPORTS_HUGETLBFS=y +CONFIG_HUGETLBFS=y +CONFIG_HUGETLB_PAGE=y +CONFIG_ARCH_HAS_GIGANTIC_PAGE=y +CONFIG_CONFIGFS_FS=m +CONFIG_EFIVAR_FS=m +# end of Pseudo filesystems + +CONFIG_MISC_FILESYSTEMS=y +CONFIG_ORANGEFS_FS=m +CONFIG_ADFS_FS=m +# CONFIG_ADFS_FS_RW is not set +CONFIG_AFFS_FS=m +CONFIG_ECRYPT_FS=m +CONFIG_ECRYPT_FS_MESSAGING=y +CONFIG_HFS_FS=m +CONFIG_HFSPLUS_FS=m +CONFIG_BEFS_FS=m +# CONFIG_BEFS_DEBUG is not set +CONFIG_BFS_FS=m +CONFIG_EFS_FS=m +CONFIG_JFFS2_FS=m +CONFIG_JFFS2_FS_DEBUG=0 +CONFIG_JFFS2_FS_WRITEBUFFER=y +# CONFIG_JFFS2_FS_WBUF_VERIFY is not set +CONFIG_JFFS2_SUMMARY=y +CONFIG_JFFS2_FS_XATTR=y +CONFIG_JFFS2_FS_POSIX_ACL=y +CONFIG_JFFS2_FS_SECURITY=y +CONFIG_JFFS2_COMPRESSION_OPTIONS=y +CONFIG_JFFS2_ZLIB=y +CONFIG_JFFS2_LZO=y +CONFIG_JFFS2_RTIME=y +# CONFIG_JFFS2_RUBIN is not set +# CONFIG_JFFS2_CMODE_NONE is not set +CONFIG_JFFS2_CMODE_PRIORITY=y +# CONFIG_JFFS2_CMODE_SIZE is not set +# CONFIG_JFFS2_CMODE_FAVOURLZO is not set +CONFIG_UBIFS_FS=m +CONFIG_UBIFS_FS_ADVANCED_COMPR=y +CONFIG_UBIFS_FS_LZO=y +CONFIG_UBIFS_FS_ZLIB=y +CONFIG_UBIFS_FS_ZSTD=y +# CONFIG_UBIFS_ATIME_SUPPORT is not set +CONFIG_UBIFS_FS_XATTR=y +CONFIG_UBIFS_FS_SECURITY=y +# CONFIG_UBIFS_FS_AUTHENTICATION is not set +# CONFIG_CRAMFS is not set +CONFIG_SQUASHFS=m +CONFIG_SQUASHFS_FILE_CACHE=y +# CONFIG_SQUASHFS_FILE_DIRECT is not set +CONFIG_SQUASHFS_DECOMP_SINGLE=y +# CONFIG_SQUASHFS_CHOICE_DECOMP_BY_MOUNT is not set +CONFIG_SQUASHFS_COMPILE_DECOMP_SINGLE=y +# CONFIG_SQUASHFS_COMPILE_DECOMP_MULTI is not set +# CONFIG_SQUASHFS_COMPILE_DECOMP_MULTI_PERCPU is not set +CONFIG_SQUASHFS_XATTR=y +CONFIG_SQUASHFS_ZLIB=y +CONFIG_SQUASHFS_LZ4=y +CONFIG_SQUASHFS_LZO=y +CONFIG_SQUASHFS_XZ=y +CONFIG_SQUASHFS_ZSTD=y +# CONFIG_SQUASHFS_4K_DEVBLK_SIZE is not set +# CONFIG_SQUASHFS_EMBEDDED is not set +CONFIG_SQUASHFS_FRAGMENT_CACHE_SIZE=3 +CONFIG_VXFS_FS=m +CONFIG_MINIX_FS=m +CONFIG_OMFS_FS=m +CONFIG_HPFS_FS=m +CONFIG_QNX4FS_FS=m +CONFIG_QNX6FS_FS=m +# CONFIG_QNX6FS_DEBUG is not set +CONFIG_ROMFS_FS=m +# CONFIG_ROMFS_BACKED_BY_BLOCK is not set +# CONFIG_ROMFS_BACKED_BY_MTD is not set +CONFIG_ROMFS_BACKED_BY_BOTH=y +CONFIG_ROMFS_ON_BLOCK=y +CONFIG_ROMFS_ON_MTD=y +CONFIG_PSTORE=y +CONFIG_PSTORE_DEFAULT_KMSG_BYTES=10240 +CONFIG_PSTORE_COMPRESS=y +# CONFIG_PSTORE_CONSOLE is not set +# CONFIG_PSTORE_PMSG is not set +# CONFIG_PSTORE_FTRACE is not set +CONFIG_PSTORE_RAM=m +# CONFIG_PSTORE_BLK is not set +CONFIG_SYSV_FS=m +CONFIG_UFS_FS=m +# CONFIG_UFS_FS_WRITE is not set +# CONFIG_UFS_DEBUG is not set +# CONFIG_EROFS_FS is not set +CONFIG_NETWORK_FILESYSTEMS=y +CONFIG_NFS_FS=m +CONFIG_NFS_V2=m +CONFIG_NFS_V3=m +CONFIG_NFS_V3_ACL=y +CONFIG_NFS_V4=m +CONFIG_NFS_SWAP=y +CONFIG_NFS_V4_1=y +CONFIG_NFS_V4_2=y +CONFIG_PNFS_FILE_LAYOUT=m +CONFIG_PNFS_BLOCK=m +CONFIG_PNFS_FLEXFILE_LAYOUT=m +CONFIG_NFS_V4_1_IMPLEMENTATION_ID_DOMAIN="kernel.org" +# CONFIG_NFS_V4_1_MIGRATION is not set +CONFIG_NFS_V4_SECURITY_LABEL=y +CONFIG_NFS_FSCACHE=y +# CONFIG_NFS_USE_LEGACY_DNS is not set +CONFIG_NFS_USE_KERNEL_DNS=y +CONFIG_NFS_DEBUG=y +CONFIG_NFS_DISABLE_UDP_SUPPORT=y +# CONFIG_NFS_V4_2_READ_PLUS is not set +CONFIG_NFSD=m +# CONFIG_NFSD_V2 is not set +CONFIG_NFSD_V3_ACL=y +CONFIG_NFSD_V4=y +CONFIG_NFSD_PNFS=y +CONFIG_NFSD_BLOCKLAYOUT=y +# CONFIG_NFSD_SCSILAYOUT is not set +# CONFIG_NFSD_FLEXFILELAYOUT is not set +# CONFIG_NFSD_V4_2_INTER_SSC is not set +CONFIG_NFSD_V4_SECURITY_LABEL=y +CONFIG_GRACE_PERIOD=m +CONFIG_LOCKD=m +CONFIG_LOCKD_V4=y +CONFIG_NFS_ACL_SUPPORT=m +CONFIG_NFS_COMMON=y +CONFIG_NFS_V4_2_SSC_HELPER=y +CONFIG_SUNRPC=m +CONFIG_SUNRPC_GSS=m +CONFIG_SUNRPC_BACKCHANNEL=y +CONFIG_SUNRPC_SWAP=y +CONFIG_RPCSEC_GSS_KRB5=m +CONFIG_RPCSEC_GSS_KRB5_ENCTYPES_AES_SHA1=y +# CONFIG_RPCSEC_GSS_KRB5_ENCTYPES_CAMELLIA is not set +# CONFIG_RPCSEC_GSS_KRB5_ENCTYPES_AES_SHA2 is not set +CONFIG_SUNRPC_DEBUG=y +CONFIG_SUNRPC_XPRT_RDMA=m +CONFIG_CEPH_FS=m +CONFIG_CEPH_FSCACHE=y +CONFIG_CEPH_FS_POSIX_ACL=y +# CONFIG_CEPH_FS_SECURITY_LABEL is not set +CONFIG_CIFS=m +# CONFIG_CIFS_STATS2 is not set +CONFIG_CIFS_ALLOW_INSECURE_LEGACY=y +CONFIG_CIFS_UPCALL=y +CONFIG_CIFS_XATTR=y +CONFIG_CIFS_POSIX=y +CONFIG_CIFS_DEBUG=y +# CONFIG_CIFS_DEBUG2 is not set +# CONFIG_CIFS_DEBUG_DUMP_KEYS is not set +CONFIG_CIFS_DFS_UPCALL=y +# CONFIG_CIFS_SWN_UPCALL is not set +# CONFIG_CIFS_SMB_DIRECT is not set +CONFIG_CIFS_FSCACHE=y +# CONFIG_SMB_SERVER is not set +CONFIG_SMBFS=m +CONFIG_CODA_FS=m +CONFIG_AFS_FS=m +# CONFIG_AFS_DEBUG is not set +CONFIG_AFS_FSCACHE=y +# CONFIG_AFS_DEBUG_CURSOR is not set +CONFIG_9P_FS=m +CONFIG_9P_FSCACHE=y +CONFIG_9P_FS_POSIX_ACL=y +CONFIG_9P_FS_SECURITY=y +CONFIG_NLS=y +CONFIG_NLS_DEFAULT="utf8" +CONFIG_NLS_CODEPAGE_437=m +CONFIG_NLS_CODEPAGE_737=m +CONFIG_NLS_CODEPAGE_775=m +CONFIG_NLS_CODEPAGE_850=m +CONFIG_NLS_CODEPAGE_852=m +CONFIG_NLS_CODEPAGE_855=m +CONFIG_NLS_CODEPAGE_857=m +CONFIG_NLS_CODEPAGE_860=m +CONFIG_NLS_CODEPAGE_861=m +CONFIG_NLS_CODEPAGE_862=m +CONFIG_NLS_CODEPAGE_863=m +CONFIG_NLS_CODEPAGE_864=m +CONFIG_NLS_CODEPAGE_865=m +CONFIG_NLS_CODEPAGE_866=m +CONFIG_NLS_CODEPAGE_869=m +CONFIG_NLS_CODEPAGE_936=m +CONFIG_NLS_CODEPAGE_950=m +CONFIG_NLS_CODEPAGE_932=m +CONFIG_NLS_CODEPAGE_949=m +CONFIG_NLS_CODEPAGE_874=m +CONFIG_NLS_ISO8859_8=m +CONFIG_NLS_CODEPAGE_1250=m +CONFIG_NLS_CODEPAGE_1251=m +CONFIG_NLS_ASCII=m +CONFIG_NLS_ISO8859_1=m +CONFIG_NLS_ISO8859_2=m +CONFIG_NLS_ISO8859_3=m +CONFIG_NLS_ISO8859_4=m +CONFIG_NLS_ISO8859_5=m +CONFIG_NLS_ISO8859_6=m +CONFIG_NLS_ISO8859_7=m +CONFIG_NLS_ISO8859_9=m +CONFIG_NLS_ISO8859_13=m +CONFIG_NLS_ISO8859_14=m +CONFIG_NLS_ISO8859_15=m +CONFIG_NLS_KOI8_R=m +CONFIG_NLS_KOI8_U=m +CONFIG_NLS_MAC_ROMAN=m +CONFIG_NLS_MAC_CELTIC=m +CONFIG_NLS_MAC_CENTEURO=m +CONFIG_NLS_MAC_CROATIAN=m +CONFIG_NLS_MAC_CYRILLIC=m +CONFIG_NLS_MAC_GAELIC=m +CONFIG_NLS_MAC_GREEK=m +CONFIG_NLS_MAC_ICELAND=m +CONFIG_NLS_MAC_INUIT=m +CONFIG_NLS_MAC_ROMANIAN=m +CONFIG_NLS_MAC_TURKISH=m +CONFIG_NLS_UTF8=m +CONFIG_NLS_UCS2_UTILS=m +CONFIG_DLM=m +CONFIG_DLM_DEBUG=y +# CONFIG_UNICODE is not set +CONFIG_IO_WQ=y +# end of File systems + +# +# Security options +# +CONFIG_KEYS=y +# CONFIG_KEYS_REQUEST_CACHE is not set +# CONFIG_PERSISTENT_KEYRINGS is not set +# CONFIG_TRUSTED_KEYS is not set +# CONFIG_ENCRYPTED_KEYS is not set +CONFIG_KEY_DH_OPERATIONS=y +CONFIG_SECURITY_DMESG_RESTRICT=y +CONFIG_PROC_MEM_ALWAYS_FORCE=y +# CONFIG_PROC_MEM_FORCE_PTRACE is not set +# CONFIG_PROC_MEM_NO_FORCE is not set +CONFIG_SECURITY=y +CONFIG_SECURITYFS=y +CONFIG_SECURITY_NETWORK=y +# CONFIG_SECURITY_INFINIBAND is not set +CONFIG_SECURITY_NETWORK_XFRM=y +CONFIG_SECURITY_PATH=y +CONFIG_LSM_MMAP_MIN_ADDR=32768 +CONFIG_HARDENED_USERCOPY=y +CONFIG_FORTIFY_SOURCE=y +# CONFIG_STATIC_USERMODEHELPER is not set +CONFIG_SECURITY_SELINUX=y +CONFIG_SECURITY_SELINUX_BOOTPARAM=y +CONFIG_SECURITY_SELINUX_DEVELOP=y +CONFIG_SECURITY_SELINUX_AVC_STATS=y +CONFIG_SECURITY_SELINUX_SIDTAB_HASH_BITS=9 +CONFIG_SECURITY_SELINUX_SID2STR_CACHE_SIZE=256 +# CONFIG_SECURITY_SELINUX_DEBUG is not set +# CONFIG_SECURITY_SMACK is not set +CONFIG_SECURITY_TOMOYO=y +CONFIG_SECURITY_TOMOYO_MAX_ACCEPT_ENTRY=2048 +CONFIG_SECURITY_TOMOYO_MAX_AUDIT_LOG=1024 +# CONFIG_SECURITY_TOMOYO_OMIT_USERSPACE_LOADER is not set +CONFIG_SECURITY_TOMOYO_POLICY_LOADER="/sbin/tomoyo-init" +CONFIG_SECURITY_TOMOYO_ACTIVATION_TRIGGER="/sbin/init" +# CONFIG_SECURITY_TOMOYO_INSECURE_BUILTIN_SETTING is not set +CONFIG_SECURITY_APPARMOR=y +# CONFIG_SECURITY_APPARMOR_DEBUG is not set +CONFIG_SECURITY_APPARMOR_INTROSPECT_POLICY=y +CONFIG_SECURITY_APPARMOR_HASH=y +CONFIG_SECURITY_APPARMOR_HASH_DEFAULT=y +CONFIG_SECURITY_APPARMOR_EXPORT_BINARY=y +CONFIG_SECURITY_APPARMOR_PARANOID_LOAD=y +# CONFIG_SECURITY_LOADPIN is not set +CONFIG_SECURITY_YAMA=y +# CONFIG_SECURITY_SAFESETID is not set +# CONFIG_SECURITY_LOCKDOWN_LSM is not set +# CONFIG_SECURITY_LANDLOCK is not set +CONFIG_INTEGRITY=y +CONFIG_INTEGRITY_SIGNATURE=y +CONFIG_INTEGRITY_ASYMMETRIC_KEYS=y +# CONFIG_INTEGRITY_TRUSTED_KEYRING is not set +CONFIG_INTEGRITY_PLATFORM_KEYRING=y +# CONFIG_INTEGRITY_MACHINE_KEYRING is not set +CONFIG_LOAD_UEFI_KEYS=y +CONFIG_INTEGRITY_AUDIT=y +# CONFIG_IMA is not set +# CONFIG_IMA_SECURE_AND_OR_TRUSTED_BOOT is not set +# CONFIG_EVM is not set +# CONFIG_DEFAULT_SECURITY_SELINUX is not set +# CONFIG_DEFAULT_SECURITY_TOMOYO is not set +CONFIG_DEFAULT_SECURITY_APPARMOR=y +# CONFIG_DEFAULT_SECURITY_DAC is not set +CONFIG_LSM="yama,loadpin,safesetid,integrity,apparmor,selinux,smack,tomoyo" + +# +# Kernel hardening options +# + +# +# Memory initialization +# +CONFIG_CC_HAS_AUTO_VAR_INIT_PATTERN=y +CONFIG_CC_HAS_AUTO_VAR_INIT_ZERO_BARE=y +CONFIG_CC_HAS_AUTO_VAR_INIT_ZERO=y +CONFIG_INIT_STACK_NONE=y +# CONFIG_INIT_STACK_ALL_PATTERN is not set +# CONFIG_INIT_STACK_ALL_ZERO is not set +# CONFIG_INIT_ON_ALLOC_DEFAULT_ON is not set +# CONFIG_INIT_ON_FREE_DEFAULT_ON is not set +CONFIG_CC_HAS_ZERO_CALL_USED_REGS=y +# CONFIG_ZERO_CALL_USED_REGS is not set +# end of Memory initialization + +# +# Hardening of kernel data structures +# +CONFIG_LIST_HARDENED=y +CONFIG_BUG_ON_DATA_CORRUPTION=y +# end of Hardening of kernel data structures + +CONFIG_RANDSTRUCT_NONE=y +# end of Kernel hardening options +# end of Security options + +CONFIG_XOR_BLOCKS=m +CONFIG_ASYNC_CORE=m +CONFIG_ASYNC_MEMCPY=m +CONFIG_ASYNC_XOR=m +CONFIG_ASYNC_PQ=m +CONFIG_ASYNC_RAID6_RECOV=m +CONFIG_CRYPTO=y + +# +# Crypto core or helper +# +CONFIG_CRYPTO_FIPS=y +CONFIG_CRYPTO_FIPS_NAME="Linux Kernel Cryptographic API" +# CONFIG_CRYPTO_FIPS_CUSTOM_VERSION is not set +CONFIG_CRYPTO_ALGAPI=y +CONFIG_CRYPTO_ALGAPI2=y +CONFIG_CRYPTO_AEAD=m +CONFIG_CRYPTO_AEAD2=y +CONFIG_CRYPTO_SIG=y +CONFIG_CRYPTO_SIG2=y +CONFIG_CRYPTO_SKCIPHER=y +CONFIG_CRYPTO_SKCIPHER2=y +CONFIG_CRYPTO_HASH=y +CONFIG_CRYPTO_HASH2=y +CONFIG_CRYPTO_RNG=m +CONFIG_CRYPTO_RNG2=y +CONFIG_CRYPTO_RNG_DEFAULT=m +CONFIG_CRYPTO_AKCIPHER2=y +CONFIG_CRYPTO_AKCIPHER=y +CONFIG_CRYPTO_KPP2=y +CONFIG_CRYPTO_KPP=y +CONFIG_CRYPTO_ACOMP2=y +CONFIG_CRYPTO_MANAGER=y +CONFIG_CRYPTO_MANAGER2=y +CONFIG_CRYPTO_USER=m +# CONFIG_CRYPTO_MANAGER_DISABLE_TESTS is not set +# CONFIG_CRYPTO_MANAGER_EXTRA_TESTS is not set +CONFIG_CRYPTO_NULL=m +CONFIG_CRYPTO_NULL2=m +CONFIG_CRYPTO_PCRYPT=m +CONFIG_CRYPTO_CRYPTD=m +CONFIG_CRYPTO_AUTHENC=m +CONFIG_CRYPTO_TEST=m +CONFIG_CRYPTO_ENGINE=m +# end of Crypto core or helper + +# +# Public-key cryptography +# +CONFIG_CRYPTO_RSA=y +CONFIG_CRYPTO_DH=y +# CONFIG_CRYPTO_DH_RFC7919_GROUPS is not set +CONFIG_CRYPTO_ECC=m +CONFIG_CRYPTO_ECDH=m +# CONFIG_CRYPTO_ECDSA is not set +# CONFIG_CRYPTO_ECRDSA is not set +# CONFIG_CRYPTO_SM2 is not set +# CONFIG_CRYPTO_CURVE25519 is not set +# end of Public-key cryptography + +# +# Block ciphers +# +CONFIG_CRYPTO_AES=y +# CONFIG_CRYPTO_AES_TI is not set +CONFIG_CRYPTO_ANUBIS=m +# CONFIG_CRYPTO_ARIA is not set +CONFIG_CRYPTO_BLOWFISH=m +CONFIG_CRYPTO_BLOWFISH_COMMON=m +CONFIG_CRYPTO_CAMELLIA=m +CONFIG_CRYPTO_CAST_COMMON=m +CONFIG_CRYPTO_CAST5=m +CONFIG_CRYPTO_CAST6=m +CONFIG_CRYPTO_DES=m +CONFIG_CRYPTO_FCRYPT=m +CONFIG_CRYPTO_KHAZAD=m +CONFIG_CRYPTO_SEED=m +CONFIG_CRYPTO_SERPENT=m +CONFIG_CRYPTO_SM4=m +# CONFIG_CRYPTO_SM4_GENERIC is not set +CONFIG_CRYPTO_TEA=m +CONFIG_CRYPTO_TWOFISH=m +CONFIG_CRYPTO_TWOFISH_COMMON=m +# end of Block ciphers + +# +# Length-preserving ciphers and modes +# +# CONFIG_CRYPTO_ADIANTUM is not set +CONFIG_CRYPTO_ARC4=m +CONFIG_CRYPTO_CHACHA20=m +CONFIG_CRYPTO_CBC=y +# CONFIG_CRYPTO_CFB is not set +CONFIG_CRYPTO_CTR=m +CONFIG_CRYPTO_CTS=y +CONFIG_CRYPTO_ECB=y +# CONFIG_CRYPTO_HCTR2 is not set +# CONFIG_CRYPTO_KEYWRAP is not set +CONFIG_CRYPTO_LRW=m +# CONFIG_CRYPTO_OFB is not set +CONFIG_CRYPTO_PCBC=m +CONFIG_CRYPTO_XTS=y +# end of Length-preserving ciphers and modes + +# +# AEAD (authenticated encryption with associated data) ciphers +# +CONFIG_CRYPTO_AEGIS128=m +CONFIG_CRYPTO_AEGIS128_SIMD=y +CONFIG_CRYPTO_CHACHA20POLY1305=m +CONFIG_CRYPTO_CCM=m +CONFIG_CRYPTO_GCM=m +CONFIG_CRYPTO_GENIV=m +CONFIG_CRYPTO_SEQIV=m +CONFIG_CRYPTO_ECHAINIV=m +CONFIG_CRYPTO_ESSIV=m +# end of AEAD (authenticated encryption with associated data) ciphers + +# +# Hashes, digests, and MACs +# +CONFIG_CRYPTO_BLAKE2B=m +CONFIG_CRYPTO_CMAC=m +CONFIG_CRYPTO_GHASH=m +CONFIG_CRYPTO_HMAC=y +CONFIG_CRYPTO_MD4=m +CONFIG_CRYPTO_MD5=y +CONFIG_CRYPTO_MICHAEL_MIC=m +CONFIG_CRYPTO_POLYVAL=m +CONFIG_CRYPTO_POLY1305=m +CONFIG_CRYPTO_RMD160=m +CONFIG_CRYPTO_SHA1=y +CONFIG_CRYPTO_SHA256=y +CONFIG_CRYPTO_SHA512=y +CONFIG_CRYPTO_SHA3=m +CONFIG_CRYPTO_SM3=m +# CONFIG_CRYPTO_SM3_GENERIC is not set +# CONFIG_CRYPTO_STREEBOG is not set +CONFIG_CRYPTO_VMAC=m +CONFIG_CRYPTO_WP512=m +CONFIG_CRYPTO_XCBC=m +CONFIG_CRYPTO_XXHASH=m +# end of Hashes, digests, and MACs + +# +# CRCs (cyclic redundancy checks) +# +CONFIG_CRYPTO_CRC32C=m +CONFIG_CRYPTO_CRC32=m +CONFIG_CRYPTO_CRCT10DIF=y +CONFIG_CRYPTO_CRC64_ROCKSOFT=m +# end of CRCs (cyclic redundancy checks) + +# +# Compression +# +CONFIG_CRYPTO_DEFLATE=y +CONFIG_CRYPTO_LZO=y +# CONFIG_CRYPTO_842 is not set +CONFIG_CRYPTO_LZ4=m +CONFIG_CRYPTO_LZ4HC=m +CONFIG_CRYPTO_ZSTD=m +# end of Compression + +# +# Random number generation +# +CONFIG_CRYPTO_ANSI_CPRNG=m +CONFIG_CRYPTO_DRBG_MENU=m +CONFIG_CRYPTO_DRBG_HMAC=y +# CONFIG_CRYPTO_DRBG_HASH is not set +# CONFIG_CRYPTO_DRBG_CTR is not set +CONFIG_CRYPTO_DRBG=m +CONFIG_CRYPTO_JITTERENTROPY=m +# CONFIG_CRYPTO_JITTERENTROPY_TESTINTERFACE is not set +CONFIG_CRYPTO_KDF800108_CTR=y +# end of Random number generation + +# +# Userspace interface +# +CONFIG_CRYPTO_USER_API=m +CONFIG_CRYPTO_USER_API_HASH=m +CONFIG_CRYPTO_USER_API_SKCIPHER=m +CONFIG_CRYPTO_USER_API_RNG=m +# CONFIG_CRYPTO_USER_API_RNG_CAVP is not set +CONFIG_CRYPTO_USER_API_AEAD=m +CONFIG_CRYPTO_USER_API_ENABLE_OBSOLETE=y +# CONFIG_CRYPTO_STATS is not set +# end of Userspace interface + +CONFIG_CRYPTO_HASH_INFO=y +# CONFIG_CRYPTO_NHPOLY1305_NEON is not set +# CONFIG_CRYPTO_CHACHA20_NEON is not set + +# +# Accelerated Cryptographic Algorithms for CPU (arm64) +# +CONFIG_CRYPTO_GHASH_ARM64_CE=m +# CONFIG_CRYPTO_POLY1305_NEON is not set +CONFIG_CRYPTO_SHA1_ARM64_CE=m +CONFIG_CRYPTO_SHA256_ARM64=m +CONFIG_CRYPTO_SHA2_ARM64_CE=m +CONFIG_CRYPTO_SHA512_ARM64=m +CONFIG_CRYPTO_SHA512_ARM64_CE=m +CONFIG_CRYPTO_SHA3_ARM64=m +CONFIG_CRYPTO_SM3_NEON=m +CONFIG_CRYPTO_SM3_ARM64_CE=m +CONFIG_CRYPTO_POLYVAL_ARM64_CE=m +CONFIG_CRYPTO_AES_ARM64=m +CONFIG_CRYPTO_AES_ARM64_CE=m +CONFIG_CRYPTO_AES_ARM64_CE_BLK=m +CONFIG_CRYPTO_AES_ARM64_NEON_BLK=m +# CONFIG_CRYPTO_AES_ARM64_BS is not set +CONFIG_CRYPTO_SM4_ARM64_CE=m +CONFIG_CRYPTO_SM4_ARM64_CE_BLK=m +CONFIG_CRYPTO_SM4_ARM64_NEON_BLK=m +CONFIG_CRYPTO_AES_ARM64_CE_CCM=m +CONFIG_CRYPTO_SM4_ARM64_CE_CCM=m +CONFIG_CRYPTO_SM4_ARM64_CE_GCM=m +# CONFIG_CRYPTO_CRCT10DIF_ARM64_CE is not set +# end of Accelerated Cryptographic Algorithms for CPU (arm64) + +CONFIG_CRYPTO_HW=y +CONFIG_CRYPTO_DEV_ALLWINNER=y +# CONFIG_CRYPTO_DEV_SUN4I_SS is not set +# CONFIG_CRYPTO_DEV_SUN8I_CE is not set +# CONFIG_CRYPTO_DEV_SUN8I_SS is not set +# CONFIG_CRYPTO_DEV_ATMEL_ECC is not set +# CONFIG_CRYPTO_DEV_ATMEL_SHA204A is not set +# CONFIG_CRYPTO_DEV_CCP is not set +CONFIG_CRYPTO_DEV_CPT=m +CONFIG_CAVIUM_CPT=m +CONFIG_CRYPTO_DEV_NITROX=m +CONFIG_CRYPTO_DEV_NITROX_CNN55XX=m +CONFIG_CRYPTO_DEV_MARVELL=m +CONFIG_CRYPTO_DEV_MARVELL_CESA=m +# CONFIG_CRYPTO_DEV_OCTEONTX_CPT is not set +# CONFIG_CRYPTO_DEV_OCTEONTX2_CPT is not set +# CONFIG_CRYPTO_DEV_QAT_DH895xCC is not set +# CONFIG_CRYPTO_DEV_QAT_C3XXX is not set +# CONFIG_CRYPTO_DEV_QAT_C62X is not set +# CONFIG_CRYPTO_DEV_QAT_4XXX is not set +# CONFIG_CRYPTO_DEV_QAT_DH895xCCVF is not set +# CONFIG_CRYPTO_DEV_QAT_C3XXXVF is not set +# CONFIG_CRYPTO_DEV_QAT_C62XVF is not set +# CONFIG_CRYPTO_DEV_CAVIUM_ZIP is not set +CONFIG_CRYPTO_DEV_QCE=m +CONFIG_CRYPTO_DEV_QCE_SKCIPHER=y +CONFIG_CRYPTO_DEV_QCE_SHA=y +CONFIG_CRYPTO_DEV_QCE_AEAD=y +CONFIG_CRYPTO_DEV_QCE_ENABLE_ALL=y +# CONFIG_CRYPTO_DEV_QCE_ENABLE_SKCIPHER is not set +# CONFIG_CRYPTO_DEV_QCE_ENABLE_SHA is not set +# CONFIG_CRYPTO_DEV_QCE_ENABLE_AEAD is not set +CONFIG_CRYPTO_DEV_QCE_SW_MAX_LEN=512 +CONFIG_CRYPTO_DEV_QCOM_RNG=m +# CONFIG_CRYPTO_DEV_ROCKCHIP is not set +# CONFIG_CRYPTO_DEV_ZYNQMP_AES is not set +# CONFIG_CRYPTO_DEV_ZYNQMP_SHA3 is not set +CONFIG_CRYPTO_DEV_CHELSIO=m +CONFIG_CRYPTO_DEV_VIRTIO=m +CONFIG_CRYPTO_DEV_SAFEXCEL=m +# CONFIG_CRYPTO_DEV_CCREE is not set +# CONFIG_CRYPTO_DEV_HISI_SEC is not set +# CONFIG_CRYPTO_DEV_HISI_SEC2 is not set +# CONFIG_CRYPTO_DEV_HISI_ZIP is not set +# CONFIG_CRYPTO_DEV_HISI_HPRE is not set +# CONFIG_CRYPTO_DEV_HISI_TRNG is not set +CONFIG_CRYPTO_DEV_AMLOGIC_GXL=m +# CONFIG_CRYPTO_DEV_AMLOGIC_GXL_DEBUG is not set +CONFIG_ASYMMETRIC_KEY_TYPE=y +CONFIG_ASYMMETRIC_PUBLIC_KEY_SUBTYPE=y +CONFIG_X509_CERTIFICATE_PARSER=y +CONFIG_PKCS8_PRIVATE_KEY_PARSER=m +CONFIG_PKCS7_MESSAGE_PARSER=y +# CONFIG_PKCS7_TEST_KEY is not set +# CONFIG_SIGNED_PE_FILE_VERIFICATION is not set +# CONFIG_FIPS_SIGNATURE_SELFTEST is not set + +# +# Certificates for signature checking +# +CONFIG_MODULE_SIG_KEY="" +CONFIG_MODULE_SIG_KEY_TYPE_RSA=y +# CONFIG_MODULE_SIG_KEY_TYPE_ECDSA is not set +CONFIG_SYSTEM_TRUSTED_KEYRING=y +CONFIG_SYSTEM_TRUSTED_KEYS="" +# CONFIG_SYSTEM_EXTRA_CERTIFICATE is not set +CONFIG_SECONDARY_TRUSTED_KEYRING=y +CONFIG_SYSTEM_BLACKLIST_KEYRING=y +CONFIG_SYSTEM_BLACKLIST_HASH_LIST="" +# CONFIG_SYSTEM_REVOCATION_LIST is not set +# CONFIG_SYSTEM_BLACKLIST_AUTH_UPDATE is not set +# end of Certificates for signature checking + +CONFIG_BINARY_PRINTF=y + +# +# Library routines +# +CONFIG_RAID6_PQ=m +CONFIG_RAID6_PQ_BENCHMARK=y +CONFIG_LINEAR_RANGES=y +# CONFIG_PACKING is not set +CONFIG_BITREVERSE=y +CONFIG_HAVE_ARCH_BITREVERSE=y +CONFIG_GENERIC_STRNCPY_FROM_USER=y +CONFIG_GENERIC_STRNLEN_USER=y +CONFIG_GENERIC_NET_UTILS=y +CONFIG_CORDIC=m +# CONFIG_PRIME_NUMBERS is not set +CONFIG_RATIONAL=y +CONFIG_GENERIC_PCI_IOMAP=y +CONFIG_ARCH_USE_CMPXCHG_LOCKREF=y +CONFIG_ARCH_HAS_FAST_MULTIPLIER=y +CONFIG_ARCH_USE_SYM_ANNOTATIONS=y +CONFIG_INDIRECT_PIO=y +# CONFIG_TRACE_MMIO_ACCESS is not set + +# +# Crypto library routines +# +CONFIG_CRYPTO_LIB_UTILS=y +CONFIG_CRYPTO_LIB_AES=y +CONFIG_CRYPTO_LIB_ARC4=m +CONFIG_CRYPTO_LIB_GF128MUL=m +CONFIG_CRYPTO_LIB_BLAKE2S_GENERIC=y +CONFIG_CRYPTO_LIB_CHACHA_GENERIC=m +# CONFIG_CRYPTO_LIB_CHACHA is not set +# CONFIG_CRYPTO_LIB_CURVE25519 is not set +CONFIG_CRYPTO_LIB_DES=m +CONFIG_CRYPTO_LIB_POLY1305_RSIZE=9 +CONFIG_CRYPTO_LIB_POLY1305_GENERIC=m +# CONFIG_CRYPTO_LIB_POLY1305 is not set +# CONFIG_CRYPTO_LIB_CHACHA20POLY1305 is not set +CONFIG_CRYPTO_LIB_SHA1=y +CONFIG_CRYPTO_LIB_SHA256=y +# end of Crypto library routines + +CONFIG_CRC_CCITT=m +CONFIG_CRC16=m +CONFIG_CRC_T10DIF=y +CONFIG_CRC64_ROCKSOFT=m +CONFIG_CRC_ITU_T=m +CONFIG_CRC32=y +# CONFIG_CRC32_SELFTEST is not set +CONFIG_CRC32_SLICEBY8=y +# CONFIG_CRC32_SLICEBY4 is not set +# CONFIG_CRC32_SARWATE is not set +# CONFIG_CRC32_BIT is not set +CONFIG_CRC64=m +# CONFIG_CRC4 is not set +CONFIG_CRC7=m +CONFIG_LIBCRC32C=m +CONFIG_CRC8=y +CONFIG_XXHASH=y +CONFIG_AUDIT_GENERIC=y +CONFIG_AUDIT_ARCH_COMPAT_GENERIC=y +CONFIG_AUDIT_COMPAT_GENERIC=y +# CONFIG_RANDOM32_SELFTEST is not set +CONFIG_ZLIB_INFLATE=y +CONFIG_ZLIB_DEFLATE=y +CONFIG_LZO_COMPRESS=y +CONFIG_LZO_DECOMPRESS=y +CONFIG_LZ4_COMPRESS=m +CONFIG_LZ4HC_COMPRESS=m +CONFIG_LZ4_DECOMPRESS=y +CONFIG_ZSTD_COMMON=y +CONFIG_ZSTD_COMPRESS=y +CONFIG_ZSTD_DECOMPRESS=y +CONFIG_XZ_DEC=y +# CONFIG_XZ_DEC_X86 is not set +# CONFIG_XZ_DEC_POWERPC is not set +# CONFIG_XZ_DEC_IA64 is not set +# CONFIG_XZ_DEC_ARM is not set +# CONFIG_XZ_DEC_ARMTHUMB is not set +# CONFIG_XZ_DEC_SPARC is not set +# CONFIG_XZ_DEC_MICROLZMA is not set +# CONFIG_XZ_DEC_TEST is not set +CONFIG_DECOMPRESS_GZIP=y +CONFIG_DECOMPRESS_BZIP2=y +CONFIG_DECOMPRESS_LZMA=y +CONFIG_DECOMPRESS_XZ=y +CONFIG_DECOMPRESS_LZO=y +CONFIG_DECOMPRESS_LZ4=y +CONFIG_DECOMPRESS_ZSTD=y +CONFIG_GENERIC_ALLOCATOR=y +CONFIG_REED_SOLOMON=m +CONFIG_REED_SOLOMON_ENC8=y +CONFIG_REED_SOLOMON_DEC8=y +CONFIG_TEXTSEARCH=y +CONFIG_TEXTSEARCH_KMP=m +CONFIG_TEXTSEARCH_BM=m +CONFIG_TEXTSEARCH_FSM=m +CONFIG_BTREE=y +CONFIG_INTERVAL_TREE=y +CONFIG_XARRAY_MULTI=y +CONFIG_ASSOCIATIVE_ARRAY=y +CONFIG_HAS_IOMEM=y +CONFIG_HAS_IOPORT=y +CONFIG_HAS_IOPORT_MAP=y +CONFIG_HAS_DMA=y +CONFIG_DMA_OPS=y +CONFIG_NEED_SG_DMA_FLAGS=y +CONFIG_NEED_SG_DMA_LENGTH=y +CONFIG_NEED_DMA_MAP_STATE=y +CONFIG_ARCH_DMA_ADDR_T_64BIT=y +CONFIG_DMA_DECLARE_COHERENT=y +CONFIG_ARCH_HAS_SETUP_DMA_OPS=y +CONFIG_ARCH_HAS_TEARDOWN_DMA_OPS=y +CONFIG_ARCH_HAS_SYNC_DMA_FOR_DEVICE=y +CONFIG_ARCH_HAS_SYNC_DMA_FOR_CPU=y +CONFIG_ARCH_HAS_DMA_PREP_COHERENT=y +CONFIG_SWIOTLB=y +# CONFIG_SWIOTLB_DYNAMIC is not set +CONFIG_DMA_BOUNCE_UNALIGNED_KMALLOC=y +# CONFIG_DMA_RESTRICTED_POOL is not set +CONFIG_DMA_NONCOHERENT_MMAP=y +CONFIG_DMA_COHERENT_POOL=y +CONFIG_DMA_DIRECT_REMAP=y +# CONFIG_DMA_API_DEBUG is not set +# CONFIG_DMA_MAP_BENCHMARK is not set +CONFIG_SGL_ALLOC=y +CONFIG_CHECK_SIGNATURE=y +CONFIG_CPU_RMAP=y +CONFIG_DQL=y +CONFIG_GLOB=y +# CONFIG_GLOB_SELFTEST is not set +CONFIG_NLATTR=y +CONFIG_LRU_CACHE=m +CONFIG_CLZ_TAB=y +CONFIG_IRQ_POLL=y +CONFIG_MPILIB=y +CONFIG_SIGNATURE=y +CONFIG_DIMLIB=y +CONFIG_LIBFDT=y +CONFIG_OID_REGISTRY=y +CONFIG_UCS2_STRING=y +CONFIG_HAVE_GENERIC_VDSO=y +CONFIG_GENERIC_GETTIMEOFDAY=y +CONFIG_GENERIC_VDSO_TIME_NS=y +CONFIG_FONT_SUPPORT=y +# CONFIG_FONTS is not set +CONFIG_FONT_8x8=y +CONFIG_FONT_8x16=y +CONFIG_SG_POOL=y +CONFIG_ARCH_HAS_PMEM_API=y +CONFIG_MEMREGION=y +CONFIG_ARCH_HAS_UACCESS_FLUSHCACHE=y +CONFIG_ARCH_STACKWALK=y +CONFIG_STACKDEPOT=y +CONFIG_SBITMAP=y +# end of Library routines + +CONFIG_GENERIC_IOREMAP=y +CONFIG_GENERIC_LIB_DEVMEM_IS_ALLOWED=y +CONFIG_PLDMFW=y + +# +# Kernel hacking +# + +# +# printk and dmesg options +# +CONFIG_PRINTK_TIME=y +CONFIG_PRINTK_CALLER=y +# CONFIG_STACKTRACE_BUILD_ID is not set +CONFIG_CONSOLE_LOGLEVEL_DEFAULT=7 +CONFIG_CONSOLE_LOGLEVEL_QUIET=4 +CONFIG_MESSAGE_LOGLEVEL_DEFAULT=4 +CONFIG_BOOT_PRINTK_DELAY=y +CONFIG_DYNAMIC_DEBUG=y +CONFIG_DYNAMIC_DEBUG_CORE=y +CONFIG_SYMBOLIC_ERRNAME=y +CONFIG_DEBUG_BUGVERBOSE=y +# end of printk and dmesg options + +CONFIG_DEBUG_KERNEL=y +CONFIG_DEBUG_MISC=y + +# +# Compile-time checks and compiler options +# +CONFIG_DEBUG_INFO=y +CONFIG_AS_HAS_NON_CONST_LEB128=y +# CONFIG_DEBUG_INFO_NONE is not set +CONFIG_DEBUG_INFO_DWARF_TOOLCHAIN_DEFAULT=y +# CONFIG_DEBUG_INFO_DWARF4 is not set +# CONFIG_DEBUG_INFO_DWARF5 is not set +# CONFIG_DEBUG_INFO_REDUCED is not set +CONFIG_DEBUG_INFO_COMPRESSED_NONE=y +# CONFIG_DEBUG_INFO_COMPRESSED_ZLIB is not set +# CONFIG_DEBUG_INFO_SPLIT is not set +CONFIG_DEBUG_INFO_BTF=y +CONFIG_PAHOLE_HAS_SPLIT_BTF=y +CONFIG_PAHOLE_HAS_LANG_EXCLUDE=y +CONFIG_DEBUG_INFO_BTF_MODULES=y +# CONFIG_MODULE_ALLOW_BTF_MISMATCH is not set +# CONFIG_GDB_SCRIPTS is not set +CONFIG_FRAME_WARN=2048 +CONFIG_STRIP_ASM_SYMS=y +# CONFIG_READABLE_ASM is not set +# CONFIG_HEADERS_INSTALL is not set +# CONFIG_DEBUG_SECTION_MISMATCH is not set +CONFIG_SECTION_MISMATCH_WARN_ONLY=y +# CONFIG_DEBUG_FORCE_FUNCTION_ALIGN_64B is not set +CONFIG_ARCH_WANT_FRAME_POINTERS=y +CONFIG_FRAME_POINTER=y +# CONFIG_VMLINUX_MAP is not set +# CONFIG_DEBUG_FORCE_WEAK_PER_CPU is not set +# end of Compile-time checks and compiler options + +# +# Generic Kernel Debugging Instruments +# +CONFIG_MAGIC_SYSRQ=y +CONFIG_MAGIC_SYSRQ_DEFAULT_ENABLE=0x01b6 +CONFIG_MAGIC_SYSRQ_SERIAL=y +CONFIG_MAGIC_SYSRQ_SERIAL_SEQUENCE="" +CONFIG_DEBUG_FS=y +CONFIG_DEBUG_FS_ALLOW_ALL=y +# CONFIG_DEBUG_FS_DISALLOW_MOUNT is not set +# CONFIG_DEBUG_FS_ALLOW_NONE is not set +CONFIG_HAVE_ARCH_KGDB=y +# CONFIG_KGDB is not set +CONFIG_ARCH_HAS_UBSAN_SANITIZE_ALL=y +# CONFIG_UBSAN is not set +CONFIG_HAVE_ARCH_KCSAN=y +CONFIG_HAVE_KCSAN_COMPILER=y +# CONFIG_KCSAN is not set +# end of Generic Kernel Debugging Instruments + +# +# Networking Debugging +# +# CONFIG_NET_DEV_REFCNT_TRACKER is not set +# CONFIG_NET_NS_REFCNT_TRACKER is not set +# CONFIG_DEBUG_NET is not set +# end of Networking Debugging + +# +# Memory Debugging +# +CONFIG_PAGE_EXTENSION=y +# CONFIG_DEBUG_PAGEALLOC is not set +CONFIG_SLUB_DEBUG=y +# CONFIG_SLUB_DEBUG_ON is not set +CONFIG_PAGE_OWNER=y +# CONFIG_PAGE_TABLE_CHECK is not set +CONFIG_PAGE_POISONING=y +# CONFIG_DEBUG_PAGE_REF is not set +# CONFIG_DEBUG_RODATA_TEST is not set +CONFIG_ARCH_HAS_DEBUG_WX=y +# CONFIG_DEBUG_WX is not set +CONFIG_GENERIC_PTDUMP=y +# CONFIG_PTDUMP_DEBUGFS is not set +CONFIG_HAVE_DEBUG_KMEMLEAK=y +# CONFIG_DEBUG_KMEMLEAK is not set +# CONFIG_PER_VMA_LOCK_STATS is not set +# CONFIG_DEBUG_OBJECTS is not set +# CONFIG_SHRINKER_DEBUG is not set +# CONFIG_DEBUG_STACK_USAGE is not set +CONFIG_SCHED_STACK_END_CHECK=y +CONFIG_ARCH_HAS_DEBUG_VM_PGTABLE=y +# CONFIG_DEBUG_VM is not set +# CONFIG_DEBUG_VM_PGTABLE is not set +CONFIG_ARCH_HAS_DEBUG_VIRTUAL=y +# CONFIG_DEBUG_VIRTUAL is not set +CONFIG_DEBUG_MEMORY_INIT=y +CONFIG_MEMORY_NOTIFIER_ERROR_INJECT=m +# CONFIG_DEBUG_PER_CPU_MAPS is not set +CONFIG_HAVE_ARCH_KASAN=y +CONFIG_HAVE_ARCH_KASAN_SW_TAGS=y +CONFIG_HAVE_ARCH_KASAN_HW_TAGS=y +CONFIG_HAVE_ARCH_KASAN_VMALLOC=y +CONFIG_CC_HAS_KASAN_GENERIC=y +CONFIG_CC_HAS_KASAN_SW_TAGS=y +CONFIG_CC_HAS_WORKING_NOSANITIZE_ADDRESS=y +# CONFIG_KASAN is not set +CONFIG_HAVE_ARCH_KFENCE=y +CONFIG_KFENCE=y +CONFIG_KFENCE_SAMPLE_INTERVAL=0 +CONFIG_KFENCE_NUM_OBJECTS=511 +# CONFIG_KFENCE_DEFERRABLE is not set +# CONFIG_KFENCE_STATIC_KEYS is not set +CONFIG_KFENCE_STRESS_TEST_FAULTS=0 +# end of Memory Debugging + +# CONFIG_DEBUG_SHIRQ is not set + +# +# Debug Oops, Lockups and Hangs +# +# CONFIG_PANIC_ON_OOPS is not set +CONFIG_PANIC_ON_OOPS_VALUE=0 +CONFIG_PANIC_TIMEOUT=0 +CONFIG_LOCKUP_DETECTOR=y +CONFIG_SOFTLOCKUP_DETECTOR=y +# CONFIG_BOOTPARAM_SOFTLOCKUP_PANIC is not set +CONFIG_HAVE_HARDLOCKUP_DETECTOR_BUDDY=y +# CONFIG_HARDLOCKUP_DETECTOR is not set +CONFIG_DETECT_HUNG_TASK=y +CONFIG_DEFAULT_HUNG_TASK_TIMEOUT=120 +# CONFIG_BOOTPARAM_HUNG_TASK_PANIC is not set +# CONFIG_WQ_WATCHDOG is not set +# CONFIG_WQ_CPU_INTENSIVE_REPORT is not set +# CONFIG_TEST_LOCKUP is not set +# end of Debug Oops, Lockups and Hangs + +# +# Scheduler Debugging +# +CONFIG_SCHED_DEBUG=y +CONFIG_SCHED_INFO=y +CONFIG_SCHEDSTATS=y +# end of Scheduler Debugging + +# CONFIG_DEBUG_TIMEKEEPING is not set +# CONFIG_DEBUG_PREEMPT is not set + +# +# Lock Debugging (spinlocks, mutexes, etc...) +# +CONFIG_LOCK_DEBUGGING_SUPPORT=y +# CONFIG_PROVE_LOCKING is not set +# CONFIG_LOCK_STAT is not set +# CONFIG_DEBUG_RT_MUTEXES is not set +# CONFIG_DEBUG_SPINLOCK is not set +# CONFIG_DEBUG_MUTEXES is not set +# CONFIG_DEBUG_WW_MUTEX_SLOWPATH is not set +# CONFIG_DEBUG_RWSEMS is not set +# CONFIG_DEBUG_LOCK_ALLOC is not set +# CONFIG_DEBUG_ATOMIC_SLEEP is not set +# CONFIG_DEBUG_LOCKING_API_SELFTESTS is not set +# CONFIG_LOCK_TORTURE_TEST is not set +# CONFIG_WW_MUTEX_SELFTEST is not set +# CONFIG_SCF_TORTURE_TEST is not set +# CONFIG_CSD_LOCK_WAIT_DEBUG is not set +# end of Lock Debugging (spinlocks, mutexes, etc...) + +# CONFIG_DEBUG_IRQFLAGS is not set +CONFIG_STACKTRACE=y +# CONFIG_WARN_ALL_UNSEEDED_RANDOM is not set +# CONFIG_DEBUG_KOBJECT is not set + +# +# Debug kernel data structures +# +CONFIG_DEBUG_LIST=y +# CONFIG_DEBUG_PLIST is not set +# CONFIG_DEBUG_SG is not set +# CONFIG_DEBUG_NOTIFIERS is not set +# CONFIG_DEBUG_MAPLE_TREE is not set +# end of Debug kernel data structures + +# +# RCU Debugging +# +# CONFIG_RCU_SCALE_TEST is not set +# CONFIG_RCU_TORTURE_TEST is not set +# CONFIG_RCU_REF_SCALE_TEST is not set +CONFIG_RCU_CPU_STALL_TIMEOUT=21 +CONFIG_RCU_EXP_CPU_STALL_TIMEOUT=0 +# CONFIG_RCU_CPU_STALL_CPUTIME is not set +# CONFIG_RCU_TRACE is not set +# CONFIG_RCU_EQS_DEBUG is not set +# end of RCU Debugging + +# CONFIG_DEBUG_WQ_FORCE_RR_CPU is not set +# CONFIG_CPU_HOTPLUG_STATE_CONTROL is not set +# CONFIG_LATENCYTOP is not set +# CONFIG_DEBUG_CGROUP_REF is not set +CONFIG_NOP_TRACER=y +CONFIG_HAVE_FUNCTION_TRACER=y +CONFIG_HAVE_FUNCTION_GRAPH_TRACER=y +CONFIG_HAVE_FUNCTION_GRAPH_RETVAL=y +CONFIG_HAVE_DYNAMIC_FTRACE=y +CONFIG_HAVE_DYNAMIC_FTRACE_WITH_DIRECT_CALLS=y +CONFIG_HAVE_DYNAMIC_FTRACE_WITH_CALL_OPS=y +CONFIG_HAVE_DYNAMIC_FTRACE_WITH_ARGS=y +CONFIG_HAVE_FTRACE_MCOUNT_RECORD=y +CONFIG_HAVE_SYSCALL_TRACEPOINTS=y +CONFIG_HAVE_C_RECORDMCOUNT=y +CONFIG_TRACER_MAX_TRACE=y +CONFIG_TRACE_CLOCK=y +CONFIG_RING_BUFFER=y +CONFIG_EVENT_TRACING=y +CONFIG_CONTEXT_SWITCH_TRACER=y +CONFIG_TRACING=y +CONFIG_GENERIC_TRACER=y +CONFIG_TRACING_SUPPORT=y +CONFIG_FTRACE=y +# CONFIG_BOOTTIME_TRACING is not set +CONFIG_FUNCTION_TRACER=y +CONFIG_FUNCTION_GRAPH_TRACER=y +# CONFIG_FUNCTION_GRAPH_RETVAL is not set +CONFIG_DYNAMIC_FTRACE=y +CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS=y +CONFIG_DYNAMIC_FTRACE_WITH_CALL_OPS=y +CONFIG_DYNAMIC_FTRACE_WITH_ARGS=y +# CONFIG_FUNCTION_PROFILER is not set +CONFIG_STACK_TRACER=y +# CONFIG_IRQSOFF_TRACER is not set +# CONFIG_PREEMPT_TRACER is not set +# CONFIG_SCHED_TRACER is not set +# CONFIG_HWLAT_TRACER is not set +# CONFIG_OSNOISE_TRACER is not set +# CONFIG_TIMERLAT_TRACER is not set +CONFIG_FTRACE_SYSCALLS=y +CONFIG_TRACER_SNAPSHOT=y +# CONFIG_TRACER_SNAPSHOT_PER_CPU_SWAP is not set +CONFIG_BRANCH_PROFILE_NONE=y +# CONFIG_PROFILE_ANNOTATED_BRANCHES is not set +CONFIG_BLK_DEV_IO_TRACE=y +CONFIG_PROBE_EVENTS_BTF_ARGS=y +CONFIG_KPROBE_EVENTS=y +# CONFIG_KPROBE_EVENTS_ON_NOTRACE is not set +CONFIG_UPROBE_EVENTS=y +CONFIG_BPF_EVENTS=y +CONFIG_DYNAMIC_EVENTS=y +CONFIG_PROBE_EVENTS=y +# CONFIG_BPF_KPROBE_OVERRIDE is not set +CONFIG_FTRACE_MCOUNT_RECORD=y +CONFIG_FTRACE_MCOUNT_USE_PATCHABLE_FUNCTION_ENTRY=y +# CONFIG_SYNTH_EVENTS is not set +# CONFIG_USER_EVENTS is not set +# CONFIG_HIST_TRIGGERS is not set +# CONFIG_TRACE_EVENT_INJECT is not set +# CONFIG_TRACEPOINT_BENCHMARK is not set +# CONFIG_RING_BUFFER_BENCHMARK is not set +# CONFIG_TRACE_EVAL_MAP_FILE is not set +# CONFIG_FTRACE_RECORD_RECURSION is not set +# CONFIG_FTRACE_STARTUP_TEST is not set +# CONFIG_RING_BUFFER_STARTUP_TEST is not set +# CONFIG_RING_BUFFER_VALIDATE_TIME_DELTAS is not set +# CONFIG_PREEMPTIRQ_DELAY_TEST is not set +# CONFIG_KPROBE_EVENT_GEN_TEST is not set +# CONFIG_RV is not set +# CONFIG_SAMPLES is not set +CONFIG_HAVE_SAMPLE_FTRACE_DIRECT=y +CONFIG_HAVE_SAMPLE_FTRACE_DIRECT_MULTI=y +CONFIG_STRICT_DEVMEM=y +# CONFIG_IO_STRICT_DEVMEM is not set + +# +# arm64 Debugging +# +# CONFIG_PID_IN_CONTEXTIDR is not set +# CONFIG_DEBUG_EFI is not set +# CONFIG_ARM64_RELOC_TEST is not set +# CONFIG_CORESIGHT is not set +# end of arm64 Debugging + +# +# Kernel Testing and Coverage +# +# CONFIG_KUNIT is not set +CONFIG_NOTIFIER_ERROR_INJECTION=m +CONFIG_PM_NOTIFIER_ERROR_INJECT=m +# CONFIG_NETDEV_NOTIFIER_ERROR_INJECT is not set +CONFIG_FUNCTION_ERROR_INJECTION=y +CONFIG_FAULT_INJECTION=y +# CONFIG_FAILSLAB is not set +# CONFIG_FAIL_PAGE_ALLOC is not set +# CONFIG_FAULT_INJECTION_USERCOPY is not set +# CONFIG_FAIL_MAKE_REQUEST is not set +# CONFIG_FAIL_IO_TIMEOUT is not set +# CONFIG_FAIL_FUTEX is not set +# CONFIG_FAULT_INJECTION_DEBUG_FS is not set +# CONFIG_FAULT_INJECTION_CONFIGFS is not set +CONFIG_ARCH_HAS_KCOV=y +CONFIG_CC_HAS_SANCOV_TRACE_PC=y +# CONFIG_KCOV is not set +CONFIG_RUNTIME_TESTING_MENU=y +# CONFIG_TEST_DHRY is not set +# CONFIG_LKDTM is not set +# CONFIG_TEST_MIN_HEAP is not set +# CONFIG_TEST_DIV64 is not set +# CONFIG_BACKTRACE_SELF_TEST is not set +# CONFIG_TEST_REF_TRACKER is not set +# CONFIG_RBTREE_TEST is not set +# CONFIG_REED_SOLOMON_TEST is not set +# CONFIG_INTERVAL_TREE_TEST is not set +# CONFIG_PERCPU_TEST is not set +# CONFIG_ATOMIC64_SELFTEST is not set +# CONFIG_ASYNC_RAID6_TEST is not set +# CONFIG_TEST_HEXDUMP is not set +# CONFIG_STRING_SELFTEST is not set +# CONFIG_TEST_STRING_HELPERS is not set +# CONFIG_TEST_KSTRTOX is not set +# CONFIG_TEST_PRINTF is not set +# CONFIG_TEST_SCANF is not set +# CONFIG_TEST_BITMAP is not set +# CONFIG_TEST_UUID is not set +# CONFIG_TEST_XARRAY is not set +# CONFIG_TEST_MAPLE_TREE is not set +# CONFIG_TEST_RHASHTABLE is not set +# CONFIG_TEST_IDA is not set +# CONFIG_TEST_LKM is not set +# CONFIG_TEST_BITOPS is not set +# CONFIG_TEST_VMALLOC is not set +CONFIG_TEST_USER_COPY=m +CONFIG_TEST_BPF=m +# CONFIG_TEST_BLACKHOLE_DEV is not set +# CONFIG_FIND_BIT_BENCHMARK is not set +CONFIG_TEST_FIRMWARE=m +# CONFIG_TEST_SYSCTL is not set +# CONFIG_TEST_UDELAY is not set +CONFIG_TEST_STATIC_KEYS=m +# CONFIG_TEST_DYNAMIC_DEBUG is not set +# CONFIG_TEST_KMOD is not set +# CONFIG_TEST_MEMCAT_P is not set +CONFIG_TEST_LIVEPATCH=m +# CONFIG_TEST_MEMINIT is not set +# CONFIG_TEST_HMM is not set +# CONFIG_TEST_FREE_PAGES is not set +CONFIG_ARCH_USE_MEMTEST=y +# CONFIG_MEMTEST is not set +# CONFIG_HYPERV_TESTING is not set +# end of Kernel Testing and Coverage + +# +# Rust hacking +# +# end of Rust hacking +# end of Kernel hacking From e8ae6593cbec63bf99c0c1e1a446817188e81eb3 Mon Sep 17 00:00:00 2001 From: Jason Gunthorpe Date: Wed, 4 Oct 2023 09:08:32 -0300 Subject: [PATCH 542/548] iommu: Fix return code in iommu_group_alloc_default_domain() This function returns NULL on errors, not ERR_PTR. Fixes: 1c68cbc64fe6 ("iommu: Add IOMMU_DOMAIN_PLATFORM") Reported-by: Dan Carpenter Link: https://lore.kernel.org/r/8fb75157-6c81-4a9c-9992-d73d49902fa8@moroto.mountain Signed-off-by: Jason Gunthorpe Link: https://lore.kernel.org/r/0-v2-ee2bae9af0f2+96-iommu_ga_err_ptr_jgg@nvidia.com Signed-off-by: Joerg Roedel (cherry picked from commit b85b4f30846bb169c114e99ceee17cc119f02a4b) Signed-off-by: Koba Ko --- drivers/iommu/iommu.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c index 53ee79a6babb4..a20136be12670 100644 --- a/drivers/iommu/iommu.c +++ b/drivers/iommu/iommu.c @@ -1607,7 +1607,7 @@ iommu_group_alloc_default_domain(struct iommu_group *group, int req_type) */ if (ops->default_domain) { if (req_type != ops->default_domain->type) - return ERR_PTR(-EINVAL); + return NULL; return ops->default_domain; } From 348a37e0eec6779fcbc594ee382602eee342195e Mon Sep 17 00:00:00 2001 From: Nicolin Chen Date: Mon, 14 Apr 2025 12:16:35 -0700 Subject: [PATCH 543/548] iommu: Fix two issues in iommu_copy_struct_from_user() In the review for iommu_copy_struct_to_user() helper, Matt pointed out that a NULL pointer should be rejected prior to dereferencing it: https://lore.kernel.org/all/86881827-8E2D-461C-BDA3-FA8FD14C343C@nvidia.com And Alok pointed out a typo at the same time: https://lore.kernel.org/all/480536af-6830-43ce-a327-adbd13dc3f1d@oracle.com Since both issues were copied from iommu_copy_struct_from_user(), fix them first in the current header. Fixes: e9d36c07bb78 ("iommu: Add iommu_copy_struct_from_user helper") Cc: stable@vger.kernel.org Signed-off-by: Nicolin Chen Reviewed-by: Kevin Tian Acked-by: Alok Tiwari Reviewed-by: Matthew R. Ochs Link: https://lore.kernel.org/r/20250414191635.450472-1-nicolinc@nvidia.com Signed-off-by: Joerg Roedel (cherry picked from commit 30a3f2f3e4bd6335b727c83c08a982d969752bc1) Signed-off-by: Koba Ko --- include/linux/iommu.h | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/include/linux/iommu.h b/include/linux/iommu.h index bc221bc51e8f8..0379be56dc91b 100644 --- a/include/linux/iommu.h +++ b/include/linux/iommu.h @@ -407,10 +407,10 @@ static inline int __iommu_copy_struct_from_user( void *dst_data, const struct iommu_user_data *src_data, unsigned int data_type, size_t data_len, size_t min_len) { - if (src_data->type != data_type) - return -EINVAL; if (WARN_ON(!dst_data || !src_data)) return -EINVAL; + if (src_data->type != data_type) + return -EINVAL; if (src_data->len < min_len || data_len < src_data->len) return -EINVAL; return copy_struct_from_user(dst_data, data_len, src_data->uptr, @@ -423,8 +423,8 @@ static inline int __iommu_copy_struct_from_user( * include/uapi/linux/iommufd.h * @user_data: Pointer to a struct iommu_user_data for user space data info * @data_type: The data type of the @kdst. Must match with @user_data->type - * @min_last: The last memember of the data structure @kdst points in the - * initial version. + * @min_last: The last member of the data structure @kdst points in the initial + * version. * Return 0 for success, otherwise -error. */ #define iommu_copy_struct_from_user(kdst, user_data, data_type, min_last) \ From 36186bb639890a4fc3952f51e60733ab4fb15bd4 Mon Sep 17 00:00:00 2001 From: Marek Szyprowski Date: Tue, 1 Apr 2025 22:27:31 +0200 Subject: [PATCH 544/548] iommu/exynos: Fix suspend/resume with IDENTITY domain Commit bcb81ac6ae3c ("iommu: Get DT/ACPI parsing into the proper probe path") changed the sequence of probing the SYSMMU controller devices and calls to arm_iommu_attach_device(), what results in resuming SYSMMU controller earlier, when it is still set to IDENTITY mapping. Such change revealed the bug in IDENTITY handling in the exynos-iommu driver. When SYSMMU controller is set to IDENTITY mapping, data->domain is NULL, so adjust checks in suspend & resume callbacks to handle this case correctly. Fixes: b3d14960e629 ("iommu/exynos: Implement an IDENTITY domain") Signed-off-by: Marek Szyprowski Link: https://lore.kernel.org/r/20250401202731.2810474-1-m.szyprowski@samsung.com Signed-off-by: Joerg Roedel (cherry picked from commit 99deffc409b69000ac4877486e69ec6516becd53) Signed-off-by: Koba Ko --- drivers/iommu/exynos-iommu.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/iommu/exynos-iommu.c b/drivers/iommu/exynos-iommu.c index d6dead2ed10c1..10c13ac413e3f 100644 --- a/drivers/iommu/exynos-iommu.c +++ b/drivers/iommu/exynos-iommu.c @@ -830,7 +830,7 @@ static int __maybe_unused exynos_sysmmu_suspend(struct device *dev) struct exynos_iommu_owner *owner = dev_iommu_priv_get(master); mutex_lock(&owner->rpm_lock); - if (&data->domain->domain != &exynos_identity_domain) { + if (data->domain) { dev_dbg(data->sysmmu, "saving state\n"); __sysmmu_disable(data); } @@ -848,7 +848,7 @@ static int __maybe_unused exynos_sysmmu_resume(struct device *dev) struct exynos_iommu_owner *owner = dev_iommu_priv_get(master); mutex_lock(&owner->rpm_lock); - if (&data->domain->domain != &exynos_identity_domain) { + if (data->domain) { dev_dbg(data->sysmmu, "restoring state\n"); __sysmmu_enable(data); } From bec607693e8f539d9bdeafa9769ef8df1b212d81 Mon Sep 17 00:00:00 2001 From: Jason Gunthorpe Date: Tue, 7 May 2024 10:21:10 -0300 Subject: [PATCH 545/548] iommu/arm-smmu-v3: Make the kunit into a module It turns out kconfig has problems ensuring the SMMU module and the KUNIT module are consistently y/m to allow linking. It will permit KUNIT to be a module while SMMU is built in. Also, Fedora apparently enables kunit on production kernels. So, put the entire kunit in its own module using the VISIBLE_IF_KUNIT/EXPORT_SYMBOL_IF_KUNIT machinery. This keeps it out of vmlinus on Fedora and makes the kconfig work in the normal way. There is no cost if kunit is disabled. Fixes: 56e1a4cc2588 ("iommu/arm-smmu-v3: Add unit tests for arm_smmu_write_entry") Reported-by: Thorsten Leemhuis Link: https://lore.kernel.org/all/aeea8546-5bce-4c51-b506-5d2008e52fef@leemhuis.info Signed-off-by: Jason Gunthorpe Tested-by: Thorsten Leemhuis Acked-by: Will Deacon Link: https://lore.kernel.org/r/0-v1-24cba6c0f404+2ae-smmu_kunit_module_jgg@nvidia.com Signed-off-by: Joerg Roedel (cherry picked from commit da55da5a42d4247d7a48b843fa5fcd9a4a10f4fe) Signed-off-by: Koba Ko --- drivers/iommu/Kconfig | 2 +- drivers/iommu/arm/arm-smmu-v3/Makefile | 3 ++- drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c | 1 + drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-test.c | 3 +++ drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 8 ++++++++ 5 files changed, 15 insertions(+), 2 deletions(-) diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig index be4bcc4f4b84b..0a6f739450643 100644 --- a/drivers/iommu/Kconfig +++ b/drivers/iommu/Kconfig @@ -410,7 +410,7 @@ config ARM_SMMU_V3_SVA and PRI. config ARM_SMMU_V3_KUNIT_TEST - bool "KUnit tests for arm-smmu-v3 driver" if !KUNIT_ALL_TESTS + tristate "KUnit tests for arm-smmu-v3 driver" if !KUNIT_ALL_TESTS depends on KUNIT depends on ARM_SMMU_V3_SVA default KUNIT_ALL_TESTS diff --git a/drivers/iommu/arm/arm-smmu-v3/Makefile b/drivers/iommu/arm/arm-smmu-v3/Makefile index 0b97054b3929b..014a997753a8a 100644 --- a/drivers/iommu/arm/arm-smmu-v3/Makefile +++ b/drivers/iommu/arm/arm-smmu-v3/Makefile @@ -2,5 +2,6 @@ obj-$(CONFIG_ARM_SMMU_V3) += arm_smmu_v3.o arm_smmu_v3-objs-y += arm-smmu-v3.o arm_smmu_v3-objs-$(CONFIG_ARM_SMMU_V3_SVA) += arm-smmu-v3-sva.o -arm_smmu_v3-objs-$(CONFIG_ARM_SMMU_V3_KUNIT_TEST) += arm-smmu-v3-test.o arm_smmu_v3-objs := $(arm_smmu_v3-objs-y) + +obj-$(CONFIG_ARM_SMMU_V3_KUNIT_TEST) += arm-smmu-v3-test.o diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c index 95e70a304ae20..a7c36654dee5a 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c @@ -113,6 +113,7 @@ void arm_smmu_make_sva_cd(struct arm_smmu_cd *target, */ target->data[3] = cpu_to_le64(read_sysreg(mair_el1)); } +EXPORT_SYMBOL_IF_KUNIT(arm_smmu_make_sva_cd); /* * Cloned from the MAX_TLBI_OPS in arch/arm64/include/asm/tlbflush.h, this diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-test.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-test.c index ee90dee4865c8..e0fce31eba54d 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-test.c +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-test.c @@ -565,3 +565,6 @@ static struct kunit_suite arm_smmu_v3_test_module = { .test_cases = arm_smmu_v3_test_cases, }; kunit_test_suites(&arm_smmu_v3_test_module); + +MODULE_IMPORT_NS(EXPORTED_FOR_KUNIT_TESTING); +MODULE_LICENSE("GPL v2"); diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c index f380847888c45..88b9fa9542a49 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c @@ -1020,6 +1020,7 @@ void arm_smmu_get_ste_used(const __le64 *ent, __le64 *used_bits) if (cfg == STRTAB_STE_0_CFG_BYPASS) used_bits[1] |= cpu_to_le64(STRTAB_STE_1_SHCFG); } +EXPORT_SYMBOL_IF_KUNIT(arm_smmu_get_ste_used); /* * Figure out if we can do a hitless update of entry to become target. Returns a @@ -1154,6 +1155,7 @@ void arm_smmu_write_entry(struct arm_smmu_entry_writer *writer, __le64 *entry, entry_set(writer, entry, target, 0, NUM_ENTRY_QWORDS)); } } +EXPORT_SYMBOL_IF_KUNIT(arm_smmu_write_entry); static void arm_smmu_sync_cd(struct arm_smmu_master *master, int ssid, bool leaf) @@ -1281,6 +1283,7 @@ void arm_smmu_get_cd_used(const __le64 *ent, __le64 *used_bits) used_bits[1] &= ~cpu_to_le64(CTXDESC_CD_1_TTB0_MASK); } } +EXPORT_SYMBOL_IF_KUNIT(arm_smmu_get_cd_used); static void arm_smmu_cd_writer_sync_entry(struct arm_smmu_entry_writer *writer) { @@ -1354,6 +1357,7 @@ void arm_smmu_make_s1_cd(struct arm_smmu_cd *target, CTXDESC_CD_1_TTB0_MASK); target->data[3] = cpu_to_le64(pgtbl_cfg->arm_lpae_s1_cfg.mair); } +EXPORT_SYMBOL_IF_KUNIT(arm_smmu_make_s1_cd); void arm_smmu_clear_cd(struct arm_smmu_master *master, ioasid_t ssid) { @@ -1521,6 +1525,7 @@ void arm_smmu_make_abort_ste(struct arm_smmu_ste *target) STRTAB_STE_0_V | FIELD_PREP(STRTAB_STE_0_CFG, STRTAB_STE_0_CFG_ABORT)); } +EXPORT_SYMBOL_IF_KUNIT(arm_smmu_make_abort_ste); VISIBLE_IF_KUNIT void arm_smmu_make_bypass_ste(struct arm_smmu_device *smmu, @@ -1535,6 +1540,7 @@ void arm_smmu_make_bypass_ste(struct arm_smmu_device *smmu, target->data[1] = cpu_to_le64(FIELD_PREP(STRTAB_STE_1_SHCFG, STRTAB_STE_1_SHCFG_INCOMING)); } +EXPORT_SYMBOL_IF_KUNIT(arm_smmu_make_bypass_ste); VISIBLE_IF_KUNIT void arm_smmu_make_cdtable_ste(struct arm_smmu_ste *target, @@ -1592,6 +1598,7 @@ void arm_smmu_make_cdtable_ste(struct arm_smmu_ste *target, cpu_to_le64(FIELD_PREP(STRTAB_STE_2_S2VMID, 0)); } } +EXPORT_SYMBOL_IF_KUNIT(arm_smmu_make_cdtable_ste); VISIBLE_IF_KUNIT void arm_smmu_make_s2_domain_ste(struct arm_smmu_ste *target, @@ -1640,6 +1647,7 @@ void arm_smmu_make_s2_domain_ste(struct arm_smmu_ste *target, target->data[3] = cpu_to_le64(pgtbl_cfg->arm_lpae_s2_cfg.vttbr & STRTAB_STE_3_S2TTB_MASK); } +EXPORT_SYMBOL_IF_KUNIT(arm_smmu_make_s2_domain_ste); /* * This can safely directly manipulate the STE memory without a sync sequence From 6bb8a06e7020781d6063bc31b396c7a48fe04b15 Mon Sep 17 00:00:00 2001 From: Jason Gunthorpe Date: Tue, 4 Feb 2025 15:18:19 -0400 Subject: [PATCH 546/548] gpu: host1x: Do not assume that a NULL domain means no DMA IOMMU Previously with tegra-smmu, even with CONFIG_IOMMU_DMA, the default domain could have been left as NULL. The NULL domain is specially recognized by host1x_iommu_attach() as meaning it is not the DMA domain and should be replaced with the special shared domain. This happened prior to the below commit because tegra-smmu was using the NULL domain to mean IDENTITY. Now that the domain is properly labled the test in DRM doesn't see NULL. Check for IDENTITY as well to enable the special domains. This is the same issue and basic fix as seen in commit fae6e669cdc5 ("drm/tegra: Do not assume that a NULL domain means no DMA IOMMU"). Fixes: c8cc2655cc6c ("iommu/tegra-smmu: Implement an IDENTITY domain") Reported-by: Diogo Ivo Closes: https://lore.kernel.org/all/c6a6f114-3acd-4d56-a13b-b88978e927dc@tecnico.ulisboa.pt/ Tested-by: Diogo Ivo Signed-off-by: Jason Gunthorpe Signed-off-by: Thierry Reding Link: https://patchwork.freedesktop.org/patch/msgid/0-v1-10dcc8ce3869+3a7-host1x_identity_jgg@nvidia.com (cherry picked from commit cb83f4b965a66d85e9a03621ef3b22c044f4a033) Signed-off-by: Koba Ko --- drivers/gpu/host1x/dev.c | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/drivers/gpu/host1x/dev.c b/drivers/gpu/host1x/dev.c index 7c6699aed7d2a..2485167ed3613 100644 --- a/drivers/gpu/host1x/dev.c +++ b/drivers/gpu/host1x/dev.c @@ -342,6 +342,10 @@ static bool host1x_wants_iommu(struct host1x *host1x) return true; } +/* + * Returns ERR_PTR on failure, NULL if the translation is IDENTITY, otherwise a + * valid paging domain. + */ static struct iommu_domain *host1x_iommu_attach(struct host1x *host) { struct iommu_domain *domain = iommu_get_domain_for_dev(host->dev); @@ -366,6 +370,8 @@ static struct iommu_domain *host1x_iommu_attach(struct host1x *host) * Similarly, if host1x is already attached to an IOMMU (via the DMA * API), don't try to attach again. */ + if (domain && domain->type == IOMMU_DOMAIN_IDENTITY) + domain = NULL; if (!host1x_wants_iommu(host) || domain) return domain; From 9313563a3580e2cce70401362bd9f86a445221f9 Mon Sep 17 00:00:00 2001 From: Shivaprasad G Bhat Date: Fri, 26 Jan 2024 09:09:18 -0600 Subject: [PATCH 547/548] powerpc: iommu: Bring back table group release_ownership() call The commit 2ad56efa80db ("powerpc/iommu: Setup a default domain and remove set_platform_dma_ops") refactored the code removing the set_platform_dma_ops(). It missed out the table group release_ownership() call which would have got called otherwise during the guest shutdown via vfio_group_detach_container(). On PPC64, this particular call actually sets up the 32-bit TCE table, and enables the 64-bit DMA bypass etc. Now after guest shutdown, the subsequent host driver (e.g megaraid-sas) probe post unbind from vfio-pci fails like, megaraid_sas 0031:01:00.0: Warning: IOMMU dma not supported: mask 0x7fffffffffffffff, table unavailable megaraid_sas 0031:01:00.0: Warning: IOMMU dma not supported: mask 0xffffffff, table unavailable megaraid_sas 0031:01:00.0: Failed to set DMA mask megaraid_sas 0031:01:00.0: Failed from megasas_init_fw 6539 The patch brings back the call to table_group release_ownership() call when switching back to PLATFORM domain from BLOCKED, while also separates the domain_ops for both. Fixes: 2ad56efa80db ("powerpc/iommu: Setup a default domain and remove set_platform_dma_ops") Signed-off-by: Shivaprasad G Bhat Reviewed-by: Jason Gunthorpe Link: https://lore.kernel.org/r/170628173462.3742.18330000394415935845.stgit@ltcd48-lp2.aus.stglab.ibm.com Signed-off-by: Joerg Roedel (cherry picked from commit d2d00e15808c37ec476a5c040ee2cdd23854ef18) Signed-off-by: Koba Ko --- arch/powerpc/kernel/iommu.c | 37 ++++++++++++++++++++++++++++--------- 1 file changed, 28 insertions(+), 9 deletions(-) diff --git a/arch/powerpc/kernel/iommu.c b/arch/powerpc/kernel/iommu.c index 88debde1d8d02..e70924500022d 100644 --- a/arch/powerpc/kernel/iommu.c +++ b/arch/powerpc/kernel/iommu.c @@ -1287,20 +1287,20 @@ spapr_tce_platform_iommu_attach_dev(struct iommu_domain *platform_domain, struct iommu_domain *domain = iommu_get_domain_for_dev(dev); struct iommu_group *grp = iommu_group_get(dev); struct iommu_table_group *table_group; - int ret = -EINVAL; /* At first attach the ownership is already set */ if (!domain) return 0; - if (!grp) - return -ENODEV; - table_group = iommu_group_get_iommudata(grp); - ret = table_group->ops->take_ownership(table_group); + /* + * The domain being set to PLATFORM from earlier + * BLOCKED. The table_group ownership has to be released. + */ + table_group->ops->release_ownership(table_group); iommu_group_put(grp); - return ret; + return 0; } static const struct iommu_domain_ops spapr_tce_platform_domain_ops = { @@ -1312,13 +1312,32 @@ static struct iommu_domain spapr_tce_platform_domain = { .ops = &spapr_tce_platform_domain_ops, }; -static struct iommu_domain spapr_tce_blocked_domain = { - .type = IOMMU_DOMAIN_BLOCKED, +static int +spapr_tce_blocked_iommu_attach_dev(struct iommu_domain *platform_domain, + struct device *dev) +{ + struct iommu_group *grp = iommu_group_get(dev); + struct iommu_table_group *table_group; + int ret = -EINVAL; + /* * FIXME: SPAPR mixes blocked and platform behaviors, the blocked domain * also sets the dma_api ops */ - .ops = &spapr_tce_platform_domain_ops, + table_group = iommu_group_get_iommudata(grp); + ret = table_group->ops->take_ownership(table_group); + iommu_group_put(grp); + + return ret; +} + +static const struct iommu_domain_ops spapr_tce_blocked_domain_ops = { + .attach_dev = spapr_tce_blocked_iommu_attach_dev, +}; + +static struct iommu_domain spapr_tce_blocked_domain = { + .type = IOMMU_DOMAIN_BLOCKED, + .ops = &spapr_tce_blocked_domain_ops, }; static bool spapr_tce_iommu_capable(struct device *dev, enum iommu_cap cap) From 5c5c4acfe63dade61f3dfe3843d610a74212b75f Mon Sep 17 00:00:00 2001 From: Nicolin Chen Date: Wed, 17 Jul 2024 22:01:30 -0700 Subject: [PATCH 548/548] iommufd/device: Fix hwpt at err_unresv in iommufd_device_do_replace() The rewind routine should remove the reserved iovas added to the new hwpt. Fixes: 89db31635c87 ("iommufd: Derive iommufd_hwpt_paging from iommufd_hw_pagetable") Cc: stable@vger.kernel.org Link: https://patch.msgid.link/r/20240718050130.1956804-1-nicolinc@nvidia.com Signed-off-by: Nicolin Chen Reviewed-by: Kevin Tian Signed-off-by: Jason Gunthorpe (cherry picked from commit 950aeefb34923fe3c28ade35fe05f24e2c5b1d55) Signed-off-by: Koba Ko --- drivers/iommu/iommufd/device.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/iommu/iommufd/device.c b/drivers/iommu/iommufd/device.c index 59ae9a4ad0170..21021a2cecba8 100644 --- a/drivers/iommu/iommufd/device.c +++ b/drivers/iommu/iommufd/device.c @@ -542,7 +542,7 @@ iommufd_device_do_replace(struct iommufd_device *idev, err_unresv: if (hwpt_is_paging(hwpt)) iommufd_group_remove_reserved_iova(igroup, - to_hwpt_paging(old_hwpt)); + to_hwpt_paging(hwpt)); err_unlock: mutex_unlock(&idev->igroup->lock); return ERR_PTR(rc);