Граф коммитов

33 Коммитов

Автор SHA1 Сообщение Дата
Dan Williams 86c8ea0f3b cxl/core/port: Use dedicated lock for decoder target list
Lockdep reports:

 ======================================================
 WARNING: possible circular locking dependency detected
 5.16.0-rc1+ #142 Tainted: G           OE
 ------------------------------------------------------
 cxl/1220 is trying to acquire lock:
 ffff979b85475460 (kn->active#144){++++}-{0:0}, at: __kernfs_remove+0x1ab/0x1e0

 but task is already holding lock:
 ffff979b87ab38e8 (&dev->lockdep_mutex#2/4){+.+.}-{3:3}, at: cxl_remove_ep+0x50c/0x5c0 [cxl_core]

...where cxl_remove_ep() is a helper that wants to delete ports while
holding a lock on the host device for that port. That sets up a lockdep
violation whereby target_list_show() can not rely holding the decoder's
device lock while walking the target_list. Switch to a dedicated seqlock
for this purpose.

Reported-by: Ben Widawsky <ben.widawsky@intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Link: https://lore.kernel.org/r/164367209095.208169.1171673319121271280.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2022-02-08 22:57:29 -08:00
Dan Williams 3c5b903955 cxl: Prove CXL locking
When CONFIG_PROVE_LOCKING is enabled the 'struct device' definition gets
an additional mutex that is not clobbered by
lockdep_set_novalidate_class() like the typical device_lock(). This
allows for local annotation of subsystem locks with mutex_lock_nested()
per the subsystem's object/lock hierarchy. For CXL, this primarily needs
the ability to lock ports by depth and child objects of ports by their
parent parent-port lock.

Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Ben Widawsky <ben.widawsky@intel.com>
Link: https://lore.kernel.org/r/164365853422.99383.1052399160445197427.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2022-02-08 22:57:29 -08:00
Ben Widawsky 53fa1bff34 cxl/core: Track port depth
In preparation for proving CXL subsystem usage of the device_lock()
order track the depth of ports with the expectation that  shallower port
locks can be held over deeper port locks.

Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Link: https://lore.kernel.org/r/164298419321.3018233.4469731547378993606.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2022-02-08 22:57:29 -08:00
Ben Widawsky d54c1bbe2d cxl/core/port: Clarify decoder creation
Add wrappers for the creation of decoder objects at the root level and
switch level, and keep the core helper private to cxl/core/port.c. Root
decoders are static descriptors conveyed from platform firmware (e.g.
ACPI CFMWS). Switch decoders are CXL standard decoders enumerated via
the HDM decoder capability structure. The base address for the HDM
decoder capability structure may be conveyed either by PCIe or platform
firmware (ACPI CEDT.CHBS).

Additionally, the kdoc descriptions for these helpers and their
dependencies is updated.

Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
[djbw: fixup changelog, clarify kdoc]
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Link: https://lore.kernel.org/r/164366463014.111117.9714595404002687111.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2022-02-08 22:57:28 -08:00
Ben Widawsky 608135db1b cxl/core: Convert decoder range to resource
CXL decoders manage address ranges in a hierarchical fashion whereby a
leaf is a unique subregion of its parent decoder (midlevel or root). It
therefore makes sense to use the resource API for handling this.

Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> (v1)
Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
Link: https://lore.kernel.org/r/164298417191.3018233.5201055578165414714.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2022-02-08 22:57:28 -08:00
Ben Widawsky c57cae78bf cxl: Introduce module_cxl_driver
Many CXL drivers simply want to register and unregister themselves.
module_driver already supported this. A simple wrapper around that
reduces a decent amount of boilerplate in upcoming patches.

Suggested-by: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Link: https://lore.kernel.org/r/164298415591.3018233.13608495220547681412.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2022-02-08 22:57:28 -08:00
Ben Widawsky 303ebc1b17 cxl/acpi: Map component registers for Root Ports
This implements the TODO in cxl_acpi for mapping component registers.
cxl_acpi becomes the second consumer of CXL register block enumeration
(cxl_pci being the first). Moving the functionality to cxl_core allows
both of these drivers to use the functionality. Equally importantly it
allows cxl_core to use the functionality in the future.

CXL 2.0 root ports are similar to CXL 2.0 Downstream Ports with the main
distinction being they're a part of the CXL 2.0 host bridge. While
mapping their component registers is not immediately useful for the CXL
drivers, the movement of register block enumeration into core is a vital
step towards HDM decoder programming.

Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
[djbw: fix cxl_regmap_to_base() failure cases]
Link: https://lore.kernel.org/r/164298415080.3018233.14694957480228676592.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2022-02-08 22:57:28 -08:00
Nathan Chancellor be185c2988 cxl/core: Remove cxld_const_init in cxl_decoder_alloc()
Commit 48667f6761 ("cxl/core: Split decoder setup into alloc + add")
aimed to fix a large stack frame warning but from v5 to v6, it
introduced a new instance of the warning due to allocating
cxld_const_init on the stack, which was done due to the use of const on
the nr_target member of the cxl_decoder struct. With ARCH=arm
allmodconfig minus CONFIG_KASAN:

GCC 11.2.0:

drivers/cxl/core/bus.c: In function ‘cxl_decoder_alloc’:
drivers/cxl/core/bus.c:523:1: error: the frame size of 1032 bytes is larger than 1024 bytes [-Werror=frame-larger-than=]
  523 | }
      | ^
cc1: all warnings being treated as errors

Clang 12.0.1:

drivers/cxl/core/bus.c:486:21: error: stack frame size of 1056 bytes in function 'cxl_decoder_alloc' [-Werror,-Wframe-larger-than=]
struct cxl_decoder *cxl_decoder_alloc(struct cxl_port *port, int nr_targets)
                    ^
1 error generated.

Revert that part of the change, which makes the stack frame of
cxl_decoder_alloc() much more reasonable.

Fixes: 48667f6761 ("cxl/core: Split decoder setup into alloc + add")
Link: https://github.com/ClangBuiltLinux/linux/issues/1539
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Link: https://lore.kernel.org/r/20211210213627.2477370-1-nathan@kernel.org
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2022-01-04 17:29:31 -08:00
Dan Williams 53989fad12 cxl/pmem: Fix module reload vs workqueue state
A test of the form:

    while true; do modprobe -r cxl_pmem; modprobe cxl_pmem; done

May lead to a crash signature of the form:

    BUG: unable to handle page fault for address: ffffffffc0660030
    #PF: supervisor instruction fetch in kernel mode
    #PF: error_code(0x0010) - not-present page
    [..]
    Workqueue: cxl_pmem 0xffffffffc0660030
    RIP: 0010:0xffffffffc0660030
    Code: Unable to access opcode bytes at RIP 0xffffffffc0660006.
    [..]
    Call Trace:
     ? process_one_work+0x4ec/0x9c0
     ? pwq_dec_nr_in_flight+0x100/0x100
     ? rwlock_bug.part.0+0x60/0x60
     ? worker_thread+0x2eb/0x700

In that report the 0xffffffffc0660030 address corresponds to the former
function address of cxl_nvb_update_state() from a previous load of the
module, not the current address. Fix that by arranging for ->state_work
in the 'struct cxl_nvdimm_bridge' object to be reinitialized on cxl_pmem
module reload.

Details:

Recall that CXL subsystem wants to link a CXL memory expander device to
an NVDIMM sub-hierarchy when both a persistent memory range has been
registered by the CXL platform driver (cxl_acpi) *and* when that CXL
memory expander has published persistent memory capacity (Get Partition
Info). To this end the cxl_nvdimm_bridge driver arranges to rescan the
CXL bus when either of those conditions change. The helper
bus_rescan_devices() can not be called underneath the device_lock() for
any device on that bus, so the cxl_nvdimm_bridge driver uses a workqueue
for the rescan.

Typically a driver allocates driver data to hold a 'struct work_struct'
for a driven device, but for a workqueue that may run after ->remove()
returns, driver data will have been freed. The 'struct
cxl_nvdimm_bridge' object holds the state and work_struct directly.
Unfortunately it was only arranging for that infrastructure to be
initialized once per device creation rather than the necessary once per
workqueue (cxl_pmem_wq) creation.

Introduce is_cxl_nvdimm_bridge() and cxl_nvdimm_bridge_reset() in
support of invalidating stale references to a recently destroyed
cxl_pmem_wq.

Cc: <stable@vger.kernel.org>
Fixes: 8fdcb1704f ("cxl/pmem: Add initial infrastructure for pmem support")
Reported-by: Vishal Verma <vishal.l.verma@intel.com>
Tested-by: Vishal Verma <vishal.l.verma@intel.com>
Link: https://lore.kernel.org/r/163665474585.3505991.8397182770066720755.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2021-11-15 11:03:00 -08:00
Linus Torvalds dd72945c43 cxl for v5.16
- Fix support for platforms that do not enumerate every ACPI0016 (CXL
   Host Bridge) in the CHBS (ACPI Host Bridge Structure).
 
 - Introduce a common pci_find_dvsec_capability() helper, clean up open
   coded implementations in various drivers.
 
 - Add 'cxl_test' for regression testing CXL subsystem ABIs. 'cxl_test'
   is a module built from tools/testing/cxl/ that mocks up a CXL topology
   to augment the nascent support for emulation of CXL devices in QEMU.
 
 - Convert libnvdimm to use the uuid API.
 
 - Complete the definition of CXL namespace labels in libnvdimm.
 
 - Tunnel libnvdimm label operations from nd_ioctl() back to the CXL
   mailbox driver. Enable 'ndctl {read,write}-labels' for CXL.
 
 - Continue to sort and refactor functionality into distinct driver and
   core-infrastructure buckets. For example, mailbox handling is now a
   generic core capability consumed by the PCI and cxl_test drivers.
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQSbo+XnGs+rwLz9XGXfioYZHlFsZwUCYYRtyQAKCRDfioYZHlFs
 Z7UsAP9DzUN6IWWnYk1R95YXYVxFriRtRsBjujAqTg49EMghawEAoHaA9lxO3Hho
 l25TLYUOmB/zFTlUbe6YQptMJZ5YLwY=
 =im9j
 -----END PGP SIGNATURE-----

Merge tag 'cxl-for-5.16' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl

Pull cxl updates from Dan Williams:
 "More preparation and plumbing work in the CXL subsystem.

  From an end user perspective the highlight here is lighting up the CXL
  Persistent Memory related commands (label read / write) with the
  generic ioctl() front-end in LIBNVDIMM.

  Otherwise, the ability to instantiate new persistent and volatile
  memory regions is still on track for v5.17.

  Summary:

   - Fix support for platforms that do not enumerate every ACPI0016 (CXL
     Host Bridge) in the CHBS (ACPI Host Bridge Structure).

   - Introduce a common pci_find_dvsec_capability() helper, clean up
     open coded implementations in various drivers.

   - Add 'cxl_test' for regression testing CXL subsystem ABIs.
     'cxl_test' is a module built from tools/testing/cxl/ that mocks up
     a CXL topology to augment the nascent support for emulation of CXL
     devices in QEMU.

   - Convert libnvdimm to use the uuid API.

   - Complete the definition of CXL namespace labels in libnvdimm.

   - Tunnel libnvdimm label operations from nd_ioctl() back to the CXL
     mailbox driver. Enable 'ndctl {read,write}-labels' for CXL.

   - Continue to sort and refactor functionality into distinct driver
     and core-infrastructure buckets. For example, mailbox handling is
     now a generic core capability consumed by the PCI and cxl_test
     drivers"

* tag 'cxl-for-5.16' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl: (34 commits)
  ocxl: Use pci core's DVSEC functionality
  cxl/pci: Use pci core's DVSEC functionality
  PCI: Add pci_find_dvsec_capability to find designated VSEC
  cxl/pci: Split cxl_pci_setup_regs()
  cxl/pci: Add @base to cxl_register_map
  cxl/pci: Make more use of cxl_register_map
  cxl/pci: Remove pci request/release regions
  cxl/pci: Fix NULL vs ERR_PTR confusion
  cxl/pci: Remove dev_dbg for unknown register blocks
  cxl/pci: Convert register block identifiers to an enum
  cxl/acpi: Do not fail cxl_acpi_probe() based on a missing CHBS
  cxl/pci: Disambiguate cxl_pci further from cxl_mem
  Documentation/cxl: Add bus internal docs
  cxl/core: Split decoder setup into alloc + add
  tools/testing/cxl: Introduce a mock memory device + driver
  cxl/mbox: Move command definitions to common location
  cxl/bus: Populate the target list at decoder create
  tools/testing/cxl: Introduce a mocked-up CXL port hierarchy
  cxl/pmem: Add support for multiple nvdimm-bridge objects
  cxl/pmem: Translate NVDIMM label commands to CXL label commands
  ...
2021-11-08 11:49:48 -08:00
Dan Williams a261e9a157 cxl/pci: Add @base to cxl_register_map
In addition to carrying @barno, @block_offset, and @reg_type, add @base
to keep all map/unmap parameters in one object. The helpers
cxl_{map,unmap}_regblock() handle adjusting @base to the @block_offset
at map and unmap time.

Document that @base incorporates @block_offset so that downstream
consumers of a mapped cxl_register_map instance do not need perform any
fixups / can use @base directly.

Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Link: https://lore.kernel.org/r/163433497228.889435.11271988238496181536.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2021-10-29 11:53:51 -07:00
Kees Cook 301e68dd9b cxl/core: Replace unions with struct_group()
Use the newly introduced struct_group_typed() macro to clean up the
declaration of struct cxl_regs.

Cc: Alison Schofield <alison.schofield@intel.com>
Cc: Vishal Verma <vishal.l.verma@intel.com>
Cc: Ira Weiny <ira.weiny@intel.com>
Cc: Ben Widawsky <ben.widawsky@intel.com>
Cc: linux-cxl@vger.kernel.org
Suggested-by: Dan Williams <dan.j.williams@intel.com>
Link: https://lore.kernel.org/lkml/1d9a2e6df2a9a35b2cdd50a9a68cac5991e7e5f0.camel@intel.com
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
2021-09-25 08:20:47 -07:00
Dan Williams 48667f6761 cxl/core: Split decoder setup into alloc + add
The kbuild robot reports:

    drivers/cxl/core/bus.c:516:1: warning: stack frame size (1032) exceeds
    limit (1024) in function 'devm_cxl_add_decoder'

It is also the case the devm_cxl_add_decoder() is unwieldy to use for
all the different decoder types. Fix the stack usage by splitting the
creation into alloc and add steps. This also allows for context
specific construction before adding.

With the split the caller is responsible for registering a devm callback
to trigger device_unregister() for the decoder rather than it being
implicit in the decoder registration. I.e. the routine that calls alloc
is responsible for calling put_device() if the "add" operation fails.

Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Nathan Chancellor <nathan@kernel.org>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Ben Widawsky <ben.widawsky@intel.com>
Link: https://lore.kernel.org/r/163225205828.3038145.6831131648369404859.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2021-09-21 14:09:34 -07:00
Dan Williams 7d3eb23c4c tools/testing/cxl: Introduce a mock memory device + driver
Introduce an emulated device-set plus driver to register CXL memory
devices, 'struct cxl_memdev' instances, in the mock cxl_test topology.
This enables the development of HDM Decoder (Host-managed Device Memory
Decoder) programming flow (region provisioning) in an environment that
can be updated alongside the kernel as it gains more functionality.

Whereas the cxl_pci module looks for CXL memory expanders on the 'pci'
bus, the cxl_mock_mem module attaches to CXL expanders on the platform
bus emitted by cxl_test.

Acked-by: Ben Widawsky <ben.widawsky@intel.com>
Reported-by: kernel test robot <lkp@intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Link: https://lore.kernel.org/r/163116440099.2460985.10692549614409346604.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2021-09-21 14:09:34 -07:00
Dan Williams a5c2580216 cxl/bus: Populate the target list at decoder create
As found by cxl_test, the implementation populated the target_list for
the single dport exceptional case, it missed populating the target_list
for the typical multi-dport case. Root decoders always know their target
list at the beginning of time, and even switch-level decoders should
have a target list of one or more zeros by default, depending on the
interleave-ways setting.

Walk the hosting port's dport list and populate based on the passed in
map.

Move devm_cxl_add_passthrough_decoder() out of line now that it does the
work of generating a target_map.

Before:
$ cat /sys/bus/cxl/devices/root2/decoder*/target_list
0

0

After:
$ cat /sys/bus/cxl/devices/root2/decoder*/target_list
0
0,1,2,3
0
0,1,2,3

Where root2 is a CXL topology root object generated by 'cxl_test'.

Acked-by: Ben Widawsky <ben.widawsky@intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Link: https://lore.kernel.org/r/163116439000.2460985.11713777051267946018.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2021-09-21 14:09:34 -07:00
Dan Williams 67dcdd4d3b tools/testing/cxl: Introduce a mocked-up CXL port hierarchy
Create an environment for CXL plumbing unit tests. Especially when it
comes to an algorithm for HDM Decoder (Host-managed Device Memory
Decoder) programming, the availability of an in-kernel-tree emulation
environment for CXL configuration complexity and corner cases speeds
development and deters regressions.

The approach taken mirrors what was done for tools/testing/nvdimm/. I.e.
an external module, cxl_test.ko built out of the tools/testing/cxl/
directory, provides mock implementations of kernel APIs and kernel
objects to simulate a real world device hierarchy.

One feedback for the tools/testing/nvdimm/ proposal was "why not do this
in QEMU?". In fact, the CXL development community has developed a QEMU
model for CXL [1]. However, there are a few blocking issues that keep
QEMU from being a tight fit for topology + provisioning unit tests:

1/ The QEMU community has yet to show interest in merging any of this
   support that has had patches on the list since November 2020. So,
   testing CXL to date involves building custom QEMU with out-of-tree
   patches.

2/ CXL mechanisms like cross-host-bridge interleave do not have a clear
   path to be emulated by QEMU without major infrastructure work. This
   is easier to achieve with the alloc_mock_res() approach taken in this
   patch to shortcut-define emulated system physical address ranges with
   interleave behavior.

The QEMU enabling has been critical to get the driver off the ground,
and may still move forward, but it does not address the ongoing needs of
a regression testing environment and test driven development.

This patch adds an ACPI CXL Platform definition with emulated CXL
multi-ported host-bridges. A follow on patch adds emulated memory
expander devices.

Acked-by: Ben Widawsky <ben.widawsky@intel.com>
Reported-by: Vishal Verma <vishal.l.verma@intel.com>
Link: https://lore.kernel.org/r/20210202005948.241655-1-ben.widawsky@intel.com [1]
Link: https://lore.kernel.org/r/163164680798.2831381.838684634806668012.stgit@dwillia2-desk3.amr.corp.intel.com
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2021-09-21 13:47:10 -07:00
Dan Williams 2e52b6256b cxl/pmem: Add support for multiple nvdimm-bridge objects
In preparation for a mocked unit test environment for CXL objects, allow
for multiple unique nvdimm-bridge objects.

For now, just allow multiple bridges to be registered. Later, when there
are multiple present, further updates are needed to
cxl_find_nvdimm_bridge() to identify which bridge is associated with
which CXL hierarchy for nvdimm registration.

Note that this does change the kernel device-name for the bridge object.
User space should not have any attachment to the device name at this
point as it is still early days in the CXL driver development.

Acked-by: Ben Widawsky <ben.widawsky@intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Link: https://lore.kernel.org/r/163164647007.2831228.2150246954620721526.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2021-09-21 13:47:10 -07:00
Ben Widawsky 5b68705d1e cxl/pci: Simplify register setup
It is desirable to retain the mappings from the calling function. By
simplifying this code, it will be much more straightforward to do that.

Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Link: https://lore.kernel.org/r/20210716231548.174778-3-ben.widawsky@intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2021-08-06 08:27:02 -07:00
Dan Williams 21083f5152 cxl/pmem: Register 'pmem' / cxl_nvdimm devices
While a memX device on /sys/bus/cxl represents a CXL memory expander
control interface, a pmemX device represents the persistent memory
sub-functionality. It bridges the CXL subystem to the libnvdimm nmemX
control interface.

With this skeleton ndctl can now see persistent memory devices on a
"CXL" bus. Later patches add support for translating libnvdimm native
commands to CXL commands.

# ndctl list -BDiu -b CXL
{
  "provider":"CXL",
  "dev":"ndbus1",
  "dimms":[
    {
      "dev":"nmem1",
      "state":"disabled"
    },
    {
      "dev":"nmem0",
      "state":"disabled"
    }
  ]
}

Given nvdimm_bus_unregister() removes all devices on an ndbus0 the
cxl_pmem infrastructure needs to arrange ->remove() to be triggered on
cxl_nvdimm devices to keep their enabled state synchronized with the
registration state of their corresponding device on the nvdimm_bus. In
other words, always arrange for cxl_nvdimm_driver.remove() to unregister
nvdimms from an nvdimm_bus ahead of the bus being unregistered.

Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Link: https://lore.kernel.org/r/162380012696.3039556.4293801691038740850.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2021-06-15 16:47:34 -07:00
Dan Williams 8fdcb1704f cxl/pmem: Add initial infrastructure for pmem support
Register an 'nvdimm-bridge' device to act as an anchor for a libnvdimm
bus hierarchy. Also, flesh out the cxl_bus definition to allow a
cxl_nvdimm_bridge_driver to attach to the bridge and trigger the
nvdimm-bus registration.

The creation of the bridge is gated on the detection of a PMEM capable
address space registered to the root. The bridge indirection allows the
libnvdimm module to remain unloaded on platforms without PMEM support.

Given that the probing of ACPI0017 is asynchronous to CXL endpoint
devices, and the expectation that CXL endpoint devices register other
PMEM resources on the 'CXL' nvdimm bus, a workqueue is added. The
workqueue is needed to run bus_rescan_devices() outside of the
device_lock() of the nvdimm-bridge device to rendezvous nvdimm resources
as they arrive. For now only the bus is taken online/offline in the
workqueue.

Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Link: https://lore.kernel.org/r/162379909706.2993820.14051258608641140169.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2021-06-15 16:47:14 -07:00
Dan Williams 6af7139c97 cxl/core: Add cxl-bus driver infrastructure
Enable devices on the 'cxl' bus to be attached to drivers. The initial
user of this functionality is a driver for an 'nvdimm-bridge' device
that anchors a libnvdimm hierarchy attached to CXL persistent memory
resources. Other device types that will leverage this include:

cxl_port: map and use component register functionality (HDM Decoders)

cxl_nvdimm: translate CXL memory expander endpoints to libnvdimm
	    'nvdimm' objects

cxl_region: translate CXL interleave sets to libnvdimm 'region' objects

The pairing of devices to drivers is handled through the cxl_device_id()
matching to cxl_driver.id values. A cxl_device_id() of '0' indicates no
driver support.

In addition to ->match(), ->probe(), and ->remove() support for the
'cxl' bus introduce MODULE_ALIAS_CXL() to autoload modules containing
cxl-drivers. Drivers are added in follow-on changes.

Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Link: https://lore.kernel.org/r/162379909190.2993820.6134168109678004186.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2021-06-15 16:46:34 -07:00
Ben Widawsky 6423035fd2 cxl/hdm: Fix decoder count calculation
The decoder count in the HDM decoder capability structure is an encoded
field. As defined in the spec:

Decoder Count: Reports the number of memory address decoders implemented
by the component.
0 – 1 Decoder
1 – 2 Decoders
2 – 4 Decoders
3 – 6 Decoders
4 – 8 Decoders
5 – 10 Decoders
All other values are reserved

Nothing is actually fixed by this as nothing actually used this mapping
yet.

Cc: Ira Weiny <ira.weiny@intel.com>
Fixes: 08422378c4 ("cxl/pci: Add HDM decoder capabilities")
Acked-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
Link: https://lore.kernel.org/r/20210611190111.121295-1-ben.widawsky@intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2021-06-12 10:29:03 -07:00
Dan Williams 40ba17afdf cxl/acpi: Introduce cxl_decoder objects
A cxl_decoder is a child of a cxl_port. It represents a hardware decoder
configuration of an upstream port to one or more of its downstream
ports. The decoder is either represented in CXL standard HDM decoder
registers (see CXL 2.0 section 8.2.5.12 CXL HDM Decoder Capability
Structure), or it is a static decode configuration communicated by
platform firmware (see the CXL Early Discovery Table: Fixed Memory
Window Structure).

The firmware described and hardware described decoders differ slightly
leading to 2 different sub-types of decoders, cxl_decoder_root and
cxl_decoder_switch. At the root level the decode capabilities restrict
what can be mapped beneath them. Mid-level switch decoders are
configured for either acclerator (type-2) or memory-expander (type-3)
operation, but they are otherwise agnostic to the type of memory
(volatile vs persistent) being mapped.

Here is an example topology from a single-ported host-bridge environment
without CFMWS decodes enumerated.

    /sys/bus/cxl/devices/root0
    ├── devtype
    ├── dport0 -> ../../../LNXSYSTM:00/LNXSYBUS:00/ACPI0016:00
    ├── port1
    │   ├── decoder1.0
    │   │   ├── devtype
    │   │   ├── locked
    │   │   ├── size
    │   │   ├── start
    │   │   ├── subsystem -> ../../../../../../bus/cxl
    │   │   ├── target_list
    │   │   ├── target_type
    │   │   └── uevent
    │   ├── devtype
    │   ├── dport0 -> ../../../../pci0000:34/0000:34:00.0
    │   ├── subsystem -> ../../../../../bus/cxl
    │   ├── uevent
    │   └── uport -> ../../../../LNXSYSTM:00/LNXSYBUS:00/ACPI0016:00
    ├── subsystem -> ../../../../bus/cxl
    ├── uevent
    └── uport -> ../../ACPI0017:00

Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Link: https://lore.kernel.org/r/162325695128.2293823.17519927266014762694.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2021-06-09 18:02:39 -07:00
Dan Williams 7d4b5ca2e2 cxl/acpi: Add downstream port data to cxl_port instances
In preparation for infrastructure that enumerates and configures the CXL
decode mechanism of an upstream port to its downstream ports, add a
representation of a CXL downstream port.

On ACPI systems the top-most logical downstream ports in the hierarchy
are the host bridges (ACPI0016 devices) that decode the memory windows
described by the CXL Early Discovery Table Fixed Memory Window
Structures (CEDT.CFMWS).

Reviewed-by: Alison Schofield <alison.schofield@intel.com>
Link: https://lore.kernel.org/r/162325450624.2293126.3533006409920271718.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2021-06-09 18:02:39 -07:00
Dan Williams 4812be97c0 cxl/acpi: Introduce the root of a cxl_port topology
While CXL builds upon the PCI software model for enumeration and
endpoint control, a static platform component is required to bootstrap
the CXL memory layout. Similar to how ACPI identifies root-level PCI
memory resources, ACPI data enumerates the address space and interleave
configuration for CXL Memory.

In addition to identifying host bridges, ACPI is responsible for
enumerating the CXL memory space that can be addressed by downstream
decoders. This is similar to the requirement for ACPI to publish
resources via the _CRS method for PCI host bridges. Specifically, ACPI
publishes a table, CXL Early Discovery Table (CEDT), which includes a
list of CXL Memory resources, CXL Fixed Memory Window Structures
(CFMWS).

For now, introduce the core infrastructure for a cxl_port hierarchy
starting with a root level anchor represented by the ACPI0017 device.

Follow on changes model support for the configurable decode capabilities
of cxl_port instances, i.e. CXL switch support.

Co-developed-by: Alison Schofield <alison.schofield@intel.com>
Signed-off-by: Alison Schofield <alison.schofield@intel.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Link: https://lore.kernel.org/r/162325449515.2293126.15303270193010154608.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2021-06-09 18:02:38 -07:00
Ben Widawsky 08422378c4 cxl/pci: Add HDM decoder capabilities
An HDM decoder is defined in the CXL 2.0 specification as a mechanism
that allow devices and upstream ports to claim memory address ranges and
participate in interleave sets. HDM decoder registers are within the
component register block defined in CXL 2.0 8.2.3 CXL 2.0 Component
Registers as part of the CXL.cache and CXL.mem subregion.

The Component Register Block is found via the Register Locator DVSEC
in a similar fashion to how the CXL Device Register Block is found. The
primary difference is the capability id size of the Component Register
Block is a single DWORD instead of 4 DWORDS.

It's now possible to configure a CXL type 3 device's HDM decoder. Such
programming is expected for CXL devices with persistent memory, and hot
plugged CXL devices that participate in CXL.mem with volatile memory.

Add probe and mapping functions for the component register blocks.

Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Co-developed-by: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Ira Weiny <ira.weiny@intel.com>
Co-developed-by: Vishal Verma <vishal.l.verma@intel.com>
Signed-off-by: Vishal Verma <vishal.l.verma@intel.com>
Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
Link: https://lore.kernel.org/r/20210528004922.3980613-6-ira.weiny@intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2021-06-05 17:39:12 -07:00
Ira Weiny 30af97296f cxl/pci: Map registers based on capabilities
The information required to map registers based on capabilities is
contained within the bars themselves.  This means the bar must be mapped
to read the information needed and then unmapped to map the individual
parts of the BAR based on capabilities.

Change cxl_setup_device_regs() to return a new cxl_register_map, change
the name to cxl_probe_device_regs().  Allocate and place
cxl_register_maps on a list while processing all of the specified
register blocks.

After probing all the register blocks go back and map smaller registers
blocks based on their capabilities and dispose of the cxl_register_maps.

NOTE: pci_iomap() is not managed automatically via pcim_enable_device()
so be careful to call pci_iounmap() correctly.

Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Ira Weiny <ira.weiny@intel.com>
Link: https://lore.kernel.org/r/20210604005036.4187184-1-ira.weiny@intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2021-06-05 17:37:16 -07:00
Dan Williams 399d34ebc2 cxl/core: Refactor CXL register lookup for bridge reuse
While CXL Memory Device endpoints locate the CXL MMIO registers in a PCI
BAR, CXL root bridges have their MMIO base address described by platform
firmware. Refactor the existing register lookup into a generic facility
for endpoints and bridges to share.

Reviewed-by: Ben Widawsky <ben.widawsky@intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Link: https://lore.kernel.org/r/162096972534.1865304.3218686216153688039.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2021-05-14 16:13:19 -07:00
Dan Williams 8ac75dd6ab cxl/mem: Introduce 'struct cxl_regs' for "composable" CXL devices
CXL MMIO register blocks are organized by device type and capabilities.
There are Component registers, Device registers (yes, an ambiguous
name), and Memory Device registers (a specific extension of Device
registers).

It is possible for a given device instance (endpoint or port) to
implement register sets from multiple of the above categories.

The driver code that enumerates and maps the registers is type specific
so it is useful to have a dedicated type and helpers for each block
type.

At the same time, once the registers are mapped the origin type does not
matter. It is overly pedantic to reference the register block type in
code that is using the registers.

In preparation for the endpoint driver to incorporate Component registers
into its MMIO operations reorganize the registers to allow typed
enumeration + mapping, but anonymous usage. With the end state of
'struct cxl_regs' to be:

struct cxl_regs {
	union {
		struct {
			CXL_DEVICE_REGS();
		};
		struct cxl_device_regs device_regs;
	};
	union {
		struct {
			CXL_COMPONENT_REGS();
		};
		struct cxl_component_regs component_regs;
	};
};

With this arrangement the driver can share component init code with
ports, but when using the registers it can directly reference the
component register block type by name without the 'component_regs'
prefix.

So, map + enumerate can be shared across drivers of different CXL
classes e.g.:

void cxl_setup_device_regs(struct device *dev, void __iomem *base,
			   struct cxl_device_regs *regs);

void cxl_setup_component_regs(struct device *dev, void __iomem *base,
			      struct cxl_component_regs *regs);

...while inline usage in the driver need not indicate where the
registers came from:

readl(cxlm->regs.mbox + MBOX_OFFSET);
readl(cxlm->regs.hdm + HDM_OFFSET);

...instead of:

readl(cxlm->regs.device_regs.mbox + MBOX_OFFSET);
readl(cxlm->regs.component_regs.hdm + HDM_OFFSET);

This complexity of the definition in .h yields improvement in code
readability in .c while maintaining type-safety for organization of
setup code. It prepares the implementation to maintain organization in
the face of CXL devices that compose register interfaces consisting of
multiple types.

Given that this new container is named 'regs' rename the common register
base pointer @base, and fixup the kernel-doc for the missing @cxlmd
description.

Reviewed-by: Ben Widawsky <ben.widawsky@intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Cc: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/162096971451.1865304.13540251513463515153.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2021-05-14 16:13:19 -07:00
Dan Williams 5f50d6b20c cxl/mem: Move some definitions to mem.h
In preparation for sharing cxl.h with other generic CXL consumers,
move / consolidate some of the memory device specifics to mem.h.

The motivation for moving out of cxl.h is to maintain least privilege
access to memory-device details since cxl.h is used in multiple files.
The motivation for moving definitions into a new mem.h header is for
code readability and organization. I.e. minimize implementation details
when reading data structures and other definitions.

Reviewed-by: Ben Widawsky <ben.widawsky@intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Link: https://lore.kernel.org/r/162096970932.1865304.14510894426562947262.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2021-05-14 16:13:18 -07:00
Ben Widawsky 472b1ce6e9 cxl/mem: Enable commands via CEL
CXL devices identified by the memory-device class code must implement
the Device Command Interface (described in 8.2.9 of the CXL 2.0 spec).
While the driver already maintains a list of commands it supports, there
is still a need to be able to distinguish between commands that the
driver knows about from commands that are optionally supported by the
hardware.

The Command Effects Log (CEL) is specified in the CXL 2.0 specification.
The CEL is one of two types of logs, the other being vendor specific.
They are distinguished in hardware/spec via UUID. The CEL is useful for
2 things:
1. Determine which optional commands are supported by the CXL device.
2. Enumerate any vendor specific commands

The CEL is used by the driver to determine which commands are available
in the hardware and therefore which commands userspace is allowed to
execute. The set of enabled commands might be a subset of commands which
are advertised in UAPI via CXL_MEM_SEND_COMMAND IOCTL.

With the CEL enabling comes a internal flag to indicate a base set of
commands that are enabled regardless of CEL. Such commands are required
for basic interaction with the hardware and thus can be useful in debug
cases, for example if the CEL is corrupted.

The implementation leaves the statically defined table of commands and
supplements it with a bitmap to determine commands that are enabled.
This organization was chosen for the following reasons:
- Smaller memory footprint. Doesn't need a table per device.
- Reduce memory allocation complexity.
- Fixed command IDs to opcode mapping for all devices makes development
  and debugging easier.
- Certain helpers are easily achievable, like cxl_for_each_cmd().

Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com> (v2)
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> (v3)
Link: https://lore.kernel.org/r/20210217040958.1354670-7-ben.widawsky@intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2021-02-16 20:36:38 -08:00
Dan Williams b39cb1052a cxl/mem: Register CXL memX devices
Create the /sys/bus/cxl hierarchy to enumerate:

* Memory Devices (per-endpoint control devices)

* Memory Address Space Devices (platform address ranges with
  interleaving, performance, and persistence attributes)

* Memory Regions (active provisioned memory from an address space device
  that is in use as System RAM or delegated to libnvdimm as Persistent
  Memory regions).

For now, only the per-endpoint control devices are registered on the
'cxl' bus. However, going forward it will provide a mechanism to
coordinate cross-device interleave.

Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> (v2)
Link: https://lore.kernel.org/r/20210217040958.1354670-4-ben.widawsky@intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2021-02-16 20:36:38 -08:00
Ben Widawsky 8adaf747c9 cxl/mem: Find device capabilities
Provide enough functionality to utilize the mailbox of a memory device.
The mailbox is used to interact with the firmware running on the memory
device. The flow is proven with one implemented command, "identify".
Because the class code has already told the driver this is a memory
device and the identify command is mandatory.

CXL devices contain an array of capabilities that describe the
interactions software can have with the device or firmware running on
the device. A CXL compliant device must implement the device status and
the mailbox capability. Additionally, a CXL compliant memory device must
implement the memory device capability. Each of the capabilities can
[will] provide an offset within the MMIO region for interacting with the
CXL device.

The capabilities tell the driver how to find and map the register space
for CXL Memory Devices. The registers are required to utilize the CXL
spec defined mailbox interface. The spec outlines two mailboxes, primary
and secondary. The secondary mailbox is earmarked for system firmware,
and not handled in this driver.

Primary mailboxes are capable of generating an interrupt when submitting
a background command. That implementation is saved for a later time.

Reported-by: Colin Ian King <colin.king@canonical.com> (coverity)
Reported-by: Dan Carpenter <dan.carpenter@oracle.com> (smatch)
Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com> (v2)
Link: https://www.computeexpresslink.org/download-the-specification
Link: https://lore.kernel.org/r/20210217040958.1354670-3-ben.widawsky@intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2021-02-16 20:36:38 -08:00