Merge branch 'for-6.3/cxl-ram-region' into cxl/next

Include the support for enumerating and provisioning ram regions for
v6.3. This also include a default policy change for ram / volatile
device-dax instances to assign them to the dax_kmem driver by default.
This commit is contained in:
Dan Williams 2023-02-10 18:11:01 -08:00
Родитель dfd423e0a3 09d09e04d2
Коммит b8b9ffced0
29 изменённых файлов: 1491 добавлений и 320 удалений

Просмотреть файл

@ -198,7 +198,7 @@ Description:
What: /sys/bus/cxl/devices/endpointX/CDAT What: /sys/bus/cxl/devices/endpointX/CDAT
Date: July, 2022 Date: July, 2022
KernelVersion: v5.20 KernelVersion: v6.0
Contact: linux-cxl@vger.kernel.org Contact: linux-cxl@vger.kernel.org
Description: Description:
(RO) If this sysfs entry is not present no DOE mailbox was (RO) If this sysfs entry is not present no DOE mailbox was
@ -209,7 +209,7 @@ Description:
What: /sys/bus/cxl/devices/decoderX.Y/mode What: /sys/bus/cxl/devices/decoderX.Y/mode
Date: May, 2022 Date: May, 2022
KernelVersion: v5.20 KernelVersion: v6.0
Contact: linux-cxl@vger.kernel.org Contact: linux-cxl@vger.kernel.org
Description: Description:
(RW) When a CXL decoder is of devtype "cxl_decoder_endpoint" it (RW) When a CXL decoder is of devtype "cxl_decoder_endpoint" it
@ -229,7 +229,7 @@ Description:
What: /sys/bus/cxl/devices/decoderX.Y/dpa_resource What: /sys/bus/cxl/devices/decoderX.Y/dpa_resource
Date: May, 2022 Date: May, 2022
KernelVersion: v5.20 KernelVersion: v6.0
Contact: linux-cxl@vger.kernel.org Contact: linux-cxl@vger.kernel.org
Description: Description:
(RO) When a CXL decoder is of devtype "cxl_decoder_endpoint", (RO) When a CXL decoder is of devtype "cxl_decoder_endpoint",
@ -240,7 +240,7 @@ Description:
What: /sys/bus/cxl/devices/decoderX.Y/dpa_size What: /sys/bus/cxl/devices/decoderX.Y/dpa_size
Date: May, 2022 Date: May, 2022
KernelVersion: v5.20 KernelVersion: v6.0
Contact: linux-cxl@vger.kernel.org Contact: linux-cxl@vger.kernel.org
Description: Description:
(RW) When a CXL decoder is of devtype "cxl_decoder_endpoint" it (RW) When a CXL decoder is of devtype "cxl_decoder_endpoint" it
@ -260,7 +260,7 @@ Description:
What: /sys/bus/cxl/devices/decoderX.Y/interleave_ways What: /sys/bus/cxl/devices/decoderX.Y/interleave_ways
Date: May, 2022 Date: May, 2022
KernelVersion: v5.20 KernelVersion: v6.0
Contact: linux-cxl@vger.kernel.org Contact: linux-cxl@vger.kernel.org
Description: Description:
(RO) The number of targets across which this decoder's host (RO) The number of targets across which this decoder's host
@ -275,7 +275,7 @@ Description:
What: /sys/bus/cxl/devices/decoderX.Y/interleave_granularity What: /sys/bus/cxl/devices/decoderX.Y/interleave_granularity
Date: May, 2022 Date: May, 2022
KernelVersion: v5.20 KernelVersion: v6.0
Contact: linux-cxl@vger.kernel.org Contact: linux-cxl@vger.kernel.org
Description: Description:
(RO) The number of consecutive bytes of host physical address (RO) The number of consecutive bytes of host physical address
@ -285,25 +285,25 @@ Description:
interleave_granularity). interleave_granularity).
What: /sys/bus/cxl/devices/decoderX.Y/create_pmem_region What: /sys/bus/cxl/devices/decoderX.Y/create_{pmem,ram}_region
Date: May, 2022 Date: May, 2022, January, 2023
KernelVersion: v5.20 KernelVersion: v6.0 (pmem), v6.3 (ram)
Contact: linux-cxl@vger.kernel.org Contact: linux-cxl@vger.kernel.org
Description: Description:
(RW) Write a string in the form 'regionZ' to start the process (RW) Write a string in the form 'regionZ' to start the process
of defining a new persistent memory region (interleave-set) of defining a new persistent, or volatile memory region
within the decode range bounded by root decoder 'decoderX.Y'. (interleave-set) within the decode range bounded by root decoder
The value written must match the current value returned from 'decoderX.Y'. The value written must match the current value
reading this attribute. An atomic compare exchange operation is returned from reading this attribute. An atomic compare exchange
done on write to assign the requested id to a region and operation is done on write to assign the requested id to a
allocate the region-id for the next creation attempt. EBUSY is region and allocate the region-id for the next creation attempt.
returned if the region name written does not match the current EBUSY is returned if the region name written does not match the
cached value. current cached value.
What: /sys/bus/cxl/devices/decoderX.Y/delete_region What: /sys/bus/cxl/devices/decoderX.Y/delete_region
Date: May, 2022 Date: May, 2022
KernelVersion: v5.20 KernelVersion: v6.0
Contact: linux-cxl@vger.kernel.org Contact: linux-cxl@vger.kernel.org
Description: Description:
(WO) Write a string in the form 'regionZ' to delete that region, (WO) Write a string in the form 'regionZ' to delete that region,
@ -312,17 +312,18 @@ Description:
What: /sys/bus/cxl/devices/regionZ/uuid What: /sys/bus/cxl/devices/regionZ/uuid
Date: May, 2022 Date: May, 2022
KernelVersion: v5.20 KernelVersion: v6.0
Contact: linux-cxl@vger.kernel.org Contact: linux-cxl@vger.kernel.org
Description: Description:
(RW) Write a unique identifier for the region. This field must (RW) Write a unique identifier for the region. This field must
be set for persistent regions and it must not conflict with the be set for persistent regions and it must not conflict with the
UUID of another region. UUID of another region. For volatile ram regions this
attribute is a read-only empty string.
What: /sys/bus/cxl/devices/regionZ/interleave_granularity What: /sys/bus/cxl/devices/regionZ/interleave_granularity
Date: May, 2022 Date: May, 2022
KernelVersion: v5.20 KernelVersion: v6.0
Contact: linux-cxl@vger.kernel.org Contact: linux-cxl@vger.kernel.org
Description: Description:
(RW) Set the number of consecutive bytes each device in the (RW) Set the number of consecutive bytes each device in the
@ -333,7 +334,7 @@ Description:
What: /sys/bus/cxl/devices/regionZ/interleave_ways What: /sys/bus/cxl/devices/regionZ/interleave_ways
Date: May, 2022 Date: May, 2022
KernelVersion: v5.20 KernelVersion: v6.0
Contact: linux-cxl@vger.kernel.org Contact: linux-cxl@vger.kernel.org
Description: Description:
(RW) Configures the number of devices participating in the (RW) Configures the number of devices participating in the
@ -343,7 +344,7 @@ Description:
What: /sys/bus/cxl/devices/regionZ/size What: /sys/bus/cxl/devices/regionZ/size
Date: May, 2022 Date: May, 2022
KernelVersion: v5.20 KernelVersion: v6.0
Contact: linux-cxl@vger.kernel.org Contact: linux-cxl@vger.kernel.org
Description: Description:
(RW) System physical address space to be consumed by the region. (RW) System physical address space to be consumed by the region.
@ -358,9 +359,20 @@ Description:
results in the same address being allocated. results in the same address being allocated.
What: /sys/bus/cxl/devices/regionZ/mode
Date: January, 2023
KernelVersion: v6.3
Contact: linux-cxl@vger.kernel.org
Description:
(RO) The mode of a region is established at region creation time
and dictates the mode of the endpoint decoder that comprise the
region. For more details on the possible modes see
/sys/bus/cxl/devices/decoderX.Y/mode
What: /sys/bus/cxl/devices/regionZ/resource What: /sys/bus/cxl/devices/regionZ/resource
Date: May, 2022 Date: May, 2022
KernelVersion: v5.20 KernelVersion: v6.0
Contact: linux-cxl@vger.kernel.org Contact: linux-cxl@vger.kernel.org
Description: Description:
(RO) A region is a contiguous partition of a CXL root decoder (RO) A region is a contiguous partition of a CXL root decoder
@ -372,7 +384,7 @@ Description:
What: /sys/bus/cxl/devices/regionZ/target[0..N] What: /sys/bus/cxl/devices/regionZ/target[0..N]
Date: May, 2022 Date: May, 2022
KernelVersion: v5.20 KernelVersion: v6.0
Contact: linux-cxl@vger.kernel.org Contact: linux-cxl@vger.kernel.org
Description: Description:
(RW) Write an endpoint decoder object name to 'targetX' where X (RW) Write an endpoint decoder object name to 'targetX' where X
@ -391,7 +403,7 @@ Description:
What: /sys/bus/cxl/devices/regionZ/commit What: /sys/bus/cxl/devices/regionZ/commit
Date: May, 2022 Date: May, 2022
KernelVersion: v5.20 KernelVersion: v6.0
Contact: linux-cxl@vger.kernel.org Contact: linux-cxl@vger.kernel.org
Description: Description:
(RW) Write a boolean 'true' string value to this attribute to (RW) Write a boolean 'true' string value to this attribute to

Просмотреть файл

@ -6034,6 +6034,7 @@ M: Dan Williams <dan.j.williams@intel.com>
M: Vishal Verma <vishal.l.verma@intel.com> M: Vishal Verma <vishal.l.verma@intel.com>
M: Dave Jiang <dave.jiang@intel.com> M: Dave Jiang <dave.jiang@intel.com>
L: nvdimm@lists.linux.dev L: nvdimm@lists.linux.dev
L: linux-cxl@vger.kernel.org
S: Supported S: Supported
F: drivers/dax/ F: drivers/dax/

Просмотреть файл

@ -718,7 +718,7 @@ static void hmat_register_target_devices(struct memory_target *target)
for (res = target->memregions.child; res; res = res->sibling) { for (res = target->memregions.child; res; res = res->sibling) {
int target_nid = pxm_to_node(target->memory_pxm); int target_nid = pxm_to_node(target->memory_pxm);
hmem_register_device(target_nid, res); hmem_register_resource(target_nid, res);
} }
} }
@ -869,4 +869,4 @@ out_put:
acpi_put_table(tbl); acpi_put_table(tbl);
return 0; return 0;
} }
device_initcall(hmat_init); subsys_initcall(hmat_init);

Просмотреть файл

@ -104,12 +104,22 @@ config CXL_SUSPEND
depends on SUSPEND && CXL_MEM depends on SUSPEND && CXL_MEM
config CXL_REGION config CXL_REGION
bool bool "CXL: Region Support"
default CXL_BUS default CXL_BUS
# For MAX_PHYSMEM_BITS # For MAX_PHYSMEM_BITS
depends on SPARSEMEM depends on SPARSEMEM
select MEMREGION select MEMREGION
select GET_FREE_REGION select GET_FREE_REGION
help
Enable the CXL core to enumerate and provision CXL regions. A CXL
region is defined by one or more CXL expanders that decode a given
system-physical address range. For CXL regions established by
platform-firmware this option enables memory error handling to
identify the devices participating in a given interleaved memory
range. Otherwise, platform-firmware managed CXL is enabled by being
placed in the system address map and does not need a driver.
If unsure say 'y'
config CXL_REGION_INVALIDATION_TEST config CXL_REGION_INVALIDATION_TEST
bool "CXL: Region Cache Management Bypass (TEST)" bool "CXL: Region Cache Management Bypass (TEST)"

Просмотреть файл

@ -731,7 +731,8 @@ static void __exit cxl_acpi_exit(void)
cxl_bus_drain(); cxl_bus_drain();
} }
module_init(cxl_acpi_init); /* load before dax_hmem sees 'Soft Reserved' CXL ranges */
subsys_initcall(cxl_acpi_init);
module_exit(cxl_acpi_exit); module_exit(cxl_acpi_exit);
MODULE_LICENSE("GPL v2"); MODULE_LICENSE("GPL v2");
MODULE_IMPORT_NS(CXL); MODULE_IMPORT_NS(CXL);

Просмотреть файл

@ -11,15 +11,18 @@ extern struct attribute_group cxl_base_attribute_group;
#ifdef CONFIG_CXL_REGION #ifdef CONFIG_CXL_REGION
extern struct device_attribute dev_attr_create_pmem_region; extern struct device_attribute dev_attr_create_pmem_region;
extern struct device_attribute dev_attr_create_ram_region;
extern struct device_attribute dev_attr_delete_region; extern struct device_attribute dev_attr_delete_region;
extern struct device_attribute dev_attr_region; extern struct device_attribute dev_attr_region;
extern const struct device_type cxl_pmem_region_type; extern const struct device_type cxl_pmem_region_type;
extern const struct device_type cxl_dax_region_type;
extern const struct device_type cxl_region_type; extern const struct device_type cxl_region_type;
void cxl_decoder_kill_region(struct cxl_endpoint_decoder *cxled); void cxl_decoder_kill_region(struct cxl_endpoint_decoder *cxled);
#define CXL_REGION_ATTR(x) (&dev_attr_##x.attr) #define CXL_REGION_ATTR(x) (&dev_attr_##x.attr)
#define CXL_REGION_TYPE(x) (&cxl_region_type) #define CXL_REGION_TYPE(x) (&cxl_region_type)
#define SET_CXL_REGION_ATTR(x) (&dev_attr_##x.attr), #define SET_CXL_REGION_ATTR(x) (&dev_attr_##x.attr),
#define CXL_PMEM_REGION_TYPE(x) (&cxl_pmem_region_type) #define CXL_PMEM_REGION_TYPE(x) (&cxl_pmem_region_type)
#define CXL_DAX_REGION_TYPE(x) (&cxl_dax_region_type)
int cxl_region_init(void); int cxl_region_init(void);
void cxl_region_exit(void); void cxl_region_exit(void);
#else #else
@ -37,6 +40,7 @@ static inline void cxl_region_exit(void)
#define CXL_REGION_TYPE(x) NULL #define CXL_REGION_TYPE(x) NULL
#define SET_CXL_REGION_ATTR(x) #define SET_CXL_REGION_ATTR(x)
#define CXL_PMEM_REGION_TYPE(x) NULL #define CXL_PMEM_REGION_TYPE(x) NULL
#define CXL_DAX_REGION_TYPE(x) NULL
#endif #endif
struct cxl_send_command; struct cxl_send_command;
@ -56,9 +60,6 @@ resource_size_t cxl_dpa_size(struct cxl_endpoint_decoder *cxled);
resource_size_t cxl_dpa_resource_start(struct cxl_endpoint_decoder *cxled); resource_size_t cxl_dpa_resource_start(struct cxl_endpoint_decoder *cxled);
extern struct rw_semaphore cxl_dpa_rwsem; extern struct rw_semaphore cxl_dpa_rwsem;
bool is_switch_decoder(struct device *dev);
struct cxl_switch_decoder *to_cxl_switch_decoder(struct device *dev);
int cxl_memdev_init(void); int cxl_memdev_init(void);
void cxl_memdev_exit(void); void cxl_memdev_exit(void);
void cxl_mbox_init(void); void cxl_mbox_init(void);

Просмотреть файл

@ -279,7 +279,7 @@ success:
return 0; return 0;
} }
static int devm_cxl_dpa_reserve(struct cxl_endpoint_decoder *cxled, int devm_cxl_dpa_reserve(struct cxl_endpoint_decoder *cxled,
resource_size_t base, resource_size_t len, resource_size_t base, resource_size_t len,
resource_size_t skipped) resource_size_t skipped)
{ {
@ -295,6 +295,7 @@ static int devm_cxl_dpa_reserve(struct cxl_endpoint_decoder *cxled,
return devm_add_action_or_reset(&port->dev, cxl_dpa_release, cxled); return devm_add_action_or_reset(&port->dev, cxl_dpa_release, cxled);
} }
EXPORT_SYMBOL_NS_GPL(devm_cxl_dpa_reserve, CXL);
resource_size_t cxl_dpa_size(struct cxl_endpoint_decoder *cxled) resource_size_t cxl_dpa_size(struct cxl_endpoint_decoder *cxled)
{ {
@ -676,6 +677,14 @@ static int cxl_decoder_reset(struct cxl_decoder *cxld)
port->commit_end--; port->commit_end--;
cxld->flags &= ~CXL_DECODER_F_ENABLE; cxld->flags &= ~CXL_DECODER_F_ENABLE;
/* Userspace is now responsible for reconfiguring this decoder */
if (is_endpoint_decoder(&cxld->dev)) {
struct cxl_endpoint_decoder *cxled;
cxled = to_cxl_endpoint_decoder(&cxld->dev);
cxled->state = CXL_DECODER_STATE_MANUAL;
}
return 0; return 0;
} }
@ -783,6 +792,9 @@ static int init_hdm_decoder(struct cxl_port *port, struct cxl_decoder *cxld,
return rc; return rc;
} }
*dpa_base += dpa_size + skip; *dpa_base += dpa_size + skip;
cxled->state = CXL_DECODER_STATE_AUTO;
return 0; return 0;
} }
@ -826,7 +838,8 @@ int devm_cxl_enumerate_decoders(struct cxl_hdm *cxlhdm)
cxled = cxl_endpoint_decoder_alloc(port); cxled = cxl_endpoint_decoder_alloc(port);
if (IS_ERR(cxled)) { if (IS_ERR(cxled)) {
dev_warn(&port->dev, dev_warn(&port->dev,
"Failed to allocate the decoder\n"); "Failed to allocate decoder%d.%d\n",
port->id, i);
return PTR_ERR(cxled); return PTR_ERR(cxled);
} }
cxld = &cxled->cxld; cxld = &cxled->cxld;
@ -836,7 +849,8 @@ int devm_cxl_enumerate_decoders(struct cxl_hdm *cxlhdm)
cxlsd = cxl_switch_decoder_alloc(port, target_count); cxlsd = cxl_switch_decoder_alloc(port, target_count);
if (IS_ERR(cxlsd)) { if (IS_ERR(cxlsd)) {
dev_warn(&port->dev, dev_warn(&port->dev,
"Failed to allocate the decoder\n"); "Failed to allocate decoder%d.%d\n",
port->id, i);
return PTR_ERR(cxlsd); return PTR_ERR(cxlsd);
} }
cxld = &cxlsd->cxld; cxld = &cxlsd->cxld;
@ -844,13 +858,16 @@ int devm_cxl_enumerate_decoders(struct cxl_hdm *cxlhdm)
rc = init_hdm_decoder(port, cxld, target_map, hdm, i, &dpa_base); rc = init_hdm_decoder(port, cxld, target_map, hdm, i, &dpa_base);
if (rc) { if (rc) {
dev_warn(&port->dev,
"Failed to initialize decoder%d.%d\n",
port->id, i);
put_device(&cxld->dev); put_device(&cxld->dev);
return rc; return rc;
} }
rc = add_hdm_decoder(port, cxld, target_map); rc = add_hdm_decoder(port, cxld, target_map);
if (rc) { if (rc) {
dev_warn(&port->dev, dev_warn(&port->dev,
"Failed to add decoder to port\n"); "Failed to add decoder%d.%d\n", port->id, i);
return rc; return rc;
} }
} }

Просмотреть файл

@ -246,6 +246,7 @@ static struct cxl_memdev *cxl_memdev_alloc(struct cxl_dev_state *cxlds,
if (rc < 0) if (rc < 0)
goto err; goto err;
cxlmd->id = rc; cxlmd->id = rc;
cxlmd->depth = -1;
dev = &cxlmd->dev; dev = &cxlmd->dev;
device_initialize(dev); device_initialize(dev);

Просмотреть файл

@ -214,11 +214,6 @@ static int devm_cxl_enable_mem(struct device *host, struct cxl_dev_state *cxlds)
return devm_add_action_or_reset(host, clear_mem_enable, cxlds); return devm_add_action_or_reset(host, clear_mem_enable, cxlds);
} }
static bool range_contains(struct range *r1, struct range *r2)
{
return r1->start <= r2->start && r1->end >= r2->end;
}
/* require dvsec ranges to be covered by a locked platform window */ /* require dvsec ranges to be covered by a locked platform window */
static int dvsec_range_allowed(struct device *dev, void *arg) static int dvsec_range_allowed(struct device *dev, void *arg)
{ {

Просмотреть файл

@ -46,6 +46,8 @@ static int cxl_device_id(struct device *dev)
return CXL_DEVICE_NVDIMM; return CXL_DEVICE_NVDIMM;
if (dev->type == CXL_PMEM_REGION_TYPE()) if (dev->type == CXL_PMEM_REGION_TYPE())
return CXL_DEVICE_PMEM_REGION; return CXL_DEVICE_PMEM_REGION;
if (dev->type == CXL_DAX_REGION_TYPE())
return CXL_DEVICE_DAX_REGION;
if (is_cxl_port(dev)) { if (is_cxl_port(dev)) {
if (is_cxl_root(to_cxl_port(dev))) if (is_cxl_root(to_cxl_port(dev)))
return CXL_DEVICE_ROOT; return CXL_DEVICE_ROOT;
@ -180,17 +182,7 @@ static ssize_t mode_show(struct device *dev, struct device_attribute *attr,
{ {
struct cxl_endpoint_decoder *cxled = to_cxl_endpoint_decoder(dev); struct cxl_endpoint_decoder *cxled = to_cxl_endpoint_decoder(dev);
switch (cxled->mode) { return sysfs_emit(buf, "%s\n", cxl_decoder_mode_name(cxled->mode));
case CXL_DECODER_RAM:
return sysfs_emit(buf, "ram\n");
case CXL_DECODER_PMEM:
return sysfs_emit(buf, "pmem\n");
case CXL_DECODER_NONE:
return sysfs_emit(buf, "none\n");
case CXL_DECODER_MIXED:
default:
return sysfs_emit(buf, "mixed\n");
}
} }
static ssize_t mode_store(struct device *dev, struct device_attribute *attr, static ssize_t mode_store(struct device *dev, struct device_attribute *attr,
@ -304,6 +296,7 @@ static struct attribute *cxl_decoder_root_attrs[] = {
&dev_attr_cap_type3.attr, &dev_attr_cap_type3.attr,
&dev_attr_target_list.attr, &dev_attr_target_list.attr,
SET_CXL_REGION_ATTR(create_pmem_region) SET_CXL_REGION_ATTR(create_pmem_region)
SET_CXL_REGION_ATTR(create_ram_region)
SET_CXL_REGION_ATTR(delete_region) SET_CXL_REGION_ATTR(delete_region)
NULL, NULL,
}; };
@ -315,6 +308,13 @@ static bool can_create_pmem(struct cxl_root_decoder *cxlrd)
return (cxlrd->cxlsd.cxld.flags & flags) == flags; return (cxlrd->cxlsd.cxld.flags & flags) == flags;
} }
static bool can_create_ram(struct cxl_root_decoder *cxlrd)
{
unsigned long flags = CXL_DECODER_F_TYPE3 | CXL_DECODER_F_RAM;
return (cxlrd->cxlsd.cxld.flags & flags) == flags;
}
static umode_t cxl_root_decoder_visible(struct kobject *kobj, struct attribute *a, int n) static umode_t cxl_root_decoder_visible(struct kobject *kobj, struct attribute *a, int n)
{ {
struct device *dev = kobj_to_dev(kobj); struct device *dev = kobj_to_dev(kobj);
@ -323,7 +323,11 @@ static umode_t cxl_root_decoder_visible(struct kobject *kobj, struct attribute *
if (a == CXL_REGION_ATTR(create_pmem_region) && !can_create_pmem(cxlrd)) if (a == CXL_REGION_ATTR(create_pmem_region) && !can_create_pmem(cxlrd))
return 0; return 0;
if (a == CXL_REGION_ATTR(delete_region) && !can_create_pmem(cxlrd)) if (a == CXL_REGION_ATTR(create_ram_region) && !can_create_ram(cxlrd))
return 0;
if (a == CXL_REGION_ATTR(delete_region) &&
!(can_create_pmem(cxlrd) || can_create_ram(cxlrd)))
return 0; return 0;
return a->mode; return a->mode;
@ -444,6 +448,7 @@ bool is_endpoint_decoder(struct device *dev)
{ {
return dev->type == &cxl_decoder_endpoint_type; return dev->type == &cxl_decoder_endpoint_type;
} }
EXPORT_SYMBOL_NS_GPL(is_endpoint_decoder, CXL);
bool is_root_decoder(struct device *dev) bool is_root_decoder(struct device *dev)
{ {
@ -455,6 +460,7 @@ bool is_switch_decoder(struct device *dev)
{ {
return is_root_decoder(dev) || dev->type == &cxl_decoder_switch_type; return is_root_decoder(dev) || dev->type == &cxl_decoder_switch_type;
} }
EXPORT_SYMBOL_NS_GPL(is_switch_decoder, CXL);
struct cxl_decoder *to_cxl_decoder(struct device *dev) struct cxl_decoder *to_cxl_decoder(struct device *dev)
{ {
@ -482,6 +488,7 @@ struct cxl_switch_decoder *to_cxl_switch_decoder(struct device *dev)
return NULL; return NULL;
return container_of(dev, struct cxl_switch_decoder, cxld.dev); return container_of(dev, struct cxl_switch_decoder, cxld.dev);
} }
EXPORT_SYMBOL_NS_GPL(to_cxl_switch_decoder, CXL);
static void cxl_ep_release(struct cxl_ep *ep) static void cxl_ep_release(struct cxl_ep *ep)
{ {
@ -1207,6 +1214,7 @@ int cxl_endpoint_autoremove(struct cxl_memdev *cxlmd, struct cxl_port *endpoint)
get_device(&endpoint->dev); get_device(&endpoint->dev);
dev_set_drvdata(dev, endpoint); dev_set_drvdata(dev, endpoint);
cxlmd->depth = endpoint->depth;
return devm_add_action_or_reset(dev, delete_endpoint, cxlmd); return devm_add_action_or_reset(dev, delete_endpoint, cxlmd);
} }
EXPORT_SYMBOL_NS_GPL(cxl_endpoint_autoremove, CXL); EXPORT_SYMBOL_NS_GPL(cxl_endpoint_autoremove, CXL);
@ -1241,50 +1249,55 @@ static void reap_dports(struct cxl_port *port)
} }
} }
struct detach_ctx {
struct cxl_memdev *cxlmd;
int depth;
};
static int port_has_memdev(struct device *dev, const void *data)
{
const struct detach_ctx *ctx = data;
struct cxl_port *port;
if (!is_cxl_port(dev))
return 0;
port = to_cxl_port(dev);
if (port->depth != ctx->depth)
return 0;
return !!cxl_ep_load(port, ctx->cxlmd);
}
static void cxl_detach_ep(void *data) static void cxl_detach_ep(void *data)
{ {
struct cxl_memdev *cxlmd = data; struct cxl_memdev *cxlmd = data;
struct device *iter;
for (iter = &cxlmd->dev; iter; iter = grandparent(iter)) { for (int i = cxlmd->depth - 1; i >= 1; i--) {
struct device *dport_dev = grandparent(iter);
struct cxl_port *port, *parent_port; struct cxl_port *port, *parent_port;
struct detach_ctx ctx = {
.cxlmd = cxlmd,
.depth = i,
};
struct device *dev;
struct cxl_ep *ep; struct cxl_ep *ep;
bool died = false; bool died = false;
if (!dport_dev) dev = bus_find_device(&cxl_bus_type, NULL, &ctx,
break; port_has_memdev);
if (!dev)
port = find_cxl_port(dport_dev, NULL);
if (!port)
continue; continue;
port = to_cxl_port(dev);
if (is_cxl_root(port)) {
put_device(&port->dev);
continue;
}
parent_port = to_cxl_port(port->dev.parent); parent_port = to_cxl_port(port->dev.parent);
device_lock(&parent_port->dev); device_lock(&parent_port->dev);
if (!parent_port->dev.driver) {
/*
* The bottom-up race to delete the port lost to a
* top-down port disable, give up here, because the
* parent_port ->remove() will have cleaned up all
* descendants.
*/
device_unlock(&parent_port->dev);
put_device(&port->dev);
continue;
}
device_lock(&port->dev); device_lock(&port->dev);
ep = cxl_ep_load(port, cxlmd); ep = cxl_ep_load(port, cxlmd);
dev_dbg(&cxlmd->dev, "disconnect %s from %s\n", dev_dbg(&cxlmd->dev, "disconnect %s from %s\n",
ep ? dev_name(ep->ep) : "", dev_name(&port->dev)); ep ? dev_name(ep->ep) : "", dev_name(&port->dev));
cxl_ep_remove(port, ep); cxl_ep_remove(port, ep);
if (ep && !port->dead && xa_empty(&port->endpoints) && if (ep && !port->dead && xa_empty(&port->endpoints) &&
!is_cxl_root(parent_port)) { !is_cxl_root(parent_port) && parent_port->dev.driver) {
/* /*
* This was the last ep attached to a dynamically * This was the last ep attached to a dynamically
* enumerated port. Block new cxl_add_ep() and garbage * enumerated port. Block new cxl_add_ep() and garbage
@ -1620,6 +1633,7 @@ struct cxl_root_decoder *cxl_root_decoder_alloc(struct cxl_port *port,
} }
cxlrd->calc_hb = calc_hb; cxlrd->calc_hb = calc_hb;
mutex_init(&cxlrd->range_lock);
cxld = &cxlsd->cxld; cxld = &cxlsd->cxld;
cxld->dev.type = &cxl_decoder_root_type; cxld->dev.type = &cxl_decoder_root_type;
@ -2003,6 +2017,6 @@ static void cxl_core_exit(void)
debugfs_remove_recursive(cxl_debugfs); debugfs_remove_recursive(cxl_debugfs);
} }
module_init(cxl_core_init); subsys_initcall(cxl_core_init);
module_exit(cxl_core_exit); module_exit(cxl_core_exit);
MODULE_LICENSE("GPL v2"); MODULE_LICENSE("GPL v2");

Разница между файлами не показана из-за своего большого размера Загрузить разницу

Просмотреть файл

@ -277,6 +277,8 @@ resource_size_t cxl_rcrb_to_component(struct device *dev,
* cxl_decoder flags that define the type of memory / devices this * cxl_decoder flags that define the type of memory / devices this
* decoder supports as well as configuration lock status See "CXL 2.0 * decoder supports as well as configuration lock status See "CXL 2.0
* 8.2.5.12.7 CXL HDM Decoder 0 Control Register" for details. * 8.2.5.12.7 CXL HDM Decoder 0 Control Register" for details.
* Additionally indicate whether decoder settings were autodetected,
* user customized.
*/ */
#define CXL_DECODER_F_RAM BIT(0) #define CXL_DECODER_F_RAM BIT(0)
#define CXL_DECODER_F_PMEM BIT(1) #define CXL_DECODER_F_PMEM BIT(1)
@ -336,12 +338,36 @@ enum cxl_decoder_mode {
CXL_DECODER_DEAD, CXL_DECODER_DEAD,
}; };
static inline const char *cxl_decoder_mode_name(enum cxl_decoder_mode mode)
{
static const char * const names[] = {
[CXL_DECODER_NONE] = "none",
[CXL_DECODER_RAM] = "ram",
[CXL_DECODER_PMEM] = "pmem",
[CXL_DECODER_MIXED] = "mixed",
};
if (mode >= CXL_DECODER_NONE && mode <= CXL_DECODER_MIXED)
return names[mode];
return "mixed";
}
/*
* Track whether this decoder is reserved for region autodiscovery, or
* free for userspace provisioning.
*/
enum cxl_decoder_state {
CXL_DECODER_STATE_MANUAL,
CXL_DECODER_STATE_AUTO,
};
/** /**
* struct cxl_endpoint_decoder - Endpoint / SPA to DPA decoder * struct cxl_endpoint_decoder - Endpoint / SPA to DPA decoder
* @cxld: base cxl_decoder_object * @cxld: base cxl_decoder_object
* @dpa_res: actively claimed DPA span of this decoder * @dpa_res: actively claimed DPA span of this decoder
* @skip: offset into @dpa_res where @cxld.hpa_range maps * @skip: offset into @dpa_res where @cxld.hpa_range maps
* @mode: which memory type / access-mode-partition this decoder targets * @mode: which memory type / access-mode-partition this decoder targets
* @state: autodiscovery state
* @pos: interleave position in @cxld.region * @pos: interleave position in @cxld.region
*/ */
struct cxl_endpoint_decoder { struct cxl_endpoint_decoder {
@ -349,6 +375,7 @@ struct cxl_endpoint_decoder {
struct resource *dpa_res; struct resource *dpa_res;
resource_size_t skip; resource_size_t skip;
enum cxl_decoder_mode mode; enum cxl_decoder_mode mode;
enum cxl_decoder_state state;
int pos; int pos;
}; };
@ -382,6 +409,7 @@ typedef struct cxl_dport *(*cxl_calc_hb_fn)(struct cxl_root_decoder *cxlrd,
* @region_id: region id for next region provisioning event * @region_id: region id for next region provisioning event
* @calc_hb: which host bridge covers the n'th position by granularity * @calc_hb: which host bridge covers the n'th position by granularity
* @platform_data: platform specific configuration data * @platform_data: platform specific configuration data
* @range_lock: sync region autodiscovery by address range
* @cxlsd: base cxl switch decoder * @cxlsd: base cxl switch decoder
*/ */
struct cxl_root_decoder { struct cxl_root_decoder {
@ -389,6 +417,7 @@ struct cxl_root_decoder {
atomic_t region_id; atomic_t region_id;
cxl_calc_hb_fn calc_hb; cxl_calc_hb_fn calc_hb;
void *platform_data; void *platform_data;
struct mutex range_lock;
struct cxl_switch_decoder cxlsd; struct cxl_switch_decoder cxlsd;
}; };
@ -438,6 +467,13 @@ struct cxl_region_params {
*/ */
#define CXL_REGION_F_INCOHERENT 0 #define CXL_REGION_F_INCOHERENT 0
/*
* Indicate whether this region has been assembled by autodetection or
* userspace assembly. Prevent endpoint decoders outside of automatic
* detection from being added to the region.
*/
#define CXL_REGION_F_AUTO 1
/** /**
* struct cxl_region - CXL region * struct cxl_region - CXL region
* @dev: This region's device * @dev: This region's device
@ -493,6 +529,12 @@ struct cxl_pmem_region {
struct cxl_pmem_region_mapping mapping[]; struct cxl_pmem_region_mapping mapping[];
}; };
struct cxl_dax_region {
struct device dev;
struct cxl_region *cxlr;
struct range hpa_range;
};
/** /**
* struct cxl_port - logical collection of upstream port devices and * struct cxl_port - logical collection of upstream port devices and
* downstream port devices to construct a CXL memory * downstream port devices to construct a CXL memory
@ -633,8 +675,10 @@ struct cxl_dport *devm_cxl_add_rch_dport(struct cxl_port *port,
struct cxl_decoder *to_cxl_decoder(struct device *dev); struct cxl_decoder *to_cxl_decoder(struct device *dev);
struct cxl_root_decoder *to_cxl_root_decoder(struct device *dev); struct cxl_root_decoder *to_cxl_root_decoder(struct device *dev);
struct cxl_switch_decoder *to_cxl_switch_decoder(struct device *dev);
struct cxl_endpoint_decoder *to_cxl_endpoint_decoder(struct device *dev); struct cxl_endpoint_decoder *to_cxl_endpoint_decoder(struct device *dev);
bool is_root_decoder(struct device *dev); bool is_root_decoder(struct device *dev);
bool is_switch_decoder(struct device *dev);
bool is_endpoint_decoder(struct device *dev); bool is_endpoint_decoder(struct device *dev);
struct cxl_root_decoder *cxl_root_decoder_alloc(struct cxl_port *port, struct cxl_root_decoder *cxl_root_decoder_alloc(struct cxl_port *port,
unsigned int nr_targets, unsigned int nr_targets,
@ -685,6 +729,7 @@ void cxl_driver_unregister(struct cxl_driver *cxl_drv);
#define CXL_DEVICE_MEMORY_EXPANDER 5 #define CXL_DEVICE_MEMORY_EXPANDER 5
#define CXL_DEVICE_REGION 6 #define CXL_DEVICE_REGION 6
#define CXL_DEVICE_PMEM_REGION 7 #define CXL_DEVICE_PMEM_REGION 7
#define CXL_DEVICE_DAX_REGION 8
#define MODULE_ALIAS_CXL(type) MODULE_ALIAS("cxl:t" __stringify(type) "*") #define MODULE_ALIAS_CXL(type) MODULE_ALIAS("cxl:t" __stringify(type) "*")
#define CXL_MODALIAS_FMT "cxl:t%d" #define CXL_MODALIAS_FMT "cxl:t%d"
@ -701,6 +746,9 @@ struct cxl_nvdimm_bridge *cxl_find_nvdimm_bridge(struct device *dev);
#ifdef CONFIG_CXL_REGION #ifdef CONFIG_CXL_REGION
bool is_cxl_pmem_region(struct device *dev); bool is_cxl_pmem_region(struct device *dev);
struct cxl_pmem_region *to_cxl_pmem_region(struct device *dev); struct cxl_pmem_region *to_cxl_pmem_region(struct device *dev);
int cxl_add_to_region(struct cxl_port *root,
struct cxl_endpoint_decoder *cxled);
struct cxl_dax_region *to_cxl_dax_region(struct device *dev);
#else #else
static inline bool is_cxl_pmem_region(struct device *dev) static inline bool is_cxl_pmem_region(struct device *dev)
{ {
@ -710,6 +758,15 @@ static inline struct cxl_pmem_region *to_cxl_pmem_region(struct device *dev)
{ {
return NULL; return NULL;
} }
static inline int cxl_add_to_region(struct cxl_port *root,
struct cxl_endpoint_decoder *cxled)
{
return 0;
}
static inline struct cxl_dax_region *to_cxl_dax_region(struct device *dev)
{
return NULL;
}
#endif #endif
/* /*

Просмотреть файл

@ -39,6 +39,7 @@
* @cxl_nvb: coordinate removal of @cxl_nvd if present * @cxl_nvb: coordinate removal of @cxl_nvd if present
* @cxl_nvd: optional bridge to an nvdimm if the device supports pmem * @cxl_nvd: optional bridge to an nvdimm if the device supports pmem
* @id: id number of this memdev instance. * @id: id number of this memdev instance.
* @depth: endpoint port depth
*/ */
struct cxl_memdev { struct cxl_memdev {
struct device dev; struct device dev;
@ -48,6 +49,7 @@ struct cxl_memdev {
struct cxl_nvdimm_bridge *cxl_nvb; struct cxl_nvdimm_bridge *cxl_nvb;
struct cxl_nvdimm *cxl_nvd; struct cxl_nvdimm *cxl_nvd;
int id; int id;
int depth;
}; };
static inline struct cxl_memdev *to_cxl_memdev(struct device *dev) static inline struct cxl_memdev *to_cxl_memdev(struct device *dev)
@ -80,6 +82,9 @@ static inline bool is_cxl_endpoint(struct cxl_port *port)
} }
struct cxl_memdev *devm_cxl_add_memdev(struct cxl_dev_state *cxlds); struct cxl_memdev *devm_cxl_add_memdev(struct cxl_dev_state *cxlds);
int devm_cxl_dpa_reserve(struct cxl_endpoint_decoder *cxled,
resource_size_t base, resource_size_t len,
resource_size_t skipped);
static inline struct cxl_ep *cxl_ep_load(struct cxl_port *port, static inline struct cxl_ep *cxl_ep_load(struct cxl_port *port,
struct cxl_memdev *cxlmd) struct cxl_memdev *cxlmd)

Просмотреть файл

@ -30,34 +30,69 @@ static void schedule_detach(void *cxlmd)
schedule_cxl_memdev_detach(cxlmd); schedule_cxl_memdev_detach(cxlmd);
} }
static int cxl_port_probe(struct device *dev) static int discover_region(struct device *dev, void *root)
{
struct cxl_endpoint_decoder *cxled;
int rc;
if (!is_endpoint_decoder(dev))
return 0;
cxled = to_cxl_endpoint_decoder(dev);
if ((cxled->cxld.flags & CXL_DECODER_F_ENABLE) == 0)
return 0;
if (cxled->state != CXL_DECODER_STATE_AUTO)
return 0;
/*
* Region enumeration is opportunistic, if this add-event fails,
* continue to the next endpoint decoder.
*/
rc = cxl_add_to_region(root, cxled);
if (rc)
dev_dbg(dev, "failed to add to region: %#llx-%#llx\n",
cxled->cxld.hpa_range.start, cxled->cxld.hpa_range.end);
return 0;
}
static int cxl_switch_port_probe(struct cxl_port *port)
{ {
struct cxl_port *port = to_cxl_port(dev);
struct cxl_hdm *cxlhdm; struct cxl_hdm *cxlhdm;
int rc; int rc;
if (!is_cxl_endpoint(port)) {
rc = devm_cxl_port_enumerate_dports(port); rc = devm_cxl_port_enumerate_dports(port);
if (rc < 0) if (rc < 0)
return rc; return rc;
if (rc == 1) if (rc == 1)
return devm_cxl_add_passthrough_decoder(port); return devm_cxl_add_passthrough_decoder(port);
}
cxlhdm = devm_cxl_setup_hdm(port); cxlhdm = devm_cxl_setup_hdm(port);
if (IS_ERR(cxlhdm)) if (IS_ERR(cxlhdm))
return PTR_ERR(cxlhdm); return PTR_ERR(cxlhdm);
if (is_cxl_endpoint(port)) { return devm_cxl_enumerate_decoders(cxlhdm);
}
static int cxl_endpoint_port_probe(struct cxl_port *port)
{
struct cxl_memdev *cxlmd = to_cxl_memdev(port->uport); struct cxl_memdev *cxlmd = to_cxl_memdev(port->uport);
struct cxl_dev_state *cxlds = cxlmd->cxlds; struct cxl_dev_state *cxlds = cxlmd->cxlds;
struct cxl_hdm *cxlhdm;
struct cxl_port *root;
int rc;
cxlhdm = devm_cxl_setup_hdm(port);
if (IS_ERR(cxlhdm))
return PTR_ERR(cxlhdm);
/* Cache the data early to ensure is_visible() works */ /* Cache the data early to ensure is_visible() works */
read_cdat_data(port); read_cdat_data(port);
get_device(&cxlmd->dev); get_device(&cxlmd->dev);
rc = devm_add_action_or_reset(dev, schedule_detach, cxlmd); rc = devm_add_action_or_reset(&port->dev, schedule_detach, cxlmd);
if (rc) if (rc)
return rc; return rc;
@ -67,20 +102,39 @@ static int cxl_port_probe(struct device *dev)
rc = cxl_await_media_ready(cxlds); rc = cxl_await_media_ready(cxlds);
if (rc) { if (rc) {
dev_err(dev, "Media not active (%d)\n", rc); dev_err(&port->dev, "Media not active (%d)\n", rc);
return rc; return rc;
} }
}
rc = devm_cxl_enumerate_decoders(cxlhdm); rc = devm_cxl_enumerate_decoders(cxlhdm);
if (rc) { if (rc)
dev_err(dev, "Couldn't enumerate decoders (%d)\n", rc);
return rc; return rc;
}
/*
* This can't fail in practice as CXL root exit unregisters all
* descendant ports and that in turn synchronizes with cxl_port_probe()
*/
root = find_cxl_root(&cxlmd->dev);
/*
* Now that all endpoint decoders are successfully enumerated, try to
* assemble regions from committed decoders
*/
device_for_each_child(&port->dev, root, discover_region);
put_device(&root->dev);
return 0; return 0;
} }
static int cxl_port_probe(struct device *dev)
{
struct cxl_port *port = to_cxl_port(dev);
if (is_cxl_endpoint(port))
return cxl_endpoint_port_probe(port);
return cxl_switch_port_probe(port);
}
static ssize_t CDAT_read(struct file *filp, struct kobject *kobj, static ssize_t CDAT_read(struct file *filp, struct kobject *kobj,
struct bin_attribute *bin_attr, char *buf, struct bin_attribute *bin_attr, char *buf,
loff_t offset, size_t count) loff_t offset, size_t count)

Просмотреть файл

@ -45,12 +45,25 @@ config DEV_DAX_HMEM
Say M if unsure. Say M if unsure.
config DEV_DAX_CXL
tristate "CXL DAX: direct access to CXL RAM regions"
depends on CXL_REGION && DEV_DAX
default CXL_REGION && DEV_DAX
help
CXL RAM regions are either mapped by platform-firmware
and published in the initial system-memory map as "System RAM", mapped
by platform-firmware as "Soft Reserved", or dynamically provisioned
after boot by the CXL driver. In the latter two cases a device-dax
instance is created to access that unmapped-by-default address range.
Per usual it can remain as dedicated access via a device interface, or
converted to "System RAM" via the dax_kmem facility.
config DEV_DAX_HMEM_DEVICES config DEV_DAX_HMEM_DEVICES
depends on DEV_DAX_HMEM && DAX=y depends on DEV_DAX_HMEM && DAX
def_bool y def_bool y
config DEV_DAX_KMEM config DEV_DAX_KMEM
tristate "KMEM DAX: volatile-use of persistent memory" tristate "KMEM DAX: map dax-devices as System-RAM"
default DEV_DAX default DEV_DAX
depends on DEV_DAX depends on DEV_DAX
depends on MEMORY_HOTPLUG # for add_memory() and friends depends on MEMORY_HOTPLUG # for add_memory() and friends

Просмотреть файл

@ -3,10 +3,12 @@ obj-$(CONFIG_DAX) += dax.o
obj-$(CONFIG_DEV_DAX) += device_dax.o obj-$(CONFIG_DEV_DAX) += device_dax.o
obj-$(CONFIG_DEV_DAX_KMEM) += kmem.o obj-$(CONFIG_DEV_DAX_KMEM) += kmem.o
obj-$(CONFIG_DEV_DAX_PMEM) += dax_pmem.o obj-$(CONFIG_DEV_DAX_PMEM) += dax_pmem.o
obj-$(CONFIG_DEV_DAX_CXL) += dax_cxl.o
dax-y := super.o dax-y := super.o
dax-y += bus.o dax-y += bus.o
device_dax-y := device.o device_dax-y := device.o
dax_pmem-y := pmem.o dax_pmem-y := pmem.o
dax_cxl-y := cxl.o
obj-y += hmem/ obj-y += hmem/

Просмотреть файл

@ -56,6 +56,25 @@ static int dax_match_id(struct dax_device_driver *dax_drv, struct device *dev)
return match; return match;
} }
static int dax_match_type(struct dax_device_driver *dax_drv, struct device *dev)
{
enum dax_driver_type type = DAXDRV_DEVICE_TYPE;
struct dev_dax *dev_dax = to_dev_dax(dev);
if (dev_dax->region->res.flags & IORESOURCE_DAX_KMEM)
type = DAXDRV_KMEM_TYPE;
if (dax_drv->type == type)
return 1;
/* default to device mode if dax_kmem is disabled */
if (dax_drv->type == DAXDRV_DEVICE_TYPE &&
!IS_ENABLED(CONFIG_DEV_DAX_KMEM))
return 1;
return 0;
}
enum id_action { enum id_action {
ID_REMOVE, ID_REMOVE,
ID_ADD, ID_ADD,
@ -216,14 +235,9 @@ static int dax_bus_match(struct device *dev, struct device_driver *drv)
{ {
struct dax_device_driver *dax_drv = to_dax_drv(drv); struct dax_device_driver *dax_drv = to_dax_drv(drv);
/* if (dax_match_id(dax_drv, dev))
* All but the 'device-dax' driver, which has 'match_always'
* set, requires an exact id match.
*/
if (dax_drv->match_always)
return 1; return 1;
return dax_match_type(dax_drv, dev);
return dax_match_id(dax_drv, dev);
} }
/* /*
@ -1413,13 +1427,10 @@ err_id:
} }
EXPORT_SYMBOL_GPL(devm_create_dev_dax); EXPORT_SYMBOL_GPL(devm_create_dev_dax);
static int match_always_count;
int __dax_driver_register(struct dax_device_driver *dax_drv, int __dax_driver_register(struct dax_device_driver *dax_drv,
struct module *module, const char *mod_name) struct module *module, const char *mod_name)
{ {
struct device_driver *drv = &dax_drv->drv; struct device_driver *drv = &dax_drv->drv;
int rc = 0;
/* /*
* dax_bus_probe() calls dax_drv->probe() unconditionally. * dax_bus_probe() calls dax_drv->probe() unconditionally.
@ -1434,26 +1445,7 @@ int __dax_driver_register(struct dax_device_driver *dax_drv,
drv->mod_name = mod_name; drv->mod_name = mod_name;
drv->bus = &dax_bus_type; drv->bus = &dax_bus_type;
/* there can only be one default driver */ return driver_register(drv);
mutex_lock(&dax_bus_lock);
match_always_count += dax_drv->match_always;
if (match_always_count > 1) {
match_always_count--;
WARN_ON(1);
rc = -EINVAL;
}
mutex_unlock(&dax_bus_lock);
if (rc)
return rc;
rc = driver_register(drv);
if (rc && dax_drv->match_always) {
mutex_lock(&dax_bus_lock);
match_always_count -= dax_drv->match_always;
mutex_unlock(&dax_bus_lock);
}
return rc;
} }
EXPORT_SYMBOL_GPL(__dax_driver_register); EXPORT_SYMBOL_GPL(__dax_driver_register);
@ -1463,7 +1455,6 @@ void dax_driver_unregister(struct dax_device_driver *dax_drv)
struct dax_id *dax_id, *_id; struct dax_id *dax_id, *_id;
mutex_lock(&dax_bus_lock); mutex_lock(&dax_bus_lock);
match_always_count -= dax_drv->match_always;
list_for_each_entry_safe(dax_id, _id, &dax_drv->ids, list) { list_for_each_entry_safe(dax_id, _id, &dax_drv->ids, list) {
list_del(&dax_id->list); list_del(&dax_id->list);
kfree(dax_id); kfree(dax_id);

Просмотреть файл

@ -11,7 +11,10 @@ struct dax_device;
struct dax_region; struct dax_region;
void dax_region_put(struct dax_region *dax_region); void dax_region_put(struct dax_region *dax_region);
#define IORESOURCE_DAX_STATIC (1UL << 0) /* dax bus specific ioresource flags */
#define IORESOURCE_DAX_STATIC BIT(0)
#define IORESOURCE_DAX_KMEM BIT(1)
struct dax_region *alloc_dax_region(struct device *parent, int region_id, struct dax_region *alloc_dax_region(struct device *parent, int region_id,
struct range *range, int target_node, unsigned int align, struct range *range, int target_node, unsigned int align,
unsigned long flags); unsigned long flags);
@ -25,10 +28,15 @@ struct dev_dax_data {
struct dev_dax *devm_create_dev_dax(struct dev_dax_data *data); struct dev_dax *devm_create_dev_dax(struct dev_dax_data *data);
enum dax_driver_type {
DAXDRV_KMEM_TYPE,
DAXDRV_DEVICE_TYPE,
};
struct dax_device_driver { struct dax_device_driver {
struct device_driver drv; struct device_driver drv;
struct list_head ids; struct list_head ids;
int match_always; enum dax_driver_type type;
int (*probe)(struct dev_dax *dev); int (*probe)(struct dev_dax *dev);
void (*remove)(struct dev_dax *dev); void (*remove)(struct dev_dax *dev);
}; };

53
drivers/dax/cxl.c Normal file
Просмотреть файл

@ -0,0 +1,53 @@
// SPDX-License-Identifier: GPL-2.0-only
/* Copyright(c) 2023 Intel Corporation. All rights reserved. */
#include <linux/module.h>
#include <linux/dax.h>
#include "../cxl/cxl.h"
#include "bus.h"
static int cxl_dax_region_probe(struct device *dev)
{
struct cxl_dax_region *cxlr_dax = to_cxl_dax_region(dev);
int nid = phys_to_target_node(cxlr_dax->hpa_range.start);
struct cxl_region *cxlr = cxlr_dax->cxlr;
struct dax_region *dax_region;
struct dev_dax_data data;
struct dev_dax *dev_dax;
if (nid == NUMA_NO_NODE)
nid = memory_add_physaddr_to_nid(cxlr_dax->hpa_range.start);
dax_region = alloc_dax_region(dev, cxlr->id, &cxlr_dax->hpa_range, nid,
PMD_SIZE, IORESOURCE_DAX_KMEM);
if (!dax_region)
return -ENOMEM;
data = (struct dev_dax_data) {
.dax_region = dax_region,
.id = -1,
.size = range_len(&cxlr_dax->hpa_range),
};
dev_dax = devm_create_dev_dax(&data);
if (IS_ERR(dev_dax))
return PTR_ERR(dev_dax);
/* child dev_dax instances now own the lifetime of the dax_region */
dax_region_put(dax_region);
return 0;
}
static struct cxl_driver cxl_dax_region_driver = {
.name = "cxl_dax_region",
.probe = cxl_dax_region_probe,
.id = CXL_DEVICE_DAX_REGION,
.drv = {
.suppress_bind_attrs = true,
},
};
module_cxl_driver(cxl_dax_region_driver);
MODULE_ALIAS_CXL(CXL_DEVICE_DAX_REGION);
MODULE_LICENSE("GPL");
MODULE_AUTHOR("Intel Corporation");
MODULE_IMPORT_NS(CXL);

Просмотреть файл

@ -475,8 +475,7 @@ EXPORT_SYMBOL_GPL(dev_dax_probe);
static struct dax_device_driver device_dax_driver = { static struct dax_device_driver device_dax_driver = {
.probe = dev_dax_probe, .probe = dev_dax_probe,
/* all probe actions are unwound by devm, so .remove isn't necessary */ .type = DAXDRV_DEVICE_TYPE,
.match_always = 1,
}; };
static int __init dax_init(void) static int __init dax_init(void)

Просмотреть файл

@ -1,6 +1,7 @@
# SPDX-License-Identifier: GPL-2.0 # SPDX-License-Identifier: GPL-2.0
obj-$(CONFIG_DEV_DAX_HMEM) += dax_hmem.o # device_hmem.o deliberately precedes dax_hmem.o for initcall ordering
obj-$(CONFIG_DEV_DAX_HMEM_DEVICES) += device_hmem.o obj-$(CONFIG_DEV_DAX_HMEM_DEVICES) += device_hmem.o
obj-$(CONFIG_DEV_DAX_HMEM) += dax_hmem.o
device_hmem-y := device.o device_hmem-y := device.o
dax_hmem-y := hmem.o dax_hmem-y := hmem.o

Просмотреть файл

@ -8,6 +8,8 @@
static bool nohmem; static bool nohmem;
module_param_named(disable, nohmem, bool, 0444); module_param_named(disable, nohmem, bool, 0444);
static bool platform_initialized;
static DEFINE_MUTEX(hmem_resource_lock);
static struct resource hmem_active = { static struct resource hmem_active = {
.name = "HMEM devices", .name = "HMEM devices",
.start = 0, .start = 0,
@ -15,80 +17,66 @@ static struct resource hmem_active = {
.flags = IORESOURCE_MEM, .flags = IORESOURCE_MEM,
}; };
void hmem_register_device(int target_nid, struct resource *r) int walk_hmem_resources(struct device *host, walk_hmem_fn fn)
{
struct resource *res;
int rc = 0;
mutex_lock(&hmem_resource_lock);
for (res = hmem_active.child; res; res = res->sibling) {
rc = fn(host, (int) res->desc, res);
if (rc)
break;
}
mutex_unlock(&hmem_resource_lock);
return rc;
}
EXPORT_SYMBOL_GPL(walk_hmem_resources);
static void __hmem_register_resource(int target_nid, struct resource *res)
{ {
/* define a clean / non-busy resource for the platform device */
struct resource res = {
.start = r->start,
.end = r->end,
.flags = IORESOURCE_MEM,
.desc = IORES_DESC_SOFT_RESERVED,
};
struct platform_device *pdev; struct platform_device *pdev;
struct memregion_info info; struct resource *new;
int rc, id; int rc;
if (nohmem) new = __request_region(&hmem_active, res->start, resource_size(res), "",
return; 0);
if (!new) {
rc = region_intersects(res.start, resource_size(&res), IORESOURCE_MEM, pr_debug("hmem range %pr already active\n", res);
IORES_DESC_SOFT_RESERVED);
if (rc != REGION_INTERSECTS)
return;
id = memregion_alloc(GFP_KERNEL);
if (id < 0) {
pr_err("memregion allocation failure for %pr\n", &res);
return; return;
} }
pdev = platform_device_alloc("hmem", id); new->desc = target_nid;
if (platform_initialized)
return;
pdev = platform_device_alloc("hmem_platform", 0);
if (!pdev) { if (!pdev) {
pr_err("hmem device allocation failure for %pr\n", &res); pr_err_once("failed to register device-dax hmem_platform device\n");
goto out_pdev; return;
}
if (!__request_region(&hmem_active, res.start, resource_size(&res),
dev_name(&pdev->dev), 0)) {
dev_dbg(&pdev->dev, "hmem range %pr already active\n", &res);
goto out_active;
}
pdev->dev.numa_node = numa_map_to_online_node(target_nid);
info = (struct memregion_info) {
.target_node = target_nid,
};
rc = platform_device_add_data(pdev, &info, sizeof(info));
if (rc < 0) {
pr_err("hmem memregion_info allocation failure for %pr\n", &res);
goto out_resource;
}
rc = platform_device_add_resources(pdev, &res, 1);
if (rc < 0) {
pr_err("hmem resource allocation failure for %pr\n", &res);
goto out_resource;
} }
rc = platform_device_add(pdev); rc = platform_device_add(pdev);
if (rc < 0) { if (rc)
dev_err(&pdev->dev, "device add failed for %pr\n", &res); platform_device_put(pdev);
goto out_resource; else
} platform_initialized = true;
}
void hmem_register_resource(int target_nid, struct resource *res)
{
if (nohmem)
return; return;
out_resource: mutex_lock(&hmem_resource_lock);
__release_region(&hmem_active, res.start, resource_size(&res)); __hmem_register_resource(target_nid, res);
out_active: mutex_unlock(&hmem_resource_lock);
platform_device_put(pdev);
out_pdev:
memregion_free(id);
} }
static __init int hmem_register_one(struct resource *res, void *data) static __init int hmem_register_one(struct resource *res, void *data)
{ {
hmem_register_device(phys_to_target_node(res->start), res); hmem_register_resource(phys_to_target_node(res->start), res);
return 0; return 0;
} }
@ -104,4 +92,4 @@ static __init int hmem_init(void)
* As this is a fallback for address ranges unclaimed by the ACPI HMAT * As this is a fallback for address ranges unclaimed by the ACPI HMAT
* parsing it must be at an initcall level greater than hmat_init(). * parsing it must be at an initcall level greater than hmat_init().
*/ */
late_initcall(hmem_init); device_initcall(hmem_init);

Просмотреть файл

@ -3,6 +3,7 @@
#include <linux/memregion.h> #include <linux/memregion.h>
#include <linux/module.h> #include <linux/module.h>
#include <linux/pfn_t.h> #include <linux/pfn_t.h>
#include <linux/dax.h>
#include "../bus.h" #include "../bus.h"
static bool region_idle; static bool region_idle;
@ -10,30 +11,32 @@ module_param_named(region_idle, region_idle, bool, 0644);
static int dax_hmem_probe(struct platform_device *pdev) static int dax_hmem_probe(struct platform_device *pdev)
{ {
unsigned long flags = IORESOURCE_DAX_KMEM;
struct device *dev = &pdev->dev; struct device *dev = &pdev->dev;
struct dax_region *dax_region; struct dax_region *dax_region;
struct memregion_info *mri; struct memregion_info *mri;
struct dev_dax_data data; struct dev_dax_data data;
struct dev_dax *dev_dax; struct dev_dax *dev_dax;
struct resource *res;
struct range range;
res = platform_get_resource(pdev, IORESOURCE_MEM, 0); /*
if (!res) * @region_idle == true indicates that an administrative agent
return -ENOMEM; * wants to manipulate the range partitioning before the devices
* are created, so do not send them to the dax_kmem driver by
* default.
*/
if (region_idle)
flags = 0;
mri = dev->platform_data; mri = dev->platform_data;
range.start = res->start; dax_region = alloc_dax_region(dev, pdev->id, &mri->range,
range.end = res->end; mri->target_node, PMD_SIZE, flags);
dax_region = alloc_dax_region(dev, pdev->id, &range, mri->target_node,
PMD_SIZE, 0);
if (!dax_region) if (!dax_region)
return -ENOMEM; return -ENOMEM;
data = (struct dev_dax_data) { data = (struct dev_dax_data) {
.dax_region = dax_region, .dax_region = dax_region,
.id = -1, .id = -1,
.size = region_idle ? 0 : resource_size(res), .size = region_idle ? 0 : range_len(&mri->range),
}; };
dev_dax = devm_create_dev_dax(&data); dev_dax = devm_create_dev_dax(&data);
if (IS_ERR(dev_dax)) if (IS_ERR(dev_dax))
@ -44,22 +47,131 @@ static int dax_hmem_probe(struct platform_device *pdev)
return 0; return 0;
} }
static int dax_hmem_remove(struct platform_device *pdev)
{
/* devm handles teardown */
return 0;
}
static struct platform_driver dax_hmem_driver = { static struct platform_driver dax_hmem_driver = {
.probe = dax_hmem_probe, .probe = dax_hmem_probe,
.remove = dax_hmem_remove,
.driver = { .driver = {
.name = "hmem", .name = "hmem",
}, },
}; };
module_platform_driver(dax_hmem_driver); static void release_memregion(void *data)
{
memregion_free((long) data);
}
static void release_hmem(void *pdev)
{
platform_device_unregister(pdev);
}
static int hmem_register_device(struct device *host, int target_nid,
const struct resource *res)
{
struct platform_device *pdev;
struct memregion_info info;
long id;
int rc;
if (IS_ENABLED(CONFIG_CXL_REGION) &&
region_intersects(res->start, resource_size(res), IORESOURCE_MEM,
IORES_DESC_CXL) != REGION_DISJOINT) {
dev_dbg(host, "deferring range to CXL: %pr\n", res);
return 0;
}
rc = region_intersects(res->start, resource_size(res), IORESOURCE_MEM,
IORES_DESC_SOFT_RESERVED);
if (rc != REGION_INTERSECTS)
return 0;
id = memregion_alloc(GFP_KERNEL);
if (id < 0) {
dev_err(host, "memregion allocation failure for %pr\n", res);
return -ENOMEM;
}
rc = devm_add_action_or_reset(host, release_memregion, (void *) id);
if (rc)
return rc;
pdev = platform_device_alloc("hmem", id);
if (!pdev) {
dev_err(host, "device allocation failure for %pr\n", res);
return -ENOMEM;
}
pdev->dev.numa_node = numa_map_to_online_node(target_nid);
info = (struct memregion_info) {
.target_node = target_nid,
.range = {
.start = res->start,
.end = res->end,
},
};
rc = platform_device_add_data(pdev, &info, sizeof(info));
if (rc < 0) {
dev_err(host, "memregion_info allocation failure for %pr\n",
res);
goto out_put;
}
rc = platform_device_add(pdev);
if (rc < 0) {
dev_err(host, "%s add failed for %pr\n", dev_name(&pdev->dev),
res);
goto out_put;
}
return devm_add_action_or_reset(host, release_hmem, pdev);
out_put:
platform_device_put(pdev);
return rc;
}
static int dax_hmem_platform_probe(struct platform_device *pdev)
{
return walk_hmem_resources(&pdev->dev, hmem_register_device);
}
static struct platform_driver dax_hmem_platform_driver = {
.probe = dax_hmem_platform_probe,
.driver = {
.name = "hmem_platform",
},
};
static __init int dax_hmem_init(void)
{
int rc;
rc = platform_driver_register(&dax_hmem_platform_driver);
if (rc)
return rc;
rc = platform_driver_register(&dax_hmem_driver);
if (rc)
platform_driver_unregister(&dax_hmem_platform_driver);
return rc;
}
static __exit void dax_hmem_exit(void)
{
platform_driver_unregister(&dax_hmem_driver);
platform_driver_unregister(&dax_hmem_platform_driver);
}
module_init(dax_hmem_init);
module_exit(dax_hmem_exit);
/* Allow for CXL to define its own dax regions */
#if IS_ENABLED(CONFIG_CXL_REGION)
#if IS_MODULE(CONFIG_CXL_ACPI)
MODULE_SOFTDEP("pre: cxl_acpi");
#endif
#endif
MODULE_ALIAS("platform:hmem*"); MODULE_ALIAS("platform:hmem*");
MODULE_ALIAS("platform:hmem_platform*");
MODULE_LICENSE("GPL v2"); MODULE_LICENSE("GPL v2");
MODULE_AUTHOR("Intel Corporation"); MODULE_AUTHOR("Intel Corporation");

Просмотреть файл

@ -239,6 +239,7 @@ static void dev_dax_kmem_remove(struct dev_dax *dev_dax)
static struct dax_device_driver device_dax_kmem_driver = { static struct dax_device_driver device_dax_kmem_driver = {
.probe = dev_dax_kmem_probe, .probe = dev_dax_kmem_probe,
.remove = dev_dax_kmem_remove, .remove = dev_dax_kmem_remove,
.type = DAXDRV_KMEM_TYPE,
}; };
static int __init dax_kmem_init(void) static int __init dax_kmem_init(void)

Просмотреть файл

@ -262,11 +262,14 @@ static inline bool dax_mapping(struct address_space *mapping)
} }
#ifdef CONFIG_DEV_DAX_HMEM_DEVICES #ifdef CONFIG_DEV_DAX_HMEM_DEVICES
void hmem_register_device(int target_nid, struct resource *r); void hmem_register_resource(int target_nid, struct resource *r);
#else #else
static inline void hmem_register_device(int target_nid, struct resource *r) static inline void hmem_register_resource(int target_nid, struct resource *r)
{ {
} }
#endif #endif
typedef int (*walk_hmem_fn)(struct device *dev, int target_nid,
const struct resource *res);
int walk_hmem_resources(struct device *dev, walk_hmem_fn fn);
#endif #endif

Просмотреть файл

@ -3,10 +3,12 @@
#define _MEMREGION_H_ #define _MEMREGION_H_
#include <linux/types.h> #include <linux/types.h>
#include <linux/errno.h> #include <linux/errno.h>
#include <linux/range.h>
#include <linux/bug.h> #include <linux/bug.h>
struct memregion_info { struct memregion_info {
int target_node; int target_node;
struct range range;
}; };
#ifdef CONFIG_MEMREGION #ifdef CONFIG_MEMREGION

Просмотреть файл

@ -13,6 +13,11 @@ static inline u64 range_len(const struct range *range)
return range->end - range->start + 1; return range->end - range->start + 1;
} }
static inline bool range_contains(struct range *r1, struct range *r2)
{
return r1->start <= r2->start && r1->end >= r2->end;
}
int add_range(struct range *range, int az, int nr_range, int add_range(struct range *range, int az, int nr_range,
u64 start, u64 end); u64 start, u64 end);

Просмотреть файл

@ -31,7 +31,7 @@ static volatile u8 forced_mask = 0xff;
static void *fill_start, *target_start; static void *fill_start, *target_start;
static size_t fill_size, target_size; static size_t fill_size, target_size;
static bool range_contains(char *haystack_start, size_t haystack_size, static bool stackinit_range_contains(char *haystack_start, size_t haystack_size,
char *needle_start, size_t needle_size) char *needle_start, size_t needle_size)
{ {
if (needle_start >= haystack_start && if (needle_start >= haystack_start &&
@ -175,7 +175,7 @@ static noinline void test_ ## name (struct kunit *test) \
\ \
/* Validate that compiler lined up fill and target. */ \ /* Validate that compiler lined up fill and target. */ \
KUNIT_ASSERT_TRUE_MSG(test, \ KUNIT_ASSERT_TRUE_MSG(test, \
range_contains(fill_start, fill_size, \ stackinit_range_contains(fill_start, fill_size, \
target_start, target_size), \ target_start, target_size), \
"stack fill missed target!? " \ "stack fill missed target!? " \
"(fill %zu wide, target offset by %d)\n", \ "(fill %zu wide, target offset by %d)\n", \

Просмотреть файл

@ -703,6 +703,142 @@ static int mock_decoder_reset(struct cxl_decoder *cxld)
return 0; return 0;
} }
static void default_mock_decoder(struct cxl_decoder *cxld)
{
cxld->hpa_range = (struct range){
.start = 0,
.end = -1,
};
cxld->interleave_ways = 1;
cxld->interleave_granularity = 256;
cxld->target_type = CXL_DECODER_EXPANDER;
cxld->commit = mock_decoder_commit;
cxld->reset = mock_decoder_reset;
}
static int first_decoder(struct device *dev, void *data)
{
struct cxl_decoder *cxld;
if (!is_switch_decoder(dev))
return 0;
cxld = to_cxl_decoder(dev);
if (cxld->id == 0)
return 1;
return 0;
}
static void mock_init_hdm_decoder(struct cxl_decoder *cxld)
{
struct acpi_cedt_cfmws *window = mock_cfmws[0];
struct platform_device *pdev = NULL;
struct cxl_endpoint_decoder *cxled;
struct cxl_switch_decoder *cxlsd;
struct cxl_port *port, *iter;
const int size = SZ_512M;
struct cxl_memdev *cxlmd;
struct cxl_dport *dport;
struct device *dev;
bool hb0 = false;
u64 base;
int i;
if (is_endpoint_decoder(&cxld->dev)) {
cxled = to_cxl_endpoint_decoder(&cxld->dev);
cxlmd = cxled_to_memdev(cxled);
WARN_ON(!dev_is_platform(cxlmd->dev.parent));
pdev = to_platform_device(cxlmd->dev.parent);
/* check is endpoint is attach to host-bridge0 */
port = cxled_to_port(cxled);
do {
if (port->uport == &cxl_host_bridge[0]->dev) {
hb0 = true;
break;
}
if (is_cxl_port(port->dev.parent))
port = to_cxl_port(port->dev.parent);
else
port = NULL;
} while (port);
port = cxled_to_port(cxled);
}
/*
* The first decoder on the first 2 devices on the first switch
* attached to host-bridge0 mock a fake / static RAM region. All
* other decoders are default disabled. Given the round robin
* assignment those devices are named cxl_mem.0, and cxl_mem.4.
*
* See 'cxl list -BMPu -m cxl_mem.0,cxl_mem.4'
*/
if (!hb0 || pdev->id % 4 || pdev->id > 4 || cxld->id > 0) {
default_mock_decoder(cxld);
return;
}
base = window->base_hpa;
cxld->hpa_range = (struct range) {
.start = base,
.end = base + size - 1,
};
cxld->interleave_ways = 2;
eig_to_granularity(window->granularity, &cxld->interleave_granularity);
cxld->target_type = CXL_DECODER_EXPANDER;
cxld->flags = CXL_DECODER_F_ENABLE;
cxled->state = CXL_DECODER_STATE_AUTO;
port->commit_end = cxld->id;
devm_cxl_dpa_reserve(cxled, 0, size / cxld->interleave_ways, 0);
cxld->commit = mock_decoder_commit;
cxld->reset = mock_decoder_reset;
/*
* Now that endpoint decoder is set up, walk up the hierarchy
* and setup the switch and root port decoders targeting @cxlmd.
*/
iter = port;
for (i = 0; i < 2; i++) {
dport = iter->parent_dport;
iter = dport->port;
dev = device_find_child(&iter->dev, NULL, first_decoder);
/*
* Ancestor ports are guaranteed to be enumerated before
* @port, and all ports have at least one decoder.
*/
if (WARN_ON(!dev))
continue;
cxlsd = to_cxl_switch_decoder(dev);
if (i == 0) {
/* put cxl_mem.4 second in the decode order */
if (pdev->id == 4)
cxlsd->target[1] = dport;
else
cxlsd->target[0] = dport;
} else
cxlsd->target[0] = dport;
cxld = &cxlsd->cxld;
cxld->target_type = CXL_DECODER_EXPANDER;
cxld->flags = CXL_DECODER_F_ENABLE;
iter->commit_end = 0;
/*
* Switch targets 2 endpoints, while host bridge targets
* one root port
*/
if (i == 0)
cxld->interleave_ways = 2;
else
cxld->interleave_ways = 1;
cxld->interleave_granularity = 256;
cxld->hpa_range = (struct range) {
.start = base,
.end = base + size - 1,
};
put_device(dev);
}
}
static int mock_cxl_enumerate_decoders(struct cxl_hdm *cxlhdm) static int mock_cxl_enumerate_decoders(struct cxl_hdm *cxlhdm)
{ {
struct cxl_port *port = cxlhdm->port; struct cxl_port *port = cxlhdm->port;
@ -748,16 +884,7 @@ static int mock_cxl_enumerate_decoders(struct cxl_hdm *cxlhdm)
cxld = &cxled->cxld; cxld = &cxled->cxld;
} }
cxld->hpa_range = (struct range) { mock_init_hdm_decoder(cxld);
.start = 0,
.end = -1,
};
cxld->interleave_ways = min_not_zero(target_count, 1);
cxld->interleave_granularity = SZ_4K;
cxld->target_type = CXL_DECODER_EXPANDER;
cxld->commit = mock_decoder_commit;
cxld->reset = mock_decoder_reset;
if (target_count) { if (target_count) {
rc = device_for_each_child(port->uport, &ctx, rc = device_for_each_child(port->uport, &ctx,