2018-01-26 20:45:16 +03:00
|
|
|
// SPDX-License-Identifier: GPL-2.0
|
2009-03-20 06:25:11 +03:00
|
|
|
/*
|
2018-03-10 01:36:33 +03:00
|
|
|
* PCI Express I/O Virtualization (IOV) support
|
2009-03-20 06:25:11 +03:00
|
|
|
* Single Root IOV 1.0
|
2009-05-18 09:51:32 +04:00
|
|
|
* Address Translation Service 1.0
|
2018-03-10 01:36:33 +03:00
|
|
|
*
|
|
|
|
* Copyright (C) 2009 Intel Corporation, Yu Zhao <yu.zhao@intel.com>
|
2009-03-20 06:25:11 +03:00
|
|
|
*/
|
|
|
|
|
|
|
|
#include <linux/pci.h>
|
include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h
percpu.h is included by sched.h and module.h and thus ends up being
included when building most .c files. percpu.h includes slab.h which
in turn includes gfp.h making everything defined by the two files
universally available and complicating inclusion dependencies.
percpu.h -> slab.h dependency is about to be removed. Prepare for
this change by updating users of gfp and slab facilities include those
headers directly instead of assuming availability. As this conversion
needs to touch large number of source files, the following script is
used as the basis of conversion.
http://userweb.kernel.org/~tj/misc/slabh-sweep.py
The script does the followings.
* Scan files for gfp and slab usages and update includes such that
only the necessary includes are there. ie. if only gfp is used,
gfp.h, if slab is used, slab.h.
* When the script inserts a new include, it looks at the include
blocks and try to put the new include such that its order conforms
to its surrounding. It's put in the include block which contains
core kernel includes, in the same order that the rest are ordered -
alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
doesn't seem to be any matching order.
* If the script can't find a place to put a new include (mostly
because the file doesn't have fitting include block), it prints out
an error message indicating which .h file needs to be added to the
file.
The conversion was done in the following steps.
1. The initial automatic conversion of all .c files updated slightly
over 4000 files, deleting around 700 includes and adding ~480 gfp.h
and ~3000 slab.h inclusions. The script emitted errors for ~400
files.
2. Each error was manually checked. Some didn't need the inclusion,
some needed manual addition while adding it to implementation .h or
embedding .c file was more appropriate for others. This step added
inclusions to around 150 files.
3. The script was run again and the output was compared to the edits
from #2 to make sure no file was left behind.
4. Several build tests were done and a couple of problems were fixed.
e.g. lib/decompress_*.c used malloc/free() wrappers around slab
APIs requiring slab.h to be added manually.
5. The script was run on all .h files but without automatically
editing them as sprinkling gfp.h and slab.h inclusions around .h
files could easily lead to inclusion dependency hell. Most gfp.h
inclusion directives were ignored as stuff from gfp.h was usually
wildly available and often used in preprocessor macros. Each
slab.h inclusion directive was examined and added manually as
necessary.
6. percpu.h was updated not to include slab.h.
7. Build test were done on the following configurations and failures
were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my
distributed build env didn't work with gcov compiles) and a few
more options had to be turned off depending on archs to make things
build (like ipr on powerpc/64 which failed due to missing writeq).
* x86 and x86_64 UP and SMP allmodconfig and a custom test config.
* powerpc and powerpc64 SMP allmodconfig
* sparc and sparc64 SMP allmodconfig
* ia64 SMP allmodconfig
* s390 SMP allmodconfig
* alpha SMP allmodconfig
* um on x86_64 SMP allmodconfig
8. percpu.h modifications were reverted so that it could be applied as
a separate patch and serve as bisection point.
Given the fact that I had only a couple of failures from tests on step
6, I'm fairly confident about the coverage of this conversion patch.
If there is a breakage, it's likely to be something in one of the arch
headers which should be easily discoverable easily on most builds of
the specific arch.
Signed-off-by: Tejun Heo <tj@kernel.org>
Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
2010-03-24 11:04:11 +03:00
|
|
|
#include <linux/slab.h>
|
2011-05-27 17:37:25 +04:00
|
|
|
#include <linux/export.h>
|
2009-03-20 06:25:11 +03:00
|
|
|
#include <linux/string.h>
|
|
|
|
#include <linux/delay.h>
|
|
|
|
#include "pci.h"
|
|
|
|
|
2009-03-20 06:25:15 +03:00
|
|
|
#define VIRTFN_ID_LEN 16
|
2009-03-20 06:25:11 +03:00
|
|
|
|
2015-03-25 11:23:48 +03:00
|
|
|
int pci_iov_virtfn_bus(struct pci_dev *dev, int vf_id)
|
2009-03-20 06:25:13 +03:00
|
|
|
{
|
2015-03-25 11:23:48 +03:00
|
|
|
if (!dev->is_physfn)
|
|
|
|
return -EINVAL;
|
2009-03-20 06:25:13 +03:00
|
|
|
return dev->bus->number + ((dev->devfn + dev->sriov->offset +
|
2015-03-25 11:23:48 +03:00
|
|
|
dev->sriov->stride * vf_id) >> 8);
|
2009-03-20 06:25:13 +03:00
|
|
|
}
|
|
|
|
|
2015-03-25 11:23:48 +03:00
|
|
|
int pci_iov_virtfn_devfn(struct pci_dev *dev, int vf_id)
|
2009-03-20 06:25:13 +03:00
|
|
|
{
|
2015-03-25 11:23:48 +03:00
|
|
|
if (!dev->is_physfn)
|
|
|
|
return -EINVAL;
|
2009-03-20 06:25:13 +03:00
|
|
|
return (dev->devfn + dev->sriov->offset +
|
2015-03-25 11:23:48 +03:00
|
|
|
dev->sriov->stride * vf_id) & 0xff;
|
2009-03-20 06:25:13 +03:00
|
|
|
}
|
|
|
|
|
2015-03-25 11:23:46 +03:00
|
|
|
/*
|
|
|
|
* Per SR-IOV spec sec 3.3.10 and 3.3.11, First VF Offset and VF Stride may
|
|
|
|
* change when NumVFs changes.
|
|
|
|
*
|
|
|
|
* Update iov->offset and iov->stride when NumVFs is written.
|
|
|
|
*/
|
|
|
|
static inline void pci_iov_set_numvfs(struct pci_dev *dev, int nr_virtfn)
|
|
|
|
{
|
|
|
|
struct pci_sriov *iov = dev->sriov;
|
|
|
|
|
|
|
|
pci_write_config_word(dev, iov->pos + PCI_SRIOV_NUM_VF, nr_virtfn);
|
|
|
|
pci_read_config_word(dev, iov->pos + PCI_SRIOV_VF_OFFSET, &iov->offset);
|
|
|
|
pci_read_config_word(dev, iov->pos + PCI_SRIOV_VF_STRIDE, &iov->stride);
|
|
|
|
}
|
|
|
|
|
2015-03-25 11:23:47 +03:00
|
|
|
/*
|
|
|
|
* The PF consumes one bus number. NumVFs, First VF Offset, and VF Stride
|
|
|
|
* determine how many additional bus numbers will be consumed by VFs.
|
|
|
|
*
|
2015-10-30 00:20:50 +03:00
|
|
|
* Iterate over all valid NumVFs, validate offset and stride, and calculate
|
|
|
|
* the maximum number of bus numbers that could ever be required.
|
2015-03-25 11:23:47 +03:00
|
|
|
*/
|
2015-10-30 00:20:50 +03:00
|
|
|
static int compute_max_vf_buses(struct pci_dev *dev)
|
2015-03-25 11:23:47 +03:00
|
|
|
{
|
|
|
|
struct pci_sriov *iov = dev->sriov;
|
2015-10-30 00:20:50 +03:00
|
|
|
int nr_virtfn, busnr, rc = 0;
|
2015-03-25 11:23:47 +03:00
|
|
|
|
2015-10-30 00:20:50 +03:00
|
|
|
for (nr_virtfn = iov->total_VFs; nr_virtfn; nr_virtfn--) {
|
2015-03-25 11:23:47 +03:00
|
|
|
pci_iov_set_numvfs(dev, nr_virtfn);
|
2015-10-30 00:20:50 +03:00
|
|
|
if (!iov->offset || (nr_virtfn > 1 && !iov->stride)) {
|
|
|
|
rc = -EIO;
|
|
|
|
goto out;
|
|
|
|
}
|
|
|
|
|
2015-03-25 11:23:48 +03:00
|
|
|
busnr = pci_iov_virtfn_bus(dev, nr_virtfn - 1);
|
2015-10-30 00:20:50 +03:00
|
|
|
if (busnr > iov->max_VF_buses)
|
|
|
|
iov->max_VF_buses = busnr;
|
2015-03-25 11:23:47 +03:00
|
|
|
}
|
|
|
|
|
2015-10-30 00:20:50 +03:00
|
|
|
out:
|
|
|
|
pci_iov_set_numvfs(dev, 0);
|
|
|
|
return rc;
|
2015-03-25 11:23:47 +03:00
|
|
|
}
|
|
|
|
|
2009-03-20 06:25:15 +03:00
|
|
|
static struct pci_bus *virtfn_add_bus(struct pci_bus *bus, int busnr)
|
|
|
|
{
|
|
|
|
struct pci_bus *child;
|
|
|
|
|
|
|
|
if (bus->number == busnr)
|
|
|
|
return bus;
|
|
|
|
|
|
|
|
child = pci_find_bus(pci_domain_nr(bus), busnr);
|
|
|
|
if (child)
|
|
|
|
return child;
|
|
|
|
|
|
|
|
child = pci_add_new_bus(bus, NULL, busnr);
|
|
|
|
if (!child)
|
|
|
|
return NULL;
|
|
|
|
|
2012-05-18 05:51:13 +04:00
|
|
|
pci_bus_insert_busn_res(child, busnr, busnr);
|
2009-03-20 06:25:15 +03:00
|
|
|
|
|
|
|
return child;
|
|
|
|
}
|
|
|
|
|
2013-05-25 17:48:37 +04:00
|
|
|
static void virtfn_remove_bus(struct pci_bus *physbus, struct pci_bus *virtbus)
|
2009-03-20 06:25:15 +03:00
|
|
|
{
|
2013-05-25 17:48:37 +04:00
|
|
|
if (physbus != virtbus && list_empty(&virtbus->devices))
|
|
|
|
pci_remove_bus(virtbus);
|
2009-03-20 06:25:15 +03:00
|
|
|
}
|
|
|
|
|
2015-03-25 11:23:44 +03:00
|
|
|
resource_size_t pci_iov_resource_size(struct pci_dev *dev, int resno)
|
|
|
|
{
|
|
|
|
if (!dev->is_physfn)
|
|
|
|
return 0;
|
|
|
|
|
|
|
|
return dev->sriov->barsz[resno - PCI_IOV_RESOURCES];
|
|
|
|
}
|
|
|
|
|
2018-03-19 23:06:00 +03:00
|
|
|
static void pci_read_vf_config_common(struct pci_dev *virtfn)
|
|
|
|
{
|
|
|
|
struct pci_dev *physfn = virtfn->physfn;
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Some config registers are the same across all associated VFs.
|
|
|
|
* Read them once from VF0 so we can skip reading them from the
|
|
|
|
* other VFs.
|
|
|
|
*
|
|
|
|
* PCIe r4.0, sec 9.3.4.1, technically doesn't require all VFs to
|
|
|
|
* have the same Revision ID and Subsystem ID, but we assume they
|
|
|
|
* do.
|
|
|
|
*/
|
|
|
|
pci_read_config_dword(virtfn, PCI_CLASS_REVISION,
|
|
|
|
&physfn->sriov->class);
|
|
|
|
pci_read_config_byte(virtfn, PCI_HEADER_TYPE,
|
|
|
|
&physfn->sriov->hdr_type);
|
|
|
|
pci_read_config_word(virtfn, PCI_SUBSYSTEM_VENDOR_ID,
|
|
|
|
&physfn->sriov->subsystem_vendor);
|
|
|
|
pci_read_config_word(virtfn, PCI_SUBSYSTEM_ID,
|
|
|
|
&physfn->sriov->subsystem_device);
|
|
|
|
}
|
|
|
|
|
2017-09-26 20:53:23 +03:00
|
|
|
int pci_iov_add_virtfn(struct pci_dev *dev, int id)
|
2009-03-20 06:25:15 +03:00
|
|
|
{
|
|
|
|
int i;
|
2013-05-25 17:48:37 +04:00
|
|
|
int rc = -ENOMEM;
|
2009-03-20 06:25:15 +03:00
|
|
|
u64 size;
|
|
|
|
char buf[VIRTFN_ID_LEN];
|
|
|
|
struct pci_dev *virtfn;
|
|
|
|
struct resource *res;
|
|
|
|
struct pci_sriov *iov = dev->sriov;
|
2013-05-25 17:48:31 +04:00
|
|
|
struct pci_bus *bus;
|
2009-03-20 06:25:15 +03:00
|
|
|
|
2015-03-25 11:23:48 +03:00
|
|
|
bus = virtfn_add_bus(dev->bus, pci_iov_virtfn_bus(dev, id));
|
2013-05-25 17:48:37 +04:00
|
|
|
if (!bus)
|
|
|
|
goto failed;
|
|
|
|
|
|
|
|
virtfn = pci_alloc_dev(bus);
|
2009-03-20 06:25:15 +03:00
|
|
|
if (!virtfn)
|
2013-05-25 17:48:37 +04:00
|
|
|
goto failed0;
|
2009-03-20 06:25:15 +03:00
|
|
|
|
2015-03-25 11:23:48 +03:00
|
|
|
virtfn->devfn = pci_iov_virtfn_devfn(dev, id);
|
2009-03-20 06:25:15 +03:00
|
|
|
virtfn->vendor = dev->vendor;
|
2017-08-28 16:38:49 +03:00
|
|
|
virtfn->device = iov->vf_device;
|
2018-03-19 23:06:00 +03:00
|
|
|
virtfn->is_virtfn = 1;
|
|
|
|
virtfn->physfn = pci_dev_get(dev);
|
|
|
|
|
|
|
|
if (id == 0)
|
|
|
|
pci_read_vf_config_common(virtfn);
|
|
|
|
|
2016-08-29 10:28:01 +03:00
|
|
|
rc = pci_setup_device(virtfn);
|
|
|
|
if (rc)
|
2018-03-19 23:06:00 +03:00
|
|
|
goto failed1;
|
2016-08-29 10:28:01 +03:00
|
|
|
|
2009-03-20 06:25:15 +03:00
|
|
|
virtfn->dev.parent = dev->dev.parent;
|
2014-01-09 19:36:08 +04:00
|
|
|
virtfn->multifunction = 0;
|
2009-03-20 06:25:15 +03:00
|
|
|
|
|
|
|
for (i = 0; i < PCI_SRIOV_NUM_BARS; i++) {
|
2015-03-25 11:23:45 +03:00
|
|
|
res = &dev->resource[i + PCI_IOV_RESOURCES];
|
2009-03-20 06:25:15 +03:00
|
|
|
if (!res->parent)
|
|
|
|
continue;
|
|
|
|
virtfn->resource[i].name = pci_name(virtfn);
|
|
|
|
virtfn->resource[i].flags = res->flags;
|
2015-03-25 11:23:44 +03:00
|
|
|
size = pci_iov_resource_size(dev, i + PCI_IOV_RESOURCES);
|
2009-03-20 06:25:15 +03:00
|
|
|
virtfn->resource[i].start = res->start + size * id;
|
|
|
|
virtfn->resource[i].end = virtfn->resource[i].start + size - 1;
|
|
|
|
rc = request_resource(res, &virtfn->resource[i]);
|
|
|
|
BUG_ON(rc);
|
|
|
|
}
|
|
|
|
|
|
|
|
pci_device_add(virtfn, virtfn->bus);
|
|
|
|
|
|
|
|
sprintf(buf, "virtfn%u", id);
|
|
|
|
rc = sysfs_create_link(&dev->dev.kobj, &virtfn->dev.kobj, buf);
|
|
|
|
if (rc)
|
2019-11-25 22:52:52 +03:00
|
|
|
goto failed1;
|
2009-03-20 06:25:15 +03:00
|
|
|
rc = sysfs_create_link(&virtfn->dev.kobj, &dev->dev.kobj, "physfn");
|
|
|
|
if (rc)
|
2019-11-25 22:52:52 +03:00
|
|
|
goto failed2;
|
2009-03-20 06:25:15 +03:00
|
|
|
|
|
|
|
kobject_uevent(&virtfn->dev.kobj, KOBJ_CHANGE);
|
|
|
|
|
2017-10-04 18:57:52 +03:00
|
|
|
pci_bus_add_device(virtfn);
|
|
|
|
|
2009-03-20 06:25:15 +03:00
|
|
|
return 0;
|
|
|
|
|
2018-03-19 23:06:00 +03:00
|
|
|
failed2:
|
2019-11-25 22:52:52 +03:00
|
|
|
sysfs_remove_link(&dev->dev.kobj, buf);
|
2009-03-20 06:25:15 +03:00
|
|
|
failed1:
|
2019-11-25 22:52:52 +03:00
|
|
|
pci_stop_and_remove_bus_device(virtfn);
|
2009-03-20 06:25:15 +03:00
|
|
|
pci_dev_put(dev);
|
2013-05-25 17:48:37 +04:00
|
|
|
failed0:
|
|
|
|
virtfn_remove_bus(dev->bus, bus);
|
|
|
|
failed:
|
2009-03-20 06:25:15 +03:00
|
|
|
|
|
|
|
return rc;
|
|
|
|
}
|
|
|
|
|
2017-09-26 20:53:23 +03:00
|
|
|
void pci_iov_remove_virtfn(struct pci_dev *dev, int id)
|
2009-03-20 06:25:15 +03:00
|
|
|
{
|
|
|
|
char buf[VIRTFN_ID_LEN];
|
|
|
|
struct pci_dev *virtfn;
|
|
|
|
|
2013-05-25 17:48:37 +04:00
|
|
|
virtfn = pci_get_domain_bus_and_slot(pci_domain_nr(dev->bus),
|
2015-03-25 11:23:48 +03:00
|
|
|
pci_iov_virtfn_bus(dev, id),
|
|
|
|
pci_iov_virtfn_devfn(dev, id));
|
2009-03-20 06:25:15 +03:00
|
|
|
if (!virtfn)
|
|
|
|
return;
|
|
|
|
|
|
|
|
sprintf(buf, "virtfn%u", id);
|
|
|
|
sysfs_remove_link(&dev->dev.kobj, buf);
|
2012-02-05 10:55:01 +04:00
|
|
|
/*
|
|
|
|
* pci_stop_dev() could have been called for this virtfn already,
|
|
|
|
* so the directory for the virtfn may have been removed before.
|
|
|
|
* Double check to avoid spurious sysfs warnings.
|
|
|
|
*/
|
|
|
|
if (virtfn->dev.kobj.sd)
|
|
|
|
sysfs_remove_link(&virtfn->dev.kobj, "physfn");
|
2009-03-20 06:25:15 +03:00
|
|
|
|
2012-02-26 01:54:20 +04:00
|
|
|
pci_stop_and_remove_bus_device(virtfn);
|
2013-05-25 17:48:37 +04:00
|
|
|
virtfn_remove_bus(dev->bus, virtfn->bus);
|
2009-03-20 06:25:15 +03:00
|
|
|
|
2013-05-25 17:48:37 +04:00
|
|
|
/* balance pci_get_domain_bus_and_slot() */
|
|
|
|
pci_dev_put(virtfn);
|
2009-03-20 06:25:15 +03:00
|
|
|
pci_dev_put(dev);
|
|
|
|
}
|
|
|
|
|
2019-08-13 23:45:13 +03:00
|
|
|
static ssize_t sriov_totalvfs_show(struct device *dev,
|
|
|
|
struct device_attribute *attr,
|
|
|
|
char *buf)
|
|
|
|
{
|
|
|
|
struct pci_dev *pdev = to_pci_dev(dev);
|
|
|
|
|
|
|
|
return sprintf(buf, "%u\n", pci_sriov_get_totalvfs(pdev));
|
|
|
|
}
|
|
|
|
|
|
|
|
static ssize_t sriov_numvfs_show(struct device *dev,
|
|
|
|
struct device_attribute *attr,
|
|
|
|
char *buf)
|
|
|
|
{
|
|
|
|
struct pci_dev *pdev = to_pci_dev(dev);
|
2019-09-11 10:27:36 +03:00
|
|
|
u16 num_vfs;
|
2019-08-13 23:45:13 +03:00
|
|
|
|
2019-09-11 10:27:36 +03:00
|
|
|
/* Serialize vs sriov_numvfs_store() so readers see valid num_VFs */
|
|
|
|
device_lock(&pdev->dev);
|
|
|
|
num_vfs = pdev->sriov->num_VFs;
|
|
|
|
device_unlock(&pdev->dev);
|
|
|
|
|
|
|
|
return sprintf(buf, "%u\n", num_vfs);
|
2019-08-13 23:45:13 +03:00
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* num_vfs > 0; number of VFs to enable
|
|
|
|
* num_vfs = 0; disable all VFs
|
|
|
|
*
|
|
|
|
* Note: SRIOV spec does not allow partial VF
|
|
|
|
* disable, so it's all or none.
|
|
|
|
*/
|
|
|
|
static ssize_t sriov_numvfs_store(struct device *dev,
|
|
|
|
struct device_attribute *attr,
|
|
|
|
const char *buf, size_t count)
|
|
|
|
{
|
|
|
|
struct pci_dev *pdev = to_pci_dev(dev);
|
|
|
|
int ret;
|
|
|
|
u16 num_vfs;
|
|
|
|
|
|
|
|
ret = kstrtou16(buf, 0, &num_vfs);
|
|
|
|
if (ret < 0)
|
|
|
|
return ret;
|
|
|
|
|
|
|
|
if (num_vfs > pci_sriov_get_totalvfs(pdev))
|
|
|
|
return -ERANGE;
|
|
|
|
|
|
|
|
device_lock(&pdev->dev);
|
|
|
|
|
|
|
|
if (num_vfs == pdev->sriov->num_VFs)
|
|
|
|
goto exit;
|
|
|
|
|
|
|
|
/* is PF driver loaded w/callback */
|
|
|
|
if (!pdev->driver || !pdev->driver->sriov_configure) {
|
|
|
|
pci_info(pdev, "Driver does not support SRIOV configuration via sysfs\n");
|
|
|
|
ret = -ENOENT;
|
|
|
|
goto exit;
|
|
|
|
}
|
|
|
|
|
|
|
|
if (num_vfs == 0) {
|
|
|
|
/* disable VFs */
|
|
|
|
ret = pdev->driver->sriov_configure(pdev, 0);
|
|
|
|
goto exit;
|
|
|
|
}
|
|
|
|
|
|
|
|
/* enable VFs */
|
|
|
|
if (pdev->sriov->num_VFs) {
|
|
|
|
pci_warn(pdev, "%d VFs already enabled. Disable before enabling %d VFs\n",
|
|
|
|
pdev->sriov->num_VFs, num_vfs);
|
|
|
|
ret = -EBUSY;
|
|
|
|
goto exit;
|
|
|
|
}
|
|
|
|
|
|
|
|
ret = pdev->driver->sriov_configure(pdev, num_vfs);
|
|
|
|
if (ret < 0)
|
|
|
|
goto exit;
|
|
|
|
|
|
|
|
if (ret != num_vfs)
|
|
|
|
pci_warn(pdev, "%d VFs requested; only %d enabled\n",
|
|
|
|
num_vfs, ret);
|
|
|
|
|
|
|
|
exit:
|
|
|
|
device_unlock(&pdev->dev);
|
|
|
|
|
|
|
|
if (ret < 0)
|
|
|
|
return ret;
|
|
|
|
|
|
|
|
return count;
|
|
|
|
}
|
|
|
|
|
|
|
|
static ssize_t sriov_offset_show(struct device *dev,
|
|
|
|
struct device_attribute *attr,
|
|
|
|
char *buf)
|
|
|
|
{
|
|
|
|
struct pci_dev *pdev = to_pci_dev(dev);
|
|
|
|
|
|
|
|
return sprintf(buf, "%u\n", pdev->sriov->offset);
|
|
|
|
}
|
|
|
|
|
|
|
|
static ssize_t sriov_stride_show(struct device *dev,
|
|
|
|
struct device_attribute *attr,
|
|
|
|
char *buf)
|
|
|
|
{
|
|
|
|
struct pci_dev *pdev = to_pci_dev(dev);
|
|
|
|
|
|
|
|
return sprintf(buf, "%u\n", pdev->sriov->stride);
|
|
|
|
}
|
|
|
|
|
|
|
|
static ssize_t sriov_vf_device_show(struct device *dev,
|
|
|
|
struct device_attribute *attr,
|
|
|
|
char *buf)
|
|
|
|
{
|
|
|
|
struct pci_dev *pdev = to_pci_dev(dev);
|
|
|
|
|
|
|
|
return sprintf(buf, "%x\n", pdev->sriov->vf_device);
|
|
|
|
}
|
|
|
|
|
|
|
|
static ssize_t sriov_drivers_autoprobe_show(struct device *dev,
|
|
|
|
struct device_attribute *attr,
|
|
|
|
char *buf)
|
|
|
|
{
|
|
|
|
struct pci_dev *pdev = to_pci_dev(dev);
|
|
|
|
|
|
|
|
return sprintf(buf, "%u\n", pdev->sriov->drivers_autoprobe);
|
|
|
|
}
|
|
|
|
|
|
|
|
static ssize_t sriov_drivers_autoprobe_store(struct device *dev,
|
|
|
|
struct device_attribute *attr,
|
|
|
|
const char *buf, size_t count)
|
|
|
|
{
|
|
|
|
struct pci_dev *pdev = to_pci_dev(dev);
|
|
|
|
bool drivers_autoprobe;
|
|
|
|
|
|
|
|
if (kstrtobool(buf, &drivers_autoprobe) < 0)
|
|
|
|
return -EINVAL;
|
|
|
|
|
|
|
|
pdev->sriov->drivers_autoprobe = drivers_autoprobe;
|
|
|
|
|
|
|
|
return count;
|
|
|
|
}
|
|
|
|
|
|
|
|
static DEVICE_ATTR_RO(sriov_totalvfs);
|
2019-09-05 09:32:26 +03:00
|
|
|
static DEVICE_ATTR_RW(sriov_numvfs);
|
2019-08-13 23:45:13 +03:00
|
|
|
static DEVICE_ATTR_RO(sriov_offset);
|
|
|
|
static DEVICE_ATTR_RO(sriov_stride);
|
|
|
|
static DEVICE_ATTR_RO(sriov_vf_device);
|
2019-09-05 09:32:26 +03:00
|
|
|
static DEVICE_ATTR_RW(sriov_drivers_autoprobe);
|
2019-08-13 23:45:13 +03:00
|
|
|
|
|
|
|
static struct attribute *sriov_dev_attrs[] = {
|
|
|
|
&dev_attr_sriov_totalvfs.attr,
|
|
|
|
&dev_attr_sriov_numvfs.attr,
|
|
|
|
&dev_attr_sriov_offset.attr,
|
|
|
|
&dev_attr_sriov_stride.attr,
|
|
|
|
&dev_attr_sriov_vf_device.attr,
|
|
|
|
&dev_attr_sriov_drivers_autoprobe.attr,
|
|
|
|
NULL,
|
|
|
|
};
|
|
|
|
|
|
|
|
static umode_t sriov_attrs_are_visible(struct kobject *kobj,
|
|
|
|
struct attribute *a, int n)
|
|
|
|
{
|
|
|
|
struct device *dev = kobj_to_dev(kobj);
|
|
|
|
|
|
|
|
if (!dev_is_pf(dev))
|
|
|
|
return 0;
|
|
|
|
|
|
|
|
return a->mode;
|
|
|
|
}
|
|
|
|
|
|
|
|
const struct attribute_group sriov_dev_attr_group = {
|
|
|
|
.attrs = sriov_dev_attrs,
|
|
|
|
.is_visible = sriov_attrs_are_visible,
|
|
|
|
};
|
|
|
|
|
2015-03-25 11:23:49 +03:00
|
|
|
int __weak pcibios_sriov_enable(struct pci_dev *pdev, u16 num_vfs)
|
|
|
|
{
|
2015-10-30 00:21:11 +03:00
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
|
|
|
int __weak pcibios_sriov_disable(struct pci_dev *pdev)
|
|
|
|
{
|
|
|
|
return 0;
|
2015-03-25 11:23:49 +03:00
|
|
|
}
|
|
|
|
|
2018-12-21 17:14:18 +03:00
|
|
|
static int sriov_add_vfs(struct pci_dev *dev, u16 num_vfs)
|
|
|
|
{
|
|
|
|
unsigned int i;
|
|
|
|
int rc;
|
|
|
|
|
2018-12-21 17:14:19 +03:00
|
|
|
if (dev->no_vf_scan)
|
|
|
|
return 0;
|
|
|
|
|
2018-12-21 17:14:18 +03:00
|
|
|
for (i = 0; i < num_vfs; i++) {
|
|
|
|
rc = pci_iov_add_virtfn(dev, i);
|
|
|
|
if (rc)
|
|
|
|
goto failed;
|
|
|
|
}
|
|
|
|
return 0;
|
|
|
|
failed:
|
|
|
|
while (i--)
|
|
|
|
pci_iov_remove_virtfn(dev, i);
|
|
|
|
|
|
|
|
return rc;
|
|
|
|
}
|
|
|
|
|
2009-03-20 06:25:15 +03:00
|
|
|
static int sriov_enable(struct pci_dev *dev, int nr_virtfn)
|
|
|
|
{
|
|
|
|
int rc;
|
2015-10-30 00:21:05 +03:00
|
|
|
int i;
|
2009-03-20 06:25:15 +03:00
|
|
|
int nres;
|
2015-10-30 00:20:57 +03:00
|
|
|
u16 initial;
|
2009-03-20 06:25:15 +03:00
|
|
|
struct resource *res;
|
|
|
|
struct pci_dev *pdev;
|
|
|
|
struct pci_sriov *iov = dev->sriov;
|
2011-11-06 06:33:10 +04:00
|
|
|
int bars = 0;
|
2015-03-25 11:23:48 +03:00
|
|
|
int bus;
|
2009-03-20 06:25:15 +03:00
|
|
|
|
|
|
|
if (!nr_virtfn)
|
|
|
|
return 0;
|
|
|
|
|
2012-11-10 07:27:53 +04:00
|
|
|
if (iov->num_VFs)
|
2009-03-20 06:25:15 +03:00
|
|
|
return -EINVAL;
|
|
|
|
|
|
|
|
pci_read_config_word(dev, iov->pos + PCI_SRIOV_INITIAL_VF, &initial);
|
2012-11-10 07:27:53 +04:00
|
|
|
if (initial > iov->total_VFs ||
|
|
|
|
(!(iov->cap & PCI_SRIOV_CAP_VFM) && (initial != iov->total_VFs)))
|
2009-03-20 06:25:15 +03:00
|
|
|
return -EIO;
|
|
|
|
|
2012-11-10 07:27:53 +04:00
|
|
|
if (nr_virtfn < 0 || nr_virtfn > iov->total_VFs ||
|
2009-03-20 06:25:15 +03:00
|
|
|
(!(iov->cap & PCI_SRIOV_CAP_VFM) && (nr_virtfn > initial)))
|
|
|
|
return -EINVAL;
|
|
|
|
|
|
|
|
nres = 0;
|
|
|
|
for (i = 0; i < PCI_SRIOV_NUM_BARS; i++) {
|
2011-11-06 06:33:10 +04:00
|
|
|
bars |= (1 << (i + PCI_IOV_RESOURCES));
|
2015-03-25 11:23:45 +03:00
|
|
|
res = &dev->resource[i + PCI_IOV_RESOURCES];
|
2009-03-20 06:25:15 +03:00
|
|
|
if (res->parent)
|
|
|
|
nres++;
|
|
|
|
}
|
|
|
|
if (nres != iov->nres) {
|
2018-01-18 21:55:24 +03:00
|
|
|
pci_err(dev, "not enough MMIO resources for SR-IOV\n");
|
2009-03-20 06:25:15 +03:00
|
|
|
return -ENOMEM;
|
|
|
|
}
|
|
|
|
|
2015-03-25 11:23:48 +03:00
|
|
|
bus = pci_iov_virtfn_bus(dev, nr_virtfn - 1);
|
2015-03-25 11:23:42 +03:00
|
|
|
if (bus > dev->bus->busn_res.end) {
|
2018-01-18 21:55:24 +03:00
|
|
|
pci_err(dev, "can't enable %d VFs (bus %02x out of range of %pR)\n",
|
2015-03-25 11:23:42 +03:00
|
|
|
nr_virtfn, bus, &dev->bus->busn_res);
|
2009-03-20 06:25:15 +03:00
|
|
|
return -ENOMEM;
|
|
|
|
}
|
|
|
|
|
2011-11-06 06:33:10 +04:00
|
|
|
if (pci_enable_resources(dev, bars)) {
|
2018-01-18 21:55:24 +03:00
|
|
|
pci_err(dev, "SR-IOV: IOV BARS not allocated\n");
|
2011-11-06 06:33:10 +04:00
|
|
|
return -ENOMEM;
|
|
|
|
}
|
|
|
|
|
2009-03-20 06:25:15 +03:00
|
|
|
if (iov->link != dev->devfn) {
|
|
|
|
pdev = pci_get_slot(dev->bus, iov->link);
|
|
|
|
if (!pdev)
|
|
|
|
return -ENODEV;
|
|
|
|
|
2013-05-25 17:48:37 +04:00
|
|
|
if (!pdev->is_physfn) {
|
|
|
|
pci_dev_put(pdev);
|
2013-08-01 02:47:56 +04:00
|
|
|
return -ENOSYS;
|
2013-05-25 17:48:37 +04:00
|
|
|
}
|
2009-03-20 06:25:15 +03:00
|
|
|
|
|
|
|
rc = sysfs_create_link(&dev->dev.kobj,
|
|
|
|
&pdev->dev.kobj, "dep_link");
|
2013-05-25 17:48:37 +04:00
|
|
|
pci_dev_put(pdev);
|
2009-03-20 06:25:15 +03:00
|
|
|
if (rc)
|
|
|
|
return rc;
|
|
|
|
}
|
|
|
|
|
2012-11-10 07:27:53 +04:00
|
|
|
iov->initial_VFs = initial;
|
2009-03-20 06:25:15 +03:00
|
|
|
if (nr_virtfn < initial)
|
|
|
|
initial = nr_virtfn;
|
|
|
|
|
2015-10-30 00:21:20 +03:00
|
|
|
rc = pcibios_sriov_enable(dev, initial);
|
|
|
|
if (rc) {
|
2018-01-18 21:55:24 +03:00
|
|
|
pci_err(dev, "failure %d from pcibios_sriov_enable()\n", rc);
|
2015-10-30 00:21:20 +03:00
|
|
|
goto err_pcibios;
|
2015-03-25 11:23:49 +03:00
|
|
|
}
|
|
|
|
|
2016-10-26 04:15:35 +03:00
|
|
|
pci_iov_set_numvfs(dev, nr_virtfn);
|
|
|
|
iov->ctrl |= PCI_SRIOV_CTRL_VFE | PCI_SRIOV_CTRL_MSE;
|
|
|
|
pci_cfg_access_lock(dev);
|
|
|
|
pci_write_config_word(dev, iov->pos + PCI_SRIOV_CTRL, iov->ctrl);
|
|
|
|
msleep(100);
|
|
|
|
pci_cfg_access_unlock(dev);
|
|
|
|
|
2018-12-21 17:14:18 +03:00
|
|
|
rc = sriov_add_vfs(dev, initial);
|
|
|
|
if (rc)
|
|
|
|
goto err_pcibios;
|
2009-03-20 06:25:15 +03:00
|
|
|
|
|
|
|
kobject_uevent(&dev->dev.kobj, KOBJ_CHANGE);
|
2012-11-10 07:27:53 +04:00
|
|
|
iov->num_VFs = nr_virtfn;
|
2009-03-20 06:25:15 +03:00
|
|
|
|
|
|
|
return 0;
|
|
|
|
|
2015-10-30 00:21:20 +03:00
|
|
|
err_pcibios:
|
2009-03-20 06:25:15 +03:00
|
|
|
iov->ctrl &= ~(PCI_SRIOV_CTRL_VFE | PCI_SRIOV_CTRL_MSE);
|
2011-11-04 12:45:59 +04:00
|
|
|
pci_cfg_access_lock(dev);
|
2009-03-20 06:25:15 +03:00
|
|
|
pci_write_config_word(dev, iov->pos + PCI_SRIOV_CTRL, iov->ctrl);
|
|
|
|
ssleep(1);
|
2011-11-04 12:45:59 +04:00
|
|
|
pci_cfg_access_unlock(dev);
|
2009-03-20 06:25:15 +03:00
|
|
|
|
2017-08-11 11:19:33 +03:00
|
|
|
pcibios_sriov_disable(dev);
|
|
|
|
|
2009-03-20 06:25:15 +03:00
|
|
|
if (iov->link != dev->devfn)
|
|
|
|
sysfs_remove_link(&dev->dev.kobj, "dep_link");
|
|
|
|
|
2015-10-30 00:21:16 +03:00
|
|
|
pci_iov_set_numvfs(dev, 0);
|
2009-03-20 06:25:15 +03:00
|
|
|
return rc;
|
|
|
|
}
|
|
|
|
|
2018-12-21 17:14:18 +03:00
|
|
|
static void sriov_del_vfs(struct pci_dev *dev)
|
2009-03-20 06:25:15 +03:00
|
|
|
{
|
2018-12-21 17:14:18 +03:00
|
|
|
struct pci_sriov *iov = dev->sriov;
|
2009-03-20 06:25:15 +03:00
|
|
|
int i;
|
2018-12-21 17:14:18 +03:00
|
|
|
|
2018-12-21 17:14:19 +03:00
|
|
|
if (dev->no_vf_scan)
|
|
|
|
return;
|
|
|
|
|
2018-12-21 17:14:18 +03:00
|
|
|
for (i = 0; i < iov->num_VFs; i++)
|
|
|
|
pci_iov_remove_virtfn(dev, i);
|
|
|
|
}
|
|
|
|
|
|
|
|
static void sriov_disable(struct pci_dev *dev)
|
|
|
|
{
|
2009-03-20 06:25:15 +03:00
|
|
|
struct pci_sriov *iov = dev->sriov;
|
|
|
|
|
2012-11-10 07:27:53 +04:00
|
|
|
if (!iov->num_VFs)
|
2009-03-20 06:25:15 +03:00
|
|
|
return;
|
|
|
|
|
2018-12-21 17:14:18 +03:00
|
|
|
sriov_del_vfs(dev);
|
2009-03-20 06:25:15 +03:00
|
|
|
iov->ctrl &= ~(PCI_SRIOV_CTRL_VFE | PCI_SRIOV_CTRL_MSE);
|
2011-11-04 12:45:59 +04:00
|
|
|
pci_cfg_access_lock(dev);
|
2009-03-20 06:25:15 +03:00
|
|
|
pci_write_config_word(dev, iov->pos + PCI_SRIOV_CTRL, iov->ctrl);
|
|
|
|
ssleep(1);
|
2011-11-04 12:45:59 +04:00
|
|
|
pci_cfg_access_unlock(dev);
|
2009-03-20 06:25:15 +03:00
|
|
|
|
2017-08-11 11:19:33 +03:00
|
|
|
pcibios_sriov_disable(dev);
|
|
|
|
|
2009-03-20 06:25:15 +03:00
|
|
|
if (iov->link != dev->devfn)
|
|
|
|
sysfs_remove_link(&dev->dev.kobj, "dep_link");
|
|
|
|
|
2012-11-10 07:27:53 +04:00
|
|
|
iov->num_VFs = 0;
|
2015-03-25 11:23:46 +03:00
|
|
|
pci_iov_set_numvfs(dev, 0);
|
2009-03-20 06:25:15 +03:00
|
|
|
}
|
|
|
|
|
2009-03-20 06:25:11 +03:00
|
|
|
static int sriov_init(struct pci_dev *dev, int pos)
|
|
|
|
{
|
2015-03-25 11:23:44 +03:00
|
|
|
int i, bar64;
|
2009-03-20 06:25:11 +03:00
|
|
|
int rc;
|
|
|
|
int nres;
|
|
|
|
u32 pgsz;
|
2015-10-30 00:20:50 +03:00
|
|
|
u16 ctrl, total;
|
2009-03-20 06:25:11 +03:00
|
|
|
struct pci_sriov *iov;
|
|
|
|
struct resource *res;
|
|
|
|
struct pci_dev *pdev;
|
|
|
|
|
|
|
|
pci_read_config_word(dev, pos + PCI_SRIOV_CTRL, &ctrl);
|
|
|
|
if (ctrl & PCI_SRIOV_CTRL_VFE) {
|
|
|
|
pci_write_config_word(dev, pos + PCI_SRIOV_CTRL, 0);
|
|
|
|
ssleep(1);
|
|
|
|
}
|
|
|
|
|
|
|
|
ctrl = 0;
|
|
|
|
list_for_each_entry(pdev, &dev->bus->devices, bus_list)
|
|
|
|
if (pdev->is_physfn)
|
|
|
|
goto found;
|
|
|
|
|
|
|
|
pdev = NULL;
|
|
|
|
if (pci_ari_enabled(dev->bus))
|
|
|
|
ctrl |= PCI_SRIOV_CTRL_ARI;
|
|
|
|
|
|
|
|
found:
|
|
|
|
pci_write_config_word(dev, pos + PCI_SRIOV_CTRL, ctrl);
|
|
|
|
|
2015-10-30 00:20:31 +03:00
|
|
|
pci_read_config_word(dev, pos + PCI_SRIOV_TOTAL_VF, &total);
|
|
|
|
if (!total)
|
|
|
|
return 0;
|
2009-03-20 06:25:11 +03:00
|
|
|
|
|
|
|
pci_read_config_dword(dev, pos + PCI_SRIOV_SUP_PGSIZE, &pgsz);
|
|
|
|
i = PAGE_SHIFT > 12 ? PAGE_SHIFT - 12 : 0;
|
|
|
|
pgsz &= ~((1 << i) - 1);
|
|
|
|
if (!pgsz)
|
|
|
|
return -EIO;
|
|
|
|
|
|
|
|
pgsz &= ~(pgsz - 1);
|
2012-02-02 21:41:20 +04:00
|
|
|
pci_write_config_dword(dev, pos + PCI_SRIOV_SYS_PGSIZE, pgsz);
|
2009-03-20 06:25:11 +03:00
|
|
|
|
2015-03-25 11:23:44 +03:00
|
|
|
iov = kzalloc(sizeof(*iov), GFP_KERNEL);
|
|
|
|
if (!iov)
|
|
|
|
return -ENOMEM;
|
|
|
|
|
2009-03-20 06:25:11 +03:00
|
|
|
nres = 0;
|
|
|
|
for (i = 0; i < PCI_SRIOV_NUM_BARS; i++) {
|
2015-03-25 11:23:45 +03:00
|
|
|
res = &dev->resource[i + PCI_IOV_RESOURCES];
|
2015-10-30 01:35:40 +03:00
|
|
|
/*
|
|
|
|
* If it is already FIXED, don't change it, something
|
|
|
|
* (perhaps EA or header fixups) wants it this way.
|
|
|
|
*/
|
|
|
|
if (res->flags & IORESOURCE_PCI_FIXED)
|
|
|
|
bar64 = (res->flags & IORESOURCE_MEM_64) ? 1 : 0;
|
|
|
|
else
|
|
|
|
bar64 = __pci_read_base(dev, pci_bar_unknown, res,
|
|
|
|
pos + PCI_SRIOV_BAR + i * 4);
|
2009-03-20 06:25:11 +03:00
|
|
|
if (!res->flags)
|
|
|
|
continue;
|
|
|
|
if (resource_size(res) & (PAGE_SIZE - 1)) {
|
|
|
|
rc = -EIO;
|
|
|
|
goto failed;
|
|
|
|
}
|
2015-03-25 11:23:44 +03:00
|
|
|
iov->barsz[i] = resource_size(res);
|
2009-03-20 06:25:11 +03:00
|
|
|
res->end = res->start + resource_size(res) * total - 1;
|
2018-01-18 21:55:24 +03:00
|
|
|
pci_info(dev, "VF(n) BAR%d space: %pR (contains BAR%d for %d VFs)\n",
|
2015-03-25 11:23:43 +03:00
|
|
|
i, res, i, total);
|
2015-03-25 11:23:44 +03:00
|
|
|
i += bar64;
|
2009-03-20 06:25:11 +03:00
|
|
|
nres++;
|
|
|
|
}
|
|
|
|
|
|
|
|
iov->pos = pos;
|
|
|
|
iov->nres = nres;
|
|
|
|
iov->ctrl = ctrl;
|
2012-11-10 07:27:53 +04:00
|
|
|
iov->total_VFs = total;
|
2018-05-25 16:18:34 +03:00
|
|
|
iov->driver_max_VFs = total;
|
2017-08-28 16:38:49 +03:00
|
|
|
pci_read_config_word(dev, pos + PCI_SRIOV_VF_DID, &iov->vf_device);
|
2009-03-20 06:25:11 +03:00
|
|
|
iov->pgsz = pgsz;
|
|
|
|
iov->self = dev;
|
2017-04-13 01:51:40 +03:00
|
|
|
iov->drivers_autoprobe = true;
|
2009-03-20 06:25:11 +03:00
|
|
|
pci_read_config_dword(dev, pos + PCI_SRIOV_CAP, &iov->cap);
|
|
|
|
pci_read_config_byte(dev, pos + PCI_SRIOV_FUNC_LINK, &iov->link);
|
2012-07-24 13:20:03 +04:00
|
|
|
if (pci_pcie_type(dev) == PCI_EXP_TYPE_RC_END)
|
2009-05-20 13:11:57 +04:00
|
|
|
iov->link = PCI_DEVFN(PCI_SLOT(dev->devfn), iov->link);
|
2009-03-20 06:25:11 +03:00
|
|
|
|
|
|
|
if (pdev)
|
|
|
|
iov->dev = pci_dev_get(pdev);
|
2009-05-18 09:51:33 +04:00
|
|
|
else
|
2009-03-20 06:25:11 +03:00
|
|
|
iov->dev = dev;
|
2009-05-18 09:51:33 +04:00
|
|
|
|
2009-03-20 06:25:11 +03:00
|
|
|
dev->sriov = iov;
|
|
|
|
dev->is_physfn = 1;
|
2015-10-30 00:20:50 +03:00
|
|
|
rc = compute_max_vf_buses(dev);
|
|
|
|
if (rc)
|
|
|
|
goto fail_max_buses;
|
2009-03-20 06:25:11 +03:00
|
|
|
|
|
|
|
return 0;
|
|
|
|
|
2015-10-30 00:20:50 +03:00
|
|
|
fail_max_buses:
|
|
|
|
dev->sriov = NULL;
|
|
|
|
dev->is_physfn = 0;
|
2009-03-20 06:25:11 +03:00
|
|
|
failed:
|
|
|
|
for (i = 0; i < PCI_SRIOV_NUM_BARS; i++) {
|
2015-03-25 11:23:45 +03:00
|
|
|
res = &dev->resource[i + PCI_IOV_RESOURCES];
|
2009-03-20 06:25:11 +03:00
|
|
|
res->flags = 0;
|
|
|
|
}
|
|
|
|
|
2015-03-25 11:23:44 +03:00
|
|
|
kfree(iov);
|
2009-03-20 06:25:11 +03:00
|
|
|
return rc;
|
|
|
|
}
|
|
|
|
|
|
|
|
static void sriov_release(struct pci_dev *dev)
|
|
|
|
{
|
2012-11-10 07:27:53 +04:00
|
|
|
BUG_ON(dev->sriov->num_VFs);
|
2009-03-20 06:25:15 +03:00
|
|
|
|
2009-05-18 09:51:33 +04:00
|
|
|
if (dev != dev->sriov->dev)
|
2009-03-20 06:25:11 +03:00
|
|
|
pci_dev_put(dev->sriov->dev);
|
|
|
|
|
|
|
|
kfree(dev->sriov);
|
|
|
|
dev->sriov = NULL;
|
|
|
|
}
|
|
|
|
|
2009-03-20 06:25:12 +03:00
|
|
|
static void sriov_restore_state(struct pci_dev *dev)
|
|
|
|
{
|
|
|
|
int i;
|
|
|
|
u16 ctrl;
|
|
|
|
struct pci_sriov *iov = dev->sriov;
|
|
|
|
|
|
|
|
pci_read_config_word(dev, iov->pos + PCI_SRIOV_CTRL, &ctrl);
|
|
|
|
if (ctrl & PCI_SRIOV_CTRL_VFE)
|
|
|
|
return;
|
|
|
|
|
PCI: Restore ARI Capable Hierarchy before setting numVFs
In the restore path, we previously read PCI_SRIOV_VF_OFFSET and
PCI_SRIOV_VF_STRIDE before restoring PCI_SRIOV_CTRL_ARI:
pci_restore_state
pci_restore_iov_state
sriov_restore_state
pci_iov_set_numvfs
pci_read_config_word(... PCI_SRIOV_VF_OFFSET, &iov->offset)
pci_read_config_word(... PCI_SRIOV_VF_STRIDE, &iov->stride)
pci_write_config_word(... PCI_SRIOV_CTRL, iov->ctrl)
But per SR-IOV r1.1, sec 3.3.3.5, the device can use PCI_SRIOV_CTRL_ARI to
determine PCI_SRIOV_VF_OFFSET and PCI_SRIOV_VF_STRIDE. Therefore, this
path, which is used for suspend/resume and AER recovery, can corrupt
iov->offset and iov->stride.
Since the iov state is associated with the device, not the driver, if we
reload the driver, it will use the the corrupted data, which may cause
crashes like this:
kernel BUG at drivers/pci/iov.c:157!
RIP: 0010:pci_iov_add_virtfn+0x2eb/0x350
Call Trace:
pci_enable_sriov+0x353/0x440
ixgbe_pci_sriov_configure+0xd5/0x1f0 [ixgbe]
sriov_numvfs_store+0xf7/0x170
dev_attr_store+0x18/0x30
sysfs_kf_write+0x37/0x40
kernfs_fop_write+0x120/0x1b0
vfs_write+0xb5/0x1a0
SyS_write+0x55/0xc0
Restore PCI_SRIOV_CTRL_ARI before calling pci_iov_set_numvfs(), then
restore the rest of PCI_SRIOV_CTRL (which may set PCI_SRIOV_CTRL_VFE)
afterwards.
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
[bhelgaas: changelog, add comment, also clear ARI if necessary]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Alexander Duyck <alexander.h.duyck@intel.com>
CC: Emil Tantilov <emil.s.tantilov@intel.com>
2017-10-04 18:52:58 +03:00
|
|
|
/*
|
|
|
|
* Restore PCI_SRIOV_CTRL_ARI before pci_iov_set_numvfs() because
|
|
|
|
* it reads offset & stride, which depend on PCI_SRIOV_CTRL_ARI.
|
|
|
|
*/
|
|
|
|
ctrl &= ~PCI_SRIOV_CTRL_ARI;
|
|
|
|
ctrl |= iov->ctrl & PCI_SRIOV_CTRL_ARI;
|
|
|
|
pci_write_config_word(dev, iov->pos + PCI_SRIOV_CTRL, ctrl);
|
|
|
|
|
2019-08-06 17:07:15 +03:00
|
|
|
for (i = 0; i < PCI_SRIOV_NUM_BARS; i++)
|
|
|
|
pci_update_resource(dev, i + PCI_IOV_RESOURCES);
|
2009-03-20 06:25:12 +03:00
|
|
|
|
|
|
|
pci_write_config_dword(dev, iov->pos + PCI_SRIOV_SYS_PGSIZE, iov->pgsz);
|
2015-03-25 11:23:46 +03:00
|
|
|
pci_iov_set_numvfs(dev, iov->num_VFs);
|
2009-03-20 06:25:12 +03:00
|
|
|
pci_write_config_word(dev, iov->pos + PCI_SRIOV_CTRL, iov->ctrl);
|
|
|
|
if (iov->ctrl & PCI_SRIOV_CTRL_VFE)
|
|
|
|
msleep(100);
|
|
|
|
}
|
|
|
|
|
2009-03-20 06:25:11 +03:00
|
|
|
/**
|
|
|
|
* pci_iov_init - initialize the IOV capability
|
|
|
|
* @dev: the PCI device
|
|
|
|
*
|
|
|
|
* Returns 0 on success, or negative on failure.
|
|
|
|
*/
|
|
|
|
int pci_iov_init(struct pci_dev *dev)
|
|
|
|
{
|
|
|
|
int pos;
|
|
|
|
|
2009-11-11 08:36:17 +03:00
|
|
|
if (!pci_is_pcie(dev))
|
2009-03-20 06:25:11 +03:00
|
|
|
return -ENODEV;
|
|
|
|
|
|
|
|
pos = pci_find_ext_capability(dev, PCI_EXT_CAP_ID_SRIOV);
|
|
|
|
if (pos)
|
|
|
|
return sriov_init(dev, pos);
|
|
|
|
|
|
|
|
return -ENODEV;
|
|
|
|
}
|
|
|
|
|
|
|
|
/**
|
|
|
|
* pci_iov_release - release resources used by the IOV capability
|
|
|
|
* @dev: the PCI device
|
|
|
|
*/
|
|
|
|
void pci_iov_release(struct pci_dev *dev)
|
|
|
|
{
|
|
|
|
if (dev->is_physfn)
|
|
|
|
sriov_release(dev);
|
|
|
|
}
|
|
|
|
|
PCI/IOV: Reset total_VFs limit after detaching PF driver
The TotalVFs register in the SR-IOV capability is the hardware limit on the
number of VFs. A PF driver can limit the number of VFs further with
pci_sriov_set_totalvfs(). When the PF driver is removed, reset any VF
limit that was imposed by the driver because that limit may not apply to
other drivers.
Before 8d85a7a4f2c9 ("PCI/IOV: Allow PF drivers to limit total_VFs to 0"),
pci_sriov_set_totalvfs(pdev, 0) meant "we can enable TotalVFs virtual
functions", and the nfp driver used that to remove the VF limit when the
driver unloads.
8d85a7a4f2c9 broke that because instead of removing the VF limit,
pci_sriov_set_totalvfs(pdev, 0) actually sets the limit to zero, and that
limit persists even if another driver is loaded.
We could fix that by making the nfp driver reset the limit when it unloads,
but it seems more robust to do it in the PCI core instead of relying on the
driver.
The regression scenario is:
nfp_pci_probe (driver 1)
...
nfp_pci_remove
pci_sriov_set_totalvfs(pf->pdev, 0) # limits VFs to 0
...
nfp_pci_probe (driver 2)
nfp_rtsym_read_le("nfd_vf_cfg_max_vfs")
# no VF limit from firmware
Now driver 2 is broken because the VF limit is still 0 from driver 1.
Fixes: 8d85a7a4f2c9 ("PCI/IOV: Allow PF drivers to limit total_VFs to 0")
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
[bhelgaas: changelog, rename functions]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2018-06-29 23:08:52 +03:00
|
|
|
/**
|
|
|
|
* pci_iov_remove - clean up SR-IOV state after PF driver is detached
|
|
|
|
* @dev: the PCI device
|
|
|
|
*/
|
|
|
|
void pci_iov_remove(struct pci_dev *dev)
|
|
|
|
{
|
|
|
|
struct pci_sriov *iov = dev->sriov;
|
|
|
|
|
|
|
|
if (!dev->is_physfn)
|
|
|
|
return;
|
|
|
|
|
|
|
|
iov->driver_max_VFs = iov->total_VFs;
|
|
|
|
if (iov->num_VFs)
|
|
|
|
pci_warn(dev, "driver left SR-IOV enabled after remove\n");
|
|
|
|
}
|
|
|
|
|
2016-11-28 18:15:52 +03:00
|
|
|
/**
|
|
|
|
* pci_iov_update_resource - update a VF BAR
|
|
|
|
* @dev: the PCI device
|
|
|
|
* @resno: the resource number
|
|
|
|
*
|
|
|
|
* Update a VF BAR in the SR-IOV capability of a PF.
|
|
|
|
*/
|
|
|
|
void pci_iov_update_resource(struct pci_dev *dev, int resno)
|
|
|
|
{
|
|
|
|
struct pci_sriov *iov = dev->is_physfn ? dev->sriov : NULL;
|
|
|
|
struct resource *res = dev->resource + resno;
|
|
|
|
int vf_bar = resno - PCI_IOV_RESOURCES;
|
|
|
|
struct pci_bus_region region;
|
2016-11-29 01:43:06 +03:00
|
|
|
u16 cmd;
|
2016-11-28 18:15:52 +03:00
|
|
|
u32 new;
|
|
|
|
int reg;
|
|
|
|
|
|
|
|
/*
|
|
|
|
* The generic pci_restore_bars() path calls this for all devices,
|
|
|
|
* including VFs and non-SR-IOV devices. If this is not a PF, we
|
|
|
|
* have nothing to do.
|
|
|
|
*/
|
|
|
|
if (!iov)
|
|
|
|
return;
|
|
|
|
|
2016-11-29 01:43:06 +03:00
|
|
|
pci_read_config_word(dev, iov->pos + PCI_SRIOV_CTRL, &cmd);
|
|
|
|
if ((cmd & PCI_SRIOV_CTRL_VFE) && (cmd & PCI_SRIOV_CTRL_MSE)) {
|
|
|
|
dev_WARN(&dev->dev, "can't update enabled VF BAR%d %pR\n",
|
|
|
|
vf_bar, res);
|
|
|
|
return;
|
|
|
|
}
|
|
|
|
|
2016-11-28 18:15:52 +03:00
|
|
|
/*
|
|
|
|
* Ignore unimplemented BARs, unused resource slots for 64-bit
|
|
|
|
* BARs, and non-movable resources, e.g., those described via
|
|
|
|
* Enhanced Allocation.
|
|
|
|
*/
|
|
|
|
if (!res->flags)
|
|
|
|
return;
|
|
|
|
|
|
|
|
if (res->flags & IORESOURCE_UNSET)
|
|
|
|
return;
|
|
|
|
|
|
|
|
if (res->flags & IORESOURCE_PCI_FIXED)
|
|
|
|
return;
|
|
|
|
|
|
|
|
pcibios_resource_to_bus(dev->bus, ®ion, res);
|
|
|
|
new = region.start;
|
|
|
|
new |= res->flags & ~PCI_BASE_ADDRESS_MEM_MASK;
|
|
|
|
|
|
|
|
reg = iov->pos + PCI_SRIOV_BAR + 4 * vf_bar;
|
|
|
|
pci_write_config_dword(dev, reg, new);
|
|
|
|
if (res->flags & IORESOURCE_MEM_64) {
|
|
|
|
new = region.start >> 16 >> 16;
|
|
|
|
pci_write_config_dword(dev, reg + 4, new);
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2015-03-25 11:23:50 +03:00
|
|
|
resource_size_t __weak pcibios_iov_resource_alignment(struct pci_dev *dev,
|
|
|
|
int resno)
|
|
|
|
{
|
|
|
|
return pci_iov_resource_size(dev, resno);
|
|
|
|
}
|
|
|
|
|
2009-08-29 00:00:06 +04:00
|
|
|
/**
|
|
|
|
* pci_sriov_resource_alignment - get resource alignment for VF BAR
|
|
|
|
* @dev: the PCI device
|
|
|
|
* @resno: the resource number
|
|
|
|
*
|
|
|
|
* Returns the alignment of the VF BAR found in the SR-IOV capability.
|
|
|
|
* This is not the same as the resource size which is defined as
|
|
|
|
* the VF BAR size multiplied by the number of VFs. The alignment
|
|
|
|
* is just the VF BAR size.
|
|
|
|
*/
|
2010-09-08 04:25:20 +04:00
|
|
|
resource_size_t pci_sriov_resource_alignment(struct pci_dev *dev, int resno)
|
2009-08-29 00:00:06 +04:00
|
|
|
{
|
2015-03-25 11:23:50 +03:00
|
|
|
return pcibios_iov_resource_alignment(dev, resno);
|
2009-08-29 00:00:06 +04:00
|
|
|
}
|
|
|
|
|
2009-03-20 06:25:12 +03:00
|
|
|
/**
|
|
|
|
* pci_restore_iov_state - restore the state of the IOV capability
|
|
|
|
* @dev: the PCI device
|
|
|
|
*/
|
|
|
|
void pci_restore_iov_state(struct pci_dev *dev)
|
|
|
|
{
|
|
|
|
if (dev->is_physfn)
|
|
|
|
sriov_restore_state(dev);
|
|
|
|
}
|
2009-03-20 06:25:13 +03:00
|
|
|
|
2017-11-09 17:00:35 +03:00
|
|
|
/**
|
|
|
|
* pci_vf_drivers_autoprobe - set PF property drivers_autoprobe for VFs
|
|
|
|
* @dev: the PCI device
|
|
|
|
* @auto_probe: set VF drivers auto probe flag
|
|
|
|
*/
|
|
|
|
void pci_vf_drivers_autoprobe(struct pci_dev *dev, bool auto_probe)
|
|
|
|
{
|
|
|
|
if (dev->is_physfn)
|
|
|
|
dev->sriov->drivers_autoprobe = auto_probe;
|
|
|
|
}
|
|
|
|
|
2009-03-20 06:25:13 +03:00
|
|
|
/**
|
|
|
|
* pci_iov_bus_range - find bus range used by Virtual Function
|
|
|
|
* @bus: the PCI bus
|
|
|
|
*
|
|
|
|
* Returns max number of buses (exclude current one) used by Virtual
|
|
|
|
* Functions.
|
|
|
|
*/
|
|
|
|
int pci_iov_bus_range(struct pci_bus *bus)
|
|
|
|
{
|
|
|
|
int max = 0;
|
|
|
|
struct pci_dev *dev;
|
|
|
|
|
|
|
|
list_for_each_entry(dev, &bus->devices, bus_list) {
|
|
|
|
if (!dev->is_physfn)
|
|
|
|
continue;
|
2015-03-25 11:23:47 +03:00
|
|
|
if (dev->sriov->max_VF_buses > max)
|
|
|
|
max = dev->sriov->max_VF_buses;
|
2009-03-20 06:25:13 +03:00
|
|
|
}
|
|
|
|
|
|
|
|
return max ? max - bus->number : 0;
|
|
|
|
}
|
2009-03-20 06:25:15 +03:00
|
|
|
|
|
|
|
/**
|
|
|
|
* pci_enable_sriov - enable the SR-IOV capability
|
|
|
|
* @dev: the PCI device
|
2009-04-02 04:45:30 +04:00
|
|
|
* @nr_virtfn: number of virtual functions to enable
|
2009-03-20 06:25:15 +03:00
|
|
|
*
|
|
|
|
* Returns 0 on success, or negative on failure.
|
|
|
|
*/
|
|
|
|
int pci_enable_sriov(struct pci_dev *dev, int nr_virtfn)
|
|
|
|
{
|
|
|
|
might_sleep();
|
|
|
|
|
|
|
|
if (!dev->is_physfn)
|
2013-08-01 02:47:56 +04:00
|
|
|
return -ENOSYS;
|
2009-03-20 06:25:15 +03:00
|
|
|
|
|
|
|
return sriov_enable(dev, nr_virtfn);
|
|
|
|
}
|
|
|
|
EXPORT_SYMBOL_GPL(pci_enable_sriov);
|
|
|
|
|
|
|
|
/**
|
|
|
|
* pci_disable_sriov - disable the SR-IOV capability
|
|
|
|
* @dev: the PCI device
|
|
|
|
*/
|
|
|
|
void pci_disable_sriov(struct pci_dev *dev)
|
|
|
|
{
|
|
|
|
might_sleep();
|
|
|
|
|
|
|
|
if (!dev->is_physfn)
|
|
|
|
return;
|
|
|
|
|
|
|
|
sriov_disable(dev);
|
|
|
|
}
|
|
|
|
EXPORT_SYMBOL_GPL(pci_disable_sriov);
|
2009-03-20 06:25:16 +03:00
|
|
|
|
2010-02-10 04:43:04 +03:00
|
|
|
/**
|
|
|
|
* pci_num_vf - return number of VFs associated with a PF device_release_driver
|
|
|
|
* @dev: the PCI device
|
|
|
|
*
|
|
|
|
* Returns number of VFs, or 0 if SR-IOV is not enabled.
|
|
|
|
*/
|
|
|
|
int pci_num_vf(struct pci_dev *dev)
|
|
|
|
{
|
2012-11-10 07:35:01 +04:00
|
|
|
if (!dev->is_physfn)
|
2010-02-10 04:43:04 +03:00
|
|
|
return 0;
|
2012-11-10 07:35:01 +04:00
|
|
|
|
|
|
|
return dev->sriov->num_VFs;
|
2010-02-10 04:43:04 +03:00
|
|
|
}
|
|
|
|
EXPORT_SYMBOL_GPL(pci_num_vf);
|
2012-11-06 00:20:37 +04:00
|
|
|
|
2013-04-25 08:42:29 +04:00
|
|
|
/**
|
|
|
|
* pci_vfs_assigned - returns number of VFs are assigned to a guest
|
|
|
|
* @dev: the PCI device
|
|
|
|
*
|
|
|
|
* Returns number of VFs belonging to this device that are assigned to a guest.
|
2013-08-01 02:47:56 +04:00
|
|
|
* If device is not a physical function returns 0.
|
2013-04-25 08:42:29 +04:00
|
|
|
*/
|
|
|
|
int pci_vfs_assigned(struct pci_dev *dev)
|
|
|
|
{
|
|
|
|
struct pci_dev *vfdev;
|
|
|
|
unsigned int vfs_assigned = 0;
|
|
|
|
unsigned short dev_id;
|
|
|
|
|
|
|
|
/* only search if we are a PF */
|
|
|
|
if (!dev->is_physfn)
|
|
|
|
return 0;
|
|
|
|
|
|
|
|
/*
|
|
|
|
* determine the device ID for the VFs, the vendor ID will be the
|
|
|
|
* same as the PF so there is no need to check for that one
|
|
|
|
*/
|
2017-08-28 16:38:49 +03:00
|
|
|
dev_id = dev->sriov->vf_device;
|
2013-04-25 08:42:29 +04:00
|
|
|
|
|
|
|
/* loop through all the VFs to see if we own any that are assigned */
|
|
|
|
vfdev = pci_get_device(dev->vendor, dev_id, NULL);
|
|
|
|
while (vfdev) {
|
|
|
|
/*
|
|
|
|
* It is considered assigned if it is a virtual function with
|
|
|
|
* our dev as the physical function and the assigned bit is set
|
|
|
|
*/
|
|
|
|
if (vfdev->is_virtfn && (vfdev->physfn == dev) &&
|
2014-09-09 06:21:28 +04:00
|
|
|
pci_is_dev_assigned(vfdev))
|
2013-04-25 08:42:29 +04:00
|
|
|
vfs_assigned++;
|
|
|
|
|
|
|
|
vfdev = pci_get_device(dev->vendor, dev_id, vfdev);
|
|
|
|
}
|
|
|
|
|
|
|
|
return vfs_assigned;
|
|
|
|
}
|
|
|
|
EXPORT_SYMBOL_GPL(pci_vfs_assigned);
|
|
|
|
|
2012-11-06 00:20:37 +04:00
|
|
|
/**
|
|
|
|
* pci_sriov_set_totalvfs -- reduce the TotalVFs available
|
|
|
|
* @dev: the PCI PF device
|
2013-01-10 05:12:52 +04:00
|
|
|
* @numvfs: number that should be used for TotalVFs supported
|
2012-11-06 00:20:37 +04:00
|
|
|
*
|
|
|
|
* Should be called from PF driver's probe routine with
|
|
|
|
* device's mutex held.
|
|
|
|
*
|
|
|
|
* Returns 0 if PF is an SRIOV-capable device and
|
2013-08-01 02:47:56 +04:00
|
|
|
* value of numvfs valid. If not a PF return -ENOSYS;
|
|
|
|
* if numvfs is invalid return -EINVAL;
|
2012-11-06 00:20:37 +04:00
|
|
|
* if VFs already enabled, return -EBUSY.
|
|
|
|
*/
|
|
|
|
int pci_sriov_set_totalvfs(struct pci_dev *dev, u16 numvfs)
|
|
|
|
{
|
2013-08-01 02:47:56 +04:00
|
|
|
if (!dev->is_physfn)
|
|
|
|
return -ENOSYS;
|
2018-05-25 16:51:50 +03:00
|
|
|
|
2013-08-01 02:47:56 +04:00
|
|
|
if (numvfs > dev->sriov->total_VFs)
|
2012-11-06 00:20:37 +04:00
|
|
|
return -EINVAL;
|
|
|
|
|
|
|
|
/* Shouldn't change if VFs already enabled */
|
|
|
|
if (dev->sriov->ctrl & PCI_SRIOV_CTRL_VFE)
|
|
|
|
return -EBUSY;
|
|
|
|
|
2018-05-25 16:51:50 +03:00
|
|
|
dev->sriov->driver_max_VFs = numvfs;
|
2012-11-06 00:20:37 +04:00
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
EXPORT_SYMBOL_GPL(pci_sriov_set_totalvfs);
|
|
|
|
|
|
|
|
/**
|
2013-07-09 00:02:43 +04:00
|
|
|
* pci_sriov_get_totalvfs -- get total VFs supported on this device
|
2012-11-06 00:20:37 +04:00
|
|
|
* @dev: the PCI PF device
|
|
|
|
*
|
|
|
|
* For a PCIe device with SRIOV support, return the PCIe
|
2012-11-10 07:27:53 +04:00
|
|
|
* SRIOV capability value of TotalVFs or the value of driver_max_VFs
|
2013-08-01 02:47:56 +04:00
|
|
|
* if the driver reduced it. Otherwise 0.
|
2012-11-06 00:20:37 +04:00
|
|
|
*/
|
|
|
|
int pci_sriov_get_totalvfs(struct pci_dev *dev)
|
|
|
|
{
|
2012-11-10 07:35:01 +04:00
|
|
|
if (!dev->is_physfn)
|
2013-08-01 02:47:56 +04:00
|
|
|
return 0;
|
2012-11-06 00:20:37 +04:00
|
|
|
|
2018-05-25 16:18:34 +03:00
|
|
|
return dev->sriov->driver_max_VFs;
|
2012-11-06 00:20:37 +04:00
|
|
|
}
|
|
|
|
EXPORT_SYMBOL_GPL(pci_sriov_get_totalvfs);
|
2018-04-21 23:23:09 +03:00
|
|
|
|
|
|
|
/**
|
|
|
|
* pci_sriov_configure_simple - helper to configure SR-IOV
|
|
|
|
* @dev: the PCI device
|
|
|
|
* @nr_virtfn: number of virtual functions to enable, 0 to disable
|
|
|
|
*
|
|
|
|
* Enable or disable SR-IOV for devices that don't require any PF setup
|
|
|
|
* before enabling SR-IOV. Return value is negative on error, or number of
|
|
|
|
* VFs allocated on success.
|
|
|
|
*/
|
|
|
|
int pci_sriov_configure_simple(struct pci_dev *dev, int nr_virtfn)
|
|
|
|
{
|
|
|
|
int rc;
|
|
|
|
|
|
|
|
might_sleep();
|
|
|
|
|
|
|
|
if (!dev->is_physfn)
|
|
|
|
return -ENODEV;
|
|
|
|
|
|
|
|
if (pci_vfs_assigned(dev)) {
|
|
|
|
pci_warn(dev, "Cannot modify SR-IOV while VFs are assigned\n");
|
|
|
|
return -EPERM;
|
|
|
|
}
|
|
|
|
|
|
|
|
if (nr_virtfn == 0) {
|
|
|
|
sriov_disable(dev);
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
|
|
|
rc = sriov_enable(dev, nr_virtfn);
|
|
|
|
if (rc < 0)
|
|
|
|
return rc;
|
|
|
|
|
|
|
|
return nr_virtfn;
|
|
|
|
}
|
|
|
|
EXPORT_SYMBOL_GPL(pci_sriov_configure_simple);
|