Skip to content

Commit

Permalink
Enable resizable BAR support
Browse files Browse the repository at this point in the history
Resizable BAR support is a PCIe extension that allows resizing a PCIe device's
mappable memory/register space (also referred to as BARs - after the Base
Address Register that sets up the region). An important use case are GPUs.
While data center GPUs, generally have BAR sizes that match the size of video
memory, consumer and workstation GPUs generally declare only have 256MiB worth
of BARs mapping GPU memory to maintain compatibility with 32bit operating
systems. However, for performance (particularly when using PCIe P2P), it is
desirable to be able to map the entirety of GPU memory, necessitating
resizable BARs.

However, while PCIe ReBAR has been a standard for more than 10 years,
it was ill used until a few years ago and thus support is lacking.
On very recent motherboards (generally after 2020), a BIOS update might
be available that causes the firmware to read and reserve space for a
ReBAR expansion (or even perform the BAR resize itself). However, older
motherbards do not have this support.

Fortunately for us, the Linux kernel has some support to do its own PCIe
enumeration without relying on the firmware to do everything. Linux even has
support for resizable BARs, though in practice there are a number of important
limitations:

* There is currently no support for movable BARs in Linux. This means that if
  there are adjacent address space allocations, it is quite possible for BAR
  resizing to fail. There was a WIP patch series to resolve this issue at
  https://patchwork.ozlabs.org/project/linux-pci/cover/20201218174011.340514-1-s.miroshnichenko@yadro.com/
  but it appears to have faded out.
  • Loading branch information
Keno authored and sjkelly committed May 11, 2022
1 parent 1739a20 commit 8d3c522
Show file tree
Hide file tree
Showing 2 changed files with 83 additions and 0 deletions.
3 changes: 3 additions & 0 deletions kernel-open/common/inc/nv-pci.h
Original file line number Diff line number Diff line change
Expand Up @@ -38,6 +38,9 @@
#define nv_dev_is_pci(dev) (true)
#endif

#define NV_GPU_BAR1 1
#define NV_GPU_BAR3 3

int nv_pci_register_driver(void);
void nv_pci_unregister_driver(void);
int nv_pci_count_devices(void);
Expand Down
80 changes: 80 additions & 0 deletions kernel-open/nvidia/nv-pci.c
Original file line number Diff line number Diff line change
Expand Up @@ -223,7 +223,81 @@ static void nv_init_dynamic_power_management



static int nv_resize_pcie_bars(struct pci_dev *pci_dev) {
struct pci_host_bridge *host;
u16 cmd;
int r, old_size, requested_size;
int ret = 0;

// Check if BAR1 has PCIe rebar capabilities
u32 sizes = pci_rebar_get_possible_sizes(pci_dev, NV_GPU_BAR1);
if (sizes == 0) {
/* ReBAR not available. Nothing to do. */
return 0;
}

/* Try to resize the BAR to the largest supported size */
requested_size = fls(sizes) - 1;

/* If the kernel will refuse us, don't even try to resize,
but give an informative error */
host = pci_find_host_bridge(pci_dev->bus);
if (host->preserve_config) {
nv_printf(NV_DBG_INFO, "NVRM: Not resizing BAR because the firmware forbids moving windows.\n");
return 0;
}

nv_printf(NV_DBG_INFO, "NVRM: %04x:%02x:%02x.%x: Attempting to resize BAR1.\n",
NV_PCI_DOMAIN_NUMBER(pci_dev), NV_PCI_BUS_NUMBER(pci_dev),
NV_PCI_SLOT_NUMBER(pci_dev), PCI_FUNC(pci_dev->devfn));

/* Disable memory decoding - required by the kernel APIs */
pci_read_config_word(pci_dev, PCI_COMMAND, &cmd);
pci_write_config_word(pci_dev, PCI_COMMAND, cmd & ~PCI_COMMAND_MEMORY);

/* Release BAR1 */
pci_release_resource(pci_dev, NV_GPU_BAR1);

/* Release BAR3 - we don't want to resize it, it's in the same bridge, so we'll want to move it */
pci_release_resource(pci_dev, NV_GPU_BAR3);

/* Save the current size, just in case things go wrong */
old_size = pci_rebar_bytes_to_size(pci_resource_len(pci_dev, NV_GPU_BAR1));

resize:
/* Attempt to resize BAR1 to the largest supported size */
r = pci_resize_resource(pci_dev, NV_GPU_BAR1, requested_size);

if (r) {
if (r == -ENOSPC)
nv_printf(NV_DBG_ERRORS, "NVRM: No address space to allocate resized BAR1.\n");
else if (r)
nv_printf(NV_DBG_ERRORS, "NVRM: BAR resizing failed with error `%d`.\n", r);
}

/* Re-attempt assignment of PCIe resources */
pci_assign_unassigned_bus_resources(pci_dev->bus);

if ((pci_resource_flags(pci_dev, NV_GPU_BAR1) & IORESOURCE_UNSET) ||
(pci_resource_flags(pci_dev, NV_GPU_BAR3) & IORESOURCE_UNSET)) {
if (requested_size != old_size) {
/* Try to get the BAR back with the original size */
requested_size = old_size;
goto resize;
}
/* Something went horribly wrong and the kernel didn't manage to re-allocate BAR1.
This is unlikely (because we had space before), but can happen. */
nv_printf(NV_DBG_ERRORS, "NVRM: FATAL: Failed to re-allocate BAR1.\n");
ret = -ENODEV;
goto done;
}

done:
/* Re-enable memory decoding */
pci_write_config_word(pci_dev, PCI_COMMAND, cmd);

return ret;
}

/* find nvidia devices and set initial state */
static int
Expand Down Expand Up @@ -434,6 +508,12 @@ nv_pci_probe
goto failed;
}

if (nv_resize_pcie_bars(pci_dev)) {
nv_printf(NV_DBG_ERRORS,
"NVRM: Fatal Error while attempting to resize PCIe BARs.\n");
goto failed;
}

NV_KMALLOC(nvl, sizeof(nv_linux_state_t));
if (nvl == NULL)
{
Expand Down

0 comments on commit 8d3c522

Please sign in to comment.