Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add mig support in specs #288

Merged
merged 3 commits into from
Apr 9, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions aws/infrastructure.tf
Original file line number Diff line number Diff line change
Expand Up @@ -207,6 +207,7 @@ locals {
cpus = data.aws_ec2_instance_type.instance_type[values.prefix].default_vcpus
ram = data.aws_ec2_instance_type.instance_type[values.prefix].memory_size
gpus = try(one(data.aws_ec2_instance_type.instance_type[values.prefix].gpus).count, 0)
mig = lookup(values, "mig", null)
}
}
}
Expand Down
1 change: 1 addition & 0 deletions azure/infrastructure.tf
Original file line number Diff line number Diff line change
Expand Up @@ -170,6 +170,7 @@ locals {
cpus = local.vmsizes[values.type].vcpus
ram = local.vmsizes[values.type].ram
gpus = local.vmsizes[values.type].gpus
mig = lookup(values, "mig", null)
}
}
}
Expand Down
32 changes: 18 additions & 14 deletions docs/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -546,19 +546,23 @@ Optional attributes can be defined:
2. `image`: specification of the image to use for this instance type. (default: global [`image`](#46-image) value).
Refer to section [10.12 - Create a compute node image](#1012-create-a-compute-node-image) to learn how this attribute can
be leveraged to accelerate compute node configuration.
3. `disk_size`: size in gibibytes (GiB) of the instance's root disk containing
3. `disk_type`: type of the instance's root disk (default: see the next table).
| Provider | `disk_type` | `disk_size` (GiB) |
| -------- | :---------- | ----------------: |
| Azure |`Premium_LRS`| 30 |
| AWS | `gp2` | 10 |
| GCP | `pd-ssd` | 20 |
| OpenStack| `null` | 10 |
| OVH | `null` | 10 |
4. `disk_size`: size in gibibytes (GiB) of the instance's root disk containing
the operating system and service software
(default: see the next table).
4. `disk_type`: type of the instance's root disk (default: see the next table).

Default root disk's attribute value per provider:
| Provider | `disk_type` | `disk_size` (GiB) |
| -------- | :---------- | ----------------: |
| Azure |`Premium_LRS`| 30 |
| AWS | `gp2` | 10 |
| GCP | `pd-ssd` | 20 |
| OpenStack| `null` | 10 |
| OVH | `null` | 10 |
(default: see the previous table).
5. `mig`: map of [NVIDIA Multi-Instance GPU (MIG)](https://docs.nvidia.com/datacenter/tesla/mig-user-guide/index.html) short profile names and count used to partition the instances' GPU, example for an A100:
```
mig = { "1g.5gb" = 2, "2g.10gb" = 1, "3g.20gb" = 1 }
```
This is only functional with [MIG supported GPUs](https://docs.nvidia.com/datacenter/tesla/mig-user-guide/index.html#supported-gpus),
and with x86-64 processors (see [NVIDIA/mig-parted issue #30](https://github.com/NVIDIA/mig-parted/issues/30)).

For some cloud providers, it possible to define additional attributes.
The following sections present the available attributes per provider.
Expand Down Expand Up @@ -864,8 +868,8 @@ if previously instantiated.

**default_value** = `false`

Determines whether the base image packages will be upgraded during the first boot or not. By default,
all packages are upgraded. If `skip_upgrade` set to `true`, no package will be upgraded on first boot.
If true, the base image packages will not be upgraded during the first boot. By default,
all packages are upgraded.

**Post build modification effect**: No effect on currently built instances. Ones created
after the modification will take into consideration the new value of the parameter to determine
Expand Down
1 change: 1 addition & 0 deletions gcp/infrastructure.tf
Original file line number Diff line number Diff line change
Expand Up @@ -180,6 +180,7 @@ locals {
cpus = data.external.machine_type[values["prefix"]].result["vcpus"]
ram = data.external.machine_type[values["prefix"]].result["ram"]
gpus = try(data.external.machine_type[values["prefix"]].result["gpus"], lookup(values, "gpu_count", 0))
mig = lookup(values, "mig", null)
}
}
}
Expand Down
1 change: 1 addition & 0 deletions openstack/infrastructure.tf
Original file line number Diff line number Diff line change
Expand Up @@ -137,6 +137,7 @@ locals {
parseint(lookup(data.openstack_compute_flavor_v2.flavors[values.prefix].extra_specs, "resources:VGPU", "0"), 10),
parseint(split(":", lookup(data.openstack_compute_flavor_v2.flavors[values.prefix].extra_specs, "pci_passthrough:alias", "gpu:0"))[1], 10)
])
mig = lookup(values, "mig", null)
}
}
}
Expand Down
Loading