-
Notifications
You must be signed in to change notification settings - Fork 584
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rework Python driver/device creation. #9330
Conversation
6a2ca2a
to
a55003d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let me noodle on this for a minute; my goal this week is to fix up driver/device stuff, and in particular the issue today is that the way the drivers are setup is that it's extremely expensive to create them (involves loading vulkan/cuda/creating thread pools/etc). As a result the query devices logic is not cheap - the cost of querying whether there's a vulkan device is nearly as expensive as creating the device, and we want to be positive it's never on a code path beyond a --list-devices flag. Loading cuda/vulkan/moltenvk/metal/etc into a process is quite destructive.
Basically, a user needs to treat drivers and devices as a tiered step and not as a flattened space: so iree.runtime.get_device_by_name is no good unless there was some spooky global iree.runtime.make_devices_available(driver_name) or something that populated the list. I think it'd be better to just have them be a tier (list drivers, then get a driver and list devices in that driver). It's too big of a footgun to have a "get me a device from some driver" API unless it's explicitly "get me any device from a particular driver" as that doesn't require creating and querying first.
The other big thing I need to fix up related to this is that vmvx/dylib are not drivers - they are just feature capabilities of a single driver (local w/ threading, and vmvx-sync/dylib-sync are local w/o threading). I'm still trying to figure out how to invert that properly and select it. In non-test cases things are easy (choose whether you want dylib/vmvx with iree-compile flags and then at runtime choose whether you want threaded/non-threaded) but in a module compiled with multiple formats it's useful to be able to force one or the other - that may need an additional configuration mechanism (+vmvx,+embedded-elf-x64-avx512
etc).
I had some notes from awhile back and will see if I can find them and then start a sketch or two.
runtime/bindings/python/iree/runtime/scripts/iree_devices/__main__.py
Outdated
Show resolved
Hide resolved
I'd be happy if the lower level API evolved as you say. Let me know. I can sit on this patch for a little while, but the features here are reasonably important. Lmk if you end up with any C API sketches to adapt this towards. |
👍 my goal is to get to #9343 this week, so I don't think it needs to sit long, and once I have a draft of that C API we can chat about how it lines up - I think it's not too disruptive to what you have here |
Ok I think the bulk of the changes landed; #9443 has what all the native tools now do with example output in https://gist.github.com/benvanik/059c5773068b114ea393bf5b95d791c2. The new device URI stuff should replace the need for duplicated python string manipulation and the goal is that they are opaque from the perspective of API layers - you can pass in The APIs didn't change much but the driver registry/driver interface grew some new functions and helpers. There's some new flag helpers under Everything is setup now for multiple devices, though I need to work on #5724 to get the right APIs in-place for passing that down to the HAL. For now anywhere you'd have one device (like Config constructor) taking in a list of devices would be better even if for now it just asserts there's only 1 in the list - that'll make it easier for me to come in and plumb that through. I did the bare minimum to get things rearranged and working for command line tools/C API - happy to add more/expose things/etc that make python better - just LMK! |
…pipe hidden. This updates enumeration and selection by index to respect whether a device is hidden, based on some characteristics. Since we've had continuous problems with lavapipe as a compute device (and since, when accidentally using since it is often installed by default, it spews stderr warnings about only being for testing), I opted to make this the first heuristic for hiding a device. Broken out of iree-org#9330.
…pipe hidden. (#9621) This updates enumeration and selection by index to respect whether a device is hidden, based on some characteristics. Since we've had continuous problems with lavapipe as a compute device (and since, when accidentally using since it is often installed by default, it spews stderr warnings about only being for testing), I opted to make this the first heuristic for hiding a device. Broken out of #9330.
APIs removed: * HalDriver.create() (use iree.runtime.get_driver(driver_name) to get a cached instance). * Environment variable IREE_DEFAULT_DRIVER renamed to IREE_DEFAULT_DEVICE to better reflect the new syntax. * Config.driver attribute (no longer captured by this class) APIs added: * iree.runtime.query_available_drivers() (alias of HalDriver.query()) * iree.runtime.get_driver(driver_name) * iree.runtime.get_device_by_name(name_spec) * iree.runtime.get_first_device_by_name(name_specs) * iree.runtime.Config(, device: HalDevice) (to configure with an explicit device) * HalDriver.create_device(device_id: Union[int, tuple]) * HalDriver.query_available_devices() Devices can now be queried/constructed explicitly using HalDriver.query_available_devices() and passing found device_ids to HalDriver.create_device. Default configuration is extended to take "name specs" instead of driver name. This can either be a raw driver name (i.e. "vmvx") but can also be driver:index (i.e. "vulkan:3"). Some logging is added to make it clearer what was selected. Devices created in this way are cached now, since this facility is meant to be used for trivial/default configuration. If explicitly creating devices, the user is on their own to cache as desired. Fixes iree-org#9277
66341e9
to
0f9c510
Compare
APIs removed:
cached instance).
IREE_DEFAULT_DEVICE to better reflect the new syntax.
APIs added:
explicit device)
Both driver and device lookup is done by a device URI, as defined by the runtime (when creating a driver, only the 'scheme' is used). Driver instances are cached by name in the native code, which should avoid various bad behavior in terms of driver lifetimes and lack of care to process state. Devices are optionally (default True) cached at the Python level.
Fixes #9277
Expected to fix #9936