Implementation:Mlc ai Mlc llm Auto Device
Overview
python/mlc_llm/support/auto_device.py provides automatic detection of locally available compute devices (GPUs and CPU) for the MLC LLM runtime. It probes the system for supported device types by spawning a subprocess to check device availability, caches the results, and returns an appropriate TVM Device object. This module is used whenever a user specifies "auto" as the device hint.
Location
- File:
python/mlc_llm/support/auto_device.py - Module:
mlc_llm.support.auto_device - Lines: 95
Module-Level Constants
FOUND = green("Found")
NOT_FOUND = red("Not found")
AUTO_DETECT_DEVICES = ["cuda", "rocm", "metal", "vulkan", "opencl", "cpu"]
_RESULT_CACHE: Dict[str, bool] = {}
- AUTO_DETECT_DEVICES: The ordered list of device types to probe during automatic detection. The order reflects priority: CUDA and ROCm (discrete GPUs) are checked first, followed by Metal and Vulkan, then OpenCL, and finally CPU as a fallback.
- _RESULT_CACHE: A module-level dictionary caching device existence results to avoid redundant subprocess calls within a session.
Functions
detect_device
def detect_device(device_hint: str) -> Optional[Device]:
The primary entry point for device detection.
When device_hint is "auto":
Iterates through AUTO_DETECT_DEVICES in order, probing each device type at index 0. Returns the first device found.
if device_hint == "auto":
device = None
for device_type in AUTO_DETECT_DEVICES:
cur_device = tvm.device(device_type=device_type, index=0)
if _device_exists(cur_device):
if device is None:
device = cur_device
if device is None:
logger.info("%s: No available device detected", NOT_FOUND)
return None
logger.info("Using device: %s", bold(device2str(device)))
return device
Note that the loop continues checking all device types even after finding one (to populate the cache with log messages), but only the first found device is returned.
When device_hint is a specific device:
Creates a TVM device from the hint string and validates its existence. Raises ValueError if the device name is invalid or the device is not found locally.
try:
device = tvm.device(device_hint)
except Exception as err:
raise ValueError(f"Invalid device name: {device_hint}") from err
if not _device_exists(device):
raise ValueError(f"Device is not found on your local environment: {device_hint}")
return device
device2str
def device2str(device: Device) -> str:
Converts a TVM Device object to a human-readable string in the format "device_type:index" (e.g., "cuda:0").
Implementation:
return f"{tvm.runtime.Device._DEVICE_TYPE_TO_NAME[device.dlpack_device_type()]}:{device.index}"
Uses TVM's internal _DEVICE_TYPE_TO_NAME mapping to convert the DLPack device type enum to a string name.
_device_exists (Private)
def _device_exists(device: Device) -> bool:
Checks whether a specific device exists on the local machine by spawning a subprocess.
Process:
- Checks the
_RESULT_CACHEfor a cached result. - If not cached, runs a subprocess command:
cmd = [sys.executable, "-m", "mlc_llm.cli.check_device", device_type]
The subprocess is run via subprocess.run with capture_output=True and the current environment.
Output parsing:
The subprocess output is expected to contain lines prefixed with "check_device:". The function extracts the content after this prefix, which is a comma-separated list of available device indices.
subproc_outputs = [
line[len(prefix):].strip()
for line in subprocess.run(cmd, capture_output=True, text=True, check=False, env=os.environ)
.stdout.strip().splitlines()
if line.startswith(prefix)
]
Cache population:
For each discovered device index, the result is cached as True in _RESULT_CACHE with the key "device_type:index". For CPU devices (kDLCPU), only the first index is cached (via break).
Error handling:
If no "check_device:" output lines are found, an error is logged asking the user to report the issue with the subprocess command.
If the device string is still not in the cache after processing, it is cached as False.
Detection Flow
The overall device detection flow is:
- User provides
"auto"or a specific device string. - For
"auto", iterate through["cuda", "rocm", "metal", "vulkan", "opencl", "cpu"]. - For each device type, spawn
python -m mlc_llm.cli.check_device <type>. - Parse output for available device indices.
- Cache results and return the first available device.
Dependencies
- tvm: For
tvm.device()creation andDevicetype constants. - tvm_ffi.DLDeviceType: For the
kDLCPUconstant used in CPU-specific logic. - subprocess: For spawning the device check process.
- os: For passing the current environment to the subprocess.
- sys: For getting the current Python executable path.
- mlc_llm.support.logging: Custom logging.
- mlc_llm.support.style: For styled terminal output (
bold,green,red).
Design Notes
- Device detection is performed in a separate subprocess via
mlc_llm.cli.check_deviceto isolate potential crashes or GPU driver issues from the main process. - The module-level
_RESULT_CACHEensures each device type is only probed once per process lifetime, avoiding expensive repeated subprocess calls. - The detection priority order places GPU backends first and CPU last, ensuring GPU acceleration is preferred when available.
- The CPU special case (breaking after first index) avoids unnecessary enumeration since CPU is not typically multi-indexed in the same way as GPUs.