AI Workloads
Training • Inference • Batch • Pipelines
HAMi enables sharing, isolation and scheduling for GPU/NPU/MLU resources so mixed accelerators run efficiently on one platform.
Training • Inference • Batch • Pipelines
Virtualization • Sharing • Isolation • Scheduling
GPU • NPU • MLU • DCU
The project is developed in the open under CNCF governance, with contributions from a growing global community of companies and individuals.
Use one Kubernetes-native workflow to schedule GPU, NPU, MLU and other AI accelerators.
Allocate memory/core slices precisely for training and inference jobs in mixed workloads, with hard isolation enforced at runtime.
Supports binpack, spread, node-topology-aware, and task-topology-aware scheduling policies to optimize resource utilization and placement.
Build on standard interfaces to avoid lock-in and simplify long-term platform evolution.
Zero-change adoption path with Kubernetes-compatible APIs and deployment model.
Community-driven governance and hardware ecosystem support for diverse environments.
Control memory/core usage to improve fairness, reliability and utilization.
Provide consistent metrics and operational visibility across device vendors.
HAMi works through two core paths: GPU virtualization/slicing and heterogeneous scheduling from request to isolated execution.
View full architecture docs →nvidia.com/gpu + gpumem/gpucoresThe same Pod requests enter whole-GPU allocation on the left and HAMi slicing on the right, showing how scheduling semantics change placement.
Each Pod is scheduled with whole-GPU semantics, so the unused portion of that card cannot be shared with another Pod.
Pod C claims an entire GPU, so the remaining capacity on that card becomes stranded.
The same Pod requests are sliced first, then placed by policy to pack, spread, or respect topology locality.
Pod C slice and pack onto the most loaded compatible gpu.
Broad accelerator ecosystem across vendors. See docs for full support matrix.
View full supported devices list →The organizations below are evaluating or using HAMi in production environments.
Submit your organization through the contributor guide process.
See submission instructions →