web.eecs.umich.edu
[PDF] Vortex: Overcoming Memory Capacity Limitations in GPU ...
Excerpt
is not without drawbacks. It necessitates the acceptance of bun- dled GPU compute resources when the primary issue is memory capacity. Such an inflexible strategy may lead to resource under- utilization and consequently increase overall costs. Additionally, there is a limit to the number of GPU cards a single node can sup- … unintentionally blocked by the runtime in our baseline solution. Challenge #2. Non-uniform IO bandwidth. When both direc- tions of all PCIe links are used simultaneously for data transfer, the combined IO bandwidth requirement exceeds the CPU-side mem- ory controller’s capacity, resulting in some PCIe links not achieving
Related Pain Points
Non-uniform PCIe bandwidth bottlenecks in multi-GPU systems
7When PCIe links are used bidirectionally for simultaneous data transfers across multiple GPUs, combined bandwidth requirements exceed CPU-side memory controller capacity, causing some PCIe links to fail achieving target throughput and degrading performance.
GPU memory underutilization from inflexible resource bundling
5Cloud GPU offerings bundle compute with memory in fixed ratios, forcing organizations to purchase excess compute capacity when their primary constraint is memory. This inflexible strategy leads to significant resource underutilization and increased costs.