torch.compile caching is slow and incomplete, causing long warm-up times
6/10 MediumMultiple gaps in PyTorch's compilation caching pipeline — including slow Triton cache artifact loading, excessive small network requests for remote caches with many small graphs, and an incomplete AOTAutograd cache rollout — collectively add significant overhead even on warm-cache runs.
Collection History
Query: “What are the most common pain points with PyTorch for developers in 2025?”4/4/2026
loading Triton cache artifacts takes a long time because we still re-parse the Triton code before doing a cache lookup... if you have a lot of small graphs, remote cache ends up having to do lots of small network requests, instead of one batched network request... AOTAutograd cache is not fully rolled out yet.
Created: 4/4/2026Updated: 4/4/2026