Memory leaks and crashes in production
8/10 HighTensorFlow exhibits reliability issues including memory leaks that impede development and crashes especially with heavier architectures, resulting in lost work and restart delays. These issues are particularly problematic in production environments.
Sources
Collection History
Memory failures comprised 18% of incidents. Failures showed a strong correlation with workloads featuring high parameter counts and large batch sizes. Model accuracy degradation, increased training volatility.
Reliability issues. While many may be tempted to continue working with the initial version of TensorFlow, it may be less secure and reliable. There were quite a few cases of memory leaks that significantly impeded and harmed the development process. Crashes. Despite its benefits of speed and flexibility, TensorFlow is still prone to crashes, especially for heavier architectures.