arxiv.org

[PDF] An Empirical Study on Bugs Inside PyTorch: A Replication ... - arXiv

Updated 1/31/2026

Excerpt

TABLE I ROOT CAUSES OF BUGS IDENTIFIED IN PYTORCH. Root Cause Description Freq. Logic Error Wrong programming logic 25.77% Inconsistency Inconsistent changes in the API 25.26% Algorithm Wrong implementation of algorithms 12.37% Corner case Wrong handling of corner cases 9.79% Configuration error Wrong configurations 8.76% Type confusion Type mismatches 8.25% Memory Incorrect usage of memory 3.09% Referenced type error Incorrect import of libraries 2.58% Processing Incorrect variable initialization or as- signment 2.06% … were caused by inconsistencies in the APIs which demonstrate that PyTorch requires more time and development effort in order to be a truly reliable framework. In the following, we discuss the 11 categories for root causes of bugs in PyTorch from the 194 bugs analyzed. 1) Logic error (25.77%). The bugs in this category were … tation caused it to not copy part of the object (gradient buffer), causing users to experience undefined behavior errors. 2) Inconsistency (25.26%). The bugs in this category were caused by changing the APIs or updating the framework’s version which resulted in inconsistencies or incompati- … corner cases since most developers will not use PyTorch functions in such a way. 5) Configuration error (8.76%). The bugs in this category were caused by wrong configurations. For example, issue #22389 [36] reports a bug which caused the developers to be unable to use TensorBoard. This bug happened … variables being initialized or assigned incorrectly, using incor- rect formats for variables, or other incorrect data processing related usages. Concurrency (1.55%) and dimension mismatch (0.52%) type errors were caused by synchronization problems (such as issue #67626 [41]) and dimension mismatch during tensor computation and transformation operations (such as PR … libraries. Inconsistencies are the second most important bug root cause in both libraries, where changes in the APIs caused breaking changes or incompatible behaviour in the library. An- other common theme is the prevalence of type confusion bugs across both libraries, a common issue in dynamically typed languages such as Python and configuration errors, due to the … that we find a much higher occurrence of bugs caused by wrong implementation of algorithms (12% in PyTorch) than the figures reported in TensorFlow (3%). Root Causes: PyTorch bugs are caused majorly by logic errors (25%), API inconsistency (25%), and wrong algorithm implementation (12%). Both PyTorch … Build Failure Program fails to compile 11.34% Warning-style error Display of warning message 8.25% Hang Program gets stuck mid-run 0.53% and SyncBatchNorm operations behaving incorrectly and causing the program to generate incorrect results. 3) Performance degradation (12.89%). This symptom in- … Torch reports more frequent performance degradation ( 13%). Warning-style errors are comparably similar, and bugs that cause the library to become not responsive are rare. Symptoms: Both PyTorch and TensorFlow frequently report as functional errors and program crash as the most frequent bug symptoms. While PyTorch reports more frequent performance Degradation, build failures

Source URL

https://arxiv.org/pdf/2307.13777.pdf

Related Pain Points