As the scale of deep learning models continues to expand, distributed training has become an inevitable trend, which is not only time-consuming but also expensive. Therefore, effective debugging and ...