MUFIN: Improving Neural Repair Models with Back-Translation


Automated program repair is the task of automatically repairing software bugs. A promising direction in this field is self-supervised learning, a learning paradigm in which repair models are trained without commits representing pairs of bug/fix. In self-supervised neural program repair, those bug/fix pairs are generated in some ways. The main problem is to generate interesting and diverse pairs that maximize the effectiveness of training. As a contribution to this problem, we propose to use back-translation, a technique coming from neural machine translation. We devise and implement MUFIN, a back-translation training technique for program repair, with specifically designed code critics to select high-quality training samples. Our results show that MUFIN’s back-translation loop generates valuable training samples in a fully automated, selfsupervised manner, generating more than half-a-million pairs of bug/fix. The code critic design is key because of a fundamental trade-off between how restrictive a critic is and how many samples are available for optimization during back-translation.

Preprint under submission

Computer Scientist

My research interests include software reliability, software verification, and formal methods applied to software engineering. I am also interested in interactive storytelling. For more details, see some of my projects or my selected (or recent) publications. More posts are available in my blog. Follow me on Twitter or add me on LinkedIn.