Project Details
Implicit Bias in Adversarial Training
Subject Area
Mathematics
Term
since 2021
Project identifier
Deutsche Forschungsgemeinschaft (DFG) - Project number 464121491
Despite all the successes of deep learning, trained deep neural networks can be very sensitive to small adversarial perturbations. This makes deep learning methods vulnerable to attacks. Several approaches to make neural networks more robust have been proposed and a particular promising approach is adversarial learning, where worst case perturbations are taken into account during the learning stage. The proposed project aims at investigating such strategies in several contexts. We will put a significant emphasis on neural ODEs, which may be seen as an infinite depth limit of standard neural networks. Through this lens, we will scrutinize adversarial training by framing it as a mean-field minimax optimal control problem. Due to the current trend of ever increasing machine learning models, the number of neural network parameters usually exceeds the number of training parameters by far. In this scenario the empirical loss function possesses infinitely many global minimizers and the employed learning algorithms - variants of (stochastic) gradient descent - impose an implicit bias on the computed solution. Surprisingly, common learning algorithm show very good generalization to unseen data. Initial theoretical results on simplified (linear) network models indicate implicit bias to sparse or low rank solutions. For adversarial learning algorithms, however, theoretical (and to a large extend numerical) results are not yet available and we plan to close this gap within this project. A recent approach to improve generalization properties of learned networks consists in sharpness-aware minimization, which aim at favoring flat minimizers of the loss function. We will explore the implicit bias for this approach for several network architectures including neural ODEs, and will investigate combinations with adversarial learning. Moreover, we will also address transformer architectures for which implicit bias and adversarial learning is widely open as well.
DFG Programme
Priority Programmes
Subproject of
SPP 2298:
Theoretical Foundations of Deep Learning