Image of

PRIN Grant 2022E3WYTY (2023-2025) “SELF-MADE - SELF-supervised learning: a Model-based Analysis to Decode its Effectiveness"

ABSTRACT

The first wave of successes with deep learning were obtained using supervised learning (SL) where neural networks are trained to perform a task using carefully labelled examples. This approach yields specialised models that perform extremely well on that task yet it has several shortcomings: the networks easily fail outside of the controlled environment in which they were trained; they are unable to acquire new skills without forgetting what they learnt previously; and it requires large amounts of labelled data which can be prohibitively expensive. Self-supervised learning (SSL) is another approach to train neural networks where a network is first pretrained by predicting a hidden part or property of an input given the rest of that input for example by predicting a hidden word in a sentence. The model can then be successfully trained on the task of interest with only a few labelled examples. While SSL now achieves state-of-the-art results in many applications we lack any understanding of what networks learn with SSL and how they do it. The key technical difficulties are the non-linearities of the neural networks and the need to go beyond simplistic models for data where data doesn’t have any structure that SSL could discover. The goals of this project are to develop a theory of SSL and to explore alternative algorithmic approaches building on recent advances in analysing non-linear neural networks trained on structured datasets to which the authors have made significant contributions.