SATW: Distributed Training of Transformers - Scalable, Private, Robust and Personalized

Distributed Training of Transformers - Scalable, Private, Robust and Personalized

Lay summary

Transformer-basierte Modelle für maschinelles Lernen haben in letzter Zeit zu revolutionären Leistungssteigerungen bei Modellen für natürliche Sprache und Text geführt, die für eine wachsende Zahl praktischer Anwendungen relevant sind, die Milliarden von Endnutzern betreffen, wie persönliche Assistenten, Schreibunterstützung oder maschinelle Übersetzungsdienste. Allerdings sind noch erhebliche Probleme zu lösen, wenn es darum geht, ein dezentrales und die Privatsphäre wahrendes kollaboratives Training solcher Modelle zu ermöglichen, im Einklang mit der Nutzung sensitiver Daten der Teilnehmer.

Der vorliegende Antrag wird verteilte und dezentrale Alternativen zu den gegenwärtigen Trainingsalgorithmen für Transformator-Modelle entwickeln und zur Anwendungsreife bringen, die skalierbar und energie-effizient, privatsphärenschonend und fehlertolerant sind und die individuelle Personalisierung der resultierenden Modelle ermöglichen.

Neben Grundlagenforschung zu den Algorithmen werden wir auch Open-Source-Software produzieren, die in der Praxis und in Industrieanwendungen eingesetzt werden kann.

Abstract

Our proposal aims to enable the distributed training of recent self-attention neural network models such as transformers, in a way that is efficiently scalable, privacy-preserving, robust to malicious actors, and personalized to each users needs.Transformer-based machine learning applications have recently lead to revolutionary performance gains on natural language and text applications, which are relevant for a growing set of practical applications affecting billions of end-users, such as personal assistants, writing support or machine translation services. However, significant problems remain to be solved in terms of enabling a decentralized and privacy-preserving collaborative training of such models, to better align data ownership of the participants with the utility resulting from such models.The present proposal will develop and help productionize distributed and decentralized analogues of current training algorithms for transformer models, which are scalable and efficient, privacy-preserving, fault tolerant, and support individual personalization of the resulting models.In addition to foundational research on algorithms, we will also produce open-source software ready to be deployed in practice and in industry applications.

Last updated:10.03.2023

SNSF
Project funding (Div. I-III)
Original data source 200342 i

Information Technology
Mathematics, Natural- and Engineering Sciences;Engineering Sciences

We help you find the perfect fit.

Lay summary

Abstract