Disentangling Linguistic intelligence: Current limitations and future trends in generalisation in Large Language Models
The current reported success of large language models is based on computationally (and environmentally) expensive algorithms and prohibitively large amounts of data that are available for only a few, non-representative languages. This limitation reduces the access to natural language processing technology to a few dominant languages and modalities and leads to the development of systems that are not human-like, with great potential for unfairness and bias.
To reach better, possibly human-like, abilities in neural networks' abstraction and generalisation, we need to develop tasks and data that train the networks to more complex and compositional linguistic abilities. We identify these abilities as the intelligent ability to infer patterns of regularities in unstructured data, generalise from few examples, using abstractions that are valid across possibly very different languages.
We have developed a new set of tasks inspired by IQ intelligence tests. These tasks are developed specifically for language and learn disentangled linguistic representations of underlying linguistic rules of grammar.
These investigations can lead to three beneficial improvements of methods and practices: (i) deep, compositional representations would be learnt, thus reducing needs in data size; (ii) current machine learning methods would be extended to low-resources languages or low-resource modalities and scenarios; (iii) higher-level abstractions would be learnt, avoiding the use of superficial, associative cues that are the cause of bias and potential harm in the representations learned by current artificial linguistic systems.
Paola Merlo is the head of the interdisciplinary research group Computational Learning and Computational Linguistics (CLCL). The group is concerned with interdisciplinary research combining linguistic modelling with machine learning techniques. She is currently involved in the centre of competence in research on the evolution of language, NCCR Evolving Language, where she develops work on compositionality through computational methods.
Over the years, Prof. Merlo has provided service to the Association for Computational Linguistics, as past editor of its main journal, Computational Linguistics, as member of several executive committees and as general chair of some of its main conferences. She holds the distinction of ACL Fellow for pioneering research on foundational problems in the automatic acquisition of structure and meaning.
Prof. Merlo holds a doctorate from the University of Maryland, has been an associate research fellow at the University of Pennsylvania, and has been visiting scholar at Edinburgh, Stanford and Uppsala University.
She has recently been awarded an SNSF Advanced grant, Disentangling Linguistic Intelligence: automatic generalisation of structure and meaning across languages.