SATW: Recurrence-independent inference of rare and non-coding functional mutations in cancer

Recurrence-independent inference of rare and non-coding functional mutations in cancer

Lay summary

La predizione di mutazioni oncogeniche è stata affrontata in molti modi negli ultimi anni, sfruttando la possibilità di analizzare dati provenienti da migliaia di pazienti con metodi algoritmici e statistici. Questi approcci utilizzano principalmente due criteri: 1) se una mutazione è osservata in tanti tumori, e/o 2) se una mutazione si trova in un’area nota essere importante per la funzione della proteina colpita, allora è più probabile che essa sia oncogenica. Questi criteri, per quanto efficaci, non consentono però di studiare la potenziale oncogenicità di mutazioni relativamente rare o per le quali non è noto l’impatto funzionale, come ad esempio mutazioni che colpiscono la parte del nostro DNA che non produce proteine (anche detto DNA non codificante) e che costituisce circa il 98% del nostro DNA. In questo progetto, utilizzeremo nuovi approcci basati su tecniche di intelligenza artificiale e teorie evolutive per predire quali mutazioni rare e/o non codificanti sono oncogeniche. Entrambi i tipi di approcci che svilupperemo sono largamente indipendenti dalla frequenza con cui una mutazione è osservata e possono essere utilizzati per analizzare mutazioni non codificanti. In aggiunta, analizzeremo come mutazioni predette oncogeniche diano origine a tumori con caratteristiche diverse quando presenti in maniera individuale o in combinazione tra di loro. Quest’analisi consentirà di predire, ad esempio, se l’efficacia di un farmaco contro una certa mutazione sia compromessa dalla presenza di una seconda mutazione. Questo progetto aprirà nuovi orizzonti nella ricerca di mutazioni oncogeniche, con importanti implicazioni per la progettazione e l’utilizzo di terapie personalizzate in oncologia.

Abstract

Human tumors typically exhibit hundreds or even thousands of somatic mutations across their genome (Rheinbay et al., 2020), out of which only a minor fraction plays a role in driving and sustaining the disease (Sanchez-Vega et al., 2018). The identification of these few driver events is key to develop anti-cancer therapies.

The search for driver mutations has been a central challenge of modern cancer genomics. Computational and statistical approaches have been designed and applied to large-scale DNA sequenced tumor cohorts, with the goal of mining over-represented mutations across multiple patients (Martínez-Jiménez et al., 2020). However, statistically significant recurrence is hard to estimate in the non-coding portion of the genome, which accounts for ~98% of our genome, and most of current approaches consider each mutation independently of the other, i.e. they lack a systemic perspective.

Here, we propose to develop orthogonal approaches to recurrence-based models leveraging cutting-edge machine learning techniques and systemic interactions among somatic alterations. In details, our proposed project will:
1.Develop a deep learning framework to learn a continuous embedding of coding and non-coding variants in cancer, which can be cross-linked with functional phenotypes.
2.Infer functional coding and non-coding alterations using evolutionary dependencies, i.e. functional interactions among altered genes. This will allow us to investigate functional alterations as systems of synergistic and antagonistic events.
3.Associating putative functional variants to cell phenotypes: we will integrate genetic, phenotypic, and experimental evidence to corroborate our predictions and link single and combination of driver events to cell phenotypes and therapeutic response.

To achieve these objectives, we will develop computational frameworks able to integrate large-scale genomic cohorts comprising overall ~20,000 tumors and >100,000 normal samples. Importantly, the collaboration between our labs at the University of Lausanne, Switzerland and Indraprastha Institute of Information Technology, India, will be key to combine specific expertise in machine learning, statistical inference, and cancer genomics.

Overall our proposal will tackle two major challenges that were left largely unresolved by recurrence-based models: the prediction of functional rare coding and non-coding variants observed in cancer, and the association between combinations (rather than single) of alterations and molecular and clinical phenotypes. The success of this project will provide resources and computational tools to advance our understanding of cancer somatic alterations and their interpretation in the clinic.

Last updated:14.05.2022

SNSF
Bilateral programmes
Original data source 200239 i

Cancer
Biology and Medicine;Preventive Medicine (Epidemiology/Early Diagnosis/Prevention)

We help you find the perfect fit.

Lay summary

Abstract