Chromosomal rearrangements could lead to fusion of genes located in different regions of the genome. Gene fusions have a high impact on cancer initiation, progression, as well as morbidity. Today there is a vast amount of knowledge about gene fusions, providing enough information to answer some key questions on fusion-related oncogenesis. Moreover, the rapid progression of high-throughput sequencing technologies has generated a huge stream of novel oncogenic gene fusion predictions.
The first objective of this thesis is to identify the hallmarks of fusion genes that are oncogenic, using integrative statistical screening of high-throughput genomic data. The data presented argues that the pool of fusion partner genes, their tissue-specificity and specific pairing are non-random and correlate with gene expression levels. The functional characteristics of fusion proteins (the retained domain combinations and protein interaction interfaces) also appear to be very specific. It was also determined that genome organization on different levels (chromosome conformation and replication timing) influences fusion partner gene selection. However it was shown, that it is unlikely that the spatial proximity in nucleus directly shapes fusion partner gene pairing.
The aforementioned hallmarks were used to construct a pipeline to distinguish oncogenic fusions from noise and passenger events in predictions made from next-generation sequencing data. The classifier performed well both for training/testing datasets comprised of several hundreds of gene fusions and during an extensive follow-up validation for which recent genomic data and recently discovered oncogenic fusions were used.
© 2001-2024 Fundación Dialnet · Todos los derechos reservados