phd project - centuri living systemscenturi-livingsystems.org/.../2018/02/phd2018-12.pdf · phd...
TRANSCRIPT
PhD PROJECT
Genetic and phenotypic comorbidities between Mendelian and common diseases from a network perspective
Host teams Aitor Gonzàlez / TAGC / [email protected]ïs Baudot / MMG / [email protected]
Scientific background Mendelian diseases show discrete phenotypes triggered usually by monogenic mutations, whereas common diseases show continuous phenotypes, which depend on numerous weak polygenic variations. However, Mendelian and common diseases share both molecular and phenotypic features, leading to comorbidity relationships [1]. For instance, mutations in the transcription factor (TF) GATA5 cause congenital heart defects (CHD), and SNPs variations around the same gene are involved in hypertension [2]. Hypertension is also a classical complication of CHD. We propose here to study these relationships between Mendelian and common diseases, but from a biological interaction network point of view. To this goal, we will map disease features (e.g., SNP variations, mutations, phenotypes) features to networks containing interactions between diseases, between genes/proteins but also interactions with non-coding genomic regions, and develop innovative algorithms to extract comorbidity subnetworks from these multiplex multipartite networks.
PhD ObjectivesThe objective of this PhD project is to investigate the shared architecture between Mendelian and commondiseases both at the phenotypic and molecular levels by identifying subnetworks enriched in mutations andvariations. As a consequence, this approach will also help predicting the relevant variations (SNPs, eQTLs)implicated in common polygenic diseases, or acting as modifier to modulate Mendelian diseases.
Proposed approach (experimental / theoretical / computational)We will create first a disease-disease network containing links between diseases sharing phenotypes. Then, we willbuild a classical multiplex network composed of different layers of biological relationships. It will contain protein-protein interactions, but also molecular complexes and pathway interactions [3; 4]. We will extend this multiplexframework to consider networks of relationships with non-coding DNA loci by including TF-DNA [5], and DNA-DNA/HiC interactions. Then, mutation and variation loci linked to Mendelian and common polygenic diseases willbe mapped. Dedicated algorithms, such as random walks with restart and community detection strategies, will bedeveloped and adapted to explore these extended multiplex and multipartite networks . They will help definingsubnetworks enriched in comorbid associations, thereby predicting regulatory interactions and biologicalprocesses linking the different disorders.
PhD student’s expected profileThe PhD student should have a Master’s degree in an area related to Bioinformatics, Computer Science or Mathematicswith interest for data analysis, graph theory and human genetics. The project and PhD student will benefit from theexpertise from numerous experimental biologists working on Mendelian and common disease in bothlaboratories. We will have in particular access to in-house datasets of exomes and transcriptomes for somediseases of interest. The TAGC laboratory is interested in the study of complex traits and diseases, with aparticular focus on the analysis of gene regulatory regions [5]. Within the TAGC, Aitor González investigates andmodels the non-coding regions and variants of the genome [6]. Marseille Medical Genetics is a research centerlocated in la Timone faculty of Medicine. The research team "Networks and Systems Biology for Diseases" lead byAnaïs Baudot applies and develops network approaches to extract information from large-scale biological data, inorder to investigate human disorders [3; 4; 7]. The project will be developed in collaboration with Daniel Rico andhis team at Newcastle University, expert in network approaches to analyze genomic data [8].
PhD PROJECT
References[1] Blair, D. R.; Lyttle, C. S.; Mortensen, J. M.; Bearden, C. F.; Jensen, A. B.; Khiabanian, H.; Melamed, R.; Rabadan, R.;Bernstam, E. V.; Brunak, S.; Jensen, L. J.; Nicolae, D.; Shah, N. H.; Grossman, R. L.; Cox, N. J.; White, K. P. and Rzhetsky, A.(2013). A nondegenerate code of deleterious variants in Mendelian loci contributes to complex disease risk. Cell 155: 70-80.
[2] Padang, R.; Bagnall, R. D.; Richmond, D. R.; Bannon, P. G. and Semsarian, C. (2012). Rare non-synonymous variationsin the transcriptional activation domains of GATA5 in bicuspid aortic valve disease. Journal of molecular and cellularcardiology 53: 277-281.
[3] Didier, G.; Brun, C. and Baudot, A. (2015). Identifying communities from multiplex biological networks. PeerJ 3:e1525.
[4] Valdeolivas, A.; Tichit, L.; Navarro, C.; Perrin, S.; Odelin, G.; Levy, N.; Cau, P.; Remy, E. and Baudot, A. (2018). RandomWalk with Restart on Multiplex and Heterogeneous Biological Networks. Bioinformatics (Oxford, England) .
[5] Chèneby, J.; Gheorghe, M.; Artufel, M.; Mathelier, A. and Ballester, B. (2018). ReMap 2018: an updated atlas ofregulatory regions from an integrative analysis of DNA-binding ChIP-seq experiments. Nucleic acids research 46: D267-D275.
[6] Seyres, D.; Darbo, E.; Perrin, L.; Herrmann, C. and González, A. (2016). LedPred: an R/bioconductor package topredict regulatory sequences using support vector machines. Bioinformatics (Oxford, England) 32: 1091-1093.
[7] Ibáñez, K.; Boullosa, C.; Tabarés-Seisdedos, R.; Baudot, A. and Valencia, A. (2014). Molecular evidence for the inversecomorbidity between central nervous system disorders and cancers detected by transcriptomic meta-analyses. PLoSgenetics 10: e1004173.
[8] Pancaldi, V.; Carrillo-de-Santa-Pau, E.; Javierre, B. M.; Juan, D.; Fraser, P.; Spivakov, M.; Valencia, A. and Rico, D.(2016). Integrating epigenomic data and 3D genomic structure with a new measure of chromatin assortativity. Genomebiology 17: 152.