Stage M2 Bioinformatics / Network Enrichment Analysis

 Stage · Stage M2  · 6 mois    Bac+5 / Master   Center for Research in Epidemiology and Population Health (CESP) · Villejuif (France)

 Date de prise de poste : 1 février 2022

Mots-Clés

Enrichment Analysis Network Analysis Pathway Analysis R Programming

Description

Background:

Some common biological pathways are suspected of being involved in different cancers [1]. In particular, thyroid and breast cancers share a lot of similarities in their biology: both are more frequent in women and are influenced by hormonal and reproductive factors and obesity. Epidemiological studies has repeatedly reported that women with thyroid carcinomas are at increased risk of breast cancer (BC) (and conversely) [2]. It has also been suspected that other thyroid disorders may be linked to BC risk. These results does not seem to be explained by the consequences of the treatment for thyroid cancer (TC) or increased surveillance, but rather suggest common causes/aetiologies, and/or common genetic mechanisms for the two diseases. The association between thyroid dysfunctions and BC risk remains unclear as most epidemiology studies included in the meta-analyses had no information on important potential confounders. To explore further the pleiotropy – the fact one genetic variant affect different multiple diseases - between BC and thyroid disorders could help us to better understand the diseases etiology by identifying common biological mechanisms. The comparison of results from genome-wide association studies (GWAS) that test each genetic variant one by one for its association with a disease, showed only one genetic locus at risk for both BC and TC [3,4]. We have access to GWAS results on thyroid cancer (EPITHYR consortium), which was coordinated by our team, thyroid disorders, and data of the most recent GWAS conducted by the Breast Cancer Association Consortium (BCAC). Using commonly used approaches and the GCPBayes method recently developed by our team [5], we identified pleiotropic genes associated to both cancers. A next step after finding candidate pleiotropic genes is to use enrichment methods for exploring important components [6]. Enrichment methods would add extra information to the analysis level using a Systems Biology point of view [7]. Systems Biology approach tries to consider interactions between different components within a biological system as well. Therefore, it could be possible to suggest how the key genes interact with each other and affect their neighbors in various pathways using a network-based information.

Objectives of the internship:

In this project, we would like to explore the common shared mechanisms between BC and Thyroid disorders by using network-based and pathway-based enrichment analyses in order to identify pathways or functional network commonly involved in both cancers. These analyses will be performed on a list of previously selected genes as candidates with potential pleiotropic effects between both diseases.

Mission:

  • Performing Network-based gene set enrichment analysis (for instance, by using GSA-SNP2, PINBPA). A candidate might write some scripts in R programming language and work with network-based Software (such as Cytoscape).
  • Performing gene set enrichment analysis (for instance, by using GIGSEA, FUMA, MAGMA). A candidate might write some scripts in R programming language and work with Unix OS.
  • Comparison of results in both strategies.
  • Preparation of standard scientific documentation (using GitHub).

Candidate’s profile:

Master (M2) or equivalent in bioinformatics. Strong interest in genetics and statistics. Knowledge in R programming language and UNIX environment. Knowledge of working with standard scientific repositories such as GitHub would be appreciated. The internship is in English language.

Contact:

Please send your CV and cover letter to Yazdan Asgari, yazdan.asgari@inserm.fr (Inserm U1018, CESP ‐Centre de Recherche en Épidémiologie et Santé des Populations, Villejuif) and Pierre-Emmanuel Sugier, pierre-emmanuel.sugier@inserm.fr (Inserm U1018, CESP ‐Centre de Recherche en Épidémiologie et Santé des Populations, Villejuif).

Location:

The intern will be located at the Center for Research in Epidemiology and Population Health (CESP) in Paul Brousse Hospital (Villejuif) (http://cesp.inserm.fr/).

Start date:

February 2022

The AMLAP Project:

This work fits into the AMLAP project (Advanced Machine Learning Algorithms for Leveraging Pleiotropy effect), funded by the “ITMO Cancer d’Aviesan”, involving two teams: The “Exposome and heredity” team of the CESP research center (Centre de Recherche en Epidémiologie et Santé des populations, Villejuif, France, http://cesp.inserm.fr/), and the LMA of the University of Pau and Adour Countries (Laboratory of Mathematics and its Applications, UMR CNRS 5142, Anglet, France, https://lma-umr5142.univ-pau.fr/fr/index.html). The overall aims of this project are thus two-fold. We aim to develop novel big data analytics methods for leveraging pleiotropy using specific data structures (gene or pathway-level) and to apply these to large individual data sets and to massive data sets using summary statistics. This work fits with the second objective of the project.

References:

[1] Solovieff et al., Pleiotropy in complex traits: challenges and strategies, Nat Rev Genet., 14(7):483-95, 2013

[2] Bolf et al., A linkage between thyroid and breast cancer: A common etiology? Cancer Epidemiol. Biomarkers Prev., 28: 643–649, 2019

[3] Gudmundsson et al., A genome-wide association study yields five novel thyroid cancer risk loci. Nat. Commun., 8: 14517, 2017

[4] Stacey et al., Common variants on chromosomes 2q35 and 16q12 confer susceptibility to estrogen receptor–positive breast cancer. Nat. Genet. 39: 865–869, 2007

[5] Baghfalaki, et al., Bayesian meta-analysis models for cross cancer genomic investigation of pleiotropic effects using group structure. Stat Med., 40(6),1498–1518, 2021

[6] Leeuw et al., The statistical properties of gene-set analysis, Nat Rev Genet., 12;17(6):353-64, 2016

[7] Yoon et al., Efficient pathway enrichment and network analysis of GWAS summary data using GSA-SNP2, Nucleic Acids Research, Vol. 46, No. 10, e60, 2018

Candidature

Procédure : Please send your CV and cover letter to yazdan.asgari@inserm.fr and pierre-emmanuel.sugier@inserm.fr

Date limite : 14 janvier 2022

Contacts

Yazdan Asgari

 yaNOSPAMzdan.asgari@inserm.fr

Offre publiée le 25 novembre 2021, affichage jusqu'au 28 février 2022