Mots-Clés
bioinformatic pipeline
complex samples
microbial communities
metabolomic, multi-omic
association study
Description
Scientific context:
Isoprenoid quinones are small lipophilic molecules that are essential for energy generation in most organisms, and their biosynthetic pathways are evolutionarily related (1). Our team has significantly contributed to the characterization of these pathways (2-4) and we suspect that novel isoprenoid quinones remain to be discovered, especially in microorganisms that cannot be cultured in the lab. We aim to identify novel quinones by molecular network analyses of high accuracy mass spectrometry metabolomic datasets (6). To do so, a pipeline will be assembled to conduct the molecular network analyses using previously developed software (GNPS, SIRIUS, pyOpenMS). Then, novel quinones will be identified in environmental samples which composition in archaea and bacteria was already established by metagenomic sequencing. Finally, the quinones will be linked to their producing species and candidate biosynthetic pathways will be proposed based on known pathways.
Keywords: bioinformatic pipeline; complex samples; microbial communities; metabolomic, multi-omic; association study
Main objectives and methods:
- Calculating a molecular network
This will consist in the pre-processing of the raw metabolomic data and extraction of the relevant associated features (e.g. with UmetaFlow). Then, the molecular network will be computed using existing software or package and mapping of the meta-data (e.g. with pyOpenMS and GNPS). A visualization of the network including the mapped meta-data will be set up using the Cytoscape program.
- Analysis of the molecular network
The analysis of the network will consist in the identification of the molecules of interest (isoprenoid quinones) in the network, and the extraction of the unknown molecules connected in the network. These unknown molecules will then be characterized using the available meta-data, confirmed by inspection of the raw metabolomic data, and verified by a second experimental round of targeted analyses.
- Statistical analyses to associate quinones with producing species
The link between the quinones and the producing species will be established using a multi-omic approach (5): the metabolomic data will be connected to the metagenomic data obtained on the same samples (already available). Then, based on the genome of the candidate producing species and the annotation of the already known biosynthetic pathways (annotation tool developed by the team), a candidate biosynthetic pathway will be proposed.
Profile of the candidate:
We are seeking a highly motivated candidate willing to explore the usage of large metabolomic data to discover new molecules of high biological relevance.
The candidate will be trained for data analysis and/or bioinformatics. In particular, the knowledge of a programming language (R, Python…) and the proficiency in using a Linux environment are required.
If interested, the student will have the possibility to contribute to the experiments conducted in our team: preparation of samples, generation of metabolomic datasets, experimental characterization of the newly discovered quinones.
Environment:
The TREE team @TIMC lab (CNRS, Université Grenoble Alpes): We are part of a highly inter-disciplinary team, gathering biochemists, biophysicist, molecular microbiologists, biostatisticians and bioinformaticians, with a common strong interest in microbial evolution. The lab is located on the Campus of La Tronche, in close vicinity to Grenoble (Tram B).
References:
(1) Abby et al., BBA Bioenergetics (2020) 1861:148259 ; (2) Pelosi et al., mBio (2019) 10: e01319-19
(3) Kazemzadeh et al., Mol Biol Evol (2023) msad219 ; (4) Elling et al., bioRxiv (2024) ; (5) Nothias et al., Nature Methods (2020) 905-908 (not from our team).