Interruption de service

En raison de travaux de maintenance électrique à l'IDRIS, le site de la SFBI sera inaccessible du 27 au 29 mars 2023.

Unravelling the determinant of gene transfer in bacterial genomes

 CDD · Thèse  · 36 mois    Bac+5 / Master   Centre de Bioinformatique (Mines Paris & Institut Curie) · Paris (France)

 Date de prise de poste : 1 octobre 2023


Bacterial Genomic, horizontal gene transfert, statistics, evolutionary biology


The spread of antibiotic resistance capacities among bacteria is a rising
challenge, which has been declared to be one of the top 10 health threats for
humanity by the WHO. The rapid spread of antibiotic resistance gene is made
possible by the very frequent gene exchanges between bacteria, that can occur
via various evolutionary processes: Closely related organisms can exchange
genes via homologous recombination, also called horizontal allele transfer
(HAT). More distantly related organisms can also exchange genes via other
mechanisms called Horizontal Gene Transfer (HGT). Finally, genes also spread
between different ecosystems via migration of bacteria between environments.
Yet, both HAT, HGT and species migration remain poorly described [1,2]. As a
consequence, we only have a preliminary understanding of the factors that
favour gene spread via HAT, HGT or migration in bacterial populations.
In recent years, the genomes of numerous species of bacteria living in diverse
habitats have been sequenced [3]. These large datasets represent a
tremendous opportunity to unravel the properties of gene exchanges. However,
analyzing such large datasets is computationally costly and requires the
development of efficient dedicated methods. The goal of this thesis will be to
develop methods to handle the huge amount of sequencing data collected in
environmental and medical science to shed light on the determinants of
genetic exchanges in the bacterial world. To this end, we propose to study the
Maximal Exact Matches found in Bacterial genomes. We have recently
demonstrated the power of studying maximal exact matches to uncover the
properties of genetic exchanges [4] and measure the rate of exchange between
The project has several independent goals:
- Further extending our mathematical method to evaluate HGT rate. Our goal
will be to use mathematical and statistical approaches to evaluate and improve
the robustness of our method to measure gene exchange rates. In addition, we
will further extend the evolutionary model of the method to make it possible to
measure exchange rates in closely related species, which is not possible in the
current version.
- We aim at applying a similar framework to measure migration rate between
environments. To do so, we will use metagenomic samples that result from the
simultaneous sequencing of many different organisms living in a given
environment. The goal of this part of the project will be to evaluate the
feasibility of using our method for metagenomic samples, and to adapt the
evolutionary and mathematical model to these data.
- The third aim will be to use network approaches to analyse genetic material
dissemination, across species and environments. This will require firstconstructing the gene exchange network inferred from the gene exchange
rates. We will then study the properties of this network, notably to identify hub
species that play a particular role in gene exchange, and test whether different
categories of gene (and in particular genes of antibiotic resistance) use
different dissemination routes.

Preferred Experience/Educational Background:
• Master degree second year (MS2 or equivalent Engineering degree) in
machine learning, bioinformatics or applied statistics.
• Experience in programming (R or Python)
• Personal qualities: autonomy, good communication skills (English).
The candidate should have a strong taste for interdisciplinary approaches at
the crossroad between mathematics and biology. Knowledge in evolutionary
biology, genomics or microbiology are a plus but not strictly mandatory.

Host Laboratory & supervision
The PhD will be supervised by Florian Massip at the Center for Computational
Biology (CBIO) in Paris. The CBIO is co hosted by Mines Paris (a top french
ingeneer school) and the Institut Curie, and is part of the U900 unit.
This project is part of an international collaboration with P.F. Arndt (Max Planck
Institute for molecular genetic, Berlin). Weeks or months longs visits in Berlin
during the PhD will be possible.


The candidate will have to apply to the ISMEE doctoral school (Ecole des mines)

To apply, please send a CV, a cover letter and a recommendation letter

[1] W. Liu, T. A. Tokuyasu, X. Fu, and C. Liu. The spatial organization of microbial
communities during range expansion. Current Opinion in Microbiology, 63:109–
116, 2021
[2] T. Sakoparnig, C. Field, and E. van Nimwegen. Whole genome phylogenies
reflect the distributions of recombination rates for many bacterial species. Elife,
10:e65366, 2021
[3] G. A. Blackwell, M. Hunt, K. M. Malone, L. Lima, G. Horesh, B. T. Alako, N. R.
Thomson, and Z. Iqbal, “Exploring bacterial diversity via a curated and
searchable snapshot of archived dna sequences,” PLoS biology, vol. 19, no. 11,
p. e3001421, 2021.
[4] M. Sheinman, K. Arkhipova, et al. Identical sequences found in distant
genomes reveal frequent horizontal transfer across the bacterial domain. Elife,
10:e62719, 2021


Procédure : To apply, please send a CV, a cover letter and a recommendation letter to

Date limite : 30 avril 2023


Florian Massip

Offre publiée le 17 mars 2023, affichage jusqu'au 30 avril 2023