An Approximate Bayesian Computation approach for modeling the evolution of centromeric DNA in Primat

 CDD · Stage M2  · 6 mois    Bac+5 / Master   Laboratoire Structure et Instabilité des Génomes - Muséum National d'Histoire Naturelle · Paris (France)

 Date de prise de poste : 1 janvier 2024


satellite DNA, genome evolution, modelisation, simulations


The centromeres of eukaryotic chromosomes are made up of tandemly repeated DNA sequences, also known as satellite DNA. Thousands of copies of monomers are found next to each other in a head to tail configuration. Monomers can vary a lot between species but their length usually remains between 150 and 200 bp. At the sequence level, they offer a broad range of diversity and several families often coexist along chromosomes within a single species. This diversity and structural organization of satellite DNAs are believed to result from specific evolutionary mechanisms, sometimes referred to as concerted evolution, that involve recurrent sequence amplification and homogenization and can be explained by unequal crossing-over and gene conversion (Smith 1976).

Additionnally, mutations (substitutions or indels) can accumulate over time. The relative importance of these mechanisms and the values of the parameters that describe them are still unknown. The best studied satellite DNA, which represents 3,5% of the human genome, is called alpha satellite DNA and has a monomer length of 171 bp. Similar sequences are found in almost all primates.

Our team has started to address the evolution of alpha satellite DNA by focusing on a clade of Old World monkeys for which numerous species are available (Cacheux et al 2016, Cacheux et al 2018). We have also developed specific bioinformatic approaches for the classification of alpha satellite DNA (Haschka et al 2020). The recent explosion of genome sequencing projects as well as the advent of long-read sequencing methods has led to the release of the first complete chromosome sequences (telomere to telomere, T2T) for the human genome as well as those of a few other species, providing a wealth of new experimental datasets for exploring the diversity of alpha satellite DNA across species.

In this context, the main objectives of the internship will be to address the relative importance of the mechanisms and the values of the parameters that describe the evolution of the alpha satellite DNA by using simulation-based approaches. More precisely, the intern will have to:

1. design evolutionary models of varying complexity to describe the evolution of satellite DNA

2. develop tools to simulate the evolution of satellite DNA according to those models

3. develop a probabilistic approach to infer model parameters from the simulation and the available
datasets using an ABC framework (see Beaumont, 2010)

The outputs of the internship will be new framework tools for the study of evolution of satellite DNA as well as new insights into the evolutionary mechanisms of satellite DNA.



Procédure : To apply, please submit your CV and a cover letter to

Date limite : 15 décembre 2023


Loic Ponger

Offre publiée le 7 septembre 2023, affichage jusqu'au 15 décembre 2023