Master 2 bioinformatique - Ruminant Endogenous Retrovirus expression

 Stage · Stage M2  · 6 mois    Bac+5 / Master   IVPC/LBBE/PRABI-AMSB · Lyon (France)

 Date de prise de poste : 6 janvier 2025

Mots-Clés

Bioinformatics, Endogenous Retrovirus, active transposable elements, transcriptomic

Description

Retroviridae is a large family of RNA viruses infecting humans and animals and responsible for a variety of diseases such as severe immunodeficiency like AIDS in humans, or cancers. Retroviruses are also major components of the mammalian genomes. Known as endogenous retroviruses (ERV), they result from ancient integrations of exogenous infectious retroviruses into the DNA of germ cells. Approximately 8% of the human genome is made of ERV sequences. They have long been considered as "junk DNA" but it is now well established that they participate to the genome evolution and even bring indispensable functions for the mammalian life and development (Almojil et al., 2021).
    Most ERVs originated from now extinct exogenous retroviruses. Still, a few animal species have been described, displaying at the same time ERVs and their still circulating exogenous counterparts (Chiu and VandeWoude, 2021). Among them, the small ruminants’ genomes carry multiple ERV copies of ß- retroviruses which co-exist with their exogenous counterparts, namely JSRV (Jaagsiekte sheep retrovirus) and ENTV (Enzootic Nasal Tumor Virus) (Arnaud et al., 2007). JSRV and ENTV are oncogenic viruses responsible for respiratory cancers in sheep and goats (Monot et al., 2015). 
    We have recently characterized and annotated the ERV families across the sheep and goat (wild and domesticated) genomes showing a different dynamic among analyzed species (Verneret et al., 2024). Our results suggest that among them two families may still have transpositional potential. One of them is a family closely related to the circulating oncogenic exogenous retroviruses represented by many full-length copies with conserved ORFs and no syntenic insertions between sheep and goats, whereas the second family is composed of shorter copies with only partial retroviral genes and shared insertions between small ruminants. This suggests different transposition mechanism between the two families. 
The main goal of this project is to gather evidences that these two ERV families are still active by analyzing their expression. We will also determine the evolutionary implications on the neighboring gene expression in the different species. This project will be subdivided in three parts corresponding to different approaches:
-    Blind approach: by using the taxonomic assignment of high-throughput sequencing data (SRA-STAT) which gather over 25 million samples (Katz et al., 2022) including over 15k RNA-Seq samples of domestic sheep and goats. This approach based on kmer dictionaries, reports the taxonomic composition of reads within a sequencing run, that will be exploited with the help of the Virome@tlas consortium. SRA-STAT references include kmers for JSRV/ENTV and their endogenous counterpart, which will be exploited to identify, without a priori, in which tissues these ERVs might be expressed and what factors (organ, age, sexe, genus, species, breed…) are associated to their expression. 
-    Targeted approach: we will analyze RNASeq data available in public databases to measure the global expression of all ERV families within targeted samples that may play a role in the transmission of ERVs: genital organs, conceptus, semen etc., and compare the results among the different species. 
-    Locus specific expression of ERV:  ERV are polymorphic and repeated elements which are by consequence difficult to map (Lanciano and Cristofari, 2020). A benchmark of the available tools for short and long reads sequencing for the allele specific expression will be done, to identify the active copies.

The student will develop expertise in handling high-throughput sequencing data and performing statistical analysis using R. He/She will become familiar with key computational methods commonly used in transcriptomics and gain a solid understanding of endogenous retroviruses (ERVs) and the analysis of gene expression data.
 

Candidature

Procédure : Merci d'envoyer un CV, une lettre de motivation à : Jocelyn Turpin (IVPC, INRAE) jocelyn.turpin@univ-lyon1.fr; Emmanuelle Lerat (LBBE ; CNRS) emmanuelle.lerat@univ-lyon1.fr; Vincent Navratil (PRABI, Univ Lyon 1) vincent.navratil@univ-lyon1.fr

Date limite : 5 juillet 2025

Contacts

Jocelyn Turpin; Emmanuelle Lerat; Vincent Navratil

 joNOSPAMcelyn.turpin@univ-lyon1.fr

Offre publiée le 5 novembre 2024, affichage jusqu'au 5 juillet 2025