Mots-Clés
mathematical modeling,
machine learning
molecular modeling
single-stranded oligonucleotides
Description
In the context of the Instituts & Initiatives Sorbonne Université’s program, we search for a suitable candidate to be auditioned by SCAI for a doctoral contract on the interdisciplinary thesis project « Mathematical and molecular modeling for single-stranded nucleic acids design ».
Single-stranded nucleic acids (ssNAs) play an important role for cells thriving, since they are involved in structural, functional and regulatory functions and in protein synthesis. In addition, synthetic ssNAs can be exploited as biosensors or as therapeutic and diagnostic tools. Indeed, thanks to the specific conformations they can adopt, they can recognize a plethora of molecular targets, spanning from small molecules to whole cells.
SsNAs function highly depends on their secondary (i.e. their base pairing pattern) and tertiary (i.e. their 3D organization) structures. Therefore, when designing a ssNA able to mimic a natural one or to bind to a given molecular target, it is primordial to consider the folding the ssNA should have. In addition, ssNAs are characterized by a high level of structural flexibility, which is relevant for the exercise of their function since it allows the ssNAs to increase the interaction with their molecular target.
Because of the relevance of ssNAs’ structures, much effort has been paid to try to predict the 3D folding of this kind of molecules. However, so far, none of the available tools, including AlphaFold3 (AF3), considers the intrinsic ssNAs’ flexibility, since just a single or a few conformations are provided as output.
In addition, ssNAs rational design requires not only the prediction of the most probable conformations for a given sequence (i.e. solving the folding problem), but also to retrieve all the sequences having the desired conformation among their most probable ones (i.e. solving the inverse folding problem).
Therefore, within this PhD thesis project we want to address the inverse folding problem, by directly focusing on the two levels of structures of ssNAs: the secondary and tertiary structure. The PhD candidate will therefore tune a new machine learning model for the generation of ssNA sequences satisfying a desired secondary structure. Successively, the retrieved sequences will be 3D modelled, and their conformational space will be explored by means of enhanced sampling molecular dynamics techniques.
This PhD project is part of a research project aimed to develop new diagnostic tools for the diagnosis of Lyme disease, therefore the developed workflow will be applied to design ssNAs targeting a surface protein of the bacteria causing the disease.
For further information about the projet, please refer to https://www.sorbonne-universite.fr/sites/default/files/media/2025-03/GAYRAUD%20Ghislaine_ED071.pdf
The PhD candidate should have a bioinformatics / biostatistics background. Expertise in molecular modeling is a plus. Knowledge of machine learning models is strongly recommended. Programming skills are required.
The thesis will start in October 2025 at the University of Technology of Compiègne (UTC), co-supervised by Pr. Ghislaine Gayraud and Dr Miraine Dàvila Felipe from Laboratory of Applied Mathematics of Compiègne (LMAC) and Dr Irene Maffucci from Laboratory Enzymatic and Cellular Engineering (GEC).
Candidature
Procédure : Envoyer un mail à Irene Maffucci (irene.maffucci@utc.fr), Ghislaine Gayraud (ghislaine.gayraud@utc.fr) et Miraine Dávila Felipe (miraine.davila-felipe@utc.fr) avec CV, lettre de motivation, au moins un contact de référence, et relevé des notes de master
Date limite : 20 avril 2025
Contacts
Irene Maffucci
irNOSPAMene.maffucci@utc.fr
Ghislaine Gayraud
ghNOSPAMislaine.gayraud@utc.fr
https://www.sorbonne-universite.fr/sites/default/files/media/2025-03/GAYRAUD%20Ghislaine_ED071.pdf