Evaluation of learning methods on Semantic Knowledge Graph for finding personalized actionable drugs

 Stage · Stage M2  · 6 mois    Bac+5 / Master   Computational Systems Biomedicine - Institut Pasteur · Paris (France)

 Date de prise de poste : 6 janvier 2025

Mots-Clés

knowledge graph graph-based learning algorithm systems biology ovarian cancer

Description

Project description

Finding personalized treatment for High Grade Serous Ovarian Carcinoma (HGSOC) patients is a very challenging task. It requires patient data (clinical data, mutational profile, gene expression ) along with biological knowledge concerning drugs, actionable variants, and proteins or biological pathways targeted by drugs. However, the large amounts of data and knowledge require scalable approaches to providing actionable information to support clinicians in decision-making [1, 2].

To propose personalized actionable drugs, our group is developing a semantic knowledge graph (SKG) database to integrate all this heterogeneous data (https://github.com/oncodash/oncodashkb). This type of database represents medical and biological data in the form of objects and relationships [3,4]. With this type of structure, it is then possible to connect HGSOC patients to potential drugs into an explainable directed graph linking previously unconnected information. Next we aim to apply AI algorithms to the graph-structured data to extract paths of interest and to suggest drugs to clinicians that they might not have otherwise considered [5, 6].

This project is ideal for students interested in the intersection of systems biology, programming and graph-based learning algorithms. This project will provide hands-on experience working with knowledge graphs, a rapidly evolving technology, and evaluating various machine learning methods on graph structures.

Internship objectives

  • ●  Integration of new data sources from cancer databases into the SKG database.

  • ●  Contribute with researchers to evaluate different learning algorithms applied to the SKG and

    assess the contribution to prediction made by the integration of these different data sources.

Profile

  • ●  Currently enrolled in a master’s program in Computer Science, Statistics, Computational Biology, or a related field.

  • ●  Programming skills in python (cypher is a plus).

  • ●  Knowledge in Machine Learning and Deep Learning and their frameworks (PyTorch).

  • ●  Proficient in working with a Unix environment.

  • ●  Knowledge of graph databases, such a Neo4j, is a plus.

Context

The European DECIDER research project (http://deciderproject.eu) aims to develop diagnostic tools and treatments for HGSOC patients with the help of AI methods—in particular, to identify earlier those patients who do not respond well to first-line treatments, and to find effective complementary treatments. The DECIDER project has been active for more than three years and has produced several datasets for hundreds of ovarian cancer patients that are in the process of being integrated in a common database. The data sources that will be integrated in the course of this internship are key ingredients for the Oncodash decision support system that will integrate genomics and clinical data and allow AI model-based inference to support clinical decision making for HGSOC patients.

Start Date

Flexible – internships can begin anytime in January or February 2025, depending on the candidate's availability.

References

[1] Reisle, C.et al. A platform for oncogenomic reporting and interpretation. Nat Commun 13, 756 (2022). https://doi.org/10.1038/s41467-022-28348-y
[2] Dreo, J., Lobentanzer, S., Gaydukova, E., Baric, M., Maarala, IA. et al.. High-level Biomedical Data Integration in a Semantic Knowledge Graph with OncodashKB for finding Personalized Actionable Drugs in Ovarian Cancer. Cancer Genomics, Multiomics and Computational Biology, European Association for Cancer Research, (2024), Bergame, Italy. hal-04509599
[3] Lobentanzer, S., Aloy, P., Baumbach, J. et al. Democratizing knowledge representation with BioCypher. Nat Biotechnol 41, 1056–1059 (2023). https://doi.org/10.1038/s41587-023-01848-y
[4] Dreo, J., Laudy, C., Lobentanzer ,S. Baric, M., Gaydukova, E. and Schwikowski, B. Reproducible Mapping of Tabular Data into Semantic Knowledge Graphs with OntoWeaver and BioCypher, 27th International Conference on Information Fusion (FUSION), (2024) Venice, Italy, pp. 1-8, doi: 10.23919/FUSION59988.2024.10706497
[5] Brouard, C., Mourad, R., Vialaneix, N., Should we really use graph neural networks for transcriptomic prediction?, Briefings in Bioinformatics, Volume 25, Issue 2,(2024), https://doi.org/10.1093/bib/bbae027
[6] Esser-Skala, W., Fortelny, N., Reliable interpretability of biology-inspired deep neural
networks, npj Systems Biology and Applications (2023) 9:50 https://doi.org/10.1038/s41540-023-00310-8

Candidature

Procédure : Please send your CV and motivation letter to johann.dreo@pasteur.fr and matthieu.najm@pasteur.fr.

Date limite : 20 décembre 2024

Contacts

Matthieu Najm

 maNOSPAMtthieu.najm@pasteur.fr

Offre publiée le 19 novembre 2024, affichage jusqu'au 6 janvier 2025