About the role:
This position is primarily based in Paris, France. Ideally you are able to join the team in Nov/Dec 2024, or early Jan 2025. Please submit your CV in English.
At Owkin, we build high-performance diagnosis and prognosis models from digital histopathology whole slides for cancer patients. These models are at the core of our diagnostic products such as RlapsRiskⓇBC and MSIntuitⓇCRC. Our focus is on developing a series of models that identify, analyze and extract relevant histological structures within these slides. These structures include various cell types (e.g., normal cells, tumor cells, immune cells), tissue types (e.g., muscle, epidermis, fibrosis), and critical morphological features (e.g., lymphovascular invasion, necrosis, perineural invasion). We propose two different internships to improve our internal models.
Internship 1: Multi-task learning
The goal of this internship is to contribute to our ongoing research on cell-level predictions by developing innovative multi-task learning approaches. Specifically, you will work on creating models that can simultaneously perform cell segmentation, detection, and classification, while also predicting the overall tile type within the histopathology slides. This challenging task involves integrating multiple levels of analysis – from individual cell identification to broader tissue classification – into a unified model architecture.
Internship 2: Model distillation
Our current biomarker prediction workflow processes slides at the patch level, a more convenient input to fuel histology feature extractors. These models, also coined Foundation Models (FMs), are pre-trained on massive amounts of unlabeled data using self-supervised learning (e.g., Phikon) and are responsible for extracting rich and compact information out of tissue patches. The goal of this internship is to address two major challenges related to the development of our FMs, that have a great impact on the clinical validation and routine deployment of our diagnostic products:
- Inference optimization. Current FMs in histology currently contain more than 1 billion parameters.
- Improved robustness. Despite their size, FMs are not consistently robust to different scanners, staining protocols and sample preparations specific to new centers.
To address those challenges, we would like to jointly investigate model distillation (this process condenses the knowledge of larger models into smaller, more efficient ones) and domain alignment (this technique ensures robustness across different scanner types). Both approaches will be integrated into a self-supervised framework.
For both internships, your work will directly contribute to improving Owkin’s diagnostic and internal pipelines to 1) discover new targets, 2) subtype patients more accurately, 3) match the right treatment for the right patient and 4) build more reliable diagnostic tools.
In particular, you will:
- Collaborate closely with, and receive mentorship from, the other members of the R&D team;
- Conduct primary research and numerical validation on your topic of study, including re-usable software implementations;
- Contribute to regular research review;
- Report your detailed findings to the group.
About you
Required qualifications / experience:
- You are enrolled in a master degree in mathematics, statistics, biomedical engineering, computer science or related field
- Authorization to work legally in France
- Fluent in English (spoken and written)
- Proficient in Python and in relevant librairies (Numpy, Pandas, PyTorch, TensorFlow)
- Strong understanding of deep learning concepts and algorithms
- Previous experience in applying deep learning algorithms to real-word data
- Interest in medical imaging applications
- Strong communication skills
- Good team player
Optional Qualifications/Experience for Internship 2 on Model Distillation:
- Prior knowledge of self-supervised learning.
- Experience training large deep-learning models in a distributed environment.
- Familiarity with the SLURM container orchestration tool.
Please submit your CV in English