Job Description
AI Research Scientist, Foundation Models
Our Artificial Intelligence Machine Learning (AI/ML) capabilities are critical accelerators to our mission of inventing new medicines that save and improve lives. Core to the Data, AI, and Genome Sciences (DAGS) function is an AI/ML-first approach to improving target and biomarker discovery, validation and selection, and elucidating complex disease mechanisms. As a senior AI scientist, you will be responsible for pre-training and fine-tuning biological foundation models, analyzing pre-trained models posthoc, building rigorous benchmarks for evaluating foundation models, and serving in-house trained foundation models. Your work will advance our understanding of complex diseases and support the development of innovative therapeutic strategies. You will be part of a cross-functional team of computational biologists, bioinformaticians, data scientists, software engineers, and machine learning engineers who strive to identify therapeutic targets.
Primary Responsibilities:
- Collaborate with cross-functional teams to identify research questions and data requirements and develop appropriate solutions.
- Develop and train transformer-based (and related state-space models) foundation models for -omics data.
- Interpret and post-hoc analyze pre-trained models.
- Rigorously benchmark and evaluate the performance of both in-house and publicly available models.
- Host and serve in-house models and make them accessible to scientists across our Company.
- Stay up to date with the latest advancements in machine learning and statistics and apply relevant advancements to improve existing methodologies and models.
- Publish research findings in relevant conferences and journals and actively contribute to the scientific community through knowledge sharing and collaborations.
Required Education, Experience and Skills:
- PhD, MS, or BS in Computer Science, Statistics, Physics, or a related field and 0-3+ years of full-time experience (with PhD), 4+ years of experience (with MS), or 7+ years of experience (with BS).
- Expertise in machine learning and in training, evaluating, and debugging models and data at scale.
- Excellent software design and development skills and strong proficiency in Python.
- Experience with standard deep learning frameworks like PyTorch and the Huggingface ecosystem for working with transformer-based foundation models.
- Excellent communication skills and ability to work collaboratively in a multi-disciplinary team.
- Interest in life sciences problems and disease biology, and willing to learn from and teach others.
Preferred Skills and Experience:
- Demonstrated experience working with models that require multiple GPUs for training and inference.
- Relevant publications in scientific journals and experience contributing to research communities, including NeurIPS, ICML, ICLR, etc.
- Experience with pre-training and multi-modal training of (biological or otherwise) foundation models is a strong plus.
- Familiarity with biological data and previous experience with protein language models and foundation models for omics is a strong plus.
Employee Status: Regular
Relocation: Domestic/International
VISA Sponsorship: Yes
Travel Requirements: 10%
Flexible Work Arrangements: Hybrid
Shift: Not Indicated
Valid Driving License: No
Hazardous Material(s): n/a
Requisition ID: R318354
#J-18808-Ljbffr