Senior AI/ML Engineer 1, Computational Biology
Ginkgo Bioworks · Remote (USA)
Our mission is to make biology easier to engineer. Ginkgo is constructing, editing, and redesigning the living world in order to answer the globe’s growing challenges in health, energy, food, materials, and more. Our bioengineers make use of an in-house automated foundry for designing and building new organisms.
Ginkgo believes that if we are to grow a thriving, sustainable bioeconomy, we must also grow a new market in biosecurity. Our biosecurity and public health initiative, Ginkgo Biosecurity, launched a nationwide emergency response to the COVID-19 pandemic, providing end-to-end pathogen monitoring services to schools, communities, and travelers. As we continue to scale, our work is also evolving into new and exciting directions, from global expansion to the integration of new technologies and capabilities, including our Traveler-Based Genomic Surveillance Plan with the CDC.
The convergence of AI and foundation models that can read/write DNA as well as biological design tools is a complex and rapidly changing field. We know that it requires deep technical understanding and diverse perspectives to ensure we maximize the realization of scientific advancements and minimize misuse. Ginkgo Biosecurity is at the forefront of monitoring the world’s biology for potential threats, including those enhanced and enabled by AI. Our team has deployed a global genomic surveillance platform that generates data driven insights to support decision making at the highest levels of public health and national security around the world.
Ginkgo recently announced a large partnership with Google Cloud to build a generative AI platform for engineering biology and for biosecurity. In collaboration with Ginkgo’s Design and AI teams, we are delivering best-in-class foundation and application specific models for design, analysis, and functional prediction and to support biosecurity.
Role Introduction:
As a Senior Machine Learning / Artificial Intelligence (ML/AI) Engineer on Ginkgo’s Biosecurity Team, you will leverage Ginkgo’s wealth of proprietary sequence and experimental data to design, train, and experimentally validate novel foundation and application-specific (e.g. fine-tuned) AI models for application to classification and design of genes, regulatory elements, multi-gene pathways, and even genomes. In addition, you will collaborate with experts throughout the organization to identify transformational opportunities for application of AI in genetic design and engineering, bioinformatics, and prediction/forecasting using a diverse array of multimodal data.
We have access to nearly limitless compute capacity: CPU, GPU, or TPU. In addition, you will have access to Ginkgo's experimental platform and resulting large datasets for generating and testing hypotheses using machine learning approaches as well as informing strategic training data acquisition.
The successful candidate will bring practical experience in modern AI/ML methods, creativity in solving problems at the intersections of scientific domains, a collaborative mindset, and enthusiasm for professional growth. We are looking for someone who is equally comfortable dreaming big as rolling up their sleeves and digging through the weeds.
Responsibilities
- Foundation Model (FM) development: Conceive, develop, and validate best-in-class foundation models for DNA and/or RNA, leveraging Ginkgo’s large and diverse proprietary sequencing datasets.
- Application-specific model development: Conceive, develop, and validate purpose-built models (e.g. fine-tuned) for a range of DNA and RNA design and prediction applications. Analyze multimodal biological data, develop tools to detect anomalies, and predict potential biological threats.
- Biosecurity: Identify and evaluate areas that AI tools could potentially be misused or developed irresponsibly that pose risks to biosecurity. Develop frameworks to evaluate and reduce risk that might arise from foundation models, biological design tools, and other emergent AI tools.
- Collaborate in cross-functional teams to design and experiment with model architectures, data models, data representations.
- Influence strategic dataset acquisition: Partner with world-class experimentalists and hundreds of robots to conceive and design experiments to collect high-value training data at unprecedented scale. Influence how routine experiments are performed to maximize future learning potential.
- Maintain high-quality documentation of your work and discoveries, creating written reports, technical presentations for internal or external audiences, electronic lab notebooks, internal database records, code comments, and software documentation.
- Take part in something big: This is a growing team, a significant company focus, and a rapidly evolving field. You will be able to influence where things go and how they change.
Minimum Requirements
- PhD graduate with 1+ years of relevant post-academic experience in applying AI/ML (or BS 7+year OR MS 5+ years of professional experience)
- Hands-on, proven experience in developing deep neural networks. Deep knowledge of currently available AI model architectures and data schemes. Perspective on advantages and drawbacks of various approaches. Direct experience applying AI/ML to problems in modeling of RNA or DNA preferred
- Subject matter expertise in genetics, genomics, transcription, translation, or RNA biology
- Familiarity with recent literature and state of the art for large model architectures and training approaches
- Deep knowledge of AI/ML literature as applied to DNA or RNA
- Deep knowledge of biological design tools
- Proficiency with at least one programing language (Python preferred)
- Experience with building machine/deep learning models with at least one common framework such as PyTorch, Tensorflow, or JAX
- Proficiency with best practices for collaborative software development (version control systems, test-driven development, and good documentation habits)
- Strong communication skills, including the ability to speak technical jargons of AI, biology, and software engineering, and present to internal customers from a variety of disciplines
- Ability to independently set priorities and advance multiple concurrent, collaborative projects
- People management or project management experience
- Ability to thrive and stay calm in a fast-paced, ever-changing environment
Preferred Capabilities and Experience
- PhD in artificial intelligence, computer science, synthetic biology, computational biology, genomics, bioinformatics, quantitative biology, chemical engineering, or another related field. Interdisciplinary work is strongly preferred.
- Experience developing deep neural networks. Deep knowledge of currently available AI model architectures and data schemes. Perspective on advantages and drawbacks of various approaches
- Broad knowledge of state-of-art machine learning approaches to biological sequence analysis
- Significant hands-on experience in using software libraries such as tensorflow, pytorch, jax, and keras for model construction
- Exposure with ML and data orchestration and workflow engines like Airflow, Kubeflow, Flyte, or Dagster.
- Expertise in best practices for software development, including version control, code reviews, unit testing, and continuous integration. Experience in ML model management, MLOps, is a plus
- Proven track record of delivering in cross-functional teams and in project management. Excellence in scientific communication
- Enthusiasm to learn new techniques. Strong curiosity of areas of biology previously unknown to you
Total compensation for this role is market driven, with a starting salary of $130k+, as well as company stock awards. Base pay is ultimately determined based on a candidate's skills, expertise, and experience. We also offer a comprehensive benefits package including medical, dental & vision coverage, health spending accounts, voluntary benefits, leave of absence policies, Employee Assistance Program, 401(k) program with employer contribution, 8 paid holidays in addition to a full-week winter shutdown and unlimited Paid Time Off policy.
- What is it really like to take your company public via a SPAC? One Boston biotech shares its journey (Fortune)
- Ginkgo Bioworks resizes the definition of going big in biotech, raising $2.5B in a record SPAC deal that weighs in with a whopping $15B-plus valuation (Endpoints News)
- Ginkgo Bioworks CEO on scaling up Covid-19 testing: ‘If we try, we can win’ (CNBC)
- Ginkgo raises $70 million to ramp up COVID-19 testing for employers, universities (Boston Globe)
- Ginkgo Bioworks Redirects Its Biotech Platform to Coronavirus (Wall Street Journal)
- Ginkgo Bioworks Provides Support on Process Optimization to Moderna for COVID-19 Response (PRNewswire)
- The Life Factory: Synthetic Organisms From This $1.4 Billion Startup Will Revolutionize Manufacturing (Forbes)
- Synthetic Bio Pioneer Ginkgo Raises $290 Million in New Funding (Bloomberg)
- Ginkgo Bioworks raises $350 million fund for biotech spinouts (Reuters)
- Can This Company Convince You to Love GMOs? (The Atlantic)