TrialSpark · New York, NYEngineering Health & Well-Being Posted 1 month ago
Before new medical treatments can be administered to the public, they must demonstrate safety and efficacy in a clinical trial. These trials protect consumers from ineffective and dangerous products, but the clinical trial process also presents a tremendous bottleneck in delivering life-saving treatments to patients. A typical trial involves coordinating between numerous parties and data formats to gather, store, analyze, and audit clinical data. Mistakes and delays are common, and fewer than 10% of trials finish on time.
At TrialSpark, we are looking for talented software engineers to help us reimagine the clinical trial process from first principles and build the backbone technology platform.
This is a foundational software engineering role at TrialSpark. Platform Engineering is a small team of systems-oriented Engineers tasked with elevating our Product and Data Teams. We deliver and maintain secure, reliable infrastructure including Cloud components, CI/CD, Observability, complex ETL pipelines, and numerous libraries and frameworks essential to fullstack development. We are looking for a strong engineering teammate with deep experience in data infrastructure to further our mission. In this role, you will collaborate with Engineering and Data Squads to understand infrastructure requirements and guide their evolution as our technical operations grow in both scale and complexity. You will develop domain expertise in healthcare data and clinical trials, and build clinical data infrastructure to enable fast and efficient decision-making. Ultimately, your work will define the quality and reliability of TrialSpark’s technology, and thus our ability to deliver new treatments to patients faster and more efficiently.
- Build, maintain, and evolve our data infrastructure and overall data architecture to accommodate growing data complexity use cases
- Partner with Data Analysts to assess the quality of our clinical data and implement targeted improvements with automated data cleansing and transformation
- Create tools to continuously monitor, test, and optimize our clinical data pipeline to ensure timely delivery and high quality
- Work with operational partners and product management to connect business and product needs, particularly for high quality data
- Partner with our Data team to maintain our data warehouse (Redshift) and scale as necessary
- Manage and evolve TrialSpark’s cloud ecosystem (AWS and Aptible) and CI/CD infrastructure (CircleCI)
- Maintain and promote observability across our systems (DataDog and SumoLogic)
- Develop frameworks, APIs, and libraries to support and enable our fullstack developers (Typescript/Python3) and data analysts (DBT/SQL, Python3, Jenkins/Groovy)
- Help enforce best practices and promote testability and maintainability throughout our systems and codebase
- If it can be automated, you will automate it
- Minimum 3 years of software development experience with at least 2 years in a data-heavy role
- Fluency in SQL and at least one other programming language (Python preferred)
- Comfortable with Linux, Docker, and cloud technologies (AWS)
- Strong knowledge of data modeling, pipeline scheduling and flows (e.g. Airflow), database design and architecture, ETL (e.g. dbt), OLAP
- Experience with performance tuning row-based (PostgreSQL) and columnar (e.g. Redshift) data stores
- Experience with infrastructure as code tools (Ansible, Terraform, etc)
- Excellent problem solving and debugging skills
- Exceptional communication skills with the ability to convey complicated systems to both technical and non-technical audiences
- B.S. in Computer Science or related field, or equivalent experience