Portfolio jobs

Discover opportunities with Lionheart Ventures and our portfolio companies.

Data Engineer

Pelago Health

Pelago Health

Software Engineering, Data Science
New York, NY, USA
USD 140k-160k / year + Equity
Posted on Aug 8, 2025
Pelago is the world’s leading virtual clinic for Substance Use Management. Our program provides guidance, support and treatment for members seeking to overcome their tobacco, alcohol and opioid use. From unhealthy habits to active substance use disorders, Pelago delivers a personalized solution based on individual health, habits, genetics, and goals, providing care for members wherever they might be on the substance use spectrum.
Pelago's suite of virtual services ranges from education, to cognitive behavioral therapy (CBT) to comprehensive medication-assisted treatment (MAT). Pelago enables employers and health plans to deliver accessible, affordable, and effective treatment for substance misuse.
Pelago has scaled to helping hundreds of employers and health plans and has already helped more than 750,000 members manage their substance use better. We have recently closed our Series C and raised over $151m from leading global investors. If you are passionate about making an impact on the health of others, join us and make it happen!

Role Overview:

As we expand our AI capabilities, we’re seeking a Data Engineer who thrives at the intersection of traditional data infrastructure and next-gen AI workflows. This role is ideal for someone excited about structured data, pipelines, and also enthusiastic about LLMs, agents, vector search, and human-in-the-loop systems. You'll collaborate cross-functionally to enable intelligent, real-time workflows while ensuring core data infrastructure continues to meet analytics and operational needs.

This is a hybrid role with a high-collaboration rhythm (4 days/week in our NYC office).

In this role, you will...

Build and scale Pelago’s data infrastructure

  • Design, develop, and maintain production-grade ELT/ETL pipelines using modern tools like dbt, Airbyte, Census, and Airflow
  • Architect scalable, modular data systems leveraging Redshift or Snowflake to support analytics and operational use cases
  • Collaborate with analytics and product teams to deliver clean, governed, and high-impact data models
  • Ensure performance, reliability, and observability across batch and streaming workflows
  • Uphold best practices for security, compliance, and handling of regulated healthcare data, including PHI

Engineer intelligent, AI-driven workflows

  • Build and orchestrate LLM-based agents using frameworks like LangChain, LlamaIndex, or DSPy
  • Integrate pipelines with vector databases (e.g., Pinecone, Weaviate) to enable retrieval-augmented generation (RAG)
  • Develop lightweight APIs for inference using FastAPI or similar frameworks
  • Implement feedback loops, prompt evaluation, and observability tools (e.g., TruLens, Ragas) to improve AI system quality
  • Partner with ML engineers and platform teams to integrate models, embeddings, and other intelligent components

Collaborate cross-functionally to drive innovation

  • Work closely with product, clinical, and platform teams to define technical requirements and deliver impactful solutions
  • Document and share best practices to onboard teammates into LLM workflows and tools
  • Influence the roadmap for AI-augmented reporting, automation, and operational intelligence across the organization

The background we're looking for...

Minimum Qualifications:

  • 3+ years in a data engineering or related role
  • Strong SQL and Python programming skills
  • Experience with modern data stacks (dbt, Airflow, Redshift or Snowflake)
  • Familiarity with LLM frameworks (e.g., LangChain, LlamaIndex, DSPy)
  • Exposure to vector databases and retrieval-based pipelines
  • Effective communicator with experience working cross-functionally

Nice to Haves:

  • Experience with FastAPI, MLflow, or similar tooling
  • Background in handling regulated or healthcare data (e.g., PHI, HIPAA)
  • Knowledge of AI observability or prompt tuning frameworks (e.g., TruLens, Ragas)
  • Familiarity with event-driven or streaming architectures

What you’ll love about us…

We have a whole host of perks for our people! From life essentials to nice-to-haves, there are more than a few good reasons to love working with us. We strive to ensure Pelago employees have equitable access to healthcare, wellbeing, time away, and then some.

  • Generous and meaningful equity package
  • Full Medical, Dental, & Vision coverage
  • 401k Plan
  • Unlimited PTO Policy, 10 paid holidays, & company wide “Me Time” Days
  • Paid maternity, paternity & new parent leave
  • Annual Learning and Development stipend to support continued learning and career development
  • Wellness Reimbursement Program
  • Access to Reproductive & Family Planning Care
  • Substance Use Support for employees and family members

At this time, we are unable to offer visa sponsorship for this position.

The provided range reflects our US target salary range for this full-time position, which is part of our broader total compensation package, including incentive bonus program, stock options, comprehensive benefits, and incentive pay applicable to eligible roles. Individual pay within the range will vary based on a variety of factors like role-related experience and education, internal pay equity, and other relevant business factors. At Pelago, we are committed to an equitable and fair pay philosophy and review total compensation for our employees at least twice a year.

Base Pay Range
$140,000$160,000 USD