Replicate

Run machine learning models in the cloud

Engineering Manager

$235K - $300KSan Francisco, CA, US
Job type
Full-time
Role
Engineering, Full stack
Experience
3+ years
Visa
US citizen/visa only
Apply to Replicate and hundreds of other fast-growing YC startups with a single profile.
Apply to role ›

About the role

The models team keeps Replicate’s model library stocked with the latest generative AI models. We make popular models fast, reliable, and easy to use. We also add features — things people ask for and things they didn’t know they needed.

We’re hiring an engineering manager to help lead this team of six to eight engineers working at the edge of open-source AI and high-performance computing. You’ll support the team, shape the technical direction, and stay close to the code. The team focuses on three things:

  1. Turn research into APIs. We make it easy to package models with cog and run them on Replicate.
  2. Make models faster. CUDA, quantization, parallelism — we use whatever works to make models run faster and cheaper.
  3. Build new model features. This could mean training video models, adding inpainting or ControlNet-style conditioning, or inventing new ways to use models.

We build in the open. That means contributing upstream, releasing internal tools, and sharing what we learn.

What we’re looking for

You’re a strong leader. You bring energy and clarity. You help people do their best work. You like solving real problems and moving fast. If that sounds like you, we’d love to hear from you.

What you’ll do

  • Lead and grow a team that packages, optimizes, and improves generative models. Push model quality and performance every day.
  • Collaborate with other teams to improve performance, tooling, and usability across the platform. Represent and advocate for our customers as an internal user of our platform.
  • Bring momentum and clarity to projects: set goals, unblock the team, and keep things moving.
  • Work with company leadership to prioritize and align on strategy; help shape the technical roadmap — from ML tooling to infrastructure.
  • Give back to the Replicate community — contribute to open-source projects, and support the team in doing the same.

You should apply if…

  • You’ve helped teams grow and thrive, especially in fast-paced or startup environments. You mentor engineers and know how to scale a team.
  • You’ve worked in machine learning, data science, or adjacent fields. You know a bit about model optimization and performance.
  • You’re great at communication and project management. You bring order to ambiguity and keep the team focused.
  • You care about AI and want to build practical tools that help real people.
  • You want to make generative AI easier for developers and creators to use.
  • You’re part of the generative AI or open-source infrastructure community.

You’ll get to work on some of the most interesting problems in AI infrastructure — while contributing to the open-source communities that make this work possible.

About Replicate

What we're doing

Machine learning can now do some extraordinary things: it can understand the world, drive cars, write code, make art.

But, it is still extremely hard to use. Research is typically published as a PDF, with scraps of code on GitHub and weights on Google Drive (if you’re lucky!). It is near-impossible to take that work and apply it to a real-world problem, unless you are an expert.

We’re making machine learning accessible to everyone. People creating machine learning models should be able to share them in a way that other people can use, and people who want to use machine learning should be able to do it without getting a PhD.

With great power also comes great responsibility. We believe that with better tools and safeguards, we will make this powerful technology safer and easier to understand.

How we work

We're a bunch of hackers, engineers, researchers, and artists.

We obsess about the details of API design and the right words for things. We're defining how AI works so we'd better get it right.

We make fast and reliable infrastructure. That's what a good infrastructure product is. We're not afraid to build things from scratch to make it the fastest.

We use AI for work. We use AI for play. We find unexplored parts of the map and create new techniques ourselves. We open-source it all.

We build in public, for the community. We want AI to work like open-source software so everyone benefits from it.

We're led by engineers. We all write code. (Or, we get ChatGPT to help.) There aren’t any meetings about meetings.

We've worked at places like Docker, Dropbox, GitHub, Heroku, NVIDIA, Scale AI, and Spotify. We've created technologies like Docker Compose and OpenAPI.

We're here to build a big company. We're ambitious and hard-working. We're not here to just build nice things.

Replicate
Founded:2019
Batch:W20
Team Size:36
Status:
Active
Location:San Francisco
Founders
Ben Firshman
Ben Firshman
Founder
Andreas Jansson
Andreas Jansson
Founder