Replicate

Run machine learning models in the cloud

Machine Learning Engineer - Media Models

$190K - $260K

Location

Remote - United States / Remote (US)

Job Type

Full-time

Experience

3+ years

Apply to Replicate and hundreds of other fast-growing YC startups with a single profile.

Apply to role ›

About the role

Replicate lets you run machine learning models in the cloud. We’re not just another AI company; we’re a team of developers, engineers, and innovators from organizations like Docker, Spotify, Dropbox, GitHub, Heroku, NVIDIA, and more. We’ve built foundational technologies like Docker Compose and OpenAPI, and now, we’re applying that expertise to make AI deployment as intuitive and reliable as web deployment.

The Models team at Replicate keeps our public model library up to date with all the latest generative AI models. We make sure all the popular models are fast, reliable, and easy to use. We use and test models to find each model's unique features. We have our ear to the ground and build new models and pipelines that our users and the community want by opening up the black boxes of open-source models.

As a Machine Learning Engineer on the Models team, you’ll deploy, optimize and customize image, audio, and video models, implement cutting-edge research, and develop tools that empower users to fine-tune open-source foundation models.

About you:

You’re a machine learning engineer who is an image, audio, and video models expert. Making them fast, customizing and controllable, and inventing new techniques.
You’re a strong software engineer with at least 5 years of full-time experience. You know the good tools and aren’t just using single-letter variable names.
You don’t need a PhD, but you need to understand math for machine learning and be able to parse a research paper.

What you’ll be doing:

We have a huge library of models on Replicate. You’d be making sure they have all the latest features and are fast and reliable.
You’ll write training code so that Replicate users can train their own LoRAs to fine-tune open-source models to fit their needs.
You’d find the latest papers, turn them into useful products, and publish them on Replicate first. You might do some new research, too.
You’d be using cutting edge techniques to empower and enable users to fine tune open source foundation models.

These aren’t hard requirements, but we definitely want to talk with you if…

You’ve invented some new techniques and put them on GitHub.
You’re an expert at PyTorch, down to its internals, using torch.compile(), and so on.
You know how to run a model on multiple GPUs with tensor parallelism.
You’re involved in the generative AI community and are in the right Discords.

This role can be remote (anywhere in the United States) or in-person. We have a preference for timezones closer to PST. If possible, we like people to come into our San Francisco office at least 3 days a week.

About the interview

Answer this question to be considered for this role:

Have you worked on any interesting generative AI projects? Tell us more (and share a link if you can).

About Replicate

What we're doing

Machine learning can now do some extraordinary things: it can understand the world, drive cars, write code, make art.

But, it is still extremely hard to use. Research is typically published as a PDF, with scraps of code on GitHub and weights on Google Drive (if you’re lucky!). It is near-impossible to take that work and apply it to a real-world problem, unless you are an expert.

We’re making machine learning accessible to everyone. People creating machine learning models should be able to share them in a way that other people can use, and people who want to use machine learning should be able to do it without getting a PhD.

With great power also comes great responsibility. We believe that with better tools and safeguards, we will make this powerful technology safer and easier to understand.

How we work

We're a bunch of hackers, engineers, researchers, and artists.

We obsess about the details of API design and the right words for things. We're defining how AI works so we'd better get it right.

We make fast and reliable infrastructure. That's what a good infrastructure product is. We're not afraid to build things from scratch to make it the fastest.

We use AI for work. We use AI for play. We find unexplored parts of the map and create new techniques ourselves. We open-source it all.

We build in public, for the community. We want AI to work like open-source software so everyone benefits from it.

We're led by engineers. We all write code. (Or, we get ChatGPT to help.) There aren’t any meetings about meetings.

We've worked at places like Docker, Dropbox, GitHub, Heroku, NVIDIA, Scale AI, and Spotify. We've created technologies like Docker Compose and OpenAPI.

We're here to build a big company. We're ambitious and hard-working. We're not here to just build nice things.