Reprompt AI: Helping AI apps get to production quality 10x faster

AI teams use Reprompt to evaluate and improve their apps without code

Rob Balian

a year ago

tldr: Productionizing AI apps is hard. Reprompt lets teams evaluate and improve their apps without code and get to production quality 10x faster.

90% of Generative AI pilots never make it to production

Generative AI applications are notoriously hard to scale to production quality:

Hallucinations are hard to detect and fix
Function calling is a huge opportunity but still hit-or-miss
RAGs aren’t as scalable as we hoped, and often they find the wrong documents
AI responses can become a legal and compliance liability

Getting to production takes more than great prompt engineering

Behind the scenes of every great product (Uber, Doordash, Waymo, Robinhood) there's an ops team with a dashboard full of if statements and data overrides that keeps the whole company from collapsing.

From what we’re seeing, AI apps will require similar tuning, using strategies like Human-In-The-Loop, custom overrides, multi-queries, and prompt splitting/routing.

Reprompt helps AI teams fix “last mile” AI issues so they can ship faster

Here’s what you can do with Reprompt:

Trace your AI calls across chat, RAG, and function calls to make debugging easy
Automatically track and highlight hallucinations and file them as bugs
Write custom prompt overrides to handle the edge cases without changing your main prompt or pushing code

About the team

Lukas Martinelli was an Engineering Director and GM of Search at Mapbox. He also led Mapbox's 120-person labeling and evaluation team.

Rob Balian led Robinhood's Growth team through Covid, Gamestop, and IPO. He was an early PM at Facebook.

Rob Balian and Lukas Martinelli

If your team has challenges getting to production-level quality, we’d love to show you a demo. See you all in production!

-Rob and Lukas