Track quality, cost and latency of your LLM apps
⭐ Star us on Github & follow us on Twitter
TLDR: Langfuse is building open-source product analytics (think ‘Mixpanel’) for LLM apps. We help companies to track and analyze quality, cost and latency across product releases and use cases.
Hi everyone, we’re Max, Marc and Clemens. We were part of the Winter 23 batch and work on Langfuse, where we help teams make sense of how their LLM applications perform.
LLMs represent a new paradigm in software. Single LLM calls are probabilistic and add substantial latency and cost. Applications use LLMs in new ways via advanced prompting, embedding-based retrieval, chains, and agents with tools. Teams building production-grade LLM applications have new product analytics and monitoring needs:
Langfuse derives actionable insights from production data. Our customers use Langfuse to answer questions such as: ‘How helpful are my LLM app’s outputs? What is my LLM API spend by customer? What do latencies look like across geographies and steps of LLM chains? Did the quality of the application improve in newer versions? What was the impact of switching from zero-shotting GPT4 to using few-shotted Llama calls?’
Metrics
Insights
Integrations
Langfuse can be self-hosted or used with a generous free tier in our managed cloud version.
Based on the ingested data, Langfuse helps developers debug complex LLM apps in production:
Star us on GitHub + follow along on Twitter & LinkedIn.