Phospho 🧪 - Open source text analytics for LLM apps
Turn your LLM prototype into a product with testing, evaluation, monitoring and guardrail at the semantic level
✨ Star us on Github and follow us on Twitter
TL;DR: Phospho is building open-source text analytics tools for LLM Apps. We help companies turn their LLM prototype into a product with testing, evaluation, monitoring, and guardrails at the semantic level.
Hey everyone, we’re Paul-Louis and Pierre-Louis. phospho is on a mission to help you monitor, test, and improve your LLM app at the semantic level.
🚨Problem
Building LLM apps has never been easier. There are TONS of tools. Yet, companies that ship to production are scarce. And lots of AI tools that have made it to production have a HIGH churn rate and LOW usage rate. Why?
Unfortunately, many AI builders are trapped at ground zero:
- They don’t know what to improve in their products, because there are so many ways to improve (and many yet to come!)
- To make decisions, they either have irrelevant KPIs to their use case or just gut feeling from everyone, but their users
- Who are the users, what they do, or what they say, is usually a big unknown.
No wonder they feel stuck. Their best chance at improving is either guesswork or reading through thousands of logs.
🧩 Solution
There is no secret. Here is what the best companies shipping LLM products do that others don’t:
- They release often, and fast… because they have a clear set of custom metrics based on their use case that act as a simple “green light/red light”
- They improve on precise product issues… because they understand in great detail who uses their products and why
- They act on feedback quickly… because they listen to it every day in their Slack channels or via mail and get alerted when something is going wrong
🧪 This is why we are building phospho. phospho is an open-source text analytics tool for LLM apps.
We are gathering all the tools that enable your team to go from prototype to product at record pace: testing, evaluation, monitoring, and guardrails at the semantic level. Let’s deep dive.
Build and Test
- Define your own textual event detection pipeline
- Set up webhooks and enforce guardrails
- Assess the quality before releasing with personalized evals, continuously A/B test
Understand and Analyze
- Detect usage patterns, categorize interactions by type, topics, intents, and more
- Evaluate app response quality
- Run tests at scale and in real-time
Improve and Take Action
- Trigger workflows, escalations, and alerts based on detected events or evaluations
- Dive deeper into the data; get consolidated reports through the platform or via the API
- Break down the analysis by users or sessions
Integrations
Python and Javascript SDKs to easily integrate into your LLM stack
Phospho can be self-hosted or used in our managed cloud version
đź’Ą Our Ask
- ✨ Star us on GitHub and join our Discord
- 👀 If you’re curious, take our open-source package for a spin. We welcome contributions from everyone.
- ⚙️ If you run an LLM app, try our platform for free. Feel free to reach out; we'd love to see how we can be helpful.