Phospho 🧪 - Open source text analytics for LLM apps

Turn your LLM prototype into a product with testing, evaluation, monitoring and guardrail at the semantic level

Paul-Louis Venard

a year ago

#robotics#artificial_intelligence

✨ Star us on Github and follow us on Twitter

TL;DR: Phospho is building open-source text analytics tools for LLM Apps. We help companies turn their LLM prototype into a product with testing, evaluation, monitoring, and guardrails at the semantic level.

Hey everyone, we’re Paul-Louis and Pierre-Louis. phospho is on a mission to help you monitor, test, and improve your LLM app at the semantic level.

🚨Problem

Building LLM apps has never been easier. There are TONS of tools. Yet, companies that ship to production are scarce. And lots of AI tools that have made it to production have a HIGH churn rate and LOW usage rate. Why?

Unfortunately, many AI builders are trapped at ground zero:

They don’t know what to improve in their products, because there are so many ways to improve (and many yet to come!)
To make decisions, they either have irrelevant KPIs to their use case or just gut feeling from everyone, but their users
Who are the users, what they do, or what they say, is usually a big unknown.

No wonder they feel stuck. Their best chance at improving is either guesswork or reading through thousands of logs.

🧩 Solution

There is no secret. Here is what the best companies shipping LLM products do that others don’t:

They release often, and fast… because they have a clear set of custom metrics based on their use case that act as a simple “green light/red light”
They improve on precise product issues… because they understand in great detail who uses their products and why
They act on feedback quickly… because they listen to it every day in their Slack channels or via mail and get alerted when something is going wrong

🧪 This is why we are building phospho. phospho is an open-source text analytics tool for LLM apps.

We are gathering all the tools that enable your team to go from prototype to product at record pace: testing, evaluation, monitoring, and guardrails at the semantic level. Let’s deep dive.

Build and Test

Define your own textual event detection pipeline
Set up webhooks and enforce guardrails
Assess the quality before releasing with personalized evals, continuously A/B test

Understand and Analyze

Detect usage patterns, categorize interactions by type, topics, intents, and more
Evaluate app response quality
Run tests at scale and in real-time

Improve and Take Action

Trigger workflows, escalations, and alerts based on detected events or evaluations
Dive deeper into the data; get consolidated reports through the platform or via the API
Break down the analysis by users or sessions

Integrations

Python and Javascript SDKs to easily integrate into your LLM stack

Phospho can be self-hosted or used in our managed cloud version

💥 Our Ask

✨ Star us on GitHub and join our Discord
👀 If you’re curious, take our open-source package for a spin. We welcome contributions from everyone.
⚙️ If you run an LLM app, try our platform for free. Feel free to reach out; we'd love to see how we can be helpful.