Rootly is the AI-native on-call and incident management platform that helps teams detect, manage, learn from, and resolve incidents faster—without leaving Slack or MS Teams. It brings Incident Response, On-Call, AI SRE agents, Status Pages, and Retrospectives into one purpose-built, end-to-end platform.
We met at Instacart 🥕—where Quentin was the first SRE and JJ was leading product management efforts. As Instacart grew from a $1B to a $40B company, everything had to scale with it: infrastructure, teams, and the processes that keep the business running when things go wrong. With millions of orders flowing through the platform, incidents weren’t rare edge cases—they were an inevitable part of operating at that level. Checkout issues, platform outages, weird cascading failures at the worst possible times… we saw it all.
What surprised us wasn’t that incidents happened. It was how quickly the process broke down under pressure. Our manual way of responding—jumping between Slack, PagerDuty, Datadog, and a dozen dashboards—just wasn’t enough. Every incident started with the same scramble: spinning up channels, pulling people in, figuring out who’s running point, trying to keep comms coherent, and stitching together context from different tools and threads. It worked when the team was small and the system was simpler. At scale, it turned chaotic and inconsistent.
And the hardest part wasn’t even the debugging. It was the inconsistency. You’d rely on runbook checklists and tribal knowledge, hoping someone remembered what to do next—who to page, which workflow to follow, how to communicate internally and externally, what steps to capture for the retrospective. In the moment, that’s a lot to ask. Afterward, it meant retros were slow (or skipped), action items drifted, and the same classes of incidents kept coming back because learning didn’t keep pace with growth.
Rootly gives you an end-to-end incident lifecycle—from first page to resolution to post-incident learning—built where teams actually operate (Slack & MS Teams), and it’s all powered by AI.
https://www.youtube.com/watch?v=56oyjjeNzqY
Incident Response
Rootly helps you declare incidents, spin up the right Slack channels and roles, coordinate responders, track tasks, and keep communication structured—while automating the operational “busywork” that normally slows teams down.
On-Call
Rootly On-Call is built to notify the right person the first time, with scheduling and escalation that scales from a small team to a global org. It’s designed to be reliable and easy to use, with a beautiful mobile experience that makes getting paged somewhat enjoyable.
AI SRE Agents
Rootly’s AI SRE agents help teams move faster during incidents by connecting the dots across signals like recent code changes, telemetry, and past incidents. Instead of starting from scratch every time, responders can quickly narrow in on probable causes, understand what changed, and get suggestions for fixes and next steps—along with the reasoning behind them.
Status Pages
Rootly makes it easy to keep customers and stakeholders updated with public and private status pages, so incident comms don’t become another manual process during a high-stress event.
Retrospectives
Rootly helps you turn incidents into learning. Create collaborative retros that teams can edit together, draft timelines and summaries faster, and track action items so the work actually gets done—not just written down.
AI everywhere responders actually feel it
During incidents, the biggest tax is cognitive load: listening, typing, summarizing, updating, and repeating yourself. Rootly uses AI to take that off the responder:
A full platform, or just what you need
Use Rootly as a complete incident management platform—or adopt products individually based on where you are today (Incident Response, On-Call, AI SRE, Status Pages, Retrospectives).