Hey,
We’re Darren and Ronit of Kura. We just built the State of the Art for Browser Agents and wanted to share our learnings.
First, here’s what 87% accurate Browser Agents look like:
https://www.youtube.com/watch?v=G62KRiW4ses
Browser Agents. Generalized agents that can perform a task on the internet via computer vision and reading the DOM.
How does it work
TL;DR our agent is a multi agent system composed of a planner, executor and critic engaged in debate over each action. The agents have varying degrees of vision and HTML DOM context.
For more information please refer to https://trykura.com/technical-deep-dive.
Why is it better
This debate architecture allows it to self heal and backtrack succeeding in several tasks other agents currently fail at. It's also largely model agnostic allowing a user to lower costs by up to 90% when compared to Claude’s Computer Use by swapping in cheaper models with slight impacts to accuracy.
Why does this matter
We believe, because of many of the companies here, there will be more agents than humans on the internet in 10 years. However, the internet today is made for humans. Significant functionality is locked behind UIs which don’t have APIs. That’s where generalized browser agents come in.
Our browser agent converts any series of UIs into an API ready for consumption by the next generation of agentic companies.
Schedule a demo