Homeâ€șLaunchesâ€șSkyvern
35

🐉 Skyvern Cloud - Open source AI Agent to automate browser based workflows

The easiest way to automate similar tasks on a lot of different websites

TL;DR

Skyvern helps companies automate browser-based workflows using AI. We provide a simple API endpoint to fully automate manual workflows, replacing brittle or unreliable scripts. We’re open source — check out our repository here. We just launched Skyvern Cloud — check it out at app.skyvern.com.

—

đŸ§™â€â™‚ïžđŸ‰Â Skip the boring explanation, show me the magic!

We have good news! We just launched Skyvern Cloud. You can go to app.skyvern.com and try playing around with it! You’ll get $5 of credit when signing up and setting up a payment method

Here’s an example of it filling out the job application form and applying to Lever’s job demo website:

đŸ€”Â What are browser workflows?

Most companies we’ve talked to have manual or semi-automated workflows that support their core product. Many of these workflows started manually as companies were getting off the ground, doing things that don’t scale, and evolved to occupy a larger amount of human capital or automated with unreliable scripts.

Example 1 — Downloading invoices on a large number of websites

Let’s say it’s the end of the month, and your accountant is hounding you for invoices for all your transactions this month. Traditionally, you would have to log into each portal and download it yourself. Skyvern can help automate this for you! PS: If you’d like to automate a complex flow like this, we’d love to help get you set up.

Example 2 — Filling out forms + uploading documents on Government websites

Let’s say you’re an accounting company or a law firm, and you want to help your clients get set up for Payroll in California — you wouldn’t want to go get registered for EDD manually for each potential customer.. and in each state! Call in a favor from Skyvern to help automate this.

Example 3 — Automating materials procurement from commerce websites

Let’s say you’re a B2B marketplace for car parts, and your users submit orders from multiple vendors with you. A subset of vendors you work with don’t support ordering via an API, so you have a human in the loop to transact orders for those vendors. Here’s an example of Skyvern navigating to finditparts.com and ordering an airbag — all via an API call!

Example 4 — Completing dynamic multi-step workflows

Let’s say that you’re trying to automate complex workflows, such as generating insurance quotes. Skyvern is able to navigate each step until its specified goal is achieved — even if the steps change depending on the user’s situation. Here’s an example of Skyvern navigating to Geico.com and generating an auto insurance quote with your personalized information.

đŸ€©Â How does it work??

Skyvern uses LLMs to become an AI Agent capable of interacting with websites like you or I would — all via a simple API call.

Traditional approaches required writing custom scripts for websites, often relying on DOM parsing and XPath-based interactions, which would break whenever the website layouts changed.

Skyvern’s prompt → automation breaks these constraints and relies on multi-modal LLMs to parse items in the viewport and interact with them the way a human would.

This approach gives us a few advantages:

  1. Skyvern can take a single prompt (i.e., “Download an invoice from the order history page”) and repeat it across a large number of similar websites. This would traditionally require one script per website — making tackling the long tail of website interactions very challenging
  2. Skyvern is able to operate on websites it’s never seen before, as it’s able to map visual elements to actions necessary to complete a workflow without any customized code
  3. Skyvern is resistant to website layout changes, as there are no pre-determined XPaths or other selectors our system is looking for while trying to navigate
  4. We’re able to circumvent or navigate through many bot detection methods, as many of them rely on looking for outlier behavior
  5. We rely on LLMs to reason through interactions to make sure we can cover complex situations. Examples include:
    • If you wanted to get an auto insurance quote from Geico, the answer to a common question, “Were you eligible to drive at 18?” could be inferred from the driver receiving their license at age 16
    • If you were doing competitor analysis, it’s understanding that an Arnold Palmer 22 oz can at 7/11 is almost definitely the same product as a 23 oz can at Gopuff (even though the sizes are slightly different, which could be a rounding error!)

😼 And you’re open source?

Yes! You can check out our code here. We’re open source for two big reasons:

  1. It allows developers to be able to look at, understand, and dive deep into Skyvern’s implementation details to (1) expand their capabilities by adding support for new functionality and (2) decode why they’re doing what they’re doing.
  2. It allows security-minded enterprises to escape “security theater” and keep data on-prem by self-hosting Skyvern.
    1. If this is at all interesting to you, let’s chat.

Our Ask

Do you have any complex workflows that you’d love to automate? We’d love to chat! Shoot me an email at suchintan@skyvern.com or grab some time via my calendar.