/extract by Firecrawl - Get structured website data with just a prompt

Turn entire websites into structured data with AI

Eric Ciarla

2 months ago

https://www.firecrawl.dev

#ai#open_source#developer_tools

Hey everyone! We’re Eric, Caleb, and Nick from Firecrawl (S22). Today, we’re launching /extract — an endpoint that turns entire websites into structured data with a prompt.

TL;DR

With Firecrawls' new /extract endpoint, any website can be turned into structured data with a simple API call and prompt. We handle the complexity so you can focus on building your company.

The Problem.

If you need to pull data from websites - maybe to enrich your CRM, track competitors, or onboard users - you're stuck with:

Manually researching and copy-pasting from multiple sources \
Building and maintaining scrapers that break at the slightest site change
Stitching together scraping services and complex LLM pipelines with limited context windows

Each approach wastes the engineering time you could spend shipping a product. :

Our Solution:

/extract is an API that turns a prompt into structured web data.

Here's how to use it:

Give us URLs + Prompt
Write what data you want, and point us at websites. Use wildcards like example.com/* to scan entire sites.
We Find Relevant Content
Our crawler finds and ranks the pages that matter, automatically.
AI Extracts Data
Intelligent agents split, search and parallelize the work, handling sites of any size.
Get Clean JSON
Receive structured data ready to use - no post-processing needed.
Integrate anywhere via API
With our API, you can use firecrawl anywhere, whether its in your applications or no-code tools like Zapier

Why It Works

Handle Any Website: Built on proven scraping infrastructure that just works
Natural Language Input: Describe what you want in plain English - we figure out the schema
No Size Limits: Process massive sites by automatically splitting the work
Use It Anywhere: Full API + ready-made integrations for Python, Node, and Zapier

Limitations - (and the road ahead)

Let's be honest - while /extract is pretty awesome at grabbing web data, it's not perfect yet. Here's what we're still working on:

Big sites are tricky - It can't (yet!) grab every single product on Amazon in one go
Complex searches need work - Things like "find all posts posted after 2024" aren't quite there
Sometimes, it's a bit quirky - Results can vary between runs, though it usually gets what you need

But here's the exciting part: we're seeing the future of web scraping take shape.

Get Started

Try the Open Beta
- For a limited time, get 500,000 free tokens to get you started
- Explore the www.firecrawl.dev/playground?mode=extract
- Read the docs => https://docs.firecrawl.dev/features/extract
Join Our Community
- Star us on www.github.com/mendableai/firecrawl - we're open-source!
- Share your use cases and feedback.

Ready to turn web data into your competitive advantage? Get started in less than 5 minutes.

Get your API key at www.firecrawl.dev/app

— Eric, Caleb, and Nick at Firecrawl 🔥