About Us

We’re building the first end-to-end testing platform for web agents, including a Browser Gym for RL-driven optimization. Our platform helps teams evaluate, benchmark, and improve web agents before they go live, ensuring they can handle real-world, dynamic environments.

With synthetic user simulations, automated evaluations, and large-scale benchmarking, we’re setting a new standard for web agent testing.

We’re a YC-backed team, and this is a founding engineering role—you’ll be one of the first hires defining how we crawl, structure, and analyze the open web at scale.

The Role

We need a Founding Web Scraping Engineer to build internet-scale web crawling infrastructure—not just scraping a single site, but handling millions of domains and evolving anti-bot defenses.

You’ll be responsible for designing robust, distributed crawling systems that adapt dynamically to web changes, optimize for efficiency, and ensure reliable data extraction.

What You’ll Do

Build large-scale, distributed crawlers that intelligently prioritize, schedule, and optimize requests across millions of domains.
Develop adaptive web scraping systems that handle DOM changes, WebSockets, AJAX-heavy sites, and dynamically loaded content.
Optimize scraping performance and resilience, ensuring high-throughput data extraction with proxy/network optimizations and behavior-driven stealth tactics.
Solve captchas at scale, integrating third-party solvers, heuristic-based workarounds, and behavior-driven bypass techniques.
Manage proxy and identity rotation, implementing session-aware scraping, JA3/TLS fingerprint spoofing, and request signature control.
Structure and clean extracted data for downstream analytics, AI training, and benchmarking applications.

What We’re Looking For

Expert-level experience in large-scale web scraping & crawling (Selenium, Puppeteer, Playwright, Scrapy, undetected-chromedriver).
Deep knowledge of anti-bot detection strategies (TLS fingerprinting, JA3 signatures, request header anomalies, and bot behavior tracking).
Hands-on expertise with captcha-solving strategies, including leveraging APIs, OCR-based approaches, and behavior-driven evasion.
Proven experience building efficient proxy management systems, including rotating IPs across residential, datacenter, and mobile networks.
Proficiency in Python, Go, or JavaScript, with experience in high-performance, parallelized scraping frameworks.
Understanding of HTTP/2, HTTP/3, WebSockets, GraphQL, and browser-based fingerprinting.
Experience designing scalable, fault-tolerant scraping infrastructure that adapts to changes in real time.

Bonus Points

Experience with search engine-scale crawling.
Background in LLM-driven web extraction or RL-enhanced adaptive crawling.
Contributions to open-source scraping tools or web automation projects.

Why Join?

Founding role—you’ll define and own our web crawling infrastructure from day one.
Work at internet scale—building a system that dynamically adapts and scales across millions of domains.
YC-backed—we’re building something that doesn’t exist yet, and you’ll be part of the core team making it happen.

Foundry

Founding Engineer: Large-Scale Web Scraping & Crawling

About the role