Datafold

Deal with analytical data quality in the pull request

Senior Software Engineer - AI Agents

$175K - $245K
Location
Remote (US)
Job Type
Full-time
Experience
3+ years
Connect directly with founders of the best YC-funded startups.
Apply to role ›
Gleb Mezhanskiy
Gleb Mezhanskiy
Founder

About the role

Datafold is a fast-growing, Series A startup at the forefront of data quality and observability—think Datadog for data engineers. Backed by top-tier investors including YC, Amplify, and NEA, we're redefining how companies like Disney, FanDuel, and Perplexity maintain quality across the entire data lifecycle. Although headquartered in the US, we’re a fully remote team with employees across the US and EU.

We’re looking for an experienced backend (or full-stack) engineer to help build and scale the Datafold Migration Agent (DMA)—an AI-powered tool that’s changing the game in data migration. Combining large language models with our unique data diffing technology, DMA automates SQL dialect translation and data reconciliation, slashing migration timelines by 5-10x and eliminating the need for manual work and costly consultants.

About the Role

  • Drive the development of DMA, shaping both the technical and architectural foundation for a product that combines AI and data engineering to solve complex migration challenges.
  • Design, develop, and maintain backend systems focused on AI-driven SQL translation and cross-database data comparison.
  • Collaborate with early customers to refine features and iterate toward product-market fit.
  • Make strategic technical decisions to ensure DMA is scalable, robust, and optimized for high-performance data migration.
  • Take ownership of projects end-to-end, troubleshooting and resolving complex challenges in real time to deliver high-quality, impactful results.

About You

  • Experience : 5+ years as a software engineer with a strong backend focus.
  • Tech Stack : Proficiency in Python is required; experience building with LLMs and/or JavaScript/TypeScript is a plus.
  • Ownership : Proven ability to manage projects end-to-end, from design through deployment.
  • Startup Mindset : Thoughtful about balancing speed, quality, and business impact.

If building a high-impact, innovative product at the intersection of AI and data engineering sounds exciting, we’d love to hear from you.

Datafold is an equal opportunity employer and does not discriminate against any employee or applicant for employment based on race, color, religion, sex, national origin, age, disability, genetic information, sexual orientation, gender identity, marital status, military status, or any other protected characteristic. We are committed to providing equal employment opportunities to all individuals. We strive to create an inclusive and diverse work environment where all employees are valued and unique perspectives are respected and celebrated.

About Datafold

About Datafold

At Datafold, we build tools for data practitioners to automate the most error-prone and time-consuming parts of the data engineering workflow: testing data to guarantee its quality. While data quality (just like software quality) is a complex and multifaceted problem, we draw from decades of our team’s combined experience in the data domain to build opinionated tools our users love. Specifically, we believe that:

Data quality is a byproduct of a great data engineering workflow. That means, rather than building yet-another-app for data practitioners to switch to and from, we insert our tools in the existing workflows, for example, in CI/CD for deployment testing and IDEs for testing during development.

Data quality issues should be addressed before deploying the code. Most data quality issues are bugs in the code that processes data, and applying a proactive, shift-left approach is the most effective way to achieve high shopping velocity and data quality simultaneously. Read more

Lack of metadata (data about data) is the biggest gap in the data engineering workflow. We bring powerful tools such as data diffing and column-level lineage to every data engineer’s workflow to help them validate the code and underlying data and fully understand the dependencies in complex data pipelines.

Datafold is used by data teams at Patreon, Thumbtack, Substack, Angellist, among others, and raised $22M from YC, NEA & Amplify Partners.

Datafold
Founded:2020
Team Size:24
Status:
Active
Location:New York
Founders
Gleb Mezhanskiy
Gleb Mezhanskiy
Founder
Alex Morozov
Alex Morozov
Founder