Homeā€ŗCompaniesā€ŗAirtrain AI

Airtrain AI

No-code data curation for LLM fine-tuning and evaluation.

Airtrain AI is a no-code data platform for Large Language Models. Proprietary AI models such as GPT-4 are very powerful but also very costly, slow, unreliable, and unsecured. As businesses look to scale their AI prototypes into production-grade products, they struggle with large AI bills, slow APIs and large failure rates. On the other hand, smaller language models have been proven to be able to perform on-part with large ones with fine-tuned on high-quality datasets. Airtrain AI lets AI practitioners explore alternatives to proprietary models, build up training datasets, evaluate, fine-tune, and serve a large selection of open-source LLMs.

Airtrain AI
Founded:2022
Team Size:5
Location:San Francisco
Group Partner:Nicolas Dessaigne

Active Founders

Emmanuel Turlay, Founder

Hi, I am Emmanuel. I spent my foundational years in academia (particle physics research at CERN) then branched out into tech. I built the order and payment infrastructure at Instacart, and lead ML platform at Cruise. I am the CEO and co-founder of Sematic (S22). Reach out and say hello.

Emmanuel Turlay
Emmanuel Turlay
Airtrain AI

Company Launches

TL;DR ā€“ Sematic is an open-source framework to build and run arbitrary end-to-end ML/DS pipelines developed by two ex-Cruisers.

When COVID hit and restaurant parklets popped up around San Francisco, the Cruise robotaxis needed to update their ML models in a matter of days.

This fast turnaround in data processing > model training > evaluation > simulation > metrics generation > deployment was enabled by talented ML Engineers using state-of-the-art pipelining tools.

Your problem

If you are an ML/DS developer šŸ§‘ā€šŸ’», you likely spend a lot of time battling obscure infrastructure instead of leveraging your core competency: extracting insights and predictions from data.

If you are a business leader šŸ§‘ā€šŸ’¼, you likely want more models to grow and optimize your business; and retrain them frequently to adapt to ever-changing world conditions (hello COVID!).

Existing solutions (e.g. Kubeflow, Argo, Airflow, Prefect) are still too low-level and not usable by ML/DS developers.

Tools such as Ruby on Rails and Heroku enabled generations of web developers to build prototypes into unicorns. Where is that experience for ML/DS development?

Our solution

Sematic Beta Launch Demo

Sematic is a lightweight open-source ML/DS pipeline development and execution framework based on learnings from working at Cruise.

With easy-as-pie onboarding, simply use native Python to develop and run arbitrary end-to-end pipelines that track and version all your assets and artifacts (models, datasets, plots, metrics, etc.), and visualize them in a slick UI.

Run locally or leverage your cloud resources (GPUs!!) seamlessly without infrastructure work.

Collaborate with your team in Sematic to keep the conversation close to the data context.

Sematic aims to facilitate the journey from Notebook prototypes to automated production-grade pipelines.

Our ask

ā­ Star our Github repo

Try us out:

$ pip install sematic
$ sematic start

Join our Discord and give us your feedback to make the product more useful to you and your company.

Early Bird offer

We offer a white-glove onboarding service and dedicated support to build out your end-to-end pipelines.

Try us out, reach out on Discord, or email us at emmanuel@sematic.dev.