HomeCompaniesEmpower

Developer platform for fine-tuned LLMs

Empower is a developer platform for fine-tuned LLMs. It aims to provide best-in-class infrastructure and prebuilt, task-specific base models as building blocks, enabling developers to cost-effectively build and deploy fine-tuned LLMs for their specific use cases, offering an alternative to expensive and slow general-purpose LLMs without compromising on response quality.
Empower
Founded:2023
Team Size:2
Location:San Mateo, CA
Group Partner:Diana Hu

Active Founders

Yulong Liu, Founder

Co-founder at Empower. Previously machine learning engineering manager at Snap, and senior software engineer at Google Research
Yulong Liu
Yulong Liu
Empower

Daiyi Yang, Founder

Co-founder at Empower, previous uber TL in Meta across multiple areas including Lead Ads, News Feed Experience and Metaverse Avatar. Director of engineering at Revinate before Meta leading the product development and infrastructure.
Daiyi Yang
Daiyi Yang
Empower

Company Launches

Problem: Fine-tuning SLMs to replace the general-purpose LLM is hard

Despite recent pricing drops, general-purpose large language models like GPT4 and Sonnet remain costly for many use cases. With rates at around $5 per million tokens on average, even simple tasks can exceed $0.10 on the cost, significantly limiting their use in many scenarios.

Fine-tuned small language models (SLMs), such as llama3 8B, can achieve performance on par with, or even surpass, general-purpose LLMs in task-specific scenarios. However, the process of fine-tuning an SLM requires significant engineering effort. Tasks such as data collection, model iteration and evaluation, and deployment management are time-consuming for engineering teams.

Solution

Empower’s Auto Fine-Tuning (AFT) platform offers a one-stop solution for model fine-tuning. With AFT, users need to modify just five lines of code, while the platform handles everything else, including data collection, SLM training, evaluation, hosting, and traffic management. Additionally, AFT offers automatic model retraining to ensure consistent fine-tuned model performance over time.

How It Works

In the Empower AFT platform, tasks serve as the core units for organizing and managing LLM requests. When a new task is created, all traffic is initially directed to the designated general-purpose LLM. As the system gathers data and fine-tunes a specialized model for the task, the platform gradually shifts traffic from the general-purpose LLM to the newly fine-tuned SLM. This automatic transition optimizes performance and reduces costs, ensuring that customers’ applications benefit from the most efficient and effective model over time.

Below, we will explain in detail how the AFT platform works:

Integration

After a task is created, integrating with the Empower AFT platform is as simple as changing 5 lines of code:

Once the changes are deployed, all LLM requests are routed through the Empower AFT’s gateway. The gateway proxies traffic to the designated general-purpose LLM while simultaneously capturing request and response data. This data is then utilized for fine-tuning SLMs.

Data Capturing

LLM requests proxied by the gateway are stored in a task-specific dataset and reviewed by the verifier. The verifier ensures the integrity of these requests through the designated mechanism, either an auto-verification LLM call, heuristic rules, or an additional manual verification API request. Once verified, these requests are injected into the training dataset used to fine-tune the task-specific SLM.

Auto Model Fine-Tuning

AFT automatically initiates the fine-tuning job once sufficient data is collected for a given task. During this process, the AFT platform determines the optimal parameters for training the model, including base model selection, hyperparameters, and dataset sampling strategies, then iterates and evaluates the model to select the best candidate.

Traffic Splitting and Model Refreshing

Once a fine-tuned SLM is ready, subsequent LLM requests routed through the Empower gateway will be automatically split between the fine-tuned SLM and general-purpose LLMs. By default, AFT directs 90% of incoming requests to the fine-tuned SLM, while the remaining 10% are sent to the designated general-purpose LLM. This 10% split ensures that the model remains accurate and current by continuously evaluating the SLM’s performance and facilitating automatic updates.

As LLM requests evolve, AFT keeps the fine-tuned models up-to-date on a designated schedule. With the auto model refreshing feature, users customize the update cadence, enabling the fine-tuned SLMs to adapt continually to new data and maintain consistent performance.

Pricing

We offer a straightforward pricing model: 20% of the LLMs bill saving, inclusive of the model training, data storage, and inference usage.

Get Access

Ready to explore how the Empower AFT platform can help reduce your LLM costs? We are currently conducting a private beta program. We are looking for customers who:

  • Utilize mainstream general-purpose LLMs, such as GPT-4o, Claude Sonnet/Claude Opus, Gemini Pro, etc.
  • Have AI-powered products in production that consume at least 200 million LLM tokens per month.

Sign up for the beta program by submitting this form, scheduling a meeting, or emailing us to discuss how we can support your use case!

Other Company Launches

Empower-Functions - Function calling model tailored for real-world use cases

GPT-4-level function calling model, but with 3X faster response time and 10X lower cost
Read Launch ›

DSensei - Pinpoint the root cause of metric fluctuations in one minute

An AI-powered key driver analysis engine
Read Launch ›