HomeLaunchesGuide Labs
67

Guide Labs: Interpretable foundation models

Foundation models that can explain their reasoning, and are easy to align

At Guide Labs, we build interpretable foundation models that can reliably explain their reasoning and are easy to align.

The Problem: foundation models are black-boxes and difficult to align

Current transformer-based large language models (LLMs) and diffusion generative models are largely inscrutable and do not provide reliable explanations for their output. In medicine, lending, and drug discovery, it is not enough to only provide an answer; domain experts would also like to know why the model arrived at its output.

  • Current foundation models don’t explain their outputs. Would you trust a black-box model to propose medications for your illness or decide whether you should get a job interview?
  • You can't debug a system you don't understand: When you call a model API and the response is incorrect, what do you do? Change the prompt? What part of your prompt should you change? Switch to a new model API?
  • Difficult to reliably align or control model outputs: Even when you've identified the cause of the problem. How do you control the model so that it no longer makes the mistake you identified?

Our Solution

We’ve developed interpretable foundation models that can explain their reasoning, and are easy to align.

These models:

  • provide human-understandable explanations;
  • indicate what part of the prompt is important; and,
  • specify which tokens led to the model's output.

Using all these explanations, we can:

  • identify the part of the prompt that causes the model to err;
  • isolate the samples that cause those errors; and,
  • use explanations to control and align the model to fix its errors.

About Us

We are a team of machine learning researchers with PhDs from MIT who have worked on interpretable models for the past 6 years. We have published more than 10 papers at top machine learning conferences on the topic. We recently developed an interpretable diffusion and large language models that can explain their outputs using human understandable concepts. We have previously built and trained interpretable models at Google and Meta.

Our Ask

We are interested in working with companies across lending/finance, healthcare/medicine, insurance, and drug discovery who are looking for alternative models that can reliably explain their outputs. Please reach out to us at info@guidelabs.ai.