Introduction to Braintrust


Braintrust provides visibility into the performance of AI-driven features. It serves three main goals:

  • Evals: Experiments allow you to test changes to AI features before shipping them.
  • Observability: Logs ensure production code successfully delivers AI features.
  • IDE: Playgrounds allow you to quickly iterate by exploring changes to prompts and other parameters.

How it works

Run evals

  1. Create an Eval() by plugging in your data, a task function, and scoring functions.
  2. Run the code to see results in Braintrust's Experiments page.

Log runs

  1. Instrument your code using traced() or logger.traced().
  2. Run the code to see results in Braintrust's Logs page.

Create prompts

  1. Visit the Prompts page to create a new prompt.
  2. Visit the Playgrounds page to try out one or more prompts on your data.