Skip to main content

Introduction

Weave is a lightweight toolkit for tracking and evaluating LLM applications, built by Weights & Biases.

Our goal is to bring rigor, best-practices, and composability to the inherently experimental process of developing AI applications, without introducing cognitive overhead.

Get started by decorating Python functions with @weave.op().

Weave Hero

Seriously, try the 🍪 quickstart 🍪 or Open In Colab

You can use Weave to:

  • Log and debug language model inputs, outputs, and traces
  • Build rigorous, apples-to-apples evaluations for language model use cases
  • Organize all the information generated across the LLM workflow, from experimentation to evaluations to production

Key concepts

Weave's core types layer contains everything you need for organizing Generative AI projects, with built-in lineage, tracking, and reproducibility.

  • Datasets: Version, store, and share rich tabular data.
  • Models: Version, store, and share parameterized functions.
  • Evaluations: Test suites for AI models.
  • [soon] Agents: ...

Weave's tracking layer brings immutable tracing and versioning to your programs and experiments.

  • Objects: Weave's extensible serialization lets you easily version, track, and share Python objects.
  • Ops: Versioned, reproducible functions, with automatic tracing.
  • Tracing: Automatic organization of function calls and data lineage.
  • Feedback: Simple utilities to capture user feedback and attach them to the underlying tracked call.

Weave offers integrations with many language model APIs and LLM frameworks to streamline tracking and evaluation:

Weave's tools layer contains utilities for making use of Weave objects.

  • Serve: FastAPI server for Weave Ops and Models
  • Deploy: Deploy Weave Ops and Models to various targets

What's next?

Try the Quickstart to see Weave in action.