TensorZero Evaluations Overview

TensorZero offers two types of evaluations:

Static Evaluations focus on evaluating the performance of a TensorZero variant (i.e. a choice of prompt, model, inference strategy, etc.) on a given dataset.

Dynamic Evaluations focus on evaluating complex workflows that might include multiple TensorZero inference calls, arbitrary application logic, and more.

As a vague analogy, static evaluations are like unit tests for individual inference calls, and dynamic evaluations are like integration tests for complex workflows.

Tutorial: Static Evaluations

Tutorial: Dynamic Evaluations