Workflow Evaluations focus on evaluating complex workflows that might include multiple TensorZero inference calls, arbitrary application logic, and more. You can initialize and run workflow evaluations using the TensorZero Gateway, either through the TensorZero client or the gateway’s HTTP API. Unlike inference evaluations, workflow evaluations are not defined in the TensorZero configuration file. See the Workflow Evaluations Tutorial for a step-by-step guide.Documentation Index
Fetch the complete documentation index at: https://www.tensorzero.com/docs/llms.txt
Use this file to discover all available pages before exploring further.
Endpoints & Methods
Starting a workflow evaluation run
- Gateway Endpoint:
POST /workflow_evaluation_run - Client Method:
workflow_evaluation_run - Parameters:
variants: an object (dictionary) mapping function names to variant namesproject_name(string, optional): the name of the project to associate the run withdisplay_name(string, optional): the display (human-readable) name of the runtags(dictionary, optional): a dictionary of key-value pairs to tag the run’s inferences with
- Returns:
run_id(UUID): the ID of the run
Starting an episode in a workflow evaluation run
- Gateway Endpoint:
POST /workflow_evaluation_run/{run_id}/episode - Client Method:
workflow_evaluation_run_episode - Parameters:
run_id(UUID): the ID of the run generated by theworkflow_evaluation_runmethodtask_name(string, optional): the name of the task to associate the episode withtags(dictionary, optional): a dictionary of key-value pairs to tag the episode’s inferences with
- Returns:
episode_id(UUID): the ID of the episode
Making inference and feedback calls during a workflow evaluation run
After initializing a run and an episode, you can make inference and feedback API calls like you normally would. By providing the specialepisode_id parameter generated by the workflow_evaluation_run_episode method , the TensorZero Gateway will associate the inference and feedback with the evaluation run, handle variant pinning, and more.