Example
You can also find the runnable code for this example on GitHub.
inputs is equal to the input field in a regular inference request.
batch_id as well as inference_ids and episode_ids for each inference in the batch.
batch_id to poll for the status of the job or retrieve the results using the GET /batch_inference/{batch_id} endpoint.
status field.
status field and the inferences field.
Each inference object is the same as the response from a regular inference request.
Technical Notes
- Observability
- For now, pending batch inference jobs are not shown in the TensorZero UI.
You can find the relevant information in the
BatchRequestandBatchModelInferencetables on ClickHouse. See Data Model for more information. - Inferences from completed batch inference jobs are shown in the UI alongside regular inferences.
- For now, pending batch inference jobs are not shown in the TensorZero UI.
You can find the relevant information in the
- Experimentation
- The gateway samples the same variant for the entire batch.
- Python Client
- The TensorZero Python client doesn’t natively support batch inference yet. You’ll need to submit batch requests using HTTP requests, as shown above.