Learn how to process multiple requests at once with batch inference to save on inference costs at the expense of longer wait times.
inputs
is equal to the input
field in a regular inference request.
batch_id
as well as inference_ids
and episode_ids
for each inference in the batch.
batch_id
to poll for the status of the job or retrieve the results using the GET /batch_inference/{batch_id}
endpoint.
status
field.
status
field and the inferences
field.
Each inference object is the same as the response from a regular inference request.
BatchRequest
and BatchModelInference
tables on ClickHouse.
See Data Model for more information.