chat: the default choice for most LLM chat completion use casesjson: a specialized function type when your goal is generating structured outputs
Generate structured outputs with a static schema
Let’s create a JSON function for one of its typical use cases: data extraction.1
Configure your JSON function
Create a configuration file that defines your JSON function with the output schema and JSON mode.
If you don’t specify an The field
output_schema, the gateway will default to accepting any valid JSON output.tensorzero.toml
json_mode can be one of the following: off, on, strict, or tool.
The tool strategy is a custom TensorZero implementation that leverages tool use under the hood for generating JSON.
See Configuration Reference for details.2
Configure your output schema
If you choose to specify a schema, place it in the relevant file:
output_schema.json
3
Create your prompt template
Create a template that instructs the model to extract the information you need.
system_template.minijinja
4
Call the function
- Python
- Python (OpenAI SDK)
- Node (OpenAI SDK)
- HTTP
When using the TensorZero SDK, the response will include
raw and parsed values.
The parsed field contains the validated JSON object.
If the output doesn’t match the schema or isn’t valid JSON, parsed will be None and you can fall back to the raw string output.Sample Response
Sample Response
Generate structured outputs with a dynamic schema
While we recommend specifying a fixed schema in the configuration whenever possible, you can provide the output schema dynamically at inference time if your use case demands it. Seeoutput_schema in the Inference API Reference or response_format in the Inference (OpenAI) API Reference.
You can also override json_mode at inference time if necessary.
Set json_mode at inference time
You can set json_mode for a particular request using params.
This value takes precedence over any default behaviors or json_mode in the configuration.
- Python
- Python (OpenAI SDK)
- Node (OpenAI SDK)
- HTTP
You can set See the Inference API Reference for more details.
json_mode by adding params to the request body.Handle model provider limitations
Anthropic
Anthropic supports native structured outputs through their beta API. To use this feature with TensorZero, enablebeta_structured_outputs = true in your Anthropic provider configuration and set json_mode = "strict".
Alternatively, you can use extra_headers.
tensorzero.toml
Gemini (GCP Vertex AI, Google AI Studio)
GCP Vertex AI Gemini and Google AI Studio support structured outputs, but only support a subset of the JSON Schema specification. TensorZero automatically handles some known limitations, but certain output schemas will still be rejected by the model provider. Refer to the Google documentation for details on supported JSON Schema features.Lack of native support (e.g. AWS Bedrock)
Some model providers (e.g. OpenAI, Google) support strictly enforcing output schemas natively, but others (e.g. AWS Bedrock) do not. For providers without native support, you can still generate structured outputs withjson_mode = "tool".
TensorZero converts your output schema into a tool call, then transforms the tool response back into JSON output.
You can set json_mode = "tool" in your configuration file or at inference time.