Skip to main content
TensorZero Functions come in two flavors:
  • chat: the default choice for most LLM chat completion use cases
  • json: a specialized function type when your goal is generating structured outputs
As a rule of thumb, you should use JSON functions if you have a single, well-defined output schema. If you need more flexibility (e.g. letting the model pick between multiple tools, or whether to pick a tool at all), then Chat Functions with tool use might be a better fit.

Generate structured outputs with a static schema

Let’s create a JSON function for one of its typical use cases: data extraction.
We provide complete code examples on GitHub.
1

Configure your JSON function

Create a configuration file that defines your JSON function with the output schema and JSON mode. If you don’t specify an output_schema, the gateway will default to accepting any valid JSON output.
tensorzero.toml
[functions.extract_data]
type = "json"
output_schema = "output_schema.json"  # optional

[functions.extract_data.variants.baseline]
type = "chat_completion"
model = "openai::gpt-5-mini"
system_template = "system_template.minijinja"
json_mode = "strict"
The field json_mode can be one of the following: off, on, strict, or tool. The tool strategy is a custom TensorZero implementation that leverages tool use under the hood for generating JSON. See Configuration Reference for details.
Use "strict" mode for providers that support it (e.g. OpenAI) or "tool" for others.
2

Configure your output schema

If you choose to specify a schema, place it in the relevant file:
output_schema.json
{
  "$schema": "http://json-schema.org/draft-07/schema#",
  "type": "object",
  "properties": {
    "name": {
      "type": ["string", "null"],
      "description": "The customer's full name"
    },
    "email": {
      "type": ["string", "null"],
      "description": "The customer's email address"
    }
  },
  "required": ["name", "email"],
  "additionalProperties": false
}
3

Create your prompt template

Create a template that instructs the model to extract the information you need.
system_template.minijinja
You are a helpful AI assistant that extracts customer information from messages.

Extract the customer's name and email address if present. Use null for any fields that are not found.

Your output should be a JSON object with the following schema:

{
  "name": string or null,
  "email": string or null
}

---

Examples:

User: Hi, I'm Sarah Johnson and you can reach me at [email protected]
Assistant: {"name": "Sarah Johnson", "email": "[email protected]"}

User: My email is [email protected]
Assistant: {"name": null, "email": "[email protected]"}

User: This is John Doe reaching out
Assistant: {"name": "John Doe", "email": null}
Including examples in your prompt helps the model understand the expected output format and improves accuracy.
4

Call the function

When using the TensorZero SDK, the response will include raw and parsed values. The parsed field contains the validated JSON object. If the output doesn’t match the schema or isn’t valid JSON, parsed will be None and you can fall back to the raw string output.
from tensorzero import TensorZeroGateway

t0 = TensorZeroGateway.build_http(gateway_url="http://localhost:3000")

response = t0.inference(
    function_name="extract_data",
    input={
        "messages": [
            {
                "role": "user",
                "content": "Hi, I'm Sarah Johnson and you can reach me at [email protected]",
            }
        ]
    },
)
JsonInferenceResponse(
    inference_id=UUID('019a78dc-0045-79e2-9629-cbcd47674abe'),
    episode_id=UUID('019a78dc-0045-79e2-9629-cbdaf9d830bd'),
    variant_name='baseline',
    output=JsonInferenceOutput(
        raw='{"name":"Sarah Johnson","email":"[email protected]"}',
        parsed={'name': 'Sarah Johnson', 'email': '[email protected]'}
    ),
    usage=Usage(input_tokens=252, output_tokens=26),
    finish_reason=<FinishReason.STOP: 'stop'>,
    original_response=None
)

Generate structured outputs with a dynamic schema

While we recommend specifying a fixed schema in the configuration whenever possible, you can provide the output schema dynamically at inference time if your use case demands it. See output_schema in the Inference API Reference or response_format in the Inference (OpenAI) API Reference. You can also override json_mode at inference time if necessary.

Set json_mode at inference time

You can set json_mode for a particular request using params. This value takes precedence over any default behaviors or json_mode in the configuration.
You can set json_mode by adding params to the request body.
response = await t0.inference(
    # ...
    params={
        "chat_completion": {
            "json_mode": "strict",  # or: "tool", "on", "off"
        }
    },
    # ...
)
See the Inference API Reference for more details.
Dynamic inference parameters like json_mode apply to specific variant types. Unless you’re using an advanced variant type, the variant type will be chat_completion.

Handle model provider limitations

Anthropic

Anthropic supports native structured outputs through their beta API. To use this feature with TensorZero, enable beta_structured_outputs = true in your Anthropic provider configuration and set json_mode = "strict". Alternatively, you can use extra_headers.
tensorzero.toml
[models.claude_structured]
routing = ["anthropic"]

[models.claude_structured.providers.anthropic]
type = "anthropic"
model_name = "claude-sonnet-4-5-20250929"
beta_structured_outputs = true

Gemini (GCP Vertex AI, Google AI Studio)

GCP Vertex AI Gemini and Google AI Studio support structured outputs, but only support a subset of the JSON Schema specification. TensorZero automatically handles some known limitations, but certain output schemas will still be rejected by the model provider. Refer to the Google documentation for details on supported JSON Schema features.

Lack of native support (e.g. AWS Bedrock)

Some model providers (e.g. OpenAI, Google) support strictly enforcing output schemas natively, but others (e.g. AWS Bedrock) do not. For providers without native support, you can still generate structured outputs with json_mode = "tool". TensorZero converts your output schema into a tool call, then transforms the tool response back into JSON output. You can set json_mode = "tool" in your configuration file or at inference time.