Skip to main content
This page shows how to:
  • Send images and files to any LLM. TensorZero supports multimodal inputs like images, PDFs, and more across all major providers.
  • Use remote file URLs or base64-encoded data. You can reference files hosted online or embed them directly in requests.
  • Configure file storage for observability. Store files in S3-compatible object storage, the local filesystem, or disable storage entirely.
You can also find the runnable code for this example on GitHub.
1

Configure file storage

TensorZero can store files used during multimodal inference for observability and other downstream workflows.You can configure the object storage service in the object_storage section of the configuration file. For production, we recommend using an S3-compatible object storage service. For local development, you can also use the filesystem. If you don’t need to store files, you can disable object storage entirely.See Configuration Reference for more details.
2

Deploy TensorZero

We’ll use Docker Compose to deploy the TensorZero Gateway, ClickHouse, and MinIO (an open-source S3-compatible object storage service).
# This is a simplified example for learning purposes. Do not use this in production.
# For production-ready deployments, see: https://www.tensorzero.com/docs/deployment/tensorzero-gateway

services:
  clickhouse:
    image: clickhouse:lts
    environment:
      CLICKHOUSE_USER: chuser
      CLICKHOUSE_DEFAULT_ACCESS_MANAGEMENT: 1
      CLICKHOUSE_PASSWORD: chpassword
    ports:
      - "8123:8123"
    volumes:
      - clickhouse-data:/var/lib/clickhouse
    healthcheck:
      test: wget --spider --tries 1 http://chuser:chpassword@clickhouse:8123/ping
      start_period: 30s
      start_interval: 1s
      timeout: 1s

  gateway:
    image: tensorzero/gateway
    volumes:
      # Mount our tensorzero.toml file into the container
      - ./config:/app/config:ro
    command: --config-file /app/config/tensorzero.toml
    environment:
      OPENAI_API_KEY: ${OPENAI_API_KEY:?Environment variable OPENAI_API_KEY must be set.}
      S3_ACCESS_KEY_ID: miniouser
      S3_SECRET_ACCESS_KEY: miniopassword
      TENSORZERO_CLICKHOUSE_URL: http://chuser:chpassword@clickhouse:8123/tensorzero
    ports:
      - "3000:3000"
    extra_hosts:
      - "host.docker.internal:host-gateway"
    depends_on:
      clickhouse:
        condition: service_healthy
      minio:
        condition: service_healthy

  # For a production deployment, you can use AWS S3, GCP Cloud Storage, Cloudflare R2, etc.
  minio:
    image: bitnamilegacy/minio:2025.7.23
    ports:
      - "9000:9000" # API port
      - "9001:9001" # Console port
    environment:
      MINIO_ROOT_USER: miniouser
      MINIO_ROOT_PASSWORD: miniopassword
      MINIO_DEFAULT_BUCKETS: tensorzero
    healthcheck:
      test: "mc ls local/tensorzero || exit 1"
      start_period: 30s
      start_interval: 1s
      timeout: 1s

volumes:
  clickhouse-data:
See Deploy the TensorZero Gateway and Deploy ClickHouse for production deployment instructions.
3

Call LLMs with file inputs

The TensorZero Gateway accepts both embedded files (encoded as base64 strings) and remote files (specified by a URL).
You can use the OpenAI Python SDK to send images to the TensorZero Gateway.
from openai import OpenAI

client = OpenAI(base_url="http://localhost:3000/openai/v1", api_key="not-used")

response = client.chat.completions.create(
    model="tensorzero::model_name::openai::gpt-4o-mini",
    messages=[
        {
            "role": "user",
            "content": [
                {
                    "type": "text",
                    "text": "Do the images share any common features?",
                },
                # Remote image of Ferris the crab
                {
                    "type": "image_url",
                    "image_url": {
                        "url": "https://raw.githubusercontent.com/tensorzero/tensorzero/eac2a230d4a4db1ea09e9c876e45bdb23a300364/tensorzero-core/tests/e2e/providers/ferris.png",
                    },
                },
                # One-pixel orange image encoded as a base64 string
                {
                    "type": "image_url",
                    "image_url": {
                        "url": "data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAAEAAAABCAYAAAAfFcSJAAAAAXNSR0IArs4c6QAAAA1JREFUGFdj+O/P8B8ABe0CTsv8mHgAAAAASUVORK5CYII=",
                    },
                },
            ],
        }
    ],
)

print(response)
See Integrations for a list of providers that support multimodal inference.

Advanced

Tune the image detail level

When working with image files, you can optionally specify a detail parameter to control the fidelity of image processing. This parameter accepts three values: low, high, or auto. The detail parameter only applies to image files and is ignored for other file types like PDFs or audio files. Using low detail reduces token consumption and processing time at the cost of image quality, while high detail provides better image quality but consumes more tokens. The auto setting allows the model provider to automatically choose the appropriate detail level based on the image characteristics.

Ensure reproducibility by fetching input files before inference

By default, the TensorZero Gateway forwards remote file URLs directly to the model provider and fetches them separately for observability in parallel with inference. This means that in rare cases, the file the model provider fetches may differ from the one TensorZero stores (e.g. if the file at the URL changes between the two fetches). To ensure that TensorZero and the model provider see identical inputs, you can set gateway.fetch_and_encode_input_files_before_inference = true in your configuration. When enabled, the gateway will fetch remote input files and send them as base64-encoded payloads in the prompt. This is recommended for if you require strict observability and reproducibility. See Configuration Reference for more details.