Call LLMs with image & file inputs

This page shows how to:

Send images and files to any LLM. TensorZero supports multimodal inputs like images, PDFs, and more across all major providers.
Use remote file URLs or base64-encoded data. You can reference files hosted online or embed them directly in requests.
Configure file storage for observability. Store files in S3-compatible object storage, the local filesystem, or disable storage entirely.

You can also find the runnable code for this example on GitHub.

Configure file storage

TensorZero can store files used during multimodal inference for observability and other downstream workflows.You can configure the object storage service in the object_storage section of the configuration file. For production, we recommend using an S3-compatible object storage service. For local development, you can also use the filesystem. If you don’t need to store files, you can disable object storage entirely.

Object Storage (Recommended)
Filesystem
Disabled

TensorZero supports any S3-compatible object storage service, including AWS S3, GCP Cloud Storage, Cloudflare R2, and many more.

[object_storage]
type = "s3_compatible"
endpoint = "http://minio:9000"  # optional: defaults to AWS S3
# region = "us-east-1"  # optional: depends on your S3-compatible storage provider
bucket_name = "tensorzero"  # optional: depends on your S3-compatible storage provider
# IMPORTANT: for production environments, remove the following setting and use a secure method of authentication in
# combination with a production-grade object storage service.
allow_http = true

The TensorZero Gateway will attempt to retrieve credentials from the following resources in order of priority:

S3_ACCESS_KEY_ID and S3_SECRET_ACCESS_KEY environment variables
AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY environment variables
Credentials from the AWS SDK (default profile)

For local development, you can store files in a directory on the filesystem.

[object_storage]
type = "filesystem"
path = "/path/to/storage"

If you don’t need to store files, you can disable object storage entirely.

[object_storage]
type = "disabled"

See Configuration Reference for more details.

Deploy TensorZero

We’ll use Docker Compose to deploy the TensorZero Gateway, Postgres, and MinIO (an open-source S3-compatible object storage service).

docker-compose.yml

# This is a simplified example for learning purposes. Do not use this in production.
# For production-ready deployments, see: https://www.tensorzero.com/docs/deployment/tensorzero-gateway

services:
  gateway:
    image: tensorzero/gateway
    volumes:
      # Mount our tensorzero.toml file into the container
      - ./config:/app/config:ro
    command: --config-file /app/config/tensorzero.toml
    environment:
      OPENAI_API_KEY: ${OPENAI_API_KEY:?Environment variable OPENAI_API_KEY must be set.}
      S3_ACCESS_KEY_ID: miniouser
      S3_SECRET_ACCESS_KEY: miniopassword
      TENSORZERO_POSTGRES_URL: postgres://postgres:postgres@postgres:5432/tensorzero
    ports:
      - "3000:3000"
    extra_hosts:
      - "host.docker.internal:host-gateway"
    healthcheck:
      test:
        [
          "CMD",
          "wget",
          "--no-verbose",
          "--tries=1",
          "--spider",
          "http://localhost:3000/health",
        ]
      start_period: 1s
      start_interval: 1s
      timeout: 1s
    depends_on:
      postgres:
        condition: service_healthy
      gateway-run-postgres-migrations:
        condition: service_completed_successfully
      minio:
        condition: service_healthy

  # For a production deployment, you can use AWS S3, GCP Cloud Storage, Cloudflare R2, etc.
  minio:
    image: bitnamilegacy/minio:2025.7.23
    ports:
      - "9000:9000" # API port
      - "9001:9001" # Console port
    environment:
      MINIO_ROOT_USER: miniouser
      MINIO_ROOT_PASSWORD: miniopassword
      MINIO_DEFAULT_BUCKETS: tensorzero
    healthcheck:
      test: "mc ls local/tensorzero || exit 1"
      start_period: 30s
      start_interval: 1s
      timeout: 1s

  postgres:
    image: tensorzero/postgres:17
    command: ["postgres", "-c", "cron.database_name=tensorzero"]
    environment:
      POSTGRES_DB: tensorzero
      POSTGRES_USER: postgres
      POSTGRES_PASSWORD: postgres
    ports:
      - "5432:5432"
    volumes:
      - postgres-data:/var/lib/postgresql/data
    healthcheck:
      test: ["CMD-SHELL", "pg_isready -U postgres -d tensorzero"]
      start_period: 30s
      start_interval: 1s
      timeout: 1s

  # Apply Postgres migrations before the gateway starts
  gateway-run-postgres-migrations:
    image: tensorzero/gateway
    environment:
      TENSORZERO_POSTGRES_URL: postgres://postgres:postgres@postgres:5432/tensorzero
    depends_on:
      postgres:
        condition: service_healthy
    command: ["--run-postgres-migrations"]

volumes:
  postgres-data:

See Deploy the TensorZero Gateway and Deploy Postgres for production deployment instructions.

Call LLMs with file inputs

The TensorZero Gateway accepts both embedded files (encoded as base64 strings) and remote files (specified by a URL).

Python (OpenAI SDK)
Node (OpenAI SDK)
HTTP

You can use the OpenAI Python SDK to send images to the TensorZero Gateway.

from openai import OpenAI

client = OpenAI(base_url="http://localhost:3000/openai/v1", api_key="not-used")

response = client.chat.completions.create(
    model="tensorzero::model_name::openai::gpt-4o-mini",
    messages=[
        {
            "role": "user",
            "content": [
                {
                    "type": "text",
                    "text": "Do the images share any common features?",
                },
                # Remote image of Ferris the crab
                {
                    "type": "image_url",
                    "image_url": {
                        "url": "https://raw.githubusercontent.com/tensorzero/tensorzero/eac2a230d4a4db1ea09e9c876e45bdb23a300364/tensorzero-core/tests/e2e/providers/ferris.png",
                    },
                },
                # One-pixel orange image encoded as a base64 string
                {
                    "type": "image_url",
                    "image_url": {
                        "url": "data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAAEAAAABCAYAAAAfFcSJAAAAAXNSR0IArs4c6QAAAA1JREFUGFdj+O/P8B8ABe0CTsv8mHgAAAAASUVORK5CYII=",
                    },
                },
            ],
        }
    ],
)

print(response)

You can use the OpenAI Node SDK to send images to the TensorZero Gateway.

import OpenAI from "openai";

const client = new OpenAI({
  baseURL: "http://localhost:3000/openai/v1",
});

const response = await client.chat.completions.create({
  model: "tensorzero::model_name::openai::gpt-4o-mini",
  messages: [
    {
      role: "user",
      content: [
        {
          type: "text",
          text: "Do the images share any common features?",
        },
        // Remote image of Ferris the crab
        {
          type: "image_url",
          image_url: {
            url: "https://raw.githubusercontent.com/tensorzero/tensorzero/eac2a230d4a4db1ea09e9c876e45bdb23a300364/tensorzero-core/tests/e2e/providers/ferris.png",
          },
        },
        // One-pixel orange image encoded as a base64 string
        {
          type: "image_url",
          image_url: {
            url: "data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAAEAAAABCAYAAAAfFcSJAAAAAXNSR0IArs4c6QAAAA1JREFUGFdj+O/P8B8ABe0CTsv8mHgAAAAASUVORK5CYII=",
          },
        },
      ],
    },
  ],
});

console.dir(response, { depth: null });

You can also make requests directly using the OpenAI-compatible HTTP API.

curl -X POST http://localhost:3000/openai/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "tensorzero::model_name::openai::gpt-4o-mini",
    "messages": [
      {
        "role": "user",
        "content": [
          {
            "type": "text",
            "text": "Do the images share any common features?"
          },
          {
            "type": "image_url",
            "image_url": {
              "url": "https://raw.githubusercontent.com/tensorzero/tensorzero/eac2a230d4a4db1ea09e9c876e45bdb23a300364/tensorzero-core/tests/e2e/providers/ferris.png"
            }
          },
          {
            "type": "image_url",
            "image_url": {
              "url": "data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAAEAAAABCAYAAAAfFcSJAAAAAXNSR0IArs4c6QAAAA1JREFUGFdj+O/P8B8ABe0CTsv8mHgAAAAASUVORK5CYII="
            }
          }
        ]
      }
    ]
  }'

See Integrations for a list of providers that support multimodal inference.

Advanced

Tune the image detail level

When working with image files, you can optionally specify a detail parameter to control the fidelity of image processing. This parameter accepts three values: low, high, or auto. The detail parameter only applies to image files and is ignored for other file types like PDFs or audio files. Using low detail reduces token consumption and processing time at the cost of image quality, while high detail provides better image quality but consumes more tokens. The auto setting allows the model provider to automatically choose the appropriate detail level based on the image characteristics.

Ensure reproducibility by fetching input files before inference

By default, the TensorZero Gateway forwards remote file URLs directly to the model provider and fetches them separately for observability in parallel with inference. This means that in rare cases, the file the model provider fetches may differ from the one TensorZero stores (e.g. if the file at the URL changes between the two fetches). To ensure that TensorZero and the model provider see identical inputs, you can set gateway.fetch_and_encode_input_files_before_inference = true in your configuration. When enabled, the gateway will fetch remote input files and send them as base64-encoded payloads in the prompt. This is recommended for if you require strict observability and reproducibility. See Configuration Reference for more details.

Introduction

Gateway

Observability

Optimization

Evaluations

Experimentation

Deployment

Operations

Call LLMs with image & file inputs

Advanced

Tune the image detail level

Ensure reproducibility by fetching input files before inference

Introduction

Gateway

Observability

Optimization

Evaluations

Experimentation

Deployment

Operations

​Advanced

​Tune the image detail level

​Ensure reproducibility by fetching input files before inference

Advanced

Tune the image detail level

Ensure reproducibility by fetching input files before inference