This guide shows how to set up a minimal deployment to use the TensorZero Gateway with the AWS SageMaker API. The AWS SageMaker model provider is a wrapper around other TensorZero model providers that handles AWS SageMaker-specific logic (e.g. auth). For example, you can use it to infer self-hosted model providers like Ollama and TGI deployed on AWS SageMaker.Documentation Index
Fetch the complete documentation index at: https://www.tensorzero.com/docs/llms.txt
Use this file to discover all available pages before exploring further.
Setup
For this minimal setup, you’ll need just two files in your project directory:Configuration
Create a minimal configuration file that defines a model and a simple chat function:config/tensorzero.toml
hosted_provider field specifies the model provider that you deployed on AWS SageMaker.
For example, Ollama is OpenAI-compatible, so we use openai as the hosted provider.
Alternatively, you can use hosted_provider = "tgi" if you had deployed TGI instead.
You can specify the endpoint’s region explicitly, or use region = "sdk" to auto-detect region with the AWS SDK.
If you’re using AWS China regions (
cn-north-1, cn-northwest-1) or AWS GovCloud, you must also specify the endpoint_url field since these partitions use different DNS suffixes.
For example: endpoint_url = "https://runtime.sagemaker.cn-north-1.amazonaws.com.cn"hosted_provider.
Credentials
You must make sure that the gateway has the necessary permissions to access AWS SageMaker. By default, the TensorZero Gateway will use the AWS SDK to retrieve the relevant credentials. The simplest way is to set the following environment variables before running the gateway:dynamic:: prefix (e.g. access_key_id = "dynamic::aws_access_key_id").
See the Credential Management guide and Configuration Reference for more details on authentication options.
Deployment (Docker Compose)
Create a minimal Docker Compose configuration:docker-compose.yml
docker compose up.