Setup
For this minimal setup, you’ll need just two files in your project directory:You can also find the complete code for this example on GitHub.
Configuration
Create a minimal configuration file that defines a model and a simple chat function:config/tensorzero.toml
hosted_provider
field specifies the model provider that you deployed on AWS SageMaker.
For example, Ollama is OpenAI-compatible, so we use openai
as the hosted provider.
Alternatively, you can use hosted_provider = "tgi"
if you had deployed TGI instead.
You can specify the endpoint’s region
explicitly, or use allow_auto_detect_region = true
to infer region with the AWS SDK.
See the Configuration Reference for optional fields.
The relevant fields will depend on the hosted_provider
.
Credentials
You must make sure that the gateway has the necessary permissions to access AWS SageMaker. The TensorZero Gateway will use the AWS SDK to retrieve the relevant credentials. The simplest way is to set the following environment variables before running the gateway:Deployment (Docker Compose)
Create a minimal Docker Compose configuration:docker-compose.yml
docker compose up
.