TensorZero Autopilot is an automated AI engineer that analyzes LLM observability data, optimizes prompts and models, sets up evals, and runs A/B tests. Join the waitlist →
Learn how to set up TensorZero API keys to authenticate your inference requests and manage access control for your workflows.
You can create TensorZero API keys to authenticate your requests to the TensorZero Gateway.
This way, your clients don’t need access to model provider credentials, making it easier to manage access and security.This page shows how to:
Create API keys for the TensorZero Gateway
Require clients to use these API keys for requests
Manage and disable API keys
TensorZero supports authentication for the gateway.
Authentication for the UI is coming soon.
In the meantime, we recommend pairing the UI with complementary products like Nginx, OAuth2 Proxy, or Tailscale.
You can create API keys using the TensorZero UI.
If you’re running a standard local deployment, visit http://localhost:4000/api-keys to create a key.Alternatively, you can create API keys programmatically in the CLI using the gateway binary with the --create-api-key flag.
For example:
Copy
docker compose run --rm gateway --create-api-key
The API key is a secret and should be kept secure.
Once you’ve created an API key, set the TENSORZERO_API_KEY environment variable.
4
Make an authenticated inference request
Python (TensorZero SDK)
Python (OpenAI SDK)
Node (OpenAI SDK)
HTTP
You can make authenticated requests by setting the api_key parameter in your TensorZero client:
tensorzero_sdk.py
Copy
import osfrom tensorzero import TensorZeroGatewayt0 = TensorZeroGateway.build_http( api_key=os.environ["TENSORZERO_API_KEY"], gateway_url="http://localhost:3000",)response = t0.inference( model_name="openai::gpt-5-mini", input={ "messages": [ { "role": "user", "content": "Tell me a fun fact.", } ] },)print(response)
The client will automatically read the TENSORZERO_API_KEY environment variable if you don’t set api_key.
Authentication is not supported in the embedded (in-memory) gateway in Python.
Please use the HTTP client with a standalone gateway to make authenticated requests.
You can make authenticated requests by setting the api_key parameter in your OpenAI client:
openai_sdk.py
Copy
import osfrom openai import OpenAIclient = OpenAI( api_key=os.environ["TENSORZERO_API_KEY"], base_url="http://localhost:3000/openai/v1",)response = client.chat.completions.create( model="tensorzero::model_name::openai::gpt-5-mini", messages=[ { "role": "user", "content": "Tell me a fun fact.", } ],)print(response)
Authentication is not supported in the embedded (in-memory) gateway in Python.
Please use the HTTP client with a standalone gateway to make authenticated requests.
You can make authenticated requests by setting the apiKey parameter in your OpenAI client:
openai_sdk.ts
Copy
import OpenAI from "openai";const client = new OpenAI({ apiKey: process.env.TENSORZERO_API_KEY, baseURL: "http://localhost:3000/openai/v1",});const response = await client.chat.completions.create({ model: "tensorzero::model_name::openai::gpt-5-mini", messages: [ { role: "user", content: "Tell me a fun fact.", }, ],});
You can make authenticated requests by setting the Authorization HTTP header to Bearer <API_KEY>:
curl.sh
Copy
curl -X POST http://localhost:3000/openai/v1/chat/completions \ -H "Content-Type: application/json" \ -H "Authorization: Bearer ${TENSORZERO_API_KEY}" \ -d '{ "model": "tensorzero::model_name::openai::gpt-5-mini", "messages": [ { "role": "user", "content": "Tell me a fun fact." } ] }'
5
Manage API keys
You can manage and disable API keys in the TensorZero UI.
If you’re running a standard local deployment, visit http://localhost:4000/api-keys to manage your keys.Alternatively, you can disable API keys programmatically in the CLI using the gateway binary with the --disable-api-key flag.
Pass the public ID of the key you want to disable (the 12-character portion after sk-t0-).
For example:
Copy
docker compose run --rm gateway --disable-api-key xxxxxxxxxxxx
Once you have authentication enabled, you can apply rate limits on a per-API-key basis using the api_key_public_id scope in your rate limiting rules.
This allows you to enforce different usage limits for different API keys, which is useful for implementing tiered access or preventing individual keys from consuming too many resources.
TensorZero API keys have the following format:sk-t0-xxxxxxxxxxxx-yyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyThe xxxxxxxxxxxx portion is the 12-character public ID that you can use in rate limiting rules.
The remaining portion of the key is secret and should be kept secure.
For example, you can limit each API key to 100 model inferences per hour, but allow a specific API key to make 1000 inferences:
Copy
# Each API key can make up to 100 model inferences per hour[[rate_limiting.rules]]priority = 0model_inferences_per_hour = 100scope = [ { api_key_public_id = "tensorzero::each" }]# But override the limit for a specific API key[[rate_limiting.rules]]priority = 1model_inferences_per_hour = 1000scope = [ { api_key_public_id = "xxxxxxxxxxxx" }]
Centralize auth across multiple TensorZero deployments
If you have multiple TensorZero deployments (e.g. one per team), you can centralize auth using gateway relay.With gateway relay, an LLM inference request can be routed through multiple independent TensorZero Gateway deployments before reaching a model provider.
This enables you to enforce organization-wide controls without restricting how teams build their LLM features.See Centralize auth, rate limits, and more for details.