You can also find the runnable code for this example on GitHub.
Feedback
TensorZero currently supports the following types of feedback:Feedback Type | Examples |
---|---|
Boolean Metric | Thumbs up, task success |
Float Metric | Star rating, clicks, number of mistakes made |
Comment | Natural-language feedback from users or developers |
Demonstration | Edited drafts, labels, human-generated content |
/feedback
endpoint.
Metrics
You can define metrics in yourtensorzero.toml
configuration file.
The skeleton of a metric looks like the following configuration entry.
tensorzero.toml
Comments and demonstrations are available by default and don’t need to be configured.
Example: Rating Haikus
In the Quickstart, we built a simple LLM application that writes haikus about artificial intelligence. Imagine we wanted to assign 👍 or 👎 to these haikus. Later, we can use this data to fine-tune a model using only haikus that match our tastes. We should use a metric of typeboolean
to capture this behavior since we’re optimizing for a binary outcome: whether we liked the haikus or not.
The metric applies to individual inference requests, so we’ll set level = "inference"
.
And finally, we’ll set optimize = "max"
because we want to maximize this metric.
Our metric configuration should look like this:
tensorzero.toml
Full Configuration
Full Configuration
tensorzero.toml
inference_id
we receive from the first API call to link the two.
run.py
Sample Output
Sample Output
Demonstrations
Demonstrations are a special type of feedback that represent the ideal output for an inference. For example, you can use demonstrations to provide corrections from human review, labels for supervised learning, or other ground truth data that represents the ideal output. You can assign demonstrations to an inference using the special metric namedemonstration
.
You can’t assign demonstrations to an episode.
Comments
You can assign natural-language feedback to an inference or episode using the special metric namecomment
.