The fastest way to turn a behaviour spec into a small, local, open-weight model with measurable regressions.
New to fine-tuning?
The open community spec library gives you simple JSON behaviour specs to inspect, run, and adapt. Start with the email safety triage spec, then use the optional dataset, model, and eval notes when you want more context.
Tuned Tensor turns a behaviour spec into a tuned open-weight model, then uses regression reports to improve the next run.
01
Rules, constraints, examples, and target base model.
02
Compile training data and tune a small open-weight model.
03
Score outputs, inspect failures, and compare against baselines.
04
Use AI feedback to improve the spec and start the next run.
Auto-tune can repeat the loop until the target score is reached or the iteration limit is hit.
Use case story
We fine-tuned Qwen 3.5 2B to turn raw emails into structured category, priority, and next-action decisions, then measured the gains against the base model on validation and held-out test examples.
Email triage, from spec to local model
A focused fine-tune measured against baseline behaviour.
Dataset
10,000
labelled email rows
Validation pass rate
57.5%
89.5%
+32.0 pp
Test avg score
0.537
0.862
+0.325
The article includes the behaviour spec, dataset shape, run command, evaluation results, lessons learned, and local serving notes.
Start from small open-weight models that are ready for managed LoRA fine-tuning.
google/gemma-4-E2B-it
google/gemma-4-E4B-it
Qwen/Qwen3.5-2B
Qwen/Qwen3.5-4B
meta-llama/Llama-3.2-3B-Instruct
microsoft/Phi-4-mini-instruct
ibm-granite/granite-3.3-2b-instruct
bigcode/starcoder2-3b
Explore the documentation to see how behaviour specs, runs, and evaluations work under the hood.