Turn behaviour specs into local models

The fastest way to turn a behaviour spec into a small, local, open-weight model with measurable regressions.

npm i -g @tuned-tensor/cli
tt init
tt runs start <spec-id>

Open source CLI tool (MIT) on GitHub. Agent-readable workflow at /skill.md.

New to fine-tuning?

Start from a working spec

The open community spec library gives you simple JSON behaviour specs to inspect, run, and adapt. Start with the email safety triage spec, then use the optional dataset, model, and eval notes when you want more context.

A closed loop from behaviour spec to model

Tuned Tensor turns a behaviour spec into a tuned open-weight model, then uses regression reports to improve the next run.

01

Write the behaviour spec

Rules, constraints, examples, and target base model.

02

Fine-tune locally sized models

Compile training data and tune a small open-weight model.

03

Measure regressions

Score outputs, inspect failures, and compare against baselines.

04

Auto-tune loop

Use AI feedback to improve the spec and start the next run.

Each run preservesVersioned
Spec version
Run report
AI feedback
Next iteration

Auto-tune can repeat the loop until the target score is reached or the iteration limit is hit.

Use case story

Fine-tune a small model to triage email

We fine-tuned Qwen 3.5 2B to turn raw emails into structured category, priority, and next-action decisions, then measured the gains against the base model on validation and held-out test examples.

Email triage, from spec to local model

A focused fine-tune measured against baseline behaviour.

Dataset

10,000

labelled email rows

MetricBaseTunedLift

Validation pass rate

57.5%

89.5%

+32.0 pp

Test avg score

0.537

0.862

+0.325

The article includes the behaviour spec, dataset shape, run command, evaluation results, lessons learned, and local serving notes.

Supported base models

Start from small open-weight models that are ready for managed LoRA fine-tuning.

Which model to use
googleE2B

gemma-4-E2B-it

google/gemma-4-E2B-it

googleE4B

gemma-4-E4B-it

google/gemma-4-E4B-it

Qwen2B

Qwen3.5-2B

Qwen/Qwen3.5-2B

Qwen4B

Qwen3.5-4B

Qwen/Qwen3.5-4B

meta-llama3B

Llama-3.2-3B-Instruct

meta-llama/Llama-3.2-3B-Instruct

microsoft3.8B

Phi-4-mini-instruct

microsoft/Phi-4-mini-instruct

ibm-granite2B

granite-3.3-2b-instruct

ibm-granite/granite-3.3-2b-instruct

bigcode3B

starcoder2-3b

bigcode/starcoder2-3b

Want to learn more?

Explore the documentation to see how behaviour specs, runs, and evaluations work under the hood.