Runs
A run is a single end-to-end cycle: compile your behaviour spec into training data, augment examples with AI, fine-tune the model, and auto-evaluate the result.
The Run Object
{
"id": "e0b7694b-2c65-4199-89a1-fc54a6a6010c",
"behavior_spec_id": "cafd8799-...",
"run_number": 1,
"status": "completed",
"spec_snapshot": { ... },
"dataset_id": "dc66546b-...",
"fine_tune_job_id": "b3e2b918-...",
"model_id": "96e9f0d9-...",
"hyperparameters": {
"augment": true,
"n_epochs": 4,
"lora_rank": 8,
"lora_alpha": 16
},
"eval_summary": {
"total": 5,
"avg_score": 0.82,
"pass_rate": 0.8,
"scoring_method": "llm_judge",
"regressions": 0,
"improvements": 3
},
"started_at": "2026-03-06T10:30:00.000Z",
"completed_at": "2026-03-06T10:57:50.000Z"
}Run Lifecycle
| Status | Description |
|---|---|
preparing | Compiling spec → augmenting examples → uploading to provider |
training | Fine-tuning job running on Together AI |
evaluating | Model being tested against the spec's examples |
completed | Eval results available |
failed | Error — check the error field |
cancelled | Manually cancelled |
Spec Snapshot
Every run captures a spec_snapshot — a frozen copy of the behaviour spec at run time. You can freely edit your spec between runs; each run preserves exactly what it trained on.
Eval Summary
| Field | Description |
|---|---|
avg_score | Mean score across all examples (0–1) |
pass_rate | Fraction of examples that passed (score ≥ 0.7) |
exact_match_rate | Fraction of near-perfect scores (≥ 0.95) |
avg_latency_ms | Mean inference latency per example |
scoring_method | llm_judge or similarity |
regressions | Examples that scored ≥ 0.1 worse than previous run |
improvements | Examples that scored ≥ 0.1 better than previous run |
The recommended way to work with runs is the tt CLI — each endpoint below shows the tt command first, followed by the equivalent REST call.
Start a Run
POST /api/v1/behavior-specs/:id/runs
CLI
tt runs start <spec-id> --epochs 4 --lr 0.00002 --batch-size 8Use tt runs watch <run-id> after starting to stream the run through preparing → training → evaluating → completed.
Equivalent REST call
curl -X POST https://api.tunedtensor.com/v1/behavior-specs/:id/runs \
-H "Authorization: Bearer tt_your_api_key" \
-H "Content-Type: application/json" \
-d '{
"augment": true,
"hyperparameters": {
"n_epochs": 4,
"learning_rate": 0.00002,
"lora_rank": 8,
"lora_alpha": 16
}
}'| Parameter | Default | Description |
|---|---|---|
augment | true | Use AI to expand examples into a larger training set |
hyperparameters.n_epochs | 4 | Number of training epochs (1–20) |
hyperparameters.learning_rate | auto | Learning rate |
hyperparameters.batch_size | 8 | Training batch size (min 8) |
hyperparameters.lora_rank | 8 | LoRA adapter rank |
hyperparameters.lora_alpha | 16 | LoRA alpha scaling factor |
Returns immediately with status preparing. Work happens asynchronously.
Cost & credits
Runs are charged from your prepaid credit balance. Cost is calculated as:
cost_cents = ceil(epochs × training_tokens × model_rate / 1_000_000)The model rate is per 1M training tokens, per epoch — see the rate card. Tokens are counted by the training provider after tokenisation. At start time we reserve the estimated cost from your available credits, then debit the final provider-reported cost on successful completion. Failed and cancelled runs release the hold and are free.
If your available balance is too low at start time, the request returns 402 insufficient_credits. A positive total balance can still be blocked when active runs have credits on hold:
{
"error": {
"code": "insufficient_credits",
"message": "This run is estimated at $0.42 but you have $0.10 available.",
"details": {
"required_cents": 42,
"available_cents": 10,
"topup_url": "/dashboard/billing"
}
}
}Top up at Dashboard → Billing or run tt topup. See Billing & credits for full pricing details.
List Runs for a Spec
GET /api/v1/behavior-specs/:id/runs
CLI
tt runs list --spec <spec-id>Equivalent REST call
curl https://api.tunedtensor.com/v1/behavior-specs/:id/runs \
-H "Authorization: Bearer tt_your_api_key"List All Runs
GET /api/v1/runs
CLI
tt runs listEquivalent REST call
curl https://api.tunedtensor.com/v1/runs \
-H "Authorization: Bearer tt_your_api_key"Returns runs across all specs with _spec_name for display.
Get Run Detail
GET /api/v1/runs/:id
CLI
# One-shot fetch
tt runs get <run-id>
# Live streaming until terminal state
tt runs watch <run-id>Equivalent REST call
curl https://api.tunedtensor.com/v1/runs/:id \
-H "Authorization: Bearer tt_your_api_key"Returns the full run with _evals — per-example results sorted by score (worst first). Each eval includes:
prompt,expected,actualscore(0–1),passed(boolean)reasoning— LLM judge's explanationlatency_ms— inference time
Cancel a Run
POST /api/v1/runs/:id/cancel
CLI
tt runs cancel <run-id>Equivalent REST call
curl -X POST https://api.tunedtensor.com/v1/runs/:id/cancel \
-H "Authorization: Bearer tt_your_api_key"Cancels runs in preparing, training, or evaluating status. Also cancels the provider fine-tuning job if running.