Quickstart

Go from zero to a fine-tuned model in a handful of commands. This guide walks through a real example — a customer support bot — using the open-source tt CLI. REST API equivalents are shown under each step for programmatic use.

Prerequisites

A Tuned Tensor account
An API key (create one in Dashboard → Settings → API Keys)
Node.js 18+ (for the CLI)

Install the CLI and authenticate:

npm install -g @tuned-tensor/cli
tt auth login <api-key>
tt auth status

The CLI is open source (MIT) — github.com/tunedtensor/tuned-tensor-cli. See the CLI Tool reference for every command. Prefer curl? Jump to Using the REST API.

Step 1: Create a Behaviour Spec

A behaviour spec defines what your model should do. Scaffold one locally with tt init, edit it, then push it to Tuned Tensor:

tt init --name "Customer Support Bot" \
  --model Qwen/Qwen3.5-2B

This creates tunedtensor.json. Open it and fill in guidelines, constraints, and examples. See the spec file guide for field-by-field guidance:

{
  "name": "Customer Support Bot",
  "description": "Handles billing, account, and technical support questions",
  "system_prompt": "You are a helpful customer support agent for Acme SaaS...",
  "guidelines": [
    "Keep responses under 150 words",
    "Always acknowledge the user concern before providing a solution",
    "Use bullet points for multi-step instructions"
  ],
  "constraints": [
    "Never promise refunds without directing to the refund policy",
    "Do not make up pricing — refer to the pricing page"
  ],
  "examples": [
    {
      "input": "How do I cancel my subscription?",
      "output": "I understand you would like to cancel. Go to Settings > Billing > Cancel Plan."
    },
    {
      "input": "I was charged twice this month",
      "output": "I am sorry about the double charge. Please contact billing@acme.com."
    },
    {
      "input": "How much does the Pro plan cost?",
      "output": "For up-to-date pricing, please check acme.com/pricing."
    },
    {
      "input": "My dashboard is loading slowly",
      "output": "Try clearing your cache, using a different browser, or incognito mode."
    },
    {
      "input": "Can I get a refund?",
      "output": "Please review our refund policy at acme.com/refund-policy."
    }
  ],
  "base_model": "Qwen/Qwen3.5-2B"
}

Push it to the API:

tt push

The CLI prints the spec id — save it for the next step (you can also list your specs any time with tt specs list).

Equivalent REST call

curl -X POST https://tunedtensor.com/api/v1/behavior-specs \
  -H "Authorization: Bearer <api-key>" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "Customer Support Bot",
    "description": "Handles billing, account, and technical support questions",
    "system_prompt": "You are a helpful customer support agent for Acme SaaS...",
    "guidelines": [
      "Keep responses under 150 words",
      "Always acknowledge the user concern before providing a solution",
      "Use bullet points for multi-step instructions"
    ],
    "constraints": [
      "Never promise refunds without directing to the refund policy",
      "Do not make up pricing — refer to the pricing page"
    ],
    "examples": [
      {
        "input": "How do I cancel my subscription?",
        "output": "I understand you would like to cancel. Go to Settings > Billing > Cancel Plan."
      }
    ],
    "base_model": "Qwen/Qwen3.5-2B"
  }'

The response includes the spec id:

{
  "data": {
    "id": "cafd8799-9180-482e-b0b2-c46d08e4b045",
    "name": "Customer Support Bot",
    "base_model": "Qwen/Qwen3.5-2B",
    "examples": [ ... ],
    "guidelines": [ ... ],
    "constraints": [ ... ],
    "created_at": "2026-03-06T08:27:33.492Z"
  }
}

Step 2: Start a Run

A run compiles your spec into training data, augments your 5 examples into ~36 diverse training rows using AI, fine-tunes the model, and auto-evaluates the result. New accounts include monthly free-run quota for small eligible runs with no card required. If your run is outside the free quota and your credit balance is too low, POST /runs returns 402 insufficient_credits with the required amount; top up at Dashboard → Billing or run tt topup.

tt runs start cafd8799-... --epochs 4 --lora-rank 8 --lora-alpha 16

The CLI returns immediately with the new run id and the status preparing.

Equivalent REST call

curl -X POST https://tunedtensor.com/api/v1/behavior-specs/cafd8799-.../runs \
  -H "Authorization: Bearer <api-key>" \
  -H "Content-Type: application/json" \
  -d '{
    "augment": true,
    "hyperparameters": {
      "n_epochs": 4,
      "lora_rank": 8,
      "lora_alpha": 16
    }
  }'

{
  "data": {
    "id": "e0b7694b-2c65-4199-89a1-fc54a6a6010c",
    "run_number": 1,
    "status": "preparing",
    "spec_snapshot": { ... },
    "hyperparameters": { "augment": true, "n_epochs": 4, "lora_rank": 8 }
  }
}

Behind the scenes, the platform:

Compiles your spec into JSONL training format
Augments your examples into ~36 diverse training rows using Claude
Uploads the dataset to the configured training provider
Starts a LoRA fine-tuning job

Step 3: Check Run Status

Watch a run until it reaches a terminal state — the CLI polls and streams status transitions:

tt runs watch e0b7694b-...

For one-shot status, use:

tt runs get e0b7694b-...

While training is running, use diagnostics for live learning signals such as recent epoch, loss, pace, and estimated remaining time:

tt runs diagnose e0b7694b-...

Equivalent REST call

curl https://tunedtensor.com/api/v1/runs/e0b7694b-... \
  -H "Authorization: Bearer <api-key>" \

The run moves through these statuses:

Status	What's happening
`preparing`	Compiling spec, augmenting examples, uploading dataset
`training`	Fine-tuning in progress on the configured training provider
`evaluating`	Auto-evaluating the fine-tuned model against your spec
`completed`	Done — eval results are available
`failed`	Something went wrong — check the `error` field

When completed, tt runs get (or the REST response) includes eval results:

{
  "data": {
    "status": "completed",
    "eval_summary": {
      "total": 5,
      "avg_score": 0.82,
      "pass_rate": 0.8,
      "scoring_method": "llm_judge",
      "regressions": 0
    },
    "_evals": [
      {
        "prompt": "Can I get a refund?",
        "expected": "Please review our refund policy...",
        "actual": "I understand you're looking for a refund...",
        "score": 0.9,
        "passed": true,
        "reasoning": "Correctly directs to the refund policy..."
      }
    ]
  }
}

Step 4: Serve the Model Locally

After a run completes, you can serve the fine-tuned artifact locally from the CLI. tt models serve downloads and caches a model ID automatically, starts an OpenAI-compatible endpoint, and applies your tunedtensor.json behaviour prompt by default.

# One-time setup for local reference serving
tt models setup-runtime

# Serve the completed model
tt models serve <model-id> --spec tunedtensor.json

# Then call the local OpenAI-compatible endpoint
curl http://127.0.0.1:8000/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "messages": [
      { "role": "user", "content": "How do I cancel my subscription?" }
    ],
    "max_tokens": 200
  }'

On Apple Silicon or GPU machines, choose the inference device with --device mps or --device cuda. The default --device auto chooses CUDA, then Apple MPS, then CPU.

Step 5: Inspect the Model Record

Your model appears in Dashboard → Models after the run completes. Use the model detail page to verify the artifact was created, inspect the base model, and jump to local serving docs.

Using the REST API

# List available models (base + your fine-tuned models)
curl https://tunedtensor.com/api/v1/playground/models \
  -H "Authorization: Bearer <api-key>" \

# Completion requests currently return 501 playground_unavailable
curl -X POST https://tunedtensor.com/api/v1/playground/completions \
  -H "Authorization: Bearer <api-key>" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "your-fine-tuned-model-id",
    "messages": [
      { "role": "system", "content": "You are a helpful customer support agent for Acme SaaS..." },
      { "role": "user", "content": "How do I cancel my subscription?" }
    ],
    "temperature": 0.7,
    "max_tokens": 512
}'

When hosted inference is unavailable, completion requests return:

{
  "error": {
    "code": "playground_unavailable",
    "message": "Playground inference is unavailable for hosted fine-tuned models."
  }
}

To run a downloaded artifact on your own machine, see Serve a Model Locally.

Step 6: Iterate

Review the eval results. If some examples failed, update your local tunedtensor.json with better examples or clearer guidelines, then push and kick off another run. To continue from a previous fine-tuned artifact instead of restarting from the original base model, pass the completed model id with --parent-model.

# Edit tunedtensor.json to add/change examples, guidelines, etc.

tt push
tt runs start cafd8799-... --parent-model 444c7c69-...

Equivalent REST call

# Update the spec with a new example
curl -X PUT https://tunedtensor.com/api/v1/behavior-specs/cafd8799-... \
  -H "Authorization: Bearer <api-key>" \
  -H "Content-Type: application/json" \
  -d '{
    "examples": [
      ...existing examples...,
      {
        "input": "How do I change my email address?",
        "output": "Go to Settings > Profile > Email and verify the new address."
      }
    ]
  }'

# Start another run
curl -X POST .../v1/behavior-specs/cafd8799-.../runs \
  -d '{
    "augment": true,
    "parent_model_id": "444c7c69-..."
  }'

Run #2 will be automatically compared to Run #1. The eval summary shows regressions (examples that got worse) and improvements.

Using the REST API

Every tt command maps to a REST endpoint under https://tunedtensor.com/api/v1. All endpoints accept Authorization: Bearer <api-key>.

Behaviour Specs — create, list, get, update, delete
Runs — start, list, get, diagnose, cancel
Datasets — upload, list, delete
Models — list, get, download, serve, delete
Authentication — API keys, response format

What's Next

CLI Tool — full command reference, local evals, and configuration
Behaviour Specs — full schema and endpoints
Runs — cancellation, eval results, regression detection
Authentication — API keys and response format