Tuned Tensor
DocsDashboard

Datasets

JSONL files used for fine-tuning. Usually auto-generated when you start a run from a behaviour spec. Can also be uploaded manually.

Auto-Generated Datasets

When you start a run, the platform automatically:

  1. Compiles your behaviour spec into JSONL chat format (system + user + assistant messages)
  2. If augmentation is enabled, uses Claude to expand your examples into a larger, more diverse training set (typically 5–10 examples → 30–40 rows)
  3. Uploads the compiled dataset to storage

Auto-generated datasets are named "Spec Name - Run #N".

The Dataset Object

{
  "id": "dc66546b-48b3-4490-8baf-9b50aa78130c",
  "name": "Customer Support Bot - Run #8",
  "description": "Auto-compiled from behaviour spec. 36 examples (augmented).",
  "format": "jsonl",
  "status": "validated",
  "row_count": 36,
  "file_size_bytes": 36922,
  "created_at": "2026-03-06T10:44:30.000Z"
}

The recommended way to work with datasets is the tt CLI — each endpoint below shows the tt command first, followed by the equivalent REST call.

Upload a Dataset

Large uploads use a signed S3 upload URL so file bytes do not pass through the app API.

CLI

tt datasets upload training.jsonl \
  --name "my-training-data" \
  --description "Custom training dataset"

Equivalent API flow

// 1. Request an upload URL from the app API.
const uploadUrl = await fetch("/api/v1/datasets/upload-url", {
  method: "POST",
  headers: {
    "Authorization": "Bearer tt_your_api_key",
    "Content-Type": "application/json"
  },
  body: JSON.stringify({
    name: "my-training-data",
    description: "Custom training dataset",
    filename: file.name,
    size: file.size,
    contentType: file.type || "application/octet-stream"
  })
}).then((res) => res.json());

// 2. Upload the file directly to S3.
await fetch(uploadUrl.data.upload_url, {
  method: uploadUrl.data.method,
  headers: uploadUrl.data.headers,
  body: file
});

// 3. Finalize and validate the uploaded dataset.
await fetch("/api/v1/datasets/finalize", {
  method: "POST",
  headers: {
    "Authorization": "Bearer tt_your_api_key",
    "Content-Type": "application/json"
  },
  body: JSON.stringify({
    path: uploadUrl.data.path,
    name: "my-training-data",
    description: "Custom training dataset"
  })
});

The file must be JSONL format and no larger than 100 MB. Each line must be an object with input and output strings. Status will be validated if all lines parse correctly, or invalid with error details.

JSONL Format

Each line should be a JSON object with input and output strings:

{"input": "How do I reset my password?", "output": "Go to Settings > Security and choose Reset password."}
{"input": "Where can I see invoices?", "output": "Open Billing > Invoices from the dashboard."}

List Datasets

GET /api/v1/datasets

CLI

tt datasets list

Equivalent REST call

curl https://tunedtensor.com/api/v1/datasets \
  -H "Authorization: Bearer tt_your_api_key"

Get a Dataset

GET /api/v1/datasets/:id

CLI

tt datasets get <dataset-id>

Equivalent REST call

curl https://tunedtensor.com/api/v1/datasets/:id \
  -H "Authorization: Bearer tt_your_api_key"

Delete a Dataset

DELETE /api/v1/datasets/:id

CLI

tt datasets delete <dataset-id>

Equivalent REST call

curl -X DELETE https://tunedtensor.com/api/v1/datasets/:id \
  -H "Authorization: Bearer tt_your_api_key"

Deletes the dataset record and the underlying file from storage.