Datasets
JSONL files used for fine-tuning. Usually auto-generated when you start a run from a behaviour spec. Can also be uploaded manually.
Auto-Generated Datasets
When you start a run, the platform automatically:
- Compiles your behaviour spec into JSONL chat format (system + user + assistant messages)
- If augmentation is enabled, uses Claude to expand your examples into a larger, more diverse training set (typically 5–10 examples → 30–40 rows)
- Uploads the compiled dataset to storage
Auto-generated datasets are named "Spec Name - Run #N".
The Dataset Object
{
"id": "dc66546b-48b3-4490-8baf-9b50aa78130c",
"name": "Customer Support Bot - Run #8",
"description": "Auto-compiled from behaviour spec. 36 examples (augmented).",
"format": "jsonl",
"status": "validated",
"row_count": 36,
"file_size_bytes": 36922,
"created_at": "2026-03-06T10:44:30.000Z"
}The recommended way to work with datasets is the tt CLI — each endpoint below shows the tt command first, followed by the equivalent REST call.
Upload a Dataset
Large uploads use a signed S3 upload URL so file bytes do not pass through the app API.
CLI
tt datasets upload training.jsonl \
--name "my-training-data" \
--description "Custom training dataset"Equivalent API flow
// 1. Request an upload URL from the app API.
const uploadUrl = await fetch("/api/v1/datasets/upload-url", {
method: "POST",
headers: {
"Authorization": "Bearer tt_your_api_key",
"Content-Type": "application/json"
},
body: JSON.stringify({
name: "my-training-data",
description: "Custom training dataset",
filename: file.name,
size: file.size,
contentType: file.type || "application/octet-stream"
})
}).then((res) => res.json());
// 2. Upload the file directly to S3.
await fetch(uploadUrl.data.upload_url, {
method: uploadUrl.data.method,
headers: uploadUrl.data.headers,
body: file
});
// 3. Finalize and validate the uploaded dataset.
await fetch("/api/v1/datasets/finalize", {
method: "POST",
headers: {
"Authorization": "Bearer tt_your_api_key",
"Content-Type": "application/json"
},
body: JSON.stringify({
path: uploadUrl.data.path,
name: "my-training-data",
description: "Custom training dataset"
})
});The file must be JSONL format and no larger than 100 MB. Each line must be an object with input and output strings. Status will be validated if all lines parse correctly, or invalid with error details.
JSONL Format
Each line should be a JSON object with input and output strings:
{"input": "How do I reset my password?", "output": "Go to Settings > Security and choose Reset password."}
{"input": "Where can I see invoices?", "output": "Open Billing > Invoices from the dashboard."}List Datasets
GET /api/v1/datasets
CLI
tt datasets listEquivalent REST call
curl https://tunedtensor.com/api/v1/datasets \
-H "Authorization: Bearer tt_your_api_key"Get a Dataset
GET /api/v1/datasets/:id
CLI
tt datasets get <dataset-id>Equivalent REST call
curl https://tunedtensor.com/api/v1/datasets/:id \
-H "Authorization: Bearer tt_your_api_key"Delete a Dataset
DELETE /api/v1/datasets/:id
CLI
tt datasets delete <dataset-id>Equivalent REST call
curl -X DELETE https://tunedtensor.com/api/v1/datasets/:id \
-H "Authorization: Bearer tt_your_api_key"Deletes the dataset record and the underlying file from storage.