Tuned Tensor
DocsDashboard

Playground

Legacy model-inventory and completion endpoints. The dashboard Playground UI has been removed while hosted inference is unavailable for SageMaker-backed fine-tuned artifacts.

Use the tt CLI for current model workflows. tt models serve can run a completed artifact locally with its behaviour spec prompt applied.

Overview

The legacy Playground REST endpoints remain documented for compatibility. They list the fine-tunable base models configured in Tuned Tensor plus any fine-tuned model you have created through a run.

Key capabilities:

  • Model inventory — base models and fine-tuned artifacts are listed from the same account-scoped API
  • Completion endpoint — currently returns 501 playground_unavailable while hosted inference is disabled

Available Base Models

These models are available for hosted fine-tuning.

ModelContext Length
google/gemma-4-E2B-it128,000
google/gemma-4-E4B-it128,000
Qwen/Qwen3.5-2B131,072
Qwen/Qwen3.5-4B262,144
meta-llama/Llama-3.2-3B-Instruct128,000
microsoft/Phi-4-mini-instruct128,000
ibm-granite/granite-3.3-2b-instruct128,000
bigcode/starcoder2-3b16,384

Fine-tuned models you create through runs also appear in the model selector with their provider_model_id.

List Playground Models

GET /api/v1/playground/models

curl https://tunedtensor.com/api/v1/playground/models \
  -H "Authorization: Bearer <api-key>"

Response:

{
  "data": {
    "base_models": [
      {
        "id": "Qwen/Qwen3.5-2B",
        "name": "Qwen3.5-2B",
        "type": "base"
      }
    ],
    "fine_tuned_models": [
      {
        "id": "user/Qwen3.5-2B-ft-abc123",
        "name": "Qwen3.5-2B-ft-abc123",
        "type": "fine-tuned",
        "base_model": "Qwen/Qwen3.5-2B"
      }
    ]
  }
}
FieldDescription
base_modelsSupported base models
fine_tuned_modelsYour fine-tuned models (includes base_model for reference)

Run a Completion

POST /api/v1/playground/completions returns 501 playground_unavailable when hosted inference is not enabled.

curl -X POST https://tunedtensor.com/api/v1/playground/completions \
  -H "Authorization: Bearer <api-key>" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "Qwen/Qwen3.5-2B",
    "messages": [
      { "role": "system", "content": "You are a helpful assistant." },
      { "role": "user", "content": "Explain LoRA fine-tuning in one paragraph." }
    ],
    "temperature": 0.7,
    "max_tokens": 1024
  }'

Request Body

ParameterTypeDefaultDescription
modelstringModel ID (base or fine-tuned). Required.
messages{role, content}[]Chat messages. At least one required. Roles: system, user, assistant.
temperaturenumber0.7Sampling temperature (0–2).
max_tokensinteger1024Maximum tokens to generate (1–4 096).

Current Response

{
  "error": {
    "code": "playground_unavailable",
    "message": "Playground inference is unavailable for hosted fine-tuned models."
  }
}

Inference Response Fields

When interactive inference is enabled, successful responses use this shape:

FieldDescription
contentGenerated text from the model when inference is enabled
latency_msWall-clock inference time in milliseconds
usage.prompt_tokensTokens consumed by the input prompt
usage.completion_tokensTokens generated in the response

Error Codes

StatusCodeMeaning
400validation_errorInvalid request body (missing model, empty messages, etc.)
501playground_unavailableInteractive inference is not enabled for hosted fine-tuned models

Comparing Base vs Fine-Tuned

For current workflows, download or serve the fine-tuned artifact with the CLI and compare outputs locally:

tt models setup-runtime
tt models serve <model-id> --spec tunedtensor.json

See Serve a Model Locally for the recommended local serving flow.