Model playground

/playground is a side-by-side model comparison tool. Run the same prompt against up to 4 hosted models at once, compare responses, see latency + cost + token counts. Share results publicly.

Why this exists

Picking the right model for a job is hard. A blog benchmark from last year doesn't tell you how privaterouter/qwen-pro handles your specific use case. The playground lets you test before you integrate.

Quickstart

Open /playground from the dashboard sidebar.
Type or paste a prompt.
(Optional) Pick a template from the dropdown: Summarize, Code review, JSON extraction, Translate.
Click "Add model" up to 4 times to pick comparison models.
Adjust temperature / max_tokens / top_p sliders.
Click "Run on all" → responses appear in a 2×2 grid.
Each result card shows latency, tokens used, cost, and the response.
Click "Pick this model" on the winner → opens /routing with that model preselected for auto-routing.
Click "Save & share" → generates a public URL anyone can view.

Sharing

Hit "Save & share" → a public URL is copied to your clipboard. Looks like https://privaterouter.com/playground/shared/abc123xyz. Anyone with the link sees a read-only view of your prompt + responses. Useful for:

Showing teammates which model wins your eval
Demonstrating PrivateRouter quality to prospects
Publishing benchmark results to your blog

You can unshare any time — the slug is preserved so re-sharing returns the same URL.

Cost

Playground runs are free in the MVP. Each run still hits the real LiteLLM-routed model, but cost_usd is recorded as 0 on the playground_run row and your account isn't charged.

This will be tightened in a future milestone if the free-tier abuse becomes meaningful. For now, treat the playground as a no-risk evaluation tool.

Limits

1–4 models per session
50,000 character prompt limit
Session history: last 50 sessions per user retained in the dashboard
Public shared sessions do not expire (delete via dashboard to remove)
No streaming on the playground — responses come back when complete

API

Member-side (session auth)

GET    /api/playground           # list last 50
POST   /api/playground           # create + run
GET    /api/playground/{id}      # detail w/ runs
DELETE /api/playground/{id}      # 204
POST   /api/playground/{id}/share    # → {share_slug, public_url}
DELETE /api/playground/{id}/share    # unshare

Public (no auth)

GET /api/playground/shared/{slug}

POST /api/playground request body:

{
  "prompt": "Translate to German: 'I love Berlin in autumn.'",
  "models": ["privaterouter/qwen-pro", "privaterouter/qwen-fast"],
  "system_prompt": "You are a professional translator.",
  "parameters": {
    "temperature": 0.7,
    "max_tokens": 1024,
    "top_p": 1.0
  }
}

Response includes every run with response_content, status, and performance metrics.

Privacy

Playground prompts + responses are stored on playground_sessions

playground_runs rows, NOT in usage_events. They're encrypted at rest using the same per-user DEK as chat messages (see /docs/observability). When you delete a session, it's a hard delete with FK cascade — no soft-delete.