How PrivateRouter models actually perform

Updated continuously from real production traffic. No synthetic benchmarks, no vendor self-reports.

Last refreshed Wed, 22 Jul 2026 18:53:07 GMT

Sort:

No traffic data yet. Models need 100+ requests in this window to appear.

Not enough data yet

privaterouter/deepseek-ocrprivaterouter/gemma4-31bprivaterouter/deepseek-r1privaterouter/qwen3-coderprivaterouter/gpt-oss-120bprivaterouter/qwen3.6-35b

Why these numbers?Expand

Every request that flows through PrivateRouter writes a row into our usage log: model, latency, token counts, status, and cost. Once an hour, a rollup job aggregates those rows into per-model windowed buckets (7 days and 30 days). This page reads the latest rollup directly — there's no hand-curation between the proxy and the table above.

Minimum traffic threshold

A model needs at least 100 requests in the selected window before it's ranked. Below that, the numbers are too noisy to publish — those models appear in the "Not enough data yet" section instead.

What "success" means

A request is counted as successful only if it returned 200 OK and the token stream completed cleanly (no mid-stream disconnect, no provider-side abort, no timeout). Anything else — 4xx, 5xx, dropped stream — counts as an error.

How the ranking score works

Models are ranked by a composite score that balances three things in plain English:

Speed — lower p50 latency is better.
Reliability — higher success rate is better.
Price — lower blended $/1M tokens is better.

The three are normalized within the current window, then combined. No single metric dominates: a fast-but-flaky model won't beat a slightly-slower-but-reliable one, and a cheap model with terrible latency won't top the list either.

Privacy

These numbers are aggregates across all traffic. No prompts, completions, user IDs, or API key fingerprints are exposed on this page — just counts, latencies, and dollar amounts rolled up per model.