privaterouter/gpt-oss-120b

gpt-ossactive131k ctx
p50
p95
Success
Requests (7d)
$/1M in
$0.40
$/1M out
$1.20
Context
131k
Not enough traffic in the last 7 days to compute aggregate stats yet. Charts below show whatever hourly buckets we do have.

Latency (last 7 days, hourly)

p50 p95

Requests per hour (last 7 days)

Switch from OpenAI
Drop-in: change two lines, keep your code.
# Before:
#   client = OpenAI()
#   client.chat.completions.create(model="gpt-4o", ...)
#
# After:
from openai import OpenAI

client = OpenAI(
    api_key="$PRIVATEROUTER_API_KEY",
    base_url="https://api.privaterouter.com/v1",
)

resp = client.chat.completions.create(
    model="privaterouter/gpt-oss-120b",
    messages=[{"role": "user", "content": "Hello!"}],
)
print(resp.choices[0].message.content)

Sample prompts

Explain transformers in two sentences.
Transformers use self-attention to weigh how every token in a sequence relates to every other token, all in parallel. That replaces the slow recurrence of RNNs with a single big matrix multiply, which is why they scale so well on GPUs.
Write a regex that matches a US ZIP code.
^\d{5}(-\d{4})?$