Document Library

Persistent, searchable per-user documents that the chat AI can ground its answers against. Upload a PDF, DOCX, Markdown, or TXT file once, and from then on:

Enable the Docs chat plugin and every chat completion auto-injects the top-K matching excerpts as a system message before calling the LLM.
Or, pass plugins:[{"id": "docs", "doc_ids": [...]}] per-request through the API to scope grounding to specific documents.
Or, hit POST /api/documents/search directly for raw retrieval (useful for agents that want to build their own prompts).

Manage uploads at /library in the dashboard.

How it works

upload  →  extract text  →  chunk (~800 tokens, 150 overlap)
              ↓
        embed each chunk (privaterouter/nomic-embed, 768d vectors)
              ↓
        store in pgvector with HNSW index, content encrypted at rest
              ↓
        chat: embed query → cosine top-K → decrypt → inject as system msg

Encryption. Chunk text is encrypted at rest with your account's per-user data encryption key (DEK), same scheme as encrypted chat messages. Embeddings are stored unencrypted so the HNSW index can do its job — reversing a 768-dim vector back to text is research-grade attack territory and not practical.

Isolation. Every read path filters by user_id. You can't search, list, or retrieve another user's chunks even by passing their document IDs as a whitelist. Verified by the cross-user isolation tests.

Storage. 50 MB free per user. Beyond that, storage is billed daily against your credit balance at $0.02/GB/day (about $0.60 per GB per month). You'll see a warning banner before any billable usage starts.

Supported file types

Type	MIME
PDF	`application/pdf`
DOCX	`application/vnd.openxmlformats-officedocument.wordprocessingml.document`
Markdown	`text/markdown`
Plain	`text/plain`

Scanned-image PDFs return status: "failed" with a friendly message — OCR support is on the roadmap.

Max upload: 100 MB per file.

REST API

All endpoints take a dashboard session cookie. The /search endpoint is also available via API key for agent use.

Upload

curl -X POST https://privaterouter.com/api/documents \
  -H "Cookie: session=..." \
  -F "file=@manual.pdf"

Response (201):

{
  "id": "8f3b...",
  "filename": "manual.pdf",
  "status": "embedding",
  "chunk_count": 24,
  "size_bytes": 153842,
  "mime_type": "application/pdf",
  "created_at": "2026-05-22T03:14:00Z"
}

Status transitions: processing → embedding → ready (or failed). Poll GET /api/documents/{id} to watch it complete; the dashboard polls every 4s while any row is unsettled.

List

GET /api/documents?limit=50&offset=0

Returns the document index plus a storage block with used_bytes, free_tier_bytes, and over_free_tier for the dashboard usage meter.

Search (raw retrieval, no LLM)

curl -X POST https://privaterouter.com/api/documents/search \
  -H "Cookie: session=..." \
  -H "Content-Type: application/json" \
  -d '{
    "query": "what is the rotation policy?",
    "top_k": 5,
    "min_similarity": 0.3,
    "doc_ids": null
  }'

Response:

{
  "hits": [
    {
      "chunk_id": "...",
      "document_id": "...",
      "document_filename": "policy.pdf",
      "chunk_index": 3,
      "content": "All encryption keys must rotate quarterly...",
      "similarity": 0.74,
      "metadata": { "page": 7 }
    }
  ],
  "query_tokens": 12,
  "cost_usd": "0.00000024",
  "latency_ms": 91,
  "embedding_model": "privaterouter/nomic-embed"
}

Billing: charges the embedding cost of the query (typically ~12 tokens → ~$2.4e-7 per search). No charge for the hits themselves.

Delete

DELETE /api/documents/{id}

Removes the row, cascades to all chunks, and unlinks the on-disk file. Returns 204.

Reindex

POST /api/documents/{id}/reindex

Wipes every chunk's embedding and re-runs the embedding worker against them. Use when:

The embedding model is upgraded
A previous embed failed and you want to retry
You suspect stale vectors

Charged the per-token embedding cost again.

Chat plugin

Enable at /settings/plugins → toggle "Document Library" on. Configure top_k (default 5, max 20) and min_similarity (default 0.30, range 0.0–1.0).

When enabled, every chat completion — whether through the web chat, an API key, or an embedded widget — runs a vector search over your documents and prepends matching excerpts as a system message. The LLM is instructed to cite with bracketed numbers ([1], [2]) matching the injected sources.

Per-request override

Pass plugins in any chat completion body:

{
  "model": "privaterouter/qwen3-32b",
  "messages": [{"role": "user", "content": "What's the rotation policy?"}],
  "plugins": [
    {
      "id": "docs",
      "doc_ids": ["8f3b...", "a91c..."],
      "top_k": 3,
      "min_similarity": 0.4
    }
  ]
}

doc_ids is optional — omit to search the whole library.
doc_ids: [] is an explicit opt-out (no grounding even if the per-user default is on).
Per-request plugins override the user's saved default entirely.

Failure modes

Insufficient credit at search time. The hook silently no-ops and the LLM answers without grounding. The REST endpoint returns 402.
Embedding service down. Same silent degradation in the chat hook; /search returns 503.
No hits above min_similarity. The chat hook doesn't inject anything — you get an ungrounded answer rather than a wrong one hallucinated from low-signal chunks.
Document status: "frozen". Happens when storage is over the free tier and your credit balance is zero. Search skips frozen docs. Top up credit and they thaw automatically on the next daily cron.

GDPR

Deleting your account hard-deletes every document, every chunk, and unlinks every file from disk. The GDPR export at GET /api/account/export includes the document index (filename, size, status, sha256, dates) but not the chunk contents — those are encrypted with your key, and you already have the original files.