Document Library
Persistent, searchable per-user documents that the chat AI can ground its answers against. Upload a PDF, DOCX, Markdown, or TXT file once, and from then on:
- Enable the Docs chat plugin and every chat completion auto-injects the top-K matching excerpts as a system message before calling the LLM.
- Or, pass
plugins:[{"id": "docs", "doc_ids": [...]}]per-request through the API to scope grounding to specific documents. - Or, hit
POST /api/documents/searchdirectly for raw retrieval (useful for agents that want to build their own prompts).
Manage uploads at /library in the dashboard.
How it works
upload → extract text → chunk (~800 tokens, 150 overlap)
↓
embed each chunk (privaterouter/nomic-embed, 768d vectors)
↓
store in pgvector with HNSW index, content encrypted at rest
↓
chat: embed query → cosine top-K → decrypt → inject as system msg
Encryption. Chunk text is encrypted at rest with your account's per-user data encryption key (DEK), same scheme as encrypted chat messages. Embeddings are stored unencrypted so the HNSW index can do its job — reversing a 768-dim vector back to text is research-grade attack territory and not practical.
Isolation. Every read path filters by user_id. You can't search,
list, or retrieve another user's chunks even by passing their document
IDs as a whitelist. Verified by the cross-user isolation tests.
Storage. 50 MB free per user. Beyond that, storage is billed daily against your credit balance at $0.02/GB/day (about $0.60 per GB per month). You'll see a warning banner before any billable usage starts.
Supported file types
| Type | MIME |
|---|---|
application/pdf | |
| DOCX | application/vnd.openxmlformats-officedocument.wordprocessingml.document |
| Markdown | text/markdown |
| Plain | text/plain |
Scanned-image PDFs return status: "failed" with a friendly message —
OCR support is on the roadmap.
Max upload: 100 MB per file.
REST API
All endpoints take a dashboard session cookie. The /search endpoint
is also available via API key for agent use.
Upload
curl -X POST https://privaterouter.com/api/documents \
-H "Cookie: session=..." \
-F "file=@manual.pdf"
Response (201):
{
"id": "8f3b...",
"filename": "manual.pdf",
"status": "embedding",
"chunk_count": 24,
"size_bytes": 153842,
"mime_type": "application/pdf",
"created_at": "2026-05-22T03:14:00Z"
}
Status transitions: processing → embedding → ready (or failed).
Poll GET /api/documents/{id} to watch it complete; the dashboard polls
every 4s while any row is unsettled.
List
GET /api/documents?limit=50&offset=0
Returns the document index plus a storage block with used_bytes,
free_tier_bytes, and over_free_tier for the dashboard usage meter.
Search (raw retrieval, no LLM)
curl -X POST https://privaterouter.com/api/documents/search \
-H "Cookie: session=..." \
-H "Content-Type: application/json" \
-d '{
"query": "what is the rotation policy?",
"top_k": 5,
"min_similarity": 0.3,
"doc_ids": null
}'
Response:
{
"hits": [
{
"chunk_id": "...",
"document_id": "...",
"document_filename": "policy.pdf",
"chunk_index": 3,
"content": "All encryption keys must rotate quarterly...",
"similarity": 0.74,
"metadata": { "page": 7 }
}
],
"query_tokens": 12,
"cost_usd": "0.00000024",
"latency_ms": 91,
"embedding_model": "privaterouter/nomic-embed"
}
Billing: charges the embedding cost of the query (typically ~12 tokens → ~$2.4e-7 per search). No charge for the hits themselves.
Delete
DELETE /api/documents/{id}
Removes the row, cascades to all chunks, and unlinks the on-disk file.
Returns 204.
Reindex
POST /api/documents/{id}/reindex
Wipes every chunk's embedding and re-runs the embedding worker against them. Use when:
- The embedding model is upgraded
- A previous embed failed and you want to retry
- You suspect stale vectors
Charged the per-token embedding cost again.
Chat plugin
Enable at /settings/plugins → toggle "Document
Library" on. Configure top_k (default 5, max 20) and min_similarity
(default 0.30, range 0.0–1.0).
When enabled, every chat completion — whether through the web chat, an
API key, or an embedded widget — runs a vector search over your
documents and prepends matching excerpts as a system message. The LLM
is instructed to cite with bracketed numbers ([1], [2]) matching
the injected sources.
Per-request override
Pass plugins in any chat completion body:
{
"model": "privaterouter/qwen3-32b",
"messages": [{"role": "user", "content": "What's the rotation policy?"}],
"plugins": [
{
"id": "docs",
"doc_ids": ["8f3b...", "a91c..."],
"top_k": 3,
"min_similarity": 0.4
}
]
}
doc_idsis optional — omit to search the whole library.doc_ids: []is an explicit opt-out (no grounding even if the per-user default is on).- Per-request
pluginsoverride the user's saved default entirely.
Failure modes
- Insufficient credit at search time. The hook silently no-ops and the LLM answers without grounding. The REST endpoint returns 402.
- Embedding service down. Same silent degradation in the chat hook;
/searchreturns 503. - No hits above
min_similarity. The chat hook doesn't inject anything — you get an ungrounded answer rather than a wrong one hallucinated from low-signal chunks. - Document
status: "frozen". Happens when storage is over the free tier and your credit balance is zero. Search skips frozen docs. Top up credit and they thaw automatically on the next daily cron.
GDPR
Deleting your account hard-deletes every document, every chunk, and
unlinks every file from disk. The GDPR export at
GET /api/account/export includes the document index (filename, size,
status, sha256, dates) but not the chunk contents — those are
encrypted with your key, and you already have the original files.