Reader API contract (v1.2+ preview)
Status: contract-only. No server yet — today the static site is the API. This doc locks the shape so when we add a hosted / SPA reader we don't have to rewrite the content model. Freezing this now protects the build pipeline (
site/outputs) and the sibling.txt/.jsonfiles from drift (#116).
Why a contract first
llmwiki is, and will stay, static-site-first. But a few near-term bets depend on the data being reachable without HTML parsing:
- A browser extension that answers "what do I know about X" from the
wiki's
.jsonsibling of the current tab. - A Raycast/Alfred plugin that hits
manifest.json+search-index.jsonto open a page. - A future lightweight SPA reader that can live on the same origin as the generated site.
- Downstream LLM agents consuming
llms-full.txt+ per-page.jsonto answer questions without pulling HTML.
Every one of those wants the same shape of data. This doc says what that
shape is, so refactors of llmwiki/build.py can't silently break
clients.
Shipped today (v1.0+) — read-only, file-based
The static build writes these to site/ on every llmwiki build:
| Path | Shape | Purpose |
|---|---|---|
/index.html |
HTML | Home page |
/<group>/index.html |
HTML | Project / sessions / models / vs index |
/<group>/<slug>.html |
HTML | Individual page |
/<group>/<slug>.txt |
Plain text | HTML-free body + first-line frontmatter |
/<group>/<slug>.json |
JSON | Structured metadata + body + outbound wikilinks |
/llms.txt |
Markdown | Short AI-agent index (llmstxt.org spec) |
/llms-full.txt |
Plain text | Flattened dump (≤ 5 MB) |
/graph.jsonld |
JSON-LD | Schema.org entity/concept/source graph |
/graph.html |
HTML | Interactive vis-network graph (#118) |
/search-index.json |
JSON | Top-level search index + facets + chunk manifest |
/search-chunks/<project>.json |
JSON | Per-project search chunk (lazy-loaded) |
/manifest.json |
JSON | Every file + SHA-256 + performance budget |
/sitemap.xml |
XML | Standard sitemap with lastmod |
/rss.xml |
XML | RSS 2.0 feed of newest sessions |
/robots.txt |
Text | AI-friendly, references llms.txt |
/ai-readme.md |
Markdown | AI-agent navigation instructions |
These are already the API. Everything below in this doc describes the future hosted/SPA surface that will be fed by the same data shapes — no new content pipeline, just new transports.
Future endpoint contract
Every endpoint below maps 1:1 to a file that llmwiki build already
produces. The server is a thin JSON wrapper; the content model is what's
already on disk.
Base URL: <root>/api/v1 (TBD — static deploy keeps /api/v1/*.json as
files).
GET /api/v1/bootstrap
One-shot payload the reader fetches on first load so it doesn't have to chain three requests before showing anything.
{
"version": "1.1.0rc2",
"generated_at": "2026-04-19T08:34:42Z",
"stats": {
"sessions": 647,
"projects": 30,
"entities": 2,
"concepts": 0,
"total_bytes": 62691698
},
"nav": [
{ "id": "home", "label": "Home", "href": "/" },
{ "id": "projects", "label": "Projects", "href": "/projects/" },
{ "id": "sessions", "label": "Sessions", "href": "/sessions/" },
{ "id": "models", "label": "Models", "href": "/models/" },
{ "id": "vs", "label": "Compare", "href": "/vs/" },
{ "id": "graph", "label": "Graph", "href": "/graph.html" },
{ "id": "changelog", "label": "Changelog", "href": "/changelog.html" }
],
"theme": {
"accent": "#7C3AED",
"default": "dark"
},
"search": {
"mode": "flat",
"chunks": "/search-chunks/",
"index": "/search-index.json"
},
"cache_tiers": ["L1", "L2", "L3", "L4"]
}
Client contract. Safe to cache for 5 minutes. Never returns partial data — if the site rebuilds mid-request, the server serves the previous full payload until the new one is ready.
GET /api/v1/article?path=<url>
The article shell already rendered as structured data — lets a SPA skip HTML parsing entirely.
{
"url": "sessions/llm-wiki/2026-04-17T10-12-llm-wiki-refactor.html",
"slug": "2026-04-17T10-12-llm-wiki-refactor",
"title": "LLM Wiki refactor",
"type": "source",
"project": "llm-wiki",
"model": "claude-sonnet-4-6",
"date": "2026-04-17",
"last_updated": "2026-04-17",
"confidence": 0.75,
"lifecycle": "reviewed",
"cache_tier": "L3",
"entity_type": null,
"tags": ["claude-code", "refactor"],
"breadcrumbs": [
{ "label": "Home", "href": "/" },
{ "label": "Projects", "href": "/projects/" },
{ "label": "llm-wiki", "href": "/projects/llm-wiki.html" },
{ "label": "LLM Wiki refactor" }
],
"body_html": "<article>…</article>",
"body_text": "Raw markdown body without frontmatter, suitable for LLM context.",
"wikilinks_out": ["Obsidian", "Karpathy"],
"wikilinks_in": ["llm-wiki", "AndrejKarpathy"],
"related": [
{ "slug": "2026-04-16T18-30-llm-wiki-seed", "title": "LLM Wiki seed", "score": 0.82 }
],
"reading_time_minutes": 4,
"summary": "First-paragraph summary for L2 pre-loading."
}
Required fields: url, slug, title, type, body_html,
body_text, wikilinks_out. Everything else is optional and may be
null/missing.
Client contract. The reader MUST gracefully render when optional
fields are missing (a newly ingested page may not have confidence or
cache_tier yet).
GET /api/v1/search?q=<query>&type=<optional>&project=<optional>
Thin wrapper over the existing client-side index + chunks. Returns the matches the palette would surface.
{
"query": "karpathy",
"mode": "flat",
"total": 12,
"hits": [
{
"id": "session:llm-wiki/2026-04-16T18-30-llm-wiki-seed",
"url": "sessions/llm-wiki/2026-04-16T18-30-llm-wiki-seed.html",
"title": "LLM Wiki seed",
"type": "source",
"project": "llm-wiki",
"snippet": "Karpathy's pattern spells out what…",
"score": 0.91,
"headings": [
{ "depth": 2, "text": "Summary" },
{ "depth": 3, "text": "Karpathy's pattern" }
]
}
],
"facets": {
"entity_type": { },
"lifecycle": { },
"tags": { },
"confidence": { "none": 647 }
}
}
Mode. "flat" vs "tree" — the client-side router today picks the
mode by heuristic (#53 lands the auto-router). The server MUST return
the same mode it used so the client can tell the user in the palette
footer.
Client contract. hits is capped at 100; the client does its own
pagination. score is 0–1 but not calibrated — use for ranking, not
thresholds.
POST /api/v1/sync (internal only)
Trigger a rebuild without waiting for the next watcher tick. Used by
/wiki-sync after a successful ingest.
POST /api/v1/sync
Authorization: Bearer <local-token>
{
"reason": "ingest",
"pages_changed": ["sources/llm-wiki-refactor.md"]
}
Response:
{
"accepted": true,
"build_id": "2026-04-19T10:22:01Z",
"eta_seconds": 2
}
Auth. Local bearer token only — this endpoint is never exposed to
the public internet. manifest.json is the read-side proof that the
build finished (its generated_at advances).
Data model invariants
Anything a client can depend on:
- Slugs are stable. A page's slug is set at ingest and never changes on rebuild. Renames produce a new slug and a redirect stub.
- Timestamps are UTC ISO-8601 with
Zsuffix. Never local time. cache_tieris always one ofL1,L2,L3,L4(#52). Missing = treat asL3.lifecycleis always one ofdraft,reviewed,verified,stale,archived(#11).confidenceis always in[0, 1]or missing. Never percent.entity_type(when set) is one ofperson,org,tool,concept,api,library,project(#137).- Wikilinks resolve to slugs, not URLs.
[[Karpathy]]→"Karpathy"— the client resolves to a URL via the index. - Frontmatter is authoritative for metadata. The body is authoritative for prose.
Versioning
/api/v1/*is the long-term contract. Breaking changes bump to/v2/and keep/v1/live for one minor version.- Additive-only changes (new optional fields, new top-level keys on
bootstrap) don't bump the version. - Rename of an existing required field is a breaking change.
Content negotiation
Today's static site already does this implicitly:
curl .../page.html→ HTMLcurl .../page.txt→ plain textcurl .../page.json→ structured
The future server will keep those three paths exactly as-is.
Accept: application/json on .html routes should redirect to
the .json sibling rather than serving JSON on the HTML URL —
that way caches and proxies stay simple.
Migration path — static → hosted
- Today:
llmwiki buildwrites the JSON/txt files. External tools read them directly. (Done — #116 is this doc.) - v1.2: Add a tiny
llmwiki serve --apiflag that wraps the same files behind/api/v1/*paths so the reader SPA can fetch them uniformly in dev. No new data, just routing. - v1.3+: If a hosted multi-tenant reader ships, the server reuses the same routes with per-user auth. The content pipeline doesn't change.
At no point does the contract require a rewrite of llmwiki/build.py —
every endpoint maps to something build.py already emits.
Related
llmwiki/build.py— produces every file referenced abovellmwiki/exporters.py—llms.txt+ JSON-LD + per-page siblingsdocs/reference/cache-tiers.md—cache_tierinvariant (#52)docs/design/brand-system.md— theme tokens returned by/bootstrap#116— this issue#112— reader-first article shell (one client of this contract)