Entity schema reference (v0.7 · #55)

The wiki's entity pages are free-form markdown by default — a file like wiki/entities/OpenAI.md can be whatever you want, and the slash-command workflow just edits the body. AI model entities are a special case: they carry structured frontmatter so llmwiki can render a sortable /models/ index, inline info-cards, and future comparison pages (#58).

This reference describes the schema. It's opt-in — any entity page that doesn't set entity_kind: ai-model is ignored by the model pipeline and continues to render as normal markdown.

Minimum viable model page

---
title: "Claude Sonnet 4"
type: entity
entity_kind: ai-model
provider: Anthropic
---

Free-form markdown body here.

With just this, the page will appear in the /models/ table with an em-dash in every numeric column. Add the structured blocks below to populate them.

Full schema

---
title: "Claude Sonnet 4"
type: entity
entity_kind: ai-model
provider: Anthropic

# Nested blocks are written as inline JSON so llmwiki's lightweight
# frontmatter parser can store them without a full YAML library. The
# schema validator parses them back out at build time.
model: {"context_window": 200000, "max_output": 8192, "license": "proprietary", "released": "2026-03-18"}
pricing: {"input_per_1m": 3.00, "output_per_1m": 15.00, "cache_read_per_1m": 0.30, "currency": "USD", "effective": "2026-03-18"}
modalities: [text, vision]
benchmarks: {"gpqa_diamond": 0.725, "swe_bench": 0.619, "mmlu": 0.887}
---

`model` block

Key	Type	Notes
`context_window`	int	Max input context, tokens. Must be > 0.
`max_output`	int	Max single-response output tokens.
`license`	string	`"proprietary"`, `"apache-2.0"`, `"mit"`, etc.
`released`	ISO date	`YYYY-MM-DD`

`pricing` block

Key	Type	Notes
`input_per_1m`	float	USD per 1M input tokens. Must be ≥ 0.
`output_per_1m`	float	USD per 1M output tokens.
`cache_read_per_1m`	float	Discounted price for cached context reads.
`cache_write_per_1m`	float	Price for writing to the prompt cache.
`currency`	string	`"USD"`, `"EUR"`, `"GBP"`, ...
`effective`	ISO date	When this pricing took effect.

`modalities`

Plain YAML list. Common values: text, vision, audio, video, function-calling, tool-use.

`benchmarks` block

Benchmark scores as fractions in [0, 1] (0.725 = 72.5%). The validator rejects values outside that range with a warning — don't paste raw percentages.

Known keys get pretty labels automatically:

Key	Label
`gpqa_diamond`	GPQA Diamond
`swe_bench`	SWE-bench
`swe_bench_verified`	SWE-bench Verified
`aime_2025`	AIME 2025
`livecodebench`	LiveCodeBench
`arc_agi_2`	ARC-AGI 2
`mmlu`	MMLU
`mmlu_pro`	MMLU-Pro
`humaneval`	HumanEval
`hellaswag`	HellaSwag
`drop`	DROP
`bbh`	BIG-Bench Hard
`math_500`	MATH-500

Unknown keys pass through. You can add my_new_bench_2027: 0.42 and it will render with a titlecased label without requiring a code change.

What the build pipeline does

discover_model_entities(wiki/entities/) walks the directory and picks out any page where entity_kind == "ai-model".
parse_model_profile(meta) validates each page's frontmatter against the schema, returning a ModelProfile TypedDict plus a list of warnings. Warnings are surfaced in a collapsible <details> block on the detail page — they don't block the build.
render_model_info_card(profile) inlines a structured card at the top of each detail page, above the free-form body.
render_models_index(entries) emits the sortable /models/index.html table with every benchmark key used anywhere as a column.
The nav bar gains a Models link so readers can jump there from any page.

Example

See wiki/entities/ClaudeSonnet4.md for a complete real-world page.

Entity schema reference (v0.7 · #55)

Minimum viable model page

Full schema

model block

pricing block