Skip to content
Foundationscritical

Structured Outputs (JSON Mode, Tool Use)

Structured outputs let an LLM emit machine-parseable data (JSON matching a schema) instead of free text. Three common mechanisms: tool-use (Claude — define a tool with an input schema; the model emits a tool_use block), OpenAI's `response_format: { type: 'json_schema', schema }` (constrained decoding to a JSON Schema), and OpenAI's older `json_mode` (validates JSON, no schema). Reliability: tool-use and json_schema are near-100% schema-conformant; free-text-then-parse fails 5-15% of the time at scale.

Memory anchor

Structured outputs are a customs declaration form — fields are typed and validated at the gate. Free-text answers are a rambling letter the agent then has to translate. The form is faster and never lies about what it contains.

Expected depth

Use cases: data extraction from text, classifiers (route to category X), tool routers (which downstream service to call), agent control flow (next action). Pattern: define the schema once; let the runtime enforce it. With Claude tool-use, you can define a 'fake' tool whose only purpose is to capture structured output — the harness gets the validated JSON without actually calling anything. Validation: even with json_schema, validate at the boundary — schema-violation retry on parse failure (1-2 retries, then error). Cost: schema tokens count toward input on every turn — keep schemas tight.

Deep — senior internals

Constrained decoding (json_schema mode) works by masking logits at each token to disallow tokens that would break the schema. This guarantees parseability but may degrade reasoning quality on hard cases — the model can't 'think out loud' before answering. Mitigations: have the model emit a 'reasoning' free-text field BEFORE the structured fields (chain-of-thought baked in), or run two passes (think → extract). Tool-use vs json_schema: tool-use lets you mix structured outputs with text in the same turn (model can explain then call tool); json_schema gives you a single structured response. For agent control, tool-use is usually better. For pure data extraction, json_schema is simpler.

🎤Interview-ready answer

Structured outputs are how I get reliably parseable data from an LLM. Three mechanisms: Claude's tool-use (define a tool with an input schema, model emits tool_use), OpenAI's json_schema mode (constrained decoding to a JSON Schema), and the older json_mode (validates JSON without enforcing schema). For agent control flow I use tool-use — it lets the model explain before acting and naturally fits the agent loop. For pure data extraction I use json_schema. I always validate at the boundary and retry on schema violation. Schemas count toward input tokens, so I keep them as tight as possible.

Common trap

Asking the model to 'return JSON' in a prompt without enforcement. ~5–15% of responses will have malformed JSON, especially on edge cases (escaped quotes, unicode, trailing commas). Use tool-use or json_schema; don't roll your own JSON parser as the failure mode.

Related concepts