What is Context Window & Management in AI Coding Agents?

Question

Accepted Answer

The context window is the LLM's working memory — everything the model 'sees' in a single forward pass. Claude models offer 200K (standard) up to 1M tokens (extended on newer Sonnet/Opus releases). The challenge in agents is that tool results accumulate fast: reading three files, running a grep, and a build can easily consume 50K tokens. Mitigations include reading file ranges instead of whole files, using grep for targeted searches, dispatching subagents to absorb verbose work, and prompt caching the static prefix. Quality degrades well before the hard limit, so I aim to stay under ~50% utilization for the working portion.

Related concepts