Tool Use (Function Calling)
Tool use is the mechanism by which an LLM requests an action: the model emits a structured tool_use block with a name and JSON arguments, the harness executes it, and returns a tool_result block. Native to Claude/GPT-4/Gemini APIs; not 'agent-specific.'
Tool use is a vending machine — model presses a button (tool name), pays in tokens, gets the snack (result). Too many buttons = wrong snack. Vague labels = wrong snack.
Each tool has a JSON Schema describing its parameters. The model picks tools based on (a) tool name + description, (b) parameter schema, (c) examples in the system prompt. Models can emit multiple tool calls in one turn — the harness should run independent ones in parallel (e.g., reading three unrelated files). Tools that depend on prior output must run sequentially. Failed tool calls return errors as tool_result with is_error: true; the model usually retries or asks for help.
Tool schemas count toward input tokens — too many tools or verbose descriptions blow up the prompt. Anthropic supports 'deferred tools' (ToolSearch in Claude Code) — schemas not loaded until needed, just names. Tool choice modes: auto (model decides), any (must call some tool), tool (must call this specific tool), none (no tools). For agents, 'auto' is standard. Tool result content can be text, images, or both. Critically: tool descriptions are part of the prompt and matter as much as system prompts — vague descriptions = bad tool selection.
Tool use lets the model emit structured action requests that the harness executes. The model sees tool definitions (name + description + JSON schema) in its system prompt and chooses tools by matching the user's intent against descriptions. Independent tool calls in the same turn can be parallelized. The harness is responsible for executing tools and feeding results back. The art is writing tool descriptions clearly enough that the model picks the right one without needing examples.
Adding too many tools 'just in case.' Each tool's schema costs tokens on every turn, and more tools means more selection errors. Aim for the smallest tool set that covers the task — quality of descriptions beats quantity of tools.