Think, act, observe
The react loop combines reasoning with tool use. The model alternates between a thought about what to do, an action that calls a tool, and an observation of the result. It repeats this cycle, updating its plan each time, until it reaches an answer.
The cycle
- Thought describes the next step the model intends.
- Action invokes a tool with chosen arguments.
- Observation records what the tool returned.
- The loop continues until a final answer is produced.
Why interleaving helps
- Reasoning makes tool choices deliberate rather than blind.
- Observations let the model adapt when a step fails or surprises it.
- The trace is inspectable, which aids debugging.
Risks and limits
The loop can wander or repeat itself, so set a step limit to stop runaway behavior. Each cycle adds latency and tokens. As with chain of thought, the written thoughts may not perfectly reflect the real computation, and a wrong observation can derail the whole chain.
Key idea
The react loop interleaves thought, action, and observation so the model plans and adapts step by step, but it needs a step limit to avoid wandering and added cost.