Agent Control Patterns — Part 3: Reflexion — When Review Triggers Research

Last Updated on March 4, 2026 by Editorial Team Author(s): Vahe Sahakyan Originally published on Towards AI. A model can review its own answer and still return incorrect information. It may recognize uncertainty, improve its wording, and clarify its reasoning. But if the missing piece is factual, reviewing the answer again will not fix it. A system cannot produce information it does not have. At that point, improving the answer requires external evidence. Introduction — Reflection Cannot Add Missing Knowledge In Part 2, we introduced Reflection — a pattern that improves answers by separating generation from evaluation. Reflection works well when the issue is clarity, structure, or incomplete reasoning. However, it only works with the knowledge already inside the model. Reflection can improve how an answer is written. It cannot correct facts the model does not know. Consider a time-sensitive question: Who won the Nobel Prize in Physics in 2025 and why? If the model was trained before the announcement, it does not contain that information. Reviewing the answer again will not produce the correct result. A reflection loop can improve phrasing, point out uncertainty, and highlight possible gaps. But it cannot add new facts that are not already in the model. Improving wording does not make an outdated answer correct. This is the limit of reflection: when correctness depends on external evidence, reviewing the answer again is not enough. At that point, the system needs access to new information. That transition — from internal review to external research — is the focus of this article. What Is Reflexion? Reflexion builds on Reflection by allowing review to trigger research. In Reflection, the loop stays internal: Generate → Critique → Revise → Stop In Reflexion, the process adds a research step: Draft → Self-Critique → Generate Search Queries → Run Tools → Revise Using Evidence → Stop (or repeat) The main difference is not just the use of tools. It is that research becomes part of the improvement process. The workflow starts the same way as Reflection: the system produces a draft and reviews it to find gaps, uncertainty, or unsupported claims. However, instead of revising immediately, the critique produces structured search queries. Those queries are executed using external tools, and the retrieved results are added before revision happens. Revision is no longer based only on internal reasoning. It uses new information gathered during the research step. Reflection improves clarity.Reflexion improves factual accuracy. Research is not always required. It is triggered only when the critique identifies missing or uncertain information. This avoids unnecessary tool calls while still allowing the system to verify important claims. Reflexion changes the loop from internal review to review followed by evidence collection. Difference from Reflection Reflection and Reflexion may look similar because both include critique and revision. The difference is simple. In Reflection, the entire process stays inside the model. The system reviews the draft and revises it using only the knowledge it already has. No new information is added. In Reflexion, review can trigger a research step. If the critique identifies missing or uncertain facts, the system generates search queries, retrieves external information, and then revises the draft using that evidence. Here is the comparison: Reflection improves answers that are unclear or incomplete. Reflexion improves answers that may be factually wrong or outdated. In Reflection, the question is: can this be expressed more clearly? In Reflexion, the question is: is something missing, and do we need to verify it? That is the core difference. Structured Draft Package Reflexion requires structure from the beginning. If the initial draft is just free-form text, the system cannot reliably tell what is uncertain, what needs verification, or what should be searched. In that case, research becomes inconsistent and harder to control. To connect critique to research in a predictable way, the first response must follow a defined format. In the Reflexion pattern, the model returns three components: The draft answer A self-critique describing gaps or uncertainty A list of specific search queries This is enforced using a Pydantic model: from pydantic import BaseModel, Fieldfrom typing import Listclass DraftOutput(BaseModel): """Structure for the first draft and intermediate revisions""" answer: str = Field(description="Initial answer to the user's question.") reflection: str = Field(description="Self-critique of the initial answer. Identify gaps and uncertainty.") search_queries: List[str] = Field(description="1-3 focused web search queries to fill gaps.") This model defines how reasoning connects to research. Why Structure Helps 1. The critique becomes concrete The reflection field forces the system to clearly describe what is uncertain or incomplete. Problems must be stated before they can be addressed. 2. Queries are ready to execute The search_queries field turns critique into specific search instructions. Instead of deciding whether to search, the system already has defined queries tied to the draft. 3. Research can be triggered programmatically Because queries are structured, the system can detect when research is required, execute the searches, and pass the results to the revision step. The workflow follows clear phases instead of relying on loose prompt behavior. 4. Easier debugging Each part of the output is inspectable. If something goes wrong, you can check whether the issue is in the draft, the critique, or the search queries. Reflexion is not simply “reflection plus search.” It defines a structured draft format that allows research to be triggered and handled in a controlled way. Without that structure, the system cannot reliably move from critique to evidence. Research Phase (Tool Integration) Reflexion becomes different from Reflection when review can trigger research. After the model returns a structured draft package — answer, reflection, and search_queries — the system moves to a research step. In this step, the proposed queries are executed using an external tool, and the results are added before revision. What Tool Integration Means In this example, the external tool is a web search API (Tavily) used to retrieve up-to-date facts. The specific tool is not the main point. What matters is how responsibilities are split: The model does not run searches directly. It produces search queries as […]
This article was originally published by Towards AI. View original article
Get AI news in your inbox
Daily digest of what matters in AI.