Claude Code’s ‘/goals’ separates the agent that works from the one that decides it’s done
A code migration agent finishes its run, and the pipeline looks green. But several pieces were never compiled — and it took days to catch. That’s not a model failure; that’s an agent deciding it was done before it actually was. Many enterprises are now seeing that production AI agent pipelines fail not because of the models’ abilities but because the model behind the agent decides to stop. Several methods to prevent premature task exits are now available from LangChain, Google and OpenAI, though these often rely on separate evaluation systems. The newest method comes from Anthropic: /goals on Claude Code, which formally separates task execution and task evaluation. Coding agents work in a loop: they read files, run commands, edit code and then check whether the task is done. Claude Code /goals essentially adds a second layer to that loop. After a user defines a goal, Claude will continue to turn by turn, but an evaluator model comes in after every step to review and decide if the goal has been achieved. The two …



