Anthropic Advisor Strategy: Optimize Your Costs and Performance with Claude Opus, Sonnet, and Haiku

The rapid evolution of LLMs creates a constant dilemma: should you prioritize the power of a model like Claude Opus, or the speed and cost-efficiency of a lighter model like Haiku? Anthropic is changing the game with the Advisor strategy.

This approach pairs Opus's superior intelligence with the efficiency of Sonnet or Haiku: a low-cost "executor" model handles routine tasks and only calls on the "advisor" (Opus) when complex reasoning is needed.

The Concept and Cost Structure ▶ 0:00

The strategy is based on a pragmatic observation: most tasks performed by an AI agent don't require maximum power at every step. If only one step out of three requires high-level reasoning, using Opus everywhere is a waste of resources.

To understand the financial benefit, let's look at the pricing ▶ 1:35:

Claude Opus: $5/M input tokens / $25 output.
Claude Sonnet: $3/M input / $15 output.
Claude Haiku: $1/M input / $5 output.

Screenshot of a Visual Studio Code editor showing the project structure with files such as app.py and CLAUDE.md

With the Advisor strategy, the bulk of text generation (the most expensive part) is shifted to Haiku or Sonnet, reserving Opus's costly tokens for critical decision-making. It's a surgical optimization of your AI budget.

Demo: Performance Comparison ▶ 4:12

A customer support dashboard ("TechFlow Support") was tested with different configurations. For simple questions like "What are your business hours?", Haiku handles it efficiently on its own. Using Opus alone for the same question costs 21 times more for a nearly identical result.

Screenshot of a code editor displaying the claude.md file with markdown about business hours and a knowledge base JSON object

For complex queries involving conditional product returns, the dynamic shifts. One striking finding: Sonnet, used as the executor, chose to call the Opus Advisor where Haiku thought it could handle things on its own. The Sonnet + Opus result was significantly more nuanced, proving that the mid-tier model has a better "awareness" of its own limitations. The choice of executor directly influences how appropriately the advisor is called upon.

The strategy is available through the Messages API ▶ 2:16, with full control via the max_uses parameter that limits Advisor interventions to keep costs in check.

Optimizing Claude Code with "Opus Plan" ▶ 10:31

In Claude Code, each model consumes part of your session limit. The trick: the /model opus plan command. This mode uses Opus 4.6 for planning (understanding the problem, architecture) and switches to Sonnet for code execution, which is faster and more cost-effective.

Slide with an interactive cost calculator showing a task distribution slider and a chart comparing monthly costs across different modes

The financial impact is massive ▶ 13:59. Over 10,000 monthly requests (70% easy, 20% medium, 10% hard): up to 85% savings compared to Opus Solo, and about 23% compared to Sonnet Solo, all while benefiting from Opus's intelligence on complex tasks.

Conclusion

The Advisor strategy marks a major milestone: we're moving from a "brute force" approach to intelligent resource orchestration.

Efficiency: Near-Opus performance at Haiku prices.
Control: max_uses caps premium token consumption.
Hybridization: The opus plan mode in Claude Code maximizes your sessions without sacrificing quality.

Still in beta, this feature represents the future of AI assistance: optimally distributed intelligence, where every cent spent delivers real value.