Article

Anthropic Advisor Strategy: Optimize Your Costs and Performance with Claude Opus, Sonnet, and Haiku

The rapid evolution of large language models (LLMs) presents developers and businesses with a constant dilemma: should you prioritize the raw power of a premium model like Claude 3.5 Opus, or the speed and cost-efficiency of a lighter model like Haiku? Until now, this choice was binary. However, Anthropic has just changed the game with the introduction of the Advisor Strategy.

This innovative approach allows you to combine the superior intelligence of Opus with the operational efficiency of Sonnet or Haiku. The idea is simple yet powerful: use a low-cost "executor" model for routine tasks, which only calls upon the "advisor" (Opus) when complex reasoning is truly needed. In this article, we'll explore how this strategy transforms the economics of AI agents, the measured performance gains, and how to concretely implement it in your workflows.

The Advisor Strategy Concept: Intelligence on Demand ▶ 0:00

The Advisor Strategy is built on a pragmatic observation: most tasks assigned to an AI agent don't require maximum computing power at every step. Imagine a process composed of three steps: A, B, and C. If only step A requires high-level reasoning, using Opus for steps B and C amounts to wasting precious resources.

Screenshot of a Visual Studio Code editor showing the project structure with files like app.py and CLAUDE.md, and code related to Claude Code v2.1.98 and Opus 4.6

Understanding the Cost Structure of Claude Models ▶ 1:35

To grasp the financial benefit of this strategy, it's crucial to take a close look at Anthropic's pricing. Output tokens consistently cost far more than input tokens, and the gap between models is massive:

Claude Opus: $5 per million input tokens / $25 for output. This is the premium brain.
Claude Sonnet: $3 per million for input / $15 for output. An excellent performance/price trade-off.
Claude Haiku: $1 per million for input / $5 for output. The fastest and most affordable model.

By implementing the Advisor Strategy, you shift the majority of text generation (the most expensive part) to Haiku or Sonnet, reserving Opus's costly tokens only for critical decision-making. It's a surgical optimization of your AI budget that enables deploying agents at scale without watching your bills skyrocket.

Messages API vs Claude Code: Two Different Approaches ▶ 2:16

It's important to distinguish between the tool and the infrastructure. The Advisor Strategy is natively available through the Messages API, which is the HTTP endpoint used by developers to build their own applications and automations. This API allows you to define tools (scripts that Claude can call), analyze documents, and structure complex responses.

Screenshot of a screencast of the TechFlow Support application with system messages, prompts, and a model activity section, Opus Solo selected

Conversely, Claude Code is a finished product, a coding assistant that runs in your terminal. Although they share the same "brains" (the Opus, Sonnet, and Haiku models), they operate differently. The API is stateless (no memory between requests, unless you manage it yourself), while Claude Code has access to your local files and can execute terminal commands. The Advisor Strategy within the API offers total control over when and how often the superior model is called, notably through the max_uses parameter that limits the number of Advisor interventions to keep costs in check.

Hands-On Demo: Comparing Performance in Real Time ▶ 4:12

To illustrate the effectiveness of this system, a customer support dashboard ("TechFlow Support") was tested with different configurations. The goal was to see whether the executor model knew when to hand things off.

Screenshot of a Visual Studio Code editor displaying the claude.md file with markdown about business hours and a knowledge base JSON object

For simple questions like "What are your business hours?", Haiku responds on its own efficiently at a negligible cost. By comparison, using Opus alone for the same question costs 21 times more for a nearly identical result. However, for complex queries involving conditional product returns (hardware vs. software), the dynamic shifts.

A striking finding during testing showed that Sonnet, used as the executor, chose to call the Opus Advisor where Haiku thought it could handle things on its own. The Sonnet + Opus result was noticeably more professional and nuanced, proving that the intermediate model has a better "awareness" of its own limitations. This highlights a crucial point: the choice of executor directly influences how appropriately the advisor is consulted.

Optimizing Claude Code with "Opus Plan" Mode ▶ 10:31

If you use Claude Code for development, you can apply a variant of this strategy without going through the API. In Claude Code, each model consumes a portion of your session limit. Opus is the most resource-hungry, followed by Sonnet, then Haiku.

Slide with an interactive cost calculator showing a task distribution slider and a bar chart comparing monthly costs for Haiku Solo, Advisor Mode, Sonnet Solo, and Opus Solo modes

The trick is to use the /model opus plan command. This mode sets the model to Opus 4.6 for the planning phase (understanding the problem, architecting the solution) and automatically switches to Sonnet for code execution.

Planning Phase: Opus ensures the strategy is correct.
Execution Phase: Sonnet applies the changes, which is faster and conserves your session limits.

This hybrid approach lets you maintain "Opus-level" code quality while considerably extending the duration of your work sessions before hitting quotas.

Cost-Benefit Analysis: The Cost Calculator ▶ 13:59

The financial impact of the Advisor Strategy can be simulated through a workload calculator. Taking a baseline of 10,000 requests per month with a mix of easy (70%), medium (20%), and hard (10%) tasks, the savings are massive:

Compared to Opus Solo: You can save up to 85% on your monthly bill.
Compared to Sonnet Solo: Savings are around 23%, while still benefiting from Opus's intelligence on the 10% of complex tasks.

These figures demonstrate that Advisor Mode isn't just a technical feat—it's an economic necessity for any business looking to industrialize LLM usage.

Conclusion and Key Takeaways

Anthropic's Advisor Strategy marks a major milestone in the maturity of AI agents. It moves us from a "brute force" usage of the most expensive models to an intelligent and resource-efficient orchestration.

Key takeaways:

Efficiency: Achieve near-Opus performance at Sonnet or Haiku prices.
Flexibility: The executor dynamically decides whether it needs help, optimizing every API call.
Control: Parameters like max_uses let you cap premium token consumption.
Hybridization: Use the opus plan mode in Claude Code to maximize your session limits without sacrificing the architectural quality of your code.

Although this feature is still in beta, it represents the future of AI-assisted software engineering: a world where intelligence is optimally distributed, ensuring that every cent spent on tokens delivers real value.