Claude Code's Thinking Regression: Measured, Diagnosed, Fixed

Q: Does this fix work on all Claude Code plans?

The CLAUDE_CODE_DISABLE_ADAPTIVE_THINKING environment variable works on all plans that support Claude Code, including Pro, Max, and API. The effortLevel and /effort max settings are available across all tiers.

Q: What is the difference between /effort high and /effort max?

high is the second-highest effort level and can be set as a persistent default in settings.json. max is the highest level, available per session via the /effort max command. Use max for complex architecture analyses. Some users report that max can occasionally over-explain, so monitor the output.

Claude Code users have reported shallower reasoning since February 2026. The cause is Adaptive Thinking: a change where the model decides its own thinking duration instead of following a fixed budget. A controlled A/B test on a real codebase shows that disabling it restores 28% deeper analysis, 3x better proactive reasoning, and 32% fewer output tokens. The fix takes two minutes.

I noticed it about three weeks ago. Tasks that Claude Code used to handle well started going sideways. Security implications were ignored. Breaking changes were not flagged. The model stopped reading files before editing them. I did not change my setup. I did not change my prompts. Something else changed.

I started investigating and found I was not alone. According to the Pragmatic Engineer Survey from February 2026 (15,000 developers), 73% of engineering teams now use AI coding tools daily, up from 41% in 2025. And 71% of developers who regularly use AI agents use Claude Code. A regression in the most-used coding agent is not a niche complaint. It affects daily workflows at scale.

What happened in February

Three changes shipped between February and March 2026.

On February 9, Anthropic launched Opus 4.6 with Adaptive Thinking. Instead of a fixed reasoning budget, the model now decides per turn how long to think. On March 3, the default effort level dropped from high to medium (value: 85). On February 12, thinking content was hidden from the UI via a redact-thinking header, a cosmetic change with no impact on reasoning depth.

Each change is defensible on its own. Together, they created a system where Claude Code under-allocates reasoning on complex tasks. The model thinks less, edits faster, and produces confident results that are wrong more often.

The community measured it

Stella Laurenzo, Senior Director of AI at AMD, filed GitHub Issue #42796 after mining months of Claude Code session logs across 50+ concurrent agent sessions. The analysis covers 17,871 thinking blocks and 234,760 tool calls across 6,852 sessions. The findings are specific.

Thinking depth dropped approximately 67% by late February, before the redaction rollout even started. The Read:Edit ratio collapsed from 6.6 to 2.0. One in three edits happened on files the model never read. Usage of phrases like 'simplest fix' increased 642%. The issue attracted 328+ comments and coverage from The Register, InfoWorld, and TechCrunch.

On Hacker News, an Anthropic engineer confirmed the core problem: Adaptive Thinking was allocating zero thinking budget on certain turns, even when effort was set to high. Turns with zero reasoning produced fabrications. Turns with deep thinking produced accurate output. The recommended workaround: CLAUDE_CODE_DISABLE_ADAPTIVE_THINKING=1.

The model did not get dumber. The defaults got conservative.

And conservative defaults on a tool at this scale have consequences. Claude Code alone generates an estimated $2.5 billion in annual revenue as of early 2026. This is not a beta experiment. It is production infrastructure for hundreds of thousands of developers.

My setup

I run Claude Code on a Max Plan through a multi-agent pipeline on WSL Ubuntu. My settings.json already had effortLevel: "high". Context pressure was not the issue: my auto-loaded rules consume roughly 6,700 tokens, about 3.3% of the 200K context budget. I was on the latest version (2.1.104).

But I had not disabled Adaptive Thinking. That was the missing piece.

A controlled A/B test

I wanted numbers, not impressions. So I designed a within-subject A/B test with a pre-registered 12-point rubric.

The task: A realistic analysis prompt on a real monorepo (React, Node.js, PostgreSQL). Add a required field to an API endpoint. What needs to change? Analyze only, do not make changes.

This prompt touches all layers: API, database, frontend, pipeline. It has obvious impacts (schema, types, UI) and non-obvious impacts (security, tests, breaking changes, documentation) that require proactive reasoning.

Controlled variables: Same model (Opus), same effort (high), identical prompt, identical repo state, pipe mode with no session persistence. Two runs per condition, four runs total.

The rubric: 12 binary dimensions scored before the test ran. Dimensions 1 through 6 cover obvious impacts (API route, Zod schema, database, types, frontend, pipeline). Dimensions 7 through 12 cover proactive impacts (tests, SSE endpoints, security, documentation, error handling, adjacent consumers).

Results

Adaptive thinking: ON vs OFF

Controlled A/B test | Opus 4.6 | effort=high | N=2 per condition | 12-point pre-registered rubric

Adaptive ON (default)Adaptive OFF

Rubric score

12-point pre-registered

7.0

9.0+28%

Proactive reasoning

security, tests, errors

1 / 6

3 / 63x

Output tokens

fewer = more efficient

2,524

1,704-32%

Behavioral variance

token spread across runs

91%

26%3.5x consistent

CLAUDE_CODE_DISABLE_ADAPTIVE_THINKING=1intentic.io

Both conditions scored 6 out of 6 on obvious dimensions. The difference is entirely in proactive reasoning.

With Adaptive Thinking ON, neither run mentioned security implications. Neither flagged breaking changes for existing API consumers. With Adaptive Thinking OFF, one run produced a structured security assessment covering input validation, SQL injection, XSS, and prompt injection. Another proactively identified that a new required field would break existing clients with HTTP 400 errors.

The outlier tells the story: Run A1 used 14 tool calls (vs. 2 for all other runs) and scored 8 out of 12. Run B1 scored 8 out of 12 with just 2 tool calls. More research does not equal deeper analysis. The model with Adaptive Thinking ON reads more files but thinks less about what it read.

The cost paradox: Run B1 cost $0.71 despite producing fewer output tokens (1,897) than Run A1 ($0.65 / 3,319 tokens). Fixed thinking budgets allocate more internal reasoning tokens, which appear in cost but not in output. The model actually thinks longer.

Limitation I want to be transparent about: N=2 per condition. These results are indicative, not statistically significant. The test script is open for anyone to reproduce.

The fix: 2 minutes, 3 settings

Setting 1: Disable Adaptive Thinking

This is the highest-impact change. It forces a fixed reasoning budget instead of letting the model decide per turn.

# Add to ~/.bashrc or ~/.zshrc
export CLAUDE_CODE_DISABLE_ADAPTIVE_THINKING=1

Reload your shell or run source ~/.bashrc.

Setting 2: Make thinking visible

This lets you see how deeply the model is reasoning on each turn. When thinking summaries are shallow or missing, you know something is off before the output goes wrong.

// ~/.claude/settings.json
{
  "effortLevel": "high",
  "showThinkingSummaries": true
}

Setting 3: Use /effort max for complex tasks

For architecture analyses, multi-file refactors, or anything where you need the model to consider adjacent impacts, start the session with:

/effort max

This goes beyond high and allocates the maximum reasoning budget.

Bonus: Tell the model what you expect

Add this to your project's CLAUDE.md:

## Analysis Standards
- Research the codebase before editing. Never change code you haven't read.
- For every change, consider: security implications, breaking changes for existing consumers, test coverage, and documentation updates.
- When analyzing an API change, explicitly check: input validation, error handling, downstream consumers, and related endpoints.

This does not replace the environment variable. But it sets expectations for what 'thorough analysis' means in your project.

The copy-paste fix

Here is a single prompt you can give Claude Code to diagnose and fix your setup:

Check my Claude Code configuration for the Adaptive Thinking regression.

1. Check if CLAUDE_CODE_DISABLE_ADAPTIVE_THINKING is set:
   echo $CLAUDE_CODE_DISABLE_ADAPTIVE_THINKING

2. Check my current settings:
   cat ~/.claude/settings.json

3. If CLAUDE_CODE_DISABLE_ADAPTIVE_THINKING is not set to 1, add it to my shell profile (~/.bashrc or ~/.zshrc, whichever exists).

4. If settings.json doesn't contain "showThinkingSummaries": true, add it.

5. If effortLevel is not "high", set it to "high".

Show me what changed and remind me to run: source ~/.bashrc

Context: https://blog.intentic.io/claude-code-thinking-regression-fix
Related: https://github.com/anthropics/claude-code/issues/42796

Three takeaways for builders

When your AI tool feels different, measure it. Subjective impressions are easy to dismiss. A 12-point rubric with controlled variables is not. The difference between 'it feels worse' and 'proactive reasoning dropped from 1.0/6 to 3.0/6' is the difference between a complaint and a diagnosis.

Defaults matter more than features. The model did not lose capability. The default configuration shifted from 'think thoroughly' to 'think efficiently.' For simple tasks, this is fine. For complex engineering, it cuts exactly the reasoning that makes AI-assisted analysis valuable.

Always check the meta-layer. I spent weeks assuming my prompts were the problem. The actual problem was an environment variable I did not know existed. Before you debug your prompts, check whether the model's reasoning infrastructure changed underneath you.

Frequently Asked Questions

Does this fix work on all Claude Code plans?

The CLAUDE_CODE_DISABLE_ADAPTIVE_THINKING environment variable works on all plans that support Claude Code, including Pro, Max, and API. The effortLevel and /effort max settings are available across all tiers.

Will Anthropic fix this permanently?

Anthropic acknowledged the issue on Hacker News and confirmed that Adaptive Thinking was under-allocating on certain turns. A permanent fix has not been announced. The environment variable is the official interim workaround. Track GitHub Issue #42796 for updates.

Does disabling Adaptive Thinking increase costs?

In the A/B test, costs were nearly identical ($0.57 vs. $0.55 per run). Fixed thinking budgets use more internal reasoning tokens but produce fewer output tokens and fewer correction cycles. For complex tasks, it may actually reduce total cost.

Is N=2 enough to trust these results?

No, and I say that in the report. N=2 is indicative, not statistically significant. But the direction is consistent across all runs, the rubric was pre-registered, and the community data from 6,852 sessions supports the same conclusion. The test script is open for anyone to reproduce with higher N.

What is the difference between /effort high and /effort max?

high is the second-highest effort level and can be set as a persistent default in settings.json. max is the highest level, available per session via the /effort max command. Use max for complex architecture analyses. Some users report that max can occasionally over-explain, so monitor the output.

Sources

GitHub
Issue #42796: Claude Code unusable for complex engineering tasks (April 2, 2026)
Pragmatic Engineer Survey
AI coding tool adoption survey (February 2026, n=15,000)
DEV Community
Technical deep-dive on Feb/Mar updates (April 2026)
TokenCost
Data analysis of Claude Code regression (April 2026)
lilting.ch
Thinking block analysis with 17,871 samples (April 2026)
Hacker News
Anthropic engineer confirmation of Adaptive Thinking bug (April 2026)
GetPanto
Claude Code revenue estimates, early 2026