Back to all posts
· 6 min read

Context Rot: Why Your AI Gets Dumber the Longer You Use It

Your AI starts brilliant and ends useless. It's called context rot, and it's probably your fault. Here's why it happens and how to avoid it.

Context Rot: Why Your AI Gets Dumber the Longer You Use It

You start a session and Claude is brilliant. Sharp suggestions, remembers every constraint, nails the context.

Two hours later, it’s suggesting things you already tried. Forgetting constraints you mentioned an hour ago. Giving generic answers to specific questions.

You’re not imagining it. The AI actually got dumber. It’s called context rot, and it’s probably your fault 😆☠️

What Is Context, Anyway?

Every AI conversation lives inside a fixed-size buffer called the context window. Think of it as the AI’s working memory — everything you say, everything it says, every code block, every error log. All of it sits in this one buffer.

It’s measured in tokens. A token is roughly ¾ of a word — “JavaScript” is 3 tokens, a blank line is 1. Here are the actual limits:

ModelContext WindowRoughly…
Claude (Sonnet/Opus)200K tokens~150,000 words
GPT-4.11M tokens~750,000 words
Gemini 2.51M tokens~750,000 words

200K sounds like a lot. It’s not. A single large source file can eat 5,000 tokens. A stack trace? 500. One hour of back-and-forth debugging? 20,000+ tokens gone before you’ve fixed anything.

And those bigger windows? The research shows every model degrades well before hitting its limit. More context window doesn’t mean more usable context. Claude gets unreliable around 130K. The others aren’t much better.

What Context Rot Looks Like

  • Minute 5: Claude remembers your tech stack, your constraints, your preferences. Suggestions are targeted.
  • Hour 1: Starts repeating ideas. Misses a detail you mentioned earlier.
  • Hour 2: Suggests the exact approach you rejected 45 minutes ago. Same session.

The context window filled up. The AI isn’t forgetting — it’s drowning.

Why It Happens

Context windows are finite. Even 200K tokens fills up faster than you think. Every message adds to the pile: your questions, Claude’s answers, code blocks, error messages, stack traces, that random tangent about lunch.

Imagine a whiteboard that never gets erased. Eventually you’re writing in the margins of the margins. The important stuff is still there somewhere, but good luck finding it.

Performance degradation across LLMs as context increases Chroma Research: Performance drops as context grows — across Claude, GPT-4.1, Qwen, and Gemini.

Research backs this up. A Chroma study tested 18 leading LLMs — all showed degradation as context expanded. Models claiming 200K tokens typically become unreliable around 130K. The drops are sudden, not gradual.

The Tangent Tax

Every side conversation pollutes the context. “Hey, quick question about something unrelated” — that’s in the context forever now.

Debugging rabbit holes are the worst. Every failed attempt, every stack trace, every “let’s try this instead” — all eating tokens. By the time you find the fix, the context is 40% debugging garbage.

You’re not just using context. You’re spending it. Every token is a resource.

The Compaction Tax

When context hits ~95% capacity, Claude Code compacts — summarizes the conversation to make room.

This takes time. You sit there waiting while it figures out what to keep.

Claude Code compaction details Compaction triggers at ~95% capacity and summarizes everything — old and new context flattened together. See my pink colored bar at 60%? Getting up there 🫠

Worse: it sometimes gets it wrong. I’ve had compaction end and Claude pick up an OLD plan from earlier in the session. Started doing work I’d already moved past. Minutes lost, plus the cognitive overhead of catching the mistake.

Here’s the thing: compaction doesn’t selectively keep messages. It summarizes everything into a condensed form. Old context and new context get flattened together. Nuance disappears.

If you’re hitting compaction regularly, you’ve already lost. The session is too long.

How to Avoid It

Start fresh more often. New session isn’t failure — it’s hygiene. “But I’ll lose context!” Only if it’s not written down. If the important stuff is in files (CLAUDE.md, AGENTS.md), fresh sessions start smart.

Pro tip: /clear in Claude Code gives you fresh context without restarting. Keeps your working directory, clears the conversation.

Compact before it compacts you. Finished a big chunk of work but not the whole feature? Don’t wait for the 95% auto-trigger — /compact in Claude Code lets you manually compress context whenever you want. You can even tell it what to keep:

/compact Keep context around our RSS feed work and the schema decisions

You control what gets summarized instead of letting the AI decide under pressure at the worst possible moment.

Build durable markdown files. CLAUDE.md, AGENTS.md, decision logs in /docs — these are your AI’s long-term memory. Every decision you write down is a decision you never have to re-explain. Next session reads those files and starts with full context, zero token debt. I accidentally turned this habit into a whole system.

Scope your sessions. One session, one goal. Don’t do feature work and debugging and refactoring in the same session. Finished a task? Start fresh for the next one.

Consider subagents. What if you could spin up fresh context on demand? A subagent does one task with clean context, returns the result. Parent orchestrates, children execute fresh. No pollution between tasks.

More on that in the next post.

The Mindset Shift

Stop treating sessions like relationships — continuous, precious, hard to end.

Start treating them like sprints — focused, finite, designed for fresh starts.

The AI doesn’t have feelings about being restarted. Your codebase doesn’t care how many sessions it took. Only your output matters.

Context is a resource. Spend it wisely, and know when to get a fresh wallet.


Context rot isn’t a bug in the AI. It’s physics. Finite windows, infinite conversations, something has to give.

Once you understand it, you can work with it. Start fresh more often. Scope your sessions. Externalize your decisions.

Your AI will feel smarter. It’s the same AI — you’re just not drowning it anymore.

Next up: I kept restarting sessions and tracking progress in a file. Felt dumb. Then I realized I was manually doing what subagents do automatically.


Sources