Context rot, retrieval-augmented generation, and how to give your AI the right information at the right time. A practical guide for teams moving from casual AI use to structured AI workflows.
These are the immediate changes that will improve how your team works with ChatGPT. The rest of this document explains the technical reasons why.
Stop copy-pasting large documents into the chat window. Upload them as files instead — either into a Custom GPT's Knowledge section or a ChatGPT Project — and let the retrieval system surface what's relevant.
When you upload files, ChatGPT breaks them into chunks and searches them semantically. You can help it work better:
If you've been pasting business documents into ChatGPT and finding that quality degrades the more you add, the cause is well-documented. Researchers call it context rot — the measurable degradation in LLM output quality as input length increases.
Chroma's 2025 study tested 18 frontier models (GPT-4, Claude, Gemini, and others) and found that every single one degrades at every input length increment. A 1M-token window still degrades at 50K tokens. The issue isn't capacity — it's noise accumulation.
Models attend well to the start and end of context but poorly to the middle. Critical information placed mid-context can sit in a blind spot even if the model technically "sees" it.
Transformer attention is quadratic. At 100K tokens, the model computes 10 billion pairwise relationships. Each token's attention weight shrinks as context grows — the noise floor rises.
Semantically similar but irrelevant content actively misleads the model. Five contracts pasted when you need one aren't neutral — they compete for the model's attention budget.
Illustrative composite — individual model curves vary. The key finding is universality: no model is immune.
When you upload files to a Custom GPT's Knowledge section or to a ChatGPT Project, the system does not paste everything into the conversation. It uses a technique called Retrieval-Augmented Generation (RAG).
The key insight: RAG keeps context lean. The model only sees information relevant to your specific question — not your entire document library. This directly combats context rot.
| Method | How It Works | Context Rot | Best For |
|---|---|---|---|
| Copy-paste | Everything goes directly into the context window | HIGH | Quick questions on short text |
| GPT Knowledge | Files indexed via RAG; relevant chunks retrieved per query | LOW | Stable reference (SOPs, policies, FAQs) |
| ChatGPT Projects | RAG retrieval + cross-conversation memory | LOW | Ongoing work with evolving context |
| Assistants API | Full control over chunking, embedding model, retrieval | LOWEST | Production apps, enterprise integration |
Note: PDFs uploaded as GPT Knowledge use text-only retrieval. PDFs uploaded by users during conversation can use visual retrieval for layout-aware parsing.
Structured retrieval (GPTs, Projects) is the first step. The next level is building repeatable AI workflows that integrate directly with your files and business processes. Here are two approaches worth exploring.
Metrized can help you scope, build, and deploy custom AI workflows — from GPT configuration to full Cowork + Skills implementations. Reach out to discuss what's possible for your team.