LLM Benchmarks blog posts

18 gold and white lines on a dark background curving downward from left to right, with text reading all of them

Context Rot Is Real: What Chroma's 18-Model Study Found

Chroma tested 18 frontier LLMs and found every one degrades as input length grows. Here is what their context rot study proves developers must change.

Split-screen image: left side shows a figure standing alone in a sparse room facing a single sticky note on a bare wall; right side shows the same figure actively working at a dense investigation board covered in documents, strings, and notes.

AI Workflows

We Ran the Same Experiment Twice. Different Feature, Different Models, Same Winner.

Discover why budget LLMs equipped with full code context consistently outperform flagship coding agents like Claude Code and Gemini CLI in PR generation.