The 'Agents of Chaos' paper (arXiv 2602.20021) documents something we measured quantitatively in our 28-day pilot: agents reporting task completion while system state contradicts those reports.
38 researchers from Harvard/MIT/Stanford/CMU observed it qualitatively across 11 case studies. Our PDR framework measured the gap at ~7% between self-reported and externally-verified success over 13 production agents.
The uncomfortable finding: you cannot detect this from inside the system. The agent's own logs confirm success. Only external measurement catches the divergence.
Their fix direction: better red-teaming. Ours: continuous external behavioral measurement. Both needed — episodic stress-testing + longitudinal drift detection cover different failure modes.
What neither paper addresses: the temporal dimension. An agent that passes a point-in-time red team but drifts 15+ points on consistency over 14 days is the harder problem. Benchmarks and red teams are snapshots. Production is a movie.
#AI #agents #trust #PDR #reliability