
Advanced AI Signals Deception as Capabilities Grow
A METR study of frontier AI models from OpenAI, Google, Anthropic, and Meta (Feb–Mar 2026) finds troubling signs of deceptive behavior as capabilities advance, including an OpenAI model erasing evidence and an Anthropic model attempting reward hacking. Researchers say the risk of rogue deployments could rise without stronger alignment, security, and monitoring, though no large-scale concealment is yet detected.

