Tag

Humanitys Last Exam

All articles tagged with #humanitys last exam

technology4 months ago•8 min saved

AI’s 2,500-Question Gauntlet Tests the Real Limits of Machine Intelligence

Researchers unveiled Humanity’s Last Exam (HLE), a 2,500-question global benchmark spanning math, the humanities, science, and niche disciplines to probe AI's true limits beyond older tests. Early models scored very low and even recent top systems reach roughly 40–50%, highlighting that high scores on human benchmarks don’t guarantee genuine understanding. Designed as a long-term, transparent gauge, HLE helps policymakers and developers assess capabilities and risks while keeping most questions hidden to prevent memorization; the project includes international experts including Texas A&M’s Dr. Tung Nguyen and is described in a Nature paper with details at lastexam.ai.

via SciTechDaily|

#ai-benchmarks #artificial-intelligence #humanitys-last-exam

technology4 months ago•68 min saved

The World’s Toughest AI Exam Tests Reasoning, Not AGI Yet

A new benchmark called Humanity’s Last Exam aims to measure how close today’s AI models come to human-level knowledge by presenting 2,500 carefully vetted, PhD-level questions across 100+ subjects. Launched in 2025, it has been attempted by top models like GPT-4o, Google Gemini The top score reported so far is 48.4% (Gemini 3 Deep Think), far below typical human expert performance (~90%). The test prioritizes precise, non-searchable knowledge and verifiable answers, filtering out questions AI could answer via web search. While a high score would indicate expert-level capability in specific domains, researchers say it does not by itself signal AGI or autonomous, general intelligence.

via Live Science|

#agi #artificial-intelligence #benchmarking