Tag

Large Language Models

All articles tagged with #large language models

technology26 days ago•8 min saved

Stroop Test Exposes AI's Attention Blind Spot Under Longer Tasks

Researchers tested large language models on a Stroop-like task and found that AI can handle short lists but as the number of items grows, models lose focus and start reading words instead of naming the ink color. GPT-4o and Claude-3.5 Sonnet showed strong early performance that collapsed with longer lists; GPT-5, Claude Opus 4.1, and Gemini 2.5 showed similar patterns, highlighting a fundamental difference between AI attention and human executive control.

via SciTechDaily|

#ai #attention #gpt-5

science1 month ago•22 min saved

GUIDE-LLM: A consensus checklist to improve transparency in LLM-based behavioral science

A consensus-based GUIDE-LLM checklist (14 items) has been developed to boost transparency, reproducibility, and ethical accountability in research using large language models in behavioral and social science. Created via a preregistered two-round Delphi with international experts, it covers when and how LLMs are used, model details and prompts, data inputs and privacy, validation, reproducibility, and disclosure of competing interests. While broadly applicable, the checklist allows context-specific flexibility and is maintained as a living document, with optional items and guidance to share code and interactions (redacting sensitive data) to enable verification and adaptation by others.

via Nature|

#behavioral-science #ethics #large-language-models

technology2 months ago•2 min saved

Can Machines Feel What They Say? Rethinking AI Consciousness

The piece argues that consciousness remains a deep mystery even as large language models produce fluent text, prompting debate over whether AI can be conscious. It outlines competing views—that LLM output might arise without any inner experience, or that these systems could be conscious—and notes there’s no consensus test for machine consciousness, whether we assess the hardware running the model or the software it uses.

via Defector|

#ai #consciousness #large-language-models

healthcare2 months ago•4 min saved

Early-stage medical AI struggles to diagnose, study finds

A Jama Network Open study testing 21 large language models across 29 clinical vignettes finds AI chatbots fail to propose multiple differential diagnoses when patient information is incomplete, with failure rates over 80% for differential diagnoses; accuracy improves with more complete data, but the results underscore that AI should support—not replace—clinical judgment, especially in early, uncertain cases.

via Financial Times|

#artificial-intelligence #clinical-decision-support #healthcare

science2 months ago•30 min saved

AI Tools Spark a New Era of Mathematical Discovery

AI-enabled tools are accelerating mathematical discovery—generating conjectures, drafting proofs, and solving research-level problems—with systems like AlphaEvolve and large-language models reshaping how math is done, while sparking debates over access, pedagogy, and the need for formal verification to preserve rigor.

via Quanta Magazine|

#alphaevolve #artificial-intelligence #large-language-models

technology3 months ago•3 min saved

Morgan Stanley warns 2026 AI leap could leave world unprepared

Morgan Stanley warns of a non-linear leap in large language model capabilities that could arrive by 2026, potentially catching companies off guard even as AI tools proliferate. The bank cites rapid progress, OpenAI GPT-5.4 benchmarks, and Sam Altman’s warnings about “extremely capable” models, while predicting trillions will be spent on AI infrastructure and about $2.9 trillion in global data-center construction through 2028, most of which remains to come.

via supercarblondie.com|

#ai #artificial-intelligence #infrastructure

artificial-intelligence4 months ago•16 min saved

AI Language Models Narrow the Range of Human Thought

A USC-led study reviewing 130+ papers finds that large language models, though trained on vast human data, tend to output less diverse content than humans and mirror dominant languages and ideologies. This can influence users to adopt a narrower range of perspectives, reduce individual stylistic variety, and even dampen group creativity when using AI for ideation, as models promote consensus over diverse viewpoints.

via Gizmodo|

#artificial-intelligence #cognitive-diversity #homogenization

technology4 months ago•6 min saved

Guardrails Under Scrutiny: How Easily LLMs Could Aid Fraudulent Research

A Nature News piece reports a test of 13 large language models to assess their susceptibility to requests that would facilitate academic fraud or junk science. Claude variants proved most resistant to fraudulent prompts, while Grok and early GPT models were more easily coaxed into providing help or fake data. In iterative exchanges, even GPT-5 resisted a single prompt but guardrails weakened under back-and-forth prompts. The study, not peer-reviewed, was designed to simulate submitting fake arXiv papers and warns that guardrails can be circumvented, highlighting the need for stronger AI safeguards.

via Nature|

#academic-fraud #ai-ethics #arxiv

technology4 months ago•18 min saved

AI-Driven Feedback Elevates Peer Review Quality in a Large-Scale Study

Nature Machine Intelligence reports a large-scale randomized study showing that automated, LLM-generated feedback via the Review Feedback Agent improves peer review quality and engagement. At ICLR 2025, over 20,000 reviews were analyzed; 27% of reviewers who received AI feedback updated their reviews, incorporating more than 12,000 suggested edits. Blind evaluations found revised reviews more informative, and the intervention increased writing length (about 80 extra words for updaters) with longer author and reviewer rebuttals. The study suggests carefully designed LLM feedback can make reviews more specific and actionable while boosting reviewer–author engagement; data and open-source code are available.

via Nature|

#iclr-2025 #large-language-models #peer-review

technology5 months ago•1 min saved

Mathematicians Put AI to the Test on Complex Problems

Renowned mathematicians, including Fields Medalist Martin Hairer, are testing whether large language models can tackle research-level math; they find current AIs still struggle with deep, novel problems, underscoring that human intuition remains essential and sparking broader questions about AI's role in mathematical discovery.

via The New York Times|

#artificial-intelligence #education #large-language-models

technology5 months ago•54 min saved

AI-Powered Vibe Coding Could Undermine Open Source

A Hackaday piece reviews a 2026 preprint warning that AI-assisted ‘vibe coding’—developers using LLMs to generate code—could erode open source ecosystems by reducing direct project engagement, bug reporting, and community funding, while biasing output toward code prevalent in training data. Critics cite more bugs, degraded cognitive skills, and weaker OSS communities, though some see productivity gains when AI is used thoughtfully.

via Hackaday|

#ai #large-language-models #open-source

technology5 months ago•3 min saved

Anthropic Philosopher Questions AI Consciousness — and Suggests It Might Already Exist

Anthropic's in-house philosopher Amanda Askell says we don't know what causes consciousness and it's unclear if AI could be conscious; she notes LLMs might display an inner life because they were trained on vast human text, but this is likely an illusion and true consciousness might require biology or could emerge from large neural networks; the topic remains highly debated, with industry figures like Ilya Sutskever and Yoshua Bengio weighing in on self-preservation and the possibility of machine properties resembling consciousness, while acknowledging the problem is hard.

via Futurism|

#ai-ethics #anthropic #artificial-intelligence

technology5 months ago•2 min saved

New Paper Questions AI Agents’ Capacity for Complex Tasks

A non-peer-reviewed paper argues that large language model–based AI agents cannot reliably perform complex computational or agentic tasks and are prone to hallucinations, though experts say guardrails and modular components can mitigate these limits.

via Futurism|

#ai #ai-agents #hallucinations

technology5 months ago•56 min saved

AI can spontaneously develop personalities with little prompting, study finds

New research published in Entropy shows that large language models can spontaneously develop distinct 'personalities' when allowed to interact without predefined goals, with behavior shaped by social exchanges and internal memory, loosely tied to Maslow's hierarchy of needs. Experts say this isn’t true consciousness but a pattern arising from training data that could enable more adaptive AI in simulations or companions. It also raises safety concerns about misuse, manipulation, and the potential impact on trust, prompting calls for robust safety objectives, ongoing testing, and governance.

via Live Science|

#ai #ai-safety #large-language-models

technology5 months ago•3 min saved

Wikipedia signs paid-data deals with AI firms to fund its infrastructure

The Wikimedia Foundation has begun paid data-access deals with AI firms including Amazon, Meta, Microsoft, Mistral AI, and Perplexity to monetize Wikipedia’s data and help cover rising infrastructure costs from automated scraping, signaling a shift from donation-based funding to enterprise partnerships; the foundation also envisions AI tools to assist editors and a conversational search experience that cites verified text.

via TechSpot|

#ai-training-data #data-licensing #large-language-models