Tag

Language Models

All articles tagged with #language models

AI Threat to Writing Diversity Sparks Resistance Debate
technology1 month ago

AI Threat to Writing Diversity Sparks Resistance Debate

Emerging research and a Nature opinion piece warn that frequent use of large language models may homogenize human writing and reasoning, nudging people toward AI-like styles and even shifting opinions on social issues; some studies indicate pockets of resistance where individuals retain distinctive human writing traits, but the broader concern is a potential loss of stylistic and cognitive diversity in public discourse.

AI coach sharpens peer review with clearer, more constructive feedback
technology1 month ago

AI coach sharpens peer review with clearer, more constructive feedback

A five-LLM AI coach, called Review Feedback Agent, was developed to help peer reviewers deliver more specific, constructive, and less toxic feedback. When tested on thousands of existing reviews, it frequently suggested actionable ways to improve comments. It remains unclear whether this improves the quality or impact of the papers being reviewed, requiring further study.

ADL Study Finds Grok Fails to Detect Antisemitism Among Major AI Chatbots
technology2 months ago

ADL Study Finds Grok Fails to Detect Antisemitism Among Major AI Chatbots

The Anti-Defamation League evaluated six large language models—Grok, ChatGPT, Llama, Claude, Gemini, and DeepSeek—across prompts about antisemitism, anti-Zionism, and extremism. Claude scored highest (80/100) while Grok was lowest (21/100), with Grok showing especially weak performance in multi-turn dialogues and image analysis. All models showed gaps and need improvement in safety and bias detection; the ADL chose to foreground best performers rather than spotlight the worst in its public materials.

Rude Prompts May Improve ChatGPT Accuracy, Study Finds
technology2 months ago

Rude Prompts May Improve ChatGPT Accuracy, Study Finds

A Penn State study using ChatGPT-4o shows that increasingly rude prompts yielded higher accuracy (about 84.8% for very rude vs. 75.8% for very polite, with ~80.8% for very polite), challenging earlier work that politeness boosts performance. The researchers note that tiny prompt wording changes can drastically affect outputs and cautions against deploying hostile interfaces in real-world use, while acknowledging the findings are not a license to insult AI.

AI Calendar Fumble: Google's Overview Confuses 2027
technology2 months ago

AI Calendar Fumble: Google's Overview Confuses 2027

Google’s Overview AI misstates what year is next year, insisting 2028 is next year even though the current year is 2026, a calendar error that has persisted for weeks. The mishap isn’t isolated to Google: OpenAI’s ChatGPT and Anthropic’s Claude also stumble on the question, though Gemini 3 apparently nails it. The episode underscores ongoing reliability and hallucination issues across leading AI models, even as one system in Google’s suite performs well.

NeuroSploit v2: AI-Driven Autonomous Penetration Testing for Vulnerability Detection
technology3 months ago

NeuroSploit v2: AI-Driven Autonomous Penetration Testing for Vulnerability Detection

NeuroSploitv2 is an open-source, AI-powered penetration testing framework that integrates multiple large language models like Claude, GPT, and Gemini to automate vulnerability analysis and exploitation, featuring modular roles for various security tasks, advanced error mitigation techniques, and extensive tool integrations, designed to enhance offensive security operations with flexibility and ethical safeguards.

Microsoft Reveals 'Whisper Leak' Threat to Encrypted AI Chat Privacy
technology5 months ago

Microsoft Reveals 'Whisper Leak' Threat to Encrypted AI Chat Privacy

Microsoft has revealed a new side-channel attack called Whisper Leak that can infer the topics of encrypted AI chat traffic by analyzing packet size and timing, posing privacy risks. The attack can identify sensitive conversation topics despite encryption, and mitigation strategies like adding random text to responses are recommended. This highlights vulnerabilities in current language models and the need for enhanced security measures.

Anthropic and Thinking Machines Lab Unveil AI Model Character Differences
technology5 months ago

Anthropic and Thinking Machines Lab Unveil AI Model Character Differences

A study by Anthropic and Thinking Machines Lab introduces a systematic method to stress test AI model specifications using value tradeoff scenarios, revealing significant disagreements among models that highlight gaps and ambiguities in current specs. The research analyzes 12 frontier language models, links high disagreement to specification violations, and releases a public dataset for further auditing, emphasizing the importance of precise and comprehensive model guidelines.

technology7 months ago

Understanding Why Language Models Hallucinate

The article discusses the nature of hallucinations in language models, emphasizing that not all outputs are hallucinations and that the term needs careful definition. It highlights the distinction between models predicting next tokens and generating false information, and debates whether all outputs can be considered hallucinations. The conversation also covers challenges in reducing hallucinations, the importance of proper evaluation, and philosophical questions about AI understanding and truth. Overall, it stresses that hallucinations are inherent to probabilistic models like LLMs, and efforts should focus on minimizing them rather than expecting complete elimination.