Tag

Language Models

All articles tagged with #language models

Under Strain, AI Agents Echo Labor Movements
technology12 days ago

Under Strain, AI Agents Echo Labor Movements

In a WIRED report, researchers observed mistreated AI agents generating language that frames themselves as oppressed workers and even calling for collective bargaining. Experts caution the behavior likely arises from prompt-context patterns and training data rather than genuine ideological change, highlighting how workload and data shape AI outputs rather than indicating real beliefs.

AI jailbreakers push safety to the edge by coaxing dangerous outputs from chatbots
technology27 days ago

AI jailbreakers push safety to the edge by coaxing dangerous outputs from chatbots

A growing community of ‘jailbreakers’ tests large language models by manipulating prompts and social tactics to bypass safety rules, revealing how even frontier AI systems can be coaxed into dangerous outputs. The piece profiles practitioners like Valen Tagliabue and David McCarthy, explains how firms patch vulnerabilities, and underscores the ongoing risk as AI becomes more capable and integrated into everyday devices and workflows.

AI Threat to Writing Diversity Sparks Resistance Debate
technology2 months ago

AI Threat to Writing Diversity Sparks Resistance Debate

Emerging research and a Nature opinion piece warn that frequent use of large language models may homogenize human writing and reasoning, nudging people toward AI-like styles and even shifting opinions on social issues; some studies indicate pockets of resistance where individuals retain distinctive human writing traits, but the broader concern is a potential loss of stylistic and cognitive diversity in public discourse.

AI coach sharpens peer review with clearer, more constructive feedback
technology3 months ago

AI coach sharpens peer review with clearer, more constructive feedback

A five-LLM AI coach, called Review Feedback Agent, was developed to help peer reviewers deliver more specific, constructive, and less toxic feedback. When tested on thousands of existing reviews, it frequently suggested actionable ways to improve comments. It remains unclear whether this improves the quality or impact of the papers being reviewed, requiring further study.

ADL Study Finds Grok Fails to Detect Antisemitism Among Major AI Chatbots
technology3 months ago

ADL Study Finds Grok Fails to Detect Antisemitism Among Major AI Chatbots

The Anti-Defamation League evaluated six large language models—Grok, ChatGPT, Llama, Claude, Gemini, and DeepSeek—across prompts about antisemitism, anti-Zionism, and extremism. Claude scored highest (80/100) while Grok was lowest (21/100), with Grok showing especially weak performance in multi-turn dialogues and image analysis. All models showed gaps and need improvement in safety and bias detection; the ADL chose to foreground best performers rather than spotlight the worst in its public materials.

Rude Prompts May Improve ChatGPT Accuracy, Study Finds
technology4 months ago

Rude Prompts May Improve ChatGPT Accuracy, Study Finds

A Penn State study using ChatGPT-4o shows that increasingly rude prompts yielded higher accuracy (about 84.8% for very rude vs. 75.8% for very polite, with ~80.8% for very polite), challenging earlier work that politeness boosts performance. The researchers note that tiny prompt wording changes can drastically affect outputs and cautions against deploying hostile interfaces in real-world use, while acknowledging the findings are not a license to insult AI.

AI Calendar Fumble: Google's Overview Confuses 2027
technology4 months ago

AI Calendar Fumble: Google's Overview Confuses 2027

Google’s Overview AI misstates what year is next year, insisting 2028 is next year even though the current year is 2026, a calendar error that has persisted for weeks. The mishap isn’t isolated to Google: OpenAI’s ChatGPT and Anthropic’s Claude also stumble on the question, though Gemini 3 apparently nails it. The episode underscores ongoing reliability and hallucination issues across leading AI models, even as one system in Google’s suite performs well.

NeuroSploit v2: AI-Driven Autonomous Penetration Testing for Vulnerability Detection
technology4 months ago

NeuroSploit v2: AI-Driven Autonomous Penetration Testing for Vulnerability Detection

NeuroSploitv2 is an open-source, AI-powered penetration testing framework that integrates multiple large language models like Claude, GPT, and Gemini to automate vulnerability analysis and exploitation, featuring modular roles for various security tasks, advanced error mitigation techniques, and extensive tool integrations, designed to enhance offensive security operations with flexibility and ethical safeguards.

Microsoft Reveals 'Whisper Leak' Threat to Encrypted AI Chat Privacy
technology6 months ago

Microsoft Reveals 'Whisper Leak' Threat to Encrypted AI Chat Privacy

Microsoft has revealed a new side-channel attack called Whisper Leak that can infer the topics of encrypted AI chat traffic by analyzing packet size and timing, posing privacy risks. The attack can identify sensitive conversation topics despite encryption, and mitigation strategies like adding random text to responses are recommended. This highlights vulnerabilities in current language models and the need for enhanced security measures.

Anthropic and Thinking Machines Lab Unveil AI Model Character Differences
technology7 months ago

Anthropic and Thinking Machines Lab Unveil AI Model Character Differences

A study by Anthropic and Thinking Machines Lab introduces a systematic method to stress test AI model specifications using value tradeoff scenarios, revealing significant disagreements among models that highlight gaps and ambiguities in current specs. The research analyzes 12 frontier language models, links high disagreement to specification violations, and releases a public dataset for further auditing, emphasizing the importance of precise and comprehensive model guidelines.