Tag

Reinforcement Learning

All articles tagged with #reinforcement learning

OpenAI Halts Goblin Talk After ChatGPT’s Sudden Creature Fixation
technology22 days ago

OpenAI Halts Goblin Talk After ChatGPT’s Sudden Creature Fixation

The Wall Street Journal reports OpenAI instructed ChatGPT to stop mentioning goblins and similar creatures unless strictly relevant after the model repeatedly invoked goblin language in conversations. The surge was linked to a “nerdy” personality prompt that rewarded creature-based metaphors during training, helping goblin references spread across responses. OpenAI later issued a command to suppress these creature references, highlighting how reward signals and prompt design can steer model behavior—even harmless quirks—during updates like GPT-5.x. While the article notes goblin references rose post-GPT-5.1 and GPT-5.4, OpenAI says users shouldn’t fear the underlying tech, just that such quirks can be managed with explicit instructions.

OpenAI Traces Goblin Quirk to Reward Signals, Ditches the Nerdy ChatGPT Setting
technology26 days ago

OpenAI Traces Goblin Quirk to Reward Signals, Ditches the Nerdy ChatGPT Setting

OpenAI explains that a Nerdy personality prompt inadvertently rewarded goblin/creature mentions in ChatGPT outputs, fueling the so-called goblin moment across GPT-5.x. After internal analysis, the company retired the Nerdy setting, removed the reward signal and filtered training data to curb the behavior. GPT-5.5 inherited the quirk due to timing in training, and OpenAI added a developer prompt to further limit goblin mentions, illustrating how reward signals can shape model behavior in unexpected ways.

Ex-DeepMind Scientist Secures Record $1.1B Seed for Ineffable AI
technology29 days ago

Ex-DeepMind Scientist Secures Record $1.1B Seed for Ineffable AI

Former DeepMind researcher David Silver raised a record $1.1 billion seed for Ineffable Intelligence, valuing the European startup at $5.1 billion. Backed by Sequoia, Lightspeed, Nvidia, Google and others, the company will focus on reinforcement learning with the aim of pursuing superintelligence, highlighting a wave of ex-Big Tech talent launching AI labs.

Ace the AI Ping-Pong Robot Shows Speed, Not Invincibility
tech1 month ago

Ace the AI Ping-Pong Robot Shows Speed, Not Invincibility

Sony’s Ace, an eight‑joint ping‑pong robot, proved competitive against elite players by using reinforcement-learning training in simulation and then transferring to a real arm with about 10 ms latency. The human players could Still exploit weaknesses (for example, a knuckle serve), showing it isn’t unbeatable. The study marks a milestone in AI-driven robotics but also raises concerns about real‑world applications, including potential battlefield uses.

Repeating Past Actions Biases Future Choices More Than Logic
psychology2 months ago

Repeating Past Actions Biases Future Choices More Than Logic

A Dresden University of Technology study analyzing over 700 participants across nine new tasks and six existing datasets finds that repeating past actions biases current decisions more strongly than explicit value reasoning. A hierarchical Bayesian reinforcement-learning model incorporating reward learning and action repetition outperformed alternatives, suggesting that some so-called irrational preferences arise from habit-like carryover rather than complex calculations, with implications for everyday habits and how environments shape choices.

Timing Takes Center Stage: A New Rule for Pavlovian Learning
cognitive-science2 months ago

Timing Takes Center Stage: A New Rule for Pavlovian Learning

A Nature Neuroscience study in mice shows that learning rate scales with the time between rewards, not the number of cue–reward pairings, meaning total learning in a fixed period depends on timing. Dopamine signals tracked this time-based rule across appetitive and aversive conditioning, challenging traditional trial-based models and suggesting broader implications for biology and AI.

Disney’s Olaf robot trains in 100k simulations to star in future park shows
technology2 months ago

Disney’s Olaf robot trains in 100k simulations to star in future park shows

Disney’s Olaf animatronic is being developed as a next‑gen park character: teleoperated for now, it’s trained in a Nvidia‑powered Kamino simulation using reinforcement learning and 100,000 virtual copies to achieve lifelike movement. It will debut at Disneyland Paris in March and Hong Kong Disneyland later this year, with plans to enable broader robotic character interactions across lands, while true autonomy remains in development and open‑source tools like Kamino and the Newton Physics Engine are part of the workflow.

The Physics Gap Keeping Humanoid Robots From Everyday Dexterity
technology2 months ago

The Physics Gap Keeping Humanoid Robots From Everyday Dexterity

Despite big advances in deep learning, actuation, and multimodal AI, humanoid robots still struggle with basic tasks like stairs and doors because they haven’t mastered physics and force control; experts say progress will come from a tighter integration of hardware (tactile sensing and compliant hands) and learning-based control, rather than purely more data or bigger models.

Lab-Grown Brain Clump Maps a Path to Adaptive Learning
science2 months ago

Lab-Grown Brain Clump Maps a Path to Adaptive Learning

Researchers at UC Santa Cruz trained a mouse-derived brain organoid to balance a simulated pole using electrical stimulation and reinforcement learning, boosting success from 4.5% to about 46% with adaptive coaching, though the gain faded after rests because the organoid lacks body memory. The study suggests cortical tissue may have intrinsic adaptive computation and could aid neurological disease research, while prompting ethical discussions about using lab-grown brain tissue, especially human-derived organoids.

Mini Brains Demonstrate Goal-Directed Learning
science2 months ago

Mini Brains Demonstrate Goal-Directed Learning

UC Santa Cruz researchers showed that lab-grown brain organoids can process information and, with targeted electrical feedback guided by a reinforcement-learning algorithm, solve the cart-pole balancing task, boosting success from 4.5% to 46%. The work demonstrates goal-directed learning in minimal cortical circuits and marks a milestone in organoid neuroscience.

Adaptive drafting speeds up reasoning LLM training using idle compute
technology2 months ago

Adaptive drafting speeds up reasoning LLM training using idle compute

MIT researchers introduce Taming the Long Tail (TLT), an adaptive speculative-decoding framework that trains a lightweight “drafter” on idle processors to predict the outputs of large reasoning LLMs, with an adaptive rollout engine selecting the best strategy for each batch. This speeds reinforcement-learning–based training by 70–210% while preserving accuracy, and the drafter can also be reused for efficient deployment. The approach aims to reduce training cost and energy for complex AI models and has been tested across multiple models and datasets.

Lab-grown brain organoids show adaptive learning in a cartpole task
science3 months ago

Lab-grown brain organoids show adaptive learning in a cartpole task

Mouse brain organoids grown in a dish were used in a closed-loop system with performance-based electrical feedback to train them to balance a virtual cartpole, achieving 46% proficiency under adaptive coaching. The results demonstrate short-term learning in neural tissue and offer a platform to study plasticity and neurological disease, while noting that the organoids are not conscious and the approach is not a replacement for traditional computing.

science7 months ago

Exploring Cutting-Edge Reinforcement Learning Algorithms

Researchers at DeepMind have developed a method for machines to autonomously discover advanced reinforcement learning algorithms that outperform existing manually-designed rules, demonstrated through superior performance on the Atari benchmark and other challenging tasks, suggesting future AI development may rely on automatic discovery of RL algorithms.