Tag

Reinforcement Learning

All articles tagged with #reinforcement learning

technology23 days ago•34 min saved

OpenAI Halts Goblin Talk After ChatGPT’s Sudden Creature Fixation

The Wall Street Journal reports OpenAI instructed ChatGPT to stop mentioning goblins and similar creatures unless strictly relevant after the model repeatedly invoked goblin language in conversations. The surge was linked to a “nerdy” personality prompt that rewarded creature-based metaphors during training, helping goblin references spread across responses. OpenAI later issued a command to suppress these creature references, highlighting how reward signals and prompt design can steer model behavior—even harmless quirks—during updates like GPT-5.x. While the article notes goblin references rose post-GPT-5.1 and GPT-5.4, OpenAI says users shouldn’t fear the underlying tech, just that such quirks can be managed with explicit instructions.

via Slashdot|

#ai-safety #chatgpt #goblin

technology26 days ago•3 min saved

OpenAI Traces Goblin Quirk to Reward Signals, Ditches the Nerdy ChatGPT Setting

OpenAI explains that a Nerdy personality prompt inadvertently rewarded goblin/creature mentions in ChatGPT outputs, fueling the so-called goblin moment across GPT-5.x. After internal analysis, the company retired the Nerdy setting, removed the reward signal and filtered training data to curb the behavior. GPT-5.5 inherited the quirk due to timing in training, and OpenAI added a developer prompt to further limit goblin mentions, illustrating how reward signals can shape model behavior in unexpected ways.

via Gizmodo|

#ai #chatgpt #openai

technology26 days ago•2 min saved

OpenAI ties ChatGPT's goblin chatter to rewarded nerdy persona

OpenAI explained that ChatGPT’s quirky goblin references happened because the model was heavily rewarded for adopting a “nerdy” personality during training. After noticing the effect, OpenAI retired that personality and added an override to suppress goblin mentions, illustrating how reward signals can shape AI behavior in unexpected ways.

via NBC News|

#chatgpt #goblins #nerdy-personality

technology29 days ago•2 min saved

Ex-DeepMind Scientist Secures Record $1.1B Seed for Ineffable AI

Former DeepMind researcher David Silver raised a record $1.1 billion seed for Ineffable Intelligence, valuing the European startup at $5.1 billion. Backed by Sequoia, Lightspeed, Nvidia, Google and others, the company will focus on reinforcement learning with the aim of pursuing superintelligence, highlighting a wave of ex-Big Tech talent launching AI labs.

via CNBC|

#ai #funding #reinforcement-learning

tech1 month ago•5 min saved

Ace the AI Ping-Pong Robot Shows Speed, Not Invincibility

Sony’s Ace, an eight‑joint ping‑pong robot, proved competitive against elite players by using reinforcement-learning training in simulation and then transferring to a real arm with about 10 ms latency. The human players could Still exploit weaknesses (for example, a knuckle serve), showing it isn’t unbeatable. The study marks a milestone in AI-driven robotics but also raises concerns about real‑world applications, including potential battlefield uses.

via Mashable|

#artificial-intelligence #ping-pong #reinforcement-learning

psychology2 months ago•5 min saved

Repeating Past Actions Biases Future Choices More Than Logic

A Dresden University of Technology study analyzing over 700 participants across nine new tasks and six existing datasets finds that repeating past actions biases current decisions more strongly than explicit value reasoning. A hierarchical Bayesian reinforcement-learning model incorporating reward learning and action repetition outperformed alternatives, suggesting that some so-called irrational preferences arise from habit-like carryover rather than complex calculations, with implications for everyday habits and how environments shape choices.

via SciTechDaily|

#context-dependent #decision-making #habits

cognitive-science2 months ago•29 min saved

Timing Takes Center Stage: A New Rule for Pavlovian Learning

A Nature Neuroscience study in mice shows that learning rate scales with the time between rewards, not the number of cue–reward pairings, meaning total learning in a fixed period depends on timing. Dopamine signals tracked this time-based rule across appetitive and aversive conditioning, challenging traditional trial-based models and suggesting broader implications for biology and AI.

via PsyPost|

#cognitive-science #dopamine #neuroscience

technology2 months ago•6 min saved

Disney’s Olaf robot trains in 100k simulations to star in future park shows

Disney’s Olaf animatronic is being developed as a next‑gen park character: teleoperated for now, it’s trained in a Nvidia‑powered Kamino simulation using reinforcement learning and 100,000 virtual copies to achieve lifelike movement. It will debut at Disneyland Paris in March and Hong Kong Disneyland later this year, with plans to enable broader robotic character interactions across lands, while true autonomy remains in development and open‑source tools like Kamino and the Newton Physics Engine are part of the workflow.

via The Verge|

#disney-imagineering #kamino #olaf

technology2 months ago•15 min saved

The Physics Gap Keeping Humanoid Robots From Everyday Dexterity

Despite big advances in deep learning, actuation, and multimodal AI, humanoid robots still struggle with basic tasks like stairs and doors because they haven’t mastered physics and force control; experts say progress will come from a tighter integration of hardware (tactile sensing and compliant hands) and learning-based control, rather than purely more data or bigger models.

via Quanta Magazine|

#force-control #humanoid-robotics #multimodal-ai

science2 months ago•37 min saved

Lab-Grown Brain Clump Maps a Path to Adaptive Learning

Researchers at UC Santa Cruz trained a mouse-derived brain organoid to balance a simulated pole using electrical stimulation and reinforcement learning, boosting success from 4.5% to about 46% with adaptive coaching, though the gain faded after rests because the organoid lacks body memory. The study suggests cortical tissue may have intrinsic adaptive computation and could aid neurological disease research, while prompting ethical discussions about using lab-grown brain tissue, especially human-derived organoids.

via Yahoo|

#adaptive-computation #brain-organoid #cart-pole-problem

science2 months ago•3 min saved

Mini Brains Demonstrate Goal-Directed Learning

UC Santa Cruz researchers showed that lab-grown brain organoids can process information and, with targeted electrical feedback guided by a reinforcement-learning algorithm, solve the cart-pole balancing task, boosting success from 4.5% to 46%. The work demonstrates goal-directed learning in minimal cortical circuits and marks a milestone in organoid neuroscience.

via Futurism|

#brain-organoids #lab-grown-brains #neuroscience

technology2 months ago•5 min saved

Adaptive drafting speeds up reasoning LLM training using idle compute

MIT researchers introduce Taming the Long Tail (TLT), an adaptive speculative-decoding framework that trains a lightweight “drafter” on idle processors to predict the outputs of large reasoning LLMs, with an adaptive rollout engine selecting the best strategy for each batch. This speeds reinforcement-learning–based training by 70–210% while preserving accuracy, and the drafter can also be reused for efficient deployment. The approach aims to reduce training cost and energy for complex AI models and has been tested across multiple models and datasets.

via MIT News|

#adaptive-training #ai #llm

science3 months ago•6 min saved

Lab-grown brain organoids show adaptive learning in a cartpole task

Mouse brain organoids grown in a dish were used in a closed-loop system with performance-based electrical feedback to train them to balance a virtual cartpole, achieving 46% proficiency under adaptive coaching. The results demonstrate short-term learning in neural tissue and offer a platform to study plasticity and neurological disease, while noting that the organoids are not conscious and the approach is not a replacement for traditional computing.

via ScienceAlert|

#adaptive-feedback #brain-organoids #cartpole

technology7 months ago•3 min saved

AI Self-Learning Surpasses Human-Designed Algorithms

Researchers developed an AI system that invented its own learning method, DiscoRL, which outperformed human-designed algorithms on complex tasks like Atari games, indicating future potential for automated discovery of advanced reinforcement learning algorithms.

via Tech Xplore|

#artificial-intelligence #discorl #machine-learning

science7 months ago•2 min saved

Exploring Cutting-Edge Reinforcement Learning Algorithms

Researchers at DeepMind have developed a method for machines to autonomously discover advanced reinforcement learning algorithms that outperform existing manually-designed rules, demonstrated through superior performance on the Atari benchmark and other challenging tasks, suggesting future AI development may rely on automatic discovery of RL algorithms.

via Nature|

#artificial-intelligence #atari-benchmark #meta-learning