Tag

Reinforcement Learning

All articles tagged with #reinforcement learning

Repeating Past Actions Biases Future Choices More Than Logic
psychology18 days ago

Repeating Past Actions Biases Future Choices More Than Logic

A Dresden University of Technology study analyzing over 700 participants across nine new tasks and six existing datasets finds that repeating past actions biases current decisions more strongly than explicit value reasoning. A hierarchical Bayesian reinforcement-learning model incorporating reward learning and action repetition outperformed alternatives, suggesting that some so-called irrational preferences arise from habit-like carryover rather than complex calculations, with implications for everyday habits and how environments shape choices.

Timing Takes Center Stage: A New Rule for Pavlovian Learning
cognitive-science21 days ago

Timing Takes Center Stage: A New Rule for Pavlovian Learning

A Nature Neuroscience study in mice shows that learning rate scales with the time between rewards, not the number of cue–reward pairings, meaning total learning in a fixed period depends on timing. Dopamine signals tracked this time-based rule across appetitive and aversive conditioning, challenging traditional trial-based models and suggesting broader implications for biology and AI.

Disney’s Olaf robot trains in 100k simulations to star in future park shows
technology25 days ago

Disney’s Olaf robot trains in 100k simulations to star in future park shows

Disney’s Olaf animatronic is being developed as a next‑gen park character: teleoperated for now, it’s trained in a Nvidia‑powered Kamino simulation using reinforcement learning and 100,000 virtual copies to achieve lifelike movement. It will debut at Disneyland Paris in March and Hong Kong Disneyland later this year, with plans to enable broader robotic character interactions across lands, while true autonomy remains in development and open‑source tools like Kamino and the Newton Physics Engine are part of the workflow.

The Physics Gap Keeping Humanoid Robots From Everyday Dexterity
technology27 days ago

The Physics Gap Keeping Humanoid Robots From Everyday Dexterity

Despite big advances in deep learning, actuation, and multimodal AI, humanoid robots still struggle with basic tasks like stairs and doors because they haven’t mastered physics and force control; experts say progress will come from a tighter integration of hardware (tactile sensing and compliant hands) and learning-based control, rather than purely more data or bigger models.

Lab-Grown Brain Clump Maps a Path to Adaptive Learning
science1 month ago

Lab-Grown Brain Clump Maps a Path to Adaptive Learning

Researchers at UC Santa Cruz trained a mouse-derived brain organoid to balance a simulated pole using electrical stimulation and reinforcement learning, boosting success from 4.5% to about 46% with adaptive coaching, though the gain faded after rests because the organoid lacks body memory. The study suggests cortical tissue may have intrinsic adaptive computation and could aid neurological disease research, while prompting ethical discussions about using lab-grown brain tissue, especially human-derived organoids.

Mini Brains Demonstrate Goal-Directed Learning
science1 month ago

Mini Brains Demonstrate Goal-Directed Learning

UC Santa Cruz researchers showed that lab-grown brain organoids can process information and, with targeted electrical feedback guided by a reinforcement-learning algorithm, solve the cart-pole balancing task, boosting success from 4.5% to 46%. The work demonstrates goal-directed learning in minimal cortical circuits and marks a milestone in organoid neuroscience.

Adaptive drafting speeds up reasoning LLM training using idle compute
technology1 month ago

Adaptive drafting speeds up reasoning LLM training using idle compute

MIT researchers introduce Taming the Long Tail (TLT), an adaptive speculative-decoding framework that trains a lightweight “drafter” on idle processors to predict the outputs of large reasoning LLMs, with an adaptive rollout engine selecting the best strategy for each batch. This speeds reinforcement-learning–based training by 70–210% while preserving accuracy, and the drafter can also be reused for efficient deployment. The approach aims to reduce training cost and energy for complex AI models and has been tested across multiple models and datasets.

Lab-grown brain organoids show adaptive learning in a cartpole task
science1 month ago

Lab-grown brain organoids show adaptive learning in a cartpole task

Mouse brain organoids grown in a dish were used in a closed-loop system with performance-based electrical feedback to train them to balance a virtual cartpole, achieving 46% proficiency under adaptive coaching. The results demonstrate short-term learning in neural tissue and offer a platform to study plasticity and neurological disease, while noting that the organoids are not conscious and the approach is not a replacement for traditional computing.

science5 months ago

Exploring Cutting-Edge Reinforcement Learning Algorithms

Researchers at DeepMind have developed a method for machines to autonomously discover advanced reinforcement learning algorithms that outperform existing manually-designed rules, demonstrated through superior performance on the Atari benchmark and other challenging tasks, suggesting future AI development may rely on automatic discovery of RL algorithms.

DeepSeek AI Model in China Cost $294,000 to Train, Developer Reveals
technology6 months ago

DeepSeek AI Model in China Cost $294,000 to Train, Developer Reveals

DeepSeek's reported $294,000 training cost is misleading; the actual cost to train their base model was around $5.87 million, with the lower figure referring only to a specific reinforcement learning phase, not the entire training process. The article clarifies misconceptions about the expenses involved in developing large AI models and compares DeepSeek's efforts to Western counterparts like Meta's Llama 4.

Apple research reveals LLMs gain from classic productivity techniques
technology7 months ago

Apple research reveals LLMs gain from classic productivity techniques

A study by Apple researchers demonstrates that large language models (LLMs) can significantly improve their performance and alignment by using a simple checklist-based reinforcement learning method called RLCF, which scores responses based on checklist items. This approach enhances complex instruction following and could be crucial for future AI-powered assistants, although it has limitations in safety alignment and applicability to other use cases.