Richard S. Sutton - Wikipedia#Awards#Computer Science#Reinforcement Learning#Biography#ACM·en.wikipedia.org·today at 6:42 PMRichard S. Sutton - Wikipedia
Model-Based Transfer Learning for Contextual Reinforcement Learning#Machine Learning#Transfer Learning#Reinforcement Learning#Paper#PDF#Performance·arxiv.org·Nov 23, 2024Model-Based Transfer Learning for Contextual Reinforcement Learning
RLEF: Grounding Code LLMs in Execution Feedback with Reinforcement LearningView PDF#Reinforcement Learning#Machine Learning·arxiv.org·Oct 5, 2024RLEF: Grounding Code LLMs in Execution Feedback with Reinforcement Learning
Training Language Models to Self-Correct via Reinforcement LearningView PDF#Large Language Models#Accuracy#Reinforcement Learning#DeepMind#Paper#PDF·arxiv.org·Sep 22, 2024Training Language Models to Self-Correct via Reinforcement Learning
Random robots are more reliable: New AI algorithm for robots consistently outperforms state-of-the-art systems#Robotics#Diffusion#Reinforcement Learning·techxplore.com·May 3, 2024Random robots are more reliable: New AI algorithm for robots consistently outperforms state-of-the-art systems
Back to Basics: Revisiting REINFORCE Style Optimization for Learning from Human Feedback in LLMs#RLHF#Reinforcement Learning#Large Language Models#Paper#PDF·arxiv.org·Feb 26, 2024Back to Basics: Revisiting REINFORCE Style Optimization for Learning from Human Feedback in LLMs
Pearl: A Production-ready Reinforcement Learning AgentDownload PDF#Reinforcement Learning#Meta#Paper#PDF·arxiv.org·Dec 14, 2023Pearl: A Production-ready Reinforcement Learning Agent
The third New England RLHF Hackers Hackathon#EleutherAI#Event#Reinforcement Learning#Feedback·blog.eleuther.ai·Nov 26, 2023The third New England RLHF Hackers Hackathon
Chip Placement with Deep Reinforcement Learning#Reinforcement Learning#Design#Automation#Google#Paper#PDF·arxiv.org·May 30, 2023Chip Placement with Deep Reinforcement Learning
Using reinforcement learning for dynamic planning in open-ended conversations#Reinforcement Learning#Conversational AI#Google#Research·ai.googleblog.com·May 17, 2023Using reinforcement learning for dynamic planning in open-ended conversations
An Overview of Environmental Features that Impact Deep Reinforcement Learning in Sparse-Reward Domains | Journal of Artificial Intelligence Research#Performance#Deep Learning#Reinforcement Learning#Paper#PDF·jair.org·Apr 26, 2023An Overview of Environmental Features that Impact Deep Reinforcement Learning in Sparse-Reward Domains | Journal of Artificial Intelligence Research
Pre-training generalist agents using offline reinforcement learning#Reinforcement Learning#Google#Paper#PDF·ai.googleblog.com·Feb 23, 2023Pre-training generalist agents using offline reinforcement learning
PI-ARS: Accelerating Evolution-Learned Visual-Locomotion with Predictive Information Representations#Reinforcement Learning#Robotics#Computer Vision#Blog#Google·ai.googleblog.com·Oct 21, 2022PI-ARS: Accelerating Evolution-Learned Visual-Locomotion with Predictive Information Representations
Quantization for Fast and Environmentally Sustainable Reinforcement Learning#Reinforcement Learning#Google·ai.googleblog.com·Sep 28, 2022Quantization for Fast and Environmentally Sustainable Reinforcement Learning
Combining AI and computational science for better, faster, energy efficient predictions#Reinforcement Learning#Simulation#Computer Science·techxplore.com·Apr 13, 2022Combining AI and computational science for better, faster, energy efficient predictions
Spurious normativity enhances learning of compliance and enforcement behavior in artificial agents#Reinforcement Learning#Social Science#AGI#Normativity#Alignment·youtube.com·Mar 9, 2022Spurious normativity enhances learning of compliance and enforcement behavior in artificial agents
Microsoft AI Research Introduces A New Reinforcement Learning Based Method, Called 'Dead-end Discovery' (DeD), To Identify the High-Risk States And Treatments In Healthcare Using Machine Learning#Reinforcement Learning#Health#Microsoft#Research·marktechpost.com·Feb 12, 2022Microsoft AI Research Introduces A New Reinforcement Learning Based Method, Called 'Dead-end Discovery' (DeD), To Identify the High-Risk States And Treatments In Healthcare Using Machine Learning