Search Test Information Space

Found 20 bookmarks

Custom sorting

UniVG-R1: Reasoning Guided Universal Visual Grounding with...

#Reasoning #Reinforcement Learning #Large Language Models #Multimodal #Paper #PDF

·arxiv.org·May 22, 2025

UniVG-R1: Reasoning Guided Universal Visual Grounding with...

Introducing OpenAI o3 and o4-mini | OpenAI

"For the first time, these models can integrate images directly into their chain of thought. They don’t just see an image—they think with it. This unlocks a new class of problem-solving that blends visual and textual reasoning, reflected in their state-of-the-art performance across multimodal benchmarks."

#OpenAI #Reasoning #Agents #Reinforcement Learning #Blog

·openai.com·Apr 17, 2025

Introducing OpenAI o3 and o4-mini | OpenAI

Bertsekas, D. P. (2019). Rollout, Policy Iteration, and Distributed Reinforcement Learning

(Any relation to economics or tariffs? Did look recently like a Gemini chat question came out of nowhere, possibly the result of agents on Android, however about marketing and seemingly inverted demographics like a sales projection. Maybe the future of ads somehow linked to reading list or Discover feed if conventional channels evaporate. Or this may have resulted from a partially typed prompt that was auto-completed.)

#Book #Reinforcement Learning #History #PDF

·web.mit.edu·Apr 12, 2025

Bertsekas, D. P. (2019). Rollout, Policy Iteration, and Distributed Reinforcement Learning

Richard S. Sutton - Wikipedia

#Awards #Computer Science #Reinforcement Learning #Biography #ACM

·en.wikipedia.org·Mar 10, 2025

Richard S. Sutton - Wikipedia

Model-Based Transfer Learning for Contextual Reinforcement Learning

#Machine Learning #Transfer Learning #Reinforcement Learning #Paper #PDF #Performance

·arxiv.org·Nov 23, 2024

Model-Based Transfer Learning for Contextual Reinforcement Learning

RLEF: Grounding Code LLMs in Execution Feedback with Reinforcement Learning

View PDF

#Reinforcement Learning #Machine Learning

·arxiv.org·Oct 5, 2024

RLEF: Grounding Code LLMs in Execution Feedback with Reinforcement Learning

Training Language Models to Self-Correct via Reinforcement Learning

View PDF

#Large Language Models #Accuracy #Reinforcement Learning #DeepMind #Paper #PDF

·arxiv.org·Sep 22, 2024

Training Language Models to Self-Correct via Reinforcement Learning

Random robots are more reliable: New AI algorithm for robots consistently outperforms state-of-the-art systems

#Robotics #Diffusion #Reinforcement Learning

·techxplore.com·May 3, 2024

Random robots are more reliable: New AI algorithm for robots consistently outperforms state-of-the-art systems

Back to Basics: Revisiting REINFORCE Style Optimization for Learning from Human Feedback in LLMs

#RLHF #Reinforcement Learning #Large Language Models #Paper #PDF

·arxiv.org·Feb 26, 2024

Back to Basics: Revisiting REINFORCE Style Optimization for Learning from Human Feedback in LLMs

Pearl: A Production-ready Reinforcement Learning Agent

Download PDF

#Reinforcement Learning #Meta #Paper #PDF

·arxiv.org·Dec 14, 2023

Pearl: A Production-ready Reinforcement Learning Agent

The third New England RLHF Hackers Hackathon

#EleutherAI #Event #Reinforcement Learning #Feedback

·blog.eleuther.ai·Nov 26, 2023

The third New England RLHF Hackers Hackathon

Chip Placement with Deep Reinforcement Learning

#Reinforcement Learning #Design #Automation #Google #Paper #PDF

·arxiv.org·May 30, 2023

Chip Placement with Deep Reinforcement Learning

Using reinforcement learning for dynamic planning in open-ended conversations

#Reinforcement Learning #Conversational AI #Google #Research

·ai.googleblog.com·May 17, 2023

Using reinforcement learning for dynamic planning in open-ended conversations

An Overview of Environmental Features that Impact Deep Reinforcement Learning in Sparse-Reward Domains | Journal of Artificial Intelligence Research

#Performance #Deep Learning #Reinforcement Learning #Paper #PDF

·jair.org·Apr 26, 2023

An Overview of Environmental Features that Impact Deep Reinforcement Learning in Sparse-Reward Domains | Journal of Artificial Intelligence Research

Pre-training generalist agents using offline reinforcement learning

#Reinforcement Learning #Google #Paper #PDF

·ai.googleblog.com·Feb 23, 2023

Pre-training generalist agents using offline reinforcement learning

PI-ARS: Accelerating Evolution-Learned Visual-Locomotion with Predictive Information Representations

#Reinforcement Learning #Robotics #Computer Vision #Blog #Google

·ai.googleblog.com·Oct 21, 2022

PI-ARS: Accelerating Evolution-Learned Visual-Locomotion with Predictive Information Representations

Quantization for Fast and Environmentally Sustainable Reinforcement Learning

#Reinforcement Learning #Google

·ai.googleblog.com·Sep 28, 2022

Quantization for Fast and Environmentally Sustainable Reinforcement Learning

Combining AI and computational science for better, faster, energy efficient predictions

#Reinforcement Learning #Simulation #Computer Science

·techxplore.com·Apr 13, 2022

Combining AI and computational science for better, faster, energy efficient predictions

Spurious normativity enhances learning of compliance and enforcement behavior in artificial agents

#Reinforcement Learning #Social Science #AGI #Normativity #Alignment

·youtube.com·Mar 9, 2022

Spurious normativity enhances learning of compliance and enforcement behavior in artificial agents

Microsoft AI Research Introduces A New Reinforcement Learning Based Method, Called 'Dead-end Discovery' (DeD), To Identify the High-Risk States And Treatments In Healthcare Using Machine Learning

#Reinforcement Learning #Health #Microsoft #Research

·marktechpost.com·Feb 12, 2022

Microsoft AI Research Introduces A New Reinforcement Learning Based Method, Called 'Dead-end Discovery' (DeD), To Identify the High-Risk States And Treatments In Healthcare Using Machine Learning