2025

4948 bookmarks
Newest
10 years ago: the reinforcement learning (RL) prompt engineer [1] (Sec. 5.3). Adaptive chain of thought: an RL neural net learns to query its "world model" net for abstract reasoning & decision making. Going beyond the 1990 neural world model [2] for millisecond-by-millisecond
10 years ago: the reinforcement learning (RL) prompt engineer [1] (Sec. 5.3). Adaptive chain of thought: an RL neural net learns to query its "world model" net for abstract reasoning & decision making. Going beyond the 1990 neural world model [2] for millisecond-by-millisecond
·x.com·
10 years ago: the reinforcement learning (RL) prompt engineer [1] (Sec. 5.3). Adaptive chain of thought: an RL neural net learns to query its "world model" net for abstract reasoning & decision making. Going beyond the 1990 neural world model [2] for millisecond-by-millisecond