Search Test Information Space

Found 5 bookmarks

Newest

Forecasting rare language model behaviors \ Anthropic

·anthropic.com·Feb 25, 2025

Reframing superintelligence fhi tr 2019 1

Drexler, K. E. (2019). Reframing superintelligence. Future of Humanity Institute.

·fhi.ox.ac.uk·Dec 15, 2023

Weak to strong generalization

·cdn.openai.com·Dec 15, 2023

LIMA: Less Is More for Alignment

·arxiv.org·May 23, 2023

Researching Alignment Research: Unsupervised Analysis

·arxiv.org·Apr 21, 2023