Does Transformer Interpretability Transfer to RNNs?
Neural Networks Learn Statistics of Increasing Complexity
Download PDF
Llemma: An Open Language Model For Mathematics
Can Transformers Learn to Solve Problems Recursively?
Pythia: A Suite for Analyzing Large Language Models Across Training and Scaling