Found 8 bookmarks
Newest
Defeating Nondeterminism in LLM Inference
Defeating Nondeterminism in LLM Inference
Reproducibility is a bedrock of scientific progress. However, it’s remarkably difficult to get reproducible results out of large language models. For example, you might observe that asking ChatGPT the same question multiple times provides different results. This by itself is not surprising, since getting a result from a language model involves “sampling”, a process that converts the language model’s output into a probability distribution and probabilistically selects a token. What might be more surprising is that even when we adjust the temperature down to 0This means that the LLM always chooses the highest probability token, which is called greedy sampling. (thus making the sampling theoretically deterministic), LLM APIs are still not deterministic in practice (see past discussions here, here, or here). Even when running inference on your own hardware with an OSS inference library like vLLM or SGLang, sampling still isn’t deterministic (see here or here).
·thinkingmachines.ai·
Defeating Nondeterminism in LLM Inference
Defeating Nondeterminism in LLM Inference
Defeating Nondeterminism in LLM Inference
A very common question I see about LLMs concerns why they can't be made to deliver the same response to the same prompt by setting a fixed random number seed. …
·simonwillison.net·
Defeating Nondeterminism in LLM Inference
Yann LeCun "Mathematical Obstacles on the Way to Human-Level AI"
Yann LeCun "Mathematical Obstacles on the Way to Human-Level AI"
Yann LeCun, Meta, gives the AMS Josiah Willard Gibbs Lecture at the 2025 Joint Mathematics Meetings on “Mathematical Obstacles on the Way to Human-Level AI.” This talk was introduced by Bryna Kra, Northwestern University, President of the AMS.
·youtube.com·
Yann LeCun "Mathematical Obstacles on the Way to Human-Level AI"
The Most Important Algorithm in Machine Learning
The Most Important Algorithm in Machine Learning
Shortform link: https://shortform.com/artem In this video we will talk about backpropagation – an algorithm powering the entire field of machine learning and try to derive it from first principles. OUTLINE: 00:00 Introduction 01:28 Historical background 02:50 Curve Fitting problem 06:26 Random vs guided adjustments 09:43 Derivatives 14:34 Gradient Descent 16:23 Higher dimensions 21:36 Chain Rule Intuition 27:01 Computational Graph and Autodiff 36:24 Summary 38:16 Shortform 39:20 Outro USEFUL RESOURCES: Andrej Karpathy's playlist: https://youtube.com/playlist?list=PLAqhIrjkxbuWI23v9cThsA9GvCAUhRvKZ&si=zBUZW5kufVPLVy9E Jürgen Schmidhuber's blog on the history of backprop: https://people.idsia.ch/~juergen/who-invented-backpropagation.html CREDITS: Icons by https://www.freepik.com/
·youtube.com·
The Most Important Algorithm in Machine Learning