Search Test Information Space

Found 2 bookmarks

Custom sorting

The Homework Machine Episodes

#Large Language Models #Education #Academics #Review #Criticism #Evaluation #Generative AI

·teachlabpodcast.com·Aug 25, 2025

The Homework Machine Episodes

(3) Andrew Ng on X: "I’ve noticed that many GenAI application projects put in automated evaluations (evals) of the system’s output probably later — and rely on humans to judge outputs longer — than they should. This is because building evals is viewed as a massive investment (say, creating 100 or" / X

#Generative AI #Evaluation #Automation #Prototype #Tips

·x.com·Apr 17, 2025

(3) Andrew Ng on X: "I’ve noticed that many GenAI application projects put in automated evaluations (evals) of the system’s output probably later — and rely on humans to judge outputs longer — than they should. This is because building evals is viewed as a massive investment (say, creating 100 or" / X