Found 2 bookmarks
Custom sorting
(3) Andrew Ng on X: "I’ve noticed that many GenAI application projects put in automated evaluations (evals) of the system’s output probably later — and rely on humans to judge outputs longer — than they should. This is because building evals is viewed as a massive investment (say, creating 100 or" / X
(3) Andrew Ng on X: "I’ve noticed that many GenAI application projects put in automated evaluations (evals) of the system’s output probably later — and rely on humans to judge outputs longer — than they should. This is because building evals is viewed as a massive investment (say, creating 100 or" / X
·x.com·
(3) Andrew Ng on X: "I’ve noticed that many GenAI application projects put in automated evaluations (evals) of the system’s output probably later — and rely on humans to judge outputs longer — than they should. This is because building evals is viewed as a massive investment (say, creating 100 or" / X