AI_bookmarks

1490 bookmarks
Newest
AI hype is built on high test scores. Those tests are flawed.
AI hype is built on high test scores. Those tests are flawed.
With hopes and fears about the technology running wild, it's time to agree on what it can and can't do.
When Taylor Webb played around with GPT-3 in early 2022, he was blown away by what OpenAI’s large language model appeared to be able to do. Here was a neural network trained only to predict the next word in a block of text—a jumped-up autocomplete. And yet it gave correct answers to many of the abstract problems that Webb set for it—the kind of thing you’d find in an IQ test. “I was really shocked by its ability to solve these problems,” he says. “It completely upended everything I would have predicted.” Webb is a psychologist at the University of California, Los Angeles, who studies the different ways people and computers solve abstract problems. He was used to building neural networks that had specific reasoning capabilities bolted on. But GPT-3 seemed to have learned them for free. Related StoryThe inside story of how ChatGPT was built from the people who made itExclusive conversations that take us behind the scenes of a cultural phenomenon. Last month Webb and his colleagues published an article in Nature, in which they describe GPT-3’s ability to pass a variety of tests devised to assess the use of analogy to solve problems (known as analogical reasoning). On some of those tests GPT-3 scored better than a group of undergrads. “Analogy is central to human reasoning,” says Webb. “We think of it as being one of the major things that any kind of machine intelligence would need to demonstrate.” What Webb’s research highlights is only the latest in a long string of remarkable tricks pulled off by large language models. For example, when OpenAI unveiled GPT-3’s successor, GPT-4, in March, the company published an eye-popping list of professional and academic assessments that it claimed its new large language model had aced, including a couple of dozen high school tests and the bar exam. OpenAI later worked with Microsoft to show that GPT-4 could pass parts of the United States Medical Licensing Examination. And multiple researchers claim to have shown that large language models can pass tests designed to identify certain cognitive abilities in humans, from chain-of-thought reasoning (working through a problem step by step) to theory of mind (guessing what other people are thinking).  Such results are feeding a hype machine that predicts computers will soon come for white-collar jobs, replacing teachers, journalists, lawyers and more. Geoffrey Hinton has called out GPT-4’s apparent ability to string together thoughts as one reason he is now scared of the technology he helped create.  But there’s a problem: there is little agreement on what those results really mean. Some people are dazzled by what they see as glimmers of human-like intelligence; others aren’t convinced one bit. “There are several critical issues with current evaluation techniques for large language models,” says Natalie Shapira, a computer scientist at Bar-Ilan University in Ramat Gan, Israel. “It creates the illusion that they have greater capabilities than what truly exists.” That’s why a growing number of researchers—computer scientists, cognitive scientists, neuroscientists, linguists—want to overhaul the way large language models are assessed, calling for more rigorous and exhaustive evaluation. Some think that the practice of scoring machines on human tests is wrongheaded, period, and should be ditched. “People have been giving human intelligence tests—IQ tests and so on—to machines since the very beginning of AI,” says Melanie Mitchell, an artificial-intelligence researcher at the Santa Fe Institute in New Mexico. “The issue throughout has been what it means when you test a machine like this. It doesn’t mean the same thing that it means for a human.” “There’s a lot of anthropomorphizing going on,” she says. “And that’s kind of coloring the way that we think about these systems and how we test them.” With hopes and fears for this technology at an all-time high, it is crucial that we get a solid grip on what large language models can and cannot do.  Open to interpretation Most of the problems with testing large language models boil down to the question of how to interpret the results.  Assessments designed for humans, like high school exams and IQ tests, take a lot for granted. When people score well, it is safe to assume that they possess the knowledge, understanding, or cognitive skills that the test is meant to measure. (In practice, that assumption only goes so far. Academic exams do not always reflect students’ true abilities. IQ tests measure a specific set of skills, not overall intelligence. Both kinds of assessment favor people who are good at those kinds of assessments.)  Related StoryAI is wrestling with a replication crisisTech giants dominate research but the line between real breakthrough and product showcase can be fuzzy. Some scientists have had enough. But when a large language model scores well on such tests, it is not clear at all what has been measured. Is it evidence of actual understanding? A mindl
·technologyreview.com·
AI hype is built on high test scores. Those tests are flawed.
A Generative AI Primer - National centre for AI
A Generative AI Primer - National centre for AI
The primer is intended as a short introduction to generative AI, exploring some of the main points and areas relevant to education, including two main elements: an introduction to Generative AI technology and the implications of generative AI on education
·nationalcentreforai.jiscinvolve.org·
A Generative AI Primer - National centre for AI
What is Multimodal Generative Artificial Intelligence?
What is Multimodal Generative Artificial Intelligence?
The term multimodal generative intelligence is getting thrown around a lot recently – even more so now that the most popular models like GPT have added features like image recognition and gen…
Transduction, on the other hand, is changing meaning across modes, such as from text to image. So in image generation, or audio, or video, we are changing the meaning from one mode to another – or, rather, the algorithm is changing the meaning in response to our prompt.
·leonfurze.com·
What is Multimodal Generative Artificial Intelligence?
AWS Selects 15 Ed Tech Startups for Education Accelerator Program -- THE Journal
AWS Selects 15 Ed Tech Startups for Education Accelerator Program -- THE Journal
Amazon Web Services (AWS) has announced the selection of 15 innovative ed tech startup companies to take part in its inaugural Education Accelerator program. The program will support these companies to apply data, analytics, and AI to transform teaching and learning and tackle educational challenges to improve student success.
·thejournal.com·
AWS Selects 15 Ed Tech Startups for Education Accelerator Program -- THE Journal
Survey: GenAI Is Making Companies More Data Oriented
Survey: GenAI Is Making Companies More Data Oriented
Although cultural change generally requires human intervention, it appears that new technology — especially a new technology like generative AI that captures human imaginations — can play a role in catalyzing a data-oriented culture. In an annual survey assessing attitudes about data, analytics, and AI, data and technology leaders in large companies reported significant improvement in their organizations’ data culture. Given that the 2023 survey was fielded just before ChatGPT was announced, generative AI seems the likely cause of the leap in positive responses around culture. To take advantage of this, companies need to invest in experimentation, production deployment, and education.
·hbr.org·
Survey: GenAI Is Making Companies More Data Oriented
Iudia - Poe
Iudia - Poe
LUDIA = UDL + AI Designed by Beth Stark and Jérémie Rostan http://bit.ly/LUDIA_FYI ...Discover the 4T's Process and Learn More About LUDIA. LUDIA, your AI-powered UDL partner, can help you reduce learning barriers and discover ways to support all learners in reaching their full potential as expert learners. Disclaimer: Nobody is perfect. LUDIA's suggestions are here to enhance, not replace, your own professional reflection and judgment. Get in touch! ludia.chatbot@gmail.com Please Note: As the creators of LUDIA, we do not have access to the information generated by the use of LUDIA or any identifying information about users or followers of LUDIA. We receive no financial revenue from LUDIA. Universal Design for Learning. Universally Accessible. That is Our Goal. Accessibility Statement: In anticipation of user variability, we have followed the Web Content Accessibility Guidelines (WCAG 2.0, Level AA), published by the World Wide Web Consortium (W3C), as closely as possible. Conformance with these guidelines will help to reduce barriers and expand web equity worldwide. If you have any comments or suggestions about improving the accessibility of LUDIA, please contact us, or the Platform for Open Exploration (Poe): https://help.poe.com/hc/en-us/requests/new
·poe.com·
Iudia - Poe
Introducing ChatGPT Team
Introducing ChatGPT Team
We’re launching a new ChatGPT plan for teams of all sizes, which provides a secure, collaborative workspace to get the most out of ChatGPT at work.
·openai.com·
Introducing ChatGPT Team