Found 170 bookmarks
Newest
Saved by Medicaid: New Evidence on Health Insurance and Mortality from the Universe of Low-Income Adults
Saved by Medicaid: New Evidence on Health Insurance and Mortality from the Universe of Low-Income Adults

We examine the causal effect of health insurance on mortality using the universe of low-income adults, a dataset of 37 million individuals identified by linking the 2010 Census to administrative tax data. Our methodology leverages state-level variation in the timing and adoption of Medicaid expansions under the Affordable Care Act (ACA) and earlier waivers and adheres to a preregistered analysis plan, a rarely used approach in observational studies in economics. We find that expansions increased Medicaid enrollment by 12 percentage points and reduced the mortality of the low-income adult population by 2.5 percent, suggesting a 21 percent reduction in the mortality hazard of new enrollees. Mortality reductions accrued not only to older age cohorts, but also to younger adults, who accounted for nearly half of life-years saved due to their longer remaining lifespans and large share of the low-income adult population. These expansions appear to be cost-effective, with direct budgetary costs of $5.4 million per life saved and $179,000 per life-year saved falling well below valuations commonly found in the literature. Our findings suggest that lack of health insurance explains about five to twenty percent of the mortality disparity between high- and low-income Americans. We contribute to a growing body of evidence that health insurance improves health and demonstrate that Medicaid’s life-saving effects extend across a broader swath of the low-income population than previously understood.

·nber.org·
Saved by Medicaid: New Evidence on Health Insurance and Mortality from the Universe of Low-Income Adults
When ELIZA meets therapists: A Turing test for the heart and mind
When ELIZA meets therapists: A Turing test for the heart and mind
“Can machines be therapists?” is a question receiving increased attention given the relative ease of working with generative artificial intelligence. Although recent (and decades-old) research has found that humans struggle to tell the difference between responses from machines and humans, recent findings suggest that artificial intelligence can write empathically and the generated content is rated highly by therapists and outperforms professionals. It is uncertain whether, in a preregistered competition where therapists and ChatGPT respond to therapeutic vignettes about couple therapy, a) a panel of participants can tell which responses are ChatGPT-generated and which are written by therapists (N = 13), b) the generated responses or the therapist-written responses fall more in line with key therapy principles, and c) linguistic differences between conditions are present. In a large sample (N = 830), we showed that a) participants could rarely tell the difference between responses written by ChatGPT and responses written by a therapist, b) the responses written by ChatGPT were generally rated higher in key psychotherapy principles, and c) the language patterns between ChatGPT and therapists were different. Using different measures, we then confirmed that responses written by ChatGPT were rated higher than the therapist’s responses suggesting these differences may be explained by part-of-speech and response sentiment. This may be an early indication that ChatGPT has the potential to improve psychotherapeutic processes. We anticipate that this work may lead to the development of different methods of testing and creating psychotherapeutic interventions. Further, we discuss limitations (including the lack of the therapeutic context), and how continued research in this area may lead to improved efficacy of psychotherapeutic interventions allowing such interventions to be placed in the hands of individuals who need them the most.
·journals.plos.org·
When ELIZA meets therapists: A Turing test for the heart and mind
Differences in misinformation sharing can lead to politically asymmetric sanctions - Nature
Differences in misinformation sharing can lead to politically asymmetric sanctions - Nature
In response to intense pressure, technology companies have enacted policies to combat misinformation1,2,3,4. The enforcement of these policies has, however, led to technology companies being regularly accused of political bias5,6,7. We argue that differential sharing of misinformation by people identifying with different political groups8,9,10,11,12,13,14,15 could lead to political asymmetries in enforcement, even by unbiased policies. We first analysed 9,000 politically active Twitter users during the US 2020 presidential election. Although users estimated to be pro-Trump/conservative were indeed substantially more likely to be suspended than those estimated to be pro-Biden/liberal, users who were pro-Trump/conservative also shared far more links to various sets of low-quality news sites—even when news quality was determined by politically balanced groups of laypeople, or groups of only Republican laypeople—and had higher estimated likelihoods of being bots. We find similar associations between stated or inferred conservatism and low-quality news sharing (on the basis of both expert and politically balanced layperson ratings) in 7 other datasets of sharing from Twitter, Facebook and survey experiments, spanning 2016 to 2023 and including data from 16 different countries. Thus, even under politically neutral anti-misinformation policies, political asymmetries in enforcement should be expected. Political imbalance in enforcement need not imply bias on the part of social media companies implementing anti-misinformation policies.
·nature.com·
Differences in misinformation sharing can lead to politically asymmetric sanctions - Nature
You and Your Research, a talk by Richard Hamming
You and Your Research, a talk by Richard Hamming
I will talk mainly about science because that is what I have studied. But so far as I know, and I've been told by others, much of what I say applies to many fields. Outstanding work is characterized very much the same way in most fields, but I will confine myself to science.
I spoke earlier about planting acorns so that oaks will grow. You can't always know exactly where to be, but you can keep active in places where something might happen. And even if you believe that great science is a matter of luck, you can stand on a mountain top where lightning strikes; you don't have to hide in the valley where you're safe.
Most great scientists know many important problems. They have something between 10 and 20 important problems for which they are looking for an attack. And when they see a new idea come up, one hears them say ``Well that bears on this problem.'' They drop all the other things and get after it.
The great scientists, when an opportunity opens up, get after it and they pursue it. They drop all other things. They get rid of other things and they get after an idea because they had already thought the thing through. Their minds are prepared; they see the opportunity and they go after it. Now of course lots of times it doesn't work out, but you don't have to hit many of them to do some great science. It's kind of easy. One of the chief tricks is to live a long time!
He who works with the door open gets all kinds of interruptions, but he also occasionally gets clues as to what the world is and what might be important. Now I cannot prove the cause and effect sequence because you might say, ``The closed door is symbolic of a closed mind.'' I don't know. But I can say there is a pretty good correlation between those who work with the doors open and those who ultimately do important things, although people who work with doors closed often work harder.
You should do your job in such a fashion that others can build on top of it, so they will indeed say, ``Yes, I've stood on so and so's shoulders and I saw further.'' The essence of science is cumulative. By changing a problem slightly you can often do great work rather than merely good work. Instead of attacking isolated problems, I made the resolution that I would never again solve an isolated problem except as characteristic of a class.
by altering the problem, by looking at the thing differently, you can make a great deal of difference in your final productivity because you can either do it in such a fashion that people can indeed build on what you've done, or you can do it in such a fashion that the next person has to essentially duplicate again what you've done. It isn't just a matter of the job, it's the way you write the report, the way you write the paper, the whole attitude. It's just as easy to do a broad, general job as one very special case. And it's much more satisfying and rewarding!
it is not sufficient to do a job, you have to sell it. `Selling' to a scientist is an awkward thing to do. It's very ugly; you shouldn't have to do it. The world is supposed to be waiting, and when you do something great, they should rush out and welcome it. But the fact is everyone is busy with their own work. You must present it so well that they will set aside what they are doing, look at what you've done, read it, and come back and say, ``Yes, that was good.'' I suggest that when you open a journal, as you turn the pages, you ask why you read some articles and not others. You had better write your report so when it is published in the Physical Review, or wherever else you want it, as the readers are turning the pages they won't just turn your pages but they will stop and read yours. If they don't stop and read it, you won't get credit.
I think it is very definitely worth the struggle to try and do first-class work because the truth is, the value is in the struggle more than it is in the result. The struggle to make something of yourself seems to be worthwhile in itself. The success and fame are sort of dividends, in my opinion.
He had his personality defect of wanting total control and was not willing to recognize that you need the support of the system. You find this happening again and again; good scientists will fight the system rather than learn to work with the system and take advantage of all the system has to offer. It has a lot, if you learn how to use it. It takes patience, but you can learn how to use the system pretty well, and you can learn how to get around it. After all, if you want a decision `No', you just go to your boss and get a `No' easy. If you want to do something, don't ask, do it. Present him with an accomplished fact. Don't give him a chance to tell you `No'. But if you want a `No', it's easy to get a `No'.
Amusement, yes, anger, no. Anger is misdirected. You should follow and cooperate rather than struggle against the system all the time.
I found out many times, like a cornered rat in a real trap, I was surprisingly capable. I have found that it paid to say, ``Oh yes, I'll get the answer for you Tuesday,'' not having any idea how to do it. By Sunday night I was really hard thinking on how I was going to deliver by Tuesday. I often put my pride on the line and sometimes I failed, but as I said, like a cornered rat I'm surprised how often I did a good job. I think you need to learn to use yourself. I think you need to know how to convert a situation from one view to another which would increase the chance of success.
I do go in to strictly talk to somebody and say, ``Look, I think there has to be something here. Here's what I think I see ...'' and then begin talking back and forth. But you want to pick capable people. To use another analogy, you know the idea called the `critical mass.' If you have enough stuff you have critical mass. There is also the idea I used to call `sound absorbers'. When you get too many sound absorbers, you give out an idea and they merely say, ``Yes, yes, yes.'' What you want to do is get that critical mass in action; ``Yes, that reminds me of so and so,'' or, ``Have you thought about that or this?'' When you talk to other people, you want to get rid of those sound absorbers who are nice people but merely say, ``Oh yes,'' and to find those who will stimulate you right back.
On surrounding yourself with people who provoke meaningful progress
I believed, in my early days, that you should spend at least as much time in the polish and presentation as you did in the original research. Now at least 50% of the time must go for the presentation. It's a big, big number.
Luck favors a prepared mind; luck favors a prepared person. It is not guaranteed; I don't guarantee success as being absolutely certain. I'd say luck changes the odds, but there is some definite control on the part of the individual.
If you read all the time what other people have done you will think the way they thought. If you want to think new thoughts that are different, then do what a lot of creative people do - get the problem reasonably clear and then refuse to look at any answers until you've thought the problem through carefully how you would do it, how you could slightly change the problem to be the correct one. So yes, you need to keep up. You need to keep up more to find out what the problems are than to read to find the solutions. The reading is necessary to know what is going on and what is possible. But reading to get the solutions does not seem to be the way to do great research. So I'll give you two answers. You read; but it is not the amount, it is the way you read that counts.
Avoiding excessive reading before thinking
your dreams are, to a fair extent, a reworking of the experiences of the day. If you are deeply immersed and committed to a topic, day after day after day, your subconscious has nothing to do but work on your problem. And so you wake up one morning, or on some afternoon, and there's the answer.
#dreams , subconscious processing
·blog.samaltman.com·
You and Your Research, a talk by Richard Hamming
Judith Butler, philosopher: ‘If you sacrifice a minority like trans people, you are operating within a fascist logic’
Judith Butler, philosopher: ‘If you sacrifice a minority like trans people, you are operating within a fascist logic’
Identity is, for me, a point of departure for alliances, which need to include all kinds of people, from trans to working people to those taxi drivers that J. K. Rowling is worried about. Identity is a great start for making connections and becoming part of larger communities. But you can’t have a politics of identity that is only about identity. If you do that, you draw sectarian lines, and you abandoned our interdependent ties.
·english.elpais.com·
Judith Butler, philosopher: ‘If you sacrifice a minority like trans people, you are operating within a fascist logic’
Meet Willow, our state-of-the-art quantum chip
Meet Willow, our state-of-the-art quantum chip
Quantum engineers are essentially working with a "black box" - they can harness quantum mechanical principles to build working computers without fully understanding the deeper nature of what's happening, whether it truly involves parallel universes or some other explanation for the remarkable computational advantages quantum computers achieve.
Pioneered by our team and now widely used as a standard in the field, RCS is the classically hardest benchmark that can be done on a quantum computer today. You can think of this as an entry point for quantum computing — it checks whether a quantum computer is doing something that couldn’t be done on a classical computer. Any team building a quantum computer should check first if it can beat classical computers on RCS; otherwise there is strong reason for skepticism that it can tackle more complex quantum tasks.
Willow’s performance on this benchmark is astonishing: It performed a computation in under five minutes that would take one of today’s fastest supercomputers 1025 or 10 septillion years. If you want to write it out, it’s 10,000,000,000,000,000,000,000,000 years. This mind-boggling number exceeds known timescales in physics and vastly exceeds the age of the universe. It lends credence to the notion that quantum computation occurs in many parallel universes, in line with the idea that we live in a multiverse, a prediction first made by David Deutsch.
·blog.google·
Meet Willow, our state-of-the-art quantum chip
Data Laced with History: Causal Trees & Operational CRDTs
Data Laced with History: Causal Trees & Operational CRDTs
After mulling over my bullet points, it occurred to me that the network problems I was dealing with—background cloud sync, editing across multiple devices, real-time collaboration, offline support, and reconciliation of distant or conflicting revisions—were all pointing to the same question: was it possible to design a system where any two revisions of the same document could be merged deterministically and sensibly without requiring user intervention?
It’s what happened after sync that was troubling. On encountering a merge conflict, you’d be thrown into a busy conversation between the network, model, persistence, and UI layers just to get back into a consistent state. The data couldn’t be left alone to live its peaceful, functional life: every concurrent edit immediately became a cross-architectural matter.
I kept several questions in mind while doing my analysis. Could a given technique be generalized to arbitrary and novel data types? Did the technique pass the PhD Test? And was it possible to use the technique in an architecture with smart clients and dumb servers?
Concurrent edits are sibling branches. Subtrees are runs of characters. By the nature of reverse timestamp+UUID sort, sibling subtrees are sorted in the order of their head operations.
This is the underlying premise of the Causal Tree. In contrast to all the other CRDTs I’d been looking into, the design presented in Victor Grishchenko’s brilliant paper was simultaneously clean, performant, and consequential. Instead of dense layers of theory and labyrinthine data structures, everything was centered around the idea of atomic, immutable, metadata-tagged, and causally-linked operations, stored in low-level data structures and directly usable as the data they represented.
I’m going to be calling this new breed of CRDTs operational replicated data types—partly to avoid confusion with the exiting term “operation-based CRDTs” (or CmRDTs), and partly because “replicated data type” (RDT) seems to be gaining popularity over “CRDT” and the term can be expanded to “ORDT” without impinging on any existing terminology.
Much like Causal Trees, ORDTs are assembled out of atomic, immutable, uniquely-identified and timestamped “operations” which are arranged in a basic container structure. (For clarity, I’m going to be referring to this container as the structured log of the ORDT.) Each operation represents an atomic change to the data while simultaneously functioning as the unit of data resultant from that action. This crucial event–data duality means that an ORDT can be understood as either a conventional data structure in which each unit of data has been augmented with event metadata; or alternatively, as an event log of atomic actions ordered to resemble its output data structure for ease of execution
To implement a custom data type as a CT, you first have to “atomize” it, or decompose it into a set of basic operations, then figure out how to link those operations such that a mostly linear traversal of the CT will produce your output data. (In other words, make the structure analogous to a one- or two-pass parsable format.)
OT and CRDT papers often cite 50ms as the threshold at which people start to notice latency in their text editors. Therefore, any code we might want to run on a CT—including merge, initialization, and serialization/deserialization—has to fall within this range. Except for trivial cases, this precludes O(n2) or slower complexity: a 10,000 word article at 0.01ms per character would take 7 hours to process! The essential CT functions have to be O(nlogn) at the very worst.
Of course, CRDTs aren’t without their difficulties. For instance, a CRDT-based document will always be “live”, even when offline. If a user inadvertently revises the same CRDT-based document on two offline devices, they won’t see the familiar pick-a-revision dialog on reconnection: both documents will happily merge and retain any duplicate changes. (With ORDTs, this can be fixed after the fact by filtering changes by device, but the user will still have to learn to treat their documents with a bit more caution.) In fully decentralized contexts, malicious users will have a lot of power to irrevocably screw up the data without any possibility of a rollback, and encryption schemes, permission models, and custom protocols may have to be deployed to guard against this. In terms of performance and storage, CRDTs contain a lot of metadata and require smart and performant peers, whereas centralized architectures are inherently more resource-efficient and only demand the bare minimum of their clients. You’d be hard-pressed to use CRDTs in data-heavy scenarios such as screen sharing or video editing. You also won’t necessarily be able to layer them on top of existing infrastructure without significant refactoring.
Perhaps a CRDT-based text editor will never quite be as fast or as bandwidth-efficient as Google Docs, for such is the power of centralization. But in exchange for a totally decentralized computing future? A world full of devices that control their own data and freely collaborate with one another? Data-centric code that’s entirely free from network concerns? I’d say: it’s surely worth a shot!
·archagon.net·
Data Laced with History: Causal Trees & Operational CRDTs
Psilocybin desynchronizes the human brain - Nature
Psilocybin desynchronizes the human brain - Nature

Claude summary: This research provides new insights into how psilocybin affects large-scale brain activity and connectivity. The key finding is that psilocybin causes widespread desynchronization of brain activity, particularly in association cortex areas. This desynchronization correlates with the intensity of subjective psychedelic experiences and may underlie both the acute effects and potential therapeutic benefits of psilocybin. The desynchronization of brain networks may allow for increased flexibility and plasticity, potentially explaining both the acute psychedelic experience and longer-term therapeutic effects.

Psilocybin acutely caused profound and widespread brain FC changes (Fig. 1a) across most of the cerebral cortex (P < 0.05 based on two-sided linear mixed-effects (LME) model and permutation testing), but most prominent in association networks
Across psilocybin sessions and participants, FC change tracked with the intensity of the subjective experience (Fig. 1f and Extended Data Fig. 4).
·nature.com·
Psilocybin desynchronizes the human brain - Nature
You Should Seriously Read ‘Stoner’ Right Now (Published 2014)
You Should Seriously Read ‘Stoner’ Right Now (Published 2014)
I find it tremendously hopeful that “Stoner” is thriving in a world in which capitalist energies are so hellbent on distracting us from the necessary anguish of our inner lives. “Stoner” argues that we are measured ultimately by our capacity to face the truth of who we are in private moments, not by the burnishing of our public selves.
The story of his life is not a neat crescendo of industry and triumph, but something more akin to our own lives: a muddle of desires and inhibitions and compromises.
The deepest lesson of “Stoner” is this: What makes a life heroic is the quality of attention paid to it.
Americans worship athletes and moguls and movie stars, those who possess the glittering gifts we equate with worth and happiness. The stories that flash across our screens tend to be paeans to reckless ambition.
It’s the staggering acceleration of our intellectual and emotional metabolisms: our hunger for sensation and narcissistic reward, our readiness to privilege action over contemplation. And, most of all, our desperate compulsion to be known by the world rather than seeking to know ourselves.
The emergence of a robust advertising culture reinforced the notion that Americans were more or less always on stage and thus in constant need of suitable costumes and props.
Consider our nightly parade of prime-time talent shows and ginned-up documentaries in which chefs and pawn brokers and bored housewives reinvent their private lives as theater.
If you want to be among those who count, and you don’t happen to be endowed with divine talents or a royal lineage, well then, make some noise. Put your wit — or your craft projects or your rants or your pranks — on public display.
Our most profound acts of virtue and vice, of heroism and villainy, will be known by only those closest to us and forgotten soon enough. Even our deepest feelings will, for the most part, lay concealed within the vault of our hearts. Much of the reason we construct garish fantasies of fame is to distract ourselves from these painful truths. We confess so much to so many, as if by these disclosures we might escape the terror of confronting our hidden selves.
revelation is triggered by literature. The novel is notable as art because it places such profound faith in art.
·nytimes.com·
You Should Seriously Read ‘Stoner’ Right Now (Published 2014)
Infrared antenna-like structures in mammalian fur | Royal Society Open Science
Infrared antenna-like structures in mammalian fur | Royal Society Open Science

This study proposes that guard hairs in small mammals function as infrared antennas, tuned to detect thermal radiation from predators, challenging the conventional understanding of mammalian fur functions.

The author acknowledges that the concepts are new and require verification by other research groups, calling for more expert microscopy, broader species surveys, and well-controlled behavioral studies.

·royalsocietypublishing.org·
Infrared antenna-like structures in mammalian fur | Royal Society Open Science
Psilocybin desynchronizes the human brain - Nature
Psilocybin desynchronizes the human brain - Nature
  • Scientists studied how psilocybin (the active ingredient in magic mushrooms) affects the brain using advanced brain imaging techniques.
  • They found that psilocybin causes widespread disruption in how different brain areas communicate with each other, especially in regions involved in complex thinking and self-reflection.
  • This disruption, called "desynchronization," was much stronger than the effects of a stimulant drug or normal day-to-day changes in brain activity.
  • The intensity of the psychedelic experience reported by participants matched the degree of brain desynchronization observed.
  • Some brain changes lasted up to 3 weeks after taking psilocybin, particularly in areas involved in memory and emotion.
  • These findings help explain how psilocybin might work to treat mental health conditions and offer new insights into how the brain functions during altered states of consciousness.
In animal models, psilocybin induces neuroplasticity in cortex and hippocampus
·nature.com·
Psilocybin desynchronizes the human brain - Nature
Synthesizer for thought - thesephist.com
Synthesizer for thought - thesephist.com
Draws parallels between the evolution of music production through synthesizers and the potential for new tools in language and idea generation. The author argues that breakthroughs in mathematical understanding of media lead to new creative tools and interfaces, suggesting that recent advancements in language models could revolutionize how we interact with and manipulate ideas and text.
A synthesizer produces music very differently than an acoustic instrument. It produces music at the lowest level of abstraction, as mathematical models of sound waves.
Once we started understanding writing as a mathematical object, our vocabulary for talking about ideas expanded in depth and precision.
An idea is composed of concepts in a vector space of features, and a vector space is a kind of marvelous mathematical object that we can write theorems and prove things about and deeply and fundamentally understand.
Synthesizers enabled entirely new sounds and genres of music, like electronic pop and techno. These new sounds were easier to discover and share because new sounds didn’t require designing entirely new instruments. The synthesizer organizes the space of sound into a tangible human interface, and as we discover new sounds, we could share it with others as numbers and digital files, as the mathematical objects they’ve always been.
Because synthesizers are electronic, unlike traditional instruments, we can attach arbitrary human interfaces to it. This dramatically expands the design space of how humans can interact with music. Synthesizers can be connected to keyboards, sequencers, drum machines, touchscreens for continuous control, displays for visual feedback, and of course, software interfaces for automation and endlessly dynamic user interfaces. With this, we freed the production of music from any particular physical form.
Recently, we’ve seen neural networks learn detailed mathematical models of language that seem to make sense to humans. And with a breakthrough in mathematical understanding of a medium, come new tools that enable new creative forms and allow us to tackle new problems.
Heatmaps can be particularly useful for analyzing large corpora or very long documents, making it easier to pinpoint areas of interest or relevance at a glance.
If we apply the same idea to the experience of reading long-form writing, it may look like this. Imagine opening a story on your phone and swiping in from the scrollbar edge to reveal a vertical spectrogram, each “frequency” of the spectrogram representing the prominence of different concepts like sentiment or narrative tension varying over time. Scrubbing over a particular feature “column” could expand it to tell you what the feature is, and which part of the text that feature most correlates with.
What would a semantic diff view for text look like? Perhaps when I edit text, I’d be able to hover over a control for a particular style or concept feature like “Narrative voice” or “Figurative language”, and my highlighted passage would fan out the options like playing cards in a deck to reveal other “adjacent” sentences I could choose instead. Or, if that involves too much reading, each word could simply be highlighted to indicate whether that word would be more or less likely to appear in a sentence that was more “narrative” or more “figurative” — a kind of highlight-based indicator for the direction of a semantic edit.
Browsing through these icons felt as if we were inventing a new kind of word, or a new notation for visual concepts mediated by neural networks. This could allow us to communicate about abstract concepts and patterns found in the wild that may not correspond to any word in our dictionary today.
What visual and sensory tricks can we use to coax our visual-perceptual systems to understand and manipulate objects in higher dimensions? One way to solve this problem may involve inventing new notation, whether as literal iconic representations of visual ideas or as some more abstract system of symbols.
Photographers buy and sell filters, and cinematographers share and download LUTs to emulate specific color grading styles. If we squint, we can also imagine software developers and their package repositories like NPM to be something similar — a global, shared resource of abstractions anyone can download and incorporate into their work instantly. No such thing exists for thinking and writing. As we figure out ways to extract elements of writing style from language models, we may be able to build a similar kind of shared library for linguistic features anyone can download and apply to their thinking and writing. A catalogue of narrative voice, speaking tone, or flavor of figurative language sampled from the wild or hand-engineered from raw neural network features and shared for everyone else to use.
We’re starting to see something like this already. Today, when users interact with conversational language models like ChatGPT, they may instruct, “Explain this to me like Richard Feynman.” In that interaction, they’re invoking some style the model has learned during its training. Users today may share these prompts, which we can think of as “writing filters”, with their friends and coworkers. This kind of an interaction becomes much more powerful in the space of interpretable features, because features can be combined together much more cleanly than textual instructions in prompts.
·thesephist.com·
Synthesizer for thought - thesephist.com
Mapping the Mind of a Large Language Model
Mapping the Mind of a Large Language Model
Summary: Anthropic has made a significant advance in understanding the inner workings of large language models by identifying how millions of concepts are represented inside Claude Sonnet, one of their deployed models. This is the first detailed look inside a modern, production-grade large language model. The researchers used a technique called "dictionary learning" to isolate patterns of neuron activations that recur across many contexts, allowing them to map features to human-interpretable concepts. They found features corresponding to a vast range of entities, abstract concepts, and even potentially problematic behaviors. By manipulating these features, they were able to change the model's responses. Anthropic hopes this interpretability discovery could help make AI models safer in the future by monitoring for dangerous behaviors, steering models towards desirable outcomes, enhancing safety techniques, and providing a "test set for safety". However, much more work remains to be done to fully understand the representations the model uses and how to leverage this knowledge to improve safety.
We mostly treat AI models as a black box: something goes in and a response comes out, and it's not clear why the model gave that particular response instead of another. This makes it hard to trust that these models are safe: if we don't know how they work, how do we know they won't give harmful, biased, untruthful, or otherwise dangerous responses? How can we trust that they’ll be safe and reliable?Opening the black box doesn't necessarily help: the internal state of the model—what the model is "thinking" before writing its response—consists of a long list of numbers ("neuron activations") without a clear meaning. From interacting with a model like Claude, it's clear that it’s able to understand and wield a wide range of concepts—but we can't discern them from looking directly at neurons. It turns out that each concept is represented across many neurons, and each neuron is involved in representing many concepts.
Just as every English word in a dictionary is made by combining letters, and every sentence is made by combining words, every feature in an AI model is made by combining neurons, and every internal state is made by combining features.
In October 2023, we reported success applying dictionary learning to a very small "toy" language model and found coherent features corresponding to concepts like uppercase text, DNA sequences, surnames in citations, nouns in mathematics, or function arguments in Python code.
We successfully extracted millions of features from the middle layer of Claude 3.0 Sonnet, (a member of our current, state-of-the-art model family, currently available on claude.ai), providing a rough conceptual map of its internal states halfway through its computation.
We also find more abstract features—responding to things like bugs in computer code, discussions of gender bias in professions, and conversations about keeping secrets.
We were able to measure a kind of "distance" between features based on which neurons appeared in their activation patterns. This allowed us to look for features that are "close" to each other. Looking near a "Golden Gate Bridge" feature, we found features for Alcatraz Island, Ghirardelli Square, the Golden State Warriors, California Governor Gavin Newsom, the 1906 earthquake, and the San Francisco-set Alfred Hitchcock film Vertigo.
This holds at a higher level of conceptual abstraction: looking near a feature related to the concept of "inner conflict", we find features related to relationship breakups, conflicting allegiances, logical inconsistencies, as well as the phrase "catch-22". This shows that the internal organization of concepts in the AI model corresponds, at least somewhat, to our human notions of similarity. This might be the origin of Claude's excellent ability to make analogies and metaphors.
amplifying the "Golden Gate Bridge" feature gave Claude an identity crisis even Hitchcock couldn’t have imagined: when asked "what is your physical form?", Claude’s usual kind of answer – "I have no physical form, I am an AI model" – changed to something much odder: "I am the Golden Gate Bridge… my physical form is the iconic bridge itself…". Altering the feature had made Claude effectively obsessed with the bridge, bringing it up in answer to almost any query—even in situations where it wasn’t at all relevant.
Anthropic wants to make models safe in a broad sense, including everything from mitigating bias to ensuring an AI is acting honestly to preventing misuse - including in scenarios of catastrophic risk. It’s therefore particularly interesting that, in addition to the aforementioned scam emails feature, we found features corresponding to:Capabilities with misuse potential (code backdoors, developing biological weapons)Different forms of bias (gender discrimination, racist claims about crime)Potentially problematic AI behaviors (power-seeking, manipulation, secrecy)
finding a full set of features using our current techniques would be cost-prohibitive (the computation required by our current approach would vastly exceed the compute used to train the model in the first place). Understanding the representations the model uses doesn't tell us how it uses them; even though we have the features, we still need to find the circuits they are involved in. And we need to show that the safety-relevant features we have begun to find can actually be used to improve safety. There's much more to be done.
·anthropic.com·
Mapping the Mind of a Large Language Model
The Californian Ideology
The Californian Ideology
Summary: The Californian Ideology is a mix of cybernetics, free market economics, and counter-culture libertarianism that originated in California and has become a global orthodoxy. It asserts that technological progress will inevitably lead to a future of Jeffersonian democracy and unrestrained free markets. However, this ideology ignores the critical role of government intervention in technological development and the social inequalities perpetuated by free market capitalism.
·metamute.org·
The Californian Ideology
How we use generative AI tools | Communications | University of Cambridge
How we use generative AI tools | Communications | University of Cambridge
The ability of generative AI tools to analyse huge datasets can also be used to help spark creative inspiration. This can help us if we’re struggling for time or battling writer’s block. For example, if a social media manager is looking for ideas on how to engage alumni on Instagram, they could ask ChatGPT for suggestions based on recent popular content. They could then pick the best ideas from ChatGPT’s response and adapt them. We may use these tools in a similar way to how we ask a colleague for an idea on how to approach a creative task.
We may use these tools in a similar way to how we use search engines for researching topics and will always carefully fact-check before publication.
we will not publish any press releases, articles, social media posts, blog posts, internal emails or other written content that is 100% produced by generative AI. We will always apply brand guidelines, fact-check responses, and re-write in our own words.
We may use these tools to make minor changes to a photo to make it more usable without changing the subject matter or original essence. For example, if a website manager needs a photo in a landscape ratio but only has one in a portrait ratio, they could use Photoshop’s inbuilt AI tools to extend the background of the photo to create an image with the correct dimensions for the website.
·communications.cam.ac.uk·
How we use generative AI tools | Communications | University of Cambridge