Found 611 bookmarks
Newest
Synthesizer for thought - thesephist.com
Synthesizer for thought - thesephist.com
Draws parallels between the evolution of music production through synthesizers and the potential for new tools in language and idea generation. The author argues that breakthroughs in mathematical understanding of media lead to new creative tools and interfaces, suggesting that recent advancements in language models could revolutionize how we interact with and manipulate ideas and text.
A synthesizer produces music very differently than an acoustic instrument. It produces music at the lowest level of abstraction, as mathematical models of sound waves.
Once we started understanding writing as a mathematical object, our vocabulary for talking about ideas expanded in depth and precision.
An idea is composed of concepts in a vector space of features, and a vector space is a kind of marvelous mathematical object that we can write theorems and prove things about and deeply and fundamentally understand.
Synthesizers enabled entirely new sounds and genres of music, like electronic pop and techno. These new sounds were easier to discover and share because new sounds didn’t require designing entirely new instruments. The synthesizer organizes the space of sound into a tangible human interface, and as we discover new sounds, we could share it with others as numbers and digital files, as the mathematical objects they’ve always been.
Because synthesizers are electronic, unlike traditional instruments, we can attach arbitrary human interfaces to it. This dramatically expands the design space of how humans can interact with music. Synthesizers can be connected to keyboards, sequencers, drum machines, touchscreens for continuous control, displays for visual feedback, and of course, software interfaces for automation and endlessly dynamic user interfaces. With this, we freed the production of music from any particular physical form.
Recently, we’ve seen neural networks learn detailed mathematical models of language that seem to make sense to humans. And with a breakthrough in mathematical understanding of a medium, come new tools that enable new creative forms and allow us to tackle new problems.
Heatmaps can be particularly useful for analyzing large corpora or very long documents, making it easier to pinpoint areas of interest or relevance at a glance.
If we apply the same idea to the experience of reading long-form writing, it may look like this. Imagine opening a story on your phone and swiping in from the scrollbar edge to reveal a vertical spectrogram, each “frequency” of the spectrogram representing the prominence of different concepts like sentiment or narrative tension varying over time. Scrubbing over a particular feature “column” could expand it to tell you what the feature is, and which part of the text that feature most correlates with.
What would a semantic diff view for text look like? Perhaps when I edit text, I’d be able to hover over a control for a particular style or concept feature like “Narrative voice” or “Figurative language”, and my highlighted passage would fan out the options like playing cards in a deck to reveal other “adjacent” sentences I could choose instead. Or, if that involves too much reading, each word could simply be highlighted to indicate whether that word would be more or less likely to appear in a sentence that was more “narrative” or more “figurative” — a kind of highlight-based indicator for the direction of a semantic edit.
Browsing through these icons felt as if we were inventing a new kind of word, or a new notation for visual concepts mediated by neural networks. This could allow us to communicate about abstract concepts and patterns found in the wild that may not correspond to any word in our dictionary today.
What visual and sensory tricks can we use to coax our visual-perceptual systems to understand and manipulate objects in higher dimensions? One way to solve this problem may involve inventing new notation, whether as literal iconic representations of visual ideas or as some more abstract system of symbols.
Photographers buy and sell filters, and cinematographers share and download LUTs to emulate specific color grading styles. If we squint, we can also imagine software developers and their package repositories like NPM to be something similar — a global, shared resource of abstractions anyone can download and incorporate into their work instantly. No such thing exists for thinking and writing. As we figure out ways to extract elements of writing style from language models, we may be able to build a similar kind of shared library for linguistic features anyone can download and apply to their thinking and writing. A catalogue of narrative voice, speaking tone, or flavor of figurative language sampled from the wild or hand-engineered from raw neural network features and shared for everyone else to use.
We’re starting to see something like this already. Today, when users interact with conversational language models like ChatGPT, they may instruct, “Explain this to me like Richard Feynman.” In that interaction, they’re invoking some style the model has learned during its training. Users today may share these prompts, which we can think of as “writing filters”, with their friends and coworkers. This kind of an interaction becomes much more powerful in the space of interpretable features, because features can be combined together much more cleanly than textual instructions in prompts.
·thesephist.com·
Synthesizer for thought - thesephist.com
the best way to please is not to please
the best way to please is not to please
I wanted to take care of everyone’s feelings. If I made them feel good, I would rewarded with their affection. For a long time, socializing involved playing a weird form of Mad-Libs: I wanted to say whatever you wanted to hear. I wanted to be assertive, but also understanding and reasonable and thoughtful.
I really took what I learned and ran with it. I wanted to master what I was bad at and made other people happy. I realized that it was: bad to talk too much about yourself good to show interest in other people’s hobbies, problems, and interests important to pay attention to body language my job to make sure that whatever social situation we were in was a delightful experience for everyone involved
·avabear.xyz·
the best way to please is not to please
Blessed and emoji-pilled: why language online is so absurd
Blessed and emoji-pilled: why language online is so absurd
AI: This article explores the evolution of online language and communication, highlighting the increasing absurdity and surrealism in digital discourse. It discusses how traditional language is being replaced by memes, emojis, and seemingly nonsensical phrases, reflecting the influence of social media platforms and algorithms on our communication styles. The piece examines the implications of this shift, touching on themes of information overload, AI-like speech patterns, and the potential consequences of this new form of digital dialect.
Layers upon layers of references are stacked together in a single post, while the posts themselves fly by faster than ever in our feeds. To someone who isn’t “chronically online” a few dislocated images or words may trigger a flash of recognition – a member of the royal family, a beloved cartoon character – but their relationship with each other is impossible to unpick. Add the absurdist language of online culture and the impenetrable algorithms that decide what we see in our feeds, and it seems like all hope is lost when it comes to making sense of the internet.
Forget words! Don’t think! In today’s digitally-mediated landscape, there’s no need for knowledge or understanding, just information. Scroll the feed and you’ll find countless video clips and posts advocating this smooth-brained agenda: lobotomy chic, sludge content, silly girl summer.
“With memes, images are converging more on the linguistic, becoming flattened into something more like symbols/hieroglyphs/words,” says writer Olivia Kan-Sperling, who specialises in programming language critique. For the meme-fluent, the form isn’t important, but rather the message it carries. “A meme is lower-resolution in terms of its aesthetic affordances than a normal pic because you barely have to look at it to know what it’s ‘doing’,” she expands. “For the literate, its full meaning unfolds at a glance.” To understand this way of “speaking writing posting” means we must embrace the malleability of language, the ambiguities and interpretations – and free it from ‘real-world’ rules.
Hey guys, I just got an order in from Sephora – here’s everything that I got. Get ready with me for a boat day in Miami. Come and spend the day with me – starting off with coffee. TikTok influencers engage in a high-pitched and breathless way of speaking that over-emphasises keywords in a youthful, singsong cadence. For the Attention Economy, it’s the sort of algorithm-friendly repetition that’s quantified by clicks and likes, monetised by engagement for short attention spans. “Now, we have to speak machine with machines that were trained on humans,” says Basar, who refers to this algorithm-led style as promptcore.
As algorithms digest our online behaviour into data, we resemble a swarm, a hivemind. We are beginning to think and speak like machines, in UI-friendly keywords and emoji-pilled phrases.
·dazeddigital.com·
Blessed and emoji-pilled: why language online is so absurd
The secret digital behaviors of Gen Z
The secret digital behaviors of Gen Z

shift from traditional notions of information literacy to "information sensibility" among Gen Zers, who prioritize social signals and peer influence over fact-checking. The research by Jigsaw, a Google subsidiary, reveals that Gen Zers spend their digital lives in "timepass" mode, engaging with light content and trusting influencers over traditional news sources.

Comment sections for social validation and information signaling

·businessinsider.com·
The secret digital behaviors of Gen Z
Apple intelligence and AI maximalism — Benedict Evans
Apple intelligence and AI maximalism — Benedict Evans
The chatbot might replace all software with a prompt - ‘software is dead’. I’m skeptical about this, as I’ve written here, but Apple is proposing the opposite: that generative AI is a technology, not a product.
Apple is, I think, signalling a view that generative AI, and ChatGPT itself, is a commodity technology that is most useful when it is: Embedded in a system that gives it broader context about the user (which might be search, social, a device OS, or a vertical application) and Unbundled into individual features (ditto), which are inherently easier to run as small power-efficient models on small power-efficient devices on the edge (paid for by users, not your capex budget) - which is just as well, because… This stuff will never work for the mass-market if we have marginal cost every time the user presses ‘OK’ and we need a fleet of new nuclear power-stations to run it all.
Apple has built its own foundation models, which (on the benchmarks it published) are comparable to anything else on the market, but there’s nowhere that you can plug a raw prompt directly into the model and get a raw output back - there are always sets of buttons and options shaping what you ask, and that’s presented to the user in different ways for different features. In most of these features, there’s no visible bot at all. You don’t ask a question and get a response: instead, your emails are prioritised, or you press ‘summarise’ and a summary appears. You can type a request into Siri (and Siri itself is only one of the many features using Apple’s models), but even then you don’t get raw model output back: you get GUI. The LLM is abstracted away as an API call.
Apple is treating this as a technology to enable new classes of features and capabilities, where there is design and product management shaping what the technology does and what the user sees, not as an oracle that you ask for things.
Apple is drawing a split between a ‘context model’ and a ‘world model’. Apple’s models have access to all the context that your phone has about you, powering those features, and this is all private, both on device and in Apple’s ‘Private Cloud’. But if you ask for ideas for what to make with a photo of your grocery shopping, then this is no longer about your context, and Apple will offer to send that to a third-party world model - today, ChatGPT.
that’s clearly separated into a different experience where you should have different expectations, and it’s also, of course, OpenAI’s brand risk, not Apple’s. Meanwhile, that world model gets none of your context, only your one-off prompt.
Neither OpenAI nor any of the other cloud models from new companies (Anthropic, Mistral etc) have your emails, messages, locations, photos, files and so on.
Apple is letting OpenAI take the brand risk of creating pizza glue recipes, and making error rates and abuse someone else’s problem, while Apple watches from a safe distance.
The next step, probably, is to take bids from Bing and Google for the default slot, but meanwhile, more and more use-cases will be quietly shifted from the third party to Apple’s own models. It’s Apple’s own software that decides where the queries go, after all, and which ones need the third party at all.
A lot of the compute to run Apple Intelligence is in end-user devices paid for by the users, not Apple’s capex budget, and Apple Intelligence is free.
Commoditisation is often also integration. There was a time when ‘spell check’ was a separate product that you had to buy, for hundreds of dollars, and there were dozens of competing products on the market, but over time it was integrated first into the word processor and then the OS. The same thing happened with the last wave of machine learning - style transfer or image recognition were products for five minutes and then became features. Today ‘summarise this document’ is AI, and you need a cloud LLM that costs $20/month, but tomorrow the OS will do that for free. ‘AI is whatever doesn’t work yet.’
Apple is big enough to take its own path, just as it did moving the Mac to its own silicon: it controls the software and APIs on top of the silicon that are the basis of those developer network effects, and it has a world class chip team and privileged access to TSMC.
Apple is doing something slightly different - it’s proposing a single context model for everything you do on your phone, and powering features from that, rather than adding disconnected LLM-powered features at disconnected points across the company.
·ben-evans.com·
Apple intelligence and AI maximalism — Benedict Evans
written in the body
written in the body
I spent so many years of my life trying to live mostly in my head. Intellectualizing everything made me feel like it was manageable. I was always trying to manage my own reactions and the reactions of everyone else around me. Learning how to manage people was the skill that I had been lavishly rewarded for in my childhood and teens. Growing up, you’re being reprimanded in a million different ways all the time, and I learned to modify my behavior so that over time I got more and more positive feedback. People like it when you do X and not Y, say X and not Y. I kept track of all of it in my head and not in my body. Intellectualizing kept me numbed out, and for a long time what I wanted was nothing more than to be numbed out, because when things hurt they hurt less. Whatever I felt like I couldn’t show people or tell people I hid away. I compartmentalized, and what I put in the compartment I never looked at became my shadow.
So much of what I care about can be boiled down to this: when you’re able to really inhabit and pay attention to your body, it becomes obvious what you want and don’t want, and the path towards your desires is clear. If you’re not in your body, you constantly rationalizing what you should do next, and that can leave you inert or trapped or simply choosing the wrong thing over and over. "I know I should, but I can’t do it” is often another way of saying “I’ve reached this conclusion intellectually, but I’m so frozen out of my body I can’t feel a deeper certainty.”
It was so incredibly hard when people gave me negative feedback—withdrew, or rejected me, or were just preoccupied with their own problems—because I relied on other people to figure out whether everything was alright.
When I started living in my body I started feeling for the first time that I could trust myself in a way that extended beyond trust of my intelligence, of my ability to pick up on cues in my external environment.
I can keep my attention outwards, I don’t direct it inwards in a self-conscious way. It’s the difference between noticing whether someone seems to having a good time in the moment by watching their face vs agonizing about whether they enjoyed something after the fact. I can tell the difference between when I’m tired because I didn’t sleep well versus tired because I’m bored versus tired because I’m avoiding something. When I’m in my body, I’m aware of myself instead of obsessing over my state, and this allows me to have more room for other people.
·avabear.xyz·
written in the body
Richard Linklater Sees the Killer Inside Us All
Richard Linklater Sees the Killer Inside Us All
What’s your relationship now to the work back then? Are you as passionate? I really had to think about that. My analysis of that is, you’re a different person with different needs. A lot of that is based on confidence. When you’re starting out in an art form or anything in life, you can’t have confidence because you don’t have experience, and you can only get confidence through experience. But you have to be pretty confident to make a film. So the only way you counterbalance that lack of experience and confidence is absolute passion, fanatical spirit. And I’ve had this conversation over the years with filmmaker friends: Am I as passionate as I was in my 20s? Would I risk my whole life? If it was my best friend or my negative drowning, which do I save? The 20-something self goes, I’m saving my film! Now it’s not that answer. I’m not ashamed to say that, because all that passion doesn’t go away. It disperses a little healthfully. I’m passionate about more things in the world. I care about more things, and that serves me. The most fascinating relationship we all have is to ourselves at different times in our lives. You look back, and it’s like, I’m not as passionate as I was at 25. Thank God. That person was very insecure, very unkind. You’re better than that now. Hopefully.
·nytimes.com·
Richard Linklater Sees the Killer Inside Us All
How to read a movie - Roger Ebert
How to read a movie - Roger Ebert
When the Sun-Times appointed me film critic, I hadn't taken a single film course (the University of Illinois didn't offer them in those days). One of the reasons I started teaching was to teach myself. Look at a couple dozen New Wave films, you know more about the New Wave. Same with silent films, documentaries, specific directors.
visual compositions have "intrinsic weighting." By that I believe he means that certain areas of the available visual space have tendencies to stir emotional or aesthetic reactions. These are not "laws." To "violate" them can be as meaningful as to "follow" them. I have never heard of a director or cinematographer who ever consciously applied them.
I suspect that filmmakers compose shots from images that well up emotionally, instinctively or strategically, just as a good pianist never thinks about the notes.
I already knew about the painter's "Golden Mean," or the larger concept of the "golden ratio." For a complete explanation, see Wiki, and also look up the "Rule of Thirds." To reduce the concept to a crude rule of thumb in the composition of a shot in a movie: A person located somewhat to the right of center will seem ideally placed. A person to the right of that position will seem more positive; to the left, more negative. A centered person will seem objectified, like a mug shot. I call that position somewhat to the right of center the "strong axis."
They are not absolutes. But in general terms, in a two-shot, the person on the right will "seem" dominant over the person on the left
In simplistic terms: Right is more positive, left more negative. Movement to the right seems more favorable; to the left, less so. The future seems to live on the right, the past on the left. The top is dominant over the bottom. The foreground is stronger than the background. Symmetrical compositions seem at rest. Diagonals in a composition seem to "move" in the direction of the sharpest angle they form, even though of course they may not move at all. Therefore, a composition could lead us into a background that becomes dominant over a foreground.
Of course I should employ quotation marks every time I write such words as positive, negative, stronger, weaker, stable, past, future, dominant or submissive. All of these are tendencies, not absolutes, and as I said, can work as well by being violated as by being followed. Think of "intrinsic weighting" as a process that gives all areas of the screen complete freedom, but acts like an invisible rubber band to create tension or attention when stretched. Never make the mistake of thinking of these things as absolutes. They exist in the realm of emotional tendencies. Often use the cautionary phrase, "all things being equal" -- which of course they never are.
·rogerebert.com·
How to read a movie - Roger Ebert
What Is the Best Way to Cut an Onion?
What Is the Best Way to Cut an Onion?
As it turns out, cutting radially is, in fact, marginally worse than the traditional method. With all your knife strokes converging at a single central point, the thin wedges of onion that you create with your first strokes taper drastically as they get toward the center, resulting in large dice cut from the outer layers and much larger dice from the center. But even the classic method doesn’t produce particularly even dice, with a standard deviation of about 48 percent.
For the next set of simulations, I wondered what would happen if, instead of making radial cuts with the knife pointed directly at the circle’s center, we aimed our knife at an imaginary point somewhere below the surface of the cutting board, producing cuts somewhere between perfectly vertical and completely radial.
This proved to be key. By plotting the standard deviation of the onion pieces against the point below the cutting board surface at which your knife is aimed, Dr. Poulsen produced a chart that revealed the ideal point to be exactly .557 onion radiuses below the surface of the cutting board. Or, if it’s easier: Angle your knife toward a point roughly six-tenths of an onion’s height below the surface of the cutting board. If you want to be even more lax about it, making sure your knife isn’t quite oriented vertically or radially for those initial cuts is enough to make a measurable difference in dice evenness.
·nytimes.com·
What Is the Best Way to Cut an Onion?
Spreadsheet Assassins | Matthew King
Spreadsheet Assassins | Matthew King
Rhe real key to SaaS success is often less about innovative software and more about locking in customers and extracting maximum value. Many SaaS products simply digitize spreadsheet workflows into proprietary systems, making it difficult for customers to switch. As SaaS proliferates into every corner of the economy, it imposes a growing "software tax" on businesses and consumers alike. While spreadsheets remain a flexible, interoperable stalwart, the trajectory of SaaS points to an increasingly extractive model prioritizing rent-seeking over genuine productivity gains.
As a SaaS startup scales, sales and customer support staff pay for themselves, and the marginal cost to serve your one-thousandth versus one-millionth user is near-zero. The result? Some SaaS companies achieve gross profit margins of 75 to 90 percent, rivaling Windows in its monopolistic heyday.
Rent-seeking has become an explicit playbook for many shameless SaaS investors. Private equity shop Thoma Bravo has acquired over four hundred software companies, repeatedly mashing products together to amplify lock-in effects so it can slash costs and boost prices—before selling the ravaged Franken-platform to the highest bidder.
In the Kafkaesque realm of health care, software giant Epic’s 1990s-era UI is still widely used for electronic medical records, a nuisance that arguably puts millions of lives at risk, even as it accrues billions in annual revenue and actively resists system interoperability. SAP, the antiquated granddaddy of enterprise resource planning software, has endured for decades within frustrated finance and supply chain teams, even as thousands of SaaS startups try to chip away at its dominance. Salesforce continues to grow at a rapid clip, despite a clunky UI that users say is “absolutely terrible” and “stuck in the 80s”—hence, the hundreds of “SalesTech” startups that simplify a single platform workflow (and pray for a billion-dollar acquihire to Benioff’s mothership). What these SaaS overlords might laud as an ecosystem of startup innovation is actually a reflection of their own technical shortcomings and bloated inertia.
Over 1,500 software startups are focused on billing and invoicing alone. The glut of tools extends to sectors without any clear need for complex software: no fewer than 378 hair salon platforms, 166 parking management solutions, and 70 operating systems for funeral homes and cemeteries are currently on the market. Billions of public pension and university endowment dollars are being burned on what amounts to hackathon curiosities, driven by the machinations of venture capital and private equity. To visit a much-hyped “demo day” at a startup incubator like Y Combinator or Techstars is to enter a realm akin to a high-end art fair—except the objects being admired are not texts or sculptures or paintings but slightly nicer faces for the drudgery of corporate productivity.
As popular as SaaS has become, much of the modern economy still runs on the humble, unfashionable spreadsheet. For all its downsides, there are virtues. Spreadsheets are highly interoperable between firms, partly because of another monopoly (Excel) but also because the generic .csv format is recognized by countless applications. They offer greater autonomy and flexibility, with tabular cells and formulas that can be shaped into workflows, processes, calculators, databases, dashboards, calendars, to-do lists, bug trackers, accounting workbooks—the list goes on. Spreadsheets are arguably the most popular programming language on Earth.
·web.archive.org·
Spreadsheet Assassins | Matthew King
When America was ‘great,’ according to data - The Washington Post
When America was ‘great,’ according to data - The Washington Post
we looked at the data another way, measuring the gap between each person’s birth year and their ideal decade. The consistency of the resulting pattern delighted us: It shows that Americans feel nostalgia not for a specific era, but for a specific age. The good old days when America was “great” aren’t the 1950s. They’re whatever decade you were 11, your parents knew the correct answer to any question, and you’d never heard of war crimes tribunals, microplastics or improvised explosive devices.
The closest-knit communities were those in our childhood, ages 4 to 7. The happiest families, most moral society and most reliable news reporting came in our early formative years — ages 8 through 11. The best economy, as well as the best radio, television and movies, happened in our early teens — ages 12 through 15.
almost without exception, if you ask an American when times were worst, the most common response will be “right now!” This holds true even when “now” is clearly not the right answer. For example, when we ask which decade had the worst economy, the most common answer is today. The Great Depression — when, for much of a decade, unemployment exceeded the what we saw in the worst month of pandemic shutdowns — comes in a grudging second.
measure after measure, Republicans were more negative about the current decade than any other group — even low-income folks in objectively difficult situations.
Hsu and her friends spent the first part of 2024 asking 2,400 Americans where they get their information about the economy. In a new analysis, she found Republicans who listen to partisan outlets are more likely to be negative, and Democrats who listen to their own version of such news are more positive — and that Republicans are a bit more likely to follow partisan news.
·archive.is·
When America was ‘great,’ according to data - The Washington Post
Culture Needs More Jerks | Defector
Culture Needs More Jerks | Defector
The function of criticism is and has always been to complicate our sense of beauty. Good criticism of music we love—or, occasionally, really hate—increases the dimensions and therefore the volume of feeling. It exercises that part of ourselves which responds to art, making it stronger.
The correction to critics’ failure to take pop music seriously is known as poptimism: the belief that pop music is just as worthy of critical consideration as genres like rock, rap or, god forbid, jazz. In my opinion, this correction was basically good. It’s fun and interesting to think seriously about music that is meant to be heard on the radio or danced to in clubs, the same way it is fun and interesting to think about crime novels or graphic design. For the critic, maybe more than for anyone else, it is important to remember that while a lot of great stuff is not popular, popular stuff can be great, too.
every good idea has a dumber version of itself on the internet. The dumb version of poptimism is the belief that anything sufficiently popular must be good. This idea is supported by certain structural forces, particularly the ability, through digitization, to count streams, pageviews, clicks, and other metrics so exactly that every artist and the music they release can be assigned a numerical value representing their popularity relative to everything else. The answer to the question “What do people like?” is right there on a chart, down to the ones digit, conclusively proving that, for example, Drake (74,706,786,894 lead streams) is more popular than The Weeknd (56,220,309,818 lead streams) on Spotify.
The question “What is good?” remains a matter of disagreement, but in the face of such precise numbers, how could you argue that the Weeknd was better? You would have to appeal to subjective aesthetic assessments (e.g. Drake’s combination of brand-checking and self-pity recreates neurasthenic consumer culture without transcending it) or socioeconomic context (e.g. Drake is a former child actor who raps about street life for listeners who want to romanticize black poverty without hearing from anyone actually affected by it, plus he’s Canadian) in a way that would ultimately just be your opinion. And who needs one jerk’s opinion when democracy is right there in the numbers?
This attitude is how you get criticism like “Why Normal Music Reviews No Longer Make Sense for Taylor Swift,” which cites streaming data (The Tortured Poets Department’s 314.5 million release-day streams versus Cowboy Carter’s 76.6 million) to argue that Swift is better understood not as a singer-songwriter but as an area of brand activity, along the lines of the Marvel Cinematic Universe or Star Wars. “The tepid music reviews often miss the fact that ‘music’ is something that Swift stopped selling long ago,” New Yorker contributor Sinéad O’Sullivan writes. “Instead, she has spent two decades building the foundation of a fan universe, filled with complex, in-sequence narratives that have been contextualized through multiple perspectives across eleven blockbuster installments. She is not creating standalone albums but, rather, a musical franchise.”
The fact that most cognitively normal adults regard these bands as children’s music is what makes their fan bases not just ticket-buyers but subcultures.
The power of the antagonist-subculture dynamic was realized by major record labels in the early 1990s, when the most popular music in America was called “alternative.”
For the person who is not into music—the person who just happens to be rapturously committed to the artists whose music you hear everywhere whether you want to or not, whose new albums are like iPhone releases and whose shows are like Disneyland—the critic is a foil.
·defector.com·
Culture Needs More Jerks | Defector
Can You Know Too Much About Your Organization?
Can You Know Too Much About Your Organization?

A study of six high-performing project teams redesigning their organizations' operations revealed:

  • Many organizations lack purposeful, integrated design
  • Systems often result from ad hoc solutions and uncoordinated decisions
  • Significant waste and redundancy in processes

The study challenges the notion that only peripheral employees push for significant organizational change. It highlights the potential consequences of exposing employees to full operational complexity and suggests organizations consider how to retain talent after redesign projects.

Despite being experienced managers, what they learned was eye-opening. One explained that “it was like the sun rose for the first time. … I saw the bigger picture.” They had never seen the pieces — the jobs, technologies, tools, and routines — connected in one place, and they realized that their prior view was narrow and fractured. A team member acknowledged, “I only thought of things in the context of my span of control.”
The maps of the organization generated by the project teams also showed that their organizations often lacked a purposeful, integrated design that was centrally monitored and managed. There may originally have been such a design, but as the organization grew, adapted to changing markets, brought on new leadership, added or subtracted divisions, and so on, this animating vision was lost. The original design had been eroded, patched, and overgrown with alternative plans. A manager explained, “Everything I see around here was developed because of specific issues that popped up, and it was all done ad hoc and added onto each other. It certainly wasn’t engineered.”
“They see problems, and the general approach, the human approach, is to try and fix them. … Functions have tried to put band-aids on every issue that comes up. It sounds good, but when they are layered one on top of the other they start to choke the organization. But they don’t see that because they are only seeing their own thing.”
Ultimately, the managers realized that what they had previously attributed to the direction and control of centralized, bureaucratic forces was actually the aggregation of the distributed work and uncoordinated decisions of people dispersed throughout the organization. Everyone was working on the part of the organization they were familiar with, assuming that another set of people were attending to the larger picture, coordinating the larger system to achieve goals and keeping the organization operating. Except no one was actually looking at how people’s work was connecting across the organization day-to-day.
as they felt a sense of empowerment about changing the organization, they felt a sense of alienation about returning to their central roles. “You really start understanding all of the waste and all of the redundancy and all of the people who are employed as what I call intervention resources,” one person told us.
In the end, a slight majority of the employees returned to their role to continue their career (25 cases). They either were promoted (7 cases), moved laterally (8 cases), or returned to their jobs (10 cases). However, 23 chose organizational change roles.
This study suggests that when companies undertake organizational change efforts, they should consider not only the implications for the organization, but also for the people tasked to do the work. Further, it highlights just how infrequently we recognize how poorly designed and managed many of our organizations really are. Not acknowledging the dysfunction of existing routines protects us from seeing how much of our work is not actually adding value, something that may lead simply to unsatisfying work, no less to larger questions about the nature of organizational design similar to those asked by the managers in my study. Knowledge of the systems we work in can be a source of power, yes. But when you realize you can’t affect the big changes your organization needs, it can also be a source of alienation.
·archive.is·
Can You Know Too Much About Your Organization?
research as leisure activity
research as leisure activity
The idea of research as leisure activity has stayed with me because it seems to describe a kind of intellectual inquiry that comes from idiosyncratic passion and interest. It’s not about the formal credentials. It’s fundamentally about play. It seems to describe a life where it’s just fun to be reading, learning, writing, and collaborating on ideas.
Research as a leisure activity includes the qualities I described above: a desire to ask and answer questions, a commitment to evidence, an understanding of what already exists, an output, a certain degree of contemporary relevance, and a community. But it also involves the following qualities
Research as leisure activity is directed by passions and instincts. It’s fundamentally very personal: What are you interested in now? It’s fine, and maybe even better, if the topic isn’t explicitly intellectual or academic in nature. And if one topic leads you to another topic that seems totally unrelated, that’s something to get excited about—not fearful of. It’s a style of research that is well-suited for people okay with being dilettantes, who are comfortable with an idiosyncratic, non-comprehensive education in a particular domain.
Who is doing this kind of research as leisure activity? Artists, often. To return to the site that originally inspired this post—I’d say that the artist/designer/educator Laurel Schwulst uses Are.na to develop and refine particular themes, directions, topics of inquiry…some of which become artworks or essays or classes that she teaches.
People who read widely and attentively—and then publish the results of their reading—are also arguably performing research as a leisure activity. Maria Popova, who started writing a blog in 2006—now called The Marginalian—which collects her reading across literature, philosophy, psychology, the sciences. Her blog feels like leisurely research, to me, because it’s an accumulation of curious, semi-directed reading, which over time build up into a dense network of references and ideas—supported by previous reading, and enriched by her own commentary and links to similar ideas by other thinkers.
pretty much every writer, essayist, “cultural critic,” etc—especially someone who’s writing more as a vocation than a profession—has research as their leisure activity. What they do for pleasure (reading books, seeing films, listening to music) shades naturally and inevitably into what they want to write about, and the things they consume for leisure end up incorporated into some written work.
What’s also striking to me is that autodidacts often begin with some very tiny topic, and through researching that topic, they end up telescoping out into bigger-picture concerns. When research is your leisure activity, you’ll end up making connections between your existing interests and new ideas or topics. Everything gets pulled into the orbit of your intellectual curiosity. You can go deeper and deeper into a narrow topic, one that seems fascinatingly trivial and end up learning about the big topics: gender, culture, economics, nationalism, colonialism. It’s why fashion writers end up writing about the history of gender identity (through writing about masculine/feminine clothing) and cross-cultural exchange (through writing about cultural appropriation and styles borrowed from other times and places) and historical trade networks (through writing about where textiles come from).
·personalcanon.com·
research as leisure activity
How to Make a Great Government Website—Asterisk
How to Make a Great Government Website—Asterisk
Summary: Dave Guarino, who has worked extensively on improving government benefits programs like SNAP in California, discusses the challenges and opportunities in civic technology. He explains how a simplified online application, GetCalFresh.org, was designed to address barriers that prevent eligible people from accessing SNAP benefits, such as a complex application process, required interviews, and document submission. Guarino argues that while technology alone cannot solve institutional problems, it provides valuable tools for measuring and mitigating administrative burdens. He sees promise in using large language models to help navigate complex policy rules. Guarino also reflects on California's ambitious approach to benefits policy and the structural challenges, like Prop 13 property tax limits, that impact the state's ability to build up implementation capacity.
there are three big categories of barriers. The application barrier, the interview barrier, and the document barrier. And that’s what we spent most of our time iterating on and building a system that could slowly learn about those barriers and then intervene against them.
The application is asking, “Are you convicted of this? Are you convicted of that? Are you convicted of this other thing?” What is that saying to you, as a person, about what the system thinks of you?
Often they’ll call from a blocked number. They’ll send you a notice of when your interview is scheduled for, but this notice will sometimes arrive after the actual date of the interview. Most state agencies are really slammed right now for a bunch of reasons, including Medicaid unwinding. And many of the people assisting on Medicaid are the same workers who process SNAP applications. If you missed your phone interview, you have to call to reschedule it. But in many states, you can’t get through, or you have to call over and over and over again. For a lot of people, if they don’t catch that first interview call, they’re screwed and they’re not going to be approved.
getting to your point about how a website can fix this —  the end result was lowest-burden application form that actually gets a caseworker what they need to efficiently and effectively process it. We did a lot of iteration to figure out that sweet spot.
We didn’t need to do some hard system integration that would potentially take years to develop — we were just using the system as it existed. Another big advantage was that we had to do a lot of built-in data validation because we could not submit anything that was going to fail the county application. We discovered some weird edge cases by doing this.
A lot of times when you want to build a new front end for these programs, it becomes this multiyear, massive project where you’re replacing everything all at once. But if you think about it, there’s a lot of potential in just taking the interfaces you have today, building better ones on top of them, and then using those existing ones as the point of integration.
Government tends to take a more high-modernist approach to the software it builds, which is like “we’re going to plan and know up front how everything is, and that way we’re never going to have to make changes.” In terms of accreting layers — yes, you can get to that point. But I think a lot of the arguments I hear that call for a fundamental transformation suffer from the same high-modernist thinking that is the source of much of the status quo.
If you slowly do this kind of stuff, you can build resilient and durable interventions in the system without knocking it over wholesale. For example, I mentioned procedural denials. It would be adding regulations, it would be making technology systems changes, blah, blah, blah, to have every state report why people are denied, at what rate, across every state up to the federal government. It would take years to do that, but that would be a really, really powerful change in terms of guiding feedback loops that the program has.
Guarino argues that attempts to fundamentally transform government technology often suffer from the same "high-modernist" thinking that created problematic legacy systems in the first place. He advocates for incremental improvements that provide better measurement and feedback loops.
when you start to read about civic technology, it very, very quickly becomes clear that things that look like they are tech problems are actually about institutional culture, or about policy, or about regulatory requirements.
If you have an application where you think people are struggling, you can measure how much time people take on each page. A lot of what technology provides is more rigorous measurement of the burdens themselves. A lot of these technologies have been developed in commercial software because there’s such a massive incentive to get people who start a transaction to finish it. But we can transplant a lot of those into government services and have orders of magnitude better situational awareness.
There’s this starting point thesis: Tech can solve these government problems, right? There’s healthcare.gov and the call to bring techies into government, blah, blah, blah. Then there’s the antithesis, where all these people say, well, no, it’s institutional problems. It’s legal problems. It’s political problems. I think either is sort of an extreme distortion of reality. I see a lot of more oblique levers that technology can pull in this area.
LLMs seem to be a fundamental breakthrough in manipulating words, and at the end of the day, a lot of government is words. I’ve been doing some active experimentation with this because I find it very promising. One common question people have is, “Who’s in my household for the purposes of SNAP?” That’s actually really complicated when you think about people who are living in poverty — they might be staying with a neighbor some of the time, or have roommates but don’t share food, or had to move back home because they lost their job.
I’ve been taking verbatim posts from Reddit that are related to the household question and inputting them into LLMs with some custom prompts that I’ve been iterating on, as well as with the full verbatim federal regulations about household definition. And these models do seem pretty capable at doing some base-level reasoning over complex, convoluted policy words in a way that I think could be really promising.
caseworkers are spending a lot of their time figuring out, wait, what rule in this 200-page policy manual is actually relevant in this specific circumstance? I think LLMS are going to be really impactful there.
It is certainly the case that I’ve seen some productive tensions in counties where there’s more of a mix of that and what you might consider California-style Republicans who are like, “We want to run this like a business, we want to be efficient.” That tension between efficiency and big, ambitious policies can be a healthy, productive one. I don’t know to what extent that exists at the state level, and I think there’s hints of more of an interest in focusing on state-level government working better and getting those fundamentals right, and then doing the more ambitious things on a more steady foundation.
California seemed to really try to take every ambitious option that the feds give us on a whole lot of fronts. I think the corollary of that is that we don’t necessarily get the fundamental operational execution of these programs to a strong place, and we then go and start adding tons and tons of additional complexity on top of them.
·asteriskmag.com·
How to Make a Great Government Website—Asterisk
Mapping the Mind of a Large Language Model
Mapping the Mind of a Large Language Model
Summary: Anthropic has made a significant advance in understanding the inner workings of large language models by identifying how millions of concepts are represented inside Claude Sonnet, one of their deployed models. This is the first detailed look inside a modern, production-grade large language model. The researchers used a technique called "dictionary learning" to isolate patterns of neuron activations that recur across many contexts, allowing them to map features to human-interpretable concepts. They found features corresponding to a vast range of entities, abstract concepts, and even potentially problematic behaviors. By manipulating these features, they were able to change the model's responses. Anthropic hopes this interpretability discovery could help make AI models safer in the future by monitoring for dangerous behaviors, steering models towards desirable outcomes, enhancing safety techniques, and providing a "test set for safety". However, much more work remains to be done to fully understand the representations the model uses and how to leverage this knowledge to improve safety.
We mostly treat AI models as a black box: something goes in and a response comes out, and it's not clear why the model gave that particular response instead of another. This makes it hard to trust that these models are safe: if we don't know how they work, how do we know they won't give harmful, biased, untruthful, or otherwise dangerous responses? How can we trust that they’ll be safe and reliable?Opening the black box doesn't necessarily help: the internal state of the model—what the model is "thinking" before writing its response—consists of a long list of numbers ("neuron activations") without a clear meaning. From interacting with a model like Claude, it's clear that it’s able to understand and wield a wide range of concepts—but we can't discern them from looking directly at neurons. It turns out that each concept is represented across many neurons, and each neuron is involved in representing many concepts.
Just as every English word in a dictionary is made by combining letters, and every sentence is made by combining words, every feature in an AI model is made by combining neurons, and every internal state is made by combining features.
In October 2023, we reported success applying dictionary learning to a very small "toy" language model and found coherent features corresponding to concepts like uppercase text, DNA sequences, surnames in citations, nouns in mathematics, or function arguments in Python code.
We successfully extracted millions of features from the middle layer of Claude 3.0 Sonnet, (a member of our current, state-of-the-art model family, currently available on claude.ai), providing a rough conceptual map of its internal states halfway through its computation.
We also find more abstract features—responding to things like bugs in computer code, discussions of gender bias in professions, and conversations about keeping secrets.
We were able to measure a kind of "distance" between features based on which neurons appeared in their activation patterns. This allowed us to look for features that are "close" to each other. Looking near a "Golden Gate Bridge" feature, we found features for Alcatraz Island, Ghirardelli Square, the Golden State Warriors, California Governor Gavin Newsom, the 1906 earthquake, and the San Francisco-set Alfred Hitchcock film Vertigo.
This holds at a higher level of conceptual abstraction: looking near a feature related to the concept of "inner conflict", we find features related to relationship breakups, conflicting allegiances, logical inconsistencies, as well as the phrase "catch-22". This shows that the internal organization of concepts in the AI model corresponds, at least somewhat, to our human notions of similarity. This might be the origin of Claude's excellent ability to make analogies and metaphors.
amplifying the "Golden Gate Bridge" feature gave Claude an identity crisis even Hitchcock couldn’t have imagined: when asked "what is your physical form?", Claude’s usual kind of answer – "I have no physical form, I am an AI model" – changed to something much odder: "I am the Golden Gate Bridge… my physical form is the iconic bridge itself…". Altering the feature had made Claude effectively obsessed with the bridge, bringing it up in answer to almost any query—even in situations where it wasn’t at all relevant.
Anthropic wants to make models safe in a broad sense, including everything from mitigating bias to ensuring an AI is acting honestly to preventing misuse - including in scenarios of catastrophic risk. It’s therefore particularly interesting that, in addition to the aforementioned scam emails feature, we found features corresponding to:Capabilities with misuse potential (code backdoors, developing biological weapons)Different forms of bias (gender discrimination, racist claims about crime)Potentially problematic AI behaviors (power-seeking, manipulation, secrecy)
finding a full set of features using our current techniques would be cost-prohibitive (the computation required by our current approach would vastly exceed the compute used to train the model in the first place). Understanding the representations the model uses doesn't tell us how it uses them; even though we have the features, we still need to find the circuits they are involved in. And we need to show that the safety-relevant features we have begun to find can actually be used to improve safety. There's much more to be done.
·anthropic.com·
Mapping the Mind of a Large Language Model
Design is compromise
Design is compromise
Having an opinionated set of tradeoffs exposes your approach to a set of weaknesses. The more you tip the scale on one side, the weaker something else will be. That’s okay! Making those difficult choices is what people pay you for. You should be proud of your compromises. My favorite products are opinionated. They make a clear statement about what they are not good at, in favor of being much better at something else.
·stephango.com·
Design is compromise
AI Integration and Modularization
AI Integration and Modularization
Summary: The question of integration versus modularization in the context of AI, drawing on the work of economists Ronald Coase and Clayton Christensen. Google is pursuing a fully integrated approach similar to Apple, while AWS is betting on modularization, and Microsoft and Meta are somewhere in between. Integration may provide an advantage in the consumer market and for achieving AGI, but that for enterprise AI, a more modular approach leveraging data gravity and treating models as commodities may prevail. Ultimately, the biggest beneficiary of this dynamic could be Nvidia.
The left side of figure 5-1 indicates that when there is a performance gap — when product functionality and reliability are not yet good enough to address the needs of customers in a given tier of the market — companies must compete by making the best possible products. In the race to do this, firms that build their products around proprietary, interdependent architectures enjoy an important competitive advantage against competitors whose product architectures are modular, because the standardization inherent in modularity takes too many degrees of design freedom away from engineers, and they cannot not optimize performance.
The issue I have with this analysis of vertical integration — and this is exactly what I was taught at business school — is that the only considered costs are financial. But there are other, more difficult to quantify costs. Modularization incurs costs in the design and experience of using products that cannot be overcome, yet cannot be measured. Business buyers — and the analysts who study them — simply ignore them, but consumers don’t. Some consumers inherently know and value quality, look-and-feel, and attention to detail, and are willing to pay a premium that far exceeds the financial costs of being vertically integrated.
Google trains and runs its Gemini family of models on its own TPU processors, which are only available on Google’s cloud infrastructure. Developers can access Gemini through Vertex AI, Google’s fully-managed AI development platform; and, to the extent Vertex AI is similar to Google’s internal development environment, that is the platform on which Google is building its own consumer-facing AI apps. It’s all Google, from top-to-bottom, and there is evidence that this integration is paying off: Gemini 1.5’s industry leading 2 million token context window almost certainly required joint innovation between Google’s infrastructure team and its model-building team.
In AI, Google is pursuing an integrated strategy, building everything from chips to models to applications, similar to Apple's approach in smartphones.
On the other extreme is AWS, which doesn’t have any of its own models; instead its focus has been on its Bedrock managed development platform, which lets you use any model. Amazon’s other focus has been on developing its own chips, although the vast majority of its AI business runs on Nvidia GPUs.
Microsoft is in the middle, thanks to its close ties to OpenAI and its models. The company added Azure Models-as-a-Service last year, but its primary focus for both external customers and its own internal apps has been building on top of OpenAI’s GPT family of models; Microsoft has also launched its own chip for inference, but the vast majority of its workloads run on Nvidia.
Google is certainly building products for the consumer market, but those products are not devices; they are Internet services. And, as you might have noticed, the historical discussion didn’t really mention the Internet. Both Google and Meta, the two biggest winners of the Internet epoch, built their services on commodity hardware. Granted, those services scaled thanks to the deep infrastructure work undertaken by both companies, but even there Google’s more customized approach has been at least rivaled by Meta’s more open approach. What is notable is that both companies are integrating their models and their apps, as is OpenAI with ChatGPT.
Google's integrated AI strategy is unique but may not provide a sustainable advantage for Internet services in the way Apple's integration does for devices
It may be the case that selling hardware, which has to be perfect every year to justify a significant outlay of money by consumers, provides a much better incentive structure for maintaining excellence and execution than does being an Aggregator that users access for free.
Google’s collection of moonshots — from Waymo to Google Fiber to Nest to Project Wing to Verily to Project Loon (and the list goes on) — have mostly been science projects that have, for the most part, served to divert profits from Google Search away from shareholders. Waymo is probably the most interesting, but even if it succeeds, it is ultimately a car service rather far afield from Google’s mission statement “to organize the world’s information and make it universally accessible and useful.”
The only thing that drives meaningful shifts in platform marketshare are paradigm shifts, and while I doubt the v1 version of Pixie [Google’s rumored Pixel-only AI assistant] would be good enough to drive switching from iPhone users, there is at least a path to where it does exactly that.
the fact that Google is being mocked mercilessly for messed-up AI answers gets at why consumer-facing AI may be disruptive for the company: the reason why incumbents find it hard to respond to disruptive technologies is because they are, at least at the beginning, not good enough for the incumbent’s core offering. Time will tell if this gives more fuel to a shift in smartphone strategies, or makes the company more reticent.
while I was very impressed with Google’s enterprise pitch, which benefits from its integration with Google’s infrastructure without all of the overhead of potentially disrupting the company’s existing products, it’s going to be a heavy lift to overcome data gravity, i.e. the fact that many enterprise customers will simply find it easier to use AI services on the same clouds where they already store their data (Google does, of course, also support non-Gemini models and Nvidia GPUs for enterprise customers). To the extent Google wins in enterprise it may be by capturing the next generation of startups that are AI first and, by definition, data light; a new company has the freedom to base its decision on infrastructure and integration.
Amazon is certainly hoping that argument is correct: the company is operating as if everything in the AI value chain is modular and ultimately a commodity, which insinuates that it believes that data gravity will matter most. What is difficult to separate is to what extent this is the correct interpretation of the strategic landscape versus a convenient interpretation of the facts that happens to perfectly align with Amazon’s strengths and weaknesses, including infrastructure that is heavily optimized for commodity workloads.
Unclear if Amazon's strategy is based on true insight or motivated reasoning based on their existing strengths
Meta’s open source approach to Llama: the company is focused on products, which do benefit from integration, but there are also benefits that come from widespread usage, particularly in terms of optimization and complementary software. Open source accrues those benefits without imposing any incentives that detract from Meta’s product efforts (and don’t forget that Meta is receiving some portion of revenue from hyperscalers serving Llama models).
The iPhone maker, like Amazon, appears to be betting that AI will be a feature or an app; like Amazon, it’s not clear to what extent this is strategic foresight versus motivated reasoning.
achieving something approaching AGI, whatever that means, will require maximizing every efficiency and optimization, which rewards the integrated approach.
the most value will be derived from building platforms that treat models like processors, delivering performance improvements to developers who never need to know what is going on under the hood.
·stratechery.com·
AI Integration and Modularization