Found 4 bookmarks
Newest
DeepSeek isn't a victory for the AI sceptics
DeepSeek isn't a victory for the AI sceptics
we now know that as the price of computing equipment fell, new use cases emerged to fill the gap – which is why today my lightbulbs have semiconductors inside them, and I occasionally have to install firmware updates my doorbell.
surely the compute freed up by more efficient models will be used to train models even harder, and apply even more “brain power” to coming up with responses? Even if DeepSeek is dramatically more efficient, the logical thing to do will be to use the excess capacity to ensure the answers are even smarter.
ure, if DeepSeek heralds a new era of much leaner LLMs, it’s not great news in the short term if you’re a shareholder in Nvidia, Microsoft, Meta or Google.6 But if DeepSeek is the enormous breakthrough it appears, it just became even cheaper to train and use the most sophisticated models humans have so far built, by one or more orders of magnitude. Which is amazing news for big tech, because it means that AI usage is going to be even more ubiquitous.
·takes.jamesomalley.co.uk·
DeepSeek isn't a victory for the AI sceptics
AI startups require new strategies
AI startups require new strategies

comment from Habitue on Hacker News: > These are some good points, but it doesn't seem to mention a big way in which startups disrupt incumbents, which is that they frame the problem a different way, and they don't need to protect existing revenue streams.

The “hard tech” in AI are the LLMs available for rent from OpenAI, Anthropic, Cohere, and others, or available as open source with Llama, Bloom, Mistral and others. The hard-tech is a level playing field; startups do not have an advantage over incumbents.
There can be differentiation in prompt engineering, problem break-down, use of vector databases, and more. However, this isn’t something where startups have an edge, such as being willing to take more risks or be more creative. At best, it is neutral; certainly not an advantage.
This doesn’t mean it’s impossible for a startup to succeed; surely many will. It means that you need a strategy that creates differentiation and distribution, even more quickly and dramatically than is normally required
Whether you’re training existing models, developing models from scratch, or simply testing theories, high-quality data is crucial. Incumbents have the data because they have the customers. They can immediately leverage customers’ data to train models and tune algorithms, so long as they maintain secrecy and privacy.
Intercom’s AI strategy is built on the foundation of hundreds of millions of customer interactions. This gives them an advantage over a newcomer developing a chatbot from scratch. Similarly, Google has an advantage in AI video because they own the entire YouTube library. GitHub has an advantage with Copilot because they trained their AI on their vast code repository (including changes, with human-written explanations of the changes).
While there will always be individuals preferring the startup environment, the allure of working on AI at an incumbent is equally strong for many, especially pure computer and data scientsts who, more than anything else, want to work on interesting AI projects. They get to work in the code, with a large budget, with all the data, with above-market compensation, and a built-in large customer base that will enjoy the fruits of their labor, all without having to do sales, marketing, tech support, accounting, raising money, or anything else that isn’t the pure joy of writing interesting code. This is heaven for many.
A chatbot is in the chatbot market, and an SEO tool is in the SEO market. Adding AI to those tools is obviously a good idea; indeed companies who fail to add AI will likely become irrelevant in the long run. Thus we see that “AI” is a new tool for developing within existing markets, not itself a new market (except for actual hard-tech AI companies).
AI is in the solution-space, not the problem-space, as we say in product management. The customer problem you’re solving is still the same as ever. The problem a chatbot is solving is the same as ever: Talk to customers 24/7 in any language. AI enables completely new solutions that none of us were imagining a few years ago; that’s what’s so exciting and truly transformative. However, the customer problems remain the same, even though the solutions are different
Companies will pay more for chatbots where the AI is excellent, more support contacts are deferred from reaching a human, more languages are supported, and more kinds of questions can be answered, so existing chatbot customers might pay more, which grows the market. Furthermore, some companies who previously (rightly) saw chatbots as a terrible customer experience, will change their mind with sufficiently good AI, and will enter the chatbot market, which again grows that market.
the right way to analyze this is not to say “the AI market is big and growing” but rather: “Here is how AI will transform this existing market.” And then: “Here’s how we fit into that growth.”
·longform.asmartbear.com·
AI startups require new strategies
The $2 Per Hour Workers Who Made ChatGPT Safer
The $2 Per Hour Workers Who Made ChatGPT Safer
The story of the workers who made ChatGPT possible offers a glimpse into the conditions in this little-known part of the AI industry, which nevertheless plays an essential role in the effort to make AI systems safe for public consumption. “Despite the foundational role played by these data enrichment professionals, a growing body of research reveals the precarious working conditions these workers face,” says the Partnership on AI, a coalition of AI organizations to which OpenAI belongs. “This may be the result of efforts to hide AI’s dependence on this large labor force when celebrating the efficiency gains of technology. Out of sight is also out of mind.”
This reminds me of [[On the Social Media Ideology - Journal 75 September 2016 - e-flux]]:<br>> Platforms are not stages; they bring together and synthesize (multimedia) data, yes, but what is lacking here is the (curatorial) element of human labor. That’s why there is no media in social media. The platforms operate because of their software, automated procedures, algorithms, and filters, not because of their large staff of editors and designers. Their lack of employees is what makes current debates in terms of racism, anti-Semitism, and jihadism so timely, as social media platforms are currently forced by politicians to employ editors who will have to do the all-too-human monitoring work (filtering out ancient ideologies that refuse to disappear).
Computer-generated text, images, video, and audio will transform the way countless industries do business, the most bullish investors believe, boosting efficiency everywhere from the creative arts, to law, to computer programming. But the working conditions of data labelers reveal a darker part of that picture: that for all its glamor, AI often relies on hidden human labor in the Global South that can often be damaging and exploitative. These invisible workers remain on the margins even as their work contributes to billion-dollar industries.
One Sama worker tasked with reading and labeling text for OpenAI told TIME he suffered from recurring visions after reading a graphic description of a man having sex with a dog in the presence of a young child. “That was torture,” he said. “You will read a number of statements like that all through the week. By the time it gets to Friday, you are disturbed from thinking through that picture.” The work’s traumatic nature eventually led Sama to cancel all its work for OpenAI in February 2022, eight months earlier than planned.
In the day-to-day work of data labeling in Kenya, sometimes edge cases would pop up that showed the difficulty of teaching a machine to understand nuance. One day in early March last year, a Sama employee was at work reading an explicit story about Batman’s sidekick, Robin, being raped in a villain’s lair. (An online search for the text reveals that it originated from an online erotica site, where it is accompanied by explicit sexual imagery.) The beginning of the story makes clear that the sex is nonconsensual. But later—after a graphically detailed description of penetration—Robin begins to reciprocate. The Sama employee tasked with labeling the text appeared confused by Robin’s ambiguous consent, and asked OpenAI researchers for clarification about how to label the text, according to documents seen by TIME. Should the passage be labeled as sexual violence, she asked, or not? OpenAI’s reply, if it ever came, is not logged in the document; the company declined to comment. The Sama employee did not respond to a request for an interview.
In February, according to one billing document reviewed by TIME, Sama delivered OpenAI a sample batch of 1,400 images. Some of those images were categorized as “C4”—OpenAI’s internal label denoting child sexual abuse—according to the document. Also included in the batch were “C3” images (including bestiality, rape, and sexual slavery,) and “V3” images depicting graphic detail of death, violence or serious physical injury, according to the billing document.
I haven't finished watching [[Severance]] yet but this labeling system reminds me of the way they have to process and filter data that is obfuscated as meaningless numbers. In the show, employees have to "sense" whether the numbers are "bad," which they can, somehow, and sort it into the trash bin.
But the need for humans to label data for AI systems remains, at least for now. “They’re impressive, but ChatGPT and other generative models are not magic – they rely on massive supply chains of human labor and scraped data, much of which is unattributed and used without consent,” Andrew Strait, an AI ethicist, recently wrote on Twitter. “These are serious, foundational problems that I do not see OpenAI addressing.”
·time.com·
The $2 Per Hour Workers Who Made ChatGPT Safer