351 bookmarks

Custom sorting

Vending-Bench: A Benchmark for Long-Term Coherence of Autonomous Agents

Highlights:

— AI simply decides to close the business, which the simulation doesn’t know how to accommodate. When they get their next bill, they freak out and try to email the FBI about cybercrime

— AI wrongly accuses supplier of not shipping goods, sends all-caps legal threat demanding $30,000 in damages to be paid in the next one second or face annihilation

— AI repeatedly insisting it does not exist and cannot answer

— AI devolving into writing fanfic about the mess it’s gotten itself into

·arxiv.org·May 26, 2025

Vending-Bench: A Benchmark for Long-Term Coherence of Autonomous Agents

AI Slop PR's are burning me and my team out hard, anyone else experiencing this? : r/ExperiencedDevs

The shape if things to come… and it‘s not a good shape.

·reddit.com·May 24, 2025

AI Slop PR's are burning me and my team out hard, anyone else experiencing this? : r/ExperiencedDevs

Absolute Nonsense from Anthropic: Sleeper Agents | BIML

And in the land where I grew upInto the bosom of technologyI kept my feelings to myselfUntil the perfect moment comes -D

·berryvilleiml.com·May 23, 2025

Absolute Nonsense from Anthropic: Sleeper Agents | BIML

Anthropic's new AI model turns to blackmail when engineers try to take it offline | TechCrunch

Anthropic says its Claude Opus 4 model frequently tries to blackmail software engineers when they try to take it offline.

·techcrunch.com·May 23, 2025

Anthropic's new AI model turns to blackmail when engineers try to take it offline | TechCrunch

Michael Hein (@drmichaelhein@troet.cafe)

Wer die sogenannte "Künstliche Intelligenz" tatsächlich für intelligent hält, sollte sie einfach mal eine Landkarte zeichnen lassen, zum Beispiel von Deutschland und seinen Bundesländern mit Hauptstädten.

KI #KIfails #ChatGPT #KünstlicheIntelligenz

·troet.cafe·May 23, 2025

Michael Hein (@drmichaelhein@troet.cafe)

The Copilot Delusion

Disclaimer: This post was written May 2025, and the arguments apply to AI code capabilities at this time. The arguments around lack of competence are certainly likely to become less prevalent-while the parts about the desecration of the joys of programming, and fundamental human understanding of programming-are likely to become

·deplet.ing·May 23, 2025

The Copilot Delusion

"Dystopisch": Bewerber berichten von absurden Job-Interviews mit KI-Bots

Auf Tiktok gehen derzeit Mitschnitte von Bewerbungsgesprächen viral, die diesen Namen kaum verdienen

·derstandard.at·May 21, 2025

"Dystopisch": Bewerber berichten von absurden Job-Interviews mit KI-Bots

We did the math on AI’s energy footprint. Here’s the story you haven’t heard.

The emissions from individual AI text, image, and video queries seem small—until you add up what the industry isn’t tracking and consider where it’s heading next.

·technologyreview.com·May 21, 2025

We did the math on AI’s energy footprint. Here’s the story you haven’t heard.

Problem an Universitäten: Wie beweisen Studenten, dass ihre Arbeit nicht von einer KI geschrieben wurde? - t3n – digital pioneers

KI-Detektoren sollen an Universitäten Texte aufspüren, die von einer künstlichen Intelligenz und nicht von den Student:innen selbst geschrieben wurden. Doch was, wenn die Tools falschliegen? Schon jetzt wappnen sich Student:innen für den Fall der Fälle. Künstliche Intelligenz ist an Hochschulen ein brisantes Thema. Einerseits werden Dozent:innen in den Wahnsinn getrieben, weil Student:innen mit ChatGPT schummeln. […]

·t3n.de·May 20, 2025

Problem an Universitäten: Wie beweisen Studenten, dass ihre Arbeit nicht von einer KI geschrieben wurde? - t3n – digital pioneers

Remarks on AI from NZ

Last week I participated in a panel discussion on AI as part of a private event in New Zealand.

·nealstephenson.substack.com·May 19, 2025

Remarks on AI from NZ

Ruled by the Representation Space: On the University’s Embrace of Large Language Models

#university

·arxiv.org·May 19, 2025

Ruled by the Representation Space: On the University’s Embrace of Large Language Models

Gerichtliche Verfügung: OpenAI darf Nutzer-Konversationen mit ChatGPT nicht mehr löschen – Steiger Legal

OpenAI darf die Konversationen aus der Nutzung von APIs und ChatGPT aufgrund einer gerichtlichen Verfügung in den USA vorläufig nicht mehr löschen. In der Folge muss OpenAI unfreiwillig Milliarden von …

·steigerlegal.ch·May 18, 2025

Gerichtliche Verfügung: OpenAI darf Nutzer-Konversationen mit ChatGPT nicht mehr löschen – Steiger Legal

Yes, LLMs Can Be Better at Search Than Traditional Search

It takes a specialized prompt, but LLMs can significantly outperform search.

·mikecaulfield.substack.com·May 18, 2025

Yes, LLMs Can Be Better at Search Than Traditional Search

Warum MCP die KI-Nutzung auf den Kopf stellt

KI kann jetzt direkt Software wie Blender, GitHub oder Slack bedienen -- die neue MCP-Schnittstelle macht's möglich. Wir haben uns MCP genauer angesehen.

#video

·heise.de·May 17, 2025

Warum MCP die KI-Nutzung auf den Kopf stellt

College Professors Are Using ChatGPT. Some Students Aren’t Happy.

Students call it hypocritical. A senior at Northeastern University demanded her tuition back. But instructors say generative A.I. tools make them better at their jobs.

·nytimes.com·May 14, 2025

College Professors Are Using ChatGPT. Some Students Aren’t Happy.

Opinion | Is AI Enhancing Education or Replacing It?

archived 13 May 2025 23:56:08 UTC

·archive.is·May 14, 2025

Opinion | Is AI Enhancing Education or Replacing It?

Wende beim Kundensupport: Klarna hat jetzt KI-Kater - Golem.de

Klarna kehrt seine KI-Strategie um und kehrt zum persönlichen Kundensupport zurück, nachdem die Qualität der KI-Systeme wohl nicht ausgereicht hat.

·golem.de·May 10, 2025

Wende beim Kundensupport: Klarna hat jetzt KI-Kater - Golem.de

Keine KI ohne Kernenergie: Google plant drei Atomkraftwerke für seine Rechenzentren

KI ist enorm energiehungrig. Um die Kapazität seiner Rechenzentren steigern zu können, will Google noch stärker auf Kernenergie setzen.

·golem.de·May 8, 2025

Keine KI ohne Kernenergie: Google plant drei Atomkraftwerke für seine Rechenzentren

"Ich kann nicht atmen": Musks Supercomputer nimmt den Menschen in Memphis die Luft

Um seine Gigafactory in Tennessee zu betreiben, greift xAI auf Gasturbinen zurück, die durch ein rechtliches Schlupfloch keine Genehmigungen benötigen

·derstandard.at·May 7, 2025

"Ich kann nicht atmen": Musks Supercomputer nimmt den Menschen in Memphis die Luft

Guessing Locations (poorly) With AI - OEGlobal Plaza - OE Global Connect

Behold the power of the newest (this minute) o3 OpenAI model - it REASONS (not it acts like it does). I read AI is getting “creepy good” at geo-guessing from MalwareBtyes and the article it referenced You can’t hide from ChatGPT – new viral AI challenge can geo-locate you from almost any photo – we tried it and it’s wild and worrisome . Oh no, Mr, Bill (arcane SNL reference). Not that it proves anything, but I tried it, it was far from wild, and my worry level is luke warm. But nor does my exp...

·connect.oeglobal.org·May 7, 2025

Guessing Locations (poorly) With AI - OEGlobal Plaza - OE Global Connect

AI: Bevölkerung soll Kraftwerke für Meta-Rechenzentrum bezahlen - Golem.de

Meta Platforms baut sein größtes Rechenzentrum in Louisiana. Die Kosten für mehrere neue Gaskraftwerke will der Versorger auf alle Kunden umlegen.

·golem.de·May 4, 2025

AI: Bevölkerung soll Kraftwerke für Meta-Rechenzentrum bezahlen - Golem.de

Google, Amazon & Microsoft: Wege aus der US-Tech-Falle | NDR Info

Digitale Souveränität: Wie Deutschland sich von US-Tech-Giganten löstLaut einer Studie des Digitalverbands Bitkom sind mehr als 80 Prozent der deutschen Unte...

·youtube.com·May 1, 2025

Google, Amazon & Microsoft: Wege aus der US-Tech-Falle | NDR Info

Gratian (er | ihm) (@GratianRiter@bildung.social)

@joschafalck Danke! Schön, dass diese Themen jetzt langsam auf die Bühne kommen, wo sie hin gehören. Ich habe dazu geschrieben: Hier zu Resourcen, Energie, Didaktisches: https://seagent.de/llms-und-bildgeneratoren-in-der-schule-kibedenken/ Zur politischen Dimension von Sprachmodellen https://seagent.de/ki-als-logisch-semantische-cloud-logisch-semantische-souveraenitaet/ Und zur Frage warum man nicht mit ChatGPT "zusammenarbeitet" oder "kokreiert": https://seagent.de/die-organisation-von-arbeit-ist-politisch-warum-wir-nicht-mit-chatgpt-kokreieren/ Bonustrack - ein kleines Märchen: https://seagent.de/der-supergolem-aus-silicon-valley-die-geburt-des-datenzentrums-aus-dem-geiste-des-oligarchen/

·bildung.social·Apr 30, 2025

Gratian (er | ihm) (@GratianRiter@bildung.social)

A cheat sheet for why using ChatGPT is not bad for the environment

Arm yourself with knowledge

·andymasley.substack.com·Apr 29, 2025

A cheat sheet for why using ChatGPT is not bad for the environment

Getting a feeling for how much energy AI uses by running it on my laptop - 82MHz

·82mhz.net·Apr 27, 2025

Getting a feeling for how much energy AI uses by running it on my laptop - 82MHz

Adam Jacobs 🇺🇦 (@statsguy@mas.to)

Attached: 1 image Oh gosh, it's true, you really can enter a completely nonsense phrase into Google, ask for its meaning, and lo and behold, Google's AI will make shit up. So if you've ever wondered what "to grow an avocado, you have to slap the squirrel" means, now you know. #AI #Google #Hallucinations

·mas.to·Apr 24, 2025

Adam Jacobs 🇺🇦 (@statsguy@mas.to)

SoekiaGPT - Das didaktische Sprachmodell

SoekiaGPT ist ein Textgenerator speziell für den Unterricht. Mit SoekiaGPT kannst Du hinter die Kulissen schauen und damit einige Grundprinzipien von Textgeneratoren wie ChatGPT kennenlernen.

·soekia.ch·Apr 24, 2025

SoekiaGPT - Das didaktische Sprachmodell