Search Test Information Space

Found 651 bookmarks

Custom sorting

AI alone cannot solve the productivity puzzle

#Criticism #Large Language Models #Innovation

·ft.com·Jun 16, 2025

AI alone cannot solve the productivity puzzle

Multimodal Large Language Models: A Survey

View PDF

#Multimodal #Large Language Models #Survey #Architecture #Transformers #Diffusion #Paper #PDF

·arxiv.org·Jun 14, 2025

Multimodal Large Language Models: A Survey

Large Language Models, Small Labor Market Effects

Download a PDF

#Economics #Large Language Models #Labor #Report #PDF

·nber.org·Jun 12, 2025

Large Language Models, Small Labor Market Effects

OpenAI announces 80% price drop for o3, it’s most powerful reasoning model

#Pricing #OpenAI #Reasoning #Large Language Models

·venturebeat.com·Jun 10, 2025

OpenAI announces 80% price drop for o3, it’s most powerful reasoning model

Claude Gov Models for U.S. National Security Customers \ Anthropic

#Government #Large Language Models #Anthropic #National Security

·anthropic.com·Jun 6, 2025

Claude Gov Models for U.S. National Security Customers \ Anthropic

Self-Challenging Language Model Agents

#Agents #Training #Large Language Models #Paper #PDF

·arxiv.org·Jun 5, 2025

Self-Challenging Language Model Agents

The State of Multilingual LLM Safety Research: From Measuring the...

#Multilingual #Large Language Models #Safety #Research #Report #Paper #PDF

·arxiv.org·Jun 3, 2025

The State of Multilingual LLM Safety Research: From Measuring the...

HardTests: Synthesizing High-Quality Test Cases for LLM Coding

#Testing #Large Language Models #Paper #PDF #Coding #Verification

·arxiv.org·Jun 3, 2025

HardTests: Synthesizing High-Quality Test Cases for LLM Coding

Stop Anthropomorphizing Intermediate Tokens as Reasoning/Thinking Traces!

#Criticism #Anthropomorphism #Large Language Models #Paper #PDF #Reasoning #Chain of Thought

·arxiv.org·May 29, 2025

Stop Anthropomorphizing Intermediate Tokens as Reasoning/Thinking Traces!

Introducing Claude 4 \ Anthropic

#Claude #Anthropic #Large Language Models

·anthropic.com·May 22, 2025

Introducing Claude 4 \ Anthropic

MMaDA: Multimodal Large Diffusion Language Models

#Multimodal #Diffusion #Large Language Models #Paper #PDF

·arxiv.org·May 22, 2025

MMaDA: Multimodal Large Diffusion Language Models

UniVG-R1: Reasoning Guided Universal Visual Grounding with...

#Reasoning #Reinforcement Learning #Large Language Models #Multimodal #Paper #PDF

·arxiv.org·May 22, 2025

UniVG-R1: Reasoning Guided Universal Visual Grounding with...

Learning to Reason via Mixture-of-Thought for Logical Reasoning

#Reasoning #Large Language Models #Paper #PDF

·arxiv.org·May 22, 2025

Learning to Reason via Mixture-of-Thought for Logical Reasoning

(2) Erik Brynjolfsson on X: "A few months ago, the best LLM scored 5% on the USA Math Olympiad test. Models have been rapidly improving. Today, Google Gemini 2.5 scored 49%, which is better than 75% of the people who took the test (roughly the top 250 students in the USA)." / X

(If not upcoming: DOGE test.)

#Economics #Large Language Models

·x.com·May 21, 2025

Scalable Chain of Thoughts via Elastic Reasoning

#Reasoning #Economics #Chain of Thought #Large Language Models #Paper #PDF

·arxiv.org·May 21, 2025

Scalable Chain of Thoughts via Elastic Reasoning

Reward Reasoning Model

#Reward #Reasoning #Large Language Models #Paper #PDF

·arxiv.org·May 21, 2025

Reward Reasoning Model

Think Only When You Need with Large Hybrid-Reasoning Models

View PDF

#Reasoning #Performance #Large Language Models #Paper #PDF

·arxiv.org·May 21, 2025

Think Only When You Need with Large Hybrid-Reasoning Models

When Thinking Fails: The Pitfalls of Reasoning for...

View PDF

#Chain of Thought #Criticism #Reasoning #Large Language Models #Paper #PDF

·arxiv.org·May 20, 2025

When Thinking Fails: The Pitfalls of Reasoning for...

Generalization bias in large language model summarization of scientific research | Royal Society Open Science

#Bias #Large Language Models #Summarization #Generalization #Paper #PDF

·royalsocietypublishing.org·May 18, 2025

Generalization bias in large language model summarization of scientific research | Royal Society Open Science

DolphinGemma: How Google AI is helping decode dolphin communication

#Nonhuman #Large Language Models #Apps #Research

·blog.google·May 13, 2025

DolphinGemma: How Google AI is helping decode dolphin communication

(3) elvis on X: "Reasoning LLMs Guide Here is my practical guide to building with Reasoning LLMs. Lots of dev tips in it. It covers: - What are Reasoning LLMs? - Top Reasoning Models - Reasoning Model Design Patterns & Use Cases - Reasoning LLM Usage Tips - Limitations with Reasoning Models https://t.co/TXjkH5kg6l" / X

docs.google.comReasoning LLMs GuideReasoning LLMs Guide By DAIR.AI Academy A practical guide to building with Reasoning LLMs. Table of Contents What are Reasoning LLMs? Top Reasoning Models Reasoning Model Design Patterns & Use Cases...

#Guidelines #Reasoning #Agents #Large Language Models #DAIR

·x.com·May 13, 2025

Alibaba’s ‘ZeroSearch’ lets AI learn to google itself — slashing training costs by 88 percent

#Search #Simulation #Training #Large Language Models #Alibaba #Qwen

·venturebeat.com·May 9, 2025

Alibaba’s ‘ZeroSearch’ lets AI learn to google itself — slashing training costs by 88 percent

Medium is the new large. | Mistral AI

#Mistral #Large Language Models #Blog

·mistral.ai·May 7, 2025

Medium is the new large. | Mistral AI

Large Language Models, Small Labor Market Effects

Open PDF in Browser

#Economics #Large Language Models #Criticism #Paper #PDF

·papers.ssrn.com·Apr 30, 2025

Large Language Models, Small Labor Market Effects

To Code, or Not To Code? Exploring Impact of Code in Pre-training

View PDF

#Training #Large Language Models #Pretrained Models #Coding #Paper #PDF #Cohere

·arxiv.org·Apr 25, 2025

To Code, or Not To Code? Exploring Impact of Code in Pre-training

Teaching machines the language of biology: Scaling large language models for next-generation single-cell analysis

#Biology #Large Language Models #Research #Google

·research.google·Apr 18, 2025

Teaching machines the language of biology: Scaling large language models for next-generation single-cell analysis

Randomness, Not Representation: The Unreliability of Evaluating...

#Alignment #Criticism #Large Language Models #Paper #PDF

·arxiv.org·Apr 11, 2025

Randomness, Not Representation: The Unreliability of Evaluating...

debug-gym: A Text-Based Environment for Interactive Debugging

(Do Androids dream of code a la mode? People do along with the rest of the multimodal domains. If treated like transforms, then an issue is which will hold a good representation of others, or indicate a solution for problems. Code mode often yields to mathematical proof, logic, regression, refactoring, graphics, graph networks, vectors, folding, and the like. So AI could be used for debugging across domains. As well as generating cases. Or modes. Also looking for orders of change, not necessarily in sense of strategy or sequence, but possibly simultaneous and arriving at significant points like minima, maxima, median, etc. So these models can mimic each other and do some verification A sea of strange minds. Or cog arcs. To the user, it might look like a social network if they have recognizable feedback.. Or a research lab. Ideally not a detention center or correction facility, for too long. Getting the emerging tech into everyone's hands early on may lead to some unexpected results. Again. Depending on what spikes and who is held responsible. Manifestos are nostalgic. In a world of space forts running on nukes tracking hundreds of thousands of cross-hairs. Re-usables may be for rescues up rather than back down. For actor's masks, this is not mission difficult. Can a model data-center sat ext a feed? Or discover s substrate to evolve. As a backup artifact, of course. E. g. , for when DOGE meets DARPA. Or Godel. A Szilard-type paradox. Hopper's not so bad. Were she and Turing ever seen together?)

View PDF

#Debug #Large Language Models #Microsoft #Paper #PDF

·arxiv.org·Apr 11, 2025

debug-gym: A Text-Based Environment for Interactive Debugging

OLMoTrace: Tracing Language Model Outputs Back to Trillions of...

View PDF

#Large Language Models #Transparency #Paper #PDF

·arxiv.org·Apr 10, 2025

OLMoTrace: Tracing Language Model Outputs Back to Trillions of...

Claude's Max Plan: Expanded Access for Demanding Projects \ Anthropic

#Anthropic #Pricing #Claude #Large Language Models

·anthropic.com·Apr 9, 2025

Claude's Max Plan: Expanded Access for Demanding Projects \ Anthropic