Found 1798 bookmarks
Newest
Curiosity - AI search for everything
Curiosity - AI search for everything
The ultimate AI productivity app that protects your privacy. Bring all your apps and data into one AI-powered search and assistant. Get it for you and for your teams today.
·curiosity.ai·
Curiosity - AI search for everything
How streaming LLM APIs work | Simon Willison’s TILs
How streaming LLM APIs work | Simon Willison’s TILs
I decided to have a poke around and see if I could figure out how the HTTP streaming APIs from the various hosted LLM providers actually worked. Here are my notes so far.
·til.simonwillison.net·
How streaming LLM APIs work | Simon Willison’s TILs
Introducing Contextual Retrieval \ Anthropic
Introducing Contextual Retrieval \ Anthropic
Anthropic is an AI safety and research company that's working to build reliable, interpretable, and steerable AI systems.
·anthropic.com·
Introducing Contextual Retrieval \ Anthropic
GitHub - ictnlp/LLaMA-Omni: LLaMA-Omni is a low-latency and high-quality end-to-end speech interaction model built upon Llama-3.1-8B-Instruct, aiming to achieve speech capabilities at the GPT-4o level.
GitHub - ictnlp/LLaMA-Omni: LLaMA-Omni is a low-latency and high-quality end-to-end speech interaction model built upon Llama-3.1-8B-Instruct, aiming to achieve speech capabilities at the GPT-4o level.
LLaMA-Omni is a low-latency and high-quality end-to-end speech interaction model built upon Llama-3.1-8B-Instruct, aiming to achieve speech capabilities at the GPT-4o level. - ictnlp/LLaMA-Omni
·github.com·
GitHub - ictnlp/LLaMA-Omni: LLaMA-Omni is a low-latency and high-quality end-to-end speech interaction model built upon Llama-3.1-8B-Instruct, aiming to achieve speech capabilities at the GPT-4o level.
Sourcetable | Your AI Data Analyst
Sourcetable | Your AI Data Analyst
Sourcetable is an AI spreadsheet that helps you analyze data and create reports. Chat with your data, create charts and graphs, build financial models, + more.
·sourcetable.com·
Sourcetable | Your AI Data Analyst
Introducing Contextual Retrieval
Introducing Contextual Retrieval
Here's an interesting new embedding/RAG technique, described by Anthropic but it should work for any embedding model against any other LLM. One of the big challenges in implementing semantic search …
·simonwillison.net·
Introducing Contextual Retrieval
Reverse engineering OpenAI’s o1
Reverse engineering OpenAI’s o1
What productionizing test-time compute shows us about the future of AI. Exploration has landed in language model training.
·interconnects.ai·
Reverse engineering OpenAI’s o1
Notes on OpenAI’s new o1 chain-of-thought models
Notes on OpenAI’s new o1 chain-of-thought models
OpenAI released two major new preview models today: o1-preview and o1-mini (that mini one is also a preview, despite the name)—previously rumored as having the codename “strawberry”. There’s a lot …
·simonwillison.net·
Notes on OpenAI’s new o1 chain-of-thought models
files-to-prompt 0.3
files-to-prompt 0.3
New version of my `files-to-prompt` CLI tool for turning a bunch of files into a prompt suitable for piping to an LLM, [described here previously](https://simonwillison.net/2024/Apr/8/files-to-prompt/). It now has a `-c/--cxml` …
·simonwillison.net·
files-to-prompt 0.3
Announcing The Assistant | Kagi Blog
Announcing The Assistant | Kagi Blog
Yes, the rumours are true! Kagi has been thoughtfully integrating AI into our search experience, creating a smarter, faster, and more intuitive search.
·blog.kagi.com·
Announcing The Assistant | Kagi Blog
dh1011/llm-term
dh1011/llm-term
A Rust-based CLI tool that generates and executes terminal commands using OpenAI's language models or local Ollama models.
·github.com·
dh1011/llm-term
Cerebras Inference: AI at Instant Speed
Cerebras Inference: AI at Instant Speed
New hosted API for Llama running at absurdly high speeds: "1,800 tokens per second for Llama3.1 8B and 450 tokens per second for Llama3.1 70B". How are they running so …
·simonwillison.net·
Cerebras Inference: AI at Instant Speed