Found 115 bookmarks
Custom sorting
openelm/README-pretraining.md
openelm/README-pretraining.md
Apple released something big three hours ago, and I'm still trying to get my head around exactly what it is. The parent project is called CoreNet, described as "A library …
·simonwillison.net·
openelm/README-pretraining.md
Command R
Command R
Command R is a conversational model that excels in language tasks and supports multiple languages, making it ideal for coding use cases that require instruction models. It responds well to preambles that follow a specific structure and format, enhancing its performance.
·docs.cohere.com·
Command R
Introduction | Ragas
Introduction | Ragas
Ragas is a framework that helps you evaluate your Retrieval Augmented Generation (RAG) pipelines. RAG denotes a class of LLM applications that use external data to augment the LLM’s context. There are existing tools and frameworks that help you build these pipelines but evaluating it and quantifying your pipeline performance can be hard. This is where Ragas (RAG Assessment) comes in.
·docs.ragas.io·
Introduction | Ragas
The GPT-4 barrier has finally been broken
The GPT-4 barrier has finally been broken
Four weeks ago, GPT-4 remained the undisputed champion: consistently at the top of every key benchmark, but more importantly the clear winner in terms of “vibes”. Almost everyone investing serious …
·simonwillison.net·
The GPT-4 barrier has finally been broken
stanfordnlp/dspy at bramadams.dev
stanfordnlp/dspy at bramadams.dev

DSPy is a framework for algorithmically optimizing LM prompts and weights, especially when LMs are used one or more times within a pipeline. To use LMs to build a complex system without DSPy, you generally have to: (1) break the problem down into steps, (2) prompt your LM well until each step works well in isolation, (3) tweak the steps to work well together, (4) generate synthetic examples to tune each step, and (5) use these examples to finetune smaller LMs to cut costs. Currently, this is hard and messy: every time you change your pipeline, your LM, or your data, all prompts (or finetuning steps) may need to change.

·github.com·
stanfordnlp/dspy at bramadams.dev
Stable Code 3B: Coding on the Edge — Stability AI
Stable Code 3B: Coding on the Edge — Stability AI
Stable Code, an upgrade from Stable Code Alpha 3B, specializes in code completion and outperforms predecessors in efficiency and multi-language support. It is compatible with standard laptops, including non-GPU models, and features capabilities like FIM and expanded context size. Trained in multiple
·stability.ai·
Stable Code 3B: Coding on the Edge — Stability AI
The Revenge of the Cataloguers
The Revenge of the Cataloguers
Over the past 15 years or so, libraries around the world have de-emphasized cataloguing. While budgetary concerns and technological efficien...
·go-to-hellman.blogspot.com·
The Revenge of the Cataloguers
RedPajama-Data-v2: an Open Dataset with 30 Trillion Tokens for Training Large Language Models — Together AI
RedPajama-Data-v2: an Open Dataset with 30 Trillion Tokens for Training Large Language Models — Together AI
Releasing a new version of the RedPajama dataset, with 30 trillion filtered and deduplicated tokens (100+ trillions raw) from 84 CommonCrawl dumps covering 5 languages, along with 40+ pre-computed data quality annotations that can be used for further filtering and weighting.
·together.ai·
RedPajama-Data-v2: an Open Dataset with 30 Trillion Tokens for Training Large Language Models — Together AI