Search Silicon vs. Carbon

Found 116 bookmarks

Newest

EFloat: Entropy-coded Floating Point Format for Compressing Vector Embedding Models

In a large class of deep learning models, including vector embedding models such as word and database embeddings, we observe that floating point exponent values cluster around a few unique values, permitting entropy based data compression. Entropy coding compresses fixed-length values with variable-length codes, encoding most probable values with fewer bits. We propose the EFloat compressed floating point number format that uses a variable field boundary between the exponent and significand fields. EFloat uses entropy coding on exponent values and signs to minimize the average width of the exponent and sign fields, while preserving the original FP32 exponent range unchanged. Saved bits become part of the significand field increasing the EFloat numeric precision by 4.3 bits on average compared to other reduced-precision floating point formats. EFloat makes 8-bit and even smaller floats practical without sacrificing the exponent range of a 32-bit floating point representation. We currently use the EFloat format for saving memory capacity and bandwidth consumption of large vector embedding models such as those used for database embeddings. Using the RMS error as metric, we demonstrate that EFloat provides higher accuracy than other floating point formats with equal bit budget. The EF12 format with 12-bit budget has less end-to-end application error than the 16-bit BFloat16. EF16 with 16-bit budget has an RMS-error 17 to 35 times less than BF16 RMS-error for a diverse set of embedding models. When making similarity and dissimilarity queries, using the NDCG ranking metric, EFloat matches the result quality of prior floating point representations with larger bit budgets.

arxiv.org #arxiv.org #2023 #APR #research

·arxiv.org·Apr 4, 2023

EFloat: Entropy-coded Floating Point Format for Compressing Vector Embedding Models

[2303.12712] Sparks of Artificial General Intelligence: Early experiments with GPT-4

arxiv.org #2023 #MAR #arxiv.org #research

·arxiv.org·Mar 24, 2023

[2303.12712] Sparks of Artificial General Intelligence: Early experiments with GPT-4

Extracting Training Data from Diffusion Models

Image diffusion models such as DALL-E 2, Imagen, and Stable Diffusion have attracted significant attention due to their ability to generate high-quality synthetic images. In this work, we show...

arxiv.org #arxiv.org #2023 #FEB #research

·arxiv.org·Feb 4, 2023

Extracting Training Data from Diffusion Models

Co-Writing Screenplays and Theatre Scripts with Language Models: An Evaluation by Industry Professionals

Language models are increasingly attracting interest from writers. However, such models lack long-range semantic coherence, limiting their usefulness for longform creative writing. We address this limitation by applying language models hierarchically, in a system we call Dramatron. By building structural context via prompt chaining, Dramatron can generate coherent scripts and screenplays complete with title, characters, story beats, location descriptions, and dialogue. We illustrate Dramatron's usefulness as an interactive co-creative system with a user study of 15 theatre and film industry professionals. Participants co-wrote theatre scripts and screenplays with Dramatron and engaged in open-ended interviews. We report critical reflections both from our interviewees and from independent reviewers who watched stagings of the works to illustrate how both Dramatron and hierarchical text generation could be useful for human-machine co-creativity. Finally, we discuss the suitability of Dramatron for co-creativity, ethical considerations -- including plagiarism and bias -- and participatory models for the design and deployment of such tools.

arxiv.org #arxiv.org #2023 #JAN #research

·arxiv.org·Jan 19, 2023

Co-Writing Screenplays and Theatre Scripts with Language Models: An Evaluation by Industry Professionals

Why WGANs beat GANs: A journey from KL divergence to Wasserstein loss

arxiv.org #2023 #JAN #arxiv.org #research

·towardsdatascience.com·Jan 12, 2023

Why WGANs beat GANs: A journey from KL divergence to Wasserstein loss

Visual Attention Network

While originally designed for natural language processing tasks, the self-attention mechanism has recently taken various computer vision areas by storm. However, the 2D nature of images brings three challenges for applying self-attention in computer vision. (1) Treating images as 1D sequences neglects their 2D structures. (2) The quadratic complexity is too expensive for high-resolution images. (3) It only captures spatial adaptability but ignores channel adaptability. In this paper, we propose a novel linear attention named large kernel attention (LKA) to enable self-adaptive and long-range correlations in self-attention while avoiding its shortcomings. Furthermore, we present a neural network based on LKA, namely Visual Attention Network (VAN). While extremely simple, VAN surpasses similar size vision transformers(ViTs) and convolutional neural networks(CNNs) in various tasks, including image classification, object detection, semantic segmentation, panoptic segmentation, pose estimation, etc. For example, VAN-B6 achieves 87.8% accuracy on ImageNet benchmark and set new state-of-the-art performance (58.2 PQ) for panoptic segmentation. Besides, VAN-B2 surpasses Swin-T 4% mIoU (50.1 vs. 46.1) for semantic segmentation on ADE20K benchmark, 2.6% AP (48.8 vs. 46.2) for object detection on COCO dataset. It provides a novel method and a simple yet strong baseline for the community. Code is available at https://github.com/Visual-Attention-Network.

arxiv.org #arxiv.org #2023 #JAN #research

·arxiv.org·Jan 10, 2023

Visual Attention Network

GitHub - ubisoft/ubisoft-laforge-ZeroEGGS: All about ZeroEggs publication ( https://arxiv.org/abs/2209.07556 )

All about ZeroEggs publication ( https://arxiv.org/abs/2209.07556 ) - GitHub - ubisoft/ubisoft-laforge-ZeroEGGS: All about ZeroEggs publication ( https://arxiv.org/abs/2209.07556 )

arxiv.org #github.com #2022 #arxiv.org #research

·github.com·Dec 29, 2022

GitHub - ubisoft/ubisoft-laforge-ZeroEGGS: All about ZeroEggs publication ( https://arxiv.org/abs/2209.07556 )

ZeroEGGS: Zero-shot Example-based Gesture Generation from Speech - 2209

arxiv.org #arxiv.org #2022 #research

·arxiv.org·Dec 29, 2022

ZeroEGGS: Zero-shot Example-based Gesture Generation from Speech - 2209

ZeroEGGS: Zero-shot Example-based Gesture Generation from Speech

We present ZeroEGGS, a neural network framework for speech-driven gesture generation with zero-shot style control by example. This means style can be controlled via only a short example motion clip, even for motion styles unseen during training. Our model uses a Variational framework to learn a style embedding, making it easy to modify style through latent space manipulation or blending and scaling of style embeddings. The probabilistic nature of our framework further enables the generation of a variety of outputs given the same input, addressing the stochastic nature of gesture motion. In a series of experiments, we first demonstrate the flexibility and generalizability of our model to new speakers and styles. In a user study, we then show that our model outperforms previous state-of-the-art techniques in naturalness of motion, appropriateness for speech, and style portrayal. Finally, we release a high-quality dataset of full-body gesture motion including fingers, with speech, spanning across 19 different styles.

arxiv.org #arxiv.org #2022 #research

·arxiv.org·Dec 29, 2022

ZeroEGGS: Zero-shot Example-based Gesture Generation from Speech

ChatGPT: Optimizing Language Models for Dialogue

arxiv.org #2022 #arxiv.org #research

·openai.com·Dec 15, 2022

ChatGPT: Optimizing Language Models for Dialogue

NVIDIA Researchers Present 'RANA,' a Novel Artificial Intelligence Framework for Learning Relightable and Articulated Neural Avatars of Humans

Human-like articulated neural avatars have several uses in telepresence, animation, and visual content production. These neural avatars must be simple to create, simple to animate in new stances and views, capable of rendering in photorealistic picture quality, and simple to relight in novel situations if they are to be widely adopted. Existing techniques frequently use monocular films to teach these neural avatars. While the method permits movement and photorealistic image quality, the synthesized images are constantly constrained by the training video's lighting conditions. Other studies specifically address the relighting of human avatars. However, they do not provide the user control

arxiv.org #2022 #arxiv.org #research

·marktechpost.com·Dec 13, 2022

NVIDIA Researchers Present 'RANA,' a Novel Artificial Intelligence Framework for Learning Relightable and Articulated Neural Avatars of Humans

Learning to Make the Right Mistakes - a Brief Comparison Between Human Perception and Multimodal LMs

A Brief Comparison Between Human Perception and Multimodal Language Models

arxiv.org #2022 #arxiv.org #research

·thegradient.pub·Dec 5, 2022

Learning to Make the Right Mistakes - a Brief Comparison Between Human Perception and Multimodal LMs

Paths toward single-slot finality - HackMD

# Paths toward single-slot finality _Special thanks to Justin Drake, Dankrad Feist, Alex Obadia, Ha

arxiv.org #2022 #arxiv.org #research

·notes.ethereum.org·Jan 26, 2022

Paths toward single-slot finality - HackMD

Megatron-LM: Training Multi-Billion Parameter Language Models...

Recent work in language modeling demonstrates that training large transformer models advances the state of the art in Natural Language Processing applications. However, very large models can be...

arxiv.org #arxiv.org #2021 #research

·arxiv.org·Dec 26, 2021

Megatron-LM: Training Multi-Billion Parameter Language Models...

87 geo

arxiv.org #2021 #arxiv.org #research

·gwlr.org·Sep 21, 2021

87 geo

COVID research updates: Tests reveal silent reinfections in hospital workers

A selection of the latest research on the new coronavirus.

arxiv.org #arxiv.org #research

·nature.com·Sep 29, 2020

COVID research updates: Tests reveal silent reinfections in hospital workers

AI4COVID-19: AI enabled preliminary diagnosis for COVID-19 from cough samples via an app

The inability to test at scale has become humanity's Achille's heel in the ongoing war against the COVID-19 pandemic. A scalable screening tool would …

arxiv.org #arxiv.org #research

·sciencedirect.com·Jul 21, 2020

AI4COVID-19: AI enabled preliminary diagnosis for COVID-19 from cough samples via an app

1806.01224.pdf

arxiv.org #arxiv.org #2020 #research

·arxiv.org·May 31, 2020

1806.01224.pdf

A Shiny Snack Bag's Reflections Can Reconstruct the Room around It - Scient

Researchers used the light reflecting off the wrapper to build an image of its surroundings

arxiv.org #2020 #arxiv.org #research

·scientificamerican.com·May 4, 2020

A Shiny Snack Bag's Reflections Can Reconstruct the Room around It - Scient

GAC-GAN: A General Method for Appearance-Controllable Human Video...

Human video motion transfer has a wide range of applications in multimedia, computer vision and graphics. Recently, due to the rapid development of Generative Adversarial Networks (GANs), there...

arxiv.org #arxiv.org #2019 #research

·arxiv.org·Nov 27, 2019

GAC-GAN: A General Method for Appearance-Controllable Human Video...

Machine learning has revealed exactly how much of a Shakespeare play was wr

For much of his life, William Shakespeare was the house playwright for an acting company called the King’s Men that performed his plays on the banks of the River Thames in London. When Shakespeare died in 1616, the company needed a replacement and turned to one of the most prolific and famous playwrights of the…

arxiv.org #2019 #arxiv.org #research

·technologyreview.com·Nov 22, 2019

Machine learning has revealed exactly how much of a Shakespeare play was wr

A neural net solves the three-body problem 100 million times faster - MIT T

In the 18th century, the great scientific challenge of the age was to find a way for mariners to determine their position at sea. One of the most successful solutions was to measure the position of the moon in the sky relative to the fixed background of stars. Because of parallax effects, this measurement depends…

arxiv.org #2019 #arxiv.org #research

·technologyreview.com·Oct 27, 2019

A neural net solves the three-body problem 100 million times faster - MIT T

Machine vision has learned to use radio waves to see through walls and in d

Machine vision has an impressive record. It has the superhuman ability to recognize people, faces and objects. It can even recognize many different kinds of actions, albeit not quite as well as humans just yet. But there are limits to its performance. Machines have a particularly difficult time when people, faces, or objects are partially…

arxiv.org #2019 #arxiv.org #research

·technologyreview.com·Oct 9, 2019

Machine vision has learned to use radio waves to see through walls and in d