Found 174 bookmarks
Custom sorting
The Design Space of Generative Models
The Design Space of Generative Models
Card et al.'s classic paper "The Design Space of Input Devices" established the value of design spaces as a tool for HCI analysis and invention. We posit that developing design spaces for emerging pre-trained, generative AI models is necessary for supporting their integration into human-centered systems and practices. We explore what it means to develop an AI model design space by proposing two design spaces relating to generative AI models: the first considers how HCI can impact generative models (i.e., interfaces for models) and the second considers how generative models can impact HCI (i.e., models as an HCI prototyping material).
·arxiv.org·
The Design Space of Generative Models
SamurAI: A Versatile IoT Node With Event-Driven Wake-Up and Embedded ML Acceleration
SamurAI: A Versatile IoT Node With Event-Driven Wake-Up and Embedded ML Acceleration
Increased capabilities such as recognition and self-adaptability are now required from IoT applications. While IoT node power consumption is a major concern for these applications, cloud-based processing is becoming unsustainable due to continuous sensor or image data transmission over the wireless network. Thus optimized ML capabilities and data transfers should be integrated in the IoT node. Moreover, IoT applications are torn between sporadic data-logging and energy-hungry data processing (e.g. image classification). Thus, the versatility of the node is key in addressing this wide diversity of energy and processing needs. This paper presents SamurAI, a versatile IoT node bridging this gap in processing and in energy by leveraging two on-chip sub-systems: a low power, clock-less, event-driven Always-Responsive (AR) part and an energy-efficient On-Demand (OD) part. AR contains a 1.7MOPS event-driven, asynchronous Wake-up Controller (WuC) with a 207ns wake-up time optimized for sporadic computing, while OD combines a deep-sleep RISC-V CPU and 1.3TOPS/W Machine Learning (ML) for more complex tasks up to 36GOPS. This architecture partitioning achieves best in class versatility metrics such as peak performance to idle power ratio. On an applicative classification scenario, it demonstrates system power gains, up to 3.5x compared to cloud-based processing, and thus extended battery lifetime.
·arxiv.org·
SamurAI: A Versatile IoT Node With Event-Driven Wake-Up and Embedded ML Acceleration
We're Afraid Language Models Aren't Modeling Ambiguity
We're Afraid Language Models Aren't Modeling Ambiguity
Ambiguity is an intrinsic feature of natural language. Managing ambiguity is a key part of human language understanding, allowing us to anticipate misunderstanding as communicators and revise our interpretations as listeners. As language models (LMs) are increasingly employed as dialogue interfaces and writing aids, handling ambiguous language is critical to their success. We characterize ambiguity in a sentence by its effect on entailment relations with another sentence, and collect AmbiEnt, a linguist-annotated benchmark of 1,645 examples with diverse kinds of ambiguity. We design a suite of tests based on AmbiEnt, presenting the first evaluation of pretrained LMs to recognize ambiguity and disentangle possible meanings. We find that the task remains extremely challenging, including for the recent GPT-4, whose generated disambiguations are considered correct only 32% of the time in human evaluation, compared to 90% for disambiguations in our dataset. Finally, to illustrate the value of ambiguity-sensitive tools, we show that a multilabel NLI model can flag political claims in the wild that are misleading due to ambiguity. We encourage the field to rediscover the importance of ambiguity for NLP.
·arxiv.org·
We're Afraid Language Models Aren't Modeling Ambiguity
EFloat: Entropy-coded Floating Point Format for Compressing Vector Embedding Models
EFloat: Entropy-coded Floating Point Format for Compressing Vector Embedding Models
In a large class of deep learning models, including vector embedding models such as word and database embeddings, we observe that floating point exponent values cluster around a few unique values, permitting entropy based data compression. Entropy coding compresses fixed-length values with variable-length codes, encoding most probable values with fewer bits. We propose the EFloat compressed floating point number format that uses a variable field boundary between the exponent and significand fields. EFloat uses entropy coding on exponent values and signs to minimize the average width of the exponent and sign fields, while preserving the original FP32 exponent range unchanged. Saved bits become part of the significand field increasing the EFloat numeric precision by 4.3 bits on average compared to other reduced-precision floating point formats. EFloat makes 8-bit and even smaller floats practical without sacrificing the exponent range of a 32-bit floating point representation. We currently use the EFloat format for saving memory capacity and bandwidth consumption of large vector embedding models such as those used for database embeddings. Using the RMS error as metric, we demonstrate that EFloat provides higher accuracy than other floating point formats with equal bit budget. The EF12 format with 12-bit budget has less end-to-end application error than the 16-bit BFloat16. EF16 with 16-bit budget has an RMS-error 17 to 35 times less than BF16 RMS-error for a diverse set of embedding models. When making similarity and dissimilarity queries, using the NDCG ranking metric, EFloat matches the result quality of prior floating point representations with larger bit budgets.
·arxiv.org·
EFloat: Entropy-coded Floating Point Format for Compressing Vector Embedding Models
Deep Learning in Music Recommendation Systems
Deep Learning in Music Recommendation Systems
Like in many other research areas, deep learning (DL) is increasingly adopted in music recommendation systems (MRS). Deep neural networks are used in this domain particularly for extracting latent factors of music items from audio signals or metadata and for learning sequential patterns of music items (tracks or artists) from music playlists or listening sessions. Latent item factors are commonly integrated into content-based filtering and hybrid MRS, whereas sequence models of music items are used for sequential music recommendation, e.g., automatic playlist continuation. This review article explains particularities of the music domain in RS research. It gives an overview of the state of the art that employs deep learning for music recommendation. The discussion is structured according to the dimensions of neural network type, input data, recommendation approach (content-based filtering, collaborative filtering, or both), and task (standard or sequential music recommendation). In addition, we discuss major challenges faced in MRS, in particular in the context of the current research on deep learning.
·frontiersin.org·
Deep Learning in Music Recommendation Systems
Quantum Neural Network Compression
Quantum Neural Network Compression
Model compression, such as pruning and quantization, has been widely applied to optimize neural networks on resource-limited classical devices. Recently, there are growing interest in variational quantum circuits (VQC), that is, a type of neural network on quantum computers (a.k.a., quantum neural networks). It is well known that the near-term quantum devices have high noise and limited resources (i.e., quantum bits, qubits); yet, how to compress quantum neural networks has not been thoroughly studied. One might think it is straightforward to apply the classical compression techniques to quantum scenarios. However, this paper reveals that there exist differences between the compression of quantum and classical neural networks. Based on our observations, we claim that the compilation/traspilation has to be involved in the compression process. On top of this, we propose the very first systematical framework, namely CompVQC, to compress quantum neural networks (QNNs).In CompVQC, the key component is a novel compression algorithm, which is based on the alternating direction method of multipliers (ADMM) approach. Experiments demonstrate the advantage of the CompVQC, reducing the circuit depth (almost over 2.5 %) with a negligible accuracy drop (1%), which outperforms other competitors. Another promising truth is our CompVQC can indeed promote the robustness of the QNN on the near-term noisy quantum devices.
·arxiv.org·
Quantum Neural Network Compression
Quantum Current and Holographic Categorical Symmetry
Quantum Current and Holographic Categorical Symmetry
We establish the formulation for quantum current. Given a symmetry group $G$, let $\mathcal{C}:=\mathrm{Rep}\, G$ be its representation category. Physically, symmetry charges are objects of $\mathcal{C}$ and symmetric operators are morphisms in $\mathcal{C}$. The addition of charges is given by the tensor product of representations. For any symmetric operator $O$ crossing two subsystems, the exact symmetry charge transported by $O$ can be extracted. The quantum current is defined as symmetric operators that can transport symmetry charges over an arbitrary long distance. A quantum current exactly corresponds to an object in the Drinfeld center $Z_1(\mathcal{C})$. The condition for quantum currents to be condensed is also specified. To express the local conservation, the internal hom must be used to compute the charge difference, and the framework of enriched category is inevitable. To illustrate these ideas, we develop a rigorous scheme of renormalization in one-dimensional lattice systems and analyse the fixed-point models. It is proved that in the fixed-point models, condensed quantum currents form a Lagrangian algebra in $Z_1(\mathcal{C})$ and the boundary-bulk correspondence is verified in the enriched setting. Overall, the quantum current provides a natural physical interpretation to the holographic categorical symmetry.
·arxiv.org·
Quantum Current and Holographic Categorical Symmetry
Community Engagement Manager, arXiv, Cornell Tech
Community Engagement Manager, arXiv, Cornell Tech
Cornell University embraces diversity and seeks candidates who will contribute to a climate that supports students, faculty and staff of all identities and backgrounds. We strongly encourage individuals from underrepresented and/or marginalized identities to apply. Cornell's Culture of Inclusion and Community Standards As a university founded to be a place where “…any person can find instruction in any study,” diversity and inclusion are at the core of our values and mission. We strive to be a welcoming, caring, healthy, and equitable community where students, faculty, and staff with different backgrounds, perspectives, abilities, and experiences can learn, innovate, and work in an environment of respect, and feel empowered to engage in any community conversation. As a member of the Cornell University community, it is important to recognize our shared responsibility to each other to cultivate a culture of inclusion for all. Cornell Core values As an individual contributor you will model and support a culture of diversity, equity, inclusion, and wellbeing and continually seek to understand how your role, behaviors, and actions impact the success of this culture. While position responsibilities vary greatly, the Skills for Success and Leadership Skills for Success are foundational to what is expected of every employee and leader working at Cornell. These skills are essential for individual and organizational success. Staff Skills for Success; Leadership Skills for Success We offer competitive compensation, generous time-off, and great benefits …More on Cornell Benefits About arXiv Started in August 1991 and located at Cornell University since 2001, arXiv.org is an open access research sharing platform for scholarly articles. The e-print repository has transformed the scholarly communication and knowledge dissemination of multiple fields of physics, mathematics, computer science, quantitative biology, quantitative finance, and statistics, electrical engineering, systems science, and economics as new subject domains. arXiv is a global resource, with 70% of institutional use coming from countries other than the USA. arXiv resides in Cornell Tech with staff and faculty collaborations spanning both Ithaca and New York City campuses. We are looking for a self-starter with an entrepreneurial mindset to be our next Community Engagement Manager. Reporting to the Program Director, the arXiv Community Engagement Manager is part of the arXiv leadership team, together with the Faculty Director, Program Director, Scientific Director, Technical Director, and Head of Operations. The Community Engagement Manager is responsible for defining and implementing arXiv’s communication strategy and managing and expanding our membership and sponsorship programs, which contribute significantly to arXiv's revenue. Job Summary While position responsibilities vary, every member of our community is expected to foster a culture of belonging and a psychologically healthy work environment by communicating across differences; being cooperative, collaborative, open, and welcoming; showing respect, compassion, and empathy; engaging and supporting others regardless of background or perspective; speaking up when others are being excluded or treated inappropriately; and supporting work/life integration of oneself and others. Responsibilities of the Community Engagement Manager primarily fall into three areas. Manage Organization Communications (50%): Serve as a creative communications strategist, leveraging emerging communications trends, research, and techniques to connect to key audiences and stakeholders around the globe; develop campaigns to support arXiv’s mission, vision, project goals, and brand identity. Act as public relations point of contact for arXiv and engage with key stakeholders, such as journalists, media, and other academic institutions. Assure exceptional integrity, quality and accuracy in communications; manage content creation for marketing materials (collateral, newsletters, press releases, digital content, social media and more). Organize, schedule, and manage digital events, including webinars and workshops. Develop annual reports for leadership groups, including arXiv Members, arXiv advisory committees, and Cornell stewardship. Develop internal communications strategy to support staff in carrying out arXiv’s mission, vision, and project goals. Coordinate with Cornell University’s communications team (within the Division of University Relations) to ensure alignment with university-wide media relations, branding and related communications protocols. Manage Membership and Sponsorship (40%): Develop, manage, and maintain successful relationships with arXiv stakeholders in academic libraries and library consortia, professional societies, research institutes, and other mission-aligned organizations to ensure a thriving membership and sponsorship program. Develop a communication strategy and benefit package to maintain engagement with members, affiliates, and sponsors. Cultivate relationships through in-person meetings, webinars, and other outreach and develop marketing materials. Organize and supervise the invoicing workflow throughout the year to ensure timely payment from all members, sponsors and affiliates; liaise with colleagues across Cornell as needed for financial reporting. Respond to current and prospective member inquiries regarding membership benefits, membership agreements, and usage data. Fundraising Support (10%): Organize and implement giving campaigns to solicit support from individual arXiv users. Assist with grant writing and reporting. This is a full-time, benefits-eligible 3-year term position with the possibility of renewal The primary work location for this role is at the Cornell Tech campus on Roosevelt Island in New York City. This position is hybrid, which involves working at least 3 day(s) per week on campus. The flexible work schedule is subject to change according to the needs of the business. Visa sponsorship is not available for this position. Minimum Qualifications Bachelor’s degree and 3 – 5 years’ experience in scientific communication or communications targeting the academic library and scholarly communication communities. Experience building membership programs for nonprofits. Excellent written and oral communication skills with a demonstrated ability to communicate successfully with scientists and other specialists as well as generalists. Proficient in using CRM and invoicing software (e.g., Salsa, Quickbooks, etc.) and/or an aptitude for learning new systems. Experienced using common software programs, e.g., Microsoft Office Suite, Adobe Creative Suite, WordPress etc. Highly organized and detail-oriented, flexible and collaborative with proven ability to prioritize and manage multiple tasks simultaneously. Experience in and/or demonstrated commitment to supporting diversity, equity, access, inclusion, and wellbeing. Experience incorporating the perspectives of multiple communities, including communities of color. Ability to cultivate and develop inclusive and equitable working relationships with students, faculty, staff, and community members. Preferred Qualifications Master's degree in information science, communications, or other related discipline. Demonstrated knowledge of business development, specifically for nonprofit organizations. Has a curated portfolio of PR contacts, including scientific journalists and the scientific media sources. Understanding of and experience working with the academic library community. Knowledge of data analytics and reporting suites (Google Analytics, Tableau). Basic understanding of GitHub and Jira. Familiarize yourself with Cornell's COVID-19 workplace guidance as well as the university's COVID-19 services and information. University Job Title: Communication Spec III Job Family: Communications/Marketing Level: F Pay Rate Type: Salary Pay Range: $70,587.00 - $91,572.00 Remote Option Availability: Hybrid Remote Company: Endowed Contact Name: Evelyn Gordon Job Titles and Pay Ranges: Non-Union Positions Noted pay ranges reflect the potential pay opportunity for each job profile. The hiring rate of pay for the successful candidate will be determined considering the following criteria: Prior relevant work or industry experience Education level to the extent education is relevant to the position Unique applicable skills Academic Discipline (faculty pay ranges reflects 9-month annual salary) To learn more about Cornell’s non-union staff job titles and pay ranges, see Career Navigator. Union Positions The hiring rate of pay for the successful candidate will be determined in accordance with the rates in the respective collective bargaining agreement. To learn more about Cornell’s union wages, see Union Pay Rates. Current Employees: If you currently work at Cornell University, please exit this website and log in to Workday using your Net ID and password. Select the Career icon on your Home dashboard to view jobs at Cornell. Online Submission Guidelines: Most positions at Cornell will require you to apply online and submit both a resume/CV and cover letter. You can upload documents either by “dragging and dropping” them into the dropbox or by using the “upload” icon on the application page. For more detailed instructions on how to apply to a job at Cornell, visit How We Hire on the HR website. Employment Assistance: For general questions about the position or the application process, please contact the Recruiter listed in the job posting or email mycareer@cornell.edu. If you require an accommodation for a disability in order to complete an employment application or to participate in the recruiting process, you are encouraged to contact Cornell University's Office of Institutional Equity and Title IX at voice (607) 255-2242, or email at equity@cornell.edu. Applicants that do not have internet access are encouraged to visit your local library, or local Department of Labo
·cornell.wd1.myworkdayjobs.com·
Community Engagement Manager, arXiv, Cornell Tech
Learnable latent embeddings for joint behavioural and neural analysis
Learnable latent embeddings for joint behavioural and neural analysis
Mapping behavioural actions to neural activity is a fundamental goal of neuroscience. As our ability to record large neural and behavioural data increases, there is growing interest in modeling neural dynamics during adaptive behaviors to probe neural representations. In particular, neural latent embeddings can reveal underlying correlates of behavior, yet, we lack non-linear techniques that can explicitly and flexibly leverage joint behavior and neural data. Here, we fill this gap with a novel method, CEBRA, that jointly uses behavioural and neural data in a hypothesis- or discovery-driven manner to produce consistent, high-performance latent spaces. We validate its accuracy and demonstrate our tool's utility for both calcium and electrophysiology datasets, across sensory and motor tasks, and in simple or complex behaviors across species. It allows for single and multi-session datasets to be leveraged for hypothesis testing or can be used label-free. Lastly, we show that CEBRA can be used for the mapping of space, uncovering complex kinematic features, and rapid, high-accuracy decoding of natural movies from visual cortex.
·cebra.ai·
Learnable latent embeddings for joint behavioural and neural analysis
A Prompt Pattern Catalog to Enhance Prompt Engineering with ChatGPT
A Prompt Pattern Catalog to Enhance Prompt Engineering with ChatGPT
Prompt engineering is an increasingly important skill set needed to converse effectively with large language models (LLMs), such as ChatGPT. Prompts are instructions given to an LLM to enforce rules, automate processes, and ensure specific qualities (and quantities) of generated output. Prompts are also a form of programming that can customize the outputs and interactions with an LLM. This paper describes a catalog of prompt engineering techniques presented in pattern form that have been applied to solve common problems when conversing with LLMs. Prompt patterns are a knowledge transfer method analogous to software patterns since they provide reusable solutions to common problems faced in a particular context, i.e., output generation and interaction when working with LLMs. This paper provides the following contributions to research on prompt engineering that apply LLMs to automate software development tasks. First, it provides a framework for documenting patterns for structuring prompts to solve a range of problems so that they can be adapted to different domains. Second, it presents a catalog of patterns that have been applied successfully to improve the outputs of LLM conversations. Third, it explains how prompts can be built from multiple patterns and illustrates prompt patterns that benefit from combination with other prompt patterns.
·arxiv.org·
A Prompt Pattern Catalog to Enhance Prompt Engineering with ChatGPT
GPT-4 Technical Report
GPT-4 Technical Report
We report the development of GPT-4, a large-scale, multimodal model which can accept image and text inputs and produce text outputs. While less capable than humans in many real-world scenarios, GPT-4 exhibits human-level performance on various professional and academic benchmarks, including passing a simulated bar exam with a score around the top 10% of test takers. GPT-4 is a Transformer-based model pre-trained to predict the next token in a document. The post-training alignment process results in improved performance on measures of factuality and adherence to desired behavior. A core component of this project was developing infrastructure and optimization methods that behave predictably across a wide range of scales. This allowed us to accurately predict some aspects of GPT-4's performance based on models trained with no more than 1/1,000th the compute of GPT-4.
·arxiv.org·
GPT-4 Technical Report
ZeroEGGS: Zero-shot Example-based Gesture Generation from Speech
ZeroEGGS: Zero-shot Example-based Gesture Generation from Speech
We present ZeroEGGS, a neural network framework for speech-driven gesture generation with zero-shot style control by example. This means style can be controlled via only a short example motion clip, even for motion styles unseen during training. Our model uses a Variational framework to learn a style embedding, making it easy to modify style through latent space manipulation or blending and scaling of style embeddings. The probabilistic nature of our framework further enables the generation of a variety of outputs given the same input, addressing the stochastic nature of gesture motion. In a series of experiments, we first demonstrate the flexibility and generalizability of our model to new speakers and styles. In a user study, we then show that our model outperforms previous state-of-the-art techniques in naturalness of motion, appropriateness for speech, and style portrayal. Finally, we release a high-quality dataset of full-body gesture motion including fingers, with speech, spanning across 19 different styles.
·arxiv.org·
ZeroEGGS: Zero-shot Example-based Gesture Generation from Speech