ABOUT SEARCHING AND SEARCH PRACTICIES

437 bookmarks
Custom sorting
Do all-stopword queries matter?
Do all-stopword queries matter?
Many search engines don’t index “stopwords”, words that are very common and have little meaning by themselves. The stopword list is often just the most frequent words in the langu…
·observer.wunderwood.org·
Do all-stopword queries matter?
BM25 The Next Generation of Lucene Relevance - OpenSource Connections
BM25 The Next Generation of Lucene Relevance - OpenSource Connections
There’s something new cooking in how Lucene scores text. Lucene just switched to something called BM25 in trunk. That means a new scoring formula for Solr and Elasticsearch.
·opensourceconnections.com·
BM25 The Next Generation of Lucene Relevance - OpenSource Connections
How is search different than other machine learning problems? - OpenSource Connections
How is search different than other machine learning problems? - OpenSource Connections
In this blog, we explore what makes search distinct from other machine learning problems. How does one approach search ranking as a machine learning problem? We go through a couple of approaches that give you an intuition on how to evaluate a learning to rank method.
·opensourceconnections.com·
How is search different than other machine learning problems? - OpenSource Connections
An Introduction to Search Quality - OpenSource Connections
An Introduction to Search Quality - OpenSource Connections
Welcome, dear reader, to my first OSC blog post. Let’s dive in! While search relevance is often equated with ensuring customers find what they need, that is only part...
·opensourceconnections.com·
An Introduction to Search Quality - OpenSource Connections
The Unreasonable Effectiveness of Collocations - OpenSource Connections
The Unreasonable Effectiveness of Collocations - OpenSource Connections
Recently while experimenting with word2vec-based features with Learning to Rank, I was exploring using collocations to improve the accuracy of my embeddings. If you read the original word2vec paper...
·opensourceconnections.com·
The Unreasonable Effectiveness of Collocations - OpenSource Connections
Demystifying nDCG and ERR - OpenSource Connections
Demystifying nDCG and ERR - OpenSource Connections
We unwrap the mystery behind two popular search relevance metrics nDCG and ERR through visualization, and discuss their pros and cons.
·opensourceconnections.com·
Demystifying nDCG and ERR - OpenSource Connections
What is a 'Relevant' Search Result? - OpenSource Connections
What is a 'Relevant' Search Result? - OpenSource Connections
Five years ago, I wrote an article called What is Search Relevance?. Back then, I had to shout to convince people to even notice whether search results were accurate...
·opensourceconnections.com·
What is a 'Relevant' Search Result? - OpenSource Connections
Choosing your search relevance evaluation metric - OpenSource Connections
Choosing your search relevance evaluation metric - OpenSource Connections
Ensuring results are relevant is tricky but critical to a good search experience. Choosing an evaluationmetric to summarize the performance of the search engine can be equally challenging because...
·opensourceconnections.com·
Choosing your search relevance evaluation metric - OpenSource Connections
Metacrap
Metacrap
·people.well.com·
Metacrap
Hybrid search sum of its parts? Berlin Buzzwords 2022
Hybrid search sum of its parts? Berlin Buzzwords 2022
Over the decades, information retrieval has been dominated by classical methods such as BM25. These lexical models are simple and effective yet vulnerable to vocabulary mismatch. With the introduction of pre-trained language models such as BERT and its relatives, deep retrieval models have achieved superior performance with their strong ability to capture semantic relationships. The downside is that training these deep models is computationally expensive, and suitable datasets are not always available for fine-tuning toward the target domain. While deep retrieval models work best on domains close to what they have been trained on, lexical models are comparatively robust across datasets and domains. This suggests that lexical and deep models can complement each other, retrieving different sets of relevant results. But how can these results effectively be combined? And can we learn something from language models to learn new indexing methods? This talk will delve into both these approaches and exemplify when they work well and not so well. We will take a closer look at different strategies to combine them to get the best of both, even in zero-shot cases where we don't have enough data to fine-tune the deep model. The Search track is presented by OpenSource Connections
·pretalx.com·
Hybrid search sum of its parts? Berlin Buzzwords 2022
Autocomplete
Autocomplete
In the past decade, autocomplete has become a required feature for search engines. Today, searchers who type into a search box expect to…
·queryunderstanding.com·
Autocomplete
Autocomplete and User Experience
Autocomplete and User Experience
The previous post focused on how to determine the best autocomplete suggestions based on query probability and query performance. In this…
·queryunderstanding.com·
Autocomplete and User Experience
Character Filtering
Character Filtering
Search queries are made up of characters. It’s easy to take characters for granted; indeed, people who have never implemented a search…
·queryunderstanding.com·
Character Filtering
Clarification Dialogues
Clarification Dialogues
As we saw in the previous post, modeling search as a conversation makes it possible to overcome breakdowns in communication between the…
·queryunderstanding.com·
Clarification Dialogues
Contextual Query Understanding: An Overview
Contextual Query Understanding: An Overview
So far, we’ve focused on understanding searchers based entirely on the words they type into the search box. But search doesn’t occur in a…
·queryunderstanding.com·
Contextual Query Understanding: An Overview
Entity Recognition
Entity Recognition
In the previous post on query scoping, we discussed query tagging as a special case of named-entity recognition (NER). In this post, we’ll…
·queryunderstanding.com·
Entity Recognition