SIGIR eCom

OSINT
The Love-at-First-Sight Gaze Pattern on Search-Results Pages
Eyetracking studies show that users sometimes look at only a single result on a search-results page because that result is good enough for their needs.
Search Patterns: Design for Discovery
A sandbox of search design illustrations, including a pattern library for search and discovery.
Word2Vec For Phrases — Learning Embeddings For More Than One Word
How to learn similar terms in a given unsupervised corpus using Word2Vec
Train and Test Sets Split for Evaluating Learning To Rank Models
How data splitting can be done and why it is important for the offline evaluation of Learning to Rank models?
Better than Average: Sort by Best Rating with Elasticsearch
People want to buy things that have many great reviews. But when sorting by average rating, the best results are buried by things that have one or two perfect reviews. You can solve this problem by ba...
Altroo - Your search engine for doing good
Transparent and trustworthy way to support your good cause by simply searching the Internet.
Multiword synonyms in search using Querqy - Luminis
There is one topic that gives even the toughest search engineers a headache, multi-word synonyms. There are some ways to sort of get a solution, one of them is a nice tool called Querqy. Querqy is created by René Kriegler....
Gensim: topic modelling for humans
Efficient topic modelling in Python
Complex Search-Results Pages Change Search Behavior: The Pinball Pattern
Because today’s search-results pages have many possible complex layouts, users don’t always process search results sequentially. They distribute their attention more variably across the page than in the past.
Session Context
The most immediate context for a search query is a search session, a sequence of activities the searcher performs in order to pursue an…
Synonyms and Antonyms in Python
Text Mining — Extracting Synonyms and Antonyms
Spelling Correction
Spelling correction is a must-have for any modern search engine. Estimates of the fraction of misspelled search queries vary, but a variety…
Stemming and Lemmatization
Different forms of a word often communicate essentially the same meaning. For example, there’s probably no difference in intent between a…
Boosting the power of Elasticsearch with synonyms
How to use synonyms and synonym filters in Elasticsearch. Synonyms are a powerful tool for increasing the recall of your search system, but there are many subtleties that are important to know and exp...
Locality Sensitive Hashing
An effective way of reducing the dimensionality of your data
Seasonality
Whether it’s the time of year or the time of day, the time when someone performs a search sometimes help us determine the searcher’s…
Search Result Snippets
Search result snippets, also known as query-biased summaries, are the additional context included with each result on the search results…
Taxonomies and Ontologies
In order to understand queries, it’s important to ground that understanding in a knowledge base. Two common ways to represent a knowledge…
Search Results Clustering
Search queries that express broad intent often return intractably large result sets. We’ve already discussed faceted search as a way to…
Search Results Presentation
Until now, we’ve mostly focused on query processing — which is to be expected, given that this series is about query understanding. But…
Humans — Search for Things not for Strings
Information Retrieval (IR) systems are a vital component in the core of successful modern web platforms. The main goal of IR systems is to provide a communication layer that enables customers to establish a retrieval dialogue with underlying data.
Understanding the Search Query — Part I
Introduction
SymSpell vs. BK-tree: 100x faster fuzzy string search & spell checking
Conventional wisdom and textbooks say BK-trees are especially suited for spelling correction and fuzzy string search. But does this really…
Search as a Conversation
Most search applications assume a query-response paradigm: the searcher submits a search query, and the search engine responds with…
The Terrier Project
Tokenization
Now that we can handle characters, let’s move on to words.
Faceted Search
Faceted search is a topic broad enough to deserve its own book. It has become a standard feature of all modern search engines, including…
Location as Context
Location often provides a strong signal of searcher intent. Sometimes location acts as an implicit part of the search query, such as when…
Query Segmentation and Spelling Correction
In English Language, people generally type the queries which are separated by space, but sometimes and somehow this space is found to be…