Skip to content

Correspondence between Haystack queries and Xapian queries

jorgecarleitao edited this page May 21, 2014 · 5 revisions

This page describes the correspondence Xapian-Haystack does between a query constructed in Haystack and a Xapian query. For how Xapian-Haystack builds the index out of the Haystack fields, see here.

Notation

  • (unstemmed) term: a single word
    • Represented by the term itself.
    • Example: Happiness.
  • stemmed term: the stem of the term
    • Represented by Z<stemmed term>
    • Example: Zhappi (stem of Happiness)
  • term in a field: a term that belongs to a given field;
    • Represented by X<field_name><unstemmed term> or ZX<field_name><stemmed term> where field_name is the field name in uppercase
    • Examples: ZXSUMMARYhappi, XSUMMARYhappiness

All haystack queries have a default filter __contains, and the default filter is content. This implies that

filter('FoO bar') == filter(content='FoO bar') == filter(content__contains='FoO bar')

We use mainly 3 Xapian connectors: OR, AND and PHRASE, of which you can find detailed information here. These do not only form boolean statements: two documents that match the same query will have different scores according to the Xapian scoring algorithm, the BM25.

Fields

content query

Such as filter('FoO bar'). Corresponds to a Xapian OR for every term in the sentence, both stemmed and unstemmed. Example:

filter(content='FoO bar')  # FoO OR Zfoo OR bar OR Zbar

This matches terms in every field since terms in fields are also indexed as non-fielded terms.

field query

Such as filter(summary='david this'). Corresponds to a Xapian OR for every term in the query, both stemmed and unstemmed, but restricted to the field. Example:

filter(summary='david this')  # XSUMMARYdavid OR ZXSUMMARYdavid OR XSUMMARYthis OR ZXSUMMARYthis

field filters

exact

Such as filter(summary__exact='david this'). Corresponds to a Xapian PHRASE of unstemmed terms if more than one term in the sentence or an exact match if one term. Examples:

filter(summary__exact='david this')  # XSUMMARYdavid PHRASE 2 XSUMMARYthis
filter(content__exact='david')  # ^david$
filter(summary__exact='david')  # XSUMMARY^david$

in

Such as filter(summary__in=['david this', 'foo']). Corresponds to a OR join of exact matches on each term in the list. Example:

filter(summary__in=['david this', 'foo'])  # (XSUMMARYdavid PHRASE 2 XSUMMARYthis) OR XSUMMARY^foo$