Arthur

Pemberton

Full-stack web applications developer


Improving Xapian backed Haystack searches in Django

January 9, 2019Arthur Pemberton0 Comments

I recently switched a Django project utilizing Haystack from Whoosh as the engine to Xapian. The performance significantly improved, but I was left with some deficiencies in the search results — this turned out to be due to the default settings.

Here are the necessary settings to improve the search result quality:

import xapian

HAYSTACK_XAPIAN_FLAGS = (
    xapian.QueryParser.FLAG_PHRASE |
    xapian.QueryParser.FLAG_BOOLEAN |
    xapian.QueryParser.FLAG_LOVEHATE |
    xapian.QueryParser.FLAG_WILDCARD |
    xapian.QueryParser.FLAG_PURE_NOT |
    xapian.QueryParser.FLAG_PARTIAL
)
HAYSTACK_XAPIAN_STEMMING_STRATEGY = 'STEM_ALL'

HAYSTACK_CONNECTIONS = {
    'default': {
        'ENGINE': 'xapian_backend.XapianEngine',
        'PATH': root.path('data/xapian')(),
        'FLAGS': HAYSTACK_XAPIAN_FLAGS,
    },
}

Before these settings, I was having problem with queries being apparently case sensitive, at least when using AutoQuery, and failing partial searches. These changes improve case insensitivity and stemming, which still having almost 100x better performance than Whoosh.

Hopefully this helps someone out there looking for solutions.


Leave a Reply