I recently switched a Django project utilizing Haystack from Whoosh as the engine to Xapian. The performance significantly improved, but I was left with some deficiencies in the search results — this turned out to be due to the default settings.
Here are the necessary settings to improve the search result quality:
import xapian
HAYSTACK_XAPIAN_FLAGS = (
xapian.QueryParser.FLAG_PHRASE |
xapian.QueryParser.FLAG_BOOLEAN |
xapian.QueryParser.FLAG_LOVEHATE |
xapian.QueryParser.FLAG_WILDCARD |
xapian.QueryParser.FLAG_PURE_NOT |
xapian.QueryParser.FLAG_PARTIAL
)
HAYSTACK_XAPIAN_STEMMING_STRATEGY = 'STEM_ALL'
HAYSTACK_CONNECTIONS = {
'default': {
'ENGINE': 'xapian_backend.XapianEngine',
'PATH': root.path('data/xapian')(),
'FLAGS': HAYSTACK_XAPIAN_FLAGS,
},
}
Before these settings, I was having problem with queries being apparently case sensitive, at least when using AutoQuery
, and failing partial searches. These changes improve case insensitivity and stemming, which still having almost 100x better performance than Whoosh.
Hopefully this helps someone out there looking for solutions.