зеркало из https://github.com/mozilla/kitsune.git
[bug 721411] Document search scoring
This adds documentation for search scoring as it's currently implemented. Additionally, it adds some minor notes about where the ES-related code is and links to the search view where the filters are. Also adds a link to elasticsearch-head.
This commit is contained in:
Родитель
d477bb7436
Коммит
1afcc2fc91
|
@ -22,11 +22,11 @@ search or Google's site search.
|
||||||
|
|
||||||
.. Note::
|
.. Note::
|
||||||
|
|
||||||
Right now we're rewriting our search system to use Elastic and
|
Right now we're rewriting our search system to use Elastic Search
|
||||||
switching between Sphinx and Elastic. At some point, the results
|
and switching between Sphinx and Elastic Search. At some point,
|
||||||
we're getting with our Elastic-based code will be good enough to
|
the results we're getting with our Elastic Search-based code will
|
||||||
switch over. At that point, we'll remove the Sphinx-based search
|
be good enough to switch over. At that point, we'll remove the
|
||||||
code.
|
Sphinx-based search code.
|
||||||
|
|
||||||
Until then, we have instructions for installing both Sphinx Search
|
Until then, we have instructions for installing both Sphinx Search
|
||||||
and Elastic Search.
|
and Elastic Search.
|
||||||
|
@ -281,3 +281,139 @@ You can see Elastic Search statistics/health with::
|
||||||
|
|
||||||
The last few lines tell you how many documents are in the index by
|
The last few lines tell you how many documents are in the index by
|
||||||
doctype. I use this to make sure I've got stuff in my index.
|
doctype. I use this to make sure I've got stuff in my index.
|
||||||
|
|
||||||
|
|
||||||
|
Tools
|
||||||
|
-----
|
||||||
|
|
||||||
|
One tool that's helpful for Elastic Search work is `elasticsearch-head
|
||||||
|
<https://github.com/mobz/elasticsearch-head>`_. It's like the
|
||||||
|
phpmyadmin for Elastic Search.
|
||||||
|
|
||||||
|
|
||||||
|
Implementation details
|
||||||
|
----------------------
|
||||||
|
|
||||||
|
Kitsune uses `elasticutils
|
||||||
|
<https://github.com/davedash/elasticutils>`_ and `pyes
|
||||||
|
<https://github.com/aparo/pyes>`_.
|
||||||
|
|
||||||
|
Most of our code is in the ``search`` app in ``apps/search/``.
|
||||||
|
|
||||||
|
Models in Kitsune that are indexable use ``SearchMixin`` defined in
|
||||||
|
``models.py``.
|
||||||
|
|
||||||
|
Utility functions are implemented in ``es_utils.py``.
|
||||||
|
|
||||||
|
Sub commands for ``manage.py`` are implemented in
|
||||||
|
``management/commands/``.
|
||||||
|
|
||||||
|
|
||||||
|
Search Scoring
|
||||||
|
==============
|
||||||
|
|
||||||
|
These are the defaults that apply to all searches:
|
||||||
|
|
||||||
|
kb:
|
||||||
|
|
||||||
|
query fields: title, content, summary, keywords
|
||||||
|
|
||||||
|
weights:
|
||||||
|
|
||||||
|
======== =====
|
||||||
|
name value
|
||||||
|
======== =====
|
||||||
|
title 6
|
||||||
|
content 1
|
||||||
|
keywords 4
|
||||||
|
summary 2
|
||||||
|
======== =====
|
||||||
|
|
||||||
|
questions:
|
||||||
|
|
||||||
|
query fields: title, question_content, answer_content
|
||||||
|
|
||||||
|
weights:
|
||||||
|
|
||||||
|
================ =====
|
||||||
|
name value
|
||||||
|
================ =====
|
||||||
|
title 4
|
||||||
|
question_content 3
|
||||||
|
answer_content 3
|
||||||
|
================ =====
|
||||||
|
|
||||||
|
forums:
|
||||||
|
|
||||||
|
query fields: title, content
|
||||||
|
|
||||||
|
weights:
|
||||||
|
|
||||||
|
======== =====
|
||||||
|
name value
|
||||||
|
======== =====
|
||||||
|
title 2
|
||||||
|
content 1
|
||||||
|
======== =====
|
||||||
|
|
||||||
|
.. Note::
|
||||||
|
|
||||||
|
The query fields and weights are shared between our Sphinx code and
|
||||||
|
our Elastic Search code.
|
||||||
|
|
||||||
|
|
||||||
|
Elastic Search is built on top of Lucene so the `Lucene documentation
|
||||||
|
on scoring <http://lucene.apache.org/java/3_5_0/scoring.html>`_ covers
|
||||||
|
how a document is scored in regards to the search query and its
|
||||||
|
contents. The weights modify that---they're query-level boosts.
|
||||||
|
|
||||||
|
Additionally we use a series of filters on tags, q_tags, and other
|
||||||
|
properties of the documents like has_helpful, is_locked, is_archived,
|
||||||
|
etc, In Elastic Search, filters remove items from the result set, but
|
||||||
|
don't otherwise affect the scoring.
|
||||||
|
|
||||||
|
|
||||||
|
Front page search
|
||||||
|
-----------------
|
||||||
|
|
||||||
|
A front page search is what happens when you start on the front page,
|
||||||
|
enter in a search query in the search box, and click on the green
|
||||||
|
arrow.
|
||||||
|
|
||||||
|
Front page search does the following:
|
||||||
|
|
||||||
|
1. searches only kb and questions
|
||||||
|
2. (filter) kb articles are tagged with the product (e.g. "desktop")
|
||||||
|
3. (filter) kb articles must not be archived
|
||||||
|
4. (filter) kb articles must be in Troubleshooting (10) and
|
||||||
|
How-to (20) categories
|
||||||
|
5. (filter) questions are tagged with the product (e.g. "desktop")
|
||||||
|
6. (filter) questions must have an answer marked as helpful
|
||||||
|
|
||||||
|
|
||||||
|
It scores as specified above.
|
||||||
|
|
||||||
|
|
||||||
|
Advanced search
|
||||||
|
---------------
|
||||||
|
|
||||||
|
The advanced search form lines up with the filters applied.
|
||||||
|
|
||||||
|
For example, if you search for knowledge base articles in the
|
||||||
|
Troubleshooting category, then we add a filter where the result has to
|
||||||
|
be in the Troubleshooting category.
|
||||||
|
|
||||||
|
|
||||||
|
Link to the code
|
||||||
|
----------------
|
||||||
|
|
||||||
|
Here's a link to the search view in the master branch. This is what's
|
||||||
|
on dev:
|
||||||
|
|
||||||
|
https://github.com/mozilla/kitsune/blob/master/apps/search/views.py
|
||||||
|
|
||||||
|
|
||||||
|
Here's a link to the search view in the next branch. This is what's
|
||||||
|
on staging:
|
||||||
|
|
||||||
|
https://github.com/mozilla/kitsune/blob/next/apps/search/views.py
|
||||||
|
|
Загрузка…
Ссылка в новой задаче