зеркало из https://github.com/mozilla/kitsune.git
[bug 721411] Document search scoring
This adds documentation for search scoring as it's currently implemented. Additionally, it adds some minor notes about where the ES-related code is and links to the search view where the filters are. Also adds a link to elasticsearch-head.
This commit is contained in:
Родитель
d477bb7436
Коммит
1afcc2fc91
|
@ -22,11 +22,11 @@ search or Google's site search.
|
|||
|
||||
.. Note::
|
||||
|
||||
Right now we're rewriting our search system to use Elastic and
|
||||
switching between Sphinx and Elastic. At some point, the results
|
||||
we're getting with our Elastic-based code will be good enough to
|
||||
switch over. At that point, we'll remove the Sphinx-based search
|
||||
code.
|
||||
Right now we're rewriting our search system to use Elastic Search
|
||||
and switching between Sphinx and Elastic Search. At some point,
|
||||
the results we're getting with our Elastic Search-based code will
|
||||
be good enough to switch over. At that point, we'll remove the
|
||||
Sphinx-based search code.
|
||||
|
||||
Until then, we have instructions for installing both Sphinx Search
|
||||
and Elastic Search.
|
||||
|
@ -281,3 +281,139 @@ You can see Elastic Search statistics/health with::
|
|||
|
||||
The last few lines tell you how many documents are in the index by
|
||||
doctype. I use this to make sure I've got stuff in my index.
|
||||
|
||||
|
||||
Tools
|
||||
-----
|
||||
|
||||
One tool that's helpful for Elastic Search work is `elasticsearch-head
|
||||
<https://github.com/mobz/elasticsearch-head>`_. It's like the
|
||||
phpmyadmin for Elastic Search.
|
||||
|
||||
|
||||
Implementation details
|
||||
----------------------
|
||||
|
||||
Kitsune uses `elasticutils
|
||||
<https://github.com/davedash/elasticutils>`_ and `pyes
|
||||
<https://github.com/aparo/pyes>`_.
|
||||
|
||||
Most of our code is in the ``search`` app in ``apps/search/``.
|
||||
|
||||
Models in Kitsune that are indexable use ``SearchMixin`` defined in
|
||||
``models.py``.
|
||||
|
||||
Utility functions are implemented in ``es_utils.py``.
|
||||
|
||||
Sub commands for ``manage.py`` are implemented in
|
||||
``management/commands/``.
|
||||
|
||||
|
||||
Search Scoring
|
||||
==============
|
||||
|
||||
These are the defaults that apply to all searches:
|
||||
|
||||
kb:
|
||||
|
||||
query fields: title, content, summary, keywords
|
||||
|
||||
weights:
|
||||
|
||||
======== =====
|
||||
name value
|
||||
======== =====
|
||||
title 6
|
||||
content 1
|
||||
keywords 4
|
||||
summary 2
|
||||
======== =====
|
||||
|
||||
questions:
|
||||
|
||||
query fields: title, question_content, answer_content
|
||||
|
||||
weights:
|
||||
|
||||
================ =====
|
||||
name value
|
||||
================ =====
|
||||
title 4
|
||||
question_content 3
|
||||
answer_content 3
|
||||
================ =====
|
||||
|
||||
forums:
|
||||
|
||||
query fields: title, content
|
||||
|
||||
weights:
|
||||
|
||||
======== =====
|
||||
name value
|
||||
======== =====
|
||||
title 2
|
||||
content 1
|
||||
======== =====
|
||||
|
||||
.. Note::
|
||||
|
||||
The query fields and weights are shared between our Sphinx code and
|
||||
our Elastic Search code.
|
||||
|
||||
|
||||
Elastic Search is built on top of Lucene so the `Lucene documentation
|
||||
on scoring <http://lucene.apache.org/java/3_5_0/scoring.html>`_ covers
|
||||
how a document is scored in regards to the search query and its
|
||||
contents. The weights modify that---they're query-level boosts.
|
||||
|
||||
Additionally we use a series of filters on tags, q_tags, and other
|
||||
properties of the documents like has_helpful, is_locked, is_archived,
|
||||
etc, In Elastic Search, filters remove items from the result set, but
|
||||
don't otherwise affect the scoring.
|
||||
|
||||
|
||||
Front page search
|
||||
-----------------
|
||||
|
||||
A front page search is what happens when you start on the front page,
|
||||
enter in a search query in the search box, and click on the green
|
||||
arrow.
|
||||
|
||||
Front page search does the following:
|
||||
|
||||
1. searches only kb and questions
|
||||
2. (filter) kb articles are tagged with the product (e.g. "desktop")
|
||||
3. (filter) kb articles must not be archived
|
||||
4. (filter) kb articles must be in Troubleshooting (10) and
|
||||
How-to (20) categories
|
||||
5. (filter) questions are tagged with the product (e.g. "desktop")
|
||||
6. (filter) questions must have an answer marked as helpful
|
||||
|
||||
|
||||
It scores as specified above.
|
||||
|
||||
|
||||
Advanced search
|
||||
---------------
|
||||
|
||||
The advanced search form lines up with the filters applied.
|
||||
|
||||
For example, if you search for knowledge base articles in the
|
||||
Troubleshooting category, then we add a filter where the result has to
|
||||
be in the Troubleshooting category.
|
||||
|
||||
|
||||
Link to the code
|
||||
----------------
|
||||
|
||||
Here's a link to the search view in the master branch. This is what's
|
||||
on dev:
|
||||
|
||||
https://github.com/mozilla/kitsune/blob/master/apps/search/views.py
|
||||
|
||||
|
||||
Here's a link to the search view in the next branch. This is what's
|
||||
on staging:
|
||||
|
||||
https://github.com/mozilla/kitsune/blob/next/apps/search/views.py
|
||||
|
|
Загрузка…
Ссылка в новой задаче