Telemetry-Aware Addon Recommender
Перейти к файлу
Victor Ng dac25c5b43 damn you dependencies 2018-03-02 12:32:07 -05:00
taar Updated all recommenders to use JSON cache 2018-03-02 12:32:07 -05:00
tests Updated all recommenders to use JSON cache 2018-03-02 12:32:07 -05:00
.gitignore Ignore vscode 2018-03-01 11:47:46 -05:00
.travis.yml Fail if flake8 fails 2018-03-01 11:47:46 -05:00
LICENSE Initial commit 2017-03-15 21:59:28 +00:00
MANIFEST.in Add MANIFEST.in 2017-03-17 11:07:31 +00:00
Makefile Added testing instructions to close off #29 2018-03-01 14:16:16 -05:00
README.md Added more docs for running integration tests and a microbenchmark 2018-03-01 14:16:16 -05:00
environment_emr.yaml damn you dependencies 2018-03-02 12:32:07 -05:00
setup.cfg added a `python setup.py test` directive that invokes pytest 2018-01-11 09:52:42 -05:00
setup.py Bumped to 0.0.25 2018-03-01 14:26:03 -06:00
test_requirements.txt Update Travis to do multiple builds with different environmnts 2018-03-01 11:47:46 -05:00

README.md

Taar

Telemetry-Aware Addon Recommender

Build Status

Table of Contents (ToC):

How does it work?

The recommendation strategy is implemented through the RecommendationManager. Once a recommendation is requested for a specific client id, the recommender iterates through all the registered models (e.g. CollaborativeRecommender) linearly in their registered order. Results are returned from the first module that can perform a recommendation.

Each module specifies its own sets of rules and requirements and thus can decide if it can perform a recommendation independently from the other modules.

Supported models

This is the ordered list of the currently supported models:

Order Model Description Conditions Generator job
1 Legacy recommends WebExtensions based on the reported and disabled legacy add-ons Telemetry data is available for the user and the user has at least one disabled add-on source
2 Collaborative recommends add-ons based on add-ons installed by other users (i.e. collaborative filtering) Telemetry data is available for the user and the user has at least one enabled add-on source
3 Similarity recommends add-ons based on add-ons installed by similar representative users Telemetry data is available for the user and a suitable representative donor can be found source
4 Locale recommends add-ons based on the top addons for the user's locale Telemetry data is available for the user and the locale has enough users source

Instructions for releasing updates

New releases can be shipped by using the normal github workflow. Once a new release is created, it will be automatically uploaded to pypi.

A note on cdist optimization.

cdist can speed up distance computation by a factor of 10 for the computations we're doing. We can use it without problems on the canberra distance calculation.

Unfortunately there are multiple problems with it accepting a string array. There are different problems in 0.18.1 (which is what is available on EMR), and on later versions. In both cases cdist attempts to convert a string to a double, which fails. For versions of scipy later than 0.18.1 this could be worked around with:

distance.cdist(v1, v2, lambda x, y: distance.hamming(x, y))

However, when you manually provide a callable to cdist, cdist can not do it's baked in optimizations (https://github.com/scipy/scipy/blob/v1.0.0/scipy/spatial/distance.py#L2408) so we can just apply the function distance.hamming to our array manually and get the same performance.

Build and run tests

You should be able to build taar using Python 2.7 or Python 3.5. To run the testsuite, execute ::

$ python setup.py develop
$ python setup.py test

Alternately, if you've got GNUMake installed, you can just run make test which will do all of that for you and run flake8 on the codebase.

There are additional integration tests and a microbenchmark available in tests/test_integration.py. See the source code for more information.