Граф коммитов

314 Коммитов

Автор SHA1 Сообщение Дата
Evgeny Pavlov f542a1ec1e
Update TAARLITE-README.md 2022-06-27 16:07:49 -07:00
Evgeny Pavlov 3bdfcaf428
GCS migration, packaging fixes (#188)
* Switch from S3 to GCS
* Refactoring of taar cache to provide independent context for Ensemble spark job
* Reduce external dependencies of recommenders used in Ensemble Spark job
* Settings refactoring
* Update docs
2021-02-05 15:23:11 -08:00
Evgeny Pavlov 47c797d9bd
Fix Related recommender results logging (#186) 2021-01-18 10:10:13 -08:00
Evgeny Pavlov bb835d3d30
Fix randomization bug and logging (#185)
* fix edge case when too few addons are recommended

* add results logging after randomization

* decrease logging levels, get rid of srgutils

* logging fixes, add extra info, structured logging

* fix tests

* move imports inside default context

* add sentry exception capturing

* fix code formatting

* move python logging level option to settings
2021-01-08 15:41:49 -08:00
Evgeny Pavlov 754b402601
Enable randomization (#184)
* Fix randomization to work with negative weights

* Fix tests to take into account randomization

* Remove experiment prob from settings
2020-12-21 14:38:15 -08:00
Evgeny Pavlov f5693b7177
fix checking whether client id is in experiment (#181) 2020-12-09 09:53:17 -08:00
Victor Ng c688525b35 bump to 0.7.5 2020-09-02 12:41:33 -04:00
Victor Ng 4badf3f74d
Perftune (#180)
* Load redis data once and only once per request

* Cache all redis data into the local instance

* Remove warm_caches function as it does nothing

* Added more verbose logging with performance profiling data

* added redis read timer to logging
2020-09-02 12:40:41 -04:00
Victor Ng b33a1b684c
Features/taar in redis (#179)
* Migrated taar locale recommender to use redis

* added a noop fixture loader module for tests

* Converted TAAR Collaborative recommender to use redis

* Dropped hybrid recommender

* Ported similarity recommender to use redis

* Ported ensemble and recommendation manager to use redis

* dropped LazyJSONLoader

* dropped moto dependency

* Renamed AddonsCoinstallCache to TAARCache

* renamed bin/taarlite-redis to bin/taar-redis

* Execute data preprocess step on redis ptr change

* bumped to 0.7.4
2020-09-01 21:39:09 -04:00
Victor Ng 9773053739 bump to 0.7.3 2020-08-31 16:52:32 -04:00
Victor Ng 756f761680
Features/new test client ids (#178)
* initial draft of fix to test client_ids

* added a test to exercise the ensemble recommender with mock client_ids
2020-08-31 16:52:13 -04:00
Victor Ng 12cb0eed46 bump to 0.7.2 2020-08-27 17:55:36 -04:00
Victor Ng c7d0fedfba
Added documentation for taarlite-redis tool (#177)
* Added documentation for taarlite-redis tool

A new run target has also been added to execute the taarlite-redis tool

* Example typo fix
2020-08-27 17:54:47 -04:00
Victor Ng 3700f7979b bumped to 0.7.1 2020-08-27 16:50:34 -04:00
Victor Ng 0047fbbb4e TAARlite returns empty lists if the cache is empty 2020-08-27 16:49:19 -04:00
Victor Ng d2e711aad5 Added .cache_ready() to GuidGuidRecommender
Also force empty lists to return if the cache is cold
the cache is cold
2020-08-27 16:15:25 -04:00
Victor Ng cd0bb8a2ee bumped taarlite lock ttl to 60 minutes 2020-08-27 16:04:14 -04:00
Victor Ng d2622ae53b Added taarlite-redis script 2020-08-27 15:55:54 -04:00
Victor Ng 6180a62de0 bumped to 0.7.0 2020-08-26 14:08:13 -04:00
Victor Ng 99f278eaa3
Features/175 taarlite limits (#176)
* Consolidate env configuration into taar.settings and add a TAARLITE_MAX_RESULTS configuration
* Rework TAARLite recommender to use threaded caching
* Added cache warmup prior to process starting
* add target for a local start into makefile
* pytest updates to accomodate new redis requirements
* Added TAARlite GUID Ranking caching
* Reworked taarlite to precompute all values and run with redis
* Truncate the length of the GUID coinstallation list to keep the normalization times bounded for taarlite

Most of the performance problems with TAARlite had to do with normalizing very long lists of GUIDs.  

The normalization method now truncates the list to a multiple of the maximum number of TAARLITE suggestions (controlled by `TAARLITE_TRUNCATE` which defaults to 5x `TAARLITE_MAX_RESULTS`)
2020-08-26 14:01:19 -04:00
Victor Ng 70989662e4 Fix CircleCI badge 2020-07-20 12:17:14 -04:00
Victor Ng 6d9690ceb9
Remove basic auth from taarlite (#174) 2020-07-13 13:08:37 -04:00
Victor Ng f88d31b599
Merge taar-lite into main TAAR repository (#173)
* Merge taar-lite into taar

* Enable API endpoint for TAARlite

* Bump version number and add an amended TAARLITE-README.md

* Added curl instruction for invoking TAARlite

* flake8/black fixes

* Add markus[datadog] for statsd metrics

* Add markus statsd metrics all over TAAR
2020-07-13 12:50:19 -04:00
Victor Ng 3891419bcb
Fixed typo in markdown for ETL source 2020-07-06 14:22:11 -04:00
Victor Ng b185583d42
Remerge #157 for weighted randomization (#171)
* Unified patch for #157

* Update ETL job links

Update documentation and removed unnecessary env variables.
Split up some test cases
2020-07-06 14:19:18 -04:00
Victor Ng 37a3fb7bbb
Use newrelic to startup gunicorn service (#172)
* Update newrelic dependency and enable newrelic

* Bump taar to 0.5.1
2020-06-30 15:59:08 -04:00
Victor Ng 316aee7c4f
Update documentation to reflect modern deployment (#170) 2020-06-26 11:42:22 -04:00
Victor Ng fef44f8368 Rename 'dynamo' to 'storage backend' for logging 2020-06-25 12:49:06 -04:00
Victor Ng 4ac0a31afa
Remove AWS Dynamo and port to GCP Cloud BigTable (#169)
* Add error handling for missing records in BigTable

* Switch from Travis to CircleCI badge

* flake8 fixes
2020-06-24 21:36:35 -04:00
Victor Ng 0f57bd5072
Add better documentation (#168)
* Update documentation to reflect prod setup

* dirty commit

* Add stubs for GCP resources

* Add instructions for deletion of user data

* Add link to production YAML configuration

* merged missing docs

* Fill in GCP and Airflow variable information
2020-06-24 16:51:34 -04:00
Victor Ng e1da916205 Bump from 0.4.5 to revision 0.5.0 and require python 3.7 2020-06-16 16:12:36 -04:00
Victor Ng 5be7a59315 Implement a new GCP BigTable Profile database for TAAR 2020-06-16 16:12:36 -04:00
Victor Ng 41f81c9c31 Modernize bin/run to use miniconda
Also simplified the use of codecov.io for the test target in bin/run
2020-06-16 16:12:36 -04:00
Victor Ng efe0baf988 Delete deprecated Dynamo configuration 2020-06-16 16:12:36 -04:00
Victor Ng 3e49cacca7 Update CircleCI to use docker container and miniconda 2020-06-16 16:12:36 -04:00
Victor Ng 3994d24887 Modernize build configuration with miniconda
Remove travis, docker-compose and tox dependancy
Delete requirements.txt for virtualenv
Add environment.yml for conda
Update Makefile to use conda build container images
2020-06-16 16:12:36 -04:00
Victor Ng d771ad1d46 updated S3 file location docs 2020-04-27 18:25:26 -04:00
Victor Ng ff4ff1371b Added another test for client_id+addon_id lookup
Tests now differentiate between failure to find a client_id and
a failure to find an addon for a found client_id.
2019-04-29 13:40:04 -04:00
Victor Ng 3f8e588f29 Lots of security updates
* bumped jinja2 and urllib3 per github security
* bumped base python docker image to 3.6.8-stretch as jessie is
  deprecated
* bump up pip to latest version
* bumped version of flake8 and fixed whitespace issues
* dropped Python 2.7 setup as databricks provides a python3 enviroment now
* dropped enviroment.emr required for python2.7 enviroment
* dropped thriftpy
* updated travis.yml to use dockerized tests and flake8
* split tests into coverage and no-coverage versions because travis
* disable coverage plugin for tests in travis until write permissions
  are sorted outa
2019-04-25 14:48:04 -04:00
Victor Ng 34af9eabd7 added tests for client+addon query for issue 151 2019-04-25 14:48:04 -04:00
Victor Ng b78d821746 patch to enable querying for existence of a (hashed_client_id, addon_id) pair 2019-04-25 14:48:04 -04:00
Mozilla-GitHub-Standards 64569674a2 Add Mozilla Code of Conduct file
Fixes #153.

_(Message COC002)_
2019-03-31 18:08:08 -05:00
Victor Ng ce781f99d1 dropped library requirements in setup.py
Databricks uses it for dependency loading
2019-03-12 13:16:24 -04:00
Victor Ng bf9375fbac fixed link for ensemble recommender code in README.md 2019-03-11 15:41:03 -04:00
Victor Ng 7110da8f85 lowered logging level for failed client lookups in dynamo 2019-02-21 13:23:34 -05:00
Victor Ng ac4f3a3c26 Reworked the matrix construction in similarity recommender so that a fresh JSON file will force the cached matrices to be recomputed.
Added tests to verify matrix reconstruction
2019-02-20 18:28:07 -05:00
Victor Ng 7624fc90a6 default SENTRY_DSN to empty string 2019-02-20 14:26:44 -05:00
Victor Ng b8e9956693 applied black formatting 2019-02-20 14:26:44 -05:00
Victor Ng 2f6c63f718 More logging and safety around the recommend method for each recommender
I've added an exception handler that forces reload of underlying model in the case of recommendation error.
Extra exception logging occurs if any recommender crashes
2019-02-20 14:26:44 -05:00
Victor Ng c1df5b1efb Added a force_expiry() method to the JSON loader 2019-02-20 14:26:44 -05:00