* Switch from S3 to GCS
* Refactoring of taar cache to provide independent context for Ensemble spark job
* Reduce external dependencies of recommenders used in Ensemble Spark job
* Settings refactoring
* Update docs
* Load redis data once and only once per request
* Cache all redis data into the local instance
* Remove warm_caches function as it does nothing
* Added more verbose logging with performance profiling data
* added redis read timer to logging
* Migrated taar locale recommender to use redis
* added a noop fixture loader module for tests
* Converted TAAR Collaborative recommender to use redis
* Dropped hybrid recommender
* Ported similarity recommender to use redis
* Ported ensemble and recommendation manager to use redis
* dropped LazyJSONLoader
* dropped moto dependency
* Renamed AddonsCoinstallCache to TAARCache
* renamed bin/taarlite-redis to bin/taar-redis
* Execute data preprocess step on redis ptr change
* bumped to 0.7.4
* Consolidate env configuration into taar.settings and add a TAARLITE_MAX_RESULTS configuration
* Rework TAARLite recommender to use threaded caching
* Added cache warmup prior to process starting
* add target for a local start into makefile
* pytest updates to accomodate new redis requirements
* Added TAARlite GUID Ranking caching
* Reworked taarlite to precompute all values and run with redis
* Truncate the length of the GUID coinstallation list to keep the normalization times bounded for taarlite
Most of the performance problems with TAARlite had to do with normalizing very long lists of GUIDs.
The normalization method now truncates the list to a multiple of the maximum number of TAARLITE suggestions (controlled by `TAARLITE_TRUNCATE` which defaults to 5x `TAARLITE_MAX_RESULTS`)
* Merge taar-lite into taar
* Enable API endpoint for TAARlite
* Bump version number and add an amended TAARLITE-README.md
* Added curl instruction for invoking TAARlite
* flake8/black fixes
* Add markus[datadog] for statsd metrics
* Add markus statsd metrics all over TAAR
* Update documentation to reflect prod setup
* dirty commit
* Add stubs for GCP resources
* Add instructions for deletion of user data
* Add link to production YAML configuration
* merged missing docs
* Fill in GCP and Airflow variable information
Remove travis, docker-compose and tox dependancy
Delete requirements.txt for virtualenv
Add environment.yml for conda
Update Makefile to use conda build container images
* bumped jinja2 and urllib3 per github security
* bumped base python docker image to 3.6.8-stretch as jessie is
deprecated
* bump up pip to latest version
* bumped version of flake8 and fixed whitespace issues
* dropped Python 2.7 setup as databricks provides a python3 enviroment now
* dropped enviroment.emr required for python2.7 enviroment
* dropped thriftpy
* updated travis.yml to use dockerized tests and flake8
* split tests into coverage and no-coverage versions because travis
* disable coverage plugin for tests in travis until write permissions
are sorted outa
I've added an exception handler that forces reload of underlying model in the case of recommendation error.
Extra exception logging occurs if any recommender crashes