Граф коммитов

5087 Коммитов

Автор SHA1 Сообщение Дата
dependabot[bot] 6c31acb11f
Bump System.Text.RegularExpressions
Bumps System.Text.RegularExpressions from 4.3.0 to 4.3.1.

---
updated-dependencies:
- dependency-name: System.Text.RegularExpressions
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
2021-08-04 21:16:55 +00:00
Alexey Taymanov 18536b679d
Ataymano/fixes (#185)
* logs sampling

* cleanup

* output path fix

* uniform sampling

* ccb num actions fix

* U-Sql db functions + fixes

* CCB stats renaming

Co-authored-by: ataymano@microsoft.com <Alexey Taymanov>
2020-09-29 10:26:21 -06:00
Yaran Fan 034953f2a4
Code refactoring; output improvements (#181)
Co-authored-by: Yaran Fan <yarfan@microsoft.com>
2020-05-04 01:39:07 -06:00
Yaran Fan 859da7aada
Data Generator improvements (#179)
* Three updates: 1.Single best action guaranteed if increase_winning_margin is greater than 0. 2. Sample with replacement 3.Add greyscale to the plot

* Include the accuracy of the latest model in the plot
2020-02-27 13:17:50 -08:00
Yaran Fan 9ef28ccc6a
Group numerical values in the reports; style improvement; bug fix (#178) 2020-02-19 10:33:33 -08:00
Yaran Fan e46f69cfd4
Add Context Explorer (#175)
* Add context explorer

* Data Generator: Add README; Expose add_control_group

* Context explorer add IPS fix, improve visualization, add examples
2020-02-12 19:09:28 -08:00
Alexey Taymanov d23720653e
CB-like stats for CCB (#164)
* cb-like stats for ccb

* SessionId/Timestamp parsing fix (ccb)

* slot switch fix

* Failing CCB dangling reward parsing test

* Ccb dangling reward parsing fix

* CCb dangling rewards analysis fix

* statistics path fix

* ccb cost fix
2020-02-10 07:29:04 -08:00
cheng-tan 8da20bef68
CCB log aggregation (#169)
* online and baseline for ccb log

* ccb prediction

* ccb confidence interval

* Fix the order of online, baseline1 and baseline random

* Add do_decode option to json_cooked

Co-authored-by: Marco Rossi <marco1650@gmail.com>
2020-02-04 11:16:21 -08:00
cheng-tan 3c28f29626
Bug fix: prediction command & loss are inconsistent (#170)
* Bug fix: prediction command & loss are inconsistent

* incorporated policy name inside Command class

* Cleaned settings printout and namespaces extraction

Co-authored-by: Marco Rossi <marco1650@gmail.com>
2020-01-30 11:34:29 -08:00
Alexey Taymanov 72cc7840df Slim logs input (#167) 2020-01-17 17:21:42 -05:00
Alexey Taymanov f7f3524907
Merge pull request #168 from cheng-tan/chunk_log
Converge ExperimentationAzure.py and dashboard_e2e.py
2019-12-31 13:07:39 -05:00
Cheng Tan 8a09fea78e Supports custom policy 2019-12-10 17:50:07 -05:00
Cheng Tan 3f726956e4 Remove vw path argument 2019-12-10 14:23:59 -05:00
Cheng Tan 8c6761370a Remove app folder 2019-12-10 11:22:07 -05:00
Cheng Tan da72c1fd5c Fix memory error 2019-12-10 11:19:24 -05:00
Cheng Tan cd8e71db95 Converge backend 2019-12-10 11:11:11 -05:00
Cheng Tan ed9a2eb3cd Support sas token 2019-12-09 17:28:11 -05:00
Cheng Tan bf75e68aa5 download logs in chunk 2019-12-09 17:09:46 -05:00
Alexey Taymanov 78e6db5247 cost fix (#166) 2019-11-19 14:05:53 -08:00
Marco Rossi ddda35a8bd LogDownloader.py: Add max_download_size to inputs
Downaload only head of each Azure Storage blobs
2019-10-29 14:20:29 -07:00
Alexey Taymanov ca53084d61 ccb extractor. first version (#163)
* multi-file ccb processing

* cleanup slimlogs settings for ccb
2019-10-24 17:45:42 -07:00
Alexey Taymanov 8b981b9512 Ataymano/stats improvement (#162)
* Statsistics over multiple days

* cleanup

* Dangling rewards analysis

* Dangling rewards timing into statistics
2019-10-24 12:04:40 -07:00
Marco Rossi 9c03dbfc1f ds_parse.py: process_files(): added counter of corrupted lines 2019-10-24 12:02:23 -07:00
Marco Rossi 543a9c1d34 ds_parse.py: process_dsjson_file(): improved stats fields readability by using dict 2019-10-24 10:54:55 -07:00
Marco Rossi ec3161c69d ds_parse.py: consolidated all corrupted lines checks into json_cooked() 2019-10-24 10:52:03 -07:00
Marco Rossi 8a43cc9954 LogDownloader.py: Preserve the legacy format in ds.config 2019-10-24 10:03:53 -07:00
Jacob Alber 65e4d66fcc
Merge pull request #160 from ataymano/ataymano/pdrop_stats
Add computation of "pdrop" rate
2019-09-24 10:15:36 -04:00
Alexey Taymanov ae1606d7b1 pdrop statistics 2019-09-23 23:31:38 -04:00
Marco Rossi fa1018175e Removed deprecated Visualization.py -> Use dashboard_utils.py instead 2019-09-04 18:15:28 -07:00
Marco Rossi 9ebe05353e
ActionSetVisualization.py: xticks every hour (#154) 2019-08-29 15:40:50 -07:00
Marco Rossi 3c9b69f9ab Ensure that downloaded file is terminated by \n + skip invalid lines 2019-08-29 15:36:01 -07:00
Dwaipayan Mukhopadhyay 615b28232a Experimentationazuresasktoken (#159)
* add storage account name and sas token to experimentationazure

* Update ExperimentationAzure to use AzureUtil
2019-08-28 10:46:24 -07:00
Marco Rossi afe8865c25 LogDownaloader: catch badly formated ds.config 2019-08-27 14:57:32 -07:00
Marco Rossi 2cdfb7bd4e
DashboardMPI: Hyper parameters grid (#150)
default hp: --power_t 0
--cb_type and --marginal added to hp sweeping
it mirrors commit fcb700febd
2019-08-27 10:16:40 -07:00
Marco Rossi 9b27fe19d6 LogDownloader.py: added if_match to inputs 2019-08-26 11:49:13 -07:00
Marco Rossi af66afc2a5
LogDownloader.py: SAS token and improvements (#149)
* LogDownloader.py: SAS token and improvements
- allow authentication with SAS token
- allow container name to be different from app_id
- removed configparser
- added SAS token support in ds.config
2019-08-26 11:45:32 -07:00
Marco Rossi f6f4c6a5f3 Dashboard Utils: Improvements and fixes
- added default output file path
- added abs to "c" field to ensure it is not negative
- fixed wrong warning error when reward is negative
2019-08-23 08:36:12 -07:00
Marco Rossi 1ed2faf5ae Dashboard Utils: create_stats(): Fixed initilization of d 2019-08-22 15:55:37 -07:00
Marco Rossi fcb700febd
Experimentation.py: Improvements (#138)
- Default hp: --power_t 0
- cb_type can be passed as input
- --cb_type and --marginal added to hp sweeping
- consider input hp values < 1e-8 as 0
2019-08-20 14:08:56 -07:00
Alexey Taymanov 4d4c35d4de Pdrop, IsDangling, ParseError support in USQL extractor (#143)
* Pdrop, IsDangling, ParseError support in USQL extractor

* skipLearn support

* db project + process script

* date in statistics
2019-08-20 11:51:16 -07:00
Sharath Malladi 5b60474ca7 freeze package versions (#145) 2019-08-14 08:55:04 -07:00
Sharath Malladi ca1ea4a550 Users/sharathm/prettyprintfeatureimportance (#144)
* Pretty print feature importance strings to make them more readable

* Remove Constant and Action.Constant from the list of features
2019-08-12 14:31:14 -07:00
Marco Rossi 08199dbe39
dashboard_utils: fixed bug when "_" is in .pred file (#141)
Fixes issue #140
2019-07-23 11:07:42 -07:00
Sharath Malladi 9b759c8f58 Improved file path handling logic (#137) 2019-06-28 19:03:49 -07:00
cheng-tan dd76a0df85 Add dashboard mpi (#133)
* Add dashboard mpi

* change write mode to append, fix azure experiment

* download current date log, enable/disable sweep

* Fix dashboard_utils import path
2019-06-28 08:16:11 -07:00
Marco Rossi 58ddb6d099 dasboard_utils.py: skip errors lines 2019-06-06 14:48:18 -07:00
Sharath Malladi 9fa2e78ccb Pass app insights key as parameter rather than environment setting (#135) 2019-05-31 15:04:46 -07:00
Sharath Malladi 19154e0d71 Users/sharathm/avoidemptyfeaturebucket (#134)
* Improve log message

* Avoid empty feature buckets

* Added app insights telemetry logging
2019-05-28 14:32:00 -07:00
Marco Rossi 3171b035de Experimentation.py: Fixed confusing variable name 2019-05-14 16:45:26 -04:00
Sharath Malladi b392bdae41 Azure Batch pipeline for Offline Experimentation (Users/sharathm/azurebatchcf) (#132)
* Azure batch setup to call Experimentation

* Change cleanup logic to delete all files

* Copied vw-important-features.py from https://raw.githubusercontent.com/marco-rossi29/vowpal_wabbit/important-features/utl/vw-important-features

* Changed script for to iterate on several l1 values and return the buckets

* Default logdownloader to report progress of downloaded data file

* Fixed issue if model file is empty

* Rename important features to feature importance
2019-05-14 15:25:52 -04:00