docker-etl

Граф коммитов

Автор	SHA1	Сообщение	Дата
Julio	75aaea0a4b	feat(code): Delete channels without members and msgs. Add custom msg when archiving a channel	2024-09-12 05:23:38 -04:00
Julio	4a4c9cd786	feat(code):Workday/Netsuite first commit	2024-09-04 08:08:03 -04:00
Julio Cezar Moscon	b230747dc4	fix(code):Remove comments	2024-08-21 17:19:32 -04:00
Julio Cezar Moscon	25937cbbad	feat(code): Initial commit for the Slack Channel Integration	2024-08-21 17:18:00 -04:00
Jared Snyder	9109957e11	Refactor Config files (#263 ) * refactored base_forecast and prophet_forecast to enable easier testing * Apply suggestions from code review change signatures of `fit` and `predict` to take arguments that default to attributes Co-authored-by: Brad Ochocki Szasz <bochocki@mozilla.com> * add test for fit * revert signatures * made timezone-aware stamps naive * finished base_forecast tests * added tests for prophet class * linting * fixed divide by zero * linting again * adding tests to funnel_forecast * added tests for funnel_forecast * feat(workday):remove unwanted fields (#249) Co-authored-by: Julio Cezar Moscon <jcmoscon@gmail.com> * fix(exit):Added sys.exit() call (#250) Co-authored-by: Julio Cezar Moscon <jcmoscon@gmail.com> * fix issue with call to _get_crossvalidation_metric * fixed type check * added string case to aggregate_to_period and added tests * revert file * added more tests to prophet_forecast * removed DotMap * modified README to make it match better between FunnelForecast and ProphetForecast * Update jobs/kpi-forecasting/kpi_forecasting/models/base_forecast.py Co-authored-by: Brad Ochocki Szasz <bochocki@mozilla.com> * Brad easy fixes * remove magic year * removed DotMap * modified README to make it match better between FunnelForecast and ProphetForecast * added test for more complex segments * renamed use_holidays to use_all_us_holidays * typo * added detail to prophet parameter descriptions * updated setting of default start date and added tests * remove print * moved filter and updated tests to relfect this --------- Co-authored-by: Brad Ochocki Szasz <bochocki@mozilla.com> Co-authored-by: JCMOSCON1976 <167822375+JCMOSCON1976@users.noreply.github.com> Co-authored-by: Julio Cezar Moscon <jcmoscon@gmail.com> Co-authored-by: m-d-bowerman <mbowerman@mozilla.com>	2024-08-16 10:01:58 -05:00
Jared Snyder	9a2bc3a34e	added more tests to prophet_forecast (#264 ) * refactored base_forecast and prophet_forecast to enable easier testing * Apply suggestions from code review change signatures of `fit` and `predict` to take arguments that default to attributes Co-authored-by: Brad Ochocki Szasz <bochocki@mozilla.com> * add test for fit * revert signatures * made timezone-aware stamps naive * finished base_forecast tests * added tests for prophet class * linting * fixed divide by zero * linting again * adding tests to funnel_forecast * added tests for funnel_forecast * feat(workday):remove unwanted fields (#249) Co-authored-by: Julio Cezar Moscon <jcmoscon@gmail.com> * fix(exit):Added sys.exit() call (#250) Co-authored-by: Julio Cezar Moscon <jcmoscon@gmail.com> * fix issue with call to _get_crossvalidation_metric * fixed type check * added string case to aggregate_to_period and added tests * revert file * added more tests to prophet_forecast * Update jobs/kpi-forecasting/kpi_forecasting/models/base_forecast.py Co-authored-by: Brad Ochocki Szasz <bochocki@mozilla.com> * Brad easy fixes * remove magic year * feat(code):increasing the max_limit from 10 to 40. (#259) Co-authored-by: Julio Cezar Moscon <jcmoscon@gmail.com> * typo * revert bugfix in _add_regressors * update tests to reflect reversion --------- Co-authored-by: Brad Ochocki Szasz <bochocki@mozilla.com> Co-authored-by: JCMOSCON1976 <167822375+JCMOSCON1976@users.noreply.github.com> Co-authored-by: Julio Cezar Moscon <jcmoscon@gmail.com> Co-authored-by: m-d-bowerman <mbowerman@mozilla.com>	2024-08-14 09:30:00 -05:00
Jared Snyder	bae0202d0b	feat: kpi forecasting add funnel_forecast unit tests (#248 ) * refactored base_forecast and prophet_forecast to enable easier testing * Apply suggestions from code review change signatures of `fit` and `predict` to take arguments that default to attributes Co-authored-by: Brad Ochocki Szasz <bochocki@mozilla.com> * add test for fit * revert signatures * made timezone-aware stamps naive * finished base_forecast tests * added tests for prophet class * linting * fixed divide by zero * linting again * adding tests to funnel_forecast * added tests for funnel_forecast * feat(workday):remove unwanted fields (#249) Co-authored-by: Julio Cezar Moscon <jcmoscon@gmail.com> * fix(exit):Added sys.exit() call (#250) Co-authored-by: Julio Cezar Moscon <jcmoscon@gmail.com> * fix issue with call to _get_crossvalidation_metric * fixed type check * added string case to aggregate_to_period and added tests * revert file * typo * revert bugfix in _add_regressors * update tests to reflect reversion --------- Co-authored-by: Brad Ochocki Szasz <bochocki@mozilla.com> Co-authored-by: JCMOSCON1976 <167822375+JCMOSCON1976@users.noreply.github.com> Co-authored-by: Julio Cezar Moscon <jcmoscon@gmail.com> Co-authored-by: m-d-bowerman <mbowerman@mozilla.com>	2024-08-14 09:11:26 -05:00
JCMOSCON1976	7324256178	fix(code):Fix the num_changes logic that was causing false deletions (#268 ) * fix(code):Fix the num_changes logic that was causing false deletions * fix(code):Remove comments. * fix(code):Add comments --------- Co-authored-by: Julio Cezar Moscon <jcmoscon@gmail.com>	2024-08-14 10:08:23 +01:00
JCMOSCON1976	2f4b0231ed	feat(link):Change sites_url and people_url to Prod (#267 ) Co-authored-by: Julio Cezar Moscon <jcmoscon@gmail.com>	2024-08-12 17:56:03 +02:00
JCMOSCON1976	3773846de0	feat(code):Add max_limit parameter and more logs (#266 ) * feat(code):added max_limit parameter and more logs * feat(code):added limit parameter and logs --------- Co-authored-by: Julio Cezar Moscon <jcmoscon@gmail.com>	2024-08-09 17:39:45 -05:00
Andrew Halberstadt	c3a1a325af	fix(fxci): batch rows when inserting into BigQuery (#265 ) It turns out there's a 10MB limit when inserting data into BigQuery, and we can sometimes exceed that. The chunk size is set to 25000 because ~50k rows was about 11MB of json, so halving that seems safe. This also upgrades to Python 3.12 as that's when `itertools.batched` was introduced. Possibly a silly reason to upgrade, but why not.	2024-08-09 17:15:27 -04:00
Ksenia	bd37bed017	Add bugzilla token as an env variable as a default option for argparse (#262 )	2024-08-09 10:47:19 -07:00
Andrew Halberstadt	c359712d08	fix(fxci): use the proper class in the pulse callbacks (#261 )	2024-08-08 13:01:26 -07:00
Ksenia	7e51ec40f2	Add chunking to avoid long GET request to bugzilla (#260 )	2024-08-07 11:49:57 -07:00
Andrew Halberstadt	2068dd0128	Change fxci metric export to process all data from yesterday, rather than from 10 minutes ago (#253 )	2024-08-07 17:29:55 +02:00
JCMOSCON1976	4c4891b07a	feat(code):increasing the max_limit from 10 to 40. (#259 ) Co-authored-by: Julio Cezar Moscon <jcmoscon@gmail.com>	2024-08-07 08:58:28 -05:00
JCMOSCON1976	5baee746b0	fix(sso_id):Fixed the sso and changed the default max_limit parameter to 10 (#258 ) Co-authored-by: Julio Cezar Moscon <jcmoscon@gmail.com> Co-authored-by: Katie Windau <153020235+kwindau@users.noreply.github.com>	2024-08-06 12:09:45 -05:00
Jared Snyder	5600019205	fix _fit method to avoid error (#257 )	2024-08-06 11:04:26 -05:00
Andrew Halberstadt	cd8aa09457	Add some tests to the fxci-taskcluster-export (#252 ) * chore(fxci): create separate lockfile for test requirements * test(fxci): add some basic tests for the fxci-taskcluster-export	2024-08-06 11:15:53 -04:00
JCMOSCON1976	9be3446f2a	fix(get_users): fixed hire_dates_inv get (#256 ) * fix(get_users): fixed hire_dates_inv get * fix(hire_dates_inv):fixed hire dtes inv --------- Co-authored-by: Julio Cezar Moscon <jcmoscon@gmail.com>	2024-08-05 18:31:04 +02:00
JCMOSCON1976	781e2746ea	feat(link): Changed URL Links to prod (#255 ) Co-authored-by: Julio Cezar Moscon <jcmoscon@gmail.com>	2024-08-05 09:49:05 -05:00
JCMOSCON1976	74e4aad170	feat(limit):Added a limit parameter and added some more logs. (#254 )	2024-08-02 14:34:39 -04:00
Ksenia	d0b969f98f	Use issue body for bugbug classification when translation is missing (#251 )	2024-08-01 13:59:03 -07:00
JCMOSCON1976	e2b17d0c45	fix(exit):Added sys.exit() call (#250 ) Co-authored-by: Julio Cezar Moscon <jcmoscon@gmail.com>	2024-07-30 08:25:36 -07:00
JCMOSCON1976	8dea005087	feat(workday):remove unwanted fields (#249 ) Co-authored-by: Julio Cezar Moscon <jcmoscon@gmail.com>	2024-07-29 13:35:12 -07:00
Andrew Halberstadt	54185cdfcd	docs(fxci): improve configuration docs for the fxci_etl module (#247 )	2024-07-26 15:14:20 -04:00
JCMOSCON1976	8977256f26	feat():add more logs, and changes in the api_adapter and everfi files (#246 ) Co-authored-by: Julio Cezar Moscon <jcmoscon@gmail.com>	2024-07-25 08:50:36 -07:00
Jared Snyder	2ac8f534bb	refactored base_forecast and prophet_forecast to enable easier testing (#232 ) * refactored base_forecast and prophet_forecast to enable easier testing * Apply suggestions from code review change signatures of `fit` and `predict` to take arguments that default to attributes Co-authored-by: Brad Ochocki Szasz <bochocki@mozilla.com> * add test for fit * revert signatures * made timezone-aware stamps naive * finished base_forecast tests * added tests for prophet class * linting * fixed divide by zero * linting again * choose less confusing dates for tests and leave a comment --------- Co-authored-by: Brad Ochocki Szasz <bochocki@mozilla.com>	2024-07-23 12:34:16 -05:00
Ksenia	3ec84ea090	Add scopes for access to gdrive (#243 )	2024-07-22 15:10:10 -07:00
jgraham	9f2e2e5687	Webcompat metric history (#240 ) * Remove click dependency. This doesn't really save any code vs argparse in this case. * Fix logging and enable command line configuration of level * Record history of metric scores To enable us to see what the metrics looked like previously, run a job that reads the metrics once per day and appends the results to a history table. --------- Co-authored-by: Ksenia <kberezina@mozilla.com>	2024-07-22 10:03:27 -07:00
Jeffrey Silverman	2ab4ed3ef2	fixed some typos in the README (#239 )	2024-07-17 09:26:52 -07:00
Daniel Mueller	d93fd68b90	fix incorrect value for timestamp field in results table (#241 )	2024-07-16 12:42:46 -04:00
Daniel Mueller	57eda920a5	[dev dap collector] support more time precision settings on tasks (#234 ) * support more time precision settings on tasks * make the same separate functions for checking time precision and date validity * review feedback	2024-07-16 10:42:34 -04:00
Daniel Mueller	ef4e640ab9	[prod dap collector] support more time precision options (#236 ) * updating prod ppa collector to support more time precision options * variable names for start/end of collection periods, split out functions to confirm collection date and task are valid	2024-07-12 15:32:34 -05:00
Andrew Halberstadt	160eb5f677	fix(fxci): Make the pulse and monitoring configs optional (#238 ) This way you don't need to define configuration for commands you aren't even going to use.	2024-07-12 14:37:24 -05:00
Andrew Halberstadt	bb45a20f05	feat(fxci): add logic to automatically create tables with correct schemas if they don't exist (#237 ) This ended up being a bit more complicated than I wanted, but since I had all the types of the records defined in the dataclass, I didn't want to hard code them all just for a BigQuery schema. So instead this uses the `generate_schema` function to automatically derive the table schema from the `Record` classes. Was it worth doing this way? Probably not. But it works well, so I think it's worth keeping. This code can be re-used for future tables as well.	2024-07-12 13:21:26 -04:00
Daniel Mueller	96eff62047	record the collection time to match when the aggregations are being queried for (#216 )	2024-07-11 10:31:54 -04:00
Jared Snyder	4fb2fdbc88	Kpi forecasting: add model performance analysis (#223 ) * feat: add class to do validation and module to run it * replaced black with ruff and updated files * ruff updates * update circle ci configs * revert configs and set config to overwrite existing data * add tests file with one test * added aggegation_period to groupby and join * added more tests * renamed validator to result_processing * added to and cleaned up README * fixes for name change * fix ambiguous wording * update modelperformanceanalysis class and rename * finished tests and made small updates * modify code to not create client when output project is empty string * fixing ci * add more to query_cte docstring * refactored model_performance_analysis.py to take a config file * update configs to use real names * renamed model to forecast * Jeff Changes	2024-07-11 09:09:55 -05:00
Ksenia	4c8be08d9e	Update README for broken-site-report-ml job (#228 )	2024-07-10 23:46:43 -04:00
Andrew Halberstadt	c4e47e8a00	feat(fxci): add an implicit 'submission_date' field to all records (#233 ) We need a date field to use for partitioning. Apparently the standard name for this is 'submission_date'. So add this to all records across all tables. Bug: 1904928	2024-07-10 11:55:32 -04:00
Andrew Halberstadt	2d39b8b0c2	Miscellaneous fixes for the firefox-ci export job (#230 ) * fix(fxci): make bigquery table names configurable The tables are defined and versioned in bigquery-etl. Make the names configurable so we don't need to rebuild the image when they change. * fix(fxci): don't assume the last run is the one that generated the pulse event While testing, I discovered that it's possible for a pulse event to contain runs later than the one that generated the event. So it's not safe to simply grab the last run in the list. Instead we should use the run from the index in the top-level `runId` field. * fix(fxci): skip tasks that hit 'deadline-exceeded'	2024-07-09 14:43:38 -04:00
JCMOSCON1976	7a9d942896	Removing the getpocket.com domain name from the list of accepted emails from Workday (#231 ) Co-authored-by: Julio Cezar Moscon <jcmoscon@gmail.com>	2024-07-09 08:40:23 -07:00
Andrew Halberstadt	a2d6c2be54	Bug 1904928 - Add new image for exporting data for the Firefox-CI Taskcluster instance (#227 )	2024-07-08 12:23:12 -04:00
JCMOSCON1976	68ae66610a	fix(file) removing test.py (#226 ) Co-authored-by: Julio Cezar Moscon <jcmoscon@gmail.com>	2024-07-08 08:30:21 -07:00
jgraham	f8c19fb6d6	Webcompat site report (#212 ) * Import bugzilla bugs by use Rather than importing all the Web Compat product bugs in one query and then filtering the bugs in code, try to match the queries to the kind of bug, to reduce the amount of filtering code required. * Import webcompat:site-report bugs These are added to those in the Web Compatibility::Site Reports component. * Add mypy type annotations This also changes the way we store bugs from lists to dictionaries mapping bug id to the bug, which reflects the fact that bugs should be unique by id and we want to look them up. * Don't set start_time as a class property * Add mypy type checker * Update CI config * Move to returning fetch status from the function The advantage over setting a class variable is that we aren't relying on state, and can depend on mypy to validate that we always check the returned value and take appropriate action. * Add breakage_reports entry for kb entries that are also site reports If a core bug has both `webcompat:platform-bug` and also `webcompat:site-report` as keywords it will already be considered a kb entry and a site report, but needs an entry pointing to itself in the breakage_reports table. * Apply suggestions from code review Co-authored-by: Ksenia <kberezina@mozilla.com> --------- Co-authored-by: Ksenia <kberezina@mozilla.com> Co-authored-by: Anna Scholtz <anna@scholtzan.net>	2024-07-03 08:59:21 -07:00
JCMOSCON1976	77b937cfc1	Workday-Everfi integration first commit (#224 ) Co-authored-by: Julio Cezar Moscon <jcmoscon@gmail.com>	2024-07-03 13:54:03 +02:00
Jared Snyder	6717a07105	Kpi forecasting: fix tests (#222 ) * replaced black with ruff and updated files * ruff updates * update circle ci configs * reverted configs	2024-07-01 14:59:33 -05:00
m-d-bowerman	c4f31afd0a	Change references to forecast dataset and table (#221 )	2024-07-01 11:19:23 -05:00
Daniel Mueller	d300d70bb1	PPA Prod adjust how secrets are handled (#219 ) * handle secrets differently * removed secret params from main method signature * simpler adjustment to read values from env vars, still use @click & specify the envvar name for the couple of values	2024-07-01 11:54:45 -04:00
m-d-bowerman	629c4926c7	Fix underscore in project name (#220 )	2024-07-01 08:40:28 -07:00

1 2 3 4 5 ...

358 Коммитов Все ветки Поиск

358 Коммитов

Все ветки