UCOSP-winter-2018_TrackingTechnologies

Граф коммитов

Автор	SHA1	Сообщение	Дата
Dave Zeber	12f0f2696c	Cleaned up analysis dir with project folders	2018-06-05 18:39:12 -05:00
David Zeber	d0a5e3c36e	Merge pull request #13 from mozilla/notebook-refactor Notebook refactor	2018-06-05 18:30:47 -05:00
mlopatka	9a5e973210	Update schema.md	2018-05-15 15:12:36 +02:00
mlopatka	a968657861	Update schema.md	2018-05-15 15:11:41 +02:00
mlopatka	085243fd29	Update schema.md	2018-05-15 15:05:31 +02:00
mlopatka	93a8a9de26	Update schema.md	2018-05-15 14:26:41 +02:00
Dave Zeber	b1e1c28d2e	Fixed typo in replay notebook	2018-05-11 13:17:42 -05:00
Dave Zeber	2b9f1331a2	Removed .py file for cryptojacking analysis	2018-05-10 17:39:23 -05:00
Dave Zeber	a1137d5054	Minor updates to evercookie notebook - added compatibility code to run from Python 2 - updated code for running on our DB cluster	2018-05-10 17:33:34 -05:00
Dave Zeber	40425fafe0	Minor updates to Spark eval analysis notebook: - reran on our DataBricks cluster - added compatibility code to run from Python 2 - added some description - print out each count as they are computed - cleaned up final print statements	2018-05-10 17:03:58 -05:00
Dave Zeber	cc8e0b6fed	Updates to session replay notebook: - reran so as to capture outputs - added compatibility code to run from Python 2 - refactored analysis to create a cached base dataset at the start and run analyses against that - load session replay sites from file rather than Spark temp table - refactored analysis to drop (page, script) duplicates at the top level - restructured the top summary section and updated all the numbers - refactored the suffix analysis to use a single joined DF	2018-05-10 16:28:18 -05:00
Dave Zeber	05a6bba792	Minor cleanup of the cryptojacking notebook	2018-05-09 14:14:18 -05:00
mlopatka	658b3ecfbc	Merge pull request #10 from Tyler-R/FixEvalUsageAnalysis Fix miscounting of scripts URLs.	2018-04-22 14:44:23 +02:00
Dave Zeber	e42d285e8d	Updated cryptojacking notebook: - added some compatibility code for running in Python 2 - added rendered HTML showing cell outputs	2018-04-21 12:52:34 +02:00
Tyler-R	63b1de6ea0	Fix miscounting of scripts URLs and corfirm spark analysis. Script urls were not being counted correctly so a number of analysis results were incorrect. This fixes those errors. Also adds an analysis that looks at the % of function calls that are created using eval in the sample to compare with the spark results.	2018-04-20 17:40:00 -07:00
mlopatka	02b434249c	Merge pull request #9 from koosoku/session-replay Modify the session replay analysis to use spark and databricks	2018-04-19 18:31:50 +02:00
Kyle Kung	efb71a9ef0	Modify the session replay analysis to use spark and databricks	2018-04-19 11:56:26 -04:00
mlopatka	9e23f8582b	Merge pull request #4 from Alexander1994/cookie_data_cleanup Cleaned up Extract cookie notebook and added scraper to pull cookie info with a simple cache	2018-04-18 13:59:07 +02:00
mlopatka	7afbf81970	Merge pull request #6 from Alexander1994/databricks_evercookie_search Databricks evercookie search script added	2018-04-18 13:58:39 +02:00
David Zeber	6812ac3531	Merge pull request #7 from Tyler-R/EvalAnalysisSpark Add analysis of eval usage for the whole data set, using Spark.	2018-04-18 13:56:28 +02:00
David Zeber	a1d6d863e3	Merge pull request #5 from vvnjin/cryptojacking Add finished cryptojacking analysis notebook with markdown and source…	2018-04-18 13:56:00 +02:00
Tyler-R	55423a5b36	Add analysis of eval usage, using Spark, that examines: - How many function calls are created using eval. - How many web pages use eval. - How many scripts with function created using eval are hosted on different domains than the web page that uses them.	2018-04-13 22:13:50 -07:00
Vivian Jin	de957ea139	Fix typo in notebook.	2018-04-13 15:08:12 -07:00
Alex	4e1f491c6c	data bricks evercookie search script added	2018-04-13 11:28:27 -03:00
Vivian Jin	c6e067a5c0	Add finished cryptojacking analysis notebook with markdown and source files.	2018-04-12 17:09:00 -07:00
Alex	85946e745d	cookie data analysis cleaned up and md added	2018-04-12 11:09:35 -03:00
mlopatka	b28eb8c07a	Merge pull request #3 from Tyler-R/EvalMultipleSampling Update the analysis of eval usage.	2018-04-06 14:12:09 +02:00
Tyler-R	a7f96446cf	Update the analysis of eval usage. The previous analysis of eval usage was lacking an initial preamble and problem justification, only looked at 1 sample, and was comparing website domains instead of comparing what company controlled the domains. This change fixes those issues and makes a number of other improvements including: - Adding a sumamry of results in the analysis introduction section to make the analysis easier to read. - Giving justification for why each analysis is being performed. - Giving an interpretation of the results so that their significance can be more easily understood.	2018-04-04 00:49:48 -07:00
mlopatka	08350e069a	Update to most recent state of old repo.	2018-03-26 19:42:10 +02:00
mlopatka	81e5cfba55	UCOSP project fresh/historyless commit.	2018-03-26 11:15:13 +02:00
mlopatka	3cc44edbe6	Initial commit	2018-03-26 11:07:55 +02:00

31 Коммитов Все ветки Поиск

31 Коммитов

Все ветки