OpenWPM/test/openwpm_jstest.py

45 строки
1.6 KiB
Python
Исходник Обычный вид История

Arbitrary WebAPI JS instrumentation (#642) * Add mdn-browser-compat-data * js_instrument_modules as list * Add mdn-browser-compat * Pass a list of instrumentingFunctions * Script to generate api data * Working give or take Getting errors like OpenWPM: Error name: TypeError post_request_ajax.html:237:17 OpenWPM: Error message: can't redefine non-configurable property "UNSENT" post_request_ajax.html:238:17 * Small naming cleanup * Handle non-configurable properties * Lint * Add aspirational API * Begin migration to new JSInstrumentationRequest interface. * We build and mandate LogSettings. * We have a new JSInstrumentatinRequest that everything runs through * Preset, fingerprinting, will be specified in JSON * Continue making progress Enum for Operation * Begin implementing jsModuleRequest validation. Changing my mind - all validation and construction to be done python side. This will reduce JS overhead at runtime. * Big cleanout- js-instrumentation work moving to python. * Continue update to python js-instrumentation * Lint Can't do all the things I want to with typing due to scope when content is loaded into page. * noqa on wip jsinstrumentation file * Begin updating existing js instrument tests. * Small cleanups * Fix naming in calling instrumentJS * No display mode native for testing * Restore py test file to orig. * Support null propertiesToInstrument * Re-work instrumentObject tests * Clean-up text in test page. * Add default to getLogSettings function * Don't re-assign logSettings.propertiesToInstrument * Revert "Don't re-assign logSettings.propertiesToInstrument" This reverts commit 87ccdabf9a97a5b50e33d66754b764fa91738a90. * Better assign propertiesToInstrument * Small cleanup * Make new logSettings object * Prettify * Small clean * -- BREAK -- JS Rework complete With this commit, the JS side of this PR is complete. Tests are still failing as fingerprinting implementation has not been completed on the python side, but all test_js_* tests are passing due to the core JS API rework being in place. * Write-out mdn compat data to js_instrumentation .py * Dry out js test code * Consolidate JS tests * Finish missing renames, and add test js via browser_params * pep8 * New files and failing tests. * Add a json schema for js_instrument_modules * Latest py tests * Flake8 * Ongoing progress. * More code, more tests. * flake * Rename mdn file * Add latest tests - just implement fingerprinting.json * flake8 * Add fingerprinting.json (incomplete) Mimetypes and plugins * Correct logSettings property name * Restore create_xpi as function Needed by manual_test * Make explicit option for logging to console * Process browser_params in task manager * Start being able to pass browser_params to selenium Also update manual_test to use click * Revert "Make explicit option for logging to console" This reverts commit c840fbc5d86350aa4d2f59e645036299a5a8195e. * Get manual_test working with browser_params From toplevel directory run: `python -m test.manual_test --selenium --browser-params --browser-params-file=debug_params.json` * More robust test for simple fingerprinting output Can't guarantee order of string output * Add timing information when testing * Make recheck really fast. You'll never hit this recheck as it all happens before page load. * Handle all inputs properly * Debug with all window params instrumented * Load xpi we just built * Check for ff version support * Save a bunch of properties * Relax constraints on what we can instrument. Let failing happen during instrumentation by using subscript notation. Don't restrict to MDN list. * Correct stringifying * Better name example params, fix some bugs, sample a_f Some example browser_params - a_f is just working - but crushes on a page like google.com. g_l and m_z haven't been vetted yet. * flake8 * Move example browser_params file out of harms way * Add failing test for regression I introduced. * Fix for regression. * Add simple mimeTypes and plugins to fingerprinting. * Lint JS * Rm mdn_browser_comat stuff no longer needed * Remove example_browser_params They're not used in tests, were just for my testing. * Load JS_INSTRUMENT_MODULES from JSON string * Rename JS_INSTRUMENT_MODULES to JS_INSTRUMENT_SETTINGS * Fixes #28 - Instrument all window.navigator properties. * Finish removing unused mdn-compat pieces. * EventID as a shadow variable * Flake8 * Remove $ prefix and rename $instrumentionRequests -> jsInstrumentationSettings * Rename jsInstrumentationRequests->jsInstrumentationSettings * TS Lint * Remove use of "request". Rename python side as per discussion with @englehardt. Privatize most methods Numpy docstrings for public methods * Convert assertions to ValueErrors * Rename file/folder and fingerprinting -> collection_fingerprinting file JSInstrumentation.py -> js_instrumentation/__init__.py collections have their own folder * Clean-up naming in schema * Add processing of json schema to documentation * Rename js_instrumentation again and ref schema location * Pass JSON not a js string * Do copying to xpi in npm postbuild step * Fix import in manual_test * Revert "Pass JSON not a js string" This reverts commit 8eb4edb5422fee30208882ca940cddc9688852dd. * Add titles to schema pieces * Add docs for js_instrument_settings * Bit more README cleanup * Update README.md Co-authored-by: Steven Englehardt <englehardt@gmail.com> * Move updating schema docs section * Add title * Fix typo in mac-osx hyperlink * Make the single-key dictionary clearer * Remove versions from npm package files * Clean up instrument_existing_window_property.html and js We're not using the js in two htmls now, so unify like other test files * Fix pyside instrumentation test, add more clarificaiton to README * pyside test must instrument browser apis * add more to readme to clarify instrumenting * Use example.com and example.org as localDomains * context-manage open, and flake8 Co-authored-by: Steven Englehardt <englehardt@gmail.com>
2020-07-09 01:55:22 +03:00
import re
Data Aggregator Rewrite (#753) * First steps in the rewrite * Fixed import paths * One giant refactor * Fixing tests * Adding mypy * Removed mypy from pre-commit workflow * First draft on DataAggregator * Wrote a DataAggregator that starts and shuts down * Created tests and added more empty types * Got demo.py working * Created sql_provider * Cleaned up imports in TaskManager * Added async * Fixed minor bugs * First steps at porting arrow * Introduced TableName and different Task handling * Added more failing tests * First first completes others don't * It works * Started working on arrow_provider * Implemented ArrowProvider * Added logger fixture * Fixed test_storage_controller * Fixing OpenWPMTest.visit() * Moved test/storage_providers to test/storage * Fixing up tests * Moved automation to openwpm * Readded datadir to .gitignore * Ran repin.sh * Fixed formatting * Let's see if this works * Fixed imports * Got arrow_memory_provider working * Starting to rewrite tests * Setting up fixtures * Attempting to fix all the tests * Still fixing tests * Broken content saving * Added node * Fixed screenshot tests * Fixing more tests * Fixed tests * Implemented local_storage.py * Cleaned up flush_cache * Fixing more tests * Wrote test for LocalArrowProvider * Introduced tests for local_storage_provider.py * Asserting test dir is empty * Creating subfolder for different aggregators * New depencies and init() * Everything is terribly broken * Figured out finalize_visit_id * Running two event loops kinda works??? * Rearming the event * Introduced mypy * Downgraded black in pre-commit * Modifying the database directly * Fixed formatting * Made mypy a lil stricter * Fixing docs and config printing * Realising I've been using the wrong with * Trying to figure arrow_storage * Moving lock initialization in in_memory_storage * Fixing tests * Fixing up tests and adding more typechecking * Fixed num_browsers in test_cache_hits_recorded * Parametrized unstructured * String fix * Added failing test * New test * Review changes with Steven * Fixed repin.sh and test_arrow_cache * Minor change * Fixed prune-environment.py * Removing references to DataAggregator * Fixed test_seed_persistance * More paths * Fixed test display shutdown * Made cache test more robust * Update crawler.py Co-authored-by: Steven Englehardt <senglehardt@mozilla.com> * Slimming down ManagerParams * Fixing more tests * Update test/storage/test_storage_controller.py Co-authored-by: Steven Englehardt <senglehardt@mozilla.com> * Purging references to DataAggregator * Reverted changes to .travis.yml * Demo.py saves locally again * Readjusting test paths * Expanded comment on initialize to reference #846 * Made token optional in finalize_visit_id * Simplified test paramtetrization * Fixed callback semantics change * Removed test_parse_http_stack_trace_str * Added DataSocket * WIP need to fix path encoding * Fixed path encoding * Added task and crawl to schema * Fixed paths in GitHub actions * Refactored completion handling * Fix tests * Trying to fix tests on CI * Removed redundant setting of tag * Removing references to S3 * Purging more DataAggregator references * Craking up logging to figure out test failure * Moved test_values into a fixture * Fixing GcpUnstructuredProvider * Fixed paths for future crawls * Renamed sqllite to official sqlite * Restored demo.py * Update openwpm/commands/profile_commands.py Co-authored-by: Georgia Kokkinou <geor5ko@gmail.com> * Restored previous behaviour of DumpProfileCommand Co-authored-by: Georgia Kokkinou <geor5ko@gmail.com> * Removed leftovers * Cleaned up comments * Expanded lock check * Fixed more stuff * More comment updates * Update openwpm/socket_interface.py Co-authored-by: Georgia Kokkinou <geor5ko@gmail.com> * Removed outdated comment * Using config_encoder * Renamed tar_location to tar_path * Removed references to database_name in docs * Cleanup * Moved screenshot_path and source_dump_path to ManagerParamsInternal * Fixed imports * Fixing up comments * Fixing up comments * More docs * updated dependencies * Fixed test_task_manager * Reupgraded to python 3.9.1 * Restoring crawl_reference in mp_logger * Removed unused imports * Apply suggestions from code review Co-authored-by: Steven Englehardt <senglehardt@mozilla.com> * Cleaned up socket handling * Fixed TaskManager.__exit__ * Moved validation code into config.py * Removed comment * Removed comment * Removed comment Co-authored-by: Steven Englehardt <senglehardt@mozilla.com> Co-authored-by: Georgia Kokkinou <geor5ko@gmail.com>
2021-02-22 19:51:32 +03:00
from pathlib import Path
from typing import List, Optional, Tuple
Arbitrary WebAPI JS instrumentation (#642) * Add mdn-browser-compat-data * js_instrument_modules as list * Add mdn-browser-compat * Pass a list of instrumentingFunctions * Script to generate api data * Working give or take Getting errors like OpenWPM: Error name: TypeError post_request_ajax.html:237:17 OpenWPM: Error message: can't redefine non-configurable property "UNSENT" post_request_ajax.html:238:17 * Small naming cleanup * Handle non-configurable properties * Lint * Add aspirational API * Begin migration to new JSInstrumentationRequest interface. * We build and mandate LogSettings. * We have a new JSInstrumentatinRequest that everything runs through * Preset, fingerprinting, will be specified in JSON * Continue making progress Enum for Operation * Begin implementing jsModuleRequest validation. Changing my mind - all validation and construction to be done python side. This will reduce JS overhead at runtime. * Big cleanout- js-instrumentation work moving to python. * Continue update to python js-instrumentation * Lint Can't do all the things I want to with typing due to scope when content is loaded into page. * noqa on wip jsinstrumentation file * Begin updating existing js instrument tests. * Small cleanups * Fix naming in calling instrumentJS * No display mode native for testing * Restore py test file to orig. * Support null propertiesToInstrument * Re-work instrumentObject tests * Clean-up text in test page. * Add default to getLogSettings function * Don't re-assign logSettings.propertiesToInstrument * Revert "Don't re-assign logSettings.propertiesToInstrument" This reverts commit 87ccdabf9a97a5b50e33d66754b764fa91738a90. * Better assign propertiesToInstrument * Small cleanup * Make new logSettings object * Prettify * Small clean * -- BREAK -- JS Rework complete With this commit, the JS side of this PR is complete. Tests are still failing as fingerprinting implementation has not been completed on the python side, but all test_js_* tests are passing due to the core JS API rework being in place. * Write-out mdn compat data to js_instrumentation .py * Dry out js test code * Consolidate JS tests * Finish missing renames, and add test js via browser_params * pep8 * New files and failing tests. * Add a json schema for js_instrument_modules * Latest py tests * Flake8 * Ongoing progress. * More code, more tests. * flake * Rename mdn file * Add latest tests - just implement fingerprinting.json * flake8 * Add fingerprinting.json (incomplete) Mimetypes and plugins * Correct logSettings property name * Restore create_xpi as function Needed by manual_test * Make explicit option for logging to console * Process browser_params in task manager * Start being able to pass browser_params to selenium Also update manual_test to use click * Revert "Make explicit option for logging to console" This reverts commit c840fbc5d86350aa4d2f59e645036299a5a8195e. * Get manual_test working with browser_params From toplevel directory run: `python -m test.manual_test --selenium --browser-params --browser-params-file=debug_params.json` * More robust test for simple fingerprinting output Can't guarantee order of string output * Add timing information when testing * Make recheck really fast. You'll never hit this recheck as it all happens before page load. * Handle all inputs properly * Debug with all window params instrumented * Load xpi we just built * Check for ff version support * Save a bunch of properties * Relax constraints on what we can instrument. Let failing happen during instrumentation by using subscript notation. Don't restrict to MDN list. * Correct stringifying * Better name example params, fix some bugs, sample a_f Some example browser_params - a_f is just working - but crushes on a page like google.com. g_l and m_z haven't been vetted yet. * flake8 * Move example browser_params file out of harms way * Add failing test for regression I introduced. * Fix for regression. * Add simple mimeTypes and plugins to fingerprinting. * Lint JS * Rm mdn_browser_comat stuff no longer needed * Remove example_browser_params They're not used in tests, were just for my testing. * Load JS_INSTRUMENT_MODULES from JSON string * Rename JS_INSTRUMENT_MODULES to JS_INSTRUMENT_SETTINGS * Fixes #28 - Instrument all window.navigator properties. * Finish removing unused mdn-compat pieces. * EventID as a shadow variable * Flake8 * Remove $ prefix and rename $instrumentionRequests -> jsInstrumentationSettings * Rename jsInstrumentationRequests->jsInstrumentationSettings * TS Lint * Remove use of "request". Rename python side as per discussion with @englehardt. Privatize most methods Numpy docstrings for public methods * Convert assertions to ValueErrors * Rename file/folder and fingerprinting -> collection_fingerprinting file JSInstrumentation.py -> js_instrumentation/__init__.py collections have their own folder * Clean-up naming in schema * Add processing of json schema to documentation * Rename js_instrumentation again and ref schema location * Pass JSON not a js string * Do copying to xpi in npm postbuild step * Fix import in manual_test * Revert "Pass JSON not a js string" This reverts commit 8eb4edb5422fee30208882ca940cddc9688852dd. * Add titles to schema pieces * Add docs for js_instrument_settings * Bit more README cleanup * Update README.md Co-authored-by: Steven Englehardt <englehardt@gmail.com> * Move updating schema docs section * Add title * Fix typo in mac-osx hyperlink * Make the single-key dictionary clearer * Remove versions from npm package files * Clean up instrument_existing_window_property.html and js We're not using the js in two htmls now, so unify like other test files * Fix pyside instrumentation test, add more clarificaiton to README * pyside test must instrument browser apis * add more to readme to clarify instrumenting * Use example.com and example.org as localDomains * context-manage open, and flake8 Co-authored-by: Steven Englehardt <englehardt@gmail.com>
2020-07-09 01:55:22 +03:00
Data Aggregator Rewrite (#753) * First steps in the rewrite * Fixed import paths * One giant refactor * Fixing tests * Adding mypy * Removed mypy from pre-commit workflow * First draft on DataAggregator * Wrote a DataAggregator that starts and shuts down * Created tests and added more empty types * Got demo.py working * Created sql_provider * Cleaned up imports in TaskManager * Added async * Fixed minor bugs * First steps at porting arrow * Introduced TableName and different Task handling * Added more failing tests * First first completes others don't * It works * Started working on arrow_provider * Implemented ArrowProvider * Added logger fixture * Fixed test_storage_controller * Fixing OpenWPMTest.visit() * Moved test/storage_providers to test/storage * Fixing up tests * Moved automation to openwpm * Readded datadir to .gitignore * Ran repin.sh * Fixed formatting * Let's see if this works * Fixed imports * Got arrow_memory_provider working * Starting to rewrite tests * Setting up fixtures * Attempting to fix all the tests * Still fixing tests * Broken content saving * Added node * Fixed screenshot tests * Fixing more tests * Fixed tests * Implemented local_storage.py * Cleaned up flush_cache * Fixing more tests * Wrote test for LocalArrowProvider * Introduced tests for local_storage_provider.py * Asserting test dir is empty * Creating subfolder for different aggregators * New depencies and init() * Everything is terribly broken * Figured out finalize_visit_id * Running two event loops kinda works??? * Rearming the event * Introduced mypy * Downgraded black in pre-commit * Modifying the database directly * Fixed formatting * Made mypy a lil stricter * Fixing docs and config printing * Realising I've been using the wrong with * Trying to figure arrow_storage * Moving lock initialization in in_memory_storage * Fixing tests * Fixing up tests and adding more typechecking * Fixed num_browsers in test_cache_hits_recorded * Parametrized unstructured * String fix * Added failing test * New test * Review changes with Steven * Fixed repin.sh and test_arrow_cache * Minor change * Fixed prune-environment.py * Removing references to DataAggregator * Fixed test_seed_persistance * More paths * Fixed test display shutdown * Made cache test more robust * Update crawler.py Co-authored-by: Steven Englehardt <senglehardt@mozilla.com> * Slimming down ManagerParams * Fixing more tests * Update test/storage/test_storage_controller.py Co-authored-by: Steven Englehardt <senglehardt@mozilla.com> * Purging references to DataAggregator * Reverted changes to .travis.yml * Demo.py saves locally again * Readjusting test paths * Expanded comment on initialize to reference #846 * Made token optional in finalize_visit_id * Simplified test paramtetrization * Fixed callback semantics change * Removed test_parse_http_stack_trace_str * Added DataSocket * WIP need to fix path encoding * Fixed path encoding * Added task and crawl to schema * Fixed paths in GitHub actions * Refactored completion handling * Fix tests * Trying to fix tests on CI * Removed redundant setting of tag * Removing references to S3 * Purging more DataAggregator references * Craking up logging to figure out test failure * Moved test_values into a fixture * Fixing GcpUnstructuredProvider * Fixed paths for future crawls * Renamed sqllite to official sqlite * Restored demo.py * Update openwpm/commands/profile_commands.py Co-authored-by: Georgia Kokkinou <geor5ko@gmail.com> * Restored previous behaviour of DumpProfileCommand Co-authored-by: Georgia Kokkinou <geor5ko@gmail.com> * Removed leftovers * Cleaned up comments * Expanded lock check * Fixed more stuff * More comment updates * Update openwpm/socket_interface.py Co-authored-by: Georgia Kokkinou <geor5ko@gmail.com> * Removed outdated comment * Using config_encoder * Renamed tar_location to tar_path * Removed references to database_name in docs * Cleanup * Moved screenshot_path and source_dump_path to ManagerParamsInternal * Fixed imports * Fixing up comments * Fixing up comments * More docs * updated dependencies * Fixed test_task_manager * Reupgraded to python 3.9.1 * Restoring crawl_reference in mp_logger * Removed unused imports * Apply suggestions from code review Co-authored-by: Steven Englehardt <senglehardt@mozilla.com> * Cleaned up socket handling * Fixed TaskManager.__exit__ * Moved validation code into config.py * Removed comment * Removed comment * Removed comment Co-authored-by: Steven Englehardt <senglehardt@mozilla.com> Co-authored-by: Georgia Kokkinou <geor5ko@gmail.com>
2021-02-22 19:51:32 +03:00
from openwpm.config import BrowserParams, ManagerParams
from openwpm.utilities import db_utils
Arbitrary WebAPI JS instrumentation (#642) * Add mdn-browser-compat-data * js_instrument_modules as list * Add mdn-browser-compat * Pass a list of instrumentingFunctions * Script to generate api data * Working give or take Getting errors like OpenWPM: Error name: TypeError post_request_ajax.html:237:17 OpenWPM: Error message: can't redefine non-configurable property "UNSENT" post_request_ajax.html:238:17 * Small naming cleanup * Handle non-configurable properties * Lint * Add aspirational API * Begin migration to new JSInstrumentationRequest interface. * We build and mandate LogSettings. * We have a new JSInstrumentatinRequest that everything runs through * Preset, fingerprinting, will be specified in JSON * Continue making progress Enum for Operation * Begin implementing jsModuleRequest validation. Changing my mind - all validation and construction to be done python side. This will reduce JS overhead at runtime. * Big cleanout- js-instrumentation work moving to python. * Continue update to python js-instrumentation * Lint Can't do all the things I want to with typing due to scope when content is loaded into page. * noqa on wip jsinstrumentation file * Begin updating existing js instrument tests. * Small cleanups * Fix naming in calling instrumentJS * No display mode native for testing * Restore py test file to orig. * Support null propertiesToInstrument * Re-work instrumentObject tests * Clean-up text in test page. * Add default to getLogSettings function * Don't re-assign logSettings.propertiesToInstrument * Revert "Don't re-assign logSettings.propertiesToInstrument" This reverts commit 87ccdabf9a97a5b50e33d66754b764fa91738a90. * Better assign propertiesToInstrument * Small cleanup * Make new logSettings object * Prettify * Small clean * -- BREAK -- JS Rework complete With this commit, the JS side of this PR is complete. Tests are still failing as fingerprinting implementation has not been completed on the python side, but all test_js_* tests are passing due to the core JS API rework being in place. * Write-out mdn compat data to js_instrumentation .py * Dry out js test code * Consolidate JS tests * Finish missing renames, and add test js via browser_params * pep8 * New files and failing tests. * Add a json schema for js_instrument_modules * Latest py tests * Flake8 * Ongoing progress. * More code, more tests. * flake * Rename mdn file * Add latest tests - just implement fingerprinting.json * flake8 * Add fingerprinting.json (incomplete) Mimetypes and plugins * Correct logSettings property name * Restore create_xpi as function Needed by manual_test * Make explicit option for logging to console * Process browser_params in task manager * Start being able to pass browser_params to selenium Also update manual_test to use click * Revert "Make explicit option for logging to console" This reverts commit c840fbc5d86350aa4d2f59e645036299a5a8195e. * Get manual_test working with browser_params From toplevel directory run: `python -m test.manual_test --selenium --browser-params --browser-params-file=debug_params.json` * More robust test for simple fingerprinting output Can't guarantee order of string output * Add timing information when testing * Make recheck really fast. You'll never hit this recheck as it all happens before page load. * Handle all inputs properly * Debug with all window params instrumented * Load xpi we just built * Check for ff version support * Save a bunch of properties * Relax constraints on what we can instrument. Let failing happen during instrumentation by using subscript notation. Don't restrict to MDN list. * Correct stringifying * Better name example params, fix some bugs, sample a_f Some example browser_params - a_f is just working - but crushes on a page like google.com. g_l and m_z haven't been vetted yet. * flake8 * Move example browser_params file out of harms way * Add failing test for regression I introduced. * Fix for regression. * Add simple mimeTypes and plugins to fingerprinting. * Lint JS * Rm mdn_browser_comat stuff no longer needed * Remove example_browser_params They're not used in tests, were just for my testing. * Load JS_INSTRUMENT_MODULES from JSON string * Rename JS_INSTRUMENT_MODULES to JS_INSTRUMENT_SETTINGS * Fixes #28 - Instrument all window.navigator properties. * Finish removing unused mdn-compat pieces. * EventID as a shadow variable * Flake8 * Remove $ prefix and rename $instrumentionRequests -> jsInstrumentationSettings * Rename jsInstrumentationRequests->jsInstrumentationSettings * TS Lint * Remove use of "request". Rename python side as per discussion with @englehardt. Privatize most methods Numpy docstrings for public methods * Convert assertions to ValueErrors * Rename file/folder and fingerprinting -> collection_fingerprinting file JSInstrumentation.py -> js_instrumentation/__init__.py collections have their own folder * Clean-up naming in schema * Add processing of json schema to documentation * Rename js_instrumentation again and ref schema location * Pass JSON not a js string * Do copying to xpi in npm postbuild step * Fix import in manual_test * Revert "Pass JSON not a js string" This reverts commit 8eb4edb5422fee30208882ca940cddc9688852dd. * Add titles to schema pieces * Add docs for js_instrument_settings * Bit more README cleanup * Update README.md Co-authored-by: Steven Englehardt <englehardt@gmail.com> * Move updating schema docs section * Add title * Fix typo in mac-osx hyperlink * Make the single-key dictionary clearer * Remove versions from npm package files * Clean up instrument_existing_window_property.html and js We're not using the js in two htmls now, so unify like other test files * Fix pyside instrumentation test, add more clarificaiton to README * pyside test must instrument browser apis * add more to readme to clarify instrumenting * Use example.com and example.org as localDomains * context-manage open, and flake8 Co-authored-by: Steven Englehardt <englehardt@gmail.com>
2020-07-09 01:55:22 +03:00
from .openwpmtest import OpenWPMTest
class OpenWPMJSTest(OpenWPMTest):
Data Aggregator Rewrite (#753) * First steps in the rewrite * Fixed import paths * One giant refactor * Fixing tests * Adding mypy * Removed mypy from pre-commit workflow * First draft on DataAggregator * Wrote a DataAggregator that starts and shuts down * Created tests and added more empty types * Got demo.py working * Created sql_provider * Cleaned up imports in TaskManager * Added async * Fixed minor bugs * First steps at porting arrow * Introduced TableName and different Task handling * Added more failing tests * First first completes others don't * It works * Started working on arrow_provider * Implemented ArrowProvider * Added logger fixture * Fixed test_storage_controller * Fixing OpenWPMTest.visit() * Moved test/storage_providers to test/storage * Fixing up tests * Moved automation to openwpm * Readded datadir to .gitignore * Ran repin.sh * Fixed formatting * Let's see if this works * Fixed imports * Got arrow_memory_provider working * Starting to rewrite tests * Setting up fixtures * Attempting to fix all the tests * Still fixing tests * Broken content saving * Added node * Fixed screenshot tests * Fixing more tests * Fixed tests * Implemented local_storage.py * Cleaned up flush_cache * Fixing more tests * Wrote test for LocalArrowProvider * Introduced tests for local_storage_provider.py * Asserting test dir is empty * Creating subfolder for different aggregators * New depencies and init() * Everything is terribly broken * Figured out finalize_visit_id * Running two event loops kinda works??? * Rearming the event * Introduced mypy * Downgraded black in pre-commit * Modifying the database directly * Fixed formatting * Made mypy a lil stricter * Fixing docs and config printing * Realising I've been using the wrong with * Trying to figure arrow_storage * Moving lock initialization in in_memory_storage * Fixing tests * Fixing up tests and adding more typechecking * Fixed num_browsers in test_cache_hits_recorded * Parametrized unstructured * String fix * Added failing test * New test * Review changes with Steven * Fixed repin.sh and test_arrow_cache * Minor change * Fixed prune-environment.py * Removing references to DataAggregator * Fixed test_seed_persistance * More paths * Fixed test display shutdown * Made cache test more robust * Update crawler.py Co-authored-by: Steven Englehardt <senglehardt@mozilla.com> * Slimming down ManagerParams * Fixing more tests * Update test/storage/test_storage_controller.py Co-authored-by: Steven Englehardt <senglehardt@mozilla.com> * Purging references to DataAggregator * Reverted changes to .travis.yml * Demo.py saves locally again * Readjusting test paths * Expanded comment on initialize to reference #846 * Made token optional in finalize_visit_id * Simplified test paramtetrization * Fixed callback semantics change * Removed test_parse_http_stack_trace_str * Added DataSocket * WIP need to fix path encoding * Fixed path encoding * Added task and crawl to schema * Fixed paths in GitHub actions * Refactored completion handling * Fix tests * Trying to fix tests on CI * Removed redundant setting of tag * Removing references to S3 * Purging more DataAggregator references * Craking up logging to figure out test failure * Moved test_values into a fixture * Fixing GcpUnstructuredProvider * Fixed paths for future crawls * Renamed sqllite to official sqlite * Restored demo.py * Update openwpm/commands/profile_commands.py Co-authored-by: Georgia Kokkinou <geor5ko@gmail.com> * Restored previous behaviour of DumpProfileCommand Co-authored-by: Georgia Kokkinou <geor5ko@gmail.com> * Removed leftovers * Cleaned up comments * Expanded lock check * Fixed more stuff * More comment updates * Update openwpm/socket_interface.py Co-authored-by: Georgia Kokkinou <geor5ko@gmail.com> * Removed outdated comment * Using config_encoder * Renamed tar_location to tar_path * Removed references to database_name in docs * Cleanup * Moved screenshot_path and source_dump_path to ManagerParamsInternal * Fixed imports * Fixing up comments * Fixing up comments * More docs * updated dependencies * Fixed test_task_manager * Reupgraded to python 3.9.1 * Restoring crawl_reference in mp_logger * Removed unused imports * Apply suggestions from code review Co-authored-by: Steven Englehardt <senglehardt@mozilla.com> * Cleaned up socket handling * Fixed TaskManager.__exit__ * Moved validation code into config.py * Removed comment * Removed comment * Removed comment Co-authored-by: Steven Englehardt <senglehardt@mozilla.com> Co-authored-by: Georgia Kokkinou <geor5ko@gmail.com>
2021-02-22 19:51:32 +03:00
def get_config(
self, data_dir: Optional[Path]
) -> Tuple[ManagerParams, List[BrowserParams]]:
Arbitrary WebAPI JS instrumentation (#642) * Add mdn-browser-compat-data * js_instrument_modules as list * Add mdn-browser-compat * Pass a list of instrumentingFunctions * Script to generate api data * Working give or take Getting errors like OpenWPM: Error name: TypeError post_request_ajax.html:237:17 OpenWPM: Error message: can't redefine non-configurable property "UNSENT" post_request_ajax.html:238:17 * Small naming cleanup * Handle non-configurable properties * Lint * Add aspirational API * Begin migration to new JSInstrumentationRequest interface. * We build and mandate LogSettings. * We have a new JSInstrumentatinRequest that everything runs through * Preset, fingerprinting, will be specified in JSON * Continue making progress Enum for Operation * Begin implementing jsModuleRequest validation. Changing my mind - all validation and construction to be done python side. This will reduce JS overhead at runtime. * Big cleanout- js-instrumentation work moving to python. * Continue update to python js-instrumentation * Lint Can't do all the things I want to with typing due to scope when content is loaded into page. * noqa on wip jsinstrumentation file * Begin updating existing js instrument tests. * Small cleanups * Fix naming in calling instrumentJS * No display mode native for testing * Restore py test file to orig. * Support null propertiesToInstrument * Re-work instrumentObject tests * Clean-up text in test page. * Add default to getLogSettings function * Don't re-assign logSettings.propertiesToInstrument * Revert "Don't re-assign logSettings.propertiesToInstrument" This reverts commit 87ccdabf9a97a5b50e33d66754b764fa91738a90. * Better assign propertiesToInstrument * Small cleanup * Make new logSettings object * Prettify * Small clean * -- BREAK -- JS Rework complete With this commit, the JS side of this PR is complete. Tests are still failing as fingerprinting implementation has not been completed on the python side, but all test_js_* tests are passing due to the core JS API rework being in place. * Write-out mdn compat data to js_instrumentation .py * Dry out js test code * Consolidate JS tests * Finish missing renames, and add test js via browser_params * pep8 * New files and failing tests. * Add a json schema for js_instrument_modules * Latest py tests * Flake8 * Ongoing progress. * More code, more tests. * flake * Rename mdn file * Add latest tests - just implement fingerprinting.json * flake8 * Add fingerprinting.json (incomplete) Mimetypes and plugins * Correct logSettings property name * Restore create_xpi as function Needed by manual_test * Make explicit option for logging to console * Process browser_params in task manager * Start being able to pass browser_params to selenium Also update manual_test to use click * Revert "Make explicit option for logging to console" This reverts commit c840fbc5d86350aa4d2f59e645036299a5a8195e. * Get manual_test working with browser_params From toplevel directory run: `python -m test.manual_test --selenium --browser-params --browser-params-file=debug_params.json` * More robust test for simple fingerprinting output Can't guarantee order of string output * Add timing information when testing * Make recheck really fast. You'll never hit this recheck as it all happens before page load. * Handle all inputs properly * Debug with all window params instrumented * Load xpi we just built * Check for ff version support * Save a bunch of properties * Relax constraints on what we can instrument. Let failing happen during instrumentation by using subscript notation. Don't restrict to MDN list. * Correct stringifying * Better name example params, fix some bugs, sample a_f Some example browser_params - a_f is just working - but crushes on a page like google.com. g_l and m_z haven't been vetted yet. * flake8 * Move example browser_params file out of harms way * Add failing test for regression I introduced. * Fix for regression. * Add simple mimeTypes and plugins to fingerprinting. * Lint JS * Rm mdn_browser_comat stuff no longer needed * Remove example_browser_params They're not used in tests, were just for my testing. * Load JS_INSTRUMENT_MODULES from JSON string * Rename JS_INSTRUMENT_MODULES to JS_INSTRUMENT_SETTINGS * Fixes #28 - Instrument all window.navigator properties. * Finish removing unused mdn-compat pieces. * EventID as a shadow variable * Flake8 * Remove $ prefix and rename $instrumentionRequests -> jsInstrumentationSettings * Rename jsInstrumentationRequests->jsInstrumentationSettings * TS Lint * Remove use of "request". Rename python side as per discussion with @englehardt. Privatize most methods Numpy docstrings for public methods * Convert assertions to ValueErrors * Rename file/folder and fingerprinting -> collection_fingerprinting file JSInstrumentation.py -> js_instrumentation/__init__.py collections have their own folder * Clean-up naming in schema * Add processing of json schema to documentation * Rename js_instrumentation again and ref schema location * Pass JSON not a js string * Do copying to xpi in npm postbuild step * Fix import in manual_test * Revert "Pass JSON not a js string" This reverts commit 8eb4edb5422fee30208882ca940cddc9688852dd. * Add titles to schema pieces * Add docs for js_instrument_settings * Bit more README cleanup * Update README.md Co-authored-by: Steven Englehardt <englehardt@gmail.com> * Move updating schema docs section * Add title * Fix typo in mac-osx hyperlink * Make the single-key dictionary clearer * Remove versions from npm package files * Clean up instrument_existing_window_property.html and js We're not using the js in two htmls now, so unify like other test files * Fix pyside instrumentation test, add more clarificaiton to README * pyside test must instrument browser apis * add more to readme to clarify instrumenting * Use example.com and example.org as localDomains * context-manage open, and flake8 Co-authored-by: Steven Englehardt <englehardt@gmail.com>
2020-07-09 01:55:22 +03:00
manager_params, browser_params = self.get_test_config(data_dir)
Refactoring browser and manager params into dataclasses (#807) * initial file commit * add new dependency for dataclasses * implemeted basic BrowserParams dataclass * dependencies update * file reformat * implemented basic ManagerParams dataclass * Update environment dependencies * Added new error class to validate browser and manager params * file reformat * Update scripts/environment-unpinned.yaml Co-authored-by: Stefan Zabka <zabkaste@informatik.hu-berlin.de> * added validations for BrowserParams dataclass * Update openwpm/config.py Co-authored-by: Stefan Zabka <zabkaste@informatik.hu-berlin.de> * Removed unnecessary checks Co-authored-by: Stefan Zabka <zabkaste@informatik.hu-berlin.de> * Changed error string formatting Co-authored-by: Stefan Zabka <zabkaste@informatik.hu-berlin.de> * Changed filenamea and necessary imports to resolve conflicts with new master branch(refering to PEP-8 reformatting) * Revert "Changed filenamea and necessary imports to resolve conflicts with new master branch(refering to PEP-8 reformatting)" This reverts commit e550c3bd604f415272bd05ee3d9c76397ad98006. * Revert "Merge branch 'master' into turn_browser_and_manager_params_into_dataclasses" This reverts commit aff5a384e737477746d6a38d3b2be6244f8dfd11, reversing changes made to 6ecaf5d0a94d376126692c3785692ba10626d88a. * Revert "Update environment dependencies" This reverts commit 385825b10aee4610a6e304122bec4ab2b7219a5b. * Revert "Merge branch 'turn_browser_and_manager_params_into_dataclasses' of https://github.com/ankushduacodes/OpenWPM into turn_browser_and_manager_params_into_dataclasses" This reverts commit 6ecaf5d0a94d376126692c3785692ba10626d88a, reversing changes made to e550c3bd604f415272bd05ee3d9c76397ad98006. * file reformat * finalized validate_browser_params function * fixed typo in error string * added validations for manager_params * Explanation for using list for supported browser * Revert "Revert "Merge branch 'master' into turn_browser_and_manager_params_into_dataclasses"" This reverts commit 6c3e98e57bd9c42acd029c74649742dcc81de86c. * Revert "Revert "Changed filenamea and necessary imports to resolve conflicts with new master branch(refering to PEP-8 reformatting)"" This reverts commit fc8f48f1878ea7c43b342989ce581dc3d6eab929. * import name change from .Error to .error * moved call_instrument check to config.py * fixed accidental use of dict syntax in a class * moved save_content check from deploy_firefox.py * deleting redundent file * deleted more redundent files * removed redundant imports * added new save_content check * property name changevariables can not have '-' * added new attribute to ManagerParams * adapted files to validate manager & broswer params - also added logic to convert the objects(BrowserParams and ManagerParams) to dictionaries to not break the functionality - also updated demo.py to work with new file names on this branch * removed obsolete documentaion * Dependency Update * Revert "Dependency Update" This reverts commit 8ee3a02b1764883a1f5922e0b52e9f17f8e098db. * Dependencies Update * unset memory and process watchdogs * add new output_format and failure_limit checks * inheriting dataclasses and added type hints to fn * added todo * fixed inheritance of dataclasses acc. to plan * refactor use of dict to use dataclasses(pending) * more refactoring use of dict to dataclasses - Also changed some type hints related to new refactoring * fixed screenshot directory issue - because of which some of the tests were failing * added try-except clause for unexpected errors * added tests to cover dataclasses * added some new and edited some old docs * refactor use of __dict__ to dataclass.to_dict() * Revert "refactor use of __dict__ to dataclass.to_dict()" This reverts commit a4f35513fa26d23a073c16af9fb332045826dcb2. * fixed some tests * refactor use of __dict__ in favor of dataclass.to_dict() method * removed some TODOS * fixed dataclases validation tests * Update docs/Configuration.md Co-authored-by: Stefan Zabka <zabkaste@informatik.hu-berlin.de> * Update docs/Configuration.md Co-authored-by: Stefan Zabka <zabkaste@informatik.hu-berlin.de> * Update docs/Configuration.md Co-authored-by: Stefan Zabka <zabkaste@informatik.hu-berlin.de> * Update openwpm/config.py Co-authored-by: Stefan Zabka <zabkaste@informatik.hu-berlin.de> * Update openwpm/config.py Co-authored-by: Stefan Zabka <zabkaste@informatik.hu-berlin.de> * Update openwpm/task_manager.py Co-authored-by: Stefan Zabka <zabkaste@informatik.hu-berlin.de> * minor fixed wrt polishing the PR * added new check and test for crawl configs Co-authored-by: Stefan Zabka <zabkaste@informatik.hu-berlin.de>
2020-12-02 12:10:45 +03:00
browser_params[0].js_instrument = True
manager_params.testing = True
Arbitrary WebAPI JS instrumentation (#642) * Add mdn-browser-compat-data * js_instrument_modules as list * Add mdn-browser-compat * Pass a list of instrumentingFunctions * Script to generate api data * Working give or take Getting errors like OpenWPM: Error name: TypeError post_request_ajax.html:237:17 OpenWPM: Error message: can't redefine non-configurable property "UNSENT" post_request_ajax.html:238:17 * Small naming cleanup * Handle non-configurable properties * Lint * Add aspirational API * Begin migration to new JSInstrumentationRequest interface. * We build and mandate LogSettings. * We have a new JSInstrumentatinRequest that everything runs through * Preset, fingerprinting, will be specified in JSON * Continue making progress Enum for Operation * Begin implementing jsModuleRequest validation. Changing my mind - all validation and construction to be done python side. This will reduce JS overhead at runtime. * Big cleanout- js-instrumentation work moving to python. * Continue update to python js-instrumentation * Lint Can't do all the things I want to with typing due to scope when content is loaded into page. * noqa on wip jsinstrumentation file * Begin updating existing js instrument tests. * Small cleanups * Fix naming in calling instrumentJS * No display mode native for testing * Restore py test file to orig. * Support null propertiesToInstrument * Re-work instrumentObject tests * Clean-up text in test page. * Add default to getLogSettings function * Don't re-assign logSettings.propertiesToInstrument * Revert "Don't re-assign logSettings.propertiesToInstrument" This reverts commit 87ccdabf9a97a5b50e33d66754b764fa91738a90. * Better assign propertiesToInstrument * Small cleanup * Make new logSettings object * Prettify * Small clean * -- BREAK -- JS Rework complete With this commit, the JS side of this PR is complete. Tests are still failing as fingerprinting implementation has not been completed on the python side, but all test_js_* tests are passing due to the core JS API rework being in place. * Write-out mdn compat data to js_instrumentation .py * Dry out js test code * Consolidate JS tests * Finish missing renames, and add test js via browser_params * pep8 * New files and failing tests. * Add a json schema for js_instrument_modules * Latest py tests * Flake8 * Ongoing progress. * More code, more tests. * flake * Rename mdn file * Add latest tests - just implement fingerprinting.json * flake8 * Add fingerprinting.json (incomplete) Mimetypes and plugins * Correct logSettings property name * Restore create_xpi as function Needed by manual_test * Make explicit option for logging to console * Process browser_params in task manager * Start being able to pass browser_params to selenium Also update manual_test to use click * Revert "Make explicit option for logging to console" This reverts commit c840fbc5d86350aa4d2f59e645036299a5a8195e. * Get manual_test working with browser_params From toplevel directory run: `python -m test.manual_test --selenium --browser-params --browser-params-file=debug_params.json` * More robust test for simple fingerprinting output Can't guarantee order of string output * Add timing information when testing * Make recheck really fast. You'll never hit this recheck as it all happens before page load. * Handle all inputs properly * Debug with all window params instrumented * Load xpi we just built * Check for ff version support * Save a bunch of properties * Relax constraints on what we can instrument. Let failing happen during instrumentation by using subscript notation. Don't restrict to MDN list. * Correct stringifying * Better name example params, fix some bugs, sample a_f Some example browser_params - a_f is just working - but crushes on a page like google.com. g_l and m_z haven't been vetted yet. * flake8 * Move example browser_params file out of harms way * Add failing test for regression I introduced. * Fix for regression. * Add simple mimeTypes and plugins to fingerprinting. * Lint JS * Rm mdn_browser_comat stuff no longer needed * Remove example_browser_params They're not used in tests, were just for my testing. * Load JS_INSTRUMENT_MODULES from JSON string * Rename JS_INSTRUMENT_MODULES to JS_INSTRUMENT_SETTINGS * Fixes #28 - Instrument all window.navigator properties. * Finish removing unused mdn-compat pieces. * EventID as a shadow variable * Flake8 * Remove $ prefix and rename $instrumentionRequests -> jsInstrumentationSettings * Rename jsInstrumentationRequests->jsInstrumentationSettings * TS Lint * Remove use of "request". Rename python side as per discussion with @englehardt. Privatize most methods Numpy docstrings for public methods * Convert assertions to ValueErrors * Rename file/folder and fingerprinting -> collection_fingerprinting file JSInstrumentation.py -> js_instrumentation/__init__.py collections have their own folder * Clean-up naming in schema * Add processing of json schema to documentation * Rename js_instrumentation again and ref schema location * Pass JSON not a js string * Do copying to xpi in npm postbuild step * Fix import in manual_test * Revert "Pass JSON not a js string" This reverts commit 8eb4edb5422fee30208882ca940cddc9688852dd. * Add titles to schema pieces * Add docs for js_instrument_settings * Bit more README cleanup * Update README.md Co-authored-by: Steven Englehardt <englehardt@gmail.com> * Move updating schema docs section * Add title * Fix typo in mac-osx hyperlink * Make the single-key dictionary clearer * Remove versions from npm package files * Clean up instrument_existing_window_property.html and js We're not using the js in two htmls now, so unify like other test files * Fix pyside instrumentation test, add more clarificaiton to README * pyside test must instrument browser apis * add more to readme to clarify instrumenting * Use example.com and example.org as localDomains * context-manage open, and flake8 Co-authored-by: Steven Englehardt <englehardt@gmail.com>
2020-07-09 01:55:22 +03:00
return manager_params, browser_params
def _check_calls(
2020-09-11 16:14:09 +03:00
self,
db,
symbol_prefix,
doc_url,
top_url,
expected_method_calls,
expected_gets_and_sets,
):
Arbitrary WebAPI JS instrumentation (#642) * Add mdn-browser-compat-data * js_instrument_modules as list * Add mdn-browser-compat * Pass a list of instrumentingFunctions * Script to generate api data * Working give or take Getting errors like OpenWPM: Error name: TypeError post_request_ajax.html:237:17 OpenWPM: Error message: can't redefine non-configurable property "UNSENT" post_request_ajax.html:238:17 * Small naming cleanup * Handle non-configurable properties * Lint * Add aspirational API * Begin migration to new JSInstrumentationRequest interface. * We build and mandate LogSettings. * We have a new JSInstrumentatinRequest that everything runs through * Preset, fingerprinting, will be specified in JSON * Continue making progress Enum for Operation * Begin implementing jsModuleRequest validation. Changing my mind - all validation and construction to be done python side. This will reduce JS overhead at runtime. * Big cleanout- js-instrumentation work moving to python. * Continue update to python js-instrumentation * Lint Can't do all the things I want to with typing due to scope when content is loaded into page. * noqa on wip jsinstrumentation file * Begin updating existing js instrument tests. * Small cleanups * Fix naming in calling instrumentJS * No display mode native for testing * Restore py test file to orig. * Support null propertiesToInstrument * Re-work instrumentObject tests * Clean-up text in test page. * Add default to getLogSettings function * Don't re-assign logSettings.propertiesToInstrument * Revert "Don't re-assign logSettings.propertiesToInstrument" This reverts commit 87ccdabf9a97a5b50e33d66754b764fa91738a90. * Better assign propertiesToInstrument * Small cleanup * Make new logSettings object * Prettify * Small clean * -- BREAK -- JS Rework complete With this commit, the JS side of this PR is complete. Tests are still failing as fingerprinting implementation has not been completed on the python side, but all test_js_* tests are passing due to the core JS API rework being in place. * Write-out mdn compat data to js_instrumentation .py * Dry out js test code * Consolidate JS tests * Finish missing renames, and add test js via browser_params * pep8 * New files and failing tests. * Add a json schema for js_instrument_modules * Latest py tests * Flake8 * Ongoing progress. * More code, more tests. * flake * Rename mdn file * Add latest tests - just implement fingerprinting.json * flake8 * Add fingerprinting.json (incomplete) Mimetypes and plugins * Correct logSettings property name * Restore create_xpi as function Needed by manual_test * Make explicit option for logging to console * Process browser_params in task manager * Start being able to pass browser_params to selenium Also update manual_test to use click * Revert "Make explicit option for logging to console" This reverts commit c840fbc5d86350aa4d2f59e645036299a5a8195e. * Get manual_test working with browser_params From toplevel directory run: `python -m test.manual_test --selenium --browser-params --browser-params-file=debug_params.json` * More robust test for simple fingerprinting output Can't guarantee order of string output * Add timing information when testing * Make recheck really fast. You'll never hit this recheck as it all happens before page load. * Handle all inputs properly * Debug with all window params instrumented * Load xpi we just built * Check for ff version support * Save a bunch of properties * Relax constraints on what we can instrument. Let failing happen during instrumentation by using subscript notation. Don't restrict to MDN list. * Correct stringifying * Better name example params, fix some bugs, sample a_f Some example browser_params - a_f is just working - but crushes on a page like google.com. g_l and m_z haven't been vetted yet. * flake8 * Move example browser_params file out of harms way * Add failing test for regression I introduced. * Fix for regression. * Add simple mimeTypes and plugins to fingerprinting. * Lint JS * Rm mdn_browser_comat stuff no longer needed * Remove example_browser_params They're not used in tests, were just for my testing. * Load JS_INSTRUMENT_MODULES from JSON string * Rename JS_INSTRUMENT_MODULES to JS_INSTRUMENT_SETTINGS * Fixes #28 - Instrument all window.navigator properties. * Finish removing unused mdn-compat pieces. * EventID as a shadow variable * Flake8 * Remove $ prefix and rename $instrumentionRequests -> jsInstrumentationSettings * Rename jsInstrumentationRequests->jsInstrumentationSettings * TS Lint * Remove use of "request". Rename python side as per discussion with @englehardt. Privatize most methods Numpy docstrings for public methods * Convert assertions to ValueErrors * Rename file/folder and fingerprinting -> collection_fingerprinting file JSInstrumentation.py -> js_instrumentation/__init__.py collections have their own folder * Clean-up naming in schema * Add processing of json schema to documentation * Rename js_instrumentation again and ref schema location * Pass JSON not a js string * Do copying to xpi in npm postbuild step * Fix import in manual_test * Revert "Pass JSON not a js string" This reverts commit 8eb4edb5422fee30208882ca940cddc9688852dd. * Add titles to schema pieces * Add docs for js_instrument_settings * Bit more README cleanup * Update README.md Co-authored-by: Steven Englehardt <englehardt@gmail.com> * Move updating schema docs section * Add title * Fix typo in mac-osx hyperlink * Make the single-key dictionary clearer * Remove versions from npm package files * Clean up instrument_existing_window_property.html and js We're not using the js in two htmls now, so unify like other test files * Fix pyside instrumentation test, add more clarificaiton to README * pyside test must instrument browser apis * add more to readme to clarify instrumenting * Use example.com and example.org as localDomains * context-manage open, and flake8 Co-authored-by: Steven Englehardt <englehardt@gmail.com>
2020-07-09 01:55:22 +03:00
"""Helper to check method calls and accesses in each frame"""
rows = db_utils.get_javascript_entries(db, all_columns=True)
observed_gets_and_sets = set()
observed_calls = set()
for row in rows:
2020-09-11 16:14:09 +03:00
if not row["symbol"].startswith(symbol_prefix):
Arbitrary WebAPI JS instrumentation (#642) * Add mdn-browser-compat-data * js_instrument_modules as list * Add mdn-browser-compat * Pass a list of instrumentingFunctions * Script to generate api data * Working give or take Getting errors like OpenWPM: Error name: TypeError post_request_ajax.html:237:17 OpenWPM: Error message: can't redefine non-configurable property "UNSENT" post_request_ajax.html:238:17 * Small naming cleanup * Handle non-configurable properties * Lint * Add aspirational API * Begin migration to new JSInstrumentationRequest interface. * We build and mandate LogSettings. * We have a new JSInstrumentatinRequest that everything runs through * Preset, fingerprinting, will be specified in JSON * Continue making progress Enum for Operation * Begin implementing jsModuleRequest validation. Changing my mind - all validation and construction to be done python side. This will reduce JS overhead at runtime. * Big cleanout- js-instrumentation work moving to python. * Continue update to python js-instrumentation * Lint Can't do all the things I want to with typing due to scope when content is loaded into page. * noqa on wip jsinstrumentation file * Begin updating existing js instrument tests. * Small cleanups * Fix naming in calling instrumentJS * No display mode native for testing * Restore py test file to orig. * Support null propertiesToInstrument * Re-work instrumentObject tests * Clean-up text in test page. * Add default to getLogSettings function * Don't re-assign logSettings.propertiesToInstrument * Revert "Don't re-assign logSettings.propertiesToInstrument" This reverts commit 87ccdabf9a97a5b50e33d66754b764fa91738a90. * Better assign propertiesToInstrument * Small cleanup * Make new logSettings object * Prettify * Small clean * -- BREAK -- JS Rework complete With this commit, the JS side of this PR is complete. Tests are still failing as fingerprinting implementation has not been completed on the python side, but all test_js_* tests are passing due to the core JS API rework being in place. * Write-out mdn compat data to js_instrumentation .py * Dry out js test code * Consolidate JS tests * Finish missing renames, and add test js via browser_params * pep8 * New files and failing tests. * Add a json schema for js_instrument_modules * Latest py tests * Flake8 * Ongoing progress. * More code, more tests. * flake * Rename mdn file * Add latest tests - just implement fingerprinting.json * flake8 * Add fingerprinting.json (incomplete) Mimetypes and plugins * Correct logSettings property name * Restore create_xpi as function Needed by manual_test * Make explicit option for logging to console * Process browser_params in task manager * Start being able to pass browser_params to selenium Also update manual_test to use click * Revert "Make explicit option for logging to console" This reverts commit c840fbc5d86350aa4d2f59e645036299a5a8195e. * Get manual_test working with browser_params From toplevel directory run: `python -m test.manual_test --selenium --browser-params --browser-params-file=debug_params.json` * More robust test for simple fingerprinting output Can't guarantee order of string output * Add timing information when testing * Make recheck really fast. You'll never hit this recheck as it all happens before page load. * Handle all inputs properly * Debug with all window params instrumented * Load xpi we just built * Check for ff version support * Save a bunch of properties * Relax constraints on what we can instrument. Let failing happen during instrumentation by using subscript notation. Don't restrict to MDN list. * Correct stringifying * Better name example params, fix some bugs, sample a_f Some example browser_params - a_f is just working - but crushes on a page like google.com. g_l and m_z haven't been vetted yet. * flake8 * Move example browser_params file out of harms way * Add failing test for regression I introduced. * Fix for regression. * Add simple mimeTypes and plugins to fingerprinting. * Lint JS * Rm mdn_browser_comat stuff no longer needed * Remove example_browser_params They're not used in tests, were just for my testing. * Load JS_INSTRUMENT_MODULES from JSON string * Rename JS_INSTRUMENT_MODULES to JS_INSTRUMENT_SETTINGS * Fixes #28 - Instrument all window.navigator properties. * Finish removing unused mdn-compat pieces. * EventID as a shadow variable * Flake8 * Remove $ prefix and rename $instrumentionRequests -> jsInstrumentationSettings * Rename jsInstrumentationRequests->jsInstrumentationSettings * TS Lint * Remove use of "request". Rename python side as per discussion with @englehardt. Privatize most methods Numpy docstrings for public methods * Convert assertions to ValueErrors * Rename file/folder and fingerprinting -> collection_fingerprinting file JSInstrumentation.py -> js_instrumentation/__init__.py collections have their own folder * Clean-up naming in schema * Add processing of json schema to documentation * Rename js_instrumentation again and ref schema location * Pass JSON not a js string * Do copying to xpi in npm postbuild step * Fix import in manual_test * Revert "Pass JSON not a js string" This reverts commit 8eb4edb5422fee30208882ca940cddc9688852dd. * Add titles to schema pieces * Add docs for js_instrument_settings * Bit more README cleanup * Update README.md Co-authored-by: Steven Englehardt <englehardt@gmail.com> * Move updating schema docs section * Add title * Fix typo in mac-osx hyperlink * Make the single-key dictionary clearer * Remove versions from npm package files * Clean up instrument_existing_window_property.html and js We're not using the js in two htmls now, so unify like other test files * Fix pyside instrumentation test, add more clarificaiton to README * pyside test must instrument browser apis * add more to readme to clarify instrumenting * Use example.com and example.org as localDomains * context-manage open, and flake8 Co-authored-by: Steven Englehardt <englehardt@gmail.com>
2020-07-09 01:55:22 +03:00
continue
2020-09-11 16:14:09 +03:00
symbol = re.sub(symbol_prefix, "", row["symbol"])
assert row["document_url"] == doc_url
assert row["top_level_url"] == top_url
if row["operation"] == "get" or row["operation"] == "set":
observed_gets_and_sets.add((symbol, row["operation"], row["value"]))
Arbitrary WebAPI JS instrumentation (#642) * Add mdn-browser-compat-data * js_instrument_modules as list * Add mdn-browser-compat * Pass a list of instrumentingFunctions * Script to generate api data * Working give or take Getting errors like OpenWPM: Error name: TypeError post_request_ajax.html:237:17 OpenWPM: Error message: can't redefine non-configurable property "UNSENT" post_request_ajax.html:238:17 * Small naming cleanup * Handle non-configurable properties * Lint * Add aspirational API * Begin migration to new JSInstrumentationRequest interface. * We build and mandate LogSettings. * We have a new JSInstrumentatinRequest that everything runs through * Preset, fingerprinting, will be specified in JSON * Continue making progress Enum for Operation * Begin implementing jsModuleRequest validation. Changing my mind - all validation and construction to be done python side. This will reduce JS overhead at runtime. * Big cleanout- js-instrumentation work moving to python. * Continue update to python js-instrumentation * Lint Can't do all the things I want to with typing due to scope when content is loaded into page. * noqa on wip jsinstrumentation file * Begin updating existing js instrument tests. * Small cleanups * Fix naming in calling instrumentJS * No display mode native for testing * Restore py test file to orig. * Support null propertiesToInstrument * Re-work instrumentObject tests * Clean-up text in test page. * Add default to getLogSettings function * Don't re-assign logSettings.propertiesToInstrument * Revert "Don't re-assign logSettings.propertiesToInstrument" This reverts commit 87ccdabf9a97a5b50e33d66754b764fa91738a90. * Better assign propertiesToInstrument * Small cleanup * Make new logSettings object * Prettify * Small clean * -- BREAK -- JS Rework complete With this commit, the JS side of this PR is complete. Tests are still failing as fingerprinting implementation has not been completed on the python side, but all test_js_* tests are passing due to the core JS API rework being in place. * Write-out mdn compat data to js_instrumentation .py * Dry out js test code * Consolidate JS tests * Finish missing renames, and add test js via browser_params * pep8 * New files and failing tests. * Add a json schema for js_instrument_modules * Latest py tests * Flake8 * Ongoing progress. * More code, more tests. * flake * Rename mdn file * Add latest tests - just implement fingerprinting.json * flake8 * Add fingerprinting.json (incomplete) Mimetypes and plugins * Correct logSettings property name * Restore create_xpi as function Needed by manual_test * Make explicit option for logging to console * Process browser_params in task manager * Start being able to pass browser_params to selenium Also update manual_test to use click * Revert "Make explicit option for logging to console" This reverts commit c840fbc5d86350aa4d2f59e645036299a5a8195e. * Get manual_test working with browser_params From toplevel directory run: `python -m test.manual_test --selenium --browser-params --browser-params-file=debug_params.json` * More robust test for simple fingerprinting output Can't guarantee order of string output * Add timing information when testing * Make recheck really fast. You'll never hit this recheck as it all happens before page load. * Handle all inputs properly * Debug with all window params instrumented * Load xpi we just built * Check for ff version support * Save a bunch of properties * Relax constraints on what we can instrument. Let failing happen during instrumentation by using subscript notation. Don't restrict to MDN list. * Correct stringifying * Better name example params, fix some bugs, sample a_f Some example browser_params - a_f is just working - but crushes on a page like google.com. g_l and m_z haven't been vetted yet. * flake8 * Move example browser_params file out of harms way * Add failing test for regression I introduced. * Fix for regression. * Add simple mimeTypes and plugins to fingerprinting. * Lint JS * Rm mdn_browser_comat stuff no longer needed * Remove example_browser_params They're not used in tests, were just for my testing. * Load JS_INSTRUMENT_MODULES from JSON string * Rename JS_INSTRUMENT_MODULES to JS_INSTRUMENT_SETTINGS * Fixes #28 - Instrument all window.navigator properties. * Finish removing unused mdn-compat pieces. * EventID as a shadow variable * Flake8 * Remove $ prefix and rename $instrumentionRequests -> jsInstrumentationSettings * Rename jsInstrumentationRequests->jsInstrumentationSettings * TS Lint * Remove use of "request". Rename python side as per discussion with @englehardt. Privatize most methods Numpy docstrings for public methods * Convert assertions to ValueErrors * Rename file/folder and fingerprinting -> collection_fingerprinting file JSInstrumentation.py -> js_instrumentation/__init__.py collections have their own folder * Clean-up naming in schema * Add processing of json schema to documentation * Rename js_instrumentation again and ref schema location * Pass JSON not a js string * Do copying to xpi in npm postbuild step * Fix import in manual_test * Revert "Pass JSON not a js string" This reverts commit 8eb4edb5422fee30208882ca940cddc9688852dd. * Add titles to schema pieces * Add docs for js_instrument_settings * Bit more README cleanup * Update README.md Co-authored-by: Steven Englehardt <englehardt@gmail.com> * Move updating schema docs section * Add title * Fix typo in mac-osx hyperlink * Make the single-key dictionary clearer * Remove versions from npm package files * Clean up instrument_existing_window_property.html and js We're not using the js in two htmls now, so unify like other test files * Fix pyside instrumentation test, add more clarificaiton to README * pyside test must instrument browser apis * add more to readme to clarify instrumenting * Use example.com and example.org as localDomains * context-manage open, and flake8 Co-authored-by: Steven Englehardt <englehardt@gmail.com>
2020-07-09 01:55:22 +03:00
else:
2020-09-11 16:14:09 +03:00
observed_calls.add((symbol, row["operation"], row["arguments"]))
Arbitrary WebAPI JS instrumentation (#642) * Add mdn-browser-compat-data * js_instrument_modules as list * Add mdn-browser-compat * Pass a list of instrumentingFunctions * Script to generate api data * Working give or take Getting errors like OpenWPM: Error name: TypeError post_request_ajax.html:237:17 OpenWPM: Error message: can't redefine non-configurable property "UNSENT" post_request_ajax.html:238:17 * Small naming cleanup * Handle non-configurable properties * Lint * Add aspirational API * Begin migration to new JSInstrumentationRequest interface. * We build and mandate LogSettings. * We have a new JSInstrumentatinRequest that everything runs through * Preset, fingerprinting, will be specified in JSON * Continue making progress Enum for Operation * Begin implementing jsModuleRequest validation. Changing my mind - all validation and construction to be done python side. This will reduce JS overhead at runtime. * Big cleanout- js-instrumentation work moving to python. * Continue update to python js-instrumentation * Lint Can't do all the things I want to with typing due to scope when content is loaded into page. * noqa on wip jsinstrumentation file * Begin updating existing js instrument tests. * Small cleanups * Fix naming in calling instrumentJS * No display mode native for testing * Restore py test file to orig. * Support null propertiesToInstrument * Re-work instrumentObject tests * Clean-up text in test page. * Add default to getLogSettings function * Don't re-assign logSettings.propertiesToInstrument * Revert "Don't re-assign logSettings.propertiesToInstrument" This reverts commit 87ccdabf9a97a5b50e33d66754b764fa91738a90. * Better assign propertiesToInstrument * Small cleanup * Make new logSettings object * Prettify * Small clean * -- BREAK -- JS Rework complete With this commit, the JS side of this PR is complete. Tests are still failing as fingerprinting implementation has not been completed on the python side, but all test_js_* tests are passing due to the core JS API rework being in place. * Write-out mdn compat data to js_instrumentation .py * Dry out js test code * Consolidate JS tests * Finish missing renames, and add test js via browser_params * pep8 * New files and failing tests. * Add a json schema for js_instrument_modules * Latest py tests * Flake8 * Ongoing progress. * More code, more tests. * flake * Rename mdn file * Add latest tests - just implement fingerprinting.json * flake8 * Add fingerprinting.json (incomplete) Mimetypes and plugins * Correct logSettings property name * Restore create_xpi as function Needed by manual_test * Make explicit option for logging to console * Process browser_params in task manager * Start being able to pass browser_params to selenium Also update manual_test to use click * Revert "Make explicit option for logging to console" This reverts commit c840fbc5d86350aa4d2f59e645036299a5a8195e. * Get manual_test working with browser_params From toplevel directory run: `python -m test.manual_test --selenium --browser-params --browser-params-file=debug_params.json` * More robust test for simple fingerprinting output Can't guarantee order of string output * Add timing information when testing * Make recheck really fast. You'll never hit this recheck as it all happens before page load. * Handle all inputs properly * Debug with all window params instrumented * Load xpi we just built * Check for ff version support * Save a bunch of properties * Relax constraints on what we can instrument. Let failing happen during instrumentation by using subscript notation. Don't restrict to MDN list. * Correct stringifying * Better name example params, fix some bugs, sample a_f Some example browser_params - a_f is just working - but crushes on a page like google.com. g_l and m_z haven't been vetted yet. * flake8 * Move example browser_params file out of harms way * Add failing test for regression I introduced. * Fix for regression. * Add simple mimeTypes and plugins to fingerprinting. * Lint JS * Rm mdn_browser_comat stuff no longer needed * Remove example_browser_params They're not used in tests, were just for my testing. * Load JS_INSTRUMENT_MODULES from JSON string * Rename JS_INSTRUMENT_MODULES to JS_INSTRUMENT_SETTINGS * Fixes #28 - Instrument all window.navigator properties. * Finish removing unused mdn-compat pieces. * EventID as a shadow variable * Flake8 * Remove $ prefix and rename $instrumentionRequests -> jsInstrumentationSettings * Rename jsInstrumentationRequests->jsInstrumentationSettings * TS Lint * Remove use of "request". Rename python side as per discussion with @englehardt. Privatize most methods Numpy docstrings for public methods * Convert assertions to ValueErrors * Rename file/folder and fingerprinting -> collection_fingerprinting file JSInstrumentation.py -> js_instrumentation/__init__.py collections have their own folder * Clean-up naming in schema * Add processing of json schema to documentation * Rename js_instrumentation again and ref schema location * Pass JSON not a js string * Do copying to xpi in npm postbuild step * Fix import in manual_test * Revert "Pass JSON not a js string" This reverts commit 8eb4edb5422fee30208882ca940cddc9688852dd. * Add titles to schema pieces * Add docs for js_instrument_settings * Bit more README cleanup * Update README.md Co-authored-by: Steven Englehardt <englehardt@gmail.com> * Move updating schema docs section * Add title * Fix typo in mac-osx hyperlink * Make the single-key dictionary clearer * Remove versions from npm package files * Clean up instrument_existing_window_property.html and js We're not using the js in two htmls now, so unify like other test files * Fix pyside instrumentation test, add more clarificaiton to README * pyside test must instrument browser apis * add more to readme to clarify instrumenting * Use example.com and example.org as localDomains * context-manage open, and flake8 Co-authored-by: Steven Englehardt <englehardt@gmail.com>
2020-07-09 01:55:22 +03:00
assert observed_calls == expected_method_calls
assert observed_gets_and_sets == expected_gets_and_sets