OpenWPM/demo.py

72 строки
2.6 KiB
Python
Исходник Обычный вид История

Command refactoring (#750) * Refactored GetCommand, BrowseCommand to have execute method * Fixed type name format issues in __issue_command * Fixed everything I broke * Changed import style so tests can run * Added BrowseCommad to imports * Added some more self * Added logging to explain failing test * Added one more self * attempt at refactoring save_screenshot * fixed indentation, attempt at refactoring save_screenshot * refactored SaveScreenshot command to have execute method * reformatted code using black * Ported SaveScreenshotCommand It now uses the new command.execute(...) syntax * refactored savefullscreenshot command to follow command sequence * formatted files with black * removed extraneous commands * Ported SaveScreenshotFullPage #763 * refactored dump page source and formatted code with black * reformatted recursive dump page source command and formatted code w black * formatted files using isort * formatted all files with isort * Ported DumpPageSource and RecursiveDumpPageSource (#767) * refactor finalize command * refactored initalize command and formatted with black and isort * missed a conflict * Command refactoring (#770) * attempt at refactoring save_screenshot * fixed indentation, attempt at refactoring save_screenshot * refactored SaveScreenshot command to have execute method * reformatted code using black * refactored savefullscreenshot command to follow command sequence * formatted files with black * removed extraneous commands * refactored dump page source and formatted code with black * reformatted recursive dump page source command and formatted code w black * formatted files using isort * formatted all files with isort * refactor finalize command * refactored initalize command and formatted with black and isort * missed a conflict * Ran isort * Added append_command * remove custom function command and format code * Refactored GetCommand, BrowseCommand to have execute method * Fixed type name format issues in __issue_command * Fixed everything I broke * Changed import style so tests can run * Added BrowseCommad to imports * Added some more self * Added logging to explain failing test * Added one more self * Ported SaveScreenshotCommand It now uses the new command.execute(...) syntax * Ported SaveScreenshotFullPage #763 * Ported DumpPageSource and RecursiveDumpPageSource (#767) * Command refactoring (#770) * attempt at refactoring save_screenshot * fixed indentation, attempt at refactoring save_screenshot * refactored SaveScreenshot command to have execute method * reformatted code using black * refactored savefullscreenshot command to follow command sequence * formatted files with black * removed extraneous commands * refactored dump page source and formatted code with black * reformatted recursive dump page source command and formatted code w black * formatted files using isort * formatted all files with isort * refactor finalize command * refactored initalize command and formatted with black and isort * missed a conflict * Ran isort * Added append_command * remove duplicate append_command * Refactored GetCommand, BrowseCommand to have execute method * Fixed type name format issues in __issue_command * Fixed everything I broke * Changed import style so tests can run * Added BrowseCommad to imports * Added some more self * Added logging to explain failing test * Added one more self * Ported SaveScreenshotCommand It now uses the new command.execute(...) syntax * Ported SaveScreenshotFullPage #763 * Ported DumpPageSource and RecursiveDumpPageSource (#767) * Command refactoring (#770) * attempt at refactoring save_screenshot * fixed indentation, attempt at refactoring save_screenshot * refactored SaveScreenshot command to have execute method * reformatted code using black * refactored savefullscreenshot command to follow command sequence * formatted files with black * removed extraneous commands * refactored dump page source and formatted code with black * reformatted recursive dump page source command and formatted code w black * formatted files using isort * formatted all files with isort * refactor finalize command * refactored initalize command and formatted with black and isort * missed a conflict * Ran isort * Added append_command * generate new xpi * Fixing tests * Fixing tests * Fixing up more tests * Removed type annotations * Fixing tests * Fixing tests * Removed command_executor * Moved Commands to commands * Fixing imports * Fixed skipped test * Removed duplicate append_command * docs: update adding command in usingOpenWPM * Forgot to save * Removed datadir * Cleaning up imports * Implemented simple command * Added documentation to simple_command.py * Renamed to custom_command.py * Moved docs around * Referencing BaseCommand.execute * Update docs/Using_OpenWPM.md Co-authored-by: Steven Englehardt <senglehardt@mozilla.com> Co-authored-by: Cyrus <cyruskarsan@gmail.com> Co-authored-by: cyruskarsan <55566678+cyruskarsan@users.noreply.github.com> Co-authored-by: Steven Englehardt <senglehardt@mozilla.com>
2021-01-09 13:15:01 +03:00
from custom_command import LinkCountingCommand
from openwpm.command_sequence import CommandSequence
Command refactoring (#750) * Refactored GetCommand, BrowseCommand to have execute method * Fixed type name format issues in __issue_command * Fixed everything I broke * Changed import style so tests can run * Added BrowseCommad to imports * Added some more self * Added logging to explain failing test * Added one more self * attempt at refactoring save_screenshot * fixed indentation, attempt at refactoring save_screenshot * refactored SaveScreenshot command to have execute method * reformatted code using black * Ported SaveScreenshotCommand It now uses the new command.execute(...) syntax * refactored savefullscreenshot command to follow command sequence * formatted files with black * removed extraneous commands * Ported SaveScreenshotFullPage #763 * refactored dump page source and formatted code with black * reformatted recursive dump page source command and formatted code w black * formatted files using isort * formatted all files with isort * Ported DumpPageSource and RecursiveDumpPageSource (#767) * refactor finalize command * refactored initalize command and formatted with black and isort * missed a conflict * Command refactoring (#770) * attempt at refactoring save_screenshot * fixed indentation, attempt at refactoring save_screenshot * refactored SaveScreenshot command to have execute method * reformatted code using black * refactored savefullscreenshot command to follow command sequence * formatted files with black * removed extraneous commands * refactored dump page source and formatted code with black * reformatted recursive dump page source command and formatted code w black * formatted files using isort * formatted all files with isort * refactor finalize command * refactored initalize command and formatted with black and isort * missed a conflict * Ran isort * Added append_command * remove custom function command and format code * Refactored GetCommand, BrowseCommand to have execute method * Fixed type name format issues in __issue_command * Fixed everything I broke * Changed import style so tests can run * Added BrowseCommad to imports * Added some more self * Added logging to explain failing test * Added one more self * Ported SaveScreenshotCommand It now uses the new command.execute(...) syntax * Ported SaveScreenshotFullPage #763 * Ported DumpPageSource and RecursiveDumpPageSource (#767) * Command refactoring (#770) * attempt at refactoring save_screenshot * fixed indentation, attempt at refactoring save_screenshot * refactored SaveScreenshot command to have execute method * reformatted code using black * refactored savefullscreenshot command to follow command sequence * formatted files with black * removed extraneous commands * refactored dump page source and formatted code with black * reformatted recursive dump page source command and formatted code w black * formatted files using isort * formatted all files with isort * refactor finalize command * refactored initalize command and formatted with black and isort * missed a conflict * Ran isort * Added append_command * remove duplicate append_command * Refactored GetCommand, BrowseCommand to have execute method * Fixed type name format issues in __issue_command * Fixed everything I broke * Changed import style so tests can run * Added BrowseCommad to imports * Added some more self * Added logging to explain failing test * Added one more self * Ported SaveScreenshotCommand It now uses the new command.execute(...) syntax * Ported SaveScreenshotFullPage #763 * Ported DumpPageSource and RecursiveDumpPageSource (#767) * Command refactoring (#770) * attempt at refactoring save_screenshot * fixed indentation, attempt at refactoring save_screenshot * refactored SaveScreenshot command to have execute method * reformatted code using black * refactored savefullscreenshot command to follow command sequence * formatted files with black * removed extraneous commands * refactored dump page source and formatted code with black * reformatted recursive dump page source command and formatted code w black * formatted files using isort * formatted all files with isort * refactor finalize command * refactored initalize command and formatted with black and isort * missed a conflict * Ran isort * Added append_command * generate new xpi * Fixing tests * Fixing tests * Fixing up more tests * Removed type annotations * Fixing tests * Fixing tests * Removed command_executor * Moved Commands to commands * Fixing imports * Fixed skipped test * Removed duplicate append_command * docs: update adding command in usingOpenWPM * Forgot to save * Removed datadir * Cleaning up imports * Implemented simple command * Added documentation to simple_command.py * Renamed to custom_command.py * Moved docs around * Referencing BaseCommand.execute * Update docs/Using_OpenWPM.md Co-authored-by: Steven Englehardt <senglehardt@mozilla.com> Co-authored-by: Cyrus <cyruskarsan@gmail.com> Co-authored-by: cyruskarsan <55566678+cyruskarsan@users.noreply.github.com> Co-authored-by: Steven Englehardt <senglehardt@mozilla.com>
2021-01-09 13:15:01 +03:00
from openwpm.commands.browser_commands import GetCommand
Refactoring browser and manager params into dataclasses (#807) * initial file commit * add new dependency for dataclasses * implemeted basic BrowserParams dataclass * dependencies update * file reformat * implemented basic ManagerParams dataclass * Update environment dependencies * Added new error class to validate browser and manager params * file reformat * Update scripts/environment-unpinned.yaml Co-authored-by: Stefan Zabka <zabkaste@informatik.hu-berlin.de> * added validations for BrowserParams dataclass * Update openwpm/config.py Co-authored-by: Stefan Zabka <zabkaste@informatik.hu-berlin.de> * Removed unnecessary checks Co-authored-by: Stefan Zabka <zabkaste@informatik.hu-berlin.de> * Changed error string formatting Co-authored-by: Stefan Zabka <zabkaste@informatik.hu-berlin.de> * Changed filenamea and necessary imports to resolve conflicts with new master branch(refering to PEP-8 reformatting) * Revert "Changed filenamea and necessary imports to resolve conflicts with new master branch(refering to PEP-8 reformatting)" This reverts commit e550c3bd604f415272bd05ee3d9c76397ad98006. * Revert "Merge branch 'master' into turn_browser_and_manager_params_into_dataclasses" This reverts commit aff5a384e737477746d6a38d3b2be6244f8dfd11, reversing changes made to 6ecaf5d0a94d376126692c3785692ba10626d88a. * Revert "Update environment dependencies" This reverts commit 385825b10aee4610a6e304122bec4ab2b7219a5b. * Revert "Merge branch 'turn_browser_and_manager_params_into_dataclasses' of https://github.com/ankushduacodes/OpenWPM into turn_browser_and_manager_params_into_dataclasses" This reverts commit 6ecaf5d0a94d376126692c3785692ba10626d88a, reversing changes made to e550c3bd604f415272bd05ee3d9c76397ad98006. * file reformat * finalized validate_browser_params function * fixed typo in error string * added validations for manager_params * Explanation for using list for supported browser * Revert "Revert "Merge branch 'master' into turn_browser_and_manager_params_into_dataclasses"" This reverts commit 6c3e98e57bd9c42acd029c74649742dcc81de86c. * Revert "Revert "Changed filenamea and necessary imports to resolve conflicts with new master branch(refering to PEP-8 reformatting)"" This reverts commit fc8f48f1878ea7c43b342989ce581dc3d6eab929. * import name change from .Error to .error * moved call_instrument check to config.py * fixed accidental use of dict syntax in a class * moved save_content check from deploy_firefox.py * deleting redundent file * deleted more redundent files * removed redundant imports * added new save_content check * property name changevariables can not have '-' * added new attribute to ManagerParams * adapted files to validate manager & broswer params - also added logic to convert the objects(BrowserParams and ManagerParams) to dictionaries to not break the functionality - also updated demo.py to work with new file names on this branch * removed obsolete documentaion * Dependency Update * Revert "Dependency Update" This reverts commit 8ee3a02b1764883a1f5922e0b52e9f17f8e098db. * Dependencies Update * unset memory and process watchdogs * add new output_format and failure_limit checks * inheriting dataclasses and added type hints to fn * added todo * fixed inheritance of dataclasses acc. to plan * refactor use of dict to use dataclasses(pending) * more refactoring use of dict to dataclasses - Also changed some type hints related to new refactoring * fixed screenshot directory issue - because of which some of the tests were failing * added try-except clause for unexpected errors * added tests to cover dataclasses * added some new and edited some old docs * refactor use of __dict__ to dataclass.to_dict() * Revert "refactor use of __dict__ to dataclass.to_dict()" This reverts commit a4f35513fa26d23a073c16af9fb332045826dcb2. * fixed some tests * refactor use of __dict__ in favor of dataclass.to_dict() method * removed some TODOS * fixed dataclases validation tests * Update docs/Configuration.md Co-authored-by: Stefan Zabka <zabkaste@informatik.hu-berlin.de> * Update docs/Configuration.md Co-authored-by: Stefan Zabka <zabkaste@informatik.hu-berlin.de> * Update docs/Configuration.md Co-authored-by: Stefan Zabka <zabkaste@informatik.hu-berlin.de> * Update openwpm/config.py Co-authored-by: Stefan Zabka <zabkaste@informatik.hu-berlin.de> * Update openwpm/config.py Co-authored-by: Stefan Zabka <zabkaste@informatik.hu-berlin.de> * Update openwpm/task_manager.py Co-authored-by: Stefan Zabka <zabkaste@informatik.hu-berlin.de> * minor fixed wrt polishing the PR * added new check and test for crawl configs Co-authored-by: Stefan Zabka <zabkaste@informatik.hu-berlin.de>
2020-12-02 12:10:45 +03:00
from openwpm.config import BrowserParams, ManagerParams
from openwpm.task_manager import TaskManager
2014-07-01 20:37:17 +04:00
# The list of sites that we wish to crawl
2020-02-28 19:39:39 +03:00
NUM_BROWSERS = 1
2020-05-08 02:27:52 +03:00
sites = [
2020-09-11 16:14:09 +03:00
"http://www.example.com",
"http://www.princeton.edu",
"http://citp.princeton.edu/",
2020-05-08 02:27:52 +03:00
]
2014-07-01 20:37:17 +04:00
Command refactoring (#750) * Refactored GetCommand, BrowseCommand to have execute method * Fixed type name format issues in __issue_command * Fixed everything I broke * Changed import style so tests can run * Added BrowseCommad to imports * Added some more self * Added logging to explain failing test * Added one more self * attempt at refactoring save_screenshot * fixed indentation, attempt at refactoring save_screenshot * refactored SaveScreenshot command to have execute method * reformatted code using black * Ported SaveScreenshotCommand It now uses the new command.execute(...) syntax * refactored savefullscreenshot command to follow command sequence * formatted files with black * removed extraneous commands * Ported SaveScreenshotFullPage #763 * refactored dump page source and formatted code with black * reformatted recursive dump page source command and formatted code w black * formatted files using isort * formatted all files with isort * Ported DumpPageSource and RecursiveDumpPageSource (#767) * refactor finalize command * refactored initalize command and formatted with black and isort * missed a conflict * Command refactoring (#770) * attempt at refactoring save_screenshot * fixed indentation, attempt at refactoring save_screenshot * refactored SaveScreenshot command to have execute method * reformatted code using black * refactored savefullscreenshot command to follow command sequence * formatted files with black * removed extraneous commands * refactored dump page source and formatted code with black * reformatted recursive dump page source command and formatted code w black * formatted files using isort * formatted all files with isort * refactor finalize command * refactored initalize command and formatted with black and isort * missed a conflict * Ran isort * Added append_command * remove custom function command and format code * Refactored GetCommand, BrowseCommand to have execute method * Fixed type name format issues in __issue_command * Fixed everything I broke * Changed import style so tests can run * Added BrowseCommad to imports * Added some more self * Added logging to explain failing test * Added one more self * Ported SaveScreenshotCommand It now uses the new command.execute(...) syntax * Ported SaveScreenshotFullPage #763 * Ported DumpPageSource and RecursiveDumpPageSource (#767) * Command refactoring (#770) * attempt at refactoring save_screenshot * fixed indentation, attempt at refactoring save_screenshot * refactored SaveScreenshot command to have execute method * reformatted code using black * refactored savefullscreenshot command to follow command sequence * formatted files with black * removed extraneous commands * refactored dump page source and formatted code with black * reformatted recursive dump page source command and formatted code w black * formatted files using isort * formatted all files with isort * refactor finalize command * refactored initalize command and formatted with black and isort * missed a conflict * Ran isort * Added append_command * remove duplicate append_command * Refactored GetCommand, BrowseCommand to have execute method * Fixed type name format issues in __issue_command * Fixed everything I broke * Changed import style so tests can run * Added BrowseCommad to imports * Added some more self * Added logging to explain failing test * Added one more self * Ported SaveScreenshotCommand It now uses the new command.execute(...) syntax * Ported SaveScreenshotFullPage #763 * Ported DumpPageSource and RecursiveDumpPageSource (#767) * Command refactoring (#770) * attempt at refactoring save_screenshot * fixed indentation, attempt at refactoring save_screenshot * refactored SaveScreenshot command to have execute method * reformatted code using black * refactored savefullscreenshot command to follow command sequence * formatted files with black * removed extraneous commands * refactored dump page source and formatted code with black * reformatted recursive dump page source command and formatted code w black * formatted files using isort * formatted all files with isort * refactor finalize command * refactored initalize command and formatted with black and isort * missed a conflict * Ran isort * Added append_command * generate new xpi * Fixing tests * Fixing tests * Fixing up more tests * Removed type annotations * Fixing tests * Fixing tests * Removed command_executor * Moved Commands to commands * Fixing imports * Fixed skipped test * Removed duplicate append_command * docs: update adding command in usingOpenWPM * Forgot to save * Removed datadir * Cleaning up imports * Implemented simple command * Added documentation to simple_command.py * Renamed to custom_command.py * Moved docs around * Referencing BaseCommand.execute * Update docs/Using_OpenWPM.md Co-authored-by: Steven Englehardt <senglehardt@mozilla.com> Co-authored-by: Cyrus <cyruskarsan@gmail.com> Co-authored-by: cyruskarsan <55566678+cyruskarsan@users.noreply.github.com> Co-authored-by: Steven Englehardt <senglehardt@mozilla.com>
2021-01-09 13:15:01 +03:00
Refactoring browser and manager params into dataclasses (#807) * initial file commit * add new dependency for dataclasses * implemeted basic BrowserParams dataclass * dependencies update * file reformat * implemented basic ManagerParams dataclass * Update environment dependencies * Added new error class to validate browser and manager params * file reformat * Update scripts/environment-unpinned.yaml Co-authored-by: Stefan Zabka <zabkaste@informatik.hu-berlin.de> * added validations for BrowserParams dataclass * Update openwpm/config.py Co-authored-by: Stefan Zabka <zabkaste@informatik.hu-berlin.de> * Removed unnecessary checks Co-authored-by: Stefan Zabka <zabkaste@informatik.hu-berlin.de> * Changed error string formatting Co-authored-by: Stefan Zabka <zabkaste@informatik.hu-berlin.de> * Changed filenamea and necessary imports to resolve conflicts with new master branch(refering to PEP-8 reformatting) * Revert "Changed filenamea and necessary imports to resolve conflicts with new master branch(refering to PEP-8 reformatting)" This reverts commit e550c3bd604f415272bd05ee3d9c76397ad98006. * Revert "Merge branch 'master' into turn_browser_and_manager_params_into_dataclasses" This reverts commit aff5a384e737477746d6a38d3b2be6244f8dfd11, reversing changes made to 6ecaf5d0a94d376126692c3785692ba10626d88a. * Revert "Update environment dependencies" This reverts commit 385825b10aee4610a6e304122bec4ab2b7219a5b. * Revert "Merge branch 'turn_browser_and_manager_params_into_dataclasses' of https://github.com/ankushduacodes/OpenWPM into turn_browser_and_manager_params_into_dataclasses" This reverts commit 6ecaf5d0a94d376126692c3785692ba10626d88a, reversing changes made to e550c3bd604f415272bd05ee3d9c76397ad98006. * file reformat * finalized validate_browser_params function * fixed typo in error string * added validations for manager_params * Explanation for using list for supported browser * Revert "Revert "Merge branch 'master' into turn_browser_and_manager_params_into_dataclasses"" This reverts commit 6c3e98e57bd9c42acd029c74649742dcc81de86c. * Revert "Revert "Changed filenamea and necessary imports to resolve conflicts with new master branch(refering to PEP-8 reformatting)"" This reverts commit fc8f48f1878ea7c43b342989ce581dc3d6eab929. * import name change from .Error to .error * moved call_instrument check to config.py * fixed accidental use of dict syntax in a class * moved save_content check from deploy_firefox.py * deleting redundent file * deleted more redundent files * removed redundant imports * added new save_content check * property name changevariables can not have '-' * added new attribute to ManagerParams * adapted files to validate manager & broswer params - also added logic to convert the objects(BrowserParams and ManagerParams) to dictionaries to not break the functionality - also updated demo.py to work with new file names on this branch * removed obsolete documentaion * Dependency Update * Revert "Dependency Update" This reverts commit 8ee3a02b1764883a1f5922e0b52e9f17f8e098db. * Dependencies Update * unset memory and process watchdogs * add new output_format and failure_limit checks * inheriting dataclasses and added type hints to fn * added todo * fixed inheritance of dataclasses acc. to plan * refactor use of dict to use dataclasses(pending) * more refactoring use of dict to dataclasses - Also changed some type hints related to new refactoring * fixed screenshot directory issue - because of which some of the tests were failing * added try-except clause for unexpected errors * added tests to cover dataclasses * added some new and edited some old docs * refactor use of __dict__ to dataclass.to_dict() * Revert "refactor use of __dict__ to dataclass.to_dict()" This reverts commit a4f35513fa26d23a073c16af9fb332045826dcb2. * fixed some tests * refactor use of __dict__ in favor of dataclass.to_dict() method * removed some TODOS * fixed dataclases validation tests * Update docs/Configuration.md Co-authored-by: Stefan Zabka <zabkaste@informatik.hu-berlin.de> * Update docs/Configuration.md Co-authored-by: Stefan Zabka <zabkaste@informatik.hu-berlin.de> * Update docs/Configuration.md Co-authored-by: Stefan Zabka <zabkaste@informatik.hu-berlin.de> * Update openwpm/config.py Co-authored-by: Stefan Zabka <zabkaste@informatik.hu-berlin.de> * Update openwpm/config.py Co-authored-by: Stefan Zabka <zabkaste@informatik.hu-berlin.de> * Update openwpm/task_manager.py Co-authored-by: Stefan Zabka <zabkaste@informatik.hu-berlin.de> * minor fixed wrt polishing the PR * added new check and test for crawl configs Co-authored-by: Stefan Zabka <zabkaste@informatik.hu-berlin.de>
2020-12-02 12:10:45 +03:00
# Loads the default ManagerParams
# and NUM_BROWSERS copies of the default BrowserParams
manager_params = ManagerParams(
num_browsers=NUM_BROWSERS
) # num_browsers is necessary to let TaskManager know how many browsers to spawn
browser_params = [BrowserParams(display_mode="headless") for _ in range(NUM_BROWSERS)]
2014-07-01 20:37:17 +04:00
# Update browser configuration (use this for per-browser settings)
for i in range(NUM_BROWSERS):
2017-07-28 23:37:35 +03:00
# Record HTTP Requests and Responses
Refactoring browser and manager params into dataclasses (#807) * initial file commit * add new dependency for dataclasses * implemeted basic BrowserParams dataclass * dependencies update * file reformat * implemented basic ManagerParams dataclass * Update environment dependencies * Added new error class to validate browser and manager params * file reformat * Update scripts/environment-unpinned.yaml Co-authored-by: Stefan Zabka <zabkaste@informatik.hu-berlin.de> * added validations for BrowserParams dataclass * Update openwpm/config.py Co-authored-by: Stefan Zabka <zabkaste@informatik.hu-berlin.de> * Removed unnecessary checks Co-authored-by: Stefan Zabka <zabkaste@informatik.hu-berlin.de> * Changed error string formatting Co-authored-by: Stefan Zabka <zabkaste@informatik.hu-berlin.de> * Changed filenamea and necessary imports to resolve conflicts with new master branch(refering to PEP-8 reformatting) * Revert "Changed filenamea and necessary imports to resolve conflicts with new master branch(refering to PEP-8 reformatting)" This reverts commit e550c3bd604f415272bd05ee3d9c76397ad98006. * Revert "Merge branch 'master' into turn_browser_and_manager_params_into_dataclasses" This reverts commit aff5a384e737477746d6a38d3b2be6244f8dfd11, reversing changes made to 6ecaf5d0a94d376126692c3785692ba10626d88a. * Revert "Update environment dependencies" This reverts commit 385825b10aee4610a6e304122bec4ab2b7219a5b. * Revert "Merge branch 'turn_browser_and_manager_params_into_dataclasses' of https://github.com/ankushduacodes/OpenWPM into turn_browser_and_manager_params_into_dataclasses" This reverts commit 6ecaf5d0a94d376126692c3785692ba10626d88a, reversing changes made to e550c3bd604f415272bd05ee3d9c76397ad98006. * file reformat * finalized validate_browser_params function * fixed typo in error string * added validations for manager_params * Explanation for using list for supported browser * Revert "Revert "Merge branch 'master' into turn_browser_and_manager_params_into_dataclasses"" This reverts commit 6c3e98e57bd9c42acd029c74649742dcc81de86c. * Revert "Revert "Changed filenamea and necessary imports to resolve conflicts with new master branch(refering to PEP-8 reformatting)"" This reverts commit fc8f48f1878ea7c43b342989ce581dc3d6eab929. * import name change from .Error to .error * moved call_instrument check to config.py * fixed accidental use of dict syntax in a class * moved save_content check from deploy_firefox.py * deleting redundent file * deleted more redundent files * removed redundant imports * added new save_content check * property name changevariables can not have '-' * added new attribute to ManagerParams * adapted files to validate manager & broswer params - also added logic to convert the objects(BrowserParams and ManagerParams) to dictionaries to not break the functionality - also updated demo.py to work with new file names on this branch * removed obsolete documentaion * Dependency Update * Revert "Dependency Update" This reverts commit 8ee3a02b1764883a1f5922e0b52e9f17f8e098db. * Dependencies Update * unset memory and process watchdogs * add new output_format and failure_limit checks * inheriting dataclasses and added type hints to fn * added todo * fixed inheritance of dataclasses acc. to plan * refactor use of dict to use dataclasses(pending) * more refactoring use of dict to dataclasses - Also changed some type hints related to new refactoring * fixed screenshot directory issue - because of which some of the tests were failing * added try-except clause for unexpected errors * added tests to cover dataclasses * added some new and edited some old docs * refactor use of __dict__ to dataclass.to_dict() * Revert "refactor use of __dict__ to dataclass.to_dict()" This reverts commit a4f35513fa26d23a073c16af9fb332045826dcb2. * fixed some tests * refactor use of __dict__ in favor of dataclass.to_dict() method * removed some TODOS * fixed dataclases validation tests * Update docs/Configuration.md Co-authored-by: Stefan Zabka <zabkaste@informatik.hu-berlin.de> * Update docs/Configuration.md Co-authored-by: Stefan Zabka <zabkaste@informatik.hu-berlin.de> * Update docs/Configuration.md Co-authored-by: Stefan Zabka <zabkaste@informatik.hu-berlin.de> * Update openwpm/config.py Co-authored-by: Stefan Zabka <zabkaste@informatik.hu-berlin.de> * Update openwpm/config.py Co-authored-by: Stefan Zabka <zabkaste@informatik.hu-berlin.de> * Update openwpm/task_manager.py Co-authored-by: Stefan Zabka <zabkaste@informatik.hu-berlin.de> * minor fixed wrt polishing the PR * added new check and test for crawl configs Co-authored-by: Stefan Zabka <zabkaste@informatik.hu-berlin.de>
2020-12-02 12:10:45 +03:00
browser_params[i].http_instrument = True
# Record cookie changes
Refactoring browser and manager params into dataclasses (#807) * initial file commit * add new dependency for dataclasses * implemeted basic BrowserParams dataclass * dependencies update * file reformat * implemented basic ManagerParams dataclass * Update environment dependencies * Added new error class to validate browser and manager params * file reformat * Update scripts/environment-unpinned.yaml Co-authored-by: Stefan Zabka <zabkaste@informatik.hu-berlin.de> * added validations for BrowserParams dataclass * Update openwpm/config.py Co-authored-by: Stefan Zabka <zabkaste@informatik.hu-berlin.de> * Removed unnecessary checks Co-authored-by: Stefan Zabka <zabkaste@informatik.hu-berlin.de> * Changed error string formatting Co-authored-by: Stefan Zabka <zabkaste@informatik.hu-berlin.de> * Changed filenamea and necessary imports to resolve conflicts with new master branch(refering to PEP-8 reformatting) * Revert "Changed filenamea and necessary imports to resolve conflicts with new master branch(refering to PEP-8 reformatting)" This reverts commit e550c3bd604f415272bd05ee3d9c76397ad98006. * Revert "Merge branch 'master' into turn_browser_and_manager_params_into_dataclasses" This reverts commit aff5a384e737477746d6a38d3b2be6244f8dfd11, reversing changes made to 6ecaf5d0a94d376126692c3785692ba10626d88a. * Revert "Update environment dependencies" This reverts commit 385825b10aee4610a6e304122bec4ab2b7219a5b. * Revert "Merge branch 'turn_browser_and_manager_params_into_dataclasses' of https://github.com/ankushduacodes/OpenWPM into turn_browser_and_manager_params_into_dataclasses" This reverts commit 6ecaf5d0a94d376126692c3785692ba10626d88a, reversing changes made to e550c3bd604f415272bd05ee3d9c76397ad98006. * file reformat * finalized validate_browser_params function * fixed typo in error string * added validations for manager_params * Explanation for using list for supported browser * Revert "Revert "Merge branch 'master' into turn_browser_and_manager_params_into_dataclasses"" This reverts commit 6c3e98e57bd9c42acd029c74649742dcc81de86c. * Revert "Revert "Changed filenamea and necessary imports to resolve conflicts with new master branch(refering to PEP-8 reformatting)"" This reverts commit fc8f48f1878ea7c43b342989ce581dc3d6eab929. * import name change from .Error to .error * moved call_instrument check to config.py * fixed accidental use of dict syntax in a class * moved save_content check from deploy_firefox.py * deleting redundent file * deleted more redundent files * removed redundant imports * added new save_content check * property name changevariables can not have '-' * added new attribute to ManagerParams * adapted files to validate manager & broswer params - also added logic to convert the objects(BrowserParams and ManagerParams) to dictionaries to not break the functionality - also updated demo.py to work with new file names on this branch * removed obsolete documentaion * Dependency Update * Revert "Dependency Update" This reverts commit 8ee3a02b1764883a1f5922e0b52e9f17f8e098db. * Dependencies Update * unset memory and process watchdogs * add new output_format and failure_limit checks * inheriting dataclasses and added type hints to fn * added todo * fixed inheritance of dataclasses acc. to plan * refactor use of dict to use dataclasses(pending) * more refactoring use of dict to dataclasses - Also changed some type hints related to new refactoring * fixed screenshot directory issue - because of which some of the tests were failing * added try-except clause for unexpected errors * added tests to cover dataclasses * added some new and edited some old docs * refactor use of __dict__ to dataclass.to_dict() * Revert "refactor use of __dict__ to dataclass.to_dict()" This reverts commit a4f35513fa26d23a073c16af9fb332045826dcb2. * fixed some tests * refactor use of __dict__ in favor of dataclass.to_dict() method * removed some TODOS * fixed dataclases validation tests * Update docs/Configuration.md Co-authored-by: Stefan Zabka <zabkaste@informatik.hu-berlin.de> * Update docs/Configuration.md Co-authored-by: Stefan Zabka <zabkaste@informatik.hu-berlin.de> * Update docs/Configuration.md Co-authored-by: Stefan Zabka <zabkaste@informatik.hu-berlin.de> * Update openwpm/config.py Co-authored-by: Stefan Zabka <zabkaste@informatik.hu-berlin.de> * Update openwpm/config.py Co-authored-by: Stefan Zabka <zabkaste@informatik.hu-berlin.de> * Update openwpm/task_manager.py Co-authored-by: Stefan Zabka <zabkaste@informatik.hu-berlin.de> * minor fixed wrt polishing the PR * added new check and test for crawl configs Co-authored-by: Stefan Zabka <zabkaste@informatik.hu-berlin.de>
2020-12-02 12:10:45 +03:00
browser_params[i].cookie_instrument = True
# Record Navigations
Refactoring browser and manager params into dataclasses (#807) * initial file commit * add new dependency for dataclasses * implemeted basic BrowserParams dataclass * dependencies update * file reformat * implemented basic ManagerParams dataclass * Update environment dependencies * Added new error class to validate browser and manager params * file reformat * Update scripts/environment-unpinned.yaml Co-authored-by: Stefan Zabka <zabkaste@informatik.hu-berlin.de> * added validations for BrowserParams dataclass * Update openwpm/config.py Co-authored-by: Stefan Zabka <zabkaste@informatik.hu-berlin.de> * Removed unnecessary checks Co-authored-by: Stefan Zabka <zabkaste@informatik.hu-berlin.de> * Changed error string formatting Co-authored-by: Stefan Zabka <zabkaste@informatik.hu-berlin.de> * Changed filenamea and necessary imports to resolve conflicts with new master branch(refering to PEP-8 reformatting) * Revert "Changed filenamea and necessary imports to resolve conflicts with new master branch(refering to PEP-8 reformatting)" This reverts commit e550c3bd604f415272bd05ee3d9c76397ad98006. * Revert "Merge branch 'master' into turn_browser_and_manager_params_into_dataclasses" This reverts commit aff5a384e737477746d6a38d3b2be6244f8dfd11, reversing changes made to 6ecaf5d0a94d376126692c3785692ba10626d88a. * Revert "Update environment dependencies" This reverts commit 385825b10aee4610a6e304122bec4ab2b7219a5b. * Revert "Merge branch 'turn_browser_and_manager_params_into_dataclasses' of https://github.com/ankushduacodes/OpenWPM into turn_browser_and_manager_params_into_dataclasses" This reverts commit 6ecaf5d0a94d376126692c3785692ba10626d88a, reversing changes made to e550c3bd604f415272bd05ee3d9c76397ad98006. * file reformat * finalized validate_browser_params function * fixed typo in error string * added validations for manager_params * Explanation for using list for supported browser * Revert "Revert "Merge branch 'master' into turn_browser_and_manager_params_into_dataclasses"" This reverts commit 6c3e98e57bd9c42acd029c74649742dcc81de86c. * Revert "Revert "Changed filenamea and necessary imports to resolve conflicts with new master branch(refering to PEP-8 reformatting)"" This reverts commit fc8f48f1878ea7c43b342989ce581dc3d6eab929. * import name change from .Error to .error * moved call_instrument check to config.py * fixed accidental use of dict syntax in a class * moved save_content check from deploy_firefox.py * deleting redundent file * deleted more redundent files * removed redundant imports * added new save_content check * property name changevariables can not have '-' * added new attribute to ManagerParams * adapted files to validate manager & broswer params - also added logic to convert the objects(BrowserParams and ManagerParams) to dictionaries to not break the functionality - also updated demo.py to work with new file names on this branch * removed obsolete documentaion * Dependency Update * Revert "Dependency Update" This reverts commit 8ee3a02b1764883a1f5922e0b52e9f17f8e098db. * Dependencies Update * unset memory and process watchdogs * add new output_format and failure_limit checks * inheriting dataclasses and added type hints to fn * added todo * fixed inheritance of dataclasses acc. to plan * refactor use of dict to use dataclasses(pending) * more refactoring use of dict to dataclasses - Also changed some type hints related to new refactoring * fixed screenshot directory issue - because of which some of the tests were failing * added try-except clause for unexpected errors * added tests to cover dataclasses * added some new and edited some old docs * refactor use of __dict__ to dataclass.to_dict() * Revert "refactor use of __dict__ to dataclass.to_dict()" This reverts commit a4f35513fa26d23a073c16af9fb332045826dcb2. * fixed some tests * refactor use of __dict__ in favor of dataclass.to_dict() method * removed some TODOS * fixed dataclases validation tests * Update docs/Configuration.md Co-authored-by: Stefan Zabka <zabkaste@informatik.hu-berlin.de> * Update docs/Configuration.md Co-authored-by: Stefan Zabka <zabkaste@informatik.hu-berlin.de> * Update docs/Configuration.md Co-authored-by: Stefan Zabka <zabkaste@informatik.hu-berlin.de> * Update openwpm/config.py Co-authored-by: Stefan Zabka <zabkaste@informatik.hu-berlin.de> * Update openwpm/config.py Co-authored-by: Stefan Zabka <zabkaste@informatik.hu-berlin.de> * Update openwpm/task_manager.py Co-authored-by: Stefan Zabka <zabkaste@informatik.hu-berlin.de> * minor fixed wrt polishing the PR * added new check and test for crawl configs Co-authored-by: Stefan Zabka <zabkaste@informatik.hu-berlin.de>
2020-12-02 12:10:45 +03:00
browser_params[i].navigation_instrument = True
# Record JS Web API calls
Refactoring browser and manager params into dataclasses (#807) * initial file commit * add new dependency for dataclasses * implemeted basic BrowserParams dataclass * dependencies update * file reformat * implemented basic ManagerParams dataclass * Update environment dependencies * Added new error class to validate browser and manager params * file reformat * Update scripts/environment-unpinned.yaml Co-authored-by: Stefan Zabka <zabkaste@informatik.hu-berlin.de> * added validations for BrowserParams dataclass * Update openwpm/config.py Co-authored-by: Stefan Zabka <zabkaste@informatik.hu-berlin.de> * Removed unnecessary checks Co-authored-by: Stefan Zabka <zabkaste@informatik.hu-berlin.de> * Changed error string formatting Co-authored-by: Stefan Zabka <zabkaste@informatik.hu-berlin.de> * Changed filenamea and necessary imports to resolve conflicts with new master branch(refering to PEP-8 reformatting) * Revert "Changed filenamea and necessary imports to resolve conflicts with new master branch(refering to PEP-8 reformatting)" This reverts commit e550c3bd604f415272bd05ee3d9c76397ad98006. * Revert "Merge branch 'master' into turn_browser_and_manager_params_into_dataclasses" This reverts commit aff5a384e737477746d6a38d3b2be6244f8dfd11, reversing changes made to 6ecaf5d0a94d376126692c3785692ba10626d88a. * Revert "Update environment dependencies" This reverts commit 385825b10aee4610a6e304122bec4ab2b7219a5b. * Revert "Merge branch 'turn_browser_and_manager_params_into_dataclasses' of https://github.com/ankushduacodes/OpenWPM into turn_browser_and_manager_params_into_dataclasses" This reverts commit 6ecaf5d0a94d376126692c3785692ba10626d88a, reversing changes made to e550c3bd604f415272bd05ee3d9c76397ad98006. * file reformat * finalized validate_browser_params function * fixed typo in error string * added validations for manager_params * Explanation for using list for supported browser * Revert "Revert "Merge branch 'master' into turn_browser_and_manager_params_into_dataclasses"" This reverts commit 6c3e98e57bd9c42acd029c74649742dcc81de86c. * Revert "Revert "Changed filenamea and necessary imports to resolve conflicts with new master branch(refering to PEP-8 reformatting)"" This reverts commit fc8f48f1878ea7c43b342989ce581dc3d6eab929. * import name change from .Error to .error * moved call_instrument check to config.py * fixed accidental use of dict syntax in a class * moved save_content check from deploy_firefox.py * deleting redundent file * deleted more redundent files * removed redundant imports * added new save_content check * property name changevariables can not have '-' * added new attribute to ManagerParams * adapted files to validate manager & broswer params - also added logic to convert the objects(BrowserParams and ManagerParams) to dictionaries to not break the functionality - also updated demo.py to work with new file names on this branch * removed obsolete documentaion * Dependency Update * Revert "Dependency Update" This reverts commit 8ee3a02b1764883a1f5922e0b52e9f17f8e098db. * Dependencies Update * unset memory and process watchdogs * add new output_format and failure_limit checks * inheriting dataclasses and added type hints to fn * added todo * fixed inheritance of dataclasses acc. to plan * refactor use of dict to use dataclasses(pending) * more refactoring use of dict to dataclasses - Also changed some type hints related to new refactoring * fixed screenshot directory issue - because of which some of the tests were failing * added try-except clause for unexpected errors * added tests to cover dataclasses * added some new and edited some old docs * refactor use of __dict__ to dataclass.to_dict() * Revert "refactor use of __dict__ to dataclass.to_dict()" This reverts commit a4f35513fa26d23a073c16af9fb332045826dcb2. * fixed some tests * refactor use of __dict__ in favor of dataclass.to_dict() method * removed some TODOS * fixed dataclases validation tests * Update docs/Configuration.md Co-authored-by: Stefan Zabka <zabkaste@informatik.hu-berlin.de> * Update docs/Configuration.md Co-authored-by: Stefan Zabka <zabkaste@informatik.hu-berlin.de> * Update docs/Configuration.md Co-authored-by: Stefan Zabka <zabkaste@informatik.hu-berlin.de> * Update openwpm/config.py Co-authored-by: Stefan Zabka <zabkaste@informatik.hu-berlin.de> * Update openwpm/config.py Co-authored-by: Stefan Zabka <zabkaste@informatik.hu-berlin.de> * Update openwpm/task_manager.py Co-authored-by: Stefan Zabka <zabkaste@informatik.hu-berlin.de> * minor fixed wrt polishing the PR * added new check and test for crawl configs Co-authored-by: Stefan Zabka <zabkaste@informatik.hu-berlin.de>
2020-12-02 12:10:45 +03:00
browser_params[i].js_instrument = True
# Record the callstack of all WebRequests made
Refactoring browser and manager params into dataclasses (#807) * initial file commit * add new dependency for dataclasses * implemeted basic BrowserParams dataclass * dependencies update * file reformat * implemented basic ManagerParams dataclass * Update environment dependencies * Added new error class to validate browser and manager params * file reformat * Update scripts/environment-unpinned.yaml Co-authored-by: Stefan Zabka <zabkaste@informatik.hu-berlin.de> * added validations for BrowserParams dataclass * Update openwpm/config.py Co-authored-by: Stefan Zabka <zabkaste@informatik.hu-berlin.de> * Removed unnecessary checks Co-authored-by: Stefan Zabka <zabkaste@informatik.hu-berlin.de> * Changed error string formatting Co-authored-by: Stefan Zabka <zabkaste@informatik.hu-berlin.de> * Changed filenamea and necessary imports to resolve conflicts with new master branch(refering to PEP-8 reformatting) * Revert "Changed filenamea and necessary imports to resolve conflicts with new master branch(refering to PEP-8 reformatting)" This reverts commit e550c3bd604f415272bd05ee3d9c76397ad98006. * Revert "Merge branch 'master' into turn_browser_and_manager_params_into_dataclasses" This reverts commit aff5a384e737477746d6a38d3b2be6244f8dfd11, reversing changes made to 6ecaf5d0a94d376126692c3785692ba10626d88a. * Revert "Update environment dependencies" This reverts commit 385825b10aee4610a6e304122bec4ab2b7219a5b. * Revert "Merge branch 'turn_browser_and_manager_params_into_dataclasses' of https://github.com/ankushduacodes/OpenWPM into turn_browser_and_manager_params_into_dataclasses" This reverts commit 6ecaf5d0a94d376126692c3785692ba10626d88a, reversing changes made to e550c3bd604f415272bd05ee3d9c76397ad98006. * file reformat * finalized validate_browser_params function * fixed typo in error string * added validations for manager_params * Explanation for using list for supported browser * Revert "Revert "Merge branch 'master' into turn_browser_and_manager_params_into_dataclasses"" This reverts commit 6c3e98e57bd9c42acd029c74649742dcc81de86c. * Revert "Revert "Changed filenamea and necessary imports to resolve conflicts with new master branch(refering to PEP-8 reformatting)"" This reverts commit fc8f48f1878ea7c43b342989ce581dc3d6eab929. * import name change from .Error to .error * moved call_instrument check to config.py * fixed accidental use of dict syntax in a class * moved save_content check from deploy_firefox.py * deleting redundent file * deleted more redundent files * removed redundant imports * added new save_content check * property name changevariables can not have '-' * added new attribute to ManagerParams * adapted files to validate manager & broswer params - also added logic to convert the objects(BrowserParams and ManagerParams) to dictionaries to not break the functionality - also updated demo.py to work with new file names on this branch * removed obsolete documentaion * Dependency Update * Revert "Dependency Update" This reverts commit 8ee3a02b1764883a1f5922e0b52e9f17f8e098db. * Dependencies Update * unset memory and process watchdogs * add new output_format and failure_limit checks * inheriting dataclasses and added type hints to fn * added todo * fixed inheritance of dataclasses acc. to plan * refactor use of dict to use dataclasses(pending) * more refactoring use of dict to dataclasses - Also changed some type hints related to new refactoring * fixed screenshot directory issue - because of which some of the tests were failing * added try-except clause for unexpected errors * added tests to cover dataclasses * added some new and edited some old docs * refactor use of __dict__ to dataclass.to_dict() * Revert "refactor use of __dict__ to dataclass.to_dict()" This reverts commit a4f35513fa26d23a073c16af9fb332045826dcb2. * fixed some tests * refactor use of __dict__ in favor of dataclass.to_dict() method * removed some TODOS * fixed dataclases validation tests * Update docs/Configuration.md Co-authored-by: Stefan Zabka <zabkaste@informatik.hu-berlin.de> * Update docs/Configuration.md Co-authored-by: Stefan Zabka <zabkaste@informatik.hu-berlin.de> * Update docs/Configuration.md Co-authored-by: Stefan Zabka <zabkaste@informatik.hu-berlin.de> * Update openwpm/config.py Co-authored-by: Stefan Zabka <zabkaste@informatik.hu-berlin.de> * Update openwpm/config.py Co-authored-by: Stefan Zabka <zabkaste@informatik.hu-berlin.de> * Update openwpm/task_manager.py Co-authored-by: Stefan Zabka <zabkaste@informatik.hu-berlin.de> * minor fixed wrt polishing the PR * added new check and test for crawl configs Co-authored-by: Stefan Zabka <zabkaste@informatik.hu-berlin.de>
2020-12-02 12:10:45 +03:00
browser_params[i].callstack_instrument = True
2020-08-04 17:40:11 +03:00
# Record DNS resolution
Refactoring browser and manager params into dataclasses (#807) * initial file commit * add new dependency for dataclasses * implemeted basic BrowserParams dataclass * dependencies update * file reformat * implemented basic ManagerParams dataclass * Update environment dependencies * Added new error class to validate browser and manager params * file reformat * Update scripts/environment-unpinned.yaml Co-authored-by: Stefan Zabka <zabkaste@informatik.hu-berlin.de> * added validations for BrowserParams dataclass * Update openwpm/config.py Co-authored-by: Stefan Zabka <zabkaste@informatik.hu-berlin.de> * Removed unnecessary checks Co-authored-by: Stefan Zabka <zabkaste@informatik.hu-berlin.de> * Changed error string formatting Co-authored-by: Stefan Zabka <zabkaste@informatik.hu-berlin.de> * Changed filenamea and necessary imports to resolve conflicts with new master branch(refering to PEP-8 reformatting) * Revert "Changed filenamea and necessary imports to resolve conflicts with new master branch(refering to PEP-8 reformatting)" This reverts commit e550c3bd604f415272bd05ee3d9c76397ad98006. * Revert "Merge branch 'master' into turn_browser_and_manager_params_into_dataclasses" This reverts commit aff5a384e737477746d6a38d3b2be6244f8dfd11, reversing changes made to 6ecaf5d0a94d376126692c3785692ba10626d88a. * Revert "Update environment dependencies" This reverts commit 385825b10aee4610a6e304122bec4ab2b7219a5b. * Revert "Merge branch 'turn_browser_and_manager_params_into_dataclasses' of https://github.com/ankushduacodes/OpenWPM into turn_browser_and_manager_params_into_dataclasses" This reverts commit 6ecaf5d0a94d376126692c3785692ba10626d88a, reversing changes made to e550c3bd604f415272bd05ee3d9c76397ad98006. * file reformat * finalized validate_browser_params function * fixed typo in error string * added validations for manager_params * Explanation for using list for supported browser * Revert "Revert "Merge branch 'master' into turn_browser_and_manager_params_into_dataclasses"" This reverts commit 6c3e98e57bd9c42acd029c74649742dcc81de86c. * Revert "Revert "Changed filenamea and necessary imports to resolve conflicts with new master branch(refering to PEP-8 reformatting)"" This reverts commit fc8f48f1878ea7c43b342989ce581dc3d6eab929. * import name change from .Error to .error * moved call_instrument check to config.py * fixed accidental use of dict syntax in a class * moved save_content check from deploy_firefox.py * deleting redundent file * deleted more redundent files * removed redundant imports * added new save_content check * property name changevariables can not have '-' * added new attribute to ManagerParams * adapted files to validate manager & broswer params - also added logic to convert the objects(BrowserParams and ManagerParams) to dictionaries to not break the functionality - also updated demo.py to work with new file names on this branch * removed obsolete documentaion * Dependency Update * Revert "Dependency Update" This reverts commit 8ee3a02b1764883a1f5922e0b52e9f17f8e098db. * Dependencies Update * unset memory and process watchdogs * add new output_format and failure_limit checks * inheriting dataclasses and added type hints to fn * added todo * fixed inheritance of dataclasses acc. to plan * refactor use of dict to use dataclasses(pending) * more refactoring use of dict to dataclasses - Also changed some type hints related to new refactoring * fixed screenshot directory issue - because of which some of the tests were failing * added try-except clause for unexpected errors * added tests to cover dataclasses * added some new and edited some old docs * refactor use of __dict__ to dataclass.to_dict() * Revert "refactor use of __dict__ to dataclass.to_dict()" This reverts commit a4f35513fa26d23a073c16af9fb332045826dcb2. * fixed some tests * refactor use of __dict__ in favor of dataclass.to_dict() method * removed some TODOS * fixed dataclases validation tests * Update docs/Configuration.md Co-authored-by: Stefan Zabka <zabkaste@informatik.hu-berlin.de> * Update docs/Configuration.md Co-authored-by: Stefan Zabka <zabkaste@informatik.hu-berlin.de> * Update docs/Configuration.md Co-authored-by: Stefan Zabka <zabkaste@informatik.hu-berlin.de> * Update openwpm/config.py Co-authored-by: Stefan Zabka <zabkaste@informatik.hu-berlin.de> * Update openwpm/config.py Co-authored-by: Stefan Zabka <zabkaste@informatik.hu-berlin.de> * Update openwpm/task_manager.py Co-authored-by: Stefan Zabka <zabkaste@informatik.hu-berlin.de> * minor fixed wrt polishing the PR * added new check and test for crawl configs Co-authored-by: Stefan Zabka <zabkaste@informatik.hu-berlin.de>
2020-12-02 12:10:45 +03:00
browser_params[i].dns_instrument = True
2014-07-01 20:37:17 +04:00
# Update TaskManager configuration (use this for crawl-wide settings)
Refactoring browser and manager params into dataclasses (#807) * initial file commit * add new dependency for dataclasses * implemeted basic BrowserParams dataclass * dependencies update * file reformat * implemented basic ManagerParams dataclass * Update environment dependencies * Added new error class to validate browser and manager params * file reformat * Update scripts/environment-unpinned.yaml Co-authored-by: Stefan Zabka <zabkaste@informatik.hu-berlin.de> * added validations for BrowserParams dataclass * Update openwpm/config.py Co-authored-by: Stefan Zabka <zabkaste@informatik.hu-berlin.de> * Removed unnecessary checks Co-authored-by: Stefan Zabka <zabkaste@informatik.hu-berlin.de> * Changed error string formatting Co-authored-by: Stefan Zabka <zabkaste@informatik.hu-berlin.de> * Changed filenamea and necessary imports to resolve conflicts with new master branch(refering to PEP-8 reformatting) * Revert "Changed filenamea and necessary imports to resolve conflicts with new master branch(refering to PEP-8 reformatting)" This reverts commit e550c3bd604f415272bd05ee3d9c76397ad98006. * Revert "Merge branch 'master' into turn_browser_and_manager_params_into_dataclasses" This reverts commit aff5a384e737477746d6a38d3b2be6244f8dfd11, reversing changes made to 6ecaf5d0a94d376126692c3785692ba10626d88a. * Revert "Update environment dependencies" This reverts commit 385825b10aee4610a6e304122bec4ab2b7219a5b. * Revert "Merge branch 'turn_browser_and_manager_params_into_dataclasses' of https://github.com/ankushduacodes/OpenWPM into turn_browser_and_manager_params_into_dataclasses" This reverts commit 6ecaf5d0a94d376126692c3785692ba10626d88a, reversing changes made to e550c3bd604f415272bd05ee3d9c76397ad98006. * file reformat * finalized validate_browser_params function * fixed typo in error string * added validations for manager_params * Explanation for using list for supported browser * Revert "Revert "Merge branch 'master' into turn_browser_and_manager_params_into_dataclasses"" This reverts commit 6c3e98e57bd9c42acd029c74649742dcc81de86c. * Revert "Revert "Changed filenamea and necessary imports to resolve conflicts with new master branch(refering to PEP-8 reformatting)"" This reverts commit fc8f48f1878ea7c43b342989ce581dc3d6eab929. * import name change from .Error to .error * moved call_instrument check to config.py * fixed accidental use of dict syntax in a class * moved save_content check from deploy_firefox.py * deleting redundent file * deleted more redundent files * removed redundant imports * added new save_content check * property name changevariables can not have '-' * added new attribute to ManagerParams * adapted files to validate manager & broswer params - also added logic to convert the objects(BrowserParams and ManagerParams) to dictionaries to not break the functionality - also updated demo.py to work with new file names on this branch * removed obsolete documentaion * Dependency Update * Revert "Dependency Update" This reverts commit 8ee3a02b1764883a1f5922e0b52e9f17f8e098db. * Dependencies Update * unset memory and process watchdogs * add new output_format and failure_limit checks * inheriting dataclasses and added type hints to fn * added todo * fixed inheritance of dataclasses acc. to plan * refactor use of dict to use dataclasses(pending) * more refactoring use of dict to dataclasses - Also changed some type hints related to new refactoring * fixed screenshot directory issue - because of which some of the tests were failing * added try-except clause for unexpected errors * added tests to cover dataclasses * added some new and edited some old docs * refactor use of __dict__ to dataclass.to_dict() * Revert "refactor use of __dict__ to dataclass.to_dict()" This reverts commit a4f35513fa26d23a073c16af9fb332045826dcb2. * fixed some tests * refactor use of __dict__ in favor of dataclass.to_dict() method * removed some TODOS * fixed dataclases validation tests * Update docs/Configuration.md Co-authored-by: Stefan Zabka <zabkaste@informatik.hu-berlin.de> * Update docs/Configuration.md Co-authored-by: Stefan Zabka <zabkaste@informatik.hu-berlin.de> * Update docs/Configuration.md Co-authored-by: Stefan Zabka <zabkaste@informatik.hu-berlin.de> * Update openwpm/config.py Co-authored-by: Stefan Zabka <zabkaste@informatik.hu-berlin.de> * Update openwpm/config.py Co-authored-by: Stefan Zabka <zabkaste@informatik.hu-berlin.de> * Update openwpm/task_manager.py Co-authored-by: Stefan Zabka <zabkaste@informatik.hu-berlin.de> * minor fixed wrt polishing the PR * added new check and test for crawl configs Co-authored-by: Stefan Zabka <zabkaste@informatik.hu-berlin.de>
2020-12-02 12:10:45 +03:00
manager_params.data_directory = "~/Desktop/"
manager_params.log_directory = "~/Desktop/"
# memory_watchdog and process_watchdog are useful for large scale cloud crawls.
# Please refer to docs/Configuration.md#platform-configuration-options for more information
# manager_params.memory_watchdog = True
# manager_params.process_watchdog = True
2014-07-01 20:37:17 +04:00
# Instantiates the measurement platform
# Commands time out by default after 60 seconds
manager = TaskManager(manager_params, browser_params)
2014-07-01 20:37:17 +04:00
# Visits the sites
2014-07-01 20:37:17 +04:00
for site in sites:
# Parallelize sites over all number of browsers set above.
command_sequence = CommandSequence(
2020-09-11 16:14:09 +03:00
site,
reset=True,
callback=lambda success, val=site: print("CommandSequence {} done".format(val)),
)
# Start by visiting the page
Command refactoring (#750) * Refactored GetCommand, BrowseCommand to have execute method * Fixed type name format issues in __issue_command * Fixed everything I broke * Changed import style so tests can run * Added BrowseCommad to imports * Added some more self * Added logging to explain failing test * Added one more self * attempt at refactoring save_screenshot * fixed indentation, attempt at refactoring save_screenshot * refactored SaveScreenshot command to have execute method * reformatted code using black * Ported SaveScreenshotCommand It now uses the new command.execute(...) syntax * refactored savefullscreenshot command to follow command sequence * formatted files with black * removed extraneous commands * Ported SaveScreenshotFullPage #763 * refactored dump page source and formatted code with black * reformatted recursive dump page source command and formatted code w black * formatted files using isort * formatted all files with isort * Ported DumpPageSource and RecursiveDumpPageSource (#767) * refactor finalize command * refactored initalize command and formatted with black and isort * missed a conflict * Command refactoring (#770) * attempt at refactoring save_screenshot * fixed indentation, attempt at refactoring save_screenshot * refactored SaveScreenshot command to have execute method * reformatted code using black * refactored savefullscreenshot command to follow command sequence * formatted files with black * removed extraneous commands * refactored dump page source and formatted code with black * reformatted recursive dump page source command and formatted code w black * formatted files using isort * formatted all files with isort * refactor finalize command * refactored initalize command and formatted with black and isort * missed a conflict * Ran isort * Added append_command * remove custom function command and format code * Refactored GetCommand, BrowseCommand to have execute method * Fixed type name format issues in __issue_command * Fixed everything I broke * Changed import style so tests can run * Added BrowseCommad to imports * Added some more self * Added logging to explain failing test * Added one more self * Ported SaveScreenshotCommand It now uses the new command.execute(...) syntax * Ported SaveScreenshotFullPage #763 * Ported DumpPageSource and RecursiveDumpPageSource (#767) * Command refactoring (#770) * attempt at refactoring save_screenshot * fixed indentation, attempt at refactoring save_screenshot * refactored SaveScreenshot command to have execute method * reformatted code using black * refactored savefullscreenshot command to follow command sequence * formatted files with black * removed extraneous commands * refactored dump page source and formatted code with black * reformatted recursive dump page source command and formatted code w black * formatted files using isort * formatted all files with isort * refactor finalize command * refactored initalize command and formatted with black and isort * missed a conflict * Ran isort * Added append_command * remove duplicate append_command * Refactored GetCommand, BrowseCommand to have execute method * Fixed type name format issues in __issue_command * Fixed everything I broke * Changed import style so tests can run * Added BrowseCommad to imports * Added some more self * Added logging to explain failing test * Added one more self * Ported SaveScreenshotCommand It now uses the new command.execute(...) syntax * Ported SaveScreenshotFullPage #763 * Ported DumpPageSource and RecursiveDumpPageSource (#767) * Command refactoring (#770) * attempt at refactoring save_screenshot * fixed indentation, attempt at refactoring save_screenshot * refactored SaveScreenshot command to have execute method * reformatted code using black * refactored savefullscreenshot command to follow command sequence * formatted files with black * removed extraneous commands * refactored dump page source and formatted code with black * reformatted recursive dump page source command and formatted code w black * formatted files using isort * formatted all files with isort * refactor finalize command * refactored initalize command and formatted with black and isort * missed a conflict * Ran isort * Added append_command * generate new xpi * Fixing tests * Fixing tests * Fixing up more tests * Removed type annotations * Fixing tests * Fixing tests * Removed command_executor * Moved Commands to commands * Fixing imports * Fixed skipped test * Removed duplicate append_command * docs: update adding command in usingOpenWPM * Forgot to save * Removed datadir * Cleaning up imports * Implemented simple command * Added documentation to simple_command.py * Renamed to custom_command.py * Moved docs around * Referencing BaseCommand.execute * Update docs/Using_OpenWPM.md Co-authored-by: Steven Englehardt <senglehardt@mozilla.com> Co-authored-by: Cyrus <cyruskarsan@gmail.com> Co-authored-by: cyruskarsan <55566678+cyruskarsan@users.noreply.github.com> Co-authored-by: Steven Englehardt <senglehardt@mozilla.com>
2021-01-09 13:15:01 +03:00
command_sequence.append_command(GetCommand(url=site, sleep=3), timeout=60)
# Have a look at custom_command.py to see how to implement your own command
command_sequence.append_command(LinkCountingCommand())
# Run commands across the three browsers (simple parallelization)
manager.execute_command_sequence(command_sequence)
2014-07-01 20:37:17 +04:00
# Shuts down the browsers and waits for the data to finish logging
manager.close()