* Update README.md
Since Dockerfile has CMD command, we dont need to specify in docker run
* Updated configuration md to have proper instructions for level db usage
* Update docs/Configuration.md
Co-authored-by: Stefan Zabka <zabkaste@informatik.hu-berlin.de>
* Update Configuration.md
Co-authored-by: Stefan Zabka <zabkaste@informatik.hu-berlin.de>
* Stop AssertionErrors crashing production crawls
Fixes#166
* Wrote tests for propagating exceptions
* Logs were too noisy
* Test_crawl should run like a real crawl
* Converted TSLint to ESLint
* Ran npm run fix
* Type refinement around JSINstrumentation config
* More typing
* Fixed types
* Reverted change to arrow function
Co-authored-by: Ayushsunny <ayush100anand@gmail.com>
* Test browser profile recovery in various scenarios
Add a test that implements the following test matrix and checks that the
profile behavior is the expected in each case:
* Crash types: normal operation, crash during page visit,
crash during launch, timeout during page visit
* Operational mode: stateful, stateless
* Seed profile: exists, doesn't exist
* Move profile tests to a separate CI job
* Test saving the browser profile using a command
* Remove unused manager_params arg from load_profile
* Test that load_profile does not modify tar file
* Test crashing during Task Manager initialization
* Remove some redundant testcases
Adjust the parameterization of test_profile_recovery in order to remove
the `stateless-without_seed_tar` testcases, as they are not adding any
value.
* Preparations for v0.15.0 release
* removed warning about v0.14.0 and v0.14.1
* Comment formatting
* Fixed date in Changelog
As well as formatting and spelling
Co-authored-by: Georgia Kokkinou <geor5ko@gmail.com>
* Update openwpm/task_manager.py
Co-authored-by: Georgia Kokkinou <geor5ko@gmail.com>
* Updated VERSION file
* Getting GCSFS to work again
* Set black to the version
* Updated JS dependecies
Co-authored-by: Georgia Kokkinou <geor5ko@gmail.com>
* Save full Firefox profile
Save the whole Firefox profile directory instead of only saving a few of
its subcomponents. Remove an unused import of shutil from
profile_commands.py.
Additionally, remove the `extension_port.txt` file after reading the
port from it, to prevent reading stale port information when a browser
is restarted after a crash.
Finally, remove a part of the documentation that references the old way
of dumping the profile and update a leftover reference to the
`log_directory` config option.
Closes#62.
* Test saving full profile
Add a test that checks that attempting to save an incomplete profile
raises an error. Also, extend `test_saving` to check that a few basic
files and directories of the Firefox profile are present in the archived
profile.
* Combined log_directory and log_file to log_path
* Updated documentation
* Fixed tests
* Implemented test, need to change CSP
* Extension logging restored and tested
* Renamed extra to custom_params
* Reverting stackdump changes
* Re-enable test_profile_saved_when_launch_crashes
Update `test_profile_saved_when_launch_crashes` so that it does not
depend on the no longer supported proxy to make browser restarts fail.
Instead, set the `FIREFOX_BINARY` environment variable to point to a
wrong path.
Also, fix a bug in `kill_browser_manager()`, which would cause OpenWPM
to crash because of a `psutil.NoSuchProcess` exception and without
archiving the browser profile, whenever a browser restart failed to
launch geckodriver.
Finally, make `kill_process_and_children()` use the timeout set via its
arguments, which it previously ignored.
* Update docstring of dump_profile
Add a note for callers that they should make sure Firefox is closed, in
order to prevent ending up with a corrupted database in the archived
profile.
* Update test_browser_profile_coverage
Remove the buggy and outdated for loop that determined whether a url is
expected to be missing from the places.sqlite database of the browser
profile, as we have not observed any missing urls when running this
test.
* Fix documentation module index
Populate the module index by setting up Sphinx to automatically run
sphinx-apidoc for every build. Also, move readthedocs dependencies under
docs/ and make prune-environment.py automatically generate the
environment-rtd.yaml file whenever we run repin.sh.
* Fix black and mypy errors
We can now generate documentation to a variety of display formats including HTML by using sphinx.
With this new infrastructure we are now also able to generate documentation on readthedocs.io.
Co-authored-by: jhabarsingh <jhabarsinghbhati23@gmail.com>
Co-authored-by: Cyrus <cyruskarsan@gmail.com>
Co-authored-by: cyruskarsan <55566678+cyruskarsan@users.noreply.github.com>
Co-authored-by: Steven Englehardt <senglehardt@mozilla.com>
Co-authored-by: ankushduacodes <61025943+ankushduacodes@users.noreply.github.com>
Co-authored-by: Mollie Bakal <bakalm@umich.edu>
Co-authored-by: MollieBakal <molliebakal@gmail.com>
Co-authored-by: jhabarsingh <43932986+jhabarsingh@users.noreply.github.com>
Co-authored-by: Georgia Kokkinou <geor5ko@gmail.com>
* Removing localtest.me
As it has been highly unreliable when running
local tests (returning DnsNotFound errors)
* Fixing tests
* Switched to localhost
* Localtest.me to localhost
* Renamed Browser to BrowserManagerHandler
* Renamed TaskManager._issue_command to BrowserManagerHandle.execute_command_sequence
* Fixing stuff
* Apply suggestions from code review
Co-authored-by: Georgia Kokkinou <geor5ko@gmail.com>
* tm to task_manager
* Found and renamed only mention of in the docs
Co-authored-by: Georgia Kokkinou <geor5ko@gmail.com>
Introduced `cleaned_js_instrument_settings` in BrowserParamsInternal to hold the expanded config dict.
Propagating the `js_instrument_settings` through the extension as an object for as long as possible.
webdriver.switch_to.alert unlike most other variants of the switch_to API is not a function but a property.
This led to TypeError:'Alert' object is not callable when there was actually an Alert to switch to.
This PR fixes that behaviour.
Make `PatchedGeckoDriverService` class subclass
selenium.webdriver.firefox.service.Service instead of
selenium.webdriver.common.service.Service, so that we only have to keep
track of the changes in the `__init__()` method of the former class.
Use the public suffix + 1 instead of the public suffix when comparing
the domains in the crawl database with those in the profile history.
Also, update an incorrectly formed query to the crawl database.
Move the core implementation of profile dumping into a `dump_profile`
function, which can be used both internally when closing or restarting a
crashed browser and from the `execute()` method of `DumpProfileCommand`.
Also, make compression the default in `DumpProfileCommand`. Finally, do
not compress the tar archive of the crashed browser's profile when
restarting from a crash. We should avoid the extra compression/
decompression step as this is a short-lived tar file.