OpenWPM/CHANGELOG.md

363 строки
15 KiB
Markdown

# Changelog
## v0.18.0 - 2021-12-07
Updates OpenWPM to Firefox 95
## v0.17.0 - 2021-07-23
Updates OpenWPM to Firefox 90.0.2
- Moved Extension folder to top-level (#939)
- Replaced TSLint with ESLint (#940)
- Stopped AssertionErrors crashing production crawls (#945)
## v0.16.0 - 2021-06-10
Updates OpenWPM to Firefox 89
This release didn't introduce any new functionality.
Things that happened in the last release:
- Enabled extension tests (#929)
- Fixed deault extension config (#927)
- Expanded profile tests (#924)
## v0.15.0 - 2021-05-03
Updates OpenWPM to Firefox 88
This release reenables the support for stateful crawling
If you are unsure what this means please have a look at our documentation on
[ReadTheDocs](https://openwpm.readthedocs.io/en/latest/Configuration.html#stateful-vs-stateless-crawls)
Other things that happened since last release:
- Restored Docker build (#871) - All of our releases are now available on Dockerhub again
- OpenWPM now monitors speculative connections (#872)
- We introduced sphinx and publish our documentation to RTD (#863, #894 and #900)
- Combined log_directory and log_file to log_path (#911)
- Fixed a bug in the socket code of the WebExtension
Thanks @shashigharti for the excellent bug report! (#912)
- Fixed data loss issue reported by @bkrumnow (#902)
## v0.14.0 - 2021-03-12
Firefox 86.0.1
This release is defined by a major push towards a more typed and
easier to extend OpenWPM
Here are the highlights:
- Refactored BrowserParams and ManagerParams into Dataclasses (#807)
By [@Ankushduacodes](https://github.com/Ankushduacodes)
This PR replaced the manager_params and browser `dict`s with objects
allowing for property access and type annotations.
This change makes it a lot harder to use invalid configuration options.
- Turned Commands into Objects inheriting from a BaseCommand (#750)
This change allows users to easily write their own commands without having to
touch OpenWPM internals.
It restructured a lot of internal code to bundle the definition of a command
with its implementation
- Data Aggregator Rewrite (#753)
This PR rewrote the way we do data saving from the ground up to enable a more
modular and scalable design.
Users are now empowered to provide their own storage backend if need be and can
easily pick and choose between provided implementations.
Additional changes:
- #844 Switched to Github Actions
- #854 Fix behavior of failure_limit (by [@boolean5](https://github.com/boolean5))
Thanks to all the external contributors that worked on this release.
Besides the people mentioned above we also merged contributions from:
[@LordReigns](https://github.com/LordReigns)
- #803 Removed unneccessary List unpacking
- #804 Updated stored commands in test_webdriver_utils.py
[@FukurouMakoto](https://github.com/FukurouMakoto)
- #806 Module & Imports conformed to PEP8
[@Ankushduacodes](https://github.com/Ankushduacodes)
- #811 Changed imports to avoid errors
- #815 Fixed nodejs version typo
- #822 Removed references and leftovers from #807
- #841 Updated webdriver syntax
[@MollieBakal](https://github.com/MollieBakal)
- #817 Fixed `manual_test.py`
- #844 Pyvirtualdisplay update
## v0.13.0 - 2020-11-16
Firefox 83 release
There has been a lot happening in OpenWPM over the last three months.
Here are the highlights:
- Introduced `seed_profile` ([#735](https://github.com/openwpm/OpenWPM/issues/735))
As part of our long-standing effort to restore stateful crawling support we now allow
specifying a `seed_profile` that gets loaded for each fresh start of a browser.
More documentation can be found [here](docs/Configuration.md#load-a-profile).
- Added new WebExtension instrument `dns_instrument` ([#721](https://github.com/openwpm/OpenWPM/issues/721))
This instrument was contributed by [@turban1988](https://github.com/turban1988) .
It allows to log the resolution of DNS requests. Unfortunately it is currently
undocumented. Adding docs is tracked in [#758](https://github.com/openwpm/OpenWPM/issues/758).
- Moved `automation` to `openwpm` ([#793](https://github.com/openwpm/OpenWPM/issues/793))
This change might have the biggest impact on users as we changed the name of the
top-level package. We apologize for the inconvenience caused but felt it was
a good change overall as the new name is a lot more meaningful.
Internal changes:
- We replaced flake8 with black as our formatting tool [#740](https://github.com/openwpm/OpenWPM/issues/740)
- [@Metropass](https://github.com/Metropass) removed the built in extensions as they were horribly out of date [#754](https://github.com/openwpm/OpenWPM/issues/754)
- [@Ankushduacodes](https://github.com/Ankushduacodes) removed the `browser_settings` as they required a lot of code for very little benefit [#775](https://github.com/openwpm/OpenWPM/issues/775)
- [@Ankushduacodes](https://github.com/Ankushduacodes) made the `memory_watchdog` and `process_watchdog` part of the `manager_params` [#785](https://github.com/openwpm/OpenWPM/issues/785) [#787](https://github.com/openwpm/OpenWPM/issues/787)
Thanks to all the external contributors that worked on this release.
Besides the people mentioned above we also merged contributions from:
- [@jyothisjagan](https://github.com/jyothisjagan) [#769](https://github.com/openwpm/OpenWPM/issues/769)
- [@Prajwal7842](https://github.com/Prajwal7842) [#760](https://github.com/openwpm/OpenWPM/issues/760)
- [@7brokenmirrors](https://github.com/7brokenmirrors) [#776](https://github.com/openwpm/OpenWPM/issues/776)
- [@LordReigns](https://github.com/LordReigns) [#801](https://github.com/openwpm/OpenWPM/pull/801)
## v0.12.0 - 2020-08-26
Firefox 80.0.0 release
There have been no new features added in this release.
However there are two significant bugfixes worth highlighting:
- We hopefully fixed [a bug when hashing content](https://github.com/openwpm/OpenWPM/issues/711) where the same file could have multiple hashes
If you ran a big crawl and could repeat @birdsarah's analysis, we'd be grateful if you reported your results [here](https://github.com/openwpm/OpenWPM/issues/719)
- Fixed longstanding bug when [propagating exceptions from the BrowserManager to the TaskManager](https://github.com/openwpm/OpenWPM/issues/547) you should now be seeing
the exception that happened in the BrowserManager in your logs
__NOTE:__ Please be aware that this release contains a regression related to https://bugzilla.mozilla.org/show_bug.cgi?id=1656405 and https://bugzilla.mozilla.org/show_bug.cgi?id=1599160.
This means some requests with cached responses might not show up as requests or responses in your instrumentation. We assume this will be fixed in FF81.
## v0.11.0 - 2020-07-08
Firefox 78.0.1 scheduled release. This release contains some minor bug fixes
and one new feature: Arbitrary JS Instrumentation.
New features:
- Arbitrary JS Instrumentation allows users to specify, python side the set
APIs they would like to instrument in their crawl. The default remains the
set of fingerprinting apis, now called ``collection_fingerprinting``. This has
meant a number of API changes including the renaming of the browser_param
``js_instrument_modules`` to ``js_instrument_settings``. As
``js_instrument_modules`` was not actually configurable previously, we do not
anticipate too much disruption to users. Details of how to configure the
new ``js_instrument_settings`` are in the
[Instrumentation and Data Access section of the README](./README.md#instrumentation-and-data-access).
- [Issue 641](https://github.com/openwpm/OpenWPM/issues/641)
- [PR 642](https://github.com/openwpm/OpenWPM/pull/642)
Minor fixes:
- Asserting that unpickled exception is an exception [PR 705](https://github.com/openwpm/OpenWPM/pull/705)
- Minor fixes [PR 703](https://github.com/openwpm/OpenWPM/pull/703)
- Better crawling experience [PR 696](https://github.com/openwpm/OpenWPM/pull/696)
No OpenWPM release was made with Firefox 78.0.
## v0.10.0 - 2020-06-11
This release is a long overdue release of OpenWPM, and contains too many
changes to list here. Instead, we highlight the major architectural since the
previous release. The instrumentation has been completely rewritten as part of
this release to the new WebExtensions framework. Older versions of OpenWPM
should not be used.
Changes:
* Migrate instrumentation from the addon-sdk framework to WebExtensions.
* Migrate to unbranded builds of Firefox, and off of the ESR channel to the
Release channel
* Add support for MacOS development
* Add an S3Aggregator that saves data in Parquet format on S3
* Use conda for dependency management
* Disable stateful crawling due to intermittent loss of profiles and
geckodriver incompatibilities
* Refactor extension instumentation to live in a separate module
* Re-write logger and add support for logging to sentry
* Add a crawler.py crawl script that can be used for cloud deployments like
the type documented in https://github.com/openwpm/OpenWPM-crawler
* Add support for Firefox's native headless mode alongside XVFB
* Add Dockerfile and automatically deploy builds to dockerhub
* Add a Navigation instrument that records navigation events
* Drop support for Python 2
* Remove support for Flash
* Numerous stability and data saving improvements (particularly for cloud
crawls / the S3Aggregator)
* Numerous bugfixes + improved testing
## v0.9.0 - 2019-04-15
A checkpoint release for the final version of OpenWPM to support Firefox 52
and the addon-sdk framework. We recommend against using this release as Firefox
52ESR is no longer receiving security updates.
Changes:
* The ``automation`` library can now be used with Python 3.4 or later,
as well as Python 2.7.
* Bump to Firefox 52 ESR, Selenium 3.4.0+, and geckodriver 0.15.0.
* geckodriver is required for Selenium 3+. ``install.sh`` will download
and install it.
* geckodriver 0.16.0+ does not support Firefox 52 or lower, so we are
stuck with 0.15.0 (and any bugs it may have) until the next ESR release.
* These versions of geckodriver and Selenium require Firefox 48+.
* MITMProxy support has been removed. Use ``http_instrument`` instead.
* Bundled Firefox privacy extensions have been updated.
* AdBlock Plus support has been removed.
* uBlock Origin and Disconnect added.
* Ghostery has been updated.
* Extensions built using the WebExtensions API are now supported. Our
extension still uses the add-on sdk.
* Experimental support for saving Parquet files on S3
* Numerous bug fixes
## v0.8.0 - 2017-10-09
A long overdue version bump to checkpoint the final version to support
Selenium 2 + FF 45. Note we recommend against using the release as Firefox
45ESR is no longer receiving security patches.
Changes:
* Add extension-based HTTP instrumentation, including POST body processing
* Deprecate proxy-based HTTP instrumentation
* Save stacktrace of HTTP requests
* Prevent Selenium 2 from self identifying in the DOM
* Add support for blocking commands
* Improve exception handling in child processes
* Refactor of socket interface in extension
* Improvements to manual testing code
* Add a logging module to the extension, logs to central log file
* Instrument ``document.cookie``
* A number of improvements to the ``instrumentObject`` instrumentation
interface in extension
* Make ``install.sh`` scriptable
## v0.7.0 - 2016-11-15
Changes:
* Bugfixes to extension instrumentation where records would be dropped when
the extension was under heavy load and fail to re-enable until the browser
was restarted.
* Bugfix to extension / socket interface
* Add ``run_custom_function`` command
* Using alternative serialization/parallelization with ``dill`` and
``multiprocess``
* Better documentation
* Bugfixes to install script
* Add ``save_screenshot`` and ``dump_page_source`` commands
* Add Audio API instrumentation
* Bugfix to ``browse`` command
* Bugfix to extension instrumentation injection to avoid Security Errors
## v0.6.2 - 2016-04-08
Changes:
* Bugfix to browse command. Now supports sleeping after get.
## v0.6.1 - 2016-04-08
Critical:
* Bugfix in LevelDBAggregator preventing data loss
Changes:
* Bump to Firefox 45 & Selenium 2.53.0
* Update certificate stored
* Added sleep argument to ``get`` command
* Added install script for development dependencies
* Improved error handling in TaskManager and Proxy
* Version bumps and bugfixes in HTTPS Everywhere, Ghostery, and ABP
* Tests added!
* Numerous bugfixes and improvements in Javascript Instrumentation
## v0.6.0 - 2015-12-22
Changes:
* Cleanup of Firefox prefs to make browsers faster and reduce phoning home
* Use LevelDB for javascript file storage
* Improved HTTP Cookie Parsing
* Several bugfixes to extension instrumentation
* Improved profile handling during shutdown and crashes
* Improved handling of child Exceptions
* Inital platform tests
* Improvements to javascript instrumentation
## v0.5.1 - 2015-10-15
Changes:
* Save json serialized headers and fix cookie parsing
## v0.5.0 - 2015-10-14
Changes:
* Added support for saving all javascript files de-duplicated and compressed
* Created two configuration dictionaries. One for individual browsers and
another for the entire infrastructure
* Support for using OpenWPM as a submodule
* Firefox (v39) and Selenium (v2.47.1)
* Added support for launching Ghostery, HTTPS Everywhere, and AdBlock Plus
* Removed Random Extension Support
* Bugfix for broken profile saving.
* Bugfix for profile clearing when memory limits are exceeded
* Numerous stability fixes
* Full Logging support in all commands
## v0.4.0
Changes:
* Significant stability improvements for long crawls
* Support for logging with logging module
* A large number of bugfixes related to process handling
* Prevention of a large number of stray tmp files/folders during long crawls
* Process/memory watchdog to handle orphaned processes and keep memory usage
reasonable
* Numerous bugfixes for extension
* Failure thresholds to prevent infinite loops of browser respawns or
command execution attempts (instead, Errors and raised)
* Script to install dependencies
* API changes to command timeouts
* Move SocketInterface from pickle to json serialization
Known Issues:
* Encoding issues cause a very small percentage of data to be dropped by the
extension
* Malformed queries are occassionally sent to the DataAggregator and are
dropped. The cause is unknown.
* Forking can be done in a more memory efficient way
## Older releases
* 0.3.1 - Fixes #5
* 0.3.0 - Experimental merge of Fourthparty + framework to allow additional
javascript instrumentation.
* 0.2.3 - Timeout logging
* 0.2.2 - Browse command + better scrolling + bugfixes
* 0.2.1 - Support for MITMProxy v0.11 + minor bugfixes
* 0.2.0 - Complete re-write of HTTP Cookie parsing
* 0.1.1 - Simplfied load of default settings, including wiki demo
* 0.1.0 - Initial Public Release