зеркало из https://github.com/openwpm/OpenWPM.git
Added type hints to development pointers
Also now autogenerating the ToC and removed trailing whitespace
This commit is contained in:
Родитель
f74a660770
Коммит
467fe164ef
43
README.md
43
README.md
|
@ -1,6 +1,7 @@
|
|||
OpenWPM
|
||||
|
||||
OpenWPM
|
||||
[![Build Status](https://travis-ci.org/mozilla/OpenWPM.svg?branch=master)](https://travis-ci.org/mozilla/OpenWPM)
|
||||
[![OpenWPM Matrix Channel](https://img.shields.io/matrix/OpenWPM:mozilla.org?label=Join%20us%20on%20matrix&server_fqdn=mozilla.modular.im)](https://matrix.to/#/!pFJihVSEWzcMCcOzSH:mozilla.org?via=mozilla.org)
|
||||
[![OpenWPM Matrix Channel](https://img.shields.io/matrix/OpenWPM:mozilla.org?label=Join%20us%20on%20matrix&server_fqdn=mozilla.modular.im)](https://matrix.to/#/!pFJihVSEWzcMCcOzSH:mozilla.org?via=mozilla.org) <!-- omit in toc -->
|
||||
=======
|
||||
|
||||
OpenWPM is a web privacy measurement framework which makes it easy to
|
||||
|
@ -9,21 +10,25 @@ of websites. OpenWPM is built on top of Firefox, with automation provided
|
|||
by Selenium. It includes several hooks for data collection. Check out
|
||||
the instrumentation section below for more details.
|
||||
|
||||
Table of Contents
|
||||
-----------------
|
||||
Table of Contents <!-- omit in toc -->
|
||||
------------------
|
||||
|
||||
* [Installation](#installation)
|
||||
* [Quick Start](#quick-start)
|
||||
* [Instrumentation and Data Access](#instrumentation-and-data-access)
|
||||
* [Output Formats](#output-format)
|
||||
* [Local Databases](#local-databases)
|
||||
* [Parquet on Amazon S3 (Experimental)](#parquet-on-amazon-s3-experimental)
|
||||
* [Output Format](#output-format)
|
||||
* [Local Databases](#local-databases)
|
||||
* [Parquet on Amazon S3 **Experimental**](#parquet-on-amazon-s3-experimental)
|
||||
* [Browser and Platform Configuration](#browser-and-platform-configuration)
|
||||
* [Browser Configuration Options](#platform-configuration-options)
|
||||
* [Platform Configuration Options](#platform-configuration-options)
|
||||
* [Browser Configuration Options](#browser-configuration-options)
|
||||
* [Browser Profile Support](#browser-profile-support)
|
||||
* [Stateful vs Stateless crawls](#stateful-vs-stateless-crawls)
|
||||
* [Loading and saving a browser profile](#loading-and-saving-a-browser-profile)
|
||||
* [Save a profile](#save-a-profile)
|
||||
* [Load a profile](#load-a-profile)
|
||||
* [Development pointers](#development-pointers)
|
||||
* [Types Annotations in Python](#types-annotations-in-python)
|
||||
* [Editing instrumentation](#editing-instrumentation)
|
||||
* [Debugging the platform](#debugging-the-platform)
|
||||
* [Managing requirements](#managing-requirements)
|
||||
|
@ -129,9 +134,9 @@ available [below](#output-format).
|
|||
LevelDB content database.
|
||||
* NOTE: this instrumentation may lead to performance issues when a large
|
||||
number of browsers are in use.
|
||||
* Set `browser_params['save_content']` to a comma-separated list of
|
||||
* Set `browser_params['save_content']` to a comma-separated list of
|
||||
[resource_types](https://developer.mozilla.org/en-US/docs/Mozilla/Add-ons/WebExtensions/API/webRequest/ResourceType)
|
||||
to save only specific types of files, for instance
|
||||
to save only specific types of files, for instance
|
||||
`browser_params['save_content'] = "script"` to save only Javascript
|
||||
files. This will lessen the performance impact of this instrumentation
|
||||
when a large number of browsers are used in parallel.
|
||||
|
@ -220,8 +225,8 @@ inline by sending a `create_table` message to the data aggregator.
|
|||
#### Parquet on Amazon S3 **Experimental**
|
||||
As an option, OpenWPM can save data directly to an Amazon S3 bucket as a
|
||||
Parquet Dataset. This is currently experimental and hasn't been thoroughly
|
||||
tested. Screenshots, and page source saving is not currently supported and
|
||||
will still be stored in local databases and directories. To enable S3
|
||||
tested. Screenshots, and page source saving is not currently supported and
|
||||
will still be stored in local databases and directories. To enable S3
|
||||
saving specify the following configuration parameters in `manager_params`:
|
||||
* Output format: `manager_params['output_format'] = 's3'`
|
||||
* S3 bucket name: `manager_params['s3_bucket'] = 'openwpm-test-crawl'`
|
||||
|
@ -411,12 +416,20 @@ Much of OpenWPM's instrumentation is included in a Firefox add-on SDK extension.
|
|||
Thus, in order to add or change instrumentation you will need a few additional
|
||||
dependencies, which can be installed with `install-dev.sh`.
|
||||
|
||||
### Types Annotations in Python
|
||||
|
||||
We as maintainers have decided it would be helpful to have Python3 type annotations
|
||||
for the python part of this project to catch errors earlier, get better
|
||||
code completion and allow bigger changes down the line with more confidence.
|
||||
As such you should strive to add type annotations to all new code you add to
|
||||
the project as well as the one you plan to change fundamentally.
|
||||
|
||||
### Editing instrumentation
|
||||
|
||||
The instrumentation extension is included in `/automation/Extension/firefox/`.
|
||||
The instrumentation itself (used by the above extension) is included in
|
||||
The instrumentation itself (used by the above extension) is included in
|
||||
`/automation/Extension/webext-instrumentation/`.
|
||||
Any edits within these directories will require the extension to be re-built to produce
|
||||
Any edits within these directories will require the extension to be re-built to produce
|
||||
a new `openwpm.xpi` with your updates. You can use `build_extension.sh` to do this.
|
||||
|
||||
### Debugging the platform
|
||||
|
@ -429,7 +442,7 @@ continuing the crawl). We recommend using
|
|||
This utility allows manual debugging of the extension instrumentation with or
|
||||
without Selenium enabled, as well as makes it easy to launch a Selenium
|
||||
instance (without any instrumentation)
|
||||
* `build-extension.sh`
|
||||
* `build-extension.sh`
|
||||
* `python -m test.manual_test` builds the current extension directory
|
||||
and launches a Firefox instance with it.
|
||||
* `python -m test.manual_test --selenium` launches a Firefox Selenium instance
|
||||
|
|
Загрузка…
Ссылка в новой задаче