Added type hints to development pointers

Also now autogenerating the ToC and removed trailing
whitespace
This commit is contained in:
Stefan Zabka 2020-03-13 16:14:43 +01:00
Родитель f74a660770
Коммит 467fe164ef
1 изменённых файлов: 28 добавлений и 15 удалений

Просмотреть файл

@ -1,6 +1,7 @@
OpenWPM
OpenWPM
[![Build Status](https://travis-ci.org/mozilla/OpenWPM.svg?branch=master)](https://travis-ci.org/mozilla/OpenWPM)
[![OpenWPM Matrix Channel](https://img.shields.io/matrix/OpenWPM:mozilla.org?label=Join%20us%20on%20matrix&server_fqdn=mozilla.modular.im)](https://matrix.to/#/!pFJihVSEWzcMCcOzSH:mozilla.org?via=mozilla.org)
[![OpenWPM Matrix Channel](https://img.shields.io/matrix/OpenWPM:mozilla.org?label=Join%20us%20on%20matrix&server_fqdn=mozilla.modular.im)](https://matrix.to/#/!pFJihVSEWzcMCcOzSH:mozilla.org?via=mozilla.org) <!-- omit in toc -->
=======
OpenWPM is a web privacy measurement framework which makes it easy to
@ -9,21 +10,25 @@ of websites. OpenWPM is built on top of Firefox, with automation provided
by Selenium. It includes several hooks for data collection. Check out
the instrumentation section below for more details.
Table of Contents
-----------------
Table of Contents <!-- omit in toc -->
------------------
* [Installation](#installation)
* [Quick Start](#quick-start)
* [Instrumentation and Data Access](#instrumentation-and-data-access)
* [Output Formats](#output-format)
* [Local Databases](#local-databases)
* [Parquet on Amazon S3 (Experimental)](#parquet-on-amazon-s3-experimental)
* [Output Format](#output-format)
* [Local Databases](#local-databases)
* [Parquet on Amazon S3 **Experimental**](#parquet-on-amazon-s3-experimental)
* [Browser and Platform Configuration](#browser-and-platform-configuration)
* [Browser Configuration Options](#platform-configuration-options)
* [Platform Configuration Options](#platform-configuration-options)
* [Browser Configuration Options](#browser-configuration-options)
* [Browser Profile Support](#browser-profile-support)
* [Stateful vs Stateless crawls](#stateful-vs-stateless-crawls)
* [Loading and saving a browser profile](#loading-and-saving-a-browser-profile)
* [Save a profile](#save-a-profile)
* [Load a profile](#load-a-profile)
* [Development pointers](#development-pointers)
* [Types Annotations in Python](#types-annotations-in-python)
* [Editing instrumentation](#editing-instrumentation)
* [Debugging the platform](#debugging-the-platform)
* [Managing requirements](#managing-requirements)
@ -129,9 +134,9 @@ available [below](#output-format).
LevelDB content database.
* NOTE: this instrumentation may lead to performance issues when a large
number of browsers are in use.
* Set `browser_params['save_content']` to a comma-separated list of
* Set `browser_params['save_content']` to a comma-separated list of
[resource_types](https://developer.mozilla.org/en-US/docs/Mozilla/Add-ons/WebExtensions/API/webRequest/ResourceType)
to save only specific types of files, for instance
to save only specific types of files, for instance
`browser_params['save_content'] = "script"` to save only Javascript
files. This will lessen the performance impact of this instrumentation
when a large number of browsers are used in parallel.
@ -220,8 +225,8 @@ inline by sending a `create_table` message to the data aggregator.
#### Parquet on Amazon S3 **Experimental**
As an option, OpenWPM can save data directly to an Amazon S3 bucket as a
Parquet Dataset. This is currently experimental and hasn't been thoroughly
tested. Screenshots, and page source saving is not currently supported and
will still be stored in local databases and directories. To enable S3
tested. Screenshots, and page source saving is not currently supported and
will still be stored in local databases and directories. To enable S3
saving specify the following configuration parameters in `manager_params`:
* Output format: `manager_params['output_format'] = 's3'`
* S3 bucket name: `manager_params['s3_bucket'] = 'openwpm-test-crawl'`
@ -411,12 +416,20 @@ Much of OpenWPM's instrumentation is included in a Firefox add-on SDK extension.
Thus, in order to add or change instrumentation you will need a few additional
dependencies, which can be installed with `install-dev.sh`.
### Types Annotations in Python
We as maintainers have decided it would be helpful to have Python3 type annotations
for the python part of this project to catch errors earlier, get better
code completion and allow bigger changes down the line with more confidence.
As such you should strive to add type annotations to all new code you add to
the project as well as the one you plan to change fundamentally.
### Editing instrumentation
The instrumentation extension is included in `/automation/Extension/firefox/`.
The instrumentation itself (used by the above extension) is included in
The instrumentation itself (used by the above extension) is included in
`/automation/Extension/webext-instrumentation/`.
Any edits within these directories will require the extension to be re-built to produce
Any edits within these directories will require the extension to be re-built to produce
a new `openwpm.xpi` with your updates. You can use `build_extension.sh` to do this.
### Debugging the platform
@ -429,7 +442,7 @@ continuing the crawl). We recommend using
This utility allows manual debugging of the extension instrumentation with or
without Selenium enabled, as well as makes it easy to launch a Selenium
instance (without any instrumentation)
* `build-extension.sh`
* `build-extension.sh`
* `python -m test.manual_test` builds the current extension directory
and launches a Firefox instance with it.
* `python -m test.manual_test --selenium` launches a Firefox Selenium instance