зеркало из https://github.com/openwpm/OpenWPM.git
Merge branch 'master' into turn_browser_and_manager_params_into_dataclasses
This commit is contained in:
Коммит
aff5a384e7
|
@ -8,4 +8,3 @@ repos:
|
|||
hooks:
|
||||
- id: black
|
||||
language_version: python3 # Should be a command that runs python3.6+
|
||||
|
||||
|
|
|
@ -126,7 +126,7 @@ Troubleshooting
|
|||
2. In older versions of firefox (pre 74) the setting to enable extensions was called
|
||||
`extensions.legacy.enabled`. If you need to work with earlier firefox, update the
|
||||
setting name `extensions.experiments.enabled` in
|
||||
`openwpm/DeployBrowsers/configure_firefox.py`.
|
||||
`openwpm/deploy_browsers/configure_firefox.py`.
|
||||
|
||||
3. Make sure you're conda environment is activated (`conda activate openwpm`). You can see
|
||||
you environments and the activate one by running `conda env list` the active environment
|
||||
|
|
8
demo.py
8
demo.py
|
@ -1,4 +1,4 @@
|
|||
from openwpm import CommandSequence, TaskManager
|
||||
from openwpm import command_sequence, task_manager
|
||||
|
||||
# The list of sites that we wish to crawl
|
||||
NUM_BROWSERS = 1
|
||||
|
@ -10,7 +10,7 @@ sites = [
|
|||
|
||||
# Loads the default manager params
|
||||
# and NUM_BROWSERS copies of the default browser params
|
||||
manager_params, browser_params = TaskManager.load_default_params(NUM_BROWSERS)
|
||||
manager_params, browser_params = task_manager.load_default_params(NUM_BROWSERS)
|
||||
|
||||
# Update browser configuration (use this for per-browser settings)
|
||||
for i in range(NUM_BROWSERS):
|
||||
|
@ -39,13 +39,13 @@ manager_params["process_watchdog"] = True
|
|||
|
||||
# Instantiates the measurement platform
|
||||
# Commands time out by default after 60 seconds
|
||||
manager = TaskManager.TaskManager(manager_params, browser_params)
|
||||
manager = task_manager.TaskManager(manager_params, browser_params)
|
||||
|
||||
# Visits the sites
|
||||
for site in sites:
|
||||
|
||||
# Parallelize sites over all number of browsers set above.
|
||||
command_sequence = CommandSequence.CommandSequence(
|
||||
command_sequence = command_sequence.CommandSequence(
|
||||
site,
|
||||
reset=True,
|
||||
callback=lambda success, val=site: print("CommandSequence {} done".format(val)),
|
||||
|
|
|
@ -70,7 +70,7 @@ with https://github.com/mozilla/OpenWPM/issues/743.
|
|||
|
||||
Contained in `openwpm/BrowserManager.py`, Browser Managers provide a wrapper around the drivers used to automate full browser instances. In particular, we opted to use [Selenium](http://docs.seleniumhq.org/) to drive full browser instances as bot detection frameworks can more easily detect lightweight alternatives such as PhantomJS.
|
||||
|
||||
Browser Managers receive commands from the Task Manager, which they then pass to the command executor (located in `openwpm/Commands/command_executor.py`), which receives a command object and converts it into web driver actions. Browser Managers also receive browser parameters which they use to instantiate the Selenium web driver using one of the browser initialization functions contained in `openwpm/DeployBrowsers`.
|
||||
Browser Managers receive commands from the Task Manager, which they then pass to the command executor (located in `openwpm/commands/command_executor.py`), which receives a command object and converts it into web driver actions. Browser Managers also receive browser parameters which they use to instantiate the Selenium web driver using one of the browser initialization functions contained in `openwpm/deploy_browsers`.
|
||||
|
||||
The Browser class, contained in the same file, is the Task Manager's wrapper around Browser Managers, which allow it to cleanly kill and restart Browser Managers as necessary.
|
||||
|
||||
|
|
|
@ -20,7 +20,7 @@ Suppose we want to add a top-level command to cause the browser to jiggle the mo
|
|||
|
||||
To add a new command you need to modify the following four files:
|
||||
|
||||
1. Define all required paramters in a type in `openwpm/Commands/Types.py`
|
||||
1. Define all required paramters in a type in `openwpm/commands/types.py`
|
||||
In our case this looks like this:
|
||||
```python
|
||||
class JiggleCommand(BaseCommand):
|
||||
|
@ -31,9 +31,9 @@ To add a new command you need to modify the following four files:
|
|||
return "JiggleCommand({})".format(self.num_jiggles)
|
||||
```
|
||||
|
||||
2. Define the behaviour of our new command in `*_commands.py` in `openwpm/Commands/`,
|
||||
2. Define the behaviour of our new command in `*_commands.py` in `openwpm/commands/`,
|
||||
e.g. `browser_commands.py`.
|
||||
Feel free to add a new module within `openwpm/Commands/` for your own custom commands
|
||||
Feel free to add a new module within `openwpm/commands/` for your own custom commands
|
||||
In our case this looks like this:
|
||||
```python
|
||||
from selenium.webdriver.common.action_chains import ActionChains
|
||||
|
@ -48,7 +48,7 @@ To add a new command you need to modify the following four files:
|
|||
```
|
||||
|
||||
3. Make our function be called when the command_sequence reaches our Command, by adding it to the
|
||||
`execute_command` function in `openwpm/Commands/command_executer.py`
|
||||
`execute_command` function in `openwpm/commands/command_executer.py`
|
||||
In our case this looks like this:
|
||||
```python
|
||||
elif type(command) is JiggleCommand:
|
||||
|
@ -117,4 +117,3 @@ print list(fp_sites)
|
|||
````
|
||||
|
||||
The variety of data stored in OpenWPM databases (with all instrumentation enabled) allows the above script to easily be expanded into a larger study. For instance, one step would be to see which parties are the recipients of the email address. Do these recipients later place cookies containing the email? Besides the site on which the original email leak was made, on which other first parties do these recipients appear as a third party? All of these questions are answerable through OpenWPM database instances.
|
||||
|
||||
|
|
|
@ -18,7 +18,7 @@ from botocore.client import Config
|
|||
from botocore.exceptions import ClientError, EndpointConnectionError
|
||||
from pyarrow.filesystem import S3FSWrapper # noqa
|
||||
|
||||
from .BaseAggregator import (
|
||||
from .base_aggregator import (
|
||||
RECORD_TYPE_CONTENT,
|
||||
RECORD_TYPE_CREATE,
|
||||
RECORD_TYPE_SPECIAL,
|
|
@ -7,7 +7,7 @@ from typing import Any, Dict, List, Optional, Tuple
|
|||
|
||||
from multiprocess import Queue
|
||||
|
||||
from ..SocketInterface import serversocket
|
||||
from ..socket_interface import ServerSocket
|
||||
from ..utilities.multiprocess_utils import Process
|
||||
|
||||
RECORD_TYPE_CONTENT = "page_content"
|
||||
|
@ -61,7 +61,7 @@ class BaseListener:
|
|||
self.record_queue: Queue = None # Initialized on `startup`
|
||||
self.logger = logging.getLogger("openwpm")
|
||||
self.curent_visit_ids: List[int] = list() # All visit_ids in flight
|
||||
self.sock: Optional[serversocket] = None
|
||||
self.sock: Optional[ServerSocket] = None
|
||||
|
||||
@abc.abstractmethod
|
||||
def process_record(self, record):
|
||||
|
@ -98,7 +98,7 @@ class BaseListener:
|
|||
"""Run listener startup tasks
|
||||
|
||||
Note: Child classes should call this method"""
|
||||
self.sock = serversocket(name=type(self).__name__)
|
||||
self.sock = ServerSocket(name=type(self).__name__)
|
||||
self.status_queue.put(self.sock.sock.getsockname())
|
||||
self.sock.start_accepting()
|
||||
self.record_queue = self.sock.queue
|
|
@ -8,7 +8,7 @@ from typing import Any, Dict, Tuple, Union
|
|||
|
||||
import plyvel
|
||||
|
||||
from .BaseAggregator import (
|
||||
from .base_aggregator import (
|
||||
RECORD_TYPE_CONTENT,
|
||||
RECORD_TYPE_CREATE,
|
||||
RECORD_TYPE_SPECIAL,
|
|
@ -11919,10 +11919,10 @@
|
|||
"from": "github:conventional-changelog/standard-version#master",
|
||||
"requires": {
|
||||
"chalk": "^2.4.2",
|
||||
"conventional-changelog": "3.1.23",
|
||||
"conventional-changelog": "3.1.24",
|
||||
"conventional-changelog-config-spec": "2.1.0",
|
||||
"conventional-changelog-conventionalcommits": "4.4.0",
|
||||
"conventional-recommended-bump": "6.0.10",
|
||||
"conventional-recommended-bump": "6.0.11",
|
||||
"detect-indent": "^6.0.0",
|
||||
"detect-newline": "^3.1.0",
|
||||
"dotgitignore": "^2.1.0",
|
||||
|
|
|
@ -1345,25 +1345,6 @@
|
|||
"xdg-basedir": "^4.0.0"
|
||||
}
|
||||
},
|
||||
"conventional-changelog": {
|
||||
"version": "3.1.23",
|
||||
"resolved": "https://registry.npmjs.org/conventional-changelog/-/conventional-changelog-3.1.23.tgz",
|
||||
"integrity": "sha512-sScUu2NHusjRC1dPc5p8/b3kT78OYr95/Bx7Vl8CPB8tF2mG1xei5iylDTRjONV5hTlzt+Cn/tBWrKdd299b7A==",
|
||||
"dev": true,
|
||||
"requires": {
|
||||
"conventional-changelog-angular": "^5.0.11",
|
||||
"conventional-changelog-atom": "^2.0.7",
|
||||
"conventional-changelog-codemirror": "^2.0.7",
|
||||
"conventional-changelog-conventionalcommits": "^4.4.0",
|
||||
"conventional-changelog-core": "^4.2.0",
|
||||
"conventional-changelog-ember": "^2.0.8",
|
||||
"conventional-changelog-eslint": "^3.0.8",
|
||||
"conventional-changelog-express": "^2.0.5",
|
||||
"conventional-changelog-jquery": "^3.0.10",
|
||||
"conventional-changelog-jshint": "^2.0.8",
|
||||
"conventional-changelog-preset-loader": "^2.3.4"
|
||||
}
|
||||
},
|
||||
"conventional-changelog-angular": {
|
||||
"version": "5.0.12",
|
||||
"resolved": "https://registry.npmjs.org/conventional-changelog-angular/-/conventional-changelog-angular-5.0.12.tgz",
|
||||
|
@ -1834,59 +1815,6 @@
|
|||
}
|
||||
}
|
||||
},
|
||||
"conventional-recommended-bump": {
|
||||
"version": "6.0.10",
|
||||
"resolved": "https://registry.npmjs.org/conventional-recommended-bump/-/conventional-recommended-bump-6.0.10.tgz",
|
||||
"integrity": "sha512-2ibrqAFMN3ZA369JgVoSbajdD/BHN6zjY7DZFKTHzyzuQejDUCjQ85S5KHxCRxNwsbDJhTPD5hOKcis/jQhRgg==",
|
||||
"dev": true,
|
||||
"requires": {
|
||||
"concat-stream": "^2.0.0",
|
||||
"conventional-changelog-preset-loader": "^2.3.4",
|
||||
"conventional-commits-filter": "^2.0.6",
|
||||
"conventional-commits-parser": "^3.1.0",
|
||||
"git-raw-commits": "2.0.0",
|
||||
"git-semver-tags": "^4.1.0",
|
||||
"meow": "^7.0.0",
|
||||
"q": "^1.5.1"
|
||||
},
|
||||
"dependencies": {
|
||||
"meow": {
|
||||
"version": "7.1.1",
|
||||
"resolved": "https://registry.npmjs.org/meow/-/meow-7.1.1.tgz",
|
||||
"integrity": "sha512-GWHvA5QOcS412WCo8vwKDlTelGLsCGBVevQB5Kva961rmNfun0PCbv5+xta2kUMFJyR8/oWnn7ddeKdosbAPbA==",
|
||||
"dev": true,
|
||||
"requires": {
|
||||
"@types/minimist": "^1.2.0",
|
||||
"camelcase-keys": "^6.2.2",
|
||||
"decamelize-keys": "^1.1.0",
|
||||
"hard-rejection": "^2.1.0",
|
||||
"minimist-options": "4.1.0",
|
||||
"normalize-package-data": "^2.5.0",
|
||||
"read-pkg-up": "^7.0.1",
|
||||
"redent": "^3.0.0",
|
||||
"trim-newlines": "^3.0.0",
|
||||
"type-fest": "^0.13.1",
|
||||
"yargs-parser": "^18.1.3"
|
||||
}
|
||||
},
|
||||
"type-fest": {
|
||||
"version": "0.13.1",
|
||||
"resolved": "https://registry.npmjs.org/type-fest/-/type-fest-0.13.1.tgz",
|
||||
"integrity": "sha512-34R7HTnG0XIJcBSn5XhDd7nNFPRcXYRZrBB2O2jdKqYODldSzBAqzsWoZYYvduky73toYS/ESqxPvkDf/F0XMg==",
|
||||
"dev": true
|
||||
},
|
||||
"yargs-parser": {
|
||||
"version": "18.1.3",
|
||||
"resolved": "https://registry.npmjs.org/yargs-parser/-/yargs-parser-18.1.3.tgz",
|
||||
"integrity": "sha512-o50j0JeToy/4K6OZcaQmW6lyXXKhq7csREXcDwk2omFPJEwUNOVtJKvmDr9EI1fAJZUyZcRF7kxGBWmRXudrCQ==",
|
||||
"dev": true,
|
||||
"requires": {
|
||||
"camelcase": "^5.0.0",
|
||||
"decamelize": "^1.2.0"
|
||||
}
|
||||
}
|
||||
}
|
||||
},
|
||||
"convert-source-map": {
|
||||
"version": "1.7.0",
|
||||
"resolved": "https://registry.npmjs.org/convert-source-map/-/convert-source-map-1.7.0.tgz",
|
||||
|
@ -7336,10 +7264,10 @@
|
|||
"dev": true,
|
||||
"requires": {
|
||||
"chalk": "^2.4.2",
|
||||
"conventional-changelog": "3.1.23",
|
||||
"conventional-changelog": "3.1.24",
|
||||
"conventional-changelog-config-spec": "2.1.0",
|
||||
"conventional-changelog-conventionalcommits": "4.4.0",
|
||||
"conventional-recommended-bump": "6.0.10",
|
||||
"conventional-recommended-bump": "6.0.11",
|
||||
"detect-indent": "^6.0.0",
|
||||
"detect-newline": "^3.1.0",
|
||||
"dotgitignore": "^2.1.0",
|
||||
|
@ -7398,6 +7326,54 @@
|
|||
"integrity": "sha1-p9BVi9icQveV3UIyj3QIMcpTvCU=",
|
||||
"dev": true
|
||||
},
|
||||
"conventional-changelog": {
|
||||
"version": "3.1.24",
|
||||
"resolved": "https://registry.npmjs.org/conventional-changelog/-/conventional-changelog-3.1.24.tgz",
|
||||
"integrity": "sha512-ed6k8PO00UVvhExYohroVPXcOJ/K1N0/drJHx/faTH37OIZthlecuLIRX/T6uOp682CAoVoFpu+sSEaeuH6Asg==",
|
||||
"dev": true,
|
||||
"requires": {
|
||||
"conventional-changelog-angular": "^5.0.12",
|
||||
"conventional-changelog-atom": "^2.0.8",
|
||||
"conventional-changelog-codemirror": "^2.0.8",
|
||||
"conventional-changelog-conventionalcommits": "^4.5.0",
|
||||
"conventional-changelog-core": "^4.2.1",
|
||||
"conventional-changelog-ember": "^2.0.9",
|
||||
"conventional-changelog-eslint": "^3.0.9",
|
||||
"conventional-changelog-express": "^2.0.6",
|
||||
"conventional-changelog-jquery": "^3.0.11",
|
||||
"conventional-changelog-jshint": "^2.0.9",
|
||||
"conventional-changelog-preset-loader": "^2.3.4"
|
||||
},
|
||||
"dependencies": {
|
||||
"conventional-changelog-conventionalcommits": {
|
||||
"version": "4.5.0",
|
||||
"resolved": "https://registry.npmjs.org/conventional-changelog-conventionalcommits/-/conventional-changelog-conventionalcommits-4.5.0.tgz",
|
||||
"integrity": "sha512-buge9xDvjjOxJlyxUnar/+6i/aVEVGA7EEh4OafBCXPlLUQPGbRUBhBUveWRxzvR8TEjhKEP4BdepnpG2FSZXw==",
|
||||
"dev": true,
|
||||
"requires": {
|
||||
"compare-func": "^2.0.0",
|
||||
"lodash": "^4.17.15",
|
||||
"q": "^1.5.1"
|
||||
}
|
||||
}
|
||||
}
|
||||
},
|
||||
"conventional-recommended-bump": {
|
||||
"version": "6.0.11",
|
||||
"resolved": "https://registry.npmjs.org/conventional-recommended-bump/-/conventional-recommended-bump-6.0.11.tgz",
|
||||
"integrity": "sha512-FciYBMwzwwBZ1K4NS8c57rsOfSc51e1V6UVSNIosrjH+A6xXkyiA4ELwoWyRKdMhJ+m3O6ru9ZJ7F2QFjjYJdQ==",
|
||||
"dev": true,
|
||||
"requires": {
|
||||
"concat-stream": "^2.0.0",
|
||||
"conventional-changelog-preset-loader": "^2.3.4",
|
||||
"conventional-commits-filter": "^2.0.7",
|
||||
"conventional-commits-parser": "^3.2.0",
|
||||
"git-raw-commits": "2.0.0",
|
||||
"git-semver-tags": "^4.1.1",
|
||||
"meow": "^8.0.0",
|
||||
"q": "^1.5.1"
|
||||
}
|
||||
},
|
||||
"find-up": {
|
||||
"version": "5.0.0",
|
||||
"resolved": "https://registry.npmjs.org/find-up/-/find-up-5.0.0.tgz",
|
||||
|
@ -7414,6 +7390,15 @@
|
|||
"integrity": "sha1-tdRU3CGZriJWmfNGfloH87lVuv0=",
|
||||
"dev": true
|
||||
},
|
||||
"hosted-git-info": {
|
||||
"version": "3.0.7",
|
||||
"resolved": "https://registry.npmjs.org/hosted-git-info/-/hosted-git-info-3.0.7.tgz",
|
||||
"integrity": "sha512-fWqc0IcuXs+BmE9orLDyVykAG9GJtGLGuZAAqgcckPgv5xad4AcXGIv8galtQvlwutxSlaMcdw7BUtq2EIvqCQ==",
|
||||
"dev": true,
|
||||
"requires": {
|
||||
"lru-cache": "^6.0.0"
|
||||
}
|
||||
},
|
||||
"locate-path": {
|
||||
"version": "6.0.0",
|
||||
"resolved": "https://registry.npmjs.org/locate-path/-/locate-path-6.0.0.tgz",
|
||||
|
@ -7423,6 +7408,54 @@
|
|||
"p-locate": "^5.0.0"
|
||||
}
|
||||
},
|
||||
"lru-cache": {
|
||||
"version": "6.0.0",
|
||||
"resolved": "https://registry.npmjs.org/lru-cache/-/lru-cache-6.0.0.tgz",
|
||||
"integrity": "sha512-Jo6dJ04CmSjuznwJSS3pUeWmd/H0ffTlkXXgwZi+eq1UCmqQwCh+eLsYOYCwY991i2Fah4h1BEMCx4qThGbsiA==",
|
||||
"dev": true,
|
||||
"requires": {
|
||||
"yallist": "^4.0.0"
|
||||
}
|
||||
},
|
||||
"meow": {
|
||||
"version": "8.0.0",
|
||||
"resolved": "https://registry.npmjs.org/meow/-/meow-8.0.0.tgz",
|
||||
"integrity": "sha512-nbsTRz2fwniJBFgUkcdISq8y/q9n9VbiHYbfwklFh5V4V2uAcxtKQkDc0yCLPM/kP0d+inZBewn3zJqewHE7kg==",
|
||||
"dev": true,
|
||||
"requires": {
|
||||
"@types/minimist": "^1.2.0",
|
||||
"camelcase-keys": "^6.2.2",
|
||||
"decamelize-keys": "^1.1.0",
|
||||
"hard-rejection": "^2.1.0",
|
||||
"minimist-options": "4.1.0",
|
||||
"normalize-package-data": "^3.0.0",
|
||||
"read-pkg-up": "^7.0.1",
|
||||
"redent": "^3.0.0",
|
||||
"trim-newlines": "^3.0.0",
|
||||
"type-fest": "^0.18.0",
|
||||
"yargs-parser": "^20.2.3"
|
||||
},
|
||||
"dependencies": {
|
||||
"yargs-parser": {
|
||||
"version": "20.2.4",
|
||||
"resolved": "https://registry.npmjs.org/yargs-parser/-/yargs-parser-20.2.4.tgz",
|
||||
"integrity": "sha512-WOkpgNhPTlE73h4VFAFsOnomJVaovO8VqLDzy5saChRBFQFBoMYirowyW+Q9HB4HFF4Z7VZTiG3iSzJJA29yRA==",
|
||||
"dev": true
|
||||
}
|
||||
}
|
||||
},
|
||||
"normalize-package-data": {
|
||||
"version": "3.0.0",
|
||||
"resolved": "https://registry.npmjs.org/normalize-package-data/-/normalize-package-data-3.0.0.tgz",
|
||||
"integrity": "sha512-6lUjEI0d3v6kFrtgA/lOx4zHCWULXsFNIjHolnZCKCTLA6m/G625cdn3O7eNmT0iD3jfo6HZ9cdImGZwf21prw==",
|
||||
"dev": true,
|
||||
"requires": {
|
||||
"hosted-git-info": "^3.0.6",
|
||||
"resolve": "^1.17.0",
|
||||
"semver": "^7.3.2",
|
||||
"validate-npm-package-license": "^3.0.1"
|
||||
}
|
||||
},
|
||||
"p-limit": {
|
||||
"version": "3.0.2",
|
||||
"resolved": "https://registry.npmjs.org/p-limit/-/p-limit-3.0.2.tgz",
|
||||
|
@ -7450,6 +7483,12 @@
|
|||
"has-flag": "^3.0.0"
|
||||
}
|
||||
},
|
||||
"type-fest": {
|
||||
"version": "0.18.1",
|
||||
"resolved": "https://registry.npmjs.org/type-fest/-/type-fest-0.18.1.tgz",
|
||||
"integrity": "sha512-OIAYXk8+ISY+qTOwkHtKqzAuxchoMiD9Udx+FSGQDuiRR+PJKJHc2NJAXlbhkGwTt/4/nKZxELY1w3ReWOL8mw==",
|
||||
"dev": true
|
||||
},
|
||||
"wrap-ansi": {
|
||||
"version": "6.2.0",
|
||||
"resolved": "https://registry.npmjs.org/wrap-ansi/-/wrap-ansi-6.2.0.tgz",
|
||||
|
@ -7493,6 +7532,12 @@
|
|||
"integrity": "sha512-r9S/ZyXu/Xu9q1tYlpsLIsa3EeLXXk0VwlxqTcFRfg9EhMW+17kbt9G0NrgCmhGb5vT2hyhJZLfDGx+7+5Uj/w==",
|
||||
"dev": true
|
||||
},
|
||||
"yallist": {
|
||||
"version": "4.0.0",
|
||||
"resolved": "https://registry.npmjs.org/yallist/-/yallist-4.0.0.tgz",
|
||||
"integrity": "sha512-3wdGidZyq5PB084XLES5TpOSRA3wjXAlIWMhum2kRcv/41Sn2emQ0dycQW4uZXLejwKvg6EsvbdlVL+FYEct7A==",
|
||||
"dev": true
|
||||
},
|
||||
"yargs": {
|
||||
"version": "15.4.1",
|
||||
"resolved": "https://registry.npmjs.org/yargs/-/yargs-15.4.1.tgz",
|
||||
|
|
|
@ -16,11 +16,11 @@ from multiprocess import Queue
|
|||
from selenium.common.exceptions import WebDriverException
|
||||
from tblib import pickling_support
|
||||
|
||||
from .Commands import command_executor
|
||||
from .Commands.Types import ShutdownCommand
|
||||
from .DeployBrowsers import deploy_browser
|
||||
from .Errors import BrowserConfigError, BrowserCrashError, ProfileLoadError
|
||||
from .SocketInterface import clientsocket
|
||||
from .commands import command_executor
|
||||
from .commands.types import ShutdownCommand
|
||||
from .deploy_browsers import deploy_browser
|
||||
from .errors import BrowserConfigError, BrowserCrashError, ProfileLoadError
|
||||
from .socket_interface import ClientSocket
|
||||
from .utilities.multiprocess_utils import (
|
||||
Process,
|
||||
kill_process_and_children,
|
||||
|
@ -471,7 +471,7 @@ def BrowserManager(
|
|||
"BROWSER %i: Connecting to extension on port %i"
|
||||
% (browser_params["browser_id"], port)
|
||||
)
|
||||
extension_socket = clientsocket(serialization="json")
|
||||
extension_socket = ClientSocket(serialization="json")
|
||||
extension_socket.connect("127.0.0.1", int(port))
|
||||
else:
|
||||
extension_socket = None
|
|
@ -1,6 +1,6 @@
|
|||
from typing import Callable, List, Tuple
|
||||
|
||||
from .Commands.Types import (
|
||||
from .commands.types import (
|
||||
BaseCommand,
|
||||
BrowseCommand,
|
||||
DumpPageSourceCommand,
|
||||
|
@ -13,7 +13,7 @@ from .Commands.Types import (
|
|||
SaveScreenshotCommand,
|
||||
ScreenshotFullPageCommand,
|
||||
)
|
||||
from .Errors import CommandExecutionError
|
||||
from .errors import CommandExecutionError
|
||||
|
||||
|
||||
class CommandSequence:
|
|
@ -20,7 +20,7 @@ from selenium.webdriver.remote.webdriver import WebDriver
|
|||
from selenium.webdriver.support import expected_conditions as EC
|
||||
from selenium.webdriver.support.ui import WebDriverWait
|
||||
|
||||
from ..SocketInterface import clientsocket
|
||||
from ..socket_interface import ClientSocket
|
||||
from .utils.webdriver_utils import (
|
||||
execute_in_all_frames,
|
||||
execute_script_with_retry,
|
||||
|
@ -112,7 +112,7 @@ def tab_restart_browser(webdriver):
|
|||
|
||||
|
||||
def get_website(
|
||||
url, sleep, visit_id, webdriver, browser_params, extension_socket: clientsocket
|
||||
url, sleep, visit_id, webdriver, browser_params, extension_socket: ClientSocket
|
||||
):
|
||||
"""
|
||||
goes to <url> using the given <webdriver> instance
|
||||
|
@ -372,7 +372,7 @@ def recursive_dump_page_source(visit_id, driver, manager_params, suffix=""):
|
|||
|
||||
|
||||
def finalize(
|
||||
visit_id: int, webdriver: WebDriver, extension_socket: clientsocket, sleep: int
|
||||
visit_id: int, webdriver: WebDriver, extension_socket: ClientSocket, sleep: int
|
||||
) -> None:
|
||||
""" Informs the extension that a visit is done """
|
||||
tab_restart_browser(webdriver)
|
||||
|
@ -383,6 +383,6 @@ def finalize(
|
|||
extension_socket.send(msg)
|
||||
|
||||
|
||||
def initialize(visit_id: int, extension_socket: clientsocket) -> None:
|
||||
def initialize(visit_id: int, extension_socket: ClientSocket) -> None:
|
||||
msg = {"action": "Initialize", "visit_id": visit_id}
|
||||
extension_socket.send(msg)
|
|
@ -1,6 +1,6 @@
|
|||
from ..Errors import CommandExecutionError
|
||||
from ..errors import CommandExecutionError
|
||||
from . import browser_commands, profile_commands
|
||||
from .Types import (
|
||||
from .types import (
|
||||
BrowseCommand,
|
||||
DumpPageSourceCommand,
|
||||
DumpProfCommand,
|
|
@ -4,7 +4,7 @@ import pickle
|
|||
import shutil
|
||||
import tarfile
|
||||
|
||||
from ..Errors import ProfileLoadError
|
||||
from ..errors import ProfileLoadError
|
||||
from .utils.firefox_profile import sleep_until_sqlite_checkpoint
|
||||
|
||||
logger = logging.getLogger("openwpm")
|
|
@ -1,4 +1,4 @@
|
|||
from ..Errors import BrowserConfigError
|
||||
from ..errors import BrowserConfigError
|
||||
from . import deploy_firefox
|
||||
|
||||
|
|
@ -8,8 +8,8 @@ from pyvirtualdisplay import Display
|
|||
from selenium import webdriver
|
||||
from selenium.webdriver.firefox.firefox_profile import FirefoxProfile
|
||||
|
||||
from ..Commands.profile_commands import load_profile
|
||||
from ..Errors import BrowserConfigError
|
||||
from ..commands.profile_commands import load_profile
|
||||
from ..errors import BrowserConfigError
|
||||
from ..utilities.platform_utils import get_firefox_binary_path
|
||||
from . import configure_firefox
|
||||
from .selenium_firefox import FirefoxBinary, FirefoxLogInterceptor, Options
|
|
@ -16,8 +16,8 @@ from multiprocess import JoinableQueue
|
|||
from sentry_sdk.integrations.logging import BreadcrumbHandler, EventHandler
|
||||
from tblib import pickling_support
|
||||
|
||||
from .Commands.utils.webdriver_utils import parse_neterror
|
||||
from .SocketInterface import serversocket
|
||||
from .commands.utils.webdriver_utils import parse_neterror
|
||||
from .socket_interface import ServerSocket
|
||||
|
||||
pickling_support.install()
|
||||
|
||||
|
@ -218,7 +218,7 @@ class MPLogger(object):
|
|||
|
||||
def _start_listener(self):
|
||||
"""Start listening socket for remote logs from extension"""
|
||||
socket = serversocket(name="loggingserver")
|
||||
socket = ServerSocket(name="loggingserver")
|
||||
self._status_queue.put(socket.sock.getsockname())
|
||||
socket.start_accepting()
|
||||
self._status_queue.join() # block to allow parent to retrieve address
|
|
@ -11,7 +11,7 @@ import dill
|
|||
# see: https://stackoverflow.com/a/1148237
|
||||
|
||||
|
||||
class serversocket:
|
||||
class ServerSocket:
|
||||
"""
|
||||
A server socket to receive and process string messages
|
||||
from client sockets to a central queue
|
||||
|
@ -111,7 +111,7 @@ class serversocket:
|
|||
self.sock.close()
|
||||
|
||||
|
||||
class clientsocket:
|
||||
class ClientSocket:
|
||||
"""A client socket for sending messages"""
|
||||
|
||||
def __init__(self, serialization="json", verbose=False):
|
||||
|
@ -171,7 +171,7 @@ def main():
|
|||
|
||||
# Just for testing
|
||||
if sys.argv[1] == "s":
|
||||
sock = serversocket(verbose=True)
|
||||
sock = ServerSocket(verbose=True)
|
||||
sock.start_accepting()
|
||||
input("Press enter to exit...")
|
||||
sock.close()
|
||||
|
@ -181,7 +181,7 @@ def main():
|
|||
serialization = input("Enter the serialization type (default: 'json'):\n")
|
||||
if serialization == "":
|
||||
serialization = "json"
|
||||
sock = clientsocket(serialization=serialization)
|
||||
sock = ClientSocket(serialization=serialization)
|
||||
sock.connect(host, int(port))
|
||||
msg = None
|
||||
|
|
@ -12,15 +12,15 @@ from typing import Any, Dict, List, Optional, Set, Tuple
|
|||
import psutil
|
||||
import tblib
|
||||
|
||||
from .BrowserManager import Browser
|
||||
from .Commands.utils.webdriver_utils import parse_neterror
|
||||
from .CommandSequence import CommandSequence
|
||||
from .DataAggregator import BaseAggregator, LocalAggregator, S3Aggregator
|
||||
from .DataAggregator.BaseAggregator import ACTION_TYPE_FINALIZE, RECORD_TYPE_SPECIAL
|
||||
from .Errors import CommandExecutionError
|
||||
from .browser_manager import Browser
|
||||
from .command_sequence import CommandSequence
|
||||
from .commands.utils.webdriver_utils import parse_neterror
|
||||
from .DataAggregator import S3_aggregator, base_aggregator, local_aggregator
|
||||
from .DataAggregator.base_aggregator import ACTION_TYPE_FINALIZE, RECORD_TYPE_SPECIAL
|
||||
from .errors import CommandExecutionError
|
||||
from .js_instrumentation import clean_js_instrumentation_settings
|
||||
from .MPLogger import MPLogger
|
||||
from .SocketInterface import clientsocket
|
||||
from .mp_logger import MPLogger
|
||||
from .socket_interface import ClientSocket
|
||||
from .utilities.multiprocess_utils import kill_process_and_children
|
||||
from .utilities.platform_utils import get_configuration_string, get_version
|
||||
|
||||
|
@ -273,13 +273,13 @@ class TaskManager:
|
|||
|
||||
def _launch_aggregators(self) -> None:
|
||||
"""Launch the necessary data aggregators"""
|
||||
self.data_aggregator: BaseAggregator.BaseAggregator
|
||||
self.data_aggregator: base_aggregator.BaseAggregator
|
||||
if self.manager_params["output_format"] == "local":
|
||||
self.data_aggregator = LocalAggregator.LocalAggregator(
|
||||
self.data_aggregator = local_aggregator.LocalAggregator(
|
||||
self.manager_params, self.browser_params
|
||||
)
|
||||
elif self.manager_params["output_format"] == "s3":
|
||||
self.data_aggregator = S3Aggregator.S3Aggregator(
|
||||
self.data_aggregator = S3_aggregator.S3Aggregator(
|
||||
self.manager_params, self.browser_params
|
||||
)
|
||||
else:
|
||||
|
@ -292,7 +292,7 @@ class TaskManager:
|
|||
] = self.data_aggregator.listener_address
|
||||
|
||||
# open connection to aggregator for saving crawl details
|
||||
self.sock = clientsocket(serialization="dill")
|
||||
self.sock = ClientSocket(serialization="dill")
|
||||
self.sock.connect(*self.manager_params["aggregator_address"])
|
||||
|
||||
def _shutdown_manager(
|
|
@ -9,7 +9,7 @@ from selenium import webdriver
|
|||
from selenium.webdriver.firefox.firefox_binary import FirefoxBinary
|
||||
|
||||
from openwpm import js_instrumentation as jsi
|
||||
from openwpm.DeployBrowsers import configure_firefox
|
||||
from openwpm.deploy_browsers import configure_firefox
|
||||
from openwpm.TaskManager import load_default_params
|
||||
from openwpm.utilities.platform_utils import get_firefox_binary_path
|
||||
|
||||
|
@ -18,7 +18,7 @@ from .utilities import BASE_TEST_URL, start_server
|
|||
|
||||
# import commonly used modules and utilities so they can be easily accessed
|
||||
# in the interactive session
|
||||
from openwpm.Commands.utils import webdriver_utils as wd_util # noqa isort:skip
|
||||
from openwpm.commands.utils import webdriver_utils as wd_util # noqa isort:skip
|
||||
import domain_utils as du # noqa isort:skip
|
||||
from selenium.webdriver.common.keys import Keys # noqa isort:skip
|
||||
from selenium.common.exceptions import * # noqa isort:skip
|
||||
|
|
|
@ -3,7 +3,7 @@ from os.path import isfile, join
|
|||
|
||||
import pytest
|
||||
|
||||
from openwpm import TaskManager
|
||||
from openwpm import task_manager
|
||||
|
||||
from . import utilities
|
||||
|
||||
|
@ -23,7 +23,7 @@ class OpenWPMTest(object):
|
|||
def visit(self, page_url, data_dir="", sleep_after=0):
|
||||
"""Visit a test page with the given parameters."""
|
||||
manager_params, browser_params = self.get_config(data_dir)
|
||||
manager = TaskManager.TaskManager(manager_params, browser_params)
|
||||
manager = task_manager.TaskManager(manager_params, browser_params)
|
||||
if not page_url.startswith("http"):
|
||||
page_url = utilities.BASE_TEST_URL + page_url
|
||||
manager.get(url=page_url, sleep=sleep_after)
|
||||
|
@ -36,7 +36,7 @@ class OpenWPMTest(object):
|
|||
"""Load and return the default test parameters."""
|
||||
if not data_dir:
|
||||
data_dir = self.tmpdir
|
||||
manager_params, browser_params = TaskManager.load_default_params(num_browsers)
|
||||
manager_params, browser_params = task_manager.load_default_params(num_browsers)
|
||||
manager_params["data_directory"] = data_dir
|
||||
manager_params["log_directory"] = data_dir
|
||||
for i in range(num_browsers):
|
||||
|
|
|
@ -1,8 +1,8 @@
|
|||
from functools import partial
|
||||
from typing import List
|
||||
|
||||
from openwpm.CommandSequence import CommandSequence
|
||||
from openwpm.TaskManager import TaskManager
|
||||
from openwpm.command_sequence import CommandSequence
|
||||
from openwpm.task_manager import TaskManager
|
||||
|
||||
from .openwpmtest import OpenWPMTest
|
||||
from .utilities import BASE_TEST_URL
|
||||
|
|
|
@ -1,4 +1,4 @@
|
|||
from openwpm import TaskManager
|
||||
from openwpm import task_manager
|
||||
from openwpm.utilities import db_utils
|
||||
from openwpm.utilities.platform_utils import parse_http_stack_trace_str
|
||||
|
||||
|
@ -70,7 +70,7 @@ class TestCallstackInstrument(OpenWPMTest):
|
|||
def test_http_stacktrace(self):
|
||||
test_url = utilities.BASE_TEST_URL + "/http_stacktrace.html"
|
||||
manager_params, browser_params = self.get_config()
|
||||
manager = TaskManager.TaskManager(manager_params, browser_params)
|
||||
manager = task_manager.TaskManager(manager_params, browser_params)
|
||||
manager.get(test_url, sleep=10)
|
||||
db = manager_params["db"]
|
||||
manager.close()
|
||||
|
|
|
@ -4,7 +4,7 @@ import tarfile
|
|||
import domain_utils as du
|
||||
import pytest
|
||||
|
||||
from openwpm import TaskManager
|
||||
from openwpm import task_manager
|
||||
from openwpm.utilities import db_utils
|
||||
|
||||
from .openwpmtest import OpenWPMTest
|
||||
|
@ -66,7 +66,7 @@ class TestCrawl(OpenWPMTest):
|
|||
# Run the test crawl
|
||||
data_dir = os.path.join(str(tmpdir), "data_dir")
|
||||
manager_params, browser_params = self.get_config(data_dir)
|
||||
manager = TaskManager.TaskManager(manager_params, browser_params)
|
||||
manager = task_manager.TaskManager(manager_params, browser_params)
|
||||
for site in TEST_SITES:
|
||||
manager.get(site)
|
||||
ff_db_tar = os.path.join(
|
||||
|
|
|
@ -1,4 +1,4 @@
|
|||
from openwpm import CommandSequence, TaskManager
|
||||
from openwpm import command_sequence, task_manager
|
||||
from openwpm.utilities import db_utils
|
||||
|
||||
from . import utilities
|
||||
|
@ -31,7 +31,7 @@ class TestCustomFunctionCommand(OpenWPMTest):
|
|||
def test_custom_function(self):
|
||||
""" Test `custom_function` with an inline func that collects links """
|
||||
|
||||
from openwpm.SocketInterface import clientsocket
|
||||
from openwpm.socket_interface import ClientSocket
|
||||
|
||||
def collect_links(table_name, scheme, **kwargs):
|
||||
""" Collect links with `scheme` and save in table `table_name` """
|
||||
|
@ -49,7 +49,7 @@ class TestCustomFunctionCommand(OpenWPMTest):
|
|||
]
|
||||
current_url = driver.current_url
|
||||
|
||||
sock = clientsocket()
|
||||
sock = ClientSocket()
|
||||
sock.connect(*manager_params["aggregator_address"])
|
||||
|
||||
query = (
|
||||
|
@ -73,8 +73,8 @@ class TestCustomFunctionCommand(OpenWPMTest):
|
|||
sock.close()
|
||||
|
||||
manager_params, browser_params = self.get_config()
|
||||
manager = TaskManager.TaskManager(manager_params, browser_params)
|
||||
cs = CommandSequence.CommandSequence(url_a)
|
||||
manager = task_manager.TaskManager(manager_params, browser_params)
|
||||
cs = command_sequence.CommandSequence(url_a)
|
||||
cs.get(sleep=0, timeout=60)
|
||||
cs.run_custom_function(collect_links, ("page_links", "http"))
|
||||
manager.execute_command_sequence(cs)
|
||||
|
|
|
@ -3,7 +3,7 @@ from datetime import datetime
|
|||
|
||||
import pytest
|
||||
|
||||
from openwpm import TaskManager
|
||||
from openwpm import task_manager
|
||||
from openwpm.utilities import db_utils
|
||||
|
||||
from . import utilities
|
||||
|
@ -281,7 +281,7 @@ class TestExtension(OpenWPMTest):
|
|||
|
||||
def test_extension_gets_correct_visit_id(self):
|
||||
manager_params, browser_params = self.get_config()
|
||||
manager = TaskManager.TaskManager(manager_params, browser_params)
|
||||
manager = task_manager.TaskManager(manager_params, browser_params)
|
||||
|
||||
url_a = utilities.BASE_TEST_URL + "/simple_a.html"
|
||||
url_b = utilities.BASE_TEST_URL + "/simple_b.html"
|
||||
|
|
|
@ -10,7 +10,7 @@ from urllib.parse import urlparse
|
|||
|
||||
import pytest
|
||||
|
||||
from openwpm import CommandSequence, TaskManager
|
||||
from openwpm import command_sequence, task_manager
|
||||
from openwpm.utilities import db_utils
|
||||
|
||||
from . import utilities
|
||||
|
@ -670,7 +670,7 @@ class TestHTTPInstrument(OpenWPMTest):
|
|||
"""
|
||||
test_url = utilities.BASE_TEST_URL + "/http_test_page.html"
|
||||
manager_params, browser_params = self.get_config()
|
||||
manager = TaskManager.TaskManager(manager_params, browser_params)
|
||||
manager = task_manager.TaskManager(manager_params, browser_params)
|
||||
manager.get(test_url, sleep=5)
|
||||
manager.get(test_url, sleep=5)
|
||||
manager.close()
|
||||
|
@ -738,7 +738,7 @@ class TestHTTPInstrument(OpenWPMTest):
|
|||
manager_params, browser_params = self.get_test_config(str(tmpdir))
|
||||
browser_params[0]["http_instrument"] = True
|
||||
browser_params[0]["save_content"] = "script"
|
||||
manager = TaskManager.TaskManager(manager_params, browser_params)
|
||||
manager = task_manager.TaskManager(manager_params, browser_params)
|
||||
manager.get(url=test_url, sleep=1)
|
||||
manager.close()
|
||||
expected_hashes = {
|
||||
|
@ -763,7 +763,7 @@ class TestHTTPInstrument(OpenWPMTest):
|
|||
manager_params, browser_params = self.get_test_config(str(tmpdir))
|
||||
browser_params[0]["http_instrument"] = True
|
||||
browser_params[0]["save_content"] = "main_frame,sub_frame"
|
||||
manager = TaskManager.TaskManager(manager_params, browser_params)
|
||||
manager = task_manager.TaskManager(manager_params, browser_params)
|
||||
manager.get(url=test_url, sleep=1)
|
||||
manager.close()
|
||||
for chash, content in db_utils.get_content(str(tmpdir)):
|
||||
|
@ -780,7 +780,7 @@ class TestHTTPInstrument(OpenWPMTest):
|
|||
manager_params, browser_params = self.get_test_config(str(tmpdir))
|
||||
browser_params[0]["http_instrument"] = True
|
||||
browser_params[0]["save_content"] = True
|
||||
manager = TaskManager.TaskManager(manager_params, browser_params)
|
||||
manager = task_manager.TaskManager(manager_params, browser_params)
|
||||
manager.get(url=test_url, sleep=1)
|
||||
manager.close()
|
||||
db = manager_params["db"]
|
||||
|
@ -1007,9 +1007,9 @@ class TestPOSTInstrument(OpenWPMTest):
|
|||
sleep(5) # wait for the form submission (3 sec after onload)
|
||||
|
||||
manager_params, browser_params = self.get_config()
|
||||
manager = TaskManager.TaskManager(manager_params, browser_params)
|
||||
manager = task_manager.TaskManager(manager_params, browser_params)
|
||||
test_url = utilities.BASE_TEST_URL + "/post_file_upload.html"
|
||||
cs = CommandSequence.CommandSequence(test_url)
|
||||
cs = command_sequence.CommandSequence(test_url)
|
||||
cs.get(sleep=0, timeout=60)
|
||||
cs.run_custom_function(type_filenames_into_form, ())
|
||||
manager.execute_command_sequence(cs)
|
||||
|
|
|
@ -4,7 +4,7 @@ import time
|
|||
|
||||
import pytest
|
||||
|
||||
from openwpm import MPLogger
|
||||
from openwpm import mp_logger
|
||||
from openwpm.utilities.multiprocess_utils import Process
|
||||
|
||||
from .openwpmtest import OpenWPMTest
|
||||
|
@ -90,7 +90,7 @@ class TestMPLogger(OpenWPMTest):
|
|||
def test_multiprocess(self, tmpdir):
|
||||
# Set up loggingserver
|
||||
log_file = self.get_logfile_path(str(tmpdir))
|
||||
openwpm_logger = MPLogger.MPLogger(log_file)
|
||||
openwpm_logger = mp_logger.MPLogger(log_file)
|
||||
|
||||
child_process_1 = Process(target=child_proc, args=(0,))
|
||||
child_process_1.daemon = True
|
||||
|
@ -144,7 +144,7 @@ class TestMPLogger(OpenWPMTest):
|
|||
)
|
||||
def test_child_process_with_exception(self, tmpdir):
|
||||
log_file = self.get_logfile_path(str(tmpdir))
|
||||
openwpm_logger = MPLogger.MPLogger(log_file)
|
||||
openwpm_logger = mp_logger.MPLogger(log_file)
|
||||
|
||||
child_process_1 = Process(target=child_proc_with_exception, args=(0,))
|
||||
child_process_1.daemon = True
|
||||
|
@ -172,7 +172,7 @@ class TestMPLogger(OpenWPMTest):
|
|||
)
|
||||
def test_child_process_logging(self, tmpdir):
|
||||
log_file = self.get_logfile_path(str(tmpdir))
|
||||
openwpm_logger = MPLogger.MPLogger(log_file)
|
||||
openwpm_logger = mp_logger.MPLogger(log_file)
|
||||
child_process = Process(target=child_proc_logging_exception())
|
||||
child_process.daemon = True
|
||||
child_process.start()
|
||||
|
|
|
@ -2,9 +2,9 @@ from os.path import isfile, join
|
|||
|
||||
import pytest
|
||||
|
||||
from openwpm import TaskManager
|
||||
from openwpm.CommandSequence import CommandSequence
|
||||
from openwpm.Errors import CommandExecutionError, ProfileLoadError
|
||||
from openwpm import task_manager
|
||||
from openwpm.command_sequence import CommandSequence
|
||||
from openwpm.errors import CommandExecutionError, ProfileLoadError
|
||||
from openwpm.utilities import db_utils
|
||||
|
||||
from .openwpmtest import OpenWPMTest
|
||||
|
@ -23,7 +23,7 @@ class TestProfile(OpenWPMTest):
|
|||
@pytest.mark.xfail(run=False)
|
||||
def test_saving(self):
|
||||
manager_params, browser_params = self.get_config()
|
||||
manager = TaskManager.TaskManager(manager_params, browser_params)
|
||||
manager = task_manager.TaskManager(manager_params, browser_params)
|
||||
manager.get("http://example.com")
|
||||
manager.close()
|
||||
assert isfile(join(browser_params[0]["profile_archive_dir"], "profile.tar.gz"))
|
||||
|
@ -32,7 +32,7 @@ class TestProfile(OpenWPMTest):
|
|||
def test_crash(self):
|
||||
manager_params, browser_params = self.get_config()
|
||||
manager_params["failure_limit"] = 0
|
||||
manager = TaskManager.TaskManager(manager_params, browser_params)
|
||||
manager = task_manager.TaskManager(manager_params, browser_params)
|
||||
with pytest.raises(CommandExecutionError):
|
||||
manager.get("http://example.com") # So we have a profile
|
||||
manager.get("example.com") # Selenium requires scheme prefix
|
||||
|
@ -42,7 +42,7 @@ class TestProfile(OpenWPMTest):
|
|||
def test_crash_profile(self):
|
||||
manager_params, browser_params = self.get_config()
|
||||
manager_params["failure_limit"] = 2
|
||||
manager = TaskManager.TaskManager(manager_params, browser_params)
|
||||
manager = task_manager.TaskManager(manager_params, browser_params)
|
||||
try:
|
||||
manager.get("http://example.com") # So we have a profile
|
||||
manager.get("example.com") # Selenium requires scheme prefix
|
||||
|
@ -58,14 +58,14 @@ class TestProfile(OpenWPMTest):
|
|||
manager_params, browser_params = self.get_config()
|
||||
browser_params[0]["seed_tar"] = "/tmp/NOTREAL"
|
||||
with pytest.raises(ProfileLoadError):
|
||||
TaskManager.TaskManager(manager_params, browser_params) # noqa
|
||||
task_manager.TaskManager(manager_params, browser_params) # noqa
|
||||
|
||||
@pytest.mark.skip(reason="proxy no longer supported, need to update")
|
||||
def test_profile_saved_when_launch_crashes(self):
|
||||
manager_params, browser_params = self.get_config()
|
||||
browser_params[0]["proxy"] = True
|
||||
browser_params[0]["save_content"] = "script"
|
||||
manager = TaskManager.TaskManager(manager_params, browser_params)
|
||||
manager = task_manager.TaskManager(manager_params, browser_params)
|
||||
manager.get("http://example.com")
|
||||
|
||||
# Kill the LevelDBAggregator
|
||||
|
@ -109,7 +109,7 @@ class TestProfile(OpenWPMTest):
|
|||
cs.get()
|
||||
cs.run_custom_function(test_config_is_set)
|
||||
command_sequences.append(cs)
|
||||
manager = TaskManager.TaskManager(manager_params, browser_params)
|
||||
manager = task_manager.TaskManager(manager_params, browser_params)
|
||||
for cs in command_sequences:
|
||||
manager.execute_command_sequence(cs)
|
||||
manager.close()
|
||||
|
|
|
@ -8,8 +8,8 @@ import pytest
|
|||
from localstack.services import infra
|
||||
from multiprocess import Queue
|
||||
|
||||
from openwpm import TaskManager
|
||||
from openwpm.CommandSequence import CommandSequence
|
||||
from openwpm import task_manager
|
||||
from openwpm.command_sequence import CommandSequence
|
||||
from openwpm.DataAggregator.parquet_schema import PQ_SCHEMAS
|
||||
|
||||
from .openwpmtest import OpenWPMTest
|
||||
|
@ -54,7 +54,7 @@ class TestS3Aggregator(OpenWPMTest):
|
|||
NUM_VISITS = 2
|
||||
NUM_BROWSERS = 4
|
||||
manager_params, browser_params = self.get_config(num_browsers=NUM_BROWSERS)
|
||||
manager = TaskManager.TaskManager(manager_params, browser_params)
|
||||
manager = task_manager.TaskManager(manager_params, browser_params)
|
||||
for _ in range(NUM_VISITS * NUM_BROWSERS):
|
||||
manager.get(TEST_SITE, sleep=1)
|
||||
manager.close()
|
||||
|
@ -100,7 +100,7 @@ class TestS3Aggregator(OpenWPMTest):
|
|||
TEST_SITE = "%s/s3_aggregator.html" % BASE_TEST_URL
|
||||
manager_params, browser_params = self.get_config(num_browsers=1)
|
||||
manager_params["s3_directory"] = "s3-aggregator-tests-2"
|
||||
manager = TaskManager.TaskManager(manager_params, browser_params)
|
||||
manager = task_manager.TaskManager(manager_params, browser_params)
|
||||
manager.get(TEST_SITE, sleep=1)
|
||||
dataset = LocalS3Dataset(
|
||||
manager_params["s3_bucket"], manager_params["s3_directory"]
|
||||
|
@ -125,7 +125,7 @@ class TestS3Aggregator(OpenWPMTest):
|
|||
dataset = LocalS3Dataset(
|
||||
manager_params["s3_bucket"], manager_params["s3_directory"]
|
||||
)
|
||||
manager = TaskManager.TaskManager(manager_params, browser_params)
|
||||
manager = task_manager.TaskManager(manager_params, browser_params)
|
||||
queue = Queue()
|
||||
|
||||
def ensure_site_in_s3(success: bool):
|
||||
|
|
|
@ -7,7 +7,7 @@ from urllib.parse import urlparse
|
|||
|
||||
from PIL import Image
|
||||
|
||||
from openwpm import CommandSequence, TaskManager
|
||||
from openwpm import command_sequence, task_manager
|
||||
from openwpm.utilities import db_utils
|
||||
|
||||
from . import utilities
|
||||
|
@ -113,12 +113,12 @@ class TestSimpleCommands(OpenWPMTest):
|
|||
"""Check that get works and populates db correctly."""
|
||||
# Run the test crawl
|
||||
manager_params, browser_params = self.get_config(display_mode)
|
||||
manager = TaskManager.TaskManager(manager_params, browser_params)
|
||||
manager = task_manager.TaskManager(manager_params, browser_params)
|
||||
|
||||
# Set up two sequential get commands to two URLS
|
||||
cs_a = CommandSequence.CommandSequence(url_a)
|
||||
cs_a = command_sequence.CommandSequence(url_a)
|
||||
cs_a.get(sleep=1)
|
||||
cs_b = CommandSequence.CommandSequence(url_b)
|
||||
cs_b = command_sequence.CommandSequence(url_b)
|
||||
cs_b.get(sleep=1)
|
||||
|
||||
# Perform the get commands
|
||||
|
@ -140,12 +140,12 @@ class TestSimpleCommands(OpenWPMTest):
|
|||
"""Check that get works and populates http tables correctly."""
|
||||
# Run the test crawl
|
||||
manager_params, browser_params = self.get_config(display_mode)
|
||||
manager = TaskManager.TaskManager(manager_params, browser_params)
|
||||
manager = task_manager.TaskManager(manager_params, browser_params)
|
||||
|
||||
# Set up two sequential get commands to two URLS
|
||||
cs_a = CommandSequence.CommandSequence(url_a)
|
||||
cs_a = command_sequence.CommandSequence(url_a)
|
||||
cs_a.get(sleep=1)
|
||||
cs_b = CommandSequence.CommandSequence(url_b)
|
||||
cs_b = command_sequence.CommandSequence(url_b)
|
||||
cs_b.get(sleep=1)
|
||||
|
||||
manager.execute_command_sequence(cs_a)
|
||||
|
@ -193,12 +193,12 @@ class TestSimpleCommands(OpenWPMTest):
|
|||
"""Check that CommandSequence.browse() populates db correctly."""
|
||||
# Run the test crawl
|
||||
manager_params, browser_params = self.get_config(display_mode)
|
||||
manager = TaskManager.TaskManager(manager_params, browser_params)
|
||||
manager = task_manager.TaskManager(manager_params, browser_params)
|
||||
|
||||
# Set up two sequential browse commands to two URLS
|
||||
cs_a = CommandSequence.CommandSequence(url_a, site_rank=0)
|
||||
cs_a = command_sequence.CommandSequence(url_a, site_rank=0)
|
||||
cs_a.browse(num_links=1, sleep=1)
|
||||
cs_b = CommandSequence.CommandSequence(url_b, site_rank=1)
|
||||
cs_b = command_sequence.CommandSequence(url_b, site_rank=1)
|
||||
cs_b.browse(num_links=1, sleep=1)
|
||||
|
||||
manager.execute_command_sequence(cs_a)
|
||||
|
@ -226,12 +226,12 @@ class TestSimpleCommands(OpenWPMTest):
|
|||
"""
|
||||
# Run the test crawl
|
||||
manager_params, browser_params = self.get_config(display_mode)
|
||||
manager = TaskManager.TaskManager(manager_params, browser_params)
|
||||
manager = task_manager.TaskManager(manager_params, browser_params)
|
||||
|
||||
# Set up two sequential browse commands to two URLS
|
||||
cs_a = CommandSequence.CommandSequence(url_a)
|
||||
cs_a = command_sequence.CommandSequence(url_a)
|
||||
cs_a.browse(num_links=20, sleep=1)
|
||||
cs_b = CommandSequence.CommandSequence(url_b)
|
||||
cs_b = command_sequence.CommandSequence(url_b)
|
||||
cs_b.browse(num_links=1, sleep=1)
|
||||
|
||||
manager.execute_command_sequence(cs_a)
|
||||
|
@ -313,7 +313,7 @@ class TestSimpleCommands(OpenWPMTest):
|
|||
"""
|
||||
# Run the test crawl
|
||||
manager_params, browser_params = self.get_config(display_mode)
|
||||
manager = TaskManager.TaskManager(manager_params, browser_params)
|
||||
manager = task_manager.TaskManager(manager_params, browser_params)
|
||||
|
||||
# Set up two sequential browse commands to two URLS
|
||||
manager.browse(url_a, num_links=20, sleep=1)
|
||||
|
@ -389,8 +389,8 @@ class TestSimpleCommands(OpenWPMTest):
|
|||
"""Check that 'save_screenshot' works"""
|
||||
# Run the test crawl
|
||||
manager_params, browser_params = self.get_config(display_mode)
|
||||
manager = TaskManager.TaskManager(manager_params, browser_params)
|
||||
cs = CommandSequence.CommandSequence(url_a)
|
||||
manager = task_manager.TaskManager(manager_params, browser_params)
|
||||
cs = command_sequence.CommandSequence(url_a)
|
||||
cs.get(sleep=1)
|
||||
cs.save_screenshot("test")
|
||||
cs.screenshot_full_page("test_full")
|
||||
|
@ -417,8 +417,8 @@ class TestSimpleCommands(OpenWPMTest):
|
|||
"""Check that 'dump_page_source' works and source is saved properly."""
|
||||
# Run the test crawl
|
||||
manager_params, browser_params = self.get_config(display_mode)
|
||||
manager = TaskManager.TaskManager(manager_params, browser_params)
|
||||
cs = CommandSequence.CommandSequence(url_a)
|
||||
manager = task_manager.TaskManager(manager_params, browser_params)
|
||||
cs = command_sequence.CommandSequence(url_a)
|
||||
cs.get(sleep=1)
|
||||
cs.dump_page_source(suffix="test")
|
||||
manager.execute_command_sequence(cs)
|
||||
|
@ -440,8 +440,8 @@ class TestSimpleCommands(OpenWPMTest):
|
|||
"""Check that 'recursive_dump_page_source' works"""
|
||||
# Run the test crawl
|
||||
manager_params, browser_params = self.get_config(display_mode)
|
||||
manager = TaskManager.TaskManager(manager_params, browser_params)
|
||||
cs = CommandSequence.CommandSequence(NESTED_FRAMES_URL)
|
||||
manager = task_manager.TaskManager(manager_params, browser_params)
|
||||
cs = command_sequence.CommandSequence(NESTED_FRAMES_URL)
|
||||
cs.get(sleep=1)
|
||||
cs.recursive_dump_page_source()
|
||||
manager.execute_command_sequence(cs)
|
||||
|
|
|
@ -1,4 +1,4 @@
|
|||
from openwpm import CommandSequence, TaskManager
|
||||
from openwpm import command_sequence, task_manager
|
||||
from openwpm.utilities import db_utils
|
||||
|
||||
from . import utilities
|
||||
|
@ -36,9 +36,9 @@ class TestStorageVectors(OpenWPMTest):
|
|||
# Run the test crawl
|
||||
manager_params, browser_params = self.get_config()
|
||||
browser_params[0]["cookie_instrument"] = True
|
||||
manager = TaskManager.TaskManager(manager_params, browser_params)
|
||||
manager = task_manager.TaskManager(manager_params, browser_params)
|
||||
url = utilities.BASE_TEST_URL + "/js_cookie.html"
|
||||
cs = CommandSequence.CommandSequence(url)
|
||||
cs = command_sequence.CommandSequence(url)
|
||||
cs.get(sleep=3, timeout=120)
|
||||
manager.execute_command_sequence(cs)
|
||||
manager.close()
|
||||
|
|
|
@ -1,4 +1,4 @@
|
|||
from openwpm import TaskManager
|
||||
from openwpm import task_manager
|
||||
from openwpm.utilities import db_utils
|
||||
|
||||
from .openwpmtest import OpenWPMTest
|
||||
|
@ -14,13 +14,13 @@ class TestCommandDuration(OpenWPMTest):
|
|||
|
||||
def test_command_duration(self):
|
||||
manager_params, browser_params = self.get_config()
|
||||
manager = TaskManager.TaskManager(manager_params, browser_params)
|
||||
manager = task_manager.TaskManager(manager_params, browser_params)
|
||||
manager.get(url=TEST_URL, sleep=5)
|
||||
manager.close()
|
||||
|
||||
get_command = db_utils.query_db(
|
||||
manager_params["db"],
|
||||
"SELECT duration FROM crawl_history WHERE command = \"<class 'openwpm.Commands.Types.GetCommand'>\"",
|
||||
"SELECT duration FROM crawl_history WHERE command = \"<class 'openwpm.commands.types.GetCommand'>\"",
|
||||
as_tuple=True,
|
||||
)[0]
|
||||
|
||||
|
|
|
@ -1,5 +1,5 @@
|
|||
from openwpm import TaskManager
|
||||
from openwpm.Commands.utils.webdriver_utils import parse_neterror
|
||||
from openwpm import task_manager
|
||||
from openwpm.commands.utils.webdriver_utils import parse_neterror
|
||||
from openwpm.utilities import db_utils
|
||||
|
||||
from .openwpmtest import OpenWPMTest
|
||||
|
@ -22,13 +22,13 @@ class TestCustomFunctionCommand(OpenWPMTest):
|
|||
|
||||
def test_parse_neterror_integration(self):
|
||||
manager_params, browser_params = self.get_config()
|
||||
manager = TaskManager.TaskManager(manager_params, browser_params)
|
||||
manager = task_manager.TaskManager(manager_params, browser_params)
|
||||
manager.get("http://website.invalid")
|
||||
manager.close()
|
||||
|
||||
get_command = db_utils.query_db(
|
||||
manager_params["db"],
|
||||
"SELECT command_status, error FROM crawl_history WHERE command = \"<class 'openwpm.Commands.Types.GetCommand'>\"",
|
||||
"SELECT command_status, error FROM crawl_history WHERE command = \"<class 'openwpm.commands.types.GetCommand'>\"",
|
||||
as_tuple=True,
|
||||
)[0]
|
||||
|
||||
|
|
Загрузка…
Ссылка в новой задаче