Merge pull request #525 from mozilla/movingSchema_sql

Moved schema.sql to the DataAggregator directory
2019-11-05 10:42:40 -08:00 · 2019-11-05 10:42:40 -08:00 · bab2a629fc
--- a/README.md
+++ b/README.md
@ -181,7 +181,7 @@ bodies are saved in a LevelDB database named `content.ldb`, and are keyed by
 the hash of the content. In addition, the browser commands that dump page
 source and save screenshots save them in the `sources` and `screenshots`
 subdirectories of the main output directory. The SQLite schema
-specified by: `automation/schema.sql`. You can specify additional tables
+specified by: `automation/DataAggregator/schema.sql`. You can specify additional tables
 inline by sending a `create_table` message to the data aggregator.

 #### Parquet on Amazon S3 **Experimental**
@ -201,7 +201,7 @@ location.
 **NOTE:** The schemas should be kept in sync with the exception of
 output-specific columns (e.g., `instance_id` in the S3 output). You can compare
 the two schemas by running
-`diff -y automation/schema.sql automation/DataAggregator/parquet_schema.py`.
+`diff -y automation/DataAggregator/schema.sql automation/DataAggregator/parquet_schema.py`.

 Browser and Platform Configuration
 ----------------------------------
--- a/automation/DataAggregator/LocalAggregator.py
+++ b/automation/DataAggregator/LocalAggregator.py
@ -17,7 +17,7 @@ from .BaseAggregator import RECORD_TYPE_CONTENT, BaseAggregator, BaseListener
 SQL_BATCH_SIZE = 1000
 LDB_BATCH_SIZE = 100
 MIN_TIME = 5  # seconds
-SCHEMA_FILE = os.path.join(os.path.dirname(__file__), '..', 'schema.sql')
+SCHEMA_FILE = os.path.join(os.path.dirname(__file__), 'schema.sql')
 LDB_NAME = 'content.ldb'


--- a/automation/DataAggregator/schema.sql
+++ b/automation/DataAggregator/schema.sql