DEPRECATED
Перейти к файлу
Renovate Bot 07291755d0
Update dependency eslint to v5.0.1
2018-07-02 02:00:34 +00:00
.circleci Update circleci/node:8.11.3 Docker digest to 126c0e 2018-06-25 00:38:31 +00:00
bin Fix some issues on the dev stack, misc tweaks 2018-06-28 15:44:38 -04:00
docs Implement metrics pings 2018-06-08 13:26:16 -04:00
functions Fix the queue attributes used in heartbeat ping 2018-06-28 16:53:10 -04:00
lib Fix the queue attributes used in heartbeat ping 2018-06-28 16:53:10 -04:00
.editorconfig Initial queue poller experiment 2018-05-03 15:40:50 -04:00
.eslintrc.yaml Tweaks to tests layout and npm scripts 2018-05-10 16:31:44 -04:00
.gitignore Tweaks to make setting dev stage more convenient 2018-05-11 16:04:48 -04:00
LICENSE Initial commit 2018-04-13 11:41:35 -07:00
README.md Fix some issues on the dev stack, misc tweaks 2018-06-28 15:44:38 -04:00
package-lock.json Update dependency eslint to v5.0.1 2018-07-02 02:00:34 +00:00
package.json Update dependency eslint to v5.0.1 2018-07-02 02:00:34 +00:00
renovate.json Update renovate.json to batch updates 2018-05-07 22:02:47 -04:00
serverless.dynamicConfig.js Add monitoring URLs 2018-05-15 16:53:17 -04:00
serverless.local.yml-dist Add support for custom domains and auto-deployment 2018-05-22 17:34:28 -04:00
serverless.yml Fix some issues on the dev stack, misc tweaks 2018-06-28 15:44:38 -04:00

README.md

watchdog-proxy

CircleCI

This is a simple proxy which interfaces with Microsoft's PhotoDNA Service.

Systems Diagram

Systems Diagram

Quick summary of operation

  1. A third-party Consumer sends an HTTP POST request to the AWS API gateway to invoke the Accept lambda function
  2. The Accept function authenticates the Consumer's credentials supplied via Hawk against a DynamoDB table
  3. If the credentials & parameters are valid, details of the Consumer's submission are sent to the SQS queue and the uploaded image is saved in a private S3 bucket.
  4. Every 60 seconds, a CloudWatch alarm executes the Queue Processor lambda function.
  5. The Queue Processor attempts an atomic write to a DynamoDB table as a form of mutex to ensure only one Queue Processor is running at any given time. The function exists if the atomic write fails.
  6. The Queue Processor lambda function runs for most of 60 seconds, using long-polling on the SQS queue for submissions and exiting when the budgeted time remaining for execution is less than a second.
  7. The Queue Processor receives SQS messages, up to a rate limit (currently 5 per second)
  8. An Event Processor lambda function is invoked for each received SQS message
  9. The Event Processor function calls the upstream web service (i.e. PhotoDNA) with the details of a Consumer submission
  10. On a response from the upstream web service, the Event Processor makes a request back to a URL included in the Consumer submission
  11. Finally, on success, the Event Processor deletes the message from the SQS queue to acknowledge completion

Note: images in the S3 bucket are not currently deleted, though objects in the bucket have a 30-day expiration

Development

Useful NPM scripts

  • npm run lint - check JS syntax & formatting
  • npm run test - run JS tests
  • npm run watch - start a file watcher that runs tests & lint
  • npm run prettier - clean up JS formatting
  • npm run deploy - deploy a stack configured for production
  • npm run deploy:dev - deploy a stack configured for development (e.g. with ENABLE_DEV_AUTH=1)
  • npm run info - display information about the currently deployed stack (e.g. handy for checking the stack's API URL)
  • npm run logs -- -f accept -t - watch logs for the function accept
  • npm run client -- [--id <id> --key <key> --url <url>] - make an authenticated request, defaults to an auto-detected service URL for your stack with credentials devuser / devkey
  • npm run client -- --url https://watchdog-proxy.dev.mozaws.net - make an authenticated request to the dev stack
  • npm run client -- --help - see further options accepted by the client

Quickstart Notes

First, ensure node.js 8.11.1 or newer is installed. Then, the steps to get started look something like this:

git clone git@github.com:mozilla/watchdog-proxy.git
cd watchdog-proxy
npm install
npm start

After cloning the repository and installing dependencies, npm start will launch several file watchers that build assets as needed, run unit tests, and check code quality as you edit files.

Now, create your own version of serverless.local.yml:

  1. Copy serverless.local.yml-dist to serverless.local.yml
  2. Edit serverless.local.yml
  3. Change at least the stage property to a name that's unique to you
  4. (optional) Change upstreamService.url to the URL of a debugging service like webhook.site

The next step is to get the service running on AWS. You'll need to sign up for an account or request a Dev IAM account from Mozilla Cloud Operations. (The latter is available only to Mozillians.)

Optional: Install AWS CLI. This gives you tools to work with AWS from the command line.

If you already have an AWS key ID and secret, you can follow the quick start docs for Serverless to configure your credentials

If you don't already have an AWS key ID and secret, follow the guide to acquire and configure these credentials.

Try deploying the service to AWS:

npm run deploy:dev

You should see output like the following:

$ npm run deploy:dev
Serverless: Packaging service...
Serverless: Excluding development dependencies...
Serverless: Creating Stack...
Serverless: Checking Stack create progress...
.....
Serverless: Stack create finished...
Serverless: Uploading CloudFormation file to S3...
Serverless: Uploading artifacts...
Serverless: Uploading service .zip file to S3 (6.39 MB)...
Serverless: Validating template...
Serverless: Updating Stack...
Serverless: Checking Stack update progress...
...........................................................................
Serverless: Stack update finished...
Service Information
service: watchdog-proxy
stage: lmorchard
region: us-east-1
stack: watchdog-proxy-lmorchard
api keys:
  None
endpoints:
  GET - https://30r00qsyhf.execute-api.us-east-1.amazonaws.com/lmorchard/accept
functions:
  accept: watchdog-proxy-lmorchard-accept
  pollQueue: watchdog-proxy-lmorchard-pollQueue
  processQueueItem: watchdog-proxy-lmorchard-processQueueItem

If everything was successful, you should now have a running stack with an HTTPS resource to accept requests listed as one of the endpoints. Copy the listed endpoint URL and keep it handy.

To send your first request, use the client script with the GET endpoint URL:

npm run client

With no options, this command should attempt to auto-detect the endpoint URL for your deployed stack. You can check to see the results of this request working its way through the stack with the following log commands:

# Client request is accepted into the queue
npm run logs -- -f accept
# Client request is received from the queue
npm run logs -- -f pollQueue
# Queued job is processed
npm run logs -- -f processQueueItem
# Upstream service receives a request
npm run logs -- -f mockUpstream
# Client callback service receives a negative result
npm run logs -- -f mockClientNegative
# Client callback service receives a positive result
npm run logs -- -f mockClientPositive

If you want to remove this stack from AWS and delete everything, run npm run remove

The Serverless docs on workflow are useful.

Custom stable domain name for local development

By default, no custom domain name is created. You can use the semi-random domain name serverless offers on deployment and with serverless info.

If you want to create a domain name for local development (e.g. watchdog-proxy-lmorchard.dev.mozaws.net):

  1. Edit your serverless.local.yml to contain an enabled customDomain section with appropriate details
  2. Run npx serverless create_domain - this only needs to be done once, to create the new custom domain name in Route53 and an accompanying CloudFront distribution
  3. Run npm run deploy:dev to update your stack

Read this Serverless Blog post for more details: https://serverless.com/blog/serverless-api-gateway-domain/

Deployment

Environment variables

When using serverless deploy to deploy the stack, you can use several environment variables to alter configuration:

  • STAGE - Stage for building and deploying - e.g. dev, stage, production
  • DOMAIN - Custom domain config selection for Route 53 and CloudFront distribution - e.g. local, dev, stage, production. If omitted, custom domain handling is disabled
  • NODE_ENV - Use production for a more optimized production build, development for a development build with more verbose logging and other conveniences
  • GIT_COMMIT - The value reported by the __version__ resource as commit. If not set, Serverless config will attempt to run the git command to discover the current commit.
  • UPSTREAM_SERVICE_URL - the URL of the production upstream web service (i.e. PhotoDNA)
  • UPSTREAM_SERVICE_KEY - the private subscription key for the upstream web service
  • ENABLE_DEV_AUTH=1 - This enables a hardcoded user id / key for development (off by default)
  • DISABLE_AUTH_CACHE=1 - Authentication credentials are cached in memory in the accept API function. This lasts until AWS recycles the container hosting the function. Setting this variable disables the cache.
  • METRICS_URL - Override for Ping Centre service URL used for internal metrics. By default, the stage or production Ping Centre URL is used based on NODE_ENV

You can see these variables used by scripts defined in package.json for development convenience.