RAPPOR: Privacy-Preserving Reporting Algorithms
Перейти к файлу
Ilya Mironov 60c64ea194 Changing the number of iterations from 20 to 10. 2015-09-11 21:16:26 +00:00
analysis Changing the number of iterations from 20 to 10. 2015-09-11 21:16:26 +00:00
apps Move apps/gce-setup.sh to setup.sh to get rid of duplication. 2015-05-06 14:45:57 -07:00
bin Dropping reference to alternative.R 2015-09-09 11:55:03 -07:00
client Make the Python simulation into a Python client library. 2015-07-13 16:14:23 -07:00
doc Merge branch 'split' of github.com:google/rappor into split 2014-11-13 14:33:09 -08:00
tests Elapsed time reporting by regtest.sh is fixed. 2015-08-04 16:53:51 -07:00
.gitignore Initial import. 2014-10-17 16:17:57 -07:00
LICENSE Initial import. 2014-10-17 16:17:57 -07:00
README.md Fixing the output of demo.sh 2015-04-17 00:58:57 -07:00
build.sh Merge branch 'master' into split 2014-11-12 12:14:18 -08:00
demo.sh Make the Python simulation into a Python client library. 2015-07-13 16:14:23 -07:00
regtest.sh Make the Python simulation into a Python client library. 2015-07-13 16:14:23 -07:00
run.sh Minor fix to line counting script 2015-03-31 22:50:28 -07:00
setup.sh Create an analysis tool with a stable interface: bin/decode-dist. 2015-09-02 14:44:02 -07:00
test.sh Make the Python simulation into a Python client library. 2015-07-13 16:14:23 -07:00
util.sh Add a regression test harness. 2015-03-18 17:39:10 -07:00

README.md

RAPPOR

RAPPOR is a novel privacy technology that allows inferring statistics about populations while preserving the privacy of individual users.

This repository contains simulation and analysis code in Python and R.

For a detailed description of the algorithms, see the paper and links below.

Feel free to send feedback to rappor-discuss@googlegroups.com.

Running the Demo

Although the Python and R libraries should be portable to any platform, our end-to-end demo has only been tested on Linux.

If you don't have a Linux box handy, you can view the generated output.

To get your feet wet, install the R dependencies (details below). It should look something like this:

$ R
...
> install.packages(c('glmnet', 'optparse', 'ggplot2'))

Then run:

$ ./demo.sh build  # optional speedup, it's OK for now if it fails

This compiles and tests the fastrand C extension module for Python, which speeds up the simulation.

$ ./demo.sh run

The demo strings together the Python and R code. It:

  1. Generates simulated input data with different distributions
  2. Runs it through the RAPPOR privacy-preserving reporting mechanisms
  3. Analyzes and plots the aggregated reports against the true input

The output is written to _tmp/regtest/results.html, and can be opened with a browser.

Dependencies

R analysis (analysis/R):

Demo dependencies (demo.sh):

These are necessary if you want to test changes to the code.

Python client (client/python):

  • None. You should be able to just import the rappor.py file.

Platform:

  • R: tested on R 3.0.
  • Python: tested on Python 2.7.
  • OS: the shell script tests have been tested on Linux, but may work on Mac/Cygwin. The R and Python code should work on any OS.

Development

To run tests:

$ ./test.sh all

This currently runs Python unit tests, lints Python source files, and runs R unit tests.

API

rappor.py is a tiny standalone Python file, and you can easily copy it into a Python program.

NOTE: Its interface is subject to change. We are in the demo stage now, but if there's demand, we will document and publish the interface.

The R interface is also subject to change.

The fastrand C module is optional. It's likely only useful for simulation of thousands of clients. It doesn't use cryptographically strong randomness, and thus should not be used in production.

Directory Structure

client/             # client libraries
  python/
    rappor.py
    rappor_test.py  # Unit tests go next to the implementation.
  cpp/              # placeholder
analysis/
  R/                # R code for analysis
  tools/            # command line tools for analysis
apps/               # web apps to help you use RAPPOR
tests/              # system tests
  gen_sim_input.py  # generate test input data
  rappor_sim.py     # run simulation
  run.sh            # driver for unit tests, lint
doc/
build.sh            # build docs, C extension
demo.sh             # run demo
run.sh              # misc automation

Documentation

Publications

  • Google Blog Post about RAPPOR
  • RAPPOR implementation in Chrome
    • This is a production quality C++ implementation, but it's somewhat tied to Chrome, and doesn't support all privacy parameters (e.g. only a few values of p and q). On the other hand, the code in this repo is not yet production quality, but supports experimentation with different parameters and data sets. Of course, anyone is free to implement RAPPOR independently as well.
  • Mailing list: rappor-discuss@googlegroups.com