RAPPOR: Privacy-Preserving Reporting Algorithms
Перейти к файлу
Andy Chu 02376f0102 Merge branch 'master' into split
Conflicts:
	build.sh
2014-11-12 12:14:18 -08:00
analysis Add licenses to tests. 2014-11-11 17:59:12 -08:00
client Add function to read params from a CSV. 2014-11-05 17:54:48 -08:00
doc Merge branch 'master' into split 2014-11-12 12:14:18 -08:00
tests Polish doc a bit. Add spell check. 2014-11-10 14:55:17 -08:00
.gitignore Initial import. 2014-10-17 16:17:57 -07:00
LICENSE Initial import. 2014-10-17 16:17:57 -07:00
README.md Polish doc a bit. Add spell check. 2014-11-10 14:55:17 -08:00
build.sh Merge branch 'master' into split 2014-11-12 12:14:18 -08:00
demo.sh Read the params CSV in sum_bits and hash_candidates. 2014-11-07 17:00:27 -08:00
run.sh Cosmetic fixes 2014-11-05 15:06:16 -08:00

README.md

RAPPOR

RAPPOR is a novel privacy technology that allows inferring statistics about populations while preserving the privacy of individual users.

This repository contains simulation and analysis code in Python and R.

For a detailed description of the algorithms, see the paper and links below.

Feel free to send feedback to rappor-discuss@googlegroups.com.

Running the Demo

Although the Python and R libraries should be portable to any platform, our end-to-end demo has only been tested on Linux.

If you don't have a Linux box handy, you can view the generated output.

To get your feet wet, install the R dependencies (details below). It should look something like this:

$ R
...
> install.packages(c('glmnet', 'optparse', 'ggplot2'))

Then run:

$ ./demo.sh build  # optional speedup, it's OK for now if it fails

This compiles and tests the fastrand C extension module for Python, which speeds up the simulation.

$ ./demo.sh run

The demo strings together the Python and R code. It:

  1. Generates simulated input data with different distributions
  2. Runs it through the RAPPOR privacy-preserving reporting mechanisms
  3. Analyzes and plots the aggregated reports against the true input

The output is written to _tmp/report.html, and can be opened with a browser.

Dependencies

R analysis (analysis/R):

Demo dependencies (demo.sh):

These are necessary if you want to test changes to the code.

Python client (client/python):

  • None. You should be able to just import the rappor.py file.

Platform:

  • R: tested on R 3.0.
  • Python: tested on Python 2.7.
  • OS: the shell script tests have been tested on Linux, but may work on Mac/Cygwin. The R and Python code should work on any OS.

Development

To run tests:

$ tests/run.sh all

This currently runs Python unit tests and lints the Python files.

API

rappor.py is a tiny standalone Python file, and you can easily copy it into a Python program.

NOTE: Its interface is subject to change. We are in the demo stage now, but if there's demand, we will document and publish the interface.

The R interface is also subject to change.

The fastrand C module is optional. It's likely only useful for simulation of thousands of clients. It doesn't use cryptographically strong randomness, and thus should not be used in production.

Directory Structure

client/             # client libraries
  python/
    rappor.py
    rappor_test.py  # Unit tests go next to the implementation.
  cpp/              # placeholder
analysis/
  R/                # R code for analysis
  tools/            # command line tools for analysis
tests/              # system tests
  gen_sim_input.py  # generate test input data
  rappor_sim.py     # run simulation
  run.sh            # driver for unit tests, lint
doc/
build.sh            # build docs, C extension
demo.sh             # run demo
run.sh              # misc automation

Documentation