Add README content.

2018-02-21 10:32:46 +00:00 · 2018-02-21 10:32:46 +00:00 · e9df453595
--- a/README.md
+++ b/README.md
@ -1,4 +1,136 @@

+# Jackknife Variational Inference, Python implementation
+
+This repository contains code related to the following
+[ICLR 2018](https://iclr.cc/Conferences/2018) paper:
+
+* _Sebastian Nowozin_, "Debiasing Evidence Approximations: On
+  Importance-weighted Autoencoders and Jackknife Variational Inference",
+  [Forum](https://openreview.net/forum?id=HyZoi-WRb),
+  [PDF](https://openreview.net/pdf?id=HyZoi-WRb).
+
+
+## Citation
+
+If you use this code or build upon it, please cite the following paper (BibTeX
+format):
+
+```
+@InProceedings{
+	title = "Debiasing Evidence Approximations: On Importance-weighted Autoencoders and Jackknife Variational Inference",
+	author = "Sebastian Nowozin",
+	booktitle = "International Conference on Learning Representations (ICLR 2018)",
+	year = "2018"
+}
+```
+
+## Installation
+
+Install the required Python2 prequisites via running:
+
+```
+pip install -r requirements.txt
+```
+
+Currently this installs:
+
+* [Chainer](http://chainer.org/), the deep learning framework, version 3.1.0
+* [CuPy](http://cupy.chainer.org/), a CUDA linear algebra framework compatible
+  with NumPy, version 2.1.0
+* [NumPy](http://www.numpy.org/), numerical linear algebra for Python, version 1.11.0
+* [SciPy](http://www.scipy.org/), scientific computing framework for Python, version 1.0.0
+* [H5py](http://www.h5py.org/), an HDF5 interface for Python, version 2.6.0
+* [docopt](http://docopt.org/), Pythonic command line arguments parser, version 0.6.2
+* [PyYAML](https://github.com/yaml/pyyaml), Python library for
+  [YAML](http://yaml.org) data language, version 3.12
+
+## Running the MNIST experiment
+
+To train the MNIST model from the paper, use the following parameters:
+
+```
+python ./train.py -g 0 -d mnist -e 1000 -b 2048 --opt adam \
+    --vae-type jvi --vae-samples 8 --jvi-order 1 --nhidden 300 --nlatent 40 \
+    -o modeloutput
+```
+
+Here the parameters are:
+
+* `-g 0`: train on GPU device 0
+* `-d mnist`: use the dynamically binarized MNIST data set
+* `-e 1000`: train for 1000 epochs
+* `-b 2048`: use a batch size of 2048 samples
+* `--opt adam`: use the Adam optimizer
+* `--vae-type jvi`: use _jackknife_ variational inference
+* `--vae-samples 8`: use eight Monte Carlo samples
+* `--jvi-order 1`: use first-order JVI bias correction
+* `--nhidden 300`: in each hidden layer use 300 hidden neurons
+* `--nlatent 40`: use 40 dimensions for the VAE latent variable
+
+The training process creates a file `modeloutput.meta.yaml` containing the
+training parameters as well as a directoy `modeloutput/` which contains a log
+file and the serialized model which performed best on the validation set.
+
+To evaluate the trained model on the test set, use
+
+```
+python ./evaluate.py -g 0 -d mnist -E iwae -s 256 modeloutput
+```
+
+This evaluates the model trained previously using the following test-time
+evaluation setup:
+
+* `-g 0`: use GPU device 0 for evaluation
+* `-d mnist`: evaluate on the mnist data set
+* `-E iwae`: use the IWAE objective for evaluation
+* `-s 256`: use 256 Monte Carlo samples in the IWAE objective
+
+Because test-time evaluation does not require backpropagation, we can evaluate
+the IWAE and JVI objectives accurately using a large number of samples, e.g.
+`-s 65536`.
+
+The `evaluate.py` script also supports a `--reps 10` parameter which would
+evaluate the same model ten times to investigate variance in the Monte Carlo
+approximation to the evaluation objective.
+
+
+## Choosing different objectives
+
+As illustrated in the paper, the JVI objective generalizes both the ELBO and
+the IWAE objectives.
+
+For example, you can train on the importance-weighted autoencoder (IWAE)
+objective using the parameter `--jvi-order 0` instead of `--jvi-order 1`.
+
+You can train using the regular evidence lower bound (ELBO) by using the
+special case of JVI, `--jvi-order 0 --vae-samples 1`, or directly via
+`--vae-type vae`.
+
+# Counting JVI sets
+
+We include a small utility to count the number of subsets used by the
+different JVI approximations.  There are two parameters, `n` and `order`,
+where `n` is the number of samples of latent space variables per instance, and
+`order` is the order of the JVI approximation (order zero corresponds to the
+IWAE).
+
+To run the utility, use:
+
+```
+python ./jvicount.py 16 2
+```
+
+This utility is useful because the set size can grow very rapidly for larger
+JVI orders.  Therefore we can use the utility to assess the total number of
+terms quickly and make informed choices about batch sizes and order of the
+approximation.
+
+
+# Contact
+
+_Sebastian Nowozin_, `Sebastian.Nowozin@microsoft.com`
+
+
 # Contributing

 This project welcomes contributions and suggestions.  Most contributions require you to agree to a
@ -12,3 +144,4 @@ provided by the bot. You will only need to do this once across all repos using o
 This project has adopted the [Microsoft Open Source Code of Conduct](https://opensource.microsoft.com/codeofconduct/).
 For more information see the [Code of Conduct FAQ](https://opensource.microsoft.com/codeofconduct/faq/) or
 contact [opencode@microsoft.com](mailto:opencode@microsoft.com) with any additional questions or comments.
+