[docs] moved example md’s to examples/**/md’s and added script to gather

them for publication
This commit is contained in:
Sergey Karayev 2014-07-11 13:28:55 -07:00
Родитель 583b84ef65
Коммит 0de282ce21
17 изменённых файлов: 394 добавлений и 318 удалений

1
.gitignore поставляемый
Просмотреть файл

@ -53,6 +53,7 @@ examples/*
# Generated documentation
docs/_site
docs/gathered
_site
# Sublime Text settings

Просмотреть файл

@ -1,3 +1,5 @@
To generate stuff you can paste in an .md page from an IPython notebook, run
# Caffe Documentation
ipython nbconvert --to markdown <notebook_file>
To generate the documentation, run `$CAFFE_ROOT/scripts/build_docs.sh`.
To push your changes to the documentation to the gh-pages branch of your or the BVLC repo, run `$CAFFE_ROOT/scripts/deploy_docs.sh <repo_name>`.

Просмотреть файл

@ -9,14 +9,14 @@ Caffe is released under the [BSD 2-Clause license](https://github.com/BVLC/caffe
Check out our web image classification [demo](http://demo.caffe.berkeleyvision.org)!
## Why
## Why use Caffe?
**Clean architecture** enables rapid deployment.
Networks are specified in simple config files, with no hard-coded parameters in the code.
Switching between CPU and GPU code is as simple as setting a flag -- so models can be trained on a GPU machine, and then used on commodity clusters.
Switching between CPU and GPU is as simple as setting a flag -- so models can be trained on a GPU machine, and then used on commodity clusters.
**Readable & modifiable implementation** fosters active development.
In Caffe's first six months, it has been forked by over 300 developers on Github, and many have contributed significant changes.
In Caffe's first six months, it has been forked by over 300 developers on Github, and many have pushed significant changes.
**Speed** makes Caffe perfect for industry use.
Caffe can process over **40M images per day** with a single NVIDIA K40 or Titan GPU\*.
@ -31,27 +31,34 @@ There is an active discussion and support community on [Github](https://github.c
Consult performance [details](/performance_hardware.html).
</p>
## How
## Documentation
* [Introductory slides](http://dl.caffe.berkeleyvision.org/caffe-presentation.pdf): slides about the Caffe architecture, *updated 03/14*.
* [ACM MM paper](http://ucb-icsi-vision-group.github.io/caffe-paper/caffe.pdf): a 4-page report for the ACM Multimedia Open Source competition.
* [Installation instructions](/installation.html): tested on Ubuntu, Red Hat, OS X.
* [Pre-trained models](/getting_pretrained_models.html): BVLC provides ready-to-use models for non-commercial use.
* [Development](/development.html): Guidelines for development and contributing to Caffe.
- [Introductory slides](http://dl.caffe.berkeleyvision.org/caffe-presentation.pdf)<br />
Slides about the Caffe architecture, *updated 03/14*.
- [ACM MM paper](http://ucb-icsi-vision-group.github.io/caffe-paper/caffe.pdf)<br />
A 4-page report for the ACM Multimedia Open Source competition.
- [Installation instructions](/installation.html)<br />
Tested on Ubuntu, Red Hat, OS X.
* [Pre-trained models](/getting_pretrained_models.html)<br />
BVLC provides ready-to-use models for non-commercial use.
* [Development](/development.html)<br />
Guidelines for development and contributing to Caffe.
### Tutorials and Examples
### Examples
* [Image Classification \[notebook\]][imagenet_classification]: classify images with the pretrained ImageNet model by the Python interface.
* [Detection \[notebook\]][detection]: run a pretrained model as a detector in Python.
* [Visualizing Features and Filters \[notebook\]][visualizing_filters]: extracting features and visualizing trained filters with an example image, viewed layer-by-layer.
* [LeNet / MNIST Demo](/mnist.html): end-to-end training and testing of LeNet on MNIST.
* [CIFAR-10 Demo](/cifar10.html): training and testing on the CIFAR-10 data.
* [Training ImageNet](/imagenet_training.html): recipe for end-to-end training of an ImageNet classifier.
* [Feature extraction with C++](/feature_extraction.html): feature extraction using pre-trained model.
{% for page in site.pages %}
{% if page.category == 'example' %}
- <div><a href="{{page.url}}">{{page.title}}</a><br />{{page.description}}</div>
{% endif %}
{% endfor %}
[imagenet_classification]: http://nbviewer.ipython.org/github/BVLC/caffe/blob/master/examples/imagenet_classification.ipynb
[detection]: http://nbviewer.ipython.org/github/BVLC/caffe/blob/master/examples/detection.ipynb
[visualizing_filters]: http://nbviewer.ipython.org/github/BVLC/caffe/blob/master/examples/filter_visualization.ipynb
### Notebook examples
{% for page in site.pages %}
{% if page.category == 'notebook' %}
- <div><a href="http://nbviewer.ipython.org/github/BVLC/caffe/blob/master/{{page.original_path}}">{{page.title}}</a><br />{{page.description}}</div>
{% endif %}
{% endfor %}
## Citing Caffe

Просмотреть файл

@ -1,91 +0,0 @@
---
layout: default
title: Caffe
---
Training MNIST with Caffe
================
We will assume that you have caffe successfully compiled. If not, please refer to the [Installation page](installation.html). In this tutorial, we will assume that your caffe installation is located at `CAFFE_ROOT`.
Prepare Datasets
----------------
You will first need to download and convert the data format from the MNIST website. To do this, simply run the following commands:
cd $CAFFE_ROOT/data/mnist
./get_mnist.sh
cd $CAFFE_ROOT/examples/mnist
./create_mnist.sh
If it complains that `wget` or `gunzip` are not installed, you need to install them respectively. After running the script there should be two datasets, `mnist-train-leveldb`, and `mnist-test-leveldb`.
LeNet: the MNIST Classification Model
-------------------------------------
Before we actually run the training program, let's explain what will happen. We will use the [LeNet](http://yann.lecun.com/exdb/publis/pdf/lecun-01a.pdf) network, which is known to work well on digit classification tasks. We will use a slightly different version from the original LeNet implementation, replacing the sigmoid activations with Rectified Linear Unit (ReLU) activations for the neurons.
The design of LeNet contains the essence of CNNs that are still used in larger models such as the ones in ImageNet. In general, it consists of a convolutional layer followed by a pooling layer, another convolution layer followed by a pooling layer, and then two fully connected layers similar to the conventional multilayer perceptrons. We have defined the layers in `CAFFE_ROOT/data/lenet.prototxt`.
If you would like to read about step-by-step instruction on how the protobuf definitions are written, see [MNIST: Define the Network](mnist_prototxt.html) and [MNIST: Define the Solver](mnist_solver_prototxt.html)?.
Training and Testing the Model
------------------------------
Training the model is simple after you have written the network definition protobuf and solver protobuf files. Simply run `train_mnist.sh`, or the following command directly:
cd $CAFFE_ROOT/examples/mnist
./train_lenet.sh
`train_lenet.sh` is a simple script, but here are a few explanations: `GLOG_logtostderr=1` is the google logging flag that prints all the logging messages directly to stderr. The main tool for training is `train_net.bin`, with the solver protobuf text file as its argument.
When you run the code, you will see a lot of messages flying by like this:
I1203 net.cpp:66] Creating Layer conv1
I1203 net.cpp:76] conv1 <- data
I1203 net.cpp:101] conv1 -> conv1
I1203 net.cpp:116] Top shape: 20 24 24
I1203 net.cpp:127] conv1 needs backward computation.
These messages tell you the details about each layer, its connections and its output shape, which may be helpful in debugging. After the initialization, the training will start:
I1203 net.cpp:142] Network initialization done.
I1203 solver.cpp:36] Solver scaffolding done.
I1203 solver.cpp:44] Solving LeNet
Based on the solver setting, we will print the training loss function every 100 iterations, and test the network every 1000 iterations. You will see messages like this:
I1203 solver.cpp:204] Iteration 100, lr = 0.00992565
I1203 solver.cpp:66] Iteration 100, loss = 0.26044
...
I1203 solver.cpp:84] Testing net
I1203 solver.cpp:111] Test score #0: 0.9785
I1203 solver.cpp:111] Test score #1: 0.0606671
For each training iteration, `lr` is the learning rate of that iteration, and `loss` is the training function. For the output of the testing phase, score 0 is the accuracy, and score 1 is the testing loss function.
And after a few minutes, you are done!
I1203 solver.cpp:84] Testing net
I1203 solver.cpp:111] Test score #0: 0.9897
I1203 solver.cpp:111] Test score #1: 0.0324599
I1203 solver.cpp:126] Snapshotting to lenet_iter_10000
I1203 solver.cpp:133] Snapshotting solver state to lenet_iter_10000.solverstate
I1203 solver.cpp:78] Optimization Done.
The final model, stored as a binary protobuf file, is stored at
lenet_iter_10000
which you can deploy as a trained model in your application, if you are training on a real-world application dataset.
Um... How about GPU training?
-----------------------------
You just did! All the training was carried out on the GPU. In fact, if you would like to do training on CPU, you can simply change one line in `lenet_solver.prototxt`:
# solver mode: CPU or GPU
solver_mode: CPU
and you will be using CPU for training. Isn't that easy?
MNIST is a small dataset, so training with GPU does not really introduce too much benefit due to communication overheads. On larger datasets with more complex models, such as ImageNet, the computation speed difference will be more significant.

Просмотреть файл

@ -1,153 +0,0 @@
---
layout: default
title: Caffe
---
Define the MNIST Network
=========================
This page explains the prototxt file `lenet_train.prototxt` used in the MNIST demo. We assume that you are familiar with [Google Protobuf](https://developers.google.com/protocol-buffers/docs/overview), and assume that you have read the protobuf definitions used by Caffe, which can be found at [src/caffe/proto/caffe.proto](https://github.com/Yangqing/caffe/blob/master/src/caffe/proto/caffe.proto).
Specifically, we will write a `caffe::NetParameter` (or in python, `caffe.proto.caffe_pb2.NetParameter`) protubuf. We will start by giving the network a name:
name: "LeNet"
Writing the Data Layer
----------------------
Currently, we will read the MNIST data from the leveldb we created earlier in the demo. This is defined by a data layer:
layers {
name: "mnist"
type: DATA
data_param {
source: "mnist-train-leveldb"
batch_size: 64
scale: 0.00390625
}
top: "data"
top: "label"
}
Specifically, this layer has name `mnist`, type `data`, and it reads the data from the given leveldb source. We will use a batch size of 64, and scale the incoming pixels so that they are in the range \[0,1\). Why 0.00390625? It is 1 divided by 256. And finally, this layer produces two blobs, one is the `data` blob, and one is the `label` blob.
Writing the Convolution Layer
--------------------------------------------
Let's define the first convolution layer:
layers {
name: "conv1"
type: CONVOLUTION
blobs_lr: 1.
blobs_lr: 2.
convolution_param {
num_output: 20
kernelsize: 5
stride: 1
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
}
}
bottom: "data"
top: "conv1"
}
This layer takes the `data` blob (it is provided by the data layer), and produces the `conv1` layer. It produces outputs of 20 channels, with the convolutional kernel size 5 and carried out with stride 1.
The fillers allow us to randomly initialize the value of the weights and bias. For the weight filler, we will use the `xavier` algorithm that automatically determines the scale of initialization based on the number of input and output neurons. For the bias filler, we will simply initialize it as constant, with the default filling value 0.
`blobs_lr` are the learning rate adjustments for the layer's learnable parameters. In this case, we will set the weight learning rate to be the same as the learning rate given by the solver during runtime, and the bias learning rate to be twice as large as that - this usually leads to better convergence rates.
Writing the Pooling Layer
-------------------------
Phew. Pooling layers are actually much easier to define:
layers {
name: "pool1"
type: POOLING
pooling_param {
kernel_size: 2
stride: 2
pool: MAX
}
bottom: "conv1"
top: "pool1"
}
This says we will perform max pooling with a pool kernel size 2 and a stride of 2 (so no overlapping between neighboring pooling regions).
Similarly, you can write up the second convolution and pooling layers. Check `data/lenet.prototxt` for details.
Writing the Fully Connected Layer
----------------------------------
Writing a fully connected layer is also simple:
layers {
name: "ip1"
type: INNER_PRODUCT
blobs_lr: 1.
blobs_lr: 2.
inner_product_param {
num_output: 500
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
}
}
bottom: "pool2"
top: "ip1"
}
This defines a fully connected layer (for some legacy reason, Caffe calls it an `innerproduct` layer) with 500 outputs. All other lines look familiar, right?
Writing the ReLU Layer
----------------------
A ReLU Layer is also simple:
layers {
name: "relu1"
type: RELU
bottom: "ip1"
top: "ip1"
}
Since ReLU is an element-wise operation, we can do *in-place* operations to save some memory. This is achieved by simply giving the same name to the bottom and top blobs. Of course, do NOT use duplicated blob names for other layer types!
After the ReLU layer, we will write another innerproduct layer:
layers {
name: "ip2"
type: INNER_PRODUCT
blobs_lr: 1.
blobs_lr: 2.
inner_product_param {
num_output: 10
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
}
}
bottom: "ip1"
top: "ip2"
}
Writing the Loss Layer
-------------------------
Finally, we will write the loss!
layers {
name: "loss"
type: SOFTMAX_LOSS
bottom: "ip2"
bottom: "label"
}
The `softmax_loss` layer implements both the softmax and the multinomial logistic loss (that saves time and improves numerical stability). It takes two blobs, the first one being the prediction and the second one being the `label` provided by the data layer (remember it?). It does not produce any outputs - all it does is to compute the loss function value, report it when backpropagation starts, and initiates the gradient with respect to `ip2`. This is where all magic starts.
Now that we have demonstrated how to write the MNIST layer definition prototxt, maybe check out [how we write a solver prototxt](mnist_solver_prototxt.html)?

Просмотреть файл

@ -1,37 +0,0 @@
---
layout: default
title: Caffe
---
Define the MNIST Solver
=======================
The page is under construction. For now, check out the comments in the solver prototxt file, which explains each line in the prototxt:
# The training protocol buffer definition
train_net: "lenet_train.prototxt"
# The testing protocol buffer definition
test_net: "lenet_test.prototxt"
# test_iter specifies how many forward passes the test should carry out.
# In the case of MNIST, we have test batch size 100 and 100 test iterations,
# covering the full 10,000 testing images.
test_iter: 100
# Carry out testing every 500 training iterations.
test_interval: 500
# The base learning rate, momentum and the weight decay of the network.
base_lr: 0.01
momentum: 0.9
weight_decay: 0.0005
# The learning rate policy
lr_policy: "inv"
gamma: 0.0001
power: 0.75
# Display every 100 iterations
display: 100
# The maximum number of iterations
max_iter: 10000
# snapshot intermediate results
snapshot: 5000
snapshot_prefix: "lenet"
# solver mode: 0 for CPU and 1 for GPU
solver_mode: 1

Просмотреть файл

@ -1,6 +1,9 @@
---
title: CIFAR-10 tutorial
category: example
description: Train and test Caffe on CIFAR-10 data.
include_in_docs: true
layout: default
title: Caffe
---
Alex's CIFAR-10 tutorial, Caffe style

Просмотреть файл

@ -1,6 +1,8 @@
{
"metadata": {
"name": ""
"name": "ImageNet detection",
"description": "Run a pretrained model as a detector in Python.",
"include_in_docs": true
},
"nbformat": 3,
"nbformat_minor": 0,
@ -652,4 +654,4 @@
"metadata": {}
}
]
}
}

Просмотреть файл

@ -1,6 +1,9 @@
---
title: Feature extraction with Caffe C++ code.
description: Extract AlexNet features using the Caffe binary.
category: example
include_in_docs: true
layout: default
title: Caffe
---
Extracting Features
@ -57,7 +60,7 @@ The last parameter above is the number of data mini-batches.
The features are stored to LevelDB `examples/_temp/features`, ready for access by some other code.
If you meet with the error "Check failed: status.ok() Failed to open leveldb examples/_temp/features", it is because the directory examples/_temp/features has been created the last time you run the command. Remove it and run again.
If you meet with the error "Check failed: status.ok() Failed to open leveldb examples/_temp/features", it is because the directory examples/_temp/features has been created the last time you run the command. Remove it and run again.
rm -rf examples/_temp/features/

Просмотреть файл

@ -1,6 +1,8 @@
{
"metadata": {
"name": ""
"name": "Filter visualization",
"description": "Extracting features and visualizing trained filters with an example image, viewed layer-by-layer.",
"include_in_docs": true
},
"nbformat": 3,
"nbformat_minor": 0,
@ -611,4 +613,4 @@
"metadata": {}
}
]
}
}

Просмотреть файл

@ -1,6 +1,9 @@
---
title: ImageNet tutorial
description: Train and test "AlexNet" on ImageNet challenge data.
category: example
include_in_docs: true
layout: default
title: Caffe
---
Yangqing's Recipe on Brewing ImageNet

Просмотреть файл

@ -1,6 +1,8 @@
{
"metadata": {
"name": ""
"description": "Use the pre-trained ImageNet model to classify images with the Python interface.",
"name": "ImageNet Classification",
"include_in_docs": true
},
"nbformat": 3,
"nbformat_minor": 0,
@ -407,4 +409,4 @@
"metadata": {}
}
]
}
}

266
examples/mnist/readme.md Normal file
Просмотреть файл

@ -0,0 +1,266 @@
---
title: MNIST Tutorial
description: Train and test "LeNet" on MNIST data.
category: example
include_in_docs: true
layout: default
---
# Training MNIST with Caffe
We will assume that you have caffe successfully compiled. If not, please refer to the [Installation page](installation.html). In this tutorial, we will assume that your caffe installation is located at `CAFFE_ROOT`.
## Prepare Datasets
You will first need to download and convert the data format from the MNIST website. To do this, simply run the following commands:
cd $CAFFE_ROOT/data/mnist
./get_mnist.sh
cd $CAFFE_ROOT/examples/mnist
./create_mnist.sh
If it complains that `wget` or `gunzip` are not installed, you need to install them respectively. After running the script there should be two datasets, `mnist-train-leveldb`, and `mnist-test-leveldb`.
## LeNet: the MNIST Classification Model
Before we actually run the training program, let's explain what will happen. We will use the [LeNet](http://yann.lecun.com/exdb/publis/pdf/lecun-01a.pdf) network, which is known to work well on digit classification tasks. We will use a slightly different version from the original LeNet implementation, replacing the sigmoid activations with Rectified Linear Unit (ReLU) activations for the neurons.
The design of LeNet contains the essence of CNNs that are still used in larger models such as the ones in ImageNet. In general, it consists of a convolutional layer followed by a pooling layer, another convolution layer followed by a pooling layer, and then two fully connected layers similar to the conventional multilayer perceptrons. We have defined the layers in `CAFFE_ROOT/data/lenet.prototxt`.
## Define the MNIST Network
This section explains the prototxt file `lenet_train.prototxt` used in the MNIST demo. We assume that you are familiar with [Google Protobuf](https://developers.google.com/protocol-buffers/docs/overview), and assume that you have read the protobuf definitions used by Caffe, which can be found at [src/caffe/proto/caffe.proto](https://github.com/Yangqing/caffe/blob/master/src/caffe/proto/caffe.proto).
Specifically, we will write a `caffe::NetParameter` (or in python, `caffe.proto.caffe_pb2.NetParameter`) protubuf. We will start by giving the network a name:
name: "LeNet"
### Writing the Data Layer
Currently, we will read the MNIST data from the leveldb we created earlier in the demo. This is defined by a data layer:
layers {
name: "mnist"
type: DATA
data_param {
source: "mnist-train-leveldb"
batch_size: 64
scale: 0.00390625
}
top: "data"
top: "label"
}
Specifically, this layer has name `mnist`, type `data`, and it reads the data from the given leveldb source. We will use a batch size of 64, and scale the incoming pixels so that they are in the range \[0,1\). Why 0.00390625? It is 1 divided by 256. And finally, this layer produces two blobs, one is the `data` blob, and one is the `label` blob.
### Writing the Convolution Layer
Let's define the first convolution layer:
layers {
name: "conv1"
type: CONVOLUTION
blobs_lr: 1.
blobs_lr: 2.
convolution_param {
num_output: 20
kernelsize: 5
stride: 1
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
}
}
bottom: "data"
top: "conv1"
}
This layer takes the `data` blob (it is provided by the data layer), and produces the `conv1` layer. It produces outputs of 20 channels, with the convolutional kernel size 5 and carried out with stride 1.
The fillers allow us to randomly initialize the value of the weights and bias. For the weight filler, we will use the `xavier` algorithm that automatically determines the scale of initialization based on the number of input and output neurons. For the bias filler, we will simply initialize it as constant, with the default filling value 0.
`blobs_lr` are the learning rate adjustments for the layer's learnable parameters. In this case, we will set the weight learning rate to be the same as the learning rate given by the solver during runtime, and the bias learning rate to be twice as large as that - this usually leads to better convergence rates.
### Writing the Pooling Layer
Phew. Pooling layers are actually much easier to define:
layers {
name: "pool1"
type: POOLING
pooling_param {
kernel_size: 2
stride: 2
pool: MAX
}
bottom: "conv1"
top: "pool1"
}
This says we will perform max pooling with a pool kernel size 2 and a stride of 2 (so no overlapping between neighboring pooling regions).
Similarly, you can write up the second convolution and pooling layers. Check `data/lenet.prototxt` for details.
### Writing the Fully Connected Layer
Writing a fully connected layer is also simple:
layers {
name: "ip1"
type: INNER_PRODUCT
blobs_lr: 1.
blobs_lr: 2.
inner_product_param {
num_output: 500
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
}
}
bottom: "pool2"
top: "ip1"
}
This defines a fully connected layer (for some legacy reason, Caffe calls it an `innerproduct` layer) with 500 outputs. All other lines look familiar, right?
### Writing the ReLU Layer
A ReLU Layer is also simple:
layers {
name: "relu1"
type: RELU
bottom: "ip1"
top: "ip1"
}
Since ReLU is an element-wise operation, we can do *in-place* operations to save some memory. This is achieved by simply giving the same name to the bottom and top blobs. Of course, do NOT use duplicated blob names for other layer types!
After the ReLU layer, we will write another innerproduct layer:
layers {
name: "ip2"
type: INNER_PRODUCT
blobs_lr: 1.
blobs_lr: 2.
inner_product_param {
num_output: 10
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
}
}
bottom: "ip1"
top: "ip2"
}
### Writing the Loss Layer
Finally, we will write the loss!
layers {
name: "loss"
type: SOFTMAX_LOSS
bottom: "ip2"
bottom: "label"
}
The `softmax_loss` layer implements both the softmax and the multinomial logistic loss (that saves time and improves numerical stability). It takes two blobs, the first one being the prediction and the second one being the `label` provided by the data layer (remember it?). It does not produce any outputs - all it does is to compute the loss function value, report it when backpropagation starts, and initiates the gradient with respect to `ip2`. This is where all magic starts.
## Define the MNIST Solver
Check out the comments explaining each line in the prototxt:
# The training protocol buffer definition
train_net: "lenet_train.prototxt"
# The testing protocol buffer definition
test_net: "lenet_test.prototxt"
# test_iter specifies how many forward passes the test should carry out.
# In the case of MNIST, we have test batch size 100 and 100 test iterations,
# covering the full 10,000 testing images.
test_iter: 100
# Carry out testing every 500 training iterations.
test_interval: 500
# The base learning rate, momentum and the weight decay of the network.
base_lr: 0.01
momentum: 0.9
weight_decay: 0.0005
# The learning rate policy
lr_policy: "inv"
gamma: 0.0001
power: 0.75
# Display every 100 iterations
display: 100
# The maximum number of iterations
max_iter: 10000
# snapshot intermediate results
snapshot: 5000
snapshot_prefix: "lenet"
# solver mode: 0 for CPU and 1 for GPU
solver_mode: 1
## Training and Testing the Model
Training the model is simple after you have written the network definition protobuf and solver protobuf files. Simply run `train_mnist.sh`, or the following command directly:
cd $CAFFE_ROOT/examples/mnist
./train_lenet.sh
`train_lenet.sh` is a simple script, but here are a few explanations: `GLOG_logtostderr=1` is the google logging flag that prints all the logging messages directly to stderr. The main tool for training is `train_net.bin`, with the solver protobuf text file as its argument.
When you run the code, you will see a lot of messages flying by like this:
I1203 net.cpp:66] Creating Layer conv1
I1203 net.cpp:76] conv1 <- data
I1203 net.cpp:101] conv1 -> conv1
I1203 net.cpp:116] Top shape: 20 24 24
I1203 net.cpp:127] conv1 needs backward computation.
These messages tell you the details about each layer, its connections and its output shape, which may be helpful in debugging. After the initialization, the training will start:
I1203 net.cpp:142] Network initialization done.
I1203 solver.cpp:36] Solver scaffolding done.
I1203 solver.cpp:44] Solving LeNet
Based on the solver setting, we will print the training loss function every 100 iterations, and test the network every 1000 iterations. You will see messages like this:
I1203 solver.cpp:204] Iteration 100, lr = 0.00992565
I1203 solver.cpp:66] Iteration 100, loss = 0.26044
...
I1203 solver.cpp:84] Testing net
I1203 solver.cpp:111] Test score #0: 0.9785
I1203 solver.cpp:111] Test score #1: 0.0606671
For each training iteration, `lr` is the learning rate of that iteration, and `loss` is the training function. For the output of the testing phase, score 0 is the accuracy, and score 1 is the testing loss function.
And after a few minutes, you are done!
I1203 solver.cpp:84] Testing net
I1203 solver.cpp:111] Test score #0: 0.9897
I1203 solver.cpp:111] Test score #1: 0.0324599
I1203 solver.cpp:126] Snapshotting to lenet_iter_10000
I1203 solver.cpp:133] Snapshotting solver state to lenet_iter_10000.solverstate
I1203 solver.cpp:78] Optimization Done.
The final model, stored as a binary protobuf file, is stored at
lenet_iter_10000
which you can deploy as a trained model in your application, if you are training on a real-world application dataset.
### Um... How about GPU training?
You just did! All the training was carried out on the GPU. In fact, if you would like to do training on CPU, you can simply change one line in `lenet_solver.prototxt`:
# solver mode: CPU or GPU
solver_mode: CPU
and you will be using CPU for training. Isn't that easy?
MNIST is a small dataset, so training with GPU does not really introduce too much benefit due to communication overheads. On larger datasets with more complex models, such as ImageNet, the computation speed difference will be more significant.

Просмотреть файл

@ -1,11 +1,17 @@
#!/bin/bash
# Build documentation for display in web browser.
PORT=${1:-4000}
echo "usage: build_docs.sh [port]"
echo "usage: build.sh [port]"
# Find the docs dir, no matter where the script is called
DIR="$( cd "$(dirname "$0")" ; pwd -P )"
cd $DIR/../docs
ROOT_DIR="$( cd "$(dirname "$0")"/.. ; pwd -P )"
cd $ROOT_DIR
# Gather docs.
scripts/gather_examples.sh
# Display docs using web server.
cd docs
jekyll serve -w -s . -d _site --port=$PORT

32
scripts/copy_notebook.py Executable file
Просмотреть файл

@ -0,0 +1,32 @@
#!/usr/bin/env python
"""
Takes as arguments:
1. the path to a JSON file (such as an IPython notebook).
2. the path to output file
If 'metadata' dict in the JSON file contains 'include_in_docs': true,
then copies the file to output file, appending the 'metadata' property
as YAML front-matter, adding the field 'category' with value 'notebook'.
"""
import os
import sys
import json
filename = sys.argv[1]
output_filename = sys.argv[2]
content = json.load(open(filename))
if 'include_in_docs' in content['metadata'] and content['metadata']['include_in_docs']:
yaml_frontmatter = ['---']
for key, val in content['metadata'].iteritems():
if key == 'name':
key = 'title'
if val == '':
val = os.path.basename(filename)
yaml_frontmatter.append('{}: {}'.format(key, val))
yaml_frontmatter += ['category: notebook']
yaml_frontmatter += ['original_path: ' + filename]
with open(output_filename, 'w') as fo:
fo.write('\n'.join(yaml_frontmatter + ['---']) + '\n')
fo.write(open(filename).read())

Просмотреть файл

@ -1,5 +1,5 @@
#!/usr/bin/env sh
# Publish/ Pull-request documentation to the gh-pages site.
#!/bin/bash
# Publish documentation to the gh-pages site.
# The remote for pushing the docs (defaults to origin).
# This is where you will submit the PR to BVLC:gh-pages from.

28
scripts/gather_examples.sh Executable file
Просмотреть файл

@ -0,0 +1,28 @@
#!/bin/bash
# Assemble documentation for the project into one directory via symbolic links.
# Find the docs dir, no matter where the script is called
ROOT_DIR="$( cd "$(dirname "$0")"/.. ; pwd -P )"
cd $ROOT_DIR
# Gather docs from examples/**/readme.md
rm -r docs/gathered
mkdir docs/gathered
for README_FILENAME in $(find examples -iname "readme.md"); do
# Only use file if it is to be included in docs.
if grep -Fxq "include_in_docs: true" $README_FILENAME; then
# Make link to readme.md in docs/gathered/.
# Since everything is called readme.md, rename it by its dirname.
README_DIRNAME=`dirname $README_FILENAME`
DOCS_FILENAME=docs/gathered/$README_DIRNAME.md
mkdir -p `dirname $DOCS_FILENAME`
ln -s $ROOT_DIR/$README_FILENAME $DOCS_FILENAME
fi
done
# Gather docs from examples/*.ipynb and add YAML front-matter.
for NOTEBOOK_FILENAME in $(find examples -d 1 -iname "*.ipynb"); do
DOCS_FILENAME=docs/gathered/$NOTEBOOK_FILENAME
mkdir -p `dirname $DOCS_FILENAME`
python scripts/copy_notebook.py $NOTEBOOK_FILENAME $DOCS_FILENAME
done