flickr style fine-tuning model (separated from example read me)

2014-09-04 02:50:53 +01:00 · 2014-09-04 02:50:53 +01:00 · c6827bf3fc
--- a/examples/finetune_flickr_style/readme.md
+++ b/examples/finetune_flickr_style/readme.md
@ -34,7 +34,7 @@ All steps are to be done from the caffe root directory.
 The dataset is distributed as a list of URLs with corresponding labels.
 Using a script, we will download a small subset of the data and split it into train and val sets.

-    caffe % ./examples/finetune_flickr_style/assemble_data.py -h
+    caffe % ./models/finetune_flickr_style/assemble_data.py -h
    usage: assemble_data.py [-h] [-s SEED] [-i IMAGES] [-w WORKERS]

    Download a subset of Flickr Style to a directory
@ -48,7 +48,7 @@ Using a script, we will download a small subset of the data and split it into tr
                            num workers used to download images. -x uses (all - x)
                            cores.

-    caffe % python examples/finetune_flickr_style/assemble_data.py --workers=-1 --images=2000 --seed 831486
+    caffe % python models/finetune_flickr_style/assemble_data.py --workers=-1 --images=2000 --seed 831486
    Downloading 2000 images with 7 workers...
    Writing train/val for 1939 successfully downloaded images.

@ -56,11 +56,11 @@ This script downloads images and writes train/val file lists into `data/flickr_s
 With this random seed there are 1,557 train images and 382 test images.
 The prototxts in this example assume this, and also assume the presence of the ImageNet mean file (run `get_ilsvrc_aux.sh` from `data/ilsvrc12` to obtain this if you haven't yet).

-We'll also need the ImageNet-trained model, which you can obtain by running `get_caffe_reference_imagenet_model.sh` from `examples/imagenet`.
+We'll also need the ImageNet-trained model, which you can obtain by running `get_caffe_reference_imagenet_model.sh` from `models/imagenet`.

 Now we can train! (You can fine-tune in CPU mode by leaving out the `-gpu` flag.)

-    caffe % ./build/tools/caffe train -solver examples/finetune_flickr_style/flickr_style_solver.prototxt -weights models/bvlc_reference_caffenet/bvlc_reference_caffenet.caffemodel -gpu 0
+    caffe % ./build/tools/caffe train -solver models/finetune_flickr_style/flickr_style_solver.prototxt -weights models/bvlc_reference_caffenet/bvlc_reference_caffenet.caffemodel -gpu 0

    [...]

@ -149,10 +149,16 @@ This model is only beginning to learn.
 Fine-tuning can be feasible when training from scratch would not be for lack of time or data.
 Even in CPU mode each pass through the training set takes ~100 s. GPU fine-tuning is of course faster still and can learn a useful model in minutes or hours instead of days or weeks.
 Furthermore, note that the model has only trained on < 2,000 instances. Transfer learning a new task like style recognition from the ImageNet pretraining can require much less data than training from scratch.
+
 Now try fine-tuning to your own tasks and data!

+## Trained model
+
+We provide a model trained on all 80K images, with final accuracy of 98%.
+Simply do `./scripts/download_model_binary.py models/finetune_flickr_style` to obtain it.
+
 ## License

 The Flickr Style dataset as distributed here contains only URLs to images.
 Some of the images may have copyright.
-Training a category-recognition model for research/non-commercial use may constitute fair use of this data.
+Training a category-recognition model for research/non-commercial use may constitute fair use of this data, but the result should not be used for commercial purposes.
--- a/models/finetune_flickr_style/readme.md
+++ b/models/finetune_flickr_style/readme.md
@ -0,0 +1,20 @@
+---
+name: Finetuning CaffeNet on Flickr Style
+caffemodel: finetune_flickr_style.caffemodel
+caffemodel_url: http://dl.caffe.berkeleyvision.org/finetune_flickr_style.caffemodel
+license: non-commercial
+sha1: 443ad95a61fb0b5cd3cee55951bcc1f299186b5e
+caffe_commit: 41751046f18499b84dbaf529f64c0e664e2a09fe
+---
+
+This model is trained exactly as described in `docs/finetune_flickr_style/readme.md`, using all 80000 images.
+The final performance on the test set:
+
+    I0903 18:40:59.211707 11585 caffe.cpp:167] Loss: 0.407405
+    I0903 18:40:59.211717 11585 caffe.cpp:179] accuracy = 0.9164
+
+## License
+
+The Flickr Style dataset contains only URLs to images.
+Some of the images may have copyright.
+Training a category-recognition model for research/non-commercial use may constitute fair use of this data, but the result should not be used for commercial purposes.
--- a/examples/finetune_flickr_style/flickr_style_solver.prototxt
+++ b/examples/finetune_flickr_style/flickr_style_solver.prototxt
@ -1,4 +1,4 @@
-net: "examples/finetune_flickr_style/flickr_style_train_val.prototxt"
+net: "models/finetune_flickr_style/train_val.prototxt"
 test_iter: 100
 test_interval: 1000
 # lr for fine-tuning should be lower than when starting from scratch
@ -12,6 +12,6 @@ max_iter: 100000
 momentum: 0.9
 weight_decay: 0.0005
 snapshot: 10000
-snapshot_prefix: "examples/finetune_flickr_style/flickr_style"
+snapshot_prefix: "models/finetune_flickr_style/finetune_flickr_style"
 # uncomment the following to default to CPU mode solving
 # solver_mode: CPU
--- a/examples/finetune_flickr_style/flickr_style_train_val.prototxt
+++ b/examples/finetune_flickr_style/flickr_style_train_val.prototxt