flickr style fine-tuning model (separated from example read me)

This commit is contained in:
Sergey Karayev 2014-09-04 02:50:53 +01:00
Родитель bc601e9060
Коммит c6827bf3fc
4 изменённых файлов: 33 добавлений и 7 удалений

Просмотреть файл

@ -34,7 +34,7 @@ All steps are to be done from the caffe root directory.
The dataset is distributed as a list of URLs with corresponding labels.
Using a script, we will download a small subset of the data and split it into train and val sets.
caffe % ./examples/finetune_flickr_style/assemble_data.py -h
caffe % ./models/finetune_flickr_style/assemble_data.py -h
usage: assemble_data.py [-h] [-s SEED] [-i IMAGES] [-w WORKERS]
Download a subset of Flickr Style to a directory
@ -48,7 +48,7 @@ Using a script, we will download a small subset of the data and split it into tr
num workers used to download images. -x uses (all - x)
cores.
caffe % python examples/finetune_flickr_style/assemble_data.py --workers=-1 --images=2000 --seed 831486
caffe % python models/finetune_flickr_style/assemble_data.py --workers=-1 --images=2000 --seed 831486
Downloading 2000 images with 7 workers...
Writing train/val for 1939 successfully downloaded images.
@ -56,11 +56,11 @@ This script downloads images and writes train/val file lists into `data/flickr_s
With this random seed there are 1,557 train images and 382 test images.
The prototxts in this example assume this, and also assume the presence of the ImageNet mean file (run `get_ilsvrc_aux.sh` from `data/ilsvrc12` to obtain this if you haven't yet).
We'll also need the ImageNet-trained model, which you can obtain by running `get_caffe_reference_imagenet_model.sh` from `examples/imagenet`.
We'll also need the ImageNet-trained model, which you can obtain by running `get_caffe_reference_imagenet_model.sh` from `models/imagenet`.
Now we can train! (You can fine-tune in CPU mode by leaving out the `-gpu` flag.)
caffe % ./build/tools/caffe train -solver examples/finetune_flickr_style/flickr_style_solver.prototxt -weights models/bvlc_reference_caffenet/bvlc_reference_caffenet.caffemodel -gpu 0
caffe % ./build/tools/caffe train -solver models/finetune_flickr_style/flickr_style_solver.prototxt -weights models/bvlc_reference_caffenet/bvlc_reference_caffenet.caffemodel -gpu 0
[...]
@ -149,10 +149,16 @@ This model is only beginning to learn.
Fine-tuning can be feasible when training from scratch would not be for lack of time or data.
Even in CPU mode each pass through the training set takes ~100 s. GPU fine-tuning is of course faster still and can learn a useful model in minutes or hours instead of days or weeks.
Furthermore, note that the model has only trained on < 2,000 instances. Transfer learning a new task like style recognition from the ImageNet pretraining can require much less data than training from scratch.
Now try fine-tuning to your own tasks and data!
## Trained model
We provide a model trained on all 80K images, with final accuracy of 98%.
Simply do `./scripts/download_model_binary.py models/finetune_flickr_style` to obtain it.
## License
The Flickr Style dataset as distributed here contains only URLs to images.
Some of the images may have copyright.
Training a category-recognition model for research/non-commercial use may constitute fair use of this data.
Training a category-recognition model for research/non-commercial use may constitute fair use of this data, but the result should not be used for commercial purposes.

Просмотреть файл

@ -0,0 +1,20 @@
---
name: Finetuning CaffeNet on Flickr Style
caffemodel: finetune_flickr_style.caffemodel
caffemodel_url: http://dl.caffe.berkeleyvision.org/finetune_flickr_style.caffemodel
license: non-commercial
sha1: 443ad95a61fb0b5cd3cee55951bcc1f299186b5e
caffe_commit: 41751046f18499b84dbaf529f64c0e664e2a09fe
---
This model is trained exactly as described in `docs/finetune_flickr_style/readme.md`, using all 80000 images.
The final performance on the test set:
I0903 18:40:59.211707 11585 caffe.cpp:167] Loss: 0.407405
I0903 18:40:59.211717 11585 caffe.cpp:179] accuracy = 0.9164
## License
The Flickr Style dataset contains only URLs to images.
Some of the images may have copyright.
Training a category-recognition model for research/non-commercial use may constitute fair use of this data, but the result should not be used for commercial purposes.

Просмотреть файл

@ -1,4 +1,4 @@
net: "examples/finetune_flickr_style/flickr_style_train_val.prototxt"
net: "models/finetune_flickr_style/train_val.prototxt"
test_iter: 100
test_interval: 1000
# lr for fine-tuning should be lower than when starting from scratch
@ -12,6 +12,6 @@ max_iter: 100000
momentum: 0.9
weight_decay: 0.0005
snapshot: 10000
snapshot_prefix: "examples/finetune_flickr_style/flickr_style"
snapshot_prefix: "models/finetune_flickr_style/finetune_flickr_style"
# uncomment the following to default to CPU mode solving
# solver_mode: CPU