Merge branch 'master' of https://github.com/Microsoft/CNTK into nikosk/pyexamples-new-io

2016-10-25 01:22:42 -07:00 · 2016-10-25 01:22:42 -07:00 · 284bdc287a
--- a/bindings/python/tutorials/CNTK_201A_CIFAR-10_DataLoader.ipynb
+++ b/bindings/python/tutorials/CNTK_201A_CIFAR-10_DataLoader.ipynb
@ -0,0 +1,339 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# CNTK 201A Part A: CIFAR-10 Data Loader\n",
+    "\n",
+    "This tutorial will show how to prepare image data sets for use with deep learning algorithms in CNTK. The CIFAR-10 dataset (http://www.cs.toronto.edu/~kriz/cifar.html) is a popular dataset for image classification, collected by Alex Krizhevsky, Vinod Nair, and Geoffrey Hinton. It is a labeled subset of the [80 million tiny images](http://people.csail.mit.edu/torralba/tinyimages/) dataset.\n",
+    "\n",
+    "The CIFAR-10 dataset is not included in the CNTK distribution but can be easily downloaded and converted to CNTK-supported format   \n",
+    "\n",
+    "CNTK 201A tutorial is divided into two parts:\n",
+    "- Part A: Familiarizes you with the CIFAR-10 data and converts them into CNTK supported format. This data will be used later in the tutorial for image classification tasks.\n",
+    "- Part B: We will introduce image understanding tutorials.\n",
+    "\n",
+    "If you are curious about how well computers can perform on CIFAR-10 today, Rodrigo Benenson maintains a [blog](http://rodrigob.github.io/are_we_there_yet/build/classification_datasets_results.html#43494641522d3130) on the state-of-the-art performance of various algorithms.\n"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 1,
+   "metadata": {
+    "collapsed": true
+   },
+   "outputs": [],
+   "source": [
+    "from __future__ import print_function\n",
+    "\n",
+    "from PIL import Image\n",
+    "import getopt\n",
+    "import numpy as np\n",
+    "import pickle as cp\n",
+    "import os\n",
+    "import shutil\n",
+    "import struct\n",
+    "import sys\n",
+    "import tarfile\n",
+    "import xml.etree.cElementTree as et\n",
+    "import xml.dom.minidom\n",
+    "\n",
+    "try: \n",
+    "    from urllib.request import urlretrieve \n",
+    "except ImportError: \n",
+    "    from urllib import urlretrieve\n",
+    "\n",
+    "# Config matplotlib for inline plotting\n",
+    "%matplotlib inline"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Data download\n",
+    "\n",
+    "The CIFAR-10 dataset consists of 60,000 32x32 color images in 10 classes, with 6,000 images per class. \n",
+    "There are 50,000 training images and 10,000 test images. The 10 classes are: airplane, automobile, bird, \n",
+    "cat, deer, dog, frog, horse, ship, and truck."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 2,
+   "metadata": {
+    "collapsed": true
+   },
+   "outputs": [],
+   "source": [
+    "# CIFAR Image data\n",
+    "imgSize = 32\n",
+    "numFeature = imgSize * imgSize * 3"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "We first setup a few helper functions to download the CIFAR data. The archive contains the files data_batch_1, data_batch_2, ..., data_batch_5, as well as test_batch. Each of these files is a Python \"pickled\" object produced with cPickle. To prepare the input data for use in CNTK we use three oprations:\n",
+    "> `readBatch`: Unpack the pickle files\n",
+    "\n",
+    "> `loadData`: Compose the data into single train and test objects\n",
+    "\n",
+    "> `saveTxt`: As the name suggests, saves the label and the features into text files for both training and testing. \n",
+    "  "
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 3,
+   "metadata": {
+    "collapsed": true
+   },
+   "outputs": [],
+   "source": [
+    "def readBatch(src):\n",
+    "    with open(src, 'rb') as f:\n",
+    "        if sys.version_info[0] < 3: \n",
+    "            d = cp.load(f) \n",
+    "        else:\n",
+    "            d = cp.load(f, encoding='latin1')\n",
+    "        data = d['data']\n",
+    "        feat = data\n",
+    "    res = np.hstack((feat, np.reshape(d['labels'], (len(d['labels']), 1))))\n",
+    "    return res.astype(np.int)\n",
+    "\n",
+    "def loadData(src):\n",
+    "    print ('Downloading ' + src)\n",
+    "    fname, h = urlretrieve(src, './delete.me')\n",
+    "    print ('Done.')\n",
+    "    try:\n",
+    "        print ('Extracting files...')\n",
+    "        with tarfile.open(fname) as tar:\n",
+    "            tar.extractall()\n",
+    "        print ('Done.')\n",
+    "        print ('Preparing train set...')\n",
+    "        trn = np.empty((0, numFeature + 1), dtype=np.int)\n",
+    "        for i in range(5):\n",
+    "            batchName = './cifar-10-batches-py/data_batch_{0}'.format(i + 1)\n",
+    "            trn = np.vstack((trn, readBatch(batchName)))\n",
+    "        print ('Done.')\n",
+    "        print ('Preparing test set...')\n",
+    "        tst = readBatch('./cifar-10-batches-py/test_batch')\n",
+    "        print ('Done.')\n",
+    "    finally:\n",
+    "        os.remove(fname)\n",
+    "    return (trn, tst)\n",
+    "\n",
+    "def saveTxt(filename, ndarray):\n",
+    "    with open(filename, 'w') as f:\n",
+    "        labels = list(map(' '.join, np.eye(10, dtype=np.uint).astype(str)))\n",
+    "        for row in ndarray:\n",
+    "            row_str = row.astype(str)\n",
+    "            label_str = labels[row[-1]]\n",
+    "            feature_str = ' '.join(row_str[:-1])\n",
+    "            f.write('|labels {} |features {}\\n'.format(label_str, feature_str))"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "In addition to saving the images in the text format, we would save the images in PNG format. In addition we also compute the mean of the image. `saveImage` and `saveMean` are two functions used for this purpose."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 4,
+   "metadata": {
+    "collapsed": true
+   },
+   "outputs": [],
+   "source": [
+    "def saveImage(fname, data, label, mapFile, regrFile, pad, **key_parms):\n",
+    "    # data in CIFAR-10 dataset is in CHW format.\n",
+    "    pixData = data.reshape((3, imgSize, imgSize))\n",
+    "    if ('mean' in key_parms):\n",
+    "        key_parms['mean'] += pixData\n",
+    "\n",
+    "    if pad > 0:\n",
+    "        pixData = np.pad(pixData, ((0, 0), (pad, pad), (pad, pad)), mode='constant', constant_values=128) \n",
+    "\n",
+    "    img = Image.new('RGB', (imgSize + 2 * pad, imgSize + 2 * pad))\n",
+    "    pixels = img.load()\n",
+    "    for x in range(img.size[0]):\n",
+    "        for y in range(img.size[1]):\n",
+    "            pixels[x, y] = (pixData[0][y][x], pixData[1][y][x], pixData[2][y][x])\n",
+    "    img.save(fname)\n",
+    "    mapFile.write(\"%s\\t%d\\n\" % (fname, label))\n",
+    "    \n",
+    "    # compute per channel mean and store for regression example\n",
+    "    channelMean = np.mean(pixData, axis=(1,2))\n",
+    "    regrFile.write(\"|regrLabels\\t%f\\t%f\\t%f\\n\" % (channelMean[0]/255.0, channelMean[1]/255.0, channelMean[2]/255.0))\n",
+    "    \n",
+    "def saveMean(fname, data):\n",
+    "    root = et.Element('opencv_storage')\n",
+    "    et.SubElement(root, 'Channel').text = '3'\n",
+    "    et.SubElement(root, 'Row').text = str(imgSize)\n",
+    "    et.SubElement(root, 'Col').text = str(imgSize)\n",
+    "    meanImg = et.SubElement(root, 'MeanImg', type_id='opencv-matrix')\n",
+    "    et.SubElement(meanImg, 'rows').text = '1'\n",
+    "    et.SubElement(meanImg, 'cols').text = str(imgSize * imgSize * 3)\n",
+    "    et.SubElement(meanImg, 'dt').text = 'f'\n",
+    "    et.SubElement(meanImg, 'data').text = ' '.join(['%e' % n for n in np.reshape(data, (imgSize * imgSize * 3))])\n",
+    "\n",
+    "    tree = et.ElementTree(root)\n",
+    "    tree.write(fname)\n",
+    "    x = xml.dom.minidom.parse(fname)\n",
+    "    with open(fname, 'w') as f:\n",
+    "        f.write(x.toprettyxml(indent = '  '))\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "`saveTrainImages` and `saveTestImages` are simple wrapper functions to iterate through the data set that ca"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 5,
+   "metadata": {
+    "collapsed": true
+   },
+   "outputs": [],
+   "source": [
+    "def saveTrainImages(filename, foldername):\n",
+    "    if not os.path.exists(foldername):\n",
+    "        os.makedirs(foldername)\n",
+    "    data = {}\n",
+    "    dataMean = np.zeros((3, imgSize, imgSize)) # mean is in CHW format.\n",
+    "    with open('train_map.txt', 'w') as mapFile:\n",
+    "        with open('train_regrLabels.txt', 'w') as regrFile:\n",
+    "            for ifile in range(1, 6):\n",
+    "                with open(os.path.join('./cifar-10-batches-py', 'data_batch_' + str(ifile)), 'rb') as f:\n",
+    "                    if sys.version_info[0] < 3: \n",
+    "                        data = cp.load(f)\n",
+    "                    else: \n",
+    "                        data = cp.load(f, encoding='latin1')\n",
+    "                    for i in range(10000):\n",
+    "                        fname = os.path.join(os.path.abspath(foldername), ('%05d.png' % (i + (ifile - 1) * 10000)))\n",
+    "                        saveImage(fname, data['data'][i, :], data['labels'][i], mapFile, regrFile, 4, mean=dataMean)\n",
+    "    dataMean = dataMean / (50 * 1000)\n",
+    "    saveMean('CIFAR-10_mean.xml', dataMean)\n",
+    "\n",
+    "def saveTestImages(filename, foldername):\n",
+    "    if not os.path.exists(foldername):\n",
+    "      os.makedirs(foldername)\n",
+    "    with open('test_map.txt', 'w') as mapFile:\n",
+    "        with open('test_regrLabels.txt', 'w') as regrFile:\n",
+    "            with open(os.path.join('./cifar-10-batches-py', 'test_batch'), 'rb') as f:\n",
+    "                if sys.version_info[0] < 3: \n",
+    "                    data = cp.load(f)\n",
+    "                else: \n",
+    "                    data = cp.load(f, encoding='latin1')\n",
+    "                for i in range(10000):\n",
+    "                    fname = os.path.join(os.path.abspath(foldername), ('%05d.png' % i))\n",
+    "                    saveImage(fname, data['data'][i, :], data['labels'][i], mapFile, regrFile, 0)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 6,
+   "metadata": {
+    "collapsed": true
+   },
+   "outputs": [],
+   "source": [
+    "# URLs for the train image and labels data\n",
+    "url_cifar_data = 'http://www.cs.toronto.edu/~kriz/cifar-10-python.tar.gz'\n",
+    "\n",
+    "# Paths for saving the text files\n",
+    "data_dir = './data/CIFAR-10/'\n",
+    "train_filename = data_dir + '/Train_cntk_text.txt'\n",
+    "test_filename = data_dir + '/Test_cntk_text.txt'\n",
+    "\n",
+    "train_img_directory = data_dir + '/Train'\n",
+    "test_img_directory = data_dir + '/Test'\n",
+    "\n",
+    "root_dir = os.getcwd()"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 7,
+   "metadata": {
+    "collapsed": false
+   },
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "Downloading http://www.cs.toronto.edu/~kriz/cifar-10-python.tar.gz\n",
+      "Done.\n",
+      "Extracting files...\n",
+      "Done.\n",
+      "Preparing train set...\n",
+      "Done.\n",
+      "Preparing test set...\n",
+      "Done.\n",
+      "Writing train text file...\n",
+      "Done.\n",
+      "Writing test text file...\n",
+      "Done.\n",
+      "Converting train data to png images...\n",
+      "Done.\n",
+      "Converting test data to png images...\n",
+      "Done.\n"
+     ]
+    }
+   ],
+   "source": [
+    "if not os.path.exists(data_dir):\n",
+    "    os.makedirs(data_dir)\n",
+    "\n",
+    "try:\n",
+    "    os.chdir(data_dir)   \n",
+    "    trn, tst= loadData('http://www.cs.toronto.edu/~kriz/cifar-10-python.tar.gz')\n",
+    "    print ('Writing train text file...')\n",
+    "    saveTxt(r'./Train_cntk_text.txt', trn)\n",
+    "    print ('Done.')\n",
+    "    print ('Writing test text file...')\n",
+    "    saveTxt(r'./Test_cntk_text.txt', tst)\n",
+    "    print ('Done.')\n",
+    "    print ('Converting train data to png images...')\n",
+    "    saveTrainImages(r'./Train_cntk_text.txt', 'train')\n",
+    "    print ('Done.')\n",
+    "    print ('Converting test data to png images...')\n",
+    "    saveTestImages(r'./Test_cntk_text.txt', 'test')\n",
+    "    print ('Done.')\n",
+    "finally:\n",
+    "    os.chdir(\"..\\..\")"
+   ]
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.4.5"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 1
+}
--- a/bindings/python/tutorials/CNTK_201B_CIFAR-10_ImageHandsOn.ipynb
+++ b/bindings/python/tutorials/CNTK_201B_CIFAR-10_ImageHandsOn.ipynb
--- a/bindings/python/tutorials/img/CNN.png
+++ b/bindings/python/tutorials/img/CNN.png
--- a/bindings/python/tutorials/img/Conv2D.png
+++ b/bindings/python/tutorials/img/Conv2D.png
--- a/bindings/python/tutorials/img/Conv2DFeatures.png
+++ b/bindings/python/tutorials/img/Conv2DFeatures.png
--- a/bindings/python/tutorials/img/MaxPooling.png
+++ b/bindings/python/tutorials/img/MaxPooling.png
--- a/bindings/python/tutorials/img/ResNetBlock2.png
+++ b/bindings/python/tutorials/img/ResNetBlock2.png