693 строки
32 KiB
Plaintext
693 строки
32 KiB
Plaintext
{
|
|
"cells": [
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"# CNTK 302 Part A: Evaluation of pretrained super-resolution models"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"**Contributed by** [Borna Vukorepa](https://www.linkedin.com/in/borna-vukorepa-32a35283/) October 30, 2017\n",
|
|
"\n",
|
|
"## Introduction\n",
|
|
"<p>The aim of CNTK 302 tutorial is to first familiarize the user with image super-resolution problem and to showcase how we can use tools from CNTK to train and evaluate models which perform image super-resolution. This notebook serves as a guide to experimenting with pre-trained super-resolution CNTK models. The tutorial on how to prepare the data and train the models is contained in the notebook CNTK 302B. It is recommended for the user to complete tutorial CNTK 206 before continuing. Some familiarity with convolutional neural networks is also advised.</p>\n",
|
|
"<p>Imagine you are given an image of low resolution (for example, 200 x 200 pixels) and that you want to make it larger (for example, 400 x 400 pixels). Adding more pixels can add more details (often high frequency content) to the image, if done correctly. Using some obvious methods, like the bicubic interpolation, details will be lost leading to a blurried image.</p>\n",
|
|
"<p>This problem is commonly referred to as <b>Single Image Super-Resolution (SISR)</b> problem and there exist many methods that adress it. Methods that have been shown to give best results so far are all including deep learning and convolutional neural networks. Below are links to several papers which discuss <b>SISR</b>. Some of the methods mentioned in them will be used in this study.</p>\n",
|
|
"\n",
|
|
"<a href=\"https://arxiv.org/pdf/1608.00367.pdf\">Accelerating the Super-Resolution Convolutional Neural Network</a>\n",
|
|
"\n",
|
|
"<a href=\"http://cvlab.cse.msu.edu/pdfs/Tai_Yang_Liu_CVPR2017.pdf\">Image Super-Resolution via Deep Recursive Residual Network</a>\n",
|
|
"\n",
|
|
"<a href=\"https://arxiv.org/pdf/1511.04587.pdf\">Accurate Image Super-Resolution Using Very Deep Convolutional Networks</a>\n",
|
|
"\n",
|
|
"<a href=\"https://arxiv.org/pdf/1609.04802.pdf\">Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network</a>\n",
|
|
"\n",
|
|
"<a href=\"https://arxiv.org/pdf/1606.01299.pdf\">RAISR: Rapid and Accurate Image Super Resolution</a>\n",
|
|
"\n",
|
|
"Encouraged by the recent results in deep learning (e.g. GANs), our goal is to explore the space of image super resolution, and look at both <b>GAN and non-GAN</b> approaches. All work is done for upscaling factor of 2. For other factors, the dataset preparation and methods are completely analoguous.\n"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"## Motivation\n",
|
|
"<p>As super-resolution models could be applied to a large number of problems, the study of <b>SISR</b> can be of great use to the community.</p>\n",
|
|
"<p>For example, super-resolution models could be applied in fields like medicine where they could help doctors in reading X-ray scans and similar. Similarly, they could be useful to the police for identifying criminals from photos taken by cameras. However, it is very important to notice that each potential application would require a specific training set designed specially for that problem. Here we will not focus on any specific subproblem of <b>SISR</b> so we will not need those special datasets.</p>"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {
|
|
"collapsed": true
|
|
},
|
|
"source": [
|
|
"## Evaluation and filters\n",
|
|
"<p>There are four super-resolution models that we trained: <b>VDSR</b>, <b>DRNN</b>, <b>SRResNet</b>, <b>SRGAN</b>. We will describe their working principles in a nutshell. In CNTK 302B we will describe them in more detail, together with their architectures and training procedures.</p>\n",
|
|
"<ul>\n",
|
|
"<li><b>VDSR</b> model takes a blurry 64 x 64 blurry image patch as the input and outputs the predicted difference of the blurry patch and the original clear version. After adding that difference to the blurry patch, we obtain our prediction, which is also 64 x 64. The idea is to upscale the target image with bicubic interpolation before running the model, so in the end, after we evaluate patch by patch, we will get a more clear version of the upscaled image. Related paper can be found <a href=\"https://arxiv.org/pdf/1511.04587.pdf\">here</a>.</li>\n",
|
|
"<li><b>DRNN</b> model works the same way as <b>VDSR</b>, but it has a different architecture. Related paper can be found <a href=\"http://cvlab.cse.msu.edu/pdfs/Tai_Yang_Liu_CVPR2017.pdf\">here</a>.</li>\n",
|
|
"<li><b>SRResNet</b> and <b>SRGAN</b> take a 112 x 112 original patch and upscale it to the predicted 224 x 224 version inside the model (there is no pre-upscaling involved). Both have the same architecture, but the loss functions are different, as we will see in the 302B tutorial. The paper describing both methods can be found <a href=\"https://arxiv.org/pdf/1609.04802.pdf\">here</a>.</li>\n",
|
|
"</ul>\n",
|
|
"<p>Generally, <b>VDSR</b> and <b>DRNN</b> tend to give us sharper resulting images, but they sometimes introduce minor artifacts into the resulting images, making them look less convincing. On the other hand, <b>SRResNet</b> and <b>SRGAN</b> tend to give us less sharpness, but also no artifacts.</p>\n",
|
|
"<p>This gives us the idea of somehow combining these models, hoping that we will be able to get better results than by just applying one of our four basic models. Notice that <b>VDSR</b> and <b>DRNN</b> do not change the image dimensions because all upscaling is done in the preprocess. This is why we can use them to further clear up the images which come as the results of the four basic models.</p>\n",
|
|
"<p>We will call such model combinations <b>filters</b>. A <b>filter</b> consists of two models, applied one after another. The upscaling will be done with the first model, so the second one can be either <b>VDSR</b> or <b>DRNN</b>.</p>\n",
|
|
"<p>For example, one filter might consist of <b>SRGAN</b> and <b>VDSR</b>. First, we would apply <b>SRGAN</b> to the starting image and then <b>VDSR</b> to the result, but without doing the bicubic interpolation preprocess, since <b>SRGAN</b> already upscaled the image. Since <b>VDSR</b> was trained to clear up a blurry image, we can expect additional improvement because <b>SRGAN</b> is obviously better at upscaling than bicubic interpolation.</p>\n",
|
|
"<p>Similarly, we could combine <b>VDSR</b> with itself. First, we would upscale the starting image with bicubic interpolation and then clear it up twice in a row with <b>VDSR</b> (no pre-upscaling for the second time). This filter is expected to give great sharpness, but there is a higher chance of artifacts appearing which can reduce the image quality.</p>\n",
|
|
"<p>We will need to enable evaluation of arbitrary sized images. Remember that our models expect the input with fixed dimensions. This is easily fixed by evaluating the image patch by patch. Since boundary pixels are sometimes predicted less accurately, we slightly overlap the patches and do not take pixels close to boundary into the account. This will ensure that every pixel is predicted as a non-boundary pixel of some patch, except for those that are really on the image boundary. We must be careful and remember that some of the models are learning residual image only. We also must not forget to scale back the result by 255.</p>\n",
|
|
"<p>We start with some basic evaluation configuration.</p>"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 1,
|
|
"metadata": {
|
|
"collapsed": true
|
|
},
|
|
"outputs": [],
|
|
"source": [
|
|
"import cntk as C\n",
|
|
"from PIL import Image\n",
|
|
"import os\n",
|
|
"import numpy as np\n",
|
|
"import urllib\n",
|
|
"from scipy.misc import imsave\n",
|
|
"\n",
|
|
"try:\n",
|
|
" from urllib.request import urlretrieve, urlopen\n",
|
|
"except ImportError: \n",
|
|
" from urllib import urlretrieve, urlopen\n",
|
|
"\n",
|
|
"try:\n",
|
|
" C.device.try_set_default_device(C.device.gpu(0))\n",
|
|
"except:\n",
|
|
" print(\"GPU unavailable. Using CPU instead.\")"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 2,
|
|
"metadata": {
|
|
"collapsed": true
|
|
},
|
|
"outputs": [],
|
|
"source": [
|
|
"# Determine the data path for testing\n",
|
|
"# Check for an environment variable defined in CNTK's test infrastructure\n",
|
|
"envvar = 'CNTK_EXTERNAL_TESTDATA_SOURCE_DIRECTORY'\n",
|
|
"def is_test(): return envvar in os.environ\n",
|
|
"\n",
|
|
"if is_test():\n",
|
|
" test_data_path_base = os.path.join(os.environ[envvar], \"Tutorials\", \"data\")\n",
|
|
" test_data_dir = os.path.join(test_data_path_base, \"BerkeleySegmentationDataset\")\n",
|
|
" test_data_dir = os.path.normpath(test_data_dir)\n",
|
|
" \n",
|
|
"\n",
|
|
"#prefer our default path for the data\n",
|
|
"data_dir = os.path.join(\"data\", \"BerkeleySegmentationDataset\")\n",
|
|
"\n",
|
|
"if not os.path.exists(data_dir):\n",
|
|
" os.makedirs(data_dir)\n",
|
|
" \n",
|
|
"#folder with images to be evaluated\n",
|
|
"example_folder = os.path.join(data_dir, \"example_images\")\n",
|
|
"if not os.path.exists(example_folder):\n",
|
|
" os.makedirs(example_folder)\n",
|
|
"\n",
|
|
"#folders with resulting images\n",
|
|
"results_folder = os.path.join(data_dir, \"example_results\")\n",
|
|
"if not os.path.exists(results_folder):\n",
|
|
" os.makedirs(results_folder)\n",
|
|
"\n",
|
|
"#names of used models\n",
|
|
"model_names = [\"VDSR\", \"DRNN\", \"SRResNet\", \"SRGAN\"]\n",
|
|
"\n",
|
|
"#output dimensions of models respectively (assumed that output is a square)\n",
|
|
"output_dims = [64, 64, 224, 224]"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"The evaluation algorithm above is implemented here in function <code>evaluate</code>. See code comments for details about each step."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 3,
|
|
"metadata": {
|
|
"collapsed": true
|
|
},
|
|
"outputs": [],
|
|
"source": [
|
|
"#filename - relative path of image being processed\n",
|
|
"#model - the model for super-resolution\n",
|
|
"#outfile - relative path of the image which will be saved\n",
|
|
"\n",
|
|
"#output_dims - dimensions of current model output image\n",
|
|
"# - it is assumed that model output image is a square\n",
|
|
"\n",
|
|
"#pre_upscale - if True, image will be upscaled by a specified factor with bicubic interpolation at the start\n",
|
|
"# - the resulting image then replaces the original one in the next operations\n",
|
|
"# - if False, that step is skipped\n",
|
|
"# - this should be set on True for models which are clearing up the image and don't make upscaling by themselves\n",
|
|
"\n",
|
|
"#clear_up - if True, the forwarded image will be cleared up by the model and not upscaled\n",
|
|
"# - this is important to know because step variables are different then (see code)\n",
|
|
"# - notice that we exit the function if pre_upscale is True and clear_up false because if image was pre-upscaled,\n",
|
|
"# it should be cleared up afterwards\n",
|
|
"\n",
|
|
"#residual_model - is the model learning residual image only (the difference between blurry and original patch)?\n",
|
|
"# - if true, residual is added to the low resolutin image to produce the result\n",
|
|
"# - otherwise, we only need to scale back the result (see code below)\n",
|
|
"def evaluate(filename, model, outfile, output_dims, pre_upscale = False, clear_up = False, residual_model = False):\n",
|
|
" img = Image.open(filename)\n",
|
|
" \n",
|
|
" #upscaling coefficient\n",
|
|
" coef = 2\n",
|
|
" \n",
|
|
" #at each step, we will evaluate subpatch (x : x + range_x, y : y + range_y) of original image\n",
|
|
" #patch by patch, we will resolve the whole image\n",
|
|
" range_x = output_dims // coef\n",
|
|
" range_y = output_dims // coef\n",
|
|
" \n",
|
|
" #how many bounding pixels from resulting patch should be excluded?\n",
|
|
" #this is important because boundaries tend to be predicted less accurately\n",
|
|
" offset = output_dims // 10\n",
|
|
" \n",
|
|
" #after we evaluate a subpatch, how much we move down/right to get the next one\n",
|
|
" #we subtract offset to cover those pixels which were boundary in the previous subpatch\n",
|
|
" step_x = range_x - offset\n",
|
|
" step_y = range_y - offset\n",
|
|
" \n",
|
|
" #situation which should not occur, if we need preprocess, we will need to clear up the result\n",
|
|
" if((pre_upscale) and (not clear_up)):\n",
|
|
" print(\"Pre-magnified image is not being cleared up.\")\n",
|
|
" return\n",
|
|
" \n",
|
|
" #pre-magnify picture if needed\n",
|
|
" if(pre_upscale):\n",
|
|
" img = img.resize((coef * img.width, coef * img.height), Image.BICUBIC)\n",
|
|
" \n",
|
|
" #if the current image is being cleared up with no further uspcaling,\n",
|
|
" #set coef to 1 and other parameters accordingly\n",
|
|
" if(clear_up):\n",
|
|
" result = np.zeros((img.height, img.width, 3))\n",
|
|
" range_x = output_dims\n",
|
|
" range_y = output_dims\n",
|
|
" step_x = range_x - 2 * offset\n",
|
|
" step_y = range_y - 2 * offset\n",
|
|
" coef = 1\n",
|
|
" #otherwise, set result to be coef (2 by default) times larger than image\n",
|
|
" else:\n",
|
|
" result = np.zeros((coef * img.height, coef * img.width, 3))\n",
|
|
" \n",
|
|
" rect = np.array(img, dtype = np.float32)\n",
|
|
" \n",
|
|
" #if the image is too small for some models to work on it, pad it with zeros\n",
|
|
" if(rect.shape[0] < range_y):\n",
|
|
" pad = np.zeros((range_y - rect.shape[0], rect.shape[1], rect.shape[2]))\n",
|
|
" rect = np.concatenate((rect, pad), axis = 0).astype(dtype = np.float32)\n",
|
|
" \n",
|
|
" if(rect.shape[1] < range_x):\n",
|
|
" pad = np.zeros((rect.shape[0], range_x - rect.shape[1], rect.shape[2]))\n",
|
|
" rect = np.concatenate((rect, pad), axis = 1).astype(dtype = np.float32)\n",
|
|
" \n",
|
|
" x = 0\n",
|
|
" y = 0\n",
|
|
" \n",
|
|
" #take subpatch by subpatch and resolve them to get the final image result\n",
|
|
" while(y < img.width):\n",
|
|
" x = 0\n",
|
|
" while(x < img.height):\n",
|
|
" rgb_patch = rect[x : x + range_x, y : y + range_y]\n",
|
|
" rgb_patch = rgb_patch[..., [2, 1, 0]]\n",
|
|
" rgb_patch = np.ascontiguousarray(np.rollaxis(rgb_patch, 2))\n",
|
|
" pred = np.squeeze(model.eval({model.arguments[0] : [rgb_patch]}))\n",
|
|
" \n",
|
|
" img1 = np.ascontiguousarray(rgb_patch.transpose(2, 1, 0))\n",
|
|
" img2 = np.ascontiguousarray(pred.transpose(2, 1, 0))\n",
|
|
" \n",
|
|
" #if model predicts residual image,\n",
|
|
" #scale back the prediction and add to starting patch\n",
|
|
" #otherwise just scale back\n",
|
|
" if(residual_model):\n",
|
|
" img2 = 255.0 * img2 + img1\n",
|
|
" else:\n",
|
|
" img2 = pred.transpose(2, 1, 0)\n",
|
|
" img2 = img2 * 255.0\n",
|
|
" \n",
|
|
" # make sure img2 is C Contiguous as we just transposed it\n",
|
|
" img2 = np.ascontiguousarray(img2)\n",
|
|
" #make sure no pixels are outside [0, 255] interval\n",
|
|
" for _ in range(2):\n",
|
|
" img2 = C.relu(img2).eval()\n",
|
|
" img2 = np.ones(img2.shape) * 255.0 - img2\n",
|
|
" \n",
|
|
" rgb = img2[..., ::-1]\n",
|
|
" patch = rgb.transpose(1, 0, 2)\n",
|
|
" \n",
|
|
" #fill in the pixels in the middle of the subpatch\n",
|
|
" #don't fill those within offset range to the boundary\n",
|
|
" for h in range(coef * x + offset, coef * x + output_dims - offset):\n",
|
|
" for w in range(coef * y + offset, coef * y + output_dims - offset):\n",
|
|
" for col in range(0, 3):\n",
|
|
" result[h][w][col] = patch[h - coef * x][w - coef * y][col]\n",
|
|
" \n",
|
|
" #pad top\n",
|
|
" if(x == 0):\n",
|
|
" for h in range(offset):\n",
|
|
" for w in range(coef * y, coef * y + output_dims):\n",
|
|
" for col in range(0, 3):\n",
|
|
" result[h][w][col] = patch[h][w - coef * y][col]\n",
|
|
" \n",
|
|
" #pad left\n",
|
|
" if(y == 0):\n",
|
|
" for h in range(coef * x, coef * x + output_dims):\n",
|
|
" for w in range(offset):\n",
|
|
" for col in range(0, 3):\n",
|
|
" result[h][w][col] = patch[h - coef * x][w][col]\n",
|
|
" \n",
|
|
" #pad bottom\n",
|
|
" if(x == img.height - range_x):\n",
|
|
" for h in range(coef * img.height - offset, coef * img.height):\n",
|
|
" for w in range(coef * y, coef * y + output_dims):\n",
|
|
" for col in range(0, 3):\n",
|
|
" result[h][w][col] = patch[h - coef * x][w - coef * y][col]\n",
|
|
" \n",
|
|
" #pad right \n",
|
|
" if(y == img.width - range_y):\n",
|
|
" for h in range(coef * x, coef * x + output_dims):\n",
|
|
" for w in range(coef * img.width - offset, coef * img.width):\n",
|
|
" for col in range(0, 3):\n",
|
|
" result[h][w][col] = patch[h - coef * x][w - coef * y][col]\n",
|
|
" \n",
|
|
" #reached bottom of image\n",
|
|
" if(x == img.height - range_x):\n",
|
|
" break\n",
|
|
" #next step by x, we must not go out of bounds\n",
|
|
" x = min(x + step_x, img.height - range_x)\n",
|
|
" \n",
|
|
" #reached right edge of image\n",
|
|
" if(y == img.width - range_x):\n",
|
|
" break\n",
|
|
" #next step by y, we must not go out of bounds\n",
|
|
" y = min(y + step_y, img.width - range_x)\n",
|
|
" \n",
|
|
" result = np.ascontiguousarray(result)\n",
|
|
" \n",
|
|
" #save result\n",
|
|
" imsave(outfile, result.astype(np.uint8))"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"<p>Now we load our pretrained models.</p>"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 4,
|
|
"metadata": {},
|
|
"outputs": [
|
|
{
|
|
"name": "stdout",
|
|
"output_type": "stream",
|
|
"text": [
|
|
"Model directory C:\\Data\\CNTKTestData\\Tutorials\\data\\BerkeleySegmentationDataset\\PretrainedModels\n",
|
|
"Image directory C:\\Data\\CNTKTestData\\Tutorials\\data\\BerkeleySegmentationDataset\\Images\n"
|
|
]
|
|
}
|
|
],
|
|
"source": [
|
|
"#Get the path for pre-trained models and example images\n",
|
|
"if is_test():\n",
|
|
" models_dir = os.path.join(test_data_dir, \"PretrainedModels\")\n",
|
|
" image_dir = os.path.join(test_data_dir, \"Images\")\n",
|
|
"else:\n",
|
|
" models_dir = os.path.join(data_dir, \"PretrainedModels\")\n",
|
|
" if not os.path.exists(models_dir):\n",
|
|
" os.makedirs(models_dir)\n",
|
|
" \n",
|
|
" image_dir = os.path.join(data_dir, \"Images\")\n",
|
|
" if not os.path.exists(image_dir):\n",
|
|
" os.makedirs(image_dir)\n",
|
|
" \n",
|
|
"print(\"Model directory\", models_dir)\n",
|
|
"print(\"Image directory\", image_dir)"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 5,
|
|
"metadata": {
|
|
"scrolled": true
|
|
},
|
|
"outputs": [
|
|
{
|
|
"name": "stdout",
|
|
"output_type": "stream",
|
|
"text": [
|
|
"Using cached VDSR model\n",
|
|
"Using cached DRNN.model\n",
|
|
"Using cached SRResNet.model\n",
|
|
"Using cached SRGAN model\n",
|
|
"Loading pretrained models...\n",
|
|
"Loaded pretrained models.\n"
|
|
]
|
|
}
|
|
],
|
|
"source": [
|
|
"if not os.path.isfile(os.path.join(models_dir, \"VDSR.model\")):\n",
|
|
" print(\"Downloading VDSR model...\")\n",
|
|
" urlretrieve(\"https://www.cntk.ai/Models/SuperResolution/VDSR.model\", os.path.join(models_dir, \"VDSR.model\"))\n",
|
|
"else:\n",
|
|
" print(\"Using cached VDSR model\")\n",
|
|
"\n",
|
|
"if not os.path.isfile(os.path.join(models_dir, \"DRNN.model\")):\n",
|
|
" print(\"Downloading DRNN model...\")\n",
|
|
" urlretrieve(\"https://www.cntk.ai/Models/SuperResolution/DRNN.model\", os.path.join(models_dir, \"DRNN.model\"))\n",
|
|
"else:\n",
|
|
" print(\"Using cached DRNN.model\")\n",
|
|
" \n",
|
|
"if not os.path.isfile(os.path.join(models_dir, \"SRResNet.model\")):\n",
|
|
" print(\"Downloading SRResNet model...\")\n",
|
|
" urlretrieve(\"https://www.cntk.ai/Models/SuperResolution/SRResNet.model\", os.path.join(models_dir, \"SRResNet.model\"))\n",
|
|
"else:\n",
|
|
" print(\"Using cached SRResNet.model\")\n",
|
|
"\n",
|
|
"if not os.path.isfile(os.path.join(models_dir, \"SRGAN.model\")):\n",
|
|
" print(\"Downloading SRGAN model...\")\n",
|
|
" urlretrieve(\"https://www.cntk.ai/Models/SuperResolution/SRGAN.model\", os.path.join(models_dir, \"SRGAN.model\"))\n",
|
|
"else:\n",
|
|
" print(\"Using cached SRGAN model\")\n",
|
|
"\n",
|
|
"print(\"Loading pretrained models...\")\n",
|
|
"VDSR_model = C.load_model(os.path.join(models_dir, \"VDSR.model\"))\n",
|
|
"DRNN_model = C.load_model(os.path.join(models_dir, \"DRNN.model\"))\n",
|
|
"SRResNet_model = C.load_model(os.path.join(models_dir, \"SRResNet.model\"))\n",
|
|
"SRGAN_model = C.load_model(os.path.join(models_dir, \"SRGAN.model\"))\n",
|
|
"\n",
|
|
"models = [VDSR_model, DRNN_model, SRResNet_model, SRGAN_model]\n",
|
|
"\n",
|
|
"print(\"Loaded pretrained models.\")"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"<p>We will take an example image on which we want to try out our models.</p>"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 6,
|
|
"metadata": {},
|
|
"outputs": [
|
|
{
|
|
"name": "stdout",
|
|
"output_type": "stream",
|
|
"text": [
|
|
"Using cached image file\n"
|
|
]
|
|
}
|
|
],
|
|
"source": [
|
|
"from shutil import copyfile\n",
|
|
"\n",
|
|
"if not os.path.isfile(os.path.join(image_dir, \"253027.jpg\")):\n",
|
|
" print(\"Downloading example image ...\")\n",
|
|
" link = \"https://www2.eecs.berkeley.edu/Research/Projects/CS/vision/grouping/segbench/BSDS300/html/images/plain/normal/color/253027.jpg\"\n",
|
|
" urlretrieve(link, os.path.join(example_folder, \"253027.jpg\"))\n",
|
|
"else:\n",
|
|
" print(\"Using cached image file\")\n",
|
|
" copyfile(os.path.join(image_dir, \"253027.jpg\"), os.path.join(example_folder, \"253027.jpg\"))"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"We will also save a copy of bicubic interpolation effect for every image we are testing on, just for reference."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 7,
|
|
"metadata": {
|
|
"collapsed": true
|
|
},
|
|
"outputs": [],
|
|
"source": [
|
|
"save_folder = os.path.join(results_folder, \"bicubic\")\n",
|
|
"\n",
|
|
"#upscale by bicubic and save for reference\n",
|
|
"for entry in os.listdir(example_folder):\n",
|
|
" filename = os.path.join(example_folder, entry)\n",
|
|
" \n",
|
|
" if not os.path.exists(save_folder):\n",
|
|
" os.makedirs(save_folder)\n",
|
|
" \n",
|
|
" img = Image.open(filename)\n",
|
|
" out = img.resize((2 * img.width, 2 * img.height), Image.BICUBIC)\n",
|
|
" out.save(os.path.join(save_folder, entry))"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"<p>Now we are finally ready to evaluate our models. The code below combines all possible filters for all images. Parameters in the calls of <code>evaluate</code> are set according to the models in question. For example, models on indices 0 and 1 are learning only the residual image, so we set <code>residual_model = True</code>.</p>\n",
|
|
"<p>Also, all upscaling is done in the first element of the filter. Therefore, only <b>VDSR</b> and <b>DRNN</b> can be the second element of the filter and their preprocess part must be skipped in that case. We process the images and save them in appropriate folders. Folder names reflect which filter was used. See code below for additional comments.</p>"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 8,
|
|
"metadata": {},
|
|
"outputs": [
|
|
{
|
|
"name": "stdout",
|
|
"output_type": "stream",
|
|
"text": [
|
|
"Now creating: data\\BerkeleySegmentationDataset\\example_results\\VDSR_results\\253027.jpg\n",
|
|
"Now creating: data\\BerkeleySegmentationDataset\\example_results\\VDSR_VDSR_results\\253027.jpg\n",
|
|
"Now creating: data\\BerkeleySegmentationDataset\\example_results\\DRNN_VDSR_results\\253027.jpg\n",
|
|
"Now creating: data\\BerkeleySegmentationDataset\\example_results\\DRNN_results\\253027.jpg\n",
|
|
"Now creating: data\\BerkeleySegmentationDataset\\example_results\\VDSR_DRNN_results\\253027.jpg\n",
|
|
"Now creating: data\\BerkeleySegmentationDataset\\example_results\\DRNN_DRNN_results\\253027.jpg\n",
|
|
"Now creating: data\\BerkeleySegmentationDataset\\example_results\\SRResNet_results\\253027.jpg\n"
|
|
]
|
|
},
|
|
{
|
|
"name": "stderr",
|
|
"output_type": "stream",
|
|
"text": [
|
|
"c:\\repos\\cntk\\bindings\\python\\cntk\\core.py:82: RuntimeWarning: data is not C contiguous; rearrange your data/computation to avoid costly data conversions\n",
|
|
" RuntimeWarning)\n"
|
|
]
|
|
},
|
|
{
|
|
"name": "stdout",
|
|
"output_type": "stream",
|
|
"text": [
|
|
"Now creating: data\\BerkeleySegmentationDataset\\example_results\\VDSR_SRResNet_results\\253027.jpg\n",
|
|
"Now creating: data\\BerkeleySegmentationDataset\\example_results\\DRNN_SRResNet_results\\253027.jpg\n",
|
|
"Now creating: data\\BerkeleySegmentationDataset\\example_results\\SRGAN_results\\253027.jpg\n",
|
|
"Now creating: data\\BerkeleySegmentationDataset\\example_results\\VDSR_SRGAN_results\\253027.jpg\n",
|
|
"Now creating: data\\BerkeleySegmentationDataset\\example_results\\DRNN_SRGAN_results\\253027.jpg\n"
|
|
]
|
|
}
|
|
],
|
|
"source": [
|
|
"#loop thorugh every model\n",
|
|
"for i in range(4):\n",
|
|
" save_folder = os.path.join(results_folder, model_names[i] + \"_results\")\n",
|
|
" \n",
|
|
" #loop through every image in example_folder\n",
|
|
" for entry in os.listdir(example_folder):\n",
|
|
" filename = os.path.join(example_folder, entry)\n",
|
|
" \n",
|
|
" if not os.path.exists(save_folder):\n",
|
|
" os.makedirs(save_folder)\n",
|
|
" \n",
|
|
" outfile = os.path.join(save_folder, entry)\n",
|
|
" \n",
|
|
" print(\"Now creating: \" + outfile)\n",
|
|
" \n",
|
|
" #function calls for different models\n",
|
|
" if(i < 2):\n",
|
|
" #residual learning, image is pre-upscaled and then cleared up\n",
|
|
" evaluate(filename, models[i], outfile, output_dims[i], pre_upscale = True, clear_up = True, residual_model = True)\n",
|
|
" else:\n",
|
|
" #all upscaling is within the model\n",
|
|
" evaluate(filename, models[i], outfile, output_dims[i], pre_upscale = False, clear_up = False, residual_model = False)\n",
|
|
" \n",
|
|
" #loop through models which can additionally clear up image after we increased it (DRNN and VDSR)\n",
|
|
" for j in range(2):\n",
|
|
" #loop through results of previously applied model\n",
|
|
" for entry in os.listdir(save_folder):\n",
|
|
" filename = os.path.join(save_folder, entry)\n",
|
|
" filter_folder = os.path.join(results_folder, model_names[j] + \"_\" + model_names[i] + \"_results\")\n",
|
|
" \n",
|
|
" if not os.path.exists(filter_folder):\n",
|
|
" os.makedirs(filter_folder)\n",
|
|
" \n",
|
|
" outfile = os.path.join(filter_folder, entry)\n",
|
|
" \n",
|
|
" print(\"Now creating: \" + outfile)\n",
|
|
" \n",
|
|
" #additionally clear up image without pre-magnifying\n",
|
|
" evaluate(filename, models[j], outfile, output_dims[j], pre_upscale = False, clear_up = True, residual_model = True)"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"## Result analysis and future work\n",
|
|
"<p>Below is the example of what we were able to get. There is no mathematical formula which would evaluate how good our models perform. The best and most used method would be through opinion scoring, that is, presenting generated images to people so they can grade them from 1 to 5 depending on how realistic they look.</p>"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 9,
|
|
"metadata": {},
|
|
"outputs": [
|
|
{
|
|
"data": {
|
|
"text/html": [
|
|
"<img src=\"https://superresolution.blob.core.windows.net/superresolutionresources/example.PNG\"/>"
|
|
],
|
|
"text/plain": [
|
|
"<IPython.core.display.Image object>"
|
|
]
|
|
},
|
|
"execution_count": 9,
|
|
"metadata": {},
|
|
"output_type": "execute_result"
|
|
}
|
|
],
|
|
"source": [
|
|
"from IPython.display import Image\n",
|
|
"Image(url = \"https://superresolution.blob.core.windows.net/superresolutionresources/example.PNG\")"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"<p>For more detailed analysis, we will present several original images (low resolution), highlight a small piece of them and see how different filters performed and what results we were able to get. We will see that in some examples we accomplished our goal pretty well, but not always.</p>"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 10,
|
|
"metadata": {},
|
|
"outputs": [
|
|
{
|
|
"data": {
|
|
"text/html": [
|
|
"<img src=\"https://superresolution.blob.core.windows.net/superresolutionresources/analysis_zebra.PNG\"/>"
|
|
],
|
|
"text/plain": [
|
|
"<IPython.core.display.Image object>"
|
|
]
|
|
},
|
|
"execution_count": 10,
|
|
"metadata": {},
|
|
"output_type": "execute_result"
|
|
}
|
|
],
|
|
"source": [
|
|
"Image(url = \"https://superresolution.blob.core.windows.net/superresolutionresources/analysis_zebra.PNG\")"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 11,
|
|
"metadata": {},
|
|
"outputs": [
|
|
{
|
|
"data": {
|
|
"text/html": [
|
|
"<img src=\"https://superresolution.blob.core.windows.net/superresolutionresources/analysis_village.PNG\"/>"
|
|
],
|
|
"text/plain": [
|
|
"<IPython.core.display.Image object>"
|
|
]
|
|
},
|
|
"execution_count": 11,
|
|
"metadata": {},
|
|
"output_type": "execute_result"
|
|
}
|
|
],
|
|
"source": [
|
|
"Image(url = \"https://superresolution.blob.core.windows.net/superresolutionresources/analysis_village.PNG\")"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 12,
|
|
"metadata": {},
|
|
"outputs": [
|
|
{
|
|
"data": {
|
|
"text/html": [
|
|
"<img src=\"https://superresolution.blob.core.windows.net/superresolutionresources/analysis_pot.PNG\"/>"
|
|
],
|
|
"text/plain": [
|
|
"<IPython.core.display.Image object>"
|
|
]
|
|
},
|
|
"execution_count": 12,
|
|
"metadata": {},
|
|
"output_type": "execute_result"
|
|
}
|
|
],
|
|
"source": [
|
|
"Image(url = \"https://superresolution.blob.core.windows.net/superresolutionresources/analysis_pot.PNG\")"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {
|
|
"collapsed": true
|
|
},
|
|
"source": [
|
|
"<p>For the <b>future work</b>, it would be wise to attempt to get better results from <b>SRGAN</b> model. The key problem is finding the appropriate coefficients for different pieces of the loss function. In most of our attempts, we would end up with completely unusable and unrecognizable images. After some trying, we were able to get decent results and there is more room for improvement there.</p>\n",
|
|
"<p>The idea of creating model combinations <b>(filters)</b> has provided us with considerably better results. It is visible that the results of combinations of two models usually give better results than just one model by itself.</p>"
|
|
]
|
|
}
|
|
],
|
|
"metadata": {
|
|
"kernelspec": {
|
|
"display_name": "Python 3",
|
|
"language": "python",
|
|
"name": "python3"
|
|
},
|
|
"language_info": {
|
|
"codemirror_mode": {
|
|
"name": "ipython",
|
|
"version": 3
|
|
},
|
|
"file_extension": ".py",
|
|
"mimetype": "text/x-python",
|
|
"name": "python",
|
|
"nbconvert_exporter": "python",
|
|
"pygments_lexer": "ipython3",
|
|
"version": "3.6.2"
|
|
}
|
|
},
|
|
"nbformat": 4,
|
|
"nbformat_minor": 2
|
|
}
|