* Move to support CUDA 10, cudnn 7.3, cub 1.8.
* Fixed a bug related to "pointer to pin pointer is disallowed" #3063,
which is exposed in newer version vctools.
* Added workaround for a potential vs2017 15.9 bug with cntk Debug
version.
CNTK now supports CUDA 9/cuDNN 7. This requires an update to build environment to Ubuntu 16/GCC 5 for Linux, and Visual Studio 2017/VCTools 14.11 for Windows. With CUDA 9, CNTK also added a preview for 16-bit floating point (a.k.a FP16) computation.
Please check out the example of FP16 in ResNet50 at /Examples/Image/Classification/ResNet/Python/TrainResNet_ImageNet_Distributed.py
Notes on FP16 preview:
* FP16 implementation on CPU is not optimized, and it's not supposed to be used in CPU inference directly. User needs to convert the model to 32-bit floating point before running on CPU.
* Loss/Criterion for FP16 training needs to be 32bit for accumulation without overflow, using cast function. Please check the example above.
* Readers do not have FP16 output unless using numpy to feed data, cast from FP32 to FP16 is needed. Please check the example above.
* FP16 gradient aggregation is currently only implemented on GPU using NCCL2. Distributed training with FP16 with MPI is not supported.
* FP16 math is a subset of current FP32 implementation. Some model may get Feature Not Implemented exception using FP16.
* FP16 is currently not supported in BrainScript. Please use Python for FP16.
To setup build and runtime environment on Windows:
* Install [Visual Studio 2017](https://www.visualstudio.com/downloads/) with following workloads and components. From command line (use Community version installer as example):
vs_community.exe --add Microsoft.VisualStudio.Workload.NativeDesktop --add Microsoft.VisualStudio.Workload.ManagedDesktop --add Microsoft.VisualStudio.Workload.Universal --add Microsoft.Component.PythonTools --add Microsoft.VisualStudio.Component.VC.Tools.14.11
* Install [NVidia CUDA 9](https://developer.nvidia.com/cuda-90-download-archive?target_os=Windows&target_arch=x86_64)
* From PowerShell, run:
/Tools/devInstall/Windows/DevInstall.ps1
* Start VCTools 14.11 command line, run:
cmd /k "%VS2017INSTALLDIR%\VC\Auxiliary\Build\vcvarsall.bat" x64 --vcvars_ver=14.11
* Open /CNTK.sln from the VCTools 14.11 command line. Note that starting CNTK.sln other than VCTools 14.11 command line, would causes CUDA 9 [build error](https://developercommunity.visualstudio.com/content/problem/163758/vs-2017-155-doesnt-support-cuda-9.html).
To setup build and runtime environment on Linux using docker, please build Unbuntu 16.04 docker image using Dockerfiles /Tools/docker. For other Linux systems, please refer to the Dockerfiles to setup dependent libraries for CNTK.
On Linux:
sudo mkdir /usr/local/mklml
sudo wget https://github.com/01org/mkl-dnn/releases/download/v0.11/mklml_lnx_2018.0.1.20171007.tgz
sudo tar -xzf mklml_lnx_2018.0.1.20171007.tgz -C /usr/local/mklml
On Windows:
Create a directory on your machine to hold MKLML, e.g. mkdir c:\local\mklml
Download the file [mklml_win_2018.0.1.20171007.zip](https://github.com/01org/mkl-dnn/releases/download/v0.11/mklml_win_2018.0.1.20171007.zip).
Unzip it into your MKLML path, creating a versioned sub directory within.
Set the environment variable `MKLML_PATH` to the versioned sub directory, e.g. setx MKLML_PATH c:\local\mklml\mklml_win_2018.0.1.20171007
This change also enables CPU convolution forward/backward using MKL, which leads to ~4x speedup in AlexNet training.
minor fixes
fixing Native utils path, loader, and the Native Load Utils
remove path and classpath setting in tests
cleaning up native utils
Unified CNTKNativeUtils class. Changed the code generation to use CNTKNativeUtils directly instead of the intermediary CNTK.init().
adding fixes to post bild
Added NATIVE_LOAD_MANIFEST for the cases when only specific high-level libraries should be loaded
linux side
Add gpu support
linux gpu .so copying
Moved and refreshed from bindings/python/swig_install.sh. Install
location now is /usr/local/swig-*, which ./configure picks up
automatically.
Specify --without-alllang for SWIG build, which should suppress building
tests and examples for specific languages.
Windows: Update source setup links for SWIG
adding some doc
use $$
add comments
automatically loadLibrary in java
semicolon
removing .iml
add more dependencies
move java static block code to cntk_java.i
remove also class files.
Adding .java files to jar
changing test location
typo
ignore DeviceDescriptorVector size constructor
DRY in Main.java
use List interface instead of ArrayList implementation
moving test location, expanding tests to gpu, fixing comments
move linux java tests
updating baseline.txt