Граф коммитов

19 Коммитов

Автор SHA1 Сообщение Дата
David Brownell 5aa511d10a Project updates to use a consistent native runtime 2018-07-10 08:29:11 -07:00
Günter Loch d1e2c91ff2 Fix native ProposalLayerLib dimensions indexes 2018-04-24 14:16:20 +00:00
jaliyaek a7a52d7402 Adding halide based binary convolution operators and its dependancies 2018-01-24 16:41:17 -08:00
KeDengMS 3cf3af5df6 CNTK support for CUDA 9
CNTK now supports CUDA 9/cuDNN 7. This requires an update to build environment to Ubuntu 16/GCC 5 for Linux, and Visual Studio 2017/VCTools 14.11 for Windows. With CUDA 9, CNTK also added a preview for 16-bit floating point (a.k.a FP16) computation.

Please check out the example of FP16 in ResNet50 at /Examples/Image/Classification/ResNet/Python/TrainResNet_ImageNet_Distributed.py

Notes on FP16 preview:
* FP16 implementation on CPU is not optimized, and it's not supposed to be used in CPU inference directly. User needs to convert the model to 32-bit floating point before running on CPU.
* Loss/Criterion for FP16 training needs to be 32bit for accumulation without overflow, using cast function. Please check the example above.
* Readers do not have FP16 output unless using numpy to feed data, cast from FP32 to FP16 is needed. Please check the example above.
* FP16 gradient aggregation is currently only implemented on GPU using NCCL2. Distributed training with FP16 with MPI is not supported.
* FP16 math is a subset of current FP32 implementation. Some model may get Feature Not Implemented exception using FP16.
* FP16 is currently not supported in BrainScript. Please use Python for FP16.

To setup build and runtime environment on Windows:
* Install [Visual Studio 2017](https://www.visualstudio.com/downloads/) with following workloads and components. From command line (use Community version installer as example):
    vs_community.exe --add Microsoft.VisualStudio.Workload.NativeDesktop --add Microsoft.VisualStudio.Workload.ManagedDesktop --add Microsoft.VisualStudio.Workload.Universal --add Microsoft.Component.PythonTools --add Microsoft.VisualStudio.Component.VC.Tools.14.11
* Install [NVidia CUDA 9](https://developer.nvidia.com/cuda-90-download-archive?target_os=Windows&target_arch=x86_64)
* From PowerShell, run:
    /Tools/devInstall/Windows/DevInstall.ps1
* Start VCTools 14.11 command line, run:
    cmd /k "%VS2017INSTALLDIR%\VC\Auxiliary\Build\vcvarsall.bat" x64 --vcvars_ver=14.11
* Open /CNTK.sln from the VCTools 14.11 command line. Note that starting CNTK.sln other than VCTools 14.11 command line, would causes CUDA 9 [build error](https://developercommunity.visualstudio.com/content/problem/163758/vs-2017-155-doesnt-support-cuda-9.html).

To setup build and runtime environment on Linux using docker, please build Unbuntu 16.04 docker image using Dockerfiles /Tools/docker. For other Linux systems, please refer to the Dockerfiles to setup dependent libraries for CNTK.
2018-01-22 16:58:56 -08:00
Philipp Kranen 63488568fe enabling native proposal layer and dlib selective search 2017-08-30 17:12:35 +02:00
Alexey Reznichenko 65471cdcd9 Add Proposal Layer as a native UDF 2017-08-19 10:03:10 +02:00
Cha Zhang 033cf05be0 Eliminate warnings during build for binary convolution. 2017-06-04 14:17:22 -07:00
Mark Hillebrand d5b331f4b2 Examples/Extensibility/BinaryConvolution/README.md: tune 2017-05-31 13:49:13 +02:00
Amit Agarwal 563b3a4481 CNTK v2 library: Add a non-SSE/AVX version of the halide_convolve static library for windows and use that by default to enable the tests to be run on machines without AVX support. 2017-05-30 11:02:35 -07:00
Amit Agarwal 71b8a35771 CNTK v2 library: Add Binary convolution native user-defined Function library. 2017-05-27 22:52:08 -07:00
Alexey Reznichenko 5bc3661988 Improve UDF serialization
* Add a base class for User-Defined Functions
  * Add support for native udf serialization
  * Change API to accept std::function callbacks
2017-05-09 10:08:23 +02:00
Amit Agarwal f9e14544db CNTK v2 library: Enable passing dictionary of attributes to native UserFunction factory methods when instantiated from python. 2017-04-28 18:34:53 -07:00
Amit Agarwal c7c2547f55 CNTK v2 library: Fixed handling of references of existing netowrk matrix storage handed out from Forward/Backward 2017-04-14 01:33:24 -07:00
Amit Agarwal 1f23f6e161 CNTK v2 library: Add ability to register and instantiate native C++ user-defined functions from python 2017-04-14 01:33:24 -07:00
Friedel van Megen f599fe0619 renaming cntk.library to cntk.core
(thats what other apps link to)
2017-03-21 17:45:46 +01:00
Friedel van Megen 8c90baf161 fix one more dependency 2017-03-20 11:06:48 +01:00
Junjie Qian bd93d7ec5a Copy gradients into a continous buffer in v2 aggregator 2017-03-10 09:15:40 -08:00
Amit Agarwal 0514a40c55 CNTK core: Made actual forward prop evaluation order consistent with the order in which we setup matrix sharing structure, to ensure proper evaluation of a set of nodes in accordance with the assumed evaluation order during allocating node value/gradient matrices. 2017-03-02 22:32:07 -08:00
Amit Agarwal de6371bb0d CNTK v2 library: C++ user-defined Function example 2017-03-01 18:11:10 -08:00