2015-09-02 18:43:31 +03:00
# Makefile for a Linux/GCC build of CNTK
#
# The Linux and Windows versions are not different branches, but rather build off the same
# source files, using different makefiles. This current makefile has the purpose of enabling
# work to make all sources compile with GCC, and also to check for GCC-compat regressions due to
# modifications which are currently done under Windows.
#
# To use this Makefile, create a directory to build in and make a Config.make in the directory
# that provides
2016-02-29 22:58:13 +03:00
# BUILDTYPE= One of release or debug
# defaults to release
2018-02-20 21:54:07 +03:00
# BUILD_VERSION= CNTK version number to be used while building
# BUILD_PUBLIC= One of yes or no
2017-11-10 22:48:16 +03:00
# MKL_PATH= path to MKLML installation
2016-02-29 22:58:13 +03:00
# only needed if MATHLIB=mkl
2016-07-27 17:14:34 +03:00
# GDK_INCLUDE_PATH= path to CUDA GDK include path, so $(GDK_INCLUDE_PATH)/nvml.h exists
# defaults to /usr/include/nvidia/gdk
# GDK_NVML_LIB_PATH= path to CUDA GDK (stub) library path, so $(GDK_NVML_LIB_PATH)/libnvidia-ml.so exists
# defaults to /usr/src/gdk/nvml/lib
2016-08-16 15:59:15 +03:00
# MATHLIB= mkl
# defaults to mkl
2016-02-29 22:58:13 +03:00
# CUDA_PATH= Path to CUDA
# If not specified, GPU will not be enabled
# CUB_PATH= path to NVIDIA CUB installation, so $(CUB_PATH)/cub/cub.cuh exists
# defaults to /usr/local/cub-1.4.1
# CUDNN_PATH= path to NVIDIA cuDNN installation so $(CUDNN_PATH)/cuda/include/cudnn.h exists
2016-08-08 10:13:39 +03:00
# CuDNN version needs to be 5.0 or higher.
2016-02-29 22:58:13 +03:00
# KALDI_PATH= Path to Kaldi
# If not specified, Kaldi plugins will not be built
2016-07-19 10:34:31 +03:00
# OPENCV_PATH= path to OpenCV 3.1.0 installation, so $(OPENCV_PATH) exists
# defaults to /usr/local/opencv-3.1.0
2016-10-25 01:33:31 +03:00
# PROTOBUF_PATH= path to Protocol Buffers 3.1.0 installation, so $(PROTOBUF_PATH) exists
# defaults to /usr/local/protobuf-3.1.0
2016-02-29 22:58:13 +03:00
# LIBZIP_PATH= path to libzip installation, so $(LIBZIP_PATH) exists
# defaults to /usr/local/
2016-07-19 10:34:31 +03:00
# BOOST_PATH= path to Boost installation, so $(BOOST_PATH)/include/boost/test/unit_test.hpp
# defaults to /usr/local/boost-1.60.0
2016-10-06 13:19:13 +03:00
# PYTHON_SUPPORT=true iff CNTK v2 Python module should be build
# SWIG_PATH= path to SWIG (>= 3.0.10)
# PYTHON_VERSIONS= list of Python versions to build for
2018-01-27 01:55:22 +03:00
# A Python version is identified by "27", "35", or "36".
2016-12-19 15:49:05 +03:00
# PYTHON27_PATH= path to Python 2.7 interpreter
2016-10-06 13:19:13 +03:00
# PYTHON34_PATH= path to Python 3.4 interpreter
# PYTHON35_PATH= path to Python 3.5 interpreter
2017-03-11 16:35:49 +03:00
# PYTHON36_PATH= path to Python 3.6 interpreter
2016-03-29 17:16:50 +03:00
# MPI_PATH= path to MPI installation, so $(MPI_PATH) exists
# defaults to /usr/local/mpi
2016-02-29 22:58:13 +03:00
# These can be overridden on the command line, e.g. make BUILDTYPE=debug
2015-09-02 18:43:31 +03:00
CNTK support for CUDA 9
CNTK now supports CUDA 9/cuDNN 7. This requires an update to build environment to Ubuntu 16/GCC 5 for Linux, and Visual Studio 2017/VCTools 14.11 for Windows. With CUDA 9, CNTK also added a preview for 16-bit floating point (a.k.a FP16) computation.
Please check out the example of FP16 in ResNet50 at /Examples/Image/Classification/ResNet/Python/TrainResNet_ImageNet_Distributed.py
Notes on FP16 preview:
* FP16 implementation on CPU is not optimized, and it's not supposed to be used in CPU inference directly. User needs to convert the model to 32-bit floating point before running on CPU.
* Loss/Criterion for FP16 training needs to be 32bit for accumulation without overflow, using cast function. Please check the example above.
* Readers do not have FP16 output unless using numpy to feed data, cast from FP32 to FP16 is needed. Please check the example above.
* FP16 gradient aggregation is currently only implemented on GPU using NCCL2. Distributed training with FP16 with MPI is not supported.
* FP16 math is a subset of current FP32 implementation. Some model may get Feature Not Implemented exception using FP16.
* FP16 is currently not supported in BrainScript. Please use Python for FP16.
To setup build and runtime environment on Windows:
* Install [Visual Studio 2017](https://www.visualstudio.com/downloads/) with following workloads and components. From command line (use Community version installer as example):
vs_community.exe --add Microsoft.VisualStudio.Workload.NativeDesktop --add Microsoft.VisualStudio.Workload.ManagedDesktop --add Microsoft.VisualStudio.Workload.Universal --add Microsoft.Component.PythonTools --add Microsoft.VisualStudio.Component.VC.Tools.14.11
* Install [NVidia CUDA 9](https://developer.nvidia.com/cuda-90-download-archive?target_os=Windows&target_arch=x86_64)
* From PowerShell, run:
/Tools/devInstall/Windows/DevInstall.ps1
* Start VCTools 14.11 command line, run:
cmd /k "%VS2017INSTALLDIR%\VC\Auxiliary\Build\vcvarsall.bat" x64 --vcvars_ver=14.11
* Open /CNTK.sln from the VCTools 14.11 command line. Note that starting CNTK.sln other than VCTools 14.11 command line, would causes CUDA 9 [build error](https://developercommunity.visualstudio.com/content/problem/163758/vs-2017-155-doesnt-support-cuda-9.html).
To setup build and runtime environment on Linux using docker, please build Unbuntu 16.04 docker image using Dockerfiles /Tools/docker. For other Linux systems, please refer to the Dockerfiles to setup dependent libraries for CNTK.
2018-01-23 03:58:56 +03:00
# TODO: Build static libraries for common dependencies that are shared by multiple
2016-06-27 12:43:56 +03:00
# targets, e.g. eval and CNTK.
2016-03-11 21:22:21 +03:00
ARCH = $( shell uname)
2015-09-02 18:43:31 +03:00
i f n d e f B U I L D _ T O P
BUILD_TOP = .
e n d i f
i f n e q ( "$(wildcard $(BUILD_TOP)/Config.make)" , "" )
include $( BUILD_TOP) /Config.make
e l s e
2017-06-07 17:50:37 +03:00
$( error Cannot find $( BUILD_TOP) /Config.make. Please see the CNTK documentation at https://docs.microsoft.com/en-us/cognitive-toolkit/Setup-CNTK-on-Linux for configuration instructions.)
2015-09-02 18:43:31 +03:00
e n d i f
i f n d e f B U I L D T Y P E
$( info Defaulting BUILDTYPE =release )
BUILDTYPE = release
e n d i f
i f n d e f M A T H L I B
2016-08-16 15:59:15 +03:00
$( info DEFAULTING MATHLIB =mkl )
MATHLIB = mkl
2015-09-02 18:43:31 +03:00
e n d i f
#### Configure based on options above
# The mpic++ wrapper only adds MPI specific flags to the g++ command line.
# The actual compiler/linker flags added can be viewed by running 'mpic++ --showme:compile' and 'mpic++ --showme:link'
2016-11-04 18:38:55 +03:00
i f n e q ( $( HAS_MPI ) , 0 )
2017-01-03 23:29:58 +03:00
CXX = $( MPI_PATH) /bin/mpic++
2016-11-04 18:38:55 +03:00
e n d i f
2016-10-14 11:09:40 +03:00
SSE_FLAGS = -msse4.1 -mssse3
2016-10-25 01:33:31 +03:00
PROTOC = $( PROTOBUF_PATH) /bin/protoc
2016-10-14 11:09:40 +03:00
# Settings for ARM64 architectures that use a crosscompiler on a host machine.
#CXX = aarch64-linux-gnu-g++
#SSE_FLAGS =
2015-09-02 18:43:31 +03:00
2015-12-15 11:58:24 +03:00
SOURCEDIR := Source
2017-03-01 03:33:56 +03:00
INCLUDEPATH := $( addprefix $( SOURCEDIR) /, Common/Include CNTKv2LibraryDll CNTKv2LibraryDll/API CNTKv2LibraryDll/proto ../Examples/Extensibility/CPP Math CNTK ActionsLib ComputationNetworkLib SGDLib SequenceTrainingLib CNTK/BrainScript Readers/ReaderLib PerformanceProfilerDll)
2016-10-25 01:33:31 +03:00
INCLUDEPATH += $( PROTOBUF_PATH) /include
2016-02-11 04:50:53 +03:00
# COMMON_FLAGS include settings that are passed both to NVCC and C++ compilers.
CNTK support for CUDA 9
CNTK now supports CUDA 9/cuDNN 7. This requires an update to build environment to Ubuntu 16/GCC 5 for Linux, and Visual Studio 2017/VCTools 14.11 for Windows. With CUDA 9, CNTK also added a preview for 16-bit floating point (a.k.a FP16) computation.
Please check out the example of FP16 in ResNet50 at /Examples/Image/Classification/ResNet/Python/TrainResNet_ImageNet_Distributed.py
Notes on FP16 preview:
* FP16 implementation on CPU is not optimized, and it's not supposed to be used in CPU inference directly. User needs to convert the model to 32-bit floating point before running on CPU.
* Loss/Criterion for FP16 training needs to be 32bit for accumulation without overflow, using cast function. Please check the example above.
* Readers do not have FP16 output unless using numpy to feed data, cast from FP32 to FP16 is needed. Please check the example above.
* FP16 gradient aggregation is currently only implemented on GPU using NCCL2. Distributed training with FP16 with MPI is not supported.
* FP16 math is a subset of current FP32 implementation. Some model may get Feature Not Implemented exception using FP16.
* FP16 is currently not supported in BrainScript. Please use Python for FP16.
To setup build and runtime environment on Windows:
* Install [Visual Studio 2017](https://www.visualstudio.com/downloads/) with following workloads and components. From command line (use Community version installer as example):
vs_community.exe --add Microsoft.VisualStudio.Workload.NativeDesktop --add Microsoft.VisualStudio.Workload.ManagedDesktop --add Microsoft.VisualStudio.Workload.Universal --add Microsoft.Component.PythonTools --add Microsoft.VisualStudio.Component.VC.Tools.14.11
* Install [NVidia CUDA 9](https://developer.nvidia.com/cuda-90-download-archive?target_os=Windows&target_arch=x86_64)
* From PowerShell, run:
/Tools/devInstall/Windows/DevInstall.ps1
* Start VCTools 14.11 command line, run:
cmd /k "%VS2017INSTALLDIR%\VC\Auxiliary\Build\vcvarsall.bat" x64 --vcvars_ver=14.11
* Open /CNTK.sln from the VCTools 14.11 command line. Note that starting CNTK.sln other than VCTools 14.11 command line, would causes CUDA 9 [build error](https://developercommunity.visualstudio.com/content/problem/163758/vs-2017-155-doesnt-support-cuda-9.html).
To setup build and runtime environment on Linux using docker, please build Unbuntu 16.04 docker image using Dockerfiles /Tools/docker. For other Linux systems, please refer to the Dockerfiles to setup dependent libraries for CNTK.
2018-01-23 03:58:56 +03:00
COMMON_FLAGS := -DHAS_MPI= $( HAS_MPI) -D_POSIX_SOURCE -D_XOPEN_SOURCE= 600 -D__USE_XOPEN2K -std= c++11 -DCUDA_NO_HALF -D__CUDA_NO_HALF_OPERATORS__
CPPFLAGS :=
2016-10-14 11:09:40 +03:00
CXXFLAGS := $( SSE_FLAGS) -std= c++0x -fopenmp -fpermissive -fPIC -Werror -fcheck-new
2015-09-02 18:43:31 +03:00
LIBPATH :=
2016-10-06 13:19:13 +03:00
LIBS_LIST :=
2015-09-02 18:43:31 +03:00
LDFLAGS :=
2015-10-01 01:02:05 +03:00
CXXVER_GE480 := $( shell expr ` $( CXX) -dumpversion | sed -e 's/\.\([0-9][0-9]\)/\1/g' -e 's/\.\([0-9]\)/0\1/g' -e 's/^[0-9]\{3,4\}$$/&00/' ` \> = 40800)
i f e q ( $( CXXVER_GE 480) , 1 )
CXXFLAGS += -Wno-error= literal-suffix
e n d i f
2015-09-02 18:43:31 +03:00
SEPARATOR = "=-----------------------------------------------------------="
ALL :=
2016-10-06 13:19:13 +03:00
ALL_LIBS :=
2017-02-08 22:37:48 +03:00
PYTHON_LIBS :=
2017-01-13 00:24:35 +03:00
JAVA_LIBS :=
2016-10-06 13:19:13 +03:00
LIBS_FULLPATH :=
2015-09-02 18:43:31 +03:00
SRC :=
# Make sure all is the first (i.e. default) target, but we can't actually define it
# this early in the file, so let buildall do the work.
all : buildall
# Set up basic nvcc options and add CUDA targets from above
2016-02-11 04:50:53 +03:00
CUFLAGS = -m 64
2015-09-02 18:43:31 +03:00
i f d e f C U D A _ P A T H
2016-07-27 17:14:34 +03:00
ifndef GDK_INCLUDE_PATH
GDK_INCLUDE_PATH = /usr/include/nvidia/gdk
$( info defaulting GDK_INCLUDE_PATH to $( GDK_INCLUDE_PATH) )
endif
ifndef GDK_NVML_LIB_PATH
GDK_NVML_LIB_PATH = /usr/src/gdk/nvml/lib
$( info defaulting GDK_NVML_LIB_PATH to $( GDK_NVML_LIB_PATH) )
2015-10-13 02:36:40 +03:00
endif
ifndef CUB_PATH
$( info defaulting CUB_PATH to /usr/local/cub-1.4.1)
CUB_PATH = /usr/local/cub-1.4.1
endif
2015-09-02 18:43:31 +03:00
DEVICE = gpu
NVCC = $( CUDA_PATH) /bin/nvcc
2016-07-27 17:14:34 +03:00
INCLUDEPATH += $( GDK_INCLUDE_PATH)
2015-10-13 02:36:40 +03:00
INCLUDEPATH += $( CUB_PATH)
2015-09-02 18:43:31 +03:00
# Set up CUDA includes and libraries
INCLUDEPATH += $( CUDA_PATH) /include
LIBPATH += $( CUDA_PATH) /lib64
2016-10-06 13:19:13 +03:00
LIBS_LIST += cublas cudart cuda curand cusparse nvidia-ml
2015-09-02 18:43:31 +03:00
2015-11-17 23:58:19 +03:00
# Set up cuDNN if needed
ifdef CUDNN_PATH
INCLUDEPATH += $( CUDNN_PATH) /cuda/include
LIBPATH += $( CUDNN_PATH) /cuda/lib64
2016-10-06 13:19:13 +03:00
LIBS_LIST += cudnn
2016-02-11 04:50:53 +03:00
COMMON_FLAGS += -DUSE_CUDNN
2015-11-17 23:58:19 +03:00
endif
2016-09-16 04:10:53 +03:00
# Set up NCCL if needed
ifdef NCCL_PATH
INCLUDEPATH += $( NCCL_PATH) /include
LIBPATH += $( NCCL_PATH) /lib
LIBS_LIST += nccl
COMMON_FLAGS += -DUSE_NCCL
endif
2015-09-02 18:43:31 +03:00
e l s e
DEVICE = cpu
2016-02-11 04:50:53 +03:00
COMMON_FLAGS += -DCPUONLY
2015-09-02 18:43:31 +03:00
e n d i f
i f e q ( "$(MATHLIB)" , "mkl" )
2017-11-10 22:48:16 +03:00
INCLUDEPATH += $( MKL_PATH) /include
CNTK support for CUDA 9
CNTK now supports CUDA 9/cuDNN 7. This requires an update to build environment to Ubuntu 16/GCC 5 for Linux, and Visual Studio 2017/VCTools 14.11 for Windows. With CUDA 9, CNTK also added a preview for 16-bit floating point (a.k.a FP16) computation.
Please check out the example of FP16 in ResNet50 at /Examples/Image/Classification/ResNet/Python/TrainResNet_ImageNet_Distributed.py
Notes on FP16 preview:
* FP16 implementation on CPU is not optimized, and it's not supposed to be used in CPU inference directly. User needs to convert the model to 32-bit floating point before running on CPU.
* Loss/Criterion for FP16 training needs to be 32bit for accumulation without overflow, using cast function. Please check the example above.
* Readers do not have FP16 output unless using numpy to feed data, cast from FP32 to FP16 is needed. Please check the example above.
* FP16 gradient aggregation is currently only implemented on GPU using NCCL2. Distributed training with FP16 with MPI is not supported.
* FP16 math is a subset of current FP32 implementation. Some model may get Feature Not Implemented exception using FP16.
* FP16 is currently not supported in BrainScript. Please use Python for FP16.
To setup build and runtime environment on Windows:
* Install [Visual Studio 2017](https://www.visualstudio.com/downloads/) with following workloads and components. From command line (use Community version installer as example):
vs_community.exe --add Microsoft.VisualStudio.Workload.NativeDesktop --add Microsoft.VisualStudio.Workload.ManagedDesktop --add Microsoft.VisualStudio.Workload.Universal --add Microsoft.Component.PythonTools --add Microsoft.VisualStudio.Component.VC.Tools.14.11
* Install [NVidia CUDA 9](https://developer.nvidia.com/cuda-90-download-archive?target_os=Windows&target_arch=x86_64)
* From PowerShell, run:
/Tools/devInstall/Windows/DevInstall.ps1
* Start VCTools 14.11 command line, run:
cmd /k "%VS2017INSTALLDIR%\VC\Auxiliary\Build\vcvarsall.bat" x64 --vcvars_ver=14.11
* Open /CNTK.sln from the VCTools 14.11 command line. Note that starting CNTK.sln other than VCTools 14.11 command line, would causes CUDA 9 [build error](https://developercommunity.visualstudio.com/content/problem/163758/vs-2017-155-doesnt-support-cuda-9.html).
To setup build and runtime environment on Linux using docker, please build Unbuntu 16.04 docker image using Dockerfiles /Tools/docker. For other Linux systems, please refer to the Dockerfiles to setup dependent libraries for CNTK.
2018-01-23 03:58:56 +03:00
LIBS_LIST += m iomp5 pthread mklml_intel mkldnn
2017-11-10 22:48:16 +03:00
MKL_LIB_PATH := $( MKL_PATH) /lib
2017-07-17 18:45:47 +03:00
LIBPATH += $( MKL_LIB_PATH)
CNTK support for CUDA 9
CNTK now supports CUDA 9/cuDNN 7. This requires an update to build environment to Ubuntu 16/GCC 5 for Linux, and Visual Studio 2017/VCTools 14.11 for Windows. With CUDA 9, CNTK also added a preview for 16-bit floating point (a.k.a FP16) computation.
Please check out the example of FP16 in ResNet50 at /Examples/Image/Classification/ResNet/Python/TrainResNet_ImageNet_Distributed.py
Notes on FP16 preview:
* FP16 implementation on CPU is not optimized, and it's not supposed to be used in CPU inference directly. User needs to convert the model to 32-bit floating point before running on CPU.
* Loss/Criterion for FP16 training needs to be 32bit for accumulation without overflow, using cast function. Please check the example above.
* Readers do not have FP16 output unless using numpy to feed data, cast from FP32 to FP16 is needed. Please check the example above.
* FP16 gradient aggregation is currently only implemented on GPU using NCCL2. Distributed training with FP16 with MPI is not supported.
* FP16 math is a subset of current FP32 implementation. Some model may get Feature Not Implemented exception using FP16.
* FP16 is currently not supported in BrainScript. Please use Python for FP16.
To setup build and runtime environment on Windows:
* Install [Visual Studio 2017](https://www.visualstudio.com/downloads/) with following workloads and components. From command line (use Community version installer as example):
vs_community.exe --add Microsoft.VisualStudio.Workload.NativeDesktop --add Microsoft.VisualStudio.Workload.ManagedDesktop --add Microsoft.VisualStudio.Workload.Universal --add Microsoft.Component.PythonTools --add Microsoft.VisualStudio.Component.VC.Tools.14.11
* Install [NVidia CUDA 9](https://developer.nvidia.com/cuda-90-download-archive?target_os=Windows&target_arch=x86_64)
* From PowerShell, run:
/Tools/devInstall/Windows/DevInstall.ps1
* Start VCTools 14.11 command line, run:
cmd /k "%VS2017INSTALLDIR%\VC\Auxiliary\Build\vcvarsall.bat" x64 --vcvars_ver=14.11
* Open /CNTK.sln from the VCTools 14.11 command line. Note that starting CNTK.sln other than VCTools 14.11 command line, would causes CUDA 9 [build error](https://developercommunity.visualstudio.com/content/problem/163758/vs-2017-155-doesnt-support-cuda-9.html).
To setup build and runtime environment on Linux using docker, please build Unbuntu 16.04 docker image using Dockerfiles /Tools/docker. For other Linux systems, please refer to the Dockerfiles to setup dependent libraries for CNTK.
2018-01-23 03:58:56 +03:00
COMMON_FLAGS += -DUSE_MKL -DUSE_MKLDNN
2015-09-02 18:43:31 +03:00
e n d i f
2017-02-26 07:35:08 +03:00
i f e q ( $( CUDA_GDR ) , 1 )
COMMON_FLAGS += -DUSE_CUDA_GDR
e n d i f
2016-01-30 02:44:37 +03:00
i f e q ( "$(MATHLIB)" , "openblas" )
INCLUDEPATH += $( OPENBLAS_PATH) /include
LIBPATH += $( OPENBLAS_PATH) /lib
2016-10-06 13:19:13 +03:00
LIBS_LIST += openblas m pthread
2016-01-30 02:44:37 +03:00
CPPFLAGS += -DUSE_OPENBLAS
e n d i f
2015-09-02 18:43:31 +03:00
i f d e f K A L D I _ P A T H
########## Copy includes and defines from $(KALDI_PATH)/src/kaldi.mk ##########
FSTROOT = $( KALDI_PATH) /tools/openfst
ATLASINC = $( KALDI_PATH) /tools/ATLAS/include
INCLUDEPATH += $( KALDI_PATH) /src $( ATLASINC) $( FSTROOT) /include
2016-10-06 13:19:13 +03:00
CPPFLAGS += -DKALDI_DOUBLEPRECISION= 0 -DHAVE_POSIX_MEMALIGN -DHAVE_EXECINFO_H= 1 -DHAVE_CXXABI_H -DHAVE_ATLAS -DHAVE_OPENFST_GE_10400
2015-09-02 18:43:31 +03:00
KALDI_LIBPATH += $( KALDI_PATH) /src/lib
2016-10-06 13:19:13 +03:00
KALDI_LIBS_LIST := kaldi-util kaldi-matrix kaldi-base kaldi-hmm kaldi-cudamatrix kaldi-nnet kaldi-lat
KALDI_LIBS := $( addprefix -l,$( KALDI_LIBS_LIST) )
2015-09-02 18:43:31 +03:00
e n d i f
2016-06-21 23:02:39 +03:00
i f d e f S U P P O R T _ A V X 2
CPPFLAGS += -mavx2
e n d i f
2015-09-03 22:17:17 +03:00
# Set up nvcc target architectures (will generate code to support them all, i.e. fat-binary, in release mode)
2017-07-07 17:44:48 +03:00
# In debug mode we only include cubin/PTX for 30 and rely on PTX / JIT to generate the required native cubin format
2017-07-06 15:13:49 +03:00
# see also http://docs.nvidia.com/cuda/pascal-compatibility-guide/index.html#building-applications-with-pascal-support
2015-09-03 22:17:17 +03:00
GENCODE_SM30 := -gencode arch = compute_30,code= \" sm_30,compute_30\"
GENCODE_SM35 := -gencode arch = compute_35,code= \" sm_35,compute_35\"
GENCODE_SM50 := -gencode arch = compute_50,code= \" sm_50,compute_50\"
2017-07-06 15:13:49 +03:00
GENCODE_SM52 := -gencode arch = compute_52,code= \" sm_52,compute_52\"
GENCODE_SM60 := -gencode arch = compute_60,code= \" sm_60,compute_60\"
GENCODE_SM61 := -gencode arch = compute_61,code= \" sm_61,compute_61\"
CNTK support for CUDA 9
CNTK now supports CUDA 9/cuDNN 7. This requires an update to build environment to Ubuntu 16/GCC 5 for Linux, and Visual Studio 2017/VCTools 14.11 for Windows. With CUDA 9, CNTK also added a preview for 16-bit floating point (a.k.a FP16) computation.
Please check out the example of FP16 in ResNet50 at /Examples/Image/Classification/ResNet/Python/TrainResNet_ImageNet_Distributed.py
Notes on FP16 preview:
* FP16 implementation on CPU is not optimized, and it's not supposed to be used in CPU inference directly. User needs to convert the model to 32-bit floating point before running on CPU.
* Loss/Criterion for FP16 training needs to be 32bit for accumulation without overflow, using cast function. Please check the example above.
* Readers do not have FP16 output unless using numpy to feed data, cast from FP32 to FP16 is needed. Please check the example above.
* FP16 gradient aggregation is currently only implemented on GPU using NCCL2. Distributed training with FP16 with MPI is not supported.
* FP16 math is a subset of current FP32 implementation. Some model may get Feature Not Implemented exception using FP16.
* FP16 is currently not supported in BrainScript. Please use Python for FP16.
To setup build and runtime environment on Windows:
* Install [Visual Studio 2017](https://www.visualstudio.com/downloads/) with following workloads and components. From command line (use Community version installer as example):
vs_community.exe --add Microsoft.VisualStudio.Workload.NativeDesktop --add Microsoft.VisualStudio.Workload.ManagedDesktop --add Microsoft.VisualStudio.Workload.Universal --add Microsoft.Component.PythonTools --add Microsoft.VisualStudio.Component.VC.Tools.14.11
* Install [NVidia CUDA 9](https://developer.nvidia.com/cuda-90-download-archive?target_os=Windows&target_arch=x86_64)
* From PowerShell, run:
/Tools/devInstall/Windows/DevInstall.ps1
* Start VCTools 14.11 command line, run:
cmd /k "%VS2017INSTALLDIR%\VC\Auxiliary\Build\vcvarsall.bat" x64 --vcvars_ver=14.11
* Open /CNTK.sln from the VCTools 14.11 command line. Note that starting CNTK.sln other than VCTools 14.11 command line, would causes CUDA 9 [build error](https://developercommunity.visualstudio.com/content/problem/163758/vs-2017-155-doesnt-support-cuda-9.html).
To setup build and runtime environment on Linux using docker, please build Unbuntu 16.04 docker image using Dockerfiles /Tools/docker. For other Linux systems, please refer to the Dockerfiles to setup dependent libraries for CNTK.
2018-01-23 03:58:56 +03:00
GENCODE_SM70 := -gencode arch = compute_70,code= \" sm_70,compute_70\"
2015-09-03 22:17:17 +03:00
2016-05-02 19:48:46 +03:00
# Should we relocate *.gcno and *.gcda files using -fprofile-dir option?
# Use GCOV_PREFIX and GCOV_PREFIX_STRIP if relocating:
# For example, if the object file /user/build/foo.o was built with -fprofile-arcs, the final executable will try to create the data file
# /user/build/foo.gcda when running on the target system. This will fail if the corresponding directory does not exist and it is unable
2016-05-19 12:02:48 +03:00
# to create it. This can be overcome by, for example, setting the environment as 'GCOV_PREFIX=/target/run' and 'GCOV_PREFIX_STRIP=1'.
2016-05-02 19:48:46 +03:00
# Such a setting will name the data file /target/run/build/foo.gcda
i f d e f C N T K _ C O D E _ C O V E R A G E
CXXFLAGS += -fprofile-arcs -ftest-coverage
LDFLAGS += -lgcov --coverage
e n d i f
2015-09-02 18:43:31 +03:00
i f e q ( "$(BUILDTYPE)" , "debug" )
2017-07-07 17:44:48 +03:00
ifdef CNTK_CUDA_CODEGEN_DEBUG
GENCODE_FLAGS := $( CNTK_CUDA_CODEGEN_DEBUG)
else
GENCODE_FLAGS := $( GENCODE_SM30)
endif
2015-09-18 02:57:26 +03:00
2015-09-02 18:43:31 +03:00
CXXFLAGS += -g
2015-11-26 04:20:56 +03:00
LDFLAGS += -rdynamic
2016-03-01 22:02:18 +03:00
COMMON_FLAGS += -D_DEBUG -DNO_SYNC
2016-01-09 03:41:45 +03:00
CUFLAGS += -O0 -g -use_fast_math -lineinfo $( GENCODE_FLAGS)
2015-09-02 18:43:31 +03:00
e n d i f
i f e q ( "$(BUILDTYPE)" , "release" )
2017-07-07 17:44:48 +03:00
ifdef CNTK_CUDA_CODEGEN_RELEASE
GENCODE_FLAGS := $( CNTK_CUDA_CODEGEN_RELEASE)
else
CNTK support for CUDA 9
CNTK now supports CUDA 9/cuDNN 7. This requires an update to build environment to Ubuntu 16/GCC 5 for Linux, and Visual Studio 2017/VCTools 14.11 for Windows. With CUDA 9, CNTK also added a preview for 16-bit floating point (a.k.a FP16) computation.
Please check out the example of FP16 in ResNet50 at /Examples/Image/Classification/ResNet/Python/TrainResNet_ImageNet_Distributed.py
Notes on FP16 preview:
* FP16 implementation on CPU is not optimized, and it's not supposed to be used in CPU inference directly. User needs to convert the model to 32-bit floating point before running on CPU.
* Loss/Criterion for FP16 training needs to be 32bit for accumulation without overflow, using cast function. Please check the example above.
* Readers do not have FP16 output unless using numpy to feed data, cast from FP32 to FP16 is needed. Please check the example above.
* FP16 gradient aggregation is currently only implemented on GPU using NCCL2. Distributed training with FP16 with MPI is not supported.
* FP16 math is a subset of current FP32 implementation. Some model may get Feature Not Implemented exception using FP16.
* FP16 is currently not supported in BrainScript. Please use Python for FP16.
To setup build and runtime environment on Windows:
* Install [Visual Studio 2017](https://www.visualstudio.com/downloads/) with following workloads and components. From command line (use Community version installer as example):
vs_community.exe --add Microsoft.VisualStudio.Workload.NativeDesktop --add Microsoft.VisualStudio.Workload.ManagedDesktop --add Microsoft.VisualStudio.Workload.Universal --add Microsoft.Component.PythonTools --add Microsoft.VisualStudio.Component.VC.Tools.14.11
* Install [NVidia CUDA 9](https://developer.nvidia.com/cuda-90-download-archive?target_os=Windows&target_arch=x86_64)
* From PowerShell, run:
/Tools/devInstall/Windows/DevInstall.ps1
* Start VCTools 14.11 command line, run:
cmd /k "%VS2017INSTALLDIR%\VC\Auxiliary\Build\vcvarsall.bat" x64 --vcvars_ver=14.11
* Open /CNTK.sln from the VCTools 14.11 command line. Note that starting CNTK.sln other than VCTools 14.11 command line, would causes CUDA 9 [build error](https://developercommunity.visualstudio.com/content/problem/163758/vs-2017-155-doesnt-support-cuda-9.html).
To setup build and runtime environment on Linux using docker, please build Unbuntu 16.04 docker image using Dockerfiles /Tools/docker. For other Linux systems, please refer to the Dockerfiles to setup dependent libraries for CNTK.
2018-01-23 03:58:56 +03:00
GENCODE_FLAGS := $( GENCODE_SM30) $( GENCODE_SM35) $( GENCODE_SM50) $( GENCODE_SM60) $( GENCODE_SM61) $( GENCODE_SM70)
2017-07-07 17:44:48 +03:00
endif
2015-09-02 18:43:31 +03:00
2017-07-07 19:05:29 +03:00
CXXFLAGS += -g -O4
LDFLAGS += -rdynamic
COMMON_FLAGS += -DNDEBUG -DNO_SYNC
CUFLAGS += -O3 -g -use_fast_math $( GENCODE_FLAGS)
e n d i f
2015-10-25 00:31:34 +03:00
i f d e f C N T K _ C U D A _ D E V I C E _ D E B U G I N F O
CUFLAGS += -G
e n d i f
2016-10-06 13:19:13 +03:00
# Create the library link options for the linker.
# LIBS_LIST must not be changed beyond this point.
LIBS := $( addprefix -l,$( LIBS_LIST) )
2015-09-02 18:43:31 +03:00
OBJDIR := $( BUILD_TOP) /.build
BINDIR := $( BUILD_TOP) /bin
LIBDIR := $( BUILD_TOP) /lib
2016-10-06 13:19:13 +03:00
PYTHONDIR := $( BUILD_TOP) /python
2015-09-02 18:43:31 +03:00
ORIGINLIBDIR := '$$ORIGIN/../lib'
ORIGINDIR := '$$ORIGIN'
2017-03-20 17:26:29 +03:00
########################################
# Components VERSION info
########################################
2018-02-20 21:54:07 +03:00
# CNTK version which should be used where CNTK version is required. Ex: print version or tag CNTK binaries.
CNTK_VERSION := $( BUILD_VERSION)
# Cntk Version banner is printed wherever CNTK_VERSION should be printed. ex: python -c 'import cntk;cntk.__version__'.
CNTK_VERSION_BANNER := $( CNTK_VERSION)
i f e q ( "$(BUILD_PUBLIC)" , "no" )
CNTK_VERSION_BANNER := $( CNTK_VERSION_BANNER) +
e n d i f
# Cntk binaries (generated by build) are appended with CNTK_COMPONENT_VERSION. Ex: libCntk.Core-$(CNTK_COMPONENT_VERSION).dll
CNTK_COMPONENT_VERSION := $( CNTK_VERSION)
2017-03-20 17:26:29 +03:00
i f e q ( "$(BUILDTYPE)" , "debug" )
2017-04-12 11:31:59 +03:00
CNTK_COMPONENT_VERSION := $( CNTK_COMPONENT_VERSION) d
2017-03-20 17:26:29 +03:00
e n d i f
2015-09-02 18:43:31 +03:00
2018-02-20 21:54:07 +03:00
CPPFLAGS += -DCNTK_VERSION= " $( CNTK_VERSION) "
CPPFLAGS += -DCNTK_VERSION_BANNER= " $( CNTK_VERSION_BANNER) "
2017-03-20 18:56:30 +03:00
CPPFLAGS += -DCNTK_COMPONENT_VERSION= " $( CNTK_COMPONENT_VERSION) "
2017-03-20 18:43:43 +03:00
2017-03-20 17:26:29 +03:00
CNTKMATH := Cntk.Math-$( CNTK_COMPONENT_VERSION)
2016-06-27 19:17:39 +03:00
RPATH = -Wl,-rpath,
2015-11-25 03:38:37 +03:00
########################################
2015-12-15 11:58:24 +03:00
# Build info
2015-11-25 03:38:37 +03:00
########################################
2017-02-16 23:53:51 +03:00
BUILDINFO := $( SOURCEDIR) /CNTKv2LibraryDll/buildinfo.h
2015-12-16 17:24:49 +03:00
GENBUILD := Tools/generate_build_info
2015-11-25 03:38:37 +03:00
2016-03-11 21:22:21 +03:00
BUILDINFO_OUTPUT := $( shell $( GENBUILD) $( BUILD_TOP) /Config.make && echo Success)
i f n e q ( "$(BUILDINFO_OUTPUT)" , "Success" )
$( error Could not generate $( BUILDINFO) )
e n d i f
2015-11-25 03:38:37 +03:00
2016-11-09 01:15:55 +03:00
########################################
# Performance profiler library
########################################
2017-03-20 17:26:29 +03:00
PERF_PROFILER := Cntk.PerformanceProfiler-$( CNTK_COMPONENT_VERSION)
2016-11-09 01:15:55 +03:00
PP_SRC = \
$( SOURCEDIR) /PerformanceProfilerDll/PerformanceProfiler.cpp \
$( SOURCEDIR) /Common/File.cpp \
$( SOURCEDIR) /Common/fileutil.cpp \
$( SOURCEDIR) /Common/ExceptionWithCallStack.cpp \
PP_OBJ := $( patsubst %.cpp, $( OBJDIR) /%.o, $( PP_SRC) )
PERF_PROFILER_LIB := $( LIBDIR) /lib$( PERF_PROFILER) .so
2017-02-08 22:37:48 +03:00
ALL_LIBS += $( PERF_PROFILER_LIB)
PYTHON_LIBS += $( PERF_PROFILER_LIB)
2017-05-19 22:30:08 +03:00
JAVA_LIBS += $( PERF_PROFILER_LIB)
2016-11-09 01:15:55 +03:00
SRC += $( PP_SRC)
$(PERF_PROFILER_LIB) : $( PP_OBJ )
@echo $( SEPARATOR)
@echo creating $@ for $( ARCH) with build type $( BUILDTYPE)
@mkdir -p $( dir $@ )
$( CXX) $( LDFLAGS) -shared $( patsubst %,$( RPATH) %, $( ORIGINDIR) ) -o $@ $^
2015-09-02 18:43:31 +03:00
########################################
# Math library
########################################
# Define all sources that need to be built
2016-01-25 18:49:09 +03:00
READER_SRC = \
2016-02-02 15:31:59 +03:00
$( SOURCEDIR) /Readers/ReaderLib/BlockRandomizer.cpp \
2016-02-19 18:38:42 +03:00
$( SOURCEDIR) /Readers/ReaderLib/Bundler.cpp \
2016-02-02 15:31:59 +03:00
$( SOURCEDIR) /Readers/ReaderLib/NoRandomizer.cpp \
2017-07-17 19:41:28 +03:00
$( SOURCEDIR) /Readers/ReaderLib/LTNoRandomizer.cpp \
$( SOURCEDIR) /Readers/ReaderLib/LTTumblingWindowRandomizer.cpp \
$( SOURCEDIR) /Readers/ReaderLib/LocalTimelineRandomizerBase.cpp \
2016-02-02 15:31:59 +03:00
$( SOURCEDIR) /Readers/ReaderLib/ReaderShim.cpp \
2016-03-07 18:06:40 +03:00
$( SOURCEDIR) /Readers/ReaderLib/ChunkRandomizer.cpp \
$( SOURCEDIR) /Readers/ReaderLib/SequenceRandomizer.cpp \
2016-03-16 19:21:38 +03:00
$( SOURCEDIR) /Readers/ReaderLib/SequencePacker.cpp \
2016-04-22 12:27:02 +03:00
$( SOURCEDIR) /Readers/ReaderLib/TruncatedBpttPacker.cpp \
2016-03-30 14:06:08 +03:00
$( SOURCEDIR) /Readers/ReaderLib/PackerBase.cpp \
2016-04-07 15:34:42 +03:00
$( SOURCEDIR) /Readers/ReaderLib/FramePacker.cpp \
2016-09-06 21:05:13 +03:00
$( SOURCEDIR) /Readers/ReaderLib/ReaderBase.cpp \
2017-07-18 18:56:32 +03:00
$( SOURCEDIR) /Readers/ReaderLib/Index.cpp \
$( SOURCEDIR) /Readers/ReaderLib/IndexBuilder.cpp \
$( SOURCEDIR) /Readers/ReaderLib/BufferedFileReader.cpp \
2017-04-21 16:50:52 +03:00
$( SOURCEDIR) /Readers/ReaderLib/DataDeserializerBase.cpp \
2017-02-17 17:18:35 +03:00
$( SOURCEDIR) /Readers/ReaderLib/ChunkCache.cpp \
$( SOURCEDIR) /Readers/ReaderLib/ReaderUtil.cpp \
2016-01-25 18:49:09 +03:00
2015-09-02 18:43:31 +03:00
COMMON_SRC = \
2015-12-15 11:58:24 +03:00
$( SOURCEDIR) /Common/Config.cpp \
2016-08-26 12:10:36 +03:00
$( SOURCEDIR) /Common/Globals.cpp \
2015-12-15 11:58:24 +03:00
$( SOURCEDIR) /Common/DataReader.cpp \
$( SOURCEDIR) /Common/DataWriter.cpp \
2016-02-18 18:35:10 +03:00
$( SOURCEDIR) /Common/ExceptionWithCallStack.cpp \
2015-12-15 11:58:24 +03:00
$( SOURCEDIR) /Common/Eval.cpp \
$( SOURCEDIR) /Common/File.cpp \
$( SOURCEDIR) /Common/TimerUtility.cpp \
$( SOURCEDIR) /Common/fileutil.cpp \
2016-09-15 16:44:45 +03:00
$( SOURCEDIR) /Common/Sequences.cpp \
2017-07-18 18:56:32 +03:00
$( SOURCEDIR) /Common/EnvironmentUtil.cpp \
2015-09-02 18:43:31 +03:00
MATH_SRC = \
2016-08-25 06:54:08 +03:00
$( SOURCEDIR) /Math/BatchNormalizationEngine.cpp \
$( SOURCEDIR) /Math/CUDAPageLockedMemAllocator.cpp \
2017-04-18 19:39:58 +03:00
$( SOURCEDIR) /Math/CPUMatrixFloat.cpp \
$( SOURCEDIR) /Math/CPUMatrixDouble.cpp \
CNTK support for CUDA 9
CNTK now supports CUDA 9/cuDNN 7. This requires an update to build environment to Ubuntu 16/GCC 5 for Linux, and Visual Studio 2017/VCTools 14.11 for Windows. With CUDA 9, CNTK also added a preview for 16-bit floating point (a.k.a FP16) computation.
Please check out the example of FP16 in ResNet50 at /Examples/Image/Classification/ResNet/Python/TrainResNet_ImageNet_Distributed.py
Notes on FP16 preview:
* FP16 implementation on CPU is not optimized, and it's not supposed to be used in CPU inference directly. User needs to convert the model to 32-bit floating point before running on CPU.
* Loss/Criterion for FP16 training needs to be 32bit for accumulation without overflow, using cast function. Please check the example above.
* Readers do not have FP16 output unless using numpy to feed data, cast from FP32 to FP16 is needed. Please check the example above.
* FP16 gradient aggregation is currently only implemented on GPU using NCCL2. Distributed training with FP16 with MPI is not supported.
* FP16 math is a subset of current FP32 implementation. Some model may get Feature Not Implemented exception using FP16.
* FP16 is currently not supported in BrainScript. Please use Python for FP16.
To setup build and runtime environment on Windows:
* Install [Visual Studio 2017](https://www.visualstudio.com/downloads/) with following workloads and components. From command line (use Community version installer as example):
vs_community.exe --add Microsoft.VisualStudio.Workload.NativeDesktop --add Microsoft.VisualStudio.Workload.ManagedDesktop --add Microsoft.VisualStudio.Workload.Universal --add Microsoft.Component.PythonTools --add Microsoft.VisualStudio.Component.VC.Tools.14.11
* Install [NVidia CUDA 9](https://developer.nvidia.com/cuda-90-download-archive?target_os=Windows&target_arch=x86_64)
* From PowerShell, run:
/Tools/devInstall/Windows/DevInstall.ps1
* Start VCTools 14.11 command line, run:
cmd /k "%VS2017INSTALLDIR%\VC\Auxiliary\Build\vcvarsall.bat" x64 --vcvars_ver=14.11
* Open /CNTK.sln from the VCTools 14.11 command line. Note that starting CNTK.sln other than VCTools 14.11 command line, would causes CUDA 9 [build error](https://developercommunity.visualstudio.com/content/problem/163758/vs-2017-155-doesnt-support-cuda-9.html).
To setup build and runtime environment on Linux using docker, please build Unbuntu 16.04 docker image using Dockerfiles /Tools/docker. For other Linux systems, please refer to the Dockerfiles to setup dependent libraries for CNTK.
2018-01-23 03:58:56 +03:00
$( SOURCEDIR) /Math/CPUMatrixHalf.cpp \
$( SOURCEDIR) /Math/CPUMatrixTensorFloat.cpp \
$( SOURCEDIR) /Math/CPUMatrixTensorDouble.cpp \
$( SOURCEDIR) /Math/CPUMatrixTensorHalf.cpp \
2018-02-09 09:22:27 +03:00
$( SOURCEDIR) /Math/CPUMatrixTensorSpecial.cpp \
2016-05-05 20:59:05 +03:00
$( SOURCEDIR) /Math/CPURNGHandle.cpp \
2016-08-25 06:54:08 +03:00
$( SOURCEDIR) /Math/CPUSparseMatrix.cpp \
$( SOURCEDIR) /Math/ConvolutionEngine.cpp \
2016-01-05 21:42:38 +03:00
$( SOURCEDIR) /Math/MatrixQuantizerImpl.cpp \
2015-12-15 11:58:24 +03:00
$( SOURCEDIR) /Math/MatrixQuantizerCPU.cpp \
$( SOURCEDIR) /Math/Matrix.cpp \
2016-08-25 06:54:08 +03:00
$( SOURCEDIR) /Math/QuantizedMatrix.cpp \
2016-09-07 16:46:20 +03:00
$( SOURCEDIR) /Math/DataTransferer.cpp \
2016-05-05 20:59:05 +03:00
$( SOURCEDIR) /Math/RNGHandle.cpp \
2015-12-15 11:58:24 +03:00
$( SOURCEDIR) /Math/TensorView.cpp \
2016-09-16 04:10:53 +03:00
$( SOURCEDIR) /Math/NcclComm.cpp \
2015-09-02 18:43:31 +03:00
i f d e f C U D A _ P A T H
MATH_SRC += \
2016-08-25 06:54:08 +03:00
$( SOURCEDIR) /Math/CuDnnBatchNormalization.cu \
$( SOURCEDIR) /Math/CuDnnCommon.cu \
$( SOURCEDIR) /Math/CuDnnConvolutionEngine.cu \
$( SOURCEDIR) /Math/CuDnnRNN.cpp \
$( SOURCEDIR) /Math/GPUDataTransferer.cpp \
2015-12-15 11:58:24 +03:00
$( SOURCEDIR) /Math/GPUMatrix.cu \
$( SOURCEDIR) /Math/GPUSparseMatrix.cu \
2016-08-25 06:54:08 +03:00
$( SOURCEDIR) /Math/GPUTensor.cu \
2015-12-15 11:58:24 +03:00
$( SOURCEDIR) /Math/GPUWatcher.cu \
2016-05-05 20:59:05 +03:00
$( SOURCEDIR) /Math/GPURNGHandle.cu \
2015-12-15 11:58:24 +03:00
$( SOURCEDIR) /Math/MatrixQuantizerGPU.cu \
2015-09-02 18:43:31 +03:00
e l s e
MATH_SRC += \
2015-12-15 11:58:24 +03:00
$( SOURCEDIR) /Math/NoGPU.cpp
2015-09-02 18:43:31 +03:00
e n d i f
MATH_SRC += $( COMMON_SRC)
2016-01-25 18:49:09 +03:00
MATH_SRC += $( READER_SRC)
2015-09-02 18:43:31 +03:00
MATH_OBJ := $( patsubst %.cu, $( OBJDIR) /%.o, $( patsubst %.cpp, $( OBJDIR) /%.o, $( MATH_SRC) ) )
CNTKMATH_LIB := $( LIBDIR) /lib$( CNTKMATH) .so
2016-10-06 13:19:13 +03:00
ALL_LIBS += $( CNTKMATH_LIB)
2017-02-08 22:37:48 +03:00
PYTHON_LIBS += $( CNTKMATH_LIB)
2017-01-13 00:24:35 +03:00
JAVA_LIBS += $( CNTKMATH_LIB)
2015-09-02 18:43:31 +03:00
SRC += $( MATH_SRC)
2016-11-09 01:15:55 +03:00
$(CNTKMATH_LIB) : $( MATH_OBJ ) | $( PERF_PROFILER_LIB )
2015-09-02 18:43:31 +03:00
@echo $( SEPARATOR)
2015-12-15 11:58:24 +03:00
@echo creating $@ for $( ARCH) with build type $( BUILDTYPE)
2015-09-02 18:43:31 +03:00
@mkdir -p $( dir $@ )
2016-11-09 01:15:55 +03:00
$( CXX) $( LDFLAGS) -shared $( patsubst %,-L%, $( LIBPATH) $( LIBDIR) $( GDK_NVML_LIB_PATH) ) $( patsubst %,$( RPATH) %, $( ORIGINDIR) $( LIBPATH) ) -o $@ $^ $( LIBS) -fopenmp -l$( PERF_PROFILER)
CNTK support for CUDA 9
CNTK now supports CUDA 9/cuDNN 7. This requires an update to build environment to Ubuntu 16/GCC 5 for Linux, and Visual Studio 2017/VCTools 14.11 for Windows. With CUDA 9, CNTK also added a preview for 16-bit floating point (a.k.a FP16) computation.
Please check out the example of FP16 in ResNet50 at /Examples/Image/Classification/ResNet/Python/TrainResNet_ImageNet_Distributed.py
Notes on FP16 preview:
* FP16 implementation on CPU is not optimized, and it's not supposed to be used in CPU inference directly. User needs to convert the model to 32-bit floating point before running on CPU.
* Loss/Criterion for FP16 training needs to be 32bit for accumulation without overflow, using cast function. Please check the example above.
* Readers do not have FP16 output unless using numpy to feed data, cast from FP32 to FP16 is needed. Please check the example above.
* FP16 gradient aggregation is currently only implemented on GPU using NCCL2. Distributed training with FP16 with MPI is not supported.
* FP16 math is a subset of current FP32 implementation. Some model may get Feature Not Implemented exception using FP16.
* FP16 is currently not supported in BrainScript. Please use Python for FP16.
To setup build and runtime environment on Windows:
* Install [Visual Studio 2017](https://www.visualstudio.com/downloads/) with following workloads and components. From command line (use Community version installer as example):
vs_community.exe --add Microsoft.VisualStudio.Workload.NativeDesktop --add Microsoft.VisualStudio.Workload.ManagedDesktop --add Microsoft.VisualStudio.Workload.Universal --add Microsoft.Component.PythonTools --add Microsoft.VisualStudio.Component.VC.Tools.14.11
* Install [NVidia CUDA 9](https://developer.nvidia.com/cuda-90-download-archive?target_os=Windows&target_arch=x86_64)
* From PowerShell, run:
/Tools/devInstall/Windows/DevInstall.ps1
* Start VCTools 14.11 command line, run:
cmd /k "%VS2017INSTALLDIR%\VC\Auxiliary\Build\vcvarsall.bat" x64 --vcvars_ver=14.11
* Open /CNTK.sln from the VCTools 14.11 command line. Note that starting CNTK.sln other than VCTools 14.11 command line, would causes CUDA 9 [build error](https://developercommunity.visualstudio.com/content/problem/163758/vs-2017-155-doesnt-support-cuda-9.html).
To setup build and runtime environment on Linux using docker, please build Unbuntu 16.04 docker image using Dockerfiles /Tools/docker. For other Linux systems, please refer to the Dockerfiles to setup dependent libraries for CNTK.
2018-01-23 03:58:56 +03:00
# Any executable using Common or ReaderLib needs to link these libraries.
2016-11-09 01:15:55 +03:00
READER_LIBS := $( CNTKMATH_LIB) $( PERF_PROFILER_LIB)
L_READER_LIBS := -l$( CNTKMATH) -l$( PERF_PROFILER)
2015-09-02 18:43:31 +03:00
2016-06-20 14:51:38 +03:00
########################################
2016-06-11 22:21:15 +03:00
# CNTKLibrary
2015-09-02 18:43:31 +03:00
########################################
2016-06-11 22:21:15 +03:00
CNTK_COMMON_SRC = \
$( SOURCEDIR) /Common/BestGpu.cpp \
$( SOURCEDIR) /Common/MPIWrapper.cpp \
COMPUTATION_NETWORK_LIB_SRC = \
$( SOURCEDIR) /ComputationNetworkLib/ComputationNode.cpp \
$( SOURCEDIR) /ComputationNetworkLib/ComputationNodeScripting.cpp \
$( SOURCEDIR) /ComputationNetworkLib/InputAndParamNodes.cpp \
2016-09-05 04:36:17 +03:00
$( SOURCEDIR) /ComputationNetworkLib/RecurrentNodes.cpp \
2016-09-01 12:32:52 +03:00
$( SOURCEDIR) /ComputationNetworkLib/LinearAlgebraNodes.cpp \
2016-06-11 22:21:15 +03:00
$( SOURCEDIR) /ComputationNetworkLib/ReshapingNodes.cpp \
2016-08-25 06:42:48 +03:00
$( SOURCEDIR) /ComputationNetworkLib/RNNNodes.cpp \
2016-06-11 22:21:15 +03:00
$( SOURCEDIR) /ComputationNetworkLib/SpecialPurposeNodes.cpp \
$( SOURCEDIR) /ComputationNetworkLib/ComputationNetwork.cpp \
$( SOURCEDIR) /ComputationNetworkLib/ComputationNetworkEvaluation.cpp \
$( SOURCEDIR) /ComputationNetworkLib/ComputationNetworkAnalysis.cpp \
$( SOURCEDIR) /ComputationNetworkLib/ComputationNetworkEditing.cpp \
$( SOURCEDIR) /ComputationNetworkLib/ComputationNetworkBuilder.cpp \
$( SOURCEDIR) /ComputationNetworkLib/ComputationNetworkScripting.cpp \
2016-09-30 18:05:00 +03:00
$( SOURCEDIR) /ComputationNetworkLib/TrainingNodes.cpp \
2016-06-11 22:21:15 +03:00
SEQUENCE_TRAINING_LIB_SRC = \
$( SOURCEDIR) /SequenceTrainingLib/latticeforwardbackward.cpp \
$( SOURCEDIR) /SequenceTrainingLib/parallelforwardbackward.cpp \
i f d e f C U D A _ P A T H
SEQUENCE_TRAINING_LIB_SRC += \
$( SOURCEDIR) /Math/cudalatticeops.cu \
$( SOURCEDIR) /Math/cudalattice.cpp \
$( SOURCEDIR) /Math/cudalib.cpp \
e l s e
SEQUENCE_TRAINING_LIB_SRC += \
$( SOURCEDIR) /SequenceTrainingLib/latticeNoGPU.cpp \
e n d i f
2016-10-17 15:07:22 +03:00
CNTKLIBRARY_COMMON_SRC = \
2016-07-18 02:44:44 +03:00
$( SOURCEDIR) /CNTKv2LibraryDll/BackCompat.cpp \
2016-06-11 22:21:15 +03:00
$( SOURCEDIR) /CNTKv2LibraryDll/Common.cpp \
$( SOURCEDIR) /CNTKv2LibraryDll/Function.cpp \
2016-11-24 05:57:17 +03:00
$( SOURCEDIR) /CNTKv2LibraryDll/PrimitiveFunction.cpp \
$( SOURCEDIR) /CNTKv2LibraryDll/CompositeFunction.cpp \
2017-04-28 17:53:26 +03:00
$( SOURCEDIR) /CNTKv2LibraryDll/UserDefinedFunction.cpp \
2016-06-11 22:21:15 +03:00
$( SOURCEDIR) /CNTKv2LibraryDll/NDArrayView.cpp \
$( SOURCEDIR) /CNTKv2LibraryDll/NDMask.cpp \
2016-07-24 23:02:55 +03:00
$( SOURCEDIR) /CNTKv2LibraryDll/Trainer.cpp \
2017-03-08 18:08:57 +03:00
$( SOURCEDIR) /CNTKv2LibraryDll/Evaluator.cpp \
2016-06-11 22:21:15 +03:00
$( SOURCEDIR) /CNTKv2LibraryDll/Utils.cpp \
$( SOURCEDIR) /CNTKv2LibraryDll/Value.cpp \
$( SOURCEDIR) /CNTKv2LibraryDll/Variable.cpp \
2016-10-18 18:38:40 +03:00
$( SOURCEDIR) /CNTKv2LibraryDll/Learner.cpp \
$( SOURCEDIR) /CNTKv2LibraryDll/Serialization.cpp \
2016-10-17 15:07:22 +03:00
$( SOURCEDIR) /CNTKv2LibraryDll/DistributedCommunicator.cpp \
2016-11-28 13:44:21 +03:00
$( SOURCEDIR) /CNTKv2LibraryDll/DistributedLearnerBase.cpp \
$( SOURCEDIR) /CNTKv2LibraryDll/DataParallelDistributedLearner.cpp \
2017-02-04 01:14:12 +03:00
$( SOURCEDIR) /CNTKv2LibraryDll/ProgressWriter.cpp \
2017-11-03 01:09:05 +03:00
$( SOURCEDIR) /CNTKv2LibraryDll/CNTKLibraryC.cpp \
$( SOURCEDIR) /CNTKv2LibraryDll/EvaluatorWrapper.cpp \
2016-10-25 01:33:31 +03:00
$( SOURCEDIR) /CNTKv2LibraryDll/proto/CNTK.pb.cc \
2017-01-19 04:03:29 +03:00
$( SOURCEDIR) /CNTKv2LibraryDll/tensorboard/tensorboard.pb.cc \
$( SOURCEDIR) /CNTKv2LibraryDll/tensorboard/TensorBoardFileWriter.cpp \
$( SOURCEDIR) /CNTKv2LibraryDll/tensorboard/TensorBoardUtils.cpp \
2017-11-30 04:51:27 +03:00
$( SOURCEDIR) /CNTKv2LibraryDll/proto/onnx/protobuf/onnx-ml.pb.cc \
2017-11-16 08:44:46 +03:00
$( SOURCEDIR) /CNTKv2LibraryDll/proto/onnx/defs/activation/defs.cpp \
2017-10-10 11:01:43 +03:00
$( SOURCEDIR) /CNTKv2LibraryDll/proto/onnx/defs/generator/defs.cpp \
$( SOURCEDIR) /CNTKv2LibraryDll/proto/onnx/defs/logical/defs.cpp \
$( SOURCEDIR) /CNTKv2LibraryDll/proto/onnx/defs/math/defs.cpp \
$( SOURCEDIR) /CNTKv2LibraryDll/proto/onnx/defs/nn/defs.cpp \
$( SOURCEDIR) /CNTKv2LibraryDll/proto/onnx/defs/reduction/defs.cpp \
$( SOURCEDIR) /CNTKv2LibraryDll/proto/onnx/defs/rnn/defs.cpp \
$( SOURCEDIR) /CNTKv2LibraryDll/proto/onnx/defs/tensor/defs.cpp \
2017-11-16 08:44:46 +03:00
$( SOURCEDIR) /CNTKv2LibraryDll/proto/onnx/defs/traditionalml/defs.cpp \
2017-10-10 11:01:43 +03:00
$( SOURCEDIR) /CNTKv2LibraryDll/proto/onnx/core/constants.cpp \
$( SOURCEDIR) /CNTKv2LibraryDll/proto/onnx/core/status.cpp \
$( SOURCEDIR) /CNTKv2LibraryDll/proto/onnx/core/utils.cpp \
$( SOURCEDIR) /CNTKv2LibraryDll/proto/onnx/core/opsignature.cpp \
$( SOURCEDIR) /CNTKv2LibraryDll/proto/onnx/core/op.cpp \
$( SOURCEDIR) /CNTKv2LibraryDll/proto/onnx/core/shape_inference.cpp \
$( SOURCEDIR) /CNTKv2LibraryDll/proto/onnx/core/graph.cpp \
$( SOURCEDIR) /CNTKv2LibraryDll/proto/onnx/core/model.cpp \
$( SOURCEDIR) /CNTKv2LibraryDll/proto/onnx/Operators.cpp \
$( SOURCEDIR) /CNTKv2LibraryDll/proto/onnx/CNTKToONNX.cpp \
$( SOURCEDIR) /CNTKv2LibraryDll/proto/onnx/ONNXToCNTK.cpp \
$( SOURCEDIR) /CNTKv2LibraryDll/proto/onnx/ONNX.cpp \
2016-10-17 15:07:22 +03:00
CNTKLIBRARY_SRC = \
$( SOURCEDIR) /CNTKv2LibraryDll/ComputeInputStatistics.cpp \
$( SOURCEDIR) /CNTKv2LibraryDll/MinibatchSource.cpp \
2017-07-25 13:48:59 +03:00
$( SOURCEDIR) /CNTKv2LibraryDll/TrainingSession.cpp \
2016-06-11 22:21:15 +03:00
2016-10-17 15:07:22 +03:00
CNTKLIBRARY_SRC += $( CNTKLIBRARY_COMMON_SRC)
2016-06-11 22:21:15 +03:00
CNTKLIBRARY_SRC += $( CNTK_COMMON_SRC)
CNTKLIBRARY_SRC += $( COMPUTATION_NETWORK_LIB_SRC)
CNTKLIBRARY_SRC += $( SEQUENCE_TRAINING_LIB_SRC)
2017-03-21 19:45:46 +03:00
CNTKLIBRARY := Cntk.Core-$( CNTK_COMPONENT_VERSION)
2016-06-11 22:21:15 +03:00
2016-10-25 01:33:31 +03:00
CNTKLIBRARY_OBJ := \
$( patsubst %.cu, $( OBJDIR) /%.o, $( filter %.cu, $( CNTKLIBRARY_SRC) ) ) \
$( patsubst %.pb.cc, $( OBJDIR) /%.pb.o, $( filter %.pb.cc, $( CNTKLIBRARY_SRC) ) ) \
$( patsubst %.cpp, $( OBJDIR) /%.o, $( filter %.cpp, $( CNTKLIBRARY_SRC) ) )
2016-06-11 22:21:15 +03:00
CNTKLIBRARY_LIB := $( LIBDIR) /lib$( CNTKLIBRARY) .so
2016-10-06 13:19:13 +03:00
ALL_LIBS += $( CNTKLIBRARY_LIB)
2017-02-08 22:37:48 +03:00
PYTHON_LIBS += $( CNTKLIBRARY_LIB)
2017-01-13 00:24:35 +03:00
JAVA_LIBS += $( CNTKLIBRARY_LIB)
2016-06-11 22:21:15 +03:00
SRC += $( CNTKLIBRARY_SRC)
$(CNTKLIBRARY_LIB) : $( CNTKLIBRARY_OBJ ) | $( CNTKMATH_LIB )
@echo $( SEPARATOR)
@mkdir -p $( dir $@ )
2016-10-25 01:33:31 +03:00
@echo building $@ for $( ARCH) with build type $( BUILDTYPE)
2017-11-17 05:11:33 +03:00
$( CXX) $( LDFLAGS) -shared $( patsubst %,-L%, $( LIBDIR) $( LIBPATH) $( GDK_NVML_LIB_PATH) ) $( patsubst %,$( RPATH) %, $( ORIGINDIR) $( LIBPATH) ) -o $@ $^ $( LIBS) -l$( CNTKMATH) $( PROTOBUF_PATH) /lib/libprotobuf.a -ldl -fopenmp
2017-09-06 07:08:01 +03:00
2017-04-12 11:40:18 +03:00
########################################
# C++ extensibility examples library
########################################
CPP_EXTENSIBILITY_EXAMPLES_LIBRARY_SRC = \
2017-04-14 02:51:13 +03:00
$( SOURCEDIR) /../Examples/Extensibility/CPPLib/CPPExtensibilityExamplesLibrary.cpp \
2017-04-12 11:40:18 +03:00
CPP_EXTENSIBILITY_EXAMPLES_LIBRARY_OBJ := $( patsubst %.cpp, $( OBJDIR) /%.o, $( CPP_EXTENSIBILITY_EXAMPLES_LIBRARY_SRC) )
CPP_EXTENSIBILITY_EXAMPLES_LIB := $( LIBDIR) /Cntk.ExtensibilityExamples-$( CNTK_COMPONENT_VERSION) .so
ALL_LIBS += $( CPP_EXTENSIBILITY_EXAMPLES_LIB)
PYTHON_LIBS += $( CPP_EXTENSIBILITY_EXAMPLES_LIB)
SRC += $( CPP_EXTENSIBILITY_EXAMPLES_LIBRARY_SRC)
$(CPP_EXTENSIBILITY_EXAMPLES_LIB) : $( CPP_EXTENSIBILITY_EXAMPLES_LIBRARY_OBJ ) | $( CNTKLIBRARY_LIB )
@echo $( SEPARATOR)
@echo creating $@ for $( ARCH) with build type $( BUILDTYPE)
@mkdir -p $( dir $@ )
$( CXX) $( LDFLAGS) -shared $( patsubst %,-L%, $( LIBDIR) ) $( patsubst %,$( RPATH) %, $( LIBDIR) $( ORIGINDIR) ) -o $@ $^ -l$( CNTKLIBRARY)
2016-06-11 22:21:15 +03:00
2017-05-26 11:30:39 +03:00
##############################################
2018-01-23 05:12:25 +03:00
# Binary convolution library
2017-05-26 11:30:39 +03:00
##############################################
2018-01-27 04:17:58 +03:00
i f d e f H A L I D E _ P A T H
2018-01-23 05:12:25 +03:00
INCLUDEPATH += $( HALIDE_PATH) /include
BINARY_CONVOLUTION_LIBRARY_SRC = \
$( SOURCEDIR) /Extensibility/BinaryConvolutionLib/BinaryConvolutionLib.cpp \
2017-05-26 11:30:39 +03:00
2018-01-23 05:12:25 +03:00
BINARY_CONVOLUTION_LIBRARY_OBJ := $( patsubst %.cpp, $( OBJDIR) /%.o, $( BINARY_CONVOLUTION_LIBRARY_SRC) )
2017-05-26 11:30:39 +03:00
2018-01-23 05:12:25 +03:00
BINARY_CONVOLUTION_LIB := $( LIBDIR) /Cntk.BinaryConvolution-$( CNTK_COMPONENT_VERSION) .so
ALL_LIBS += $( BINARY_CONVOLUTION_LIB)
PYTHON_LIBS += $( BINARY_CONVOLUTION_LIB)
SRC += $( BINARY_CONVOLUTION_LIBRARY_SRC)
2017-05-26 11:30:39 +03:00
2018-01-23 05:12:25 +03:00
$(BINARY_CONVOLUTION_LIB) : $( BINARY_CONVOLUTION_LIBRARY_OBJ ) | $( CNTKLIBRARY_LIB )
2017-05-26 11:30:39 +03:00
@echo $( SEPARATOR)
@echo creating $@ for $( ARCH) with build type $( BUILDTYPE)
@mkdir -p $( dir $@ )
2018-01-23 05:12:25 +03:00
$( CXX) $( LDFLAGS) -shared $( patsubst %,-L%, $( LIBDIR) ) $( patsubst %,$( RPATH) %, $( LIBDIR) $( ORIGINDIR) ) -o $@ $^ -l$( CNTKLIBRARY) $( HALIDE_PATH) /bin/libHalide.so
e n d i f
2017-05-26 11:30:39 +03:00
2017-08-07 19:34:57 +03:00
##############################################
# Native implementation of the Proposal Layer
##############################################
PROPOSAL_LAYER_LIBRARY_SRC = \
$( SOURCEDIR) /../Examples/Extensibility/ProposalLayer/ProposalLayerLib/ProposalLayerLib.cpp \
PROPOSAL_LAYER_LIBRARY_OBJ := $( patsubst %.cpp, $( OBJDIR) /%.o, $( PROPOSAL_LAYER_LIBRARY_SRC) )
PROPOSAL_LAYER_LIB := $( LIBDIR) /Cntk.ProposalLayerLib-$( CNTK_COMPONENT_VERSION) .so
ALL_LIBS += $( PROPOSAL_LAYER_LIB)
PYTHON_LIBS += $( PROPOSAL_LAYER_LIB)
SRC += $( PROPOSAL_LAYER_LIBRARY_SRC)
$(PROPOSAL_LAYER_LIB) : $( PROPOSAL_LAYER_LIBRARY_OBJ ) | $( CNTKLIBRARY_LIB )
@echo $( SEPARATOR)
@echo creating $@ for $( ARCH) with build type $( BUILDTYPE)
@mkdir -p $( dir $@ )
2018-01-23 05:12:25 +03:00
$( CXX) $( LDFLAGS) -shared $( CXXFLAGS) $( patsubst %,-L%, $( LIBDIR) $( LIBPATH) ) $( patsubst %,$( RPATH) %, $( LIBDIR) $( LIBPATH) $( ORIGINDIR) ) -o $@ $^ -l$( CNTKLIBRARY)
2017-08-07 19:34:57 +03:00
2016-06-20 14:51:38 +03:00
########################################
# LibEval
########################################
2017-03-20 17:26:29 +03:00
EVAL := Cntk.Eval-$( CNTK_COMPONENT_VERSION)
2016-06-27 12:43:56 +03:00
SGDLIB_SRC = \
2016-11-01 16:59:10 +03:00
$( SOURCEDIR) /SGDLib/ASGDHelper.cpp \
2016-06-27 12:43:56 +03:00
$( SOURCEDIR) /SGDLib/Profiler.cpp \
2016-09-09 11:41:14 +03:00
$( SOURCEDIR) /SGDLib/SGD.cpp \
$( SOURCEDIR) /SGDLib/PostComputingActions.cpp \
2016-10-25 01:33:31 +03:00
2016-10-17 15:07:22 +03:00
SGDLIB_SRC += $( CNTKLIBRARY_COMMON_SRC)
2016-06-27 12:43:56 +03:00
EVAL_SRC = \
2016-06-20 14:51:38 +03:00
$( SOURCEDIR) /EvalDll/CNTKEval.cpp \
$( SOURCEDIR) /CNTK/BrainScript/BrainScriptEvaluator.cpp \
$( SOURCEDIR) /CNTK/BrainScript/BrainScriptParser.cpp \
$( SOURCEDIR) /CNTK/ModelEditLanguage.cpp \
$( SOURCEDIR) /ActionsLib/EvalActions.cpp \
$( SOURCEDIR) /ActionsLib/NetworkFactory.cpp \
$( SOURCEDIR) /ActionsLib/NetworkDescriptionLanguage.cpp \
$( SOURCEDIR) /ActionsLib/SimpleNetworkBuilder.cpp \
2016-06-30 15:57:16 +03:00
$( SOURCEDIR) /ActionsLib/NDLNetworkBuilder.cpp \
2016-06-20 14:51:38 +03:00
2016-06-27 12:43:56 +03:00
EVAL_SRC += $( SGDLIB_SRC)
EVAL_SRC += $( COMPUTATION_NETWORK_LIB_SRC)
EVAL_SRC += $( CNTK_COMMON_SRC)
EVAL_SRC += $( SEQUENCE_TRAINING_LIB_SRC)
2016-10-25 02:07:42 +03:00
EVAL_OBJ := \
$( patsubst %.cu, $( OBJDIR) /%.o, $( filter %.cu, $( EVAL_SRC) ) ) \
$( patsubst %.pb.cc, $( OBJDIR) /%.pb.o, $( filter %.pb.cc, $( EVAL_SRC) ) ) \
$( patsubst %.cpp, $( OBJDIR) /%.o, $( filter %.cpp, $( EVAL_SRC) ) )
2016-06-20 14:51:38 +03:00
2016-06-27 12:43:56 +03:00
EVAL_LIB := $( LIBDIR) /lib$( EVAL) .so
2016-10-06 13:19:13 +03:00
ALL_LIBS += $( EVAL_LIB)
2016-06-27 12:43:56 +03:00
SRC += $( EVAL_SRC)
2016-06-20 14:51:38 +03:00
2016-07-29 19:58:49 +03:00
$(EVAL_LIB) : $( EVAL_OBJ ) | $( CNTKMATH_LIB )
2016-06-20 14:51:38 +03:00
@echo $( SEPARATOR)
@mkdir -p $( dir $@ )
2016-06-27 12:43:56 +03:00
@echo Building $( EVAL_LIB) for $( ARCH) with build type $( BUILDTYPE)
2017-04-12 11:40:18 +03:00
$( CXX) $( LDFLAGS) -shared $( patsubst %,-L%, $( LIBDIR) $( LIBPATH) $( GDK_NVML_LIB_PATH) ) $( patsubst %,$( RPATH) %, $( ORIGINDIR) $( LIBPATH) ) -o $@ $^ $( LIBS) -l$( CNTKMATH) -ldl $( lMULTIVERSO) $( PROTOBUF_PATH) /lib/libprotobuf.a
2016-06-20 14:51:38 +03:00
########################################
2016-11-07 09:53:06 +03:00
# Eval Sample clients
2016-06-20 14:51:38 +03:00
########################################
2016-11-07 09:53:06 +03:00
EVAL_CLIENT := $( BINDIR) /cppevalclient
2016-06-27 12:43:56 +03:00
2016-11-07 09:53:06 +03:00
EVAL_CLIENT_SRC = \
2017-07-18 19:25:03 +03:00
$( SOURCEDIR) /../Examples/Evaluation/LegacyEvalDll/CPPEvalClient/CPPEvalClient.cpp
2016-06-20 14:51:38 +03:00
2016-11-07 09:53:06 +03:00
EVAL_CLIENT_OBJ := $( patsubst %.cpp, $( OBJDIR) /%.o, $( EVAL_CLIENT_SRC) )
2016-06-20 14:51:38 +03:00
2016-11-07 09:53:06 +03:00
ALL += $( EVAL_CLIENT)
SRC += $( EVAL_CLIENT_SRC)
2016-06-20 14:51:38 +03:00
2016-11-09 01:15:55 +03:00
$(EVAL_CLIENT) : $( EVAL_CLIENT_OBJ ) | $( EVAL_LIB ) $( READER_LIBS )
2016-06-20 14:51:38 +03:00
@echo $( SEPARATOR)
@mkdir -p $( dir $@ )
2016-11-07 09:53:06 +03:00
@echo building $( EVAL_CLIENT) for $( ARCH) with build type $( BUILDTYPE)
2017-11-17 05:11:33 +03:00
$( CXX) $( LDFLAGS) $( patsubst %,-L%, $( LIBDIR) $( LIBPATH) $( GDK_NVML_LIB_PATH) ) $( patsubst %,$( RPATH) %, $( ORIGINLIBDIR) $( LIBPATH) ) -o $@ $^ $( LIBS) -l$( EVAL) $( L_READER_LIBS) $( lMULTIVERSO)
2016-06-11 22:21:15 +03:00
2016-11-07 09:53:06 +03:00
EVAL_EXTENDED_CLIENT := $( BINDIR) /cppevalextendedclient
2016-06-11 22:21:15 +03:00
2016-11-07 09:53:06 +03:00
EVAL_EXTENDED_CLIENT_SRC = \
2017-07-18 19:25:03 +03:00
$( SOURCEDIR) /../Examples/Evaluation/LegacyEvalDll/CPPEvalExtendedClient/CPPEvalExtendedClient.cpp
2016-11-07 09:53:06 +03:00
EVAL_EXTENDED_CLIENT_OBJ := $( patsubst %.cpp, $( OBJDIR) /%.o, $( EVAL_EXTENDED_CLIENT_SRC) )
ALL += $( EVAL_EXTENDED_CLIENT)
SRC += $( EVAL_EXTENDED_CLIENT_SRC)
2016-11-09 01:15:55 +03:00
$(EVAL_EXTENDED_CLIENT) : $( EVAL_EXTENDED_CLIENT_OBJ ) | $( EVAL_LIB ) $( READER_LIBS )
2016-11-07 09:53:06 +03:00
@echo $( SEPARATOR)
@mkdir -p $( dir $@ )
@echo building $( EVAL_EXTENDED_CLIENT) for $( ARCH) with build type $( BUILDTYPE)
2017-11-17 05:11:33 +03:00
$( CXX) $( LDFLAGS) $( patsubst %,-L%, $( LIBDIR) $( LIBPATH) $( GDK_NVML_LIB_PATH) ) $( patsubst %,$( RPATH) %, $( ORIGINLIBDIR) $( LIBPATH) ) -o $@ $^ $( LIBS) -l$( EVAL) $( L_READER_LIBS) $( lMULTIVERSO)
2016-10-20 21:19:28 +03:00
########################################
# Eval V2 Sample client
########################################
2017-01-23 19:10:25 +03:00
CNTKLIBRARY_CPP_EVAL_EXAMPLES := $( BINDIR) /CNTKLibraryCPPEvalExamples
2016-10-20 21:19:28 +03:00
2018-01-12 11:34:03 +03:00
i f d e f C U D A _ P A T H
2017-01-23 19:10:25 +03:00
CNTKLIBRARY_CPP_EVAL_EXAMPLES_SRC = \
2017-02-15 15:11:34 +03:00
$( SOURCEDIR) /../Examples/Evaluation/CNTKLibraryCPPEvalGPUExamples/CNTKLibraryCPPEvalGPUExamples.cpp\
2017-08-04 15:20:04 +03:00
$( SOURCEDIR) /../Examples/Evaluation/CNTKLibraryCPPEvalCPUOnlyExamples/CNTKLibraryCPPEvalExamples.cpp
2017-02-14 21:37:51 +03:00
2018-01-12 11:34:03 +03:00
e l s e
2017-02-14 21:37:51 +03:00
CNTKLIBRARY_CPP_EVAL_EXAMPLES_SRC = \
2017-02-15 15:11:34 +03:00
$( SOURCEDIR) /../Examples/Evaluation/CNTKLibraryCPPEvalCPUOnlyExamples/CNTKLibraryCPPEvalCPUOnlyExamples.cpp\
2017-08-04 15:20:04 +03:00
$( SOURCEDIR) /../Examples/Evaluation/CNTKLibraryCPPEvalCPUOnlyExamples/CNTKLibraryCPPEvalExamples.cpp
2018-01-12 11:34:03 +03:00
e n d i f
2016-10-20 21:19:28 +03:00
2017-01-23 19:10:25 +03:00
CNTKLIBRARY_CPP_EVAL_EXAMPLES_OBJ := $( patsubst %.cpp, $( OBJDIR) /%.o, $( CNTKLIBRARY_CPP_EVAL_EXAMPLES_SRC) )
2016-10-20 21:19:28 +03:00
2017-01-23 19:10:25 +03:00
ALL += $( CNTKLIBRARY_CPP_EVAL_EXAMPLES)
SRC += $( CNTKLIBRARY_CPP_EVAL_EXAMPLES_SRC)
2016-10-20 21:19:28 +03:00
2017-01-23 19:10:25 +03:00
$(CNTKLIBRARY_CPP_EVAL_EXAMPLES) : $( CNTKLIBRARY_CPP_EVAL_EXAMPLES_OBJ ) | $( CNTKLIBRARY_LIB ) $( READER_LIBS )
2016-10-20 21:19:28 +03:00
@echo $( SEPARATOR)
@mkdir -p $( dir $@ )
2017-01-23 19:10:25 +03:00
@echo building $( CNTKLIBRARY_CPP_EVAL_EXAMPLES) for $( ARCH) with build type $( BUILDTYPE)
2016-11-09 01:15:55 +03:00
$( CXX) $( LDFLAGS) $( patsubst %,-L%, $( LIBDIR) $( LIBPATH) $( GDK_NVML_LIB_PATH) ) $( patsubst %,$( RPATH) %, $( ORIGINLIBDIR) $( LIBPATH) ) -o $@ $^ $( LIBS) -l$( CNTKLIBRARY) $( L_READER_LIBS)
2017-02-14 21:37:51 +03:00
########################################
CNTK support for CUDA 9
CNTK now supports CUDA 9/cuDNN 7. This requires an update to build environment to Ubuntu 16/GCC 5 for Linux, and Visual Studio 2017/VCTools 14.11 for Windows. With CUDA 9, CNTK also added a preview for 16-bit floating point (a.k.a FP16) computation.
Please check out the example of FP16 in ResNet50 at /Examples/Image/Classification/ResNet/Python/TrainResNet_ImageNet_Distributed.py
Notes on FP16 preview:
* FP16 implementation on CPU is not optimized, and it's not supposed to be used in CPU inference directly. User needs to convert the model to 32-bit floating point before running on CPU.
* Loss/Criterion for FP16 training needs to be 32bit for accumulation without overflow, using cast function. Please check the example above.
* Readers do not have FP16 output unless using numpy to feed data, cast from FP32 to FP16 is needed. Please check the example above.
* FP16 gradient aggregation is currently only implemented on GPU using NCCL2. Distributed training with FP16 with MPI is not supported.
* FP16 math is a subset of current FP32 implementation. Some model may get Feature Not Implemented exception using FP16.
* FP16 is currently not supported in BrainScript. Please use Python for FP16.
To setup build and runtime environment on Windows:
* Install [Visual Studio 2017](https://www.visualstudio.com/downloads/) with following workloads and components. From command line (use Community version installer as example):
vs_community.exe --add Microsoft.VisualStudio.Workload.NativeDesktop --add Microsoft.VisualStudio.Workload.ManagedDesktop --add Microsoft.VisualStudio.Workload.Universal --add Microsoft.Component.PythonTools --add Microsoft.VisualStudio.Component.VC.Tools.14.11
* Install [NVidia CUDA 9](https://developer.nvidia.com/cuda-90-download-archive?target_os=Windows&target_arch=x86_64)
* From PowerShell, run:
/Tools/devInstall/Windows/DevInstall.ps1
* Start VCTools 14.11 command line, run:
cmd /k "%VS2017INSTALLDIR%\VC\Auxiliary\Build\vcvarsall.bat" x64 --vcvars_ver=14.11
* Open /CNTK.sln from the VCTools 14.11 command line. Note that starting CNTK.sln other than VCTools 14.11 command line, would causes CUDA 9 [build error](https://developercommunity.visualstudio.com/content/problem/163758/vs-2017-155-doesnt-support-cuda-9.html).
To setup build and runtime environment on Linux using docker, please build Unbuntu 16.04 docker image using Dockerfiles /Tools/docker. For other Linux systems, please refer to the Dockerfiles to setup dependent libraries for CNTK.
2018-01-23 03:58:56 +03:00
# Eval V2 Sample test
2017-02-14 21:37:51 +03:00
########################################
CNTKLIBRARY_CPP_EVAL_TEST := $( BINDIR) /CNTKLibraryCPPEvalExamplesTest
CNTKLIBRARY_CPP_EVAL_TEST_SRC = \
2017-08-04 15:20:04 +03:00
$( SOURCEDIR) /../Examples/Evaluation/CNTKLibraryCPPEvalCPUOnlyExamples/CNTKLibraryCPPEvalExamples.cpp\
2017-02-15 15:11:34 +03:00
$( SOURCEDIR) /../Tests/EndToEndTests/EvalClientTests/CNTKLibraryCPPEvalExamplesTest/CNTKLibraryCPPEvalExamplesTest.cpp\
2017-08-04 15:20:04 +03:00
$( SOURCEDIR) /../Tests/EndToEndTests/EvalClientTests/CNTKLibraryCPPEvalExamplesTest/EvalMultithreads.cpp\
2017-02-14 21:37:51 +03:00
$( SOURCEDIR) /../Tests/EndToEndTests/CNTKv2Library/Common/Common.cpp
CNTKLIBRARY_CPP_EVAL_TEST_OBJ := $( patsubst %.cpp, $( OBJDIR) /%.o, $( CNTKLIBRARY_CPP_EVAL_TEST_SRC) )
ALL += $( CNTKLIBRARY_CPP_EVAL_TEST)
SRC += $( CNTKLIBRARY_CPP_EVAL_TEST_SRC)
$(CNTKLIBRARY_CPP_EVAL_TEST) : $( CNTKLIBRARY_CPP_EVAL_TEST_OBJ ) | $( CNTKLIBRARY_LIB ) $( READER_LIBS )
@mkdir -p $( dir $@ )
@echo building $( CNTKLIBRARY_CPP_EVAL_TEST) for $( ARCH) with build type $( BUILDTYPE)
$( CXX) $( LDFLAGS) $( patsubst %,-L%, $( LIBDIR) $( LIBPATH) $( GDK_NVML_LIB_PATH) ) $( patsubst %,$( RPATH) %, $( ORIGINLIBDIR) $( LIBPATH) ) -o $@ $^ $( LIBS) -l$( CNTKLIBRARY) $( L_READER_LIBS)
2016-10-20 21:19:28 +03:00
2015-09-02 18:43:31 +03:00
########################################
# HTKMLFReader plugin
########################################
HTKMLFREADER_SRC = \
2016-02-29 06:01:07 +03:00
$( SOURCEDIR) /Readers/HTKMLFReader/Exports.cpp \
$( SOURCEDIR) /Readers/HTKMLFReader/DataWriterLocal.cpp \
2015-12-15 11:58:24 +03:00
$( SOURCEDIR) /Readers/HTKMLFReader/HTKMLFReader.cpp \
$( SOURCEDIR) /Readers/HTKMLFReader/HTKMLFWriter.cpp \
2015-09-02 18:43:31 +03:00
2015-12-15 11:58:24 +03:00
HTKMLFREADER_OBJ := $( patsubst %.cpp, $( OBJDIR) /%.o, $( HTKMLFREADER_SRC) )
2015-09-02 18:43:31 +03:00
2017-03-20 17:26:29 +03:00
HTKMLFREADER := $( LIBDIR) /Cntk.Reader.HTKMLF-$( CNTK_COMPONENT_VERSION) .so
2016-10-06 13:19:13 +03:00
ALL_LIBS += $( HTKMLFREADER)
2015-12-15 11:58:24 +03:00
SRC += $( HTKMLFREADER_SRC)
2015-09-02 18:43:31 +03:00
2017-03-20 17:26:29 +03:00
$(HTKMLFREADER) : $( HTKMLFREADER_OBJ ) | $( CNTKMATH_LIB )
2015-09-02 18:43:31 +03:00
@echo $( SEPARATOR)
$( CXX) $( LDFLAGS) -shared $( patsubst %,-L%, $( LIBDIR) $( LIBPATH) ) $( patsubst %,$( RPATH) %, $( ORIGINDIR) $( LIBPATH) ) -o $@ $^ -l$( CNTKMATH)
2016-04-28 16:50:37 +03:00
########################################
# CompositeDataReader plugin
########################################
COMPOSITEDATAREADER_SRC = \
$( SOURCEDIR) /Readers/CompositeDataReader/CompositeDataReader.cpp \
2016-04-29 14:54:54 +03:00
$( SOURCEDIR) /Readers/CompositeDataReader/Exports.cpp \
2016-04-28 16:50:37 +03:00
COMPOSITEDATAREADER_OBJ := $( patsubst %.cpp, $( OBJDIR) /%.o, $( COMPOSITEDATAREADER_SRC) )
2017-03-21 22:39:28 +03:00
COMPOSITEDATAREADER := $( LIBDIR) /Cntk.Composite-$( CNTK_COMPONENT_VERSION) .so
2016-10-06 13:19:13 +03:00
ALL_LIBS += $( COMPOSITEDATAREADER)
2017-02-08 22:37:48 +03:00
PYTHON_LIBS += $( COMPOSITEDATAREADER)
2016-04-28 16:50:37 +03:00
SRC += $( COMPOSITEDATAREADER_SRC)
2017-03-20 17:26:29 +03:00
$(COMPOSITEDATAREADER) : $( COMPOSITEDATAREADER_OBJ ) | $( CNTKMATH_LIB )
2016-04-28 16:50:37 +03:00
@echo $( SEPARATOR)
$( CXX) $( LDFLAGS) -shared $( patsubst %,-L%, $( LIBDIR) $( LIBPATH) ) $( patsubst %,$( RPATH) %, $( ORIGINDIR) $( LIBPATH) ) -o $@ $^ -l$( CNTKMATH)
2016-02-19 18:38:42 +03:00
########################################
2016-06-06 16:23:39 +03:00
# HTKDeserializers plugin
2016-02-19 18:38:42 +03:00
########################################
2016-06-06 16:23:39 +03:00
HTKDESERIALIZERS_SRC = \
2016-03-01 08:27:49 +03:00
$( SOURCEDIR) /Readers/HTKMLFReader/DataWriterLocal.cpp \
2016-02-19 18:38:42 +03:00
$( SOURCEDIR) /Readers/HTKMLFReader/HTKMLFWriter.cpp \
2016-06-06 16:23:39 +03:00
$( SOURCEDIR) /Readers/HTKDeserializers/ConfigHelper.cpp \
$( SOURCEDIR) /Readers/HTKDeserializers/Exports.cpp \
2017-04-05 18:51:54 +03:00
$( SOURCEDIR) /Readers/HTKDeserializers/HTKDeserializer.cpp \
2017-11-25 04:25:58 +03:00
$( SOURCEDIR) /Readers/HTKDeserializers/LatticeDeserializer.cpp \
2017-12-27 06:31:56 +03:00
$( SOURCEDIR) /Readers/HTKDeserializers/LatticeIndexBuilder.cpp \
2016-06-06 16:23:39 +03:00
$( SOURCEDIR) /Readers/HTKDeserializers/HTKMLFReader.cpp \
2017-03-10 17:23:55 +03:00
$( SOURCEDIR) /Readers/HTKDeserializers/MLFDeserializer.cpp \
2017-07-18 18:56:32 +03:00
$( SOURCEDIR) /Readers/HTKDeserializers/MLFIndexBuilder.cpp \
2017-03-10 17:23:55 +03:00
$( SOURCEDIR) /Readers/HTKDeserializers/MLFUtils.cpp \
2016-02-19 18:38:42 +03:00
2016-06-06 16:23:39 +03:00
HTKDESERIALIZERS_OBJ := $( patsubst %.cpp, $( OBJDIR) /%.o, $( HTKDESERIALIZERS_SRC) )
2016-02-19 18:38:42 +03:00
2017-03-21 19:34:43 +03:00
HTKDESERIALIZERS := $( LIBDIR) /Cntk.Deserializers.HTK-$( CNTK_COMPONENT_VERSION) .so
2016-10-06 13:19:13 +03:00
ALL_LIBS += $( HTKDESERIALIZERS)
2017-02-08 22:37:48 +03:00
PYTHON_LIBS += $( HTKDESERIALIZERS)
2016-06-06 16:23:39 +03:00
SRC += $( HTKDESERIALIZERS_SRC)
2016-02-19 18:38:42 +03:00
2017-03-20 17:26:29 +03:00
$(HTKDESERIALIZERS) : $( HTKDESERIALIZERS_OBJ ) | $( CNTKMATH_LIB )
2016-02-19 18:38:42 +03:00
@echo $( SEPARATOR)
$( CXX) $( LDFLAGS) -shared $( patsubst %,-L%, $( LIBDIR) $( LIBPATH) ) $( patsubst %,$( RPATH) %, $( ORIGINDIR) $( LIBPATH) ) -o $@ $^ -l$( CNTKMATH)
2015-09-02 18:43:31 +03:00
########################################
# LMSequenceReader plugin
########################################
LMSEQUENCEREADER_SRC = \
2015-12-15 11:58:24 +03:00
$( SOURCEDIR) /Readers/LMSequenceReader/Exports.cpp \
$( SOURCEDIR) /Readers/LMSequenceReader/SequenceParser.cpp \
$( SOURCEDIR) /Readers/LMSequenceReader/SequenceReader.cpp \
2015-12-15 15:39:43 +03:00
$( SOURCEDIR) /Readers/LMSequenceReader/SequenceWriter.cpp \
2015-09-02 18:43:31 +03:00
LMSEQUENCEREADER_OBJ := $( patsubst %.cpp, $( OBJDIR) /%.o, $( LMSEQUENCEREADER_SRC) )
2017-03-20 17:26:29 +03:00
LMSEQUENCEREADER := $( LIBDIR) /Cntk.Reader.LMSequence-$( CNTK_COMPONENT_VERSION) .so
2016-10-06 13:19:13 +03:00
ALL_LIBS += $( LMSEQUENCEREADER)
2015-09-02 18:43:31 +03:00
SRC += $( LMSEQUENCEREADER_SRC)
$(LMSEQUENCEREADER) : $( LMSEQUENCEREADER_OBJ ) | $( CNTKMATH_LIB )
@echo $( SEPARATOR)
$( CXX) $( LDFLAGS) -shared $( patsubst %,-L%, $( LIBDIR) $( LIBPATH) ) $( patsubst %,$( RPATH) %, $( ORIGINDIR) $( LIBPATH) ) -o $@ $^ -l$( CNTKMATH)
########################################
# LUSequenceReader plugin
########################################
LUSEQUENCEREADER_SRC = \
2015-12-15 11:58:24 +03:00
$( SOURCEDIR) /Readers/LUSequenceReader/Exports.cpp \
2016-02-29 06:01:07 +03:00
$( SOURCEDIR) /Readers/LUSequenceReader/DataWriterLocal.cpp \
2015-12-15 11:58:24 +03:00
$( SOURCEDIR) /Readers/LUSequenceReader/LUSequenceParser.cpp \
$( SOURCEDIR) /Readers/LUSequenceReader/LUSequenceReader.cpp \
2016-03-24 21:33:14 +03:00
$( SOURCEDIR) /Readers/LUSequenceReader/LUSequenceWriter.cpp \
2015-09-02 18:43:31 +03:00
LUSEQUENCEREADER_OBJ := $( patsubst %.cpp, $( OBJDIR) /%.o, $( LUSEQUENCEREADER_SRC) )
2017-03-20 17:26:29 +03:00
LUSEQUENCEREADER := $( LIBDIR) /Cntk.Reader.LUSequence-$( CNTK_COMPONENT_VERSION) .so
2016-10-06 13:19:13 +03:00
ALL_LIBS += $( LUSEQUENCEREADER)
2015-09-02 18:43:31 +03:00
SRC += $( LUSEQUENCEREADER_SRC)
$(LUSEQUENCEREADER) : $( LUSEQUENCEREADER_OBJ ) | $( CNTKMATH_LIB )
@echo $( SEPARATOR)
$( CXX) $( LDFLAGS) -shared $( patsubst %,-L%, $( LIBDIR) $( LIBPATH) ) $( patsubst %,$( RPATH) %, $( ORIGINDIR) $( LIBPATH) ) -o $@ $^ -l$( CNTKMATH)
########################################
# UCIFastReader plugin
########################################
UCIFASTREADER_SRC = \
2015-12-15 11:58:24 +03:00
$( SOURCEDIR) /Readers/UCIFastReader/Exports.cpp \
$( SOURCEDIR) /Readers/UCIFastReader/UCIFastReader.cpp \
$( SOURCEDIR) /Readers/UCIFastReader/UCIParser.cpp \
2015-09-02 18:43:31 +03:00
UCIFASTREADER_OBJ := $( patsubst %.cpp, $( OBJDIR) /%.o, $( UCIFASTREADER_SRC) )
2017-03-20 17:26:29 +03:00
UCIFASTREADER := $( LIBDIR) /Cntk.Reader.UCIFast-$( CNTK_COMPONENT_VERSION) .so
2016-10-06 13:19:13 +03:00
ALL_LIBS += $( UCIFASTREADER)
2015-09-02 18:43:31 +03:00
SRC += $( UCIFASTREADER_SRC)
$(UCIFASTREADER) : $( UCIFASTREADER_OBJ ) | $( CNTKMATH_LIB )
@echo $( SEPARATOR)
$( CXX) $( LDFLAGS) -shared $( patsubst %,-L%, $( LIBDIR) $( LIBPATH) ) $( patsubst %,$( RPATH) %, $( ORIGINDIR) $( LIBPATH) ) -o $@ $^ -l$( CNTKMATH)
2015-12-16 20:53:10 +03:00
########################################
# LibSVMBinaryReader plugin
########################################
LIBSVMBINARYREADER_SRC = \
$( SOURCEDIR) /Readers/LibSVMBinaryReader/Exports.cpp \
$( SOURCEDIR) /Readers/LibSVMBinaryReader/LibSVMBinaryReader.cpp \
LIBSVMBINARYREADER_OBJ := $( patsubst %.cpp, $( OBJDIR) /%.o, $( LIBSVMBINARYREADER_SRC) )
2017-03-20 17:26:29 +03:00
LIBSVMBINARYREADER := $( LIBDIR) /Cntk.Reader.SVMBinary-$( CNTK_COMPONENT_VERSION) .so
2016-10-06 13:19:13 +03:00
ALL_LIBS += $( LIBSVMBINARYREADER)
2015-12-16 20:53:10 +03:00
SRC += $( LIBSVMBINARYREADER_SRC)
$(LIBSVMBINARYREADER) : $( LIBSVMBINARYREADER_OBJ ) | $( CNTKMATH_LIB )
@echo $( SEPARATOR)
$( CXX) $( LDFLAGS) -shared $( patsubst %,-L%, $( LIBDIR) $( LIBPATH) ) $( patsubst %,$( RPATH) %, $( ORIGINDIR) $( LIBPATH) ) -o $@ $^ -l$( CNTKMATH)
2016-01-29 18:58:34 +03:00
########################################
# SparsePCReader plugin
########################################
SPARSEPCREADER_SRC = \
$( SOURCEDIR) /Readers/SparsePCReader/Exports.cpp \
$( SOURCEDIR) /Readers/SparsePCReader/SparsePCReader.cpp \
2016-02-02 17:45:50 +03:00
SPARSEPCREADER_OBJ := $( patsubst %.cpp, $( OBJDIR) /%.o, $( SPARSEPCREADER_SRC) )
2016-01-29 18:58:34 +03:00
2017-03-20 17:26:29 +03:00
SPARSEPCREADER := $( LIBDIR) /Cntk.Reader.SparsePC-$( CNTK_COMPONENT_VERSION) .so
2016-10-06 13:19:13 +03:00
ALL_LIBS += $( SPARSEPCREADER)
2016-02-02 17:45:50 +03:00
SRC += $( SPARSEPCREADER_SRC)
2016-01-29 18:58:34 +03:00
2016-02-02 17:45:50 +03:00
$(SPARSEPCREADER) : $( SPARSEPCREADER_OBJ ) | $( CNTKMATH_LIB )
2016-01-29 18:58:34 +03:00
@echo $( SEPARATOR)
$( CXX) $( LDFLAGS) -shared $( patsubst %,-L%, $( LIBDIR) $( LIBPATH) ) $( patsubst %,$( RPATH) %, $( ORIGINDIR) $( LIBPATH) ) -o $@ $^ -l$( CNTKMATH)
2016-11-30 02:31:20 +03:00
########################################
# CNTKBinaryReader plugin
########################################
2016-12-01 04:29:29 +03:00
2016-11-30 02:31:20 +03:00
CNTKBINARYREADER_SRC = \
$( SOURCEDIR) /Readers/CNTKBinaryReader/Exports.cpp \
$( SOURCEDIR) /Readers/CNTKBinaryReader/BinaryChunkDeserializer.cpp \
$( SOURCEDIR) /Readers/CNTKBinaryReader/BinaryConfigHelper.cpp \
$( SOURCEDIR) /Readers/CNTKBinaryReader/CNTKBinaryReader.cpp \
2016-12-01 04:29:29 +03:00
2016-11-30 02:31:20 +03:00
CNTKBINARYREADER_OBJ := $( patsubst %.cpp, $( OBJDIR) /%.o, $( CNTKBINARYREADER_SRC) )
2016-12-01 04:29:29 +03:00
2017-03-21 19:34:43 +03:00
CNTKBINARYREADER := $( LIBDIR) /Cntk.Deserializers.Binary-$( CNTK_COMPONENT_VERSION) .so
2017-02-08 22:37:48 +03:00
ALL_LIBS += $( CNTKBINARYREADER)
PYTHON_LIBS += $( CNTKBINARYREADER)
2016-11-30 02:31:20 +03:00
SRC += $( CNTKBINARYREADER_SRC)
2016-12-01 04:29:29 +03:00
2016-11-30 02:31:20 +03:00
$(CNTKBINARYREADER) : $( CNTKBINARYREADER_OBJ ) | $( CNTKMATH_LIB )
@echo $( SEPARATOR)
$( CXX) $( LDFLAGS) -shared $( patsubst %,-L%, $( LIBDIR) $( LIBPATH) ) $( patsubst %,$( RPATH) %, $( ORIGINDIR) $( LIBPATH) ) -o $@ $^ -l$( CNTKMATH)
2016-12-01 04:29:29 +03:00
2016-03-17 17:11:44 +03:00
########################################
# CNTKTextFormatReader plugin
########################################
CNTKTEXTFORMATREADER_SRC = \
$( SOURCEDIR) /Readers/CNTKTextFormatReader/Exports.cpp \
$( SOURCEDIR) /Readers/CNTKTextFormatReader/TextParser.cpp \
$( SOURCEDIR) /Readers/CNTKTextFormatReader/CNTKTextFormatReader.cpp \
$( SOURCEDIR) /Readers/CNTKTextFormatReader/TextConfigHelper.cpp \
CNTKTEXTFORMATREADER_OBJ := $( patsubst %.cpp, $( OBJDIR) /%.o, $( CNTKTEXTFORMATREADER_SRC) )
2017-03-21 19:34:43 +03:00
CNTKTEXTFORMATREADER := $( LIBDIR) /Cntk.Deserializers.TextFormat-$( CNTK_COMPONENT_VERSION) .so
2016-10-06 13:19:13 +03:00
ALL_LIBS += $( CNTKTEXTFORMATREADER)
2017-02-08 22:37:48 +03:00
PYTHON_LIBS += $( CNTKTEXTFORMATREADER)
2016-03-17 17:11:44 +03:00
SRC += $( CNTKTEXTFORMATREADER_SRC)
$(CNTKTEXTFORMATREADER) : $( CNTKTEXTFORMATREADER_OBJ ) | $( CNTKMATH_LIB )
@echo $( SEPARATOR)
$( CXX) $( LDFLAGS) -shared $( patsubst %,-L%, $( LIBDIR) $( LIBPATH) ) $( patsubst %,$( RPATH) %, $( ORIGINDIR) $( LIBPATH) ) -o $@ $^ -l$( CNTKMATH)
2015-09-02 18:43:31 +03:00
########################################
# Kaldi plugins
########################################
i f d e f K A L D I _ P A T H
KALDI2READER_SRC = \
2015-12-15 11:58:24 +03:00
$( SOURCEDIR) /Readers/Kaldi2Reader/DataReader.cpp \
$( SOURCEDIR) /Readers/Kaldi2Reader/DataWriter.cpp \
$( SOURCEDIR) /Readers/Kaldi2Reader/HTKMLFReader.cpp \
$( SOURCEDIR) /Readers/Kaldi2Reader/HTKMLFWriter.cpp \
$( SOURCEDIR) /Readers/Kaldi2Reader/KaldiSequenceTrainingDerivative.cpp \
$( SOURCEDIR) /Readers/Kaldi2Reader/UtteranceDerivativeBuffer.cpp \
2015-09-02 18:43:31 +03:00
KALDI2READER_OBJ := $( patsubst %.cpp, $( OBJDIR) /%.o, $( KALDI2READER_SRC) )
2017-03-21 19:34:43 +03:00
KALDI2READER := $( LIBDIR) /Cntk.Reader.Kaldi2-$( CNTK_COMPONENT_VERSION) .so
2016-10-06 13:19:13 +03:00
ALL_LIBS += $( KALDI2READER)
2015-09-02 18:43:31 +03:00
SRC += $( KALDI2READER_SRC)
2015-08-11 23:58:18 +03:00
$(KALDI2READER) : $( KALDI 2READER_OBJ ) | $( CNTKMATH_LIB )
2015-09-02 18:43:31 +03:00
@echo $( SEPARATOR)
$( CXX) $( LDFLAGS) -shared $( patsubst %,-L%, $( LIBDIR) $( KALDI_LIBPATH) $( LIBPATH) ) $( patsubst %,$( RPATH) %, $( ORIGINDIR) $( KALDI_LIBPATH) $( LIBPATH) ) -o $@ $^ -l$( CNTKMATH) $( KALDI_LIBS)
e n d i f
2015-10-13 22:02:35 +03:00
########################################
# ImageReader plugin
########################################
i f d e f O P E N C V _ P A T H
2016-08-17 16:27:15 +03:00
i f d e f B O O S T _ P A T H
INCLUDEPATH += $( BOOST_PATH) /include
2016-02-20 04:41:40 +03:00
2016-10-06 13:19:13 +03:00
IMAGEREADER_LIBS_LIST := opencv_core opencv_imgproc opencv_imgcodecs
2016-02-20 04:41:40 +03:00
i f d e f L I B Z I P _ P A T H
CPPFLAGS += -DUSE_ZIP
2016-10-06 13:19:13 +03:00
# Both directories are needed for building libzip
INCLUDEPATH += $( LIBZIP_PATH) /include $( LIBZIP_PATH) /lib/libzip/include
2016-09-28 12:31:03 +03:00
LIBPATH += $( LIBZIP_PATH) /lib
2016-10-06 13:19:13 +03:00
IMAGEREADER_LIBS_LIST += zip
2016-02-20 04:41:40 +03:00
e n d i f
2016-10-06 13:19:13 +03:00
IMAGEREADER_LIBS := $( addprefix -l,$( IMAGEREADER_LIBS_LIST) )
2015-10-13 22:02:35 +03:00
IMAGEREADER_SRC = \
2016-09-30 17:27:53 +03:00
$( SOURCEDIR) /Readers/ImageReader/Base64ImageDeserializer.cpp \
$( SOURCEDIR) /Readers/ImageReader/ImageDeserializerBase.cpp \
2016-02-25 21:48:13 +03:00
$( SOURCEDIR) /Readers/ImageReader/Exports.cpp \
$( SOURCEDIR) /Readers/ImageReader/ImageConfigHelper.cpp \
$( SOURCEDIR) /Readers/ImageReader/ImageDataDeserializer.cpp \
$( SOURCEDIR) /Readers/ImageReader/ImageTransformers.cpp \
$( SOURCEDIR) /Readers/ImageReader/ImageReader.cpp \
$( SOURCEDIR) /Readers/ImageReader/ZipByteReader.cpp \
2015-12-15 11:58:24 +03:00
2015-10-13 22:02:35 +03:00
IMAGEREADER_OBJ := $( patsubst %.cpp, $( OBJDIR) /%.o, $( IMAGEREADER_SRC) )
2017-03-21 19:34:43 +03:00
IMAGEREADER := $( LIBDIR) /Cntk.Deserializers.Image-$( CNTK_COMPONENT_VERSION) .so
2016-10-06 13:19:13 +03:00
ALL_LIBS += $( IMAGEREADER)
2017-02-08 22:37:48 +03:00
PYTHON_LIBS += $( IMAGEREADER)
2018-01-29 23:44:29 +03:00
JAVA_LOAD_DEPS += $( IMAGEREADER_LIBS)
2015-10-13 22:02:35 +03:00
SRC += $( IMAGEREADER_SRC)
2015-12-03 21:52:05 +03:00
INCLUDEPATH += $( OPENCV_PATH) /include
2016-02-01 15:10:43 +03:00
LIBPATH += $( OPENCV_PATH) /lib $( OPENCV_PATH) /release/lib
2015-12-03 21:52:05 +03:00
2015-10-13 22:02:35 +03:00
$(IMAGEREADER) : $( IMAGEREADER_OBJ ) | $( CNTKMATH_LIB )
@echo $( SEPARATOR)
2016-10-12 21:51:17 +03:00
$( CXX) $( LDFLAGS) -shared $( patsubst %,-L%, $( LIBDIR) $( LIBPATH) ) $( patsubst %,$( RPATH) %, $( ORIGINDIR) $( LIBPATH) ) -o $@ $^ -l$( CNTKMATH) $( IMAGEREADER_LIBS)
2015-10-13 22:02:35 +03:00
e n d i f
2016-08-17 16:27:15 +03:00
e n d i f
2015-10-13 22:02:35 +03:00
2017-11-17 05:11:33 +03:00
########################################
# ImageWriter plugin
########################################
i f d e f O P E N C V _ P A T H
IMAGEWRITER_LIBS_LIST := opencv_core opencv_imgproc opencv_imgcodecs
IMAGEWRITER_LIBS := $( addprefix -l,$( IMAGEWRITER_LIBS_LIST) )
IMAGEWRITER_SRC = \
$( SOURCEDIR) /ImageWriterDll/ImageWriter.cpp \
IMAGEWRITER_OBJ := $( patsubst %.cpp, $( OBJDIR) /%.o, $( IMAGEWRITER_SRC) )
IMAGEWRITER := $( LIBDIR) /Cntk.ImageWriter-$( CNTK_COMPONENT_VERSION) .so
ALL_LIBS += $( IMAGEWRITER)
PYTHON_LIBS += $( IMAGEWRITER)
2018-01-29 23:44:29 +03:00
JAVA_LOAD_DEPS += $( IMAGEWRITER_LIBS)
2017-11-17 05:11:33 +03:00
SRC += $( IMAGEWRITER_SRC)
INCLUDEPATH += $( OPENCV_PATH) /include
LIBPATH += $( OPENCV_PATH) /lib $( OPENCV_PATH) /release/lib
$(IMAGEWRITER) : $( IMAGEWRITER_OBJ )
@echo $( SEPARATOR)
$( CXX) $( LDFLAGS) -shared $( patsubst %,-L%, $( LIBDIR) $( LIBPATH) ) $( patsubst %,$( RPATH) %, $( ORIGINDIR) $( LIBPATH) ) -o $@ $^ $( IMAGEWRITER_LIBS)
e n d i f
2016-01-14 01:29:19 +03:00
########################################
# 1bit SGD setup
########################################
i f e q ( "$(CNTK_ENABLE_1BitSGD)" , "true" )
2016-01-29 15:05:12 +03:00
i f e q ( , $( wildcard Source /1BitSGD /*.h ) )
2017-06-07 16:55:34 +03:00
$( error Build with 1bit-SGD was requested but cannot find the code. Please check https://docs.microsoft.com/en-us/cognitive-toolkit/Enabling-1bit-SGD for instructions)
2016-01-29 15:05:12 +03:00
e n d i f
CNTK support for CUDA 9
CNTK now supports CUDA 9/cuDNN 7. This requires an update to build environment to Ubuntu 16/GCC 5 for Linux, and Visual Studio 2017/VCTools 14.11 for Windows. With CUDA 9, CNTK also added a preview for 16-bit floating point (a.k.a FP16) computation.
Please check out the example of FP16 in ResNet50 at /Examples/Image/Classification/ResNet/Python/TrainResNet_ImageNet_Distributed.py
Notes on FP16 preview:
* FP16 implementation on CPU is not optimized, and it's not supposed to be used in CPU inference directly. User needs to convert the model to 32-bit floating point before running on CPU.
* Loss/Criterion for FP16 training needs to be 32bit for accumulation without overflow, using cast function. Please check the example above.
* Readers do not have FP16 output unless using numpy to feed data, cast from FP32 to FP16 is needed. Please check the example above.
* FP16 gradient aggregation is currently only implemented on GPU using NCCL2. Distributed training with FP16 with MPI is not supported.
* FP16 math is a subset of current FP32 implementation. Some model may get Feature Not Implemented exception using FP16.
* FP16 is currently not supported in BrainScript. Please use Python for FP16.
To setup build and runtime environment on Windows:
* Install [Visual Studio 2017](https://www.visualstudio.com/downloads/) with following workloads and components. From command line (use Community version installer as example):
vs_community.exe --add Microsoft.VisualStudio.Workload.NativeDesktop --add Microsoft.VisualStudio.Workload.ManagedDesktop --add Microsoft.VisualStudio.Workload.Universal --add Microsoft.Component.PythonTools --add Microsoft.VisualStudio.Component.VC.Tools.14.11
* Install [NVidia CUDA 9](https://developer.nvidia.com/cuda-90-download-archive?target_os=Windows&target_arch=x86_64)
* From PowerShell, run:
/Tools/devInstall/Windows/DevInstall.ps1
* Start VCTools 14.11 command line, run:
cmd /k "%VS2017INSTALLDIR%\VC\Auxiliary\Build\vcvarsall.bat" x64 --vcvars_ver=14.11
* Open /CNTK.sln from the VCTools 14.11 command line. Note that starting CNTK.sln other than VCTools 14.11 command line, would causes CUDA 9 [build error](https://developercommunity.visualstudio.com/content/problem/163758/vs-2017-155-doesnt-support-cuda-9.html).
To setup build and runtime environment on Linux using docker, please build Unbuntu 16.04 docker image using Dockerfiles /Tools/docker. For other Linux systems, please refer to the Dockerfiles to setup dependent libraries for CNTK.
2018-01-23 03:58:56 +03:00
INCLUDEPATH += $( SOURCEDIR) /1BitSGD
2016-01-14 01:29:19 +03:00
2016-04-19 02:18:49 +03:00
COMMON_FLAGS += -DCNTK_PARALLEL_TRAINING_SUPPORT
2016-03-10 03:36:34 +03:00
# temporarily adding to 1bit, need to work with others to fix it
2016-01-14 01:29:19 +03:00
e n d i f
CNTK support for CUDA 9
CNTK now supports CUDA 9/cuDNN 7. This requires an update to build environment to Ubuntu 16/GCC 5 for Linux, and Visual Studio 2017/VCTools 14.11 for Windows. With CUDA 9, CNTK also added a preview for 16-bit floating point (a.k.a FP16) computation.
Please check out the example of FP16 in ResNet50 at /Examples/Image/Classification/ResNet/Python/TrainResNet_ImageNet_Distributed.py
Notes on FP16 preview:
* FP16 implementation on CPU is not optimized, and it's not supposed to be used in CPU inference directly. User needs to convert the model to 32-bit floating point before running on CPU.
* Loss/Criterion for FP16 training needs to be 32bit for accumulation without overflow, using cast function. Please check the example above.
* Readers do not have FP16 output unless using numpy to feed data, cast from FP32 to FP16 is needed. Please check the example above.
* FP16 gradient aggregation is currently only implemented on GPU using NCCL2. Distributed training with FP16 with MPI is not supported.
* FP16 math is a subset of current FP32 implementation. Some model may get Feature Not Implemented exception using FP16.
* FP16 is currently not supported in BrainScript. Please use Python for FP16.
To setup build and runtime environment on Windows:
* Install [Visual Studio 2017](https://www.visualstudio.com/downloads/) with following workloads and components. From command line (use Community version installer as example):
vs_community.exe --add Microsoft.VisualStudio.Workload.NativeDesktop --add Microsoft.VisualStudio.Workload.ManagedDesktop --add Microsoft.VisualStudio.Workload.Universal --add Microsoft.Component.PythonTools --add Microsoft.VisualStudio.Component.VC.Tools.14.11
* Install [NVidia CUDA 9](https://developer.nvidia.com/cuda-90-download-archive?target_os=Windows&target_arch=x86_64)
* From PowerShell, run:
/Tools/devInstall/Windows/DevInstall.ps1
* Start VCTools 14.11 command line, run:
cmd /k "%VS2017INSTALLDIR%\VC\Auxiliary\Build\vcvarsall.bat" x64 --vcvars_ver=14.11
* Open /CNTK.sln from the VCTools 14.11 command line. Note that starting CNTK.sln other than VCTools 14.11 command line, would causes CUDA 9 [build error](https://developercommunity.visualstudio.com/content/problem/163758/vs-2017-155-doesnt-support-cuda-9.html).
To setup build and runtime environment on Linux using docker, please build Unbuntu 16.04 docker image using Dockerfiles /Tools/docker. For other Linux systems, please refer to the Dockerfiles to setup dependent libraries for CNTK.
2018-01-23 03:58:56 +03:00
2016-01-29 09:20:21 +03:00
########################################
# ASGD(multiverso) setup
########################################
2016-09-27 06:37:31 +03:00
2016-01-29 09:20:21 +03:00
i f e q ( "$(CNTK_ENABLE_ASGD)" , "true" )
2016-02-02 14:11:02 +03:00
2016-02-15 15:01:04 +03:00
i f e q ( , $( wildcard Source /Multiverso /include /multiverso /*.h ) )
2017-06-07 17:50:37 +03:00
$( error Build with Multiverso was requested but cannot find the code. Please check https://docs.microsoft.com/en-us/cognitive-toolkit/Multiple-GPUs-and-machines#8-data-parallel-training-with-parameter-server to learn more.)
2016-02-02 14:11:02 +03:00
e n d i f
2016-10-10 06:18:17 +03:00
lMULTIVERSO := -lmultiverso
2016-09-27 06:37:31 +03:00
2016-09-19 13:07:15 +03:00
INCLUDEPATH += $( SOURCEDIR) /Multiverso/include
2016-11-03 12:31:28 +03:00
COMMON_FLAGS += -DASGD_PARALLEL_SUPPORT
2016-09-19 13:07:15 +03:00
MULTIVERSO_LIB := $( LIBDIR) /libmultiverso.so
2016-11-06 08:11:20 +03:00
ALL_LIBS += $( MULTIVERSO_LIB)
2016-11-01 11:29:18 +03:00
i f e q ( "$(BUILDTYPE)" , "release" )
2016-11-08 14:59:19 +03:00
MULTIVERSO_CMAKE_BUILDTYPE = Release
2016-11-01 11:29:18 +03:00
e n d i f
i f e q ( "$(BUILDTYPE)" , "debug" )
2016-11-08 14:59:19 +03:00
MULTIVERSO_CMAKE_BUILDTYPE = Debug
e n d i f
2016-11-17 11:16:58 +03:00
# TODO need to align Multiverso OpenMP with the one we use (libiomp). For now, disabled.
CNTK support for CUDA 9
CNTK now supports CUDA 9/cuDNN 7. This requires an update to build environment to Ubuntu 16/GCC 5 for Linux, and Visual Studio 2017/VCTools 14.11 for Windows. With CUDA 9, CNTK also added a preview for 16-bit floating point (a.k.a FP16) computation.
Please check out the example of FP16 in ResNet50 at /Examples/Image/Classification/ResNet/Python/TrainResNet_ImageNet_Distributed.py
Notes on FP16 preview:
* FP16 implementation on CPU is not optimized, and it's not supposed to be used in CPU inference directly. User needs to convert the model to 32-bit floating point before running on CPU.
* Loss/Criterion for FP16 training needs to be 32bit for accumulation without overflow, using cast function. Please check the example above.
* Readers do not have FP16 output unless using numpy to feed data, cast from FP32 to FP16 is needed. Please check the example above.
* FP16 gradient aggregation is currently only implemented on GPU using NCCL2. Distributed training with FP16 with MPI is not supported.
* FP16 math is a subset of current FP32 implementation. Some model may get Feature Not Implemented exception using FP16.
* FP16 is currently not supported in BrainScript. Please use Python for FP16.
To setup build and runtime environment on Windows:
* Install [Visual Studio 2017](https://www.visualstudio.com/downloads/) with following workloads and components. From command line (use Community version installer as example):
vs_community.exe --add Microsoft.VisualStudio.Workload.NativeDesktop --add Microsoft.VisualStudio.Workload.ManagedDesktop --add Microsoft.VisualStudio.Workload.Universal --add Microsoft.Component.PythonTools --add Microsoft.VisualStudio.Component.VC.Tools.14.11
* Install [NVidia CUDA 9](https://developer.nvidia.com/cuda-90-download-archive?target_os=Windows&target_arch=x86_64)
* From PowerShell, run:
/Tools/devInstall/Windows/DevInstall.ps1
* Start VCTools 14.11 command line, run:
cmd /k "%VS2017INSTALLDIR%\VC\Auxiliary\Build\vcvarsall.bat" x64 --vcvars_ver=14.11
* Open /CNTK.sln from the VCTools 14.11 command line. Note that starting CNTK.sln other than VCTools 14.11 command line, would causes CUDA 9 [build error](https://developercommunity.visualstudio.com/content/problem/163758/vs-2017-155-doesnt-support-cuda-9.html).
To setup build and runtime environment on Linux using docker, please build Unbuntu 16.04 docker image using Dockerfiles /Tools/docker. For other Linux systems, please refer to the Dockerfiles to setup dependent libraries for CNTK.
2018-01-23 03:58:56 +03:00
$(MULTIVERSO_LIB) :
2016-11-02 04:52:25 +03:00
@echo "Build Multiverso lib"
@mkdir -p $( LIBDIR)
@mkdir -p $( BINDIR)
2016-11-08 14:59:19 +03:00
@mkdir -p $( SOURCEDIR) /Multiverso/build/$( BUILDTYPE)
2016-11-02 04:52:25 +03:00
@cmake -DCMAKE_VERBOSE_MAKEFILE= TRUE \
2017-02-28 05:51:28 +03:00
-DCMAKE_CXX_COMPILER= $( CXX) \
2016-11-17 11:16:58 +03:00
-DOpenMP_CXX_FLAGS= "" \
-DOpenMP_C_FLAGS= "" \
2016-11-02 04:52:25 +03:00
-DBoost_NO_BOOST_CMAKE= TRUE \
-DBoost_NO_SYSTEM_PATHS= TRUE \
-DBOOST_ROOT:PATHNAME= $( BOOST_PATH) \
-DBOOST_LIBRARY_DIRS:FILEPATH= $( BOOST_PATH) \
-DLIBRARY_OUTPUT_PATH= $( shell readlink -f $( LIBDIR) ) \
-DEXECUTABLE_OUTPUT_PATH= $( shell readlink -f $( BINDIR) ) \
2016-11-08 14:59:19 +03:00
-DCMAKE_BUILD_TYPE= $( MULTIVERSO_CMAKE_BUILDTYPE) \
-B./Source/Multiverso/build/$( BUILDTYPE) -H./Source/Multiverso
@make VERBOSE = 1 -C ./Source/Multiverso/build/$( BUILDTYPE) -j multiverso
2016-02-15 15:01:04 +03:00
2016-10-10 06:18:17 +03:00
UNITTEST_MULTIVERSO_SRC = \
$( SOURCEDIR) /Multiverso/Test/unittests/test_array.cpp \
$( SOURCEDIR) /Multiverso/Test/unittests/test_blob.cpp \
$( SOURCEDIR) /Multiverso/Test/unittests/test_kv.cpp \
$( SOURCEDIR) /Multiverso/Test/unittests/test_message.cpp \
$( SOURCEDIR) /Multiverso/Test/unittests/test_multiverso.cpp \
$( SOURCEDIR) /Multiverso/Test/unittests/test_node.cpp \
$( SOURCEDIR) /Multiverso/Test/unittests/test_sync.cpp \
UNITTEST_MULTIVERSO_OBJ := $( patsubst %.cpp, $( OBJDIR) /%.o, $( UNITTEST_MULTIVERSO_SRC) )
UNITTEST_MULTIVERSO := $( BINDIR) /multiversotests
ALL += $( UNITTEST_MULTIVERSO)
$(UNITTEST_MULTIVERSO) : $( UNITTEST_MULTIVERSO_OBJ ) | $( MULTIVERSO_LIB )
@echo $( SEPARATOR)
@mkdir -p $( dir $@ )
@echo building $@ for $( ARCH) with build type $( BUILDTYPE)
$( CXX) $( LDFLAGS) $( patsubst %,-L%, $( LIBDIR) $( BOOSTLIB_PATH) ) $( patsubst %, $( RPATH) %, $( ORIGINLIBDIR) $( BOOSTLIB_PATH) ) -o $@ $^ $( BOOSTLIBS) $( lMULTIVERSO) -ldl
2016-01-29 09:20:21 +03:00
e n d i f
2015-09-02 18:43:31 +03:00
########################################
# cntk
########################################
CNTK_SRC = \
2015-12-15 11:58:24 +03:00
$( SOURCEDIR) /CNTK/CNTK.cpp \
$( SOURCEDIR) /CNTK/ModelEditLanguage.cpp \
$( SOURCEDIR) /CNTK/tests.cpp \
$( SOURCEDIR) /ActionsLib/TrainActions.cpp \
$( SOURCEDIR) /ActionsLib/EvalActions.cpp \
$( SOURCEDIR) /ActionsLib/OtherActions.cpp \
2016-01-22 20:42:57 +03:00
$( SOURCEDIR) /ActionsLib/SpecialPurposeActions.cpp \
2016-03-22 07:34:41 +03:00
$( SOURCEDIR) /ActionsLib/NetworkFactory.cpp \
2016-03-10 11:19:54 +03:00
$( SOURCEDIR) /ActionsLib/NetworkDescriptionLanguage.cpp \
$( SOURCEDIR) /ActionsLib/SimpleNetworkBuilder.cpp \
2016-03-11 16:31:27 +03:00
$( SOURCEDIR) /ActionsLib/NDLNetworkBuilder.cpp \
2015-12-15 11:58:24 +03:00
$( SOURCEDIR) /CNTK/BrainScript/BrainScriptEvaluator.cpp \
$( SOURCEDIR) /CNTK/BrainScript/BrainScriptParser.cpp \
2015-09-02 18:43:31 +03:00
2016-06-27 12:43:56 +03:00
CNTK_SRC += $( SGDLIB_SRC)
2016-06-11 22:21:15 +03:00
CNTK_SRC += $( CNTK_COMMON_SRC)
CNTK_SRC += $( COMPUTATION_NETWORK_LIB_SRC)
CNTK_SRC += $( SEQUENCE_TRAINING_LIB_SRC)
2015-09-26 03:22:58 +03:00
2016-10-25 02:07:42 +03:00
CNTK_OBJ := \
$( patsubst %.cu, $( OBJDIR) /%.o, $( filter %.cu, $( CNTK_SRC) ) ) \
$( patsubst %.pb.cc, $( OBJDIR) /%.pb.o, $( filter %.pb.cc, $( CNTK_SRC) ) ) \
$( patsubst %.cpp, $( OBJDIR) /%.o, $( filter %.cpp, $( CNTK_SRC) ) )
2015-09-02 18:43:31 +03:00
CNTK := $( BINDIR) /cntk
ALL += $( CNTK)
2016-03-29 10:19:00 +03:00
SRC += $( CNTK_SRC)
2015-09-02 18:43:31 +03:00
2016-11-09 01:15:55 +03:00
$(CNTK) : $( CNTK_OBJ ) | $( READER_LIBS ) $( MULTIVERSO_LIB )
2015-09-02 18:43:31 +03:00
@echo $( SEPARATOR)
@mkdir -p $( dir $@ )
2016-10-25 01:33:31 +03:00
@echo building $@ for $( ARCH) with build type $( BUILDTYPE)
2017-11-17 05:11:33 +03:00
$( CXX) $( LDFLAGS) $( patsubst %,-L%, $( LIBDIR) $( LIBPATH) $( GDK_NVML_LIB_PATH) ) $( patsubst %,$( RPATH) %, $( ORIGINLIBDIR) $( LIBPATH) ) -o $@ $^ $( LIBS) $( L_READER_LIBS) $( lMULTIVERSO) -ldl -fopenmp $( PROTOBUF_PATH) /lib/libprotobuf.a
2015-09-02 18:43:31 +03:00
2016-03-07 04:40:58 +03:00
# deployable resources: standard library of BS
2016-03-06 11:25:39 +03:00
CNTK_CORE_BS := $( BINDIR) /cntk.core.bs
ALL += $( CNTK_CORE_BS)
$(CNTK_CORE_BS) : $( SOURCEDIR ) /CNTK /BrainScript /CNTKCoreLib /CNTK .core .bs
2016-03-07 04:40:58 +03:00
@mkdir -p $( dir $@ )
@echo bin-placing deployable resource files
2016-03-06 08:27:04 +03:00
cp -f $^ $@
2017-02-06 18:11:20 +03:00
########################################
# V2Library EndToEndTests
########################################
CNTKLIBRARY_END_TO_END_TESTS_PATH = \
Tests/EndToEndTests/CNTKv2Library
CNTKLIBRARY_END_TO_END_COMMON_SRC_PATH = \
$( CNTKLIBRARY_END_TO_END_TESTS_PATH) /Common
INCLUDEPATH += $( CNTKLIBRARY_END_TO_END_COMMON_SRC_PATH)
CNTKLIBRARY_END_TO_END_TESTS_SRC_PATH = \
$( CNTKLIBRARY_END_TO_END_TESTS_PATH) /EndToEndTests
CNTKLIBRARY_END_TO_END_TESTS_SRC = \
$( CNTKLIBRARY_END_TO_END_COMMON_SRC_PATH) /Common.cpp \
$( CNTKLIBRARY_END_TO_END_TESTS_SRC_PATH) /Main.cpp \
$( CNTKLIBRARY_END_TO_END_TESTS_SRC_PATH) /CifarResNet.cpp \
$( CNTKLIBRARY_END_TO_END_TESTS_SRC_PATH) /MNISTClassifier.cpp \
$( CNTKLIBRARY_END_TO_END_TESTS_SRC_PATH) /Seq2Seq.cpp \
$( CNTKLIBRARY_END_TO_END_TESTS_SRC_PATH) /SequenceClassification.cpp \
$( CNTKLIBRARY_END_TO_END_TESTS_SRC_PATH) /TruncatedLSTMAcousticModel.cpp \
$( CNTKLIBRARY_END_TO_END_TESTS_SRC_PATH) /FrameMode.cpp \
CNTKLIBRARY_END_TO_END_TESTS := $( BINDIR) /V2LibraryEndToEndTests
CNTKLIBRARY_END_TO_END_TESTS_OBJ := $( patsubst %.cu, $( OBJDIR) /%.o, $( patsubst %.cpp, $( OBJDIR) /%.o, $( CNTKLIBRARY_END_TO_END_TESTS_SRC) ) )
ALL += $( CNTKLIBRARY_END_TO_END_TESTS)
SRC += $( CNTKLIBRARY_END_TO_END_TESTS_SRC)
$(CNTKLIBRARY_END_TO_END_TESTS) : $( CNTKLIBRARY_END_TO_END_TESTS_OBJ ) | $( CNTKLIBRARY_LIB ) $( READER_LIBS )
@echo $( SEPARATOR)
@mkdir -p $( dir $@ )
@echo building $@ for $( ARCH) with build type $( BUILDTYPE)
$( CXX) $( LDFLAGS) $( patsubst %,-L%, $( LIBDIR) $( LIBPATH) $( GDK_NVML_LIB_PATH) ) $( patsubst %,$( RPATH) %, $( ORIGINLIBDIR) $( LIBPATH) ) -o $@ $^ $( LIBS) -l$( CNTKLIBRARY) $( L_READER_LIBS)
2016-06-30 15:57:16 +03:00
########################################
# Unit Tests
########################################
2016-07-19 10:34:31 +03:00
# only build unit tests when Boost is available
i f d e f B O O S T _ P A T H
INCLUDEPATH += $( BOOST_PATH) /include
BOOSTLIB_PATH = $( BOOST_PATH) /lib
2016-07-29 19:58:49 +03:00
BOOSTLIBS := -lboost_unit_test_framework -lboost_filesystem -lboost_system
2016-06-30 15:57:16 +03:00
UNITTEST_EVAL_SRC = \
$( SOURCEDIR) /../Tests/UnitTests/EvalTests/EvalExtendedTests.cpp \
$( SOURCEDIR) /../Tests/UnitTests/EvalTests/stdafx.cpp
UNITTEST_EVAL_OBJ := $( patsubst %.cpp, $( OBJDIR) /%.o, $( UNITTEST_EVAL_SRC) )
UNITTEST_EVAL := $( BINDIR) /evaltests
2016-07-19 10:34:31 +03:00
ALL += $( UNITTEST_EVAL)
SRC += $( UNITTEST_EVAL_SRC)
2016-06-30 15:57:16 +03:00
2016-11-09 01:15:55 +03:00
$(UNITTEST_EVAL) : $( UNITTEST_EVAL_OBJ ) | $( EVAL_LIB ) $( READER_LIBS )
2016-06-30 15:57:16 +03:00
@echo $( SEPARATOR)
@mkdir -p $( dir $@ )
@echo building $@ for $( ARCH) with build type $( BUILDTYPE)
2017-11-17 05:11:33 +03:00
$( CXX) $( LDFLAGS) $( patsubst %,-L%, $( LIBDIR) $( LIBPATH) $( GDK_NVML_LIB_PATH) $( BOOSTLIB_PATH) ) $( patsubst %, $( RPATH) %, $( ORIGINLIBDIR) $( LIBPATH) $( BOOSTLIB_PATH) ) -o $@ $^ $( BOOSTLIBS) $( LIBS) -l$( EVAL) $( L_READER_LIBS) $( lMULTIVERSO)
2016-06-30 15:57:16 +03:00
2016-07-12 17:36:38 +03:00
#TODO: create project specific makefile or rules to avoid adding project specific path to the global path
2016-06-30 15:57:16 +03:00
INCLUDEPATH += $( SOURCEDIR) /Readers/CNTKTextFormatReader
UNITTEST_READER_SRC = \
2016-12-01 04:29:29 +03:00
$( SOURCEDIR) /../Tests/UnitTests/ReaderTests/CNTKBinaryReaderTests.cpp \
2016-06-30 15:57:16 +03:00
$( SOURCEDIR) /../Tests/UnitTests/ReaderTests/CNTKTextFormatReaderTests.cpp \
$( SOURCEDIR) /../Tests/UnitTests/ReaderTests/HTKLMFReaderTests.cpp \
$( SOURCEDIR) /../Tests/UnitTests/ReaderTests/ImageReaderTests.cpp \
$( SOURCEDIR) /../Tests/UnitTests/ReaderTests/ReaderLibTests.cpp \
2017-07-18 18:56:32 +03:00
$( SOURCEDIR) /../Tests/UnitTests/ReaderTests/ReaderUtilTests.cpp \
2016-06-30 15:57:16 +03:00
$( SOURCEDIR) /../Tests/UnitTests/ReaderTests/stdafx.cpp \
$( SOURCEDIR) /Readers/CNTKTextFormatReader/TextParser.cpp \
UNITTEST_READER_OBJ := $( patsubst %.cpp, $( OBJDIR) /%.o, $( UNITTEST_READER_SRC) )
UNITTEST_READER := $( BINDIR) /readertests
2016-07-19 10:34:31 +03:00
ALL += $( UNITTEST_READER)
SRC += $( UNITTEST_READER_SRC)
2016-06-30 15:57:16 +03:00
2016-11-09 01:15:55 +03:00
$(UNITTEST_READER) : $( UNITTEST_READER_OBJ ) | $( HTKMLFREADER ) $( HTKDESERIALIZERS ) $( UCIFASTREADER ) $( COMPOSITEDATAREADER ) $( IMAGEREADER ) $( READER_LIBS )
2016-06-30 15:57:16 +03:00
@echo $( SEPARATOR)
@mkdir -p $( dir $@ )
@echo building $@ for $( ARCH) with build type $( BUILDTYPE)
2017-01-11 13:15:46 +03:00
$( CXX) $( LDFLAGS) $( patsubst %,-L%, $( LIBDIR) $( BOOSTLIB_PATH) ) $( patsubst %, $( RPATH) %, $( ORIGINLIBDIR) $( BOOSTLIB_PATH) ) -o $@ $^ $( BOOSTLIBS) $( L_READER_LIBS) -ldl -fopenmp
2016-06-30 15:57:16 +03:00
UNITTEST_NETWORK_SRC = \
2016-09-01 12:32:52 +03:00
$( SOURCEDIR) /../Tests/UnitTests/NetworkTests/AccumulatorNodeTests.cpp \
2017-02-03 18:31:05 +03:00
$( SOURCEDIR) /../Tests/UnitTests/NetworkTests/BatchNormalizationTests.cpp \
2016-08-25 11:06:38 +03:00
$( SOURCEDIR) /../Tests/UnitTests/NetworkTests/CropNodeTests.cpp \
2016-06-30 15:57:16 +03:00
$( SOURCEDIR) /../Tests/UnitTests/NetworkTests/OperatorEvaluation.cpp \
$( SOURCEDIR) /../Tests/UnitTests/NetworkTests/stdafx.cpp \
2016-09-01 12:32:52 +03:00
$( SOURCEDIR) /../Tests/UnitTests/NetworkTests/TestHelpers.cpp \
2017-01-11 01:04:58 +03:00
$( SOURCEDIR) /../Tests/UnitTests/NetworkTests/EditDistanceTests.cpp \
2016-06-30 15:57:16 +03:00
$( SOURCEDIR) /CNTK/ModelEditLanguage.cpp \
$( SOURCEDIR) /ActionsLib/TrainActions.cpp \
$( SOURCEDIR) /ActionsLib/EvalActions.cpp \
$( SOURCEDIR) /ActionsLib/OtherActions.cpp \
$( SOURCEDIR) /ActionsLib/SpecialPurposeActions.cpp \
$( SOURCEDIR) /ActionsLib/NetworkFactory.cpp \
$( SOURCEDIR) /ActionsLib/NetworkDescriptionLanguage.cpp \
$( SOURCEDIR) /ActionsLib/SimpleNetworkBuilder.cpp \
$( SOURCEDIR) /ActionsLib/NDLNetworkBuilder.cpp \
$( SOURCEDIR) /CNTK/BrainScript/BrainScriptEvaluator.cpp \
$( SOURCEDIR) /CNTK/BrainScript/BrainScriptParser.cpp \
UNITTEST_NETWORK_SRC += $( COMPUTATION_NETWORK_LIB_SRC)
UNITTEST_NETWORK_SRC += $( CNTK_COMMON_SRC)
UNITTEST_NETWORK_SRC += $( SEQUENCE_TRAINING_LIB_SRC)
UNITTEST_NETWORK_SRC += $( SGDLIB_SRC)
2016-10-25 02:07:42 +03:00
UNITTEST_NETWORK_OBJ := \
$( patsubst %.cu, $( OBJDIR) /%.o, $( filter %.cu, $( UNITTEST_NETWORK_SRC) ) ) \
$( patsubst %.pb.cc, $( OBJDIR) /%.pb.o, $( filter %.pb.cc, $( UNITTEST_NETWORK_SRC) ) ) \
$( patsubst %.cpp, $( OBJDIR) /%.o, $( filter %.cpp, $( UNITTEST_NETWORK_SRC) ) )
2016-06-30 15:57:16 +03:00
UNITTEST_NETWORK := $( BINDIR) /networktests
2016-07-19 10:34:31 +03:00
ALL += $( UNITTEST_NETWORK)
SRC += $( UNITTEST_NETWORK_SRC)
2016-06-30 15:57:16 +03:00
2016-11-09 01:15:55 +03:00
$(UNITTEST_NETWORK) : $( UNITTEST_NETWORK_OBJ ) | $( READER_LIBS ) $( CNTKTEXTFORMATREADER ) $( MULTIVERSO_LIB )
2016-06-30 15:57:16 +03:00
@echo $( SEPARATOR)
@mkdir -p $( dir $@ )
@echo building $@ for $( ARCH) with build type $( BUILDTYPE)
2017-11-17 05:11:33 +03:00
$( CXX) $( LDFLAGS) $( patsubst %,-L%, $( LIBDIR) $( LIBPATH) $( GDK_NVML_LIB_PATH) $( BOOSTLIB_PATH) ) $( patsubst %, $( RPATH) %, $( ORIGINLIBDIR) $( LIBPATH) $( BOOSTLIB_PATH) ) -o $@ $^ $( BOOSTLIBS) $( LIBS) $( lMULTIVERSO) $( L_READER_LIBS) -ldl -fopenmp $( PROTOBUF_PATH) /lib/libprotobuf.a
2016-06-30 15:57:16 +03:00
UNITTEST_MATH_SRC = \
$( SOURCEDIR) /../Tests/UnitTests/MathTests/BatchNormalizationEngineTests.cpp \
$( SOURCEDIR) /../Tests/UnitTests/MathTests/constants.cpp \
$( SOURCEDIR) /../Tests/UnitTests/MathTests/ConvolutionEngineTests.cpp \
$( SOURCEDIR) /../Tests/UnitTests/MathTests/CPUMatrixTests.cpp \
$( SOURCEDIR) /../Tests/UnitTests/MathTests/CPUSparseMatrixTests.cpp \
$( SOURCEDIR) /../Tests/UnitTests/MathTests/fixtures.cpp \
2016-09-01 17:31:42 +03:00
$( SOURCEDIR) /../Tests/UnitTests/MathTests/QuantizersTests.cpp \
2016-11-17 05:20:50 +03:00
$( SOURCEDIR) /../Tests/UnitTests/MathTests/QuantizedOperationsTests.cpp \
2016-09-01 17:31:42 +03:00
$( SOURCEDIR) /../Tests/UnitTests/MathTests/TensorTests.cpp \
2016-06-30 15:57:16 +03:00
$( SOURCEDIR) /../Tests/UnitTests/MathTests/GPUMatrixCudaBlasTests.cpp \
$( SOURCEDIR) /../Tests/UnitTests/MathTests/GPUMatrixTests.cpp \
$( SOURCEDIR) /../Tests/UnitTests/MathTests/GPUSparseMatrixTests.cpp \
$( SOURCEDIR) /../Tests/UnitTests/MathTests/MatrixBlasTests.cpp \
$( SOURCEDIR) /../Tests/UnitTests/MathTests/MatrixDataSynchronizationTests.cpp \
$( SOURCEDIR) /../Tests/UnitTests/MathTests/MatrixFileWriteReadTests.cpp \
$( SOURCEDIR) /../Tests/UnitTests/MathTests/MatrixQuantizerTests.cpp \
$( SOURCEDIR) /../Tests/UnitTests/MathTests/MatrixSparseDenseInteractionsTests.cpp \
$( SOURCEDIR) /../Tests/UnitTests/MathTests/MatrixTests.cpp \
2017-01-13 08:26:56 +03:00
$( SOURCEDIR) /../Tests/UnitTests/MathTests/MatrixLearnerTests.cpp \
2016-06-30 15:57:16 +03:00
$( SOURCEDIR) /../Tests/UnitTests/MathTests/stdafx.cpp \
CNTK support for CUDA 9
CNTK now supports CUDA 9/cuDNN 7. This requires an update to build environment to Ubuntu 16/GCC 5 for Linux, and Visual Studio 2017/VCTools 14.11 for Windows. With CUDA 9, CNTK also added a preview for 16-bit floating point (a.k.a FP16) computation.
Please check out the example of FP16 in ResNet50 at /Examples/Image/Classification/ResNet/Python/TrainResNet_ImageNet_Distributed.py
Notes on FP16 preview:
* FP16 implementation on CPU is not optimized, and it's not supposed to be used in CPU inference directly. User needs to convert the model to 32-bit floating point before running on CPU.
* Loss/Criterion for FP16 training needs to be 32bit for accumulation without overflow, using cast function. Please check the example above.
* Readers do not have FP16 output unless using numpy to feed data, cast from FP32 to FP16 is needed. Please check the example above.
* FP16 gradient aggregation is currently only implemented on GPU using NCCL2. Distributed training with FP16 with MPI is not supported.
* FP16 math is a subset of current FP32 implementation. Some model may get Feature Not Implemented exception using FP16.
* FP16 is currently not supported in BrainScript. Please use Python for FP16.
To setup build and runtime environment on Windows:
* Install [Visual Studio 2017](https://www.visualstudio.com/downloads/) with following workloads and components. From command line (use Community version installer as example):
vs_community.exe --add Microsoft.VisualStudio.Workload.NativeDesktop --add Microsoft.VisualStudio.Workload.ManagedDesktop --add Microsoft.VisualStudio.Workload.Universal --add Microsoft.Component.PythonTools --add Microsoft.VisualStudio.Component.VC.Tools.14.11
* Install [NVidia CUDA 9](https://developer.nvidia.com/cuda-90-download-archive?target_os=Windows&target_arch=x86_64)
* From PowerShell, run:
/Tools/devInstall/Windows/DevInstall.ps1
* Start VCTools 14.11 command line, run:
cmd /k "%VS2017INSTALLDIR%\VC\Auxiliary\Build\vcvarsall.bat" x64 --vcvars_ver=14.11
* Open /CNTK.sln from the VCTools 14.11 command line. Note that starting CNTK.sln other than VCTools 14.11 command line, would causes CUDA 9 [build error](https://developercommunity.visualstudio.com/content/problem/163758/vs-2017-155-doesnt-support-cuda-9.html).
To setup build and runtime environment on Linux using docker, please build Unbuntu 16.04 docker image using Dockerfiles /Tools/docker. For other Linux systems, please refer to the Dockerfiles to setup dependent libraries for CNTK.
2018-01-23 03:58:56 +03:00
$( SOURCEDIR) /../Tests/UnitTests/MathTests/HalfGPUTests.cpp \
2016-06-30 15:57:16 +03:00
2017-03-09 19:57:57 +03:00
UNITTEST_MATH_SRC += $( CNTK_COMMON_SRC)
2016-07-12 17:36:38 +03:00
UNITTEST_MATH_OBJ := $( patsubst %.cpp, $( OBJDIR) /%.o, $( UNITTEST_MATH_SRC) )
2016-06-30 15:57:16 +03:00
UNITTEST_MATH := $( BINDIR) /mathtests
2016-07-19 10:34:31 +03:00
ALL += $( UNITTEST_MATH)
SRC += $( UNITTEST_MATH_SRC)
2016-06-30 15:57:16 +03:00
2016-11-09 01:15:55 +03:00
$(UNITTEST_MATH) : $( UNITTEST_MATH_OBJ ) | $( READER_LIBS )
2016-06-30 15:57:16 +03:00
@echo $( SEPARATOR)
@mkdir -p $( dir $@ )
@echo building $@ for $( ARCH) with build type $( BUILDTYPE)
2016-11-09 01:15:55 +03:00
$( CXX) $( LDFLAGS) $( patsubst %,-L%, $( LIBDIR) $( LIBPATH) $( GDK_NVML_LIB_PATH) $( BOOSTLIB_PATH) ) $( patsubst %, $( RPATH) %, $( ORIGINLIBDIR) $( LIBPATH) $( BOOSTLIB_PATH) ) -o $@ $^ $( BOOSTLIBS) $( LIBS) $( L_READER_LIBS) -ldl -fopenmp
2016-06-30 15:57:16 +03:00
2016-08-02 10:18:08 +03:00
UNITTEST_BRAINSCRIPT_SRC = \
$( SOURCEDIR) /CNTK/BrainScript/BrainScriptEvaluator.cpp \
$( SOURCEDIR) /CNTK/BrainScript/BrainScriptParser.cpp \
$( SOURCEDIR) /../Tests/UnitTests/BrainScriptTests/ParserTests.cpp \
2016-08-09 12:31:47 +03:00
$( SOURCEDIR) /../Tests/UnitTests/BrainScriptTests/ComputationNetworkTests.cpp \
2016-08-02 10:18:08 +03:00
$( SOURCEDIR) /../Tests/UnitTests/BrainScriptTests/stdafx.cpp
2016-08-09 12:31:47 +03:00
UNITTEST_BRAINSCRIPT_SRC += $( COMPUTATION_NETWORK_LIB_SRC)
UNITTEST_BRAINSCRIPT_SRC += $( SEQUENCE_TRAINING_LIB_SRC)
2016-08-02 10:18:08 +03:00
2016-08-09 12:31:47 +03:00
UNITTEST_BRAINSCRIPT_OBJ := $( patsubst %.cu, $( OBJDIR) /%.o, $( patsubst %.cpp, $( OBJDIR) /%.o, $( UNITTEST_BRAINSCRIPT_SRC) ) )
2016-08-02 10:18:08 +03:00
UNITTEST_BRAINSCRIPT := $( BINDIR) /brainscripttests
ALL += $( UNITTEST_BRAINSCRIPT)
SRC += $( UNITTEST_BRAINSCRIPT_SRC)
2016-11-09 01:15:55 +03:00
$(UNITTEST_BRAINSCRIPT) : $( UNITTEST_BRAINSCRIPT_OBJ ) | $( READER_LIBS )
2016-08-02 10:18:08 +03:00
@echo $( SEPARATOR)
@mkdir -p $( dir $@ )
@echo building $@ for $( ARCH) with build type $( BUILDTYPE)
2017-01-11 13:15:46 +03:00
$( CXX) $( LDFLAGS) $( patsubst %,-L%, $( LIBDIR) $( LIBPATH) $( GDK_NVML_LIB_PATH) $( BOOSTLIB_PATH) ) $( patsubst %, $( RPATH) %, $( ORIGINLIBDIR) $( LIBPATH) $( BOOSTLIB_PATH) ) -o $@ $^ $( BOOSTLIBS) $( LIBS) -ldl $( L_READER_LIBS) -fopenmp
2016-08-02 10:18:08 +03:00
2017-02-06 18:11:20 +03:00
########################################
# CNTKLibrary tests
########################################
CNTKLIBRARY_TESTS_SRC_PATH = \
Tests/UnitTests/V2LibraryTests
CNTKLIBRARY_TESTS_SRC = \
$( CNTKLIBRARY_END_TO_END_COMMON_SRC_PATH) /Common.cpp \
$( CNTKLIBRARY_TESTS_SRC_PATH) /FeedForwardTests.cpp \
$( CNTKLIBRARY_TESTS_SRC_PATH) /NDArrayViewTests.cpp \
$( CNTKLIBRARY_TESTS_SRC_PATH) /RecurrentFunctionTests.cpp \
$( CNTKLIBRARY_TESTS_SRC_PATH) /BlockTests.cpp \
$( CNTKLIBRARY_TESTS_SRC_PATH) /TensorTests.cpp \
$( CNTKLIBRARY_TESTS_SRC_PATH) /ValueTests.cpp \
$( CNTKLIBRARY_TESTS_SRC_PATH) /SerializationTests.cpp \
$( CNTKLIBRARY_TESTS_SRC_PATH) /LearnerTests.cpp \
$( CNTKLIBRARY_TESTS_SRC_PATH) /FunctionTests.cpp \
$( CNTKLIBRARY_TESTS_SRC_PATH) /DeviceSelectionTests.cpp \
$( CNTKLIBRARY_TESTS_SRC_PATH) /MinibatchSourceTest.cpp \
$( CNTKLIBRARY_TESTS_SRC_PATH) /UserDefinedFunctionTests.cpp \
$( CNTKLIBRARY_TESTS_SRC_PATH) /LoadLegacyModelTests.cpp \
$( CNTKLIBRARY_TESTS_SRC_PATH) /stdafx.cpp
CNTKLIBRARY_TESTS := $( BINDIR) /v2librarytests
CNTKLIBRARY_TESTS_OBJ := $( patsubst %.cu, $( OBJDIR) /%.o, $( patsubst %.cpp, $( OBJDIR) /%.o, $( CNTKLIBRARY_TESTS_SRC) ) )
ALL += $( CNTKLIBRARY_TESTS)
SRC += $( CNTKLIBRARY_TESTS_SRC)
$(CNTKLIBRARY_TESTS) : $( CNTKLIBRARY_TESTS_OBJ ) | $( CNTKLIBRARY_LIB ) $( READER_LIBS )
@echo $( SEPARATOR)
@mkdir -p $( dir $@ )
@echo building $@ for $( ARCH) with build type $( BUILDTYPE)
$( CXX) $( LDFLAGS) $( patsubst %,-L%, $( LIBDIR) $( LIBPATH) $( GDK_NVML_LIB_PATH) $( BOOSTLIB_PATH) ) $( patsubst %, $( RPATH) %, $( ORIGINLIBDIR) $( LIBPATH) $( BOOSTLIB_PATH) ) -o $@ $^ $( BOOSTLIBS) $( LIBS) -ldl -l$( CNTKLIBRARY) $( L_READER_LIBS)
unittests : $( UNITTEST_EVAL ) $( UNITTEST_READER ) $( UNITTEST_NETWORK ) $( UNITTEST_MATH ) $( UNITTEST_BRAINSCRIPT ) $( CNTKLIBRARY_TESTS )
2016-06-30 15:57:16 +03:00
2016-07-19 10:34:31 +03:00
e n d i f
2016-06-30 15:57:16 +03:00
2017-03-08 04:24:02 +03:00
i f e q ( "$(PYTHON_SUPPORT)" , "true" )
2016-10-06 13:19:13 +03:00
# Libraries needed for the run-time (i.e., excluding test binaries)
# TODO MPI doesn't appear explicitly here, hidden by mpic++ usage (but currently, it should be user installed)
2017-02-08 22:37:48 +03:00
PYTHON_LIBS_LIST := $( LIBS_LIST) $( IMAGEREADER_LIBS_LIST)
PYTHON_LIBS_EXCLUDE_LIST := m pthread nvidia-ml
PYTHON_EXTRA_LIBS_BASENAMES := $( addsuffix .so,$( addprefix lib,$( filter-out $( PYTHON_LIBS_EXCLUDE_LIST) ,$( PYTHON_LIBS_LIST) ) ) )
2016-10-06 13:19:13 +03:00
# TODO dependencies
# TODO intermediate build results should go below $OBJDIR
.PHONY : python
2017-02-08 22:37:48 +03:00
python : $( PYTHON_LIBS )
2016-10-06 13:19:13 +03:00
@bash -c ' \
2016-10-19 12:22:09 +03:00
set -x -e; \
2016-10-06 13:19:13 +03:00
declare -A py_paths; \
2016-12-19 15:49:05 +03:00
py_paths[ 27] = $( PYTHON27_PATH) ; \
2016-10-06 13:19:13 +03:00
py_paths[ 34] = $( PYTHON34_PATH) ; \
py_paths[ 35] = $( PYTHON35_PATH) ; \
2017-03-11 16:35:49 +03:00
py_paths[ 36] = $( PYTHON36_PATH) ; \
2017-02-08 22:37:48 +03:00
export LD_LIBRARY_PATH = $$ LD_LIBRARY_PATH:$$ ( echo $( GDK_NVML_LIB_PATH) $( LIBPATH) | tr " " :) ; \
2017-01-13 00:24:35 +03:00
ldd $$ ( find $( LIBDIR) -maxdepth 1 -type f -print) | grep "not found" && false; \
2018-02-20 21:54:07 +03:00
export CNTK_VERSION = $( CNTK_VERSION) ; \
export CNTK_VERSION_BANNER = $( CNTK_VERSION_BANNER) ; \
2017-03-21 02:15:57 +03:00
export CNTK_COMPONENT_VERSION = $( CNTK_COMPONENT_VERSION) ; \
2017-02-08 22:37:48 +03:00
export CNTK_LIBRARIES = " $( PYTHON_LIBS) " ; \
export CNTK_EXTRA_LIBRARIES = $$ ( ldd $( LIBDIR) /* | grep "^\s.*=> " | cut -d ">" -f 2- --only-delimited | cut -d "(" -f 1 --only-delimited | sort -u | grep -Ff <( echo $( PYTHON_EXTRA_LIBS_BASENAMES) | xargs -n1) ) ; \
2016-10-19 12:22:09 +03:00
test -x $( SWIG_PATH) ; \
export CNTK_LIB_PATH = $$ ( readlink -f $( LIBDIR) ) ; \
PYTHONDIR = $$ ( readlink -f $( PYTHONDIR) ) ; \
test $$ ? -eq 0; \
cd bindings/python; \
2016-10-06 13:19:13 +03:00
export PATH = $( SWIG_PATH) :$$ PATH; \
for ver in $( PYTHON_VERSIONS) ; \
do \
test -x $$ { py_paths[ $$ ver] } ; \
$$ { py_paths[ $$ ver] } setup.py \
2016-10-18 15:52:05 +03:00
build_ext --inplace \
2016-10-06 13:19:13 +03:00
bdist_wheel \
--dist-dir $$ PYTHONDIR || exit $$ ?; \
done '
ALL += python
e n d i f
2017-01-13 00:24:35 +03:00
i f e q ( "$(JAVA_SUPPORT)" , "true" )
BINDINGS_DIR = bindings
JAVA_SWIG_DIR = $( BINDINGS_DIR) /java/Swig
JAVA_TEST_DIR = Tests/EndToEndTests/EvalClientTests/JavaEvalTest
GENERATED_JAVA_DIR = $( JAVA_SWIG_DIR) /com/microsoft/CNTK
JDK_BIN_PATH = $( JDK_PATH) /bin
JDK_INCLUDE_PATH := $( JDK_PATH) /include
JDK_INCLUDE_PATH += $( JDK_INCLUDE_PATH) /linux
2017-08-07 21:18:17 +03:00
JAVA_SO_NAME = $( LIBDIR) /libCntk.Core.JavaBinding-$( CNTK_COMPONENT_VERSION) .so
JAVA_LOAD_DEPS := $( CNTKMATH_LIB) $( PERF_PROFILER_LIB) $( CNTKLIBRARY_LIB) $( JAVA_SO_NAME)
JAVA_LOAD_DEPS := $( JAVA_LOAD_DEPS:$( LIBDIR) /%= %)
2017-05-19 22:30:08 +03:00
JAVA_DEP_SO_NAMES_GPU := libcublas.so libcudart.so libcurand.so libcusparse.so
2017-01-13 00:24:35 +03:00
.PHONY : java
java : $( JAVA_LIBS )
@echo $( SEPARATOR)
@echo creating $@ for $( ARCH) with build type $( BUILDTYPE)
2017-05-19 22:30:08 +03:00
rm -rf $( GENERATED_JAVA_DIR)
2017-01-13 00:24:35 +03:00
mkdir -p $( GENERATED_JAVA_DIR)
$( SWIG_PATH) /swig -c++ -java -package com.microsoft.CNTK $( INCLUDEPATH:%= -I%) -I$( BINDINGS_DIR) /common -outdir $( GENERATED_JAVA_DIR) $( JAVA_SWIG_DIR) /cntk_java.i
2017-08-07 21:18:17 +03:00
$( CXX) $( LDFLAGS) -shared $( COMMON_FLAGS) $( CPPFLAGS) $( CXXFLAGS) $( INCLUDEPATH:%= -I%) $( JDK_INCLUDE_PATH:%= -I%) $( patsubst %,$( RPATH) %, $( ORIGINDIR) ) -L$( LIBDIR) $( JAVA_SWIG_DIR) /cntk_java_wrap.cxx -l$( CNTKMATH) -l$( CNTKLIBRARY) -o $( JAVA_SO_NAME)
2017-05-19 22:30:08 +03:00
mkdir -p $( JAVA_SWIG_DIR) /com/microsoft/CNTK/lib/linux
2017-08-07 21:18:17 +03:00
echo $( JAVA_SO_NAME:$( LIBDIR) /%= %) > $( JAVA_SWIG_DIR) /com/microsoft/CNTK/lib/linux/NATIVE_LOAD_MANIFEST
2017-11-10 22:48:16 +03:00
for so in libiomp5.so libmklml_intel.so; do \
2017-07-17 18:45:47 +03:00
cp -p $( MKL_LIB_PATH) /$$ so $( JAVA_SWIG_DIR) /com/microsoft/CNTK/lib/linux; \
2017-05-19 22:30:08 +03:00
echo $$ so >> $( JAVA_SWIG_DIR) /com/microsoft/CNTK/lib/linux/NATIVE_MANIFEST; \
done
2017-08-07 21:18:17 +03:00
for so in $( JAVA_LOAD_DEPS) ; do \
2017-05-19 22:30:08 +03:00
cp -p $( LIBDIR) /$$ so $( JAVA_SWIG_DIR) /com/microsoft/CNTK/lib/linux; \
echo $$ so >> $( JAVA_SWIG_DIR) /com/microsoft/CNTK/lib/linux/NATIVE_MANIFEST; \
done
i f d e f C U D A _ P A T H
for so in $( JAVA_DEP_SO_NAMES_GPU) ; do \
cp -p $( CUDA_PATH) /lib64/$$ so $( JAVA_SWIG_DIR) /com/microsoft/CNTK/lib/linux; \
echo $$ so >> $( JAVA_SWIG_DIR) /com/microsoft/CNTK/lib/linux/NATIVE_MANIFEST; \
done
cp -p $( CUDNN_PATH) /cuda/lib64/libcudnn.so $( JAVA_SWIG_DIR) /com/microsoft/CNTK/lib/linux
echo 'libcudnn.so' >> $( JAVA_SWIG_DIR) /com/microsoft/CNTK/lib/linux/NATIVE_MANIFEST
cp -p $( GDK_NVML_LIB_PATH) /libnvidia-ml.so $( JAVA_SWIG_DIR) /com/microsoft/CNTK/lib/linux
echo 'libnvidia-ml.so' >> $( JAVA_SWIG_DIR) /com/microsoft/CNTK/lib/linux/NATIVE_MANIFEST
e n d i f
cp -p $( JAVA_SWIG_DIR) /CNTKNativeUtils.java $( JAVA_SWIG_DIR) /com/microsoft/CNTK/CNTKNativeUtils.java
2017-01-13 00:24:35 +03:00
$( JDK_BIN_PATH) /javac $( GENERATED_JAVA_DIR) /*.java
mkdir -p $( LIBDIR) /java
cd $( JAVA_SWIG_DIR) && $( JDK_BIN_PATH) /jar -cvf cntk.jar com
cp $( JAVA_SWIG_DIR) /cntk.jar $( LIBDIR) /java
javac -cp $( JAVA_SWIG_DIR) $( JAVA_TEST_DIR) /src/Main.java -d $( LIBDIR) /java
ALL += java
e n d i f
2015-09-02 18:43:31 +03:00
########################################
# General compile and dependency rules
########################################
2016-10-06 13:19:13 +03:00
ALL += $( ALL_LIBS)
VPATH := $( sort $( dir $( SRC) ) )
2015-09-02 18:43:31 +03:00
# Define object files
2016-10-25 01:33:31 +03:00
OBJ := \
$( patsubst %.cu, $( OBJDIR) /%.o, $( filter %.cu, $( SRC) ) ) \
$( patsubst %.pb.cc, $( OBJDIR) /%.pb.o, $( filter %.pb.cc, $( SRC) ) ) \
$( patsubst %.cpp, $( OBJDIR) /%.o, $( filter %.cpp, $( SRC) ) )
2015-09-02 18:43:31 +03:00
# C++ include dependencies generated by -MF compiler option
DEP := $( patsubst %.o, %.d, $( OBJ) )
# Include all C++ dependencies, like header files, to ensure that a change in those
# will result in the rebuild.
- i n c l u d e $ { D E P }
2016-06-09 18:53:06 +03:00
BUILD_CONFIGURATION := Makefile $( BUILD_TOP) /Config.make
2016-04-11 22:24:24 +03:00
2016-10-25 01:33:31 +03:00
%.pb.cc : %.proto $( BUILD_CONFIGURATION )
@echo $( SEPARATOR)
@echo compiling protobuf $<
$( PROTOC) --proto_path= $( dir $<) --cpp_out= $( dir $<) $<
2016-06-09 18:53:06 +03:00
$(OBJDIR)/%.o : %.cu $( BUILD_CONFIGURATION )
2015-09-02 18:43:31 +03:00
@echo $( SEPARATOR)
2015-12-15 11:58:24 +03:00
@echo creating $@ for $( ARCH) with build type $( BUILDTYPE)
2015-09-02 18:43:31 +03:00
@mkdir -p $( dir $@ )
2016-02-11 04:50:53 +03:00
$( NVCC) -c $< -o $@ $( COMMON_FLAGS) $( CUFLAGS) $( INCLUDEPATH:%= -I%) -Xcompiler "-fPIC -Werror"
2015-09-02 18:43:31 +03:00
CNTK support for CUDA 9
CNTK now supports CUDA 9/cuDNN 7. This requires an update to build environment to Ubuntu 16/GCC 5 for Linux, and Visual Studio 2017/VCTools 14.11 for Windows. With CUDA 9, CNTK also added a preview for 16-bit floating point (a.k.a FP16) computation.
Please check out the example of FP16 in ResNet50 at /Examples/Image/Classification/ResNet/Python/TrainResNet_ImageNet_Distributed.py
Notes on FP16 preview:
* FP16 implementation on CPU is not optimized, and it's not supposed to be used in CPU inference directly. User needs to convert the model to 32-bit floating point before running on CPU.
* Loss/Criterion for FP16 training needs to be 32bit for accumulation without overflow, using cast function. Please check the example above.
* Readers do not have FP16 output unless using numpy to feed data, cast from FP32 to FP16 is needed. Please check the example above.
* FP16 gradient aggregation is currently only implemented on GPU using NCCL2. Distributed training with FP16 with MPI is not supported.
* FP16 math is a subset of current FP32 implementation. Some model may get Feature Not Implemented exception using FP16.
* FP16 is currently not supported in BrainScript. Please use Python for FP16.
To setup build and runtime environment on Windows:
* Install [Visual Studio 2017](https://www.visualstudio.com/downloads/) with following workloads and components. From command line (use Community version installer as example):
vs_community.exe --add Microsoft.VisualStudio.Workload.NativeDesktop --add Microsoft.VisualStudio.Workload.ManagedDesktop --add Microsoft.VisualStudio.Workload.Universal --add Microsoft.Component.PythonTools --add Microsoft.VisualStudio.Component.VC.Tools.14.11
* Install [NVidia CUDA 9](https://developer.nvidia.com/cuda-90-download-archive?target_os=Windows&target_arch=x86_64)
* From PowerShell, run:
/Tools/devInstall/Windows/DevInstall.ps1
* Start VCTools 14.11 command line, run:
cmd /k "%VS2017INSTALLDIR%\VC\Auxiliary\Build\vcvarsall.bat" x64 --vcvars_ver=14.11
* Open /CNTK.sln from the VCTools 14.11 command line. Note that starting CNTK.sln other than VCTools 14.11 command line, would causes CUDA 9 [build error](https://developercommunity.visualstudio.com/content/problem/163758/vs-2017-155-doesnt-support-cuda-9.html).
To setup build and runtime environment on Linux using docker, please build Unbuntu 16.04 docker image using Dockerfiles /Tools/docker. For other Linux systems, please refer to the Dockerfiles to setup dependent libraries for CNTK.
2018-01-23 03:58:56 +03:00
$(OBJDIR)/%.pb.o : %.pb .cc $( BUILD_CONFIGURATION )
2016-10-25 01:33:31 +03:00
@echo $( SEPARATOR)
@echo creating $@ for $( ARCH) with build type $( BUILDTYPE)
@mkdir -p $( dir $@ )
$( CXX) -c $< -o $@ $( COMMON_FLAGS) $( CPPFLAGS) $( CXXFLAGS) $( INCLUDEPATH:%= -I%) -MD -MP -MF ${ @ : .o=.d }
CNTK support for CUDA 9
CNTK now supports CUDA 9/cuDNN 7. This requires an update to build environment to Ubuntu 16/GCC 5 for Linux, and Visual Studio 2017/VCTools 14.11 for Windows. With CUDA 9, CNTK also added a preview for 16-bit floating point (a.k.a FP16) computation.
Please check out the example of FP16 in ResNet50 at /Examples/Image/Classification/ResNet/Python/TrainResNet_ImageNet_Distributed.py
Notes on FP16 preview:
* FP16 implementation on CPU is not optimized, and it's not supposed to be used in CPU inference directly. User needs to convert the model to 32-bit floating point before running on CPU.
* Loss/Criterion for FP16 training needs to be 32bit for accumulation without overflow, using cast function. Please check the example above.
* Readers do not have FP16 output unless using numpy to feed data, cast from FP32 to FP16 is needed. Please check the example above.
* FP16 gradient aggregation is currently only implemented on GPU using NCCL2. Distributed training with FP16 with MPI is not supported.
* FP16 math is a subset of current FP32 implementation. Some model may get Feature Not Implemented exception using FP16.
* FP16 is currently not supported in BrainScript. Please use Python for FP16.
To setup build and runtime environment on Windows:
* Install [Visual Studio 2017](https://www.visualstudio.com/downloads/) with following workloads and components. From command line (use Community version installer as example):
vs_community.exe --add Microsoft.VisualStudio.Workload.NativeDesktop --add Microsoft.VisualStudio.Workload.ManagedDesktop --add Microsoft.VisualStudio.Workload.Universal --add Microsoft.Component.PythonTools --add Microsoft.VisualStudio.Component.VC.Tools.14.11
* Install [NVidia CUDA 9](https://developer.nvidia.com/cuda-90-download-archive?target_os=Windows&target_arch=x86_64)
* From PowerShell, run:
/Tools/devInstall/Windows/DevInstall.ps1
* Start VCTools 14.11 command line, run:
cmd /k "%VS2017INSTALLDIR%\VC\Auxiliary\Build\vcvarsall.bat" x64 --vcvars_ver=14.11
* Open /CNTK.sln from the VCTools 14.11 command line. Note that starting CNTK.sln other than VCTools 14.11 command line, would causes CUDA 9 [build error](https://developercommunity.visualstudio.com/content/problem/163758/vs-2017-155-doesnt-support-cuda-9.html).
To setup build and runtime environment on Linux using docker, please build Unbuntu 16.04 docker image using Dockerfiles /Tools/docker. For other Linux systems, please refer to the Dockerfiles to setup dependent libraries for CNTK.
2018-01-23 03:58:56 +03:00
$(OBJDIR)/%.o : %.cpp $( BUILD_CONFIGURATION )
2015-09-02 18:43:31 +03:00
@echo $( SEPARATOR)
2015-12-15 11:58:24 +03:00
@echo creating $@ for $( ARCH) with build type $( BUILDTYPE)
2015-09-02 18:43:31 +03:00
@mkdir -p $( dir $@ )
2016-02-11 04:50:53 +03:00
$( CXX) -c $< -o $@ $( COMMON_FLAGS) $( CPPFLAGS) $( CXXFLAGS) $( INCLUDEPATH:%= -I%) -MD -MP -MF ${ @ : .o=.d }
2015-09-02 18:43:31 +03:00
2016-06-30 15:57:16 +03:00
.PHONY : clean buildall all unittests
2015-09-02 18:43:31 +03:00
clean :
@echo $( SEPARATOR)
@rm -rf $( OBJDIR)
@rm -rf $( ALL)
2016-02-01 22:16:08 +03:00
@rm -rf $( BUILDINFO)
2015-12-15 11:58:24 +03:00
@echo finished cleaning up the project
2015-09-02 18:43:31 +03:00
buildall : $( ALL )
@echo $( SEPARATOR)
@echo finished building for $( ARCH) with build type $( BUILDTYPE)