Граф коммитов

787 Коммитов

Автор SHA1 Сообщение Дата
Thilo Will e7c884a047 removing some uneeded (int) casts. 2016-08-26 11:45:05 +02:00
Thilo Will 3f5f0028aa Fixing issue that for Sparse * Dense the dense*Dense path was taken afterwards 2016-08-26 11:14:33 +02:00
Mark Hillebrand 54b07705b7 Merge remote-tracking branch 'origin/master' into mahilleb/cuDNN5 2016-08-26 10:59:09 +02:00
Thilo Will ad17a78209 Fixing some newly introduced bugs 2016-08-26 10:19:32 +02:00
yuxiaoguo be79c3ca64 add matrix pool, add basic impl 2016-08-26 14:28:58 +08:00
Nikos Karampatziakis 65dbb7a7e5 Merge branch 'DanielMerget-DanielMerget/fix_atomicAdd' into nikosk/pascal-and-cuda8-fixes 2016-08-25 16:01:23 -07:00
Nikos Karampatziakis 002e920bc5 Update CNTK.Cpp.props to CUDA8;
Add atomicAdd fix by DanielMerget
2016-08-25 15:57:17 -07:00
Nikos Karampatziakis 58b7186777 Merge branch 'DanielMerget/fix_atomicAdd' of https://github.com/DanielMerget/CNTK into DanielMerget-DanielMerget/fix_atomicAdd 2016-08-25 14:29:09 -07:00
Thilo Will 3786b88683 Added path CPU: SPARSE * DENSE -> DENSE to MultiplyAndWeightedAdd in Matrix.cpp 2016-08-25 17:52:27 +02:00
Thilo Will 5658594c69 CPU Sparse*Dense->Dense compiles 2016-08-25 17:39:26 +02:00
Thilo Will 1810e7cafe First implementation of new sparse*dense for CPU 2016-08-25 15:56:13 +02:00
Mark Hillebrand d9e9c885bd Merge remote-tracking branch 'origin/fseide/cudnn5' into mahilleb/CuDnn5Test 2016-08-25 15:44:53 +02:00
Mark Hillebrand 6827182791 Merge remote-tracking branch 'origin/mahilleb/CuDnn5Test' into mahilleb/CuDnn5Test 2016-08-25 15:38:08 +02:00
Mark Hillebrand fc3a071a71 CntkBatchNormalization.cuh: fix for batchSize == 1 2016-08-25 15:37:23 +02:00
Thilo Will f0aa69d365 merge from master 2016-08-25 11:31:09 +02:00
Frank Seide e576c3d6b7 missing NoGPU.cpp entries for RNN node;
fixed shared_ptr to incomplete CuDnnRNNExecutor
2016-08-24 20:24:31 -07:00
Frank Seide 686078fdfd (made gcc happy) 2016-08-24 19:51:02 -07:00
Frank Seide 5f14fcaea0 merged from mahilleb/CuDnn5Test 2016-08-24 18:45:39 -07:00
Frank Seide 769b2602a2 updated SLUHandsOn tests 2016-08-24 18:15:09 -07:00
Frank Seide d9c7e82031 OptimizedRNNStackNode: renamed some variables, renamed recurrentOps to camelCase, added weigth inference 2016-08-24 17:17:23 -07:00
Frank Seide 8a86da8f02 renamed RNNNode to OptimizedRNNStackNode, also updated parameter names 2016-08-24 16:10:01 -07:00
Mark Hillebrand 0285fa9a13 Merge remote-tracking branch 'origin/master' into mahilleb/CuDnn5Test
Conflicts:
	Source/ComputationNetworkLib/ComputationNode.h
	Source/ComputationNetworkLib/TrainingNodes.h
	Tests/EndToEndTests/Examples/Image/Miscellaneous/CIFAR-10/02_BatchNormConv/baseline.linux.txt
	Tests/EndToEndTests/Examples/Image/Miscellaneous/CIFAR-10/02_BatchNormConv/baseline.windows.txt
	Tests/UnitTests/MathTests/ConvolutionEngineTests.cpp
2016-08-25 00:37:49 +02:00
Frank Seide d8c3c15be5 merged from CuDnn5Test and cudnn-rnn 2016-08-24 09:44:29 -07:00
Mark Hillebrand 493744d922 Source/Math/CntkBatchNormalization.cuh: fix variance conversion 2016-08-24 14:00:22 +02:00
Frank Seide d1b1127c9d Merge branch 'jdroppo/cudnn-rnn-lstm' of https://github.com/Microsoft/cntk into fseide/cudnn5 2016-08-24 00:20:13 -07:00
Mark Hillebrand bb155ef563 tune 2016-08-24 00:06:49 +02:00
Mark Hillebrand e1a9cabbde Address CR comments 2016-08-23 20:32:03 +02:00
Jasha Droppo 373adfd9ac Changes Addressing Code Review for CUDNN RNNStack Node 2016-08-23 10:40:34 -07:00
Thilo Will f8f663551b formatting code 2016-08-23 17:02:12 +02:00
Thilo Will 2de2ffb10b In backprob of times Dense Sparse no switching to sparse 2016-08-23 16:45:45 +02:00
thilow 8c2ec53cde Adding comments in method MultiplyAndWeightedAdd of Matrix.cpp 2016-08-23 06:19:22 -07:00
Mark Hillebrand 66498cf414 Merge remote-tracking branch 'origin/master' into mahilleb/CuDnn5Test
Note: baselines need to be fixed for
Tests/EndToEndTests/BatchNormalization and
Tests/EndToEndTests/Examples/Image/Miscellaneous/CIFAR-10/02_BatchNormConv.
2016-08-23 11:12:35 +02:00
Frank Seide 1f9c539c61 Merge branch 'jdroppo/cudnn-rnn-lstm' of https://github.com/Microsoft/cntk into fseide/cudnn5 2016-08-22 20:11:07 -07:00
Frank Seide 54f096083d merged from mahilleb/CuDnn5Test 2016-08-22 18:51:22 -07:00
Frank Seide 5b969bac70 merged from master. Undid the ClassificationError baseline updates due to merge conflicts 2016-08-22 14:36:28 -07:00
Jasha Droppo 2fa1b7033d Merge commit 'origin/master' 8493f11 into jdroppo/cudnn-rnn-lstm 2016-08-22 13:28:42 -07:00
Amit Agarwal 37b6897e94 Merge branch 'master' of https://github.com/Microsoft/CNTK into amitaga/cntkv2Library 2016-08-22 10:48:54 -07:00
Mark Hillebrand f76afa2b7e Switch to CuDNN v5
For batch normalization, running inverse standard deviation becomes
running variance. We mirror this CuDNN v5 change in the CNTK batch
normalization engine. Model version is bumped. When old models are
loaded, this parameter is (approximately) converted.

In the same model version change, let batch normalization count
samples seen rather minibatches (this deals with incorrect averaging
when minibatch size is varied across epochs).

For batch normalization averaging and blending handle initialization
cases, don't rely on mean and variance initial values (set in
NDL/BrainScript).

Update Windows / Linux / Docker build.
With this commit, CuDNN v4 is not supported anymore.
2016-08-22 17:55:10 +02:00
Amit Agarwal fa4b99d102 CNTK v2 Library: a) Add dynamic axis support b) New primitive functions and some higher level functions and c) Sequence classification test 2016-08-21 03:49:03 -07:00
Frank Seide 050a84035f tuned >1-bit SGD: odd #quantization levels, range now 4 stddevs (before: 5) 2016-08-21 01:33:01 -07:00
Frank Seide db74d6b468 changed ImageHandsOn from "gaussian" to "heNormal" initialization, and also most layers defaults in CNTK.core.bs 2016-08-19 23:34:17 -07:00
Daniel Merget 6eaacc7a98 clarified comment 2016-08-19 14:15:51 +02:00
Daniel Merget fc2e6c2427 avoid double definition of atomicAdd on modern GPUs 2016-08-19 13:47:39 +02:00
Zhou Wang 3d80725a16 Define THREAD_LOCAL and force currentDevice to be THREAD_LOCAL 2016-08-19 11:16:33 +02:00
Jasha Droppo 971b7d0003 Change PrepareDevice() to have a Thread Dependent Value Cache
The cached value of currentDevice is meant to avoid redundant calls
to cudaSetDevice(). But, this setting is tread specific. So, the
cache should be thread specific. This fixes the problem on Windows.
2016-08-18 15:41:59 -07:00
Wolfgang Manousek 786ec99da2 fixed broken ifdef statement 2016-08-17 10:04:56 +02:00
Wolfgang Manousek 79cfcf7d4f more acml removal 2016-08-17 10:04:56 +02:00
Jasha Droppo b99b3832fb CuDNN-RNN Fix Merge Error in Math/*filters Visual Studio files 2016-08-16 15:50:56 -07:00
Jasha Droppo 2fb185b1fe CUDNN-RNN Fix Parameter Count in Error Message 2016-08-16 10:10:06 -07:00
Jasha Droppo 80d077054d Merge branch 'master' into jdroppo/cudnn-rnn-lstm 2016-08-15 16:11:16 -07:00
Frank Seide f5e77e4efb minor fixes 2016-08-13 12:39:27 -07:00
Project Philly a269e0e6b5 Integrate thilow/FixTimes4SparseOnCPU into master 2016-08-12 13:45:12 -07:00
Thilo Will 8d7ed085e1 Fix of fix 2016-08-12 16:24:01 +02:00
Thilo Will 6f7505f656 Fixing another bug in Times(Dense,Sparse) and restructure code 2016-08-12 14:52:27 +02:00
Eldar Akchurin 66e45348fa Fixing bug in sparse matrix buffer estimation 2016-08-12 14:47:30 +02:00
Thilo Will edd3a948f6 Fixing initialsation of result in Times(dense, sparse) on CPU 2016-08-12 09:53:09 +02:00
Frank Seide d6fb3786ae bug fix: CPUSparseMatrix<ElemType>::MultiplyAndWeightedAdd() should handle transposed inputs in all combinations 2016-08-07 21:27:54 -07:00
Frank Seide db5fff2a02 merged from master 2016-08-05 14:11:38 -07:00
Thilo Will 4e17fd5175 Fixing typo in reduction test and reformatting 2016-08-05 14:40:51 +02:00
Thilo Will 253e65b432 ReduceLogSum: beautifications 2016-08-05 14:08:54 +02:00
Thilo Will 69470799c1 merged from master 2016-08-05 12:33:39 +02:00
Thilo Will 0d3b9e57f6 Added comment regarding default axis values in python bindings 2016-08-03 10:59:12 +02:00
Frank Seide e80562feda trying a fix to lazy init 2016-08-02 19:24:42 -07:00
Frank Seide bc06c3c4be CNTK BatchNorm engine Backprop() should honor blendFactor 2016-08-02 19:13:30 -07:00
Project Philly e2e15e0b18 Integrate mahilleb/AssertRemoval into master 2016-08-02 06:53:24 -07:00
Mark A. Hillebrand fa7befb882 Address CR comment 2016-08-02 15:48:54 +02:00
thilow 5a7e77b4c5 ElementwiseProductWithExpOffDiff 2016-08-02 00:32:57 -07:00
Vadim Mazalov c81eb6fd3f Add Array struct, quantizer unit tests, minor fixes. 2016-08-01 12:33:03 -07:00
Vadim Mazalov 7cfc3f358e Remove LearnableParameterQuantized and MEL command to quantize a node 2016-08-01 12:33:03 -07:00
Vadim Mazalov 90079c6fa3 Introduce Matrix<short>. 2016-08-01 12:28:11 -07:00
Vadim Mazalov 15e9cf8e94 Refine MEL command for quantization of LearnableParameter node, changes to InputAndParamNodes. 2016-08-01 12:28:11 -07:00
Vadim Mazalov 77ff661930 Quantization of learnable parameter node 2016-08-01 12:28:11 -07:00
Mark A. Hillebrand 97c0c98b39 Remove assertion that's not true for the CuDNN engine. 2016-08-01 18:46:09 +02:00
Frank Seide 0c86e36310 bug fix: BatchNormEngine::Forward() should assert saveMean/InvStdDev as a post-condition now 2016-07-29 12:42:23 -07:00
thilow 39c60b5c12 ReduceLogSum: adapted core.bs. Tests still failing 2016-07-28 23:20:38 +02:00
Thilo Will 7397854908 ReduceLogSum backward path and core.bs 2016-07-28 17:49:52 +02:00
Thilo Will 38e4b2b402 merged from master 2016-07-28 14:23:59 +02:00
Frank Seide 540cd0be04 addressed CR feedback 2016-07-27 15:11:57 -07:00
thilow dde483fee7 Adding ReduceLogSum 2016-07-27 22:07:26 +02:00
Frank Seide ee23bb2000 (fix for previous fix) 2016-07-26 18:03:52 -07:00
Frank Seide 8f716986ae renamed reduction kernels that expect a hard-coded number of threads to reflect that number in their names 2016-07-26 17:48:37 -07:00
Frank Seide 9dbf806c39 merged from master 2016-07-26 16:51:19 -07:00
Frank Seide 1a80a6a1c1 undid accidental change of Shuffle() 2016-07-26 13:55:12 -07:00
Frank Seide 5e357fee8b addressed minor feedback from Amit's CR;
addressed feedback from Simon Layton (NVidia) that the constants defined in GridDim are too small.
2016-07-26 13:52:21 -07:00
Thilo Will dcc7e9b3f1 Added comments 2016-07-26 09:40:12 +02:00
Thilo Will 524c5278c7 Using double as aggregator, hoping to fix issue with tests TWRGE TWRGS, TLRGS 2016-07-25 17:56:15 +02:00
Thilo Will 52dda16053 Removed 'typedef' in partial specialisation of TensorOpReduction in hope to fix Linux build. 2016-07-25 16:06:03 +02:00
Thilo Will 24c0ad1cf5 Fixed comments 2016-07-25 15:35:14 +02:00
Thilo Will 2115db661e Renamed variable to reductionOp 2016-07-25 15:17:13 +02:00
Thilo Will 5dbb7254fa factored aggregation op into a lambda 2016-07-25 15:00:11 +02:00
U-FAREAST\fseide a5d15f3078 Merge branch 'master' of https://github.com/Microsoft/CNTK into fseide/clonebs 2016-07-22 10:04:08 -07:00
Frank Seide 27ff6f7177 (typo) 2016-07-22 08:53:31 -07:00
Frank Seide 07a6fa25f9 BatchNorm: moved allocation of saveMean() to where they are produced, and allocating them empty when they are not produced at all 2016-07-22 08:46:02 -07:00
Frank Seide 02700105a6 added new interface IFreezable to tell a node to freeze itself, in order to allow BatchNormalization to honor CloneFunction (..., parameters="constant") 2016-07-22 08:24:56 -07:00
Thilo Will fd954772ea Converted AggregationOp to pure function template 2016-07-22 17:02:33 +02:00
Thilo Will bd776dc849 Changed "NeutralValue" to function templates 2016-07-22 16:49:54 +02:00
Thilo Will 80fdb8f53d using function overloading for neutral 2016-07-22 16:28:53 +02:00
Thilo Will fec05bffe8 Improved formatting and comments 2016-07-22 15:51:15 +02:00
Frank Seide ce350dda68 (trying around with saveMean) 2016-07-21 19:47:45 -07:00
Frank Seide 3d70ff34e0 heavily commented batch-normalization code, including several bugs;
new interface IParameterNode for identifying LearnableParameters;
first implementation of CloneFunctionConfigLambda (except for returning the result)
2016-07-21 17:37:44 -07:00
Thilo Will 6dce931c19 merged from master 2016-07-21 10:28:22 +02:00
Amit Agarwal f3dec438d6 a) Made CUDA sync mode execution of kernels a runtime config option instead of a build flavor b) Added perf instumentation to show accurate per MB read, compute and parameter update time 2016-07-20 17:19:00 -07:00
Frank Seide 39a9175097 merged from master 2016-07-19 16:40:51 -07:00
Jasha Droppo 3c8e63f1d5 Fix Bug Introduced in Merge 2016-07-18 11:57:31 -07:00
Jasha Droppo a4e42744c2 Merge branch 'master' into jdroppo/cudnn-rnn-lstm
Conflicts:
	Makefile
	Source/CNTK/BrainScript/CNTKCoreLib/CNTK.core.bs
	Source/Math/CuDnnBatchNormalization.cu
	Source/Math/CuDnnConvolutionEngine.cu
	Source/Math/Math.vcxproj
	Source/SGDLib/SGD.cpp
2016-07-18 11:11:58 -07:00
Ivan Rodriguez be64a3958d Using again shared_ptr 2016-07-18 13:43:37 +02:00
Ivan Rodriguez 7d8657b1a8 Change code according to review 2016-07-18 13:43:37 +02:00
Ivan Rodriguez dac4aca396 refactor tensor tests 2016-07-18 13:40:15 +02:00
Frank Seide e3b1b66aba added tensor test(s) to MathPerformanceTests 2016-07-18 13:32:14 +02:00
Ivan Rodriguez 64ecb3c659 Change code according to review 2016-07-18 13:30:00 +02:00
Ivan Rodriguez 64f6978ad9 remove unused forward declared struct 2016-07-18 13:30:00 +02:00
Ivan Rodriguez 934dd082a0 Fix crash when running BiasGradient test. Remove the original test code. 2016-07-18 13:28:11 +02:00
Ivan Rodriguez 936b736c1f refactor tensor tests 2016-07-18 13:28:11 +02:00
Ivan Rodriguez af6ddf9c04 Move math performance tests to MathTests 2016-07-18 13:28:11 +02:00
Frank Seide a172c89111 added tensor test(s) to MathPerformanceTests 2016-07-18 13:24:56 +02:00
Zhou Wang e3927bb717 Add math unit tests and adapt them for Linux
This is a combination of 7 commits.
minor format changes

adapt makefile and math tests

enable sse4.1 support

adapt to linux

fix shadow param, and adjust order of functions

netowrk tests need .cu

move constant definition into a .cpp file, instead of .h
2016-07-13 16:03:06 +02:00
Thilo Will 873d988115 Improved formatting and comments 2016-07-13 11:50:18 +02:00
Thilo Will 4bcc0d1b85 Trying to avoid: template instantiation depth exceeds maximum 2016-07-12 17:47:16 +02:00
Thilo Will 9c0cf7123a Factored out the reduction operations 2016-07-12 16:20:58 +02:00
Thilo Will 248753faa6 Factored out neutral element of binary ops. 2016-07-12 14:32:47 +02:00
Thilo Will 204aca8563 ReduceOp passed through on all paths 2016-07-12 10:22:32 +02:00
Thilo Will 2cc578ef0e passed through reduceOp till end on some paths. In TensorOpReduce still missing 2016-07-12 10:01:22 +02:00
Thilo Will 8493e614e2 passing reductionop to _launchTensorOpWithReduction 2016-07-11 17:05:04 +02:00
Thilo Will beb797b0db implemented min/max reduction inside TensorOpElement 2016-07-11 16:56:32 +02:00
Thilo Will 98f9e8ac39 Passing reduction op furhter down 2016-07-11 15:49:13 +02:00
Thilo Will 73d1e32d3a Revert "passing through reductionOp. Not yet compiling."
This reverts commit 84c9b0caa0.
2016-07-11 14:38:29 +02:00
Thilo Will 84c9b0caa0 passing through reductionOp. Not yet compiling. 2016-07-11 11:45:08 +02:00
Thilo Will f4c3821302 passing the reduction op down the call hierarchy for reduction on GPU into LaunchTensorOpWithReduction. In ReduceElementsNode renaming m_op to m_reductionOP 2016-07-08 16:26:58 +02:00
Thilo Will 94bd96eaba merge with master 2016-07-08 13:57:01 +02:00
anthonyaue 3f41c0c9c5 Add a bunch of new tests to exercise block multiplier.
Change spacing of comments.
Reset omp threads in d'tor.
2016-06-30 08:23:13 -07:00
anthonyaue 03e504e7ab Allow block multiplier to support arbitrary number of rows in A 2016-06-29 13:19:35 -07:00
Jasha Droppo e3dd352d20 RNN Debug info on object creation/deletion 2016-06-29 11:28:53 -07:00
Thilo Will 693b9a6c45 merged with master 2016-06-27 11:51:20 +02:00
Project Philly d39410d2fc Integrate anthaue/addblockmultiplier into master 2016-06-24 16:50:41 -07:00
anthonyaue d2bf769c83 Implement code review feedback from clemensm 2016-06-24 08:40:48 -07:00
Thilo Will 681c805cf9 Added some code that should force ReduceMax to run on CPU 2016-06-24 17:28:52 +02:00
Thilo Will 723441ea5c ReduceMin/Max work (cpu only). Missing: move to cpu. 2016-06-24 16:20:37 +02:00
Thilo Will 20c9444176 ReduceMax/Min builds 2016-06-24 15:40:25 +02:00
anthonyaue a2fda5f7f5 Put AVX support under SUPPORT_AVX2 so compilatino on linux will work 2016-06-21 13:45:37 -07:00
anthonyaue 1e6fb55338 Add Stdafx.h headers to fix release win build breaks; add -mavx2 flag to
fix linux build break
2016-06-21 10:43:29 -07:00
anthonyaue 96e40865b8 Fix some capitalization issues 2016-06-21 08:45:07 -07:00
Eldar Akchurin 3bf21c13e3 Fixing invalid address 2016-06-21 17:38:54 +02:00
Eldar Akchurin 3e61e9fa84 Removing unused dependencies 2016-06-21 17:38:54 +02:00
Eldar Akchurin 39a44a58de Inlining checks 2016-06-21 17:38:54 +02:00
Eldar Akchurin d564ed7cab Adding cude runtime api 2016-06-21 17:38:54 +02:00
Eldar Akchurin 5b0a3aa55a Fixing comments 2016-06-21 17:38:54 +02:00
Eldar Akchurin f75a32301e Fixing Cuda return code checks 2016-06-21 17:38:54 +02:00
Frank Seide 4a79ef3d8c added tensor test(s) to MathPerformanceTests 2016-06-17 12:10:37 -07:00
Anthony Aue 12140c993b Makefile and *.vcxproj changes 2016-06-17 08:57:55 -07:00
Anthony Aue 567a7ff421 Implement comments from code review. Have not tried to compile. 2016-06-16 16:17:21 -07:00