add linux baseline | bump model version

This commit is contained in:
Jian Jiao 2016-10-14 09:53:43 -07:00
Родитель cc4eab4627
Коммит e33d6ea56f
6 изменённых файлов: 192 добавлений и 5 удалений

Просмотреть файл

@ -1263,7 +1263,7 @@ Project("{888888A0-9F3D-457C-B088-3A5042F75D52}") = "CNTKv2", "bindings\python\e
EndProject
Project("{2150E333-8FDC-42A3-9474-1A3956D46DE8}") = "IRMetric", "IRMetric", "{E844AB9A-A48F-4A99-9625-F528C5C46D83}"
ProjectSection(SolutionItems) = preProject
Tests\EndToEndTests\Text\IRMetric\baseline.txt = Tests\EndToEndTests\Text\IRMetric\baseline.txt
Tests\EndToEndTests\Text\IRMetric\baseline.windows.txt = Tests\EndToEndTests\Text\IRMetric\baseline.windows.txt
Tests\EndToEndTests\Text\IRMetric\ir.cntk = Tests\EndToEndTests\Text\IRMetric\ir.cntk
Tests\EndToEndTests\Text\IRMetric\run-test = Tests\EndToEndTests\Text\IRMetric\run-test
Tests\EndToEndTests\Text\IRMetric\testcases.yml = Tests\EndToEndTests\Text\IRMetric\testcases.yml

Просмотреть файл

@ -43,7 +43,8 @@
#define CNTK_MODEL_VERSION_12 12 // Times() m_inputRank to support parameter-rank inference
#define CNTK_MODEL_VERSION_13 13 // batch norm: switch running inverse std deviation -> variance, MB count -> samplesSeen; CuDNN v5
#define CNTK_MODEL_VERSION_14 14 // axis parameter in OptimizedRNNStackNode
#define CURRENT_CNTK_MODEL_VERSION CNTK_MODEL_VERSION_14
#define CNTK_MODEL_VERSION_15 15 // add new nodes: LambdaRankNode and NDCG1Eval
#define CURRENT_CNTK_MODEL_VERSION CNTK_MODEL_VERSION_15
extern bool g_shareNodeValueMatrices;

Просмотреть файл

@ -0,0 +1,189 @@
CPU info:
CPU Model Name: Intel(R) Xeon(R) CPU E5-2660 v2 @ 2.20GHz
Hardware threads: 40
Total Memory: 134051080 kB
-------------------------------------------------------------------
=== Running /cygdrive/d/jiajia/github/cntk/x64/debug/cntk.exe configFile=D:\jiajia\github\cntk\Tests\EndToEndTests\Text\IRMetric/ir.cntk currentDirectory=C:\cygwin64\tmp\cntk-test-20160929135008.377590\Text_IRMetric@debug_gpu\TestData RunDir=C:\cygwin64\tmp\cntk-test-20160929135008.377590\Text_IRMetric@debug_gpu DataDir=C:\cygwin64\tmp\cntk-test-20160929135008.377590\Text_IRMetric@debug_gpu\TestData ConfigDir=D:\jiajia\github\cntk\Tests\EndToEndTests\Text\IRMetric OutputDir=C:\cygwin64\tmp\cntk-test-20160929135008.377590\Text_IRMetric@debug_gpu DeviceId=0 timestamping=true makeMode=true
CNTK 1.7.1 (Sep 28 2016 15:16:12) on SAADSRNRDEV038 at 2016/09/29 21:50:10
D:\jiajia\github\cntk\x64\debug\cntk.exe configFile=D:\jiajia\github\cntk\Tests\EndToEndTests\Text\IRMetric/ir.cntk currentDirectory=C:\cygwin64\tmp\cntk-test-20160929135008.377590\Text_IRMetric@debug_gpu\TestData RunDir=C:\cygwin64\tmp\cntk-test-20160929135008.377590\Text_IRMetric@debug_gpu DataDir=C:\cygwin64\tmp\cntk-test-20160929135008.377590\Text_IRMetric@debug_gpu\TestData ConfigDir=D:\jiajia\github\cntk\Tests\EndToEndTests\Text\IRMetric OutputDir=C:\cygwin64\tmp\cntk-test-20160929135008.377590\Text_IRMetric@debug_gpu DeviceId=0 timestamping=true makeMode=true
Changed current directory to C:\cygwin64\tmp\cntk-test-20160929135008.377590\Text_IRMetric@debug_gpu\TestData
09/29/2016 21:50:12: ##############################################################################
09/29/2016 21:50:12: # #
09/29/2016 21:50:12: # train command (train action) #
09/29/2016 21:50:12: # #
09/29/2016 21:50:12: ##############################################################################
parallelTrain option is not enabled. ParallelTrain config will be ignored.
09/29/2016 21:50:12:
Creating virgin network.
Post-processing network...
3 roots:
irm = IRMetric()
irm1 = IRMetricEval()
s = Plus()
Validating network. 41 nodes to process in pass 1.
Validating --> Gain = InputValue() : -> [1 x *]
Validating --> s.z.W = LearnableParameter() : -> [1 x 204]
Validating --> Q.z.W = LearnableParameter() : -> [56 x 49293]
Validating --> Query = SparseInputValue() : -> [49293 x *]
Validating --> Q.z.z.PlusArgs[0] = Times (Q.z.W, Query) : [56 x 49293], [49293 x *] -> [56 x *]
Validating --> Q.z.B = LearnableParameter() : -> [56 x 1]
Validating --> Q.z.z = Plus (Q.z.z.PlusArgs[0], Q.z.B) : [56 x *], [56 x 1] -> [56 x 1 x *]
Validating --> Q = RectifiedLinear (Q.z.z) : [56 x 1 x *] -> [56 x 1 x *]
Validating --> T.z.W = LearnableParameter() : -> [56 x 49293]
Validating --> Title = SparseInputValue() : -> [49293 x *]
Validating --> T.z.z.PlusArgs[0] = Times (T.z.W, Title) : [56 x 49293], [49293 x *] -> [56 x *]
Validating --> T.z.B = LearnableParameter() : -> [56 x 1]
Validating --> T.z.z = Plus (T.z.z.PlusArgs[0], T.z.B) : [56 x *], [56 x 1] -> [56 x 1 x *]
Validating --> T = RectifiedLinear (T.z.z) : [56 x 1 x *] -> [56 x 1 x *]
Validating --> U.z.W = LearnableParameter() : -> [56 x 49293]
Validating --> NormalizedUrl = SparseInputValue() : -> [49293 x *]
Validating --> U.z.z.PlusArgs[0] = Times (U.z.W, NormalizedUrl) : [56 x 49293], [49293 x *] -> [56 x *]
Validating --> U.z.B = LearnableParameter() : -> [56 x 1]
Validating --> U.z.z = Plus (U.z.z.PlusArgs[0], U.z.B) : [56 x *], [56 x 1] -> [56 x 1 x *]
Validating --> U = RectifiedLinear (U.z.z) : [56 x 1 x *] -> [56 x 1 x *]
Validating --> RS0._ = RowStack (Q, T, U) : [56 x 1 x *], [56 x 1 x *], [56 x 1 x *] -> [168 x 1 x *]
Validating --> RS0 = Dropout (RS0._) : [168 x 1 x *] -> [168 x 1 x *]
Validating --> BingCounts = InputValue() : -> [36 x *]
Validating --> R0 = RowStack (RS0, BingCounts) : [168 x 1 x *], [36 x *] -> [204 x 1 x *]
Validating --> R1.l2.W = LearnableParameter() : -> [204 x 512]
Validating --> R1.l1.z.W = LearnableParameter() : -> [512 x 204]
Validating --> R1.l1.z.z.PlusArgs[0] = Times (R1.l1.z.W, R0) : [512 x 204], [204 x 1 x *] -> [512 x 1 x *]
Validating --> R1.l1.z.B = LearnableParameter() : -> [512 x 1]
Validating --> R1.l1.z.z = Plus (R1.l1.z.z.PlusArgs[0], R1.l1.z.B) : [512 x 1 x *], [512 x 1] -> [512 x 1 x *]
Validating --> R1.l1.act = RectifiedLinear (R1.l1.z.z) : [512 x 1 x *] -> [512 x 1 x *]
Validating --> R1.l2.z.PlusArgs[0] = Times (R1.l2.W, R1.l1.act) : [204 x 512], [512 x 1 x *] -> [204 x 1 x *]
Validating --> R1.l2.B = LearnableParameter() : -> [204 x 1]
Validating --> R1.l2.z = Plus (R1.l2.z.PlusArgs[0], R1.l2.B) : [204 x 1 x *], [204 x 1] -> [204 x 1 x *]
Validating --> R1.act._ = Plus (R0, R1.l2.z) : [204 x 1 x *], [204 x 1 x *] -> [204 x 1 x *]
Validating --> R1 = RectifiedLinear (R1.act._) : [204 x 1 x *] -> [204 x 1 x *]
Validating --> s.z.z.PlusArgs[0] = Times (s.z.W, R1) : [1 x 204], [204 x 1 x *] -> [1 x 1 x *]
Validating --> s.z.B = LearnableParameter() : -> [1 x 1]
Validating --> s = Plus (s.z.z.PlusArgs[0], s.z.B) : [1 x 1 x *], [1 x 1] -> [1 x 1 x *]
Validating --> GroupId = InputValue() : -> [1 x *]
Validating --> irm = IRMetric (Gain, s, GroupId) : [1 x *], [1 x 1 x *], [1 x *] -> [1]
Validating --> irm1 = IRMetricEval (Gain, s, GroupId) : [1 x *], [1 x 1 x *], [1 x *] -> [1]
Validating network. 23 nodes to process in pass 2.
Validating network, final pass.
Post-processing network complete.
09/29/2016 21:50:15:
Model has 41 nodes. Using GPU 0.
09/29/2016 21:50:15: Training criterion: irm = IRMetric
09/29/2016 21:50:15: Evaluation criterion: irm1 = IRMetricEval
Allocating matrices for forward and/or backward propagation.
Memory Sharing: Out of 75 matrices, 44 are shared as 19, and 31 are not shared.
{ T : [56 x 1 x *]
T.z.z.PlusArgs[0] : [56 x *] (gradient) }
{ R1.l1.z.W : [512 x 204] (gradient)
R1.l1.z.z : [512 x 1 x *] }
{ R1.l2.W : [204 x 512] (gradient)
R1.l2.z : [204 x 1 x *] }
{ R1.l1.act : [512 x 1 x *]
R1.l1.z.z.PlusArgs[0] : [512 x 1 x *] (gradient) }
{ Q : [56 x 1 x *]
Q.z.z.PlusArgs[0] : [56 x *] (gradient) }
{ R1.l1.z.z.PlusArgs[0] : [512 x 1 x *]
RS0 : [168 x 1 x *] (gradient)
U : [56 x 1 x *] (gradient)
U.z.B : [56 x 1] (gradient) }
{ R1.l1.z.z : [512 x 1 x *] (gradient)
R1.l2.z.PlusArgs[0] : [204 x 1 x *] }
{ R1.act._ : [204 x 1 x *] (gradient)
R1.l2.B : [204 x 1] (gradient)
s.z.z.PlusArgs[0] : [1 x 1 x *] }
{ T.z.z : [56 x 1 x *] (gradient)
U.z.z.PlusArgs[0] : [56 x *] }
{ RS0._ : [168 x 1 x *]
U.z.z : [56 x 1 x *] (gradient) }
{ R1 : [204 x 1 x *] (gradient)
R1.l1.act : [512 x 1 x *] (gradient)
R1.l1.z.B : [512 x 1] (gradient)
R1.l2.z : [204 x 1 x *] (gradient) }
{ Q.z.z : [56 x 1 x *] (gradient)
T.z.z.PlusArgs[0] : [56 x *] }
{ U : [56 x 1 x *]
U.z.z.PlusArgs[0] : [56 x *] (gradient) }
{ R1.act._ : [204 x 1 x *]
R1.l2.z.PlusArgs[0] : [204 x 1 x *] (gradient) }
{ s : [1 x 1 x *] (gradient)
s.z.W : [1 x 204] (gradient) }
{ R0 : [204 x 1 x *] (gradient)
R1 : [204 x 1 x *] }
{ R0 : [204 x 1 x *]
RS0._ : [168 x 1 x *] (gradient) }
{ RS0 : [168 x 1 x *]
T : [56 x 1 x *] (gradient)
T.z.B : [56 x 1] (gradient) }
{ Q : [56 x 1 x *] (gradient)
Q.z.B : [56 x 1] (gradient) }
09/29/2016 21:50:15: Training 8491209 parameters in 12 out of 12 parameter tensors and 34 nodes with gradient:
09/29/2016 21:50:15: Node 'Q.z.B' (LearnableParameter operation) : [56 x 1]
09/29/2016 21:50:15: Node 'Q.z.W' (LearnableParameter operation) : [56 x 49293]
09/29/2016 21:50:15: Node 'R1.l1.z.B' (LearnableParameter operation) : [512 x 1]
09/29/2016 21:50:15: Node 'R1.l1.z.W' (LearnableParameter operation) : [512 x 204]
09/29/2016 21:50:15: Node 'R1.l2.B' (LearnableParameter operation) : [204 x 1]
09/29/2016 21:50:15: Node 'R1.l2.W' (LearnableParameter operation) : [204 x 512]
09/29/2016 21:50:15: Node 'T.z.B' (LearnableParameter operation) : [56 x 1]
09/29/2016 21:50:15: Node 'T.z.W' (LearnableParameter operation) : [56 x 49293]
09/29/2016 21:50:15: Node 'U.z.B' (LearnableParameter operation) : [56 x 1]
09/29/2016 21:50:15: Node 'U.z.W' (LearnableParameter operation) : [56 x 49293]
09/29/2016 21:50:15: Node 's.z.B' (LearnableParameter operation) : [1 x 1]
09/29/2016 21:50:15: Node 's.z.W' (LearnableParameter operation) : [1 x 204]
09/29/2016 21:50:15: No PreCompute nodes found, or all already computed. Skipping pre-computation step.
10/12/2016 01:08:44: Starting Epoch 1: learning rate per sample = 0.000100 effective momentum = 0.900000 momentum as time constant = 9719.0 samples
10/12/2016 01:08:44: Starting minibatch loop.
10/14/2016 16:18:54: Finished Epoch[ 1 of 5]: [Training] irm = 45.42444375 * 5000; irm1 = 18.67998750 * 5000; totalSamplesSeen = 5000; learningRatePerSample = 9.9999997e-05; epochTime=1.08043s
10/12/2016 01:08:48: SGD: Saving checkpoint model 'C:\cygwin64\tmp\cntk-test-20161011170840.409571\Text_IRMetric@release_gpu/Models/ir.net.1'
10/12/2016 01:08:49: Starting Epoch 2: learning rate per sample = 0.000100 effective momentum = 0.900000 momentum as time constant = 9719.0 samples
10/12/2016 01:08:49: Starting minibatch loop.
10/14/2016 16:18:55: Finished Epoch[ 2 of 5]: [Training] irm = 41.23000625 * 5000; irm1 = 25.73576875 * 5000; totalSamplesSeen = 10000; learningRatePerSample = 9.9999997e-05; epochTime=0.288752s
10/12/2016 01:08:50: SGD: Saving checkpoint model 'C:\cygwin64\tmp\cntk-test-20161011170840.409571\Text_IRMetric@release_gpu/Models/ir.net.2'
10/12/2016 01:08:50: Starting Epoch 3: learning rate per sample = 0.000100 effective momentum = 0.900000 momentum as time constant = 9719.0 samples
10/12/2016 01:08:50: Starting minibatch loop.
10/14/2016 16:18:56: Finished Epoch[ 3 of 5]: [Training] irm = 39.80844688 * 5000; irm1 = 27.15674063 * 5000; totalSamplesSeen = 15000; learningRatePerSample = 9.9999997e-05; epochTime=0.282538s
10/12/2016 01:08:51: SGD: Saving checkpoint model 'C:\cygwin64\tmp\cntk-test-20161011170840.409571\Text_IRMetric@release_gpu/Models/ir.net.3'
10/12/2016 01:08:52: Starting Epoch 4: learning rate per sample = 0.000100 effective momentum = 0.900000 momentum as time constant = 9719.0 samples
10/12/2016 01:08:52: Starting minibatch loop.
10/14/2016 16:18:57: Finished Epoch[ 4 of 5]: [Training] irm = 38.94093125 * 5000; irm1 = 28.90999375 * 5000; totalSamplesSeen = 20000; learningRatePerSample = 9.9999997e-05; epochTime=0.271508s
10/12/2016 01:08:52: SGD: Saving checkpoint model 'C:\cygwin64\tmp\cntk-test-20161011170840.409571\Text_IRMetric@release_gpu/Models/ir.net.4'
10/12/2016 01:08:53: Starting Epoch 5: learning rate per sample = 0.000100 effective momentum = 0.900000 momentum as time constant = 9719.0 samples
10/12/2016 01:08:53: Starting minibatch loop.
10/14/2016 16:18:59: Finished Epoch[ 5 of 5]: [Training] irm = 38.99923125 * 5000; irm1 = 28.86687188 * 5000; totalSamplesSeen = 25000; learningRatePerSample = 9.9999997e-05; epochTime=0.278808s
10/12/2016 01:08:54: SGD: Saving checkpoint model 'C:\cygwin64\tmp\cntk-test-20161011170840.409571\Text_IRMetric@release_gpu/Models/ir.net'
09/29/2016 21:50:38: Action "train" complete.
09/29/2016 21:50:38: __COMPLETED__

Просмотреть файл

@ -1,6 +1,5 @@
WorkingDir = C:\temp
DataDir = \\storage.ccp.philly.selfhost.corp.microsoft.com\public\CNTKTestData\Text\IRMetric
RunDir = $WorkingDir$
ConfigDir = $WorkingDir$

Просмотреть файл

@ -24,5 +24,3 @@ ConfigDir=$TEST_DIR/
# cntkrun <CNTK config file name> <additional CNTK args>
cntkrun ir.cntk 'makeMode=true' || exit $?
#TEST_DATA_DIR