Microsoft Cognitive Toolkit (CNTK), an open source deep-learning toolkit
Перейти к файлу
Alexey Kamenev dede055e1c Change CUDA settings for Debug builds. 2015-09-17 16:37:00 -07:00
BrainScript moved most parts of BrainScriptEvaluator.h (those that are now independent of BrainScript) to BrainScriptObjects.h, and then renamed it ScriptableObjects.h; 2015-09-14 21:38:45 +02:00
Common Moved the parallel trainign guard when writing model/checkpoint files to the actual save functions instead of guarding at the call sites 2015-09-15 21:04:09 -07:00
DataReader Merge branch 'master' into erw/MA-latest 2015-09-16 21:36:59 -07:00
Demos Fixed a couple of bugs in the ModelAveraging implementation and some code cleanup 2015-08-26 13:01:42 -07:00
Documentation update CNTK book. 2015-08-10 14:59:42 -07:00
ExampleSetups Minor changes in readme.txt and .config samples 2015-08-26 14:37:09 -07:00
MachineLearning Fix a bug introduced during branch merging. 2015-09-16 21:57:48 -07:00
Math Change CUDA settings for Debug builds. 2015-09-17 16:37:00 -07:00
Scripts Updating build-and-test script and e2e tests to reflect new build directory layout 2015-08-07 19:00:28 -07:00
Tests Introducing a flexible test tagging system: 2015-09-09 15:54:09 -07:00
license First Release of CNTK 2014-08-29 16:21:42 -07:00
.gitattributes Added ability to run speech e2e tests on Windows (cygwin) 2015-08-11 18:17:37 -07:00
.gitignore Add a configure script for setting build parameters 2015-08-07 13:03:27 -04:00
CNTK.sln made BrainScriptEvaluator.h independent of BrainScriptParser.h, in prep of separating out ConfigValue, ConfigRecord, ConfigArray, and ConfigLambda from BrainScript so that they can be used by other language wrappers as well. This required to replace ConfigValuePtr::textLocation by a lambda that prints an error string, annotated with a text location hidden inside the lambda (think Python wrapper--the lambda knows how to pinpoint the location in the Python source). Instead of throwing EvaluationError(msg, val.GetTextLocation()), one now instead says val.Fail(msg); 2015-09-14 20:19:35 +02:00
CppCntk.vssettings add Visual Studio C++ settings files. Make sure everyone uses the same tab, indentation and other settings in CNTK. 2015-07-09 11:23:27 -07:00
Makefile Change CUDA settings for Debug builds. 2015-09-17 16:37:00 -07:00
Makefile_kaldi.cpu Merge remote-tracking branch 'origin/master' into linux-gcc 2015-07-05 22:28:21 -07:00
Makefile_kaldi.gpu Merge remote-tracking branch 'origin/master' into linux-gcc 2015-07-05 22:28:21 -07:00
Makefile_kaldi2.cpu Adding interface for utterance derivative computation in the Kaldi2Reader 2015-07-29 03:30:56 +00:00
Makefile_kaldi2.gpu Adding interface for utterance derivative computation in the Kaldi2Reader 2015-07-29 03:30:56 +00:00
README Add a configure script for setting build parameters 2015-08-07 13:03:27 -04:00
ThirdPartyNotices.txt Added end-to-end test for CNTK speech workloads using AN4 dataset 2015-08-05 12:17:09 -07:00
configure Fix configure script permissions... again 2015-09-04 12:22:12 -07:00
kaldi_vars.mk Merge remote-tracking branch 'origin/dongyu/dev' into linux-gcc 2015-06-30 15:00:58 +00:00

README

== Dev branch           ==
This branch contains some features that are not yet checked into main branch. To enlist this branch, run 

git checkout origin/Dev 

== To-do                ==
Add descriptions to LSTMnode
Add descriptions to 0/1 mask segmentation in feature reader, delay node, and crossentropywithsoftmax node
Change criterion node to use the 0/1 mask, following example in crossentropywithsoftmax node
Add description of encoder-decoder simple network builder
Add description of time-reverse node, simple network builder and NDL builder for bi-directional models

== Author of the README ==
   	Kaisheng Yao
        Microsoft Research
        email: kaisheny@microsoft.com

	Wengong Jin,
	Shanghai Jiao Tong University
	email: acmgokun@gmail.com

    Yu Zhang, Leo Liu
    CSAIL, Massachusetts Institute of Technology
    email: yzhang87@csail.mit.edu
    email: leoliu_cu@sbcglobal.net

    Guoguo Chen
    CLSP, Johns Hopkins University
    email: guoguo@jhu.edu

== Preeliminaries ==
To build the cpu version, you have to install intel MKL blas library or ACML library first. Note that ACML is free, where MKL may not be.

for MKL:
1. Download from https://software.intel.com/en-us/intel-mkl

for ACML:
1. Download from http://developer.amd.com/tools-and-sdks/cpu-development/amd-core-math-library-acml/

for Kaldi:
1. In kaldi-trunk/tools/Makefile, uncomment # OPENFST_VERSION = 1.4.1, and
   re-install OpenFst using the makefile.
2. In kaldi-trunk/src/, do ./configure --shared; make depend -j 8; make -j 8;
   and re-compile Kaldi (the -j option is for parallelization).

To build the gpu version, you have to install NIVIDIA CUDA first

== Build Preparation ==
Let $CNTK be the CNTK directory.
>mkdir build
>$CNTK/configure -h

You will see various options for configure, as well as their default
values.  CNTK needs a CPU math directory, either acml or mkl.  If you
do not specify one and both are available, acml will be used.  For GPU
use, a cuda and gdk directory are also required.  Similary, to build
the kaldi plugin a kaldi directory is required.  You may also specify
whether you want a debug or release build.  Rerun configure with the
desired options.

>$CNTK/configure ...

This will create a Config.make and a Makefile (if you are in the $CNTK
directory, a Makefile will not be created).  The Config.make file
records the configuration parameters and the Makefile reinvokes the
$CNTK/Makefile, passing it the build directory where it can find the
Config.make.

After make completes, you will have the following directories:

  .build will contain object files, and can be deleted
  bin contains the cntk program
  lib contains libraries and plugins

  The bin and lib directories can safely be moved as long as they remain siblings.

To clean

>make clean

== Run ==
All executables are in bin directory:
	cntk: The main executable for CNTK
	*.so: shared library for corresponding reader, these readers will be linked and loaded dynamically at runtime.

	./cntk configFile=${your cntk config file}

== Kaldi Reader ==
This is a HTKMLF reader and kaldi writer (for decode)

To build, set KALDI_PATH in your Config.make

The feature section is like:

writer=[
    writerType=KaldiReader
    readMethod=blockRandomize
    frameMode=false
    miniBatchMode=Partial
    randomize=Auto
    verbosity=1
    ScaledLogLikelihood=[
        dim=$labelDim$
        Kaldicmd="ark:-" # will pipe to the Kaldi decoder latgen-faster-mapped
        scpFile=$outputSCP$ # the file key of the features
    ]
]

== Kaldi2 Reader ==
This is a kaldi reader and kaldi writer (for decode)

To build, set KALDI_PATH in your Config.make

The features section is different:

features=[
    dim=
    rx=
    scpFile=
    featureTransform=
]

rx is a text file which contains:

    one Kaldi feature rxspecifier readable by RandomAccessBaseFloatMatrixReader.
    'ark:' specifiers don't work; only 'scp:' specifiers work.

scpFile is a text file generated by running:

    feat-to-len FEATURE_RXSPECIFIER_FROM_ABOVE ark,t:- > TEXT_FILE_NAME

    scpFile should contain one line per utterance.

    If you want to run with fewer utterances, just shorten this file.
    (It will load the feature rxspecifier but ignore utterances not present in scpFile).

featureTransform is the name of a Kaldi feature transform file:
    
    Kaldi feature transform files are used for stacking / applying transforms to features.

    An empty string (if permitted by the config file reader?) or the special string: NO_FEATURE_TRANSFORM
    says to ignore this option.

********** Labels **********

The labels section is also different.

labels=[
    mlfFile=
    labelDim=
    labelMappingFile=
]

Only difference is mlfFile. mlfFile is a different format now. It is a text file which contains:

    one Kaldi label rxspecifier readable by Kaldi's copy-post binary.