change CNTK book author list to include additional contributors and sort in alphabetical order.
Change SLU/lstmNDl.txt to use outputs as the output node to be consistent with simple network builder. Modify README and rename old README to KaldiReaderReadme
This commit is contained in:
Родитель
2370bcfd27
Коммит
802b2194b5
Разница между файлами не показана из-за своего большого размера
Загрузить разницу
|
@ -98,32 +98,44 @@ An Introduction to Computational Networks and the Computational Network
|
|||
\end_layout
|
||||
|
||||
\begin_layout Author
|
||||
Dong Yu, Adam Eversole, Michael L.
|
||||
Seltzer, Kaisheng Yao,
|
||||
Amit Agarwal, Eldar Akchurin, Chris Basoglu, Guoguo Chen,
|
||||
\begin_inset Newline newline
|
||||
\end_inset
|
||||
|
||||
Brian Guenter, Oleksii Kuchaiev, Yu Zhang, Frank Seide, Guoguo Chen,
|
||||
Scott Cyphers, Jasha Droppo, Adam Eversole, Brian Guenter,
|
||||
\begin_inset Newline newline
|
||||
\end_inset
|
||||
|
||||
Huaming Wang, Jasha Droppo, Amit Agarwal, Chris Basoglu,
|
||||
Mark Hillebrand, Xuedong Huang, Zhiheng Huang, Vladimir Ivanov,
|
||||
\begin_inset Newline newline
|
||||
\end_inset
|
||||
|
||||
Marko Padmilac, Alexey Kamenev, Vladimir Ivanov, Scott Cypher,
|
||||
Alexey Kamenev, Philipp Kranen, Oleksii Kuchaiev, Wolfgang Manousek,
|
||||
\begin_inset Newline newline
|
||||
\end_inset
|
||||
|
||||
Hari Parthasarathi, Bhaskar Mitra, Zhiheng Huang, Geoffrey Zweig,
|
||||
Avner May, Bhaskar Mitra, Olivier Nano, Gaizka Navarro,
|
||||
\begin_inset Newline newline
|
||||
\end_inset
|
||||
|
||||
Chris Rossbach, Jon Currey,Jie Gao, Avner May, Baolin Peng,
|
||||
Alexey Orlov, Marko Padmilac, Hari Parthasarathi, Baolin Peng,
|
||||
\begin_inset Newline newline
|
||||
\end_inset
|
||||
|
||||
Andreas Stolcke, Malcolm Slaney, Xuedong Huang
|
||||
Alexey Reznichenko, Frank Seide, Michael L.
|
||||
Seltzer, Malcolm Slaney,
|
||||
\begin_inset Newline newline
|
||||
\end_inset
|
||||
|
||||
Andreas Stolcke, Huaming Wang, Kaisheng Yao, Dong Yu,
|
||||
\begin_inset Newline newline
|
||||
\end_inset
|
||||
|
||||
Yu Zhang, Geoffrey Zweig
|
||||
\begin_inset Newline newline
|
||||
\end_inset
|
||||
|
||||
(*authors are listed in alphabetical order)
|
||||
\end_layout
|
||||
|
||||
\begin_layout Date
|
||||
|
|
|
@ -206,13 +206,13 @@ ndlCreateNetwork=[
|
|||
|
||||
#else if you don't want to include a bias uncommment following and comment the above block
|
||||
|
||||
outvalue = Times(W, LSTMoutput3);
|
||||
outputs = Times(W, LSTMoutput3);
|
||||
|
||||
#end if
|
||||
|
||||
cr = CrossEntropyWithSoftmax(labels, outvalue);
|
||||
cr = CrossEntropyWithSoftmax(labels, outputs);
|
||||
|
||||
CriterionNodes = (cr)
|
||||
EvalNodes = (cr)
|
||||
OutputNodes = (outvalue)
|
||||
OutputNodes = (outputs)
|
||||
]
|
|
@ -0,0 +1,164 @@
|
|||
|
||||
== Authors of the Linux Building README ==
|
||||
|
||||
Kaisheng Yao
|
||||
Microsoft Research
|
||||
email: kaisheny@microsoft.com
|
||||
|
||||
Wengong Jin,
|
||||
Shanghai Jiao Tong University
|
||||
email: acmgokun@gmail.com
|
||||
|
||||
Yu Zhang, Leo Liu, Scott Cyphers
|
||||
CSAIL, Massachusetts Institute of Technology
|
||||
email: yzhang87@csail.mit.edu
|
||||
email: leoliu_cu@sbcglobal.net
|
||||
email: cyphers@mit.edu
|
||||
|
||||
Guoguo Chen
|
||||
CLSP, Johns Hopkins University
|
||||
email: guoguo@jhu.edu
|
||||
|
||||
== Preeliminaries ==
|
||||
|
||||
To build the cpu version, you have to install intel MKL blas library
|
||||
or ACML library first. Note that ACML is free, whereas MKL may not be.
|
||||
|
||||
for MKL:
|
||||
1. Download from https://software.intel.com/en-us/intel-mkl
|
||||
|
||||
for ACML:
|
||||
1. Download from
|
||||
http://developer.amd.com/tools-and-sdks/archive/amd-core-math-library-acml/acml-downloads-resources/
|
||||
We have seen some problems with some versions of the library on Intel
|
||||
processors, but have had success with acml-5-3-1-ifort-64bit.tgz
|
||||
|
||||
for Kaldi:
|
||||
1. In kaldi-trunk/tools/Makefile, uncomment # OPENFST_VERSION = 1.4.1, and
|
||||
re-install OpenFst using the makefile.
|
||||
2. In kaldi-trunk/src/, do ./configure --shared; make depend -j 8; make -j 8;
|
||||
and re-compile Kaldi (the -j option is for parallelization).
|
||||
|
||||
To build the gpu version, you have to install NIVIDIA CUDA first
|
||||
|
||||
== Build Preparation ==
|
||||
|
||||
You can do an out of source build in any directory, as well as an in
|
||||
source build. Let $CNTK be the CNTK directory. For an out of source
|
||||
build in the directory "build" type
|
||||
|
||||
>mkdir build
|
||||
>cd build
|
||||
>$CNTK/configure -h
|
||||
|
||||
(For an in source build, just run configure in the $CNTK directory).
|
||||
|
||||
You will see various options for configure, as well as their default
|
||||
values. CNTK needs a CPU math directory, either acml or mkl. If you
|
||||
do not specify one and both are available, acml will be used. For GPU
|
||||
use, a cuda and gdk directory are also required. Similary, to build
|
||||
the kaldi plugin a kaldi directory is required. You may also specify
|
||||
whether you want a debug or release build, as well as add additional
|
||||
path roots to use in searching for libraries.
|
||||
|
||||
Rerun configure with the desired options:
|
||||
|
||||
>$CNTK/configure ...
|
||||
|
||||
This will create a Config.make and a Makefile (if you are doing an in
|
||||
source build, a Makefile will not be created). The Config.make file
|
||||
records the configuration parameters and the Makefile reinvokes the
|
||||
$CNTK/Makefile, passing it the build directory where it can find the
|
||||
Config.make.
|
||||
|
||||
After make completes, you will have the following directories:
|
||||
|
||||
.build will contain object files, and can be deleted
|
||||
bin contains the cntk program
|
||||
lib contains libraries and plugins
|
||||
|
||||
The bin and lib directories can safely be moved as long as they
|
||||
remain siblings.
|
||||
|
||||
To clean
|
||||
|
||||
>make clean
|
||||
|
||||
== Run ==
|
||||
All executables are in bin directory:
|
||||
cntk: The main executable for CNTK
|
||||
*.so: shared library for corresponding reader, these readers will be linked and loaded dynamically at runtime.
|
||||
|
||||
./cntk configFile=${your cntk config file}
|
||||
|
||||
== Kaldi Reader ==
|
||||
This is a HTKMLF reader and kaldi writer (for decode)
|
||||
|
||||
To build, set --with-kaldi when you configure.
|
||||
|
||||
The feature section is like:
|
||||
|
||||
writer=[
|
||||
writerType=KaldiReader
|
||||
readMethod=blockRandomize
|
||||
frameMode=false
|
||||
miniBatchMode=Partial
|
||||
randomize=Auto
|
||||
verbosity=1
|
||||
ScaledLogLikelihood=[
|
||||
dim=$labelDim$
|
||||
Kaldicmd="ark:-" # will pipe to the Kaldi decoder latgen-faster-mapped
|
||||
scpFile=$outputSCP$ # the file key of the features
|
||||
]
|
||||
]
|
||||
|
||||
== Kaldi2 Reader ==
|
||||
This is a kaldi reader and kaldi writer (for decode)
|
||||
|
||||
To build, set --with-kaldi in your Config.make
|
||||
|
||||
The features section is different:
|
||||
|
||||
features=[
|
||||
dim=
|
||||
rx=
|
||||
scpFile=
|
||||
featureTransform=
|
||||
]
|
||||
|
||||
rx is a text file which contains:
|
||||
|
||||
one Kaldi feature rxspecifier readable by RandomAccessBaseFloatMatrixReader.
|
||||
'ark:' specifiers don't work; only 'scp:' specifiers work.
|
||||
|
||||
scpFile is a text file generated by running:
|
||||
|
||||
feat-to-len FEATURE_RXSPECIFIER_FROM_ABOVE ark,t:- > TEXT_FILE_NAME
|
||||
|
||||
scpFile should contain one line per utterance.
|
||||
|
||||
If you want to run with fewer utterances, just shorten this file.
|
||||
(It will load the feature rxspecifier but ignore utterances not present in scpFile).
|
||||
|
||||
featureTransform is the name of a Kaldi feature transform file:
|
||||
|
||||
Kaldi feature transform files are used for stacking / applying transforms to features.
|
||||
|
||||
An empty string (if permitted by the config file reader?) or the special string: NO_FEATURE_TRANSFORM
|
||||
says to ignore this option.
|
||||
|
||||
********** Labels **********
|
||||
|
||||
The labels section is also different.
|
||||
|
||||
labels=[
|
||||
mlfFile=
|
||||
labelDim=
|
||||
labelMappingFile=
|
||||
]
|
||||
|
||||
Only difference is mlfFile. mlfFile is a different format now. It is a text file which contains:
|
||||
|
||||
one Kaldi label rxspecifier readable by Kaldi's copy-post binary.
|
||||
|
||||
|
223
README
223
README
|
@ -1,169 +1,106 @@
|
|||
== To-do ==
|
||||
Add descriptions to LSTMnode
|
||||
Add descriptions to 0/1 mask segmentation in feature reader, delay node, and crossentropywithsoftmax node
|
||||
Change criterion node to use the 0/1 mask, following example in crossentropywithsoftmax node
|
||||
Add description of encoder-decoder simple network builder
|
||||
Add description of time-reverse node, simple network builder and NDL builder for bi-directional models
|
||||
######################
|
||||
1. User Manual
|
||||
######################
|
||||
|
||||
== Authors of the README ==
|
||||
Kaisheng Yao
|
||||
Microsoft Research
|
||||
email: kaisheny@microsoft.com
|
||||
The detailed introduction of the computational network and its implementation as well as the user manual of the computational network toolkit (CNTK) can be found at
|
||||
|
||||
Wengong Jin,
|
||||
Shanghai Jiao Tong University
|
||||
email: acmgokun@gmail.com
|
||||
Amit Agarwal, Eldar Akchurin, Chris Basoglu, Guoguo Chen, Scott Cyphers, Jasha Droppo, Adam Eversole, Brian Guenter, Mark Hillebrand, Xuedong Huang, Zhiheng Huang, Vladimir Ivanov, Alexey Kamenev, Philipp Kranen, Oleksii Kuchaiev, Wolfgang Manousek, Avner May, Bhaskar Mitra, Olivier Nano, Gaizka Navarro, Alexey Orlov, Marko Padmilac, Hari Parthasarathi, Baolin Peng, Alexey Reznichenko, Frank Seide, Michael L. Seltzer, Malcolm Slaney, Andreas Stolcke, Huaming Wang, Kaisheng Yao, Dong Yu, Yu Zhang, Geoffrey Zweig (in alphabetical order), "An Introduction to Computational Networks and the Computational Network Toolkit", Microsoft Technical Report MSR-TR-2014-112, 2014.
|
||||
|
||||
Yu Zhang, Leo Liu, Scott Cyphers
|
||||
CSAIL, Massachusetts Institute of Technology
|
||||
email: yzhang87@csail.mit.edu
|
||||
email: leoliu_cu@sbcglobal.net
|
||||
email: cyphers@mit.edu
|
||||
Note: For builds before May 18, 2015 the executable is called cn which is now changed to cntk.
|
||||
|
||||
Guoguo Chen
|
||||
CLSP, Johns Hopkins University
|
||||
email: guoguo@jhu.edu
|
||||
There are also four files in the documentation directory of the source that contain additional details.
|
||||
|
||||
== Preeliminaries ==
|
||||
######################
|
||||
2. Clone Source Code (Windows)
|
||||
######################
|
||||
|
||||
To build the cpu version, you have to install intel MKL blas library
|
||||
or ACML library first. Note that ACML is free, whereas MKL may not be.
|
||||
The CNTK project uses Git as the source version control system.
|
||||
|
||||
for MKL:
|
||||
1. Download from https://software.intel.com/en-us/intel-mkl
|
||||
If you have Visual Studio 2013 installed Git is already available. You can follow the "Clone a remote Git repository from a third-party service" section in Set up Git on your dev machine (configure, create, clone, add) and connect to https://git01.codeplex.com/cntk to clone the source code. We found that installing Git Extension for VS is still helpful esp. for new users.
|
||||
|
||||
for ACML:
|
||||
1. Download from
|
||||
http://developer.amd.com/tools-and-sdks/archive/amd-core-math-library-acml/acml-downloads-resources/
|
||||
We have seen some problems with some versions of the library on Intel
|
||||
processors, but have had success with acml-5-3-1-ifort-64bit.tgz
|
||||
Otherwise you can install Git for your OS from the Using Git with CodePlex page and clone the CNTK source code with the command "git clone https://git01.codeplex.com/cntk".
|
||||
|
||||
for Kaldi:
|
||||
1. In kaldi-trunk/tools/Makefile, uncomment # OPENFST_VERSION = 1.4.1, and
|
||||
re-install OpenFst using the makefile.
|
||||
2. In kaldi-trunk/src/, do ./configure --shared; make depend -j 8; make -j 8;
|
||||
and re-compile Kaldi (the -j option is for parallelization).
|
||||
######################
|
||||
3. Clone Source Code (Linux/mac)
|
||||
######################
|
||||
|
||||
To build the gpu version, you have to install NIVIDIA CUDA first
|
||||
For the linux user please replace "git01" to "git" (otherwise you will get an RPC error):
|
||||
|
||||
== Build Preparation ==
|
||||
git clone https://git.codeplex.com/cntk
|
||||
|
||||
You can do an out of source build in any directory, as well as an in
|
||||
source build. Let $CNTK be the CNTK directory. For an out of source
|
||||
build in the directory "build" type
|
||||
More detail you can follow this thread: http://codeplex.codeplex.com/workitem/26133
|
||||
|
||||
>mkdir build
|
||||
>cd build
|
||||
>$CNTK/configure -h
|
||||
######################
|
||||
4. Windows Visual Studio Setup (CNTK only runs on 64-bit OS)
|
||||
######################
|
||||
|
||||
(For an in source build, just run configure in the $CNTK directory).
|
||||
# 4.1 Install Visual Studio 2013. After installation make sure to install Update 5 or higher: Go to menu Tools -> Extensions and Updates -> Updates -> Product Updates -> Visual Studio 2013 Update 5 (or higher if applicable)
|
||||
|
||||
You will see various options for configure, as well as their default
|
||||
values. CNTK needs a CPU math directory, either acml or mkl. If you
|
||||
do not specify one and both are available, acml will be used. For GPU
|
||||
use, a cuda and gdk directory are also required. Similary, to build
|
||||
the kaldi plugin a kaldi directory is required. You may also specify
|
||||
whether you want a debug or release build, as well as add additional
|
||||
path roots to use in searching for libraries.
|
||||
# 4.2 Install CUDA 7.0 from https://developer.nvidia.com/cuda-toolkit-70.
|
||||
# 4.3 Install NVIDIA CUB from https://github.com/NVlabs/cub/archive/1.4.1.zip Unzip the archive.
|
||||
Set environment variable CUB_PATH to CUB folder, e.g.:
|
||||
CUB_PATH=c:\src\cub-1.4.1
|
||||
|
||||
Rerun configure with the desired options:
|
||||
# 4.4 Install ACML 5.3.1 or above (specifically the ifort64 variant, e.g., acml5.3.1-ifort64.exe) from http://developer.amd.com/tools/cpu-development/amd-core-math-library-acml/acml-downloads-resources/. Before launching Visual Studio set the system environment variable e.g.:
|
||||
ACML_PATH=C:\AMD\acml5.3.1\ifort64_mp
|
||||
or the folder you installed acml. (The easiest way to do this on Windows 8 is to press the windows key, and then in the metro interface start typing: edit environment variables.)
|
||||
If you are running on an Intel processor with FMA3 support, we also advise to set ACML_FMA=0 in your environment to work around issue in the ACML library.
|
||||
|
||||
>$CNTK/configure ...
|
||||
# 4.5 Alternatively if you have an MKL license, you can install Intel MKL library instead of ACML from https://software.intel.com/en-us/intel-math-kernel-library-evaluation-options and define USE_MKL in the CNTKMath project. MKL is faster and more reliable on Intel chips if you have the license.
|
||||
|
||||
This will create a Config.make and a Makefile (if you are doing an in
|
||||
source build, a Makefile will not be created). The Config.make file
|
||||
records the configuration parameters and the Makefile reinvokes the
|
||||
$CNTK/Makefile, passing it the build directory where it can find the
|
||||
Config.make.
|
||||
# 4.6 Install the latest Microsoft MS-MPI SDK and runtime from https://msdn.microsoft.com/en-us/library/bb524831(v=vs.85).aspx
|
||||
|
||||
After make completes, you will have the following directories:
|
||||
# 4.7 If you want to use ImageReader, install OpenCV v3.0.0: Download and install OpenCV v3.0.0 for Windows from http://opencv.org/downloads.html
|
||||
Set OPENCV_PATH environment variable to OpenCV build folder (e.g. C:\src\opencv\build)
|
||||
|
||||
.build will contain object files, and can be deleted
|
||||
bin contains the cntk program
|
||||
lib contains libraries and plugins
|
||||
# 4.8 Load the CNTKSolution and build the cntk project.
|
||||
|
||||
The bin and lib directories can safely be moved as long as they
|
||||
remain siblings.
|
||||
######################
|
||||
5. Linux GCC Setup
|
||||
######################
|
||||
|
||||
To clean
|
||||
# 5.1 install needed libraries as indicated in section 4 on your Linux box.
|
||||
|
||||
>make clean
|
||||
# 5.2 create a directory to build in and make a Config.make in the directory
|
||||
that provides
|
||||
ACML_PATH= path to ACML library installation
|
||||
only needed if MATHLIB=acml
|
||||
MKL_PATH= path to MKL library installation
|
||||
only needed if MATHLIB=mkl
|
||||
GDK_PATH= path to cuda gdk installation, so $(GDK_PATH)/include/nvidia/gdk/nvml.h exists
|
||||
defaults to /usr
|
||||
BUILDTYPE= One of release or debug
|
||||
defaults to release
|
||||
MATHLIB= One of acml or mkl
|
||||
defaults to acml
|
||||
CUDA_PATH= Path to CUDA
|
||||
If not specified, GPU will not be enabled
|
||||
CUB_PATH= path to NVIDIA CUB installation, so $(CUB_PATH)/cub/cub.cuh exists
|
||||
defaults to /usr/local/cub-1.4.1
|
||||
KALDI_PATH= Path to Kaldi
|
||||
If not specified, Kaldi plugins will not be built
|
||||
OPENCV_PATH= path to OpenCV 3.0.0 installation, so $(OPENCV_PATH) exists
|
||||
defaults to /usr/local/opencv-3.0.0
|
||||
|
||||
== Run ==
|
||||
All executables are in bin directory:
|
||||
cntk: The main executable for CNTK
|
||||
*.so: shared library for corresponding reader, these readers will be linked and loaded dynamically at runtime.
|
||||
|
||||
./cntk configFile=${your cntk config file}
|
||||
|
||||
== Kaldi Reader ==
|
||||
This is a HTKMLF reader and kaldi writer (for decode)
|
||||
|
||||
To build, set --with-kaldi when you configure.
|
||||
|
||||
The feature section is like:
|
||||
|
||||
writer=[
|
||||
writerType=KaldiReader
|
||||
readMethod=blockRandomize
|
||||
frameMode=false
|
||||
miniBatchMode=Partial
|
||||
randomize=Auto
|
||||
verbosity=1
|
||||
ScaledLogLikelihood=[
|
||||
dim=$labelDim$
|
||||
Kaldicmd="ark:-" # will pipe to the Kaldi decoder latgen-faster-mapped
|
||||
scpFile=$outputSCP$ # the file key of the features
|
||||
]
|
||||
]
|
||||
|
||||
== Kaldi2 Reader ==
|
||||
This is a kaldi reader and kaldi writer (for decode)
|
||||
|
||||
To build, set --with-kaldi in your Config.make
|
||||
|
||||
The features section is different:
|
||||
|
||||
features=[
|
||||
dim=
|
||||
rx=
|
||||
scpFile=
|
||||
featureTransform=
|
||||
]
|
||||
|
||||
rx is a text file which contains:
|
||||
|
||||
one Kaldi feature rxspecifier readable by RandomAccessBaseFloatMatrixReader.
|
||||
'ark:' specifiers don't work; only 'scp:' specifiers work.
|
||||
|
||||
scpFile is a text file generated by running:
|
||||
|
||||
feat-to-len FEATURE_RXSPECIFIER_FROM_ABOVE ark,t:- > TEXT_FILE_NAME
|
||||
|
||||
scpFile should contain one line per utterance.
|
||||
|
||||
If you want to run with fewer utterances, just shorten this file.
|
||||
(It will load the feature rxspecifier but ignore utterances not present in scpFile).
|
||||
|
||||
featureTransform is the name of a Kaldi feature transform file:
|
||||
|
||||
Kaldi feature transform files are used for stacking / applying transforms to features.
|
||||
|
||||
An empty string (if permitted by the config file reader?) or the special string: NO_FEATURE_TRANSFORM
|
||||
says to ignore this option.
|
||||
|
||||
********** Labels **********
|
||||
|
||||
The labels section is also different.
|
||||
|
||||
labels=[
|
||||
mlfFile=
|
||||
labelDim=
|
||||
labelMappingFile=
|
||||
]
|
||||
|
||||
Only difference is mlfFile. mlfFile is a different format now. It is a text file which contains:
|
||||
|
||||
one Kaldi label rxspecifier readable by Kaldi's copy-post binary.
|
||||
# 5.3 Build the clean version with command
|
||||
make -j all
|
||||
|
||||
|
||||
######################
|
||||
6. Coding Standard
|
||||
######################
|
||||
|
||||
No TABs. Each TAB should be replaced with 4 spaces in the source code. If you use Visual Studio as your editor, goto Tools|Options|Text Editor|C/C++|Tabs and make sure it is set to Smart Indenting Tab and Indent Size set to 4, and "Insert Spaces" option selected.
|
||||
Follow the same naming conventions as shown in the ComputationNetwork.h file.
|
||||
Open/close braces should be on lines by themselves aligned with previous code indent level, e.g.,
|
||||
|
||||
if (true)
|
||||
{
|
||||
Function();
|
||||
}
|
||||
|
||||
If you are using Visual Studio 2013 as your main development environment, you can load the CppCntk.vssettings file (in the CNTK home directory) which contains settings for C++ editor with defaults like using spaces for tabs, curly brace positioning and other preferences that meet CNTK style guidelines. Note that this file will not change any other settings, like your windows layout etc but it's still a good idea to backup your current settings just in case.
|
||||
|
||||
To import/export the settings, use Tools -> Import and Export Settings... Visual Studio menu option.
|
||||
|
||||
Once the settings are loaded, you can use Edit -> Advanced -> Format Document (or Ctrl+E,D shortcut) or, which is recommended, Edit -> Advanced -> Format Selection (or Ctrl+E,F shortcut) to format only selected fragment.
|
||||
|
|
Загрузка…
Ссылка в новой задаче