Distributed skipgram mixture model for multisense word embedding
Перейти к файлу
ustctf 72d5f92c10 Merge branch 'master' of https://github.com/Microsoft/distributed_skipgram_mixture 2015-11-12 11:35:45 +08:00
scripts Modify the scripts. 2015-11-09 10:01:28 +08:00
src Change IO. 2015-11-08 15:14:58 +08:00
windows/distributed_skipgram_mixture disable crt warning 2015-11-06 17:16:07 +08:00
.gitignore init commit 2015-10-14 12:16:19 +08:00
LICENSE add LICENSE 2015-10-16 15:13:40 +08:00
Makefile Update Makefile 2015-11-11 18:59:59 +08:00
README.md Merge branch 'master' of https://github.com/Microsoft/distributed_skipgram_mixture 2015-11-12 11:35:45 +08:00
build.sh linux build 2015-11-06 17:19:43 +08:00

README.md

Distributed Multisense Word Embedding

The Distributed Multisense Word Embedding(DMWE) tool is a parallelization of the Skip-Gram Mixture [1] algorithm on top of the DMTK parameter server. It provides an efficient "scaling to industry size" solution for multi sense word embedding.

For more details, please view our website http://www.dmtk.io

Download

$ git clone https://github.com/Microsoft/distributed_skipgram_mixture

Build

Prerequisite

DMWE is built on top of the DMTK parameter sever, therefore please download and build DMTK first (https://github.com/Microsoft/multiverso).

For Windows

Open windows\distributed_skipgram_mixture\distributed_skipgram_mixture.sln using Visual Studio 2013. Add the necessary include path (for example, the path for DMTK multiverso) and lib path. Then build the solution.

For Ubuntu (Tested on Ubuntu 12.04)

Download and build by running sh build.sh. Modify the include and lib path in Makefile. Then run make all -j4.

HyperParameter Settings

See ./scripts/parameters_settings.txt and ./scripts/run.py.

Reference

[1] Tian, F., Dai, H., Bian, J., Gao, B., Zhang, R., Chen, E., & Liu, T. Y. (2014). A probabilistic model for learning multi-prototype word embeddings. In Proceedings of COLING (pp. 151-160).