This commit is contained in:
Taku Kudo 2018-05-01 19:14:31 +09:00 коммит произвёл GitHub
Родитель c18b5949cc
Коммит c0d9a5d263
Не найден ключ, соответствующий данной подписи
Идентификатор ключа GPG: 4AEE18F83AFDEB23
1 изменённых файлов: 1 добавлений и 1 удалений

Просмотреть файл

@ -89,7 +89,7 @@ Subword regularization [[Kudo.](https://arxiv.org/abs/1804.10959)] is a simple r
that virtually augments training data with on-the-fly subword sampling, which helps to improve the accuracy as well as robustness of NMT models.
To enable subword regularization, you would like to use the SentencePiece library
([C++](doc/api.md)/[Python](python/README.md)) to sample one segmentation for each parameter updates, which is different from the standard off-line data preparations. Here's the example of [Python library](python/README.md). You can find that 'New York' is segmented differently on each ``SampleEncode`` call. The details of sampling parameters are found in [sentencepiece_processor.h](src/sentencepiece_processor.h).
([C++](doc/api.md#sampling-subword-regularization)/[Python](python/README.md)) to sample one segmentation for each parameter updates, which is different from the standard off-line data preparations. Here's the example of [Python library](python/README.md). You can find that 'New York' is segmented differently on each ``SampleEncode`` call. The details of sampling parameters are found in [sentencepiece_processor.h](src/sentencepiece_processor.h).
```
>>> import sentencepiece as spm