Added link on string #32
This commit is contained in:
Родитель
20e68fa7c4
Коммит
b915766b48
|
@ -29,7 +29,7 @@ You can use custom normalization rule by preparing a TSV file formatted as follo
|
|||
```
|
||||
In this sample, UCS4 sequence [41 302 300] (hex) is converted into [1EA6] (hex). When there are ambiguities in the conversions, the longest rule is used.
|
||||
Note that the tab is used as a delimiter for source and target sequence and space is used as a delimiter for UCS4 characters.
|
||||
See data/nfkc.tsv as an example. Once a TSV file is prepared, you can specify it with `--normalization_rule_tsv` flag.
|
||||
See [data/nfkc.tsv](data/nfkc.tsv) as an example. Once a TSV file is prepared, you can specify it with `--normalization_rule_tsv` flag.
|
||||
```
|
||||
% spm_train --normalization_rule_tsv=<rule tsv file> --input=<input> --model_prefix=<model file> --vocab_size=8000
|
||||
```
|
||||
|
|
Загрузка…
Ссылка в новой задаче