Add note to resize token embeddings matrix when adding new tokens to voc (#10331)
This commit is contained in:
Родитель
19e737b93e
Коммит
e73a3e1891
|
@ -971,6 +971,12 @@ class SpecialTokensMixin:
|
|||
Add a list of new tokens to the tokenizer class. If the new tokens are not in the vocabulary, they are added to
|
||||
it with indices starting from length of the current vocabulary.
|
||||
|
||||
.. Note::
|
||||
When adding new tokens to the vocabulary, you should make sure to also resize the token embedding matrix of
|
||||
the model so that its embedding matrix matches the tokenizer.
|
||||
|
||||
In order to do that, please use the :meth:`~transformers.PreTrainedModel.resize_token_embeddings` method.
|
||||
|
||||
Args:
|
||||
new_tokens (:obj:`str`, :obj:`tokenizers.AddedToken` or a list of `str` or :obj:`tokenizers.AddedToken`):
|
||||
Tokens are only added if they are not already in the vocabulary. :obj:`tokenizers.AddedToken` wraps a
|
||||
|
|
Загрузка…
Ссылка в новой задаче