* Add {decoder_,}head_mask to LED
* Fix create_custom_forward signatue in encoder
* Add head_mask to longformer
* Add head_mask to longformer to fix dependencies
of LED on Longformer.
* Not working yet
* Add mising one input in longofrmer_modeling.py
* make fix-copies
* change tokenizer requirement
* split line
* Correct typo from list to str
* improve style
* make other function pretty as well
* add comment
* correct typo
* add new test
* pass tests for tok without padding token
* Apply suggestions from code review
* Change documentation to correctly specify loss tensor size
* Change documentation to correct input format for labels
* Corrected output size of loss tensor for sequence classifier, multiple choice model and question answering
This affects Adafactor with relative_step=False and scale_parameter=True.
Updating group["lr"] makes the result of ._get_lr() depends on the previous call,
i.e., on the scale of other parameters. This isn't supposed to happen.
* MOD: fit chinese wwm to new datasets
* MOD: move wwm to new folder
* MOD: formate code
* Styling
* MOD add param and recover trainer
Co-authored-by: Sylvain Gugger <sylvain.gugger@gmail.com>
* [t5 doc] typos
a few run away backticks
@sgugger
* style
* [trainer] put fp16 args together
this PR proposes a purely cosmetic change that puts all the fp16 args together - so they are easier to manager/read
@sgugger
* style
* [wandb] make WANDB_DISABLED disable wandb with any value
This PR solves part of https://github.com/huggingface/transformers/issues/9623
It tries to actually do what https://github.com/huggingface/transformers/issues/9699 requested/discussed and that is any value of `WANDB_DISABLED` should disable wandb.
The current behavior is that it has to be one of `ENV_VARS_TRUE_VALUES = {"1", "ON", "YES"}`
I have been using `WANDB_DISABLED=true` everywhere in scripts as it was originally advertised. I have no idea why this was changed to a sub-set of possible values. And it's not documented anywhere.
@sgugger
* WANDB_DISABLED=true to disable; make tf trainer consistent
* style
* Add {decoder_,}head_mask to fsmt_modeling.py
* Enable test_headmasking and some changes to docs
* Remove test_head_masking flag from fsmt test file
Remove test_head_masking flag from test_modeling_fsmt.py
since test_head_masking is set to be True by default (thus it is redundant to store).
* Merge master and remove test_head_masking = True
* Rebase necessary due to an update of jaxlib
* Remove test_head_masking=True in tests/test_modeling_fsmt.py
as it is redundant.
* TFBart lables consider both pad token and -100
* make style
* fix for all other models
Co-authored-by: kykim <kykim>
Co-authored-by: patrickvonplaten <patrick.v.platen@gmail.com>
* Adding a new `return_full_text` parameter to TextGenerationPipeline.
For text-generation, it's sometimes used as prompting text.
In that context, prefixing `generated_text` with the actual input
forces the caller to take an extra step to remove it.
The proposed change adds a new parameter (for backward compatibility).
`return_full_text` that enables the caller to prevent adding the prefix.
* Doc quality.
* expand install instructions
* fix
* white space
* rewrite as discussed in the PR
* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* change the wording to encourage issue report
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Remove redundant test_head_masking = True flags
* Remove all redundant test_head_masking flags in PyTorch test_modeling_* files
* Make test_head_masking = True as a default choice in test_modeling_tf_commong.py
* Remove all redundant test_head_masking flags in TensorFlow
test_modeling_tf_* files
* Put back test_head_masking=False fot TFT5 models
* Fix computation of attention_probs when head_mask is provided.
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>
* Apply changes to the template
Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>