Suraj Patil
f6e74a63ca
Add m2m100 ( #10236 )
...
* m2m_100
* no layernorm_embedding
* sinusoidal positional embeddings
* update pos embeddings
* add default config values
* tokenizer
* add conversion script
* fix config
* fix pos embed
* remove _float_tensor
* update tokenizer
* update lang codes
* handle lang codes
* fix pos embeds
* fix spm key
* put embedding weights on device
* remove qa and seq classification heads
* fix convert script
* lang codes pn one line
* fix embeds
* fix tokenizer
* fix tokenizer
* add fast tokenizer
* style
* M2M100MT => M2M100
* fix copyright, style
* tokenizer converter
* vocab file
* remove fast tokenizer
* fix embeds
* fix tokenizer
* fix tests
* add tokenizer tests
* add integration test
* quality
* fix model name
* fix test
* doc
* doc
* fix doc
* add copied from statements
* fix tokenizer tests
* apply review suggestions
* fix urls
* fix shift_tokens_right
* apply review suggestions
* fix
* fix doc
* add lang code to id
* remove unused function
* update checkpoint names
* fix copy
* fix tokenizer
* fix checkpoint names
* fix merge issue
* style
2021-03-06 22:14:16 +05:30
Lysandre
fd01104435
Temporarily disable stale bot
2021-03-06 00:21:50 -05:00
Stas Bekman
88a951e3cc
offline mode for firewalled envs ( #10407 )
...
* offline mode start
* add specific values
* fix fallback
* add test
* better values check and range
* test that actually works
* document the offline mode
* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* more strict check
* cleaner test
* pt-only test
* style
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2021-03-05 17:27:48 -08:00
Daniel Hug
90ecc29656
Refactoring checkpoint names for multiple models ( #10527 )
...
* Refactor checkpoint name in ALBERT and ALBERT_tf
* Refactor checkpoint name in BART and BART_tf
* Refactor checkpoint name in BERT generation
* Refactor checkpoint name in Blenderbot_tf
* Refactor checkpoint name in Blenderbot_small_tf
* Refactor checkpoint name in ConvBERT AND CONVBERT_TF
* Refactor checkpoint name in CTRL AND CTRL_TF
* Refactor checkpoint name in DistilBERT AND DistilBERT_TF
* Refactor checkpoint name in DistilBERT redo
* Refactor checkpoint name in Electra and Electra_tf
* Refactor checkpoint name in FlauBERT and FlauBERT_tf
* Refactor checkpoint name in FSMT
* Refactor checkpoint name in GPT2 and GPT2_tf
* Refactor checkpoint name in IBERT
* Refactor checkpoint name in LED and LED_tf
* Refactor checkpoint name in Longformer and Longformer_tf
* Refactor checkpoint name in Lxmert and Lxmert_tf
* Refactor checkpoint name in Marian_tf
* Refactor checkpoint name in MBART and MBART_tf
* Refactor checkpoint name in MobileBERT and MobileBERT_tf
* Refactor checkpoint name in mpnet and mpnet_tf
* Refactor checkpoint name in openai and openai_tf
* Refactor checkpoint name in pegasus_tf
* Refactor checkpoint name in reformer
* Refactor checkpoint name in Roberta and Roberta_tf
* Refactor checkpoint name in SqueezeBert
* Refactor checkpoint name in Transformer_xl and Transformer_xl_tf
* Refactor checkpoint name in XLM and XLM_tf
* Refactor checkpoint name in XLNET and XLNET_tf
* Refactor checkpoint name in BERT_tf
* run make tests, style, quality, fixup
2021-03-05 18:06:55 -05:00
Lysandre Debut
defe9e20fe
Stale Bot ( #10509 )
...
* Add stale bot to Github Actions
* Update message
* Message for assignee
* Update scripts/stale.py
* Uncomment & stop testing
2021-03-05 16:41:50 -05:00
Sylvain Gugger
7da995c00c
Fix embeddings for PyTorch 1.8 ( #10549 )
...
* Fix embeddings for PyTorch 1.8
* Try with PyTorch 1.8.0
* Fix embeddings init
* Fix copies
* Typo
* More typos
2021-03-05 16:18:48 -05:00
Chen Liang
3e056c1003
Typo correction. ( #10531 )
...
DEBERTA_PRETRAINED_MODEL_ARCHIVE_LIST => DEBERTA_V2_PRETRAINED_MODEL_ARCHIVE_LIST in line 31.
2021-03-05 15:27:09 -05:00
Joakim Warholm
9f8bc87cbe
fixed dead link in trainer doc ( #10554 )
2021-03-05 14:56:37 -05:00
Lysandre Debut
6b58e15507
Fix torch 1.8.0 segmentation fault ( #10546 )
...
* Only run one test
* Patch segfault
* Fix summarization pipeline
* Ready for merge
2021-03-05 12:10:19 -05:00
Patrick von Platen
395ffcd757
fix run seq2seq ( #10547 )
2021-03-05 18:17:12 +03:00
Nicolas Patry
54e55b52d4
Fixing conversation test for torch 1.8 ( #10545 )
2021-03-05 09:24:14 -05:00
Lysandre
dc9aaa3848
Pin torch to 1.7.1 in tests while we resolve issues
2021-03-05 07:57:35 -05:00
lewtun
12b66215cf
Fix example of custom Trainer to reflect signature of compute_loss ( #10537 )
2021-03-05 07:44:53 -05:00
Lysandre
093b88f4e9
Update scatter to use torch 1.8.0
2021-03-05 07:31:51 -05:00
Patrick von Platen
c503a1c15e
[ProphetNet] Bart-like Refactor ( #10501 )
...
* first step to refactor
* make all fast tests pass
* make all slow tests pass
* save intermediate
* correct cache
* finish PR
* make fp16 work
2021-03-04 23:27:12 +03:00
Sylvain Gugger
6290169eb3
Rework TPU checkpointing in Trainer ( #10504 )
...
* Rework TPU checkpointing in Trainer
* Wraps the barrier in a dist test
* Address review comments
* Remove line
2021-03-04 11:46:11 -05:00
Philipp Schmid
805c5200dc
Removes overwrites for output_dir ( #10521 )
...
* removed overwrites
* remove default value for output_dir
* adjusted typing
2021-03-04 17:12:37 +01:00
Sylvain Gugger
a5bd40b75c
Not always consider a local model a checkpoint in run_glue ( #10517 )
2021-03-04 11:11:39 -05:00
Sylvain Gugger
745ea78dcc
Revert "Not always consider a local model a checkpoint in run_glue"
...
This reverts commit f3660613bc
.
2021-03-04 09:45:18 -05:00
Sylvain Gugger
f3660613bc
Not always consider a local model a checkpoint in run_glue
2021-03-04 09:44:02 -05:00
Sylvain Gugger
948b730f97
Remove unsupported methods from ModelOutput doc ( #10505 )
2021-03-03 14:55:18 -05:00
Sylvain Gugger
b70f441b72
Smp grad accum ( #10488 )
...
* Fix gradient accumulation for SM Model Parallelism
* Style and divide loss by grad accum steps
2021-03-03 12:13:29 -05:00
felixgwu
d064fb5647
Fix the bug in constructing the all_hidden_states of DeBERTa v2 ( #10466 )
...
* fix all_hidden_states
* use output_states instead of next_kv
2021-03-03 12:05:21 -05:00
Stas Bekman
188574ac50
remap MODEL_FOR_QUESTION_ANSWERING_MAPPING classes to names auto-generated file ( #10487 )
...
* remap classes to strings
* missing new util
* style
* doc
* move the autogenerated file
* Trigger CI
2021-03-03 08:54:00 -08:00
Sylvain Gugger
801ff969ce
Refactor checkpoint name in BERT and MobileBERT ( #10424 )
...
* Refactor checkpoint name in BERT and MobileBERT
* Add option to check copies
* Add QuestionAnswering
* Add last models
* Make black happy
2021-03-03 11:21:17 -05:00
Jeff Yang
39f70a4058
feat(docs): navigate with left/right arrow keys ( #10481 )
...
* feat(docs): navigate with left/right arrow keys
* fix: add missing comma
2021-03-03 11:17:12 -05:00
Patrick von Platen
2d2ed2cc18
[T5] Fix speed degradation bug t5 ( #10496 )
...
* fix speed degradation bug t5
* fix for all models
* fix code quality
2021-03-03 12:42:41 +03:00
WybeKoper
5dc303e281
Fixed minor spelling mistakes ( #10489 )
...
Co-authored-by: WybeKoper <WybeKoper@users.noreply.github.com>
2021-03-03 14:17:25 +05:30
Mehrad Moradshahi
1750e62900
Generate can return cross-attention weights too ( #10493 )
2021-03-03 13:57:02 +05:30
Martin Schmitt
b013842244
Changed `num_beams` to `num_beams // num_beam_groups` when initialising `PrefixConstrainedLogitsProcessor` in `_get_logits_processor` to fix compatibility issue when constrained decoding is used together with grouped beam search ( #10475 )
2021-03-02 10:41:54 +03:00
Lysandre Debut
0c2325198f
Add I-BERT to README ( #10462 )
2021-03-01 12:12:31 -05:00
Lysandre Debut
9248e27037
Remove Anthony from the bug reports in Transformers
2021-03-01 10:23:40 -05:00
Suraj Patil
a106bde5a7
[Wav2Vec2FeatureExtractor] smal fixes ( #10455 )
...
* smal fixes
* don't check for None
2021-03-01 20:19:52 +05:30
Patrick von Platen
11655fafdd
remove feature extraction config ( #10457 )
2021-03-01 12:30:12 +03:00
Patrick von Platen
0234de8418
Add Fine-Tuning for Wav2Vec2 ( #10145 )
...
* add encode labels function to tokenizer
* start adding finetuning
* init dropout
* upload
* correct convert script
* apply changes
* fix second typo
* make first dummy training run
* adapt convert script
* push confg for comparison
* remove conf
* finish training
* adapt data collator
* add research folder
* update according to fairseq feedback
* some minor corrections
* refactor masking indices a bit
* some minor changes
* clean tokenizer
* finish clean-up
* remove previous logic
* update run script
* correct training
* finish changes
* finish model
* correct bug
* fix training a bit more
* add some tests
* finish gradient checkpointing
* finish example
* correct gradient checkpointing
* improve tokenization method
* revert changes in tokenizer
* revert general change
* adapt fine-tuning
* update
* save intermediate test
* Update README.md
* finish finetuning
* delete conversion script
* Update src/transformers/models/wav2vec2/configuration_wav2vec2.py
* Update src/transformers/models/wav2vec2/processing_wav2vec2.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
* finish wav2vec2 script
* finish wav2vec2 fine-tuning
* finalize test
* correct test
* adapt tests
* finish
* remove test file
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
2021-03-01 12:13:17 +03:00
Patrick von Platen
3c733f3208
Update ibert.rst ( #10445 )
2021-02-28 19:03:49 +03:00
Darigov Research
aeba4f95bb
Adds terms to Glossary ( #10443 )
...
* feat: Adds three definitions to glossary from @cronoik
Needed a definition for transformer which in turn needed 2 more definitions
To do with issue https://github.com/huggingface/transformers/issues/9078
* fix: Adjusts definition of neural network to make it easier to read
2021-02-28 08:27:54 -05:00
Tanmay Garg
256482ac92
Introduce save_strategy training argument ( #10286 )
...
* Introduce save_strategy training argument
* deprecate EvaluationStrategy
* collapse EvaluationStrategy and LoggingStrategy into a single
IntervalStrategy enum
* modify tests to use modified enum
2021-02-27 19:34:22 -05:00
Bhadresh Savani
aca6288ff4
updated logging and saving metrics ( #10436 )
...
* updated logging and saving metrics
* space removal
2021-02-27 09:53:44 -08:00
Stas Bekman
f52a15897b
[run_seq2seq.py] restore functionality: saving to test_generations.txt ( #10428 )
...
This PR restores the original functionality that for some reason was modified.
Fixes: https://github.com/huggingface/transformers/issues/10381
@sgugger
2021-02-27 08:21:50 -08:00
Lysandre Debut
311b7048c5
Fix conda-build ( #10431 )
2021-02-26 20:20:30 -05:00
Stas Bekman
ee04b69822
[examples] better model example ( #10427 )
...
* refactors
* typo
2021-02-26 17:01:01 -08:00
Amog Kamsetty
a85eb616f7
Ray Tune Integration Bug Fixes ( #10406 )
...
* fixes
* update resources
* formatting
* remove import
* add log statement
* use fstring
* add period
* Update src/transformers/integrations.py
2021-02-26 19:06:08 -05:00
Kai Fricke
98569d4ba2
Add Ray Tune hyperparameter search integration test ( #10414 )
2021-02-26 10:18:33 -05:00
Patrick von Platen
d03695f3a2
[LED] Correct Docs ( #10419 )
...
* correct docs
* correct tf model docs as well
2021-02-26 17:53:28 +03:00
Mansi Mane
7fc686efb1
Sagemaker Model Parallel tensoboard writing fix ( #10403 )
...
* Added tb fix
* Removed local rank condition
* Updated reference to args
2021-02-26 08:04:55 -05:00
Julien Chaumond
83d2d55c94
[ci, flax] non-existing models are unlikely to pass tests ( #10409 )
...
😂
2021-02-26 12:35:36 +03:00
Sylvain Gugger
17b6e0d474
Fix run_glue evaluation when model has a label correspondence ( #10401 )
2021-02-25 15:30:38 -05:00
Sylvain Gugger
26f8b2cb10
Make Barthez tokenizer tests a bit faster ( #10399 )
...
* Make Barthez tokenizer tests a bit faster
* Quality
2021-02-25 11:42:25 -05:00
Andrea Bacciu
b040e6efc1
Fix None in add_token_positions - issue #10210 ( #10374 )
...
* Fix None in add_token_positions - issue #10210
Fix None in add_token_positions related to the issue #10210
* add_token_positions fix None values in end_positions vector
add_token_positions fix None in end_positions vector as proposed by @joeddav
2021-02-25 09:18:33 -07:00