Lysandre
10d72390c0
Revert #4446 Since it introduces a new dependency
2020-05-22 10:49:45 -04:00
Lysandre
e0db6bbd65
Release: v2.10.0
2020-05-22 10:37:44 -04:00
Frankie Liuzzi
bd6e301832
added functionality for electra classification head ( #4257 )
...
* added functionality for electra classification head
* unneeded dropout
* Test ELECTRA for sequence classification
* Style
Co-authored-by: Frankie <frankie@frase.io>
Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>
2020-05-22 09:48:21 -04:00
Lysandre
a086527727
Unused Union should not be imported
2020-05-21 09:42:47 -04:00
Lysandre Debut
9d2ce253de
TPU hangs when saving optimizer/scheduler ( #4467 )
...
* TPU hangs when saving optimizer/scheduler
* Style
* ParallelLoader is not a DataLoader
* Style
* Addressing @julien-c's comments
2020-05-21 09:18:27 -04:00
Zhangyx
49296533ca
Adds predict stage for glue tasks, and generate result files which can be submitted to gluebenchmark.com ( #4463 )
...
* Adds predict stage for glue tasks, and generate result files which could be submitted to gluebenchmark.com website.
* Use Split enum + always output the label name
Co-authored-by: Julien Chaumond <chaumond@gmail.com>
2020-05-21 09:17:44 -04:00
Tobias Lee
271bedb485
[examples] fix no grad in second pruning in run_bertology ( #4479 )
...
* fix no grad in second pruning and typo
* fix prune heads attention mismatch problem
* fix
* fix
* fix
* run make style
* run make style
2020-05-21 09:17:03 -04:00
Julien Chaumond
865d4d595e
[ci] Close #4481
2020-05-20 18:27:42 -04:00
Julien Chaumond
a3af8e86cb
Update test_trainer_distributed.py
2020-05-20 18:26:51 -04:00
Cola
eacea530c1
🚨 Remove warning of deprecation ( #4477 )
...
Remove warning of deprecated overload of addcdiv_
Fix #4451
2020-05-20 16:48:29 -04:00
Julien Plu
fa2fbed3e5
Better None gradients handling in TF Trainer ( #4469 )
...
* Better None gradients handling
* Apply Style
* Apply Style
2020-05-20 16:46:21 -04:00
Oliver Åstrand
e708bb75bf
Correct TF formatting to exclude LayerNorms from weight decay ( #4448 )
...
* Exclude LayerNorms from weight decay
* Include both formats of layer norm
2020-05-20 16:45:59 -04:00
Rens
49c06132df
pass on tokenizer to pipeline ( #4489 )
2020-05-20 22:23:21 +02:00
Nathan Cooper
cacb654c7f
Add Fine-tune DialoGPT on new datasets notebook ( #4473 )
2020-05-20 16:17:52 -04:00
Timo Moeller
30a09f3827
Adjust german bert model card, add new model card ( #4488 )
2020-05-20 16:08:29 -04:00
Lysandre Debut
14cb5b35fa
Fix slow gpu tests lysandre ( #4487 )
...
* There is one missing key in BERT
* Correct device for CamemBERT model
* RoBERTa tokenization adding prefix space
* Style
2020-05-20 11:59:45 -04:00
Manuel Romero
6dc52c78d8
Create README.md ( #4482 )
2020-05-20 09:45:50 -04:00
Manuel Romero
ed5456daf4
Model card for RuPERTa-base fine-tuned for NER ( #4466 )
2020-05-20 09:45:24 -04:00
Oleksandr Bushkovskyi
c76450e20c
Model card for Tereveni-AI/gpt2-124M-uk-fiction ( #4470 )
...
Create model card for "Tereveni-AI/gpt2-124M-uk-fiction" model
2020-05-20 09:44:26 -04:00
Hu Xu
9907dc523a
add BERT trained from review corpus. ( #4405 )
...
* add model_cards for BERT trained on reviews.
* add link to repository.
* refine README.md for each review model
2020-05-20 09:42:35 -04:00
Sam Shleifer
efbc1c5a9d
[MarianTokenizer] implement save_vocabulary and other common methods ( #4389 )
2020-05-19 19:45:49 -04:00
Sam Shleifer
956c4c4eb4
[gpu slow tests] fix mbart-large-enro gpu tests ( #4472 )
2020-05-19 19:45:31 -04:00
Patrick von Platen
48c3a70b4e
[Longformer] Docs and clean API ( #4464 )
...
* add longformer docs
* improve docs
2020-05-19 21:52:36 +02:00
Patrick von Platen
aa925a52fa
[Tests, GPU, SLOW] fix a bunch of GPU hardcoded tests in Pytorch ( #4468 )
...
* fix gpu slow tests in pytorch
* change model to device syntax
2020-05-19 21:35:04 +02:00
Suraj Patil
5856999a9f
add T5 fine-tuning notebook [Community notebooks] ( #4462 )
...
* add T5 fine-tuning notebook [Community notebooks]
* Update README.md
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2020-05-19 18:26:28 +02:00
Sam Shleifer
07dd7c2fd8
[cleanup] test_tokenization_common.py ( #4390 )
2020-05-19 10:46:55 -04:00
Iz Beltagy
8f1d047148
Longformer ( #4352 )
...
* first commit
* bug fixes
* better examples
* undo padding
* remove wrong VOCAB_FILES_NAMES
* License
* make style
* make isort happy
* unit tests
* integration test
* make `black` happy by undoing `isort` changes!!
* lint
* no need for the padding value
* batch_size not bsz
* remove unused type casting
* seqlen not seq_len
* staticmethod
* `bert` selfattention instead of `n2`
* uint8 instead of bool + lints
* pad inputs_embeds using embeddings not a constant
* black
* unit test with padding
* fix unit tests
* remove redundant unit test
* upload model weights
* resolve todo
* simpler _mask_invalid_locations without lru_cache + backward compatible masked_fill_
* increase unittest coverage
2020-05-19 16:04:43 +02:00
Girishkumar
31eedff5a0
Refactored the README.md file ( #4427 )
2020-05-19 09:56:24 -04:00
Shaoyen
384f0eb2f9
Map optimizer to correct device after loading from checkpoint. ( #4403 )
...
* Map optimizer to correct device after loading from checkpoint.
* Make style test pass
Co-authored-by: Julien Chaumond <chaumond@gmail.com>
2020-05-18 23:16:05 -04:00
Julien Chaumond
bf14ef75f1
[Trainer] move model to device before setting optimizer ( #4450 )
2020-05-18 23:13:33 -04:00
Julien Chaumond
5e7fe8b585
Distributed eval: SequentialDistributedSampler + gather all results ( #4243 )
...
* Distributed eval: SequentialDistributedSampler + gather all results
* For consistency only write to disk from world_master
Close https://github.com/huggingface/transformers/issues/4272
* Working distributed eval
* Hook into scripts
* Fix #3721 again
* TPU.mesh_reduce: stay in tensor space
Thanks @jysohn23
* Just a small comment
* whitespace
* torch.hub: pip install packaging
* Add test scenarii
2020-05-18 22:02:39 -04:00
Julien Chaumond
4c06893610
Fix nn.DataParallel compatibility in PyTorch 1.5 ( #4300 )
...
* Test case for #3936
* multigpu tests pass on pytorch 1.4.0
* Fixup
* multigpu tests pass on pytorch 1.5.0
* Update src/transformers/modeling_utils.py
* Update src/transformers/modeling_utils.py
* rename multigpu to require_multigpu
* mode doc
2020-05-18 20:34:50 -04:00
Rakesh Chada
9de4afa897
Make get_last_lr in trainer backward compatible ( #4446 )
...
* makes fetching last learning late in trainer backward compatible
* split comment to multiple lines
* fixes black styling issue
* uses version to create a more explicit logic
2020-05-18 20:17:36 -04:00
Stefan Dumitrescu
42e8fbfc51
Added model cards for Romanian BERT models ( #4437 )
...
* Create README.md
* Create README.md
* Update README.md
* Update README.md
* Apply suggestions from code review
Co-authored-by: Julien Chaumond <chaumond@gmail.com>
2020-05-18 18:48:56 -04:00
Oliver Guhr
54065d68b8
added model card for german-sentiment-bert ( #4435 )
2020-05-18 18:44:41 -04:00
Martin Müller
e28b7e2311
Create README.md ( #4433 )
2020-05-18 18:41:34 -04:00
sy-wada
09b933f19d
Update README.md (model_card) ( #4424 )
...
- add a citation.
- modify the table of the BLUE benchmark.
The table of the first version was not displayed correctly on https://huggingface.co/seiya/oubiobert-base-uncased .
Could you please confirm that this fix will allow you to display it correctly?
2020-05-18 18:18:17 -04:00
Manuel Romero
235777ccc9
Modify example of usage ( #4413 )
...
I followed the google example of usage for its electra small model but i have seen it is not meaningful, so i created a better example
2020-05-18 18:17:33 -04:00
Suraj Patil
9ddd3a6548
add model card for t5-base-squad ( #4409 )
...
* add model card for t5-base-squad
* Update model_cards/valhalla/t5-base-squad/README.md
Co-authored-by: Julien Chaumond <chaumond@gmail.com>
2020-05-18 18:17:14 -04:00
HUSEIN ZOLKEPLI
c5aa114392
Added README huseinzol05/t5-base-bahasa-cased ( #4377 )
...
* add bert bahasa readme
* update readme
* update readme
* added xlnet
* added tiny-bert and fix xlnet readme
* added albert base
* added albert tiny
* added electra model
* added gpt2 117m bahasa readme
* added gpt2 345m bahasa readme
* added t5-base-bahasa
* fix readme
* Update model_cards/huseinzol05/t5-base-bahasa-cased/README.md
Co-authored-by: Julien Chaumond <chaumond@gmail.com>
2020-05-18 18:10:23 -04:00
Funtowicz Morgan
ca4a3f4da9
Adding optimizations block from ONNXRuntime. ( #4431 )
...
* Adding optimizations block from ONNXRuntime.
* Turn off external data format by default for PyTorch export.
* Correct the way use_external_format is passed through the cmdline args.
2020-05-18 20:32:33 +02:00
Patrick von Platen
24538df919
[Community notebooks] General notebooks ( #4441 )
...
* Update README.md
* Update README.md
* Update README.md
* Update README.md
2020-05-18 20:23:57 +02:00
Sam Shleifer
a699525d25
[test_pipelines] Mark tests > 10s @slow, small speedups ( #4421 )
2020-05-18 12:23:21 -04:00
Boris Dayma
d9ece8233d
fix(run_language_modeling): use arg overwrite_cache ( #4407 )
2020-05-18 11:37:35 -04:00
Patrick von Platen
d39bf0ac2d
better naming in tf t5 ( #4401 )
2020-05-18 11:34:00 -04:00
Patrick von Platen
590adb130b
improve docstring ( #4422 )
2020-05-18 11:31:35 -04:00
Patrick von Platen
026a5d0888
[T5 fp16] Fix fp16 in T5 ( #4436 )
...
* fix fp16 in t5
* make style
* refactor invert_attention_mask fn
* fix typo
2020-05-18 17:25:58 +02:00
Soham Chatterjee
fa6113f9a0
Fixed spelling of training ( #4416 )
2020-05-18 11:23:29 -04:00
Julien Chaumond
757baee846
Fix un-prefixed f-string
...
see https://github.com/huggingface/transformers/pull/4367#discussion_r426356693
Hat/tip @girishponkiya
2020-05-18 11:20:46 -04:00
Patrick von Platen
a27c795908
fix ( #4419 )
2020-05-18 15:51:40 +02:00