Sylvain Gugger
a5bd40b75c
Not always consider a local model a checkpoint in run_glue ( #10517 )
2021-03-04 11:11:39 -05:00
Sylvain Gugger
745ea78dcc
Revert "Not always consider a local model a checkpoint in run_glue"
...
This reverts commit f3660613bc
.
2021-03-04 09:45:18 -05:00
Sylvain Gugger
f3660613bc
Not always consider a local model a checkpoint in run_glue
2021-03-04 09:44:02 -05:00
Sylvain Gugger
948b730f97
Remove unsupported methods from ModelOutput doc ( #10505 )
2021-03-03 14:55:18 -05:00
Sylvain Gugger
b70f441b72
Smp grad accum ( #10488 )
...
* Fix gradient accumulation for SM Model Parallelism
* Style and divide loss by grad accum steps
2021-03-03 12:13:29 -05:00
felixgwu
d064fb5647
Fix the bug in constructing the all_hidden_states of DeBERTa v2 ( #10466 )
...
* fix all_hidden_states
* use output_states instead of next_kv
2021-03-03 12:05:21 -05:00
Stas Bekman
188574ac50
remap MODEL_FOR_QUESTION_ANSWERING_MAPPING classes to names auto-generated file ( #10487 )
...
* remap classes to strings
* missing new util
* style
* doc
* move the autogenerated file
* Trigger CI
2021-03-03 08:54:00 -08:00
Sylvain Gugger
801ff969ce
Refactor checkpoint name in BERT and MobileBERT ( #10424 )
...
* Refactor checkpoint name in BERT and MobileBERT
* Add option to check copies
* Add QuestionAnswering
* Add last models
* Make black happy
2021-03-03 11:21:17 -05:00
Jeff Yang
39f70a4058
feat(docs): navigate with left/right arrow keys ( #10481 )
...
* feat(docs): navigate with left/right arrow keys
* fix: add missing comma
2021-03-03 11:17:12 -05:00
Patrick von Platen
2d2ed2cc18
[T5] Fix speed degradation bug t5 ( #10496 )
...
* fix speed degradation bug t5
* fix for all models
* fix code quality
2021-03-03 12:42:41 +03:00
WybeKoper
5dc303e281
Fixed minor spelling mistakes ( #10489 )
...
Co-authored-by: WybeKoper <WybeKoper@users.noreply.github.com>
2021-03-03 14:17:25 +05:30
Mehrad Moradshahi
1750e62900
Generate can return cross-attention weights too ( #10493 )
2021-03-03 13:57:02 +05:30
Martin Schmitt
b013842244
Changed `num_beams` to `num_beams // num_beam_groups` when initialising `PrefixConstrainedLogitsProcessor` in `_get_logits_processor` to fix compatibility issue when constrained decoding is used together with grouped beam search ( #10475 )
2021-03-02 10:41:54 +03:00
Lysandre Debut
0c2325198f
Add I-BERT to README ( #10462 )
2021-03-01 12:12:31 -05:00
Lysandre Debut
9248e27037
Remove Anthony from the bug reports in Transformers
2021-03-01 10:23:40 -05:00
Suraj Patil
a106bde5a7
[Wav2Vec2FeatureExtractor] smal fixes ( #10455 )
...
* smal fixes
* don't check for None
2021-03-01 20:19:52 +05:30
Patrick von Platen
11655fafdd
remove feature extraction config ( #10457 )
2021-03-01 12:30:12 +03:00
Patrick von Platen
0234de8418
Add Fine-Tuning for Wav2Vec2 ( #10145 )
...
* add encode labels function to tokenizer
* start adding finetuning
* init dropout
* upload
* correct convert script
* apply changes
* fix second typo
* make first dummy training run
* adapt convert script
* push confg for comparison
* remove conf
* finish training
* adapt data collator
* add research folder
* update according to fairseq feedback
* some minor corrections
* refactor masking indices a bit
* some minor changes
* clean tokenizer
* finish clean-up
* remove previous logic
* update run script
* correct training
* finish changes
* finish model
* correct bug
* fix training a bit more
* add some tests
* finish gradient checkpointing
* finish example
* correct gradient checkpointing
* improve tokenization method
* revert changes in tokenizer
* revert general change
* adapt fine-tuning
* update
* save intermediate test
* Update README.md
* finish finetuning
* delete conversion script
* Update src/transformers/models/wav2vec2/configuration_wav2vec2.py
* Update src/transformers/models/wav2vec2/processing_wav2vec2.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
* finish wav2vec2 script
* finish wav2vec2 fine-tuning
* finalize test
* correct test
* adapt tests
* finish
* remove test file
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
2021-03-01 12:13:17 +03:00
Patrick von Platen
3c733f3208
Update ibert.rst ( #10445 )
2021-02-28 19:03:49 +03:00
Darigov Research
aeba4f95bb
Adds terms to Glossary ( #10443 )
...
* feat: Adds three definitions to glossary from @cronoik
Needed a definition for transformer which in turn needed 2 more definitions
To do with issue https://github.com/huggingface/transformers/issues/9078
* fix: Adjusts definition of neural network to make it easier to read
2021-02-28 08:27:54 -05:00
Tanmay Garg
256482ac92
Introduce save_strategy training argument ( #10286 )
...
* Introduce save_strategy training argument
* deprecate EvaluationStrategy
* collapse EvaluationStrategy and LoggingStrategy into a single
IntervalStrategy enum
* modify tests to use modified enum
2021-02-27 19:34:22 -05:00
Bhadresh Savani
aca6288ff4
updated logging and saving metrics ( #10436 )
...
* updated logging and saving metrics
* space removal
2021-02-27 09:53:44 -08:00
Stas Bekman
f52a15897b
[run_seq2seq.py] restore functionality: saving to test_generations.txt ( #10428 )
...
This PR restores the original functionality that for some reason was modified.
Fixes: https://github.com/huggingface/transformers/issues/10381
@sgugger
2021-02-27 08:21:50 -08:00
Lysandre Debut
311b7048c5
Fix conda-build ( #10431 )
2021-02-26 20:20:30 -05:00
Stas Bekman
ee04b69822
[examples] better model example ( #10427 )
...
* refactors
* typo
2021-02-26 17:01:01 -08:00
Amog Kamsetty
a85eb616f7
Ray Tune Integration Bug Fixes ( #10406 )
...
* fixes
* update resources
* formatting
* remove import
* add log statement
* use fstring
* add period
* Update src/transformers/integrations.py
2021-02-26 19:06:08 -05:00
Kai Fricke
98569d4ba2
Add Ray Tune hyperparameter search integration test ( #10414 )
2021-02-26 10:18:33 -05:00
Patrick von Platen
d03695f3a2
[LED] Correct Docs ( #10419 )
...
* correct docs
* correct tf model docs as well
2021-02-26 17:53:28 +03:00
Mansi Mane
7fc686efb1
Sagemaker Model Parallel tensoboard writing fix ( #10403 )
...
* Added tb fix
* Removed local rank condition
* Updated reference to args
2021-02-26 08:04:55 -05:00
Julien Chaumond
83d2d55c94
[ci, flax] non-existing models are unlikely to pass tests ( #10409 )
...
😂
2021-02-26 12:35:36 +03:00
Sylvain Gugger
17b6e0d474
Fix run_glue evaluation when model has a label correspondence ( #10401 )
2021-02-25 15:30:38 -05:00
Sylvain Gugger
26f8b2cb10
Make Barthez tokenizer tests a bit faster ( #10399 )
...
* Make Barthez tokenizer tests a bit faster
* Quality
2021-02-25 11:42:25 -05:00
Andrea Bacciu
b040e6efc1
Fix None in add_token_positions - issue #10210 ( #10374 )
...
* Fix None in add_token_positions - issue #10210
Fix None in add_token_positions related to the issue #10210
* add_token_positions fix None values in end_positions vector
add_token_positions fix None in end_positions vector as proposed by @joeddav
2021-02-25 09:18:33 -07:00
Sylvain Gugger
9d14be5c20
Add support for ZeRO-2/3 and ZeRO-offload in fairscale ( #10354 )
...
* Ass support for ZeRO-2/3 and ZeRO-offload in fairscale
* Quality
* Rework from review comments
* Add doc
* Apply suggestions from code review
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
* Address review comments
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
2021-02-25 11:07:53 -05:00
Lysandre Debut
88cc26dcd1
Ignore unexpected weights from PT conversion ( #10397 )
2021-02-25 10:42:27 -05:00
Sehoon Kim
63645b3b11
I-BERT model support ( #10153 )
...
* IBertConfig, IBertTokentizer added
* IBert Model names moified
* tokenizer bugfix
* embedding -> QuantEmbedding
* quant utils added
* quant_mode added to configuration
* QuantAct added, Embedding layer + QuantAct addition
* QuantAct added
* unused path removed, QKV quantized
* self attention layer all quantized, except softmax
* temporarl commit
* all liner layers quantized
* quant_utils bugfix
* bugfix: requantization missing
* IntGELU added
* IntSoftmax added
* LayerNorm implemented
* LayerNorm implemented all
* names changed: roberta->ibert
* config not inherit from ROberta
* No support for CausalLM
* static quantization added, quantize_model.py removed
* import modules uncommented
* copyrights fixed
* minor bugfix
* quant_modules, quant_utils merged as one file
* import * fixed
* unused runfile removed
* make style run
* configutration.py docstring fixed
* refactoring: comments removed, function name fixed
* unused dependency removed
* typo fixed
* comments(Copied from), assertion string added
* refactoring: super(..) -> super(), etc.
* refactoring
* refarctoring
* make style
* refactoring
* cuda -> to(x.device)
* weight initialization removed
* QuantLinear set_param removed
* QuantEmbedding set_param removed
* IntLayerNorm set_param removed
* assert string added
* assertion error message fixed
* is_decoder removed
* enc-dec arguments/functions removed
* Converter removed
* quant_modules docstring fixed
* conver_slow_tokenizer rolled back
* quant_utils docstring fixed
* unused aruments e.g. use_cache removed from config
* weight initialization condition fixed
* x_min, x_max initialized with small values to avoid div-zero exceptions
* testing code for ibert
* test emb, linear, gelu, softmax added
* test ln and act added
* style reformatted
* force_dequant added
* error tests overrided
* make style
* Style + Docs
* force dequant tests added
* Fix fast tokenizer in init
* Fix doc
* Remove space
* docstring, IBertConfig, chunk_size
* test_modeling_ibert refactoring
* quant_modules.py refactoring
* e2e integration test added
* tokenizers removed
* IBertConfig added to tokenizer_auto.py
* bugfix
* fix docs & test
* fix style num 2
* final fixes
Co-authored-by: Sehoon Kim <sehoonkim@berkeley.edu>
Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>
Co-authored-by: Sylvain Gugger <sylvain.gugger@gmail.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2021-02-25 10:06:42 -05:00
Patrick von Platen
cb38ffcc5e
[PretrainedFeatureExtractor] + Wav2Vec2FeatureExtractor, Wav2Vec2Processor, Wav2Vec2Tokenizer ( #10324 )
...
* push to show
* small improvement
* small improvement
* Update src/transformers/feature_extraction_utils.py
* Update src/transformers/feature_extraction_utils.py
* implement base
* add common tests
* make all tests pass for wav2vec2
* make padding work & add more tests
* finalize feature extractor utils
* add call method to feature extraction
* finalize feature processor
* finish tokenizer
* finish general processor design
* finish tests
* typo
* remove bogus file
* finish docstring
* add docs
* finish docs
* small fix
* correct docs
* save intermediate
* load changes
* apply changes
* apply changes to doc
* change tests
* apply surajs recommend
* final changes
* Apply suggestions from code review
* fix typo
* fix import
* correct docstring
2021-02-25 17:42:46 +03:00
abhishek thakur
9dc7825744
Remove unused variable in example for Q&A ( #10392 )
2021-02-25 09:18:47 -05:00
mingruimingrui
894db6701e
Bugfix: Removal of padding_idx in BartLearnedPositionalEmbedding ( #10200 )
...
* Assumption of padding_idx <2 might not stand
* Use offset instead of 2
* Fix with black
* Change behavior to warning instead for backward compatibility.
* Fix with black
* Remove warning
* Make padding_idx non-required
* padding_idx fix for blenderbot
* padding_idx fix for blenderbot_small
* padding_idx fix for led
* padding_idx fix for mbart
* Remove extra whitespaces
* padding_idx fix for template
* Fix padding_idx passed to nn.Embedding mistake
* Fixed padding_idx passed to positional embedding in template
* Remove padding_idx from pytorch learned positional embeddings
* Remove accidentally added quotes
* Remove padding_idx from tf learned positional embeddings
* Remove zeroing of weights in __init__
Co-authored-by: Wang Ming Rui <mingrui.wang@C02CJTUYMD6M.local>
2021-02-25 14:33:13 +03:00
Lysandre Debut
55fe80d084
Only run model templates tests once ( #10388 )
2021-02-24 19:48:00 -05:00
Lysandre Debut
22bd047e91
Run GA on every push even on forks ( #10383 )
2021-02-24 19:23:39 -05:00
Lysandre
3591844306
v4.3.3 docs
2021-02-24 15:19:01 -05:00
Stas Bekman
bdbb2c756b
[trainer] move secondary methods into a separate file ( #10363 )
...
* move secondary methods into a separate file
* cleanup
* style
2021-02-24 08:32:52 -08:00
Poedator
5f2a3d721c
fix deprecated ref to `tokenizer.max_len` ( #10220 )
...
This is to fix deprecated reference to `tokenizer.max_len` with `tokenizer.model_max_length` - similar to [issue 8739](https://github.com/huggingface/transformers/issues/8739 ) and [PR 8604](https://github.com/huggingface/transformers/pull/8604 ).
Example [here](https://colab.research.google.com/gist/poedator/f8776349e5c625ce287fc6fcd312fa1e/tokenizer-max_len-error-in-transformers_glue.ipynb ). The error happens when `glue_convert_examples_to_features` is called without `max_length` parameter specified. In that case line 119 with wrong reference gets called. This simple fix should do it.
2021-02-24 09:01:28 -05:00
Julien Plu
cdcdd5f03a
Rework casts ( #10274 )
2021-02-24 08:38:29 -05:00
abhishek thakur
2d458b2c7d
ConvBERT fix torch <> tf weights conversion ( #10314 )
...
* convbert conversion test
* fin
* fin
* fin
* clean up tf<->pt conversion
* remove from_pt
Co-authored-by: patrickvonplaten <patrick.v.platen@gmail.com>
2021-02-24 14:55:34 +03:00
Stas Bekman
3437d12134
[Trainer/Deepspeed] handle get_last_lr() before first step() ( #10362 )
...
* handle get_last_lr() before first step()
* abstract away the lr getting logic
* cleanup
* add test
* move to utils
2021-02-23 17:42:25 -08:00
Julien Chaumond
4a1ab7cb6c
[bert-base-german-cased] cp to hardcoded urls ( #10353 )
2021-02-23 12:30:47 -05:00
Akmal
23e87c27be
Fix broken examples/seq2seq/README.md markdown ( #10344 )
2021-02-23 10:49:25 -05:00
Lysandre
83f890ddd1
Easier self-scheduled debugging
2021-02-23 08:53:55 -05:00