Граф коммитов

6667 Коммитов

Автор SHA1 Сообщение Дата
Sylvain Gugger a5bd40b75c
Not always consider a local model a checkpoint in run_glue (#10517) 2021-03-04 11:11:39 -05:00
Sylvain Gugger 745ea78dcc Revert "Not always consider a local model a checkpoint in run_glue"
This reverts commit f3660613bc.
2021-03-04 09:45:18 -05:00
Sylvain Gugger f3660613bc Not always consider a local model a checkpoint in run_glue 2021-03-04 09:44:02 -05:00
Sylvain Gugger 948b730f97
Remove unsupported methods from ModelOutput doc (#10505) 2021-03-03 14:55:18 -05:00
Sylvain Gugger b70f441b72
Smp grad accum (#10488)
* Fix gradient accumulation for SM Model Parallelism

* Style and divide loss by grad accum steps
2021-03-03 12:13:29 -05:00
felixgwu d064fb5647
Fix the bug in constructing the all_hidden_states of DeBERTa v2 (#10466)
* fix all_hidden_states

* use output_states instead of next_kv
2021-03-03 12:05:21 -05:00
Stas Bekman 188574ac50
remap MODEL_FOR_QUESTION_ANSWERING_MAPPING classes to names auto-generated file (#10487)
* remap classes to strings

* missing new util

* style

* doc

* move the autogenerated file

* Trigger CI
2021-03-03 08:54:00 -08:00
Sylvain Gugger 801ff969ce
Refactor checkpoint name in BERT and MobileBERT (#10424)
* Refactor checkpoint name in BERT and MobileBERT

* Add option to check copies

* Add QuestionAnswering

* Add last models

* Make black happy
2021-03-03 11:21:17 -05:00
Jeff Yang 39f70a4058
feat(docs): navigate with left/right arrow keys (#10481)
* feat(docs): navigate with left/right arrow keys

* fix: add missing comma
2021-03-03 11:17:12 -05:00
Patrick von Platen 2d2ed2cc18
[T5] Fix speed degradation bug t5 (#10496)
* fix speed degradation bug t5

* fix for all models

* fix code quality
2021-03-03 12:42:41 +03:00
WybeKoper 5dc303e281
Fixed minor spelling mistakes (#10489)
Co-authored-by: WybeKoper <WybeKoper@users.noreply.github.com>
2021-03-03 14:17:25 +05:30
Mehrad Moradshahi 1750e62900
Generate can return cross-attention weights too (#10493) 2021-03-03 13:57:02 +05:30
Martin Schmitt b013842244
Changed `num_beams` to `num_beams // num_beam_groups` when initialising `PrefixConstrainedLogitsProcessor` in `_get_logits_processor` to fix compatibility issue when constrained decoding is used together with grouped beam search (#10475) 2021-03-02 10:41:54 +03:00
Lysandre Debut 0c2325198f
Add I-BERT to README (#10462) 2021-03-01 12:12:31 -05:00
Lysandre Debut 9248e27037
Remove Anthony from the bug reports in Transformers 2021-03-01 10:23:40 -05:00
Suraj Patil a106bde5a7
[Wav2Vec2FeatureExtractor] smal fixes (#10455)
* smal fixes

* don't check for None
2021-03-01 20:19:52 +05:30
Patrick von Platen 11655fafdd
remove feature extraction config (#10457) 2021-03-01 12:30:12 +03:00
Patrick von Platen 0234de8418
Add Fine-Tuning for Wav2Vec2 (#10145)
* add encode labels function to tokenizer

* start adding finetuning

* init dropout

* upload

* correct convert script

* apply changes

* fix second typo

* make first dummy training run

* adapt convert script

* push confg for comparison

* remove conf

* finish training

* adapt data collator

* add research folder

* update according to fairseq feedback

* some minor corrections

* refactor masking indices a bit

* some minor changes

* clean tokenizer

* finish clean-up

* remove previous logic

* update run script

* correct training

* finish changes

* finish model

* correct bug

* fix training a bit more

* add some tests

* finish gradient checkpointing

* finish example

* correct gradient checkpointing

* improve tokenization method

* revert changes in tokenizer

* revert general change

* adapt fine-tuning

* update

* save intermediate test

* Update README.md

* finish finetuning

* delete conversion script

* Update src/transformers/models/wav2vec2/configuration_wav2vec2.py

* Update src/transformers/models/wav2vec2/processing_wav2vec2.py

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* finish wav2vec2 script

* finish wav2vec2 fine-tuning

* finalize test

* correct test

* adapt tests

* finish

* remove test file

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
2021-03-01 12:13:17 +03:00
Patrick von Platen 3c733f3208
Update ibert.rst (#10445) 2021-02-28 19:03:49 +03:00
Darigov Research aeba4f95bb
Adds terms to Glossary (#10443)
* feat: Adds three definitions to glossary from @cronoik

Needed a definition for transformer which in turn needed 2 more definitions

To do with issue https://github.com/huggingface/transformers/issues/9078

* fix: Adjusts definition of neural network to make it easier to read
2021-02-28 08:27:54 -05:00
Tanmay Garg 256482ac92
Introduce save_strategy training argument (#10286)
* Introduce save_strategy training argument

* deprecate EvaluationStrategy

* collapse EvaluationStrategy and LoggingStrategy into a single
  IntervalStrategy enum

* modify tests to use modified enum
2021-02-27 19:34:22 -05:00
Bhadresh Savani aca6288ff4
updated logging and saving metrics (#10436)
* updated logging and saving metrics

* space removal
2021-02-27 09:53:44 -08:00
Stas Bekman f52a15897b
[run_seq2seq.py] restore functionality: saving to test_generations.txt (#10428)
This PR restores the original functionality that for some reason was modified.

Fixes: https://github.com/huggingface/transformers/issues/10381

@sgugger
2021-02-27 08:21:50 -08:00
Lysandre Debut 311b7048c5
Fix conda-build (#10431) 2021-02-26 20:20:30 -05:00
Stas Bekman ee04b69822
[examples] better model example (#10427)
* refactors

* typo
2021-02-26 17:01:01 -08:00
Amog Kamsetty a85eb616f7
Ray Tune Integration Bug Fixes (#10406)
* fixes

* update resources

* formatting

* remove import

* add log statement

* use fstring

* add period

* Update src/transformers/integrations.py
2021-02-26 19:06:08 -05:00
Kai Fricke 98569d4ba2
Add Ray Tune hyperparameter search integration test (#10414) 2021-02-26 10:18:33 -05:00
Patrick von Platen d03695f3a2
[LED] Correct Docs (#10419)
* correct docs

* correct tf model docs as well
2021-02-26 17:53:28 +03:00
Mansi Mane 7fc686efb1
Sagemaker Model Parallel tensoboard writing fix (#10403)
* Added tb fix

* Removed local rank condition

* Updated reference to args
2021-02-26 08:04:55 -05:00
Julien Chaumond 83d2d55c94
[ci, flax] non-existing models are unlikely to pass tests (#10409)
😂
2021-02-26 12:35:36 +03:00
Sylvain Gugger 17b6e0d474
Fix run_glue evaluation when model has a label correspondence (#10401) 2021-02-25 15:30:38 -05:00
Sylvain Gugger 26f8b2cb10
Make Barthez tokenizer tests a bit faster (#10399)
* Make Barthez tokenizer tests a bit faster

* Quality
2021-02-25 11:42:25 -05:00
Andrea Bacciu b040e6efc1
Fix None in add_token_positions - issue #10210 (#10374)
* Fix None in add_token_positions - issue #10210

Fix None in add_token_positions related to the issue #10210

* add_token_positions fix None values in end_positions vector

add_token_positions fix None in end_positions vector as proposed by @joeddav
2021-02-25 09:18:33 -07:00
Sylvain Gugger 9d14be5c20
Add support for ZeRO-2/3 and ZeRO-offload in fairscale (#10354)
* Ass support for ZeRO-2/3 and ZeRO-offload in fairscale

* Quality

* Rework from review comments

* Add doc

* Apply suggestions from code review

Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>

* Address review comments

Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
2021-02-25 11:07:53 -05:00
Lysandre Debut 88cc26dcd1
Ignore unexpected weights from PT conversion (#10397) 2021-02-25 10:42:27 -05:00
Sehoon Kim 63645b3b11
I-BERT model support (#10153)
* IBertConfig, IBertTokentizer added

* IBert Model names moified

* tokenizer bugfix

* embedding -> QuantEmbedding

* quant utils added

* quant_mode added to configuration

* QuantAct added, Embedding layer + QuantAct addition

* QuantAct added

* unused path removed, QKV quantized

* self attention layer all quantized, except softmax

* temporarl commit

* all liner layers quantized

* quant_utils bugfix

* bugfix: requantization missing

* IntGELU added

* IntSoftmax added

* LayerNorm implemented

* LayerNorm implemented all

* names changed: roberta->ibert

* config not inherit from ROberta

* No support for CausalLM

* static quantization added, quantize_model.py removed

* import modules uncommented

* copyrights fixed

* minor bugfix

* quant_modules, quant_utils merged as one file

* import * fixed

* unused runfile removed

* make style run

* configutration.py docstring fixed

* refactoring: comments removed, function name fixed

* unused dependency removed

* typo fixed

* comments(Copied from), assertion string added

* refactoring: super(..) -> super(), etc.

* refactoring

* refarctoring

* make style

* refactoring

* cuda -> to(x.device)

* weight initialization removed

* QuantLinear set_param removed

* QuantEmbedding set_param removed

* IntLayerNorm set_param removed

* assert string added

* assertion error message fixed

* is_decoder removed

* enc-dec arguments/functions removed

* Converter removed

* quant_modules docstring fixed

* conver_slow_tokenizer rolled back

* quant_utils docstring fixed

* unused aruments e.g. use_cache removed from config

* weight initialization condition fixed

* x_min, x_max initialized with small values to avoid div-zero exceptions

* testing code for ibert

* test emb, linear, gelu, softmax added

* test ln and act added

* style reformatted

* force_dequant added

* error tests overrided

* make style

* Style + Docs

* force dequant tests added

* Fix fast tokenizer in init

* Fix doc

* Remove space

* docstring, IBertConfig, chunk_size

* test_modeling_ibert refactoring

* quant_modules.py refactoring

* e2e integration test added

* tokenizers removed

* IBertConfig added to tokenizer_auto.py

* bugfix

* fix docs & test

* fix style num 2

* final fixes

Co-authored-by: Sehoon Kim <sehoonkim@berkeley.edu>
Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>
Co-authored-by: Sylvain Gugger <sylvain.gugger@gmail.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2021-02-25 10:06:42 -05:00
Patrick von Platen cb38ffcc5e
[PretrainedFeatureExtractor] + Wav2Vec2FeatureExtractor, Wav2Vec2Processor, Wav2Vec2Tokenizer (#10324)
* push to show

* small improvement

* small improvement

* Update src/transformers/feature_extraction_utils.py

* Update src/transformers/feature_extraction_utils.py

* implement base

* add common tests

* make all tests pass for wav2vec2

* make padding work & add more tests

* finalize feature extractor utils

* add call method to feature extraction

* finalize feature processor

* finish tokenizer

* finish general processor design

* finish tests

* typo

* remove bogus file

* finish docstring

* add docs

* finish docs

* small fix

* correct docs

* save intermediate

* load changes

* apply changes

* apply changes to doc

* change tests

* apply surajs recommend

* final changes

* Apply suggestions from code review

* fix typo

* fix import

* correct docstring
2021-02-25 17:42:46 +03:00
abhishek thakur 9dc7825744
Remove unused variable in example for Q&A (#10392) 2021-02-25 09:18:47 -05:00
mingruimingrui 894db6701e
Bugfix: Removal of padding_idx in BartLearnedPositionalEmbedding (#10200)
* Assumption of padding_idx <2 might not stand

* Use offset instead of 2

* Fix with black

* Change behavior to warning instead for backward compatibility.

* Fix with black

* Remove warning

* Make padding_idx non-required

* padding_idx fix for blenderbot

* padding_idx fix for blenderbot_small

* padding_idx fix for led

* padding_idx fix for mbart

* Remove extra whitespaces

* padding_idx fix for template

* Fix padding_idx passed to nn.Embedding mistake

* Fixed padding_idx passed to positional embedding in template

* Remove padding_idx from pytorch learned positional embeddings

* Remove accidentally added quotes

* Remove padding_idx from tf learned positional embeddings

* Remove zeroing of weights in __init__

Co-authored-by: Wang Ming Rui <mingrui.wang@C02CJTUYMD6M.local>
2021-02-25 14:33:13 +03:00
Lysandre Debut 55fe80d084
Only run model templates tests once (#10388) 2021-02-24 19:48:00 -05:00
Lysandre Debut 22bd047e91
Run GA on every push even on forks (#10383) 2021-02-24 19:23:39 -05:00
Lysandre 3591844306 v4.3.3 docs 2021-02-24 15:19:01 -05:00
Stas Bekman bdbb2c756b
[trainer] move secondary methods into a separate file (#10363)
* move secondary methods into a separate file

* cleanup

* style
2021-02-24 08:32:52 -08:00
Poedator 5f2a3d721c
fix deprecated ref to `tokenizer.max_len` (#10220)
This is to fix deprecated reference to `tokenizer.max_len` with `tokenizer.model_max_length` - similar to [issue 8739](https://github.com/huggingface/transformers/issues/8739) and [PR 8604](https://github.com/huggingface/transformers/pull/8604). 
Example [here](https://colab.research.google.com/gist/poedator/f8776349e5c625ce287fc6fcd312fa1e/tokenizer-max_len-error-in-transformers_glue.ipynb). The error happens when `glue_convert_examples_to_features` is called without `max_length` parameter specified. In that case line 119 with wrong reference gets called. This simple fix should  do it.
2021-02-24 09:01:28 -05:00
Julien Plu cdcdd5f03a
Rework casts (#10274) 2021-02-24 08:38:29 -05:00
abhishek thakur 2d458b2c7d
ConvBERT fix torch <> tf weights conversion (#10314)
* convbert conversion test

* fin

* fin

* fin

* clean up tf<->pt conversion

* remove from_pt

Co-authored-by: patrickvonplaten <patrick.v.platen@gmail.com>
2021-02-24 14:55:34 +03:00
Stas Bekman 3437d12134
[Trainer/Deepspeed] handle get_last_lr() before first step() (#10362)
* handle get_last_lr() before first step()

* abstract away the lr getting logic

* cleanup

* add test

* move to utils
2021-02-23 17:42:25 -08:00
Julien Chaumond 4a1ab7cb6c
[bert-base-german-cased] cp to hardcoded urls (#10353) 2021-02-23 12:30:47 -05:00
Akmal 23e87c27be
Fix broken examples/seq2seq/README.md markdown (#10344) 2021-02-23 10:49:25 -05:00
Lysandre 83f890ddd1 Easier self-scheduled debugging 2021-02-23 08:53:55 -05:00