* Add a script to check all models are tested and documented
* Apply suggestions from code review
Co-authored-by: Kevin Canwen Xu <canwenxu@126.com>
* Address comments
Co-authored-by: Kevin Canwen Xu <canwenxu@126.com>
* Add strip_accents to basic tokenizer
* Add tests for strip_accents.
* fix style with black
* Fix strip_accents test
* empty commit to trigger CI
* Improved strip_accents check
* Add code quality with is not False
* TF outputs and test on BERT
* Albert to DistilBert
* All remaining TF models except T5
* Documentation
* One file forgotten
* TF outputs and test on BERT
* Albert to DistilBert
* All remaining TF models except T5
* Documentation
* One file forgotten
* Add new models and fix issues
* Quality improvements
* Add T5
* A bit of cleanup
* Fix for slow tests
* Style
* Add SequenceClassification and MultipleChoice TF models to Electra
* Apply style
* Add summary_proj_to_labels to Electra config
* Finally mirroring the PT version of these models
* Apply style
* Fix Electra test
* improve unit tests
this is a sample of one test according to the request in https://github.com/huggingface/transformers/issues/5973
before I apply it to the rest
* batch 1
* batch 2
* batch 3
* batch 4
* batch 5
* style
* non-tf template
* last deletion of check_loss_output
* Fix TF Serving when output_hidden_states and output_attentions are True
* Add tests for saved model creation + bug fix for multiple choices models
* remove unused import
* Fix the input for several layers
* Fix test
* Fix conflict printing
* Apply style
* Fix XLM and Flaubert for TensorFlow
* Apply style
* Fix TF check version
* Apply style
* Trigger CI
* enable easy checkout switch
allow having multiple repository checkouts and not needing to remember to rerun 'pip install -e .[dev]' when switching between checkouts and running tests.
* make isort happy
* examples needs one too
* initial commit for pipeline implementation
Addition of input processing and history concatenation
* Conversation pipeline tested and working for single & multiple conversation inputs
* Added docstrings for dialogue pipeline
* Addition of dialogue pipeline integration tests
* Delete test_t5.py
* Fixed max code length
* Updated styling
* Fixed test broken by formatting tools
* Removed unused import
* Added unit test for DialoguePipeline
* Fixed Tensorflow compatibility
* Fixed multi-framework support using framework flag
* - Fixed docstring
- Added `min_length_for_response` as an initialization parameter
- Renamed `*args` to `conversations`, `conversations` being a `Conversation` or a `List[Conversation]`
- Updated truncation to truncate entire segments of conversations, instead of cutting in the middle of a user/bot input
* - renamed pipeline name from dialogue to conversational
- removed hardcoded default value of 1000 and use config.max_length instead
- added `append_response` and `set_history` method to the Conversation class to avoid direct fields mutation
- fixed bug in history truncation method
* - Updated ConversationalPipeline to accept only active conversations (otherwise a ValueError is raised)
* - Simplified input tensor conversion
* - Updated attention_mask value for Tensorflow compatibility
* - Updated last dialogue reference to conversational & fixed integration tests
* Fixed conflict with master
* Updates following review comments
* Updated formatting
* Added Conversation and ConversationalPipeline to the library __init__, addition of docstrings for Conversation, added both to the docs
* Update src/transformers/pipelines.py
Updated docsting following review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Switch from return_tuple to return_dict
* Fix test
* [WIP] Test TF Flaubert + Add {XLM, Flaubert}{TokenClassification, MultipleC… (#5614)
* Test TF Flaubert + Add {XLM, Flaubert}{TokenClassification, MultipleChoice} models and tests
* AutoModels
Tiny tweaks
* Style
* Final changes before merge
* Re-order for simpler review
* Final fixes
* Addressing @sgugger's comments
* Test MultipleChoice
* Rework TF trainer (#6038)
* Fully rework training/prediction loops
* fix method name
* Fix variable name
* Fix property name
* Fix scope
* Fix method name
* Fix tuple index
* Fix tuple index
* Fix indentation
* Fix variable name
* fix eval before log
* Add drop remainder for test dataset
* Fix step number + fix logging datetime
* fix eval loss value
* use global step instead of step + fix logging at step 0
* Fix logging datetime
* Fix global_step usage
* Fix breaking loop + logging datetime
* Fix step in prediction loop
* Fix step breaking
* Fix train/test loops
* Force TF at least 2.2 for the trainer
* Use assert_cardinality to facilitate the dataset size computation
* Log steps per epoch
* Make tfds compliant with TPU
* Make tfds compliant with TPU
* Use TF dataset enumerate instead of the Python one
* revert previous commit
* Fix data_dir
* Apply style
* rebase on master
* Address Sylvain's comments
* Address Sylvain's and Lysandre comments
* Trigger CI
* Remove unused import
* Switch from return_tuple to return_dict
* Fix test
* Add recent model
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Julien Plu <plu.julien@gmail.com>
* Added capability to quantize a model while exporting through ONNX.
Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>
We do not support multiple extensions
Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>
* Reformat files
Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>
* More quality
Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>
* Ensure test_generate_identified_name compares the same object types
Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>
* Added documentation everywhere on ONNX exporter
Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>
* Use pathlib.Path instead of plain-old string
Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>
* Use f-string everywhere
Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>
* Use the correct parameters for black formatting
Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>
* Use Python 3 super() style.
Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>
* Use packaging.version to ensure installed onnxruntime version match requirements
Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>
* Fixing imports sorting order.
Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>
* Missing raise(s)
Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>
* Added quantization documentation
Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>
* Fix some spelling.
Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>
* Fix bad list header format
Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>
* DataParallel fixes:
1. switched to a more precise check
- if self.args.n_gpu > 1:
+ if isinstance(model, nn.DataParallel):
2. fix tests - require the same fixup under DataParallel as the training module
* another fix
* Don't pass sampler for iterable dataset
* Added check for test and eval dataloaders.
* Formatting
* Don't pass sampler for iterable dataset
* Added check for test and eval dataloaders.
* Formatting
* Cleaner if nesting.
* Added test for trainer and iterable dataset
* Formatting for test
* Fixed import when torch is available only.
* Added require torch decorator to helper class
* Moved dataset class inside unittest
* Removed nested if and changed model in test
* Checking torch availability for IterableDataset
Slightly breaking change, changes functionality for `use_cache` in XLNet: if use_cache is True and mem_len is 0 or None (which is the case in the base model config), the model behaves like GPT-2 and returns mems to be used as past in generation. At training time `use_cache` is overriden and always True.
Slightly breaking change, changes functionality for `use_cache` in XLNet: if use_cache is True and mem_len is 0 or None (which is the case in the base model config), the model behaves like GPT-2 and returns mems to be used as past in generation. At training time `use_cache` is overriden and always True.
Slightly breaking change, changes functionality for `use_cache` in XLNet: if use_cache is True and mem_len is 0 or None (which is the case in the base model config), the model behaves like GPT-2 and returns mems to be used as past in generation. At training time `use_cache` is overriden and always True.
* fix merge rebase
* add intermediate reformer code
* save intermediate caching results
* save intermediate
* save intermediate results
* save intermediate
* upload next step
* fix generate tests
* make tests work
* add named tuple output
* Apply suggestions from code review
* fix use_cache for False case
* fix tensor to gpu
* fix tensor to gpu
* refactor
* refactor and make style
* Reformer model head classification implementation for text classification
* Reformat the reformer model classification code
* PR review comments, and test case implementation for reformer for classification head changes
* CI/CD reformer for classification head test import error fix
* CI/CD test case implementation added ReformerForSequenceClassification to all_model_classes
* Code formatting- fixed
* Normal test cases added for reformer classification head
* Fix test cases implementation for the reformer classification head
* removed token_type_id parameter from the reformer classification head
* fixed the test case for reformer classification head
* merge conflict with master fixed
* merge conflict, changed reformer classification to accept the choice_label parameter added in latest code
* refactored the the reformer classification head test code
* reformer classification head, common transform test cases fixed
* final set of the review comment, rearranging the reformer classes and docstring add to classification forward method
* fixed the compilation error and text case fix for reformer classification head
* Apply suggestions from code review
Remove unnecessary dup
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* Add B I handling to grouping
* Add fix to include separate entity as last token
* move last_idx definition outside loop
* Use first entity in entity group as reference for entity type
* Add test cases
* Take out extra class accidentally added
* Return tf ner grouped test to original
* Take out redundant last entity
* Get last_idx safely
Co-authored-by: ColleterVi <36503688+ColleterVi@users.noreply.github.com>
* Fix first entity comment
* Create separate functions for group_sub_entities and group_entities (splitting call method to testable functions)
* Take out unnecessary last_idx
* Remove additional forward pass test
* Move token classification basic tests to separate class
* Move token classification basic tests back to monocolumninputtestcase
* Move base ner tests to nerpipelinetests
* Take out unused kwargs
* Add back mandatory_keys argument
* Add unitary tests for group_entities in _test_ner_pipeline
* Fix last entity handling
* Fix grouping fucntion used
* Add typing to group_sub_entities and group_entities
Co-authored-by: ColleterVi <36503688+ColleterVi@users.noreply.github.com>
* Default decoder inputs to encoder ones for T5 if neither are specified.
* Fixing typo, now all tests are passing.
* Changing einsum to operations supported by onnx
* Adding a test to ensure T5 can be exported to onnx op>9
* Modified test for onnx export to make it faster
* Styling changes.
* Styling changes.
* Changing notation for matrix multiplication
Co-authored-by: Abel Riboulot <tkai@protomail.com>
* Added data collator for XLNet language modeling and related calls
Added DataCollatorForXLNetLanguageModeling in data/data_collator.py
to generate necessary inputs for language modeling training with
XLNetLMHeadModel. Also added related arguments, logic and calls in
examples/language-modeling/run_language_modeling.py.
Resolves: #4739, #2008 (partially)
* Changed name to `DataCollatorForPermutationLanguageModeling`
Changed the name of `DataCollatorForXLNetLanguageModeling` to the more general `DataCollatorForPermutationLanguageModelling`.
Removed the `--mlm` flag requirement for the new collator and defined a separate `--plm_probability` flag for its use.
CTRL uses a CLM loss just like GPT and GPT-2, so should work out of the box with this script (provided `past` is taken care of
similar to `mems` for XLNet).
Changed calls and imports appropriately.
* Added detailed comments, changed variable names
Added more detailed comments to `DataCollatorForPermutationLanguageModeling` in `data/data_collator.py` to explain working. Also cleaned up variable names and made them more informative.
* Added tests for new data collator
Added tests in `tests/test_trainer.py` for DataCollatorForPermutationLanguageModeling based on those in DataCollatorForLanguageModeling. A specific test has been added to check for odd-length sequences.
* Fixed styling issues