huggingface-transformers

Граф коммитов

Автор	SHA1	Сообщение	Дата
Lysandre	10d72390c0	Revert #4446 Since it introduces a new dependency	2020-05-22 10:49:45 -04:00
Lysandre	e0db6bbd65	Release: v2.10.0	2020-05-22 10:37:44 -04:00
Frankie Liuzzi	bd6e301832	added functionality for electra classification head (#4257 ) * added functionality for electra classification head * unneeded dropout * Test ELECTRA for sequence classification * Style Co-authored-by: Frankie <frankie@frase.io> Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>	2020-05-22 09:48:21 -04:00
Lysandre	a086527727	Unused Union should not be imported	2020-05-21 09:42:47 -04:00
Lysandre Debut	9d2ce253de	TPU hangs when saving optimizer/scheduler (#4467 ) * TPU hangs when saving optimizer/scheduler * Style * ParallelLoader is not a DataLoader * Style * Addressing @julien-c's comments	2020-05-21 09:18:27 -04:00
Zhangyx	49296533ca	Adds predict stage for glue tasks, and generate result files which can be submitted to gluebenchmark.com (#4463 ) * Adds predict stage for glue tasks, and generate result files which could be submitted to gluebenchmark.com website. * Use Split enum + always output the label name Co-authored-by: Julien Chaumond <chaumond@gmail.com>	2020-05-21 09:17:44 -04:00
Tobias Lee	271bedb485	[examples] fix no grad in second pruning in run_bertology (#4479 ) * fix no grad in second pruning and typo * fix prune heads attention mismatch problem * fix * fix * fix * run make style * run make style	2020-05-21 09:17:03 -04:00
Julien Chaumond	865d4d595e	[ci] Close #4481	2020-05-20 18:27:42 -04:00
Julien Chaumond	a3af8e86cb	Update test_trainer_distributed.py	2020-05-20 18:26:51 -04:00
Cola	eacea530c1	🚨 Remove warning of deprecation (#4477 ) Remove warning of deprecated overload of addcdiv_ Fix #4451	2020-05-20 16:48:29 -04:00
Julien Plu	fa2fbed3e5	Better None gradients handling in TF Trainer (#4469 ) * Better None gradients handling * Apply Style * Apply Style	2020-05-20 16:46:21 -04:00
Oliver Åstrand	e708bb75bf	Correct TF formatting to exclude LayerNorms from weight decay (#4448 ) * Exclude LayerNorms from weight decay * Include both formats of layer norm	2020-05-20 16:45:59 -04:00
Rens	49c06132df	pass on tokenizer to pipeline (#4489 )	2020-05-20 22:23:21 +02:00
Nathan Cooper	cacb654c7f	Add Fine-tune DialoGPT on new datasets notebook (#4473 )	2020-05-20 16:17:52 -04:00
Timo Moeller	30a09f3827	Adjust german bert model card, add new model card (#4488 )	2020-05-20 16:08:29 -04:00
Lysandre Debut	14cb5b35fa	Fix slow gpu tests lysandre (#4487 ) * There is one missing key in BERT * Correct device for CamemBERT model * RoBERTa tokenization adding prefix space * Style	2020-05-20 11:59:45 -04:00
Manuel Romero	6dc52c78d8	Create README.md (#4482 )	2020-05-20 09:45:50 -04:00
Manuel Romero	ed5456daf4	Model card for RuPERTa-base fine-tuned for NER (#4466 )	2020-05-20 09:45:24 -04:00
Oleksandr Bushkovskyi	c76450e20c	Model card for Tereveni-AI/gpt2-124M-uk-fiction (#4470 ) Create model card for "Tereveni-AI/gpt2-124M-uk-fiction" model	2020-05-20 09:44:26 -04:00
Hu Xu	9907dc523a	add BERT trained from review corpus. (#4405 ) * add model_cards for BERT trained on reviews. * add link to repository. * refine README.md for each review model	2020-05-20 09:42:35 -04:00
Sam Shleifer	efbc1c5a9d	[MarianTokenizer] implement save_vocabulary and other common methods (#4389 )	2020-05-19 19:45:49 -04:00
Sam Shleifer	956c4c4eb4	[gpu slow tests] fix mbart-large-enro gpu tests (#4472 )	2020-05-19 19:45:31 -04:00
Patrick von Platen	48c3a70b4e	[Longformer] Docs and clean API (#4464 ) * add longformer docs * improve docs	2020-05-19 21:52:36 +02:00
Patrick von Platen	aa925a52fa	[Tests, GPU, SLOW] fix a bunch of GPU hardcoded tests in Pytorch (#4468 ) * fix gpu slow tests in pytorch * change model to device syntax	2020-05-19 21:35:04 +02:00
Suraj Patil	5856999a9f	add T5 fine-tuning notebook [Community notebooks] (#4462 ) * add T5 fine-tuning notebook [Community notebooks] * Update README.md Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>	2020-05-19 18:26:28 +02:00
Sam Shleifer	07dd7c2fd8	[cleanup] test_tokenization_common.py (#4390 )	2020-05-19 10:46:55 -04:00
Iz Beltagy	8f1d047148	Longformer (#4352 ) * first commit * bug fixes * better examples * undo padding * remove wrong VOCAB_FILES_NAMES * License * make style * make isort happy * unit tests * integration test * make `black` happy by undoing `isort` changes!! * lint * no need for the padding value * batch_size not bsz * remove unused type casting * seqlen not seq_len * staticmethod * `bert` selfattention instead of `n2` * uint8 instead of bool + lints * pad inputs_embeds using embeddings not a constant * black * unit test with padding * fix unit tests * remove redundant unit test * upload model weights * resolve todo * simpler _mask_invalid_locations without lru_cache + backward compatible masked_fill_ * increase unittest coverage	2020-05-19 16:04:43 +02:00
Girishkumar	31eedff5a0	Refactored the README.md file (#4427 )	2020-05-19 09:56:24 -04:00
Shaoyen	384f0eb2f9	Map optimizer to correct device after loading from checkpoint. (#4403 ) * Map optimizer to correct device after loading from checkpoint. * Make style test pass Co-authored-by: Julien Chaumond <chaumond@gmail.com>	2020-05-18 23:16:05 -04:00
Julien Chaumond	bf14ef75f1	[Trainer] move model to device before setting optimizer (#4450 )	2020-05-18 23:13:33 -04:00
Julien Chaumond	5e7fe8b585	Distributed eval: SequentialDistributedSampler + gather all results (#4243 ) * Distributed eval: SequentialDistributedSampler + gather all results * For consistency only write to disk from world_master Close https://github.com/huggingface/transformers/issues/4272 * Working distributed eval * Hook into scripts * Fix #3721 again * TPU.mesh_reduce: stay in tensor space Thanks @jysohn23 * Just a small comment * whitespace * torch.hub: pip install packaging * Add test scenarii	2020-05-18 22:02:39 -04:00
Julien Chaumond	4c06893610	Fix nn.DataParallel compatibility in PyTorch 1.5 (#4300 ) * Test case for #3936 * multigpu tests pass on pytorch 1.4.0 * Fixup * multigpu tests pass on pytorch 1.5.0 * Update src/transformers/modeling_utils.py * Update src/transformers/modeling_utils.py * rename multigpu to require_multigpu * mode doc	2020-05-18 20:34:50 -04:00
Rakesh Chada	9de4afa897	Make get_last_lr in trainer backward compatible (#4446 ) * makes fetching last learning late in trainer backward compatible * split comment to multiple lines * fixes black styling issue * uses version to create a more explicit logic	2020-05-18 20:17:36 -04:00
Stefan Dumitrescu	42e8fbfc51	Added model cards for Romanian BERT models (#4437 ) * Create README.md * Create README.md * Update README.md * Update README.md * Apply suggestions from code review Co-authored-by: Julien Chaumond <chaumond@gmail.com>	2020-05-18 18:48:56 -04:00
Oliver Guhr	54065d68b8	added model card for german-sentiment-bert (#4435 )	2020-05-18 18:44:41 -04:00
Martin Müller	e28b7e2311	Create README.md (#4433 )	2020-05-18 18:41:34 -04:00
sy-wada	09b933f19d	Update README.md (model_card) (#4424 ) - add a citation. - modify the table of the BLUE benchmark. The table of the first version was not displayed correctly on https://huggingface.co/seiya/oubiobert-base-uncased. Could you please confirm that this fix will allow you to display it correctly?	2020-05-18 18:18:17 -04:00
Manuel Romero	235777ccc9	Modify example of usage (#4413 ) I followed the google example of usage for its electra small model but i have seen it is not meaningful, so i created a better example	2020-05-18 18:17:33 -04:00
Suraj Patil	9ddd3a6548	add model card for t5-base-squad (#4409 ) * add model card for t5-base-squad * Update model_cards/valhalla/t5-base-squad/README.md Co-authored-by: Julien Chaumond <chaumond@gmail.com>	2020-05-18 18:17:14 -04:00
HUSEIN ZOLKEPLI	c5aa114392	Added README huseinzol05/t5-base-bahasa-cased (#4377 ) * add bert bahasa readme * update readme * update readme * added xlnet * added tiny-bert and fix xlnet readme * added albert base * added albert tiny * added electra model * added gpt2 117m bahasa readme * added gpt2 345m bahasa readme * added t5-base-bahasa * fix readme * Update model_cards/huseinzol05/t5-base-bahasa-cased/README.md Co-authored-by: Julien Chaumond <chaumond@gmail.com>	2020-05-18 18:10:23 -04:00
Funtowicz Morgan	ca4a3f4da9	Adding optimizations block from ONNXRuntime. (#4431 ) * Adding optimizations block from ONNXRuntime. * Turn off external data format by default for PyTorch export. * Correct the way use_external_format is passed through the cmdline args.	2020-05-18 20:32:33 +02:00
Patrick von Platen	24538df919	[Community notebooks] General notebooks (#4441 ) * Update README.md * Update README.md * Update README.md * Update README.md	2020-05-18 20:23:57 +02:00
Sam Shleifer	a699525d25	[test_pipelines] Mark tests > 10s @slow, small speedups (#4421 )	2020-05-18 12:23:21 -04:00
Boris Dayma	d9ece8233d	fix(run_language_modeling): use arg overwrite_cache (#4407 )	2020-05-18 11:37:35 -04:00
Patrick von Platen	d39bf0ac2d	better naming in tf t5 (#4401 )	2020-05-18 11:34:00 -04:00
Patrick von Platen	590adb130b	improve docstring (#4422 )	2020-05-18 11:31:35 -04:00
Patrick von Platen	026a5d0888	[T5 fp16] Fix fp16 in T5 (#4436 ) * fix fp16 in t5 * make style * refactor invert_attention_mask fn * fix typo	2020-05-18 17:25:58 +02:00
Soham Chatterjee	fa6113f9a0	Fixed spelling of training (#4416 )	2020-05-18 11:23:29 -04:00
Julien Chaumond	757baee846	Fix un-prefixed f-string see https://github.com/huggingface/transformers/pull/4367#discussion_r426356693 Hat/tip @girishponkiya	2020-05-18 11:20:46 -04:00
Patrick von Platen	a27c795908	fix (#4419 )	2020-05-18 15:51:40 +02:00

... 3 4 5 6 7 ...

4206 Коммитов Все ветки Поиск

4206 Коммитов

Все ветки