huggingface-transformers

Граф коммитов

Автор	SHA1	Сообщение	Дата
Patrick von Platen	306f1a2695	Add Reformer MLM notebook (#5450 ) * Add Reformer MLM notebook * Update notebooks/README.md	2020-07-02 00:20:49 +02:00
Patrick von Platen	4bcc35cd69	[Docs] Benchmark docs (#5360 ) * first doc version * add benchmark docs * fix typos * improve README * Update docs/source/benchmarks.rst Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * fix naming and docs Co-authored-by: Lysandre Debut <lysandre@huggingface.co>	2020-06-29 16:08:57 +02:00
Thomas Wolf	601d4d699c	[tokenizers] Updates data processors, docstring, examples and model cards to the new API (#5308 ) * remove references to old API in docstring - update data processors * style * fix tests - better type checking error messages * better type checking * include awesome fix by @LysandreJik for #5310 * updated doc and examples	2020-06-26 19:48:14 +02:00
Patrick von Platen	834b6884c5	Add benchmark notebook (#5312 ) * add notebook * Créé avec Colaboratory * move notebook to correct folder * correct link * correct filename * correct filename * better name	2020-06-26 17:38:13 +02:00
Sylvain Gugger	7c41057d50	Add hugs (#5225 )	2020-06-24 07:56:14 -04:00
Michaël Benesty	0cca61925c	Add link to new comunity notebook (optimization) (#5195 ) * Add link to new comunity notebook (optimization) related to https://github.com/huggingface/transformers/issues/4842#event-3469184635 This notebook is about benchmarking model training with/without dynamic padding optimization. https://github.com/ELS-RD/transformers-notebook Using dynamic padding on MNLI provides a 4.7 times training time reduction, with max pad length set to 512. The effect is strong because few examples are >> 400 tokens in this dataset. IRL, it will depend of the dataset, but it always bring improvement and, after more than 20 experiments listed in this [article](https://towardsdatascience.com/divide-hugging-face-transformers-training-time-by-2-or-more-21bf7129db9q-21bf7129db9e?source=friends_link&sk=10a45a0ace94b3255643d81b6475f409), it seems to not hurt performance. Following advice from @patrickvonplaten I do the PR myself :-) * Update notebooks/README.md Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>	2020-06-22 23:47:33 +02:00
Pri Oberoi	a258982af3	Add missing arg in 02-transformers notebook (#5085 ) * Add missing arg when creating model * Fix typos * Remove from_tf flag when creating model	2020-06-18 19:04:04 -04:00
Abhishek Kumar Mishra	3e5928c57d	Adding notebooks for Fine Tuning [Community Notebook] (#4732 ) * Added links to more community notebooks Added links to 3 more community notebooks from the git repo: https://github.com/abhimishra91/transformers-tutorials Different Transformers models are fine tuned on Dataset using PyTorch * Update README.md * Update README.md * Update README.md Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>	2020-06-03 11:07:26 +02:00
Lorenzo Ampil	d3ef14f931	Add community notebook for sentiment span extraction (#4700 )	2020-06-02 09:59:53 +02:00
Patrick von Platen	6f82aea66b	Include `nlp` notebook for model evaluation (#4676 )	2020-05-29 19:38:56 +02:00
Iz Beltagy	91487cbb8e	[Longformer] fix model name in examples (#4653 ) * fix longformer model names in examples * a better name for the notebook	2020-05-29 13:12:35 +02:00
Iz Beltagy	fe5cb1a1c8	Adding community notebook (#4642 ) Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>	2020-05-28 22:35:15 +02:00
Suraj Patil	aecaaf73a4	[Community notebooks] add longformer-for-qa notebook (#4652 )	2020-05-28 22:27:22 +02:00
Lavanya Shukla	3cc2c2a150	add 2 colab notebooks (#4505 ) Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>	2020-05-28 11:18:16 +02:00
ohmeow	5ddd8d6531	Add BART fine-tuning summarization community notebook (#4539 ) * adding BART summarization how-to community notebook * Update notebooks/README.md Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>	2020-05-26 16:43:41 +02:00
Patrick von Platen	0f6969b7e9	Better github link for Reformer Colab Notebook	2020-05-22 23:51:36 +02:00
Patrick von Platen	12e6afe900	Add Reformer colab to community noteboos	2020-05-22 17:03:34 +02:00
Nathan Cooper	cacb654c7f	Add Fine-tune DialoGPT on new datasets notebook (#4473 )	2020-05-20 16:17:52 -04:00
Suraj Patil	5856999a9f	add T5 fine-tuning notebook [Community notebooks] (#4462 ) * add T5 fine-tuning notebook [Community notebooks] * Update README.md Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>	2020-05-19 18:26:28 +02:00
Funtowicz Morgan	ca4a3f4da9	Adding optimizations block from ONNXRuntime. (#4431 ) * Adding optimizations block from ONNXRuntime. * Turn off external data format by default for PyTorch export. * Correct the way use_external_format is passed through the cmdline args.	2020-05-18 20:32:33 +02:00
Patrick von Platen	24538df919	[Community notebooks] General notebooks (#4441 ) * Update README.md * Update README.md * Update README.md * Update README.md	2020-05-18 20:23:57 +02:00
Nikita	62427d0815	rerun notebook 02-transformers (#4341 )	2020-05-15 10:33:08 -04:00
Morgan Funtowicz	84894974bd	Updated ONNX notebook link in README.	2020-05-14 22:40:59 +02:00
Funtowicz Morgan	db0076a9df	Conversion script to export transformers models to ONNX IR. (#4253 ) * Added generic ONNX conversion script for PyTorch model. * WIP initial TF support. * TensorFlow/Keras ONNX export working. * Print framework version info * Add possibility to check the model is correctly loading on ONNX runtime. * Remove quantization option. * Specify ONNX opset version when exporting. * Formatting. * Remove unused imports. * Make functions more generally reusable from other part of the code. * isort happy. * flake happy * Export only feature-extraction for now * Correctly check inputs order / filter before export. * Removed task variable * Fix invalid args call in load_graph_from_args. * Fix invalid args call in convert. * Fix invalid args call in infer_shapes. * Raise exception and catch in caller function instead of exit. * Add 04-onnx-export.ipynb notebook * More WIP on the notebook * Remove unused imports * Simplify & remove unused constants. * Export with constant_folding in PyTorch * Let's try to put function args in the right order this time ... * Disable external_data_format temporary * ONNX notebook draft ready. * Updated notebooks charts + wording * Correct error while exporting last chart in notebook. * Adressing @LysandreJik comment. * Set ONNX opset to 11 as default value. * Set opset param mandatory * Added ONNX export unittests * Quality. * flake8 happy * Add keras2onnx dependency on extras["tf"] * Pin keras2onnx on github master to v1.6.5 * Second attempt. * Third attempt. * Use the right repo URL this time ... * Do the same for onnxconverter-common * Added keras2onnx and onnxconveter-common to 1.7.0 to supports TF2.2 * Correct commit hash. * Addressing PR review: Optimization are enabled by default. * Addressing PR review: small changes in the notebook * setup.py comment about keras2onnx versioning.	2020-05-14 16:35:52 -04:00
Patrick von Platen	839bfaedb2	[Docs, Notebook] Include generation pipeline (#4295 ) * add first text for generation * add generation pipeline to usage * Created using Colaboratory * correct docstring * finish	2020-05-13 14:24:08 -04:00
Stefan Schweter	b5c6d3d4c7	notebooks: minor fix for community provided models example (#4025 )	2020-04-28 09:12:25 +02:00
Jonathan Sum	0cec4fab7d	typo: fine-grained token-leven Changing from "fine-grained token-leven" to "fine-grained token-level"	2020-04-16 15:11:23 -04:00
Anthony MOI	b7cf9f43d2	Update tokenizers to 0.7.0-rc5 (#3705 )	2020-04-10 14:23:49 -04:00
Lysandre Debut	261c4ff4e2	Update notebooks (#3620 ) * Update notebooks * From local to global link * from local links to actual global links	2020-04-06 14:32:39 -04:00
Patrick von Platen	00ea100e96	add summarization and translation to notebook (#3478 )	2020-03-27 11:05:37 -04:00
Kyeongpil Kang	8eeefcb576	Update 01-training-tokenizers.ipynb (typo issue) (#3343 ) I found there are two grammar errors or typo issues in the explanation of the encoding properties. The original sentences: If your was made of multiple \"parts\" such as (question, context), then this would be a vector with for each token the segment it belongs to If your has been truncated into multiple subparts because of a length limit (for BERT for example the sequence length is limited to 512), this will contain all the remaining overflowing parts. I think "input" should be inserted after the phrase "If your".	2020-03-19 23:21:49 +01:00
Kyeongpil Kang	3bedfd3347	Fix wrong link for the notebook file (#3344 ) For the tutorial of "How to generate text", the URL link was wrong (it was linked to the tutorial of "How to train a language model"). I fixed the URL.	2020-03-19 17:22:47 +01:00
Morgan Funtowicz	cae334c43c	Improve fill-mask pipeline example in 03-pipelines notebook. Remove hardcoded mask_token and use the value provided by the tokenizer.	2020-03-18 17:11:42 +01:00
Patrick von Platen	efdb46b6e2	add link to blog post (#3326 )	2020-03-18 13:24:28 +01:00
Param bhavsar	b29fed790b	Updated `Tokenw ise` in print statement to `Token wise`	2020-03-08 10:55:30 -04:00
Morgan Funtowicz	7ac47bfe69	Updated notebook dependencies for Colab. Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>	2020-03-05 16:07:51 +01:00
Morgan Funtowicz	be02176a4b	Fixing sentiment pipeline in 03-pipelines notebook. Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>	2020-03-05 16:07:51 +01:00
Morgan Funtowicz	012cbdb0f5	Updating colab links in notebooks README. Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>	2020-03-05 15:34:15 +01:00
Morgan Funtowicz	30624f7056	Fix Colab links + install dependencies first. Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>	2020-03-05 11:40:15 +01:00
Morgan Funtowicz	1bca97ec7f	Update notebook link and fix few working issues. Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>	2020-03-04 21:19:33 +01:00
Julien Chaumond	256cbbc4a2	[doc] Fix link to how-to-train Colab	2020-03-04 12:01:45 -05:00
Funtowicz Morgan	71c8711970	Adding Docker images for transformers + notebooks (#3051 ) * Added transformers-pytorch-cpu and gpu Docker images Signed-off-by: Morgan Funtowicz <morgan@huggingface.co> * Added automatic jupyter launch for Docker image. Signed-off-by: Morgan Funtowicz <morgan@huggingface.co> * Move image from alpine to Ubuntu to align with NVidia container images. Signed-off-by: Morgan Funtowicz <morgan@huggingface.co> * Added TRANSFORMERS_VERSION argument to Dockerfile. Signed-off-by: Morgan Funtowicz <morgan@huggingface.co> * Added Pytorch-GPU based Docker image Signed-off-by: Morgan Funtowicz <morgan@huggingface.co> * Added Tensorflow images. Signed-off-by: Morgan Funtowicz <morgan@huggingface.co> * Use python 3.7 as Tensorflow doesnt provide 3.8 compatible wheel. Signed-off-by: Morgan Funtowicz <morgan@huggingface.co> * Remove double FROM instructions on transformers-pytorch-cpu image. Signed-off-by: Morgan Funtowicz <morgan@huggingface.co> * Added transformers-tensorflow-gpu Docker image. Signed-off-by: Morgan Funtowicz <morgan@huggingface.co> * use the correct ubuntu version for tensorflow-gpu Signed-off-by: Morgan Funtowicz <morgan@huggingface.co> * Added pipelines example notebook Signed-off-by: Morgan Funtowicz <morgan@huggingface.co> * Added transformers-cpu and transformers-gpu (including both PyTorch and TensorFlow) images. Signed-off-by: Morgan Funtowicz <morgan@huggingface.co> * Docker images doesnt start jupyter notebook by default. Signed-off-by: Morgan Funtowicz <morgan@huggingface.co> * Tokenizers notebook Signed-off-by: Morgan Funtowicz <morgan@huggingface.co> * Update images links Signed-off-by: Morgan Funtowicz <morgan@huggingface.co> * Update Docker images to python 3.7.6 and transformers 2.5.1 Signed-off-by: Morgan Funtowicz <morgan@huggingface.co> * Added 02-transformers notebook. Signed-off-by: Morgan Funtowicz <morgan@huggingface.co> * Trying to realign 02-transformers notebook ? Signed-off-by: Morgan Funtowicz <morgan@huggingface.co> * Added Transformer image schema * Some tweaks on tokenizers notebook * Removed old notebooks. Signed-off-by: Morgan Funtowicz <morgan@huggingface.co> * Attempt to provide table of content for each notebooks Signed-off-by: Morgan Funtowicz <morgan@huggingface.co> * Second attempt. Signed-off-by: Morgan Funtowicz <morgan@huggingface.co> * Reintroduce transformer image. Signed-off-by: Morgan Funtowicz <morgan@huggingface.co> * Keep trying Signed-off-by: Morgan Funtowicz <morgan@huggingface.co> * It's going to fly ! Signed-off-by: Morgan Funtowicz <morgan@huggingface.co> * Remaining of the Table of Content Signed-off-by: Morgan Funtowicz <morgan@huggingface.co> * Fix inlined elements for the table of content Signed-off-by: Morgan Funtowicz <morgan@huggingface.co> * Removed anaconda dependencies for Docker images. Signed-off-by: Morgan Funtowicz <morgan@huggingface.co> * Removing notebooks ToC Signed-off-by: Morgan Funtowicz <morgan@huggingface.co> * Added LABEL to each docker image. Signed-off-by: Morgan Funtowicz <morgan@huggingface.co> * Removed old Dockerfile Signed-off-by: Morgan Funtowicz <morgan@huggingface.co> * Directly use the context and include transformers from here. Signed-off-by: Morgan Funtowicz <morgan@huggingface.co> * Reduce overall size of compiled Docker images. Signed-off-by: Morgan Funtowicz <morgan@huggingface.co> * Install jupyter by default and use CMD for easier launching of the images. Signed-off-by: Morgan Funtowicz <morgan@huggingface.co> * Reduce number of layers in the images. Signed-off-by: Morgan Funtowicz <morgan@huggingface.co> * Added README.md for notebooks. Signed-off-by: Morgan Funtowicz <morgan@huggingface.co> * Fix notebooks link in README Signed-off-by: Morgan Funtowicz <morgan@huggingface.co> * Fix some wording issues. Signed-off-by: Morgan Funtowicz <morgan@huggingface.co> * Added blog notebooks too. Signed-off-by: Morgan Funtowicz <morgan@huggingface.co> * Addressing spelling errors in review comments. Signed-off-by: Morgan Funtowicz <morgan@huggingface.co> Co-authored-by: MOI Anthony <xn1t0x@gmail.com>	2020-03-04 11:45:57 -05:00
alberduris	81d6841b4b	GPU text generation: mMoved the encoded_prompt to correct device	2020-01-06 15:11:12 +01:00
alberduris	dd4df80f0b	Moved the encoded_prompts to correct device	2020-01-06 15:11:12 +01:00
thomwolf	f31154cb9d	Merge branch 'xlnet'	2019-07-16 11:51:13 +02:00
thomwolf	0bab55d5d5	[BIG] name change	2019-07-05 11:55:36 +02:00
thomwolf	c41f2bad69	WIP XLM + refactoring	2019-07-03 22:54:39 +02:00
thomwolf	62d78aa37e	updating GLUE utils for compatibility with XLNet	2019-06-24 14:36:11 +02:00
chrislarson1	a8e071c690	added notebook to check correctness of the pytorch->tensorflow conversion	2019-06-19 23:08:08 -04:00
thomwolf	32167cdf4b	remove convert_to_unicode and printable_text from examples	2018-11-26 23:33:22 +01:00

1 2

57 Коммитов