* Including El Attn optimization
* minor changes
* Update benchmark_fs.sh
* Update benchmark_fs.sh
* minor changes
* minor changes
* env variable for testing fairseq optimizer
* CI tests to gpu3
* Trigger Build
* Optimize GPT2
* Add benchmarking scripts for GPT2
* Replace the line break in the generation hypo and update the benchmarking data
* Update README and benchmark script
* Disable transformers-test_gpt2_model_att_mask_past. The currect cache behavior is not compatible with that unit test because the cache key and value will be updated if past is none when the model is called. This unit test will work well if switching the order of calling the second and third model.
* Add readme file for gpt2
* Minor updates
* Use bigger data for prophetnet. Reformat benchmarks.
* Use real hf baseline.
* Fix Transformers rouge metric scale.
* Support ngram in hf.
* Fix blue score thresholds.
* Update install_requires and enable fairseq to work with torch 1.6&1.7
* Better error message and address some warnings in torch1.7
* Raise the error if fairseq/transformers are installed but the optmizations can not be applied
* Move transformers/fairseq to extra_require
* Remove the out-of-dated build files for ngram cuda op
* Run fastseq units before transformers and fairseq
* Generate the XML log file for each unit tests
* run all fastseq unit tests
* Add Nikhil's changes on pipeline to publish XML
* Just use a small unit test to test pipeline
* Change the xml folder path
* Add more tests
* Add env var for xml log dir and test the failures
* Enable all fastseq unit tests
* Enable all tests
* Generate xml files for fairseq and transformers unit tests
* Fix an issue in pytest command
* Trigger the CI pipeline
* Cuda op for ngram repeat blocking
* clean up
* Unit test for cuda op
* unit test updated, minor updates in cpp/cu code
* Rebased on new codebase , updated all benchmarks
* Update README.md
* Update README.md
* Update README.md
* minor change in kernel
* changing install order
* Refactoring to target only one version of transformers + fairseq in the
main branch
* Address the comments
* Add the error handling for applying optimizations
* Use a same version var for code and setup
* Refactor the class-replace-related parts
* Change log error to warning
* Skip the registration of ProphetNet model if fairseq can not be imported.
* Guarding user env with test wrappers
* minor changes
* cleaning up tests
* minor change
* minor change
* Ensurign zero errors, changing dir for transformers
* Using Pytest instead of Unittest
* Remove reformer test
Co-authored-by: Jiusheng Chen <chenjiusheng@outlook.com>
* Added parellelization in hypo collection code , speed bump in inference approx 2x
* Guarding user env with bash wrapper
* Revert "Guarding user env with bash wrapper"
This reverts commit d020cc56c6.
* Removed small tensors copy from GPU to CPU. added linting corrections.
* small changes after running transformers Unit tests
* creating gpu tensor
* reverting all formatting changes due to YAPF
* linting checks
* liniting checks