FBGEMM/include/fbgemm
James Reed d4bfa96cda int8 specialization for AVX2 Quantize routine (#120)
Summary:
This adds a specialization for `int8` to the AVX2 `Quantize` routine.

I tried also adding a specialization for `int32` (the final datatype we support in PyTorch quantization), but it seemed to introduce numerical issues stemming from the difference in implementations:

https://github.com/pytorch/FBGEMM/blob/master/include/fbgemm/QuantUtils.h#L63

vs

https://github.com/pytorch/FBGEMM/blob/master/src/QuantUtilsAvx2.cc#L82
Pull Request resolved: https://github.com/pytorch/FBGEMM/pull/120

Reviewed By: driazati

Differential Revision: D17115198

Pulled By: jamesr66a

fbshipit-source-id: 119145bb99235a7545389afa61483060200cc2b7
2019-08-29 11:26:05 -07:00
..
ConvUtils.h Unified convolution interface 2019-06-05 12:50:08 -07:00
Fbgemm.h Per channel support in fbgemmConv (#119) 2019-08-20 16:58:08 -07:00
FbgemmBuild.h Only export symbols that are required while building shared library 2018-11-30 11:31:33 -08:00
FbgemmFP16.h Add unpack to PackedGemmMatrixFP16 (#112) 2019-08-08 16:17:04 -07:00
FbgemmI8DepthwiseAvx2.h Unpack data for 3x3 (and 3x3x3) depthwise convolution 2019-07-05 18:05:46 -07:00
FbgemmI8Spmdm.h Update with clang format (#51) 2018-12-21 11:21:05 -08:00
OutputProcessing-inl.h optimize requantize for float out processing (#85) 2019-03-12 20:17:49 -07:00
PackingTraits-inl.h Integrate VNNI into FBGEMM master branch (#114) 2019-08-09 11:33:13 -07:00
QuantUtils.h Per channel and groupwise quantization (#99) 2019-06-20 12:21:51 -07:00
QuantUtilsAvx2.h int8 specialization for AVX2 Quantize routine (#120) 2019-08-29 11:26:05 -07:00
Types.h clang-format (#11) 2018-11-18 20:19:53 -08:00
Utils.h fix error message (#117) 2019-08-12 10:50:35 -07:00
UtilsAvx2.h optimize requantize for float out processing (#85) 2019-03-12 20:17:49 -07:00