FBGEMM

f78e609883

Merge pull request #2 from XapaJIaMnu/restore_mac_support master Young Jin Kim 2020-03-04 09:24:47 -0800
b7a88185fb Restore mac support Nikolay Bogoychev 2020-02-25 18:10:59 +0000
84e66a9760

Merge pull request #1 from marian-nmt/youki/win-jit-debug-int8 Young Jin Kim 2019-12-03 11:53:14 -0800
e6e9b16742 Merge branch 'master' into youki/win-jit-debug-int8 Young Jin Kim 2019-12-03 11:52:20 -0800
501f92e531 Remove unused code Young Jin Kim 2019-10-18 17:30:00 -0700
197acffac6 Change AVX2 compile check to runtime check Young Jin Kim 2019-10-18 11:47:58 -0700
21f93c950b Linux memory fix Young Jin Kim 2019-09-26 09:17:42 -0700
13347764fc debugging linux unit tests Young Jin Kim 2019-09-26 03:30:38 -0700
abf37c64da fix linux build error Young Jin Kim 2019-09-25 13:32:13 -0700
604620b786 All functions are running well on windows Young Jin Kim 2019-09-25 11:46:49 -0700
d02815ffed Fix windows build errors Young Jin Kim 2019-09-25 09:43:01 -0700
7bd598c9e9 Merge remote-tracking branch 'upstream/master' into youki/win-jit-debug-int8 Fix for windows build errors Young Jin Kim 2019-09-25 09:40:48 -0700
08763b198e Fix jit code (AVX512) on windows Young Jin Kim 2019-09-24 16:55:03 -0700
97caeee5af JIT code working on windows (AVX512) Young Jin Kim 2019-09-24 16:38:34 -0700
518d8a1832 remove template parameter from PackedDepthWiseConvMatrix (#128) Jongsoo Park 2019-09-24 07:06:47 -0700
f0b354327a Enable AVX2 query API when compiled with AVX Young Jin Kim 2019-09-17 09:52:24 -0700
53f0c0d175 A bit more refactoring Aleks Zi 2019-09-16 11:03:32 -0700
96f2b9db2e Small refactoring of FBGEMM GenerateKernel class Aleks Zi 2019-09-16 11:03:32 -0700
57dcf55075 (fixed an error message) Frank Seide 2019-09-13 17:22:58 -0700
d53e6d709a fixed a build error for non-AVX2 builds Frank Seide 2019-09-13 17:17:20 -0700
2f1477dfee Minor changes in initialization of dilation (#126) Daya Khudia 2019-09-13 14:36:54 -0700
c8cac64995 add missing instantiation for float bias for gconv (#127) Daya Khudia 2019-09-13 13:35:17 -0700
ea787e8278 fbgemmPacked and fbgemmConv apis with float bias + tests Daya Khudia 2019-09-11 11:47:58 -0700
637288bff9 ReQuantization with FP32 bias Daya Khudia 2019-09-11 11:47:58 -0700
415035019c API changes to take unquantized bias for depthwise conv Daya Khudia 2019-09-11 11:47:58 -0700
685b1855d7 Add assert to ensure the divisor is not 0 (#25960) Jianyu Huang 2019-09-10 21:00:43 -0700
9f096ab12c CodeCache implemented with correct initialization of Static variables (#123) Aleks Zi 2019-09-05 12:11:17 -0700
f9078fdd81 Revert D16968373: Introduced CodeCache container to share the microkernels among different threads. Edward Yang 2019-09-04 15:04:56 -0700
823284ec8f Modifying PackAWithIm2Col to support dilated convolution and adding test cases Protonu Basu 2019-09-04 14:52:03 -0700
624d098f67 remove dw conv refs and use conv_ref instead (#122) Jongsoo Park 2019-09-04 12:04:31 -0700
ab2d527807 Introduced CodeCache container to share the microkernels among different threads. Aleks Zi 2019-09-04 11:27:35 -0700
21782ffd9e Modifying reference conv2d/3d, im2col2d.3d to support dilated convolutions Protonu Basu 2019-09-03 20:42:16 -0700
3ace43b21a Adding Support for dilations in the conv_param_t constructor Protonu Basu 2019-09-03 14:30:56 -0700
e55a59653b disable clang formatting in a few array definitions (#121) Jongsoo Park 2019-09-03 05:01:41 -0700
dc76fd4ce3 Adopt Contributor Covenant Paul O'Shannessy 2019-08-29 23:19:10 -0700
d4bfa96cda int8 specialization for AVX2 Quantize routine (#120) James Reed 2019-08-29 11:11:31 -0700
280fa17349 Per channel support in fbgemmConv (#119) Daya Khudia 2019-08-20 16:52:53 -0700
bb50635332 Merge branch 'upstream/master' into youki/prepack_constr Young Jin Kim 2019-08-14 16:01:38 -0700
a6d1d3eed7 Update asmjit to version that includes a bug fix (#118) Daya Khudia 2019-08-14 15:47:50 -0700
1be081503e fix error message (#117) Daya Khudia 2019-08-12 10:42:13 -0700
aceefe3e0c Update README.md with mentioning PyTorch (#116) Jianyu Huang 2019-08-12 09:19:22 -0700
7b156071d8 Integrate VNNI into FBGEMM master branch (#114) Jianyu Huang 2019-08-09 11:23:22 -0700
122135c29b Add unpack to PackedGemmMatrixFP16 (#112) Yinghai Lu 2019-08-08 16:12:20 -0700
cf34b9a26b Back out "[fbgemm] Integrate VNNI into FBGEMM master branch" Jianyu Huang 2019-08-06 11:55:17 -0700
d8b3323668 Integrate VNNI into FBGEMM master branch (#113) Jianyu Huang 2019-08-06 09:35:42 -0700
0d5d057ca9 Pass blocking param pointer into packedBufferSize() in PackBMatrix.cc Mike Tsai 2019-08-01 16:03:29 -0700
eb8fede25b Merge upstream master Young Jin Kim 2019-08-01 12:38:23 -0700
e4ed5196cb adding a constructor for PackBMatrix with pre-packed data Young Jin Kim 2019-08-01 10:10:23 -0700
f712cb2328 Update OSS build instructions for submodule build (#110) Jianyu Huang 2019-07-22 22:03:48 -0700
1419a6e114 Fix fbgemm OSS failure Jianyu Huang 2019-07-18 20:04:58 -0700
09493fd291 Support pointwise with unified convolution interface as well (#108) Daya Khudia 2019-07-18 15:58:33 -0700
28b7332a03 Fix missing blocking params in conv im2col code path. Mike Tsai 2019-07-17 17:07:02 -0700
6e903d5f7f While calling fbgemmConv with packed weights, packed weights should be compliant with convolution parameters Daya Khudia 2019-07-16 19:24:07 -0700
931b3b71f0 changes to remove warnings when building in opt mode Protonu Basu 2019-07-16 13:10:44 -0700
feca34d3d0 Add functions needed for unpacking in PackWeightsForConv (#106) Daya Khudia 2019-07-15 17:34:48 -0700
e69972dad1 unpack through unified convolution interface (#105) Daya Khudia 2019-07-15 17:34:48 -0700
1568107cd1 Assume input weights to be in transposed format for convUnified (#104) Daya Khudia 2019-07-15 17:34:48 -0700
49e8018ab2 Add cpuinfo_initialize() into the isa query functions Young Jin Kim 2019-07-10 15:44:23 -0700
f08039388a Refactoring unpack weight function (#103) Jianyu Huang 2019-07-09 20:59:38 -0700
815139b1ba Unpack data for 3x3 (and 3x3x3) depthwise convolution Daya Khudia 2019-07-05 18:02:43 -0700
64a2c73a42 Implement ::unpack() for PackWeightMatrixForGConv Jaewon Lee 2019-07-05 14:58:49 -0700
b0cf97df8e Refactor the code and avoid the duplication (#102) Jianyu Huang 2019-07-01 12:08:42 -0700
61928df38b Clean up some code for JIT code generator (#101) Jianyu Huang 2019-07-01 11:55:51 -0700
278c146b92 fix flaky test (#100) Daya Khudia 2019-06-21 18:10:52 -0700
bc33ed9474 Fix a compile error in assert Young Jin Kim 2019-06-21 10:40:34 -0700
c4269a772d Fix some compile error on windows Young Jin Kim 2019-06-21 09:05:39 -0700
5b64af1469 Per channel and groupwise quantization (#99) Daya Khudia 2019-06-20 12:13:35 -0700
25c1595ceb Merged PR 8337: Enable windows build and FP16 packed GEMM on windows Young Jin Kim 2019-06-18 00:39:01 +0000
604575ff5d Update the logic of checking valid parameters. Mike Tsai 2019-06-14 17:04:25 -0700
a838fc2a9c Fix memory allocation bug Young Jin Kim 2019-06-14 14:41:53 -0700
696a8f5a6e missed file Young Jin Kim 2019-06-14 09:23:22 -0700
b4e3a9ceb7 Improve some memroy allocation codes Young Jin Kim 2019-06-14 09:22:49 -0700
d402bed4f1 turn off forceinline due to the compile speed Young Jin Kim 2019-06-13 13:36:28 -0700
24ff10d324 Compile both on windows and linux Young Jin Kim 2019-06-12 17:49:16 -0700
5e71d2c304 Print packed matrix for each group as well Daya Khudia 2019-06-12 11:26:39 -0700
bf2f45f35c Remove duplicated header and undo some changes in D15399811 Daya Khudia 2019-06-07 10:03:21 -0700
8197494f3a Unified convolution interface Daya Khudia 2019-06-05 12:44:57 -0700
77868418c7 Add quantized::fbgemm_linear_unpack operator for serialization (#97) Jianyu Huang 2019-06-03 20:31:51 -0700
85f4105ceb Adding -02 flag to the cmake build Protonu Basu 2019-05-30 12:14:24 -0700
846785dc12 fix broken test Daya S Khudia 2019-05-23 17:52:47 -0700
d05944835b Fix kernel logging Mike Tsai 2019-05-23 15:22:46 -0700
9ae8912fc9 update readme Daya S Khudia 2019-05-16 15:49:03 -0700
d8f740de76 fixing compiler warnings for uninitialized MR, NCB, KCB Protonu Basu 2019-05-16 09:12:35 -0700
1b14303544 Fix CI indent error Daya S Khudia 2019-05-14 15:09:08 -0700
49be9f84e0 fix circleci build with submodules Daya S Khudia 2019-05-14 14:38:15 -0700
7de846addb Use submodules instead of cmake downloads Daya S Khudia 2019-05-14 13:21:21 -0700
b14f582ca6 Back out "[FBGEMM][PR] switch from cmake downloads to git submodules" Daya S Khudia 2019-05-13 11:24:34 -0700
c21a93d628 switch from cmake downloads to git submodules (#95) David Pollack 2019-05-13 10:46:36 -0700
6ec218e6ed make sure cpuinfo_initialize called before fbgemmHasAvx2/512Support (#94) Jongsoo Park 2019-04-18 17:51:07 -0700
c6e86067e4 optimize dw conv for symmetric quant (#73) Jongsoo Park 2019-04-03 07:59:57 -0700
f12ec122be Exposing tuning parameters in FBGEMM (MCB, NCB, KCB, MR, NR, Row Interleave) (#90) Protonu Basu 2019-04-02 05:22:44 -0700
d8e0d440ef reduce the number of shapes tested in GConvTest (#91) Jongsoo Park 2019-04-01 12:21:26 -0700
23aa5a35ec Packing B documentation Daya S Khudia 2019-03-25 09:53:30 -0700
f65f0ebe54 Improves small N cases back to what they were Daya S Khudia 2019-03-21 10:03:36 -0700
452627c5f2 Allocate some registers for B matrix loading and reuse loaded results Daya S Khudia 2019-03-21 10:03:36 -0700
d53c0220cf Further optimize acc16 kernel and cache blocking dimension for B matrix is now free to be autotuned (#88) Daya S Khudia 2019-03-21 10:03:36 -0700
fe1c3d9177 Further optimize acc32 kernel and cache blocking dimension for B matrix is now free to be autotuned (#89) Daya S Khudia 2019-03-21 10:03:36 -0700
0d8f88c12d Dump generated kernels in files Daya S Khudia 2019-03-18 19:24:57 -0700
1351790c8c Add the Naive bfloat16 implementation based on MKL Jianyu Huang 2019-03-17 23:35:08 -0700
6011ce3b0c optimize requantize for float out processing (#85) Jongsoo Park 2019-03-12 20:14:32 -0700