mozilla/aom - aom

Граф коммитов

Автор	SHA1	Сообщение	Дата
James Zern	97946622c0	Revert "mips msa vp9 subpel variance optimization" This reverts commit `a42df86c03`. this change causes MSA/VP9SubpelVarianceTest.Ref and MSA/VP9SubpelVarianceTest.ExtremeRef failures under mips32r5el-msa-linux-gnu and mips64r6el-msa-linux-gnu Change-Id: I40b71a0b774eaeb31f66f795733f95cf360909f7	2015-07-02 12:06:51 -07:00
James Zern	ced982640b	Revert "mips msa vp9 avg subpel variance optimization" This reverts commit `61774ad1c4`. this change causes MSA/VP9SubpelAvgVarianceTest.Ref failures under mips32r5el-msa-linux-gnu and mips64r6el-msa-linux-gnu Change-Id: I7fb520c12b2a3b212d5e84b7619a380a48e49bb0	2015-07-02 12:06:29 -07:00
Parag Salasakar	61774ad1c4	mips msa vp9 avg subpel variance optimization average improvement ~3x-5x Change-Id: Iefbcafc05daab77b38a4e63b551e427867a501a4	2015-07-01 13:46:41 +05:30
Parag Salasakar	a42df86c03	mips msa vp9 subpel variance optimization average improvement ~3x-5x Change-Id: I4cbba2711467b0e205904769ebbb4a1fcbb1a311	2015-07-01 07:51:34 +05:30
Parag Salasakar	2d730a289a	mips msa vpx_dsp variance optimization average improvement ~2x-4x Change-Id: Ia3eef3f390148c2eb5cdc580a94cb26369737f82	2015-06-30 12:22:18 +05:30
James Zern	e0e4045db8	variance_test: fix build w/--disable-vp8-encoder s/CONFIG_VP8\b/CONFIG_VP8_ENCODER/ Change-Id: I616aace9cf8f18d7e83f00f7aef3b8a26fc4c17b	2015-06-11 23:15:30 -07:00
James Zern	47fe535422	disable vp8_sub_pixel_variance8x8_neon fails unit tests: [ FAILED ] NEON/VP8SubpelVarianceTest.ExtremeRef/0, where GetParam() = (3, 3, 0x14e36d, 0) [ FAILED ] NEON/VP8SubpelVarianceTest.Ref/0, where GetParam() = (3, 3, 0x14e36d, 0) the tests were recently enabled in: `eb88b17` Make vp9 subpixel match vp8 the functions likely haven't changed since being converted from assembly Change-Id: I6141717b111b8f735f436c160d74270af53ef722	2015-06-05 20:18:51 -07:00
Johann	eb88b172fe	Make vp9 subpixel match vp8 The only difference between the two was that the vp9 function allowed for every step in the bilinear filter (16 steps) while vp8 only allowed for half of those. Since all the call sites in vp9 (<< 1) the input, it only ever used the same steps as vp8. This will allow moving the subpel variance to vpx_dsp with the rest of the variance functions. Change-Id: I6fa2509350a2dc610c46b3e15bde98a15a084b75	2015-06-03 22:10:51 -07:00
Johann	d90536c1a2	Unify reference variance functions Use uint32_t for all output and make all functions static Change-Id: I2c9c6f6310732dc53444607d1c1a268ac1ab83ba	2015-06-02 15:14:55 -07:00
Johann	fdc549994a	Cast variance reference output The larger internal variables are required for the intermediates but RoundHighBitDepth brings them down to uint32_t/unsigned int. Fixes type warnings in visual studio. Change-Id: I48d35284d6cbde330ccdc1f46b6215a645d5eb00	2015-06-01 10:56:52 -07:00
Johann	a927aec5f8	Merge "Use correct parameters for NEON variance tests"	2015-05-28 19:53:50 +00:00
Johann	efc2e9844e	Use correct parameters for NEON variance tests Change-Id: Ib2949d0a3e9273e7952bbf91956357c1138093f1	2015-05-28 11:28:06 -07:00
Johann	c855ed72a6	Remove conversion warnings from hbd shifts ROUND_POWER_OF_TWO has some poor side effects when used with [u]int64_t such as doing the shifting in 32bits. Change-Id: Ic85a19765cd316fb43657cb21c86f35ceb772773	2015-05-27 17:54:22 -07:00
Johann	c5a7c89e89	Correct case in Get4x4SSEFunc Change-Id: Ie8a7508798fa8e65c579a77cedb8305cee4ddc81	2015-05-27 11:38:43 -07:00
Johann	c3bdffb0a5	Move variance functions to vpx_dsp subpel functions will be moved in another patch. Change-Id: Idb2e049bad0b9b32ac42cc7731cd6903de2826ce	2015-05-26 12:01:52 -07:00
Johann	1d7ccd5325	Relocate memory operations for common code With the sad functions, and hopefully the variance functions soon, moving to the vpx_dsp location, place the defines used in the reference C code in a common location. Change-Id: I4c8ce7778eb38a0a3ee674d2f1c488eda01cfeca	2015-05-13 11:41:15 -07:00
Frank Galligan	ec1d8387e1	Add 64x64 sub_pel_variance Neon function On Nexus 7 speed -5, -6, -7, and -8 saw about a 15% increase in perf for 480p. Speeds -5, -6, -7, and -8 saw about a 10% increase in perf for 720p. Tested on Nexus 7, built with ndk r10d, gcc 4.9. Change-Id: I2fa5315845e3021c9a6e2ea47e52e68b398d8334	2015-01-14 08:36:24 -08:00
Frank Galligan	74d40cd507	Add 64x variance Neon functions Add optimized Neon functions of: vp9_variance32x64 vp9_variance64x32 vp9_variance64x64 On Nexus 7 speed -5 and -6 saw about a 4% increase in perf. Speeds -7 and -8 saw about a 6% increase in perf. Tested on Nexus 7, built with ndk r10d, gcc 4.9. Change-Id: I5a81f13c9897eb927fa39662530f5524a0f768fa	2015-01-13 15:08:13 -08:00
Peter de Rivaz	48032bfcdb	Added sse2 acceleration for highbitdepth variance Change-Id: I446bdf3a405e4e9d2aa633d6281d66ea0cdfd79f (cherry picked from commit d7422b2b1eb9f0011a8c379c2be680d6892b16bc) (cherry picked from commit 6d741e4d76a7d9ece69ca117d1d9e2f9ee48ef8c)	2014-11-14 15:18:53 -08:00
Scott LaVarnway	fe2cc873dc	VP8 encoder for ARMv8 by using NEON intrinsics 1 Add vp8_mse16x16_neon.c - vp8_mse16x16_neon - vp8_get4x4sse_cs_neon Change-Id: I108952f60a9ae50613f0ce3903c2c81df19d99d0 Signed-off-by: James Yu <james.yu@linaro.org>	2014-09-15 12:04:09 -07:00
Dmitry Kovalev	1f19ebbab6	Replacing vp9_get_mb_ss_sse2 asm implementation with intrinsics. Change-Id: Ib4f5dd733eb2939b108070a01e83da5d9990bac0	2014-09-06 00:10:25 -07:00
Dmitry Kovalev	202edb3d23	Actually resetting random generator for all variance test cases. Calling Reset(int) method instead of overloaded operator()(int). Adding underscore at the end of class member name. Change-Id: I01934e7bc056d4b594e5d05d693328febd34ac3c	2014-09-04 12:24:52 -07:00
Dmitry Kovalev	12cd6f421d	Removing variance MMX code. Removed functions: * vp9_mse16x16_mmx * vp9_get_mb_ss_mmx * vp9_get4x4var_mmx * vp9_get8x8var_mmx * vp9_variance4x4_mmx * vp9_variance8x8_mmx * vp9_variance16x16_mmx * vp9_variance16x8_mmx * vp9_variance8x16_mmx They all have SSE2 equivalent. Change-Id: I3796f2477c4f59b35b4828f46a300c16e62a2615	2014-08-29 10:26:42 -07:00
levytamar82	69a5f5ecf7	Fix bug 807 in the sub_pixel_variance function the dst is aligned to 16 bytes and not to 32 bytes - now load unaligned data Change-Id: I2e0b9745543697efc56fefa32857ea10117af135	2014-08-07 18:51:02 -07:00
Scott LaVarnway	98165ec074	Neon version of vp9_sub_pixel_variance8x8(), vp9_variance8x8(), and vp9_get8x8var(). On a Nexus 7, vpxenc (in realtime mode, speed -12) reported a performance improvement of ~1.2%. Change-Id: I8a66ac2a0f550b407caa27816833bdc563395102	2014-08-01 11:35:55 -07:00
Scott LaVarnway	d39448e2d4	Neon version of vp9_sub_pixel_variance32x32(), vp9_variance32x32(), and vp9_get32x32var(). Change-Id: I8137e2540e50984744da59ae3a41e94f8af4a548	2014-07-31 08:00:36 -07:00
Scott LaVarnway	521cf7e879	Neon version of vp9_sub_pixel_variance16x16(), vp9_variance16x16(), and vp9_get16x16var(). On a Nexus 7, vpxenc (in realtime mode, speed -12) reported a performance improvement of ~16.7%. Change-Id: Ib163aa99f56e680194aabe00dacdd7f0899a4ecb	2014-07-30 08:17:32 -07:00
Yunqing Wang	5c93e62e0a	Allocate aligned source in variance test The source buffer is an aligned buffer in VP9. Added the alignment to make it consistent with libvpx. Change-Id: I3ebb9d2e8555ed532951da479dd5cbbb8812e02d	2014-07-24 17:11:58 -07:00
James Zern	29e1b1a4b0	tests: add API_REGISTER_STATE_CHECK used to wrap API functions to ensure full environment consistency as opposed to the renamed ASM_REGISTER_STATE_CHECK which is used with assembly functions. currently checks the FPU tag word in x86/x86_64 gcc builds to ensure emms has been called. Change-Id: Ie241772dbf903d33d516a1add4c8c6783f2e1490	2014-07-10 12:40:31 -07:00
James Zern	520cb3f39f	vp9_sub_pixel_variance: disable avx2 variants tests failing under Win32/Win64 + variance_test: add missing avx2 functions (partially disabled) Change-Id: I6abc0657ea076379ab9ca65c12678b9ea199849d	2014-06-10 16:11:15 -07:00
James Zern	6e5e75fa21	Revert "Removing redundant variables from variance_test.cc." This reverts commit `4725ab7e51`. The constants are necessary to avoid breakage in vs9 builds: warning C4180: qualifier applied to function type has no meaning; ignored error C2436: 'f2_' : member function or nested class in constructor initializer list while compiling class template member function 'std::tr1::tuple<T0,T1,T2,T3,T4,T5,T6,T7,T8,T9>::tuple(const int &,const int &,unsigned int (__cdecl &))' ..\test\variance_test.cc : see reference to class template instantiation 'std::tr1::tuple<T0,T1,T2,T3,T4,T5,T6,T7,T8,T9>' being compiled Change-Id: Ia218b74fc473d40f02fee84cb7009adfbe82e5a7	2014-05-08 14:35:40 -07:00
Dmitry Kovalev	4725ab7e51	Removing redundant variables from variance_test.cc. Change-Id: Icd44bce1c9d292f6e6f4d5157b694f6170b7b289	2014-05-07 14:40:21 -07:00
James Zern	d5e07a8451	variance_test: add NEON functions note not all functions have NEON implementations: - variance4x4_neon Change-Id: I03c1ba21f3b02aa2482d7ca8feedc3ef74b5947f	2014-02-26 19:25:02 -08:00
James Zern	002ad40897	test/: remove unnecessary extern "C"s Change-Id: I826655a708010149de231ca31a2e3ba4f1842c0c	2014-01-23 19:42:59 -08:00
James Zern	a0fcbcfa5f	fix vp8-only build Change-Id: Id9ce44f3364dd57b30ea491d956a2a0d6186be05	2013-09-17 18:47:25 -07:00
Yaowu Xu	afffa3d9b0	cleanup cpplint warnings Suggested by James Zern to clear out cpplint warnings for all unit test code. Change-Id: I731a3fa4d2a257eb9ef733426ba84286fbd7ea34	2013-09-06 10:13:49 -07:00
Jim Bankoski	5b307886fb	variance x86inc guards also fixed bug in sad calcs Change-Id: I6571fcbe37556c16ae32be66dc0fd879852aac1d	2013-08-06 14:17:13 -07:00
James Zern	e247ab09a6	variance_test: add missing ClearSystemState... ...to recently added SubpelVarianceTest Change-Id: I8775e39fd5dbfba81ad42b79b47bf6dd6ca8cc0e	2013-06-26 18:32:21 -07:00
Ronald S. Bultje	ac6ea2ab91	Allocate memory using appropriate expected alignment in unit tests. Fixes crashes of test_libvpx on 32-bit Linux. Change-Id: If94e7628a86b788ca26c004861dee2f162e47ed6	2013-06-21 17:03:57 -07:00
James Zern	cc774c8bb0	variance_test: use REGISTER_STATE_CHECK Change-Id: Id54ad9a781634f075e990d5bade5be8490959975	2013-06-21 14:30:08 -07:00
Ronald S. Bultje	1e6a32f1af	SSE2/SSSE3 optimizations and unit test for sub_pixel_avg_variance(). Encoding of bus @ 1500kbps (first 50 frames) goes from 3min57 to 3min35, i.e. approximately a 10.5% speedup. Note that the SIMD versions which use a bilinear filter (x_offset & 7 \|\| y_offset & 7) aren't perfectly interleaved, and can probably be improved further in the future. I've marked this with a few TODOs/FIXMEs in the code. Change-Id: I5c9e900c0f0d32e431a50fecae213b510b2549f9	2013-06-20 15:59:48 -07:00
Ronald S. Bultje	8fb6c58191	Implement sse2 and ssse3 versions for all sub_pixel_variance sizes. Overall speedup around 5% (bus @ 1500kbps first 50 frames 4min10 -> 3min58). Specific changes to timings for each function compared to original assembly-optimized versions (or just new version timings if no previous assembly-optimized version was available): sse2 4x4: 99 -> 82 cycles sse2 4x8: 128 cycles sse2 8x4: 121 cycles sse2 8x8: 149 -> 129 cycles sse2 8x16: 235 -> 245 cycles (?) sse2 16x8: 269 -> 203 cycles sse2 16x16: 441 -> 349 cycles sse2 16x32: 641 cycles sse2 32x16: 643 cycles sse2 32x32: 1733 -> 1154 cycles sse2 32x64: 2247 cycles sse2 64x32: 2323 cycles sse2 64x64: 6984 -> 4442 cycles ssse3 4x4: 100 cycles (?) ssse3 4x8: 103 cycles ssse3 8x4: 71 cycles ssse3 8x8: 147 cycles ssse3 8x16: 158 cycles ssse3 16x8: 188 -> 162 cycles ssse3 16x16: 316 -> 273 cycles ssse3 16x32: 535 cycles ssse3 32x16: 564 cycles ssse3 32x32: 973 cycles ssse3 32x64: 1930 cycles ssse3 64x32: 1922 cycles ssse3 64x64: 3760 cycles Change-Id: I81ff6fe51daf35a40d19785167004664d7e0c59d	2013-06-20 09:34:25 -07:00
James Zern	5b756748fd	tests: clear system state after non-API calls add ClearSystemState() to reset MMX registers avoiding corrupting subsequent tests. Change-Id: I668deb09aa7aa467709776e5819f936910698bc0	2013-06-18 11:32:27 -07:00
Yunqing Wang	f4fcfe3075	Optimize variance functions Added SSE2 version of variance functions for super blocks. Change-Id: Ibeaae8771ca21c99d41dd74067574a51e97b412d	2013-05-22 10:29:38 -07:00
James Zern	1711cf2dbb	add vp8 variance test Change-Id: I4e94ee2c4e2360d6a11a454c323f2899c1bb6f72	2013-02-22 16:25:14 -08:00
John Koleszar	fcccbcbb39	Add vp9_ prefix to all vp9 files Support for gyp which doesn't support multiple objects in the same static library having the same basename. Change-Id: Ib947eefbaf68f8b177a796d23f875ccdfa6bc9dc	2012-11-27 14:12:30 -08:00
John Koleszar	a9c7597adc	support building vp8 and vp9 into a single lib Change-Id: Ib8f8a66c9fd31e508cdc9caa662192f38433aa3d	2012-11-15 10:46:17 -08:00
James Zern	984734436d	Fix variance (signed integer) overflow In the variance calculations the difference is summed and later squared. When the sum exceeds sqrt(2^31) the value is treated as a negative when it is shifted which gives incorrect results. To fix this we force the multiplication to be unsigned. The alternative fix is to shift sum down by 4 before multiplying. However that will reduce precision. For 16x16 blocks the maximum sum is 65280 and sqrt(2^31) is 46340 (and change). This change is based on: `1698234` Missed some variance casts `fea3556` Fix variance overflow Change-Id: I2c61856cca9db54b9b81de83b4505ea81a050a0f	2012-11-06 23:06:44 -08:00

48 Коммитов