Граф коммитов

7801 Коммитов

Автор SHA1 Сообщение Дата
Marco Paniconi 21a0c1f38f Merge "Don't use gf_update by default for 1-pass CBR." 2014-01-10 11:43:20 -08:00
hkuang b5af9d2905 Merge "Fix Issue #679: vp9 C loop filter produces valgrind warning." 2014-01-10 10:59:00 -08:00
Marco Paniconi c46538d45e Don't use gf_update by default for 1-pass CBR.
Change-Id: I5df6abceb0a2a69706feadeb820b593cae88f573
2014-01-10 10:40:12 -08:00
Dmitry Kovalev ed364b2114 Merge "Adding {get, set}_rate_correction_factor() functions." 2014-01-10 10:30:04 -08:00
hkuang 66c6f7bf61 Fix Issue #679: vp9 C loop filter produces valgrind warning.
Fix the valgrind error due to access uninitialized
memory in loopfilter.

Change-Id: I52fccf5ede845ee1f4c13d3bd909b8f220c0bdff
2014-01-10 10:24:21 -08:00
Marco Paniconi a260369aa8 Merge "Keep buffer clipped to maximum in change_config." 2014-01-10 09:33:33 -08:00
Paul Wilkins b645257121 Revert "SSSE3 convolution optimization"
This reverts commit 511d218c60.

In current form intrinsics break borg build.

Change-Id: Ied37936af841250ecff449802e69a3d3761c91b9
2014-01-10 13:38:26 +00:00
Jingning Han a4c94a94cc Merge "Optimze inv 16x16 DCT with 10 non-zero coeffs - P2" 2014-01-09 18:17:25 -08:00
Jingning Han faa2ba86cc Merge "Optimze inv 16x16 DCT with 10 non-zero coeffs - P1" 2014-01-09 18:17:12 -08:00
Deb Mukherjee 36c8daed58 Merge "Cleanups on refresh flags" 2014-01-09 17:38:45 -08:00
Deb Mukherjee 412e4954c1 Cleanups on refresh flags
Cleanups on frame refresh flags and external overrides.

Change-Id: Ia6a56fe1bde906b1dc3fcbf4ef1c7b207cd2df2d
2014-01-09 17:00:23 -08:00
Johann e8192cf633 Merge "Use the correct member for initialization" 2014-01-09 15:21:19 -08:00
Yaowu Xu b1d81e19d8 Merge "Simplify set_rt_speed_feature()" 2014-01-09 15:02:24 -08:00
Marco Paniconi 193fa5c8ba Keep buffer clipped to maximum in change_config.
Under a configuration change, where the bitrate suddenly decreases,
the buffer level may be larger than maximum allowed (for that first frame to be encoded after change_config).
This change keeps it clipped to its maximum level.

Change-Id: I4d0b5b3d1fd8148600dd39e02bd630c9464baba5
2014-01-09 14:33:40 -08:00
Dmitry Kovalev c8e8d3a461 Merge "Renaming 'Sharpness' to 'sharpness'." 2014-01-09 13:42:55 -08:00
Yaowu Xu 2d381d76d8 Simplify set_rt_speed_feature()
1. Made speed choices to be progressive
2. Adjusted rt speed settings to achieve better speed/quality

Overall, rt-5 gained 2.5% in compression/quality, encoding time of 720p
niklas clip goes from 137,052ms to 121,874ms

Change-Id: Ia6e7e1e15225395a868a2f1059c3db8e266e1600
2014-01-09 13:02:15 -08:00
Jingning Han af31b27aae Optimze inv 16x16 DCT with 10 non-zero coeffs - P2
This commit further optimizes SSE2 operations in the second 1-D
inverse 16x16 DCT, with (<10) non-zero coefficients. The average
runtime of this module goes down from 779 cycles -> 725 cycles.

Change-Id: Iac31b123640d9b1e8f906e770702936b71f0ba7f
2014-01-09 12:46:09 -08:00
Yunqing Wang f3b9b97c0e Merge "SSSE3 convolution optimization" 2014-01-09 12:39:47 -08:00
levytamar82 511d218c60 SSSE3 convolution optimization
Optimizing all SSSE3 assembly for convolution:
1. vp9_filter_block1d4_h8_sse2
2. vp9_filter_block1d8_h8_sse2
3. vp9_filter_block1d16_h8_sse2
4. vp9_filter_block1d4_v8_sse2
5. vp9_filter_block1d8_v8_sse2
6. vp9_filter_block1d16_v8_sse2
my optimization include:
-processing 2x8 elements in one 128 bit register instead of processing
8 elements in one 128 bit register.
-removing unecessary loads.
This optimization gives between 2.4% user level gain for 480p input
and 1.6% user level gain for 720p.
This Optimization done only for 64bit.

Change-Id: Icb586dc0c938b56699864fcee6c52fd43b36b969
2014-01-09 12:27:51 -07:00
Dmitry Kovalev 6d812d6f24 Merge "Removing examples code generation and making them static." 2014-01-09 11:15:46 -08:00
Dmitry Kovalev 42647fc9fe Merge "Using VP9_COMMON instead of VP9_COMP." 2014-01-09 11:15:29 -08:00
Johann c8a2aaa7e7 Merge "VP8 for ARMv8 by using NEON intrinsics 01" 2014-01-09 10:39:05 -08:00
James Yu 79395e16cf VP8 for ARMv8 by using NEON intrinsics 01
Add bilinearpredict_neon_intrinsics.c
- vp8_bilinear_predict4x4_neon
- vp8_bilinear_predict8x4_neon
- vp8_bilinear_predict8x8_neon
- vp8_bilinear_predict16x16_neon

Change-Id: I33dfa502881219841b442dda32b73220e51b716b
Signed-off-by: James Yu <james.yu@linaro.org>
2014-01-09 09:56:22 -08:00
Paul Wilkins 11569060f4 Merge "Fix rate allocation bug." 2014-01-09 03:00:15 -08:00
Johann 719dadf3ef Use the correct member for initialization
On Windows this fails with:
error C2440: 'initializing': cannot convert from int_mv to uint32_t

Change-Id: I51630efd0e83a0ce620c91aa7859dd6fc1572e99
2014-01-08 19:31:24 -08:00
Dmitry Kovalev b16fac42d4 Using VP9_COMMON instead of VP9_COMP.
Change-Id: If7d3958653104f3e170853e931f8489de3ecf3cc
2014-01-08 18:36:38 -08:00
Dmitry Kovalev d606bf93ef Merge "Removing direct references to {lst_fb, gld_fb, alt_fb}_idx fields." 2014-01-08 18:12:09 -08:00
Johann 67ad03acc3 Merge "Install test sources for MSVS" 2014-01-08 17:59:30 -08:00
Dmitry Kovalev feaad4f133 Merge "Cleanups around cpi->common." 2014-01-08 17:48:28 -08:00
Dmitry Kovalev c01fe86ccc Adding {get, set}_rate_correction_factor() functions.
Change-Id: Ib3212832953a3445fc5f021af0e1de7886f09b4f
2014-01-08 17:40:35 -08:00
Dmitry Kovalev 4fbe54d201 Merge "Renaming 'Mode' to 'mode'." 2014-01-08 16:29:29 -08:00
Johann 0239f11482 Install test sources for MSVS
Move the code outside the conditions. The test sources themselves are
also required for Visual Studio.

Change-Id: Id5e93ebc7369e1807eba0b9dc4f7d0f18033d794
2014-01-08 15:45:14 -08:00
Jingning Han ba6ab46cdc Optimze inv 16x16 DCT with 10 non-zero coeffs - P1
This commit is the first patch optimizing SSE2 implementation of inverse
16x16 DCT with <10 non-zero coefficients. It focused on the first 1-D (row)
transformation. It exploits the fact that only top-left 4x4 block contains
non-zero coefficients, in a 2-D inverse 16x16 DCT with <10 coeffients.

The average runtime of idct16x16_10 unit is reduced from
883 cycles -> 779 cycles (12% faster).

For pedestrian_area_1080p 300 frames at 4000 kbps, the speed 2 runtime goes
down from 310651 ms  -> 305910 ms. The decoding speed goes up from
80.37 fps -> 80.87 fps.

Change-Id: Ic6f3ac5a637a76c07ba73ddaafe318a699fea645
2014-01-08 15:36:45 -08:00
Dmitry Kovalev 510a828256 Removing direct references to {lst_fb, gld_fb, alt_fb}_idx fields.
Change-Id: Ib1d9628d2b538b6dc27b0db1fa7f40f70ff2072f
2014-01-08 15:21:41 -08:00
Dmitry Kovalev 0ecd583d8d Cleanups around cpi->common.
Change-Id: I0c42a729038d0f4cb7bc07f587d066fcb1dfe9d9
2014-01-08 14:51:00 -08:00
Alex Converse 8fcb74e6bb Merge "Add a C fallback for get_msb() and change inline to INLINE." 2014-01-08 14:43:46 -08:00
hkuang 5be0ed30dc Merge "Add initial intra frame neon optimization. 1~2% gain." 2014-01-08 14:41:43 -08:00
Dmitry Kovalev 962c8b241e Renaming 'Mode' to 'mode'.
Change-Id: I6cdd670d66288dbd66228f38bba6b30502d25362
2014-01-08 14:33:59 -08:00
Dmitry Kovalev 57be81369a Renaming 'Sharpness' to 'sharpness'.
Change-Id: I54513dc3b3321e0c0bb6b15ea5c34085ed80b4a4
2014-01-08 14:19:14 -08:00
Dmitry Kovalev feab7e1146 Merge "Using struct twopass_rc* instead of VP9_COMP*." 2014-01-08 14:14:05 -08:00
Alex Converse ce7ff3b63d Add a C fallback for get_msb() and change inline to INLINE.
For systems without __builtin_clz() or _BitScanReverse(), taken from libwep

Change-Id: Iead257efc1772c466c79e1dc0356ed571d38d43e
2014-01-08 12:25:47 -08:00
hkuang 691111aacf Add initial intra frame neon optimization. 1~2% gain.
More intra optimizations will be added.

Change-Id: I33ae8d93f6002bf7b64cc2669602d9e6bfa5a6e8
2014-01-08 11:58:42 -08:00
Yunqing Wang a84029ad9c Merge "AVX2 Variance Optimization" 2014-01-08 11:33:42 -08:00
Johann af72081818 Merge "Include gen_msvs_vcxproj.sh" 2014-01-08 11:10:03 -08:00
Alex Converse 22d83a0ab7 Merge "Replace RD modeling with a fixed point approximation." 2014-01-08 11:06:54 -08:00
levytamar82 357b65369f AVX2 Variance Optimization
Optimizing the variance functions: vp9_variance16x16, vp9_variance32x32,
vp9_variance64x64, vp9_variance32x16, vp9_variance64x32,
vp9_mse16x16 by migrating to AVX2
some of the functions were optimized by processing 32 elements instead of 16.
some of the functions were optimized by processing 2 loop strides of 16
elements in a single 256 bit register
This optimization gives between 2.4% - 2.7% user level performance gain
and 42% function level gain.

Change-Id: I265ae08a2b0196057a224a86450153ef3aebd85d
2014-01-08 12:05:53 -07:00
Alex Converse f2ca665f1c Replace RD modeling with a fixed point approximation.
Change-Id: I44eb44eb3f36c05d916ef140ef42cc84f72f99ec
2014-01-08 10:37:24 -08:00
Jingning Han aa9552b0b5 Merge "Fix an issue in motion vector prediction stage" 2014-01-08 10:06:03 -08:00
Johann 87784e3a99 Include gen_msvs_vcxproj.sh
Change-Id: I28e9cf9347acd7279df3b841863a248479633265
2014-01-08 09:51:15 -08:00
Deb Mukherjee 0d21d79bbc Merge "Further rate control cleanups" 2014-01-08 09:20:29 -08:00