mozilla/aom - aom

Граф коммитов

Автор	SHA1	Сообщение	Дата
Angie Chiang	ff8c490b9a	Branch dct to new implementation for bd12 Change-Id: I9281935653aacce22ac3100f79fb956c249e2bf3	2016-04-04 12:40:10 -07:00
Angie Chiang	f1060f5bc4	Change dct32x32's range Bitdepth 10/12: Fit coefficient range into 32 bits Fit codfficient * const range into 32 bits Bitdepth 8: Fit coefficient range into 16 bits Fit codfficient * constant range into 32 bits Change-Id: I50b5a3132e8a9f5155c971ab0f6eb52876d2b5ca	2016-04-04 11:21:11 -07:00
Angie Chiang	39b3c025fa	Fit dct's stage range into 32-bit when bitdepth is 12 Change-Id: I807e60c6dcacc50c087adcbdb1df022f8541efc5	2016-04-04 11:13:44 -07:00
Angie Chiang	c7c40d2329	Generalize txfm scale in highbd quantizer Change-Id: I359aa49c09b244e0d44ebd09442e365a3d22556c	2016-03-30 15:25:26 -07:00
Angie Chiang	64413a6ca7	Parameterize transform scale for quantizer This is to facilitate changing transform scale later Change-Id: Ic8ca5afba57d2489ebd191ccc40c1b31605a0d8c	2016-03-30 15:25:26 -07:00
Angie Chiang	25520d8dc3	change vp10_fwd_txfm2d_#x#_sse2 to vp10_fwd_txfm2d_#x#_sse4_1 The speed performance for running 20k times is as follows Notice that the vp10_highbd_fdct#x#_sse2 version is 16-bit version plus range check The rest are 32-bit version vp10_fwd_txfm2d_4x4_c (2 ms) vp10_fwd_txfm2d_8x8_c (9 ms) vp10_fwd_txfm2d_16x16_c (45 ms) vp10_fwd_txfm2d_32x32_c (233 ms) vp10_fwd_txfm2d_4x4_sse4_1 (2 ms) vp10_fwd_txfm2d_8x8_sse4_1 (3 ms) vp10_fwd_txfm2d_16x16_sse4_1 (16 ms) vp10_fwd_txfm2d_32x32_sse4_1 (80 ms) vp10_highbd_fdct4x4_c (1 ms) vp10_highbd_fdct8x8_c (3 ms) vp10_highbd_fdct16x16_c (17 ms) highbd_fdct32x32_c (160 ms) vp10_highbd_fdct4x4_sse2 (0 ms) vp10_highbd_fdct8x8_sse2 (2 ms) vp10_highbd_fdct16x16_sse2 (8 ms) highbd_fdct32x32_sse2 (105 ms) Change-Id: I24daf1e0d4d66e91e4ce61ef71cefa7b70ee90ce	2016-03-30 15:25:26 -07:00
Angie Chiang	c75f64780b	Remove redundant code from vp10_fwd_txfm2d.c Change-Id: I87ae5e93957616c0f5160a4f679e42f77092c33f	2016-03-30 15:25:26 -07:00
Angie Chiang	f2b311f580	Simplify rounding in vp10_[fwd/inv]_txfm[1/2]d_#x# Change-Id: I24ce46e157dc5b9c0d75000a1a48e9c136ed4ee1	2016-03-30 15:25:26 -07:00
Angie Chiang	11d2bb5429	Add vp10_fwd_txfm2d_sse2 Change-Id: Idfbe3c7f5a7eb799c03968171006f21bf3d96091	2016-03-30 15:25:26 -07:00
Debargha Mukherjee	91707ac79e	Merge "Extend superblock size fo 128x128 pixels." into nextgenv2	2016-03-30 20:55:32 +00:00
Geza Lore	552d5cd715	Extend superblock size fo 128x128 pixels. If --enable-ext-partition is used at build time, the superblock size (sometimes also referred to as coding unit (CU) size) is extended to 128x128 pixels. Change-Id: Ie09cec6b7e8d765b7555ff5d80974aab60803f3a	2016-03-30 18:23:06 +01:00
Debargha Mukherjee	e467627f33	Merge "Fix for ext_interp experiment" into nextgenv2	2016-03-30 14:44:39 +00:00
Yaowu Xu	37241e6f95	Merge "Merge branch 'masterbase' into nextgenv2" into nextgenv2	2016-03-29 16:05:53 +00:00
Julia Robson	068e799459	Fix for ext_interp experiment Amends previous commit to also handle subsampling correctly. Change ID of prev commit: I6b07e6cf9b287ba4b5bd6599af4a7412e50b3bdc Was causing occassional failures for 422 streams due to accessing elements beyond the extent of the bmi array. Change-Id: I37ebabf4c01ca84bcd1851428172bdf753805d98	2016-03-29 16:09:49 +01:00
Yaowu Xu	c810740c36	Merge branch 'masterbase' into nextgenv2 Conflicts: vp9/encoder/vp9_encoder.c vpx_dsp/x86/convolve.h Change-Id: I60c3532936bedd796a75dfe78245a95ec21e2e55	2016-03-28 17:44:28 -07:00
Angie Chiang	4144a11552	Merge "Use vp10_[fwd/inv]_txfm2d_add_32x32 for bd 10" into nextgenv2	2016-03-28 19:20:48 +00:00
Hui Su	14f2d03b4b	Merge "Fix assertion fail in build_intra_predictors" into nextgenv2	2016-03-28 18:14:47 +00:00
Angie Chiang	33833aefdd	Merge "Use vp10_[fwd/inv]_txfm2d_add_#x# for bd 10" into nextgenv2	2016-03-28 18:11:47 +00:00
Angie Chiang	46b234478f	Use vp10_[fwd/inv]_txfm2d_add_32x32 for bd 10 Change-Id: I996c48a90d7d71b52594a91a35cb8712c7fc212e	2016-03-28 11:08:40 -07:00
Alex Converse	72e29c3a73	Merge changes I3c72a2d8,I9905f3a8 into nextgenv2 * changes: Add pluggable bitwriters. Add pluggable bitreaders.	2016-03-28 16:59:18 +00:00
hui su	f24b91c9e1	Fix assertion fail in build_intra_predictors Change-Id: Id6683b9593b52aa0d159f8f013782d9e0bd07206	2016-03-28 09:37:54 -07:00
Alex Converse	efd566ff93	Add pluggable bitreaders. This will make the code change for a pure ANS experiment manageable. Change-Id: I9905f3a89f492a4346860463a72fa8c52aac4c8e	2016-03-25 11:02:41 -07:00
Yunqing Wang	bdcc14051b	Recover tile coding performance After porting tile coding from VP9 to VP10, some performance degradation was seen because of the difference between VP9 and Vp10 baseline. This patch disabled some features in VP10 while tile coding is turned on. Also, an encoder control API was added back for this use case. Change-Id: I8f736db8388408a8cc35320a2f80abb02906571c	2016-03-25 09:05:25 -07:00
Geza Lore	490ba1ad25	Port large scale tile coding features from nextgen. If configured with --enable-ext-tile, the codec uses an alternative tile coding syntax in the bitstream. Changes include:: - The maximum number of tile rows and columns is extended to 1024 each. - The minimum tile width/height is 64 pixels (1 superblock). - A tile copy mode is added where a tile directly reuse the coded data of a previous tile - The meaning of the tile-columns and tile-rows codec parameters are overloaded to mean tile-width and tile-height in units of 64 pixels. - All tiles should now be independent, including rows within the same columns, so large scale parallel, or independent decoding is possible. - vpxdec also gained the options to decode only a particular tile, tile row, or tile column. Changes without --enable-ext-tile: - All tiles should now be independent, including rows within the same columns, so large scale parallel, or independent decoding is possible. - vpxenc default tile configuration changed to use 1 tile column. Change-Id: I0cd08ad550967ac18622dae5e98ad23d581cb33e	2016-03-24 09:26:05 +00:00
Jingning Han	1fcb5fc755	Refactor motion vector residual coding process This commit separates the predicted motion vector from the nearestmv motion vector in the coding process for both regular and sub8x8 block sizes. Change-Id: I703490513b0194e6669ebf719352db015facb3e1	2016-03-23 12:10:38 -07:00
Angie Chiang	d9a0cbb1b7	Use vp10_[fwd/inv]_txfm2d_add_#x# for bd 10 Change-Id: Ie35bdbd7aafae693e3106d7ccbbdd8e65ee8800c	2016-03-23 12:05:12 -07:00
Yi Luo	deb33056d1	Merge "Highbd fht4x4 SSE4.1 optimization for DCT_DCT mode - Setup function vp10_highbd_fht4x4_sse4_1 for highbd SSE4.1 intrinsics optimization. - Wrote SSE4.1 functions: load_buffer_4x4(), write_buffer_4x4(), and fdct4x4_sse4_1(). - Used logic right shift to avoid coeff memory write/read. - Turned on vp10_highbd_fht4x4_sse4_1 for DCT_DCT mode only. - Improved overall encoding performance >2.3% for 50 frames sequence, park_joy_1080p_12.y4m, in which, --input-bit-depth=12, --bit-depth=12, 50 frames. - Unit test passed." into nextgenv2	2016-03-23 18:30:40 +00:00
Hui Su	daf2fb42e6	Merge "Add "entropy" experiment" into nextgenv2	2016-03-23 17:50:57 +00:00
Alex Converse	b5454b245a	Merge "Add some ANS helpers needed to replace the vpx bool coder with pure ANS." into nextgenv2	2016-03-23 16:21:58 +00:00
Yi Luo	977dccd12c	Highbd fht4x4 SSE4.1 optimization for DCT_DCT mode - Setup function vp10_highbd_fht4x4_sse4_1 for highbd SSE4.1 intrinsics optimization. - Wrote SSE4.1 functions: load_buffer_4x4(), write_buffer_4x4(), and fdct4x4_sse4_1(). - Used logic right shift to avoid coeff memory write/read. - Turned on vp10_highbd_fht4x4_sse4_1 for DCT_DCT mode only. - Improved overall encoding performance >2.3% for 50 frames sequence, park_joy_1080p_12.y4m, in which, --input-bit-depth=12, --bit-depth=12, 50 frames. - Unit test passed. Change-Id: Idd6dc6e472cbbf235f0ade4f66fbe859a860a004	2016-03-23 09:13:45 -07:00
Debargha Mukherjee	7a3bae768e	Merge "Porting ext_partition experiment from nextgen" into nextgenv2	2016-03-23 04:58:38 +00:00
Alex Converse	6b9cb8c489	Add some ANS helpers needed to replace the vpx bool coder with pure ANS. Change-Id: I32b63fca020c410cef16e93379b4e6e281ccbccd	2016-03-22 16:23:23 -07:00
Yue Chen	2613b5e9d6	Merge "Refactor prediction functions of OBMC" into nextgenv2	2016-03-22 21:06:16 +00:00
Julia Robson	5cce322a09	Porting ext_partition experiment from nextgen This has been ported under ext_partition_types because it is due to be combined with the coding_unit_size experiment which is already being ported under ext_partition Change-Id: I47af869ae123ddf0aa99160dac644059d14266ee	2016-03-22 12:29:01 -07:00
Angie Chiang	9d380d8872	Merge "mv vp10_fwd_txfm2d_#x# into vp10_rtcd.h" into nextgenv2	2016-03-22 01:07:56 +00:00
Angie Chiang	063e965d7d	Merge "Passing TXFM_TYPE instead of func pointer" into nextgenv2	2016-03-22 01:07:42 +00:00
Jingning Han	4df51c8de4	Merge "Refactor sub8x8 reference motion vector search function" into nextgenv2	2016-03-22 00:07:45 +00:00
Jingning Han	bfdcccd8a1	Merge "Rework the DRL syntax entropy coding system" into nextgenv2	2016-03-22 00:07:36 +00:00
Yue Chen	2e3f77316d	Refactor prediction functions of OBMC Merge the functions that generate prediction by above/left predictors for the encoder and the decoder. Change-Id: I57e53a8f2eb8d3028c4ed0c9abdcbf00503f95a0	2016-03-21 17:04:13 -07:00
Debargha Mukherjee	1b17559327	Adds 1D transforms for ADST/FlipADST to make 16 Makes a set of 16 transforms total, adding all 1D combinations of ADST and FlipADST, and removng all DST transforms. lowres, midres both improve by about 0.1% and hdres by -0.378% in BDRATE but with fewer transforms that are also simpler. Further experiments to continue later. Change-Id: I7348a4c0e12078fdea5ae3a2d36a89a319ffcc6e	2016-03-21 11:19:36 -07:00
Angie Chiang	abd447e339	mv vp10_fwd_txfm2d_#x# into vp10_rtcd.h Change-Id: Iad7352698786791b0fd7c005a7edfd1724b71599	2016-03-21 10:51:54 -07:00
Angie Chiang	40ef86f27d	Passing TXFM_TYPE instead of func pointer This is to facilitate sse2 implementation Change-Id: Id2f53e83c5508c4445d9b1bba00a649cb4da6b74	2016-03-21 10:50:59 -07:00
Jingning Han	66df6e7c7f	Refactor sub8x8 reference motion vector search function Rework the interface to allow codec store the reference motion vector list information for coding process. Change-Id: I47e26587f6c0808655e4626f316ec7614a7ad8ed	2016-03-21 10:02:08 -07:00
Jingning Han	5c9d315572	Rework the DRL syntax entropy coding system This commit re-designs the probability model for the syntax elements of the dynamic motion vector referencing system. Change-Id: Icfb8203c7e8f64e10e99f5890e25e6f6b15fe5d1	2016-03-21 09:52:33 -07:00
Geza Lore	efe7d4e5a2	Refactor mbmi->inter_tx_size to 2D array. This is in preparation of increasing the superblock size. Change-Id: I9197e397399fbe8aec1178a45ea0337dd90412d7	2016-03-18 15:30:09 +00:00
Angie Chiang	ed2514a22c	add dct 64x64 transform Change-Id: I131c4d1216cd156e520b8a91c4438c2d3c6602cb	2016-03-16 19:37:21 -07:00
hui su	83b47af18d	Add "entropy" experiment This patch added two features to improve entropy coding efficiency for coefficient tokens. 1. Choose 1 of 4 default probability tables based on q-index for key-frames. It is ported from nextgen branch: https://chromium-review.googlesource.com/#/c/280586/ 2. Do backward update after each superblock (64X64) row using subframe token counts. Coding gain: 0.1% on lowres; 0.42% on midres; 0.36% on hdres. Much larger gain for key-frames: 2.6%, 2.3%, 1.7%. Design doc: go/huisu-entropy Change-Id: Ia3b6a615636be09247d70e4c520405637561532b	2016-03-16 11:55:50 -07:00
Geza Lore	c2005c578b	Factor out zeroing above and left context. Change-Id: I6e5d8cff869c7415a924f845c9e6ccaabe2b7a9b	2016-03-16 13:08:29 +00:00
Debargha Mukherjee	dcbbb81605	Merge "Refactor 1D transforms" into nextgenv2	2016-03-15 19:08:07 +00:00
Debargha Mukherjee	cb37db126e	Merge "Fix copy/zero macros." into nextgenv2	2016-03-15 17:45:31 +00:00

1 2 3 4 5 ...

394 Коммитов