mozilla/aom - aom

Граф коммитов

Автор	SHA1	Сообщение	Дата
Yunqing Wang	6193bc3ba8	Refactor vp9_dequant_idct_add function Provided a wrapper and removed duplicate code. Change-Id: Iaef842226ec348422e459202793b001d0983ea30	2013-02-28 14:18:46 -08:00
Dmitry Kovalev	347f3a0aa8	Code cleanup. Fixing code style, using array lookup instead of switch statements for forward hybrid transforms (in the same way as for their inverses). Consistent usage of ROUND_POWER_OF_TWO macro in appropriate places. Change-Id: I0d3822ae11f928905fdbfbe4158f91d97c71015f	2013-02-27 13:51:04 -08:00
Ronald S. Bultje	e8c74e2b70	Move eob from BLOCKD to MACROBLOCKD. Consistent with VP8. Change-Id: I8c316ee49f072e15abbb033a80e9c36617891f07	2013-02-27 11:00:55 -08:00
John Koleszar	eb939f45b8	Spatial resamping of ZEROMV predictors This patch allows coding frames using references of different resolution, in ZEROMV mode. For compound prediction, either reference may be scaled. To test, I use the resize_test and enable WRITE_RECON_BUFFER in vp9_onyxd_if.c. It's also useful to apply this patch to test/i420_video_source.h: --- a/test/i420_video_source.h +++ b/test/i420_video_source.h @@ -93,6 +93,7 @@ class I420VideoSource : public VideoSource { virtual void FillFrame() { // Read a frame from input_file. + if (frame_ != 3) if (fread(img_->img_data, raw_sz_, 1, input_file_) == 0) { limit_ = frame_; } This forces the frame that the resolution changes on to be coded with no motion, only scaling, and improves the quality of the result. Change-Id: I1ee75d19a437ff801192f767fd02a36bcbd1d496	2013-02-26 23:54:23 -08:00
John Koleszar	6a4f708c25	Refactor inter recon functions to support scaling Ensure that all inter prediction goes through a common code path that takes scaling into account. Removes a bunch of duplicate 1st/2nd predictor code. Also introduces a 16x8 mode for 8x8 MVs, similar to the 8x4 trick we were doing before. This has an unexpected effect with EIGHTTAP_SMOOTH, so it's disabled in that case for now. Change-Id: Ia053e823a8bc616a988a0af30452e1e75a739cba	2013-02-26 10:03:29 -08:00
Ronald S. Bultje	35524e2231	Remove "eobs" array in MACROBLOCKD. The information is a duplicate of "eob" in BLOCKD. Change-Id: Ia6416273bd004611da801e4bfa6e2d328d6f02a3	2013-02-21 10:07:36 -08:00
Dmitry Kovalev	e6c89a1f9b	Merge "Code cleanup." into experimental	2013-02-20 12:47:54 -08:00
Dmitry Kovalev	eb6aee50a4	Code cleanup. Change-Id: I7c6e3bebd94856b24dbe2aded7f9e04ef8bb8c08	2013-02-20 11:36:31 -08:00
Yaowu Xu	d262e26cc7	Merge lossless experiment Change-Id: I7b7b8d4fda3a23699e0c920d727f8c15d37d43aa	2013-02-20 07:54:28 -08:00
Jingning Han	cd907b1601	16x16 butterfly inverse ADST/DCT hybrid transform rebased. This patch includes 16x16 butterfly inverse ADST/DCT hybrid transform. It uses the variant ADST of kernel sin((2k+1)*(2n+1)/4N), which allows a butterfly implementation. The coding gains as compared to DCT 16x16 are about 0.1% for both derf and std-hd. It is noteworthy that for std-hd sets many sequences gains about 0.5%, some 0.2%. There are also few points that provides -1% to -3% performance. Hence the average goes to about 0.1%. Change-Id: Ie80ac84cf403390f6e5d282caa58723739e5ec17	2013-02-19 09:07:00 -08:00
Yaowu Xu	93d6b86cfd	Use lossless for Q0 The commit changes the coding mode to lossless whenever the lowest quantizer is choosen. As expected, test results showed no difference for cif and std-hd set where Q0 is rarely used. For yt and yt-hd set, Q0 is used for a number of clips, where this commit helped a lot in the high end. Average over all clips in the sets: yt: 2.391% 1.017% 1.066% hd: 1.937% .764% .787% Change-Id: I9fa9df8646fd70cb09ffe9e4202b86b67da16765	2013-02-19 06:18:42 -08:00
Ronald S. Bultje	3af36ea8cc	Remove Y2 and Y-no-DC token types from the bitstream. Change-Id: I7a5314daca993d46b8666ba1ec2ff3766c1e5042	2013-02-15 14:06:30 -08:00
Ronald S. Bultje	46dff5d233	Remove some Y2-related code. Change-Id: I4f46d142c2a8d1e8a880cfac63702dcbfb999b78	2013-02-15 14:06:25 -08:00
Ronald S. Bultje	51afedbe28	Merge "Remove 2nd-order transform for first-order DC coefficients." into experimental	2013-02-13 13:58:02 -08:00
Ronald S. Bultje	42d6be8080	Remove 2nd-order transform for first-order DC coefficients. Since addition of the larger-scale transforms (16x16, 32x32), these don't give a benefit at macroblock-sizes anymore. At superblock-sizes, 2nd-order transform was never used over the larger transforms. Future work should test whether there is a benefit for that use case. Change-Id: I90cadfc42befaf201de3eb0c4f7330c56e33330a	2013-02-13 12:28:19 -08:00
Yaowu Xu	f01b08c96c	Merge "enable bitstream lossless support" into experimental	2013-02-13 10:26:58 -08:00
Yaowu Xu	d3de97794f	Merge "fix the lossless experiment" into experimental	2013-02-13 09:54:35 -08:00
Yaowu Xu	17db5d00be	enable bitstream lossless support 1. Added a bit in frame header to to indicate if a frame is encoded in lossless mode, so decoder does not make the decision based on Q0 2. Minor changes to make sure that lossy coding works same as when the lossless experiment is not enabled. 3. Renamed function pointers for transforms to be consistent, using prefix fwd_txm and inv_txm for forward and inverse respectively To encode in lossless mode, using "--lossless=1 --min-q=0 --max-q=0" with vpxenc. Change-Id: Ifae53b26d2ffbe378d707e29d96817b8a5e6c068	2013-02-13 09:24:39 -08:00
Yaowu Xu	16f25f9dc8	fix the lossless experiment Change-Id: I95acfc1417634b52d344586ab97f0abaa9a4b256	2013-02-13 09:20:26 -08:00
Paul Wilkins	649be94cf0	Removal of Hybrid DWT/DCT experiment. Removal of experiment to simplify code base for other changes. Change-Id: If0a33952504558511926ad212bc311fc2bffb19a	2013-02-13 15:08:48 +00:00
John Koleszar	1d60b6bcb5	Merge "Replace as_mv struct with array" into experimental	2013-02-12 13:59:04 -08:00
Jingning Han	57e995ff9c	butterfly inverse 4x4 ADST fixed format issues. Implement the inverse 4x4 ADST using 9 multiplications. For this particular dimension, the original ADST transform can be factorized into simpler operations, hence is retained. Change-Id: Ie5d9749942468df299ab74e90d92cd899569e960	2013-02-11 10:42:39 -08:00
John Koleszar	7ca517f755	Replace as_mv struct with array Replace as_mv.{first, second} with a two element array, so that they can easily be processed with an index variable. Change-Id: I1e429155544d2a94a5b72a5b467c53d8b8728190	2013-02-08 20:23:35 -08:00
John Koleszar	6125a1ed81	Pass macroblock index to pick inter functions Pass the current mb row and column around rather than the recon_yoffset and recon_uvoffset, since those offsets will change from predictor to predictor, based on the reference frame selection. Change-Id: If3f9df059e00f5048ca729d3d083ff428e1859c1	2013-02-08 14:25:40 -08:00
John Koleszar	3de8ee6ba1	Merge changes Ife0d8147,I7d469716,Ic9a5615f into experimental * changes: Restore SSSE3 subpixel filters in new convolve framework Convert subpixel filters to use convolve framework Add 8-tap generic convolver	2013-02-08 13:19:47 -08:00
Jingning Han	d15e1da494	Butterfly ADST based hybrid transform Refactor the 8x8 inverse hybrid transform. It is now consistent with the new inverse DCT. Overall performance loss (due to the use of this variant ADST, and the rounding errors in the butterfly implementation) for std-hd is -0.02. Fixed BUILD warning. Devise a variant of the original ADST, which allows butterfly computation structure. This new transform has kernel of the form: sin((2k+1)*(2n+1) / (4N)). One of its butterfly structures using floating-point multiplications was reported in Z. Wang, "Fast algorithms for the discrete W transform and for the discrete Fourier transform", IEEE Trans. on ASSP, 1984. This patch includes the butterfly implementation of the inverse ADST/DCT hybrid transform of dimension 8x8. Change-Id: I3533cb715f749343a80b9087ce34b3e776d1581d	2013-02-07 10:07:46 -08:00
Ronald S. Bultje	1407bdc243	[WIP] Add column-based tiling. This patch adds column-based tiling. The idea is to make each tile independently decodable (after reading the common frame header) and also independendly encodable (minus within-frame cost adjustments in the RD loop) to speed-up hardware & software en/decoders if they used multi-threading. Column-based tiling has the added advantage (over other tiling methods) that it minimizes realtime use-case latency, since all threads can start encoding data as soon as the first SB-row worth of data is available to the encoder. There is some test code that does random tile ordering in the decoder, to confirm that each tile is indeed independently decodable from other tiles in the same frame. At tile edges, all contexts assume default values (i.e. 0, 0 motion vector, no coefficients, DC intra4x4 mode), and motion vector search and ordering do not cross tiles in the same frame. t log Tile independence is not maintained between frames ATM, i.e. tile 0 of frame 1 is free to use motion vectors that point into any tile of frame 0. We support 1 (i.e. no tiling), 2 or 4 column-tiles. The loopfilter crosses tile boundaries. I discussed this briefly with Aki and he says that's OK. An in-loop loopfilter would need to do some sync between tile threads, but that shouldn't be a big issue. Resuls: with tiling disabled, we go up slightly because of improved edge use in the intra4x4 prediction. With 2 tiles, we lose about ~1% on derf, ~0.35% on HD and ~0.55% on STD/HD. With 4 tiles, we lose another ~1.5% on derf ~0.77% on HD and ~0.85% on STD/HD. Most of this loss is concentrated in the low-bitrate end of clips, and most of it is because of the loss of edges at tile boundaries and the resulting loss of intra predictors. TODO: - more tiles (perhaps allow row-based tiling also, and max. 8 tiles)? - maybe optionally (for EC purposes), motion vectors themselves should not cross tile edges, or we should emulate such borders as if they were off-frame, to limit error propagation to within one tile only. This doesn't have to be the default behaviour but could be an optional bitstream flag. Change-Id: I5951c3a0742a767b20bc9fb5af685d9892c2c96f	2013-02-05 15:43:03 -08:00
John Koleszar	7a07eea13f	Convert subpixel filters to use convolve framework Update the code to call the new convolution functions to do subpixel prediction rather than the existing functions. Remove the old C and assembly code, since it is unused. This causes a 50% performance reduction on the decoder, but that will be resolved when the asm for the new functions is available. There is no consensus for whether 6-tap or 2-tap predictors will be supported in the final codec, so these filters are implemented in terms of the 8-tap code, so that quality testing of these modes can continue. Implementing the lower complexity algorithms is a simple exercise, should it be necessary. This code produces slightly better results in the EIGHTTAP_SMOOTH case, since the filter is now applied in only one direction when the subpel motion is only in one direction. Like the previous code, the filtering is skipped entirely on full-pel MVs. This combination seems to give the best quality gains, but this may be indicative of a bug in the encoder's filter selection, since the encoder could achieve the result of skipping the filtering on full-pel by selecting one of the other filters. This should be revisited. Quality gains on derf positive on almost all clips. The only clip that seemed to be hurt at all datarates was football (-0.115% PSNR average, -0.587% min). Overall averages 0.375% PSNR, 0.347% SSIM. Change-Id: I7d469716091b1d89b4b08adde5863999319d69ff	2013-02-05 14:23:17 -08:00
Ronald S. Bultje	3a4b18bc67	don't code the branch for the predicted seg_id if that flag is false. Change-Id: Icb6e21dc0c2d9918faa33c8bf70943660df7ad88	2013-01-30 09:30:46 -08:00
Paul Wilkins	0ff9b033b0	Segment Skip Flag First step in simplifying the segment mode and segment EOB flags into a simpler segment skip flag that implies 0,0 mv and EOB at position 0. Change-Id: Ib750cac31a7a02dc21082580498efd9f7d8d72a5	2013-01-28 17:28:04 +00:00
Ronald S. Bultje	c9071601a2	Remove compound intra-intra experiment. This experiment gives little gains and adds relatively much code complexity (and it hinders other experiments), so let's get rid of it. Change-Id: Id25e79a137a1b8a01138aa27a1fa0ba4a2df274a	2013-01-14 15:47:25 -08:00
Deb Mukherjee	516db21c2c	Further enhancements/fixes on dct/dwt hybrid txfm Fixes some scaling issues. Adds an option to only compute the dct on the low-low subband for 32x32 and 64x64 blocks using only a single 16x16 dct after 1 and 2 wavelet decomposition levels respectively. Also adds an option to use a 8x8 dct as building block. Currenlty with the 2/6 filter and with a single 16x16 dct on the low low band, the reuslts compared to full 32x32 dct is as follows: derf: -0.15% yt: -0.29% std-hd: -0.18% hd: -0.6% These are my current recommended settings, since the 2/6 filter is very simple. Results with 8x8 dct are about 0.3% worse. Change-Id: I00100cdc96e32deced591985785ef0d06f325e44	2013-01-12 16:00:53 -08:00
Ronald S. Bultje	aa2effa954	Merge tx32x32 experiment. Change-Id: I615651e4c7b09e576a341ad425cf80c393637833	2013-01-10 08:23:59 -08:00
Ronald S. Bultje	6884a83f06	Merge superblocks64 experiment. Change-Id: If6c88752dffdb566f8d4322f135145270716fb8e	2013-01-09 17:21:40 -08:00
Adrian Grange	7d6b5425d7	New prediction filter This patch removes the old pred-filter experiment and replaces it with one that is implemented using the switchable filter framework. If the pred-filter experiment is enabled, three interopolation filters are tested during mode selection; the standard 8-tap interpolation filter, a sharp 8-tap filter and a (new) 8-tap smoothing filter. The 6-tap filter code has been preserved for now and if the enable-6tap experiment is enabled (in addition to the pred-filter experiment) the original 6-tap filter replaces the new 8-tap smooth filter in the switchable mode. The new experiment applies the prediction filter in cases of a fractional-pel motion vector. Future patches will apply the filter where the mv is pel-aligned and also to intra predicted blocks. Change-Id: I08e8cba978f2bbf3019f8413f376b8e2cd85eba4	2013-01-09 12:00:39 -08:00
Ronald S. Bultje	4455036cfc	Merge superblocks (32x32) experiment. Change-Id: I0df99742029834a85c4933652b0587cf5b6b2587	2013-01-08 12:54:45 -08:00
John Koleszar	879cb7d962	Merge vp9-preview changes into experimental branch Incorportate vp9-preview changes by merging master branch into experimental. Conflicts: test/test.mk vp9/common/vp9_filter.c vp9/common/vp9_idctllm.c vp9/common/vp9_invtrans.h vp9/common/vp9_mbpitch.c vp9/common/vp9_rtcd_defs.sh vp9/common/vp9_systemdependent.h vp9/common/vp9_type_aliases.h vp9/common/x86/vp9_asm_stubs.c vp9/common/x86/vp9_subpixel_mmx.asm vp9/decoder/vp9_decodframe.c vp9/decoder/vp9_dequantize.c vp9/decoder/vp9_dequantize.h vp9/decoder/vp9_onyxd_int.h vp9/encoder/vp9_bitstream.c vp9/encoder/vp9_encodeframe.c vp9/encoder/vp9_rdopt.c Change-Id: I17f51c3666d1b59cf1a699f87607cbc5d30a87c5	2013-01-08 10:19:59 -08:00
Ronald S. Bultje	c3941665e9	64x64 blocksize support. 3.2% gains on std/hd, 1.0% gains on hd. Change-Id: I481d5df23d8a4fc650a5bcba956554490b2bd200	2013-01-05 18:20:25 -08:00
Paul Wilkins	313d1100af	Added update-able mv-ref probabilities. Part of NEW_MVREF experiment. Added update-able probabilities. Change-Id: I5a4fcf4aaed1d0d1dac980f69d535639a3d59401	2013-01-02 14:22:11 +00:00
John Koleszar	5ebe94f9f1	Build fixes to merge vp9-preview into master Various fixups to resolve issues when building vp9-preview under the more stringent checks placed on the experimental branch. Change-Id: I21749de83552e1e75c799003f849e6a0f1a35b07	2012-12-26 11:21:09 -08:00
Ronald S. Bultje	4cca47b538	Use standard integer types for pixel values and coefficients. For coefficients, use int16_t (instead of short); for pixel values in 16-bit intermediates, use uint16_t (instead of unsigned short); for all others, use uint8_t (instead of unsigned char). Change-Id: I3619cd9abf106c3742eccc2e2f5e89a62774f7da	2012-12-18 15:31:19 -08:00
Yaowu Xu	899f0fc126	clean up tokenize_b() and stuff_b() Change-Id: I0c1be01aae933243311ad321b6c456adaec1a0f5	2012-12-11 13:32:16 -08:00
Yaowu Xu	ab480cede5	experiment with CONTEXT conversion This commit changed the ENTROPY_CONTEXT conversion between MBs that have different transform sizes. In additioin, this commit also did a number of cleanup/bug fix: 1. removed duplicate function vp9_fix_contexts() and changed to use vp8_reset_mb_token_contexts() for both encoder and decoder 2. fixed a bug in stuff_mb_16x16 where wrong context was used for the UV. 3. changed reset all context to 0 if a MB is skipped to simplify the logic. Change-Id: I7bc57a5fb6dbf1f85eac1543daaeb3a61633275c	2012-12-07 17:25:45 -08:00
Ronald S. Bultje	c456b35fdf	32x32 transform for superblocks. This adds Debargha's DCT/DWT hybrid and a regular 32x32 DCT, and adds code all over the place to wrap that in the bitstream/encoder/decoder/RD. Some implementation notes (these probably need careful review): - token range is extended by 1 bit, since the value range out of this transform is [-16384,16383]. - the coefficients coming out of the FDCT are manually scaled back by 1 bit, or else they won't fit in int16_t (they are 17 bits). Because of this, the RD error scoring does not right-shift the MSE score by two (unlike for 4x4/8x8/16x16). - to compensate for this loss in precision, the quantizer is halved also. This is currently a little hacky. - FDCT and IDCT is double-only right now. Needs a fixed-point impl. - There are no default probabilities for the 32x32 transform yet; I'm simply using the 16x16 luma ones. A future commit will add newly generated probabilities for all transforms. - No ADST version. I don't think we'll add one for this level; if an ADST is desired, transform-size selection can scale back to 16x16 or lower, and use an ADST at that level. Additional notes specific to Debargha's DWT/DCT hybrid: - coefficient scale is different for the top/left 16x16 (DCT-over-DWT) block than for the rest (DWT pixel differences) of the block. Therefore, RD error scoring isn't easily scalable between coefficient and pixel domain. Thus, unfortunately, we need to compute the RD distortion in the pixel domain until we figure out how to scale these appropriately. Change-Id: I00386f20f35d7fabb19aba94c8162f8aee64ef2b	2012-12-07 14:45:05 -08:00
Paul Wilkins	4cc657ec6e	Change to MV reference search. This patch reduces the cpu cost of the MV ref search by only allowing insert for candidates that would be in the current top 4. This could alter the outcome and slightly favors near candidates which are tested first but also limits the worst case loop count to 4 and means in many cases it will drop out and not happen. Change-Id: Idd795a825f9fd681f30f4fcd550c34c38939e113	2012-12-05 14:03:45 +00:00
Jim Bankoski	2b8dc065d1	google style guide include guards Change-Id: I2c252f3ddcc99e96c1f5d3dab8bcb25a2a3637ea	2012-11-30 07:30:59 -08:00
Jim Bankoski	13dbf1fb17	more rtcd cleanup Change-Id: Ieefd76e164ca4aa87597da0412977614ddfbacb7	2012-11-28 17:27:15 -08:00
Deb Mukherjee	0742b1e4ae	Fixing 8x8/4x4 ADST for intra modes with tx select This patch allows use of 8x8 and 4x4 ADST correctly for Intra 16x16 modes and Intra 8x8 modes when the block size selected is smaller than the prediction mode. Also includes some cleanups and refactoring. Rebase. Change-Id: Ie3257bdf07bdb9c6e9476915e3a80183c8fa005a	2012-11-28 16:21:12 -08:00
Jim Bankoski	c67873989f	fixed includes to be fully specified Change-Id: Ia1cce221f8511561b9cbd8edb7726fbc286ff243	2012-11-28 10:53:17 -08:00
John Koleszar	fcccbcbb39	Add vp9_ prefix to all vp9 files Support for gyp which doesn't support multiple objects in the same static library having the same basename. Change-Id: Ib947eefbaf68f8b177a796d23f875ccdfa6bc9dc	2012-11-27 14:12:30 -08:00

50 Коммитов