There were several places where loop_restoration used the encoded width
and height while superres was active. This patch changes it to use the
upscaled width and height, since loop_restoration is supposed to occur
after superres has done its upscaling.
Change-Id: I2b9bbb06b5370618758bf81d8eb63f2eef26af80
This affects two places:
* Fixes a compile error with frame-superres when
highbitdepth is disabled.
* Avoids including some supertx-related code when supertx
is disabled
BUG=aomedia:602
Change-Id: Idfc478fd88ade91d48c93cfd8abdd2bea86de898
Fix the bug that height of a partition is used as a stride mistakenly.
This fixes the regression caused by sub8x8 tx size rd search
for a partition >= 8x8.
Change-Id: I6114814dcec70fd5198f681c0a861bc9849286fd
Now the single ref comp mode should work with WEDGE and
COMPOUND_SEGMENT. For motion_var, the OBMC_CAUSAL mode uses the 2nd
predictor if the neighboring block is single ref comp mode predicted.
This patch removes the mode of SR_NEAREST_NEWMV and leaves four
single ref comp modes in total:
SR_NEAREST_NEARMV
SR_NEAR_NEWMV
SR_ZERO_NEWMV
SR_NEW_NEWMV
Change-Id: If6140455771f0f1a3b947766eccf82f23cc6b67a
Daala-dist replaces the luma distortion of sub8x8 partitions with
its own distortion thus requires to split the luma distortion only.
Doing so, there has been a bug that INT_MAX64 value comes
when the sub8x8 parition is skipped. This happened because the existing
code does not initialize the rd_stats_y or tmp_rd_stats_y, i.e. rd_stat struct
for luma only in several places.
Change-Id: If229b53bb7a6cff0b8751138a32b1dcf02665624
Commit 12311 had a misplaced assert set that was causing superres debug
runs to fail. The asserts just needed to be moved to where they were
relevant to fix the issue.
Change-Id: Ic370686c7156fcaf9380d8d8fd9d35b892d77e46
A new experiment SBL_SYMBOL, meaning superblock-level symbols, will
be explored. It allows some symbols being coded at superblock level
(64x64) by checking whether a symbol(e.g. motion_mode, tx depth,
and interpolation filter) is identical across macroblocks in a
superblock.
Change-Id: I38408325c9b7a4b94c11c400a5060036ce36405e
This patch implements the post-encode and post-decode upscaling for the
frame superresolution experiment to work.
Upscaling happens after cdef and before loop restoration.
For now, this patch forces on random-superres.
The patch also cleans up some broken rate control hooks from VP9
days, to be brought back later when the resize and superres tools
are stable.
Change-Id: If0a8f69224dfaa0f4ae7703bd429ea2af953c7a6
Refactoring: split prediction+extension for each plane, so we can
handle luma/chroma supertx pred in different ways.
Compatibility fix: fix conflicts with cb4x4 and chroma_sub8x8, now
for chroma sub8x8 supertx, only the top-left(basic cb4x4) or the
the bottom-right(cb4x4 + chroma_sub8x8) predictor will be used
without any blending within a 8x8 unit.
Change-Id: I6cf7b12768a82d3c7e01811ada02de84af9bd8ac
The values 'offset_r' and 'offset_c', representing a random
offset into a large pre-generated block, were calculated the
wrong way around. This could cause problems when testing
rectangular convolutions.
Change-Id: Ide830f275c83492abe83b61216da0fbce669fb7e
Check the availability of the reference frames at the frame level at
both encoder and decoder, and if a reference frame is not available
for a specific video frame, remove the signaling of such reference
frame info at the block level.
This patch adds the consideration of the bit saving inside the RD
optimization loop.
Change-Id: I4c22f1b843b21c7d2b47e118c99c3ad615a3d4e4
Scale the rounding factor according to the scaling factor applied
to the quantization step size. This resolves a compression
performance regression in 32x32 and above transform size.
BUG=aomedia:599
Change-Id: Id3fc9a46c4a8843ff5d77ccaa59ee3112b12d7f4
This avoids the rounding errors due to the right shift of the
negative numbers that cause the reconstruction coefficient has
higher distortion than the source coefficient.
BUG=aomedia:599
Change-Id: I11ed86bf1d41164dda4398545334a7b4e8e10513
The highbd_clip_pixel_add() function is generalized to be used in
the regular 8 bit path. Move its defintions outside the highbd
experimental flag.
This resolves the comiler warning in unit tests when high bit-depth
is turned off.
Change-Id: I90a744adb2381c9bf8476aa2a2bd0c87d9afdf57
This was hard-coding the assumption that the block size for the
smallest TX size was also the smallest block size. This is no
longer true since fe67ed6af2 landed.
As a result, for TX blocks that overlapped the frame edge, it was
only measuring distortion on the upper-left 2x2 part of each 4x4
sub-block, causing the encoder to prefer larger transforms which
cause such overlap and avoid transforms which do not, causing a
regression.
This patch uses the appropriate conversion table, which fixes the
regression.
BUG=aomedia:593
Change-Id: Id253cf0f3a5252378e3f340b8350120639ff5c88
The Windows calling convention pushes any __m128i type arguments
after the 3rd (4th on x86-64) onto the stack. But on x86,
stack-allocated arguments are not guaranteed to be aligned to
a multiple of their natural alignment, leading to compile errors.
We fix this by making the functions which take >3 __m128i arguments
instead take pointers. Since the functions are marked INLINE, the
extra memory operations should optimize out.
BUG=aomedia:587
Change-Id: I0cb2831fd12aded6f2821c037365386e6183ba5c
* Reduce bit widths of intermediate values where possible
* Change ROUND_POWER_OF_TWO_SIGNED to ROUND_POWER_OF_TWO
in av1(_highbd)_convolve_2d
* Apply offsetting and bounds checking, to match the intended
hardware implementation
* Separate the implementations of av1(_highbd)_convolve_2d
into compound-round and non-compound-round cases. This is because
there are now a significant number of differences between the
functions.
Overall, this is expected to affect the bitstream and encoder output
when convolve-round alone is enabled, but *not* when compound-round
is enabled.
Change-Id: I8c21e0645fd11f64c59552885f87f4a5dd40ccf7
The 'ref' member of ConvolveParams currently serves two purposes:
* To indicate which component of a compound we're currently predicting,
eg. for fetching interpolation filters with dual-filter enabled.
* To determine whether we should average into the destination buffer.
But there are two cases where we want to separate these out:
* In joint_motion_search, we want to try combining a fixed second
prediction with various first predictions.
* When searching masked interinter compounds, we want to predict
each component separately then try different combinations.
In these cases, we set 'ref' to 0 and use temporary variables to
make sure we use the correct interpolation filters. But this is
quite fragile.
This patch separates out the two uses into separate members.
This allows us to remove some temporary variables, but more
importantly gives easy fixes to two bugs in
build_inter_predictors_single_buf (used by rdopt):
* We previously set ref=0 but didn't fix up the interpolation filters
* For ZERO_ZEROMV modes, the second component would accidentally
average into the (uninitialized!) second prediction buffer
BUG=aomedia:577
BUG=aomedia:584
BUG=aomedia:595
Change-Id: Ibc31d1ac701a029ea5efaa1197dd402bc4b7af1e
This has been found to be better than the original version in both ways:
(1) Better compression: lowres -0.229, midres -0.147
(2) Faster too in my quick test over 5 different clips with 30
frames: 2.7% to 10.5% faster.
Change-Id: I4d46e0915d6e4b8e7bfc03d0c8b88cbe3351ca20
Minor updates in test/tools_common.sh to enable use of
test/examples.sh with CMake make builds while continuing
to support configure builds.
BUG=aomedia:76,aomedia:589
Change-Id: I841aef3b61a0c9baa8ad7356fc5b51ffb0902907
This option is obsolete and confusing now that the LBD coding path
can be disabled at build time (--disable-lowbitdepth).
Removing it makes the encoder/decoder behaviours symmetric, and
prevents the "--test-16bit-internal --test-decode=fatal" mismatch.
BUG=aomedia:528
Change-Id: Ia2d9857629b789b11d37fc75433b2cecc27d6642
This will fix Assertion in get_txb_ctx() for invalid sign value
After fixing this bug, we have around 0.2% gain on lowres.
With cb4x4 on only, lv_map provides 1.538% gain on lowres.
With cb4x4 + chroma_sub8x8, lv_map provides 1.468% gain on lowres.
Change-Id: I06f996ec5af0d5e79e377a3dc8c012862fc4b9c7
For a tx size RD search with partition size >= 8x8 and tx size < 8x8,
daala-dist function is applied to the whole partition after all tx blocks are encoded
instead of each 8x8 sub block of the partition.
Change-Id: I27d9e2960aa641f550096e32ebcdf8dfb4de79a6
This reverts commit bf3813a166.
Reason for revert: feature not ready.
Incompatible with lossless under some circumstances,
causes the following assertion failure:
Assertion `(!is_compound) == (cm->reference_mode == SINGLE_REFERENCE)' failed
BUG=aomedia:575
Change-Id: I63a2b38ce3b7cb50108ac559cca0768b4579c9ae
This unifies the codepath for high-bitdepth transforms and deletes
all calls to the old deprecated versions. This required reworking
the way 1d configurations are combined in order to support rectangular
transforms.
There is one remaining codepath that calls the deprecated 4x4 hbd
transform from encoder/encodemb.c. I need to take a closer look
at what is happening there and will leave that for a followup
since this change has already gotten so large.
lowres 10 bit: -0.035%
lowres 12 bit: 0.021%
BUG=aomedia:524
Change-Id: I34cdeaed2461ed7942364147cef10d7d21e3779c