Removes a dupe. The CONFIG_MOTION_VAR testing support exists
in test/test.cmake.
- Remove the source file references.
- Remove setup_aom_dsp_test_targets().
Change-Id: Ifa034582223641d6d89a3274ff293c1e65cbb73d
Defer include of aom_config_defaults.cmake until config detection is
completed to allow for accurate storage of config information.
BUG=https://bugs.chromium.org/p/aomedia/issues/detail?id=76
Change-Id: I7fd71696d564c625531025555e923b7ebf686451
check for googletest failures as well as mismatches. this greatly
reduces the error output and time to failure.
BUG=aomedia:371
Change-Id: Ic617905430a8ec39fbee2af9ce6655a8ef6796c0
Resolve the encoding failure when ec-multisymbol and rectangular
(rect-tx or var-tx) are both turned on.
Change-Id: I708ed66d907c5928adecfd2a53498566296594d6
promote the unsigned int calculation to uint64_t rather than int64_t for
type consistency
cherry-picked from libvpx:
47d6f16a0 get_prob(): rationalize int types
Change-Id: Ic34dee1dc707d9faf6a3ae250bfe39b60bef3438
Adds the arguments struct HandleInterModeArgs to hold arguments that
are conditional on compiled features. This means that there are no
longer #if's in the function's argument list.
Some of the array pointers that were optional arguments have been
made array members in the new struct, but not all. This is due to the
function being called with either references to arrays that are
maintained between trying different modes OR with references to
"dummy" arrays initialized to zero. The arrays that are always used
are now members of the HandleInterModeArgs struct.
Change-Id: I3076fd53c3cddf5a6d14bbe7d23a889465ed716d
This reduces the runtime profile of pvq_search_rdo_double from 37%
to 15% and improves overall encoding speed when PVQ is enabled by ~40%.
The SIMD code is not bit accurate with the C version and introduces a
slight PSNR regression on AWCY:
PSNR | PSNR Cb | PSNR Cr | PSNR HVS | SSIM | MS SSIM | CIEDE 2000
0.0607 | 0.1044 | N/A | 0.0126 | N/A | -0.0309 | N/A
Change-Id: Ie22cebc62df2e72618305f2268668d79167860c6
This is branchless on newer gcc and clang and is about 1% faster overall
at cq-level=16 frame-parallel=1.
Change-Id: I7f5608ab0f0abbc29aa3419a103addf945ea9f0a
Using 8-bit weights gives similar results as 12-bit, with only noise
level difference. Here's what 8-bit looks like compared to 12-bit:
* AWCY Objective-1-fast:
high latency low latency
ALL keyframes 0.00 0.01
Video 0.00 0.04
* Google sets:
All Keyframes:
lowres: 0
midres: -0.001
hdres: -0.001
Video overall:
lowres: 0
midres: -0.063
hdres: 0.026
Change-Id: Ibed6015aa7cce12fcc6f314ffde76624df4ad2a1
Offsets for the least-squares for affine motion computation
are now set at the top left corner of the current block.
Improves stability and performance a little.
Change-Id: I68ca7e74c6102502daa8ca3373af2b2dd59400c3
Disable the support of compound prediction modes for sub8x8 codking
blocks. Make the rate-distortion optimizations process account for
such constraints.
With the use 2x2 chroma prediction block, this makes the wrost case
number of inter predictors same as vp9. It affects the coding
gains by 0.35% for lowres, 0.17% for midres, and 0.08% for hdres.
The encoding speed is up by 10%.
Change-Id: Ieb2a83030676911baa403e586f1f800cbf485d81
Segmment based lossless flag is used in select transform size, this
commit fixes a bug where wrong segment_id is used in such selection.
BUG=aomedia:350
Change-Id: Ibc981c779739849bac00447155180abbd319eb28
The macro used in assert is defined under CONFIG_VAR_TX. This fixes a
build issuse when --enable-var-tx and --enable-rd-debug are both on.
Change-Id: I497fe4a8b1fa6c7b05ac2b41c97522f7bdedc0ce
USE_TXTYPE_SEARCH_FOR_SUB8X8_IN_CB4X4 macro added to turn
tx_type search on/off for sub8x8 in cb4x4 mode.
The purpose is mainly to analyze the coding gains from cb4x4
but this later can be made into a speed feature as well.
Change-Id: Ic22026c373eebba87f324689ac5686a2844315b6
Integerizes computation of the least squares for warped motion.
The model is restricted to only Affine. Affine seems easiest
to compute and integerize since it can be split into two 3-dim
least squares problems, as opposed to rotation-zoom which needs
a 4-dim least-squares problem to be solved.
The current implementation requires only one division per block.
BDRATE impact is mminimal. The upgrade to the affine model improves
coding efficiency but integerization also degrades efficiency a
little. Overall there is a net gain of about -0.07% BDRATE on
the lowres set.
BDRATE lowres: -1.113% with ----enable-warped-motion vs. without
(up from -1.044%).
Change-Id: I6b9216ac0737d76f59054293eabee48e17739ec4
- Move source list vars.
- Split source list vars into common/decoder/encoder sources.
- Move target definitions into function.
- Split targets into common/decoder/encoder targets.
- Update CMakeLists.txt to include test.cmake and call
setup_aom_test_targets() at the appropriate time.
BUG=https://bugs.chromium.org/p/aomedia/issues/detail?id=76
Change-Id: Icd9ce67593c2de7ebd5c8ef921e31517b6d20945
It only handles the realloc constraint (preserving low elements) by
serendipity, and we don't actually rely on that behavior anyway.
Meanwhile the calls may do extra copying that gets immediately clobbered
by the callers.
Cherry-pick from libvpx:
3063c3760 Remove vpx_realloc()
Change-Id: I8dfa89e4a81084b084889c27bd272fdf85184e8d
Since we now require C99, this is undefined behavior.
Thanks to Luc Trudeau for the report and Alex Converse for the
suggestion on how to make the macro safe for all integer sizes.
Change-Id: I99a1342dfedb3e17a6869269be317c2ed26bfe9b
This commit enables the motion vector referencing system to use
the motion information of blocks to the bottom and right of the
collocated block. This improves the compression performance by
0.3% for lowres, midres, and hdres sets.
Change-Id: I03b3fb21f3a8698880ca9ceb945fa3e32531acdb
to get_binary_prob(). the only other caller mode_mv_merge_probs() does
its own test on 0.
cherry-picked from libvpx:
93c823e24 vpx_dsp/get_prob: relocate den == 0 test
Change-Id: Ie0604ad405a97ed754e4b88c6d580eb4894ea0f6
+ inline the function directly as there was only one consumer
(get_prob())
this is an attempt to reduce the amount of branches to workaround an amd
bug. this change is mildly faster or neutral across x86-64, arm.
http://support.amd.com/TechDocs/44739_12h_Rev_Gd.pdf
665 Integer Divide Instruction May Cause Unpredictable Behavior
cherry-picked from libvpx:
7481edb33 vpx_dsp/get_prob: make clip_prob branchless
Change-Id: I433059c61ce43ec5058cc16ca590d186bfa8aab5
Don't smash the value when assigning in CMakeLists.txt in
case the list needs an update from elsewhere in the build.
Change-Id: Icf1720f6bb4508e6a557c16dc229170f82d740b9
Merges two consecutive loops that iterated over TX_SIZES. There's no
impact to the bitstream. The 4 used as the termination threshold in the
second loop is equivalent to TX_SIZES.
Change-Id: Ic891d209b28f20907d53bcdd58139fe39c37b0fa