This option increases runtime by 20% and is only marginally
better than good cpu-used=0:
PSNR | PSNR Cb | PSNR Cr | PSNR HVS | SSIM | MS SSIM | CIEDE 2000
-0.3382 | -0.3911 | -0.4875 | -0.2982 | -0.2992 | -0.3164 | -0.3686
It is also not well integrated with speed_features.c, which is
the main reason for the removal.
Change-Id: If88c50367f63b860ad57f650869b978ec7734aad
Refactors the end of handle_inter_mode into a new funciton. This code
is responsible for calculating an accurate RD for the SIMPLE_TRANSLATION
motion mode in the simplest case, and does the same for other motion
modes as their experiments are enabled.
This patch aims to do as little as possible to the code inside the
function - that is left to later patches to reduce the complexity of
this diff.
Change-Id: I62bf5aae34594b0a1dc4813aeba99e675d6db374
This a simple implementation.
We will use a more precise buffer size for tcoeff once the
experiment functions correctly.
Change-Id: Ib561974f21ee1b8d72ce407882ea2be3cf0b069f
Do multisymbol coding for transform type.
Load default cdf probabilities directly.
Use CDF frame update mechanism when EC_ADAPT is
enabled.
Change-Id: Id23c927e81587b560e9df8b9bc56c0e2e3bb6f03
Adds a dependent config flag 'interintra' to turn on/off interintra
modes altogether.
Adds a dependent config flag 'wedge' to turn on/off wedge compound
for both interinter and interintra.
Adds another macro to change wedge predictors to use
only 0, 1/2, or 1 weights.
From now, use
--enable-ext-inter --enable-wedge --enable-interintra to get the
same behavior as the old --enable-ext-inter.
Change-Id: I2e787e6994163b6b859a9d6431b87c4217834ddc
This filter was temporarily removed due to test failures.
This patch reintroduces the filter and fixes two bugs:
* The test cases would occasionally segfault on x86, since
the highbd filter requires its inputs to be aligned to
16 bytes. This will always be true when used on real videos,
so adjust the test cases to match.
* The function calc_block was incorrect for bit_depth > 8,
due to passing an incorrect argument to _mm_srl_epi32().
This was the cause of the original test failures.
BUG=aomedia:392
Change-Id: Ia06b76c3e6122eebadd0995fb62f32c2fcab8b3e
This removes an instruction from the HW path. It also improves
BDR by 0.02% on all metrics (AWCY, High Latency,
objective-1-fast).
Change-Id: I9f8a86871e1c0db4a0704dee297acd6977abcbe4
get_txb_ctx is designed under the condition that ctx is uint8_t
Hence, we cast ctx to uint8_t before further operations
Change-Id: If8423d6e5edd346034cb9631726e930c47bc682b
Bring the following libvpx commits to aom:
e446ffd Cache optimizations in optimize_b()
50d3629 Repack vp9_token_state
Saves 24600 bytes of stack in the default configuration.
Change-Id: If9d6506cf3fe1c34ab639dedb3ef62a996293781
- Add function add_gas_asm_library() to handle conversion of asm
sources and creation of custom dependencies.
- Uses add_asm_library() to create the library build.
- Add aom_dsp_common_neon_intrinsics target for the neon intrinsics.
BUG=https://bugs.chromium.org/p/aomedia/issues/detail?id=76
Change-Id: Ifd99fbd69998a79613e0f5b61003a47973a804bc
The previous limit was <=1080p, which was sufficient to keep RAM
usage below 3GB for the 8 bit path, but turns out to be insufficient
for the 10 bit path.
Change-Id: I7a19261928a4e1a71f5f297125651113a2465d3d
Enabling SEPARATE_GLOBAL_MOTION will remove the ability for
a block that uses zeromv with global motion to pick warped_causal
or obmc_causal as the motion mode. When this is enabled there is:
0.05% drop on lowres for global + warped enabled
0.15% drop on midres for global + warped enabled
0.12% drop on lowres with global + motion var enabled
0.07% drop on midres with global + motion var enabled
No performance change for global, warped, or motion var individually.
Change-Id: Idbfb8dd7a93da14902438504b06a08e5212e48cb
This fix was motivated by a code generation bug in g++ on arm, but it
seems a good idea generally to disable these unit tests if we're not
compiling with optimisations, since the code tested is only intended
to be used as inlined functions, and while it's possible to compile
without optimisations, the tests become somewhat half-hearted since
there are workarounds when there's no inlining (such as for
instrinsics requiring immediate values), so the tests would partly
test code that wont be used anyway.
BUG=aomedia:377
Change-Id: I9a0515c96a7ed2f4636820dfc03fbb92323ca8ee
* Dering and clpf were merged into a single pass.
* 32x32 and 128x128 filter block sizes for clpf were removed.
* RDO for dering and clpf merged and improved:
- "0" no longer required to be in the strength selection
- Dering strength can now be 0, 1 or 2 bits per block
LL HL
PSNR: -0.04 -0.01
PSNR HVS: -0.27 -0.18
SSIM: -0.15 +0.01
CIEDE 2000: -0.11 -0.03
APSNR: -0.03 -0.00
MS SSIM: -0.18 -0.11
Change-Id: I9f002a16ad218eab6007f90f1f176232443495f0
This function go through each transform block in the
prediction block and call av1_write_coeffs_txb to
pack coefficients into the bitstream
Change-Id: I6dedebef6cf8957f9173241a7de60e9936bc0be8
Replaces the int64 and int32 divisions in least-squares and
gamma or delta computation with a mechanism that decomposes
the divisor D such that 1/D = y * 2^-k where y is obtained
from a lookup table indexed by 8 highest bits of the difference
D - 2^floor(log2(D)). The main complexity is now only from
computing this decomposition, which is essentially equivalent
to finding floor(log2(D)) (position of highest
bit in a 64-bit integer).
Also includes an out of memory bug fix and some cleanups.
Change-Id: I9247fdff5f6b4191175d4b4656357bfff626f02c
This is used at cmake generation time via a command line like this:
$ cmake path/to/aom -DCMAKE_TOOLCHAIN_FILE=path/to/aom/build/cmake/toolchain/armv7-ios.cmake
BUG=https://bugs.chromium.org/p/aomedia/issues/detail?id=76
Change-Id: Iadacc32c43bc23e0f670b88e3c1563c44319945c
- Stop acting as if Yasm is the only assembler.
- Kill generation and report error when yasm is not found for x86
and x86_64 (remove the generic fallback).
- Use $AOM_AS_FLAGS to pass assembler specific flags.
- Add include guard in aom_optimization.cmake.
BUG=https://bugs.chromium.org/p/aomedia/issues/detail?id=76
Change-Id: Ic68d6c81071c24a8ceb6806d04ab8959be97d876
This draft version only pass compiling check, it's not working yet.
The following goal is to use new coding system when doing bitstream
packing but keep old coding system in RD loop.
Change-Id: I224a1581d1cc5c67d73e71558fb77d9faf9c2470
This commit implements support for twopass encoding using the xiphrc
experimental rate control system. Most of the code and logic comes
from the theora project encoder.
Currently support is limited to the bitrate targeting mode of the
rate control system and while it does visibly improve quality and does
bring rate closer to the target than the one pass mode there's still
tuning and bug fixing to be done.
Change-Id: Iae0d65bbce5ddfbb95b436e2238a43d6100a23b3
Doing tokenize in the last step of RD loop and then doing packing
tokens in bitstream packing phase is hard for debugging.
Therefore, we create a frame-level buffer to store the txfm coeffs
from the reconstruction in RD loop and then in bitstream packing
phase, we can code the txfm coeffs directly.
Change-Id: I999470eef6e038317a91585df2bdfc20aca3573e
Fixes some rd-debug mismatches coding cat6 tokens with tx size < 32x32.
For these tokens the high extrabits are elided during tokenization and
detokenization, but the rd cost was computed with the old tables from
VP9 where these high extrabits are always coded.
Change-Id: I4a9a6ea822ff821e1932c351d43a57bdb4d6d466
The offset of neighbors is communicated to av1_make_inter_predictors
so as to use the correct mi in gm warping
Change-Id: I471bbdf2112ed678969492b11730f15d9527eb7e
Update the first PARTITION_PLOFFSET (4) contexts with the four classic
partitions. The extended partitions are only codable above 8x8, but
there are PARTITION_PLOFFSET (4) contexts for dropping below 8x8.
Change-Id: Ib3291dded6dc24103222e8f470504c20e29adb88