Allows DomainTxfm filters to be turned off for experimentation.
Also expands the parameter set for the Self guided filters.
Change-Id: I68fdb8e079a2464d80b3a4a990005c49baaaf0b8
Wedges were wrongly indexed when cb4x4 was on.
Brings a bit of the gain back when ext-inter + cb4x4 are turned on
together.
Change-Id: Id2bd359e70546cf0ea9cf31656064711c9894177
The reference frame handling in av1_find_mv_refs introduced
in https://aomedia-review.googlesource.com/c/6067/ broke the
compile when global-motion is enabled but ref-mv is not.
This patch adds the missing logic, allowing this case to compile
again.
Change-Id: I914887eb56d28a700b2917d086447bdbb314f35d
This commit makes the adaptive scan order system support multi-
thread encoding. It fixes unit test failure issue associated with
AV1/AVxEncoderThreadTest.EncoderResultTest/0.
BUG=aomedia:353
Change-Id: I61cbf9531c8deab97fb3bb17428d0b2a63cf309a
The WienerInfo struct requires a 16-byte alignment on x86,
since it contains filter coefficients which are loaded using
SSE aligned load instructions. But on 32-bit x86, the default
alignment of aom_malloc/aom_realloc is only 8 bytes, leading
to occasional segfaults.
To fix this, rather than using aom_realloc to resize WienerInfo
structures, we always free and re-allocate them using aom_memalign
BUG=aomedia:345
Change-Id: Ib1b2a42d4a2fa215dcc81ea481c51271ab068a37
CLPF performance had degraded by about 0.5% over the past six months,
which isn't totally surprising since the codec is a moving target.
About half of that degradation comes from the improved 7 bit filter
coefficients. Therefore, CLPF needs to be retuned for the current
codec.
This patch makes two (normative) changes to the CLPF kernel:
* The clipping function was changed from clamp(x, -s, s) to
sign(x) * max(0, abs(x) - max(0, abs(x) - s +
(abs(x) >> (bitdepth - 3 - log2(s)))))
This adds a rampdown to 0 at -32 and 32 (for 8 bit, -128 & 128
for 10 bit, etc), so large differences are ignored.
* 8 taps instead of 6 taps:
1
4 3
13 31 -> 13 31
4 3
1
AWCY results: low delay high delay
PSNR: -0.40% -0.47%
PSNR HVS: 0.00% -0.11%
SSIM: -0.31% -0.39%
CIEDE 2000: -0.22% -0.31%
APSNR: -0.40% -0.48%
MS SSIM: 0.01% -0.12%
About 3/4 of the gains come from the new clipping function.
Change-Id: Idad9dc4004e71a9c7ec81ba62ebd12fb76fb044a
The only compound mode used with sub 8x8 blocks is COMPOUND_AVERAGE, so
we don't have to send anything in this case
Change-Id: I90d0162e5f7f1ad205e65094293cde2a48eb77b1
When convolve_round is on, av1_convolve_2d_facade will be used for
interpolation rather than av1_convolve. Will remove the experiment
code of convolve_round experiment from av1_convolve in another CL.
So far we use 4-bit rounding in the intermediate stage on top of using
post rounding for compound mode after the last stage.
This will give us roughly 0.45% gain on lowres , 0.39% on midres and
roughly 0.6-0.7% on hdres
Altogether, is 1.15% on lowresm, 0.74% on midres and roughly 1.7-1.8% on
hdres
Note that there no restriction usage of 12-tap filter in the CL.
Adding that, we will lose roughly 0.1% again on lowres.
Change-Id: I6332e1d888e28a3b3ddc29711817d66e52cb5cdf
The new_tokenset experiment replaces the unconstrained tokenset with a
multisymbol alphabet in an inventive way.
Tested configurations:
new_tokenset + ec_adapt, new_tokenset, ec_multisymbol
Change-Id: I846ab2e51c2a1dc3f2f9904ed8c47a8e98f853c5
By default, the activity masking is used with PVQ.
In addition to '--enable-pvq', '--enable-daala-dist' is also
required by configure to use the activity masking.
Change-Id: I5100a1db992f0e693e61daf5439de8ae8c64a752
For fixed-point version of PVQ, which is current default,
added MAXI(1, ) to limit the minimum companded or expanded gain to be one.
Previously, gain compand/expand function, which is invoked when
activity masking is enabled, sometimes outputs zero
then triggered the assert(gain != 0).
Metric change from floating-pt to fixed-pt PVQ is:
PSNR PSNR-HVS SSIM CIEDE-2000 PSNR Cb PSNR Cr MS-SSIM VMAF
0.02 0.10 0.08 0.11 0.01 0.02 0.13 -0.30
Change-Id: I64a60d1970d35a26af227841e4a5e50a89ddc44c
Preparation for merging EOB_TOKEN. The block_zero value
corresponds to the first EOB_TOKEN: other EOB_TOKEN values will
be merged with non-zero values.
Change-Id: I94036783ee240fa916a79c544ecd716a9c24fa59
This commit renames deblocking_across_tiles to loopfilter_across_tiles,
to get ready for dering and clpf integration.
Change-Id: Id25b051da9b1e5cb92f35a9619662597462d9537
These are optimized for EOB_TOKEN being associated
with the current position, not the previous.
CBP tables cover EOB_TOKEN for the the whole block.
This change causes a performance regression until
EOB_TOKEN is merged into the coding scheme.
Change-Id: Ica3a12ed97285cbae204ce3cc1a7e658ebcacc9f
Allow the above combination of experiments to work together
correctly, fixing an encode/decode mismatch bug when they
were all enabled.
This change causes build_masked_compound(_highbd) to only
ever be called if CONFIG_SUPERTX is off, so wrap these functions
in an '#if !CONFIG_SUPERTX' block.
BUG=aomedia:313
Change-Id: Ic3886bc69ba9624b8fcb0a4c2d71fc64d2c0f22c
Zero, one, and two or more coded as one symbol (head).
Remaining tokens coded as a tail symbol.
The pareto CDF distribution is adjusted to cover tokens from
two onwards.
Change-Id: I98b33fab6b9f52690f6ad618ac55e725a97be056
- Added comments for some tables and #defines for clarity.
- Renamed some variables to ensure we use "color_index" instead of
"color" for palette color index related variables.
Change-Id: Ica95a26e0f171a41a3259c8e6b3b891b8cd10151
This commit makes the daala-ec work in the cb4x4 mode. As compared
to --enable-experimental, --enable-experimental --enable-cb4x4
improves the coding performance by:
lowres 2.6%
midres 1.2%
Change-Id: Ifee6f011c80364492c4a547513d24eb2958b5a56
Now that we have small number of contexts (5), use hash multipliers
(instead of base 11), so that color context hash is within a small
range. This allows us to use a lookup table to get color context
instead of a for loop.
Output bitstreams are bit-exact, so no change in compression.
Change-Id: I8cd8c893048c2fc6b22ccbd56f652d11486e2ee9
This reduces the complexity in a number of ways:
- We need just 3 neighbors instead of 4.
- Possible contexts reduce from 16 to 5.
- On hardware side, getting the contexts for a whole block will be more
parallelizable.
At the same time, compression performance improves very slightly:
- Screen-content set (videos) (Google): BDRate improved by 0.32
- screenshots set (images) (AWCY): PSNR improved by 0.62:
https://arewecompressedyet.com/?job=palette_withTR2%402017-01-27T21%3A30%3A28.890Z&job=palette_noTR2%402017-01-27T21%3A41%3A34.312Z
Change-Id: Ie84ca32f05d55ad481a51c2d3abc579468597189
This commit fixes the encoding/decoding mismatch issue when
ext-partition and ext-partition-type are both turned on in cb4x4
mode.
BUG=aomedia:336
Change-Id: I4d6ad5863c9d3bc8e3a41c259b8b39f130164790
Adjusts the value by 1 to make sure that the center tap
if the Wiener filter does not drop below 0.
BUG=aomedia:315
Change-Id: I41c3a2eb3f36dd49072a4873a995003d18f94ece