av1/encoder/rdopt.c:9533 ‘zeromv[1].as_int’ may be used
uninitialized in this function [-Wmaybe-uninitialized]
this was spurious given the logic in the if
Change-Id: I8ddfe7e46d1bf5593cc8624f05c9f181243a87d4
PVQ replaces the scalar quantizer and coefficient coding with a new
design originally developed in Daala. It currently depends on the
Daala entropy coder although it could be adapted to work with another
entropy coder if needed:
./configure --enable-experimental --enable-daala_ec --enable-pvq
The version of PVQ in this commit is adapted from the following
revision of Daala:
fb51c1ade6
More information about PVQ:
- https://people.xiph.org/~jm/daala/pvq_demo/
- https://jmvalin.ca/papers/spie_pvq.pdf
The following files are copied as-is from Daala with minimal
adaptations, therefore we disable clang-format on those files
to make it easier to synchronize the AV1 and Daala codebases in the future:
av1/common/generic_code.c
av1/common/generic_code.h
av1/common/laplace_tables.c
av1/common/partition.c
av1/common/partition.h
av1/common/pvq.c
av1/common/pvq.h
av1/common/state.c
av1/common/state.h
av1/common/zigzag.h
av1/common/zigzag16.c
av1/common/zigzag32.c
av1/common/zigzag4.c
av1/common/zigzag64.c
av1/common/zigzag8.c
av1/decoder/decint.h
av1/decoder/generic_decoder.c
av1/decoder/laplace_decoder.c
av1/decoder/pvq_decoder.c
av1/decoder/pvq_decoder.h
av1/encoder/daala_compat_enc.c
av1/encoder/encint.h
av1/encoder/generic_encoder.c
av1/encoder/laplace_encoder.c
av1/encoder/pvq_encoder.c
av1/encoder/pvq_encoder.h
Known issues:
- Lossless mode is not supported, '--lossless=1' will give the same result as
'--end-usage=q --cq-level=1'.
- High bit depth is not supported by PVQ.
Change-Id: I1ae0d6517b87f4c1ccea944b2e12dc906979f25e
* changes:
Add av1_ prefix on ###_rd_stats functions
Use init_rd_stats() in encodeframe.c
Add transform block coefficient cost in RD_STATS for debugging
Add helper functions to modify RD_STATS
Add mi_row and mi_col into mbmi to facilitate rd_debug process
Add token cost comparison in write_modes_b()
Those functions includes
init_rd_stats()
invalid_rd_stats()
merge_rd_stats()
This CL help simplify the code.
Change-Id: Id1704d883bd21a039b0478a940994ca14184ae1c
This is just partial implementation
Compare token cost of pack_mb_tokens/pack_txb_tokens with token cost
from rate-distortion loop. If there is any difference, dump out mode
info.
Change-Id: I46b373ee2522c5047f799f36baf7cec5fbc06f06
This commit replaces the offset based block index calculation with
incremental based one. It does not change the coding statistics.
Change-Id: I3789294eb45416bd0823e773ec30f05ed41ba0dc
* changes:
Reformatting the deringing code
Introducing OD_DERING_SIZE_LOG2 constant (3)
Renaming deringing blockwise write-back functions to make code clearer
Deringing refactoring: replace last_sbc with simpler dering_left flag
Getting rid of the od_dering_in type
This commit refactors the recursive transform block partition
search process to make it support rectangular transform block size
coding.
Change-Id: I0207ae40d83c7eae3cb5d460e403f470747590d3
Allow the transform size writing, reading, and the reconstruction
process to support rectangular transform block size coding.
Change-Id: I57393c73ec60835a088d785ca838d7e3d7eb29a4
Using a struct named dlist rather than an array named bskip. Simplified some
code.
No change in output
Change-Id: Id40d40b19b5d8f2ebafe347590fa1bb8cb80e6e1
The OD_DERING_VERY_LARGE values are now explicitly copied to the buffer instead
of being read from the line buffer when we're on the edge of the frame. This
will make it possible to make the line buffer 8-bit for non-high-bitdepth.
No change in output
Change-Id: I1a4134d67ac7f8c239f08d73941405c56f01050b
Now only buffering three lines across the entire frame and four lines
over the height of one superblock.
No change in output.
Change-Id: I6b99399974e197dc02f2e4ff2e60cdd7fdaa2e43
This introduces a line buffer that hold the last three lines of each original
row so that the next row can be deringed with the original input of the upper
row.
No change in output
Change-Id: I8fad3bc48745e9ce3e440289f453477a0c5442c0
This commit makes the encoding process of the recursive transform
block partition support both rectangular and square transform block
sizes as the starting point. If the coding block size is rectangular,
it would allow the transform block size to start from the largest
rectangular transform size, and recursive parse to the selected
coding sizes.
Change-Id: I576628b9166565bada6a918f0a1e67849dfef4cd
The function module in inter_predictor() has been changed to
universally support arbitrary block size inter prediction. Hence
sub8x8mc can be a standalone experiment now.
Change-Id: Ie9d87f61fc317b1d114edb4e0bf5544f918ed08e
The rectangular transform syntax is by default supported, hence
no need to put it under the experimental flag. This does not change
the coding statistics.
Change-Id: I3a147503d973a03400f8a86e11f07c7d754e6234
* changes:
Avoid the "initial copy" in the deringing filter
Only copy the deringed blocks back into the buffer
Reducing copies in deringing filter
sb_all_skip_out() now computes a list of deringed blocks
compute bskip as we go
Revert "Fix dering filter when using 4:2:2 or 4:4:0 subsampling"
This prepares the integration of rectangular transform block size
with recursive transform block partition system.
Change-Id: Id96aa3790dace15619c665f438241938992d1730
Upsampled references currently increase the size of references by
64 times. This patch limits the memory used by the encoder to
about 3GB when encoding high bit depth content.
This should be re-evaluated in the future, if doing 8-tap
resampling in the motion search becomes reasonably fast, or if
the upsampled references are reduced in size (by omitting some
subpel positions and interpolating them instead).
Change-Id: I6d84ff0d6202ec46f4fa53e268e68aa808e5df85
Using a struct named dlist rather than an array named bskip. Simplified some
code.
No change in output
Change-Id: Id40d40b19b5d8f2ebafe347590fa1bb8cb80e6e1
The OD_DERING_VERY_LARGE values are now explicitly copied to the buffer instead
of being read from the line buffer when we're on the edge of the frame. This
will make it possible to make the line buffer 8-bit for non-high-bitdepth.
No change in output
Change-Id: I1a4134d67ac7f8c239f08d73941405c56f01050b
Now only buffering three lines across the entire frame and four lines
over the height of one superblock.
No change in output.
Change-Id: I6b99399974e197dc02f2e4ff2e60cdd7fdaa2e43
This introduces a line buffer that hold the last three lines of each original
row so that the next row can be deringed with the original input of the upper
row.
No change in output
Change-Id: I8fad3bc48745e9ce3e440289f453477a0c5442c0
Reverted commit: f8306bfdc (with some changes).
Reason: This was triggering an assert in debug build because of zero
probability values. So, using an "UNUSED_PROB" macro to replace these to
retain clarity.
Assertion failure can be reproduced as follows:
$ make clean; extra_cflags='-O0 -g -fno-inline' ../../configure
--enable-debug --enable-experimental --enable-palette && make -j 16
$ ./aomenc -D --codec=av1 ~/videos/screen_content_set/gimp.y4m -o
/tmp/foo.webm --tune-content=screen --limit=50
Pass 1/2 frame 50/51 8976B 1436b/f 86169b/s 2902620 us
(17.23 fps)
Pass 2/2 frame 25/0 0B 2933053 us 8.52 fps [ETA unknown]
aomenc: ../../av1/encoder/cost.c:46: cost: Assertion `prob != 0' failed.
Aborted (core dumped)
Change-Id: I47a76b8f415060909bc8448fae3002857eb61d8e
This commit allows the partition context model to account for the
maximum transform block size of the coding block.
Change-Id: I22b91e85fff70faa974afd362ce327d3f2eda81d
Upsampled references currently increase the size of references by
64 times. This patch limits the memory used by the encoder to
about 3GB when encoding high bit depth content.
This should be re-evaluated in the future, if doing 8-tap
resampling in the motion search becomes reasonably fast, or if
the upsampled references are reduced in size (by omitting some
subpel positions and interpolating them instead).
Change-Id: I6d84ff0d6202ec46f4fa53e268e68aa808e5df85
- Add unit tests to verify the bit-exact result.
- User level time reduction (EXT_TX):
encoder: 3.63%
decoder: 2.36%
- Also add tx_type=V_DCT...H_FLIPADST SSE2 for 16x16 inv txfm.
Change-Id: Idc6d9e8254aa536e5f18a87fa0d37c6bd551c083
* changes:
Reverse order of CLPF and dering
Refactor: read_tx_size_probs()
Fix compiling issues with --enable-ec-adapt
Fixes compilation error on Windows/Visual Studio
This commit adds simp-mv-pred experiment. The experiment is to work on
top of ref-mv experiment to save memory bandwidth and reduce the size
of line buffer needed in ref-mv experiment.
When compared to ref-mv, this experiment showed:
low-delay BDR gain: 0.03%
High-delay BDR gain: 0.01%
memory/memory bandwidth saving: 40%
local memory/gate count saving: 20%
Change-Id: Ic4006e041fc58ede411da83d0d730c464ebe1749
This commit fixes the top-right reference block location for block
sizes above 8x8. It improves the coding performance of ref-mv:
lowres 0.08%
midres 0.15%
Thanks to jiafeng@ for finding this issue.
Change-Id: I70750fc7b18bf0126d3e07abc1b63ca5a160193e
This prevents a crash if the upsample_refs speed feature is
changed as part of set_size_dependent_vars, when the recode
loop is enabled.
Change-Id: I645e389bfe961879dd2001439a34fde2993868d9
This CL will cause
0.122% PSNR drop on lowres dataset
0.059% PSNR drop on midres dataset
However, it will facilitate hardware implementation.
Change-Id: I0a0713acacbfd571509a721337711c021915dd3c
The EC_ADAPT experiment cannot work unless EC_MULTISYMBOL is also
enabled.
This patch replaces all individual checks with a centralized check in
both the bitreader.h and bitwriter.h.
Change-Id: I418852d95c5012cc074ed65cd24997e08bc2aadd
The new ec_multisymbol experiment supersedes the rans experiment and is
used for multisymbol features that can be backed by either daala_ec or
rans.
This experiment is automatically enabled by ec_adapt and will try to
enable daala_ec or ans (in that order).
Change-Id: Ie75b4002b7a9d7f5f7b4d130c1aacb3dbe97e54f
This experiment performs symbol-by-symbol statistics
adaptation for non-binary symbols. It requires DAALA_EC or
RANS and ANS to be enabled. The adaptation is currently
based on a simple recursive filter and is taken from
Daala. It has an adaptation rate dependent on alphabet size,
taken from Daala. It applies wherever non-binary symbols
are encoded using Cumulative Probability Functions rather
than trees.
Where symbols are adapted, forward updates in the compressed
header are removed.
In the case of RANS coefficient token values are adapted,
with the exception of the zero token which remains a
binary symbol. In the case of DAALA_EC other values
such as inter and intra modes are adapted as CDFs are
provided in those cases.
The experiment is configured with:
./configure --enable-experimental --enable-daala-ec --enable-ec-adapt
or
./configure --enable-experimental --enable-ans --enable-rans \
--enable-ec-adapt
EC_ADAPT is not currently compatible with tiles.
BDR results on Objective-1-fast give a small loss:
PSNR YCbCr: 0.51% 0.49% 0.48%
PSNRHVS: 0.50%
SSIM: 0.50%
MSSSIM: 0.51%
CIEDE2000: 0.50%
Change-Id: I3888718e42616f3fd87144de7f125228446ac984