Граф коммитов

1839 Коммитов

Автор SHA1 Сообщение Дата
Thomas Davies 0ccefe21af EC_MULTISYMBOL: merge ZERO_TOKEN into coding scheme.
Zero, one, and two or more coded as one symbol (head).
Remaining tokens coded as a tail symbol.

The pareto CDF distribution is adjusted to cover tokens from
two onwards.

Change-Id: I98b33fab6b9f52690f6ad618ac55e725a97be056
2017-01-31 11:13:04 +00:00
Urvang Joshi 23a611173b Palette code: add comments and rename some variables.
- Added comments for some tables and #defines for clarity.
- Renamed some variables to ensure we use "color_index" instead of
"color" for palette color index related variables.

Change-Id: Ica95a26e0f171a41a3259c8e6b3b891b8cd10151
2017-01-30 15:09:57 -08:00
Yue Chen d0d3bccf14 Fix conflicts between cb4x4 and warped_motion
Set mi_size properly in findSample()

Change-Id: I26bae25bf6300a107108dc5c2b7098e7d7dfa750
2017-01-30 22:04:02 +00:00
Jingning Han 1992af1b98 Make cb4x4 work with daala-ec
This commit makes the daala-ec work in the cb4x4 mode. As compared
to --enable-experimental, --enable-experimental --enable-cb4x4
improves the coding performance by:

lowres 2.6%
midres 1.2%

Change-Id: Ifee6f011c80364492c4a547513d24eb2958b5a56
2017-01-30 19:39:11 +00:00
Urvang Joshi cdbe708581 Palette Optimization: O(1) context lookup.
Now that we have small number of contexts (5), use hash multipliers
(instead of base 11), so that color context hash is within a small
range. This allows us to use a lookup table to get color context
instead of a for loop.

Output bitstreams are bit-exact, so no change in compression.

Change-Id: I8cd8c893048c2fc6b22ccbd56f652d11486e2ee9
2017-01-30 17:48:46 +00:00
Urvang Joshi 199a2f4052 Palette: Don't use top-right pixel for context of color indices.
This reduces the complexity in a number of ways:
- We need just 3 neighbors instead of 4.
- Possible contexts reduce from 16 to 5.
- On hardware side, getting the contexts for a whole block will be more
parallelizable.

At the same time, compression performance improves very slightly:
- Screen-content set (videos) (Google): BDRate improved by 0.32
- screenshots set (images) (AWCY): PSNR improved by 0.62:
https://arewecompressedyet.com/?job=palette_withTR2%402017-01-27T21%3A30%3A28.890Z&job=palette_noTR2%402017-01-27T21%3A41%3A34.312Z

Change-Id: Ie84ca32f05d55ad481a51c2d3abc579468597189
2017-01-30 17:48:46 +00:00
Jean-Marc Valin 79c0f32c58 Remove DCT from od_compute_dist_8x8
Cherry-pick Daala e248823a
 Getting rid of the DCT in od_compute_dist_8x8()
Replacing the DCT and frequency weighting by a filter

Change-Id: Icc3a46e5dbb561e4e3b00fa6c2290d54299c05cb
2017-01-30 09:46:15 +00:00
Jingning Han 86e277911a Fix ext-partition/type in cb4x4 mode
This commit fixes the encoding/decoding mismatch issue when
ext-partition and ext-partition-type are both turned on in cb4x4
mode.

BUG=aomedia:336

Change-Id: I4d6ad5863c9d3bc8e3a41c259b8b39f130164790
2017-01-27 13:58:08 -08:00
Debargha Mukherjee 4bab6e4f58 Adjust WIENER_FILT_TAP2_MIDV value to fix convolve
Adjusts the value by 1 to make sure that the center tap
if the Wiener filter does not drop below 0.

BUG=aomedia:315

Change-Id: I41c3a2eb3f36dd49072a4873a995003d18f94ece
2017-01-27 17:56:17 +00:00
Jonathan Matthews 6d69ba0c74 Bugfix: decode_palette_tokens inverting stride and width.
Introduced in I745ca032f313c5041aacc98c03ae4bfc33d840de.
Stride should be plane_block_width and width should be cols,
 sanity check: cols <= plane_block_width.

Change-Id: Ic5128e94a909e498010c92fef2013da8df6d6d85
2017-01-27 17:26:15 +00:00
Thomas Davies dbfc4f9cc0 TILE_GROUPS: code a single tile group more efficiently.
Change-Id: If6efdb754558e3f237aa2d56c0eae4590fb021a4
2017-01-27 15:18:06 +00:00
Debargha Mukherjee 1a0ae84dab Fix OneByOneVideoTest for loop-restoration
Fixes and turns on the test.

BUG=aomedia:312

Change-Id: I6c7d1970e743ec2b025a798070761d22624e796a
2017-01-27 06:06:04 +00:00
Debargha Mukherjee 9868c7479a Fix crash with cb4x4 and warped-motion
BUG=aomedia:314

Change-Id: I66af7f69ca0b97b9d840918a6b9ec34708a7f4e5
2017-01-27 00:54:05 +00:00
hui su 83c2663677 Refactor rd_pick_intra_sbuv_mode()
Change-Id: Id86b48ad34059668beb9464200dd9e03fc1b8a48
2017-01-27 00:42:44 +00:00
Yaowu Xu 006ff4be43 Change to initialize correct thread_data
BUG=aomedia:307

Change-Id: Ia1d39916b3e856acd33f4e199321395455151fb6
2017-01-26 23:05:35 +00:00
Debargha Mukherjee ff59b6acb1 Fix mismtach with ref-mv and ext-partition-types
Change the list of search offsets searched when ext-partition-types
is on for square block_sizes. This is because the VERTICAL_A and
HORIZONTAL_A partitions are incompatible with the default list.

BUG=AOMEDIA:141

Change-Id: I884c45c3d11039b7dcb72336a928362f926473ed
2017-01-26 20:48:18 +00:00
Urvang Joshi 56ba91bbe4 Palette: Don't store tokens for pixels outside image boundary.
If part of a block falls outside right and/or bottom image boundary,
then only store tokens for the part of it within the boundary.

Also, consider only the part of the block within the boundary when
calculating the number of colors in the image, deciding the base
colors for palette, RD calculation etc.

The part of color map corresponding to pixels outside the image
boundary is padded with color indices copied from same row/column.
This behavior is similar to how pixels outside the boundary are padded.

For screen_content set, this is improves compression performance by
0.038 overall. One clip, in particular, has a significant gain of 0.8.

Change-Id: I745ca032f313c5041aacc98c03ae4bfc33d840de
2017-01-26 18:03:33 +00:00
ltrudeau e1c0929f51 Convert PVQ skip variable to enum
Creates the PVQ_SKIP_TYPE enum to encapsulate the different types of
skipping that can be signaled by PVQ (i.e. skip: AC, DC or both).

There is no impact on the bitstream. However, the decoder will now emit
an internal error if the decoded skip flag is out of range. The
block_skip variable is also renamed to ac_dc_coded as it stores the same
information.

Change-Id: Ib2aadaf99dc1736ea392ae5ed8948c3cdc12da9b
2017-01-26 17:36:04 +00:00
Debargha Mukherjee 8b61321690 Fix mismatch w/ ext-inter/warped-motion/motion-var
Fixes a mismatch issue with ext-inter+motion-var+warped-motion
due to unset num_proj_ref values.

BUG=aomedia:311

Change-Id: I042551f6c53e8cc005f2133704a03b243c98c12a
2017-01-26 02:05:45 +00:00
hui su 78c611ab7f Speed up palette keyframe encoding with model RD
On keyframe, 18% speedup, 0.02% compression loss.

Change-Id: I29085ec23dd145effbea58852a46cd7f4dea8a46
2017-01-25 23:47:03 +00:00
hui su 8f4cc0a351 Speed up filter-intra keyframe encoding with model RD
On keyframe, 22% speedup, 0.04% compression loss.

Change-Id: I70d387cc9de86c0c0c8b0037d35cff141409d59b
2017-01-25 23:38:57 +00:00
hui su 0161a93266 Cleanup for the entropy experiment
Minor performance changes
0.03% better on lowres
0.01% better on midres

Change-Id: I7a7168f3a2a4d17a03353841a416eff6edf1e241
2017-01-25 22:45:22 +00:00
hui su 9a416f5721 Speed up ext-intra keyframe encoding with model RD
On keyframe, 18% speedup, 0.07% compression loss.

Change-Id: I98323db23251c70958a314f16fd6d789579017ec
2017-01-25 22:44:20 +00:00
Tristan Matthews 54e197749d pvq: skip gshift calculation in float pvq case
Cherry-picked from daala commit 28de40bfcd84e7df3fbd64de7b89dd7fd889bb27

Change-Id: I31af05f07514c023c5be84f7e2ae353ab7d276f0
2017-01-25 21:18:14 +00:00
James Zern e2a703e340 av1_dct_test: fix duplicate symbol link error
only expose the static functions needed in the test file to avoid link
errors for e.g., av1_fht4x4_c

Change-Id: I35111d322f30bc2bfc57b32c11f691f0717cfaba
2017-01-25 19:43:06 +00:00
Debargha Mukherjee 63131ea6e3 Silence a compiler warning
Change-Id: I130f748c076a1642f12b95051dab19bfdac5b855
2017-01-25 05:10:06 +00:00
David Barker 839467f42c Make ext-inter use new rectangular intra predictor
Now that https://aomedia-review.googlesource.com/#/c/6729/
has been merged, build_intra_predictors_for_interintra() is
now redundant, so replace it by a direct call to
av1_predict_intra_block() and remove the old function.

Reset rect_interintra back to 1.

To do this, we need to make the intra predictor take a
BLOCK_SIZE instead of a TX_SIZE. This is because we need to
be able to predict 32x64 and 64x32 blocks, but there is no
TX_32X64 or TX_64X32.

No effect on output or performance.

Change-Id: I8c185a211c97a85012cc54ec293c785a693608ed
2017-01-24 21:37:24 +00:00
Yaowu Xu a93e65e5c6 Fix a couple of typos
Change-Id: Ibec40c3cd8e14343b096e406ba233cf4f131e7b9
2017-01-24 20:35:31 +00:00
Angie Chiang c71d613097 Fix bitstream error when entropy and adapt_scan on
BUG=aomedia:310

Change-Id: I8e1a1c6d59e3d14ba132d2bbf4e203da26538bde
2017-01-24 20:06:31 +00:00
Jingning Han 61418bbd1f Fix conflicts between ext-inter and cb4x4 modes
Resolve the broken coding pipeline in ext-inter experiment when
cb4x4 mode is enabled. Turn off rectangular inter-intra mode.
This needs some more work to hook up. Given that it gives fairly
limited coding performance gains, disable it for the moment.

BUG=aomedia:309

Change-Id: I9b406df6183f75697bfd4eed5125a6e9436d84b0
2017-01-24 18:18:19 +00:00
Fangwen Fu 8d164de25c enable explicit temp mv prediction signaling
Change-Id: Ieb2922c3df4ef4f8514b8a6df6f9a8fc45ef3cf4
2017-01-23 14:22:45 -08:00
Yaowu Xu 6b763c9c9e Fix issues in --enable-entropy and --enable-cb4x4
Change-Id: I148d60d56599a238c60c429572a25cbddbe5191d
2017-01-23 21:50:06 +00:00
Emil Keyder 01770b3e20 Rename NONE to NONE_FRAME.
This follows the naming for the other frame types, and allows libaom
to be compiled against other libraries that also #define NONE.

Change-Id: Ic2e2814587bbc5ea67385a9af775396d29b7dde0
2017-01-23 21:12:35 +00:00
David Barker 13797462df Warp filter improvements
* The restriction on the parameter 'delta' was too strict, so we
  loosen it (delta only ever gets multiplied by -4, ... , 4,
  whereas beta gets multiplied by -7, ..., 7)
* Correct a comment about the border clamping
* Fix an issue with the test case

Change-Id: I30e55203455ba6e419b5a8b646151a6d1fd5cc3b
2017-01-23 20:46:22 +00:00
Yushin Cho 7a428ba243 Add a new experiment, DAALA_DIST
This commit adds a new experiment, Daala's distortion function,
which is designed to better approximate perceptual distortion
in 8x8 pixel blocks.

This experiment is expected to work best with PVQ.

It measures the variance of overlapped 4x4 regions in the 8x8 area,
then uses these variances to scale the MSE of weighted frequency domain
distortion of 8x8 block.

Since AV1 calculates distortion in blocks as small as 4x4, it is not possible to
directly replace the existing distortion functions of AV1,
such as dist_block() and block_rd_txf().
Hence, there has been substantial changes in order to apply
Daala's 8x8 distortion function.
The daala distortion function is applied
after all 4x4 tx blocks in a 8x8 block are encoded (during RDO),
as in below two cases:
1) intra/inter sub8x8 predictions and
2) 4x4 transform with prediction size >= 8.

To enable this experiment, add '--enable-daala-dist' with configure.

TODO: Significant tuning of parameters is required since the function has
originally came from Daala thus most parameters would not work
correctly outside Daala.
The fact that chroma distortion is added to the distortion of AV1's RDO is
also critical since Daala's distortion function is applied to luma only
and chroma continues to use MSE.

Change-Id: If35fdd3aec7efe401f351ba1c99891ad57a3d957
2017-01-23 20:24:57 +00:00
Jingning Han 48b1cb35bb Support filter-intra in cb4x4 mode
This commit resolves an enc/dec mismatch issue when both filter-intra
and cb4x4 modes are enabled.

BUG=aomedia:253

Change-Id: I4026d93c00a819f2ce69aedba9d34a774319acbf
2017-01-23 20:20:30 +00:00
Angie Chiang 54294194c5 Fix segmentation fault of dual_filter in hbd mode
BUG=aomedia:142

Change-Id: Id21dd2d19e1e46a9225cd5f8f8b0705ae178118c
2017-01-23 16:10:28 +00:00
David Barker 561eb72f43 Fix a typo in a comment for ext-inter
Change-Id: I2a20b3eb8020e3e3592d284737acd5da13bad103
2017-01-22 03:40:58 +00:00
Jingning Han 758b2ceba3 Make adapt-scan support rectangular transform block sizes
This commit enables the adaptive scan order system support
rectangular trnasform block sizes. It resolves the coding failure
when rect-tx or var-tx are enabled.

BUG=aomedia:143

Change-Id: Ic565284e811e3f7e0ebf2e08fb3748257ce8a049
2017-01-21 21:05:10 +00:00
Jingning Han 07ef967d39 Resolve coding failure in var-tx
Fix an encoding failure issue when var-tx is enabled, while ext-tx
and rect-tx are disabled. This doesn't change coding statistics
when all are enabled.

Change-Id: I4b32387a0a1497380980f8087832aaf6467cdcbe
2017-01-21 21:04:42 +00:00
Jingning Han 3daa4fda6c Support rectangular tx size in cb4x4 mode
This commit makes ext-tx and rect-tx experiments supported in the
cb4x4 mode. It resolves an enc/dec mismatch issue when all the
transform experiments are enabled.

The coding gains are
        ext-tx + rect-tx   cb4x4    vartx     total
lowres      4.0%           2.3%      0.5%     6.9%

The encoding speed is about the same when cb4x4 and vartx are
further enabled.

BUG=aomedia:139

Change-Id: I3fdabc6d5de23ceb78ac0751a9bf7332ebc0a3ac
2017-01-21 21:04:27 +00:00
Angie Chiang b9b42a0ac1 Fix mismatch when dual_filter daala-ec both are on
BUG=aomedia:132

Change-Id: I5c3214ddbc97576a2e90a070f2bdccc15be50d65
2017-01-20 23:06:29 +00:00
Yue Chen 09c0a5bcf5 Remove unused input in av1_encode_mv()
Change-Id: I54a0fbaeb59de0907a17b73dab4170cf62b4fd8d
2017-01-20 21:57:55 +00:00
Peter Boström a1f6432dfa Add decoder controls for getting last quantizer.
Also adds --framestats=file.csv to aomdec that prints a CSV file with
frame size and frame QP.

Change-Id: I3b70c4b3df35d0b97bdd83cfc4631f096573b4a2
2017-01-20 21:51:11 +00:00
Yue Chen 80a15c9bb6 Set default warped motion model to rotation-zoom
Change-Id: Ied58b6e4a15259cf24e3ee490c042767f4a48f16
2017-01-20 19:50:20 +00:00
Jingning Han 0f6a60a98f Add missing break statement in get_entropy_contexts_plane()
Change-Id: Ia2fe7cb7f0d0a98d3050c1db059ec04e3735e1ec
2017-01-20 18:14:48 +00:00
Jingning Han f4e097b486 Fix rectangular tables in cb4x4 mode
Account for the additional block sizes in these tables.

Change-Id: Iae940f28671714caaf32432940752958ef66f6d5
2017-01-20 18:14:48 +00:00
Debargha Mukherjee 3eb713e287 Fix loopfilter for rectangular transforms
Properly determine and use horizontal and vertical masks
for loop filtering when rectangular transforms are used.

Fixes an intermittent mismatch issue and improves coding
efficiency.

BDRATE results for ext-tx + rect-tx:
lowres: -3.739% (up from -3.443%)
midres: -3.366% (up from -3.006%)

Change-Id: If26fa14261f3893662eb1245f0b876d68513247c
2017-01-20 17:36:35 +00:00
Angie Chiang caa9e5adf9 Refactor av1_convolve
Move declaration of filter_params_x/y outside of if/else block

Change-Id: I4f908872b7ff85b440a12a535d939a3c137aaab5
2017-01-20 17:04:45 +00:00
Angie Chiang 117aa0dc6c Add CONVOLVE_POST_ROUNDING flag
By turning on CONVOLVE_POST_ROUNDING, in the compound inter
prediction mode, FILTER_BITS rounding is moved after the summation
of two predictions.

Note that the post rounding is only applied on non-sub8x8 block

       PSNR     BDRate
lowres -0.808%  -0.673%

Change-Id: Ib91304e6122c24d832a582ab9f5757d33eac876c
2017-01-20 17:04:45 +00:00
Thomas Davies 77c7c40f8b EC_ADAPT: use tile context for switchable filter.
Change-Id: I7bbd3c62341ede45628641766b8683b77f3a7efb
2017-01-20 15:30:49 +00:00
Thomas Davies 1de6c88a67 EC_ADAPT: use tile context for inter mode.
Change-Id: I522dfe77cbe0ea4833d11e25386586d7312c463f
2017-01-20 15:30:27 +00:00
Thomas Davies cef09627cf EC_ADAPT: use tile context for TX size.
Change-Id: Idd7926f0539a0bd039828e5882392fbfb024e531
2017-01-20 15:30:07 +00:00
ltrudeau 472f63f4a6 Replace Skip with AC/DC coded in PVQ
Instead of returning skip, av1_pvq_encode_helper and od_pvq_encode now
return ac_dc_coded. This gives more information on whether the DC part
or the AC part was skipped.

Although it is possible to obtain ac_dc_coded from the pvq_info struct,
this struct is not always used, in which case the information was lost.

This change does not impact the bitstream.

Change-Id: Ie303de915f74e8da384f822332eb1aa27f677bd3
2017-01-20 15:14:40 +00:00
Thomas Davies c2ec0e4e3f EC_ADAPT: use tile context for partition type.
Change-Id: I4b53dab674390496d8fe7299970c5fb327b5a7be
2017-01-20 14:12:06 +00:00
Thomas Davies 489dad8ffe EC_ADAPT: use tile context for coefficients.
Change-Id: I61433d0c0bbab9b7cf74a405cbedd60965318888
2017-01-20 14:11:35 +00:00
Thomas Davies 1bfb5edac3 EC_ADAPT: use tile context for intra mode syntax.
Change-Id: Id01c785ad48134075c4f6643233413564f0b8fbc
2017-01-20 14:11:12 +00:00
Thomas Davies 2452329ad1 EC_ADAPT: use tile context for MV data.
Change-Id: I71c9bedfae2304c201fe6621a20c03f4e26a85cf
2017-01-20 14:10:39 +00:00
Jingning Han 456e0864dd Fix enc/dec mismatch due to ext-partition-type in cb4x4 mode
This commit fixes an enc/dec mismatch issue in ext-partition-type
in the cb4x4 mode.

BUG=aomedia:137

Change-Id: I19f538a967a6059a40b1668eed076bc315b46149
2017-01-20 03:53:42 +00:00
Jingning Han d9c24a33b5 Fix intra block coding order in ext-partition-type
Fix the intra block coding order when both ext-partition-type and
cb4x4 modes are turned on.

Change-Id: Iaaaf4742c53c4778526974f9d1dfdaed6ca3ce3c
2017-01-20 03:53:33 +00:00
Jingning Han 58bc4cc024 Support ext-partition in cb4x4 mode
This commit resolves the coding pipeline breakage when ext-partition
and cb4x4 are both enabled.

BUG=aomedia:138

Change-Id: Ic17da68af80d7a66565b0e1c69b895be27282a9a
2017-01-20 03:53:19 +00:00
Alex Converse 6f345c6a0d Use OD_ILOG_NZ in OD_DIVU_SMALL_CONSTS
If _d == 0 we are already off to the UB races due to out of bounds
access in OD_DIVU_SMALL_CONSTS.

Change-Id: I55a76c51483885bbb38667f14836be9830e130a8
2017-01-20 02:22:27 +00:00
Jingning Han b5bb244c31 Free up all the allocated mem space in cb4x4
Use the right number to free up the allocated memory space in
context space.

Change-Id: Ic2950c133d6234b9a4216283a6f4a1dea13128d1
2017-01-20 01:31:14 +00:00
Debargha Mukherjee e6044fecd6 Change the warp filter to use real 8-tap
The warp filter for the (0,1) case is changed to use a real
8-tap filter.

Improves coding efficiency.

BDRATE on lowres:
-0.772% (up from -0.633%) with --enable-global-motion
-1.124% (up from -1.001%) with --enable-warped-motion

Change-Id: I296efe36dbc72a7af74773b71b445f19a2aa7205
2017-01-20 01:02:50 +00:00
Sarah Parker 2e6048874c Do masked motion search based on COMPOUND_TYPE
Change-Id: I2d1b5f57a3bb19eb8c00eb4c2e6c7835047dc4ac
2017-01-19 23:09:50 +00:00
James Zern 7a266e295b ransac.c: define _POSIX_C_SOURCE for rand_r
rand_r() isn't visible by default with -std=c99. this can be changed to
_POSIX_SOURCE after -std=c99 is enabled.; the portability of rand_r()
can be addressed in a future change.

BUG=aomedia:111

Change-Id: Id540f7f4a70007f70585261814b6fb09925fb32b
2017-01-19 19:38:54 +00:00
James Zern 1a5223224f odintrin.h: define M_PI fallback
+ M_SQRT2 / M_SQRT1_2 to keep the daala diffs down

adapted from:
ebb9b28 Move math.h fills to odintrin.h.

these aren't visible by default with -std=c99.

BUG=aomedia:111

Change-Id: Iaa65986f35d914bf92c8c49a8211e0e6864c64e4
2017-01-19 19:38:54 +00:00
hui su 308a6397e5 Speed up keyframe encoding with model RD
model_rd_for_sb() can quickly compute an approximated RD cost. We
use the estimated RD cost to skip running full RD for some bad
mode candidates.

This only affects keyframe encoding. Observed 22% encoding time
reduction, and 0.03% compression loss.

Change-Id: I793f1eda98d67e8da9bc1648dcf272222b30a556
2017-01-19 19:05:40 +00:00
Alex Converse eb780e7167 Add a control to set the ANS window size
Change-Id: I3d64ec4bbc72143b30a094ece7a6c711d6b479cd
2017-01-19 17:22:44 +00:00
David Barker 838367db1e Add correctness tests for the SSE2 warp filter
Also rename warp_affine() to av1_warp_affine()

Change-Id: I945baff6be8a1ea942ce88dfcfa5344af6b3a966
2017-01-19 16:55:58 +00:00
David Barker 1b888f2e9a Optimize SSE2 warp filter
Improve the speed of the warp filter itself by ~30%. This leads
to an overall decoder speedup of 5-20%, depending on bitrate,
for the global-motion experiment, and a small speedup for
warped-motion.

Applies a very minor change to the rounding during filter
selection (ROUND_POWER_OF_TWO makes slightly more sense here
than ROUND_POWER_OF_TWO_SIGNED, and is faster)

Change-Id: I3f364221d1ec35a8aac0d2c8b0e427f527d12e43
2017-01-19 16:55:52 +00:00
David Barker 0b04e9b8b1 Bring highbd loop restoration filters in line with lowbd ones
* Use the same function for domaintxfmrf in both highbd and lowbd
  cases
* Move an assertion out of a loop in
  apply_selfguided_restoration_highbd, to match the lowbd case

No change to output, but a decoder speed improvement of ~3.5%
(roughly independent of bitrate) with loop-restoration on a
10bpp sample.

Change-Id: I970a3bb8f1c6b0ac60aa4a6fe4e7f54d1e6c1452
2017-01-19 14:34:15 +00:00
David Barker 1e8e6b9572 Miscellaneous cleaning up for loop-restoration
* Change Wiener filter storage to match the format expected
  by the convolve functions

Change-Id: I4d1fb08a13cfc31e69e12c1cb4b2e510c6d8ae30
2017-01-19 14:33:32 +00:00
Thomas Davies f77d4ad3c7 EC_ADAPT: add per tile contexts.
This will support adapting in each tile.

(https://bugs.chromium.org/p/aomedia/issues/detail?id=71)

Change-Id: I3eced47715749a48f78c4ccf151c4d0b58f36c0d
2017-01-19 14:24:56 +00:00
Jingning Han 71bf3eec3f Offset default probs in wrapped_motion
Change-Id: I2e39d2f23c8bb18878597e198b5ba7f98f07ecef
2017-01-19 05:31:32 +00:00
Jingning Han 91f01fdf9b Support non-causal obmc in the cb4x4 mode
Make non-causal obmc option works in the cb4x4 mode.

Change-Id: If470ab61166752efc72719f9cd3e440560de1d51
2017-01-19 05:31:32 +00:00
Jingning Han 51ec505c8f Fix encoding failure in motion-var and cb4x4 mode
This commit makes the motion-var support cb4x4 mode. It resolves
the encoding failure issue when both experiments are enabled.

BUG=aomedia:136

Change-Id: I2fa963d62cbdd24cc54d5a95d02f2dc226e6d2d0
2017-01-19 05:31:32 +00:00
Jingning Han b3044ddbf1 Offset default probs in motion_var to account for cb4x4 mode
Offset the default probability set in motion_var to account for
the added block sizes in cb4x4 mode.

Change-Id: I18d90fda1678fad2fc738036e0d9caff6ac894b7
2017-01-19 05:31:32 +00:00
Jingning Han 74fd89f3d6 Fix decoding failure in cb4x4 and var-tx mode
Fix the bit-stream decoding failure introduced lately in cb4x4
and var-tx mode.

Change-Id: Id671b5ec98b32d65e4fb45812ee8d1b7037fd6ec
2017-01-19 01:18:40 +00:00
Jingning Han a6b0c4c9cd Support adaptive scan order in cb4x4 mode
This commit adds 2x2 transform block scan order to make the
adaptive scan order support cb4x4 mode.

BUG=aomedia:135

Change-Id: Ic8c3ae9ed65d577df629524b617b386b5e799d4c
2017-01-19 01:18:32 +00:00
Jingning Han 25f2f7d95f Remove an outdated assertion
The check condtion on block size is deprecated. No need to keep
this assertion around.

Change-Id: Icf2dde2a678cbbce837798877634b7be54e86e67
2017-01-19 01:18:23 +00:00
Yue Chen eeacc4c07a Bug fix: determine tx_mode based on lossless mode of all segments
When segment feature is on, frame level cm->tx_mode can be set to
ONLY_4X4 only if all segments are lossless. Otherwise will cause
bugs when xd->lossless[i] is 0 and xd->lossless[0] is 1.
Also fix the condition of coding tx_type, which should be on when
the qindex of current segment is > 0.

BUG=aomedia:106
BUG=aomedia:104

Change-Id: Ic076083bb78b3b99a6f7d17ec82ee402c64bcc52
2017-01-19 01:16:54 +00:00
Angie Chiang f715922384 Store result on conv_params->buf when no rounding
We need uint16_t buf for storing no-rounding prediction.
Add uint16_t buf in conv_params for that.
This CL let us avoid changing interface of convolve functions.

Change-Id: I079fad911327f40ffb98e17c73e7638b1719c975
2017-01-18 23:40:27 +00:00
Angie Chiang 907230ea72 Change build_inter_predictors
Separate prediction code and parameter generating code.
This will not change bitstream statistics.

Change-Id: I194480166d3f8641592e53683029be1d466cfba9
2017-01-18 23:40:27 +00:00
Yue Chen f27b16053e Add rd loop of NCOBMC
At the final round of encoding of each superblock, will go through
each prediction block to check if ncobmc mode is better than non-
overlapped prediction. Note that causal obmc mode is dumped here.

PSNR gain (MOTION_VAR + NCOBMC): -2.845% lowres

Change-Id: Ibe504f7f1882446a08ba426e1e9824bca73bf655
2017-01-18 22:50:47 +00:00
Yunqing Wang 2615d6eaac Fix the transform size/type search condition
While encoding a key frame with quantizer = 0 and aq-mode = 1,
for some segment_ids, the quantizer got modified and could be
> 0, and lossless[segment_id] might be 0 or 1 depending on the
segment_id. Namely, blocks with lossless[segment_id] = 0 were
allowed to choose transform sizes other than 4x4. This conflicted
with tx_mode which was a frame-level decision. In this patch,
the transform search condition was modified so that the transform
choice was consistent with tx_mode of that frame.

BUG=aomedia:104

Change-Id: Ia39127b5dee129283a133cf5e4000da62d9e0f1c
2017-01-18 19:13:35 +00:00
Urvang Joshi feb925fe84 Enable rectangular transforms for Intra also.
These are under EXT_TX + RECT_TX experiment combo.

Results
=======

Derf Set:
--------
All Intra frames: 1.8% avg improvement (and 1.78% BD-rate improvement)
Video: 0.230% avg improvement (and 0.262% BD-rate improvement)

Objective-1-fast set
--------------------
Video: 0.52 PSNR improvement

Change-Id: I1893465929858e38419f327752dc61c19b96b997
2017-01-18 19:04:40 +00:00
David Barker 60a055bd5c Fix compile errors with loop-restoration + highbd
Change-Id: I0d9850e082b8da3b182a3bbaf6569c45317c9659
2017-01-18 16:31:05 +00:00
Angie Chiang 9f45bc480e Pass ConvolveParams into prediction functions
Those functions includes
av1_make_inter_predictor
av1_build_inter_predictor
inter_predictor

Change-Id: Ide3b744277cf30964e8b352fc8de91365d7217a8
2017-01-18 01:36:31 +00:00
Alex Converse 55c6bdeb27 Add unpoison_partition_ctx experiment
At the edges of the picture only a subset of partitions are legal. Add
new contexts for these borders so they don't distort the probabilities of
the interior of the image where all partitions are legal.

Only include one context for each block size of each border direction
because so few blocks fall into these contexts to begin with.

objective-1-fast:
   PSNR | PSNR Cb | PSNR Cr | PSNR HVS |    SSIM | MS SSIM | CIEDE 2000
-0.0294 | -0.0911 | -0.2382 |  -0.0481 | -0.0441 | -0.0450 |    -0.0454

derf144: -0.135
lowres: -0.124
midres: -0.076
hdres: -0.078

Change-Id: I909b98eebb7e49273cde90154c8408febe334158
2017-01-18 01:27:55 +00:00
Debargha Mukherjee 1edf9a303f Improvements on segment mask
Adds a few options to make the compound mask lightly dependent on the
the two predictors.

Also adds high bit depth support

Change-Id: If57b6e8ddd140e0c00fd9d4738927d37225091cb
2017-01-17 12:01:54 -08:00
Angie Chiang 0cfaeeafa0 Refactor av1_update_neighbors
Beside above and left positions, additional above-left,
above-right, and bottom-left positions are added as
neighbor candidates.

In av1_update_neighbors, two available positions will be picked as
context neighbors.

The picking priority is
above -> left -> above-left -> above->right -> bottom->left

Change-Id: I82eaf0b23d0189caaea008ecc86776492886a05b
2017-01-14 02:04:30 +00:00
Angie Chiang fe2a959e42 Use default scan order as a tie breaker
Change-Id: I85f059b6e2c48bcdf2edd3b7bf896fdccbaaa703
2017-01-14 02:04:30 +00:00
Yi Luo f07ddf3aa6 Fix 32x32 hybrid transform AVX2 to match C
Change-Id: I77bc383d4b2526cd9bef4d806905db0111c04f65
2017-01-13 22:51:47 +00:00
Yi Luo 3b0b5f17eb Fix 16x32, 32x16 rectangular transform SSE2 to match C
- Turn on SSE2 unit tests

Change-Id: I285771b04c0dec0501210fde570b9ac3cb9c4be0
2017-01-13 20:33:28 +00:00
Jingning Han ab9ecbabe8 Refactor end-node process in var-tx
Change-Id: If38fa7e7816a556602c937f0526f5842cc216bf3
2017-01-13 20:30:28 +00:00
hui su 8a630493b5 Fix a RD bug with palette mode
Rate and distortion stats were not correctly recorded.

Change-Id: I91829d66b42e7b7b5b8bf37a0c1c43d6b8206a9f
2017-01-13 19:16:58 +00:00
Yue Chen d193cdcfd9 Correct projection samples for local warping model estimation
When both GLOBAL_MOTION and WARPED_MOTION are enabled, identify
the neighbors using global motion, and generate correct projection
samples, from which the local warped motion is estimated.

Change-Id: I13556a49649208e6f4d30bc570a41074aabc8ae6
2017-01-13 18:50:37 +00:00
Yue Chen 86ae7b1397 Add recon functions of non-causal obmc
Change-Id: Id2537c8826e07ad6605aaa9858ba6d797bcd23a5
2017-01-13 18:06:07 +00:00
Yue Chen 9ab6d71f8d Separate mbmi coding and coeff coding+recon at sb level in NCOBMC
In order to use mvs from a future block in obmc, we first send mbmi
info for the entire superblock, and then call another recursion to
handle the coeffs and recon.

Note: this change is currently not compatible with SUPERTX, later I
will move detoken and recon for supertx to a proper place

Change-Id: I19ab77fa137f53a370e68ea777f70d0306e3e303
2017-01-13 18:06:07 +00:00
hui su de0c70a27a Refactor rd_pick_intra_sby_mode()
Simplify code.

Change-Id: Ifa65ea66e55c52ab79f32de1fc27121ddf088fc3
2017-01-13 17:57:10 +00:00
Thomas Davies ef97ec0b50 EC_ADAPT: faster CDF update.
Also fix warning.

Change-Id: Ia515360af9c3269901eb0d002d326b7af43a00e7
2017-01-13 10:52:32 +00:00
Angie Chiang 674bffdc1b Add rounding option into av1_convolve
Use a round flag in ConvolveParams to indicate if the destination buffer
has the result rounded by FILTER_BITS or not.
This CL is part of the goal of reducing interpolation rounding error in
compound prediction mode.

Change-Id: I49e522a89a67a771f5a6e7fbbc609e97923aecb6
2017-01-13 02:04:02 +00:00
Jingning Han 203b1d30a6 Clean up redundant #if statements
Change-Id: Ia4779ffb47de333d670ae110cbdfb6cc567da910
2017-01-13 01:51:24 +00:00
Yue Chen 64550b6af4 Refactor write_modes_b() and decode_block()
In order to reduce the code complexity for handling parameter
coding and recon separately for each 64x64 in non-causal obmc
experiment, we break them down to two steps calling separate
functions, one for params, the other dealing with coefficients
and recon(decoder side).
Note: actually the non-causal prediction can use the original
syntax, but right now in the decoder coeff detoken and recon are
heavily nested.

Change-Id: I72d9c42ab8f38b57850d6b0481551893f1702822
2017-01-12 23:50:19 +00:00
David Barker d5dfa96e88 Add SSE2 vectorized warp filter for lowbd
End-to-end speed improvements: (measured on tempete_cif.y4m,
20 frames for encoder and all 260 frames for decoder)

* GLOBAL_MOTION encoder: ~10% faster
* GLOBAL_MOTION decoder: 100-200% faster depending on bitrate
* WARPED_MOTION encoder: ~2.5% faster
* WARPED_MOTION decoder: ~20-40% faster depending on bitrate

The improvement in the GLOBAL_MOTION decoder is particularly
large because its runtime is dominated by calls to warp_plane().

This introduces minor changes to the output of the warp filter,
but these should be rare.

Change-Id: I5813ab9e90311e27587045153c32d400b6b9eb92
2017-01-12 17:14:35 +00:00
Yi Luo 3bd8377533 High bit depth 32x32 inverse DCT_DCT transform, AVX2
- Witness the follow user-level speedup on AV1 baseline:
 Encoding time reduction: 4.26%
 Decoding time reduction: 25.35%

Change-Id: Ideaf3cd473ad45ed9256c80d5a5daed0a6e098cf
2017-01-12 17:01:08 +00:00
Jingning Han 904fd18276 Make av1_update_txb_coeff_cost() check condition support cb4x4
Replace hard coded numbers with macro defs.

Change-Id: I125ef4e4c8c3aead182c583522450626b730bbb3
2017-01-12 00:40:03 +00:00
Jingning Han c7ea761fe3 Make rd_debug aligned to var-tx
Fix the corner case and use the right rate cost udpate for rd_debug.
This would make the var-tx pass rd_debug test.

Change-Id: Ib0fbd2d73030c0d150222c6b7c2dfffc0c6af085
2017-01-12 00:40:03 +00:00
Jingning Han 0c70a80feb Make txfm block partition context support rectangular blocks
Make the transform block partition context model support the
rectangular transform block size partition. The coding gains
from cb4x4 and var-tx are:
          cb4x4 + var-tx
lowres         4.3%
midres         2.6%

Change-Id: I6cc1413fbf6d7707ca7fd24300623a3f0118be7c
2017-01-12 00:39:56 +00:00
Nathan E. Egge cefb409475 Move from Daala accounting to AOM accounting.
Replace all instances of Daala's OD_ACCOUNTING with those specified by
 CONFIG_ACCOUNTING.

Change-Id: Ibb59fc5df0ce4b0528b15296bf2f14029c414bc0
2017-01-12 00:12:06 +00:00
Nathan E. Egge cceac33aec Don't include Daala EC headers directly.
The generic coder now uses the AOM entropy coder API and no longer
 needs to include the entenc.h and entdec.h headers.

Change-Id: I213acb5b6bd8a3fe60dc096b83d76ae72315e9de
2017-01-12 00:12:06 +00:00
Nathan E. Egge 25007c889a Use aom_reader / aom_writer API to code rest.
The functions aom_encode_pvq_split() and aom_decode_pvq_split() code
 the rest value as raw bits using the od_ec_enc_bits() and
 od_ec_dec_bits() functions.
These code bits in the reverse order as the aom_write_literal() and
 aom_read_literal() functions, so both the encoder and decoder must
 be changed at the same time.
This commit has no impact on metrics but is a bitstream change.

Change-Id: Iee79777f35aebbb23043a7efa7fe439af70348ba
2017-01-12 00:12:06 +00:00
Nathan E. Egge f1e2fbdc9b Use aom_reader / aom_writer API to code raw bits.
The functions aom_laplace_encode_special() and
 aom_laplace_decode_special() code the rest value as raw bits using the
 od_ec_enc_bits() and od_ec_dec_bits() functions.
These code bits in the reverse order as the aom_write_literal() and
 aom_read_literal() functions, so both the encoder and decoder must
 be changed at the same time.
This commit has no impact on metrics but is a bitstream change.

Change-Id: I428d5a83dd108c3a54f3c1dbae2c7fd5e59f5726
2017-01-12 00:12:06 +00:00
Debargha Mukherjee 8a70919e82 Expand the parameter set for sgrproj restoration
A slight improvement for lowres and midres.

Change-Id: I377ba41034e1d70320e0c694d90a058e7809b129
2017-01-11 19:53:48 +00:00
ltrudeau c875510544 Simplified PVQ skip clipping to 1
Cherry-pick Daala da7896a7

Remove double negation and added a comment explaining that this is used
for visualization. This change does not alter the bitstream.

Change-Id: I2a01ed292cc5cfa4e1bfdbc08251da6bd2c27158
2017-01-11 19:37:48 +00:00
Alex Converse 346440bd74 Use the standard aom_reader_init() interface for ans
Change-Id: I4a0f0a775362e6e43cd28ed29bf83c912cdc7df5
2017-01-11 17:29:55 +00:00
David Michael Barr 8f110572ad Use stable sort with PVQ.
Cherry-pick Daala 85433214
 Fully order the pvq search candidates
For portable and stable sorting, break ties.
Large differences in output were observed between AWCY and an OS X
 machine because of the platform qsort implementation.

Change-Id: I294dd2e167c1e0464c7f61f32d60ab478341446e
2017-01-11 10:33:24 +00:00
Debargha Mukherjee 5d15721d51 Fix a memory leak
Change-Id: I1f66837151a955a9fde0c1b4670ab0fc1d318111
2017-01-11 01:45:52 +00:00
Yaowu Xu a98ceea7f9 add a minor clarification in comment
Change-Id: I7320074ff7bd95d833cba8afb04fcc0730392f1e
2017-01-10 22:32:42 +00:00
Urvang Joshi f1c06a73fe Palette: use insertion sort for sorting neighbors' scores.
While sorting, preserving the order of the rest of the list when moving
an element to the top of list makes hardware implementation much simpler.

The compression performance is roughly same: overall, avg performance on
screen-content set is 0.137% better than before in fact.

Bug=aom:127

Change-Id: Id1aa1e90254b44eae9133b47bca8f853f6a62c6b
2017-01-10 19:19:59 +00:00
Sarah Parker b9f757c7bf Refactor compound_segment to try different segmentation masks
Change-Id: I7c992c9aae895aebcfb5c147cb179cf665c0ac10
2017-01-09 18:50:51 -08:00
Yushin Cho ab44fd14c1 Rename encode_inter_mb_segment()
Rename encode_inter_mb_segment() so that it tells readers
that the function is only used for sub8x8 case.

Change-Id: I2d86d9efaf0e1e96446d9e2dec8a8d97772489a7
2017-01-10 00:51:49 +00:00
Yushin Cho ee0af21256 Correct the misleading codes in encode_inter_mb_segment()
In encode_inter_mb_segment(), when BLOCK_8X4 or BLOCK_4X8 is
passed, the nested loop inside it iterates always twice.
(For BLOCK_4X4, loop iterates only once because encode_inter_mb_segment()
is called for each of 4X4 block.)
Then, the k for 1st iteration is always zero, and the k for 2nd
iteration is always (idy * 2 + idx) with either idy == 1 or idx == 1
depending on the sb_type.

Using "+=" there could mislead readers expecting that
the # of iterations is more.
And probably using simple assignment would be more proper here.

Change-Id: I7a11255eca13403bc090ba4f0cd4785db9f0e541
2017-01-10 00:51:43 +00:00
Yushin Cho 1a2df5e295 Fix wrong stride of dst buffer in intra4x4
Change-Id: Icbd238c73323d11d60ca4da755b52c83cb11b8b5
2017-01-10 00:50:28 +00:00
Nathan E. Egge f25bae4412 Use aom_reader with od_decode_cdf_adapt().
Change the od_decode_cdf_adapt() function to take an aom_reader
 struct instead of an od_ec_dec struct.
Rename od_decode_cdf_adapt() to aom_decode_cdf_adapt().

Change-Id: I0713d2f56acfea3f67f1b4087c0feee77c2e25cb
2017-01-10 00:30:32 +00:00
Nathan E. Egge b97f1c479c Use aom_reader with laplace_decode_special().
Change the laplace_decode_special() function to take an aom_reader
 struct instead of an od_ec_dec struct.
Rename laplace_decode_special() to aom_laplace_decode_special().

Change-Id: I137ae9a4df3fb0fd0b54dea09f787f70a7d287f5
2017-01-10 00:30:32 +00:00
Nathan E. Egge 984b2327ad Replace OD_ACC_STR with __func__.
Replace the passed in bit accounting string from OD_ACCOUNTING with the
 current function name as ACCT_STR in preparation for the migration to
 CONFIG_ACCOUNTING.

Change-Id: Ib9946232b37cacfd88f6ff914b99e91c3d7b650e
2017-01-10 00:30:32 +00:00
Debargha Mukherjee 994ccd7f15 Add tiled version of UV wiener restoration
Slight improvement in midres and hdres sets of 0.02% and 0.0.9%
respectively.

This is also a better design anyways.

Change-Id: I15b60b8836070a2132641e5b1d8e9f68df426c08
2017-01-09 22:01:46 +00:00
Debargha Mukherjee d7489148d4 Refactor UV restoration to use same tilesize as Y
Change-Id: I56e741551f74624a84250d7565520db9c5127d1b
2017-01-09 20:06:26 +00:00
Yue Chen 7d2109e55b Use fast warping algorithm for warped motion mode
Disable warped motion mode when the model parameters are out of the
range of the new interpolation algorithm.
Performance: 1.1% lowres (was 1.2%)

Change-Id: I947ce3fd07e0d574d66333c1a729e85ba0294b4a
2017-01-09 18:14:12 +00:00
Angie Chiang 61dca1fd2c Use 16-bit internal precision in fdct32
Change-Id: I487995f51737be882d4f2a4c7bbd6b87297b4f55
2017-01-09 17:53:42 +00:00
Angie Chiang 8e1d0f7086 Change scales of fht 32x16 16x32 32x32 functions
Performance drop with ext_tx and rect_tx on
       BDRate
lowres -0.028
midres -0.075
hdres  -0.054

Change-Id: I50f89b9e9785d82ab05c3276a3c8b22b4dcfd408
2017-01-09 17:53:34 +00:00
Zoe Liu 705ce47f77 Remove unnecessary #if-#endif for ext-inter
Change-Id: Iab1217c7eb006c72e86e4261576b775b7debafd3
2017-01-09 17:37:37 +00:00
Jingning Han 2ee81fecc2 Fix num_4x4_blk scale in var-tx and cb4x4 mode
This resolves an out-of-boundary memory access issue in the encoding
process.

Change-Id: I9363f5a5a012880289e3370f66507126c609a41f
2017-01-09 17:16:35 +00:00
Jingning Han 6ae7564b94 Fix frame boundary block distortion computation in var-tx
Fix the computation of distortion for blocks at frame boundary.

Change-Id: Ib32b95f25e28af42abe9144a7f589030bbaab463
2017-01-09 17:16:35 +00:00
Jingning Han b98ec86f1a Remove repeated ADAPT_SCAN_UPDATE_RATE_16 defs
Change-Id: I5c2e92469d8f87f7c565acd77f12535b3f58929a
2017-01-09 17:16:29 +00:00
Nathan E. Egge e069849592 Split aom_read_cdf() from aom_read_symbol().
Separate the aom_read_cdf() functionality from aom_read_symbol() which
 can optionally adapt the cdf when run with --enable-ec_adapt.

Change-Id: I5446d6402835dfcf68d3462a2bd8835704fe6603
2017-01-09 17:03:21 +00:00
Nathan E. Egge 39051a7787 Use aom_writer with od_encode_cdf_adapt().
Change the od_encode_cdf_adapt() function to take an aom_writer
 struct instead of an od_ec_enc struct.
Rename od_encode_cdf_adapt() to aom_encode_cdf_adapt().

Change-Id: I00de05b8b7428f67139c234160ab9aaf8900f967
2017-01-09 17:03:21 +00:00
Nathan E. Egge 140069eb75 Use aom_writer with od_laplace_encode_special().
Change the od_laplace_encode_special() function to take an aom_writer
 struct instead of an od_ec_enc struct.
Rename od_laplace_encode_special() to aom_laplace_encode_special().

Change-Id: Ieba63c8519d363081124a11e633b437adccfa500
2017-01-09 17:03:21 +00:00
Nathan E. Egge 87d44dc749 Split aom_write_cdf() from aom_write_symbol().
Separate the aom_write_cdf() functionality from aom_write_symbol() which
 can optionally adapt the cdf when run with --enable-ec_adapt.

Change-Id: Ibc58690eddb647d69f08d72f0f0712779aab11d1
2017-01-09 17:03:21 +00:00
Yushin Cho 482016d0eb Rename the function rd_pick_best_sub8x8_mode()
This large function is solely used for the RDO search for
inter prediction mode. It would be helpful for readers if its name
tells that whole function is used for inter mode decision only.

Change-Id: Ida366b142b7129bf89498227d186c54341c3af5e
2017-01-08 23:39:08 +00:00
David Barker fa19516f2e Fix new warp filter in the case wmmat[2] == 0
In this case, calculating the shear parameters fails
with a divide-by-zero error. So disable the new filter
in this case.

We also temporarily remove the asserts blocking use
of the old filter with debugging enabled.

Change-Id: I788ff51c3bc1d841eab1099881cc3b55038ae342
2017-01-07 19:26:45 +00:00
David Barker 33f3bfdef7 Optimize Wiener filter selection
* Change the behaviour of search_wiener at borders to match
  the behaviour of the Wiener filter itself
* Reorder the calculation in compute_stats, saving ~5% of
  encode time at low bitrates (tested on bus_cif.y4m at 200kbps)

Change-Id: I5f649d77fd66584451aaf37697ce9c9af69524e4
2017-01-07 05:30:22 -08:00
David Barker 6928a5d257 Various loop-restoration optimizations
* Optimize the self-guided and domaintxfmrf filters
* Save 576KiB of buffers in the encoder and decoder
* Disable self-guided filter for videos whose width or
  height is < 5, in order to help simplify the filter.

This results in an overall 30-40% improvement in decoder
speed with loop-restoration enabled (depending on source
and bitate), with no effect on video quality, *except* for
videos with width or height < 5 pixels.

Change-Id: Ide9181118ec3a63a0335338f316505b08df2d831
2017-01-07 05:30:22 -08:00
Debargha Mukherjee 09ad6d85b1 A mismatch fix in loop restoration
Change-Id: Icfc4645ff97d4fd6849f149f4c5296a53c204cf4
2017-01-07 13:19:07 +00:00
Angie Chiang 1733f6b77b Merge ext_interp and dual_filter
Change-Id: I0ebd6951d2b42869ae872b33f63a07db03e99c62
2017-01-07 00:50:23 +00:00
Jingning Han 4be1a4d49b Fix frame header tx_size syntax setting
Fix an intricacy due to interactions between cb4x4 and var-tx that
sets frame header away from tx_mode_select. This resolves a rare
enc/dec mismatch issue.

Change-Id: I6981f21f7e6f04f2a47ef32f744f83a8fd34355b
2017-01-06 23:57:05 +00:00
Nathan E. Egge 8324fa85ce Fix --enable-accounting with --enable-pvq.
The bit accounting was broken when refactor portions of PVQ to use the
 aom_reader / aom_writer API because the daala_ec calls were using
 OD_ACCOUNTING instead of CONFIG_ACCOUNTING.
This fixes them so that bit accounting will still work with pvq while
 the full port to --enable-accounting is in review.

Change-Id: I99e6b6debc716f1a6780116d5602085f7a2bb827
2017-01-06 18:16:14 -05:00
Jingning Han 581d1697e7 Rework the txfm partition context to support cb4x4 mode
This commit reworks the transform block partition context update
to support cb4x4 mode in the recursive transform block partition.
It resolves the remaining enc/dec mismatch issue when both cb4x4
and var-tx are turned on.

Change-Id: I850d121204fe4c68e81488f1d2848c570d9d08b9
2017-01-06 18:08:08 +00:00
Jingning Han 030f651f8a Fix av1_iht8x4_32_add_sse2() implementation
Fix the 8x4 inverse transform for ADST row process.

Change-Id: Iceff4ab356a51218a952b53b1134606548832eac
2017-01-06 18:08:08 +00:00
Jingning Han 9ca05b7e3d Refactor var-tx pipeline to support cb4x4 mode
Replace hard coded 4x4 transform block step size assumption with
scalable table access.

Change-Id: Ib1cc555c2641e5634acdd91ca33217f00aeb0b89
2017-01-06 18:08:08 +00:00
Debargha Mukherjee a43a2d98e3 Add UV wiener loop restoration
Enables Wiener based loop restoration only for the UV
frames. The selfguided and domaintranform filters do not
work very well for UV components, hence they are disabled.
For each UV frame a single set of wiener parameters are
sent. They are applied tile-wise, but all tiles use the
same parameters.

BDRATE (Global PSNR) results:
-----------------------------
lowres: -1.266% (up from -0.666%, good improvement)
midres: -1.815% (up from -1.792%, tiny improvement)

Tiling on UV components will be explored subsequently.

Change-Id: Ib5be93121c4e88e05edf3c36c46488df3cfcd1e2
2017-01-06 00:30:06 +00:00
Nathan E. Egge 8fcfcc5798 Use aom_reader / aom_writer API to code lsb.
The functions generic_encode() and generic_decode() code the lsb values
 as raw bits using the od_ec_enc_bits() and od_ec_dec_bits() functions.
These code bits in the reverse order as the aom_write_literal() and
 aom_read_literal() functions, so both the encoder and decoder must
 be changed at the same time.
This commit has no impact on metrics but is a bitstream change.

Change-Id: I83546e2d4b73c28a7f269ddc850742df53d227ce
2017-01-05 17:20:48 -05:00
Nathan E. Egge d998a007ad Delete unused laplace decoder functions.
Delete the unused od_laplace_decode(), od_laplace_decode_vector(), and
 laplace_decode_vector_delta() functions.

Change-Id: Iec581e8cdb0bc9cac9199c09486891500c707c03
2017-01-05 14:50:24 -05:00
Nathan E. Egge 5c7acc9e47 Use aom_reader with od_decode_band_pvq_splits().
Change the od_decode_band_pvq_splits() and od_decode_pvq_split()
 functions to take an aom_reader struct instead of an od_ec_dec struct.
Rename od_decode_band_pvq_splits() to aom_decode_band_pvq_splits() and
 od_decode_pvq_split() to aom_decode_pvq_split().

Change-Id: I5979b32977377e1541c609a13242852e5cfab233
2017-01-05 14:50:24 -05:00
Nathan E. Egge 0bccd5dc48 Use aom_reader with od_decode_pvq_codeword().
Change the od_decode_pvq_codeword() function to take an aom_reader
 struct instead of an od_ec_dec struct.
Rename od_decode_pvq_codeword() to aom_decode_pvq_codeword().

Change-Id: I9fc2dda28a6169cb04410e822070991f3bcbc25a
2017-01-05 14:46:26 -05:00
Nathan E. Egge 89f5876f27 Use aom_reader with generic_decode().
Change the generic_decode() function to take an aom_reader struct
 instead of an od_ec_dec struct.

Change-Id: Ifa19ab1dbdd9fa1af19e6740839708b27ab4a44b
2017-01-05 14:44:17 -05:00
Nathan E. Egge e130861993 Use aom_reader with pvq_decode_partition().
Change the pvq_decode_partition() function to take an aom_reader struct
 instead of an od_ec_dec struct.

Change-Id: I7247aaa0be3eedd336371ba677dc2d9f16f27d20
2017-01-05 14:37:05 -05:00
Nathan E. Egge ab08397c4d Replace od_ec_dec with aom_reader in daala_dec_ctx.
Use the generic AOM entropy decoder in the daala_dec_ctx struct.
This is done in preparation for migrating other entropy coder calls to
 use the more generic entropy coding API.

Change-Id: I473a278174195401bcf35730fb5db7eb368b097a
2017-01-05 12:42:00 -05:00
Nathan E. Egge e335fb7b10 Use aom_write_bit() instead of od_ec_enc_bits().
Change-Id: Ied21efe93b2508e052087d84deebaf46c61e9c3d
2017-01-05 08:34:21 -05:00
Nathan E. Egge 9e96f79a65 Use aom_writer with od_encode_band_pvq_splits().
Change the od_encode_band_pvq_splits() and od_encode_pvq_split()
 functions to take an aom_writer struct instead of an od_ec_enc struct.
Rename od_encode_band_pvq_splits() to aom_encode_band_pvq_splits() and
 od_encode_pvq_split() to aom_encode_pvq_split().

Change-Id: I72e6684e032f4c8f9f9133c6102f870830001712
2017-01-05 08:34:21 -05:00
Nathan E. Egge 13e44fb477 Use aom_writer with od_encode_pvq_codeword().
Change the od_encode_pvq_codeword() function to take an aom_writer
 struct instead of an od_ec_enc struct.
Rename od_encode_pvq_codeword() to aom_encode_pvq_codeword().

Change-Id: I1254eca06291740770a4371dc01c78c12e613c3a
2017-01-05 08:34:21 -05:00
Nathan E. Egge 760c27f198 Use aom_writer with generic_encode().
Change the generic_encode() function to take an aom_writer struct
 instead of an od_ec_enc struct.

Change-Id: Icb447fe5ada27aba45fbaea08b28e9fe42c5a404
2017-01-05 08:34:21 -05:00
Nathan E. Egge 6b0b4a90c0 Use aom_writer with pvq_encode_partition().
Change the pvq_encode_partition() function to take an aom_writer struct
 instead of an od_ec_enc struct.

Change-Id: I459d31c600467958c9a1cbebd632fec05e01f534
2017-01-05 08:34:21 -05:00
Nathan E. Egge 8518ad5b6d Delete unused laplace encoder functions.
Delete the unused od_laplace_encode(), od_laplace_encode_vector(), and
 laplace_encode_vector_delta() functions.

Change-Id: I92e393836c0ba4e5149b2565e7142a161c44c612
2017-01-05 04:20:07 -05:00
Nathan E. Egge 6675be0ad0 Replace od_ec_enc with aom_writer in daala_enc_ctx.
Use the generic AOM entropy encoder in the daala_enc_ctx struct.
This is done in preparation for migrating other entropy coder calls to
 use the more generic entropy coding API.

Change-Id: Id627d12402a397bcb21d48d896c0de249d4d8657
2017-01-05 04:20:03 -05:00
Nathan E. Egge fddd3eb6d9 Use aom_reader with od_decode_cdf_adapt_q15().
Change the od_decode_cdf_adapt_q15() function to take an aom_reader
 struct instead of an od_ec_enc struct.
Rename od_decode_cdf_adapt_q15() to aom_decode_cdf_adapt_q15().

Change-Id: I72315c6e89d689e232c53a99a7d4e0f9cdcfbd0c
2017-01-05 04:19:50 -05:00
Nathan E. Egge a653f20d11 Use aom_writer with od_encode_cdf_adapt_q15().
Change the od_encode_cdf_adapt_q15() function to take an aom_writer
 struct instead of an od_ec_enc struct.
Rename od_encode_cdf_adapt_q15() to aom_encode_cdf_adapt_q15().

Change-Id: I631af7be4b553fbb10a4c72e1958aa48a4c8245a
2017-01-05 09:17:06 +00:00
Nathan E. Egge d6c2dc4fa3 Rename od_cdf_adapt_q15() to aom_cdf_adapt_q15().
Change-Id: I79addac857ff10c89f2ad79a5d2bf8d4c5e89ef4
2017-01-05 09:17:06 +00:00
Nathan E. Egge 850014294e Rename od_cdf_init() to aom_cdf_init().
Change-Id: I74d6cd507460d418c7be7faa31394fbcd8bb0d5d
2017-01-05 09:17:06 +00:00
Nathan E. Egge dd28aed875 Use uv_mode_cdf with CONFIG_EC_MULTISYMBOL.
Change-Id: I0cacd4e8cdd07458b36bbdd56e4f005327854b34
2017-01-05 08:26:14 +00:00
Nathan E. Egge 10ba2bedf0 Use kf_y_mode_cdf with CONFIG_EC_MULTISYMBOL.
Change-Id: Ic0eba16329d7b63dd7d18e9cd28b89be4b5f2710
2017-01-05 08:26:14 +00:00
Nathan E. Egge ecc21ec854 Use y_mode_cdf with CONFIG_EC_MULTISYMBOL.
Change-Id: Ia5046d9158d5421a7f6e0397f4fa1e1925ae2ccb
2017-01-05 08:26:14 +00:00
Nathan E. Egge a59b23dd4a Use inter_mode_cdf with CONFIG_EC_MULTISYMBOL.
Change-Id: I1cf27d2f029c1e985cafb468f60e7117d92593f5
2017-01-05 08:26:14 +00:00
Nathan E. Egge 00b3331ee8 Use switchable_interp_cdf with CONFIG_EC_MULTISYMBOL.
Change-Id: I3f7eeff102fc30e2cef59c2c07df94826587d100
2017-01-05 08:26:14 +00:00
Nathan E. Egge 29ccee03c0 Use intra_ext_tx_cdf with CONFIG_EC_MULTISYMBOL.
Change-Id: I21785bec0563299b4b0c1d17aaaa788e4e8df4d7
2017-01-05 08:26:14 +00:00
Nathan E. Egge dfa33f224b Use inter_ext_tx_cdf with CONFIG_EC_MULTISYMBOL.
Change-Id: I76259d6ec925a0c7024e7c70a517debe2d3bf1ab
2017-01-05 08:26:14 +00:00
David Barker 3a0df186bd Simplify buffer management for self-guided restoration filter
* Remove some unused variables
* Reduce need for casts by typing intermediate buffers appropriately
* Avoid copying data which is never modified; use the original data
  instead.
* Reduce number of intermediate buffers required, saving allocations
  of 576KiB in the decoder and ~1MiB in the encoder

No effect on performance

Change-Id: I55243904dd8e818fb6d43fa431903736475d23ff
2017-01-04 22:49:22 +00:00
Angie Chiang 2cc057cf1f Remove fwd_txfm_opt
This CL aims at simplify transform code.

Change-Id: Ibaf1dd8607e37d44a0f77788a72e344583f81fa0
2017-01-04 22:37:04 +00:00
Sarah Parker 409c0bb27e Bugfixes in pick_interinter_seg_mask
Change-Id: I5ad51375287b40170882c4816d34858be50afacd
2017-01-04 20:43:36 +00:00
Jingning Han 1a00cffdef Enable cb4x4 mode support to ext-tx experiment
This commit enables the cb4x4 mode to support ext-tx experiment. The
coding performance gains are:

       ext-tx   cb4x4    ext-tx + cb4x4
lowres  2.7%     2.6%      4.9%
midres  2.1%     1.2%      3.0%

Change-Id: I6c566b6073527262abcdbb1a0c6bcb8729988f3b
2017-01-04 20:38:07 +00:00
Jingning Han aa434238e3 Clean up ext-tx experiment
Remove unnecessary #if statements from the implementation.

Change-Id: I09c2f046aec2c43894f8dcfd99216fdf0a50451d
2017-01-04 20:38:07 +00:00
Angie Chiang 8fd2d7aa69 Remove speed feature use_lp32x32fdct
Change-Id: I6ce654b582f2a9d45a40bf22ba597b47d418a0be
2017-01-04 11:08:21 -08:00
Yushin Cho 3839548c49 Refactor PVQ codes for inter4x4
Similarly to the refactoring of PVQ codes for 4x4 intra,
instead of calling tx and pvq_encode_helper() in 4x4 inter,
av1_xform_quant() is called.

This commit gives no change in metrics.

Change-Id: Ib69efb00ed5a5b2254478bf5db5a19d9dac12b3b
2017-01-04 16:46:01 +00:00
Ryan Lei 7386eda0e0 Add an experiment to disable lpf on tile boundaries
This commit adds a new experiment to allow disabling of loop filtering
on tile boundaries. It is implemented by adding a syntax field
"loopfilter_across_tiles_enabled" into the uncompressed frame header. 
If it is set to 0, decoder and encoder will disables loop filtering for
block edges that are also tile boundaries.

Change-Id: Ib80bfd82d49c74f1ba46ae18ceedb30704ac8aa5
2017-01-04 04:59:42 +00:00
Yushin Cho 900243b9c0 Refactor PVQ codes for intra4x4
In 4x4 intra search for RDO, AV1 codes has been changed to
call av1_xform_quant() while ago, while PVQ did not but call
txfm and pvq_encode_helper() instead, which caused duplicated codes
and thus worse maintenance and testing.

This refactor also has fixed the long-sitting bug,
which we couldn't find before refactoring.

PSNR    PSNR-HVS  SSIM  FAST-SSIM  CIEDE 2000 MS-SSIM
-2.77   -2.62     -2.90 -4.07       -2.94     -2.63

Change-Id: I6e526123a64af810897962d11d53028719e82e16
2017-01-03 13:28:06 -08:00
Debargha Mukherjee 5802ebe65c Add code to output counts for an encode run
If --enable-entropy-stats is on, the aggregate counts for each
frame are written out to a file named counts.stt.

Change-Id: I0c73ab872183a9dbd6d767a8c6f0642c5c117253
2017-01-03 17:29:04 +00:00
David Barker be6cc07d82 Add new convolve variant for loop-restoration
The convolve filters generated by loop_wiener_filter_tile
are not compatible with some existing convolve implementations
(they can have coefficients >128, sums of (certain subsets of)
coefficients >128, etc.)

So we implement a new variant, which takes a filter with 128
subtracted from its central element and which adds an extra copy
of the source just before clipping to a pixel (reinstating the
128 we subtracted). This should be easy to adapt from the existing
convolve functions, and this patch includes SSE2 highbd and
SSSE3 lowbd implementations.

Change-Id: I0abf4c2915f0665c49d88fe450dbc77b783f69e1
2017-01-03 17:15:29 +00:00
Jingning Han 469a5c80fc Format clean-up in entropymode.c
Change-Id: Id24b4ac814b8b9db8ca3dd8d8d0c19174f345c75
2016-12-28 14:01:27 -08:00
Jingning Han 8260d8b682 Make get_tx_type() to support cb4x4 mode
The bmi structure for sub8x8 block is deprecated in the cb4x4 mode.
Always fetch the transform type from coding block's mode_info
structure directly.

Change-Id: I8df8536e1a1723b292600018c4843e5fcc025284
2016-12-28 13:52:35 -08:00
Jingning Han 3174e5fd42 Change the transform operator table to support 2x2 transform
Change-Id: I6ccb0c884545049a7d428f8654df54c20f563392
2016-12-28 13:48:26 -08:00
Jingning Han 9e0976a464 Support sub8x8 chroma component prediction
This commit allows the sub8x8 blocks to compose and filter their
chroma components for supertx in cb4x4 mode. The coding gains of
supertx and cb4x4 are largely additive:

          supertx      cb4x4       cb4x4 + supertx
lowres     -1.0%       -2.7%        -3.64%
midres     -0.8%       -1.3%        -2.10%

Change-Id: Ie7d09f6fceb36ce375e56773728f05dd628786fe
2016-12-28 21:39:36 +00:00
Jingning Han 24f24a54e7 Rework spatial filter process in supertx
This makes the cb4x4 mode support supertx experiment. It resolves
the enc/dec mismatch issue when both experiments are turned on.

Change-Id: If3f70fb26862b4ea95d73f7030f86a399051e21e
2016-12-28 21:39:36 +00:00
Jingning Han 38b1bc45b6 Fix update_state_supertx() motion vector update
This allows the cb4x4 mode to work with ref-mv and supertx modes.

Change-Id: Ib9747d2c8a2b036fb246ca04bf7cc8c8f40931bf
2016-12-28 21:39:36 +00:00
Jingning Han 2511c661ab Refactor supertx decoding context
Use table access to replace integer to enum conversion.

Change-Id: Idb3e7e2e3267bccf322cffbe4bfaa969e9018296
2016-12-28 21:39:36 +00:00
Jingning Han feb517c8c3 Make cb4x4 mode support supertx
This commit makes the cb4x4 mode support supertx operation.

Change-Id: I1a713b2268c1029aebeb43aa6aeb0fa37b16810f
2016-12-28 21:39:36 +00:00
Jingning Han 1856e43f2f Refactor prediction filtering process in supertx
Change-Id: Id2ca1c1b03fc6dca33e86cdbc17ca0782f44a446
2016-12-28 21:39:36 +00:00
Jingning Han fc0476deca Remove redundant #if config from encodeframe.c
Change-Id: Ic278cc3296813b52bfd228544971fc62779a2d1c
2016-12-28 21:39:36 +00:00
Jingning Han 5b7706a7d8 Clean up supertx functions
Avoid comparing values from different enums.

Change-Id: I405f87942a64e86bda899b84a142c4d64414dd81
2016-12-28 21:39:36 +00:00
Yaowu Xu 415ba93b2d Remove redundant code
Change-Id: I53d1383bfe70e0508b0c91d77931a21be2b91682
2016-12-27 12:36:25 -08:00
Yushin Cho b27a17f243 Fix wrong place of setting dst with PVQ in intra 4x4
With PVQ, the dst buffer should be initialized as zero
before av1_inv_txfm_add_*() is called.
This bug seems introduced during resolving conflicts
when nextgenv2 was merged.

BD-Rate change:
                PSNR  PSNR-HVS  SSIM  CIEDE 2000  MS SSIM
subset1-mono    -0.25 -0.25     -0.23 -0.26       -0.23
objective1-fast -0.17 -0.26     -0.14 -0.04       -0.18

Change-Id: I7c6b793ba0aa5f1e3d419312cbbe5c207a68f1f8
2016-12-23 16:39:54 -08:00
Arild Fuldseth (arilfuld) 788dc23f5b Fix: Make CONFIG_REFERENCE_BUFFER and CONFIG_EXT_REFS work together
BUG=aomedia:115

Change-Id: If67821ed084b01f26287ac5e032d4f5fd5a83024
2016-12-23 08:11:55 +00:00
Jingning Han 863694a436 Avoid divisions at decoder side in supertx
Change-Id: I3c52a4759780d987d045bb7b34a27ee9f7f55117
2016-12-21 10:07:26 -08:00
Jingning Han 9353124fe8 Refactor supertx implementation
Replace hard coded numbers with table access. Avoid comparing
values from different enums.

Change-Id: I43216db4a9b13ee317e8e517683946f526e5ca0e
2016-12-21 10:07:22 -08:00
Jingning Han 443c38d322 Fix 2x2 high bit-depth transform setups
This commit fixes the 2x2 transform system setups for high bit-
depth setting. It enables the cb4x4 mode to support high bit-depth
process. The coding performance is improved over high bit-depth +
ref-mv:

lowres  2.5%
midres  1.2%

Change-Id: I351f9d72bdc7e15b2bd00e94286b98966a295e6d
2016-12-21 10:05:58 -08:00
Jingning Han 002c814658 Clean up av1_build_inter_predictors_sb_extend()
Remove unused chunk of codes for sub8x8 inside this function.

Change-Id: Ie49d707b3ceeeee3b439e46c9d0e63abd4ae104c
2016-12-21 08:36:51 -08:00
David Barker 87fcb36af9 Fixes for new warped-motion/global-motion filter
* Fix a bug in warp_erroradv introduced by previous patch
* Add highbd version of the new warp filter

Change-Id: I791d3a97baf86f0cbfc72880776848f93df6daa6
2016-12-21 14:07:19 +00:00
Debargha Mukherjee e8c6f5f14b Adds prep work for 4:1 transforms
Also fixes a bug with rectangular transforms

Change-Id: Id459c18d8fdc767678452e0b20c4168a412f4de7
2016-12-21 12:21:18 +00:00
Jingning Han cc5bdf4920 Add 2x2 block level variance functions for high bd
Change-Id: I38259c4074f77a8941baefbe7585fff2eded6b12
2016-12-20 17:28:13 +00:00
Jingning Han b1ed8d7272 Add 2x2 forward and inverse transform for high bd
Add 2x2 forward and inverse 2D-DCT for high bit-depth.

Change-Id: I3092a2587a0cdc6675a69cc9203499a530b65325
2016-12-20 17:28:13 +00:00
Nathan E. Egge b69cb5289a Delete duplicate cdf updating code.
The mv cdfs are updated below in calls to av1_set_mv_cdfs().

av1-master@2016-12-19T17:27:14Z -> av1-set-mv-cdfs@2016-12-19T17:27:54Z

  PSNR | PSNR Cb | PSNR Cr | PSNR HVS |   SSIM | MS SSIM | CIEDE 2000
0.0000 |  0.0000 |  0.0000 |   0.0000 | 0.0000 |  0.0000 |     0.0000

Change-Id: I6061602a4f2cc91dadf79e39d513c9678b9d2075
2016-12-19 16:47:34 -05:00
Guillaume Martres e50d9ddbfd Fix uninitialized entropy contexts
When enable_optimize_b is false in av1_encode_intra_block_plane the
entropy contexts were never initialized.

No changes on metrics for objective-1-fast when no experiment is
enabled.

Change-Id: Ic68913f6400d2becbaec3cc14214a0257530ed0b
2016-12-19 16:52:47 +00:00
Jingning Han fab16037d9 Scale reference motion vector search step size
This commit allows the dynamic motion vector referencing system to
scale its search range according to the coding block size. This
provides higher search resolution for smaller size coding unit.

The cb4x4 mode improves the compression performance across all the
test sets:

         avg     low    mid    high
lowres   2.8%    2.4%   3.1%   3.0%
midres   1.3%    0.3%   1.8%   2.7%
hdres    0.9%    0.5%   1.4%   1.5%

Change-Id: I1bc501506a9f2f06071c5274391f6bd053b235a7
2016-12-19 16:30:38 +00:00
Jingning Han ed8f396451 Refactor loop filter frame border condition
Use the proper scaling factor to decide if a block is sitting on
the frame border. This refactor does not change the coding
statistics of the code base. It fixes an enc/dec mismatch issue
due to out of boundary memory access in the cb4x4 mode.

Change-Id: Ia1e999c0f4e4ef10aac6120e69c1fb10a738dd4d
2016-12-19 16:30:38 +00:00
Jingning Han 6a9b24003c Fill the token cost for 2x2 transform blocks
Refactor the fill_token_cost() function to automatically compute
the token cost arry for all transform block sizes.

Change-Id: I2f44c9c08fb169bc14282ba48bce23577b1ab184
2016-12-19 16:30:38 +00:00
Jingning Han 8363063d9e Allow 2x2 transform block forward model update
This commit allows the forward model update for 2x2 transform
block size.

Change-Id: Ie08c401e488b3872be0d92640468857f0aa0d0b3
2016-12-19 16:30:38 +00:00
Thomas Davies 0090c8fb0a Turn on delta_q by default.
Also make sure that qindex is clipped to the quantizer range.

Change-Id: I3163da4b45e190f9ab34982d1bbbefa5cba7514e
2016-12-19 11:49:48 +00:00
David Barker be1286011c WIP: New warp filter for global motion
Do not merge; the highbd filter is not implemented yet.

Change-Id: I8f3322f5ab932b0f2e45f3590c135b70d711d915
2016-12-17 20:19:28 +00:00
Alex Converse 2cdf0d85a2 Specify ANS window size at initialization
Change-Id: Ia1757d580dd230d9e743b1f8c3e87df164008684
2016-12-17 03:56:10 +00:00
Jingning Han 6de954c3e0 Fix multi-thread encoding for cb4x4 mode
This commit makes the encoder to properly account for all transform
block sizes when combining statistics from encoding threads.

Change-Id: I010acd3b247dc890f63756d3d1436b1fb52ea2d9
2016-12-16 22:45:53 +00:00
Sarah Parker 569eddab48 Add temporary dummy mask for compound segmentation
This uses a segmentation mask (which is temporarily computed arbitrarily)
to blend predictors in compound prediction. The mask will be computed
using a color segmentation in a followup patch.
Change-Id: I2d24cf27a8589211f8a70779a5be2d61746406b9
2016-12-16 22:18:07 +00:00
Nathan E. Egge 9d9eb6c609 Use partition_cdf with CONFIG_EC_MULTISYMBOL.
Change-Id: I62b662052a4b9b1de07575824410aa9b2ce2c924
2016-12-16 21:30:50 +00:00
Nathan E. Egge 3129606766 Use segment tree_cdf with CONFIG_EC_MULTISYMBOL.
Change-Id: I0005c896a243275c052a0163a5da0f9230071743
2016-12-16 21:30:50 +00:00
David Barker 136cfe4614 Remove copy_border from loop restoration filters
This function corrected for the fact that the old bilateral and
Wiener filters would not write to the outermost 3 pixels of the
destination. Now that the bilateral filter has been removed and
the Wiener filter has been rewritten, this is no longer necessary.

No effect on performance

Change-Id: I3f3b0a759bdb9ff1e2407affe963388e76a9c9e6
2016-12-16 20:55:44 +00:00
Debargha Mukherjee 519dbcf19b Further optimizations of loop restoration
Change-Id: I4c4300f3f565d8aecf65669b77aaa874bb73a3a0
2016-12-16 20:54:53 +00:00
Yi Luo f10cba2b39 Apply the rect fwd tx changes to SSE2 optimization
- Apply changes on tx_size: 4x8, 8x4, 8x16, 16x8.
- Turn on corresponding unit tests on SSE2.
- Partially fix aomedia:113.

Change-Id: I29d15540ab8e9e3681e9caa54e5162bcbbd7af11
2016-12-16 16:54:30 +00:00
Jingning Han b6cb890fee Fix av1_has_bottom() logic in cb4x4 mode
This resolves an enc/dec mismatch issue in the intra prediction
modes.

Change-Id: I8655621332d955e718b9341e208e837c91e2acf0
2016-12-16 16:20:08 +00:00
Debargha Mukherjee 999d2f65e2 More cleanups / fixes on loop-restoration buffers
Also includes some minor renaming of macros.

Change-Id: I9493cc97c6ec9c8dae8020a05a02d6f322db9a02
2016-12-16 06:46:35 +00:00
hui su 18885360ed Fix ext-intra EndToEndTest failure under highbitdepth
Failure brought by 45dc597a

Also harmonize the high-bit-depth and regular versions
of directional intra prediction.

Change-Id: I7ed6602ccbfb53470cb7e9d8f428b17a860ca596
2016-12-15 22:00:01 +00:00
Guillaume Martres 930118c5c5 PVQ: Fix incorrect calculation of rd_stats
When PVQ is on, we reencode at the end of choose_tx_size_type_from_rd to
get the entropy contexts right, previously this was done using
txfm_rd_in_plane but this is different from the encodes done in the loop
which use txfm_yrd, the result is that rd_stats is set incorrectly at
the end of choose_tx_size_type_from_rd when PVQ is on.

Results on objective-1-fast with --limit=5:

   PSNR | PSNR Cb | PSNR Cr | PSNR HVS |    SSIM | MS SSIM | CIEDE 2000
-0.5803 | -1.0598 | -1.4565 |  -0.3377 | -0.8153 | -0.5934 |    -0.9943

See https://goo.gl/Hvv0E2

Change-Id: Iccc7b0afaff849f959a0084eb48dbb838bc3cb1a
2016-12-15 20:37:04 +00:00
Jingning Han 6dbacf2859 Fix partition decoding for 4x4 block
Stop reading partition information at 4x4 block.

Change-Id: I2b33f5ad307aa3051a1b1e230b7f6953ec6cecc6
2016-12-15 20:19:19 +00:00
Jingning Han e2ffaf884d Add 2x4 and 4x2 variance functions
Change-Id: Ic2fbc66e9212da32930c6a8ba1a749e3a37c5b9a
2016-12-15 20:19:19 +00:00
Jingning Han bf9c6b7606 Enable 4x4 block partition search
This commit enables the 4x4 level block partition search. It turns
on the 4x4 level coding block unit.

Change-Id: I7251db10176fd6c4f853604d263170721252dd4f
2016-12-15 20:19:19 +00:00
Guillaume Martres cc60daae77 Remove no longer necessary intra block reencode again
This is the same change as a94997aa90, it
has to be applied again as it was accidentally removed in the merge of
nextgenv2 (f883b42cab).

Change-Id: Ic9c47766e9e7d189885ce2c774b92d1796a9a574
2016-12-15 19:43:45 +00:00
Debargha Mukherjee 874d36d9c9 Misc cleanups and enhancements on loop restoration
Includes:
Some cleanups/refactoring
Better buffer management.
Some preps for future chrominance restoration.

Change-Id: Ia264b8989b5f4a53c0764ed3e8258ddc212723fc
2016-12-15 19:11:46 +00:00
Zoe Liu 4d44f5ab79 Make a small cleanup on wedge compound prediction
Change-Id: I6624b12f00e3862d9c05f6c26bbfa50106212bff
2016-12-15 17:04:13 +00:00
Jingning Han 271bb2c428 Enable rate-distortion optimization search to support cb4x4
This commit makes the rate-distortion optimization search of a
given block size support 4x4 level coding block unit.

Change-Id: I0149c3576af929bf2feb1c40850b53b21b3dca71
2016-12-15 16:47:56 +00:00
Jingning Han 5226184268 Unify prediction mode write and read operations
Unify the prediction mode write and read for all block sizes.

Change-Id: I32415fa4d9413978324597f7879c29963afe8118
2016-12-15 16:47:56 +00:00
Jingning Han e03fa610a3 Rework pc_tree allocation
Support 4x4 level coding block context_tree. This would make the
leaf nodes redundant. Need to remove those after cb4x4 mode is
stable.

Change-Id: Ida33eddbca384a949bb0bf46b7dabaadcab42542
2016-12-15 16:47:56 +00:00
Jingning Han 599461d620 Make encoder mv count sync with decoder behavior
Change-Id: Ia2e60e6ac1cb342b26ffa919b40c77284921b8e0
2016-12-15 16:47:56 +00:00
Angie Chiang 84f05d2c78 One-dir 12 sharp filter only in highbitdepth mode
When both directions pick sharp filter, horizontal direction use
12-tap sharp filter and vertical direction uses 8-tap sharp filter.

BDRate performance drop slightly.
       BDRate
lowres -0.083%
midres -0.073%
hdres  -0.016%

Change-Id: I6dc075af98f6b4fae558827424a7dd8f38d56503
2016-12-14 21:53:07 -08:00
Angie Chiang 9e963dc0ed Shorter-tap interp first in highbitdepth mode
BDRate varies within +-0.04%

Change-Id: I76f440c479d411c09ef39a19b46eb8dbc5330efb
2016-12-15 05:49:59 +00:00
Jingning Han b2f0c338b2 Change table definitions to support 4x4 coding block
Change-Id: I93493abe3c412fc10f5bb5a2eb157c8db277f4e0
2016-12-15 05:07:02 +00:00
Jingning Han d7d20477f3 Remove the use case of bmi->as_mode
Remove the use case of bmi->as_mode in cb4x4 mode. Its function is
covered by 4x4 level mode_info.

Change-Id: I04abc1b7a0a97c12c3b6fddc1f16f7045512772e
2016-12-15 05:07:02 +00:00
Jingning Han 8570b35d64 Streamline reference motion vector search for all block sizes
Take out the functions set for sub8x8 block sizes.

Change-Id: I15836df44051f2c8679c317d52eab9ef55fb5b17
2016-12-15 05:07:02 +00:00
Jingning Han b46540ca2b Simplify motion compensated predictor logic
No need of special handle on sub8x8 block sizes.

Change-Id: I8487cd68eda0882fe50550af3998dc941ec13b21
2016-12-15 05:07:02 +00:00
Yaowu Xu bf1d62dd2f Move large buffers from stack to heap
This commit moves a number of large buffers from stack to heap to fix
crashes due to stack overflow.

Change-Id: I9d1592e4f6dbfa18a475d0fc5674f6d3632f39ed
2016-12-15 04:55:25 +00:00
Yushin Cho 70669125c6 Enable the activity masking codes of PVQ
Turned off, by default.

TODO: The distortion function of Daala should be added
to complete the activity masking working.

Note that PVQ QM matrix (i.e. scaler for each band of
transform block) is calculated in decoder side as exactly same
way in encoder. In Daala, this matrix is written to bitstream
and decoder does not generate it.

Activity masking can be turned on by setting below flag as 1:

Change-Id: I44bfb905cb4e0cad6aa830a4c355cd760a993ffe
2016-12-14 22:49:58 +00:00
Jingning Han 41bb339627 Support 4x4 block unit decoding
Unify the block decoding process for all coding block sizes.

Change-Id: I7bfb482e9b5266f144e280b3ed713927a5ddc572
2016-12-14 22:48:53 +00:00