Граф коммитов

219 Коммитов

Автор SHA1 Сообщение Дата
Steinar Midtskogen 4f0b3ed8b8 Retune the CLPF kernel
CLPF performance had degraded by about 0.5% over the past six months,
which isn't totally surprising since the codec is a moving target.
About half of that degradation comes from the improved 7 bit filter
coefficients.  Therefore, CLPF needs to be retuned for the current
codec.

This patch makes two (normative) changes to the CLPF kernel:

* The clipping function was changed from clamp(x, -s, s) to
      sign(x) * max(0, abs(x) - max(0, abs(x) - s +
             (abs(x) >> (bitdepth - 3 - log2(s)))))
  This adds a rampdown to 0 at -32 and 32 (for 8 bit, -128 & 128
  for 10 bit, etc), so large differences are ignored.

* 8 taps instead of 6 taps:
               1
    4          3
  13 31  ->  13 31
    4          3
               1

AWCY results: low delay  high delay
PSNR:           -0.40%     -0.47%
PSNR HVS:        0.00%     -0.11%
SSIM:           -0.31%     -0.39%
CIEDE 2000:     -0.22%     -0.31%
APSNR:          -0.40%     -0.48%
MS SSIM:         0.01%     -0.12%

About 3/4 of the gains come from the new clipping function.

Change-Id: Idad9dc4004e71a9c7ec81ba62ebd12fb76fb044a
2017-02-10 23:00:16 +00:00
Alex Converse b13ce13c94 ans: Increase the base state to 1<<17.
ans_multion@2017-01-25T21:00:51.374Z ->
ans_multion_rabs17@2017-01-27T19:25:33.101Z
objective-1-fast
   PSNR | PSNR Cb | PSNR Cr | PSNR HVS |    SSIM | MS SSIM | CIEDE 2000
-0.0494 | -0.0494 | -0.0494 |  -0.0475 | -0.0484 | -0.0488 | -0.0497

Increasing the state any further seems to yield a compression drop.

Change-Id: Iacfd6af7e2b8a47c41033d61e338c5106bd3679c
2017-02-08 17:56:30 +00:00
Steinar Midtskogen 73ad523642 Add support for disabling CLPF on tile boundaries
Change-Id: Icb578f9b54c4020effa4b9245e343c1519bd7acb
2017-02-08 06:41:20 +00:00
Urvang Joshi 7a40600c8a ALT_INTRA: Integerize the weights for SMOOTH_PRED.
Insignificant change in BDRate.

Change-Id: Id1aa798393fd4c4c174dfcb9a8315828b531996f
2017-02-07 06:36:14 +00:00
Alex Converse c54692b5ac ans: Switch from uABS to rABS
This is in preparation for expanding the state range.

No discernible compression impact

ans_multioff@2017-01-25T20:58:18.756Z -> ans_multioff_rabs@2017-01-26T01:05:12.801Z

     PSNR | PSNR Cb | PSNR Cr | PSNR HVS |    SSIM | MS SSIM | CIEDE 2000
  -0.0001 | -0.0001 | -0.0001 |  -0.0001 | -0.0001 | -0.0001 | -0.0001

https://arewecompressedyet.com/?job=ans_multioff%402017-01-25T20%3A58%3A18.756Z&job=ans_multioff_rabs%402017-01-26T01%3A05%3A12.801Z

Change-Id: Ie1817991190f1de6d9c31e0c97f77efbd5869d35
2017-02-04 02:40:26 +00:00
Tom Finegan 0d3aeda300 Remove unused assembly sources and associated tests.
Change-Id: Ic8386743b1852ca1074528d04e2adc1d191b091b
2017-02-02 17:48:17 +00:00
Alex Converse c957af9c1d Remove unused aom_complement()
Change-Id: I84ebc5acd57aa1cf735fae9cdb56f225afcb3a63
2017-02-02 17:31:44 +00:00
Alex Converse 2ce4bd4208 ec_multisymbol: Add defines CDF_PROB_BITS, CDF_PROB_TOP
Change-Id: I6c1717ad82d05ebe22327aec6989af2c0db336e3
2017-02-02 15:09:10 +00:00
Alex Converse e8b34bb1eb ans: Remove some dead code.
This was part of the old ans zero token handling. It has been replaced
by the new ec_multisymbol zero token handling.

Change-Id: I9c1fcb42ac0d214178cf4fbf8755ad68dcbbc11f
2017-02-01 23:24:02 +00:00
Urvang Joshi 5bb97ed064 ALT_INTRA experiment: Use single set of weights for SMOOTH_PRED
2nd set of weights can be derived from the 1st.

Insignificant change in BDRate.

Change-Id: I68d6fc256f532d52573583f121dd28fd8913ce3a
2017-02-01 20:47:36 +00:00
Sebastien Alaiwan d0e23b4061 Merge dct_const_round_shift functions.
Change-Id: I73e3eec0b8fd17c3f9b9f52afc9fac43f3043028
2017-02-01 16:35:09 +00:00
Johann cda0b5e46c highbitdepth + loop restoration: fix build on x86 32 bit
When the functions were added in
https://aomedia-review.googlesource.com/6545 they were not restricted to
x86_64 builds.

Fixes "undefined reference to
`aom_highbd_convolve8_add_src_sse2'" for --target=x86-linux-gcc

Also remove SSE2 specializations from
`aom_highbd_convolve8_add_src[_horiz/_vert]`, since those functions
don't actually have SSE2 versions (this was left in by accident
in the original patch).

Change-Id: I9f7d0c11b58b6f5a0e6a1fdaed0f92175bdeab34
2017-01-27 16:36:30 +00:00
Alex Converse 822513c88c ans: Support a larger state range in reverse serialization
Change-Id: Ic3a6f9d16a16f347fb36b94e6dca70d9436b984e
2017-01-27 01:11:38 +00:00
Urvang Joshi bca73c4cb8 Make ALT_INTRA work with CB4X4.
Change-Id: Ibc1803c3d149c6a53d1817798d0cab6dc5ab5927
2017-01-24 14:54:42 -08:00
James Zern 9303b941d8 aom_subpixel_8t_intrin_avx2: tolerate unversioned clang
assume __clang_major__==0 has the latest version of
_mm256_broadcastsi128_si256. fixes builds with custom clang toolchains.

cherry-picked from libvpx:
33aef48f2 vpx_subpixel_8t_intrin_avx2: tolerate unversioned clang

BUG=b/30970831

Change-Id: I90becd56278e4716bd46e2ba9d910af977e8dfa6
2017-01-20 17:56:40 -08:00
Alex Converse eb780e7167 Add a control to set the ANS window size
Change-Id: I3d64ec4bbc72143b30a094ece7a6c711d6b479cd
2017-01-19 17:22:44 +00:00
Thomas Davies ef97ec0b50 EC_ADAPT: faster CDF update.
Also fix warning.

Change-Id: Ia515360af9c3269901eb0d002d326b7af43a00e7
2017-01-13 10:52:32 +00:00
Alex Converse 346440bd74 Use the standard aom_reader_init() interface for ans
Change-Id: I4a0f0a775362e6e43cd28ed29bf83c912cdc7df5
2017-01-11 17:29:55 +00:00
Nathan E. Egge 2d8dd96635 Use const cdf with aom_read_symbol().
Change-Id: I6e60d9083da8a2d8f7e182e4f12704eddd170df6
2017-01-09 17:03:21 +00:00
Nathan E. Egge e069849592 Split aom_read_cdf() from aom_read_symbol().
Separate the aom_read_cdf() functionality from aom_read_symbol() which
 can optionally adapt the cdf when run with --enable-ec_adapt.

Change-Id: I5446d6402835dfcf68d3462a2bd8835704fe6603
2017-01-09 17:03:21 +00:00
Nathan E. Egge 58ef551fd7 Use const cdf with aom_write_cdf().
Change-Id: I0e254b52d3e347a96f38922c3f00993d3e70538f
2017-01-09 17:03:21 +00:00
Nathan E. Egge 87d44dc749 Split aom_write_cdf() from aom_write_symbol().
Separate the aom_write_cdf() functionality from aom_write_symbol() which
 can optionally adapt the cdf when run with --enable-ec_adapt.

Change-Id: Ibc58690eddb647d69f08d72f0f0712779aab11d1
2017-01-09 17:03:21 +00:00
Steinar Midtskogen 83307f33f2 Fix typos in comments
Change-Id: Id70b49e2a77c6837da75c684d622ddfe60f3d97e
2017-01-07 10:26:28 +01:00
Steinar Midtskogen d954f2d77d Disable unsupported SIMD optimisations for CLPF for 32 bit VS targets
VS compiling for 32 bit targets does not support vector types in
structs as arguments, which makes the v256 type of the intrinsics hard
to support, so optimizations for this target are disabled.

Change-Id: I675394cf1aed0cb18a48f21216470867031b30ce
2017-01-07 08:59:56 +00:00
Nathan E. Egge c98d286385 Add API for coding symbols with unscaled CDFs.
Add aom_write_symbol_unscaled() and aom_read_symbol_unscaled() calls
 for encoding and decoding symbols with non-dyadic CDFs, e.g. that
 don't add up to 32768.
This currently only works with the DAALA_EC backend, but does support
 AOM bit accounting.

Change-Id: Icb37500f1b051dd2e8893ff0920302ece1d6ccfd
2017-01-05 04:19:55 -05:00
David Barker be6cc07d82 Add new convolve variant for loop-restoration
The convolve filters generated by loop_wiener_filter_tile
are not compatible with some existing convolve implementations
(they can have coefficients >128, sums of (certain subsets of)
coefficients >128, etc.)

So we implement a new variant, which takes a filter with 128
subtracted from its central element and which adds an extra copy
of the source just before clipping to a pixel (reinstating the
128 we subtracted). This should be easy to adapt from the existing
convolve functions, and this patch includes SSE2 highbd and
SSSE3 lowbd implementations.

Change-Id: I0abf4c2915f0665c49d88fe450dbc77b783f69e1
2017-01-03 17:15:29 +00:00
Jingning Han cc5bdf4920 Add 2x2 block level variance functions for high bd
Change-Id: I38259c4074f77a8941baefbe7585fff2eded6b12
2016-12-20 17:28:13 +00:00
Jingning Han 324b4c6d6a Add 2x2 intra predictor for high bit-depth
Provide primitive modules for cb4x4 mode use. This resolves compiler
warnings when both high bit-depth and cb4x4 mode are turned on.

Change-Id: If6ecac50578b3e665b602419a0701c3e047ce623
2016-12-20 17:28:13 +00:00
Alex Converse 2cdf0d85a2 Specify ANS window size at initialization
Change-Id: Ia1757d580dd230d9e743b1f8c3e87df164008684
2016-12-17 03:56:10 +00:00
Jingning Han 8a7786d247 Fix 2x2 d45 intra prediction
This commit fixes the 2x2 d45 intra prediction. It avoids the use
of out-of-boundary position as reference. This resolves an enc/dec
mismatch issue in cb4x4 mode.

Change-Id: I93d01536a0c004190cc9fe3c724bf41364f6fdde
2016-12-16 16:20:08 +00:00
Jingning Han e2ffaf884d Add 2x4 and 4x2 variance functions
Change-Id: Ic2fbc66e9212da32930c6a8ba1a749e3a37c5b9a
2016-12-15 20:19:19 +00:00
Debargha Mukherjee 874d36d9c9 Misc cleanups and enhancements on loop restoration
Includes:
Some cleanups/refactoring
Better buffer management.
Some preps for future chrominance restoration.

Change-Id: Ia264b8989b5f4a53c0764ed3e8258ddc212723fc
2016-12-15 19:11:46 +00:00
Angie Chiang 9e963dc0ed Shorter-tap interp first in highbitdepth mode
BDRate varies within +-0.04%

Change-Id: I76f440c479d411c09ef39a19b46eb8dbc5330efb
2016-12-15 05:49:59 +00:00
Nathan E. Egge 67b9921bbf Fix aom_write_bit() to match aom_read_bit().
The aom_write_bit() was not calling buf_uabs_write_bit() while the
 aom_read_bit() function was calling uabs_read_bit().

Change-Id: If98975341472988e8d809aa80a647d7a2531e21e
2016-12-15 02:05:58 +00:00
Nathan E. Egge 08c99eb30f Explicitly call daala read/write bit functions.
Calling aom_write_bit() and aom_read_bit() with --enable-daala_ec
 would call aom_write() and aom_read() with probability 128 which
 would ultimately call od_ec_enc_bits() and od_ec_dec_bits().
This refactors that code and makes the call explicit.

objective-1-fast:
master@2016-12-14T18:38:33Z -> daala_ec_bits@2016-12-14T18:36:22Z

    PSNR | PSNR Cb | PSNR Cr | PSNR HVS |   SSIM | MS SSIM | CIEDE 2000
  0.0000 |  0.0000 |  0.0000 |   0.0000 | 0.0000 |  0.0000 |     0.0000

Change-Id: Ib69e98734fadcdc8b89936b7b6fbd0574afc7e34
2016-12-15 02:05:58 +00:00
Nathan E. Egge 90b305a9b9 Compute token_stats in aom_write_bit_record() function.
The RD_DEBUG experiment computes stats in the _record() functions which
 then proxy calls through to the actual bit writer.
The aom_write_bit_record() should proxy calls through to aom_write_bit()
 instead of aom_write() with probability 128.

Change-Id: I7617fad0f2c25dc05cf111c660a90068c3f4c513
2016-12-15 00:45:26 +00:00
David Barker 025b25459d Change Wiener filter in loop-restoration
The Wiener filter now uses the same convolution code as the
inter predictors.

Change-Id: Ia3bfbc778171eb25c6a0141426d1f69d92c17992
2016-12-14 18:58:21 +00:00
Alex Converse 5b5140b06e Unfork some ANS setup code
Change-Id: I85e1b3cc4174029b6d1bfa4109b37793537071c2
2016-12-14 17:56:22 +00:00
Steinar Midtskogen ea42c4e969 Remove aom_simd.c and replace simd_check with macro
Change-Id: If2bb7ab2b16ba44e2d6e43eeb8713aa6c05d9d7c
2016-12-13 08:25:12 +00:00
Alex Converse b0be6411db ans: Use a fixed N-symbol window
Accept a small compression loss is in exchange for a fixed sized encoder
side buffering requirement.

subset1:
rans_base@2016-12-02T22:55:56.809Z -> rans_nsym@2016-12-02T22:58:19.859Z

    PSNR | PSNR Cb | PSNR Cr | PSNR HVS |   SSIM | MS SSIM | CIEDE 2000
  0.0304 |  0.0303 |  0.0305 |   0.0317 | 0.0312 |  0.0309 |     0.0301

Change-Id: I09dd143e4f1638b97dc9bba7023efa837a7d48c7
2016-12-12 21:28:43 +00:00
Yi Luo e98325848d High bit depth motion search SAD optimization on avx2
- For all blocks with width >= 16.
- Add test_count to make the unit tests harder to pass.
- Speed testing on 1080p, 100 frames, 5 Mbps, CPU, i7-6700
  User level time reduction:
   baseline:                  3.68%
   baseline + ext-partition: 36.12%

Change-Id: I78c5d9ca216f0fd91f1a360dca2190b11fd54a08
2016-12-09 21:14:48 +00:00
Angie Chiang 48c06da2d0 Remove saturate_int16 from fdct_round_shift
1) Not every transform's internal signal is designed to fit in 16 bits.
2) If overflow happens in this function, it indicates that we need to
adjust the txfm's scaling. We shouldn't mute the overflow signal.
3) Saturation might be handy when all of our transform design are stable,
but I don't think we are at the stable point yet.
4) This will fix C/Trans16x16DCT.AccuracyCheck/1 failure in highbd mode.

Change-Id: I5ef5d130c22adb4b8c3b608ffcb0f2c99dc7523f
2016-12-09 18:13:32 +00:00
Tristan Matthews 3fb5c4c0bc intrapred_sse2: Fix nasm build
Fixes Issue 96: https://bugs.chromium.org/p/aomedia/issues/detail?id=96&q=&desc=3

Change-Id: I47381ef3930368901c7c2ca6d7f9064216de8ad0
2016-12-07 18:45:30 +00:00
Jingning Han 9e7c49fc8a Add 2x2 variance function
Change-Id: I73bcb8ab5727e2d07e34ca35e9e014f3c6f63d56
2016-12-07 05:47:55 +00:00
Alex Converse 1ecdf2bf33 ans: Move buf_ans_flush to the .c file
It is called relatively rarely and doesn't need to be inlined.

Change-Id: I4ee7f95548f008f2ee29da807aaca54b9a25aecd
2016-12-03 02:35:06 +00:00
Alex Converse b0bbd60685 ans: Allow compressed buffer reversal
The final ANS state gets further compacted because aliasing the super
frame marker is not an issue.

Change-Id: I26208accb117a6748abb6f1c32c28fadbc48de09
2016-12-03 02:35:06 +00:00
Alex Converse 2a1b3af329 ans: Give buf_ans ownership of the AnsCoder
Change-Id: I509bbba0d84c1d378044e2c612dd48cd8f99848d
2016-12-02 02:07:27 +00:00
Jingning Han 7833d2bfbf Enable 2x2 intra prediction
Bring 2x2 intra prediction online for chroma components.

Change-Id: Ia56af9101b2a977691bca4156a6dcf89e644b4a7
2016-12-02 01:46:59 +00:00
Alex Converse 52a4b11d51 ans: Refill state at the end of the decoding process.
This should have no effect on the bitstream format (see also no related
encoder change). This is like moving code from the top of the loop to
the bottom of the loop.

This change allows us to:
* Make sure we consume the final renormalization byte after the last
symbol in an ANS partition.
* Move back toward a single renormalization operation for some ANS modes
since we know the bounds of the state mutation algorithm that got us out
of the valid state range.

Change-Id: Ia80246fd0ed805aa61b913a362546b3f08e4d79c
2016-12-02 00:35:07 +00:00
Angie Chiang 7a483cffc8 Turn on SIMD optimization for dual_filter
Let aom_convolve8_### SIMD implementation support any block width.
Turn on SIMD optimization when interpolation filter types on two
directions are different.

This will reduce 30% of encoding time when dual_filter and ext_interp
both on.

Change-Id: I539dbb2737f01835034b7269656a15b2058fa3cc
2016-12-01 21:58:03 +00:00