Граф коммитов

293 Коммитов

Автор SHA1 Сообщение Дата
Jingning Han cc5bdf4920 Add 2x2 block level variance functions for high bd
Change-Id: I38259c4074f77a8941baefbe7585fff2eded6b12
2016-12-20 17:28:13 +00:00
Jingning Han 324b4c6d6a Add 2x2 intra predictor for high bit-depth
Provide primitive modules for cb4x4 mode use. This resolves compiler
warnings when both high bit-depth and cb4x4 mode are turned on.

Change-Id: If6ecac50578b3e665b602419a0701c3e047ce623
2016-12-20 17:28:13 +00:00
Alex Converse 2cdf0d85a2 Specify ANS window size at initialization
Change-Id: Ia1757d580dd230d9e743b1f8c3e87df164008684
2016-12-17 03:56:10 +00:00
Jingning Han 8a7786d247 Fix 2x2 d45 intra prediction
This commit fixes the 2x2 d45 intra prediction. It avoids the use
of out-of-boundary position as reference. This resolves an enc/dec
mismatch issue in cb4x4 mode.

Change-Id: I93d01536a0c004190cc9fe3c724bf41364f6fdde
2016-12-16 16:20:08 +00:00
Jingning Han e2ffaf884d Add 2x4 and 4x2 variance functions
Change-Id: Ic2fbc66e9212da32930c6a8ba1a749e3a37c5b9a
2016-12-15 20:19:19 +00:00
Debargha Mukherjee 874d36d9c9 Misc cleanups and enhancements on loop restoration
Includes:
Some cleanups/refactoring
Better buffer management.
Some preps for future chrominance restoration.

Change-Id: Ia264b8989b5f4a53c0764ed3e8258ddc212723fc
2016-12-15 19:11:46 +00:00
Angie Chiang 9e963dc0ed Shorter-tap interp first in highbitdepth mode
BDRate varies within +-0.04%

Change-Id: I76f440c479d411c09ef39a19b46eb8dbc5330efb
2016-12-15 05:49:59 +00:00
Nathan E. Egge 67b9921bbf Fix aom_write_bit() to match aom_read_bit().
The aom_write_bit() was not calling buf_uabs_write_bit() while the
 aom_read_bit() function was calling uabs_read_bit().

Change-Id: If98975341472988e8d809aa80a647d7a2531e21e
2016-12-15 02:05:58 +00:00
Nathan E. Egge 08c99eb30f Explicitly call daala read/write bit functions.
Calling aom_write_bit() and aom_read_bit() with --enable-daala_ec
 would call aom_write() and aom_read() with probability 128 which
 would ultimately call od_ec_enc_bits() and od_ec_dec_bits().
This refactors that code and makes the call explicit.

objective-1-fast:
master@2016-12-14T18:38:33Z -> daala_ec_bits@2016-12-14T18:36:22Z

    PSNR | PSNR Cb | PSNR Cr | PSNR HVS |   SSIM | MS SSIM | CIEDE 2000
  0.0000 |  0.0000 |  0.0000 |   0.0000 | 0.0000 |  0.0000 |     0.0000

Change-Id: Ib69e98734fadcdc8b89936b7b6fbd0574afc7e34
2016-12-15 02:05:58 +00:00
Nathan E. Egge 90b305a9b9 Compute token_stats in aom_write_bit_record() function.
The RD_DEBUG experiment computes stats in the _record() functions which
 then proxy calls through to the actual bit writer.
The aom_write_bit_record() should proxy calls through to aom_write_bit()
 instead of aom_write() with probability 128.

Change-Id: I7617fad0f2c25dc05cf111c660a90068c3f4c513
2016-12-15 00:45:26 +00:00
David Barker 025b25459d Change Wiener filter in loop-restoration
The Wiener filter now uses the same convolution code as the
inter predictors.

Change-Id: Ia3bfbc778171eb25c6a0141426d1f69d92c17992
2016-12-14 18:58:21 +00:00
Alex Converse 5b5140b06e Unfork some ANS setup code
Change-Id: I85e1b3cc4174029b6d1bfa4109b37793537071c2
2016-12-14 17:56:22 +00:00
Steinar Midtskogen ea42c4e969 Remove aom_simd.c and replace simd_check with macro
Change-Id: If2bb7ab2b16ba44e2d6e43eeb8713aa6c05d9d7c
2016-12-13 08:25:12 +00:00
Alex Converse b0be6411db ans: Use a fixed N-symbol window
Accept a small compression loss is in exchange for a fixed sized encoder
side buffering requirement.

subset1:
rans_base@2016-12-02T22:55:56.809Z -> rans_nsym@2016-12-02T22:58:19.859Z

    PSNR | PSNR Cb | PSNR Cr | PSNR HVS |   SSIM | MS SSIM | CIEDE 2000
  0.0304 |  0.0303 |  0.0305 |   0.0317 | 0.0312 |  0.0309 |     0.0301

Change-Id: I09dd143e4f1638b97dc9bba7023efa837a7d48c7
2016-12-12 21:28:43 +00:00
Yi Luo e98325848d High bit depth motion search SAD optimization on avx2
- For all blocks with width >= 16.
- Add test_count to make the unit tests harder to pass.
- Speed testing on 1080p, 100 frames, 5 Mbps, CPU, i7-6700
  User level time reduction:
   baseline:                  3.68%
   baseline + ext-partition: 36.12%

Change-Id: I78c5d9ca216f0fd91f1a360dca2190b11fd54a08
2016-12-09 21:14:48 +00:00
Angie Chiang 48c06da2d0 Remove saturate_int16 from fdct_round_shift
1) Not every transform's internal signal is designed to fit in 16 bits.
2) If overflow happens in this function, it indicates that we need to
adjust the txfm's scaling. We shouldn't mute the overflow signal.
3) Saturation might be handy when all of our transform design are stable,
but I don't think we are at the stable point yet.
4) This will fix C/Trans16x16DCT.AccuracyCheck/1 failure in highbd mode.

Change-Id: I5ef5d130c22adb4b8c3b608ffcb0f2c99dc7523f
2016-12-09 18:13:32 +00:00
Tristan Matthews 3fb5c4c0bc intrapred_sse2: Fix nasm build
Fixes Issue 96: https://bugs.chromium.org/p/aomedia/issues/detail?id=96&q=&desc=3

Change-Id: I47381ef3930368901c7c2ca6d7f9064216de8ad0
2016-12-07 18:45:30 +00:00
Jingning Han 9e7c49fc8a Add 2x2 variance function
Change-Id: I73bcb8ab5727e2d07e34ca35e9e014f3c6f63d56
2016-12-07 05:47:55 +00:00
Alex Converse 1ecdf2bf33 ans: Move buf_ans_flush to the .c file
It is called relatively rarely and doesn't need to be inlined.

Change-Id: I4ee7f95548f008f2ee29da807aaca54b9a25aecd
2016-12-03 02:35:06 +00:00
Alex Converse b0bbd60685 ans: Allow compressed buffer reversal
The final ANS state gets further compacted because aliasing the super
frame marker is not an issue.

Change-Id: I26208accb117a6748abb6f1c32c28fadbc48de09
2016-12-03 02:35:06 +00:00
Alex Converse 2a1b3af329 ans: Give buf_ans ownership of the AnsCoder
Change-Id: I509bbba0d84c1d378044e2c612dd48cd8f99848d
2016-12-02 02:07:27 +00:00
Jingning Han 7833d2bfbf Enable 2x2 intra prediction
Bring 2x2 intra prediction online for chroma components.

Change-Id: Ia56af9101b2a977691bca4156a6dcf89e644b4a7
2016-12-02 01:46:59 +00:00
Alex Converse 52a4b11d51 ans: Refill state at the end of the decoding process.
This should have no effect on the bitstream format (see also no related
encoder change). This is like moving code from the top of the loop to
the bottom of the loop.

This change allows us to:
* Make sure we consume the final renormalization byte after the last
symbol in an ANS partition.
* Move back toward a single renormalization operation for some ANS modes
since we know the bounds of the state mutation algorithm that got us out
of the valid state range.

Change-Id: Ia80246fd0ed805aa61b913a362546b3f08e4d79c
2016-12-02 00:35:07 +00:00
Angie Chiang 7a483cffc8 Turn on SIMD optimization for dual_filter
Let aom_convolve8_### SIMD implementation support any block width.
Turn on SIMD optimization when interpolation filter types on two
directions are different.

This will reduce 30% of encoding time when dual_filter and ext_interp
both on.

Change-Id: I539dbb2737f01835034b7269656a15b2058fa3cc
2016-12-01 21:58:03 +00:00
Alex Converse 5943d41e36 ans: Factor out refill_state
Change-Id: I648f4eb2954b2d138c2128bbf3f638eea31ec28f
2016-12-01 17:59:43 +00:00
Yaowu Xu bde4ac8260 change to use AOMedia copyright notice
Change-Id: I82580120a154ecd7c41f4cd9bc0f8c669fca7774
2016-11-29 00:01:36 +00:00
Alex Converse fa9c9d1cc0 Adjust how the final ANS state is written.
The new prefixes are
0: 15 bits of state are added to the base state.
10: 22 bits of state are added to the base state.
110: Reserved for super frame marker
111: 29 bits of state are added to the base state.

The likelihood of any final state is proportional to 1 / state. Given a
state range of [2**15, 2 **23) this should save on average 0.4 bits
per serialized final state.

BDRATE
subset1: -.000%
lowres: -.010%

Change-Id: I8e66e4a6667f5692c541083e6d6edc35ff411181
2016-11-28 23:25:46 +00:00
Yi Luo 9e218747c4 SAD avg and 4D avx2 optimization for ext-partition
- User level time reduction <1% on i7-6700 cpu

Change-Id: I8f15bde07dddd938df0b065e20ae94109e7b3b5b
2016-11-28 22:42:08 +00:00
Urvang Joshi 6be4a54b89 Add a new intra prediction mode "smooth".
This is added as part of ALT_INTRA experiment.

This uses interpolation between top row and estimated bottom row; as
well as left column and estimated right column to generate the
predicted block.The interpolation is done using a predefined weight
array.

Based on experiments, the currently chosen weight array was created
to represent a quadratic curve, but can be tuned further if needed.

Improvement from baseline on Derf set:
ALL Keyframes: 1.279%

Improvement from existing ALT_INTRA:
ALL Keyframes: 1.146%

Change-Id: I12637fa1b91bd836f1c59b27d6caee2004acbdd4
2016-11-28 12:12:26 -08:00
Steinar Midtskogen 29c61068d7 Add v256_intrinsics_v128.h to aom_dsp.mk
Change-Id: I3f0af5cf71d17f4d331d846d92728396399f187b
2016-11-23 10:06:33 +01:00
Debargha Mukherjee 84c56af017 Support 64x64 intra prediction
Change-Id: I2536b5b55f28c2ee59445c3b70d3e073e69945cd
2016-11-21 20:06:46 +00:00
Yi Luo 63bd6dc96b Fix rectangle transform computation overflow
- Add 16-bit saturation in fdct_round_shift().
- Add extreme value tests and round trip error tests.
- Fix inv 4x8 txfm calculation accuracy.
- Fix 4x8, 8x4, 8x16, 16x8, 16x32, 32x16 extreme value tests.
- BDRate: lowres: -0.034
          midres: -0.036
          hdres:  -0.013
BUG=webm:1340

Change-Id: I48365c1e50a03a7b1aa69b8856b732b483299fb5
2016-11-21 17:18:27 +00:00
Yaowu Xu 88cbc5827b Use an alternative fix to ubsan warning.
This commit revert the previous fix of the ubsan warning for unsigned
int overflow, and use a better fix by  moving the offs-- inside the
while loop to avoid "0-1" situation.

Change-Id: Id4a3e03859ebcdf264df0808412b30841028f87c
2016-11-17 21:11:23 +00:00
David Barker 4c12cc5fc5 Comment out code accidentally left in the bitstream_debug patch
In https://aomedia-review.googlesource.com/#/c/5864/ , some
code to stop the decoder at a preselected point was left enabled.
This code should only be uncommented when debugging, so comment
it out by default.

Change-Id: Ie168e8a1588ba92971e3ff1a056f597a7dfca136
2016-11-17 16:26:14 +00:00
David Barker fa2865b502 Implement bitstream debug for daala_ec
Change-Id: I809eb52e8a632189c49b8ea0a2b5de760cc2a34c
2016-11-16 17:17:24 +00:00
Yaowu Xu 4d34154b66 Fix IOC warnings
av1_txfm.h: left shift of a negative number
av1/encoder/quantize.c: unsigned int overflow
aom_dsp/entenc.c: unsigned int overflow

Change-Id: I6143e68f7d6e2621f97900808c8ef7ee0ad0c814
2016-11-16 17:08:02 +00:00
Thomas Davies faa7fcfe1c Add overwrite functions which do not zero bytes.
In several places bits are overwritten in the bitstream. These
functions avoid zeroing bytes during writing so that this can
happen correctly when the number of bits is not 8*N.

Re-addresses the attempted fix in
133c57c331

which broke threaded encoding tests, which relied on re-using
byte buffers.

Change-Id: I682c5e3a7869eac7ad475584db8bf170d47a56c9
2016-11-16 03:57:10 +00:00
Sarah Parker caaf9f9270 Revert "Support overwriting an arbitrary number of literal bits."
This reverts commit 133c57c331,
which appears to cause test failures in
AV1/AVxEncoderThreadTest.EncoderResultTest/*

Change-Id: I200b7a135ed65dc2c3f23b23b8c3dbf0872715fa
2016-11-12 01:02:11 +00:00
Yaowu Xu f42bba2522 Reinstate "fix msvc build warnings and errors"
This commit reinstates portion of a reverted commit to fix warnings
and errors with MSVC2013 build.

Change-Id: Ibb5fd665db6d8c897a657e5994547a1f82e3f188
2016-11-12 00:36:55 +00:00
Yaowu Xu fdb4216d6b Revert "fix msvc build warnings and errors"
This reverts commit 32dbdff1b3.

Change-Id: I94ef281223f7abceb156714e8192d5ea5fdc2581
2016-11-10 22:32:29 +00:00
Yaowu Xu 32dbdff1b3 fix msvc build warnings and errors
This commit fix the msvc2013 build for configuration:
configure --target=x86_64-win64-vs12 --enable-experimental
 --enable-clpf --enable-dering --enable-motion-var --enable-ans

BUG=aomedia:80

Change-Id: I08b61e38e761ea4ed3175529fba4a50c57be44ac
2016-11-10 21:51:43 +00:00
Yi Luo 1f49624c7f SAD avx2 optimization for ext-partition
- User level improves 1.33% on i7-6700

Change-Id: I279fc7ec99f4c3500017ed079709227f96e9702e
2016-11-10 19:56:00 +00:00
Debargha Mukherjee 0e11912ae1 Support 64x64 quantizer functions
Also includes some refactoring and cleanups.

Change-Id: I2c2528c434a1e9e9b898251fa69489d884463929
2016-11-09 21:59:14 +00:00
Yaowu Xu febe9b06bb Fix msvc compiler warnings
BUG=aomedia:80

Change-Id: Ie4bccf053d2c24dcb64519650bcbcef4baffcdae
2016-11-09 19:38:21 +00:00
Thomas Davies 133c57c331 Support overwriting an arbitrary number of literal bits.
Overwriting was only guaranteed to work if a whole number
of bytes were being overwritten.

Change-Id: I5e72cb337ec6ff691e93288de9f751b583654a17
2016-11-09 11:11:57 +00:00
Alex Converse 1e4e29f776 Fix rans ec_multisymbol merge issues.
The rans experiment is dead. The ans experiment with the ec_multisymbol
experiment also turned on takes its place.

Change-Id: Ie9f30ec7cf73aae6b2ea580a7b1f208485a8a7a7
2016-11-09 01:25:29 +00:00
Angie Chiang d02001ddc2 Add txb_coeff_cost_map into TOKEN_STATS
This is to facilitate debugging process in var_tx experiment

Change-Id: Ibd5ea7f6054c598b8e686abb4e8158ef28c67aab
2016-11-08 22:32:02 +00:00
Yushin Cho 77bba8d30a New experiment: Perceptual Vector Quantization from Daala
PVQ replaces the scalar quantizer and coefficient coding with a new
design originally developed in Daala. It currently depends on the
Daala entropy coder although it could be adapted to work with another
entropy coder if needed:
./configure --enable-experimental --enable-daala_ec --enable-pvq

The version of PVQ in this commit is adapted from the following
revision of Daala:
fb51c1ade6

More information about PVQ:
- https://people.xiph.org/~jm/daala/pvq_demo/
- https://jmvalin.ca/papers/spie_pvq.pdf

The following files are copied as-is from Daala with minimal
adaptations, therefore we disable clang-format on those files
to make it easier to synchronize the AV1 and Daala codebases in the future:
 av1/common/generic_code.c
 av1/common/generic_code.h
 av1/common/laplace_tables.c
 av1/common/partition.c
 av1/common/partition.h
 av1/common/pvq.c
 av1/common/pvq.h
 av1/common/state.c
 av1/common/state.h
 av1/common/zigzag.h
 av1/common/zigzag16.c
 av1/common/zigzag32.c
 av1/common/zigzag4.c
 av1/common/zigzag64.c
 av1/common/zigzag8.c
 av1/decoder/decint.h
 av1/decoder/generic_decoder.c
 av1/decoder/laplace_decoder.c
 av1/decoder/pvq_decoder.c
 av1/decoder/pvq_decoder.h
 av1/encoder/daala_compat_enc.c
 av1/encoder/encint.h
 av1/encoder/generic_encoder.c
 av1/encoder/laplace_encoder.c
 av1/encoder/pvq_encoder.c
 av1/encoder/pvq_encoder.h

Known issues:
- Lossless mode is not supported, '--lossless=1' will give the same result as
'--end-usage=q --cq-level=1'.
- High bit depth is not supported by PVQ.

Change-Id: I1ae0d6517b87f4c1ccea944b2e12dc906979f25e
2016-11-06 22:18:01 -08:00
Angie Chiang d402282f69 Add token cost comparison in write_modes_b()
This is just partial implementation
Compare token cost of pack_mb_tokens/pack_txb_tokens with token cost
from rate-distortion loop. If there is any difference, dump out mode
info.

Change-Id: I46b373ee2522c5047f799f36baf7cec5fbc06f06
2016-11-04 11:09:24 -07:00
Yi Luo 7f6bf9c70d Merge "Hybrid inverse transforms 16x16 AVX2 optimization" into nextgenv2 2016-11-02 01:43:02 +00:00
Yi Luo 7317200002 Hybrid inverse transforms 16x16 AVX2 optimization
- Add unit tests to verify the bit-exact result.
- User level time reduction (EXT_TX):
    encoder: 3.63%
    decoder: 2.36%
- Also add tx_type=V_DCT...H_FLIPADST SSE2 for 16x16 inv txfm.

Change-Id: Idc6d9e8254aa536e5f18a87fa0d37c6bd551c083
2016-11-01 13:38:20 -07:00
Yaowu Xu 8af861bbf1 Fix merge issues related --enable-ec-adapt
1. Avoid compiler warnings.
2. Enable prob_diff_update() required by update_txfm_probs().

Change-Id: I9081b645c55a8432bdaeb600e9ba901c0d0d96f5
2016-11-01 12:36:04 -07:00
Nathan E. Egge baaaa16186 Centralize EC_MULTISYMBOL error checking.
The EC_ADAPT experiment cannot work unless EC_MULTISYMBOL is also
 enabled.
This patch replaces all individual checks with a centralized check in
 both the bitreader.h and bitwriter.h.

Change-Id: I418852d95c5012cc074ed65cd24997e08bc2aadd
2016-10-29 22:26:27 -07:00
Alex Converse 58c520afe9 Only build aom_read/write_symbol if CONFIG_EC_MULTISYMBOL
Change-Id: If86c7220ac9199a59e605dc43d42cc3db26cf8bd
2016-10-29 17:05:40 -07:00
Thomas Davies f6c04acaa3 EC_ADAPT: improved symbol adaptation.
Place a floor under symbol probabilities and
modify adaptation rate.

Change-Id: Ic9cf6d9fadfc3bf1f3027bc3d2bb198526441591
2016-10-29 17:05:40 -07:00
Alex Converse aca9feba82 Add ec_multisymbol for common daala_ec and rans code
The new ec_multisymbol experiment supersedes the rans experiment and is
used for multisymbol features that can be backed by either daala_ec or
rans.

This experiment is automatically enabled by ec_adapt and will try to
enable daala_ec or ans (in that order).

Change-Id: Ie75b4002b7a9d7f5f7b4d130c1aacb3dbe97e54f
2016-10-29 17:05:40 -07:00
Thomas 9ac5508f32 Add EC_ADAPT experiment for symbol-adaptive entropy coding.
This experiment performs symbol-by-symbol statistics
adaptation for non-binary symbols. It requires DAALA_EC or
RANS and ANS to be enabled. The adaptation is currently
based on a simple recursive filter and is taken from
Daala. It has an adaptation rate dependent on alphabet size,
taken from Daala. It applies wherever non-binary symbols
are encoded using Cumulative Probability Functions rather
than trees.

Where symbols are adapted, forward updates in the compressed
header are removed.

In the case of RANS coefficient token values are adapted,
with the exception of the zero token which remains a
binary symbol. In the case of DAALA_EC other values
such as inter and intra modes are adapted as CDFs are
provided in those cases.

The experiment is configured with:

./configure --enable-experimental --enable-daala-ec --enable-ec-adapt

or

./configure --enable-experimental --enable-ans --enable-rans \
    --enable-ec-adapt

EC_ADAPT is not currently compatible with tiles.

BDR results on Objective-1-fast give a small loss:

PSNR YCbCr:      0.51%      0.49%      0.48%
PSNRHVS:      0.50%
SSIM:      0.50%
MSSSIM:      0.51%
CIEDE2000:      0.50%

Change-Id: I3888718e42616f3fd87144de7f125228446ac984
2016-10-29 16:57:48 -07:00
Debargha Mukherjee 3ff8cb764b Merge "Fix aom_fdct8x8_ssse3 in high bit depth mode" into nextgenv2 2016-10-28 19:31:45 +00:00
David Barker 0602edfbc5 Fix aom_fdct8x8_ssse3 in high bit depth mode
Change-Id: I63e492163ef10e12a842837368c209b8ffc4eee0
2016-10-28 10:13:43 +01:00
Alex Converse 3fc98e86d1 rans: Use symbol coding for motion vectors
Change-Id: If497b53c3b36e32fb98c99dba2d4a490e226572a
2016-10-27 12:38:43 -07:00
Luca Barbato f0f98578df Namespace the idct/iad symbols
Make linking to libvpx and libaom at the same time possible.

Change-Id: I7bab8527a32e446e3d564e6fa5d94ccd056bc63f
2016-10-27 12:36:37 -07:00
Debargha Mukherjee a5e3bc0fbc Merge "Fix compile error with --enable-ans + --enable-accounting" into nextgenv2 2016-10-27 19:03:22 +00:00
Yi Luo 400dcc8088 Merge "Fix aom_fdct32x32_avx2 output as CONFIG_AOM_HIGHBITDEPTH=1" into nextgenv2 2016-10-26 22:42:17 +00:00
Yi Luo 133c13d637 Fix incorrect merge of forward txfm function declarations
- Restore the fwd txfm HBD function declarations exposure.

Change-Id: I1e33df6297fd37e242f4b73c8ab97063b9feb7c6
2016-10-26 10:30:53 -07:00
Yi Luo 0c552dfd82 Fix aom_fdct32x32_avx2 output as CONFIG_AOM_HIGHBITDEPTH=1
- Change FDCT32x32_2D_AVX2 output parameter to tran_low_t.
- Add unit tests for CONFIG_AOM_HIGHBITDEPTH=1.
- Update TODO notes.
BUG=webm:1323

Change-Id: If4766c919a24231fce886de74658b6dd7a011246
2016-10-25 14:33:21 -07:00
Yaowu Xu b695b1c118 dkboolwriter.c: change copyright notice
Change-Id: I1d9349a07ffd85991fc5673354d3ceff3404b358
2016-10-25 10:32:33 -07:00
David Barker 01b16baa5a Fix compile error with --enable-ans + --enable-accounting
Change-Id: I43deba9c80b324c12852750d08c62dc2dd783835
2016-10-25 16:22:24 +01:00
Alex Converse 8db9faefe8 Use remove some magic numbers in aom_rans_merge_prob8_pdf.
Change-Id: I0cefae17642d7adf1b9bd637ecb81b437629aa0c
2016-10-24 09:05:03 -07:00
Yi Luo 62b6cc0bc9 Merge "Fix avx2 16x16/32x32 fwd txfm coeff output on HBD" into nextgenv2 2016-10-22 01:46:09 +00:00
Yi Luo 1a0f27aaa6 Fix avx2 16x16/32x32 fwd txfm coeff output on HBD
Change-Id: Ida036defe5688894a63007a31aa2dd0b3f0b5d59
2016-10-21 14:14:00 -07:00
Jingning Han dc90bf0737 Merge "Fix unused variable error in intrapred.c" into nextgenv2 2016-10-21 21:11:31 +00:00
Nathan E. Egge 5357dcaf71 Decoder performance improvement with daala_ec.
Cherry-pick Daala b5020bee:
 Remove redundant test in od_ec_decode_bool_q15().
Using a test that decodes 100M random binary symbols, making this change
 produced a speed up of 8.81% with gcc-4.9.3 and 3.71% with clang-3.7.1,
 both compiled with -O2.

Change-Id: If6d0077a56121a575ae53bcd4d1d9b7d800a317d
2016-10-21 12:38:30 -07:00
Yaowu Xu 91219941b1 Merge "Use divide by multiply in the ans writer." into nextgenv2 2016-10-21 18:46:29 +00:00
Angie Chiang 646e52a85a Fix unused variable error in intrapred.c
Change-Id: Icda975cd9b264c1752c3057bce8031791f91c08a
2016-10-21 11:45:31 -07:00
Yaowu Xu 2f5b9d66b5 Merge "Add support for v256 intrinsics" into nextgenv2 2016-10-21 18:00:20 +00:00
Yaowu Xu c76572af16 Merge changes Icfc16070,Ied47a248,I8af087d9,I322a1366,If04580af into nextgenv2
* changes:
  Palette: Use inverse_color_order to find color index faster.
  Rewrite some loops to avoid -Wunsafe-loop-optimizations warnings.
  Remove some useless casts
  Add compiler warning flag -Wextra and fix related warnings.
  Declare some array sizes to be constants (known at compile time).
2016-10-21 17:31:42 +00:00
Alex Converse 64e2f105a7 Use divide by multiply in the ans writer.
Change-Id: Ide4e9b3a605571ec41c265347217e103df8d0821
2016-10-21 09:54:41 -07:00
Steinar Midtskogen 045d413ca2 Add support for v256 intrinsics
Change-Id: I1da08afaa945ca1aaf4bf9f50cf649a7feef2e60
2016-10-21 08:55:37 -07:00
Urvang Joshi d71a231c49 Add compiler warning flag -Wextra and fix related warnings.
Note: some of these warnings are enabled by a combination of -Wunused
(added earlier) and -Wextra.

Cherry-picked from aomedia/master: 4790a69

Change-Id: I322a1366bd4fd6c0dec9e758c2d5e88e003b1cbf
2016-10-20 15:49:16 -07:00
Yaowu Xu ec5a1942e2 Merge changes I7d6394e4,Ia8ce1464,If20e8637,Ia9adc46b,I651db25b into nextgenv2
* changes:
  Define SIMD_INLINE using AOM_FORCE_INLINE
  AOM_FORCE_INLINE: fix always_inline attribute
  Free memory allocated by daala_ec encoder.
  Move clpf_sse4_1.c to clpf_sse4.c in agreement with convention
  sync avg_test.cc with aom/master
2016-10-20 22:30:11 +00:00
Jingning Han 7ae6ae3497 Merge "Add 2x2 directional intra predictors" into nextgenv2 2016-10-20 22:15:46 +00:00
Yaowu Xu cfc5ac5034 Merge "Partition the ans experiment into 'ans' and 'rans'" into nextgenv2 2016-10-19 22:58:05 +00:00
Steinar Midtskogen c38afedb8d Define SIMD_INLINE using AOM_FORCE_INLINE
Change-Id: I7d6394e48e9b6093e5b523387ed250f371ee7fb9
2016-10-19 15:14:27 -07:00
Nathan E. Egge e734fcb114 Free memory allocated by daala_ec encoder.
Free the two memory buffers allocated by the daala_ec encoder when
 calling od_ec_enc_clear() from aom_daala_stop_encode().

Change-Id: If20e86374ea29e51ee59111012905e56039dd4cc
2016-10-19 15:14:27 -07:00
Jingning Han 03b3514058 Add 2x2 directional intra predictors
Change-Id: Iaa25269a15231dadeaba0f4836c864fc10e858df
2016-10-19 21:58:09 +00:00
Alex Converse ec6fb649da Partition the ans experiment into 'ans' and 'rans'
The (new) ans experiment replaces the bool coder with uABS bools. The
'rans' experiment adds multisymbol coding.

This matches the setup in aom/master.

Change-Id: Ida8372ccabf1e1e9afc45fe66362cda35a491222
2016-10-19 12:03:15 -07:00
Yaowu Xu e94767ae97 Merge "Change return type of tell and tell_frac to uint32_t." into nextgenv2 2016-10-19 18:59:08 +00:00
Urvang Joshi 66b1fcc924 Merge changes I3922dea2,I3bab2848,I21f7478a,Ida5de713,Ib9f0eefe, ... into nextgenv2
* changes:
  Fix warnings reported by -Wshadow: Part4: main directory
  Fix warnings reported by -Wshadow: Part3: test/ directory
  Fix warnings reported by -Wshadow: Part2b: more from av1 directory
  Fix warnings reported by -Wshadow: Part2: av1 directory
  Fix warnings reported by -Wshadow: Part1b: scan_order struct and variable
  Fix warnings reported by -Wshadow: Part1: aom_dsp directory
  Move STAT_TYPE enum to source file.
  Code cleanup: mainly rd_pick_partition and methods called from there.
2016-10-19 18:25:52 +00:00
Nathan E. Egge b244f39627 Change return type of tell and tell_frac to uint32_t.
The bit accounting functions aom_reader_tell() and aom_reader_tell_frac()
 return the number of bits and 1/8th bits respectively.
This patch changes the return type from ptrdiff_t which is signed to
 uint32_t which is unsigned.
The size_t type is not used since we only care about the number of bits
 or 1/8 bits per entropy coder context and we don't expect to code more
 than 512 megabits per tile.

Change-Id: I84a119d1f52829dcbdb66a92656eacca06e42b11
2016-10-19 10:53:52 -07:00
Michael Bebenita 6048d05225 Bit accounting.
This patch adds bit account infrastructure to the bit reader API.
When configured with --enable-accounting, every bit reader API
function records the number of bits necessary to decoding a symbol.
Accounting symbol entries are collected in global accounting data
structure, that can be used to understand exactly where bits are
spent (http://aomanalyzer.org). The data structure is cleared and
reused each frame to reduce memory usage. When configured without
--enable-accounting, bit accounting does not incur any runtime
overhead.

All aom_read_xxx functions now have an additional string parameter
that specifies the symbol name. By default, the ACCT_STR macro is
used (which expands to __func__). For more precise accounting,
these should be replaced with more descriptive names.

Change-Id: Ia2e1343cb842c9391b12b77272587dfbe307a56d
2016-10-19 04:34:29 +00:00
Urvang Joshi fdb60962f4 Fix warnings reported by -Wshadow: Part1: aom_dsp directory
While we are at it:
- Rename some variables to more meaningful names
- Reuse some common consts from a header instead of redefining them.

Cherry-picked from aomedia/master: 09eea2193

Change-Id: I61030e773137ae107d3bd43556c0d5bb26f9dbf8
2016-10-18 17:22:12 -07:00
Michael Bebenita d7baf45ff6 Adds ability to measure with a higher precision the number of bits
read per symbol.

Change-Id: I218abaa5172b769b66dba45050381c0212602668
2016-10-18 16:57:56 -07:00
Michael Bebenita 63b44c4c50 Remove stale OD_ACCOUNTING code.
Change-Id: Ie90dd06c387119ccd9c920a328c942477df00bb7
2016-10-18 09:12:06 -07:00
Michael Bebenita 868fc0b04a Port aom_reader_tell() support
This commit ports the following from aom/master:
4c46278 Add aom_reader_tell() support.
b9c9935 Remove an erroneous declaration.
56c9c3b Fix ANS build.

Change-Id: I59bd910f58c218c649a1de2a7b5fae0397e13cb1
2016-10-18 08:50:05 -07:00
Nathan E. Egge 9ac1f7d770 Create aom_cdf_prob type for 16-bit probabilities.
Change-Id: I33899eca44300037816c9f20c965aa8311a1ef52
2016-10-17 20:22:48 -07:00
Nathan E. Egge 45741e9351 Rename daala_read_tree_cdf() to daala_read_symbol().
Change-Id: I35f85bad88c637cea62577c546cdd5ced0e21bd6
2016-10-17 20:22:19 -07:00
Nathan E. Egge 19698a7084 Fix warning when discarding const qualifier.
Cherry-pick Daala 211c2a41: Clean up EC tell() and tell_frac() functions.
Add a const qualifier to the od_ec_enc and od_ec_dec parameters of
 the od_ec_enc_tell(), od_ec_enc_tell_frac(), od_ec_dec_tell(), and
 od_ec_dec_tell_frac() functions.
Add an OD_WARN_UNUSED_RESULT to od_ec_enc_tell_frac().

Change-Id: Ia50e2fd75e98d8a03d993449d658b695cf56e6fb
2016-10-17 12:16:27 -07:00
Nathan E. Egge f3035f2bc7 Revert code formatting of OD_UNIFORM_CDFS_Q15.
The formatting of OD_UNIFORM_CDFS_Q15[] in entcode.c is helpful for
 for understanding what is contained in the array (e.g., the uniform
 probability distributions of small sizes 2 through 16).
This patch reverts the change made in f4b2926d and adds linter hints to
 ignore the formatting.

Change-Id: I2ad9fe6673b86e6067cb97b40f0f0e69a119cdf5
2016-10-17 12:16:26 -07:00
Nathan E. Egge 56eeaa5daf Rename aom_write_tree_cdf() to aom_write_symbol().
Change-Id: I7c088c55f1c461063976d5bd84ff2026c4f3bc69
2016-10-17 11:54:51 -07:00
Yaowu Xu 2bdb9e6344 Merge changes Ie43c599f,Icd0dbed4,Ic04e180b into nextgenv2
* changes:
  Move av1_indices_from_tree() to common code space.
  Add code to compute in-order mappings for tokens.
  Fix bug in av1_tree_to_cdf_2D() macro.
2016-10-14 23:46:48 +00:00
Yaowu Xu 73d702db7f Merge changes I339d0389,I2fa1e87a,If79fa5ae,Icb1a8cb8,Ic76de4a4, ... into nextgenv2
* changes:
  Add missing CONFIG_DAALA_EC declaration.
  Add API for writing trees using a CDF.
  Add macro to build a simple cdf table.
  Use Daala entropy coder to code trees.
  Silence clang-format code review warning.
  Use Daala entropy coder to code bits.
  Clear existing format issue in the codebase
  Add Daala entropy coder.
2016-10-14 23:42:22 +00:00
Yi Luo 1dec26e004 Merge "Zero high 128b YMM registers to avoid SSE-AVX transition penalties" into nextgenv2 2016-10-14 23:13:10 +00:00
Nathan E. Egge 8abf8673e6 Move av1_indices_from_tree() to common code space.
Move the av1_indices_from_tree() function from av1/encoder/treewriter.c
 to aom_dsp/prob.c so that it can be used by both the encoder and
 the decoder.

Change-Id: Ie43c599f425c3503b1ff93f0c77b5033a05b1bb4
2016-10-14 14:59:27 -07:00
Nathan E. Egge a67c0ff4d7 Add missing CONFIG_DAALA_EC declaration.
Without first including ./aom_config.h in aom_dsp/prob.c the memmove
 function is implicitly defined and causes a compiler warning.

Change-Id: I339d0389f10324a1085aba7d6492b2159a14da92
2016-10-14 14:59:27 -07:00
Nathan E. Egge 44460148b2 Add API for writing trees using a CDF.
Added aom_write_tree_cdf() and aom_read_tree_cdf() function calls to
 bitwriter.h and bitreader.h respectively.
These calls take a multisymbol CDF and an index and directly encode the
 symbol using the enabled entropy coder.
Currently only the daala entropy encoder supports this (enabled with
 --enable-daala_ec) and a compile error is thrown otherwise.

Change-Id: I2fa1e87af4352c94384e0cfdbfd170ac99cf3705
2016-10-14 14:59:27 -07:00
Nathan E. Egge 439c50251f Fix bug in av1_tree_to_cdf_2D() macro.
Change-Id: Ic04e180b09745fab2230d05985770c41deea4fad
2016-10-14 14:59:27 -07:00
Nathan E. Egge e2ed411836 Add macro to build a simple cdf table.
Add the av1_tree_to_cdf() macro which takes a aom_tree_index tree and
 associated aom_prob probabilities and constructs a daala uint16_t cdf.
The av1_tree_to_cdf_1D() and av1_tree_to_cdf_2D() apply av1_tree_to_cdf()
 across 1D and 2D arrays respectively.

Change-Id: If79fa5ae034263f279d7d0842493570885272fb2
2016-10-14 14:59:27 -07:00
Nathan E. Egge 43acafdee2 Use Daala entropy coder to code trees.
When building with --enable-daala_ec, calls to aom_write_tree() and
 aom_read_tree() will convert a aom_tree_index structure with associated
 aom_prob probabilities into a CDF on the fly for use with the
 od_ec_encode_cdf_q15().
The number of symbols in the CDF is capped at 16, and trees that contain
 more than 16 leaf nodes are handled by splitting the most likely, e.g.,
 highest probability symbols, first and coding multiple symbols if
 necessary.

ntt-short-1:

         MEDIUM (%) HIGH (%)
    PSNR 0.000227   0.000213
 PSNRHVS 0.000215   0.000205
    SSIM 0.000229   0.000209
FASTSSIM 0.000229   0.000214

subset1:

          RATE (%)  DSNR (dB)
    PSNR -0.00026   0.00002
 PSNRHVS -0.00026   0.00002
    SSIM -0.00026   0.00001
FASTSSIM -0.00026   0.00001

Change-Id: Icb1a8cb854fd81fdd88fbe4bc6761c7eb4757dfe
2016-10-14 14:59:27 -07:00
Nathan E. Egge 0435f0eae6 Silence clang-format code review warning.
Change-Id: Ic76de4a4c0c39924bf04c3c2fa9214d33bcee9fb
2016-10-14 14:59:27 -07:00
Nathan E. Egge 8043cc4018 Use Daala entropy coder to code bits.
When building with --enable-daala_ec, calls to aom_write() and aom_read()
 use the daala entropy coder to write and read bits.
When the probability is exactly 0.5 (128), then raw bits are used.

ntt-short-1:

          MEDIUM (%) HIGH (%)
    PSNR -0.027556  -0.020114
 PSNRHVS -0.027401  -0.020169
    SSIM -0.027587  -0.020151
FASTSSIM -0.027592  -0.020102

subset1:

         RATE (%)  DSNR (dB)
    PSNR 0.03296  -0.00210
 PSNRHVS 0.03537  -0.00281
    SSIM 0.03299  -0.00161
FASTSSIM 0.03458  -0.00111

Change-Id: I48ad8eb40fc895d62d6e241ea8abc02820d573f7
2016-10-14 14:59:27 -07:00
Yaowu Xu 931bc2a714 Clear existing format issue in the codebase
Fix the clang-format warnings on the existing codes.

Change-Id: I8e9e781b6f68f41a7fbd0a2116f6b35290d73dc8
2016-10-14 14:59:27 -07:00
Nathan E. Egge 1078dee569 Add Daala entropy coder.
Change-Id: I2849a50163268d58cc5d80aacfec1fd02299ca43
2016-10-14 14:59:27 -07:00
Alex Converse 62a94a649d Switch rANS to 15 bit precision, and adjust L_BASE.
This causes rANS to operate at the same precision as the Daala EC.

aom/master stats: rans10uabs8lbase12 → rans15uabs8lbase15

objective-1-fast
PSNR YCbCr:      0.01%      0.01%      0.01%
   PSNRHVS:      0.01%
      SSIM:      0.01%
    MSSSIM:      0.01%
 CIEDE2000:      0.01%

subset1
PSNR YCbCr:     -0.01%     -0.00%     -0.00%
   PSNRHVS:     -0.01%
      SSIM:     -0.01%
    MSSSIM:     -0.01%
 CIEDE2000:     -0.01%

(cherry picked from aom/master commit ddbc2e2a68)

Change-Id: I6ef0a4f6198784b3712a61af9f105d560a22eaea
2016-10-14 14:05:50 -07:00
Yi Luo e9fde265f7 Zero high 128b YMM registers to avoid SSE-AVX transition penalties
Documents:
- https://software.intel.com/en-us/articles/intel-avx-state-transitions-migrating-sse-code-to-avx
- https://software.intel.com/sites/default/files/m/d/4/1/d/8/11MC12_Avoiding_2BAVX-SSE_2BTransition_2BPenalties_2Brh_2Bfinal.pdf

Change-Id: I90f85fcb15a7a2c49ee068300be6ffe9c68d371c
2016-10-14 12:22:35 -07:00
James Zern fbabcad67c Merge changes I4850b36e,Ic4d7128a into nextgenv2
* changes:
  variance_avx2: sync variance functions with c-code
  Resolve -Wshorten-64-to-32 in variance.
2016-10-14 19:10:20 +00:00
Yi Luo b9fbf38bff Merge "Delete some redundant function declarations in aom_dsp_rtcd_defs.pl" into nextgenv2 2016-10-14 17:50:37 +00:00
James Zern 8c64331aa2 variance_avx2: sync variance functions with c-code
add missing int64 -> uint32 cast; quiets -Wshorten-64-to-32 warnings

Change-Id: I4850b36e18dc8b399108342be4bfe0b684aefb78
(cherry picked from commit 6acd061aad8cf62000cc9117390d0c94581a8591)
2016-10-13 20:15:18 -07:00
Alex Converse 2176b7acc2 Resolve -Wshorten-64-to-32 in variance.
The subtrahend is small enough to fit into uint32_t.

Change-Id: Ic4d7128aaa665eaf6b25d562610ba8942c46137f
(cherry picked from commit c0241664aac3a1805db9bd8e09e071ac326531e0)
2016-10-13 20:12:20 -07:00
Debargha Mukherjee a720f4b3b5 Merge "Add sse2 forward and inverse 16x32 and 32x16 transforms" into nextgenv2 2016-10-14 02:49:20 +00:00
Yue Chen a48764d05f Merge "Renamings for OBMC experiment" into nextgenv2 2016-10-14 01:33:00 +00:00
Yi Luo 761ae880d7 Delete some redundant function declarations in aom_dsp_rtcd_defs.pl
Change-Id: I4df57a7faba5800c048b2dc469ec31545406f55c
2016-10-13 17:53:45 -07:00
Yue Chen cb60b185c7 Renamings for OBMC experiment
To get ready for pulling AV1 to nextgenv2
Replace the experimental flag by MOTION_VAR. Rename major variables.

Change-Id: If6cf4f37b9319c46d8f90df551cc7295d66ca205
2016-10-13 15:51:22 -07:00
Steinar Midtskogen 2d5f752ae9 Don't use _mm_cvtsi128_si64 on 32 bit systems
Change-Id: I332afb8d9e35cd60f05915160a5b2e1dc8757de5
2016-10-13 14:35:00 -07:00
Yaowu Xu 410fee8de6 Fix formatting in a few files
Change-Id: Ia5175afe82b142d9e18c01c546610202c630588e
2016-10-13 13:04:29 -07:00
Yaowu Xu 8ac419f307 Merge changes Ic3a68557,Ib1dbe41a,I0da09270,Ibdbd720d into nextgenv2
* changes:
  Deringing cleanup: remove DERING_REFINEMENT (always on now)
  Don't run the deringing filter on skipped blocks within a superblock
  Don't dering skipped superblocks
  On x86 use _mm_set_epi32 when _mm_cvtsi64_si128 isn't available
2016-10-13 15:54:32 +00:00
Yaowu Xu 89d3f2fd10 Merge "Sync 2x2 intra predictors" into nextgenv2 2016-10-13 15:20:52 +00:00
Alex Converse fc4980edb7 Merge changes Ic74d9d88,Ie93b474e,I544989ea,Ic273f7d9,Idfd2d2b3, ... into nextgenv2
* changes:
  Remove custom rans types
  Remove add_token_no_extra.
  Remove unused aom_rans_build_cdf_from_pdf
  Add the tool used to generate the constrained tokenset.
  Remove the starting zero from ANS CDFs.
  Import the aom_read/write_symbol abstractions from aom/master
2016-10-13 14:03:15 +00:00
David Barker 33231d4801 Add sse2 forward and inverse 16x32 and 32x16 transforms
Change-Id: I1241257430f1e08ead1ce0f31db8272b50783102
2016-10-13 14:01:22 +01:00
Alex Converse 9ed1a2ff44 Remove custom rans types
(cherry picked from aom/master commit 11206c60d9)

Includes renames in a bunch of places not handled by the original
due to differing tree states.

Change-Id: Ic74d9d8850b8c80a51e55e425bbf472a67e2653f
2016-10-13 05:53:58 +00:00
Jingning Han e3954d8312 Sync 2x2 intra predictors
Add 2x2 DC, V, H, TM intra predictors.

Change-Id: I2a614adde553f821c45bc5a9bf09800a9f0aaa26
2016-10-12 21:04:01 -07:00
Alex Converse d5b9c730ad Remove unused aom_rans_build_cdf_from_pdf
Change-Id: I544989eae45b7dda04250365c3de99f50110a76b
(cherry picked from aom/master commit 06cce842ca)
2016-10-12 17:44:14 -07:00
Alex Converse e9f70f8f10 Remove the starting zero from ANS CDFs.
This brings it in line with the Daala CDFs and will make it easier to
share code.

Change-Id: Idfd2d2b33c3b9b2c4e72ce72fb3d8039013448b9
(cherry picked from aom/master commit af98507ca9)
2016-10-12 17:41:01 -07:00
Alex Converse a1ac972867 Import the aom_read/write_symbol abstractions from aom/master
Change-Id: I0b255c05108c3b97e74df1b59c34111c9e9a5770
2016-10-12 17:41:01 -07:00
Steinar Midtskogen b074823863 On x86 use _mm_set_epi32 when _mm_cvtsi64_si128 isn't available
Change-Id: Ibdbd720d4f68892da6164a9849e212e759305005
2016-10-12 15:48:13 -07:00
Yi Luo fed8e1c06d Hybrid forward transform 32x32 AVX2 optimization
- av1_fht32x32 AVX2 function level time reduction ~89% compared to C.

- av1_fht32x32_avx2() on DCT_DCT improves 42.62% over aom_fdct32x32_avx2()
  But function replacement must go with the corresponding inverse txfm.

- No obvious user level time reduction due to 32x32 TX_TYPE selection.

- Zero high 128b YMM to avoid AVX-SSE transition penalties
  (fix 16x16 case).

- Added 32x32 AVX2 unit tests to verify bitexact.

- AVX2 optimization summary:
  On CPU i7-6700, based on 16x16/32x32 fwd txfm optimization results:
  C to AVX2: function level time reduction, ~86-89%.
  SSE2 to AVX2: function level time reduction, ~51%.

Change-Id: Idd0cd8bf066a61c7117140ef15ab6c1f8eb4b036
2016-10-12 14:19:53 -07:00
Yaowu Xu f36d0b46d1 minor updates
1. vp8->aom
2. removed no-effect statements and spaces

Change-Id: I367d05ff9bf1b9f3c71c517c45d8049d9d4236ec
2016-10-12 10:50:08 -07:00
Steinar Midtskogen b066b962a7 Fix missing parentheses in v64_align()
Change-Id: I16469062853c101965f56002be30ebc5823975b1
2016-10-11 12:36:17 -07:00
Steinar Midtskogen 9d6a53b8fd Improve v128 and v64 8 bit shifts for x86
Change-Id: I25dc61bab46895d425ce49f89fceb164bee36906
2016-10-11 12:36:17 -07:00
Steinar Midtskogen ebf209ba82 Make generic SIMD code compile if no native support
Change-Id: I7f691a0ae27f06ef3d727764829a60a8ffc509eb
2016-10-11 12:36:16 -07:00
Yaowu Xu 25faa0e9f5 Merge "Move tree writing code into bitwriter.h." into nextgenv2 2016-10-11 19:16:25 +00:00
Yaowu Xu 80eaf1a120 Merge "Extend CLPF to chroma." into nextgenv2 2016-10-11 18:44:31 +00:00
Yaowu Xu a1a7ad0c15 Merge "Make generic SIMD work with clang." into nextgenv2 2016-10-11 18:42:15 +00:00
Yaowu Xu 0bab35bf64 Merge "Fix clang-format warnings in aom_dsp/simd/v64_intrinsics_arm.h" into nextgenv2 2016-10-11 18:41:50 +00:00
Yaowu Xu 038d41045b Merge "Added high bit-depth support in CLPF." into nextgenv2 2016-10-11 18:41:15 +00:00
Yaowu Xu a2bbf621f1 Merge "Reduce memory footprint for CLPF decoding." into nextgenv2 2016-10-11 18:40:47 +00:00
Nathan E. Egge eeedc633c0 Move tree writing code into bitwriter.h.
Rename av1_write_tree() to aom_write_tree() and move it into bitwriter.h
 to match aom_read_tree() in bitreader.h.

Manually cherry-picked from aom/master:
33a143fa7a

Change-Id: I6c686cdd3e0f179d7e95c5bc6984558b62d46d67
2016-10-11 09:36:01 -07:00
Yaowu Xu 4960f7c3bd Merge "Added generic SIMD support for CLPF." into nextgenv2 2016-10-11 16:05:18 +00:00
Debargha Mukherjee fb865cf41c Merge "Add sse2 forward / inverse 4x8 and 8x4 transforms" into nextgenv2 2016-10-11 15:50:32 +00:00
Steinar Midtskogen ecf9a0c821 Extend CLPF to chroma.
Objective quality impact (low latency):

PSNR YCbCr:      0.13%     -1.37%     -1.79%
   PSNRHVS:      0.03%
      SSIM:      0.24%
    MSSSIM:      0.10%
 CIEDE2000:     -0.83%

Change-Id: I8ddf0def569286775f0f9d4d4005932766a7fc27
2016-10-10 15:23:38 -07:00
Steinar Midtskogen 7b7624e89e Make generic SIMD work with clang.
Change-Id: I2c504a078a7137bea6ba50c5768c1295878e9ea1
2016-10-10 15:18:57 -07:00