Граф коммитов

20318 Коммитов

Автор SHA1 Сообщение Дата
Steinar Midtskogen febe223d54 Move the CLPF damping adjustment for strength up in the call chain
Rather than having the adjustment in the leaf functions, do the
adjustment in the top-level function.  Optimising compilers would
figure this out themselves as far as the functions are inlined, but
probably no further and this patch gives a slightly reduced object
code size.

Change-Id: I104750962f613fa665391c9b2a9e99bcc6f47f93
2017-04-02 18:23:08 +00:00
Jean-Marc Valin ec70797dc3
Temporarily revert some 4:2:2 code
As part of 9cf0c9cde7 the buffering was made
to better handle 4:2:2, but that causes regressions in the tests, so we're
backing out part of it for now.

Change-Id: I9ca4cfeb159aa65514613989e3dcbc30f86ec5b2
2017-04-02 11:54:58 -04:00
Yue Chen 5558e5da12 Use 1 sample per neighbor for local warping model estimation
Only 1 sample needs to be collected. Max of 8 neighbors are
used.
In LS estimation, the projection samples (sx, sy)->(dx, dy) are
intentionally smoothed by assuming 3 shifted versions
(sx, sy+n)->(dx, dy+n), (sx+n, sy)->(dx+n, dy), (sx+n,
sy+n)->(dx+n, dy+n) also contribute to the estimation.
For example, instead of using A[0] = sx^2, we use the sum of
squares of source x of four points, A[0] += 4sx^2+4*n*sx+n^2.
But computational cost wise, it does not add much overhead. Coding
gain is mostly same as the old version. If no smoothing is added,
will lose 0.3% on lowres.

Change-Id: I04be32cffa525f7dc8ee583c0bf211d7bdc6e609
2017-04-02 07:02:13 +00:00
Yue Chen 54723f97e2 Use only the first predictors of compound neighbors in OBMC
Loss of gain in AWCY
HL 0.23%
LL 0 (since no compound is used in LL)
lowres 0.277%
midres 0.248%

Change-Id: I46ad1e2f07411c838f2ca6765de57a60a9c68b12
2017-04-02 06:55:37 +00:00
Yue Chen 13e412ebda Not use sub8x8 mv of neighbors in obmc
Take all sub8x8 neighbors as 8x8 blocks and use mv assigned to
the last block.

Change of performance in AWCY
HL improved by 0.01%
LL improved by 0.06%

Change-Id: I55d3c5401222396d871f9157b62b3de29e5390b0
2017-04-02 06:55:37 +00:00
Steinar Midtskogen 41167b3901 Swap order of xor and offset
Change-Id: I5b202d8e57dbc8fc283f2fda7afe0fec0c3ef622
2017-04-02 02:04:11 +00:00
Steinar Midtskogen b1e04f7e6a Improve high bitdepth CLPF by using newly added v128_ssub_u16
Change-Id: I392b801f61b0d3bcd1cd6157ab783f76ea8c9e5e
2017-04-02 02:04:11 +00:00
Steinar Midtskogen 9b8444a17c Add v64_ssub_u16, v128_ssub_u16 and v256_ssub_u16
Change-Id: I60543913cbd8dc5cad524ab74697227f9e93836e
2017-04-02 02:04:11 +00:00
Michael Bebenita 131a0d5519 Add multiple of 8 copies
Change-Id: I8fb710b767a986c898fbef9e329f30bfb0a22dad
2017-04-01 15:58:26 -07:00
Michael Bebenita 68f3c3bb04 Refactor copy_rect loops
Change-Id: Id3d8034fba7ebcfeadbd753f80a4d6abadac82bd
2017-04-01 13:34:06 -07:00
Michael Bebenita 30134b15cb Refactor fill_rect loops
Change-Id: I3e9255c9b82a2fb9806718caf4f744932f617456
2017-04-01 13:13:33 -07:00
Michael Bebenita 300312dce8 Flatten out colbuf
Change-Id: Ib6f813f68a5d507c0adeec3dcb4c8dea100d23da
2017-04-01 13:13:23 -07:00
Michael Bebenita 7dc0530723 Factor out hsize and vsize in av1_cdef_frame
Change-Id: If46d338300b0db02d6ef41b2ce028c33eaa44cf0
2017-04-01 12:45:40 -07:00
Michael Bebenita 84bc7991c3 Refector mbmi_cdef_strength lookup
Change-Id: Ib28af04a500d7217a48709f1b0fe91d60184fc46
2017-04-01 12:21:27 -07:00
Steinar Midtskogen bea96c5a1d Make CLPF unit test try all dampings
Also rebalance the iterations depending on the bitdepth, since this
made the high bitdepth tests very slow.

Change-Id: If02c52900875c01f662bc8ecf3b509d3b94529c5
2017-04-01 18:16:59 +00:00
Jean-Marc Valin 8362114e6a Fix constrain() broken in 19f7663574
Change-Id: Icd0313d12025b9dcf84f203dbfb1e0756d220dbc
2017-04-01 18:11:21 +00:00
Jingning Han 8a4de2182b Rework sb_compute_dering_list()
Use a unified is_8x8_block_skip() to find if an 8x8 block is coded
in skip mode. This covers cases w/o cb4x4 mode turned on.

Change-Id: I94b72ca7cf0fadbb61bfaa8b97f482feb94fd0f2
2017-04-01 16:04:44 +00:00
Jean-Marc Valin 03de1b2886 Implement tile boundaries
Change-Id: I7fad7934b3b43b17e762a8610cb9bf3bbb837ebd
2017-04-01 12:52:01 +00:00
Jingning Han a4ecb1bfc1 Revert changes in od_dering filter
Respect od_dering.c file as codec independent process. Take out
the changes the assume dependency on AV1 common data. No coding
stats change.

Change-Id: Ic4a9e6356469e0667f9765302a5e8b872589fc5d
2017-04-01 05:13:52 +00:00
Michael Bebenita 54170d92c9 Add SIMD code for block copies.
Change-Id: I696da03fb5e9e87d054a9aa9238ad96937a0e281
2017-04-01 04:44:25 +00:00
Jean-Marc Valin 19f7663574 Simplifying constrain() and constrain_hbd()
Change-Id: Ie6677376861cb053a946724ba32a98a33d68e123
2017-04-01 04:34:51 +00:00
Steinar Midtskogen 45544f9165 Move the contents of clpf_simd_kernels.h to clpf_simd.h
The only reason for clpf_simd_kernels.h was to share the contents with
clpf_rdo_simd.h and clpf_simd.h, but clpf_rdo_simd.h is now removed.

Change-Id: I9132b2f397101069ab7d04065a215c662985e6ce
2017-04-01 04:34:51 +00:00
Yue Chen 0c731f7fea Fix compiling error for MOTION_VAR + CB4X4
Change-Id: I50309970120bbfedc9fefb8d2e012689234fb659
2017-03-31 23:58:15 +00:00
Jingning Han 944b805b1a Retain the frame size as integer multiples of 8
Resolve potential issues in loop filter and cdef.

BUG=aomedia:414

Change-Id: I756c2a16bcb1582be60b2bdbedfc44773ed8f4f3
2017-03-31 22:57:31 +00:00
Yi Luo 13d2aee7df Add the missing IDTX type optimization to hybrid txfm
Change-Id: I99b15e5270bfefe2eb3e982aeba06ed564540d73
2017-03-31 21:33:47 +00:00
Frederic Barbier 72e2e982ee Avoid out-of-bounds issue
When accessing to reference vector list with NONE_FRAME

BUG=aomedia:412

Change-Id: I82a23591d6d9a179eb6f3b1e40f8d1f4018a53d8
2017-03-31 21:01:21 +00:00
Steinar Midtskogen 6501122f1a Improve high bitdepth CLPF SIMD
The high bitdepth was a direct translation of the low bit code, but
the tricks to keep 9 bit differences saturated within 8 bit are
redundant in high bitdepth, so the these were replaced with simpler
and more readable code.

Change-Id: I0710a1f1b9dcde8039d3dfa0f74cd2ea2b3bae27
2017-03-31 19:55:08 +00:00
Jingning Han 9cf0c9cde7 Refactor av1_cdef_frame()
Make the pixel offset scalable with mode_info block size.

Change-Id: I2cd16be64240c613adf6222a7addbda5db267579
2017-03-31 19:03:43 +00:00
James Zern f791e631c7 od_filter_dering_direction_*: make int64->int explicit
clears -Wshorten-64-to-32 warnings

Change-Id: I55231a9a0eec7ade5b06328f386ec19b11860646
2017-03-31 17:22:44 +00:00
James Zern 3e2613b1da av1: normalize aom_enc_frame_flags_t usage
quiets -Wshorten-64-to-32 warnings

ported from libvpx:
710483308 vp9: normalize vpx_enc_frame_flags_t usage

Change-Id: Ice037acb675d1d81bfedf2dfcfa91a8a29a19dfd
2017-03-31 17:22:44 +00:00
Debargha Mukherjee 3b6c54479b Refactoring related to shear parameter computation
Shear parameters for global motion are now computed once
when the parameters are determined.

Change-Id: Idfd53410079a81a81ddd4728f173a0d0ec60230b
2017-03-31 17:14:24 +00:00
Urvang Joshi 5ddac0aac8 RTCD defs: Remove empty specialize statements once and for all.
A similar cleanup happened before, but the empty statements have since
reappeared. I added a check in 'specialize' subroutine to die whenever
such an empty specialize call is found, so that config+make would fail.

Change-Id: I300ca0f0b077c0aeca8096d6460d8fb1c364d9b9
2017-03-31 16:40:03 +00:00
David Barker 404b2e873c Allow NEAR_NEARMV and NEW_NEWMV modes to use ref_mv_idx
When ext-inter and ref-mv are both enabled, this patch
allows the NEAR_NEARMV and NEW_NEWMV modes to pick from
the extended reference mv list, just like the NEARMV and
NEWMV modes can.

Change-Id: Ibcc9e19dba7779422c1c9589d5498159e83bf61e
2017-03-31 16:26:12 +00:00
Alex Converse 61f37b8760 Crop distortion to visible MIs
Ported from VP9 with some heavy modifications

bsize_dist@2017-03-29T23:18:27.564Z -> bcropped_dist@2017-03-29T23:21:00.200Z

   PSNR | PSNR Cb | PSNR Cr | PSNR HVS |    SSIM | MS SSIM | CIEDE 2000
-0.0966 | -0.0922 |  0.0032 |  -0.0618 | -0.0579 | -0.0441 |  -0.0959

Change-Id: Icdfcf47a9017fd3180e7fbc963196a43c5376c4e
2017-03-31 16:25:44 +00:00
Alex Converse 29608d84af variance: Add odd size sse functions
Change-Id: I5eb7870d4b1b83bb907e539528f27f80a42e2fad
2017-03-31 16:25:44 +00:00
Yaowu Xu 19e0c4b6bd update md5 to reflect bitstream change
Making CDEF enabled by default changed the output bitstream of this
test.

Change-Id: I73a10d0cc339b7159bd30994b13127e9a4bf709a
2017-03-31 00:15:31 +00:00
Yi Luo 9d24735537 High bit depth inter prediction filter AVX2
On i7-6700:
- Function level speed improvement: 23%-29%
- User level speed improvement:
   decoder: ~%2-%4.
   encoder: <1%.

Change-Id: I02937a72304c3b356ca41e580352790df391f0a2
2017-03-30 23:12:13 +00:00
Yi Luo 9a3d29eadf Add SSE2 av1_fht32x32
BUG=aomedia:407

Change-Id: I27a7a230bbc701920a996d1e22ae4d22ca8cfead
2017-03-30 21:23:55 +00:00
James Zern e5034e341b clpf_test: mark TestSpeed disabled
performance characteristics in test environments vary, speed tests
should be for local performance testing. these can still be run with:
--gtest_filter=*TestSpeed* --gtest_also_run_disabled_tests

Change-Id: I96a05fe72336b7654ae832d3d2114dacc8203aa5
2017-03-30 20:23:52 +00:00
James Zern 22c0d57c28 dering_test: mark TestSpeed disabled
performance characteristics in test environments vary, speed tests
should be for local performance testing. these can still be run with:
--gtest_filter=*TestSpeed* --gtest_also_run_disabled_tests

Change-Id: I4c00ca6970ba7dc8387ee509e695ec922810c3ae
2017-03-30 20:23:52 +00:00
Jean-Marc Valin 133a98787c Prevent PVQ SSE search from putting pulses beyond n-1
Change-Id: Ib9fec5f2e00ecd73006a603a61d3fddee5229cf8
2017-03-30 18:55:29 +00:00
Yunqing Wang 8036058587 Enable/disable unit tests correctly in decoder-only build case
While building the decoder-only AV1, the unit tests need to be the ones
that only call decoder functions.

BUG=aomedia:395

Change-Id: Iac7b464aa222a177c06b2e037faa6717305cd59d
2017-03-30 18:36:55 +00:00
Debargha Mukherjee 11f0e40d74 A few fixes for global motion
Handles a rare divisin by 0 case.
Also adds a check on global motion parameters to disable
if the parameters obtained are outside the range that the
shear supports. This fixes a rare assert failure.
Also changes the recode loop threshold somewhat.

Change-Id: I4c6e74b914ac653cd9caa0563d78b0a19a2a8627
2017-03-30 16:50:58 +00:00
Alex Converse 4c5b020472 Make aom_sum_squares_2d_i16 take width and height parameters.
SSE2 may be needed for nx4 and 4xn.

Change-Id: I3c10112447fdb5fe51a68bc2c6e2f2641b102723
2017-03-30 15:49:22 +00:00
Jean-Marc Valin 2d5c201619 SSIM-like contrast term for CDEF distortion function
high-latency, cpu=0:

  PSNR | PSNR Cb | PSNR Cr | PSNR HVS |    SSIM | MS SSIM | CIEDE 2000
0.0378 |  0.1946 |  0.1385 |  -0.1159 | -0.2058 | -0.2085 |     0.1353

low-latency, cpu=0:

  PSNR | PSNR Cb | PSNR Cr | PSNR HVS |    SSIM | MS SSIM | CIEDE 2000
0.2388 |  0.2234 |  0.3290 |   0.0623 | -0.1716 | -0.1704 |     0.2542

low-latency, cpu=4:

  PSNR | PSNR Cb | PSNR Cr | PSNR HVS |    SSIM | MS SSIM | CIEDE 2000
0.4089 |  0.3477 |  0.6132 |   0.1729 | -0.1905 | -0.1610 |     0.5522

Change-Id: I35b8596667d82a127847b209416ad83e3b839a9a
2017-03-30 09:40:57 -04:00
Steinar Midtskogen 40fbd217a0 Optimise od_dir_find8() for SSE4.1
Change-Id: I56a35b0d3d76294cc7b3d601770f7dcef12a8bc9
2017-03-30 00:54:57 +00:00
Yue Chen 1bd42be68f Restrict # of neighbors in obmc blending
Only blend with the first N neighbors at each side. If the size of
one dimenstion is 8/16/32/64, the max # of neighbors to overlap
with is 1/2/3/4.
Previously we disable obmc mode if there are too many neighbors.

Change of performance in AWCY, compared to disabling obmc if
at any side there are more than 2 overlappable neighbors.
HL improved by 0.02%
LL improved by 0.09%

Change-Id: I93d9a65c6c4aabf0b4a4946e2253d3e2ef21a662
2017-03-30 00:32:12 +00:00
Steinar Midtskogen dfad2b1579 Add unit tests for SIMD optimised dering functions
Change-Id: I1d3a7dcf8be1fc8f8fc6a70a5660a8c68c60b5ea
2017-03-29 23:47:34 +00:00
Steinar Midtskogen b8ff6aaf5d Add SIMD support for CDEF dering for sse2/ssse3 and neon
Change-Id: Ibaaed850ddceba9c3db542eaf4a1c623ce6b412b
2017-03-29 23:47:21 +00:00
Luc Trudeau 7faea43653 Revert "[PVQ] Don't transform if block skipped"
This reverts commit c538d6815e.

Reason for revert: should be av1_fwd_txfm(), instead of fwd_txfm()

Change-Id: I58657e54eb0a1c20c930b32cd53b6d05493eb8f4
2017-03-29 22:13:37 +00:00