Граф коммитов

334 Коммитов

Автор SHA1 Сообщение Дата
Timothy B. Terriberry e2795e9978 Fix valgrind errors in vp8_sixtap_predict8x4_armv6().
This function was accessing values below the stack pointer, which
 can be corrupted by signal delivery at any time.

Change-Id: I92945b30817562eb0340f289e74c108da72aeaca
2010-09-24 14:34:18 -07:00
Johann f30e8dd7bd combine max values and compare once
previous implementation compared each set of values to limit and then
&'d them together, requiring a compare and & for each value.

this does the accumulation first, requiring only one compare

Change-Id: Ia5e3a1a50e47699c88470b8c41964f92a0dc1323
2010-09-24 15:42:50 -04:00
John Koleszar dbd57c2663 Merge "move reconintra_mt to decoder (for now)" 2010-09-24 08:46:35 -07:00
John Koleszar 8ca779aba8 disable compilation of debugging code
This patch avoids compiling some debugging code in onyx_if.c. The most
significant fix is to avoid generating code for vp8_write_yuv_frame,
which is never called. Some other code was removed by the dead code
elimination performed by the compiler, and this patch does it with the
preprocessor instead. There are advantages both ways.

Change-Id: I044fd43179d2e947553f0d6f2cad5b40907ac458
2010-09-24 11:42:22 -04:00
John Koleszar cbdc129895 darwin-icc: build for specific SDKs
Add the missing -isysroot and -mmacosx-version-min flags to ICC builds.
Fixes issue #185.

Change-Id: I2fb37fcaaafef7122a61ced603569f4aa17f8bbc
2010-09-24 11:40:33 -04:00
Yunqing Wang aab0f5b121 Merge "Adjust multi-thread sync ranges according to image sizes" 2010-09-24 08:34:07 -07:00
John Koleszar 48e76ff4fd move reconintra_mt to decoder (for now)
reconintra_mt.c is only required for building the decoder right now.
It could definitely be used for the encoder in the future, but it
currently depends on decoder only data structures. (onyxd_int.h,
VP8D_COMP, etc). Move it from common/ to decoder/ until the
necessary changes to the common multithread code are complete.

This patch is needed to build with --disable-vp8-decoder.

Change-Id: I568c52221a2b309234d269675cba97131ce35c86
2010-09-24 11:23:06 -04:00
John Koleszar e913eb97c9 configure: enable PIC for shared libs by default
Shared libs generally require PIC, so this saves a little typing at
configure time.

Change-Id: I357d70cc68434f3283fee78873052d2b7d77c777
2010-09-24 08:40:27 -04:00
John Koleszar f9b2ca5b99 configure: add --enable-small
Build with -O2 rather than -O3, to dissuade the compiler from inlining
so much. See issue #1.

Change-Id: Iacb8ddb59125d3f01c5fea846b45a1c004c9aee0
2010-09-24 08:40:27 -04:00
John Koleszar 329aaaf453 Merge "Add getter functions for the interface data symbols" 2010-09-24 05:39:48 -07:00
John Koleszar fa7a55bb04 Add getter functions for the interface data symbols
Having these symbols be available as functions rather than data is
occasionally more convenient. Implemented this way rather than a
get-codec-by-id style to avoid creating a link-time dependency
between the encoder and the decoder.

Fixes issue #169

Change-Id: I319f281277033a5e7e3ee3b092b9a87cce2f463d
2010-09-23 14:58:43 -04:00
Yunqing Wang 8db5da2906 Adjust multi-thread sync ranges according to image sizes
In multi-threaded decoder, set different sync ranges for
different video resolutions.

Change-Id: Iea48fd36f51919e0152c8ed3b1f10e1b723c0ca7
2010-09-23 13:53:09 -04:00
Johann 7fed3832e7 Remove dead code
The new loopfilter was originally introduced as an experimental change.
It's permanent now.

Change-Id: I25dbedb6ceff3e9f9c04e18bb29f84c3ecb7e546
2010-09-22 11:07:34 -04:00
John Koleszar cdd2066687 unset execute bit on c source
Change-Id: I6625ee41f8872908cb015ce0729e1c7a105b5217
2010-09-21 19:48:06 -04:00
Johann a8a38bcf10 Merge "Fix typo" 2010-09-21 12:03:37 -07:00
Johann 0511cbff7a Fix typo
Also, move with other ppc32 options

Change-Id: I0b97413c767909c5682afc9bdd954f3d43401f6c
2010-09-21 14:56:42 -04:00
John Koleszar 6f4c0435d1 Merge "Don't reset mb clamping state during splitmv decoding" 2010-09-21 09:06:59 -07:00
John Koleszar 4d391e8ed2 Don't reset mb clamping state during splitmv decoding
The MV decoding changes in c5fb0eb introduced a bug where the
macroblock clamping state was reset for each partition, so if an
earlier partition needed clamping but a subsequent one didn't,
the MB wouldn't receive clamping. Instead, the state is only
set during splitmv decoding, never cleared.

Change-Id: I224fe258493405ee0f6a04596acdb622c475e845
2010-09-21 11:58:48 -04:00
John Koleszar 3d5f8291b1 Merge "gitignore: initial version" 2010-09-21 07:13:26 -07:00
John Koleszar 12651b3c2b Merge "configure: support for ppc32-linux-gcc" 2010-09-21 07:02:43 -07:00
John Koleszar 015cfcafbd Merge "Add high limit check for unsigned parameters" 2010-09-21 05:36:46 -07:00
Yunqing Wang a23ccf8f8c Merge "Restructure multi-threaded decoder" 2010-09-21 05:00:30 -07:00
Fritz Koenig b7dc9398f2 Use movq instead of movdqu.
Movdqu is more expensive (throughput, uops) than movq.  Minimal
impact for newer big cores, but ~2.25% gain on Atom.

Change-Id: I62c80bb1cc01d8a91c350c4c7719462809a4ef7f
2010-09-20 11:34:26 -07:00
Fritz Koenig 1c906448cc Merge "Better choice of instruction filter mask comparision." 2010-09-20 11:01:51 -07:00
Johann 6cf2b4aa0e Merge "reorder data to use wider instructions" 2010-09-20 10:47:33 -07:00
Johann 9c9afbab85 Merge "Update NEON wide idcts" 2010-09-20 10:47:22 -07:00
Fritz Koenig 8eae7fe7e8 Better choice of instruction filter mask comparision.
Use pmaxub instead of a combination of psubusb/por to
determine if any comparisons go over the limit.

Change-Id: I3f0bd7d2aabe5fee9ba6620508e2b60605abcb82
2010-09-20 10:20:38 -07:00
Guillermo Ballester Valor 236906863a Add high limit check for unsigned parameters
The patch related with issue #55 (5a72620) fixed some warnings, but the
fix was not optimal. It actually was a trick to confuse compiler rather
than a fix.

This patch fixes it by creating a new macro used when needed just a high
limit check for an unsigned.

Change-Id: I94b322e0f7fb07604b3b1df1f9321185f48cfcb5
2010-09-20 10:03:05 -04:00
Johann 022323bf85 reorder data to use wider instructions
the previous commit laid the groundwork by doing two sets of idcts
together. this moved that further by grouping the interesting data
(q[0], q+16[0]) together to allow using wider instructions. also
managed to drop a few instructions by recognizing that the constant
for sinpi8sqrt2 could be downshifted all the time which avoided a
dowshift as well as workarounds for a function which only accepted
signed data

looks like a modest gain for performance: at qcif, went from ~180
fps to ~183
Change-Id: I842673f3080b8239e026cc9b50346dbccbab4adf
2010-09-17 16:47:39 -04:00
Yunqing Wang f857a85088 Restructure multi-threaded decoder
On each MB, loopfiltering is done right after MB decoding. This
combines two loops in multi-threaded code into one, which reduces
number of synchronizations to half.

The above-row/left-col data are saved in temp buffers for
next-row/next MB decoding.

Tests on 4-core gLucid machine showed 10% decoder performance
gain with threads=4 (tulip clip). Testing on other platforms
isn't done yet.

Change-Id: Id18ea7c1e84965dabea65d4c01ca5bc056ddeac9
2010-09-17 09:56:05 -04:00
John Koleszar 9100073e8d cleanup: remove unused xprintf
These files aren't currently used, and we can get them back if we
need them.

Change-Id: I62aa3bff828e491a80c80eeb84a7c44903df29b5
2010-09-16 13:14:12 -04:00
John Koleszar 147b125b15 Reduce size of tokenizer tables
This patch reduces the size of the global tables maintained by the
tokenizer to 16k from 80k-96k. See issue #177.

Change-Id: If0275d5f28389af11ac83c5d929d1157cde90fbe
2010-09-16 10:00:04 -04:00
Fritz Koenig 746439ef6c Modify GET_GOT macro for performance.
GET_GOT was producing a zero length call.  This resulted in
pipeline flushes occuring when returing from the assembly
functions.  Masked on out of order cores, but evident on
Atom cores.

Change-Id: I8c375af313e8a169c77adbaf956693c0cfeb5ccd
2010-09-15 12:41:15 -07:00
Fritz Koenig 769f2424cc Removed unnecessary pxor.
There is no need to make sure that the lower byte of the
register is 0 because the downshift by 11 overwrites that byte.

Change-Id: I89cbf004b2ff532a2c68e0dc399c45a49cdad5a1
2010-09-13 18:34:34 -07:00
Fritz Koenig 71a1c19754 Merge "Make block access to frame buffer sequential" 2010-09-13 11:04:22 -07:00
John Koleszar 887d6ef49a configure: support for ppc32-linux-gcc
Fixes issue 89. Thanks to josejx for the patch.

Change-Id: I7e664fed703b49f2fb3af4c5e6ce1173742000c2
2010-09-13 09:04:55 -04:00
John Koleszar 7f1a908b97 cosmetics: expand tabs in configure
Change-Id: I88ddb0afb56ef2be8184b56fe125ad938ead7a84
2010-09-13 09:02:18 -04:00
Fritz Koenig a65cd3def0 Make block access to frame buffer sequential
Sequentially accessing memory from a low address to a high
address should make it easier for the processor to predict
the cache.

Change-Id: I1921ce996bdd547144fe864fea6435f527f5842d
2010-09-10 16:27:28 -07:00
Scott LaVarnway a32ded1d5f Merge "Improved subset block search" 2010-09-09 11:51:29 -07:00
Scott LaVarnway c5fb0eb8d9 Improved subset block search
Improved the subset block search and fill.  (about 3% improvement for
32 bit)  Modified/merged the code in order to create
vp8_read_mb_modes_mv which can decode the modes/mvs on a macroblock
level. This will allow the decode loop (in the future) to decode
modes/mvs on a frame, row, or mb level.

Change-Id: If637d994b508792f846d39b5d44a7bf9aa5cddf3
2010-09-09 14:42:48 -04:00
Johann 14ba764219 Update NEON wide idcts
Expand 93c32a55 which used SSE2 instructions to do two
idct/dequant/recons at a time to NEON. Initial working
commit. More work needs to be put into rearranging and
interlacing the data to take advantage of quadword
operations, which is when we'll hopefully see a much
better boost

Change-Id: I86d59d96f15e0d0f9710253e2c098ac2ff2865d1
2010-09-09 14:08:12 -04:00
John Koleszar edcbb1c199 Fix GF interval for non-lagged ARFs
When ARFs are enabled in non-lagged compress modes, the GF interval
was being reset to zero. Non-lagged ARF updates were enabled in commit
63ccfbd, but this incorrect GF interval caused a quality regression.

Change-Id: I615c3b493f4ce2127044f4e68d0bcb07d6b730c3
2010-09-09 13:18:54 -04:00
Fritz Koenig 6d90f867e4 Merge branch 'master' of git://review.webmproject.org/libvpx 2010-09-09 08:54:21 -07:00
John Koleszar c2140b8af1 Use WebM in copyright notice for consistency
Changes 'The VP8 project' to 'The WebM project', for consistency
with other webmproject.org repositories.

Fixes issue #97.

Change-Id: I37c13ed5fbdb9d334ceef71c6350e9febed9bbba
2010-09-09 10:01:21 -04:00
Jim Bankoski 69ae8f475d Skip unnecessary search of identical frames
vp8_get_compressed_data() was defeating logic in
encode_frame_to_datarate() that determined the reference buffers to
search and forcing all frames to be eligible to search. In cases
where buffers have identical contents, this is unnecessary extra
work.

Change-Id: I9e667ac39128ae32dc455a3db4c62e3efce6f114
2010-09-08 11:31:34 -04:00
Jim Bankoski 63ccfbd545 Enable ARFs for non-lagged compress
ARFs were explicitly disabled except in lagged compress mode. New
ARF logic allows for the ARF buffer to hold an older golden frame,
which does not require lagged compress.

Change-Id: I1dff82b6f53e8311f1e0514b1794ae05919d5f79
2010-09-08 11:26:13 -04:00
Fritz Koenig 3fb37162a8 Bilinear subpixel optimizations for ssse3.
Used pmaddubsw for multiply and add of two filter taps
at once for 16x16 and 8x8 blocks.

Change-Id: Idccf2d6e094561624407b109fa7e80ba799355ea
2010-09-07 17:19:40 -07:00
Scott LaVarnway 0de458f6b9 Reduced the size of MB_MODE_INFO
Moved partition_bmi and partition_count out of MB_MODE_INFO and
placed into MACROBLOCK.  Also reduced the size of other members
of the MB_MODE_INFO struct.  For 1080p, the memory was reduced
by 1,209,516 bytes.  The decoder performance appeared to improve
by 3% for the clip used.
Note:  The main goal for this change is to improve the decoder
performance.  The encoder will be revisited at a later date for
further structure cleanup.

Change-Id: I4733621292ee9cc3fffa4046cb3fd4d99bd14613
2010-09-03 16:43:23 -04:00
John Koleszar b0519a26b1 Update CHANGELOG for v0.9.2 release
Change-Id: I184e927987544e9f34f890249b589ea13a93a330
2010-09-02 14:56:47 -04:00
John Koleszar e4b5002490 Update AUTHORS
Change-Id: I0395ffa107651a773fd11d12682ab9372f76a90b
2010-09-02 13:41:03 -04:00