libjpeg-turbo has never supported non-ANSI C compilers. Per the spec,
ANSI C compilers must have locale.h, stddef.h, stdlib.h, memset(),
memcpy(), unsigned char, and unsigned short. They must also handle
undefined structures.
This fixes an oversight from the integration of the arithmetic entropy
codec from libjpeg (66f97e6820). I chose
to integrate the latest implementation available at the time, which was
from jpeg-8b. However, I naively replaced cinfo->lim_Se with
DCTSIZE2 - 1, not realizing that-- because of SmartScale-- jpeg-8b
contains additional code
(https://github.com/libjpeg-turbo/libjpeg-turbo/blob/jpeg-8b/jdinput.c#L249-L334)
that guards against illegal values of cinfo->Se >= DCTSIZE2. Thus,
libjpeg-turbo's implementation of arithmetic decoding has never guarded
against such illegal values. This commit restores the relevant check
from the original jpeg-6b arithmetic entropy codec patch ("jpeg-ari",
1e247ac854).
Fixes#564
Regression introduced by 42825b68d5
The test case
https://user-images.githubusercontent.com/3491627/101376530-fde56180-38b0-11eb-938d-734119a5b5ba.jpg
is a malformed progressive JPEG image containing an interleaved Y/Cb/Cr
DC scan followed by two non-interleaved Y DC scans. Thus, the
prev_coef_bits[] array was initialized for the Y component but not the
other components, the uninitialized values for Cb and Cr were
transferred to the prev_coef_bits_latch[] array in smoothing_ok(), and
because cinfo->master->last_good_iMCU_row was 0,
decompress_smooth_data() read those uninitialized values when attempting
to smooth the second iMCU row.
Possibly fixes#478
Because of 01e3032354 (officially
eliminating support for compilers without unsigned char, since we never
effectively supported those compilers anyhow), GETJOCTET() is now a
no-op. Since that macro is in jmorecfg.h, it is part of the de facto
libjpeg API and must remain in the public headers. However, there is no
reason to continue using it internally, and eliminating its internal use
improves code readability.
... introduced by 42825b68d5. In fact,
fault-tolerant multi-scan block smoothing cannot currently be used with
the arithmetic decoder, because that decoder doesn't have any way of
distinguishing a normal end of scan from an unexpected end of scan.
Thus, this commit also modifies the change log to reset the expectations
regarding the scope of the fault-tolerant multi-scan block smoothing
feature. If, at some point in the future, the arithmetic decoder can be
modified to detect an unexpected end of scan, then one would need only
set entropy->pub.insufficient_data = TRUE when the arithmetic decoder
encounters an unexpected end of scan in order to make fault-tolerant
block smoothing work properly with that decoder.
This commit modifies the behavior of the block smoothing algorithm in
the libjpeg API library so that, if a scan in a multi-scan JPEG image is
incomplete (due to premature termination of the image stream), the block
smoothing parameters from the previous (complete) scan are used to
smooth any iMCU rows that the incomplete scan does not contain.
Closes#343
- When referring to specific clauses, annexes, tables, and figures, a
"timed reference" (a reference that includes the year) must be used in
order to avoid confusion.
- "CCITT" = "ITU-T"
- Replace ambiguous "JPEG spec" with the specific document number.
Within the libjpeg API code, it seems to be more the convention than not
to separate the macro name and value by two or more spaces, which
improves general readability. Making this consistent across all of
libjpeg-turbo is less about my individual preferences and more about
making it easy to automatically detect variations from our chosen
formatting convention. I intend to release the script I'm using to
validate this stuff, once it matures and stabilizes a bit.
With rare exceptions ...
- Always separate line continuation characters by one space from
preceding code.
- Always use two-space indentation. Never use tabs.
- Always use K&R-style conditional blocks.
- Always surround operators with spaces, except in raw assembly code.
- Always put a space after, but not before, a comma.
- Never put a space between type casts and variables/function calls.
- Never put a space between the function name and the argument list in
function declarations and prototypes.
- Always surround braces ('{' and '}') with spaces.
- Always surround statements (if, for, else, catch, while, do, switch)
with spaces.
- Always attach pointer symbols ('*' and '**') to the variable or
function name.
- Always precede pointer symbols ('*' and '**') by a space in type
casts.
- Use the MIN() macro from jpegint.h within the libjpeg and TurboJPEG
API libraries (using min() from tjutil.h is still necessary for
TJBench.)
- Where it makes sense (particularly in the TurboJPEG code), put a blank
line after variable declaration blocks.
- Always separate statements in one-liners by two spaces.
The purpose of this was to ease maintenance on my part and also to make
it easier for contributors to figure out how to format patch
submissions. This was admittedly confusing (even to me sometimes) when
we had 3 or 4 different style conventions in the same source tree. The
new convention is more consistent with the formatting of other OSS code
bases.
This commit corrects deviations from the chosen formatting style in the
libjpeg API code and reformats the TurboJPEG API code such that it
conforms to the same standard.
NOTES:
- Although it is no longer necessary for the function name in function
declarations to begin in Column 1 (this was historically necessary
because of the ansi2knr utility, which allowed libjpeg to be built
with non-ANSI compilers), we retain that formatting for the libjpeg
code because it improves readability when using libjpeg's function
attribute macros (GLOBAL(), etc.)
- This reformatting project was accomplished with the help of AStyle and
Uncrustify, although neither was completely up to the task, and thus
a great deal of manual tweaking was required. Note to developers of
code formatting utilities: the libjpeg-turbo code base is an
excellent test bed, because AFAICT, it breaks every single one of the
utilities that are currently available.
- The legacy (MMX, SSE, 3DNow!) assembly code for i386 has been
formatted to match the SSE2 code (refer to
ff5685d5344273df321eb63a005eaae19d2496e3.) I hadn't intended to
bother with this, but the Loongson MMI implementation demonstrated
that there is still academic value to the MMX implementation, as an
algorithmic model for other 64-bit vector implementations. Thus, it
is desirable to improve its readability in the same manner as that of
the SSE2 implementation.
The convention used by libjpeg:
type * variable;
is not very common anymore, because it looks too much like
multiplication. Some (particularly C++ programmers) prefer to tuck the
pointer symbol against the type:
type* variable;
to emphasize that a pointer to a type is effectively a new type.
However, this can also be confusing, since defining multiple variables
on the same line would not work properly:
type* variable1, variable2; /* Only variable1 is actually a
pointer. */
This commit reformats the entirety of the libjpeg-turbo code base so
that it uses the same code formatting convention for pointers that the
TurboJPEG API code uses:
type *variable1, *variable2;
This seems to be the most common convention among C programmers, and
it is the convention used by other codec libraries, such as libpng and
libtiff.
These days, INT32 is a commonly-defined datatype in system headers. We
cannot eliminate the definition of that datatype from jmorecfg.h, since
the INT32 typedef has technically been part of the libjpeg API since
version 5 (1994.) However, using INT32 internally is risky, because the
inclusion of a particular header (Xmd.h, for instance) could change the
definition of INT32 from long to int on 64-bit platforms and thus change
the internal behavior of libjpeg-turbo in unexpected ways (for instance,
failing to correctly set __INT32_IS_ACTUALLY_LONG to match the INT32
typedef-- perhaps as a result of including the wrong version of
jpeglib.h-- could cause libjpeg-turbo to produce incorrect results.)
The library has always been built in environments in which INT32 is
effectively long (on Windows, long is always 32-bit, so effectively it's
the same as int), so it makes sense to turn INT32 into an explicitly
long datatype. This ensures that libjpeg-turbo will always behave
consistently, regardless of the headers included at compile time.
Addresses a concern expressed in #26.
The IJG README file has been renamed to README.ijg, in order to avoid
confusion (many people were assuming that that was our project's README
file and weren't reading README-turbo.txt) and to lay the groundwork for
markdown versions of the libjpeg-turbo README and build instructions.
Most of these involved left shifting a negative number, which is
technically undefined (although every modern compiler I'm aware of
will implement this by treating the signed integer as a 2's complement
unsigned integer-- the LEFT_SHIFT() macro just makes this behavior
explicit in order to shut up ubsan.) This also fixes a couple of
non-issues in the entropy codecs, whereby the sanitizer reported an
out-of-bounds index in the 4th argument of jpeg_make_d_derived_tbl().
In those cases, the index was actually out of bounds (caused by a
malformed JPEG image), but jpeg_make_d_derived_tbl() would have caught
the error and aborted prior to actually using the invalid address. Here
again, the fix was to make our intentions explicit so as to shut up
ubsan.