зеркало из https://github.com/microsoft/git.git
Tolerate zlib deflation with window size < 32Kb
Git currently reports loose objects as 'corrupt' if they've been deflated using a window size less than 32Kb, because the experimental_loose_object() function doesn't recognise the header byte as a zlib header. This patch makes the function tolerant of all valid window sizes (15-bit to 8-bit) - but doesn't sacrifice it's accuracy in distingushing the standard loose-object format from the experimental (now abandoned) format. On memory constrained systems zlib may use a much smaller window size - working on Agit, I found that Android uses a 4KB window; giving a header byte of 0x48, not 0x78. Consequently all loose objects generated appear 'corrupt', which is why Agit is a read-only Git client at this time - I don't want my client to generate Git repos that other clients treat as broken :( This patch makes Git tolerant of different deflate settings - it might appear that it changes experimental_loose_object() to the point where it could incorrectly identify the experimental format as the standard one, but the two criteria (bitmask & checksum) can only give a false result for an experimental object where both of the following are true: 1) object size is exactly 8 bytes when uncompressed (bitmask) 2) [single-byte in-pack git type&size header] * 256 + [1st byte of the following zlib header] % 31 = 0 (checksum) As it happens, for all possible combinations of valid object type (1-4) and window bits (0-7), the only time when the checksum will be divisible by 31 is for 0x1838 - ie object type *1*, a Commit - which, due the fields all Commit objects must contain, could never be as small as 8 bytes in size. Given this, the combination of the two criteria (bitmask & checksum) always correctly determines the buffer format, and is more tolerant than the previous version. The alternative to this patch is simply removing support for the experimental format, which I am also totally cool with. References: Android uses a 4KB window for deflation: http://android.git.kernel.org/?p=platform/libcore.git;a=blob;f=luni/src/main/native/java_util_zip_Deflater.cpp;h=c0b2feff196e63a7b85d97cf9ae5bb2583409c28;hb=refs/heads/gingerbread#l53 Code snippet searching for false positives with the zlib checksum: https://gist.github.com/1118177 Signed-off-by: Roberto Tyley <roberto.tyley@guardian.co.uk> Signed-off-by: Junio C Hamano <gitster@pobox.com>
This commit is contained in:
Родитель
e9e0643fe6
Коммит
7f684a2aff
32
sha1_file.c
32
sha1_file.c
|
@ -1217,14 +1217,34 @@ static int experimental_loose_object(unsigned char *map)
|
|||
unsigned int word;
|
||||
|
||||
/*
|
||||
* Is it a zlib-compressed buffer? If so, the first byte
|
||||
* must be 0x78 (15-bit window size, deflated), and the
|
||||
* first 16-bit word is evenly divisible by 31. If so,
|
||||
* we are looking at the official format, not the experimental
|
||||
* one.
|
||||
* We must determine if the buffer contains the standard
|
||||
* zlib-deflated stream or the experimental format based
|
||||
* on the in-pack object format. Compare the header byte
|
||||
* for each format:
|
||||
*
|
||||
* RFC1950 zlib w/ deflate : 0www1000 : 0 <= www <= 7
|
||||
* Experimental pack-based : Stttssss : ttt = 1,2,3,4
|
||||
*
|
||||
* If bit 7 is clear and bits 0-3 equal 8, the buffer MUST be
|
||||
* in standard loose-object format, UNLESS it is a Git-pack
|
||||
* format object *exactly* 8 bytes in size when inflated.
|
||||
*
|
||||
* However, RFC1950 also specifies that the 1st 16-bit word
|
||||
* must be divisible by 31 - this checksum tells us our buffer
|
||||
* is in the standard format, giving a false positive only if
|
||||
* the 1st word of the Git-pack format object happens to be
|
||||
* divisible by 31, ie:
|
||||
* ((byte0 * 256) + byte1) % 31 = 0
|
||||
* => 0ttt10000www1000 % 31 = 0
|
||||
*
|
||||
* As it happens, this case can only arise for www=3 & ttt=1
|
||||
* - ie, a Commit object, which would have to be 8 bytes in
|
||||
* size. As no Commit can be that small, we find that the
|
||||
* combination of these two criteria (bitmask & checksum)
|
||||
* can always correctly determine the buffer format.
|
||||
*/
|
||||
word = (map[0] << 8) + map[1];
|
||||
if (map[0] == 0x78 && !(word % 31))
|
||||
if ((map[0] & 0x8F) == 0x08 && !(word % 31))
|
||||
return 0;
|
||||
else
|
||||
return 1;
|
||||
|
|
|
@ -0,0 +1,68 @@
|
|||
#!/bin/sh
|
||||
#
|
||||
# Copyright (c) 2011 Roberto Tyley
|
||||
#
|
||||
|
||||
test_description='Correctly identify and parse loose object headers
|
||||
|
||||
There are two file formats for loose objects - the original standard
|
||||
format, and the experimental format introduced with Git v1.4.3, later
|
||||
deprecated with v1.5.3. Although Git no longer writes the
|
||||
experimental format, objects in both formats must be read, with the
|
||||
format for a given file being determined by the header.
|
||||
|
||||
Detecting file format based on header is not entirely trivial, not
|
||||
least because the first byte of a zlib-deflated stream will vary
|
||||
depending on how much memory was allocated for the deflation window
|
||||
buffer when the object was written out (for example 4KB on Android,
|
||||
rather that 32KB on a normal PC).
|
||||
|
||||
The loose objects used as test vectors have been generated with the
|
||||
following Git versions:
|
||||
|
||||
standard format: Git v1.7.4.1
|
||||
experimental format: Git v1.4.3 (legacyheaders=false)
|
||||
standard format, deflated with 4KB window size: Agit/JGit on Android
|
||||
'
|
||||
|
||||
. ./test-lib.sh
|
||||
LF='
|
||||
'
|
||||
|
||||
assert_blob_equals() {
|
||||
printf "%s" "$2" >expected &&
|
||||
git cat-file -p "$1" >actual &&
|
||||
test_cmp expected actual
|
||||
}
|
||||
|
||||
test_expect_success setup '
|
||||
cp -R "$TEST_DIRECTORY/t1013/objects" .git/
|
||||
git --version
|
||||
'
|
||||
|
||||
test_expect_success 'read standard-format loose objects' '
|
||||
git cat-file tag 8d4e360d6c70fbd72411991c02a09c442cf7a9fa &&
|
||||
git cat-file commit 6baee0540ea990d9761a3eb9ab183003a71c3696 &&
|
||||
git ls-tree 7a37b887a73791d12d26c0d3e39568a8fb0fa6e8 &&
|
||||
assert_blob_equals "257cc5642cb1a054f08cc83f2d943e56fd3ebe99" "foo$LF"
|
||||
'
|
||||
|
||||
test_expect_success 'read experimental-format loose objects' '
|
||||
git cat-file tag 76e7fa9941f4d5f97f64fea65a2cba436bc79cbb &&
|
||||
git cat-file commit 7875c6237d3fcdd0ac2f0decc7d3fa6a50b66c09 &&
|
||||
git ls-tree 95b1625de3ba8b2214d1e0d0591138aea733f64f &&
|
||||
assert_blob_equals "2e65efe2a145dda7ee51d1741299f848e5bf752e" "a" &&
|
||||
assert_blob_equals "9ae9e86b7bd6cb1472d9373702d8249973da0832" "ab" &&
|
||||
assert_blob_equals "85df50785d62d3b05ab03d9cbf7e4a0b49449730" "abcd" &&
|
||||
assert_blob_equals "1656f9233d999f61ef23ef390b9c71d75399f435" "abcdefgh" &&
|
||||
assert_blob_equals "1e72a6b2c4a577ab0338860fa9fe87f761fc9bbd" "abcdefghi" &&
|
||||
assert_blob_equals "70e6a83d8dcb26fc8bc0cf702e2ddeb6adca18fd" "abcdefghijklmnop" &&
|
||||
assert_blob_equals "bd15045f6ce8ff75747562173640456a394412c8" "abcdefghijklmnopqrstuvwx"
|
||||
'
|
||||
|
||||
test_expect_success 'read standard-format objects deflated with smaller window buffer' '
|
||||
git cat-file tag f816d5255855ac160652ee5253b06cd8ee14165a &&
|
||||
git cat-file tag 149cedb5c46929d18e0f118e9fa31927487af3b6
|
||||
'
|
||||
|
||||
test_done
|
Двоичный файл не отображается.
Двоичный файл не отображается.
Двоичный файл не отображается.
Двоичный файл не отображается.
Двоичный файл не отображается.
Двоичный файл не отображается.
Двоичный файл не отображается.
|
@ -0,0 +1,2 @@
|
|||
Âxś%ĚA‚0@Ń}O1{cSZ(<28>ăνáĂthŞ”’ZŚÜŢ Ë˙?
¦m×6dµiťÉ9…¤Gĺ<47>h´Ř¨ÁZR'Q¶…<C2B6>RŚˇ<C59A>‚řłp‘ç‚ÓqL9âĎ=g¸§<C2B8>sIĐoopÎ˙”eĎ«_1»€ł¤$×ç*Si«ëNwpP•RBôűĹÁú
|
||||
ł‡[(đ®d-ŤřÁL9á
|
Двоичный файл не отображается.
Двоичный файл не отображается.
Двоичный файл не отображается.
Двоичный файл не отображается.
Двоичный файл не отображается.
Двоичный файл не отображается.
Двоичный файл не отображается.
Двоичный файл не отображается.
|
@ -0,0 +1 @@
|
|||
H<EFBFBD>ЬС<0E>0<0C>aЯ{<7B>о
IЛe&Цј*Ѕ<1D>GАп^И§љПЫDхв<D185>wU<77>в<EFBFBD>ЌSБ4Њ<19>ЦЊ<C2AD> ,fХ[№пVAлКЮќxШЧі6[wtGЇLuИ?<3F>ІВМкз@<40>"gь{<7B>+byО%M
|
Загрузка…
Ссылка в новой задаче