A bitmap index is a `.bitmap` file that can be found inside
`$GIT_DIR/objects/pack/`, next to its corresponding packfile, and
contains precalculated reachability information for selected commits.
The full specification of the format for these bitmap indexes can be found
in `Documentation/technical/bitmap-format.txt`.
For a given commit SHA1, if it happens to be available in the bitmap
index, its bitmap will represent every single object that is reachable
from the commit itself. The nth bit in the bitmap is the nth object in
the packfile; if it's set to 1, the object is reachable.
By using the bitmaps available in the index, this commit implements
several new functions:
- `prepare_bitmap_git`
- `prepare_bitmap_walk`
- `traverse_bitmap_commit_list`
- `reuse_partial_packfile_from_bitmap`
The `prepare_bitmap_walk` function tries to build a bitmap of all the
objects that can be reached from the commit roots of a given `rev_info`
struct by using the following algorithm:
- If all the interesting commits for a revision walk are available in
the index, the resulting reachability bitmap is the bitwise OR of all
the individual bitmaps.
- When the full set of WANTs is not available in the index, we perform a
partial revision walk using the commits that don't have bitmaps as
roots, and limiting the revision walk as soon as we reach a commit that
has a corresponding bitmap. The earlier OR'ed bitmap with all the
indexed commits can now be completed as this walk progresses, so the end
result is the full reachability list.
- For revision walks with a HAVEs set (a set of commits that are deemed
uninteresting), first we perform the same method as for the WANTs, but
using our HAVEs as roots, in order to obtain a full reachability bitmap
of all the uninteresting commits. This bitmap then can be used to:
a) limit the subsequent walk when building the WANTs bitmap
b) finding the final set of interesting commits by performing an
AND-NOT of the WANTs and the HAVEs.
If `prepare_bitmap_walk` runs successfully, the resulting bitmap is
stored and the equivalent of a `traverse_commit_list` call can be
performed by using `traverse_bitmap_commit_list`; the bitmap version
of this call yields the objects straight from the packfile index
(without having to look them up or parse them) and hence is several
orders of magnitude faster.
As an extra optimization, when `prepare_bitmap_walk` succeeds, the
`reuse_partial_packfile_from_bitmap` call can be attempted: it will find
the amount of objects at the beginning of the on-disk packfile that can
be reused as-is, and return an offset into the packfile. The source
packfile can then be loaded and the bytes up to `offset` can be written
directly to the result without having to consider the entires inside the
packfile individually.
If the `prepare_bitmap_walk` call fails (e.g. because no bitmap files
are available), the `rev_info` struct is left untouched, and can be used
to perform a manual rev-walk using `traverse_commit_list`.
Hence, this new set of functions are a generic API that allows to
perform the equivalent of
git rev-list --objects [roots...] [^uninteresting...]
for any set of commits, even if they don't have specific bitmaps
generated for them.
In further patches, we'll use this bitmap traversal optimization to
speed up the `pack-objects` and `rev-list` commands.
Signed-off-by: Vicent Marti <tanoku@gmail.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
EWAH is a word-aligned compressed variant of a bitset (i.e. a data
structure that acts as a 0-indexed boolean array for many entries).
It uses a 64-bit run-length encoding (RLE) compression scheme,
trading some compression for better processing speed.
The goal of this word-aligned implementation is not to achieve
the best compression, but rather to improve query processing time.
As it stands right now, this EWAH implementation will always be more
efficient storage-wise than its uncompressed alternative.
EWAH arrays will be used as the on-disk format to store reachability
bitmaps for all objects in a repository while keeping reasonable sizes,
in the same way that JGit does.
This EWAH implementation is a mostly straightforward port of the
original `javaewah` library that JGit currently uses. The library is
self-contained and has been embedded whole (4 files) inside the `ewah`
folder to ease redistribution.
The library is re-licensed under the GPLv2 with the permission of Daniel
Lemire, the original author. The source code for the C version can
be found on GitHub:
https://github.com/vmg/libewok
The original Java implementation can also be found on GitHub:
https://github.com/lemire/javaewah
[jc: stripped debug-only code per Peff's $gmane/239768]
Signed-off-by: Vicent Marti <tanoku@gmail.com>
Signed-off-by: Jeff King <peff@peff.net>
Helped-by: Ramsay Jones <ramsay@ramsay1.demon.co.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The hash table that stores the packing list for a given `pack-objects`
run was tightly coupled to the pack-objects code.
In this commit, we refactor the hash table and the underlying storage
array into a `packing_data` struct. The functionality for accessing and
adding entries to the packing list is hence accessible from other parts
of Git besides the `pack-objects` builtin.
This refactoring is a requirement for further patches in this series
that will require accessing the commit packing list from outside of
`pack-objects`.
The hash table implementation has been minimally altered: we now
use table sizes which are always a power of two, to ensure a uniform
index distribution in the array.
Signed-off-by: Vicent Marti <tanoku@gmail.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Rewrite "git repack" in C.
* sb/repack-in-c:
repack: improve warnings about failure of renaming and removing files
repack: retain the return value of pack-objects
repack: rewrite the shell script in C
Sparse issues an "using sizeof on a function" warning for each
call to curl_easy_setopt() which sets an option that takes a
function pointer parameter. (currently 12 such warnings over 4
files.)
The warnings relate to the use of the "typecheck-gcc.h" header
file which adds a layer of type-checking macros to the curl
function invocations (for gcc >= 4.3 and !__cplusplus). As part
of the type-checking layer, 'sizeof' is applied to the function
parameter of curl_easy_setopt(). Note that, in the context of
sizeof, the function to function pointer conversion is not
performed and that sizeof(f) != sizeof(&f).
A simple solution, therefore, would be to replace the function
name in each such call to curl_easy_setopt() with an explicit
function pointer expression (i.e. replace f with &f).
However, the "typecheck-gcc.h" header file is only conditionally
included, in addition to the gcc and C++ checks mentioned above,
depending on the CURL_DISABLE_TYPECHECK preprocessor variable.
In order to suppress the warnings, we use target-specific variable
assignments to add -DCURL_DISABLE_TYPECHECK to SPARSE_FLAGS for
each file affected (http-push.c, http.c, http-walker.c and
remote-curl.c).
Signed-off-by: Ramsay Jones <ramsay@ramsay1.demon.co.uk>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
The motivation of this patch is to get closer to a goal of being
able to have a core subset of git functionality built in to git.
That would mean
* people on Windows could get a copy of at least the core parts
of Git without having to install a Unix-style shell
* people using git in on servers with chrooted environments
do not need to worry about standard tools lacking for shell
scripts.
This patch is meant to be mostly a literal translation of the
git-repack script; the intent is that later patches would start using
more library facilities, but this patch is meant to be as close to a
no-op as possible so it doesn't do that kind of thing.
Signed-off-by: Stefan Beller <stefanbeller@googlemail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Remove now disused remote-helpers framework for helpers written in
Python.
* jk/remove-remote-helpers-in-python:
git_remote_helpers: remove little used Python library
Send a large request to read(2)/write(2) as a smaller but still
reasonably large chunks, which would improve the latency when the
operation needs to be killed and incidentally works around broken
64-bit systems that cannot take a 2GB write or read in one go.
* sp/clip-read-write-to-8mb:
Revert "compat/clipped-write.c: large write(2) fails on Mac OS X/XNU"
xread, xwrite: limit size of IO to 8MB
Allow section.<urlpattern>.var configuration variables to be
treated as a "virtual" section.var given a URL, and use the
mechanism to enhance http.* configuration variables.
This is a reroll of Kyle J. McKay's work.
* jc/url-match:
builtin/config.c: compilation fix
config: "git config --get-urlmatch" parses section.<url>.key
builtin/config: refactor collect_config()
config: parse http.<url>.<variable> using urlmatch
config: add generic callback wrapper to parse section.<url>.key
config: add helper to normalize and match URLs
http.c: fix parsing of http.sslCertPasswordProtected variable
When it was originally added, the git_remote_helpers library was used as
part of the tests of the remote-helper interface, but since commit
fc407f9 (Add new simplified git-remote-testgit, 2012-11-28) a simple
shell script is used for this.
A search on Ohloh [1] indicates that this library isn't used by any
external projects and even the Python remote helpers in contrib/ don't
use this library, so it is only used by its own test suite.
Since this is the only Python library in Git, removing it will make
packaging easier as the Python scripts only need to be installed for one
version of Python, whereas the library should be installed for all
available versions.
[1] http://code.ohloh.net/search?s=%22git_remote_helpers%22
Signed-off-by: John Keeping <john@keeping.me.uk>
Acked-by: Sverre Rabbelier <srabbelier@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* da/darwin:
OS X: Fix redeclaration of die warning
Makefile: Fix APPLE_COMMON_CRYPTO with BLK_SHA1
imap-send: use Apple's Security framework for base64 encoding
This reverts commit 6c642a8786.
The previous commit introduced a size limit on IO chunks on all
platforms. The compat clipped_write() is not needed anymore.
Signed-off-by: Steffen Prohaska <prohaska@zib.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Use the urlmatch_config_entry() to wrap the underlying
http_options() two-level variable parser in order to set
http.<variable> to the value with the most specific URL in the
configuration.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Kyle J. McKay <mackyle@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
It used to be that APPLE_COMMON_CRYPTO did nothing when BLK_SHA1 was
set. But APPLE_COMMON_CRYPTO is now used for more than just SHA1 (see
3ef2bca) so make sure that the appropriate libraries are always set.
Signed-off-by: Brian Gernhardt <brian@gernhardtsoftware.com>
Reviewed-by: Jeremy Huddleston Sequoia <jeremyhu@apple.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Cygwin port added a "not quite correct but a lot faster and good
enough for many lstat() calls that are only used to see if the
working tree entity matches the index entry" lstat() emulation some
time ago, and it started biting us in places. This removes it and
uses the standard lstat() that comes with Cygwin.
Recent topic that uses lstat on packed-refs file is broken when
this cheating lstat is used, and this is a simplest fix that is
also the cleanest direction to go in the long run.
* rj/cygwin-clarify-use-of-cheating-lstat:
cygwin: Remove the Win32 l/stat() implementation
Use Apple's supported functions for base64 encoding instead
of the deprecated OpenSSL functions.
Signed-off-by: Jeremy Huddleston <jeremyhu@apple.com>
Signed-off-by: David Aguilar <davvid@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A new command to allow scripts to query the mailmap information.
* es/check-mailmap:
t4203: test check-mailmap command invocation
builtin: add git-check-mailmap command
Commit adbc0b6b ("cygwin: Use native Win32 API for stat", 30-09-2008)
added a Win32 specific implementation of the stat functions. In order
to handle absolute paths, cygwin mount points and symbolic links, this
implementation may fall back on the standard cygwin l/stat() functions.
Also, the choice of cygwin or Win32 functions is made lazily (by the
first call(s) to l/stat) based on the state of some config variables.
Unfortunately, this "schizophrenic stat" implementation has been the
source of many problems ever since. For example, see commits 7faee6b8,
79748439, 452993c2, 085479e7, b8a97333, 924aaf3e, 05bab3ea and 0117c2f0.
In order to avoid further problems, such as the issue raised by the new
reference handling API, remove the Win32 l/stat() implementation.
Signed-off-by: Ramsay Jones <ramsay@ramsay1.demon.co.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Introduce command check-mailmap, similar to check-attr and check-ignore,
which allows direct testing of .mailmap configuration.
As plumbing accessible to scripts and other porcelain, check-mailmap
publishes the stable, well-tested .mailmap functionality employed by
built-in Git commands. Consequently, script authors need not
re-implement .mailmap functionality manually, thus avoiding potential
quirks and behavioral differences.
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This script was added in 36e5e70 (Start deprecating "git-command" in
favor of "git command", 2007-06-30) with the intent of aiding the
transition away from dashed forms.
It has already been used to help the transision and served its
purpose, and is no longer very useful for follow-up work, because
the majority of remaining matches it finds are false positives.
Signed-off-by: Ramkumar Ramachandra <artagnon@gmail.com>
Reviewed-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git log" learned the "--author-date-order" option, with which the
output is topologically sorted and commits in parallel histories
are shown intermixed together based on the author timestamp.
* jc/topo-author-date-sort:
t6003: add --author-date-order test
topology tests: teach a helper to set author dates as well
t6003: add --date-order test
topology tests: teach a helper to take abbreviated timestamps
t/lib-t6000: style fixes
log: --author-date-order
sort-in-topological-order: use prio-queue
prio-queue: priority queue of pointers to structs
toposort: rename "lifo" field
Mac OS X does not like to write(2) more than INT_MAX number of
bytes; work it around by chopping write(2) into smaller pieces.
* fc/macos-x-clipped-write:
compate/clipped-write.c: large write(2) fails on Mac OS X/XNU
Make it possible to call into copy-notes API from the sequencer code.
* jh/libify-note-handling:
Move create_notes_commit() from notes-merge.c into notes-utils.c
Move copy_note_for_rewrite + friends from builtin/notes.c to notes-utils.c
finish_copy_notes_for_rewrite(): Let caller provide commit message
Makefile simplification.
* fc/makefile:
Makefile: use $^ to avoid listing prerequisites on the command line
build: do not install git-remote-testgit
build: generate and clean test scripts
This is a pure code movement of the machinery for copying notes to
rewritten objects. This code was located in builtin/notes.c for
historical reasons. In order to make it available to builtin/commit.c
it was declared in builtin.h. This was more of an accident of history
than a concious design, and we now want to make this machinery more
widely available.
Hence, this patch moves the code into the new notes-utils.[hc] files
which are included into libgit.a. Except for adjusting #includes
accordingly, this patch merely moves the relevant functions verbatim
into the new files.
Cc: Thomas Rast <trast@inf.ethz.ch>
Signed-off-by: Johan Herland <johan@herland.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Traditionally we used a singly linked list of commits to hold a set
of in-flight commits while traversing history. The most typical use
of the list is to add commits that are newly discovered to it, keep
the list sorted by commit timestamp, pick up the newest one from the
list, and keep digging. The cost of keeping the singly linked list
sorted is nontrivial, and this typical use pattern better matches a
priority queue.
Introduce a prio-queue structure, that can be used either as a LIFO
stack, or a priority queue. This will be used in the next patch to
hold in-flight commits during sort-in-topological-order.
Tests and the idea to make it usable for any "void *" pointers to
"things" are by Jeff King. Bugs are mine.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Update build for Cygwin 1.[57]. Torsten Bögershausen reports that
this is fine with Cygwin 1.7 ($gmane/225824) so let's try moving it
ahead.
* rj/mingw-cygwin:
cygwin: Remove the CYGWIN_V15_WIN32API build variable
mingw: rename WIN32 cpp macro to GIT_WINDOWS_NATIVE
Add the helper test-read-cache, which can be used to call read_cache and
discard_cache in a loop as well as a performance check based on it.
Signed-off-by: René Scharfe <rene.scharfe@lsrfire.ath.cx>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There's no need to list again the prerequisites.
Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commit 416fda6 (build: do not install git-remote-testpy) made it so
git-remote-testpy is not only not installed, but also not generated
by default. From a fresh checkout, "make --test=5800 test" would
have failed.
This was not found primarily because "make clean" failed to remove
git-remote-testpy, which is another bug in the same commit.
Fix the former by having 'all' target depend on $(NO_INSTALL) and
the latter by removing $(NO_INSTALL) in the 'clean' target.
Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* js/transport-helper-error-reporting-fix:
git-remote-testgit: build it to run under $SHELL_PATH
git-remote-testgit: further remove some bashisms
git-remote-testgit: avoid process substitution
t5801: "VAR=VAL shell_func args" is forbidden
transport-helper: update remote helper namespace
transport-helper: trivial code shuffle
transport-helper: warn when refspec is not used
transport-helper: clarify pushing without refspecs
transport-helper: update refspec documentation
transport-helper: clarify *:* refspec
transport-helper: improve push messages
transport-helper: mention helper name when it dies
transport-helper: report errors properly
Conflicts:
t/t5801-remote-helpers.sh
This set of patches collects a number of build fixes that have been
used on the msysgit port for a while and merging upstream should
simplify future maintenance.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.12 (GNU/Linux)
iQCVAwUAUbEmnGB90JXwhOSJAQKRJgP/TdWucLnedP4tRKhRrwy3AnZ2Her4Mn5n
isrNQu3eixT3PsGzdyYUvTYLP8OPNfgYYVEzqyrRtNHKKSD2qLGXt8oyOw63z10n
tiDcHHCfI1U/W7GHK1Q9abaQz/PF6yWnYenRt9lnckyqtxNoa8o+eOCfuY9lBfNJ
ccTP/dRgoL0=
=uWg2
-----END PGP SIGNATURE-----
Merge tag 'post183-for-junio' of http://github.com/msysgit/git
Collected msysgit build patches for upstream
This set of patches collects a number of build fixes that have been
used on the msysgit port for a while and merging upstream should
simplify future maintenance.
* tag 'post183-for-junio' of http://github.com/msysgit/git:
Set the default help format to html for msys builds.
Ensure the resource file is rebuilt when the version changes.
Windows resource: handle dashes in the Git version gracefully
Provide a Windows version resource for the git executables.
msysgit: Add the --large-address-aware linker directive to the makefile.
Define NO_GETTEXT for Git for Windows
Makefile: Do not use OLD_ICONV on MINGW anymore
Update Makefile to use handy automatic variables where appropriate,
and stop installing a script that is only used for testing.
* fc/makefile:
build: do not install git-remote-testpy
build: add NO_INSTALL variable
build: cleanup using $<
build: cleanup using $^
build: trivial simplification
Acked-by: Erik Faye-Lund <kusmabite@gmail.com>
Acked-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Pat Thoyts <patthoyts@users.sourceforge.net>
Embeds the git version and description into the git executable thus
implementing the request in issue #5.
Acked-by: Heiko Voigt <hvoigt@hvoigt.net>
Acked-by: Sebastian Schuberth <sschuberth@gmail.com>
Acked-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Pat Thoyts <patthoyts@users.sourceforge.net>
* tr/line-log:
git-log(1): remove --full-line-diff description
line-log: fix documentation formatting
log -L: improve comments in process_all_files()
log -L: store the path instead of a diff_filespec
log -L: test merge of parallel modify/rename
t4211: pass -M to 'git log -M -L...' test
log -L: fix overlapping input ranges
log -L: check range set invariants when we look it up
Speed up log -L... -M
log -L: :pattern:file syntax to find by funcname
Implement line-history search (git log -L)
Export rewrite_parents() for 'log -L'
Refactor parse_loc
Update the test coverage support that was left to bitrot for some
time.
* tr/coverage:
coverage: build coverage-untested-functions by default
coverage: set DEFAULT_TEST_TARGET to avoid using prove
coverage: do not delete .gcno files before building
coverage: split build target into compile and test
Newer MacOS X encourages the programs to compile and link with
their CommonCrypto, not with OpenSSL.
* da/darwin:
imap-send: eliminate HMAC deprecation warnings on Mac OS X
cache.h: eliminate SHA-1 deprecation warnings on Mac OS X
Makefile: add support for Apple CommonCrypto facility
Makefile: fix default regex settings on Darwin
Mac OS X does not like to write(2) more than INT_MAX number of
bytes.
* fc/macos-x-clipped-write:
compate/clipped-write.c: large write(2) fails on Mac OS X/XNU
This makes git use wildmatch by default for all fnmatch() calls. Users
who want to use system fnmatch (or compat fnmatch) need to set
NO_WILDMATCH flag.
wildmatch is a drop-in fnmatch replacement with more features. Using
wildmatch gives us a consistent behavior across platforms. The
tentative plan is make it default with an opt-out for about 2 cycles,
then remove NO_WILDMATCH and compat/fnmatch.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When TEST_OUTPUT_DIRECTORY setting is used, it was handled somewhat
inconsistently between the test framework and t/Makefile, and logic
to summarize the results looked at a wrong place.
* jk/test-output:
t/Makefile: don't define TEST_RESULTS_DIRECTORY recursively
test output: respect $TEST_OUTPUT_DIRECTORY
t/Makefile: fix result handling with TEST_OUTPUT_DIRECTORY
Update reading and updating packed-refs file, correcting corner case
bugs.
* mh/packed-refs-various: (33 commits)
refs: handle the main ref_cache specially
refs: change do_for_each_*() functions to take ref_cache arguments
pack_one_ref(): do some cheap tests before a more expensive one
pack_one_ref(): use write_packed_entry() to do the writing
pack_one_ref(): use function peel_entry()
refs: inline function do_not_prune()
pack_refs(): change to use do_for_each_entry()
refs: use same lock_file object for both ref-packing functions
pack_one_ref(): rename "path" parameter to "refname"
pack-refs: merge code from pack-refs.{c,h} into refs.{c,h}
pack-refs: rename handle_one_ref() to pack_one_ref()
refs: extract a function write_packed_entry()
repack_without_ref(): write peeled refs in the rewritten file
t3211: demonstrate loss of peeled refs if a packed ref is deleted
refs: change how packed refs are deleted
search_ref_dir(): return an index rather than a pointer
repack_without_ref(): silence errors for dangling packed refs
t3210: test for spurious error messages for dangling packed refs
refs: change the internal reference-iteration API
refs: extract a function peel_entry()
...
So that we can specify which scripts we do not want to install (they are
for testing).
Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>