For a large tree, the index needs to hold many cache entries
allocated on heap. These cache entries are now allocated out of a
dedicated memory pool to amortize malloc(3) overhead.
* jm/cache-entry-from-mem-pool:
block alloc: add validations around cache_entry lifecyle
block alloc: allocate cache entries from mem_pool
mem-pool: fill out functionality
mem-pool: add life cycle management functions
mem-pool: only search head block for available space
block alloc: add lifecycle APIs for cache_entry structs
read-cache: teach make_cache_entry to take object_id
read-cache: teach refresh_cache_entry to take istate
It has been observed that the time spent loading an index with a large
number of entries is partly dominated by malloc() calls. This change
is in preparation for using memory pools to reduce the number of
malloc() calls made to allocate cahce entries when loading an index.
Add an API to allocate and discard cache entries, abstracting the
details of managing the memory backing the cache entries. This commit
does actually change how memory is managed - this will be done in a
later commit in the series.
This change makes the distinction between cache entries that are
associated with an index and cache entries that are not associated with
an index. A main use of cache entries is with an index, and we can
optimize the memory management around this. We still have other cases
where a cache entry is not persisted with an index, and so we need to
handle the "transient" use case as well.
To keep the congnitive overhead of managing the cache entries, there
will only be a single discard function. This means there must be enough
information kept with the cache entry so that we know how to discard
them.
A summary of the main functions in the API is:
make_cache_entry: create cache entry for use in an index. Uses specified
parameters to populate cache_entry fields.
make_empty_cache_entry: Create an empty cache entry for use in an index.
Returns cache entry with empty fields.
make_transient_cache_entry: create cache entry that is not used in an
index. Uses specified parameters to populate
cache_entry fields.
make_empty_transient_cache_entry: create cache entry that is not used in
an index. Returns cache entry with
empty fields.
discard_cache_entry: A single function that knows how to discard a cache
entry regardless of how it was allocated.
Signed-off-by: Jameson Miller <jamill@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a repository argument to allow the callers of deref_tag
to be more specific about which repository to act on. This is a small
mechanical change; it doesn't change the implementation to handle
repositories other than the_repository yet.
As with the previous commits, use a macro to catch callers passing a
repository other than the_repository at compile time.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a repository argument to allow callers of set_commit_buffer to
be more specific about which repository to handle. This is a small
mechanical change; it doesn't change the implementation to handle
repositories other than the_repository yet.
As with the previous commits, use a macro to catch callers passing a
repository other than the_repository at compile time.
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a repository argument to allow callers of lookup_commit_reference
to be more specific about which repository to handle. This is a small
mechanical change; it doesn't change the implementation to handle
repositories other than the_repository yet.
As with the previous commits, use a macro to catch callers passing a
repository other than the_repository at compile time.
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a repository argument to allow callers of
lookup_commit_reference_gently to be more specific about which
repository to handle. This is a small mechanical change; it doesn't
change the implementation to handle repositories other than
the_repository yet.
As with the previous commits, use a macro to catch callers passing a
repository other than the_repository at compile time.
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The conversion to pass "the_repository" and then "a_repository"
throughout the object access API continues.
* sb/object-store-alloc:
alloc: allow arbitrary repositories for alloc functions
object: allow create_object to handle arbitrary repositories
object: allow grow_object_hash to handle arbitrary repositories
alloc: add repository argument to alloc_commit_index
alloc: add repository argument to alloc_report
alloc: add repository argument to alloc_object_node
alloc: add repository argument to alloc_tag_node
alloc: add repository argument to alloc_commit_node
alloc: add repository argument to alloc_tree_node
alloc: add repository argument to alloc_blob_node
object: add repository argument to grow_object_hash
object: add repository argument to create_object
repository: introduce parsed objects field
The in-core "commit" object had an all-purpose "void *util" field,
which was tricky to use especially in library-ish part of the
code. All of the existing uses of the field has been migrated to a
more dedicated "commit-slab" mechanism and the field is eliminated.
* nd/commit-util-to-slab:
commit.h: delete 'util' field in struct commit
merge: use commit-slab in merge remote desc instead of commit->util
log: use commit-slab in prepare_bases() instead of commit->util
show-branch: note about its object flags usage
show-branch: use commit-slab for commit-name instead of commit->util
name-rev: use commit-slab for rev-name instead of commit->util
bisect.c: use commit-slab for commit weight instead of commit->util
revision.c: use commit-slab for show_source
sequencer.c: use commit-slab to associate todo items to commits
sequencer.c: use commit-slab to mark seen commits
shallow.c: use commit-slab for commit depth instead of commit->util
describe: use commit-slab for commit names instead of commit->util
blame: use commit-slab for blame suspects instead of commit->util
commit-slab: support shared commit-slab
commit-slab.h: code split
Developer support update, by using BUG() macro instead of die() to
mark codepaths that should not happen more clearly.
* js/use-bug-macro:
BUG_exit_code: fix sparse "symbol not declared" warning
Convert remaining die*(BUG) messages
Replace all die("BUG: ...") calls by BUG() ones
run-command: use BUG() to report bugs, not die()
test-tool: help verifying BUG() code paths
The codepath around object-info API has been taught to take the
repository object (which in turn tells the API which object store
the objects are to be located).
* sb/oid-object-info:
cache.h: allow oid_object_info to handle arbitrary repositories
packfile: add repository argument to cache_or_unpack_entry
packfile: add repository argument to unpack_entry
packfile: add repository argument to read_object
packfile: add repository argument to packed_object_info
packfile: add repository argument to packed_to_object_type
packfile: add repository argument to retry_bad_packed_offset
cache.h: add repository argument to oid_object_info
cache.h: add repository argument to oid_object_info_extended
The code has been taught to use the duplicated information stored
in the commit-graph file to learn the tree object name for a commit
to avoid opening and parsing the commit object when it makes sense
to do so.
* ds/lazy-load-trees:
coccinelle: avoid wrong transformation suggestions from commit.cocci
commit-graph: lazy-load trees for commits
treewide: replace maybe_tree with accessor methods
commit: create get_commit_tree() method
treewide: rename tree to maybe_tree
It's done so that commit->util can be removed. See more explanation in
the commit that removes commit->util.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Migrate all git_path_* functions that are defined in path.c to take a
repository argument. Unlike other patches in this series, do not use the
#define trick, as we rewrite the whole function, which is rather small.
This doesn't migrate all the functions, as other builtins have their own
local path functions defined using GIT_PATH_FUNC. So keep that macro
around to serve the other locations.
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This should make these functions easier to find and cache.h less
overwhelming to read.
In particular, this moves:
- read_object_file
- oid_object_info
- write_object_file
As a result, most of the codebase needs to #include object-store.h.
In this patch the #include is only added to files that would fail to
compile otherwise. It would be better to #include wherever
identifiers from the header are used. That can happen later
when we have better tooling for it.
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We have to convert all of the alloc functions at once, because alloc_report
uses a funky macro for reporting. It is better for the sake of mechanical
conversion to convert multiple functions at once rather than changing the
structure of the reporting function.
We record all memory allocation in alloc.c, and free them in
clear_alloc_state, which is called for all repositories except
the_repository.
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This is a small mechanical change; it doesn't change the
implementation to handle repositories other than the_repository yet.
Use a macro to catch callers passing a repository other than
the_repository at compile time.
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In d8193743e0 (usage.c: add BUG() function, 2017-05-12), a new macro
was introduced to use for reporting bugs instead of die(). It was then
subsequently used to convert one single caller in 588a538ae5
(setup_git_env: convert die("BUG") to BUG(), 2017-05-12).
The cover letter of the patch series containing this patch
(cf 20170513032414.mfrwabt4hovujde2@sigill.intra.peff.net) is not
terribly clear why only one call site was converted, or what the plan
is for other, similar calls to die() to report bugs.
Let's just convert all remaining ones in one fell swoop.
This trick was performed by this invocation:
sed -i 's/die("BUG: /BUG("/g' $(git grep -l 'die("BUG' \*.c)
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a repository argument to allow the callers of oid_object_info
to be more specific about which repository to handle. This is a small
mechanical change; it doesn't change the implementation to handle
repositories other than the_repository yet.
As with the previous commits, use a macro to catch callers passing a
repository other than the_repository at compile time.
Signed-off-by: Stefan Beller <sbeller@google.com>
Reviewed-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In anticipation of making trees load lazily, create a Coccinelle
script (contrib/coccinelle/commit.cocci) to ensure that all
references to the 'maybe_tree' member of struct commit are either
mutations or accesses through get_commit_tree() or
get_commit_tree_oid().
Apply the Coccinelle script to create the rest of the patch.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Using the commit-graph file to walk commit history removes the large
cost of parsing commits during the walk. This exposes a performance
issue: lookup_tree() takes a large portion of the computation time,
even when Git never uses those trees.
In anticipation of lazy-loading these trees, rename the 'tree' member
of struct commit to 'maybe_tree'. This serves two purposes: it hints
at the future role of possibly being NULL even if the commit has a
valid tree, and it allows for unambiguous transformation from simple
member access (i.e. commit->maybe_tree) to method access.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Convert read_sha1_file to take a pointer to struct object_id and rename
it read_object_file. Do the same for read_sha1_file_extended.
Convert one use in grep.c to use the new function without any other code
change, since the pointer being passed is a void pointer that is already
initialized with a pointer to struct object_id. Update the declaration
and definitions of the modified functions, and apply the following
semantic patch to convert the remaining callers:
@@
expression E1, E2, E3;
@@
- read_sha1_file(E1.hash, E2, E3)
+ read_object_file(&E1, E2, E3)
@@
expression E1, E2, E3;
@@
- read_sha1_file(E1->hash, E2, E3)
+ read_object_file(E1, E2, E3)
@@
expression E1, E2, E3, E4;
@@
- read_sha1_file_extended(E1.hash, E2, E3, E4)
+ read_object_file_extended(&E1, E2, E3, E4)
@@
expression E1, E2, E3, E4;
@@
- read_sha1_file_extended(E1->hash, E2, E3, E4)
+ read_object_file_extended(E1, E2, E3, E4)
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Convert get_tree_entry and find_tree_entry to take pointers to struct
object_id.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Convert sha1_object_info and sha1_object_info_extended to take pointers
to struct object_id and rename them to use "oid" instead of "sha1" in
their names. Update the declaration and definition and apply the
following semantic patch, plus the standard object_id transforms:
@@
expression E1, E2;
@@
- sha1_object_info(E1.hash, E2)
+ oid_object_info(&E1, E2)
@@
expression E1, E2;
@@
- sha1_object_info(E1->hash, E2)
+ oid_object_info(E1, E2)
@@
expression E1, E2, E3;
@@
- sha1_object_info_extended(E1.hash, E2, E3)
+ oid_object_info_extended(&E1, E2, E3)
@@
expression E1, E2, E3;
@@
- sha1_object_info_extended(E1->hash, E2, E3)
+ oid_object_info_extended(E1, E2, E3)
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Rename C++ keyword in order to bring the codebase closer to being able
to be compiled with a C++ compiler.
Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Convert the declaration and definition of pretend_sha1_file to use
struct object_id and adjust all usages of this function. Rename it to
pretend_object_file.
Signed-off-by: Patryk Obara <patryk.obara@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A single-word "unsigned flags" in the diff options is being split
into a structure with many bitfields.
* bw/diff-opt-impl-to-bitfields:
diff: make struct diff_flags members lowercase
diff: remove DIFF_OPT_CLR macro
diff: remove DIFF_OPT_SET macro
diff: remove DIFF_OPT_TST macro
diff: remove touched flags
diff: add flag to indicate textconv was set via cmdline
diff: convert flags to be stored in bitfields
add, reset: use DIFF_OPT_SET macro to set a diff flag
Remove the `DIFF_OPT_SET` macro and instead set the flags directly.
This conversion is done using the following semantic patch:
@@
expression E;
identifier fld;
@@
- DIFF_OPT_SET(&E, fld)
+ E.flags.fld = 1
@@
type T;
T *ptr;
identifier fld;
@@
- DIFF_OPT_SET(ptr, fld)
+ ptr->flags.fld = 1
Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Remove the `DIFF_OPT_TST` macro and instead access the flags directly.
This conversion is done using the following semantic patch:
@@
expression E;
identifier fld;
@@
- DIFF_OPT_TST(&E, fld)
+ E.flags.fld
@@
type T;
T *ptr;
identifier fld;
@@
- DIFF_OPT_TST(ptr, fld)
+ ptr->flags.fld
Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Convert resolve_ref_unsafe to take a pointer to struct object_id by
converting one remaining caller to use struct object_id, removing the
temporary NULL pointer check in expand_ref, converting the declaration
and definition, and applying the following semantic patch:
@@
expression E1, E2, E3, E4;
@@
- resolve_ref_unsafe(E1, E2, E3.hash, E4)
+ resolve_ref_unsafe(E1, E2, &E3, E4)
@@
expression E1, E2, E3, E4;
@@
- resolve_ref_unsafe(E1, E2, E3->hash, E4)
+ resolve_ref_unsafe(E1, E2, E3, E4)
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When attempting to blame a non-existing path, git should show an error
message like this:
$ git blame e83c51633 -- nonexisting-file
fatal: no such path nonexisting-file in e83c51633
Since the recent commit 835c49f7d (blame: rework methods that
determine 'final' commit, 2017-05-24) the revision name is either
missing or some scrambled characters are shown instead. The reason is
that the revision name must be duplicated, because it is invalidated
when the pending objects array is cleared in the meantime, but this
commit dropped the duplication.
Restore the duplication of the revision name in the affected functions
(find_single_final() and find_single_initial()).
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A common pattern to free a piece of memory and assign NULL to the
pointer that used to point at it has been replaced with a new
FREE_AND_NULL() macro.
* ab/free-and-null:
*.[ch] refactoring: make use of the FREE_AND_NULL() macro
coccinelle: make use of the "expression" FREE_AND_NULL() rule
coccinelle: add a rule to make "expression" code use FREE_AND_NULL()
coccinelle: make use of the "type" FREE_AND_NULL() rule
coccinelle: add a rule to make "type" code use FREE_AND_NULL()
git-compat-util: add a FREE_AND_NULL() wrapper around free(ptr); ptr = NULL
Code clean-up.
* bw/ls-files-sans-the-index:
ls-files: factor out tag calculation
ls-files: factor out debug info into a function
ls-files: convert show_files to take an index
ls-files: convert show_ce_entry to take an index
ls-files: convert prune_cache to take an index
ls-files: convert ce_excluded to take an index
ls-files: convert show_ru_info to take an index
ls-files: convert show_other_files to take an index
ls-files: convert show_killed_files to take an index
ls-files: convert write_eolinfo to take an index
ls-files: convert overlay_tree_on_cache to take an index
tree: convert read_tree to take an index parameter
convert: convert renormalize_buffer to take an index
convert: convert convert_to_git to take an index
convert: convert convert_to_git_filter_fd to take an index
convert: convert crlf_to_git to take an index
convert: convert get_cached_convert_stats_ascii to take an index
A follow-up to the existing "expression" rule added in an earlier
change. This manually excludes a few occurrences, mostly things that
resulted in many FREE_AND_NULL() on one line, that'll be manually
fixed in a subsequent change.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The internal logic used in "git blame" has been libified to make it
easier to use by cgit.
* js/blame-lib: (29 commits)
blame: move entry prepend to libgit
blame: move scoreboard setup to libgit
blame: move scoreboard-related methods to libgit
blame: move fake-commit-related methods to libgit
blame: move origin-related methods to libgit
blame: move core structures to header
blame: create entry prepend function
blame: create scoreboard setup function
blame: create scoreboard init function
blame: rework methods that determine 'final' commit
blame: wrap blame_sort and compare_blame_final
blame: move progress updates to a scoreboard callback
blame: make sanity_check use a callback in scoreboard
blame: move no_whole_file_rename flag to scoreboard
blame: move xdl_opts flags to scoreboard
blame: move show_root flag to scoreboard
blame: move reverse flag to scoreboard
blame: move contents_from to scoreboard
blame: move copy/move thresholds to scoreboard
blame: move stat counters to scoreboard
...
Just make it take over blame's place. Documentation and command
have all stopped mentioning "git-pickaxe". The built-in synonym
is left in the command table, so you can still say "git pickaxe",
but it probably is a good idea to retire it as well.
Signed-off-by: Junio C Hamano <junkio@cox.net>
* maint:
revision traversal: --unpacked does not limit commit list anymore.
Continue traversal when rev-list --unpacked finds a packed commit.
Use memmove instead of memcpy for overlapping areas
quote.c: ensure the same quoting across platforms.
Surround "#define DEBUG 0" with "#ifndef DEBUG..#endif"