When debugging bitmap generation performance, it is useful to know how
many bitmaps were generated from scratch, and how many were the result
of permuting the bit-order of an existing bitmap.
Keep track of the latter, and emit the count as a trace2_data line to
aid in debugging.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When debugging MIDX and MIDX-bitmap related issues, it is useful to
figure out where Git is spending its time.
GitHub has been using the below trace2 regions to instrument various
components of generating a MIDX itself, as well time spent preparing to
build a MIDX bitmap.
These are limited to instrumenting the following functions:
- midx.c::find_commits_for_midx_bitmap()
- midx.c::midx_pack_order()
- midx.c::prepare_midx_packing_data()
- midx.c::write_midx_bitmap()
- midx.c::write_midx_internal()
- midx.c::write_midx_reverse_index()
to start and end with a trace2_region_enter() and trace2_region_leave(),
respectively.
The category for all of these is "midx", which matches the existing
convention. The region description matches the name of the function.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When generating a multi-pack bitmap without a `--refs-snapshot` (e.g.,
by running `git multi-pack-index write --bitmap` directly), we determine
the set of bitmap-able commits by enumerating each reference, and adding
the referrent as the tip of a reachability traversal when it appears
somewhere in the MIDX. (Any commit we encounter during the reachability
traversal then becomes a candidate for bitmap selection).
But we incorrectly avoid peeling the object at the tip of each
reference. So if we see some reference that points at an annotated tag
(which in turn points through zero or more additional annotated tags at
a commit), that we will not add it as a tip for the reachability
traversal. This means that if some commit C is only referenced through
one or more annotated tag(s), then C won't become a bitmap candidate.
Correct this by peeling the reference tips as we enumerate them to
ensure that we consider commits which are the targets of annotated tags,
in addition to commits which are referenced directly.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This was unintentionally introduced via 893b563505 (midx: inline
nth_midxed_pack_entry(), 2021-09-11) where "struct repository *r"
became "struct repository * r".
The latter does not adhere to our usual style conventions, so fix that
up to look more like our usual declarations.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We have small updates since -rc1 but none of them is about a new
thing and there is no updates to the release notes.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The trace2 region around the call to lazy_bitmap_for_commit() in
bitmap_for_commit() was added in 28cd730680 (pack-bitmap: prepare to
read lookup table extension, 2022-08-14). While adding trace2 regions is
typically helpful for tracking performance, this method is called
possibly thousands of times as a commit walk explores commit history
looking for a matching bitmap. When trace2 output is enabled, this
region is emitted many times and performance is throttled by that
output.
For now, remove these regions entirely.
This is a critical path, and it would be valuable to measure that the
time spent in bitmap_for_commit() does not increase when using the
commit lookup table. The best way to do that would be to use a mechanism
that sums the time spent in a region and reports a single value at the
end of the process. This technique was introduced but not merged by [1]
so maybe this example presents some justification to revisit that
approach.
[1] https://lore.kernel.org/git/pull.1099.v2.git.1640720202.gitgitgadget@gmail.com/
To help with the 'git blame' output in this region, add a comment that
warns against adding a trace2 region. Delete a test from t5310 that used
that trace output to check that this lookup optimization was activated.
To create this kind of test again in the future, the stopwatch traces
mentioned earlier could be used as a signal that we activated this code
path.
Helpedy-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The text of this message was changed in commit
71076d0edd to avoid making any
suggestion about which strategy is better for the situation at hand.
Update the Franch translation to match.
Signed-off-by: Alex Henrie <alexhenrie24@gmail.com>
We attribute each documentation text file to a man section by finding a
line in the file that looks like "gitfoo(<digit>)". Commit cc75e556a9
("scalar: add to 'git help -a' command list", 2022-09-02) updated this
logic to look not only for "gitfoo" but also "scalarfoo". In doing so,
it forgot to account for the fact that after the updated regex has found
a match, the man section is no longer to be found in `$1` but now lives
in `$2`.
This makes our git(1) manpage look as follows:
Main porcelain commands
git-add(git)
Add file contents to the index.
[...]
gitk(git)
The Git repository browser.
scalar(scalar)
A tool for managing large Git repositories.
Restore the man sections by not capturing the (git|scalar) part of the
match into `$1`.
As noted by Ævar [1], we could even match any "foo" rather than just
"gitfoo" and "scalarfoo", but that's a larger change. For now, just fix
the regression in cc75e556a9.
[1] https://lore.kernel.org/git/220923.86wn9u4joo.gmgdl@evledraar.gmail.com/#t
Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* 'main' of github.com:git/git:
list-objects-filter: initialize sub-filter structs
Git 2.38-rc1
Final batch before -rc1
builtin/diagnose.c: don't translate the two mode values
t/Makefile: remove 'test-results' on 'make clean'
gc: don't translate literal commands
Documentation: clean up various typos in technical docs
Documentation: clean up a few misspelled word typos
version: fix builtin linking & documentation
diagnose: add to command-list.txt
Documentation: add ReviewingGuidelines
commit-graph: Fix missing closedir in expire_commit_graphs
diagnose.c: refactor to safely use 'd_type'
help: fix doubled words in explanation for developer interfaces
api docs: link to html version of api-trace2
docs: fix a few recently broken links
reftable: use a pointer for pq_entry param
Fix uninitialized memory access in a recent fix-up that is already
in -rc1.
* jk/list-objects-filter-cleanup:
list-objects-filter: initialize sub-filter structs
Since commit c54980ab83 (list-objects-filter: convert filter_spec to a
strbuf, 2022-09-11), building with SANITIZE=undefined triggers an error
in t5616.
The problem is that we end up with a strbuf that has been
zero-initialized instead of via STRBUF_INIT. Feeding that strbuf to
strbuf_addbuf() in list_objects_filter_copy() means we will call memcpy
like:
memcpy(some_actual_buffer, NULL, 0);
This works on most systems because we're copying zero bytes, but it is
technically undefined behavior to ever pass NULL to memcpy.
Even though c54980ab83 is where the bug manifests, that is only because
we switched away from a string_list, which is OK with being
zero-initialized (though it may cause other problems by not duplicating
the strings, it happened to be OK in this instance).
The actual bug is caused by the commit before that, 2a01bdedf8
(list-objects-filter: add and use initializers, 2022-09-11). There we
consistently initialize the top-level filter structs, but we forgot the
dynamically allocated ones we stick in filter_options->sub when creating
combined filters.
Note that we need to fix two spots here: where we parse a "combine:"
filter, but also where we transform from a single-filter into a combined
one after seeing multiple "--filter" options. In the second spot, we'll
do some minor refactoring to avoid repeating our very-long array index.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>