We provide oid_array_for_each_unique() for iterating over the
de-duplicated items in an array. But it's awkward to use for two
reasons:
1. It uses a callback, which means marshaling arguments into a struct
and passing it to the callback with a void parameter.
2. The callback doesn't know the numeric index of the oid we're
looking at. This is useful for things like progress meters.
Iterating with a for-loop is much more natural for some cases, but the
caller has to do the de-duping itself. However, we can provide a small
helper to make this easier (see the docstring in the header for an
example use).
The caller does have to remember to sort the array first. We could add
an assertion into the helper that array->sorted is set, but I didn't
want to complicate what is otherwise a pretty fast code path.
I also considered adding a full iterator type with init/next/end
functions (similar to what we have for hashmaps). But it ended up making
the callers much harder to read. This version keeps us close to a basic
for-loop.
Yet another option would be adding an option to sort the array and
compact out the duplicates. This would mean iterating over the array an
extra time, though that's probably not a big deal (we did just do an
O(n log n) sort). But we'd still have to write a for-loop to iterate, so
it doesn't really make anything easier for the caller.
No new test, since we'll convert the callback iterator (which is covered
by t0064, among other callers) to use the new code.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We sort the oid-array as a side effect of calling the lookup or
unique-iteration functions. But callers may want to sort it themselves
(especially as we add new iteration options in future patches).
We'll also move the check of the "sorted" flag into the sort function,
so callers don't have to remember to check it.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We renamed the actual data structure in 910650d2f8 (Rename sha1_array to
oid_array, 2017-03-31), but the file is still called sha1-array. Besides
being slightly confusing, it makes it more annoying to grep for leftover
occurrences of "sha1" in various files, because the header is included
in so many places.
Let's complete the transition by renaming the source and header files
(and fixing up a few comment references).
I kept the "-" in the name, as that seems to be our style; cf.
fc1395f4a4 (sha1_file.c: rename to use dash in file name, 2018-04-10).
We also have oidmap.h and oidset.h without any punctuation, but those
are "struct oidmap" and "struct oidset" in the code. We _could_ make
this "oidarray" to match, but somehow it looks uglier to me because of
the length of "array" (plus it would be a very invasive patch for little
gain).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>