Add a new "const struct config_context *ctx" arg to config_fn_t to hold
additional information about the config iteration operation.
config_context has a "struct key_value_info kvi" member that holds
metadata about the config source being read (e.g. what kind of config
source it is, the filename, etc). In this series, we're only interested
in .kvi, so we could have just used "struct key_value_info" as an arg,
but config_context makes it possible to add/adjust members in the future
without changing the config_fn_t signature. We could also consider other
ways of organizing the args (e.g. moving the config name and value into
config_context or key_value_info), but in my experiments, the
incremental benefit doesn't justify the added complexity (e.g. a
config_fn_t will sometimes invoke another config_fn_t but with a
different config value).
In subsequent commits, the .kvi member will replace the global "struct
config_reader" in config.c, making config iteration a global-free
operation. It requires much more work for the machinery to provide
meaningful values of .kvi, so for now, merely change the signature and
call sites, pass NULL as a placeholder value, and don't rely on the arg
in any meaningful way.
Most of the changes are performed by
contrib/coccinelle/config_fn_ctx.pending.cocci, which, for every
config_fn_t:
- Modifies the signature to accept "const struct config_context *ctx"
- Passes "ctx" to any inner config_fn_t, if needed
- Adds UNUSED attributes to "ctx", if needed
Most config_fn_t instances are easily identified by seeing if they are
called by the various config functions. Most of the remaining ones are
manually named in the .cocci patch. Manual cleanups are still needed,
but the majority of it is trivial; it's either adjusting config_fn_t
that the .cocci patch didn't catch, or adding forward declarations of
"struct config_context ctx" to make the signatures make sense.
The non-trivial changes are in cases where we are invoking a config_fn_t
outside of config machinery, and we now need to decide what value of
"ctx" to pass. These cases are:
- trace2/tr2_cfg.c:tr2_cfg_set_fl()
This is indirectly called by git_config_set() so that the trace2
machinery can notice the new config values and update its settings
using the tr2 config parsing function, i.e. tr2_cfg_cb().
- builtin/checkout.c:checkout_main()
This calls git_xmerge_config() as a shorthand for parsing a CLI arg.
This might be worth refactoring away in the future, since
git_xmerge_config() can call git_default_config(), which can do much
more than just parsing.
Handle them by creating a KVI_INIT macro that initializes "struct
key_value_info" to a reasonable default, and use that to construct the
"ctx" arg.
Signed-off-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
These are actually used as config callbacks, so use the typedef-ed type
and make future refactors easier.
Signed-off-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Plug a memory leak in credential_apply_config() by adding and using a
new urlmatch_config_release() function. This just does a
string_list_clear() on the "vars" member.
This finished up work on normalizing the init/free pattern in this
API, started in 73ee449bbf (urlmatch.[ch]: add and use
URLMATCH_CONFIG_INIT, 2021-10-01).
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Change the initialization pattern of "struct urlmatch_config" to use
an *_INIT macro and designated initializers. Right now there's no
other "struct" member of "struct urlmatch_config" which would require
its own *_INIT, but it's good practice not to assume that. Let's also
change this to a designated initializer while we're at it.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the patches for CVE-2020-11008, the ability to specify credential
settings in the config for partial URLs got lost. For example, it used
to be possible to specify a credential helper for a specific protocol:
[credential "https://"]
helper = my-https-helper
Likewise, it used to be possible to configure settings for a specific
host, e.g.:
[credential "dev.azure.com"]
useHTTPPath = true
Let's reinstate this behavior.
While at it, increase the test coverage to document and verify the
behavior with a couple other categories of partial URLs.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In some cases, a user will want to use a specific credential helper for
a wildcard pattern, such as https://*.corp.example.com. We have code
that handles this already with the urlmatch code, so let's use that
instead of our custom code.
Since the urlmatch code is a superset of our current matching in terms
of capabilities, there shouldn't be any cases of things that matched
previously that don't match now. However, in addition to wildcard
matching, we now use partial path matching, which can cause slightly
different behavior in the case that a helper applies to the prefix
(considering path components) of the remote URL. While different, this
is probably the behavior people were wanting anyway.
Since we're using the urlmatch code, we need to encode the components
we've gotten into a URL to match, so add a function to percent-encode
data and format the URL with it. We now also no longer need to the
custom code to match URLs, so let's remove it.
Additionally, the urlmatch code always looks for the best match, whereas
we want all matches for credential helpers to preserve existing
behavior. Let's add an optional field, select_fn, that lets us control
which items we want (in this case, all of them) and default it to the
best-match code that already exists for other users.
Signed-off-by: brian m. carlson <bk2204@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There has been a push to remove extern from function declarations.
Remove some instances of "extern" for function declarations which are
caught by Coccinelle. Note that Coccinelle has some difficulty with
processing functions with `__attribute__` or varargs so some `extern`
declarations are left behind to be dealt with in a future patch.
This was the Coccinelle patch used:
@@
type T;
identifier f;
@@
- extern
T f(...);
and it was run with:
$ git ls-files \*.{c,h} |
grep -v ^compat/ |
xargs spatch --sp-file contrib/coccinelle/noextern.cocci --in-place
Files under `compat/` are intentionally excluded as some are directly
copied from external sources and we should avoid churning them as much
as possible.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In order to be able to rank positive matches by `urlmatch`, we inspect
the path length and user part to decide whether a match is better than
another match. As all other parts are matched exactly between both URLs,
this is the correct thing to do right now.
In the future, though, we want to introduce wild cards for the domain
part. When doing this, it does not make sense anymore to only compare
the path lengths. Instead, we also want to compare the domain lengths to
determine which of both URLs matches the host part more closely.
Signed-off-by: Patrick Steinhardt <patrick.steinhardt@elego.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The `url_info` structure contains information about a normalized URL
with the URL's components being represented by different fields. The
host and port part though are to be accessed by the same `host` field,
so that getting the host and/or port separately becomes more involved
than really necessary.
To make the port more readily accessible, split up the host and port
fields. Namely, the `host_len` will not include the port length anymore
and a new `port_off` field has been added which includes the offset to
the port, if available.
The only user of these fields is `url_normalize_1`. This change makes it
easier later on to treat host and port differently when introducing
globs for domains.
Signed-off-by: Patrick Steinhardt <patrick.steinhardt@elego.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Existing configuration parsing functions (e.g. http_options() in
http.c) know how to parse two-level configuration variable names.
We would like to exploit them and parse something like this:
[http]
sslVerify = true
[http "https://weak.example.com"]
sslVerify = false
and pretend as if http.sslVerify were set to false when talking to
"https://weak.example.com/path".
Introduce `urlmatch_config_entry()` wrapper that:
- is called with the target URL (e.g. "https://weak.example.com/path"),
and the two-level variable parser (e.g. `http_options`);
- uses `url_normalize()` and `match_urls()` to see if configuration
data matches the target URL; and
- calls the traditional two-level configuration variable parser
only for the configuration data whose <url> part matches the
target URL (and if there are multiple matches, only do so if the
current match is a better match than the ones previously seen).
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Some http.* configuration variables need to take values customized
for the URL we are talking to. We may want to set http.sslVerify to
true in general but to false only for a certain site, for example,
with a configuration file like this:
[http]
sslVerify = true
[http "https://weak.example.com"]
sslVerify = false
and let the configuration machinery pick up the latter only when
talking to "https://weak.example.com". The latter needs to kick in
not only when the URL is exactly "https://weak.example.com", but
also is anything that "match" it, e.g.
https://weak.example.com/testhttps://me@weak.example.com/test
The <url> in the configuration key consists of the following parts,
and is considered a match to the URL we are attempting to access
under certain conditions:
. Scheme (e.g., `https` in `https://example.com/`). This field
must match exactly between the config key and the URL.
. Host/domain name (e.g., `example.com` in `https://example.com/`).
This field must match exactly between the config key and the URL.
. Port number (e.g., `8080` in `http://example.com:8080/`). This
field must match exactly between the config key and the URL.
Omitted port numbers are automatically converted to the correct
default for the scheme before matching.
. Path (e.g., `repo.git` in `https://example.com/repo.git`). The
path field of the config key must match the path field of the
URL either exactly or as a prefix of slash-delimited path
elements. A config key with path `foo/` matches URL path
`foo/bar`. A prefix can only match on a slash (`/`) boundary.
Longer matches take precedence (so a config key with path
`foo/bar` is a better match to URL path `foo/bar` than a config
key with just path `foo/`).
. User name (e.g., `me` in `https://me@example.com/repo.git`). If
the config key has a user name, it must match the user name in
the URL exactly. If the config key does not have a user name,
that config key will match a URL with any user name (including
none), but at a lower precedence than a config key with a user
name.
Longer matches take precedence over shorter matches.
This step adds two helper functions `url_normalize()` and
`match_urls()` to help implement the above semantics. The
normalization rules are based on RFC 3986 and should result in any
two equivalent urls being a match.
Signed-off-by: Kyle J. McKay <mackyle@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>