Граф коммитов

119 Коммитов

Автор SHA1 Сообщение Дата
Junio C Hamano 5938454cbc Merge branch 'dt/xgethostname-nul-termination'
gethostname(2) may not NUL terminate the buffer if hostname does
not fit; unfortunately there is no easy way to see if our buffer
was too small, but at least this will make sure we will not end up
using garbage past the end of the buffer.

* dt/xgethostname-nul-termination:
  xgethostname: handle long hostnames
  use HOST_NAME_MAX to size buffers for gethostname(2)
2017-04-23 22:07:57 -07:00
David Turner 5781a9a270 xgethostname: handle long hostnames
If the full hostname doesn't fit in the buffer supplied to
gethostname, POSIX does not specify whether the buffer will be
null-terminated, so to be safe, we should do it ourselves.  Introduce
new function, xgethostname, which ensures that there is always a \0
at the end of the buffer.

Signed-off-by: David Turner <dturner@twosigma.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-04-18 19:58:04 -07:00
René Scharfe da25bdb776 use HOST_NAME_MAX to size buffers for gethostname(2)
POSIX limits the length of host names to HOST_NAME_MAX.  Export the
fallback definition from daemon.c and use this constant to make all
buffers used with gethostname(2) big enough for any possible result
and a terminating NUL.

Inspired-by: David Turner <dturner@twosigma.com>
Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: David Turner <dturner@twosigma.com>
Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-04-18 19:57:41 -07:00
Jeff King 94425552f3 ident: do not ignore empty config name/email
When we read user.name and user.email from a config file,
they go into strbufs. When a caller asks ident_default_name()
for the value, we fallback to auto-detecting if the strbuf
is empty.

That means that explicitly setting an empty string in the
config is identical to not setting it at all. This is
potentially confusing, as we usually accept a configured
value as the final value.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-02-23 12:58:47 -08:00
Jeff King 13b9a24e58 ident: reject all-crud ident name
An ident name consisting of only "crud" characters (like
whitespace or punctuation) is effectively the same as an
empty one, because our strbuf_addstr_without_crud() will
remove those characters.

We reject an empty name when formatting a strict ident, but
don't notice an all-crud one because our check happens
before the crud-removal step.

We could skip past the crud before checking for an empty
name, but let's make it a separate code path, for two
reasons. One is that we can give a more specific error
message. And two is that unlike a blank name, we probably
don't want to kick in the fallback-to-username behavior.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-02-23 12:47:02 -08:00
Jeff King 862e80a413 ident: handle NULL email when complaining of empty name
If we see an empty name, we complain about and mention the
matching email in the error message (to give it some
context). However, the "email" pointer may be NULL here if
we were planning to fill it in later from ident_default_email().

This was broken by 59f929596 (fmt_ident: refactor strictness
checks, 2016-02-04). Prior to that commit, we would look up
the default name and email before doing any other actions.
So one solution would be to go back to that.

However, we can't just do so blindly. The logic for handling
the "!email" condition has grown since then. In particular,
looking up the default email can die if getpwuid() fails,
but there are other errors that should take precedence.
Commit 734c7789a (ident: check for useConfigOnly before
auto-detection of name/email, 2016-03-30) reordered the
checks so that we prefer the error message for
useConfigOnly.

Instead, we can observe that while the name-handling depends
on "email" being set, the reverse is not true. So we can
simply set up the email variable first.

This does mean that if both are bogus, we'll complain about
the email before the name. But between the two, there is no
reason to prefer one over the other.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-02-23 12:46:00 -08:00
Jeff King afb6c30b27 ident: mark error messages for translation
We already translate the big "please tell me who you are"
hint, but missed the individual error messages that go with
it.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-02-23 12:44:20 -08:00
Junio C Hamano 11738ddf48 Merge branch 'jk/ident-ai-canonname-could-be-null' into maint
In the codepath that comes up with the hostname to be used in an
e-mail when the user didn't tell us, we looked at ai_canonname
field in struct addrinfo without making sure it is not NULL first.

* jk/ident-ai-canonname-could-be-null:
  ident: handle NULL ai_canonname
2016-10-03 13:22:32 -07:00
Junio C Hamano fff948fe0e Merge branch 'jk/ident-ai-canonname-could-be-null'
In the codepath that comes up with the hostname to be used in an
e-mail when the user didn't tell us, we looked at ai_canonname
field in struct addrinfo without making sure it is not NULL first.

* jk/ident-ai-canonname-could-be-null:
  ident: handle NULL ai_canonname
2016-09-29 16:57:14 -07:00
Jeff King c375a7efa3 ident: handle NULL ai_canonname
We call getaddrinfo() to try to convert a short hostname
into a fully-qualified one (to use it as an email domain).
If there isn't a canonical name, getaddrinfo() will
generally return either a NULL addrinfo list, or one in
which ai->ai_canonname is a copy of the original name.

However, if the result of gethostname() looks like an IP
address, then getaddrinfo() behaves differently on some
systems. On OS X, it will return a "struct addrinfo" with a
NULL ai_canonname, and we segfault feeding it to strchr().

This is hard to test reliably because it involves not only a
system where we we have to fallback to gethostname() to come
up with an ident, but also where the hostname is a number
with no dots. But I was able to replicate the bug by faking
a hostname, like:

    diff --git a/ident.c b/ident.c
    index e20a772..b790d28 100644
    --- a/ident.c
    +++ b/ident.c
    @@ -128,6 +128,7 @@ static void add_domainname(struct strbuf *out, int *is_bogus)
                     *is_bogus = 1;
                     return;
             }
    +        xsnprintf(buf, sizeof(buf), "1");
             if (strchr(buf, '.'))
                     strbuf_addstr(out, buf);
             else if (canonical_name(buf, out) < 0) {

and running "git var GIT_AUTHOR_IDENT" on an OS X system.

Before this patch it segfaults, and after we correctly
complain of the bogus "user@1.(none)" address (though this
bogus address would be suitable for non-object uses like
writing reflogs).

Reported-by: Jonas Thiel <jonas.lierschied@gmx.de>
Diagnosed-by: John Keeping <john@keeping.me.uk>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-09-23 10:01:15 -07:00
Vasco Almeida 166e55e328 i18n: ident: mark hint for translation
Mark env_hint for translation.

Signed-off-by: Vasco Almeida <vascomalmeida@sapo.pt>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-09-21 10:20:43 -07:00
Junio C Hamano f4fd627661 Merge branch 'jk/reset-ident-time-per-commit' into maint
Not-so-recent rewrite of "git am" that started making internal
calls into the commit machinery had an unintended regression, in
that no matter how many seconds it took to apply many patches, the
resulting committer timestamp for the resulting commits were all
the same.

* jk/reset-ident-time-per-commit:
  am: reset cached ident date for each patch
2016-08-12 09:16:56 -07:00
Junio C Hamano 24fbe00490 Merge branch 'jk/reset-ident-time-per-commit'
Not-so-recent rewrite of "git am" that started making internal
calls into the commit machinery had an unintended regression, in
that no matter how many seconds it took to apply many patches, the
resulting committer timestamp for the resulting commits were all
the same.

* jk/reset-ident-time-per-commit:
  am: reset cached ident date for each patch
2016-08-10 12:33:17 -07:00
Jeff King 4d9c7e6f45 am: reset cached ident date for each patch
When we compute the date to go in author/committer lines of
commits, or tagger lines of tags, we get the current date
once and then cache it for the rest of the program.  This is
a good thing in some cases, like "git commit", because it
means we do not racily assign different times to the
author/committer fields of a single commit object.

But as more programs start to make many commits in a single
process (e.g., the recently builtin "git am"), it means that
you'll get long strings of commits with identical committer
timestamps (whereas before, we invoked "git commit" many
times and got true timestamps).

This patch addresses it by letting callers reset the cached
time, which means they'll get a fresh time on their next
call to git_committer_info() or git_author_info(). The first
caller to do so is "git am", which resets the time for each
patch it applies.

It would be nice if we could just do this automatically
before filling in the ident fields of commit and tag
objects. Unfortunately, it's hard to know where a particular
logical operation begins and ends.

For instance, if commit_tree_extended() were to call
reset_ident_date() before getting the committer/author
ident, that doesn't quite work; sometimes the author info is
passed in to us as a parameter, and it may or may not have
come from a previous call to ident_default_date(). So in
those cases, we lose the property that the committer and the
author timestamp always match.

You could similarly put a date-reset at the end of
commit_tree_extended(). That actually works in the current
code base, but it's fragile. It makes the assumption that
after commit_tree_extended() finishes, the caller has no
other operations that would logically want to fall into the
same timestamp.

So instead we provide the tool to easily do the reset, and
let the high-level callers use it to annotate their own
logical operations.

There's no automated test, because it would be inherently
racy (it depends on whether the program takes multiple
seconds to run). But you can see the effect with something
like:

  # make a fake 100-patch series
  top=$(git rev-parse HEAD)
  bottom=$(git rev-list --first-parent -100 HEAD | tail -n 1)
  git log --format=email --reverse --first-parent \
          --binary -m -p $bottom..$top >patch

  # now apply it; this presumably takes multiple seconds
  git checkout --detach $bottom
  git am <patch

  # now count the number of distinct committer times;
  # prior to this patch, there would only be one, but
  # now we'd typically see several.
  git log --format=%ct $bottom.. | sort -u

Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Helped-by: Paul Tan <pyokagan@gmail.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-08-01 14:49:41 -07:00
Junio C Hamano e9ef83a299 Merge branch 'da/user-useconfigonly' into HEAD
The "user.useConfigOnly" configuration variable makes it an error
if users do not explicitly set user.name and user.email.  However,
its check was not done early enough and allowed another error to
trigger, reporting that the default value we guessed from the
system setting was unusable.  This was a suboptimal end-user
experience as we want the users to set user.name/user.email without
relying on the auto-detection at all.

* da/user-useconfigonly:
  ident: give "please tell me" message upon useConfigOnly error
  ident: check for useConfigOnly before auto-detection of name/email
2016-05-18 14:40:05 -07:00
Junio C Hamano 40cfc95856 Merge branch 'nd/error-errno'
The code for warning_errno/die_errno has been refactored and a new
error_errno() reporting helper is introduced.

* nd/error-errno: (41 commits)
  wrapper.c: use warning_errno()
  vcs-svn: use error_errno()
  upload-pack.c: use error_errno()
  unpack-trees.c: use error_errno()
  transport-helper.c: use error_errno()
  sha1_file.c: use {error,die,warning}_errno()
  server-info.c: use error_errno()
  sequencer.c: use error_errno()
  run-command.c: use error_errno()
  rerere.c: use error_errno() and warning_errno()
  reachable.c: use error_errno()
  mailmap.c: use error_errno()
  ident.c: use warning_errno()
  http.c: use error_errno() and warning_errno()
  grep.c: use error_errno()
  gpg-interface.c: use error_errno()
  fast-import.c: use error_errno()
  entry.c: use error_errno()
  editor.c: use error_errno()
  diff-no-index.c: use error_errno()
  ...
2016-05-17 14:38:28 -07:00
Nguyễn Thái Ngọc Duy a26f4ed682 ident.c: use warning_errno()
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-05-09 12:29:08 -07:00
Junio C Hamano e7e6826514 Merge branch 'da/user-useconfigonly'
The "user.useConfigOnly" configuration variable makes it an error
if users do not explicitly set user.name and user.email.  However,
its check was not done early enough and allowed another error to
trigger, reporting that the default value we guessed from the
system setting was unusable.  This was a suboptimal end-user
experience as we want the users to set user.name/user.email without
relying on the auto-detection at all.

* da/user-useconfigonly:
  ident: give "please tell me" message upon useConfigOnly error
  ident: check for useConfigOnly before auto-detection of name/email
2016-04-29 12:59:06 -07:00
Marios Titas d3c06c1969 ident: give "please tell me" message upon useConfigOnly error
The env_hint message applies perfectly to the case when
user.useConfigOnly is set and at least one of the user.name and the
user.email are not provided.

Additionally, use a less descriptive error message to discourage
users from disabling user.useConfigOnly configuration variable to
work around this error condition.  We want to encourage them to set
user.name or user.email instead.

Signed-off-by: Marios Titas <redneb@gmx.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-04-01 15:01:20 -07:00
Marios Titas 734c7789aa ident: check for useConfigOnly before auto-detection of name/email
If user.useConfigOnly is set, it does not make sense to try to
auto-detect the name and/or the email.  The auto-detection may
even result in a bogus name and trigger an error message.

Check if the use-config-only is set and die if no explicit name was
given, before attempting to auto-detect, to correct this.

Signed-off-by: Marios Titas <redneb@gmx.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-04-01 14:57:55 -07:00
Junio C Hamano c37f9a1bc3 Merge branch 'da/user-useconfigonly'
The "user.useConfigOnly" configuration variable can be used to
force the user to always set user.email & user.name configuration
variables, serving as a reminder for those who work on multiple
projects and do not want to put these in their $HOME/.gitconfig.

* da/user-useconfigonly:
  ident: add user.useConfigOnly boolean for when ident shouldn't be guessed
  fmt_ident: refactor strictness checks
2016-02-17 10:13:31 -08:00
Dan Aloni 4d5c295696 ident: add user.useConfigOnly boolean for when ident shouldn't be guessed
It used to be that:

   git config --global user.email "(none)"

was a viable way for people to force themselves to set user.email in
each repository.  This was helpful for people with more than one
email address, targeting different email addresses for different
clones, as it barred git from creating a commit unless the user.email
config was set in the per-repo config to the correct email address.

A recent change, 19ce497c (ident: keep a flag for bogus
default_email, 2015-12-10), however, declared that an explicitly
configured user.email is not bogus, no matter what its value is, so
this hack no longer works.

Provide the same functionality by adding a new configuration
variable user.useConfigOnly; when this variable is set, the
user must explicitly set user.email configuration.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Dan Aloni <alonid@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-02-08 11:06:28 -08:00
Jeff King 59f929596b fmt_ident: refactor strictness checks
This function has evolved quite a bit over time, and as a
result, the logic for "is this an OK ident" has been
sprinkled throughout. This ends up with a lot of redundant
conditionals, like checking want_name repeatedly. Worse,
we want to know in many cases whether we are using the
"default" ident, and we do so by comparing directly to the
global strbuf, which violates the abstraction of the
ident_default_* functions.

Let's reorganize the function into a hierarchy of
conditionals to handle similar cases together. The only
case that doesn't just work naturally for this is that of an
empty name, where our advice is different based on whether
we came from ident_default_name() or not. We can use a
simple flag to cover this case.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-02-04 13:18:48 -08:00
Junio C Hamano 1f3b1efd18 ident.c: read /etc/mailname with strbuf_getline()
Just in case /etc/mailname file was edited with a DOS editor,
read it with strbuf_getline() so that a stray CR is not included
as the last character of the mail hostname.

We _might_ want to more aggressively discard whitespace characters
around the line with strbuf_trim(), but that is a bit outside the
scope of this series.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-01-15 10:34:41 -08:00
Junio C Hamano 8f309aeb82 strbuf: introduce strbuf_getline_{lf,nul}()
The strbuf_getline() interface allows a byte other than LF or NUL as
the line terminator, but this is only because I wrote these
codepaths anticipating that there might be a value other than NUL
and LF that could be useful when I introduced line_termination long
time ago.  No useful caller that uses other value has emerged.

By now, it is clear that the interface is overly broad without a
good reason.  Many codepaths have hardcoded preference to read
either LF terminated or NUL terminated records from their input, and
then call strbuf_getline() with LF or NUL as the third parameter.

This step introduces two thin wrappers around strbuf_getline(),
namely, strbuf_getline_lf() and strbuf_getline_nul(), and
mechanically rewrites these call sites to call either one of
them.  The changes contained in this patch are:

 * introduction of these two functions in strbuf.[ch]

 * mechanical conversion of all callers to strbuf_getline() with
   either '\n' or '\0' as the third parameter to instead call the
   respective thin wrapper.

After this step, output from "git grep 'strbuf_getline('" would
become a lot smaller.  An interim goal of this series is to make
this an empty set, so that we can have strbuf_getline_crlf() take
over the shorter name strbuf_getline().

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-01-15 10:12:51 -08:00
Junio C Hamano 5498c57cdd Merge branch 'jk/ident-loosen-getpwuid'
When getpwuid() on the system returned NULL (e.g. the user is not
in the /etc/passwd file or other uid-to-name mappings), the
codepath to find who the user is to record it in the reflog barfed
and died.  Loosen the check in this codepath, which already accepts
questionable ident string (e.g. host part of the e-mail address is
obviously bogus), and in general when we operate fmt_ident() function
in non-strict mode.

* jk/ident-loosen-getpwuid:
  ident: loosen getpwuid error in non-strict mode
  ident: keep a flag for bogus default_email
  ident: make xgetpwuid_self() a static local helper
2015-12-21 10:59:07 -08:00
Jeff King 58d29ececf ident: fix undefined variable when NO_IPV6 is set
Commit 00bce77 (ident.c: add support for IPv6, 2015-11-27)
moved the "gethostbyname" call out of "add_domainname" and
into the helper function "canonical_name". But when moving
the code, it forgot that the "buf" variable is passed as
"host" in the helper.

Reported-by: johan defries <johandefries@gmail.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-12-14 13:06:00 -08:00
Jeff King 92bcbb9b33 ident: loosen getpwuid error in non-strict mode
If the user has not specified an identity and we have to
turn to getpwuid() to find the username or gecos field, we
die immediately when getpwuid fails (e.g., because the user
does not exist). This is OK for making a commit, where we
have set IDENT_STRICT and would want to bail on bogus input.

But for something like a reflog, where the ident is "best
effort", it can be pain. For instance, even running "git
clone" with a UID that is not in /etc/passwd will result in
git barfing, just because we can't find an ident to put in
the reflog.

Instead of dying in xgetpwuid_self, we can instead return a
fallback value, and set a "bogus" flag. For the username in
an email, we already have a "default_email_is_bogus" flag.
For the name field, we introduce (and check) a matching
"default_name_is_bogus" flag. As a bonus, this means you now
get the usual "tell me who you are" advice instead of just a
"no such user" error.

No tests, as this is dependent on configuration outside of
git's control. However, I did confirm that it behaves
sensibly when I delete myself from the local /etc/passwd
(reflogs get written, and commits complain).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-12-14 11:44:38 -08:00
Jeff King 19ce497cf5 ident: keep a flag for bogus default_email
If we have to deduce the user's email address and can't come
up with something plausible for the hostname, we simply
write "(none)" or ".(none)" in the hostname.

Later, our strict-check is forced to use strstr to look for
this magic string. This is probably not a problem in
practice, but it's rather ugly. Let's keep an extra flag
that tells us the email is bogus, and check that instead.

We could get away with simply setting the global in
add_domainname(); it only gets called to write into
git_default_email. However, let's make the code a little
more obvious to future readers by actually passing a pointer
to our "bogus" flag down the call-chain.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-12-10 15:39:25 -08:00
Jeff King e850194c83 ident: make xgetpwuid_self() a static local helper
This function is defined in wrapper.c, but nobody besides
ident.c uses it. And nobody is likely to in the future,
either, as anything that cares about the user's name should
be going through the ident code.

Moving it here is a cleanup of the global namespace, but it
will also enable further cleanups inside ident.c.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-12-10 15:38:59 -08:00
Elia Pinto 00bce77fe5 ident.c: add support for IPv6
Add IPv6 support by implementing name resolution with the
protocol agnostic getaddrinfo(3) API. The old gethostbyname(3)
code is still available when git is compiled with NO_IPV6.

Signed-off-by: Elia Pinto <gitter.spiros@gmail.com>
Helped-by: Jeff King <peff@peff.net>
Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Jeff King <peff@peff.net>
2015-11-28 12:20:42 -05:00
Junio C Hamano 9ff700ebac Merge branch 'jk/commit-author-parsing'
Code clean-up.

* jk/commit-author-parsing:
  determine_author_info(): copy getenv output
  determine_author_info(): reuse parsing functions
  date: use strbufs in date-formatting functions
  record_author_date(): use find_commit_header()
  record_author_date(): fix memory leak on malformed commit
  commit: provide a function to find a header in a buffer
2014-09-19 11:38:33 -07:00
Jeff King c33ddc2e33 date: use strbufs in date-formatting functions
Many of the date functions write into fixed-size buffers.
This is a minor pain, as we have to take special
precautions, and frequently end up copying the result into a
strbuf or heap-allocated buffer anyway (for which we
sometimes use strcpy!).

Let's instead teach parse_date, datestamp, etc to write to a
strbuf. The obvious downside is that we might need to
perform a heap allocation where we otherwise would not need
to. However, it turns out that the only two new allocations
required are:

  1. In test-date.c, where we don't care about efficiency.

  2. In determine_author_info, which is not performance
     critical (and where the use of a strbuf will help later
     refactoring).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-08-27 10:32:56 -07:00
Matthieu Moy 9830534e40 config --global --edit: create a template file if needed
When the user has no ~/.gitconfig file, git config --global --edit used
to launch an editor on an nonexistant file name.

Instead, create a file with a default content before launching the
editor. The template contains only commented-out entries, to save a few
keystrokes for the user. If the values are guessed properly, the user
will only have to uncomment the entries.

Advanced users teaching newbies can create a minimalistic configuration
faster for newbies. Beginners reading a tutorial advising to run "git
config --global --edit" as a first step will be slightly more guided for
their first contact with Git.

Signed-off-by: Matthieu Moy <Matthieu.Moy@imag.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-07-25 12:23:06 -07:00
Junio C Hamano 2125261b63 Merge branch 'jk/split-broken-ident'
Make the fall-back parsing of commit objects with broken author or
committer lines more robust to pick up the timestamps.

* jk/split-broken-ident:
  split_ident: parse timestamp from end of line
2013-10-28 10:43:32 -07:00
Jeff King 03818a4a94 split_ident: parse timestamp from end of line
Split_ident currently parses left to right. Given this
input:

  Your Name <email@example.com> 123456789 -0500\n

We assume the name starts the line and runs until the first
"<".  That starts the email address, which runs until the
first ">".  Everything after that is assumed to be the
timestamp.

This works fine in the normal case, but is easily broken by
corrupted ident lines that contain an extra ">". Some
examples seen in the wild are:

  1. Name <email>-<> 123456789 -0500\n

  2. Name <email> <Name<email>> 123456789 -0500\n

  3. Name1 <email1>, Name2 <email2> 123456789 -0500\n

Currently each of these produces some email address (which
is not necessarily the one the user intended) and end up
with a NULL date (which is generally interpreted as the
epoch by "git log" and friends).

But in each case we could get the correct timestamp simply
by parsing from the right-hand side, looking backwards for
the final ">", and then reading the timestamp from there.

In general, it's a losing battle to try to automatically
guess what the user meant with their broken crud. But this
particular workaround is probably worth doing.  One, it's
dirt simple, and can't impact non-broken cases. Two, it
doesn't catch a single breakage we've seen, but rather a
large class of errors (i.e., any breakage inside the email
angle brackets may affect the email, but won't spill over
into the timestamp parsing). And three, the timestamp is
arguably more valuable to get right, because it can affect
correctness (e.g., in --until cutoffs).

This patch implements the right-to-left scheme described
above. We adjust the tests in t4212, which generate a commit
with such a broken ident, and now gets the timestamp right.
We also add a test that fsck continues to detect the
breakage.

For reference, here are pointers to the breakages seen (as
numbered above):

[1] http://article.gmane.org/gmane.comp.version-control.git/221441

[2] http://article.gmane.org/gmane.comp.version-control.git/222362

[3] http://perl5.git.perl.org/perl.git/commit/13b79730adea97e660de84bbe67f9d7cbe344302

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-10-15 10:41:49 -07:00
Jeff King 662cc30cd0 format-patch: print in-body "From" only when needed
Commit a908047 taught format-patch the "--from" option,
which places the author ident into an in-body from header,
and uses the committer ident in the rfc822 from header.  The
documentation claims that it will omit the in-body header
when it is the same as the rfc822 header, but the code never
implemented that behavior.

This patch completes the feature by comparing the two idents
and doing nothing when they are the same (this is the same
as simply omitting the in-body header, as the two are by
definition indistinguishable in this case). This makes it
reasonable to turn on "--from" all the time (if it matches
your particular workflow), rather than only using it when
exporting other people's patches.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-09-20 11:09:51 -07:00
Junio C Hamano a14daf8b91 Merge branch 'jn/do-not-drop-username-when-reading-from-etc-mailname'
We used to stuff "user@" and then append what we read from
/etc/mailname to come up with a default e-mail ident, but a bug
lost the "user@" part.  This is to fix it.

* jn/do-not-drop-username-when-reading-from-etc-mailname:
  ident: do not drop username when reading from /etc/mailname
2013-02-01 12:40:05 -08:00
Jonathan Nieder dc342a25d1 ident: do not drop username when reading from /etc/mailname
An earlier conversion from fgets() to strbuf_getline() in the
codepath to read from /etc/mailname to learn the default host-part
of the ident e-mail address forgot that strbuf_getline() stores the
line at the beginning of the buffer just like fgets().

The "username@" the caller has prepared in the strbuf, expecting the
function to append the host-part to it, was lost because of this.

Reported-by: Mihai Rusu <dizzy@google.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-01-25 10:41:49 -08:00
Jeff King d6991ceedc ident: keep separate "explicit" flags for author and committer
We keep track of whether the user ident was given to us
explicitly, or if we guessed at it from system parameters
like username and hostname. However, we kept only a single
variable. This covers the common cases (because the author
and committer will usually come from the same explicit
source), but can miss two cases:

  1. GIT_COMMITTER_* is set explicitly, but we fallback for
     GIT_AUTHOR. We claim the ident is explicit, even though
     the author is not.

  2. GIT_AUTHOR_* is set and we ask for author ident, but
     not committer ident. We will claim the ident is
     implicit, even though it is explicit.

This patch uses two variables instead of one, updates both
when we set the "fallback" values, and updates them
individually when we read from the environment.

Rather than keep user_ident_sufficiently_given as a
compatibility wrapper, we update the only two callers to
check the committer_ident, which matches their intent and
what was happening already.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-11-15 17:47:24 -08:00
Jeff King 452802309c ident: make user_ident_explicitly_given static
In v1.5.6-rc0~56^2 (2008-05-04) "user_ident_explicitly_given"
was introduced as a global for communication between config,
ident, and builtin-commit.  In v1.7.0-rc0~72^2 (2010-01-07)
readers switched to using the common wrapper
user_ident_sufficiently_given().  After v1.7.11-rc1~15^2~18
(2012-05-21), the var is only written in ident.c.

Now we can make it static, which will enable further
refactoring without worrying about upsetting other code.

Signed-off-by: Jeff King <peff@peff.net>
Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-11-15 17:47:24 -08:00
Junio C Hamano dad148c359 ident.c: mark private file-scope symbols as static
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-09-15 22:58:21 -07:00
Junio C Hamano e27ddb6456 split_ident_line(): make best effort when parsing author/committer line
Commits made by ancient version of Git allowed committer without
human readable name, like this (00213b17c in the kernel history):

    tree 6947dba41f8b0e7fe7bccd41a4840d6de6a27079
    parent 352dd1df32e672be4cff71132eb9c06a257872fe
    author Petr Baudis <pasky@ucw.cz> 1135223044 +0100
    committer  <sam@mars.ravnborg.org> 1136151043 +0100

    kconfig: Remove support for lxdialog --checklist

    ...

    Signed-off-by: Petr Baudis <pasky@suse.cz>
    Signed-off-by: Sam Ravnborg <sam@ravnborg.org>

When fed such a commit, --format='%ci' fails to parse it, and gives
back an empty string.  Update the split_ident_line() to be a bit
more lenient when parsing, but make sure the caller that wants to
pick up sane value from its return value does its own validation.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-08-31 14:54:18 -07:00
Junio C Hamano 261ec7d02a Merge branch 'jk/ident-gecos-strbuf'
Fixes quite a lot of brokenness when ident information needs to be taken
from the system and cleans up the code.

By Jeff King
* jk/ident-gecos-strbuf: (22 commits)
  format-patch: do not use bogus email addresses in message ids
  ident: reject bogus email addresses with IDENT_STRICT
  ident: rename IDENT_ERROR_ON_NO_NAME to IDENT_STRICT
  format-patch: use GIT_COMMITTER_EMAIL in message ids
  ident: let callers omit name with fmt_indent
  ident: refactor NO_DATE flag in fmt_ident
  ident: reword empty ident error message
  format-patch: refactor get_patch_filename
  ident: trim whitespace from default name/email
  ident: use a dynamic strbuf in fmt_ident
  ident: use full dns names to generate email addresses
  ident: report passwd errors with a more friendly message
  drop length limitations on gecos-derived names and emails
  ident: don't write fallback username into git_default_name
  fmt_ident: drop IDENT_WARN_ON_NO_NAME code
  format-patch: use default email for generating message ids
  ident: trim trailing newline from /etc/mailname
  move git_default_* variables to ident.c
  move identity config parsing to ident.c
  fmt-merge-msg: don't use static buffer in record_person
  ...
2012-05-29 13:09:13 -07:00
Jeff King 8c5b1ae1b2 ident: reject bogus email addresses with IDENT_STRICT
If we come up with a hostname like "foo.(none)" because the
user's machine is not fully qualified, we should reject this
in strict mode (e.g., when we are making a commit object),
just as we reject an empty gecos username.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-05-24 20:50:05 -07:00
Jeff King f9bc573fda ident: rename IDENT_ERROR_ON_NO_NAME to IDENT_STRICT
Callers who ask for ERROR_ON_NO_NAME are not so much
concerned that the name will be blank (because, after all,
we will fall back to using the username), but rather it is a
check to make sure that low-quality identities do not end up
in things like commit messages or emails (whereas it is OK
for them to end up in things like reflogs).

When future commits add more quality checks on the identity,
each of these callers would want to use those checks, too.
Rather than modify each of them later to add a new flag,
let's refactor the flag.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-05-24 17:16:41 -07:00
Jeff King c15e1987ae ident: let callers omit name with fmt_indent
Most callers want to see all of "$name <$email> $date", but
a few want only limited parts, omitting the date, or even
the name. We already have IDENT_NO_DATE to handle the date
part, but there's not a good option for getting just the
email. Callers have to done one of:

  1. Call ident_default_email; this does not respect
     environment variables, nor does it promise to trim
     whitespace or other crud from the result.

  2. Call git_{committer,author}_info; this returns the name
     and email, leaving the caller to parse out the wanted
     bits.

This patch adds IDENT_NO_NAME; it stops short of adding
IDENT_NO_EMAIL, as no callers want it (nor are likely to),
and it complicates the error handling of the function.

When no name is requested, the angle brackets (<>) around
the email address are also omitted.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-05-24 17:16:40 -07:00
Jeff King 359b27add3 ident: refactor NO_DATE flag in fmt_ident
As a short-hand, we extract this flag into the local
variable "name_addr_only". It's more accurate to simply
negate this and refer to it as "want_date", which will be
less confusing when we add more NO_* flags.

While we're touching this part of the code, let's move the
call to ident_default_date() only when we are actually going
to use it, not when we have NO_DATE set, or when we get a
date from the environment.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-05-24 17:16:40 -07:00
Jeff King b00f6cfcd7 ident: reword empty ident error message
There's on point in printing the name, since it is by
definition the empty string if we have reached this code
path. Instead, let's be more clear that we are complaining
about the empty name, but still show the email address that
it is attached to (since that may provide some context to
the user).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-05-24 17:16:34 -07:00
Jeff King d9955fd60f fix off-by-one error in split_ident_line
Commit 4b340cf split the logic to parse an ident line out of
pretty.c's format_person_part. But in doing so, it
accidentally introduced an off-by-one error that caused it
to think that single-character names were invalid.

This manifested itself as the "%an" format failing to show
anything at all for a single-character name.

Reported-by: Brian Turner <bturner@atlassian.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-05-22 11:24:11 -07:00