2014-10-01 14:28:42 +04:00
|
|
|
#ifndef LOCKFILE_H
|
|
|
|
#define LOCKFILE_H
|
|
|
|
|
|
|
|
/*
|
|
|
|
* File write-locks as used by Git.
|
|
|
|
*
|
2015-08-10 12:47:36 +03:00
|
|
|
* The lockfile API serves two purposes:
|
|
|
|
*
|
|
|
|
* * Mutual exclusion and atomic file updates. When we want to change
|
|
|
|
* a file, we create a lockfile `<filename>.lock`, write the new
|
|
|
|
* file contents into it, and then rename the lockfile to its final
|
|
|
|
* destination `<filename>`. We create the `<filename>.lock` file
|
|
|
|
* with `O_CREAT|O_EXCL` so that we can notice and fail if somebody
|
|
|
|
* else has already locked the file, then atomically rename the
|
|
|
|
* lockfile to its final destination to commit the changes and
|
|
|
|
* unlock the file.
|
|
|
|
*
|
|
|
|
* * Automatic cruft removal. If the program exits after we lock a
|
|
|
|
* file but before the changes have been committed, we want to make
|
|
|
|
* sure that we remove the lockfile. This is done by remembering the
|
|
|
|
* lockfiles we have created in a linked list and setting up an
|
|
|
|
* `atexit(3)` handler and a signal handler that clean up the
|
|
|
|
* lockfiles. This mechanism ensures that outstanding lockfiles are
|
|
|
|
* cleaned up if the program exits (including when `die()` is
|
|
|
|
* called) or if the program is terminated by a signal.
|
|
|
|
*
|
|
|
|
* Please note that lockfiles only block other writers. Readers do not
|
|
|
|
* block, but they are guaranteed to see either the old contents of
|
|
|
|
* the file or the new contents of the file (assuming that the
|
|
|
|
* filesystem implements `rename(2)` atomically).
|
|
|
|
*
|
2015-08-10 12:47:41 +03:00
|
|
|
* Most of the heavy lifting is done by the tempfile module (see
|
|
|
|
* "tempfile.h").
|
2015-08-10 12:47:36 +03:00
|
|
|
*
|
|
|
|
* Calling sequence
|
|
|
|
* ----------------
|
|
|
|
*
|
|
|
|
* The caller:
|
|
|
|
*
|
2017-09-05 15:15:12 +03:00
|
|
|
* * Allocates a `struct lock_file` with whatever storage duration you
|
|
|
|
* desire. The struct does not have to be initialized before being
|
|
|
|
* used, but it is good practice to do so using by setting it to
|
|
|
|
* all-zeros (or using the LOCK_INIT macro). This puts the object in a
|
|
|
|
* consistent state that allows you to call rollback_lock_file() even
|
|
|
|
* if the lock was never taken (in which case it is a noop).
|
2015-08-10 12:47:36 +03:00
|
|
|
*
|
2015-08-28 19:55:52 +03:00
|
|
|
* * Attempts to create a lockfile by calling `hold_lock_file_for_update()`.
|
2015-08-10 12:47:36 +03:00
|
|
|
*
|
|
|
|
* * Writes new content for the destination file by either:
|
|
|
|
*
|
|
|
|
* * writing to the file descriptor returned by the
|
|
|
|
* `hold_lock_file_for_*()` functions (also available via
|
|
|
|
* `lock->fd`).
|
|
|
|
*
|
|
|
|
* * calling `fdopen_lock_file()` to get a `FILE` pointer for the
|
|
|
|
* open file and writing to the file using stdio.
|
|
|
|
*
|
2016-08-22 15:47:55 +03:00
|
|
|
* Note that the file descriptor returned by hold_lock_file_for_update()
|
|
|
|
* is marked O_CLOEXEC, so the new contents must be written by the
|
|
|
|
* current process, not a spawned one.
|
|
|
|
*
|
2015-08-10 12:47:36 +03:00
|
|
|
* When finished writing, the caller can:
|
|
|
|
*
|
|
|
|
* * Close the file descriptor and rename the lockfile to its final
|
|
|
|
* destination by calling `commit_lock_file()` or
|
|
|
|
* `commit_lock_file_to()`.
|
|
|
|
*
|
|
|
|
* * Close the file descriptor and remove the lockfile by calling
|
|
|
|
* `rollback_lock_file()`.
|
|
|
|
*
|
|
|
|
* * Close the file descriptor without removing or renaming the
|
2017-09-05 15:14:33 +03:00
|
|
|
* lockfile by calling `close_lock_file_gently()`, and later call
|
2015-08-10 12:47:36 +03:00
|
|
|
* `commit_lock_file()`, `commit_lock_file_to()`,
|
|
|
|
* `rollback_lock_file()`, or `reopen_lock_file()`.
|
|
|
|
*
|
2017-09-05 15:15:12 +03:00
|
|
|
* After the lockfile is committed or rolled back, the `lock_file`
|
|
|
|
* object can be discarded or reused.
|
2015-08-10 12:47:36 +03:00
|
|
|
*
|
|
|
|
* If the program exits before `commit_lock_file()`,
|
2015-08-10 12:47:41 +03:00
|
|
|
* `commit_lock_file_to()`, or `rollback_lock_file()` is called, the
|
|
|
|
* tempfile module will close and remove the lockfile, thereby rolling
|
|
|
|
* back any uncommitted changes.
|
2015-08-10 12:47:36 +03:00
|
|
|
*
|
|
|
|
* If you need to close the file descriptor you obtained from a
|
|
|
|
* `hold_lock_file_for_*()` function yourself, do so by calling
|
2017-09-05 15:14:33 +03:00
|
|
|
* `close_lock_file_gently()`. See "tempfile.h" for more information.
|
2015-08-10 12:47:41 +03:00
|
|
|
*
|
|
|
|
*
|
|
|
|
* Under the covers, a lockfile is just a tempfile with a few helper
|
|
|
|
* functions. In particular, the state diagram and the cleanup
|
|
|
|
* machinery are all implemented in the tempfile module.
|
|
|
|
*
|
2015-08-10 12:47:36 +03:00
|
|
|
*
|
|
|
|
* Error handling
|
|
|
|
* --------------
|
|
|
|
*
|
|
|
|
* The `hold_lock_file_for_*()` functions return a file descriptor on
|
|
|
|
* success or -1 on failure (unless `LOCK_DIE_ON_ERROR` is used; see
|
|
|
|
* "flags" below). On errors, `errno` describes the reason for
|
|
|
|
* failure. Errors can be reported by passing `errno` to
|
|
|
|
* `unable_to_lock_message()` or `unable_to_lock_die()`.
|
|
|
|
*
|
|
|
|
* Similarly, `commit_lock_file`, `commit_lock_file_to`, and
|
|
|
|
* `close_lock_file` return 0 on success. On failure they set `errno`
|
2017-09-05 15:14:33 +03:00
|
|
|
* appropriately and return -1. The `commit` variants (but not `close`)
|
|
|
|
* do their best to delete the temporary file before returning.
|
2014-10-01 14:28:42 +04:00
|
|
|
*/
|
|
|
|
|
2015-08-10 12:47:41 +03:00
|
|
|
#include "tempfile.h"
|
|
|
|
|
2014-10-01 14:28:42 +04:00
|
|
|
struct lock_file {
|
tempfile: auto-allocate tempfiles on heap
The previous commit taught the tempfile code to give up
ownership over tempfiles that have been renamed or deleted.
That makes it possible to use a stack variable like this:
struct tempfile t;
create_tempfile(&t, ...);
...
if (!err)
rename_tempfile(&t, ...);
else
delete_tempfile(&t);
But doing it this way has a high potential for creating
memory errors. The tempfile we pass to create_tempfile()
ends up on a global linked list, and it's not safe for it to
go out of scope until we've called one of those two
deactivation functions.
Imagine that we add an early return from the function that
forgets to call delete_tempfile(). With a static or heap
tempfile variable, the worst case is that the tempfile hangs
around until the program exits (and some functions like
setup_shallow_temporary rely on this intentionally, creating
a tempfile and then leaving it for later cleanup).
But with a stack variable as above, this is a serious memory
error: the variable goes out of scope and may be filled with
garbage by the time the tempfile code looks at it. Let's
see if we can make it harder to get this wrong.
Since many callers need to allocate arbitrary numbers of
tempfiles, we can't rely on static storage as a general
solution. So we need to turn to the heap. We could just ask
all callers to pass us a heap variable, but that puts the
burden on them to call free() at the right time.
Instead, let's have the tempfile code handle the heap
allocation _and_ the deallocation (when the tempfile is
deactivated and removed from the list).
This changes the return value of all of the creation
functions. For the cleanup functions (delete and rename),
we'll add one extra bit of safety: instead of taking a
tempfile pointer, we'll take a pointer-to-pointer and set it
to NULL after freeing the object. This makes it safe to
double-call functions like delete_tempfile(), as the second
call treats the NULL input as a noop. Several callsites
follow this pattern.
The resulting patch does have a fair bit of noise, as each
caller needs to be converted to handle:
1. Storing a pointer instead of the struct itself.
2. Passing the pointer instead of taking the struct
address.
3. Handling a "struct tempfile *" return instead of a file
descriptor.
We could play games to make this less noisy. For example, by
defining the tempfile like this:
struct tempfile {
struct heap_allocated_part_of_tempfile {
int fd;
...etc
} *actual_data;
}
Callers would continue to have a "struct tempfile", and it
would be "active" only when the inner pointer was non-NULL.
But that just makes things more awkward in the long run.
There aren't that many callers, so we can simply bite
the bullet and adjust all of them. And the compiler makes it
easy for us to find them all.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-05 15:15:08 +03:00
|
|
|
struct tempfile *tempfile;
|
2014-10-01 14:28:42 +04:00
|
|
|
};
|
|
|
|
|
2017-09-05 15:15:12 +03:00
|
|
|
#define LOCK_INIT { NULL }
|
|
|
|
|
2014-10-01 14:28:42 +04:00
|
|
|
/* String appended to a filename to derive the lockfile name: */
|
|
|
|
#define LOCK_SUFFIX ".lock"
|
|
|
|
#define LOCK_SUFFIX_LEN 5
|
|
|
|
|
2015-08-10 12:47:36 +03:00
|
|
|
|
|
|
|
/*
|
|
|
|
* Flags
|
|
|
|
* -----
|
|
|
|
*
|
2015-08-28 19:55:52 +03:00
|
|
|
* The following flags can be passed to `hold_lock_file_for_update()`.
|
2015-08-10 12:47:36 +03:00
|
|
|
*/
|
|
|
|
|
|
|
|
/*
|
|
|
|
* If a lock is already taken for the file, `die()` with an error
|
|
|
|
* message. If this flag is not specified, trying to lock a file that
|
2016-12-07 21:56:26 +03:00
|
|
|
* is already locked silently returns -1 to the caller, or ...
|
2015-08-10 12:47:36 +03:00
|
|
|
*/
|
2014-10-01 14:28:42 +04:00
|
|
|
#define LOCK_DIE_ON_ERROR 1
|
2015-08-10 12:47:36 +03:00
|
|
|
|
2016-12-07 21:56:26 +03:00
|
|
|
/*
|
|
|
|
* ... this flag can be passed instead to return -1 and give the usual
|
|
|
|
* error message upon an error.
|
|
|
|
*/
|
2016-12-27 20:12:09 +03:00
|
|
|
#define LOCK_REPORT_ON_ERROR 4
|
2016-12-07 21:56:26 +03:00
|
|
|
|
2015-08-10 12:47:36 +03:00
|
|
|
/*
|
|
|
|
* Usually symbolic links in the destination path are resolved. This
|
|
|
|
* means that (1) the lockfile is created by adding ".lock" to the
|
|
|
|
* resolved path, and (2) upon commit, the resolved path is
|
|
|
|
* overwritten. However, if `LOCK_NO_DEREF` is set, then the lockfile
|
|
|
|
* is created by adding ".lock" to the path argument itself. This
|
|
|
|
* option is used, for example, when detaching a symbolic reference,
|
|
|
|
* which for backwards-compatibility reasons, can be a symbolic link
|
|
|
|
* containing the name of the referred-to-reference.
|
|
|
|
*/
|
2014-10-01 14:28:42 +04:00
|
|
|
#define LOCK_NO_DEREF 2
|
|
|
|
|
2015-08-10 12:47:36 +03:00
|
|
|
/*
|
|
|
|
* Attempt to create a lockfile for the file at `path` and return a
|
|
|
|
* file descriptor for writing to it, or -1 on error. If the file is
|
|
|
|
* currently locked, retry with quadratic backoff for at least
|
|
|
|
* timeout_ms milliseconds. If timeout_ms is 0, try exactly once; if
|
|
|
|
* timeout_ms is -1, retry indefinitely. The flags argument and error
|
|
|
|
* handling are described above.
|
|
|
|
*/
|
2015-05-11 13:35:25 +03:00
|
|
|
extern int hold_lock_file_for_update_timeout(
|
|
|
|
struct lock_file *lk, const char *path,
|
|
|
|
int flags, long timeout_ms);
|
|
|
|
|
2015-08-10 12:47:36 +03:00
|
|
|
/*
|
|
|
|
* Attempt to create a lockfile for the file at `path` and return a
|
|
|
|
* file descriptor for writing to it, or -1 on error. The flags
|
|
|
|
* argument and error handling are described above.
|
|
|
|
*/
|
2015-05-11 13:35:25 +03:00
|
|
|
static inline int hold_lock_file_for_update(
|
|
|
|
struct lock_file *lk, const char *path,
|
|
|
|
int flags)
|
|
|
|
{
|
|
|
|
return hold_lock_file_for_update_timeout(lk, path, flags, 0);
|
|
|
|
}
|
|
|
|
|
2017-05-22 17:17:39 +03:00
|
|
|
/*
|
|
|
|
* Return a nonzero value iff `lk` is currently locked.
|
|
|
|
*/
|
|
|
|
static inline int is_lock_file_locked(struct lock_file *lk)
|
|
|
|
{
|
tempfile: auto-allocate tempfiles on heap
The previous commit taught the tempfile code to give up
ownership over tempfiles that have been renamed or deleted.
That makes it possible to use a stack variable like this:
struct tempfile t;
create_tempfile(&t, ...);
...
if (!err)
rename_tempfile(&t, ...);
else
delete_tempfile(&t);
But doing it this way has a high potential for creating
memory errors. The tempfile we pass to create_tempfile()
ends up on a global linked list, and it's not safe for it to
go out of scope until we've called one of those two
deactivation functions.
Imagine that we add an early return from the function that
forgets to call delete_tempfile(). With a static or heap
tempfile variable, the worst case is that the tempfile hangs
around until the program exits (and some functions like
setup_shallow_temporary rely on this intentionally, creating
a tempfile and then leaving it for later cleanup).
But with a stack variable as above, this is a serious memory
error: the variable goes out of scope and may be filled with
garbage by the time the tempfile code looks at it. Let's
see if we can make it harder to get this wrong.
Since many callers need to allocate arbitrary numbers of
tempfiles, we can't rely on static storage as a general
solution. So we need to turn to the heap. We could just ask
all callers to pass us a heap variable, but that puts the
burden on them to call free() at the right time.
Instead, let's have the tempfile code handle the heap
allocation _and_ the deallocation (when the tempfile is
deactivated and removed from the list).
This changes the return value of all of the creation
functions. For the cleanup functions (delete and rename),
we'll add one extra bit of safety: instead of taking a
tempfile pointer, we'll take a pointer-to-pointer and set it
to NULL after freeing the object. This makes it safe to
double-call functions like delete_tempfile(), as the second
call treats the NULL input as a noop. Several callsites
follow this pattern.
The resulting patch does have a fair bit of noise, as each
caller needs to be converted to handle:
1. Storing a pointer instead of the struct itself.
2. Passing the pointer instead of taking the struct
address.
3. Handling a "struct tempfile *" return instead of a file
descriptor.
We could play games to make this less noisy. For example, by
defining the tempfile like this:
struct tempfile {
struct heap_allocated_part_of_tempfile {
int fd;
...etc
} *actual_data;
}
Callers would continue to have a "struct tempfile", and it
would be "active" only when the inner pointer was non-NULL.
But that just makes things more awkward in the long run.
There aren't that many callers, so we can simply bite
the bullet and adjust all of them. And the compiler makes it
easy for us to find them all.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-05 15:15:08 +03:00
|
|
|
return is_tempfile_active(lk->tempfile);
|
2017-05-22 17:17:39 +03:00
|
|
|
}
|
|
|
|
|
2015-08-10 12:47:36 +03:00
|
|
|
/*
|
|
|
|
* Append an appropriate error message to `buf` following the failure
|
2015-08-28 19:55:52 +03:00
|
|
|
* of `hold_lock_file_for_update()` to lock `path`. `err` should be the
|
|
|
|
* `errno` set by the failing call.
|
2015-08-10 12:47:36 +03:00
|
|
|
*/
|
|
|
|
extern void unable_to_lock_message(const char *path, int err,
|
|
|
|
struct strbuf *buf);
|
2015-05-11 13:35:25 +03:00
|
|
|
|
2015-08-10 12:47:36 +03:00
|
|
|
/*
|
|
|
|
* Emit an appropriate error message and `die()` following the failure
|
2015-08-28 19:55:52 +03:00
|
|
|
* of `hold_lock_file_for_update()` to lock `path`. `err` should be the
|
|
|
|
* `errno` set by the failing
|
2015-08-10 12:47:36 +03:00
|
|
|
* call.
|
|
|
|
*/
|
|
|
|
extern NORETURN void unable_to_lock_die(const char *path, int err);
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Associate a stdio stream with the lockfile (which must still be
|
|
|
|
* open). Return `NULL` (*without* rolling back the lockfile) on
|
2017-09-05 15:14:33 +03:00
|
|
|
* error. The stream is closed automatically when
|
|
|
|
* `close_lock_file_gently()` is called or when the file is committed or
|
|
|
|
* rolled back.
|
2015-08-10 12:47:36 +03:00
|
|
|
*/
|
2015-08-10 12:47:41 +03:00
|
|
|
static inline FILE *fdopen_lock_file(struct lock_file *lk, const char *mode)
|
|
|
|
{
|
tempfile: auto-allocate tempfiles on heap
The previous commit taught the tempfile code to give up
ownership over tempfiles that have been renamed or deleted.
That makes it possible to use a stack variable like this:
struct tempfile t;
create_tempfile(&t, ...);
...
if (!err)
rename_tempfile(&t, ...);
else
delete_tempfile(&t);
But doing it this way has a high potential for creating
memory errors. The tempfile we pass to create_tempfile()
ends up on a global linked list, and it's not safe for it to
go out of scope until we've called one of those two
deactivation functions.
Imagine that we add an early return from the function that
forgets to call delete_tempfile(). With a static or heap
tempfile variable, the worst case is that the tempfile hangs
around until the program exits (and some functions like
setup_shallow_temporary rely on this intentionally, creating
a tempfile and then leaving it for later cleanup).
But with a stack variable as above, this is a serious memory
error: the variable goes out of scope and may be filled with
garbage by the time the tempfile code looks at it. Let's
see if we can make it harder to get this wrong.
Since many callers need to allocate arbitrary numbers of
tempfiles, we can't rely on static storage as a general
solution. So we need to turn to the heap. We could just ask
all callers to pass us a heap variable, but that puts the
burden on them to call free() at the right time.
Instead, let's have the tempfile code handle the heap
allocation _and_ the deallocation (when the tempfile is
deactivated and removed from the list).
This changes the return value of all of the creation
functions. For the cleanup functions (delete and rename),
we'll add one extra bit of safety: instead of taking a
tempfile pointer, we'll take a pointer-to-pointer and set it
to NULL after freeing the object. This makes it safe to
double-call functions like delete_tempfile(), as the second
call treats the NULL input as a noop. Several callsites
follow this pattern.
The resulting patch does have a fair bit of noise, as each
caller needs to be converted to handle:
1. Storing a pointer instead of the struct itself.
2. Passing the pointer instead of taking the struct
address.
3. Handling a "struct tempfile *" return instead of a file
descriptor.
We could play games to make this less noisy. For example, by
defining the tempfile like this:
struct tempfile {
struct heap_allocated_part_of_tempfile {
int fd;
...etc
} *actual_data;
}
Callers would continue to have a "struct tempfile", and it
would be "active" only when the inner pointer was non-NULL.
But that just makes things more awkward in the long run.
There aren't that many callers, so we can simply bite
the bullet and adjust all of them. And the compiler makes it
easy for us to find them all.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-05 15:15:08 +03:00
|
|
|
return fdopen_tempfile(lk->tempfile, mode);
|
2015-08-10 12:47:41 +03:00
|
|
|
}
|
2015-08-10 12:47:36 +03:00
|
|
|
|
2015-08-10 12:47:39 +03:00
|
|
|
/*
|
|
|
|
* Return the path of the lockfile. The return value is a pointer to a
|
|
|
|
* field within the lock_file object and should not be freed.
|
|
|
|
*/
|
2015-08-10 12:47:41 +03:00
|
|
|
static inline const char *get_lock_file_path(struct lock_file *lk)
|
|
|
|
{
|
tempfile: auto-allocate tempfiles on heap
The previous commit taught the tempfile code to give up
ownership over tempfiles that have been renamed or deleted.
That makes it possible to use a stack variable like this:
struct tempfile t;
create_tempfile(&t, ...);
...
if (!err)
rename_tempfile(&t, ...);
else
delete_tempfile(&t);
But doing it this way has a high potential for creating
memory errors. The tempfile we pass to create_tempfile()
ends up on a global linked list, and it's not safe for it to
go out of scope until we've called one of those two
deactivation functions.
Imagine that we add an early return from the function that
forgets to call delete_tempfile(). With a static or heap
tempfile variable, the worst case is that the tempfile hangs
around until the program exits (and some functions like
setup_shallow_temporary rely on this intentionally, creating
a tempfile and then leaving it for later cleanup).
But with a stack variable as above, this is a serious memory
error: the variable goes out of scope and may be filled with
garbage by the time the tempfile code looks at it. Let's
see if we can make it harder to get this wrong.
Since many callers need to allocate arbitrary numbers of
tempfiles, we can't rely on static storage as a general
solution. So we need to turn to the heap. We could just ask
all callers to pass us a heap variable, but that puts the
burden on them to call free() at the right time.
Instead, let's have the tempfile code handle the heap
allocation _and_ the deallocation (when the tempfile is
deactivated and removed from the list).
This changes the return value of all of the creation
functions. For the cleanup functions (delete and rename),
we'll add one extra bit of safety: instead of taking a
tempfile pointer, we'll take a pointer-to-pointer and set it
to NULL after freeing the object. This makes it safe to
double-call functions like delete_tempfile(), as the second
call treats the NULL input as a noop. Several callsites
follow this pattern.
The resulting patch does have a fair bit of noise, as each
caller needs to be converted to handle:
1. Storing a pointer instead of the struct itself.
2. Passing the pointer instead of taking the struct
address.
3. Handling a "struct tempfile *" return instead of a file
descriptor.
We could play games to make this less noisy. For example, by
defining the tempfile like this:
struct tempfile {
struct heap_allocated_part_of_tempfile {
int fd;
...etc
} *actual_data;
}
Callers would continue to have a "struct tempfile", and it
would be "active" only when the inner pointer was non-NULL.
But that just makes things more awkward in the long run.
There aren't that many callers, so we can simply bite
the bullet and adjust all of them. And the compiler makes it
easy for us to find them all.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-05 15:15:08 +03:00
|
|
|
return get_tempfile_path(lk->tempfile);
|
2015-08-10 12:47:41 +03:00
|
|
|
}
|
2015-08-10 12:47:39 +03:00
|
|
|
|
2015-08-10 12:47:41 +03:00
|
|
|
static inline int get_lock_file_fd(struct lock_file *lk)
|
|
|
|
{
|
tempfile: auto-allocate tempfiles on heap
The previous commit taught the tempfile code to give up
ownership over tempfiles that have been renamed or deleted.
That makes it possible to use a stack variable like this:
struct tempfile t;
create_tempfile(&t, ...);
...
if (!err)
rename_tempfile(&t, ...);
else
delete_tempfile(&t);
But doing it this way has a high potential for creating
memory errors. The tempfile we pass to create_tempfile()
ends up on a global linked list, and it's not safe for it to
go out of scope until we've called one of those two
deactivation functions.
Imagine that we add an early return from the function that
forgets to call delete_tempfile(). With a static or heap
tempfile variable, the worst case is that the tempfile hangs
around until the program exits (and some functions like
setup_shallow_temporary rely on this intentionally, creating
a tempfile and then leaving it for later cleanup).
But with a stack variable as above, this is a serious memory
error: the variable goes out of scope and may be filled with
garbage by the time the tempfile code looks at it. Let's
see if we can make it harder to get this wrong.
Since many callers need to allocate arbitrary numbers of
tempfiles, we can't rely on static storage as a general
solution. So we need to turn to the heap. We could just ask
all callers to pass us a heap variable, but that puts the
burden on them to call free() at the right time.
Instead, let's have the tempfile code handle the heap
allocation _and_ the deallocation (when the tempfile is
deactivated and removed from the list).
This changes the return value of all of the creation
functions. For the cleanup functions (delete and rename),
we'll add one extra bit of safety: instead of taking a
tempfile pointer, we'll take a pointer-to-pointer and set it
to NULL after freeing the object. This makes it safe to
double-call functions like delete_tempfile(), as the second
call treats the NULL input as a noop. Several callsites
follow this pattern.
The resulting patch does have a fair bit of noise, as each
caller needs to be converted to handle:
1. Storing a pointer instead of the struct itself.
2. Passing the pointer instead of taking the struct
address.
3. Handling a "struct tempfile *" return instead of a file
descriptor.
We could play games to make this less noisy. For example, by
defining the tempfile like this:
struct tempfile {
struct heap_allocated_part_of_tempfile {
int fd;
...etc
} *actual_data;
}
Callers would continue to have a "struct tempfile", and it
would be "active" only when the inner pointer was non-NULL.
But that just makes things more awkward in the long run.
There aren't that many callers, so we can simply bite
the bullet and adjust all of them. And the compiler makes it
easy for us to find them all.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-05 15:15:08 +03:00
|
|
|
return get_tempfile_fd(lk->tempfile);
|
2015-08-10 12:47:41 +03:00
|
|
|
}
|
|
|
|
|
|
|
|
static inline FILE *get_lock_file_fp(struct lock_file *lk)
|
|
|
|
{
|
tempfile: auto-allocate tempfiles on heap
The previous commit taught the tempfile code to give up
ownership over tempfiles that have been renamed or deleted.
That makes it possible to use a stack variable like this:
struct tempfile t;
create_tempfile(&t, ...);
...
if (!err)
rename_tempfile(&t, ...);
else
delete_tempfile(&t);
But doing it this way has a high potential for creating
memory errors. The tempfile we pass to create_tempfile()
ends up on a global linked list, and it's not safe for it to
go out of scope until we've called one of those two
deactivation functions.
Imagine that we add an early return from the function that
forgets to call delete_tempfile(). With a static or heap
tempfile variable, the worst case is that the tempfile hangs
around until the program exits (and some functions like
setup_shallow_temporary rely on this intentionally, creating
a tempfile and then leaving it for later cleanup).
But with a stack variable as above, this is a serious memory
error: the variable goes out of scope and may be filled with
garbage by the time the tempfile code looks at it. Let's
see if we can make it harder to get this wrong.
Since many callers need to allocate arbitrary numbers of
tempfiles, we can't rely on static storage as a general
solution. So we need to turn to the heap. We could just ask
all callers to pass us a heap variable, but that puts the
burden on them to call free() at the right time.
Instead, let's have the tempfile code handle the heap
allocation _and_ the deallocation (when the tempfile is
deactivated and removed from the list).
This changes the return value of all of the creation
functions. For the cleanup functions (delete and rename),
we'll add one extra bit of safety: instead of taking a
tempfile pointer, we'll take a pointer-to-pointer and set it
to NULL after freeing the object. This makes it safe to
double-call functions like delete_tempfile(), as the second
call treats the NULL input as a noop. Several callsites
follow this pattern.
The resulting patch does have a fair bit of noise, as each
caller needs to be converted to handle:
1. Storing a pointer instead of the struct itself.
2. Passing the pointer instead of taking the struct
address.
3. Handling a "struct tempfile *" return instead of a file
descriptor.
We could play games to make this less noisy. For example, by
defining the tempfile like this:
struct tempfile {
struct heap_allocated_part_of_tempfile {
int fd;
...etc
} *actual_data;
}
Callers would continue to have a "struct tempfile", and it
would be "active" only when the inner pointer was non-NULL.
But that just makes things more awkward in the long run.
There aren't that many callers, so we can simply bite
the bullet and adjust all of them. And the compiler makes it
easy for us to find them all.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-05 15:15:08 +03:00
|
|
|
return get_tempfile_fp(lk->tempfile);
|
2015-08-10 12:47:41 +03:00
|
|
|
}
|
2015-08-10 12:47:38 +03:00
|
|
|
|
2015-08-10 12:47:36 +03:00
|
|
|
/*
|
|
|
|
* Return the path of the file that is locked by the specified
|
|
|
|
* lock_file object. The caller must free the memory.
|
|
|
|
*/
|
|
|
|
extern char *get_locked_file_path(struct lock_file *lk);
|
|
|
|
|
|
|
|
/*
|
|
|
|
* If the lockfile is still open, close it (and the file pointer if it
|
|
|
|
* has been opened using `fdopen_lock_file()`) without renaming the
|
|
|
|
* lockfile over the file being locked. Return 0 upon success. On
|
2017-10-05 23:32:05 +03:00
|
|
|
* failure to `close(2)`, return a negative value (the lockfile is not
|
|
|
|
* rolled back). Usually `commit_lock_file()`, `commit_lock_file_to()`,
|
2017-09-05 15:14:33 +03:00
|
|
|
* or `rollback_lock_file()` should eventually be called.
|
2015-08-10 12:47:36 +03:00
|
|
|
*/
|
2017-09-05 15:14:33 +03:00
|
|
|
static inline int close_lock_file_gently(struct lock_file *lk)
|
2015-08-10 12:47:41 +03:00
|
|
|
{
|
tempfile: auto-allocate tempfiles on heap
The previous commit taught the tempfile code to give up
ownership over tempfiles that have been renamed or deleted.
That makes it possible to use a stack variable like this:
struct tempfile t;
create_tempfile(&t, ...);
...
if (!err)
rename_tempfile(&t, ...);
else
delete_tempfile(&t);
But doing it this way has a high potential for creating
memory errors. The tempfile we pass to create_tempfile()
ends up on a global linked list, and it's not safe for it to
go out of scope until we've called one of those two
deactivation functions.
Imagine that we add an early return from the function that
forgets to call delete_tempfile(). With a static or heap
tempfile variable, the worst case is that the tempfile hangs
around until the program exits (and some functions like
setup_shallow_temporary rely on this intentionally, creating
a tempfile and then leaving it for later cleanup).
But with a stack variable as above, this is a serious memory
error: the variable goes out of scope and may be filled with
garbage by the time the tempfile code looks at it. Let's
see if we can make it harder to get this wrong.
Since many callers need to allocate arbitrary numbers of
tempfiles, we can't rely on static storage as a general
solution. So we need to turn to the heap. We could just ask
all callers to pass us a heap variable, but that puts the
burden on them to call free() at the right time.
Instead, let's have the tempfile code handle the heap
allocation _and_ the deallocation (when the tempfile is
deactivated and removed from the list).
This changes the return value of all of the creation
functions. For the cleanup functions (delete and rename),
we'll add one extra bit of safety: instead of taking a
tempfile pointer, we'll take a pointer-to-pointer and set it
to NULL after freeing the object. This makes it safe to
double-call functions like delete_tempfile(), as the second
call treats the NULL input as a noop. Several callsites
follow this pattern.
The resulting patch does have a fair bit of noise, as each
caller needs to be converted to handle:
1. Storing a pointer instead of the struct itself.
2. Passing the pointer instead of taking the struct
address.
3. Handling a "struct tempfile *" return instead of a file
descriptor.
We could play games to make this less noisy. For example, by
defining the tempfile like this:
struct tempfile {
struct heap_allocated_part_of_tempfile {
int fd;
...etc
} *actual_data;
}
Callers would continue to have a "struct tempfile", and it
would be "active" only when the inner pointer was non-NULL.
But that just makes things more awkward in the long run.
There aren't that many callers, so we can simply bite
the bullet and adjust all of them. And the compiler makes it
easy for us to find them all.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-05 15:15:08 +03:00
|
|
|
return close_tempfile_gently(lk->tempfile);
|
2015-08-10 12:47:41 +03:00
|
|
|
}
|
2015-08-10 12:47:36 +03:00
|
|
|
|
|
|
|
/*
|
2017-09-05 15:14:33 +03:00
|
|
|
* Re-open a lockfile that has been closed using `close_lock_file_gently()`
|
2015-08-10 12:47:36 +03:00
|
|
|
* but not yet committed or rolled back. This can be used to implement
|
|
|
|
* a sequence of operations like the following:
|
|
|
|
*
|
|
|
|
* * Lock file.
|
|
|
|
*
|
2017-09-05 15:14:33 +03:00
|
|
|
* * Write new contents to lockfile, then `close_lock_file_gently()` to
|
2015-08-10 12:47:36 +03:00
|
|
|
* cause the contents to be written to disk.
|
|
|
|
*
|
|
|
|
* * Pass the name of the lockfile to another program to allow it (and
|
|
|
|
* nobody else) to inspect the contents you wrote, while still
|
|
|
|
* holding the lock yourself.
|
|
|
|
*
|
reopen_tempfile(): truncate opened file
We provide a reopen_tempfile() function, which is in turn
used by reopen_lockfile(). The idea is that a caller may
want to rewrite the tempfile without letting go of the lock.
And that's what our one caller does: after running
add--interactive, "commit -p" will update the cache-tree
extension of the index and write out the result, all while
holding the lock.
However, because we open the file with only the O_WRONLY
flag, the existing index content is left in place, and we
overwrite it starting at position 0. If the new index after
updating the cache-tree is smaller than the original, those
final bytes are not overwritten and remain in the file. This
results in a corrupt index, since those cruft bytes are
interpreted as part of the trailing hash (or even as an
extension, if there are enough bytes).
This bug actually pre-dates reopen_tempfile(); the original
code from 9c4d6c0297 (cache-tree: Write updated cache-tree
after commit, 2014-07-13) has the same bug, and those lines
were eventually refactored into the tempfile module. Nobody
noticed until now for two reasons:
- the bug can only be triggered in interactive mode
("commit -p" or "commit -i")
- the size of the index must shrink after updating the
cache-tree, which implies a non-trivial deletion. Notice
that the included test actually has to create a 2-deep
hierarchy. A single level is not enough to actually cause
shrinkage.
The fix is to truncate the file before writing out the
second index. We can do that at the caller by using
ftruncate(). But we shouldn't have to do that. There is no
other place in Git where we want to open a file and
overwrite bytes, making reopen_tempfile() a confusing and
error-prone interface. Let's pass O_TRUNC there, which gives
callers the same state they had after initially opening the
file or lock.
It's possible that we could later add a caller that wants
something else (e.g., to open with O_APPEND). But this is
the only caller we've had in the history of the codebase.
Let's punt on doing anything more clever until another one
comes along.
Reported-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-09-05 02:36:43 +03:00
|
|
|
* * `reopen_lock_file()` to reopen the lockfile, truncating the existing
|
|
|
|
* contents. Write out the new contents.
|
2015-08-10 12:47:36 +03:00
|
|
|
*
|
|
|
|
* * `commit_lock_file()` to make the final version permanent.
|
|
|
|
*/
|
2015-08-10 12:47:41 +03:00
|
|
|
static inline int reopen_lock_file(struct lock_file *lk)
|
|
|
|
{
|
tempfile: auto-allocate tempfiles on heap
The previous commit taught the tempfile code to give up
ownership over tempfiles that have been renamed or deleted.
That makes it possible to use a stack variable like this:
struct tempfile t;
create_tempfile(&t, ...);
...
if (!err)
rename_tempfile(&t, ...);
else
delete_tempfile(&t);
But doing it this way has a high potential for creating
memory errors. The tempfile we pass to create_tempfile()
ends up on a global linked list, and it's not safe for it to
go out of scope until we've called one of those two
deactivation functions.
Imagine that we add an early return from the function that
forgets to call delete_tempfile(). With a static or heap
tempfile variable, the worst case is that the tempfile hangs
around until the program exits (and some functions like
setup_shallow_temporary rely on this intentionally, creating
a tempfile and then leaving it for later cleanup).
But with a stack variable as above, this is a serious memory
error: the variable goes out of scope and may be filled with
garbage by the time the tempfile code looks at it. Let's
see if we can make it harder to get this wrong.
Since many callers need to allocate arbitrary numbers of
tempfiles, we can't rely on static storage as a general
solution. So we need to turn to the heap. We could just ask
all callers to pass us a heap variable, but that puts the
burden on them to call free() at the right time.
Instead, let's have the tempfile code handle the heap
allocation _and_ the deallocation (when the tempfile is
deactivated and removed from the list).
This changes the return value of all of the creation
functions. For the cleanup functions (delete and rename),
we'll add one extra bit of safety: instead of taking a
tempfile pointer, we'll take a pointer-to-pointer and set it
to NULL after freeing the object. This makes it safe to
double-call functions like delete_tempfile(), as the second
call treats the NULL input as a noop. Several callsites
follow this pattern.
The resulting patch does have a fair bit of noise, as each
caller needs to be converted to handle:
1. Storing a pointer instead of the struct itself.
2. Passing the pointer instead of taking the struct
address.
3. Handling a "struct tempfile *" return instead of a file
descriptor.
We could play games to make this less noisy. For example, by
defining the tempfile like this:
struct tempfile {
struct heap_allocated_part_of_tempfile {
int fd;
...etc
} *actual_data;
}
Callers would continue to have a "struct tempfile", and it
would be "active" only when the inner pointer was non-NULL.
But that just makes things more awkward in the long run.
There aren't that many callers, so we can simply bite
the bullet and adjust all of them. And the compiler makes it
easy for us to find them all.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-05 15:15:08 +03:00
|
|
|
return reopen_tempfile(lk->tempfile);
|
2015-08-10 12:47:41 +03:00
|
|
|
}
|
2015-08-10 12:47:36 +03:00
|
|
|
|
|
|
|
/*
|
|
|
|
* Commit the change represented by `lk`: close the file descriptor
|
|
|
|
* and/or file pointer if they are still open and rename the lockfile
|
|
|
|
* to its final destination. Return 0 upon success. On failure, roll
|
|
|
|
* back the lock file and return -1, with `errno` set to the value
|
|
|
|
* from the failing call to `close(2)` or `rename(2)`. It is a bug to
|
|
|
|
* call `commit_lock_file()` for a `lock_file` object that is not
|
|
|
|
* currently locked.
|
|
|
|
*/
|
|
|
|
extern int commit_lock_file(struct lock_file *lk);
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Like `commit_lock_file()`, but rename the lockfile to the provided
|
|
|
|
* `path`. `path` must be on the same filesystem as the lock file.
|
|
|
|
*/
|
2015-08-10 12:47:41 +03:00
|
|
|
static inline int commit_lock_file_to(struct lock_file *lk, const char *path)
|
|
|
|
{
|
|
|
|
return rename_tempfile(&lk->tempfile, path);
|
|
|
|
}
|
2015-08-10 12:47:36 +03:00
|
|
|
|
|
|
|
/*
|
|
|
|
* Roll back `lk`: close the file descriptor and/or file pointer and
|
|
|
|
* remove the lockfile. It is a NOOP to call `rollback_lock_file()`
|
|
|
|
* for a `lock_file` object that has already been committed or rolled
|
|
|
|
* back.
|
|
|
|
*/
|
2015-08-10 12:47:41 +03:00
|
|
|
static inline void rollback_lock_file(struct lock_file *lk)
|
|
|
|
{
|
|
|
|
delete_tempfile(&lk->tempfile);
|
|
|
|
}
|
2014-10-01 14:28:42 +04:00
|
|
|
|
|
|
|
#endif /* LOCKFILE_H */
|