2005-04-17 02:20:36 +04:00
|
|
|
/*
|
|
|
|
* linux/fs/exec.c
|
|
|
|
*
|
|
|
|
* Copyright (C) 1991, 1992 Linus Torvalds
|
|
|
|
*/
|
|
|
|
|
|
|
|
/*
|
|
|
|
* #!-checking implemented by tytso.
|
|
|
|
*/
|
|
|
|
/*
|
|
|
|
* Demand-loading implemented 01.12.91 - no need to read anything but
|
|
|
|
* the header into memory. The inode of the executable is put into
|
|
|
|
* "current->executable", and page faults do the actual loading. Clean.
|
|
|
|
*
|
|
|
|
* Once more I can proudly say that linux stood up to being changed: it
|
|
|
|
* was less than 2 hours work to get demand-loading completely implemented.
|
|
|
|
*
|
|
|
|
* Demand loading changed July 1993 by Eric Youngdale. Use mmap instead,
|
|
|
|
* current->executable is only used by the procfs. This allows a dispatch
|
|
|
|
* table to check for several different types of binary formats. We keep
|
|
|
|
* trying until we recognize the file or we run out of supported binary
|
|
|
|
* formats.
|
|
|
|
*/
|
|
|
|
|
|
|
|
#include <linux/slab.h>
|
|
|
|
#include <linux/file.h>
|
2008-04-24 15:44:08 +04:00
|
|
|
#include <linux/fdtable.h>
|
2008-07-25 12:45:43 +04:00
|
|
|
#include <linux/mm.h>
|
2005-04-17 02:20:36 +04:00
|
|
|
#include <linux/stat.h>
|
|
|
|
#include <linux/fcntl.h>
|
|
|
|
#include <linux/smp_lock.h>
|
2008-07-25 12:45:43 +04:00
|
|
|
#include <linux/swap.h>
|
2007-10-17 10:26:35 +04:00
|
|
|
#include <linux/string.h>
|
2005-04-17 02:20:36 +04:00
|
|
|
#include <linux/init.h>
|
2008-07-29 02:46:18 +04:00
|
|
|
#include <linux/pagemap.h>
|
2005-04-17 02:20:36 +04:00
|
|
|
#include <linux/highmem.h>
|
|
|
|
#include <linux/spinlock.h>
|
|
|
|
#include <linux/key.h>
|
|
|
|
#include <linux/personality.h>
|
|
|
|
#include <linux/binfmts.h>
|
|
|
|
#include <linux/utsname.h>
|
2006-12-08 13:38:01 +03:00
|
|
|
#include <linux/pid_namespace.h>
|
2005-04-17 02:20:36 +04:00
|
|
|
#include <linux/module.h>
|
|
|
|
#include <linux/namei.h>
|
|
|
|
#include <linux/proc_fs.h>
|
|
|
|
#include <linux/mount.h>
|
|
|
|
#include <linux/security.h>
|
|
|
|
#include <linux/syscalls.h>
|
2006-10-01 10:28:59 +04:00
|
|
|
#include <linux/tsacct_kern.h>
|
2005-11-07 11:59:16 +03:00
|
|
|
#include <linux/cn_proc.h>
|
2006-04-26 22:04:08 +04:00
|
|
|
#include <linux/audit.h>
|
2008-07-26 06:45:44 +04:00
|
|
|
#include <linux/tracehook.h>
|
2008-07-09 12:28:40 +04:00
|
|
|
#include <linux/kmod.h>
|
2005-04-17 02:20:36 +04:00
|
|
|
|
|
|
|
#include <asm/uaccess.h>
|
|
|
|
#include <asm/mmu_context.h>
|
2007-07-19 12:48:16 +04:00
|
|
|
#include <asm/tlb.h>
|
CRED: Make execve() take advantage of copy-on-write credentials
Make execve() take advantage of copy-on-write credentials, allowing it to set
up the credentials in advance, and then commit the whole lot after the point
of no return.
This patch and the preceding patches have been tested with the LTP SELinux
testsuite.
This patch makes several logical sets of alteration:
(1) execve().
The credential bits from struct linux_binprm are, for the most part,
replaced with a single credentials pointer (bprm->cred). This means that
all the creds can be calculated in advance and then applied at the point
of no return with no possibility of failure.
I would like to replace bprm->cap_effective with:
cap_isclear(bprm->cap_effective)
but this seems impossible due to special behaviour for processes of pid 1
(they always retain their parent's capability masks where normally they'd
be changed - see cap_bprm_set_creds()).
The following sequence of events now happens:
(a) At the start of do_execve, the current task's cred_exec_mutex is
locked to prevent PTRACE_ATTACH from obsoleting the calculation of
creds that we make.
(a) prepare_exec_creds() is then called to make a copy of the current
task's credentials and prepare it. This copy is then assigned to
bprm->cred.
This renders security_bprm_alloc() and security_bprm_free()
unnecessary, and so they've been removed.
(b) The determination of unsafe execution is now performed immediately
after (a) rather than later on in the code. The result is stored in
bprm->unsafe for future reference.
(c) prepare_binprm() is called, possibly multiple times.
(i) This applies the result of set[ug]id binaries to the new creds
attached to bprm->cred. Personality bit clearance is recorded,
but now deferred on the basis that the exec procedure may yet
fail.
(ii) This then calls the new security_bprm_set_creds(). This should
calculate the new LSM and capability credentials into *bprm->cred.
This folds together security_bprm_set() and parts of
security_bprm_apply_creds() (these two have been removed).
Anything that might fail must be done at this point.
(iii) bprm->cred_prepared is set to 1.
bprm->cred_prepared is 0 on the first pass of the security
calculations, and 1 on all subsequent passes. This allows SELinux
in (ii) to base its calculations only on the initial script and
not on the interpreter.
(d) flush_old_exec() is called to commit the task to execution. This
performs the following steps with regard to credentials:
(i) Clear pdeath_signal and set dumpable on certain circumstances that
may not be covered by commit_creds().
(ii) Clear any bits in current->personality that were deferred from
(c.i).
(e) install_exec_creds() [compute_creds() as was] is called to install the
new credentials. This performs the following steps with regard to
credentials:
(i) Calls security_bprm_committing_creds() to apply any security
requirements, such as flushing unauthorised files in SELinux, that
must be done before the credentials are changed.
This is made up of bits of security_bprm_apply_creds() and
security_bprm_post_apply_creds(), both of which have been removed.
This function is not allowed to fail; anything that might fail
must have been done in (c.ii).
(ii) Calls commit_creds() to apply the new credentials in a single
assignment (more or less). Possibly pdeath_signal and dumpable
should be part of struct creds.
(iii) Unlocks the task's cred_replace_mutex, thus allowing
PTRACE_ATTACH to take place.
(iv) Clears The bprm->cred pointer as the credentials it was holding
are now immutable.
(v) Calls security_bprm_committed_creds() to apply any security
alterations that must be done after the creds have been changed.
SELinux uses this to flush signals and signal handlers.
(f) If an error occurs before (d.i), bprm_free() will call abort_creds()
to destroy the proposed new credentials and will then unlock
cred_replace_mutex. No changes to the credentials will have been
made.
(2) LSM interface.
A number of functions have been changed, added or removed:
(*) security_bprm_alloc(), ->bprm_alloc_security()
(*) security_bprm_free(), ->bprm_free_security()
Removed in favour of preparing new credentials and modifying those.
(*) security_bprm_apply_creds(), ->bprm_apply_creds()
(*) security_bprm_post_apply_creds(), ->bprm_post_apply_creds()
Removed; split between security_bprm_set_creds(),
security_bprm_committing_creds() and security_bprm_committed_creds().
(*) security_bprm_set(), ->bprm_set_security()
Removed; folded into security_bprm_set_creds().
(*) security_bprm_set_creds(), ->bprm_set_creds()
New. The new credentials in bprm->creds should be checked and set up
as appropriate. bprm->cred_prepared is 0 on the first call, 1 on the
second and subsequent calls.
(*) security_bprm_committing_creds(), ->bprm_committing_creds()
(*) security_bprm_committed_creds(), ->bprm_committed_creds()
New. Apply the security effects of the new credentials. This
includes closing unauthorised files in SELinux. This function may not
fail. When the former is called, the creds haven't yet been applied
to the process; when the latter is called, they have.
The former may access bprm->cred, the latter may not.
(3) SELinux.
SELinux has a number of changes, in addition to those to support the LSM
interface changes mentioned above:
(a) The bprm_security_struct struct has been removed in favour of using
the credentials-under-construction approach.
(c) flush_unauthorized_files() now takes a cred pointer and passes it on
to inode_has_perm(), file_has_perm() and dentry_open().
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: James Morris <jmorris@namei.org>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Signed-off-by: James Morris <jmorris@namei.org>
2008-11-14 02:39:24 +03:00
|
|
|
#include "internal.h"
|
2005-04-17 02:20:36 +04:00
|
|
|
|
|
|
|
int core_uses_pid;
|
2007-05-17 09:11:16 +04:00
|
|
|
char core_pattern[CORENAME_MAX_SIZE] = "core";
|
2005-06-23 11:09:43 +04:00
|
|
|
int suid_dumpable = 0;
|
|
|
|
|
2005-04-17 02:20:36 +04:00
|
|
|
/* The maximal length of core_pattern is also specified in sysctl.c */
|
|
|
|
|
2007-10-17 10:26:03 +04:00
|
|
|
static LIST_HEAD(formats);
|
2005-04-17 02:20:36 +04:00
|
|
|
static DEFINE_RWLOCK(binfmt_lock);
|
|
|
|
|
|
|
|
int register_binfmt(struct linux_binfmt * fmt)
|
|
|
|
{
|
|
|
|
if (!fmt)
|
|
|
|
return -EINVAL;
|
|
|
|
write_lock(&binfmt_lock);
|
2007-10-17 10:26:03 +04:00
|
|
|
list_add(&fmt->lh, &formats);
|
2005-04-17 02:20:36 +04:00
|
|
|
write_unlock(&binfmt_lock);
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
|
|
|
EXPORT_SYMBOL(register_binfmt);
|
|
|
|
|
2007-10-17 10:26:04 +04:00
|
|
|
void unregister_binfmt(struct linux_binfmt * fmt)
|
2005-04-17 02:20:36 +04:00
|
|
|
{
|
|
|
|
write_lock(&binfmt_lock);
|
2007-10-17 10:26:03 +04:00
|
|
|
list_del(&fmt->lh);
|
2005-04-17 02:20:36 +04:00
|
|
|
write_unlock(&binfmt_lock);
|
|
|
|
}
|
|
|
|
|
|
|
|
EXPORT_SYMBOL(unregister_binfmt);
|
|
|
|
|
|
|
|
static inline void put_binfmt(struct linux_binfmt * fmt)
|
|
|
|
{
|
|
|
|
module_put(fmt->module);
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Note that a shared library must be both readable and executable due to
|
|
|
|
* security reasons.
|
|
|
|
*
|
|
|
|
* Also note that we take the address to load from from the file itself.
|
|
|
|
*/
|
|
|
|
asmlinkage long sys_uselib(const char __user * library)
|
|
|
|
{
|
2008-07-26 11:33:14 +04:00
|
|
|
struct file *file;
|
2005-04-17 02:20:36 +04:00
|
|
|
struct nameidata nd;
|
2008-07-26 11:33:14 +04:00
|
|
|
char *tmp = getname(library);
|
|
|
|
int error = PTR_ERR(tmp);
|
|
|
|
|
|
|
|
if (!IS_ERR(tmp)) {
|
|
|
|
error = path_lookup_open(AT_FDCWD, tmp,
|
|
|
|
LOOKUP_FOLLOW, &nd,
|
|
|
|
FMODE_READ|FMODE_EXEC);
|
|
|
|
putname(tmp);
|
|
|
|
}
|
2005-04-17 02:20:36 +04:00
|
|
|
if (error)
|
|
|
|
goto out;
|
|
|
|
|
|
|
|
error = -EINVAL;
|
2008-02-15 06:34:32 +03:00
|
|
|
if (!S_ISREG(nd.path.dentry->d_inode->i_mode))
|
2005-04-17 02:20:36 +04:00
|
|
|
goto exit;
|
|
|
|
|
2008-07-22 08:02:33 +04:00
|
|
|
error = -EACCES;
|
|
|
|
if (nd.path.mnt->mnt_flags & MNT_NOEXEC)
|
|
|
|
goto exit;
|
|
|
|
|
2008-10-24 11:59:29 +04:00
|
|
|
error = inode_permission(nd.path.dentry->d_inode,
|
|
|
|
MAY_READ | MAY_EXEC | MAY_OPEN);
|
2005-04-17 02:20:36 +04:00
|
|
|
if (error)
|
|
|
|
goto exit;
|
|
|
|
|
2008-02-08 15:20:23 +03:00
|
|
|
file = nameidata_to_filp(&nd, O_RDONLY|O_LARGEFILE);
|
2005-04-17 02:20:36 +04:00
|
|
|
error = PTR_ERR(file);
|
|
|
|
if (IS_ERR(file))
|
|
|
|
goto out;
|
|
|
|
|
|
|
|
error = -ENOEXEC;
|
|
|
|
if(file->f_op) {
|
|
|
|
struct linux_binfmt * fmt;
|
|
|
|
|
|
|
|
read_lock(&binfmt_lock);
|
2007-10-17 10:26:03 +04:00
|
|
|
list_for_each_entry(fmt, &formats, lh) {
|
2005-04-17 02:20:36 +04:00
|
|
|
if (!fmt->load_shlib)
|
|
|
|
continue;
|
|
|
|
if (!try_module_get(fmt->module))
|
|
|
|
continue;
|
|
|
|
read_unlock(&binfmt_lock);
|
|
|
|
error = fmt->load_shlib(file);
|
|
|
|
read_lock(&binfmt_lock);
|
|
|
|
put_binfmt(fmt);
|
|
|
|
if (error != -ENOEXEC)
|
|
|
|
break;
|
|
|
|
}
|
|
|
|
read_unlock(&binfmt_lock);
|
|
|
|
}
|
|
|
|
fput(file);
|
|
|
|
out:
|
|
|
|
return error;
|
|
|
|
exit:
|
2005-10-19 01:20:16 +04:00
|
|
|
release_open_intent(&nd);
|
2008-02-15 06:34:35 +03:00
|
|
|
path_put(&nd.path);
|
2005-04-17 02:20:36 +04:00
|
|
|
goto out;
|
|
|
|
}
|
|
|
|
|
2007-07-19 12:48:16 +04:00
|
|
|
#ifdef CONFIG_MMU
|
|
|
|
|
|
|
|
static struct page *get_arg_page(struct linux_binprm *bprm, unsigned long pos,
|
|
|
|
int write)
|
|
|
|
{
|
|
|
|
struct page *page;
|
|
|
|
int ret;
|
|
|
|
|
|
|
|
#ifdef CONFIG_STACK_GROWSUP
|
|
|
|
if (write) {
|
|
|
|
ret = expand_stack_downwards(bprm->vma, pos);
|
|
|
|
if (ret < 0)
|
|
|
|
return NULL;
|
|
|
|
}
|
|
|
|
#endif
|
|
|
|
ret = get_user_pages(current, bprm->mm, pos,
|
|
|
|
1, write, 1, &page, NULL);
|
|
|
|
if (ret <= 0)
|
|
|
|
return NULL;
|
|
|
|
|
|
|
|
if (write) {
|
|
|
|
unsigned long size = bprm->vma->vm_end - bprm->vma->vm_start;
|
2008-03-03 21:12:14 +03:00
|
|
|
struct rlimit *rlim;
|
|
|
|
|
|
|
|
/*
|
|
|
|
* We've historically supported up to 32 pages (ARG_MAX)
|
|
|
|
* of argument strings even with small stacks
|
|
|
|
*/
|
|
|
|
if (size <= ARG_MAX)
|
|
|
|
return page;
|
2007-07-19 12:48:16 +04:00
|
|
|
|
|
|
|
/*
|
|
|
|
* Limit to 1/4-th the stack size for the argv+env strings.
|
|
|
|
* This ensures that:
|
|
|
|
* - the remaining binfmt code will not run out of stack space,
|
|
|
|
* - the program will have a reasonable amount of stack left
|
|
|
|
* to work from.
|
|
|
|
*/
|
2008-03-03 21:12:14 +03:00
|
|
|
rlim = current->signal->rlim;
|
2007-07-19 12:48:16 +04:00
|
|
|
if (size > rlim[RLIMIT_STACK].rlim_cur / 4) {
|
|
|
|
put_page(page);
|
|
|
|
return NULL;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
return page;
|
|
|
|
}
|
|
|
|
|
|
|
|
static void put_arg_page(struct page *page)
|
|
|
|
{
|
|
|
|
put_page(page);
|
|
|
|
}
|
|
|
|
|
|
|
|
static void free_arg_page(struct linux_binprm *bprm, int i)
|
|
|
|
{
|
|
|
|
}
|
|
|
|
|
|
|
|
static void free_arg_pages(struct linux_binprm *bprm)
|
|
|
|
{
|
|
|
|
}
|
|
|
|
|
|
|
|
static void flush_arg_page(struct linux_binprm *bprm, unsigned long pos,
|
|
|
|
struct page *page)
|
|
|
|
{
|
|
|
|
flush_cache_page(bprm->vma, pos, page_to_pfn(page));
|
|
|
|
}
|
|
|
|
|
|
|
|
static int __bprm_mm_init(struct linux_binprm *bprm)
|
|
|
|
{
|
|
|
|
int err = -ENOMEM;
|
|
|
|
struct vm_area_struct *vma = NULL;
|
|
|
|
struct mm_struct *mm = bprm->mm;
|
|
|
|
|
|
|
|
bprm->vma = vma = kmem_cache_zalloc(vm_area_cachep, GFP_KERNEL);
|
|
|
|
if (!vma)
|
|
|
|
goto err;
|
|
|
|
|
|
|
|
down_write(&mm->mmap_sem);
|
|
|
|
vma->vm_mm = mm;
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Place the stack at the largest stack address the architecture
|
|
|
|
* supports. Later, we'll move this to an appropriate place. We don't
|
|
|
|
* use STACK_TOP because that can depend on attributes which aren't
|
|
|
|
* configured yet.
|
|
|
|
*/
|
|
|
|
vma->vm_end = STACK_TOP_MAX;
|
|
|
|
vma->vm_start = vma->vm_end - PAGE_SIZE;
|
|
|
|
|
|
|
|
vma->vm_flags = VM_STACK_FLAGS;
|
2007-10-19 10:39:15 +04:00
|
|
|
vma->vm_page_prot = vm_get_page_prot(vma->vm_flags);
|
2007-07-19 12:48:16 +04:00
|
|
|
err = insert_vm_struct(mm, vma);
|
|
|
|
if (err) {
|
|
|
|
up_write(&mm->mmap_sem);
|
|
|
|
goto err;
|
|
|
|
}
|
|
|
|
|
|
|
|
mm->stack_vm = mm->total_vm = 1;
|
|
|
|
up_write(&mm->mmap_sem);
|
|
|
|
|
|
|
|
bprm->p = vma->vm_end - sizeof(void *);
|
|
|
|
|
|
|
|
return 0;
|
|
|
|
|
|
|
|
err:
|
|
|
|
if (vma) {
|
|
|
|
bprm->vma = NULL;
|
|
|
|
kmem_cache_free(vm_area_cachep, vma);
|
|
|
|
}
|
|
|
|
|
|
|
|
return err;
|
|
|
|
}
|
|
|
|
|
|
|
|
static bool valid_arg_len(struct linux_binprm *bprm, long len)
|
|
|
|
{
|
|
|
|
return len <= MAX_ARG_STRLEN;
|
|
|
|
}
|
|
|
|
|
|
|
|
#else
|
|
|
|
|
|
|
|
static struct page *get_arg_page(struct linux_binprm *bprm, unsigned long pos,
|
|
|
|
int write)
|
|
|
|
{
|
|
|
|
struct page *page;
|
|
|
|
|
|
|
|
page = bprm->page[pos / PAGE_SIZE];
|
|
|
|
if (!page && write) {
|
|
|
|
page = alloc_page(GFP_HIGHUSER|__GFP_ZERO);
|
|
|
|
if (!page)
|
|
|
|
return NULL;
|
|
|
|
bprm->page[pos / PAGE_SIZE] = page;
|
|
|
|
}
|
|
|
|
|
|
|
|
return page;
|
|
|
|
}
|
|
|
|
|
|
|
|
static void put_arg_page(struct page *page)
|
|
|
|
{
|
|
|
|
}
|
|
|
|
|
|
|
|
static void free_arg_page(struct linux_binprm *bprm, int i)
|
|
|
|
{
|
|
|
|
if (bprm->page[i]) {
|
|
|
|
__free_page(bprm->page[i]);
|
|
|
|
bprm->page[i] = NULL;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
static void free_arg_pages(struct linux_binprm *bprm)
|
|
|
|
{
|
|
|
|
int i;
|
|
|
|
|
|
|
|
for (i = 0; i < MAX_ARG_PAGES; i++)
|
|
|
|
free_arg_page(bprm, i);
|
|
|
|
}
|
|
|
|
|
|
|
|
static void flush_arg_page(struct linux_binprm *bprm, unsigned long pos,
|
|
|
|
struct page *page)
|
|
|
|
{
|
|
|
|
}
|
|
|
|
|
|
|
|
static int __bprm_mm_init(struct linux_binprm *bprm)
|
|
|
|
{
|
|
|
|
bprm->p = PAGE_SIZE * MAX_ARG_PAGES - sizeof(void *);
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
|
|
|
static bool valid_arg_len(struct linux_binprm *bprm, long len)
|
|
|
|
{
|
|
|
|
return len <= bprm->p;
|
|
|
|
}
|
|
|
|
|
|
|
|
#endif /* CONFIG_MMU */
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Create a new mm_struct and populate it with a temporary stack
|
|
|
|
* vm_area_struct. We don't have enough context at this point to set the stack
|
|
|
|
* flags, permissions, and offset, so we use temporary values. We'll update
|
|
|
|
* them later in setup_arg_pages().
|
|
|
|
*/
|
|
|
|
int bprm_mm_init(struct linux_binprm *bprm)
|
|
|
|
{
|
|
|
|
int err;
|
|
|
|
struct mm_struct *mm = NULL;
|
|
|
|
|
|
|
|
bprm->mm = mm = mm_alloc();
|
|
|
|
err = -ENOMEM;
|
|
|
|
if (!mm)
|
|
|
|
goto err;
|
|
|
|
|
|
|
|
err = init_new_context(current, mm);
|
|
|
|
if (err)
|
|
|
|
goto err;
|
|
|
|
|
|
|
|
err = __bprm_mm_init(bprm);
|
|
|
|
if (err)
|
|
|
|
goto err;
|
|
|
|
|
|
|
|
return 0;
|
|
|
|
|
|
|
|
err:
|
|
|
|
if (mm) {
|
|
|
|
bprm->mm = NULL;
|
|
|
|
mmdrop(mm);
|
|
|
|
}
|
|
|
|
|
|
|
|
return err;
|
|
|
|
}
|
|
|
|
|
2005-04-17 02:20:36 +04:00
|
|
|
/*
|
|
|
|
* count() counts the number of strings in array ARGV.
|
|
|
|
*/
|
|
|
|
static int count(char __user * __user * argv, int max)
|
|
|
|
{
|
|
|
|
int i = 0;
|
|
|
|
|
|
|
|
if (argv != NULL) {
|
|
|
|
for (;;) {
|
|
|
|
char __user * p;
|
|
|
|
|
|
|
|
if (get_user(p, argv))
|
|
|
|
return -EFAULT;
|
|
|
|
if (!p)
|
|
|
|
break;
|
|
|
|
argv++;
|
2008-10-16 09:01:52 +04:00
|
|
|
if (i++ >= max)
|
2005-04-17 02:20:36 +04:00
|
|
|
return -E2BIG;
|
|
|
|
cond_resched();
|
|
|
|
}
|
|
|
|
}
|
|
|
|
return i;
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
2007-07-19 12:48:16 +04:00
|
|
|
* 'copy_strings()' copies argument/environment strings from the old
|
|
|
|
* processes's memory to the new process's stack. The call to get_user_pages()
|
|
|
|
* ensures the destination page is created and not swapped out.
|
2005-04-17 02:20:36 +04:00
|
|
|
*/
|
2005-05-06 03:16:09 +04:00
|
|
|
static int copy_strings(int argc, char __user * __user * argv,
|
|
|
|
struct linux_binprm *bprm)
|
2005-04-17 02:20:36 +04:00
|
|
|
{
|
|
|
|
struct page *kmapped_page = NULL;
|
|
|
|
char *kaddr = NULL;
|
2007-07-19 12:48:16 +04:00
|
|
|
unsigned long kpos = 0;
|
2005-04-17 02:20:36 +04:00
|
|
|
int ret;
|
|
|
|
|
|
|
|
while (argc-- > 0) {
|
|
|
|
char __user *str;
|
|
|
|
int len;
|
|
|
|
unsigned long pos;
|
|
|
|
|
|
|
|
if (get_user(str, argv+argc) ||
|
2007-07-19 12:48:16 +04:00
|
|
|
!(len = strnlen_user(str, MAX_ARG_STRLEN))) {
|
2005-04-17 02:20:36 +04:00
|
|
|
ret = -EFAULT;
|
|
|
|
goto out;
|
|
|
|
}
|
|
|
|
|
2007-07-19 12:48:16 +04:00
|
|
|
if (!valid_arg_len(bprm, len)) {
|
2005-04-17 02:20:36 +04:00
|
|
|
ret = -E2BIG;
|
|
|
|
goto out;
|
|
|
|
}
|
|
|
|
|
2007-07-19 12:48:16 +04:00
|
|
|
/* We're going to work our way backwords. */
|
2005-04-17 02:20:36 +04:00
|
|
|
pos = bprm->p;
|
2007-07-19 12:48:16 +04:00
|
|
|
str += len;
|
|
|
|
bprm->p -= len;
|
2005-04-17 02:20:36 +04:00
|
|
|
|
|
|
|
while (len > 0) {
|
|
|
|
int offset, bytes_to_copy;
|
|
|
|
|
|
|
|
offset = pos % PAGE_SIZE;
|
2007-07-19 12:48:16 +04:00
|
|
|
if (offset == 0)
|
|
|
|
offset = PAGE_SIZE;
|
|
|
|
|
|
|
|
bytes_to_copy = offset;
|
|
|
|
if (bytes_to_copy > len)
|
|
|
|
bytes_to_copy = len;
|
|
|
|
|
|
|
|
offset -= bytes_to_copy;
|
|
|
|
pos -= bytes_to_copy;
|
|
|
|
str -= bytes_to_copy;
|
|
|
|
len -= bytes_to_copy;
|
|
|
|
|
|
|
|
if (!kmapped_page || kpos != (pos & PAGE_MASK)) {
|
|
|
|
struct page *page;
|
|
|
|
|
|
|
|
page = get_arg_page(bprm, pos, 1);
|
2005-04-17 02:20:36 +04:00
|
|
|
if (!page) {
|
2007-07-19 12:48:16 +04:00
|
|
|
ret = -E2BIG;
|
2005-04-17 02:20:36 +04:00
|
|
|
goto out;
|
|
|
|
}
|
|
|
|
|
2007-07-19 12:48:16 +04:00
|
|
|
if (kmapped_page) {
|
|
|
|
flush_kernel_dcache_page(kmapped_page);
|
2005-04-17 02:20:36 +04:00
|
|
|
kunmap(kmapped_page);
|
2007-07-19 12:48:16 +04:00
|
|
|
put_arg_page(kmapped_page);
|
|
|
|
}
|
2005-04-17 02:20:36 +04:00
|
|
|
kmapped_page = page;
|
|
|
|
kaddr = kmap(kmapped_page);
|
2007-07-19 12:48:16 +04:00
|
|
|
kpos = pos & PAGE_MASK;
|
|
|
|
flush_arg_page(bprm, kpos, kmapped_page);
|
2005-04-17 02:20:36 +04:00
|
|
|
}
|
2007-07-19 12:48:16 +04:00
|
|
|
if (copy_from_user(kaddr+offset, str, bytes_to_copy)) {
|
2005-04-17 02:20:36 +04:00
|
|
|
ret = -EFAULT;
|
|
|
|
goto out;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
ret = 0;
|
|
|
|
out:
|
2007-07-19 12:48:16 +04:00
|
|
|
if (kmapped_page) {
|
|
|
|
flush_kernel_dcache_page(kmapped_page);
|
2005-04-17 02:20:36 +04:00
|
|
|
kunmap(kmapped_page);
|
2007-07-19 12:48:16 +04:00
|
|
|
put_arg_page(kmapped_page);
|
|
|
|
}
|
2005-04-17 02:20:36 +04:00
|
|
|
return ret;
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Like copy_strings, but get argv and its values from kernel memory.
|
|
|
|
*/
|
|
|
|
int copy_strings_kernel(int argc,char ** argv, struct linux_binprm *bprm)
|
|
|
|
{
|
|
|
|
int r;
|
|
|
|
mm_segment_t oldfs = get_fs();
|
|
|
|
set_fs(KERNEL_DS);
|
|
|
|
r = copy_strings(argc, (char __user * __user *)argv, bprm);
|
|
|
|
set_fs(oldfs);
|
|
|
|
return r;
|
|
|
|
}
|
|
|
|
EXPORT_SYMBOL(copy_strings_kernel);
|
|
|
|
|
|
|
|
#ifdef CONFIG_MMU
|
2007-07-19 12:48:16 +04:00
|
|
|
|
2005-04-17 02:20:36 +04:00
|
|
|
/*
|
2007-07-19 12:48:16 +04:00
|
|
|
* During bprm_mm_init(), we create a temporary stack at STACK_TOP_MAX. Once
|
|
|
|
* the binfmt code determines where the new stack should reside, we shift it to
|
|
|
|
* its final location. The process proceeds as follows:
|
2005-04-17 02:20:36 +04:00
|
|
|
*
|
2007-07-19 12:48:16 +04:00
|
|
|
* 1) Use shift to calculate the new vma endpoints.
|
|
|
|
* 2) Extend vma to cover both the old and new ranges. This ensures the
|
|
|
|
* arguments passed to subsequent functions are consistent.
|
|
|
|
* 3) Move vma's page tables to the new range.
|
|
|
|
* 4) Free up any cleared pgd range.
|
|
|
|
* 5) Shrink the vma to cover only the new range.
|
2005-04-17 02:20:36 +04:00
|
|
|
*/
|
2007-07-19 12:48:16 +04:00
|
|
|
static int shift_arg_pages(struct vm_area_struct *vma, unsigned long shift)
|
2005-04-17 02:20:36 +04:00
|
|
|
{
|
|
|
|
struct mm_struct *mm = vma->vm_mm;
|
2007-07-19 12:48:16 +04:00
|
|
|
unsigned long old_start = vma->vm_start;
|
|
|
|
unsigned long old_end = vma->vm_end;
|
|
|
|
unsigned long length = old_end - old_start;
|
|
|
|
unsigned long new_start = old_start - shift;
|
|
|
|
unsigned long new_end = old_end - shift;
|
|
|
|
struct mmu_gather *tlb;
|
2005-04-17 02:20:36 +04:00
|
|
|
|
2007-07-19 12:48:16 +04:00
|
|
|
BUG_ON(new_start > new_end);
|
2005-04-17 02:20:36 +04:00
|
|
|
|
2007-07-19 12:48:16 +04:00
|
|
|
/*
|
|
|
|
* ensure there are no vmas between where we want to go
|
|
|
|
* and where we are
|
|
|
|
*/
|
|
|
|
if (vma != find_vma(mm, new_start))
|
|
|
|
return -EFAULT;
|
|
|
|
|
|
|
|
/*
|
|
|
|
* cover the whole range: [new_start, old_end)
|
|
|
|
*/
|
|
|
|
vma_adjust(vma, new_start, old_end, vma->vm_pgoff, NULL);
|
|
|
|
|
|
|
|
/*
|
|
|
|
* move the page tables downwards, on failure we rely on
|
|
|
|
* process cleanup to remove whatever mess we made.
|
|
|
|
*/
|
|
|
|
if (length != move_page_tables(vma, old_start,
|
|
|
|
vma, new_start, length))
|
|
|
|
return -ENOMEM;
|
|
|
|
|
|
|
|
lru_add_drain();
|
|
|
|
tlb = tlb_gather_mmu(mm, 0);
|
|
|
|
if (new_end > old_start) {
|
|
|
|
/*
|
|
|
|
* when the old and new regions overlap clear from new_end.
|
|
|
|
*/
|
2008-07-24 08:27:10 +04:00
|
|
|
free_pgd_range(tlb, new_end, old_end, new_end,
|
2007-07-19 12:48:16 +04:00
|
|
|
vma->vm_next ? vma->vm_next->vm_start : 0);
|
|
|
|
} else {
|
|
|
|
/*
|
|
|
|
* otherwise, clean from old_start; this is done to not touch
|
|
|
|
* the address space in [new_end, old_start) some architectures
|
|
|
|
* have constraints on va-space that make this illegal (IA64) -
|
|
|
|
* for the others its just a little faster.
|
|
|
|
*/
|
2008-07-24 08:27:10 +04:00
|
|
|
free_pgd_range(tlb, old_start, old_end, new_end,
|
2007-07-19 12:48:16 +04:00
|
|
|
vma->vm_next ? vma->vm_next->vm_start : 0);
|
2005-04-17 02:20:36 +04:00
|
|
|
}
|
2007-07-19 12:48:16 +04:00
|
|
|
tlb_finish_mmu(tlb, new_end, old_end);
|
|
|
|
|
|
|
|
/*
|
|
|
|
* shrink the vma to just the new range.
|
|
|
|
*/
|
|
|
|
vma_adjust(vma, new_start, new_end, vma->vm_pgoff, NULL);
|
|
|
|
|
|
|
|
return 0;
|
2005-04-17 02:20:36 +04:00
|
|
|
}
|
|
|
|
|
|
|
|
#define EXTRA_STACK_VM_PAGES 20 /* random */
|
|
|
|
|
2007-07-19 12:48:16 +04:00
|
|
|
/*
|
|
|
|
* Finalizes the stack vm_area_struct. The flags and permissions are updated,
|
|
|
|
* the stack is optionally relocated, and some extra space is added.
|
|
|
|
*/
|
2005-04-17 02:20:36 +04:00
|
|
|
int setup_arg_pages(struct linux_binprm *bprm,
|
|
|
|
unsigned long stack_top,
|
|
|
|
int executable_stack)
|
|
|
|
{
|
2007-07-19 12:48:16 +04:00
|
|
|
unsigned long ret;
|
|
|
|
unsigned long stack_shift;
|
2005-04-17 02:20:36 +04:00
|
|
|
struct mm_struct *mm = current->mm;
|
2007-07-19 12:48:16 +04:00
|
|
|
struct vm_area_struct *vma = bprm->vma;
|
|
|
|
struct vm_area_struct *prev = NULL;
|
|
|
|
unsigned long vm_flags;
|
|
|
|
unsigned long stack_base;
|
2005-04-17 02:20:36 +04:00
|
|
|
|
|
|
|
#ifdef CONFIG_STACK_GROWSUP
|
|
|
|
/* Limit stack size to 1GB */
|
|
|
|
stack_base = current->signal->rlim[RLIMIT_STACK].rlim_max;
|
|
|
|
if (stack_base > (1 << 30))
|
|
|
|
stack_base = 1 << 30;
|
|
|
|
|
2007-07-19 12:48:16 +04:00
|
|
|
/* Make sure we didn't let the argument array grow too large. */
|
|
|
|
if (vma->vm_end - vma->vm_start > stack_base)
|
|
|
|
return -ENOMEM;
|
2005-04-17 02:20:36 +04:00
|
|
|
|
2007-07-19 12:48:16 +04:00
|
|
|
stack_base = PAGE_ALIGN(stack_top - stack_base);
|
2005-04-17 02:20:36 +04:00
|
|
|
|
2007-07-19 12:48:16 +04:00
|
|
|
stack_shift = vma->vm_start - stack_base;
|
|
|
|
mm->arg_start = bprm->p - stack_shift;
|
|
|
|
bprm->p = vma->vm_end - stack_shift;
|
2005-04-17 02:20:36 +04:00
|
|
|
#else
|
2007-07-19 12:48:16 +04:00
|
|
|
stack_top = arch_align_stack(stack_top);
|
|
|
|
stack_top = PAGE_ALIGN(stack_top);
|
|
|
|
stack_shift = vma->vm_end - stack_top;
|
|
|
|
|
|
|
|
bprm->p -= stack_shift;
|
2005-04-17 02:20:36 +04:00
|
|
|
mm->arg_start = bprm->p;
|
|
|
|
#endif
|
|
|
|
|
|
|
|
if (bprm->loader)
|
2007-07-19 12:48:16 +04:00
|
|
|
bprm->loader -= stack_shift;
|
|
|
|
bprm->exec -= stack_shift;
|
2005-04-17 02:20:36 +04:00
|
|
|
|
|
|
|
down_write(&mm->mmap_sem);
|
2008-07-11 00:19:20 +04:00
|
|
|
vm_flags = VM_STACK_FLAGS;
|
2007-07-19 12:48:16 +04:00
|
|
|
|
|
|
|
/*
|
|
|
|
* Adjust stack execute permissions; explicitly enable for
|
|
|
|
* EXSTACK_ENABLE_X, disable for EXSTACK_DISABLE_X and leave alone
|
|
|
|
* (arch default) otherwise.
|
|
|
|
*/
|
|
|
|
if (unlikely(executable_stack == EXSTACK_ENABLE_X))
|
|
|
|
vm_flags |= VM_EXEC;
|
|
|
|
else if (executable_stack == EXSTACK_DISABLE_X)
|
|
|
|
vm_flags &= ~VM_EXEC;
|
|
|
|
vm_flags |= mm->def_flags;
|
|
|
|
|
|
|
|
ret = mprotect_fixup(vma, &prev, vma->vm_start, vma->vm_end,
|
|
|
|
vm_flags);
|
|
|
|
if (ret)
|
|
|
|
goto out_unlock;
|
|
|
|
BUG_ON(prev != vma);
|
|
|
|
|
|
|
|
/* Move stack pages down in memory. */
|
|
|
|
if (stack_shift) {
|
|
|
|
ret = shift_arg_pages(vma, stack_shift);
|
|
|
|
if (ret) {
|
2005-04-17 02:20:36 +04:00
|
|
|
up_write(&mm->mmap_sem);
|
|
|
|
return ret;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2007-07-19 12:48:16 +04:00
|
|
|
#ifdef CONFIG_STACK_GROWSUP
|
|
|
|
stack_base = vma->vm_end + EXTRA_STACK_VM_PAGES * PAGE_SIZE;
|
|
|
|
#else
|
|
|
|
stack_base = vma->vm_start - EXTRA_STACK_VM_PAGES * PAGE_SIZE;
|
|
|
|
#endif
|
|
|
|
ret = expand_stack(vma, stack_base);
|
|
|
|
if (ret)
|
|
|
|
ret = -EFAULT;
|
|
|
|
|
|
|
|
out_unlock:
|
2005-04-17 02:20:36 +04:00
|
|
|
up_write(&mm->mmap_sem);
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
EXPORT_SYMBOL(setup_arg_pages);
|
|
|
|
|
|
|
|
#endif /* CONFIG_MMU */
|
|
|
|
|
|
|
|
struct file *open_exec(const char *name)
|
|
|
|
{
|
|
|
|
struct nameidata nd;
|
|
|
|
struct file *file;
|
2008-05-19 09:53:34 +04:00
|
|
|
int err;
|
2005-04-17 02:20:36 +04:00
|
|
|
|
2008-05-19 09:53:34 +04:00
|
|
|
err = path_lookup_open(AT_FDCWD, name, LOOKUP_FOLLOW, &nd,
|
|
|
|
FMODE_READ|FMODE_EXEC);
|
|
|
|
if (err)
|
|
|
|
goto out;
|
|
|
|
|
|
|
|
err = -EACCES;
|
|
|
|
if (!S_ISREG(nd.path.dentry->d_inode->i_mode))
|
|
|
|
goto out_path_put;
|
|
|
|
|
2008-07-22 08:02:33 +04:00
|
|
|
if (nd.path.mnt->mnt_flags & MNT_NOEXEC)
|
|
|
|
goto out_path_put;
|
|
|
|
|
2008-10-24 11:59:29 +04:00
|
|
|
err = inode_permission(nd.path.dentry->d_inode, MAY_EXEC | MAY_OPEN);
|
2008-05-19 09:53:34 +04:00
|
|
|
if (err)
|
|
|
|
goto out_path_put;
|
|
|
|
|
|
|
|
file = nameidata_to_filp(&nd, O_RDONLY|O_LARGEFILE);
|
|
|
|
if (IS_ERR(file))
|
|
|
|
return file;
|
|
|
|
|
|
|
|
err = deny_write_access(file);
|
|
|
|
if (err) {
|
|
|
|
fput(file);
|
|
|
|
goto out;
|
2005-04-17 02:20:36 +04:00
|
|
|
}
|
|
|
|
|
2008-05-19 09:53:34 +04:00
|
|
|
return file;
|
|
|
|
|
|
|
|
out_path_put:
|
|
|
|
release_open_intent(&nd);
|
|
|
|
path_put(&nd.path);
|
|
|
|
out:
|
|
|
|
return ERR_PTR(err);
|
|
|
|
}
|
2005-04-17 02:20:36 +04:00
|
|
|
EXPORT_SYMBOL(open_exec);
|
|
|
|
|
|
|
|
int kernel_read(struct file *file, unsigned long offset,
|
|
|
|
char *addr, unsigned long count)
|
|
|
|
{
|
|
|
|
mm_segment_t old_fs;
|
|
|
|
loff_t pos = offset;
|
|
|
|
int result;
|
|
|
|
|
|
|
|
old_fs = get_fs();
|
|
|
|
set_fs(get_ds());
|
|
|
|
/* The cast to a user pointer is valid due to the set_fs() */
|
|
|
|
result = vfs_read(file, (void __user *)addr, count, &pos);
|
|
|
|
set_fs(old_fs);
|
|
|
|
return result;
|
|
|
|
}
|
|
|
|
|
|
|
|
EXPORT_SYMBOL(kernel_read);
|
|
|
|
|
|
|
|
static int exec_mmap(struct mm_struct *mm)
|
|
|
|
{
|
|
|
|
struct task_struct *tsk;
|
|
|
|
struct mm_struct * old_mm, *active_mm;
|
|
|
|
|
|
|
|
/* Notify parent that we're no longer interested in the old VM */
|
|
|
|
tsk = current;
|
|
|
|
old_mm = current->mm;
|
|
|
|
mm_release(tsk, old_mm);
|
|
|
|
|
|
|
|
if (old_mm) {
|
|
|
|
/*
|
|
|
|
* Make sure that if there is a core dump in progress
|
|
|
|
* for the old mm, we get out and die instead of going
|
|
|
|
* through with the exec. We must hold mmap_sem around
|
2008-07-25 12:47:41 +04:00
|
|
|
* checking core_state and changing tsk->mm.
|
2005-04-17 02:20:36 +04:00
|
|
|
*/
|
|
|
|
down_read(&old_mm->mmap_sem);
|
2008-07-25 12:47:41 +04:00
|
|
|
if (unlikely(old_mm->core_state)) {
|
2005-04-17 02:20:36 +04:00
|
|
|
up_read(&old_mm->mmap_sem);
|
|
|
|
return -EINTR;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
task_lock(tsk);
|
|
|
|
active_mm = tsk->active_mm;
|
|
|
|
tsk->mm = mm;
|
|
|
|
tsk->active_mm = mm;
|
|
|
|
activate_mm(active_mm, mm);
|
|
|
|
task_unlock(tsk);
|
|
|
|
arch_pick_mmap_layout(mm);
|
|
|
|
if (old_mm) {
|
|
|
|
up_read(&old_mm->mmap_sem);
|
2006-04-01 03:13:38 +04:00
|
|
|
BUG_ON(active_mm != old_mm);
|
mm owner: fix race between swapoff and exit
There's a race between mm->owner assignment and swapoff, more easily
seen when task slab poisoning is turned on. The condition occurs when
try_to_unuse() runs in parallel with an exiting task. A similar race
can occur with callers of get_task_mm(), such as /proc/<pid>/<mmstats>
or ptrace or page migration.
CPU0 CPU1
try_to_unuse
looks at mm = task0->mm
increments mm->mm_users
task 0 exits
mm->owner needs to be updated, but no
new owner is found (mm_users > 1, but
no other task has task->mm = task0->mm)
mm_update_next_owner() leaves
mmput(mm) decrements mm->mm_users
task0 freed
dereferencing mm->owner fails
The fix is to notify the subsystem via mm_owner_changed callback(),
if no new owner is found, by specifying the new task as NULL.
Jiri Slaby:
mm->owner was set to NULL prior to calling cgroup_mm_owner_callbacks(), but
must be set after that, so as not to pass NULL as old owner causing oops.
Daisuke Nishimura:
mm_update_next_owner() may set mm->owner to NULL, but mem_cgroup_from_task()
and its callers need to take account of this situation to avoid oops.
Hugh Dickins:
Lockdep warning and hang below exec_mmap() when testing these patches.
exit_mm() up_reads mmap_sem before calling mm_update_next_owner(),
so exec_mmap() now needs to do the same. And with that repositioning,
there's now no point in mm_need_new_owner() allowing for NULL mm.
Reported-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Balbir Singh <balbir@linux.vnet.ibm.com>
Signed-off-by: Jiri Slaby <jirislaby@gmail.com>
Signed-off-by: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
Signed-off-by: Hugh Dickins <hugh@veritas.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Paul Menage <menage@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-09-29 02:09:31 +04:00
|
|
|
mm_update_next_owner(old_mm);
|
2005-04-17 02:20:36 +04:00
|
|
|
mmput(old_mm);
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
mmdrop(active_mm);
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* This function makes sure the current process has its own signal table,
|
|
|
|
* so that flush_signal_handlers can later reset the handlers without
|
|
|
|
* disturbing other processes. (Other processes might share the signal
|
|
|
|
* table via the CLONE_SIGHAND option to clone().)
|
|
|
|
*/
|
2006-01-15 00:20:43 +03:00
|
|
|
static int de_thread(struct task_struct *tsk)
|
2005-04-17 02:20:36 +04:00
|
|
|
{
|
|
|
|
struct signal_struct *sig = tsk->signal;
|
2007-10-17 10:27:22 +04:00
|
|
|
struct sighand_struct *oldsighand = tsk->sighand;
|
2005-04-17 02:20:36 +04:00
|
|
|
spinlock_t *lock = &oldsighand->siglock;
|
|
|
|
int count;
|
|
|
|
|
2006-09-27 12:51:13 +04:00
|
|
|
if (thread_group_empty(tsk))
|
2005-04-17 02:20:36 +04:00
|
|
|
goto no_thread_group;
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Kill all other threads in the thread group.
|
|
|
|
*/
|
|
|
|
spin_lock_irq(lock);
|
2008-02-05 09:27:24 +03:00
|
|
|
if (signal_group_exit(sig)) {
|
2005-04-17 02:20:36 +04:00
|
|
|
/*
|
|
|
|
* Another group action in progress, just
|
|
|
|
* return so that the signal is processed.
|
|
|
|
*/
|
|
|
|
spin_unlock_irq(lock);
|
|
|
|
return -EAGAIN;
|
|
|
|
}
|
2008-02-05 09:27:24 +03:00
|
|
|
sig->group_exit_task = tsk;
|
2006-09-27 12:51:13 +04:00
|
|
|
zap_other_threads(tsk);
|
2005-04-17 02:20:36 +04:00
|
|
|
|
2008-02-08 15:19:19 +03:00
|
|
|
/* Account for the thread group leader hanging around: */
|
|
|
|
count = thread_group_leader(tsk) ? 1 : 2;
|
2007-10-17 10:27:23 +04:00
|
|
|
sig->notify_count = count;
|
2005-04-17 02:20:36 +04:00
|
|
|
while (atomic_read(&sig->count) > count) {
|
|
|
|
__set_current_state(TASK_UNINTERRUPTIBLE);
|
|
|
|
spin_unlock_irq(lock);
|
|
|
|
schedule();
|
|
|
|
spin_lock_irq(lock);
|
|
|
|
}
|
|
|
|
spin_unlock_irq(lock);
|
|
|
|
|
|
|
|
/*
|
|
|
|
* At this point all other threads have exited, all we have to
|
|
|
|
* do is to wait for the thread group leader to become inactive,
|
|
|
|
* and to assume its PID:
|
|
|
|
*/
|
2006-09-27 12:51:13 +04:00
|
|
|
if (!thread_group_leader(tsk)) {
|
2008-12-02 01:18:16 +03:00
|
|
|
struct task_struct *leader = tsk->group_leader;
|
2007-10-17 10:27:23 +04:00
|
|
|
|
2008-04-30 11:53:12 +04:00
|
|
|
sig->notify_count = -1; /* for exit_notify() */
|
2007-10-17 10:27:23 +04:00
|
|
|
for (;;) {
|
|
|
|
write_lock_irq(&tasklist_lock);
|
|
|
|
if (likely(leader->exit_state))
|
|
|
|
break;
|
|
|
|
__set_current_state(TASK_UNINTERRUPTIBLE);
|
|
|
|
write_unlock_irq(&tasklist_lock);
|
|
|
|
schedule();
|
|
|
|
}
|
2005-04-17 02:20:36 +04:00
|
|
|
|
2006-04-11 09:54:16 +04:00
|
|
|
/*
|
|
|
|
* The only record we have of the real-time age of a
|
|
|
|
* process, regardless of execs it's done, is start_time.
|
|
|
|
* All the past CPU time is accumulated in signal_struct
|
|
|
|
* from sister threads now dead. But in this non-leader
|
|
|
|
* exec, nothing survives from the original leader thread,
|
|
|
|
* whose birth marks the true age of this process now.
|
|
|
|
* When we take on its identity by switching to its PID, we
|
|
|
|
* also take its birthdate (always earlier than our own).
|
|
|
|
*/
|
2006-09-27 12:51:13 +04:00
|
|
|
tsk->start_time = leader->start_time;
|
2006-04-11 09:54:16 +04:00
|
|
|
|
2007-10-19 10:40:18 +04:00
|
|
|
BUG_ON(!same_thread_group(leader, tsk));
|
|
|
|
BUG_ON(has_group_leader_pid(tsk));
|
2005-04-17 02:20:36 +04:00
|
|
|
/*
|
|
|
|
* An exec() starts a new thread group with the
|
|
|
|
* TGID of the previous thread group. Rehash the
|
|
|
|
* two threads with a switched PID, and release
|
|
|
|
* the former thread group leader:
|
|
|
|
*/
|
2006-03-29 04:11:03 +04:00
|
|
|
|
|
|
|
/* Become a process group leader with the old leader's pid.
|
2006-09-27 12:51:06 +04:00
|
|
|
* The old leader becomes a thread of the this thread group.
|
|
|
|
* Note: The old leader also uses this pid until release_task
|
2006-03-29 04:11:03 +04:00
|
|
|
* is called. Odd but simple and correct.
|
|
|
|
*/
|
2006-09-27 12:51:13 +04:00
|
|
|
detach_pid(tsk, PIDTYPE_PID);
|
|
|
|
tsk->pid = leader->pid;
|
2007-10-19 10:39:51 +04:00
|
|
|
attach_pid(tsk, PIDTYPE_PID, task_pid(leader));
|
2006-09-27 12:51:13 +04:00
|
|
|
transfer_pid(leader, tsk, PIDTYPE_PGID);
|
|
|
|
transfer_pid(leader, tsk, PIDTYPE_SID);
|
|
|
|
list_replace_rcu(&leader->tasks, &tsk->tasks);
|
2005-04-17 02:20:36 +04:00
|
|
|
|
2006-09-27 12:51:13 +04:00
|
|
|
tsk->group_leader = tsk;
|
|
|
|
leader->group_leader = tsk;
|
2006-04-11 03:16:49 +04:00
|
|
|
|
2006-09-27 12:51:13 +04:00
|
|
|
tsk->exit_signal = SIGCHLD;
|
2005-11-24 00:37:43 +03:00
|
|
|
|
|
|
|
BUG_ON(leader->exit_state != EXIT_ZOMBIE);
|
|
|
|
leader->exit_state = EXIT_DEAD;
|
2005-04-17 02:20:36 +04:00
|
|
|
write_unlock_irq(&tasklist_lock);
|
2008-12-02 01:18:16 +03:00
|
|
|
|
|
|
|
release_task(leader);
|
2008-02-05 09:27:24 +03:00
|
|
|
}
|
2005-04-17 02:20:36 +04:00
|
|
|
|
2007-10-17 10:27:23 +04:00
|
|
|
sig->group_exit_task = NULL;
|
|
|
|
sig->notify_count = 0;
|
2005-04-17 02:20:36 +04:00
|
|
|
|
|
|
|
no_thread_group:
|
|
|
|
exit_itimers(sig);
|
2008-05-26 20:55:42 +04:00
|
|
|
flush_itimer_signals();
|
2005-11-07 21:12:43 +03:00
|
|
|
|
2007-10-17 10:27:22 +04:00
|
|
|
if (atomic_read(&oldsighand->count) != 1) {
|
|
|
|
struct sighand_struct *newsighand;
|
2005-04-17 02:20:36 +04:00
|
|
|
/*
|
2007-10-17 10:27:22 +04:00
|
|
|
* This ->sighand is shared with the CLONE_SIGHAND
|
|
|
|
* but not CLONE_THREAD task, switch to the new one.
|
2005-04-17 02:20:36 +04:00
|
|
|
*/
|
2007-10-17 10:27:22 +04:00
|
|
|
newsighand = kmem_cache_alloc(sighand_cachep, GFP_KERNEL);
|
|
|
|
if (!newsighand)
|
|
|
|
return -ENOMEM;
|
|
|
|
|
2005-04-17 02:20:36 +04:00
|
|
|
atomic_set(&newsighand->count, 1);
|
|
|
|
memcpy(newsighand->action, oldsighand->action,
|
|
|
|
sizeof(newsighand->action));
|
|
|
|
|
|
|
|
write_lock_irq(&tasklist_lock);
|
|
|
|
spin_lock(&oldsighand->siglock);
|
2006-09-27 12:51:13 +04:00
|
|
|
rcu_assign_pointer(tsk->sighand, newsighand);
|
2005-04-17 02:20:36 +04:00
|
|
|
spin_unlock(&oldsighand->siglock);
|
|
|
|
write_unlock_irq(&tasklist_lock);
|
|
|
|
|
signal/timer/event: signalfd core
This patch series implements the new signalfd() system call.
I took part of the original Linus code (and you know how badly it can be
broken :), and I added even more breakage ;) Signals are fetched from the same
signal queue used by the process, so signalfd will compete with standard
kernel delivery in dequeue_signal(). If you want to reliably fetch signals on
the signalfd file, you need to block them with sigprocmask(SIG_BLOCK). This
seems to be working fine on my Dual Opteron machine. I made a quick test
program for it:
http://www.xmailserver.org/signafd-test.c
The signalfd() system call implements signal delivery into a file descriptor
receiver. The signalfd file descriptor if created with the following API:
int signalfd(int ufd, const sigset_t *mask, size_t masksize);
The "ufd" parameter allows to change an existing signalfd sigmask, w/out going
to close/create cycle (Linus idea). Use "ufd" == -1 if you want a brand new
signalfd file.
The "mask" allows to specify the signal mask of signals that we are interested
in. The "masksize" parameter is the size of "mask".
The signalfd fd supports the poll(2) and read(2) system calls. The poll(2)
will return POLLIN when signals are available to be dequeued. As a direct
consequence of supporting the Linux poll subsystem, the signalfd fd can use
used together with epoll(2) too.
The read(2) system call will return a "struct signalfd_siginfo" structure in
the userspace supplied buffer. The return value is the number of bytes copied
in the supplied buffer, or -1 in case of error. The read(2) call can also
return 0, in case the sighand structure to which the signalfd was attached,
has been orphaned. The O_NONBLOCK flag is also supported, and read(2) will
return -EAGAIN in case no signal is available.
If the size of the buffer passed to read(2) is lower than sizeof(struct
signalfd_siginfo), -EINVAL is returned. A read from the signalfd can also
return -ERESTARTSYS in case a signal hits the process. The format of the
struct signalfd_siginfo is, and the valid fields depends of the (->code &
__SI_MASK) value, in the same way a struct siginfo would:
struct signalfd_siginfo {
__u32 signo; /* si_signo */
__s32 err; /* si_errno */
__s32 code; /* si_code */
__u32 pid; /* si_pid */
__u32 uid; /* si_uid */
__s32 fd; /* si_fd */
__u32 tid; /* si_fd */
__u32 band; /* si_band */
__u32 overrun; /* si_overrun */
__u32 trapno; /* si_trapno */
__s32 status; /* si_status */
__s32 svint; /* si_int */
__u64 svptr; /* si_ptr */
__u64 utime; /* si_utime */
__u64 stime; /* si_stime */
__u64 addr; /* si_addr */
};
[akpm@linux-foundation.org: fix signalfd_copyinfo() on i386]
Signed-off-by: Davide Libenzi <davidel@xmailserver.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-11 09:23:13 +04:00
|
|
|
__cleanup_sighand(oldsighand);
|
2005-04-17 02:20:36 +04:00
|
|
|
}
|
|
|
|
|
2006-09-27 12:51:13 +04:00
|
|
|
BUG_ON(!thread_group_leader(tsk));
|
2005-04-17 02:20:36 +04:00
|
|
|
return 0;
|
|
|
|
}
|
2007-10-17 10:27:22 +04:00
|
|
|
|
2005-04-17 02:20:36 +04:00
|
|
|
/*
|
|
|
|
* These functions flushes out all traces of the currently running executable
|
|
|
|
* so that a new one can be started
|
|
|
|
*/
|
2006-01-15 00:20:43 +03:00
|
|
|
static void flush_old_files(struct files_struct * files)
|
2005-04-17 02:20:36 +04:00
|
|
|
{
|
|
|
|
long j = -1;
|
2005-09-10 00:04:10 +04:00
|
|
|
struct fdtable *fdt;
|
2005-04-17 02:20:36 +04:00
|
|
|
|
|
|
|
spin_lock(&files->file_lock);
|
|
|
|
for (;;) {
|
|
|
|
unsigned long set, i;
|
|
|
|
|
|
|
|
j++;
|
|
|
|
i = j * __NFDBITS;
|
2005-09-10 00:04:10 +04:00
|
|
|
fdt = files_fdtable(files);
|
2006-12-10 13:21:12 +03:00
|
|
|
if (i >= fdt->max_fds)
|
2005-04-17 02:20:36 +04:00
|
|
|
break;
|
2005-09-10 00:04:10 +04:00
|
|
|
set = fdt->close_on_exec->fds_bits[j];
|
2005-04-17 02:20:36 +04:00
|
|
|
if (!set)
|
|
|
|
continue;
|
2005-09-10 00:04:10 +04:00
|
|
|
fdt->close_on_exec->fds_bits[j] = 0;
|
2005-04-17 02:20:36 +04:00
|
|
|
spin_unlock(&files->file_lock);
|
|
|
|
for ( ; set ; i++,set >>= 1) {
|
|
|
|
if (set & 1) {
|
|
|
|
sys_close(i);
|
|
|
|
}
|
|
|
|
}
|
|
|
|
spin_lock(&files->file_lock);
|
|
|
|
|
|
|
|
}
|
|
|
|
spin_unlock(&files->file_lock);
|
|
|
|
}
|
|
|
|
|
2008-02-05 09:27:21 +03:00
|
|
|
char *get_task_comm(char *buf, struct task_struct *tsk)
|
2005-04-17 02:20:36 +04:00
|
|
|
{
|
|
|
|
/* buf must be at least sizeof(tsk->comm) in size */
|
|
|
|
task_lock(tsk);
|
|
|
|
strncpy(buf, tsk->comm, sizeof(tsk->comm));
|
|
|
|
task_unlock(tsk);
|
2008-02-05 09:27:21 +03:00
|
|
|
return buf;
|
2005-04-17 02:20:36 +04:00
|
|
|
}
|
|
|
|
|
|
|
|
void set_task_comm(struct task_struct *tsk, char *buf)
|
|
|
|
{
|
|
|
|
task_lock(tsk);
|
|
|
|
strlcpy(tsk->comm, buf, sizeof(tsk->comm));
|
|
|
|
task_unlock(tsk);
|
|
|
|
}
|
|
|
|
|
|
|
|
int flush_old_exec(struct linux_binprm * bprm)
|
|
|
|
{
|
|
|
|
char * name;
|
|
|
|
int i, ch, retval;
|
|
|
|
char tcomm[sizeof(current->comm)];
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Make sure we have a private signal table and that
|
|
|
|
* we are unassociated from the previous thread group.
|
|
|
|
*/
|
|
|
|
retval = de_thread(current);
|
|
|
|
if (retval)
|
|
|
|
goto out;
|
|
|
|
|
2008-04-29 12:01:36 +04:00
|
|
|
set_mm_exe_file(bprm->mm, bprm->file);
|
|
|
|
|
2005-04-17 02:20:36 +04:00
|
|
|
/*
|
|
|
|
* Release all of the old mmap stuff
|
|
|
|
*/
|
|
|
|
retval = exec_mmap(bprm->mm);
|
|
|
|
if (retval)
|
2008-04-22 13:11:59 +04:00
|
|
|
goto out;
|
2005-04-17 02:20:36 +04:00
|
|
|
|
|
|
|
bprm->mm = NULL; /* We're using it now */
|
|
|
|
|
|
|
|
/* This is the point of no return */
|
|
|
|
current->sas_ss_sp = current->sas_ss_size = 0;
|
|
|
|
|
2008-11-14 02:39:05 +03:00
|
|
|
if (current_euid() == current_uid() && current_egid() == current_gid())
|
2007-07-19 12:48:27 +04:00
|
|
|
set_dumpable(current->mm, 1);
|
2005-06-23 11:09:43 +04:00
|
|
|
else
|
2007-07-19 12:48:27 +04:00
|
|
|
set_dumpable(current->mm, suid_dumpable);
|
2005-06-23 11:09:43 +04:00
|
|
|
|
2005-04-17 02:20:36 +04:00
|
|
|
name = bprm->filename;
|
2005-05-06 03:16:12 +04:00
|
|
|
|
|
|
|
/* Copies the binary name from after last slash */
|
2005-04-17 02:20:36 +04:00
|
|
|
for (i=0; (ch = *(name++)) != '\0';) {
|
|
|
|
if (ch == '/')
|
2005-05-06 03:16:12 +04:00
|
|
|
i = 0; /* overwrite what we wrote */
|
2005-04-17 02:20:36 +04:00
|
|
|
else
|
|
|
|
if (i < (sizeof(tcomm) - 1))
|
|
|
|
tcomm[i++] = ch;
|
|
|
|
}
|
|
|
|
tcomm[i] = '\0';
|
|
|
|
set_task_comm(current, tcomm);
|
|
|
|
|
|
|
|
current->flags &= ~PF_RANDOMIZE;
|
|
|
|
flush_thread();
|
|
|
|
|
2006-03-01 03:59:19 +03:00
|
|
|
/* Set the new mm task size. We have to do that late because it may
|
|
|
|
* depend on TIF_32BIT which is only updated in flush_thread() on
|
|
|
|
* some architectures like powerpc
|
|
|
|
*/
|
|
|
|
current->mm->task_size = TASK_SIZE;
|
|
|
|
|
CRED: Make execve() take advantage of copy-on-write credentials
Make execve() take advantage of copy-on-write credentials, allowing it to set
up the credentials in advance, and then commit the whole lot after the point
of no return.
This patch and the preceding patches have been tested with the LTP SELinux
testsuite.
This patch makes several logical sets of alteration:
(1) execve().
The credential bits from struct linux_binprm are, for the most part,
replaced with a single credentials pointer (bprm->cred). This means that
all the creds can be calculated in advance and then applied at the point
of no return with no possibility of failure.
I would like to replace bprm->cap_effective with:
cap_isclear(bprm->cap_effective)
but this seems impossible due to special behaviour for processes of pid 1
(they always retain their parent's capability masks where normally they'd
be changed - see cap_bprm_set_creds()).
The following sequence of events now happens:
(a) At the start of do_execve, the current task's cred_exec_mutex is
locked to prevent PTRACE_ATTACH from obsoleting the calculation of
creds that we make.
(a) prepare_exec_creds() is then called to make a copy of the current
task's credentials and prepare it. This copy is then assigned to
bprm->cred.
This renders security_bprm_alloc() and security_bprm_free()
unnecessary, and so they've been removed.
(b) The determination of unsafe execution is now performed immediately
after (a) rather than later on in the code. The result is stored in
bprm->unsafe for future reference.
(c) prepare_binprm() is called, possibly multiple times.
(i) This applies the result of set[ug]id binaries to the new creds
attached to bprm->cred. Personality bit clearance is recorded,
but now deferred on the basis that the exec procedure may yet
fail.
(ii) This then calls the new security_bprm_set_creds(). This should
calculate the new LSM and capability credentials into *bprm->cred.
This folds together security_bprm_set() and parts of
security_bprm_apply_creds() (these two have been removed).
Anything that might fail must be done at this point.
(iii) bprm->cred_prepared is set to 1.
bprm->cred_prepared is 0 on the first pass of the security
calculations, and 1 on all subsequent passes. This allows SELinux
in (ii) to base its calculations only on the initial script and
not on the interpreter.
(d) flush_old_exec() is called to commit the task to execution. This
performs the following steps with regard to credentials:
(i) Clear pdeath_signal and set dumpable on certain circumstances that
may not be covered by commit_creds().
(ii) Clear any bits in current->personality that were deferred from
(c.i).
(e) install_exec_creds() [compute_creds() as was] is called to install the
new credentials. This performs the following steps with regard to
credentials:
(i) Calls security_bprm_committing_creds() to apply any security
requirements, such as flushing unauthorised files in SELinux, that
must be done before the credentials are changed.
This is made up of bits of security_bprm_apply_creds() and
security_bprm_post_apply_creds(), both of which have been removed.
This function is not allowed to fail; anything that might fail
must have been done in (c.ii).
(ii) Calls commit_creds() to apply the new credentials in a single
assignment (more or less). Possibly pdeath_signal and dumpable
should be part of struct creds.
(iii) Unlocks the task's cred_replace_mutex, thus allowing
PTRACE_ATTACH to take place.
(iv) Clears The bprm->cred pointer as the credentials it was holding
are now immutable.
(v) Calls security_bprm_committed_creds() to apply any security
alterations that must be done after the creds have been changed.
SELinux uses this to flush signals and signal handlers.
(f) If an error occurs before (d.i), bprm_free() will call abort_creds()
to destroy the proposed new credentials and will then unlock
cred_replace_mutex. No changes to the credentials will have been
made.
(2) LSM interface.
A number of functions have been changed, added or removed:
(*) security_bprm_alloc(), ->bprm_alloc_security()
(*) security_bprm_free(), ->bprm_free_security()
Removed in favour of preparing new credentials and modifying those.
(*) security_bprm_apply_creds(), ->bprm_apply_creds()
(*) security_bprm_post_apply_creds(), ->bprm_post_apply_creds()
Removed; split between security_bprm_set_creds(),
security_bprm_committing_creds() and security_bprm_committed_creds().
(*) security_bprm_set(), ->bprm_set_security()
Removed; folded into security_bprm_set_creds().
(*) security_bprm_set_creds(), ->bprm_set_creds()
New. The new credentials in bprm->creds should be checked and set up
as appropriate. bprm->cred_prepared is 0 on the first call, 1 on the
second and subsequent calls.
(*) security_bprm_committing_creds(), ->bprm_committing_creds()
(*) security_bprm_committed_creds(), ->bprm_committed_creds()
New. Apply the security effects of the new credentials. This
includes closing unauthorised files in SELinux. This function may not
fail. When the former is called, the creds haven't yet been applied
to the process; when the latter is called, they have.
The former may access bprm->cred, the latter may not.
(3) SELinux.
SELinux has a number of changes, in addition to those to support the LSM
interface changes mentioned above:
(a) The bprm_security_struct struct has been removed in favour of using
the credentials-under-construction approach.
(c) flush_unauthorized_files() now takes a cred pointer and passes it on
to inode_has_perm(), file_has_perm() and dentry_open().
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: James Morris <jmorris@namei.org>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Signed-off-by: James Morris <jmorris@namei.org>
2008-11-14 02:39:24 +03:00
|
|
|
/* install the new credentials */
|
|
|
|
if (bprm->cred->uid != current_euid() ||
|
|
|
|
bprm->cred->gid != current_egid()) {
|
2007-08-17 23:47:58 +04:00
|
|
|
current->pdeath_signal = 0;
|
|
|
|
} else if (file_permission(bprm->file, MAY_READ) ||
|
CRED: Make execve() take advantage of copy-on-write credentials
Make execve() take advantage of copy-on-write credentials, allowing it to set
up the credentials in advance, and then commit the whole lot after the point
of no return.
This patch and the preceding patches have been tested with the LTP SELinux
testsuite.
This patch makes several logical sets of alteration:
(1) execve().
The credential bits from struct linux_binprm are, for the most part,
replaced with a single credentials pointer (bprm->cred). This means that
all the creds can be calculated in advance and then applied at the point
of no return with no possibility of failure.
I would like to replace bprm->cap_effective with:
cap_isclear(bprm->cap_effective)
but this seems impossible due to special behaviour for processes of pid 1
(they always retain their parent's capability masks where normally they'd
be changed - see cap_bprm_set_creds()).
The following sequence of events now happens:
(a) At the start of do_execve, the current task's cred_exec_mutex is
locked to prevent PTRACE_ATTACH from obsoleting the calculation of
creds that we make.
(a) prepare_exec_creds() is then called to make a copy of the current
task's credentials and prepare it. This copy is then assigned to
bprm->cred.
This renders security_bprm_alloc() and security_bprm_free()
unnecessary, and so they've been removed.
(b) The determination of unsafe execution is now performed immediately
after (a) rather than later on in the code. The result is stored in
bprm->unsafe for future reference.
(c) prepare_binprm() is called, possibly multiple times.
(i) This applies the result of set[ug]id binaries to the new creds
attached to bprm->cred. Personality bit clearance is recorded,
but now deferred on the basis that the exec procedure may yet
fail.
(ii) This then calls the new security_bprm_set_creds(). This should
calculate the new LSM and capability credentials into *bprm->cred.
This folds together security_bprm_set() and parts of
security_bprm_apply_creds() (these two have been removed).
Anything that might fail must be done at this point.
(iii) bprm->cred_prepared is set to 1.
bprm->cred_prepared is 0 on the first pass of the security
calculations, and 1 on all subsequent passes. This allows SELinux
in (ii) to base its calculations only on the initial script and
not on the interpreter.
(d) flush_old_exec() is called to commit the task to execution. This
performs the following steps with regard to credentials:
(i) Clear pdeath_signal and set dumpable on certain circumstances that
may not be covered by commit_creds().
(ii) Clear any bits in current->personality that were deferred from
(c.i).
(e) install_exec_creds() [compute_creds() as was] is called to install the
new credentials. This performs the following steps with regard to
credentials:
(i) Calls security_bprm_committing_creds() to apply any security
requirements, such as flushing unauthorised files in SELinux, that
must be done before the credentials are changed.
This is made up of bits of security_bprm_apply_creds() and
security_bprm_post_apply_creds(), both of which have been removed.
This function is not allowed to fail; anything that might fail
must have been done in (c.ii).
(ii) Calls commit_creds() to apply the new credentials in a single
assignment (more or less). Possibly pdeath_signal and dumpable
should be part of struct creds.
(iii) Unlocks the task's cred_replace_mutex, thus allowing
PTRACE_ATTACH to take place.
(iv) Clears The bprm->cred pointer as the credentials it was holding
are now immutable.
(v) Calls security_bprm_committed_creds() to apply any security
alterations that must be done after the creds have been changed.
SELinux uses this to flush signals and signal handlers.
(f) If an error occurs before (d.i), bprm_free() will call abort_creds()
to destroy the proposed new credentials and will then unlock
cred_replace_mutex. No changes to the credentials will have been
made.
(2) LSM interface.
A number of functions have been changed, added or removed:
(*) security_bprm_alloc(), ->bprm_alloc_security()
(*) security_bprm_free(), ->bprm_free_security()
Removed in favour of preparing new credentials and modifying those.
(*) security_bprm_apply_creds(), ->bprm_apply_creds()
(*) security_bprm_post_apply_creds(), ->bprm_post_apply_creds()
Removed; split between security_bprm_set_creds(),
security_bprm_committing_creds() and security_bprm_committed_creds().
(*) security_bprm_set(), ->bprm_set_security()
Removed; folded into security_bprm_set_creds().
(*) security_bprm_set_creds(), ->bprm_set_creds()
New. The new credentials in bprm->creds should be checked and set up
as appropriate. bprm->cred_prepared is 0 on the first call, 1 on the
second and subsequent calls.
(*) security_bprm_committing_creds(), ->bprm_committing_creds()
(*) security_bprm_committed_creds(), ->bprm_committed_creds()
New. Apply the security effects of the new credentials. This
includes closing unauthorised files in SELinux. This function may not
fail. When the former is called, the creds haven't yet been applied
to the process; when the latter is called, they have.
The former may access bprm->cred, the latter may not.
(3) SELinux.
SELinux has a number of changes, in addition to those to support the LSM
interface changes mentioned above:
(a) The bprm_security_struct struct has been removed in favour of using
the credentials-under-construction approach.
(c) flush_unauthorized_files() now takes a cred pointer and passes it on
to inode_has_perm(), file_has_perm() and dentry_open().
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: James Morris <jmorris@namei.org>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Signed-off-by: James Morris <jmorris@namei.org>
2008-11-14 02:39:24 +03:00
|
|
|
bprm->interp_flags & BINPRM_FLAGS_ENFORCE_NONDUMP) {
|
2007-07-19 12:48:27 +04:00
|
|
|
set_dumpable(current->mm, suid_dumpable);
|
2005-04-17 02:20:36 +04:00
|
|
|
}
|
|
|
|
|
CRED: Make execve() take advantage of copy-on-write credentials
Make execve() take advantage of copy-on-write credentials, allowing it to set
up the credentials in advance, and then commit the whole lot after the point
of no return.
This patch and the preceding patches have been tested with the LTP SELinux
testsuite.
This patch makes several logical sets of alteration:
(1) execve().
The credential bits from struct linux_binprm are, for the most part,
replaced with a single credentials pointer (bprm->cred). This means that
all the creds can be calculated in advance and then applied at the point
of no return with no possibility of failure.
I would like to replace bprm->cap_effective with:
cap_isclear(bprm->cap_effective)
but this seems impossible due to special behaviour for processes of pid 1
(they always retain their parent's capability masks where normally they'd
be changed - see cap_bprm_set_creds()).
The following sequence of events now happens:
(a) At the start of do_execve, the current task's cred_exec_mutex is
locked to prevent PTRACE_ATTACH from obsoleting the calculation of
creds that we make.
(a) prepare_exec_creds() is then called to make a copy of the current
task's credentials and prepare it. This copy is then assigned to
bprm->cred.
This renders security_bprm_alloc() and security_bprm_free()
unnecessary, and so they've been removed.
(b) The determination of unsafe execution is now performed immediately
after (a) rather than later on in the code. The result is stored in
bprm->unsafe for future reference.
(c) prepare_binprm() is called, possibly multiple times.
(i) This applies the result of set[ug]id binaries to the new creds
attached to bprm->cred. Personality bit clearance is recorded,
but now deferred on the basis that the exec procedure may yet
fail.
(ii) This then calls the new security_bprm_set_creds(). This should
calculate the new LSM and capability credentials into *bprm->cred.
This folds together security_bprm_set() and parts of
security_bprm_apply_creds() (these two have been removed).
Anything that might fail must be done at this point.
(iii) bprm->cred_prepared is set to 1.
bprm->cred_prepared is 0 on the first pass of the security
calculations, and 1 on all subsequent passes. This allows SELinux
in (ii) to base its calculations only on the initial script and
not on the interpreter.
(d) flush_old_exec() is called to commit the task to execution. This
performs the following steps with regard to credentials:
(i) Clear pdeath_signal and set dumpable on certain circumstances that
may not be covered by commit_creds().
(ii) Clear any bits in current->personality that were deferred from
(c.i).
(e) install_exec_creds() [compute_creds() as was] is called to install the
new credentials. This performs the following steps with regard to
credentials:
(i) Calls security_bprm_committing_creds() to apply any security
requirements, such as flushing unauthorised files in SELinux, that
must be done before the credentials are changed.
This is made up of bits of security_bprm_apply_creds() and
security_bprm_post_apply_creds(), both of which have been removed.
This function is not allowed to fail; anything that might fail
must have been done in (c.ii).
(ii) Calls commit_creds() to apply the new credentials in a single
assignment (more or less). Possibly pdeath_signal and dumpable
should be part of struct creds.
(iii) Unlocks the task's cred_replace_mutex, thus allowing
PTRACE_ATTACH to take place.
(iv) Clears The bprm->cred pointer as the credentials it was holding
are now immutable.
(v) Calls security_bprm_committed_creds() to apply any security
alterations that must be done after the creds have been changed.
SELinux uses this to flush signals and signal handlers.
(f) If an error occurs before (d.i), bprm_free() will call abort_creds()
to destroy the proposed new credentials and will then unlock
cred_replace_mutex. No changes to the credentials will have been
made.
(2) LSM interface.
A number of functions have been changed, added or removed:
(*) security_bprm_alloc(), ->bprm_alloc_security()
(*) security_bprm_free(), ->bprm_free_security()
Removed in favour of preparing new credentials and modifying those.
(*) security_bprm_apply_creds(), ->bprm_apply_creds()
(*) security_bprm_post_apply_creds(), ->bprm_post_apply_creds()
Removed; split between security_bprm_set_creds(),
security_bprm_committing_creds() and security_bprm_committed_creds().
(*) security_bprm_set(), ->bprm_set_security()
Removed; folded into security_bprm_set_creds().
(*) security_bprm_set_creds(), ->bprm_set_creds()
New. The new credentials in bprm->creds should be checked and set up
as appropriate. bprm->cred_prepared is 0 on the first call, 1 on the
second and subsequent calls.
(*) security_bprm_committing_creds(), ->bprm_committing_creds()
(*) security_bprm_committed_creds(), ->bprm_committed_creds()
New. Apply the security effects of the new credentials. This
includes closing unauthorised files in SELinux. This function may not
fail. When the former is called, the creds haven't yet been applied
to the process; when the latter is called, they have.
The former may access bprm->cred, the latter may not.
(3) SELinux.
SELinux has a number of changes, in addition to those to support the LSM
interface changes mentioned above:
(a) The bprm_security_struct struct has been removed in favour of using
the credentials-under-construction approach.
(c) flush_unauthorized_files() now takes a cred pointer and passes it on
to inode_has_perm(), file_has_perm() and dentry_open().
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: James Morris <jmorris@namei.org>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Signed-off-by: James Morris <jmorris@namei.org>
2008-11-14 02:39:24 +03:00
|
|
|
current->personality &= ~bprm->per_clear;
|
|
|
|
|
2005-04-17 02:20:36 +04:00
|
|
|
/* An exec changes our domain. We are no longer part of the thread
|
|
|
|
group */
|
|
|
|
|
|
|
|
current->self_exec_id++;
|
|
|
|
|
|
|
|
flush_signal_handlers(current, 0);
|
|
|
|
flush_old_files(current->files);
|
|
|
|
|
|
|
|
return 0;
|
|
|
|
|
|
|
|
out:
|
|
|
|
return retval;
|
|
|
|
}
|
|
|
|
|
|
|
|
EXPORT_SYMBOL(flush_old_exec);
|
|
|
|
|
CRED: Make execve() take advantage of copy-on-write credentials
Make execve() take advantage of copy-on-write credentials, allowing it to set
up the credentials in advance, and then commit the whole lot after the point
of no return.
This patch and the preceding patches have been tested with the LTP SELinux
testsuite.
This patch makes several logical sets of alteration:
(1) execve().
The credential bits from struct linux_binprm are, for the most part,
replaced with a single credentials pointer (bprm->cred). This means that
all the creds can be calculated in advance and then applied at the point
of no return with no possibility of failure.
I would like to replace bprm->cap_effective with:
cap_isclear(bprm->cap_effective)
but this seems impossible due to special behaviour for processes of pid 1
(they always retain their parent's capability masks where normally they'd
be changed - see cap_bprm_set_creds()).
The following sequence of events now happens:
(a) At the start of do_execve, the current task's cred_exec_mutex is
locked to prevent PTRACE_ATTACH from obsoleting the calculation of
creds that we make.
(a) prepare_exec_creds() is then called to make a copy of the current
task's credentials and prepare it. This copy is then assigned to
bprm->cred.
This renders security_bprm_alloc() and security_bprm_free()
unnecessary, and so they've been removed.
(b) The determination of unsafe execution is now performed immediately
after (a) rather than later on in the code. The result is stored in
bprm->unsafe for future reference.
(c) prepare_binprm() is called, possibly multiple times.
(i) This applies the result of set[ug]id binaries to the new creds
attached to bprm->cred. Personality bit clearance is recorded,
but now deferred on the basis that the exec procedure may yet
fail.
(ii) This then calls the new security_bprm_set_creds(). This should
calculate the new LSM and capability credentials into *bprm->cred.
This folds together security_bprm_set() and parts of
security_bprm_apply_creds() (these two have been removed).
Anything that might fail must be done at this point.
(iii) bprm->cred_prepared is set to 1.
bprm->cred_prepared is 0 on the first pass of the security
calculations, and 1 on all subsequent passes. This allows SELinux
in (ii) to base its calculations only on the initial script and
not on the interpreter.
(d) flush_old_exec() is called to commit the task to execution. This
performs the following steps with regard to credentials:
(i) Clear pdeath_signal and set dumpable on certain circumstances that
may not be covered by commit_creds().
(ii) Clear any bits in current->personality that were deferred from
(c.i).
(e) install_exec_creds() [compute_creds() as was] is called to install the
new credentials. This performs the following steps with regard to
credentials:
(i) Calls security_bprm_committing_creds() to apply any security
requirements, such as flushing unauthorised files in SELinux, that
must be done before the credentials are changed.
This is made up of bits of security_bprm_apply_creds() and
security_bprm_post_apply_creds(), both of which have been removed.
This function is not allowed to fail; anything that might fail
must have been done in (c.ii).
(ii) Calls commit_creds() to apply the new credentials in a single
assignment (more or less). Possibly pdeath_signal and dumpable
should be part of struct creds.
(iii) Unlocks the task's cred_replace_mutex, thus allowing
PTRACE_ATTACH to take place.
(iv) Clears The bprm->cred pointer as the credentials it was holding
are now immutable.
(v) Calls security_bprm_committed_creds() to apply any security
alterations that must be done after the creds have been changed.
SELinux uses this to flush signals and signal handlers.
(f) If an error occurs before (d.i), bprm_free() will call abort_creds()
to destroy the proposed new credentials and will then unlock
cred_replace_mutex. No changes to the credentials will have been
made.
(2) LSM interface.
A number of functions have been changed, added or removed:
(*) security_bprm_alloc(), ->bprm_alloc_security()
(*) security_bprm_free(), ->bprm_free_security()
Removed in favour of preparing new credentials and modifying those.
(*) security_bprm_apply_creds(), ->bprm_apply_creds()
(*) security_bprm_post_apply_creds(), ->bprm_post_apply_creds()
Removed; split between security_bprm_set_creds(),
security_bprm_committing_creds() and security_bprm_committed_creds().
(*) security_bprm_set(), ->bprm_set_security()
Removed; folded into security_bprm_set_creds().
(*) security_bprm_set_creds(), ->bprm_set_creds()
New. The new credentials in bprm->creds should be checked and set up
as appropriate. bprm->cred_prepared is 0 on the first call, 1 on the
second and subsequent calls.
(*) security_bprm_committing_creds(), ->bprm_committing_creds()
(*) security_bprm_committed_creds(), ->bprm_committed_creds()
New. Apply the security effects of the new credentials. This
includes closing unauthorised files in SELinux. This function may not
fail. When the former is called, the creds haven't yet been applied
to the process; when the latter is called, they have.
The former may access bprm->cred, the latter may not.
(3) SELinux.
SELinux has a number of changes, in addition to those to support the LSM
interface changes mentioned above:
(a) The bprm_security_struct struct has been removed in favour of using
the credentials-under-construction approach.
(c) flush_unauthorized_files() now takes a cred pointer and passes it on
to inode_has_perm(), file_has_perm() and dentry_open().
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: James Morris <jmorris@namei.org>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Signed-off-by: James Morris <jmorris@namei.org>
2008-11-14 02:39:24 +03:00
|
|
|
/*
|
|
|
|
* install the new credentials for this executable
|
|
|
|
*/
|
|
|
|
void install_exec_creds(struct linux_binprm *bprm)
|
|
|
|
{
|
|
|
|
security_bprm_committing_creds(bprm);
|
|
|
|
|
|
|
|
commit_creds(bprm->cred);
|
|
|
|
bprm->cred = NULL;
|
|
|
|
|
|
|
|
/* cred_exec_mutex must be held at least to this point to prevent
|
|
|
|
* ptrace_attach() from altering our determination of the task's
|
|
|
|
* credentials; any time after this it may be unlocked */
|
|
|
|
|
|
|
|
security_bprm_committed_creds(bprm);
|
|
|
|
}
|
|
|
|
EXPORT_SYMBOL(install_exec_creds);
|
|
|
|
|
|
|
|
/*
|
|
|
|
* determine how safe it is to execute the proposed program
|
|
|
|
* - the caller must hold current->cred_exec_mutex to protect against
|
|
|
|
* PTRACE_ATTACH
|
|
|
|
*/
|
|
|
|
void check_unsafe_exec(struct linux_binprm *bprm)
|
|
|
|
{
|
|
|
|
struct task_struct *p = current;
|
|
|
|
|
|
|
|
bprm->unsafe = tracehook_unsafe_exec(p);
|
|
|
|
|
|
|
|
if (atomic_read(&p->fs->count) > 1 ||
|
|
|
|
atomic_read(&p->files->count) > 1 ||
|
|
|
|
atomic_read(&p->sighand->count) > 1)
|
|
|
|
bprm->unsafe |= LSM_UNSAFE_SHARE;
|
|
|
|
}
|
|
|
|
|
2005-04-17 02:20:36 +04:00
|
|
|
/*
|
|
|
|
* Fill the binprm structure from the inode.
|
|
|
|
* Check permissions, then read the first 128 (BINPRM_BUF_SIZE) bytes
|
CRED: Make execve() take advantage of copy-on-write credentials
Make execve() take advantage of copy-on-write credentials, allowing it to set
up the credentials in advance, and then commit the whole lot after the point
of no return.
This patch and the preceding patches have been tested with the LTP SELinux
testsuite.
This patch makes several logical sets of alteration:
(1) execve().
The credential bits from struct linux_binprm are, for the most part,
replaced with a single credentials pointer (bprm->cred). This means that
all the creds can be calculated in advance and then applied at the point
of no return with no possibility of failure.
I would like to replace bprm->cap_effective with:
cap_isclear(bprm->cap_effective)
but this seems impossible due to special behaviour for processes of pid 1
(they always retain their parent's capability masks where normally they'd
be changed - see cap_bprm_set_creds()).
The following sequence of events now happens:
(a) At the start of do_execve, the current task's cred_exec_mutex is
locked to prevent PTRACE_ATTACH from obsoleting the calculation of
creds that we make.
(a) prepare_exec_creds() is then called to make a copy of the current
task's credentials and prepare it. This copy is then assigned to
bprm->cred.
This renders security_bprm_alloc() and security_bprm_free()
unnecessary, and so they've been removed.
(b) The determination of unsafe execution is now performed immediately
after (a) rather than later on in the code. The result is stored in
bprm->unsafe for future reference.
(c) prepare_binprm() is called, possibly multiple times.
(i) This applies the result of set[ug]id binaries to the new creds
attached to bprm->cred. Personality bit clearance is recorded,
but now deferred on the basis that the exec procedure may yet
fail.
(ii) This then calls the new security_bprm_set_creds(). This should
calculate the new LSM and capability credentials into *bprm->cred.
This folds together security_bprm_set() and parts of
security_bprm_apply_creds() (these two have been removed).
Anything that might fail must be done at this point.
(iii) bprm->cred_prepared is set to 1.
bprm->cred_prepared is 0 on the first pass of the security
calculations, and 1 on all subsequent passes. This allows SELinux
in (ii) to base its calculations only on the initial script and
not on the interpreter.
(d) flush_old_exec() is called to commit the task to execution. This
performs the following steps with regard to credentials:
(i) Clear pdeath_signal and set dumpable on certain circumstances that
may not be covered by commit_creds().
(ii) Clear any bits in current->personality that were deferred from
(c.i).
(e) install_exec_creds() [compute_creds() as was] is called to install the
new credentials. This performs the following steps with regard to
credentials:
(i) Calls security_bprm_committing_creds() to apply any security
requirements, such as flushing unauthorised files in SELinux, that
must be done before the credentials are changed.
This is made up of bits of security_bprm_apply_creds() and
security_bprm_post_apply_creds(), both of which have been removed.
This function is not allowed to fail; anything that might fail
must have been done in (c.ii).
(ii) Calls commit_creds() to apply the new credentials in a single
assignment (more or less). Possibly pdeath_signal and dumpable
should be part of struct creds.
(iii) Unlocks the task's cred_replace_mutex, thus allowing
PTRACE_ATTACH to take place.
(iv) Clears The bprm->cred pointer as the credentials it was holding
are now immutable.
(v) Calls security_bprm_committed_creds() to apply any security
alterations that must be done after the creds have been changed.
SELinux uses this to flush signals and signal handlers.
(f) If an error occurs before (d.i), bprm_free() will call abort_creds()
to destroy the proposed new credentials and will then unlock
cred_replace_mutex. No changes to the credentials will have been
made.
(2) LSM interface.
A number of functions have been changed, added or removed:
(*) security_bprm_alloc(), ->bprm_alloc_security()
(*) security_bprm_free(), ->bprm_free_security()
Removed in favour of preparing new credentials and modifying those.
(*) security_bprm_apply_creds(), ->bprm_apply_creds()
(*) security_bprm_post_apply_creds(), ->bprm_post_apply_creds()
Removed; split between security_bprm_set_creds(),
security_bprm_committing_creds() and security_bprm_committed_creds().
(*) security_bprm_set(), ->bprm_set_security()
Removed; folded into security_bprm_set_creds().
(*) security_bprm_set_creds(), ->bprm_set_creds()
New. The new credentials in bprm->creds should be checked and set up
as appropriate. bprm->cred_prepared is 0 on the first call, 1 on the
second and subsequent calls.
(*) security_bprm_committing_creds(), ->bprm_committing_creds()
(*) security_bprm_committed_creds(), ->bprm_committed_creds()
New. Apply the security effects of the new credentials. This
includes closing unauthorised files in SELinux. This function may not
fail. When the former is called, the creds haven't yet been applied
to the process; when the latter is called, they have.
The former may access bprm->cred, the latter may not.
(3) SELinux.
SELinux has a number of changes, in addition to those to support the LSM
interface changes mentioned above:
(a) The bprm_security_struct struct has been removed in favour of using
the credentials-under-construction approach.
(c) flush_unauthorized_files() now takes a cred pointer and passes it on
to inode_has_perm(), file_has_perm() and dentry_open().
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: James Morris <jmorris@namei.org>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Signed-off-by: James Morris <jmorris@namei.org>
2008-11-14 02:39:24 +03:00
|
|
|
*
|
|
|
|
* This may be called multiple times for binary chains (scripts for example).
|
2005-04-17 02:20:36 +04:00
|
|
|
*/
|
|
|
|
int prepare_binprm(struct linux_binprm *bprm)
|
|
|
|
{
|
CRED: Make execve() take advantage of copy-on-write credentials
Make execve() take advantage of copy-on-write credentials, allowing it to set
up the credentials in advance, and then commit the whole lot after the point
of no return.
This patch and the preceding patches have been tested with the LTP SELinux
testsuite.
This patch makes several logical sets of alteration:
(1) execve().
The credential bits from struct linux_binprm are, for the most part,
replaced with a single credentials pointer (bprm->cred). This means that
all the creds can be calculated in advance and then applied at the point
of no return with no possibility of failure.
I would like to replace bprm->cap_effective with:
cap_isclear(bprm->cap_effective)
but this seems impossible due to special behaviour for processes of pid 1
(they always retain their parent's capability masks where normally they'd
be changed - see cap_bprm_set_creds()).
The following sequence of events now happens:
(a) At the start of do_execve, the current task's cred_exec_mutex is
locked to prevent PTRACE_ATTACH from obsoleting the calculation of
creds that we make.
(a) prepare_exec_creds() is then called to make a copy of the current
task's credentials and prepare it. This copy is then assigned to
bprm->cred.
This renders security_bprm_alloc() and security_bprm_free()
unnecessary, and so they've been removed.
(b) The determination of unsafe execution is now performed immediately
after (a) rather than later on in the code. The result is stored in
bprm->unsafe for future reference.
(c) prepare_binprm() is called, possibly multiple times.
(i) This applies the result of set[ug]id binaries to the new creds
attached to bprm->cred. Personality bit clearance is recorded,
but now deferred on the basis that the exec procedure may yet
fail.
(ii) This then calls the new security_bprm_set_creds(). This should
calculate the new LSM and capability credentials into *bprm->cred.
This folds together security_bprm_set() and parts of
security_bprm_apply_creds() (these two have been removed).
Anything that might fail must be done at this point.
(iii) bprm->cred_prepared is set to 1.
bprm->cred_prepared is 0 on the first pass of the security
calculations, and 1 on all subsequent passes. This allows SELinux
in (ii) to base its calculations only on the initial script and
not on the interpreter.
(d) flush_old_exec() is called to commit the task to execution. This
performs the following steps with regard to credentials:
(i) Clear pdeath_signal and set dumpable on certain circumstances that
may not be covered by commit_creds().
(ii) Clear any bits in current->personality that were deferred from
(c.i).
(e) install_exec_creds() [compute_creds() as was] is called to install the
new credentials. This performs the following steps with regard to
credentials:
(i) Calls security_bprm_committing_creds() to apply any security
requirements, such as flushing unauthorised files in SELinux, that
must be done before the credentials are changed.
This is made up of bits of security_bprm_apply_creds() and
security_bprm_post_apply_creds(), both of which have been removed.
This function is not allowed to fail; anything that might fail
must have been done in (c.ii).
(ii) Calls commit_creds() to apply the new credentials in a single
assignment (more or less). Possibly pdeath_signal and dumpable
should be part of struct creds.
(iii) Unlocks the task's cred_replace_mutex, thus allowing
PTRACE_ATTACH to take place.
(iv) Clears The bprm->cred pointer as the credentials it was holding
are now immutable.
(v) Calls security_bprm_committed_creds() to apply any security
alterations that must be done after the creds have been changed.
SELinux uses this to flush signals and signal handlers.
(f) If an error occurs before (d.i), bprm_free() will call abort_creds()
to destroy the proposed new credentials and will then unlock
cred_replace_mutex. No changes to the credentials will have been
made.
(2) LSM interface.
A number of functions have been changed, added or removed:
(*) security_bprm_alloc(), ->bprm_alloc_security()
(*) security_bprm_free(), ->bprm_free_security()
Removed in favour of preparing new credentials and modifying those.
(*) security_bprm_apply_creds(), ->bprm_apply_creds()
(*) security_bprm_post_apply_creds(), ->bprm_post_apply_creds()
Removed; split between security_bprm_set_creds(),
security_bprm_committing_creds() and security_bprm_committed_creds().
(*) security_bprm_set(), ->bprm_set_security()
Removed; folded into security_bprm_set_creds().
(*) security_bprm_set_creds(), ->bprm_set_creds()
New. The new credentials in bprm->creds should be checked and set up
as appropriate. bprm->cred_prepared is 0 on the first call, 1 on the
second and subsequent calls.
(*) security_bprm_committing_creds(), ->bprm_committing_creds()
(*) security_bprm_committed_creds(), ->bprm_committed_creds()
New. Apply the security effects of the new credentials. This
includes closing unauthorised files in SELinux. This function may not
fail. When the former is called, the creds haven't yet been applied
to the process; when the latter is called, they have.
The former may access bprm->cred, the latter may not.
(3) SELinux.
SELinux has a number of changes, in addition to those to support the LSM
interface changes mentioned above:
(a) The bprm_security_struct struct has been removed in favour of using
the credentials-under-construction approach.
(c) flush_unauthorized_files() now takes a cred pointer and passes it on
to inode_has_perm(), file_has_perm() and dentry_open().
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: James Morris <jmorris@namei.org>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Signed-off-by: James Morris <jmorris@namei.org>
2008-11-14 02:39:24 +03:00
|
|
|
umode_t mode;
|
2006-12-08 13:36:35 +03:00
|
|
|
struct inode * inode = bprm->file->f_path.dentry->d_inode;
|
2005-04-17 02:20:36 +04:00
|
|
|
int retval;
|
|
|
|
|
|
|
|
mode = inode->i_mode;
|
|
|
|
if (bprm->file->f_op == NULL)
|
|
|
|
return -EACCES;
|
|
|
|
|
CRED: Make execve() take advantage of copy-on-write credentials
Make execve() take advantage of copy-on-write credentials, allowing it to set
up the credentials in advance, and then commit the whole lot after the point
of no return.
This patch and the preceding patches have been tested with the LTP SELinux
testsuite.
This patch makes several logical sets of alteration:
(1) execve().
The credential bits from struct linux_binprm are, for the most part,
replaced with a single credentials pointer (bprm->cred). This means that
all the creds can be calculated in advance and then applied at the point
of no return with no possibility of failure.
I would like to replace bprm->cap_effective with:
cap_isclear(bprm->cap_effective)
but this seems impossible due to special behaviour for processes of pid 1
(they always retain their parent's capability masks where normally they'd
be changed - see cap_bprm_set_creds()).
The following sequence of events now happens:
(a) At the start of do_execve, the current task's cred_exec_mutex is
locked to prevent PTRACE_ATTACH from obsoleting the calculation of
creds that we make.
(a) prepare_exec_creds() is then called to make a copy of the current
task's credentials and prepare it. This copy is then assigned to
bprm->cred.
This renders security_bprm_alloc() and security_bprm_free()
unnecessary, and so they've been removed.
(b) The determination of unsafe execution is now performed immediately
after (a) rather than later on in the code. The result is stored in
bprm->unsafe for future reference.
(c) prepare_binprm() is called, possibly multiple times.
(i) This applies the result of set[ug]id binaries to the new creds
attached to bprm->cred. Personality bit clearance is recorded,
but now deferred on the basis that the exec procedure may yet
fail.
(ii) This then calls the new security_bprm_set_creds(). This should
calculate the new LSM and capability credentials into *bprm->cred.
This folds together security_bprm_set() and parts of
security_bprm_apply_creds() (these two have been removed).
Anything that might fail must be done at this point.
(iii) bprm->cred_prepared is set to 1.
bprm->cred_prepared is 0 on the first pass of the security
calculations, and 1 on all subsequent passes. This allows SELinux
in (ii) to base its calculations only on the initial script and
not on the interpreter.
(d) flush_old_exec() is called to commit the task to execution. This
performs the following steps with regard to credentials:
(i) Clear pdeath_signal and set dumpable on certain circumstances that
may not be covered by commit_creds().
(ii) Clear any bits in current->personality that were deferred from
(c.i).
(e) install_exec_creds() [compute_creds() as was] is called to install the
new credentials. This performs the following steps with regard to
credentials:
(i) Calls security_bprm_committing_creds() to apply any security
requirements, such as flushing unauthorised files in SELinux, that
must be done before the credentials are changed.
This is made up of bits of security_bprm_apply_creds() and
security_bprm_post_apply_creds(), both of which have been removed.
This function is not allowed to fail; anything that might fail
must have been done in (c.ii).
(ii) Calls commit_creds() to apply the new credentials in a single
assignment (more or less). Possibly pdeath_signal and dumpable
should be part of struct creds.
(iii) Unlocks the task's cred_replace_mutex, thus allowing
PTRACE_ATTACH to take place.
(iv) Clears The bprm->cred pointer as the credentials it was holding
are now immutable.
(v) Calls security_bprm_committed_creds() to apply any security
alterations that must be done after the creds have been changed.
SELinux uses this to flush signals and signal handlers.
(f) If an error occurs before (d.i), bprm_free() will call abort_creds()
to destroy the proposed new credentials and will then unlock
cred_replace_mutex. No changes to the credentials will have been
made.
(2) LSM interface.
A number of functions have been changed, added or removed:
(*) security_bprm_alloc(), ->bprm_alloc_security()
(*) security_bprm_free(), ->bprm_free_security()
Removed in favour of preparing new credentials and modifying those.
(*) security_bprm_apply_creds(), ->bprm_apply_creds()
(*) security_bprm_post_apply_creds(), ->bprm_post_apply_creds()
Removed; split between security_bprm_set_creds(),
security_bprm_committing_creds() and security_bprm_committed_creds().
(*) security_bprm_set(), ->bprm_set_security()
Removed; folded into security_bprm_set_creds().
(*) security_bprm_set_creds(), ->bprm_set_creds()
New. The new credentials in bprm->creds should be checked and set up
as appropriate. bprm->cred_prepared is 0 on the first call, 1 on the
second and subsequent calls.
(*) security_bprm_committing_creds(), ->bprm_committing_creds()
(*) security_bprm_committed_creds(), ->bprm_committed_creds()
New. Apply the security effects of the new credentials. This
includes closing unauthorised files in SELinux. This function may not
fail. When the former is called, the creds haven't yet been applied
to the process; when the latter is called, they have.
The former may access bprm->cred, the latter may not.
(3) SELinux.
SELinux has a number of changes, in addition to those to support the LSM
interface changes mentioned above:
(a) The bprm_security_struct struct has been removed in favour of using
the credentials-under-construction approach.
(c) flush_unauthorized_files() now takes a cred pointer and passes it on
to inode_has_perm(), file_has_perm() and dentry_open().
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: James Morris <jmorris@namei.org>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Signed-off-by: James Morris <jmorris@namei.org>
2008-11-14 02:39:24 +03:00
|
|
|
/* clear any previous set[ug]id data from a previous binary */
|
|
|
|
bprm->cred->euid = current_euid();
|
|
|
|
bprm->cred->egid = current_egid();
|
2005-04-17 02:20:36 +04:00
|
|
|
|
CRED: Make execve() take advantage of copy-on-write credentials
Make execve() take advantage of copy-on-write credentials, allowing it to set
up the credentials in advance, and then commit the whole lot after the point
of no return.
This patch and the preceding patches have been tested with the LTP SELinux
testsuite.
This patch makes several logical sets of alteration:
(1) execve().
The credential bits from struct linux_binprm are, for the most part,
replaced with a single credentials pointer (bprm->cred). This means that
all the creds can be calculated in advance and then applied at the point
of no return with no possibility of failure.
I would like to replace bprm->cap_effective with:
cap_isclear(bprm->cap_effective)
but this seems impossible due to special behaviour for processes of pid 1
(they always retain their parent's capability masks where normally they'd
be changed - see cap_bprm_set_creds()).
The following sequence of events now happens:
(a) At the start of do_execve, the current task's cred_exec_mutex is
locked to prevent PTRACE_ATTACH from obsoleting the calculation of
creds that we make.
(a) prepare_exec_creds() is then called to make a copy of the current
task's credentials and prepare it. This copy is then assigned to
bprm->cred.
This renders security_bprm_alloc() and security_bprm_free()
unnecessary, and so they've been removed.
(b) The determination of unsafe execution is now performed immediately
after (a) rather than later on in the code. The result is stored in
bprm->unsafe for future reference.
(c) prepare_binprm() is called, possibly multiple times.
(i) This applies the result of set[ug]id binaries to the new creds
attached to bprm->cred. Personality bit clearance is recorded,
but now deferred on the basis that the exec procedure may yet
fail.
(ii) This then calls the new security_bprm_set_creds(). This should
calculate the new LSM and capability credentials into *bprm->cred.
This folds together security_bprm_set() and parts of
security_bprm_apply_creds() (these two have been removed).
Anything that might fail must be done at this point.
(iii) bprm->cred_prepared is set to 1.
bprm->cred_prepared is 0 on the first pass of the security
calculations, and 1 on all subsequent passes. This allows SELinux
in (ii) to base its calculations only on the initial script and
not on the interpreter.
(d) flush_old_exec() is called to commit the task to execution. This
performs the following steps with regard to credentials:
(i) Clear pdeath_signal and set dumpable on certain circumstances that
may not be covered by commit_creds().
(ii) Clear any bits in current->personality that were deferred from
(c.i).
(e) install_exec_creds() [compute_creds() as was] is called to install the
new credentials. This performs the following steps with regard to
credentials:
(i) Calls security_bprm_committing_creds() to apply any security
requirements, such as flushing unauthorised files in SELinux, that
must be done before the credentials are changed.
This is made up of bits of security_bprm_apply_creds() and
security_bprm_post_apply_creds(), both of which have been removed.
This function is not allowed to fail; anything that might fail
must have been done in (c.ii).
(ii) Calls commit_creds() to apply the new credentials in a single
assignment (more or less). Possibly pdeath_signal and dumpable
should be part of struct creds.
(iii) Unlocks the task's cred_replace_mutex, thus allowing
PTRACE_ATTACH to take place.
(iv) Clears The bprm->cred pointer as the credentials it was holding
are now immutable.
(v) Calls security_bprm_committed_creds() to apply any security
alterations that must be done after the creds have been changed.
SELinux uses this to flush signals and signal handlers.
(f) If an error occurs before (d.i), bprm_free() will call abort_creds()
to destroy the proposed new credentials and will then unlock
cred_replace_mutex. No changes to the credentials will have been
made.
(2) LSM interface.
A number of functions have been changed, added or removed:
(*) security_bprm_alloc(), ->bprm_alloc_security()
(*) security_bprm_free(), ->bprm_free_security()
Removed in favour of preparing new credentials and modifying those.
(*) security_bprm_apply_creds(), ->bprm_apply_creds()
(*) security_bprm_post_apply_creds(), ->bprm_post_apply_creds()
Removed; split between security_bprm_set_creds(),
security_bprm_committing_creds() and security_bprm_committed_creds().
(*) security_bprm_set(), ->bprm_set_security()
Removed; folded into security_bprm_set_creds().
(*) security_bprm_set_creds(), ->bprm_set_creds()
New. The new credentials in bprm->creds should be checked and set up
as appropriate. bprm->cred_prepared is 0 on the first call, 1 on the
second and subsequent calls.
(*) security_bprm_committing_creds(), ->bprm_committing_creds()
(*) security_bprm_committed_creds(), ->bprm_committed_creds()
New. Apply the security effects of the new credentials. This
includes closing unauthorised files in SELinux. This function may not
fail. When the former is called, the creds haven't yet been applied
to the process; when the latter is called, they have.
The former may access bprm->cred, the latter may not.
(3) SELinux.
SELinux has a number of changes, in addition to those to support the LSM
interface changes mentioned above:
(a) The bprm_security_struct struct has been removed in favour of using
the credentials-under-construction approach.
(c) flush_unauthorized_files() now takes a cred pointer and passes it on
to inode_has_perm(), file_has_perm() and dentry_open().
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: James Morris <jmorris@namei.org>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Signed-off-by: James Morris <jmorris@namei.org>
2008-11-14 02:39:24 +03:00
|
|
|
if (!(bprm->file->f_path.mnt->mnt_flags & MNT_NOSUID)) {
|
2005-04-17 02:20:36 +04:00
|
|
|
/* Set-uid? */
|
|
|
|
if (mode & S_ISUID) {
|
CRED: Make execve() take advantage of copy-on-write credentials
Make execve() take advantage of copy-on-write credentials, allowing it to set
up the credentials in advance, and then commit the whole lot after the point
of no return.
This patch and the preceding patches have been tested with the LTP SELinux
testsuite.
This patch makes several logical sets of alteration:
(1) execve().
The credential bits from struct linux_binprm are, for the most part,
replaced with a single credentials pointer (bprm->cred). This means that
all the creds can be calculated in advance and then applied at the point
of no return with no possibility of failure.
I would like to replace bprm->cap_effective with:
cap_isclear(bprm->cap_effective)
but this seems impossible due to special behaviour for processes of pid 1
(they always retain their parent's capability masks where normally they'd
be changed - see cap_bprm_set_creds()).
The following sequence of events now happens:
(a) At the start of do_execve, the current task's cred_exec_mutex is
locked to prevent PTRACE_ATTACH from obsoleting the calculation of
creds that we make.
(a) prepare_exec_creds() is then called to make a copy of the current
task's credentials and prepare it. This copy is then assigned to
bprm->cred.
This renders security_bprm_alloc() and security_bprm_free()
unnecessary, and so they've been removed.
(b) The determination of unsafe execution is now performed immediately
after (a) rather than later on in the code. The result is stored in
bprm->unsafe for future reference.
(c) prepare_binprm() is called, possibly multiple times.
(i) This applies the result of set[ug]id binaries to the new creds
attached to bprm->cred. Personality bit clearance is recorded,
but now deferred on the basis that the exec procedure may yet
fail.
(ii) This then calls the new security_bprm_set_creds(). This should
calculate the new LSM and capability credentials into *bprm->cred.
This folds together security_bprm_set() and parts of
security_bprm_apply_creds() (these two have been removed).
Anything that might fail must be done at this point.
(iii) bprm->cred_prepared is set to 1.
bprm->cred_prepared is 0 on the first pass of the security
calculations, and 1 on all subsequent passes. This allows SELinux
in (ii) to base its calculations only on the initial script and
not on the interpreter.
(d) flush_old_exec() is called to commit the task to execution. This
performs the following steps with regard to credentials:
(i) Clear pdeath_signal and set dumpable on certain circumstances that
may not be covered by commit_creds().
(ii) Clear any bits in current->personality that were deferred from
(c.i).
(e) install_exec_creds() [compute_creds() as was] is called to install the
new credentials. This performs the following steps with regard to
credentials:
(i) Calls security_bprm_committing_creds() to apply any security
requirements, such as flushing unauthorised files in SELinux, that
must be done before the credentials are changed.
This is made up of bits of security_bprm_apply_creds() and
security_bprm_post_apply_creds(), both of which have been removed.
This function is not allowed to fail; anything that might fail
must have been done in (c.ii).
(ii) Calls commit_creds() to apply the new credentials in a single
assignment (more or less). Possibly pdeath_signal and dumpable
should be part of struct creds.
(iii) Unlocks the task's cred_replace_mutex, thus allowing
PTRACE_ATTACH to take place.
(iv) Clears The bprm->cred pointer as the credentials it was holding
are now immutable.
(v) Calls security_bprm_committed_creds() to apply any security
alterations that must be done after the creds have been changed.
SELinux uses this to flush signals and signal handlers.
(f) If an error occurs before (d.i), bprm_free() will call abort_creds()
to destroy the proposed new credentials and will then unlock
cred_replace_mutex. No changes to the credentials will have been
made.
(2) LSM interface.
A number of functions have been changed, added or removed:
(*) security_bprm_alloc(), ->bprm_alloc_security()
(*) security_bprm_free(), ->bprm_free_security()
Removed in favour of preparing new credentials and modifying those.
(*) security_bprm_apply_creds(), ->bprm_apply_creds()
(*) security_bprm_post_apply_creds(), ->bprm_post_apply_creds()
Removed; split between security_bprm_set_creds(),
security_bprm_committing_creds() and security_bprm_committed_creds().
(*) security_bprm_set(), ->bprm_set_security()
Removed; folded into security_bprm_set_creds().
(*) security_bprm_set_creds(), ->bprm_set_creds()
New. The new credentials in bprm->creds should be checked and set up
as appropriate. bprm->cred_prepared is 0 on the first call, 1 on the
second and subsequent calls.
(*) security_bprm_committing_creds(), ->bprm_committing_creds()
(*) security_bprm_committed_creds(), ->bprm_committed_creds()
New. Apply the security effects of the new credentials. This
includes closing unauthorised files in SELinux. This function may not
fail. When the former is called, the creds haven't yet been applied
to the process; when the latter is called, they have.
The former may access bprm->cred, the latter may not.
(3) SELinux.
SELinux has a number of changes, in addition to those to support the LSM
interface changes mentioned above:
(a) The bprm_security_struct struct has been removed in favour of using
the credentials-under-construction approach.
(c) flush_unauthorized_files() now takes a cred pointer and passes it on
to inode_has_perm(), file_has_perm() and dentry_open().
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: James Morris <jmorris@namei.org>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Signed-off-by: James Morris <jmorris@namei.org>
2008-11-14 02:39:24 +03:00
|
|
|
bprm->per_clear |= PER_CLEAR_ON_SETID;
|
|
|
|
bprm->cred->euid = inode->i_uid;
|
2005-04-17 02:20:36 +04:00
|
|
|
}
|
|
|
|
|
|
|
|
/* Set-gid? */
|
|
|
|
/*
|
|
|
|
* If setgid is set but no group execute bit then this
|
|
|
|
* is a candidate for mandatory locking, not a setgid
|
|
|
|
* executable.
|
|
|
|
*/
|
|
|
|
if ((mode & (S_ISGID | S_IXGRP)) == (S_ISGID | S_IXGRP)) {
|
CRED: Make execve() take advantage of copy-on-write credentials
Make execve() take advantage of copy-on-write credentials, allowing it to set
up the credentials in advance, and then commit the whole lot after the point
of no return.
This patch and the preceding patches have been tested with the LTP SELinux
testsuite.
This patch makes several logical sets of alteration:
(1) execve().
The credential bits from struct linux_binprm are, for the most part,
replaced with a single credentials pointer (bprm->cred). This means that
all the creds can be calculated in advance and then applied at the point
of no return with no possibility of failure.
I would like to replace bprm->cap_effective with:
cap_isclear(bprm->cap_effective)
but this seems impossible due to special behaviour for processes of pid 1
(they always retain their parent's capability masks where normally they'd
be changed - see cap_bprm_set_creds()).
The following sequence of events now happens:
(a) At the start of do_execve, the current task's cred_exec_mutex is
locked to prevent PTRACE_ATTACH from obsoleting the calculation of
creds that we make.
(a) prepare_exec_creds() is then called to make a copy of the current
task's credentials and prepare it. This copy is then assigned to
bprm->cred.
This renders security_bprm_alloc() and security_bprm_free()
unnecessary, and so they've been removed.
(b) The determination of unsafe execution is now performed immediately
after (a) rather than later on in the code. The result is stored in
bprm->unsafe for future reference.
(c) prepare_binprm() is called, possibly multiple times.
(i) This applies the result of set[ug]id binaries to the new creds
attached to bprm->cred. Personality bit clearance is recorded,
but now deferred on the basis that the exec procedure may yet
fail.
(ii) This then calls the new security_bprm_set_creds(). This should
calculate the new LSM and capability credentials into *bprm->cred.
This folds together security_bprm_set() and parts of
security_bprm_apply_creds() (these two have been removed).
Anything that might fail must be done at this point.
(iii) bprm->cred_prepared is set to 1.
bprm->cred_prepared is 0 on the first pass of the security
calculations, and 1 on all subsequent passes. This allows SELinux
in (ii) to base its calculations only on the initial script and
not on the interpreter.
(d) flush_old_exec() is called to commit the task to execution. This
performs the following steps with regard to credentials:
(i) Clear pdeath_signal and set dumpable on certain circumstances that
may not be covered by commit_creds().
(ii) Clear any bits in current->personality that were deferred from
(c.i).
(e) install_exec_creds() [compute_creds() as was] is called to install the
new credentials. This performs the following steps with regard to
credentials:
(i) Calls security_bprm_committing_creds() to apply any security
requirements, such as flushing unauthorised files in SELinux, that
must be done before the credentials are changed.
This is made up of bits of security_bprm_apply_creds() and
security_bprm_post_apply_creds(), both of which have been removed.
This function is not allowed to fail; anything that might fail
must have been done in (c.ii).
(ii) Calls commit_creds() to apply the new credentials in a single
assignment (more or less). Possibly pdeath_signal and dumpable
should be part of struct creds.
(iii) Unlocks the task's cred_replace_mutex, thus allowing
PTRACE_ATTACH to take place.
(iv) Clears The bprm->cred pointer as the credentials it was holding
are now immutable.
(v) Calls security_bprm_committed_creds() to apply any security
alterations that must be done after the creds have been changed.
SELinux uses this to flush signals and signal handlers.
(f) If an error occurs before (d.i), bprm_free() will call abort_creds()
to destroy the proposed new credentials and will then unlock
cred_replace_mutex. No changes to the credentials will have been
made.
(2) LSM interface.
A number of functions have been changed, added or removed:
(*) security_bprm_alloc(), ->bprm_alloc_security()
(*) security_bprm_free(), ->bprm_free_security()
Removed in favour of preparing new credentials and modifying those.
(*) security_bprm_apply_creds(), ->bprm_apply_creds()
(*) security_bprm_post_apply_creds(), ->bprm_post_apply_creds()
Removed; split between security_bprm_set_creds(),
security_bprm_committing_creds() and security_bprm_committed_creds().
(*) security_bprm_set(), ->bprm_set_security()
Removed; folded into security_bprm_set_creds().
(*) security_bprm_set_creds(), ->bprm_set_creds()
New. The new credentials in bprm->creds should be checked and set up
as appropriate. bprm->cred_prepared is 0 on the first call, 1 on the
second and subsequent calls.
(*) security_bprm_committing_creds(), ->bprm_committing_creds()
(*) security_bprm_committed_creds(), ->bprm_committed_creds()
New. Apply the security effects of the new credentials. This
includes closing unauthorised files in SELinux. This function may not
fail. When the former is called, the creds haven't yet been applied
to the process; when the latter is called, they have.
The former may access bprm->cred, the latter may not.
(3) SELinux.
SELinux has a number of changes, in addition to those to support the LSM
interface changes mentioned above:
(a) The bprm_security_struct struct has been removed in favour of using
the credentials-under-construction approach.
(c) flush_unauthorized_files() now takes a cred pointer and passes it on
to inode_has_perm(), file_has_perm() and dentry_open().
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: James Morris <jmorris@namei.org>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Signed-off-by: James Morris <jmorris@namei.org>
2008-11-14 02:39:24 +03:00
|
|
|
bprm->per_clear |= PER_CLEAR_ON_SETID;
|
|
|
|
bprm->cred->egid = inode->i_gid;
|
2005-04-17 02:20:36 +04:00
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
/* fill in binprm security blob */
|
CRED: Make execve() take advantage of copy-on-write credentials
Make execve() take advantage of copy-on-write credentials, allowing it to set
up the credentials in advance, and then commit the whole lot after the point
of no return.
This patch and the preceding patches have been tested with the LTP SELinux
testsuite.
This patch makes several logical sets of alteration:
(1) execve().
The credential bits from struct linux_binprm are, for the most part,
replaced with a single credentials pointer (bprm->cred). This means that
all the creds can be calculated in advance and then applied at the point
of no return with no possibility of failure.
I would like to replace bprm->cap_effective with:
cap_isclear(bprm->cap_effective)
but this seems impossible due to special behaviour for processes of pid 1
(they always retain their parent's capability masks where normally they'd
be changed - see cap_bprm_set_creds()).
The following sequence of events now happens:
(a) At the start of do_execve, the current task's cred_exec_mutex is
locked to prevent PTRACE_ATTACH from obsoleting the calculation of
creds that we make.
(a) prepare_exec_creds() is then called to make a copy of the current
task's credentials and prepare it. This copy is then assigned to
bprm->cred.
This renders security_bprm_alloc() and security_bprm_free()
unnecessary, and so they've been removed.
(b) The determination of unsafe execution is now performed immediately
after (a) rather than later on in the code. The result is stored in
bprm->unsafe for future reference.
(c) prepare_binprm() is called, possibly multiple times.
(i) This applies the result of set[ug]id binaries to the new creds
attached to bprm->cred. Personality bit clearance is recorded,
but now deferred on the basis that the exec procedure may yet
fail.
(ii) This then calls the new security_bprm_set_creds(). This should
calculate the new LSM and capability credentials into *bprm->cred.
This folds together security_bprm_set() and parts of
security_bprm_apply_creds() (these two have been removed).
Anything that might fail must be done at this point.
(iii) bprm->cred_prepared is set to 1.
bprm->cred_prepared is 0 on the first pass of the security
calculations, and 1 on all subsequent passes. This allows SELinux
in (ii) to base its calculations only on the initial script and
not on the interpreter.
(d) flush_old_exec() is called to commit the task to execution. This
performs the following steps with regard to credentials:
(i) Clear pdeath_signal and set dumpable on certain circumstances that
may not be covered by commit_creds().
(ii) Clear any bits in current->personality that were deferred from
(c.i).
(e) install_exec_creds() [compute_creds() as was] is called to install the
new credentials. This performs the following steps with regard to
credentials:
(i) Calls security_bprm_committing_creds() to apply any security
requirements, such as flushing unauthorised files in SELinux, that
must be done before the credentials are changed.
This is made up of bits of security_bprm_apply_creds() and
security_bprm_post_apply_creds(), both of which have been removed.
This function is not allowed to fail; anything that might fail
must have been done in (c.ii).
(ii) Calls commit_creds() to apply the new credentials in a single
assignment (more or less). Possibly pdeath_signal and dumpable
should be part of struct creds.
(iii) Unlocks the task's cred_replace_mutex, thus allowing
PTRACE_ATTACH to take place.
(iv) Clears The bprm->cred pointer as the credentials it was holding
are now immutable.
(v) Calls security_bprm_committed_creds() to apply any security
alterations that must be done after the creds have been changed.
SELinux uses this to flush signals and signal handlers.
(f) If an error occurs before (d.i), bprm_free() will call abort_creds()
to destroy the proposed new credentials and will then unlock
cred_replace_mutex. No changes to the credentials will have been
made.
(2) LSM interface.
A number of functions have been changed, added or removed:
(*) security_bprm_alloc(), ->bprm_alloc_security()
(*) security_bprm_free(), ->bprm_free_security()
Removed in favour of preparing new credentials and modifying those.
(*) security_bprm_apply_creds(), ->bprm_apply_creds()
(*) security_bprm_post_apply_creds(), ->bprm_post_apply_creds()
Removed; split between security_bprm_set_creds(),
security_bprm_committing_creds() and security_bprm_committed_creds().
(*) security_bprm_set(), ->bprm_set_security()
Removed; folded into security_bprm_set_creds().
(*) security_bprm_set_creds(), ->bprm_set_creds()
New. The new credentials in bprm->creds should be checked and set up
as appropriate. bprm->cred_prepared is 0 on the first call, 1 on the
second and subsequent calls.
(*) security_bprm_committing_creds(), ->bprm_committing_creds()
(*) security_bprm_committed_creds(), ->bprm_committed_creds()
New. Apply the security effects of the new credentials. This
includes closing unauthorised files in SELinux. This function may not
fail. When the former is called, the creds haven't yet been applied
to the process; when the latter is called, they have.
The former may access bprm->cred, the latter may not.
(3) SELinux.
SELinux has a number of changes, in addition to those to support the LSM
interface changes mentioned above:
(a) The bprm_security_struct struct has been removed in favour of using
the credentials-under-construction approach.
(c) flush_unauthorized_files() now takes a cred pointer and passes it on
to inode_has_perm(), file_has_perm() and dentry_open().
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: James Morris <jmorris@namei.org>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Signed-off-by: James Morris <jmorris@namei.org>
2008-11-14 02:39:24 +03:00
|
|
|
retval = security_bprm_set_creds(bprm);
|
2005-04-17 02:20:36 +04:00
|
|
|
if (retval)
|
|
|
|
return retval;
|
CRED: Make execve() take advantage of copy-on-write credentials
Make execve() take advantage of copy-on-write credentials, allowing it to set
up the credentials in advance, and then commit the whole lot after the point
of no return.
This patch and the preceding patches have been tested with the LTP SELinux
testsuite.
This patch makes several logical sets of alteration:
(1) execve().
The credential bits from struct linux_binprm are, for the most part,
replaced with a single credentials pointer (bprm->cred). This means that
all the creds can be calculated in advance and then applied at the point
of no return with no possibility of failure.
I would like to replace bprm->cap_effective with:
cap_isclear(bprm->cap_effective)
but this seems impossible due to special behaviour for processes of pid 1
(they always retain their parent's capability masks where normally they'd
be changed - see cap_bprm_set_creds()).
The following sequence of events now happens:
(a) At the start of do_execve, the current task's cred_exec_mutex is
locked to prevent PTRACE_ATTACH from obsoleting the calculation of
creds that we make.
(a) prepare_exec_creds() is then called to make a copy of the current
task's credentials and prepare it. This copy is then assigned to
bprm->cred.
This renders security_bprm_alloc() and security_bprm_free()
unnecessary, and so they've been removed.
(b) The determination of unsafe execution is now performed immediately
after (a) rather than later on in the code. The result is stored in
bprm->unsafe for future reference.
(c) prepare_binprm() is called, possibly multiple times.
(i) This applies the result of set[ug]id binaries to the new creds
attached to bprm->cred. Personality bit clearance is recorded,
but now deferred on the basis that the exec procedure may yet
fail.
(ii) This then calls the new security_bprm_set_creds(). This should
calculate the new LSM and capability credentials into *bprm->cred.
This folds together security_bprm_set() and parts of
security_bprm_apply_creds() (these two have been removed).
Anything that might fail must be done at this point.
(iii) bprm->cred_prepared is set to 1.
bprm->cred_prepared is 0 on the first pass of the security
calculations, and 1 on all subsequent passes. This allows SELinux
in (ii) to base its calculations only on the initial script and
not on the interpreter.
(d) flush_old_exec() is called to commit the task to execution. This
performs the following steps with regard to credentials:
(i) Clear pdeath_signal and set dumpable on certain circumstances that
may not be covered by commit_creds().
(ii) Clear any bits in current->personality that were deferred from
(c.i).
(e) install_exec_creds() [compute_creds() as was] is called to install the
new credentials. This performs the following steps with regard to
credentials:
(i) Calls security_bprm_committing_creds() to apply any security
requirements, such as flushing unauthorised files in SELinux, that
must be done before the credentials are changed.
This is made up of bits of security_bprm_apply_creds() and
security_bprm_post_apply_creds(), both of which have been removed.
This function is not allowed to fail; anything that might fail
must have been done in (c.ii).
(ii) Calls commit_creds() to apply the new credentials in a single
assignment (more or less). Possibly pdeath_signal and dumpable
should be part of struct creds.
(iii) Unlocks the task's cred_replace_mutex, thus allowing
PTRACE_ATTACH to take place.
(iv) Clears The bprm->cred pointer as the credentials it was holding
are now immutable.
(v) Calls security_bprm_committed_creds() to apply any security
alterations that must be done after the creds have been changed.
SELinux uses this to flush signals and signal handlers.
(f) If an error occurs before (d.i), bprm_free() will call abort_creds()
to destroy the proposed new credentials and will then unlock
cred_replace_mutex. No changes to the credentials will have been
made.
(2) LSM interface.
A number of functions have been changed, added or removed:
(*) security_bprm_alloc(), ->bprm_alloc_security()
(*) security_bprm_free(), ->bprm_free_security()
Removed in favour of preparing new credentials and modifying those.
(*) security_bprm_apply_creds(), ->bprm_apply_creds()
(*) security_bprm_post_apply_creds(), ->bprm_post_apply_creds()
Removed; split between security_bprm_set_creds(),
security_bprm_committing_creds() and security_bprm_committed_creds().
(*) security_bprm_set(), ->bprm_set_security()
Removed; folded into security_bprm_set_creds().
(*) security_bprm_set_creds(), ->bprm_set_creds()
New. The new credentials in bprm->creds should be checked and set up
as appropriate. bprm->cred_prepared is 0 on the first call, 1 on the
second and subsequent calls.
(*) security_bprm_committing_creds(), ->bprm_committing_creds()
(*) security_bprm_committed_creds(), ->bprm_committed_creds()
New. Apply the security effects of the new credentials. This
includes closing unauthorised files in SELinux. This function may not
fail. When the former is called, the creds haven't yet been applied
to the process; when the latter is called, they have.
The former may access bprm->cred, the latter may not.
(3) SELinux.
SELinux has a number of changes, in addition to those to support the LSM
interface changes mentioned above:
(a) The bprm_security_struct struct has been removed in favour of using
the credentials-under-construction approach.
(c) flush_unauthorized_files() now takes a cred pointer and passes it on
to inode_has_perm(), file_has_perm() and dentry_open().
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: James Morris <jmorris@namei.org>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Signed-off-by: James Morris <jmorris@namei.org>
2008-11-14 02:39:24 +03:00
|
|
|
bprm->cred_prepared = 1;
|
2005-04-17 02:20:36 +04:00
|
|
|
|
CRED: Make execve() take advantage of copy-on-write credentials
Make execve() take advantage of copy-on-write credentials, allowing it to set
up the credentials in advance, and then commit the whole lot after the point
of no return.
This patch and the preceding patches have been tested with the LTP SELinux
testsuite.
This patch makes several logical sets of alteration:
(1) execve().
The credential bits from struct linux_binprm are, for the most part,
replaced with a single credentials pointer (bprm->cred). This means that
all the creds can be calculated in advance and then applied at the point
of no return with no possibility of failure.
I would like to replace bprm->cap_effective with:
cap_isclear(bprm->cap_effective)
but this seems impossible due to special behaviour for processes of pid 1
(they always retain their parent's capability masks where normally they'd
be changed - see cap_bprm_set_creds()).
The following sequence of events now happens:
(a) At the start of do_execve, the current task's cred_exec_mutex is
locked to prevent PTRACE_ATTACH from obsoleting the calculation of
creds that we make.
(a) prepare_exec_creds() is then called to make a copy of the current
task's credentials and prepare it. This copy is then assigned to
bprm->cred.
This renders security_bprm_alloc() and security_bprm_free()
unnecessary, and so they've been removed.
(b) The determination of unsafe execution is now performed immediately
after (a) rather than later on in the code. The result is stored in
bprm->unsafe for future reference.
(c) prepare_binprm() is called, possibly multiple times.
(i) This applies the result of set[ug]id binaries to the new creds
attached to bprm->cred. Personality bit clearance is recorded,
but now deferred on the basis that the exec procedure may yet
fail.
(ii) This then calls the new security_bprm_set_creds(). This should
calculate the new LSM and capability credentials into *bprm->cred.
This folds together security_bprm_set() and parts of
security_bprm_apply_creds() (these two have been removed).
Anything that might fail must be done at this point.
(iii) bprm->cred_prepared is set to 1.
bprm->cred_prepared is 0 on the first pass of the security
calculations, and 1 on all subsequent passes. This allows SELinux
in (ii) to base its calculations only on the initial script and
not on the interpreter.
(d) flush_old_exec() is called to commit the task to execution. This
performs the following steps with regard to credentials:
(i) Clear pdeath_signal and set dumpable on certain circumstances that
may not be covered by commit_creds().
(ii) Clear any bits in current->personality that were deferred from
(c.i).
(e) install_exec_creds() [compute_creds() as was] is called to install the
new credentials. This performs the following steps with regard to
credentials:
(i) Calls security_bprm_committing_creds() to apply any security
requirements, such as flushing unauthorised files in SELinux, that
must be done before the credentials are changed.
This is made up of bits of security_bprm_apply_creds() and
security_bprm_post_apply_creds(), both of which have been removed.
This function is not allowed to fail; anything that might fail
must have been done in (c.ii).
(ii) Calls commit_creds() to apply the new credentials in a single
assignment (more or less). Possibly pdeath_signal and dumpable
should be part of struct creds.
(iii) Unlocks the task's cred_replace_mutex, thus allowing
PTRACE_ATTACH to take place.
(iv) Clears The bprm->cred pointer as the credentials it was holding
are now immutable.
(v) Calls security_bprm_committed_creds() to apply any security
alterations that must be done after the creds have been changed.
SELinux uses this to flush signals and signal handlers.
(f) If an error occurs before (d.i), bprm_free() will call abort_creds()
to destroy the proposed new credentials and will then unlock
cred_replace_mutex. No changes to the credentials will have been
made.
(2) LSM interface.
A number of functions have been changed, added or removed:
(*) security_bprm_alloc(), ->bprm_alloc_security()
(*) security_bprm_free(), ->bprm_free_security()
Removed in favour of preparing new credentials and modifying those.
(*) security_bprm_apply_creds(), ->bprm_apply_creds()
(*) security_bprm_post_apply_creds(), ->bprm_post_apply_creds()
Removed; split between security_bprm_set_creds(),
security_bprm_committing_creds() and security_bprm_committed_creds().
(*) security_bprm_set(), ->bprm_set_security()
Removed; folded into security_bprm_set_creds().
(*) security_bprm_set_creds(), ->bprm_set_creds()
New. The new credentials in bprm->creds should be checked and set up
as appropriate. bprm->cred_prepared is 0 on the first call, 1 on the
second and subsequent calls.
(*) security_bprm_committing_creds(), ->bprm_committing_creds()
(*) security_bprm_committed_creds(), ->bprm_committed_creds()
New. Apply the security effects of the new credentials. This
includes closing unauthorised files in SELinux. This function may not
fail. When the former is called, the creds haven't yet been applied
to the process; when the latter is called, they have.
The former may access bprm->cred, the latter may not.
(3) SELinux.
SELinux has a number of changes, in addition to those to support the LSM
interface changes mentioned above:
(a) The bprm_security_struct struct has been removed in favour of using
the credentials-under-construction approach.
(c) flush_unauthorized_files() now takes a cred pointer and passes it on
to inode_has_perm(), file_has_perm() and dentry_open().
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: James Morris <jmorris@namei.org>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Signed-off-by: James Morris <jmorris@namei.org>
2008-11-14 02:39:24 +03:00
|
|
|
memset(bprm->buf, 0, BINPRM_BUF_SIZE);
|
|
|
|
return kernel_read(bprm->file, 0, bprm->buf, BINPRM_BUF_SIZE);
|
2005-04-17 02:20:36 +04:00
|
|
|
}
|
|
|
|
|
|
|
|
EXPORT_SYMBOL(prepare_binprm);
|
|
|
|
|
2007-05-08 11:25:16 +04:00
|
|
|
/*
|
|
|
|
* Arguments are '\0' separated strings found at the location bprm->p
|
|
|
|
* points to; chop off the first by relocating brpm->p to right after
|
|
|
|
* the first '\0' encountered.
|
|
|
|
*/
|
2007-07-19 12:48:16 +04:00
|
|
|
int remove_arg_zero(struct linux_binprm *bprm)
|
2005-04-17 02:20:36 +04:00
|
|
|
{
|
2007-07-19 12:48:16 +04:00
|
|
|
int ret = 0;
|
|
|
|
unsigned long offset;
|
|
|
|
char *kaddr;
|
|
|
|
struct page *page;
|
2007-05-08 11:25:16 +04:00
|
|
|
|
2007-07-19 12:48:16 +04:00
|
|
|
if (!bprm->argc)
|
|
|
|
return 0;
|
2005-04-17 02:20:36 +04:00
|
|
|
|
2007-07-19 12:48:16 +04:00
|
|
|
do {
|
|
|
|
offset = bprm->p & ~PAGE_MASK;
|
|
|
|
page = get_arg_page(bprm, bprm->p, 0);
|
|
|
|
if (!page) {
|
|
|
|
ret = -EFAULT;
|
|
|
|
goto out;
|
|
|
|
}
|
|
|
|
kaddr = kmap_atomic(page, KM_USER0);
|
2007-05-08 11:25:16 +04:00
|
|
|
|
2007-07-19 12:48:16 +04:00
|
|
|
for (; offset < PAGE_SIZE && kaddr[offset];
|
|
|
|
offset++, bprm->p++)
|
|
|
|
;
|
2007-05-08 11:25:16 +04:00
|
|
|
|
2007-07-19 12:48:16 +04:00
|
|
|
kunmap_atomic(kaddr, KM_USER0);
|
|
|
|
put_arg_page(page);
|
2007-05-08 11:25:16 +04:00
|
|
|
|
2007-07-19 12:48:16 +04:00
|
|
|
if (offset == PAGE_SIZE)
|
|
|
|
free_arg_page(bprm, (bprm->p >> PAGE_SHIFT) - 1);
|
|
|
|
} while (offset == PAGE_SIZE);
|
2007-05-08 11:25:16 +04:00
|
|
|
|
2007-07-19 12:48:16 +04:00
|
|
|
bprm->p++;
|
|
|
|
bprm->argc--;
|
|
|
|
ret = 0;
|
2007-05-08 11:25:16 +04:00
|
|
|
|
2007-07-19 12:48:16 +04:00
|
|
|
out:
|
|
|
|
return ret;
|
2005-04-17 02:20:36 +04:00
|
|
|
}
|
|
|
|
EXPORT_SYMBOL(remove_arg_zero);
|
|
|
|
|
|
|
|
/*
|
|
|
|
* cycle the list of binary formats handler, until one recognizes the image
|
|
|
|
*/
|
|
|
|
int search_binary_handler(struct linux_binprm *bprm,struct pt_regs *regs)
|
|
|
|
{
|
tracehook: exec double-reporting fix
The patch 6341c39 "tracehook: exec" introduced a small regression in
2.6.27 regarding binfmt_misc exec event reporting. Since the reporting
is now done in the common search_binary_handler() function, an exec
of a misc binary will result in two (or possibly multiple) exec events
being reported, instead of just a single one, because the misc handler
contains a recursive call to search_binary_handler.
To add to the confusion, if PTRACE_O_TRACEEXEC is not active, the multiple
SIGTRAP signals will in fact cause only a single ptrace intercept, as the
signals are not queued. However, if PTRACE_O_TRACEEXEC is on, the debugger
will actually see multiple ptrace intercepts (PTRACE_EVENT_EXEC).
The test program included below demonstrates the problem.
This change fixes the bug by calling tracehook_report_exec() only in the
outermost search_binary_handler() call (bprm->recursion_depth == 0).
The additional change to restore bprm->recursion_depth after each binfmt
load_binary call is actually superfluous for this bug, since we test the
value saved on entry to search_binary_handler(). But it keeps the use of
of the depth count to its most obvious expected meaning. Depending on what
binfmt handlers do in certain cases, there could have been false-positive
tests for recursion limits before this change.
/* Test program using PTRACE_O_TRACEEXEC.
This forks and exec's the first argument with the rest of the arguments,
while ptrace'ing. It expects to see one PTRACE_EVENT_EXEC stop and
then a successful exit, with no other signals or events in between.
Test for kernel doing two PTRACE_EVENT_EXEC stops for a binfmt_misc exec:
$ gcc -g traceexec.c -o traceexec
$ sudo sh -c 'echo :test:M::foobar::/bin/cat: > /proc/sys/fs/binfmt_misc/register'
$ echo 'foobar test' > ./foobar
$ chmod +x ./foobar
$ ./traceexec ./foobar; echo $?
==> good <==
foobar test
0
$
==> bad <==
foobar test
unexpected status 0x4057f != 0
3
$
*/
#include <stdio.h>
#include <sys/types.h>
#include <sys/wait.h>
#include <sys/ptrace.h>
#include <unistd.h>
#include <signal.h>
#include <stdlib.h>
static void
wait_for (pid_t child, int expect)
{
int status;
pid_t p = wait (&status);
if (p != child)
{
perror ("wait");
exit (2);
}
if (status != expect)
{
fprintf (stderr, "unexpected status %#x != %#x\n", status, expect);
exit (3);
}
}
int
main (int argc, char **argv)
{
pid_t child = fork ();
if (child < 0)
{
perror ("fork");
return 127;
}
else if (child == 0)
{
ptrace (PTRACE_TRACEME);
raise (SIGUSR1);
execv (argv[1], &argv[1]);
perror ("execve");
_exit (127);
}
wait_for (child, W_STOPCODE (SIGUSR1));
if (ptrace (PTRACE_SETOPTIONS, child,
0L, (void *) (long) PTRACE_O_TRACEEXEC) != 0)
{
perror ("PTRACE_SETOPTIONS");
return 4;
}
if (ptrace (PTRACE_CONT, child, 0L, 0L) != 0)
{
perror ("PTRACE_CONT");
return 5;
}
wait_for (child, W_STOPCODE (SIGTRAP | (PTRACE_EVENT_EXEC << 8)));
if (ptrace (PTRACE_CONT, child, 0L, 0L) != 0)
{
perror ("PTRACE_CONT");
return 6;
}
wait_for (child, W_EXITCODE (0, 0));
return 0;
}
Reported-by: Arnd Bergmann <arnd@arndb.de>
CC: Ulrich Weigand <ulrich.weigand@de.ibm.com>
Signed-off-by: Roland McGrath <roland@redhat.com>
2008-12-10 06:36:38 +03:00
|
|
|
unsigned int depth = bprm->recursion_depth;
|
2005-04-17 02:20:36 +04:00
|
|
|
int try,retval;
|
|
|
|
struct linux_binfmt *fmt;
|
2009-01-03 10:16:33 +03:00
|
|
|
|
2005-04-17 02:20:36 +04:00
|
|
|
retval = security_bprm_check(bprm);
|
|
|
|
if (retval)
|
|
|
|
return retval;
|
|
|
|
|
|
|
|
/* kernel module loader fixup */
|
|
|
|
/* so we don't try to load run modprobe in kernel space. */
|
|
|
|
set_fs(USER_DS);
|
2006-04-26 22:04:08 +04:00
|
|
|
|
|
|
|
retval = audit_bprm(bprm);
|
|
|
|
if (retval)
|
|
|
|
return retval;
|
|
|
|
|
2005-04-17 02:20:36 +04:00
|
|
|
retval = -ENOENT;
|
|
|
|
for (try=0; try<2; try++) {
|
|
|
|
read_lock(&binfmt_lock);
|
2007-10-17 10:26:03 +04:00
|
|
|
list_for_each_entry(fmt, &formats, lh) {
|
2005-04-17 02:20:36 +04:00
|
|
|
int (*fn)(struct linux_binprm *, struct pt_regs *) = fmt->load_binary;
|
|
|
|
if (!fn)
|
|
|
|
continue;
|
|
|
|
if (!try_module_get(fmt->module))
|
|
|
|
continue;
|
|
|
|
read_unlock(&binfmt_lock);
|
|
|
|
retval = fn(bprm, regs);
|
tracehook: exec double-reporting fix
The patch 6341c39 "tracehook: exec" introduced a small regression in
2.6.27 regarding binfmt_misc exec event reporting. Since the reporting
is now done in the common search_binary_handler() function, an exec
of a misc binary will result in two (or possibly multiple) exec events
being reported, instead of just a single one, because the misc handler
contains a recursive call to search_binary_handler.
To add to the confusion, if PTRACE_O_TRACEEXEC is not active, the multiple
SIGTRAP signals will in fact cause only a single ptrace intercept, as the
signals are not queued. However, if PTRACE_O_TRACEEXEC is on, the debugger
will actually see multiple ptrace intercepts (PTRACE_EVENT_EXEC).
The test program included below demonstrates the problem.
This change fixes the bug by calling tracehook_report_exec() only in the
outermost search_binary_handler() call (bprm->recursion_depth == 0).
The additional change to restore bprm->recursion_depth after each binfmt
load_binary call is actually superfluous for this bug, since we test the
value saved on entry to search_binary_handler(). But it keeps the use of
of the depth count to its most obvious expected meaning. Depending on what
binfmt handlers do in certain cases, there could have been false-positive
tests for recursion limits before this change.
/* Test program using PTRACE_O_TRACEEXEC.
This forks and exec's the first argument with the rest of the arguments,
while ptrace'ing. It expects to see one PTRACE_EVENT_EXEC stop and
then a successful exit, with no other signals or events in between.
Test for kernel doing two PTRACE_EVENT_EXEC stops for a binfmt_misc exec:
$ gcc -g traceexec.c -o traceexec
$ sudo sh -c 'echo :test:M::foobar::/bin/cat: > /proc/sys/fs/binfmt_misc/register'
$ echo 'foobar test' > ./foobar
$ chmod +x ./foobar
$ ./traceexec ./foobar; echo $?
==> good <==
foobar test
0
$
==> bad <==
foobar test
unexpected status 0x4057f != 0
3
$
*/
#include <stdio.h>
#include <sys/types.h>
#include <sys/wait.h>
#include <sys/ptrace.h>
#include <unistd.h>
#include <signal.h>
#include <stdlib.h>
static void
wait_for (pid_t child, int expect)
{
int status;
pid_t p = wait (&status);
if (p != child)
{
perror ("wait");
exit (2);
}
if (status != expect)
{
fprintf (stderr, "unexpected status %#x != %#x\n", status, expect);
exit (3);
}
}
int
main (int argc, char **argv)
{
pid_t child = fork ();
if (child < 0)
{
perror ("fork");
return 127;
}
else if (child == 0)
{
ptrace (PTRACE_TRACEME);
raise (SIGUSR1);
execv (argv[1], &argv[1]);
perror ("execve");
_exit (127);
}
wait_for (child, W_STOPCODE (SIGUSR1));
if (ptrace (PTRACE_SETOPTIONS, child,
0L, (void *) (long) PTRACE_O_TRACEEXEC) != 0)
{
perror ("PTRACE_SETOPTIONS");
return 4;
}
if (ptrace (PTRACE_CONT, child, 0L, 0L) != 0)
{
perror ("PTRACE_CONT");
return 5;
}
wait_for (child, W_STOPCODE (SIGTRAP | (PTRACE_EVENT_EXEC << 8)));
if (ptrace (PTRACE_CONT, child, 0L, 0L) != 0)
{
perror ("PTRACE_CONT");
return 6;
}
wait_for (child, W_EXITCODE (0, 0));
return 0;
}
Reported-by: Arnd Bergmann <arnd@arndb.de>
CC: Ulrich Weigand <ulrich.weigand@de.ibm.com>
Signed-off-by: Roland McGrath <roland@redhat.com>
2008-12-10 06:36:38 +03:00
|
|
|
/*
|
|
|
|
* Restore the depth counter to its starting value
|
|
|
|
* in this call, so we don't have to rely on every
|
|
|
|
* load_binary function to restore it on return.
|
|
|
|
*/
|
|
|
|
bprm->recursion_depth = depth;
|
2005-04-17 02:20:36 +04:00
|
|
|
if (retval >= 0) {
|
tracehook: exec double-reporting fix
The patch 6341c39 "tracehook: exec" introduced a small regression in
2.6.27 regarding binfmt_misc exec event reporting. Since the reporting
is now done in the common search_binary_handler() function, an exec
of a misc binary will result in two (or possibly multiple) exec events
being reported, instead of just a single one, because the misc handler
contains a recursive call to search_binary_handler.
To add to the confusion, if PTRACE_O_TRACEEXEC is not active, the multiple
SIGTRAP signals will in fact cause only a single ptrace intercept, as the
signals are not queued. However, if PTRACE_O_TRACEEXEC is on, the debugger
will actually see multiple ptrace intercepts (PTRACE_EVENT_EXEC).
The test program included below demonstrates the problem.
This change fixes the bug by calling tracehook_report_exec() only in the
outermost search_binary_handler() call (bprm->recursion_depth == 0).
The additional change to restore bprm->recursion_depth after each binfmt
load_binary call is actually superfluous for this bug, since we test the
value saved on entry to search_binary_handler(). But it keeps the use of
of the depth count to its most obvious expected meaning. Depending on what
binfmt handlers do in certain cases, there could have been false-positive
tests for recursion limits before this change.
/* Test program using PTRACE_O_TRACEEXEC.
This forks and exec's the first argument with the rest of the arguments,
while ptrace'ing. It expects to see one PTRACE_EVENT_EXEC stop and
then a successful exit, with no other signals or events in between.
Test for kernel doing two PTRACE_EVENT_EXEC stops for a binfmt_misc exec:
$ gcc -g traceexec.c -o traceexec
$ sudo sh -c 'echo :test:M::foobar::/bin/cat: > /proc/sys/fs/binfmt_misc/register'
$ echo 'foobar test' > ./foobar
$ chmod +x ./foobar
$ ./traceexec ./foobar; echo $?
==> good <==
foobar test
0
$
==> bad <==
foobar test
unexpected status 0x4057f != 0
3
$
*/
#include <stdio.h>
#include <sys/types.h>
#include <sys/wait.h>
#include <sys/ptrace.h>
#include <unistd.h>
#include <signal.h>
#include <stdlib.h>
static void
wait_for (pid_t child, int expect)
{
int status;
pid_t p = wait (&status);
if (p != child)
{
perror ("wait");
exit (2);
}
if (status != expect)
{
fprintf (stderr, "unexpected status %#x != %#x\n", status, expect);
exit (3);
}
}
int
main (int argc, char **argv)
{
pid_t child = fork ();
if (child < 0)
{
perror ("fork");
return 127;
}
else if (child == 0)
{
ptrace (PTRACE_TRACEME);
raise (SIGUSR1);
execv (argv[1], &argv[1]);
perror ("execve");
_exit (127);
}
wait_for (child, W_STOPCODE (SIGUSR1));
if (ptrace (PTRACE_SETOPTIONS, child,
0L, (void *) (long) PTRACE_O_TRACEEXEC) != 0)
{
perror ("PTRACE_SETOPTIONS");
return 4;
}
if (ptrace (PTRACE_CONT, child, 0L, 0L) != 0)
{
perror ("PTRACE_CONT");
return 5;
}
wait_for (child, W_STOPCODE (SIGTRAP | (PTRACE_EVENT_EXEC << 8)));
if (ptrace (PTRACE_CONT, child, 0L, 0L) != 0)
{
perror ("PTRACE_CONT");
return 6;
}
wait_for (child, W_EXITCODE (0, 0));
return 0;
}
Reported-by: Arnd Bergmann <arnd@arndb.de>
CC: Ulrich Weigand <ulrich.weigand@de.ibm.com>
Signed-off-by: Roland McGrath <roland@redhat.com>
2008-12-10 06:36:38 +03:00
|
|
|
if (depth == 0)
|
|
|
|
tracehook_report_exec(fmt, bprm, regs);
|
2005-04-17 02:20:36 +04:00
|
|
|
put_binfmt(fmt);
|
|
|
|
allow_write_access(bprm->file);
|
|
|
|
if (bprm->file)
|
|
|
|
fput(bprm->file);
|
|
|
|
bprm->file = NULL;
|
|
|
|
current->did_exec = 1;
|
2005-11-07 11:59:16 +03:00
|
|
|
proc_exec_connector(current);
|
2005-04-17 02:20:36 +04:00
|
|
|
return retval;
|
|
|
|
}
|
|
|
|
read_lock(&binfmt_lock);
|
|
|
|
put_binfmt(fmt);
|
|
|
|
if (retval != -ENOEXEC || bprm->mm == NULL)
|
|
|
|
break;
|
|
|
|
if (!bprm->file) {
|
|
|
|
read_unlock(&binfmt_lock);
|
|
|
|
return retval;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
read_unlock(&binfmt_lock);
|
|
|
|
if (retval != -ENOEXEC || bprm->mm == NULL) {
|
|
|
|
break;
|
2008-07-09 12:28:40 +04:00
|
|
|
#ifdef CONFIG_MODULES
|
|
|
|
} else {
|
2005-04-17 02:20:36 +04:00
|
|
|
#define printable(c) (((c)=='\t') || ((c)=='\n') || (0x20<=(c) && (c)<=0x7e))
|
|
|
|
if (printable(bprm->buf[0]) &&
|
|
|
|
printable(bprm->buf[1]) &&
|
|
|
|
printable(bprm->buf[2]) &&
|
|
|
|
printable(bprm->buf[3]))
|
|
|
|
break; /* -ENOEXEC */
|
|
|
|
request_module("binfmt-%04x", *(unsigned short *)(&bprm->buf[2]));
|
|
|
|
#endif
|
|
|
|
}
|
|
|
|
}
|
|
|
|
return retval;
|
|
|
|
}
|
|
|
|
|
|
|
|
EXPORT_SYMBOL(search_binary_handler);
|
|
|
|
|
2008-05-11 00:38:25 +04:00
|
|
|
void free_bprm(struct linux_binprm *bprm)
|
|
|
|
{
|
|
|
|
free_arg_pages(bprm);
|
CRED: Make execve() take advantage of copy-on-write credentials
Make execve() take advantage of copy-on-write credentials, allowing it to set
up the credentials in advance, and then commit the whole lot after the point
of no return.
This patch and the preceding patches have been tested with the LTP SELinux
testsuite.
This patch makes several logical sets of alteration:
(1) execve().
The credential bits from struct linux_binprm are, for the most part,
replaced with a single credentials pointer (bprm->cred). This means that
all the creds can be calculated in advance and then applied at the point
of no return with no possibility of failure.
I would like to replace bprm->cap_effective with:
cap_isclear(bprm->cap_effective)
but this seems impossible due to special behaviour for processes of pid 1
(they always retain their parent's capability masks where normally they'd
be changed - see cap_bprm_set_creds()).
The following sequence of events now happens:
(a) At the start of do_execve, the current task's cred_exec_mutex is
locked to prevent PTRACE_ATTACH from obsoleting the calculation of
creds that we make.
(a) prepare_exec_creds() is then called to make a copy of the current
task's credentials and prepare it. This copy is then assigned to
bprm->cred.
This renders security_bprm_alloc() and security_bprm_free()
unnecessary, and so they've been removed.
(b) The determination of unsafe execution is now performed immediately
after (a) rather than later on in the code. The result is stored in
bprm->unsafe for future reference.
(c) prepare_binprm() is called, possibly multiple times.
(i) This applies the result of set[ug]id binaries to the new creds
attached to bprm->cred. Personality bit clearance is recorded,
but now deferred on the basis that the exec procedure may yet
fail.
(ii) This then calls the new security_bprm_set_creds(). This should
calculate the new LSM and capability credentials into *bprm->cred.
This folds together security_bprm_set() and parts of
security_bprm_apply_creds() (these two have been removed).
Anything that might fail must be done at this point.
(iii) bprm->cred_prepared is set to 1.
bprm->cred_prepared is 0 on the first pass of the security
calculations, and 1 on all subsequent passes. This allows SELinux
in (ii) to base its calculations only on the initial script and
not on the interpreter.
(d) flush_old_exec() is called to commit the task to execution. This
performs the following steps with regard to credentials:
(i) Clear pdeath_signal and set dumpable on certain circumstances that
may not be covered by commit_creds().
(ii) Clear any bits in current->personality that were deferred from
(c.i).
(e) install_exec_creds() [compute_creds() as was] is called to install the
new credentials. This performs the following steps with regard to
credentials:
(i) Calls security_bprm_committing_creds() to apply any security
requirements, such as flushing unauthorised files in SELinux, that
must be done before the credentials are changed.
This is made up of bits of security_bprm_apply_creds() and
security_bprm_post_apply_creds(), both of which have been removed.
This function is not allowed to fail; anything that might fail
must have been done in (c.ii).
(ii) Calls commit_creds() to apply the new credentials in a single
assignment (more or less). Possibly pdeath_signal and dumpable
should be part of struct creds.
(iii) Unlocks the task's cred_replace_mutex, thus allowing
PTRACE_ATTACH to take place.
(iv) Clears The bprm->cred pointer as the credentials it was holding
are now immutable.
(v) Calls security_bprm_committed_creds() to apply any security
alterations that must be done after the creds have been changed.
SELinux uses this to flush signals and signal handlers.
(f) If an error occurs before (d.i), bprm_free() will call abort_creds()
to destroy the proposed new credentials and will then unlock
cred_replace_mutex. No changes to the credentials will have been
made.
(2) LSM interface.
A number of functions have been changed, added or removed:
(*) security_bprm_alloc(), ->bprm_alloc_security()
(*) security_bprm_free(), ->bprm_free_security()
Removed in favour of preparing new credentials and modifying those.
(*) security_bprm_apply_creds(), ->bprm_apply_creds()
(*) security_bprm_post_apply_creds(), ->bprm_post_apply_creds()
Removed; split between security_bprm_set_creds(),
security_bprm_committing_creds() and security_bprm_committed_creds().
(*) security_bprm_set(), ->bprm_set_security()
Removed; folded into security_bprm_set_creds().
(*) security_bprm_set_creds(), ->bprm_set_creds()
New. The new credentials in bprm->creds should be checked and set up
as appropriate. bprm->cred_prepared is 0 on the first call, 1 on the
second and subsequent calls.
(*) security_bprm_committing_creds(), ->bprm_committing_creds()
(*) security_bprm_committed_creds(), ->bprm_committed_creds()
New. Apply the security effects of the new credentials. This
includes closing unauthorised files in SELinux. This function may not
fail. When the former is called, the creds haven't yet been applied
to the process; when the latter is called, they have.
The former may access bprm->cred, the latter may not.
(3) SELinux.
SELinux has a number of changes, in addition to those to support the LSM
interface changes mentioned above:
(a) The bprm_security_struct struct has been removed in favour of using
the credentials-under-construction approach.
(c) flush_unauthorized_files() now takes a cred pointer and passes it on
to inode_has_perm(), file_has_perm() and dentry_open().
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: James Morris <jmorris@namei.org>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Signed-off-by: James Morris <jmorris@namei.org>
2008-11-14 02:39:24 +03:00
|
|
|
if (bprm->cred)
|
|
|
|
abort_creds(bprm->cred);
|
2008-05-11 00:38:25 +04:00
|
|
|
kfree(bprm);
|
|
|
|
}
|
|
|
|
|
2005-04-17 02:20:36 +04:00
|
|
|
/*
|
|
|
|
* sys_execve() executes a new program.
|
|
|
|
*/
|
|
|
|
int do_execve(char * filename,
|
|
|
|
char __user *__user *argv,
|
|
|
|
char __user *__user *envp,
|
|
|
|
struct pt_regs * regs)
|
|
|
|
{
|
|
|
|
struct linux_binprm *bprm;
|
|
|
|
struct file *file;
|
2008-04-22 13:31:30 +04:00
|
|
|
struct files_struct *displaced;
|
2005-04-17 02:20:36 +04:00
|
|
|
int retval;
|
|
|
|
|
2008-04-22 13:31:30 +04:00
|
|
|
retval = unshare_files(&displaced);
|
2008-04-22 13:11:59 +04:00
|
|
|
if (retval)
|
|
|
|
goto out_ret;
|
|
|
|
|
2005-04-17 02:20:36 +04:00
|
|
|
retval = -ENOMEM;
|
2006-03-25 14:08:13 +03:00
|
|
|
bprm = kzalloc(sizeof(*bprm), GFP_KERNEL);
|
2005-04-17 02:20:36 +04:00
|
|
|
if (!bprm)
|
2008-04-22 13:11:59 +04:00
|
|
|
goto out_files;
|
2005-04-17 02:20:36 +04:00
|
|
|
|
CRED: Make execve() take advantage of copy-on-write credentials
Make execve() take advantage of copy-on-write credentials, allowing it to set
up the credentials in advance, and then commit the whole lot after the point
of no return.
This patch and the preceding patches have been tested with the LTP SELinux
testsuite.
This patch makes several logical sets of alteration:
(1) execve().
The credential bits from struct linux_binprm are, for the most part,
replaced with a single credentials pointer (bprm->cred). This means that
all the creds can be calculated in advance and then applied at the point
of no return with no possibility of failure.
I would like to replace bprm->cap_effective with:
cap_isclear(bprm->cap_effective)
but this seems impossible due to special behaviour for processes of pid 1
(they always retain their parent's capability masks where normally they'd
be changed - see cap_bprm_set_creds()).
The following sequence of events now happens:
(a) At the start of do_execve, the current task's cred_exec_mutex is
locked to prevent PTRACE_ATTACH from obsoleting the calculation of
creds that we make.
(a) prepare_exec_creds() is then called to make a copy of the current
task's credentials and prepare it. This copy is then assigned to
bprm->cred.
This renders security_bprm_alloc() and security_bprm_free()
unnecessary, and so they've been removed.
(b) The determination of unsafe execution is now performed immediately
after (a) rather than later on in the code. The result is stored in
bprm->unsafe for future reference.
(c) prepare_binprm() is called, possibly multiple times.
(i) This applies the result of set[ug]id binaries to the new creds
attached to bprm->cred. Personality bit clearance is recorded,
but now deferred on the basis that the exec procedure may yet
fail.
(ii) This then calls the new security_bprm_set_creds(). This should
calculate the new LSM and capability credentials into *bprm->cred.
This folds together security_bprm_set() and parts of
security_bprm_apply_creds() (these two have been removed).
Anything that might fail must be done at this point.
(iii) bprm->cred_prepared is set to 1.
bprm->cred_prepared is 0 on the first pass of the security
calculations, and 1 on all subsequent passes. This allows SELinux
in (ii) to base its calculations only on the initial script and
not on the interpreter.
(d) flush_old_exec() is called to commit the task to execution. This
performs the following steps with regard to credentials:
(i) Clear pdeath_signal and set dumpable on certain circumstances that
may not be covered by commit_creds().
(ii) Clear any bits in current->personality that were deferred from
(c.i).
(e) install_exec_creds() [compute_creds() as was] is called to install the
new credentials. This performs the following steps with regard to
credentials:
(i) Calls security_bprm_committing_creds() to apply any security
requirements, such as flushing unauthorised files in SELinux, that
must be done before the credentials are changed.
This is made up of bits of security_bprm_apply_creds() and
security_bprm_post_apply_creds(), both of which have been removed.
This function is not allowed to fail; anything that might fail
must have been done in (c.ii).
(ii) Calls commit_creds() to apply the new credentials in a single
assignment (more or less). Possibly pdeath_signal and dumpable
should be part of struct creds.
(iii) Unlocks the task's cred_replace_mutex, thus allowing
PTRACE_ATTACH to take place.
(iv) Clears The bprm->cred pointer as the credentials it was holding
are now immutable.
(v) Calls security_bprm_committed_creds() to apply any security
alterations that must be done after the creds have been changed.
SELinux uses this to flush signals and signal handlers.
(f) If an error occurs before (d.i), bprm_free() will call abort_creds()
to destroy the proposed new credentials and will then unlock
cred_replace_mutex. No changes to the credentials will have been
made.
(2) LSM interface.
A number of functions have been changed, added or removed:
(*) security_bprm_alloc(), ->bprm_alloc_security()
(*) security_bprm_free(), ->bprm_free_security()
Removed in favour of preparing new credentials and modifying those.
(*) security_bprm_apply_creds(), ->bprm_apply_creds()
(*) security_bprm_post_apply_creds(), ->bprm_post_apply_creds()
Removed; split between security_bprm_set_creds(),
security_bprm_committing_creds() and security_bprm_committed_creds().
(*) security_bprm_set(), ->bprm_set_security()
Removed; folded into security_bprm_set_creds().
(*) security_bprm_set_creds(), ->bprm_set_creds()
New. The new credentials in bprm->creds should be checked and set up
as appropriate. bprm->cred_prepared is 0 on the first call, 1 on the
second and subsequent calls.
(*) security_bprm_committing_creds(), ->bprm_committing_creds()
(*) security_bprm_committed_creds(), ->bprm_committed_creds()
New. Apply the security effects of the new credentials. This
includes closing unauthorised files in SELinux. This function may not
fail. When the former is called, the creds haven't yet been applied
to the process; when the latter is called, they have.
The former may access bprm->cred, the latter may not.
(3) SELinux.
SELinux has a number of changes, in addition to those to support the LSM
interface changes mentioned above:
(a) The bprm_security_struct struct has been removed in favour of using
the credentials-under-construction approach.
(c) flush_unauthorized_files() now takes a cred pointer and passes it on
to inode_has_perm(), file_has_perm() and dentry_open().
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: James Morris <jmorris@namei.org>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Signed-off-by: James Morris <jmorris@namei.org>
2008-11-14 02:39:24 +03:00
|
|
|
retval = mutex_lock_interruptible(¤t->cred_exec_mutex);
|
|
|
|
if (retval < 0)
|
|
|
|
goto out_free;
|
|
|
|
|
|
|
|
retval = -ENOMEM;
|
|
|
|
bprm->cred = prepare_exec_creds();
|
|
|
|
if (!bprm->cred)
|
|
|
|
goto out_unlock;
|
|
|
|
check_unsafe_exec(bprm);
|
|
|
|
|
2005-04-17 02:20:36 +04:00
|
|
|
file = open_exec(filename);
|
|
|
|
retval = PTR_ERR(file);
|
|
|
|
if (IS_ERR(file))
|
CRED: Make execve() take advantage of copy-on-write credentials
Make execve() take advantage of copy-on-write credentials, allowing it to set
up the credentials in advance, and then commit the whole lot after the point
of no return.
This patch and the preceding patches have been tested with the LTP SELinux
testsuite.
This patch makes several logical sets of alteration:
(1) execve().
The credential bits from struct linux_binprm are, for the most part,
replaced with a single credentials pointer (bprm->cred). This means that
all the creds can be calculated in advance and then applied at the point
of no return with no possibility of failure.
I would like to replace bprm->cap_effective with:
cap_isclear(bprm->cap_effective)
but this seems impossible due to special behaviour for processes of pid 1
(they always retain their parent's capability masks where normally they'd
be changed - see cap_bprm_set_creds()).
The following sequence of events now happens:
(a) At the start of do_execve, the current task's cred_exec_mutex is
locked to prevent PTRACE_ATTACH from obsoleting the calculation of
creds that we make.
(a) prepare_exec_creds() is then called to make a copy of the current
task's credentials and prepare it. This copy is then assigned to
bprm->cred.
This renders security_bprm_alloc() and security_bprm_free()
unnecessary, and so they've been removed.
(b) The determination of unsafe execution is now performed immediately
after (a) rather than later on in the code. The result is stored in
bprm->unsafe for future reference.
(c) prepare_binprm() is called, possibly multiple times.
(i) This applies the result of set[ug]id binaries to the new creds
attached to bprm->cred. Personality bit clearance is recorded,
but now deferred on the basis that the exec procedure may yet
fail.
(ii) This then calls the new security_bprm_set_creds(). This should
calculate the new LSM and capability credentials into *bprm->cred.
This folds together security_bprm_set() and parts of
security_bprm_apply_creds() (these two have been removed).
Anything that might fail must be done at this point.
(iii) bprm->cred_prepared is set to 1.
bprm->cred_prepared is 0 on the first pass of the security
calculations, and 1 on all subsequent passes. This allows SELinux
in (ii) to base its calculations only on the initial script and
not on the interpreter.
(d) flush_old_exec() is called to commit the task to execution. This
performs the following steps with regard to credentials:
(i) Clear pdeath_signal and set dumpable on certain circumstances that
may not be covered by commit_creds().
(ii) Clear any bits in current->personality that were deferred from
(c.i).
(e) install_exec_creds() [compute_creds() as was] is called to install the
new credentials. This performs the following steps with regard to
credentials:
(i) Calls security_bprm_committing_creds() to apply any security
requirements, such as flushing unauthorised files in SELinux, that
must be done before the credentials are changed.
This is made up of bits of security_bprm_apply_creds() and
security_bprm_post_apply_creds(), both of which have been removed.
This function is not allowed to fail; anything that might fail
must have been done in (c.ii).
(ii) Calls commit_creds() to apply the new credentials in a single
assignment (more or less). Possibly pdeath_signal and dumpable
should be part of struct creds.
(iii) Unlocks the task's cred_replace_mutex, thus allowing
PTRACE_ATTACH to take place.
(iv) Clears The bprm->cred pointer as the credentials it was holding
are now immutable.
(v) Calls security_bprm_committed_creds() to apply any security
alterations that must be done after the creds have been changed.
SELinux uses this to flush signals and signal handlers.
(f) If an error occurs before (d.i), bprm_free() will call abort_creds()
to destroy the proposed new credentials and will then unlock
cred_replace_mutex. No changes to the credentials will have been
made.
(2) LSM interface.
A number of functions have been changed, added or removed:
(*) security_bprm_alloc(), ->bprm_alloc_security()
(*) security_bprm_free(), ->bprm_free_security()
Removed in favour of preparing new credentials and modifying those.
(*) security_bprm_apply_creds(), ->bprm_apply_creds()
(*) security_bprm_post_apply_creds(), ->bprm_post_apply_creds()
Removed; split between security_bprm_set_creds(),
security_bprm_committing_creds() and security_bprm_committed_creds().
(*) security_bprm_set(), ->bprm_set_security()
Removed; folded into security_bprm_set_creds().
(*) security_bprm_set_creds(), ->bprm_set_creds()
New. The new credentials in bprm->creds should be checked and set up
as appropriate. bprm->cred_prepared is 0 on the first call, 1 on the
second and subsequent calls.
(*) security_bprm_committing_creds(), ->bprm_committing_creds()
(*) security_bprm_committed_creds(), ->bprm_committed_creds()
New. Apply the security effects of the new credentials. This
includes closing unauthorised files in SELinux. This function may not
fail. When the former is called, the creds haven't yet been applied
to the process; when the latter is called, they have.
The former may access bprm->cred, the latter may not.
(3) SELinux.
SELinux has a number of changes, in addition to those to support the LSM
interface changes mentioned above:
(a) The bprm_security_struct struct has been removed in favour of using
the credentials-under-construction approach.
(c) flush_unauthorized_files() now takes a cred pointer and passes it on
to inode_has_perm(), file_has_perm() and dentry_open().
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: James Morris <jmorris@namei.org>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Signed-off-by: James Morris <jmorris@namei.org>
2008-11-14 02:39:24 +03:00
|
|
|
goto out_unlock;
|
2005-04-17 02:20:36 +04:00
|
|
|
|
|
|
|
sched_exec();
|
|
|
|
|
|
|
|
bprm->file = file;
|
|
|
|
bprm->filename = filename;
|
|
|
|
bprm->interp = filename;
|
|
|
|
|
2007-07-19 12:48:16 +04:00
|
|
|
retval = bprm_mm_init(bprm);
|
|
|
|
if (retval)
|
|
|
|
goto out_file;
|
2005-04-17 02:20:36 +04:00
|
|
|
|
2007-07-19 12:48:16 +04:00
|
|
|
bprm->argc = count(argv, MAX_ARG_STRINGS);
|
2005-04-17 02:20:36 +04:00
|
|
|
if ((retval = bprm->argc) < 0)
|
CRED: Make execve() take advantage of copy-on-write credentials
Make execve() take advantage of copy-on-write credentials, allowing it to set
up the credentials in advance, and then commit the whole lot after the point
of no return.
This patch and the preceding patches have been tested with the LTP SELinux
testsuite.
This patch makes several logical sets of alteration:
(1) execve().
The credential bits from struct linux_binprm are, for the most part,
replaced with a single credentials pointer (bprm->cred). This means that
all the creds can be calculated in advance and then applied at the point
of no return with no possibility of failure.
I would like to replace bprm->cap_effective with:
cap_isclear(bprm->cap_effective)
but this seems impossible due to special behaviour for processes of pid 1
(they always retain their parent's capability masks where normally they'd
be changed - see cap_bprm_set_creds()).
The following sequence of events now happens:
(a) At the start of do_execve, the current task's cred_exec_mutex is
locked to prevent PTRACE_ATTACH from obsoleting the calculation of
creds that we make.
(a) prepare_exec_creds() is then called to make a copy of the current
task's credentials and prepare it. This copy is then assigned to
bprm->cred.
This renders security_bprm_alloc() and security_bprm_free()
unnecessary, and so they've been removed.
(b) The determination of unsafe execution is now performed immediately
after (a) rather than later on in the code. The result is stored in
bprm->unsafe for future reference.
(c) prepare_binprm() is called, possibly multiple times.
(i) This applies the result of set[ug]id binaries to the new creds
attached to bprm->cred. Personality bit clearance is recorded,
but now deferred on the basis that the exec procedure may yet
fail.
(ii) This then calls the new security_bprm_set_creds(). This should
calculate the new LSM and capability credentials into *bprm->cred.
This folds together security_bprm_set() and parts of
security_bprm_apply_creds() (these two have been removed).
Anything that might fail must be done at this point.
(iii) bprm->cred_prepared is set to 1.
bprm->cred_prepared is 0 on the first pass of the security
calculations, and 1 on all subsequent passes. This allows SELinux
in (ii) to base its calculations only on the initial script and
not on the interpreter.
(d) flush_old_exec() is called to commit the task to execution. This
performs the following steps with regard to credentials:
(i) Clear pdeath_signal and set dumpable on certain circumstances that
may not be covered by commit_creds().
(ii) Clear any bits in current->personality that were deferred from
(c.i).
(e) install_exec_creds() [compute_creds() as was] is called to install the
new credentials. This performs the following steps with regard to
credentials:
(i) Calls security_bprm_committing_creds() to apply any security
requirements, such as flushing unauthorised files in SELinux, that
must be done before the credentials are changed.
This is made up of bits of security_bprm_apply_creds() and
security_bprm_post_apply_creds(), both of which have been removed.
This function is not allowed to fail; anything that might fail
must have been done in (c.ii).
(ii) Calls commit_creds() to apply the new credentials in a single
assignment (more or less). Possibly pdeath_signal and dumpable
should be part of struct creds.
(iii) Unlocks the task's cred_replace_mutex, thus allowing
PTRACE_ATTACH to take place.
(iv) Clears The bprm->cred pointer as the credentials it was holding
are now immutable.
(v) Calls security_bprm_committed_creds() to apply any security
alterations that must be done after the creds have been changed.
SELinux uses this to flush signals and signal handlers.
(f) If an error occurs before (d.i), bprm_free() will call abort_creds()
to destroy the proposed new credentials and will then unlock
cred_replace_mutex. No changes to the credentials will have been
made.
(2) LSM interface.
A number of functions have been changed, added or removed:
(*) security_bprm_alloc(), ->bprm_alloc_security()
(*) security_bprm_free(), ->bprm_free_security()
Removed in favour of preparing new credentials and modifying those.
(*) security_bprm_apply_creds(), ->bprm_apply_creds()
(*) security_bprm_post_apply_creds(), ->bprm_post_apply_creds()
Removed; split between security_bprm_set_creds(),
security_bprm_committing_creds() and security_bprm_committed_creds().
(*) security_bprm_set(), ->bprm_set_security()
Removed; folded into security_bprm_set_creds().
(*) security_bprm_set_creds(), ->bprm_set_creds()
New. The new credentials in bprm->creds should be checked and set up
as appropriate. bprm->cred_prepared is 0 on the first call, 1 on the
second and subsequent calls.
(*) security_bprm_committing_creds(), ->bprm_committing_creds()
(*) security_bprm_committed_creds(), ->bprm_committed_creds()
New. Apply the security effects of the new credentials. This
includes closing unauthorised files in SELinux. This function may not
fail. When the former is called, the creds haven't yet been applied
to the process; when the latter is called, they have.
The former may access bprm->cred, the latter may not.
(3) SELinux.
SELinux has a number of changes, in addition to those to support the LSM
interface changes mentioned above:
(a) The bprm_security_struct struct has been removed in favour of using
the credentials-under-construction approach.
(c) flush_unauthorized_files() now takes a cred pointer and passes it on
to inode_has_perm(), file_has_perm() and dentry_open().
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: James Morris <jmorris@namei.org>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Signed-off-by: James Morris <jmorris@namei.org>
2008-11-14 02:39:24 +03:00
|
|
|
goto out;
|
2005-04-17 02:20:36 +04:00
|
|
|
|
2007-07-19 12:48:16 +04:00
|
|
|
bprm->envc = count(envp, MAX_ARG_STRINGS);
|
2005-04-17 02:20:36 +04:00
|
|
|
if ((retval = bprm->envc) < 0)
|
|
|
|
goto out;
|
|
|
|
|
|
|
|
retval = prepare_binprm(bprm);
|
|
|
|
if (retval < 0)
|
|
|
|
goto out;
|
|
|
|
|
|
|
|
retval = copy_strings_kernel(1, &bprm->filename, bprm);
|
|
|
|
if (retval < 0)
|
|
|
|
goto out;
|
|
|
|
|
|
|
|
bprm->exec = bprm->p;
|
|
|
|
retval = copy_strings(bprm->envc, envp, bprm);
|
|
|
|
if (retval < 0)
|
|
|
|
goto out;
|
|
|
|
|
|
|
|
retval = copy_strings(bprm->argc, argv, bprm);
|
|
|
|
if (retval < 0)
|
|
|
|
goto out;
|
|
|
|
|
2008-07-25 12:47:37 +04:00
|
|
|
current->flags &= ~PF_KTHREAD;
|
2005-04-17 02:20:36 +04:00
|
|
|
retval = search_binary_handler(bprm,regs);
|
CRED: Make execve() take advantage of copy-on-write credentials
Make execve() take advantage of copy-on-write credentials, allowing it to set
up the credentials in advance, and then commit the whole lot after the point
of no return.
This patch and the preceding patches have been tested with the LTP SELinux
testsuite.
This patch makes several logical sets of alteration:
(1) execve().
The credential bits from struct linux_binprm are, for the most part,
replaced with a single credentials pointer (bprm->cred). This means that
all the creds can be calculated in advance and then applied at the point
of no return with no possibility of failure.
I would like to replace bprm->cap_effective with:
cap_isclear(bprm->cap_effective)
but this seems impossible due to special behaviour for processes of pid 1
(they always retain their parent's capability masks where normally they'd
be changed - see cap_bprm_set_creds()).
The following sequence of events now happens:
(a) At the start of do_execve, the current task's cred_exec_mutex is
locked to prevent PTRACE_ATTACH from obsoleting the calculation of
creds that we make.
(a) prepare_exec_creds() is then called to make a copy of the current
task's credentials and prepare it. This copy is then assigned to
bprm->cred.
This renders security_bprm_alloc() and security_bprm_free()
unnecessary, and so they've been removed.
(b) The determination of unsafe execution is now performed immediately
after (a) rather than later on in the code. The result is stored in
bprm->unsafe for future reference.
(c) prepare_binprm() is called, possibly multiple times.
(i) This applies the result of set[ug]id binaries to the new creds
attached to bprm->cred. Personality bit clearance is recorded,
but now deferred on the basis that the exec procedure may yet
fail.
(ii) This then calls the new security_bprm_set_creds(). This should
calculate the new LSM and capability credentials into *bprm->cred.
This folds together security_bprm_set() and parts of
security_bprm_apply_creds() (these two have been removed).
Anything that might fail must be done at this point.
(iii) bprm->cred_prepared is set to 1.
bprm->cred_prepared is 0 on the first pass of the security
calculations, and 1 on all subsequent passes. This allows SELinux
in (ii) to base its calculations only on the initial script and
not on the interpreter.
(d) flush_old_exec() is called to commit the task to execution. This
performs the following steps with regard to credentials:
(i) Clear pdeath_signal and set dumpable on certain circumstances that
may not be covered by commit_creds().
(ii) Clear any bits in current->personality that were deferred from
(c.i).
(e) install_exec_creds() [compute_creds() as was] is called to install the
new credentials. This performs the following steps with regard to
credentials:
(i) Calls security_bprm_committing_creds() to apply any security
requirements, such as flushing unauthorised files in SELinux, that
must be done before the credentials are changed.
This is made up of bits of security_bprm_apply_creds() and
security_bprm_post_apply_creds(), both of which have been removed.
This function is not allowed to fail; anything that might fail
must have been done in (c.ii).
(ii) Calls commit_creds() to apply the new credentials in a single
assignment (more or less). Possibly pdeath_signal and dumpable
should be part of struct creds.
(iii) Unlocks the task's cred_replace_mutex, thus allowing
PTRACE_ATTACH to take place.
(iv) Clears The bprm->cred pointer as the credentials it was holding
are now immutable.
(v) Calls security_bprm_committed_creds() to apply any security
alterations that must be done after the creds have been changed.
SELinux uses this to flush signals and signal handlers.
(f) If an error occurs before (d.i), bprm_free() will call abort_creds()
to destroy the proposed new credentials and will then unlock
cred_replace_mutex. No changes to the credentials will have been
made.
(2) LSM interface.
A number of functions have been changed, added or removed:
(*) security_bprm_alloc(), ->bprm_alloc_security()
(*) security_bprm_free(), ->bprm_free_security()
Removed in favour of preparing new credentials and modifying those.
(*) security_bprm_apply_creds(), ->bprm_apply_creds()
(*) security_bprm_post_apply_creds(), ->bprm_post_apply_creds()
Removed; split between security_bprm_set_creds(),
security_bprm_committing_creds() and security_bprm_committed_creds().
(*) security_bprm_set(), ->bprm_set_security()
Removed; folded into security_bprm_set_creds().
(*) security_bprm_set_creds(), ->bprm_set_creds()
New. The new credentials in bprm->creds should be checked and set up
as appropriate. bprm->cred_prepared is 0 on the first call, 1 on the
second and subsequent calls.
(*) security_bprm_committing_creds(), ->bprm_committing_creds()
(*) security_bprm_committed_creds(), ->bprm_committed_creds()
New. Apply the security effects of the new credentials. This
includes closing unauthorised files in SELinux. This function may not
fail. When the former is called, the creds haven't yet been applied
to the process; when the latter is called, they have.
The former may access bprm->cred, the latter may not.
(3) SELinux.
SELinux has a number of changes, in addition to those to support the LSM
interface changes mentioned above:
(a) The bprm_security_struct struct has been removed in favour of using
the credentials-under-construction approach.
(c) flush_unauthorized_files() now takes a cred pointer and passes it on
to inode_has_perm(), file_has_perm() and dentry_open().
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: James Morris <jmorris@namei.org>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Signed-off-by: James Morris <jmorris@namei.org>
2008-11-14 02:39:24 +03:00
|
|
|
if (retval < 0)
|
|
|
|
goto out;
|
2005-04-17 02:20:36 +04:00
|
|
|
|
CRED: Make execve() take advantage of copy-on-write credentials
Make execve() take advantage of copy-on-write credentials, allowing it to set
up the credentials in advance, and then commit the whole lot after the point
of no return.
This patch and the preceding patches have been tested with the LTP SELinux
testsuite.
This patch makes several logical sets of alteration:
(1) execve().
The credential bits from struct linux_binprm are, for the most part,
replaced with a single credentials pointer (bprm->cred). This means that
all the creds can be calculated in advance and then applied at the point
of no return with no possibility of failure.
I would like to replace bprm->cap_effective with:
cap_isclear(bprm->cap_effective)
but this seems impossible due to special behaviour for processes of pid 1
(they always retain their parent's capability masks where normally they'd
be changed - see cap_bprm_set_creds()).
The following sequence of events now happens:
(a) At the start of do_execve, the current task's cred_exec_mutex is
locked to prevent PTRACE_ATTACH from obsoleting the calculation of
creds that we make.
(a) prepare_exec_creds() is then called to make a copy of the current
task's credentials and prepare it. This copy is then assigned to
bprm->cred.
This renders security_bprm_alloc() and security_bprm_free()
unnecessary, and so they've been removed.
(b) The determination of unsafe execution is now performed immediately
after (a) rather than later on in the code. The result is stored in
bprm->unsafe for future reference.
(c) prepare_binprm() is called, possibly multiple times.
(i) This applies the result of set[ug]id binaries to the new creds
attached to bprm->cred. Personality bit clearance is recorded,
but now deferred on the basis that the exec procedure may yet
fail.
(ii) This then calls the new security_bprm_set_creds(). This should
calculate the new LSM and capability credentials into *bprm->cred.
This folds together security_bprm_set() and parts of
security_bprm_apply_creds() (these two have been removed).
Anything that might fail must be done at this point.
(iii) bprm->cred_prepared is set to 1.
bprm->cred_prepared is 0 on the first pass of the security
calculations, and 1 on all subsequent passes. This allows SELinux
in (ii) to base its calculations only on the initial script and
not on the interpreter.
(d) flush_old_exec() is called to commit the task to execution. This
performs the following steps with regard to credentials:
(i) Clear pdeath_signal and set dumpable on certain circumstances that
may not be covered by commit_creds().
(ii) Clear any bits in current->personality that were deferred from
(c.i).
(e) install_exec_creds() [compute_creds() as was] is called to install the
new credentials. This performs the following steps with regard to
credentials:
(i) Calls security_bprm_committing_creds() to apply any security
requirements, such as flushing unauthorised files in SELinux, that
must be done before the credentials are changed.
This is made up of bits of security_bprm_apply_creds() and
security_bprm_post_apply_creds(), both of which have been removed.
This function is not allowed to fail; anything that might fail
must have been done in (c.ii).
(ii) Calls commit_creds() to apply the new credentials in a single
assignment (more or less). Possibly pdeath_signal and dumpable
should be part of struct creds.
(iii) Unlocks the task's cred_replace_mutex, thus allowing
PTRACE_ATTACH to take place.
(iv) Clears The bprm->cred pointer as the credentials it was holding
are now immutable.
(v) Calls security_bprm_committed_creds() to apply any security
alterations that must be done after the creds have been changed.
SELinux uses this to flush signals and signal handlers.
(f) If an error occurs before (d.i), bprm_free() will call abort_creds()
to destroy the proposed new credentials and will then unlock
cred_replace_mutex. No changes to the credentials will have been
made.
(2) LSM interface.
A number of functions have been changed, added or removed:
(*) security_bprm_alloc(), ->bprm_alloc_security()
(*) security_bprm_free(), ->bprm_free_security()
Removed in favour of preparing new credentials and modifying those.
(*) security_bprm_apply_creds(), ->bprm_apply_creds()
(*) security_bprm_post_apply_creds(), ->bprm_post_apply_creds()
Removed; split between security_bprm_set_creds(),
security_bprm_committing_creds() and security_bprm_committed_creds().
(*) security_bprm_set(), ->bprm_set_security()
Removed; folded into security_bprm_set_creds().
(*) security_bprm_set_creds(), ->bprm_set_creds()
New. The new credentials in bprm->creds should be checked and set up
as appropriate. bprm->cred_prepared is 0 on the first call, 1 on the
second and subsequent calls.
(*) security_bprm_committing_creds(), ->bprm_committing_creds()
(*) security_bprm_committed_creds(), ->bprm_committed_creds()
New. Apply the security effects of the new credentials. This
includes closing unauthorised files in SELinux. This function may not
fail. When the former is called, the creds haven't yet been applied
to the process; when the latter is called, they have.
The former may access bprm->cred, the latter may not.
(3) SELinux.
SELinux has a number of changes, in addition to those to support the LSM
interface changes mentioned above:
(a) The bprm_security_struct struct has been removed in favour of using
the credentials-under-construction approach.
(c) flush_unauthorized_files() now takes a cred pointer and passes it on
to inode_has_perm(), file_has_perm() and dentry_open().
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: James Morris <jmorris@namei.org>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Signed-off-by: James Morris <jmorris@namei.org>
2008-11-14 02:39:24 +03:00
|
|
|
/* execve succeeded */
|
|
|
|
mutex_unlock(¤t->cred_exec_mutex);
|
|
|
|
acct_update_integrals(current);
|
|
|
|
free_bprm(bprm);
|
|
|
|
if (displaced)
|
|
|
|
put_files_struct(displaced);
|
|
|
|
return retval;
|
2005-04-17 02:20:36 +04:00
|
|
|
|
CRED: Make execve() take advantage of copy-on-write credentials
Make execve() take advantage of copy-on-write credentials, allowing it to set
up the credentials in advance, and then commit the whole lot after the point
of no return.
This patch and the preceding patches have been tested with the LTP SELinux
testsuite.
This patch makes several logical sets of alteration:
(1) execve().
The credential bits from struct linux_binprm are, for the most part,
replaced with a single credentials pointer (bprm->cred). This means that
all the creds can be calculated in advance and then applied at the point
of no return with no possibility of failure.
I would like to replace bprm->cap_effective with:
cap_isclear(bprm->cap_effective)
but this seems impossible due to special behaviour for processes of pid 1
(they always retain their parent's capability masks where normally they'd
be changed - see cap_bprm_set_creds()).
The following sequence of events now happens:
(a) At the start of do_execve, the current task's cred_exec_mutex is
locked to prevent PTRACE_ATTACH from obsoleting the calculation of
creds that we make.
(a) prepare_exec_creds() is then called to make a copy of the current
task's credentials and prepare it. This copy is then assigned to
bprm->cred.
This renders security_bprm_alloc() and security_bprm_free()
unnecessary, and so they've been removed.
(b) The determination of unsafe execution is now performed immediately
after (a) rather than later on in the code. The result is stored in
bprm->unsafe for future reference.
(c) prepare_binprm() is called, possibly multiple times.
(i) This applies the result of set[ug]id binaries to the new creds
attached to bprm->cred. Personality bit clearance is recorded,
but now deferred on the basis that the exec procedure may yet
fail.
(ii) This then calls the new security_bprm_set_creds(). This should
calculate the new LSM and capability credentials into *bprm->cred.
This folds together security_bprm_set() and parts of
security_bprm_apply_creds() (these two have been removed).
Anything that might fail must be done at this point.
(iii) bprm->cred_prepared is set to 1.
bprm->cred_prepared is 0 on the first pass of the security
calculations, and 1 on all subsequent passes. This allows SELinux
in (ii) to base its calculations only on the initial script and
not on the interpreter.
(d) flush_old_exec() is called to commit the task to execution. This
performs the following steps with regard to credentials:
(i) Clear pdeath_signal and set dumpable on certain circumstances that
may not be covered by commit_creds().
(ii) Clear any bits in current->personality that were deferred from
(c.i).
(e) install_exec_creds() [compute_creds() as was] is called to install the
new credentials. This performs the following steps with regard to
credentials:
(i) Calls security_bprm_committing_creds() to apply any security
requirements, such as flushing unauthorised files in SELinux, that
must be done before the credentials are changed.
This is made up of bits of security_bprm_apply_creds() and
security_bprm_post_apply_creds(), both of which have been removed.
This function is not allowed to fail; anything that might fail
must have been done in (c.ii).
(ii) Calls commit_creds() to apply the new credentials in a single
assignment (more or less). Possibly pdeath_signal and dumpable
should be part of struct creds.
(iii) Unlocks the task's cred_replace_mutex, thus allowing
PTRACE_ATTACH to take place.
(iv) Clears The bprm->cred pointer as the credentials it was holding
are now immutable.
(v) Calls security_bprm_committed_creds() to apply any security
alterations that must be done after the creds have been changed.
SELinux uses this to flush signals and signal handlers.
(f) If an error occurs before (d.i), bprm_free() will call abort_creds()
to destroy the proposed new credentials and will then unlock
cred_replace_mutex. No changes to the credentials will have been
made.
(2) LSM interface.
A number of functions have been changed, added or removed:
(*) security_bprm_alloc(), ->bprm_alloc_security()
(*) security_bprm_free(), ->bprm_free_security()
Removed in favour of preparing new credentials and modifying those.
(*) security_bprm_apply_creds(), ->bprm_apply_creds()
(*) security_bprm_post_apply_creds(), ->bprm_post_apply_creds()
Removed; split between security_bprm_set_creds(),
security_bprm_committing_creds() and security_bprm_committed_creds().
(*) security_bprm_set(), ->bprm_set_security()
Removed; folded into security_bprm_set_creds().
(*) security_bprm_set_creds(), ->bprm_set_creds()
New. The new credentials in bprm->creds should be checked and set up
as appropriate. bprm->cred_prepared is 0 on the first call, 1 on the
second and subsequent calls.
(*) security_bprm_committing_creds(), ->bprm_committing_creds()
(*) security_bprm_committed_creds(), ->bprm_committed_creds()
New. Apply the security effects of the new credentials. This
includes closing unauthorised files in SELinux. This function may not
fail. When the former is called, the creds haven't yet been applied
to the process; when the latter is called, they have.
The former may access bprm->cred, the latter may not.
(3) SELinux.
SELinux has a number of changes, in addition to those to support the LSM
interface changes mentioned above:
(a) The bprm_security_struct struct has been removed in favour of using
the credentials-under-construction approach.
(c) flush_unauthorized_files() now takes a cred pointer and passes it on
to inode_has_perm(), file_has_perm() and dentry_open().
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: James Morris <jmorris@namei.org>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Signed-off-by: James Morris <jmorris@namei.org>
2008-11-14 02:39:24 +03:00
|
|
|
out:
|
2005-04-17 02:20:36 +04:00
|
|
|
if (bprm->mm)
|
2007-07-19 12:48:16 +04:00
|
|
|
mmput (bprm->mm);
|
2005-04-17 02:20:36 +04:00
|
|
|
|
|
|
|
out_file:
|
|
|
|
if (bprm->file) {
|
|
|
|
allow_write_access(bprm->file);
|
|
|
|
fput(bprm->file);
|
|
|
|
}
|
CRED: Make execve() take advantage of copy-on-write credentials
Make execve() take advantage of copy-on-write credentials, allowing it to set
up the credentials in advance, and then commit the whole lot after the point
of no return.
This patch and the preceding patches have been tested with the LTP SELinux
testsuite.
This patch makes several logical sets of alteration:
(1) execve().
The credential bits from struct linux_binprm are, for the most part,
replaced with a single credentials pointer (bprm->cred). This means that
all the creds can be calculated in advance and then applied at the point
of no return with no possibility of failure.
I would like to replace bprm->cap_effective with:
cap_isclear(bprm->cap_effective)
but this seems impossible due to special behaviour for processes of pid 1
(they always retain their parent's capability masks where normally they'd
be changed - see cap_bprm_set_creds()).
The following sequence of events now happens:
(a) At the start of do_execve, the current task's cred_exec_mutex is
locked to prevent PTRACE_ATTACH from obsoleting the calculation of
creds that we make.
(a) prepare_exec_creds() is then called to make a copy of the current
task's credentials and prepare it. This copy is then assigned to
bprm->cred.
This renders security_bprm_alloc() and security_bprm_free()
unnecessary, and so they've been removed.
(b) The determination of unsafe execution is now performed immediately
after (a) rather than later on in the code. The result is stored in
bprm->unsafe for future reference.
(c) prepare_binprm() is called, possibly multiple times.
(i) This applies the result of set[ug]id binaries to the new creds
attached to bprm->cred. Personality bit clearance is recorded,
but now deferred on the basis that the exec procedure may yet
fail.
(ii) This then calls the new security_bprm_set_creds(). This should
calculate the new LSM and capability credentials into *bprm->cred.
This folds together security_bprm_set() and parts of
security_bprm_apply_creds() (these two have been removed).
Anything that might fail must be done at this point.
(iii) bprm->cred_prepared is set to 1.
bprm->cred_prepared is 0 on the first pass of the security
calculations, and 1 on all subsequent passes. This allows SELinux
in (ii) to base its calculations only on the initial script and
not on the interpreter.
(d) flush_old_exec() is called to commit the task to execution. This
performs the following steps with regard to credentials:
(i) Clear pdeath_signal and set dumpable on certain circumstances that
may not be covered by commit_creds().
(ii) Clear any bits in current->personality that were deferred from
(c.i).
(e) install_exec_creds() [compute_creds() as was] is called to install the
new credentials. This performs the following steps with regard to
credentials:
(i) Calls security_bprm_committing_creds() to apply any security
requirements, such as flushing unauthorised files in SELinux, that
must be done before the credentials are changed.
This is made up of bits of security_bprm_apply_creds() and
security_bprm_post_apply_creds(), both of which have been removed.
This function is not allowed to fail; anything that might fail
must have been done in (c.ii).
(ii) Calls commit_creds() to apply the new credentials in a single
assignment (more or less). Possibly pdeath_signal and dumpable
should be part of struct creds.
(iii) Unlocks the task's cred_replace_mutex, thus allowing
PTRACE_ATTACH to take place.
(iv) Clears The bprm->cred pointer as the credentials it was holding
are now immutable.
(v) Calls security_bprm_committed_creds() to apply any security
alterations that must be done after the creds have been changed.
SELinux uses this to flush signals and signal handlers.
(f) If an error occurs before (d.i), bprm_free() will call abort_creds()
to destroy the proposed new credentials and will then unlock
cred_replace_mutex. No changes to the credentials will have been
made.
(2) LSM interface.
A number of functions have been changed, added or removed:
(*) security_bprm_alloc(), ->bprm_alloc_security()
(*) security_bprm_free(), ->bprm_free_security()
Removed in favour of preparing new credentials and modifying those.
(*) security_bprm_apply_creds(), ->bprm_apply_creds()
(*) security_bprm_post_apply_creds(), ->bprm_post_apply_creds()
Removed; split between security_bprm_set_creds(),
security_bprm_committing_creds() and security_bprm_committed_creds().
(*) security_bprm_set(), ->bprm_set_security()
Removed; folded into security_bprm_set_creds().
(*) security_bprm_set_creds(), ->bprm_set_creds()
New. The new credentials in bprm->creds should be checked and set up
as appropriate. bprm->cred_prepared is 0 on the first call, 1 on the
second and subsequent calls.
(*) security_bprm_committing_creds(), ->bprm_committing_creds()
(*) security_bprm_committed_creds(), ->bprm_committed_creds()
New. Apply the security effects of the new credentials. This
includes closing unauthorised files in SELinux. This function may not
fail. When the former is called, the creds haven't yet been applied
to the process; when the latter is called, they have.
The former may access bprm->cred, the latter may not.
(3) SELinux.
SELinux has a number of changes, in addition to those to support the LSM
interface changes mentioned above:
(a) The bprm_security_struct struct has been removed in favour of using
the credentials-under-construction approach.
(c) flush_unauthorized_files() now takes a cred pointer and passes it on
to inode_has_perm(), file_has_perm() and dentry_open().
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: James Morris <jmorris@namei.org>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Signed-off-by: James Morris <jmorris@namei.org>
2008-11-14 02:39:24 +03:00
|
|
|
|
|
|
|
out_unlock:
|
|
|
|
mutex_unlock(¤t->cred_exec_mutex);
|
|
|
|
|
|
|
|
out_free:
|
2008-05-11 00:38:25 +04:00
|
|
|
free_bprm(bprm);
|
2005-04-17 02:20:36 +04:00
|
|
|
|
2008-04-22 13:11:59 +04:00
|
|
|
out_files:
|
2008-04-22 13:31:30 +04:00
|
|
|
if (displaced)
|
|
|
|
reset_files_struct(displaced);
|
2005-04-17 02:20:36 +04:00
|
|
|
out_ret:
|
|
|
|
return retval;
|
|
|
|
}
|
|
|
|
|
|
|
|
int set_binfmt(struct linux_binfmt *new)
|
|
|
|
{
|
|
|
|
struct linux_binfmt *old = current->binfmt;
|
|
|
|
|
|
|
|
if (new) {
|
|
|
|
if (!try_module_get(new->module))
|
|
|
|
return -1;
|
|
|
|
}
|
|
|
|
current->binfmt = new;
|
|
|
|
if (old)
|
|
|
|
module_put(old->module);
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
|
|
|
EXPORT_SYMBOL(set_binfmt);
|
|
|
|
|
|
|
|
/* format_corename will inspect the pattern parameter, and output a
|
|
|
|
* name into corename, which must have space for at least
|
|
|
|
* CORENAME_MAX_SIZE bytes plus one byte for the zero terminator.
|
|
|
|
*/
|
2008-10-19 07:28:22 +04:00
|
|
|
static int format_corename(char *corename, long signr)
|
2005-04-17 02:20:36 +04:00
|
|
|
{
|
2008-11-14 02:39:18 +03:00
|
|
|
const struct cred *cred = current_cred();
|
2008-07-25 12:47:47 +04:00
|
|
|
const char *pat_ptr = core_pattern;
|
|
|
|
int ispipe = (*pat_ptr == '|');
|
2005-04-17 02:20:36 +04:00
|
|
|
char *out_ptr = corename;
|
|
|
|
char *const out_end = corename + CORENAME_MAX_SIZE;
|
|
|
|
int rc;
|
|
|
|
int pid_in_pattern = 0;
|
|
|
|
|
|
|
|
/* Repeat as long as we have more pattern to process and more output
|
|
|
|
space */
|
|
|
|
while (*pat_ptr) {
|
|
|
|
if (*pat_ptr != '%') {
|
|
|
|
if (out_ptr == out_end)
|
|
|
|
goto out;
|
|
|
|
*out_ptr++ = *pat_ptr++;
|
|
|
|
} else {
|
|
|
|
switch (*++pat_ptr) {
|
|
|
|
case 0:
|
|
|
|
goto out;
|
|
|
|
/* Double percent, output one percent */
|
|
|
|
case '%':
|
|
|
|
if (out_ptr == out_end)
|
|
|
|
goto out;
|
|
|
|
*out_ptr++ = '%';
|
|
|
|
break;
|
|
|
|
/* pid */
|
|
|
|
case 'p':
|
|
|
|
pid_in_pattern = 1;
|
|
|
|
rc = snprintf(out_ptr, out_end - out_ptr,
|
2007-10-19 10:40:14 +04:00
|
|
|
"%d", task_tgid_vnr(current));
|
2005-04-17 02:20:36 +04:00
|
|
|
if (rc > out_end - out_ptr)
|
|
|
|
goto out;
|
|
|
|
out_ptr += rc;
|
|
|
|
break;
|
|
|
|
/* uid */
|
|
|
|
case 'u':
|
|
|
|
rc = snprintf(out_ptr, out_end - out_ptr,
|
2008-11-14 02:39:18 +03:00
|
|
|
"%d", cred->uid);
|
2005-04-17 02:20:36 +04:00
|
|
|
if (rc > out_end - out_ptr)
|
|
|
|
goto out;
|
|
|
|
out_ptr += rc;
|
|
|
|
break;
|
|
|
|
/* gid */
|
|
|
|
case 'g':
|
|
|
|
rc = snprintf(out_ptr, out_end - out_ptr,
|
2008-11-14 02:39:18 +03:00
|
|
|
"%d", cred->gid);
|
2005-04-17 02:20:36 +04:00
|
|
|
if (rc > out_end - out_ptr)
|
|
|
|
goto out;
|
|
|
|
out_ptr += rc;
|
|
|
|
break;
|
|
|
|
/* signal that caused the coredump */
|
|
|
|
case 's':
|
|
|
|
rc = snprintf(out_ptr, out_end - out_ptr,
|
|
|
|
"%ld", signr);
|
|
|
|
if (rc > out_end - out_ptr)
|
|
|
|
goto out;
|
|
|
|
out_ptr += rc;
|
|
|
|
break;
|
|
|
|
/* UNIX time of coredump */
|
|
|
|
case 't': {
|
|
|
|
struct timeval tv;
|
|
|
|
do_gettimeofday(&tv);
|
|
|
|
rc = snprintf(out_ptr, out_end - out_ptr,
|
|
|
|
"%lu", tv.tv_sec);
|
|
|
|
if (rc > out_end - out_ptr)
|
|
|
|
goto out;
|
|
|
|
out_ptr += rc;
|
|
|
|
break;
|
|
|
|
}
|
|
|
|
/* hostname */
|
|
|
|
case 'h':
|
|
|
|
down_read(&uts_sem);
|
|
|
|
rc = snprintf(out_ptr, out_end - out_ptr,
|
2006-10-02 13:18:11 +04:00
|
|
|
"%s", utsname()->nodename);
|
2005-04-17 02:20:36 +04:00
|
|
|
up_read(&uts_sem);
|
|
|
|
if (rc > out_end - out_ptr)
|
|
|
|
goto out;
|
|
|
|
out_ptr += rc;
|
|
|
|
break;
|
|
|
|
/* executable */
|
|
|
|
case 'e':
|
|
|
|
rc = snprintf(out_ptr, out_end - out_ptr,
|
|
|
|
"%s", current->comm);
|
|
|
|
if (rc > out_end - out_ptr)
|
|
|
|
goto out;
|
|
|
|
out_ptr += rc;
|
|
|
|
break;
|
2007-10-17 10:26:35 +04:00
|
|
|
/* core limit size */
|
|
|
|
case 'c':
|
|
|
|
rc = snprintf(out_ptr, out_end - out_ptr,
|
|
|
|
"%lu", current->signal->rlim[RLIMIT_CORE].rlim_cur);
|
|
|
|
if (rc > out_end - out_ptr)
|
|
|
|
goto out;
|
|
|
|
out_ptr += rc;
|
|
|
|
break;
|
2005-04-17 02:20:36 +04:00
|
|
|
default:
|
|
|
|
break;
|
|
|
|
}
|
|
|
|
++pat_ptr;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
/* Backward compatibility with core_uses_pid:
|
|
|
|
*
|
|
|
|
* If core_pattern does not include a %p (as is the default)
|
|
|
|
* and core_uses_pid is set, then .%pid will be appended to
|
2007-04-17 09:53:13 +04:00
|
|
|
* the filename. Do not do this for piped commands. */
|
2008-10-19 07:28:22 +04:00
|
|
|
if (!ispipe && !pid_in_pattern && core_uses_pid) {
|
2005-04-17 02:20:36 +04:00
|
|
|
rc = snprintf(out_ptr, out_end - out_ptr,
|
2007-10-19 10:40:14 +04:00
|
|
|
".%d", task_tgid_vnr(current));
|
2005-04-17 02:20:36 +04:00
|
|
|
if (rc > out_end - out_ptr)
|
|
|
|
goto out;
|
|
|
|
out_ptr += rc;
|
|
|
|
}
|
2007-04-17 09:53:13 +04:00
|
|
|
out:
|
2005-04-17 02:20:36 +04:00
|
|
|
*out_ptr = 0;
|
2007-04-17 09:53:13 +04:00
|
|
|
return ispipe;
|
2005-04-17 02:20:36 +04:00
|
|
|
}
|
|
|
|
|
2008-07-25 12:47:42 +04:00
|
|
|
static int zap_process(struct task_struct *start)
|
2006-06-26 11:26:05 +04:00
|
|
|
{
|
|
|
|
struct task_struct *t;
|
2008-07-25 12:47:42 +04:00
|
|
|
int nr = 0;
|
2006-06-26 11:26:06 +04:00
|
|
|
|
2006-06-26 11:26:07 +04:00
|
|
|
start->signal->flags = SIGNAL_GROUP_EXIT;
|
|
|
|
start->signal->group_stop_count = 0;
|
2006-06-26 11:26:05 +04:00
|
|
|
|
|
|
|
t = start;
|
|
|
|
do {
|
|
|
|
if (t != current && t->mm) {
|
2006-06-26 11:26:06 +04:00
|
|
|
sigaddset(&t->pending.signal, SIGKILL);
|
|
|
|
signal_wake_up(t, 1);
|
2008-07-25 12:47:42 +04:00
|
|
|
nr++;
|
2006-06-26 11:26:05 +04:00
|
|
|
}
|
2008-07-25 12:47:31 +04:00
|
|
|
} while_each_thread(start, t);
|
2008-07-25 12:47:42 +04:00
|
|
|
|
|
|
|
return nr;
|
2006-06-26 11:26:05 +04:00
|
|
|
}
|
|
|
|
|
2006-06-26 11:26:08 +04:00
|
|
|
static inline int zap_threads(struct task_struct *tsk, struct mm_struct *mm,
|
2008-07-25 12:47:42 +04:00
|
|
|
struct core_state *core_state, int exit_code)
|
2005-04-17 02:20:36 +04:00
|
|
|
{
|
|
|
|
struct task_struct *g, *p;
|
2006-06-26 11:26:09 +04:00
|
|
|
unsigned long flags;
|
2008-07-25 12:47:42 +04:00
|
|
|
int nr = -EAGAIN;
|
2006-06-26 11:26:08 +04:00
|
|
|
|
|
|
|
spin_lock_irq(&tsk->sighand->siglock);
|
2008-02-05 09:27:24 +03:00
|
|
|
if (!signal_group_exit(tsk->signal)) {
|
2008-07-25 12:47:42 +04:00
|
|
|
mm->core_state = core_state;
|
2006-06-26 11:26:08 +04:00
|
|
|
tsk->signal->group_exit_code = exit_code;
|
2008-07-25 12:47:42 +04:00
|
|
|
nr = zap_process(tsk);
|
2005-04-17 02:20:36 +04:00
|
|
|
}
|
2006-06-26 11:26:08 +04:00
|
|
|
spin_unlock_irq(&tsk->sighand->siglock);
|
2008-07-25 12:47:42 +04:00
|
|
|
if (unlikely(nr < 0))
|
|
|
|
return nr;
|
2005-04-17 02:20:36 +04:00
|
|
|
|
2008-07-25 12:47:42 +04:00
|
|
|
if (atomic_read(&mm->mm_users) == nr + 1)
|
2006-06-26 11:26:09 +04:00
|
|
|
goto done;
|
2008-07-25 12:47:31 +04:00
|
|
|
/*
|
|
|
|
* We should find and kill all tasks which use this mm, and we should
|
2008-07-25 12:47:41 +04:00
|
|
|
* count them correctly into ->nr_threads. We don't take tasklist
|
2008-07-25 12:47:31 +04:00
|
|
|
* lock, but this is safe wrt:
|
|
|
|
*
|
|
|
|
* fork:
|
|
|
|
* None of sub-threads can fork after zap_process(leader). All
|
|
|
|
* processes which were created before this point should be
|
|
|
|
* visible to zap_threads() because copy_process() adds the new
|
|
|
|
* process to the tail of init_task.tasks list, and lock/unlock
|
|
|
|
* of ->siglock provides a memory barrier.
|
|
|
|
*
|
|
|
|
* do_exit:
|
|
|
|
* The caller holds mm->mmap_sem. This means that the task which
|
|
|
|
* uses this mm can't pass exit_mm(), so it can't exit or clear
|
|
|
|
* its ->mm.
|
|
|
|
*
|
|
|
|
* de_thread:
|
|
|
|
* It does list_replace_rcu(&leader->tasks, ¤t->tasks),
|
|
|
|
* we must see either old or new leader, this does not matter.
|
|
|
|
* However, it can change p->sighand, so lock_task_sighand(p)
|
|
|
|
* must be used. Since p->mm != NULL and we hold ->mmap_sem
|
|
|
|
* it can't fail.
|
|
|
|
*
|
|
|
|
* Note also that "g" can be the old leader with ->mm == NULL
|
|
|
|
* and already unhashed and thus removed from ->thread_group.
|
|
|
|
* This is OK, __unhash_process()->list_del_rcu() does not
|
|
|
|
* clear the ->next pointer, we will find the new leader via
|
|
|
|
* next_thread().
|
|
|
|
*/
|
2006-06-26 11:26:08 +04:00
|
|
|
rcu_read_lock();
|
2006-06-26 11:26:05 +04:00
|
|
|
for_each_process(g) {
|
2006-06-26 11:26:09 +04:00
|
|
|
if (g == tsk->group_leader)
|
|
|
|
continue;
|
2008-07-25 12:47:39 +04:00
|
|
|
if (g->flags & PF_KTHREAD)
|
|
|
|
continue;
|
2006-06-26 11:26:05 +04:00
|
|
|
p = g;
|
|
|
|
do {
|
|
|
|
if (p->mm) {
|
2008-07-25 12:47:39 +04:00
|
|
|
if (unlikely(p->mm == mm)) {
|
2006-06-26 11:26:09 +04:00
|
|
|
lock_task_sighand(p, &flags);
|
2008-07-25 12:47:42 +04:00
|
|
|
nr += zap_process(p);
|
2006-06-26 11:26:09 +04:00
|
|
|
unlock_task_sighand(p, &flags);
|
|
|
|
}
|
2006-06-26 11:26:05 +04:00
|
|
|
break;
|
|
|
|
}
|
2008-07-25 12:47:31 +04:00
|
|
|
} while_each_thread(g, p);
|
2006-06-26 11:26:05 +04:00
|
|
|
}
|
2006-06-26 11:26:08 +04:00
|
|
|
rcu_read_unlock();
|
2006-06-26 11:26:09 +04:00
|
|
|
done:
|
2008-07-25 12:47:42 +04:00
|
|
|
atomic_set(&core_state->nr_threads, nr);
|
2008-07-25 12:47:42 +04:00
|
|
|
return nr;
|
2005-04-17 02:20:36 +04:00
|
|
|
}
|
|
|
|
|
2008-07-25 12:47:43 +04:00
|
|
|
static int coredump_wait(int exit_code, struct core_state *core_state)
|
2005-04-17 02:20:36 +04:00
|
|
|
{
|
2006-06-26 11:26:08 +04:00
|
|
|
struct task_struct *tsk = current;
|
|
|
|
struct mm_struct *mm = tsk->mm;
|
|
|
|
struct completion *vfork_done;
|
2005-10-31 02:02:47 +03:00
|
|
|
int core_waiters;
|
2005-04-17 02:20:36 +04:00
|
|
|
|
2008-07-25 12:47:43 +04:00
|
|
|
init_completion(&core_state->startup);
|
2008-07-25 12:47:44 +04:00
|
|
|
core_state->dumper.task = tsk;
|
|
|
|
core_state->dumper.next = NULL;
|
2008-07-25 12:47:43 +04:00
|
|
|
core_waiters = zap_threads(tsk, mm, core_state, exit_code);
|
2005-10-31 02:02:47 +03:00
|
|
|
up_write(&mm->mmap_sem);
|
|
|
|
|
2006-06-26 11:26:08 +04:00
|
|
|
if (unlikely(core_waiters < 0))
|
|
|
|
goto fail;
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Make sure nobody is waiting for us to release the VM,
|
|
|
|
* otherwise we can deadlock when we wait on each other
|
|
|
|
*/
|
|
|
|
vfork_done = tsk->vfork_done;
|
|
|
|
if (vfork_done) {
|
|
|
|
tsk->vfork_done = NULL;
|
|
|
|
complete(vfork_done);
|
|
|
|
}
|
|
|
|
|
2005-10-31 02:02:47 +03:00
|
|
|
if (core_waiters)
|
2008-07-25 12:47:43 +04:00
|
|
|
wait_for_completion(&core_state->startup);
|
2006-06-26 11:26:08 +04:00
|
|
|
fail:
|
|
|
|
return core_waiters;
|
2005-04-17 02:20:36 +04:00
|
|
|
}
|
|
|
|
|
2008-07-25 12:47:46 +04:00
|
|
|
static void coredump_finish(struct mm_struct *mm)
|
|
|
|
{
|
|
|
|
struct core_thread *curr, *next;
|
|
|
|
struct task_struct *task;
|
|
|
|
|
|
|
|
next = mm->core_state->dumper.next;
|
|
|
|
while ((curr = next) != NULL) {
|
|
|
|
next = curr->next;
|
|
|
|
task = curr->task;
|
|
|
|
/*
|
|
|
|
* see exit_mm(), curr->task must not see
|
|
|
|
* ->task == NULL before we read ->next.
|
|
|
|
*/
|
|
|
|
smp_mb();
|
|
|
|
curr->task = NULL;
|
|
|
|
wake_up_process(task);
|
|
|
|
}
|
|
|
|
|
|
|
|
mm->core_state = NULL;
|
|
|
|
}
|
|
|
|
|
2007-07-19 12:48:27 +04:00
|
|
|
/*
|
|
|
|
* set_dumpable converts traditional three-value dumpable to two flags and
|
|
|
|
* stores them into mm->flags. It modifies lower two bits of mm->flags, but
|
|
|
|
* these bits are not changed atomically. So get_dumpable can observe the
|
|
|
|
* intermediate state. To avoid doing unexpected behavior, get get_dumpable
|
|
|
|
* return either old dumpable or new one by paying attention to the order of
|
|
|
|
* modifying the bits.
|
|
|
|
*
|
|
|
|
* dumpable | mm->flags (binary)
|
|
|
|
* old new | initial interim final
|
|
|
|
* ---------+-----------------------
|
|
|
|
* 0 1 | 00 01 01
|
|
|
|
* 0 2 | 00 10(*) 11
|
|
|
|
* 1 0 | 01 00 00
|
|
|
|
* 1 2 | 01 11 11
|
|
|
|
* 2 0 | 11 10(*) 00
|
|
|
|
* 2 1 | 11 11 01
|
|
|
|
*
|
|
|
|
* (*) get_dumpable regards interim value of 10 as 11.
|
|
|
|
*/
|
|
|
|
void set_dumpable(struct mm_struct *mm, int value)
|
|
|
|
{
|
|
|
|
switch (value) {
|
|
|
|
case 0:
|
|
|
|
clear_bit(MMF_DUMPABLE, &mm->flags);
|
|
|
|
smp_wmb();
|
|
|
|
clear_bit(MMF_DUMP_SECURELY, &mm->flags);
|
|
|
|
break;
|
|
|
|
case 1:
|
|
|
|
set_bit(MMF_DUMPABLE, &mm->flags);
|
|
|
|
smp_wmb();
|
|
|
|
clear_bit(MMF_DUMP_SECURELY, &mm->flags);
|
|
|
|
break;
|
|
|
|
case 2:
|
|
|
|
set_bit(MMF_DUMP_SECURELY, &mm->flags);
|
|
|
|
smp_wmb();
|
|
|
|
set_bit(MMF_DUMPABLE, &mm->flags);
|
|
|
|
break;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
int get_dumpable(struct mm_struct *mm)
|
|
|
|
{
|
|
|
|
int ret;
|
|
|
|
|
|
|
|
ret = mm->flags & 0x3;
|
|
|
|
return (ret >= 2) ? 2 : ret;
|
|
|
|
}
|
|
|
|
|
2005-04-17 02:20:36 +04:00
|
|
|
int do_coredump(long signr, int exit_code, struct pt_regs * regs)
|
|
|
|
{
|
2008-07-25 12:47:43 +04:00
|
|
|
struct core_state core_state;
|
2005-04-17 02:20:36 +04:00
|
|
|
char corename[CORENAME_MAX_SIZE + 1];
|
|
|
|
struct mm_struct *mm = current->mm;
|
|
|
|
struct linux_binfmt * binfmt;
|
|
|
|
struct inode * inode;
|
|
|
|
struct file * file;
|
CRED: Inaugurate COW credentials
Inaugurate copy-on-write credentials management. This uses RCU to manage the
credentials pointer in the task_struct with respect to accesses by other tasks.
A process may only modify its own credentials, and so does not need locking to
access or modify its own credentials.
A mutex (cred_replace_mutex) is added to the task_struct to control the effect
of PTRACE_ATTACHED on credential calculations, particularly with respect to
execve().
With this patch, the contents of an active credentials struct may not be
changed directly; rather a new set of credentials must be prepared, modified
and committed using something like the following sequence of events:
struct cred *new = prepare_creds();
int ret = blah(new);
if (ret < 0) {
abort_creds(new);
return ret;
}
return commit_creds(new);
There are some exceptions to this rule: the keyrings pointed to by the active
credentials may be instantiated - keyrings violate the COW rule as managing
COW keyrings is tricky, given that it is possible for a task to directly alter
the keys in a keyring in use by another task.
To help enforce this, various pointers to sets of credentials, such as those in
the task_struct, are declared const. The purpose of this is compile-time
discouragement of altering credentials through those pointers. Once a set of
credentials has been made public through one of these pointers, it may not be
modified, except under special circumstances:
(1) Its reference count may incremented and decremented.
(2) The keyrings to which it points may be modified, but not replaced.
The only safe way to modify anything else is to create a replacement and commit
using the functions described in Documentation/credentials.txt (which will be
added by a later patch).
This patch and the preceding patches have been tested with the LTP SELinux
testsuite.
This patch makes several logical sets of alteration:
(1) execve().
This now prepares and commits credentials in various places in the
security code rather than altering the current creds directly.
(2) Temporary credential overrides.
do_coredump() and sys_faccessat() now prepare their own credentials and
temporarily override the ones currently on the acting thread, whilst
preventing interference from other threads by holding cred_replace_mutex
on the thread being dumped.
This will be replaced in a future patch by something that hands down the
credentials directly to the functions being called, rather than altering
the task's objective credentials.
(3) LSM interface.
A number of functions have been changed, added or removed:
(*) security_capset_check(), ->capset_check()
(*) security_capset_set(), ->capset_set()
Removed in favour of security_capset().
(*) security_capset(), ->capset()
New. This is passed a pointer to the new creds, a pointer to the old
creds and the proposed capability sets. It should fill in the new
creds or return an error. All pointers, barring the pointer to the
new creds, are now const.
(*) security_bprm_apply_creds(), ->bprm_apply_creds()
Changed; now returns a value, which will cause the process to be
killed if it's an error.
(*) security_task_alloc(), ->task_alloc_security()
Removed in favour of security_prepare_creds().
(*) security_cred_free(), ->cred_free()
New. Free security data attached to cred->security.
(*) security_prepare_creds(), ->cred_prepare()
New. Duplicate any security data attached to cred->security.
(*) security_commit_creds(), ->cred_commit()
New. Apply any security effects for the upcoming installation of new
security by commit_creds().
(*) security_task_post_setuid(), ->task_post_setuid()
Removed in favour of security_task_fix_setuid().
(*) security_task_fix_setuid(), ->task_fix_setuid()
Fix up the proposed new credentials for setuid(). This is used by
cap_set_fix_setuid() to implicitly adjust capabilities in line with
setuid() changes. Changes are made to the new credentials, rather
than the task itself as in security_task_post_setuid().
(*) security_task_reparent_to_init(), ->task_reparent_to_init()
Removed. Instead the task being reparented to init is referred
directly to init's credentials.
NOTE! This results in the loss of some state: SELinux's osid no
longer records the sid of the thread that forked it.
(*) security_key_alloc(), ->key_alloc()
(*) security_key_permission(), ->key_permission()
Changed. These now take cred pointers rather than task pointers to
refer to the security context.
(4) sys_capset().
This has been simplified and uses less locking. The LSM functions it
calls have been merged.
(5) reparent_to_kthreadd().
This gives the current thread the same credentials as init by simply using
commit_thread() to point that way.
(6) __sigqueue_alloc() and switch_uid()
__sigqueue_alloc() can't stop the target task from changing its creds
beneath it, so this function gets a reference to the currently applicable
user_struct which it then passes into the sigqueue struct it returns if
successful.
switch_uid() is now called from commit_creds(), and possibly should be
folded into that. commit_creds() should take care of protecting
__sigqueue_alloc().
(7) [sg]et[ug]id() and co and [sg]et_current_groups.
The set functions now all use prepare_creds(), commit_creds() and
abort_creds() to build and check a new set of credentials before applying
it.
security_task_set[ug]id() is called inside the prepared section. This
guarantees that nothing else will affect the creds until we've finished.
The calling of set_dumpable() has been moved into commit_creds().
Much of the functionality of set_user() has been moved into
commit_creds().
The get functions all simply access the data directly.
(8) security_task_prctl() and cap_task_prctl().
security_task_prctl() has been modified to return -ENOSYS if it doesn't
want to handle a function, or otherwise return the return value directly
rather than through an argument.
Additionally, cap_task_prctl() now prepares a new set of credentials, even
if it doesn't end up using it.
(9) Keyrings.
A number of changes have been made to the keyrings code:
(a) switch_uid_keyring(), copy_keys(), exit_keys() and suid_keys() have
all been dropped and built in to the credentials functions directly.
They may want separating out again later.
(b) key_alloc() and search_process_keyrings() now take a cred pointer
rather than a task pointer to specify the security context.
(c) copy_creds() gives a new thread within the same thread group a new
thread keyring if its parent had one, otherwise it discards the thread
keyring.
(d) The authorisation key now points directly to the credentials to extend
the search into rather pointing to the task that carries them.
(e) Installing thread, process or session keyrings causes a new set of
credentials to be created, even though it's not strictly necessary for
process or session keyrings (they're shared).
(10) Usermode helper.
The usermode helper code now carries a cred struct pointer in its
subprocess_info struct instead of a new session keyring pointer. This set
of credentials is derived from init_cred and installed on the new process
after it has been cloned.
call_usermodehelper_setup() allocates the new credentials and
call_usermodehelper_freeinfo() discards them if they haven't been used. A
special cred function (prepare_usermodeinfo_creds()) is provided
specifically for call_usermodehelper_setup() to call.
call_usermodehelper_setkeys() adjusts the credentials to sport the
supplied keyring as the new session keyring.
(11) SELinux.
SELinux has a number of changes, in addition to those to support the LSM
interface changes mentioned above:
(a) selinux_setprocattr() no longer does its check for whether the
current ptracer can access processes with the new SID inside the lock
that covers getting the ptracer's SID. Whilst this lock ensures that
the check is done with the ptracer pinned, the result is only valid
until the lock is released, so there's no point doing it inside the
lock.
(12) is_single_threaded().
This function has been extracted from selinux_setprocattr() and put into
a file of its own in the lib/ directory as join_session_keyring() now
wants to use it too.
The code in SELinux just checked to see whether a task shared mm_structs
with other tasks (CLONE_VM), but that isn't good enough. We really want
to know if they're part of the same thread group (CLONE_THREAD).
(13) nfsd.
The NFS server daemon now has to use the COW credentials to set the
credentials it is going to use. It really needs to pass the credentials
down to the functions it calls, but it can't do that until other patches
in this series have been applied.
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: James Morris <jmorris@namei.org>
Signed-off-by: James Morris <jmorris@namei.org>
2008-11-14 02:39:23 +03:00
|
|
|
const struct cred *old_cred;
|
|
|
|
struct cred *cred;
|
2005-04-17 02:20:36 +04:00
|
|
|
int retval = 0;
|
2005-06-23 11:09:43 +04:00
|
|
|
int flag = 0;
|
2006-10-01 10:29:28 +04:00
|
|
|
int ispipe = 0;
|
2007-10-17 10:26:34 +04:00
|
|
|
unsigned long core_limit = current->signal->rlim[RLIMIT_CORE].rlim_cur;
|
2007-10-17 10:26:35 +04:00
|
|
|
char **helper_argv = NULL;
|
|
|
|
int helper_argc = 0;
|
|
|
|
char *delimit;
|
2005-04-17 02:20:36 +04:00
|
|
|
|
2007-04-19 18:28:21 +04:00
|
|
|
audit_core_dumps(signr);
|
|
|
|
|
2005-04-17 02:20:36 +04:00
|
|
|
binfmt = current->binfmt;
|
|
|
|
if (!binfmt || !binfmt->core_dump)
|
|
|
|
goto fail;
|
CRED: Inaugurate COW credentials
Inaugurate copy-on-write credentials management. This uses RCU to manage the
credentials pointer in the task_struct with respect to accesses by other tasks.
A process may only modify its own credentials, and so does not need locking to
access or modify its own credentials.
A mutex (cred_replace_mutex) is added to the task_struct to control the effect
of PTRACE_ATTACHED on credential calculations, particularly with respect to
execve().
With this patch, the contents of an active credentials struct may not be
changed directly; rather a new set of credentials must be prepared, modified
and committed using something like the following sequence of events:
struct cred *new = prepare_creds();
int ret = blah(new);
if (ret < 0) {
abort_creds(new);
return ret;
}
return commit_creds(new);
There are some exceptions to this rule: the keyrings pointed to by the active
credentials may be instantiated - keyrings violate the COW rule as managing
COW keyrings is tricky, given that it is possible for a task to directly alter
the keys in a keyring in use by another task.
To help enforce this, various pointers to sets of credentials, such as those in
the task_struct, are declared const. The purpose of this is compile-time
discouragement of altering credentials through those pointers. Once a set of
credentials has been made public through one of these pointers, it may not be
modified, except under special circumstances:
(1) Its reference count may incremented and decremented.
(2) The keyrings to which it points may be modified, but not replaced.
The only safe way to modify anything else is to create a replacement and commit
using the functions described in Documentation/credentials.txt (which will be
added by a later patch).
This patch and the preceding patches have been tested with the LTP SELinux
testsuite.
This patch makes several logical sets of alteration:
(1) execve().
This now prepares and commits credentials in various places in the
security code rather than altering the current creds directly.
(2) Temporary credential overrides.
do_coredump() and sys_faccessat() now prepare their own credentials and
temporarily override the ones currently on the acting thread, whilst
preventing interference from other threads by holding cred_replace_mutex
on the thread being dumped.
This will be replaced in a future patch by something that hands down the
credentials directly to the functions being called, rather than altering
the task's objective credentials.
(3) LSM interface.
A number of functions have been changed, added or removed:
(*) security_capset_check(), ->capset_check()
(*) security_capset_set(), ->capset_set()
Removed in favour of security_capset().
(*) security_capset(), ->capset()
New. This is passed a pointer to the new creds, a pointer to the old
creds and the proposed capability sets. It should fill in the new
creds or return an error. All pointers, barring the pointer to the
new creds, are now const.
(*) security_bprm_apply_creds(), ->bprm_apply_creds()
Changed; now returns a value, which will cause the process to be
killed if it's an error.
(*) security_task_alloc(), ->task_alloc_security()
Removed in favour of security_prepare_creds().
(*) security_cred_free(), ->cred_free()
New. Free security data attached to cred->security.
(*) security_prepare_creds(), ->cred_prepare()
New. Duplicate any security data attached to cred->security.
(*) security_commit_creds(), ->cred_commit()
New. Apply any security effects for the upcoming installation of new
security by commit_creds().
(*) security_task_post_setuid(), ->task_post_setuid()
Removed in favour of security_task_fix_setuid().
(*) security_task_fix_setuid(), ->task_fix_setuid()
Fix up the proposed new credentials for setuid(). This is used by
cap_set_fix_setuid() to implicitly adjust capabilities in line with
setuid() changes. Changes are made to the new credentials, rather
than the task itself as in security_task_post_setuid().
(*) security_task_reparent_to_init(), ->task_reparent_to_init()
Removed. Instead the task being reparented to init is referred
directly to init's credentials.
NOTE! This results in the loss of some state: SELinux's osid no
longer records the sid of the thread that forked it.
(*) security_key_alloc(), ->key_alloc()
(*) security_key_permission(), ->key_permission()
Changed. These now take cred pointers rather than task pointers to
refer to the security context.
(4) sys_capset().
This has been simplified and uses less locking. The LSM functions it
calls have been merged.
(5) reparent_to_kthreadd().
This gives the current thread the same credentials as init by simply using
commit_thread() to point that way.
(6) __sigqueue_alloc() and switch_uid()
__sigqueue_alloc() can't stop the target task from changing its creds
beneath it, so this function gets a reference to the currently applicable
user_struct which it then passes into the sigqueue struct it returns if
successful.
switch_uid() is now called from commit_creds(), and possibly should be
folded into that. commit_creds() should take care of protecting
__sigqueue_alloc().
(7) [sg]et[ug]id() and co and [sg]et_current_groups.
The set functions now all use prepare_creds(), commit_creds() and
abort_creds() to build and check a new set of credentials before applying
it.
security_task_set[ug]id() is called inside the prepared section. This
guarantees that nothing else will affect the creds until we've finished.
The calling of set_dumpable() has been moved into commit_creds().
Much of the functionality of set_user() has been moved into
commit_creds().
The get functions all simply access the data directly.
(8) security_task_prctl() and cap_task_prctl().
security_task_prctl() has been modified to return -ENOSYS if it doesn't
want to handle a function, or otherwise return the return value directly
rather than through an argument.
Additionally, cap_task_prctl() now prepares a new set of credentials, even
if it doesn't end up using it.
(9) Keyrings.
A number of changes have been made to the keyrings code:
(a) switch_uid_keyring(), copy_keys(), exit_keys() and suid_keys() have
all been dropped and built in to the credentials functions directly.
They may want separating out again later.
(b) key_alloc() and search_process_keyrings() now take a cred pointer
rather than a task pointer to specify the security context.
(c) copy_creds() gives a new thread within the same thread group a new
thread keyring if its parent had one, otherwise it discards the thread
keyring.
(d) The authorisation key now points directly to the credentials to extend
the search into rather pointing to the task that carries them.
(e) Installing thread, process or session keyrings causes a new set of
credentials to be created, even though it's not strictly necessary for
process or session keyrings (they're shared).
(10) Usermode helper.
The usermode helper code now carries a cred struct pointer in its
subprocess_info struct instead of a new session keyring pointer. This set
of credentials is derived from init_cred and installed on the new process
after it has been cloned.
call_usermodehelper_setup() allocates the new credentials and
call_usermodehelper_freeinfo() discards them if they haven't been used. A
special cred function (prepare_usermodeinfo_creds()) is provided
specifically for call_usermodehelper_setup() to call.
call_usermodehelper_setkeys() adjusts the credentials to sport the
supplied keyring as the new session keyring.
(11) SELinux.
SELinux has a number of changes, in addition to those to support the LSM
interface changes mentioned above:
(a) selinux_setprocattr() no longer does its check for whether the
current ptracer can access processes with the new SID inside the lock
that covers getting the ptracer's SID. Whilst this lock ensures that
the check is done with the ptracer pinned, the result is only valid
until the lock is released, so there's no point doing it inside the
lock.
(12) is_single_threaded().
This function has been extracted from selinux_setprocattr() and put into
a file of its own in the lib/ directory as join_session_keyring() now
wants to use it too.
The code in SELinux just checked to see whether a task shared mm_structs
with other tasks (CLONE_VM), but that isn't good enough. We really want
to know if they're part of the same thread group (CLONE_THREAD).
(13) nfsd.
The NFS server daemon now has to use the COW credentials to set the
credentials it is going to use. It really needs to pass the credentials
down to the functions it calls, but it can't do that until other patches
in this series have been applied.
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: James Morris <jmorris@namei.org>
Signed-off-by: James Morris <jmorris@namei.org>
2008-11-14 02:39:23 +03:00
|
|
|
|
|
|
|
cred = prepare_creds();
|
|
|
|
if (!cred) {
|
|
|
|
retval = -ENOMEM;
|
|
|
|
goto fail;
|
|
|
|
}
|
|
|
|
|
2005-04-17 02:20:36 +04:00
|
|
|
down_write(&mm->mmap_sem);
|
2007-11-12 06:13:43 +03:00
|
|
|
/*
|
|
|
|
* If another thread got here first, or we are not dumpable, bail out.
|
|
|
|
*/
|
2008-07-25 12:47:41 +04:00
|
|
|
if (mm->core_state || !get_dumpable(mm)) {
|
2005-04-17 02:20:36 +04:00
|
|
|
up_write(&mm->mmap_sem);
|
CRED: Inaugurate COW credentials
Inaugurate copy-on-write credentials management. This uses RCU to manage the
credentials pointer in the task_struct with respect to accesses by other tasks.
A process may only modify its own credentials, and so does not need locking to
access or modify its own credentials.
A mutex (cred_replace_mutex) is added to the task_struct to control the effect
of PTRACE_ATTACHED on credential calculations, particularly with respect to
execve().
With this patch, the contents of an active credentials struct may not be
changed directly; rather a new set of credentials must be prepared, modified
and committed using something like the following sequence of events:
struct cred *new = prepare_creds();
int ret = blah(new);
if (ret < 0) {
abort_creds(new);
return ret;
}
return commit_creds(new);
There are some exceptions to this rule: the keyrings pointed to by the active
credentials may be instantiated - keyrings violate the COW rule as managing
COW keyrings is tricky, given that it is possible for a task to directly alter
the keys in a keyring in use by another task.
To help enforce this, various pointers to sets of credentials, such as those in
the task_struct, are declared const. The purpose of this is compile-time
discouragement of altering credentials through those pointers. Once a set of
credentials has been made public through one of these pointers, it may not be
modified, except under special circumstances:
(1) Its reference count may incremented and decremented.
(2) The keyrings to which it points may be modified, but not replaced.
The only safe way to modify anything else is to create a replacement and commit
using the functions described in Documentation/credentials.txt (which will be
added by a later patch).
This patch and the preceding patches have been tested with the LTP SELinux
testsuite.
This patch makes several logical sets of alteration:
(1) execve().
This now prepares and commits credentials in various places in the
security code rather than altering the current creds directly.
(2) Temporary credential overrides.
do_coredump() and sys_faccessat() now prepare their own credentials and
temporarily override the ones currently on the acting thread, whilst
preventing interference from other threads by holding cred_replace_mutex
on the thread being dumped.
This will be replaced in a future patch by something that hands down the
credentials directly to the functions being called, rather than altering
the task's objective credentials.
(3) LSM interface.
A number of functions have been changed, added or removed:
(*) security_capset_check(), ->capset_check()
(*) security_capset_set(), ->capset_set()
Removed in favour of security_capset().
(*) security_capset(), ->capset()
New. This is passed a pointer to the new creds, a pointer to the old
creds and the proposed capability sets. It should fill in the new
creds or return an error. All pointers, barring the pointer to the
new creds, are now const.
(*) security_bprm_apply_creds(), ->bprm_apply_creds()
Changed; now returns a value, which will cause the process to be
killed if it's an error.
(*) security_task_alloc(), ->task_alloc_security()
Removed in favour of security_prepare_creds().
(*) security_cred_free(), ->cred_free()
New. Free security data attached to cred->security.
(*) security_prepare_creds(), ->cred_prepare()
New. Duplicate any security data attached to cred->security.
(*) security_commit_creds(), ->cred_commit()
New. Apply any security effects for the upcoming installation of new
security by commit_creds().
(*) security_task_post_setuid(), ->task_post_setuid()
Removed in favour of security_task_fix_setuid().
(*) security_task_fix_setuid(), ->task_fix_setuid()
Fix up the proposed new credentials for setuid(). This is used by
cap_set_fix_setuid() to implicitly adjust capabilities in line with
setuid() changes. Changes are made to the new credentials, rather
than the task itself as in security_task_post_setuid().
(*) security_task_reparent_to_init(), ->task_reparent_to_init()
Removed. Instead the task being reparented to init is referred
directly to init's credentials.
NOTE! This results in the loss of some state: SELinux's osid no
longer records the sid of the thread that forked it.
(*) security_key_alloc(), ->key_alloc()
(*) security_key_permission(), ->key_permission()
Changed. These now take cred pointers rather than task pointers to
refer to the security context.
(4) sys_capset().
This has been simplified and uses less locking. The LSM functions it
calls have been merged.
(5) reparent_to_kthreadd().
This gives the current thread the same credentials as init by simply using
commit_thread() to point that way.
(6) __sigqueue_alloc() and switch_uid()
__sigqueue_alloc() can't stop the target task from changing its creds
beneath it, so this function gets a reference to the currently applicable
user_struct which it then passes into the sigqueue struct it returns if
successful.
switch_uid() is now called from commit_creds(), and possibly should be
folded into that. commit_creds() should take care of protecting
__sigqueue_alloc().
(7) [sg]et[ug]id() and co and [sg]et_current_groups.
The set functions now all use prepare_creds(), commit_creds() and
abort_creds() to build and check a new set of credentials before applying
it.
security_task_set[ug]id() is called inside the prepared section. This
guarantees that nothing else will affect the creds until we've finished.
The calling of set_dumpable() has been moved into commit_creds().
Much of the functionality of set_user() has been moved into
commit_creds().
The get functions all simply access the data directly.
(8) security_task_prctl() and cap_task_prctl().
security_task_prctl() has been modified to return -ENOSYS if it doesn't
want to handle a function, or otherwise return the return value directly
rather than through an argument.
Additionally, cap_task_prctl() now prepares a new set of credentials, even
if it doesn't end up using it.
(9) Keyrings.
A number of changes have been made to the keyrings code:
(a) switch_uid_keyring(), copy_keys(), exit_keys() and suid_keys() have
all been dropped and built in to the credentials functions directly.
They may want separating out again later.
(b) key_alloc() and search_process_keyrings() now take a cred pointer
rather than a task pointer to specify the security context.
(c) copy_creds() gives a new thread within the same thread group a new
thread keyring if its parent had one, otherwise it discards the thread
keyring.
(d) The authorisation key now points directly to the credentials to extend
the search into rather pointing to the task that carries them.
(e) Installing thread, process or session keyrings causes a new set of
credentials to be created, even though it's not strictly necessary for
process or session keyrings (they're shared).
(10) Usermode helper.
The usermode helper code now carries a cred struct pointer in its
subprocess_info struct instead of a new session keyring pointer. This set
of credentials is derived from init_cred and installed on the new process
after it has been cloned.
call_usermodehelper_setup() allocates the new credentials and
call_usermodehelper_freeinfo() discards them if they haven't been used. A
special cred function (prepare_usermodeinfo_creds()) is provided
specifically for call_usermodehelper_setup() to call.
call_usermodehelper_setkeys() adjusts the credentials to sport the
supplied keyring as the new session keyring.
(11) SELinux.
SELinux has a number of changes, in addition to those to support the LSM
interface changes mentioned above:
(a) selinux_setprocattr() no longer does its check for whether the
current ptracer can access processes with the new SID inside the lock
that covers getting the ptracer's SID. Whilst this lock ensures that
the check is done with the ptracer pinned, the result is only valid
until the lock is released, so there's no point doing it inside the
lock.
(12) is_single_threaded().
This function has been extracted from selinux_setprocattr() and put into
a file of its own in the lib/ directory as join_session_keyring() now
wants to use it too.
The code in SELinux just checked to see whether a task shared mm_structs
with other tasks (CLONE_VM), but that isn't good enough. We really want
to know if they're part of the same thread group (CLONE_THREAD).
(13) nfsd.
The NFS server daemon now has to use the COW credentials to set the
credentials it is going to use. It really needs to pass the credentials
down to the functions it calls, but it can't do that until other patches
in this series have been applied.
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: James Morris <jmorris@namei.org>
Signed-off-by: James Morris <jmorris@namei.org>
2008-11-14 02:39:23 +03:00
|
|
|
put_cred(cred);
|
2005-04-17 02:20:36 +04:00
|
|
|
goto fail;
|
|
|
|
}
|
2005-06-23 11:09:43 +04:00
|
|
|
|
|
|
|
/*
|
|
|
|
* We cannot trust fsuid as being the "true" uid of the
|
|
|
|
* process nor do we know its entire history. We only know it
|
|
|
|
* was tainted so we dump it as root in mode 2.
|
|
|
|
*/
|
2007-07-19 12:48:27 +04:00
|
|
|
if (get_dumpable(mm) == 2) { /* Setuid core dump mode */
|
2005-06-23 11:09:43 +04:00
|
|
|
flag = O_EXCL; /* Stop rewrite attacks */
|
CRED: Inaugurate COW credentials
Inaugurate copy-on-write credentials management. This uses RCU to manage the
credentials pointer in the task_struct with respect to accesses by other tasks.
A process may only modify its own credentials, and so does not need locking to
access or modify its own credentials.
A mutex (cred_replace_mutex) is added to the task_struct to control the effect
of PTRACE_ATTACHED on credential calculations, particularly with respect to
execve().
With this patch, the contents of an active credentials struct may not be
changed directly; rather a new set of credentials must be prepared, modified
and committed using something like the following sequence of events:
struct cred *new = prepare_creds();
int ret = blah(new);
if (ret < 0) {
abort_creds(new);
return ret;
}
return commit_creds(new);
There are some exceptions to this rule: the keyrings pointed to by the active
credentials may be instantiated - keyrings violate the COW rule as managing
COW keyrings is tricky, given that it is possible for a task to directly alter
the keys in a keyring in use by another task.
To help enforce this, various pointers to sets of credentials, such as those in
the task_struct, are declared const. The purpose of this is compile-time
discouragement of altering credentials through those pointers. Once a set of
credentials has been made public through one of these pointers, it may not be
modified, except under special circumstances:
(1) Its reference count may incremented and decremented.
(2) The keyrings to which it points may be modified, but not replaced.
The only safe way to modify anything else is to create a replacement and commit
using the functions described in Documentation/credentials.txt (which will be
added by a later patch).
This patch and the preceding patches have been tested with the LTP SELinux
testsuite.
This patch makes several logical sets of alteration:
(1) execve().
This now prepares and commits credentials in various places in the
security code rather than altering the current creds directly.
(2) Temporary credential overrides.
do_coredump() and sys_faccessat() now prepare their own credentials and
temporarily override the ones currently on the acting thread, whilst
preventing interference from other threads by holding cred_replace_mutex
on the thread being dumped.
This will be replaced in a future patch by something that hands down the
credentials directly to the functions being called, rather than altering
the task's objective credentials.
(3) LSM interface.
A number of functions have been changed, added or removed:
(*) security_capset_check(), ->capset_check()
(*) security_capset_set(), ->capset_set()
Removed in favour of security_capset().
(*) security_capset(), ->capset()
New. This is passed a pointer to the new creds, a pointer to the old
creds and the proposed capability sets. It should fill in the new
creds or return an error. All pointers, barring the pointer to the
new creds, are now const.
(*) security_bprm_apply_creds(), ->bprm_apply_creds()
Changed; now returns a value, which will cause the process to be
killed if it's an error.
(*) security_task_alloc(), ->task_alloc_security()
Removed in favour of security_prepare_creds().
(*) security_cred_free(), ->cred_free()
New. Free security data attached to cred->security.
(*) security_prepare_creds(), ->cred_prepare()
New. Duplicate any security data attached to cred->security.
(*) security_commit_creds(), ->cred_commit()
New. Apply any security effects for the upcoming installation of new
security by commit_creds().
(*) security_task_post_setuid(), ->task_post_setuid()
Removed in favour of security_task_fix_setuid().
(*) security_task_fix_setuid(), ->task_fix_setuid()
Fix up the proposed new credentials for setuid(). This is used by
cap_set_fix_setuid() to implicitly adjust capabilities in line with
setuid() changes. Changes are made to the new credentials, rather
than the task itself as in security_task_post_setuid().
(*) security_task_reparent_to_init(), ->task_reparent_to_init()
Removed. Instead the task being reparented to init is referred
directly to init's credentials.
NOTE! This results in the loss of some state: SELinux's osid no
longer records the sid of the thread that forked it.
(*) security_key_alloc(), ->key_alloc()
(*) security_key_permission(), ->key_permission()
Changed. These now take cred pointers rather than task pointers to
refer to the security context.
(4) sys_capset().
This has been simplified and uses less locking. The LSM functions it
calls have been merged.
(5) reparent_to_kthreadd().
This gives the current thread the same credentials as init by simply using
commit_thread() to point that way.
(6) __sigqueue_alloc() and switch_uid()
__sigqueue_alloc() can't stop the target task from changing its creds
beneath it, so this function gets a reference to the currently applicable
user_struct which it then passes into the sigqueue struct it returns if
successful.
switch_uid() is now called from commit_creds(), and possibly should be
folded into that. commit_creds() should take care of protecting
__sigqueue_alloc().
(7) [sg]et[ug]id() and co and [sg]et_current_groups.
The set functions now all use prepare_creds(), commit_creds() and
abort_creds() to build and check a new set of credentials before applying
it.
security_task_set[ug]id() is called inside the prepared section. This
guarantees that nothing else will affect the creds until we've finished.
The calling of set_dumpable() has been moved into commit_creds().
Much of the functionality of set_user() has been moved into
commit_creds().
The get functions all simply access the data directly.
(8) security_task_prctl() and cap_task_prctl().
security_task_prctl() has been modified to return -ENOSYS if it doesn't
want to handle a function, or otherwise return the return value directly
rather than through an argument.
Additionally, cap_task_prctl() now prepares a new set of credentials, even
if it doesn't end up using it.
(9) Keyrings.
A number of changes have been made to the keyrings code:
(a) switch_uid_keyring(), copy_keys(), exit_keys() and suid_keys() have
all been dropped and built in to the credentials functions directly.
They may want separating out again later.
(b) key_alloc() and search_process_keyrings() now take a cred pointer
rather than a task pointer to specify the security context.
(c) copy_creds() gives a new thread within the same thread group a new
thread keyring if its parent had one, otherwise it discards the thread
keyring.
(d) The authorisation key now points directly to the credentials to extend
the search into rather pointing to the task that carries them.
(e) Installing thread, process or session keyrings causes a new set of
credentials to be created, even though it's not strictly necessary for
process or session keyrings (they're shared).
(10) Usermode helper.
The usermode helper code now carries a cred struct pointer in its
subprocess_info struct instead of a new session keyring pointer. This set
of credentials is derived from init_cred and installed on the new process
after it has been cloned.
call_usermodehelper_setup() allocates the new credentials and
call_usermodehelper_freeinfo() discards them if they haven't been used. A
special cred function (prepare_usermodeinfo_creds()) is provided
specifically for call_usermodehelper_setup() to call.
call_usermodehelper_setkeys() adjusts the credentials to sport the
supplied keyring as the new session keyring.
(11) SELinux.
SELinux has a number of changes, in addition to those to support the LSM
interface changes mentioned above:
(a) selinux_setprocattr() no longer does its check for whether the
current ptracer can access processes with the new SID inside the lock
that covers getting the ptracer's SID. Whilst this lock ensures that
the check is done with the ptracer pinned, the result is only valid
until the lock is released, so there's no point doing it inside the
lock.
(12) is_single_threaded().
This function has been extracted from selinux_setprocattr() and put into
a file of its own in the lib/ directory as join_session_keyring() now
wants to use it too.
The code in SELinux just checked to see whether a task shared mm_structs
with other tasks (CLONE_VM), but that isn't good enough. We really want
to know if they're part of the same thread group (CLONE_THREAD).
(13) nfsd.
The NFS server daemon now has to use the COW credentials to set the
credentials it is going to use. It really needs to pass the credentials
down to the functions it calls, but it can't do that until other patches
in this series have been applied.
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: James Morris <jmorris@namei.org>
Signed-off-by: James Morris <jmorris@namei.org>
2008-11-14 02:39:23 +03:00
|
|
|
cred->fsuid = 0; /* Dump root private */
|
2005-06-23 11:09:43 +04:00
|
|
|
}
|
2005-10-31 02:02:54 +03:00
|
|
|
|
2008-07-25 12:47:43 +04:00
|
|
|
retval = coredump_wait(exit_code, &core_state);
|
CRED: Inaugurate COW credentials
Inaugurate copy-on-write credentials management. This uses RCU to manage the
credentials pointer in the task_struct with respect to accesses by other tasks.
A process may only modify its own credentials, and so does not need locking to
access or modify its own credentials.
A mutex (cred_replace_mutex) is added to the task_struct to control the effect
of PTRACE_ATTACHED on credential calculations, particularly with respect to
execve().
With this patch, the contents of an active credentials struct may not be
changed directly; rather a new set of credentials must be prepared, modified
and committed using something like the following sequence of events:
struct cred *new = prepare_creds();
int ret = blah(new);
if (ret < 0) {
abort_creds(new);
return ret;
}
return commit_creds(new);
There are some exceptions to this rule: the keyrings pointed to by the active
credentials may be instantiated - keyrings violate the COW rule as managing
COW keyrings is tricky, given that it is possible for a task to directly alter
the keys in a keyring in use by another task.
To help enforce this, various pointers to sets of credentials, such as those in
the task_struct, are declared const. The purpose of this is compile-time
discouragement of altering credentials through those pointers. Once a set of
credentials has been made public through one of these pointers, it may not be
modified, except under special circumstances:
(1) Its reference count may incremented and decremented.
(2) The keyrings to which it points may be modified, but not replaced.
The only safe way to modify anything else is to create a replacement and commit
using the functions described in Documentation/credentials.txt (which will be
added by a later patch).
This patch and the preceding patches have been tested with the LTP SELinux
testsuite.
This patch makes several logical sets of alteration:
(1) execve().
This now prepares and commits credentials in various places in the
security code rather than altering the current creds directly.
(2) Temporary credential overrides.
do_coredump() and sys_faccessat() now prepare their own credentials and
temporarily override the ones currently on the acting thread, whilst
preventing interference from other threads by holding cred_replace_mutex
on the thread being dumped.
This will be replaced in a future patch by something that hands down the
credentials directly to the functions being called, rather than altering
the task's objective credentials.
(3) LSM interface.
A number of functions have been changed, added or removed:
(*) security_capset_check(), ->capset_check()
(*) security_capset_set(), ->capset_set()
Removed in favour of security_capset().
(*) security_capset(), ->capset()
New. This is passed a pointer to the new creds, a pointer to the old
creds and the proposed capability sets. It should fill in the new
creds or return an error. All pointers, barring the pointer to the
new creds, are now const.
(*) security_bprm_apply_creds(), ->bprm_apply_creds()
Changed; now returns a value, which will cause the process to be
killed if it's an error.
(*) security_task_alloc(), ->task_alloc_security()
Removed in favour of security_prepare_creds().
(*) security_cred_free(), ->cred_free()
New. Free security data attached to cred->security.
(*) security_prepare_creds(), ->cred_prepare()
New. Duplicate any security data attached to cred->security.
(*) security_commit_creds(), ->cred_commit()
New. Apply any security effects for the upcoming installation of new
security by commit_creds().
(*) security_task_post_setuid(), ->task_post_setuid()
Removed in favour of security_task_fix_setuid().
(*) security_task_fix_setuid(), ->task_fix_setuid()
Fix up the proposed new credentials for setuid(). This is used by
cap_set_fix_setuid() to implicitly adjust capabilities in line with
setuid() changes. Changes are made to the new credentials, rather
than the task itself as in security_task_post_setuid().
(*) security_task_reparent_to_init(), ->task_reparent_to_init()
Removed. Instead the task being reparented to init is referred
directly to init's credentials.
NOTE! This results in the loss of some state: SELinux's osid no
longer records the sid of the thread that forked it.
(*) security_key_alloc(), ->key_alloc()
(*) security_key_permission(), ->key_permission()
Changed. These now take cred pointers rather than task pointers to
refer to the security context.
(4) sys_capset().
This has been simplified and uses less locking. The LSM functions it
calls have been merged.
(5) reparent_to_kthreadd().
This gives the current thread the same credentials as init by simply using
commit_thread() to point that way.
(6) __sigqueue_alloc() and switch_uid()
__sigqueue_alloc() can't stop the target task from changing its creds
beneath it, so this function gets a reference to the currently applicable
user_struct which it then passes into the sigqueue struct it returns if
successful.
switch_uid() is now called from commit_creds(), and possibly should be
folded into that. commit_creds() should take care of protecting
__sigqueue_alloc().
(7) [sg]et[ug]id() and co and [sg]et_current_groups.
The set functions now all use prepare_creds(), commit_creds() and
abort_creds() to build and check a new set of credentials before applying
it.
security_task_set[ug]id() is called inside the prepared section. This
guarantees that nothing else will affect the creds until we've finished.
The calling of set_dumpable() has been moved into commit_creds().
Much of the functionality of set_user() has been moved into
commit_creds().
The get functions all simply access the data directly.
(8) security_task_prctl() and cap_task_prctl().
security_task_prctl() has been modified to return -ENOSYS if it doesn't
want to handle a function, or otherwise return the return value directly
rather than through an argument.
Additionally, cap_task_prctl() now prepares a new set of credentials, even
if it doesn't end up using it.
(9) Keyrings.
A number of changes have been made to the keyrings code:
(a) switch_uid_keyring(), copy_keys(), exit_keys() and suid_keys() have
all been dropped and built in to the credentials functions directly.
They may want separating out again later.
(b) key_alloc() and search_process_keyrings() now take a cred pointer
rather than a task pointer to specify the security context.
(c) copy_creds() gives a new thread within the same thread group a new
thread keyring if its parent had one, otherwise it discards the thread
keyring.
(d) The authorisation key now points directly to the credentials to extend
the search into rather pointing to the task that carries them.
(e) Installing thread, process or session keyrings causes a new set of
credentials to be created, even though it's not strictly necessary for
process or session keyrings (they're shared).
(10) Usermode helper.
The usermode helper code now carries a cred struct pointer in its
subprocess_info struct instead of a new session keyring pointer. This set
of credentials is derived from init_cred and installed on the new process
after it has been cloned.
call_usermodehelper_setup() allocates the new credentials and
call_usermodehelper_freeinfo() discards them if they haven't been used. A
special cred function (prepare_usermodeinfo_creds()) is provided
specifically for call_usermodehelper_setup() to call.
call_usermodehelper_setkeys() adjusts the credentials to sport the
supplied keyring as the new session keyring.
(11) SELinux.
SELinux has a number of changes, in addition to those to support the LSM
interface changes mentioned above:
(a) selinux_setprocattr() no longer does its check for whether the
current ptracer can access processes with the new SID inside the lock
that covers getting the ptracer's SID. Whilst this lock ensures that
the check is done with the ptracer pinned, the result is only valid
until the lock is released, so there's no point doing it inside the
lock.
(12) is_single_threaded().
This function has been extracted from selinux_setprocattr() and put into
a file of its own in the lib/ directory as join_session_keyring() now
wants to use it too.
The code in SELinux just checked to see whether a task shared mm_structs
with other tasks (CLONE_VM), but that isn't good enough. We really want
to know if they're part of the same thread group (CLONE_THREAD).
(13) nfsd.
The NFS server daemon now has to use the COW credentials to set the
credentials it is going to use. It really needs to pass the credentials
down to the functions it calls, but it can't do that until other patches
in this series have been applied.
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: James Morris <jmorris@namei.org>
Signed-off-by: James Morris <jmorris@namei.org>
2008-11-14 02:39:23 +03:00
|
|
|
if (retval < 0) {
|
|
|
|
put_cred(cred);
|
2005-10-31 02:02:54 +03:00
|
|
|
goto fail;
|
CRED: Inaugurate COW credentials
Inaugurate copy-on-write credentials management. This uses RCU to manage the
credentials pointer in the task_struct with respect to accesses by other tasks.
A process may only modify its own credentials, and so does not need locking to
access or modify its own credentials.
A mutex (cred_replace_mutex) is added to the task_struct to control the effect
of PTRACE_ATTACHED on credential calculations, particularly with respect to
execve().
With this patch, the contents of an active credentials struct may not be
changed directly; rather a new set of credentials must be prepared, modified
and committed using something like the following sequence of events:
struct cred *new = prepare_creds();
int ret = blah(new);
if (ret < 0) {
abort_creds(new);
return ret;
}
return commit_creds(new);
There are some exceptions to this rule: the keyrings pointed to by the active
credentials may be instantiated - keyrings violate the COW rule as managing
COW keyrings is tricky, given that it is possible for a task to directly alter
the keys in a keyring in use by another task.
To help enforce this, various pointers to sets of credentials, such as those in
the task_struct, are declared const. The purpose of this is compile-time
discouragement of altering credentials through those pointers. Once a set of
credentials has been made public through one of these pointers, it may not be
modified, except under special circumstances:
(1) Its reference count may incremented and decremented.
(2) The keyrings to which it points may be modified, but not replaced.
The only safe way to modify anything else is to create a replacement and commit
using the functions described in Documentation/credentials.txt (which will be
added by a later patch).
This patch and the preceding patches have been tested with the LTP SELinux
testsuite.
This patch makes several logical sets of alteration:
(1) execve().
This now prepares and commits credentials in various places in the
security code rather than altering the current creds directly.
(2) Temporary credential overrides.
do_coredump() and sys_faccessat() now prepare their own credentials and
temporarily override the ones currently on the acting thread, whilst
preventing interference from other threads by holding cred_replace_mutex
on the thread being dumped.
This will be replaced in a future patch by something that hands down the
credentials directly to the functions being called, rather than altering
the task's objective credentials.
(3) LSM interface.
A number of functions have been changed, added or removed:
(*) security_capset_check(), ->capset_check()
(*) security_capset_set(), ->capset_set()
Removed in favour of security_capset().
(*) security_capset(), ->capset()
New. This is passed a pointer to the new creds, a pointer to the old
creds and the proposed capability sets. It should fill in the new
creds or return an error. All pointers, barring the pointer to the
new creds, are now const.
(*) security_bprm_apply_creds(), ->bprm_apply_creds()
Changed; now returns a value, which will cause the process to be
killed if it's an error.
(*) security_task_alloc(), ->task_alloc_security()
Removed in favour of security_prepare_creds().
(*) security_cred_free(), ->cred_free()
New. Free security data attached to cred->security.
(*) security_prepare_creds(), ->cred_prepare()
New. Duplicate any security data attached to cred->security.
(*) security_commit_creds(), ->cred_commit()
New. Apply any security effects for the upcoming installation of new
security by commit_creds().
(*) security_task_post_setuid(), ->task_post_setuid()
Removed in favour of security_task_fix_setuid().
(*) security_task_fix_setuid(), ->task_fix_setuid()
Fix up the proposed new credentials for setuid(). This is used by
cap_set_fix_setuid() to implicitly adjust capabilities in line with
setuid() changes. Changes are made to the new credentials, rather
than the task itself as in security_task_post_setuid().
(*) security_task_reparent_to_init(), ->task_reparent_to_init()
Removed. Instead the task being reparented to init is referred
directly to init's credentials.
NOTE! This results in the loss of some state: SELinux's osid no
longer records the sid of the thread that forked it.
(*) security_key_alloc(), ->key_alloc()
(*) security_key_permission(), ->key_permission()
Changed. These now take cred pointers rather than task pointers to
refer to the security context.
(4) sys_capset().
This has been simplified and uses less locking. The LSM functions it
calls have been merged.
(5) reparent_to_kthreadd().
This gives the current thread the same credentials as init by simply using
commit_thread() to point that way.
(6) __sigqueue_alloc() and switch_uid()
__sigqueue_alloc() can't stop the target task from changing its creds
beneath it, so this function gets a reference to the currently applicable
user_struct which it then passes into the sigqueue struct it returns if
successful.
switch_uid() is now called from commit_creds(), and possibly should be
folded into that. commit_creds() should take care of protecting
__sigqueue_alloc().
(7) [sg]et[ug]id() and co and [sg]et_current_groups.
The set functions now all use prepare_creds(), commit_creds() and
abort_creds() to build and check a new set of credentials before applying
it.
security_task_set[ug]id() is called inside the prepared section. This
guarantees that nothing else will affect the creds until we've finished.
The calling of set_dumpable() has been moved into commit_creds().
Much of the functionality of set_user() has been moved into
commit_creds().
The get functions all simply access the data directly.
(8) security_task_prctl() and cap_task_prctl().
security_task_prctl() has been modified to return -ENOSYS if it doesn't
want to handle a function, or otherwise return the return value directly
rather than through an argument.
Additionally, cap_task_prctl() now prepares a new set of credentials, even
if it doesn't end up using it.
(9) Keyrings.
A number of changes have been made to the keyrings code:
(a) switch_uid_keyring(), copy_keys(), exit_keys() and suid_keys() have
all been dropped and built in to the credentials functions directly.
They may want separating out again later.
(b) key_alloc() and search_process_keyrings() now take a cred pointer
rather than a task pointer to specify the security context.
(c) copy_creds() gives a new thread within the same thread group a new
thread keyring if its parent had one, otherwise it discards the thread
keyring.
(d) The authorisation key now points directly to the credentials to extend
the search into rather pointing to the task that carries them.
(e) Installing thread, process or session keyrings causes a new set of
credentials to be created, even though it's not strictly necessary for
process or session keyrings (they're shared).
(10) Usermode helper.
The usermode helper code now carries a cred struct pointer in its
subprocess_info struct instead of a new session keyring pointer. This set
of credentials is derived from init_cred and installed on the new process
after it has been cloned.
call_usermodehelper_setup() allocates the new credentials and
call_usermodehelper_freeinfo() discards them if they haven't been used. A
special cred function (prepare_usermodeinfo_creds()) is provided
specifically for call_usermodehelper_setup() to call.
call_usermodehelper_setkeys() adjusts the credentials to sport the
supplied keyring as the new session keyring.
(11) SELinux.
SELinux has a number of changes, in addition to those to support the LSM
interface changes mentioned above:
(a) selinux_setprocattr() no longer does its check for whether the
current ptracer can access processes with the new SID inside the lock
that covers getting the ptracer's SID. Whilst this lock ensures that
the check is done with the ptracer pinned, the result is only valid
until the lock is released, so there's no point doing it inside the
lock.
(12) is_single_threaded().
This function has been extracted from selinux_setprocattr() and put into
a file of its own in the lib/ directory as join_session_keyring() now
wants to use it too.
The code in SELinux just checked to see whether a task shared mm_structs
with other tasks (CLONE_VM), but that isn't good enough. We really want
to know if they're part of the same thread group (CLONE_THREAD).
(13) nfsd.
The NFS server daemon now has to use the COW credentials to set the
credentials it is going to use. It really needs to pass the credentials
down to the functions it calls, but it can't do that until other patches
in this series have been applied.
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: James Morris <jmorris@namei.org>
Signed-off-by: James Morris <jmorris@namei.org>
2008-11-14 02:39:23 +03:00
|
|
|
}
|
|
|
|
|
|
|
|
old_cred = override_creds(cred);
|
2005-04-17 02:20:36 +04:00
|
|
|
|
|
|
|
/*
|
|
|
|
* Clear any false indication of pending signals that might
|
|
|
|
* be seen by the filesystem code called to write the core file.
|
|
|
|
*/
|
|
|
|
clear_thread_flag(TIF_SIGPENDING);
|
|
|
|
|
|
|
|
/*
|
|
|
|
* lock_kernel() because format_corename() is controlled by sysctl, which
|
|
|
|
* uses lock_kernel()
|
|
|
|
*/
|
|
|
|
lock_kernel();
|
2008-10-19 07:28:22 +04:00
|
|
|
ispipe = format_corename(corename, signr);
|
2005-04-17 02:20:36 +04:00
|
|
|
unlock_kernel();
|
2007-10-17 10:26:34 +04:00
|
|
|
/*
|
|
|
|
* Don't bother to check the RLIMIT_CORE value if core_pattern points
|
|
|
|
* to a pipe. Since we're not writing directly to the filesystem
|
|
|
|
* RLIMIT_CORE doesn't really apply, as no actual core file will be
|
|
|
|
* created unless the pipe reader choses to write out the core file
|
|
|
|
* at which point file size limits and permissions will be imposed
|
|
|
|
* as it does with any other process
|
|
|
|
*/
|
2007-10-17 10:26:35 +04:00
|
|
|
if ((!ispipe) && (core_limit < binfmt->min_coredump))
|
2007-10-17 10:26:34 +04:00
|
|
|
goto fail_unlock;
|
|
|
|
|
2007-04-17 09:53:13 +04:00
|
|
|
if (ispipe) {
|
2007-10-17 10:26:35 +04:00
|
|
|
helper_argv = argv_split(GFP_KERNEL, corename+1, &helper_argc);
|
|
|
|
/* Terminate the string before the first option */
|
|
|
|
delimit = strchr(corename, ' ');
|
|
|
|
if (delimit)
|
|
|
|
*delimit = '\0';
|
2007-10-17 10:26:36 +04:00
|
|
|
delimit = strrchr(helper_argv[0], '/');
|
|
|
|
if (delimit)
|
|
|
|
delimit++;
|
|
|
|
else
|
|
|
|
delimit = helper_argv[0];
|
|
|
|
if (!strcmp(delimit, current->comm)) {
|
|
|
|
printk(KERN_NOTICE "Recursive core dump detected, "
|
|
|
|
"aborting\n");
|
|
|
|
goto fail_unlock;
|
|
|
|
}
|
|
|
|
|
|
|
|
core_limit = RLIM_INFINITY;
|
|
|
|
|
2006-10-01 10:29:28 +04:00
|
|
|
/* SIGPIPE can happen, but it's just never processed */
|
2007-10-17 10:26:36 +04:00
|
|
|
if (call_usermodehelper_pipe(corename+1, helper_argv, NULL,
|
|
|
|
&file)) {
|
2006-10-01 10:29:28 +04:00
|
|
|
printk(KERN_INFO "Core dump to %s pipe failed\n",
|
|
|
|
corename);
|
|
|
|
goto fail_unlock;
|
|
|
|
}
|
|
|
|
} else
|
|
|
|
file = filp_open(corename,
|
2006-12-07 07:40:39 +03:00
|
|
|
O_CREAT | 2 | O_NOFOLLOW | O_LARGEFILE | flag,
|
|
|
|
0600);
|
2005-04-17 02:20:36 +04:00
|
|
|
if (IS_ERR(file))
|
|
|
|
goto fail_unlock;
|
2006-12-08 13:36:35 +03:00
|
|
|
inode = file->f_path.dentry->d_inode;
|
2005-04-17 02:20:36 +04:00
|
|
|
if (inode->i_nlink > 1)
|
|
|
|
goto close_fail; /* multiple links - don't dump */
|
2006-12-08 13:36:35 +03:00
|
|
|
if (!ispipe && d_unhashed(file->f_path.dentry))
|
2005-04-17 02:20:36 +04:00
|
|
|
goto close_fail;
|
|
|
|
|
2006-10-01 10:29:28 +04:00
|
|
|
/* AK: actually i see no reason to not allow this for named pipes etc.,
|
|
|
|
but keep the previous behaviour for now. */
|
|
|
|
if (!ispipe && !S_ISREG(inode->i_mode))
|
2005-04-17 02:20:36 +04:00
|
|
|
goto close_fail;
|
2007-11-28 15:59:18 +03:00
|
|
|
/*
|
|
|
|
* Dont allow local users get cute and trick others to coredump
|
|
|
|
* into their pre-created files:
|
|
|
|
*/
|
2008-11-14 02:39:05 +03:00
|
|
|
if (inode->i_uid != current_fsuid())
|
2007-11-28 15:59:18 +03:00
|
|
|
goto close_fail;
|
2005-04-17 02:20:36 +04:00
|
|
|
if (!file->f_op)
|
|
|
|
goto close_fail;
|
|
|
|
if (!file->f_op->write)
|
|
|
|
goto close_fail;
|
2006-12-08 13:36:35 +03:00
|
|
|
if (!ispipe && do_truncate(file->f_path.dentry, 0, 0, file) != 0)
|
2005-04-17 02:20:36 +04:00
|
|
|
goto close_fail;
|
|
|
|
|
2007-10-17 10:26:34 +04:00
|
|
|
retval = binfmt->core_dump(signr, regs, file, core_limit);
|
2005-04-17 02:20:36 +04:00
|
|
|
|
|
|
|
if (retval)
|
|
|
|
current->signal->group_exit_code |= 0x80;
|
|
|
|
close_fail:
|
|
|
|
filp_close(file, NULL);
|
|
|
|
fail_unlock:
|
2007-10-17 10:26:35 +04:00
|
|
|
if (helper_argv)
|
|
|
|
argv_free(helper_argv);
|
|
|
|
|
CRED: Inaugurate COW credentials
Inaugurate copy-on-write credentials management. This uses RCU to manage the
credentials pointer in the task_struct with respect to accesses by other tasks.
A process may only modify its own credentials, and so does not need locking to
access or modify its own credentials.
A mutex (cred_replace_mutex) is added to the task_struct to control the effect
of PTRACE_ATTACHED on credential calculations, particularly with respect to
execve().
With this patch, the contents of an active credentials struct may not be
changed directly; rather a new set of credentials must be prepared, modified
and committed using something like the following sequence of events:
struct cred *new = prepare_creds();
int ret = blah(new);
if (ret < 0) {
abort_creds(new);
return ret;
}
return commit_creds(new);
There are some exceptions to this rule: the keyrings pointed to by the active
credentials may be instantiated - keyrings violate the COW rule as managing
COW keyrings is tricky, given that it is possible for a task to directly alter
the keys in a keyring in use by another task.
To help enforce this, various pointers to sets of credentials, such as those in
the task_struct, are declared const. The purpose of this is compile-time
discouragement of altering credentials through those pointers. Once a set of
credentials has been made public through one of these pointers, it may not be
modified, except under special circumstances:
(1) Its reference count may incremented and decremented.
(2) The keyrings to which it points may be modified, but not replaced.
The only safe way to modify anything else is to create a replacement and commit
using the functions described in Documentation/credentials.txt (which will be
added by a later patch).
This patch and the preceding patches have been tested with the LTP SELinux
testsuite.
This patch makes several logical sets of alteration:
(1) execve().
This now prepares and commits credentials in various places in the
security code rather than altering the current creds directly.
(2) Temporary credential overrides.
do_coredump() and sys_faccessat() now prepare their own credentials and
temporarily override the ones currently on the acting thread, whilst
preventing interference from other threads by holding cred_replace_mutex
on the thread being dumped.
This will be replaced in a future patch by something that hands down the
credentials directly to the functions being called, rather than altering
the task's objective credentials.
(3) LSM interface.
A number of functions have been changed, added or removed:
(*) security_capset_check(), ->capset_check()
(*) security_capset_set(), ->capset_set()
Removed in favour of security_capset().
(*) security_capset(), ->capset()
New. This is passed a pointer to the new creds, a pointer to the old
creds and the proposed capability sets. It should fill in the new
creds or return an error. All pointers, barring the pointer to the
new creds, are now const.
(*) security_bprm_apply_creds(), ->bprm_apply_creds()
Changed; now returns a value, which will cause the process to be
killed if it's an error.
(*) security_task_alloc(), ->task_alloc_security()
Removed in favour of security_prepare_creds().
(*) security_cred_free(), ->cred_free()
New. Free security data attached to cred->security.
(*) security_prepare_creds(), ->cred_prepare()
New. Duplicate any security data attached to cred->security.
(*) security_commit_creds(), ->cred_commit()
New. Apply any security effects for the upcoming installation of new
security by commit_creds().
(*) security_task_post_setuid(), ->task_post_setuid()
Removed in favour of security_task_fix_setuid().
(*) security_task_fix_setuid(), ->task_fix_setuid()
Fix up the proposed new credentials for setuid(). This is used by
cap_set_fix_setuid() to implicitly adjust capabilities in line with
setuid() changes. Changes are made to the new credentials, rather
than the task itself as in security_task_post_setuid().
(*) security_task_reparent_to_init(), ->task_reparent_to_init()
Removed. Instead the task being reparented to init is referred
directly to init's credentials.
NOTE! This results in the loss of some state: SELinux's osid no
longer records the sid of the thread that forked it.
(*) security_key_alloc(), ->key_alloc()
(*) security_key_permission(), ->key_permission()
Changed. These now take cred pointers rather than task pointers to
refer to the security context.
(4) sys_capset().
This has been simplified and uses less locking. The LSM functions it
calls have been merged.
(5) reparent_to_kthreadd().
This gives the current thread the same credentials as init by simply using
commit_thread() to point that way.
(6) __sigqueue_alloc() and switch_uid()
__sigqueue_alloc() can't stop the target task from changing its creds
beneath it, so this function gets a reference to the currently applicable
user_struct which it then passes into the sigqueue struct it returns if
successful.
switch_uid() is now called from commit_creds(), and possibly should be
folded into that. commit_creds() should take care of protecting
__sigqueue_alloc().
(7) [sg]et[ug]id() and co and [sg]et_current_groups.
The set functions now all use prepare_creds(), commit_creds() and
abort_creds() to build and check a new set of credentials before applying
it.
security_task_set[ug]id() is called inside the prepared section. This
guarantees that nothing else will affect the creds until we've finished.
The calling of set_dumpable() has been moved into commit_creds().
Much of the functionality of set_user() has been moved into
commit_creds().
The get functions all simply access the data directly.
(8) security_task_prctl() and cap_task_prctl().
security_task_prctl() has been modified to return -ENOSYS if it doesn't
want to handle a function, or otherwise return the return value directly
rather than through an argument.
Additionally, cap_task_prctl() now prepares a new set of credentials, even
if it doesn't end up using it.
(9) Keyrings.
A number of changes have been made to the keyrings code:
(a) switch_uid_keyring(), copy_keys(), exit_keys() and suid_keys() have
all been dropped and built in to the credentials functions directly.
They may want separating out again later.
(b) key_alloc() and search_process_keyrings() now take a cred pointer
rather than a task pointer to specify the security context.
(c) copy_creds() gives a new thread within the same thread group a new
thread keyring if its parent had one, otherwise it discards the thread
keyring.
(d) The authorisation key now points directly to the credentials to extend
the search into rather pointing to the task that carries them.
(e) Installing thread, process or session keyrings causes a new set of
credentials to be created, even though it's not strictly necessary for
process or session keyrings (they're shared).
(10) Usermode helper.
The usermode helper code now carries a cred struct pointer in its
subprocess_info struct instead of a new session keyring pointer. This set
of credentials is derived from init_cred and installed on the new process
after it has been cloned.
call_usermodehelper_setup() allocates the new credentials and
call_usermodehelper_freeinfo() discards them if they haven't been used. A
special cred function (prepare_usermodeinfo_creds()) is provided
specifically for call_usermodehelper_setup() to call.
call_usermodehelper_setkeys() adjusts the credentials to sport the
supplied keyring as the new session keyring.
(11) SELinux.
SELinux has a number of changes, in addition to those to support the LSM
interface changes mentioned above:
(a) selinux_setprocattr() no longer does its check for whether the
current ptracer can access processes with the new SID inside the lock
that covers getting the ptracer's SID. Whilst this lock ensures that
the check is done with the ptracer pinned, the result is only valid
until the lock is released, so there's no point doing it inside the
lock.
(12) is_single_threaded().
This function has been extracted from selinux_setprocattr() and put into
a file of its own in the lib/ directory as join_session_keyring() now
wants to use it too.
The code in SELinux just checked to see whether a task shared mm_structs
with other tasks (CLONE_VM), but that isn't good enough. We really want
to know if they're part of the same thread group (CLONE_THREAD).
(13) nfsd.
The NFS server daemon now has to use the COW credentials to set the
credentials it is going to use. It really needs to pass the credentials
down to the functions it calls, but it can't do that until other patches
in this series have been applied.
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: James Morris <jmorris@namei.org>
Signed-off-by: James Morris <jmorris@namei.org>
2008-11-14 02:39:23 +03:00
|
|
|
revert_creds(old_cred);
|
|
|
|
put_cred(cred);
|
2008-07-25 12:47:46 +04:00
|
|
|
coredump_finish(mm);
|
2005-04-17 02:20:36 +04:00
|
|
|
fail:
|
|
|
|
return retval;
|
|
|
|
}
|