x86/tls: Synchronize segment registers in set_thread_area()

The current behavior of set_thread_area() when it modifies a segment that is
currently loaded is a bit confused.

If CS [1] or SS is modified, the change will take effect on return
to userspace because CS and SS are fundamentally always reloaded on
return to userspace.

Similarly, on 32-bit kernels, if DS, ES, FS, or (depending on
configuration) GS refers to a modified segment, the change will take
effect immediately on return to user mode because the entry code
reloads these registers.

If set_thread_area() modifies DS, ES [2], FS, or GS on 64-bit kernels or
GS on 32-bit lazy-GS [3] kernels, however, the segment registers
will be left alone until something (most likely a context switch)
causes them to be reloaded.  This means that behavior visible to
user space is inconsistent.

If set_thread_area() is implicitly called via CLONE_SETTLS, then all
segment registers will be reloaded before the thread starts because
CLONE_SETTLS happens before the initial context switch into the
newly created thread.

Empirically, glibc requires the immediate reload on CLONE_SETTLS --
32-bit glibc on my system does *not* manually reload GS when
creating a new thread.

Before enabling FSGSBASE, we need to figure out what the behavior
will be, as FSGSBASE requires that we reconsider our behavior when,
e.g., GS and GSBASE are out of sync in user mode.  Given that we
must preserve the existing behavior of CLONE_SETTLS, it makes sense
to me that we simply extend similar behavior to all invocations
of set_thread_area().

This patch explicitly updates any segment register referring to a
segment that is targetted by set_thread_area().  If set_thread_area()
deletes the segment, then the segment register will be nulled out.

[1] This can't actually happen since 0e58af4e1d ("x86/tls:
    Disallow unusual TLS segments") but, if it did, this is how it
    would behave.

[2] I strongly doubt that any existing non-malicious program loads a
    TLS segment into DS or ES on a 64-bit kernel because the context
    switch code was badly broken until recently, but that's not an
    excuse to leave the current code alone.

[3] One way or another, that config option should to go away.  Yuck!

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/27d119b0d396e9b82009e40dff8333a249038225.1461698311.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
This commit is contained in:
Andy Lutomirski 2016-04-26 12:23:30 -07:00 коммит произвёл Ingo Molnar
Родитель 296f781a4b
Коммит c9867f863e
1 изменённых файлов: 42 добавлений и 0 удалений

Просмотреть файл

@ -114,6 +114,7 @@ int do_set_thread_area(struct task_struct *p, int idx,
int can_allocate)
{
struct user_desc info;
unsigned short __maybe_unused sel, modified_sel;
if (copy_from_user(&info, u_info, sizeof(info)))
return -EFAULT;
@ -141,6 +142,47 @@ int do_set_thread_area(struct task_struct *p, int idx,
set_tls_desc(p, idx, &info, 1);
/*
* If DS, ES, FS, or GS points to the modified segment, forcibly
* refresh it. Only needed on x86_64 because x86_32 reloads them
* on return to user mode.
*/
modified_sel = (idx << 3) | 3;
if (p == current) {
#ifdef CONFIG_X86_64
savesegment(ds, sel);
if (sel == modified_sel)
loadsegment(ds, sel);
savesegment(es, sel);
if (sel == modified_sel)
loadsegment(es, sel);
savesegment(fs, sel);
if (sel == modified_sel)
loadsegment(fs, sel);
savesegment(gs, sel);
if (sel == modified_sel)
load_gs_index(sel);
#endif
#ifdef CONFIG_X86_32_LAZY_GS
savesegment(gs, sel);
if (sel == modified_sel)
loadsegment(gs, sel);
#endif
} else {
#ifdef CONFIG_X86_64
if (p->thread.fsindex == modified_sel)
p->thread.fsbase = info.base_addr;
if (p->thread.gsindex == modified_sel)
p->thread.gsbase = info.base_addr;
#endif
}
return 0;
}