WSL2-Linux-Kernel/arch
Vineet Gupta 8d56bec2f2 ARC: [mm] optimize needless full mm TLB flush on munmap
munmap ends up calling tlb_flush() which for ARC was flushing the entire
TLB unconditionally (by moving the MMU to a new ASID)

do_munmap
  unmap_region
    unmap_vmas
      unmap_single_vma
         unmap_page_range
            tlb_start_vma
            zap_pud_range
            tlb_end_vma()
  tlb_finish_mmu
    tlb_flush()  ---> unconditional flush_tlb_mm()

So even a single page munmap, a frequent operation when uClibc dynamic
linker (ldso) is loading the dependent shared libraries, would move the
the ASID multiple times - needlessly invalidating the pre-faulted TLB
entries (and increasing the rate of ASID wraparound + full TLB flush).

This is now optimised to only be called if tlb->full_mm (which means
for exit/execve) cases only. And for those cases, flush_tlb_mm() is
already optimised to be a no-op for mm->mm_users == 0.

So essentially there are no mmore full mm flushes - except for fork which
anyhow needs it for properly COW'ing parent address space.

munmap now needs to do TLB range flush, which is implemented with
tlb_end_vma()

Results
-------
1. ASID now consistenly moves by 4 during a simple ls (as opposed to 5 or
   7 before).

2. LMBench microbenchmark also shows improvements

Basic system parameters
------------------------------------------------------------------------------
Host                 OS Description              Mhz  tlb  cache  mem scal
                                                     pages line   par load
                                                           bytes
--------- ------------- ----------------------- ---- ----- ----- ------ ----
3.9-rc5-0 Linux 3.9.0-r 3.9-rc5-0404-gcc-4.4-ba   80     8    64 1.1000 1
3.9-rc5-0 Linux 3.9.0-r 3.9-rc5-0405-avoid-full   80     8    64 1.1200 1

Processor, Processes - times in microseconds - smaller is better
------------------------------------------------------------------------------
Host                 OS  Mhz null null      open slct sig  sig  fork exec sh
                             call  I/O stat clos TCP  inst hndl proc proc proc
--------- ------------- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ----
3.9-rc5-0 Linux 3.9.0-r   80 4.81 8.69 68.6 118. 239. 8.53 31.6 4839 13.K 34.K
3.9-rc5-0 Linux 3.9.0-r   80 4.46 8.36 53.8 91.3 223. 8.12 24.2 4725 13.K 33.K

File & VM system latencies in microseconds - smaller is better
-------------------------------------------------------------------------------
Host                 OS   0K File      10K File     Mmap    Prot   Page 100fd
                        Create Delete Create Delete Latency Fault  Fault selct
--------- ------------- ------ ------ ------ ------ ------- ----- ------- -----
3.9-rc5-0 Linux 3.9.0-r  314.7  223.2 1054.9  390.2  3615.0 1.590 20.1 126.6
3.9-rc5-0 Linux 3.9.0-r  265.8  183.8 1014.2  314.1  3193.0 6.910 18.8 110.4

Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2013-05-07 13:44:00 +05:30
..
alpha alpha: irq: remove deprecated use of IRQF_DISABLED 2013-04-07 12:59:30 -07:00
arc ARC: [mm] optimize needless full mm TLB flush on munmap 2013-05-07 13:44:00 +05:30
arm Merge branch 'fixes' of git://git.linaro.org/people/rmk/linux-arm 2013-04-03 16:15:17 -07:00
arm64 Fix IS_ENABLED() usage typo (missing CONFIG_ prefix). 2013-03-28 13:45:49 -07:00
avr32 Select VIRT_TO_BUS directly where needed 2013-03-12 11:16:40 -07:00
blackfin Select VIRT_TO_BUS directly where needed 2013-03-12 11:16:40 -07:00
c6x Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2013-02-26 20:16:07 -08:00
cris Select VIRT_TO_BUS directly where needed 2013-03-12 11:16:40 -07:00
frv Select VIRT_TO_BUS directly where needed 2013-03-12 11:16:40 -07:00
h8300 Select VIRT_TO_BUS directly where needed 2013-03-12 11:16:40 -07:00
hexagon Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2013-02-26 20:16:07 -08:00
ia64 ia64 idle: delete stale (*idle)() function pointer 2013-03-29 11:12:25 -07:00
m32r UAPI: fix endianness conditionals in M32R's asm/stat.h 2013-03-13 15:21:49 -07:00
m68k Select VIRT_TO_BUS directly where needed 2013-03-12 11:16:40 -07:00
metag metag: Inhibit NUMA balancing. 2013-03-04 10:29:19 +00:00
microblaze Select VIRT_TO_BUS directly where needed 2013-03-12 11:16:40 -07:00
mips Merge branch 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus 2013-04-05 12:23:12 -07:00
mn10300 Select VIRT_TO_BUS directly where needed 2013-03-12 11:16:40 -07:00
openrisc openrisc: remove HAVE_VIRT_TO_BUS 2013-03-13 06:12:39 +01:00
parisc Select VIRT_TO_BUS directly where needed 2013-03-12 11:16:40 -07:00
powerpc powerpc: pSeries_lpar_hpte_remove fails from Adjunct partition being performed before the ANDCOND test 2013-04-08 15:19:09 +10:00
s390 s390/mm: provide emtpy check_pgt_cache() function 2013-04-02 08:53:11 +02:00
score Select VIRT_TO_BUS directly where needed 2013-03-12 11:16:40 -07:00
sh hlist: drop the node parameter from iterators 2013-02-27 19:10:24 -08:00
sparc Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc 2013-03-19 14:47:11 -07:00
tile Merge branch 'stable' of git://git.kernel.org/pub/scm/linux/kernel/git/cmetcalf/linux-tile 2013-04-01 08:17:09 -07:00
um um: Use tty_port in SIGWINCH handler 2013-03-11 10:08:04 +01:00
unicore32 Select VIRT_TO_BUS directly where needed 2013-03-12 11:16:40 -07:00
x86 Merge git://git.kernel.org/pub/scm/virt/kvm/kvm 2013-04-07 13:01:25 -07:00
xtensa Select VIRT_TO_BUS directly where needed 2013-03-12 11:16:40 -07:00
.gitignore
Kconfig Select VIRT_TO_BUS directly where needed 2013-03-12 11:16:40 -07:00