commit ddadb75979177635306580654460e19f778b3660 Author: Alexandre Frade Date: Sun Apr 29 16:04:38 2018 -0300 4.14.37-xanmod30 Signed-off-by: Alexandre Frade commit 498f2c04c536357b2510cebefaee74c1fadec77d Merge: c30ef0902c83 753be7e83bb8 Author: Alexandre Frade Date: Sun Apr 29 15:56:05 2018 -0300 Merge tag 'v4.14.37' into 4.14 This is the 4.14.37 stable release commit 753be7e83bb80128b4a2aa24214c98466905827c Author: Greg Kroah-Hartman Date: Thu Apr 26 11:02:22 2018 +0200 Linux 4.14.37 commit f606893fbbc64d68510c5cc3e64e39971e5f26e0 Author: Benjamin Beichler Date: Wed Mar 7 18:11:07 2018 +0100 mac80211_hwsim: fix use-after-free bug in hwsim_exit_net commit 8cfd36a0b53aeb4ec21d81eb79706697b84dfc3d upstream. When destroying a net namespace, all hwsim interfaces, which are not created in default namespace are deleted. But the async deletion of the interfaces could last longer than the actual destruction of the namespace, which results to an use after free bug. Therefore use synchronous deletion in this case. Fixes: 100cb9ff40e0 ("mac80211_hwsim: Allow managing radios from non-initial namespaces") Reported-by: syzbot+70ce058e01259de7bb1d@syzkaller.appspotmail.com Signed-off-by: Benjamin Beichler Signed-off-by: Johannes Berg Signed-off-by: Greg Kroah-Hartman commit 679833ea18221fe703ebf56213c1b1ca1731bad6 Author: Sean Christopherson Date: Thu Mar 29 14:48:30 2018 -0700 Revert "KVM: X86: Fix SMRAM accessing even if VM is shutdown" commit 2c151b25441ae5c2da66472abd165af785c9ecd2 upstream. The bug that led to commit 95e057e25892eaa48cad1e2d637b80d0f1a4fac5 was a benign warning (no adverse affects other than the warning itself) that was detected by syzkaller. Further inspection shows that the WARN_ON in question, in handle_ept_misconfig(), is unnecessary and flawed (this was also briefly discussed in the original patch: https://patchwork.kernel.org/patch/10204649). * The WARN_ON is unnecessary as kvm_mmu_page_fault() will WARN if reserved bits are set in the SPTEs, i.e. it covers the case where an EPT misconfig occurred because of a KVM bug. * The WARN_ON is flawed because it will fire on any system error code that is hit while handling the fault, e.g. -ENOMEM can be returned by mmu_topup_memory_caches() while handling a legitmate MMIO EPT misconfig. The original behavior of returning -EFAULT when userspace munmaps an HVA without first removing the memslot is correct and desirable, i.e. KVM is letting userspace know it has generated a bad address. Returning RET_PF_EMULATE masks the WARN_ON in the EPT misconfig path, but does not fix the underlying bug, i.e. the WARN_ON is bogus. Furthermore, returning RET_PF_EMULATE has the unwanted side effect of causing KVM to attempt to emulate an instruction on any page fault with an invalid HVA translation, e.g. a not-present EPT violation on a VM_PFNMAP VMA whose fault handler failed to insert a PFN. * There is no guarantee that the fault is directly related to the instruction, i.e. the fault could have been triggered by a side effect memory access in the guest, e.g. while vectoring a #DB or writing a tracing record. This could cause KVM to effectively mask the fault if KVM doesn't model the behavior leading to the fault, i.e. emulation could succeed and resume the guest. * If emulation does fail, KVM will return EMULATION_FAILED instead of -EFAULT, which is a red herring as the user will either debug a bogus emulation attempt or scratch their head wondering why we were attempting emulation in the first place. TL;DR: revert to returning -EFAULT and remove the bogus WARN_ON in handle_ept_misconfig in a future patch. This reverts commit 95e057e25892eaa48cad1e2d637b80d0f1a4fac5. Signed-off-by: Sean Christopherson Signed-off-by: Paolo Bonzini Signed-off-by: Greg Kroah-Hartman commit 75dceb6872b3526a5269a28553f9a35972d03bba Author: Leon Romanovsky Date: Sun Mar 11 13:51:32 2018 +0200 RDMA/mlx5: Fix NULL dereference while accessing XRC_TGT QPs commit 75a4598209cbe45540baa316c3b51d9db222e96e upstream. mlx5 modify_qp() relies on FW that the error will be thrown if wrong state is supplied. The missing check in FW causes the following crash while using XRC_TGT QPs. [ 14.769632] BUG: unable to handle kernel NULL pointer dereference at (null) [ 14.771085] IP: mlx5_ib_modify_qp+0xf60/0x13f0 [ 14.771894] PGD 800000001472e067 P4D 800000001472e067 PUD 14529067 PMD 0 [ 14.773126] Oops: 0002 [#1] SMP PTI [ 14.773763] CPU: 0 PID: 365 Comm: ubsan Not tainted 4.16.0-rc1-00038-g8151138c0793 #119 [ 14.775192] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.7.5-0-ge51488c-20140602_164612-nilsson.home.kraxel.org 04/01/2014 [ 14.777522] RIP: 0010:mlx5_ib_modify_qp+0xf60/0x13f0 [ 14.778417] RSP: 0018:ffffbf48001c7bd8 EFLAGS: 00010246 [ 14.779346] RAX: 0000000000000000 RBX: ffff9a8f9447d400 RCX: 0000000000000000 [ 14.780643] RDX: 0000000000000000 RSI: 000000000000000a RDI: 0000000000000000 [ 14.781930] RBP: 0000000000000000 R08: 00000000000217b0 R09: ffffffffbc9c1504 [ 14.783214] R10: fffff4a180519480 R11: ffff9a8f94523600 R12: ffff9a8f9493e240 [ 14.784507] R13: ffff9a8f9447d738 R14: 000000000000050a R15: 0000000000000000 [ 14.785800] FS: 00007f545b466700(0000) GS:ffff9a8f9fc00000(0000) knlGS:0000000000000000 [ 14.787073] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 14.787792] CR2: 0000000000000000 CR3: 00000000144be000 CR4: 00000000000006b0 [ 14.788689] Call Trace: [ 14.789007] _ib_modify_qp+0x71/0x120 [ 14.789475] modify_qp.isra.20+0x207/0x2f0 [ 14.790010] ib_uverbs_modify_qp+0x90/0xe0 [ 14.790532] ib_uverbs_write+0x1d2/0x3c0 [ 14.791049] ? __handle_mm_fault+0x93c/0xe40 [ 14.791644] __vfs_write+0x36/0x180 [ 14.792096] ? handle_mm_fault+0xc1/0x210 [ 14.792601] vfs_write+0xad/0x1e0 [ 14.793018] SyS_write+0x52/0xc0 [ 14.793422] do_syscall_64+0x75/0x180 [ 14.793888] entry_SYSCALL_64_after_hwframe+0x21/0x86 [ 14.794527] RIP: 0033:0x7f545ad76099 [ 14.794975] RSP: 002b:00007ffd78787468 EFLAGS: 00000287 ORIG_RAX: 0000000000000001 [ 14.795958] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f545ad76099 [ 14.797075] RDX: 0000000000000078 RSI: 0000000020009000 RDI: 0000000000000003 [ 14.798140] RBP: 00007ffd78787470 R08: 00007ffd78787480 R09: 00007ffd78787480 [ 14.799207] R10: 00007ffd78787480 R11: 0000000000000287 R12: 00005599ada98760 [ 14.800277] R13: 00007ffd78787560 R14: 0000000000000000 R15: 0000000000000000 [ 14.801341] Code: 4c 8b 1c 24 48 8b 83 70 02 00 00 48 c7 83 cc 02 00 00 00 00 00 00 48 c7 83 24 03 00 00 00 00 00 00 c7 83 2c 03 00 00 00 00 00 00 00 00 00 00 00 48 8b 83 70 02 00 00 c7 40 04 00 00 00 00 4c [ 14.804012] RIP: mlx5_ib_modify_qp+0xf60/0x13f0 RSP: ffffbf48001c7bd8 [ 14.804838] CR2: 0000000000000000 [ 14.805288] ---[ end trace 3f1da0df5c8b7c37 ]--- Cc: syzkaller Reported-by: Maor Gottlieb Signed-off-by: Leon Romanovsky Signed-off-by: Doug Ledford Signed-off-by: Greg Kroah-Hartman commit 01e71c218219fe40c620a53b2d37beafd95c4e14 Author: Jiri Olsa Date: Sun Apr 15 11:23:50 2018 +0200 perf: Return proper values for user stack errors commit 78b562fbfa2cf0a9fcb23c3154756b690f4905c1 upstream. Return immediately when we find issue in the user stack checks. The error value could get overwritten by following check for PERF_SAMPLE_REGS_INTR. Signed-off-by: Jiri Olsa Cc: Alexander Shishkin Cc: Andi Kleen Cc: H. Peter Anvin Cc: Namhyung Kim Cc: Peter Zijlstra Cc: Stephane Eranian Cc: Thomas Gleixner Cc: syzkaller-bugs@googlegroups.com Cc: x86@kernel.org Fixes: 60e2364e60e8 ("perf: Add ability to sample machine state on interrupt") Link: http://lkml.kernel.org/r/20180415092352.12403-1-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo Signed-off-by: Greg Kroah-Hartman commit 66038084560d12457f4dd9e5cfb1d7a7859f70a2 Author: Jiri Olsa Date: Sun Apr 15 11:23:51 2018 +0200 perf: Fix sample_max_stack maximum check commit 5af44ca53d019de47efe6dbc4003dd518e5197ed upstream. The syzbot hit KASAN bug in perf_callchain_store having the entry stored behind the allocated bounds [1]. We miss the sample_max_stack check for the initial event that allocates callchain buffers. This missing check allows to create an event with sample_max_stack value bigger than the global sysctl maximum: # sysctl -a | grep perf_event_max_stack kernel.perf_event_max_stack = 127 # perf record -vv -C 1 -e cycles/max-stack=256/ kill ... perf_event_attr: size 112 ... sample_max_stack 256 ------------------------------------------------------------ sys_perf_event_open: pid -1 cpu 1 group_fd -1 flags 0x8 = 4 Note the '-C 1', which forces perf record to create just single event. Otherwise it opens event for every cpu, then the sample_max_stack check fails on the second event and all's fine. The fix is to run the sample_max_stack check also for the first event with callchains. [1] https://marc.info/?l=linux-kernel&m=152352732920874&w=2 Reported-by: syzbot+7c449856228b63ac951e@syzkaller.appspotmail.com Signed-off-by: Jiri Olsa Cc: Alexander Shishkin Cc: Andi Kleen Cc: H. Peter Anvin Cc: Namhyung Kim Cc: Peter Zijlstra Cc: Thomas Gleixner Cc: syzkaller-bugs@googlegroups.com Cc: x86@kernel.org Fixes: 97c79a38cd45 ("perf core: Per event callchain limit") Link: http://lkml.kernel.org/r/20180415092352.12403-2-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo Signed-off-by: Greg Kroah-Hartman commit 5bcf169444540cbb8646cb415087f5ec83f60432 Author: Florian Westphal Date: Tue Feb 27 19:42:32 2018 +0100 netfilter: x_tables: limit allocation requests for blob rule heads commit 9d5c12a7c08f67999772065afd50fb222072114e upstream. This is a very conservative limit (134217728 rules), but good enough to not trigger frequent oom from syzkaller. Signed-off-by: Florian Westphal Signed-off-by: Pablo Neira Ayuso Signed-off-by: Greg Kroah-Hartman commit 764f2162d97a498269c9b67607fe163692a79aa7 Author: Florian Westphal Date: Tue Feb 27 19:42:35 2018 +0100 netfilter: compat: reject huge allocation requests commit 7d7d7e02111e9a4dc9d0658597f528f815d820fd upstream. no need to bother even trying to allocating huge compat offset arrays, such ruleset is rejected later on anyway becaus we refuse to allocate overly large rule blobs. However, compat translation happens before blob allocation, so we should add a check there too. This is supposed to help with fuzzing by avoiding oom-killer. Signed-off-by: Florian Westphal Signed-off-by: Pablo Neira Ayuso Signed-off-by: Greg Kroah-Hartman commit 8d92d53365395c97cf819a10cda4693f792cd5b6 Author: Florian Westphal Date: Tue Feb 27 19:42:34 2018 +0100 netfilter: compat: prepare xt_compat_init_offsets to return errors commit 9782a11efc072faaf91d4aa60e9d23553f918029 upstream. should have no impact, function still always returns 0. This patch is only to ease review. Signed-off-by: Florian Westphal Signed-off-by: Pablo Neira Ayuso Signed-off-by: Greg Kroah-Hartman commit 82b68ecde5d056588799f0d38e675bbb81fe3b46 Author: Florian Westphal Date: Tue Feb 27 19:42:33 2018 +0100 netfilter: x_tables: add counters allocation wrapper commit c84ca954ac9fa67a6ce27f91f01e4451c74fd8f6 upstream. allows to have size checks in a single spot. This is supposed to reduce oom situations when fuzz-testing xtables. Signed-off-by: Florian Westphal Signed-off-by: Pablo Neira Ayuso Signed-off-by: Greg Kroah-Hartman commit fab0b3ce67a54f45848ee5d6023ef9a42153a2c9 Author: Florian Westphal Date: Tue Feb 27 19:42:31 2018 +0100 netfilter: x_tables: cap allocations at 512 mbyte commit 19926968ea86a286aa6fbea16ee3f2e7442f10f0 upstream. Arbitrary limit, however, this still allows huge rulesets (> 1 million rules). This helps with automated fuzzer as it prevents oom-killer invocation. Signed-off-by: Florian Westphal Signed-off-by: Pablo Neira Ayuso Signed-off-by: Greg Kroah-Hartman commit 89f3232c394e8946905b3c3b57ed593872003d60 Author: Thomas Gleixner Date: Mon Mar 26 15:29:57 2018 +0200 alarmtimer: Init nanosleep alarm timer on stack commit bd03143007eb9b03a7f2316c677780561b68ba2a upstream. syszbot reported the following debugobjects splat: ODEBUG: object is on stack, but not annotated WARNING: CPU: 0 PID: 4185 at lib/debugobjects.c:328 RIP: 0010:debug_object_is_on_stack lib/debugobjects.c:327 [inline] debug_object_init+0x17/0x20 lib/debugobjects.c:391 debug_hrtimer_init kernel/time/hrtimer.c:410 [inline] debug_init kernel/time/hrtimer.c:458 [inline] hrtimer_init+0x8c/0x410 kernel/time/hrtimer.c:1259 alarm_init kernel/time/alarmtimer.c:339 [inline] alarm_timer_nsleep+0x164/0x4d0 kernel/time/alarmtimer.c:787 SYSC_clock_nanosleep kernel/time/posix-timers.c:1226 [inline] SyS_clock_nanosleep+0x235/0x330 kernel/time/posix-timers.c:1204 do_syscall_64+0x281/0x940 arch/x86/entry/common.c:287 entry_SYSCALL_64_after_hwframe+0x42/0xb7 This happens because the hrtimer for the alarm nanosleep is on stack, but the code does not use the proper debug objects initialization. Split out the code for the allocated use cases and invoke hrtimer_init_on_stack() for the nanosleep related functions. Reported-by: syzbot+a3e0726462b2e346a31d@syzkaller.appspotmail.com Signed-off-by: Thomas Gleixner Cc: John Stultz Cc: syzkaller-bugs@googlegroups.com Link: https://lkml.kernel.org/r/alpine.DEB.2.21.1803261528270.1585@nanos.tec.linutronix.de Signed-off-by: Greg Kroah-Hartman commit 76cd54fa70ce7c02e81bc4c3a06de32cdaa84b29 Author: Max Gurtovoy Date: Mon Mar 5 20:09:48 2018 +0200 RDMA/core: Reduce poll batch for direct cq polling [ Upstream commit d3b9e8ad425cfd5b9116732e057f1b48e4d3bcb8 ] Fix warning limit for kernel stack consumption: drivers/infiniband/core/cq.c: In function 'ib_process_cq_direct': drivers/infiniband/core/cq.c:78:1: error: the frame size of 1032 bytes is larger than 1024 bytes [-Werror=frame-larger-than=] Using smaller ib_wc array on the stack brings us comfortably below that limit again. Fixes: 246d8b184c10 ("IB/cq: Don't force IB_POLL_DIRECT poll context for ib_process_cq_direct") Reported-by: Arnd Bergmann Reviewed-by: Sergey Gorenko Signed-off-by: Max Gurtovoy Signed-off-by: Leon Romanovsky Reviewed-by: Bart Van Assche Acked-by: Arnd Bergmann Signed-off-by: Jason Gunthorpe Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit de16dfcc510d8ec2981e4ed97e51d17c043cdaec Author: Mark Salter Date: Fri Feb 2 09:20:29 2018 -0500 irqchip/gic-v3: Change pr_debug message to pr_devel [ Upstream commit b6dd4d83dc2f78cebc9a7e6e7e4bc2be4d29b94d ] The pr_debug() in gic-v3 gic_send_sgi() can trigger a circular locking warning: GICv3: CPU10: ICC_SGI1R_EL1 5000400 ====================================================== WARNING: possible circular locking dependency detected 4.15.0+ #1 Tainted: G W ------------------------------------------------------ dynamic_debug01/1873 is trying to acquire lock: ((console_sem).lock){-...}, at: [<0000000099c891ec>] down_trylock+0x20/0x4c but task is already holding lock: (&rq->lock){-.-.}, at: [<00000000842e1587>] __task_rq_lock+0x54/0xdc which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #2 (&rq->lock){-.-.}: __lock_acquire+0x3b4/0x6e0 lock_acquire+0xf4/0x2a8 _raw_spin_lock+0x4c/0x60 task_fork_fair+0x3c/0x148 sched_fork+0x10c/0x214 copy_process.isra.32.part.33+0x4e8/0x14f0 _do_fork+0xe8/0x78c kernel_thread+0x48/0x54 rest_init+0x34/0x2a4 start_kernel+0x45c/0x488 -> #1 (&p->pi_lock){-.-.}: __lock_acquire+0x3b4/0x6e0 lock_acquire+0xf4/0x2a8 _raw_spin_lock_irqsave+0x58/0x70 try_to_wake_up+0x48/0x600 wake_up_process+0x28/0x34 __up.isra.0+0x60/0x6c up+0x60/0x68 __up_console_sem+0x4c/0x7c console_unlock+0x328/0x634 vprintk_emit+0x25c/0x390 dev_vprintk_emit+0xc4/0x1fc dev_printk_emit+0x88/0xa8 __dev_printk+0x58/0x9c _dev_info+0x84/0xa8 usb_new_device+0x100/0x474 hub_port_connect+0x280/0x92c hub_event+0x740/0xa84 process_one_work+0x240/0x70c worker_thread+0x60/0x400 kthread+0x110/0x13c ret_from_fork+0x10/0x18 -> #0 ((console_sem).lock){-...}: validate_chain.isra.34+0x6e4/0xa20 __lock_acquire+0x3b4/0x6e0 lock_acquire+0xf4/0x2a8 _raw_spin_lock_irqsave+0x58/0x70 down_trylock+0x20/0x4c __down_trylock_console_sem+0x3c/0x9c console_trylock+0x20/0xb0 vprintk_emit+0x254/0x390 vprintk_default+0x58/0x90 vprintk_func+0xbc/0x164 printk+0x80/0xa0 __dynamic_pr_debug+0x84/0xac gic_raise_softirq+0x184/0x18c smp_cross_call+0xac/0x218 smp_send_reschedule+0x3c/0x48 resched_curr+0x60/0x9c check_preempt_curr+0x70/0xdc wake_up_new_task+0x310/0x470 _do_fork+0x188/0x78c SyS_clone+0x44/0x50 __sys_trace_return+0x0/0x4 other info that might help us debug this: Chain exists of: (console_sem).lock --> &p->pi_lock --> &rq->lock Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(&rq->lock); lock(&p->pi_lock); lock(&rq->lock); lock((console_sem).lock); *** DEADLOCK *** 2 locks held by dynamic_debug01/1873: #0: (&p->pi_lock){-.-.}, at: [<000000001366df53>] wake_up_new_task+0x40/0x470 #1: (&rq->lock){-.-.}, at: [<00000000842e1587>] __task_rq_lock+0x54/0xdc stack backtrace: CPU: 10 PID: 1873 Comm: dynamic_debug01 Tainted: G W 4.15.0+ #1 Hardware name: GIGABYTE R120-T34-00/MT30-GS2-00, BIOS T48 10/02/2017 Call trace: dump_backtrace+0x0/0x188 show_stack+0x24/0x2c dump_stack+0xa4/0xe0 print_circular_bug.isra.31+0x29c/0x2b8 check_prev_add.constprop.39+0x6c8/0x6dc validate_chain.isra.34+0x6e4/0xa20 __lock_acquire+0x3b4/0x6e0 lock_acquire+0xf4/0x2a8 _raw_spin_lock_irqsave+0x58/0x70 down_trylock+0x20/0x4c __down_trylock_console_sem+0x3c/0x9c console_trylock+0x20/0xb0 vprintk_emit+0x254/0x390 vprintk_default+0x58/0x90 vprintk_func+0xbc/0x164 printk+0x80/0xa0 __dynamic_pr_debug+0x84/0xac gic_raise_softirq+0x184/0x18c smp_cross_call+0xac/0x218 smp_send_reschedule+0x3c/0x48 resched_curr+0x60/0x9c check_preempt_curr+0x70/0xdc wake_up_new_task+0x310/0x470 _do_fork+0x188/0x78c SyS_clone+0x44/0x50 __sys_trace_return+0x0/0x4 GICv3: CPU0: ICC_SGI1R_EL1 12000 This could be fixed with printk_deferred() but that might lessen its usefulness for debugging. So change it to pr_devel to keep it out of production kernels. Developers working on gic-v3 can enable it as needed in their kernels. Signed-off-by: Mark Salter Signed-off-by: Marc Zyngier Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit 4032cd4fd3aec5d76d5ad9451f6d31ff6003c7cf Author: Michael Kelley Date: Wed Feb 14 02:54:03 2018 +0000 cpumask: Make for_each_cpu_wrap() available on UP as well [ Upstream commit d207af2eab3f8668b95ad02b21930481c42806fd ] for_each_cpu_wrap() was originally added in the #else half of a large "#if NR_CPUS == 1" statement, but was omitted in the #if half. This patch adds the missing #if half to prevent compile errors when NR_CPUS is 1. Reported-by: kbuild test robot Signed-off-by: Michael Kelley Cc: Linus Torvalds Cc: Peter Zijlstra Cc: Thomas Gleixner Cc: kys@microsoft.com Cc: martin.petersen@oracle.com Cc: mikelley@microsoft.com Fixes: c743f0a5c50f ("sched/fair, cpumask: Export for_each_cpu_wrap()") Link: http://lkml.kernel.org/r/SN6PR1901MB2045F087F59450507D4FCC17CBF50@SN6PR1901MB2045.namprd19.prod.outlook.com Signed-off-by: Ingo Molnar Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit c834b955d3f00789b2c95a6e642a058f12ae9096 Author: Stephen Boyd Date: Thu Feb 1 09:03:29 2018 -0800 irqchip/gic-v3: Ignore disabled ITS nodes [ Upstream commit 95a2562590c2f64a0398183f978d5cf3db6d0284 ] On some platforms there's an ITS available but it's not enabled because reading or writing the registers is denied by the firmware. In fact, reading or writing them will cause the system to reset. We could remove the node from DT in such a case, but it's better to skip nodes that are marked as "disabled" in DT so that we can describe the hardware that exists and use the status property to indicate how the firmware has configured things. Cc: Stuart Yoder Cc: Laurentiu Tudor Cc: Greg Kroah-Hartman Cc: Marc Zyngier Cc: Rajendra Nayak Signed-off-by: Stephen Boyd Signed-off-by: Marc Zyngier Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit 2d8d8d23c485d78b00c6a44629b711987ac1234b Author: Thomas Richter Date: Wed Jan 17 09:38:31 2018 +0100 perf test: Fix test trace+probe_libc_inet_pton.sh for s390x [ Upstream commit 7a92453620d42c3a5fea94a864dc6aa04c262b93 ] On Intel test case trace+probe_libc_inet_pton.sh succeeds and the output is: [root@f27 perf]# ./perf trace --no-syscalls -e probe_libc:inet_pton/max-stack=3/ ping -6 -c 1 ::1 PING ::1(::1) 56 data bytes 64 bytes from ::1: icmp_seq=1 ttl=64 time=0.037 ms --- ::1 ping statistics --- 1 packets transmitted, 1 received, 0% packet loss, time 0ms rtt min/avg/max/mdev = 0.037/0.037/0.037/0.000 ms 0.000 probe_libc:inet_pton:(7fa40ac618a0)) __GI___inet_pton (/usr/lib64/libc-2.26.so) getaddrinfo (/usr/lib64/libc-2.26.so) main (/usr/bin/ping) The kernel stack unwinder is used, it is specified implicitly as call-graph=fp (frame pointer). On s390x only dwarf is available for stack unwinding. It is also done in user space. This requires different parameter setup and result checking for s390x and Intel. This patch adds separate perf trace setup and result checking for Intel and s390x. On s390x specify this command line to get a call-graph and handle the different call graph result checking: [root@s35lp76 perf]# ./perf trace --no-syscalls -e probe_libc:inet_pton/call-graph=dwarf/ ping -6 -c 1 ::1 PING ::1(::1) 56 data bytes 64 bytes from ::1: icmp_seq=1 ttl=64 time=0.041 ms --- ::1 ping statistics --- 1 packets transmitted, 1 received, 0% packet loss, time 0ms rtt min/avg/max/mdev = 0.041/0.041/0.041/0.000 ms 0.000 probe_libc:inet_pton:(3ffb9942060)) __GI___inet_pton (/usr/lib64/libc-2.26.so) gaih_inet (inlined) __GI_getaddrinfo (inlined) main (/usr/bin/ping) __libc_start_main (/usr/lib64/libc-2.26.so) _start (/usr/bin/ping) [root@s35lp76 perf]# Before: [root@s8360047 perf]# ./perf test -vv 58 58: probe libc's inet_pton & backtrace it with ping : --- start --- test child forked, pid 26349 PING ::1(::1) 56 data bytes 64 bytes from ::1: icmp_seq=1 ttl=64 time=0.079 ms --- ::1 ping statistics --- 1 packets transmitted, 1 received, 0% packet loss, time 0ms rtt min/avg/max/mdev = 0.079/0.079/0.079/0.000 ms 0.000 probe_libc:inet_pton:(3ff925c2060)) test child finished with -1 ---- end ---- probe libc's inet_pton & backtrace it with ping: FAILED! [root@s8360047 perf]# After: [root@s35lp76 perf]# ./perf test -vv 57 57: probe libc's inet_pton & backtrace it with ping : --- start --- test child forked, pid 38708 PING ::1(::1) 56 data bytes 64 bytes from ::1: icmp_seq=1 ttl=64 time=0.038 ms --- ::1 ping statistics --- 1 packets transmitted, 1 received, 0% packet loss, time 0ms rtt min/avg/max/mdev = 0.038/0.038/0.038/0.000 ms 0.000 probe_libc:inet_pton:(3ff87342060)) __GI___inet_pton (/usr/lib64/libc-2.26.so) gaih_inet (inlined) __GI_getaddrinfo (inlined) main (/usr/bin/ping) __libc_start_main (/usr/lib64/libc-2.26.so) _start (/usr/bin/ping) test child finished with 0 ---- end ---- probe libc's inet_pton & backtrace it with ping: Ok [root@s35lp76 perf]# On Intel the test case runs unchanged and succeeds. Signed-off-by: Thomas Richter Reviewed-by: Hendrik Brueckner Tested-by: Arnaldo Carvalho de Melo Cc: Heiko Carstens Cc: Martin Schwidefsky Link: http://lkml.kernel.org/r/20180117083831.101001-1-tmricht@linux.vnet.ibm.com Signed-off-by: Arnaldo Carvalho de Melo Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit 74cd9414788ca5de87b2df2103d39a7705f202f1 Author: Nicholas Piggin Date: Tue Feb 13 17:45:11 2018 +1000 powerpc/powernv: IMC fix out of bounds memory access at shutdown [ Upstream commit e7bde88cdb4f0e432398a7d29ca2a15d2c18952a ] The OPAL IMC driver's shutdown handler disables nest PMU counters by walking nodes and taking the first CPU out of their cpumask, which is used to index into the paca (get_hard_smp_processor_id()). This does not always do the right thing, and in particular for CPU-less nodes it returns NR_CPUS and that overruns the paca and dereferences random memory. Fix it by being more careful about checking returned CPU, and only using online CPUs. It's not clear this shutdown code makes sense after commit 885dcd709b ("powerpc/perf: Add nest IMC PMU support"), but this should not make things worse Currently the bug causes us to call OPAL with a junk CPU number. A separate patch in development to change the way pacas are allocated escalates this bug into a crash: Unable to handle kernel paging request for data at address 0x2a21af1eeb000076 Faulting instruction address: 0xc0000000000a5468 Oops: Kernel access of bad area, sig: 11 [#1] ... NIP opal_imc_counters_shutdown+0x148/0x1d0 LR opal_imc_counters_shutdown+0x134/0x1d0 Call Trace: opal_imc_counters_shutdown+0x134/0x1d0 (unreliable) platform_drv_shutdown+0x44/0x60 device_shutdown+0x1f8/0x350 kernel_restart_prepare+0x54/0x70 kernel_restart+0x28/0xc0 SyS_reboot+0x1d0/0x2c0 system_call+0x58/0x6c Signed-off-by: Nicholas Piggin Signed-off-by: Michael Ellerman Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit c74e004c62739d6a6a002730d57fe851ff42f346 Author: Will Deacon Date: Tue Feb 13 13:22:57 2018 +0000 locking/qspinlock: Ensure node->count is updated before initialising node [ Upstream commit 11dc13224c975efcec96647a4768a6f1bb7a19a8 ] When queuing on the qspinlock, the count field for the current CPU's head node is incremented. This needn't be atomic because locking in e.g. IRQ context is balanced and so an IRQ will return with node->count as it found it. However, the compiler could in theory reorder the initialisation of node[idx] before the increment of the head node->count, causing an IRQ to overwrite the initialised node and potentially corrupt the lock state. Avoid the potential for this harmful compiler reordering by placing a barrier() between the increment of the head node->count and the subsequent node initialisation. Signed-off-by: Will Deacon Acked-by: Peter Zijlstra (Intel) Cc: Linus Torvalds Cc: Thomas Gleixner Link: http://lkml.kernel.org/r/1518528177-19169-3-git-send-email-will.deacon@arm.com Signed-off-by: Ingo Molnar Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit 5350cb0111d23ff50b66bcdd9ee5fdf3d64a9f01 Author: mike.travis@hpe.com Date: Mon Feb 5 16:15:04 2018 -0600 x86/platform/UV: Fix GAM Range Table entries less than 1GB [ Upstream commit c25d99d20ba69824a1e2cc118e04b877cd427afc ] The latest UV platforms include the new ApachePass NVDIMMs into the UV address space. This has introduced address ranges in the Global Address Map Table that are less than the previous lowest range, which was 2GB. Fix the address calculation so it accommodates address ranges from bytes to exabytes. Signed-off-by: Mike Travis Reviewed-by: Andrew Banman Reviewed-by: Dimitri Sivanich Cc: Linus Torvalds Cc: Peter Zijlstra Cc: Russ Anderson Cc: Thomas Gleixner Link: http://lkml.kernel.org/r/20180205221503.190219903@stormcage.americas.sgi.com Signed-off-by: Ingo Molnar Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit 288b373264c51f490676a860c705ec70e4d83a67 Author: Aneesh Kumar K.V Date: Tue Feb 13 16:39:33 2018 +0530 powerpc/mm/hash64: Zero PGD pages on allocation [ Upstream commit fc5c2f4a55a2c258e12013cdf287cf266dbcd2a7 ] On powerpc we allocate page table pages from slab caches of different sizes. Currently we have a constructor that zeroes out the objects when we allocate them for the first time. We expect the objects to be zeroed out when we free the the object back to slab cache. This happens in the unmap path. For hugetlb pages we call huge_pte_get_and_clear() to do that. With the current configuration of page table size, both PUD and PGD level tables are allocated from the same slab cache. At the PUD level, we use the second half of the table to store the slot information. But we never clear that when unmapping. When such a freed object is then allocated for a PGD page, the second half of the page table page will not be zeroed as expected. This results in a kernel crash. Fix it by always clearing PGD pages when they're allocated. Signed-off-by: Aneesh Kumar K.V [mpe: Change log wording and formatting, add whitespace] Signed-off-by: Michael Ellerman Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit f4d6e4598a29f6f7df6937d943c01e1eeb9ee871 Author: Jia Zhang Date: Mon Feb 12 22:44:53 2018 +0800 vfs/proc/kcore, x86/mm/kcore: Fix SMAP fault when dumping vsyscall user page [ Upstream commit 595dd46ebfc10be041a365d0a3fa99df50b6ba73 ] Commit: df04abfd181a ("fs/proc/kcore.c: Add bounce buffer for ktext data") ... introduced a bounce buffer to work around CONFIG_HARDENED_USERCOPY=y. However, accessing the vsyscall user page will cause an SMAP fault. Replace memcpy() with copy_from_user() to fix this bug works, but adding a common way to handle this sort of user page may be useful for future. Currently, only vsyscall page requires KCORE_USER. Signed-off-by: Jia Zhang Reviewed-by: Jiri Olsa Cc: Al Viro Cc: Linus Torvalds Cc: Peter Zijlstra Cc: Thomas Gleixner Cc: jolsa@redhat.com Link: http://lkml.kernel.org/r/1518446694-21124-2-git-send-email-zhang.jia@linux.alibaba.com Signed-off-by: Ingo Molnar Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit c064b7c1d203cd5d781c316ac9c8049ba772684f Author: Tony Lindgren Date: Fri Feb 9 08:11:26 2018 -0800 PM / wakeirq: Fix unbalanced IRQ enable for wakeirq [ Upstream commit 69728051f5bf15efaf6edfbcfe1b5a49a2437918 ] If a device is runtime PM suspended when we enter suspend and has a dedicated wake IRQ, we can get the following warning: WARNING: CPU: 0 PID: 108 at kernel/irq/manage.c:526 enable_irq+0x40/0x94 [ 102.087860] Unbalanced enable for IRQ 147 ... (enable_irq) from [] (dev_pm_arm_wake_irq+0x4c/0x60) (dev_pm_arm_wake_irq) from [] (device_wakeup_arm_wake_irqs+0x58/0x9c) (device_wakeup_arm_wake_irqs) from [] (dpm_suspend_noirq+0x10/0x48) (dpm_suspend_noirq) from [] (suspend_devices_and_enter+0x30c/0xf14) (suspend_devices_and_enter) from [] (enter_state+0xad4/0xbd8) (enter_state) from [] (pm_suspend+0x38/0x98) (pm_suspend) from [] (state_store+0x68/0xc8) This is because the dedicated wake IRQ for the device may have been already enabled earlier by dev_pm_enable_wake_irq_check(). Fix the issue by checking for runtime PM suspended status. This issue can be easily reproduced by setting serial console log level to zero, letting the serial console idle, and suspend the system from an ssh terminal. On resume, dmesg will have the warning above. The reason why I have not run into this issue earlier has been that I typically run my PM test cases from on a serial console instead over ssh. Fixes: c84345597558 (PM / wakeirq: Enable dedicated wakeirq for suspend) Signed-off-by: Tony Lindgren Signed-off-by: Rafael J. Wysocki Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit afa0ce071488154e28d60b510dd0a3c3d86f7992 Author: Rafael J. Wysocki Date: Fri Feb 9 22:55:28 2018 +0100 ACPI / EC: Restore polling during noirq suspend/resume phases [ Upstream commit 3cd091a773936c54344a519f7ee1379ccb620bee ] Commit 662591461c4b (ACPI / EC: Drop EC noirq hooks to fix a regression) modified the ACPI EC driver so that it doesn't switch over to busy polling mode during noirq stages of system suspend and resume in an attempt to fix an issue resulting from that behavior. However, that modification introduced a system resume regression on Thinkpad X240, so make the EC driver switch over to the polling mode during noirq stages of system suspend and resume again, which effectively reverts the problematic commit. Fixes: 662591461c4b (ACPI / EC: Drop EC noirq hooks to fix a regression) Link: https://bugzilla.kernel.org/show_bug.cgi?id=197863 Reported-by: Markus Demleitner Tested-by: Markus Demleitner Signed-off-by: Rafael J. Wysocki Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit 85bd5c686fe9e60557b5d8610110c49381ed4a07 Author: Daniel Borkmann Date: Fri Feb 9 14:49:44 2018 +0100 bpf: fix rlimit in reuseport net selftest [ Upstream commit 941ff6f11c020913f5cddf543a9ec63475d7c082 ] Fix two issues in the reuseport_bpf selftests that were reported by Linaro CI: [...] + ./reuseport_bpf ---- IPv4 UDP ---- Testing EBPF mod 10... Reprograming, testing mod 5... ./reuseport_bpf: ebpf error. log: 0: (bf) r6 = r1 1: (20) r0 = *(u32 *)skb[0] 2: (97) r0 %= 10 3: (95) exit processed 4 insns : Operation not permitted + echo FAIL [...] ---- IPv4 TCP ---- Testing EBPF mod 10... ./reuseport_bpf: failed to bind send socket: Address already in use + echo FAIL [...] For the former adjust rlimit since this was the cause of failure for loading the BPF prog, and for the latter add SO_REUSEADDR. Reported-by: Naresh Kamboju Link: https://bugs.linaro.org/show_bug.cgi?id=3502 Signed-off-by: Daniel Borkmann Signed-off-by: David S. Miller Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit ee5fe4bdcf2aea19ee238d6c3844dfe3c7c77e8b Author: Niklas Cassel Date: Fri Feb 9 17:22:45 2018 +0100 net: stmmac: discard disabled flags in interrupt status register [ Upstream commit 1b84ca187510f60f00f4e15255043ce19bb30410 ] The interrupt status register in both dwmac1000 and dwmac4 ignores interrupt enable (for dwmac4) / interrupt mask (for dwmac1000). Therefore, if we want to check only the bits that can actually trigger an irq, we have to filter the interrupt status register manually. Commit 0a764db10337 ("stmmac: Discard masked flags in interrupt status register") fixed this for dwmac1000. Fix the same issue for dwmac4. Just like commit 0a764db10337 ("stmmac: Discard masked flags in interrupt status register"), this makes sure that we do not get spurious link up/link down prints. Signed-off-by: Niklas Cassel Signed-off-by: David S. Miller Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit 26bebd5a7865376d5944b0206d7b271ab76f116b Author: Trond Myklebust Date: Fri Feb 9 09:39:42 2018 -0500 SUNRPC: Don't call __UDPX_INC_STATS() from a preemptible context [ Upstream commit 0afa6b4412988019db14c6bfb8c6cbdf120ca9ad ] Calling __UDPX_INC_STATS() from a preemptible context leads to a warning of the form: BUG: using __this_cpu_add() in preemptible [00000000] code: kworker/u5:0/31 caller is xs_udp_data_receive_workfn+0x194/0x270 CPU: 1 PID: 31 Comm: kworker/u5:0 Not tainted 4.15.0-rc8-00076-g90ea9f1 #2 Workqueue: xprtiod xs_udp_data_receive_workfn Call Trace: dump_stack+0x85/0xc1 check_preemption_disabled+0xce/0xe0 xs_udp_data_receive_workfn+0x194/0x270 process_one_work+0x318/0x620 worker_thread+0x20a/0x390 ? process_one_work+0x620/0x620 kthread+0x120/0x130 ? __kthread_bind_mask+0x60/0x60 ret_from_fork+0x24/0x30 Since we're taking a spinlock in those functions anyway, let's fix the issue by moving the call so that it occurs under the spinlock. Reported-by: kernel test robot Signed-off-by: Trond Myklebust Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit f58e4ecb9b2e75b8a2a32b990241f16363fd3673 Author: Paul Mackerras Date: Wed Feb 7 19:49:54 2018 +1100 KVM: PPC: Book3S HV: Fix handling of secondary HPTEG in HPT resizing code [ Upstream commit 05f2bb0313a2855e491dadfc8319b7da261d7074 ] This fixes the computation of the HPTE index to use when the HPT resizing code encounters a bolted HPTE which is stored in its secondary HPTE group. The code inverts the HPTE group number, which is correct, but doesn't then mask it with new_hash_mask. As a result, new_pteg will be effectively negative, resulting in new_hptep pointing before the new HPT, which will corrupt memory. In addition, this removes two BUG_ON statements. The condition that the BUG_ONs were testing -- that we have computed the hash value incorrectly -- has never been observed in testing, and if it did occur, would only affect the guest, not the host. Given that BUG_ON should only be used in conditions where the kernel (i.e. the host kernel, in this case) can't possibly continue execution, it is not appropriate here. Reviewed-by: David Gibson Signed-off-by: Paul Mackerras Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit d6b00490a04d2434088c87c3afe2e827bdaab61b Author: Jesper Dangaard Brouer Date: Thu Feb 8 12:48:32 2018 +0100 tools/libbpf: handle issues with bpf ELF objects containing .eh_frames [ Upstream commit e3d91b0ca523d53158f435a3e13df7f0cb360ea2 ] V3: More generic skipping of relo-section (suggested by Daniel) If clang >= 4.0.1 is missing the option '-target bpf', it will cause llc/llvm to create two ELF sections for "Exception Frames", with section names '.eh_frame' and '.rel.eh_frame'. The BPF ELF loader library libbpf fails when loading files with these sections. The other in-kernel BPF ELF loader in samples/bpf/bpf_load.c, handle this gracefully. And iproute2 loader also seems to work with these "eh" sections. The issue in libbpf is caused by bpf_object__elf_collect() skipping some sections, and later when performing relocation it will be pointing to a skipped section, as these sections cannot be found by bpf_object__find_prog_by_idx() in bpf_object__collect_reloc(). This is a general issue that also occurs for other sections, like debug sections which are also skipped and can have relo section. As suggested by Daniel. To avoid keeping state about all skipped sections, instead perform a direct qlookup in the ELF object. Lookup the section that the relo-section points to and check if it contains executable machine instructions (denoted by the sh_flags SHF_EXECINSTR). Use this check to also skip irrelevant relo-sections. Note, for samples/bpf/ the '-target bpf' parameter to clang cannot be used due to incompatibility with asm embedded headers, that some of the samples include. This is explained in more details by Yonghong Song in bpf_devel_QA. Signed-off-by: Jesper Dangaard Brouer Signed-off-by: Daniel Borkmann Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit 327aac8ccbc54a0a320428664385b1d8bd8fd033 Author: Mathieu Malaterre Date: Wed Feb 7 20:35:00 2018 +0100 net: Extra '_get' in declaration of arch_get_platform_mac_address [ Upstream commit e728789c52afccc1275cba1dd812f03abe16ea3c ] In commit c7f5d105495a ("net: Add eth_platform_get_mac_address() helper."), two declarations were added: int eth_platform_get_mac_address(struct device *dev, u8 *mac_addr); unsigned char *arch_get_platform_get_mac_address(void); An extra '_get' was introduced in arch_get_platform_get_mac_address, remove it. Fix compile warning using W=1: CC net/ethernet/eth.o net/ethernet/eth.c:523:24: warning: no previous prototype for ‘arch_get_platform_mac_address’ [-Wmissing-prototypes] unsigned char * __weak arch_get_platform_mac_address(void) ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~ AR net/ethernet/built-in.o Signed-off-by: Mathieu Malaterre Signed-off-by: David S. Miller Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit 0b1fa241dd86820659cb477caa73e3c232d344e2 Author: Chuck Lever Date: Fri Feb 2 14:28:59 2018 -0500 svcrdma: Fix Read chunk round-up [ Upstream commit 175e03101d36c3034f3c80038d4c28838351a7f2 ] A single NFSv4 WRITE compound can often have three operations: PUTFH, WRITE, then GETATTR. When the WRITE payload is sent in a Read chunk, the client places the GETATTR in the inline part of the RPC/RDMA message, just after the WRITE operation (sans payload). The position value in the Read chunk enables the receiver to insert the Read chunk at the correct place in the received XDR stream; that is between the WRITE and GETATTR. According to RFC 8166, an NFS/RDMA client does not have to add XDR round-up to the Read chunk that carries the WRITE payload. The receiver adds XDR round-up padding if it is absent and the receiver's XDR decoder requires it to be present. Commit 193bcb7b3719 ("svcrdma: Populate tail iovec when receiving") attempted to add support for receiving such a compound so that just the WRITE payload appears in rq_arg's page list, and the trailing GETATTR is placed in rq_arg's tail iovec. (TCP just strings the whole compound into the head iovec and page list, without regard to the alignment of the WRITE payload). The server transport logic also had to accommodate the optional XDR round-up of the Read chunk, which it did simply by lengthening the tail iovec when round-up was needed. This approach is adequate for the NFSv2 and NFSv3 WRITE decoders. Unfortunately it is not sufficient for nfsd4_decode_write. When the Read chunk length is a couple of bytes less than PAGE_SIZE, the computation at the end of nfsd4_decode_write allows argp->pagelen to go negative, which breaks the logic in read_buf that looks for the tail iovec. The result is that a WRITE operation whose payload length is just less than a multiple of a page succeeds, but the subsequent GETATTR in the same compound fails with NFS4ERR_OP_ILLEGAL because the XDR decoder can't find it. Clients ignore the error, but they must update their attribute cache via a separate round trip. As nfsd4_decode_write appears to expect the payload itself to always have appropriate XDR round-up, have svc_rdma_build_normal_read_chunk add the Read chunk XDR round-up to the page_len rather than lengthening the tail iovec. Reported-by: Olga Kornievskaia Fixes: 193bcb7b3719 ("svcrdma: Populate tail iovec when receiving") Signed-off-by: Chuck Lever Tested-by: Olga Kornievskaia Signed-off-by: J. Bruce Fields Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit e781fff7b78ffd1ae7149e8f5ac7512dab32cfa5 Author: David Howells Date: Thu Feb 8 15:59:07 2018 +0000 rxrpc: Don't put crypto buffers on the stack [ Upstream commit 8c2f826dc36314059ac146c78d3bf8056b626446 ] Don't put buffers of data to be handed to crypto on the stack as this may cause an assertion failure in the kernel (see below). Fix this by using an kmalloc'd buffer instead. kernel BUG at ./include/linux/scatterlist.h:147! ... RIP: 0010:rxkad_encrypt_response.isra.6+0x191/0x1b0 [rxrpc] RSP: 0018:ffffbe2fc06cfca8 EFLAGS: 00010246 RAX: 0000000000000000 RBX: ffff989277d59900 RCX: 0000000000000028 RDX: 0000259dc06cfd88 RSI: 0000000000000025 RDI: ffffbe30406cfd88 RBP: ffffbe2fc06cfd60 R08: ffffbe2fc06cfd08 R09: ffffbe2fc06cfd08 R10: 0000000000000000 R11: 0000000000000000 R12: 1ffff7c5f80d9f95 R13: ffffbe2fc06cfd88 R14: ffff98927a3f7aa0 R15: ffffbe2fc06cfd08 FS: 0000000000000000(0000) GS:ffff98927fc00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000055b1ff28f0f8 CR3: 000000001b412003 CR4: 00000000003606f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: rxkad_respond_to_challenge+0x297/0x330 [rxrpc] rxrpc_process_connection+0xd1/0x690 [rxrpc] ? process_one_work+0x1c3/0x680 ? __lock_is_held+0x59/0xa0 process_one_work+0x249/0x680 worker_thread+0x3a/0x390 ? process_one_work+0x680/0x680 kthread+0x121/0x140 ? kthread_create_worker_on_cpu+0x70/0x70 ret_from_fork+0x3a/0x50 Reported-by: Jonathan Billings Reported-by: Marc Dionne Signed-off-by: David Howells Tested-by: Jonathan Billings Signed-off-by: David S. Miller Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit c5ce9e5b57cc24142e38b6441c23e0e8bac4ef54 Author: Steven Rostedt (VMware) Date: Tue Feb 6 17:19:03 2018 -0500 selftests/ftrace: Add some missing glob checks [ Upstream commit 97fe22adf33f06519bfdf7dad33bcd562e366c8f ] Al Viro discovered a bug in the glob ftrace filtering code where "*a*b" is treated the same as "a*b", and functions that would be selected by "*a*b" but not "a*b" are not selected with "*a*b". Add tests for patterns "*a*b" and "a*b*" to the glob selftest. Link: http://lkml.kernel.org/r/20180127170748.GF13338@ZenIV.linux.org.uk Cc: Shuah Khan Acked-by: Masami Hiramatsu Signed-off-by: Steven Rostedt (VMware) Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit ae9c78af577f3741357851e6e98223d67f07d4a6 Author: Chen Yu Date: Mon Jan 29 10:27:57 2018 +0800 cpufreq: intel_pstate: Enable HWP during system resume on CPU0 [ Upstream commit 70f6bf2a3b7e40c3f802b0ea837762a8bc6c1430 ] When maxcpus=1 is in the kernel command line, the BP is responsible for re-enabling the HWP - because currently only the APs invoke intel_pstate_hwp_enable() during their online process - which might put the system into unstable state after resume. Fix this by enabling the HWP explicitly on BP during resume. Reported-by: Doug Smythies Suggested-by: Srinivas Pandruvada Signed-off-by: Yu Chen [ rjw: Subject/changelog, minor modifications ] Signed-off-by: Rafael J. Wysocki Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit c4c9fd55899fd780ee010eda172fe574c65bf56e Author: Tang Junhui Date: Wed Feb 7 11:41:45 2018 -0800 bcache: return attach error when no cache set exist [ Upstream commit 7f4fc93d4713394ee8f1cd44c238e046e11b4f15 ] I attach a back-end device to a cache set, and the cache set is not registered yet, this back-end device did not attach successfully, and no error returned: [root]# echo 87859280-fec6-4bcc-20df7ca8f86b > /sys/block/sde/bcache/attach [root]# In sysfs_attach(), the return value "v" is initialized to "size" in the beginning, and if no cache set exist in bch_cache_sets, the "v" value would not change any more, and return to sysfs, sysfs regard it as success since the "size" is a positive number. This patch fixes this issue by assigning "v" with "-ENOENT" in the initialization. Signed-off-by: Tang Junhui Reviewed-by: Michael Lyle Signed-off-by: Jens Axboe Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit 4c8e0270dc7afb10d4e06be4adfd177db5cb3cb8 Author: Tang Junhui Date: Wed Feb 7 11:41:46 2018 -0800 bcache: fix for data collapse after re-attaching an attached device [ Upstream commit 73ac105be390c1de42a2f21643c9778a5e002930 ] back-end device sdm has already attached a cache_set with ID f67ebe1f-f8bc-4d73-bfe5-9dc88607f119, then try to attach with another cache set, and it returns with an error: [root]# cd /sys/block/sdm/bcache [root]# echo 5ccd0a63-148e-48b8-afa2-aca9cbd6279f > attach -bash: echo: write error: Invalid argument After that, execute a command to modify the label of bcache device: [root]# echo data_disk1 > label Then we reboot the system, when the system power on, the back-end device can not attach to cache_set, a messages show in the log: Feb 5 12:05:52 ceph152 kernel: [922385.508498] bcache: bch_cached_dev_attach() couldn't find uuid for sdm in set In sysfs_attach(), dc->sb.set_uuid was assigned to the value which input through sysfs, no matter whether it is success or not in bch_cached_dev_attach(). For example, If the back-end device has already attached to an cache set, bch_cached_dev_attach() would fail, but dc->sb.set_uuid was changed. Then modify the label of bcache device, it will call bch_write_bdev_super(), which would write the dc->sb.set_uuid to the super block, so we record a wrong cache set ID in the super block, after the system reboot, the cache set couldn't find the uuid of the back-end device, so the bcache device couldn't exist and use any more. In this patch, we don't assigned cache set ID to dc->sb.set_uuid in sysfs_attach() directly, but input it into bch_cached_dev_attach(), and assigned dc->sb.set_uuid to the cache set ID after the back-end device attached to the cache set successful. Signed-off-by: Tang Junhui Reviewed-by: Michael Lyle Signed-off-by: Jens Axboe Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit 311e31419b7240151de0cf7aa3f2d972ee103a5c Author: Tang Junhui Date: Wed Feb 7 11:41:43 2018 -0800 bcache: fix for allocator and register thread race [ Upstream commit 682811b3ce1a5a4e20d700939a9042f01dbc66c4 ] After long time running of random small IO writing, I reboot the machine, and after the machine power on, I found bcache got stuck, the stack is: [root@ceph153 ~]# cat /proc/2510/task/*/stack [] closure_sync+0x25/0x90 [bcache] [] bch_journal+0x118/0x2b0 [bcache] [] bch_journal_meta+0x47/0x70 [bcache] [] bch_prio_write+0x237/0x340 [bcache] [] bch_allocator_thread+0x3c8/0x3d0 [bcache] [] kthread+0xcf/0xe0 [] ret_from_fork+0x58/0x90 [] 0xffffffffffffffff [root@ceph153 ~]# cat /proc/2038/task/*/stack [] __bch_btree_map_nodes+0x12d/0x150 [bcache] [] bch_btree_insert+0xf1/0x170 [bcache] [] bch_journal_replay+0x13f/0x230 [bcache] [] run_cache_set+0x79a/0x7c2 [bcache] [] register_bcache+0xd48/0x1310 [bcache] [] kobj_attr_store+0xf/0x20 [] sysfs_write_file+0xc6/0x140 [] vfs_write+0xbd/0x1e0 [] SyS_write+0x7f/0xe0 [] system_call_fastpath+0x16/0x1 The stack shows the register thread and allocator thread were getting stuck when registering cache device. I reboot the machine several times, the issue always exsit in this machine. I debug the code, and found the call trace as bellow: register_bcache() ==>run_cache_set() ==>bch_journal_replay() ==>bch_btree_insert() ==>__bch_btree_map_nodes() ==>btree_insert_fn() ==>btree_split() //node need split ==>btree_check_reserve() In btree_check_reserve(), It will check if there is enough buckets of RESERVE_BTREE type, since allocator thread did not work yet, so no buckets of RESERVE_BTREE type allocated, so the register thread waits on c->btree_cache_wait, and goes to sleep. Then the allocator thread initialized, the call trace is bellow: bch_allocator_thread() ==>bch_prio_write() ==>bch_journal_meta() ==>bch_journal() ==>journal_wait_for_write() In journal_wait_for_write(), It will check if journal is full by journal_full(), but the long time random small IO writing causes the exhaustion of journal buckets(journal.blocks_free=0), In order to release the journal buckets, the allocator calls btree_flush_write() to flush keys to btree nodes, and waits on c->journal.wait until btree nodes writing over or there has already some journal buckets space, then the allocator thread goes to sleep. but in btree_flush_write(), since bch_journal_replay() is not finished, so no btree nodes have journal (condition "if (btree_current_write(b)->journal)" never satisfied), so we got no btree node to flush, no journal bucket released, and allocator sleep all the times. Through the above analysis, we can see that: 1) Register thread wait for allocator thread to allocate buckets of RESERVE_BTREE type; 2) Alloctor thread wait for register thread to replay journal, so it can flush btree nodes and get journal bucket. then they are all got stuck by waiting for each other. Hua Rui provided a patch for me, by allocating some buckets of RESERVE_BTREE type in advance, so the register thread can get bucket when btree node splitting and no need to waiting for the allocator thread. I tested it, it has effect, and register thread run a step forward, but finally are still got stuck, the reason is only 8 bucket of RESERVE_BTREE type were allocated, and in bch_journal_replay(), after 2 btree nodes splitting, only 4 bucket of RESERVE_BTREE type left, then btree_check_reserve() is not satisfied anymore, so it goes to sleep again, and in the same time, alloctor thread did not flush enough btree nodes to release a journal bucket, so they all got stuck again. So we need to allocate more buckets of RESERVE_BTREE type in advance, but how much is enough? By experience and test, I think it should be as much as journal buckets. Then I modify the code as this patch, and test in the machine, and it works. This patch modified base on Hua Rui’s patch, and allocate more buckets of RESERVE_BTREE type in advance to avoid register thread and allocate thread going to wait for each other. [patch v2] ca->sb.njournal_buckets would be 0 in the first time after cache creation, and no journal exists, so just 8 btree buckets is OK. Signed-off-by: Hua Rui Signed-off-by: Tang Junhui Reviewed-by: Michael Lyle Signed-off-by: Jens Axboe Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit f89edd17aff46a3970f302add6949700198dfd92 Author: Coly Li Date: Wed Feb 7 11:41:41 2018 -0800 bcache: properly set task state in bch_writeback_thread() [ Upstream commit 99361bbf26337186f02561109c17a4c4b1a7536a ] Kernel thread routine bch_writeback_thread() has the following code block, 447 down_write(&dc->writeback_lock); 448~450 if (check conditions) { 451 up_write(&dc->writeback_lock); 452 set_current_state(TASK_INTERRUPTIBLE); 453 454 if (kthread_should_stop()) 455 return 0; 456 457 schedule(); 458 continue; 459 } If condition check is true, its task state is set to TASK_INTERRUPTIBLE and call schedule() to wait for others to wake up it. There are 2 issues in current code, 1, Task state is set to TASK_INTERRUPTIBLE after the condition checks, if another process changes the condition and call wake_up_process(dc-> writeback_thread), then at line 452 task state is set back to TASK_INTERRUPTIBLE, the writeback kernel thread will lose a chance to be waken up. 2, At line 454 if kthread_should_stop() is true, writeback kernel thread will return to kernel/kthread.c:kthread() with TASK_INTERRUPTIBLE and call do_exit(). It is not good to enter do_exit() with task state TASK_INTERRUPTIBLE, in following code path might_sleep() is called and a warning message is reported by __might_sleep(): "WARNING: do not call blocking ops when !TASK_RUNNING; state=1 set at [xxxx]". For the first issue, task state should be set before condition checks. Ineed because dc->writeback_lock is required when modifying all the conditions, calling set_current_state() inside code block where dc-> writeback_lock is hold is safe. But this is quite implicit, so I still move set_current_state() before all the condition checks. For the second issue, frankley speaking it does not hurt when kernel thread exits with TASK_INTERRUPTIBLE state, but this warning message scares users, makes them feel there might be something risky with bcache and hurt their data. Setting task state to TASK_RUNNING before returning fixes this problem. In alloc.c:allocator_wait(), there is also a similar issue, and is also fixed in this patch. Changelog: v3: merge two similar fixes into one patch v2: fix the race issue in v1 patch. v1: initial buggy fix. Signed-off-by: Coly Li Reviewed-by: Hannes Reinecke Reviewed-by: Michael Lyle Cc: Michael Lyle Cc: Junhui Tang Signed-off-by: Jens Axboe Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit 05921c492fdb6046e5a91a83dba3320701421902 Author: Arnd Bergmann Date: Fri Feb 2 16:48:47 2018 +0100 cifs: silence compiler warnings showing up with gcc-8.0.0 [ Upstream commit ade7db991b47ab3016a414468164f4966bd08202 ] This bug was fixed before, but came up again with the latest compiler in another function: fs/cifs/cifssmb.c: In function 'CIFSSMBSetEA': fs/cifs/cifssmb.c:6362:3: error: 'strncpy' offset 8 is out of the bounds [0, 4] [-Werror=array-bounds] strncpy(parm_data->list[0].name, ea_name, name_len); Let's apply the same fix that was used for the other instances. Fixes: b2a3ad9ca502 ("cifs: silence compiler warnings showing up with gcc-4.7.0") Signed-off-by: Arnd Bergmann Signed-off-by: Steve French Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit 4b95781cb6f37398e424e70e089e380f297693f5 Author: Ulf Hansson Date: Tue Jan 23 21:43:08 2018 +0100 PM / domains: Fix up domain-idle-states OF parsing [ Upstream commit a3381e3a65cbaf612c8f584906c4dba27e84267c ] Commit b539cc82d493 (PM / Domains: Ignore domain-idle-states that are not compatible), made it possible to ignore non-compatible domain-idle-states OF nodes. However, in case that happens while doing the OF parsing, the number of elements in the allocated array would exceed the numbers actually needed, thus wasting memory. Fix this by pre-iterating the genpd OF node and counting the number of compatible domain-idle-states nodes, before doing the allocation. While doing this, it makes sense to rework the code a bit to avoid open coding, of parts responsible for the OF node iteration. Let's also take the opportunity to clarify the function header for of_genpd_parse_idle_states(), about what is being returned in case of errors. Fixes: b539cc82d493 (PM / Domains: Ignore domain-idle-states that are not compatible) Signed-off-by: Ulf Hansson Reviewed-by: Lina Iyer Signed-off-by: Rafael J. Wysocki Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit 05e52e5bd103a335288415d5399cdb120a05a5e4 Author: Alexey Dobriyan Date: Tue Feb 6 15:36:59 2018 -0800 proc: fix /proc/*/map_files lookup [ Upstream commit ac7f1061c2c11bb8936b1b6a94cdb48de732f7a4 ] Current code does: if (sscanf(dentry->d_name.name, "%lx-%lx", start, end) != 2) However sscanf() is broken garbage. It silently accepts whitespace between format specifiers (did you know that?). It silently accepts valid strings which result in integer overflow. Do not use sscanf() for any even remotely reliable parsing code. OK # readlink '/proc/1/map_files/55a23af39000-55a23b05b000' /lib/systemd/systemd broken # readlink '/proc/1/map_files/ 55a23af39000-55a23b05b000' /lib/systemd/systemd broken # readlink '/proc/1/map_files/55a23af39000-55a23b05b000 ' /lib/systemd/systemd very broken # readlink '/proc/1/map_files/1000000000000000055a23af39000-55a23b05b000' /lib/systemd/systemd Andrei said: : This patch breaks criu. It was a bug in criu. And this bug is on a minor : path, which works when memfd_create() isn't available. It is a reason why : I ask to not backport this patch to stable kernels. : : In CRIU this bug can be triggered, only if this patch will be backported : to a kernel which version is lower than v3.16. Link: http://lkml.kernel.org/r/20171120212706.GA14325@avx2 Signed-off-by: Alexey Dobriyan Cc: Pavel Emelyanov Cc: Andrei Vagin Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit 4ec317a41d80145de90bd320928da6916572b6f0 Author: Will Deacon Date: Wed Jan 31 12:12:20 2018 +0000 arm64: spinlock: Fix theoretical trylock() A-B-A with LSE atomics [ Upstream commit 202fb4ef81e3ec765c23bd1e6746a5c25b797d0e ] If the spinlock "next" ticket wraps around between the initial LDR and the cmpxchg in the LSE version of spin_trylock, then we can erroneously think that we have successfuly acquired the lock because we only check whether the next ticket return by the cmpxchg is equal to the owner ticket in our updated lock word. This patch fixes the issue by performing a full 32-bit check of the lock word when trying to determine whether or not the CASA instruction updated memory. Reported-by: Catalin Marinas Signed-off-by: Will Deacon Signed-off-by: Catalin Marinas Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit 693b9589c297e0b05359f62a6b8dc73289c0e3e4 Author: Guanglei Li Date: Tue Feb 6 10:43:21 2018 +0800 RDS: IB: Fix null pointer issue [ Upstream commit 2c0aa08631b86a4678dbc93b9caa5248014b4458 ] Scenario: 1. Port down and do fail over 2. Ap do rds_bind syscall PID: 47039 TASK: ffff89887e2fe640 CPU: 47 COMMAND: "kworker/u:6" #0 [ffff898e35f159f0] machine_kexec at ffffffff8103abf9 #1 [ffff898e35f15a60] crash_kexec at ffffffff810b96e3 #2 [ffff898e35f15b30] oops_end at ffffffff8150f518 #3 [ffff898e35f15b60] no_context at ffffffff8104854c #4 [ffff898e35f15ba0] __bad_area_nosemaphore at ffffffff81048675 #5 [ffff898e35f15bf0] bad_area_nosemaphore at ffffffff810487d3 #6 [ffff898e35f15c00] do_page_fault at ffffffff815120b8 #7 [ffff898e35f15d10] page_fault at ffffffff8150ea95 [exception RIP: unknown or invalid address] RIP: 0000000000000000 RSP: ffff898e35f15dc8 RFLAGS: 00010282 RAX: 00000000fffffffe RBX: ffff889b77f6fc00 RCX:ffffffff81c99d88 RDX: 0000000000000000 RSI: ffff896019ee08e8 RDI:ffff889b77f6fc00 RBP: ffff898e35f15df0 R8: ffff896019ee08c8 R9:0000000000000000 R10: 0000000000000400 R11: 0000000000000000 R12:ffff896019ee08c0 R13: ffff889b77f6fe68 R14: ffffffff81c99d80 R15: ffffffffa022a1e0 ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 #8 [ffff898e35f15dc8] cma_ndev_work_handler at ffffffffa022a228 [rdma_cm] #9 [ffff898e35f15df8] process_one_work at ffffffff8108a7c6 #10 [ffff898e35f15e58] worker_thread at ffffffff8108bda0 #11 [ffff898e35f15ee8] kthread at ffffffff81090fe6 PID: 45659 TASK: ffff880d313d2500 CPU: 31 COMMAND: "oracle_45659_ap" #0 [ffff881024ccfc98] __schedule at ffffffff8150bac4 #1 [ffff881024ccfd40] schedule at ffffffff8150c2cf #2 [ffff881024ccfd50] __mutex_lock_slowpath at ffffffff8150cee7 #3 [ffff881024ccfdc0] mutex_lock at ffffffff8150cdeb #4 [ffff881024ccfde0] rdma_destroy_id at ffffffffa022a027 [rdma_cm] #5 [ffff881024ccfe10] rds_ib_laddr_check at ffffffffa0357857 [rds_rdma] #6 [ffff881024ccfe50] rds_trans_get_preferred at ffffffffa0324c2a [rds] #7 [ffff881024ccfe80] rds_bind at ffffffffa031d690 [rds] #8 [ffff881024ccfeb0] sys_bind at ffffffff8142a670 PID: 45659 PID: 47039 rds_ib_laddr_check /* create id_priv with a null event_handler */ rdma_create_id rdma_bind_addr cma_acquire_dev /* add id_priv to cma_dev->id_list */ cma_attach_to_dev cma_ndev_work_handler /* event_hanlder is null */ id_priv->id.event_handler Signed-off-by: Guanglei Li Signed-off-by: Honglei Wang Reviewed-by: Junxiao Bi Reviewed-by: Yanjun Zhu Reviewed-by: Leon Romanovsky Acked-by: Santosh Shilimkar Acked-by: Doug Ledford Signed-off-by: David S. Miller Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit a8e7a4e243747fdf64a459e0eb2e5091ffee9fe1 Author: John Fastabend Date: Mon Feb 5 10:17:54 2018 -0800 bpf: sockmap, fix leaking maps with attached but not detached progs [ Upstream commit 3d9e952697de89b53227f06d4241f275eb99cfc4 ] When a program is attached to a map we increment the program refcnt to ensure that the program is not removed while it is potentially being referenced from sockmap side. However, if this same program also references the map (this is a reasonably common pattern in my programs) then the verifier will also increment the maps refcnt from the verifier. This is to ensure the map doesn't get garbage collected while the program has a reference to it. So we are left in a state where the map holds the refcnt on the program stopping it from being removed and releasing the map refcnt. And vice versa the program holds a refcnt on the map stopping it from releasing the refcnt on the prog. All this is fine as long as users detach the program while the map fd is still around. But, if the user omits this detach command we are left with a dangling map we can no longer release. To resolve this when the map fd is released decrement the program references and remove any reference from the map to the program. This fixes the issue with possibly dangling map and creates a user side API constraint. That is, the map fd must be held open for programs to be attached to a map. Fixes: 174a79ff9515 ("bpf: sockmap with sk redirect support") Signed-off-by: John Fastabend Signed-off-by: Daniel Borkmann Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit 05c062c3685e227a22c96c93e1149fcad383ecbf Author: Ross Lagerwall Date: Thu Jan 11 09:36:37 2018 +0000 xen/grant-table: Use put_page instead of free_page [ Upstream commit 3ac7292a25db1c607a50752055a18aba32ac2176 ] The page given to gnttab_end_foreign_access() to free could be a compound page so use put_page() instead of free_page() since it can handle both compound and single pages correctly. This bug was discovered when migrating a Xen VM with several VIFs and CONFIG_DEBUG_VM enabled. It hits a BUG usually after fewer than 10 iterations. All netfront devices disconnect from the backend during a suspend/resume and this will call gnttab_end_foreign_access() if a netfront queue has an outstanding skb. The mismatch between calling get_page() and free_page() on a compound page causes a reference counting error which is detected when DEBUG_VM is enabled. Signed-off-by: Ross Lagerwall Reviewed-by: Boris Ostrovsky Signed-off-by: Juergen Gross Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit 70f3461c23ffb394676cb53c2eb1095208a52327 Author: Ross Lagerwall Date: Thu Jan 11 09:36:38 2018 +0000 xen-netfront: Fix race between device setup and open [ Upstream commit f599c64fdf7d9c108e8717fb04bc41c680120da4 ] When a netfront device is set up it registers a netdev fairly early on, before it has set up the queues and is actually usable. A userspace tool like NetworkManager will immediately try to open it and access its state as soon as it appears. The bug can be reproduced by hotplugging VIFs until the VM runs out of grant refs. It registers the netdev but fails to set up any queues (since there are no more grant refs). In the meantime, NetworkManager opens the device and the kernel crashes trying to access the queues (of which there are none). Fix this in two ways: * For initial setup, register the netdev much later, after the queues are setup. This avoids the race entirely. * During a suspend/resume cycle, the frontend reconnects to the backend and the queues are recreated. It is possible (though highly unlikely) to race with something opening the device and accessing the queues after they have been destroyed but before they have been recreated. Extend the region covered by the rtnl semaphore to protect against this race. There is a possibility that we fail to recreate the queues so check for this in the open function. Signed-off-by: Ross Lagerwall Reviewed-by: Boris Ostrovsky Signed-off-by: Juergen Gross Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit 2f79b5e52d46db124ad04c994ce0c2d43244de85 Author: Jiri Olsa Date: Thu Feb 1 09:38:10 2018 +0100 perf evsel: Fix period/freq terms setup [ Upstream commit 49c0ae80eb32426fa133246200628e529067c595 ] Stephane reported that we don't set properly PERIOD sample type for events with period term defined. Before: $ perf record -e cpu/cpu-cycles,period=1000/u ls $ perf evlist -v cpu/cpu-cycles,period=1000/u: ... sample_type: IP|TID|TIME|PERIOD, ... After: $ perf record -e cpu/cpu-cycles,period=1000/u ls $ perf evlist -v cpu/cpu-cycles,period=1000/u: ... sample_type: IP|TID|TIME, ... Setting PERIOD sample type based on period term setup. Committer note: When we use -c or a period=N term in the event definition, then we don't need to ask the kernel, for this event, via perf_event_attr.sample_type |= PERF_SAMPLE_PERIOD, to put the event period in each sample for this event, as we know it already, it is in perf_event_attr.sample_period. Reported-by: Stephane Eranian Signed-off-by: Jiri Olsa Tested-by: Stephane Eranian Cc: Alexander Shishkin Cc: Andi Kleen Cc: David Ahern Cc: Namhyung Kim Cc: Peter Zijlstra Link: http://lkml.kernel.org/r/20180201083812.11359-2-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit b1f9f9fb3f99d8942991aa5976de73fd865937ec Author: Matt Redfearn Date: Fri Jan 5 10:31:07 2018 +0000 MIPS: Generic: Support GIC in EIC mode [ Upstream commit 7bf8b16d1b60419c865e423b907a05f413745b3e ] The GIC supports running in External Interrupt Controller (EIC) mode, and will signal this via cpu_has_veic if enabled in hardware. Currently the generic kernel will panic if cpu_has_veic is set - but the GIC can legitimately set this flag if either configured to boot in EIC mode, or if the GIC driver enables this mode. Make the kernel not panic in this case, and instead just check if the GIC is present. If so, use it's CPU local interrupt routing functions. If an EIC is present, but it is not the GIC, then the kernel does not know how to get the VIRQ for the CPU local interrupts and should panic. Support for alternative EICs being present is needed here for the generic kernel to support them. Suggested-by: Paul Burton Signed-off-by: Matt Redfearn Cc: Ralf Baechle Cc: linux-mips@linux-mips.org Patchwork: https://patchwork.linux-mips.org/patch/18191/ Signed-off-by: James Hogan Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit 76e3ea2f95632d23aa12e954940c69e32544330b Author: Jiri Olsa Date: Thu Feb 1 09:38:11 2018 +0100 perf record: Fix period option handling [ Upstream commit f290aa1ffa45ed7e37599840878b4dae68269ee1 ] Stephan reported we don't unset PERIOD sample type when --no-period is specified. Adding the unset check and reset PERIOD if --no-period is specified. Committer notes: Check the sample_type, it shouldn't have PERF_SAMPLE_PERIOD there when --no-period is used. Before: # perf record --no-period sleep 1 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.018 MB perf.data (7 samples) ] # perf evlist -v cycles:ppp: size: 112, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|PERIOD, disabled: 1, inherit: 1, mmap: 1, comm: 1, freq: 1, enable_on_exec: 1, task: 1, precise_ip: 3, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1 # After: [root@jouet ~]# perf record --no-period sleep 1 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.019 MB perf.data (17 samples) ] [root@jouet ~]# perf evlist -v cycles:ppp: size: 112, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME, disabled: 1, inherit: 1, mmap: 1, comm: 1, freq: 1, enable_on_exec: 1, task: 1, precise_ip: 3, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1 [root@jouet ~]# Reported-by: Stephane Eranian Signed-off-by: Jiri Olsa Tested-by: Arnaldo Carvalho de Melo Tested-by: Stephane Eranian Cc: Alexander Shishkin Cc: Andi Kleen Cc: David Ahern Cc: Namhyung Kim Cc: Peter Zijlstra Link: http://lkml.kernel.org/r/20180201083812.11359-3-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit f938c2acc829689a29f3b1b2aa123285c31b2c67 Author: Matt Redfearn Date: Mon Jan 29 11:26:45 2018 +0000 MIPS: TXx9: use IS_BUILTIN() for CONFIG_LEDS_CLASS [ Upstream commit 0cde5b44a30f1daaef1c34e08191239dc63271c4 ] When commit b27311e1cace ("MIPS: TXx9: Add RBTX4939 board support") added board support for the RBTX4939, it added a call to led_classdev_register even if the LED class is built as a module. Built-in arch code cannot call module code directly like this. Commit b33b44073734 ("MIPS: TXX9: use IS_ENABLED() macro") subsequently changed the inclusion of this code to a single check that CONFIG_LEDS_CLASS is either builtin or a module, but the same issue remains. This leads to MIPS allmodconfig builds failing when CONFIG_MACH_TX49XX=y is set: arch/mips/txx9/rbtx4939/setup.o: In function `rbtx4939_led_probe': setup.c:(.init.text+0xc0): undefined reference to `of_led_classdev_register' make: *** [Makefile:999: vmlinux] Error 1 Fix this by using the IS_BUILTIN() macro instead. Fixes: b27311e1cace ("MIPS: TXx9: Add RBTX4939 board support") Signed-off-by: Matt Redfearn Reviewed-by: James Hogan Cc: Ralf Baechle Cc: linux-mips@linux-mips.org Patchwork: https://patchwork.linux-mips.org/patch/18544/ Signed-off-by: James Hogan Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit 3e01c16d87511071679d3d3d14c0f8d37b856b52 Author: Yonghong Song Date: Fri Feb 2 22:37:15 2018 -0800 bpf: fix selftests/bpf test_kmod.sh failure when CONFIG_BPF_JIT_ALWAYS_ON=y [ Upstream commit 09584b406742413ac4c8d7e030374d4daa045b69 ] With CONFIG_BPF_JIT_ALWAYS_ON is defined in the config file, tools/testing/selftests/bpf/test_kmod.sh failed like below: [root@localhost bpf]# ./test_kmod.sh sysctl: setting key "net.core.bpf_jit_enable": Invalid argument [ JIT enabled:0 hardened:0 ] [ 132.175681] test_bpf: #297 BPF_MAXINSNS: Jump, gap, jump, ... FAIL to prog_create err=-524 len=4096 [ 132.458834] test_bpf: Summary: 348 PASSED, 1 FAILED, [340/340 JIT'ed] [ JIT enabled:1 hardened:0 ] [ 133.456025] test_bpf: #297 BPF_MAXINSNS: Jump, gap, jump, ... FAIL to prog_create err=-524 len=4096 [ 133.730935] test_bpf: Summary: 348 PASSED, 1 FAILED, [340/340 JIT'ed] [ JIT enabled:1 hardened:1 ] [ 134.769730] test_bpf: #297 BPF_MAXINSNS: Jump, gap, jump, ... FAIL to prog_create err=-524 len=4096 [ 135.050864] test_bpf: Summary: 348 PASSED, 1 FAILED, [340/340 JIT'ed] [ JIT enabled:1 hardened:2 ] [ 136.442882] test_bpf: #297 BPF_MAXINSNS: Jump, gap, jump, ... FAIL to prog_create err=-524 len=4096 [ 136.821810] test_bpf: Summary: 348 PASSED, 1 FAILED, [340/340 JIT'ed] [root@localhost bpf]# The test_kmod.sh load/remove test_bpf.ko multiple times with different settings for sysctl net.core.bpf_jit_{enable,harden}. The failed test #297 of test_bpf.ko is designed such that JIT always fails. Commit 290af86629b2 (bpf: introduce BPF_JIT_ALWAYS_ON config) introduced the following tightening logic: ... if (!bpf_prog_is_dev_bound(fp->aux)) { fp = bpf_int_jit_compile(fp); #ifdef CONFIG_BPF_JIT_ALWAYS_ON if (!fp->jited) { *err = -ENOTSUPP; return fp; } #endif ... With this logic, Test #297 always gets return value -ENOTSUPP when CONFIG_BPF_JIT_ALWAYS_ON is defined, causing the test failure. This patch fixed the failure by marking Test #297 as expected failure when CONFIG_BPF_JIT_ALWAYS_ON is defined. Fixes: 290af86629b2 (bpf: introduce BPF_JIT_ALWAYS_ON config) Signed-off-by: Yonghong Song Signed-off-by: Daniel Borkmann Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit 74abca65f1e43f747ee429441792be9394667944 Author: Hans de Goede Date: Fri Jan 26 16:02:59 2018 +0100 ACPI / scan: Use acpi_bus_get_status() to initialize ACPI_TYPE_DEVICE devs [ Upstream commit 63347db0affadcbccd5613116ea8431c70139b3e ] The acpi_get_bus_status wrapper for acpi_bus_get_status_handle has some code to handle certain device quirks, in some cases we also need this quirk handling for the initial _STA call. Specifically on some devices calling _STA before all _DEP dependencies are met results in errors like these: [ 0.123579] ACPI Error: No handler for Region [ECRM] (00000000ba9edc4c) [GenericSerialBus] (20170831/evregion-166) [ 0.123601] ACPI Error: Region GenericSerialBus (ID=9) has no handler (20170831/exfldio-299) [ 0.123618] ACPI Error: Method parse/execution failed \_SB.I2C1.BAT1._STA, AE_NOT_EXIST (20170831/psparse-550) acpi_get_bus_status already has code to avoid this, so by using it we also silence these errors from the initial _STA call. Note that in order for the acpi_get_bus_status handling for this to work, we initialize dep_unmet to 1 until acpi_device_dep_initialize gets called, this means that battery devices will be instantiated with an initial status of 0. This is not a problem, acpi_bus_attach will get called soon after the instantiation anyways and it will update the status as first point of order. Signed-off-by: Hans de Goede Signed-off-by: Rafael J. Wysocki Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit f920e914801c57f6853a62bba081fd6f3b11aede Author: Hans de Goede Date: Fri Jan 26 16:02:58 2018 +0100 ACPI / bus: Do not call _STA on battery devices with unmet dependencies [ Upstream commit 54ddce7062242036402242242c07c60c0b505f84 ] The battery code uses acpi_device->dep_unmet to check for unmet deps and if there are unmet deps it does not bind to the device to avoid errors about missing OpRegions when calling ACPI methods on the device. The missing OpRegions when there are unmet deps problem also applies to the _STA method of some battery devices and calling it too early results in errors like these: [ 0.123579] ACPI Error: No handler for Region [ECRM] (00000000ba9edc4c) [GenericSerialBus] (20170831/evregion-166) [ 0.123601] ACPI Error: Region GenericSerialBus (ID=9) has no handler (20170831/exfldio-299) [ 0.123618] ACPI Error: Method parse/execution failed \_SB.I2C1.BAT1._STA, AE_NOT_EXIST (20170831/psparse-550) This commit fixes these errors happening when acpi_get_bus_status gets called by checking dep_unmet for battery devices and reporting a status of 0 until all dependencies are met. Signed-off-by: Hans de Goede Signed-off-by: Rafael J. Wysocki Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit 51939996acde4bce855f70a32544aaf44f8e3f0a Author: Chen Yu Date: Mon Jan 29 10:26:46 2018 +0800 ACPI: processor_perflib: Do not send _PPC change notification if not ready [ Upstream commit ba1edb9a5125a617d612f98eead14b9b84e75c3a ] The following warning was triggered after resumed from S3 - if all the nonboot CPUs were put offline before suspend: [ 1840.329515] unchecked MSR access error: RDMSR from 0x771 at rIP: 0xffffffff86061e3a (native_read_msr+0xa/0x30) [ 1840.329516] Call Trace: [ 1840.329521] __rdmsr_on_cpu+0x33/0x50 [ 1840.329525] generic_exec_single+0x81/0xb0 [ 1840.329527] smp_call_function_single+0xd2/0x100 [ 1840.329530] ? acpi_ds_result_pop+0xdd/0xf2 [ 1840.329532] ? acpi_ds_create_operand+0x215/0x23c [ 1840.329534] rdmsrl_on_cpu+0x57/0x80 [ 1840.329536] ? cpumask_next+0x1b/0x20 [ 1840.329538] ? rdmsrl_on_cpu+0x57/0x80 [ 1840.329541] intel_pstate_update_perf_limits+0xf3/0x220 [ 1840.329544] ? notifier_call_chain+0x4a/0x70 [ 1840.329546] intel_pstate_set_policy+0x4e/0x150 [ 1840.329548] cpufreq_set_policy+0xcd/0x2f0 [ 1840.329550] cpufreq_update_policy+0xb2/0x130 [ 1840.329552] ? cpufreq_update_policy+0x130/0x130 [ 1840.329556] acpi_processor_ppc_has_changed+0x65/0x80 [ 1840.329558] acpi_processor_notify+0x80/0x100 [ 1840.329561] acpi_ev_notify_dispatch+0x44/0x5c [ 1840.329563] acpi_os_execute_deferred+0x14/0x20 [ 1840.329565] process_one_work+0x193/0x3c0 [ 1840.329567] worker_thread+0x35/0x3b0 [ 1840.329569] kthread+0x125/0x140 [ 1840.329571] ? process_one_work+0x3c0/0x3c0 [ 1840.329572] ? kthread_park+0x60/0x60 [ 1840.329575] ? do_syscall_64+0x67/0x180 [ 1840.329577] ret_from_fork+0x25/0x30 [ 1840.329585] unchecked MSR access error: WRMSR to 0x774 (tried to write 0x0000000000000000) at rIP: 0xffffffff86061f78 (native_write_msr+0x8/0x30) [ 1840.329586] Call Trace: [ 1840.329587] __wrmsr_on_cpu+0x37/0x40 [ 1840.329589] generic_exec_single+0x81/0xb0 [ 1840.329592] smp_call_function_single+0xd2/0x100 [ 1840.329594] ? acpi_ds_create_operand+0x215/0x23c [ 1840.329595] ? cpumask_next+0x1b/0x20 [ 1840.329597] wrmsrl_on_cpu+0x57/0x70 [ 1840.329598] ? rdmsrl_on_cpu+0x57/0x80 [ 1840.329599] ? wrmsrl_on_cpu+0x57/0x70 [ 1840.329602] intel_pstate_hwp_set+0xd3/0x150 [ 1840.329604] intel_pstate_set_policy+0x119/0x150 [ 1840.329606] cpufreq_set_policy+0xcd/0x2f0 [ 1840.329607] cpufreq_update_policy+0xb2/0x130 [ 1840.329610] ? cpufreq_update_policy+0x130/0x130 [ 1840.329613] acpi_processor_ppc_has_changed+0x65/0x80 [ 1840.329615] acpi_processor_notify+0x80/0x100 [ 1840.329617] acpi_ev_notify_dispatch+0x44/0x5c [ 1840.329619] acpi_os_execute_deferred+0x14/0x20 [ 1840.329620] process_one_work+0x193/0x3c0 [ 1840.329622] worker_thread+0x35/0x3b0 [ 1840.329624] kthread+0x125/0x140 [ 1840.329625] ? process_one_work+0x3c0/0x3c0 [ 1840.329626] ? kthread_park+0x60/0x60 [ 1840.329628] ? do_syscall_64+0x67/0x180 [ 1840.329631] ret_from_fork+0x25/0x30 This is because if there's only one online CPU, the MSR_PM_ENABLE (package wide)can not be enabled after resumed, due to intel_pstate_hwp_enable() will only be invoked on AP's online process after resumed - if there's no AP online, the HWP remains disabled after resumed (BIOS has disabled it in S3). Then if there comes a _PPC change notification which touches HWP register during this stage, the warning is triggered. Since we don't call acpi_processor_register_performance() when HWP is enabled, the pr->performance will be NULL. When this is NULL we don't need to do _PPC change notification. Reported-by: Doug Smythies Suggested-by: Srinivas Pandruvada Signed-off-by: Yu Chen Signed-off-by: Rafael J. Wysocki Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit 573cb560b4edd71a6b9dd41cdc39d27abb8dda30 Author: Jean Delvare Date: Sat Feb 3 11:25:20 2018 +0100 firmware: dmi_scan: Fix handling of empty DMI strings [ Upstream commit a7770ae194569e96a93c48aceb304edded9cc648 ] The handling of empty DMI strings looks quite broken to me: * Strings from 1 to 7 spaces are not considered empty. * True empty DMI strings (string index set to 0) are not considered empty, and result in allocating a 0-char string. * Strings with invalid index also result in allocating a 0-char string. * Strings starting with 8 spaces are all considered empty, even if non-space characters follow (sounds like a weird thing to do, but I have actually seen occurrences of this in DMI tables before.) * Strings which are considered empty are reported as 8 spaces, instead of being actually empty. Some of these issues are the result of an off-by-one error in memcmp, the rest is incorrect by design. So let's get it square: missing strings and strings made of only spaces, regardless of their length, should be treated as empty and no memory should be allocated for them. All other strings are non-empty and should be allocated. Signed-off-by: Jean Delvare Fixes: 79da4721117f ("x86: fix DMI out of memory problems") Cc: Parag Warudkar Cc: Ingo Molnar Cc: Thomas Gleixner Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit ee06ed9ba5182bfdbc829f150b594c7f14a9d62f Author: Arnd Bergmann Date: Fri Feb 2 15:56:17 2018 +0100 x86/dumpstack: Avoid uninitlized variable [ Upstream commit ebfc15019cfa72496c674ffcb0b8ef10790dcddc ] In some configurations, 'partial' does not get initialized, as shown by this gcc-8 warning: arch/x86/kernel/dumpstack.c: In function 'show_trace_log_lvl': arch/x86/kernel/dumpstack.c:156:4: error: 'partial' may be used uninitialized in this function [-Werror=maybe-uninitialized] show_regs_if_on_stack(&stack_info, regs, partial); ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ This initializes it to false, to get the previous behavior in this case. Fixes: a9cdbe72c4e8 ("x86/dumpstack: Fix partial register dumps") Signed-off-by: Arnd Bergmann Signed-off-by: Thomas Gleixner Cc: Andi Kleen Cc: Nicolas Pitre Cc: Peter Zijlstra Cc: Dave Hansen Cc: Andy Lutomirski Cc: Josh Poimboeuf Cc: Borislav Petkov Cc: Vlastimil Babka Link: https://lkml.kernel.org/r/20180202145634.200291-1-arnd@arndb.de Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit 423505471f5ed18ed88afe66592931f91245fce2 Author: Arnd Bergmann Date: Fri Feb 2 15:56:18 2018 +0100 x86/power: Fix swsusp_arch_resume prototype [ Upstream commit 328008a72d38b5bde6491e463405c34a81a65d3e ] The declaration for swsusp_arch_resume marks it as 'asmlinkage', but the definition in x86-32 does not, and it fails to include the header with the declaration. This leads to a warning when building with link-time-optimizations: kernel/power/power.h:108:23: error: type of 'swsusp_arch_resume' does not match original declaration [-Werror=lto-type-mismatch] extern asmlinkage int swsusp_arch_resume(void); ^ arch/x86/power/hibernate_32.c:148:0: note: 'swsusp_arch_resume' was previously declared here int swsusp_arch_resume(void) This moves the declaration into a globally visible header file and fixes up both x86 definitions to match it. Signed-off-by: Arnd Bergmann Signed-off-by: Thomas Gleixner Cc: Len Brown Cc: Andi Kleen Cc: Nicolas Pitre Cc: linux-pm@vger.kernel.org Cc: "Rafael J. Wysocki" Cc: Pavel Machek Cc: Bart Van Assche Link: https://lkml.kernel.org/r/20180202145634.200291-2-arnd@arndb.de Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit 074372c8124cbd62a8edf8b979fa814b4e223ece Author: Subash Abhinov Kasiviswanathan Date: Wed Jan 31 04:50:01 2018 -0700 netfilter: ipv6: nf_defrag: Kill frag queue on RFC2460 failure [ Upstream commit ea23d5e3bf340e413b8e05c13da233c99c64142b ] Failures were seen in ICMPv6 fragmentation timeout tests if they were run after the RFC2460 failure tests. Kernel was not sending out the ICMPv6 fragment reassembly time exceeded packet after the fragmentation reassembly timeout of 1 minute had elapsed. This happened because the frag queue was not released if an error in IPv6 fragmentation header was detected by RFC2460. Fixes: 83f1999caeb1 ("netfilter: ipv6: nf_defrag: Pass on packets to stack per RFC2460") Signed-off-by: Subash Abhinov Kasiviswanathan Signed-off-by: Pablo Neira Ayuso Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit 2cd5100363b794cdd26d878cbde0c12c33687e7d Author: Sebastian Ott Date: Tue Jan 23 13:58:05 2018 +0100 s390/eadm: fix CONFIG_BLOCK include dependency [ Upstream commit 366b77ae43c5a3bf1a367f15ec8bc16e05035f14 ] Commit 2a842acab109 ("block: introduce new block status code type") added blk_status_t usage to the eadm subchannel driver. However blk_status_t is unknown when included via for CONFIG_BLOCK=n. Only include since this is the only dependency eadm has. This fixes build failures like below: In file included from drivers/s390/cio/eadm_sch.c:24:0: ./arch/s390/include/asm/eadm.h:111:4: error: unknown type name 'blk_status_t'; did you mean 'si_status'? blk_status_t error); Reported-by: Heiko Carstens Signed-off-by: Sebastian Ott Signed-off-by: Martin Schwidefsky Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit eb41efa1386582c00865a3dec5758cfbda6dbda8 Author: Karol Herbst Date: Mon Nov 6 16:32:41 2017 +0100 drm/nouveau/pmu/fuc: don't use movw directly anymore [ Upstream commit fe9748b7b41cee11f8db57fb8b20bc540a33102a ] Fixes failure to compile with recent envyas as a result of the 'movw' alias being removed for v5. A bit of history: v3 only has a 16-bit sign-extended immediate mov op. In order to set the high bits, there's a separate 'sethi' op. envyas validates that the value passed to mov(imm) is between -0x8000 and 0x7fff. In order to simplify macros that load both the low and high word, a 'movw' alias was added which takes an unsigned 16-bit immediate. However the actual hardware op still sign extends. v5 has a full 32-bit immediate mov op. The v3 16-bit immediate mov op is gone (loads 0 into the dst reg). However due to a bug in envyas, the movw alias still existed, and selected the no-longer-present v3 16-bit immediate mov op. As a result usage of movw on v5 is the same as mov with a 0x0 argument. The proper fix throughout is to only ever use the 'movw' alias in combination with 'sethi'. Anything else should get the sign-extended validation to ensure that the intended value ends up in the destination register. Changes in fuc3 binaries is the result of a different encoding being selected for a mov with an 8-bit value. v2: added commit message written by Ilia, thanks for that! v3: messed up rebasing, now it should apply Signed-off-by: Karol Herbst Signed-off-by: Ben Skeggs Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit fd370b8e65e3364a8a4411656506ccfaf13b4599 Author: Don Hiatt Date: Thu Feb 1 10:57:03 2018 -0800 IB/core: Map iWarp AH type to undefined in rdma_ah_find_type [ Upstream commit 87daac68f77a3e21a1113f816e6a7be0b38bdde8 ] iWarp devices do not support the creation of address handles so return AH_ATTR_TYPE_UNDEFINED for all iWarp devices. While we are here reduce the size of port_num to u8 and add a comment. Fixes: 44c58487d51a ("IB/core: Define 'ib' and 'roce' rdma_ah_attr types") Reported-by: Parav Pandit CC: Sean Hefty Reviewed-by: Ira Weiny Reviewed-by: Shiraz Saleem Signed-off-by: Don Hiatt Signed-off-by: Dennis Dalessandro Signed-off-by: Jason Gunthorpe Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit f63bb02694f08e94a50ae334d16b9908426655f7 Author: Alex Estrin Date: Thu Feb 1 10:55:41 2018 -0800 IB/ipoib: Fix for potential no-carrier state [ Upstream commit 1029361084d18cc270f64dfd39529fafa10cfe01 ] On reboot SM can program port pkey table before ipoib registered its event handler, which could result in missing pkey event and leave root interface with initial pkey value from index 0. Since OPA port starts with invalid pkey in index 0, root interface will fail to initialize and stay down with no-carrier flag. For IB ipoib interface may end up with pkey different from value opensm put in pkey table idx 0, resulting in connectivity issues (different mcast groups, for example). Close the window by calling event handler after registration to make sure ipoib pkey is in sync with port pkey table. Reviewed-by: Mike Marciniszyn Reviewed-by: Ira Weiny Signed-off-by: Alex Estrin Signed-off-by: Dennis Dalessandro Signed-off-by: Jason Gunthorpe Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit 8f96d408a9547e2a814562796c6f7fac3e839332 Author: Alex Estrin Date: Thu Feb 1 10:43:58 2018 -0800 IB/hfi1: Fix for potential refcount leak in hfi1_open_file() [ Upstream commit 2b1e7fe16124e86ee9242aeeee859c79a843e3a2 ] The dd refcount is speculatively incremented prior to allocating the fd memory with kzalloc(). If that kzalloc() failed the dd refcount leaks. Increment refcount on kzalloc success. Fixes: e11ffbd57520 ("IB/hfi1: Do not free hfi1 cdev parent structure early") Reviewed-by: Michael J Ruhl Signed-off-by: Alex Estrin Signed-off-by: Jason Gunthorpe Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit 5ceae7690f0d0575c730663ffd67493c9fa4e992 Author: Michael J. Ruhl Date: Thu Feb 1 10:43:42 2018 -0800 IB/hfi1: Re-order IRQ cleanup to address driver cleanup race [ Upstream commit 82a979265638c505e12fbe7ba40980dc0901436d ] The pci_request_irq() interfaces always adds the IRQF_SHARED bit to all IRQ requests. When the kernel is built with CONFIG_DEBUG_SHIRQ config flag, if the IRQF_SHARED bit is set, a call to the IRQ handler is made from the __free_irq() function. This is testing a race condition between the IRQ cleanup and an IRQ racing the cleanup. The HFI driver should be able to handle this race, but does not. This race can cause traces that start with this footprint: BUG: unable to handle kernel NULL pointer dereference at (null) Call Trace: ... __free_irq+0x1b3/0x2d0 free_irq+0x35/0x70 pci_free_irq+0x1c/0x30 clean_up_interrupts+0x53/0xf0 [hfi1] hfi1_start_cleanup+0x122/0x190 [hfi1] postinit_cleanup+0x1d/0x280 [hfi1] remove_one+0x233/0x250 [hfi1] pci_device_remove+0x39/0xc0 Export IRQ cleanup function so it can be called from other modules. Using the exported cleanup function: Re-order the driver cleanup code to clean up IRQ resources before other resources, eliminating the race. Re-order error path for init so that the race does not occur. Reduce severity on spurious error message for SDMA IRQs to info. Reviewed-by: Alex Estrin Reviewed-by: Patel Jay P Reviewed-by: Mike Marciniszyn Signed-off-by: Michael J. Ruhl Signed-off-by: Dennis Dalessandro Signed-off-by: Jason Gunthorpe Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit 73027d80d67e151787603e65bf62bdbc4125877c Author: Jens Axboe Date: Thu Feb 1 14:01:02 2018 -0700 blk-mq: fix discard merge with scheduler attached [ Upstream commit 445251d0f4d329aa061f323546cd6388a3bb7ab5 ] I ran into an issue on my laptop that triggered a bug on the discard path: WARNING: CPU: 2 PID: 207 at drivers/nvme/host/core.c:527 nvme_setup_cmd+0x3d3/0x430 Modules linked in: rfcomm fuse ctr ccm bnep arc4 binfmt_misc snd_hda_codec_hdmi nls_iso8859_1 nls_cp437 vfat snd_hda_codec_conexant fat snd_hda_codec_generic iwlmvm snd_hda_intel snd_hda_codec snd_hwdep mac80211 snd_hda_core snd_pcm snd_seq_midi snd_seq_midi_event snd_rawmidi snd_seq x86_pkg_temp_thermal intel_powerclamp kvm_intel uvcvideo iwlwifi btusb snd_seq_device videobuf2_vmalloc btintel videobuf2_memops kvm snd_timer videobuf2_v4l2 bluetooth irqbypass videobuf2_core aesni_intel aes_x86_64 crypto_simd cryptd snd glue_helper videodev cfg80211 ecdh_generic soundcore hid_generic usbhid hid i915 psmouse e1000e ptp pps_core xhci_pci xhci_hcd intel_gtt CPU: 2 PID: 207 Comm: jbd2/nvme0n1p7- Tainted: G U 4.15.0+ #176 Hardware name: LENOVO 20FBCTO1WW/20FBCTO1WW, BIOS N1FET59W (1.33 ) 12/19/2017 RIP: 0010:nvme_setup_cmd+0x3d3/0x430 RSP: 0018:ffff880423e9f838 EFLAGS: 00010217 RAX: 0000000000000000 RBX: ffff880423e9f8c8 RCX: 0000000000010000 RDX: ffff88022b200010 RSI: 0000000000000002 RDI: 00000000327f0000 RBP: ffff880421251400 R08: ffff88022b200000 R09: 0000000000000009 R10: 0000000000000000 R11: 0000000000000000 R12: 000000000000ffff R13: ffff88042341e280 R14: 000000000000ffff R15: ffff880421251440 FS: 0000000000000000(0000) GS:ffff880441500000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000055b684795030 CR3: 0000000002e09006 CR4: 00000000001606e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: nvme_queue_rq+0x40/0xa00 ? __sbitmap_queue_get+0x24/0x90 ? blk_mq_get_tag+0xa3/0x250 ? wait_woken+0x80/0x80 ? blk_mq_get_driver_tag+0x97/0xf0 blk_mq_dispatch_rq_list+0x7b/0x4a0 ? deadline_remove_request+0x49/0xb0 blk_mq_do_dispatch_sched+0x4f/0xc0 blk_mq_sched_dispatch_requests+0x106/0x170 __blk_mq_run_hw_queue+0x53/0xa0 __blk_mq_delay_run_hw_queue+0x83/0xa0 blk_mq_run_hw_queue+0x6c/0xd0 blk_mq_sched_insert_request+0x96/0x140 __blk_mq_try_issue_directly+0x3d/0x190 blk_mq_try_issue_directly+0x30/0x70 blk_mq_make_request+0x1a4/0x6a0 generic_make_request+0xfd/0x2f0 ? submit_bio+0x5c/0x110 submit_bio+0x5c/0x110 ? __blkdev_issue_discard+0x152/0x200 submit_bio_wait+0x43/0x60 ext4_process_freed_data+0x1cd/0x440 ? account_page_dirtied+0xe2/0x1a0 ext4_journal_commit_callback+0x4a/0xc0 jbd2_journal_commit_transaction+0x17e2/0x19e0 ? kjournald2+0xb0/0x250 kjournald2+0xb0/0x250 ? wait_woken+0x80/0x80 ? commit_timeout+0x10/0x10 kthread+0x111/0x130 ? kthread_create_worker_on_cpu+0x50/0x50 ? do_group_exit+0x3a/0xa0 ret_from_fork+0x1f/0x30 Code: 73 89 c1 83 ce 10 c1 e1 10 09 ca 83 f8 04 0f 87 0f ff ff ff 8b 4d 20 48 8b 7d 00 c1 e9 09 48 01 8c c7 00 08 00 00 e9 f8 fe ff ff <0f> ff 4c 89 c7 41 bc 0a 00 00 00 e8 0d 78 d6 ff e9 a1 fc ff ff ---[ end trace 50d361cc444506c8 ]--- print_req_error: I/O error, dev nvme0n1, sector 847167488 Decoding the assembly, the request claims to have 0xffff segments, while nvme counts two. This turns out to be because we don't check for a data carrying request on the mq scheduler path, and since blk_phys_contig_segment() returns true for a non-data request, we decrement the initial segment count of 0 and end up with 0xffff in the unsigned short. There are a few issues here: 1) We should initialize the segment count for a discard to 1. 2) The discard merging is currently using the data limits for segments and sectors. Fix this up by having attempt_merge() correctly identify the request, and by initializing the segment count correctly for discards. This can only be triggered with mq-deadline on discard capable devices right now, which isn't a common configuration. Signed-off-by: Jens Axboe Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit 6eddea4ba5ccd00875d6f07ea042f16335a854ba Author: Ed Swierk Date: Wed Jan 31 18:48:02 2018 -0800 openvswitch: Remove padding from packet before L3+ conntrack processing [ Upstream commit 9382fe71c0058465e942a633869629929102843d ] IPv4 and IPv6 packets may arrive with lower-layer padding that is not included in the L3 length. For example, a short IPv4 packet may have up to 6 bytes of padding following the IP payload when received on an Ethernet device with a minimum packet length of 64 bytes. Higher-layer processing functions in netfilter (e.g. nf_ip_checksum(), and help() in nf_conntrack_ftp) assume skb->len reflects the length of the L3 header and payload, rather than referring back to ip_hdr->tot_len or ipv6_hdr->payload_len, and get confused by lower-layer padding. In the normal IPv4 receive path, ip_rcv() trims the packet to ip_hdr->tot_len before invoking netfilter hooks. In the IPv6 receive path, ip6_rcv() does the same using ipv6_hdr->payload_len. Similarly in the br_netfilter receive path, br_validate_ipv4() and br_validate_ipv6() trim the packet to the L3 length before invoking netfilter hooks. Currently in the OVS conntrack receive path, ovs_ct_execute() pulls the skb to the L3 header but does not trim it to the L3 length before calling nf_conntrack_in(NF_INET_PRE_ROUTING). When nf_conntrack_proto_tcp encounters a packet with lower-layer padding, nf_ip_checksum() fails causing a "nf_ct_tcp: bad TCP checksum" log message. While extra zero bytes don't affect the checksum, the length in the IP pseudoheader does. That length is based on skb->len, and without trimming, it doesn't match the length the sender used when computing the checksum. In ovs_ct_execute(), trim the skb to the L3 length before higher-layer processing. Signed-off-by: Ed Swierk Acked-by: Pravin B Shelar Signed-off-by: David S. Miller Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit 3b1d9626fc58c133ae434368495a3b33c1b18ccd Author: shidao.ytt Date: Wed Jan 31 16:19:55 2018 -0800 mm/fadvise: discard partial page if endbyte is also EOF [ Upstream commit a7ab400d6fe73d0119fdc234e9982a6f80faea9f ] During our recent testing with fadvise(FADV_DONTNEED), we find that if given offset/length is not page-aligned, the last page will not be discarded. The tool we use is vmtouch (https://hoytech.com/vmtouch/), we map a 10KB-sized file into memory and then try to run this tool to evict the whole file mapping, but the last single page always remains staying in the memory: $./vmtouch -e test_10K Files: 1 Directories: 0 Evicted Pages: 3 (12K) Elapsed: 2.1e-05 seconds $./vmtouch test_10K Files: 1 Directories: 0 Resident Pages: 1/3 4K/12K 33.3% Elapsed: 5.5e-05 seconds However when we test with an older kernel, say 3.10, this problem is gone. So we wonder if this is a regression: $./vmtouch -e test_10K Files: 1 Directories: 0 Evicted Pages: 3 (12K) Elapsed: 8.2e-05 seconds $./vmtouch test_10K Files: 1 Directories: 0 Resident Pages: 0/3 0/12K 0% <-- partial page also discarded Elapsed: 5e-05 seconds After digging a little bit into this problem, we find it seems not a regression. Not discarding partial page is likely to be on purpose according to commit 441c228f817f ("mm: fadvise: document the fadvise(FADV_DONTNEED) behaviour for partial pages") written by Mel Gorman. He explained why partial pages should be preserved instead of being discarded when using fadvise(FADV_DONTNEED). However, the interesting part is that the actual code did NOT work as the same as it was described, the partial page was still discarded anyway, due to a calculation mistake of `end_index' passed to invalidate_mapping_pages(). This mistake has not been fixed until recently, that's why we fail to reproduce our problem in old kernels. The fix is done in commit 18aba41cbf ("mm/fadvise.c: do not discard partial pages with POSIX_FADV_DONTNEED") by Oleg Drokin. Back to the original testing, our problem becomes that there is a special case that, if the page-unaligned `endbyte' is also the end of file, it is not necessary at all to preserve the last partial page, as we all know no one else will use the rest of it. It should be safe enough if we just discard the whole page. So we add an EOF check in this patch. We also find a poosbile real world issue in mainline kernel. Assume such scenario: A userspace backup application want to backup a huge amount of small files (<4k) at once, the developer might (I guess) want to use fadvise(FADV_DONTNEED) to save memory. However, FADV_DONTNEED won't really happen since the only page mapped is a partial page, and kernel will preserve it. Our patch also fixes this problem, since we know the endbyte is EOF, so we discard it. Here is a simple reproducer to reproduce and verify each scenario we described above: test_fadvise.c ============================== #include #include #include #include #include #include #include int main(int argc, char **argv) { int i, fd, ret, len; struct stat buf; void *addr; unsigned char *vec; char *strbuf; ssize_t pagesize = getpagesize(); ssize_t filesize; fd = open(argv[1], O_RDWR|O_CREAT, S_IRUSR|S_IWUSR); if (fd < 0) return -1; filesize = strtoul(argv[2], NULL, 10); strbuf = malloc(filesize); memset(strbuf, 42, filesize); write(fd, strbuf, filesize); free(strbuf); fsync(fd); len = (filesize + pagesize - 1) / pagesize; printf("length of pages: %d\n", len); addr = mmap(NULL, filesize, PROT_READ, MAP_SHARED, fd, 0); if (addr == MAP_FAILED) return -1; ret = posix_fadvise(fd, 0, filesize, POSIX_FADV_DONTNEED); if (ret < 0) return -1; vec = malloc(len); ret = mincore(addr, filesize, (void *)vec); if (ret < 0) return -1; for (i = 0; i < len; i++) printf("pages[%d]: %x\n", i, vec[i] & 0x1); free(vec); close(fd); return 0; } ============================== Test 1: running on kernel with commit 18aba41cbf reverted: [root@caspar ~]# uname -r 4.15.0-rc6.revert+ [root@caspar ~]# ./test_fadvise file1 1024 length of pages: 1 pages[0]: 0 # <-- partial page discarded [root@caspar ~]# ./test_fadvise file2 8192 length of pages: 2 pages[0]: 0 pages[1]: 0 [root@caspar ~]# ./test_fadvise file3 10240 length of pages: 3 pages[0]: 0 pages[1]: 0 pages[2]: 0 # <-- partial page discarded Test 2: running on mainline kernel: [root@caspar ~]# uname -r 4.15.0-rc6+ [root@caspar ~]# ./test_fadvise test1 1024 length of pages: 1 pages[0]: 1 # <-- partial and the only page not discarded [root@caspar ~]# ./test_fadvise test2 8192 length of pages: 2 pages[0]: 0 pages[1]: 0 [root@caspar ~]# ./test_fadvise test3 10240 length of pages: 3 pages[0]: 0 pages[1]: 0 pages[2]: 1 # <-- partial page not discarded Test 3: running on kernel with this patch: [root@caspar ~]# uname -r 4.15.0-rc6.patched+ [root@caspar ~]# ./test_fadvise test1 1024 length of pages: 1 pages[0]: 0 # <-- partial page and EOF, discarded [root@caspar ~]# ./test_fadvise test2 8192 length of pages: 2 pages[0]: 0 pages[1]: 0 [root@caspar ~]# ./test_fadvise test3 10240 length of pages: 3 pages[0]: 0 pages[1]: 0 pages[2]: 0 # <-- partial page and EOF, discarded [akpm@linux-foundation.org: tweak code comment] Link: http://lkml.kernel.org/r/5222da9ee20e1695eaabb69f631f200d6e6b8876.1515132470.git.jinli.zjl@alibaba-inc.com Signed-off-by: shidao.ytt Signed-off-by: Caspar Zhang Reviewed-by: Oliver Yang Cc: Mel Gorman Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit 1f9c87e251583942991c482d99c1797f1454be2a Author: Mel Gorman Date: Wed Jan 31 16:19:52 2018 -0800 mm: pin address_space before dereferencing it while isolating an LRU page [ Upstream commit 69d763fc6d3aee787a3e8c8c35092b4f4960fa5d ] Minchan Kim asked the following question -- what locks protects address_space destroying when race happens between inode trauncation and __isolate_lru_page? Jan Kara clarified by describing the race as follows CPU1 CPU2 truncate(inode) __isolate_lru_page() ... truncate_inode_page(mapping, page); delete_from_page_cache(page) spin_lock_irqsave(&mapping->tree_lock, flags); __delete_from_page_cache(page, NULL) page_cache_tree_delete(..) ... mapping = page_mapping(page); page->mapping = NULL; ... spin_unlock_irqrestore(&mapping->tree_lock, flags); page_cache_free_page(mapping, page) put_page(page) if (put_page_testzero(page)) -> false - inode now has no pages and can be freed including embedded address_space if (mapping && !mapping->a_ops->migratepage) - we've dereferenced mapping which is potentially already free. The race is theoretically possible but unlikely. Before the delete_from_page_cache, truncate_cleanup_page is called so the page is likely to be !PageDirty or PageWriteback which gets skipped by the only caller that checks the mappping in __isolate_lru_page. Even if the race occurs, a substantial amount of work has to happen during a tiny window with no preemption but it could potentially be done using a virtual machine to artifically slow one CPU or halt it during the critical window. This patch should eliminate the race with truncation by try-locking the page before derefencing mapping and aborting if the lock was not acquired. There was a suggestion from Huang Ying to use RCU as a side-effect to prevent mapping being freed. However, I do not like the solution as it's an unconventional means of preserving a mapping and it's not a context where rcu_read_lock is obviously protecting rcu data. Link: http://lkml.kernel.org/r/20180104102512.2qos3h5vqzeisrek@techsingularity.net Fixes: c82449352854 ("mm: compaction: make isolate_lru_page() filter-aware again") Signed-off-by: Mel Gorman Acked-by: Minchan Kim Cc: "Huang, Ying" Cc: Jan Kara Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit 8054b87fccd4fe9c67bfc164462d81b38ea56af4 Author: Yang Shi Date: Wed Jan 31 16:18:28 2018 -0800 mm: thp: use down_read_trylock() in khugepaged to avoid long block [ Upstream commit 3b454ad35043dfbd3b5d2bb92b0991d6342afb44 ] In the current design, khugepaged needs to acquire mmap_sem before scanning an mm. But in some corner cases, khugepaged may scan a process which is modifying its memory mapping, so khugepaged blocks in uninterruptible state. But the process might hold the mmap_sem for a long time when modifying a huge memory space and it may trigger the below khugepaged hung issue: INFO: task khugepaged:270 blocked for more than 120 seconds. Tainted: G E 4.9.65-006.ali3000.alios7.x86_64 #1 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. khugepaged D 0 270 2 0x00000000  ffff883f3deae4c0 0000000000000000 ffff883f610596c0 ffff883f7d359440 ffff883f63818000 ffffc90019adfc78 ffffffff817079a5 d67e5aa8c1860a64 0000000000000246 ffff883f7d359440 ffffc90019adfc88 ffff883f610596c0 Call Trace: schedule+0x36/0x80 rwsem_down_read_failed+0xf0/0x150 call_rwsem_down_read_failed+0x18/0x30 down_read+0x20/0x40 khugepaged+0x476/0x11d0 kthread+0xe6/0x100 ret_from_fork+0x25/0x30 So it sounds pointless to just block khugepaged waiting for the semaphore so replace down_read() with down_read_trylock() to move to scan the next mm quickly instead of just blocking on the semaphore so that other processes can get more chances to install THP. Then khugepaged can come back to scan the skipped mm when it has finished the current round full_scan. And it appears that the change can improve khugepaged efficiency a little bit. Below is the test result when running LTP on a 24 cores 4GB memory 2 nodes NUMA VM: pristine w/ trylock full_scan 197 187 pages_collapsed 21 26 thp_fault_alloc 40818 44466 thp_fault_fallback 18413 16679 thp_collapse_alloc 21 150 thp_collapse_alloc_failed 14 16 thp_file_alloc 369 369 [akpm@linux-foundation.org: coding-style fixes] [akpm@linux-foundation.org: tweak comment] [arnd@arndb.de: avoid uninitialized variable use] Link: http://lkml.kernel.org/r/20171215125129.2948634-1-arnd@arndb.de Link: http://lkml.kernel.org/r/1513281203-54878-1-git-send-email-yang.s@alibaba-inc.com Signed-off-by: Yang Shi Acked-by: Kirill A. Shutemov Acked-by: Michal Hocko Cc: Hugh Dickins Cc: Andrea Arcangeli Signed-off-by: Arnd Bergmann Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit 6acb8818eff43638b691acadb048bce32d0c89a3 Author: Nitin Gupta Date: Wed Jan 31 16:18:09 2018 -0800 sparc64: update pmdp_invalidate() to return old pmd value [ Upstream commit a8e654f01cb725d0bfd741ebca1bf4c9337969cc ] It's required to avoid losing dirty and accessed bits. [akpm@linux-foundation.org: add a `do' to the do-while loop] Link: http://lkml.kernel.org/r/20171213105756.69879-9-kirill.shutemov@linux.intel.com Signed-off-by: Nitin Gupta Signed-off-by: Kirill A. Shutemov Cc: David Miller Cc: Vlastimil Babka Cc: Andrea Arcangeli Cc: Michal Hocko Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit 78185a93d42ddb9595df13b3394312d07a34e832 Author: Kirill A. Shutemov Date: Wed Jan 31 16:17:43 2018 -0800 asm-generic: provide generic_pmdp_establish() [ Upstream commit c58f0bb77ed8bf93dfdde762b01cb67eebbdfc29 ] Patch series "Do not lose dirty bit on THP pages", v4. Vlastimil noted that pmdp_invalidate() is not atomic and we can lose dirty and access bits if CPU sets them after pmdp dereference, but before set_pmd_at(). The bug can lead to data loss, but the race window is tiny and I haven't seen any reports that suggested that it happens in reality. So I don't think it worth sending it to stable. Unfortunately, there's no way to address the issue in a generic way. We need to fix all architectures that support THP one-by-one. All architectures that have THP supported have to provide atomic pmdp_invalidate() that returns previous value. If generic implementation of pmdp_invalidate() is used, architecture needs to provide atomic pmdp_estabish(). pmdp_estabish() is not used out-side generic implementation of pmdp_invalidate() so far, but I think this can change in the future. This patch (of 12): This is an implementation of pmdp_establish() that is only suitable for an architecture that doesn't have hardware dirty/accessed bits. In this case we can't race with CPU which sets these bits and non-atomic approach is fine. Link: http://lkml.kernel.org/r/20171213105756.69879-2-kirill.shutemov@linux.intel.com Signed-off-by: Kirill A. Shutemov Cc: Vlastimil Babka Cc: Andrea Arcangeli Cc: Michal Hocko Cc: Aneesh Kumar K.V Cc: Catalin Marinas Cc: David Daney Cc: David Miller Cc: H. Peter Anvin Cc: Hugh Dickins Cc: Ingo Molnar Cc: Martin Schwidefsky Cc: Nitin Gupta Cc: Ralf Baechle Cc: Thomas Gleixner Cc: Vineet Gupta Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit 305e56756da71e0ef7cc8d91b8e66fafbb6e5d5a Author: Yisheng Xie Date: Wed Jan 31 16:16:15 2018 -0800 mm/mempolicy: add nodes_empty check in SYSC_migrate_pages [ Upstream commit 0486a38bcc4749808edbc848f1bcf232042770fc ] As in manpage of migrate_pages, the errno should be set to EINVAL when none of the node IDs specified by new_nodes are on-line and allowed by the process's current cpuset context, or none of the specified nodes contain memory. However, when test by following case: new_nodes = 0; old_nodes = 0xf; ret = migrate_pages(pid, old_nodes, new_nodes, MAX); The ret will be 0 and no errno is set. As the new_nodes is empty, we should expect EINVAL as documented. To fix the case like above, this patch check whether target nodes AND current task_nodes is empty, and then check whether AND node_states[N_MEMORY] is empty. Link: http://lkml.kernel.org/r/1510882624-44342-4-git-send-email-xieyisheng1@huawei.com Signed-off-by: Yisheng Xie Acked-by: Vlastimil Babka Cc: Andi Kleen Cc: Chris Salls Cc: Christopher Lameter Cc: David Rientjes Cc: Ingo Molnar Cc: Naoya Horiguchi Cc: Tan Xiaojun Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit 6cab60ac6a0a978ac58e27f79012122f90e75a1d Author: Yisheng Xie Date: Wed Jan 31 16:16:11 2018 -0800 mm/mempolicy: fix the check of nodemask from user [ Upstream commit 56521e7a02b7b84a5e72691a1fb15570e6055545 ] As Xiaojun reported the ltp of migrate_pages01 will fail on arm64 system which has 4 nodes[0...3], all have memory and CONFIG_NODES_SHIFT=2: migrate_pages01 0 TINFO : test_invalid_nodes migrate_pages01 14 TFAIL : migrate_pages_common.c:45: unexpected failure - returned value = 0, expected: -1 migrate_pages01 15 TFAIL : migrate_pages_common.c:55: call succeeded unexpectedly In this case the test_invalid_nodes of migrate_pages01 will call: SYSC_migrate_pages as: migrate_pages(0, , {0x0000000000000001}, 64, , {0x0000000000000010}, 64) = 0 The new nodes specifies one or more node IDs that are greater than the maximum supported node ID, however, the errno is not set to EINVAL as expected. As man pages of set_mempolicy[1], mbind[2], and migrate_pages[3] mentioned, when nodemask specifies one or more node IDs that are greater than the maximum supported node ID, the errno should set to EINVAL. However, get_nodes only check whether the part of bits [BITS_PER_LONG*BITS_TO_LONGS(MAX_NUMNODES), maxnode) is zero or not, and remain [MAX_NUMNODES, BITS_PER_LONG*BITS_TO_LONGS(MAX_NUMNODES) unchecked. This patch is to check the bits of [MAX_NUMNODES, maxnode) in get_nodes to let migrate_pages set the errno to EINVAL when nodemask specifies one or more node IDs that are greater than the maximum supported node ID, which follows the manpage's guide. [1] http://man7.org/linux/man-pages/man2/set_mempolicy.2.html [2] http://man7.org/linux/man-pages/man2/mbind.2.html [3] http://man7.org/linux/man-pages/man2/migrate_pages.2.html Link: http://lkml.kernel.org/r/1510882624-44342-3-git-send-email-xieyisheng1@huawei.com Signed-off-by: Yisheng Xie Reported-by: Tan Xiaojun Acked-by: Vlastimil Babka Cc: Andi Kleen Cc: Chris Salls Cc: Christopher Lameter Cc: David Rientjes Cc: Ingo Molnar Cc: Naoya Horiguchi Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit a7fbc7f3134ab4848898383a0852a8e768311498 Author: piaojun Date: Wed Jan 31 16:15:32 2018 -0800 ocfs2: return error when we attempt to access a dirty bh in jbd2 [ Upstream commit d984187e3a1ad7d12447a7ab2c43ce3717a2b5b3 ] We should not reuse the dirty bh in jbd2 directly due to the following situation: 1. When removing extent rec, we will dirty the bhs of extent rec and truncate log at the same time, and hand them over to jbd2. 2. The bhs are submitted to jbd2 area successfully. 3. The write-back thread of device help flush the bhs to disk but encounter write error due to abnormal storage link. 4. After a while the storage link become normal. Truncate log flush worker triggered by the next space reclaiming found the dirty bh of truncate log and clear its 'BH_Write_EIO' and then set it uptodate in __ocfs2_journal_access(): ocfs2_truncate_log_worker ocfs2_flush_truncate_log __ocfs2_flush_truncate_log ocfs2_replay_truncate_records ocfs2_journal_access_di __ocfs2_journal_access // here we clear io_error and set 'tl_bh' uptodata. 5. Then jbd2 will flush the bh of truncate log to disk, but the bh of extent rec is still in error state, and unfortunately nobody will take care of it. 6. At last the space of extent rec was not reduced, but truncate log flush worker have given it back to globalalloc. That will cause duplicate cluster problem which could be identified by fsck.ocfs2. Sadly we can hardly revert this but set fs read-only in case of ruining atomicity and consistency of space reclaim. Link: http://lkml.kernel.org/r/5A6E8092.8090701@huawei.com Fixes: acf8fdbe6afb ("ocfs2: do not BUG if buffer not uptodate in __ocfs2_journal_access") Signed-off-by: Jun Piao Reviewed-by: Yiwen Jiang Reviewed-by: Changwei Ge Cc: Mark Fasheh Cc: Joel Becker Cc: Junxiao Bi Cc: Joseph Qi Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit a66174eb4a149b7919d174d9a7ce6ad9909c5ee2 Author: piaojun Date: Wed Jan 31 16:14:59 2018 -0800 ocfs2/acl: use 'ip_xattr_sem' to protect getting extended attribute [ Upstream commit 16c8d569f5704a84164f30ff01b29879f3438065 ] The race between *set_acl and *get_acl will cause getting incomplete xattr data as below: processA processB ocfs2_set_acl ocfs2_xattr_set __ocfs2_xattr_set_handle ocfs2_get_acl_nolock ocfs2_xattr_get_nolock: processB may get incomplete xattr data if processA hasn't set_acl done. So we should use 'ip_xattr_sem' to protect getting extended attribute in ocfs2_get_acl_nolock(), as other processes could be changing it concurrently. Link: http://lkml.kernel.org/r/5A5DDCFF.7030001@huawei.com Signed-off-by: Jun Piao Reviewed-by: Alex Chen Cc: Mark Fasheh Cc: Joel Becker Cc: Junxiao Bi Cc: Joseph Qi Cc: Changwei Ge Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit 66aaeed2796ef50af8775cad209e62c173463d05 Author: piaojun Date: Wed Jan 31 16:14:44 2018 -0800 ocfs2: return -EROFS to mount.ocfs2 if inode block is invalid [ Upstream commit 025bcbde3634b2c9b316f227fed13ad6ad6817fb ] If metadata is corrupted such as 'invalid inode block', we will get failed by calling 'mount()' and then set filesystem readonly as below: ocfs2_mount ocfs2_initialize_super ocfs2_init_global_system_inodes ocfs2_iget ocfs2_read_locked_inode ocfs2_validate_inode_block ocfs2_error ocfs2_handle_error ocfs2_set_ro_flag(osb, 0); // set readonly In this situation we need return -EROFS to 'mount.ocfs2', so that user can fix it by fsck. And then mount again. In addition, 'mount.ocfs2' should be updated correspondingly as it only return 1 for all errno. And I will post a patch for 'mount.ocfs2' too. Link: http://lkml.kernel.org/r/5A4302FA.2010606@huawei.com Signed-off-by: Jun Piao Reviewed-by: Alex Chen Reviewed-by: Joseph Qi Reviewed-by: Changwei Ge Reviewed-by: Gang He Cc: Mark Fasheh Cc: Joel Becker Cc: Junxiao Bi Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit 710b5124aac67c4a48d29e32c2d9525278db405b Author: Jan H. Schönherr Date: Wed Jan 31 16:14:04 2018 -0800 fs/dax.c: release PMD lock even when there is no PMD support in DAX [ Upstream commit ee190ca6516bc8257e3d36187ca6f0f71a9ec477 ] follow_pte_pmd() can theoretically return after having acquired a PMD lock, even when DAX was not compiled with CONFIG_FS_DAX_PMD. Release the PMD lock unconditionally. Link: http://lkml.kernel.org/r/20180118133839.20587-1-jschoenh@amazon.de Fixes: f729c8c9b24f ("dax: wrprotect pmd_t in dax_mapping_entry_mkclean") Signed-off-by: Jan H. Schönherr Reviewed-by: Ross Zwisler Reviewed-by: Andrew Morton Cc: Matthew Wilcox Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit cc0600dae30fe2ca2f3783599420a1b9bb88715d Author: Vitaly Kuznetsov Date: Thu Jan 25 16:37:07 2018 +0100 x86/kvm/vmx: do not use vm-exit instruction length for fast MMIO when running nested [ Upstream commit d391f1207067268261add0485f0f34503539c5b0 ] I was investigating an issue with seabios >= 1.10 which stopped working for nested KVM on Hyper-V. The problem appears to be in handle_ept_violation() function: when we do fast mmio we need to skip the instruction so we do kvm_skip_emulated_instruction(). This, however, depends on VM_EXIT_INSTRUCTION_LEN field being set correctly in VMCS. However, this is not the case. Intel's manual doesn't mandate VM_EXIT_INSTRUCTION_LEN to be set when EPT MISCONFIG occurs. While on real hardware it was observed to be set, some hypervisors follow the spec and don't set it; we end up advancing IP with some random value. I checked with Microsoft and they confirmed they don't fill VM_EXIT_INSTRUCTION_LEN on EPT MISCONFIG. Fix the issue by doing instruction skip through emulator when running nested. Fixes: 68c3b4d1676d870f0453c31d5a52e7e65c7448ae Suggested-by: Radim Krčmář Suggested-by: Paolo Bonzini Signed-off-by: Vitaly Kuznetsov Acked-by: Michael S. Tsirkin Signed-off-by: Radim Krčmář Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit d757c3a9cf4a8af26d085054731ebd9b5bc9983c Author: KarimAllah Ahmed Date: Wed Jan 17 19:18:56 2018 +0100 kvm: Map PFN-type memory regions as writable (if possible) [ Upstream commit a340b3e229b24a56f1c7f5826b15a3af0f4b13e5 ] For EPT-violations that are triggered by a read, the pages are also mapped with write permissions (if their memory region is also writable). That would avoid getting yet another fault on the same page when a write occurs. This optimization only happens when you have a "struct page" backing the memory region. So also enable it for memory regions that do not have a "struct page". Cc: Paolo Bonzini Cc: Radim Krčmář Cc: kvm@vger.kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by: KarimAllah Ahmed Reviewed-by: Paolo Bonzini Signed-off-by: Radim Krčmář Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit a6a25002e6d83c4b66297a0fd192927c8fa0988d Author: Gustavo A. R. Silva Date: Tue Jan 30 22:21:48 2018 -0600 tcp_nv: fix potential integer overflow in tcpnv_acked [ Upstream commit e4823fbd229bfbba368b40cdadb8f4eeb20604cc ] Add suffix ULL to constant 80000 in order to avoid a potential integer overflow and give the compiler complete information about the proper arithmetic to use. Notice that this constant is used in a context that expects an expression of type u64. The current cast to u64 effectively applies to the whole expression as an argument of type u64 to be passed to div64_u64, but it does not prevent it from being evaluated using 32-bit arithmetic instead of 64-bit arithmetic. Also, once the expression is properly evaluated using 64-bit arithmentic, there is no need for the parentheses and the external cast to u64. Addresses-Coverity-ID: 1357588 ("Unintentional integer overflow") Signed-off-by: Gustavo A. R. Silva Signed-off-by: David S. Miller Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit ad10785a706e63ff155fc97860cdcc5e3bc5992d Author: Dmitry Vyukov Date: Mon Jan 29 13:21:20 2018 +0100 netfilter: x_tables: fix pointer leaks to userspace [ Upstream commit 1e98ffea5a8935ec040ab72299e349cb44b8defd ] Several netfilter matches and targets put kernel pointers into info objects, but don't set usersize in descriptors. This leads to kernel pointer leaks if a match/target is set and then read back to userspace. Properly set usersize for these matches/targets. Found with manual code inspection. Fixes: ec2318904965 ("xtables: extend matches and targets with .usersize") Signed-off-by: Dmitry Vyukov Signed-off-by: Pablo Neira Ayuso Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit 2b7cc93682acf720f6b2e1a690ccdae71129e6fc Author: Vitaly Kuznetsov Date: Wed Jan 24 14:23:31 2018 +0100 x86/hyperv: Check for required priviliges in hyperv_init() [ Upstream commit 89a8f6d4904c8cf3ff8fee9fdaff392a6bbb8bf6 ] In hyperv_init() its presumed that it always has access to VP index and hypercall MSRs while according to the specification it should be checked if it's allowed to access the corresponding MSRs before accessing them. Signed-off-by: Vitaly Kuznetsov Signed-off-by: Thomas Gleixner Reviewed-by: Thomas Gleixner Cc: Stephen Hemminger Cc: kvm@vger.kernel.org Cc: Radim Krčmář Cc: Haiyang Zhang Cc: "Michael Kelley (EOSG)" Cc: Roman Kagan Cc: Andy Lutomirski Cc: devel@linuxdriverproject.org Cc: Paolo Bonzini Cc: "K. Y. Srinivasan" Cc: Cathy Avery Cc: Mohammed Gamal Link: https://lkml.kernel.org/r/20180124132337.30138-2-vkuznets@redhat.com Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit cdf635a66c5ba80fa03275265d6191fb3adf33c8 Author: Andy Spencer Date: Thu Jan 25 19:37:50 2018 -0800 gianfar: prevent integer wrapping in the rx handler [ Upstream commit 202a0a70e445caee1d0ec7aae814e64b1189fa4d ] When the frame check sequence (FCS) is split across the last two frames of a fragmented packet, part of the FCS gets counted twice, once when subtracting the FCS, and again when subtracting the previously received data. For example, if 1602 bytes are received, and the first fragment contains the first 1600 bytes (including the first two bytes of the FCS), and the second fragment contains the last two bytes of the FCS: 'skb->len == 1600' from the first fragment size = lstatus & BD_LENGTH_MASK; # 1602 size -= ETH_FCS_LEN; # 1598 size -= skb->len; # -2 Since the size is unsigned, it wraps around and causes a BUG later in the packet handling, as shown below: kernel BUG at ./include/linux/skbuff.h:2068! Oops: Exception in kernel mode, sig: 5 [#1] ... NIP [c021ec60] skb_pull+0x24/0x44 LR [c01e2fbc] gfar_clean_rx_ring+0x498/0x690 Call Trace: [df7edeb0] [c01e2c1c] gfar_clean_rx_ring+0xf8/0x690 (unreliable) [df7edf20] [c01e33a8] gfar_poll_rx_sq+0x3c/0x9c [df7edf40] [c023352c] net_rx_action+0x21c/0x274 [df7edf90] [c0329000] __do_softirq+0xd8/0x240 [df7edff0] [c000c108] call_do_irq+0x24/0x3c [c0597e90] [c00041dc] do_IRQ+0x64/0xc4 [c0597eb0] [c000d920] ret_from_except+0x0/0x18 --- interrupt: 501 at arch_cpu_idle+0x24/0x5c Change the size to a signed integer and then trim off any part of the FCS that was received prior to the last fragment. Fixes: 6c389fc931bc ("gianfar: fix size of scatter-gathered frames") Signed-off-by: Andy Spencer Signed-off-by: David S. Miller Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit 67fa8bfff771973a6867ca4be8b6c29e6e2e312f Author: Logan Gunthorpe Date: Mon Dec 18 11:25:05 2017 -0700 ntb_transport: Fix bug with max_mw_size parameter [ Upstream commit cbd27448faff4843ac4b66cc71445a10623ff48d ] When using the max_mw_size parameter of ntb_transport to limit the size of the Memory windows, communication cannot be established and the queues freeze. This is because the mw_size that's reported to the peer is correctly limited but the size used locally is not. So the MW is initialized with a buffer smaller than the window but the TX side is using the full window. This means the TX side will be writing to a region of the window that points nowhere. This is easily fixed by applying the same limit to tx_size in ntb_transport_init_queue(). Fixes: e26a5843f7f5 ("NTB: Split ntb_hw_intel and ntb_transport drivers") Signed-off-by: Logan Gunthorpe Acked-by: Allen Hubbe Cc: Dave Jiang Signed-off-by: Jon Mason Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit d810c548157f9f2533864bfb220787eaa9e36e3e Author: Leon Romanovsky Date: Sun Jan 28 11:25:30 2018 +0200 RDMA/mlx5: Avoid memory leak in case of XRCD dealloc failure [ Upstream commit b081808a66345ba725b77ecd8d759bee874cd937 ] Failure in XRCD FW deallocation command leaves memory leaked and returns error to the user which he can't do anything about it. This patch changes behavior to always free memory and always return success to the user. Fixes: e126ba97dba9 ("mlx5: Add driver for Mellanox Connect-IB adapters") Reviewed-by: Majd Dibbiny Signed-off-by: Leon Romanovsky Reviewed-by: Yuval Shaia Signed-off-by: Jason Gunthorpe Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit 0bddd43ac2001d87471117ff29e789aa3bcfd18b Author: Michael Bringmann Date: Tue Nov 28 16:58:40 2017 -0600 powerpc/numa: Ensure nodes initialized for hotplug [ Upstream commit ea05ba7c559c8e5a5946c3a94a2a266e9a6680a6 ] This patch fixes some problems encountered at runtime with configurations that support memory-less nodes, or that hot-add CPUs into nodes that are memoryless during system execution after boot. The problems of interest include: * Nodes known to powerpc to be memoryless at boot, but to have CPUs in them are allowed to be 'possible' and 'online'. Memory allocations for those nodes are taken from another node that does have memory until and if memory is hot-added to the node. * Nodes which have no resources assigned at boot, but which may still be referenced subsequently by affinity or associativity attributes, are kept in the list of 'possible' nodes for powerpc. Hot-add of memory or CPUs to the system can reference these nodes and bring them online instead of redirecting the references to one of the set of nodes known to have memory at boot. Note that this software operates under the context of CPU hotplug. We are not doing memory hotplug in this code, but rather updating the kernel's CPU topology (i.e. arch_update_cpu_topology / numa_update_cpu_topology). We are initializing a node that may be used by CPUs or memory before it can be referenced as invalid by a CPU hotplug operation. CPU hotplug operations are protected by a range of APIs including cpu_maps_update_begin/cpu_maps_update_done, cpus_read/write_lock / cpus_read/write_unlock, device locks, and more. Memory hotplug operations, including try_online_node, are protected by mem_hotplug_begin/mem_hotplug_done, device locks, and more. In the case of CPUs being hot-added to a previously memoryless node, the try_online_node operation occurs wholly within the CPU locks with no overlap. Using HMC hot-add/hot-remove operations, we have been able to add and remove CPUs to any possible node without failures. HMC operations involve a degree self-serialization, though. Signed-off-by: Michael Bringmann Reviewed-by: Nathan Fontenot Signed-off-by: Michael Ellerman Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit 0caebc3810328467cc67dadee612603d24087a9f Author: Michael Bringmann Date: Tue Nov 28 16:58:36 2017 -0600 powerpc/numa: Use ibm,max-associativity-domains to discover possible nodes [ Upstream commit a346137e9142b039fd13af2e59696e3d40c487ef ] On powerpc systems which allow 'hot-add' of CPU or memory resources, it may occur that the new resources are to be inserted into nodes that were not used for these resources at bootup. In the kernel, any node that is used must be defined and initialized. These empty nodes may occur when, * Dedicated vs. shared resources. Shared resources require information such as the VPHN hcall for CPU assignment to nodes. Associativity decisions made based on dedicated resource rules, such as associativity properties in the device tree, may vary from decisions made using the values returned by the VPHN hcall. * memoryless nodes at boot. Nodes need to be defined as 'possible' at boot for operation with other code modules. Previously, the powerpc code would limit the set of possible nodes to those which have memory assigned at boot, and were thus online. Subsequent add/remove of CPUs or memory would only work with this subset of possible nodes. * memoryless nodes with CPUs at boot. Due to the previous restriction on nodes, nodes that had CPUs but no memory were being collapsed into other nodes that did have memory at boot. In practice this meant that the node assignment presented by the runtime kernel differed from the affinity and associativity attributes presented by the device tree or VPHN hcalls. Nodes that might be known to the pHyp were not 'possible' in the runtime kernel because they did not have memory at boot. This patch ensures that sufficient nodes are defined to support configuration requirements after boot, as well as at boot. This patch set fixes a couple of problems. * Nodes known to powerpc to be memoryless at boot, but to have CPUs in them are allowed to be 'possible' and 'online'. Memory allocations for those nodes are taken from another node that does have memory until and if memory is hot-added to the node. * Nodes which have no resources assigned at boot, but which may still be referenced subsequently by affinity or associativity attributes, are kept in the list of 'possible' nodes for powerpc. Hot-add of memory or CPUs to the system can reference these nodes and bring them online instead of redirecting to one of the set of nodes that were known to have memory at boot. This patch extracts the value of the lowest domain level (number of allocable resources) from the device tree property "ibm,max-associativity-domains" to use as the maximum number of nodes to setup as possibly available in the system. This new setting will override the instruction: nodes_and(node_possible_map, node_possible_map, node_online_map); presently seen in the function arch/powerpc/mm/numa.c:initmem_init(). If the "ibm,max-associativity-domains" property is not present at boot, no operation will be performed to define or enable additional nodes, or enable the above 'nodes_and()'. Signed-off-by: Michael Bringmann Reviewed-by: Nathan Fontenot Signed-off-by: Michael Ellerman Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit b086dd2d79d911abbc6001ef2d59dc6a042ae5d9 Author: Mickaël Salaün Date: Fri Jan 26 01:39:30 2018 +0100 samples/bpf: Partially fixes the bpf.o build [ Upstream commit c25ef6a5e62fa212d298ce24995ce239f29b5f96 ] Do not build lib/bpf/bpf.o with this Makefile but use the one from the library directory. This avoid making a buggy bpf.o file (e.g. missing symbols). This patch is useful if some code (e.g. Landlock tests) needs both the bpf.o (from tools/lib/bpf) and the bpf_load.o (from samples/bpf). Signed-off-by: Mickaël Salaün Cc: Alexei Starovoitov Cc: Daniel Borkmann Signed-off-by: Daniel Borkmann Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit 64e5e46cdd8b3fcfed016dc89f057087346ad742 Author: Jacob Keller Date: Wed Dec 27 08:26:33 2017 -0500 i40e: fix reported mask for ntuple filters [ Upstream commit 40339af33c703bacb336493157d43c86a8bf2fed ] In commit 36777d9fa24c ("i40e: check current configured input set when adding ntuple filters") some code was added to report the input set mask for a given filter when reporting it to the user. This code is necessary so that the reported filter correctly displays that it is or is not masking certain fields. Unfortunately the code was incorrect. Development error accidentally swapped the mask values for the IPv4 addresses with the L4 port numbers. The port numbers are only 16bits wide while IPv4 addresses are 32 bits. Unfortunately we assigned only 16 bits to the IPv4 address masks. Additionally we assigned 32bit value 0xFFFFFFF to the TCP port numbers. This second part does not matter as the value would be truncated to 16bits regardless, but it is unnecessary. Fix the reported masks to properly report that the entire field is masked. Fixes: 36777d9fa24c ("i40e: check current configured input set when adding ntuple filters") Signed-off-by: Jacob Keller Tested-by: Andrew Bowers Signed-off-by: Jeff Kirsher Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit 1ec85fe4e259a25e69c7d73fa5efe47074fcff46 Author: Jacob Keller Date: Wed Dec 27 08:24:12 2017 -0500 i40e: program fragmented IPv4 filter input set [ Upstream commit 02b4016bfe43d2d5ed043be7ffa56cda6a4d1100 ] When implementing support for IP_USER_FLOW filters, we correctly programmed a filter for both the non fragmented IPv4/Other filter, as well as the fragmented IPv4 filters. However, we did not properly program the input set for fragmented IPv4 PCTYPE. This meant that the filters would almost certainly not match, unless the user specified all of the flow types. Add support to program the fragmented IPv4 filter input set. Since we always program these filters together, we'll assume that the two input sets must match, and will thus always program the input sets to the same value. Signed-off-by: Jacob Keller Tested-by: Andrew Bowers Signed-off-by: Jeff Kirsher Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit 7addb3e4ad3db6a95a953c59884921b5883dcdec Author: Emil Tantilov Date: Fri Jan 12 14:02:56 2018 -0800 ixgbe: don't set RXDCTL.RLPML for 82599 [ Upstream commit 2bafa8fac19a31ca72ae1a3e48df35f73661dbed ] commit 2de6aa3a666e ("ixgbe: Add support for padding packet") Uses RXDCTL.RLPML to limit the maximum frame size on Rx when using build_skb. Unfortunately that register does not work on 82599. Added an explicit check to avoid setting this register on 82599 MAC. Extended the comment related to the setting of RXDCTL.RLPML to better explain its purpose. Signed-off-by: Emil Tantilov Tested-by: Andrew Bowers Signed-off-by: Jeff Kirsher Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit 27eb641f236843b34819ed05d6288dd54053827b Author: Jake Daryll Obina Date: Fri Sep 22 00:00:14 2017 +0800 jffs2: Fix use-after-free bug in jffs2_iget()'s error handling path [ Upstream commit 5bdd0c6f89fba430e18d636493398389dadc3b17 ] If jffs2_iget() fails for a newly-allocated inode, jffs2_do_clear_inode() can get called twice in the error handling path, the first call in jffs2_iget() itself and the second through iget_failed(). This can result to a use-after-free error in the second jffs2_do_clear_inode() call, such as shown by the oops below wherein the second jffs2_do_clear_inode() call was trying to free node fragments that were already freed in the first jffs2_do_clear_inode() call. [ 78.178860] jffs2: error: (1904) jffs2_do_read_inode_internal: CRC failed for read_inode of inode 24 at physical location 0x1fc00c [ 78.178914] Unable to handle kernel paging request at virtual address 6b6b6b6b6b6b6b7b [ 78.185871] pgd = ffffffc03a567000 [ 78.188794] [6b6b6b6b6b6b6b7b] *pgd=0000000000000000, *pud=0000000000000000 [ 78.194968] Internal error: Oops: 96000004 [#1] PREEMPT SMP ... [ 78.513147] PC is at rb_first_postorder+0xc/0x28 [ 78.516503] LR is at jffs2_kill_fragtree+0x28/0x90 [jffs2] [ 78.520672] pc : [] lr : [] pstate: 60000105 [ 78.526757] sp : ffffff800cea38f0 [ 78.528753] x29: ffffff800cea38f0 x28: ffffffc01f3f8e80 [ 78.532754] x27: 0000000000000000 x26: ffffff800cea3c70 [ 78.536756] x25: 00000000dc67c8ae x24: ffffffc033d6945d [ 78.540759] x23: ffffffc036811740 x22: ffffff800891a5b8 [ 78.544760] x21: 0000000000000000 x20: 0000000000000000 [ 78.548762] x19: ffffffc037d48910 x18: ffffff800891a588 [ 78.552764] x17: 0000000000000800 x16: 0000000000000c00 [ 78.556766] x15: 0000000000000010 x14: 6f2065646f6e695f [ 78.560767] x13: 6461657220726f66 x12: 2064656c69616620 [ 78.564769] x11: 435243203a6c616e x10: 7265746e695f6564 [ 78.568771] x9 : 6f6e695f64616572 x8 : ffffffc037974038 [ 78.572774] x7 : bbbbbbbbbbbbbbbb x6 : 0000000000000008 [ 78.576775] x5 : 002f91d85bd44a2f x4 : 0000000000000000 [ 78.580777] x3 : 0000000000000000 x2 : 000000403755e000 [ 78.584779] x1 : 6b6b6b6b6b6b6b6b x0 : 6b6b6b6b6b6b6b6b ... [ 79.038551] [] rb_first_postorder+0xc/0x28 [ 79.042962] [] jffs2_do_clear_inode+0x88/0x100 [jffs2] [ 79.048395] [] jffs2_evict_inode+0x3c/0x48 [jffs2] [ 79.053443] [] evict+0xb0/0x168 [ 79.056835] [] iput+0x1c0/0x200 [ 79.060228] [] iget_failed+0x30/0x3c [ 79.064097] [] jffs2_iget+0x2d8/0x360 [jffs2] [ 79.068740] [] jffs2_lookup+0xe8/0x130 [jffs2] [ 79.073434] [] lookup_slow+0x118/0x190 [ 79.077435] [] walk_component+0xfc/0x28c [ 79.081610] [] path_lookupat+0x84/0x108 [ 79.085699] [] filename_lookup+0x88/0x100 [ 79.089960] [] user_path_at_empty+0x58/0x6c [ 79.094396] [] vfs_statx+0xa4/0x114 [ 79.098138] [] SyS_newfstatat+0x58/0x98 [ 79.102227] [] __sys_trace_return+0x0/0x4 [ 79.106489] Code: d65f03c0 f9400001 b40000e1 aa0103e0 (f9400821) The jffs2_do_clear_inode() call in jffs2_iget() is unnecessary since iget_failed() will eventually call jffs2_do_clear_inode() if needed, so just remove it. Fixes: 5451f79f5f81 ("iget: stop JFFS2 from using iget() and read_inode()") Reviewed-by: Richard Weinberger Signed-off-by: Jake Daryll Obina Signed-off-by: Al Viro Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit 19b3638ce46065936c2092c983227ae7f2161220 Author: Jason Gunthorpe Date: Wed Jan 24 19:58:34 2018 -0700 RDMA/uverbs: Use an unambiguous errno for method not supported [ Upstream commit 3624a8f02568f08aef299d3b117f2226f621177d ] Returning EOPNOTSUPP is problematic because it can also be returned by the method function, and we use it in quite a few places in drivers these days. Instead, dedicate EPROTONOSUPPORT to indicate that the ioctl framework is enabled but the requested object and method are not supported by the kernel. No other case will return this code, and it lets userspace know to fall back to write(). grep says we do not use it today in drivers/infiniband subsystem. Signed-off-by: Jason Gunthorpe Reviewed-by: Matan Barak Signed-off-by: Doug Ledford Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit 827aab45cb16862083415abdb3e56313d59c347e Author: Corentin LABBE Date: Wed Jan 17 19:50:56 2018 +0100 crypto: artpec6 - remove select on non-existing CRYPTO_SHA384 [ Upstream commit 980b4c95e78e4113cb7b9f430f121dab1c814b6c ] Since CRYPTO_SHA384 does not exists, Kconfig should not select it. Anyway, all SHA384 stuff is in CRYPTO_SHA512 which is already selected. Fixes: a21eb94fc4d3i ("crypto: axis - add ARTPEC-6/7 crypto accelerator driver") Signed-off-by: Corentin Labbe Signed-off-by: Herbert Xu Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit 592ea370bf1ce685bdb5bdf71eaf7e65190c551f Author: Andy Shevchenko Date: Mon Jan 22 18:01:42 2018 +0200 device property: Define type of PROPERTY_ENRTY_*() macros [ Upstream commit c505cbd45f6e9c539d57dd171d95ec7e5e9f9cd0 ] Some of the drivers may use the macro at runtime flow, like struct property_entry p[10]; ... p[index++] = PROPERTY_ENTRY_U8("u8 property", u8_data); In that case and absence of the data type compiler fails the build: drivers/char/ipmi/ipmi_dmi.c:79:29: error: Expected ; at end of statement drivers/char/ipmi/ipmi_dmi.c:79:29: error: got { Acked-by: Corey Minyard Cc: Corey Minyard Signed-off-by: Andy Shevchenko Signed-off-by: Greg Kroah-Hartman Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit c5fda2b8610b96f0035210952fb1dedcd73bf2ee Author: Aaron Sierra Date: Wed Jan 24 18:19:23 2018 -0600 tty: serial: exar: Relocate sleep wake-up handling [ Upstream commit c7e1b4059075c9e8eed101d7cc5da43e95eb5e18 ] Exar sleep wake-up handling has been done on a per-channel basis by virtue of INT0 being accessible from each channel's address space. I believe this was initially done out of necessity, but now that Exar devices have their own driver, we can do things more efficiently by registering a dedicated INT0 handler at the PCI device level. I see this change providing the following benefits: 1. If more than one port is active, eliminates the redundant bus cycles for reading INT0 on every interrupt. 2. This note associated with hooking in the per-channel handler in 8250_port.c is resolved: /* Fixme: probably not the best place for this */ Cc: Matt Schulte Signed-off-by: Aaron Sierra Signed-off-by: Greg Kroah-Hartman Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit 519a7119527c26984e6804abc422808681d1a117 Author: Vitaly Kuznetsov Date: Wed Jan 24 11:36:29 2018 +0100 x86/hyperv: Stop suppressing X86_FEATURE_PCID [ Upstream commit 617ab45c9a8900e64a78b43696c02598b8cad68b ] When hypercall-based TLB flush was enabled for Hyper-V guests PCID feature was deliberately suppressed as a precaution: back then PCID was never exposed to Hyper-V guests and it wasn't clear what will happen if some day it becomes available. The day came and PCID/INVPCID features are already exposed on certain Hyper-V hosts. >From TLFS (as of 5.0b) it is unclear how TLB flush hypercalls combine with PCID. In particular the usage of PCID is per-cpu based: the same mm gets different CR3 values on different CPUs. If the hypercall does exact matching this will fail. However, this is not the case. David Zhang explains: "In practice, the AddressSpace argument is ignored on any VM that supports PCIDs. Architecturally, the AddressSpace argument must match the CR3 with PCID bits stripped out (i.e., the low 12 bits of AddressSpace should be 0 in long mode). The flush hypercalls flush all PCIDs for the specified AddressSpace." With this, PCID can be enabled. Signed-off-by: Vitaly Kuznetsov Signed-off-by: Thomas Gleixner Cc: David Zhang Cc: Stephen Hemminger Cc: Haiyang Zhang Cc: "Michael Kelley (EOSG)" Cc: Andy Lutomirski Cc: devel@linuxdriverproject.org Cc: "K. Y. Srinivasan" Cc: Aditya Bhandari Link: https://lkml.kernel.org/r/20180124103629.29980-1-vkuznets@redhat.com Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit 9a1dda252663c930822c933232736cd05a8cb800 Author: Ngai-Mint Kwan Date: Wed Jan 24 14:18:22 2018 -0800 fm10k: fix "failed to kill vid" message for VF [ Upstream commit cf315ea596ec26d7aa542a9ce354990875a920c0 ] When a VF is under PF VLAN assignment: ip link set vf <#> vlan This will remove all previous entries in the VLAN table including those generated by VLAN interfaces created on the VF. The issue arises when the VF is under PF VLAN assignment and one or more of these VLAN interfaces of the VF are deleted. When deleting these VLAN interfaces, the following message will be generated in "dmesg": failed to kill vid 0081/ for device This is due to the fact that "ndo_vlan_rx_kill_vid" exits with an error. The handler for this ndo is "fm10k_update_vid". Any calls to this function while under PF VLAN management will exit prematurely and, thus, it will generate the failure message. Additionally, since "fm10k_update_vid" exits prematurely, none of the VLAN update is performed. So, even though the actual VLAN interfaces of the VF will be deleted, the active_vlans bitmask is not cleared. When the VF is no longer under PF VLAN assignment, the driver mistakenly restores the previous entries of the VLAN table based on an unsynchronized list of active VLANs. The solution to this issue involves checking the VLAN update action type before exiting "fm10k_update_vid". If the VLAN update action type is to "add", this action will not be permitted while the VF is under PF VLAN assignment and the VLAN update is abandoned like before. However, if the VLAN update action type is to "kill", then we need to also clear the active_vlans bitmask. However, we don't need to actually queue any messages to the PF, because the MAC and VLAN tables have already been cleared, and the PF would silently ignore these requests anyways. Signed-off-by: Ngai-Mint Kwan Signed-off-by: Jacob Keller Tested-by: Krishneil Singh Signed-off-by: Jeff Kirsher Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit 0e7a0c139cbf09d602eabb935ad5a080c99cdd46 Author: Daniel Hua Date: Tue Jan 2 08:33:18 2018 +0800 igb: Clear TXSTMP when ptp_tx_work() is timeout [ Upstream commit 3a53285228165225a7f76c7d5ff1ddc0213ce0e4 ] Problem description: After ethernet cable connect and disconnect for several iterations on a device with i210, tx timestamp will stop being put into the socket. Steps to reproduce: 1. Setup a device with i210 and wire it to a 802.1AS capable switch ( Extreme Networks Summit x440 is used in our case) 2. Have the gptp daemon running on the device and make sure it is synced with the switch 3. Have the switch disable and enable the port, wait for the device gets resynced with the switch 4. Iterates step 3 until the device is not albe to get resynced 5. Review the log in dmesg and you will see warning message "igb : clearing Tx timestamp hang" Root cause: If ptp_tx_work() gets scheduled just before the port gets disabled, a LINK DOWN event will be processed before ptp_tx_work(), which may cause timeout in ptp_tx_work(). In the timeout logic, the TSYNCTXCTL's TXTT bit (Transmit timestamp valid bit) is not cleared, causing no new timestamp loaded to TXSTMP register. Consequently therefore, no new interrupt is triggerred by TSICR.TXTS bit and no more Tx timestamp send to the socket. Signed-off-by: Daniel Hua Tested-by: Aaron Brown Signed-off-by: Jeff Kirsher Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit 187bf28199d85d12c68c836e1ba3fe903ba7fec3 Author: Corinna Vinschen Date: Mon Apr 10 10:58:14 2017 +0200 igb: Allow to remove administratively set MAC on VFs [ Upstream commit 177132df5e45b134c147f419f567a3b56aafaf2b ] Before libvirt modifies the MAC address and vlan tag for an SRIOV VF for use by a virtual machine (either using vfio device assignment or macvtap passthru mode), it saves the current MAC address and vlan tag so that it can reset them to their original value when the guest is done. Libvirt can't leave the VF MAC set to the value used by the now-defunct guest since it may be started again later using a different VF, but it certainly shouldn't just pick any random value, either. So it saves the state of everything prior to using the VF, and resets it to that. The igb driver initializes the MAC addresses of all VFs to 00:00:00:00:00:00, and reports that when asked (via an RTM_GETLINK netlink message, also visible in the list of VFs in the output of "ip link show"). But when libvirt attempts to restore the MAC address back to 00:00:00:00:00:00 (using an RTM_SETLINK netlink message) the kernel responds with "Invalid argument". Forbidding a reset back to the original value leaves the VF MAC at the value set for the now-defunct virtual machine. Especially on a system with NetworkManager enabled, this has very bad consequences, since NetworkManager forces all interfacess to be IFF_UP all the time - if the same virtual machine is restarted using a different VF (or even on a different host), there will be multiple interfaces watching for traffic with the same MAC address. To allow libvirt to revert to the original state, we need a way to remove the administrative set MAC on a VF, to allow normal host operation again, and to reset/overwrite the VF MAC via VF netdev. This patch implements the outlined scenario by allowing to set the VF MAC to 00:00:00:00:00:00 via RTM_SETLINK on the PF. igb_ndo_set_vf_mac resets the IGB_VF_FLAG_PF_SET_MAC flag to 0, so it's possible to reset the VF MAC back to the original value via the VF netdev. Note: Recent patches to libvirt allow for a workaround if the NIC isn't capable of resetting the administrative MAC back to all 0, but in theory the NIC should allow resetting the MAC in the first place. Signed-off-by: Corinna Vinschen Tested-by: Aaron Brown Signed-off-by: Jeff Kirsher Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit 048af64fd48f639bb97773c33e87ee6cf250ac99 Author: Jeffy Chen Date: Tue Nov 21 16:25:17 2017 +0800 ASoC: rockchip: Use dummy_dai for rt5514 dsp dailink [ Upstream commit fde7f9dbc71365230eeb8c8ea97ce9b552c8e5bd ] The rt5514 dsp captures pcm data through spi directly, so we should not use rockchip-i2s as it's cpu dai like other codecs. Use dummy_dai for rt5514 dsp dailink to make voice wakeup work again. Reported-by: Jimmy Cheng-Yi Chiang Fixes: (72cfb0f20c75 ASoC: rockchip: Use codec of_node and dai_name for rt5514 dsp) Signed-off-by: Jeffy Chen Tested-by: Brian Norris Signed-off-by: Mark Brown Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit f25ba4f6be4aa368a4ce22d225d4c09257b3667a Author: Eryu Guan Date: Wed Jan 24 01:20:00 2018 +0800 blk-mq-debugfs: don't allow write on attributes with seq_operations set [ Upstream commit 6b136a24b05c81a24e0b648a4bd938bcd0c4f69e ] Attributes that only implement .seq_ops are read-only, any write to them should be rejected. But currently kernel would crash when writing to such debugfs entries, e.g. chmod +w /sys/kernel/debug/block//requeue_list echo 0 > /sys/kernel/debug/block//requeue_list chmod -w /sys/kernel/debug/block//requeue_list Fix it by returning -EPERM in blk_mq_debugfs_write() when writing to such attributes. Cc: Ming Lei Signed-off-by: Eryu Guan Signed-off-by: Jens Axboe Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit a42ebbdae0a58e62906a47fc36b89822c82f627a Author: David Hildenbrand Date: Tue Jan 16 18:15:25 2018 +0100 KVM: s390: vsie: use READ_ONCE to access some SCB fields [ Upstream commit b3ecd4aa8632a86428605ab73393d14779019d82 ] Another VCPU might try to modify the SCB while we are creating the shadow SCB. In general this is no problem - unless the compiler decides to not load values once, but e.g. twice. For us, this is only relevant when checking/working with such values. E.g. the prefix value, the mso, state of transactional execution and addresses of satellite blocks. E.g. if we blindly forward values (e.g. general purpose registers or execution controls after masking), we don't care. Leaving unpin_blocks() untouched for now, will handle it separately. The worst thing right now that I can see would be a missed prefix un/remap (mso, prefix, tx) or using wrong guest addresses. Nothing critical, but let's try to avoid unpredictable behavior. Signed-off-by: David Hildenbrand Message-Id: <20180116171526.12343-2-david@redhat.com> Reviewed-by: Christian Borntraeger Acked-by: Cornelia Huck Signed-off-by: Christian Borntraeger Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit 48d441324a58460e201f15309edf2e082392170d Author: David Herrmann Date: Fri Jan 12 12:04:45 2018 +0100 platform/x86: thinkpad_acpi: suppress warning about palm detection [ Upstream commit 587d8628fb71c3bfae29fb2bbe84c1478c59bac8 ] This patch prevents the thinkpad_acpi driver from warning about 2 event codes returned for keyboard palm-detection. No behavioral changes, other than suppressing the warning in the kernel log. The events are still forwarded via acpi-netlink channels. We could, optionally, decide to forward the event through a input-switch on the tpacpi input device. However, so far no suitable input-code exists, and no similar drivers report such events. Hence, leave it an acpi event for now. Note that the event-codes are named based on empirical studies. On the ThinkPad X1 5th Gen the sensor can be found underneath the arrow key. Cc: Matthew Thode Signed-off-by: David Herrmann Acked-by: Henrique de Moraes Holschuh Signed-off-by: Andy Shevchenko Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit b9d78055c6aea80faa783b105520efa05b443075 Author: Alan Brady Date: Fri Jan 5 04:55:21 2018 -0500 i40evf: ignore link up if not running [ Upstream commit e0346f9fcb6c636d2f870e6666de8781413f34ea ] If we receive the link status message from PF with link up before queues are actually enabled, it will trigger a TX hang. This fixes the issue by ignoring a link up message if the VF state is not yet in RUNNING state. Signed-off-by: Alan Brady Tested-by: Andrew Bowers Signed-off-by: Jeff Kirsher Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit 09f6d65db13b0a58af216d11a8e3f23069dc0f47 Author: Avinash Dayanand Date: Mon Dec 18 05:16:43 2017 -0500 i40evf: Don't schedule reset_task when device is being removed [ Upstream commit 06aa040f039404a0039a5158cd12f41187487a1f ] When a host disables and enables a PF device, all the associated VFs are removed and added back in. It also generates a PFR which in turn resets all the connected VFs. This behaviour is different from that of Linux guest on Linux host. Hence we end up in a situation where there's a PFR and device removal at the same time. And watchdog doesn't have a clue about this and schedules a reset_task. This patch adds code to send signal to reset_task that the device is currently being removed. Signed-off-by: Avinash Dayanand Tested-by: Andrew Bowers Signed-off-by: Jeff Kirsher Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit 7c7ae4ed2fcd5a10bdac40b82deaa7f940f001ed Author: Prashant Bhole Date: Tue Jan 23 13:30:44 2018 +0900 bpf: test_maps: cleanup sockmaps when test ends [ Upstream commit 783687810e986a15ffbf86c516a1a48ff37f38f7 ] Bug: BPF programs and maps related to sockmaps test exist in memory even after test_maps ends. This patch fixes it as a short term workaround (sockmap kernel side needs real fixing) by empyting sockmaps when test ends. Fixes: 6f6d33f3b3d0f ("bpf: selftests add sockmap tests") Signed-off-by: Prashant Bhole [ daniel: Note on workaround. ] Signed-off-by: Daniel Borkmann Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit c6c6e38aeff2b4131de2718a993d07995844dcc8 Author: Goldwyn Rodrigues Date: Tue Jan 23 09:10:19 2018 -0700 block: Set BIO_TRACE_COMPLETION on new bio during split [ Upstream commit 20d59023c5ec4426284af492808bcea1f39787ef ] We inadvertently set it again on the source bio, but we need to set it on the new split bio instead. Fixes: fbbaf700e7b1 ("block: trace completion of all bios.") Signed-off-by: Goldwyn Rodrigues Signed-off-by: Jens Axboe Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit f2e73df302f31b588086b6eb5a4151b196faf427 Author: Wei Yongjun Date: Tue Jan 23 02:10:27 2018 +0000 nfp: fix error return code in nfp_pci_probe() [ Upstream commit e58decc9c51eb61697aba35ba8eda33f4b80552d ] Fix to return error code -EINVAL instead of 0 when num_vfs above limit_vfs, as done elsewhere in this function. Fixes: 0dc786219186 ("nfp: handle SR-IOV already enabled when driver is probing") Signed-off-by: Wei Yongjun Acked-by: Jakub Kicinski Signed-off-by: David S. Miller Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit 8591958413bfeacbe294dbb846e53c7bbd82e685 Author: Dan Carpenter Date: Wed Jan 10 12:39:03 2018 +0300 HID: roccat: prevent an out of bounds read in kovaplus_profile_activated() [ Upstream commit 7ad81482cad67cbe1ec808490d1ddfc420c42008 ] We get the "new_profile_index" value from the mouse device when we're handling raw events. Smatch taints it as untrusted data and complains that we need a bounds check. This seems like a reasonable warning otherwise there is a small read beyond the end of the array. Fixes: 0e70f97f257e ("HID: roccat: Add support for Kova[+] mouse") Signed-off-by: Dan Carpenter Acked-by: Silvan Jegen Signed-off-by: Jiri Kosina Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit 6a5505da41faa98555526ba50ca9865a2bd0ef99 Author: Andi Shyti Date: Mon Jan 22 17:32:46 2018 -0800 Input: stmfts - set IRQ_NOAUTOEN to the irq flag [ Upstream commit cba04cdf437d745fac85220d1d692a9ae23d7004 ] The interrupt is requested before the device is powered on and it's value in some cases cannot be reliable. It happens on some devices that an interrupt is generated as soon as requested before having the chance to disable the irq. Set the irq flag as IRQ_NOAUTOEN before requesting it. This patch mutes the error: stmfts 2-0049: failed to read events: -11 received sometimes during boot time. Signed-off-by: Andi Shyti Signed-off-by: Dmitry Torokhov Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit 8afed2798e50f3f9b84a6b0c9eb9810119a0fc3f Author: Arnd Bergmann Date: Thu Jan 18 14:16:38 2018 +0100 scsi: fas216: fix sense buffer initialization [ Upstream commit 96d5eaa9bb74d299508d811d865c2c41b38b0301 ] While testing with the ARM specific memset() macro removed, I ran into a compiler warning that shows an old bug: drivers/scsi/arm/fas216.c: In function 'fas216_rq_sns_done': drivers/scsi/arm/fas216.c:2014:40: error: argument to 'sizeof' in 'memset' call is the same expression as the destination; did you mean to provide an explicit length? [-Werror=sizeof-pointer-memaccess] It turns out that the definition of the scsi_cmd structure changed back in linux-2.6.25, so now we clear only four bytes (sizeof(pointer)) instead of 96 (SCSI_SENSE_BUFFERSIZE). I did not check whether we actually need to initialize the buffer here, but it's clear that if we do it, we should use the correct size. Fixes: de25deb18016 ("[SCSI] use dynamically allocated sense buffer") Signed-off-by: Arnd Bergmann Signed-off-by: Martin K. Petersen Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit 800fda575b119312b429fe488680e990b32e2de6 Author: Xose Vazquez Perez Date: Mon Jan 15 17:47:23 2018 +0100 scsi: devinfo: fix format of the device list [ Upstream commit 3f884a0a8bdf28cfd1e9987d54d83350096cdd46 ] Replace "" with NULL for product revision level, and merge TEXEL duplicate entries. Cc: Hannes Reinecke Cc: Martin K. Petersen Cc: James E.J. Bottomley Cc: SCSI ML Signed-off-by: Xose Vazquez Perez Signed-off-by: Martin K. Petersen Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit a09881cfb7135787375a3975cbd05bc94e07a0d0 Author: Sheng Yong Date: Wed Jan 17 12:11:31 2018 +0800 f2fs: avoid hungtask when GC encrypted block if io_bits is set [ Upstream commit a9d572c7550044d5b217b5287d99a2e6d34b97b0 ] When io_bits is set, GCing encrypted block may hit the following hungtask. Since io_bits requires aligned block address, f2fs_submit_page_write may return -EAGAIN if new_blkaddr does not satisify io_bits alignment. As a result, the encrypted page will never be writtenback. This patch makes move_data_block aware the EAGAIN error and cancel the writeback. [ 246.751371] INFO: task kworker/u4:4:797 blocked for more than 90 seconds. [ 246.752423] Not tainted 4.15.0-rc4+ #11 [ 246.754176] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 246.755336] kworker/u4:4 D25448 797 2 0x80000000 [ 246.755597] Workqueue: writeback wb_workfn (flush-7:0) [ 246.755616] Call Trace: [ 246.755695] ? __schedule+0x322/0xa90 [ 246.755761] ? blk_init_request_from_bio+0x120/0x120 [ 246.755773] ? pci_mmcfg_check_reserved+0xb0/0xb0 [ 246.755801] ? __radix_tree_create+0x19e/0x200 [ 246.755813] ? delete_node+0x136/0x370 [ 246.755838] schedule+0x43/0xc0 [ 246.755904] io_schedule+0x17/0x40 [ 246.755939] wait_on_page_bit_common+0x17b/0x240 [ 246.755950] ? wake_page_function+0xa0/0xa0 [ 246.755961] ? add_to_page_cache_lru+0x160/0x160 [ 246.755972] ? page_cache_tree_insert+0x170/0x170 [ 246.755983] ? __lru_cache_add+0x96/0xb0 [ 246.756086] __filemap_fdatawait_range+0x14f/0x1c0 [ 246.756097] ? wait_on_page_bit_common+0x240/0x240 [ 246.756120] ? __wake_up_locked_key_bookmark+0x20/0x20 [ 246.756167] ? wait_on_all_pages_writeback+0xc9/0x100 [ 246.756179] ? __remove_ino_entry+0x120/0x120 [ 246.756192] ? wait_woken+0x100/0x100 [ 246.756204] filemap_fdatawait_range+0x9/0x20 [ 246.756216] write_checkpoint+0x18a1/0x1f00 [ 246.756254] ? blk_get_request+0x10/0x10 [ 246.756265] ? cpumask_next_and+0x43/0x60 [ 246.756279] ? f2fs_sync_inode_meta+0x160/0x160 [ 246.756289] ? remove_element.isra.4+0xa0/0xa0 [ 246.756300] ? __put_compound_page+0x40/0x40 [ 246.756310] ? f2fs_sync_fs+0xec/0x1c0 [ 246.756320] ? f2fs_sync_fs+0x120/0x1c0 [ 246.756329] f2fs_sync_fs+0x120/0x1c0 [ 246.756357] ? trace_event_raw_event_f2fs__page+0x260/0x260 [ 246.756393] ? ata_build_rw_tf+0x173/0x410 [ 246.756397] f2fs_balance_fs_bg+0x198/0x390 [ 246.756405] ? drop_inmem_page+0x230/0x230 [ 246.756415] ? ahci_qc_prep+0x1bb/0x2e0 [ 246.756418] ? ahci_qc_issue+0x1df/0x290 [ 246.756422] ? __accumulate_pelt_segments+0x42/0xd0 [ 246.756426] ? f2fs_write_node_pages+0xd1/0x380 [ 246.756429] f2fs_write_node_pages+0xd1/0x380 [ 246.756437] ? sync_node_pages+0x8f0/0x8f0 [ 246.756440] ? update_curr+0x53/0x220 [ 246.756444] ? __accumulate_pelt_segments+0xa2/0xd0 [ 246.756448] ? __update_load_avg_se.isra.39+0x349/0x360 [ 246.756452] ? do_writepages+0x2a/0xa0 [ 246.756456] do_writepages+0x2a/0xa0 [ 246.756460] __writeback_single_inode+0x70/0x490 [ 246.756463] ? check_preempt_wakeup+0x199/0x310 [ 246.756467] writeback_sb_inodes+0x2a2/0x660 [ 246.756471] ? is_empty_dir_inode+0x40/0x40 [ 246.756474] ? __writeback_single_inode+0x490/0x490 [ 246.756477] ? string+0xbf/0xf0 [ 246.756480] ? down_read_trylock+0x35/0x60 [ 246.756484] __writeback_inodes_wb+0x9f/0xf0 [ 246.756488] wb_writeback+0x41d/0x4b0 [ 246.756492] ? writeback_inodes_wb.constprop.55+0x150/0x150 [ 246.756498] ? set_worker_desc+0xf7/0x130 [ 246.756502] ? current_is_workqueue_rescuer+0x60/0x60 [ 246.756511] ? _find_next_bit+0x2c/0xa0 [ 246.756514] ? wb_workfn+0x400/0x5d0 [ 246.756518] wb_workfn+0x400/0x5d0 [ 246.756521] ? finish_task_switch+0xdf/0x2a0 [ 246.756525] ? inode_wait_for_writeback+0x30/0x30 [ 246.756529] process_one_work+0x3a7/0x6f0 [ 246.756533] worker_thread+0x82/0x750 [ 246.756537] kthread+0x16f/0x1c0 [ 246.756541] ? trace_event_raw_event_workqueue_work+0x110/0x110 [ 246.756544] ? kthread_create_worker_on_cpu+0xb0/0xb0 [ 246.756548] ret_from_fork+0x1f/0x30 Signed-off-by: Sheng Yong Reviewed-by: Chao Yu Signed-off-by: Jaegeuk Kim Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit 889177d172d3a1879742cdd17debbbeb300ca1b5 Author: Parav Pandit Date: Tue Jan 9 15:58:54 2018 +0200 RDMA/cma: Check existence of netdevice during port validation [ Upstream commit 00db63c128dd3daf38f481371976c24d32678142 ] If valid netdevice is not found for RoCE, GID table should not be searched with NULL netdevice. Doing so causes the search routines to ignore the netdev argument and may match the wrong GID table entry if the netdev is deleted. Fixes: abae1b71dd37 ("IB/cma: cma_validate_port should verify the port and netdevice") Signed-off-by: Parav Pandit Reviewed-by: Mark Bloch Signed-off-by: Leon Romanovsky Signed-off-by: Jason Gunthorpe Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit 48b8839d91a49cde35ddab422a99ef908391aeee Author: Liu Bo Date: Tue Jan 9 18:36:25 2018 -0700 Btrfs: raid56: fix race between merge_bio and rbio_orig_end_io [ Upstream commit 7583d8d088ff2c323b1d4f15b191ca2c23d32558 ] Before rbio_orig_end_io() goes to free rbio, rbio may get merged with more bios from other rbios and rbio->bio_list becomes non-empty, in that case, these newly merged bios don't end properly. Once unlock_stripe() is done, rbio->bio_list will not be updated any more and we can call bio_endio() on all queued bios. It should only happen in error-out cases, the normal path of recover and full stripe write have already set RBIO_RMW_LOCKED_BIT to disable merge before doing IO, so rbio_orig_end_io() called by them doesn't have the above issue. Reported-by: Jérôme Carretero Signed-off-by: Liu Bo Signed-off-by: David Sterba Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit ebe064401f078ae68f2532833ad8f212b6ee2be2 Author: Liu Bo Date: Fri Jan 5 12:51:09 2018 -0700 Btrfs: fix unexpected EEXIST from btrfs_get_extent [ Upstream commit 18e83ac75bfe67009c4ddcdd581bba8eb16f4030 ] This fixes a corner case that is caused by a race of dio write vs dio read/write. Here is how the race could happen. Suppose that no extent map has been loaded into memory yet. There is a file extent [0, 32K), two jobs are running concurrently against it, t1 is doing dio write to [8K, 32K) and t2 is doing dio read from [0, 4K) or [4K, 8K). t1 goes ahead of t2 and splits em [0, 32K) to em [0K, 8K) and [8K 32K). ------------------------------------------------------ t1 t2 btrfs_get_blocks_direct() btrfs_get_blocks_direct() -> btrfs_get_extent() -> btrfs_get_extent() -> lookup_extent_mapping() -> add_extent_mapping() -> lookup_extent_mapping() # load [0, 32K) -> btrfs_new_extent_direct() -> btrfs_drop_extent_cache() # split [0, 32K) and # drop [8K, 32K) -> add_extent_mapping() # add [8K, 32K) -> add_extent_mapping() # handle -EEXIST when adding # [0, 32K) ------------------------------------------------------ About how t2(dio read/write) runs into -EEXIST: a) add_extent_mapping() gets -EEXIST for adding em [0, 32k), b) search_extent_mapping() then returns [0, 8k) as the existing em, even though start == existing->start, em is [0, 32k) so that extent_map_end(em) > extent_map_end(existing), i.e. 32k > 8k, c) then it goes thru merge_extent_mapping() which tries to add a [8k, 8k) (with a length 0) and returns -EEXIST as [8k, 32k) is already in tree, d) so btrfs_get_extent() ends up returning -EEXIST to dio read/write, which is confusing applications. Here I conclude all the possible situations, 1) start < existing->start +-----------+em+-----------+ +--prev---+ | +-------------+ | | | | | | | +---------+ + +---+existing++ ++ + | + start 2) start == existing->start +------------em------------+ | +-------------+ | | | | | + +----existing-+ + | | + start 3) start > existing->start && start < (existing->start + existing->len) +------------em------------+ | +-------------+ | | | | | + +----existing-+ + | | + start 4) start >= (existing->start + existing->len) +-----------+em+-----------+ | +-------------+ | +--next---+ | | | | | | + +---+existing++ + +---------+ + | + start As we can see, it turns out that if start is within existing em (front inclusive), then the existing em should be returned as is, otherwise, we try our best to merge candidate em with sibling ems to form a larger em (in order to reduce the total number of em). Reported-by: David Vallender Signed-off-by: Liu Bo Reviewed-by: Josef Bacik Signed-off-by: David Sterba Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit c231cec825a9d72642b712b6b62bb9c99b7407fd Author: Anand Jain Date: Tue Jan 9 09:05:43 2018 +0800 btrfs: fail mount when sb flag is not in BTRFS_SUPER_FLAG_SUPP [ Upstream commit 6f794e3c5c8f8fdd3b5bb20d9ded894e685b5bbe ] It appears from the original commit [1] that there isn't any design specific reason not to fail the mount instead of just warning. This patch will change it to fail. [1] commit 319e4d0661e5323c9f9945f0f8fb5905e5fe74c3 btrfs: Enhance super validation check Fixes: 319e4d0661e5323 ("btrfs: Enhance super validation check") Signed-off-by: Anand Jain Reviewed-by: Qu Wenruo Reviewed-by: David Sterba Signed-off-by: David Sterba Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit d91bb7c6988bd6450284c762b33f2e1ea3fe7c97 Author: Liu Bo Date: Tue Jan 2 13:36:41 2018 -0700 Btrfs: fix scrub to repair raid6 corruption [ Upstream commit 762221f095e3932669093466aaf4b85ed9ad2ac1 ] The raid6 corruption is that, suppose that all disks can be read without problems and if the content that was read out doesn't match its checksum, currently for raid6 btrfs at most retries twice, - the 1st retry is to rebuild with all other stripes, it'll eventually be a raid5 xor rebuild, - if the 1st fails, the 2nd retry will deliberately fail parity p so that it will do raid6 style rebuild, however, the chances are that another non-parity stripe content also has something corrupted, so that the above retries are not able to return correct content. We've fixed normal reads to rebuild raid6 correctly with more retries in Patch "Btrfs: make raid6 rebuild retry more"[1], this is to fix scrub to do the exactly same rebuild process. [1]: https://patchwork.kernel.org/patch/10091755/ Signed-off-by: Liu Bo Signed-off-by: David Sterba Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit db6d651eccded6fe512f0ed02347e0638dd95296 Author: Nikolay Borisov Date: Tue Dec 12 11:14:49 2017 +0200 btrfs: Fix out of bounds access in btrfs_search_slot [ Upstream commit 9ea2c7c9da13c9073e371c046cbbc45481ecb459 ] When modifying a tree where the root is at BTRFS_MAX_LEVEL - 1 then the level variable is going to be 7 (this is the max height of the tree). On the other hand btrfs_cow_block is always called with "level + 1" as an index into the nodes and slots arrays. This leads to an out of bounds access. Admittdely this will be benign since an OOB access of the nodes array will likely read the 0th element from the slots array, which in this case is going to be 0 (since we start CoW at the top of the tree). The OOB access into the slots array in turn will read the 0th and 1st values of the locks array, which would both be 0 at the time. However, this benign behavior relies on the fact that the path being passed hasn't been initialised, if it has already been used to query a btree then it could potentially have populated the nodes/slots arrays. Fix it by explicitly checking if we are at level 7 (the maximum allowed index in nodes/slots arrays) and explicitly call the CoW routine with NULL for parent's node/slot. Signed-off-by: Nikolay Borisov Fixes-coverity-id: 711515 Reviewed-by: David Sterba Signed-off-by: David Sterba Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit a4909c8518f79875e06d4903d249f2524126c2cd Author: Liu Bo Date: Wed Nov 15 16:10:28 2017 -0700 Btrfs: set plug for fsync [ Upstream commit 343e4fc1c60971b0734de26dbbd475d433950982 ] Setting plug can merge adjacent IOs before dispatching IOs to the disk driver. Without plug, it'd not be a problem for single disk usecases, but for multiple disks using raid profile, a large IO can be split to several IOs of stripe length, and plug can be helpful to bring them together for each disk so that we can save several disk access. Moreover, fsync issues synchronous writes, so plug can really take effect. Signed-off-by: Liu Bo Reviewed-by: David Sterba Signed-off-by: David Sterba Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit fb5d97a19fc39d9fd5238af989a45cc56b4c2b9a Author: Wei Yongjun Date: Thu Jan 18 01:43:19 2018 +0000 ipmi/powernv: Fix error return code in ipmi_powernv_probe() [ Upstream commit e749d328b0b450aa78d562fa26a0cd8872325dd9 ] Fix to return a negative error code from the request_irq() error handling case instead of 0, as done elsewhere in this function. Fixes: dce143c3381c ("ipmi/powernv: Convert to irq event interface") Signed-off-by: Wei Yongjun Reviewed-by: Alexey Kardashevskiy Signed-off-by: Corey Minyard Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit afadc440a1cc08895f451b4a9db551a45f2a1a21 Author: weiyongjun (A) Date: Thu Jan 18 02:23:34 2018 +0000 mac80211_hwsim: fix possible memory leak in hwsim_new_radio_nl() [ Upstream commit 0ddcff49b672239dda94d70d0fcf50317a9f4b51 ] 'hwname' is malloced in hwsim_new_radio_nl() and should be freed before leaving from the error handling cases, otherwise it will cause memory leak. Fixes: ff4dd73dd2b4 ("mac80211_hwsim: check HWSIM_ATTR_RADIO_NAME length") Signed-off-by: Wei Yongjun Reviewed-by: Ben Hutchings Signed-off-by: Johannes Berg Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit 18004e6f26ec406ee762fd3f8cceaf4ab5acde41 Author: Ulf Magnusson Date: Sun Oct 8 19:35:45 2017 +0200 kconfig: Fix expr_free() E_NOT leak [ Upstream commit 5b1374b3b3c2fc4f63a398adfa446fb8eff791a4 ] Only the E_NOT operand and not the E_NOT node itself was freed, due to accidentally returning too early in expr_free(). Outline of leak: switch (e->type) { ... case E_NOT: expr_free(e->left.expr); return; ... } *Never reached, 'e' leaked* free(e); Fix by changing the 'return' to a 'break'. Summary from Valgrind on 'menuconfig' (ARCH=x86) before the fix: LEAK SUMMARY: definitely lost: 44,448 bytes in 1,852 blocks ... Summary after the fix: LEAK SUMMARY: definitely lost: 1,608 bytes in 67 blocks ... Signed-off-by: Ulf Magnusson Signed-off-by: Masahiro Yamada Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit 0f511f3dda8c6fb2c799eab22acfb39056047a49 Author: Ulf Magnusson Date: Sun Oct 8 19:35:44 2017 +0200 kconfig: Fix automatic menu creation mem leak [ Upstream commit ae7440ef0c8013d68c00dad6900e7cce5311bb1c ] expr_trans_compare() always allocates and returns a new expression, giving the following leak outline: ... *Allocate* basedep = expr_trans_compare(basedep, E_UNEQUAL, &symbol_no); ... for (menu = parent->next; menu; menu = menu->next) { ... *Copy* dep2 = expr_copy(basedep); ... *Free copy* expr_free(dep2); } *basedep lost!* Fix by freeing 'basedep' after the loop. Summary from Valgrind on 'menuconfig' (ARCH=x86) before the fix: LEAK SUMMARY: definitely lost: 344,376 bytes in 14,349 blocks ... Summary after the fix: LEAK SUMMARY: definitely lost: 44,448 bytes in 1,852 blocks ... Signed-off-by: Ulf Magnusson Signed-off-by: Masahiro Yamada Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit 8bf116b258c286137635542a752ec68ffebd809b Author: Ulf Magnusson Date: Sun Oct 8 19:11:21 2017 +0200 kconfig: Don't leak main menus during parsing [ Upstream commit 0724a7c32a54e3e50d28e19e30c59014f61d4e2c ] If a 'mainmenu' entry appeared in the Kconfig files, two things would leak: - The 'struct property' allocated for the default "Linux Kernel Configuration" prompt. - The string for the T_WORD/T_WORD_QUOTE prompt after the T_MAINMENU token, allocated on the heap in zconf.l. To fix it, introduce a new 'no_mainmenu_stmt' nonterminal that matches if there's no 'mainmenu' and adds the default prompt. That means the prompt only gets allocated once regardless of whether there's a 'mainmenu' statement or not, and managing it becomes simple. Summary from Valgrind on 'menuconfig' (ARCH=x86) before the fix: LEAK SUMMARY: definitely lost: 344,568 bytes in 14,352 blocks ... Summary after the fix: LEAK SUMMARY: definitely lost: 344,440 bytes in 14,350 blocks ... Signed-off-by: Ulf Magnusson Signed-off-by: Masahiro Yamada Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit 9f2df99f9eb020a1fa4fceb460d9a494c3b33341 Author: Guenter Roeck Date: Sun Dec 24 13:04:07 2017 -0800 watchdog: sp5100_tco: Fix watchdog disable bit [ Upstream commit f541c09ebfc61697b586b38c9ebaf4b70defb278 ] According to all published information, the watchdog disable bit for SB800 compatible controllers is bit 1 of PM register 0x48, not bit 2. For the most part that doesn't matter in practice, since the bit has to be cleared to enable watchdog address decoding, which is the default setting, but it still needs to be fixed. Cc: Zoltán Böszörményi Signed-off-by: Guenter Roeck Signed-off-by: Wim Van Sebroeck Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit ebf5ffca1bf2be571544befa56e37723e8b8356f Author: Niklas Cassel Date: Fri Jan 19 10:39:06 2018 +0100 PCI: Add dummy pci_irqd_intx_xlate() for CONFIG_PCI=n build [ Upstream commit 80db6f08b7af93eddc9487535e6150b220262637 ] Some hardware can operate in either "host" or "endpoint" mode, which means there can be both a host bridge driver and an endpoint driver for the same device. Those drivers share a lot of code, so sometimes they live in the same source file. The host bridge driver requires CONFIG_PCI=y because it enumerates PCI devices below the bridge using the PCI core. The endpoint driver does not require CONFIG_PCI=y because it runs in an embedded kernel on the other side of the device, e.g., on an adapter card. pci-dra7xx.c contains both host and endpoint drivers. If we select only the endpoint driver (CONFIG_PCI=n and CONFIG_PCI_DRA7XX_EP=y), the unneeded host driver is still compiled. It references pci_irqd_intx_xlate(), which is not present when CONFIG_PCI=n, which causes this error: drivers/pci/dwc/pci-dra7xx.c:229:11: error: 'pci_irqd_intx_xlate' undeclared here (not in a function) Add a dummy pci_irqd_intx_xlate() for the CONFIG_PCI=n case. [bhelgaas: changelog] Signed-off-by: Niklas Cassel Signed-off-by: Bjorn Helgaas Acked-by: Arnd Bergmann Acked-by: Lorenzo Pieralisi Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit c212c855a09d826b95eb18d5adf04f4c65ecb8ca Author: James Hogan Date: Tue Jan 16 21:38:24 2018 +0000 MIPS: Fix clean of vmlinuz.{32,ecoff,bin,srec} [ Upstream commit 5f2483eb2423152445b39f2db59d372f523e664e ] Make doesn't expand shell style "vmlinuz.{32,ecoff,bin,srec}" to the 4 separate files, so none of these files get cleaned up by make clean. List the files separately instead. Fixes: ec3352925b74 ("MIPS: Remove all generated vmlinuz* files on "make clean"") Signed-off-by: James Hogan Cc: Ralf Baechle Cc: linux-mips@linux-mips.org Patchwork: https://patchwork.linux-mips.org/patch/18491/ Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit 81fbb7e26ea1068d6c7777f5c1f8d4f9b4ec68a9 Author: Jan Chochol Date: Fri Jan 5 08:39:12 2018 +0100 nfs: Do not convert nfs_idmap_cache_timeout to jiffies [ Upstream commit cbebc6ef4fc830f4040d4140bf53484812d5d5d9 ] Since commit 57e62324e469 ("NFS: Store the legacy idmapper result in the keyring") nfs_idmap_cache_timeout changed units from jiffies to seconds. Unfortunately sysctl interface was not updated accordingly. As a effect updating /proc/sys/fs/nfs/idmap_cache_timeout with some value will incorrectly multiply this value by HZ. Also reading /proc/sys/fs/nfs/idmap_cache_timeout will show real value divided by HZ. Fixes: 57e62324e469 ("NFS: Store the legacy idmapper result in the keyring") Signed-off-by: Jan Chochol Signed-off-by: Trond Myklebust Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit 35ceddc59cd4f70343072247fd5b9995d8fceb87 Author: Sagi Grimberg Date: Sun Jan 14 17:07:50 2018 +0200 IB/cq: Don't force IB_POLL_DIRECT poll context for ib_process_cq_direct [ Upstream commit 246d8b184c100e8eb6b4e8c88f232c2ed2a4e672 ] polling the completion queue directly does not interfere with the existing polling logic, hence drop the requirement. Be aware that running ib_process_cq_direct with non IB_POLL_DIRECT CQ may trigger concurrent CQ processing. This can be used for polling mode ULPs. Cc: Bart Van Assche Reported-by: Steve Wise Signed-off-by: Sagi Grimberg [maxg: added wcs array argument to __ib_process_cq] Signed-off-by: Max Gurtovoy Signed-off-by: Doug Ledford Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit 58bc0fd8434d510c5e86f6739c9856b695730504 Author: Maxime Chevallier Date: Wed Jan 17 17:15:25 2018 +0100 spi: a3700: Clear DATA_OUT when performing a read [ Upstream commit 44a5f423e70374e5b42cecd85e78f2d79334e0f2 ] When performing a read using FIFO mode, the spi controller shifts out the last 2 bytes that were written in a previous transfer on MOSI. This undocumented behaviour can cause devices to misinterpret the transfer, so we explicitly clear the WFIFO before each read. This behaviour was noticed on EspressoBin. Signed-off-by: Maxime Chevallier Signed-off-by: Mark Brown Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit 5bb5b9c68192b81e16eef1b7f9d4d526fe6afefe Author: Martin Blumenstingl Date: Mon Jan 15 18:10:15 2018 +0100 net: stmmac: dwmac-meson8b: propagate rate changes to the parent clock [ Upstream commit fb7d38a70e1d8ffd54f7a7464dcc4889d7e490ad ] On Meson8b the only valid input clock is MPLL2. The bootloader configures that to run at 500002394Hz which cannot be divided evenly down to 125MHz using the m250_div clock. Currently the common clock framework chooses a m250_div of 2 - with the internal fixed "divide by 10" this results in a RGMII TX clock of 125001197Hz (120Hz above the requested 125MHz). Letting the common clock framework propagate the rate changes up to the parent of m250_mux allows us to get the best possible clock rate. With this patch the common clock framework calculates a rate of very-close-to-250MHz (249999701Hz to be exact) for the MPLL2 clock (which is the mux input). Dividing that by 2 (which is an internal, fixed divider for the RGMII TX clock) gives us an RGMII TX clock of 124999850Hz (which is only 150Hz off the requested 125MHz, compared to 1197Hz based on the MPLL2 rate set by u-boot and the Amlogic GPL kernel sources). SoCs from the Meson GX series are not affected by this change because the input clock is FCLK_DIV2 whose rate cannot be changed (which is fine since it's running at 1GHz, so it's already a multiple of 250MHz and 125MHz). Fixes: 566e8251625304 ("net: stmmac: add a glue driver for the Amlogic Meson 8b / GXBB DWMAC") Suggested-by: Jerome Brunet Signed-off-by: Martin Blumenstingl Reviewed-by: Jerome Brunet Tested-by: Jerome Brunet Signed-off-by: David S. Miller Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit 5bfa11c9619283d77cf8193152fba8ae18819557 Author: Martin Blumenstingl Date: Mon Jan 15 18:10:14 2018 +0100 net: stmmac: dwmac-meson8b: fix setting the RGMII TX clock on Meson8b [ Upstream commit 433c6cab9d298687c097f6ee82e49157044dc7c6 ] Meson8b only supports MPLL2 as clock input. The rate of the MPLL2 clock set by Odroid-C1's u-boot is close to (but not exactly) 500MHz. The exact rate is 500002394Hz, which is calculated in drivers/clk/meson/clk-mpll.c using the following formula: DIV_ROUND_UP_ULL((u64)parent_rate * SDM_DEN, (SDM_DEN * n2) + sdm) Odroid-C1's u-boot configures MPLL2 with the following values: - SDM_DEN = 16384 - SDM = 1638 - N2 = 5 The 250MHz clock (m250_div) inside dwmac-meson8b driver is derived from the MPLL2 clock. Due to MPLL2 running slightly faster than 500MHz the common clock framework chooses a divider which is too big to generate the 250MHz clock (a divider of 2 would be needed, but this is rounded up to a divider of 3). This breaks the RTL8211F RGMII PHY on Odroid-C1 because it requires a (close to) 125MHz RGMII TX clock (on Gbit speeds, the IP block internally divides that down to 25MHz on 100Mbit/s connections and 2.5MHz on 10Mbit/s connections - we don't need any special configuration for that). Round the divider to the closest value to prevent this issue on Meson8b. This means we'll now end up with a clock rate for the RGMII TX clock of 125001197Hz (= 125MHz plus 1197Hz), which is close-enough to 125MHz. This has no effect on the Meson GX SoCs since there fclk_div2 is used as input clock, which has a rate of 1000MHz (and thus is divisible cleanly to 250MHz and 125MHz). Fixes: 566e8251625304 ("net: stmmac: add a glue driver for the Amlogic Meson 8b / GXBB DWMAC") Reported-by: Emiliano Ingrassia Signed-off-by: Martin Blumenstingl Reviewed-by: Jerome Brunet Tested-by: Jerome Brunet Signed-off-by: David S. Miller Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit 2a71a742f09b390a200110dd3d8ff9f6f51f9443 Author: Geert Uytterhoeven Date: Sun Sep 17 10:32:20 2017 +0200 ubifs: Fix uninitialized variable in search_dh_cookie() [ Upstream commit c877154d307f4a91e0b5b85b75535713dab945ae ] fs/ubifs/tnc.c: In function ‘search_dh_cookie’: fs/ubifs/tnc.c:1893: warning: ‘err’ is used uninitialized in this function Indeed, err is always used uninitialized. According to an original review comment from Hyunchul, acknowledged by Richard, err should be initialized to -ENOENT to avoid the first call to tnc_next(). But we can achieve the same by reordering the code. Fixes: 781f675e2d7e ("ubifs: Fix unlink code wrt. double hash lookups") Reported-by: Hyunchul Lee Signed-off-by: Geert Uytterhoeven Signed-off-by: Richard Weinberger Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit a1dfcb01e374729f6920c3c07867914c5e848fb4 Author: Ming Lei Date: Thu Jan 18 00:41:52 2018 +0800 blk-mq: turn WARN_ON in __blk_mq_run_hw_queue into printk [ Upstream commit 7df938fbc4ee641e70e05002ac67c24b19e86e74 ] We know this WARN_ON is harmless and in reality it may be trigged, so convert it to printk() and dump_stack() to avoid to confusing people. Also add comment about two releated races here. Cc: Christian Borntraeger Cc: Stefan Haberland Cc: Christoph Hellwig Cc: Thomas Gleixner Cc: "jianchao.wang" Signed-off-by: Ming Lei Signed-off-by: Jens Axboe Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit 2e102fe86ede237ca875da9336648caa5e2fcac6 Author: Ming Lei Date: Thu Jan 11 14:01:56 2018 +0800 dm mpath: return DM_MAPIO_REQUEUE on blk-mq rq allocation failure [ Upstream commit 050af08ffb1b62af69196d61c22a0755f9a3cdbd ] blk-mq will rerun queue via RESTART or dispatch wake after one request is completed, so not necessary to wait random time for requeuing, we should trust blk-mq to do it. More importantly, we need to return BLK_STS_RESOURCE to blk-mq so that dequeuing from the I/O scheduler can be stopped, this results in improved I/O merging. Signed-off-by: Ming Lei Signed-off-by: Mike Snitzer Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit 223ed638e937c53fdb49c6a55aa3c9de2c728b0f Author: mulhern Date: Mon Nov 27 10:02:39 2017 -0500 dm thin: fix documentation relative to low water mark threshold [ Upstream commit 9b28a1102efc75d81298198166ead87d643a29ce ] Fixes: 1. The use of "exceeds" when the opposite of exceeds, falls below, was meant. 2. Properly speaking, a table can not exceed a threshold. It emphasizes the important point, which is that it is the userspace daemon's responsibility to check for low free space when a device is resumed, since it won't get a special event indicating low free space in that situation. Signed-off-by: mulhern Signed-off-by: Mike Snitzer Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit e9c8a5fa078ca85b4a78dfbe837fadfef780608d Author: Peter Xu Date: Wed Jan 10 13:51:37 2018 +0800 iommu/vt-d: Use domain instead of cache fetching [ Upstream commit 9d2e6505f6d6934e681aed502f566198cb25c74a ] after commit a1ddcbe93010 ("iommu/vt-d: Pass dmar_domain directly into iommu_flush_iotlb_psi", 2015-08-12), we have domain pointer as parameter to iommu_flush_iotlb_psi(), so no need to fetch it from cache again. More importantly, a NULL reference pointer bug is reported on RHEL7 (and it can be reproduced on some old upstream kernels too, e.g., v4.13) by unplugging an 40g nic from a VM (hard to test unplug on real host, but it should be the same): https://bugzilla.redhat.com/show_bug.cgi?id=1531367 [ 24.391863] pciehp 0000:00:03.0:pcie004: Slot(0): Attention button pressed [ 24.393442] pciehp 0000:00:03.0:pcie004: Slot(0): Powering off due to button press [ 29.721068] i40evf 0000:01:00.0: Unable to send opcode 2 to PF, err I40E_ERR_QUEUE_EMPTY, aq_err OK [ 29.783557] iommu: Removing device 0000:01:00.0 from group 3 [ 29.784662] BUG: unable to handle kernel NULL pointer dereference at 0000000000000304 [ 29.785817] IP: iommu_flush_iotlb_psi+0xcf/0x120 [ 29.786486] PGD 0 [ 29.786487] P4D 0 [ 29.786812] [ 29.787390] Oops: 0000 [#1] SMP [ 29.787876] Modules linked in: ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 xt_conntrack ip_set nfnetlink ebtable_nat ebtable_broute bridge stp llc ip6table_ng [ 29.795371] CPU: 0 PID: 156 Comm: kworker/0:2 Not tainted 4.13.0 #14 [ 29.796366] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.11.0-1.el7 04/01/2014 [ 29.797593] Workqueue: pciehp-0 pciehp_power_thread [ 29.798328] task: ffff94f5745b4a00 task.stack: ffffb326805ac000 [ 29.799178] RIP: 0010:iommu_flush_iotlb_psi+0xcf/0x120 [ 29.799919] RSP: 0018:ffffb326805afbd0 EFLAGS: 00010086 [ 29.800666] RAX: ffff94f5bc56e800 RBX: 0000000000000000 RCX: 0000000200000025 [ 29.801667] RDX: ffff94f5bc56e000 RSI: 0000000000000082 RDI: 0000000000000000 [ 29.802755] RBP: ffffb326805afbf8 R08: 0000000000000000 R09: ffff94f5bc86bbf0 [ 29.803772] R10: ffffb326805afba8 R11: 00000000000ffdc4 R12: ffff94f5bc86a400 [ 29.804789] R13: 0000000000000000 R14: 00000000ffdc4000 R15: 0000000000000000 [ 29.805792] FS: 0000000000000000(0000) GS:ffff94f5bfc00000(0000) knlGS:0000000000000000 [ 29.806923] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 29.807736] CR2: 0000000000000304 CR3: 000000003499d000 CR4: 00000000000006f0 [ 29.808747] Call Trace: [ 29.809156] flush_unmaps_timeout+0x126/0x1c0 [ 29.809800] domain_exit+0xd6/0x100 [ 29.810322] device_notifier+0x6b/0x70 [ 29.810902] notifier_call_chain+0x4a/0x70 [ 29.812822] __blocking_notifier_call_chain+0x47/0x60 [ 29.814499] blocking_notifier_call_chain+0x16/0x20 [ 29.816137] device_del+0x233/0x320 [ 29.817588] pci_remove_bus_device+0x6f/0x110 [ 29.819133] pci_stop_and_remove_bus_device+0x1a/0x20 [ 29.820817] pciehp_unconfigure_device+0x7a/0x1d0 [ 29.822434] pciehp_disable_slot+0x52/0xe0 [ 29.823931] pciehp_power_thread+0x8a/0xa0 [ 29.825411] process_one_work+0x18c/0x3a0 [ 29.826875] worker_thread+0x4e/0x3b0 [ 29.828263] kthread+0x109/0x140 [ 29.829564] ? process_one_work+0x3a0/0x3a0 [ 29.831081] ? kthread_park+0x60/0x60 [ 29.832464] ret_from_fork+0x25/0x30 [ 29.833794] Code: 85 ed 74 0b 5b 41 5c 41 5d 41 5e 41 5f 5d c3 49 8b 54 24 60 44 89 f8 0f b6 c4 48 8b 04 c2 48 85 c0 74 49 45 0f b6 ff 4a 8b 3c f8 <80> bf [ 29.838514] RIP: iommu_flush_iotlb_psi+0xcf/0x120 RSP: ffffb326805afbd0 [ 29.840362] CR2: 0000000000000304 [ 29.841716] ---[ end trace b10ec0d6900868d3 ]--- This patch fixes that problem if applied to v4.13 kernel. The bug does not exist on latest upstream kernel since it's fixed as a side effect of commit 13cf01744608 ("iommu/vt-d: Make use of iova deferred flushing", 2017-08-15). But IMHO it's still good to have this patch upstream. CC: Alex Williamson Signed-off-by: Peter Xu Fixes: a1ddcbe93010 ("iommu/vt-d: Pass dmar_domain directly into iommu_flush_iotlb_psi") Reviewed-by: Alex Williamson Signed-off-by: Joerg Roedel Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit 6ec6bd8ec2e363f7955a992d97038b7fbed4974b Author: Nicholas Piggin Date: Sun Dec 24 02:49:22 2017 +1000 powerpc: System reset avoid interleaving oops using die synchronisation [ Upstream commit 4552d128c26e0f0f27a5bd2fadc24092b8f6c1d7 ] The die() oops path contains a serializing lock to prevent oops messages from being interleaved. In the case of a system reset initiated oops (e.g., qemu nmi command), __die was being called which lacks that synchronisation and oops reports could be interleaved across CPUs. A recent patch 4388c9b3a6ee7 ("powerpc: Do not send system reset request through the oops path") changed this to __die to avoid the debugger() call, but there is no real harm to calling it twice if the first time fell through. So go back to using die() here. This was observed to fix the problem. Fixes: 4388c9b3a6ee7 ("powerpc: Do not send system reset request through the oops path") Signed-off-by: Nicholas Piggin Reviewed-by: David Gibson Signed-off-by: Michael Ellerman Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit bc5fddf315f8a1664754bac4ef3752f4bc553e44 Author: Robin Murphy Date: Tue Jan 9 15:34:07 2018 +0000 iommu/exynos: Don't unconditionally steal bus ops [ Upstream commit dc98b8480d8a68c2ce9aa28b9f0d714fd258bc0b ] Removing the early device registration hook overlooked the fact that it only ran conditionally on a compatible device being present in the DT. With exynos_iommu_init() now running as an unconditional initcall, problems arise on non-Exynos systems when other IOMMU drivers find themselves unable to install their ops on the platform bus, or at worst the Exynos ops get called with someone else's domain and all hell breaks loose. The global ops/cache setup could probably all now be triggered from the first IOMMU probe, as with dma_dev assigment, but for the time being the simplest fix is to resurrect the logic from commit a7b67cd5d9af ("iommu/exynos: Play nice in multi-platform builds") to explicitly check the DT for the presence of an Exynos IOMMU before trying anything. Fixes: 928055a01b3f ("iommu/exynos: Remove custom platform device registration code") Signed-off-by: Robin Murphy Acked-by: Marek Szyprowski Signed-off-by: Joerg Roedel Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit 77d17d0e8934b65864051d74b099f04420066689 Author: Thomas Richter Date: Wed Jan 17 14:16:11 2018 +0100 perf record: Fix failed memory allocation for get_cpuid_str [ Upstream commit 81fccd6ca507d3b2012eaf1edeb9b1dbf4bd22db ] In x86 architecture dependend part function get_cpuid_str() mallocs a 128 byte buffer, but does not check if the memory allocation succeeded or not. When the memory allocation fails, function __get_cpuid() is called with first parameter being a NULL pointer. However this function references its first parameter and operates on a NULL pointer which might cause core dumps. Signed-off-by: Thomas Richter Cc: Heiko Carstens Cc: Hendrik Brueckner Cc: Martin Schwidefsky Link: http://lkml.kernel.org/r/20180117131611.34319-1-tmricht@linux.vnet.ibm.com Signed-off-by: Arnaldo Carvalho de Melo Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit 1fe5e88c389a2f7bda893ed57aa0f3fff52b82b0 Author: Steven Rostedt (VMware) Date: Thu Jan 11 19:47:51 2018 -0500 tools lib traceevent: Fix get_field_str() for dynamic strings [ Upstream commit d777f8de99b05d399c0e4e51cdce016f26bd971b ] If a field is a dynamic string, get_field_str() returned just the offset/size value and not the string. Have it parse the offset/size correctly to return the actual string. Otherwise filtering fails when trying to filter fields that are dynamic strings. Reported-by: Gopanapalli Pradeep Signed-off-by: Steven Rostedt Acked-by: Namhyung Kim Cc: Andrew Morton Link: http://lkml.kernel.org/r/20180112004823.146333275@goodmis.org Signed-off-by: Arnaldo Carvalho de Melo Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit 4e63115b6b9d040c153a9d4b53cd783e0ce9ae76 Author: Arnaldo Carvalho de Melo Date: Mon Jan 15 11:07:58 2018 -0300 perf callchain: Fix attr.sample_max_stack setting [ Upstream commit 249d98e567e25dd03e015e2d31e1b7b9648f34df ] When setting the "dwarf" unwinder for a specific event and not specifying the max-stack, the attr.sample_max_stack ended up using an uninitialized callchain_param.max_stack, fix it by using designated initializers for that callchain_param variable, zeroing all non explicitely initialized struct members. Here is what happened: # perf trace -vv --no-syscalls --max-stack 4 -e probe_libc:inet_pton/call-graph=dwarf/ ping -6 -c 1 ::1 callchain: type DWARF callchain: stack dump size 8192 perf_event_attr: type 2 size 112 config 0x730 { sample_period, sample_freq } 1 sample_type IP|TID|TIME|ADDR|CALLCHAIN|CPU|PERIOD|RAW|REGS_USER|STACK_USER|DATA_SRC exclude_callchain_user 1 { wakeup_events, wakeup_watermark } 1 sample_regs_user 0xff0fff sample_stack_user 8192 sample_max_stack 50656 sys_perf_event_open failed, error -75 Value too large for defined data type # perf trace -vv --no-syscalls --max-stack 4 -e probe_libc:inet_pton/call-graph=dwarf/ ping -6 -c 1 ::1 callchain: type DWARF callchain: stack dump size 8192 perf_event_attr: type 2 size 112 config 0x730 sample_type IP|TID|TIME|ADDR|CALLCHAIN|CPU|PERIOD|RAW|REGS_USER|STACK_USER|DATA_SRC exclude_callchain_user 1 sample_regs_user 0xff0fff sample_stack_user 8192 sample_max_stack 30448 sys_perf_event_open failed, error -75 Value too large for defined data type # Now the attr.sample_max_stack is set to zero and the above works as expected: # perf trace --no-syscalls --max-stack 4 -e probe_libc:inet_pton/call-graph=dwarf/ ping -6 -c 1 ::1 PING ::1(::1) 56 data bytes 64 bytes from ::1: icmp_seq=1 ttl=64 time=0.072 ms --- ::1 ping statistics --- 1 packets transmitted, 1 received, 0% packet loss, time 0ms rtt min/avg/max/mdev = 0.072/0.072/0.072/0.000 ms 0.000 probe_libc:inet_pton:(7feb7a998350)) __inet_pton (inlined) gaih_inet.constprop.7 (/usr/lib64/libc-2.26.so) __GI_getaddrinfo (inlined) [0xffffaa39b6108f3f] (/usr/bin/ping) # Cc: Adrian Hunter Cc: David Ahern Cc: Hendrick Brueckner Cc: Jiri Olsa Cc: Namhyung Kim Cc: Thomas Richter Cc: Wang Nan Link: https://lkml.kernel.org/n/tip-is9tramondqa9jlxxsgcm9iz@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit 448bcd67b4c51bd7e4ac3a6d4a1b49e880c5bb5a Author: Steven Rostedt (VMware) Date: Thu Jan 11 19:47:45 2018 -0500 tools lib traceevent: Simplify pointer print logic and fix %pF [ Upstream commit 38d70b7ca1769f26c0b79f3c08ff2cc949712b59 ] When processing %pX in pretty_print(), simplify the logic slightly by incrementing the ptr to the format string if isalnum(ptr[1]) is true. This follows the logic a bit more closely to what is in the kernel. Also, this fixes a small bug where %pF was not giving the offset of the function. Signed-off-by: Steven Rostedt Acked-by: Namhyung Kim Cc: Andrew Morton Link: http://lkml.kernel.org/r/20180112004822.260262257@goodmis.org Signed-off-by: Arnaldo Carvalho de Melo Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit 0eda4d03ef4c4c707a3f40c5afe634238775ff8a Author: Arnaldo Carvalho de Melo Date: Mon Jan 15 16:48:46 2018 -0300 perf unwind: Do not look just at the global callchain_param.record_mode [ Upstream commit eabad8c6856f185f876b54c426c2cc69fe0f0a7d ] When setting up DWARF callchains on specific events, without using 'record' or 'trace' --call-graph, but instead doing it like: perf trace -e cycles/call-graph=dwarf/ The unwind__prepare_access() call in thread__insert_map() when we process PERF_RECORD_MMAP(2) metadata events were not being performed, precluding us from using per-event DWARF callchains, handling them just when we asked for all events to be DWARF, using "--call-graph dwarf". We do it in the PERF_RECORD_MMAP because we have to look at one of the executable maps to figure out the executable type (64-bit, 32-bit) of the DSO laid out in that mmap. Also to look at the architecture where the perf.data file was recorded. All this probably should be deferred to when we process a sample for some thread that has callchains, so that we do this processing only for the threads with samples, not for all of them. For now, fix using DWARF on specific events. Before: # perf trace --no-syscalls -e probe_libc:inet_pton/call-graph=dwarf/ ping -6 -c 1 ::1 PING ::1(::1) 56 data bytes 64 bytes from ::1: icmp_seq=1 ttl=64 time=0.048 ms --- ::1 ping statistics --- 1 packets transmitted, 1 received, 0% packet loss, time 0ms rtt min/avg/max/mdev = 0.048/0.048/0.048/0.000 ms 0.000 probe_libc:inet_pton:(7fe9597bb350)) Problem processing probe_libc:inet_pton callchain, skipping... # After: # perf trace --no-syscalls -e probe_libc:inet_pton/call-graph=dwarf/ ping -6 -c 1 ::1 PING ::1(::1) 56 data bytes 64 bytes from ::1: icmp_seq=1 ttl=64 time=0.060 ms --- ::1 ping statistics --- 1 packets transmitted, 1 received, 0% packet loss, time 0ms rtt min/avg/max/mdev = 0.060/0.060/0.060/0.000 ms 0.000 probe_libc:inet_pton:(7fd4aa930350)) __inet_pton (inlined) gaih_inet.constprop.7 (/usr/lib64/libc-2.26.so) __GI_getaddrinfo (inlined) [0xffffaa804e51af3f] (/usr/bin/ping) __libc_start_main (/usr/lib64/libc-2.26.so) [0xffffaa804e51b379] (/usr/bin/ping) # # perf trace --call-graph=dwarf --no-syscalls -e probe_libc:inet_pton/call-graph=dwarf/ ping -6 -c 1 ::1 PING ::1(::1) 56 data bytes 64 bytes from ::1: icmp_seq=1 ttl=64 time=0.057 ms --- ::1 ping statistics --- 1 packets transmitted, 1 received, 0% packet loss, time 0ms rtt min/avg/max/mdev = 0.057/0.057/0.057/0.000 ms 0.000 probe_libc:inet_pton:(7f9363b9e350)) __inet_pton (inlined) gaih_inet.constprop.7 (/usr/lib64/libc-2.26.so) __GI_getaddrinfo (inlined) [0xffffa9e8a14e0f3f] (/usr/bin/ping) __libc_start_main (/usr/lib64/libc-2.26.so) [0xffffa9e8a14e1379] (/usr/bin/ping) # # perf trace --call-graph=fp --no-syscalls -e probe_libc:inet_pton/call-graph=dwarf/ ping -6 -c 1 ::1 PING ::1(::1) 56 data bytes 64 bytes from ::1: icmp_seq=1 ttl=64 time=0.077 ms --- ::1 ping statistics --- 1 packets transmitted, 1 received, 0% packet loss, time 0ms rtt min/avg/max/mdev = 0.077/0.077/0.077/0.000 ms 0.000 probe_libc:inet_pton:(7f4947e1c350)) __inet_pton (inlined) gaih_inet.constprop.7 (/usr/lib64/libc-2.26.so) __GI_getaddrinfo (inlined) [0xffffaa716d88ef3f] (/usr/bin/ping) __libc_start_main (/usr/lib64/libc-2.26.so) [0xffffaa716d88f379] (/usr/bin/ping) # # perf trace --no-syscalls -e probe_libc:inet_pton/call-graph=fp/ ping -6 -c 1 ::1 PING ::1(::1) 56 data bytes 64 bytes from ::1: icmp_seq=1 ttl=64 time=0.078 ms --- ::1 ping statistics --- 1 packets transmitted, 1 received, 0% packet loss, time 0ms rtt min/avg/max/mdev = 0.078/0.078/0.078/0.000 ms 0.000 probe_libc:inet_pton:(7fa157696350)) __GI___inet_pton (/usr/lib64/libc-2.26.so) getaddrinfo (/usr/lib64/libc-2.26.so) [0xffffa9ba39c74f40] (/usr/bin/ping) # Acked-by: Namhyung Kim Cc: Adrian Hunter Cc: David Ahern Cc: Hendrick Brueckner Cc: Jiri Olsa Cc: Thomas Richter Cc: Wang Nan Link: https://lkml.kernel.org/r/20180116182650.GE16107@kernel.org Signed-off-by: Arnaldo Carvalho de Melo Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit f3a7d11834f30d04a67c68dd572493953c088834 Author: himanshu.madhani@cavium.com Date: Mon Jan 15 20:46:48 2018 -0800 scsi: qla2xxx: Fix warning in qla2x00_async_iocb_timeout() [ Upstream commit 7ac0c332f96bb9688560726f5e80c097ed8de59a ] This patch fixes following Smatch warning: drivers/scsi/qla2xxx/qla_init.c:130 qla2x00_async_iocb_timeout() error: we previously assumed 'fcport' could be null (see line 107) Fixes: 5c25d451163c ("scsi: qla2xxx: Fix NULL pointer access for fcport structure") Reported by: Dan Carpenter Signed-off-by: Quinn Tran Signed-off-by: Himanshu Madhani Signed-off-by: Martin K. Petersen Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit f3ce194cae63c2d1749dbfe4c92e1ce7b56e0648 Author: Shiraz Saleem Date: Thu Jan 11 18:10:51 2018 -0600 i40iw: Zero-out consumer key on allocate stag for FMR [ Upstream commit 6376e926af1a8661dd1b2e6d0896e07f84a35844 ] If the application invalidates the MR before the FMR WR, HW parses the consumer key portion of the stag and returns an invalid stag key Asynchronous Event (AE) that tears down the QP. Fix this by zeroing-out the consumer key portion of the allocated stag returned to application for FMR. Fixes: ee855d3b93f3 ("RDMA/i40iw: Add base memory management extensions") Signed-off-by: Shiraz Saleem Signed-off-by: Jason Gunthorpe Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit b3b2ca24d9f7c5c34a073553a983c099a2862119 Author: Mustafa Ismail Date: Thu Jan 11 18:10:54 2018 -0600 i40iw: Free IEQ resources [ Upstream commit f20d429511affab6a2a9129f46042f43e6ffe396 ] The iWARP Exception Queue (IEQ) resources are not freed when a QP is destroyed. Fix this by freeing IEQ resources when freeing QP resources. Fixes: d37498417947 ("i40iw: add files for iwarp interface") Signed-off-by: Mustafa Ismail Signed-off-by: Shiraz Saleem Signed-off-by: Jason Gunthorpe Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit 0d5ef8956c84a4d6667d41085755b09ac54266c6 Author: Peter Hutterer Date: Tue Jan 16 15:20:58 2018 -0800 Input: synaptics - reset the ABS_X/Y fuzz after initializing MT axes [ Upstream commit 19eb4ed1141bd1096b9bc84ba9c4d03d5830c143 ] input_mt_init_slots() resets the ABS_X/Y fuzz to 0 and expects the driver to call input_mt_report_pointer_emulation(). That is based on the MT position bits which are already defuzzed - hence a fuzz of 0. In the case of synaptics semi-mt devices, we report the ABS_X/Y axes manually. This results in the MT position being defuzzed but the single-touch emulation missing that defuzzing. Work around this by re-initializing the ABS_X/Y axes after the MT axis to get the same fuzz value back. https://bugs.freedesktop.org/show_bug.cgi?id=104533 Signed-off-by: Peter Hutterer Signed-off-by: Dmitry Torokhov Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit 0d9a46ae3204b0d07ec6a4de00573cf12b0b1822 Author: Jesper Dangaard Brouer Date: Wed Jan 17 00:20:40 2018 +0100 libbpf: Makefile set specified permission mode [ Upstream commit 7110d80d53f472956420cd05a6297f49b558b674 ] The third parameter to do_install was not used by $(INSTALL) command. Fix this by only setting the -m option when the third parameter is supplied. The use of a third parameter was introduced in commit eb54e522a000 ("bpf: install libbpf headers on 'make install'"). Without this change, the header files are install as executables files (755). Fixes: eb54e522a000 ("bpf: install libbpf headers on 'make install'") Signed-off-by: Jesper Dangaard Brouer Signed-off-by: Daniel Borkmann Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit d925c308742284e508c984bc820b47a3d5c88397 Author: Dmitry Torokhov Date: Tue Jan 9 13:44:46 2018 -0800 Input: psmouse - fix Synaptics detection when protocol is disabled [ Upstream commit 2bc4298f59d2f15175bb568e2d356b5912d0cdd9 ] When Synaptics protocol is disabled, we still need to try and detect the hardware, so we can switch to SMBus device if SMbus is detected, or we know that it is Synaptics device and reset it properly for the bare PS/2 protocol. Fixes: c378b5119eb0 ("Input: psmouse - factor out common protocol probing code") Reported-by: Matteo Croce Tested-by: Matteo Croce Signed-off-by: Dmitry Torokhov Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit 03fdc4ef7a67217506d48e1fcd2a03ab9c1f74bb Author: Alex Williamson Date: Tue Jan 16 10:05:26 2018 -0700 PCI: Add function 1 DMA alias quirk for Marvell 9128 [ Upstream commit aa008206634363ef800fbd5f0262016c9ff81dea ] The Marvell 9128 is the original device generating bug 42679, from which many other Marvell DMA alias quirks have been sourced, but we didn't have positive confirmation of the fix on 9128 until now. Link: https://bugzilla.kernel.org/show_bug.cgi?id=42679 Link: https://www.spinics.net/lists/kvm/msg161459.html Reported-by: Binarus Tested-by: Binarus Signed-off-by: Alex Williamson Signed-off-by: Bjorn Helgaas Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit c45ab4fb384cb3fc05e0949ca3a6ca4d6f7613fb Author: Masami Hiramatsu Date: Sun Jan 14 22:50:07 2018 +0900 selftest: ftrace: Fix to pick text symbols for kprobes [ Upstream commit 5e46664703b364434a2cbda3e6988fc24ae0ced5 ] Fix to pick text symbols for multiple kprobe testcase. kallsyms shows text symbols with " t " or " T " but current testcase picks all symbols including "t", so it picks data symbols if it includes 't' (e.g. "str"). This fixes it to find symbol lines with " t " or " T " (including spaces). Signed-off-by: Masami Hiramatsu Reported-by: Russell King Acked-by: Steven Rostedt (VMware) Signed-off-by: Shuah Khan Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit 342d9092a50184b4b6d1c3f1f6bed06321afa88d Author: Chuck Lever Date: Thu Dec 14 20:56:09 2017 -0500 xprtrdma: Fix backchannel allocation of extra rpcrdma_reps [ Upstream commit d698c4a02ee02053bbebe051322ff427a2dad56a ] The backchannel code uses rpcrdma_recv_buffer_put to add new reps to the free rep list. This also decrements rb_recv_count, which spoofs the receive overrun logic in rpcrdma_buffer_get_rep. Commit 9b06688bc3b9 ("xprtrdma: Fix additional uses of spin_lock_irqsave(rb_lock)") replaced the original open-coded list_add with a call to rpcrdma_recv_buffer_put(), but then a year later, commit 05c974669ece ("xprtrdma: Fix receive buffer accounting") added rep accounting to rpcrdma_recv_buffer_put. It was an oversight to let the backchannel continue to use this function. The fix this, let's combine the "add to free list" logic with rpcrdma_create_rep. Also, do not allocate RPCRDMA_MAX_BC_REQUESTS rpcrdma_reps in rpcrdma_buffer_create and then allocate additional rpcrdma_reps in rpcrdma_bc_setup_reps. Allocating the extra reps during backchannel set-up is sufficient. Fixes: 05c974669ece ("xprtrdma: Fix receive buffer accounting") Signed-off-by: Chuck Lever Signed-off-by: Anna Schumaker Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit 79f2ced39657f6eb539f70664c355183cf21aa79 Author: Hans de Goede Date: Thu Jan 11 15:14:39 2018 +0100 platform/x86: dell-laptop: Filter out spurious keyboard backlight change events [ Upstream commit 4d6bde512a86c32df3a1f289d2b4cd04b17758d1 ] On some Dell XPS models WMI events of type 0x0000 reporting a keycode of 0xe00c get reported when the brightness of the LCD panel changes. This leads to us reporting false-positive kbd_led change events to userspace which in turn leads to the kbd backlight OSD showing when it should not. We already read the current keyboard backlight brightness value when reporting events because the led_classdev_notify_brightness_hw_changed API requires this. Compare this value to the last known value and filter out duplicate events, fixing this. Note the fixed issue is esp. a problem on XPS models with an ambient light sensor and automatic brightness adjustments turned on, this causes the kbd backlight OSD to show all the time there. BugLink: https://bugzilla.redhat.com/show_bug.cgi?id=1514969 Fixes: 9c656b0799 ("platform/x86: dell-*: Call new led hw_changed API ...") Acked-by: Pali Rohár Signed-off-by: Hans de Goede Signed-off-by: Andy Shevchenko Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit 80bd91ab9ad853d2f4f769b732fb8a501c90ec9d Author: Christian Borntraeger Date: Thu Nov 16 15:12:52 2017 +0100 KVM: s390: use created_vcpus in more places [ Upstream commit 241e3ec0faf5ab1a0d9b1f6c43eefa919fb9c112 ] commit a03825bbd0c3 ("KVM: s390: use kvm->created_vcpus") introduced kvm->created_vcpus to avoid races with the existing kvm->online_vcpus scheme. One place was "forgotten" and one new place was "added". Let's fix those. Reported-by: Halil Pasic Signed-off-by: Christian Borntraeger Reviewed-by: Halil Pasic Reviewed-by: Cornelia Huck Reviewed-by: David Hildenbrand Fixes: 4e0b1ab72b8a ("KVM: s390: gs support for kvm guests") Fixes: a03825bbd0c3 ("KVM: s390: use kvm->created_vcpus") Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit a5a8ca753c0c4659287416ce282ba357c30098a8 Author: Anna-Maria Gleixner Date: Thu Dec 21 11:41:37 2017 +0100 tracing/hrtimer: Fix tracing bugs by taking all clock bases and modes into account [ Upstream commit 91633eed73a3ac37aaece5c8c1f93a18bae616a9 ] So far only CLOCK_MONOTONIC and CLOCK_REALTIME were taken into account as well as HRTIMER_MODE_ABS/REL in the hrtimer_init tracepoint. The query for detecting the ABS or REL timer modes is not valid anymore, it got broken by the introduction of HRTIMER_MODE_PINNED. HRTIMER_MODE_PINNED is not evaluated in the hrtimer_init() call, but for the sake of completeness print all given modes. Signed-off-by: Anna-Maria Gleixner Cc: Christoph Hellwig Cc: John Stultz Cc: Linus Torvalds Cc: Peter Zijlstra Cc: Thomas Gleixner Cc: keescook@chromium.org Link: http://lkml.kernel.org/r/20171221104205.7269-9-anna-maria@linutronix.de Signed-off-by: Ingo Molnar Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit e0a1cec3db0a2fcf002fc101a15fcede7551c699 Author: Subash Abhinov Kasiviswanathan Date: Fri Jan 12 17:36:27 2018 -0700 netfilter: ipv6: nf_defrag: Pass on packets to stack per RFC2460 [ Upstream commit 83f1999caeb14e15df205e80d210699951733287 ] ipv6_defrag pulls network headers before fragment header. In case of an error, the netfilter layer is currently dropping these packets. This results in failure of some IPv6 standards tests which passed on older kernels due to the netfilter framework using cloning. The test case run here is a check for ICMPv6 error message replies when some invalid IPv6 fragments are sent. This specific test case is listed in https://www.ipv6ready.org/docs/Core_Conformance_Latest.pdf in the Extension Header Processing Order section. A packet with unrecognized option Type 11 is sent and the test expects an ICMP error in line with RFC2460 section 4.2 - 11 - discard the packet and, only if the packet's Destination Address was not a multicast address, send an ICMP Parameter Problem, Code 2, message to the packet's Source Address, pointing to the unrecognized Option Type. Since netfilter layer now drops all invalid IPv6 frag packets, we no longer see the ICMP error message and fail the test case. To fix this, save the transport header. If defrag is unable to process the packet due to RFC2460, restore the transport header and allow packet to be processed by stack. There is no change for other packet processing paths. Tested by confirming that stack sends an ICMP error when it receives these packets. Also tested that fragmented ICMP pings succeed. v1->v2: Instead of cloning always, save the transport_header and restore it in case of this specific error. Update the title and commit message accordingly. Signed-off-by: Subash Abhinov Kasiviswanathan Signed-off-by: Pablo Neira Ayuso Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit ddf09f2a0896bb29f1574d0ef983f42eb8592346 Author: Paul Mackerras Date: Fri Jan 12 20:55:20 2018 +1100 KVM: PPC: Book3S HV: Enable migration of decrementer register [ Upstream commit 5855564c8ab2d9cefca7b2933bd19818eb795e40 ] This adds a register identifier for use with the one_reg interface to allow the decrementer expiry time to be read and written by userspace. The decrementer expiry time is in guest timebase units and is equal to the sum of the decrementer and the guest timebase. (The expiry time is used rather than the decrementer value itself because the expiry time is not constantly changing, though the decrementer value is, while the guest vcpu is not running.) Without this, a guest vcpu migrated to a new host will see its decrementer set to some random value. On POWER8 and earlier, the decrementer is 32 bits wide and counts down at 512MHz, so the guest vcpu will potentially see no decrementer interrupts for up to about 4 seconds, which will lead to a stall. With POWER9, the decrementer is now 56 bits side, so the stall can be much longer (up to 2.23 years) and more noticeable. To help work around the problem in cases where userspace has not been updated to migrate the decrementer expiry time, we now set the default decrementer expiry at vcpu creation time to the current time rather than the maximum possible value. This should mean an immediate decrementer interrupt when a migrated vcpu starts running. In cases where the decrementer is 32 bits wide and more than 4 seconds elapse between the creation of the vcpu and when it first runs, the decrementer would have wrapped around to positive values and there may still be a stall - but this is no worse than the current situation. In the large-decrementer case, we are sure to get an immediate decrementer interrupt (assuming the time from vcpu creation to first run is less than 2.23 years) and we thus avoid a very long stall. Signed-off-by: Paul Mackerras Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit b7b27e19e374e12b8bb3d12627242e6c5d69d706 Author: Parav Pandit Date: Fri Jan 12 07:58:42 2018 +0200 RDMA/core: Clarify rdma_ah_find_type [ Upstream commit a6532e7139660c103dda181aa5b2c734aa26ed6c ] iWARP does not use rdma_ah_attr_type, and for this reason we do not have a RDMA_AH_ATTR_TYPE_IWARP. rdma_ah_find_type should not even be called on iwarp ports and for clarity it shouldn't have a special test for iWarp. This changes the result from RDMA_AH_ATTR_TYPE_ROCE to RDMA_AH_ATTR_TYPE_IB when wrongly called on an iWarp port. Fixes: 44c58487d51a ("IB/core: Define 'ib' and 'roce' rdma_ah_attr types") Signed-off-by: Parav Pandit Signed-off-by: Leon Romanovsky Signed-off-by: Jason Gunthorpe Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit 8e40eae185f85426d9d6d667f2e55aa721b84e0d Author: Paolo Bonzini Date: Thu Oct 26 15:45:47 2017 +0200 kvm: x86: fix KVM_XEN_HVM_CONFIG ioctl [ Upstream commit 51776043afa415435c7e4636204fbe4f7edc4501 ] This ioctl is obsolete (it was used by Xenner as far as I know) but still let's not break it gratuitously... Its handler is copying directly into struct kvm. Go through a bounce buffer instead, with the added benefit that we can actually do something useful with the flags argument---the previous code was exiting with -EINVAL but still doing the copy. This technically is a userspace ABI breakage, but since no one should be using the ioctl, it's a good occasion to see if someone actually complains. Cc: kernel-hardening@lists.openwall.com Cc: Kees Cook Cc: Radim Krčmář Signed-off-by: Paolo Bonzini Signed-off-by: Kees Cook Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit 3f3017fa1540cf5eb6bd6af1f76f76a831c99652 Author: Dan Carpenter Date: Mon Jan 15 11:08:38 2018 +0300 ASoC: au1x: Fix timeout tests in au1xac97c_ac97_read() [ Upstream commit 123af9043e93cb6f235207d260d50f832cdb5439 ] The loop timeout doesn't work because it's a post op and ends with "tmo" set to -1. I changed it from a post-op to a pre-op and I changed the initial the starting value from 5 to 6 so we still iterate 5 times. I left the other as it was because it's a large number. Fixes: b3c70c9ea62a ("ASoC: Alchemy AC97C/I2SC audio support") Signed-off-by: Dan Carpenter Signed-off-by: Mark Brown Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit d3222cfc0b58f022a17f5f936b5c5d2df8a3f868 Author: Takashi Iwai Date: Mon Jan 15 10:44:35 2018 +0100 ALSA: hda - Use IS_REACHABLE() for dependency on input [ Upstream commit c469652bb5e8fb715db7d152f46d33b3740c9b87 ] The commit ffcd28d88e4f ("ALSA: hda - Select INPUT for Realtek HD-audio codec") introduced the reverse-selection of CONFIG_INPUT for Realtek codec in order to avoid the mess with dependency between built-in and modules. Later on, we obtained IS_REACHABLE() macro exactly for this kind of problems, and now we can remove th INPUT selection in Kconfig and put IS_REACHABLE(INPUT) to the appropriate places in the code, so that the driver doesn't need to select other subsystem forcibly. Fixes: ffcd28d88e4f ("ALSA: hda - Select INPUT for Realtek HD-audio codec") Reported-by: Randy Dunlap Acked-by: Randy Dunlap # and build-tested Signed-off-by: Takashi Iwai Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit 4e7284b34c78a11f04452621b3eb41e1bdb83bff Author: Hans de Goede Date: Sun Jan 14 21:01:48 2018 +0100 ACPI / LPSS: Do not instiate platform_dev for devs without MMIO resources [ Upstream commit e1681599345b8466786b6e54a2db2a00a068a3f3 ] acpi_lpss_create_device() skips handling LPSS devices which do not have a mmio resources in their resource list (typically these devices are disabled by the firmware). But since the LPSS code does not bind to the device, acpi_bus_attach() ends up still creating a platform device for it and the regular platform_driver for the ACPI HID still tries to bind to it. This happens e.g. on some boards which do not use the pwm-controller and have an empty or invalid resource-table for it. Currently this causes these error messages to get logged: [ 3.281966] pwm-lpss 80862288:00: invalid resource [ 3.287098] pwm-lpss: probe of 80862288:00 failed with error -22 This commit stops the undesirable creation of a platform_device for disabled LPSS devices by setting pnp.type.platform_id to 0. Note that acpi_scan_attach_handler() also sets pnp.type.platform_id to 0 when there is a matching handler for the device and that handler has no attach callback, so we simply behave as a handler without an attach function in this case. Signed-off-by: Hans de Goede Acked-by: Mika Westerberg Reviewed-by: Andy Shevchenko Signed-off-by: Rafael J. Wysocki Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit 7a420b5d95a5428887dbcbd338ddb36fdbdc9233 Author: NeilBrown Date: Wed Dec 13 09:57:09 2017 +1100 NFSv4: always set NFS_LOCK_LOST when a lock is lost. [ Upstream commit dce2630c7da73b0634686bca557cc8945cc450c8 ] There are 2 comments in the NFSv4 code which suggest that SIGLOST should possibly be sent to a process. In these cases a lock has been lost. The current practice is to set NFS_LOCK_LOST so that read/write returns EIO when a lock is lost. So change these comments to code when sets NFS_LOCK_LOST. One case is when lock recovery after apparent server restart fails with NFS4ERR_DENIED, NFS4ERR_RECLAIM_BAD, or NFS4ERRO_RECLAIM_CONFLICT. The other case is when a lock attempt as part of lease recovery fails with NFS4ERR_DENIED. In an ideal world, these should not happen. However I have a packet trace showing an NFSv4.1 session getting NFS4ERR_BADSESSION after an extended network parition. The NFSv4.1 client treats this like server reboot until/unless it get NFS4ERR_NO_GRACE, in which case it switches over to "nograce" recovery mode. In this network trace, the client attempts to recover a lock and the server (incorrectly) reports NFS4ERR_DENIED rather than NFS4ERR_NO_GRACE. This leads to the ineffective comment and the client then continues to write using the OPEN stateid. Signed-off-by: NeilBrown Signed-off-by: Trond Myklebust Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit 239c948e3266adba944da4ea257362e773f39823 Author: Peter Zijlstra Date: Fri Dec 22 10:20:11 2017 +0100 x86/tsc: Allow TSC calibration without PIT [ Upstream commit 30c7e5b123673d5e570e238dbada2fb68a87212c ] Zhang Rui reported that a Surface Pro 4 will fail to boot with lapic=notscdeadline. Part of the problem is that that machine doesn't have a PIT. If, for some reason, the TSC init has to fall back to TSC calibration, it relies on the PIT to be present. Allow TSC calibration to reliably fall back to HPET. The below results in an accurate TSC measurement when forced on a IVB: tsc: Unable to calibrate against PIT tsc: No reference (HPET/PMTIMER) available tsc: Unable to calibrate against PIT tsc: using HPET reference calibration tsc: Detected 2792.451 MHz processor Signed-off-by: Peter Zijlstra (Intel) Signed-off-by: Thomas Gleixner Cc: len.brown@intel.com Cc: rui.zhang@intel.com Link: https://lkml.kernel.org/r/20171222092243.333145937@infradead.org Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit 4a5d70332d57bd473ffe76d8777a2e9f847c7863 Author: Hector Martin Date: Fri Nov 3 20:28:57 2017 +0900 firewire-ohci: work around oversized DMA reads on JMicron controllers [ Upstream commit 188775181bc05f29372b305ef96485840e351fde ] At least some JMicron controllers issue buggy oversized DMA reads when fetching context descriptors, always fetching 0x20 bytes at once for descriptors which are only 0x10 bytes long. This is often harmless, but can cause page faults on modern systems with IOMMUs: DMAR: [DMA Read] Request device [05:00.0] fault addr fff56000 [fault reason 06] PTE Read access is not set firewire_ohci 0000:05:00.0: DMA context IT0 has stopped, error code: evt_descriptor_read This works around the problem by always leaving 0x10 padding bytes at the end of descriptor buffer pages, which should be harmless to do unconditionally for controllers in case others have the same behavior. Signed-off-by: Hector Martin Reviewed-by: Clemens Ladisch Signed-off-by: Stefan Richter Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit 1f52b0c642157bafda295c16a4650b4f0c26f4d1 Author: Merlijn Wajer Date: Tue Mar 13 09:48:40 2018 -0500 usb: musb: Fix external abort in musb_remove on omap2430 commit 94e46a4f2d5eb14059e42f313c098d4854847376 upstream. This fixes an oops on unbind / module unload (on the musb omap2430 platform). musb_remove function now calls musb_platform_exit before disabling runtime pm. Signed-off-by: Merlijn Wajer Signed-off-by: Bin Liu Signed-off-by: Greg Kroah-Hartman commit de4c4914cce296b8dd9f5381f5d5dad70a065ba0 Author: Merlijn Wajer Date: Mon Mar 5 11:35:10 2018 -0600 usb: musb: call pm_runtime_{get,put}_sync before reading vbus registers commit df6b074dbe248d8c43a82131e8fd429e401841a5 upstream. Without pm_runtime_{get,put}_sync calls in place, reading vbus status via /sys causes the following error: Unhandled fault: external abort on non-linefetch (0x1028) at 0xfa0ab060 pgd = b333e822 [fa0ab060] *pgd=48011452(bad) [] (musb_default_readb) from [] (musb_vbus_show+0x58/0xe4) [] (musb_vbus_show) from [] (dev_attr_show+0x20/0x44) [] (dev_attr_show) from [] (sysfs_kf_seq_show+0x80/0xdc) [] (sysfs_kf_seq_show) from [] (seq_read+0x250/0x448) [] (seq_read) from [] (__vfs_read+0x1c/0x118) [] (__vfs_read) from [] (vfs_read+0x90/0x144) [] (vfs_read) from [] (SyS_read+0x3c/0x74) [] (SyS_read) from [] (ret_fast_syscall+0x0/0x54) Solution was suggested by Tony Lindgren . Signed-off-by: Merlijn Wajer Acked-by: Tony Lindgren Signed-off-by: Bin Liu Signed-off-by: Greg Kroah-Hartman commit 43de32cdf0f4f73519e2df12fb93adc24f9746cb Author: Andreas Kemnade Date: Tue Feb 20 07:30:10 2018 -0600 usb: musb: fix enumeration after resume commit 17539f2f4f0b7fa906b508765c8ada07a1e45f52 upstream. On dm3730 there are enumeration problems after resume. Investigation led to the cause that the MUSB_POWER_SOFTCONN bit is not set. If it was set before suspend (because it was enabled via musb_pullup()), it is set in musb_restore_context() so the pullup is enabled. But then musb_start() is called which overwrites MUSB_POWER and therefore disables MUSB_POWER_SOFTCONN, so no pullup is enabled and the device is not enumerated. So let's do a subset of what musb_start() does in the same way as musb_suspend() does it. Platform-specific stuff it still called as there might be some phy-related stuff which needs to be enabled. Also interrupts are enabled, as it was the original idea of calling musb_start() in musb_resume() according to Commit 6fc6f4b87cb3 ("usb: musb: Disable interrupts on suspend, enable them on resume") Signed-off-by: Andreas Kemnade Tested-by: Tony Lindgren Signed-off-by: Bin Liu Signed-off-by: Greg Kroah-Hartman commit 829239740c129fe9aaf276843fd546bdd45f114e Author: Imre Deak Date: Tue Jan 30 16:29:38 2018 +0200 drm/i915/bxt, glk: Increase PCODE timeouts during CDCLK freq changing commit 5e1df40f40ee45a97bb1066c3d71f0ae920a9672 upstream. Currently we see sporadic timeouts during CDCLK changing both on BXT and GLK as reported by the Bugzilla: ticket. It's easy to reproduce this by changing the frequency in a tight loop after blanking the display. The upper bound for the completion time is 800us based on my tests, so increase it from the current 500us to 2ms; with that I couldn't trigger the problem either on BXT or GLK. Note that timeouts happened during both the change notification and the voltage level setting PCODE request. (For the latter one BSpec doesn't require us to wait for completion before further HW programming.) This issue is similar to commit 2c7d0602c815 ("drm/i915/gen9: Fix PCODE polling during CDCLK change notification") but there the PCODE request does complete (as shown by the mbox busy flag), only the reply we get from PCODE indicates a failure. So there we keep resending the request until a success reply, here we just have to increase the timeout for the one PCODE request we send. v2: - s/snb_pcode_request/sandybridge_pcode_write_timeout/ (Ville) Cc: Chris Wilson Cc: Ville Syrjälä Cc: # v4.4+ Acked-by: Chris Wilson (v1) Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103326 Reviewed-by: Ville Syrjälä Signed-off-by: Imre Deak Link: https://patchwork.freedesktop.org/patch/msgid/20180130142939.17983-1-imre.deak@intel.com (cherry picked from commit e76019a81921e87a4d9e7b3d86102bc708a6c227) Signed-off-by: Rodrigo Vivi (Rebased for v4.14 stable tree due to upstream cdclk_state and pcu_lock change) Signed-off-by: Imre Deak Signed-off-by: Greg Kroah-Hartman commit 5c825627d4e5dd1989b44b7c27feba8061084096 Author: Imre Deak Date: Mon Apr 16 18:53:09 2018 +0300 drm/i915: Fix LSPCON TMDS output buffer enabling from low-power state commit 7eb2c4dd54ff841f2fe509a84973eb25fa20bda2 upstream. LSPCON adapters in low-power state may ignore the first I2C write during TMDS output buffer enabling, resulting in a blank screen even with an otherwise enabled pipe. Fix this by reading back and validating the written value a few times. The problem was noticed on GLK machines with an onboard LSPCON adapter after entering/exiting DC5 power state. Doing an I2C read of the adapter ID as the first transaction - instead of the I2C write to enable the TMDS buffers - returns the correct value. Based on this we assume that the transaction itself is sent properly, it's only the adapter that is not ready for some reason to accept this first write after waking from low-power state. In my case the second I2C write attempt always succeeded. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105854 Cc: Clinton Taylor Cc: Ville Syrjälä Cc: stable@vger.kernel.org Signed-off-by: Imre Deak Signed-off-by: Jani Nikula Link: https://patchwork.freedesktop.org/patch/msgid/20180416155309.11100-1-imre.deak@intel.com Signed-off-by: Greg Kroah-Hartman commit 6312eff3c70ed43bc0d4a86a042772e4921d649d Author: Xidong Wang Date: Wed Apr 4 10:38:24 2018 +0100 drm/i915: Do no use kfree() to free a kmem_cache_alloc() return value commit fcf1fadf4c65eea6c519c773d2d9901e8ad94f5f upstream. Along the eb_lookup_vmas() error path, the return value from kmem_cache_alloc() was freed using kfree(). Fix it to use the proper kmem_cache_free() instead. Fixes: d1b48c1e7184 ("drm/i915: Replace execbuf vma ht with an idr") Signed-off-by: Xidong Wang Cc: Chris Wilson Cc: Tvrtko Ursulin Cc: # v4.14+ Reviewed-by: Chris Wilson Signed-off-by: Chris Wilson Link: https://patchwork.freedesktop.org/patch/msgid/20180404093824.9313-1-chris@chris-wilson.co.uk (cherry picked from commit 6be1187dbffa0027ea379c53f7ca0c782515c610) Signed-off-by: Joonas Lahtinen Signed-off-by: Greg Kroah-Hartman commit 8e0489cf4d098ab4fe1c30941390362fb2843b7c Author: Gaurav K Singh Date: Tue Apr 17 23:52:18 2018 +0530 drm/i915/audio: Fix audio detection issue on GLK commit b4615730530be85fc45ab4631c2ad6d8e2d0b97d upstream. On Geminilake, sometimes audio card is not getting detected after reboot. This is a spurious issue happening on Geminilake. HW codec and HD audio controller link was going out of sync for which there was a fix in i915 driver but was not getting invoked for GLK. Extending this fix to GLK as well. Tested by Du,Wenkai on GLK board. Bspec: 21829 v2: Instead of checking GEN9_BC, BXT and GLK macros, use IS_GEN9 macro (Jani N) Cc: # b651bd2a3ae3 ("drm/i915/audio: Fix audio enumeration issue on BXT") Cc: Signed-off-by: Gaurav K Singh Reviewed-by: Abhay Kumar Signed-off-by: Jani Nikula Link: https://patchwork.freedesktop.org/patch/msgid/1523989338-29677-1-git-send-email-gaurav.k.singh@intel.com (cherry picked from commit 8221229046e862977ae93ec9d34aa583fbd10397) Signed-off-by: Joonas Lahtinen Signed-off-by: Greg Kroah-Hartman commit c53f225fd792970c75ce0cb4042a11d04ef06d09 Author: Gerd Hoffmann Date: Wed Mar 21 15:08:47 2018 +0100 drm/i915/gvt: throw error on unhandled vfio ioctls commit 9f591ae60e1be026901398ef99eede91237aa3a1 upstream. On unknown/unhandled ioctls the driver should return an error, so userspace knows it tried to use something unsupported. Cc: stable@vger.kernel.org Signed-off-by: Gerd Hoffmann Reviewed-by: Alex Williamson Signed-off-by: Zhenyu Wang Signed-off-by: Greg Kroah-Hartman commit 325abf3db041d7ca130a5f84024da2ffffa35f0e Author: Daniel J Blueman Date: Mon Apr 2 15:10:35 2018 +0800 drm/vc4: Fix memory leak during BO teardown commit c0db1b677e1d584fab5d7ac76a32e1c0157542e0 upstream. During BO teardown, an indirect list 'uniform_addr_offsets' wasn't being freed leading to leaking many 128B allocations. Fix the memory leak by releasing it at teardown time. Cc: stable@vger.kernel.org Fixes: 6d45c81d229d ("drm/vc4: Add support for branching in shader validation.") Signed-off-by: Daniel J Blueman Signed-off-by: Eric Anholt Reviewed-by: Eric Anholt Link: https://patchwork.freedesktop.org/patch/msgid/20180402071035.25356-1-daniel@quora.org Signed-off-by: Greg Kroah-Hartman commit 08641a24d4e70607f767b066def49d3b3a3f9c98 Author: Xiaoming Gao Date: Fri Apr 13 17:48:08 2018 +0800 x86/tsc: Prevent 32bit truncation in calc_hpet_ref() commit d3878e164dcd3925a237a20e879432400e369172 upstream. The TSC calibration code uses HPET as reference. The conversion normalizes the delta of two HPET timestamps: hpetref = ((tshpet1 - tshpet2) * HPET_PERIOD) / 1e6 and then divides the normalized delta of the corresponding TSC timestamps by the result to calulate the TSC frequency. tscfreq = ((tstsc1 - tstsc2 ) * 1e6) / hpetref This uses do_div() which takes an u32 as the divisor, which worked so far because the HPET frequency was low enough that 'hpetref' never exceeded 32bit. On Skylake machines the HPET frequency increased so 'hpetref' can exceed 32bit. do_div() truncates the divisor, which causes the calibration to fail. Use div64_u64() to avoid the problem. [ tglx: Fixes whitespace mangled patch and rewrote changelog ] Signed-off-by: Xiaoming Gao Signed-off-by: Thomas Gleixner Cc: stable@vger.kernel.org Cc: peterz@infradead.org Cc: hpa@zytor.com Link: https://lkml.kernel.org/r/38894564-4fc9-b8ec-353f-de702839e44e@gmail.com Signed-off-by: Greg Kroah-Hartman commit c6aaaaa4d62ad885a8ca0a255d4af975f843ee98 Author: Anson Huang Date: Thu Apr 19 14:04:43 2018 +0800 clocksource/imx-tpm: Correct -ETIME return condition check commit 7407188489c62a7b5694bc75a6db2b82af94c9a5 upstream. The additional brakects added to tpm_set_next_event's return value computation causes (int) forced type conversion NOT taking effect, and the incorrect value return will cause various system timer issue, like RCU stall etc.. Remove the additional brackets to make sure tpm_set_next_event always returns correct value. Fixes: 059ab7b82eec ("clocksource/drivers/imx-tpm: Add imx tpm timer support") Signed-off-by: Anson Huang Signed-off-by: Thomas Gleixner Acked-by: Dong Aisheng Cc: stable@vger.kernel.org Cc: daniel.lezcano@linaro.org Cc: Linux-imx@nxp.com Link: https://lkml.kernel.org/r/1524117883-2484-1-git-send-email-Anson.Huang@nxp.com Signed-off-by: Greg Kroah-Hartman commit b8d4055372b58aad4a51b67e176eabdcc238fde3 Author: Dou Liyang Date: Thu Apr 12 09:40:52 2018 +0800 x86/acpi: Prevent X2APIC id 0xffffffff from being accounted commit 10daf10ab154e31237a8c07242be3063fb6a9bf4 upstream. RongQing reported that there are some X2APIC id 0xffffffff in his machine's ACPI MADT table, which makes the number of possible CPU inaccurate. The reason is that the ACPI X2APIC parser has no sanity check for APIC ID 0xffffffff, which is an invalid id in all APIC types. See "Intel® 64 Architecture x2APIC Specification", Chapter 2.4.1. Add a sanity check to acpi_parse_x2apic() which ignores the invalid id. Reported-by: Li RongQing Signed-off-by: Dou Liyang Signed-off-by: Thomas Gleixner Cc: stable@vger.kernel.org Cc: len.brown@intel.com Cc: rjw@rjwysocki.net Cc: hpa@zytor.com Link: https://lkml.kernel.org/r/20180412014052.25186-1-douly.fnst@cn.fujitsu.com Signed-off-by: Greg Kroah-Hartman commit f6edc45e21c35f9e0124f7ac8b2d8eb7f551cc3b Author: David Sterba Date: Mon Apr 16 21:10:14 2018 +0200 btrfs: fix unaligned access in readdir commit 92d32170847bfff2dd08af2c016085779f2fd2a1 upstream. The last update to readdir introduced a temporary buffer to store the emitted readdir data, but as there are file names of variable length, there's a lot of unaligned access. This was observed on a sparc64 machine: Kernel unaligned access at TPC[102f3080] btrfs_real_readdir+0x51c/0x718 [btrfs] Fixes: 23b5ec74943 ("btrfs: fix readdir deadlock with pagefault") CC: stable@vger.kernel.org # 4.14+ Reported-and-tested-by: René Rebe Reviewed-by: Liu Bo Signed-off-by: David Sterba Signed-off-by: Greg Kroah-Hartman commit 240a528684852cc3231b276b59c3e4bf1b533de5 Author: Steve French Date: Fri Apr 20 12:19:07 2018 -0500 cifs: do not allow creating sockets except with SMB1 posix exensions commit 1d0cffa674cfa7d185a302c8c6850fc50b893bed upstream. RHBZ: 1453123 Since at least the 3.10 kernel and likely a lot earlier we have not been able to create unix domain sockets in a cifs share when mounted using the SFU mount option (except when mounted with the cifs unix extensions to Samba e.g.) Trying to create a socket, for example using the af_unix command from xfstests will cause : BUG: unable to handle kernel NULL pointer dereference at 00000000 00000040 Since no one uses or depends on being able to create unix domains sockets on a cifs share the easiest fix to stop this vulnerability is to simply not allow creation of any other special files than char or block devices when sfu is used. Added update to Ronnie's patch to handle a tcon link leak, and to address a buf leak noticed by Gustavo and Colin. Acked-by: Gustavo A. R. Silva CC: Colin Ian King Reviewed-by: Pavel Shilovsky Reported-by: Eryu Guan Signed-off-by: Ronnie Sahlberg Signed-off-by: Steve French Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman commit d6949f48093c2d862d9bc39a7a89f2825c55edc4 Author: Greg Kroah-Hartman Date: Tue Apr 24 09:36:40 2018 +0200 Linux 4.14.36 commit 7c9b87a78a17017abe5def1ab53e3b261be6853f Author: Greg Thelen Date: Fri Apr 20 14:55:42 2018 -0700 writeback: safer lock nesting commit 2e898e4c0a3897ccd434adac5abb8330194f527b upstream. lock_page_memcg()/unlock_page_memcg() use spin_lock_irqsave/restore() if the page's memcg is undergoing move accounting, which occurs when a process leaves its memcg for a new one that has memory.move_charge_at_immigrate set. unlocked_inode_to_wb_begin,end() use spin_lock_irq/spin_unlock_irq() if the given inode is switching writeback domains. Switches occur when enough writes are issued from a new domain. This existing pattern is thus suspicious: lock_page_memcg(page); unlocked_inode_to_wb_begin(inode, &locked); ... unlocked_inode_to_wb_end(inode, locked); unlock_page_memcg(page); If both inode switch and process memcg migration are both in-flight then unlocked_inode_to_wb_end() will unconditionally enable interrupts while still holding the lock_page_memcg() irq spinlock. This suggests the possibility of deadlock if an interrupt occurs before unlock_page_memcg(). truncate __cancel_dirty_page lock_page_memcg unlocked_inode_to_wb_begin unlocked_inode_to_wb_end end_page_writeback test_clear_page_writeback lock_page_memcg unlock_page_memcg Due to configuration limitations this deadlock is not currently possible because we don't mix cgroup writeback (a cgroupv2 feature) and memory.move_charge_at_immigrate (a cgroupv1 feature). If the kernel is hacked to always claim inode switching and memcg moving_account, then this script triggers lockup in less than a minute: cd /mnt/cgroup/memory mkdir a b echo 1 > a/memory.move_charge_at_immigrate echo 1 > b/memory.move_charge_at_immigrate ( echo $BASHPID > a/cgroup.procs while true; do dd if=/dev/zero of=/mnt/big bs=1M count=256 done ) & while true; do sync done & sleep 1h & SLEEP=$! while true; do echo $SLEEP > a/cgroup.procs echo $SLEEP > b/cgroup.procs done The deadlock does not seem possible, so it's debatable if there's any reason to modify the kernel. I suggest we should to prevent future surprises. And Wang Long said "this deadlock occurs three times in our environment", so there's more reason to apply this, even to stable. Stable 4.4 has minor conflicts applying this patch. For a clean 4.4 patch see "[PATCH for-4.4] writeback: safer lock nesting" https://lkml.org/lkml/2018/4/11/146 Wang Long said "this deadlock occurs three times in our environment" [gthelen@google.com: v4] Link: http://lkml.kernel.org/r/20180411084653.254724-1-gthelen@google.com [akpm@linux-foundation.org: comment tweaks, struct initialization simplification] Change-Id: Ibb773e8045852978f6207074491d262f1b3fb613 Link: http://lkml.kernel.org/r/20180410005908.167976-1-gthelen@google.com Fixes: 682aa8e1a6a1 ("writeback: implement unlocked_inode_to_wb transaction and use it for stat updates") Signed-off-by: Greg Thelen Reported-by: Wang Long Acked-by: Wang Long Acked-by: Michal Hocko Reviewed-by: Andrew Morton Cc: Johannes Weiner Cc: Tejun Heo Cc: Nicholas Piggin Cc: [v4.2+] Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds [natechancellor: Adjust context due to lack of b93b016313b3b] Signed-off-by: Nathan Chancellor Signed-off-by: Greg Kroah-Hartman commit 071ff203d962478368bbc84c3c8f9ba40e9fe2b6 Author: Sean Young Date: Sun Apr 15 10:51:51 2018 +0100 media: staging: lirc_zilog: incorrect reference counting [not upstream as the driver is deleted in 4.16 - gregkh] Whenever poll is called, the reference count is increased but never decreased. This means that on rmmod, the lirc_thread is not stopped, and will trample over freed memory. Zilog/Hauppauge IR driver unloaded BUG: unable to handle kernel paging request at ffffffffc17ba640 Oops: 0010 [#1] SMP CPU: 1 PID: 667 Comm: zilog-rx-i2c-1 Tainted: P C OE 4.13.16-302.fc27.x86_64 #1 Hardware name: Gigabyte Technology Co., Ltd. GA-MA790FXT-UD5P/GA-MA790FXT-UD5P, BIOS F6 08/06/2009 task: ffff964eb452ca00 task.stack: ffffb254414dc000 RIP: 0010:0xffffffffc17ba640 RSP: 0018:ffffb254414dfe78 EFLAGS: 00010286 RAX: 0000000000000000 RBX: ffff964ec1b35890 RCX: 0000000000000000 RDX: 0000000000000000 RSI: 0000000000000246 RDI: 0000000000000246 RBP: ffffb254414dff00 R08: 000000000000036e R09: ffff964ecfc8dfd0 R10: ffffb254414dfe78 R11: 00000000000f4240 R12: ffff964ec2bf28a0 R13: ffff964ec1b358a8 R14: ffff964ec1b358d0 R15: ffff964ec1b35800 FS: 0000000000000000(0000) GS:ffff964ecfc80000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: ffffffffc17ba640 CR3: 000000023058c000 CR4: 00000000000006e0 Call Trace: kthread+0x125/0x140 ? kthread_park+0x60/0x60 ? do_syscall_64+0x67/0x140 ret_from_fork+0x25/0x30 Code: Bad RIP value. RIP: 0xffffffffc17ba640 RSP: ffffb254414dfe78 CR2: ffffffffc17ba640 Note that zilog-rx-i2c-1 should have exited by now, but hasn't due to the missing put in poll(). This code has been replaced completely in kernel v4.16 by a new driver, see commit acaa34bf06e9 ("media: rc: implement zilog transmitter"), and commit f95367a7b758 ("media: staging: remove lirc_zilog driver"). Cc: stable@vger.kernel.org # v4.15- (all up to and including v4.15) Reported-by: Warren Sturm Tested-by: Warren Sturm Signed-off-by: Sean Young Signed-off-by: Greg Kroah-Hartman commit e7a08ffb2d897f4cfbd7b2931339c4643ad04f64 Author: Sean Young Date: Sun Apr 15 10:51:50 2018 +0100 Revert "media: lirc_zilog: driver only sends LIRCCODE" [not upstream as the driver is deleted in 4.16 - gregkh] The lirc config documented here https://www.blushingpenguin.com/mark/blog/?p=24 uses raw_codes for sending IR. Each key only has one pulse, which in fact is an index into the haup-ir-blaster.bin file. Changing the driver to LIRCCODE (although more accurate) breaks this configuration. This code has been replaced completely in kernel v4.16 by a new driver, see commit acaa34bf06e9 ("media: rc: implement zilog transmitter"), and commit f95367a7b758 ("media: staging: remove lirc_zilog driver"). This reverts commit 89d8a2cc51d1f29ea24a0b44dde13253141190a0. Fixes: 615cd3fe6ccc ("[media] media: lirc_dev: make better use of file->private_data") Cc: stable@vger.kernel.org # v4.14-v4.15 Reported-by: Warren Sturm Tested-by: Warren Sturm Signed-off-by: Sean Young Signed-off-by: Greg Kroah-Hartman commit 8caa4c5fde76315b964b77c4a7dade6833be4cb1 Author: Luca Coelho Date: Wed Mar 28 11:15:09 2018 +0300 iwlwifi: add a bunch of new 9000 PCI IDs commit 9e5053ad9d590e095829a8bb07adbbdbd893f0f9 upstream. A lot of new PCI IDs were added for the 9000 series. Add them to the list of supported PCI IDs. Cc: stable@vger.kernel.org # 4.13+ Signed-off-by: Luca Coelho Signed-off-by: Greg Kroah-Hartman commit 0c61952c3d199c8179cb8d2d400ef33c5511c3a3 Author: Luca Coelho Date: Wed Feb 21 22:18:59 2018 +0200 iwlwifi: add shared clock PHY config flag for some devices commit 86a2b2043af79deff5cf000c5a08847faa4f2ee0 upstream. Some devices use a shared clock which is very sensitive to variations and cause trouble in some situations. We need to set a bit in the phy configuration to indicate that to the FW. To make this generic, add a extra_phy_config_flags element to the device configuration and OR it into the phy_cfg before sending it to the firmware. And also create a set of configurations for devices that use shared clocks and need this extra bit to be set. Fixes: c62446d2b028 ("iwlwifi: add new 9460 series PCI IDs") Signed-off-by: Luca Coelho Signed-off-by: Greg Kroah-Hartman commit 30593709f80ddf23a3e8b41f3a7edfc68c4786be Author: Andrew Lunn Date: Sat Apr 7 20:37:40 2018 +0200 net: dsa: Discard frames from unused ports commit fc5f33768cca7144f8d793205b229d46740d183b upstream. The Marvell switches under some conditions will pass a frame to the host with the port being the CPU port. Such frames are invalid, and should be dropped. Not dropping them can result in a crash when incrementing the receive statistics for an invalid port. This has been reworked for 4.14, which does not have the central dsa_master_find_slave() function, so each tag driver needs to check. Reported-by: Chris Healy Fixes: 91da11f870f0 ("net: Distributed Switch Architecture protocol support") Signed-off-by: Andrew Lunn Reviewed-by: Florian Fainelli Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 90a32d1f0ec9844b7984f52f07fe7bcb0abfb6a7 Author: Matthew Wilcox Date: Fri Apr 20 14:56:20 2018 -0700 mm/filemap.c: fix NULL pointer in page_cache_tree_insert() commit abc1be13fd113ddef5e2d807a466286b864caed3 upstream. f2fs specifies the __GFP_ZERO flag for allocating some of its pages. Unfortunately, the page cache also uses the mapping's GFP flags for allocating radix tree nodes. It always masked off the __GFP_HIGHMEM flag, and masks off __GFP_ZERO in some paths, but not all. That causes radix tree nodes to be allocated with a NULL list_head, which causes backtraces like: __list_del_entry+0x30/0xd0 list_lru_del+0xac/0x1ac page_cache_tree_insert+0xd8/0x110 The __GFP_DMA and __GFP_DMA32 flags would also be able to sneak through if they are ever used. Fix them all by using GFP_RECLAIM_MASK at the innermost location, and remove it from earlier in the callchain. Link: http://lkml.kernel.org/r/20180411060320.14458-2-willy@infradead.org Fixes: 449dd6984d0e ("mm: keep page cache radix tree nodes in check") Signed-off-by: Matthew Wilcox Reported-by: Chris Fries Debugged-by: Minchan Kim Acked-by: Johannes Weiner Acked-by: Michal Hocko Reviewed-by: Jan Kara Cc: Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman commit 5e7575c6690a2871a1db5d5fe86c8fe8aef00cd8 Author: Ian Kent Date: Fri Apr 20 14:55:59 2018 -0700 autofs: mount point create should honour passed in mode commit 1e6306652ba18723015d1b4967fe9de55f042499 upstream. The autofs file system mkdir inode operation blindly sets the created directory mode to S_IFDIR | 0555, ingoring the passed in mode, which can cause selinux dac_override denials. But the function also checks if the caller is the daemon (as no-one else should be able to do anything here) so there's no point in not honouring the passed in mode, allowing the daemon to set appropriate mode when required. Link: http://lkml.kernel.org/r/152361593601.8051.14014139124905996173.stgit@pluto.themaw.net Signed-off-by: Ian Kent Cc: Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman commit d4d49cb1c20a13f03cf1740f5ae690cb1dd1f43d Author: Al Viro Date: Thu Apr 19 22:03:08 2018 -0400 Don't leak MNT_INTERNAL away from internal mounts commit 16a34adb9392b2fe4195267475ab5b472e55292c upstream. We want it only for the stuff created by SB_KERNMOUNT mounts, *not* for their copies. As it is, creating a deep stack of bindings of /proc/*/ns/* somewhere in a new namespace and exiting yields a stack overflow. Cc: stable@kernel.org Reported-by: Alexander Aring Bisected-by: Kirill Tkhai Tested-by: Kirill Tkhai Tested-by: Alexander Aring Signed-off-by: Al Viro Signed-off-by: Greg Kroah-Hartman commit 085125572a453938bf4b0f483ccb0d76c40f1d1c Author: Al Viro Date: Tue Apr 3 01:15:46 2018 -0400 rpc_pipefs: fix double-dput() commit 4a3877c4cedd95543f8726b0a98743ed8db0c0fb upstream. if we ever hit rpc_gssd_dummy_depopulate() dentry passed to it has refcount equal to 1. __rpc_rmpipe() drops it and dput() done after that hits an already freed dentry. Cc: stable@kernel.org Signed-off-by: Al Viro Signed-off-by: Greg Kroah-Hartman commit 0bb4a6f2ff1a9081a35b07760fa40f9880d16bc6 Author: Al Viro Date: Tue Apr 3 00:13:17 2018 -0400 orangefs_kill_sb(): deal with allocation failures commit 659038428cb43a66e3eff71e2c845c9de3611a98 upstream. orangefs_fill_sb() might've failed to allocate ORANGEFS_SB(s); don't oops in that case. Cc: stable@kernel.org Signed-off-by: Al Viro Signed-off-by: Greg Kroah-Hartman commit bb5def77d0a1e41b53f7049fdb0f640408194068 Author: Al Viro Date: Mon Apr 2 23:50:31 2018 -0400 hypfs_kill_super(): deal with failed allocations commit a24cd490739586a7d2da3549a1844e1d7c4f4fc4 upstream. hypfs_fill_super() might fail to allocate sbi; hypfs_kill_super() should not oops on that. Cc: stable@vger.kernel.org Signed-off-by: Al Viro Signed-off-by: Greg Kroah-Hartman commit c780ac96e120081a1d2e892f47ce34245ef1cfb2 Author: Al Viro Date: Mon Apr 2 23:56:44 2018 -0400 jffs2_kill_sb(): deal with failed allocations commit c66b23c2840446a82c389e4cb1a12eb2a71fa2e4 upstream. jffs2_fill_super() might fail to allocate jffs2_sb_info; jffs2_kill_sb() must survive that. Cc: stable@kernel.org Signed-off-by: Al Viro Signed-off-by: Greg Kroah-Hartman commit 22ec5ee71086284911b21339eaf4d2c2ec1df351 Author: Ville Syrjälä Date: Wed Feb 14 21:23:23 2018 +0200 drm/i915: Correctly handle limited range YCbCr data on VLV/CHV commit 5deae9191130db6b617c94fb261804597cf9b508 upstream. Turns out the VLV/CHV fixed function sprite CSC expects full range data as input. We've been feeding it limited range data to it all along. To expand the data out to full range we'll use the color correction registers (brightness, contrast, and saturation). On CHV pipe B we were actually doing the right thing already because we progammed the custom CSC matrix to do expect limited range input. Now that well pre-expand the data out with the color correction unit, we need to change the CSC matrix to operate with full range input instead. This should make the sprite output of the other pipes match the sprite output of pipe B reasonably well. Looking at the resulting pipe CRCs, there can be a slight difference in the output, but as I don't know the formula used by the fixed function CSC of the other pipes, I don't think it's worth the effort to try to match the output exactly. It might not even be possible due to difference in internal precision etc. One slight caveat here is that the color correction registers are single bufferred, so we should really be updating them during vblank, but we still don't have a mechanism for that, so just toss in another FIXME. v2: Rebase v3: s/bri/brightness/ s/con/contrast/ (Shashank) v4: Clarify the constants and math (Shashank) Cc: Harry Wentland Cc: Daniel Vetter Cc: Daniel Stone Cc: Russell King - ARM Linux Cc: Ilia Mirkin Cc: Hans Verkuil Cc: Shashank Sharma Cc: Uma Shankar Cc: Jyri Sarha Cc: "Tang, Jun" Reported-by: "Tang, Jun" Cc: stable@vger.kernel.org Fixes: 7f1f3851feb0 ("drm/i915: sprite support for ValleyView v4") Reviewed-by: Shashank Sharma Signed-off-by: Ville Syrjälä Link: https://patchwork.freedesktop.org/patch/msgid/20180214192327.3250-5-ville.syrjala@linux.intel.com Signed-off-by: Greg Kroah-Hartman commit 4bddb1209a6df483487b241b586f9ef12c88ab8c Author: Daniel Kurtz Date: Fri Apr 6 16:07:59 2018 -0600 mmc: sdhci-pci: Only do AMD tuning for HS200 commit 300ad8992913025b4294d4fc37b6bfff4a8b7ad1 upstream. Commit c31165d7400b ("mmc: sdhci-pci: Add support for HS200 tuning mode on AMD, eMMC-4.5.1") added a HS200 tuning method for use with AMD SDHCI controllers. As described in the commit subject, this tuning is specific for HS200. However, as implemented, this method is used for all host timings, because platform_execute_tuning, if it exists, is called unconditionally by sdhci_execute_tuning(). This breaks tuning when using the AMD controller with, for example, a DDR50 SD card. Instead, we can implement an amd execute_tuning wrapper callback, and then conditionally do the HS200 specific tuning for HS200, and otherwise call back to the standard sdhci_execute_tuning(). Signed-off-by: Daniel Kurtz Acked-by: Shyam Sundar S K Acked-by: Adrian Hunter Fixes: c31165d7400b ("mmc: sdhci-pci: Add support for HS200 tuning mode on AMD, eMMC-4.5.1") Cc: stable@vger.kernel.org # v4.11+ Signed-off-by: Ulf Hansson Signed-off-by: Greg Kroah-Hartman commit 28f46dee49d3eac6dbeac3e4c9ecf1b861d2594b Author: Amir Goldstein Date: Wed Apr 4 23:42:18 2018 +0300 fanotify: fix logic of events on child commit 54a307ba8d3cd00a3902337ffaae28f436eeb1a4 upstream. When event on child inodes are sent to the parent inode mark and parent inode mark was not marked with FAN_EVENT_ON_CHILD, the event will not be delivered to the listener process. However, if the same process also has a mount mark, the event to the parent inode will be delivered regadless of the mount mark mask. This behavior is incorrect in the case where the mount mark mask does not contain the specific event type. For example, the process adds a mark on a directory with mask FAN_MODIFY (without FAN_EVENT_ON_CHILD) and a mount mark with mask FAN_CLOSE_NOWRITE (without FAN_ONDIR). A modify event on a file inside that directory (and inside that mount) should not create a FAN_MODIFY event, because neither of the marks requested to get that event on the file. Fixes: 1968f5eed54c ("fanotify: use both marks when possible") Cc: stable Signed-off-by: Amir Goldstein Signed-off-by: Jan Kara Signed-off-by: Greg Kroah-Hartman commit a2a9d0190f9964dd734348085c22582e519c145a Author: Jan Kara Date: Thu Apr 12 17:22:23 2018 +0200 udf: Fix leak of UTF-16 surrogates into encoded strings commit 44f06ba8297c7e9dfd0e49b40cbe119113cca094 upstream. OSTA UDF specification does not mention whether the CS0 charset in case of two bytes per character encoding should be treated in UTF-16 or UCS-2. The sample code in the standard does not treat UTF-16 surrogates in any special way but on systems such as Windows which work in UTF-16 internally, filenames would be treated as being in UTF-16 effectively. In Linux it is more difficult to handle characters outside of Base Multilingual plane (beyond 0xffff) as NLS framework works with 2-byte characters only. Just make sure we don't leak UTF-16 surrogates into the resulting string when loading names from the filesystem for now. CC: stable@vger.kernel.org # >= v4.6 Reported-by: Mingye Wang Signed-off-by: Jan Kara Signed-off-by: Greg Kroah-Hartman commit f86815184c471dae61f3c764b7589872a91e930a Author: Michael Ellerman Date: Mon Apr 16 23:25:19 2018 +1000 powerpc/lib: Fix off-by-one in alternate feature patching commit b8858581febb050688e276b956796bc4a78299ed upstream. When we patch an alternate feature section, we have to adjust any relative branches that branch out of the alternate section. But currently we have a bug if we have a branch that points to past the last instruction of the alternate section, eg: FTR_SECTION_ELSE 1: b 2f or 6,6,6 2: ALT_FTR_SECTION_END(...) nop This will result in a relative branch at 1 with a target that equals the end of the alternate section. That branch does not need adjusting when it's moved to the non-else location. Currently we do adjust it, resulting in a branch that goes off into the link-time location of the else section, which is junk. The fix is to not patch branches that have a target == end of the alternate section. Fixes: d20fe50a7b3c ("KVM: PPC: Book3S HV: Branch inside feature section") Fixes: 9b1a735de64c ("powerpc: Add logic to patch alternative feature sections") Cc: stable@vger.kernel.org # v2.6.27+ Signed-off-by: Michael Ellerman Signed-off-by: Greg Kroah-Hartman commit ce3b0b0589a8b7e0241c0d1b1049a715108b78fb Author: Benjamin Herrenschmidt Date: Wed Apr 11 15:17:59 2018 +1000 powerpc/xive: Fix trying to "push" an already active pool VP commit b32e56e5a87a1f9243db92bc7a5df0ffb4627cfb upstream. When setting up a CPU, we "push" (activate) a pool VP for it. However it's an error to do so if it already has an active pool VP. This happens when doing soft CPU hotplug on powernv since we don't tear down the CPU on unplug. The HW flags the error which gets captured by the diagnostics. Fix this by making sure to "pull" out any already active pool first. Fixes: 243e25112d06 ("powerpc/xive: Native exploitation of the XIVE interrupt controller") Cc: stable@vger.kernel.org # v4.12+ Signed-off-by: Benjamin Herrenschmidt Signed-off-by: Michael Ellerman Signed-off-by: Greg Kroah-Hartman commit 49a52f7d9274da3f32d7c3b51843fa44fffd2e55 Author: Michael Neuling Date: Wed Apr 11 13:37:58 2018 +1000 powerpc/eeh: Fix enabling bridge MMIO windows commit 13a83eac373c49c0a081cbcd137e79210fe78acd upstream. On boot we save the configuration space of PCIe bridges. We do this so when we get an EEH event and everything gets reset that we can restore them. Unfortunately we save this state before we've enabled the MMIO space on the bridges. Hence if we have to reset the bridge when we come back MMIO is not enabled and we end up taking an PE freeze when the driver starts accessing again. This patch forces the memory/MMIO and bus mastering on when restoring bridges on EEH. Ideally we'd do this correctly by saving the configuration space writes later, but that will have to come later in a larger EEH rewrite. For now we have this simple fix. The original bug can be triggered on a boston machine by doing: echo 0x8000000000000000 > /sys/kernel/debug/powerpc/PCI0001/err_injct_outbound On boston, this PHB has a PCIe switch on it. Without this patch, you'll see two EEH events, 1 expected and 1 the failure we are fixing here. The second EEH event causes the anything under the PHB to disappear (i.e. the i40e eth). With this patch, only 1 EEH event occurs and devices properly recover. Fixes: 652defed4875 ("powerpc/eeh: Check PCIe link after reset") Cc: stable@vger.kernel.org # v3.11+ Reported-by: Pridhiviraj Paidipeddi Signed-off-by: Michael Neuling Acked-by: Russell Currey Signed-off-by: Michael Ellerman Signed-off-by: Greg Kroah-Hartman commit a5f6e787b9b061624f525eff944fc73e08974567 Author: Matt Redfearn Date: Tue Apr 17 16:40:00 2018 +0100 MIPS: memset.S: Fix clobber of v1 in last_fixup commit c96eebf07692e53bf4dd5987510d8b550e793598 upstream. The label .Llast_fixup\@ is jumped to on page fault within the final byte set loop of memset (on < MIPSR6 architectures). For some reason, in this fault handler, the v1 register is randomly set to a2 & STORMASK. This clobbers v1 for the calling function. This can be observed with the following test code: static int __init __attribute__((optimize("O0"))) test_clear_user(void) { register int t asm("v1"); char *test; int j, k; pr_info("\n\n\nTesting clear_user\n"); test = vmalloc(PAGE_SIZE); for (j = 256; j < 512; j++) { t = 0xa5a5a5a5; if ((k = clear_user(test + PAGE_SIZE - 256, j)) != j - 256) { pr_err("clear_user (%px %d) returned %d\n", test + PAGE_SIZE - 256, j, k); } if (t != 0xa5a5a5a5) { pr_err("v1 was clobbered to 0x%x!\n", t); } } return 0; } late_initcall(test_clear_user); Which demonstrates that v1 is indeed clobbered (MIPS64): Testing clear_user v1 was clobbered to 0x1! v1 was clobbered to 0x2! v1 was clobbered to 0x3! v1 was clobbered to 0x4! v1 was clobbered to 0x5! v1 was clobbered to 0x6! v1 was clobbered to 0x7! Since the number of bytes that could not be set is already contained in a2, the andi placing a value in v1 is not necessary and actively harmful in clobbering v1. Reported-by: James Hogan Signed-off-by: Matt Redfearn Cc: Ralf Baechle Cc: linux-mips@linux-mips.org Cc: stable@vger.kernel.org Patchwork: https://patchwork.linux-mips.org/patch/19109/ Signed-off-by: James Hogan Signed-off-by: Greg Kroah-Hartman commit 6da34ca1ca3e7a42b46e1265d04e827d3a6970d3 Author: Matt Redfearn Date: Tue Apr 17 15:52:21 2018 +0100 MIPS: memset.S: Fix return of __clear_user from Lpartial_fixup commit daf70d89f80c6e1772233da9e020114b1254e7e0 upstream. The __clear_user function is defined to return the number of bytes that could not be cleared. From the underlying memset / bzero implementation this means setting register a2 to that number on return. Currently if a page fault is triggered within the memset_partial block, the value loaded into a2 on return is meaningless. The label .Lpartial_fixup\@ is jumped to on page fault. In order to work out how many bytes failed to copy, the exception handler should find how many bytes left in the partial block (andi a2, STORMASK), add that to the partial block end address (a2), and subtract the faulting address to get the remainder. Currently it incorrectly subtracts the partial block start address (t1), which has additionally been clobbered to generate a jump target in memset_partial. Fix this by adding the block end address instead. This issue was found with the following test code: int j, k; for (j = 0; j < 512; j++) { if ((k = clear_user(NULL, j)) != j) { pr_err("clear_user (NULL %d) returned %d\n", j, k); } } Which now passes on Creator Ci40 (MIPS32) and Cavium Octeon II (MIPS64). Suggested-by: James Hogan Signed-off-by: Matt Redfearn Cc: Ralf Baechle Cc: linux-mips@linux-mips.org Cc: stable@vger.kernel.org Patchwork: https://patchwork.linux-mips.org/patch/19108/ Signed-off-by: James Hogan Signed-off-by: Greg Kroah-Hartman commit 7b34760dc682d58f77dfab1810d4af2841828f6b Author: Matt Redfearn Date: Thu Mar 29 10:28:23 2018 +0100 MIPS: memset.S: EVA & fault support for small_memset commit 8a8158c85e1e774a44fbe81106fa41138580dfd1 upstream. The MIPS kernel memset / bzero implementation includes a small_memset branch which is used when the region to be set is smaller than a long (4 bytes on 32bit, 8 bytes on 64bit). The current small_memset implementation uses a simple store byte loop to write the destination. There are 2 issues with this implementation: 1. When EVA mode is active, user and kernel address spaces may overlap. Currently the use of the sb instruction means kernel mode addressing is always used and an intended write to userspace may actually overwrite some critical kernel data. 2. If the write triggers a page fault, for example by calling __clear_user(NULL, 2), instead of gracefully handling the fault, an OOPS is triggered. Fix these issues by replacing the sb instruction with the EX() macro, which will emit EVA compatible instuctions as required. Additionally implement a fault fixup for small_memset which sets a2 to the number of bytes that could not be cleared (as defined by __clear_user). Reported-by: Chuanhua Lei Signed-off-by: Matt Redfearn Cc: Ralf Baechle Cc: linux-mips@linux-mips.org Cc: stable@vger.kernel.org Patchwork: https://patchwork.linux-mips.org/patch/18975/ Signed-off-by: James Hogan Signed-off-by: Greg Kroah-Hartman commit 23f5562852b9f999d6c36606e40bcd81c64f4765 Author: Matt Redfearn Date: Tue Apr 17 16:40:01 2018 +0100 MIPS: uaccess: Add micromips clobbers to bzero invocation commit b3d7e55c3f886493235bfee08e1e5a4a27cbcce8 upstream. The micromips implementation of bzero additionally clobbers registers t7 & t8. Specify this in the clobbers list when invoking bzero. Fixes: 26c5e07d1478 ("MIPS: microMIPS: Optimise 'memset' core library function.") Reported-by: James Hogan Signed-off-by: Matt Redfearn Cc: Ralf Baechle Cc: linux-mips@linux-mips.org Cc: # 3.10+ Patchwork: https://patchwork.linux-mips.org/patch/19110/ Signed-off-by: James Hogan Signed-off-by: Greg Kroah-Hartman commit 1da964d421dae559422561ef407c2557d8d0348a Author: Aaron Armstrong Skomra Date: Wed Apr 4 14:24:11 2018 -0700 HID: wacom: bluetooth: send exit report for recent Bluetooth devices commit 619d3a2922ce623ca2eca443cc936810d328317c upstream. The code path for recent Bluetooth devices omits an exit report which resets all the values of the device. Fixes: 4922cd26f0 ("HID: wacom: Support 2nd-gen Intuos Pro's Bluetooth classic interface") Cc: # 4.11 Signed-off-by: Aaron Armstrong Skomra Reviewed-by: Ping Cheng Signed-off-by: Jiri Kosina Signed-off-by: Greg Kroah-Hartman commit 0e159a9e1823469c46bf3355d87a79a2cc7f8890 Author: Rodrigo Rivas Costa Date: Fri Apr 6 01:09:36 2018 +0200 HID: hidraw: Fix crash on HIDIOCGFEATURE with a destroyed device commit a955358d54695e4ad9f7d6489a7ac4d69a8fc711 upstream. Doing `ioctl(HIDIOCGFEATURE)` in a tight loop on a hidraw device and then disconnecting the device, or unloading the driver, can cause a NULL pointer dereference. When a hidraw device is destroyed it sets 0 to `dev->exist`. Most functions check 'dev->exist' before doing its work, but `hidraw_get_report()` was missing that check. Cc: stable@vger.kernel.org Signed-off-by: Rodrigo Rivas Costa Signed-off-by: Jiri Kosina Signed-off-by: Greg Kroah-Hartman commit 27840bc4ebb2b4bf106c3cf7cb931c277f997a24 Author: Dmitry Torokhov Date: Tue Apr 3 10:52:20 2018 -0700 HID: input: fix battery level reporting on BT mice commit 2e210bbb7429cdcf1a1a3ad00c1bf98bd9bf2452 upstream. The commit 581c4484769e ("HID: input: map digitizer battery usage") assumed that devices having input (qas opposed to feature) report for battery strength would report the data on their own, without the need to be polled by the kernel; unfortunately it is not so. Many wireless mice do not send unsolicited reports with battery strength data and have to be polled explicitly. As a complication, stylus devices on digitizers are not normally connected to the base and thus can not be polled - the base can only determine battery strength in the stylus when it is in proximity. To solve this issue, we add a special flag that tells the kernel to avoid polling the device (and expect unsolicited reports) and set it when report field with physical usage of digitizer stylus (HID_DG_STYLUS). Unless this flag is set, and we have not seen the unsolicited reports, the kernel will attempt to poll the device when userspace attempts to read "capacity" and "state" attributes of power_supply object corresponding to the devices battery. Fixes: 581c4484769e ("HID: input: map digitizer battery usage") Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=198095 Cc: stable@vger.kernel.org Reported-and-tested-by: Martin van Es Signed-off-by: Dmitry Torokhov Signed-off-by: Jiri Kosina Signed-off-by: Greg Kroah-Hartman commit 90936d903c2f34663cffe68d9845debdeb85174c Author: Theodore Ts'o Date: Wed Apr 11 16:32:17 2018 -0400 random: add new ioctl RNDRESEEDCRNG commit d848e5f8e1ebdb227d045db55fe4f825e82965fa upstream. Add a new ioctl which forces the the crng to be reseeded. Signed-off-by: Theodore Ts'o Cc: stable@kernel.org Signed-off-by: Greg Kroah-Hartman commit d152fcc173149a99d6f707a5b8a80d83d906755b Author: Theodore Ts'o Date: Thu Apr 12 00:50:45 2018 -0400 random: crng_reseed() should lock the crng instance that it is modifying commit 0bb29a849a6433b72e249eea7695477b02056e94 upstream. Reported-by: Jann Horn Fixes: 1e7f583af67b ("random: make /dev/urandom scalable for silly...") Cc: stable@kernel.org # 4.8+ Signed-off-by: Theodore Ts'o Reviewed-by: Jann Horn Signed-off-by: Greg Kroah-Hartman commit 7b6b1f3a192372937164d1293b432c640ffc7c8f Author: Theodore Ts'o Date: Wed Apr 11 14:58:27 2018 -0400 random: use a different mixing algorithm for add_device_randomness() commit dc12baacb95f205948f64dc936a47d89ee110117 upstream. add_device_randomness() use of crng_fast_load() was highly problematic. Some callers of add_device_randomness() can pass in a large amount of static information. This would immediately promote the crng_init state from 0 to 1, without really doing much to initialize the primary_crng's internal state with something even vaguely unpredictable. Since we don't have the speed constraints of add_interrupt_randomness(), we can do a better job mixing in the what unpredictability a device driver or architecture maintainer might see fit to give us, and do it in a way which does not bump the crng_init_cnt variable. Also, since add_device_randomness() doesn't bump any entropy accounting in crng_init state 0, mix the device randomness into the input_pool entropy pool as well. This is related to CVE-2018-1108. Reported-by: Jann Horn Fixes: ee7998c50c26 ("random: do not ignore early device randomness") Cc: stable@kernel.org # 4.13+ Signed-off-by: Theodore Ts'o Signed-off-by: Greg Kroah-Hartman commit 6e513bc20ca63f594632eca4e1968791240b8f18 Author: Theodore Ts'o Date: Wed Apr 11 13:27:52 2018 -0400 random: fix crng_ready() test commit 43838a23a05fbd13e47d750d3dfd77001536dd33 upstream. The crng_init variable has three states: 0: The CRNG is not initialized at all 1: The CRNG has a small amount of entropy, hopefully good enough for early-boot, non-cryptographical use cases 2: The CRNG is fully initialized and we are sure it is safe for cryptographic use cases. The crng_ready() function should only return true once we are in the last state. This addresses CVE-2018-1108. Reported-by: Jann Horn Fixes: e192be9d9a30 ("random: replace non-blocking pool...") Cc: stable@kernel.org # 4.8+ Signed-off-by: Theodore Ts'o Reviewed-by: Jann Horn Signed-off-by: Greg Kroah-Hartman commit 8036cdaa1b13144428bbf95bc9679ea7a592d7dd Author: Hui Wang Date: Thu Apr 19 13:29:05 2018 +0800 ALSA: hda/realtek - adjust the location of one mic commit a3dafb2200bf3c13905a088e82ae11f1eb275a83 upstream. There are two front mics on this machine, if we don't adjust the location for one of them, they will have the same mixer name, pulseaudio can't handle this situation. After applying this FIXUP, they will have different mixer name, then pulseaudio can handle them correctly. Cc: Signed-off-by: Hui Wang Signed-off-by: Takashi Iwai Signed-off-by: Greg Kroah-Hartman commit adc02ac60919116f61ebcfbb4e5cfd0b6c3cb2f3 Author: Hui Wang Date: Thu Apr 19 13:29:04 2018 +0800 ALSA: hda/realtek - set PINCFG_HEADSET_MIC to parse_flags commit 3ce0d5aa265bcc0a4b281cb0cabf92491276101b upstream. Otherwise, the pin will be regarded as microphone, and the jack name is "Mic Phantom", it is always on in the pulseaudio even nothing is plugged into the jack. So the UI is confusing to users since the microphone always shows up in the UI even there is no microphone plugged. After adding this flag, the jack name is "Headset Mic Phantom", then the pulseaudio can handle its detection correctly. Fixes: f0ba9d699e5c ("ALSA: hda/realtek - Fix Dell headset Mic can't record") Cc: Signed-off-by: Hui Wang Signed-off-by: Takashi Iwai Signed-off-by: Greg Kroah-Hartman commit 11e9bed2b97199ee940226be54aed5d19c50d0e4 Author: David Wang Date: Mon Apr 16 17:48:09 2018 +0800 ALSA: hda - New VIA controller suppor no-snoop path commit af52f9982e410edac21ca4b49563053ffc9da1eb upstream. This patch is used to tell kernel that new VIA HDAC controller also support no-snoop path. [ minor coding style fix by tiwai ] Signed-off-by: David Wang Cc: Signed-off-by: Takashi Iwai Signed-off-by: Greg Kroah-Hartman commit fcf38cf55e286f7da4a1210d8496696b855934be Author: Takashi Iwai Date: Thu Apr 19 18:16:15 2018 +0200 ALSA: rawmidi: Fix missing input substream checks in compat ioctls commit 8a56ef4f3ffba9ebf4967b61ef600b0a7ba10f11 upstream. Some rawmidi compat ioctls lack of the input substream checks (although they do check only for rfile->output). This many eventually lead to an Oops as NULL substream is passed to the rawmidi core functions. Fix it by adding the proper checks before each function call. The bug was spotted by syzkaller. Reported-by: syzbot+f7a0348affc3b67bc617@syzkaller.appspotmail.com Cc: Signed-off-by: Takashi Iwai Signed-off-by: Greg Kroah-Hartman commit 43b3e7915edd6c20be27444bd70d967fcb07394d Author: Fabián Inostroza Date: Thu Apr 12 00:37:35 2018 -0300 ALSA: line6: Use correct endpoint type for midi output commit 7ecb46e9ee9af18e304eb9e7d6804c59a408e846 upstream. Sending MIDI messages to a PODxt through the USB connection shows "usb_submit_urb failed" in dmesg and the message is not received by the POD. The error is caused because in the funcion send_midi_async() in midi.c there is a call to usb_sndbulkpipe() for endpoint 3 OUT, but the PODxt USB descriptor shows that this endpoint it's an interrupt endpoint. Patch tested with PODxt only. [ The bug has been present from the very beginning in the staging driver time, but Fixes below points to the commit moving to sound/ directory so that the fix can be cleanly applied -- tiwai ] Fixes: 61864d844c29 ("ALSA: move line6 usb driver into sound/usb") Signed-off-by: Fabián Inostroza Cc: Signed-off-by: Takashi Iwai Signed-off-by: Greg Kroah-Hartman commit bdc6f4c3db08c7b947b5c4200e600136993fe01b Author: Paul Parsons Date: Sat Apr 2 12:32:30 2016 +0100 drm/radeon: Fix PCIe lane width calculation commit 85e290d92b4b794d0c758c53007eb4248d385386 upstream. Two years ago I tried an AMD Radeon E8860 embedded GPU with the drm driver. The dmesg output included driver warnings about an invalid PCIe lane width. Tracking the problem back led to si_set_pcie_lane_width_in_smc(). The calculation of the lane widths via ATOM_PPLIB_PCIE_LINK_WIDTH_MASK and ATOM_PPLIB_PCIE_LINK_WIDTH_SHIFT macros did not increment the resulting value, per the comment in pptable.h ("lanes - 1"), and per usage elsewhere. Applying the increment silenced the warnings. The code has not changed since, so either my analysis was incorrect or the bug has gone unnoticed. Hence submitting this as an RFC. Acked-by: Christian König Acked-by: Chunming Zhou Signed-off-by: Paul Parsons Signed-off-by: Alex Deucher Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman commit 110b72d9351ff9662fd27d1225301d10293cf5a8 Author: Nico Sneck Date: Sat Apr 7 15:13:04 2018 +0000 drm/radeon: add PX quirk for Asus K73TK commit b1550359d1eb392ee54f7cf47cffcfe0a602f6a7 upstream. With this the dGPU turns on correctly. Signed-off-by: Nico Sneck Signed-off-by: Alex Deucher Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman commit 791469d6b882c7366a7710539dc7267c5b57fed6 Author: Marc Zyngier Date: Tue Feb 20 13:01:18 2018 +0000 drm/rockchip: Clear all interrupts before requesting the IRQ commit 5f9e93fed4d45e9a8f84728aff1a8f2ab8922902 upstream. Calling request_irq() followed by disable_irq() is usually a bad idea, specially if the interrupt can be pending, and you're not yet in a position to handle it. This is exactly what happens on my kevin system when rebooting in a second kernel using kexec: Some interrupt is left pending from the previous kernel, and we take it too early, before disable_irq() could do anything. Let's clear the pending interrupts as we initialize the HW, and move the interrupt request after that point. This ensures that we're in a sane state when the interrupt is requested. Cc: stable@vger.kernel.org Signed-off-by: Marc Zyngier [adapted to recent rockchip-drm changes] Signed-off-by: Heiko Stuebner Link: https://patchwork.freedesktop.org/patch/msgid/20180220130120.5254-2-marc.zyngier@arm.com Signed-off-by: Greg Kroah-Hartman commit f188464e3d54c2d651fa710f74c8aed68aa87813 Author: Alex Deucher Date: Tue Apr 3 12:54:33 2018 -0500 drm/amdgpu/si: implement get/set pcie_lanes asic callback commit 20ca25e86c56f5490bdc80318f4fc06466e4c21b upstream. Required for dpm setup on some asics. Fixes a NULL dereference on asics that require it. Acked-by: Christian König Bug: https://bugs.freedesktop.org/show_bug.cgi?id=102553 Tested-by: Abel Garcia Dorta Signed-off-by: Alex Deucher Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman commit f056e3339741cf34908d93d428c2fbe5fd6a3a3e Author: Alex Deucher Date: Mon Apr 2 12:29:26 2018 -0500 drm/amdgpu: Fix PCIe lane width calculation commit 41212e2fe72b26ded7ed78224d9eab720c2891e2 upstream. The calculation of the lane widths via ATOM_PPLIB_PCIE_LINK_WIDTH_MASK and ATOM_PPLIB_PCIE_LINK_WIDTH_SHIFT macros did not increment the resulting value, per the comment in pptable.h ("lanes - 1"), and per usage elsewhere. Port of the radeon fix to amdgpu. Acked-by: Christian König Acked-by: Chunming Zhou Bug: https://bugs.freedesktop.org/show_bug.cgi?id=102553 Signed-off-by: Alex Deucher Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman commit 57e56826611a158234ce75a1684d0de92af73c8e Author: Alex Deucher Date: Tue Mar 27 15:53:52 2018 -0500 drm/amdgpu/sdma: fix mask in emit_pipeline_sync commit 4a8e06f7aad797e92413a3042d09d3b385fa1fda upstream. Needs to be a 32 bit mask. Acked-by: Huang Rui Reviewed-by: Christian König Signed-off-by: Alex Deucher Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman commit c73d9e350085a479d787d6d6dff3599fa6952491 Author: Bas Nieuwenhuizen Date: Wed Jan 31 13:58:55 2018 +0100 drm/amdgpu: Fix always_valid bos multiple LRU insertions. commit a20ee0b1f8b42e2568f3a4408003d22b2dfcc706 upstream. If these bos are evicted and are in the validated list things blow up, so do not put them in there. Notably, that tries to add the bo to the LRU twice, which results in a BUG_ON in ttm_bo.c. While for the bo_list an alternative would be to not allow always valid bos in there, that does not work for the user fence. v2: Fixed whitespace issue pointed out by checkpatch.pl Signed-off-by: Bas Nieuwenhuizen Reviewed-by: Christian König Signed-off-by: Alex Deucher Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman commit 674b6963cec22d45d0ff9565554bf5ee2628630e Author: Alex Deucher Date: Wed Mar 21 21:05:46 2018 -0500 drm/amdgpu: Add an ATPX quirk for hybrid laptop commit 13b40935cf64f59b93cf1c716a2033488e5a228c upstream. _PR3 doesn't seem to work properly, use ATPX instead. Bug: https://bugs.freedesktop.org/show_bug.cgi?id=104064 Reviewed-by: Huang Rui Signed-off-by: Alex Deucher Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman commit 312d02879f9f0ae9c6b011d0ead6a143115e9f93 Author: Takashi Iwai Date: Sat Apr 7 11:48:58 2018 +0200 ALSA: pcm: Fix endless loop for XRUN recovery in OSS emulation commit e15dc99dbb9cf99f6432e8e3c0b3a8f7a3403a86 upstream. The commit 02a5d6925cd3 ("ALSA: pcm: Avoid potential races between OSS ioctls and read/write") split the PCM preparation code to a locked version, and it added a sanity check of runtime->oss.prepare flag along with the change. This leaded to an endless loop when the stream gets XRUN: namely, snd_pcm_oss_write3() and co call snd_pcm_oss_prepare() without setting runtime->oss.prepare flag and the loop continues until the PCM state reaches to another one. As the function is supposed to execute the preparation unconditionally, drop the invalid state check there. The bug was triggered by syzkaller. Fixes: 02a5d6925cd3 ("ALSA: pcm: Avoid potential races between OSS ioctls and read/write") Reported-by: syzbot+150189c103427d31a053@syzkaller.appspotmail.com Reported-by: syzbot+7e3f31a52646f939c052@syzkaller.appspotmail.com Reported-by: syzbot+4f2016cf5185da7759dc@syzkaller.appspotmail.com Cc: Signed-off-by: Takashi Iwai Signed-off-by: Greg Kroah-Hartman commit 157113cb7c89829b1cd2e2792a1f38932f31c965 Author: Takashi Iwai Date: Tue Mar 27 14:32:23 2018 +0200 ALSA: pcm: Fix mutex unbalance in OSS emulation ioctls commit f6d297df4dd47ef949540e4a201230d0c5308325 upstream. The previous fix 40cab6e88cb0 ("ALSA: pcm: Return -EBUSY for OSS ioctls changing busy streams") introduced some mutex unbalance; the check of runtime->oss.rw_ref was inserted in a wrong place after the mutex lock. This patch fixes the inconsistency by rewriting with the helper functions to lock/unlock parameters with the stream check. Fixes: 40cab6e88cb0 ("ALSA: pcm: Return -EBUSY for OSS ioctls changing busy streams") Reported-by: Dan Carpenter Cc: Signed-off-by: Takashi Iwai Signed-off-by: Greg Kroah-Hartman commit 048747b04842cbe81386386ecd15b74db93029cf Author: Takashi Iwai Date: Fri Mar 23 08:03:26 2018 +0100 ALSA: pcm: Return -EBUSY for OSS ioctls changing busy streams commit 40cab6e88cb0b6c56d3f30b7491a20e803f948f6 upstream. OSS PCM stream management isn't modal but it allows ioctls issued at any time for changing the parameters. In the previous hardening patch ("ALSA: pcm: Avoid potential races between OSS ioctls and read/write"), we covered these races and prevent the corruption by protecting the concurrent accesses via params_lock mutex. However, this means that some ioctls that try to change the stream parameter (e.g. channels or format) would be blocked until the read/write finishes, and it may take really long. Basically changing the parameter while reading/writing is an invalid operation, hence it's even more user-friendly from the API POV if it returns -EBUSY in such a situation. This patch adds such checks in the relevant ioctls with the addition of read/write access refcount. Cc: Signed-off-by: Takashi Iwai Signed-off-by: Greg Kroah-Hartman commit bd889a82fb01365ca0c364166d4defad3253248c Author: Takashi Iwai Date: Thu Mar 22 18:10:14 2018 +0100 ALSA: pcm: Avoid potential races between OSS ioctls and read/write commit 02a5d6925cd34c3b774bdb8eefb057c40a30e870 upstream. Although we apply the params_lock mutex to the whole read and write operations as well as snd_pcm_oss_change_params(), we may still face some races. First off, the params_lock is taken inside the read and write loop. This is intentional for avoiding the too long locking, but it allows the in-between parameter change, which might lead to invalid pointers. We check the readiness of the stream and set up via snd_pcm_oss_make_ready() at the beginning of read and write, but it's called only once, by assuming that it remains ready in the rest. Second, many ioctls that may change the actual parameters (i.e. setting runtime->oss.params=1) aren't protected, hence they can be processed in a half-baked state. This patch is an attempt to plug these holes. The stream readiness check is moved inside the read/write inner loop, so that the stream is always set up in a proper state before further processing. Also, each ioctl that may change the parameter is wrapped with the params_lock for avoiding the races. The issues were triggered by syzkaller in a few different scenarios, particularly the one below appearing as GPF in loopback_pos_update. Reported-by: syzbot+c4227aec125487ec3efa@syzkaller.appspotmail.com Cc: Signed-off-by: Takashi Iwai Signed-off-by: Greg Kroah-Hartman commit 4d2ea307ffa16644127ba09de30cfb31f3f40c4a Author: Takashi Iwai Date: Tue Jan 9 08:51:02 2018 +0100 ALSA: pcm: Use ERESTARTSYS instead of EINTR in OSS emulation commit c64ed5dd9feba193c76eb460b451225ac2a0d87b upstream. Fix the last standing EINTR in the whole subsystem. Use more correct ERESTARTSYS for pending signals. Signed-off-by: Takashi Iwai Signed-off-by: Greg Kroah-Hartman commit 2ccdea040e81553aaaa9f380f0ff745f71d27eff Author: Alex Williamson Date: Mon Oct 2 12:39:10 2017 -0600 vfio/pci: Virtualize Maximum Read Request Size commit cf0d53ba4947aad6e471491d5b20a567cbe92e56 upstream. MRRS defines the maximum read request size a device is allowed to make. Drivers will often increase this to allow more data transfer with a single request. Completions to this request are bound by the MPS setting for the bus. Aside from device quirks (none known), it doesn't seem to make sense to set an MRRS value less than MPS, yet this is a likely scenario given that user drivers do not have a system-wide view of the PCI topology. Virtualize MRRS such that the user can set MRRS >= MPS, but use MPS as the floor value that we'll write to hardware. Signed-off-by: Alex Williamson Signed-off-by: Greg Kroah-Hartman commit 23a63d96e015d56fc69f527a012625c5b8f346e5 Author: Igor Pylypiv Date: Tue Mar 6 23:47:25 2018 -0800 watchdog: f71808e_wdt: Fix WD_EN register read commit 977f6f68331f94bb72ad84ee96b7b87ce737d89d upstream. F71808FG_FLAG_WD_EN defines bit position, not a bitmask Signed-off-by: Igor Pylypiv Reviewed-by: Guenter Roeck Signed-off-by: Guenter Roeck Signed-off-by: Wim Van Sebroeck Cc: stable Signed-off-by: Greg Kroah-Hartman commit 28fe0fba29f20d1301ae2183d424bcf6a46c0ca8 Author: Sean Wang Date: Thu Mar 1 11:27:50 2018 +0800 dt-bindings: clock: mediatek: add binding for fixed-factor clock axisel_d4 commit 55a5fcafe3a94e8a0777bb993d09107d362258d2 upstream. Just add binding for a fixed-factor clock axisel_d4, which would be referenced by PWM devices on MT7623 or MT2701 SoC. Cc: stable@vger.kernel.org Fixes: 1de9b21633d6 ("clk: mediatek: Add dt-bindings for MT2701 clocks") Signed-off-by: Sean Wang Reviewed-by: Rob Herring Cc: Mark Rutland Cc: devicetree@vger.kernel.org Signed-off-by: Stephen Boyd Signed-off-by: Greg Kroah-Hartman commit ecb67e92d42e1fbbb710c9c86259ba4ef0839304 Author: Mikhail Lappo Date: Fri Feb 2 16:17:46 2018 -0200 thermal: imx: Fix race condition in imx_thermal_probe() commit cf1ba1d73a33944d8c1a75370a35434bf146b8a7 upstream. When device boots with T > T_trip_1 and requests interrupt, the race condition takes place. The interrupt comes before THERMAL_DEVICE_ENABLED is set. This leads to an attempt to reading sensor value from irq and disabling the sensor, based on the data->mode field, which expected to be THERMAL_DEVICE_ENABLED, but still stays as THERMAL_DEVICE_DISABLED. Afher this issue sensor is never re-enabled, as the driver state is wrong. Fix this problem by setting the 'data' members prior to requesting the interrupts. Fixes: 37713a1e8e4c ("thermal: imx: implement thermal alarm interrupt handling") Cc: Signed-off-by: Mikhail Lappo Signed-off-by: Fabio Estevam Reviewed-by: Philipp Zabel Acked-by: Dong Aisheng Signed-off-by: Zhang Rui Signed-off-by: Greg Kroah-Hartman commit c9b200ce2be5da7613e73daf356bd0ae63c0a48d Author: Ryo Kodama Date: Fri Mar 9 20:24:21 2018 +0900 pwm: rcar: Fix a condition to prevent mismatch value setting to duty commit 6225f9c64b40bc8a22503e9cda70f55d7a9dd3c6 upstream. This patch fixes an issue that is possible to set mismatch value to duty for R-Car PWM if we input the following commands: # cd /sys/class/pwm// # echo 0 > export # cd pwm0 # echo 30 > period # echo 30 > duty_cycle # echo 0 > duty_cycle # cat duty_cycle 0 # echo 1 > enable --> Then, the actual duty_cycle is 30, not 0. So, this patch adds a condition into rcar_pwm_config() to fix this issue. Signed-off-by: Ryo Kodama [shimoda: revise the commit log and add Fixes and Cc tags] Fixes: ed6c1476bf7f ("pwm: Add support for R-Car PWM Timer") Cc: Cc: # v4.4+ Signed-off-by: Yoshihiro Shimoda Signed-off-by: Thierry Reding Signed-off-by: Greg Kroah-Hartman commit ff18ffb1f81db091e916bb3afc4a264219f840ea Author: Boris Brezillon Date: Thu Mar 22 10:11:30 2018 +0100 clk: bcm2835: De-assert/assert PLL reset signal when appropriate commit 753872373b599384ac7df809aa61ea12d1c4d5d1 upstream. In order to enable a PLL, not only the PLL has to be powered up and locked, but you also have to de-assert the reset signal. The last part was missing. Add it so PLLs that were not enabled by the FW/bootloader can be enabled from Linux. Fixes: 41691b8862e2 ("clk: bcm2835: Add support for programming the audio domain clocks") Cc: Signed-off-by: Boris Brezillon Reviewed-by: Eric Anholt Signed-off-by: Stephen Boyd Signed-off-by: Greg Kroah-Hartman commit dc7a428ae26f7711be24c484b9b793853734b44a Author: Sean Wang Date: Thu Mar 1 11:27:51 2018 +0800 clk: mediatek: fix PWM clock source by adding a fixed-factor clock commit 89cd7aec21af26fd0c117bfc4bfc781724f201de upstream. The clock for which all PWM devices on MT7623 or MT2701 actually depending on has to be divided by four from its parent clock axi_sel in the clock path prior to PWM devices. Consequently, adding a fixed-factor clock axisel_d4 as one-fourth of clock axi_sel allows that PWM devices can have the correct resolution calculation. Cc: stable@vger.kernel.org Fixes: e9862118272a ("clk: mediatek: Add MT2701 clock support") Signed-off-by: Sean Wang Signed-off-by: Stephen Boyd Signed-off-by: Greg Kroah-Hartman commit d8b6fdbe513d2f68d93dd07af4df1b65254408e1 Author: Arnd Bergmann Date: Fri Feb 16 16:27:47 2018 +0100 clk: fix false-positive Wmaybe-uninitialized warning commit ce33f284935e08229046b30635e6aadcbab02b53 upstream. When we build this driver with on x86-32, gcc produces a false-positive warning: drivers/clk/renesas/clk-sh73a0.c: In function 'sh73a0_cpg_clocks_init': drivers/clk/renesas/clk-sh73a0.c:155:10: error: 'parent_name' may be used uninitialized in this function [-Werror=maybe-uninitialized] return clk_register_fixed_factor(NULL, name, parent_name, 0, We can work around that warning by adding a fake initialization, I tried and failed to come up with any better workaround. This is currently one of few remaining warnings for a 4.14.y randconfig build, so it would be good to also have it backported at least to that version. Older versions have more randconfig warnings, so we might not care. I had not noticed this earlier, because one patch in my randconfig test tree removes the '-ffreestanding' option on x86-32, and that avoids the warning. The -ffreestanding flag was originally global but moved into arch/i386 by Andi Kleen in commit 6edfba1b33c7 ("[PATCH] x86_64: Don't define string functions to builtin") as a 'temporary workaround'. Like many temporary hacks, this turned out to be rather long-lived, from all I can tell we still need a simple fix to asm/string_32.h before it can be removed, but I'm not sure about how to best do that. Cc: stable@vger.kernel.org Cc: Andi Kleen Signed-off-by: Arnd Bergmann Acked-by: Geert Uytterhoeven Signed-off-by: Stephen Boyd Signed-off-by: Greg Kroah-Hartman commit 37d8947c0b073a27198ae477d8a8a8b6cda61abf Author: Richard Genoud Date: Tue Mar 13 16:27:02 2018 +0100 clk: mvebu: armada-38x: add support for missing clocks commit 6a4a4595804548e173f0763a0e7274a3521c59a9 upstream. Clearfog boards can come with a CPU clocked at 1600MHz (commercial) or 1333MHz (industrial). They have also some dip-switches to select a different clock (666, 800, 1066, 1200). The funny thing is that the recovery button is on the MPP34 fq selector. So, when booting an industrial board with this button down, the frequency 666MHz is selected (and the kernel didn't boot). This patch add all the missing clocks. The only mode I didn't test is 2GHz (uboot found 4294MHz instead :/ ). Fixes: 0e85aeced4d6 ("clk: mvebu: add clock support for Armada 380/385") Cc: # 3.16.x: 9593f4f56cf5: clk: mvebu: armada-38x: add support for 1866MHz variants Cc: # 3.16.x Signed-off-by: Richard Genoud Acked-by: Gregory CLEMENT Signed-off-by: Stephen Boyd Signed-off-by: Greg Kroah-Hartman commit f13b4a61989fb29e16adba2ed2e073ffed301f40 Author: Sinan Kaya Date: Tue Apr 10 14:44:21 2018 -0500 PCI: Mark Broadcom HT1100 and HT2000 Root Port Extended Tags as broken commit 1b30dfd376e28e7f37eda5e2033f6823cdda222b upstream. Per PCIe r3.1, sec 2.2.6.2 and 7.8.4, a Requester may not use 8-bit Tags unless its Extended Tag Field Enable is set, but all Receivers/Completers must handle 8-bit Tags correctly regardless of their Extended Tag Field Enable. Some devices do not handle 8-bit Tags as Completers, so add a quirk for them. If we find such a device, we disable Extended Tags for the entire hierarchy to make peer-to-peer DMA possible. The Broadcom HT1100/HT2000/HT2100 seems to have issues with handling 8-bit tags. Mark it as broken. This fixes Xorg hangs and unresponsive keyboards with errors like this: radeon 0000:06:00.0: GPU lockup (current fence id 0x000000000000000e last fence id 0x0000000000000 [drm:r600_ring_test [radeon]] *ERROR* radeon: ring 0 test failed (scratch(0x8504)=0xCAFEDEAD) [drm:r600_resume [radeon]] *ERROR* r600 startup failed on resume Fixes: 60db3a4d8cc9 ("PCI: Enable PCIe Extended Tags if supported") Link: https://bugzilla.kernel.org/show_bug.cgi?id=196197 Signed-off-by: Sinan Kaya Signed-off-by: Bjorn Helgaas CC: stable@vger.kernel.org # v4.11: 62ce94a7a5a5 PCI: Mark Broadcom HT2100 Root Port Extended Tags as broken CC: stable@vger.kernel.org # v4.11 Signed-off-by: Greg Kroah-Hartman commit 4b684fbbc58ed048a2da5e21e94eec8c98c3c914 Author: Masaharu Hayakawa Date: Tue Apr 3 23:57:03 2018 +0200 mmc: tmio: Fix error handling when issuing CMD23 commit fc167daff581c01ebce8695e9618231cae3561a1 upstream. If an error was detected when CMD23 was issued, command sequence should be terminated with errors and CMD23 should be issued after retuning. Fixes: 8b22c3c18be5 ("mmc: tmio: add CMD23 support") Signed-off-by: Masaharu Hayakawa Signed-off-by: Wolfram Sang Cc: # 4.13+ Signed-off-by: Ulf Hansson Signed-off-by: Greg Kroah-Hartman commit e5e2841e20ff7d6bc5c886a0aa46c8623a0afe63 Author: Alex Smith Date: Wed Mar 28 18:00:43 2018 -0300 mmc: jz4740: Fix race condition in IRQ mask update commit a04f0017c22453613d5f423326b190c61e3b4f98 upstream. A spinlock is held while updating the internal copy of the IRQ mask, but not while writing it to the actual IMASK register. After the lock is released, an IRQ can occur before the IMASK register is written. If handling this IRQ causes the mask to be changed, when the handler returns back to the middle of the first mask update, a stale value will be written to the mask register. If this causes an IRQ to become unmasked that cannot have its status cleared by writing a 1 to it in the IREG register, e.g. the SDIO IRQ, then we can end up stuck with the same IRQ repeatedly being fired but not handled. Normally the MMC IRQ handler attempts to clear any unexpected IRQs by writing IREG, but for those that cannot be cleared in this way then the IRQ will just repeatedly fire. This was resulting in lockups after a while of using Wi-Fi on the CI20 (GitHub issue #19). Resolve by holding the spinlock until after the IMASK register has been updated. Cc: stable@vger.kernel.org Link: https://github.com/MIPS/CI20_linux/issues/19 Fixes: 61bfbdb85687 ("MMC: Add support for the controller on JZ4740 SoCs.") Tested-by: Mathieu Malaterre Signed-off-by: Alex Smith Signed-off-by: Ulf Hansson Signed-off-by: Greg Kroah-Hartman commit 180d28f824ce68022a4600300e3b5bc98c3c0661 Author: Lu Baolu Date: Sat Feb 24 13:42:27 2018 +0800 iommu/vt-d: Fix a potential memory leak commit bbe4b3af9d9e3172fb9aa1f8dcdfaedcb381fc64 upstream. A memory block was allocated in intel_svm_bind_mm() but never freed in a failure path. This patch fixes this by free it to avoid memory leakage. Cc: Ashok Raj Cc: Jacob Pan Cc: # v4.4+ Signed-off-by: Lu Baolu Fixes: 2f26e0a9c9860 ('iommu/vt-d: Add basic SVM PASID support') Signed-off-by: Joerg Roedel Signed-off-by: Greg Kroah-Hartman commit 5a999c2bef6880909901b5c56c81b920f89ed72f Author: Krzysztof Mazur Date: Wed Nov 15 11:12:39 2017 +0100 um: Use POSIX ucontext_t instead of struct ucontext commit 4d1a535b8ec5e74b42dfd9dc809142653b2597f6 upstream. glibc 2.26 removed the 'struct ucontext' to "improve" POSIX compliance and break programs, including User Mode Linux. Fix User Mode Linux by using POSIX ucontext_t. This fixes: arch/um/os-Linux/signal.c: In function 'hard_handler': arch/um/os-Linux/signal.c:163:22: error: dereferencing pointer to incomplete type 'struct ucontext' mcontext_t *mc = &uc->uc_mcontext; arch/x86/um/stub_segv.c: In function 'stub_segv_handler': arch/x86/um/stub_segv.c:16:13: error: dereferencing pointer to incomplete type 'struct ucontext' &uc->uc_mcontext); Cc: stable@vger.kernel.org Signed-off-by: Krzysztof Mazur Signed-off-by: Richard Weinberger Signed-off-by: Greg Kroah-Hartman commit f57f3f346d0516fcfcb9bee1df2a2a9f361c4396 Author: Jason A. Donenfeld Date: Thu Dec 14 03:23:37 2017 +0100 um: Compile with modern headers commit 530ba6c7cb3c22435a4d26de47037bb6f86a5329 upstream. Recent libcs have gotten a bit more strict, so we actually need to include the right headers and use the right types. This enables UML to compile again. Signed-off-by: Jason A. Donenfeld Cc: stable@vger.kernel.org Signed-off-by: Richard Weinberger Signed-off-by: Greg Kroah-Hartman commit dc0f003274525862bd671bf438398bf46fc62b0f Author: Steven Rostedt (VMware) Date: Mon Apr 2 10:33:56 2018 -0400 ring-buffer: Check if memory is available before allocation commit 2a872fa4e9c8adc79c830e4009e1cc0c013a9d8a upstream. The ring buffer is made up of a link list of pages. When making the ring buffer bigger, it will allocate all the pages it needs before adding to the ring buffer, and if it fails, it frees them and returns an error. This makes increasing the ring buffer size an all or nothing action. When this was first created, the pages were allocated with "NORETRY". This was to not cause any Out-Of-Memory (OOM) actions from allocating the ring buffer. But NORETRY was too strict, as the ring buffer would fail to expand even when there's memory available, but was taken up in the page cache. Commit 848618857d253 ("tracing/ring_buffer: Try harder to allocate") changed the allocating from NORETRY to RETRY_MAYFAIL. The RETRY_MAYFAIL would allocate from the page cache, but if there was no memory available, it would simple fail the allocation and not trigger an OOM. This worked fine, but had one problem. As the ring buffer would allocate one page at a time, it could take up all memory in the system before it failed to allocate and free that memory. If the allocation is happening and the ring buffer allocates all memory and then tries to take more than available, its allocation will not trigger an OOM, but if there's any allocation that happens someplace else, that could trigger an OOM, even though once the ring buffer's allocation fails, it would free up all the previous memory it tried to allocate, and allow other memory allocations to succeed. Commit d02bd27bd33dd ("mm/page_alloc.c: calculate 'available' memory in a separate function") separated out si_mem_availble() as a separate function that could be used to see how much memory is available in the system. Using this function to make sure that the ring buffer could be allocated before it tries to allocate pages we can avoid allocating all memory in the system and making it vulnerable to OOMs if other allocations are taking place. Link: http://lkml.kernel.org/r/1522320104-6573-1-git-send-email-zhaoyang.huang@spreadtrum.com CC: stable@vger.kernel.org Cc: linux-mm@kvack.org Fixes: 848618857d253 ("tracing/ring_buffer: Try harder to allocate") Requires: d02bd27bd33dd ("mm/page_alloc.c: calculate 'available' memory in a separate function") Reported-by: Zhaoyang Huang Tested-by: Joel Fernandes Signed-off-by: Steven Rostedt (VMware) Signed-off-by: Greg Kroah-Hartman commit 4171ea2471a1c383e5df0a4cb6a193fc13436b9b Author: Dan Williams Date: Wed Mar 21 21:22:34 2018 -0700 nfit: skip region registration for incomplete control regions commit 0731de476a37c33485af82d64041c9d193208df8 upstream. Per the ACPI specification the only functional purpose for a DIMM Control Region to be mapped into the system physical address space, from an OSPM perspective, is to support block-apertures. However, there are some BIOSen that publish DIMM Control Region SPA entries for pre-boot environment consumption. Undo the kernel policy of generating disabled 'ndblk' regions when this configuration is detected. Cc: Fixes: 1f7df6f88b92 ("libnvdimm, nfit: regions (block-data-window...)") Reviewed-by: Toshi Kani Signed-off-by: Dan Williams Signed-off-by: Greg Kroah-Hartman commit 5520091356b09abbaa2205babc5312a78d741a75 Author: Dan Williams Date: Mon Apr 2 16:40:04 2018 -0700 nfit, address-range-scrub: fix scrub in-progress reporting commit 78727137fdf49edf9f731bde79d7189067b4047a upstream. There is a small window whereby ARS scan requests can schedule work that userspace will miss when polling scrub_show. Hold the init_mutex lock over calls to report the status to close this potential escape. Also, make sure that requests to cancel the ARS workqueue are treated as an idle event. Cc: Cc: Vishal Verma Fixes: 37b137ff8c83 ("nfit, libnvdimm: allow an ARS scrub...") Reviewed-by: Dave Jiang Signed-off-by: Dan Williams Signed-off-by: Greg Kroah-Hartman commit b68b77c935dd1574873aac827c17fc76bb036a0f Author: Dan Williams Date: Fri Apr 6 16:37:21 2018 -0700 libnvdimm, namespace: use a safe lookup for dimm device name commit 4f8672201b7e7ed4f5f6c3cf6dcd080648580582 upstream. The following NULL dereference results from incorrectly assuming that ndd is valid in this print: struct nvdimm_drvdata *ndd = to_ndd(&nd_region->mapping[i]); /* * Give up if we don't find an instance of a uuid at each * position (from 0 to nd_region->ndr_mappings - 1), or if we * find a dimm with two instances of the same uuid. */ dev_err(&nd_region->dev, "%s missing label for %pUb\n", dev_name(ndd->dev), nd_label->uuid); BUG: unable to handle kernel NULL pointer dereference at 0000000000000000 IP: nd_region_register_namespaces+0xd67/0x13c0 [libnvdimm] PGD 0 P4D 0 Oops: 0000 [#1] SMP PTI CPU: 43 PID: 673 Comm: kworker/u609:10 Not tainted 4.16.0-rc4+ #1 [..] RIP: 0010:nd_region_register_namespaces+0xd67/0x13c0 [libnvdimm] [..] Call Trace: ? devres_add+0x2f/0x40 ? devm_kmalloc+0x52/0x60 ? nd_region_activate+0x9c/0x320 [libnvdimm] nd_region_probe+0x94/0x260 [libnvdimm] ? kernfs_add_one+0xe4/0x130 nvdimm_bus_probe+0x63/0x100 [libnvdimm] Switch to using the nvdimm device directly. Fixes: 0e3b0d123c8f ("libnvdimm, namespace: allow multiple pmem...") Cc: Reported-by: Dave Jiang Signed-off-by: Dan Williams Signed-off-by: Greg Kroah-Hartman commit 45980ba59916718a8d8e9a4c1a27a4ac187e515c Author: Dan Williams Date: Fri Apr 6 11:25:38 2018 -0700 libnvdimm, dimm: fix dpa reservation vs uninitialized label area commit c31898c8c711f2bbbcaebe802a55827e288d875a upstream. At initialization time the 'dimm' driver caches a copy of the memory device's label area and reserves address space for each of the namespaces defined. However, as can be seen below, the reservation occurs even when the index blocks are invalid: nvdimm nmem0: nvdimm_init_config_data: len: 131072 rc: 0 nvdimm nmem0: config data size: 131072 nvdimm nmem0: __nd_label_validate: nsindex0 labelsize 1 invalid nvdimm nmem0: __nd_label_validate: nsindex1 labelsize 1 invalid nvdimm nmem0: : pmem-6025e505: 0x1000000000 @ 0xf50000000 reserve <-- bad Gate dpa reservation on the presence of valid index blocks. Cc: Fixes: 4a826c83db4e ("libnvdimm: namespace indices: read and validate") Reported-by: Krzysztof Rusocki Signed-off-by: Dan Williams Signed-off-by: Greg Kroah-Hartman commit a43d8e0ee79d748ad98129254a7969abf88f11b6 Author: Chris Chiu Date: Tue Mar 20 15:36:40 2018 +0800 tpm: self test failure should not cause suspend to fail commit 0803d7befa15cab5717d667a97a66214d2a4c083 upstream. The Acer Acer Veriton X4110G has a TPM device detected as: tpm_tis 00:0b: 1.2 TPM (device-id 0xFE, rev-id 71) After the first S3 suspend, the following error appears during resume: tpm tpm0: A TPM error(38) occurred continue selftest Any following S3 suspend attempts will now fail with this error: tpm tpm0: Error (38) sending savestate before suspend PM: Device 00:0b failed to suspend: error 38 Error 38 is TPM_ERR_INVALID_POSTINIT which means the TPM is not in the correct state. This indicates that the platform BIOS is not sending the usual TPM_Startup command during S3 resume. >From this point onwards, all TPM commands will fail. The same issue was previously reported on Foxconn 6150BK8MC and Sony Vaio TX3. The platform behaviour seems broken here, but we should not break suspend/resume because of this. When the unexpected TPM state is encountered, set a flag to skip the affected TPM_SaveState command on later suspends. Cc: stable@vger.kernel.org Signed-off-by: Chris Chiu Signed-off-by: Daniel Drake Link: http://lkml.kernel.org/r/CAB4CAwfSCvj1cudi+MWaB5g2Z67d9DwY1o475YOZD64ma23UiQ@mail.gmail.com Link: https://lkml.org/lkml/2011/3/28/192 Link: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=591031 Reviewed-by: Jarkko Sakkinen Signed-off-by: Jarkko Sakkinen Signed-off-by: Greg Kroah-Hartman commit c1edd3b19f302c85ed6fe11e753eda674c100524 Author: Frederic Barrat Date: Tue Apr 3 15:54:02 2018 +0200 cxl: Fix possible deadlock when processing page faults from cxllib commit ad7b4e8022b9864c075fe71e1328b1d25cad82f6 upstream. cxllib_handle_fault() is called by an external driver when it needs to have the host resolve page faults for a buffer. The buffer can cover several pages and VMAs. The function iterates over all the pages used by the buffer, based on the page size of the VMA. To ensure some stability while processing the faults, the thread T1 grabs the mm->mmap_sem semaphore with read access (R1). However, when processing a page fault for a single page, one of the underlying functions, copro_handle_mm_fault(), also grabs the same semaphore with read access (R2). So the thread T1 takes the semaphore twice. If another thread T2 tries to access the semaphore in write mode W1 (say, because it wants to allocate memory and calls 'brk'), then that thread T2 will have to wait because there's a reader (R1). If the thread T1 is processing a new page at that time, it won't get an automatic grant at R2, because there's now a writer thread waiting (T2). And we have a deadlock. The timeline is: 1. thread T1 owns the semaphore with read access R1 2. thread T2 requests write access W1 and waits 3. thread T1 requests read access R2 and waits The fix is for the thread T1 to release the semaphore R1 once it got the information it needs from the current VMA. The address space/VMAs could evolve while T1 iterates over the full buffer, but in the unlikely case where T1 misses a page, the external driver will raise a new page fault when retrying the memory access. Fixes: 3ced8d730063 ("cxl: Export library to support IBM XSL") Cc: stable@vger.kernel.org # 4.13+ Signed-off-by: Frederic Barrat Signed-off-by: Michael Ellerman Signed-off-by: Greg Kroah-Hartman commit 156b45ed22073e7b8aa753abb8ff4462e1cbe47f Author: Maxime Jayat Date: Thu Feb 22 12:39:55 2018 +0100 dmaengine: at_xdmac: fix rare residue corruption commit c5637476bbf9bb86c7f0413b8f4822a73d8d2d07 upstream. Despite the efforts made to correctly read the NDA and CUBC registers, the order in which the registers are read could sometimes lead to an inconsistent state. Re-using the timeline from the comments, this following timing of registers reads could lead to reading NDA with value "@desc2" and CUBC with value "MAX desc1": INITD -------- ------------ |____________________| _______________________ _______________ NDA @desc2 \/ @desc3 _______________________/\_______________ __________ ___________ _______________ CUBC 0 \/ MAX desc1 \/ MAX desc2 __________/\___________/\_______________ | | | | Events:(1)(2) (3)(4) (1) check_nda = @desc2 (2) initd = 1 (3) cur_ubc = MAX desc1 (4) cur_nda = @desc2 This is allowed by the condition ((check_nda == cur_nda) && initd), despite cur_ubc and cur_nda being in the precise state we don't want. This error leads to incorrect residue computation. Fix it by inversing the order in which CUBC and INITD are read. This makes sure that NDA and CUBC are always read together either _before_ INITD goes to 0 or _after_ it is back at 1. The case where NDA is read before INITD is at 0 and CUBC is read after INITD is back at 1 will be rejected by check_nda and cur_nda being different. Fixes: 53398f488821 ("dmaengine: at_xdmac: fix residue corruption") Cc: stable@vger.kernel.org Signed-off-by: Maxime Jayat Acked-by: Ludovic Desroches Signed-off-by: Vinod Koul Signed-off-by: Greg Kroah-Hartman commit e99ca1ee070dea0fcda75c127cab1b8b85189224 Author: Bart Van Assche Date: Mon Feb 12 09:50:25 2018 -0800 IB/srp: Fix completion vector assignment algorithm commit 3a148896b24adf8688dc0c59af54531931677a40 upstream. Ensure that cv_end is equal to ibdev->num_comp_vectors for the NUMA node with the highest index. This patch improves spreading of RDMA channels over completion vectors and thereby improves performance, especially on systems with only a single NUMA node. This patch drops support for the comp_vector login parameter by ignoring the value of that parameter since I have not found a good way to combine support for that parameter and automatic spreading of RDMA channels over completion vectors. Fixes: d92c0da71a35 ("IB/srp: Add multichannel support") Reported-by: Alexander Schmid Signed-off-by: Bart Van Assche Cc: Alexander Schmid Cc: stable@vger.kernel.org Signed-off-by: Jason Gunthorpe Signed-off-by: Greg Kroah-Hartman commit fe71b03e698325f5361d187fc7534b202f2a8404 Author: Bart Van Assche Date: Fri Feb 23 14:09:24 2018 -0800 IB/srp: Fix srp_abort() commit e68088e78d82920632eba112b968e49d588d02a2 upstream. Before commit e494f6a72839 ("[SCSI] improved eh timeout handler") it did not really matter whether or not abort handlers like srp_abort() called .scsi_done() when returning another value than SUCCESS. Since that commit however this matters. Hence only call .scsi_done() when returning SUCCESS. Signed-off-by: Bart Van Assche Cc: stable@vger.kernel.org Signed-off-by: Jason Gunthorpe Signed-off-by: Greg Kroah-Hartman commit 0bb5579128e65d8c4334f90ca47cfe49e90187f7 Author: Takashi Iwai Date: Mon Apr 2 22:41:43 2018 +0200 ALSA: pcm: Fix UAF at PCM release via PCM timer access commit a820ccbe21e8ce8e86c39cd1d3bc8c7d1cbb949b upstream. The PCM runtime object is created and freed dynamically at PCM stream open / close time. This is tracked via substream->runtime, and it's cleared at snd_pcm_detach_substream(). The runtime object assignment is protected by PCM open_mutex, so for all PCM operations, it's safely handled. However, each PCM substream provides also an ALSA timer interface, and user-space can access to this while closing a PCM substream. This may eventually lead to a UAF, as snd_pcm_timer_resolution() tries to access the runtime while clearing it in other side. Fortunately, it's the only concurrent access from the PCM timer, and it merely reads runtime->timer_resolution field. So, we can avoid the race by reordering kfree() and wrapping the substream->runtime clearance with the corresponding timer lock. Reported-by: syzbot+8e62ff4e07aa2ce87826@syzkaller.appspotmail.com Cc: Signed-off-by: Takashi Iwai Signed-off-by: Greg Kroah-Hartman commit d3b14a66e14bb6b248dcab04c294aceacff68c0b Author: Bart Van Assche Date: Thu Mar 1 14:00:29 2018 -0800 RDMA/rxe: Fix an out-of-bounds read commit a6544a624c3ff92a64e4aca3931fa064607bd3da upstream. This patch avoids that KASAN reports the following when the SRP initiator calls srp_post_send(): ================================================================== BUG: KASAN: stack-out-of-bounds in rxe_post_send+0x5c4/0x980 [rdma_rxe] Read of size 8 at addr ffff880066606e30 by task 02-mq/1074 CPU: 2 PID: 1074 Comm: 02-mq Not tainted 4.16.0-rc3-dbg+ #1 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.0.0-prebuilt.qemu-project.org 04/01/2014 Call Trace: dump_stack+0x85/0xc7 print_address_description+0x65/0x270 kasan_report+0x231/0x350 rxe_post_send+0x5c4/0x980 [rdma_rxe] srp_post_send.isra.16+0x149/0x190 [ib_srp] srp_queuecommand+0x94d/0x1670 [ib_srp] scsi_dispatch_cmd+0x1c2/0x550 [scsi_mod] scsi_queue_rq+0x843/0xa70 [scsi_mod] blk_mq_dispatch_rq_list+0x143/0xac0 blk_mq_do_dispatch_ctx+0x1c5/0x260 blk_mq_sched_dispatch_requests+0x2bf/0x2f0 __blk_mq_run_hw_queue+0xdb/0x160 __blk_mq_delay_run_hw_queue+0xba/0x100 blk_mq_run_hw_queue+0xf2/0x190 blk_mq_sched_insert_request+0x163/0x2f0 blk_execute_rq+0xb0/0x130 scsi_execute+0x14e/0x260 [scsi_mod] scsi_probe_and_add_lun+0x366/0x13d0 [scsi_mod] __scsi_scan_target+0x18a/0x810 [scsi_mod] scsi_scan_target+0x11e/0x130 [scsi_mod] srp_create_target+0x1522/0x19e0 [ib_srp] kernfs_fop_write+0x180/0x210 __vfs_write+0xb1/0x2e0 vfs_write+0xf6/0x250 SyS_write+0x99/0x110 do_syscall_64+0xee/0x2b0 entry_SYSCALL_64_after_hwframe+0x42/0xb7 The buggy address belongs to the page: page:ffffea0001998180 count:0 mapcount:0 mapping:0000000000000000 index:0x0 flags: 0x4000000000000000() raw: 4000000000000000 0000000000000000 0000000000000000 00000000ffffffff raw: dead000000000100 dead000000000200 0000000000000000 0000000000000000 page dumped because: kasan: bad access detected Memory state around the buggy address: ffff880066606d00: 00 00 00 00 00 00 00 00 00 00 00 00 00 f1 f1 f1 ffff880066606d80: f1 00 f2 f2 f2 f2 f2 f2 f2 00 00 f2 f2 f2 f2 f2 >ffff880066606e00: f2 00 00 00 00 00 f2 f2 f2 f3 f3 f3 f3 00 00 00 ^ ffff880066606e80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ffff880066606f00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ================================================================== Fixes: 8700e3e7c485 ("Soft RoCE driver") Signed-off-by: Bart Van Assche Cc: Moni Shoua Cc: stable@vger.kernel.org Signed-off-by: Jason Gunthorpe Signed-off-by: Greg Kroah-Hartman commit 28ce82e3c8b116f09ea539e15bbb9f0f09f5fa32 Author: Leon Romanovsky Date: Tue Mar 13 15:29:24 2018 +0200 RDMA/mlx5: Protect from NULL pointer derefence commit 4289861d88d6c7b5e4c8cc7fe2ad6cdf0cdfc366 upstream. The mlx5_ib_alloc_implicit_mr() can fail to acquire pages and the returned mr pointer won't be valid. Ensure that it is not error prior to access. Cc: # 4.10 Fixes: 81713d3788d2 ("IB/mlx5: Add implicit MR support") Reported-by: Noa Osherovich Signed-off-by: Leon Romanovsky Signed-off-by: Doug Ledford Signed-off-by: Greg Kroah-Hartman commit b140d94688707e42b75cad3b020563a706101abe Author: Roland Dreier Date: Tue Apr 3 15:33:01 2018 -0700 RDMA/ucma: Don't allow setting RDMA_OPTION_IB_PATH without an RDMA device commit 8435168d50e66fa5eae01852769d20a36f9e5e83 upstream. Check to make sure that ctx->cm_id->device is set before we use it. Otherwise userspace can trigger a NULL dereference by doing RDMA_USER_CM_CMD_SET_OPTION on an ID that is not bound to a device. Cc: Reported-by: Signed-off-by: Roland Dreier Signed-off-by: Jason Gunthorpe Signed-off-by: Greg Kroah-Hartman commit 79fbd052ea63e26ee739dbdd32b45bf124eb7923 Author: Mikulas Patocka Date: Sun Aug 13 22:45:08 2017 -0400 dm crypt: limit the number of allocated pages commit 5059353df86e2573ccd9d43fd9d9396dcec47ca2 upstream. dm-crypt consumes an excessive amount memory when the user attempts to zero a dm-crypt device with "blkdiscard -z". The command "blkdiscard -z" calls the BLKZEROOUT ioctl, it goes to the function __blkdev_issue_zeroout, __blkdev_issue_zeroout sends a large amount of write bios that contain the zero page as their payload. For each incoming page, dm-crypt allocates another page that holds the encrypted data, so when processing "blkdiscard -z", dm-crypt tries to allocate the amount of memory that is equal to the size of the device. This can trigger OOM killer or cause system crash. Fix this by limiting the amount of memory that dm-crypt allocates to 2% of total system memory. This limit is system-wide and is divided by the number of active dm-crypt devices and each device receives an equal share. Cc: stable@vger.kernel.org Signed-off-by: Mikulas Patocka Signed-off-by: Mike Snitzer Signed-off-by: Greg Kroah-Hartman commit e7793f2a2ac85f912690a69f7964907e31731ae7 Author: Theodore Ts'o Date: Fri Mar 30 20:04:11 2018 -0400 ext4: add extra checks to ext4_xattr_block_get() commit 54dd0e0a1b255f115f8647fc6fb93273251b01b9 upstream. Add explicit checks in ext4_xattr_block_get() just in case the e_value_offs and e_value_size fields in the the xattr block are corrupted in memory after the buffer_verified bit is set on the xattr block. Signed-off-by: Theodore Ts'o Cc: stable@kernel.org Signed-off-by: Greg Kroah-Hartman commit 9703952178f1412b7a74e170298d58e96c53152f Author: Theodore Ts'o Date: Fri Mar 30 20:00:56 2018 -0400 ext4: add bounds checking to ext4_xattr_find_entry() commit 9496005d6ca4cf8f5ee8f828165a8956872dc59d upstream. Add some paranoia checks to make sure we don't stray beyond the end of the valid memory region containing ext4 xattr entries while we are scanning for a match. Also rename the function to xattr_find_entry() since it is static and thus only used in fs/ext4/xattr.c Signed-off-by: Theodore Ts'o Cc: stable@kernel.org Signed-off-by: Greg Kroah-Hartman commit 598e04ae2fc402c364d3f0ee76599783149f573f Author: Theodore Ts'o Date: Fri Mar 30 15:42:25 2018 -0400 ext4: move call to ext4_error() into ext4_xattr_check_block() commit de05ca8526796c7e9f7c7282b7f89a818af19818 upstream. Refactor the call to EXT4_ERROR_INODE() into ext4_xattr_check_block(). This simplifies the code, and fixes a problem where not all callers of ext4_xattr_check_block() were not resulting in ext4_error() getting called when the xattr block is corrupted. Signed-off-by: Theodore Ts'o Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman commit b2623d8166563a19bee0cc8292e4ec951a42a6da Author: Theodore Ts'o Date: Thu Mar 29 22:10:35 2018 -0400 ext4: don't allow r/w mounts if metadata blocks overlap the superblock commit 18db4b4e6fc31eda838dd1c1296d67dbcb3dc957 upstream. If some metadata block, such as an allocation bitmap, overlaps the superblock, it's very likely that if the file system is mounted read/write, the results will not be pretty. So disallow r/w mounts for file systems corrupted in this particular way. Signed-off-by: Theodore Ts'o Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman commit 26dbb30c58ffb85bc015bd5e58831483d50f7d18 Author: Theodore Ts'o Date: Thu Mar 29 22:10:31 2018 -0400 ext4: always initialize the crc32c checksum driver commit a45403b51582a87872927a3e0fc0a389c26867f1 upstream. The extended attribute code now uses the crc32c checksum for hashing purposes, so we should just always always initialize it. We also want to prevent NULL pointer dereferences if one of the metadata checksum features is enabled after the file sytsem is originally mounted. This issue has been assigned CVE-2018-1094. https://bugzilla.kernel.org/show_bug.cgi?id=199183 https://bugzilla.redhat.com/show_bug.cgi?id=1560788 Signed-off-by: Theodore Ts'o Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman commit 8e0e94683f8449f4e83b4b563b80eb9c76b9e18f Author: Theodore Ts'o Date: Thu Mar 29 21:56:09 2018 -0400 ext4: fail ext4_iget for root directory if unallocated commit 8e4b5eae5decd9dfe5a4ee369c22028f90ab4c44 upstream. If the root directory has an i_links_count of zero, then when the file system is mounted, then when ext4_fill_super() notices the problem and tries to call iput() the root directory in the error return path, ext4_evict_inode() will try to free the inode on disk, before all of the file system structures are set up, and this will result in an OOPS caused by a NULL pointer dereference. This issue has been assigned CVE-2018-1092. https://bugzilla.kernel.org/show_bug.cgi?id=199179 https://bugzilla.redhat.com/show_bug.cgi?id=1560777 Reported-by: Wen Xu Signed-off-by: Theodore Ts'o Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman commit a57eb14b740e6175aff8b8941bec628403992dfa Author: Eric Biggers Date: Thu Mar 29 14:31:42 2018 -0400 ext4: limit xattr size to INT_MAX commit ce3fd194fcc6fbdc00ce095a852f22df97baa401 upstream. ext4 isn't validating the sizes of xattrs where the value of the xattr is stored in an external inode. This is problematic because ->e_value_size is a u32, but ext4_xattr_get() returns an int. A very large size is misinterpreted as an error code, which ext4_get_acl() translates into a bogus ERR_PTR() for which IS_ERR() returns false, causing a crash. Fix this by validating that all xattrs are <= INT_MAX bytes. This issue has been assigned CVE-2018-1095. https://bugzilla.kernel.org/show_bug.cgi?id=199185 https://bugzilla.redhat.com/show_bug.cgi?id=1560793 Reported-by: Wen Xu Signed-off-by: Eric Biggers Signed-off-by: Theodore Ts'o Cc: stable@vger.kernel.org Fixes: e50e5129f384 ("ext4: xattr-in-inode support") Signed-off-by: Greg Kroah-Hartman commit 5058b70d2118ee0304a2f3124946c4089b06abde Author: Eryu Guan Date: Thu Mar 22 11:41:25 2018 -0400 ext4: protect i_disksize update by i_data_sem in direct write path commit 73fdad00b208b139cf43f3163fbc0f67e4c6047c upstream. i_disksize update should be protected by i_data_sem, by either taking the lock explicitly or by using ext4_update_i_disksize() helper. But the i_disksize updates in ext4_direct_IO_write() are not protected at all, which may be racing with i_disksize updates in writeback path in delalloc buffer write path. This is found by code inspection, and I didn't hit any i_disksize corruption due to this bug. Thanks to Jan Kara for catching this bug and suggesting the fix! Reported-by: Jan Kara Suggested-by: Jan Kara Signed-off-by: Eryu Guan Signed-off-by: Theodore Ts'o Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman commit bd499f55384969d53c825fa7c3a0ae21c972c945 Author: Theodore Ts'o Date: Mon Feb 19 14:16:47 2018 -0500 ext4: don't update checksum of new initialized bitmaps commit 044e6e3d74a3d7103a0c8a9305dfd94d64000660 upstream. When reading the inode or block allocation bitmap, if the bitmap needs to be initialized, do not update the checksum in the block group descriptor. That's because we're not set up to journal those changes. Instead, just set the verified bit on the bitmap block, so that it's not necessary to validate the checksum. When a block or inode allocation actually happens, at that point the checksum will be calculated, and update of the bg descriptor block will be properly journalled. Signed-off-by: Theodore Ts'o Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman commit 73297f13a003b97e37a5648bbbecd45323cc0e24 Author: Theodore Ts'o Date: Sun Feb 18 23:45:18 2018 -0500 ext4: pass -ESHUTDOWN code to jbd2 layer commit fb7c02445c497943e7296cd3deee04422b63acb8 upstream. Previously the jbd2 layer assumed that a file system check would be required after a journal abort. In the case of the deliberate file system shutdown, this should not be necessary. Allow the jbd2 layer to distinguish between these two cases by using the ESHUTDOWN errno. Also add proper locking to __journal_abort_soft(). Signed-off-by: Theodore Ts'o Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman commit 09439481998aefbe1bd6ae067079e738ba4a7a3f Author: Theodore Ts'o Date: Sun Feb 18 23:16:28 2018 -0500 ext4: eliminate sleep from shutdown ioctl commit a6d9946bb925293fda9f5ed6d33d8580b001f006 upstream. The msleep() when processing EXT4_GOING_FLAGS_NOLOGFLUSH was a hack to avoid some races (that are now fixed), but in fact it introduced its own race. Signed-off-by: Theodore Ts'o Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman commit 7ebcea25968219b4ed9b27e5708e2b96c1718d86 Author: Theodore Ts'o Date: Sun Feb 18 22:07:36 2018 -0500 ext4: shutdown should not prevent get_write_access commit 576d18ed60f5465110087c5e0eb1010de13e374d upstream. The ext4 forced shutdown flag needs to prevent new handles from being started, but it needs to allow existing handles to complete. So the forced shutdown flag should not force ext4_journal_get_write_access to fail. Signed-off-by: Theodore Ts'o Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman commit 867175f944857448b83cad1370e532a1ffda9e47 Author: Theodore Ts'o Date: Mon Feb 19 12:22:53 2018 -0500 jbd2: if the journal is aborted then don't allow update of the log tail commit 85e0c4e89c1b864e763c4e3bb15d0b6d501ad5d9 upstream. This updates the jbd2 superblock unnecessarily, and on an abort we shouldn't truncate the log. Signed-off-by: Theodore Ts'o Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman commit 6151a5a45fc42271493c8805b533d8349954b130 Author: Mikulas Patocka Date: Wed Mar 21 12:42:25 2018 -0400 block: use 32-bit blk_status_t on Alpha commit 6e2fb22103b99c26ae30a46512abe75526d8e4c9 upstream. Early alpha processors cannot write a single byte or word; they read 8 bytes, modify the value in registers and write back 8 bytes. The type blk_status_t is defined as one byte, it is often written asynchronously by I/O completion routines, this asynchronous modification can corrupt content of nearby bytes if these nearby bytes can be written simultaneously by another CPU. - one example of such corruption is the structure dm_io where "blk_status_t status" is written by an asynchronous completion routine and "atomic_t io_count" is modified synchronously - another example is the structure dm_buffer where "unsigned hold_count" is modified synchronously from process context and "blk_status_t write_error" is modified asynchronously from bio completion routine This patch fixes the bug by changing the type blk_status_t to 32 bits if we are on Alpha and if we are compiling for a processor that doesn't have the byte-word-extension. Signed-off-by: Mikulas Patocka Cc: stable@vger.kernel.org # 4.13+ Signed-off-by: Jens Axboe Signed-off-by: Greg Kroah-Hartman commit 7044bf9ef6c87cb49c71a9aa1f219e6a1cce5148 Author: Hans de Goede Date: Mon Feb 19 14:20:46 2018 +0100 extcon: intel-cht-wc: Set direction and drv flags for V5 boost GPIO commit ad49aee401dd1997ec71360df6e51a91ad3cf516 upstream. Sometimes (firmware bug?) the V5 boost GPIO is not configured as output by the BIOS, leading to the 5V boost convertor being permanently on, Explicitly set the direction and drv flags rather then inheriting them from the firmware to fix this. Fixes: 585cb239f4de ("extcon: intel-cht-wc: Disable external 5v boost ...") Cc: stable@vger.kernel.org Reviewed-by: Andy Shevchenko Signed-off-by: Hans de Goede Signed-off-by: Chanwoo Choi Signed-off-by: Greg Kroah-Hartman commit b0afd9d1cb5aeb116edaaf967321444b588849cf Author: Theodore Ts'o Date: Sat Feb 25 18:21:33 2017 -0400 random: use a tighter cap in credit_entropy_bits_safe() commit 9f886f4d1d292442b2f22a0a33321eae821bde40 upstream. This fixes a harmless UBSAN where root could potentially end up causing an overflow while bumping the entropy_total field (which is ignored once the entropy pool has been initialized, and this generally is completed during the boot sequence). This is marginal for the stable kernel series, but it's a really trivial patch, and it fixes UBSAN warning that might cause security folks to get overly excited for no reason. Signed-off-by: Theodore Ts'o Reported-by: Chen Feng Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman commit 439e8b2dcab1963d077af193ebdabb586b8e14be Author: Aniruddha Banerjee Date: Wed Mar 28 19:12:00 2018 +0530 irqchip/gic: Take lock when updating irq type commit aa08192a254d362a4d5317647a81de6996961aef upstream. Most MMIO GIC register accesses use a 1-hot bit scheme that avoids requiring any form of locking. This isn't true for the GICD_ICFGRn registers, which require a RMW sequence. Unfortunately, we seem to be missing a lock for these particular accesses, which could result in a race condition if changing the trigger type on any two interrupts within the same set of 16 interrupts (and thus controlled by the same CFGR register). Introduce a private lock in the GIC common comde for this particular case, making it cover both GIC implementations in one go. Cc: stable@vger.kernel.org Signed-off-by: Aniruddha Banerjee [maz: updated changelog] Signed-off-by: Marc Zyngier Signed-off-by: Greg Kroah-Hartman commit 2836377857633db5a7d93106fad90a642e296eec Author: Mika Westerberg Date: Fri Mar 9 13:17:01 2018 +0300 thunderbolt: Prevent crash when ICM firmware is not running commit ea9d7bb798900096f26c585957d6ad9c532417e6 upstream. On Lenovo ThinkPad Yoga 370 (and possibly some other Lenovo models as well) the Thunderbolt host controller sometimes comes up in such way that the ICM firmware is not running properly. This is most likely an issue in BIOS/firmware but as side-effect driver crashes the kernel due to NULL pointer dereference: BUG: unable to handle kernel NULL pointer dereference at 0000000000000980 IP: pci_write_config_dword+0x5/0x20 Call Trace: pcie2cio_write+0x3b/0x70 [thunderbolt] icm_driver_ready+0x168/0x260 [thunderbolt] ? tb_ctl_start+0x50/0x70 [thunderbolt] tb_domain_add+0x73/0xf0 [thunderbolt] nhi_probe+0x182/0x300 [thunderbolt] local_pci_probe+0x42/0xa0 ? pci_match_device+0xd9/0x100 pci_device_probe+0x146/0x1b0 driver_probe_device+0x315/0x480 ... Instead of crashing update the driver to bail out gracefully if we encounter such situation. Fixes: f67cf491175a ("thunderbolt: Add support for Internal Connection Manager (ICM)") Reported-by: Jordan Glover Signed-off-by: Mika Westerberg Acked-by: Yehezkel Bernat Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman commit 5ae695df59e1c3519b36d31671ac01697b606240 Author: Mika Westerberg Date: Tue Dec 19 12:44:56 2017 +0300 thunderbolt: Resume control channel after hibernation image is created commit f2a659f7d8d5da803836583aa16df06bdf324252 upstream. The driver misses implementation of PM hook that undoes what ->freeze_noirq() does after the hibernation image is created. This means the control channel is not resumed properly and the Thunderbolt bus becomes useless in later stages of hibernation (when the image is stored or if the operation fails). Fix this by pointing ->thaw_noirq to driver nhi_resume_noirq(). This makes sure the control channel is resumed properly. Fixes: 23dd5bb49d98 ("thunderbolt: Add suspend/hibernate support") Signed-off-by: Mika Westerberg Reviewed-by: Andy Shevchenko Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman commit 7a4a66c504fbfae4428ac7a3784ee5017ae28682 Author: Mika Westerberg Date: Thu Jan 18 20:27:47 2018 +0300 thunderbolt: Serialize PCIe tunnel creation with PCI rescan commit a03e828915c00ed0ea5aa40647c81472cfa7a984 upstream. We need to make sure a new PCIe tunnel is not created in a middle of previous PCI rescan because otherwise the rescan code might find too much and fail to reconfigure devices properly. This is important when native PCIe hotplug is used. In BIOS assisted hotplug there should be no such issue. Fixes: f67cf491175a ("thunderbolt: Add support for Internal Connection Manager (ICM)") Signed-off-by: Mika Westerberg Reviewed-by: Andy Shevchenko Cc: Bjorn Helgaas Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman commit 6f40f6ee22b91c913b2891c26943eb37bfceb50a Author: Mika Westerberg Date: Fri Nov 24 17:51:12 2017 +0300 thunderbolt: Wait a bit longer for ICM to authenticate the active NVM commit e4be8c9b6a512e274cb6bbac4ac869d73880a8b3 upstream. Sometimes during cold boot ICM has not yet authenticated the active NVM image leading to timeout and failing the driver probe. Allow ICM to take some more time and increase the timeout to 3 seconds before we give up. While there fix icm_firmware_init() to return the real error code without overwriting it with -ENODEV. Fixes: f67cf491175a ("thunderbolt: Add support for Internal Connection Manager (ICM)") Signed-off-by: Mika Westerberg Reviewed-by: Andy Shevchenko Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman commit 73b969f6a92062a5c1547541a2608d1755a7954b Author: Liam Girdwood Date: Tue Mar 27 12:04:04 2018 +0100 ASoC: topology: Fix kcontrol name string handling commit 267e2c6fd7ca3d4076d20f9d52d49dc91addfe9d upstream. Fix the topology kcontrol string handling so that string pointer references are strdup()ed instead of being copied. This fixes issues with kcontrol templates on the stack or ones that are freed. Remember and free the strings too when topology is unloaded. Signed-off-by: Liam Girdwood Signed-off-by: Mark Brown Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman commit 7e23ef535073ef8e1684f7b6e967b55dab055bd0 Author: James Kelly Date: Mon Mar 19 21:29:50 2018 +1100 ASoC: ssm2602: Replace reg_default_raw with reg_default commit a01df75ce737951ad13a08d101306e88c3f57cb2 upstream. SSM2602 driver is broken on recent kernels (at least since 4.9). User space applications such as amixer or alsamixer get EIO when attempting to access codec controls via the relevant IOCTLs. Root cause of these failures is the regcache_hw_init function in drivers/base/regmap/regcache.c, which prevents regmap cache initalization from the reg_defaults_raw element of the regmap_config structure when registers are write only. It also disables the regmap cache entirely when all registers are write only or volatile as is the case for the SSM2602 driver. Using the reg_defaults element of the regmap_config structure rather than the reg_defaults_raw element to initalize the regmap cache avoids the logic in the regcache_hw_init function entirely. It also makes this driver consistent with other ASoC codec drivers, as this driver was the ONLY codec driver that used the reg_defaults_raw element to initalize the cache. Tested on Digilent Zybo Z7 development board which has a SSM2603 codec chip connected to a Xilinx Zynq SoC. Signed-off-by: James Kelly Signed-off-by: Mark Brown Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman commit 1b3d2e7a34095edb8ff104961bdc403f8d36e405 Author: Sean Wang Date: Fri Feb 9 02:07:59 2018 +0800 soc: mediatek: fix the mistaken pointer accessed when subdomains are added commit 73ce2ce129783813e1ebc37d2c757fe5e0fab1ef upstream. Fix the pointer to struct scp_subdomian not being moved forward when each sub-domain is expected to be iteratively added through pm_genpd_add_subdomain call. Cc: stable@vger.kernel.org Fixes: 53fddb1a66dd ("soc: mediatek: reduce code duplication of scpsys_probe across all SoCs") Reported-by: Weiyi Lu Signed-off-by: Sean Wang Signed-off-by: Matthias Brugger Signed-off-by: Greg Kroah-Hartman commit 3f306336cdee80f830c46a768c08daf90a31e972 Author: Aaron Ma Date: Mon Jan 8 10:41:41 2018 +0800 HID: core: Fix size as type u32 commit 6de0b13cc0b4ba10e98a9263d7a83b940720b77a upstream. When size is negative, calling memset will make segment fault. Declare the size as type u32 to keep memset safe. size in struct hid_report is unsigned, fix return type of hid_report_len to u32. Cc: stable@vger.kernel.org Signed-off-by: Aaron Ma Signed-off-by: Jiri Kosina Signed-off-by: Greg Kroah-Hartman commit f671ac7a5317edddcc8d4593bae3ea1dde5334ed Author: Aaron Ma Date: Sat Feb 3 23:57:15 2018 +0800 HID: Fix hid_report_len usage commit 3064a03b94e60388f0955fcc29f3e8a978d28f75 upstream. Follow the change of return type u32 of hid_report_len, fix all the types of variables those get the return value of hid_report_len to u32, and all other code already uses u32. Cc: stable@vger.kernel.org Signed-off-by: Aaron Ma Signed-off-by: Jiri Kosina Signed-off-by: Greg Kroah-Hartman commit 25b6ee378dc43fff21e778f1a21cb7e6dffa6e1a Author: Nicholas Piggin Date: Tue Apr 10 21:49:33 2018 +1000 powerpc/powernv: Fix OPAL NVRAM driver OPAL_BUSY loops commit 3b8070335f751aac9f1526ae2e012e6f5b8b0f21 upstream. The OPAL NVRAM driver does not sleep in case it gets OPAL_BUSY or OPAL_BUSY_EVENT from firmware, which causes large scheduling latencies, and various lockup errors to trigger (again, BMC reboot can cause it). Fix this by converting it to the standard form OPAL_BUSY loop that sleeps. Fixes: 628daa8d5abf ("powerpc/powernv: Add RTC and NVRAM support plus RTAS fallbacks") Depends-on: 34dd25de9fe3 ("powerpc/powernv: define a standard delay for OPAL_BUSY type retry loops") Cc: stable@vger.kernel.org # v3.2+ Signed-off-by: Nicholas Piggin Signed-off-by: Michael Ellerman Signed-off-by: Greg Kroah-Hartman commit 7c854f2e1ff08edd35988acd733c6b1ad0e95e8b Author: Nicholas Piggin Date: Tue Apr 10 21:49:31 2018 +1000 powerpc/powernv: define a standard delay for OPAL_BUSY type retry loops commit 34dd25de9fe3f60bfdb31b473bf04b28262d0896 upstream. This is the start of an effort to tidy up and standardise all the delays. Existing loops have a range of delay/sleep periods from 1ms to 20ms, and some have no delay. They all loop forever except rtc, which times out after 10 retries, and that uses 10ms delays. So use 10ms as our standard delay. The OPAL maintainer agrees 10ms is a reasonable starting point. The idea is to use the same recipe everywhere, once this is proven to work then it will be documented as an OPAL API standard. Then both firmware and OS can agree, and if a particular call needs something else, then that can be documented with reasoning. This is not the end-all of this effort, it's just a relatively easy change that fixes some existing high latency delays. There should be provision for standardising timeouts and/or interruptible loops where possible, so non-fatal firmware errors don't cause hangs. Signed-off-by: Nicholas Piggin Signed-off-by: Michael Ellerman Cc: Nathan Chancellor Cc: Guenter Roeck Signed-off-by: Greg Kroah-Hartman commit a55d2c9d42f9212740911f4c8d4c3de12154ef42 Author: Thiago Jung Bauermann Date: Thu Mar 29 16:05:43 2018 -0300 powerpc/kexec_file: Fix error code when trying to load kdump kernel commit bf8a1abc3ddbd6e9a8312ea7d96e5dd89c140f18 upstream. kexec_file_load() on powerpc doesn't support kdump kernels yet, so it returns -ENOTSUPP in that case. I've recently learned that this errno is internal to the kernel and isn't supposed to be exposed to userspace. Therefore, change to -EOPNOTSUPP which is defined in an uapi header. This does indeed make kexec-tools happier. Before the patch, on ppc64le: # ~bauermann/src/kexec-tools/build/sbin/kexec -s -p /boot/vmlinuz kexec_file_load failed: Unknown error 524 After the patch: # ~bauermann/src/kexec-tools/build/sbin/kexec -s -p /boot/vmlinuz kexec_file_load failed: Operation not supported Fixes: a0458284f062 ("powerpc: Add support code for kexec_file_load()") Cc: stable@vger.kernel.org # v4.10+ Reported-by: Dave Young Signed-off-by: Thiago Jung Bauermann Reviewed-by: Simon Horman Reviewed-by: Dave Young Signed-off-by: Michael Ellerman Signed-off-by: Greg Kroah-Hartman commit fa99a3470e917a7a83b0648b53d2e1a5d71f3b29 Author: Naveen N. Rao Date: Wed Jan 17 17:52:24 2018 +0530 powerpc/kprobes: Fix call trace due to incorrect preempt count commit e6e133c47e6bd4d5dac05b35d06634a8e5648615 upstream. Michael Ellerman reported the following call trace when running ftracetest: BUG: using __this_cpu_write() in preemptible [00000000] code: ftracetest/6178 caller is opt_pre_handler+0xc4/0x110 CPU: 1 PID: 6178 Comm: ftracetest Not tainted 4.15.0-rc7-gcc6x-gb2cd1df #1 Call Trace: [c0000000f9ec39c0] [c000000000ac4304] dump_stack+0xb4/0x100 (unreliable) [c0000000f9ec3a00] [c00000000061159c] check_preemption_disabled+0x15c/0x170 [c0000000f9ec3a90] [c000000000217e84] opt_pre_handler+0xc4/0x110 [c0000000f9ec3af0] [c00000000004cf68] optimized_callback+0x148/0x170 [c0000000f9ec3b40] [c00000000004d954] optinsn_slot+0xec/0x10000 [c0000000f9ec3e30] [c00000000004bae0] kretprobe_trampoline+0x0/0x10 This is showing up since OPTPROBES is now enabled with CONFIG_PREEMPT. trampoline_probe_handler() considers itself to be a special kprobe handler for kretprobes. In doing so, it expects to be called from kprobe_handler() on a trap, and re-enables preemption before returning a non-zero return value so as to suppress any subsequent processing of the trap by the kprobe_handler(). However, with optprobes, we don't deal with special handlers (we ignore the return code) and just try to re-enable preemption causing the above trace. To address this, modify trampoline_probe_handler() to not be special. The only additional processing done in kprobe_handler() is to emulate the instruction (in this case, a 'nop'). We adjust the value of regs->nip for the purpose and delegate the job of re-enabling preemption and resetting current kprobe to the probe handlers (kprobe_handler() or optimized_callback()). Fixes: 8a2d71a3f273 ("powerpc/kprobes: Disable preemption before invoking probe handler for optprobes") Cc: stable@vger.kernel.org # v4.15+ Reported-by: Michael Ellerman Signed-off-by: Naveen N. Rao Acked-by: Ananth N Mavinakayanahalli Signed-off-by: Michael Ellerman Signed-off-by: Greg Kroah-Hartman commit 3df05fcf89114e9011b899dcb7437a50a978bac1 Author: Nicholas Piggin Date: Thu Mar 22 20:41:46 2018 +1000 powerpc/64: Fix smp_wmb barrier definition use use lwsync consistently commit 0bfdf598900fd62869659f360d3387ed80eb71cf upstream. asm/barrier.h is not always included after asm/synch.h, which meant it was missing __SUBARCH_HAS_LWSYNC, so in some files smp_wmb() would be eieio when it should be lwsync. kernel/time/hrtimer.c is one case. __SUBARCH_HAS_LWSYNC is only used in one place, so just fold it in to where it's used. Previously with my small simulator config, 377 instances of eieio in the tree. After this patch there are 55. Fixes: 46d075be585e ("powerpc: Optimise smp_wmb") Cc: stable@vger.kernel.org # v2.6.29+ Signed-off-by: Nicholas Piggin Signed-off-by: Michael Ellerman Signed-off-by: Greg Kroah-Hartman commit 1699bd03742d89732f2d15eb8e4a544cfe4da1fa Author: Paul Mackerras Date: Thu Feb 16 16:03:39 2017 +1100 powerpc/64: Call H_REGISTER_PROC_TBL when running as a HPT guest on POWER9 commit dbfcf3cb9c681aa0c5d0bb46068f98d5b1823dd3 upstream. On POWER9, since commit cc3d2940133d ("powerpc/64: Enable use of radix MMU under hypervisor on POWER9", 2017-01-30), we set both the radix and HPT bits in the client-architecture-support (CAS) vector, which tells the hypervisor that we can do either radix or HPT. According to PAPR, if we use this combination we are promising to do a H_REGISTER_PROC_TBL hcall later on to let the hypervisor know whether we are doing radix or HPT. We currently do this call if we are doing radix but not if we are doing HPT. If the hypervisor is able to support both radix and HPT guests, it would be entitled to defer allocation of the HPT until the H_REGISTER_PROC_TBL call, and to fail any attempts to create HPTEs until the H_REGISTER_PROC_TBL call. Thus we need to do a H_REGISTER_PROC_TBL call when we are doing HPT; otherwise we may crash at boot time. This adds the code to call H_REGISTER_PROC_TBL in this case, before we attempt to create any HPT entries using H_ENTER. Fixes: cc3d2940133d ("powerpc/64: Enable use of radix MMU under hypervisor on POWER9") Cc: stable@vger.kernel.org # v4.11+ Signed-off-by: Paul Mackerras Reviewed-by: Suraj Jitindar Singh Signed-off-by: Michael Ellerman Signed-off-by: Greg Kroah-Hartman commit f4eff13a278079ab219a182a354e8571e0bdc468 Author: Nicholas Piggin Date: Thu Apr 5 15:50:49 2018 +1000 powerpc/64s: Fix dt_cpu_ftrs to have restore_cpu clear unwanted LPCR bits commit a57ac411832384eb93df4bfed2bf644c4089720e upstream. Presently the dt_cpu_ftrs restore_cpu will only add bits to the LPCR for secondaries, but some bits must be removed (e.g., UPRT for HPT). Not clearing these bits on secondaries causes checkstops when booting with disable_radix. restore_cpu can not just set LPCR, because it is also called by the idle wakeup code which relies on opal_slw_set_reg to restore the value of LPCR, at least on P8 which does not save LPCR to stack in the idle code. Fix this by including a mask of bits to clear from LPCR as well, which is used by restore_cpu. This is a little messy now, but it's a minimal fix that can be backported. Longer term, the idle SPR save/restore code can be reworked to completely avoid calls to restore_cpu, then restore_cpu would be able to unconditionally set LPCR to match boot processor environment. Fixes: 5a61ef74f269f ("powerpc/64s: Support new device tree binding for discovering CPU features") Cc: stable@vger.kernel.org # v4.12+ Signed-off-by: Nicholas Piggin Signed-off-by: Michael Ellerman Signed-off-by: Greg Kroah-Hartman commit c3baeca67d85e698201fca324368c3a704667e80 Author: Nicholas Piggin Date: Tue Mar 27 01:02:33 2018 +1000 powerpc/powernv: Handle unknown OPAL errors in opal_nvram_write() commit 741de617661794246f84a21a02fc5e327bffc9ad upstream. opal_nvram_write currently just assumes success if it encounters an error other than OPAL_BUSY or OPAL_BUSY_EVENT. Have it return -EIO on other errors instead. Fixes: 628daa8d5abf ("powerpc/powernv: Add RTC and NVRAM support plus RTAS fallbacks") Cc: stable@vger.kernel.org # v3.2+ Signed-off-by: Nicholas Piggin Reviewed-by: Vasant Hegde Acked-by: Stewart Smith Signed-off-by: Michael Ellerman Signed-off-by: Greg Kroah-Hartman commit 693b03f9b185f14dded38a934c744a60e99363c3 Author: Gustavo A. R. Silva Date: Mon Feb 19 11:11:13 2018 -0600 CIFS: fix sha512 check in cifs_crypto_secmech_release commit 70e80655f58e17a2e38e577e1b4fa7a8c99619a0 upstream. It seems this is a copy-paste error and that the proper variable to use in this particular case is _sha512_ instead of _md5_. Addresses-Coverity-ID: 1465358 ("Copy-paste error") Fixes: 1c6614d229e7 ("CIFS: add sha512 secmech") Signed-off-by: Gustavo A. R. Silva Reviewed-by: Aurelien Aptel CC: Stable Signed-off-by: Steve French Signed-off-by: Greg Kroah-Hartman commit 7a55d160b730da05a7ea82c6c78b8e8d012a74cc Author: Aurelien Aptel Date: Fri Feb 16 19:19:28 2018 +0100 CIFS: add sha512 secmech commit 5fcd7f3f966f37f3f9a215af4cc1597fe338d0d5 upstream. * prepare for SMB3.11 pre-auth integrity * enable sha512 when SMB311 is enabled in Kconfig * add sha512 as a soft dependency Signed-off-by: Aurelien Aptel Signed-off-by: Steve French CC: Stable Reviewed-by: Ronnie Sahlberg Signed-off-by: Greg Kroah-Hartman commit 0910e2804f2ecb081bd87c26cdb8f4ff502c39f2 Author: Aurelien Aptel Date: Fri Feb 16 19:19:27 2018 +0100 CIFS: refactor crypto shash/sdesc allocation&free commit 82fb82be05585426405667dd5f0510aa953ba439 upstream. shash and sdesc and always allocated and freed together. * abstract this in new functions cifs_alloc_hash() and cifs_free_hash(). * make smb2/3 crypto allocation independent from each other. Signed-off-by: Aurelien Aptel Signed-off-by: Steve French Reviewed-by: Ronnie Sahlberg CC: Stable Signed-off-by: Greg Kroah-Hartman commit fd5cc02cbef93b2391d295d81f16706fdc01bcea Author: Jean Delvare Date: Wed Apr 11 18:05:34 2018 +0200 i2c: i801: Restore configuration at shutdown commit f7f6d915a10f7f2bce17e3b1b7d3376562395a28 upstream. On some systems, the BIOS expects certain SMBus register values to match the hardware defaults. Restore these configuration registers at shutdown time to avoid confusing the BIOS. This avoids hard-locking such systems upon reboot. Signed-off-by: Jean Delvare Tested-by: Jason Andryuk Signed-off-by: Wolfram Sang Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman commit 44ff2389a840c76434c943544f24909783366d90 Author: Jean Delvare Date: Wed Apr 11 18:03:31 2018 +0200 i2c: i801: Save register SMBSLVCMD value only once commit a086bb8317303dd74725dca933b9b29575159382 upstream. Saving the original value of register SMBSLVCMD in i801_enable_host_notify() doesn't work, because this function is called not only at probe time but also at resume time. Do it in i801_probe() instead, so that the saved value is not overwritten at resume time. Signed-off-by: Jean Delvare Fixes: 22e94bd6779e ("i2c: i801: store and restore the SLVCMD register at load and unload") Reviewed-by: Benjamin Tissoires Tested-by: Jason Andryuk Signed-off-by: Wolfram Sang Cc: stable@vger.kernel.org # v4.10+ Signed-off-by: Greg Kroah-Hartman commit d6b3a5c87d45ba6af5597879c98024de1ff0f327 Author: Aaron Ma Date: Mon Jan 8 10:41:40 2018 +0800 HID: i2c-hid: fix size check and type usage commit ac75a041048b8c1f7418e27621ca5efda8571043 upstream. When convert char array with signed int, if the inbuf[x] is negative then upper bits will be set to 1. Fix this by using u8 instead of char. ret_size has to be at least 3, hid_input_report use it after minus 2 bytes. Cc: stable@vger.kernel.org Signed-off-by: Aaron Ma Signed-off-by: Jiri Kosina Signed-off-by: Greg Kroah-Hartman commit 70dbed63a96d1f5ebc779c87f00340d352b3b0c5 Author: Steve French Date: Sat Mar 31 18:13:38 2018 -0500 smb3: Fix root directory when server returns inode number of zero commit 7ea884c77e5c97f1e0a1a422d961d27f78ca2745 upstream. Some servers return inode number zero for the root directory, which causes ls to display incorrect data (missing "." and ".."). If the server returns zero for the inode number of the root directory, fake an inode number for it. Signed-off-by: Steve French Reviewed-by: Pavel Shilovsky CC: Stable Signed-off-by: Greg Kroah-Hartman commit bf895b2a637dc4bbe818cabf1fa2c91a2f346dfd Author: Ronnie Sahlberg Date: Tue Feb 20 12:45:21 2018 +1100 fix smb3-encryption breakage when CONFIG_DEBUG_SG=y commit 262916bc69faf90104aa784d55e10760a4199594 upstream. We can not use the standard sg_set_buf() fucntion since when CONFIG_DEBUG_SG=y this adds a check that will BUG_ON for cifs.ko when we pass it an object from the stack. Create a new wrapper smb2_sg_set_buf() which avoids doing that particular check and use it for smb3 encryption instead. Signed-off-by: Ronnie Sahlberg Signed-off-by: Steve French CC: Stable Signed-off-by: Greg Kroah-Hartman commit fdbd795405203c22c063690f00cf79a23389f1df Author: Ronnie Sahlberg Date: Tue Feb 13 15:42:30 2018 +1100 cifs: fix memory leak in SMB2_open() commit b7a73c84eb96dabd6bb8e9d7c56f796d83efee8e upstream. Signed-off-by: Ronnie Sahlberg Signed-off-by: Steve French CC: Stable Signed-off-by: Greg Kroah-Hartman commit 59d3a952e4f3d505f9444e86db069081323351c7 Author: Felipe Balbi Date: Mon Mar 26 13:14:47 2018 +0300 usb: dwc3: gadget: never call ->complete() from ->ep_queue() commit c91815b596245fd7da349ecc43c8def670d2269e upstream. This is a requirement which has always existed but, somehow, wasn't reflected in the documentation and problems weren't found until now when Tuba Yavuz found a possible deadlock happening between dwc3 and f_hid. She described the situation as follows: spin_lock_irqsave(&hidg->write_spinlock, flags); // first acquire /* we our function has been disabled by host */ if (!hidg->req) { free_ep_req(hidg->in_ep, hidg->req); goto try_again; } [...] status = usb_ep_queue(hidg->in_ep, hidg->req, GFP_ATOMIC); => [...] => usb_gadget_giveback_request => f_hidg_req_complete => spin_lock_irqsave(&hidg->write_spinlock, flags); // second acquire Note that this happens because dwc3 would call ->complete() on a failed usb_ep_queue() due to failed Start Transfer command. This is, anyway, a theoretical situation because dwc3 currently uses "No Response Update Transfer" command for Bulk and Interrupt endpoints. It's still good to make this case impossible to happen even if the "No Reponse Update Transfer" command is changed. Reported-by: Tuba Yavuz Signed-off-by: Felipe Balbi Cc: stable Signed-off-by: Greg Kroah-Hartman Signed-off-by: Greg Kroah-Hartman commit 093dcb929c8e8344ce148ba11e0313ad1890553d Author: Thinh Nguyen Date: Mon Mar 19 13:07:35 2018 -0700 usb: dwc3: pci: Properly cleanup resource commit cabdf83dadfb3d83eec31e0f0638a92dbd716435 upstream. Platform device is allocated before adding resources. Make sure to properly cleanup on error case. Cc: Fixes: f1c7e7108109 ("usb: dwc3: convert to pcim_enable_device()") Signed-off-by: Thinh Nguyen Signed-off-by: Felipe Balbi Signed-off-by: Greg Kroah-Hartman commit 30e9a1cddc4d4ef881515ff128c2c707ef81573d Author: Roger Quadros Date: Tue Feb 27 12:54:37 2018 +0200 usb: dwc3: prevent setting PRTCAP to OTG from debugfs commit daaecc6541d014dca073473ec8a4120c0babbeb4 upstream. We don't support PRTCAP == OTG yet, so prevent user from setting it via debugfs. Fixes: 41ce1456e1db ("usb: dwc3: core: make dwc3_set_mode() work properly") Cc: # v4.12+ Signed-off-by: Roger Quadros Signed-off-by: Felipe Balbi Signed-off-by: Greg Kroah-Hartman commit f7f9187a110e58c0b9988db972a0631d06c7633f Author: Zhengjun Xing Date: Wed Mar 21 13:29:42 2018 +0800 USB:fix USB3 devices behind USB3 hubs not resuming at hibernate thaw commit 64627388b50158fd24d6ad88132525b95a5ef573 upstream. USB3 hubs don't support global suspend. USB3 specification 10.10, Enhanced SuperSpeed hubs only support selective suspend and resume, they do not support global suspend/resume where the hub downstream facing ports states are not affected. When system enters hibernation it first enters freeze process where only the root hub enters suspend, usb_port_suspend() is not called for other devices, and suspend status flags are not set for them. Other devices are expected to suspend globally. Some external USB3 hubs will suspend the downstream facing port at global suspend. These devices won't be resumed at thaw as the suspend status flag is not set. A USB3 removable hard disk connected through a USB3 hub that won't resume at thaw will fail to synchronize SCSI cache, return “cmd cmplt err -71” error, and needs a 60 seconds timeout which causing system hang for 60s before the USB host reset the port for the USB3 removable hard disk to recover. Fix this by always calling usb_port_suspend() during freeze for USB3 devices. Signed-off-by: Zhengjun Xing Cc: stable Signed-off-by: Greg Kroah-Hartman commit 96dc465173a1f790e805246206aee3d18770f614 Author: Yavuz, Tuba Date: Fri Mar 23 17:00:38 2018 +0000 USB: gadget: f_midi: fixing a possible double-free in f_midi commit 7fafcfdf6377b18b2a726ea554d6e593ba44349f upstream. It looks like there is a possibility of a double-free vulnerability on an error path of the f_midi_set_alt function in the f_midi driver. If the path is feasible then free_ep_req gets called twice: req->complete = f_midi_complete; err = usb_ep_queue(midi->out_ep, req, GFP_ATOMIC); => ... usb_gadget_giveback_request => f_midi_complete (CALLBACK) (inside f_midi_complete, for various cases of status) free_ep_req(ep, req); // first kfree if (err) { ERROR(midi, "%s: couldn't enqueue request: %d\n", midi->out_ep->name, err); free_ep_req(midi->out_ep, req); // second kfree return err; } The double-free possibility was introduced with commit ad0d1a058eac ("usb: gadget: f_midi: fix leak on failed to enqueue out requests"). Found by MOXCAFE tool. Signed-off-by: Tuba Yavuz Fixes: ad0d1a058eac ("usb: gadget: f_midi: fix leak on failed to enqueue out requests") Acked-by: Felipe Balbi Cc: stable Signed-off-by: Greg Kroah-Hartman commit a2b540651d8c1938e07493e18e781e3fc609728c Author: Mika Westerberg Date: Mon Feb 12 13:55:23 2018 +0300 ACPI / hotplug / PCI: Check presence of slot itself in get_slot_status() commit 13d3047c81505cc0fb9bdae7810676e70523c8bf upstream. Mike Lothian reported that plugging in a USB-C device does not work properly in his Dell Alienware system. This system has an Intel Alpine Ridge Thunderbolt controller providing USB-C functionality. In these systems the USB controller (xHCI) is hotplugged whenever a device is connected to the port using ACPI-based hotplug. The ACPI description of the root port in question is as follows: Device (RP01) { Name (_ADR, 0x001C0000) Device (PXSX) { Name (_ADR, 0x02) Method (_RMV, 0, NotSerialized) { // ... } } Here _ADR 0x02 means device 0, function 2 on the bus under root port (RP01) but that seems to be incorrect because device 0 is the upstream port of the Alpine Ridge PCIe switch and it has no functions other than 0 (the bridge itself). When we get ACPI Notify() to the root port resulting from connecting a USB-C device, Linux tries to read PCI_VENDOR_ID from device 0, function 2 which of course always returns 0xffffffff because there is no such function and we never find the device. In Windows this works fine. Now, since we get ACPI Notify() to the root port and not to the PXSX device we should actually start our scan from there as well and not from the non-existent PXSX device. Fix this by checking presence of the slot itself (function 0) if we fail to do that otherwise. While there use pci_bus_read_dev_vendor_id() in get_slot_status(), which is the recommended way to read Device and Vendor IDs of devices on PCI buses. Link: https://bugzilla.kernel.org/show_bug.cgi?id=198557 Reported-by: Mike Lothian Signed-off-by: Mika Westerberg Signed-off-by: Bjorn Helgaas Reviewed-by: Rafael J. Wysocki Cc: Greg Kroah-Hartman Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman commit d6e98387b2e994e68d3b7a6abd5c6ceef62ad5e3 Author: Hans de Goede Date: Mon Mar 19 18:01:45 2018 +0100 ACPI / video: Add quirk to force acpi-video backlight on Samsung 670Z5E commit bbf038618a24d72e2efc19146ef421bb1e1eda1a upstream. Just like many other Samsung models, the 670Z5E needs to use the acpi-video backlight interface rather then the native one for backlight control to work, add a quirk for this. Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=1557060 Cc: All applicable Signed-off-by: Hans de Goede Signed-off-by: Rafael J. Wysocki Signed-off-by: Greg Kroah-Hartman commit 3dac1fe2719d6aa894768848fecf6a2837356534 Author: Dan Carpenter Date: Thu Feb 8 10:23:44 2018 +0300 regmap: Fix reversed bounds check in regmap_raw_write() commit f00e71091ab92eba52122332586c6ecaa9cd1a56 upstream. We're supposed to be checking that "val_len" is not too large but instead we check if it is smaller than the max. The only function affected would be regmap_i2c_smbus_i2c_write() in drivers/base/regmap/regmap-i2c.c. Strangely that function has its own limit check which returns an error if (count >= I2C_SMBUS_BLOCK_MAX) so it doesn't look like it has ever been able to do anything except return an error. Fixes: c335931ed9d2 ("regmap: Add raw_write/read checks for max_raw_write/read sizes") Signed-off-by: Dan Carpenter Signed-off-by: Mark Brown Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman commit 666d1084c13d919c1ace5cbcd594eb355e8c4686 Author: Jason Andryuk Date: Wed Feb 28 07:23:23 2018 -0500 xen-netfront: Fix hang on device removal commit c2d2e6738a209f0f9dffa2dc8e7292fc45360d61 upstream. A toolstack may delete the vif frontend and backend xenstore entries while xen-netfront is in the removal code path. In that case, the checks for xenbus_read_driver_state would return XenbusStateUnknown, and xennet_remove would hang indefinitely. This hang prevents system shutdown. xennet_remove must be able to handle XenbusStateUnknown, and netback_changed must also wake up the wake_queue for that state as well. Fixes: 5b5971df3bc2 ("xen-netfront: remove warning when unloading module") Signed-off-by: Jason Andryuk Cc: Eduardo Otubo Reviewed-by: Boris Ostrovsky Signed-off-by: Juergen Gross Signed-off-by: Greg Kroah-Hartman commit e7b00dc28275a1883749d319081484e3306dc265 Author: Jason Andryuk Date: Mon Mar 19 12:58:04 2018 -0400 x86/xen: Delay get_cpu_cap until stack canary is established commit 36104cb9012a82e73c32a3b709257766b16bcd1d upstream. Commit 2cc42bac1c79 ("x86-64/Xen: eliminate W+X mappings") introduced a call to get_cpu_cap, which is fstack-protected. This is works on x86-64 as commit 4f277295e54c ("x86/xen: init %gs very early to avoid page faults with stack protector") ensures the stack protector is configured, but it it did not cover x86-32. Delay calling get_cpu_cap until after xen_setup_gdt has initialized the stack canary. Without this, a 32bit PV machine crashes early in boot. (XEN) Domain 0 (vcpu#0) crashed on cpu#0: (XEN) ----[ Xen-4.6.6-xc x86_64 debug=n Tainted: C ]---- (XEN) CPU: 0 (XEN) RIP: e019:[<00000000c10362f8>] And the PV kernel IP corresponds to init_scattered_cpuid_features 0xc10362f8 <+24>: mov %gs:0x14,%eax Fixes 2cc42bac1c79 ("x86-64/Xen: eliminate W+X mappings") Signed-off-by: Jason Andryuk Reviewed-by: Boris Ostrovsky Signed-off-by: Boris Ostrovsky Signed-off-by: Greg Kroah-Hartman commit fcd054c733cf702f192e01fa7e78596cc07df771 Author: Kieran Bingham Date: Mon Mar 26 09:29:17 2018 -0400 media: vsp1: Fix BRx conditional path in WPF commit 639fa43d59e5a41ca8c55592cd5c1021fea2ab83 upstream. When a BRx is provided by a pipeline, the WPF must determine the master layer. Currently the condition to check this identifies pipe->bru || pipe->num_inputs > 1. The code then moves on to dereference pipe->bru, thus the check fails static analysers on the possibility that pipe->num_inputs could be greater than 1 without pipe->bru being set. The reality is that the pipeline must have a BRx to support more than one input, thus this could never cause a fault - however it also identifies that the num_inputs > 1 check is redundant. Remove the redundant check - and always configure the master layer appropriately when we have a BRx configured in our pipeline. Fixes: 6134148f6098 ("v4l: vsp1: Add support for the BRS entity") Cc: stable@vger.kernel.org Suggested-by: Mauro Carvalho Chehab Signed-off-by: Kieran Bingham Reviewed-by: Laurent Pinchart Signed-off-by: Mauro Carvalho Chehab Signed-off-by: Greg Kroah-Hartman commit 2fb28b075f80136b17c8aebb2c006a3502c628db Author: Hans Verkuil Date: Mon Feb 12 06:45:32 2018 -0500 media: vivid: check if the cec_adapter is valid commit ed356f110403f6acc64dcbbbfdc38662ab9b06c2 upstream. If CEC is not enabled for the vivid driver, then the adap pointer is NULL and 'adap->phys_addr' will fail. Cc: # for v4.12 and up Signed-off-by: Hans Verkuil Signed-off-by: Mauro Carvalho Chehab Signed-off-by: Greg Kroah-Hartman commit 9864a1ef86791897b0f6c215001deae472c05635 Author: Hans Verkuil Date: Sun Feb 25 06:55:32 2018 -0500 media: atomisp_fops.c: disable atomisp_compat_ioctl32 commit 57e6b6f2303e596a6493078b53be14b789e7b79f upstream. The atomisp_compat_ioctl32() code has problems. This patch disables the compat_ioctl32 support until those issues have been fixed. Contact Sakari or me for more details. Signed-off-by: Hans Verkuil Cc: # for v4.12 and up Signed-off-by: Sakari Ailus Signed-off-by: Mauro Carvalho Chehab Signed-off-by: Greg Kroah-Hartman commit 9629964f032ca5289a371da9f1288ad2655fb6b7 Author: Jarkko Nikula Date: Tue Mar 20 10:27:50 2018 +0200 spi: Fix unregistration of controller with fixed SPI bus number commit 613bd1ea387bb48b7c9a71a0bb451ac15cfbbc01 upstream. Commit 9b61e302210e (spi: Pick spi bus number from Linux idr or spi alias) ceased to unregister SPI buses with fixed bus numbers. Moreover this is visible only if CONFIG_SPI_DEBUG=y is set or when trying to re-register the same SPI controller. rmmod spi_pxa2xx_platform (with CONFIG_SPI_DEBUG=y): [ 26.788362] spi_master spi1: attempting to delete unregistered controller [spi1] modprobe spi_pxa2xx_platform: [ 37.883137] sysfs: cannot create duplicate filename '/devices/pci0000:00/0000:00:19.0/pxa2xx-spi.12/spi_master/spi1' [ 37.894984] CPU: 1 PID: 1467 Comm: modprobe Not tainted 4.16.0-rc4+ #21 [ 37.902384] Call Trace: ... [ 38.122680] kobject_add_internal failed for spi1 with -EEXIST, don't try to register things with the same name in the same directory. [ 38.136154] WARNING: CPU: 1 PID: 1467 at lib/kobject.c:238 kobject_add_internal+0x2a5/0x2f0 ... [ 38.513817] pxa2xx-spi pxa2xx-spi.12: problem registering spi master [ 38.521036] pxa2xx-spi: probe of pxa2xx-spi.12 failed with error -17 Fix this by not returning immediately from spi_unregister_controller() if idr_find() doesn't find controller with given ID/bus number. It finds only those controllers that were registered with dynamic SPI bus numbers. Only conditional cleanup between dynamic and fixed bus numbers is to remove allocated IDR. Fixes: 9b61e302210e (spi: Pick spi bus number from Linux idr or spi alias) Cc: stable@vger.kernel.org Signed-off-by: Jarkko Nikula Signed-off-by: Mark Brown Signed-off-by: Greg Kroah-Hartman commit e4ff723039dcacff20d834f6fb710ec18074b677 Author: Maxime Chevallier Date: Fri Mar 2 15:55:09 2018 +0100 spi: Fix scatterlist elements size in spi_map_buf commit ce99319a182fe766be67f96338386f3ec73e321c upstream. When SPI transfers can be offloaded using DMA, the SPI core need to build a scatterlist to make sure that the buffer to be transferred is dma-able. This patch fixes the scatterlist entry size computation in the case where the maximum acceptable scatterlist entry supported by the DMA controller is less than PAGE_SIZE, when the buffer is vmalloced. For each entry, the actual size is given by the minimum between the desc_len (which is the max buffer size supported by the DMA controller) and the remaining buffer length until we cross a page boundary. Fixes: 65598c13fd66 ("spi: Fix per-page mapping of unaligned vmalloc-ed buffer") Signed-off-by: Maxime Chevallier Signed-off-by: Mark Brown Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman commit faddb17685f9f0507dcd28a3a4af3c63d5765407 Author: Eugen Hristev Date: Tue Feb 27 12:25:07 2018 +0200 spi: atmel: init FIFOs before spi enable commit 9581329eff9db72ab4fbb46a594fd7fdda3c51b0 upstream. The datasheet recommends initializing FIFOs before SPI enable. If we do not do it like this, there may be a strange behavior. We noticed that DMA does not work properly with FIFOs if we do not clear them beforehand or enable them before SPIEN. Signed-off-by: Eugen Hristev Acked-by: Nicolas Ferre Signed-off-by: Mark Brown Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman commit 38866e866121454b5c6e2fc7eefe82fec97e810e Author: Santiago Esteban Date: Thu Jan 18 15:38:47 2018 +0100 ARM: dts: at91: sama5d4: fix pinctrl compatible string commit 9a06757dcc8509c162ac00488c8c82fc98e04227 upstream. The compatible string is incorrect. Add atmel,sama5d3-pinctrl since it's the appropriate compatible string. Remove the atmel,at91rm9200-pinctrl compatible string, this fallback is useless, there are too many changes. Signed-off-by: Santiago Esteban Signed-off-by: Ludovic Desroches Cc: stable@vger.kernel.org #v3.18 Signed-off-by: Alexandre Belloni Signed-off-by: Greg Kroah-Hartman commit c57b7e1a150e840004b628d2eae3c3d3385fd25a Author: Marek Szyprowski Date: Fri Mar 2 17:07:42 2018 +0100 ARM: dts: exynos: Fix IOMMU support for GScaler devices on Exynos5250 commit 6f4870753f29edf7dc39444246f9e39987b8b158 upstream. The proper name for the property, which assign given device to IOMMU is 'iommus', not 'iommu'. Fix incorrect name and let all GScaler devices to be properly handled when IOMMU support is enabled. Reported-by: Andrzej Hajda Signed-off-by: Marek Szyprowski Fixes: 6cbfdd73a94f ("ARM: dts: add sysmmu nodes for exynos5250") Cc: # v4.8+ Signed-off-by: Krzysztof Kozlowski Signed-off-by: Greg Kroah-Hartman commit 838ea3802e9dbe96ac1c3faa03a93a399d09f8ff Author: Nicolas Ferre Date: Tue Mar 13 16:20:05 2018 +0100 ARM: dts: at91: at91sam9g25: fix mux-mask pinctrl property commit e8fd0adf105e132fd84545997bbef3d5edc2c9c1 upstream. There are only 19 PIOB pins having primary names PB0-PB18. Not all of them have a 'C' function. So the pinctrl property mask ends up being the same as the other SoC of the at91sam9x5 series. Reported-by: Marek Sieranski Signed-off-by: Nicolas Ferre Cc: # v3.8+ Signed-off-by: Alexandre Belloni Signed-off-by: Greg Kroah-Hartman commit 958d6e41888f8659db70418a8203c16d021833cc Author: Sean Wang Date: Fri Feb 23 18:16:26 2018 +0800 arm: dts: mt7623: fix USB initialization fails on bananapi-r2 commit 0629a01920c0f8a3f825361b24863d760610884a upstream. Fix that USB initialization fails as below runtime log is present during booting on bananapi-r2 board by adding missing regulators the USB device requires. Current regulators USB device uses are being updated with the correct ones to reflect real configurations which are all from fixed regulators rather than MT6323 one's output. xhci-mtk 1a1c0000.usb: 1a1c0000.usb supply vbus not found, using dummy regulator xhci-mtk 1a240000.usb: 1a240000.usb supply vbus not found, using dummy regulator Cc: stable@vger.kernel.org Fixes: f4ff257cd160 ("arm: dts: mt7623: add support for Bananapi R2 (BPI-R2) board") Signed-off-by: Sean Wang [mb: update kernel log in commit message] Signed-off-by: Matthias Brugger Signed-off-by: Greg Kroah-Hartman commit 2106cd34635e5aa95d759e414e38d5fd216ce8ec Author: Marek Szyprowski Date: Wed Mar 21 10:45:05 2018 +0100 ARM: EXYNOS: Fix coupled CPU idle freeze on Exynos4210 commit a7480dbcf983c31d8111f864c848e8a75116a87d upstream. Since commit 04c8b0f82c7d ("irqchip/gic: Make locking a BL_SWITCHER only feature") coupled CPU idle freezes from time to time on Exynos4210. Later commit 313c8c16ee62 ("PM / CPU: replace raw_notifier with atomic_notifier") changed the context in which the CPU idle code is executed, what results in fully reproducible freeze all the time. However, almost the same coupled CPU idle code works fine on Exynos3250 regardless of the changes made in the mentioned commits. It turned out that the IPI call used on Exynos4210 is conflicting with the change done in the first mentioned commit in GIC. Fix this by using the same code path as for Exynos3250, instead of the IPI call for synchronization with second CPU core, call dsb_sev() directly. Tested on Exynos4210-based Trats and Origen boards. Signed-off-by: Marek Szyprowski CC: # v4.13+ Acked-by: Marc Zyngier Acked-by: Bartlomiej Zolnierkiewicz Signed-off-by: Krzysztof Kozlowski Signed-off-by: Greg Kroah-Hartman commit 326e61ce9761d71098c7001e51dafec067f50424 Author: David Lechner Date: Sun Dec 3 16:04:53 2017 -0600 ARM: dts: da850-lego-ev3: Fix battery voltage gpio commit c5a88cd2e1c508868922bafa0a5c3365986b98e5 upstream. This fixes the battery voltage monitoring gpio-hog settings. When the gpio is low, it turns off the battery voltage to the ADC chip. However, this needs to be on all of the time so that we can monitor battery voltage. Also, there was a typo that prevented pinmuxing from working correctly. Signed-off-by: David Lechner Signed-off-by: Sekhar Nori Signed-off-by: Greg Kroah-Hartman commit 8f1a2803e4c2163c013816f7b41c85b91c9124c9 Author: Marc Zyngier Date: Fri Mar 23 14:57:09 2018 +0000 KVM: arm/arm64: vgic-its: Fix potential overrun in vgic_copy_lpi_list commit 7d8b44c54e0c7c8f688e3a07f17e6083f849f01f upstream. vgic_copy_lpi_list() parses the LPI list and picks LPIs targeting a given vcpu. We allocate the array containing the intids before taking the lpi_list_lock, which means we can have an array size that is not equal to the number of LPIs. This is particularly obvious when looking at the path coming from vgic_enable_lpis, which is not a command, and thus can run in parallel with commands: vcpu 0: vcpu 1: vgic_enable_lpis its_sync_lpi_pending_table vgic_copy_lpi_list intids = kmalloc_array(irq_count) MAPI(lpi targeting vcpu 0) list_for_each_entry(lpi_list_head) intids[i++] = irq->intid; At that stage, we will happily overrun the intids array. Boo. An easy fix is is to break once the array is full. The MAPI command will update the config anyway, and we won't miss a thing. We also make sure that lpi_list_count is read exactly once, so that further updates of that value will not affect the array bound check. Cc: stable@vger.kernel.org Fixes: ccb1d791ab9e ("KVM: arm64: vgic-its: Fix pending table sync") Reviewed-by: Andre Przywara Reviewed-by: Eric Auger Signed-off-by: Marc Zyngier Signed-off-by: Greg Kroah-Hartman commit 8fdbba69cb3424b8ff470da40980d4edfec19769 Author: Jerome Brunet Date: Fri Mar 2 14:44:36 2018 +0100 ARM64: dts: meson: reduce odroid-c2 eMMC maximum rate commit c04ffa71ff491220cac28f55237c9aad379a8656 upstream. Different modules maybe installed by the user on the eMMC connector of the odroid-c2. While the red modules are working without an issue, it seems some black modules (apparently Samsung based) are having issue at 200MHz While the tuning algorithm introduced in v4.14 enables high speed modes on every other tested designs, it seems a problem remains for this particular combination of board and eMMC module. Lowering the maximum frequency of the eMMC on this board until we can figure out a better solution. Fixes: d341ca88eead ("mmc: meson-gx: rework tuning function") Suggested-by: Ellie Reeves Signed-off-by: Jerome Brunet Cc: stable@vger.kernel.org Signed-off-by: Kevin Hilman Signed-off-by: Greg Kroah-Hartman commit 7732495c599c0b98ff17097637f239e1ef108e69 Author: Felipe Balbi Date: Mon Mar 26 13:14:46 2018 +0300 usb: gadget: udc: core: update usb_ep_queue() documentation commit eaa358c7790338d83bb6a31258bdc077de120414 upstream. Mention that ->complete() should never be called from within usb_ep_queue(). Signed-off-by: Felipe Balbi Cc: stable Signed-off-by: Greg Kroah-Hartman commit aea6c0b4aee80367bd091ebd2132b6db533c1ed1 Author: Chen-Yu Tsai Date: Fri Jan 19 17:25:41 2018 +0800 phy: allwinner: sun4i-usb: poll vbus changes on A23/A33 when driving VBUS commit d7119224bfe6e8efbf821a52db7da9530d790f07 upstream. The AXP223 PMIC, like the AXP221, does not generate VBUS change interrupts when N_VBUSEN is used to drive VBUS for the OTG port on the board. This was not noticed until recently, as most A23/A33 boards use a GPIO pin that does not support interrupts for OTG ID detection. This forces the driver to use polling. However the A33-OlinuXino uses a pin that does support interrupts, so the driver uses them. However the VBUS interrupt never fires, and the driver never gets to update the VBUS status. This results in musb timing out waiting for VBUS to rise. This was worked around for the AXP221 by resorting to polling changes in commit 91d96f06a760 ("phy-sun4i-usb: Add workaround for missing Vbus det interrupts on A31"). This patch adds the A23 and A33 to the list of SoCs that need the workaround. Fixes: fc1f45ed3043 ("phy-sun4i-usb: Add support for the usb-phys on the sun8i-a33 SoC") Fixes: 123dfdbcfaf5 ("phy-sun4i-usb: Add support for the usb-phys on the sun8i-a23 SoC") Cc: # 4.3.x: 68dbc2ce77bb phy-sun4i-usb: Use of_match_node to get model specific config data Cc: # 4.3.x: 5cf700ac9d50 phy: phy-sun4i-usb: Fix optional gpios failing probe Cc: # 4.3.x: 04e59a0211ff phy-sun4i-usb: Fix irq free conditions to match request conditions Cc: # 4.3.x: 91d96f06a760 phy-sun4i-usb: Add workaround for missing Vbus det interrupts on A31 Cc: # 4.3.x Signed-off-by: Chen-Yu Tsai Acked-by: Maxime Ripard Signed-off-by: Kishon Vijay Abraham I Signed-off-by: Greg Kroah-Hartman Signed-off-by: Kishon Vijay Abraham I commit 334d8f201ef5f9c32ab26fbc868952ffb86dc175 Author: Heinrich Schuchardt Date: Thu Mar 29 10:48:28 2018 -0500 usb: musb: gadget: misplaced out of bounds check commit af6f8529098aeb0e56a68671b450cf74e7a64fcd upstream. musb->endpoints[] has array size MUSB_C_NUM_EPS. We must check array bounds before accessing the array and not afterwards. Signed-off-by: Heinrich Schuchardt Signed-off-by: Bin Liu Cc: stable Signed-off-by: Greg Kroah-Hartman commit 20eaa393fcd32fe48f2e13ce649e90a48a446fe6 Author: Vlastimil Babka Date: Fri Apr 13 15:35:38 2018 -0700 mm, slab: reschedule cache_reap() on the same CPU commit a9f2a846f0503e7d729f552e3ccfe2279010fe94 upstream. cache_reap() is initially scheduled in start_cpu_timer() via schedule_delayed_work_on(). But then the next iterations are scheduled via schedule_delayed_work(), i.e. using WORK_CPU_UNBOUND. Thus since commit ef557180447f ("workqueue: schedule WORK_CPU_UNBOUND work on wq_unbound_cpumask CPUs") there is no guarantee the future iterations will run on the originally intended cpu, although it's still preferred. I was able to demonstrate this with /sys/module/workqueue/parameters/debug_force_rr_cpu. IIUC, it may also happen due to migrating timers in nohz context. As a result, some cpu's would be calling cache_reap() more frequently and others never. This patch uses schedule_delayed_work_on() with the current cpu when scheduling the next iteration. Link: http://lkml.kernel.org/r/20180411070007.32225-1-vbabka@suse.cz Fixes: ef557180447f ("workqueue: schedule WORK_CPU_UNBOUND work on wq_unbound_cpumask CPUs") Signed-off-by: Vlastimil Babka Acked-by: Pekka Enberg Acked-by: Christoph Lameter Cc: Joonsoo Kim Cc: David Rientjes Cc: Tejun Heo Cc: Lai Jiangshan Cc: John Stultz Cc: Thomas Gleixner Cc: Stephen Boyd Cc: Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman commit 703eee654360679660c813a591176c7bf4016254 Author: Eric Biggers Date: Fri Apr 13 15:35:30 2018 -0700 ipc/shm: fix use-after-free of shm file via remap_file_pages() commit 3f05317d9889ab75c7190dcd39491d2a97921984 upstream. syzbot reported a use-after-free of shm_file_data(file)->file->f_op in shm_get_unmapped_area(), called via sys_remap_file_pages(). Unfortunately it couldn't generate a reproducer, but I found a bug which I think caused it. When remap_file_pages() is passed a full System V shared memory segment, the memory is first unmapped, then a new map is created using the ->vm_file. Between these steps, the shm ID can be removed and reused for a new shm segment. But, shm_mmap() only checks whether the ID is currently valid before calling the underlying file's ->mmap(); it doesn't check whether it was reused. Thus it can use the wrong underlying file, one that was already freed. Fix this by making the "outer" shm file (the one that gets put in ->vm_file) hold a reference to the real shm file, and by making __shm_open() require that the file associated with the shm ID matches the one associated with the "outer" file. Taking the reference to the real shm file is needed to fully solve the problem, since otherwise sfd->file could point to a freed file, which then could be reallocated for the reused shm ID, causing the wrong shm segment to be mapped (and without the required permission checks). Commit 1ac0b6dec656 ("ipc/shm: handle removed segments gracefully in shm_mmap()") almost fixed this bug, but it didn't go far enough because it didn't consider the case where the shm ID is reused. The following program usually reproduces this bug: #include #include #include #include int main() { int is_parent = (fork() != 0); srand(getpid()); for (;;) { int id = shmget(0xF00F, 4096, IPC_CREAT|0700); if (is_parent) { void *addr = shmat(id, NULL, 0); usleep(rand() % 50); while (!syscall(__NR_remap_file_pages, addr, 4096, 0, 0, 0)); } else { usleep(rand() % 50); shmctl(id, IPC_RMID, NULL); } } } It causes the following NULL pointer dereference due to a 'struct file' being used while it's being freed. (I couldn't actually get a KASAN use-after-free splat like in the syzbot report. But I think it's possible with this bug; it would just take a more extraordinary race...) BUG: unable to handle kernel NULL pointer dereference at 0000000000000058 PGD 0 P4D 0 Oops: 0000 [#1] SMP NOPTI CPU: 9 PID: 258 Comm: syz_ipc Not tainted 4.16.0-05140-gf8cf2f16a7c95 #189 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.0-20171110_100015-anatol 04/01/2014 RIP: 0010:d_inode include/linux/dcache.h:519 [inline] RIP: 0010:touch_atime+0x25/0xd0 fs/inode.c:1724 [...] Call Trace: file_accessed include/linux/fs.h:2063 [inline] shmem_mmap+0x25/0x40 mm/shmem.c:2149 call_mmap include/linux/fs.h:1789 [inline] shm_mmap+0x34/0x80 ipc/shm.c:465 call_mmap include/linux/fs.h:1789 [inline] mmap_region+0x309/0x5b0 mm/mmap.c:1712 do_mmap+0x294/0x4a0 mm/mmap.c:1483 do_mmap_pgoff include/linux/mm.h:2235 [inline] SYSC_remap_file_pages mm/mmap.c:2853 [inline] SyS_remap_file_pages+0x232/0x310 mm/mmap.c:2769 do_syscall_64+0x64/0x1a0 arch/x86/entry/common.c:287 entry_SYSCALL_64_after_hwframe+0x42/0xb7 [ebiggers@google.com: add comment] Link: http://lkml.kernel.org/r/20180410192850.235835-1-ebiggers3@gmail.com Link: http://lkml.kernel.org/r/20180409043039.28915-1-ebiggers3@gmail.com Reported-by: syzbot+d11f321e7f1923157eac80aa990b446596f46439@syzkaller.appspotmail.com Fixes: c8d78c1823f4 ("mm: replace remap_file_pages() syscall with emulation") Signed-off-by: Eric Biggers Acked-by: Kirill A. Shutemov Acked-by: Davidlohr Bueso Cc: Manfred Spraul Cc: "Eric W . Biederman" Cc: Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman commit c25ef6220facd0ec7df7f3af0a5f5f24ab9659e8 Author: Takashi Iwai Date: Fri Apr 13 15:35:13 2018 -0700 resource: fix integer overflow at reallocation commit 60bb83b81169820c691fbfa33a6a4aef32aa4b0b upstream. We've got a bug report indicating a kernel panic at booting on an x86-32 system, and it turned out to be the invalid PCI resource assigned after reallocation. __find_resource() first aligns the resource start address and resets the end address with start+size-1 accordingly, then checks whether it's contained. Here the end address may overflow the integer, although resource_contains() still returns true because the function validates only start and end address. So this ends up with returning an invalid resource (start > end). There was already an attempt to cover such a problem in the commit 47ea91b4052d ("Resource: fix wrong resource window calculation"), but this case is an overseen one. This patch adds the validity check of the newly calculated resource for avoiding the integer overflow problem. Bugzilla: http://bugzilla.opensuse.org/show_bug.cgi?id=1086739 Link: http://lkml.kernel.org/r/s5hpo37d5l8.wl-tiwai@suse.de Fixes: 23c570a67448 ("resource: ability to resize an allocated resource") Signed-off-by: Takashi Iwai Reported-by: Michael Henders Tested-by: Michael Henders Reviewed-by: Andrew Morton Cc: Ram Pai Cc: Bjorn Helgaas Cc: Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman commit f659e7e79f52657b88ea0f9ed473fe0958d31996 Author: Andrew Morton Date: Tue Apr 10 16:34:41 2018 -0700 fs/reiserfs/journal.c: add missing resierfs_warning() arg commit 9ad553abe66f8be3f4755e9fa0a6ba137ce76341 upstream. One use of the reiserfs_warning() macro in journal_init_dev() is missing a parameter, causing the following warning: REISERFS warning (device loop0): journal_init_dev: Cannot open '%s': %i journal_init_dev: This also causes a WARN_ONCE() warning in the vsprintf code, and then a panic if panic_on_warn is set. Please remove unsupported %/ in format string WARNING: CPU: 1 PID: 4480 at lib/vsprintf.c:2138 format_decode+0x77f/0x830 lib/vsprintf.c:2138 Kernel panic - not syncing: panic_on_warn set ... Just add another string argument to the macro invocation. Addresses https://syzkaller.appspot.com/bug?id=0627d4551fdc39bf1ef5d82cd9eef587047f7718 Link: http://lkml.kernel.org/r/d678ebe1-6f54-8090-df4c-b9affad62293@infradead.org Signed-off-by: Randy Dunlap Reported-by: Tested-by: Randy Dunlap Acked-by: Jeff Mahoney Cc: Alexander Viro Cc: Jan Kara Cc: Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman commit 6337067b2ab66c91b18b895c2399139b28f13fb7 Author: Kees Cook Date: Tue Apr 10 16:32:44 2018 -0700 task_struct: only use anon struct under randstruct plugin commit 2cfe0d3009418a132b93d78642a8059a38fe5944 upstream. The original intent for always adding the anonymous struct in task_struct was to make sure we had compiler coverage. However, this caused pathological padding of 40 bytes at the start of task_struct. Instead, move the anonymous struct to being only used when struct layout randomization is enabled. Link: http://lkml.kernel.org/r/20180327213609.GA2964@beast Fixes: 29e48ce87f1e ("task_struct: Allow randomized") Signed-off-by: Kees Cook Reported-by: Peter Zijlstra Cc: Peter Zijlstra Cc: Ingo Molnar Cc: Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman commit 28cb085f1598956b5b29edb7f220ce9e0d827762 Author: Jérôme Glisse Date: Tue Apr 10 16:28:27 2018 -0700 mm/hmm: hmm_pfns_bad() was accessing wrong struct commit c719547f032d4610c7a20900baacae26d0b1ff3e upstream. The private field of mm_walk struct point to an hmm_vma_walk struct and not to the hmm_range struct desired. Fix to get proper struct pointer. Link: http://lkml.kernel.org/r/20180323005527.758-6-jglisse@redhat.com Signed-off-by: Jérôme Glisse Cc: Evgeny Baskakov Cc: Ralph Campbell Cc: Mark Hairgrove Cc: John Hubbard Cc: Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman commit 963722d031e506600b66b91738e2007bf55ac3d0 Author: Jérôme Glisse Date: Tue Apr 10 16:28:15 2018 -0700 mm/hmm: fix header file if/else/endif maze commit b28b08de436a638c82d0cf3dcdbdbad055baf1fc upstream. The #if/#else/#endif for IS_ENABLED(CONFIG_HMM) were wrong. Because of this after multiple include there was multiple definition of both hmm_mm_init() and hmm_mm_destroy() leading to build failure if HMM was enabled (CONFIG_HMM set). Link: http://lkml.kernel.org/r/20180323005527.758-3-jglisse@redhat.com Signed-off-by: Jérôme Glisse Acked-by: Balbir Singh Cc: Andrew Morton Cc: Ralph Campbell Cc: John Hubbard Cc: Evgeny Baskakov Cc: Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman commit e84e6914ccb464ce5cbb99fd832a4673bb119adc Author: Claudio Imbrenda Date: Tue Apr 10 16:29:41 2018 -0700 mm/ksm.c: fix inconsistent accounting of zero pages commit a38c015f3156895b07e71d4e4414289f8a3b2745 upstream. When using KSM with use_zero_pages, we replace anonymous pages containing only zeroes with actual zero pages, which are not anonymous. We need to do proper accounting of the mm counters, otherwise we will get wrong values in /proc and a BUG message in dmesg when tearing down the mm. Link: http://lkml.kernel.org/r/1522931274-15552-1-git-send-email-imbrenda@linux.vnet.ibm.com Fixes: e86c59b1b1 ("mm/ksm: improve deduplication of zero pages with colouring") Signed-off-by: Claudio Imbrenda Reviewed-by: Andrew Morton Cc: Andrea Arcangeli Cc: Minchan Kim Cc: Kirill A. Shutemov Cc: Hugh Dickins Cc: Christian Borntraeger Cc: Gerald Schaefer Cc: Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman commit 577b4eb23811dfc8e38924dc476dbc866be74253 Author: Richard Weinberger Date: Sat Mar 3 11:45:54 2018 +0100 ubi: Reject MLC NAND commit b5094b7f135be34630e3ea8a98fa215715d0f29d upstream. While UBI and UBIFS seem to work at first sight with MLC NAND, you will most likely lose all your data upon a power-cut or due to read/write disturb. In order to protect users from bad surprises, refuse to attach to MLC NAND. Cc: stable@vger.kernel.org Signed-off-by: Richard Weinberger Acked-by: Boris Brezillon Acked-by: Artem Bityutskiy Signed-off-by: Greg Kroah-Hartman commit 7ade852714ded6f3ede2708e036e13ed4c63335d Author: Romain Izard Date: Mon Jan 29 11:18:20 2018 +0100 ubi: Fix error for write access commit 78a8dfbabbece22bee58ac4cb26cab10e7a19c5d upstream. When opening a device with write access, ubiblock_open returns an error code. Currently, this error code is -EPERM, but this is not the right value. The open function for other block devices returns -EROFS when opening read-only devices with FMODE_WRITE set. When used with dm-verity, the veritysetup userspace tool is expecting EROFS, and refuses to use the ubiblock device. Use -EROFS for ubiblock as well. As a result, veritysetup accepts the ubiblock device as valid. Cc: stable@vger.kernel.org Fixes: 9d54c8a33eec (UBI: R/O block driver on top of UBI volumes) Signed-off-by: Romain Izard Signed-off-by: Richard Weinberger Signed-off-by: Greg Kroah-Hartman commit bf3fbf54a9aec60eac770b2e69b4762c09082bd1 Author: Richard Weinberger Date: Wed Jan 17 23:15:57 2018 +0100 ubi: fastmap: Don't flush fastmap work on detach commit 29b7a6fa1ec07e8480b0d9caf635a4498a438bf4 upstream. At this point UBI volumes have already been free()'ed and fastmap can no longer access these data structures. Reported-by: Martin Townsend Fixes: 74cdaf24004a ("UBI: Fastmap: Fix memory leaks while closing the WL sub-system") Cc: stable@vger.kernel.org Signed-off-by: Richard Weinberger Signed-off-by: Greg Kroah-Hartman commit 09844df060879db7d7fbec2c22c7f3c92f6b6b57 Author: Richard Weinberger Date: Wed Jan 17 19:12:42 2018 +0100 ubifs: Check ubifs_wbuf_sync() return code commit aac17948a7ce01fb60b9ee6cf902967a47b3ce26 upstream. If ubifs_wbuf_sync() fails we must not write a master node with the dirty marker cleared. Otherwise it is possible that in case of an IO error while syncing we mark the filesystem as clean and UBIFS refuses to recover upon next mount. Cc: Fixes: 1e51764a3c2a ("UBIFS: add new flash file system") Signed-off-by: Richard Weinberger Signed-off-by: Greg Kroah-Hartman commit f1e90bf95e5503fcedaf59ac52cbdc013068eeb4 Author: George Cherian Date: Fri Mar 23 03:30:31 2018 -0700 cpufreq: CPPC: Use transition_delay_us depending transition_latency commit 3d41386d556db9f720e00de3e11e45f39cb5071c upstream. With commit e948bc8fbee0 (cpufreq: Cap the default transition delay value to 10 ms) the cpufreq was not honouring the delay passed via ACPI (PCCT). Due to which on ARM based platforms using CPPC the cpufreq governor tries to change the frequency of CPUs faster than expected. This leads to continuous error messages like the following. " ACPI CPPC: PCC check channel failed. Status=0 " Earlier (without above commit) the default transition delay was taken form the value passed from PCCT. Use the same value provided by PCCT to set the transition_delay_us. Fixes: e948bc8fbee0 (cpufreq: Cap the default transition delay value to 10 ms) Signed-off-by: George Cherian Cc: 4.14+ # 4.14+ Signed-off-by: Rafael J. Wysocki Signed-off-by: Greg Kroah-Hartman commit 9427a4aecf2330ba70dd59f194416de3e49042e2 Author: Tejun Heo Date: Tue Feb 13 07:38:08 2018 -0800 tty: make n_tty_read() always abort if hangup is in progress commit 28b0f8a6962a24ed21737578f3b1b07424635c9e upstream. A tty is hung up by __tty_hangup() setting file->f_op to hung_up_tty_fops, which is skipped on ttys whose write operation isn't tty_write(). This means that, for example, /dev/console whose write op is redirected_tty_write() is never actually marked hung up. Because n_tty_read() uses the hung up status to decide whether to abort the waiting readers, the lack of hung-up marking can lead to the following scenario. 1. A session contains two processes. The leader and its child. The child ignores SIGHUP. 2. The leader exits and starts disassociating from the controlling terminal (/dev/console). 3. __tty_hangup() skips setting f_op to hung_up_tty_fops. 4. SIGHUP is delivered and ignored. 5. tty_ldisc_hangup() is invoked. It wakes up the waits which should clear the read lockers of tty->ldisc_sem. 6. The reader wakes up but because tty_hung_up_p() is false, it doesn't abort and goes back to sleep while read-holding tty->ldisc_sem. 7. The leader progresses to tty_ldisc_lock() in tty_ldisc_hangup() and is now stuck in D sleep indefinitely waiting for tty->ldisc_sem. The following is Alan's explanation on why some ttys aren't hung up. http://lkml.kernel.org/r/20171101170908.6ad08580@alans-desktop 1. It broke the serial consoles because they would hang up and close down the hardware. With tty_port that *should* be fixable properly for any cases remaining. 2. The console layer was (and still is) completely broken and doens't refcount properly. So if you turn on console hangups it breaks (as indeed does freeing consoles and half a dozen other things). As neither can be fixed quickly, this patch works around the problem by introducing a new flag, TTY_HUPPING, which is used solely to tell n_tty_read() that hang-up is in progress for the console and the readers should be aborted regardless of the hung-up status of the device. The following is a sample hung task warning caused by this issue. INFO: task agetty:2662 blocked for more than 120 seconds. Not tainted 4.11.3-dbg-tty-lockup-02478-gfd6c7ee-dirty #28 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. 0 2662 1 0x00000086 Call Trace: __schedule+0x267/0x890 schedule+0x36/0x80 schedule_timeout+0x23c/0x2e0 ldsem_down_write+0xce/0x1f6 tty_ldisc_lock+0x16/0x30 tty_ldisc_hangup+0xb3/0x1b0 __tty_hangup+0x300/0x410 disassociate_ctty+0x6c/0x290 do_exit+0x7ef/0xb00 do_group_exit+0x3f/0xa0 get_signal+0x1b3/0x5d0 do_signal+0x28/0x660 exit_to_usermode_loop+0x46/0x86 do_syscall_64+0x9c/0xb0 entry_SYSCALL64_slow_path+0x25/0x25 The following is the repro. Run "$PROG /dev/console". The parent process hangs in D state. #include #include #include #include #include #include #include #include #include #include #include #include int main(int argc, char **argv) { struct sigaction sact = { .sa_handler = SIG_IGN }; struct timespec ts1s = { .tv_sec = 1 }; pid_t pid; int fd; if (argc < 2) { fprintf(stderr, "test-hung-tty /dev/$TTY\n"); return 1; } /* fork a child to ensure that it isn't already the session leader */ pid = fork(); if (pid < 0) { perror("fork"); return 1; } if (pid > 0) { /* top parent, wait for everyone */ while (waitpid(-1, NULL, 0) >= 0) ; if (errno != ECHILD) perror("waitpid"); return 0; } /* new session, start a new session and set the controlling tty */ if (setsid() < 0) { perror("setsid"); return 1; } fd = open(argv[1], O_RDWR); if (fd < 0) { perror("open"); return 1; } if (ioctl(fd, TIOCSCTTY, 1) < 0) { perror("ioctl"); return 1; } /* fork a child, sleep a bit and exit */ pid = fork(); if (pid < 0) { perror("fork"); return 1; } if (pid > 0) { nanosleep(&ts1s, NULL); printf("Session leader exiting\n"); exit(0); } /* * The child ignores SIGHUP and keeps reading from the controlling * tty. Because SIGHUP is ignored, the child doesn't get killed on * parent exit and the bug in n_tty makes the read(2) block the * parent's control terminal hangup attempt. The parent ends up in * D sleep until the child is explicitly killed. */ sigaction(SIGHUP, &sact, NULL); printf("Child reading tty\n"); while (1) { char buf[1024]; if (read(fd, buf, sizeof(buf)) < 0) { perror("read"); return 1; } } return 0; } Signed-off-by: Tejun Heo Cc: Alan Cox Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman